Implement strtok in C


To implement the C function ‘strtok’, one doesn’t need to allocate extra memory for the input string before modifying it, since the caller function is supposed to make copy of the input string and deallocate the original input string. The function only needs to define static variables to store the starting of the string.

The caller function provides a string as the input during the first call, and ‘null’ for the following function calls. For details about how this function works, please refer to http://www.cplusplus.com/reference/cstring/strtok/.

char* sp = NULL; /* the start position of the string */

char* strtok1(char* str, const char* delimiters) {

	int i = 0;
	int len = strlen(delimiters);

	/* check in the delimiters */
	if(len == 0)
		printf("delimiters are empty\n");

	/* if the original string has nothing left */
	if(!str && !sp)
		return NULL;

	/* initialize the sp during the first call */
	if(str && !sp)
		sp = str;

	/* find the start of the substring, skip delimiters */
	char* p_start = sp;
	while(true) {
		for(i = 0; i < len; i ++) {
			if(*p_start == delimiters[i]) {
				p_start ++;
				break;
			}
		}

		if(i == len) {
		       sp = p_start;
		       break;
		}
	}

	/* return NULL if nothing left */
	if(*sp == '\0') {
		sp = NULL;
		return sp;
	}

	/* find the end of the substring, and
        replace the delimiter with null */
	while(*sp != '\0') {
		for(i = 0; i < len; i ++) {
			if(*sp == delimiters[i]) {
				*sp = '\0';
				break;
			}
		}

		sp ++;
		if (i < len)
			break;
	}

	return p_start;
}

This following is the test code from cplusplus.com.

#include <stdio.h>
#include <string.h>

int main ()
{
  char str[] ="- This, a sample string.";
  char * pch;
  printf ("Splitting string \"%s\" into tokens:\n",str);
  pch = strtok1(str," ,.-");
  while (pch != NULL)
  {
    printf ("%s\n",pch);
    pch = strtok1(NULL, " ,.-");
  }
  return 0;
}

————————-
As pointed out by Ian in the following comments [9.25.2014], there is a much simpler solution to this problem. Feel free to test it and let us know if it has any bugs.

char* strtok1(char *str, const char* delim) {
	static char* _buffer;
	if(str != NULL) _buffer = str;
	if(_buffer[0] == '\0') return NULL;

	char *ret = _buffer, *b;
	const char *d;

	for(b = _buffer; *b !='\0'; b++) {
		for(d = delim; *d != '\0'; d++) {
			if(*b == *d) {
				*b = '\0';
				_buffer = b+1;

				// skip the beginning delimiters
				if(b == ret) { 
					ret++; 
					continue; 
				}
				return ret;
			}
		}
	}

	return ret;
}

Advertisements

10 thoughts on “Implement strtok in C

  1. Ian

    I may be over simplifying this but it seems this approach is over complicated. Here is my strtok implementation. I have done no edge case analysis but has served me well for a few years.

    4 char* strtok(char *str, const char* delim) {
    5 static char* _buffer;
    6 if(str != NULL) _buffer = str;
    7 if(_buffer[0] == ”) return NULL;
    8
    9 char *ret = _buffer, *b;
    10 const char *d;
    11
    12 for(b = _buffer; *b != ”; b++) {
    13 for(d = delim; *d != ”; d++) {
    14 if(*b == *d) {
    15 *b = ”;
    16 _buffer = b+1;
    17 return ret;
    18 }
    19 }
    20 }
    21
    22 return ret;
    23 }

    Reply
    1. fli10 Post author

      Ian, thanks for sharing your solution. It looks very neat. But there is some difference when comparing your output and mine using the above mentioned test case. Your output is:
      Splitting string “- This, a sample string.” into tokens:

      This

      a
      sample
      string

      while mine is
      Splitting string “- This, a sample string.” into tokens:
      This
      a
      sample
      string

      You will immediately notice that your solution returns multiple ” sequences. Like you said, you didn’t deal with edge cases such as two consecutive delimiters or strings beginning with delimiters. Under some situations, maybe it is not a problem.

      I tested both functions on ‘http://www.compileonline.com/compile_cpp_online.php’ and replaced ” with ‘\ 0’.

      Reply
      1. Ian

        Great! Thank you for taking the time to share this with me. Looks like i will have to do a bit of tweaking on my solution.

  2. Pingback: Write a strtok function | juliachenonsoftware

  3. Ben

    The ‘much simpler solution’ has a problem when the string ends with a new line

    char str[] =”- This, a sample string.\n”;

    gets stuck in an infinite loop.

    Reply
  4. srivatsan

    Same infinite loop problem with the original solution when the string ends with one of the delimiters and/or has a series of delimiters before last token.

    May be there needs
    _buffer = b+1 before the second return as well.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s