]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man3/strtok.3
a9fa0f0d9a870dfb31bc6043e80bc2f5d11a6da8
[thirdparty/man-pages.git] / man3 / strtok.3
1 .\" Copyright (C) 2005, 2013 Michael Kerrisk <mtk.manpages@gmail.com>
2 .\" a few fragments from an earlier (1996) version by
3 .\" Andries Brouwer (aeb@cwi.nl) remain.
4 .\"
5 .\" SPDX-License-Identifier: Linux-man-pages-copyleft
6 .\"
7 .\" Rewritten old page, 960210, aeb@cwi.nl
8 .\" Updated, added strtok_r. 2000-02-13 Nicolás Lichtmaier <nick@debian.org>
9 .\" 2005-11-17, mtk: Substantial parts rewritten
10 .\" 2013-05-19, mtk: added much further detail on the operation of strtok()
11 .\"
12 .TH STRTOK 3 2021-03-22 "GNU" "Linux Programmer's Manual"
13 .SH NAME
14 strtok, strtok_r \- extract tokens from strings
15 .SH LIBRARY
16 Standard C library
17 .RI ( libc ", " \-lc )
18 .SH SYNOPSIS
19 .nf
20 .B #include <string.h>
21 .PP
22 .BI "char *strtok(char *restrict " str ", const char *restrict " delim );
23 .BI "char *strtok_r(char *restrict " str ", const char *restrict " delim ,
24 .BI " char **restrict " saveptr );
25 .fi
26 .PP
27 .RS -4
28 Feature Test Macro Requirements for glibc (see
29 .BR feature_test_macros (7)):
30 .RE
31 .PP
32 .BR strtok_r ():
33 .nf
34 _POSIX_C_SOURCE
35 || /* Glibc <= 2.19: */ _BSD_SOURCE || _SVID_SOURCE
36 .fi
37 .SH DESCRIPTION
38 The
39 .BR strtok ()
40 function breaks a string into a sequence of zero or more nonempty tokens.
41 On the first call to
42 .BR strtok (),
43 the string to be parsed should be
44 specified in
45 .IR str .
46 In each subsequent call that should parse the same string,
47 .I str
48 must be NULL.
49 .PP
50 The
51 .I delim
52 argument specifies a set of bytes that
53 delimit the tokens in the parsed string.
54 The caller may specify different strings in
55 .I delim
56 in successive
57 calls that parse the same string.
58 .PP
59 Each call to
60 .BR strtok ()
61 returns a pointer to a
62 null-terminated string containing the next token.
63 This string does not include the delimiting byte.
64 If no more tokens are found,
65 .BR strtok ()
66 returns NULL.
67 .PP
68 A sequence of calls to
69 .BR strtok ()
70 that operate on the same string maintains a pointer
71 that determines the point from which to start searching for the next token.
72 The first call to
73 .BR strtok ()
74 sets this pointer to point to the first byte of the string.
75 The start of the next token is determined by scanning forward
76 for the next nondelimiter byte in
77 .IR str .
78 If such a byte is found, it is taken as the start of the next token.
79 If no such byte is found,
80 then there are no more tokens, and
81 .BR strtok ()
82 returns NULL.
83 (A string that is empty or that contains only delimiters
84 will thus cause
85 .BR strtok ()
86 to return NULL on the first call.)
87 .PP
88 The end of each token is found by scanning forward until either
89 the next delimiter byte is found or until the
90 terminating null byte (\(aq\e0\(aq) is encountered.
91 If a delimiter byte is found, it is overwritten with
92 a null byte to terminate the current token, and
93 .BR strtok ()
94 saves a pointer to the following byte;
95 that pointer will be used as the starting point
96 when searching for the next token.
97 In this case,
98 .BR strtok ()
99 returns a pointer to the start of the found token.
100 .PP
101 From the above description,
102 it follows that a sequence of two or more contiguous delimiter bytes in
103 the parsed string is considered to be a single delimiter, and that
104 delimiter bytes at the start or end of the string are ignored.
105 Put another way: the tokens returned by
106 .BR strtok ()
107 are always nonempty strings.
108 Thus, for example, given the string "\fIaaa;;bbb,\fP",
109 successive calls to
110 .BR strtok ()
111 that specify the delimiter string "\fI;,\fP"
112 would return the strings "\fIaaa\fP" and "\fIbbb\fP",
113 and then a null pointer.
114 .PP
115 The
116 .BR strtok_r ()
117 function is a reentrant version of
118 .BR strtok ().
119 The
120 .I saveptr
121 argument is a pointer to a
122 .IR "char\ *"
123 variable that is used internally by
124 .BR strtok_r ()
125 in order to maintain context between successive calls that parse the
126 same string.
127 .PP
128 On the first call to
129 .BR strtok_r (),
130 .I str
131 should point to the string to be parsed, and the value of
132 .I *saveptr
133 is ignored (but see NOTES).
134 In subsequent calls,
135 .I str
136 should be NULL, and
137 .I saveptr
138 (and the buffer that it points to)
139 should be unchanged since the previous call.
140 .PP
141 Different strings may be parsed concurrently using sequences of calls to
142 .BR strtok_r ()
143 that specify different
144 .I saveptr
145 arguments.
146 .SH RETURN VALUE
147 The
148 .BR strtok ()
149 and
150 .BR strtok_r ()
151 functions return a pointer to
152 the next token, or NULL if there are no more tokens.
153 .SH ATTRIBUTES
154 For an explanation of the terms used in this section, see
155 .BR attributes (7).
156 .ad l
157 .nh
158 .TS
159 allbox;
160 lbx lb lb
161 l l l.
162 Interface Attribute Value
163 T{
164 .BR strtok ()
165 T} Thread safety MT-Unsafe race:strtok
166 T{
167 .BR strtok_r ()
168 T} Thread safety MT-Safe
169 .TE
170 .hy
171 .ad
172 .sp 1
173 .SH CONFORMING TO
174 .TP
175 .BR strtok ()
176 POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
177 .TP
178 .BR strtok_r ()
179 POSIX.1-2001, POSIX.1-2008.
180 .SH NOTES
181 On some implementations,
182 .\" Tru64, according to its manual page
183 .I *saveptr
184 is required to be NULL on the first call to
185 .BR strtok_r ()
186 that is being used to parse
187 .IR str .
188 .SH BUGS
189 Be cautious when using these functions.
190 If you do use them, note that:
191 .IP * 2
192 These functions modify their first argument.
193 .IP *
194 These functions cannot be used on constant strings.
195 .IP *
196 The identity of the delimiting byte is lost.
197 .IP *
198 The
199 .BR strtok ()
200 function uses a static buffer while parsing, so it's not thread safe.
201 Use
202 .BR strtok_r ()
203 if this matters to you.
204 .SH EXAMPLES
205 The program below uses nested loops that employ
206 .BR strtok_r ()
207 to break a string into a two-level hierarchy of tokens.
208 The first command-line argument specifies the string to be parsed.
209 The second argument specifies the delimiter byte(s)
210 to be used to separate that string into "major" tokens.
211 The third argument specifies the delimiter byte(s)
212 to be used to separate the "major" tokens into subtokens.
213 .PP
214 An example of the output produced by this program is the following:
215 .PP
216 .in +4n
217 .EX
218 .RB "$" " ./a.out \(aqa/bbb///cc;xxx:yyy:\(aq \(aq:;\(aq \(aq/\(aq"
219 1: a/bbb///cc
220 \-\-> a
221 \-\-> bbb
222 \-\-> cc
223 2: xxx
224 \-\-> xxx
225 3: yyy
226 \-\-> yyy
227 .EE
228 .in
229 .SS Program source
230 \&
231 .EX
232 #include <stdio.h>
233 #include <stdlib.h>
234 #include <string.h>
235
236 int
237 main(int argc, char *argv[])
238 {
239 char *str1, *str2, *token, *subtoken;
240 char *saveptr1, *saveptr2;
241 int j;
242
243 if (argc != 4) {
244 fprintf(stderr, "Usage: %s string delim subdelim\en",
245 argv[0]);
246 exit(EXIT_FAILURE);
247 }
248
249 for (j = 1, str1 = argv[1]; ; j++, str1 = NULL) {
250 token = strtok_r(str1, argv[2], &saveptr1);
251 if (token == NULL)
252 break;
253 printf("%d: %s\en", j, token);
254
255 for (str2 = token; ; str2 = NULL) {
256 subtoken = strtok_r(str2, argv[3], &saveptr2);
257 if (subtoken == NULL)
258 break;
259 printf("\et \-\-> %s\en", subtoken);
260 }
261 }
262
263 exit(EXIT_SUCCESS);
264 }
265 .EE
266 .PP
267 Another example program using
268 .BR strtok ()
269 can be found in
270 .BR getaddrinfo_a (3).
271 .SH SEE ALSO
272 .BR index (3),
273 .BR memchr (3),
274 .BR rindex (3),
275 .BR strchr (3),
276 .BR string (3),
277 .BR strpbrk (3),
278 .BR strsep (3),
279 .BR strspn (3),
280 .BR strstr (3),
281 .BR wcstok (3)