]>
Commit | Line | Data |
---|---|---|
40a55d20 | 1 | @node Searching and Sorting, Pattern Matching, Message Translation, Top |
7a68c94a | 2 | @c %MENU% General searching and sorting functions |
f65fd747 | 3 | @chapter Searching and Sorting |
28f540f4 RM |
4 | |
5 | This chapter describes functions for searching and sorting arrays of | |
6 | arbitrary objects. You pass the appropriate comparison function to be | |
7 | applied as an argument, along with the size of the objects in the array | |
8 | and the total number of elements. | |
9 | ||
10 | @menu | |
11 | * Comparison Functions:: Defining how to compare two objects. | |
12 | Since the sort and search facilities | |
13 | are general, you have to specify the | |
f65fd747 | 14 | ordering. |
28f540f4 RM |
15 | * Array Search Function:: The @code{bsearch} function. |
16 | * Array Sort Function:: The @code{qsort} function. | |
17 | * Search/Sort Example:: An example program. | |
2604afb1 UD |
18 | * Hash Search Function:: The @code{hsearch} function. |
19 | * Tree Search Function:: The @code{tsearch} function. | |
28f540f4 RM |
20 | @end menu |
21 | ||
2604afb1 | 22 | @node Comparison Functions |
28f540f4 RM |
23 | @section Defining the Comparison Function |
24 | @cindex Comparison Function | |
25 | ||
26 | In order to use the sorted array library functions, you have to describe | |
27 | how to compare the elements of the array. | |
28 | ||
29 | To do this, you supply a comparison function to compare two elements of | |
30 | the array. The library will call this function, passing as arguments | |
31 | pointers to two array elements to be compared. Your comparison function | |
32 | should return a value the way @code{strcmp} (@pxref{String/Array | |
33 | Comparison}) does: negative if the first argument is ``less'' than the | |
34 | second, zero if they are ``equal'', and positive if the first argument | |
35 | is ``greater''. | |
36 | ||
37 | Here is an example of a comparison function which works with an array of | |
e7b90e6e | 38 | numbers of type @code{long int}: |
28f540f4 RM |
39 | |
40 | @smallexample | |
41 | int | |
e7b90e6e | 42 | compare_long_ints (const void *a, const void *b) |
28f540f4 | 43 | @{ |
e7b90e6e PE |
44 | const long int *la = a; |
45 | const long int *lb = b; | |
68ef28ed | 46 | |
e7b90e6e | 47 | return (*la > *lb) - (*la < *lb); |
28f540f4 RM |
48 | @} |
49 | @end smallexample | |
50 | ||
e7b90e6e PE |
51 | (The code would have to be more complicated for an array of @code{double}, |
52 | to handle NaNs correctly.) | |
53 | ||
28f540f4 RM |
54 | The header file @file{stdlib.h} defines a name for the data type of |
55 | comparison functions. This type is a GNU extension. | |
56 | ||
57 | @comment stdlib.h | |
58 | @comment GNU | |
59 | @tindex comparison_fn_t | |
60 | @smallexample | |
61 | int comparison_fn_t (const void *, const void *); | |
62 | @end smallexample | |
63 | ||
2604afb1 | 64 | @node Array Search Function |
28f540f4 RM |
65 | @section Array Search Function |
66 | @cindex search function (for arrays) | |
67 | @cindex binary search function (for arrays) | |
68 | @cindex array search function | |
69 | ||
2604afb1 | 70 | Generally searching for a specific element in an array means that |
1f77f049 | 71 | potentially all elements must be checked. @Theglibc{} contains |
2604afb1 UD |
72 | functions to perform linear search. The prototypes for the following |
73 | two functions can be found in @file{search.h}. | |
74 | ||
8ded91fb | 75 | @deftypefun {void *} lfind (const void *@var{key}, const void *@var{base}, size_t *@var{nmemb}, size_t @var{size}, comparison_fn_t @var{compar}) |
d08a7e4c | 76 | @standards{SVID, search.h} |
433c45a2 | 77 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
2604afb1 UD |
78 | The @code{lfind} function searches in the array with @code{*@var{nmemb}} |
79 | elements of @var{size} bytes pointed to by @var{base} for an element | |
80 | which matches the one pointed to by @var{key}. The function pointed to | |
8ed0d867 | 81 | by @var{compar} is used to decide whether two elements match. |
2604afb1 UD |
82 | |
83 | The return value is a pointer to the matching element in the array | |
84 | starting at @var{base} if it is found. If no matching element is | |
85 | available @code{NULL} is returned. | |
86 | ||
57581acd PE |
87 | The mean runtime of this function is proportional to @code{*@var{nmemb}/2}, |
88 | assuming random elements of the array are searched for. This | |
89 | function should be used only if elements often get added to or deleted from | |
2604afb1 UD |
90 | the array in which case it might not be useful to sort the array before |
91 | searching. | |
92 | @end deftypefun | |
93 | ||
2604afb1 | 94 | @deftypefun {void *} lsearch (const void *@var{key}, void *@var{base}, size_t *@var{nmemb}, size_t @var{size}, comparison_fn_t @var{compar}) |
d08a7e4c | 95 | @standards{SVID, search.h} |
433c45a2 AO |
96 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
97 | @c A signal handler that interrupted an insertion and performed an | |
98 | @c insertion itself would leave the array in a corrupt state (e.g. one | |
99 | @c new element initialized twice, with parts of both initializations | |
100 | @c prevailing, and another uninitialized element), but this is just a | |
101 | @c special case of races on user-controlled objects, that have to be | |
102 | @c avoided by users. | |
103 | ||
104 | @c In case of cancellation, we know the array won't be left in a corrupt | |
105 | @c state; the new element is initialized before the element count is | |
106 | @c incremented, and the compiler can't reorder these operations because | |
107 | @c it can't know that they don't alias. So, we'll either cancel after | |
108 | @c the increment and the initialization are both complete, or the | |
109 | @c increment won't have taken place, and so how far the initialization | |
110 | @c got doesn't matter. | |
2604afb1 UD |
111 | The @code{lsearch} function is similar to the @code{lfind} function. It |
112 | searches the given array for an element and returns it if found. The | |
113 | difference is that if no matching element is found the @code{lsearch} | |
114 | function adds the object pointed to by @var{key} (with a size of | |
115 | @var{size} bytes) at the end of the array and it increments the value of | |
116 | @code{*@var{nmemb}} to reflect this addition. | |
117 | ||
118 | This means for the caller that if it is not sure that the array contains | |
119 | the element one is searching for the memory allocated for the array | |
120 | starting at @var{base} must have room for at least @var{size} more | |
121 | bytes. If one is sure the element is in the array it is better to use | |
122 | @code{lfind} so having more room in the array is always necessary when | |
123 | calling @code{lsearch}. | |
124 | @end deftypefun | |
125 | ||
57581acd PE |
126 | To search a sorted or partially sorted array for an element matching the key, |
127 | use the @code{bsearch} function. The prototype for this function is in | |
28f540f4 RM |
128 | the header file @file{stdlib.h}. |
129 | @pindex stdlib.h | |
130 | ||
28f540f4 | 131 | @deftypefun {void *} bsearch (const void *@var{key}, const void *@var{array}, size_t @var{count}, size_t @var{size}, comparison_fn_t @var{compare}) |
d08a7e4c | 132 | @standards{ISO, stdlib.h} |
433c45a2 | 133 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
57581acd | 134 | The @code{bsearch} function searches @var{array} for an element |
28f540f4 | 135 | that is equivalent to @var{key}. The array contains @var{count} elements, |
f65fd747 | 136 | each of which is of size @var{size} bytes. |
28f540f4 RM |
137 | |
138 | The @var{compare} function is used to perform the comparison. This | |
57581acd PE |
139 | function is called with arguments that point to the key and to an |
140 | array element, in that order, and should return an | |
28f540f4 | 141 | integer less than, equal to, or greater than zero corresponding to |
57581acd PE |
142 | whether the key is considered less than, equal to, or greater than |
143 | the array element. The function should not alter the array's contents, | |
144 | and the same array element should always compare the same way with the key. | |
145 | ||
146 | Although the array need not be completely sorted, it should be | |
147 | partially sorted with respect to @var{key}. That is, the array should | |
148 | begin with elements that compare less than @var{key}, followed by | |
149 | elements that compare equal to @var{key}, and ending with elements | |
150 | that compare greater than @var{key}. Any or all of these element | |
151 | sequences can be empty. | |
152 | ||
153 | The return value is a pointer to a matching array element, or a null | |
28f540f4 RM |
154 | pointer if no match is found. If the array contains more than one element |
155 | that matches, the one that is returned is unspecified. | |
156 | ||
157 | This function derives its name from the fact that it is implemented | |
158 | using the binary search algorithm. | |
159 | @end deftypefun | |
160 | ||
2604afb1 | 161 | @node Array Sort Function |
28f540f4 RM |
162 | @section Array Sort Function |
163 | @cindex sort function (for arrays) | |
164 | @cindex quick sort function (for arrays) | |
165 | @cindex array sort function | |
166 | ||
167 | To sort an array using an arbitrary comparison function, use the | |
168 | @code{qsort} function. The prototype for this function is in | |
169 | @file{stdlib.h}. | |
170 | @pindex stdlib.h | |
171 | ||
28f540f4 | 172 | @deftypefun void qsort (void *@var{array}, size_t @var{count}, size_t @var{size}, comparison_fn_t @var{compare}) |
d08a7e4c | 173 | @standards{ISO, stdlib.h} |
709fbd3e | 174 | @safety{@prelim{}@mtsafe{}@assafe{}@acunsafe{@acucorrupt{}}} |
98fe149e AK |
175 | The @code{qsort} function sorts the array @var{array}. The array |
176 | contains @var{count} elements, each of which is of size @var{size}. | |
28f540f4 RM |
177 | |
178 | The @var{compare} function is used to perform the comparison on the | |
179 | array elements. This function is called with two pointer arguments and | |
180 | should return an integer less than, equal to, or greater than zero | |
181 | corresponding to whether its first argument is considered less than, | |
182 | equal to, or greater than its second argument. | |
57581acd PE |
183 | The function must not alter the array's contents, and must define a |
184 | total ordering on the array elements, including any unusual values | |
185 | such as floating-point NaN (@pxref{Infinity and NaN}). | |
186 | Because the sorting process can move elements, | |
187 | the function's return value must not depend on the element addresses | |
188 | or the relative positions of elements within the array, | |
189 | as these are meaningless while @code{qsort} is running. | |
28f540f4 RM |
190 | |
191 | @cindex stable sorting | |
57581acd | 192 | @strong{Warning:} If two elements compare equal, their order after |
28f540f4 RM |
193 | sorting is unpredictable. That is to say, the sorting is not stable. |
194 | This can make a difference when the comparison considers only part of | |
57581acd PE |
195 | the elements and two elements that compare equal may differ in other |
196 | respects. To ensure a stable sort in this situation, you can augment | |
197 | each element with an appropriate tie-breaking value, such as its | |
198 | original array index. | |
28f540f4 | 199 | |
e7b90e6e | 200 | Here is a simple example of sorting an array of @code{long int} in numerical |
28f540f4 RM |
201 | order, using the comparison function defined above (@pxref{Comparison |
202 | Functions}): | |
203 | ||
204 | @smallexample | |
205 | @{ | |
e7b90e6e PE |
206 | long int *array; |
207 | size_t nmemb; | |
28f540f4 | 208 | @dots{} |
e7b90e6e | 209 | qsort (array, nmemb, sizeof *array, compare_long_ints); |
28f540f4 RM |
210 | @} |
211 | @end smallexample | |
212 | ||
213 | The @code{qsort} function derives its name from the fact that it was | |
214 | originally implemented using the ``quick sort'' algorithm. | |
21ad6b26 | 215 | |
57581acd | 216 | The implementation of @code{qsort} attempts to allocate auxiliary memory |
709fbd3e AZ |
217 | and use the merge sort algorithm, without violating C standard requirement |
218 | that arguments passed to the comparison function point within the array. | |
57581acd | 219 | If the memory allocation fails, @code{qsort} resorts to a slower algorithm. |
28f540f4 RM |
220 | @end deftypefun |
221 | ||
2604afb1 | 222 | @node Search/Sort Example |
28f540f4 RM |
223 | @section Searching and Sorting Example |
224 | ||
225 | Here is an example showing the use of @code{qsort} and @code{bsearch} | |
57581acd | 226 | with an array of structures. The elements of the array are sorted |
28f540f4 | 227 | by comparing their @code{name} fields with the @code{strcmp} function. |
57581acd | 228 | Then, we can look up individual elements based on their names. |
28f540f4 RM |
229 | |
230 | @comment This example is dedicated to the memory of Jim Henson. RIP. | |
231 | @smallexample | |
232 | @include search.c.texi | |
233 | @end smallexample | |
234 | ||
235 | @cindex Kermit the frog | |
236 | The output from this program looks like: | |
237 | ||
238 | @smallexample | |
239 | Kermit, the frog | |
240 | Piggy, the pig | |
241 | Gonzo, the whatever | |
242 | Fozzie, the bear | |
243 | Sam, the eagle | |
244 | Robin, the frog | |
245 | Animal, the animal | |
246 | Camilla, the chicken | |
247 | Sweetums, the monster | |
248 | Dr. Strangepork, the pig | |
249 | Link Hogthrob, the pig | |
250 | Zoot, the human | |
251 | Dr. Bunsen Honeydew, the human | |
252 | Beaker, the human | |
253 | Swedish Chef, the human | |
254 | ||
255 | Animal, the animal | |
256 | Beaker, the human | |
257 | Camilla, the chicken | |
258 | Dr. Bunsen Honeydew, the human | |
259 | Dr. Strangepork, the pig | |
260 | Fozzie, the bear | |
261 | Gonzo, the whatever | |
262 | Kermit, the frog | |
263 | Link Hogthrob, the pig | |
264 | Piggy, the pig | |
265 | Robin, the frog | |
266 | Sam, the eagle | |
267 | Swedish Chef, the human | |
268 | Sweetums, the monster | |
269 | Zoot, the human | |
270 | ||
271 | Kermit, the frog | |
272 | Gonzo, the whatever | |
273 | Couldn't find Janice. | |
274 | @end smallexample | |
2604afb1 UD |
275 | |
276 | ||
277 | @node Hash Search Function | |
278 | @section The @code{hsearch} function. | |
279 | ||
11bf311e | 280 | The functions mentioned so far in this chapter are for searching in a sorted |
2604afb1 UD |
281 | or unsorted array. There are other methods to organize information |
282 | which later should be searched. The costs of insert, delete and search | |
283 | differ. One possible implementation is using hashing tables. | |
11bf311e | 284 | The following functions are declared in the header file @file{search.h}. |
2604afb1 | 285 | |
2604afb1 | 286 | @deftypefun int hcreate (size_t @var{nel}) |
d08a7e4c | 287 | @standards{SVID, search.h} |
433c45a2 AO |
288 | @safety{@prelim{}@mtunsafe{@mtasurace{:hsearch}}@asunsafe{@ascuheap{}}@acunsafe{@acucorrupt{} @acsmem{}}} |
289 | @c hcreate @mtasurace:hsearch @ascuheap @acucorrupt @acsmem | |
290 | @c hcreate_r dup @mtsrace:htab @ascuheap @acucorrupt @acsmem | |
2604afb1 UD |
291 | The @code{hcreate} function creates a hashing table which can contain at |
292 | least @var{nel} elements. There is no possibility to grow this table so | |
11bf311e UD |
293 | it is necessary to choose the value for @var{nel} wisely. The method |
294 | used to implement this function might make it necessary to make the | |
2604afb1 | 295 | number of elements in the hashing table larger than the expected maximal |
11bf311e | 296 | number of elements. Hashing tables usually work inefficiently if they are |
2604afb1 UD |
297 | filled 80% or more. The constant access time guaranteed by hashing can |
298 | only be achieved if few collisions exist. See Knuth's ``The Art of | |
299 | Computer Programming, Part 3: Searching and Sorting'' for more | |
300 | information. | |
301 | ||
302 | The weakest aspect of this function is that there can be at most one | |
f2ea0f5b | 303 | hashing table used through the whole program. The table is allocated |
1f77f049 | 304 | in local memory out of control of the programmer. As an extension @theglibc{} |
9dcc8f11 | 305 | provides an additional set of functions with a reentrant |
8ed0d867 | 306 | interface which provides a similar interface but which allows keeping |
49c091e5 | 307 | arbitrarily many hashing tables. |
2604afb1 UD |
308 | |
309 | It is possible to use more than one hashing table in the program run if | |
310 | the former table is first destroyed by a call to @code{hdestroy}. | |
311 | ||
8ed0d867 | 312 | The function returns a non-zero value if successful. If it returns zero, |
2604afb1 | 313 | something went wrong. This could either mean there is already a hashing |
8ed0d867 | 314 | table in use or the program ran out of memory. |
2604afb1 UD |
315 | @end deftypefun |
316 | ||
2604afb1 | 317 | @deftypefun void hdestroy (void) |
d08a7e4c | 318 | @standards{SVID, search.h} |
433c45a2 AO |
319 | @safety{@prelim{}@mtunsafe{@mtasurace{:hsearch}}@asunsafe{@ascuheap{}}@acunsafe{@acucorrupt{} @acsmem{}}} |
320 | @c hdestroy @mtasurace:hsearch @ascuheap @acucorrupt @acsmem | |
321 | @c hdestroy_r dup @mtsrace:htab @ascuheap @acucorrupt @acsmem | |
2604afb1 UD |
322 | The @code{hdestroy} function can be used to free all the resources |
323 | allocated in a previous call of @code{hcreate}. After a call to this | |
324 | function it is again possible to call @code{hcreate} and allocate a new | |
325 | table with possibly different size. | |
326 | ||
327 | It is important to remember that the elements contained in the hashing | |
328 | table at the time @code{hdestroy} is called are @emph{not} freed by this | |
329 | function. It is the responsibility of the program code to free those | |
f2ea0f5b | 330 | strings (if necessary at all). Freeing all the element memory is not |
2604afb1 UD |
331 | possible without extra, separately kept information since there is no |
332 | function to iterate through all available elements in the hashing table. | |
333 | If it is really necessary to free a table and all elements the | |
334 | programmer has to keep a list of all table elements and before calling | |
335 | @code{hdestroy} s/he has to free all element's data using this list. | |
f2ea0f5b | 336 | This is a very unpleasant mechanism and it also shows that this kind of |
8ed0d867 | 337 | hashing table is mainly meant for tables which are created once and |
2604afb1 UD |
338 | used until the end of the program run. |
339 | @end deftypefun | |
340 | ||
341 | Entries of the hashing table and keys for the search are defined using | |
342 | this type: | |
343 | ||
2d8a22cd | 344 | @deftp {Data type} ENTRY |
2604afb1 UD |
345 | @table @code |
346 | @item char *key | |
347 | Pointer to a zero-terminated string of characters describing the key for | |
348 | the search or the element in the hashing table. | |
2d8a22cd FW |
349 | |
350 | This is a limiting restriction of the functionality of the | |
351 | @code{hsearch} functions: They can only be used for data sets which | |
352 | use the NUL character always and solely to terminate keys. It is not | |
353 | possible to handle general binary data for keys. | |
354 | ||
355 | @item void *data | |
356 | Generic pointer for use by the application. The hashing table | |
357 | implementation preserves this pointer in entries, but does not use it | |
358 | in any way otherwise. | |
2604afb1 UD |
359 | @end table |
360 | @end deftp | |
361 | ||
2d8a22cd FW |
362 | @deftp {Data type} {struct entry} |
363 | The underlying type of @code{ENTRY}. | |
364 | @end deftp | |
365 | ||
2604afb1 | 366 | @deftypefun {ENTRY *} hsearch (ENTRY @var{item}, ACTION @var{action}) |
d08a7e4c | 367 | @standards{SVID, search.h} |
433c45a2 AO |
368 | @safety{@prelim{}@mtunsafe{@mtasurace{:hsearch}}@asunsafe{}@acunsafe{@acucorrupt{/action==ENTER}}} |
369 | @c hsearch @mtasurace:hsearch @acucorrupt/action==ENTER | |
370 | @c hsearch_r dup @mtsrace:htab @acucorrupt/action==ENTER | |
2604afb1 | 371 | To search in a hashing table created using @code{hcreate} the |
8ed0d867 RJ |
372 | @code{hsearch} function must be used. This function can perform a simple |
373 | search for an element (if @var{action} has the value @code{FIND}) or it can | |
afdef815 UD |
374 | alternatively insert the key element into the hashing table. Entries |
375 | are never replaced. | |
2604afb1 UD |
376 | |
377 | The key is denoted by a pointer to an object of type @code{ENTRY}. For | |
378 | locating the corresponding position in the hashing table only the | |
379 | @code{key} element of the structure is used. | |
380 | ||
8ed0d867 | 381 | If an entry with a matching key is found the @var{action} parameter is |
afdef815 UD |
382 | irrelevant. The found entry is returned. If no matching entry is found |
383 | and the @var{action} parameter has the value @code{FIND} the function | |
384 | returns a @code{NULL} pointer. If no entry is found and the | |
385 | @var{action} parameter has the value @code{ENTER} a new entry is added | |
386 | to the hashing table which is initialized with the parameter @var{item}. | |
387 | A pointer to the newly added entry is returned. | |
2604afb1 UD |
388 | @end deftypefun |
389 | ||
8ed0d867 | 390 | As mentioned before, the hashing table used by the functions described so |
2604afb1 UD |
391 | far is global and there can be at any time at most one hashing table in |
392 | the program. A solution is to use the following functions which are a | |
393 | GNU extension. All have in common that they operate on a hashing table | |
394 | which is described by the content of an object of the type @code{struct | |
395 | hsearch_data}. This type should be treated as opaque, none of its | |
396 | members should be changed directly. | |
397 | ||
2604afb1 | 398 | @deftypefun int hcreate_r (size_t @var{nel}, struct hsearch_data *@var{htab}) |
d08a7e4c | 399 | @standards{GNU, search.h} |
433c45a2 AO |
400 | @safety{@prelim{}@mtsafe{@mtsrace{:htab}}@asunsafe{@ascuheap{}}@acunsafe{@acucorrupt{} @acsmem{}}} |
401 | @c Unlike the lsearch array, the htab is (at least in part) opaque, so | |
402 | @c let's make it absolutely clear that ensuring exclusive access is a | |
403 | @c caller responsibility. | |
404 | ||
405 | @c Cancellation is unlikely to leave the htab in a corrupt state: the | |
406 | @c last field to be initialized is the one that tells whether the entire | |
407 | @c data structure was initialized, and there's a function call (calloc) | |
408 | @c in between that will often ensure all other fields are written before | |
409 | @c the table. However, should this call be inlined (say with LTO), this | |
410 | @c assumption may not hold. The calloc call doesn't cross our library | |
411 | @c interface barrier, so let's consider this could happen and mark this | |
412 | @c with @acucorrupt. It's no safety loss, since we already have | |
413 | @c @ascuheap anyway... | |
414 | ||
415 | @c hcreate_r @mtsrace:htab @ascuheap @acucorrupt @acsmem | |
416 | @c isprime ok | |
417 | @c calloc dup @ascuheap @acsmem | |
f2ea0f5b | 418 | The @code{hcreate_r} function initializes the object pointed to by |
2604afb1 UD |
419 | @var{htab} to contain a hashing table with at least @var{nel} elements. |
420 | So this function is equivalent to the @code{hcreate} function except | |
421 | that the initialized data structure is controlled by the user. | |
422 | ||
0d1ef15f UD |
423 | This allows having more than one hashing table at one time. The memory |
424 | necessary for the @code{struct hsearch_data} object can be allocated | |
425 | dynamically. It must be initialized with zero before calling this | |
426 | function. | |
2604afb1 | 427 | |
11bf311e UD |
428 | The return value is non-zero if the operation was successful. If the |
429 | return value is zero, something went wrong, which probably means the | |
8ed0d867 | 430 | program ran out of memory. |
2604afb1 UD |
431 | @end deftypefun |
432 | ||
2604afb1 | 433 | @deftypefun void hdestroy_r (struct hsearch_data *@var{htab}) |
d08a7e4c | 434 | @standards{GNU, search.h} |
433c45a2 AO |
435 | @safety{@prelim{}@mtsafe{@mtsrace{:htab}}@asunsafe{@ascuheap{}}@acunsafe{@acucorrupt{} @acsmem{}}} |
436 | @c The table is released while the table pointer still points to it. | |
437 | @c Async cancellation is thus unsafe, but it already was because we call | |
438 | @c free(). Using the table in a handler while it's being released would | |
439 | @c also be dangerous, but calling free() already makes it unsafe, and | |
440 | @c the requirement on the caller to ensure exclusive access already | |
441 | @c guarantees this doesn't happen, so we don't get @asucorrupt. | |
442 | ||
443 | @c hdestroy_r @mtsrace:htab @ascuheap @acucorrupt @acsmem | |
444 | @c free dup @ascuheap @acsmem | |
2604afb1 UD |
445 | The @code{hdestroy_r} function frees all resources allocated by the |
446 | @code{hcreate_r} function for this very same object @var{htab}. As for | |
8ed0d867 | 447 | @code{hdestroy} it is the program's responsibility to free the strings |
2604afb1 UD |
448 | for the elements of the table. |
449 | @end deftypefun | |
450 | ||
2604afb1 | 451 | @deftypefun int hsearch_r (ENTRY @var{item}, ACTION @var{action}, ENTRY **@var{retval}, struct hsearch_data *@var{htab}) |
d08a7e4c | 452 | @standards{GNU, search.h} |
433c45a2 AO |
453 | @safety{@prelim{}@mtsafe{@mtsrace{:htab}}@assafe{}@acunsafe{@acucorrupt{/action==ENTER}}} |
454 | @c Callers have to ensure mutual exclusion; insertion, if cancelled, | |
455 | @c leaves the table in a corrupt state. | |
456 | ||
457 | @c hsearch_r @mtsrace:htab @acucorrupt/action==ENTER | |
458 | @c strlen dup ok | |
459 | @c strcmp dup ok | |
2604afb1 UD |
460 | The @code{hsearch_r} function is equivalent to @code{hsearch}. The |
461 | meaning of the first two arguments is identical. But instead of | |
f2ea0f5b | 462 | operating on a single global hashing table the function works on the |
2604afb1 UD |
463 | table described by the object pointed to by @var{htab} (which is |
464 | initialized by a call to @code{hcreate_r}). | |
465 | ||
466 | Another difference to @code{hcreate} is that the pointer to the found | |
8ed0d867 RJ |
467 | entry in the table is not the return value of the function. It is |
468 | returned by storing it in a pointer variable pointed to by the | |
2604afb1 UD |
469 | @var{retval} parameter. The return value of the function is an integer |
470 | value indicating success if it is non-zero and failure if it is zero. | |
010fe231 | 471 | In the latter case the global variable @code{errno} signals the reason for |
2604afb1 UD |
472 | the failure. |
473 | ||
474 | @table @code | |
475 | @item ENOMEM | |
9dcc8f11 | 476 | The table is filled and @code{hsearch_r} was called with a so far |
2604afb1 UD |
477 | unknown key and @var{action} set to @code{ENTER}. |
478 | @item ESRCH | |
479 | The @var{action} parameter is @code{FIND} and no corresponding element | |
480 | is found in the table. | |
481 | @end table | |
482 | @end deftypefun | |
483 | ||
484 | ||
485 | @node Tree Search Function | |
486 | @section The @code{tsearch} function. | |
487 | ||
488 | Another common form to organize data for efficient search is to use | |
489 | trees. The @code{tsearch} function family provides a nice interface to | |
490 | functions to organize possibly large amounts of data by providing a mean | |
491 | access time proportional to the logarithm of the number of elements. | |
1f77f049 | 492 | @Theglibc{} implementation even guarantees that this bound is |
2604afb1 UD |
493 | never exceeded even for input data which cause problems for simple |
494 | binary tree implementations. | |
495 | ||
f2ea0f5b | 496 | The functions described in the chapter are all described in the @w{System |
2604afb1 UD |
497 | V} and X/Open specifications and are therefore quite portable. |
498 | ||
499 | In contrast to the @code{hsearch} functions the @code{tsearch} functions | |
500 | can be used with arbitrary data and not only zero-terminated strings. | |
501 | ||
502 | The @code{tsearch} functions have the advantage that no function to | |
503 | initialize data structures is necessary. A simple pointer of type | |
504 | @code{void *} initialized to @code{NULL} is a valid tree and can be | |
ce460d04 RM |
505 | extended or searched. The prototypes for these functions can be found |
506 | in the header file @file{search.h}. | |
2604afb1 | 507 | |
2604afb1 | 508 | @deftypefun {void *} tsearch (const void *@var{key}, void **@var{rootp}, comparison_fn_t @var{compar}) |
d08a7e4c | 509 | @standards{SVID, search.h} |
433c45a2 AO |
510 | @safety{@prelim{}@mtsafe{@mtsrace{:rootp}}@asunsafe{@ascuheap{}}@acunsafe{@acucorrupt{} @acsmem{}}} |
511 | @c The tree is not modified in a thread-safe manner, and rotations may | |
512 | @c leave the tree in an inconsistent state that could be observed in an | |
513 | @c asynchronous signal handler (except for the caller-synchronization | |
514 | @c requirement) or after asynchronous cancellation of the thread | |
515 | @c performing the rotation or the insertion. | |
2604afb1 UD |
516 | The @code{tsearch} function searches in the tree pointed to by |
517 | @code{*@var{rootp}} for an element matching @var{key}. The function | |
f2ea0f5b | 518 | pointed to by @var{compar} is used to determine whether two elements |
8b7fb588 | 519 | match. @xref{Comparison Functions}, for a specification of the functions |
2604afb1 UD |
520 | which can be used for the @var{compar} parameter. |
521 | ||
522 | If the tree does not contain a matching entry the @var{key} value will | |
523 | be added to the tree. @code{tsearch} does not make a copy of the object | |
524 | pointed to by @var{key} (how could it since the size is unknown). | |
525 | Instead it adds a reference to this object which means the object must | |
526 | be available as long as the tree data structure is used. | |
527 | ||
528 | The tree is represented by a pointer to a pointer since it is sometimes | |
529 | necessary to change the root node of the tree. So it must not be | |
530 | assumed that the variable pointed to by @var{rootp} has the same value | |
531 | after the call. This also shows that it is not safe to call the | |
532 | @code{tsearch} function more than once at the same time using the same | |
533 | tree. It is no problem to run it more than once at a time on different | |
534 | trees. | |
535 | ||
536 | The return value is a pointer to the matching element in the tree. If a | |
537 | new element was created the pointer points to the new data (which is in | |
538 | fact @var{key}). If an entry had to be created and the program ran out | |
539 | of space @code{NULL} is returned. | |
540 | @end deftypefun | |
541 | ||
2604afb1 | 542 | @deftypefun {void *} tfind (const void *@var{key}, void *const *@var{rootp}, comparison_fn_t @var{compar}) |
d08a7e4c | 543 | @standards{SVID, search.h} |
433c45a2 | 544 | @safety{@prelim{}@mtsafe{@mtsrace{:rootp}}@assafe{}@acsafe{}} |
2604afb1 UD |
545 | The @code{tfind} function is similar to the @code{tsearch} function. It |
546 | locates an element matching the one pointed to by @var{key} and returns | |
547 | a pointer to this element. But if no matching element is available no | |
548 | new element is entered (note that the @var{rootp} parameter points to a | |
549 | constant pointer). Instead the function returns @code{NULL}. | |
550 | @end deftypefun | |
551 | ||
8ed0d867 | 552 | Another advantage of the @code{tsearch} functions in contrast to the |
2604afb1 UD |
553 | @code{hsearch} functions is that there is an easy way to remove |
554 | elements. | |
555 | ||
2604afb1 | 556 | @deftypefun {void *} tdelete (const void *@var{key}, void **@var{rootp}, comparison_fn_t @var{compar}) |
d08a7e4c | 557 | @standards{SVID, search.h} |
433c45a2 | 558 | @safety{@prelim{}@mtsafe{@mtsrace{:rootp}}@asunsafe{@ascuheap{}}@acunsafe{@acucorrupt{} @acsmem{}}} |
2604afb1 UD |
559 | To remove a specific element matching @var{key} from the tree |
560 | @code{tdelete} can be used. It locates the matching element using the | |
561 | same method as @code{tfind}. The corresponding element is then removed | |
2864e767 UD |
562 | and a pointer to the parent of the deleted node is returned by the |
563 | function. If there is no matching entry in the tree nothing can be | |
564 | deleted and the function returns @code{NULL}. If the root of the tree | |
565 | is deleted @code{tdelete} returns some unspecified value not equal to | |
566 | @code{NULL}. | |
2604afb1 UD |
567 | @end deftypefun |
568 | ||
2604afb1 | 569 | @deftypefun void tdestroy (void *@var{vroot}, __free_fn_t @var{freefct}) |
d08a7e4c | 570 | @standards{GNU, search.h} |
433c45a2 | 571 | @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} |
2604afb1 UD |
572 | If the complete search tree has to be removed one can use |
573 | @code{tdestroy}. It frees all resources allocated by the @code{tsearch} | |
8ed0d867 | 574 | functions to generate the tree pointed to by @var{vroot}. |
2604afb1 UD |
575 | |
576 | For the data in each tree node the function @var{freefct} is called. | |
577 | The pointer to the data is passed as the argument to the function. If | |
578 | no such work is necessary @var{freefct} must point to a function doing | |
579 | nothing. It is called in any case. | |
580 | ||
581 | This function is a GNU extension and not covered by the @w{System V} or | |
582 | X/Open specifications. | |
583 | @end deftypefun | |
584 | ||
8ed0d867 | 585 | In addition to the functions to create and destroy the tree data |
e8b1163e AJ |
586 | structure, there is another function which allows you to apply a |
587 | function to all elements of the tree. The function must have this type: | |
2604afb1 UD |
588 | |
589 | @smallexample | |
8c479619 | 590 | void __action_fn_t (const void *nodep, VISIT value, int level); |
2604afb1 UD |
591 | @end smallexample |
592 | ||
8c479619 | 593 | The @var{nodep} is the data value of the current node (once given as the |
2604afb1 UD |
594 | @var{key} argument to @code{tsearch}). @var{level} is a numeric value |
595 | which corresponds to the depth of the current node in the tree. The | |
596 | root node has the depth @math{0} and its children have a depth of | |
597 | @math{1} and so on. The @code{VISIT} type is an enumeration type. | |
598 | ||
599 | @deftp {Data Type} VISIT | |
600 | The @code{VISIT} value indicates the status of the current node in the | |
601 | tree and how the function is called. The status of a node is either | |
602 | `leaf' or `internal node'. For each leaf node the function is called | |
603 | exactly once, for each internal node it is called three times: before | |
604 | the first child is processed, after the first child is processed and | |
f2ea0f5b | 605 | after both children are processed. This makes it possible to handle all |
2604afb1 UD |
606 | three methods of tree traversal (or even a combination of them). |
607 | ||
2fe82ca6 | 608 | @vtable @code |
2604afb1 UD |
609 | @item preorder |
610 | The current node is an internal node and the function is called before | |
611 | the first child was processed. | |
da2a3ca6 | 612 | @item postorder |
2604afb1 UD |
613 | The current node is an internal node and the function is called after |
614 | the first child was processed. | |
da2a3ca6 | 615 | @item endorder |
2604afb1 UD |
616 | The current node is an internal node and the function is called after |
617 | the second child was processed. | |
618 | @item leaf | |
619 | The current node is a leaf. | |
2fe82ca6 | 620 | @end vtable |
2604afb1 UD |
621 | @end deftp |
622 | ||
2604afb1 | 623 | @deftypefun void twalk (const void *@var{root}, __action_fn_t @var{action}) |
d08a7e4c | 624 | @standards{SVID, search.h} |
433c45a2 | 625 | @safety{@prelim{}@mtsafe{@mtsrace{:root}}@assafe{}@acsafe{}} |
da2a3ca6 | 626 | For each node in the tree with a node pointed to by @var{root}, the |
2604afb1 UD |
627 | @code{twalk} function calls the function provided by the parameter |
628 | @var{action}. For leaf nodes the function is called exactly once with | |
629 | @var{value} set to @code{leaf}. For internal nodes the function is | |
630 | called three times, setting the @var{value} parameter or @var{action} to | |
631 | the appropriate value. The @var{level} argument for the @var{action} | |
8ed0d867 RJ |
632 | function is computed while descending the tree by increasing the value |
633 | by one for each descent to a child, starting with the value @math{0} for | |
2604afb1 UD |
634 | the root node. |
635 | ||
636 | Since the functions used for the @var{action} parameter to @code{twalk} | |
da2a3ca6 UD |
637 | must not modify the tree data, it is safe to run @code{twalk} in more |
638 | than one thread at the same time, working on the same tree. It is also | |
2604afb1 | 639 | safe to call @code{tfind} in parallel. Functions which modify the tree |
7b807a35 FW |
640 | must not be used, otherwise the behavior is undefined. However, it is |
641 | difficult to pass data external to the tree to the callback function | |
642 | without resorting to global variables (and thread safety issues), so | |
643 | see the @code{twalk_r} function below. | |
644 | @end deftypefun | |
645 | ||
646 | @deftypefun void twalk_r (const void *@var{root}, void (*@var{action}) (const void *@var{key}, VISIT @var{which}, void *@var{closure}), void *@var{closure}) | |
647 | @standards{GNU, search.h} | |
648 | @safety{@prelim{}@mtsafe{@mtsrace{:root}}@assafe{}@acsafe{}} | |
649 | For each node in the tree with a node pointed to by @var{root}, the | |
650 | @code{twalk_r} function calls the function provided by the parameter | |
651 | @var{action}. For leaf nodes the function is called exactly once with | |
6807f47b CD |
652 | @var{which} set to @code{leaf}. For internal nodes the function is |
653 | called three times, setting the @var{which} parameter of @var{action} to | |
7b807a35 FW |
654 | the appropriate value. The @var{closure} parameter is passed down to |
655 | each call of the @var{action} function, unmodified. | |
656 | ||
657 | It is possible to implement the @code{twalk} function on top of the | |
658 | @code{twalk_r} function, which is why there is no separate level | |
659 | parameter. | |
660 | ||
661 | @smallexample | |
662 | @include twalk.c.texi | |
663 | @end smallexample | |
2604afb1 | 664 | @end deftypefun |