]> git.ipfire.org Git - thirdparty/glibc.git/blame - manual/ctype.texi
Fix trailing space.
[thirdparty/glibc.git] / manual / ctype.texi
CommitLineData
99a20616 1@node Character Handling, String and Array Utilities, Memory, Top
7a68c94a 2@c %MENU% Character testing and conversion functions
28f540f4
RM
3@chapter Character Handling
4
5Programs that work with characters and strings often need to classify a
6character---is it alphabetic, is it a digit, is it whitespace, and so
7on---and perform case conversion operations on characters. The
8functions in the header file @file{ctype.h} are provided for this
9purpose.
10@pindex ctype.h
11
12Since the choice of locale and character set can alter the
13classifications of particular character codes, all of these functions
14are affected by the current locale. (More precisely, they are affected
15by the locale currently selected for character classification---the
16@code{LC_CTYPE} category; see @ref{Locale Categories}.)
17
390955cb
UD
18The @w{ISO C} standard specifies two different sets of functions. The
19one set works on @code{char} type characters, the other one on
bc938d3d 20@code{wchar_t} wide characters (@pxref{Extended Char Intro}).
28f540f4 21
390955cb
UD
22@menu
23* Classification of Characters:: Testing whether characters are
24 letters, digits, punctuation, etc.
25
26* Case Conversion:: Case mapping, and the like.
27* Classification of Wide Characters:: Character class determination for
28 wide characters.
29* Using Wide Char Classes:: Notes on using the wide character
30 classes.
31* Wide Character Case Conversion:: Mapping of wide characters.
28f540f4
RM
32@end menu
33
34@node Classification of Characters, Case Conversion, , Character Handling
35@section Classification of Characters
36@cindex character testing
37@cindex classification of characters
38@cindex predicates on characters
39@cindex character predicates
40
41This section explains the library functions for classifying characters.
42For example, @code{isalpha} is the function to test for an alphabetic
43character. It takes one argument, the character to test, and returns a
44nonzero integer if the character is alphabetic, and zero otherwise. You
45would use it like this:
46
47@smallexample
48if (isalpha (c))
49 printf ("The character `%c' is alphabetic.\n", c);
50@end smallexample
51
52Each of the functions in this section tests for membership in a
53particular class of characters; each has a name starting with @samp{is}.
54Each of them takes one argument, which is a character to test, and
55returns an @code{int} which is treated as a boolean value. The
56character argument is passed as an @code{int}, and it may be the
f65fd747 57constant value @code{EOF} instead of a real character.
28f540f4
RM
58
59The attributes of any given character can vary between locales.
60@xref{Locales}, for more information on locales.@refill
61
62These functions are declared in the header file @file{ctype.h}.
63@pindex ctype.h
64
65@cindex lower-case character
66@comment ctype.h
f65fd747 67@comment ISO
28f540f4 68@deftypefun int islower (int @var{c})
c49130e3
AO
69@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
70@c The is* macros call __ctype_b_loc to get the ctype array from the
71@c current locale, and then index it by c. __ctype_b_loc reads from
72@c thread-local memory the (indirect) pointer to the ctype array, which
73@c may involve one word access to the global locale object, if that's
74@c the active locale for the thread, and the array, being part of the
75@c locale data, is undeletable, so there's no thread-safety issue. We
76@c might want to mark these with @mtslocale to flag to callers that
77@c changing locales might affect them, even if not these simpler
78@c functions.
390955cb
UD
79Returns true if @var{c} is a lower-case letter. The letter need not be
80from the Latin alphabet, any alphabet representable is valid.
28f540f4
RM
81@end deftypefun
82
83@cindex upper-case character
84@comment ctype.h
f65fd747 85@comment ISO
28f540f4 86@deftypefun int isupper (int @var{c})
c49130e3 87@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
390955cb
UD
88Returns true if @var{c} is an upper-case letter. The letter need not be
89from the Latin alphabet, any alphabet representable is valid.
28f540f4
RM
90@end deftypefun
91
92@cindex alphabetic character
93@comment ctype.h
f65fd747 94@comment ISO
28f540f4 95@deftypefun int isalpha (int @var{c})
c49130e3 96@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
97Returns true if @var{c} is an alphabetic character (a letter). If
98@code{islower} or @code{isupper} is true of a character, then
99@code{isalpha} is also true.
100
101In some locales, there may be additional characters for which
cc3fa755 102@code{isalpha} is true---letters which are neither upper case nor lower
28f540f4
RM
103case. But in the standard @code{"C"} locale, there are no such
104additional characters.
105@end deftypefun
106
107@cindex digit character
108@cindex decimal digit character
109@comment ctype.h
f65fd747 110@comment ISO
28f540f4 111@deftypefun int isdigit (int @var{c})
c49130e3 112@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
113Returns true if @var{c} is a decimal digit (@samp{0} through @samp{9}).
114@end deftypefun
115
116@cindex alphanumeric character
117@comment ctype.h
f65fd747 118@comment ISO
28f540f4 119@deftypefun int isalnum (int @var{c})
c49130e3 120@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
121Returns true if @var{c} is an alphanumeric character (a letter or
122number); in other words, if either @code{isalpha} or @code{isdigit} is
123true of a character, then @code{isalnum} is also true.
124@end deftypefun
125
126@cindex hexadecimal digit character
127@comment ctype.h
f65fd747 128@comment ISO
28f540f4 129@deftypefun int isxdigit (int @var{c})
c49130e3 130@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
131Returns true if @var{c} is a hexadecimal digit.
132Hexadecimal digits include the normal decimal digits @samp{0} through
133@samp{9} and the letters @samp{A} through @samp{F} and
134@samp{a} through @samp{f}.
135@end deftypefun
136
137@cindex punctuation character
138@comment ctype.h
f65fd747 139@comment ISO
28f540f4 140@deftypefun int ispunct (int @var{c})
c49130e3 141@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
142Returns true if @var{c} is a punctuation character.
143This means any printing character that is not alphanumeric or a space
144character.
145@end deftypefun
146
147@cindex whitespace character
148@comment ctype.h
f65fd747 149@comment ISO
28f540f4 150@deftypefun int isspace (int @var{c})
c49130e3 151@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
152Returns true if @var{c} is a @dfn{whitespace} character. In the standard
153@code{"C"} locale, @code{isspace} returns true for only the standard
154whitespace characters:
155
156@table @code
157@item ' '
158space
159
160@item '\f'
161formfeed
162
163@item '\n'
164newline
165
166@item '\r'
167carriage return
168
169@item '\t'
170horizontal tab
171
172@item '\v'
173vertical tab
174@end table
175@end deftypefun
176
177@cindex blank character
178@comment ctype.h
b33ed432 179@comment ISO
28f540f4 180@deftypefun int isblank (int @var{c})
c49130e3 181@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4 182Returns true if @var{c} is a blank character; that is, a space or a tab.
d3466201 183This function was originally a GNU extension, but was added in @w{ISO C99}.
28f540f4
RM
184@end deftypefun
185
186@cindex graphic character
187@comment ctype.h
f65fd747 188@comment ISO
28f540f4 189@deftypefun int isgraph (int @var{c})
c49130e3 190@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
191Returns true if @var{c} is a graphic character; that is, a character
192that has a glyph associated with it. The whitespace characters are not
193considered graphic.
194@end deftypefun
195
196@cindex printing character
197@comment ctype.h
f65fd747 198@comment ISO
28f540f4 199@deftypefun int isprint (int @var{c})
c49130e3 200@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
201Returns true if @var{c} is a printing character. Printing characters
202include all the graphic characters, plus the space (@samp{ }) character.
203@end deftypefun
204
205@cindex control character
206@comment ctype.h
f65fd747 207@comment ISO
28f540f4 208@deftypefun int iscntrl (int @var{c})
c49130e3 209@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
210Returns true if @var{c} is a control character (that is, a character that
211is not a printing character).
212@end deftypefun
213
214@cindex ASCII character
215@comment ctype.h
216@comment SVID, BSD
217@deftypefun int isascii (int @var{c})
c49130e3 218@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
219Returns true if @var{c} is a 7-bit @code{unsigned char} value that fits
220into the US/UK ASCII character set. This function is a BSD extension
221and is also an SVID extension.
222@end deftypefun
223
390955cb 224@node Case Conversion, Classification of Wide Characters, Classification of Characters, Character Handling
28f540f4
RM
225@section Case Conversion
226@cindex character case conversion
227@cindex case conversion of characters
228@cindex converting case of characters
229
230This section explains the library functions for performing conversions
231such as case mappings on characters. For example, @code{toupper}
232converts any character to upper case if possible. If the character
233can't be converted, @code{toupper} returns it unchanged.
234
235These functions take one argument of type @code{int}, which is the
236character to convert, and return the converted character as an
237@code{int}. If the conversion is not applicable to the argument given,
238the argument is returned unchanged.
239
f65fd747 240@strong{Compatibility Note:} In pre-@w{ISO C} dialects, instead of
28f540f4
RM
241returning the argument unchanged, these functions may fail when the
242argument is not suitable for the conversion. Thus for portability, you
243may need to write @code{islower(c) ? toupper(c) : c} rather than just
244@code{toupper(c)}.
245
246These functions are declared in the header file @file{ctype.h}.
247@pindex ctype.h
248
249@comment ctype.h
f65fd747 250@comment ISO
28f540f4 251@deftypefun int tolower (int @var{c})
c49130e3
AO
252@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
253@c The to* macros/functions call different functions that use different
254@c arrays than those of__ctype_b_loc, but the access patterns and
255@c thus safety guarantees are the same.
28f540f4
RM
256If @var{c} is an upper-case letter, @code{tolower} returns the corresponding
257lower-case letter. If @var{c} is not an upper-case letter,
258@var{c} is returned unchanged.
259@end deftypefun
260
261@comment ctype.h
f65fd747 262@comment ISO
28f540f4 263@deftypefun int toupper (int @var{c})
c49130e3 264@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
390955cb 265If @var{c} is a lower-case letter, @code{toupper} returns the corresponding
28f540f4
RM
266upper-case letter. Otherwise @var{c} is returned unchanged.
267@end deftypefun
268
269@comment ctype.h
270@comment SVID, BSD
271@deftypefun int toascii (int @var{c})
c49130e3 272@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
273This function converts @var{c} to a 7-bit @code{unsigned char} value
274that fits into the US/UK ASCII character set, by clearing the high-order
275bits. This function is a BSD extension and is also an SVID extension.
276@end deftypefun
277
278@comment ctype.h
279@comment SVID
280@deftypefun int _tolower (int @var{c})
c49130e3 281@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
282This is identical to @code{tolower}, and is provided for compatibility
283with the SVID. @xref{SVID}.@refill
284@end deftypefun
285
286@comment ctype.h
287@comment SVID
288@deftypefun int _toupper (int @var{c})
c49130e3 289@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
290This is identical to @code{toupper}, and is provided for compatibility
291with the SVID.
292@end deftypefun
390955cb
UD
293
294
295@node Classification of Wide Characters, Using Wide Char Classes, Case Conversion, Character Handling
296@section Character class determination for wide characters
297
aaca11d8
UD
298@w{Amendment 1} to @w{ISO C90} defines functions to classify wide
299characters. Although the original @w{ISO C90} standard already defined
6dd5b57e 300the type @code{wchar_t}, no functions operating on them were defined.
390955cb
UD
301
302The general design of the classification functions for wide characters
6dd5b57e
UD
303is more general. It allows extensions to the set of available
304classifications, beyond those which are always available. The POSIX
305standard specifies how extensions can be made, and this is already
1f77f049 306implemented in the @glibcadj{} implementation of the @code{localedef}
bc938d3d 307program.
390955cb 308
6dd5b57e
UD
309The character class functions are normally implemented with bitsets,
310with a bitset per character. For a given character, the appropriate
311bitset is read from a table and a test is performed as to whether a
312certain bit is set. Which bit is tested for is determined by the
313class.
390955cb
UD
314
315For the wide character classification functions this is made visible.
6dd5b57e
UD
316There is a type classification type defined, a function to retrieve this
317value for a given class, and a function to test whether a given
318character is in this class, using the classification value. On top of
319this the normal character classification functions as used for
390955cb
UD
320@code{char} objects can be defined.
321
322@comment wctype.h
323@comment ISO
324@deftp {Data type} wctype_t
325The @code{wctype_t} can hold a value which represents a character class.
6dd5b57e 326The only defined way to generate such a value is by using the
390955cb
UD
327@code{wctype} function.
328
329@pindex wctype.h
330This type is defined in @file{wctype.h}.
331@end deftp
332
333@comment wctype.h
334@comment ISO
335@deftypefun wctype_t wctype (const char *@var{property})
c49130e3
AO
336@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
337@c Although the source code of wctype contains multiple references to
338@c the locale, that could each reference different locale_data objects
339@c should the global locale object change while active, the compiler can
340@c and does combine them all into a single dereference that resolves
341@c once to the LCTYPE locale object used throughout the function, so it
342@c is safe in (optimized) practice, if not in theory, even when the
343@c locale changes. Ideally we'd explicitly save the resolved
344@c locale_data object to make it visibly safe instead of safe only under
345@c compiler optimizations, but given the decision that setlocale is
346@c MT-Unsafe, all this would afford us would be the ability to not mark
347@c this function with @mtslocale.
390955cb
UD
348The @code{wctype} returns a value representing a class of wide
349characters which is identified by the string @var{property}. Beside
350some standard properties each locale can define its own ones. In case
6dd5b57e
UD
351no property with the given name is known for the current locale
352selected for the @code{LC_CTYPE} category, the function returns zero.
390955cb
UD
353
354@noindent
355The properties known in every locale are:
356
357@multitable @columnfractions .25 .25 .25 .25
358@item
359@code{"alnum"} @tab @code{"alpha"} @tab @code{"cntrl"} @tab @code{"digit"}
360@item
361@code{"graph"} @tab @code{"lower"} @tab @code{"print"} @tab @code{"punct"}
362@item
363@code{"space"} @tab @code{"upper"} @tab @code{"xdigit"}
364@end multitable
365
366@pindex wctype.h
367This function is declared in @file{wctype.h}.
368@end deftypefun
369
370To test the membership of a character to one of the non-standard classes
371the @w{ISO C} standard defines a completely new function.
372
373@comment wctype.h
374@comment ISO
375@deftypefun int iswctype (wint_t @var{wc}, wctype_t @var{desc})
c49130e3
AO
376@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
377@c The compressed lookup table returned by wctype is read-only.
390955cb
UD
378This function returns a nonzero value if @var{wc} is in the character
379class specified by @var{desc}. @var{desc} must previously be returned
380by a successful call to @code{wctype}.
381
382@pindex wctype.h
383This function is declared in @file{wctype.h}.
384@end deftypefun
385
6dd5b57e
UD
386To make it easier to use the commonly-used classification functions,
387they are defined in the C library. There is no need to use
bc938d3d 388@code{wctype} if the property string is one of the known character
390955cb 389classes. In some situations it is desirable to construct the property
6dd5b57e 390strings, and then it is important that @code{wctype} can also handle the
390955cb
UD
391standard classes.
392
393@cindex alphanumeric character
394@comment wctype.h
395@comment ISO
396@deftypefun int iswalnum (wint_t @var{wc})
c49130e3
AO
397@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
398@c The implicit wctype call in the isw* functions is actually an
399@c optimized version because the category has a known offset, but the
400@c wctype is equally safe when optimized, unsafe with changing locales
401@c if not optimized (thus @mtslocale). Since it's not a macro, we
402@c always optimize, and the locale can't change in any MT-Safe way, it's
403@c fine. The test whether wc is ASCII to use the non-wide is*
404@c macro/function doesn't bring any other safety issues: the test does
405@c not depend on the locale, and each path after the decision resolves
406@c the locale object only once.
390955cb
UD
407This function returns a nonzero value if @var{wc} is an alphanumeric
408character (a letter or number); in other words, if either @code{iswalpha}
409or @code{iswdigit} is true of a character, then @code{iswalnum} is also
410true.
411
412@noindent
413This function can be implemented using
414
415@smallexample
416iswctype (wc, wctype ("alnum"))
417@end smallexample
418
419@pindex wctype.h
18fd611b 420It is declared in @file{wctype.h}.
390955cb
UD
421@end deftypefun
422
423@cindex alphabetic character
424@comment wctype.h
425@comment ISO
426@deftypefun int iswalpha (wint_t @var{wc})
c49130e3 427@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
390955cb
UD
428Returns true if @var{wc} is an alphabetic character (a letter). If
429@code{iswlower} or @code{iswupper} is true of a character, then
430@code{iswalpha} is also true.
431
432In some locales, there may be additional characters for which
433@code{iswalpha} is true---letters which are neither upper case nor lower
434case. But in the standard @code{"C"} locale, there are no such
435additional characters.
436
437@noindent
438This function can be implemented using
439
440@smallexample
441iswctype (wc, wctype ("alpha"))
442@end smallexample
443
444@pindex wctype.h
18fd611b 445It is declared in @file{wctype.h}.
390955cb
UD
446@end deftypefun
447
448@cindex control character
449@comment wctype.h
450@comment ISO
451@deftypefun int iswcntrl (wint_t @var{wc})
c49130e3 452@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
390955cb
UD
453Returns true if @var{wc} is a control character (that is, a character that
454is not a printing character).
455
456@noindent
457This function can be implemented using
458
459@smallexample
460iswctype (wc, wctype ("cntrl"))
461@end smallexample
462
463@pindex wctype.h
18fd611b 464It is declared in @file{wctype.h}.
390955cb
UD
465@end deftypefun
466
467@cindex digit character
468@comment wctype.h
469@comment ISO
470@deftypefun int iswdigit (wint_t @var{wc})
c49130e3 471@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
390955cb
UD
472Returns true if @var{wc} is a digit (e.g., @samp{0} through @samp{9}).
473Please note that this function does not only return a nonzero value for
474@emph{decimal} digits, but for all kinds of digits. A consequence is
475that code like the following will @strong{not} work unconditionally for
476wide characters:
477
478@smallexample
479n = 0;
6dd5b57e 480while (iswdigit (*wc))
390955cb
UD
481 @{
482 n *= 10;
483 n += *wc++ - L'0';
484 @}
485@end smallexample
486
487@noindent
488This function can be implemented using
489
490@smallexample
491iswctype (wc, wctype ("digit"))
492@end smallexample
493
494@pindex wctype.h
18fd611b 495It is declared in @file{wctype.h}.
390955cb
UD
496@end deftypefun
497
498@cindex graphic character
499@comment wctype.h
500@comment ISO
501@deftypefun int iswgraph (wint_t @var{wc})
c49130e3 502@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
390955cb
UD
503Returns true if @var{wc} is a graphic character; that is, a character
504that has a glyph associated with it. The whitespace characters are not
505considered graphic.
506
507@noindent
508This function can be implemented using
509
510@smallexample
511iswctype (wc, wctype ("graph"))
512@end smallexample
513
514@pindex wctype.h
18fd611b 515It is declared in @file{wctype.h}.
390955cb
UD
516@end deftypefun
517
518@cindex lower-case character
519@comment ctype.h
520@comment ISO
521@deftypefun int iswlower (wint_t @var{wc})
c49130e3 522@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
390955cb
UD
523Returns true if @var{wc} is a lower-case letter. The letter need not be
524from the Latin alphabet, any alphabet representable is valid.
525
526@noindent
527This function can be implemented using
528
529@smallexample
530iswctype (wc, wctype ("lower"))
531@end smallexample
532
533@pindex wctype.h
18fd611b 534It is declared in @file{wctype.h}.
390955cb
UD
535@end deftypefun
536
537@cindex printing character
538@comment wctype.h
539@comment ISO
540@deftypefun int iswprint (wint_t @var{wc})
c49130e3 541@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
390955cb
UD
542Returns true if @var{wc} is a printing character. Printing characters
543include all the graphic characters, plus the space (@samp{ }) character.
544
545@noindent
546This function can be implemented using
547
548@smallexample
549iswctype (wc, wctype ("print"))
550@end smallexample
551
552@pindex wctype.h
18fd611b 553It is declared in @file{wctype.h}.
390955cb
UD
554@end deftypefun
555
556@cindex punctuation character
557@comment wctype.h
558@comment ISO
559@deftypefun int iswpunct (wint_t @var{wc})
c49130e3 560@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
390955cb
UD
561Returns true if @var{wc} is a punctuation character.
562This means any printing character that is not alphanumeric or a space
563character.
564
565@noindent
566This function can be implemented using
567
568@smallexample
569iswctype (wc, wctype ("punct"))
570@end smallexample
571
572@pindex wctype.h
18fd611b 573It is declared in @file{wctype.h}.
390955cb
UD
574@end deftypefun
575
576@cindex whitespace character
577@comment wctype.h
578@comment ISO
579@deftypefun int iswspace (wint_t @var{wc})
c49130e3 580@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
390955cb
UD
581Returns true if @var{wc} is a @dfn{whitespace} character. In the standard
582@code{"C"} locale, @code{iswspace} returns true for only the standard
583whitespace characters:
584
585@table @code
586@item L' '
587space
588
589@item L'\f'
590formfeed
591
592@item L'\n'
593newline
594
595@item L'\r'
596carriage return
597
598@item L'\t'
599horizontal tab
600
601@item L'\v'
602vertical tab
603@end table
604
605@noindent
606This function can be implemented using
607
608@smallexample
609iswctype (wc, wctype ("space"))
610@end smallexample
611
612@pindex wctype.h
18fd611b 613It is declared in @file{wctype.h}.
390955cb
UD
614@end deftypefun
615
616@cindex upper-case character
617@comment wctype.h
618@comment ISO
619@deftypefun int iswupper (wint_t @var{wc})
c49130e3 620@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
390955cb
UD
621Returns true if @var{wc} is an upper-case letter. The letter need not be
622from the Latin alphabet, any alphabet representable is valid.
623
624@noindent
625This function can be implemented using
626
627@smallexample
628iswctype (wc, wctype ("upper"))
629@end smallexample
630
631@pindex wctype.h
18fd611b 632It is declared in @file{wctype.h}.
390955cb
UD
633@end deftypefun
634
635@cindex hexadecimal digit character
636@comment wctype.h
637@comment ISO
638@deftypefun int iswxdigit (wint_t @var{wc})
c49130e3 639@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
390955cb
UD
640Returns true if @var{wc} is a hexadecimal digit.
641Hexadecimal digits include the normal decimal digits @samp{0} through
642@samp{9} and the letters @samp{A} through @samp{F} and
643@samp{a} through @samp{f}.
644
645@noindent
646This function can be implemented using
647
648@smallexample
649iswctype (wc, wctype ("xdigit"))
650@end smallexample
651
652@pindex wctype.h
18fd611b 653It is declared in @file{wctype.h}.
390955cb
UD
654@end deftypefun
655
1f77f049 656@Theglibc{} also provides a function which is not defined in the
390955cb
UD
657@w{ISO C} standard but which is available as a version for single byte
658characters as well.
659
660@cindex blank character
661@comment wctype.h
b33ed432 662@comment ISO
390955cb 663@deftypefun int iswblank (wint_t @var{wc})
c49130e3 664@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
390955cb 665Returns true if @var{wc} is a blank character; that is, a space or a tab.
d3466201
RM
666This function was originally a GNU extension, but was added in @w{ISO C99}.
667It is declared in @file{wchar.h}.
390955cb
UD
668@end deftypefun
669
670@node Using Wide Char Classes, Wide Character Case Conversion, Classification of Wide Characters, Character Handling
671@section Notes on using the wide character classes
672
6dd5b57e 673The first note is probably not astonishing but still occasionally a
390955cb 674cause of problems. The @code{isw@var{XXX}} functions can be implemented
1f77f049 675using macros and in fact, @theglibc{} does this. They are still
390955cb 676available as real functions but when the @file{wctype.h} header is
6dd5b57e 677included the macros will be used. This is the same as the
390955cb
UD
678@code{char} type versions of these functions.
679
bc938d3d
UD
680The second note covers something new. It can be best illustrated by a
681(real-world) example. The first piece of code is an excerpt from the
682original code. It is truncated a bit but the intention should be clear.
390955cb
UD
683
684@smallexample
685int
686is_in_class (int c, const char *class)
687@{
688 if (strcmp (class, "alnum") == 0)
689 return isalnum (c);
690 if (strcmp (class, "alpha") == 0)
691 return isalpha (c);
692 if (strcmp (class, "cntrl") == 0)
693 return iscntrl (c);
95fdc6a0 694 @dots{}
390955cb
UD
695 return 0;
696@}
697@end smallexample
698
6dd5b57e
UD
699Now, with the @code{wctype} and @code{iswctype} you can avoid the
700@code{if} cascades, but rewriting the code as follows is wrong:
390955cb
UD
701
702@smallexample
703int
704is_in_class (int c, const char *class)
705@{
706 wctype_t desc = wctype (class);
707 return desc ? iswctype ((wint_t) c, desc) : 0;
708@}
709@end smallexample
710
bc938d3d 711The problem is that it is not guaranteed that the wide character
390955cb 712representation of a single-byte character can be found using casting.
6dd5b57e 713In fact, usually this fails miserably. The correct solution to this
390955cb
UD
714problem is to write the code as follows:
715
716@smallexample
717int
718is_in_class (int c, const char *class)
719@{
720 wctype_t desc = wctype (class);
721 return desc ? iswctype (btowc (c), desc) : 0;
722@}
723@end smallexample
724
e18db2b0 725@xref{Converting a Character}, for more information on @code{btowc}.
6dd5b57e 726Note that this change probably does not improve the performance
390955cb 727of the program a lot since the @code{wctype} function still has to make
6dd5b57e
UD
728the string comparisons. It gets really interesting if the
729@code{is_in_class} function is called more than once for the
390955cb
UD
730same class name. In this case the variable @var{desc} could be computed
731once and reused for all the calls. Therefore the above form of the
732function is probably not the final one.
733
734
735@node Wide Character Case Conversion, , Using Wide Char Classes, Character Handling
736@section Mapping of wide characters.
737
6dd5b57e
UD
738The classification functions are also generalized by the @w{ISO C}
739standard. Instead of just allowing the two standard mappings, a
740locale can contain others. Again, the @code{localedef} program
741already supports generating such locale data files.
390955cb
UD
742
743@comment wctype.h
744@comment ISO
745@deftp {Data Type} wctrans_t
746This data type is defined as a scalar type which can hold a value
747representing the locale-dependent character mapping. There is no way to
b912ca11 748construct such a value apart from using the return value of the
390955cb
UD
749@code{wctrans} function.
750
751@pindex wctype.h
752@noindent
753This type is defined in @file{wctype.h}.
754@end deftp
755
756@comment wctype.h
757@comment ISO
464d646f 758@deftypefun wctrans_t wctrans (const char *@var{property})
c49130e3
AO
759@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
760@c Similar implementation, same caveats as wctype.
390955cb
UD
761The @code{wctrans} function has to be used to find out whether a named
762mapping is defined in the current locale selected for the
6dd5b57e
UD
763@code{LC_CTYPE} category. If the returned value is non-zero, you can use
764it afterwards in calls to @code{towctrans}. If the return value is
390955cb
UD
765zero no such mapping is known in the current locale.
766
767Beside locale-specific mappings there are two mappings which are
768guaranteed to be available in every locale:
769
770@multitable @columnfractions .5 .5
771@item
772@code{"tolower"} @tab @code{"toupper"}
773@end multitable
774
775@pindex wctype.h
776@noindent
6dd5b57e 777These functions are declared in @file{wctype.h}.
390955cb
UD
778@end deftypefun
779
780@comment wctype.h
781@comment ISO
782@deftypefun wint_t towctrans (wint_t @var{wc}, wctrans_t @var{desc})
c49130e3
AO
783@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
784@c Same caveats as iswctype.
6dd5b57e
UD
785@code{towctrans} maps the input character @var{wc}
786according to the rules of the mapping for which @var{desc} is a
787descriptor, and returns the value it finds. @var{desc} must be
390955cb
UD
788obtained by a successful call to @code{wctrans}.
789
790@pindex wctype.h
791@noindent
792This function is declared in @file{wctype.h}.
793@end deftypefun
794
6dd5b57e
UD
795For the generally available mappings, the @w{ISO C} standard defines
796convenient shortcuts so that it is not necessary to call @code{wctrans}
390955cb
UD
797for them.
798
799@comment wctype.h
800@comment ISO
801@deftypefun wint_t towlower (wint_t @var{wc})
c49130e3
AO
802@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
803@c Same caveats as iswalnum, just using a wctrans rather than a wctype
804@c table.
390955cb
UD
805If @var{wc} is an upper-case letter, @code{towlower} returns the corresponding
806lower-case letter. If @var{wc} is not an upper-case letter,
807@var{wc} is returned unchanged.
808
18fd611b
UD
809@noindent
810@code{towlower} can be implemented using
811
812@smallexample
813towctrans (wc, wctrans ("tolower"))
814@end smallexample
815
390955cb
UD
816@pindex wctype.h
817@noindent
818This function is declared in @file{wctype.h}.
819@end deftypefun
820
821@comment wctype.h
822@comment ISO
823@deftypefun wint_t towupper (wint_t @var{wc})
c49130e3 824@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
390955cb
UD
825If @var{wc} is a lower-case letter, @code{towupper} returns the corresponding
826upper-case letter. Otherwise @var{wc} is returned unchanged.
827
18fd611b
UD
828@noindent
829@code{towupper} can be implemented using
830
831@smallexample
832towctrans (wc, wctrans ("toupper"))
833@end smallexample
834
390955cb
UD
835@pindex wctype.h
836@noindent
837This function is declared in @file{wctype.h}.
838@end deftypefun
839
840The same warnings given in the last section for the use of the wide
6dd5b57e 841character classification functions apply here. It is not possible to
390955cb 842simply cast a @code{char} type value to a @code{wint_t} and use it as an
6dd5b57e 843argument to @code{towctrans} calls.