]>
Commit | Line | Data |
---|---|---|
fea681da MK |
1 | .\" Copyright (c) Bruno Haible <haible@clisp.cons.org> |
2 | .\" | |
89e3ffe9 | 3 | .\" %%%LICENSE_START(GPLv2+_DOC_ONEPARA) |
fea681da MK |
4 | .\" This is free documentation; you can redistribute it and/or |
5 | .\" modify it under the terms of the GNU General Public License as | |
6 | .\" published by the Free Software Foundation; either version 2 of | |
7 | .\" the License, or (at your option) any later version. | |
fe382ebf | 8 | .\" %%%LICENSE_END |
fea681da MK |
9 | .\" |
10 | .\" References consulted: | |
11 | .\" GNU glibc-2 source code and manual | |
008f1ecc | 12 | .\" OpenGroup's Single UNIX specification |
3876c0e5 | 13 | .\" http://www.UNIX-systems.org/online.html |
fea681da MK |
14 | .\" |
15 | .\" 2000-06-30 correction by Yuichi SATO <sato@complex.eng.hokudai.ac.jp> | |
16 | .\" 2000-11-15 aeb, fixed prototype | |
17 | .\" | |
4b8c67d9 | 18 | .TH ICONV 3 2017-09-15 "GNU" "Linux Programmer's Manual" |
fea681da MK |
19 | .SH NAME |
20 | iconv \- perform character set conversion | |
21 | .SH SYNOPSIS | |
22 | .nf | |
23 | .B #include <iconv.h> | |
68e4db0a | 24 | .PP |
fea681da | 25 | .BI "size_t iconv(iconv_t " cd , |
b9f02710 MK |
26 | .BI " char **" inbuf ", size_t *" inbytesleft , |
27 | .BI " char **" outbuf ", size_t *" outbytesleft ); | |
fea681da MK |
28 | .fi |
29 | .SH DESCRIPTION | |
f9b75bd4 MK |
30 | The |
31 | .BR iconv () | |
32 | function converts a sequence of characters in one character encoding | |
33 | to a sequence of characters in another character encoding. | |
34 | The | |
35 | .I cd | |
36 | argument is a conversion descriptor, | |
37 | previously created by a call to | |
38 | .BR iconv_open (3); | |
39 | the conversion descriptor defines the character encodings that | |
40 | .BR iconv () | |
41 | uses for the conversion. | |
42 | The | |
43 | .I inbuf | |
44 | argument is the address of a variable that points to | |
45 | the first character of the input sequence; | |
46 | .I inbytesleft | |
47 | indicates the number of bytes in that buffer. | |
48 | The | |
49 | .I outbuf | |
50 | argument is the address of a variable that points to | |
51 | the first byte available in the output buffer; | |
52 | .I outbytesleft | |
53 | indicates the number of bytes available in the output buffer. | |
fea681da MK |
54 | .PP |
55 | The main case is when \fIinbuf\fP is not NULL and \fI*inbuf\fP is not NULL. | |
60a90ecd MK |
56 | In this case, the |
57 | .BR iconv () | |
58 | function converts the multibyte sequence | |
fea681da MK |
59 | starting at \fI*inbuf\fP to a multibyte sequence starting at \fI*outbuf\fP. |
60 | At most \fI*inbytesleft\fP bytes, starting at \fI*inbuf\fP, will be read. | |
61 | At most \fI*outbytesleft\fP bytes, starting at \fI*outbuf\fP, will be written. | |
62 | .PP | |
60a90ecd MK |
63 | The |
64 | .BR iconv () | |
65 | function converts one multibyte character at a time, and for | |
fea681da MK |
66 | each character conversion it increments \fI*inbuf\fP and decrements |
67 | \fI*inbytesleft\fP by the number of converted input bytes, it increments | |
68 | \fI*outbuf\fP and decrements \fI*outbytesleft\fP by the number of converted | |
69 | output bytes, and it updates the conversion state contained in \fIcd\fP. | |
037273a6 MK |
70 | If the character encoding of the input is stateful, the |
71 | .BR iconv () | |
72 | function can also convert a sequence of input bytes | |
73 | to an update to the conversion state without producing any output bytes; | |
74 | such input is called a \fIshift sequence\fP. | |
fea681da | 75 | The conversion can stop for four reasons: |
f9b75bd4 MK |
76 | .IP 1. 3 |
77 | An invalid multibyte sequence is encountered in the input. | |
cfa65d73 | 78 | In this case, |
7d2cb9d5 | 79 | it sets \fIerrno\fP to \fBEILSEQ\fP and returns |
009df872 | 80 | .IR (size_t)\ \-1 . |
7d2cb9d5 | 81 | \fI*inbuf\fP |
fea681da | 82 | is left pointing to the beginning of the invalid multibyte sequence. |
f9b75bd4 MK |
83 | .IP 2. |
84 | The input byte sequence has been entirely converted, | |
75b94dc3 | 85 | that is, \fI*inbytesleft\fP has gone down to 0. |
cfa65d73 | 86 | In this case, |
60a90ecd MK |
87 | .BR iconv () |
88 | returns the number of | |
24b74457 | 89 | nonreversible conversions performed during this call. |
f9b75bd4 MK |
90 | .IP 3. |
91 | An incomplete multibyte sequence is encountered in the input, and the | |
c13182ef | 92 | input byte sequence terminates after it. |
fa51e6a4 | 93 | In this case, it sets \fIerrno\fP to |
7d2cb9d5 | 94 | \fBEINVAL\fP and returns |
009df872 | 95 | .IR (size_t)\ \-1 . |
7d2cb9d5 | 96 | \fI*inbuf\fP is left pointing to the |
fea681da | 97 | beginning of the incomplete multibyte sequence. |
f9b75bd4 MK |
98 | .IP 4. |
99 | The output buffer has no more room for the next converted character. | |
fa51e6a4 | 100 | In this case, it sets \fIerrno\fP to \fBE2BIG\fP and returns |
009df872 | 101 | .IR (size_t)\ \-1 . |
fea681da MK |
102 | .PP |
103 | A different case is when \fIinbuf\fP is NULL or \fI*inbuf\fP is NULL, but | |
c13182ef MK |
104 | \fIoutbuf\fP is not NULL and \fI*outbuf\fP is not NULL. |
105 | In this case, the | |
60a90ecd MK |
106 | .BR iconv () |
107 | function attempts to set \fIcd\fP's conversion state to the | |
fea681da MK |
108 | initial state and store a corresponding shift sequence at \fI*outbuf\fP. |
109 | At most \fI*outbytesleft\fP bytes, starting at \fI*outbuf\fP, will be written. | |
110 | If the output buffer has no more room for this reset sequence, it sets | |
7d2cb9d5 | 111 | \fIerrno\fP to \fBE2BIG\fP and returns |
009df872 | 112 | .IR (size_t)\ \-1 . |
2b9b829d | 113 | Otherwise, it increments |
fea681da MK |
114 | \fI*outbuf\fP and decrements \fI*outbytesleft\fP by the number of bytes |
115 | written. | |
116 | .PP | |
117 | A third case is when \fIinbuf\fP is NULL or \fI*inbuf\fP is NULL, and | |
c13182ef | 118 | \fIoutbuf\fP is NULL or \fI*outbuf\fP is NULL. |
60a90ecd MK |
119 | In this case, the |
120 | .BR iconv () | |
fea681da | 121 | function sets \fIcd\fP's conversion state to the initial state. |
47297adb | 122 | .SH RETURN VALUE |
60a90ecd MK |
123 | The |
124 | .BR iconv () | |
125 | function returns the number of characters converted in a | |
24b74457 | 126 | nonreversible way during this call; reversible conversions are not counted. |
7d2cb9d5 | 127 | In case of error, it sets \fIerrno\fP and returns |
009df872 | 128 | .IR (size_t)\ \-1 . |
fea681da MK |
129 | .SH ERRORS |
130 | The following errors can occur, among others: | |
131 | .TP | |
132 | .B E2BIG | |
133 | There is not sufficient room at \fI*outbuf\fP. | |
134 | .TP | |
135 | .B EILSEQ | |
136 | An invalid multibyte sequence has been encountered in the input. | |
137 | .TP | |
138 | .B EINVAL | |
139 | An incomplete multibyte sequence has been encountered in the input. | |
3fd4929b MK |
140 | .SH VERSIONS |
141 | This function is available in glibc since version 2.1. | |
e5ea2ae7 | 142 | .SH ATTRIBUTES |
fe3ac773 MK |
143 | For an explanation of the terms used in this section, see |
144 | .BR attributes (7). | |
145 | .TS | |
146 | allbox; | |
147 | lb lb lb | |
148 | l l l. | |
149 | Interface Attribute Value | |
150 | T{ | |
e5ea2ae7 | 151 | .BR iconv () |
a353619d | 152 | T} Thread safety MT-Safe race:cd |
fe3ac773 | 153 | .TE |
a353619d PH |
154 | .PP |
155 | The | |
156 | .BR iconv () | |
157 | function is MT-Safe, as long as callers arrange for | |
158 | mutual exclusion on the | |
159 | .I cd | |
160 | argument. | |
47297adb | 161 | .SH CONFORMING TO |
21151fe5 | 162 | POSIX.1-2001, POSIX.1-2008. |
2d17a61d | 163 | .SH NOTES |
c89ca436 AB |
164 | In each series of calls to |
165 | .BR iconv (), | |
166 | the last should be one with \fIinbuf\fP or \fI*inbuf\fP equal to NULL, | |
167 | in order to flush out any partially converted input. | |
847e0d88 | 168 | .PP |
2d17a61d MK |
169 | Although |
170 | .I inbuf | |
171 | and | |
172 | .I outbuf | |
173 | are typed as | |
174 | .IR "char\ **" , | |
175 | this does not mean that the objects they point can be interpreted | |
176 | as C strings or as arrays of characters: | |
177 | the interpretation of character byte sequences is | |
178 | handled internally by the conversion functions. | |
179 | In some encodings, a zero byte may be a valid part of a multibyte character. | |
847e0d88 | 180 | .PP |
2d17a61d MK |
181 | The caller of |
182 | .BR iconv () | |
183 | must ensure that the pointers passed to the function are suitable | |
184 | for accessing characters in the appropriate character set. | |
185 | This includes ensuring correct alignment on platforms that have | |
186 | tight restrictions on alignment. | |
47297adb | 187 | .SH SEE ALSO |
fea681da | 188 | .BR iconv_close (3), |
054f9fb2 MK |
189 | .BR iconv_open (3), |
190 | .BR iconvconfig (8) |