]>
Commit | Line | Data |
---|---|---|
fea681da MK |
1 | .\" Copyright (c) Bruno Haible <haible@clisp.cons.org> |
2 | .\" | |
3 | .\" This is free documentation; you can redistribute it and/or | |
4 | .\" modify it under the terms of the GNU General Public License as | |
5 | .\" published by the Free Software Foundation; either version 2 of | |
6 | .\" the License, or (at your option) any later version. | |
7 | .\" | |
8 | .\" References consulted: | |
9 | .\" GNU glibc-2 source code and manual | |
008f1ecc | 10 | .\" OpenGroup's Single UNIX specification |
3876c0e5 | 11 | .\" http://www.UNIX-systems.org/online.html |
fea681da MK |
12 | .\" |
13 | .\" 2000-06-30 correction by Yuichi SATO <sato@complex.eng.hokudai.ac.jp> | |
14 | .\" 2000-11-15 aeb, fixed prototype | |
15 | .\" | |
eae2dfce | 16 | .TH ICONV 3 2012-05-10 "GNU" "Linux Programmer's Manual" |
fea681da MK |
17 | .SH NAME |
18 | iconv \- perform character set conversion | |
19 | .SH SYNOPSIS | |
20 | .nf | |
21 | .B #include <iconv.h> | |
22 | .sp | |
23 | .BI "size_t iconv(iconv_t " cd , | |
b9f02710 MK |
24 | .BI " char **" inbuf ", size_t *" inbytesleft , |
25 | .BI " char **" outbuf ", size_t *" outbytesleft ); | |
fea681da MK |
26 | .fi |
27 | .SH DESCRIPTION | |
f9b75bd4 MK |
28 | The |
29 | .BR iconv () | |
30 | function converts a sequence of characters in one character encoding | |
31 | to a sequence of characters in another character encoding. | |
32 | The | |
33 | .I cd | |
34 | argument is a conversion descriptor, | |
35 | previously created by a call to | |
36 | .BR iconv_open (3); | |
37 | the conversion descriptor defines the character encodings that | |
38 | .BR iconv () | |
39 | uses for the conversion. | |
40 | The | |
41 | .I inbuf | |
42 | argument is the address of a variable that points to | |
43 | the first character of the input sequence; | |
44 | .I inbytesleft | |
45 | indicates the number of bytes in that buffer. | |
46 | The | |
47 | .I outbuf | |
48 | argument is the address of a variable that points to | |
49 | the first byte available in the output buffer; | |
50 | .I outbytesleft | |
51 | indicates the number of bytes available in the output buffer. | |
fea681da MK |
52 | .PP |
53 | The main case is when \fIinbuf\fP is not NULL and \fI*inbuf\fP is not NULL. | |
60a90ecd MK |
54 | In this case, the |
55 | .BR iconv () | |
56 | function converts the multibyte sequence | |
fea681da MK |
57 | starting at \fI*inbuf\fP to a multibyte sequence starting at \fI*outbuf\fP. |
58 | At most \fI*inbytesleft\fP bytes, starting at \fI*inbuf\fP, will be read. | |
59 | At most \fI*outbytesleft\fP bytes, starting at \fI*outbuf\fP, will be written. | |
60 | .PP | |
60a90ecd MK |
61 | The |
62 | .BR iconv () | |
63 | function converts one multibyte character at a time, and for | |
fea681da MK |
64 | each character conversion it increments \fI*inbuf\fP and decrements |
65 | \fI*inbytesleft\fP by the number of converted input bytes, it increments | |
66 | \fI*outbuf\fP and decrements \fI*outbytesleft\fP by the number of converted | |
67 | output bytes, and it updates the conversion state contained in \fIcd\fP. | |
037273a6 MK |
68 | If the character encoding of the input is stateful, the |
69 | .BR iconv () | |
70 | function can also convert a sequence of input bytes | |
71 | to an update to the conversion state without producing any output bytes; | |
72 | such input is called a \fIshift sequence\fP. | |
fea681da | 73 | The conversion can stop for four reasons: |
f9b75bd4 MK |
74 | .IP 1. 3 |
75 | An invalid multibyte sequence is encountered in the input. | |
c13182ef | 76 | In this case |
7d2cb9d5 | 77 | it sets \fIerrno\fP to \fBEILSEQ\fP and returns |
009df872 | 78 | .IR (size_t)\ \-1 . |
7d2cb9d5 | 79 | \fI*inbuf\fP |
fea681da | 80 | is left pointing to the beginning of the invalid multibyte sequence. |
f9b75bd4 MK |
81 | .IP 2. |
82 | The input byte sequence has been entirely converted, | |
75b94dc3 | 83 | that is, \fI*inbytesleft\fP has gone down to 0. |
60a90ecd MK |
84 | In this case |
85 | .BR iconv () | |
86 | returns the number of | |
24b74457 | 87 | nonreversible conversions performed during this call. |
f9b75bd4 MK |
88 | .IP 3. |
89 | An incomplete multibyte sequence is encountered in the input, and the | |
c13182ef MK |
90 | input byte sequence terminates after it. |
91 | In this case it sets \fIerrno\fP to | |
7d2cb9d5 | 92 | \fBEINVAL\fP and returns |
009df872 | 93 | .IR (size_t)\ \-1 . |
7d2cb9d5 | 94 | \fI*inbuf\fP is left pointing to the |
fea681da | 95 | beginning of the incomplete multibyte sequence. |
f9b75bd4 MK |
96 | .IP 4. |
97 | The output buffer has no more room for the next converted character. | |
7d2cb9d5 | 98 | In this case it sets \fIerrno\fP to \fBE2BIG\fP and returns |
009df872 | 99 | .IR (size_t)\ \-1 . |
fea681da MK |
100 | .PP |
101 | A different case is when \fIinbuf\fP is NULL or \fI*inbuf\fP is NULL, but | |
c13182ef MK |
102 | \fIoutbuf\fP is not NULL and \fI*outbuf\fP is not NULL. |
103 | In this case, the | |
60a90ecd MK |
104 | .BR iconv () |
105 | function attempts to set \fIcd\fP's conversion state to the | |
fea681da MK |
106 | initial state and store a corresponding shift sequence at \fI*outbuf\fP. |
107 | At most \fI*outbytesleft\fP bytes, starting at \fI*outbuf\fP, will be written. | |
108 | If the output buffer has no more room for this reset sequence, it sets | |
7d2cb9d5 | 109 | \fIerrno\fP to \fBE2BIG\fP and returns |
009df872 | 110 | .IR (size_t)\ \-1 . |
c13182ef | 111 | Otherwise it increments |
fea681da MK |
112 | \fI*outbuf\fP and decrements \fI*outbytesleft\fP by the number of bytes |
113 | written. | |
114 | .PP | |
115 | A third case is when \fIinbuf\fP is NULL or \fI*inbuf\fP is NULL, and | |
c13182ef | 116 | \fIoutbuf\fP is NULL or \fI*outbuf\fP is NULL. |
60a90ecd MK |
117 | In this case, the |
118 | .BR iconv () | |
fea681da | 119 | function sets \fIcd\fP's conversion state to the initial state. |
47297adb | 120 | .SH RETURN VALUE |
60a90ecd MK |
121 | The |
122 | .BR iconv () | |
123 | function returns the number of characters converted in a | |
24b74457 | 124 | nonreversible way during this call; reversible conversions are not counted. |
7d2cb9d5 | 125 | In case of error, it sets \fIerrno\fP and returns |
009df872 | 126 | .IR (size_t)\ \-1 . |
fea681da MK |
127 | .SH ERRORS |
128 | The following errors can occur, among others: | |
129 | .TP | |
130 | .B E2BIG | |
131 | There is not sufficient room at \fI*outbuf\fP. | |
132 | .TP | |
133 | .B EILSEQ | |
134 | An invalid multibyte sequence has been encountered in the input. | |
135 | .TP | |
136 | .B EINVAL | |
137 | An incomplete multibyte sequence has been encountered in the input. | |
3fd4929b MK |
138 | .SH VERSIONS |
139 | This function is available in glibc since version 2.1. | |
47297adb | 140 | .SH CONFORMING TO |
3876c0e5 | 141 | POSIX.1-2001. |
2d17a61d MK |
142 | .SH NOTES |
143 | Although | |
144 | .I inbuf | |
145 | and | |
146 | .I outbuf | |
147 | are typed as | |
148 | .IR "char\ **" , | |
149 | this does not mean that the objects they point can be interpreted | |
150 | as C strings or as arrays of characters: | |
151 | the interpretation of character byte sequences is | |
152 | handled internally by the conversion functions. | |
153 | In some encodings, a zero byte may be a valid part of a multibyte character. | |
154 | ||
155 | The caller of | |
156 | .BR iconv () | |
157 | must ensure that the pointers passed to the function are suitable | |
158 | for accessing characters in the appropriate character set. | |
159 | This includes ensuring correct alignment on platforms that have | |
160 | tight restrictions on alignment. | |
47297adb | 161 | .SH SEE ALSO |
fea681da MK |
162 | .BR iconv_close (3), |
163 | .BR iconv_open (3) |