]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man3/iconv.3
Many pages: Use correct letter case in page titles (TH)
[thirdparty/man-pages.git] / man3 / iconv.3
1 .\" Copyright (c) Bruno Haible <haible@clisp.cons.org>
2 .\"
3 .\" SPDX-License-Identifier: GPL-2.0-or-later
4 .\"
5 .\" References consulted:
6 .\" GNU glibc-2 source code and manual
7 .\" OpenGroup's Single UNIX specification
8 .\" http://www.UNIX-systems.org/online.html
9 .\"
10 .\" 2000-06-30 correction by Yuichi SATO <sato@complex.eng.hokudai.ac.jp>
11 .\" 2000-11-15 aeb, fixed prototype
12 .\"
13 .TH iconv 3 (date) "Linux man-pages (unreleased)"
14 .SH NAME
15 iconv \- perform character set conversion
16 .SH LIBRARY
17 Standard C library
18 .RI ( libc ", " \-lc )
19 .SH SYNOPSIS
20 .nf
21 .B #include <iconv.h>
22 .PP
23 .BI "size_t iconv(iconv_t " cd ,
24 .BI " char **restrict " inbuf ", size_t *restrict " inbytesleft ,
25 .BI " char **restrict " outbuf ", size_t *restrict " outbytesleft );
26 .fi
27 .SH DESCRIPTION
28 The
29 .BR iconv ()
30 function converts a sequence of characters in one character encoding
31 to a sequence of characters in another character encoding.
32 The
33 .I cd
34 argument is a conversion descriptor,
35 previously created by a call to
36 .BR iconv_open (3);
37 the conversion descriptor defines the character encodings that
38 .BR iconv ()
39 uses for the conversion.
40 The
41 .I inbuf
42 argument is the address of a variable that points to
43 the first character of the input sequence;
44 .I inbytesleft
45 indicates the number of bytes in that buffer.
46 The
47 .I outbuf
48 argument is the address of a variable that points to
49 the first byte available in the output buffer;
50 .I outbytesleft
51 indicates the number of bytes available in the output buffer.
52 .PP
53 The main case is when \fIinbuf\fP is not NULL and \fI*inbuf\fP is not NULL.
54 In this case, the
55 .BR iconv ()
56 function converts the multibyte sequence
57 starting at \fI*inbuf\fP to a multibyte sequence starting at \fI*outbuf\fP.
58 At most \fI*inbytesleft\fP bytes, starting at \fI*inbuf\fP, will be read.
59 At most \fI*outbytesleft\fP bytes, starting at \fI*outbuf\fP, will be written.
60 .PP
61 The
62 .BR iconv ()
63 function converts one multibyte character at a time, and for
64 each character conversion it increments \fI*inbuf\fP and decrements
65 \fI*inbytesleft\fP by the number of converted input bytes, it increments
66 \fI*outbuf\fP and decrements \fI*outbytesleft\fP by the number of converted
67 output bytes, and it updates the conversion state contained in \fIcd\fP.
68 If the character encoding of the input is stateful, the
69 .BR iconv ()
70 function can also convert a sequence of input bytes
71 to an update to the conversion state without producing any output bytes;
72 such input is called a \fIshift sequence\fP.
73 The conversion can stop for four reasons:
74 .IP \(bu 3
75 An invalid multibyte sequence is encountered in the input.
76 In this case,
77 it sets \fIerrno\fP to \fBEILSEQ\fP and returns
78 .IR (size_t)\ \-1 .
79 \fI*inbuf\fP
80 is left pointing to the beginning of the invalid multibyte sequence.
81 .IP \(bu
82 The input byte sequence has been entirely converted,
83 that is, \fI*inbytesleft\fP has gone down to 0.
84 In this case,
85 .BR iconv ()
86 returns the number of
87 nonreversible conversions performed during this call.
88 .IP \(bu
89 An incomplete multibyte sequence is encountered in the input, and the
90 input byte sequence terminates after it.
91 In this case, it sets \fIerrno\fP to
92 \fBEINVAL\fP and returns
93 .IR (size_t)\ \-1 .
94 \fI*inbuf\fP is left pointing to the
95 beginning of the incomplete multibyte sequence.
96 .IP \(bu
97 The output buffer has no more room for the next converted character.
98 In this case, it sets \fIerrno\fP to \fBE2BIG\fP and returns
99 .IR (size_t)\ \-1 .
100 .PP
101 A different case is when \fIinbuf\fP is NULL or \fI*inbuf\fP is NULL, but
102 \fIoutbuf\fP is not NULL and \fI*outbuf\fP is not NULL.
103 In this case, the
104 .BR iconv ()
105 function attempts to set \fIcd\fP's conversion state to the
106 initial state and store a corresponding shift sequence at \fI*outbuf\fP.
107 At most \fI*outbytesleft\fP bytes, starting at \fI*outbuf\fP, will be written.
108 If the output buffer has no more room for this reset sequence, it sets
109 \fIerrno\fP to \fBE2BIG\fP and returns
110 .IR (size_t)\ \-1 .
111 Otherwise, it increments
112 \fI*outbuf\fP and decrements \fI*outbytesleft\fP by the number of bytes
113 written.
114 .PP
115 A third case is when \fIinbuf\fP is NULL or \fI*inbuf\fP is NULL, and
116 \fIoutbuf\fP is NULL or \fI*outbuf\fP is NULL.
117 In this case, the
118 .BR iconv ()
119 function sets \fIcd\fP's conversion state to the initial state.
120 .SH RETURN VALUE
121 The
122 .BR iconv ()
123 function returns the number of characters converted in a
124 nonreversible way during this call; reversible conversions are not counted.
125 In case of error,
126 .BR iconv ()
127 returns
128 .I (size_t)\ \-1
129 and sets
130 .I errno
131 to indicate the error.
132 .SH ERRORS
133 The following errors can occur, among others:
134 .TP
135 .B E2BIG
136 There is not sufficient room at \fI*outbuf\fP.
137 .TP
138 .B EILSEQ
139 An invalid multibyte sequence has been encountered in the input.
140 .TP
141 .B EINVAL
142 An incomplete multibyte sequence has been encountered in the input.
143 .SH VERSIONS
144 This function is available in glibc since version 2.1.
145 .SH ATTRIBUTES
146 For an explanation of the terms used in this section, see
147 .BR attributes (7).
148 .ad l
149 .nh
150 .TS
151 allbox;
152 lbx lb lb
153 l l l.
154 Interface Attribute Value
155 T{
156 .BR iconv ()
157 T} Thread safety MT-Safe race:cd
158 .TE
159 .hy
160 .ad
161 .sp 1
162 .PP
163 The
164 .BR iconv ()
165 function is MT-Safe, as long as callers arrange for
166 mutual exclusion on the
167 .I cd
168 argument.
169 .SH STANDARDS
170 POSIX.1-2001, POSIX.1-2008.
171 .SH NOTES
172 In each series of calls to
173 .BR iconv (),
174 the last should be one with \fIinbuf\fP or \fI*inbuf\fP equal to NULL,
175 in order to flush out any partially converted input.
176 .PP
177 Although
178 .I inbuf
179 and
180 .I outbuf
181 are typed as
182 .IR "char\ **" ,
183 this does not mean that the objects they point can be interpreted
184 as C strings or as arrays of characters:
185 the interpretation of character byte sequences is
186 handled internally by the conversion functions.
187 In some encodings, a zero byte may be a valid part of a multibyte character.
188 .PP
189 The caller of
190 .BR iconv ()
191 must ensure that the pointers passed to the function are suitable
192 for accessing characters in the appropriate character set.
193 This includes ensuring correct alignment on platforms that have
194 tight restrictions on alignment.
195 .SH SEE ALSO
196 .BR iconv_close (3),
197 .BR iconv_open (3),
198 .BR iconvconfig (8)