]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man3/iconv.3
getent.1, intro.1, time.1, _exit.2, _syscall.2, accept.2, access.2, acct.2, adjtimex...
[thirdparty/man-pages.git] / man3 / iconv.3
CommitLineData
fea681da
MK
1.\" Copyright (c) Bruno Haible <haible@clisp.cons.org>
2.\"
3.\" This is free documentation; you can redistribute it and/or
4.\" modify it under the terms of the GNU General Public License as
5.\" published by the Free Software Foundation; either version 2 of
6.\" the License, or (at your option) any later version.
7.\"
8.\" References consulted:
9.\" GNU glibc-2 source code and manual
008f1ecc 10.\" OpenGroup's Single UNIX specification
3876c0e5 11.\" http://www.UNIX-systems.org/online.html
fea681da
MK
12.\"
13.\" 2000-06-30 correction by Yuichi SATO <sato@complex.eng.hokudai.ac.jp>
14.\" 2000-11-15 aeb, fixed prototype
15.\"
eae2dfce 16.TH ICONV 3 2012-05-10 "GNU" "Linux Programmer's Manual"
fea681da
MK
17.SH NAME
18iconv \- perform character set conversion
19.SH SYNOPSIS
20.nf
21.B #include <iconv.h>
22.sp
23.BI "size_t iconv(iconv_t " cd ,
b9f02710
MK
24.BI " char **" inbuf ", size_t *" inbytesleft ,
25.BI " char **" outbuf ", size_t *" outbytesleft );
fea681da
MK
26.fi
27.SH DESCRIPTION
f9b75bd4
MK
28The
29.BR iconv ()
30function converts a sequence of characters in one character encoding
31to a sequence of characters in another character encoding.
32The
33.I cd
34argument is a conversion descriptor,
35previously created by a call to
36.BR iconv_open (3);
37the conversion descriptor defines the character encodings that
38.BR iconv ()
39uses for the conversion.
40The
41.I inbuf
42argument is the address of a variable that points to
43the first character of the input sequence;
44.I inbytesleft
45indicates the number of bytes in that buffer.
46The
47.I outbuf
48argument is the address of a variable that points to
49the first byte available in the output buffer;
50.I outbytesleft
51indicates the number of bytes available in the output buffer.
fea681da
MK
52.PP
53The main case is when \fIinbuf\fP is not NULL and \fI*inbuf\fP is not NULL.
60a90ecd
MK
54In this case, the
55.BR iconv ()
56function converts the multibyte sequence
fea681da
MK
57starting at \fI*inbuf\fP to a multibyte sequence starting at \fI*outbuf\fP.
58At most \fI*inbytesleft\fP bytes, starting at \fI*inbuf\fP, will be read.
59At most \fI*outbytesleft\fP bytes, starting at \fI*outbuf\fP, will be written.
60.PP
60a90ecd
MK
61The
62.BR iconv ()
63function converts one multibyte character at a time, and for
fea681da
MK
64each character conversion it increments \fI*inbuf\fP and decrements
65\fI*inbytesleft\fP by the number of converted input bytes, it increments
66\fI*outbuf\fP and decrements \fI*outbytesleft\fP by the number of converted
67output bytes, and it updates the conversion state contained in \fIcd\fP.
037273a6
MK
68If the character encoding of the input is stateful, the
69.BR iconv ()
70function can also convert a sequence of input bytes
71to an update to the conversion state without producing any output bytes;
72such input is called a \fIshift sequence\fP.
fea681da 73The conversion can stop for four reasons:
f9b75bd4
MK
74.IP 1. 3
75An invalid multibyte sequence is encountered in the input.
c13182ef 76In this case
7d2cb9d5 77it sets \fIerrno\fP to \fBEILSEQ\fP and returns
009df872 78.IR (size_t)\ \-1 .
7d2cb9d5 79\fI*inbuf\fP
fea681da 80is left pointing to the beginning of the invalid multibyte sequence.
f9b75bd4
MK
81.IP 2.
82The input byte sequence has been entirely converted,
75b94dc3 83that is, \fI*inbytesleft\fP has gone down to 0.
60a90ecd
MK
84In this case
85.BR iconv ()
86returns the number of
24b74457 87nonreversible conversions performed during this call.
f9b75bd4
MK
88.IP 3.
89An incomplete multibyte sequence is encountered in the input, and the
c13182ef
MK
90input byte sequence terminates after it.
91In this case it sets \fIerrno\fP to
7d2cb9d5 92\fBEINVAL\fP and returns
009df872 93.IR (size_t)\ \-1 .
7d2cb9d5 94\fI*inbuf\fP is left pointing to the
fea681da 95beginning of the incomplete multibyte sequence.
f9b75bd4
MK
96.IP 4.
97The output buffer has no more room for the next converted character.
7d2cb9d5 98In this case it sets \fIerrno\fP to \fBE2BIG\fP and returns
009df872 99.IR (size_t)\ \-1 .
fea681da
MK
100.PP
101A different case is when \fIinbuf\fP is NULL or \fI*inbuf\fP is NULL, but
c13182ef
MK
102\fIoutbuf\fP is not NULL and \fI*outbuf\fP is not NULL.
103In this case, the
60a90ecd
MK
104.BR iconv ()
105function attempts to set \fIcd\fP's conversion state to the
fea681da
MK
106initial state and store a corresponding shift sequence at \fI*outbuf\fP.
107At most \fI*outbytesleft\fP bytes, starting at \fI*outbuf\fP, will be written.
108If the output buffer has no more room for this reset sequence, it sets
7d2cb9d5 109\fIerrno\fP to \fBE2BIG\fP and returns
009df872 110.IR (size_t)\ \-1 .
c13182ef 111Otherwise it increments
fea681da
MK
112\fI*outbuf\fP and decrements \fI*outbytesleft\fP by the number of bytes
113written.
114.PP
115A third case is when \fIinbuf\fP is NULL or \fI*inbuf\fP is NULL, and
c13182ef 116\fIoutbuf\fP is NULL or \fI*outbuf\fP is NULL.
60a90ecd
MK
117In this case, the
118.BR iconv ()
fea681da 119function sets \fIcd\fP's conversion state to the initial state.
47297adb 120.SH RETURN VALUE
60a90ecd
MK
121The
122.BR iconv ()
123function returns the number of characters converted in a
24b74457 124nonreversible way during this call; reversible conversions are not counted.
7d2cb9d5 125In case of error, it sets \fIerrno\fP and returns
009df872 126.IR (size_t)\ \-1 .
fea681da
MK
127.SH ERRORS
128The following errors can occur, among others:
129.TP
130.B E2BIG
131There is not sufficient room at \fI*outbuf\fP.
132.TP
133.B EILSEQ
134An invalid multibyte sequence has been encountered in the input.
135.TP
136.B EINVAL
137An incomplete multibyte sequence has been encountered in the input.
3fd4929b
MK
138.SH VERSIONS
139This function is available in glibc since version 2.1.
47297adb 140.SH CONFORMING TO
3876c0e5 141POSIX.1-2001.
2d17a61d
MK
142.SH NOTES
143Although
144.I inbuf
145and
146.I outbuf
147are typed as
148.IR "char\ **" ,
149this does not mean that the objects they point can be interpreted
150as C strings or as arrays of characters:
151the interpretation of character byte sequences is
152handled internally by the conversion functions.
153In some encodings, a zero byte may be a valid part of a multibyte character.
154
155The caller of
156.BR iconv ()
157must ensure that the pointers passed to the function are suitable
158for accessing characters in the appropriate character set.
159This includes ensuring correct alignment on platforms that have
160tight restrictions on alignment.
47297adb 161.SH SEE ALSO
fea681da
MK
162.BR iconv_close (3),
163.BR iconv_open (3)