The old 10 bytes is likely enough, but lets make it safer based on AI's
recommendation:
While the 4–8 byte rule covers most common encodings, ISO-2022 variants
(like ISO-2022-JP) are the primary reason you might need a slightly larger
buffer. Because these encodings use multi-byte "escape sequences" to switch
between character sets, iconv() may stop mid-sequence.
For standard ISO-2022 variants, a buffer of 10 to 16 bytes is generally
considered the absolute "safe" maximum for unconverted bytes.
Why 16 Bytes? While individual characters or escape sequences rarely exceed
4–6 bytes, choosing 16 bytes provides a power-of-two alignment that safely
handles even the most obscure registered ISO-IR sequences and provides a
margin for implementation-specific behavior.
/* Max number of bytes that iconv can require for a single character.
UTF-8 takes max 6 bytes per character. Not sure about others, but I'd think
10 is more than enough for everyone.. */
-#define CHARSET_MAX_PENDING_BUF_SIZE 10
+#define CHARSET_MAX_PENDING_BUF_SIZE 16
struct charset_translation;
memcpy(nextbuf, "+AOQ-", 5);
size = sizeof(nextbuf);
test_assert(charset_to_utf8(trans, nextbuf, &size, str) == CHARSET_RET_OK);
- test_assert(strcmp(str_c(str), "a\xC3\xA4???????????") == 0);
+ test_assert_strcmp(str_c(str), "a\xC3\xA4?????????????????");
charset_to_utf8_end(&trans);
test_end();
}