]> git.ipfire.org Git - thirdparty/vim.git/commit
patch 9.1.2124: blob2str() does not handle UTF-16 encoding v9.1.2124
authorYasuhiro Matsumoto <mattn.jp@gmail.com>
Sat, 31 Jan 2026 15:53:26 +0000 (15:53 +0000)
committerChristian Brabandt <cb@256bit.org>
Sat, 31 Jan 2026 15:59:01 +0000 (15:59 +0000)
commit2b184d4b97c370179a33aeb832cd77e737bbc480
tree9beab44be23af9618be55337a8a927133dc16eb2
parentecc3faef61cb02f28028f27184ccd2aadcee7c24
patch 9.1.2124: blob2str() does not handle UTF-16 encoding

Problem:  blob2str() does not handle UTF-16 encoding
          (Hirohito Higashi)
Solution: Refactor the code and fix remaining issues, see below
          (Yasuhiro Matsumoto).

blob2str() function did not properly handle UTF-16/UCS-2/UTF-32/UCS-4
encodings with endianness suffixes (e.g., utf-16le, utf-16be, ucs-2le).
The encoding name was canonicalized too aggressively, losing the
endianness information needed by iconv.

This change include few fixes:

- Preserve the raw encoding name with endianness suffix for iconv calls
- Normalize encoding names properly: "ucs2be" → "ucs-2be", "utf16le" →
  "utf-16le"
- For multi-byte encodings (UTF-16/32, UCS-2/4), convert the entire blob
  first, then split by newlines

convert_string() cannot handle UTF-16 because it uses string_convert()
which expects NUL-terminated strings. UTF-16 contains 0x00 bytes within
characters (e.g., "H" = 0x48 0x00), causing premature termination.
Therefore, for UTF-16/32 encodings, the fix uses string_convert_ext()
with an explicit input length to convert the entire blob at once.

The code appends two NUL bytes (ga_append(&blob_ga, NUL) twice) because
UTF-16 requires a 2-byte NUL terminator (0x00 0x00), not a single-byte
NUL.

- src/strings.c: Add from_encoding_raw to preserve endianness, special
  handling for UTF-16/32 and UCS-2/4
- src/mbyte.c: Fix convert_setup_ext() to use == ENC_UNICODE instead of
  & ENC_UNICODE. The bitwise AND was incorrectly treating UTF-16/UCS-2
  (which have ENC_UNICODE + ENC_2BYTE etc.) as UTF-8, causing iconv
  setup to be skipped.

fixes:  #19198
closes: #19246

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Yasuhiro Matsumoto <mattn.jp@gmail.com>
Signed-off-by: Christian Brabandt <cb@256bit.org>
runtime/doc/builtin.txt
src/strings.c
src/testdir/test_blob.vim
src/testdir/test_functions.vim
src/version.c