]> git.ipfire.org Git - thirdparty/rspamd.git/commit
[Fix] Use UTF-8 buffer for HTML URL rewriting
authorVsevolod Stakhov <vsevolod@rspamd.com>
Mon, 13 Oct 2025 09:22:52 +0000 (10:22 +0100)
committerVsevolod Stakhov <vsevolod@rspamd.com>
Mon, 13 Oct 2025 09:23:18 +0000 (10:23 +0100)
commit220cfd85b12e9d94d95ceeabfa9b22e8ec4ec1e1
tree35f8d52ed58d42d03112c376c01989f1f54c0ee0
parent777cc810a8cb6358a04a0944d0e5da8c9d863e1b
[Fix] Use UTF-8 buffer for HTML URL rewriting

The HTML parser calculates attribute value offsets from the UTF-8
buffer (utf_raw_content), but URL rewriting was incorrectly applying
patches to the MIME-decoded buffer (parsed). When charset conversion
occurs (e.g., from ISO-8859-1 to UTF-8), the same character can have
different byte lengths, causing incorrect patch positions.

This commit ensures all URL rewriting operations use the UTF-8 buffer
consistently, preventing corruption with non-ASCII characters.
src/libserver/html/html_tag.hxx
src/libserver/html/html_url_rewrite.hxx
src/lua/lua_task.c