From: Vsevolod Stakhov <vsevolod@rspamd.com>
Date: Wed, 20 May 2026 10:39:51 +0000 (+0100)
Subject: [Fix] html: prevent buffer overflow in entity decoding
X-Git-Tag: 4.1.0~46
X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=04fc040b5ce534c796878f802d531049e044dfc5;p=thirdparty%2Frspamd.git

[Fix] html: prevent buffer overflow in entity decoding

decode_html_entitles_inplace works in place, relying on the
replacement never being longer than the source entity text. That
assumption does not hold for some short entity names that expand to
multi-codepoint replacements (e.g. nGt, nLt, nvap): when such an
entity sits at the very end of the buffer the named-entity memcpy
wrote a few bytes past the end.

Bounds-check the replacement against the remaining buffer before
copying, matching the existing numeric-entity path, and drop the
entity when it does not fit.
---

diff --git a/src/libserver/html/html_entities.cxx b/src/libserver/html/html_entities.cxx
index d7c709f2da..5e18cf7a30 100644
--- a/src/libserver/html/html_entities.cxx
+++ b/src/libserver/html/html_entities.cxx
@@ -2260,8 +2260,17 @@ decode_html_entitles_inplace(char *s, std::size_t len, bool norm_spaces)
 
 		auto replace_entity = [&]() -> void {
 			auto l = strlen(entity_def->replacement);
-			memcpy(t, entity_def->replacement, l);
-			t += l;
+			/*
+			 * The decoder works in place, so the replacement may only be
+			 * written while it fits the remaining buffer. Some short entity
+			 * names expand to longer multi-codepoint replacements, which
+			 * would otherwise overflow when the entity sits at the very end
+			 * of the buffer. Drop such a truncated entity instead.
+			 */
+			if (end - t >= (decltype(end - t)) l) {
+				memcpy(t, entity_def->replacement, l);
+				t += l;
+			}
 		};
 
 		if (entity_def) {