From: E.Smith <31170571+azlm8t@users.noreply.github.com> Date: Wed, 10 Jun 2020 21:30:14 +0000 (+0100) Subject: Handle bad UTF-8 in xmltv (#5909) X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=f0b21875cf5f3c6ccc735d9c9613122946188628;p=thirdparty%2Ftvheadend.git Handle bad UTF-8 in xmltv (#5909) We had a string where we had a rogue byte (0x8a) which was not part of a UTF-8 string. This then caused some downstream parsers to abort processing the document; other parsers ignored the bad character. As an interim fix, we now parse the individual characters and filter out invalid characters. We replace such characters with a space character (instead of a U+FFFD replacement character) since this is typically user presentable data on a "10ft interface". An alternative would be to completely discard the character, but the examples we had would then have words combined where the invalid character used to be. --- diff --git a/src/htsbuf.c b/src/htsbuf.c index 50f22a500..c7c9b10f8 100644 --- a/src/htsbuf.c +++ b/src/htsbuf.c @@ -353,15 +353,50 @@ htsbuf_append_and_escape_xml(htsbuf_queue_t *hq, const char *s) { const char *c = s; const char *e = s + strlen(s); - const char *esc; + const char *esc = 0; int h; if(e == s) return; - while(1) { + while(c': esc = ">"; break;