parser: Handle even more exotically broken headers
An archive of the Ubuntu kernel team mailing list contains a
fascinating email that causes the following parse error:
email.errors.HeaderParseError: header value appears to contain an embedded header:
'4Mf^tnii7k\\_EnR5aobBm6Di[DZ9@AX1wJ"okBdX-UoJ>:SRn]c6DDU"qUIwfs98vF>...
The broken bit seem related to a UTF-8 quoted-printable encoded
section and to be from an internal attempt to break it over multiple
lines: here's a snippet from the error message:
'\n\t=?utf-8?q?Tnf?=\n'
but interesting the header itself does not contain the new lines, so
clearly something quite weird is happening behind the scenes!
This only throws on header.encode(): it actually makes it through
sanitise_header and into find_headers before throwing the assertion.
So, try to encode in sanitize_header as a final step.
Also, fix a hilarious* python bug that this exposes: whitespace-only
headers cause an index error!
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Stephen Finucane <stephen@that.guru> Signed-off-by: Daniel Axtens <dja@axtens.net>