Per https://www.w3.org/TR/xml/#sec-entity-decl (and MS references)
there is always some whitespace between '<!ENTITY' and the name, and
between the name and whatever is next. Also, it is valid XML to have
newlines inside entity declarations, like this:
<!ENTITY
bubble
"*S-1-5-113"
>
We used to create such files, so we should allow them.
There is a kind of entity that has '%' before the name, and there are
non-ascii names, which we continue not to support.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15829
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Jennifer Sutton <jennifersutton@catalyst.net.nz>
(cherry picked from commit
6107656ebc8d092b2c1907940b2486ab0265aad9)
entities_content = entities_file.read()
# Do a basic regex test of the entities file format
- if re.match(r'(\s*<!ENTITY\s*[a-zA-Z0-9_]+\s*.*?>)+\s*\Z',
- entities_content, flags=re.MULTILINE) is None:
+ if re.match(r'(\s*<!ENTITY\s+[a-zA-Z0-9_]+\s+.*?>)+\s*\Z',
+ entities_content, flags=re.MULTILINE|re.DOTALL) is None:
raise CommandError("Entities file does not appear to "
"conform to format\n"
'e.g. <!ENTITY entity "value">')