File.read without an explicit encoding uses Ruby's default external
encoding, which depends on the system locale. On systems where it
resolves to US-ASCII (e.g. some Debian configurations), reading
translated man page files with non-ASCII content (such as Arabic)
fails with "source is either binary or contains invalid Unicode data".
Specify encoding: 'UTF-8' explicitly so the string is correctly
tagged regardless of locale.
Fixes: https://github.com/util-linux/util-linux/issues/4409
Signed-off-by: Karel Zak <kzak@redhat.com>
docdir = doc.attributes["docdir"]
path = target
file = File.expand_path path, docdir
- data = File.read file
+ data = File.read file, encoding: 'UTF-8'
reader.push_include data, file, path, 1, attributes
doc.attributes["include_dependencies"] << file
reader