C1 control bytes are more complicated than that. They're represented as
two bytes in UTF-8.
Commit
19d725da, has issues, rejecting otherwise valid UTF-8 multi-byte
characters.
We could in theory do correct parsing of UTF, possibly parsing the
multi-byte sequences, or translating to wchar_t. However, that would
complicate the source code well beyond what I'd be comfortable with.
Instead, let's revert this, and claim no intention to support UTF-8.
If an admin uses a UTF-8 locale while reading /etc/passwd, that's their
own fault.
Reverts:
19d725da (2026-03-13; "strchriscntrl: reject C1 control bytes (0x80-0x9F)")
Fixes: 19d725da (2026-03-13; "strchriscntrl: reject C1 control bytes (0x80-0x9F)")
Closes: <https://github.com/shadow-maint/shadow/issues/1598>
Reported-by: Mantas Mikulėnas <grawity@gmail.com>
Cc: KhaelK-Praetorian <khael.kugler@praetorian.com>
Cc: Tobias Stoeckmann <tobias@stoeckmann.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
// string character is [:cntrl:]
-// Return true if any control character is found in the string.
-// Check both C0 (0x00-0x1F, 0x7F) via iscntrl(3) and C1 (0x80-0x9F)
-// explicitly, since glibc's iscntrl() does not classify C1 bytes as
-// control characters in any locale.
+// Return true if any iscntrl(3) character is found in the string.
inline bool
strchriscntrl(const char *s)
{
if (iscntrl(c))
return true;
- if (c >= 0x80 && c <= 0x9F)
- return true;
}
return false;