From: Alex Rousskov Date: Thu, 23 Oct 2025 08:05:28 +0000 (+0000) Subject: Bug 5520: ERR_INVALID_URL for CONNECT host with leading digit (#2283) X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=85b9d6d92057c1185c1ab883c550a01347b1c02e;p=thirdparty%2Fsquid.git Bug 5520: ERR_INVALID_URL for CONNECT host with leading digit (#2283) Squid 7.2 commit b8337359 added validation of host names following RFC 1035 requirements. But those requirements were outdated by RFC 1123: One aspect of host name syntax is hereby changed: the restriction on the first character is relaxed to allow either a letter or a digit. Host software MUST support this more liberal syntax. The commit treated CONNECT host names that start with a decimal digit as invalid IPv4 addresses and rejected the corresponding requests, resulting in HTTP 404 errors. Undo that change. We have considered preserving code that detects valid IPv4 addresses (as opposed to treating all non-IPv6 input as an "IPv4 address or reg-name" without disambiguating the two cases) because its pieces may be reused, but that essentially unused code has non-trivial performance penalty and final code may look quite different after we complete our "non-CONNECT uri-host parsing code" migration TODO. Polished source code comments aside, this change reverts 2025 commit b8337359 and restores 2023 AnyP::Uri::parseHost() implementation (commit 963ff143). --- diff --git a/src/anyp/Uri.cc b/src/anyp/Uri.cc index 0c80f10212..5263ba5752 100644 --- a/src/anyp/Uri.cc +++ b/src/anyp/Uri.cc @@ -668,23 +668,11 @@ AnyP::Uri::parseHost(Parser::Tokenizer &tok) const // no brackets implies we are looking at IPv4address or reg-name - static const CharacterSet IPv4chars = CharacterSet("period", ".") + CharacterSet::DIGIT; - SBuf ipv4ish; // IPv4address-ish - if (tok.prefix(ipv4ish, IPv4chars)) { - // This rejects non-IP addresses that our caller would have - // otherwise mistaken for a domain name (e.g., '127.0.0' or '1234.5'). - Ip::Address ipCheck; - if (!ipCheck.fromHost(ipv4ish.c_str())) - throw TextException("malformed IP address in uri-host", Here()); - - return ipv4ish; - } - - // XXX: This code does not detect/reject some bad host values (e.g. "!#$%&"). + // XXX: This code does not detect/reject some bad host values (e.g. `!#$%&`). // TODO: Add more checks here, after migrating the // non-CONNECT uri-host parsing code to use us. - SBuf otherHost; // IPv4address-ish or reg-name-ish; + SBuf otherHost; // IPv4address-ish or reg-name-ish // ":" is not in TCHAR so we will stop before any port specification if (tok.prefix(otherHost, CharacterSet::TCHAR)) return otherHost;