From: Alex Rousskov Date: Fri, 11 Feb 2022 16:32:27 +0000 (+0000) Subject: Do not skip problematic regexes in ACLs (#979) X-Git-Tag: SQUID_6_0_1~235 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=e1c92a7cfd7d9d81079aae8cb22b4300e6625e82;p=thirdparty%2Fsquid.git Do not skip problematic regexes in ACLs (#979) This change has two partially overlapping parts: * Reject configurations with ACLs containing failed-to-compile regexes. * Do not skip ACL regexes with BUFSIZ or more characters. Invalid or long ACL regexes were skipped (with an ERROR message), resulting in a dangerous mismatch between admin (mis)interpretation of their squid.conf and the actual configuration of the running instance. Given the volume of ERRORs busy Squids are often reporting, including transaction errors that admins may consider irrelevant, admins are not paying enough attention to non-fatal configuration ERRORs, especially reconfiguration ERRORs. Squid should reject erroneous configs instead. And until we find (and this time document!) a specific reason to artificially restrict individual regex length, let the Squid configuration tokenizer and the regex library limit RE lengths. KISS. Also deliver what commit 0fa036e promised but failed to do: "Squid no longer reports REs _optimization_ failure when it is an individual RE that is broken (and reported as such)". In that commit, I overlooked the fact that Squid still ignored broken REs at the time, fooling the higher-level code into thinking that they were OK (and triggering unnecessary reporting of misleading optimization errors). Besides rejection of invalid REs, there are two known side effects: 1. Squid may silently start using previously skipped long REs[^1]. 2. Squid may fail/succeed/change regex optimization because the previously skipped long REs now participate in optimization attempts. In nearly all cases, this does not affect the ACL matching outcome. [^1]: The regex library is unlikely to reject regexes based on their length alone because libraries ought to accept regexes much longer than longest Squid configuration tokens. For example, glibc accepts 4MB-long regexes on 32-bit systems AFAICT based on a quick scan of regcomp code. --- diff --git a/src/acl/RegexData.cc b/src/acl/RegexData.cc index 0d0f949ec5..3978df489e 100644 --- a/src/acl/RegexData.cc +++ b/src/acl/RegexData.cc @@ -209,14 +209,7 @@ compileUnoptimisedREs(std::list &curlist, const SBufList &sl, cons } else if (configurationLineWord == plus_i) { flags &= ~REG_ICASE; } else { - try { - compileRE(curlist, configurationLineWord, flags); - } catch (...) { - // TODO: Make these configuration failures fatal (by default). - debugs(28, DBG_CRITICAL, "ERROR: Skipping regular expression: '" << configurationLineWord << "'" << - Debug::Extra << "configuration: " << cfg_filename << " line " << config_lineno << ": " << config_input_line << - Debug::Extra << "regex compilation failure: " << CurrentException); - } + compileRE(curlist, configurationLineWord, flags); } } } @@ -233,13 +226,8 @@ ACLRegexData::parse() SBufList sl; while (char *t = ConfigParser::RegexStrtokFile()) { const char *clean = removeUnnecessaryWildcards(t); - if (strlen(clean) > BUFSIZ-1) { - debugs(28, DBG_CRITICAL, cfg_filename << " line " << config_lineno << ": " << config_input_line); - debugs(28, DBG_CRITICAL, "ERROR: Skipping regular expression. Larger than " << BUFSIZ-1 << " characters: '" << clean << "'"); - } else { - debugs(28, 3, "buffering RE '" << clean << "'"); - sl.emplace_back(clean); - } + debugs(28, 3, "buffering RE '" << clean << "'"); + sl.emplace_back(clean); } try { diff --git a/test-suite/squidconf/regex b/test-suite/squidconf/regex index 95c82a8d16..b1f3b8b620 100644 --- a/test-suite/squidconf/regex +++ b/test-suite/squidconf/regex @@ -19,6 +19,5 @@ acl G dstdom_regex \.g...le\.com$ acl B browser ^Mozilla acl B browser ^Java/[0-9]+(\.[0-9]+)? -# invalid pattern - this should ERROR -acl foo browser * - +# TODO: Support testing invalid configurations, like this one: +# acl foo browser *