From: Colin Vidal Date: Fri, 30 Jan 2026 16:09:18 +0000 (+0100) Subject: fetch loop detection improvements X-Git-Tag: v9.21.19~29^2~1 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=f623ab1fb31ee8cf9a04d2063ac15986c6790ea0;p=thirdparty%2Fbind9.git fetch loop detection improvements The fetch loop detection occured in two places: when `dns_resolver_createfetch()` is invoked (looking up through the parent fetches chain and stops the fetch if a parent fetch is the same qname and qtype) and right after calling `dns_adb_findname()` in the resolver (stops the fetch if the current fetch is the same name from the ADB lookup, and ADB lookup needs to fetch it). Regarding fetch loop detection at the `dns_resulver_createfetch()` entry, there are case where both qname and qtype are similar but the zonecut is different. This will then query different name servers and get different responses. For instance, the following delegation parent-side (both for `foo.example.` and `dnshost.example.`): foo.example. 3600 NS ns.dnshost.example. dnshost.example. 3600 NS ns.dnshost.example. ns.dnshost.example. 3600 A 1.2.3.4 Then the child-side of `dnshost.example.`: dnshost.example. 300 NS ns.dnshost.example. ns.dnshost.example. 300 A 1.2.3.4 Then the child-side of `foo.example.`: foo.example 3600 NS ns.dnshost.example. a.foo.example 300 A 5.6.7.8 Obviously, there is a misconfiguration between the parent-side and the child-side of `dnshost.example` (the mismatch of the TTL), but, this happens... Because the resolver is currently child-centric, the parent-side delegation's glue of `dnshost.example.` will be overriden by the child-side of the delegation. Once both A records will expires, the resolver will attempt to find out the A RRs but will start from the `foo.example.` zonecut, as the delegation itself is still valid. Then the resolver will attempt to resolve `ns.dnshost.example.`, still using the `foo.example.` zonecut, which will immediately trigger another attempt to resolve `ns.foo.example.` (because the A RR is expired). This is, however _not_ a loop, because the second attempt will have `dnshost.example.` zonecut. And this changes everything, because the resolver detects the A name is in-domain, and pass a flag to ADB so `dns_view_find()` won't use the cache. As a result, the zonecut will be `.`, and the hints (root servers) will be queried instead. From that point, they'll return the parent-side delegation, which includes the glue for `ns.dnshost.example/A`, and the resolution can continue. Previously, this wouldn't be possible because a loop would be detected from the second attempt to looking `ns.foo.example/A` and would result in a SERVFAIL. Now, the loop detection is relaxed as the loop is detected if the qname, qtype _and_ zonecut are equals. This commit also changes the way the loop detection post `dns_adb_createfind()` works. From the same example above, there would be two ADB fetches with the same name, but with two different ADB flags (the first one without DNS_ADB_STARTATZONE, the second one with that flag). It means that there will be two fetches out of those two ADB lookups, both legit, and not a loop (i.e. it won't be stuck). To differenciate between a find which has a pending fetch (which could be from another find the current find has been attached to), a new find option `DNS_ADBFIND_STARTEDFETCH` is introduced, which tells that the current has did started a fetch. That way, if a find doesn't have `DNS_ADBFIND_STARTEDFETCH` option but has pending fetches, we know this is a find attached to a similar find so this is a loop. Otherwise, with `DNS_ADBFIND_STARTEDFETCH`, we know that even if there is a pending fetch, this is not a loop as the fetch has just been started --- diff --git a/lib/dns/adb.c b/lib/dns/adb.c index 8721d8f6ff7..2e61c755341 100644 --- a/lib/dns/adb.c +++ b/lib/dns/adb.c @@ -2028,6 +2028,10 @@ post_copy: find->cbarg = cbarg; } + if (wanted_fetches) { + find->options |= DNS_ADBFIND_STARTEDFETCH; + } + *findp = find; UNLOCK(&adbname->lock); diff --git a/lib/dns/include/dns/adb.h b/lib/dns/include/dns/adb.h index dfa52236dcc..399e665ee8b 100644 --- a/lib/dns/include/dns/adb.h +++ b/lib/dns/include/dns/adb.h @@ -194,6 +194,10 @@ struct dns_adbfind { */ #define DNS_ADBFIND_STATICSTUB 0x00001000 #define DNS_ADBFIND_NOVALIDATE 0x00002000 +/*% + * This specific find created a fetch + */ +#define DNS_ADBFIND_STARTEDFETCH 0x00010000 /*% * The answers to queries come back as a list of these. diff --git a/lib/dns/resolver.c b/lib/dns/resolver.c index f475af5ffd2..f09b502ba67 100644 --- a/lib/dns/resolver.c +++ b/lib/dns/resolver.c @@ -1087,6 +1087,9 @@ set_stats(dns_resolver_t *res, isc_statscounter_t counter, uint64_t val) { } } +static bool +waiting_for_fetch(fetchctx_t *fctx, const dns_name_t *name, + dns_rdatatype_t type, const dns_name_t *domain); static void valcreate(fetchctx_t *fctx, dns_message_t *message, dns_adbaddrinfo_t *addrinfo, dns_name_t *name, dns_rdatatype_t type, dns_rdataset_t *rdataset, @@ -3340,9 +3343,10 @@ sort_finds(dns_adbfindlist_t *findlist, unsigned int bias) { } /* - * Return true iff the ADB find has a pending fetch for 'type'. This is - * used to find out whether we're in a loop, where a fetch is waiting for a - * find which is waiting for that same fetch. + * Return true iff the ADB find has an already pending fetch for 'type'. This + * is used to find out whether we're in a loop, where a fetch is waiting for a + * find which is waiting for that same fetch. So if the current find actually + * started the fetch, we know it can't be a loop, so we returns false. * * Note: This could be done with either an equivalence check (e.g., * query_pending == DNS_ADBFIND_INET) or with a bit check, as below. If @@ -3361,7 +3365,11 @@ sort_finds(dns_adbfindlist_t *findlist, unsigned int bias) { * evil. */ static bool -waiting_for(dns_adbfind_t *find, dns_rdatatype_t type) { +already_waiting_for(dns_adbfind_t *find, dns_rdatatype_t type) { + if ((find->options & DNS_ADBFIND_STARTEDFETCH) != 0) { + return false; + } + switch (type) { case dns_rdatatype_a: return (find->query_pending & DNS_ADBFIND_INET) != 0; @@ -3471,22 +3479,25 @@ findname(fetchctx_t *fctx, const dns_name_t *name, in_port_t port, * We don't know any of the addresses for this name. * * The find may be waiting on a resolver fetch for a server - * address. We need to make sure it isn't waiting on *this* + * address. We need to make sure it isn't waiting before *this* * fetch, because if it is, we won't be answering it and it * won't be answering us. */ - if (waiting_for(find, fctx->type) && dns_name_equal(name, fctx->name)) { - fctx->adberr++; + if (already_waiting_for(find, fctx->type) && + dns_name_equal(name, fctx->name)) + { isc_log_write(DNS_LOGCATEGORY_RESOLVER, DNS_LOGMODULE_RESOLVER, ISC_LOG_INFO, "loop detected resolving '%s'", fctx->info); + fctx->adberr++; if ((find->options & DNS_ADBFIND_WANTEVENT) != 0) { dns_adb_cancelfind(find); } else { dns_adb_destroyfind(&find); fetchctx_detach(&fctx); } + return; } @@ -10126,12 +10137,27 @@ get_attached_fctx(dns_resolver_t *res, isc_loop_t *loop, const dns_name_t *name, return ISC_R_SUCCESS; } +static bool +is_samedomain(const dns_name_t *domain1, const dns_name_t *domain2) { + if (domain1 == NULL && domain2 == NULL) { + return true; + } + + if (domain1 == NULL || domain2 == NULL) { + return false; + } + + return !dns_name_compare(domain1, domain2); +} + static bool waiting_for_fetch(fetchctx_t *fctx, const dns_name_t *name, - dns_rdatatype_t type) { + dns_rdatatype_t type, const dns_name_t *domain) { while (fctx != NULL) { if (type == fctx->type && !dns_name_compare(name, fctx->name)) { - return true; + if (is_samedomain(domain, fctx->domain)) { + return true; + } } fctx = fctx->parent; } @@ -10181,18 +10207,30 @@ dns_resolver_createfetch(dns_resolver_t *res, const dns_name_t *name, log_fetch(name, type); - if (waiting_for_fetch(parent, name, type)) { + /* + * This fetch loop detection enable to guard against loop scenarios + * where the DNSSEC is involved. See + * `4d307ac67a0e3f9831c9a4e66ac481e2f9ceebb5`. This is a complementary + * detection with the ADB lookup loop detection (in `findname()`). + */ + if (waiting_for_fetch(parent, name, type, domain)) { if (isc_log_wouldlog(ISC_LOG_INFO)) { char namebuf[DNS_NAME_FORMATSIZE + 1]; char typebuf[DNS_RDATATYPE_FORMATSIZE]; + char domainbuf[DNS_NAME_FORMATSIZE + 1] = { 0 }; dns_name_format(name, namebuf, sizeof(namebuf)); dns_rdatatype_format(type, typebuf, sizeof(typebuf)); + if (domain != NULL) { + dns_name_format(domain, domainbuf, + sizeof(domainbuf)); + } isc_log_write(DNS_LOGCATEGORY_RESOLVER, DNS_LOGMODULE_RESOLVER, ISC_LOG_DEBUG(2), - "fetch loop detected resolving '%s/%s'", - namebuf, typebuf); + "fetch loop detected resolving '%s/%s " + "(in '%s'?)", + namebuf, typebuf, domainbuf); } return DNS_R_LOOPDETECTED; }