the tcp system test uses the 'packet.pl' test tool to send a packet
thousands of times. this took a long time because the tool was waiting
for replies and parsing them; however, for that particular test the
replies aren't relevant.
this commit uses non-blocking sockets and moves the reply parsing
outside the send loop, which speeds the system test up substantially.
Michał Kępień [Thu, 3 Sep 2020 08:35:37 +0000 (10:35 +0200)]
Include BIND documentation in Windows zips
As generated documentation files are no longer stored in the BIND Git
repository, put a copy of the PDF version of the BIND ARM generated by
the "docs" GitLab CI job into the Windows zips to make it easily
available to the end users on that platform.
Make sure Windows zips also contain certain documentation files included
in source tarballs to make the contents of each release more consistent
across different platforms.
Mark Andrews [Mon, 31 Aug 2020 12:09:38 +0000 (22:09 +1000)]
Increase zone load timeout in the "rndc" test
The "huge.zone" zone can take longer than 100 seconds to load when
running under a sanitizer. Increase the relevant zone load timeout to
prevent intermittent failures of the "rndc" system test.
Matthijs Mekking [Thu, 27 Aug 2020 12:24:50 +0000 (14:24 +0200)]
Fix CDS (non-)publication
The CDS/CDNSKEY record will be published when the DS is in the
rumoured state. However, with the introduction of the rndc '-checkds'
command, the logic in the keymgr was changed to prevent the DS
state to go in RUMOURED unless the specific command was given. Hence,
the CDS was never published before it was seen in the parent.
Initially I thought this was a policy approval rule, however it is
actually a DNSSEC timing rule. Remove the restriction from
'keymgr_policy_approval' and update the 'keymgr_transition_time'
function. When looking to move the DS state to OMNIPRESENT it will
no longer calculate the state from its last change, but from when
the DS was seen in the parent, "DS Publish". If the time was not set,
default to next key event of an hour.
Similarly for moving the DS state to HIDDEN, the time to wait will
be derived from the "DS Delete" time, not from when the DS state
last changed.
Matthijs Mekking [Thu, 27 Aug 2020 12:11:23 +0000 (14:11 +0200)]
Update rndc_checkds test util
The 'rndc_checkds' utility now allows "now" as the time when the DS
has been seen in/seen removed from the parent.
Also it uses "KEYX" as the key argument, rather than key id.
The 'rndc_checkds' will retrieve the key from the "KEYX" string. This
makes the call a bit more readable.
Matthijs Mekking [Thu, 27 Aug 2020 11:18:10 +0000 (13:18 +0200)]
Improve kasp test readability
This commit has a lot of updates on comments, mainly to make the
system test more readable.
Also remove some redundant signing policy checks (check_keys,
check_dnssecstatus, check_keytimes).
Finally, move key time checks and expected key time settings above
'rndc_checkds' calls (with the new way of testing next key event
times there is no need to do them after 'rndc_checkds', and moving
them above 'rndc_checkds' makes the flow of testing easier to follow.
Matthijs Mekking [Thu, 27 Aug 2020 10:32:41 +0000 (12:32 +0200)]
Add '-P ds' and '-D ds' to dnssec-settime
Add two more arguments to the dnssec-settime tool. '-P ds' sets the
time that the DS was published in the parent, '-D ds' sets the time
that the DS was removed from the parent (these times are not accurate,
but rely on the user to use them appropriately, and as long as the
time is not before actual publication/withdrawal, it is fine).
These new arguments are needed for the kasp system test. We want to
test when the next key event is once a DS is published, and now
that 'parent-registration-delay' is obsoleted, we need a different
approach to reliable test the timings.
Michal Nowak [Mon, 31 Aug 2020 15:46:45 +0000 (17:46 +0200)]
Drop gperftools-profiler configure switch
This switch is believed to be unnecessary. The possibility to use
gperftools CPU profiler was kept, one needs to set 'CFLAGS' and
'LDFLAGS' accordingly.
Diego Fronza [Fri, 28 Aug 2020 21:49:26 +0000 (18:49 -0300)]
Added test for the proposed fix
The test works as follows:
1. Client wants to resolve unusual ip6.arpa. name:
test1.test2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. IN TXT
2. Query is sent to ns7, a qmin enabled resolver.
3. ns7 do the first stage in query minimization for the name and send a new
query to root (ns1):
_.1.0.0.2.ip6.arpa. IN A
4. ns1 delegates ip6.arpa. to ns2.good.:
;; AUTHORITY SECTION:
;ip6.arpa. 20 IN NS ns2.good.
;; ADDITIONAL SECTION:
;ns2.good. 20 IN A 10.53.0.2
5. ns7 do a second round in minimizing the name and send a new query
to ns2.good. (10.53.0.2):
_.8.2.6.0.1.0.0.2.ip6.arpa. IN A
6. ans2 delegates 8.2.6.0.1.0.0.2.ip6.arpa. to ns3.good.:
;; AUTHORITY SECTION:
;8.2.6.0.1.0.0.2.ip6.arpa. 60 IN NS ns3.good.
;; ADDITIONAL SECTION:
;ns3.good. 60 IN A 10.53.0.3
7. ns7 do a third round in minimizing the name and send a new query to
ns3.good.:
_.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. IN A
8. ans3 delegates 1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. to ns4.good.:
;; AUTHORITY SECTION:
;1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. 60 IN NS ns4.good.
;; ADDITIONAL SECTION:
;ns4.good. 60 IN A 10.53.0.4
9. ns7 do fourth round in minimizing the name and send a new query to
ns4.good.:
_.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. IN A
10. ns4.good. doesn't know such name, but answers stating it is authoritative for
the domai:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 53815
...
;; AUTHORITY SECTION:
1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. 60 IN SOA ns4.good. ...
11. ns7 do another minimization on name:
_.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa
sends to ns4.good. and gets the same SOA response stated in item #10
12. ns7 do another minimization on name:
_.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa
sends to ns4.good. and gets the same SOA response stated in item #10.
13. ns7 do the last query minimization name for the ip6.arpa. QNAME.
After all IPv6 labels are exausted the algorithm falls back to the
original QNAME:
test1.test2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa
ns7 sends a new query with the original QNAME to ans4.
Diego Fronza [Wed, 26 Aug 2020 17:36:14 +0000 (14:36 -0300)]
Fix resolution of unusual ip6.arpa names
Before this commit, BIND was unable to resolve ip6.arpa names like
the one reported in issue #1847 when using query minimization.
As reported in the issue, an attempt to resolve a name like
'rec-test-dom-158937817846788.test123.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.3.4.3.5.4.0.8.2.6.0.1.0.0.2.ip6.arpa'
using default settings would fail.
The reason was that query minimization algorithm in 'fctx_minimize_qname'
would divide any ip6.arpa names in increasing number of labels,
7,11, ... up to 35, thus limiting the destination name (minimized) to a number
of 35 labels.
In case the last query minimization attempt (with 35 labels) would fail with
NXDOMAIN, BIND would attempt the query mininimization again with the exact
same QNAME, limited on the 35 labels, and that in turn would fail again.
This fix avoids this fail loop by considering the extra labels that may appear
in the leftmost part of an ip6.arpa name, those after the IPv6 part.
Multiply 1996-alloc_dnsbuf-crash-test.pkt by 300000 via TCP
The test for assertion failure via large TCP packet needs to be repeated
multiple times (we use 300000). This commit fixes the input file to be
properly hexlified and uses the new packet.pl -r feature to send it
300000 times via TCP.
For some tests, we need to send big data streams (for TCP) or repeated
packets (for UDP), this commits adds `-r` option to packet.pl that sends
the same input <repeats> times using the specified protocol.
is theoretically the most efficent in practice, using
memory_order_acq_rel produces the same code on x86_64 and doesn't
trigger tsan data races (which use a idealistic model) if
isc_refcount_destroy() is not called immediately. In fact
isc_refcount_destroy() could be removed if we didn't want
to check for the count being 0 when isc_refcount_destroy() is
called.
In order to lower the amount of memory allocated at startup by named
instances used in the BIND system test suite, set the default value of
"max-cache-size" for these to 2 megabytes. The purpose of this change
is to prevent named instances (or even entire virtual machines) from
getting killed by the operating system on the test host due to excessive
memory use.
Remove all "max-cache-size" statements from named configuration files
used in system tests ("checkconf" notwithstanding) to prevent confusion
as the "-T maxcachesize=..." command line option takes precedence over
configuration files.
Michał Kępień [Mon, 31 Aug 2020 11:15:33 +0000 (13:15 +0200)]
Add "-T maxcachesize=..." command line option
An implicit default of "max-cache-size 90%;" may cause memory use issues
on hosts which run numerous named instances in parallel (e.g. GitLab CI
runners) due to the cache RBT hash table now being pre-allocated [1] at
startup. Add a new command line option, "-T maxcachesize=...", to allow
the default value of "max-cache-size" to be overridden at runtime. When
this new option is in effect, it overrides any other "max-cache-size"
setting in the configuration, either implicit or explicit. This
approach was chosen because it is arguably the simplest one to
implement.
The following alternative approaches to solving this problem were
considered and ultimately rejected (after it was decided they were not
worth the extra code complexity):
- adding the same command line option, but making explicit
configuration statements have priority over it,
- adding a build-time option that allows the implicit default of
"max-cache-size 90%;" to be overridden.
Ondřej Surý [Wed, 26 Aug 2020 14:31:13 +0000 (16:31 +0200)]
Handle EPROTO errno from recvmsg
It was discovered, that some systems might set EPROTO instead of EACCESS
on recvmsg() call causing spurious syslog messages from the socket
code. This commit returns soft handling of EPROTO errno code to the
socket code. [GL #1928]
Ondřej Surý [Fri, 28 Aug 2020 07:30:29 +0000 (09:30 +0200)]
Fix off-by-one error when calculating new hashtable size
When calculating the new hashtable bitsize, there was an off-by-one
error that would allow the new bitsize to be larger than maximum allowed
causing assertion failure in the rehash() function.
Michal Nowak [Wed, 19 Aug 2020 09:16:11 +0000 (11:16 +0200)]
Print test-suite.log correctly in tarball system test job
Printing test-suite.log on system test failure does not work for system
test run from tarball because the "after_script" step does not honour
directory change from the "before_script" step and fails with:
Running after script...
$ cat bin/tests/system/test-suite.log
cat: bin/tests/system/test-suite.log: No such file or directory
Ondřej Surý [Mon, 24 Aug 2020 10:30:42 +0000 (12:30 +0200)]
Use the Fibonacci Hashing for the RBTDB glue table
The rbtdb version glue_table has been refactored similarly to rbt.c hash
table, so it does use 32-bit hash function return values and apply
Fibonacci Hashing to lookup the index to the hash table instead of
modulo. For more details, see the lib/dns/rbt.c commit log.
Ondřej Surý [Tue, 25 Aug 2020 08:04:11 +0000 (10:04 +0200)]
Add minimized (cmin-tmin-cmin) corpus for dns_message_parse fuzzer
The non-minimized corpus from https://github.com/CZ-NIC/dns-fuzzing was
used as input to afl-cmin, then every case were processed by afl-tmin
and then afl-cmin was used to further minimize the corpus again.
Ondřej Surý [Tue, 25 Aug 2020 07:51:40 +0000 (09:51 +0200)]
Add dns_message_parse() fuzzer
Previously, the bin/system/wire_test.c was optionally used as a fuzzer,
this commit extracts the parts relevant to the fuzzing into a
specialized fuzzer that can be used in oss-fuzz project.
The fuzzer parses the input as UDP DNS message, then prints parsed DNS
message, then renders the DNS message and then prints the rendered DNS
message. No part of the code should cause a assertion failure.
Mark Andrews [Tue, 25 Aug 2020 12:59:35 +0000 (22:59 +1000)]
Cast the original rcode to (dns_ttl_t) when setting extended rcode
Shifting (signed) integer left could trigger undefined behaviour when
the shifted value would overflow into the sign bit (e.g. 2048).
The issue was found when using AFL++ and UBSAN:
message.c:2274:33: runtime error: left shift of 2048 by 20 places cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior message.c:2274:33 in
Michal Nowak [Tue, 18 Aug 2020 16:27:29 +0000 (18:27 +0200)]
Fix warnings in when build with --enable-buffer-useinline
sockaddr.c:147:49: error: pointer targets in passing argument 2 of ‘isc__buffer_putmem’ differ in signedness
rdata.c:1780:30: error: pointer targets in passing argument 2 of ‘isc__buffer_putmem’ differ in signedness