Evan Hunt [Sat, 26 Jan 2019 18:36:47 +0000 (10:36 -0800)]
enable parallel system tests on windows
this moves the creation of "parallel.mk" into a separate shell script
instead of bin/tests/system/Makefile. that shell script can now be
executed by runall.sh, allowing us to make use of the cygwin "make"
command, which supports parallel execution.
Michał Kępień [Fri, 26 Apr 2019 18:38:02 +0000 (20:38 +0200)]
Simplify trailing period handling in system tests
Windows systems do not allow a trailing period in file names while Unix
systems do. When BIND system tests are run, the $TP environment
variable is set to an empty string on Windows systems and to "." on Unix
systems. This environment variable is then used by system test scripts
for handling this discrepancy properly.
In multiple system test scripts, a variable holding a zone name is set
to a string with a trailing period while the names of the zone's
corresponding dlvset-* and/or dsset-* files are determined using
numerous sed invocations like the following one:
In order to improve code readability, use zone names without trailing
periods and replace sed invocations with variable substitutions.
To retain local consistency, also remove the trailing period from
certain other zone names used in system tests that are not subsequently
processed using sed.
in the "refactor tcpquota and pipeline refs" commit, the counting
of active interfaces was tightened in such a way that named could
fail to listen on an interface if there were more interfaces than
tcp-clients. when checking the quota to start accepting on an
interface, if the number of active clients was above zero, then
it was presumed that some other client was able to handle accepting
new connections. this, however, ignored the fact that the current client
could be included in that count, so if the quota was already exceeded
before all the interfaces were listening, some interfaces would never
listen.
we now check whether the current client has been marked active; if so,
then the number of active clients on the interface must be greater
than 1, not 0.
refactor tcpquota and pipeline refs; allow special-case overrun in isc_quota
- if the TCP quota has been exceeded but there are no clients listening
for new connections on the interface, we can now force attachment to the
quota using isc_quota_force(), instead of carrying on with the quota not
attached.
- the TCP client quota is now referenced via a reference-counted
'ns_tcpconn' object, one of which is created whenever a client begins
listening for new connections, and attached to by members of that
client's pipeline group. when the last reference to the tcpconn
object is detached, it is freed and the TCP quota slot is released.
- reduce code duplication by adding mark_tcp_active() function.
- convert counters to atomic.
better tcpquota accounting and client mortality checks
- ensure that tcpactive is cleaned up correctly when accept() fails.
- set 'client->tcpattached' when the client is attached to the tcpquota.
carry this value on to new clients sharing the same pipeline group.
don't call isc_quota_detach() on the tcpquota unless tcpattached is
set. this way clients that were allowed to accept TCP connections
despite being over quota (and therefore, were never attached to the
quota) will not inadvertently detach from it and mess up the
accounting.
- simplify the code for tcpquota disconnection by using a new function
tcpquota_disconnect().
- before deciding whether to reject a new connection due to quota
exhaustion, check to see whether there are at least two active
clients. previously, this was "at least one", but that could be
insufficient if there was one other client in READING state (waiting
for messages on an open connection) but none in READY (listening
for new connections).
- before deciding whether a TCP client object can to go inactive, we
must ensure there are enough other clients to maintain service
afterward -- both accepting new connections and reading/processing new
queries. A TCP client can't shut down unless at least one
client is accepting new connections and (in the case of pipelined
clients) at least one additional client is waiting to read.
Witold Kręcicki [Fri, 4 Jan 2019 11:50:51 +0000 (12:50 +0100)]
tcp-clients could still be exceeded (v2)
the TCP client quota could still be ineffective under some
circumstances. this change:
- improves quota accounting to ensure that TCP clients are
properly limited, while still guaranteeing that at least one client
is always available to serve TCP connections on each interface.
- uses more descriptive names and removes one (ntcptarget) that
was no longer needed
- adds comments
Witold Kręcicki [Thu, 3 Jan 2019 13:17:43 +0000 (14:17 +0100)]
fix enforcement of tcp-clients (v1)
tcp-clients settings could be exceeded in some cases by
creating more and more active TCP clients that are over
the set quota limit, which in the end could lead to a
DoS attack by e.g. exhaustion of file descriptors.
If TCP client we're closing went over the quota (so it's
not attached to a quota) mark it as mortal - so that it
will be destroyed and not set up to listen for new
connections - unless it's the last client for a specific
interface.
Key IDs may accidentally match dig output that is not the key ID (for
example the RRSIG inception or expiration time, the query ID, ...).
Search for key ID + signer name should prevent that, as that is what
only should occur in the RRSIG record, and signer name always follows
the key ID.
Remove sleep calls from test, rely on wait_for_log(). Make
wait_for_log() and dnssec_loadkeys_on() fail the test if the
appropriate log line is not found.
Slightly adjust the echo_i() lines to print only the key ID (not the
key name).
Michał Kępień [Tue, 23 Apr 2019 12:59:05 +0000 (14:59 +0200)]
Wait more than 1 second for NSEC3 chain changes
One second may not be enough for an NSEC3 chain change triggered by an
UPDATE message to complete. Wait up to 10 seconds when checking whether
a given NSEC3 chain change is complete in the "nsupdate" system test.
Michał Kępień [Tue, 23 Apr 2019 12:59:05 +0000 (14:59 +0200)]
Remove redundant sleeps
In the "nsupdate" system test, do not sleep before checking results of
changes which are expected to be processed synchronously, i.e. before
nsupdate returns.
Michał Kępień [Fri, 19 Apr 2019 09:21:43 +0000 (11:21 +0200)]
Update interface lists in ifconfig scripts
Make bin/tests/system/ifconfig.bat also configure addresses ending with
9 and 10, so that the script is in sync with its Unix counterpart.
Update comments listing the interfaces created by ifconfig.{bat,sh} so
that they do not include addresses whose last octet is zero (since an
address like 10.53.1.0/24 is not a valid host address and thus the
aforementioned scripts do not even attempt configuring them).
Michał Kępień [Fri, 19 Apr 2019 09:21:43 +0000 (11:21 +0200)]
Fix the "dnssec" system test on Windows
On Windows, the bin/tests/system/dnssec/signer/example.db.signed file
contains carriage return characters at the end of each line. Remove
them before passing the aforementioned file to the awk script extracting
key IDs so that the latter can work properly.
Michał Kępień [Fri, 19 Apr 2019 09:21:43 +0000 (11:21 +0200)]
Do not wait for lock file cleanup on Windows
As signals are currently not handled by named on Windows, instances
terminated using signals are not able to perform a clean shutdown, which
involves e.g. removing the lock file. Thus, waiting for a given
instance's lock file to be removed beforing assuming it is shut down
is pointless on Windows, so do not even attempt it.
Michał Kępień [Fri, 19 Apr 2019 07:37:51 +0000 (09:37 +0200)]
win32: fix service state reported during shutdown
When a Windows service receives a request to stop, it should not set its
state to SERVICE_STOPPED until it is completely shut down as doing that
allows the operating system to kill that service prematurely, which in
the case of named may e.g. prevent the PID file and/or the lock file
from being cleaned up.
Set service state to SERVICE_STOP_PENDING when named begins its shutdown
and only report the SERVICE_STOPPED state immediately before exiting.
Matthijs Mekking [Tue, 15 Jan 2019 13:12:14 +0000 (14:12 +0100)]
DLV tests unsupported/disabled algorithms
This tests both the cases when the DLV trust anchor is of an
unsupported or disabled algorithm, as well as if the DLV zone
contains a key with an unsupported or disabled algorithm.
* Remove dnskey-sig-validity option (added in 9.12)
* Replace rndccmd, dig_with_opts with export variables
* Remove tests for CDNSKEY and CDS (in 9.11 always signed with ZSK)
Matthijs Mekking [Fri, 22 Mar 2019 14:42:10 +0000 (15:42 +0100)]
With update-check-ksk also consider offline keys
The option `update-check-ksk` will look if both KSK and ZSK are
available before signing records. It will make sure the keys are
active and available. However, for operational practices keys may
be offline. This commit relaxes the update-check-ksk check and will
mark a key that is offline to be available when adding signature
tasks.
Matthijs Mekking [Thu, 14 Mar 2019 08:32:20 +0000 (09:32 +0100)]
Add test for ZSK rollover while KSK offline
This commit adds a lengthy test where the ZSK is rolled but the
KSK is offline (except for when the DNSKEY RRset is changed). The
specific scenario has the `dnskey-kskonly` configuration option set
meaning the DNSKEY RRset should only be signed with the KSK.
A new zone `updatecheck-kskonly.secure` is added to test against,
that can be dynamically updated, and that can be controlled with rndc
to load the DNSSEC keys.
There are some pre-checks for this test to make sure everything is
fine before the ZSK roll, after the new ZSK is published, and after
the old ZSK is deleted. Note there are actually two ZSK rolls in
quick succession.
When the latest added ZSK becomes active and its predecessor becomes
inactive, the KSK is offline. However, the DNSKEY RRset did not
change and it has a good signature that is valid for long enough.
The expected behavior is that the DNSKEY RRset stays signed with
the KSK only (signature does not need to change). However, the
test will fail because after reconfiguring the keys for the zone,
it wants to add re-sign tasks for the new active keys (in sign_apex).
Because the KSK is offline, named determines that the only other
active key, the latest ZSK, will be used to resign the DNSKEY RRset,
in addition to keeping the RRSIG of the KSK.
The question is: Why do we need to resign the DNSKEY RRset
immediately when a new key becomes active? This is not required,
only once the next resign task is triggered the new active key
should replace signatures that are in need of refreshing.
Mark Andrews [Tue, 26 Feb 2019 23:21:33 +0000 (10:21 +1100)]
Add dns_rdata_totext() and dns_rdata_fromtext() to fromwire
Add dns_rdata_totext() and dns_rdata_fromtext() to fromwire for
valid inputs to ensure that what we accept in dns_rdata_fromwire()
can be written out and read back in.