Tobias Brunner [Fri, 17 Jun 2016 09:18:25 +0000 (11:18 +0200)]
testing: Fix race in tnc/tnccs-20-pdp-pt-tls scenario
aacf84d837e7 ("testing: Add expect-connection calls for all tests and
hosts") removed the expect-connection call for the non-existing aaa
connection. However, because the credentials were loaded asynchronously
via start-script the clients might have been connecting when the secrets
were not yet loaded. As `swanctl --load-creds` is a synchronous call
this change avoids that issue without having to add a sleep or failing
expect-connection call.
Tobias Brunner [Fri, 17 Jun 2016 08:19:37 +0000 (10:19 +0200)]
daemon: Don't hold settings lock while executing start/stop scripts
If a called script interacts with the daemon or one of its plugins
another thread might have to acquire the write lock (e.g. to configure a
fallback or set a value). Holding the read lock prevents that, potentially
resulting in a deadlock.
Tobias Brunner [Thu, 26 Nov 2015 18:06:41 +0000 (19:06 +0100)]
testing: Use TLS 1.2 in RADIUS test cases
This took a while as in the OpenSSL package shipped with Debian and on which
our FIPS-enabled package is based, the function SSL_export_keying_material(),
which is used by FreeRADIUS to derive the MSK, did not use the correct digest
to calculate the result when TLS 1.2 was used. This caused IKE to fail with
"verification of AUTH payload with EAP MSK failed". The fix was only
backported to jessie recently.
Tobias Brunner [Thu, 26 Nov 2015 17:48:43 +0000 (18:48 +0100)]
testing: Update FreeRADIUS to 2.2.8
While this is not the latest 2.x release it is the latest in /old.
Upgrading to 3.0 might be possible, not sure if the TNC-FHH patches could
be easily updated, though. Upgrading to 3.1 will definitely not be possible
directly as that version removes the EAP-TNC module. So we'd first have to
get rid of the TNC-FHH stuff.
Tobias Brunner [Thu, 16 Jun 2016 15:57:37 +0000 (17:57 +0200)]
configure: Cache result of pthread_condattr_setclock() check
Even if not using caching when running the configure script (-C) this
allows pre-defining the result by setting the environment variable
ss_cv_func_pthread_condattr_setclock_monotonic=yes|no|unknown
before/while running the script.
As the check requires running a test program this might be helpful
when cross-compiling to disable using monotonic time if
pthread_condattr_setclock() is defined but not actually usable with
CLOCK_MONOTONIC.
Tobias Brunner [Fri, 17 Jun 2016 08:22:25 +0000 (10:22 +0200)]
load-tester: Fix load-tester on platforms where plain `char` is signed
fgetc() returns an int and EOF is usually -1 so when this gets casted to
a char the result depends on whether `char` means `signed char` or
`unsigned char` (the C standard does not specify it). If it is unsigned
then its value is 0xff so the comparison with EOF will fail as that is an
implicit signed int.
Tobias Brunner [Thu, 16 Jun 2016 14:28:51 +0000 (16:28 +0200)]
Merge branch 'testing-jessie'
Updates the default Debian image used for the test environment from wheezy
to jessie. Also adds a script that allows chrooting to an image (base,
root or one of the guests). In pretty much all test scenarios
expect-connection is used to make test runs more reliable.
Tobias Brunner [Wed, 15 Jun 2016 16:19:23 +0000 (18:19 +0200)]
testing: Build hostapd from sources
There is a bug (fix at [1]) in hostapd 2.1-2.3 that let it crash when used
with the wired driver. The package in jessie (and sid) is affected, so we
build it from sources (same, older, version as wpa_supplicant).
Tobias Brunner [Tue, 14 Jun 2016 18:41:43 +0000 (20:41 +0200)]
testing: Wait for packets to be processed by tcpdump
Sometimes tcpdump fails to process all packets during the short running
time of a scenario:
0 packets captured
18 packets received by filter
0 packets dropped by kernel
So 18 packets were captured by libpcap but tcpdump did not yet process
and print them.
This tries to use --immediate-mode if supported by tcpdump (the one
currently in jessie or wheezy does not, but the one in jessie-backports
does), which disables the buffering in libpcap.
However, even with immediate mode there are cases where it takes a while
longer for all packets to get processed. And without it we also need a
workaround (even though the version in wheezy actually works fine).
That's why there now is a loop checking for differences in captured vs.
received packets. There are actually cases where these numbers are not
equal but we still captured all packets we're interested in, so we abort
after 1s of retrying. But sometimes it could still happen that packets
we expected got lost somewhere ("packets dropped by kernel" is not
always 0 either).
Tobias Brunner [Fri, 20 Nov 2015 16:50:29 +0000 (17:50 +0100)]
testing: Update base image to Debian jessie
Several packages got renamed/updated, libgcrypt was apparently installed
by default previously.
Since most libraries changed we have to completely rebuild all the tools
installed in the root image. We currently don't provide a clean target in
the recipes, and even if we did we'd have to track which base image we
last built for. It's easier to just use a different build directory for
each base image, at the cost of some additional disk space (if not manually
cleaned). However, that's also the case when updating kernel or
software versions.
Tobias Brunner [Tue, 24 Nov 2015 17:32:23 +0000 (18:32 +0100)]
testing: Update Apache config for newer Debian releases
It is still compatible with the current release as the config in
sites-available will be ignored, while conf-enabled does not exist and
is not included in the main config.
Tobias Brunner [Fri, 20 Nov 2015 16:42:55 +0000 (17:42 +0100)]
testing: Don't attempt to stop services when building base image
Unlike `apt-get install` in a chroot debootstrap does not seem to start
the services but stopping them might cause problems if they were running
outside the chroot.
Tobias Brunner [Wed, 15 Jun 2016 09:22:04 +0000 (11:22 +0200)]
leak-detective: Make sure to actually call malloc() from calloc() hook
Newer versions of GCC are too "smart" and replace a call to malloc(X)
followed by a call to memset(0,X) with a call co calloc(), which obviously
results in an infinite loop when it does that in our own calloc()
implementation. Using `volatile` for the variable storing the total size
prevents the optimization and we actually call malloc().
Tobias Brunner [Wed, 8 Jun 2016 17:43:13 +0000 (19:43 +0200)]
resolve: Add refcounting for installed DNS servers
This fixes DNS server installation if make-before-break reauthentication
is used as there the new SA and DNS server is installed before it then
is removed again when the old IKE_SA is torn down.
Tobias Brunner [Tue, 7 Jun 2016 13:58:05 +0000 (15:58 +0200)]
resolve: Make sure to clean up if calling resolvconf failed
If running resolvconf fails handle() fails release() is not called, which
might leave an interface file on the system (or depending on which script
called by resolvconf actually failed even the installed DNS server).
Tobias Brunner [Fri, 10 Jun 2016 16:15:42 +0000 (18:15 +0200)]
Merge branch 'interface-for-routes'
Changes how the interface for routes installed with policies is
determined. In most cases we now use the interface over which we reach the
other peer, not the interface on which the local address (or the source IP) is
installed. However, that might be the same interface depending on the
configuration (i.e. in practice there will often not be a change).
Routes are not installed anymore for drop policies and for policies with
protocol/port selectors.
Tobias Brunner [Thu, 9 Jun 2016 13:38:37 +0000 (15:38 +0200)]
kernel-netlink: Don't install routes for drop policies and if protocol/ports are in the selector
We don't need them for drop policies and they might even mess with other
routes we install. Routes for policies with protocol/ports in the
selector will always be too broad and might conflict with other routes
we install.
Tobias Brunner [Fri, 11 Mar 2016 18:09:54 +0000 (19:09 +0100)]
kernel-netlink: Use interface to next hop for shunt policies
Using the source address to determine the interface is not correct for
net-to-net shunts between two interfaces on which the host has IP addresses
for each subnet.
Tobias Brunner [Wed, 25 May 2016 10:15:38 +0000 (12:15 +0200)]
kernel-netlink: Let only a single thread work on a specific policy
Other threads are free to add/update/delete other policies.
This tries to prevent race conditions caused by releasing the mutex while
sending messages to the kernel. For instance, if break-before-make
reauthentication is used and one thread on the responder is delayed in
deleting the policies that another thread is concurrently adding for the
new SA. This could have resulted in no policies being installed
eventually.
Tobias Brunner [Wed, 8 Jun 2016 14:06:53 +0000 (16:06 +0200)]
ipsec: Add function to compare two ipsec_sa_cfg_t instances
memeq() is currently used to compare these but if there is padding that
is not initialized the same for two instances the comparison fails.
Using this function ensures the objects are compared correctly.
Tobias Brunner [Tue, 24 May 2016 08:26:38 +0000 (10:26 +0200)]
eap-simaka-pseudonym: Properly store mappings
If a pseudonym changed a new entry was added to the table storing
permanent identity objects (that are used as keys in the other table).
However, the old mapping was not removed while replacing the mapping in
the pseudonym table caused the old pseudonym to get destroyed. This
eventually caused crashes when a new pseudonym had the same hash value as
such a defunct entry and keys had to be compared.
Tobias Brunner [Thu, 19 May 2016 09:56:44 +0000 (11:56 +0200)]
child-sa: Use non-static variable to store generated unique mark
If two CHILD_SAs with mark=%unique are created concurrently they could
otherwise end up with either the same mark or different marks in both
directions.
Tobias Brunner [Wed, 25 May 2016 07:42:08 +0000 (09:42 +0200)]
ike: Don't trigger message hook when fragmenting pre-generated messages
This is the case for the IKE_SA_INIT and the initial IKEv1 messages, which
are pre-generated in tasks as at least parts of it are used to generate
the AUTH payload. The IKE_SA_INIT message will never be fragmented, but
the IKEv1 messages might be, so we can't just call generate_message().
Some peers send an INITIAL_CONTACT notify after they received our XAuth
username. The XAuth task waiting for the third XAuth message handles
this incorrectly and closes the IKE_SA as no configuration payloads are
contained in the message. We queue the INFORMATIONAL until the XAuth
exchange is complete to avoid this issue.
Martin Willi [Thu, 19 May 2016 09:13:24 +0000 (11:13 +0200)]
af-alg: Silently skip probing algorithms if AF_ALG is not supported
If the af-alg plugin is enabled, but kernel support is missing, we get
an error line during startup for each probed algorithm. This is way too
verbose, so just skip probing if AF_ALG is unsupported.
Tobias Brunner [Tue, 10 May 2016 10:09:24 +0000 (12:09 +0200)]
vici: Replace dr with dev in version numbers for the Python egg
The versioning scheme used by Python (PEP 440) supports the rcN suffix
but development releases have to be named devN, not drN, which are
not supported and considered legacy versions.
Tobias Brunner [Mon, 2 May 2016 12:21:30 +0000 (14:21 +0200)]
child-sa: Install "outbound" FWD policy with lower priority
This provides a fix if symmetrically overlapping policies are
installed as e.g. the case in the ikev2/ip-two-pools-db scenario:
carol 10.3.0.1/32 ----- 10.3.0.0/16, 10.4.0.0/16 moon
alice 10.4.0.1/32 ----- 10.3.0.0/16, 10.4.0.0/16 moon
Among others, the following FWD policies are installed on moon:
src 10.3.0.1/32 dst 10.4.0.0/16
...
tmpl ...
src 10.4.0.0/16 dst 10.3.0.1/32
...
src 10.4.0.1/32 dst 10.3.0.0/16
...
tmpl ...
src 10.3.0.0/16 dst 10.4.0.1/32
...
Because the network prefixes are the same for all of these they all have the
same priority. Due to that it depends on the install order which policy gets
used. For instance, a packet from 10.3.0.1 to 10.4.0.1 will match the
first as well as the last policy. However, when handling the inbound
packet we have to use the first one as the packet will otherwise be
dropped due to a template mismatch. And we can't install templates with
the "outbound" FWD policies as that would prevent using different
IPsec modes or e.g. IPComp on only one of multiple SAs.
Instead we install the "outbound" FWD policies with a lower priority
than the "inbound" FWD policies so the latter are preferred. But we use
a higher priority than default drop policies would use (in case they'd
be defined with the same subnets).
Tobias Brunner [Wed, 4 May 2016 13:39:51 +0000 (15:39 +0200)]
kernel-netlink: Check proper watcher state in parallel mode
After adding the read callback the state is WATCHER_QUEUED and it is
switched to WATCHER_RUNNING only later by an asynchronous job. This means
that a thread that sent a Netlink message shortly after registration
might see the state as WATCHER_QUEUED. If it then tries to read the
response and the watcher thread is quicker to actually read the message
from the socket, it could block on recv() while still holding the lock.
And the asynchronous job that actually read the message and tries to queue
it will block while trying to acquire the lock, so we'd end up in a deadlock.
This is probably mostly a problem in the unit tests.
trap-manager: Allow local address to be unspecified
If there is currently no route to reach the other peer we just default
to left=%any. The local address is only really used to resolve
leftsubnet=%dynamic anyway (and perhaps for MIPv6 proxy transport mode).
kernel-netlink: Order routes by prefix before comparing priority/metric
Metrics are basically defined to order routes with equal prefix, so ordering
routes by metric first makes not much sense as that could prefer totally
unspecific routes over very specific ones.
For instance, the previous code did break installation of routes for
passthrough policies with two routes like these in the main routing table:
default via 192.168.2.1 dev eth0 proto static
192.168.2.0/24 dev eth0 proto kernel scope link src 192.168.2.10 metric 1
Because the default route has no metric set (0) it was used, instead of the
more specific other one, to determine src and next hop when installing a route
for a passthrough policy for 192.168.2.0/24. Therefore, the installed route
in table 220 did then incorrectly redirect all local traffic to "next hop"
192.168.2.1.
The same issue occurred when determining the source address while
installing trap policies.
Fixes 6b57790270fb ("kernel-netlink: Respect kernel routing priorities for IKE routes").
Fixes #1416.