Tobias Brunner [Tue, 31 May 2016 12:41:19 +0000 (14:41 +0200)]
ike-rekey: Handle undetected collisions also if delete is delayed
If the peer does not detect the rekey collision and deletes the old
IKE_SA and then receives the colliding rekey request it will respond with
TEMPORARY_FAILURE. That notify may arrive before the DELETE does, in
which case we may just conclude the rekeying initiated by the peer.
Also, since the IKE_SA is destroyed in any case when we receive a delete
there is no point in storing the delete task in collide() as process_i()
in the ike-rekey task will never be called.
Tobias Brunner [Tue, 31 May 2016 10:22:32 +0000 (12:22 +0200)]
ike-rekey: Properly handle situation if the peer did not notice the rekey collision
We conclude the rekeying before deleting the IKE_SA. Waiting for the
potential TEMPORARY_FAILURE notify is no good because if that response
does not reach us the peer will not retransmit it upon our retransmits
of the rekey request if it already deleted the IKE_SA after receiving
our response to the delete.
Tobias Brunner [Sat, 28 May 2016 07:34:29 +0000 (09:34 +0200)]
ikev2: Add a new state to track rekeyed IKE_SAs
This makes handling such IKE_SAs more specifically compared to keeping them
in state IKE_CONNECTING or IKE_ESTABLISHED (which we did when we lost a
collision - even triggering the ike_updown event), or using IKE_REKEYING for
them, which would also be ambiguous.
For instance, we can now reject anything but DELETES for such SAs.
Tobias Brunner [Fri, 27 May 2016 08:17:53 +0000 (10:17 +0200)]
unit-tests: Make sure to flush the IKE_SA manager before destroying the sender
As the static plugin that creates and destroys the default sender was
not initialized because of the missing socket the daemon won't destroy
our sender. Test cases will eventually have to flush the IKE_SA manager to
satisfy the leak detective. However, in case of a test failure and if there
are IKE_SAs in the manager the daemon will flush the SAs when deinitializing,
which will cause deletes to get sent. This crashes if the sender is already
destroyed.
Tobias Brunner [Thu, 26 May 2016 14:57:31 +0000 (16:57 +0200)]
unit-tests: Provide a wrapper around bus_t::add_listener and unregister them during cleanup
In case listeners on the stack are triggered while cleaning up after a
test failed (e.g. via ike_sa_manager_t::flush) remaining listeners defined on
the stack would cause a segmentation fault.
Tobias Brunner [Thu, 26 May 2016 13:08:09 +0000 (15:08 +0200)]
ike-rekey: Establish new IKE_SA earlier as responder, but only if no collision
Moving to the new SA only after receiving the DELETE for the old SA was
not ideal as it rendered the new SA unusable (because it simply didn't
exist in the manager) if the DELETE was delayed/got dropped.
Tobias Brunner [Wed, 25 May 2016 17:12:53 +0000 (19:12 +0200)]
child-delete: Check if the deleted CHILD_SA is the redundant SA of a collision
This happens if the peer deletes the redundant SA before we are able to
handle the response. The deleted SA will be in state CHILD_INSTALLED but
we don't want to trigger the child_updown() event for it or recreate it.
Tobias Brunner [Thu, 12 May 2016 10:22:35 +0000 (12:22 +0200)]
child-delete: Remove unnecessary call to destroy_child_sa()
Generally, we will not find the CHILD_SA by searching for it with the
outbound SPI (the initiator of the DELETE sent its inbound SPI) - and if
we found a CHILD_SA it would most likely be the wrong one (one in which
we used the same inbound SPI as the peer used for the one it deletes).
And we don't actually want to destroy the CHILD_SA at this point as we
know we already initiated a DELETE ourselves, which means that task
still has a reference to it and will destroy the CHILD_SA when it
receives the response from the other peer.
Tobias Brunner [Thu, 12 May 2016 11:49:11 +0000 (13:49 +0200)]
unit-tests: Don't unload plugins before calling libcharon_deinit()
libcharon_deinit() already calls all the functions we called manually.
Unloading the plugins will not work if charon->initialize() is called
as charon's static plugin features would already be unloaded before the
destroyed members are accessed in destroy() to flush them.
Tobias Brunner [Fri, 17 Jun 2016 09:18:25 +0000 (11:18 +0200)]
testing: Fix race in tnc/tnccs-20-pdp-pt-tls scenario
aacf84d837e7 ("testing: Add expect-connection calls for all tests and
hosts") removed the expect-connection call for the non-existing aaa
connection. However, because the credentials were loaded asynchronously
via start-script the clients might have been connecting when the secrets
were not yet loaded. As `swanctl --load-creds` is a synchronous call
this change avoids that issue without having to add a sleep or failing
expect-connection call.
Tobias Brunner [Fri, 17 Jun 2016 08:19:37 +0000 (10:19 +0200)]
daemon: Don't hold settings lock while executing start/stop scripts
If a called script interacts with the daemon or one of its plugins
another thread might have to acquire the write lock (e.g. to configure a
fallback or set a value). Holding the read lock prevents that, potentially
resulting in a deadlock.
Tobias Brunner [Thu, 26 Nov 2015 18:06:41 +0000 (19:06 +0100)]
testing: Use TLS 1.2 in RADIUS test cases
This took a while as in the OpenSSL package shipped with Debian and on which
our FIPS-enabled package is based, the function SSL_export_keying_material(),
which is used by FreeRADIUS to derive the MSK, did not use the correct digest
to calculate the result when TLS 1.2 was used. This caused IKE to fail with
"verification of AUTH payload with EAP MSK failed". The fix was only
backported to jessie recently.
Tobias Brunner [Thu, 26 Nov 2015 17:48:43 +0000 (18:48 +0100)]
testing: Update FreeRADIUS to 2.2.8
While this is not the latest 2.x release it is the latest in /old.
Upgrading to 3.0 might be possible, not sure if the TNC-FHH patches could
be easily updated, though. Upgrading to 3.1 will definitely not be possible
directly as that version removes the EAP-TNC module. So we'd first have to
get rid of the TNC-FHH stuff.
Tobias Brunner [Thu, 16 Jun 2016 15:57:37 +0000 (17:57 +0200)]
configure: Cache result of pthread_condattr_setclock() check
Even if not using caching when running the configure script (-C) this
allows pre-defining the result by setting the environment variable
ss_cv_func_pthread_condattr_setclock_monotonic=yes|no|unknown
before/while running the script.
As the check requires running a test program this might be helpful
when cross-compiling to disable using monotonic time if
pthread_condattr_setclock() is defined but not actually usable with
CLOCK_MONOTONIC.
Tobias Brunner [Fri, 17 Jun 2016 08:22:25 +0000 (10:22 +0200)]
load-tester: Fix load-tester on platforms where plain `char` is signed
fgetc() returns an int and EOF is usually -1 so when this gets casted to
a char the result depends on whether `char` means `signed char` or
`unsigned char` (the C standard does not specify it). If it is unsigned
then its value is 0xff so the comparison with EOF will fail as that is an
implicit signed int.
Tobias Brunner [Thu, 16 Jun 2016 14:28:51 +0000 (16:28 +0200)]
Merge branch 'testing-jessie'
Updates the default Debian image used for the test environment from wheezy
to jessie. Also adds a script that allows chrooting to an image (base,
root or one of the guests). In pretty much all test scenarios
expect-connection is used to make test runs more reliable.
Tobias Brunner [Wed, 15 Jun 2016 16:19:23 +0000 (18:19 +0200)]
testing: Build hostapd from sources
There is a bug (fix at [1]) in hostapd 2.1-2.3 that let it crash when used
with the wired driver. The package in jessie (and sid) is affected, so we
build it from sources (same, older, version as wpa_supplicant).
Tobias Brunner [Tue, 14 Jun 2016 18:41:43 +0000 (20:41 +0200)]
testing: Wait for packets to be processed by tcpdump
Sometimes tcpdump fails to process all packets during the short running
time of a scenario:
0 packets captured
18 packets received by filter
0 packets dropped by kernel
So 18 packets were captured by libpcap but tcpdump did not yet process
and print them.
This tries to use --immediate-mode if supported by tcpdump (the one
currently in jessie or wheezy does not, but the one in jessie-backports
does), which disables the buffering in libpcap.
However, even with immediate mode there are cases where it takes a while
longer for all packets to get processed. And without it we also need a
workaround (even though the version in wheezy actually works fine).
That's why there now is a loop checking for differences in captured vs.
received packets. There are actually cases where these numbers are not
equal but we still captured all packets we're interested in, so we abort
after 1s of retrying. But sometimes it could still happen that packets
we expected got lost somewhere ("packets dropped by kernel" is not
always 0 either).
Tobias Brunner [Fri, 20 Nov 2015 16:50:29 +0000 (17:50 +0100)]
testing: Update base image to Debian jessie
Several packages got renamed/updated, libgcrypt was apparently installed
by default previously.
Since most libraries changed we have to completely rebuild all the tools
installed in the root image. We currently don't provide a clean target in
the recipes, and even if we did we'd have to track which base image we
last built for. It's easier to just use a different build directory for
each base image, at the cost of some additional disk space (if not manually
cleaned). However, that's also the case when updating kernel or
software versions.