Martin Willi [Thu, 26 Mar 2015 10:26:51 +0000 (11:26 +0100)]
aesni: Use 4-way parallel AES-NI instructions for CTR en/decryption
CTR can be parallelized, and we do so by queueing instructions to the processor
pipeline. While we have enough registers for 128-bit decryption, the register
count is insufficient to hold all variables with larger key sizes. Nonetheless
is 4-way parallelism faster, depending on key size between ~10% and ~25%.
Martin Willi [Thu, 26 Mar 2015 07:34:00 +0000 (08:34 +0100)]
aesni: Use 4-way parallel AES-NI instructions for CBC decryption
CBC decryption can be parallelized, and we do so by queueing instructions
to the processor pipeline. While we have enough registers for 128-bit
decryption, the register count is insufficient to hold all variables with
larger key sizes. Nonetheless is 4-way parallelism faster, roughly by ~8%.
Martin Willi [Thu, 26 Mar 2015 07:31:00 +0000 (08:31 +0100)]
aesni: Use separate en-/decryption CBC code paths for different key sizes
This allows us to unroll loops, and use local (register) variables for the
key schedule. This improves performance slightly for encryption, but a lot
for reorderable decryption (>30%).
Martin Willi [Thu, 26 Mar 2015 16:44:46 +0000 (17:44 +0100)]
test-vectors: Define some additional CCM test vectors
We don't have any where plain or associated data is not a multiple of the block
size, but it is likely to find bugs here. Also, we miss some ICV12 test vectors
using 128- and 192-bit key sizes.
Martin Willi [Thu, 26 Mar 2015 10:50:28 +0000 (11:50 +0100)]
crypto-tester: Use the plugin feature key size to benchmark crypters/aeads
We previously didn't pass the key size during algorithm registration, but this
resulted in benchmarking with the "default" key size the crypter uses when
passing 0 as key size.
Martin Willi [Tue, 14 Apr 2015 15:42:53 +0000 (17:42 +0200)]
vici: Relicense libvici.h under MIT
libvici currently relies on libstrongswan, and therefore is bound to the GPLv2.
But to allow alternatively licensed reimplementations without copyleft based
on the same interface, we liberate the header.
Martin Willi [Sat, 11 Apr 2015 12:59:22 +0000 (14:59 +0200)]
scripts: Add a tool that tries to guess MAC/ICV values using validation times
This tool shows that it is trivial to re-construct the value memcmp() compares
against by just measuring the time the non-time-constant memcmp() requires to
fail.
It also shows that even when running without any network latencies it gets
very difficult to reconstruct MAC/ICV values, as the time variances due to the
crypto routines are large enough that it gets difficult to measure the time
that memcmp() actually requires after computing the MAC.
However, the faster/time constant an algorithm is, the more likely is a
successful attack. When using AES-NI, it is possible to reconstruct (parts of)
a valid MAC with this tool, for example with AES-GCM.
While this is all theoretical, and way more difficult to exploit with network
jitter, it nonetheless shows that we should replace any use of memcmp/memeq()
with a constant-time alternative in all sensitive places.
Martin Willi [Mon, 13 Apr 2015 13:18:47 +0000 (15:18 +0200)]
Merge branch 'cpu-features'
Centralize all uses of CPUID to a cpu_feature class, which in theory can support
optional features of non-x86/x64 as well using architecture specific code.
Martin Willi [Fri, 10 Apr 2015 11:36:58 +0000 (13:36 +0200)]
sqlite: Use our locking mechanism also when sqlite3_threadsafe() returns 0
We previously checked for older library versions without locking support at
all. But newer libraries can be built in single-threading mode as well, where
we have to care about the locking.
Martin Willi [Thu, 2 Apr 2015 06:50:56 +0000 (08:50 +0200)]
vici: Defer read/write error reporting after connection entry has been released
If a vici client registered for (control-)log events, but a vici read/write
operation fails, this may result in a deadlock. The attempt to write to the
bus results in a vici log message, which in turn tries to acquire the lock
for the entry currently held.
While a recursive lock could help as well for a single thread, there is still
a risk of inter-thread races if there is more than one thread listening for
events and/or having read/write errors.
We instead log to a local buffer, and write to the bus not before the connection
entry has been released. Additionally, we mark the connection entry as unusable
to avoid writing to the failed socket again, potentially triggering an error
loop.
Martin Willi [Tue, 31 Mar 2015 12:59:12 +0000 (14:59 +0200)]
aead: Create AEAD using traditional transforms with an explicit IV generator
Real AEADs directly provide a suitable IV generator, but traditional crypters
do not. For some (stream) ciphers, we should use sequential IVs, for which
we pass an appropriate generator to the AEAD wrapper.
In 9138f49e we explicitly added the check we remove now, as HMAC_Update()
might crash if HMAC_Init_ex() has not been called yet. To avoid that, we
set and check a flag locally to let any get_mac() call fail if set_key() has
not yet been called.
sem_init() is deprecated on OS X, and it actually fails with ENOSYS. Using our
wrapped semaphore object is not an option, as it relies on the thread cleanup
that we can't rely on at this stage.
It is unclear why startup synchronization is required, as we can allocate the
thread ID just before creating the pthread. There is a chance that we allocate
a thread ID for a thread that fails to create, but the risk and consequences
are negligible.
Tobias Brunner [Mon, 23 Mar 2015 17:37:48 +0000 (18:37 +0100)]
kernel-netlink: Copy current usage stats to new SA in update_sa()
This is needed to fix usage stats sent via RADIUS Accounting if clients
use MOBIKE or e.g. the kernel notifies us about a changed NAT mapping.
The upper layers won't expect the stats to get reset if only the IPs have
changed (and some kernel interface might actually allow such updates
without reset).
It also fixes traffic based lifetimes in such situations.
Tobias Brunner [Tue, 24 Mar 2015 17:36:49 +0000 (18:36 +0100)]
child-sa: Add a new state to track rekeyed IKEv1 CHILD_SAs
This is needed to handle DELETEs properly, which was previously done via
CHILD_REKEYING, which we don't use anymore since 5c6a62ceb6 as it prevents
reauthentication.
As we have no DH group available in the KE payload for IKEv1, the verification
can't work in that stage. Instead, we now verify DH groups in the DH backends,
which works for any IKE version or any other purpose.
Tobias Brunner [Mon, 23 Mar 2015 09:58:30 +0000 (10:58 +0100)]
ikev1: Make sure SPIs in an IKEv1 DELETE payload match the current SA
OpenBSD's isakmpd uses the latest ISAKMP SA to delete other expired SAs.
This caused strongSwan to delete e.g. a rekeyed SA even though isakmpd
meant to delete the old one.
What isakmpd does might not be standard compliant. As RFC 2408 puts
it:
Deletion which is concerned with an ISAKMP SA will contain a
Protocol-Id of ISAKMP and the SPIs are the initiator and responder
cookies from the ISAKMP Header.
This could either be interpreted as "copy the SPIs from the ISAKMP
header of the current message to the DELETE payload" (which is what
strongSwan assumed, and the direction IKEv2 took it, by not sending SPIs
for IKE), or as clarification that ISAKMP "cookies" are actually the
SPIs meant to be put in the payload (but that any ISAKMP SA may be
deleted).
Tobias Brunner [Mon, 16 Mar 2015 17:25:22 +0000 (18:25 +0100)]
pki: Use SHA-256 as default for signatures
Since the BLISS private key supports this we don't do any special
handling anymore (if the user choses a digest that is not supported,
signing will simply fail later because no signature scheme will be found).
Tobias Brunner [Thu, 12 Mar 2015 10:50:20 +0000 (11:50 +0100)]
trap-manager: Add option to ignore traffic selectors from acquire events
The specific traffic selectors from the acquire events, which are derived
from the triggering packet, are usually prepended to those from the
config. Some implementations might not be able to handle these properly.