openssl: Don't refer to EVP_des_ecb() if OpenSSL is built without DES support
While DES-ECB is not registered by the plugin in this case (so the
function will never actually be called), the compiler still warns
about the implicitly declared function.
Martin Willi [Thu, 16 Apr 2015 14:50:27 +0000 (16:50 +0200)]
Merge branch 'utils-split'
Split up the almighty utils.[ch] to separate files in the utils/utils subfolder.
These are not meant to include manually, but bring back some order to all
this functionality included through utils.h.
Martin Willi [Thu, 16 Apr 2015 07:38:14 +0000 (09:38 +0200)]
test-vectors: Define test vector symbols as extern
We don't actually define a vector, but only prototype the test vector
implemented in a different file. GCC uses the correct symbol during testing,
but clang correctly complains about duplicated symbols during linking.
Martin Willi [Tue, 14 Apr 2015 07:26:17 +0000 (09:26 +0200)]
unit-tests: Set test verbosity just after test suite loading
We see any plugin startup messages during suite configuration, where
initialization is called once to query plugin features. No need to be verbose
and show these messages once again in the first test.
Martin Willi [Mon, 13 Apr 2015 16:23:58 +0000 (18:23 +0200)]
unit-tests: Use progressive testing of transforms with test vectors
This allows us to show which transform from which plugin failed. Also, we use
the new cleanup handler functionality that allows proper deinitialization on
failure or timeout.
Martin Willi [Tue, 14 Apr 2015 06:59:58 +0000 (08:59 +0200)]
unit-tests: Invoke all registered thread cleanup handlers on test failure
If a test fails in a timeout or a test failure, longjmp() is used to restore
the thread context and handle test failure. However, there might be unreleased
resources, namely locks, which prevent the library to clean up properly after
finishing the test.
By using thread cleanup handlers, we can release any test subject internal or
test specific external resources on test failure. We do so by calling all
registered cleanup handlers.
Martin Willi [Mon, 13 Apr 2015 15:12:49 +0000 (17:12 +0200)]
gcrypt: Explicitly initialize RNG backend to allocate static data
The libgcrypt RNG implementation uses static buffer allocation which it does
not free. There is no symbol we can catch in leak-detective, hence we explicitly
initialize the RNG during the whitelisted gcrypt_plugin_create() function.
Martin Willi [Mon, 13 Apr 2015 10:02:07 +0000 (12:02 +0200)]
leak-detective: Whitelist gcrypt_plugin_create()
gcry_check_version() does not free statically allocated resources. However,
we can't whitelist it in some versions, as it is not a resolvable symbol name.
Instead, whitelist our own plugin constructor function.
Martin Willi [Tue, 14 Apr 2015 10:38:18 +0000 (12:38 +0200)]
aesni: Avoid loading AES/GHASH round keys into local variables
The performance impact is not measurable, as the compiler loads these variables
in xmm registers in unrolled loops anyway.
However, we avoid loading these sensitive keys onto the stack. This happens for
larger key schedules, where the register count is insufficient. If that key
material is not on the stack, we can avoid to wipe it explicitly after
crypto operations.
Martin Willi [Tue, 31 Mar 2015 15:28:12 +0000 (17:28 +0200)]
aesni: Align all class instances to 16 byte boundaries
While the required members are aligned in the struct as required, on 32-bit
platforms the allocator aligns the structures itself to 8 bytes only. This
results in non-aligned struct members, and invalid memory accesses.
Martin Willi [Wed, 15 Apr 2015 10:02:45 +0000 (12:02 +0200)]
unit-tests: Pass stringyfied assertion statement as non-format string argument
If the assertion contains a modulo (%) operation, test_fail_msg() handles
this as printf() format specifier. Pass the assertion string as argument for
an explicit "%s" in the format string, instead.
Martin Willi [Tue, 31 Mar 2015 15:25:05 +0000 (17:25 +0200)]
utils: Add malloc/free wrappers returning aligned data
While we could use posix_memalign(3), that is not fully portable. Further, it
might be difficult on some platforms to properly catch it in leak-detective,
which results in invalid free()s when releasing such memory.
We instead use a simple wrapper, which allocates larger data, and saves the
padding size in the allocated header. This requires that memory is released
using a dedicated function.
To reduce the risk of invalid free() when working on corrupted data, we fill up
all the padding with the padding length, and verify it during free_align().
Martin Willi [Thu, 26 Mar 2015 10:26:51 +0000 (11:26 +0100)]
aesni: Use 4-way parallel AES-NI instructions for CTR en/decryption
CTR can be parallelized, and we do so by queueing instructions to the processor
pipeline. While we have enough registers for 128-bit decryption, the register
count is insufficient to hold all variables with larger key sizes. Nonetheless
is 4-way parallelism faster, depending on key size between ~10% and ~25%.
Martin Willi [Thu, 26 Mar 2015 07:34:00 +0000 (08:34 +0100)]
aesni: Use 4-way parallel AES-NI instructions for CBC decryption
CBC decryption can be parallelized, and we do so by queueing instructions
to the processor pipeline. While we have enough registers for 128-bit
decryption, the register count is insufficient to hold all variables with
larger key sizes. Nonetheless is 4-way parallelism faster, roughly by ~8%.
Martin Willi [Thu, 26 Mar 2015 07:31:00 +0000 (08:31 +0100)]
aesni: Use separate en-/decryption CBC code paths for different key sizes
This allows us to unroll loops, and use local (register) variables for the
key schedule. This improves performance slightly for encryption, but a lot
for reorderable decryption (>30%).
Martin Willi [Thu, 26 Mar 2015 16:44:46 +0000 (17:44 +0100)]
test-vectors: Define some additional CCM test vectors
We don't have any where plain or associated data is not a multiple of the block
size, but it is likely to find bugs here. Also, we miss some ICV12 test vectors
using 128- and 192-bit key sizes.
Martin Willi [Thu, 26 Mar 2015 10:50:28 +0000 (11:50 +0100)]
crypto-tester: Use the plugin feature key size to benchmark crypters/aeads
We previously didn't pass the key size during algorithm registration, but this
resulted in benchmarking with the "default" key size the crypter uses when
passing 0 as key size.
Martin Willi [Tue, 14 Apr 2015 15:42:53 +0000 (17:42 +0200)]
vici: Relicense libvici.h under MIT
libvici currently relies on libstrongswan, and therefore is bound to the GPLv2.
But to allow alternatively licensed reimplementations without copyleft based
on the same interface, we liberate the header.
Martin Willi [Sat, 11 Apr 2015 12:59:22 +0000 (14:59 +0200)]
scripts: Add a tool that tries to guess MAC/ICV values using validation times
This tool shows that it is trivial to re-construct the value memcmp() compares
against by just measuring the time the non-time-constant memcmp() requires to
fail.
It also shows that even when running without any network latencies it gets
very difficult to reconstruct MAC/ICV values, as the time variances due to the
crypto routines are large enough that it gets difficult to measure the time
that memcmp() actually requires after computing the MAC.
However, the faster/time constant an algorithm is, the more likely is a
successful attack. When using AES-NI, it is possible to reconstruct (parts of)
a valid MAC with this tool, for example with AES-GCM.
While this is all theoretical, and way more difficult to exploit with network
jitter, it nonetheless shows that we should replace any use of memcmp/memeq()
with a constant-time alternative in all sensitive places.
Martin Willi [Mon, 13 Apr 2015 13:18:47 +0000 (15:18 +0200)]
Merge branch 'cpu-features'
Centralize all uses of CPUID to a cpu_feature class, which in theory can support
optional features of non-x86/x64 as well using architecture specific code.