Tobias Brunner [Thu, 15 Oct 2020 14:10:07 +0000 (16:10 +0200)]
kernel-netlink: Update cached address flags
Note that manually adding an IPv6 address without disabling duplicate
address detection (DAD, e.g. via `nodad` when using iproute2) will cause
a roam event due to a flag change after about 1-2 seconds (TENTATIVE is
removed). If this is a problem, we might have to ignore addresses with
TENTATIVE flag when we receive a RTM_NEWADDR message until that flag is
eventually removed.
Tobias Brunner [Thu, 15 Oct 2020 11:45:53 +0000 (13:45 +0200)]
child-create: Update CHILD_SA IP addresses before installation
We create the child_sa_t object when initiating the CREATE_CHILD_SA
request, however, the IP addresses/ports might have changed once we
eventually receive the response (potentially to a retransmit sent to
a different address). So update them before installing the SA and
policies.
If the local address changed too and depending on the kernel
implementation, the temporary SA created to allocate the inbound SPI
might remain as it can't be updated. This could cause issues if e.g.
the address switches back before that SA expired (the updated inbound
SA conflicts with the temporary one), or if that happens close together
and the expire (having to wait for the address update) causes the
updated SA to get deleted.
Tobias Brunner [Thu, 8 Oct 2020 07:40:12 +0000 (09:40 +0200)]
swanctl: Support any key type for decrypted keys
The previous code required explicit support for a particular key type,
of which Ed25519 and Ed448 were missing. While a fallback to `any` would
have been possible (this is already the case for unencrypted keys in the
`private` and `pkcs8` directories, which are not parsed by swanctl), it's
not necessary (as long as swanctl and the daemon are from the same release)
and does not require the daemon to detect the key type again.
Martin Willi [Mon, 10 Aug 2020 16:29:52 +0000 (18:29 +0200)]
revocation: Validate OCSP nonce only if response actually contains a nonce
Commit 27756b081c1b8 (revocation: Check that nonce in OCSP response matches)
introduced strict nonce validation to prevent replay attacks with OCSP
responses having a longer lifetime. However, many commercial CAs (such as
Digicert) do not support nonces in responses, as they reuse once-issued OCSP
responses for the OCSP lifetime. This can be problematic for replay attack
scenarios, but is nothing we can fix at our end.
With the mentioned commit, such OCSP responses get completely unusable,
requiring the fallback to CRL based revocation. CRLs don't provide any
replay protection either, so there is nothing gained security-wise, but may
require a download of several megabytes CRL data.
To make use of replay protection where available, but fix OCSP verification
where it is not, do nonce verification only if the response actually contains
a nonce. To be safe against replay attacks, one has to fix the OCSP responder
or use a different CA, but this is not something we can enforce.
openssl: Accept CRLs issued by non-CA certificates with cRLSign keyUsage flag
The x509 plugin accepted CRL signers since forever, to be precise, since dffb176f2bc0 ("CRLSign keyUsage or CA basicConstraint are sufficient
for CRL validation")).
lgtm: Fix building dependencies (in particular tpm2-tss)
This was moved to a separate step with 0ff939585ec7 ("travis: Bump tpm2-tss
to 2.4.1") so packages are installed before these dependencies are built.
However, on LGTM, packages can't be installed explicitly, so `deps` is
a no-op and we still have to list some dependencies in the config.
ike-vendor: Add option to send Cisco FLexVPN vendor ID
A new global option enables sending this vendor ID to prevent Cisco
devices from narrowing the initiator's local traffic selector to the
requested virtual IP, so e.g. 0.0.0.0/0 can be used instead.
This has been tested with a "tunnel mode ipsec ipv4" Cisco template but
should also work for GRE encapsulation.
It's ever so close with strongTNC, sometimes the OOM killer got triggered
and the tests failed, or even worse, the whole guest system got stuck.
This might just be enough for now.
Tobias Brunner [Mon, 24 Aug 2020 14:22:18 +0000 (16:22 +0200)]
testing: Fix dependency issue with strongTNC
Apparently, djangorestframework-camel-case, in the referenced version,
uses `six` but does not itself require/install it (later versions removed
Python 2 support altogether).
On newer systems, the upper hard limit for open file descriptors (see
`ulimit -H -n`) was increased from 4096 to 524288. Due to how python-daemon
closes potentially open file descriptors (basically stores them in a set,
removes those excluded by config, and loops through all of them), the updown
script was either killed immediately (by the OOM killer) or not ready yet
when updown events occurred.
Tobias Brunner [Mon, 24 Aug 2020 13:33:44 +0000 (15:33 +0200)]
testing: Use host's /dev/urandom as /dev/random on guests via VirtIO RNG
Newer versions of systemd etc. seem to require quite a lot of entropy
from /dev/random while booting, which can block and therefore delay the
start of other services (in particular sshd) by more than a minute.
Using the host's /dev/urandom via VirtIO RNG, we can avoid blocking the
guests.
The required kernel options are added for kernel versions 5.4+.
Tobias Brunner [Tue, 18 Aug 2020 11:18:52 +0000 (13:18 +0200)]
imv-scanner: Fix potential buffer overflow
While `pos` was moved to the end, `len` was not adjusted (i.e. set to 0)
so later calls could write beyond the buffer. However, the last port
written might have been incomplete, so instead we just reset the string.
Don't abort the script if the version is reported as UNKNOWN, which happens
on CI hosts where the repository is only cloned with a certain depth (which
may not include the latest tag).
Also, never map VERSION to UNKNOWN.
Fixes: 2e522952c77d ("configure: Optionally use version information obtained from Git in executables")
If it takes a while to start one of the threads, another thread might already
have passed the usleep() call previously used and re-enabled cancelability
so that the loop that checked for it would never terminate.
Tobias Brunner [Tue, 5 May 2020 14:19:09 +0000 (16:19 +0200)]
configure: Optionally use version information obtained from Git in executables
The variable GIT_VERSION is always defined, either obtained from Git or
a file that is embedded in tarballs when they are built. Optionally,
that version is declared as VERSION in config.h so it will be used e.g. in
the daemons when they print the version number.
There is a check that should catch missing tags (i.e. if the version number
in AC_INIT() isn't a prefix of the version obtained via Git).
Tobias Brunner [Wed, 20 May 2020 14:50:11 +0000 (16:50 +0200)]
vici: Keep track of all CA certificates in vici_authority_t
This way we only have one reference for each CA certificate, whether it
is loaded in an authority section, a connection or via load-certs() command.
It also avoids enumerating CA certificates multiple times if they are
loaded in different ways.
Tobias Brunner [Wed, 20 May 2020 12:40:51 +0000 (14:40 +0200)]
vici: Directly provide CA certificates in authority sections
With the previous approach, CA certificates that were not re-loaded via
load-cert() (e.g. from tokens or via absolute paths) would not be available
anymore after the clear-creds() command was used. This avoids this
issue, but can cause duplicate CA certificates to get stored and enumerated,
so there might be a scaling factor.
This changes the hashtable implementation to that it maintains insertion
order. This is then used in the vici plugin to store connections in a
hash table instead of a linked list, which makes managing them quite a
bit faster if there are lots of connections.
The old implementation is extracted into a new class (hashlist_t), which
optionally supports sorting keys and provides the previous get_match()
function.
This reduces the clustering problem (primary clustering) but is not
completely free of it (secondary clustering) it still reduces the maximum
and average probing lengths.
hashtable: Maintain insertion order when enumerating
With the previous approach we'd require at least an additional pointer
per item to store them in a list (15-18% increase in the overhead per
item). Instead we switch from handling collisions with overflow lists to
an open addressing scheme and store the actual table as variable-sized
indices pointing into an array of all inserted items in their original
order.
This can reduce the memory overhead even compared to the previous
implementation (especially for smaller tables), but because the array for
items is preallocated whenever the table is resized, it can be worse for
certain numbers of items. However, avoiding all the allocations required
by the previous design is actually a big advantage.
Depending on the usage pattern, the performance can improve quite a bit (in
particular when inserting many items). The raw lookup performance is a bit
slower as probing lengths increase with open addressing, but there are some
caching benefits due to the compact storage. So for general usage the
performance should be better. For instance, one test I did was counting the
occurrences of words in a list of 1'000'000 randomly selected words from a
dictionary of ~58'000 words (i.e. using a counter stored under each word as
key). The new implementation was ~8% faster on average while requiring
10% less memory.
Since we can't remove items from the array (would change the indices of all
items that follow it) we just mark them as removed and remove them once the
hash table is resized/rehashed (the cells in the hash table for these may
be reused). Due to this the latter may also happen if the number of stored
items does not increase e.g. after a series of remove/put operations (each
insertion requires storage in the array, no matter if items were removed).
So if the capacity is exhausted, the table is resized/rehashed (after lots
of removals the size may even be reduced) and all items marked as removed
are simply skipped.
Compared to the previous implementation the load factor/capacity is
lowered to reduce chances of collisions and to avoid primary clustering to
some degree. However, the latter in particular, but the open addressing
scheme in general, make this implementation completely unsuited for the
get_match() functionality (purposefully hashing to the same value and,
therefore, increasing the probing length and clustering). And keeping the
keys optionally sorted would complicate the code significantly. So we just
keep the existing hashlist_t implementation without adding code to maintain
the overall insertion order (we could add that feature optionally later, but
with the mentioned overhead for one or two pointers).
The maximum size is currently not changed. With the new implementation
this translates to a hard limit for the maximum number of items that can be
held in the table (=CAPACITY(MAX_SIZE)). Since this equals 715'827'882
items with the current settings, this shouldn't be a problem in practice,
the table alone would require 20 GiB in memory for that many items. The
hashlist_t implementation doesn't have that limitation due to the overflow
lists (it can store beyond it's capacity) but it itself would require over
29 GiB of memory to hold that many items.
hashlist: Move get_match() and sorting into a separate class
The main intention here is that we can change the hashtable_t
implementation without being impeded by the special requirements imposed
by get_match() and sorting the keys/items in buckets.
hashtable: Optionally sort keys/items in buckets in a specific way
This can improve negative lookups, but is mostly intended to be used
with get_match() so keys/items can be matched/enumerated in a specific
order. It's like storing sorted linked lists under a shared key but
with less memory overhead.
kernel-netlink: Ignore preference for temporary addresses for IPv6 VIPs
They are not marked as temporary addresses so make sure we always return
them whether temporary addresses are preferred as source addresses or not
as we need to enumerate them when searching for addresses in traffic selectors
to install routes.
Fixes: 9f12b8a61c47 ("kernel-netlink: Enumerate temporary IPv6 addresses according to config")
Tobias Brunner [Mon, 18 May 2020 12:17:24 +0000 (14:17 +0200)]
charon-nm: Set DPD/close action to restart and enable indefinite keying tries
We don't track CHILD_SA down events anymore and rely on NM's initial timeout
to let the user know if the connection failed initially. So we also don't
have to explicitly differentiate between initial connection failures and
later ones like we do an Android. Also, with the default retransmission
settings, there will only be one keying try as NM's timeout is lower than
the combined retransmission timeout of 165s.
There is no visual indicator while the connection is reestablished later.