Alexander Færøy [Thu, 4 Feb 2021 23:11:11 +0000 (23:11 +0000)]
Only check for bindable ports if we are unsure if it will fail.
We currently assume that the only way for Tor to listen on ports in the
privileged port range (1 to 1023), on Linux, is if we are granted the
NET_BIND_SERVICE capability. Today on Linux, it's possible to specify
the beginning of the unprivileged port range using a sysctl
configuration option. Docker (and thus the CI service Tor uses) recently
changed this sysctl value to 0, which causes our tests to fail as they
assume that we should NOT be able to bind to a privileged port *without*
the NET_BIND_SERVICE capability.
In this patch, we read the value of the sysctl value via the /proc/sys/
filesystem iff it's present, otherwise we assume the default
unprivileged port range begins at port 1024.
David Goulet [Tue, 17 Aug 2021 16:43:58 +0000 (12:43 -0400)]
dir: Do not flag non-running failing HSDir
When a directory request fails, we flag the relay as non Running so we
don't use it anymore.
This can be problematic with onion services because there are cases
where a tor instance could have a lot of services, ephemeral ones, and
keeps failing to upload descriptors, let say due to a bad network, and
thus flag a lot of nodes as non Running which then in turn can not be
used for circuit building.
This commit makes it that we never flag nodes as non Running on a onion
service directory request (upload or fetch) failure as to keep the
hashring intact and not affect other parts of tor.
Fortunately, the onion service hashring is _not_ selected by looking at
the Running flag but since we do a 3-hop circuit to the HSDir, other
services on the same instance can influence each other by removing nodes
from the consensus for path selection.
This was made apparent with a small network that ran out of nodes to
used due to rapid succession of onion services uploading and failing.
See #40434 for details.
Fixes #40434
Signed-off-by: David Goulet <dgoulet@torproject.org>
As reported by hdevalence our batch verification logic can cause an assert
crash.
The assert happens because when the batch verification of ed25519-donna fails,
the code in `ed25519_checksig_batch()` falls back to doing a single
verification for each signature.
The crash occurs because batch verification failed, but then all signatures
individually verified just fine.
That's because batch verification and single verification use a different
equation which means that there are sigs that can pass single verification
but fail batch verification.
Fixing this would require modding ed25519-donna which is not in scope for
this ticket, and will be soon deprecated in favor of arti and ed25519-dalek, so my branch instead removes batch verification.
Nick Mathewson [Tue, 1 Jun 2021 20:18:23 +0000 (16:18 -0400)]
Use native timegm when available.
Continue having a tor_gmtime_impl() unit test so that we can detect
any problems in our replacement function; add a new test function to
make sure that gmtime<->timegm are a round-trip on now-ish times.
This is a fix for bug #40383, wherein we ran into trouble because
tor_timegm() does not believe that time_t should include a count of
leap seconds, but FreeBSD's gmtime believes that it should. This
disagreement meant that for a certain amount of time each day,
instead of calculating the most recent midnight, our voting-schedule
functions would calculate the second-most-recent midnight, and lead
to an assertion failure.
I am calling this a bugfix on 0.2.0.3-alpha when we first started
calculating our voting schedule in this way.
Nick Mathewson [Mon, 28 Jun 2021 13:08:31 +0000 (09:08 -0400)]
Suppress strict-prototypes warning on NSS pk11pub.h header
We already did this in a couple of places, but there are more that
we didn't get. This is necessary for systems with versions of
NSS that don't do their prototypes properly.
David Goulet [Thu, 3 Jun 2021 13:33:21 +0000 (09:33 -0400)]
TROVE-2021-003: Check layer_hint before half-closed end and resolve cells
This issue was reported by Jann Horn part of Google's Project Zero.
Jann's one-sentence summary: entry/middle relays can spoof RELAY_END cells on
half-closed streams, which can lead to stream confusion between OP and
exit.
Nick Mathewson [Mon, 17 May 2021 12:50:01 +0000 (08:50 -0400)]
Assert on _all_ failures from RAND_bytes().
Previously, we would detect errors from a missing RNG
implementation, but not failures from the RNG code itself.
Fortunately, it appears those failures do not happen in practice
when Tor is using OpenSSL's default RNG implementation. Fixes bug
40390; bugfix on 0.2.8.1-alpha. This issue is also tracked as
TROVE-2021-004. Reported by Jann Horn at Google's Project Zero.
Nick Mathewson [Thu, 27 May 2021 14:49:37 +0000 (10:49 -0400)]
Upgrade and rate-limit compression failure message.
Without this message getting logged at 'WARN', it's hard to
contextualize the messages we get about compression bombs, so this
message should fix #40175.
I'm rate-limiting this, however, since it _could_ get spammy if
somebody on the network starts acting up. (Right now it should be
very quiet; I've asked Sebastian to check it, and he says that he
doesn't hit this message in practice.)
Nick Mathewson [Wed, 26 May 2021 17:02:56 +0000 (13:02 -0400)]
Prefer mmap()ed consensus files over cached_dir_t entries.
Cached_dir_t is a somewhat "legacy" kind of storage when used for
consensus documents, and it appears that there are cases when
changing our settings causes us to stop updating those entries.
This can cause trouble, as @arma found out in #40375, where he
changed his settings around, and consensus diff application got
messed up: consensus diffs were being _requested_ based on the
latest consensus, but were being (incorrectly) applied to a
consensus that was no longer the latest one.
This patch is a minimal fix for backporting purposes: it has Tor do
the same search when applying consensus diffs as we use to request
them. This should be sufficient for correct behavior.
There's a similar case in GETINFO handling; I've fixed that too.