Alexander Færøy [Fri, 5 Feb 2021 17:12:52 +0000 (17:12 +0000)]
Fix Windows build.
While trying to resolve our CI issues, the Windows build broke with an
unused function error:
src/test/test_switch_id.c:37:1: error: ‘unprivileged_port_range_start’
defined but not used [-Werror=unused-function]
We solve this by moving the `#if !defined(_WIN32)` test above the
`unprivileged_port_range_start()` function defintion such that it is
included in its body.
Roger Dingledine [Sun, 24 Oct 2021 00:32:36 +0000 (20:32 -0400)]
don't retry entry guards if they're bridges without descriptors
When we don't yet have a descriptor for one of our bridges, disable
the entry guard retry schedule on that bridge. The entry guard retry
schedule and the bridge descriptor retry schedule can conflict,
e.g. where we mark a bridge as "maybe up" yet we don't try to fetch
its descriptor yet, leading Tor to wait (refusing to do anything)
until it becomes time to fetch the descriptor.
Roger Dingledine [Sat, 23 Oct 2021 09:24:29 +0000 (05:24 -0400)]
do notice-level log when we resume having enough dir info
we do a notice-level log when we decide we *don't* have enough dir
info, but in 0.3.5.1-alpha (see commit eee62e13d97, #14950) we lost our
corresponding notice-level log when things come back.
Roger Dingledine [Fri, 29 Oct 2021 00:39:31 +0000 (20:39 -0400)]
handle other de-sync cases from #40396
Specifically, every time a guard moves into or out of state
GUARD_REACHABLE_MAYBE, it is an opportunity for the guard reachability
state to get out of sync with the have-minimum-dir-info state.
Roger Dingledine [Fri, 22 Oct 2021 06:33:49 +0000 (02:33 -0400)]
reassess minimum-dir-info when a bridge fails
When we try to fetch a bridge descriptor and we fail, we mark
the guard as failed, but we never scheduled a re-compute for
router_have_minimum_dir_info().
So if we had already decided we needed to wait for this new descriptor,
we would just wait forever -- even if, counterintuitively, *losing* the
bridge is just what we need to *resume* using the network, if we had it
in state GUARD_REACHABLE_MAYBE and we were stalling to learn this outcome.
Roger Dingledine [Fri, 29 Oct 2021 00:53:26 +0000 (20:53 -0400)]
only log "new bridge descriptor" if really new
The bridge descriptor fetching codes ends up fetching a lot of duplicate
bridge descriptors, because this is how we learn when the descriptor
changes.
This commit only changes comments plus whether we log that one line.
It moves us back to the old behavior, before the previous commit for
30496, where we would only log that line when the bridge descriptor
we're talking about is better than the one we already had (if any).
Roger Dingledine [Sat, 23 Oct 2021 08:18:00 +0000 (04:18 -0400)]
fetch missing bridge descriptors without delay
Without this change, if we have a working bridge, and we add a new bridge,
we will schedule the fetch attempt for that new bridge descriptor for
three hours(!) in the future.
This change is especially needed because of bug #40396, where if you have
one working bridge and one bridge whose descriptor you haven't fetched
yet, your Tor will stall until you have successfully fetched that new
descriptor -- in this case for hours.
In the old design, we would put off all further bridge descriptor fetches
once we had any working bridge descriptor. In this new design, we make the
decision per bridge based on whether we successfully got *its* descriptor.
To make this work, we need to also call learned_bridge_descriptor() every
time we get a bridge descriptor, not just when it's a novel descriptor.
Fixes bug 40396.
Also happens to fix bug 40495 (redundant descriptor fetches for every
bridge) since now we delay fetches once we succeed.
A side effect of this change is that if we have any configured bridges
that *aren't* working, we will keep trying to fetch their descriptors
on the modern directory retry schedule -- every couple of seconds for
the first half minute, then backing off after that -- which is a lot
faster than before.
Nick Mathewson [Fri, 8 Oct 2021 15:36:04 +0000 (11:36 -0400)]
Add a new consensus method to handle MiddleOnly specially.
When this method is in place, then any relay which is assigned
MiddleOnly has Exit, V2Dir, Guard, and HSDir cleared
(and has BadExit set if appropriate).
Nick Mathewson [Fri, 8 Oct 2021 15:14:53 +0000 (11:14 -0400)]
Implement a MiddleOnly flag for vote generation.
This proposal implements part of Prop335; it's based on a patch
from Neel Chauhan.
When configured to do so, authorities will assign a MiddleOnly flag
to certain relays. Any relay which an authority gives this flag
will not get Exit, V2Dir, Guard, or HSDir, and might get BadExit if
the authority votes for that one.
Nick Mathewson [Tue, 19 Oct 2021 19:16:49 +0000 (15:16 -0400)]
Move "Didn't recognize cell, but circ stops here" into heartbeat.
When we looked, this was the third most frequent message at
PROTOCOL_WARN, and doesn't actually tell us what to do about it.
Now:
* we just log it at info
* we log it only once per circuit
* we report, in the heartbeat, how many times it happens, how many
cells it happens with per circuit, and how long these circuits
have been alive (on average).