Mike Perry [Sun, 2 Apr 2023 21:06:20 +0000 (21:06 +0000)]
Prop#329 streams: Handle stream usage with conflux
This adds utility functions to help stream block decisions, as well as cpath
layer_hint checks for stream cell acceptance, and syncing stream lists
for conflux circuits.
These functions are then called throughout the codebase to properly manage
conflux streams.
Mike Perry [Fri, 20 Jan 2023 19:14:33 +0000 (19:14 +0000)]
Refactor stream blocking due to channel cell queues
Streams can get blocked on a circuit in two ways:
1. When the circuit package window is full
2. When the channel's cell queue is too high
Conflux needs to decouple stream blocking from both of these conditions,
because streams can continue on another circuit, even if the primary circuit
is blocked for either of these cases.
However, both conflux and congestion control need to know if the channel's
cell queue hit the highwatermark and is still draining, because this condition
is used by those components, independent of stream state.
Therefore, this commit renames the 'streams_blocked_on_chan' variable to
signify that it refers to the cell queue state, and also refactors the actual
stream blocking bits out, so they can be handled separately if conflux is
present.
Mike Perry [Wed, 18 Jan 2023 22:48:43 +0000 (22:48 +0000)]
Prop#329 sendme: Adjust sendme sending and tracking for conflux
Because circuit-level sendmes are sent before relay data cells are processed,
we can safely move this to before the conflux decision. In this way,
regardless of conflux being negotiated, we still send sendmes as soon as data
cells are recieved. This avoids introducing conflux queue delay into RTT
measurement, which is important for measuring actual circuit capacity.
The circuit-level tracking must happen inside the call to send a data cell,
since that call now chooses a circuit to send on. Turns out, we were already
doing this kind of here, but only for the digest. Now we do both things here.
David Goulet [Fri, 3 Mar 2023 19:28:18 +0000 (14:28 -0500)]
Prop#329 OOM: Handle freeing conflux queues on OOM
We use the oldest-circ-first method here, since that seems good for conflux:
queues could briefly spike, but the bad case is if they are maliciously
bloated to stick around for a long time.
The tradeoff here is that it is possible to kill old circuits on a relay
quickly, but that has always been the case with this algorithm choice.
Signed-off-by: David Goulet <dgoulet@torproject.org>
metrics: Add HS service side circuit build time metrics.
This adds 2 histogram metrics for hidden services:
* `tor_hs_rend_circ_build_time` - the rendezvous circuit build time in milliseconds
* `tor_hs_intro_circ_build_time` - the introduction circuit build time in milliseconds
The text representation representation of the new metrics looks like this:
```
# HELP tor_hs_rend_circ_build_time The rendezvous circuit build time in milliseconds
# TYPE tor_hs_rend_circ_build_time histogram
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="1000.00"} 2
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="5000.00"} 10
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="10000.00"} 10
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="30000.00"} 10
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="60000.00"} 10
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="+Inf"} 10
tor_hs_rend_circ_build_time_sum{onion="<elided>"} 10824
tor_hs_rend_circ_build_time_count{onion="<elided>"} 10
# HELP tor_hs_intro_circ_build_time The introduction circuit build time in milliseconds
# TYPE tor_hs_intro_circ_build_time histogram
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="1000.00"} 0
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="5000.00"} 6
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="10000.00"} 6
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="30000.00"} 6
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="60000.00"} 6
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="+Inf"} 6
tor_hs_intro_circ_build_time_sum{onion="<elided>"} 9843
tor_hs_intro_circ_build_time_count{onion="<elided>"} 6
```
metrics: Add a `reason` label to the HS error metrics.
This adds a `reason` label to the `hs_intro_rejected_intro_req_count` and
`hs_rdv_error_count` metrics introduced in #40755.
Metric look up and intialization is now more a bit more involved. This may be
fine for now, but it will become unwieldy if/when we add more labels (and as
such will need to be refactored).
Also, in the future, we may want to introduce finer grained `reason` labels.
For example, the `invalid_introduce2` label actually covers multiple types of
errors that can happen during the processing of an INTRODUCE2 cell (such as
cell parse errors, replays, decryption errors).
metrics: Add metrics for rendezvous and introduction request failures.
This introduces a couple of new service side metrics:
* `hs_intro_rejected_intro_req_count`, which counts the number of introduction
requests rejected by the hidden service
* `hs_rdv_error_count`, which counts the number of rendezvous errors as seen by
the hidden service (this number includes the number of circuit establishment
failures, failed retries, end-to-end circuit setup failures)
Roger Dingledine [Sun, 12 Feb 2023 20:50:55 +0000 (15:50 -0500)]
vote AuthDirMaxServersPerAddr in consensus params
Directory authorities now include their AuthDirMaxServersPerAddr
config option in the consensus parameter section of their vote. Now
external tools can better predict how they will behave.
In particular, the value should make its way to the
https://consensus-health.torproject.org/#consensusparams page.
Once enough dir auths vote this param, they should also compute a
consensus value for it in the consensus document. Nothing uses this
consensus value yet, but we could imagine having dir auths consult it
in the future.
Nick Mathewson [Fri, 10 Feb 2023 13:11:39 +0000 (08:11 -0500)]
Extend blinding testvec with timeperiod test.
When I copied this to arti, I messed up and thought that the default
time period was 1440 seconds for some weird testing reason. That led
to confusion.
This commit adds a test case that time period 1440 is May 20, 1973:
now arti and c tor match!
Add new liblzma enums (LZMA_SEEK_NEEDED and LZMA_RET_INTERNAL*)
conditional to the API version they arrived in. The first stable
version of liblzma this affects is 5.4.0
Fixes #40741
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
Add new liblzma enums (LZMA_SEEK_NEEDED and LZMA_RET_INTERNAL*)
conditional to the API version they arrived in. The first stable
version of liblzma this affects is 5.4.0
Fixes #40741
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
Mike Perry [Tue, 10 Jan 2023 20:47:11 +0000 (20:47 +0000)]
Do not reset our RTT in slow start.
If a circuit only sends a tiny amount of data such that its cwnd is not
full, it won't increase its cwnd above the minimum. Since slow start circuits
should never hit the minimum otherwise, we can just ignore them for RTT reset
to handle this.
Mike Perry [Wed, 21 Dec 2022 17:35:09 +0000 (17:35 +0000)]
Reduce size of congestion control next_*_event fields.
Since these are derived from the number of SENDMEs in a cwnd/cc update,
and a cwnd should not exceed ~10k, there's plenty of room in uint16_t
for them, even if the network gets significantly faster.
Mike Perry [Fri, 16 Dec 2022 21:12:50 +0000 (21:12 +0000)]
Avoid increasing the congestion window if it is not full.
Also provides some stickiness, so that once full, the congestion window is
considered still full for the rest of an update cycle, or the entire
congestion window.
In this way, we avoid increasing the congestion window if it is not fully
utilized, but we can still back off in this case. This substantially reduces
queue use in Shadow.
Mike Perry [Wed, 21 Dec 2022 17:35:09 +0000 (17:35 +0000)]
Reduce size of congestion control next_*_event fields.
Since these are derived from the number of SENDMEs in a cwnd/cc update,
and a cwnd should not exceed ~10k, there's plenty of room in uint16_t
for them, even if the network gets significantly faster.
Mike Perry [Fri, 16 Dec 2022 21:12:50 +0000 (21:12 +0000)]
Avoid increasing the congestion window if it is not full.
Also provides some stickiness, so that once full, the congestion window is
considered still full for the rest of an update cycle, or the entire
congestion window.
In this way, we avoid increasing the congestion window if it is not fully
utilized, but we can still back off in this case. This substantially reduces
queue use in Shadow.
David Goulet [Tue, 10 Jan 2023 14:24:09 +0000 (09:24 -0500)]
state: Fix segfault on malformed file
Having no TotalBuildTimes along a positive CircuitBuildAbandonedCount
count lead to a segfault. We check for that condition and then BUG + log
warn if that is the case.
It should never happened in theory but if someone modified their state
file, it can lead to this problem so instead of segfaulting, warn.
Fixes #40437
Signed-off-by: David Goulet <dgoulet@torproject.org>