Jordan Whited [Thu, 2 Mar 2023 23:08:28 +0000 (15:08 -0800)]
conn, device, tun: implement vectorized I/O on Linux
Implement TCP offloading via TSO and GRO for the Linux tun.Device, which
is made possible by virtio extensions in the kernel's TUN driver.
Delete conn.LinuxSocketEndpoint in favor of a collapsed conn.StdNetBind.
conn.StdNetBind makes use of recvmmsg() and sendmmsg() on Linux. All
platforms now fall under conn.StdNetBind, except for Windows, which
remains in conn.WinRingBind, which still needs to be adjusted to handle
multiple packets.
Also refactor sticky sockets support to eventually be applicable on
platforms other than just Linux. However Linux remains the sole platform
that fully implements it for now.
Co-authored-by: James Tucker <james@tailscale.com> Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Accept packet vectors for reading and writing in the tun.Device and
conn.Bind interfaces, so that the internal plumbing between these
interfaces now passes a vector of packets. Vectors move untouched
between these interfaces, i.e. if 128 packets are received from
conn.Bind.Read(), 128 packets are passed to tun.Device.Write(). There is
no internal buffering.
Currently, existing implementations are only adjusted to have vectors
of length one. Subsequent patches will improve that.
Also, as a related fixup, use the unix and windows packages rather than
the syscall package when possible.
Co-authored-by: James Tucker <james@tailscale.com> Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
tun/netstack: check error returned by SetDeadline()
Signed-off-by: Alexander Neumann <alexander.neumann@redteam-pentesting.de>
[Jason: don't wrap deadline error.] Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
This commit fixes all callsites of netip.AddrFromSlice(), which has
changed its signature and now returns two values.
Signed-off-by: Alexander Neumann <alexander.neumann@redteam-pentesting.de>
[Jason: remove error handling from AddrFromSlice.] Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Thomas H. Ptacek [Mon, 31 Jan 2022 22:55:36 +0000 (16:55 -0600)]
tun/netstack: implement ICMP ping
Provide a PacketConn interface for netstack's ICMP endpoint; netstack
currently only provides EchoRequest/EchoResponse ICMP support, so this
code exposes only an interface for doing ping.
Signed-off-by: Thomas Ptacek <thomas@sockpuppet.org>
[Jason: rework structure, match std go interfaces, add example code] Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
We missed a function exit point. This was exacerbated by e3134bf
("device: defer state machine transitions until configuration is
complete"), but the bug existed prior. Minus provided the following
useful reproducer script:
#!/usr/bin/env bash
set -eux
make wireguard-go || exit 125
ip netns del test-ns || true
ip netns add test-ns
ip link add test-kernel type wireguard
wg set test-kernel listen-port 0 private-key <(echo "QMCfZcp1KU27kEkpcMCgASEjDnDZDYsfMLHPed7+538=") peer "eDPZJMdfnb8ZcA/VSUnLZvLB2k8HVH12ufCGa7Z7rHI=" allowed-ips 10.51.234.10/32
ip link set test-kernel netns test-ns up
ip -n test-ns addr add 10.51.234.1/24 dev test-kernel
port=$(ip netns exec test-ns wg show test-kernel listen-port)
ip link del test-go || true
./wireguard-go test-go
wg set test-go private-key <(echo "WBM7qimR3vFk1QtWNfH+F4ggy/hmO+5hfIHKxxI4nF4=") peer "+nj9Dkqpl4phsHo2dQliGm5aEiWJJgBtYKbh7XjeNjg=" allowed-ips 0.0.0.0/0 endpoint 127.0.0.1:$port
ip addr add 10.51.234.10/24 dev test-go
ip link set test-go up
ping -c2 -W1 10.51.234.1
Reported-by: minus <minus@mnus.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
The deferred RUnlock calls weren't executing until all peers
had been processed. Add an anonymous function so that each
peer may be unlocked as soon as it is completed.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
There are more places where we'll need to add it later, when Go 1.18
comes out with support for it in the "net" package. Also, allowedips
still uses slices internally, which might be suboptimal.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
device: only propagate roaming value before peer is referenced elsewhere
A peer.endpoint never becomes nil after being not-nil, so creation is
the only time we actually need to set this. This prevents a race from
when the variable is actually used elsewhere, and allows us to avoid an
expensive atomic.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Otherwise recent WDK binaries fail on ARM64, where an exception handler
is used for trapping an illegal instruction when ARMv8.1 atomics are
being tested for functionality.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
The reason this was failing before is that dloadsup.h's
DloadObtainSection was doing a linear search of sections to find which
header corresponds with the IMAGE_DELAYLOAD_DESCRIPTOR section, and we
were stupidly overwriting the VirtualSize field, so the linear search
wound up matching the .text section, which then it found to not be
marked writable and failed with FAST_FAIL_DLOAD_PROTECTION_FAILURE.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
device: remove nodes by peer in O(1) instead of O(n)
Now that we have parent pointers hooked up, we can simply go right to
the node and remove it in place, rather than having to recursively walk
the entire trie.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
device: remove recursion from insertion and connect parent pointers
This makes the insertion algorithm a bit more efficient, while also now
taking on the additional task of connecting up parent pointers. This
will be handy in the following commit.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
tun: linux: account for interface removal from outside
On Linux we can run `ip link del wg0`, in which case the fd becomes
stale, and we should exit. Since this is an intentional action, don't
treat it as an error.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
The -1 protection was removed and the wrong error was returned, causing
us to read from a bogus fd. As well, remove the useless closures that
aren't doing anything, since this is all synchronized anyway.
Fixes: 10533c3 ("all: make conn.Bind.Open return a slice of receive functions") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
In 097af6e ("tun: windows: protect reads from closing") we made sure no
functions are running when End() is called, to avoid a UaF. But we still
need to kick that event somehow, so that Read() is allowed to exit, in
order to release the lock. So this commit calls SetEvent, while moving
the closing boolean to be atomic so it can be modified without locks,
and then moves to a WaitGroup for the RCU-like pattern.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
device: log all errors received by RoutineReceiveIncoming
When debugging, it's useful to know why a receive func exited.
We were already logging that, but only in the "death spiral" case.
Move the logging up, to capture it always.
Reduce the verbosity, since it is not an error case any more.
Put the receive func name in the log line.
The code previously used the old errors channel for checking, rather
than the simpler boolean, which caused issues on shutdown, since the
errors channel was meaningless. However, looking at this exposed a more
basic problem: Close() and all the other functions that check the closed
boolean can race. So protect with a basic RW lock, to ensure that
Close() waits for all pending operations to complete.
Reported-by: Joshua Sjoding <joshua.sjoding@scjalliance.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>