device: add Device.ListenPort and Device.SetListenPort
This is a sample commit for a possible way to make
a Go API that lives alongside UAPI.
The general idea is to add Device and Peer methods
corresponding to UAPI directives, including a way to
look up a peer from a device based on a public key,
as in UAPI.
The UAPI code then deals with parsing and generating textual
input/output, and calls the Go methods to do the work.
This commit also contains a bug fix for a racy access of device.net.port
I will send an independently commit that fixes those directly in UAPI.
This commit is NOT meant to be merged as-is.
They're called elem in most places.
Rename a few local variables to make it consistent.
This makes it easier to grep the code for things like elem.Drop.
device: use channel close to shut down and drain outbound channel
This is a similar treatment to the handling of the encryption
channel found a few commits ago: Use the closing of the channel
to manage goroutine lifetime and shutdown.
It is considerably simpler because there is only a single writer.
device: use channel close to shut down and drain encryption channel
The new test introduced in this commit used to deadlock about 1% of the time.
I believe that the deadlock occurs as follows:
* The test completes, calling device.Close.
* device.Close closes device.signals.stop.
* RoutineEncryption stops.
* The deferred function in RoutineEncryption drains device.queue.encryption.
* RoutineEncryption exits.
* A peer's RoutineNonce processes an element queued in peer.queue.nonce.
* RoutineNonce puts that element into the outbound and encryption queues.
* RoutineSequentialSender reads that elements from the outbound queue.
* It waits for that element to get Unlocked by RoutineEncryption.
* RoutineEncryption has already exited, so RoutineSequentialSender blocks forever.
* device.RemoveAllPeers calls peer.Stop on all peers.
* peer.Stop waits for peer.routines.stopping, which blocks forever.
Rather than attempt to add even more ordering to the already complex
centralized shutdown orchestration, this commit moves towards a
data-flow-oriented shutdown.
The device.queue.encryption gets closed when there will be no more writes to it.
All device.queue.encryption readers always read until the channel is closed and then exit.
We thus guarantee that any element that enters the encryption queue also exits it.
This removes the need for central control of the lifetime of RoutineEncryption,
removes the need to drain the encryption queue on shutdown, and simplifies RoutineEncryption.
This commit also fixes a data race. When RoutineSequentialSender
drains its queue on shutdown, it needs to lock the elem before operating on it,
just as the main body does.
The new test in this commit passed 50k iterations with the race detector enabled
and 150k iterations with the race detector disabled, with no failures.
Since we already have it packed into a uint64
in a known byte order, write it back out again
the same byte order instead of copying byte by byte.
This should also generate more efficient code,
because the compiler can do a single uint64 write,
instead of eight bounds checks and eight byte writes.
Due to a missed optimization, it actually generates a mishmash
of smaller writes: 1 byte, 4 bytes, 2 bytes, 1 byte.
This is https://golang.org/issue/41663.
The code is still better than before, and will get better yet
once that compiler bug gets fixed.
device: accept any io.Reader in device.IpcSetOperation
Any io.Reader will do, and there are no performance concerns here.
This is technically backwards incompatible,
but it is very unlikely to break any existing code.
It is compatible with the existing uses in wireguard-{windows,android,apple}
and also will allow us to slightly simplify it if desired.
When running many concurrent test processing using
https://godoc.org/golang.org/x/tools/cmd/stress
the processing sometimes cannot complete a ping in under 300ms.
Increase the timeout to 5s to reduce the rate of false positives.
device: prevent spurious errors while closing a device
When closing a device, packets that are in flight
can make it to SendBuffer, which then returns an error.
Those errors add noise but no light;
they do not reflect an actual problem.
Adding the synchronization required to prevent
this from occurring is currently expensive and error-prone.
Instead, quietly drop such packets instead of
returning an error.
In each case, the starting waitgroup did nothing but ensure
that the goroutine has launched.
Nothing downstream depends on the order in which goroutines launch,
and if the Go runtime scheduler is so broken that goroutines
don't get launched reasonably promptly, we have much deeper problems.
Given all that, simplify the code.
Passed a race-enabled stress test 25,000 times without failure.
Picking two free ports to use for a test is difficult.
The free port we selected might no longer be free when we reach
for it a second time.
On my machine, this failure mode led to failures approximately
once per thousand test runs.
Since failures are rare, and threading through and checking for
all possible errors is complicated, fix this with a big hammer:
Retry if either device fails to come up.
Also, if you accidentally pick the same port twice, delightful confusion ensues.
The handshake failures manifest as crypto errors, which look scary.
Again, fix with retries.
To make these retries easier to implement, use testing.T.Cleanup
instead of defer to close devices. This requires Go 1.14.
Update go.mod accordingly. Go 1.13 is no longer supported anyway.
With these fixes, 'go test -race' ran 100,000 times without failure.
Simon Rozman [Wed, 22 Jul 2020 07:15:49 +0000 (09:15 +0200)]
wintun: migrate to wintun.dll API
Rather than having every application using Wintun driver reinvent the
wheel, the Wintun device/adapter/interface management has been moved
from wireguard-go to wintun.dll deployed with Wintun itself.
Tobias Klauser [Tue, 27 Oct 2020 13:39:36 +0000 (14:39 +0100)]
tun: use SockaddrCtl from golang.org/x/sys/unix on macOS
Direct syscalls using unix.Syscall(unix.SYS_*, ...) are discouraged on
macOS and might not be supported in future versions. Switch to use
unix.Connect with unix.SockaddrCtl instead.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tobias Klauser [Tue, 27 Oct 2020 13:39:35 +0000 (14:39 +0100)]
tun: use Ioctl{Get,Set}IfreqMTU from golang.org/x/sys/unix on macOS
Direct syscalls using unix.Syscall(unix.SYS_*, ...) are discouraged on
macOS and might not be supported in future versions. Switch to use
unix.Ioctl{Get,Set}IfreqMTU to get and set an interface's MTU.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tobias Klauser [Tue, 27 Oct 2020 13:39:34 +0000 (14:39 +0100)]
tun: use IoctlCtlInfo from golang.org/x/sys/unix on macOS
Direct syscalls using unix.Syscall(unix.SYS_*, ...) are discouraged on
macOS and might not be supported in future versions. Switch to use
unix.IoctlCtlInfo to get the kernel control info.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tobias Klauser [Tue, 27 Oct 2020 13:39:33 +0000 (14:39 +0100)]
tun: use GetsockoptString in (*NativeTun).Name on macOS
Direct syscalls using unix.Syscall(unix.SYS_*, ...) are discouraged on
macOS and might not be supported in future versions. Instead, use the
existing unix.GetsockoptString wrapper to get the interface name.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
device: wait for routines to stop before removing peers
Peers are currently removed after Device's goroutines are signaled to stop,
but without waiting for them to actually do so, which is racy.
For example, RoutineHandshake may be in Peer.SendKeepalive
when the corresponding peer is removed, which closes its nonce channel.
This causes a send on a closed channel, as observed in tailscale/tailscale#487.
This patch seems to be the correct synchronizing action:
Peer's goroutines are receivers and handle channel closure gracefully,
so Device's goroutines are the ones that should be fully stopped first.
David Crawshaw [Sat, 2 May 2020 06:28:33 +0000 (16:28 +1000)]
ipc: deduplicate some unix-specific code
Cleans up and splits out UAPIOpen to its own file.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
[zx2c4: changed const to var for socketDirectory] Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
device: use atomic access for unlocked keypair.next
Go's GC semantics might not always guarantee the safety of this, and the
race detector gets upset too, so instead we wrap this all in atomic
accessors.
Reported-by: David Anderson <danderson@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Simon Rozman [Sat, 2 May 2020 06:49:35 +0000 (08:49 +0200)]
wintun: make remaining HWID comparisons case insensitive
c85e4a410f27986a2967a49c0155633c716bf3ca introduced preliminary HWID
checking to speed up Wintun adapter enumeration. However, all HWID are
case insensitive by Windows convention.
Furthermore, a device might have multiple HWIDs. When DevInfo's
DeviceRegistryProperty(SPDRP_HARDWAREID) method returns []string, all
strings returned should be checked against given hardware ID.
This issue was discovered when researching Wintun and wireguard-go on
Windows 10 ARM64. The Wintun adapter was created using devcon.exe
utility with "wintun" hardware ID, causing wireguard-go fail to
enumerate the adapter properly.
Avery Pennarun [Wed, 6 Nov 2019 08:28:02 +0000 (00:28 -0800)]
tun: NetlinkListener: don't send EventDown before sending EventUp
This works around a startup race condition when competing with
HackListener, which is trying to do the same job. If HackListener
detects that the tundev is running while there is still an event in the
netlink queue that says it isn't running, then the device receives a
string of events like
EventUp (HackListener)
EventDown (NetlinkListener)
EventUp (NetlinkListener)
Unfortunately, after the first EventDown, the device stops itself,
thinking incorrectly that the administrator has downed its tundev.
The device is ignoring the initial EventDown anyway, so just don't emit
it.
David Anderson [Sun, 1 Mar 2020 08:39:24 +0000 (00:39 -0800)]
device: add test to ensure Peer fields are safe for atomic access on 32-bit
Adds a test that will fail consistently on 32-bit platforms if the
struct ever changes again to violate the rules. This is likely not
needed because unaligned access crashes reliably, but this will reliably
fail even if tests accidentally pass due to lucky alignment.
Signed-Off-By: David Anderson <danderson@tailscale.com>
David Crawshaw [Wed, 19 Feb 2020 15:09:24 +0000 (10:09 -0500)]
rwcancel: no-op builds for windows and darwin
This lets us include the package on those platforms in a
followup commit where we split out a conn package from device.
It also lets us run `go test ./...` when developing on macOS.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
David Crawshaw [Sun, 8 Dec 2019 23:22:31 +0000 (18:22 -0500)]
ratelimiter: use a fake clock in tests and style cleanups
The existing test would occasionally flake out with:
--- FAIL: TestRatelimiter (0.12s)
ratelimiter_test.go:99: Test failed for 127.0.0.1 , on: 7 ( not having refilled enough ) expected: false got: true
FAIL
FAIL golang.zx2c4.com/wireguard/ratelimiter 0.171s
The fake clock also means the tests run much faster, so
testing this package with -count=1000 now takes < 100ms.
While here, several style cleanups. The most significant one
is unembeding the sync.Mutex fields in the rate limiter objects.
Embedded as they were, the lock methods were accessible
outside the ratelimiter package. As they aren't needed externally,
keep them internal to make them easier to reason about.
Passes `go test -race -count=10000 ./ratelimiter`
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
Tobias Klauser [Wed, 4 Mar 2020 16:21:54 +0000 (17:21 +0100)]
global: use RTMGRP_* consts from x/sys/unix
Update the golang.org/x/sys/unix dependency and use the newly introduced
RTMGRP_* consts instead of using the corresponding RTNLGRP_* const to
create a mask.