]> git.ipfire.org Git - thirdparty/wireguard-go.git/log
thirdparty/wireguard-go.git
6 weeks agoversion: bump snapshot master 0.0.20250522
Jason A. Donenfeld [Wed, 21 May 2025 23:45:02 +0000 (01:45 +0200)] 
version: bump snapshot

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
6 weeks agoconn: don't enable GRO on Linux < 5.12
Jason A. Donenfeld [Wed, 21 May 2025 23:33:55 +0000 (01:33 +0200)] 
conn: don't enable GRO on Linux < 5.12

Kernels below 5.12 are missing this:

    commit 98184612aca0a9ee42b8eb0262a49900ee9eef0d
    Author: Norman Maurer <norman_maurer@apple.com>
    Date:   Thu Apr 1 08:59:17 2021

        net: udp: Add support for getsockopt(..., ..., UDP_GRO, ..., ...);

        Support for UDP_GRO was added in the past but the implementation for
        getsockopt was missed which did lead to an error when we tried to
        retrieve the setting for UDP_GRO. This patch adds the missing switch
        case for UDP_GRO

Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.")
Signed-off-by: Norman Maurer <norman_maurer@apple.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
That means we can't set the option and then read it back later. Given
how buggy UDP_GRO is in general on odd kernels, just disable it on older
kernels all together.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
6 weeks agodevice: optimize message encoding
Alexander Yastrebov [Sat, 17 May 2025 09:34:30 +0000 (11:34 +0200)] 
device: optimize message encoding

Optimize message encoding by eliminating binary.Write (which internally
uses reflection) in favour of hand-rolled encoding.

This is companion to 9e7529c3d2d0c54f4d5384c01645a9279e4740ae.

Synthetic benchmark:

    var packetSink []byte
    func BenchmarkMessageInitiationMarshal(b *testing.B) {
        var msg MessageInitiation
        b.Run("binary.Write", func(b *testing.B) {
            b.ReportAllocs()
            for range b.N {
                var buf [MessageInitiationSize]byte
                writer := bytes.NewBuffer(buf[:0])
                _ = binary.Write(writer, binary.LittleEndian, msg)
                packetSink = writer.Bytes()
            }
        })
        b.Run("binary.Encode", func(b *testing.B) {
            b.ReportAllocs()
            for range b.N {
                packet := make([]byte, MessageInitiationSize)
                _, _ = binary.Encode(packet, binary.LittleEndian, msg)
                packetSink = packet
            }
        })
        b.Run("marshal", func(b *testing.B) {
            b.ReportAllocs()
            for range b.N {
                packet := make([]byte, MessageInitiationSize)
                _ = msg.marshal(packet)
                packetSink = packet
            }
        })
    }

Results:
                                             │      -      │
                                             │   sec/op    │
    MessageInitiationMarshal/binary.Write-8    1.337µ ± 0%
    MessageInitiationMarshal/binary.Encode-8   1.242µ ± 0%
    MessageInitiationMarshal/marshal-8         53.05n ± 1%

                                             │     -      │
                                             │    B/op    │
    MessageInitiationMarshal/binary.Write-8    368.0 ± 0%
    MessageInitiationMarshal/binary.Encode-8   160.0 ± 0%
    MessageInitiationMarshal/marshal-8         160.0 ± 0%

                                             │     -      │
                                             │ allocs/op  │
    MessageInitiationMarshal/binary.Write-8    3.000 ± 0%
    MessageInitiationMarshal/binary.Encode-8   1.000 ± 0%
    MessageInitiationMarshal/marshal-8         1.000 ± 0%

Signed-off-by: Alexander Yastrebov <yastrebov.alex@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
6 weeks agodevice: add support for removing allowedips individually
Jason A. Donenfeld [Tue, 20 May 2025 21:03:06 +0000 (23:03 +0200)] 
device: add support for removing allowedips individually

This pairs with the recent change in wireguard-tools.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
7 weeks agoversion: bump snapshot 0.0.20250515
Jason A. Donenfeld [Thu, 15 May 2025 14:54:03 +0000 (16:54 +0200)] 
version: bump snapshot

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
7 weeks agodevice: make unmarshall length checks exact
Jason A. Donenfeld [Thu, 15 May 2025 14:48:14 +0000 (16:48 +0200)] 
device: make unmarshall length checks exact

This is already enforced in receive.go, but if these unmarshallers are
to have error return values anyway, make them as explicit as possible.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
7 weeks agodevice: reduce RoutineHandshake allocations
Alexander Yastrebov [Thu, 26 Dec 2024 19:36:53 +0000 (20:36 +0100)] 
device: reduce RoutineHandshake allocations

Reduce allocations by eliminating byte reader, hand-rolled decoding and
reusing message structs.

Synthetic benchmark:

    var msgSink MessageInitiation
    func BenchmarkMessageInitiationUnmarshal(b *testing.B) {
        packet := make([]byte, MessageInitiationSize)
        reader := bytes.NewReader(packet)
        err := binary.Read(reader, binary.LittleEndian, &msgSink)
        if err != nil {
            b.Fatal(err)
        }
        b.Run("binary.Read", func(b *testing.B) {
            b.ReportAllocs()
            for range b.N {
                reader := bytes.NewReader(packet)
                _ = binary.Read(reader, binary.LittleEndian, &msgSink)
            }
        })
        b.Run("unmarshal", func(b *testing.B) {
            b.ReportAllocs()
            for range b.N {
                _ = msgSink.unmarshal(packet)
            }
        })
    }

Results:
                                         │      -      │
                                         │   sec/op    │
MessageInitiationUnmarshal/binary.Read-8   1.508µ ± 2%
MessageInitiationUnmarshal/unmarshal-8     12.66n ± 2%

                                         │      -       │
                                         │     B/op     │
MessageInitiationUnmarshal/binary.Read-8   208.0 ± 0%
MessageInitiationUnmarshal/unmarshal-8     0.000 ± 0%

                                         │      -       │
                                         │  allocs/op   │
MessageInitiationUnmarshal/binary.Read-8   2.000 ± 0%
MessageInitiationUnmarshal/unmarshal-8     0.000 ± 0%

Signed-off-by: Alexander Yastrebov <yastrebov.alex@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 months agorwcancel: fix wrong poll event flag on ReadyWrite
Kurnia D Win [Wed, 7 Jun 2023 05:41:02 +0000 (12:41 +0700)] 
rwcancel: fix wrong poll event flag on ReadyWrite

It should be POLLIN because closeFd is read-only file.

Signed-off-by: Kurnia D Win <kurnia.d.win@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 months agodevice: use rand.NewSource instead of rand.Seed
Tom Holford [Sun, 4 May 2025 16:49:49 +0000 (18:49 +0200)] 
device: use rand.NewSource instead of rand.Seed

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 months agoglobal: replaced unused function params with _
Tom Holford [Sun, 4 May 2025 16:49:03 +0000 (18:49 +0200)] 
global: replaced unused function params with _

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 months agotun: darwin: fetch flags and mtu from if_msghdr directly
ruokeqx [Thu, 2 Jan 2025 12:28:33 +0000 (20:28 +0800)] 
tun: darwin: fetch flags and mtu from if_msghdr directly

Signed-off-by: ruokeqx <ruokeqx@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 months agotun: use add-with-carry in checksumNoFold()
Tu Dinh Ngoc [Thu, 20 Jun 2024 13:28:38 +0000 (13:28 +0000)] 
tun: use add-with-carry in checksumNoFold()

Use parallel summation with native byte order per RFC 1071.
add-with-carry operation is used to add 4 words per operation.  Byteswap
is performed before and after checksumming for compatibility with old
`checksumNoFold()`.  With this we get a 30-80% speedup in `checksum()`
depending on packet sizes.

Add unit tests with comparison to a per-word implementation.

**Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz**

| Size | OldTime | NewTime | Speedup  |
|------|---------|---------|----------|
| 64   | 12.64   | 9.183   | 1.376456 |
| 128  | 18.52   | 12.72   | 1.455975 |
| 256  | 31.01   | 18.13   | 1.710425 |
| 512  | 54.46   | 29.03   | 1.87599  |
| 1024 | 102     | 52.2    | 1.954023 |
| 1500 | 146.8   | 81.36   | 1.804326 |
| 2048 | 196.9   | 102.5   | 1.920976 |
| 4096 | 389.8   | 200.8   | 1.941235 |
| 8192 | 767.3   | 413.3   | 1.856521 |
| 9000 | 851.7   | 448.8   | 1.897727 |
| 9001 | 854.8   | 451.9   | 1.891569 |

**AMD EPYC 7352 24-Core Processor**

| Size | OldTime | NewTime | Speedup  |
|------|---------|---------|----------|
| 64   | 9.159   | 6.949   | 1.318031 |
| 128  | 13.59   | 10.59   | 1.283286 |
| 256  | 22.37   | 14.91   | 1.500335 |
| 512  | 41.42   | 24.22   | 1.710157 |
| 1024 | 81.59   | 45.05   | 1.811099 |
| 1500 | 120.4   | 68.35   | 1.761522 |
| 2048 | 162.8   | 90.14   | 1.806079 |
| 4096 | 321.4   | 180.3   | 1.782585 |
| 8192 | 650.4   | 360.8   | 1.802661 |
| 9000 | 706.3   | 398.1   | 1.774177 |
| 9001 | 712.4   | 398.2   | 1.789051 |

Signed-off-by: Tu Dinh Ngoc <dinhngoc.tu@irit.fr>
[Jason: simplified and cleaned up unit tests]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 months agotun/netstack: cleanup network stack at closing time
Jason A. Donenfeld [Mon, 5 May 2025 13:09:09 +0000 (15:09 +0200)] 
tun/netstack: cleanup network stack at closing time

Colin's commit went a step further and protected tun.incomingPacket with
a lock on shutdown, but let's see if the tun.stack.Close() call actually
solves that on its own.

Suggested-by: kshangx <hikeshang@hotmail.com>
Suggested-by: Colin Adler <colin1adler@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 months agotun/netstack: remove usage of pkt.IsNil()
Jason A. Donenfeld [Sun, 4 May 2025 15:54:57 +0000 (17:54 +0200)] 
tun/netstack: remove usage of pkt.IsNil()

Since 3c75945fd ("netstack: remove PacketBuffer.IsNil()") this has been
invalid. Follow the replacement pattern of that commit.

The old definition inlined to the same code anyway:

 func (pk *PacketBuffer) IsNil() bool {
  return pk == nil
 }

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 months agomod: bump deps
Jason A. Donenfeld [Sun, 4 May 2025 15:50:41 +0000 (17:50 +0200)] 
mod: bump deps

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 months agoglobal: bump copyright notice
Jason A. Donenfeld [Sun, 4 May 2025 15:48:53 +0000 (17:48 +0200)] 
global: bump copyright notice

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 months agodevice: fix missed return of QueueOutboundElementsContainer to its WaitPool
Jordan Whited [Thu, 27 Jun 2024 16:06:40 +0000 (09:06 -0700)] 
device: fix missed return of QueueOutboundElementsContainer to its WaitPool

Fixes: 3bb8fec ("conn, device, tun: implement vectorized I/O plumbing")
Reviewed-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 months agodevice: fix WaitPool sync.Cond usage
Jordan Whited [Thu, 27 Jun 2024 15:43:41 +0000 (08:43 -0700)] 
device: fix WaitPool sync.Cond usage

The sync.Locker used with a sync.Cond must be acquired when changing
the associated condition, otherwise there is a window within
sync.Cond.Wait() where a wake-up may be missed.

Fixes: 4846070 ("device: use a waiting sync.Pool instead of a channel")
Reviewed-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
18 months agodevice: fix possible deadlock in close method
Martin Basovnik [Fri, 10 Nov 2023 10:10:12 +0000 (11:10 +0100)] 
device: fix possible deadlock in close method

There is a possible deadlock in `device.Close()` when you try to close
the device very soon after its start. The problem is that two different
methods acquire the same locks in different order:

1. device.Close()
 - device.ipcMutex.Lock()
 - device.state.Lock()

2. device.changeState(deviceState)
 - device.state.Lock()
 - device.ipcMutex.Lock()

Reproducer:

    func TestDevice_deadlock(t *testing.T) {
     d := randDevice(t)
     d.Close()
    }

Problem:

    $ go clean -testcache && go test -race -timeout 3s -run TestDevice_deadlock ./device | grep -A 10 sync.runtime_SemacquireMutex
    sync.runtime_SemacquireMutex(0xc000117d20?, 0x94?, 0x0?)
            /usr/local/opt/go/libexec/src/runtime/sema.go:77 +0x25
    sync.(*Mutex).lockSlow(0xc000130518)
            /usr/local/opt/go/libexec/src/sync/mutex.go:171 +0x213
    sync.(*Mutex).Lock(0xc000130518)
            /usr/local/opt/go/libexec/src/sync/mutex.go:90 +0x55
    golang.zx2c4.com/wireguard/device.(*Device).Close(0xc000130500)
            /Users/martin.basovnik/git/basovnik/wireguard-go/device/device.go:373 +0xb6
    golang.zx2c4.com/wireguard/device.TestDevice_deadlock(0x0?)
            /Users/martin.basovnik/git/basovnik/wireguard-go/device/device_test.go:480 +0x2c
    testing.tRunner(0xc00014c000, 0x131d7b0)
    --
    sync.runtime_SemacquireMutex(0xc000130564?, 0x60?, 0xc000130548?)
            /usr/local/opt/go/libexec/src/runtime/sema.go:77 +0x25
    sync.(*Mutex).lockSlow(0xc000130750)
            /usr/local/opt/go/libexec/src/sync/mutex.go:171 +0x213
    sync.(*Mutex).Lock(0xc000130750)
            /usr/local/opt/go/libexec/src/sync/mutex.go:90 +0x55
    sync.(*RWMutex).Lock(0xc000130750)
            /usr/local/opt/go/libexec/src/sync/rwmutex.go:147 +0x45
    golang.zx2c4.com/wireguard/device.(*Device).upLocked(0xc000130500)
            /Users/martin.basovnik/git/basovnik/wireguard-go/device/device.go:179 +0x72
    golang.zx2c4.com/wireguard/device.(*Device).changeState(0xc000130500, 0x1)

Signed-off-by: Martin Basovnik <martin.basovnik@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
18 months agodevice: do atomic 64-bit add outside of vector loop
Jason A. Donenfeld [Mon, 11 Dec 2023 15:35:57 +0000 (16:35 +0100)] 
device: do atomic 64-bit add outside of vector loop

Only bother updating the rxBytes counter once we've processed a whole
vector, since additions are atomic.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
18 months agodevice: reduce redundant per-packet overhead in RX path
Jordan Whited [Tue, 7 Nov 2023 23:24:21 +0000 (15:24 -0800)] 
device: reduce redundant per-packet overhead in RX path

Peer.RoutineSequentialReceiver() deals with packet vectors and does not
need to perform timer and endpoint operations for every packet in a
given vector. Changing these per-packet operations to per-vector
improves throughput by as much as 10% in some environments.

Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
18 months agodevice: change Peer.endpoint locking to reduce contention
Jordan Whited [Tue, 21 Nov 2023 00:49:06 +0000 (16:49 -0800)] 
device: change Peer.endpoint locking to reduce contention

Access to Peer.endpoint was previously synchronized by Peer.RWMutex.
This has now moved to Peer.endpoint.Mutex. Peer.SendBuffers() is now the
sole caller of Endpoint.ClearSrc(), which is signaled via a new bool,
Peer.endpoint.clearSrcOnTx. Previous Callers of Endpoint.ClearSrc() now
set this bool, primarily via peer.markEndpointSrcForClearing().
Peer.SetEndpointFromPacket() clears Peer.endpoint.clearSrcOnTx when an
updated conn.Endpoint is stored. This maintains the same event order as
before, i.e. a conn.Endpoint received after peer.endpoint.clearSrcOnTx
is set, but before the next Peer.SendBuffers() call results in the
latest conn.Endpoint source being used for the next packet transmission.

These changes result in throughput improvements for single flow,
parallel (-P n) flow, and bidirectional (--bidir) flow iperf3 TCP/UDP
tests as measured on both Linux and Windows. Latency under load improves
especially for high throughput Linux scenarios. These improvements are
likely realized on all platforms to some degree, as the changes are not
platform-specific.

Co-authored-by: James Tucker <james@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
18 months agotun: implement UDP GSO/GRO for Linux
Jordan Whited [Wed, 1 Nov 2023 02:53:35 +0000 (19:53 -0700)] 
tun: implement UDP GSO/GRO for Linux

Implement UDP GSO and GRO for the Linux tun.Device, which is made
possible by virtio extensions in the kernel's TUN driver starting in
v6.2.

secnetperf, a QUIC benchmark utility from microsoft/msquic@8e1eb1a, is
used to demonstrate the effect of this commit between two Linux
computers with i5-12400 CPUs. There is roughly ~13us of round trip
latency between them. secnetperf was invoked with the following command
line options:
-stats:1 -exec:maxtput -test:tput -download:10000 -timed:1 -encrypt:0

The first result is from commit 2e0774f without UDP GSO/GRO on the TUN.

[conn][0x55739a144980] STATS: EcnCapable=0 RTT=3973 us
SendTotalPackets=55859 SendSuspectedLostPackets=61
SendSpuriousLostPackets=59 SendCongestionCount=27
SendEcnCongestionCount=0 RecvTotalPackets=2779122
RecvReorderedPackets=0 RecvDroppedPackets=0
RecvDuplicatePackets=0 RecvDecryptionFailures=0
Result: 3654977571 bytes @ 2922821 kbps (10003.972 ms).

The second result is with UDP GSO/GRO on the TUN.

[conn][0x56493dfd09a0] STATS: EcnCapable=0 RTT=1216 us
SendTotalPackets=165033 SendSuspectedLostPackets=64
SendSpuriousLostPackets=61 SendCongestionCount=53
SendEcnCongestionCount=0 RecvTotalPackets=11845268
RecvReorderedPackets=25267 RecvDroppedPackets=0
RecvDuplicatePackets=0 RecvDecryptionFailures=0
Result: 15574671184 bytes @ 12458214 kbps (10001.222 ms).

Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
18 months agotun: fix Device.Read() buf length assumption on Windows
Jordan Whited [Wed, 8 Nov 2023 22:06:20 +0000 (14:06 -0800)] 
tun: fix Device.Read() buf length assumption on Windows

The length of a packet read from the underlying TUN device may exceed
the length of a supplied buffer when MTU exceeds device.MaxMessageSize.

Reviewed-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
20 months agodevice: ratchet up max segment size on android
Jason A. Donenfeld [Sun, 22 Oct 2023 00:12:13 +0000 (02:12 +0200)] 
device: ratchet up max segment size on android

GRO requires big allocations to be efficient. This isn't great, as there
might be Android memory usage issues. So we should revisit this commit.
But at least it gets things working again.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
20 months agoconn: set unused OOB to zero length
Jason A. Donenfeld [Sat, 21 Oct 2023 17:32:07 +0000 (19:32 +0200)] 
conn: set unused OOB to zero length

Otherwise in the event that we're using GSO without sticky sockets, we
pass garbage OOB buffers to sendmmsg, making a EINVAL, when GSO doesn't
set its header.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
20 months agoconn: fix cmsg data padding calculation for gso
Jason A. Donenfeld [Sat, 21 Oct 2023 17:06:38 +0000 (19:06 +0200)] 
conn: fix cmsg data padding calculation for gso

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
20 months agoconn: separate gso and sticky control
Jason A. Donenfeld [Sat, 21 Oct 2023 16:41:27 +0000 (18:41 +0200)] 
conn: separate gso and sticky control

Android wants GSO but not sticky.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
20 months agoconn: harmonize GOOS checks between "linux" and "android"
Jason A. Donenfeld [Wed, 18 Oct 2023 19:14:13 +0000 (21:14 +0200)] 
conn: harmonize GOOS checks between "linux" and "android"

Otherwise GRO gets enabled on Android, but the conn doesn't use it,
resulting in bundled packets being discarded.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
20 months agoconn: simplify supportsUDPOffload
Jason A. Donenfeld [Wed, 18 Oct 2023 19:02:52 +0000 (21:02 +0200)] 
conn: simplify supportsUDPOffload

This allows a kernel to support UDP_GRO while not supporting
UDP_SEGMENT.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
20 months agogo.mod,tun/netstack: bump gvisor
James Tucker [Wed, 27 Sep 2023 23:15:09 +0000 (16:15 -0700)] 
go.mod,tun/netstack: bump gvisor

Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
20 months agotun: fix crash when ForceMTU is called after close
James Tucker [Wed, 27 Sep 2023 21:52:21 +0000 (14:52 -0700)] 
tun: fix crash when ForceMTU is called after close

Close closes the events channel, resulting in a panic from send on
closed channel.

Reported-By: Brad Fitzpatrick <brad@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
20 months agodevice: move Queue{In,Out}boundElement Mutex to container type
Jordan Whited [Mon, 2 Oct 2023 21:48:28 +0000 (14:48 -0700)] 
device: move Queue{In,Out}boundElement Mutex to container type

Queue{In,Out}boundElement locking can contribute to significant
overhead via sync.Mutex.lockSlow() in some environments. These types
are passed throughout the device package as elements in a slice, so
move the per-element Mutex to a container around the slice.

Reviewed-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
20 months agotun: reduce redundant checksumming in tcpGRO()
Jordan Whited [Mon, 2 Oct 2023 21:46:13 +0000 (14:46 -0700)] 
tun: reduce redundant checksumming in tcpGRO()

IPv4 header and pseudo header checksums were being computed on every
merge operation. Additionally, virtioNetHdr was being written at the
same time. This delays those operations until after all coalescing has
occurred.

Reviewed-by: Adrian Dewhurst <adrian@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
20 months agotun: unwind summing loop in checksumNoFold()
Jordan Whited [Mon, 2 Oct 2023 21:43:56 +0000 (14:43 -0700)] 
tun: unwind summing loop in checksumNoFold()

$ benchstat old.txt new.txt
goos: linux
goarch: amd64
pkg: golang.zx2c4.com/wireguard/tun
cpu: 12th Gen Intel(R) Core(TM) i5-12400
                 │   old.txt    │               new.txt               │
                 │    sec/op    │   sec/op     vs base                │
Checksum/64-12     10.670n ± 2%   4.769n ± 0%  -55.30% (p=0.000 n=10)
Checksum/128-12    19.665n ± 2%   8.032n ± 0%  -59.16% (p=0.000 n=10)
Checksum/256-12     37.68n ± 1%   16.06n ± 0%  -57.37% (p=0.000 n=10)
Checksum/512-12     76.61n ± 3%   32.13n ± 0%  -58.06% (p=0.000 n=10)
Checksum/1024-12   160.55n ± 4%   64.25n ± 0%  -59.98% (p=0.000 n=10)
Checksum/1500-12   231.05n ± 7%   94.12n ± 0%  -59.26% (p=0.000 n=10)
Checksum/2048-12    309.5n ± 3%   128.5n ± 0%  -58.48% (p=0.000 n=10)
Checksum/4096-12    603.8n ± 4%   257.2n ± 0%  -57.41% (p=0.000 n=10)
Checksum/8192-12   1185.0n ± 3%   515.5n ± 0%  -56.50% (p=0.000 n=10)
Checksum/9000-12   1328.5n ± 5%   564.8n ± 0%  -57.49% (p=0.000 n=10)
Checksum/9001-12   1340.5n ± 3%   564.8n ± 0%  -57.87% (p=0.000 n=10)
geomean             185.3n        77.99n       -57.92%

Reviewed-by: Adrian Dewhurst <adrian@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
20 months agodevice: distribute crypto work as slice of elements
Jordan Whited [Mon, 2 Oct 2023 21:41:04 +0000 (14:41 -0700)] 
device: distribute crypto work as slice of elements

After reducing UDP stack traversal overhead via GSO and GRO,
runtime.chanrecv() began to account for a high percentage (20% in one
environment) of perf samples during a throughput benchmark. The
individual packet channel ops with the crypto goroutines was the primary
contributor to this overhead.

Updating these channels to pass vectors, which the device package
already handles at its ends, reduced this overhead substantially, and
improved throughput.

The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.

The first result is with UDP GSO and GRO, and with single element
channels.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-10.00  sec  12.3 GBytes  10.6 Gbits/sec  232   3.15 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  12.3 GBytes  10.6 Gbits/sec  232   sender
[  5]   0.00-10.04  sec  12.3 GBytes  10.6 Gbits/sec        receiver

The second result is with channels updated to pass a slice of
elements.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-10.00  sec  13.2 GBytes  11.3 Gbits/sec  182   3.15 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  13.2 GBytes  11.3 Gbits/sec  182   sender
[  5]   0.00-10.04  sec  13.2 GBytes  11.3 Gbits/sec        receiver

Reviewed-by: Adrian Dewhurst <adrian@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
20 months agoconn, device: use UDP GSO and GRO on Linux
Jordan Whited [Mon, 2 Oct 2023 20:53:07 +0000 (13:53 -0700)] 
conn, device: use UDP GSO and GRO on Linux

StdNetBind probes for UDP GSO and GRO support at runtime. UDP GSO is
dependent on checksum offload support on the egress netdev. UDP GSO
will be disabled in the event sendmmsg() returns EIO, which is a strong
signal that the egress netdev does not support checksum offload.

The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.

The first result is from commit 052af4a without UDP GSO or GRO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-10.00  sec  9.85 GBytes  8.46 Gbits/sec  1139   3.01 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  9.85 GBytes  8.46 Gbits/sec  1139  sender
[  5]   0.00-10.04  sec  9.85 GBytes  8.42 Gbits/sec        receiver

The second result is with UDP GSO and GRO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-10.00  sec  12.3 GBytes  10.6 Gbits/sec  232   3.15 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  12.3 GBytes  10.6 Gbits/sec  232   sender
[  5]   0.00-10.04  sec  12.3 GBytes  10.6 Gbits/sec        receiver

Reviewed-by: Adrian Dewhurst <adrian@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agonetstack: fix typo
Dimitri Papadopoulos Orfanos [Wed, 17 May 2023 07:16:27 +0000 (09:16 +0200)] 
netstack: fix typo

Signed-off-by: Dimitri Papadopoulos Orfanos <3234522+DimitriPapadopoulos@users.noreply.github.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoall: adjust build tags for wasip1/wasm
Brad Fitzpatrick [Sun, 11 Jun 2023 23:10:38 +0000 (16:10 -0700)] 
all: adjust build tags for wasip1/wasm

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn: windows: add missing return statement in DstToString AF_INET
springhack [Thu, 15 Jun 2023 06:41:19 +0000 (14:41 +0800)] 
conn: windows: add missing return statement in DstToString AF_INET

Signed-off-by: SpringHack <springhack@live.cn>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn: store IP_PKTINFO cmsg in StdNetendpoint src
James Tucker [Wed, 19 Apr 2023 05:29:55 +0000 (22:29 -0700)] 
conn: store IP_PKTINFO cmsg in StdNetendpoint src

Replace the src storage inside StdNetEndpoint with a copy of the raw
control message buffer, to reduce allocation and perform less work on a
per-packet basis.

Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agodevice: wait for and lock ipc operations during close
James Tucker [Fri, 5 May 2023 23:11:38 +0000 (16:11 -0700)] 
device: wait for and lock ipc operations during close

If an IPC operation is in flight while close starts, it is possible for
both processes to deadlock. Prevent this by taking the IPC lock at the
start of close and for the duration.

Signed-off-by: James Tucker <jftucker@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agotun: use correct IP header comparisons in tcpGRO() and tcpPacketsCanCoalesce()
Jordan Whited [Fri, 24 Mar 2023 23:23:42 +0000 (16:23 -0700)] 
tun: use correct IP header comparisons in tcpGRO() and tcpPacketsCanCoalesce()

tcpGRO() was using an incorrect IPv4 more fragments bit mask.

tcpPacketsCanCoalesce() was not distinguishing tcp6 from tcp4, and TTL
values were not compared. TTL values should be equal at the IP layer,
otherwise the packets should not coalesce. This tracks with the kernel.

Reviewed-by: Denton Gentry <dgentry@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agotun: disqualify tcp4 packets w/IP options from coalescing
Jordan Whited [Fri, 24 Mar 2023 22:09:47 +0000 (15:09 -0700)] 
tun: disqualify tcp4 packets w/IP options from coalescing

IP options were not being compared prior to coalescing. They are not
commonly used. Disqualification due to nonzero options is in line with
the kernel.

Reviewed-by: Denton Gentry <dgentry@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn: move booleans to bottom of StdNetBind struct
Jason A. Donenfeld [Fri, 24 Mar 2023 15:21:46 +0000 (16:21 +0100)] 
conn: move booleans to bottom of StdNetBind struct

This results in a more compact structure.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn: use ipv6 message pool for ipv6 receiving
Jason A. Donenfeld [Fri, 24 Mar 2023 15:20:16 +0000 (16:20 +0100)] 
conn: use ipv6 message pool for ipv6 receiving

Looks like a simple copy&paste error.

Fixes: 9e2f386 ("conn, device, tun: implement vectorized I/O on Linux")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn: fix StdNetEndpoint data race by dynamically allocating endpoints
Jordan Whited [Thu, 23 Mar 2023 23:57:21 +0000 (16:57 -0700)] 
conn: fix StdNetEndpoint data race by dynamically allocating endpoints

In 9e2f386 ("conn, device, tun: implement vectorized I/O on Linux"), the
Linux-specific Bind implementation was collapsed into StdNetBind. This
introduced a race on StdNetEndpoint from getSrcFromControl() and
setSrcControl().

Remove the sync.Pool involved in the race, and simplify StdNetBind's
receive path to allocate StdNetEndpoint on the heap instead, with the
intent for it to be cleaned up by the GC, later. This essentially
reverts ef5c587 ("conn: remove the final alloc per packet receive"),
adding back that allocation, unfortunately.

This does slightly increase resident memory usage in higher throughput
scenarios. StdNetBind is the only Bind implementation that was using
this Endpoint recycling technique prior to this commit.

This is considered a stop-gap solution, and there are plans to replace
the allocation with a better mechanism.

Reported-by: lsc <lsc@lv6.tw>
Link: https://lore.kernel.org/wireguard/ac87f86f-6837-4e0e-ec34-1df35f52540e@lv6.tw/
Fixes: 9e2f386 ("conn, device, tun: implement vectorized I/O on Linux")
Cc: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn: disable sticky sockets on Android
Jason A. Donenfeld [Thu, 23 Mar 2023 17:38:34 +0000 (18:38 +0100)] 
conn: disable sticky sockets on Android

We can't have the netlink listener socket, so it's not possible to
support it. Plus, android networking stack complexity makes it a bit
tricky anyway, so best to leave it disabled.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoglobal: remove old style build tags
Jason A. Donenfeld [Thu, 23 Mar 2023 17:33:31 +0000 (18:33 +0100)] 
global: remove old style build tags

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agotun: replace ErrorBatch() with errors.Join()
Jordan Whited [Thu, 16 Mar 2023 20:27:51 +0000 (13:27 -0700)] 
tun: replace ErrorBatch() with errors.Join()

Reviewed-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agogo.mod: bump to Go 1.20
Jordan Whited [Thu, 16 Mar 2023 22:40:04 +0000 (15:40 -0700)] 
go.mod: bump to Go 1.20

Reviewed-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn: fix getSrcFromControl() iteration
Jordan Whited [Wed, 15 Mar 2023 03:28:07 +0000 (20:28 -0700)] 
conn: fix getSrcFromControl() iteration

We only expect a single control message in the normal case, but this
would loop infinitely if there were more.

Reviewed-by: Adrian Dewhurst <adrian@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn: use CmsgSpace() for ancillary data buf sizing
Jordan Whited [Wed, 15 Mar 2023 03:02:24 +0000 (20:02 -0700)] 
conn: use CmsgSpace() for ancillary data buf sizing

CmsgLen() does not account for data alignment.

Reviewed-by: Adrian Dewhurst <adrian@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoglobal: buff -> buf
Jason A. Donenfeld [Mon, 13 Mar 2023 16:55:05 +0000 (17:55 +0100)] 
global: buff -> buf

This always struck me as kind of weird and non-standard.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn: use right cmsghdr len types on 32-bit in sticky test
Jason A. Donenfeld [Fri, 10 Mar 2023 15:18:01 +0000 (16:18 +0100)] 
conn: use right cmsghdr len types on 32-bit in sticky test

Cmsghdr uses uint32 and uint64 on 32-bit and 64-bit respectively for the
Len member, which makes assignments and comparisons slightly more
irksome than usual.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn: make StdNetBind.BatchSize() return 1 for non-Linux
Jordan Whited [Thu, 9 Mar 2023 21:02:17 +0000 (13:02 -0800)] 
conn: make StdNetBind.BatchSize() return 1 for non-Linux

This commit updates StdNetBind.BatchSize() to return 1 instead of
IdealBatchSize for non-Linux platforms. Non-Linux platforms do not
yet benefit from values > 1, which only serves to increase memory
consumption.

Reviewed-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agotun/netstack: enable TCP Selective Acknowledgements
Jordan Whited [Thu, 9 Mar 2023 19:06:01 +0000 (11:06 -0800)] 
tun/netstack: enable TCP Selective Acknowledgements

Enable TCP SACK for the gVisor Stack used in tun/netstack. This can
improve throughput by an order of magnitude in the presence of packet
loss.

Reviewed-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn: ensure control message size is respected in StdNetBind
Jordan Whited [Thu, 9 Mar 2023 18:46:12 +0000 (10:46 -0800)] 
conn: ensure control message size is respected in StdNetBind

This commit re-slices received control messages in StdNetBind to the
value the OS reports on a successful read. Previously, the len of this
slice would always be srcControlSize, which could result in control
message values leaking through a sync.Pool round trip. This is
unlikely with the IP_PKTINFO socket option set successfully, but
should be guarded against.

Reviewed-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn: fix StdNetBind fallback on Windows
Jordan Whited [Mon, 6 Mar 2023 23:58:32 +0000 (15:58 -0800)] 
conn: fix StdNetBind fallback on Windows

If RIO is unavailable, NewWinRingBind() falls back to StdNetBind.
StdNetBind uses x/net/ipv{4,6}.PacketConn for sending and receiving
datagrams, specifically via the {Read,Write}Batch methods.
These methods are unimplemented on Windows and will return runtime
errors as a result. Additionally, only Linux benefits from these
x/net types for reading and writing, so we update StdNetBind to fall
back to the standard library net package for all platforms other than
Linux.

Reviewed-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn: inch BatchSize toward being non-dynamic
Jason A. Donenfeld [Sat, 4 Mar 2023 14:25:46 +0000 (15:25 +0100)] 
conn: inch BatchSize toward being non-dynamic

There's not really a use at the moment for making this configurable, and
once bind_windows.go behaves like bind_std.go, we'll be able to use
constants everywhere. So begin that simplification now.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn: set SO_{SND,RCV}BUF to 7MB on the Bind UDP socket
Jordan Whited [Thu, 2 Mar 2023 23:25:19 +0000 (15:25 -0800)] 
conn: set SO_{SND,RCV}BUF to 7MB on the Bind UDP socket

The conn.Bind UDP sockets' send and receive buffers are now being sized
to 7MB, whereas they were previously inheriting the system defaults.
The system defaults are considerably small and can result in dropped
packets on high speed links. By increasing the size of these buffers we
are able to achieve higher throughput in the aforementioned case.

The iperf3 results below demonstrate the effect of this commit between
two Linux computers with 32-core Xeon Platinum CPUs @ 2.9Ghz. There is
roughly ~125us of round trip latency between them.

The first result is from commit 792b49c which uses the system defaults,
e.g. net.core.{r,w}mem_max = 212992. The TCP retransmits are correlated
with buffer full drops on both sides.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-10.00  sec  4.74 GBytes  4.08 Gbits/sec  2742   285 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  4.74 GBytes  4.08 Gbits/sec  2742   sender
[  5]   0.00-10.04  sec  4.74 GBytes  4.06 Gbits/sec         receiver

The second result is after increasing SO_{SND,RCV}BUF to 7MB, i.e.
applying this commit.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-10.00  sec  6.14 GBytes  5.27 Gbits/sec    0   3.15 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  6.14 GBytes  5.27 Gbits/sec    0    sender
[  5]   0.00-10.04  sec  6.14 GBytes  5.25 Gbits/sec         receiver

The specific value of 7MB is chosen as it is the max supported by a
default configuration of macOS. A value greater than 7MB may further
benefit throughput for environments with higher network latency and
lower CPU clocks, but will also increase latency under load
(bufferbloat). Some platforms will silently clamp the value to other
maximums. On Linux, we use SO_{SND,RCV}BUFFORCE in case 7MB is beyond
net.core.{r,w}mem_max.

Co-authored-by: James Tucker <james@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agogo.mod: bump deps
Jason A. Donenfeld [Fri, 3 Mar 2023 13:58:10 +0000 (14:58 +0100)] 
go.mod: bump deps

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn, device, tun: implement vectorized I/O on Linux
Jordan Whited [Thu, 2 Mar 2023 23:08:28 +0000 (15:08 -0800)] 
conn, device, tun: implement vectorized I/O on Linux

Implement TCP offloading via TSO and GRO for the Linux tun.Device, which
is made possible by virtio extensions in the kernel's TUN driver.

Delete conn.LinuxSocketEndpoint in favor of a collapsed conn.StdNetBind.
conn.StdNetBind makes use of recvmmsg() and sendmmsg() on Linux. All
platforms now fall under conn.StdNetBind, except for Windows, which
remains in conn.WinRingBind, which still needs to be adjusted to handle
multiple packets.

Also refactor sticky sockets support to eventually be applicable on
platforms other than just Linux. However Linux remains the sole platform
that fully implements it for now.

Co-authored-by: James Tucker <james@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoconn, device, tun: implement vectorized I/O plumbing
Jordan Whited [Thu, 2 Mar 2023 22:48:02 +0000 (14:48 -0800)] 
conn, device, tun: implement vectorized I/O plumbing

Accept packet vectors for reading and writing in the tun.Device and
conn.Bind interfaces, so that the internal plumbing between these
interfaces now passes a vector of packets. Vectors move untouched
between these interfaces, i.e. if 128 packets are received from
conn.Bind.Read(), 128 packets are passed to tun.Device.Write(). There is
no internal buffering.

Currently, existing implementations are only adjusted to have vectors
of length one. Subsequent patches will improve that.

Also, as a related fixup, use the unix and windows packages rather than
the syscall package when possible.

Co-authored-by: James Tucker <james@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoversion: bump snapshot 0.0.20230223
Jason A. Donenfeld [Thu, 23 Feb 2023 18:12:33 +0000 (19:12 +0100)] 
version: bump snapshot

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agodevice: uniformly check ECDH output for zeros
Jason A. Donenfeld [Thu, 16 Feb 2023 14:51:30 +0000 (15:51 +0100)] 
device: uniformly check ECDH output for zeros

For some reason, this was omitted for response messages.

Reported-by: z <dzm@unexpl0.red>
Fixes: 8c34c4c ("First set of code review patches")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agotun: guard Device.Events() against chan writes
Jordan Whited [Wed, 8 Feb 2023 18:42:07 +0000 (10:42 -0800)] 
tun: guard Device.Events() against chan writes

Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoglobal: bump copyright year
Jason A. Donenfeld [Tue, 20 Sep 2022 15:21:32 +0000 (17:21 +0200)] 
global: bump copyright year

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agotun/netstack: make http examples communicate with each other
Soren L. Hansen [Wed, 6 Oct 2021 10:40:01 +0000 (10:40 +0000)] 
tun/netstack: make http examples communicate with each other

This seems like a much better demonstration as it removes the need for
external components.

Signed-off-by: Søren L. Hansen <sorenisanerd@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agotun/netstack: bump gvisor
Colin Adler [Mon, 6 Feb 2023 22:35:59 +0000 (16:35 -0600)] 
tun/netstack: bump gvisor

Bump gVisor to a recent known-good version.

Signed-off-by: Colin Adler <colin1adler@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoglobal: bump copyright year
Jason A. Donenfeld [Tue, 20 Sep 2022 15:21:32 +0000 (17:21 +0200)] 
global: bump copyright year

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agotun/netstack: ensure `(*netTun).incomingPacket` chan is closed
Colin Adler [Tue, 13 Sep 2022 03:03:55 +0000 (22:03 -0500)] 
tun/netstack: ensure `(*netTun).incomingPacket` chan is closed

Without this, `device.Close()` will deadlock.

Signed-off-by: Colin Adler <colin1adler@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agoall: use Go 1.19 and its atomic types
Brad Fitzpatrick [Tue, 30 Aug 2022 14:43:11 +0000 (07:43 -0700)] 
all: use Go 1.19 and its atomic types

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agotun/netstack: remove separate module
Jason A. Donenfeld [Mon, 29 Aug 2022 16:04:27 +0000 (12:04 -0400)] 
tun/netstack: remove separate module

Now that the gvisor deps aren't insane, we can just do this in the main
module.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 years agotun/netstack: bump to latest gvisor
Shengjing Zhu [Thu, 18 Aug 2022 17:27:28 +0000 (01:27 +0800)] 
tun/netstack: bump to latest gvisor

To build with go1.19, gvisor needs
99325baf ("Bump gVisor build tags to go1.19").

However gvisor.dev/gvisor/pkg/tcpip/buffer is no longer available,
so refactor to use gvisor.dev/gvisor/pkg/tcpip/link/channel directly.

Signed-off-by: Shengjing Zhu <i@zhsj.me>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agoconn, device, tun: set CLOEXEC on fds
Brad Fitzpatrick [Sat, 2 Jul 2022 04:28:52 +0000 (21:28 -0700)] 
conn, device, tun: set CLOEXEC on fds

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agotun: use ByteSliceToString from golang.org/x/sys/unix
Tobias Klauser [Wed, 1 Jun 2022 09:33:54 +0000 (11:33 +0200)] 
tun: use ByteSliceToString from golang.org/x/sys/unix

Use unix.ByteSliceToString in (*NativeTun).nameSlice to convert the
TUNGETIFF ioctl result []byte to a string.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agoconn: remove the final alloc per packet receive
Josh Bleecher Snyder [Tue, 22 Mar 2022 18:23:56 +0000 (11:23 -0700)] 
conn: remove the final alloc per packet receive

This does bind_std only; other platforms remain.

The remaining alloc per iteration in the Throughput benchmark
comes from the tuntest package, and should not appear in regular use.

name           old time/op      new time/op      delta
Latency-10         25.2µs ± 1%      25.0µs ± 0%   -0.58%  (p=0.006 n=10+10)
Throughput-10      2.44µs ± 3%      2.41µs ± 2%     ~     (p=0.140 n=10+8)

name           old alloc/op     new alloc/op     delta
Latency-10           854B ± 5%        741B ± 3%  -13.22%  (p=0.000 n=10+10)
Throughput-10        265B ±34%        267B ±39%     ~     (p=0.670 n=10+10)

name           old allocs/op    new allocs/op    delta
Latency-10           16.0 ± 0%        14.0 ± 0%  -12.50%  (p=0.000 n=10+10)
Throughput-10        2.00 ± 0%        1.00 ± 0%  -50.00%  (p=0.000 n=10+10)

name           old packet-loss  new packet-loss  delta
Throughput-10        0.01 ±82%       0.01 ±282%     ~     (p=0.321 n=9+8)

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agoconn: use netip for std bind
Jason A. Donenfeld [Fri, 18 Mar 2022 04:23:02 +0000 (22:23 -0600)] 
conn: use netip for std bind

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agoversion: bump snapshot 0.0.20220316
Jason A. Donenfeld [Thu, 17 Mar 2022 03:32:14 +0000 (21:32 -0600)] 
version: bump snapshot

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agotun/netstack: bump mod
Jason A. Donenfeld [Wed, 16 Mar 2022 23:58:35 +0000 (17:58 -0600)] 
tun/netstack: bump mod

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agomod: bump packages and remove compat netip
Jason A. Donenfeld [Wed, 16 Mar 2022 23:51:47 +0000 (17:51 -0600)] 
mod: bump packages and remove compat netip

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agoall: use any in place of interface{}
Josh Bleecher Snyder [Wed, 16 Mar 2022 23:40:24 +0000 (16:40 -0700)] 
all: use any in place of interface{}

Enabled by using Go 1.18. A bit less verbose.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years agoall: update to Go 1.18
Josh Bleecher Snyder [Wed, 16 Mar 2022 23:09:48 +0000 (16:09 -0700)] 
all: update to Go 1.18

Bump go.mod and README.

Switch to upstream net/netip.

Use strings.Cut.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years agotun/netstack: check error returned by SetDeadline()
Alexander Neumann [Fri, 4 Mar 2022 09:38:10 +0000 (10:38 +0100)] 
tun/netstack: check error returned by SetDeadline()

Signed-off-by: Alexander Neumann <alexander.neumann@redteam-pentesting.de>
[Jason: don't wrap deadline error.]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agotun/netstack: update to latest wireguard-go
Alexander Neumann [Fri, 4 Mar 2022 09:36:15 +0000 (10:36 +0100)] 
tun/netstack: update to latest wireguard-go

This commit fixes all callsites of netip.AddrFromSlice(), which has
changed its signature and now returns two values.

Signed-off-by: Alexander Neumann <alexander.neumann@redteam-pentesting.de>
[Jason: remove error handling from AddrFromSlice.]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agotun/netstack: simplify read timeout on ping socket
Jason A. Donenfeld [Wed, 2 Feb 2022 22:30:31 +0000 (23:30 +0100)] 
tun/netstack: simplify read timeout on ping socket

I'm not 100% sure this is correct, but it certainly is a lot simpler.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agotun/netstack: implement ICMP ping
Thomas H. Ptacek [Mon, 31 Jan 2022 22:55:36 +0000 (16:55 -0600)] 
tun/netstack: implement ICMP ping

Provide a PacketConn interface for netstack's ICMP endpoint; netstack
currently only provides EchoRequest/EchoResponse ICMP support, so this
code exposes only an interface for doing ping.

Signed-off-by: Thomas Ptacek <thomas@sockpuppet.org>
[Jason: rework structure, match std go interfaces, add example code]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agoversion: bump snapshot 0.0.20220117
Jason A. Donenfeld [Mon, 17 Jan 2022 16:37:42 +0000 (17:37 +0100)] 
version: bump snapshot

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agoipc: bsd: try again if kqueue returns EINTR
Jason A. Donenfeld [Fri, 14 Jan 2022 15:10:43 +0000 (16:10 +0100)] 
ipc: bsd: try again if kqueue returns EINTR

Reported-by: J. Michael McAtee <mmcatee@jumptrading.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agoglobal: apply gofumpt
Jason A. Donenfeld [Thu, 9 Dec 2021 16:55:50 +0000 (17:55 +0100)] 
global: apply gofumpt

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agodevice: handle peer post config on blank line
Jason A. Donenfeld [Mon, 29 Nov 2021 17:31:54 +0000 (12:31 -0500)] 
device: handle peer post config on blank line

We missed a function exit point. This was exacerbated by e3134bf
("device: defer state machine transitions until configuration is
complete"), but the bug existed prior. Minus provided the following
useful reproducer script:

    #!/usr/bin/env bash

    set -eux

    make wireguard-go || exit 125

    ip netns del test-ns || true
    ip netns add test-ns
    ip link add test-kernel type wireguard
    wg set test-kernel listen-port 0 private-key <(echo "QMCfZcp1KU27kEkpcMCgASEjDnDZDYsfMLHPed7+538=") peer "eDPZJMdfnb8ZcA/VSUnLZvLB2k8HVH12ufCGa7Z7rHI=" allowed-ips 10.51.234.10/32
    ip link set test-kernel netns test-ns up
    ip -n test-ns addr add 10.51.234.1/24 dev test-kernel
    port=$(ip netns exec test-ns wg show test-kernel listen-port)

    ip link del test-go || true
    ./wireguard-go test-go
    wg set test-go private-key <(echo "WBM7qimR3vFk1QtWNfH+F4ggy/hmO+5hfIHKxxI4nF4=") peer "+nj9Dkqpl4phsHo2dQliGm5aEiWJJgBtYKbh7XjeNjg=" allowed-ips 0.0.0.0/0 endpoint 127.0.0.1:$port
    ip addr add 10.51.234.10/24 dev test-go
    ip link set test-go up

    ping -c2 -W1 10.51.234.1

Reported-by: minus <minus@mnus.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agodevice: reduce peer lock critical section in UAPI
Josh Bleecher Snyder [Thu, 18 Nov 2021 23:37:24 +0000 (15:37 -0800)] 
device: reduce peer lock critical section in UAPI

The deferred RUnlock calls weren't executing until all peers
had been processed. Add an anonymous function so that each
peer may be unlocked as soon as it is completed.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agodevice: remove code using unsafe
Josh Bleecher Snyder [Mon, 8 Nov 2021 19:15:06 +0000 (11:15 -0800)] 
device: remove code using unsafe

There is no performance impact.

name                             old time/op  new time/op  delta
TrieIPv4Peers100Addresses1000-8  78.6ns ± 1%  79.4ns ± 3%    ~     (p=0.604 n=10+9)
TrieIPv4Peers10Addresses10-8     29.1ns ± 2%  28.8ns ± 1%  -1.12%  (p=0.014 n=10+9)
TrieIPv6Peers100Addresses1000-8  78.9ns ± 1%  78.6ns ± 1%    ~     (p=0.492 n=10+10)
TrieIPv6Peers10Addresses10-8     29.3ns ± 2%  28.6ns ± 2%  -2.16%  (p=0.000 n=10+10)

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agoglobal: use netip where possible now
Jason A. Donenfeld [Fri, 5 Nov 2021 00:52:54 +0000 (01:52 +0100)] 
global: use netip where possible now

There are more places where we'll need to add it later, when Go 1.18
comes out with support for it in the "net" package. Also, allowedips
still uses slices internally, which might be suboptimal.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agodevice: only propagate roaming value before peer is referenced elsewhere
Jason A. Donenfeld [Tue, 16 Nov 2021 20:13:55 +0000 (21:13 +0100)] 
device: only propagate roaming value before peer is referenced elsewhere

A peer.endpoint never becomes nil after being not-nil, so creation is
the only time we actually need to set this. This prevents a race from
when the variable is actually used elsewhere, and allows us to avoid an
expensive atomic.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agodevice: align 64-bit atomic member in Device
Jason A. Donenfeld [Tue, 16 Nov 2021 20:07:15 +0000 (21:07 +0100)] 
device: align 64-bit atomic member in Device

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agodevice: start peers before running handshake test
Jason A. Donenfeld [Tue, 16 Nov 2021 20:04:54 +0000 (21:04 +0100)] 
device: start peers before running handshake test

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agoMakefile: don't use test -v because it hides failures in scrollback
Jason A. Donenfeld [Tue, 16 Nov 2021 19:59:40 +0000 (20:59 +0100)] 
Makefile: don't use test -v because it hides failures in scrollback

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agodevice: fix nil pointer dereference in uapi read
David Anderson [Tue, 16 Nov 2021 19:27:44 +0000 (11:27 -0800)] 
device: fix nil pointer dereference in uapi read

Signed-off-by: David Anderson <danderson@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>