From: Sasha Levin Date: Fri, 10 May 2024 21:35:06 +0000 (-0400) Subject: Fixes for 6.6 X-Git-Tag: v4.19.314~100 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=eaf61346287053fab6a092d85dc4922274335bbe;p=thirdparty%2Fkernel%2Fstable-queue.git Fixes for 6.6 Signed-off-by: Sasha Levin --- diff --git a/queue-6.6/arm-9381-1-kasan-clear-stale-stack-poison.patch b/queue-6.6/arm-9381-1-kasan-clear-stale-stack-poison.patch new file mode 100644 index 00000000000..3e55b59d14d --- /dev/null +++ b/queue-6.6/arm-9381-1-kasan-clear-stale-stack-poison.patch @@ -0,0 +1,116 @@ +From d8f74a367f3dd9380aa3c562ee78bb4d192cc419 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 15 Apr 2024 05:21:55 +0100 +Subject: ARM: 9381/1: kasan: clear stale stack poison + +From: Boy.Wu + +[ Upstream commit c4238686f9093b98bd6245a348bcf059cdce23af ] + +We found below OOB crash: + +[ 33.452494] ================================================================== +[ 33.453513] BUG: KASAN: stack-out-of-bounds in refresh_cpu_vm_stats.constprop.0+0xcc/0x2ec +[ 33.454660] Write of size 164 at addr c1d03d30 by task swapper/0/0 +[ 33.455515] +[ 33.455767] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 6.1.25-mainline #1 +[ 33.456880] Hardware name: Generic DT based system +[ 33.457555] unwind_backtrace from show_stack+0x18/0x1c +[ 33.458326] show_stack from dump_stack_lvl+0x40/0x4c +[ 33.459072] dump_stack_lvl from print_report+0x158/0x4a4 +[ 33.459863] print_report from kasan_report+0x9c/0x148 +[ 33.460616] kasan_report from kasan_check_range+0x94/0x1a0 +[ 33.461424] kasan_check_range from memset+0x20/0x3c +[ 33.462157] memset from refresh_cpu_vm_stats.constprop.0+0xcc/0x2ec +[ 33.463064] refresh_cpu_vm_stats.constprop.0 from tick_nohz_idle_stop_tick+0x180/0x53c +[ 33.464181] tick_nohz_idle_stop_tick from do_idle+0x264/0x354 +[ 33.465029] do_idle from cpu_startup_entry+0x20/0x24 +[ 33.465769] cpu_startup_entry from rest_init+0xf0/0xf4 +[ 33.466528] rest_init from arch_post_acpi_subsys_init+0x0/0x18 +[ 33.467397] +[ 33.467644] The buggy address belongs to stack of task swapper/0/0 +[ 33.468493] and is located at offset 112 in frame: +[ 33.469172] refresh_cpu_vm_stats.constprop.0+0x0/0x2ec +[ 33.469917] +[ 33.470165] This frame has 2 objects: +[ 33.470696] [32, 76) 'global_zone_diff' +[ 33.470729] [112, 276) 'global_node_diff' +[ 33.471294] +[ 33.472095] The buggy address belongs to the physical page: +[ 33.472862] page:3cd72da8 refcount:1 mapcount:0 mapping:00000000 index:0x0 pfn:0x41d03 +[ 33.473944] flags: 0x1000(reserved|zone=0) +[ 33.474565] raw: 00001000 ed741470 ed741470 00000000 00000000 00000000 ffffffff 00000001 +[ 33.475656] raw: 00000000 +[ 33.476050] page dumped because: kasan: bad access detected +[ 33.476816] +[ 33.477061] Memory state around the buggy address: +[ 33.477732] c1d03c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 +[ 33.478630] c1d03c80: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 +[ 33.479526] >c1d03d00: 00 04 f2 f2 f2 f2 00 00 00 00 00 00 f1 f1 f1 f1 +[ 33.480415] ^ +[ 33.481195] c1d03d80: 00 00 00 00 00 00 00 00 00 00 04 f3 f3 f3 f3 f3 +[ 33.482088] c1d03e00: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 +[ 33.482978] ================================================================== + +We find the root cause of this OOB is that arm does not clear stale stack +poison in the case of cpuidle. + +This patch refer to arch/arm64/kernel/sleep.S to resolve this issue. + +From cited commit [1] that explain the problem + +Functions which the compiler has instrumented for KASAN place poison on +the stack shadow upon entry and remove this poison prior to returning. + +In the case of cpuidle, CPUs exit the kernel a number of levels deep in +C code. Any instrumented functions on this critical path will leave +portions of the stack shadow poisoned. + +If CPUs lose context and return to the kernel via a cold path, we +restore a prior context saved in __cpu_suspend_enter are forgotten, and +we never remove the poison they placed in the stack shadow area by +functions calls between this and the actual exit of the kernel. + +Thus, (depending on stackframe layout) subsequent calls to instrumented +functions may hit this stale poison, resulting in (spurious) KASAN +splats to the console. + +To avoid this, clear any stale poison from the idle thread for a CPU +prior to bringing a CPU online. + +From cited commit [2] + +Extend to check for CONFIG_KASAN_STACK + +[1] commit 0d97e6d8024c ("arm64: kasan: clear stale stack poison") +[2] commit d56a9ef84bd0 ("kasan, arm64: unpoison stack only with CONFIG_KASAN_STACK") + +Signed-off-by: Boy Wu +Reviewed-by: Mark Rutland +Acked-by: Andrey Ryabinin +Reviewed-by: Linus Walleij +Fixes: 5615f69bc209 ("ARM: 9016/2: Initialize the mapping of KASan shadow memory") +Signed-off-by: Russell King (Oracle) +Signed-off-by: Sasha Levin +--- + arch/arm/kernel/sleep.S | 4 ++++ + 1 file changed, 4 insertions(+) + +diff --git a/arch/arm/kernel/sleep.S b/arch/arm/kernel/sleep.S +index a86a1d4f34618..93afd1005b43c 100644 +--- a/arch/arm/kernel/sleep.S ++++ b/arch/arm/kernel/sleep.S +@@ -127,6 +127,10 @@ cpu_resume_after_mmu: + instr_sync + #endif + bl cpu_init @ restore the und/abt/irq banked regs ++#if defined(CONFIG_KASAN) && defined(CONFIG_KASAN_STACK) ++ mov r0, sp ++ bl kasan_unpoison_task_stack_below ++#endif + mov r0, #0 @ return zero on success + ldmfd sp!, {r4 - r11, pc} + ENDPROC(cpu_resume_after_mmu) +-- +2.43.0 + diff --git a/queue-6.6/bluetooth-fix-use-after-free-bugs-caused-by-sco_sock.patch b/queue-6.6/bluetooth-fix-use-after-free-bugs-caused-by-sco_sock.patch new file mode 100644 index 00000000000..5d365e5b857 --- /dev/null +++ b/queue-6.6/bluetooth-fix-use-after-free-bugs-caused-by-sco_sock.patch @@ -0,0 +1,145 @@ +From 280cd1b1c1b6e36dcda0e6dcbe4deff73404acf2 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 25 Apr 2024 22:23:45 +0800 +Subject: Bluetooth: Fix use-after-free bugs caused by sco_sock_timeout + +From: Duoming Zhou + +[ Upstream commit 483bc08181827fc475643272ffb69c533007e546 ] + +When the sco connection is established and then, the sco socket +is releasing, timeout_work will be scheduled to judge whether +the sco disconnection is timeout. The sock will be deallocated +later, but it is dereferenced again in sco_sock_timeout. As a +result, the use-after-free bugs will happen. The root cause is +shown below: + + Cleanup Thread | Worker Thread +sco_sock_release | + sco_sock_close | + __sco_sock_close | + sco_sock_set_timer | + schedule_delayed_work | + sco_sock_kill | (wait a time) + sock_put(sk) //FREE | sco_sock_timeout + | sock_hold(sk) //USE + +The KASAN report triggered by POC is shown below: + +[ 95.890016] ================================================================== +[ 95.890496] BUG: KASAN: slab-use-after-free in sco_sock_timeout+0x5e/0x1c0 +[ 95.890755] Write of size 4 at addr ffff88800c388080 by task kworker/0:0/7 +... +[ 95.890755] Workqueue: events sco_sock_timeout +[ 95.890755] Call Trace: +[ 95.890755] +[ 95.890755] dump_stack_lvl+0x45/0x110 +[ 95.890755] print_address_description+0x78/0x390 +[ 95.890755] print_report+0x11b/0x250 +[ 95.890755] ? __virt_addr_valid+0xbe/0xf0 +[ 95.890755] ? sco_sock_timeout+0x5e/0x1c0 +[ 95.890755] kasan_report+0x139/0x170 +[ 95.890755] ? update_load_avg+0xe5/0x9f0 +[ 95.890755] ? sco_sock_timeout+0x5e/0x1c0 +[ 95.890755] kasan_check_range+0x2c3/0x2e0 +[ 95.890755] sco_sock_timeout+0x5e/0x1c0 +[ 95.890755] process_one_work+0x561/0xc50 +[ 95.890755] worker_thread+0xab2/0x13c0 +[ 95.890755] ? pr_cont_work+0x490/0x490 +[ 95.890755] kthread+0x279/0x300 +[ 95.890755] ? pr_cont_work+0x490/0x490 +[ 95.890755] ? kthread_blkcg+0xa0/0xa0 +[ 95.890755] ret_from_fork+0x34/0x60 +[ 95.890755] ? kthread_blkcg+0xa0/0xa0 +[ 95.890755] ret_from_fork_asm+0x11/0x20 +[ 95.890755] +[ 95.890755] +[ 95.890755] Allocated by task 506: +[ 95.890755] kasan_save_track+0x3f/0x70 +[ 95.890755] __kasan_kmalloc+0x86/0x90 +[ 95.890755] __kmalloc+0x17f/0x360 +[ 95.890755] sk_prot_alloc+0xe1/0x1a0 +[ 95.890755] sk_alloc+0x31/0x4e0 +[ 95.890755] bt_sock_alloc+0x2b/0x2a0 +[ 95.890755] sco_sock_create+0xad/0x320 +[ 95.890755] bt_sock_create+0x145/0x320 +[ 95.890755] __sock_create+0x2e1/0x650 +[ 95.890755] __sys_socket+0xd0/0x280 +[ 95.890755] __x64_sys_socket+0x75/0x80 +[ 95.890755] do_syscall_64+0xc4/0x1b0 +[ 95.890755] entry_SYSCALL_64_after_hwframe+0x67/0x6f +[ 95.890755] +[ 95.890755] Freed by task 506: +[ 95.890755] kasan_save_track+0x3f/0x70 +[ 95.890755] kasan_save_free_info+0x40/0x50 +[ 95.890755] poison_slab_object+0x118/0x180 +[ 95.890755] __kasan_slab_free+0x12/0x30 +[ 95.890755] kfree+0xb2/0x240 +[ 95.890755] __sk_destruct+0x317/0x410 +[ 95.890755] sco_sock_release+0x232/0x280 +[ 95.890755] sock_close+0xb2/0x210 +[ 95.890755] __fput+0x37f/0x770 +[ 95.890755] task_work_run+0x1ae/0x210 +[ 95.890755] get_signal+0xe17/0xf70 +[ 95.890755] arch_do_signal_or_restart+0x3f/0x520 +[ 95.890755] syscall_exit_to_user_mode+0x55/0x120 +[ 95.890755] do_syscall_64+0xd1/0x1b0 +[ 95.890755] entry_SYSCALL_64_after_hwframe+0x67/0x6f +[ 95.890755] +[ 95.890755] The buggy address belongs to the object at ffff88800c388000 +[ 95.890755] which belongs to the cache kmalloc-1k of size 1024 +[ 95.890755] The buggy address is located 128 bytes inside of +[ 95.890755] freed 1024-byte region [ffff88800c388000, ffff88800c388400) +[ 95.890755] +[ 95.890755] The buggy address belongs to the physical page: +[ 95.890755] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88800c38a800 pfn:0xc388 +[ 95.890755] head: order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 +[ 95.890755] anon flags: 0x100000000000840(slab|head|node=0|zone=1) +[ 95.890755] page_type: 0xffffffff() +[ 95.890755] raw: 0100000000000840 ffff888006842dc0 0000000000000000 0000000000000001 +[ 95.890755] raw: ffff88800c38a800 000000000010000a 00000001ffffffff 0000000000000000 +[ 95.890755] head: 0100000000000840 ffff888006842dc0 0000000000000000 0000000000000001 +[ 95.890755] head: ffff88800c38a800 000000000010000a 00000001ffffffff 0000000000000000 +[ 95.890755] head: 0100000000000003 ffffea000030e201 ffffea000030e248 00000000ffffffff +[ 95.890755] head: 0000000800000000 0000000000000000 00000000ffffffff 0000000000000000 +[ 95.890755] page dumped because: kasan: bad access detected +[ 95.890755] +[ 95.890755] Memory state around the buggy address: +[ 95.890755] ffff88800c387f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc +[ 95.890755] ffff88800c388000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb +[ 95.890755] >ffff88800c388080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb +[ 95.890755] ^ +[ 95.890755] ffff88800c388100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb +[ 95.890755] ffff88800c388180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb +[ 95.890755] ================================================================== + +Fix this problem by adding a check protected by sco_conn_lock to judget +whether the conn->hcon is null. Because the conn->hcon will be set to null, +when the sock is releasing. + +Fixes: ba316be1b6a0 ("Bluetooth: schedule SCO timeouts with delayed_work") +Signed-off-by: Duoming Zhou +Signed-off-by: Luiz Augusto von Dentz +Signed-off-by: Sasha Levin +--- + net/bluetooth/sco.c | 4 ++++ + 1 file changed, 4 insertions(+) + +diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c +index 3cc9fab8e8384..ede7391f3aa98 100644 +--- a/net/bluetooth/sco.c ++++ b/net/bluetooth/sco.c +@@ -83,6 +83,10 @@ static void sco_sock_timeout(struct work_struct *work) + struct sock *sk; + + sco_conn_lock(conn); ++ if (!conn->hcon) { ++ sco_conn_unlock(conn); ++ return; ++ } + sk = conn->sk; + if (sk) + sock_hold(sk); +-- +2.43.0 + diff --git a/queue-6.6/bluetooth-hci-fix-potential-null-ptr-deref.patch b/queue-6.6/bluetooth-hci-fix-potential-null-ptr-deref.patch new file mode 100644 index 00000000000..856bc07d194 --- /dev/null +++ b/queue-6.6/bluetooth-hci-fix-potential-null-ptr-deref.patch @@ -0,0 +1,35 @@ +From d8d08a047a623f742ec824ba376191733e17c407 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 2 May 2024 12:09:31 -0400 +Subject: Bluetooth: HCI: Fix potential null-ptr-deref + +From: Sungwoo Kim + +[ Upstream commit d2706004a1b8b526592e823d7e52551b518a7941 ] + +Fix potential null-ptr-deref in hci_le_big_sync_established_evt(). + +Fixes: f777d8827817 (Bluetooth: ISO: Notify user space about failed bis connections) +Signed-off-by: Sungwoo Kim +Signed-off-by: Luiz Augusto von Dentz +Signed-off-by: Sasha Levin +--- + net/bluetooth/hci_event.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c +index 1b4abf8e90f6b..9274d32550493 100644 +--- a/net/bluetooth/hci_event.c ++++ b/net/bluetooth/hci_event.c +@@ -7200,6 +7200,8 @@ static void hci_le_big_sync_established_evt(struct hci_dev *hdev, void *data, + u16 handle = le16_to_cpu(ev->bis[i]); + + bis = hci_conn_hash_lookup_handle(hdev, handle); ++ if (!bis) ++ continue; + + set_bit(HCI_CONN_BIG_SYNC_FAILED, &bis->flags); + hci_connect_cfm(bis, ev->status); +-- +2.43.0 + diff --git a/queue-6.6/bluetooth-l2cap-fix-null-ptr-deref-in-l2cap_chan_tim.patch b/queue-6.6/bluetooth-l2cap-fix-null-ptr-deref-in-l2cap_chan_tim.patch new file mode 100644 index 00000000000..79b302eeef7 --- /dev/null +++ b/queue-6.6/bluetooth-l2cap-fix-null-ptr-deref-in-l2cap_chan_tim.patch @@ -0,0 +1,136 @@ +From f7873cb93c327c6daf33ba3b389f8ddeef601600 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 2 May 2024 20:57:36 +0800 +Subject: Bluetooth: l2cap: fix null-ptr-deref in l2cap_chan_timeout + +From: Duoming Zhou + +[ Upstream commit adf0398cee86643b8eacde95f17d073d022f782c ] + +There is a race condition between l2cap_chan_timeout() and +l2cap_chan_del(). When we use l2cap_chan_del() to delete the +channel, the chan->conn will be set to null. But the conn could +be dereferenced again in the mutex_lock() of l2cap_chan_timeout(). +As a result the null pointer dereference bug will happen. The +KASAN report triggered by POC is shown below: + +[ 472.074580] ================================================================== +[ 472.075284] BUG: KASAN: null-ptr-deref in mutex_lock+0x68/0xc0 +[ 472.075308] Write of size 8 at addr 0000000000000158 by task kworker/0:0/7 +[ 472.075308] +[ 472.075308] CPU: 0 PID: 7 Comm: kworker/0:0 Not tainted 6.9.0-rc5-00356-g78c0094a146b #36 +[ 472.075308] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu4 +[ 472.075308] Workqueue: events l2cap_chan_timeout +[ 472.075308] Call Trace: +[ 472.075308] +[ 472.075308] dump_stack_lvl+0x137/0x1a0 +[ 472.075308] print_report+0x101/0x250 +[ 472.075308] ? __virt_addr_valid+0x77/0x160 +[ 472.075308] ? mutex_lock+0x68/0xc0 +[ 472.075308] kasan_report+0x139/0x170 +[ 472.075308] ? mutex_lock+0x68/0xc0 +[ 472.075308] kasan_check_range+0x2c3/0x2e0 +[ 472.075308] mutex_lock+0x68/0xc0 +[ 472.075308] l2cap_chan_timeout+0x181/0x300 +[ 472.075308] process_one_work+0x5d2/0xe00 +[ 472.075308] worker_thread+0xe1d/0x1660 +[ 472.075308] ? pr_cont_work+0x5e0/0x5e0 +[ 472.075308] kthread+0x2b7/0x350 +[ 472.075308] ? pr_cont_work+0x5e0/0x5e0 +[ 472.075308] ? kthread_blkcg+0xd0/0xd0 +[ 472.075308] ret_from_fork+0x4d/0x80 +[ 472.075308] ? kthread_blkcg+0xd0/0xd0 +[ 472.075308] ret_from_fork_asm+0x11/0x20 +[ 472.075308] +[ 472.075308] ================================================================== +[ 472.094860] Disabling lock debugging due to kernel taint +[ 472.096136] BUG: kernel NULL pointer dereference, address: 0000000000000158 +[ 472.096136] #PF: supervisor write access in kernel mode +[ 472.096136] #PF: error_code(0x0002) - not-present page +[ 472.096136] PGD 0 P4D 0 +[ 472.096136] Oops: 0002 [#1] PREEMPT SMP KASAN NOPTI +[ 472.096136] CPU: 0 PID: 7 Comm: kworker/0:0 Tainted: G B 6.9.0-rc5-00356-g78c0094a146b #36 +[ 472.096136] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu4 +[ 472.096136] Workqueue: events l2cap_chan_timeout +[ 472.096136] RIP: 0010:mutex_lock+0x88/0xc0 +[ 472.096136] Code: be 08 00 00 00 e8 f8 23 1f fd 4c 89 f7 be 08 00 00 00 e8 eb 23 1f fd 42 80 3c 23 00 74 08 48 88 +[ 472.096136] RSP: 0018:ffff88800744fc78 EFLAGS: 00000246 +[ 472.096136] RAX: 0000000000000000 RBX: 1ffff11000e89f8f RCX: ffffffff8457c865 +[ 472.096136] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88800744fc78 +[ 472.096136] RBP: 0000000000000158 R08: ffff88800744fc7f R09: 1ffff11000e89f8f +[ 472.096136] R10: dffffc0000000000 R11: ffffed1000e89f90 R12: dffffc0000000000 +[ 472.096136] R13: 0000000000000158 R14: ffff88800744fc78 R15: ffff888007405a00 +[ 472.096136] FS: 0000000000000000(0000) GS:ffff88806d200000(0000) knlGS:0000000000000000 +[ 472.096136] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +[ 472.096136] CR2: 0000000000000158 CR3: 000000000da32000 CR4: 00000000000006f0 +[ 472.096136] Call Trace: +[ 472.096136] +[ 472.096136] ? __die_body+0x8d/0xe0 +[ 472.096136] ? page_fault_oops+0x6b8/0x9a0 +[ 472.096136] ? kernelmode_fixup_or_oops+0x20c/0x2a0 +[ 472.096136] ? do_user_addr_fault+0x1027/0x1340 +[ 472.096136] ? _printk+0x7a/0xa0 +[ 472.096136] ? mutex_lock+0x68/0xc0 +[ 472.096136] ? add_taint+0x42/0xd0 +[ 472.096136] ? exc_page_fault+0x6a/0x1b0 +[ 472.096136] ? asm_exc_page_fault+0x26/0x30 +[ 472.096136] ? mutex_lock+0x75/0xc0 +[ 472.096136] ? mutex_lock+0x88/0xc0 +[ 472.096136] ? mutex_lock+0x75/0xc0 +[ 472.096136] l2cap_chan_timeout+0x181/0x300 +[ 472.096136] process_one_work+0x5d2/0xe00 +[ 472.096136] worker_thread+0xe1d/0x1660 +[ 472.096136] ? pr_cont_work+0x5e0/0x5e0 +[ 472.096136] kthread+0x2b7/0x350 +[ 472.096136] ? pr_cont_work+0x5e0/0x5e0 +[ 472.096136] ? kthread_blkcg+0xd0/0xd0 +[ 472.096136] ret_from_fork+0x4d/0x80 +[ 472.096136] ? kthread_blkcg+0xd0/0xd0 +[ 472.096136] ret_from_fork_asm+0x11/0x20 +[ 472.096136] +[ 472.096136] Modules linked in: +[ 472.096136] CR2: 0000000000000158 +[ 472.096136] ---[ end trace 0000000000000000 ]--- +[ 472.096136] RIP: 0010:mutex_lock+0x88/0xc0 +[ 472.096136] Code: be 08 00 00 00 e8 f8 23 1f fd 4c 89 f7 be 08 00 00 00 e8 eb 23 1f fd 42 80 3c 23 00 74 08 48 88 +[ 472.096136] RSP: 0018:ffff88800744fc78 EFLAGS: 00000246 +[ 472.096136] RAX: 0000000000000000 RBX: 1ffff11000e89f8f RCX: ffffffff8457c865 +[ 472.096136] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88800744fc78 +[ 472.096136] RBP: 0000000000000158 R08: ffff88800744fc7f R09: 1ffff11000e89f8f +[ 472.132932] R10: dffffc0000000000 R11: ffffed1000e89f90 R12: dffffc0000000000 +[ 472.132932] R13: 0000000000000158 R14: ffff88800744fc78 R15: ffff888007405a00 +[ 472.132932] FS: 0000000000000000(0000) GS:ffff88806d200000(0000) knlGS:0000000000000000 +[ 472.132932] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +[ 472.132932] CR2: 0000000000000158 CR3: 000000000da32000 CR4: 00000000000006f0 +[ 472.132932] Kernel panic - not syncing: Fatal exception +[ 472.132932] Kernel Offset: disabled +[ 472.132932] ---[ end Kernel panic - not syncing: Fatal exception ]--- + +Add a check to judge whether the conn is null in l2cap_chan_timeout() +in order to mitigate the bug. + +Fixes: 3df91ea20e74 ("Bluetooth: Revert to mutexes from RCU list") +Signed-off-by: Duoming Zhou +Signed-off-by: Luiz Augusto von Dentz +Signed-off-by: Sasha Levin +--- + net/bluetooth/l2cap_core.c | 3 +++ + 1 file changed, 3 insertions(+) + +diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c +index 706d2478ddb33..1e961cfaa07b3 100644 +--- a/net/bluetooth/l2cap_core.c ++++ b/net/bluetooth/l2cap_core.c +@@ -415,6 +415,9 @@ static void l2cap_chan_timeout(struct work_struct *work) + + BT_DBG("chan %p state %s", chan, state_to_string(chan->state)); + ++ if (!conn) ++ return; ++ + mutex_lock(&conn->chan_lock); + /* __set_chan_timer() calls l2cap_chan_hold(chan) while scheduling + * this work. No need to call l2cap_chan_hold(chan) here again. +-- +2.43.0 + diff --git a/queue-6.6/bluetooth-msft-fix-slab-use-after-free-in-msft_do_cl.patch b/queue-6.6/bluetooth-msft-fix-slab-use-after-free-in-msft_do_cl.patch new file mode 100644 index 00000000000..027f6523fb1 --- /dev/null +++ b/queue-6.6/bluetooth-msft-fix-slab-use-after-free-in-msft_do_cl.patch @@ -0,0 +1,101 @@ +From 113dd84b598520e89d6de3ddb7fd219230f087d4 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 30 Apr 2024 12:20:51 -0400 +Subject: Bluetooth: msft: fix slab-use-after-free in msft_do_close() + +From: Sungwoo Kim + +[ Upstream commit 10f9f426ac6e752c8d87bf4346930ba347aaabac ] + +Tying the msft->data lifetime to hdev by freeing it in +hci_release_dev() to fix the following case: + +[use] +msft_do_close() + msft = hdev->msft_data; + if (!msft) ...(1) <- passed. + return; + mutex_lock(&msft->filter_lock); ...(4) <- used after freed. + +[free] +msft_unregister() + msft = hdev->msft_data; + hdev->msft_data = NULL; ...(2) + kfree(msft); ...(3) <- msft is freed. + +================================================================== +BUG: KASAN: slab-use-after-free in __mutex_lock_common +kernel/locking/mutex.c:587 [inline] +BUG: KASAN: slab-use-after-free in __mutex_lock+0x8f/0xc30 +kernel/locking/mutex.c:752 +Read of size 8 at addr ffff888106cbbca8 by task kworker/u5:2/309 + +Fixes: bf6a4e30ffbd ("Bluetooth: disable advertisement filters during suspend") +Signed-off-by: Sungwoo Kim +Signed-off-by: Luiz Augusto von Dentz +Signed-off-by: Sasha Levin +--- + net/bluetooth/hci_core.c | 3 +-- + net/bluetooth/msft.c | 2 +- + net/bluetooth/msft.h | 4 ++-- + 3 files changed, 4 insertions(+), 5 deletions(-) + +diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c +index 0592369579ab2..befe645d3f9bf 100644 +--- a/net/bluetooth/hci_core.c ++++ b/net/bluetooth/hci_core.c +@@ -2736,8 +2736,6 @@ void hci_unregister_dev(struct hci_dev *hdev) + + hci_unregister_suspend_notifier(hdev); + +- msft_unregister(hdev); +- + hci_dev_do_close(hdev); + + if (!test_bit(HCI_INIT, &hdev->flags) && +@@ -2791,6 +2789,7 @@ void hci_release_dev(struct hci_dev *hdev) + hci_discovery_filter_clear(hdev); + hci_blocked_keys_clear(hdev); + hci_codec_list_clear(&hdev->local_codecs); ++ msft_release(hdev); + hci_dev_unlock(hdev); + + ida_destroy(&hdev->unset_handle_ida); +diff --git a/net/bluetooth/msft.c b/net/bluetooth/msft.c +index 9612c5d1b13f6..d039683d3bdd4 100644 +--- a/net/bluetooth/msft.c ++++ b/net/bluetooth/msft.c +@@ -769,7 +769,7 @@ void msft_register(struct hci_dev *hdev) + mutex_init(&msft->filter_lock); + } + +-void msft_unregister(struct hci_dev *hdev) ++void msft_release(struct hci_dev *hdev) + { + struct msft_data *msft = hdev->msft_data; + +diff --git a/net/bluetooth/msft.h b/net/bluetooth/msft.h +index 2a63205b377b7..fe538e9c91c01 100644 +--- a/net/bluetooth/msft.h ++++ b/net/bluetooth/msft.h +@@ -14,7 +14,7 @@ + + bool msft_monitor_supported(struct hci_dev *hdev); + void msft_register(struct hci_dev *hdev); +-void msft_unregister(struct hci_dev *hdev); ++void msft_release(struct hci_dev *hdev); + void msft_do_open(struct hci_dev *hdev); + void msft_do_close(struct hci_dev *hdev); + void msft_vendor_evt(struct hci_dev *hdev, void *data, struct sk_buff *skb); +@@ -35,7 +35,7 @@ static inline bool msft_monitor_supported(struct hci_dev *hdev) + } + + static inline void msft_register(struct hci_dev *hdev) {} +-static inline void msft_unregister(struct hci_dev *hdev) {} ++static inline void msft_release(struct hci_dev *hdev) {} + static inline void msft_do_open(struct hci_dev *hdev) {} + static inline void msft_do_close(struct hci_dev *hdev) {} + static inline void msft_vendor_evt(struct hci_dev *hdev, void *data, +-- +2.43.0 + diff --git a/queue-6.6/dt-bindings-net-mediatek-remove-wrongly-added-clocks.patch b/queue-6.6/dt-bindings-net-mediatek-remove-wrongly-added-clocks.patch new file mode 100644 index 00000000000..8894032e6a9 --- /dev/null +++ b/queue-6.6/dt-bindings-net-mediatek-remove-wrongly-added-clocks.patch @@ -0,0 +1,88 @@ +From 283ae1731095a4fd1d1f3249a452879332d7cbf8 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 7 May 2024 13:20:43 +0100 +Subject: dt-bindings: net: mediatek: remove wrongly added clocks and SerDes + +From: Daniel Golle + +[ Upstream commit cc349b0771dccebf0fa9f5e1822ac444aef11448 ] + +Several clocks as well as both sgmiisys phandles were added by mistake +to the Ethernet bindings for MT7988. Also, the total number of clocks +didn't match with the actual number of items listed. + +This happened because the vendor driver which served as a reference uses +a high number of syscon phandles to access various parts of the SoC +which wasn't acceptable upstream. Hence several parts which have never +previously been supported (such SerDes PHY and USXGMII PCS) are going to +be implemented by separate drivers. As a result the device tree will +look much more sane. + +Quickly align the bindings with the upcoming reality of the drivers +actually adding support for the remaining Ethernet-related features of +the MT7988 SoC. + +Fixes: c94a9aabec36 ("dt-bindings: net: mediatek,net: add mt7988-eth binding") +Signed-off-by: Daniel Golle +Acked-by: Krzysztof Kozlowski +Link: https://lore.kernel.org/r/1569290b21cc787a424469ed74456a7e976b102d.1715084326.git.daniel@makrotopia.org +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + .../devicetree/bindings/net/mediatek,net.yaml | 22 ++----------------- + 1 file changed, 2 insertions(+), 20 deletions(-) + +diff --git a/Documentation/devicetree/bindings/net/mediatek,net.yaml b/Documentation/devicetree/bindings/net/mediatek,net.yaml +index e74502a0afe86..3202dc7967c5b 100644 +--- a/Documentation/devicetree/bindings/net/mediatek,net.yaml ++++ b/Documentation/devicetree/bindings/net/mediatek,net.yaml +@@ -337,8 +337,8 @@ allOf: + minItems: 4 + + clocks: +- minItems: 34 +- maxItems: 34 ++ minItems: 24 ++ maxItems: 24 + + clock-names: + items: +@@ -351,18 +351,6 @@ allOf: + - const: ethwarp_wocpu1 + - const: ethwarp_wocpu0 + - const: esw +- - const: netsys0 +- - const: netsys1 +- - const: sgmii_tx250m +- - const: sgmii_rx250m +- - const: sgmii2_tx250m +- - const: sgmii2_rx250m +- - const: top_usxgmii0_sel +- - const: top_usxgmii1_sel +- - const: top_sgm0_sel +- - const: top_sgm1_sel +- - const: top_xfi_phy0_xtal_sel +- - const: top_xfi_phy1_xtal_sel + - const: top_eth_gmii_sel + - const: top_eth_refck_50m_sel + - const: top_eth_sys_200m_sel +@@ -375,16 +363,10 @@ allOf: + - const: top_netsys_sync_250m_sel + - const: top_netsys_ppefb_250m_sel + - const: top_netsys_warp_sel +- - const: wocpu1 +- - const: wocpu0 + - const: xgp1 + - const: xgp2 + - const: xgp3 + +- mediatek,sgmiisys: +- minItems: 2 +- maxItems: 2 +- + patternProperties: + "^mac@[0-1]$": + type: object +-- +2.43.0 + diff --git a/queue-6.6/hsr-simplify-code-for-announcing-hsr-nodes-timer-set.patch b/queue-6.6/hsr-simplify-code-for-announcing-hsr-nodes-timer-set.patch new file mode 100644 index 00000000000..2dace35bc39 --- /dev/null +++ b/queue-6.6/hsr-simplify-code-for-announcing-hsr-nodes-timer-set.patch @@ -0,0 +1,102 @@ +From 4b3de9cf37e652cbcc45e72b46b7dec27aac9f62 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 7 May 2024 13:12:14 +0200 +Subject: hsr: Simplify code for announcing HSR nodes timer setup + +From: Lukasz Majewski + +[ Upstream commit 4893b8b3ef8db2b182d1a1bebf6c7acf91405000 ] + +Up till now the code to start HSR announce timer, which triggers sending +supervisory frames, was assuming that hsr_netdev_notify() would be called +at least twice for hsrX interface. This was required to have different +values for old and current values of network device's operstate. + +This is problematic for a case where hsrX interface is already in the +operational state when hsr_netdev_notify() is called, so timer is not +configured to trigger and as a result the hsrX is not sending supervisory +frames to HSR ring. + +This error has been discovered when hsr_ping.sh script was run. To be +more specific - for the hsr1 and hsr2 the hsr_netdev_notify() was +called at least twice with different IF_OPER_{LOWERDOWN|DOWN|UP} states +assigned in hsr_check_carrier_and_operstate(hsr). As a result there was +no issue with sending supervisory frames. +However, with hsr3, the notify function was called only once with +operstate set to IF_OPER_UP and timer responsible for triggering +supervisory frames was not fired. + +The solution is to use netif_oper_up() and netif_running() helper +functions to assess if network hsrX device is up. +Only then, when the timer is not already pending, it is started. +Otherwise it is deactivated. + +Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") +Signed-off-by: Lukasz Majewski +Reviewed-by: Simon Horman +Link: https://lore.kernel.org/r/20240507111214.3519800-1-lukma@denx.de +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/hsr/hsr_device.c | 27 ++++++++++++--------------- + 1 file changed, 12 insertions(+), 15 deletions(-) + +diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c +index cd337385e8592..c5f7bd01379ce 100644 +--- a/net/hsr/hsr_device.c ++++ b/net/hsr/hsr_device.c +@@ -71,39 +71,36 @@ static bool hsr_check_carrier(struct hsr_port *master) + return false; + } + +-static void hsr_check_announce(struct net_device *hsr_dev, +- unsigned char old_operstate) ++static void hsr_check_announce(struct net_device *hsr_dev) + { + struct hsr_priv *hsr; + + hsr = netdev_priv(hsr_dev); +- +- if (READ_ONCE(hsr_dev->operstate) == IF_OPER_UP && old_operstate != IF_OPER_UP) { +- /* Went up */ +- hsr->announce_count = 0; +- mod_timer(&hsr->announce_timer, +- jiffies + msecs_to_jiffies(HSR_ANNOUNCE_INTERVAL)); ++ if (netif_running(hsr_dev) && netif_oper_up(hsr_dev)) { ++ /* Enable announce timer and start sending supervisory frames */ ++ if (!timer_pending(&hsr->announce_timer)) { ++ hsr->announce_count = 0; ++ mod_timer(&hsr->announce_timer, jiffies + ++ msecs_to_jiffies(HSR_ANNOUNCE_INTERVAL)); ++ } ++ } else { ++ /* Deactivate the announce timer */ ++ timer_delete(&hsr->announce_timer); + } +- +- if (READ_ONCE(hsr_dev->operstate) != IF_OPER_UP && old_operstate == IF_OPER_UP) +- /* Went down */ +- del_timer(&hsr->announce_timer); + } + + void hsr_check_carrier_and_operstate(struct hsr_priv *hsr) + { + struct hsr_port *master; +- unsigned char old_operstate; + bool has_carrier; + + master = hsr_port_get_hsr(hsr, HSR_PT_MASTER); + /* netif_stacked_transfer_operstate() cannot be used here since + * it doesn't set IF_OPER_LOWERLAYERDOWN (?) + */ +- old_operstate = READ_ONCE(master->dev->operstate); + has_carrier = hsr_check_carrier(master); + hsr_set_operstate(master, has_carrier); +- hsr_check_announce(master->dev, old_operstate); ++ hsr_check_announce(master->dev); + } + + int hsr_get_max_mtu(struct hsr_priv *hsr) +-- +2.43.0 + diff --git a/queue-6.6/hwmon-corsair-cpro-protect-ccp-wait_input_report-wit.patch b/queue-6.6/hwmon-corsair-cpro-protect-ccp-wait_input_report-wit.patch new file mode 100644 index 00000000000..ff38ab6888f --- /dev/null +++ b/queue-6.6/hwmon-corsair-cpro-protect-ccp-wait_input_report-wit.patch @@ -0,0 +1,96 @@ +From f1e95bc50846fc48a85b7fcd2fde8783a106e102 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sat, 4 May 2024 11:25:03 +0200 +Subject: hwmon: (corsair-cpro) Protect ccp->wait_input_report with a spinlock + +From: Aleksa Savic + +[ Upstream commit d02abd57e79469a026213f7f5827a98d909f236a ] + +Through hidraw, userspace can cause a status report to be sent +from the device. The parsing in ccp_raw_event() may happen in +parallel to a send_usb_cmd() call (which resets the completion +for tracking the report) if it's running on a different CPU where +bottom half interrupts are not disabled. + +Add a spinlock around the complete_all() in ccp_raw_event() and +reinit_completion() in send_usb_cmd() to prevent race issues. + +Fixes: 40c3a4454225 ("hwmon: add Corsair Commander Pro driver") +Signed-off-by: Aleksa Savic +Acked-by: Marius Zachmann +Link: https://lore.kernel.org/r/20240504092504.24158-4-savicaleksa83@gmail.com +Signed-off-by: Guenter Roeck +Signed-off-by: Sasha Levin +--- + drivers/hwmon/corsair-cpro.c | 24 +++++++++++++++++++----- + 1 file changed, 19 insertions(+), 5 deletions(-) + +diff --git a/drivers/hwmon/corsair-cpro.c b/drivers/hwmon/corsair-cpro.c +index e65e3825af974..280b90646a873 100644 +--- a/drivers/hwmon/corsair-cpro.c ++++ b/drivers/hwmon/corsair-cpro.c +@@ -16,6 +16,7 @@ + #include + #include + #include ++#include + #include + + #define USB_VENDOR_ID_CORSAIR 0x1b1c +@@ -77,6 +78,8 @@ + struct ccp_device { + struct hid_device *hdev; + struct device *hwmon_dev; ++ /* For reinitializing the completion below */ ++ spinlock_t wait_input_report_lock; + struct completion wait_input_report; + struct mutex mutex; /* whenever buffer is used, lock before send_usb_cmd */ + u8 *cmd_buffer; +@@ -118,7 +121,15 @@ static int send_usb_cmd(struct ccp_device *ccp, u8 command, u8 byte1, u8 byte2, + ccp->cmd_buffer[2] = byte2; + ccp->cmd_buffer[3] = byte3; + ++ /* ++ * Disable raw event parsing for a moment to safely reinitialize the ++ * completion. Reinit is done because hidraw could have triggered ++ * the raw event parsing and marked the ccp->wait_input_report ++ * completion as done. ++ */ ++ spin_lock_bh(&ccp->wait_input_report_lock); + reinit_completion(&ccp->wait_input_report); ++ spin_unlock_bh(&ccp->wait_input_report_lock); + + ret = hid_hw_output_report(ccp->hdev, ccp->cmd_buffer, OUT_BUFFER_SIZE); + if (ret < 0) +@@ -136,11 +147,12 @@ static int ccp_raw_event(struct hid_device *hdev, struct hid_report *report, u8 + struct ccp_device *ccp = hid_get_drvdata(hdev); + + /* only copy buffer when requested */ +- if (completion_done(&ccp->wait_input_report)) +- return 0; +- +- memcpy(ccp->buffer, data, min(IN_BUFFER_SIZE, size)); +- complete_all(&ccp->wait_input_report); ++ spin_lock(&ccp->wait_input_report_lock); ++ if (!completion_done(&ccp->wait_input_report)) { ++ memcpy(ccp->buffer, data, min(IN_BUFFER_SIZE, size)); ++ complete_all(&ccp->wait_input_report); ++ } ++ spin_unlock(&ccp->wait_input_report_lock); + + return 0; + } +@@ -515,7 +527,9 @@ static int ccp_probe(struct hid_device *hdev, const struct hid_device_id *id) + + ccp->hdev = hdev; + hid_set_drvdata(hdev, ccp); ++ + mutex_init(&ccp->mutex); ++ spin_lock_init(&ccp->wait_input_report_lock); + init_completion(&ccp->wait_input_report); + + hid_device_io_start(hdev); +-- +2.43.0 + diff --git a/queue-6.6/hwmon-corsair-cpro-use-a-separate-buffer-for-sending.patch b/queue-6.6/hwmon-corsair-cpro-use-a-separate-buffer-for-sending.patch new file mode 100644 index 00000000000..6da42450cf8 --- /dev/null +++ b/queue-6.6/hwmon-corsair-cpro-use-a-separate-buffer-for-sending.patch @@ -0,0 +1,78 @@ +From f9f9cbd3b01dfabfef2869292e594a630f8f4e8d Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sat, 4 May 2024 11:25:01 +0200 +Subject: hwmon: (corsair-cpro) Use a separate buffer for sending commands + +From: Aleksa Savic + +[ Upstream commit e0cd85dc666cb08e1bd313d560cb4eff4d04219e ] + +Introduce cmd_buffer, a separate buffer for storing only +the command that is sent to the device. Before this separation, +the existing buffer was shared for both the command and the +report received in ccp_raw_event(), which was copied into it. + +However, because of hidraw, the raw event parsing may be triggered +in the middle of sending a command, resulting in outputting gibberish +to the device. Using a separate buffer resolves this. + +Fixes: 40c3a4454225 ("hwmon: add Corsair Commander Pro driver") +Signed-off-by: Aleksa Savic +Acked-by: Marius Zachmann +Link: https://lore.kernel.org/r/20240504092504.24158-2-savicaleksa83@gmail.com +Signed-off-by: Guenter Roeck +Signed-off-by: Sasha Levin +--- + drivers/hwmon/corsair-cpro.c | 19 ++++++++++++------- + 1 file changed, 12 insertions(+), 7 deletions(-) + +diff --git a/drivers/hwmon/corsair-cpro.c b/drivers/hwmon/corsair-cpro.c +index 463ab4296ede5..34136d1b04764 100644 +--- a/drivers/hwmon/corsair-cpro.c ++++ b/drivers/hwmon/corsair-cpro.c +@@ -79,6 +79,7 @@ struct ccp_device { + struct device *hwmon_dev; + struct completion wait_input_report; + struct mutex mutex; /* whenever buffer is used, lock before send_usb_cmd */ ++ u8 *cmd_buffer; + u8 *buffer; + int target[6]; + DECLARE_BITMAP(temp_cnct, NUM_TEMP_SENSORS); +@@ -111,15 +112,15 @@ static int send_usb_cmd(struct ccp_device *ccp, u8 command, u8 byte1, u8 byte2, + unsigned long t; + int ret; + +- memset(ccp->buffer, 0x00, OUT_BUFFER_SIZE); +- ccp->buffer[0] = command; +- ccp->buffer[1] = byte1; +- ccp->buffer[2] = byte2; +- ccp->buffer[3] = byte3; ++ memset(ccp->cmd_buffer, 0x00, OUT_BUFFER_SIZE); ++ ccp->cmd_buffer[0] = command; ++ ccp->cmd_buffer[1] = byte1; ++ ccp->cmd_buffer[2] = byte2; ++ ccp->cmd_buffer[3] = byte3; + + reinit_completion(&ccp->wait_input_report); + +- ret = hid_hw_output_report(ccp->hdev, ccp->buffer, OUT_BUFFER_SIZE); ++ ret = hid_hw_output_report(ccp->hdev, ccp->cmd_buffer, OUT_BUFFER_SIZE); + if (ret < 0) + return ret; + +@@ -492,7 +493,11 @@ static int ccp_probe(struct hid_device *hdev, const struct hid_device_id *id) + if (!ccp) + return -ENOMEM; + +- ccp->buffer = devm_kmalloc(&hdev->dev, OUT_BUFFER_SIZE, GFP_KERNEL); ++ ccp->cmd_buffer = devm_kmalloc(&hdev->dev, OUT_BUFFER_SIZE, GFP_KERNEL); ++ if (!ccp->cmd_buffer) ++ return -ENOMEM; ++ ++ ccp->buffer = devm_kmalloc(&hdev->dev, IN_BUFFER_SIZE, GFP_KERNEL); + if (!ccp->buffer) + return -ENOMEM; + +-- +2.43.0 + diff --git a/queue-6.6/hwmon-corsair-cpro-use-complete_all-instead-of-compl.patch b/queue-6.6/hwmon-corsair-cpro-use-complete_all-instead-of-compl.patch new file mode 100644 index 00000000000..c643e0a0c87 --- /dev/null +++ b/queue-6.6/hwmon-corsair-cpro-use-complete_all-instead-of-compl.patch @@ -0,0 +1,41 @@ +From 5aecb0dc016093a3d43e11812deaa32cdeb9cfdf Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sat, 4 May 2024 11:25:02 +0200 +Subject: hwmon: (corsair-cpro) Use complete_all() instead of complete() in + ccp_raw_event() + +From: Aleksa Savic + +[ Upstream commit 3a034a7b0715eb51124a5263890b1ed39978ed3a ] + +In ccp_raw_event(), the ccp->wait_input_report completion is +completed once. Since we're waiting for exactly one report in +send_usb_cmd(), use complete_all() instead of complete() +to mark the completion as spent. + +Fixes: 40c3a4454225 ("hwmon: add Corsair Commander Pro driver") +Signed-off-by: Aleksa Savic +Acked-by: Marius Zachmann +Link: https://lore.kernel.org/r/20240504092504.24158-3-savicaleksa83@gmail.com +Signed-off-by: Guenter Roeck +Signed-off-by: Sasha Levin +--- + drivers/hwmon/corsair-cpro.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/hwmon/corsair-cpro.c b/drivers/hwmon/corsair-cpro.c +index 34136d1b04764..e65e3825af974 100644 +--- a/drivers/hwmon/corsair-cpro.c ++++ b/drivers/hwmon/corsair-cpro.c +@@ -140,7 +140,7 @@ static int ccp_raw_event(struct hid_device *hdev, struct hid_report *report, u8 + return 0; + + memcpy(ccp->buffer, data, min(IN_BUFFER_SIZE, size)); +- complete(&ccp->wait_input_report); ++ complete_all(&ccp->wait_input_report); + + return 0; + } +-- +2.43.0 + diff --git a/queue-6.6/ipv6-annotate-data-races-around-cnf.disable_ipv6.patch b/queue-6.6/ipv6-annotate-data-races-around-cnf.disable_ipv6.patch new file mode 100644 index 00000000000..96360b156ff --- /dev/null +++ b/queue-6.6/ipv6-annotate-data-races-around-cnf.disable_ipv6.patch @@ -0,0 +1,99 @@ +From ec7ec84ba83132644f6e0419c94fd5e9e196a399 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 28 Feb 2024 13:54:26 +0000 +Subject: ipv6: annotate data-races around cnf.disable_ipv6 + +From: Eric Dumazet + +[ Upstream commit d289ab65b89c1d4d88417cb6c03e923f21f95fae ] + +disable_ipv6 is read locklessly, add appropriate READ_ONCE() +and WRITE_ONCE() annotations. + +v2: do not preload net before rtnl_trylock() in + addrconf_disable_ipv6() (Jiri) + +Signed-off-by: Eric Dumazet +Reviewed-by: Jiri Pirko +Signed-off-by: David S. Miller +Stable-dep-of: 4db783d68b9b ("ipv6: prevent NULL dereference in ip6_output()") +Signed-off-by: Sasha Levin +--- + net/ipv6/addrconf.c | 9 +++++---- + net/ipv6/ip6_input.c | 4 ++-- + net/ipv6/ip6_output.c | 2 +- + 3 files changed, 8 insertions(+), 7 deletions(-) + +diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c +index 01f4502916a12..9dfbda164e8c1 100644 +--- a/net/ipv6/addrconf.c ++++ b/net/ipv6/addrconf.c +@@ -4160,7 +4160,7 @@ static void addrconf_dad_work(struct work_struct *w) + if (!ipv6_generate_eui64(addr.s6_addr + 8, idev->dev) && + ipv6_addr_equal(&ifp->addr, &addr)) { + /* DAD failed for link-local based on MAC */ +- idev->cnf.disable_ipv6 = 1; ++ WRITE_ONCE(idev->cnf.disable_ipv6, 1); + + pr_info("%s: IPv6 being disabled!\n", + ifp->idev->dev->name); +@@ -6321,7 +6321,8 @@ static void addrconf_disable_change(struct net *net, __s32 newf) + idev = __in6_dev_get(dev); + if (idev) { + int changed = (!idev->cnf.disable_ipv6) ^ (!newf); +- idev->cnf.disable_ipv6 = newf; ++ ++ WRITE_ONCE(idev->cnf.disable_ipv6, newf); + if (changed) + dev_disable_change(idev); + } +@@ -6338,7 +6339,7 @@ static int addrconf_disable_ipv6(struct ctl_table *table, int *p, int newf) + + net = (struct net *)table->extra2; + old = *p; +- *p = newf; ++ WRITE_ONCE(*p, newf); + + if (p == &net->ipv6.devconf_dflt->disable_ipv6) { + rtnl_unlock(); +@@ -6346,7 +6347,7 @@ static int addrconf_disable_ipv6(struct ctl_table *table, int *p, int newf) + } + + if (p == &net->ipv6.devconf_all->disable_ipv6) { +- net->ipv6.devconf_dflt->disable_ipv6 = newf; ++ WRITE_ONCE(net->ipv6.devconf_dflt->disable_ipv6, newf); + addrconf_disable_change(net, newf); + } else if ((!newf) ^ (!old)) + dev_disable_change((struct inet6_dev *)table->extra1); +diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c +index b8378814532ce..1ba97933c74fb 100644 +--- a/net/ipv6/ip6_input.c ++++ b/net/ipv6/ip6_input.c +@@ -168,9 +168,9 @@ static struct sk_buff *ip6_rcv_core(struct sk_buff *skb, struct net_device *dev, + + SKB_DR_SET(reason, NOT_SPECIFIED); + if ((skb = skb_share_check(skb, GFP_ATOMIC)) == NULL || +- !idev || unlikely(idev->cnf.disable_ipv6)) { ++ !idev || unlikely(READ_ONCE(idev->cnf.disable_ipv6))) { + __IP6_INC_STATS(net, idev, IPSTATS_MIB_INDISCARDS); +- if (idev && unlikely(idev->cnf.disable_ipv6)) ++ if (idev && unlikely(READ_ONCE(idev->cnf.disable_ipv6))) + SKB_DR_SET(reason, IPV6DISABLED); + goto drop; + } +diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c +index fba789cbd215c..b6cc557abb942 100644 +--- a/net/ipv6/ip6_output.c ++++ b/net/ipv6/ip6_output.c +@@ -227,7 +227,7 @@ int ip6_output(struct net *net, struct sock *sk, struct sk_buff *skb) + skb->protocol = htons(ETH_P_IPV6); + skb->dev = dev; + +- if (unlikely(idev->cnf.disable_ipv6)) { ++ if (unlikely(READ_ONCE(idev->cnf.disable_ipv6))) { + IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTDISCARDS); + kfree_skb_reason(skb, SKB_DROP_REASON_IPV6DISABLED); + return 0; +-- +2.43.0 + diff --git a/queue-6.6/ipv6-fib6_rules-avoid-possible-null-dereference-in-f.patch b/queue-6.6/ipv6-fib6_rules-avoid-possible-null-dereference-in-f.patch new file mode 100644 index 00000000000..d1454357cf5 --- /dev/null +++ b/queue-6.6/ipv6-fib6_rules-avoid-possible-null-dereference-in-f.patch @@ -0,0 +1,92 @@ +From 07170beb5924f2cf06a64ff3eb22843b66582256 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 7 May 2024 16:31:45 +0000 +Subject: ipv6: fib6_rules: avoid possible NULL dereference in + fib6_rule_action() + +From: Eric Dumazet + +[ Upstream commit d101291b2681e5ab938554e3e323f7a7ee33e3aa ] + +syzbot is able to trigger the following crash [1], +caused by unsafe ip6_dst_idev() use. + +Indeed ip6_dst_idev() can return NULL, and must always be checked. + +[1] + +Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI +KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] +CPU: 0 PID: 31648 Comm: syz-executor.0 Not tainted 6.9.0-rc4-next-20240417-syzkaller #0 +Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 + RIP: 0010:__fib6_rule_action net/ipv6/fib6_rules.c:237 [inline] + RIP: 0010:fib6_rule_action+0x241/0x7b0 net/ipv6/fib6_rules.c:267 +Code: 02 00 00 49 8d 9f d8 00 00 00 48 89 d8 48 c1 e8 03 42 80 3c 20 00 74 08 48 89 df e8 f9 32 bf f7 48 8b 1b 48 89 d8 48 c1 e8 03 <42> 80 3c 20 00 74 08 48 89 df e8 e0 32 bf f7 4c 8b 03 48 89 ef 4c +RSP: 0018:ffffc9000fc1f2f0 EFLAGS: 00010246 +RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1a772f98c8186700 +RDX: 0000000000000003 RSI: ffffffff8bcac4e0 RDI: ffffffff8c1f9760 +RBP: ffff8880673fb980 R08: ffffffff8fac15ef R09: 1ffffffff1f582bd +R10: dffffc0000000000 R11: fffffbfff1f582be R12: dffffc0000000000 +R13: 0000000000000080 R14: ffff888076509000 R15: ffff88807a029a00 +FS: 00007f55e82ca6c0(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 0000001b31d23000 CR3: 0000000022b66000 CR4: 00000000003506f0 +DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 +DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 +Call Trace: + + fib_rules_lookup+0x62c/0xdb0 net/core/fib_rules.c:317 + fib6_rule_lookup+0x1fd/0x790 net/ipv6/fib6_rules.c:108 + ip6_route_output_flags_noref net/ipv6/route.c:2637 [inline] + ip6_route_output_flags+0x38e/0x610 net/ipv6/route.c:2649 + ip6_route_output include/net/ip6_route.h:93 [inline] + ip6_dst_lookup_tail+0x189/0x11a0 net/ipv6/ip6_output.c:1120 + ip6_dst_lookup_flow+0xb9/0x180 net/ipv6/ip6_output.c:1250 + sctp_v6_get_dst+0x792/0x1e20 net/sctp/ipv6.c:326 + sctp_transport_route+0x12c/0x2e0 net/sctp/transport.c:455 + sctp_assoc_add_peer+0x614/0x15c0 net/sctp/associola.c:662 + sctp_connect_new_asoc+0x31d/0x6c0 net/sctp/socket.c:1099 + __sctp_connect+0x66d/0xe30 net/sctp/socket.c:1197 + sctp_connect net/sctp/socket.c:4819 [inline] + sctp_inet_connect+0x149/0x1f0 net/sctp/socket.c:4834 + __sys_connect_file net/socket.c:2048 [inline] + __sys_connect+0x2df/0x310 net/socket.c:2065 + __do_sys_connect net/socket.c:2075 [inline] + __se_sys_connect net/socket.c:2072 [inline] + __x64_sys_connect+0x7a/0x90 net/socket.c:2072 + do_syscall_x64 arch/x86/entry/common.c:52 [inline] + do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 + entry_SYSCALL_64_after_hwframe+0x77/0x7f + +Fixes: 5e5f3f0f8013 ("[IPV6] ADDRCONF: Convert ipv6_get_saddr() to ipv6_dev_get_saddr().") +Signed-off-by: Eric Dumazet +Reviewed-by: Simon Horman +Reviewed-by: David Ahern +Link: https://lore.kernel.org/r/20240507163145.835254-1-edumazet@google.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/ipv6/fib6_rules.c | 6 +++++- + 1 file changed, 5 insertions(+), 1 deletion(-) + +diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c +index be52b18e08a6b..6eeab21512ba9 100644 +--- a/net/ipv6/fib6_rules.c ++++ b/net/ipv6/fib6_rules.c +@@ -233,8 +233,12 @@ static int __fib6_rule_action(struct fib_rule *rule, struct flowi *flp, + rt = pol_lookup_func(lookup, + net, table, flp6, arg->lookup_data, flags); + if (rt != net->ipv6.ip6_null_entry) { ++ struct inet6_dev *idev = ip6_dst_idev(&rt->dst); ++ ++ if (!idev) ++ goto again; + err = fib6_rule_saddr(net, rule, flags, flp6, +- ip6_dst_idev(&rt->dst)->dev); ++ idev->dev); + + if (err == -EAGAIN) + goto again; +-- +2.43.0 + diff --git a/queue-6.6/ipv6-fix-potential-uninit-value-access-in-__ip6_make.patch b/queue-6.6/ipv6-fix-potential-uninit-value-access-in-__ip6_make.patch new file mode 100644 index 00000000000..f1e6a29c0be --- /dev/null +++ b/queue-6.6/ipv6-fix-potential-uninit-value-access-in-__ip6_make.patch @@ -0,0 +1,38 @@ +From b803daae2bc4a6b7207c65307eb347dd2c33b7a6 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 6 May 2024 23:11:29 +0900 +Subject: ipv6: Fix potential uninit-value access in __ip6_make_skb() + +From: Shigeru Yoshida + +[ Upstream commit 4e13d3a9c25b7080f8a619f961e943fe08c2672c ] + +As it was done in commit fc1092f51567 ("ipv4: Fix uninit-value access in +__ip_make_skb()") for IPv4, check FLOWI_FLAG_KNOWN_NH on fl6->flowi6_flags +instead of testing HDRINCL on the socket to avoid a race condition which +causes uninit-value access. + +Fixes: ea30388baebc ("ipv6: Fix an uninit variable access bug in __ip6_make_skb()") +Signed-off-by: Shigeru Yoshida +Signed-off-by: David S. Miller +Signed-off-by: Sasha Levin +--- + net/ipv6/ip6_output.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c +index 53fe1375b147c..fba789cbd215c 100644 +--- a/net/ipv6/ip6_output.c ++++ b/net/ipv6/ip6_output.c +@@ -2003,7 +2003,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk, + u8 icmp6_type; + + if (sk->sk_socket->type == SOCK_RAW && +- !inet_test_bit(HDRINCL, sk)) ++ !(fl6->flowi6_flags & FLOWI_FLAG_KNOWN_NH)) + icmp6_type = fl6->fl6_icmp_type; + else + icmp6_type = icmp6_hdr(skb)->icmp6_type; +-- +2.43.0 + diff --git a/queue-6.6/ipv6-prevent-null-dereference-in-ip6_output.patch b/queue-6.6/ipv6-prevent-null-dereference-in-ip6_output.patch new file mode 100644 index 00000000000..c74f389e9d4 --- /dev/null +++ b/queue-6.6/ipv6-prevent-null-dereference-in-ip6_output.patch @@ -0,0 +1,83 @@ +From 0e07369f6ae4b05196d851c78566fb6afc89e7d6 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 7 May 2024 16:18:42 +0000 +Subject: ipv6: prevent NULL dereference in ip6_output() + +From: Eric Dumazet + +[ Upstream commit 4db783d68b9b39a411a96096c10828ff5dfada7a ] + +According to syzbot, there is a chance that ip6_dst_idev() +returns NULL in ip6_output(). Most places in IPv6 stack +deal with a NULL idev just fine, but not here. + +syzbot reported: + +general protection fault, probably for non-canonical address 0xdffffc00000000bc: 0000 [#1] PREEMPT SMP KASAN PTI +KASAN: null-ptr-deref in range [0x00000000000005e0-0x00000000000005e7] +CPU: 0 PID: 9775 Comm: syz-executor.4 Not tainted 6.9.0-rc5-syzkaller-00157-g6a30653b604a #0 +Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 + RIP: 0010:ip6_output+0x231/0x3f0 net/ipv6/ip6_output.c:237 +Code: 3c 1e 00 49 89 df 74 08 4c 89 ef e8 19 58 db f7 48 8b 44 24 20 49 89 45 00 49 89 c5 48 8d 9d e0 05 00 00 48 89 d8 48 c1 e8 03 <42> 0f b6 04 38 84 c0 4c 8b 74 24 28 0f 85 61 01 00 00 8b 1b 31 ff +RSP: 0018:ffffc9000927f0d8 EFLAGS: 00010202 +RAX: 00000000000000bc RBX: 00000000000005e0 RCX: 0000000000040000 +RDX: ffffc900131f9000 RSI: 0000000000004f47 RDI: 0000000000004f48 +RBP: 0000000000000000 R08: ffffffff8a1f0b9a R09: 1ffffffff1f51fad +R10: dffffc0000000000 R11: fffffbfff1f51fae R12: ffff8880293ec8c0 +R13: ffff88805d7fc000 R14: 1ffff1100527d91a R15: dffffc0000000000 +FS: 00007f135c6856c0(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 0000000020000080 CR3: 0000000064096000 CR4: 00000000003506f0 +DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 +DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 +Call Trace: + + NF_HOOK include/linux/netfilter.h:314 [inline] + ip6_xmit+0xefe/0x17f0 net/ipv6/ip6_output.c:358 + sctp_v6_xmit+0x9f2/0x13f0 net/sctp/ipv6.c:248 + sctp_packet_transmit+0x26ad/0x2ca0 net/sctp/output.c:653 + sctp_packet_singleton+0x22c/0x320 net/sctp/outqueue.c:783 + sctp_outq_flush_ctrl net/sctp/outqueue.c:914 [inline] + sctp_outq_flush+0x6d5/0x3e20 net/sctp/outqueue.c:1212 + sctp_side_effects net/sctp/sm_sideeffect.c:1198 [inline] + sctp_do_sm+0x59cc/0x60c0 net/sctp/sm_sideeffect.c:1169 + sctp_primitive_ASSOCIATE+0x95/0xc0 net/sctp/primitive.c:73 + __sctp_connect+0x9cd/0xe30 net/sctp/socket.c:1234 + sctp_connect net/sctp/socket.c:4819 [inline] + sctp_inet_connect+0x149/0x1f0 net/sctp/socket.c:4834 + __sys_connect_file net/socket.c:2048 [inline] + __sys_connect+0x2df/0x310 net/socket.c:2065 + __do_sys_connect net/socket.c:2075 [inline] + __se_sys_connect net/socket.c:2072 [inline] + __x64_sys_connect+0x7a/0x90 net/socket.c:2072 + do_syscall_x64 arch/x86/entry/common.c:52 [inline] + do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 + entry_SYSCALL_64_after_hwframe+0x77/0x7f + +Fixes: 778d80be5269 ("ipv6: Add disable_ipv6 sysctl to disable IPv6 operaion on specific interface.") +Reported-by: syzbot +Signed-off-by: Eric Dumazet +Reviewed-by: Larysa Zaremba +Link: https://lore.kernel.org/r/20240507161842.773961-1-edumazet@google.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/ipv6/ip6_output.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c +index b6cc557abb942..f97cb368e5a81 100644 +--- a/net/ipv6/ip6_output.c ++++ b/net/ipv6/ip6_output.c +@@ -227,7 +227,7 @@ int ip6_output(struct net *net, struct sock *sk, struct sk_buff *skb) + skb->protocol = htons(ETH_P_IPV6); + skb->dev = dev; + +- if (unlikely(READ_ONCE(idev->cnf.disable_ipv6))) { ++ if (unlikely(!idev || READ_ONCE(idev->cnf.disable_ipv6))) { + IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTDISCARDS); + kfree_skb_reason(skb, SKB_DROP_REASON_IPV6DISABLED); + return 0; +-- +2.43.0 + diff --git a/queue-6.6/net-bridge-fix-corrupted-ethernet-header-on-multicas.patch b/queue-6.6/net-bridge-fix-corrupted-ethernet-header-on-multicas.patch new file mode 100644 index 00000000000..c4e0ffbc8aa --- /dev/null +++ b/queue-6.6/net-bridge-fix-corrupted-ethernet-header-on-multicas.patch @@ -0,0 +1,56 @@ +From 5cf825eac226ee669c6357c1c349d2545aba178e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sun, 5 May 2024 20:42:38 +0200 +Subject: net: bridge: fix corrupted ethernet header on multicast-to-unicast + +From: Felix Fietkau + +[ Upstream commit 86b29d830ad69eecff25b22dc96c14c6573718e6 ] + +The change from skb_copy to pskb_copy unfortunately changed the data +copying to omit the ethernet header, since it was pulled before reaching +this point. Fix this by calling __skb_push/pull around pskb_copy. + +Fixes: 59c878cbcdd8 ("net: bridge: fix multicast-to-unicast with fraglist GSO") +Signed-off-by: Felix Fietkau +Acked-by: Nikolay Aleksandrov +Signed-off-by: David S. Miller +Signed-off-by: Sasha Levin +--- + net/bridge/br_forward.c | 9 +++++++-- + 1 file changed, 7 insertions(+), 2 deletions(-) + +diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c +index d7c35f55bd69f..d97064d460dc7 100644 +--- a/net/bridge/br_forward.c ++++ b/net/bridge/br_forward.c +@@ -258,6 +258,7 @@ static void maybe_deliver_addr(struct net_bridge_port *p, struct sk_buff *skb, + { + struct net_device *dev = BR_INPUT_SKB_CB(skb)->brdev; + const unsigned char *src = eth_hdr(skb)->h_source; ++ struct sk_buff *nskb; + + if (!should_deliver(p, skb)) + return; +@@ -266,12 +267,16 @@ static void maybe_deliver_addr(struct net_bridge_port *p, struct sk_buff *skb, + if (skb->dev == p->dev && ether_addr_equal(src, addr)) + return; + +- skb = pskb_copy(skb, GFP_ATOMIC); +- if (!skb) { ++ __skb_push(skb, ETH_HLEN); ++ nskb = pskb_copy(skb, GFP_ATOMIC); ++ __skb_pull(skb, ETH_HLEN); ++ if (!nskb) { + DEV_STATS_INC(dev, tx_dropped); + return; + } + ++ skb = nskb; ++ __skb_pull(skb, ETH_HLEN); + if (!is_broadcast_ether_addr(addr)) + memcpy(eth_hdr(skb)->h_dest, addr, ETH_ALEN); + +-- +2.43.0 + diff --git a/queue-6.6/net-dsa-mv88e6xxx-add-phylink_get_caps-for-the-mv88e.patch b/queue-6.6/net-dsa-mv88e6xxx-add-phylink_get_caps-for-the-mv88e.patch new file mode 100644 index 00000000000..30f4bc0d25b --- /dev/null +++ b/queue-6.6/net-dsa-mv88e6xxx-add-phylink_get_caps-for-the-mv88e.patch @@ -0,0 +1,82 @@ +From 481f521086f6ef8097f922ad827e148436c48d5a Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 8 May 2024 09:29:43 +0200 +Subject: net: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 family +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Steffen Bätz + +[ Upstream commit f39bf3cf08a49e7d20c44bc8bc8e390fea69959a ] + +As of commit de5c9bf40c45 ("net: phylink: require supported_interfaces to +be filled") +Marvell 88e6320/21 switches fail to be probed: + +... +mv88e6085 30be0000.ethernet-1:00: phylink: error: empty supported_interfaces +error creating PHYLINK: -22 +... + +The problem stems from the use of mv88e6185_phylink_get_caps() to get +the device capabilities. +Since there are serdes only ports 0/1 included, create a new dedicated +phylink_get_caps for the 6320 and 6321 to properly support their +set of capabilities. + +Fixes: de5c9bf40c45 ("net: phylink: require supported_interfaces to be filled") +Signed-off-by: Steffen Bätz +Reviewed-by: Andrew Lunn +Reviewed-by: Fabio Estevam +Link: https://lore.kernel.org/r/20240508072944.54880-2-steffen@innosonix.de +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/dsa/mv88e6xxx/chip.c | 16 ++++++++++++++-- + 1 file changed, 14 insertions(+), 2 deletions(-) + +diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c +index db1d9df7d47fe..e5bac87941f61 100644 +--- a/drivers/net/dsa/mv88e6xxx/chip.c ++++ b/drivers/net/dsa/mv88e6xxx/chip.c +@@ -697,6 +697,18 @@ static void mv88e6352_phylink_get_caps(struct mv88e6xxx_chip *chip, int port, + } + } + ++static void mv88e632x_phylink_get_caps(struct mv88e6xxx_chip *chip, int port, ++ struct phylink_config *config) ++{ ++ unsigned long *supported = config->supported_interfaces; ++ ++ /* Translate the default cmode */ ++ mv88e6xxx_translate_cmode(chip->ports[port].cmode, supported); ++ ++ config->mac_capabilities = MAC_SYM_PAUSE | MAC_10 | MAC_100 | ++ MAC_1000FD; ++} ++ + static void mv88e6341_phylink_get_caps(struct mv88e6xxx_chip *chip, int port, + struct phylink_config *config) + { +@@ -4976,7 +4988,7 @@ static const struct mv88e6xxx_ops mv88e6320_ops = { + .gpio_ops = &mv88e6352_gpio_ops, + .avb_ops = &mv88e6352_avb_ops, + .ptp_ops = &mv88e6352_ptp_ops, +- .phylink_get_caps = mv88e6185_phylink_get_caps, ++ .phylink_get_caps = mv88e632x_phylink_get_caps, + }; + + static const struct mv88e6xxx_ops mv88e6321_ops = { +@@ -5022,7 +5034,7 @@ static const struct mv88e6xxx_ops mv88e6321_ops = { + .gpio_ops = &mv88e6352_gpio_ops, + .avb_ops = &mv88e6352_avb_ops, + .ptp_ops = &mv88e6352_ptp_ops, +- .phylink_get_caps = mv88e6185_phylink_get_caps, ++ .phylink_get_caps = mv88e632x_phylink_get_caps, + }; + + static const struct mv88e6xxx_ops mv88e6341_ops = { +-- +2.43.0 + diff --git a/queue-6.6/net-hns3-change-type-of-numa_node_mask-as-nodemask_t.patch b/queue-6.6/net-hns3-change-type-of-numa_node_mask-as-nodemask_t.patch new file mode 100644 index 00000000000..d1f466a3eea --- /dev/null +++ b/queue-6.6/net-hns3-change-type-of-numa_node_mask-as-nodemask_t.patch @@ -0,0 +1,117 @@ +From 4af3b5fbf0066639cff49c06f333a173cce35b9e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 7 May 2024 21:42:20 +0800 +Subject: net: hns3: change type of numa_node_mask as nodemask_t + +From: Peiyang Wang + +[ Upstream commit 6639a7b953212ac51aa4baa7d7fb855bf736cf56 ] + +It provides nodemask_t to describe the numa node mask in kernel. To +improve transportability, change the type of numa_node_mask as nodemask_t. + +Fixes: 38caee9d3ee8 ("net: hns3: Add support of the HNAE3 framework") +Signed-off-by: Peiyang Wang +Signed-off-by: Jijie Shao +Reviewed-by: Simon Horman +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/hisilicon/hns3/hnae3.h | 2 +- + drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 6 ++++-- + drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h | 2 +- + drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 7 ++++--- + drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h | 2 +- + 5 files changed, 11 insertions(+), 8 deletions(-) + +diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h +index aaf1f42624a79..57787c380fa07 100644 +--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h ++++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h +@@ -890,7 +890,7 @@ struct hnae3_handle { + struct hnae3_roce_private_info rinfo; + }; + +- u32 numa_node_mask; /* for multi-chip support */ ++ nodemask_t numa_node_mask; /* for multi-chip support */ + + enum hnae3_port_base_vlan_state port_base_vlan_state; + +diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +index 4398de42c9157..b02b96bd93b7a 100644 +--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c ++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +@@ -1758,7 +1758,8 @@ static int hclge_vport_setup(struct hclge_vport *vport, u16 num_tqps) + + nic->pdev = hdev->pdev; + nic->ae_algo = &ae_algo; +- nic->numa_node_mask = hdev->numa_node_mask; ++ bitmap_copy(nic->numa_node_mask.bits, hdev->numa_node_mask.bits, ++ MAX_NUMNODES); + nic->kinfo.io_base = hdev->hw.hw.io_base; + + ret = hclge_knic_setup(vport, num_tqps, +@@ -2450,7 +2451,8 @@ static int hclge_init_roce_base_info(struct hclge_vport *vport) + + roce->pdev = nic->pdev; + roce->ae_algo = nic->ae_algo; +- roce->numa_node_mask = nic->numa_node_mask; ++ bitmap_copy(roce->numa_node_mask.bits, nic->numa_node_mask.bits, ++ MAX_NUMNODES); + + return 0; + } +diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h +index 6a6b41ef08baf..76a5edfe7d2e5 100644 +--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h ++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h +@@ -878,7 +878,7 @@ struct hclge_dev { + + u16 fdir_pf_filter_count; /* Num of guaranteed filters for this PF */ + u16 num_alloc_vport; /* Num vports this driver supports */ +- u32 numa_node_mask; ++ nodemask_t numa_node_mask; + u16 rx_buf_len; + u16 num_tx_desc; /* desc num of per tx queue */ + u16 num_rx_desc; /* desc num of per rx queue */ +diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +index 0aa9beefd1c7e..b57111252d071 100644 +--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c ++++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +@@ -412,7 +412,8 @@ static int hclgevf_set_handle_info(struct hclgevf_dev *hdev) + + nic->ae_algo = &ae_algovf; + nic->pdev = hdev->pdev; +- nic->numa_node_mask = hdev->numa_node_mask; ++ bitmap_copy(nic->numa_node_mask.bits, hdev->numa_node_mask.bits, ++ MAX_NUMNODES); + nic->flags |= HNAE3_SUPPORT_VF; + nic->kinfo.io_base = hdev->hw.hw.io_base; + +@@ -2082,8 +2083,8 @@ static int hclgevf_init_roce_base_info(struct hclgevf_dev *hdev) + + roce->pdev = nic->pdev; + roce->ae_algo = nic->ae_algo; +- roce->numa_node_mask = nic->numa_node_mask; +- ++ bitmap_copy(roce->numa_node_mask.bits, nic->numa_node_mask.bits, ++ MAX_NUMNODES); + return 0; + } + +diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h +index a73f2bf3a56a6..cccef32284616 100644 +--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h ++++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h +@@ -236,7 +236,7 @@ struct hclgevf_dev { + u16 rss_size_max; /* HW defined max RSS task queue */ + + u16 num_alloc_vport; /* num vports this driver supports */ +- u32 numa_node_mask; ++ nodemask_t numa_node_mask; + u16 rx_buf_len; + u16 num_tx_desc; /* desc num of per tx queue */ + u16 num_rx_desc; /* desc num of per rx queue */ +-- +2.43.0 + diff --git a/queue-6.6/net-hns3-direct-return-when-receive-a-unknown-mailbo.patch b/queue-6.6/net-hns3-direct-return-when-receive-a-unknown-mailbo.patch new file mode 100644 index 00000000000..f8a0d529721 --- /dev/null +++ b/queue-6.6/net-hns3-direct-return-when-receive-a-unknown-mailbo.patch @@ -0,0 +1,47 @@ +From dade656315a1b7c9b285af55917198aa0a08c5ec Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 7 May 2024 21:42:19 +0800 +Subject: net: hns3: direct return when receive a unknown mailbox message + +From: Jian Shen + +[ Upstream commit 669554c512d2107e2f21616f38e050d40655101f ] + +Currently, the driver didn't return when receive a unknown +mailbox message, and continue checking whether need to +generate a response. It's unnecessary and may be incorrect. + +Fixes: bb5790b71bad ("net: hns3: refactor mailbox response scheme between PF and VF") +Signed-off-by: Jian Shen +Signed-off-by: Jijie Shao +Reviewed-by: Simon Horman +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c | 7 ++++--- + 1 file changed, 4 insertions(+), 3 deletions(-) + +diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c +index 04ff9bf121853..877feee53804f 100644 +--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c ++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c +@@ -1077,12 +1077,13 @@ static void hclge_mbx_request_handling(struct hclge_mbx_ops_param *param) + + hdev = param->vport->back; + cmd_func = hclge_mbx_ops_list[param->req->msg.code]; +- if (cmd_func) +- ret = cmd_func(param); +- else ++ if (!cmd_func) { + dev_err(&hdev->pdev->dev, + "un-supported mailbox message, code = %u\n", + param->req->msg.code); ++ return; ++ } ++ ret = cmd_func(param); + + /* PF driver should not reply IMP */ + if (hnae3_get_bit(param->req->mbx_need_resp, HCLGE_MBX_NEED_RESP_B) && +-- +2.43.0 + diff --git a/queue-6.6/net-hns3-fix-kernel-crash-when-devlink-reload-during.patch b/queue-6.6/net-hns3-fix-kernel-crash-when-devlink-reload-during.patch new file mode 100644 index 00000000000..dc6d6c59181 --- /dev/null +++ b/queue-6.6/net-hns3-fix-kernel-crash-when-devlink-reload-during.patch @@ -0,0 +1,118 @@ +From d4f1c79de77804804a1a3f43ab52f056d8b7f04b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 7 May 2024 21:42:24 +0800 +Subject: net: hns3: fix kernel crash when devlink reload during initialization + +From: Yonglong Liu + +[ Upstream commit 35d92abfbad88cf947c010baf34b075e40566095 ] + +The devlink reload process will access the hardware resources, +but the register operation is done before the hardware is initialized. +So, processing the devlink reload during initialization may lead to kernel +crash. + +This patch fixes this by registering the devlink after +hardware initialization. + +Fixes: cd6242991d2e ("net: hns3: add support for registering devlink for VF") +Fixes: 93305b77ffcb ("net: hns3: fix kernel crash when devlink reload during pf initialization") +Signed-off-by: Yonglong Liu +Signed-off-by: Jijie Shao +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 17 +++++------------ + .../hisilicon/hns3/hns3vf/hclgevf_main.c | 10 ++++------ + 2 files changed, 9 insertions(+), 18 deletions(-) + +diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +index 3b74cce46ac65..14713454e0d82 100644 +--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c ++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +@@ -11619,16 +11619,10 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev) + if (ret) + goto out; + +- ret = hclge_devlink_init(hdev); +- if (ret) +- goto err_pci_uninit; +- +- devl_lock(hdev->devlink); +- + /* Firmware command queue initialize */ + ret = hclge_comm_cmd_queue_init(hdev->pdev, &hdev->hw.hw); + if (ret) +- goto err_devlink_uninit; ++ goto err_pci_uninit; + + /* Firmware command initialize */ + ret = hclge_comm_cmd_init(hdev->ae_dev, &hdev->hw.hw, &hdev->fw_version, +@@ -11796,6 +11790,10 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev) + dev_warn(&pdev->dev, + "failed to wake on lan init, ret = %d\n", ret); + ++ ret = hclge_devlink_init(hdev); ++ if (ret) ++ goto err_ptp_uninit; ++ + hclge_state_init(hdev); + hdev->last_reset_time = jiffies; + +@@ -11803,8 +11801,6 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev) + HCLGE_DRIVER_NAME); + + hclge_task_schedule(hdev, round_jiffies_relative(HZ)); +- +- devl_unlock(hdev->devlink); + return 0; + + err_ptp_uninit: +@@ -11818,9 +11814,6 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev) + pci_free_irq_vectors(pdev); + err_cmd_uninit: + hclge_comm_cmd_uninit(hdev->ae_dev, &hdev->hw.hw); +-err_devlink_uninit: +- devl_unlock(hdev->devlink); +- hclge_devlink_uninit(hdev); + err_pci_uninit: + pcim_iounmap(pdev, hdev->hw.hw.io_base); + pci_release_regions(pdev); +diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +index 08db8e84be4ed..43ee20eb03d1f 100644 +--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c ++++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +@@ -2845,10 +2845,6 @@ static int hclgevf_init_hdev(struct hclgevf_dev *hdev) + if (ret) + return ret; + +- ret = hclgevf_devlink_init(hdev); +- if (ret) +- goto err_devlink_init; +- + ret = hclge_comm_cmd_queue_init(hdev->pdev, &hdev->hw.hw); + if (ret) + goto err_cmd_queue_init; +@@ -2941,6 +2937,10 @@ static int hclgevf_init_hdev(struct hclgevf_dev *hdev) + + hclgevf_init_rxd_adv_layout(hdev); + ++ ret = hclgevf_devlink_init(hdev); ++ if (ret) ++ goto err_config; ++ + set_bit(HCLGEVF_STATE_SERVICE_INITED, &hdev->state); + + hdev->last_reset_time = jiffies; +@@ -2960,8 +2960,6 @@ static int hclgevf_init_hdev(struct hclgevf_dev *hdev) + err_cmd_init: + hclge_comm_cmd_uninit(hdev->ae_dev, &hdev->hw.hw); + err_cmd_queue_init: +- hclgevf_devlink_uninit(hdev); +-err_devlink_init: + hclgevf_pci_uninit(hdev); + clear_bit(HCLGEVF_STATE_IRQ_INITED, &hdev->state); + return ret; +-- +2.43.0 + diff --git a/queue-6.6/net-hns3-fix-port-vlan-filter-not-disabled-issue.patch b/queue-6.6/net-hns3-fix-port-vlan-filter-not-disabled-issue.patch new file mode 100644 index 00000000000..8a3306c1aa0 --- /dev/null +++ b/queue-6.6/net-hns3-fix-port-vlan-filter-not-disabled-issue.patch @@ -0,0 +1,63 @@ +From e73fda585e29ee9b8d9b8478150817303669c882 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 7 May 2024 21:42:23 +0800 +Subject: net: hns3: fix port vlan filter not disabled issue + +From: Yonglong Liu + +[ Upstream commit f5db7a3b65c84d723ca5e2bb6e83115180ab6336 ] + +According to hardware limitation, for device support modify +VLAN filter state but not support bypass port VLAN filter, +it should always disable the port VLAN filter. but the driver +enables port VLAN filter when initializing, if there is no +VLAN(except VLAN 0) id added, the driver will disable it +in service task. In most time, it works fine. But there is +a time window before the service task shceduled and net device +being registered. So if user adds VLAN at this time, the driver +will not update the VLAN filter state, and the port VLAN filter +remains enabled. + +To fix the problem, if support modify VLAN filter state but not +support bypass port VLAN filter, set the port vlan filter to "off". + +Fixes: 184cd221a863 ("net: hns3: disable port VLAN filter when support function level VLAN filter control") +Fixes: 2ba306627f59 ("net: hns3: add support for modify VLAN filter state") +Signed-off-by: Yonglong Liu +Signed-off-by: Jijie Shao +Reviewed-by: Simon Horman +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 7 ++++++- + 1 file changed, 6 insertions(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +index 9858124665aa6..3b74cce46ac65 100644 +--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c ++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +@@ -9898,6 +9898,7 @@ static int hclge_set_vlan_protocol_type(struct hclge_dev *hdev) + static int hclge_init_vlan_filter(struct hclge_dev *hdev) + { + struct hclge_vport *vport; ++ bool enable = true; + int ret; + int i; + +@@ -9917,8 +9918,12 @@ static int hclge_init_vlan_filter(struct hclge_dev *hdev) + vport->cur_vlan_fltr_en = true; + } + ++ if (test_bit(HNAE3_DEV_SUPPORT_VLAN_FLTR_MDF_B, hdev->ae_dev->caps) && ++ !test_bit(HNAE3_DEV_SUPPORT_PORT_VLAN_BYPASS_B, hdev->ae_dev->caps)) ++ enable = false; ++ + return hclge_set_vlan_filter_ctrl(hdev, HCLGE_FILTER_TYPE_PORT, +- HCLGE_FILTER_FE_INGRESS, true, 0); ++ HCLGE_FILTER_FE_INGRESS, enable, 0); + } + + static int hclge_init_vlan_type(struct hclge_dev *hdev) +-- +2.43.0 + diff --git a/queue-6.6/net-hns3-release-ptp-resources-if-pf-initialization-.patch b/queue-6.6/net-hns3-release-ptp-resources-if-pf-initialization-.patch new file mode 100644 index 00000000000..c62a368464a --- /dev/null +++ b/queue-6.6/net-hns3-release-ptp-resources-if-pf-initialization-.patch @@ -0,0 +1,50 @@ +From 2d7bcc2a8f574dab5ead5b3ea79b402cfcaa43a4 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 7 May 2024 21:42:21 +0800 +Subject: net: hns3: release PTP resources if pf initialization failed + +From: Peiyang Wang + +[ Upstream commit 950aa42399893a170d9b57eda0e4a3ff91fd8b70 ] + +During the PF initialization process, hclge_update_port_info may return an +error code for some reason. At this point, the ptp initialization has been +completed. To void memory leaks, the resources that are applied by ptp +should be released. Therefore, when hclge_update_port_info returns an error +code, hclge_ptp_uninit is called to release the corresponding resources. + +Fixes: eaf83ae59e18 ("net: hns3: add querying fec ability from firmware") +Signed-off-by: Peiyang Wang +Signed-off-by: Jijie Shao +Reviewed-by: Hariprasad Kelam +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +index b02b96bd93b7a..7f2bb0e708896 100644 +--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c ++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +@@ -11752,7 +11752,7 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev) + + ret = hclge_update_port_info(hdev); + if (ret) +- goto err_mdiobus_unreg; ++ goto err_ptp_uninit; + + INIT_KFIFO(hdev->mac_tnl_log); + +@@ -11803,6 +11803,8 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev) + devl_unlock(hdev->devlink); + return 0; + ++err_ptp_uninit: ++ hclge_ptp_uninit(hdev); + err_mdiobus_unreg: + if (hdev->hw.mac.phydev) + mdiobus_unregister(hdev->hw.mac.mdio_bus); +-- +2.43.0 + diff --git a/queue-6.6/net-hns3-use-appropriate-barrier-function-after-sett.patch b/queue-6.6/net-hns3-use-appropriate-barrier-function-after-sett.patch new file mode 100644 index 00000000000..404530f073b --- /dev/null +++ b/queue-6.6/net-hns3-use-appropriate-barrier-function-after-sett.patch @@ -0,0 +1,64 @@ +From 4cb6a010521196e2f0a45434ff81916daa1f2fac Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 7 May 2024 21:42:22 +0800 +Subject: net: hns3: use appropriate barrier function after setting a bit value + +From: Peiyang Wang + +[ Upstream commit 094c281228529d333458208fd02fcac3b139d93b ] + +There is a memory barrier in followed case. When set the port down, +hclgevf_set_timmer will set DOWN in state. Meanwhile, the service task has +different behaviour based on whether the state is DOWN. Thus, to make sure +service task see DOWN, use smp_mb__after_atomic after calling set_bit(). + + CPU0 CPU1 +========================== =================================== +hclgevf_set_timer_task() hclgevf_periodic_service_task() + set_bit(DOWN,state) test_bit(DOWN,state) + +pf also has this issue. + +Fixes: ff200099d271 ("net: hns3: remove unnecessary work in hclgevf_main") +Fixes: 1c6dfe6fc6f7 ("net: hns3: remove mailbox and reset work in hclge_main") +Signed-off-by: Peiyang Wang +Signed-off-by: Jijie Shao +Reviewed-by: Simon Horman +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 3 +-- + drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 3 +-- + 2 files changed, 2 insertions(+), 4 deletions(-) + +diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +index 7f2bb0e708896..9858124665aa6 100644 +--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c ++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +@@ -7945,8 +7945,7 @@ static void hclge_set_timer_task(struct hnae3_handle *handle, bool enable) + /* Set the DOWN flag here to disable link updating */ + set_bit(HCLGE_STATE_DOWN, &hdev->state); + +- /* flush memory to make sure DOWN is seen by service task */ +- smp_mb__before_atomic(); ++ smp_mb__after_atomic(); /* flush memory to make sure DOWN is seen by service task */ + hclge_flush_link_update(hdev); + } + } +diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +index b57111252d071..08db8e84be4ed 100644 +--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c ++++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +@@ -2181,8 +2181,7 @@ static void hclgevf_set_timer_task(struct hnae3_handle *handle, bool enable) + } else { + set_bit(HCLGEVF_STATE_DOWN, &hdev->state); + +- /* flush memory to make sure DOWN is seen by service task */ +- smp_mb__before_atomic(); ++ smp_mb__after_atomic(); /* flush memory to make sure DOWN is seen by service task */ + hclgevf_flush_link_update(hdev); + } + } +-- +2.43.0 + diff --git a/queue-6.6/net-hns3-using-user-configure-after-hardware-reset.patch b/queue-6.6/net-hns3-using-user-configure-after-hardware-reset.patch new file mode 100644 index 00000000000..685515d41dd --- /dev/null +++ b/queue-6.6/net-hns3-using-user-configure-after-hardware-reset.patch @@ -0,0 +1,128 @@ +From 7b6941f7507a8e8f8f5fcbd64d72441e403a6ac9 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 7 May 2024 21:42:18 +0800 +Subject: net: hns3: using user configure after hardware reset + +From: Peiyang Wang + +[ Upstream commit 05eb60e9648cca0beeebdbcd263b599fb58aee48 ] + +When a reset occurring, it's supposed to recover user's configuration. +Currently, the port info(speed, duplex and autoneg) is stored in hclge_mac +and will be scheduled updated. Consider the case that reset was happened +consecutively. During the first reset, the port info is configured with +a temporary value cause the PHY is reset and looking for best link config. +Second reset start and use pervious configuration which is not the user's. +The specific process is as follows: + ++------+ +----+ +----+ +| USER | | PF | | HW | ++---+--+ +-+--+ +-+--+ + | ethtool --reset | | + +------------------->| reset command | + | ethtool --reset +-------------------->| + +------------------->| +---+ + | +---+ | | + | | |reset currently | | HW RESET + | | |and wait to do | | + | |<--+ | | + | | send pervious cfg |<--+ + | | (1000M FULL AN_ON) | + | +-------------------->| + | | read cfg(time task) | + | | (10M HALF AN_OFF) +---+ + | |<--------------------+ | cfg take effect + | | reset command |<--+ + | +-------------------->| + | | +---+ + | | send pervious cfg | | HW RESET + | | (10M HALF AN_OFF) |<--+ + | +-------------------->| + | | read cfg(time task) | + | | (10M HALF AN_OFF) +---+ + | |<--------------------+ | cfg take effect + | | | | + | | read cfg(time task) |<--+ + | | (10M HALF AN_OFF) | + | |<--------------------+ + | | | + v v v + +To avoid aboved situation, this patch introduced req_speed, req_duplex, +req_autoneg to store user's configuration and it only be used after +hardware reset and to recover user's configuration + +Fixes: f5f2b3e4dcc0 ("net: hns3: add support for imp-controlled PHYs") +Signed-off-by: Peiyang Wang +Signed-off-by: Jijie Shao +Reviewed-by: Przemek Kitszel +Reviewed-by: Simon Horman +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 15 +++++++++------ + .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h | 3 +++ + 2 files changed, 12 insertions(+), 6 deletions(-) + +diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +index dfd0c5f4cb9f5..4398de42c9157 100644 +--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c ++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +@@ -1526,6 +1526,9 @@ static int hclge_configure(struct hclge_dev *hdev) + cfg.default_speed, ret); + return ret; + } ++ hdev->hw.mac.req_speed = hdev->hw.mac.speed; ++ hdev->hw.mac.req_autoneg = AUTONEG_ENABLE; ++ hdev->hw.mac.req_duplex = DUPLEX_FULL; + + hclge_parse_link_mode(hdev, cfg.speed_ability); + +@@ -3331,9 +3334,9 @@ hclge_set_phy_link_ksettings(struct hnae3_handle *handle, + return ret; + } + +- hdev->hw.mac.autoneg = cmd->base.autoneg; +- hdev->hw.mac.speed = cmd->base.speed; +- hdev->hw.mac.duplex = cmd->base.duplex; ++ hdev->hw.mac.req_autoneg = cmd->base.autoneg; ++ hdev->hw.mac.req_speed = cmd->base.speed; ++ hdev->hw.mac.req_duplex = cmd->base.duplex; + linkmode_copy(hdev->hw.mac.advertising, cmd->link_modes.advertising); + + return 0; +@@ -3366,9 +3369,9 @@ static int hclge_tp_port_init(struct hclge_dev *hdev) + if (!hnae3_dev_phy_imp_supported(hdev)) + return 0; + +- cmd.base.autoneg = hdev->hw.mac.autoneg; +- cmd.base.speed = hdev->hw.mac.speed; +- cmd.base.duplex = hdev->hw.mac.duplex; ++ cmd.base.autoneg = hdev->hw.mac.req_autoneg; ++ cmd.base.speed = hdev->hw.mac.req_speed; ++ cmd.base.duplex = hdev->hw.mac.req_duplex; + linkmode_copy(cmd.link_modes.advertising, hdev->hw.mac.advertising); + + return hclge_set_phy_link_ksettings(&hdev->vport->nic, &cmd); +diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h +index 7bc2049b723da..6a6b41ef08baf 100644 +--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h ++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h +@@ -263,11 +263,14 @@ struct hclge_mac { + u8 media_type; /* port media type, e.g. fibre/copper/backplane */ + u8 mac_addr[ETH_ALEN]; + u8 autoneg; ++ u8 req_autoneg; + u8 duplex; ++ u8 req_duplex; + u8 support_autoneg; + u8 speed_type; /* 0: sfp speed, 1: active speed */ + u8 lane_num; + u32 speed; ++ u32 req_speed; + u32 max_speed; + u32 speed_ability; /* speed ability supported by current media */ + u32 module_type; /* sub media type, e.g. kr/cr/sr/lr */ +-- +2.43.0 + diff --git a/queue-6.6/net-ks8851-queue-rx-packets-in-irq-handler-instead-o.patch b/queue-6.6/net-ks8851-queue-rx-packets-in-irq-handler-instead-o.patch new file mode 100644 index 00000000000..3995ed7babf --- /dev/null +++ b/queue-6.6/net-ks8851-queue-rx-packets-in-irq-handler-instead-o.patch @@ -0,0 +1,107 @@ +From 240a4ce864cb45adfbac392ae45374b9a58d5d33 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 2 May 2024 20:32:59 +0200 +Subject: net: ks8851: Queue RX packets in IRQ handler instead of disabling BHs + +From: Marek Vasut + +[ Upstream commit e0863634bf9f7cf36291ebb5bfa2d16632f79c49 ] + +Currently the driver uses local_bh_disable()/local_bh_enable() in its +IRQ handler to avoid triggering net_rx_action() softirq on exit from +netif_rx(). The net_rx_action() could trigger this driver .start_xmit +callback, which is protected by the same lock as the IRQ handler, so +calling the .start_xmit from netif_rx() from the IRQ handler critical +section protected by the lock could lead to an attempt to claim the +already claimed lock, and a hang. + +The local_bh_disable()/local_bh_enable() approach works only in case +the IRQ handler is protected by a spinlock, but does not work if the +IRQ handler is protected by mutex, i.e. this works for KS8851 with +Parallel bus interface, but not for KS8851 with SPI bus interface. + +Remove the BH manipulation and instead of calling netif_rx() inside +the IRQ handler code protected by the lock, queue all the received +SKBs in the IRQ handler into a queue first, and once the IRQ handler +exits the critical section protected by the lock, dequeue all the +queued SKBs and push them all into netif_rx(). At this point, it is +safe to trigger the net_rx_action() softirq, since the netif_rx() +call is outside of the lock that protects the IRQ handler. + +Fixes: be0384bf599c ("net: ks8851: Handle softirqs at the end of IRQ thread to fix hang") +Tested-by: Ronald Wahl # KS8851 SPI +Signed-off-by: Marek Vasut +Reviewed-by: Eric Dumazet +Link: https://lore.kernel.org/r/20240502183436.117117-1-marex@denx.de +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/micrel/ks8851_common.c | 16 ++++++++++------ + 1 file changed, 10 insertions(+), 6 deletions(-) + +diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c +index d4cdf3d4f5525..502518cdb4618 100644 +--- a/drivers/net/ethernet/micrel/ks8851_common.c ++++ b/drivers/net/ethernet/micrel/ks8851_common.c +@@ -234,12 +234,13 @@ static void ks8851_dbg_dumpkkt(struct ks8851_net *ks, u8 *rxpkt) + /** + * ks8851_rx_pkts - receive packets from the host + * @ks: The device information. ++ * @rxq: Queue of packets received in this function. + * + * This is called from the IRQ work queue when the system detects that there + * are packets in the receive queue. Find out how many packets there are and + * read them from the FIFO. + */ +-static void ks8851_rx_pkts(struct ks8851_net *ks) ++static void ks8851_rx_pkts(struct ks8851_net *ks, struct sk_buff_head *rxq) + { + struct sk_buff *skb; + unsigned rxfc; +@@ -299,7 +300,7 @@ static void ks8851_rx_pkts(struct ks8851_net *ks) + ks8851_dbg_dumpkkt(ks, rxpkt); + + skb->protocol = eth_type_trans(skb, ks->netdev); +- __netif_rx(skb); ++ __skb_queue_tail(rxq, skb); + + ks->netdev->stats.rx_packets++; + ks->netdev->stats.rx_bytes += rxlen; +@@ -326,11 +327,11 @@ static void ks8851_rx_pkts(struct ks8851_net *ks) + static irqreturn_t ks8851_irq(int irq, void *_ks) + { + struct ks8851_net *ks = _ks; ++ struct sk_buff_head rxq; + unsigned handled = 0; + unsigned long flags; + unsigned int status; +- +- local_bh_disable(); ++ struct sk_buff *skb; + + ks8851_lock(ks, &flags); + +@@ -384,7 +385,8 @@ static irqreturn_t ks8851_irq(int irq, void *_ks) + * from the device so do not bother masking just the RX + * from the device. */ + +- ks8851_rx_pkts(ks); ++ __skb_queue_head_init(&rxq); ++ ks8851_rx_pkts(ks, &rxq); + } + + /* if something stopped the rx process, probably due to wanting +@@ -408,7 +410,9 @@ static irqreturn_t ks8851_irq(int irq, void *_ks) + if (status & IRQ_LCI) + mii_check_link(&ks->mii); + +- local_bh_enable(); ++ if (status & IRQ_RXI) ++ while ((skb = __skb_dequeue(&rxq))) ++ netif_rx(skb); + + return IRQ_HANDLED; + } +-- +2.43.0 + diff --git a/queue-6.6/net-smc-fix-neighbour-and-rtable-leak-in-smc_ib_find.patch b/queue-6.6/net-smc-fix-neighbour-and-rtable-leak-in-smc_ib_find.patch new file mode 100644 index 00000000000..6552ac76723 --- /dev/null +++ b/queue-6.6/net-smc-fix-neighbour-and-rtable-leak-in-smc_ib_find.patch @@ -0,0 +1,56 @@ +From 932f4c9031631f3904a70c9da994c129e47bf838 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 7 May 2024 20:53:31 +0800 +Subject: net/smc: fix neighbour and rtable leak in smc_ib_find_route() + +From: Wen Gu + +[ Upstream commit 2ddc0dd7fec86ee53b8928a5cca5fbddd4fc7c06 ] + +In smc_ib_find_route(), the neighbour found by neigh_lookup() and rtable +resolved by ip_route_output_flow() are not released or put before return. +It may cause the refcount leak, so fix it. + +Link: https://lore.kernel.org/r/20240506015439.108739-1-guwen@linux.alibaba.com +Fixes: e5c4744cfb59 ("net/smc: add SMC-Rv2 connection establishment") +Signed-off-by: Wen Gu +Link: https://lore.kernel.org/r/20240507125331.2808-1-guwen@linux.alibaba.com +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + net/smc/smc_ib.c | 19 ++++++++++++------- + 1 file changed, 12 insertions(+), 7 deletions(-) + +diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c +index 89981dbe46c94..598ac9ead64b7 100644 +--- a/net/smc/smc_ib.c ++++ b/net/smc/smc_ib.c +@@ -209,13 +209,18 @@ int smc_ib_find_route(struct net *net, __be32 saddr, __be32 daddr, + if (IS_ERR(rt)) + goto out; + if (rt->rt_uses_gateway && rt->rt_gw_family != AF_INET) +- goto out; +- neigh = rt->dst.ops->neigh_lookup(&rt->dst, NULL, &fl4.daddr); +- if (neigh) { +- memcpy(nexthop_mac, neigh->ha, ETH_ALEN); +- *uses_gateway = rt->rt_uses_gateway; +- return 0; +- } ++ goto out_rt; ++ neigh = dst_neigh_lookup(&rt->dst, &fl4.daddr); ++ if (!neigh) ++ goto out_rt; ++ memcpy(nexthop_mac, neigh->ha, ETH_ALEN); ++ *uses_gateway = rt->rt_uses_gateway; ++ neigh_release(neigh); ++ ip_rt_put(rt); ++ return 0; ++ ++out_rt: ++ ip_rt_put(rt); + out: + return -ENOENT; + } +-- +2.43.0 + diff --git a/queue-6.6/net-sysfs-convert-dev-operstate-reads-to-lockless-on.patch b/queue-6.6/net-sysfs-convert-dev-operstate-reads-to-lockless-on.patch new file mode 100644 index 00000000000..defd50461dd --- /dev/null +++ b/queue-6.6/net-sysfs-convert-dev-operstate-reads-to-lockless-on.patch @@ -0,0 +1,155 @@ +From 8e764166b66f5e8f9dbb7d5189cf663defac1536 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 13 Feb 2024 06:32:39 +0000 +Subject: net-sysfs: convert dev->operstate reads to lockless ones + +From: Eric Dumazet + +[ Upstream commit 004d138364fd10dd5ff8ceb54cfdc2d792a7b338 ] + +operstate_show() can omit dev_base_lock acquisition only +to read dev->operstate. + +Annotate accesses to dev->operstate. + +Writers still acquire dev_base_lock for mutual exclusion. + +Signed-off-by: Eric Dumazet +Signed-off-by: David S. Miller +Stable-dep-of: 4893b8b3ef8d ("hsr: Simplify code for announcing HSR nodes timer setup") +Signed-off-by: Sasha Levin +--- + net/bridge/br_netlink.c | 3 ++- + net/core/link_watch.c | 4 ++-- + net/core/net-sysfs.c | 4 +--- + net/core/rtnetlink.c | 4 ++-- + net/hsr/hsr_device.c | 10 +++++----- + net/ipv6/addrconf.c | 2 +- + 6 files changed, 13 insertions(+), 14 deletions(-) + +diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c +index 65e9ed3851425..4488faf059a36 100644 +--- a/net/bridge/br_netlink.c ++++ b/net/bridge/br_netlink.c +@@ -455,7 +455,8 @@ static int br_fill_ifinfo(struct sk_buff *skb, + u32 filter_mask, const struct net_device *dev, + bool getlink) + { +- u8 operstate = netif_running(dev) ? dev->operstate : IF_OPER_DOWN; ++ u8 operstate = netif_running(dev) ? READ_ONCE(dev->operstate) : ++ IF_OPER_DOWN; + struct nlattr *af = NULL; + struct net_bridge *br; + struct ifinfomsg *hdr; +diff --git a/net/core/link_watch.c b/net/core/link_watch.c +index c469d1c4db5d7..cb43f5aebfbcc 100644 +--- a/net/core/link_watch.c ++++ b/net/core/link_watch.c +@@ -67,7 +67,7 @@ static void rfc2863_policy(struct net_device *dev) + { + unsigned char operstate = default_operstate(dev); + +- if (operstate == dev->operstate) ++ if (operstate == READ_ONCE(dev->operstate)) + return; + + write_lock(&dev_base_lock); +@@ -87,7 +87,7 @@ static void rfc2863_policy(struct net_device *dev) + break; + } + +- dev->operstate = operstate; ++ WRITE_ONCE(dev->operstate, operstate); + + write_unlock(&dev_base_lock); + } +diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c +index fccaa5bac0ed0..5a9487af44e00 100644 +--- a/net/core/net-sysfs.c ++++ b/net/core/net-sysfs.c +@@ -307,11 +307,9 @@ static ssize_t operstate_show(struct device *dev, + const struct net_device *netdev = to_net_dev(dev); + unsigned char operstate; + +- read_lock(&dev_base_lock); +- operstate = netdev->operstate; ++ operstate = READ_ONCE(netdev->operstate); + if (!netif_running(netdev)) + operstate = IF_OPER_DOWN; +- read_unlock(&dev_base_lock); + + if (operstate >= ARRAY_SIZE(operstates)) + return -EINVAL; /* should not happen */ +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index 89964270cf27f..7ea66de1442cc 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -880,9 +880,9 @@ static void set_operstate(struct net_device *dev, unsigned char transition) + break; + } + +- if (dev->operstate != operstate) { ++ if (READ_ONCE(dev->operstate) != operstate) { + write_lock(&dev_base_lock); +- dev->operstate = operstate; ++ WRITE_ONCE(dev->operstate, operstate); + write_unlock(&dev_base_lock); + netdev_state_change(dev); + } +diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c +index dd4b5f0aa1318..cd337385e8592 100644 +--- a/net/hsr/hsr_device.c ++++ b/net/hsr/hsr_device.c +@@ -31,8 +31,8 @@ static bool is_slave_up(struct net_device *dev) + static void __hsr_set_operstate(struct net_device *dev, int transition) + { + write_lock(&dev_base_lock); +- if (dev->operstate != transition) { +- dev->operstate = transition; ++ if (READ_ONCE(dev->operstate) != transition) { ++ WRITE_ONCE(dev->operstate, transition); + write_unlock(&dev_base_lock); + netdev_state_change(dev); + } else { +@@ -78,14 +78,14 @@ static void hsr_check_announce(struct net_device *hsr_dev, + + hsr = netdev_priv(hsr_dev); + +- if (hsr_dev->operstate == IF_OPER_UP && old_operstate != IF_OPER_UP) { ++ if (READ_ONCE(hsr_dev->operstate) == IF_OPER_UP && old_operstate != IF_OPER_UP) { + /* Went up */ + hsr->announce_count = 0; + mod_timer(&hsr->announce_timer, + jiffies + msecs_to_jiffies(HSR_ANNOUNCE_INTERVAL)); + } + +- if (hsr_dev->operstate != IF_OPER_UP && old_operstate == IF_OPER_UP) ++ if (READ_ONCE(hsr_dev->operstate) != IF_OPER_UP && old_operstate == IF_OPER_UP) + /* Went down */ + del_timer(&hsr->announce_timer); + } +@@ -100,7 +100,7 @@ void hsr_check_carrier_and_operstate(struct hsr_priv *hsr) + /* netif_stacked_transfer_operstate() cannot be used here since + * it doesn't set IF_OPER_LOWERLAYERDOWN (?) + */ +- old_operstate = master->dev->operstate; ++ old_operstate = READ_ONCE(master->dev->operstate); + has_carrier = hsr_check_carrier(master); + hsr_set_operstate(master, has_carrier); + hsr_check_announce(master->dev, old_operstate); +diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c +index d1806eee1687d..01f4502916a12 100644 +--- a/net/ipv6/addrconf.c ++++ b/net/ipv6/addrconf.c +@@ -6011,7 +6011,7 @@ static int inet6_fill_ifinfo(struct sk_buff *skb, struct inet6_dev *idev, + (dev->ifindex != dev_get_iflink(dev) && + nla_put_u32(skb, IFLA_LINK, dev_get_iflink(dev))) || + nla_put_u8(skb, IFLA_OPERSTATE, +- netif_running(dev) ? dev->operstate : IF_OPER_DOWN)) ++ netif_running(dev) ? READ_ONCE(dev->operstate) : IF_OPER_DOWN)) + goto nla_put_failure; + protoinfo = nla_nest_start_noflag(skb, IFLA_PROTINFO); + if (!protoinfo) +-- +2.43.0 + diff --git a/queue-6.6/nfc-nci-fix-kcov-check-in-nci_rx_work.patch b/queue-6.6/nfc-nci-fix-kcov-check-in-nci_rx_work.patch new file mode 100644 index 00000000000..4e50d4d5524 --- /dev/null +++ b/queue-6.6/nfc-nci-fix-kcov-check-in-nci_rx_work.patch @@ -0,0 +1,44 @@ +From e07a45cf8dfcc21b5144d9132a343a7a3fc878fc Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sun, 5 May 2024 19:36:49 +0900 +Subject: nfc: nci: Fix kcov check in nci_rx_work() + +From: Tetsuo Handa + +[ Upstream commit 19e35f24750ddf860c51e51c68cf07ea181b4881 ] + +Commit 7e8cdc97148c ("nfc: Add KCOV annotations") added +kcov_remote_start_common()/kcov_remote_stop() pair into nci_rx_work(), +with an assumption that kcov_remote_stop() is called upon continue of +the for loop. But commit d24b03535e5e ("nfc: nci: Fix uninit-value in +nci_dev_up and nci_ntf_packet") forgot to call kcov_remote_stop() before +break of the for loop. + +Reported-by: syzbot +Closes: https://syzkaller.appspot.com/bug?extid=0438378d6f157baae1a2 +Fixes: d24b03535e5e ("nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet") +Suggested-by: Andrey Konovalov +Signed-off-by: Tetsuo Handa +Reviewed-by: Krzysztof Kozlowski +Link: https://lore.kernel.org/r/6d10f829-5a0c-405a-b39a-d7266f3a1a0b@I-love.SAKURA.ne.jp +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/nfc/nci/core.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/net/nfc/nci/core.c b/net/nfc/nci/core.c +index 772ddb5824d9e..5d708af0fcfd3 100644 +--- a/net/nfc/nci/core.c ++++ b/net/nfc/nci/core.c +@@ -1518,6 +1518,7 @@ static void nci_rx_work(struct work_struct *work) + + if (!nci_plen(skb->data)) { + kfree_skb(skb); ++ kcov_remote_stop(); + break; + } + +-- +2.43.0 + diff --git a/queue-6.6/phonet-fix-rtm_phonet_notify-skb-allocation.patch b/queue-6.6/phonet-fix-rtm_phonet_notify-skb-allocation.patch new file mode 100644 index 00000000000..55442036a3c --- /dev/null +++ b/queue-6.6/phonet-fix-rtm_phonet_notify-skb-allocation.patch @@ -0,0 +1,50 @@ +From 8f307c4f63fa2328b138f3d46f8367c38c6768f7 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 2 May 2024 16:17:00 +0000 +Subject: phonet: fix rtm_phonet_notify() skb allocation +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Eric Dumazet + +[ Upstream commit d8cac8568618dcb8a51af3db1103e8d4cc4aeea7 ] + +fill_route() stores three components in the skb: + +- struct rtmsg +- RTA_DST (u8) +- RTA_OIF (u32) + +Therefore, rtm_phonet_notify() should use + +NLMSG_ALIGN(sizeof(struct rtmsg)) + +nla_total_size(1) + +nla_total_size(4) + +Fixes: f062f41d0657 ("Phonet: routing table Netlink interface") +Signed-off-by: Eric Dumazet +Acked-by: Rémi Denis-Courmont +Link: https://lore.kernel.org/r/20240502161700.1804476-1-edumazet@google.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/phonet/pn_netlink.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/net/phonet/pn_netlink.c b/net/phonet/pn_netlink.c +index 59aebe2968907..dd4c7e9a634fb 100644 +--- a/net/phonet/pn_netlink.c ++++ b/net/phonet/pn_netlink.c +@@ -193,7 +193,7 @@ void rtm_phonet_notify(int event, struct net_device *dev, u8 dst) + struct sk_buff *skb; + int err = -ENOBUFS; + +- skb = nlmsg_new(NLMSG_ALIGN(sizeof(struct ifaddrmsg)) + ++ skb = nlmsg_new(NLMSG_ALIGN(sizeof(struct rtmsg)) + + nla_total_size(1) + nla_total_size(4), GFP_KERNEL); + if (skb == NULL) + goto errout; +-- +2.43.0 + diff --git a/queue-6.6/qibfs-fix-dentry-leak.patch b/queue-6.6/qibfs-fix-dentry-leak.patch new file mode 100644 index 00000000000..272d596d408 --- /dev/null +++ b/queue-6.6/qibfs-fix-dentry-leak.patch @@ -0,0 +1,38 @@ +From bcd1225d1bc0341816567f3f9d0b8a319a4fb815 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sun, 25 Feb 2024 23:58:42 -0500 +Subject: qibfs: fix dentry leak + +From: Al Viro + +[ Upstream commit aa23317d0268b309bb3f0801ddd0d61813ff5afb ] + +simple_recursive_removal() drops the pinning references to all positives +in subtree. For the cases when its argument has been kept alive by +the pinning alone that's exactly the right thing to do, but here +the argument comes from dcache lookup, that needs to be balanced by +explicit dput(). + +Fixes: e41d237818598 "qib_fs: switch to simple_recursive_removal()" +Fucked-up-by: Al Viro +Signed-off-by: Al Viro +Signed-off-by: Sasha Levin +--- + drivers/infiniband/hw/qib/qib_fs.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/drivers/infiniband/hw/qib/qib_fs.c b/drivers/infiniband/hw/qib/qib_fs.c +index ed7d4b02f45a6..11155e0fb8395 100644 +--- a/drivers/infiniband/hw/qib/qib_fs.c ++++ b/drivers/infiniband/hw/qib/qib_fs.c +@@ -439,6 +439,7 @@ static int remove_device_files(struct super_block *sb, + return PTR_ERR(dir); + } + simple_recursive_removal(dir, NULL); ++ dput(dir); + return 0; + } + +-- +2.43.0 + diff --git a/queue-6.6/rtnetlink-correct-nested-ifla_vf_vlan_list-attribute.patch b/queue-6.6/rtnetlink-correct-nested-ifla_vf_vlan_list-attribute.patch new file mode 100644 index 00000000000..9bf4b1e5b88 --- /dev/null +++ b/queue-6.6/rtnetlink-correct-nested-ifla_vf_vlan_list-attribute.patch @@ -0,0 +1,44 @@ +From 36fb6c0b49f496ee4412fdc15a055c0a8f12194b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 2 May 2024 18:57:51 +0300 +Subject: rtnetlink: Correct nested IFLA_VF_VLAN_LIST attribute validation + +From: Roded Zats + +[ Upstream commit 1aec77b2bb2ed1db0f5efc61c4c1ca3813307489 ] + +Each attribute inside a nested IFLA_VF_VLAN_LIST is assumed to be a +struct ifla_vf_vlan_info so the size of such attribute needs to be at least +of sizeof(struct ifla_vf_vlan_info) which is 14 bytes. +The current size validation in do_setvfinfo is against NLA_HDRLEN (4 bytes) +which is less than sizeof(struct ifla_vf_vlan_info) so this validation +is not enough and a too small attribute might be cast to a +struct ifla_vf_vlan_info, this might result in an out of bands +read access when accessing the saved (casted) entry in ivvl. + +Fixes: 79aab093a0b5 ("net: Update API for VF vlan protocol 802.1ad support") +Signed-off-by: Roded Zats +Reviewed-by: Donald Hunter +Link: https://lore.kernel.org/r/20240502155751.75705-1-rzats@paloaltonetworks.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/core/rtnetlink.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c +index e8bf481e80f72..89964270cf27f 100644 +--- a/net/core/rtnetlink.c ++++ b/net/core/rtnetlink.c +@@ -2519,7 +2519,7 @@ static int do_setvfinfo(struct net_device *dev, struct nlattr **tb) + + nla_for_each_nested(attr, tb[IFLA_VF_VLAN_LIST], rem) { + if (nla_type(attr) != IFLA_VF_VLAN_INFO || +- nla_len(attr) < NLA_HDRLEN) { ++ nla_len(attr) < sizeof(struct ifla_vf_vlan_info)) { + return -EINVAL; + } + if (len >= MAX_VLAN_LIST_LEN) +-- +2.43.0 + diff --git a/queue-6.6/rxrpc-fix-congestion-control-algorithm.patch b/queue-6.6/rxrpc-fix-congestion-control-algorithm.patch new file mode 100644 index 00000000000..1a6a9be6964 --- /dev/null +++ b/queue-6.6/rxrpc-fix-congestion-control-algorithm.patch @@ -0,0 +1,88 @@ +From 1201f32065f27e93bd4e9cee21fd61caf8c17481 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 3 May 2024 16:07:39 +0100 +Subject: rxrpc: Fix congestion control algorithm + +From: David Howells + +[ Upstream commit ba4e103848d3a2a28a0445e39f4a9564187efe54 ] + +Make the following fixes to the congestion control algorithm: + + (1) Don't vary the cwnd starting value by the size of RXRPC_TX_SMSS since + that's currently held constant - set to the size of a jumbo subpacket + payload so that we can create jumbo packets on the fly. The current + code invariably picks 3 as the starting value. + + Further, the starting cwnd needs to be an even number because we ack + every other packet, so set it to 4. + + (2) Don't cut ssthresh when we see an ACK come from the peer with a + receive window (rwind) less than ssthresh. ssthresh keeps track of + characteristics of the connection whereas rwind may be reduced by the + peer for any reason - and may be reduced to 0. + +Fixes: 1fc4fa2ac93d ("rxrpc: Fix congestion management") +Fixes: 0851115090a3 ("rxrpc: Reduce ssthresh to peer's receive window") +Signed-off-by: David Howells +Suggested-by: Simon Wilkinson +cc: Marc Dionne +cc: linux-afs@lists.infradead.org +Reviewed-by: Jeffrey Altman > +Link: https://lore.kernel.org/r/20240503150749.1001323-2-dhowells@redhat.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/rxrpc/ar-internal.h | 2 +- + net/rxrpc/call_object.c | 7 +------ + net/rxrpc/input.c | 3 --- + 3 files changed, 2 insertions(+), 10 deletions(-) + +diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h +index bda3f6690b321..d431376bb120a 100644 +--- a/net/rxrpc/ar-internal.h ++++ b/net/rxrpc/ar-internal.h +@@ -688,7 +688,7 @@ struct rxrpc_call { + * packets) rather than bytes. + */ + #define RXRPC_TX_SMSS RXRPC_JUMBO_DATALEN +-#define RXRPC_MIN_CWND (RXRPC_TX_SMSS > 2190 ? 2 : RXRPC_TX_SMSS > 1095 ? 3 : 4) ++#define RXRPC_MIN_CWND 4 + u8 cong_cwnd; /* Congestion window size */ + u8 cong_extra; /* Extra to send for congestion management */ + u8 cong_ssthresh; /* Slow-start threshold */ +diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c +index 0a50341d920af..29385908099ef 100644 +--- a/net/rxrpc/call_object.c ++++ b/net/rxrpc/call_object.c +@@ -175,12 +175,7 @@ struct rxrpc_call *rxrpc_alloc_call(struct rxrpc_sock *rx, gfp_t gfp, + call->rx_winsize = rxrpc_rx_window_size; + call->tx_winsize = 16; + +- if (RXRPC_TX_SMSS > 2190) +- call->cong_cwnd = 2; +- else if (RXRPC_TX_SMSS > 1095) +- call->cong_cwnd = 3; +- else +- call->cong_cwnd = 4; ++ call->cong_cwnd = RXRPC_MIN_CWND; + call->cong_ssthresh = RXRPC_TX_MAX_WINDOW; + + call->rxnet = rxnet; +diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c +index 718ffd184ddb6..f7304e06aadca 100644 +--- a/net/rxrpc/input.c ++++ b/net/rxrpc/input.c +@@ -688,9 +688,6 @@ static void rxrpc_input_ack_trailer(struct rxrpc_call *call, struct sk_buff *skb + call->tx_winsize = rwind; + } + +- if (call->cong_ssthresh > rwind) +- call->cong_ssthresh = rwind; +- + mtu = min(ntohl(trailer->maxMTU), ntohl(trailer->ifMTU)); + + peer = call->peer; +-- +2.43.0 + diff --git a/queue-6.6/rxrpc-fix-the-names-of-the-fields-in-the-ack-trailer.patch b/queue-6.6/rxrpc-fix-the-names-of-the-fields-in-the-ack-trailer.patch new file mode 100644 index 00000000000..b06400f2555 --- /dev/null +++ b/queue-6.6/rxrpc-fix-the-names-of-the-fields-in-the-ack-trailer.patch @@ -0,0 +1,213 @@ +From 70d112aa6492c9ce79b0693aeecbe031969f630e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 26 Jan 2024 16:17:03 +0000 +Subject: rxrpc: Fix the names of the fields in the ACK trailer struct + +From: David Howells + +[ Upstream commit 17469ae0582aaacad36e8e858f58b86c369f21ef ] + +From AFS-3.3 a trailer containing extra info was added to the ACK packet +format - but AF_RXRPC has the names of some of the fields mixed up compared +to other AFS implementations. + +Rename the struct and the fields to make them match. + +Signed-off-by: David Howells +cc: Marc Dionne +cc: "David S. Miller" +cc: Eric Dumazet +cc: Jakub Kicinski +cc: Paolo Abeni +cc: linux-afs@lists.infradead.org +cc: netdev@vger.kernel.org +Stable-dep-of: ba4e103848d3 ("rxrpc: Fix congestion control algorithm") +Signed-off-by: Sasha Levin +--- + include/trace/events/rxrpc.h | 2 +- + net/rxrpc/conn_event.c | 16 ++++++++-------- + net/rxrpc/input.c | 22 +++++++++++----------- + net/rxrpc/output.c | 14 +++++++------- + net/rxrpc/protocol.h | 6 +++--- + 5 files changed, 30 insertions(+), 30 deletions(-) + +diff --git a/include/trace/events/rxrpc.h b/include/trace/events/rxrpc.h +index 0dd4a21d172da..3322fb93a260b 100644 +--- a/include/trace/events/rxrpc.h ++++ b/include/trace/events/rxrpc.h +@@ -83,7 +83,7 @@ + EM(rxrpc_badmsg_bad_abort, "bad-abort") \ + EM(rxrpc_badmsg_bad_jumbo, "bad-jumbo") \ + EM(rxrpc_badmsg_short_ack, "short-ack") \ +- EM(rxrpc_badmsg_short_ack_info, "short-ack-info") \ ++ EM(rxrpc_badmsg_short_ack_trailer, "short-ack-trailer") \ + EM(rxrpc_badmsg_short_hdr, "short-hdr") \ + EM(rxrpc_badmsg_unsupported_packet, "unsup-pkt") \ + EM(rxrpc_badmsg_zero_call, "zero-call") \ +diff --git a/net/rxrpc/conn_event.c b/net/rxrpc/conn_event.c +index 1f251d758cb9d..598b4ee389fc1 100644 +--- a/net/rxrpc/conn_event.c ++++ b/net/rxrpc/conn_event.c +@@ -88,7 +88,7 @@ void rxrpc_conn_retransmit_call(struct rxrpc_connection *conn, + struct rxrpc_ackpacket ack; + }; + } __attribute__((packed)) pkt; +- struct rxrpc_ackinfo ack_info; ++ struct rxrpc_acktrailer trailer; + size_t len; + int ret, ioc; + u32 serial, mtu, call_id, padding; +@@ -122,8 +122,8 @@ void rxrpc_conn_retransmit_call(struct rxrpc_connection *conn, + iov[0].iov_len = sizeof(pkt.whdr); + iov[1].iov_base = &padding; + iov[1].iov_len = 3; +- iov[2].iov_base = &ack_info; +- iov[2].iov_len = sizeof(ack_info); ++ iov[2].iov_base = &trailer; ++ iov[2].iov_len = sizeof(trailer); + + serial = rxrpc_get_next_serial(conn); + +@@ -158,14 +158,14 @@ void rxrpc_conn_retransmit_call(struct rxrpc_connection *conn, + pkt.ack.serial = htonl(skb ? sp->hdr.serial : 0); + pkt.ack.reason = skb ? RXRPC_ACK_DUPLICATE : RXRPC_ACK_IDLE; + pkt.ack.nAcks = 0; +- ack_info.rxMTU = htonl(rxrpc_rx_mtu); +- ack_info.maxMTU = htonl(mtu); +- ack_info.rwind = htonl(rxrpc_rx_window_size); +- ack_info.jumbo_max = htonl(rxrpc_rx_jumbo_max); ++ trailer.maxMTU = htonl(rxrpc_rx_mtu); ++ trailer.ifMTU = htonl(mtu); ++ trailer.rwind = htonl(rxrpc_rx_window_size); ++ trailer.jumbo_max = htonl(rxrpc_rx_jumbo_max); + pkt.whdr.flags |= RXRPC_SLOW_START_OK; + padding = 0; + iov[0].iov_len += sizeof(pkt.ack); +- len += sizeof(pkt.ack) + 3 + sizeof(ack_info); ++ len += sizeof(pkt.ack) + 3 + sizeof(trailer); + ioc = 3; + + trace_rxrpc_tx_ack(chan->call_debug_id, serial, +diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c +index 9691de00ade75..718ffd184ddb6 100644 +--- a/net/rxrpc/input.c ++++ b/net/rxrpc/input.c +@@ -670,14 +670,14 @@ static void rxrpc_complete_rtt_probe(struct rxrpc_call *call, + /* + * Process the extra information that may be appended to an ACK packet + */ +-static void rxrpc_input_ackinfo(struct rxrpc_call *call, struct sk_buff *skb, +- struct rxrpc_ackinfo *ackinfo) ++static void rxrpc_input_ack_trailer(struct rxrpc_call *call, struct sk_buff *skb, ++ struct rxrpc_acktrailer *trailer) + { + struct rxrpc_skb_priv *sp = rxrpc_skb(skb); + struct rxrpc_peer *peer; + unsigned int mtu; + bool wake = false; +- u32 rwind = ntohl(ackinfo->rwind); ++ u32 rwind = ntohl(trailer->rwind); + + if (rwind > RXRPC_TX_MAX_WINDOW) + rwind = RXRPC_TX_MAX_WINDOW; +@@ -691,7 +691,7 @@ static void rxrpc_input_ackinfo(struct rxrpc_call *call, struct sk_buff *skb, + if (call->cong_ssthresh > rwind) + call->cong_ssthresh = rwind; + +- mtu = min(ntohl(ackinfo->rxMTU), ntohl(ackinfo->maxMTU)); ++ mtu = min(ntohl(trailer->maxMTU), ntohl(trailer->ifMTU)); + + peer = call->peer; + if (mtu < peer->maxdata) { +@@ -837,7 +837,7 @@ static void rxrpc_input_ack(struct rxrpc_call *call, struct sk_buff *skb) + struct rxrpc_ack_summary summary = { 0 }; + struct rxrpc_ackpacket ack; + struct rxrpc_skb_priv *sp = rxrpc_skb(skb); +- struct rxrpc_ackinfo info; ++ struct rxrpc_acktrailer trailer; + rxrpc_serial_t ack_serial, acked_serial; + rxrpc_seq_t first_soft_ack, hard_ack, prev_pkt, since; + int nr_acks, offset, ioffset; +@@ -917,11 +917,11 @@ static void rxrpc_input_ack(struct rxrpc_call *call, struct sk_buff *skb) + goto send_response; + } + +- info.rxMTU = 0; ++ trailer.maxMTU = 0; + ioffset = offset + nr_acks + 3; +- if (skb->len >= ioffset + sizeof(info) && +- skb_copy_bits(skb, ioffset, &info, sizeof(info)) < 0) +- return rxrpc_proto_abort(call, 0, rxrpc_badmsg_short_ack_info); ++ if (skb->len >= ioffset + sizeof(trailer) && ++ skb_copy_bits(skb, ioffset, &trailer, sizeof(trailer)) < 0) ++ return rxrpc_proto_abort(call, 0, rxrpc_badmsg_short_ack_trailer); + + if (nr_acks > 0) + skb_condense(skb); +@@ -950,8 +950,8 @@ static void rxrpc_input_ack(struct rxrpc_call *call, struct sk_buff *skb) + } + + /* Parse rwind and mtu sizes if provided. */ +- if (info.rxMTU) +- rxrpc_input_ackinfo(call, skb, &info); ++ if (trailer.maxMTU) ++ rxrpc_input_ack_trailer(call, skb, &trailer); + + if (first_soft_ack == 0) + return rxrpc_proto_abort(call, 0, rxrpc_eproto_ackr_zero); +diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c +index 4a292f860ae37..cad6a7d18e040 100644 +--- a/net/rxrpc/output.c ++++ b/net/rxrpc/output.c +@@ -83,7 +83,7 @@ static size_t rxrpc_fill_out_ack(struct rxrpc_connection *conn, + struct rxrpc_txbuf *txb, + u16 *_rwind) + { +- struct rxrpc_ackinfo ackinfo; ++ struct rxrpc_acktrailer trailer; + unsigned int qsize, sack, wrap, to; + rxrpc_seq_t window, wtop; + int rsize; +@@ -126,16 +126,16 @@ static size_t rxrpc_fill_out_ack(struct rxrpc_connection *conn, + qsize = (window - 1) - call->rx_consumed; + rsize = max_t(int, call->rx_winsize - qsize, 0); + *_rwind = rsize; +- ackinfo.rxMTU = htonl(rxrpc_rx_mtu); +- ackinfo.maxMTU = htonl(mtu); +- ackinfo.rwind = htonl(rsize); +- ackinfo.jumbo_max = htonl(jmax); ++ trailer.maxMTU = htonl(rxrpc_rx_mtu); ++ trailer.ifMTU = htonl(mtu); ++ trailer.rwind = htonl(rsize); ++ trailer.jumbo_max = htonl(jmax); + + *ackp++ = 0; + *ackp++ = 0; + *ackp++ = 0; +- memcpy(ackp, &ackinfo, sizeof(ackinfo)); +- return txb->ack.nAcks + 3 + sizeof(ackinfo); ++ memcpy(ackp, &trailer, sizeof(trailer)); ++ return txb->ack.nAcks + 3 + sizeof(trailer); + } + + /* +diff --git a/net/rxrpc/protocol.h b/net/rxrpc/protocol.h +index e8ee4af43ca89..4fe6b4d20ada9 100644 +--- a/net/rxrpc/protocol.h ++++ b/net/rxrpc/protocol.h +@@ -135,9 +135,9 @@ struct rxrpc_ackpacket { + /* + * ACK packets can have a further piece of information tagged on the end + */ +-struct rxrpc_ackinfo { +- __be32 rxMTU; /* maximum Rx MTU size (bytes) [AFS 3.3] */ +- __be32 maxMTU; /* maximum interface MTU size (bytes) [AFS 3.3] */ ++struct rxrpc_acktrailer { ++ __be32 maxMTU; /* maximum Rx MTU size (bytes) [AFS 3.3] */ ++ __be32 ifMTU; /* maximum interface MTU size (bytes) [AFS 3.3] */ + __be32 rwind; /* Rx window size (packets) [AFS 3.4] */ + __be32 jumbo_max; /* max packets to stick into a jumbo packet [AFS 3.5] */ + }; +-- +2.43.0 + diff --git a/queue-6.6/rxrpc-only-transmit-one-ack-per-jumbo-packet-receive.patch b/queue-6.6/rxrpc-only-transmit-one-ack-per-jumbo-packet-receive.patch new file mode 100644 index 00000000000..b35a7c38667 --- /dev/null +++ b/queue-6.6/rxrpc-only-transmit-one-ack-per-jumbo-packet-receive.patch @@ -0,0 +1,138 @@ +From 5ab2633fa6a84a095c122aecf694c6e16ab1c3f1 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 3 May 2024 16:07:40 +0100 +Subject: rxrpc: Only transmit one ACK per jumbo packet received + +From: David Howells + +[ Upstream commit 012b7206918dcc5a4dcf1432b3e643114c95957e ] + +Only generate one ACK packet for all the subpackets in a jumbo packet. If +we would like to generate more than one ACK, we prioritise them base on +their reason code, in the order, highest first: + + OutOfSeq > NoSpace > ExceedsWin > Duplicate > Requested > Delay > Idle + +For the first four, we reference the lowest offending subpacket; for the +last three, the highest. + +This reduces the number of ACKs we end up transmitting to one per UDP +packet transmitted to reduce network loading and packet parsing. + +Fixes: 5d7edbc9231e ("rxrpc: Get rid of the Rx ring") +Signed-off-by: David Howells +cc: Marc Dionne +cc: linux-afs@lists.infradead.org +Reviewed-by: Jeffrey Altman > +Link: https://lore.kernel.org/r/20240503150749.1001323-3-dhowells@redhat.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/rxrpc/input.c | 46 +++++++++++++++++++++++++++++++++++----------- + 1 file changed, 35 insertions(+), 11 deletions(-) + +diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c +index f7304e06aadca..5dfda1ac51dda 100644 +--- a/net/rxrpc/input.c ++++ b/net/rxrpc/input.c +@@ -9,6 +9,17 @@ + + #include "ar-internal.h" + ++/* Override priority when generating ACKs for received DATA */ ++static const u8 rxrpc_ack_priority[RXRPC_ACK__INVALID] = { ++ [RXRPC_ACK_IDLE] = 1, ++ [RXRPC_ACK_DELAY] = 2, ++ [RXRPC_ACK_REQUESTED] = 3, ++ [RXRPC_ACK_DUPLICATE] = 4, ++ [RXRPC_ACK_EXCEEDS_WINDOW] = 5, ++ [RXRPC_ACK_NOSPACE] = 6, ++ [RXRPC_ACK_OUT_OF_SEQUENCE] = 7, ++}; ++ + static void rxrpc_proto_abort(struct rxrpc_call *call, rxrpc_seq_t seq, + enum rxrpc_abort_reason why) + { +@@ -366,7 +377,7 @@ static void rxrpc_input_queue_data(struct rxrpc_call *call, struct sk_buff *skb, + * Process a DATA packet. + */ + static void rxrpc_input_data_one(struct rxrpc_call *call, struct sk_buff *skb, +- bool *_notify) ++ bool *_notify, rxrpc_serial_t *_ack_serial, int *_ack_reason) + { + struct rxrpc_skb_priv *sp = rxrpc_skb(skb); + struct sk_buff *oos; +@@ -419,8 +430,6 @@ static void rxrpc_input_data_one(struct rxrpc_call *call, struct sk_buff *skb, + /* Send an immediate ACK if we fill in a hole */ + else if (!skb_queue_empty(&call->rx_oos_queue)) + ack_reason = RXRPC_ACK_DELAY; +- else +- call->ackr_nr_unacked++; + + window++; + if (after(window, wtop)) { +@@ -498,12 +507,16 @@ static void rxrpc_input_data_one(struct rxrpc_call *call, struct sk_buff *skb, + } + + send_ack: +- if (ack_reason >= 0) +- rxrpc_send_ACK(call, ack_reason, serial, +- rxrpc_propose_ack_input_data); +- else +- rxrpc_propose_delay_ACK(call, serial, +- rxrpc_propose_ack_input_data); ++ if (ack_reason >= 0) { ++ if (rxrpc_ack_priority[ack_reason] > rxrpc_ack_priority[*_ack_reason]) { ++ *_ack_serial = serial; ++ *_ack_reason = ack_reason; ++ } else if (rxrpc_ack_priority[ack_reason] == rxrpc_ack_priority[*_ack_reason] && ++ ack_reason == RXRPC_ACK_REQUESTED) { ++ *_ack_serial = serial; ++ *_ack_reason = ack_reason; ++ } ++ } + } + + /* +@@ -514,9 +527,11 @@ static bool rxrpc_input_split_jumbo(struct rxrpc_call *call, struct sk_buff *skb + struct rxrpc_jumbo_header jhdr; + struct rxrpc_skb_priv *sp = rxrpc_skb(skb), *jsp; + struct sk_buff *jskb; ++ rxrpc_serial_t ack_serial = 0; + unsigned int offset = sizeof(struct rxrpc_wire_header); + unsigned int len = skb->len - offset; + bool notify = false; ++ int ack_reason = 0; + + while (sp->hdr.flags & RXRPC_JUMBO_PACKET) { + if (len < RXRPC_JUMBO_SUBPKTLEN) +@@ -536,7 +551,7 @@ static bool rxrpc_input_split_jumbo(struct rxrpc_call *call, struct sk_buff *skb + jsp = rxrpc_skb(jskb); + jsp->offset = offset; + jsp->len = RXRPC_JUMBO_DATALEN; +- rxrpc_input_data_one(call, jskb, ¬ify); ++ rxrpc_input_data_one(call, jskb, ¬ify, &ack_serial, &ack_reason); + rxrpc_free_skb(jskb, rxrpc_skb_put_jumbo_subpacket); + + sp->hdr.flags = jhdr.flags; +@@ -549,7 +564,16 @@ static bool rxrpc_input_split_jumbo(struct rxrpc_call *call, struct sk_buff *skb + + sp->offset = offset; + sp->len = len; +- rxrpc_input_data_one(call, skb, ¬ify); ++ rxrpc_input_data_one(call, skb, ¬ify, &ack_serial, &ack_reason); ++ ++ if (ack_reason > 0) { ++ rxrpc_send_ACK(call, ack_reason, ack_serial, ++ rxrpc_propose_ack_input_data); ++ } else { ++ call->ackr_nr_unacked++; ++ rxrpc_propose_delay_ACK(call, sp->hdr.serial, ++ rxrpc_propose_ack_input_data); ++ } + if (notify) { + trace_rxrpc_notify_socket(call->debug_id, sp->hdr.serial); + rxrpc_notify_socket(call); +-- +2.43.0 + diff --git a/queue-6.6/selftests-net-convert-test_bridge_neigh_suppress.sh-.patch b/queue-6.6/selftests-net-convert-test_bridge_neigh_suppress.sh-.patch new file mode 100644 index 00000000000..8e0d098ca69 --- /dev/null +++ b/queue-6.6/selftests-net-convert-test_bridge_neigh_suppress.sh-.patch @@ -0,0 +1,692 @@ +From 0cda0dd31e8f9340f25dd67be09a83db4d680a09 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 6 Dec 2023 15:07:54 +0800 +Subject: selftests/net: convert test_bridge_neigh_suppress.sh to run it in + unique namespace + +From: Hangbin Liu + +[ Upstream commit 312abe3d93a35f9c486a4703d39cab52457266f0 ] + +Here is the test result after conversion. + +]# ./test_bridge_neigh_suppress.sh + +Per-port ARP suppression - VLAN 10 +---------------------------------- +TEST: arping [ OK ] +TEST: ARP suppression [ OK ] + +... + +TEST: NS suppression (VLAN 20) [ OK ] + +Tests passed: 148 +Tests failed: 0 + +Acked-by: David Ahern +Signed-off-by: Hangbin Liu +Reviewed-by: Ido Schimmel +Tested-by: Ido Schimmel +Signed-off-by: David S. Miller +Stable-dep-of: 9a169c267e94 ("selftests: test_bridge_neigh_suppress.sh: Fix failures due to duplicate MAC") +Signed-off-by: Sasha Levin +--- + .../net/test_bridge_neigh_suppress.sh | 331 +++++++++--------- + 1 file changed, 162 insertions(+), 169 deletions(-) + +diff --git a/tools/testing/selftests/net/test_bridge_neigh_suppress.sh b/tools/testing/selftests/net/test_bridge_neigh_suppress.sh +index d80f2cd87614c..8533393a4f186 100755 +--- a/tools/testing/selftests/net/test_bridge_neigh_suppress.sh ++++ b/tools/testing/selftests/net/test_bridge_neigh_suppress.sh +@@ -45,9 +45,8 @@ + # | sw1 | | sw2 | + # +------------------------------------+ +------------------------------------+ + ++source lib.sh + ret=0 +-# Kselftest framework requirement - SKIP code is 4. +-ksft_skip=4 + + # All tests in this script. Can be overridden with -t option. + TESTS=" +@@ -140,9 +139,6 @@ setup_topo_ns() + { + local ns=$1; shift + +- ip netns add $ns +- ip -n $ns link set dev lo up +- + ip netns exec $ns sysctl -qw net.ipv6.conf.all.keep_addr_on_down=1 + ip netns exec $ns sysctl -qw net.ipv6.conf.default.ignore_routes_with_linkdown=1 + ip netns exec $ns sysctl -qw net.ipv6.conf.all.accept_dad=0 +@@ -153,21 +149,22 @@ setup_topo() + { + local ns + +- for ns in h1 h2 sw1 sw2; do ++ setup_ns h1 h2 sw1 sw2 ++ for ns in $h1 $h2 $sw1 $sw2; do + setup_topo_ns $ns + done + + ip link add name veth0 type veth peer name veth1 +- ip link set dev veth0 netns h1 name eth0 +- ip link set dev veth1 netns sw1 name swp1 ++ ip link set dev veth0 netns $h1 name eth0 ++ ip link set dev veth1 netns $sw1 name swp1 + + ip link add name veth0 type veth peer name veth1 +- ip link set dev veth0 netns sw1 name veth0 +- ip link set dev veth1 netns sw2 name veth0 ++ ip link set dev veth0 netns $sw1 name veth0 ++ ip link set dev veth1 netns $sw2 name veth0 + + ip link add name veth0 type veth peer name veth1 +- ip link set dev veth0 netns h2 name eth0 +- ip link set dev veth1 netns sw2 name swp1 ++ ip link set dev veth0 netns $h2 name eth0 ++ ip link set dev veth1 netns $sw2 name swp1 + } + + setup_host_common() +@@ -190,7 +187,7 @@ setup_host_common() + + setup_h1() + { +- local ns=h1 ++ local ns=$h1 + local v4addr1=192.0.2.1/28 + local v4addr2=192.0.2.17/28 + local v6addr1=2001:db8:1::1/64 +@@ -201,7 +198,7 @@ setup_h1() + + setup_h2() + { +- local ns=h2 ++ local ns=$h2 + local v4addr1=192.0.2.2/28 + local v4addr2=192.0.2.18/28 + local v6addr1=2001:db8:1::2/64 +@@ -254,7 +251,7 @@ setup_sw_common() + + setup_sw1() + { +- local ns=sw1 ++ local ns=$sw1 + local local_addr=192.0.2.33 + local remote_addr=192.0.2.34 + local veth_addr=192.0.2.49 +@@ -265,7 +262,7 @@ setup_sw1() + + setup_sw2() + { +- local ns=sw2 ++ local ns=$sw2 + local local_addr=192.0.2.34 + local remote_addr=192.0.2.33 + local veth_addr=192.0.2.50 +@@ -291,11 +288,7 @@ setup() + + cleanup() + { +- local ns +- +- for ns in h1 h2 sw1 sw2; do +- ip netns del $ns &> /dev/null +- done ++ cleanup_ns $h1 $h2 $sw1 $sw2 + } + + ################################################################################ +@@ -312,80 +305,80 @@ neigh_suppress_arp_common() + echo "Per-port ARP suppression - VLAN $vid" + echo "----------------------------------" + +- run_cmd "tc -n sw1 qdisc replace dev vx0 clsact" +- run_cmd "tc -n sw1 filter replace dev vx0 egress pref 1 handle 101 proto 0x0806 flower indev swp1 arp_tip $tip arp_sip $sip arp_op request action pass" ++ run_cmd "tc -n $sw1 qdisc replace dev vx0 clsact" ++ run_cmd "tc -n $sw1 filter replace dev vx0 egress pref 1 handle 101 proto 0x0806 flower indev swp1 arp_tip $tip arp_sip $sip arp_op request action pass" + + # Initial state - check that ARP requests are not suppressed and that + # ARP replies are received. +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid $tip" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid $tip" + log_test $? 0 "arping" +- tc_check_packets sw1 "dev vx0 egress" 101 1 ++ tc_check_packets $sw1 "dev vx0 egress" 101 1 + log_test $? 0 "ARP suppression" + + # Enable neighbor suppression and check that nothing changes compared + # to the initial state. +- run_cmd "bridge -n sw1 link set dev vx0 neigh_suppress on" +- run_cmd "bridge -n sw1 -d link show dev vx0 | grep \"neigh_suppress on\"" ++ run_cmd "bridge -n $sw1 link set dev vx0 neigh_suppress on" ++ run_cmd "bridge -n $sw1 -d link show dev vx0 | grep \"neigh_suppress on\"" + log_test $? 0 "\"neigh_suppress\" is on" + +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid $tip" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid $tip" + log_test $? 0 "arping" +- tc_check_packets sw1 "dev vx0 egress" 101 2 ++ tc_check_packets $sw1 "dev vx0 egress" 101 2 + log_test $? 0 "ARP suppression" + + # Install an FDB entry for the remote host and check that nothing + # changes compared to the initial state. +- h2_mac=$(ip -n h2 -j -p link show eth0.$vid | jq -r '.[]["address"]') +- run_cmd "bridge -n sw1 fdb replace $h2_mac dev vx0 master static vlan $vid" ++ h2_mac=$(ip -n $h2 -j -p link show eth0.$vid | jq -r '.[]["address"]') ++ run_cmd "bridge -n $sw1 fdb replace $h2_mac dev vx0 master static vlan $vid" + log_test $? 0 "FDB entry installation" + +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid $tip" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid $tip" + log_test $? 0 "arping" +- tc_check_packets sw1 "dev vx0 egress" 101 3 ++ tc_check_packets $sw1 "dev vx0 egress" 101 3 + log_test $? 0 "ARP suppression" + + # Install a neighbor on the matching SVI interface and check that ARP + # requests are suppressed. +- run_cmd "ip -n sw1 neigh replace $tip lladdr $h2_mac nud permanent dev br0.$vid" ++ run_cmd "ip -n $sw1 neigh replace $tip lladdr $h2_mac nud permanent dev br0.$vid" + log_test $? 0 "Neighbor entry installation" + +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid $tip" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid $tip" + log_test $? 0 "arping" +- tc_check_packets sw1 "dev vx0 egress" 101 3 ++ tc_check_packets $sw1 "dev vx0 egress" 101 3 + log_test $? 0 "ARP suppression" + + # Take the second host down and check that ARP requests are suppressed + # and that ARP replies are received. +- run_cmd "ip -n h2 link set dev eth0.$vid down" ++ run_cmd "ip -n $h2 link set dev eth0.$vid down" + log_test $? 0 "H2 down" + +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid $tip" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid $tip" + log_test $? 0 "arping" +- tc_check_packets sw1 "dev vx0 egress" 101 3 ++ tc_check_packets $sw1 "dev vx0 egress" 101 3 + log_test $? 0 "ARP suppression" + +- run_cmd "ip -n h2 link set dev eth0.$vid up" ++ run_cmd "ip -n $h2 link set dev eth0.$vid up" + log_test $? 0 "H2 up" + + # Disable neighbor suppression and check that ARP requests are no + # longer suppressed. +- run_cmd "bridge -n sw1 link set dev vx0 neigh_suppress off" +- run_cmd "bridge -n sw1 -d link show dev vx0 | grep \"neigh_suppress off\"" ++ run_cmd "bridge -n $sw1 link set dev vx0 neigh_suppress off" ++ run_cmd "bridge -n $sw1 -d link show dev vx0 | grep \"neigh_suppress off\"" + log_test $? 0 "\"neigh_suppress\" is off" + +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid $tip" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid $tip" + log_test $? 0 "arping" +- tc_check_packets sw1 "dev vx0 egress" 101 4 ++ tc_check_packets $sw1 "dev vx0 egress" 101 4 + log_test $? 0 "ARP suppression" + + # Take the second host down and check that ARP requests are not + # suppressed and that ARP replies are not received. +- run_cmd "ip -n h2 link set dev eth0.$vid down" ++ run_cmd "ip -n $h2 link set dev eth0.$vid down" + log_test $? 0 "H2 down" + +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid $tip" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid $tip" + log_test $? 1 "arping" +- tc_check_packets sw1 "dev vx0 egress" 101 5 ++ tc_check_packets $sw1 "dev vx0 egress" 101 5 + log_test $? 0 "ARP suppression" + } + +@@ -415,80 +408,80 @@ neigh_suppress_ns_common() + echo "Per-port NS suppression - VLAN $vid" + echo "---------------------------------" + +- run_cmd "tc -n sw1 qdisc replace dev vx0 clsact" +- run_cmd "tc -n sw1 filter replace dev vx0 egress pref 1 handle 101 proto ipv6 flower indev swp1 ip_proto icmpv6 dst_ip $maddr src_ip $saddr type 135 code 0 action pass" ++ run_cmd "tc -n $sw1 qdisc replace dev vx0 clsact" ++ run_cmd "tc -n $sw1 filter replace dev vx0 egress pref 1 handle 101 proto ipv6 flower indev swp1 ip_proto icmpv6 dst_ip $maddr src_ip $saddr type 135 code 0 action pass" + + # Initial state - check that NS messages are not suppressed and that ND + # messages are received. +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr -w 5000 $daddr eth0.$vid" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr -w 5000 $daddr eth0.$vid" + log_test $? 0 "ndisc6" +- tc_check_packets sw1 "dev vx0 egress" 101 1 ++ tc_check_packets $sw1 "dev vx0 egress" 101 1 + log_test $? 0 "NS suppression" + + # Enable neighbor suppression and check that nothing changes compared + # to the initial state. +- run_cmd "bridge -n sw1 link set dev vx0 neigh_suppress on" +- run_cmd "bridge -n sw1 -d link show dev vx0 | grep \"neigh_suppress on\"" ++ run_cmd "bridge -n $sw1 link set dev vx0 neigh_suppress on" ++ run_cmd "bridge -n $sw1 -d link show dev vx0 | grep \"neigh_suppress on\"" + log_test $? 0 "\"neigh_suppress\" is on" + +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr -w 5000 $daddr eth0.$vid" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr -w 5000 $daddr eth0.$vid" + log_test $? 0 "ndisc6" +- tc_check_packets sw1 "dev vx0 egress" 101 2 ++ tc_check_packets $sw1 "dev vx0 egress" 101 2 + log_test $? 0 "NS suppression" + + # Install an FDB entry for the remote host and check that nothing + # changes compared to the initial state. +- h2_mac=$(ip -n h2 -j -p link show eth0.$vid | jq -r '.[]["address"]') +- run_cmd "bridge -n sw1 fdb replace $h2_mac dev vx0 master static vlan $vid" ++ h2_mac=$(ip -n $h2 -j -p link show eth0.$vid | jq -r '.[]["address"]') ++ run_cmd "bridge -n $sw1 fdb replace $h2_mac dev vx0 master static vlan $vid" + log_test $? 0 "FDB entry installation" + +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr -w 5000 $daddr eth0.$vid" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr -w 5000 $daddr eth0.$vid" + log_test $? 0 "ndisc6" +- tc_check_packets sw1 "dev vx0 egress" 101 3 ++ tc_check_packets $sw1 "dev vx0 egress" 101 3 + log_test $? 0 "NS suppression" + + # Install a neighbor on the matching SVI interface and check that NS + # messages are suppressed. +- run_cmd "ip -n sw1 neigh replace $daddr lladdr $h2_mac nud permanent dev br0.$vid" ++ run_cmd "ip -n $sw1 neigh replace $daddr lladdr $h2_mac nud permanent dev br0.$vid" + log_test $? 0 "Neighbor entry installation" + +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr -w 5000 $daddr eth0.$vid" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr -w 5000 $daddr eth0.$vid" + log_test $? 0 "ndisc6" +- tc_check_packets sw1 "dev vx0 egress" 101 3 ++ tc_check_packets $sw1 "dev vx0 egress" 101 3 + log_test $? 0 "NS suppression" + + # Take the second host down and check that NS messages are suppressed + # and that ND messages are received. +- run_cmd "ip -n h2 link set dev eth0.$vid down" ++ run_cmd "ip -n $h2 link set dev eth0.$vid down" + log_test $? 0 "H2 down" + +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr -w 5000 $daddr eth0.$vid" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr -w 5000 $daddr eth0.$vid" + log_test $? 0 "ndisc6" +- tc_check_packets sw1 "dev vx0 egress" 101 3 ++ tc_check_packets $sw1 "dev vx0 egress" 101 3 + log_test $? 0 "NS suppression" + +- run_cmd "ip -n h2 link set dev eth0.$vid up" ++ run_cmd "ip -n $h2 link set dev eth0.$vid up" + log_test $? 0 "H2 up" + + # Disable neighbor suppression and check that NS messages are no longer + # suppressed. +- run_cmd "bridge -n sw1 link set dev vx0 neigh_suppress off" +- run_cmd "bridge -n sw1 -d link show dev vx0 | grep \"neigh_suppress off\"" ++ run_cmd "bridge -n $sw1 link set dev vx0 neigh_suppress off" ++ run_cmd "bridge -n $sw1 -d link show dev vx0 | grep \"neigh_suppress off\"" + log_test $? 0 "\"neigh_suppress\" is off" + +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr -w 5000 $daddr eth0.$vid" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr -w 5000 $daddr eth0.$vid" + log_test $? 0 "ndisc6" +- tc_check_packets sw1 "dev vx0 egress" 101 4 ++ tc_check_packets $sw1 "dev vx0 egress" 101 4 + log_test $? 0 "NS suppression" + + # Take the second host down and check that NS messages are not + # suppressed and that ND messages are not received. +- run_cmd "ip -n h2 link set dev eth0.$vid down" ++ run_cmd "ip -n $h2 link set dev eth0.$vid down" + log_test $? 0 "H2 down" + +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr -w 5000 $daddr eth0.$vid" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr -w 5000 $daddr eth0.$vid" + log_test $? 2 "ndisc6" +- tc_check_packets sw1 "dev vx0 egress" 101 5 ++ tc_check_packets $sw1 "dev vx0 egress" 101 5 + log_test $? 0 "NS suppression" + } + +@@ -524,118 +517,118 @@ neigh_vlan_suppress_arp() + echo "Per-{Port, VLAN} ARP suppression" + echo "--------------------------------" + +- run_cmd "tc -n sw1 qdisc replace dev vx0 clsact" +- run_cmd "tc -n sw1 filter replace dev vx0 egress pref 1 handle 101 proto 0x0806 flower indev swp1 arp_tip $tip1 arp_sip $sip1 arp_op request action pass" +- run_cmd "tc -n sw1 filter replace dev vx0 egress pref 1 handle 102 proto 0x0806 flower indev swp1 arp_tip $tip2 arp_sip $sip2 arp_op request action pass" ++ run_cmd "tc -n $sw1 qdisc replace dev vx0 clsact" ++ run_cmd "tc -n $sw1 filter replace dev vx0 egress pref 1 handle 101 proto 0x0806 flower indev swp1 arp_tip $tip1 arp_sip $sip1 arp_op request action pass" ++ run_cmd "tc -n $sw1 filter replace dev vx0 egress pref 1 handle 102 proto 0x0806 flower indev swp1 arp_tip $tip2 arp_sip $sip2 arp_op request action pass" + +- h2_mac1=$(ip -n h2 -j -p link show eth0.$vid1 | jq -r '.[]["address"]') +- h2_mac2=$(ip -n h2 -j -p link show eth0.$vid2 | jq -r '.[]["address"]') +- run_cmd "bridge -n sw1 fdb replace $h2_mac1 dev vx0 master static vlan $vid1" +- run_cmd "bridge -n sw1 fdb replace $h2_mac2 dev vx0 master static vlan $vid2" +- run_cmd "ip -n sw1 neigh replace $tip1 lladdr $h2_mac1 nud permanent dev br0.$vid1" +- run_cmd "ip -n sw1 neigh replace $tip2 lladdr $h2_mac2 nud permanent dev br0.$vid2" ++ h2_mac1=$(ip -n $h2 -j -p link show eth0.$vid1 | jq -r '.[]["address"]') ++ h2_mac2=$(ip -n $h2 -j -p link show eth0.$vid2 | jq -r '.[]["address"]') ++ run_cmd "bridge -n $sw1 fdb replace $h2_mac1 dev vx0 master static vlan $vid1" ++ run_cmd "bridge -n $sw1 fdb replace $h2_mac2 dev vx0 master static vlan $vid2" ++ run_cmd "ip -n $sw1 neigh replace $tip1 lladdr $h2_mac1 nud permanent dev br0.$vid1" ++ run_cmd "ip -n $sw1 neigh replace $tip2 lladdr $h2_mac2 nud permanent dev br0.$vid2" + + # Enable per-{Port, VLAN} neighbor suppression and check that ARP + # requests are not suppressed and that ARP replies are received. +- run_cmd "bridge -n sw1 link set dev vx0 neigh_vlan_suppress on" +- run_cmd "bridge -n sw1 -d link show dev vx0 | grep \"neigh_vlan_suppress on\"" ++ run_cmd "bridge -n $sw1 link set dev vx0 neigh_vlan_suppress on" ++ run_cmd "bridge -n $sw1 -d link show dev vx0 | grep \"neigh_vlan_suppress on\"" + log_test $? 0 "\"neigh_vlan_suppress\" is on" + +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip1 -I eth0.$vid1 $tip1" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip1 -I eth0.$vid1 $tip1" + log_test $? 0 "arping (VLAN $vid1)" +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip2 -I eth0.$vid2 $tip2" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip2 -I eth0.$vid2 $tip2" + log_test $? 0 "arping (VLAN $vid2)" + +- tc_check_packets sw1 "dev vx0 egress" 101 1 ++ tc_check_packets $sw1 "dev vx0 egress" 101 1 + log_test $? 0 "ARP suppression (VLAN $vid1)" +- tc_check_packets sw1 "dev vx0 egress" 102 1 ++ tc_check_packets $sw1 "dev vx0 egress" 102 1 + log_test $? 0 "ARP suppression (VLAN $vid2)" + + # Enable neighbor suppression on VLAN 10 and check that only on this + # VLAN ARP requests are suppressed. +- run_cmd "bridge -n sw1 vlan set vid $vid1 dev vx0 neigh_suppress on" +- run_cmd "bridge -n sw1 -d vlan show dev vx0 vid $vid1 | grep \"neigh_suppress on\"" ++ run_cmd "bridge -n $sw1 vlan set vid $vid1 dev vx0 neigh_suppress on" ++ run_cmd "bridge -n $sw1 -d vlan show dev vx0 vid $vid1 | grep \"neigh_suppress on\"" + log_test $? 0 "\"neigh_suppress\" is on (VLAN $vid1)" +- run_cmd "bridge -n sw1 -d vlan show dev vx0 vid $vid2 | grep \"neigh_suppress off\"" ++ run_cmd "bridge -n $sw1 -d vlan show dev vx0 vid $vid2 | grep \"neigh_suppress off\"" + log_test $? 0 "\"neigh_suppress\" is off (VLAN $vid2)" + +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip1 -I eth0.$vid1 $tip1" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip1 -I eth0.$vid1 $tip1" + log_test $? 0 "arping (VLAN $vid1)" +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip2 -I eth0.$vid2 $tip2" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip2 -I eth0.$vid2 $tip2" + log_test $? 0 "arping (VLAN $vid2)" + +- tc_check_packets sw1 "dev vx0 egress" 101 1 ++ tc_check_packets $sw1 "dev vx0 egress" 101 1 + log_test $? 0 "ARP suppression (VLAN $vid1)" +- tc_check_packets sw1 "dev vx0 egress" 102 2 ++ tc_check_packets $sw1 "dev vx0 egress" 102 2 + log_test $? 0 "ARP suppression (VLAN $vid2)" + + # Enable neighbor suppression on the port and check that it has no + # effect compared to previous state. +- run_cmd "bridge -n sw1 link set dev vx0 neigh_suppress on" +- run_cmd "bridge -n sw1 -d link show dev vx0 | grep \"neigh_suppress on\"" ++ run_cmd "bridge -n $sw1 link set dev vx0 neigh_suppress on" ++ run_cmd "bridge -n $sw1 -d link show dev vx0 | grep \"neigh_suppress on\"" + log_test $? 0 "\"neigh_suppress\" is on" + +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip1 -I eth0.$vid1 $tip1" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip1 -I eth0.$vid1 $tip1" + log_test $? 0 "arping (VLAN $vid1)" +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip2 -I eth0.$vid2 $tip2" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip2 -I eth0.$vid2 $tip2" + log_test $? 0 "arping (VLAN $vid2)" + +- tc_check_packets sw1 "dev vx0 egress" 101 1 ++ tc_check_packets $sw1 "dev vx0 egress" 101 1 + log_test $? 0 "ARP suppression (VLAN $vid1)" +- tc_check_packets sw1 "dev vx0 egress" 102 3 ++ tc_check_packets $sw1 "dev vx0 egress" 102 3 + log_test $? 0 "ARP suppression (VLAN $vid2)" + + # Disable neighbor suppression on the port and check that it has no + # effect compared to previous state. +- run_cmd "bridge -n sw1 link set dev vx0 neigh_suppress off" +- run_cmd "bridge -n sw1 -d link show dev vx0 | grep \"neigh_suppress off\"" ++ run_cmd "bridge -n $sw1 link set dev vx0 neigh_suppress off" ++ run_cmd "bridge -n $sw1 -d link show dev vx0 | grep \"neigh_suppress off\"" + log_test $? 0 "\"neigh_suppress\" is off" + +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip1 -I eth0.$vid1 $tip1" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip1 -I eth0.$vid1 $tip1" + log_test $? 0 "arping (VLAN $vid1)" +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip2 -I eth0.$vid2 $tip2" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip2 -I eth0.$vid2 $tip2" + log_test $? 0 "arping (VLAN $vid2)" + +- tc_check_packets sw1 "dev vx0 egress" 101 1 ++ tc_check_packets $sw1 "dev vx0 egress" 101 1 + log_test $? 0 "ARP suppression (VLAN $vid1)" +- tc_check_packets sw1 "dev vx0 egress" 102 4 ++ tc_check_packets $sw1 "dev vx0 egress" 102 4 + log_test $? 0 "ARP suppression (VLAN $vid2)" + + # Disable neighbor suppression on VLAN 10 and check that ARP requests + # are no longer suppressed on this VLAN. +- run_cmd "bridge -n sw1 vlan set vid $vid1 dev vx0 neigh_suppress off" +- run_cmd "bridge -n sw1 -d vlan show dev vx0 vid $vid1 | grep \"neigh_suppress off\"" ++ run_cmd "bridge -n $sw1 vlan set vid $vid1 dev vx0 neigh_suppress off" ++ run_cmd "bridge -n $sw1 -d vlan show dev vx0 vid $vid1 | grep \"neigh_suppress off\"" + log_test $? 0 "\"neigh_suppress\" is off (VLAN $vid1)" + +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip1 -I eth0.$vid1 $tip1" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip1 -I eth0.$vid1 $tip1" + log_test $? 0 "arping (VLAN $vid1)" +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip2 -I eth0.$vid2 $tip2" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip2 -I eth0.$vid2 $tip2" + log_test $? 0 "arping (VLAN $vid2)" + +- tc_check_packets sw1 "dev vx0 egress" 101 2 ++ tc_check_packets $sw1 "dev vx0 egress" 101 2 + log_test $? 0 "ARP suppression (VLAN $vid1)" +- tc_check_packets sw1 "dev vx0 egress" 102 5 ++ tc_check_packets $sw1 "dev vx0 egress" 102 5 + log_test $? 0 "ARP suppression (VLAN $vid2)" + + # Disable per-{Port, VLAN} neighbor suppression, enable neighbor + # suppression on the port and check that on both VLANs ARP requests are + # suppressed. +- run_cmd "bridge -n sw1 link set dev vx0 neigh_vlan_suppress off" +- run_cmd "bridge -n sw1 -d link show dev vx0 | grep \"neigh_vlan_suppress off\"" ++ run_cmd "bridge -n $sw1 link set dev vx0 neigh_vlan_suppress off" ++ run_cmd "bridge -n $sw1 -d link show dev vx0 | grep \"neigh_vlan_suppress off\"" + log_test $? 0 "\"neigh_vlan_suppress\" is off" + +- run_cmd "bridge -n sw1 link set dev vx0 neigh_suppress on" +- run_cmd "bridge -n sw1 -d link show dev vx0 | grep \"neigh_suppress on\"" ++ run_cmd "bridge -n $sw1 link set dev vx0 neigh_suppress on" ++ run_cmd "bridge -n $sw1 -d link show dev vx0 | grep \"neigh_suppress on\"" + log_test $? 0 "\"neigh_suppress\" is on" + +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip1 -I eth0.$vid1 $tip1" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip1 -I eth0.$vid1 $tip1" + log_test $? 0 "arping (VLAN $vid1)" +- run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip2 -I eth0.$vid2 $tip2" ++ run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip2 -I eth0.$vid2 $tip2" + log_test $? 0 "arping (VLAN $vid2)" + +- tc_check_packets sw1 "dev vx0 egress" 101 2 ++ tc_check_packets $sw1 "dev vx0 egress" 101 2 + log_test $? 0 "ARP suppression (VLAN $vid1)" +- tc_check_packets sw1 "dev vx0 egress" 102 5 ++ tc_check_packets $sw1 "dev vx0 egress" 102 5 + log_test $? 0 "ARP suppression (VLAN $vid2)" + } + +@@ -655,118 +648,118 @@ neigh_vlan_suppress_ns() + echo "Per-{Port, VLAN} NS suppression" + echo "-------------------------------" + +- run_cmd "tc -n sw1 qdisc replace dev vx0 clsact" +- run_cmd "tc -n sw1 filter replace dev vx0 egress pref 1 handle 101 proto ipv6 flower indev swp1 ip_proto icmpv6 dst_ip $maddr src_ip $saddr1 type 135 code 0 action pass" +- run_cmd "tc -n sw1 filter replace dev vx0 egress pref 1 handle 102 proto ipv6 flower indev swp1 ip_proto icmpv6 dst_ip $maddr src_ip $saddr2 type 135 code 0 action pass" ++ run_cmd "tc -n $sw1 qdisc replace dev vx0 clsact" ++ run_cmd "tc -n $sw1 filter replace dev vx0 egress pref 1 handle 101 proto ipv6 flower indev swp1 ip_proto icmpv6 dst_ip $maddr src_ip $saddr1 type 135 code 0 action pass" ++ run_cmd "tc -n $sw1 filter replace dev vx0 egress pref 1 handle 102 proto ipv6 flower indev swp1 ip_proto icmpv6 dst_ip $maddr src_ip $saddr2 type 135 code 0 action pass" + +- h2_mac1=$(ip -n h2 -j -p link show eth0.$vid1 | jq -r '.[]["address"]') +- h2_mac2=$(ip -n h2 -j -p link show eth0.$vid2 | jq -r '.[]["address"]') +- run_cmd "bridge -n sw1 fdb replace $h2_mac1 dev vx0 master static vlan $vid1" +- run_cmd "bridge -n sw1 fdb replace $h2_mac2 dev vx0 master static vlan $vid2" +- run_cmd "ip -n sw1 neigh replace $daddr1 lladdr $h2_mac1 nud permanent dev br0.$vid1" +- run_cmd "ip -n sw1 neigh replace $daddr2 lladdr $h2_mac2 nud permanent dev br0.$vid2" ++ h2_mac1=$(ip -n $h2 -j -p link show eth0.$vid1 | jq -r '.[]["address"]') ++ h2_mac2=$(ip -n $h2 -j -p link show eth0.$vid2 | jq -r '.[]["address"]') ++ run_cmd "bridge -n $sw1 fdb replace $h2_mac1 dev vx0 master static vlan $vid1" ++ run_cmd "bridge -n $sw1 fdb replace $h2_mac2 dev vx0 master static vlan $vid2" ++ run_cmd "ip -n $sw1 neigh replace $daddr1 lladdr $h2_mac1 nud permanent dev br0.$vid1" ++ run_cmd "ip -n $sw1 neigh replace $daddr2 lladdr $h2_mac2 nud permanent dev br0.$vid2" + + # Enable per-{Port, VLAN} neighbor suppression and check that NS + # messages are not suppressed and that ND messages are received. +- run_cmd "bridge -n sw1 link set dev vx0 neigh_vlan_suppress on" +- run_cmd "bridge -n sw1 -d link show dev vx0 | grep \"neigh_vlan_suppress on\"" ++ run_cmd "bridge -n $sw1 link set dev vx0 neigh_vlan_suppress on" ++ run_cmd "bridge -n $sw1 -d link show dev vx0 | grep \"neigh_vlan_suppress on\"" + log_test $? 0 "\"neigh_vlan_suppress\" is on" + +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr1 -w 5000 $daddr1 eth0.$vid1" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr1 -w 5000 $daddr1 eth0.$vid1" + log_test $? 0 "ndisc6 (VLAN $vid1)" +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr2 -w 5000 $daddr2 eth0.$vid2" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr2 -w 5000 $daddr2 eth0.$vid2" + log_test $? 0 "ndisc6 (VLAN $vid2)" + +- tc_check_packets sw1 "dev vx0 egress" 101 1 ++ tc_check_packets $sw1 "dev vx0 egress" 101 1 + log_test $? 0 "NS suppression (VLAN $vid1)" +- tc_check_packets sw1 "dev vx0 egress" 102 1 ++ tc_check_packets $sw1 "dev vx0 egress" 102 1 + log_test $? 0 "NS suppression (VLAN $vid2)" + + # Enable neighbor suppression on VLAN 10 and check that only on this + # VLAN NS messages are suppressed. +- run_cmd "bridge -n sw1 vlan set vid $vid1 dev vx0 neigh_suppress on" +- run_cmd "bridge -n sw1 -d vlan show dev vx0 vid $vid1 | grep \"neigh_suppress on\"" ++ run_cmd "bridge -n $sw1 vlan set vid $vid1 dev vx0 neigh_suppress on" ++ run_cmd "bridge -n $sw1 -d vlan show dev vx0 vid $vid1 | grep \"neigh_suppress on\"" + log_test $? 0 "\"neigh_suppress\" is on (VLAN $vid1)" +- run_cmd "bridge -n sw1 -d vlan show dev vx0 vid $vid2 | grep \"neigh_suppress off\"" ++ run_cmd "bridge -n $sw1 -d vlan show dev vx0 vid $vid2 | grep \"neigh_suppress off\"" + log_test $? 0 "\"neigh_suppress\" is off (VLAN $vid2)" + +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr1 -w 5000 $daddr1 eth0.$vid1" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr1 -w 5000 $daddr1 eth0.$vid1" + log_test $? 0 "ndisc6 (VLAN $vid1)" +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr2 -w 5000 $daddr2 eth0.$vid2" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr2 -w 5000 $daddr2 eth0.$vid2" + log_test $? 0 "ndisc6 (VLAN $vid2)" + +- tc_check_packets sw1 "dev vx0 egress" 101 1 ++ tc_check_packets $sw1 "dev vx0 egress" 101 1 + log_test $? 0 "NS suppression (VLAN $vid1)" +- tc_check_packets sw1 "dev vx0 egress" 102 2 ++ tc_check_packets $sw1 "dev vx0 egress" 102 2 + log_test $? 0 "NS suppression (VLAN $vid2)" + + # Enable neighbor suppression on the port and check that it has no + # effect compared to previous state. +- run_cmd "bridge -n sw1 link set dev vx0 neigh_suppress on" +- run_cmd "bridge -n sw1 -d link show dev vx0 | grep \"neigh_suppress on\"" ++ run_cmd "bridge -n $sw1 link set dev vx0 neigh_suppress on" ++ run_cmd "bridge -n $sw1 -d link show dev vx0 | grep \"neigh_suppress on\"" + log_test $? 0 "\"neigh_suppress\" is on" + +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr1 -w 5000 $daddr1 eth0.$vid1" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr1 -w 5000 $daddr1 eth0.$vid1" + log_test $? 0 "ndisc6 (VLAN $vid1)" +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr2 -w 5000 $daddr2 eth0.$vid2" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr2 -w 5000 $daddr2 eth0.$vid2" + log_test $? 0 "ndisc6 (VLAN $vid2)" + +- tc_check_packets sw1 "dev vx0 egress" 101 1 ++ tc_check_packets $sw1 "dev vx0 egress" 101 1 + log_test $? 0 "NS suppression (VLAN $vid1)" +- tc_check_packets sw1 "dev vx0 egress" 102 3 ++ tc_check_packets $sw1 "dev vx0 egress" 102 3 + log_test $? 0 "NS suppression (VLAN $vid2)" + + # Disable neighbor suppression on the port and check that it has no + # effect compared to previous state. +- run_cmd "bridge -n sw1 link set dev vx0 neigh_suppress off" +- run_cmd "bridge -n sw1 -d link show dev vx0 | grep \"neigh_suppress off\"" ++ run_cmd "bridge -n $sw1 link set dev vx0 neigh_suppress off" ++ run_cmd "bridge -n $sw1 -d link show dev vx0 | grep \"neigh_suppress off\"" + log_test $? 0 "\"neigh_suppress\" is off" + +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr1 -w 5000 $daddr1 eth0.$vid1" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr1 -w 5000 $daddr1 eth0.$vid1" + log_test $? 0 "ndisc6 (VLAN $vid1)" +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr2 -w 5000 $daddr2 eth0.$vid2" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr2 -w 5000 $daddr2 eth0.$vid2" + log_test $? 0 "ndisc6 (VLAN $vid2)" + +- tc_check_packets sw1 "dev vx0 egress" 101 1 ++ tc_check_packets $sw1 "dev vx0 egress" 101 1 + log_test $? 0 "NS suppression (VLAN $vid1)" +- tc_check_packets sw1 "dev vx0 egress" 102 4 ++ tc_check_packets $sw1 "dev vx0 egress" 102 4 + log_test $? 0 "NS suppression (VLAN $vid2)" + + # Disable neighbor suppression on VLAN 10 and check that NS messages + # are no longer suppressed on this VLAN. +- run_cmd "bridge -n sw1 vlan set vid $vid1 dev vx0 neigh_suppress off" +- run_cmd "bridge -n sw1 -d vlan show dev vx0 vid $vid1 | grep \"neigh_suppress off\"" ++ run_cmd "bridge -n $sw1 vlan set vid $vid1 dev vx0 neigh_suppress off" ++ run_cmd "bridge -n $sw1 -d vlan show dev vx0 vid $vid1 | grep \"neigh_suppress off\"" + log_test $? 0 "\"neigh_suppress\" is off (VLAN $vid1)" + +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr1 -w 5000 $daddr1 eth0.$vid1" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr1 -w 5000 $daddr1 eth0.$vid1" + log_test $? 0 "ndisc6 (VLAN $vid1)" +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr2 -w 5000 $daddr2 eth0.$vid2" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr2 -w 5000 $daddr2 eth0.$vid2" + log_test $? 0 "ndisc6 (VLAN $vid2)" + +- tc_check_packets sw1 "dev vx0 egress" 101 2 ++ tc_check_packets $sw1 "dev vx0 egress" 101 2 + log_test $? 0 "NS suppression (VLAN $vid1)" +- tc_check_packets sw1 "dev vx0 egress" 102 5 ++ tc_check_packets $sw1 "dev vx0 egress" 102 5 + log_test $? 0 "NS suppression (VLAN $vid2)" + + # Disable per-{Port, VLAN} neighbor suppression, enable neighbor + # suppression on the port and check that on both VLANs NS messages are + # suppressed. +- run_cmd "bridge -n sw1 link set dev vx0 neigh_vlan_suppress off" +- run_cmd "bridge -n sw1 -d link show dev vx0 | grep \"neigh_vlan_suppress off\"" ++ run_cmd "bridge -n $sw1 link set dev vx0 neigh_vlan_suppress off" ++ run_cmd "bridge -n $sw1 -d link show dev vx0 | grep \"neigh_vlan_suppress off\"" + log_test $? 0 "\"neigh_vlan_suppress\" is off" + +- run_cmd "bridge -n sw1 link set dev vx0 neigh_suppress on" +- run_cmd "bridge -n sw1 -d link show dev vx0 | grep \"neigh_suppress on\"" ++ run_cmd "bridge -n $sw1 link set dev vx0 neigh_suppress on" ++ run_cmd "bridge -n $sw1 -d link show dev vx0 | grep \"neigh_suppress on\"" + log_test $? 0 "\"neigh_suppress\" is on" + +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr1 -w 5000 $daddr1 eth0.$vid1" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr1 -w 5000 $daddr1 eth0.$vid1" + log_test $? 0 "ndisc6 (VLAN $vid1)" +- run_cmd "ip netns exec h1 ndisc6 -q -r 1 -s $saddr2 -w 5000 $daddr2 eth0.$vid2" ++ run_cmd "ip netns exec $h1 ndisc6 -q -r 1 -s $saddr2 -w 5000 $daddr2 eth0.$vid2" + log_test $? 0 "ndisc6 (VLAN $vid2)" + +- tc_check_packets sw1 "dev vx0 egress" 101 2 ++ tc_check_packets $sw1 "dev vx0 egress" 101 2 + log_test $? 0 "NS suppression (VLAN $vid1)" +- tc_check_packets sw1 "dev vx0 egress" 102 5 ++ tc_check_packets $sw1 "dev vx0 egress" 102 5 + log_test $? 0 "NS suppression (VLAN $vid2)" + } + +-- +2.43.0 + diff --git a/queue-6.6/selftests-test_bridge_neigh_suppress.sh-fix-failures.patch b/queue-6.6/selftests-test_bridge_neigh_suppress.sh-fix-failures.patch new file mode 100644 index 00000000000..1f59509db8b --- /dev/null +++ b/queue-6.6/selftests-test_bridge_neigh_suppress.sh-fix-failures.patch @@ -0,0 +1,62 @@ +From a1389dc610b281e5a927ca276e099e9f3c9de5c2 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 7 May 2024 14:30:33 +0300 +Subject: selftests: test_bridge_neigh_suppress.sh: Fix failures due to + duplicate MAC + +From: Ido Schimmel + +[ Upstream commit 9a169c267e946b0f47f67e8ccc70134708ccf3d4 ] + +When creating the topology for the test, three veth pairs are created in +the initial network namespace before being moved to one of the network +namespaces created by the test. + +On systems where systemd-udev uses MACAddressPolicy=persistent (default +since systemd version 242), this will result in some net devices having +the same MAC address since they were created with the same name in the +initial network namespace. In turn, this leads to arping / ndisc6 +failing since packets are dropped by the bridge's loopback filter. + +Fix by creating each net device in the correct network namespace instead +of moving it there from the initial network namespace. + +Reported-by: Jakub Kicinski +Closes: https://lore.kernel.org/netdev/20240426074015.251854d4@kernel.org/ +Fixes: 7648ac72dcd7 ("selftests: net: Add bridge neighbor suppression test") +Signed-off-by: Ido Schimmel +Link: https://lore.kernel.org/r/20240507113033.1732534-1-idosch@nvidia.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + .../selftests/net/test_bridge_neigh_suppress.sh | 14 +++----------- + 1 file changed, 3 insertions(+), 11 deletions(-) + +diff --git a/tools/testing/selftests/net/test_bridge_neigh_suppress.sh b/tools/testing/selftests/net/test_bridge_neigh_suppress.sh +index 8533393a4f186..02b986c9c247d 100755 +--- a/tools/testing/selftests/net/test_bridge_neigh_suppress.sh ++++ b/tools/testing/selftests/net/test_bridge_neigh_suppress.sh +@@ -154,17 +154,9 @@ setup_topo() + setup_topo_ns $ns + done + +- ip link add name veth0 type veth peer name veth1 +- ip link set dev veth0 netns $h1 name eth0 +- ip link set dev veth1 netns $sw1 name swp1 +- +- ip link add name veth0 type veth peer name veth1 +- ip link set dev veth0 netns $sw1 name veth0 +- ip link set dev veth1 netns $sw2 name veth0 +- +- ip link add name veth0 type veth peer name veth1 +- ip link set dev veth0 netns $h2 name eth0 +- ip link set dev veth1 netns $sw2 name swp1 ++ ip -n $h1 link add name eth0 type veth peer name swp1 netns $sw1 ++ ip -n $sw1 link add name veth0 type veth peer name veth0 netns $sw2 ++ ip -n $h2 link add name eth0 type veth peer name swp1 netns $sw2 + } + + setup_host_common() +-- +2.43.0 + diff --git a/queue-6.6/series b/queue-6.6/series index ad60531a514..4965ffffc2f 100644 --- a/queue-6.6/series +++ b/queue-6.6/series @@ -172,3 +172,41 @@ drm-radeon-silence-ubsan-warning-v3.patch net-usb-qmi_wwan-support-rolling-modules.patch blk-iocost-do-not-warn-if-iocg-was-already-offlined.patch sunrpc-add-a-missing-rpc_stat-for-tcp-tls.patch +qibfs-fix-dentry-leak.patch +xfrm-preserve-vlan-tags-for-transport-mode-software-.patch +arm-9381-1-kasan-clear-stale-stack-poison.patch +tcp-defer-shutdown-send_shutdown-for-tcp_syn_recv-so.patch +tcp-use-refcount_inc_not_zero-in-tcp_twsk_unique.patch +bluetooth-fix-use-after-free-bugs-caused-by-sco_sock.patch +bluetooth-msft-fix-slab-use-after-free-in-msft_do_cl.patch +bluetooth-hci-fix-potential-null-ptr-deref.patch +bluetooth-l2cap-fix-null-ptr-deref-in-l2cap_chan_tim.patch +net-ks8851-queue-rx-packets-in-irq-handler-instead-o.patch +rtnetlink-correct-nested-ifla_vf_vlan_list-attribute.patch +hwmon-corsair-cpro-use-a-separate-buffer-for-sending.patch +hwmon-corsair-cpro-use-complete_all-instead-of-compl.patch +hwmon-corsair-cpro-protect-ccp-wait_input_report-wit.patch +phonet-fix-rtm_phonet_notify-skb-allocation.patch +nfc-nci-fix-kcov-check-in-nci_rx_work.patch +net-bridge-fix-corrupted-ethernet-header-on-multicas.patch +ipv6-fix-potential-uninit-value-access-in-__ip6_make.patch +selftests-net-convert-test_bridge_neigh_suppress.sh-.patch +selftests-test_bridge_neigh_suppress.sh-fix-failures.patch +rxrpc-fix-the-names-of-the-fields-in-the-ack-trailer.patch +rxrpc-fix-congestion-control-algorithm.patch +rxrpc-only-transmit-one-ack-per-jumbo-packet-receive.patch +dt-bindings-net-mediatek-remove-wrongly-added-clocks.patch +ipv6-fib6_rules-avoid-possible-null-dereference-in-f.patch +net-sysfs-convert-dev-operstate-reads-to-lockless-on.patch +hsr-simplify-code-for-announcing-hsr-nodes-timer-set.patch +ipv6-annotate-data-races-around-cnf.disable_ipv6.patch +ipv6-prevent-null-dereference-in-ip6_output.patch +net-smc-fix-neighbour-and-rtable-leak-in-smc_ib_find.patch +net-hns3-using-user-configure-after-hardware-reset.patch +net-hns3-direct-return-when-receive-a-unknown-mailbo.patch +net-hns3-change-type-of-numa_node_mask-as-nodemask_t.patch +net-hns3-release-ptp-resources-if-pf-initialization-.patch +net-hns3-use-appropriate-barrier-function-after-sett.patch +net-hns3-fix-port-vlan-filter-not-disabled-issue.patch +net-hns3-fix-kernel-crash-when-devlink-reload-during.patch +net-dsa-mv88e6xxx-add-phylink_get_caps-for-the-mv88e.patch diff --git a/queue-6.6/tcp-defer-shutdown-send_shutdown-for-tcp_syn_recv-so.patch b/queue-6.6/tcp-defer-shutdown-send_shutdown-for-tcp_syn_recv-so.patch new file mode 100644 index 00000000000..8e5c4a738fb --- /dev/null +++ b/queue-6.6/tcp-defer-shutdown-send_shutdown-for-tcp_syn_recv-so.patch @@ -0,0 +1,145 @@ +From 47736274421d290365f34e199df5b0d9b022486b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 1 May 2024 12:54:48 +0000 +Subject: tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets + +From: Eric Dumazet + +[ Upstream commit 94062790aedb505bdda209b10bea47b294d6394f ] + +TCP_SYN_RECV state is really special, it is only used by +cross-syn connections, mostly used by fuzzers. + +In the following crash [1], syzbot managed to trigger a divide +by zero in tcp_rcv_space_adjust() + +A socket makes the following state transitions, +without ever calling tcp_init_transfer(), +meaning tcp_init_buffer_space() is also not called. + + TCP_CLOSE +connect() + TCP_SYN_SENT + TCP_SYN_RECV +shutdown() -> tcp_shutdown(sk, SEND_SHUTDOWN) + TCP_FIN_WAIT1 + +To fix this issue, change tcp_shutdown() to not +perform a TCP_SYN_RECV -> TCP_FIN_WAIT1 transition, +which makes no sense anyway. + +When tcp_rcv_state_process() later changes socket state +from TCP_SYN_RECV to TCP_ESTABLISH, then look at +sk->sk_shutdown to finally enter TCP_FIN_WAIT1 state, +and send a FIN packet from a sane socket state. + +This means tcp_send_fin() can now be called from BH +context, and must use GFP_ATOMIC allocations. + +[1] +divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI +CPU: 1 PID: 5084 Comm: syz-executor358 Not tainted 6.9.0-rc6-syzkaller-00022-g98369dccd2f8 #0 +Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 + RIP: 0010:tcp_rcv_space_adjust+0x2df/0x890 net/ipv4/tcp_input.c:767 +Code: e3 04 4c 01 eb 48 8b 44 24 38 0f b6 04 10 84 c0 49 89 d5 0f 85 a5 03 00 00 41 8b 8e c8 09 00 00 89 e8 29 c8 48 0f af c3 31 d2 <48> f7 f1 48 8d 1c 43 49 8d 96 76 08 00 00 48 89 d0 48 c1 e8 03 48 +RSP: 0018:ffffc900031ef3f0 EFLAGS: 00010246 +RAX: 0c677a10441f8f42 RBX: 000000004fb95e7e RCX: 0000000000000000 +RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 +RBP: 0000000027d4b11f R08: ffffffff89e535a4 R09: 1ffffffff25e6ab7 +R10: dffffc0000000000 R11: ffffffff8135e920 R12: ffff88802a9f8d30 +R13: dffffc0000000000 R14: ffff88802a9f8d00 R15: 1ffff1100553f2da +FS: 00005555775c0380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 00007f1155bf2304 CR3: 000000002b9f2000 CR4: 0000000000350ef0 +Call Trace: + + tcp_recvmsg_locked+0x106d/0x25a0 net/ipv4/tcp.c:2513 + tcp_recvmsg+0x25d/0x920 net/ipv4/tcp.c:2578 + inet6_recvmsg+0x16a/0x730 net/ipv6/af_inet6.c:680 + sock_recvmsg_nosec net/socket.c:1046 [inline] + sock_recvmsg+0x109/0x280 net/socket.c:1068 + ____sys_recvmsg+0x1db/0x470 net/socket.c:2803 + ___sys_recvmsg net/socket.c:2845 [inline] + do_recvmmsg+0x474/0xae0 net/socket.c:2939 + __sys_recvmmsg net/socket.c:3018 [inline] + __do_sys_recvmmsg net/socket.c:3041 [inline] + __se_sys_recvmmsg net/socket.c:3034 [inline] + __x64_sys_recvmmsg+0x199/0x250 net/socket.c:3034 + do_syscall_x64 arch/x86/entry/common.c:52 [inline] + do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 + entry_SYSCALL_64_after_hwframe+0x77/0x7f +RIP: 0033:0x7faeb6363db9 +Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 c1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 +RSP: 002b:00007ffcc1997168 EFLAGS: 00000246 ORIG_RAX: 000000000000012b +RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007faeb6363db9 +RDX: 0000000000000001 RSI: 0000000020000bc0 RDI: 0000000000000005 +RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000000001c +R10: 0000000000000122 R11: 0000000000000246 R12: 0000000000000000 +R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000001 + +Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") +Reported-by: syzbot +Signed-off-by: Eric Dumazet +Acked-by: Neal Cardwell +Link: https://lore.kernel.org/r/20240501125448.896529-1-edumazet@google.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/ipv4/tcp.c | 4 ++-- + net/ipv4/tcp_input.c | 2 ++ + net/ipv4/tcp_output.c | 4 +++- + 3 files changed, 7 insertions(+), 3 deletions(-) + +diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c +index f8df35f7352a5..7bf774bdb9386 100644 +--- a/net/ipv4/tcp.c ++++ b/net/ipv4/tcp.c +@@ -2710,7 +2710,7 @@ void tcp_shutdown(struct sock *sk, int how) + /* If we've already sent a FIN, or it's a closed state, skip this. */ + if ((1 << sk->sk_state) & + (TCPF_ESTABLISHED | TCPF_SYN_SENT | +- TCPF_SYN_RECV | TCPF_CLOSE_WAIT)) { ++ TCPF_CLOSE_WAIT)) { + /* Clear out any half completed packets. FIN if needed. */ + if (tcp_close_state(sk)) + tcp_send_fin(sk); +@@ -2819,7 +2819,7 @@ void __tcp_close(struct sock *sk, long timeout) + * machine. State transitions: + * + * TCP_ESTABLISHED -> TCP_FIN_WAIT1 +- * TCP_SYN_RECV -> TCP_FIN_WAIT1 (forget it, it's impossible) ++ * TCP_SYN_RECV -> TCP_FIN_WAIT1 (it is difficult) + * TCP_CLOSE_WAIT -> TCP_LAST_ACK + * + * are legal only when FIN has been sent (i.e. in window), +diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c +index e6c4929549428..f938442b202d7 100644 +--- a/net/ipv4/tcp_input.c ++++ b/net/ipv4/tcp_input.c +@@ -6627,6 +6627,8 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) + + tcp_initialize_rcv_mss(sk); + tcp_fast_path_on(tp); ++ if (sk->sk_shutdown & SEND_SHUTDOWN) ++ tcp_shutdown(sk, SEND_SHUTDOWN); + break; + + case TCP_FIN_WAIT1: { +diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c +index ab3b7b4b4429b..5631041ae12cb 100644 +--- a/net/ipv4/tcp_output.c ++++ b/net/ipv4/tcp_output.c +@@ -3533,7 +3533,9 @@ void tcp_send_fin(struct sock *sk) + return; + } + } else { +- skb = alloc_skb_fclone(MAX_TCP_HEADER, sk->sk_allocation); ++ skb = alloc_skb_fclone(MAX_TCP_HEADER, ++ sk_gfp_mask(sk, GFP_ATOMIC | ++ __GFP_NOWARN)); + if (unlikely(!skb)) + return; + +-- +2.43.0 + diff --git a/queue-6.6/tcp-use-refcount_inc_not_zero-in-tcp_twsk_unique.patch b/queue-6.6/tcp-use-refcount_inc_not_zero-in-tcp_twsk_unique.patch new file mode 100644 index 00000000000..ef334b0db19 --- /dev/null +++ b/queue-6.6/tcp-use-refcount_inc_not_zero-in-tcp_twsk_unique.patch @@ -0,0 +1,118 @@ +From a33abf620608b922d8af369143700d9633432e14 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 1 May 2024 14:31:45 -0700 +Subject: tcp: Use refcount_inc_not_zero() in tcp_twsk_unique(). + +From: Kuniyuki Iwashima + +[ Upstream commit f2db7230f73a80dbb179deab78f88a7947f0ab7e ] + +Anderson Nascimento reported a use-after-free splat in tcp_twsk_unique() +with nice analysis. + +Since commit ec94c2696f0b ("tcp/dccp: avoid one atomic operation for +timewait hashdance"), inet_twsk_hashdance() sets TIME-WAIT socket's +sk_refcnt after putting it into ehash and releasing the bucket lock. + +Thus, there is a small race window where other threads could try to +reuse the port during connect() and call sock_hold() in tcp_twsk_unique() +for the TIME-WAIT socket with zero refcnt. + +If that happens, the refcnt taken by tcp_twsk_unique() is overwritten +and sock_put() will cause underflow, triggering a real use-after-free +somewhere else. + +To avoid the use-after-free, we need to use refcount_inc_not_zero() in +tcp_twsk_unique() and give up on reusing the port if it returns false. + +[0]: +refcount_t: addition on 0; use-after-free. +WARNING: CPU: 0 PID: 1039313 at lib/refcount.c:25 refcount_warn_saturate+0xe5/0x110 +CPU: 0 PID: 1039313 Comm: trigger Not tainted 6.8.6-200.fc39.x86_64 #1 +Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023 +RIP: 0010:refcount_warn_saturate+0xe5/0x110 +Code: 42 8e ff 0f 0b c3 cc cc cc cc 80 3d aa 13 ea 01 00 0f 85 5e ff ff ff 48 c7 c7 f8 8e b7 82 c6 05 96 13 ea 01 01 e8 7b 42 8e ff <0f> 0b c3 cc cc cc cc 48 c7 c7 50 8f b7 82 c6 05 7a 13 ea 01 01 e8 +RSP: 0018:ffffc90006b43b60 EFLAGS: 00010282 +RAX: 0000000000000000 RBX: ffff888009bb3ef0 RCX: 0000000000000027 +RDX: ffff88807be218c8 RSI: 0000000000000001 RDI: ffff88807be218c0 +RBP: 0000000000069d70 R08: 0000000000000000 R09: ffffc90006b439f0 +R10: ffffc90006b439e8 R11: 0000000000000003 R12: ffff8880029ede84 +R13: 0000000000004e20 R14: ffffffff84356dc0 R15: ffff888009bb3ef0 +FS: 00007f62c10926c0(0000) GS:ffff88807be00000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 0000000020ccb000 CR3: 000000004628c005 CR4: 0000000000f70ef0 +PKRU: 55555554 +Call Trace: + + ? refcount_warn_saturate+0xe5/0x110 + ? __warn+0x81/0x130 + ? refcount_warn_saturate+0xe5/0x110 + ? report_bug+0x171/0x1a0 + ? refcount_warn_saturate+0xe5/0x110 + ? handle_bug+0x3c/0x80 + ? exc_invalid_op+0x17/0x70 + ? asm_exc_invalid_op+0x1a/0x20 + ? refcount_warn_saturate+0xe5/0x110 + tcp_twsk_unique+0x186/0x190 + __inet_check_established+0x176/0x2d0 + __inet_hash_connect+0x74/0x7d0 + ? __pfx___inet_check_established+0x10/0x10 + tcp_v4_connect+0x278/0x530 + __inet_stream_connect+0x10f/0x3d0 + inet_stream_connect+0x3a/0x60 + __sys_connect+0xa8/0xd0 + __x64_sys_connect+0x18/0x20 + do_syscall_64+0x83/0x170 + entry_SYSCALL_64_after_hwframe+0x78/0x80 +RIP: 0033:0x7f62c11a885d +Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a3 45 0c 00 f7 d8 64 89 01 48 +RSP: 002b:00007f62c1091e58 EFLAGS: 00000296 ORIG_RAX: 000000000000002a +RAX: ffffffffffffffda RBX: 0000000020ccb004 RCX: 00007f62c11a885d +RDX: 0000000000000010 RSI: 0000000020ccb000 RDI: 0000000000000003 +RBP: 00007f62c1091e90 R08: 0000000000000000 R09: 0000000000000000 +R10: 0000000000000000 R11: 0000000000000296 R12: 00007f62c10926c0 +R13: ffffffffffffff88 R14: 0000000000000000 R15: 00007ffe237885b0 + + +Fixes: ec94c2696f0b ("tcp/dccp: avoid one atomic operation for timewait hashdance") +Reported-by: Anderson Nascimento +Closes: https://lore.kernel.org/netdev/37a477a6-d39e-486b-9577-3463f655a6b7@allelesecurity.com/ +Suggested-by: Eric Dumazet +Signed-off-by: Kuniyuki Iwashima +Reviewed-by: Eric Dumazet +Link: https://lore.kernel.org/r/20240501213145.62261-1-kuniyu@amazon.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/ipv4/tcp_ipv4.c | 8 +++++++- + 1 file changed, 7 insertions(+), 1 deletion(-) + +diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c +index c7ffab37a34cd..c464ced7137ee 100644 +--- a/net/ipv4/tcp_ipv4.c ++++ b/net/ipv4/tcp_ipv4.c +@@ -154,6 +154,12 @@ int tcp_twsk_unique(struct sock *sk, struct sock *sktw, void *twp) + if (tcptw->tw_ts_recent_stamp && + (!twp || (reuse && time_after32(ktime_get_seconds(), + tcptw->tw_ts_recent_stamp)))) { ++ /* inet_twsk_hashdance() sets sk_refcnt after putting twsk ++ * and releasing the bucket lock. ++ */ ++ if (unlikely(!refcount_inc_not_zero(&sktw->sk_refcnt))) ++ return 0; ++ + /* In case of repair and re-using TIME-WAIT sockets we still + * want to be sure that it is safe as above but honor the + * sequence numbers and time stamps set as part of the repair +@@ -174,7 +180,7 @@ int tcp_twsk_unique(struct sock *sk, struct sock *sktw, void *twp) + tp->rx_opt.ts_recent = tcptw->tw_ts_recent; + tp->rx_opt.ts_recent_stamp = tcptw->tw_ts_recent_stamp; + } +- sock_hold(sktw); ++ + return 1; + } + +-- +2.43.0 + diff --git a/queue-6.6/xfrm-preserve-vlan-tags-for-transport-mode-software-.patch b/queue-6.6/xfrm-preserve-vlan-tags-for-transport-mode-software-.patch new file mode 100644 index 00000000000..cade6a6da29 --- /dev/null +++ b/queue-6.6/xfrm-preserve-vlan-tags-for-transport-mode-software-.patch @@ -0,0 +1,153 @@ +From 61cf6047a620ec0b06ad1fa0b55a6e9f97d47f3f Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 23 Apr 2024 18:00:24 +1200 +Subject: xfrm: Preserve vlan tags for transport mode software GRO + +From: Paul Davey + +[ Upstream commit 58fbfecab965014b6e3cc956a76b4a96265a1add ] + +The software GRO path for esp transport mode uses skb_mac_header_rebuild +prior to re-injecting the packet via the xfrm_napi_dev. This only +copies skb->mac_len bytes of header which may not be sufficient if the +packet contains 802.1Q tags or other VLAN tags. Worse copying only the +initial header will leave a packet marked as being VLAN tagged but +without the corresponding tag leading to mangling when it is later +untagged. + +The VLAN tags are important when receiving the decrypted esp transport +mode packet after GRO processing to ensure it is received on the correct +interface. + +Therefore record the full mac header length in xfrm*_transport_input for +later use in corresponding xfrm*_transport_finish to copy the entire mac +header when rebuilding the mac header for GRO. The skb->data pointer is +left pointing skb->mac_header bytes after the start of the mac header as +is expected by the network stack and network and transport header +offsets reset to this location. + +Fixes: 7785bba299a8 ("esp: Add a software GRO codepath") +Signed-off-by: Paul Davey +Signed-off-by: Steffen Klassert +Signed-off-by: Sasha Levin +--- + include/linux/skbuff.h | 15 +++++++++++++++ + include/net/xfrm.h | 3 +++ + net/ipv4/xfrm4_input.c | 6 +++++- + net/ipv6/xfrm6_input.c | 6 +++++- + net/xfrm/xfrm_input.c | 8 ++++++++ + 5 files changed, 36 insertions(+), 2 deletions(-) + +diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h +index 7d54808a1e8f9..5f11f98733419 100644 +--- a/include/linux/skbuff.h ++++ b/include/linux/skbuff.h +@@ -2962,6 +2962,21 @@ static inline void skb_mac_header_rebuild(struct sk_buff *skb) + } + } + ++/* Move the full mac header up to current network_header. ++ * Leaves skb->data pointing at offset skb->mac_len into the mac_header. ++ * Must be provided the complete mac header length. ++ */ ++static inline void skb_mac_header_rebuild_full(struct sk_buff *skb, u32 full_mac_len) ++{ ++ if (skb_mac_header_was_set(skb)) { ++ const unsigned char *old_mac = skb_mac_header(skb); ++ ++ skb_set_mac_header(skb, -full_mac_len); ++ memmove(skb_mac_header(skb), old_mac, full_mac_len); ++ __skb_push(skb, full_mac_len - skb->mac_len); ++ } ++} ++ + static inline int skb_checksum_start_offset(const struct sk_buff *skb) + { + return skb->csum_start - skb_headroom(skb); +diff --git a/include/net/xfrm.h b/include/net/xfrm.h +index 363c7d5105542..a3fd2cfed5e33 100644 +--- a/include/net/xfrm.h ++++ b/include/net/xfrm.h +@@ -1047,6 +1047,9 @@ struct xfrm_offload { + #define CRYPTO_INVALID_PACKET_SYNTAX 64 + #define CRYPTO_INVALID_PROTOCOL 128 + ++ /* Used to keep whole l2 header for transport mode GRO */ ++ __u32 orig_mac_len; ++ + __u8 proto; + __u8 inner_ipproto; + }; +diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c +index 183f6dc372429..f6e90ba50b639 100644 +--- a/net/ipv4/xfrm4_input.c ++++ b/net/ipv4/xfrm4_input.c +@@ -61,7 +61,11 @@ int xfrm4_transport_finish(struct sk_buff *skb, int async) + ip_send_check(iph); + + if (xo && (xo->flags & XFRM_GRO)) { +- skb_mac_header_rebuild(skb); ++ /* The full l2 header needs to be preserved so that re-injecting the packet at l2 ++ * works correctly in the presence of vlan tags. ++ */ ++ skb_mac_header_rebuild_full(skb, xo->orig_mac_len); ++ skb_reset_network_header(skb); + skb_reset_transport_header(skb); + return 0; + } +diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c +index 4156387248e40..8432b50d9ce4c 100644 +--- a/net/ipv6/xfrm6_input.c ++++ b/net/ipv6/xfrm6_input.c +@@ -56,7 +56,11 @@ int xfrm6_transport_finish(struct sk_buff *skb, int async) + skb_postpush_rcsum(skb, skb_network_header(skb), nhlen); + + if (xo && (xo->flags & XFRM_GRO)) { +- skb_mac_header_rebuild(skb); ++ /* The full l2 header needs to be preserved so that re-injecting the packet at l2 ++ * works correctly in the presence of vlan tags. ++ */ ++ skb_mac_header_rebuild_full(skb, xo->orig_mac_len); ++ skb_reset_network_header(skb); + skb_reset_transport_header(skb); + return 0; + } +diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c +index d5ee96789d4bf..0c08bac3ed269 100644 +--- a/net/xfrm/xfrm_input.c ++++ b/net/xfrm/xfrm_input.c +@@ -388,11 +388,15 @@ static int xfrm_prepare_input(struct xfrm_state *x, struct sk_buff *skb) + */ + static int xfrm4_transport_input(struct xfrm_state *x, struct sk_buff *skb) + { ++ struct xfrm_offload *xo = xfrm_offload(skb); + int ihl = skb->data - skb_transport_header(skb); + + if (skb->transport_header != skb->network_header) { + memmove(skb_transport_header(skb), + skb_network_header(skb), ihl); ++ if (xo) ++ xo->orig_mac_len = ++ skb_mac_header_was_set(skb) ? skb_mac_header_len(skb) : 0; + skb->network_header = skb->transport_header; + } + ip_hdr(skb)->tot_len = htons(skb->len + ihl); +@@ -403,11 +407,15 @@ static int xfrm4_transport_input(struct xfrm_state *x, struct sk_buff *skb) + static int xfrm6_transport_input(struct xfrm_state *x, struct sk_buff *skb) + { + #if IS_ENABLED(CONFIG_IPV6) ++ struct xfrm_offload *xo = xfrm_offload(skb); + int ihl = skb->data - skb_transport_header(skb); + + if (skb->transport_header != skb->network_header) { + memmove(skb_transport_header(skb), + skb_network_header(skb), ihl); ++ if (xo) ++ xo->orig_mac_len = ++ skb_mac_header_was_set(skb) ? skb_mac_header_len(skb) : 0; + skb->network_header = skb->transport_header; + } + ipv6_hdr(skb)->payload_len = htons(skb->len + ihl - +-- +2.43.0 +