From 299da1b8157a9d5c11e0e54dd928eb137d2cb6b4 Mon Sep 17 00:00:00 2001 From: Sasha Levin Date: Mon, 15 Apr 2024 04:56:17 -0400 Subject: [PATCH] Fixes for 5.15 Signed-off-by: Sasha Levin --- .../af_unix-clear-stale-u-oob_skb.patch | 104 ++++ ...se-atomic-ops-for-unix_sk-sk-infligh.patch | 147 ++++++ ...age-collector-racing-against-connect.patch | 122 +++++ ...s-conn-fix-usdhc-wrong-lpcg-clock-or.patch | 95 ++++ ...der-validation-in-geneve-6-_xmit_skb.patch | 166 ++++++ ...ate-local-memory-for-page-request-qu.patch | 39 ++ ...void-unused-but-set-variable-warning.patch | 51 ++ .../ipv6-fib-hide-unused-pn-variable.patch | 60 +++ ...ndition-between-ipv6_get_ifaddr-and-.patch | 133 +++++ ...rap-link-local-frames-regardless-of-.patch | 495 ++++++++++++++++++ ...x-incorrect-descriptor-free-behavior.patch | 72 +++ ...a-fix-potential-sign-extension-issue.patch | 66 +++ ...g-missing-io-completions-check-order.patch | 108 ++++ ...erly-link-new-fs-rules-into-the-tree.patch | 66 +++ ...fix-unwanted-error-log-on-timeout-po.patch | 60 +++ ...rong-config-being-used-when-reconfig.patch | 47 ++ ...er-complete-validation-of-user-input.patch | 102 ++++ .../nouveau-fix-function-cast-warning.patch | 51 ++ ...tx2-af-fix-nix-sq-mode-and-bp-config.patch | 59 +++ ...vert-drm-qxl-simplify-qxl_fence_wait.patch | 115 ++++ ...-off-by-one-in-qla_edif_app_getstats.patch | 39 ++ queue-5.15/series | 25 + ...ce_record_recursion_size-kconfig-ent.patch | 42 ++ ...ing-hide-unused-ftrace_event_id_fops.patch | 76 +++ ...e-preemption-on-32bit-up-smp-preempt.patch | 164 ++++++ ...r-input-for-xdp_-umem-completion-_fi.patch | 176 +++++++ 26 files changed, 2680 insertions(+) create mode 100644 queue-5.15/af_unix-clear-stale-u-oob_skb.patch create mode 100644 queue-5.15/af_unix-do-not-use-atomic-ops-for-unix_sk-sk-infligh.patch create mode 100644 queue-5.15/af_unix-fix-garbage-collector-racing-against-connect.patch create mode 100644 queue-5.15/arm64-dts-imx8-ss-conn-fix-usdhc-wrong-lpcg-clock-or.patch create mode 100644 queue-5.15/geneve-fix-header-validation-in-geneve-6-_xmit_skb.patch create mode 100644 queue-5.15/iommu-vt-d-allocate-local-memory-for-page-request-qu.patch create mode 100644 queue-5.15/ipv4-route-avoid-unused-but-set-variable-warning.patch create mode 100644 queue-5.15/ipv6-fib-hide-unused-pn-variable.patch create mode 100644 queue-5.15/ipv6-fix-race-condition-between-ipv6_get_ifaddr-and-.patch create mode 100644 queue-5.15/net-dsa-mt7530-trap-link-local-frames-regardless-of-.patch create mode 100644 queue-5.15/net-ena-fix-incorrect-descriptor-free-behavior.patch create mode 100644 queue-5.15/net-ena-fix-potential-sign-extension-issue.patch create mode 100644 queue-5.15/net-ena-wrong-missing-io-completions-check-order.patch create mode 100644 queue-5.15/net-mlx5-properly-link-new-fs-rules-into-the-tree.patch create mode 100644 queue-5.15/net-openvswitch-fix-unwanted-error-log-on-timeout-po.patch create mode 100644 queue-5.15/net-sparx5-fix-wrong-config-being-used-when-reconfig.patch create mode 100644 queue-5.15/netfilter-complete-validation-of-user-input.patch create mode 100644 queue-5.15/nouveau-fix-function-cast-warning.patch create mode 100644 queue-5.15/octeontx2-af-fix-nix-sq-mode-and-bp-config.patch create mode 100644 queue-5.15/revert-drm-qxl-simplify-qxl_fence_wait.patch create mode 100644 queue-5.15/scsi-qla2xxx-fix-off-by-one-in-qla_edif_app_getstats.patch create mode 100644 queue-5.15/tracing-fix-ftrace_record_recursion_size-kconfig-ent.patch create mode 100644 queue-5.15/tracing-hide-unused-ftrace_event_id_fops.patch create mode 100644 queue-5.15/u64_stats-disable-preemption-on-32bit-up-smp-preempt.patch create mode 100644 queue-5.15/xsk-validate-user-input-for-xdp_-umem-completion-_fi.patch diff --git a/queue-5.15/af_unix-clear-stale-u-oob_skb.patch b/queue-5.15/af_unix-clear-stale-u-oob_skb.patch new file mode 100644 index 00000000000..79ad483689a --- /dev/null +++ b/queue-5.15/af_unix-clear-stale-u-oob_skb.patch @@ -0,0 +1,104 @@ +From 2aa09962770120cc9d4553f20ac5a1762ab2dff5 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 5 Apr 2024 15:10:57 -0700 +Subject: af_unix: Clear stale u->oob_skb. + +From: Kuniyuki Iwashima + +[ Upstream commit b46f4eaa4f0ec38909fb0072eea3aeddb32f954e ] + +syzkaller started to report deadlock of unix_gc_lock after commit +4090fa373f0e ("af_unix: Replace garbage collection algorithm."), but +it just uncovers the bug that has been there since commit 314001f0bf92 +("af_unix: Add OOB support"). + +The repro basically does the following. + + from socket import * + from array import array + + c1, c2 = socketpair(AF_UNIX, SOCK_STREAM) + c1.sendmsg([b'a'], [(SOL_SOCKET, SCM_RIGHTS, array("i", [c2.fileno()]))], MSG_OOB) + c2.recv(1) # blocked as no normal data in recv queue + + c2.close() # done async and unblock recv() + c1.close() # done async and trigger GC + +A socket sends its file descriptor to itself as OOB data and tries to +receive normal data, but finally recv() fails due to async close(). + +The problem here is wrong handling of OOB skb in manage_oob(). When +recvmsg() is called without MSG_OOB, manage_oob() is called to check +if the peeked skb is OOB skb. In such a case, manage_oob() pops it +out of the receive queue but does not clear unix_sock(sk)->oob_skb. +This is wrong in terms of uAPI. + +Let's say we send "hello" with MSG_OOB, and "world" without MSG_OOB. +The 'o' is handled as OOB data. When recv() is called twice without +MSG_OOB, the OOB data should be lost. + + >>> from socket import * + >>> c1, c2 = socketpair(AF_UNIX, SOCK_STREAM, 0) + >>> c1.send(b'hello', MSG_OOB) # 'o' is OOB data + 5 + >>> c1.send(b'world') + 5 + >>> c2.recv(5) # OOB data is not received + b'hell' + >>> c2.recv(5) # OOB date is skipped + b'world' + >>> c2.recv(5, MSG_OOB) # This should return an error + b'o' + +In the same situation, TCP actually returns -EINVAL for the last +recv(). + +Also, if we do not clear unix_sk(sk)->oob_skb, unix_poll() always set +EPOLLPRI even though the data has passed through by previous recv(). + +To avoid these issues, we must clear unix_sk(sk)->oob_skb when dequeuing +it from recv queue. + +The reason why the old GC did not trigger the deadlock is because the +old GC relied on the receive queue to detect the loop. + +When it is triggered, the socket with OOB data is marked as GC candidate +because file refcount == inflight count (1). However, after traversing +all inflight sockets, the socket still has a positive inflight count (1), +thus the socket is excluded from candidates. Then, the old GC lose the +chance to garbage-collect the socket. + +With the old GC, the repro continues to create true garbage that will +never be freed nor detected by kmemleak as it's linked to the global +inflight list. That's why we couldn't even notice the issue. + +Fixes: 314001f0bf92 ("af_unix: Add OOB support") +Reported-by: syzbot+7f7f201cc2668a8fd169@syzkaller.appspotmail.com +Closes: https://syzkaller.appspot.com/bug?extid=7f7f201cc2668a8fd169 +Signed-off-by: Kuniyuki Iwashima +Reviewed-by: Eric Dumazet +Link: https://lore.kernel.org/r/20240405221057.2406-1-kuniyu@amazon.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/unix/af_unix.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c +index 265dc665c92a2..27a88a738793f 100644 +--- a/net/unix/af_unix.c ++++ b/net/unix/af_unix.c +@@ -2567,7 +2567,9 @@ static struct sk_buff *manage_oob(struct sk_buff *skb, struct sock *sk, + } + } else if (!(flags & MSG_PEEK)) { + skb_unlink(skb, &sk->sk_receive_queue); +- consume_skb(skb); ++ WRITE_ONCE(u->oob_skb, NULL); ++ if (!WARN_ON_ONCE(skb_unref(skb))) ++ kfree_skb(skb); + skb = skb_peek(&sk->sk_receive_queue); + } + } +-- +2.43.0 + diff --git a/queue-5.15/af_unix-do-not-use-atomic-ops-for-unix_sk-sk-infligh.patch b/queue-5.15/af_unix-do-not-use-atomic-ops-for-unix_sk-sk-infligh.patch new file mode 100644 index 00000000000..acede590e3c --- /dev/null +++ b/queue-5.15/af_unix-do-not-use-atomic-ops-for-unix_sk-sk-infligh.patch @@ -0,0 +1,147 @@ +From 9c487a8468b982c2e53790f5f24fac21882cc1d8 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 23 Jan 2024 09:08:53 -0800 +Subject: af_unix: Do not use atomic ops for unix_sk(sk)->inflight. + +From: Kuniyuki Iwashima + +[ Upstream commit 97af84a6bba2ab2b9c704c08e67de3b5ea551bb2 ] + +When touching unix_sk(sk)->inflight, we are always under +spin_lock(&unix_gc_lock). + +Let's convert unix_sk(sk)->inflight to the normal unsigned long. + +Signed-off-by: Kuniyuki Iwashima +Reviewed-by: Simon Horman +Link: https://lore.kernel.org/r/20240123170856.41348-3-kuniyu@amazon.com +Signed-off-by: Jakub Kicinski +Stable-dep-of: 47d8ac011fe1 ("af_unix: Fix garbage collector racing against connect()") +Signed-off-by: Sasha Levin +--- + include/net/af_unix.h | 2 +- + net/unix/af_unix.c | 4 ++-- + net/unix/garbage.c | 17 ++++++++--------- + net/unix/scm.c | 8 +++++--- + 4 files changed, 16 insertions(+), 15 deletions(-) + +diff --git a/include/net/af_unix.h b/include/net/af_unix.h +index 32d21983c6968..094afdf7dea10 100644 +--- a/include/net/af_unix.h ++++ b/include/net/af_unix.h +@@ -56,7 +56,7 @@ struct unix_sock { + struct mutex iolock, bindlock; + struct sock *peer; + struct list_head link; +- atomic_long_t inflight; ++ unsigned long inflight; + spinlock_t lock; + unsigned long gc_flags; + #define UNIX_GC_CANDIDATE 0 +diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c +index 27a88a738793f..628d97c195a7e 100644 +--- a/net/unix/af_unix.c ++++ b/net/unix/af_unix.c +@@ -877,11 +877,11 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern, + sk->sk_write_space = unix_write_space; + sk->sk_max_ack_backlog = net->unx.sysctl_max_dgram_qlen; + sk->sk_destruct = unix_sock_destructor; +- u = unix_sk(sk); ++ u = unix_sk(sk); ++ u->inflight = 0; + u->path.dentry = NULL; + u->path.mnt = NULL; + spin_lock_init(&u->lock); +- atomic_long_set(&u->inflight, 0); + INIT_LIST_HEAD(&u->link); + mutex_init(&u->iolock); /* single task reading lock */ + mutex_init(&u->bindlock); /* single task binding lock */ +diff --git a/net/unix/garbage.c b/net/unix/garbage.c +index 9bfffe2a7f020..7b326582d97da 100644 +--- a/net/unix/garbage.c ++++ b/net/unix/garbage.c +@@ -166,17 +166,18 @@ static void scan_children(struct sock *x, void (*func)(struct unix_sock *), + + static void dec_inflight(struct unix_sock *usk) + { +- atomic_long_dec(&usk->inflight); ++ usk->inflight--; + } + + static void inc_inflight(struct unix_sock *usk) + { +- atomic_long_inc(&usk->inflight); ++ usk->inflight++; + } + + static void inc_inflight_move_tail(struct unix_sock *u) + { +- atomic_long_inc(&u->inflight); ++ u->inflight++; ++ + /* If this still might be part of a cycle, move it to the end + * of the list, so that it's checked even if it was already + * passed over +@@ -237,14 +238,12 @@ void unix_gc(void) + */ + list_for_each_entry_safe(u, next, &gc_inflight_list, link) { + long total_refs; +- long inflight_refs; + + total_refs = file_count(u->sk.sk_socket->file); +- inflight_refs = atomic_long_read(&u->inflight); + +- BUG_ON(inflight_refs < 1); +- BUG_ON(total_refs < inflight_refs); +- if (total_refs == inflight_refs) { ++ BUG_ON(!u->inflight); ++ BUG_ON(total_refs < u->inflight); ++ if (total_refs == u->inflight) { + list_move_tail(&u->link, &gc_candidates); + __set_bit(UNIX_GC_CANDIDATE, &u->gc_flags); + __set_bit(UNIX_GC_MAYBE_CYCLE, &u->gc_flags); +@@ -271,7 +270,7 @@ void unix_gc(void) + /* Move cursor to after the current position. */ + list_move(&cursor, &u->link); + +- if (atomic_long_read(&u->inflight) > 0) { ++ if (u->inflight) { + list_move_tail(&u->link, ¬_cycle_list); + __clear_bit(UNIX_GC_MAYBE_CYCLE, &u->gc_flags); + scan_children(&u->sk, inc_inflight_move_tail, NULL); +diff --git a/net/unix/scm.c b/net/unix/scm.c +index d1048b4c2baaf..4eff7da9f6f96 100644 +--- a/net/unix/scm.c ++++ b/net/unix/scm.c +@@ -52,12 +52,13 @@ void unix_inflight(struct user_struct *user, struct file *fp) + if (s) { + struct unix_sock *u = unix_sk(s); + +- if (atomic_long_inc_return(&u->inflight) == 1) { ++ if (!u->inflight) { + BUG_ON(!list_empty(&u->link)); + list_add_tail(&u->link, &gc_inflight_list); + } else { + BUG_ON(list_empty(&u->link)); + } ++ u->inflight++; + /* Paired with READ_ONCE() in wait_for_unix_gc() */ + WRITE_ONCE(unix_tot_inflight, unix_tot_inflight + 1); + } +@@ -74,10 +75,11 @@ void unix_notinflight(struct user_struct *user, struct file *fp) + if (s) { + struct unix_sock *u = unix_sk(s); + +- BUG_ON(!atomic_long_read(&u->inflight)); ++ BUG_ON(!u->inflight); + BUG_ON(list_empty(&u->link)); + +- if (atomic_long_dec_and_test(&u->inflight)) ++ u->inflight--; ++ if (!u->inflight) + list_del_init(&u->link); + /* Paired with READ_ONCE() in wait_for_unix_gc() */ + WRITE_ONCE(unix_tot_inflight, unix_tot_inflight - 1); +-- +2.43.0 + diff --git a/queue-5.15/af_unix-fix-garbage-collector-racing-against-connect.patch b/queue-5.15/af_unix-fix-garbage-collector-racing-against-connect.patch new file mode 100644 index 00000000000..6f7b2d9fa94 --- /dev/null +++ b/queue-5.15/af_unix-fix-garbage-collector-racing-against-connect.patch @@ -0,0 +1,122 @@ +From d9549d28ba77d1026105b16abd41bc53267d858f Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 9 Apr 2024 22:09:39 +0200 +Subject: af_unix: Fix garbage collector racing against connect() + +From: Michal Luczaj + +[ Upstream commit 47d8ac011fe1c9251070e1bd64cb10b48193ec51 ] + +Garbage collector does not take into account the risk of embryo getting +enqueued during the garbage collection. If such embryo has a peer that +carries SCM_RIGHTS, two consecutive passes of scan_children() may see a +different set of children. Leading to an incorrectly elevated inflight +count, and then a dangling pointer within the gc_inflight_list. + +sockets are AF_UNIX/SOCK_STREAM +S is an unconnected socket +L is a listening in-flight socket bound to addr, not in fdtable +V's fd will be passed via sendmsg(), gets inflight count bumped + +connect(S, addr) sendmsg(S, [V]); close(V) __unix_gc() +---------------- ------------------------- ----------- + +NS = unix_create1() +skb1 = sock_wmalloc(NS) +L = unix_find_other(addr) +unix_state_lock(L) +unix_peer(S) = NS + // V count=1 inflight=0 + + NS = unix_peer(S) + skb2 = sock_alloc() + skb_queue_tail(NS, skb2[V]) + + // V became in-flight + // V count=2 inflight=1 + + close(V) + + // V count=1 inflight=1 + // GC candidate condition met + + for u in gc_inflight_list: + if (total_refs == inflight_refs) + add u to gc_candidates + + // gc_candidates={L, V} + + for u in gc_candidates: + scan_children(u, dec_inflight) + + // embryo (skb1) was not + // reachable from L yet, so V's + // inflight remains unchanged +__skb_queue_tail(L, skb1) +unix_state_unlock(L) + for u in gc_candidates: + if (u.inflight) + scan_children(u, inc_inflight_move_tail) + + // V count=1 inflight=2 (!) + +If there is a GC-candidate listening socket, lock/unlock its state. This +makes GC wait until the end of any ongoing connect() to that socket. After +flipping the lock, a possibly SCM-laden embryo is already enqueued. And if +there is another embryo coming, it can not possibly carry SCM_RIGHTS. At +this point, unix_inflight() can not happen because unix_gc_lock is already +taken. Inflight graph remains unaffected. + +Fixes: 1fd05ba5a2f2 ("[AF_UNIX]: Rewrite garbage collector, fixes race.") +Signed-off-by: Michal Luczaj +Reviewed-by: Kuniyuki Iwashima +Link: https://lore.kernel.org/r/20240409201047.1032217-1-mhal@rbox.co +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + net/unix/garbage.c | 18 +++++++++++++++++- + 1 file changed, 17 insertions(+), 1 deletion(-) + +diff --git a/net/unix/garbage.c b/net/unix/garbage.c +index 7b326582d97da..85c6f05c0fa3c 100644 +--- a/net/unix/garbage.c ++++ b/net/unix/garbage.c +@@ -235,11 +235,22 @@ void unix_gc(void) + * receive queues. Other, non candidate sockets _can_ be + * added to queue, so we must make sure only to touch + * candidates. ++ * ++ * Embryos, though never candidates themselves, affect which ++ * candidates are reachable by the garbage collector. Before ++ * being added to a listener's queue, an embryo may already ++ * receive data carrying SCM_RIGHTS, potentially making the ++ * passed socket a candidate that is not yet reachable by the ++ * collector. It becomes reachable once the embryo is ++ * enqueued. Therefore, we must ensure that no SCM-laden ++ * embryo appears in a (candidate) listener's queue between ++ * consecutive scan_children() calls. + */ + list_for_each_entry_safe(u, next, &gc_inflight_list, link) { ++ struct sock *sk = &u->sk; + long total_refs; + +- total_refs = file_count(u->sk.sk_socket->file); ++ total_refs = file_count(sk->sk_socket->file); + + BUG_ON(!u->inflight); + BUG_ON(total_refs < u->inflight); +@@ -247,6 +258,11 @@ void unix_gc(void) + list_move_tail(&u->link, &gc_candidates); + __set_bit(UNIX_GC_CANDIDATE, &u->gc_flags); + __set_bit(UNIX_GC_MAYBE_CYCLE, &u->gc_flags); ++ ++ if (sk->sk_state == TCP_LISTEN) { ++ unix_state_lock(sk); ++ unix_state_unlock(sk); ++ } + } + } + +-- +2.43.0 + diff --git a/queue-5.15/arm64-dts-imx8-ss-conn-fix-usdhc-wrong-lpcg-clock-or.patch b/queue-5.15/arm64-dts-imx8-ss-conn-fix-usdhc-wrong-lpcg-clock-or.patch new file mode 100644 index 00000000000..736bd83b62a --- /dev/null +++ b/queue-5.15/arm64-dts-imx8-ss-conn-fix-usdhc-wrong-lpcg-clock-or.patch @@ -0,0 +1,95 @@ +From 269a8327c3b5a49ca4178df8d059a87dc4dca892 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 22 Mar 2024 12:47:05 -0400 +Subject: arm64: dts: imx8-ss-conn: fix usdhc wrong lpcg clock order + +From: Frank Li + +[ Upstream commit c6ddd6e7b166532a0816825442ff60f70aed9647 ] + +The actual clock show wrong frequency: + + echo on >/sys/devices/platform/bus\@5b000000/5b010000.mmc/power/control + cat /sys/kernel/debug/mmc0/ios + + clock: 200000000 Hz + actual clock: 166000000 Hz + ^^^^^^^^^ + ..... + +According to + +sdhc0_lpcg: clock-controller@5b200000 { + compatible = "fsl,imx8qxp-lpcg"; + reg = <0x5b200000 0x10000>; + #clock-cells = <1>; + clocks = <&clk IMX_SC_R_SDHC_0 IMX_SC_PM_CLK_PER>, + <&conn_ipg_clk>, <&conn_axi_clk>; + clock-indices = , , + ; + clock-output-names = "sdhc0_lpcg_per_clk", + "sdhc0_lpcg_ipg_clk", + "sdhc0_lpcg_ahb_clk"; + power-domains = <&pd IMX_SC_R_SDHC_0>; + } + +"per_clk" should be IMX_LPCG_CLK_0 instead of IMX_LPCG_CLK_5. + +After correct clocks order: + + echo on >/sys/devices/platform/bus\@5b000000/5b010000.mmc/power/control + cat /sys/kernel/debug/mmc0/ios + + clock: 200000000 Hz + actual clock: 198000000 Hz + ^^^^^^^^ + ... + +Fixes: 16c4ea7501b1 ("arm64: dts: imx8: switch to new lpcg clock binding") +Signed-off-by: Frank Li +Signed-off-by: Shawn Guo +Signed-off-by: Sasha Levin +--- + arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi | 12 ++++++------ + 1 file changed, 6 insertions(+), 6 deletions(-) + +diff --git a/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi b/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi +index 639220dbff008..685e9b83d42b1 100644 +--- a/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi ++++ b/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi +@@ -38,8 +38,8 @@ usdhc1: mmc@5b010000 { + interrupts = ; + reg = <0x5b010000 0x10000>; + clocks = <&sdhc0_lpcg IMX_LPCG_CLK_4>, +- <&sdhc0_lpcg IMX_LPCG_CLK_0>, +- <&sdhc0_lpcg IMX_LPCG_CLK_5>; ++ <&sdhc0_lpcg IMX_LPCG_CLK_5>, ++ <&sdhc0_lpcg IMX_LPCG_CLK_0>; + clock-names = "ipg", "ahb", "per"; + power-domains = <&pd IMX_SC_R_SDHC_0>; + status = "disabled"; +@@ -49,8 +49,8 @@ usdhc2: mmc@5b020000 { + interrupts = ; + reg = <0x5b020000 0x10000>; + clocks = <&sdhc1_lpcg IMX_LPCG_CLK_4>, +- <&sdhc1_lpcg IMX_LPCG_CLK_0>, +- <&sdhc1_lpcg IMX_LPCG_CLK_5>; ++ <&sdhc1_lpcg IMX_LPCG_CLK_5>, ++ <&sdhc1_lpcg IMX_LPCG_CLK_0>; + clock-names = "ipg", "ahb", "per"; + power-domains = <&pd IMX_SC_R_SDHC_1>; + fsl,tuning-start-tap = <20>; +@@ -62,8 +62,8 @@ usdhc3: mmc@5b030000 { + interrupts = ; + reg = <0x5b030000 0x10000>; + clocks = <&sdhc2_lpcg IMX_LPCG_CLK_4>, +- <&sdhc2_lpcg IMX_LPCG_CLK_0>, +- <&sdhc2_lpcg IMX_LPCG_CLK_5>; ++ <&sdhc2_lpcg IMX_LPCG_CLK_5>, ++ <&sdhc2_lpcg IMX_LPCG_CLK_0>; + clock-names = "ipg", "ahb", "per"; + power-domains = <&pd IMX_SC_R_SDHC_2>; + status = "disabled"; +-- +2.43.0 + diff --git a/queue-5.15/geneve-fix-header-validation-in-geneve-6-_xmit_skb.patch b/queue-5.15/geneve-fix-header-validation-in-geneve-6-_xmit_skb.patch new file mode 100644 index 00000000000..e378fc573cd --- /dev/null +++ b/queue-5.15/geneve-fix-header-validation-in-geneve-6-_xmit_skb.patch @@ -0,0 +1,166 @@ +From 4ab64e12a0a0ac9957ca084d1e10b59450b6ec1b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 5 Apr 2024 10:30:34 +0000 +Subject: geneve: fix header validation in geneve[6]_xmit_skb + +From: Eric Dumazet + +[ Upstream commit d8a6213d70accb403b82924a1c229e733433a5ef ] + +syzbot is able to trigger an uninit-value in geneve_xmit() [1] + +Problem : While most ip tunnel helpers (like ip_tunnel_get_dsfield()) +uses skb_protocol(skb, true), pskb_inet_may_pull() is only using +skb->protocol. + +If anything else than ETH_P_IPV6 or ETH_P_IP is found in skb->protocol, +pskb_inet_may_pull() does nothing at all. + +If a vlan tag was provided by the caller (af_packet in the syzbot case), +the network header might not point to the correct location, and skb +linear part could be smaller than expected. + +Add skb_vlan_inet_prepare() to perform a complete mac validation. + +Use this in geneve for the moment, I suspect we need to adopt this +more broadly. + +v4 - Jakub reported v3 broke l2_tos_ttl_inherit.sh selftest + - Only call __vlan_get_protocol() for vlan types. +Link: https://lore.kernel.org/netdev/20240404100035.3270a7d5@kernel.org/ + +v2,v3 - Addressed Sabrina comments on v1 and v2 +Link: https://lore.kernel.org/netdev/Zg1l9L2BNoZWZDZG@hog/ + +[1] + +BUG: KMSAN: uninit-value in geneve_xmit_skb drivers/net/geneve.c:910 [inline] + BUG: KMSAN: uninit-value in geneve_xmit+0x302d/0x5420 drivers/net/geneve.c:1030 + geneve_xmit_skb drivers/net/geneve.c:910 [inline] + geneve_xmit+0x302d/0x5420 drivers/net/geneve.c:1030 + __netdev_start_xmit include/linux/netdevice.h:4903 [inline] + netdev_start_xmit include/linux/netdevice.h:4917 [inline] + xmit_one net/core/dev.c:3531 [inline] + dev_hard_start_xmit+0x247/0xa20 net/core/dev.c:3547 + __dev_queue_xmit+0x348d/0x52c0 net/core/dev.c:4335 + dev_queue_xmit include/linux/netdevice.h:3091 [inline] + packet_xmit+0x9c/0x6c0 net/packet/af_packet.c:276 + packet_snd net/packet/af_packet.c:3081 [inline] + packet_sendmsg+0x8bb0/0x9ef0 net/packet/af_packet.c:3113 + sock_sendmsg_nosec net/socket.c:730 [inline] + __sock_sendmsg+0x30f/0x380 net/socket.c:745 + __sys_sendto+0x685/0x830 net/socket.c:2191 + __do_sys_sendto net/socket.c:2203 [inline] + __se_sys_sendto net/socket.c:2199 [inline] + __x64_sys_sendto+0x125/0x1d0 net/socket.c:2199 + do_syscall_64+0xd5/0x1f0 + entry_SYSCALL_64_after_hwframe+0x6d/0x75 + +Uninit was created at: + slab_post_alloc_hook mm/slub.c:3804 [inline] + slab_alloc_node mm/slub.c:3845 [inline] + kmem_cache_alloc_node+0x613/0xc50 mm/slub.c:3888 + kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:577 + __alloc_skb+0x35b/0x7a0 net/core/skbuff.c:668 + alloc_skb include/linux/skbuff.h:1318 [inline] + alloc_skb_with_frags+0xc8/0xbf0 net/core/skbuff.c:6504 + sock_alloc_send_pskb+0xa81/0xbf0 net/core/sock.c:2795 + packet_alloc_skb net/packet/af_packet.c:2930 [inline] + packet_snd net/packet/af_packet.c:3024 [inline] + packet_sendmsg+0x722d/0x9ef0 net/packet/af_packet.c:3113 + sock_sendmsg_nosec net/socket.c:730 [inline] + __sock_sendmsg+0x30f/0x380 net/socket.c:745 + __sys_sendto+0x685/0x830 net/socket.c:2191 + __do_sys_sendto net/socket.c:2203 [inline] + __se_sys_sendto net/socket.c:2199 [inline] + __x64_sys_sendto+0x125/0x1d0 net/socket.c:2199 + do_syscall_64+0xd5/0x1f0 + entry_SYSCALL_64_after_hwframe+0x6d/0x75 + +CPU: 0 PID: 5033 Comm: syz-executor346 Not tainted 6.9.0-rc1-syzkaller-00005-g928a87efa423 #0 +Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 + +Fixes: d13f048dd40e ("net: geneve: modify IP header check in geneve6_xmit_skb and geneve_xmit_skb") +Reported-by: syzbot+9ee20ec1de7b3168db09@syzkaller.appspotmail.com +Closes: https://lore.kernel.org/netdev/000000000000d19c3a06152f9ee4@google.com/ +Signed-off-by: Eric Dumazet +Cc: Phillip Potter +Cc: Sabrina Dubroca +Reviewed-by: Sabrina Dubroca +Reviewed-by: Phillip Potter +Signed-off-by: David S. Miller +Signed-off-by: Sasha Levin +--- + drivers/net/geneve.c | 4 ++-- + include/net/ip_tunnels.h | 33 +++++++++++++++++++++++++++++++++ + 2 files changed, 35 insertions(+), 2 deletions(-) + +diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c +index 9569b5cc595ec..0e4ea3c0fe829 100644 +--- a/drivers/net/geneve.c ++++ b/drivers/net/geneve.c +@@ -909,7 +909,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev, + __be16 sport; + int err; + +- if (!pskb_inet_may_pull(skb)) ++ if (!skb_vlan_inet_prepare(skb)) + return -EINVAL; + + sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true); +@@ -1006,7 +1006,7 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev, + __be16 sport; + int err; + +- if (!pskb_inet_may_pull(skb)) ++ if (!skb_vlan_inet_prepare(skb)) + return -EINVAL; + + sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true); +diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h +index 17ec652e8f124..eca36edb85570 100644 +--- a/include/net/ip_tunnels.h ++++ b/include/net/ip_tunnels.h +@@ -332,6 +332,39 @@ static inline bool pskb_inet_may_pull(struct sk_buff *skb) + return pskb_network_may_pull(skb, nhlen); + } + ++/* Variant of pskb_inet_may_pull(). ++ */ ++static inline bool skb_vlan_inet_prepare(struct sk_buff *skb) ++{ ++ int nhlen = 0, maclen = ETH_HLEN; ++ __be16 type = skb->protocol; ++ ++ /* Essentially this is skb_protocol(skb, true) ++ * And we get MAC len. ++ */ ++ if (eth_type_vlan(type)) ++ type = __vlan_get_protocol(skb, type, &maclen); ++ ++ switch (type) { ++#if IS_ENABLED(CONFIG_IPV6) ++ case htons(ETH_P_IPV6): ++ nhlen = sizeof(struct ipv6hdr); ++ break; ++#endif ++ case htons(ETH_P_IP): ++ nhlen = sizeof(struct iphdr); ++ break; ++ } ++ /* For ETH_P_IPV6/ETH_P_IP we make sure to pull ++ * a base network header in skb->head. ++ */ ++ if (!pskb_may_pull(skb, maclen + nhlen)) ++ return false; ++ ++ skb_set_network_header(skb, maclen); ++ return true; ++} ++ + static inline int ip_encap_hlen(struct ip_tunnel_encap *e) + { + const struct ip_tunnel_encap_ops *ops; +-- +2.43.0 + diff --git a/queue-5.15/iommu-vt-d-allocate-local-memory-for-page-request-qu.patch b/queue-5.15/iommu-vt-d-allocate-local-memory-for-page-request-qu.patch new file mode 100644 index 00000000000..d8148e5f9b0 --- /dev/null +++ b/queue-5.15/iommu-vt-d-allocate-local-memory-for-page-request-qu.patch @@ -0,0 +1,39 @@ +From 0d1f52b4525c1e93d0870474b6561c14212a621d Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Apr 2024 11:07:43 +0800 +Subject: iommu/vt-d: Allocate local memory for page request queue + +From: Jacob Pan + +[ Upstream commit a34f3e20ddff02c4f12df2c0635367394e64c63d ] + +The page request queue is per IOMMU, its allocation should be made +NUMA-aware for performance reasons. + +Fixes: a222a7f0bb6c ("iommu/vt-d: Implement page request handling") +Signed-off-by: Jacob Pan +Reviewed-by: Kevin Tian +Link: https://lore.kernel.org/r/20240403214007.985600-1-jacob.jun.pan@linux.intel.com +Signed-off-by: Lu Baolu +Signed-off-by: Joerg Roedel +Signed-off-by: Sasha Levin +--- + drivers/iommu/intel/svm.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c +index 3a9468b1d2c3c..a96c9a15c9fee 100644 +--- a/drivers/iommu/intel/svm.c ++++ b/drivers/iommu/intel/svm.c +@@ -88,7 +88,7 @@ int intel_svm_enable_prq(struct intel_iommu *iommu) + struct page *pages; + int irq, ret; + +- pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, PRQ_ORDER); ++ pages = alloc_pages_node(iommu->node, GFP_KERNEL | __GFP_ZERO, PRQ_ORDER); + if (!pages) { + pr_warn("IOMMU: %s: Failed to allocate page request queue\n", + iommu->name); +-- +2.43.0 + diff --git a/queue-5.15/ipv4-route-avoid-unused-but-set-variable-warning.patch b/queue-5.15/ipv4-route-avoid-unused-but-set-variable-warning.patch new file mode 100644 index 00000000000..b1648552e4c --- /dev/null +++ b/queue-5.15/ipv4-route-avoid-unused-but-set-variable-warning.patch @@ -0,0 +1,51 @@ +From b38c87008269a5d41576b94822a6162cc7fb6807 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 8 Apr 2024 09:42:03 +0200 +Subject: ipv4/route: avoid unused-but-set-variable warning + +From: Arnd Bergmann + +[ Upstream commit cf1b7201df59fb936f40f4a807433fe3f2ce310a ] + +The log_martians variable is only used in an #ifdef, causing a 'make W=1' +warning with gcc: + +net/ipv4/route.c: In function 'ip_rt_send_redirect': +net/ipv4/route.c:880:13: error: variable 'log_martians' set but not used [-Werror=unused-but-set-variable] + +Change the #ifdef to an equivalent IS_ENABLED() to let the compiler +see where the variable is used. + +Fixes: 30038fc61adf ("net: ip_rt_send_redirect() optimization") +Reviewed-by: David Ahern +Signed-off-by: Arnd Bergmann +Reviewed-by: Eric Dumazet +Link: https://lore.kernel.org/r/20240408074219.3030256-2-arnd@kernel.org +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + net/ipv4/route.c | 4 +--- + 1 file changed, 1 insertion(+), 3 deletions(-) + +diff --git a/net/ipv4/route.c b/net/ipv4/route.c +index 12c59d700942f..4ff94596f8cd5 100644 +--- a/net/ipv4/route.c ++++ b/net/ipv4/route.c +@@ -933,13 +933,11 @@ void ip_rt_send_redirect(struct sk_buff *skb) + icmp_send(skb, ICMP_REDIRECT, ICMP_REDIR_HOST, gw); + peer->rate_last = jiffies; + ++peer->n_redirects; +-#ifdef CONFIG_IP_ROUTE_VERBOSE +- if (log_martians && ++ if (IS_ENABLED(CONFIG_IP_ROUTE_VERBOSE) && log_martians && + peer->n_redirects == ip_rt_redirect_number) + net_warn_ratelimited("host %pI4/if%d ignores redirects for %pI4 to %pI4\n", + &ip_hdr(skb)->saddr, inet_iif(skb), + &ip_hdr(skb)->daddr, &gw); +-#endif + } + out_put_peer: + inet_putpeer(peer); +-- +2.43.0 + diff --git a/queue-5.15/ipv6-fib-hide-unused-pn-variable.patch b/queue-5.15/ipv6-fib-hide-unused-pn-variable.patch new file mode 100644 index 00000000000..c57f3520143 --- /dev/null +++ b/queue-5.15/ipv6-fib-hide-unused-pn-variable.patch @@ -0,0 +1,60 @@ +From c13e862aaf95a8c236f09dba0c1da9de7389a259 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 8 Apr 2024 09:42:02 +0200 +Subject: ipv6: fib: hide unused 'pn' variable + +From: Arnd Bergmann + +[ Upstream commit 74043489fcb5e5ca4074133582b5b8011b67f9e7 ] + +When CONFIG_IPV6_SUBTREES is disabled, the only user is hidden, causing +a 'make W=1' warning: + +net/ipv6/ip6_fib.c: In function 'fib6_add': +net/ipv6/ip6_fib.c:1388:32: error: variable 'pn' set but not used [-Werror=unused-but-set-variable] + +Add another #ifdef around the variable declaration, matching the other +uses in this file. + +Fixes: 66729e18df08 ("[IPV6] ROUTE: Make sure we have fn->leaf when adding a node on subtree.") +Link: https://lore.kernel.org/netdev/20240322131746.904943-1-arnd@kernel.org/ +Reviewed-by: David Ahern +Signed-off-by: Arnd Bergmann +Reviewed-by: Eric Dumazet +Link: https://lore.kernel.org/r/20240408074219.3030256-1-arnd@kernel.org +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + net/ipv6/ip6_fib.c | 7 +++++-- + 1 file changed, 5 insertions(+), 2 deletions(-) + +diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c +index bbb9ed6d1ae6f..c0ff5ee490e7b 100644 +--- a/net/ipv6/ip6_fib.c ++++ b/net/ipv6/ip6_fib.c +@@ -1375,7 +1375,10 @@ int fib6_add(struct fib6_node *root, struct fib6_info *rt, + struct nl_info *info, struct netlink_ext_ack *extack) + { + struct fib6_table *table = rt->fib6_table; +- struct fib6_node *fn, *pn = NULL; ++ struct fib6_node *fn; ++#ifdef CONFIG_IPV6_SUBTREES ++ struct fib6_node *pn = NULL; ++#endif + int err = -ENOMEM; + int allow_create = 1; + int replace_required = 0; +@@ -1399,9 +1402,9 @@ int fib6_add(struct fib6_node *root, struct fib6_info *rt, + goto out; + } + ++#ifdef CONFIG_IPV6_SUBTREES + pn = fn; + +-#ifdef CONFIG_IPV6_SUBTREES + if (rt->fib6_src.plen) { + struct fib6_node *sn; + +-- +2.43.0 + diff --git a/queue-5.15/ipv6-fix-race-condition-between-ipv6_get_ifaddr-and-.patch b/queue-5.15/ipv6-fix-race-condition-between-ipv6_get_ifaddr-and-.patch new file mode 100644 index 00000000000..73e4502d770 --- /dev/null +++ b/queue-5.15/ipv6-fix-race-condition-between-ipv6_get_ifaddr-and-.patch @@ -0,0 +1,133 @@ +From 0cbe6511e04a51c1fb93cc65f18e17d6e560f3fc Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 8 Apr 2024 16:18:21 +0200 +Subject: ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr + +From: Jiri Benc + +[ Upstream commit 7633c4da919ad51164acbf1aa322cc1a3ead6129 ] + +Although ipv6_get_ifaddr walks inet6_addr_lst under the RCU lock, it +still means hlist_for_each_entry_rcu can return an item that got removed +from the list. The memory itself of such item is not freed thanks to RCU +but nothing guarantees the actual content of the memory is sane. + +In particular, the reference count can be zero. This can happen if +ipv6_del_addr is called in parallel. ipv6_del_addr removes the entry +from inet6_addr_lst (hlist_del_init_rcu(&ifp->addr_lst)) and drops all +references (__in6_ifa_put(ifp) + in6_ifa_put(ifp)). With bad enough +timing, this can happen: + +1. In ipv6_get_ifaddr, hlist_for_each_entry_rcu returns an entry. + +2. Then, the whole ipv6_del_addr is executed for the given entry. The + reference count drops to zero and kfree_rcu is scheduled. + +3. ipv6_get_ifaddr continues and tries to increments the reference count + (in6_ifa_hold). + +4. The rcu is unlocked and the entry is freed. + +5. The freed entry is returned. + +Prevent increasing of the reference count in such case. The name +in6_ifa_hold_safe is chosen to mimic the existing fib6_info_hold_safe. + +[ 41.506330] refcount_t: addition on 0; use-after-free. +[ 41.506760] WARNING: CPU: 0 PID: 595 at lib/refcount.c:25 refcount_warn_saturate+0xa5/0x130 +[ 41.507413] Modules linked in: veth bridge stp llc +[ 41.507821] CPU: 0 PID: 595 Comm: python3 Not tainted 6.9.0-rc2.main-00208-g49563be82afa #14 +[ 41.508479] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) +[ 41.509163] RIP: 0010:refcount_warn_saturate+0xa5/0x130 +[ 41.509586] Code: ad ff 90 0f 0b 90 90 c3 cc cc cc cc 80 3d c0 30 ad 01 00 75 a0 c6 05 b7 30 ad 01 01 90 48 c7 c7 38 cc 7a 8c e8 cc 18 ad ff 90 <0f> 0b 90 90 c3 cc cc cc cc 80 3d 98 30 ad 01 00 0f 85 75 ff ff ff +[ 41.510956] RSP: 0018:ffffbda3c026baf0 EFLAGS: 00010282 +[ 41.511368] RAX: 0000000000000000 RBX: ffff9e9c46914800 RCX: 0000000000000000 +[ 41.511910] RDX: ffff9e9c7ec29c00 RSI: ffff9e9c7ec1c900 RDI: ffff9e9c7ec1c900 +[ 41.512445] RBP: ffff9e9c43660c9c R08: 0000000000009ffb R09: 00000000ffffdfff +[ 41.512998] R10: 00000000ffffdfff R11: ffffffff8ca58a40 R12: ffff9e9c4339a000 +[ 41.513534] R13: 0000000000000001 R14: ffff9e9c438a0000 R15: ffffbda3c026bb48 +[ 41.514086] FS: 00007fbc4cda1740(0000) GS:ffff9e9c7ec00000(0000) knlGS:0000000000000000 +[ 41.514726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +[ 41.515176] CR2: 000056233b337d88 CR3: 000000000376e006 CR4: 0000000000370ef0 +[ 41.515713] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 +[ 41.516252] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 +[ 41.516799] Call Trace: +[ 41.517037] +[ 41.517249] ? __warn+0x7b/0x120 +[ 41.517535] ? refcount_warn_saturate+0xa5/0x130 +[ 41.517923] ? report_bug+0x164/0x190 +[ 41.518240] ? handle_bug+0x3d/0x70 +[ 41.518541] ? exc_invalid_op+0x17/0x70 +[ 41.520972] ? asm_exc_invalid_op+0x1a/0x20 +[ 41.521325] ? refcount_warn_saturate+0xa5/0x130 +[ 41.521708] ipv6_get_ifaddr+0xda/0xe0 +[ 41.522035] inet6_rtm_getaddr+0x342/0x3f0 +[ 41.522376] ? __pfx_inet6_rtm_getaddr+0x10/0x10 +[ 41.522758] rtnetlink_rcv_msg+0x334/0x3d0 +[ 41.523102] ? netlink_unicast+0x30f/0x390 +[ 41.523445] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 +[ 41.523832] netlink_rcv_skb+0x53/0x100 +[ 41.524157] netlink_unicast+0x23b/0x390 +[ 41.524484] netlink_sendmsg+0x1f2/0x440 +[ 41.524826] __sys_sendto+0x1d8/0x1f0 +[ 41.525145] __x64_sys_sendto+0x1f/0x30 +[ 41.525467] do_syscall_64+0xa5/0x1b0 +[ 41.525794] entry_SYSCALL_64_after_hwframe+0x72/0x7a +[ 41.526213] RIP: 0033:0x7fbc4cfcea9a +[ 41.526528] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89 +[ 41.527942] RSP: 002b:00007ffcf54012a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c +[ 41.528593] RAX: ffffffffffffffda RBX: 00007ffcf5401368 RCX: 00007fbc4cfcea9a +[ 41.529173] RDX: 000000000000002c RSI: 00007fbc4b9d9bd0 RDI: 0000000000000005 +[ 41.529786] RBP: 00007fbc4bafb040 R08: 00007ffcf54013e0 R09: 000000000000000c +[ 41.530375] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 +[ 41.530977] R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007fbc4ca85d1b +[ 41.531573] + +Fixes: 5c578aedcb21d ("IPv6: convert addrconf hash list to RCU") +Reviewed-by: Eric Dumazet +Reviewed-by: David Ahern +Signed-off-by: Jiri Benc +Link: https://lore.kernel.org/r/8ab821e36073a4a406c50ec83c9e8dc586c539e4.1712585809.git.jbenc@redhat.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + include/net/addrconf.h | 4 ++++ + net/ipv6/addrconf.c | 7 ++++--- + 2 files changed, 8 insertions(+), 3 deletions(-) + +diff --git a/include/net/addrconf.h b/include/net/addrconf.h +index 700a19e0455e6..5cf1a73774078 100644 +--- a/include/net/addrconf.h ++++ b/include/net/addrconf.h +@@ -435,6 +435,10 @@ static inline void in6_ifa_hold(struct inet6_ifaddr *ifp) + refcount_inc(&ifp->refcnt); + } + ++static inline bool in6_ifa_hold_safe(struct inet6_ifaddr *ifp) ++{ ++ return refcount_inc_not_zero(&ifp->refcnt); ++} + + /* + * compute link-local solicited-node multicast address +diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c +index 968ca078191cd..a17e1d744b2d0 100644 +--- a/net/ipv6/addrconf.c ++++ b/net/ipv6/addrconf.c +@@ -2054,9 +2054,10 @@ struct inet6_ifaddr *ipv6_get_ifaddr(struct net *net, const struct in6_addr *add + if (ipv6_addr_equal(&ifp->addr, addr)) { + if (!dev || ifp->idev->dev == dev || + !(ifp->scope&(IFA_LINK|IFA_HOST) || strict)) { +- result = ifp; +- in6_ifa_hold(ifp); +- break; ++ if (in6_ifa_hold_safe(ifp)) { ++ result = ifp; ++ break; ++ } + } + } + } +-- +2.43.0 + diff --git a/queue-5.15/net-dsa-mt7530-trap-link-local-frames-regardless-of-.patch b/queue-5.15/net-dsa-mt7530-trap-link-local-frames-regardless-of-.patch new file mode 100644 index 00000000000..86c1bd6caa8 --- /dev/null +++ b/queue-5.15/net-dsa-mt7530-trap-link-local-frames-regardless-of-.patch @@ -0,0 +1,495 @@ +From d78f216df11c20be572f40591a9e445c92827b28 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 9 Apr 2024 18:01:14 +0300 +Subject: net: dsa: mt7530: trap link-local frames regardless of ST Port State +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Arınç ÜNAL + +[ Upstream commit 17c560113231ddc20088553c7b499b289b664311 ] + +In Clause 5 of IEEE Std 802-2014, two sublayers of the data link layer +(DLL) of the Open Systems Interconnection basic reference model (OSI/RM) +are described; the medium access control (MAC) and logical link control +(LLC) sublayers. The MAC sublayer is the one facing the physical layer. + +In 8.2 of IEEE Std 802.1Q-2022, the Bridge architecture is described. A +Bridge component comprises a MAC Relay Entity for interconnecting the Ports +of the Bridge, at least two Ports, and higher layer entities with at least +a Spanning Tree Protocol Entity included. + +Each Bridge Port also functions as an end station and shall provide the MAC +Service to an LLC Entity. Each instance of the MAC Service is provided to a +distinct LLC Entity that supports protocol identification, multiplexing, +and demultiplexing, for protocol data unit (PDU) transmission and reception +by one or more higher layer entities. + +It is described in 8.13.9 of IEEE Std 802.1Q-2022 that in a Bridge, the LLC +Entity associated with each Bridge Port is modeled as being directly +connected to the attached Local Area Network (LAN). + +On the switch with CPU port architecture, CPU port functions as Management +Port, and the Management Port functionality is provided by software which +functions as an end station. Software is connected to an IEEE 802 LAN that +is wholly contained within the system that incorporates the Bridge. +Software provides access to the LLC Entity associated with each Bridge Port +by the value of the source port field on the special tag on the frame +received by software. + +We call frames that carry control information to determine the active +topology and current extent of each Virtual Local Area Network (VLAN), +i.e., spanning tree or Shortest Path Bridging (SPB) and Multiple VLAN +Registration Protocol Data Units (MVRPDUs), and frames from other link +constrained protocols, such as Extensible Authentication Protocol over LAN +(EAPOL) and Link Layer Discovery Protocol (LLDP), link-local frames. They +are not forwarded by a Bridge. Permanently configured entries in the +filtering database (FDB) ensure that such frames are discarded by the +Forwarding Process. In 8.6.3 of IEEE Std 802.1Q-2022, this is described in +detail: + +Each of the reserved MAC addresses specified in Table 8-1 +(01-80-C2-00-00-[00,01,02,03,04,05,06,07,08,09,0A,0B,0C,0D,0E,0F]) shall be +permanently configured in the FDB in C-VLAN components and ERs. + +Each of the reserved MAC addresses specified in Table 8-2 +(01-80-C2-00-00-[01,02,03,04,05,06,07,08,09,0A,0E]) shall be permanently +configured in the FDB in S-VLAN components. + +Each of the reserved MAC addresses specified in Table 8-3 +(01-80-C2-00-00-[01,02,04,0E]) shall be permanently configured in the FDB +in TPMR components. + +The FDB entries for reserved MAC addresses shall specify filtering for all +Bridge Ports and all VIDs. Management shall not provide the capability to +modify or remove entries for reserved MAC addresses. + +The addresses in Table 8-1, Table 8-2, and Table 8-3 determine the scope of +propagation of PDUs within a Bridged Network, as follows: + + The Nearest Bridge group address (01-80-C2-00-00-0E) is an address that + no conformant Two-Port MAC Relay (TPMR) component, Service VLAN (S-VLAN) + component, Customer VLAN (C-VLAN) component, or MAC Bridge can forward. + PDUs transmitted using this destination address, or any other addresses + that appear in Table 8-1, Table 8-2, and Table 8-3 + (01-80-C2-00-00-[00,01,02,03,04,05,06,07,08,09,0A,0B,0C,0D,0E,0F]), can + therefore travel no further than those stations that can be reached via a + single individual LAN from the originating station. + + The Nearest non-TPMR Bridge group address (01-80-C2-00-00-03), is an + address that no conformant S-VLAN component, C-VLAN component, or MAC + Bridge can forward; however, this address is relayed by a TPMR component. + PDUs using this destination address, or any of the other addresses that + appear in both Table 8-1 and Table 8-2 but not in Table 8-3 + (01-80-C2-00-00-[00,03,05,06,07,08,09,0A,0B,0C,0D,0F]), will be relayed + by any TPMRs but will propagate no further than the nearest S-VLAN + component, C-VLAN component, or MAC Bridge. + + The Nearest Customer Bridge group address (01-80-C2-00-00-00) is an + address that no conformant C-VLAN component, MAC Bridge can forward; + however, it is relayed by TPMR components and S-VLAN components. PDUs + using this destination address, or any of the other addresses that appear + in Table 8-1 but not in either Table 8-2 or Table 8-3 + (01-80-C2-00-00-[00,0B,0C,0D,0F]), will be relayed by TPMR components and + S-VLAN components but will propagate no further than the nearest C-VLAN + component or MAC Bridge. + +Because the LLC Entity associated with each Bridge Port is provided via CPU +port, we must not filter these frames but forward them to CPU port. + +In a Bridge, the transmission Port is majorly decided by ingress and egress +rules, FDB, and spanning tree Port State functions of the Forwarding +Process. For link-local frames, only CPU port should be designated as +destination port in the FDB, and the other functions of the Forwarding +Process must not interfere with the decision of the transmission Port. We +call this process trapping frames to CPU port. + +Therefore, on the switch with CPU port architecture, link-local frames must +be trapped to CPU port, and certain link-local frames received by a Port of +a Bridge comprising a TPMR component or an S-VLAN component must be +excluded from it. + +A Bridge of the switch with CPU port architecture cannot comprise a +Two-Port MAC Relay (TPMR) component as a TPMR component supports only a +subset of the functionality of a MAC Bridge. A Bridge comprising two Ports +(Management Port doesn't count) of this architecture will either function +as a standard MAC Bridge or a standard VLAN Bridge. + +Therefore, a Bridge of this architecture can only comprise S-VLAN +components, C-VLAN components, or MAC Bridge components. Since there's no +TPMR component, we don't need to relay PDUs using the destination addresses +specified on the Nearest non-TPMR section, and the proportion of the +Nearest Customer Bridge section where they must be relayed by TPMR +components. + +One option to trap link-local frames to CPU port is to add static FDB +entries with CPU port designated as destination port. However, because that +Independent VLAN Learning (IVL) is being used on every VID, each entry only +applies to a single VLAN Identifier (VID). For a Bridge comprising a MAC +Bridge component or a C-VLAN component, there would have to be 16 times +4096 entries. This switch intellectual property can only hold a maximum of +2048 entries. Using this option, there also isn't a mechanism to prevent +link-local frames from being discarded when the spanning tree Port State of +the reception Port is discarding. + +The remaining option is to utilise the BPC, RGAC1, RGAC2, RGAC3, and RGAC4 +registers. Whilst this applies to every VID, it doesn't contain all of the +reserved MAC addresses without affecting the remaining Standard Group MAC +Addresses. The REV_UN frame tag utilised using the RGAC4 register covers +the remaining 01-80-C2-00-00-[04,05,06,07,08,09,0A,0B,0C,0D,0F] destination +addresses. It also includes the 01-80-C2-00-00-22 to 01-80-C2-00-00-FF +destination addresses which may be relayed by MAC Bridges or VLAN Bridges. +The latter option provides better but not complete conformance. + +This switch intellectual property also does not provide a mechanism to trap +link-local frames with specific destination addresses to CPU port by +Bridge, to conform to the filtering rules for the distinct Bridge +components. + +Therefore, regardless of the type of the Bridge component, link-local +frames with these destination addresses will be trapped to CPU port: + +01-80-C2-00-00-[00,01,02,03,0E] + +In a Bridge comprising a MAC Bridge component or a C-VLAN component: + + Link-local frames with these destination addresses won't be trapped to + CPU port which won't conform to IEEE Std 802.1Q-2022: + + 01-80-C2-00-00-[04,05,06,07,08,09,0A,0B,0C,0D,0F] + +In a Bridge comprising an S-VLAN component: + + Link-local frames with these destination addresses will be trapped to CPU + port which won't conform to IEEE Std 802.1Q-2022: + + 01-80-C2-00-00-00 + + Link-local frames with these destination addresses won't be trapped to + CPU port which won't conform to IEEE Std 802.1Q-2022: + + 01-80-C2-00-00-[04,05,06,07,08,09,0A] + +Currently on this switch intellectual property, if the spanning tree Port +State of the reception Port is discarding, link-local frames will be +discarded. + +To trap link-local frames regardless of the spanning tree Port State, make +the switch regard them as Bridge Protocol Data Units (BPDUs). This switch +intellectual property only lets the frames regarded as BPDUs bypass the +spanning tree Port State function of the Forwarding Process. + +With this change, the only remaining interference is the ingress rules. +When the reception Port has no PVID assigned on software, VLAN-untagged +frames won't be allowed in. There doesn't seem to be a mechanism on the +switch intellectual property to have link-local frames bypass this function +of the Forwarding Process. + +Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") +Reviewed-by: Daniel Golle +Signed-off-by: Arınç ÜNAL +Link: https://lore.kernel.org/r/20240409-b4-for-net-mt7530-fix-link-local-when-stp-discarding-v2-1-07b1150164ac@arinc9.com +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/dsa/mt7530.c | 229 +++++++++++++++++++++++++++++++++------ + drivers/net/dsa/mt7530.h | 5 + + 2 files changed, 200 insertions(+), 34 deletions(-) + +diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c +index 14c47e614d337..f291d1e70f807 100644 +--- a/drivers/net/dsa/mt7530.c ++++ b/drivers/net/dsa/mt7530.c +@@ -994,20 +994,173 @@ static void mt7530_setup_port5(struct dsa_switch *ds, phy_interface_t interface) + mutex_unlock(&priv->reg_mutex); + } + +-/* On page 205, section "8.6.3 Frame filtering" of the active standard, IEEE Std +- * 802.1Q™-2022, it is stated that frames with 01:80:C2:00:00:00-0F as MAC DA +- * must only be propagated to C-VLAN and MAC Bridge components. That means +- * VLAN-aware and VLAN-unaware bridges. On the switch designs with CPU ports, +- * these frames are supposed to be processed by the CPU (software). So we make +- * the switch only forward them to the CPU port. And if received from a CPU +- * port, forward to a single port. The software is responsible of making the +- * switch conform to the latter by setting a single port as destination port on +- * the special tag. ++/* In Clause 5 of IEEE Std 802-2014, two sublayers of the data link layer (DLL) ++ * of the Open Systems Interconnection basic reference model (OSI/RM) are ++ * described; the medium access control (MAC) and logical link control (LLC) ++ * sublayers. The MAC sublayer is the one facing the physical layer. + * +- * This switch intellectual property cannot conform to this part of the standard +- * fully. Whilst the REV_UN frame tag covers the remaining :04-0D and :0F MAC +- * DAs, it also includes :22-FF which the scope of propagation is not supposed +- * to be restricted for these MAC DAs. ++ * In 8.2 of IEEE Std 802.1Q-2022, the Bridge architecture is described. A ++ * Bridge component comprises a MAC Relay Entity for interconnecting the Ports ++ * of the Bridge, at least two Ports, and higher layer entities with at least a ++ * Spanning Tree Protocol Entity included. ++ * ++ * Each Bridge Port also functions as an end station and shall provide the MAC ++ * Service to an LLC Entity. Each instance of the MAC Service is provided to a ++ * distinct LLC Entity that supports protocol identification, multiplexing, and ++ * demultiplexing, for protocol data unit (PDU) transmission and reception by ++ * one or more higher layer entities. ++ * ++ * It is described in 8.13.9 of IEEE Std 802.1Q-2022 that in a Bridge, the LLC ++ * Entity associated with each Bridge Port is modeled as being directly ++ * connected to the attached Local Area Network (LAN). ++ * ++ * On the switch with CPU port architecture, CPU port functions as Management ++ * Port, and the Management Port functionality is provided by software which ++ * functions as an end station. Software is connected to an IEEE 802 LAN that is ++ * wholly contained within the system that incorporates the Bridge. Software ++ * provides access to the LLC Entity associated with each Bridge Port by the ++ * value of the source port field on the special tag on the frame received by ++ * software. ++ * ++ * We call frames that carry control information to determine the active ++ * topology and current extent of each Virtual Local Area Network (VLAN), i.e., ++ * spanning tree or Shortest Path Bridging (SPB) and Multiple VLAN Registration ++ * Protocol Data Units (MVRPDUs), and frames from other link constrained ++ * protocols, such as Extensible Authentication Protocol over LAN (EAPOL) and ++ * Link Layer Discovery Protocol (LLDP), link-local frames. They are not ++ * forwarded by a Bridge. Permanently configured entries in the filtering ++ * database (FDB) ensure that such frames are discarded by the Forwarding ++ * Process. In 8.6.3 of IEEE Std 802.1Q-2022, this is described in detail: ++ * ++ * Each of the reserved MAC addresses specified in Table 8-1 ++ * (01-80-C2-00-00-[00,01,02,03,04,05,06,07,08,09,0A,0B,0C,0D,0E,0F]) shall be ++ * permanently configured in the FDB in C-VLAN components and ERs. ++ * ++ * Each of the reserved MAC addresses specified in Table 8-2 ++ * (01-80-C2-00-00-[01,02,03,04,05,06,07,08,09,0A,0E]) shall be permanently ++ * configured in the FDB in S-VLAN components. ++ * ++ * Each of the reserved MAC addresses specified in Table 8-3 ++ * (01-80-C2-00-00-[01,02,04,0E]) shall be permanently configured in the FDB in ++ * TPMR components. ++ * ++ * The FDB entries for reserved MAC addresses shall specify filtering for all ++ * Bridge Ports and all VIDs. Management shall not provide the capability to ++ * modify or remove entries for reserved MAC addresses. ++ * ++ * The addresses in Table 8-1, Table 8-2, and Table 8-3 determine the scope of ++ * propagation of PDUs within a Bridged Network, as follows: ++ * ++ * The Nearest Bridge group address (01-80-C2-00-00-0E) is an address that no ++ * conformant Two-Port MAC Relay (TPMR) component, Service VLAN (S-VLAN) ++ * component, Customer VLAN (C-VLAN) component, or MAC Bridge can forward. ++ * PDUs transmitted using this destination address, or any other addresses ++ * that appear in Table 8-1, Table 8-2, and Table 8-3 ++ * (01-80-C2-00-00-[00,01,02,03,04,05,06,07,08,09,0A,0B,0C,0D,0E,0F]), can ++ * therefore travel no further than those stations that can be reached via a ++ * single individual LAN from the originating station. ++ * ++ * The Nearest non-TPMR Bridge group address (01-80-C2-00-00-03), is an ++ * address that no conformant S-VLAN component, C-VLAN component, or MAC ++ * Bridge can forward; however, this address is relayed by a TPMR component. ++ * PDUs using this destination address, or any of the other addresses that ++ * appear in both Table 8-1 and Table 8-2 but not in Table 8-3 ++ * (01-80-C2-00-00-[00,03,05,06,07,08,09,0A,0B,0C,0D,0F]), will be relayed by ++ * any TPMRs but will propagate no further than the nearest S-VLAN component, ++ * C-VLAN component, or MAC Bridge. ++ * ++ * The Nearest Customer Bridge group address (01-80-C2-00-00-00) is an address ++ * that no conformant C-VLAN component, MAC Bridge can forward; however, it is ++ * relayed by TPMR components and S-VLAN components. PDUs using this ++ * destination address, or any of the other addresses that appear in Table 8-1 ++ * but not in either Table 8-2 or Table 8-3 (01-80-C2-00-00-[00,0B,0C,0D,0F]), ++ * will be relayed by TPMR components and S-VLAN components but will propagate ++ * no further than the nearest C-VLAN component or MAC Bridge. ++ * ++ * Because the LLC Entity associated with each Bridge Port is provided via CPU ++ * port, we must not filter these frames but forward them to CPU port. ++ * ++ * In a Bridge, the transmission Port is majorly decided by ingress and egress ++ * rules, FDB, and spanning tree Port State functions of the Forwarding Process. ++ * For link-local frames, only CPU port should be designated as destination port ++ * in the FDB, and the other functions of the Forwarding Process must not ++ * interfere with the decision of the transmission Port. We call this process ++ * trapping frames to CPU port. ++ * ++ * Therefore, on the switch with CPU port architecture, link-local frames must ++ * be trapped to CPU port, and certain link-local frames received by a Port of a ++ * Bridge comprising a TPMR component or an S-VLAN component must be excluded ++ * from it. ++ * ++ * A Bridge of the switch with CPU port architecture cannot comprise a Two-Port ++ * MAC Relay (TPMR) component as a TPMR component supports only a subset of the ++ * functionality of a MAC Bridge. A Bridge comprising two Ports (Management Port ++ * doesn't count) of this architecture will either function as a standard MAC ++ * Bridge or a standard VLAN Bridge. ++ * ++ * Therefore, a Bridge of this architecture can only comprise S-VLAN components, ++ * C-VLAN components, or MAC Bridge components. Since there's no TPMR component, ++ * we don't need to relay PDUs using the destination addresses specified on the ++ * Nearest non-TPMR section, and the proportion of the Nearest Customer Bridge ++ * section where they must be relayed by TPMR components. ++ * ++ * One option to trap link-local frames to CPU port is to add static FDB entries ++ * with CPU port designated as destination port. However, because that ++ * Independent VLAN Learning (IVL) is being used on every VID, each entry only ++ * applies to a single VLAN Identifier (VID). For a Bridge comprising a MAC ++ * Bridge component or a C-VLAN component, there would have to be 16 times 4096 ++ * entries. This switch intellectual property can only hold a maximum of 2048 ++ * entries. Using this option, there also isn't a mechanism to prevent ++ * link-local frames from being discarded when the spanning tree Port State of ++ * the reception Port is discarding. ++ * ++ * The remaining option is to utilise the BPC, RGAC1, RGAC2, RGAC3, and RGAC4 ++ * registers. Whilst this applies to every VID, it doesn't contain all of the ++ * reserved MAC addresses without affecting the remaining Standard Group MAC ++ * Addresses. The REV_UN frame tag utilised using the RGAC4 register covers the ++ * remaining 01-80-C2-00-00-[04,05,06,07,08,09,0A,0B,0C,0D,0F] destination ++ * addresses. It also includes the 01-80-C2-00-00-22 to 01-80-C2-00-00-FF ++ * destination addresses which may be relayed by MAC Bridges or VLAN Bridges. ++ * The latter option provides better but not complete conformance. ++ * ++ * This switch intellectual property also does not provide a mechanism to trap ++ * link-local frames with specific destination addresses to CPU port by Bridge, ++ * to conform to the filtering rules for the distinct Bridge components. ++ * ++ * Therefore, regardless of the type of the Bridge component, link-local frames ++ * with these destination addresses will be trapped to CPU port: ++ * ++ * 01-80-C2-00-00-[00,01,02,03,0E] ++ * ++ * In a Bridge comprising a MAC Bridge component or a C-VLAN component: ++ * ++ * Link-local frames with these destination addresses won't be trapped to CPU ++ * port which won't conform to IEEE Std 802.1Q-2022: ++ * ++ * 01-80-C2-00-00-[04,05,06,07,08,09,0A,0B,0C,0D,0F] ++ * ++ * In a Bridge comprising an S-VLAN component: ++ * ++ * Link-local frames with these destination addresses will be trapped to CPU ++ * port which won't conform to IEEE Std 802.1Q-2022: ++ * ++ * 01-80-C2-00-00-00 ++ * ++ * Link-local frames with these destination addresses won't be trapped to CPU ++ * port which won't conform to IEEE Std 802.1Q-2022: ++ * ++ * 01-80-C2-00-00-[04,05,06,07,08,09,0A] ++ * ++ * To trap link-local frames to CPU port as conformant as this switch ++ * intellectual property can allow, link-local frames are made to be regarded as ++ * Bridge Protocol Data Units (BPDUs). This is because this switch intellectual ++ * property only lets the frames regarded as BPDUs bypass the spanning tree Port ++ * State function of the Forwarding Process. ++ * ++ * The only remaining interference is the ingress rules. When the reception Port ++ * has no PVID assigned on software, VLAN-untagged frames won't be allowed in. ++ * There doesn't seem to be a mechanism on the switch intellectual property to ++ * have link-local frames bypass this function of the Forwarding Process. + */ + static void + mt753x_trap_frames(struct mt7530_priv *priv) +@@ -1015,35 +1168,43 @@ mt753x_trap_frames(struct mt7530_priv *priv) + /* Trap 802.1X PAE frames and BPDUs to the CPU port(s) and egress them + * VLAN-untagged. + */ +- mt7530_rmw(priv, MT753X_BPC, MT753X_PAE_EG_TAG_MASK | +- MT753X_PAE_PORT_FW_MASK | MT753X_BPDU_EG_TAG_MASK | +- MT753X_BPDU_PORT_FW_MASK, +- MT753X_PAE_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | +- MT753X_PAE_PORT_FW(MT753X_BPDU_CPU_ONLY) | +- MT753X_BPDU_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | +- MT753X_BPDU_CPU_ONLY); ++ mt7530_rmw(priv, MT753X_BPC, ++ MT753X_PAE_BPDU_FR | MT753X_PAE_EG_TAG_MASK | ++ MT753X_PAE_PORT_FW_MASK | MT753X_BPDU_EG_TAG_MASK | ++ MT753X_BPDU_PORT_FW_MASK, ++ MT753X_PAE_BPDU_FR | ++ MT753X_PAE_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | ++ MT753X_PAE_PORT_FW(MT753X_BPDU_CPU_ONLY) | ++ MT753X_BPDU_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | ++ MT753X_BPDU_CPU_ONLY); + + /* Trap frames with :01 and :02 MAC DAs to the CPU port(s) and egress + * them VLAN-untagged. + */ +- mt7530_rmw(priv, MT753X_RGAC1, MT753X_R02_EG_TAG_MASK | +- MT753X_R02_PORT_FW_MASK | MT753X_R01_EG_TAG_MASK | +- MT753X_R01_PORT_FW_MASK, +- MT753X_R02_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | +- MT753X_R02_PORT_FW(MT753X_BPDU_CPU_ONLY) | +- MT753X_R01_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | +- MT753X_BPDU_CPU_ONLY); ++ mt7530_rmw(priv, MT753X_RGAC1, ++ MT753X_R02_BPDU_FR | MT753X_R02_EG_TAG_MASK | ++ MT753X_R02_PORT_FW_MASK | MT753X_R01_BPDU_FR | ++ MT753X_R01_EG_TAG_MASK | MT753X_R01_PORT_FW_MASK, ++ MT753X_R02_BPDU_FR | ++ MT753X_R02_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | ++ MT753X_R02_PORT_FW(MT753X_BPDU_CPU_ONLY) | ++ MT753X_R01_BPDU_FR | ++ MT753X_R01_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | ++ MT753X_BPDU_CPU_ONLY); + + /* Trap frames with :03 and :0E MAC DAs to the CPU port(s) and egress + * them VLAN-untagged. + */ +- mt7530_rmw(priv, MT753X_RGAC2, MT753X_R0E_EG_TAG_MASK | +- MT753X_R0E_PORT_FW_MASK | MT753X_R03_EG_TAG_MASK | +- MT753X_R03_PORT_FW_MASK, +- MT753X_R0E_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | +- MT753X_R0E_PORT_FW(MT753X_BPDU_CPU_ONLY) | +- MT753X_R03_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | +- MT753X_BPDU_CPU_ONLY); ++ mt7530_rmw(priv, MT753X_RGAC2, ++ MT753X_R0E_BPDU_FR | MT753X_R0E_EG_TAG_MASK | ++ MT753X_R0E_PORT_FW_MASK | MT753X_R03_BPDU_FR | ++ MT753X_R03_EG_TAG_MASK | MT753X_R03_PORT_FW_MASK, ++ MT753X_R0E_BPDU_FR | ++ MT753X_R0E_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | ++ MT753X_R0E_PORT_FW(MT753X_BPDU_CPU_ONLY) | ++ MT753X_R03_BPDU_FR | ++ MT753X_R03_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | ++ MT753X_BPDU_CPU_ONLY); + } + + static int +diff --git a/drivers/net/dsa/mt7530.h b/drivers/net/dsa/mt7530.h +index 03598f9ae288c..299a26ad5809c 100644 +--- a/drivers/net/dsa/mt7530.h ++++ b/drivers/net/dsa/mt7530.h +@@ -64,6 +64,7 @@ enum mt753x_id { + + /* Registers for BPDU and PAE frame control*/ + #define MT753X_BPC 0x24 ++#define MT753X_PAE_BPDU_FR BIT(25) + #define MT753X_PAE_EG_TAG_MASK GENMASK(24, 22) + #define MT753X_PAE_EG_TAG(x) FIELD_PREP(MT753X_PAE_EG_TAG_MASK, x) + #define MT753X_PAE_PORT_FW_MASK GENMASK(18, 16) +@@ -74,20 +75,24 @@ enum mt753x_id { + + /* Register for :01 and :02 MAC DA frame control */ + #define MT753X_RGAC1 0x28 ++#define MT753X_R02_BPDU_FR BIT(25) + #define MT753X_R02_EG_TAG_MASK GENMASK(24, 22) + #define MT753X_R02_EG_TAG(x) FIELD_PREP(MT753X_R02_EG_TAG_MASK, x) + #define MT753X_R02_PORT_FW_MASK GENMASK(18, 16) + #define MT753X_R02_PORT_FW(x) FIELD_PREP(MT753X_R02_PORT_FW_MASK, x) ++#define MT753X_R01_BPDU_FR BIT(9) + #define MT753X_R01_EG_TAG_MASK GENMASK(8, 6) + #define MT753X_R01_EG_TAG(x) FIELD_PREP(MT753X_R01_EG_TAG_MASK, x) + #define MT753X_R01_PORT_FW_MASK GENMASK(2, 0) + + /* Register for :03 and :0E MAC DA frame control */ + #define MT753X_RGAC2 0x2c ++#define MT753X_R0E_BPDU_FR BIT(25) + #define MT753X_R0E_EG_TAG_MASK GENMASK(24, 22) + #define MT753X_R0E_EG_TAG(x) FIELD_PREP(MT753X_R0E_EG_TAG_MASK, x) + #define MT753X_R0E_PORT_FW_MASK GENMASK(18, 16) + #define MT753X_R0E_PORT_FW(x) FIELD_PREP(MT753X_R0E_PORT_FW_MASK, x) ++#define MT753X_R03_BPDU_FR BIT(9) + #define MT753X_R03_EG_TAG_MASK GENMASK(8, 6) + #define MT753X_R03_EG_TAG(x) FIELD_PREP(MT753X_R03_EG_TAG_MASK, x) + #define MT753X_R03_PORT_FW_MASK GENMASK(2, 0) +-- +2.43.0 + diff --git a/queue-5.15/net-ena-fix-incorrect-descriptor-free-behavior.patch b/queue-5.15/net-ena-fix-incorrect-descriptor-free-behavior.patch new file mode 100644 index 00000000000..1cb43b85a5f --- /dev/null +++ b/queue-5.15/net-ena-fix-incorrect-descriptor-free-behavior.patch @@ -0,0 +1,72 @@ +From b14a980b100ded9c1dff94ea989bfce3931a5a00 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 10 Apr 2024 09:13:57 +0000 +Subject: net: ena: Fix incorrect descriptor free behavior + +From: David Arinzon + +[ Upstream commit bf02d9fe00632d22fa91d34749c7aacf397b6cde ] + +ENA has two types of TX queues: +- queues which only process TX packets arriving from the network stack +- queues which only process TX packets forwarded to it by XDP_REDIRECT + or XDP_TX instructions + +The ena_free_tx_bufs() cycles through all descriptors in a TX queue +and unmaps + frees every descriptor that hasn't been acknowledged yet +by the device (uncompleted TX transactions). +The function assumes that the processed TX queue is necessarily from +the first category listed above and ends up using napi_consume_skb() +for descriptors belonging to an XDP specific queue. + +This patch solves a bug in which, in case of a VF reset, the +descriptors aren't freed correctly, leading to crashes. + +Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action") +Signed-off-by: Shay Agroskin +Signed-off-by: David Arinzon +Reviewed-by: Shannon Nelson +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/amazon/ena/ena_netdev.c | 14 +++++++++++--- + 1 file changed, 11 insertions(+), 3 deletions(-) + +diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c +index 44b8df731c889..3ea449be7bdc3 100644 +--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c ++++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c +@@ -1205,8 +1205,11 @@ static void ena_unmap_tx_buff(struct ena_ring *tx_ring, + static void ena_free_tx_bufs(struct ena_ring *tx_ring) + { + bool print_once = true; ++ bool is_xdp_ring; + u32 i; + ++ is_xdp_ring = ENA_IS_XDP_INDEX(tx_ring->adapter, tx_ring->qid); ++ + for (i = 0; i < tx_ring->ring_size; i++) { + struct ena_tx_buffer *tx_info = &tx_ring->tx_buffer_info[i]; + +@@ -1226,10 +1229,15 @@ static void ena_free_tx_bufs(struct ena_ring *tx_ring) + + ena_unmap_tx_buff(tx_ring, tx_info); + +- dev_kfree_skb_any(tx_info->skb); ++ if (is_xdp_ring) ++ xdp_return_frame(tx_info->xdpf); ++ else ++ dev_kfree_skb_any(tx_info->skb); + } +- netdev_tx_reset_queue(netdev_get_tx_queue(tx_ring->netdev, +- tx_ring->qid)); ++ ++ if (!is_xdp_ring) ++ netdev_tx_reset_queue(netdev_get_tx_queue(tx_ring->netdev, ++ tx_ring->qid)); + } + + static void ena_free_all_tx_bufs(struct ena_adapter *adapter) +-- +2.43.0 + diff --git a/queue-5.15/net-ena-fix-potential-sign-extension-issue.patch b/queue-5.15/net-ena-fix-potential-sign-extension-issue.patch new file mode 100644 index 00000000000..53e04e4d30f --- /dev/null +++ b/queue-5.15/net-ena-fix-potential-sign-extension-issue.patch @@ -0,0 +1,66 @@ +From 621fb41832da62226744d73d90d55eb86661a9d3 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 10 Apr 2024 09:13:55 +0000 +Subject: net: ena: Fix potential sign extension issue + +From: David Arinzon + +[ Upstream commit 713a85195aad25d8a26786a37b674e3e5ec09e3c ] + +Small unsigned types are promoted to larger signed types in +the case of multiplication, the result of which may overflow. +In case the result of such a multiplication has its MSB +turned on, it will be sign extended with '1's. +This changes the multiplication result. + +Code example of the phenomenon: +------------------------------- +u16 x, y; +size_t z1, z2; + +x = y = 0xffff; +printk("x=%x y=%x\n",x,y); + +z1 = x*y; +z2 = (size_t)x*y; + +printk("z1=%lx z2=%lx\n", z1, z2); + +Output: +------- +x=ffff y=ffff +z1=fffffffffffe0001 z2=fffe0001 + +The expected result of ffff*ffff is fffe0001, and without the +explicit casting to avoid the unwanted sign extension we got +fffffffffffe0001. + +This commit adds an explicit casting to avoid the sign extension +issue. + +Fixes: 689b2bdaaa14 ("net: ena: add functions for handling Low Latency Queues in ena_com") +Signed-off-by: Arthur Kiyanovski +Signed-off-by: David Arinzon +Reviewed-by: Shannon Nelson +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/amazon/ena/ena_com.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c +index 7979b10192425..e37c82eb62326 100644 +--- a/drivers/net/ethernet/amazon/ena/ena_com.c ++++ b/drivers/net/ethernet/amazon/ena/ena_com.c +@@ -362,7 +362,7 @@ static int ena_com_init_io_sq(struct ena_com_dev *ena_dev, + ENA_COM_BOUNCE_BUFFER_CNTRL_CNT; + io_sq->bounce_buf_ctrl.next_to_use = 0; + +- size = io_sq->bounce_buf_ctrl.buffer_size * ++ size = (size_t)io_sq->bounce_buf_ctrl.buffer_size * + io_sq->bounce_buf_ctrl.buffers_num; + + dev_node = dev_to_node(ena_dev->dmadev); +-- +2.43.0 + diff --git a/queue-5.15/net-ena-wrong-missing-io-completions-check-order.patch b/queue-5.15/net-ena-wrong-missing-io-completions-check-order.patch new file mode 100644 index 00000000000..d76db6d3782 --- /dev/null +++ b/queue-5.15/net-ena-wrong-missing-io-completions-check-order.patch @@ -0,0 +1,108 @@ +From c01fb83542af435de8945928b1820e2d999d92d8 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 10 Apr 2024 09:13:56 +0000 +Subject: net: ena: Wrong missing IO completions check order + +From: David Arinzon + +[ Upstream commit f7e417180665234fdb7af2ebe33d89aaa434d16f ] + +Missing IO completions check is called every second (HZ jiffies). +This commit fixes several issues with this check: + +1. Duplicate queues check: + Max of 4 queues are scanned on each check due to monitor budget. + Once reaching the budget, this check exits under the assumption that + the next check will continue to scan the remainder of the queues, + but in practice, next check will first scan the last already scanned + queue which is not necessary and may cause the full queue scan to + last a couple of seconds longer. + The fix is to start every check with the next queue to scan. + For example, on 8 IO queues: + Bug: [0,1,2,3], [3,4,5,6], [6,7] + Fix: [0,1,2,3], [4,5,6,7] + +2. Unbalanced queues check: + In case the number of active IO queues is not a multiple of budget, + there will be checks which don't utilize the full budget + because the full scan exits when reaching the last queue id. + The fix is to run every TX completion check with exact queue budget + regardless of the queue id. + For example, on 7 IO queues: + Bug: [0,1,2,3], [4,5,6], [0,1,2,3] + Fix: [0,1,2,3], [4,5,6,0], [1,2,3,4] + The budget may be lowered in case the number of IO queues is less + than the budget (4) to make sure there are no duplicate queues on + the same check. + For example, on 3 IO queues: + Bug: [0,1,2,0], [1,2,0,1] + Fix: [0,1,2], [0,1,2] + +Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") +Signed-off-by: Amit Bernstein +Signed-off-by: David Arinzon +Reviewed-by: Shannon Nelson +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/amazon/ena/ena_netdev.c | 21 +++++++++++--------- + 1 file changed, 12 insertions(+), 9 deletions(-) + +diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c +index 43c099141e211..44b8df731c889 100644 +--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c ++++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c +@@ -3815,10 +3815,11 @@ static void check_for_missing_completions(struct ena_adapter *adapter) + { + struct ena_ring *tx_ring; + struct ena_ring *rx_ring; +- int i, budget, rc; ++ int qid, budget, rc; + int io_queue_count; + + io_queue_count = adapter->xdp_num_queues + adapter->num_io_queues; ++ + /* Make sure the driver doesn't turn the device in other process */ + smp_rmb(); + +@@ -3831,27 +3832,29 @@ static void check_for_missing_completions(struct ena_adapter *adapter) + if (adapter->missing_tx_completion_to == ENA_HW_HINTS_NO_TIMEOUT) + return; + +- budget = ENA_MONITORED_TX_QUEUES; ++ budget = min_t(u32, io_queue_count, ENA_MONITORED_TX_QUEUES); + +- for (i = adapter->last_monitored_tx_qid; i < io_queue_count; i++) { +- tx_ring = &adapter->tx_ring[i]; +- rx_ring = &adapter->rx_ring[i]; ++ qid = adapter->last_monitored_tx_qid; ++ ++ while (budget) { ++ qid = (qid + 1) % io_queue_count; ++ ++ tx_ring = &adapter->tx_ring[qid]; ++ rx_ring = &adapter->rx_ring[qid]; + + rc = check_missing_comp_in_tx_queue(adapter, tx_ring); + if (unlikely(rc)) + return; + +- rc = !ENA_IS_XDP_INDEX(adapter, i) ? ++ rc = !ENA_IS_XDP_INDEX(adapter, qid) ? + check_for_rx_interrupt_queue(adapter, rx_ring) : 0; + if (unlikely(rc)) + return; + + budget--; +- if (!budget) +- break; + } + +- adapter->last_monitored_tx_qid = i % io_queue_count; ++ adapter->last_monitored_tx_qid = qid; + } + + /* trigger napi schedule after 2 consecutive detections */ +-- +2.43.0 + diff --git a/queue-5.15/net-mlx5-properly-link-new-fs-rules-into-the-tree.patch b/queue-5.15/net-mlx5-properly-link-new-fs-rules-into-the-tree.patch new file mode 100644 index 00000000000..d7fdb6690c2 --- /dev/null +++ b/queue-5.15/net-mlx5-properly-link-new-fs-rules-into-the-tree.patch @@ -0,0 +1,66 @@ +From d165e12536235b076ea1fd57123832f80cde3d27 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 9 Apr 2024 22:08:12 +0300 +Subject: net/mlx5: Properly link new fs rules into the tree + +From: Cosmin Ratiu + +[ Upstream commit 7c6782ad4911cbee874e85630226ed389ff2e453 ] + +Previously, add_rule_fg would only add newly created rules from the +handle into the tree when they had a refcount of 1. On the other hand, +create_flow_handle tries hard to find and reference already existing +identical rules instead of creating new ones. + +These two behaviors can result in a situation where create_flow_handle +1) creates a new rule and references it, then +2) in a subsequent step during the same handle creation references it + again, +resulting in a rule with a refcount of 2 that is not linked into the +tree, will have a NULL parent and root and will result in a crash when +the flow group is deleted because del_sw_hw_rule, invoked on rule +deletion, assumes node->parent is != NULL. + +This happened in the wild, due to another bug related to incorrect +handling of duplicate pkt_reformat ids, which lead to the code in +create_flow_handle incorrectly referencing a just-added rule in the same +flow handle, resulting in the problem described above. Full details are +at [1]. + +This patch changes add_rule_fg to add new rules without parents into +the tree, properly initializing them and avoiding the crash. This makes +it more consistent with how rules are added to an FTE in +create_flow_handle. + +Fixes: 74491de93712 ("net/mlx5: Add multi dest support") +Link: https://lore.kernel.org/netdev/ea5264d6-6b55-4449-a602-214c6f509c1e@163.com/T/#u [1] +Signed-off-by: Cosmin Ratiu +Reviewed-by: Tariq Toukan +Reviewed-by: Mark Bloch +Signed-off-by: Saeed Mahameed +Signed-off-by: Tariq Toukan +Link: https://lore.kernel.org/r/20240409190820.227554-5-tariqt@nvidia.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c +index 161ad2ae40196..a55cacb988ac2 100644 +--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c ++++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c +@@ -1682,8 +1682,9 @@ static struct mlx5_flow_handle *add_rule_fg(struct mlx5_flow_group *fg, + } + trace_mlx5_fs_set_fte(fte, false); + ++ /* Link newly added rules into the tree. */ + for (i = 0; i < handle->num_rules; i++) { +- if (refcount_read(&handle->rule[i]->node.refcount) == 1) { ++ if (!handle->rule[i]->node.parent) { + tree_add_node(&handle->rule[i]->node, &fte->node); + trace_mlx5_fs_add_rule(handle->rule[i]); + } +-- +2.43.0 + diff --git a/queue-5.15/net-openvswitch-fix-unwanted-error-log-on-timeout-po.patch b/queue-5.15/net-openvswitch-fix-unwanted-error-log-on-timeout-po.patch new file mode 100644 index 00000000000..f57bf8ec914 --- /dev/null +++ b/queue-5.15/net-openvswitch-fix-unwanted-error-log-on-timeout-po.patch @@ -0,0 +1,60 @@ +From 37df019242cb60b5cbb0870bb3c0adabcf75b1e4 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 3 Apr 2024 22:38:01 +0200 +Subject: net: openvswitch: fix unwanted error log on timeout policy probing + +From: Ilya Maximets + +[ Upstream commit 4539f91f2a801c0c028c252bffae56030cfb2cae ] + +On startup, ovs-vswitchd probes different datapath features including +support for timeout policies. While probing, it tries to execute +certain operations with OVS_PACKET_ATTR_PROBE or OVS_FLOW_ATTR_PROBE +attributes set. These attributes tell the openvswitch module to not +log any errors when they occur as it is expected that some of the +probes will fail. + +For some reason, setting the timeout policy ignores the PROBE attribute +and logs a failure anyway. This is causing the following kernel log +on each re-start of ovs-vswitchd: + + kernel: Failed to associated timeout policy `ovs_test_tp' + +Fix that by using the same logging macro that all other messages are +using. The message will still be printed at info level when needed +and will be rate limited, but with a net rate limiter instead of +generic printk one. + +The nf_ct_set_timeout() itself will still print some info messages, +but at least this change makes logging in openvswitch module more +consistent. + +Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action") +Signed-off-by: Ilya Maximets +Acked-by: Eelco Chaudron +Link: https://lore.kernel.org/r/20240403203803.2137962-1-i.maximets@ovn.org +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/openvswitch/conntrack.c | 5 +++-- + 1 file changed, 3 insertions(+), 2 deletions(-) + +diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c +index 7106ce231a2dd..60dd6f32d520e 100644 +--- a/net/openvswitch/conntrack.c ++++ b/net/openvswitch/conntrack.c +@@ -1704,8 +1704,9 @@ int ovs_ct_copy_action(struct net *net, const struct nlattr *attr, + if (ct_info.timeout[0]) { + if (nf_ct_set_timeout(net, ct_info.ct, family, key->ip.proto, + ct_info.timeout)) +- pr_info_ratelimited("Failed to associated timeout " +- "policy `%s'\n", ct_info.timeout); ++ OVS_NLERR(log, ++ "Failed to associated timeout policy '%s'", ++ ct_info.timeout); + else + ct_info.nf_ct_timeout = rcu_dereference( + nf_ct_timeout_find(ct_info.ct)->timeout); +-- +2.43.0 + diff --git a/queue-5.15/net-sparx5-fix-wrong-config-being-used-when-reconfig.patch b/queue-5.15/net-sparx5-fix-wrong-config-being-used-when-reconfig.patch new file mode 100644 index 00000000000..9dcaca3b5b1 --- /dev/null +++ b/queue-5.15/net-sparx5-fix-wrong-config-being-used-when-reconfig.patch @@ -0,0 +1,47 @@ +From a7e207cc27c1c50eed9f5f511c294eb56a171736 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 9 Apr 2024 12:41:59 +0200 +Subject: net: sparx5: fix wrong config being used when reconfiguring PCS + +From: Daniel Machon + +[ Upstream commit 33623113a48ea906f1955cbf71094f6aa4462e8f ] + +The wrong port config is being used if the PCS is reconfigured. Fix this +by correctly using the new config instead of the old one. + +Fixes: 946e7fd5053a ("net: sparx5: add port module support") +Signed-off-by: Daniel Machon +Reviewed-by: Jacob Keller +Link: https://lore.kernel.org/r/20240409-link-mode-reconfiguration-fix-v2-1-db6a507f3627@microchip.com +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/microchip/sparx5/sparx5_port.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_port.c b/drivers/net/ethernet/microchip/sparx5/sparx5_port.c +index 189a6a0a2e08a..8561a7bf53e19 100644 +--- a/drivers/net/ethernet/microchip/sparx5/sparx5_port.c ++++ b/drivers/net/ethernet/microchip/sparx5/sparx5_port.c +@@ -730,7 +730,7 @@ static int sparx5_port_pcs_low_set(struct sparx5 *sparx5, + bool sgmii = false, inband_aneg = false; + int err; + +- if (port->conf.inband) { ++ if (conf->inband) { + if (conf->portmode == PHY_INTERFACE_MODE_SGMII || + conf->portmode == PHY_INTERFACE_MODE_QSGMII) + inband_aneg = true; /* Cisco-SGMII in-band-aneg */ +@@ -947,7 +947,7 @@ int sparx5_port_pcs_set(struct sparx5 *sparx5, + if (err) + return -EINVAL; + +- if (port->conf.inband) { ++ if (conf->inband) { + /* Enable/disable 1G counters in ASM */ + spx5_rmw(ASM_PORT_CFG_CSC_STAT_DIS_SET(high_speed_dev), + ASM_PORT_CFG_CSC_STAT_DIS, +-- +2.43.0 + diff --git a/queue-5.15/netfilter-complete-validation-of-user-input.patch b/queue-5.15/netfilter-complete-validation-of-user-input.patch new file mode 100644 index 00000000000..3b951a08822 --- /dev/null +++ b/queue-5.15/netfilter-complete-validation-of-user-input.patch @@ -0,0 +1,102 @@ +From 3cf2385060b88a0f898001ca94662b62bffa65ac Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 9 Apr 2024 12:07:41 +0000 +Subject: netfilter: complete validation of user input + +From: Eric Dumazet + +[ Upstream commit 65acf6e0501ac8880a4f73980d01b5d27648b956 ] + +In my recent commit, I missed that do_replace() handlers +use copy_from_sockptr() (which I fixed), followed +by unsafe copy_from_sockptr_offset() calls. + +In all functions, we can perform the @optlen validation +before even calling xt_alloc_table_info() with the following +check: + +if ((u64)optlen < (u64)tmp.size + sizeof(tmp)) + return -EINVAL; + +Fixes: 0c83842df40f ("netfilter: validate user input for expected length") +Reported-by: syzbot +Signed-off-by: Eric Dumazet +Reviewed-by: Pablo Neira Ayuso +Link: https://lore.kernel.org/r/20240409120741.3538135-1-edumazet@google.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/ipv4/netfilter/arp_tables.c | 4 ++++ + net/ipv4/netfilter/ip_tables.c | 4 ++++ + net/ipv6/netfilter/ip6_tables.c | 4 ++++ + 3 files changed, 12 insertions(+) + +diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c +index 07ecb16231cd0..a9d5a1973224a 100644 +--- a/net/ipv4/netfilter/arp_tables.c ++++ b/net/ipv4/netfilter/arp_tables.c +@@ -965,6 +965,8 @@ static int do_replace(struct net *net, sockptr_t arg, unsigned int len) + return -ENOMEM; + if (tmp.num_counters == 0) + return -EINVAL; ++ if ((u64)len < (u64)tmp.size + sizeof(tmp)) ++ return -EINVAL; + + tmp.name[sizeof(tmp.name)-1] = 0; + +@@ -1265,6 +1267,8 @@ static int compat_do_replace(struct net *net, sockptr_t arg, unsigned int len) + return -ENOMEM; + if (tmp.num_counters == 0) + return -EINVAL; ++ if ((u64)len < (u64)tmp.size + sizeof(tmp)) ++ return -EINVAL; + + tmp.name[sizeof(tmp.name)-1] = 0; + +diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c +index 1e1e7488d6bf1..aee7cd584c926 100644 +--- a/net/ipv4/netfilter/ip_tables.c ++++ b/net/ipv4/netfilter/ip_tables.c +@@ -1119,6 +1119,8 @@ do_replace(struct net *net, sockptr_t arg, unsigned int len) + return -ENOMEM; + if (tmp.num_counters == 0) + return -EINVAL; ++ if ((u64)len < (u64)tmp.size + sizeof(tmp)) ++ return -EINVAL; + + tmp.name[sizeof(tmp.name)-1] = 0; + +@@ -1505,6 +1507,8 @@ compat_do_replace(struct net *net, sockptr_t arg, unsigned int len) + return -ENOMEM; + if (tmp.num_counters == 0) + return -EINVAL; ++ if ((u64)len < (u64)tmp.size + sizeof(tmp)) ++ return -EINVAL; + + tmp.name[sizeof(tmp.name)-1] = 0; + +diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c +index b17990d514ee9..afd22ea9f555b 100644 +--- a/net/ipv6/netfilter/ip6_tables.c ++++ b/net/ipv6/netfilter/ip6_tables.c +@@ -1137,6 +1137,8 @@ do_replace(struct net *net, sockptr_t arg, unsigned int len) + return -ENOMEM; + if (tmp.num_counters == 0) + return -EINVAL; ++ if ((u64)len < (u64)tmp.size + sizeof(tmp)) ++ return -EINVAL; + + tmp.name[sizeof(tmp.name)-1] = 0; + +@@ -1515,6 +1517,8 @@ compat_do_replace(struct net *net, sockptr_t arg, unsigned int len) + return -ENOMEM; + if (tmp.num_counters == 0) + return -EINVAL; ++ if ((u64)len < (u64)tmp.size + sizeof(tmp)) ++ return -EINVAL; + + tmp.name[sizeof(tmp.name)-1] = 0; + +-- +2.43.0 + diff --git a/queue-5.15/nouveau-fix-function-cast-warning.patch b/queue-5.15/nouveau-fix-function-cast-warning.patch new file mode 100644 index 00000000000..aa561e2bff3 --- /dev/null +++ b/queue-5.15/nouveau-fix-function-cast-warning.patch @@ -0,0 +1,51 @@ +From cd150c36afa6b3b3e20772aeee83d8f6f198c972 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 4 Apr 2024 18:02:25 +0200 +Subject: nouveau: fix function cast warning + +From: Arnd Bergmann + +[ Upstream commit 185fdb4697cc9684a02f2fab0530ecdd0c2f15d4 ] + +Calling a function through an incompatible pointer type causes breaks +kcfi, so clang warns about the assignment: + +drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c:73:10: error: cast from 'void (*)(const void *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict] + 73 | .fini = (void(*)(void *))kfree, + +Avoid this with a trivial wrapper. + +Fixes: c39f472e9f14 ("drm/nouveau: remove symlinks, move core/ to nvkm/ (no code changes)") +Signed-off-by: Arnd Bergmann +Signed-off-by: Danilo Krummrich +Link: https://patchwork.freedesktop.org/patch/msgid/20240404160234.2923554-1-arnd@kernel.org +Signed-off-by: Sasha Levin +--- + drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c | 7 ++++++- + 1 file changed, 6 insertions(+), 1 deletion(-) + +diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c +index 4bf486b571013..cb05f7f48a98b 100644 +--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c ++++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c +@@ -66,11 +66,16 @@ of_init(struct nvkm_bios *bios, const char *name) + return ERR_PTR(-EINVAL); + } + ++static void of_fini(void *p) ++{ ++ kfree(p); ++} ++ + const struct nvbios_source + nvbios_of = { + .name = "OpenFirmware", + .init = of_init, +- .fini = (void(*)(void *))kfree, ++ .fini = of_fini, + .read = of_read, + .size = of_size, + .rw = false, +-- +2.43.0 + diff --git a/queue-5.15/octeontx2-af-fix-nix-sq-mode-and-bp-config.patch b/queue-5.15/octeontx2-af-fix-nix-sq-mode-and-bp-config.patch new file mode 100644 index 00000000000..75f3641ace0 --- /dev/null +++ b/queue-5.15/octeontx2-af-fix-nix-sq-mode-and-bp-config.patch @@ -0,0 +1,59 @@ +From 2f2f64ee15944e3ed3e58227d2817925da0410e2 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 8 Apr 2024 12:06:43 +0530 +Subject: octeontx2-af: Fix NIX SQ mode and BP config + +From: Geetha sowjanya + +[ Upstream commit faf23006185e777db18912685922c5ddb2df383f ] + +NIX SQ mode and link backpressure configuration is required for +all platforms. But in current driver this code is wrongly placed +under specific platform check. This patch fixes the issue by +moving the code out of platform check. + +Fixes: 5d9b976d4480 ("octeontx2-af: Support fixed transmit scheduler topology") +Signed-off-by: Geetha sowjanya +Link: https://lore.kernel.org/r/20240408063643.26288-1-gakula@marvell.com +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + .../ethernet/marvell/octeontx2/af/rvu_nix.c | 20 +++++++++---------- + 1 file changed, 10 insertions(+), 10 deletions(-) + +diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c +index bda93e550b08a..34a9a9164f3c6 100644 +--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c ++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c +@@ -4184,18 +4184,18 @@ static int rvu_nix_block_init(struct rvu *rvu, struct nix_hw *nix_hw) + */ + rvu_write64(rvu, blkaddr, NIX_AF_CFG, + rvu_read64(rvu, blkaddr, NIX_AF_CFG) | 0x40ULL); ++ } + +- /* Set chan/link to backpressure TL3 instead of TL2 */ +- rvu_write64(rvu, blkaddr, NIX_AF_PSE_CHANNEL_LEVEL, 0x01); ++ /* Set chan/link to backpressure TL3 instead of TL2 */ ++ rvu_write64(rvu, blkaddr, NIX_AF_PSE_CHANNEL_LEVEL, 0x01); + +- /* Disable SQ manager's sticky mode operation (set TM6 = 0) +- * This sticky mode is known to cause SQ stalls when multiple +- * SQs are mapped to same SMQ and transmitting pkts at a time. +- */ +- cfg = rvu_read64(rvu, blkaddr, NIX_AF_SQM_DBG_CTL_STATUS); +- cfg &= ~BIT_ULL(15); +- rvu_write64(rvu, blkaddr, NIX_AF_SQM_DBG_CTL_STATUS, cfg); +- } ++ /* Disable SQ manager's sticky mode operation (set TM6 = 0) ++ * This sticky mode is known to cause SQ stalls when multiple ++ * SQs are mapped to same SMQ and transmitting pkts at a time. ++ */ ++ cfg = rvu_read64(rvu, blkaddr, NIX_AF_SQM_DBG_CTL_STATUS); ++ cfg &= ~BIT_ULL(15); ++ rvu_write64(rvu, blkaddr, NIX_AF_SQM_DBG_CTL_STATUS, cfg); + + ltdefs = rvu->kpu.lt_def; + /* Calibrate X2P bus to check if CGX/LBK links are fine */ +-- +2.43.0 + diff --git a/queue-5.15/revert-drm-qxl-simplify-qxl_fence_wait.patch b/queue-5.15/revert-drm-qxl-simplify-qxl_fence_wait.patch new file mode 100644 index 00000000000..0497ca52bf3 --- /dev/null +++ b/queue-5.15/revert-drm-qxl-simplify-qxl_fence_wait.patch @@ -0,0 +1,115 @@ +From ecd0a4ea412d9ffcaadddca4ba3b5c6c3f79b3bb Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 4 Apr 2024 19:14:48 +0100 +Subject: Revert "drm/qxl: simplify qxl_fence_wait" + +From: Alex Constantino + +[ Upstream commit 07ed11afb68d94eadd4ffc082b97c2331307c5ea ] + +This reverts commit 5a838e5d5825c85556011478abde708251cc0776. + +Changes from commit 5a838e5d5825 ("drm/qxl: simplify qxl_fence_wait") would +result in a '[TTM] Buffer eviction failed' exception whenever it reached a +timeout. +Due to a dependency to DMA_FENCE_WARN this also restores some code deleted +by commit d72277b6c37d ("dma-buf: nuke DMA_FENCE_TRACE macros v2"). + +Fixes: 5a838e5d5825 ("drm/qxl: simplify qxl_fence_wait") +Link: https://lore.kernel.org/regressions/ZTgydqRlK6WX_b29@eldamar.lan/ +Reported-by: Timo Lindfors +Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1054514 +Signed-off-by: Alex Constantino +Signed-off-by: Maxime Ripard +Link: https://patchwork.freedesktop.org/patch/msgid/20240404181448.1643-2-dreaming.about.electric.sheep@gmail.com +Signed-off-by: Sasha Levin +--- + drivers/gpu/drm/qxl/qxl_release.c | 50 +++++++++++++++++++++++++++---- + include/linux/dma-fence.h | 7 +++++ + 2 files changed, 52 insertions(+), 5 deletions(-) + +diff --git a/drivers/gpu/drm/qxl/qxl_release.c b/drivers/gpu/drm/qxl/qxl_release.c +index b19f2f00b2158..d4f26075383da 100644 +--- a/drivers/gpu/drm/qxl/qxl_release.c ++++ b/drivers/gpu/drm/qxl/qxl_release.c +@@ -58,16 +58,56 @@ static long qxl_fence_wait(struct dma_fence *fence, bool intr, + signed long timeout) + { + struct qxl_device *qdev; ++ struct qxl_release *release; ++ int count = 0, sc = 0; ++ bool have_drawable_releases; + unsigned long cur, end = jiffies + timeout; + + qdev = container_of(fence->lock, struct qxl_device, release_lock); ++ release = container_of(fence, struct qxl_release, base); ++ have_drawable_releases = release->type == QXL_RELEASE_DRAWABLE; + +- if (!wait_event_timeout(qdev->release_event, +- (dma_fence_is_signaled(fence) || +- (qxl_io_notify_oom(qdev), 0)), +- timeout)) +- return 0; ++retry: ++ sc++; ++ ++ if (dma_fence_is_signaled(fence)) ++ goto signaled; ++ ++ qxl_io_notify_oom(qdev); ++ ++ for (count = 0; count < 11; count++) { ++ if (!qxl_queue_garbage_collect(qdev, true)) ++ break; ++ ++ if (dma_fence_is_signaled(fence)) ++ goto signaled; ++ } ++ ++ if (dma_fence_is_signaled(fence)) ++ goto signaled; ++ ++ if (have_drawable_releases || sc < 4) { ++ if (sc > 2) ++ /* back off */ ++ usleep_range(500, 1000); ++ ++ if (time_after(jiffies, end)) ++ return 0; ++ ++ if (have_drawable_releases && sc > 300) { ++ DMA_FENCE_WARN(fence, ++ "failed to wait on release %llu after spincount %d\n", ++ fence->context & ~0xf0000000, sc); ++ goto signaled; ++ } ++ goto retry; ++ } ++ /* ++ * yeah, original sync_obj_wait gave up after 3 spins when ++ * have_drawable_releases is not set. ++ */ + ++signaled: + cur = jiffies; + if (time_after(cur, end)) + return 0; +diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h +index 9d276655cc25a..6659d0369ec5c 100644 +--- a/include/linux/dma-fence.h ++++ b/include/linux/dma-fence.h +@@ -631,4 +631,11 @@ u64 dma_fence_context_alloc(unsigned num); + ##args); \ + } while (0) + ++#define DMA_FENCE_WARN(f, fmt, args...) \ ++ do { \ ++ struct dma_fence *__ff = (f); \ ++ pr_warn("f %llu#%llu: " fmt, __ff->context, __ff->seqno,\ ++ ##args); \ ++ } while (0) ++ + #endif /* __LINUX_DMA_FENCE_H */ +-- +2.43.0 + diff --git a/queue-5.15/scsi-qla2xxx-fix-off-by-one-in-qla_edif_app_getstats.patch b/queue-5.15/scsi-qla2xxx-fix-off-by-one-in-qla_edif_app_getstats.patch new file mode 100644 index 00000000000..3c6ff414277 --- /dev/null +++ b/queue-5.15/scsi-qla2xxx-fix-off-by-one-in-qla_edif_app_getstats.patch @@ -0,0 +1,39 @@ +From c1b2ef3b46e0e00404c208e748f366c337108fde Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 2 Apr 2024 12:56:54 +0300 +Subject: scsi: qla2xxx: Fix off by one in qla_edif_app_getstats() + +From: Dan Carpenter + +[ Upstream commit 4406e4176f47177f5e51b4cc7e6a7a2ff3dbfbbd ] + +The app_reply->elem[] array is allocated earlier in this function and it +has app_req.num_ports elements. Thus this > comparison needs to be >= to +prevent memory corruption. + +Fixes: 7878f22a2e03 ("scsi: qla2xxx: edif: Add getfcinfo and statistic bsgs") +Signed-off-by: Dan Carpenter +Link: https://lore.kernel.org/r/5c125b2f-92dd-412b-9b6f-fc3a3207bd60@moroto.mountain +Reviewed-by: Himanshu Madhani +Signed-off-by: Martin K. Petersen +Signed-off-by: Sasha Levin +--- + drivers/scsi/qla2xxx/qla_edif.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/scsi/qla2xxx/qla_edif.c b/drivers/scsi/qla2xxx/qla_edif.c +index 40a03f9c2d21f..ac702f74dd984 100644 +--- a/drivers/scsi/qla2xxx/qla_edif.c ++++ b/drivers/scsi/qla2xxx/qla_edif.c +@@ -1012,7 +1012,7 @@ qla_edif_app_getstats(scsi_qla_host_t *vha, struct bsg_job *bsg_job) + + list_for_each_entry_safe(fcport, tf, &vha->vp_fcports, list) { + if (fcport->edif.enable) { +- if (pcnt > app_req.num_ports) ++ if (pcnt >= app_req.num_ports) + break; + + app_reply->elem[pcnt].rekey_count = +-- +2.43.0 + diff --git a/queue-5.15/series b/queue-5.15/series index 5ebe62a63d7..11a8eb62931 100644 --- a/queue-5.15/series +++ b/queue-5.15/series @@ -2,3 +2,28 @@ batman-adv-avoid-infinite-loop-trying-to-resize-local-tt.patch ring-buffer-only-update-pages_touched-when-a-new-page-is-touched.patch bluetooth-fix-memory-leak-in-hci_req_sync_complete.patch media-cec-core-remove-length-check-of-timer-status.patch +arm64-dts-imx8-ss-conn-fix-usdhc-wrong-lpcg-clock-or.patch +revert-drm-qxl-simplify-qxl_fence_wait.patch +nouveau-fix-function-cast-warning.patch +scsi-qla2xxx-fix-off-by-one-in-qla_edif_app_getstats.patch +net-openvswitch-fix-unwanted-error-log-on-timeout-po.patch +u64_stats-disable-preemption-on-32bit-up-smp-preempt.patch +xsk-validate-user-input-for-xdp_-umem-completion-_fi.patch +geneve-fix-header-validation-in-geneve-6-_xmit_skb.patch +af_unix-clear-stale-u-oob_skb.patch +octeontx2-af-fix-nix-sq-mode-and-bp-config.patch +ipv6-fib-hide-unused-pn-variable.patch +ipv4-route-avoid-unused-but-set-variable-warning.patch +ipv6-fix-race-condition-between-ipv6_get_ifaddr-and-.patch +netfilter-complete-validation-of-user-input.patch +net-mlx5-properly-link-new-fs-rules-into-the-tree.patch +net-sparx5-fix-wrong-config-being-used-when-reconfig.patch +net-dsa-mt7530-trap-link-local-frames-regardless-of-.patch +af_unix-do-not-use-atomic-ops-for-unix_sk-sk-infligh.patch +af_unix-fix-garbage-collector-racing-against-connect.patch +net-ena-fix-potential-sign-extension-issue.patch +net-ena-wrong-missing-io-completions-check-order.patch +net-ena-fix-incorrect-descriptor-free-behavior.patch +tracing-fix-ftrace_record_recursion_size-kconfig-ent.patch +tracing-hide-unused-ftrace_event_id_fops.patch +iommu-vt-d-allocate-local-memory-for-page-request-qu.patch diff --git a/queue-5.15/tracing-fix-ftrace_record_recursion_size-kconfig-ent.patch b/queue-5.15/tracing-fix-ftrace_record_recursion_size-kconfig-ent.patch new file mode 100644 index 00000000000..a9d71700dac --- /dev/null +++ b/queue-5.15/tracing-fix-ftrace_record_recursion_size-kconfig-ent.patch @@ -0,0 +1,42 @@ +From 16c9bf4ba17abcfd8818814a7ad662ca950a6f8e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 22 Mar 2024 17:48:01 +0530 +Subject: tracing: Fix FTRACE_RECORD_RECURSION_SIZE Kconfig entry + +From: Prasad Pandit + +[ Upstream commit d96c36004e31e2baaf8ea1b449b7d0b2c2bfb41a ] + +Fix FTRACE_RECORD_RECURSION_SIZE entry, replace tab with +a space character. It helps Kconfig parsers to read file +without error. + +Link: https://lore.kernel.org/linux-trace-kernel/20240322121801.1803948-1-ppandit@redhat.com + +Cc: Masami Hiramatsu +Cc: Mathieu Desnoyers +Fixes: 773c16705058 ("ftrace: Add recording of functions that caused recursion") +Signed-off-by: Prasad Pandit +Reviewed-by: Randy Dunlap +Signed-off-by: Steven Rostedt (Google) +Signed-off-by: Sasha Levin +--- + kernel/trace/Kconfig | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig +index 4265d125d50f3..6caa551b7fc98 100644 +--- a/kernel/trace/Kconfig ++++ b/kernel/trace/Kconfig +@@ -847,7 +847,7 @@ config FTRACE_RECORD_RECURSION + + config FTRACE_RECORD_RECURSION_SIZE + int "Max number of recursed functions to record" +- default 128 ++ default 128 + depends on FTRACE_RECORD_RECURSION + help + This defines the limit of number of functions that can be +-- +2.43.0 + diff --git a/queue-5.15/tracing-hide-unused-ftrace_event_id_fops.patch b/queue-5.15/tracing-hide-unused-ftrace_event_id_fops.patch new file mode 100644 index 00000000000..c567515026f --- /dev/null +++ b/queue-5.15/tracing-hide-unused-ftrace_event_id_fops.patch @@ -0,0 +1,76 @@ +From d0a37e9c5504eade98c9a202a3ff9a673fe4867e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 3 Apr 2024 10:06:24 +0200 +Subject: tracing: hide unused ftrace_event_id_fops +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Arnd Bergmann + +[ Upstream commit 5281ec83454d70d98b71f1836fb16512566c01cd ] + +When CONFIG_PERF_EVENTS, a 'make W=1' build produces a warning about the +unused ftrace_event_id_fops variable: + +kernel/trace/trace_events.c:2155:37: error: 'ftrace_event_id_fops' defined but not used [-Werror=unused-const-variable=] + 2155 | static const struct file_operations ftrace_event_id_fops = { + +Hide this in the same #ifdef as the reference to it. + +Link: https://lore.kernel.org/linux-trace-kernel/20240403080702.3509288-7-arnd@kernel.org + +Cc: Masami Hiramatsu +Cc: Oleg Nesterov +Cc: Mathieu Desnoyers +Cc: Zheng Yejian +Cc: Kees Cook +Cc: Ajay Kaher +Cc: Jinjie Ruan +Cc: Clément Léger +Cc: Dan Carpenter +Cc: "Tzvetomir Stoyanov (VMware)" +Fixes: 620a30e97feb ("tracing: Don't pass file_operations array to event_create_dir()") +Signed-off-by: Arnd Bergmann +Signed-off-by: Steven Rostedt (Google) +Signed-off-by: Sasha Levin +--- + kernel/trace/trace_events.c | 4 ++++ + 1 file changed, 4 insertions(+) + +diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c +index 0a7348b90ba50..1f4f3096b9ac4 100644 +--- a/kernel/trace/trace_events.c ++++ b/kernel/trace/trace_events.c +@@ -1645,6 +1645,7 @@ static int trace_format_open(struct inode *inode, struct file *file) + return 0; + } + ++#ifdef CONFIG_PERF_EVENTS + static ssize_t + event_id_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos) + { +@@ -1659,6 +1660,7 @@ event_id_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos) + + return simple_read_from_buffer(ubuf, cnt, ppos, buf, len); + } ++#endif + + static ssize_t + event_filter_read(struct file *filp, char __user *ubuf, size_t cnt, +@@ -2104,10 +2106,12 @@ static const struct file_operations ftrace_event_format_fops = { + .release = seq_release, + }; + ++#ifdef CONFIG_PERF_EVENTS + static const struct file_operations ftrace_event_id_fops = { + .read = event_id_read, + .llseek = default_llseek, + }; ++#endif + + static const struct file_operations ftrace_event_filter_fops = { + .open = tracing_open_file_tr, +-- +2.43.0 + diff --git a/queue-5.15/u64_stats-disable-preemption-on-32bit-up-smp-preempt.patch b/queue-5.15/u64_stats-disable-preemption-on-32bit-up-smp-preempt.patch new file mode 100644 index 00000000000..29f9c0dda40 --- /dev/null +++ b/queue-5.15/u64_stats-disable-preemption-on-32bit-up-smp-preempt.patch @@ -0,0 +1,164 @@ +From 89c53c2100704bb006e63f6e40ea44ba2a859d3b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 10 Dec 2021 21:29:59 +0100 +Subject: u64_stats: Disable preemption on 32bit UP+SMP PREEMPT_RT during + updates. + +From: Sebastian Andrzej Siewior + +[ Upstream commit 3c118547f87e930d45a5787e386734015dd93b32 ] + +On PREEMPT_RT the seqcount_t for synchronisation is required on 32bit +architectures even on UP because the softirq (and the threaded IRQ handler) can +be preempted. + +With the seqcount_t for synchronisation, a reader with higher priority can +preempt the writer and then spin endlessly in read_seqcount_begin() while the +writer can't make progress. + +To avoid such a lock up on PREEMPT_RT the writer must disable preemption during +the update. There is no need to disable interrupts because no writer is using +this API in hard-IRQ context on PREEMPT_RT. + +Disable preemption on 32bit-RT within the u64_stats write section. + +Signed-off-by: Sebastian Andrzej Siewior +Signed-off-by: David S. Miller +Stable-dep-of: 38a15d0a50e0 ("u64_stats: fix u64_stats_init() for lockdep when used repeatedly in one file") +Signed-off-by: Sasha Levin +--- + include/linux/u64_stats_sync.h | 42 ++++++++++++++++++++++------------ + 1 file changed, 28 insertions(+), 14 deletions(-) + +diff --git a/include/linux/u64_stats_sync.h b/include/linux/u64_stats_sync.h +index e81856c0ba134..6a0f2097d3709 100644 +--- a/include/linux/u64_stats_sync.h ++++ b/include/linux/u64_stats_sync.h +@@ -66,7 +66,7 @@ + #include + + struct u64_stats_sync { +-#if BITS_PER_LONG==32 && defined(CONFIG_SMP) ++#if BITS_PER_LONG == 32 && (defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)) + seqcount_t seq; + #endif + }; +@@ -115,7 +115,7 @@ static inline void u64_stats_inc(u64_stats_t *p) + } + #endif + +-#if BITS_PER_LONG == 32 && defined(CONFIG_SMP) ++#if BITS_PER_LONG == 32 && (defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)) + #define u64_stats_init(syncp) seqcount_init(&(syncp)->seq) + #else + static inline void u64_stats_init(struct u64_stats_sync *syncp) +@@ -125,15 +125,19 @@ static inline void u64_stats_init(struct u64_stats_sync *syncp) + + static inline void u64_stats_update_begin(struct u64_stats_sync *syncp) + { +-#if BITS_PER_LONG==32 && defined(CONFIG_SMP) ++#if BITS_PER_LONG == 32 && (defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)) ++ if (IS_ENABLED(CONFIG_PREEMPT_RT)) ++ preempt_disable(); + write_seqcount_begin(&syncp->seq); + #endif + } + + static inline void u64_stats_update_end(struct u64_stats_sync *syncp) + { +-#if BITS_PER_LONG==32 && defined(CONFIG_SMP) ++#if BITS_PER_LONG == 32 && (defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)) + write_seqcount_end(&syncp->seq); ++ if (IS_ENABLED(CONFIG_PREEMPT_RT)) ++ preempt_enable(); + #endif + } + +@@ -142,8 +146,11 @@ u64_stats_update_begin_irqsave(struct u64_stats_sync *syncp) + { + unsigned long flags = 0; + +-#if BITS_PER_LONG==32 && defined(CONFIG_SMP) +- local_irq_save(flags); ++#if BITS_PER_LONG == 32 && (defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)) ++ if (IS_ENABLED(CONFIG_PREEMPT_RT)) ++ preempt_disable(); ++ else ++ local_irq_save(flags); + write_seqcount_begin(&syncp->seq); + #endif + return flags; +@@ -153,15 +160,18 @@ static inline void + u64_stats_update_end_irqrestore(struct u64_stats_sync *syncp, + unsigned long flags) + { +-#if BITS_PER_LONG==32 && defined(CONFIG_SMP) ++#if BITS_PER_LONG == 32 && (defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)) + write_seqcount_end(&syncp->seq); +- local_irq_restore(flags); ++ if (IS_ENABLED(CONFIG_PREEMPT_RT)) ++ preempt_enable(); ++ else ++ local_irq_restore(flags); + #endif + } + + static inline unsigned int __u64_stats_fetch_begin(const struct u64_stats_sync *syncp) + { +-#if BITS_PER_LONG==32 && defined(CONFIG_SMP) ++#if BITS_PER_LONG == 32 && (defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)) + return read_seqcount_begin(&syncp->seq); + #else + return 0; +@@ -170,7 +180,7 @@ static inline unsigned int __u64_stats_fetch_begin(const struct u64_stats_sync * + + static inline unsigned int u64_stats_fetch_begin(const struct u64_stats_sync *syncp) + { +-#if BITS_PER_LONG==32 && !defined(CONFIG_SMP) ++#if BITS_PER_LONG == 32 && (!defined(CONFIG_SMP) && !defined(CONFIG_PREEMPT_RT)) + preempt_disable(); + #endif + return __u64_stats_fetch_begin(syncp); +@@ -179,7 +189,7 @@ static inline unsigned int u64_stats_fetch_begin(const struct u64_stats_sync *sy + static inline bool __u64_stats_fetch_retry(const struct u64_stats_sync *syncp, + unsigned int start) + { +-#if BITS_PER_LONG==32 && defined(CONFIG_SMP) ++#if BITS_PER_LONG == 32 && (defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)) + return read_seqcount_retry(&syncp->seq, start); + #else + return false; +@@ -189,7 +199,7 @@ static inline bool __u64_stats_fetch_retry(const struct u64_stats_sync *syncp, + static inline bool u64_stats_fetch_retry(const struct u64_stats_sync *syncp, + unsigned int start) + { +-#if BITS_PER_LONG==32 && !defined(CONFIG_SMP) ++#if BITS_PER_LONG == 32 && (!defined(CONFIG_SMP) && !defined(CONFIG_PREEMPT_RT)) + preempt_enable(); + #endif + return __u64_stats_fetch_retry(syncp, start); +@@ -203,7 +213,9 @@ static inline bool u64_stats_fetch_retry(const struct u64_stats_sync *syncp, + */ + static inline unsigned int u64_stats_fetch_begin_irq(const struct u64_stats_sync *syncp) + { +-#if BITS_PER_LONG==32 && !defined(CONFIG_SMP) ++#if BITS_PER_LONG == 32 && defined(CONFIG_PREEMPT_RT) ++ preempt_disable(); ++#elif BITS_PER_LONG == 32 && !defined(CONFIG_SMP) + local_irq_disable(); + #endif + return __u64_stats_fetch_begin(syncp); +@@ -212,7 +224,9 @@ static inline unsigned int u64_stats_fetch_begin_irq(const struct u64_stats_sync + static inline bool u64_stats_fetch_retry_irq(const struct u64_stats_sync *syncp, + unsigned int start) + { +-#if BITS_PER_LONG==32 && !defined(CONFIG_SMP) ++#if BITS_PER_LONG == 32 && defined(CONFIG_PREEMPT_RT) ++ preempt_enable(); ++#elif BITS_PER_LONG == 32 && !defined(CONFIG_SMP) + local_irq_enable(); + #endif + return __u64_stats_fetch_retry(syncp, start); +-- +2.43.0 + diff --git a/queue-5.15/xsk-validate-user-input-for-xdp_-umem-completion-_fi.patch b/queue-5.15/xsk-validate-user-input-for-xdp_-umem-completion-_fi.patch new file mode 100644 index 00000000000..8d28461442c --- /dev/null +++ b/queue-5.15/xsk-validate-user-input-for-xdp_-umem-completion-_fi.patch @@ -0,0 +1,176 @@ +From 261d3fb9712d8959162c30b5bdfd603c3bededbe Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 4 Apr 2024 20:27:38 +0000 +Subject: xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Eric Dumazet + +[ Upstream commit 237f3cf13b20db183d3706d997eedc3c49eacd44 ] + +syzbot reported an illegal copy in xsk_setsockopt() [1] + +Make sure to validate setsockopt() @optlen parameter. + +[1] + + BUG: KASAN: slab-out-of-bounds in copy_from_sockptr_offset include/linux/sockptr.h:49 [inline] + BUG: KASAN: slab-out-of-bounds in copy_from_sockptr include/linux/sockptr.h:55 [inline] + BUG: KASAN: slab-out-of-bounds in xsk_setsockopt+0x909/0xa40 net/xdp/xsk.c:1420 +Read of size 4 at addr ffff888028c6cde3 by task syz-executor.0/7549 + +CPU: 0 PID: 7549 Comm: syz-executor.0 Not tainted 6.8.0-syzkaller-08951-gfe46a7dd189e #0 +Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 +Call Trace: + + __dump_stack lib/dump_stack.c:88 [inline] + dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 + print_address_description mm/kasan/report.c:377 [inline] + print_report+0x169/0x550 mm/kasan/report.c:488 + kasan_report+0x143/0x180 mm/kasan/report.c:601 + copy_from_sockptr_offset include/linux/sockptr.h:49 [inline] + copy_from_sockptr include/linux/sockptr.h:55 [inline] + xsk_setsockopt+0x909/0xa40 net/xdp/xsk.c:1420 + do_sock_setsockopt+0x3af/0x720 net/socket.c:2311 + __sys_setsockopt+0x1ae/0x250 net/socket.c:2334 + __do_sys_setsockopt net/socket.c:2343 [inline] + __se_sys_setsockopt net/socket.c:2340 [inline] + __x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340 + do_syscall_64+0xfb/0x240 + entry_SYSCALL_64_after_hwframe+0x6d/0x75 +RIP: 0033:0x7fb40587de69 +Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 +RSP: 002b:00007fb40665a0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 +RAX: ffffffffffffffda RBX: 00007fb4059abf80 RCX: 00007fb40587de69 +RDX: 0000000000000005 RSI: 000000000000011b RDI: 0000000000000006 +RBP: 00007fb4058ca47a R08: 0000000000000002 R09: 0000000000000000 +R10: 0000000020001980 R11: 0000000000000246 R12: 0000000000000000 +R13: 000000000000000b R14: 00007fb4059abf80 R15: 00007fff57ee4d08 + + +Allocated by task 7549: + kasan_save_stack mm/kasan/common.c:47 [inline] + kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 + poison_kmalloc_redzone mm/kasan/common.c:370 [inline] + __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387 + kasan_kmalloc include/linux/kasan.h:211 [inline] + __do_kmalloc_node mm/slub.c:3966 [inline] + __kmalloc+0x233/0x4a0 mm/slub.c:3979 + kmalloc include/linux/slab.h:632 [inline] + __cgroup_bpf_run_filter_setsockopt+0xd2f/0x1040 kernel/bpf/cgroup.c:1869 + do_sock_setsockopt+0x6b4/0x720 net/socket.c:2293 + __sys_setsockopt+0x1ae/0x250 net/socket.c:2334 + __do_sys_setsockopt net/socket.c:2343 [inline] + __se_sys_setsockopt net/socket.c:2340 [inline] + __x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340 + do_syscall_64+0xfb/0x240 + entry_SYSCALL_64_after_hwframe+0x6d/0x75 + +The buggy address belongs to the object at ffff888028c6cde0 + which belongs to the cache kmalloc-8 of size 8 +The buggy address is located 1 bytes to the right of + allocated 2-byte region [ffff888028c6cde0, ffff888028c6cde2) + +The buggy address belongs to the physical page: +page:ffffea0000a31b00 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888028c6c9c0 pfn:0x28c6c +anon flags: 0xfff00000000800(slab|node=0|zone=1|lastcpupid=0x7ff) +page_type: 0xffffffff() +raw: 00fff00000000800 ffff888014c41280 0000000000000000 dead000000000001 +raw: ffff888028c6c9c0 0000000080800057 00000001ffffffff 0000000000000000 +page dumped because: kasan: bad access detected +page_owner tracks the page as allocated +page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112cc0(GFP_USER|__GFP_NOWARN|__GFP_NORETRY), pid 6648, tgid 6644 (syz-executor.0), ts 133906047828, free_ts 133859922223 + set_page_owner include/linux/page_owner.h:31 [inline] + post_alloc_hook+0x1ea/0x210 mm/page_alloc.c:1533 + prep_new_page mm/page_alloc.c:1540 [inline] + get_page_from_freelist+0x33ea/0x3580 mm/page_alloc.c:3311 + __alloc_pages+0x256/0x680 mm/page_alloc.c:4569 + __alloc_pages_node include/linux/gfp.h:238 [inline] + alloc_pages_node include/linux/gfp.h:261 [inline] + alloc_slab_page+0x5f/0x160 mm/slub.c:2175 + allocate_slab mm/slub.c:2338 [inline] + new_slab+0x84/0x2f0 mm/slub.c:2391 + ___slab_alloc+0xc73/0x1260 mm/slub.c:3525 + __slab_alloc mm/slub.c:3610 [inline] + __slab_alloc_node mm/slub.c:3663 [inline] + slab_alloc_node mm/slub.c:3835 [inline] + __do_kmalloc_node mm/slub.c:3965 [inline] + __kmalloc_node+0x2db/0x4e0 mm/slub.c:3973 + kmalloc_node include/linux/slab.h:648 [inline] + __vmalloc_area_node mm/vmalloc.c:3197 [inline] + __vmalloc_node_range+0x5f9/0x14a0 mm/vmalloc.c:3392 + __vmalloc_node mm/vmalloc.c:3457 [inline] + vzalloc+0x79/0x90 mm/vmalloc.c:3530 + bpf_check+0x260/0x19010 kernel/bpf/verifier.c:21162 + bpf_prog_load+0x1667/0x20f0 kernel/bpf/syscall.c:2895 + __sys_bpf+0x4ee/0x810 kernel/bpf/syscall.c:5631 + __do_sys_bpf kernel/bpf/syscall.c:5738 [inline] + __se_sys_bpf kernel/bpf/syscall.c:5736 [inline] + __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5736 + do_syscall_64+0xfb/0x240 + entry_SYSCALL_64_after_hwframe+0x6d/0x75 +page last free pid 6650 tgid 6647 stack trace: + reset_page_owner include/linux/page_owner.h:24 [inline] + free_pages_prepare mm/page_alloc.c:1140 [inline] + free_unref_page_prepare+0x95d/0xa80 mm/page_alloc.c:2346 + free_unref_page_list+0x5a3/0x850 mm/page_alloc.c:2532 + release_pages+0x2117/0x2400 mm/swap.c:1042 + tlb_batch_pages_flush mm/mmu_gather.c:98 [inline] + tlb_flush_mmu_free mm/mmu_gather.c:293 [inline] + tlb_flush_mmu+0x34d/0x4e0 mm/mmu_gather.c:300 + tlb_finish_mmu+0xd4/0x200 mm/mmu_gather.c:392 + exit_mmap+0x4b6/0xd40 mm/mmap.c:3300 + __mmput+0x115/0x3c0 kernel/fork.c:1345 + exit_mm+0x220/0x310 kernel/exit.c:569 + do_exit+0x99e/0x27e0 kernel/exit.c:865 + do_group_exit+0x207/0x2c0 kernel/exit.c:1027 + get_signal+0x176e/0x1850 kernel/signal.c:2907 + arch_do_signal_or_restart+0x96/0x860 arch/x86/kernel/signal.c:310 + exit_to_user_mode_loop kernel/entry/common.c:105 [inline] + exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline] + __syscall_exit_to_user_mode_work kernel/entry/common.c:201 [inline] + syscall_exit_to_user_mode+0xc9/0x360 kernel/entry/common.c:212 + do_syscall_64+0x10a/0x240 arch/x86/entry/common.c:89 + entry_SYSCALL_64_after_hwframe+0x6d/0x75 + +Memory state around the buggy address: + ffff888028c6cc80: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc + ffff888028c6cd00: fa fc fc fc fa fc fc fc 00 fc fc fc 06 fc fc fc +>ffff888028c6cd80: fa fc fc fc fa fc fc fc fa fc fc fc 02 fc fc fc + ^ + ffff888028c6ce00: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc + ffff888028c6ce80: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc + +Fixes: 423f38329d26 ("xsk: add umem fill queue support and mmap") +Reported-by: syzbot +Signed-off-by: Eric Dumazet +Cc: "Björn Töpel" +Cc: Magnus Karlsson +Cc: Maciej Fijalkowski +Cc: Jonathan Lemon +Acked-by: Daniel Borkmann +Link: https://lore.kernel.org/r/20240404202738.3634547-1-edumazet@google.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + net/xdp/xsk.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c +index e5eb5616be0ca..1f61d15b3d1d4 100644 +--- a/net/xdp/xsk.c ++++ b/net/xdp/xsk.c +@@ -1135,6 +1135,8 @@ static int xsk_setsockopt(struct socket *sock, int level, int optname, + struct xsk_queue **q; + int entries; + ++ if (optlen < sizeof(entries)) ++ return -EINVAL; + if (copy_from_sockptr(&entries, optval, sizeof(entries))) + return -EFAULT; + +-- +2.43.0 + -- 2.47.2