Fixes for 6.1

author Sasha Levin <sashal@kernel.org>

Mon, 15 Apr 2024 08:56:16 +0000 (04:56 -0400)

committer Sasha Levin <sashal@kernel.org>

Mon, 15 Apr 2024 08:56:16 +0000 (04:56 -0400)
author Sasha Levin <sashal@kernel.org>
Mon, 15 Apr 2024 08:56:16 +0000 (04:56 -0400)
committer Sasha Levin <sashal@kernel.org>
Mon, 15 Apr 2024 08:56:16 +0000 (04:56 -0400)
diff --git a/queue-6.1/af_unix-clear-stale-u-oob_skb.patch b/queue-6.1/af_unix-clear-stale-u-oob_skb.patch

new file mode 100644 (file)

index 0000000..e075160
--- /dev/null
+++ b/queue-6.1/af_unix-clear-stale-u-oob_skb.patch
@@ -0,0 +1,104 @@
+From 40abb36f74cf22d0ac5b4480daa6721c0e34646e Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Fri, 5 Apr 2024 15:10:57 -0700
+Subject: af_unix: Clear stale u->oob_skb.
+
+From: Kuniyuki Iwashima <kuniyu@amazon.com>
+
+[ Upstream commit b46f4eaa4f0ec38909fb0072eea3aeddb32f954e ]
+
+syzkaller started to report deadlock of unix_gc_lock after commit
+4090fa373f0e ("af_unix: Replace garbage collection algorithm."), but
+it just uncovers the bug that has been there since commit 314001f0bf92
+("af_unix: Add OOB support").
+
+The repro basically does the following.
+
+  from socket import *
+  from array import array
+
+  c1, c2 = socketpair(AF_UNIX, SOCK_STREAM)
+  c1.sendmsg([b'a'], [(SOL_SOCKET, SCM_RIGHTS, array("i", [c2.fileno()]))], MSG_OOB)
+  c2.recv(1)  # blocked as no normal data in recv queue
+
+  c2.close()  # done async and unblock recv()
+  c1.close()  # done async and trigger GC
+
+A socket sends its file descriptor to itself as OOB data and tries to
+receive normal data, but finally recv() fails due to async close().
+
+The problem here is wrong handling of OOB skb in manage_oob().  When
+recvmsg() is called without MSG_OOB, manage_oob() is called to check
+if the peeked skb is OOB skb.  In such a case, manage_oob() pops it
+out of the receive queue but does not clear unix_sock(sk)->oob_skb.
+This is wrong in terms of uAPI.
+
+Let's say we send "hello" with MSG_OOB, and "world" without MSG_OOB.
+The 'o' is handled as OOB data.  When recv() is called twice without
+MSG_OOB, the OOB data should be lost.
+
+  >>> from socket import *
+  >>> c1, c2 = socketpair(AF_UNIX, SOCK_STREAM, 0)
+  >>> c1.send(b'hello', MSG_OOB)  # 'o' is OOB data
+  5
+  >>> c1.send(b'world')
+  5
+  >>> c2.recv(5)  # OOB data is not received
+  b'hell'
+  >>> c2.recv(5)  # OOB date is skipped
+  b'world'
+  >>> c2.recv(5, MSG_OOB)  # This should return an error
+  b'o'
+
+In the same situation, TCP actually returns -EINVAL for the last
+recv().
+
+Also, if we do not clear unix_sk(sk)->oob_skb, unix_poll() always set
+EPOLLPRI even though the data has passed through by previous recv().
+
+To avoid these issues, we must clear unix_sk(sk)->oob_skb when dequeuing
+it from recv queue.
+
+The reason why the old GC did not trigger the deadlock is because the
+old GC relied on the receive queue to detect the loop.
+
+When it is triggered, the socket with OOB data is marked as GC candidate
+because file refcount == inflight count (1).  However, after traversing
+all inflight sockets, the socket still has a positive inflight count (1),
+thus the socket is excluded from candidates.  Then, the old GC lose the
+chance to garbage-collect the socket.
+
+With the old GC, the repro continues to create true garbage that will
+never be freed nor detected by kmemleak as it's linked to the global
+inflight list.  That's why we couldn't even notice the issue.
+
+Fixes: 314001f0bf92 ("af_unix: Add OOB support")
+Reported-by: syzbot+7f7f201cc2668a8fd169@syzkaller.appspotmail.com
+Closes: https://syzkaller.appspot.com/bug?extid=7f7f201cc2668a8fd169
+Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
+Reviewed-by: Eric Dumazet <edumazet@google.com>
+Link: https://lore.kernel.org/r/20240405221057.2406-1-kuniyu@amazon.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/unix/af_unix.c | 4 +++-
+ 1 file changed, 3 insertions(+), 1 deletion(-)
+
+diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
+index e1af94393789f..373530303ad19 100644
+--- a/net/unix/af_unix.c
++++ b/net/unix/af_unix.c
+@@ -2677,7 +2677,9 @@ static struct sk_buff *manage_oob(struct sk_buff *skb, struct sock *sk,
+                               }
+                       } else if (!(flags & MSG_PEEK)) {
+                               skb_unlink(skb, &sk->sk_receive_queue);
+-                              consume_skb(skb);
++                              WRITE_ONCE(u->oob_skb, NULL);
++                              if (!WARN_ON_ONCE(skb_unref(skb)))
++                                      kfree_skb(skb);
+                               skb = skb_peek(&sk->sk_receive_queue);
+                       }
+               }
+-- 
+2.43.0
+
diff --git a/queue-6.1/af_unix-do-not-use-atomic-ops-for-unix_sk-sk-infligh.patch b/queue-6.1/af_unix-do-not-use-atomic-ops-for-unix_sk-sk-infligh.patch

new file mode 100644 (file)

index 0000000..e28699e
--- /dev/null
+++ b/queue-6.1/af_unix-do-not-use-atomic-ops-for-unix_sk-sk-infligh.patch
@@ -0,0 +1,147 @@
+From 91ded5f845b3a6a7fe1902dd1c9100d4b0980cfc Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 23 Jan 2024 09:08:53 -0800
+Subject: af_unix: Do not use atomic ops for unix_sk(sk)->inflight.
+
+From: Kuniyuki Iwashima <kuniyu@amazon.com>
+
+[ Upstream commit 97af84a6bba2ab2b9c704c08e67de3b5ea551bb2 ]
+
+When touching unix_sk(sk)->inflight, we are always under
+spin_lock(&unix_gc_lock).
+
+Let's convert unix_sk(sk)->inflight to the normal unsigned long.
+
+Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
+Reviewed-by: Simon Horman <horms@kernel.org>
+Link: https://lore.kernel.org/r/20240123170856.41348-3-kuniyu@amazon.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Stable-dep-of: 47d8ac011fe1 ("af_unix: Fix garbage collector racing against connect()")
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ include/net/af_unix.h |  2 +-
+ net/unix/af_unix.c    |  4 ++--
+ net/unix/garbage.c    | 17 ++++++++---------
+ net/unix/scm.c        |  8 +++++---
+ 4 files changed, 16 insertions(+), 15 deletions(-)
+
+diff --git a/include/net/af_unix.h b/include/net/af_unix.h
+index 0920b669b9b31..16d6936baa2fb 100644
+--- a/include/net/af_unix.h
++++ b/include/net/af_unix.h
+@@ -54,7 +54,7 @@ struct unix_sock {
+       struct mutex            iolock, bindlock;
+       struct sock             *peer;
+       struct list_head        link;
+-      atomic_long_t           inflight;
++      unsigned long           inflight;
+       spinlock_t              lock;
+       unsigned long           gc_flags;
+ #define UNIX_GC_CANDIDATE     0
+diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
+index 373530303ad19..0a75d76535f75 100644
+--- a/net/unix/af_unix.c
++++ b/net/unix/af_unix.c
+@@ -968,11 +968,11 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern,
+       sk->sk_write_space      = unix_write_space;
+       sk->sk_max_ack_backlog  = net->unx.sysctl_max_dgram_qlen;
+       sk->sk_destruct         = unix_sock_destructor;
+-      u         = unix_sk(sk);
++      u = unix_sk(sk);
++      u->inflight = 0;
+       u->path.dentry = NULL;
+       u->path.mnt = NULL;
+       spin_lock_init(&u->lock);
+-      atomic_long_set(&u->inflight, 0);
+       INIT_LIST_HEAD(&u->link);
+       mutex_init(&u->iolock); /* single task reading lock */
+       mutex_init(&u->bindlock); /* single task binding lock */
+diff --git a/net/unix/garbage.c b/net/unix/garbage.c
+index 9bfffe2a7f020..7b326582d97da 100644
+--- a/net/unix/garbage.c
++++ b/net/unix/garbage.c
+@@ -166,17 +166,18 @@ static void scan_children(struct sock *x, void (*func)(struct unix_sock *),
+ 
+ static void dec_inflight(struct unix_sock *usk)
+ {
+-      atomic_long_dec(&usk->inflight);
++      usk->inflight--;
+ }
+ 
+ static void inc_inflight(struct unix_sock *usk)
+ {
+-      atomic_long_inc(&usk->inflight);
++      usk->inflight++;
+ }
+ 
+ static void inc_inflight_move_tail(struct unix_sock *u)
+ {
+-      atomic_long_inc(&u->inflight);
++      u->inflight++;
++
+       /* If this still might be part of a cycle, move it to the end
+        * of the list, so that it's checked even if it was already
+        * passed over
+@@ -237,14 +238,12 @@ void unix_gc(void)
+        */
+       list_for_each_entry_safe(u, next, &gc_inflight_list, link) {
+               long total_refs;
+-              long inflight_refs;
+ 
+               total_refs = file_count(u->sk.sk_socket->file);
+-              inflight_refs = atomic_long_read(&u->inflight);
+ 
+-              BUG_ON(inflight_refs < 1);
+-              BUG_ON(total_refs < inflight_refs);
+-              if (total_refs == inflight_refs) {
++              BUG_ON(!u->inflight);
++              BUG_ON(total_refs < u->inflight);
++              if (total_refs == u->inflight) {
+                       list_move_tail(&u->link, &gc_candidates);
+                       __set_bit(UNIX_GC_CANDIDATE, &u->gc_flags);
+                       __set_bit(UNIX_GC_MAYBE_CYCLE, &u->gc_flags);
+@@ -271,7 +270,7 @@ void unix_gc(void)
+               /* Move cursor to after the current position. */
+               list_move(&cursor, &u->link);
+ 
+-              if (atomic_long_read(&u->inflight) > 0) {
++              if (u->inflight) {
+                       list_move_tail(&u->link, &not_cycle_list);
+                       __clear_bit(UNIX_GC_MAYBE_CYCLE, &u->gc_flags);
+                       scan_children(&u->sk, inc_inflight_move_tail, NULL);
+diff --git a/net/unix/scm.c b/net/unix/scm.c
+index d1048b4c2baaf..4eff7da9f6f96 100644
+--- a/net/unix/scm.c
++++ b/net/unix/scm.c
+@@ -52,12 +52,13 @@ void unix_inflight(struct user_struct *user, struct file *fp)
+       if (s) {
+               struct unix_sock *u = unix_sk(s);
+ 
+-              if (atomic_long_inc_return(&u->inflight) == 1) {
++              if (!u->inflight) {
+                       BUG_ON(!list_empty(&u->link));
+                       list_add_tail(&u->link, &gc_inflight_list);
+               } else {
+                       BUG_ON(list_empty(&u->link));
+               }
++              u->inflight++;
+               /* Paired with READ_ONCE() in wait_for_unix_gc() */
+               WRITE_ONCE(unix_tot_inflight, unix_tot_inflight + 1);
+       }
+@@ -74,10 +75,11 @@ void unix_notinflight(struct user_struct *user, struct file *fp)
+       if (s) {
+               struct unix_sock *u = unix_sk(s);
+ 
+-              BUG_ON(!atomic_long_read(&u->inflight));
++              BUG_ON(!u->inflight);
+               BUG_ON(list_empty(&u->link));
+ 
+-              if (atomic_long_dec_and_test(&u->inflight))
++              u->inflight--;
++              if (!u->inflight)
+                       list_del_init(&u->link);
+               /* Paired with READ_ONCE() in wait_for_unix_gc() */
+               WRITE_ONCE(unix_tot_inflight, unix_tot_inflight - 1);
+-- 
+2.43.0
+
diff --git a/queue-6.1/af_unix-fix-garbage-collector-racing-against-connect.patch b/queue-6.1/af_unix-fix-garbage-collector-racing-against-connect.patch

new file mode 100644 (file)

index 0000000..1d6e3c0
--- /dev/null
+++ b/queue-6.1/af_unix-fix-garbage-collector-racing-against-connect.patch
@@ -0,0 +1,122 @@
+From 104c4fc2d2aa99ffd2f7190155d01675476aa57d Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 Apr 2024 22:09:39 +0200
+Subject: af_unix: Fix garbage collector racing against connect()
+
+From: Michal Luczaj <mhal@rbox.co>
+
+[ Upstream commit 47d8ac011fe1c9251070e1bd64cb10b48193ec51 ]
+
+Garbage collector does not take into account the risk of embryo getting
+enqueued during the garbage collection. If such embryo has a peer that
+carries SCM_RIGHTS, two consecutive passes of scan_children() may see a
+different set of children. Leading to an incorrectly elevated inflight
+count, and then a dangling pointer within the gc_inflight_list.
+
+sockets are AF_UNIX/SOCK_STREAM
+S is an unconnected socket
+L is a listening in-flight socket bound to addr, not in fdtable
+V's fd will be passed via sendmsg(), gets inflight count bumped
+
+connect(S, addr)       sendmsg(S, [V]); close(V)       __unix_gc()
+----------------       -------------------------       -----------
+
+NS = unix_create1()
+skb1 = sock_wmalloc(NS)
+L = unix_find_other(addr)
+unix_state_lock(L)
+unix_peer(S) = NS
+                       // V count=1 inflight=0
+
+                       NS = unix_peer(S)
+                       skb2 = sock_alloc()
+                       skb_queue_tail(NS, skb2[V])
+
+                       // V became in-flight
+                       // V count=2 inflight=1
+
+                       close(V)
+
+                       // V count=1 inflight=1
+                       // GC candidate condition met
+
+                                               for u in gc_inflight_list:
+                                                 if (total_refs == inflight_refs)
+                                                   add u to gc_candidates
+
+                                               // gc_candidates={L, V}
+
+                                               for u in gc_candidates:
+                                                 scan_children(u, dec_inflight)
+
+                                               // embryo (skb1) was not
+                                               // reachable from L yet, so V's
+                                               // inflight remains unchanged
+__skb_queue_tail(L, skb1)
+unix_state_unlock(L)
+                                               for u in gc_candidates:
+                                                 if (u.inflight)
+                                                   scan_children(u, inc_inflight_move_tail)
+
+                                               // V count=1 inflight=2 (!)
+
+If there is a GC-candidate listening socket, lock/unlock its state. This
+makes GC wait until the end of any ongoing connect() to that socket. After
+flipping the lock, a possibly SCM-laden embryo is already enqueued. And if
+there is another embryo coming, it can not possibly carry SCM_RIGHTS. At
+this point, unix_inflight() can not happen because unix_gc_lock is already
+taken. Inflight graph remains unaffected.
+
+Fixes: 1fd05ba5a2f2 ("[AF_UNIX]: Rewrite garbage collector, fixes race.")
+Signed-off-by: Michal Luczaj <mhal@rbox.co>
+Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
+Link: https://lore.kernel.org/r/20240409201047.1032217-1-mhal@rbox.co
+Signed-off-by: Paolo Abeni <pabeni@redhat.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/unix/garbage.c | 18 +++++++++++++++++-
+ 1 file changed, 17 insertions(+), 1 deletion(-)
+
+diff --git a/net/unix/garbage.c b/net/unix/garbage.c
+index 7b326582d97da..85c6f05c0fa3c 100644
+--- a/net/unix/garbage.c
++++ b/net/unix/garbage.c
+@@ -235,11 +235,22 @@ void unix_gc(void)
+        * receive queues.  Other, non candidate sockets _can_ be
+        * added to queue, so we must make sure only to touch
+        * candidates.
++       *
++       * Embryos, though never candidates themselves, affect which
++       * candidates are reachable by the garbage collector.  Before
++       * being added to a listener's queue, an embryo may already
++       * receive data carrying SCM_RIGHTS, potentially making the
++       * passed socket a candidate that is not yet reachable by the
++       * collector.  It becomes reachable once the embryo is
++       * enqueued.  Therefore, we must ensure that no SCM-laden
++       * embryo appears in a (candidate) listener's queue between
++       * consecutive scan_children() calls.
+        */
+       list_for_each_entry_safe(u, next, &gc_inflight_list, link) {
++              struct sock *sk = &u->sk;
+               long total_refs;
+ 
+-              total_refs = file_count(u->sk.sk_socket->file);
++              total_refs = file_count(sk->sk_socket->file);
+ 
+               BUG_ON(!u->inflight);
+               BUG_ON(total_refs < u->inflight);
+@@ -247,6 +258,11 @@ void unix_gc(void)
+                       list_move_tail(&u->link, &gc_candidates);
+                       __set_bit(UNIX_GC_CANDIDATE, &u->gc_flags);
+                       __set_bit(UNIX_GC_MAYBE_CYCLE, &u->gc_flags);
++
++                      if (sk->sk_state == TCP_LISTEN) {
++                              unix_state_lock(sk);
++                              unix_state_unlock(sk);
++                      }
+               }
+       }
+ 
+-- 
+2.43.0
+
diff --git a/queue-6.1/arm64-dts-imx8-ss-conn-fix-usdhc-wrong-lpcg-clock-or.patch b/queue-6.1/arm64-dts-imx8-ss-conn-fix-usdhc-wrong-lpcg-clock-or.patch

new file mode 100644 (file)

index 0000000..722278a
--- /dev/null
+++ b/queue-6.1/arm64-dts-imx8-ss-conn-fix-usdhc-wrong-lpcg-clock-or.patch
@@ -0,0 +1,95 @@
+From 5988a6a1d560ecd77ad855ce6e218dc25eae01e0 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Fri, 22 Mar 2024 12:47:05 -0400
+Subject: arm64: dts: imx8-ss-conn: fix usdhc wrong lpcg clock order
+
+From: Frank Li <Frank.Li@nxp.com>
+
+[ Upstream commit c6ddd6e7b166532a0816825442ff60f70aed9647 ]
+
+The actual clock show wrong frequency:
+
+   echo on >/sys/devices/platform/bus\@5b000000/5b010000.mmc/power/control
+   cat /sys/kernel/debug/mmc0/ios
+
+   clock:          200000000 Hz
+   actual clock:   166000000 Hz
+                   ^^^^^^^^^
+   .....
+
+According to
+
+sdhc0_lpcg: clock-controller@5b200000 {
+                compatible = "fsl,imx8qxp-lpcg";
+                reg = <0x5b200000 0x10000>;
+                #clock-cells = <1>;
+                clocks = <&clk IMX_SC_R_SDHC_0 IMX_SC_PM_CLK_PER>,
+                         <&conn_ipg_clk>, <&conn_axi_clk>;
+                clock-indices = <IMX_LPCG_CLK_0>, <IMX_LPCG_CLK_4>,
+                                <IMX_LPCG_CLK_5>;
+                clock-output-names = "sdhc0_lpcg_per_clk",
+                                     "sdhc0_lpcg_ipg_clk",
+                                     "sdhc0_lpcg_ahb_clk";
+                power-domains = <&pd IMX_SC_R_SDHC_0>;
+        }
+
+"per_clk" should be IMX_LPCG_CLK_0 instead of IMX_LPCG_CLK_5.
+
+After correct clocks order:
+
+   echo on >/sys/devices/platform/bus\@5b000000/5b010000.mmc/power/control
+   cat /sys/kernel/debug/mmc0/ios
+
+   clock:          200000000 Hz
+   actual clock:   198000000 Hz
+                   ^^^^^^^^
+   ...
+
+Fixes: 16c4ea7501b1 ("arm64: dts: imx8: switch to new lpcg clock binding")
+Signed-off-by: Frank Li <Frank.Li@nxp.com>
+Signed-off-by: Shawn Guo <shawnguo@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi | 12 ++++++------
+ 1 file changed, 6 insertions(+), 6 deletions(-)
+
+diff --git a/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi b/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi
+index 10370d1a6c6de..dbb298b907c1c 100644
+--- a/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi
++++ b/arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi
+@@ -38,8 +38,8 @@ usdhc1: mmc@5b010000 {
+               interrupts = <GIC_SPI 232 IRQ_TYPE_LEVEL_HIGH>;
+               reg = <0x5b010000 0x10000>;
+               clocks = <&sdhc0_lpcg IMX_LPCG_CLK_4>,
+-                       <&sdhc0_lpcg IMX_LPCG_CLK_0>,
+-                       <&sdhc0_lpcg IMX_LPCG_CLK_5>;
++                       <&sdhc0_lpcg IMX_LPCG_CLK_5>,
++                       <&sdhc0_lpcg IMX_LPCG_CLK_0>;
+               clock-names = "ipg", "ahb", "per";
+               power-domains = <&pd IMX_SC_R_SDHC_0>;
+               status = "disabled";
+@@ -49,8 +49,8 @@ usdhc2: mmc@5b020000 {
+               interrupts = <GIC_SPI 233 IRQ_TYPE_LEVEL_HIGH>;
+               reg = <0x5b020000 0x10000>;
+               clocks = <&sdhc1_lpcg IMX_LPCG_CLK_4>,
+-                       <&sdhc1_lpcg IMX_LPCG_CLK_0>,
+-                       <&sdhc1_lpcg IMX_LPCG_CLK_5>;
++                       <&sdhc1_lpcg IMX_LPCG_CLK_5>,
++                       <&sdhc1_lpcg IMX_LPCG_CLK_0>;
+               clock-names = "ipg", "ahb", "per";
+               power-domains = <&pd IMX_SC_R_SDHC_1>;
+               fsl,tuning-start-tap = <20>;
+@@ -62,8 +62,8 @@ usdhc3: mmc@5b030000 {
+               interrupts = <GIC_SPI 234 IRQ_TYPE_LEVEL_HIGH>;
+               reg = <0x5b030000 0x10000>;
+               clocks = <&sdhc2_lpcg IMX_LPCG_CLK_4>,
+-                       <&sdhc2_lpcg IMX_LPCG_CLK_0>,
+-                       <&sdhc2_lpcg IMX_LPCG_CLK_5>;
++                       <&sdhc2_lpcg IMX_LPCG_CLK_5>,
++                       <&sdhc2_lpcg IMX_LPCG_CLK_0>;
+               clock-names = "ipg", "ahb", "per";
+               power-domains = <&pd IMX_SC_R_SDHC_2>;
+               status = "disabled";
+-- 
+2.43.0
+
diff --git a/queue-6.1/bluetooth-l2cap-fix-not-validating-setsockopt-user-i.patch b/queue-6.1/bluetooth-l2cap-fix-not-validating-setsockopt-user-i.patch

new file mode 100644 (file)

index 0000000..641a3b8
--- /dev/null
+++ b/queue-6.1/bluetooth-l2cap-fix-not-validating-setsockopt-user-i.patch
@@ -0,0 +1,165 @@
+From f13bb13d3d246fd03872195eafa213855a6c567e Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Fri, 5 Apr 2024 15:50:47 -0400
+Subject: Bluetooth: L2CAP: Fix not validating setsockopt user input
+
+From: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
+
+[ Upstream commit 4f3951242ace5efc7131932e2e01e6ac6baed846 ]
+
+Check user input length before copying data.
+
+Fixes: 33575df7be67 ("Bluetooth: move l2cap_sock_setsockopt() to l2cap_sock.c")
+Fixes: 3ee7b7cd8390 ("Bluetooth: Add BT_MODE socket option")
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/bluetooth/l2cap_sock.c | 52 +++++++++++++++-----------------------
+ 1 file changed, 20 insertions(+), 32 deletions(-)
+
+diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
+index 947ca580bb9a2..4198ca66fbe10 100644
+--- a/net/bluetooth/l2cap_sock.c
++++ b/net/bluetooth/l2cap_sock.c
+@@ -745,7 +745,7 @@ static int l2cap_sock_setsockopt_old(struct socket *sock, int optname,
+       struct sock *sk = sock->sk;
+       struct l2cap_chan *chan = l2cap_pi(sk)->chan;
+       struct l2cap_options opts;
+-      int len, err = 0;
++      int err = 0;
+       u32 opt;
+ 
+       BT_DBG("sk %p", sk);
+@@ -772,11 +772,9 @@ static int l2cap_sock_setsockopt_old(struct socket *sock, int optname,
+               opts.max_tx   = chan->max_tx;
+               opts.txwin_size = chan->tx_win;
+ 
+-              len = min_t(unsigned int, sizeof(opts), optlen);
+-              if (copy_from_sockptr(&opts, optval, len)) {
+-                      err = -EFAULT;
++              err = bt_copy_from_sockptr(&opts, sizeof(opts), optval, optlen);
++              if (err)
+                       break;
+-              }
+ 
+               if (opts.txwin_size > L2CAP_DEFAULT_EXT_WINDOW) {
+                       err = -EINVAL;
+@@ -819,10 +817,9 @@ static int l2cap_sock_setsockopt_old(struct socket *sock, int optname,
+               break;
+ 
+       case L2CAP_LM:
+-              if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
+-                      err = -EFAULT;
++              err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen);
++              if (err)
+                       break;
+-              }
+ 
+               if (opt & L2CAP_LM_FIPS) {
+                       err = -EINVAL;
+@@ -903,7 +900,7 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
+       struct bt_security sec;
+       struct bt_power pwr;
+       struct l2cap_conn *conn;
+-      int len, err = 0;
++      int err = 0;
+       u32 opt;
+       u16 mtu;
+       u8 mode;
+@@ -929,11 +926,9 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
+ 
+               sec.level = BT_SECURITY_LOW;
+ 
+-              len = min_t(unsigned int, sizeof(sec), optlen);
+-              if (copy_from_sockptr(&sec, optval, len)) {
+-                      err = -EFAULT;
++              err = bt_copy_from_sockptr(&sec, sizeof(sec), optval, optlen);
++              if (err)
+                       break;
+-              }
+ 
+               if (sec.level < BT_SECURITY_LOW ||
+                   sec.level > BT_SECURITY_FIPS) {
+@@ -978,10 +973,9 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
+                       break;
+               }
+ 
+-              if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
+-                      err = -EFAULT;
++              err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen);
++              if (err)
+                       break;
+-              }
+ 
+               if (opt) {
+                       set_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags);
+@@ -993,10 +987,9 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
+               break;
+ 
+       case BT_FLUSHABLE:
+-              if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
+-                      err = -EFAULT;
++              err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen);
++              if (err)
+                       break;
+-              }
+ 
+               if (opt > BT_FLUSHABLE_ON) {
+                       err = -EINVAL;
+@@ -1028,11 +1021,9 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
+ 
+               pwr.force_active = BT_POWER_FORCE_ACTIVE_ON;
+ 
+-              len = min_t(unsigned int, sizeof(pwr), optlen);
+-              if (copy_from_sockptr(&pwr, optval, len)) {
+-                      err = -EFAULT;
++              err = bt_copy_from_sockptr(&pwr, sizeof(pwr), optval, optlen);
++              if (err)
+                       break;
+-              }
+ 
+               if (pwr.force_active)
+                       set_bit(FLAG_FORCE_ACTIVE, &chan->flags);
+@@ -1041,10 +1032,9 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
+               break;
+ 
+       case BT_CHANNEL_POLICY:
+-              if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
+-                      err = -EFAULT;
++              err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen);
++              if (err)
+                       break;
+-              }
+ 
+               if (opt > BT_CHANNEL_POLICY_AMP_PREFERRED) {
+                       err = -EINVAL;
+@@ -1089,10 +1079,9 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
+                       break;
+               }
+ 
+-              if (copy_from_sockptr(&mtu, optval, sizeof(u16))) {
+-                      err = -EFAULT;
++              err = bt_copy_from_sockptr(&mtu, sizeof(mtu), optval, optlen);
++              if (err)
+                       break;
+-              }
+ 
+               if (chan->mode == L2CAP_MODE_EXT_FLOWCTL &&
+                   sk->sk_state == BT_CONNECTED)
+@@ -1120,10 +1109,9 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
+                       break;
+               }
+ 
+-              if (copy_from_sockptr(&mode, optval, sizeof(u8))) {
+-                      err = -EFAULT;
++              err = bt_copy_from_sockptr(&mode, sizeof(mode), optval, optlen);
++              if (err)
+                       break;
+-              }
+ 
+               BT_DBG("mode %u", mode);
+ 
+-- 
+2.43.0
+
diff --git a/queue-6.1/bluetooth-sco-fix-not-validating-setsockopt-user-inp.patch b/queue-6.1/bluetooth-sco-fix-not-validating-setsockopt-user-inp.patch

new file mode 100644 (file)

index 0000000..5769420
--- /dev/null
+++ b/queue-6.1/bluetooth-sco-fix-not-validating-setsockopt-user-inp.patch
@@ -0,0 +1,122 @@
+From 98af6b2539047fae9d30882da3ec133502604204 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Fri, 5 Apr 2024 15:41:52 -0400
+Subject: Bluetooth: SCO: Fix not validating setsockopt user input
+
+From: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
+
+[ Upstream commit 51eda36d33e43201e7a4fd35232e069b2c850b01 ]
+
+syzbot reported sco_sock_setsockopt() is copying data without
+checking user input length.
+
+BUG: KASAN: slab-out-of-bounds in copy_from_sockptr_offset
+include/linux/sockptr.h:49 [inline]
+BUG: KASAN: slab-out-of-bounds in copy_from_sockptr
+include/linux/sockptr.h:55 [inline]
+BUG: KASAN: slab-out-of-bounds in sco_sock_setsockopt+0xc0b/0xf90
+net/bluetooth/sco.c:893
+Read of size 4 at addr ffff88805f7b15a3 by task syz-executor.5/12578
+
+Fixes: ad10b1a48754 ("Bluetooth: Add Bluetooth socket voice option")
+Fixes: b96e9c671b05 ("Bluetooth: Add BT_DEFER_SETUP option to sco socket")
+Fixes: 00398e1d5183 ("Bluetooth: Add support for BT_PKT_STATUS CMSG data for SCO connections")
+Fixes: f6873401a608 ("Bluetooth: Allow setting of codec for HFP offload use case")
+Reported-by: syzbot <syzkaller@googlegroups.com>
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ include/net/bluetooth/bluetooth.h |  9 +++++++++
+ net/bluetooth/sco.c               | 23 ++++++++++-------------
+ 2 files changed, 19 insertions(+), 13 deletions(-)
+
+diff --git a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h
+index bcc5a4cd2c17b..5aaf7d7f3c6fa 100644
+--- a/include/net/bluetooth/bluetooth.h
++++ b/include/net/bluetooth/bluetooth.h
+@@ -565,6 +565,15 @@ static inline struct sk_buff *bt_skb_sendmmsg(struct sock *sk,
+       return skb;
+ }
+ 
++static inline int bt_copy_from_sockptr(void *dst, size_t dst_size,
++                                     sockptr_t src, size_t src_size)
++{
++      if (dst_size > src_size)
++              return -EINVAL;
++
++      return copy_from_sockptr(dst, src, dst_size);
++}
++
+ int bt_to_errno(u16 code);
+ __u8 bt_status(int err);
+ 
+diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
+index 6d4168cfeb563..2e9137c539a49 100644
+--- a/net/bluetooth/sco.c
++++ b/net/bluetooth/sco.c
+@@ -831,7 +831,7 @@ static int sco_sock_setsockopt(struct socket *sock, int level, int optname,
+                              sockptr_t optval, unsigned int optlen)
+ {
+       struct sock *sk = sock->sk;
+-      int len, err = 0;
++      int err = 0;
+       struct bt_voice voice;
+       u32 opt;
+       struct bt_codecs *codecs;
+@@ -850,10 +850,9 @@ static int sco_sock_setsockopt(struct socket *sock, int level, int optname,
+                       break;
+               }
+ 
+-              if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
+-                      err = -EFAULT;
++              err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen);
++              if (err)
+                       break;
+-              }
+ 
+               if (opt)
+                       set_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags);
+@@ -870,11 +869,10 @@ static int sco_sock_setsockopt(struct socket *sock, int level, int optname,
+ 
+               voice.setting = sco_pi(sk)->setting;
+ 
+-              len = min_t(unsigned int, sizeof(voice), optlen);
+-              if (copy_from_sockptr(&voice, optval, len)) {
+-                      err = -EFAULT;
++              err = bt_copy_from_sockptr(&voice, sizeof(voice), optval,
++                                         optlen);
++              if (err)
+                       break;
+-              }
+ 
+               /* Explicitly check for these values */
+               if (voice.setting != BT_VOICE_TRANSPARENT &&
+@@ -897,10 +895,9 @@ static int sco_sock_setsockopt(struct socket *sock, int level, int optname,
+               break;
+ 
+       case BT_PKT_STATUS:
+-              if (copy_from_sockptr(&opt, optval, sizeof(u32))) {
+-                      err = -EFAULT;
++              err = bt_copy_from_sockptr(&opt, sizeof(opt), optval, optlen);
++              if (err)
+                       break;
+-              }
+ 
+               if (opt)
+                       sco_pi(sk)->cmsg_mask |= SCO_CMSG_PKT_STATUS;
+@@ -941,9 +938,9 @@ static int sco_sock_setsockopt(struct socket *sock, int level, int optname,
+                       break;
+               }
+ 
+-              if (copy_from_sockptr(buffer, optval, optlen)) {
++              err = bt_copy_from_sockptr(buffer, optlen, optval, optlen);
++              if (err) {
+                       hci_dev_put(hdev);
+-                      err = -EFAULT;
+                       break;
+               }
+ 
+-- 
+2.43.0
+
diff --git a/queue-6.1/bnxt_en-reset-ptp-tx_avail-after-possible-firmware-r.patch b/queue-6.1/bnxt_en-reset-ptp-tx_avail-after-possible-firmware-r.patch

new file mode 100644 (file)

index 0000000..59375f6
--- /dev/null
+++ b/queue-6.1/bnxt_en-reset-ptp-tx_avail-after-possible-firmware-r.patch
@@ -0,0 +1,42 @@
+From 7f29f921150696c49c78b89bf39487b1736ec6a5 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Fri, 5 Apr 2024 16:55:13 -0700
+Subject: bnxt_en: Reset PTP tx_avail after possible firmware reset
+
+From: Pavan Chebbi <pavan.chebbi@broadcom.com>
+
+[ Upstream commit faa12ca245585379d612736a4b5e98e88481ea59 ]
+
+It is possible that during error recovery and firmware reset,
+there is a pending TX PTP packet waiting for the timestamp.
+We need to reset this condition so that after recovery, the
+tx_avail count for PTP is reset back to the initial value.
+Otherwise, we may not accept any PTP TX timestamps after
+recovery.
+
+Fixes: 118612d519d8 ("bnxt_en: Add PTP clock APIs, ioctls, and ethtool methods")
+Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
+Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
+Signed-off-by: Michael Chan <michael.chan@broadcom.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 ++
+ 1 file changed, 2 insertions(+)
+
+diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+index f810b5dc25f01..0d0aad7141c15 100644
+--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
++++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+@@ -10564,6 +10564,8 @@ static int __bnxt_open_nic(struct bnxt *bp, bool irq_re_init, bool link_re_init)
+       /* VF-reps may need to be re-opened after the PF is re-opened */
+       if (BNXT_PF(bp))
+               bnxt_vf_reps_open(bp);
++      if (bp->ptp_cfg)
++              atomic_set(&bp->ptp_cfg->tx_avail, BNXT_MAX_TX_TS);
+       bnxt_ptp_init_rtc(bp, true);
+       bnxt_ptp_cfg_tstamp_filters(bp);
+       return 0;
+-- 
+2.43.0
+
diff --git a/queue-6.1/geneve-fix-header-validation-in-geneve-6-_xmit_skb.patch b/queue-6.1/geneve-fix-header-validation-in-geneve-6-_xmit_skb.patch

new file mode 100644 (file)

index 0000000..0e2e5b2
--- /dev/null
+++ b/queue-6.1/geneve-fix-header-validation-in-geneve-6-_xmit_skb.patch
@@ -0,0 +1,166 @@
+From 4812de9ebd09d4a19032706488466dbfaa5cb5a2 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Fri, 5 Apr 2024 10:30:34 +0000
+Subject: geneve: fix header validation in geneve[6]_xmit_skb
+
+From: Eric Dumazet <edumazet@google.com>
+
+[ Upstream commit d8a6213d70accb403b82924a1c229e733433a5ef ]
+
+syzbot is able to trigger an uninit-value in geneve_xmit() [1]
+
+Problem : While most ip tunnel helpers (like ip_tunnel_get_dsfield())
+uses skb_protocol(skb, true), pskb_inet_may_pull() is only using
+skb->protocol.
+
+If anything else than ETH_P_IPV6 or ETH_P_IP is found in skb->protocol,
+pskb_inet_may_pull() does nothing at all.
+
+If a vlan tag was provided by the caller (af_packet in the syzbot case),
+the network header might not point to the correct location, and skb
+linear part could be smaller than expected.
+
+Add skb_vlan_inet_prepare() to perform a complete mac validation.
+
+Use this in geneve for the moment, I suspect we need to adopt this
+more broadly.
+
+v4 - Jakub reported v3 broke l2_tos_ttl_inherit.sh selftest
+   - Only call __vlan_get_protocol() for vlan types.
+Link: https://lore.kernel.org/netdev/20240404100035.3270a7d5@kernel.org/
+
+v2,v3 - Addressed Sabrina comments on v1 and v2
+Link: https://lore.kernel.org/netdev/Zg1l9L2BNoZWZDZG@hog/
+
+[1]
+
+BUG: KMSAN: uninit-value in geneve_xmit_skb drivers/net/geneve.c:910 [inline]
+ BUG: KMSAN: uninit-value in geneve_xmit+0x302d/0x5420 drivers/net/geneve.c:1030
+  geneve_xmit_skb drivers/net/geneve.c:910 [inline]
+  geneve_xmit+0x302d/0x5420 drivers/net/geneve.c:1030
+  __netdev_start_xmit include/linux/netdevice.h:4903 [inline]
+  netdev_start_xmit include/linux/netdevice.h:4917 [inline]
+  xmit_one net/core/dev.c:3531 [inline]
+  dev_hard_start_xmit+0x247/0xa20 net/core/dev.c:3547
+  __dev_queue_xmit+0x348d/0x52c0 net/core/dev.c:4335
+  dev_queue_xmit include/linux/netdevice.h:3091 [inline]
+  packet_xmit+0x9c/0x6c0 net/packet/af_packet.c:276
+  packet_snd net/packet/af_packet.c:3081 [inline]
+  packet_sendmsg+0x8bb0/0x9ef0 net/packet/af_packet.c:3113
+  sock_sendmsg_nosec net/socket.c:730 [inline]
+  __sock_sendmsg+0x30f/0x380 net/socket.c:745
+  __sys_sendto+0x685/0x830 net/socket.c:2191
+  __do_sys_sendto net/socket.c:2203 [inline]
+  __se_sys_sendto net/socket.c:2199 [inline]
+  __x64_sys_sendto+0x125/0x1d0 net/socket.c:2199
+ do_syscall_64+0xd5/0x1f0
+ entry_SYSCALL_64_after_hwframe+0x6d/0x75
+
+Uninit was created at:
+  slab_post_alloc_hook mm/slub.c:3804 [inline]
+  slab_alloc_node mm/slub.c:3845 [inline]
+  kmem_cache_alloc_node+0x613/0xc50 mm/slub.c:3888
+  kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:577
+  __alloc_skb+0x35b/0x7a0 net/core/skbuff.c:668
+  alloc_skb include/linux/skbuff.h:1318 [inline]
+  alloc_skb_with_frags+0xc8/0xbf0 net/core/skbuff.c:6504
+  sock_alloc_send_pskb+0xa81/0xbf0 net/core/sock.c:2795
+  packet_alloc_skb net/packet/af_packet.c:2930 [inline]
+  packet_snd net/packet/af_packet.c:3024 [inline]
+  packet_sendmsg+0x722d/0x9ef0 net/packet/af_packet.c:3113
+  sock_sendmsg_nosec net/socket.c:730 [inline]
+  __sock_sendmsg+0x30f/0x380 net/socket.c:745
+  __sys_sendto+0x685/0x830 net/socket.c:2191
+  __do_sys_sendto net/socket.c:2203 [inline]
+  __se_sys_sendto net/socket.c:2199 [inline]
+  __x64_sys_sendto+0x125/0x1d0 net/socket.c:2199
+ do_syscall_64+0xd5/0x1f0
+ entry_SYSCALL_64_after_hwframe+0x6d/0x75
+
+CPU: 0 PID: 5033 Comm: syz-executor346 Not tainted 6.9.0-rc1-syzkaller-00005-g928a87efa423 #0
+Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
+
+Fixes: d13f048dd40e ("net: geneve: modify IP header check in geneve6_xmit_skb and geneve_xmit_skb")
+Reported-by: syzbot+9ee20ec1de7b3168db09@syzkaller.appspotmail.com
+Closes: https://lore.kernel.org/netdev/000000000000d19c3a06152f9ee4@google.com/
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Cc: Phillip Potter <phil@philpotter.co.uk>
+Cc: Sabrina Dubroca <sd@queasysnail.net>
+Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
+Reviewed-by: Phillip Potter <phil@philpotter.co.uk>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/geneve.c     |  4 ++--
+ include/net/ip_tunnels.h | 33 +++++++++++++++++++++++++++++++++
+ 2 files changed, 35 insertions(+), 2 deletions(-)
+
+diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
+index 3f8da6f0b25ce..488ca1c854962 100644
+--- a/drivers/net/geneve.c
++++ b/drivers/net/geneve.c
+@@ -930,7 +930,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
+       __be16 sport;
+       int err;
+ 
+-      if (!pskb_inet_may_pull(skb))
++      if (!skb_vlan_inet_prepare(skb))
+               return -EINVAL;
+ 
+       sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
+@@ -1028,7 +1028,7 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
+       __be16 sport;
+       int err;
+ 
+-      if (!pskb_inet_may_pull(skb))
++      if (!skb_vlan_inet_prepare(skb))
+               return -EINVAL;
+ 
+       sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
+diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
+index bca80522f95c8..f9906b73e7ff4 100644
+--- a/include/net/ip_tunnels.h
++++ b/include/net/ip_tunnels.h
+@@ -351,6 +351,39 @@ static inline bool pskb_inet_may_pull(struct sk_buff *skb)
+       return pskb_network_may_pull(skb, nhlen);
+ }
+ 
++/* Variant of pskb_inet_may_pull().
++ */
++static inline bool skb_vlan_inet_prepare(struct sk_buff *skb)
++{
++      int nhlen = 0, maclen = ETH_HLEN;
++      __be16 type = skb->protocol;
++
++      /* Essentially this is skb_protocol(skb, true)
++       * And we get MAC len.
++       */
++      if (eth_type_vlan(type))
++              type = __vlan_get_protocol(skb, type, &maclen);
++
++      switch (type) {
++#if IS_ENABLED(CONFIG_IPV6)
++      case htons(ETH_P_IPV6):
++              nhlen = sizeof(struct ipv6hdr);
++              break;
++#endif
++      case htons(ETH_P_IP):
++              nhlen = sizeof(struct iphdr);
++              break;
++      }
++      /* For ETH_P_IPV6/ETH_P_IP we make sure to pull
++       * a base network header in skb->head.
++       */
++      if (!pskb_may_pull(skb, maclen + nhlen))
++              return false;
++
++      skb_set_network_header(skb, maclen);
++      return true;
++}
++
+ static inline int ip_encap_hlen(struct ip_tunnel_encap *e)
+ {
+       const struct ip_tunnel_encap_ops *ops;
+-- 
+2.43.0
+
diff --git a/queue-6.1/iommu-vt-d-allocate-local-memory-for-page-request-qu.patch b/queue-6.1/iommu-vt-d-allocate-local-memory-for-page-request-qu.patch

new file mode 100644 (file)

index 0000000..6290435
--- /dev/null
+++ b/queue-6.1/iommu-vt-d-allocate-local-memory-for-page-request-qu.patch
@@ -0,0 +1,39 @@
+From b5a1ba11b1c4eaff1247a620f252afd8b118874c Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Apr 2024 11:07:43 +0800
+Subject: iommu/vt-d: Allocate local memory for page request queue
+
+From: Jacob Pan <jacob.jun.pan@linux.intel.com>
+
+[ Upstream commit a34f3e20ddff02c4f12df2c0635367394e64c63d ]
+
+The page request queue is per IOMMU, its allocation should be made
+NUMA-aware for performance reasons.
+
+Fixes: a222a7f0bb6c ("iommu/vt-d: Implement page request handling")
+Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
+Reviewed-by: Kevin Tian <kevin.tian@intel.com>
+Link: https://lore.kernel.org/r/20240403214007.985600-1-jacob.jun.pan@linux.intel.com
+Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
+Signed-off-by: Joerg Roedel <jroedel@suse.de>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/iommu/intel/svm.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
+index 03b25358946c4..cb862ab96873e 100644
+--- a/drivers/iommu/intel/svm.c
++++ b/drivers/iommu/intel/svm.c
+@@ -71,7 +71,7 @@ int intel_svm_enable_prq(struct intel_iommu *iommu)
+       struct page *pages;
+       int irq, ret;
+ 
+-      pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, PRQ_ORDER);
++      pages = alloc_pages_node(iommu->node, GFP_KERNEL | __GFP_ZERO, PRQ_ORDER);
+       if (!pages) {
+               pr_warn("IOMMU: %s: Failed to allocate page request queue\n",
+                       iommu->name);
+-- 
+2.43.0
+
diff --git a/queue-6.1/ipv4-route-avoid-unused-but-set-variable-warning.patch b/queue-6.1/ipv4-route-avoid-unused-but-set-variable-warning.patch

new file mode 100644 (file)

index 0000000..d9278c7
--- /dev/null
+++ b/queue-6.1/ipv4-route-avoid-unused-but-set-variable-warning.patch
@@ -0,0 +1,51 @@
+From aa78bd11e34c00469f5b40874e0543e281d8279c Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Mon, 8 Apr 2024 09:42:03 +0200
+Subject: ipv4/route: avoid unused-but-set-variable warning
+
+From: Arnd Bergmann <arnd@arndb.de>
+
+[ Upstream commit cf1b7201df59fb936f40f4a807433fe3f2ce310a ]
+
+The log_martians variable is only used in an #ifdef, causing a 'make W=1'
+warning with gcc:
+
+net/ipv4/route.c: In function 'ip_rt_send_redirect':
+net/ipv4/route.c:880:13: error: variable 'log_martians' set but not used [-Werror=unused-but-set-variable]
+
+Change the #ifdef to an equivalent IS_ENABLED() to let the compiler
+see where the variable is used.
+
+Fixes: 30038fc61adf ("net: ip_rt_send_redirect() optimization")
+Reviewed-by: David Ahern <dsahern@kernel.org>
+Signed-off-by: Arnd Bergmann <arnd@arndb.de>
+Reviewed-by: Eric Dumazet <edumazet@google.com>
+Link: https://lore.kernel.org/r/20240408074219.3030256-2-arnd@kernel.org
+Signed-off-by: Paolo Abeni <pabeni@redhat.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/ipv4/route.c | 4 +---
+ 1 file changed, 1 insertion(+), 3 deletions(-)
+
+diff --git a/net/ipv4/route.c b/net/ipv4/route.c
+index 474f391fab35d..a0c687ff25987 100644
+--- a/net/ipv4/route.c
++++ b/net/ipv4/route.c
+@@ -926,13 +926,11 @@ void ip_rt_send_redirect(struct sk_buff *skb)
+               icmp_send(skb, ICMP_REDIRECT, ICMP_REDIR_HOST, gw);
+               peer->rate_last = jiffies;
+               ++peer->n_redirects;
+-#ifdef CONFIG_IP_ROUTE_VERBOSE
+-              if (log_martians &&
++              if (IS_ENABLED(CONFIG_IP_ROUTE_VERBOSE) && log_martians &&
+                   peer->n_redirects == ip_rt_redirect_number)
+                       net_warn_ratelimited("host %pI4/if%d ignores redirects for %pI4 to %pI4\n",
+                                            &ip_hdr(skb)->saddr, inet_iif(skb),
+                                            &ip_hdr(skb)->daddr, &gw);
+-#endif
+       }
+ out_put_peer:
+       inet_putpeer(peer);
+-- 
+2.43.0
+
diff --git a/queue-6.1/ipv6-fib-hide-unused-pn-variable.patch b/queue-6.1/ipv6-fib-hide-unused-pn-variable.patch

new file mode 100644 (file)

index 0000000..c6614c2
--- /dev/null
+++ b/queue-6.1/ipv6-fib-hide-unused-pn-variable.patch
@@ -0,0 +1,60 @@
+From d1afc3102faabd97f068661cf9a0b0bf4e3b96cc Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Mon, 8 Apr 2024 09:42:02 +0200
+Subject: ipv6: fib: hide unused 'pn' variable
+
+From: Arnd Bergmann <arnd@arndb.de>
+
+[ Upstream commit 74043489fcb5e5ca4074133582b5b8011b67f9e7 ]
+
+When CONFIG_IPV6_SUBTREES is disabled, the only user is hidden, causing
+a 'make W=1' warning:
+
+net/ipv6/ip6_fib.c: In function 'fib6_add':
+net/ipv6/ip6_fib.c:1388:32: error: variable 'pn' set but not used [-Werror=unused-but-set-variable]
+
+Add another #ifdef around the variable declaration, matching the other
+uses in this file.
+
+Fixes: 66729e18df08 ("[IPV6] ROUTE: Make sure we have fn->leaf when adding a node on subtree.")
+Link: https://lore.kernel.org/netdev/20240322131746.904943-1-arnd@kernel.org/
+Reviewed-by: David Ahern <dsahern@kernel.org>
+Signed-off-by: Arnd Bergmann <arnd@arndb.de>
+Reviewed-by: Eric Dumazet <edumazet@google.com>
+Link: https://lore.kernel.org/r/20240408074219.3030256-1-arnd@kernel.org
+Signed-off-by: Paolo Abeni <pabeni@redhat.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/ipv6/ip6_fib.c | 7 +++++--
+ 1 file changed, 5 insertions(+), 2 deletions(-)
+
+diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
+index e606374854ce5..8213626434b91 100644
+--- a/net/ipv6/ip6_fib.c
++++ b/net/ipv6/ip6_fib.c
+@@ -1376,7 +1376,10 @@ int fib6_add(struct fib6_node *root, struct fib6_info *rt,
+            struct nl_info *info, struct netlink_ext_ack *extack)
+ {
+       struct fib6_table *table = rt->fib6_table;
+-      struct fib6_node *fn, *pn = NULL;
++      struct fib6_node *fn;
++#ifdef CONFIG_IPV6_SUBTREES
++      struct fib6_node *pn = NULL;
++#endif
+       int err = -ENOMEM;
+       int allow_create = 1;
+       int replace_required = 0;
+@@ -1400,9 +1403,9 @@ int fib6_add(struct fib6_node *root, struct fib6_info *rt,
+               goto out;
+       }
+ 
++#ifdef CONFIG_IPV6_SUBTREES
+       pn = fn;
+ 
+-#ifdef CONFIG_IPV6_SUBTREES
+       if (rt->fib6_src.plen) {
+               struct fib6_node *sn;
+ 
+-- 
+2.43.0
+
diff --git a/queue-6.1/ipv6-fix-race-condition-between-ipv6_get_ifaddr-and-.patch b/queue-6.1/ipv6-fix-race-condition-between-ipv6_get_ifaddr-and-.patch

new file mode 100644 (file)

index 0000000..7fb29a6
--- /dev/null
+++ b/queue-6.1/ipv6-fix-race-condition-between-ipv6_get_ifaddr-and-.patch
@@ -0,0 +1,133 @@
+From f589e225f4bf23dbdc29dfe251ef2cf3b47dd1f6 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Mon, 8 Apr 2024 16:18:21 +0200
+Subject: ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr
+
+From: Jiri Benc <jbenc@redhat.com>
+
+[ Upstream commit 7633c4da919ad51164acbf1aa322cc1a3ead6129 ]
+
+Although ipv6_get_ifaddr walks inet6_addr_lst under the RCU lock, it
+still means hlist_for_each_entry_rcu can return an item that got removed
+from the list. The memory itself of such item is not freed thanks to RCU
+but nothing guarantees the actual content of the memory is sane.
+
+In particular, the reference count can be zero. This can happen if
+ipv6_del_addr is called in parallel. ipv6_del_addr removes the entry
+from inet6_addr_lst (hlist_del_init_rcu(&ifp->addr_lst)) and drops all
+references (__in6_ifa_put(ifp) + in6_ifa_put(ifp)). With bad enough
+timing, this can happen:
+
+1. In ipv6_get_ifaddr, hlist_for_each_entry_rcu returns an entry.
+
+2. Then, the whole ipv6_del_addr is executed for the given entry. The
+   reference count drops to zero and kfree_rcu is scheduled.
+
+3. ipv6_get_ifaddr continues and tries to increments the reference count
+   (in6_ifa_hold).
+
+4. The rcu is unlocked and the entry is freed.
+
+5. The freed entry is returned.
+
+Prevent increasing of the reference count in such case. The name
+in6_ifa_hold_safe is chosen to mimic the existing fib6_info_hold_safe.
+
+[   41.506330] refcount_t: addition on 0; use-after-free.
+[   41.506760] WARNING: CPU: 0 PID: 595 at lib/refcount.c:25 refcount_warn_saturate+0xa5/0x130
+[   41.507413] Modules linked in: veth bridge stp llc
+[   41.507821] CPU: 0 PID: 595 Comm: python3 Not tainted 6.9.0-rc2.main-00208-g49563be82afa #14
+[   41.508479] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
+[   41.509163] RIP: 0010:refcount_warn_saturate+0xa5/0x130
+[   41.509586] Code: ad ff 90 0f 0b 90 90 c3 cc cc cc cc 80 3d c0 30 ad 01 00 75 a0 c6 05 b7 30 ad 01 01 90 48 c7 c7 38 cc 7a 8c e8 cc 18 ad ff 90 <0f> 0b 90 90 c3 cc cc cc cc 80 3d 98 30 ad 01 00 0f 85 75 ff ff ff
+[   41.510956] RSP: 0018:ffffbda3c026baf0 EFLAGS: 00010282
+[   41.511368] RAX: 0000000000000000 RBX: ffff9e9c46914800 RCX: 0000000000000000
+[   41.511910] RDX: ffff9e9c7ec29c00 RSI: ffff9e9c7ec1c900 RDI: ffff9e9c7ec1c900
+[   41.512445] RBP: ffff9e9c43660c9c R08: 0000000000009ffb R09: 00000000ffffdfff
+[   41.512998] R10: 00000000ffffdfff R11: ffffffff8ca58a40 R12: ffff9e9c4339a000
+[   41.513534] R13: 0000000000000001 R14: ffff9e9c438a0000 R15: ffffbda3c026bb48
+[   41.514086] FS:  00007fbc4cda1740(0000) GS:ffff9e9c7ec00000(0000) knlGS:0000000000000000
+[   41.514726] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
+[   41.515176] CR2: 000056233b337d88 CR3: 000000000376e006 CR4: 0000000000370ef0
+[   41.515713] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
+[   41.516252] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
+[   41.516799] Call Trace:
+[   41.517037]  <TASK>
+[   41.517249]  ? __warn+0x7b/0x120
+[   41.517535]  ? refcount_warn_saturate+0xa5/0x130
+[   41.517923]  ? report_bug+0x164/0x190
+[   41.518240]  ? handle_bug+0x3d/0x70
+[   41.518541]  ? exc_invalid_op+0x17/0x70
+[   41.520972]  ? asm_exc_invalid_op+0x1a/0x20
+[   41.521325]  ? refcount_warn_saturate+0xa5/0x130
+[   41.521708]  ipv6_get_ifaddr+0xda/0xe0
+[   41.522035]  inet6_rtm_getaddr+0x342/0x3f0
+[   41.522376]  ? __pfx_inet6_rtm_getaddr+0x10/0x10
+[   41.522758]  rtnetlink_rcv_msg+0x334/0x3d0
+[   41.523102]  ? netlink_unicast+0x30f/0x390
+[   41.523445]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
+[   41.523832]  netlink_rcv_skb+0x53/0x100
+[   41.524157]  netlink_unicast+0x23b/0x390
+[   41.524484]  netlink_sendmsg+0x1f2/0x440
+[   41.524826]  __sys_sendto+0x1d8/0x1f0
+[   41.525145]  __x64_sys_sendto+0x1f/0x30
+[   41.525467]  do_syscall_64+0xa5/0x1b0
+[   41.525794]  entry_SYSCALL_64_after_hwframe+0x72/0x7a
+[   41.526213] RIP: 0033:0x7fbc4cfcea9a
+[   41.526528] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
+[   41.527942] RSP: 002b:00007ffcf54012a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
+[   41.528593] RAX: ffffffffffffffda RBX: 00007ffcf5401368 RCX: 00007fbc4cfcea9a
+[   41.529173] RDX: 000000000000002c RSI: 00007fbc4b9d9bd0 RDI: 0000000000000005
+[   41.529786] RBP: 00007fbc4bafb040 R08: 00007ffcf54013e0 R09: 000000000000000c
+[   41.530375] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
+[   41.530977] R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007fbc4ca85d1b
+[   41.531573]  </TASK>
+
+Fixes: 5c578aedcb21d ("IPv6: convert addrconf hash list to RCU")
+Reviewed-by: Eric Dumazet <edumazet@google.com>
+Reviewed-by: David Ahern <dsahern@kernel.org>
+Signed-off-by: Jiri Benc <jbenc@redhat.com>
+Link: https://lore.kernel.org/r/8ab821e36073a4a406c50ec83c9e8dc586c539e4.1712585809.git.jbenc@redhat.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ include/net/addrconf.h | 4 ++++
+ net/ipv6/addrconf.c    | 7 ++++---
+ 2 files changed, 8 insertions(+), 3 deletions(-)
+
+diff --git a/include/net/addrconf.h b/include/net/addrconf.h
+index 86eb2aba1479c..5bcc63eade035 100644
+--- a/include/net/addrconf.h
++++ b/include/net/addrconf.h
+@@ -437,6 +437,10 @@ static inline void in6_ifa_hold(struct inet6_ifaddr *ifp)
+       refcount_inc(&ifp->refcnt);
+ }
+ 
++static inline bool in6_ifa_hold_safe(struct inet6_ifaddr *ifp)
++{
++      return refcount_inc_not_zero(&ifp->refcnt);
++}
+ 
+ /*
+  *    compute link-local solicited-node multicast address
+diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
+index 1648373692a99..3866deaadbb66 100644
+--- a/net/ipv6/addrconf.c
++++ b/net/ipv6/addrconf.c
+@@ -2050,9 +2050,10 @@ struct inet6_ifaddr *ipv6_get_ifaddr(struct net *net, const struct in6_addr *add
+               if (ipv6_addr_equal(&ifp->addr, addr)) {
+                       if (!dev || ifp->idev->dev == dev ||
+                           !(ifp->scope&(IFA_LINK|IFA_HOST) || strict)) {
+-                              result = ifp;
+-                              in6_ifa_hold(ifp);
+-                              break;
++                              if (in6_ifa_hold_safe(ifp)) {
++                                      result = ifp;
++                                      break;
++                              }
+                       }
+               }
+       }
+-- 
+2.43.0
+
diff --git a/queue-6.1/net-dsa-mt7530-trap-link-local-frames-regardless-of-.patch b/queue-6.1/net-dsa-mt7530-trap-link-local-frames-regardless-of-.patch

new file mode 100644 (file)

index 0000000..2958275
--- /dev/null
+++ b/queue-6.1/net-dsa-mt7530-trap-link-local-frames-regardless-of-.patch
@@ -0,0 +1,495 @@
+From f7e4bd05b6adf10c3d49c3208c024f78f7473a00 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 Apr 2024 18:01:14 +0300
+Subject: net: dsa: mt7530: trap link-local frames regardless of ST Port State
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+From: Arınç ÜNAL <arinc.unal@arinc9.com>
+
+[ Upstream commit 17c560113231ddc20088553c7b499b289b664311 ]
+
+In Clause 5 of IEEE Std 802-2014, two sublayers of the data link layer
+(DLL) of the Open Systems Interconnection basic reference model (OSI/RM)
+are described; the medium access control (MAC) and logical link control
+(LLC) sublayers. The MAC sublayer is the one facing the physical layer.
+
+In 8.2 of IEEE Std 802.1Q-2022, the Bridge architecture is described. A
+Bridge component comprises a MAC Relay Entity for interconnecting the Ports
+of the Bridge, at least two Ports, and higher layer entities with at least
+a Spanning Tree Protocol Entity included.
+
+Each Bridge Port also functions as an end station and shall provide the MAC
+Service to an LLC Entity. Each instance of the MAC Service is provided to a
+distinct LLC Entity that supports protocol identification, multiplexing,
+and demultiplexing, for protocol data unit (PDU) transmission and reception
+by one or more higher layer entities.
+
+It is described in 8.13.9 of IEEE Std 802.1Q-2022 that in a Bridge, the LLC
+Entity associated with each Bridge Port is modeled as being directly
+connected to the attached Local Area Network (LAN).
+
+On the switch with CPU port architecture, CPU port functions as Management
+Port, and the Management Port functionality is provided by software which
+functions as an end station. Software is connected to an IEEE 802 LAN that
+is wholly contained within the system that incorporates the Bridge.
+Software provides access to the LLC Entity associated with each Bridge Port
+by the value of the source port field on the special tag on the frame
+received by software.
+
+We call frames that carry control information to determine the active
+topology and current extent of each Virtual Local Area Network (VLAN),
+i.e., spanning tree or Shortest Path Bridging (SPB) and Multiple VLAN
+Registration Protocol Data Units (MVRPDUs), and frames from other link
+constrained protocols, such as Extensible Authentication Protocol over LAN
+(EAPOL) and Link Layer Discovery Protocol (LLDP), link-local frames. They
+are not forwarded by a Bridge. Permanently configured entries in the
+filtering database (FDB) ensure that such frames are discarded by the
+Forwarding Process. In 8.6.3 of IEEE Std 802.1Q-2022, this is described in
+detail:
+
+Each of the reserved MAC addresses specified in Table 8-1
+(01-80-C2-00-00-[00,01,02,03,04,05,06,07,08,09,0A,0B,0C,0D,0E,0F]) shall be
+permanently configured in the FDB in C-VLAN components and ERs.
+
+Each of the reserved MAC addresses specified in Table 8-2
+(01-80-C2-00-00-[01,02,03,04,05,06,07,08,09,0A,0E]) shall be permanently
+configured in the FDB in S-VLAN components.
+
+Each of the reserved MAC addresses specified in Table 8-3
+(01-80-C2-00-00-[01,02,04,0E]) shall be permanently configured in the FDB
+in TPMR components.
+
+The FDB entries for reserved MAC addresses shall specify filtering for all
+Bridge Ports and all VIDs. Management shall not provide the capability to
+modify or remove entries for reserved MAC addresses.
+
+The addresses in Table 8-1, Table 8-2, and Table 8-3 determine the scope of
+propagation of PDUs within a Bridged Network, as follows:
+
+  The Nearest Bridge group address (01-80-C2-00-00-0E) is an address that
+  no conformant Two-Port MAC Relay (TPMR) component, Service VLAN (S-VLAN)
+  component, Customer VLAN (C-VLAN) component, or MAC Bridge can forward.
+  PDUs transmitted using this destination address, or any other addresses
+  that appear in Table 8-1, Table 8-2, and Table 8-3
+  (01-80-C2-00-00-[00,01,02,03,04,05,06,07,08,09,0A,0B,0C,0D,0E,0F]), can
+  therefore travel no further than those stations that can be reached via a
+  single individual LAN from the originating station.
+
+  The Nearest non-TPMR Bridge group address (01-80-C2-00-00-03), is an
+  address that no conformant S-VLAN component, C-VLAN component, or MAC
+  Bridge can forward; however, this address is relayed by a TPMR component.
+  PDUs using this destination address, or any of the other addresses that
+  appear in both Table 8-1 and Table 8-2 but not in Table 8-3
+  (01-80-C2-00-00-[00,03,05,06,07,08,09,0A,0B,0C,0D,0F]), will be relayed
+  by any TPMRs but will propagate no further than the nearest S-VLAN
+  component, C-VLAN component, or MAC Bridge.
+
+  The Nearest Customer Bridge group address (01-80-C2-00-00-00) is an
+  address that no conformant C-VLAN component, MAC Bridge can forward;
+  however, it is relayed by TPMR components and S-VLAN components. PDUs
+  using this destination address, or any of the other addresses that appear
+  in Table 8-1 but not in either Table 8-2 or Table 8-3
+  (01-80-C2-00-00-[00,0B,0C,0D,0F]), will be relayed by TPMR components and
+  S-VLAN components but will propagate no further than the nearest C-VLAN
+  component or MAC Bridge.
+
+Because the LLC Entity associated with each Bridge Port is provided via CPU
+port, we must not filter these frames but forward them to CPU port.
+
+In a Bridge, the transmission Port is majorly decided by ingress and egress
+rules, FDB, and spanning tree Port State functions of the Forwarding
+Process. For link-local frames, only CPU port should be designated as
+destination port in the FDB, and the other functions of the Forwarding
+Process must not interfere with the decision of the transmission Port. We
+call this process trapping frames to CPU port.
+
+Therefore, on the switch with CPU port architecture, link-local frames must
+be trapped to CPU port, and certain link-local frames received by a Port of
+a Bridge comprising a TPMR component or an S-VLAN component must be
+excluded from it.
+
+A Bridge of the switch with CPU port architecture cannot comprise a
+Two-Port MAC Relay (TPMR) component as a TPMR component supports only a
+subset of the functionality of a MAC Bridge. A Bridge comprising two Ports
+(Management Port doesn't count) of this architecture will either function
+as a standard MAC Bridge or a standard VLAN Bridge.
+
+Therefore, a Bridge of this architecture can only comprise S-VLAN
+components, C-VLAN components, or MAC Bridge components. Since there's no
+TPMR component, we don't need to relay PDUs using the destination addresses
+specified on the Nearest non-TPMR section, and the proportion of the
+Nearest Customer Bridge section where they must be relayed by TPMR
+components.
+
+One option to trap link-local frames to CPU port is to add static FDB
+entries with CPU port designated as destination port. However, because that
+Independent VLAN Learning (IVL) is being used on every VID, each entry only
+applies to a single VLAN Identifier (VID). For a Bridge comprising a MAC
+Bridge component or a C-VLAN component, there would have to be 16 times
+4096 entries. This switch intellectual property can only hold a maximum of
+2048 entries. Using this option, there also isn't a mechanism to prevent
+link-local frames from being discarded when the spanning tree Port State of
+the reception Port is discarding.
+
+The remaining option is to utilise the BPC, RGAC1, RGAC2, RGAC3, and RGAC4
+registers. Whilst this applies to every VID, it doesn't contain all of the
+reserved MAC addresses without affecting the remaining Standard Group MAC
+Addresses. The REV_UN frame tag utilised using the RGAC4 register covers
+the remaining 01-80-C2-00-00-[04,05,06,07,08,09,0A,0B,0C,0D,0F] destination
+addresses. It also includes the 01-80-C2-00-00-22 to 01-80-C2-00-00-FF
+destination addresses which may be relayed by MAC Bridges or VLAN Bridges.
+The latter option provides better but not complete conformance.
+
+This switch intellectual property also does not provide a mechanism to trap
+link-local frames with specific destination addresses to CPU port by
+Bridge, to conform to the filtering rules for the distinct Bridge
+components.
+
+Therefore, regardless of the type of the Bridge component, link-local
+frames with these destination addresses will be trapped to CPU port:
+
+01-80-C2-00-00-[00,01,02,03,0E]
+
+In a Bridge comprising a MAC Bridge component or a C-VLAN component:
+
+  Link-local frames with these destination addresses won't be trapped to
+  CPU port which won't conform to IEEE Std 802.1Q-2022:
+
+  01-80-C2-00-00-[04,05,06,07,08,09,0A,0B,0C,0D,0F]
+
+In a Bridge comprising an S-VLAN component:
+
+  Link-local frames with these destination addresses will be trapped to CPU
+  port which won't conform to IEEE Std 802.1Q-2022:
+
+  01-80-C2-00-00-00
+
+  Link-local frames with these destination addresses won't be trapped to
+  CPU port which won't conform to IEEE Std 802.1Q-2022:
+
+  01-80-C2-00-00-[04,05,06,07,08,09,0A]
+
+Currently on this switch intellectual property, if the spanning tree Port
+State of the reception Port is discarding, link-local frames will be
+discarded.
+
+To trap link-local frames regardless of the spanning tree Port State, make
+the switch regard them as Bridge Protocol Data Units (BPDUs). This switch
+intellectual property only lets the frames regarded as BPDUs bypass the
+spanning tree Port State function of the Forwarding Process.
+
+With this change, the only remaining interference is the ingress rules.
+When the reception Port has no PVID assigned on software, VLAN-untagged
+frames won't be allowed in. There doesn't seem to be a mechanism on the
+switch intellectual property to have link-local frames bypass this function
+of the Forwarding Process.
+
+Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
+Reviewed-by: Daniel Golle <daniel@makrotopia.org>
+Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
+Link: https://lore.kernel.org/r/20240409-b4-for-net-mt7530-fix-link-local-when-stp-discarding-v2-1-07b1150164ac@arinc9.com
+Signed-off-by: Paolo Abeni <pabeni@redhat.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/dsa/mt7530.c | 229 +++++++++++++++++++++++++++++++++------
+ drivers/net/dsa/mt7530.h |   5 +
+ 2 files changed, 200 insertions(+), 34 deletions(-)
+
+diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
+index 07065c1af55e4..d4515c19a5f34 100644
+--- a/drivers/net/dsa/mt7530.c
++++ b/drivers/net/dsa/mt7530.c
+@@ -998,20 +998,173 @@ static void mt7530_setup_port5(struct dsa_switch *ds, phy_interface_t interface)
+       mutex_unlock(&priv->reg_mutex);
+ }
+ 
+-/* On page 205, section "8.6.3 Frame filtering" of the active standard, IEEE Std
+- * 802.1Q™-2022, it is stated that frames with 01:80:C2:00:00:00-0F as MAC DA
+- * must only be propagated to C-VLAN and MAC Bridge components. That means
+- * VLAN-aware and VLAN-unaware bridges. On the switch designs with CPU ports,
+- * these frames are supposed to be processed by the CPU (software). So we make
+- * the switch only forward them to the CPU port. And if received from a CPU
+- * port, forward to a single port. The software is responsible of making the
+- * switch conform to the latter by setting a single port as destination port on
+- * the special tag.
++/* In Clause 5 of IEEE Std 802-2014, two sublayers of the data link layer (DLL)
++ * of the Open Systems Interconnection basic reference model (OSI/RM) are
++ * described; the medium access control (MAC) and logical link control (LLC)
++ * sublayers. The MAC sublayer is the one facing the physical layer.
+  *
+- * This switch intellectual property cannot conform to this part of the standard
+- * fully. Whilst the REV_UN frame tag covers the remaining :04-0D and :0F MAC
+- * DAs, it also includes :22-FF which the scope of propagation is not supposed
+- * to be restricted for these MAC DAs.
++ * In 8.2 of IEEE Std 802.1Q-2022, the Bridge architecture is described. A
++ * Bridge component comprises a MAC Relay Entity for interconnecting the Ports
++ * of the Bridge, at least two Ports, and higher layer entities with at least a
++ * Spanning Tree Protocol Entity included.
++ *
++ * Each Bridge Port also functions as an end station and shall provide the MAC
++ * Service to an LLC Entity. Each instance of the MAC Service is provided to a
++ * distinct LLC Entity that supports protocol identification, multiplexing, and
++ * demultiplexing, for protocol data unit (PDU) transmission and reception by
++ * one or more higher layer entities.
++ *
++ * It is described in 8.13.9 of IEEE Std 802.1Q-2022 that in a Bridge, the LLC
++ * Entity associated with each Bridge Port is modeled as being directly
++ * connected to the attached Local Area Network (LAN).
++ *
++ * On the switch with CPU port architecture, CPU port functions as Management
++ * Port, and the Management Port functionality is provided by software which
++ * functions as an end station. Software is connected to an IEEE 802 LAN that is
++ * wholly contained within the system that incorporates the Bridge. Software
++ * provides access to the LLC Entity associated with each Bridge Port by the
++ * value of the source port field on the special tag on the frame received by
++ * software.
++ *
++ * We call frames that carry control information to determine the active
++ * topology and current extent of each Virtual Local Area Network (VLAN), i.e.,
++ * spanning tree or Shortest Path Bridging (SPB) and Multiple VLAN Registration
++ * Protocol Data Units (MVRPDUs), and frames from other link constrained
++ * protocols, such as Extensible Authentication Protocol over LAN (EAPOL) and
++ * Link Layer Discovery Protocol (LLDP), link-local frames. They are not
++ * forwarded by a Bridge. Permanently configured entries in the filtering
++ * database (FDB) ensure that such frames are discarded by the Forwarding
++ * Process. In 8.6.3 of IEEE Std 802.1Q-2022, this is described in detail:
++ *
++ * Each of the reserved MAC addresses specified in Table 8-1
++ * (01-80-C2-00-00-[00,01,02,03,04,05,06,07,08,09,0A,0B,0C,0D,0E,0F]) shall be
++ * permanently configured in the FDB in C-VLAN components and ERs.
++ *
++ * Each of the reserved MAC addresses specified in Table 8-2
++ * (01-80-C2-00-00-[01,02,03,04,05,06,07,08,09,0A,0E]) shall be permanently
++ * configured in the FDB in S-VLAN components.
++ *
++ * Each of the reserved MAC addresses specified in Table 8-3
++ * (01-80-C2-00-00-[01,02,04,0E]) shall be permanently configured in the FDB in
++ * TPMR components.
++ *
++ * The FDB entries for reserved MAC addresses shall specify filtering for all
++ * Bridge Ports and all VIDs. Management shall not provide the capability to
++ * modify or remove entries for reserved MAC addresses.
++ *
++ * The addresses in Table 8-1, Table 8-2, and Table 8-3 determine the scope of
++ * propagation of PDUs within a Bridged Network, as follows:
++ *
++ *   The Nearest Bridge group address (01-80-C2-00-00-0E) is an address that no
++ *   conformant Two-Port MAC Relay (TPMR) component, Service VLAN (S-VLAN)
++ *   component, Customer VLAN (C-VLAN) component, or MAC Bridge can forward.
++ *   PDUs transmitted using this destination address, or any other addresses
++ *   that appear in Table 8-1, Table 8-2, and Table 8-3
++ *   (01-80-C2-00-00-[00,01,02,03,04,05,06,07,08,09,0A,0B,0C,0D,0E,0F]), can
++ *   therefore travel no further than those stations that can be reached via a
++ *   single individual LAN from the originating station.
++ *
++ *   The Nearest non-TPMR Bridge group address (01-80-C2-00-00-03), is an
++ *   address that no conformant S-VLAN component, C-VLAN component, or MAC
++ *   Bridge can forward; however, this address is relayed by a TPMR component.
++ *   PDUs using this destination address, or any of the other addresses that
++ *   appear in both Table 8-1 and Table 8-2 but not in Table 8-3
++ *   (01-80-C2-00-00-[00,03,05,06,07,08,09,0A,0B,0C,0D,0F]), will be relayed by
++ *   any TPMRs but will propagate no further than the nearest S-VLAN component,
++ *   C-VLAN component, or MAC Bridge.
++ *
++ *   The Nearest Customer Bridge group address (01-80-C2-00-00-00) is an address
++ *   that no conformant C-VLAN component, MAC Bridge can forward; however, it is
++ *   relayed by TPMR components and S-VLAN components. PDUs using this
++ *   destination address, or any of the other addresses that appear in Table 8-1
++ *   but not in either Table 8-2 or Table 8-3 (01-80-C2-00-00-[00,0B,0C,0D,0F]),
++ *   will be relayed by TPMR components and S-VLAN components but will propagate
++ *   no further than the nearest C-VLAN component or MAC Bridge.
++ *
++ * Because the LLC Entity associated with each Bridge Port is provided via CPU
++ * port, we must not filter these frames but forward them to CPU port.
++ *
++ * In a Bridge, the transmission Port is majorly decided by ingress and egress
++ * rules, FDB, and spanning tree Port State functions of the Forwarding Process.
++ * For link-local frames, only CPU port should be designated as destination port
++ * in the FDB, and the other functions of the Forwarding Process must not
++ * interfere with the decision of the transmission Port. We call this process
++ * trapping frames to CPU port.
++ *
++ * Therefore, on the switch with CPU port architecture, link-local frames must
++ * be trapped to CPU port, and certain link-local frames received by a Port of a
++ * Bridge comprising a TPMR component or an S-VLAN component must be excluded
++ * from it.
++ *
++ * A Bridge of the switch with CPU port architecture cannot comprise a Two-Port
++ * MAC Relay (TPMR) component as a TPMR component supports only a subset of the
++ * functionality of a MAC Bridge. A Bridge comprising two Ports (Management Port
++ * doesn't count) of this architecture will either function as a standard MAC
++ * Bridge or a standard VLAN Bridge.
++ *
++ * Therefore, a Bridge of this architecture can only comprise S-VLAN components,
++ * C-VLAN components, or MAC Bridge components. Since there's no TPMR component,
++ * we don't need to relay PDUs using the destination addresses specified on the
++ * Nearest non-TPMR section, and the proportion of the Nearest Customer Bridge
++ * section where they must be relayed by TPMR components.
++ *
++ * One option to trap link-local frames to CPU port is to add static FDB entries
++ * with CPU port designated as destination port. However, because that
++ * Independent VLAN Learning (IVL) is being used on every VID, each entry only
++ * applies to a single VLAN Identifier (VID). For a Bridge comprising a MAC
++ * Bridge component or a C-VLAN component, there would have to be 16 times 4096
++ * entries. This switch intellectual property can only hold a maximum of 2048
++ * entries. Using this option, there also isn't a mechanism to prevent
++ * link-local frames from being discarded when the spanning tree Port State of
++ * the reception Port is discarding.
++ *
++ * The remaining option is to utilise the BPC, RGAC1, RGAC2, RGAC3, and RGAC4
++ * registers. Whilst this applies to every VID, it doesn't contain all of the
++ * reserved MAC addresses without affecting the remaining Standard Group MAC
++ * Addresses. The REV_UN frame tag utilised using the RGAC4 register covers the
++ * remaining 01-80-C2-00-00-[04,05,06,07,08,09,0A,0B,0C,0D,0F] destination
++ * addresses. It also includes the 01-80-C2-00-00-22 to 01-80-C2-00-00-FF
++ * destination addresses which may be relayed by MAC Bridges or VLAN Bridges.
++ * The latter option provides better but not complete conformance.
++ *
++ * This switch intellectual property also does not provide a mechanism to trap
++ * link-local frames with specific destination addresses to CPU port by Bridge,
++ * to conform to the filtering rules for the distinct Bridge components.
++ *
++ * Therefore, regardless of the type of the Bridge component, link-local frames
++ * with these destination addresses will be trapped to CPU port:
++ *
++ * 01-80-C2-00-00-[00,01,02,03,0E]
++ *
++ * In a Bridge comprising a MAC Bridge component or a C-VLAN component:
++ *
++ *   Link-local frames with these destination addresses won't be trapped to CPU
++ *   port which won't conform to IEEE Std 802.1Q-2022:
++ *
++ *   01-80-C2-00-00-[04,05,06,07,08,09,0A,0B,0C,0D,0F]
++ *
++ * In a Bridge comprising an S-VLAN component:
++ *
++ *   Link-local frames with these destination addresses will be trapped to CPU
++ *   port which won't conform to IEEE Std 802.1Q-2022:
++ *
++ *   01-80-C2-00-00-00
++ *
++ *   Link-local frames with these destination addresses won't be trapped to CPU
++ *   port which won't conform to IEEE Std 802.1Q-2022:
++ *
++ *   01-80-C2-00-00-[04,05,06,07,08,09,0A]
++ *
++ * To trap link-local frames to CPU port as conformant as this switch
++ * intellectual property can allow, link-local frames are made to be regarded as
++ * Bridge Protocol Data Units (BPDUs). This is because this switch intellectual
++ * property only lets the frames regarded as BPDUs bypass the spanning tree Port
++ * State function of the Forwarding Process.
++ *
++ * The only remaining interference is the ingress rules. When the reception Port
++ * has no PVID assigned on software, VLAN-untagged frames won't be allowed in.
++ * There doesn't seem to be a mechanism on the switch intellectual property to
++ * have link-local frames bypass this function of the Forwarding Process.
+  */
+ static void
+ mt753x_trap_frames(struct mt7530_priv *priv)
+@@ -1019,35 +1172,43 @@ mt753x_trap_frames(struct mt7530_priv *priv)
+       /* Trap 802.1X PAE frames and BPDUs to the CPU port(s) and egress them
+        * VLAN-untagged.
+        */
+-      mt7530_rmw(priv, MT753X_BPC, MT753X_PAE_EG_TAG_MASK |
+-                 MT753X_PAE_PORT_FW_MASK | MT753X_BPDU_EG_TAG_MASK |
+-                 MT753X_BPDU_PORT_FW_MASK,
+-                 MT753X_PAE_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
+-                 MT753X_PAE_PORT_FW(MT753X_BPDU_CPU_ONLY) |
+-                 MT753X_BPDU_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
+-                 MT753X_BPDU_CPU_ONLY);
++      mt7530_rmw(priv, MT753X_BPC,
++                 MT753X_PAE_BPDU_FR | MT753X_PAE_EG_TAG_MASK |
++                         MT753X_PAE_PORT_FW_MASK | MT753X_BPDU_EG_TAG_MASK |
++                         MT753X_BPDU_PORT_FW_MASK,
++                 MT753X_PAE_BPDU_FR |
++                         MT753X_PAE_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
++                         MT753X_PAE_PORT_FW(MT753X_BPDU_CPU_ONLY) |
++                         MT753X_BPDU_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
++                         MT753X_BPDU_CPU_ONLY);
+ 
+       /* Trap frames with :01 and :02 MAC DAs to the CPU port(s) and egress
+        * them VLAN-untagged.
+        */
+-      mt7530_rmw(priv, MT753X_RGAC1, MT753X_R02_EG_TAG_MASK |
+-                 MT753X_R02_PORT_FW_MASK | MT753X_R01_EG_TAG_MASK |
+-                 MT753X_R01_PORT_FW_MASK,
+-                 MT753X_R02_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
+-                 MT753X_R02_PORT_FW(MT753X_BPDU_CPU_ONLY) |
+-                 MT753X_R01_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
+-                 MT753X_BPDU_CPU_ONLY);
++      mt7530_rmw(priv, MT753X_RGAC1,
++                 MT753X_R02_BPDU_FR | MT753X_R02_EG_TAG_MASK |
++                         MT753X_R02_PORT_FW_MASK | MT753X_R01_BPDU_FR |
++                         MT753X_R01_EG_TAG_MASK | MT753X_R01_PORT_FW_MASK,
++                 MT753X_R02_BPDU_FR |
++                         MT753X_R02_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
++                         MT753X_R02_PORT_FW(MT753X_BPDU_CPU_ONLY) |
++                         MT753X_R01_BPDU_FR |
++                         MT753X_R01_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
++                         MT753X_BPDU_CPU_ONLY);
+ 
+       /* Trap frames with :03 and :0E MAC DAs to the CPU port(s) and egress
+        * them VLAN-untagged.
+        */
+-      mt7530_rmw(priv, MT753X_RGAC2, MT753X_R0E_EG_TAG_MASK |
+-                 MT753X_R0E_PORT_FW_MASK | MT753X_R03_EG_TAG_MASK |
+-                 MT753X_R03_PORT_FW_MASK,
+-                 MT753X_R0E_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
+-                 MT753X_R0E_PORT_FW(MT753X_BPDU_CPU_ONLY) |
+-                 MT753X_R03_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
+-                 MT753X_BPDU_CPU_ONLY);
++      mt7530_rmw(priv, MT753X_RGAC2,
++                 MT753X_R0E_BPDU_FR | MT753X_R0E_EG_TAG_MASK |
++                         MT753X_R0E_PORT_FW_MASK | MT753X_R03_BPDU_FR |
++                         MT753X_R03_EG_TAG_MASK | MT753X_R03_PORT_FW_MASK,
++                 MT753X_R0E_BPDU_FR |
++                         MT753X_R0E_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
++                         MT753X_R0E_PORT_FW(MT753X_BPDU_CPU_ONLY) |
++                         MT753X_R03_BPDU_FR |
++                         MT753X_R03_EG_TAG(MT7530_VLAN_EG_UNTAGGED) |
++                         MT753X_BPDU_CPU_ONLY);
+ }
+ 
+ static int
+diff --git a/drivers/net/dsa/mt7530.h b/drivers/net/dsa/mt7530.h
+index fa2afa67ceb07..2d1ea390f05ab 100644
+--- a/drivers/net/dsa/mt7530.h
++++ b/drivers/net/dsa/mt7530.h
+@@ -63,6 +63,7 @@ enum mt753x_id {
+ 
+ /* Registers for BPDU and PAE frame control*/
+ #define MT753X_BPC                    0x24
++#define  MT753X_PAE_BPDU_FR           BIT(25)
+ #define  MT753X_PAE_EG_TAG_MASK               GENMASK(24, 22)
+ #define  MT753X_PAE_EG_TAG(x)         FIELD_PREP(MT753X_PAE_EG_TAG_MASK, x)
+ #define  MT753X_PAE_PORT_FW_MASK      GENMASK(18, 16)
+@@ -73,20 +74,24 @@ enum mt753x_id {
+ 
+ /* Register for :01 and :02 MAC DA frame control */
+ #define MT753X_RGAC1                  0x28
++#define  MT753X_R02_BPDU_FR           BIT(25)
+ #define  MT753X_R02_EG_TAG_MASK               GENMASK(24, 22)
+ #define  MT753X_R02_EG_TAG(x)         FIELD_PREP(MT753X_R02_EG_TAG_MASK, x)
+ #define  MT753X_R02_PORT_FW_MASK      GENMASK(18, 16)
+ #define  MT753X_R02_PORT_FW(x)                FIELD_PREP(MT753X_R02_PORT_FW_MASK, x)
++#define  MT753X_R01_BPDU_FR           BIT(9)
+ #define  MT753X_R01_EG_TAG_MASK               GENMASK(8, 6)
+ #define  MT753X_R01_EG_TAG(x)         FIELD_PREP(MT753X_R01_EG_TAG_MASK, x)
+ #define  MT753X_R01_PORT_FW_MASK      GENMASK(2, 0)
+ 
+ /* Register for :03 and :0E MAC DA frame control */
+ #define MT753X_RGAC2                  0x2c
++#define  MT753X_R0E_BPDU_FR           BIT(25)
+ #define  MT753X_R0E_EG_TAG_MASK               GENMASK(24, 22)
+ #define  MT753X_R0E_EG_TAG(x)         FIELD_PREP(MT753X_R0E_EG_TAG_MASK, x)
+ #define  MT753X_R0E_PORT_FW_MASK      GENMASK(18, 16)
+ #define  MT753X_R0E_PORT_FW(x)                FIELD_PREP(MT753X_R0E_PORT_FW_MASK, x)
++#define  MT753X_R03_BPDU_FR           BIT(9)
+ #define  MT753X_R03_EG_TAG_MASK               GENMASK(8, 6)
+ #define  MT753X_R03_EG_TAG(x)         FIELD_PREP(MT753X_R03_EG_TAG_MASK, x)
+ #define  MT753X_R03_PORT_FW_MASK      GENMASK(2, 0)
+-- 
+2.43.0
+
diff --git a/queue-6.1/net-ena-fix-incorrect-descriptor-free-behavior.patch b/queue-6.1/net-ena-fix-incorrect-descriptor-free-behavior.patch

new file mode 100644 (file)

index 0000000..8982d3c
--- /dev/null
+++ b/queue-6.1/net-ena-fix-incorrect-descriptor-free-behavior.patch
@@ -0,0 +1,72 @@
+From 92535ce67980a1288e3e26b3d2c42bfb68ef72be Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Wed, 10 Apr 2024 09:13:57 +0000
+Subject: net: ena: Fix incorrect descriptor free behavior
+
+From: David Arinzon <darinzon@amazon.com>
+
+[ Upstream commit bf02d9fe00632d22fa91d34749c7aacf397b6cde ]
+
+ENA has two types of TX queues:
+- queues which only process TX packets arriving from the network stack
+- queues which only process TX packets forwarded to it by XDP_REDIRECT
+  or XDP_TX instructions
+
+The ena_free_tx_bufs() cycles through all descriptors in a TX queue
+and unmaps + frees every descriptor that hasn't been acknowledged yet
+by the device (uncompleted TX transactions).
+The function assumes that the processed TX queue is necessarily from
+the first category listed above and ends up using napi_consume_skb()
+for descriptors belonging to an XDP specific queue.
+
+This patch solves a bug in which, in case of a VF reset, the
+descriptors aren't freed correctly, leading to crashes.
+
+Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action")
+Signed-off-by: Shay Agroskin <shayagr@amazon.com>
+Signed-off-by: David Arinzon <darinzon@amazon.com>
+Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
+Signed-off-by: Paolo Abeni <pabeni@redhat.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/ethernet/amazon/ena/ena_netdev.c | 14 +++++++++++---
+ 1 file changed, 11 insertions(+), 3 deletions(-)
+
+diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
+index b2eb6e1958f04..5e37b18ac3adf 100644
+--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
++++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
+@@ -1203,8 +1203,11 @@ static void ena_unmap_tx_buff(struct ena_ring *tx_ring,
+ static void ena_free_tx_bufs(struct ena_ring *tx_ring)
+ {
+       bool print_once = true;
++      bool is_xdp_ring;
+       u32 i;
+ 
++      is_xdp_ring = ENA_IS_XDP_INDEX(tx_ring->adapter, tx_ring->qid);
++
+       for (i = 0; i < tx_ring->ring_size; i++) {
+               struct ena_tx_buffer *tx_info = &tx_ring->tx_buffer_info[i];
+ 
+@@ -1224,10 +1227,15 @@ static void ena_free_tx_bufs(struct ena_ring *tx_ring)
+ 
+               ena_unmap_tx_buff(tx_ring, tx_info);
+ 
+-              dev_kfree_skb_any(tx_info->skb);
++              if (is_xdp_ring)
++                      xdp_return_frame(tx_info->xdpf);
++              else
++                      dev_kfree_skb_any(tx_info->skb);
+       }
+-      netdev_tx_reset_queue(netdev_get_tx_queue(tx_ring->netdev,
+-                                                tx_ring->qid));
++
++      if (!is_xdp_ring)
++              netdev_tx_reset_queue(netdev_get_tx_queue(tx_ring->netdev,
++                                                        tx_ring->qid));
+ }
+ 
+ static void ena_free_all_tx_bufs(struct ena_adapter *adapter)
+-- 
+2.43.0
+
diff --git a/queue-6.1/net-ena-fix-potential-sign-extension-issue.patch b/queue-6.1/net-ena-fix-potential-sign-extension-issue.patch

new file mode 100644 (file)

index 0000000..9d8d6d2
--- /dev/null
+++ b/queue-6.1/net-ena-fix-potential-sign-extension-issue.patch
@@ -0,0 +1,66 @@
+From bf5bd0f041a53fac1ea40d164a67c4b6e07615c4 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Wed, 10 Apr 2024 09:13:55 +0000
+Subject: net: ena: Fix potential sign extension issue
+
+From: David Arinzon <darinzon@amazon.com>
+
+[ Upstream commit 713a85195aad25d8a26786a37b674e3e5ec09e3c ]
+
+Small unsigned types are promoted to larger signed types in
+the case of multiplication, the result of which may overflow.
+In case the result of such a multiplication has its MSB
+turned on, it will be sign extended with '1's.
+This changes the multiplication result.
+
+Code example of the phenomenon:
+-------------------------------
+u16 x, y;
+size_t z1, z2;
+
+x = y = 0xffff;
+printk("x=%x y=%x\n",x,y);
+
+z1 = x*y;
+z2 = (size_t)x*y;
+
+printk("z1=%lx z2=%lx\n", z1, z2);
+
+Output:
+-------
+x=ffff y=ffff
+z1=fffffffffffe0001 z2=fffe0001
+
+The expected result of ffff*ffff is fffe0001, and without the
+explicit casting to avoid the unwanted sign extension we got
+fffffffffffe0001.
+
+This commit adds an explicit casting to avoid the sign extension
+issue.
+
+Fixes: 689b2bdaaa14 ("net: ena: add functions for handling Low Latency Queues in ena_com")
+Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
+Signed-off-by: David Arinzon <darinzon@amazon.com>
+Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
+Signed-off-by: Paolo Abeni <pabeni@redhat.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/ethernet/amazon/ena/ena_com.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
+index 633b321d7fdd9..4db689372980e 100644
+--- a/drivers/net/ethernet/amazon/ena/ena_com.c
++++ b/drivers/net/ethernet/amazon/ena/ena_com.c
+@@ -362,7 +362,7 @@ static int ena_com_init_io_sq(struct ena_com_dev *ena_dev,
+                       ENA_COM_BOUNCE_BUFFER_CNTRL_CNT;
+               io_sq->bounce_buf_ctrl.next_to_use = 0;
+ 
+-              size = io_sq->bounce_buf_ctrl.buffer_size *
++              size = (size_t)io_sq->bounce_buf_ctrl.buffer_size *
+                       io_sq->bounce_buf_ctrl.buffers_num;
+ 
+               dev_node = dev_to_node(ena_dev->dmadev);
+-- 
+2.43.0
+
diff --git a/queue-6.1/net-ena-wrong-missing-io-completions-check-order.patch b/queue-6.1/net-ena-wrong-missing-io-completions-check-order.patch

new file mode 100644 (file)

index 0000000..fee13eb
--- /dev/null
+++ b/queue-6.1/net-ena-wrong-missing-io-completions-check-order.patch
@@ -0,0 +1,108 @@
+From c4eeb81b42c997ba24c362832f171262e27e60c9 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Wed, 10 Apr 2024 09:13:56 +0000
+Subject: net: ena: Wrong missing IO completions check order
+
+From: David Arinzon <darinzon@amazon.com>
+
+[ Upstream commit f7e417180665234fdb7af2ebe33d89aaa434d16f ]
+
+Missing IO completions check is called every second (HZ jiffies).
+This commit fixes several issues with this check:
+
+1. Duplicate queues check:
+   Max of 4 queues are scanned on each check due to monitor budget.
+   Once reaching the budget, this check exits under the assumption that
+   the next check will continue to scan the remainder of the queues,
+   but in practice, next check will first scan the last already scanned
+   queue which is not necessary and may cause the full queue scan to
+   last a couple of seconds longer.
+   The fix is to start every check with the next queue to scan.
+   For example, on 8 IO queues:
+   Bug: [0,1,2,3], [3,4,5,6], [6,7]
+   Fix: [0,1,2,3], [4,5,6,7]
+
+2. Unbalanced queues check:
+   In case the number of active IO queues is not a multiple of budget,
+   there will be checks which don't utilize the full budget
+   because the full scan exits when reaching the last queue id.
+   The fix is to run every TX completion check with exact queue budget
+   regardless of the queue id.
+   For example, on 7 IO queues:
+   Bug: [0,1,2,3], [4,5,6], [0,1,2,3]
+   Fix: [0,1,2,3], [4,5,6,0], [1,2,3,4]
+   The budget may be lowered in case the number of IO queues is less
+   than the budget (4) to make sure there are no duplicate queues on
+   the same check.
+   For example, on 3 IO queues:
+   Bug: [0,1,2,0], [1,2,0,1]
+   Fix: [0,1,2], [0,1,2]
+
+Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
+Signed-off-by: Amit Bernstein <amitbern@amazon.com>
+Signed-off-by: David Arinzon <darinzon@amazon.com>
+Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
+Signed-off-by: Paolo Abeni <pabeni@redhat.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/ethernet/amazon/ena/ena_netdev.c | 21 +++++++++++---------
+ 1 file changed, 12 insertions(+), 9 deletions(-)
+
+diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
+index 9e82e7b9c3b72..b2eb6e1958f04 100644
+--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
++++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
+@@ -3797,10 +3797,11 @@ static void check_for_missing_completions(struct ena_adapter *adapter)
+ {
+       struct ena_ring *tx_ring;
+       struct ena_ring *rx_ring;
+-      int i, budget, rc;
++      int qid, budget, rc;
+       int io_queue_count;
+ 
+       io_queue_count = adapter->xdp_num_queues + adapter->num_io_queues;
++
+       /* Make sure the driver doesn't turn the device in other process */
+       smp_rmb();
+ 
+@@ -3813,27 +3814,29 @@ static void check_for_missing_completions(struct ena_adapter *adapter)
+       if (adapter->missing_tx_completion_to == ENA_HW_HINTS_NO_TIMEOUT)
+               return;
+ 
+-      budget = ENA_MONITORED_TX_QUEUES;
++      budget = min_t(u32, io_queue_count, ENA_MONITORED_TX_QUEUES);
+ 
+-      for (i = adapter->last_monitored_tx_qid; i < io_queue_count; i++) {
+-              tx_ring = &adapter->tx_ring[i];
+-              rx_ring = &adapter->rx_ring[i];
++      qid = adapter->last_monitored_tx_qid;
++
++      while (budget) {
++              qid = (qid + 1) % io_queue_count;
++
++              tx_ring = &adapter->tx_ring[qid];
++              rx_ring = &adapter->rx_ring[qid];
+ 
+               rc = check_missing_comp_in_tx_queue(adapter, tx_ring);
+               if (unlikely(rc))
+                       return;
+ 
+-              rc =  !ENA_IS_XDP_INDEX(adapter, i) ?
++              rc =  !ENA_IS_XDP_INDEX(adapter, qid) ?
+                       check_for_rx_interrupt_queue(adapter, rx_ring) : 0;
+               if (unlikely(rc))
+                       return;
+ 
+               budget--;
+-              if (!budget)
+-                      break;
+       }
+ 
+-      adapter->last_monitored_tx_qid = i % io_queue_count;
++      adapter->last_monitored_tx_qid = qid;
+ }
+ 
+ /* trigger napi schedule after 2 consecutive detections */
+-- 
+2.43.0
+
diff --git a/queue-6.1/net-ks8851-handle-softirqs-at-the-end-of-irq-thread-.patch b/queue-6.1/net-ks8851-handle-softirqs-at-the-end-of-irq-thread-.patch

new file mode 100644 (file)

index 0000000..8c17268
--- /dev/null
+++ b/queue-6.1/net-ks8851-handle-softirqs-at-the-end-of-irq-thread-.patch
@@ -0,0 +1,101 @@
+From a8fe1105b87b4c45a9d200232a1d86007ff5ed87 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Fri, 5 Apr 2024 22:30:40 +0200
+Subject: net: ks8851: Handle softirqs at the end of IRQ thread to fix hang
+
+From: Marek Vasut <marex@denx.de>
+
+[ Upstream commit be0384bf599cf1eb8d337517feeb732d71f75a6f ]
+
+The ks8851_irq() thread may call ks8851_rx_pkts() in case there are
+any packets in the MAC FIFO, which calls netif_rx(). This netif_rx()
+implementation is guarded by local_bh_disable() and local_bh_enable().
+The local_bh_enable() may call do_softirq() to run softirqs in case
+any are pending. One of the softirqs is net_rx_action, which ultimately
+reaches the driver .start_xmit callback. If that happens, the system
+hangs. The entire call chain is below:
+
+ks8851_start_xmit_par from netdev_start_xmit
+netdev_start_xmit from dev_hard_start_xmit
+dev_hard_start_xmit from sch_direct_xmit
+sch_direct_xmit from __dev_queue_xmit
+__dev_queue_xmit from __neigh_update
+__neigh_update from neigh_update
+neigh_update from arp_process.constprop.0
+arp_process.constprop.0 from __netif_receive_skb_one_core
+__netif_receive_skb_one_core from process_backlog
+process_backlog from __napi_poll.constprop.0
+__napi_poll.constprop.0 from net_rx_action
+net_rx_action from __do_softirq
+__do_softirq from call_with_stack
+call_with_stack from do_softirq
+do_softirq from __local_bh_enable_ip
+__local_bh_enable_ip from netif_rx
+netif_rx from ks8851_irq
+ks8851_irq from irq_thread_fn
+irq_thread_fn from irq_thread
+irq_thread from kthread
+kthread from ret_from_fork
+
+The hang happens because ks8851_irq() first locks a spinlock in
+ks8851_par.c ks8851_lock_par() spin_lock_irqsave(&ksp->lock, ...)
+and with that spinlock locked, calls netif_rx(). Once the execution
+reaches ks8851_start_xmit_par(), it calls ks8851_lock_par() again
+which attempts to claim the already locked spinlock again, and the
+hang happens.
+
+Move the do_softirq() call outside of the spinlock protected section
+of ks8851_irq() by disabling BHs around the entire spinlock protected
+section of ks8851_irq() handler. Place local_bh_enable() outside of
+the spinlock protected section, so that it can trigger do_softirq()
+without the ks8851_par.c ks8851_lock_par() spinlock being held, and
+safely call ks8851_start_xmit_par() without attempting to lock the
+already locked spinlock.
+
+Since ks8851_irq() is protected by local_bh_disable()/local_bh_enable()
+now, replace netif_rx() with __netif_rx() which is not duplicating the
+local_bh_disable()/local_bh_enable() calls.
+
+Fixes: 797047f875b5 ("net: ks8851: Implement Parallel bus operations")
+Signed-off-by: Marek Vasut <marex@denx.de>
+Link: https://lore.kernel.org/r/20240405203204.82062-2-marex@denx.de
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/ethernet/micrel/ks8851_common.c | 6 +++++-
+ 1 file changed, 5 insertions(+), 1 deletion(-)
+
+diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c
+index 896d43bb8883d..d4cdf3d4f5525 100644
+--- a/drivers/net/ethernet/micrel/ks8851_common.c
++++ b/drivers/net/ethernet/micrel/ks8851_common.c
+@@ -299,7 +299,7 @@ static void ks8851_rx_pkts(struct ks8851_net *ks)
+                                       ks8851_dbg_dumpkkt(ks, rxpkt);
+ 
+                               skb->protocol = eth_type_trans(skb, ks->netdev);
+-                              netif_rx(skb);
++                              __netif_rx(skb);
+ 
+                               ks->netdev->stats.rx_packets++;
+                               ks->netdev->stats.rx_bytes += rxlen;
+@@ -330,6 +330,8 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
+       unsigned long flags;
+       unsigned int status;
+ 
++      local_bh_disable();
++
+       ks8851_lock(ks, &flags);
+ 
+       status = ks8851_rdreg16(ks, KS_ISR);
+@@ -406,6 +408,8 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
+       if (status & IRQ_LCI)
+               mii_check_link(&ks->mii);
+ 
++      local_bh_enable();
++
+       return IRQ_HANDLED;
+ }
+ 
+-- 
+2.43.0
+
diff --git a/queue-6.1/net-ks8851-inline-ks8851_rx_skb.patch b/queue-6.1/net-ks8851-inline-ks8851_rx_skb.patch

new file mode 100644 (file)

index 0000000..d3f81d8
--- /dev/null
+++ b/queue-6.1/net-ks8851-inline-ks8851_rx_skb.patch
@@ -0,0 +1,138 @@
+From b92311c0779585affead78a7a275bb540b768a09 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Fri, 5 Apr 2024 22:30:39 +0200
+Subject: net: ks8851: Inline ks8851_rx_skb()
+
+From: Marek Vasut <marex@denx.de>
+
+[ Upstream commit f96f700449b6d190e06272f1cf732ae8e45b73df ]
+
+Both ks8851_rx_skb_par() and ks8851_rx_skb_spi() call netif_rx(skb),
+inline the netif_rx(skb) call directly into ks8851_common.c and drop
+the .rx_skb callback and ks8851_rx_skb() wrapper. This removes one
+indirect call from the driver, no functional change otherwise.
+
+Signed-off-by: Marek Vasut <marex@denx.de>
+Link: https://lore.kernel.org/r/20240405203204.82062-1-marex@denx.de
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Stable-dep-of: be0384bf599c ("net: ks8851: Handle softirqs at the end of IRQ thread to fix hang")
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/ethernet/micrel/ks8851.h        |  3 ---
+ drivers/net/ethernet/micrel/ks8851_common.c | 12 +-----------
+ drivers/net/ethernet/micrel/ks8851_par.c    | 11 -----------
+ drivers/net/ethernet/micrel/ks8851_spi.c    | 11 -----------
+ 4 files changed, 1 insertion(+), 36 deletions(-)
+
+diff --git a/drivers/net/ethernet/micrel/ks8851.h b/drivers/net/ethernet/micrel/ks8851.h
+index e5ec0a363aff8..31f75b4a67fd7 100644
+--- a/drivers/net/ethernet/micrel/ks8851.h
++++ b/drivers/net/ethernet/micrel/ks8851.h
+@@ -368,7 +368,6 @@ union ks8851_tx_hdr {
+  * @rdfifo: FIFO read callback
+  * @wrfifo: FIFO write callback
+  * @start_xmit: start_xmit() implementation callback
+- * @rx_skb: rx_skb() implementation callback
+  * @flush_tx_work: flush_tx_work() implementation callback
+  *
+  * The @statelock is used to protect information in the structure which may
+@@ -423,8 +422,6 @@ struct ks8851_net {
+                                         struct sk_buff *txp, bool irq);
+       netdev_tx_t             (*start_xmit)(struct sk_buff *skb,
+                                             struct net_device *dev);
+-      void                    (*rx_skb)(struct ks8851_net *ks,
+-                                        struct sk_buff *skb);
+       void                    (*flush_tx_work)(struct ks8851_net *ks);
+ };
+ 
+diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c
+index 0bf13b38b8f5b..896d43bb8883d 100644
+--- a/drivers/net/ethernet/micrel/ks8851_common.c
++++ b/drivers/net/ethernet/micrel/ks8851_common.c
+@@ -231,16 +231,6 @@ static void ks8851_dbg_dumpkkt(struct ks8851_net *ks, u8 *rxpkt)
+                  rxpkt[12], rxpkt[13], rxpkt[14], rxpkt[15]);
+ }
+ 
+-/**
+- * ks8851_rx_skb - receive skbuff
+- * @ks: The device state.
+- * @skb: The skbuff
+- */
+-static void ks8851_rx_skb(struct ks8851_net *ks, struct sk_buff *skb)
+-{
+-      ks->rx_skb(ks, skb);
+-}
+-
+ /**
+  * ks8851_rx_pkts - receive packets from the host
+  * @ks: The device information.
+@@ -309,7 +299,7 @@ static void ks8851_rx_pkts(struct ks8851_net *ks)
+                                       ks8851_dbg_dumpkkt(ks, rxpkt);
+ 
+                               skb->protocol = eth_type_trans(skb, ks->netdev);
+-                              ks8851_rx_skb(ks, skb);
++                              netif_rx(skb);
+ 
+                               ks->netdev->stats.rx_packets++;
+                               ks->netdev->stats.rx_bytes += rxlen;
+diff --git a/drivers/net/ethernet/micrel/ks8851_par.c b/drivers/net/ethernet/micrel/ks8851_par.c
+index 7f49042484bdc..96fb0ffcedb90 100644
+--- a/drivers/net/ethernet/micrel/ks8851_par.c
++++ b/drivers/net/ethernet/micrel/ks8851_par.c
+@@ -210,16 +210,6 @@ static void ks8851_wrfifo_par(struct ks8851_net *ks, struct sk_buff *txp,
+       iowrite16_rep(ksp->hw_addr, txp->data, len / 2);
+ }
+ 
+-/**
+- * ks8851_rx_skb_par - receive skbuff
+- * @ks: The device state.
+- * @skb: The skbuff
+- */
+-static void ks8851_rx_skb_par(struct ks8851_net *ks, struct sk_buff *skb)
+-{
+-      netif_rx(skb);
+-}
+-
+ static unsigned int ks8851_rdreg16_par_txqcr(struct ks8851_net *ks)
+ {
+       return ks8851_rdreg16_par(ks, KS_TXQCR);
+@@ -298,7 +288,6 @@ static int ks8851_probe_par(struct platform_device *pdev)
+       ks->rdfifo = ks8851_rdfifo_par;
+       ks->wrfifo = ks8851_wrfifo_par;
+       ks->start_xmit = ks8851_start_xmit_par;
+-      ks->rx_skb = ks8851_rx_skb_par;
+ 
+ #define STD_IRQ (IRQ_LCI |    /* Link Change */       \
+                IRQ_RXI |      /* RX done */           \
+diff --git a/drivers/net/ethernet/micrel/ks8851_spi.c b/drivers/net/ethernet/micrel/ks8851_spi.c
+index 88e26c120b483..4dcbff789b19d 100644
+--- a/drivers/net/ethernet/micrel/ks8851_spi.c
++++ b/drivers/net/ethernet/micrel/ks8851_spi.c
+@@ -298,16 +298,6 @@ static unsigned int calc_txlen(unsigned int len)
+       return ALIGN(len + 4, 4);
+ }
+ 
+-/**
+- * ks8851_rx_skb_spi - receive skbuff
+- * @ks: The device state
+- * @skb: The skbuff
+- */
+-static void ks8851_rx_skb_spi(struct ks8851_net *ks, struct sk_buff *skb)
+-{
+-      netif_rx(skb);
+-}
+-
+ /**
+  * ks8851_tx_work - process tx packet(s)
+  * @work: The work strucutre what was scheduled.
+@@ -435,7 +425,6 @@ static int ks8851_probe_spi(struct spi_device *spi)
+       ks->rdfifo = ks8851_rdfifo_spi;
+       ks->wrfifo = ks8851_wrfifo_spi;
+       ks->start_xmit = ks8851_start_xmit_spi;
+-      ks->rx_skb = ks8851_rx_skb_spi;
+       ks->flush_tx_work = ks8851_flush_tx_work_spi;
+ 
+ #define STD_IRQ (IRQ_LCI |    /* Link Change */       \
+-- 
+2.43.0
+
diff --git a/queue-6.1/net-mlx5-properly-link-new-fs-rules-into-the-tree.patch b/queue-6.1/net-mlx5-properly-link-new-fs-rules-into-the-tree.patch

new file mode 100644 (file)

index 0000000..a2b0f71
--- /dev/null
+++ b/queue-6.1/net-mlx5-properly-link-new-fs-rules-into-the-tree.patch
@@ -0,0 +1,66 @@
+From be3404952290ff9ab98889f6cf60b51df0fbba91 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 Apr 2024 22:08:12 +0300
+Subject: net/mlx5: Properly link new fs rules into the tree
+
+From: Cosmin Ratiu <cratiu@nvidia.com>
+
+[ Upstream commit 7c6782ad4911cbee874e85630226ed389ff2e453 ]
+
+Previously, add_rule_fg would only add newly created rules from the
+handle into the tree when they had a refcount of 1. On the other hand,
+create_flow_handle tries hard to find and reference already existing
+identical rules instead of creating new ones.
+
+These two behaviors can result in a situation where create_flow_handle
+1) creates a new rule and references it, then
+2) in a subsequent step during the same handle creation references it
+   again,
+resulting in a rule with a refcount of 2 that is not linked into the
+tree, will have a NULL parent and root and will result in a crash when
+the flow group is deleted because del_sw_hw_rule, invoked on rule
+deletion, assumes node->parent is != NULL.
+
+This happened in the wild, due to another bug related to incorrect
+handling of duplicate pkt_reformat ids, which lead to the code in
+create_flow_handle incorrectly referencing a just-added rule in the same
+flow handle, resulting in the problem described above. Full details are
+at [1].
+
+This patch changes add_rule_fg to add new rules without parents into
+the tree, properly initializing them and avoiding the crash. This makes
+it more consistent with how rules are added to an FTE in
+create_flow_handle.
+
+Fixes: 74491de93712 ("net/mlx5: Add multi dest support")
+Link: https://lore.kernel.org/netdev/ea5264d6-6b55-4449-a602-214c6f509c1e@163.com/T/#u [1]
+Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
+Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
+Reviewed-by: Mark Bloch <mbloch@nvidia.com>
+Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
+Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
+Link: https://lore.kernel.org/r/20240409190820.227554-5-tariqt@nvidia.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 3 ++-
+ 1 file changed, 2 insertions(+), 1 deletion(-)
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+index e6674118bc428..164e10b5f9b7f 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+@@ -1752,8 +1752,9 @@ static struct mlx5_flow_handle *add_rule_fg(struct mlx5_flow_group *fg,
+       }
+       trace_mlx5_fs_set_fte(fte, false);
+ 
++      /* Link newly added rules into the tree. */
+       for (i = 0; i < handle->num_rules; i++) {
+-              if (refcount_read(&handle->rule[i]->node.refcount) == 1) {
++              if (!handle->rule[i]->node.parent) {
+                       tree_add_node(&handle->rule[i]->node, &fte->node);
+                       trace_mlx5_fs_add_rule(handle->rule[i]);
+               }
+-- 
+2.43.0
+
diff --git a/queue-6.1/net-mlx5e-fix-mlx5e_priv_init-cleanup-flow.patch b/queue-6.1/net-mlx5e-fix-mlx5e_priv_init-cleanup-flow.patch

new file mode 100644 (file)

index 0000000..9525826
--- /dev/null
+++ b/queue-6.1/net-mlx5e-fix-mlx5e_priv_init-cleanup-flow.patch
@@ -0,0 +1,109 @@
+From a363c24fc9dcb840eb1096a549c8a92c6b371263 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 Apr 2024 22:08:15 +0300
+Subject: net/mlx5e: Fix mlx5e_priv_init() cleanup flow
+
+From: Carolina Jubran <cjubran@nvidia.com>
+
+[ Upstream commit ecb829459a841198e142f72fadab56424ae96519 ]
+
+When mlx5e_priv_init() fails, the cleanup flow calls mlx5e_selq_cleanup which
+calls mlx5e_selq_apply() that assures that the `priv->state_lock` is held using
+lockdep_is_held().
+
+Acquire the state_lock in mlx5e_selq_cleanup().
+
+Kernel log:
+=============================
+WARNING: suspicious RCU usage
+6.8.0-rc3_net_next_841a9b5 #1 Not tainted
+-----------------------------
+drivers/net/ethernet/mellanox/mlx5/core/en/selq.c:124 suspicious rcu_dereference_protected() usage!
+
+other info that might help us debug this:
+
+rcu_scheduler_active = 2, debug_locks = 1
+2 locks held by systemd-modules/293:
+ #0: ffffffffa05067b0 (devices_rwsem){++++}-{3:3}, at: ib_register_client+0x109/0x1b0 [ib_core]
+ #1: ffff8881096c65c0 (&device->client_data_rwsem){++++}-{3:3}, at: add_client_context+0x104/0x1c0 [ib_core]
+
+stack backtrace:
+CPU: 4 PID: 293 Comm: systemd-modules Not tainted 6.8.0-rc3_net_next_841a9b5 #1
+Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
+Call Trace:
+ <TASK>
+ dump_stack_lvl+0x8a/0xa0
+ lockdep_rcu_suspicious+0x154/0x1a0
+ mlx5e_selq_apply+0x94/0xa0 [mlx5_core]
+ mlx5e_selq_cleanup+0x3a/0x60 [mlx5_core]
+ mlx5e_priv_init+0x2be/0x2f0 [mlx5_core]
+ mlx5_rdma_setup_rn+0x7c/0x1a0 [mlx5_core]
+ rdma_init_netdev+0x4e/0x80 [ib_core]
+ ? mlx5_rdma_netdev_free+0x70/0x70 [mlx5_core]
+ ipoib_intf_init+0x64/0x550 [ib_ipoib]
+ ipoib_intf_alloc+0x4e/0xc0 [ib_ipoib]
+ ipoib_add_one+0xb0/0x360 [ib_ipoib]
+ add_client_context+0x112/0x1c0 [ib_core]
+ ib_register_client+0x166/0x1b0 [ib_core]
+ ? 0xffffffffa0573000
+ ipoib_init_module+0xeb/0x1a0 [ib_ipoib]
+ do_one_initcall+0x61/0x250
+ do_init_module+0x8a/0x270
+ init_module_from_file+0x8b/0xd0
+ idempotent_init_module+0x17d/0x230
+ __x64_sys_finit_module+0x61/0xb0
+ do_syscall_64+0x71/0x140
+ entry_SYSCALL_64_after_hwframe+0x46/0x4e
+ </TASK>
+
+Fixes: 8bf30be75069 ("net/mlx5e: Introduce select queue parameters")
+Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
+Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
+Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
+Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
+Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
+Link: https://lore.kernel.org/r/20240409190820.227554-8-tariqt@nvidia.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/ethernet/mellanox/mlx5/core/en/selq.c | 2 ++
+ drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 2 --
+ 2 files changed, 2 insertions(+), 2 deletions(-)
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/selq.c b/drivers/net/ethernet/mellanox/mlx5/core/en/selq.c
+index f675b1926340f..f66bbc8464645 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/selq.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/selq.c
+@@ -57,6 +57,7 @@ int mlx5e_selq_init(struct mlx5e_selq *selq, struct mutex *state_lock)
+ 
+ void mlx5e_selq_cleanup(struct mlx5e_selq *selq)
+ {
++      mutex_lock(selq->state_lock);
+       WARN_ON_ONCE(selq->is_prepared);
+ 
+       kvfree(selq->standby);
+@@ -67,6 +68,7 @@ void mlx5e_selq_cleanup(struct mlx5e_selq *selq)
+ 
+       kvfree(selq->standby);
+       selq->standby = NULL;
++      mutex_unlock(selq->state_lock);
+ }
+ 
+ void mlx5e_selq_prepare_params(struct mlx5e_selq *selq, struct mlx5e_params *params)
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+index 9910a0480f589..e7d396434da36 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+@@ -5578,9 +5578,7 @@ void mlx5e_priv_cleanup(struct mlx5e_priv *priv)
+       kfree(priv->tx_rates);
+       kfree(priv->txq2sq);
+       destroy_workqueue(priv->wq);
+-      mutex_lock(&priv->state_lock);
+       mlx5e_selq_cleanup(&priv->selq);
+-      mutex_unlock(&priv->state_lock);
+       free_cpumask_var(priv->scratchpad.cpumask);
+ 
+       for (i = 0; i < priv->htb_max_qos_sqs; i++)
+-- 
+2.43.0
+
diff --git a/queue-6.1/net-mlx5e-htb-fix-inconsistencies-with-qos-sqs-numbe.patch b/queue-6.1/net-mlx5e-htb-fix-inconsistencies-with-qos-sqs-numbe.patch

new file mode 100644 (file)

index 0000000..c60df9b
--- /dev/null
+++ b/queue-6.1/net-mlx5e-htb-fix-inconsistencies-with-qos-sqs-numbe.patch
@@ -0,0 +1,83 @@
+From 3e5583c523ed42f75a31a33921688168e2098314 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 Apr 2024 22:08:16 +0300
+Subject: net/mlx5e: HTB, Fix inconsistencies with QoS SQs number
+
+From: Carolina Jubran <cjubran@nvidia.com>
+
+[ Upstream commit 2f436f1869771d46e1a9f85738d5a1a7c5653a4e ]
+
+When creating a new HTB class while the interface is down,
+the variable that follows the number of QoS SQs (htb_max_qos_sqs)
+may not be consistent with the number of HTB classes.
+
+Previously, we compared these two values to ensure that
+the node_qid is lower than the number of QoS SQs, and we
+allocated stats for that SQ when they are equal.
+
+Change the check to compare the node_qid with the current
+number of leaf nodes and fix the checking conditions to
+ensure allocation of stats_list and stats for each node.
+
+Fixes: 214baf22870c ("net/mlx5e: Support HTB offload")
+Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
+Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
+Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
+Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
+Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
+Link: https://lore.kernel.org/r/20240409190820.227554-9-tariqt@nvidia.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ .../net/ethernet/mellanox/mlx5/core/en/qos.c  | 33 ++++++++++---------
+ 1 file changed, 17 insertions(+), 16 deletions(-)
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c b/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
+index 2842195ee548a..1e887d640cffc 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
+@@ -82,24 +82,25 @@ int mlx5e_open_qos_sq(struct mlx5e_priv *priv, struct mlx5e_channels *chs,
+ 
+       txq_ix = mlx5e_qid_from_qos(chs, node_qid);
+ 
+-      WARN_ON(node_qid > priv->htb_max_qos_sqs);
+-      if (node_qid == priv->htb_max_qos_sqs) {
+-              struct mlx5e_sq_stats *stats, **stats_list = NULL;
+-
+-              if (priv->htb_max_qos_sqs == 0) {
+-                      stats_list = kvcalloc(mlx5e_qos_max_leaf_nodes(priv->mdev),
+-                                            sizeof(*stats_list),
+-                                            GFP_KERNEL);
+-                      if (!stats_list)
+-                              return -ENOMEM;
+-              }
++      WARN_ON(node_qid >= mlx5e_htb_cur_leaf_nodes(priv->htb));
++      if (!priv->htb_qos_sq_stats) {
++              struct mlx5e_sq_stats **stats_list;
++
++              stats_list = kvcalloc(mlx5e_qos_max_leaf_nodes(priv->mdev),
++                                    sizeof(*stats_list), GFP_KERNEL);
++              if (!stats_list)
++                      return -ENOMEM;
++
++              WRITE_ONCE(priv->htb_qos_sq_stats, stats_list);
++      }
++
++      if (!priv->htb_qos_sq_stats[node_qid]) {
++              struct mlx5e_sq_stats *stats;
++
+               stats = kzalloc(sizeof(*stats), GFP_KERNEL);
+-              if (!stats) {
+-                      kvfree(stats_list);
++              if (!stats)
+                       return -ENOMEM;
+-              }
+-              if (stats_list)
+-                      WRITE_ONCE(priv->htb_qos_sq_stats, stats_list);
++
+               WRITE_ONCE(priv->htb_qos_sq_stats[node_qid], stats);
+               /* Order htb_max_qos_sqs increment after writing the array pointer.
+                * Pairs with smp_load_acquire in en_stats.c.
+-- 
+2.43.0
+
diff --git a/queue-6.1/net-openvswitch-fix-unwanted-error-log-on-timeout-po.patch b/queue-6.1/net-openvswitch-fix-unwanted-error-log-on-timeout-po.patch

new file mode 100644 (file)

index 0000000..f2e9255
--- /dev/null
+++ b/queue-6.1/net-openvswitch-fix-unwanted-error-log-on-timeout-po.patch
@@ -0,0 +1,60 @@
+From 19b17d4a78bb04433f7ad8dd8d9027dd87a3eb6f Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Wed, 3 Apr 2024 22:38:01 +0200
+Subject: net: openvswitch: fix unwanted error log on timeout policy probing
+
+From: Ilya Maximets <i.maximets@ovn.org>
+
+[ Upstream commit 4539f91f2a801c0c028c252bffae56030cfb2cae ]
+
+On startup, ovs-vswitchd probes different datapath features including
+support for timeout policies.  While probing, it tries to execute
+certain operations with OVS_PACKET_ATTR_PROBE or OVS_FLOW_ATTR_PROBE
+attributes set.  These attributes tell the openvswitch module to not
+log any errors when they occur as it is expected that some of the
+probes will fail.
+
+For some reason, setting the timeout policy ignores the PROBE attribute
+and logs a failure anyway.  This is causing the following kernel log
+on each re-start of ovs-vswitchd:
+
+  kernel: Failed to associated timeout policy `ovs_test_tp'
+
+Fix that by using the same logging macro that all other messages are
+using.  The message will still be printed at info level when needed
+and will be rate limited, but with a net rate limiter instead of
+generic printk one.
+
+The nf_ct_set_timeout() itself will still print some info messages,
+but at least this change makes logging in openvswitch module more
+consistent.
+
+Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action")
+Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
+Acked-by: Eelco Chaudron <echaudro@redhat.com>
+Link: https://lore.kernel.org/r/20240403203803.2137962-1-i.maximets@ovn.org
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/openvswitch/conntrack.c | 5 +++--
+ 1 file changed, 3 insertions(+), 2 deletions(-)
+
+diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
+index 0591cfb289d50..e4ba86b84b9b1 100644
+--- a/net/openvswitch/conntrack.c
++++ b/net/openvswitch/conntrack.c
+@@ -1711,8 +1711,9 @@ int ovs_ct_copy_action(struct net *net, const struct nlattr *attr,
+       if (ct_info.timeout[0]) {
+               if (nf_ct_set_timeout(net, ct_info.ct, family, key->ip.proto,
+                                     ct_info.timeout))
+-                      pr_info_ratelimited("Failed to associated timeout "
+-                                          "policy `%s'\n", ct_info.timeout);
++                      OVS_NLERR(log,
++                                "Failed to associated timeout policy '%s'",
++                                ct_info.timeout);
+               else
+                       ct_info.nf_ct_timeout = rcu_dereference(
+                               nf_ct_timeout_find(ct_info.ct)->timeout);
+-- 
+2.43.0
+
diff --git a/queue-6.1/net-sparx5-fix-wrong-config-being-used-when-reconfig.patch b/queue-6.1/net-sparx5-fix-wrong-config-being-used-when-reconfig.patch

new file mode 100644 (file)

index 0000000..6386c10
--- /dev/null
+++ b/queue-6.1/net-sparx5-fix-wrong-config-being-used-when-reconfig.patch
@@ -0,0 +1,47 @@
+From b594a2ce7e86cc7c79a41f3f9b643c193d732f41 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 Apr 2024 12:41:59 +0200
+Subject: net: sparx5: fix wrong config being used when reconfiguring PCS
+
+From: Daniel Machon <daniel.machon@microchip.com>
+
+[ Upstream commit 33623113a48ea906f1955cbf71094f6aa4462e8f ]
+
+The wrong port config is being used if the PCS is reconfigured. Fix this
+by correctly using the new config instead of the old one.
+
+Fixes: 946e7fd5053a ("net: sparx5: add port module support")
+Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
+Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
+Link: https://lore.kernel.org/r/20240409-link-mode-reconfiguration-fix-v2-1-db6a507f3627@microchip.com
+Signed-off-by: Paolo Abeni <pabeni@redhat.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/ethernet/microchip/sparx5/sparx5_port.c | 4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_port.c b/drivers/net/ethernet/microchip/sparx5/sparx5_port.c
+index 32709d21ab2f9..212bf6f4ed72d 100644
+--- a/drivers/net/ethernet/microchip/sparx5/sparx5_port.c
++++ b/drivers/net/ethernet/microchip/sparx5/sparx5_port.c
+@@ -730,7 +730,7 @@ static int sparx5_port_pcs_low_set(struct sparx5 *sparx5,
+       bool sgmii = false, inband_aneg = false;
+       int err;
+ 
+-      if (port->conf.inband) {
++      if (conf->inband) {
+               if (conf->portmode == PHY_INTERFACE_MODE_SGMII ||
+                   conf->portmode == PHY_INTERFACE_MODE_QSGMII)
+                       inband_aneg = true; /* Cisco-SGMII in-band-aneg */
+@@ -947,7 +947,7 @@ int sparx5_port_pcs_set(struct sparx5 *sparx5,
+       if (err)
+               return -EINVAL;
+ 
+-      if (port->conf.inband) {
++      if (conf->inband) {
+               /* Enable/disable 1G counters in ASM */
+               spx5_rmw(ASM_PORT_CFG_CSC_STAT_DIS_SET(high_speed_dev),
+                        ASM_PORT_CFG_CSC_STAT_DIS,
+-- 
+2.43.0
+
diff --git a/queue-6.1/netfilter-complete-validation-of-user-input.patch b/queue-6.1/netfilter-complete-validation-of-user-input.patch

new file mode 100644 (file)

index 0000000..35234d3
--- /dev/null
+++ b/queue-6.1/netfilter-complete-validation-of-user-input.patch
@@ -0,0 +1,102 @@
+From ae6ff91c02b1f1f389a9dd1aaa21c7ca93733f2b Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 Apr 2024 12:07:41 +0000
+Subject: netfilter: complete validation of user input
+
+From: Eric Dumazet <edumazet@google.com>
+
+[ Upstream commit 65acf6e0501ac8880a4f73980d01b5d27648b956 ]
+
+In my recent commit, I missed that do_replace() handlers
+use copy_from_sockptr() (which I fixed), followed
+by unsafe copy_from_sockptr_offset() calls.
+
+In all functions, we can perform the @optlen validation
+before even calling xt_alloc_table_info() with the following
+check:
+
+if ((u64)optlen < (u64)tmp.size + sizeof(tmp))
+        return -EINVAL;
+
+Fixes: 0c83842df40f ("netfilter: validate user input for expected length")
+Reported-by: syzbot <syzkaller@googlegroups.com>
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
+Link: https://lore.kernel.org/r/20240409120741.3538135-1-edumazet@google.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/ipv4/netfilter/arp_tables.c | 4 ++++
+ net/ipv4/netfilter/ip_tables.c  | 4 ++++
+ net/ipv6/netfilter/ip6_tables.c | 4 ++++
+ 3 files changed, 12 insertions(+)
+
+diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
+index b150c9929b12e..14365b20f1c5c 100644
+--- a/net/ipv4/netfilter/arp_tables.c
++++ b/net/ipv4/netfilter/arp_tables.c
+@@ -966,6 +966,8 @@ static int do_replace(struct net *net, sockptr_t arg, unsigned int len)
+               return -ENOMEM;
+       if (tmp.num_counters == 0)
+               return -EINVAL;
++      if ((u64)len < (u64)tmp.size + sizeof(tmp))
++              return -EINVAL;
+ 
+       tmp.name[sizeof(tmp.name)-1] = 0;
+ 
+@@ -1266,6 +1268,8 @@ static int compat_do_replace(struct net *net, sockptr_t arg, unsigned int len)
+               return -ENOMEM;
+       if (tmp.num_counters == 0)
+               return -EINVAL;
++      if ((u64)len < (u64)tmp.size + sizeof(tmp))
++              return -EINVAL;
+ 
+       tmp.name[sizeof(tmp.name)-1] = 0;
+ 
+diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
+index 1f365e28e316c..a6208efcfccfc 100644
+--- a/net/ipv4/netfilter/ip_tables.c
++++ b/net/ipv4/netfilter/ip_tables.c
+@@ -1120,6 +1120,8 @@ do_replace(struct net *net, sockptr_t arg, unsigned int len)
+               return -ENOMEM;
+       if (tmp.num_counters == 0)
+               return -EINVAL;
++      if ((u64)len < (u64)tmp.size + sizeof(tmp))
++              return -EINVAL;
+ 
+       tmp.name[sizeof(tmp.name)-1] = 0;
+ 
+@@ -1506,6 +1508,8 @@ compat_do_replace(struct net *net, sockptr_t arg, unsigned int len)
+               return -ENOMEM;
+       if (tmp.num_counters == 0)
+               return -EINVAL;
++      if ((u64)len < (u64)tmp.size + sizeof(tmp))
++              return -EINVAL;
+ 
+       tmp.name[sizeof(tmp.name)-1] = 0;
+ 
+diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
+index 37a2b3301e423..b844e519da1b4 100644
+--- a/net/ipv6/netfilter/ip6_tables.c
++++ b/net/ipv6/netfilter/ip6_tables.c
+@@ -1137,6 +1137,8 @@ do_replace(struct net *net, sockptr_t arg, unsigned int len)
+               return -ENOMEM;
+       if (tmp.num_counters == 0)
+               return -EINVAL;
++      if ((u64)len < (u64)tmp.size + sizeof(tmp))
++              return -EINVAL;
+ 
+       tmp.name[sizeof(tmp.name)-1] = 0;
+ 
+@@ -1515,6 +1517,8 @@ compat_do_replace(struct net *net, sockptr_t arg, unsigned int len)
+               return -ENOMEM;
+       if (tmp.num_counters == 0)
+               return -EINVAL;
++      if ((u64)len < (u64)tmp.size + sizeof(tmp))
++              return -EINVAL;
+ 
+       tmp.name[sizeof(tmp.name)-1] = 0;
+ 
+-- 
+2.43.0
+
diff --git a/queue-6.1/nouveau-fix-function-cast-warning.patch b/queue-6.1/nouveau-fix-function-cast-warning.patch

new file mode 100644 (file)

index 0000000..3c15571
--- /dev/null
+++ b/queue-6.1/nouveau-fix-function-cast-warning.patch
@@ -0,0 +1,51 @@
+From 1b20095cbe9e9c16dad4588993e2973f2a5e6e5f Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 4 Apr 2024 18:02:25 +0200
+Subject: nouveau: fix function cast warning
+
+From: Arnd Bergmann <arnd@arndb.de>
+
+[ Upstream commit 185fdb4697cc9684a02f2fab0530ecdd0c2f15d4 ]
+
+Calling a function through an incompatible pointer type causes breaks
+kcfi, so clang warns about the assignment:
+
+drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c:73:10: error: cast from 'void (*)(const void *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict]
+   73 |         .fini = (void(*)(void *))kfree,
+
+Avoid this with a trivial wrapper.
+
+Fixes: c39f472e9f14 ("drm/nouveau: remove symlinks, move core/ to nvkm/ (no code changes)")
+Signed-off-by: Arnd Bergmann <arnd@arndb.de>
+Signed-off-by: Danilo Krummrich <dakr@redhat.com>
+Link: https://patchwork.freedesktop.org/patch/msgid/20240404160234.2923554-1-arnd@kernel.org
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c | 7 ++++++-
+ 1 file changed, 6 insertions(+), 1 deletion(-)
+
+diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c
+index 4bf486b571013..cb05f7f48a98b 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c
++++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c
+@@ -66,11 +66,16 @@ of_init(struct nvkm_bios *bios, const char *name)
+       return ERR_PTR(-EINVAL);
+ }
+ 
++static void of_fini(void *p)
++{
++      kfree(p);
++}
++
+ const struct nvbios_source
+ nvbios_of = {
+       .name = "OpenFirmware",
+       .init = of_init,
+-      .fini = (void(*)(void *))kfree,
++      .fini = of_fini,
+       .read = of_read,
+       .size = of_size,
+       .rw = false,
+-- 
+2.43.0
+
diff --git a/queue-6.1/octeontx2-af-fix-nix-sq-mode-and-bp-config.patch b/queue-6.1/octeontx2-af-fix-nix-sq-mode-and-bp-config.patch

new file mode 100644 (file)

index 0000000..fd5bcbb
--- /dev/null
+++ b/queue-6.1/octeontx2-af-fix-nix-sq-mode-and-bp-config.patch
@@ -0,0 +1,59 @@
+From bf7e810dd10ab1e4c776c71c7d0e1805b9d5976e Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Mon, 8 Apr 2024 12:06:43 +0530
+Subject: octeontx2-af: Fix NIX SQ mode and BP config
+
+From: Geetha sowjanya <gakula@marvell.com>
+
+[ Upstream commit faf23006185e777db18912685922c5ddb2df383f ]
+
+NIX SQ mode and link backpressure configuration is required for
+all platforms. But in current driver this code is wrongly placed
+under specific platform check. This patch fixes the issue by
+moving the code out of platform check.
+
+Fixes: 5d9b976d4480 ("octeontx2-af: Support fixed transmit scheduler topology")
+Signed-off-by: Geetha sowjanya <gakula@marvell.com>
+Link: https://lore.kernel.org/r/20240408063643.26288-1-gakula@marvell.com
+Signed-off-by: Paolo Abeni <pabeni@redhat.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ .../ethernet/marvell/octeontx2/af/rvu_nix.c   | 20 +++++++++----------
+ 1 file changed, 10 insertions(+), 10 deletions(-)
+
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
+index bb99302eab67a..67080d5053e07 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
+@@ -4237,18 +4237,18 @@ static int rvu_nix_block_init(struct rvu *rvu, struct nix_hw *nix_hw)
+                */
+               rvu_write64(rvu, blkaddr, NIX_AF_CFG,
+                           rvu_read64(rvu, blkaddr, NIX_AF_CFG) | 0x40ULL);
++      }
+ 
+-              /* Set chan/link to backpressure TL3 instead of TL2 */
+-              rvu_write64(rvu, blkaddr, NIX_AF_PSE_CHANNEL_LEVEL, 0x01);
++      /* Set chan/link to backpressure TL3 instead of TL2 */
++      rvu_write64(rvu, blkaddr, NIX_AF_PSE_CHANNEL_LEVEL, 0x01);
+ 
+-              /* Disable SQ manager's sticky mode operation (set TM6 = 0)
+-               * This sticky mode is known to cause SQ stalls when multiple
+-               * SQs are mapped to same SMQ and transmitting pkts at a time.
+-               */
+-              cfg = rvu_read64(rvu, blkaddr, NIX_AF_SQM_DBG_CTL_STATUS);
+-              cfg &= ~BIT_ULL(15);
+-              rvu_write64(rvu, blkaddr, NIX_AF_SQM_DBG_CTL_STATUS, cfg);
+-      }
++      /* Disable SQ manager's sticky mode operation (set TM6 = 0)
++       * This sticky mode is known to cause SQ stalls when multiple
++       * SQs are mapped to same SMQ and transmitting pkts at a time.
++       */
++      cfg = rvu_read64(rvu, blkaddr, NIX_AF_SQM_DBG_CTL_STATUS);
++      cfg &= ~BIT_ULL(15);
++      rvu_write64(rvu, blkaddr, NIX_AF_SQM_DBG_CTL_STATUS, cfg);
+ 
+       ltdefs = rvu->kpu.lt_def;
+       /* Calibrate X2P bus to check if CGX/LBK links are fine */
+-- 
+2.43.0
+
diff --git a/queue-6.1/revert-drm-qxl-simplify-qxl_fence_wait.patch b/queue-6.1/revert-drm-qxl-simplify-qxl_fence_wait.patch

new file mode 100644 (file)

index 0000000..f4607da
--- /dev/null
+++ b/queue-6.1/revert-drm-qxl-simplify-qxl_fence_wait.patch
@@ -0,0 +1,115 @@
+From 23eb3c700a36ce228995c28ffaf5fe15b891f8a5 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 4 Apr 2024 19:14:48 +0100
+Subject: Revert "drm/qxl: simplify qxl_fence_wait"
+
+From: Alex Constantino <dreaming.about.electric.sheep@gmail.com>
+
+[ Upstream commit 07ed11afb68d94eadd4ffc082b97c2331307c5ea ]
+
+This reverts commit 5a838e5d5825c85556011478abde708251cc0776.
+
+Changes from commit 5a838e5d5825 ("drm/qxl: simplify qxl_fence_wait") would
+result in a '[TTM] Buffer eviction failed' exception whenever it reached a
+timeout.
+Due to a dependency to DMA_FENCE_WARN this also restores some code deleted
+by commit d72277b6c37d ("dma-buf: nuke DMA_FENCE_TRACE macros v2").
+
+Fixes: 5a838e5d5825 ("drm/qxl: simplify qxl_fence_wait")
+Link: https://lore.kernel.org/regressions/ZTgydqRlK6WX_b29@eldamar.lan/
+Reported-by: Timo Lindfors <timo.lindfors@iki.fi>
+Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1054514
+Signed-off-by: Alex Constantino <dreaming.about.electric.sheep@gmail.com>
+Signed-off-by: Maxime Ripard <mripard@kernel.org>
+Link: https://patchwork.freedesktop.org/patch/msgid/20240404181448.1643-2-dreaming.about.electric.sheep@gmail.com
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/gpu/drm/qxl/qxl_release.c | 50 +++++++++++++++++++++++++++----
+ include/linux/dma-fence.h         |  7 +++++
+ 2 files changed, 52 insertions(+), 5 deletions(-)
+
+diff --git a/drivers/gpu/drm/qxl/qxl_release.c b/drivers/gpu/drm/qxl/qxl_release.c
+index 368d26da0d6a2..9febc8b73f09e 100644
+--- a/drivers/gpu/drm/qxl/qxl_release.c
++++ b/drivers/gpu/drm/qxl/qxl_release.c
+@@ -58,16 +58,56 @@ static long qxl_fence_wait(struct dma_fence *fence, bool intr,
+                          signed long timeout)
+ {
+       struct qxl_device *qdev;
++      struct qxl_release *release;
++      int count = 0, sc = 0;
++      bool have_drawable_releases;
+       unsigned long cur, end = jiffies + timeout;
+ 
+       qdev = container_of(fence->lock, struct qxl_device, release_lock);
++      release = container_of(fence, struct qxl_release, base);
++      have_drawable_releases = release->type == QXL_RELEASE_DRAWABLE;
+ 
+-      if (!wait_event_timeout(qdev->release_event,
+-                              (dma_fence_is_signaled(fence) ||
+-                               (qxl_io_notify_oom(qdev), 0)),
+-                              timeout))
+-              return 0;
++retry:
++      sc++;
++
++      if (dma_fence_is_signaled(fence))
++              goto signaled;
++
++      qxl_io_notify_oom(qdev);
++
++      for (count = 0; count < 11; count++) {
++              if (!qxl_queue_garbage_collect(qdev, true))
++                      break;
++
++              if (dma_fence_is_signaled(fence))
++                      goto signaled;
++      }
++
++      if (dma_fence_is_signaled(fence))
++              goto signaled;
++
++      if (have_drawable_releases || sc < 4) {
++              if (sc > 2)
++                      /* back off */
++                      usleep_range(500, 1000);
++
++              if (time_after(jiffies, end))
++                      return 0;
++
++              if (have_drawable_releases && sc > 300) {
++                      DMA_FENCE_WARN(fence,
++                                     "failed to wait on release %llu after spincount %d\n",
++                                     fence->context & ~0xf0000000, sc);
++                      goto signaled;
++              }
++              goto retry;
++      }
++      /*
++       * yeah, original sync_obj_wait gave up after 3 spins when
++       * have_drawable_releases is not set.
++       */
+ 
++signaled:
+       cur = jiffies;
+       if (time_after(cur, end))
+               return 0;
+diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
+index b79097b9070b3..5d6a5f3097cd0 100644
+--- a/include/linux/dma-fence.h
++++ b/include/linux/dma-fence.h
+@@ -659,4 +659,11 @@ static inline bool dma_fence_is_container(struct dma_fence *fence)
+       return dma_fence_is_array(fence) || dma_fence_is_chain(fence);
+ }
+ 
++#define DMA_FENCE_WARN(f, fmt, args...) \
++      do {                                                            \
++              struct dma_fence *__ff = (f);                           \
++              pr_warn("f %llu#%llu: " fmt, __ff->context, __ff->seqno,\
++                       ##args);                                       \
++      } while (0)
++
+ #endif /* __LINUX_DMA_FENCE_H */
+-- 
+2.43.0
+
diff --git a/queue-6.1/scsi-hisi_sas-modify-the-deadline-for-ata_wait_after.patch b/queue-6.1/scsi-hisi_sas-modify-the-deadline-for-ata_wait_after.patch

new file mode 100644 (file)

index 0000000..f3163d4
--- /dev/null
+++ b/queue-6.1/scsi-hisi_sas-modify-the-deadline-for-ata_wait_after.patch
@@ -0,0 +1,43 @@
+From 64578fe8ecb137768f6168f0a4d0c9f96bb8ca2e Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 2 Apr 2024 11:55:13 +0800
+Subject: scsi: hisi_sas: Modify the deadline for ata_wait_after_reset()
+
+From: Xiang Chen <chenxiang66@hisilicon.com>
+
+[ Upstream commit 0098c55e0881f0b32591f2110410d5c8b7f9bd5a ]
+
+We found that the second parameter of function ata_wait_after_reset() is
+incorrectly used. We call smp_ata_check_ready_type() to poll the device
+type until the 30s timeout, so the correct deadline should be (jiffies +
+30000).
+
+Fixes: 3c2673a09cf1 ("scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset")
+Co-developed-by: xiabing <xiabing12@h-partners.com>
+Signed-off-by: xiabing <xiabing12@h-partners.com>
+Co-developed-by: Yihang Li <liyihang9@huawei.com>
+Signed-off-by: Yihang Li <liyihang9@huawei.com>
+Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
+Link: https://lore.kernel.org/r/20240402035513.2024241-3-chenxiang66@hisilicon.com
+Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/scsi/hisi_sas/hisi_sas_main.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
+index 450a8578157cb..2116f5ee36e20 100644
+--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
++++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
+@@ -1715,7 +1715,7 @@ static int hisi_sas_debug_I_T_nexus_reset(struct domain_device *device)
+       if (dev_is_sata(device)) {
+               struct ata_link *link = &device->sata_dev.ap->link;
+ 
+-              rc = ata_wait_after_reset(link, HISI_SAS_WAIT_PHYUP_TIMEOUT,
++              rc = ata_wait_after_reset(link, jiffies + HISI_SAS_WAIT_PHYUP_TIMEOUT,
+                                         smp_ata_check_ready_type);
+       } else {
+               msleep(2000);
+-- 
+2.43.0
+
diff --git a/queue-6.1/scsi-qla2xxx-fix-off-by-one-in-qla_edif_app_getstats.patch b/queue-6.1/scsi-qla2xxx-fix-off-by-one-in-qla_edif_app_getstats.patch

new file mode 100644 (file)

index 0000000..077d7fd
--- /dev/null
+++ b/queue-6.1/scsi-qla2xxx-fix-off-by-one-in-qla_edif_app_getstats.patch
@@ -0,0 +1,39 @@
+From 563c24ad0ed88cc97362393372e5fdbc76904384 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 2 Apr 2024 12:56:54 +0300
+Subject: scsi: qla2xxx: Fix off by one in qla_edif_app_getstats()
+
+From: Dan Carpenter <dan.carpenter@linaro.org>
+
+[ Upstream commit 4406e4176f47177f5e51b4cc7e6a7a2ff3dbfbbd ]
+
+The app_reply->elem[] array is allocated earlier in this function and it
+has app_req.num_ports elements.  Thus this > comparison needs to be >= to
+prevent memory corruption.
+
+Fixes: 7878f22a2e03 ("scsi: qla2xxx: edif: Add getfcinfo and statistic bsgs")
+Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
+Link: https://lore.kernel.org/r/5c125b2f-92dd-412b-9b6f-fc3a3207bd60@moroto.mountain
+Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
+Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/scsi/qla2xxx/qla_edif.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/drivers/scsi/qla2xxx/qla_edif.c b/drivers/scsi/qla2xxx/qla_edif.c
+index 7aee4d093969a..969008071decd 100644
+--- a/drivers/scsi/qla2xxx/qla_edif.c
++++ b/drivers/scsi/qla2xxx/qla_edif.c
+@@ -1058,7 +1058,7 @@ qla_edif_app_getstats(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+ 
+               list_for_each_entry_safe(fcport, tf, &vha->vp_fcports, list) {
+                       if (fcport->edif.enable) {
+-                              if (pcnt > app_req.num_ports)
++                              if (pcnt >= app_req.num_ports)
+                                       break;
+ 
+                               app_reply->elem[pcnt].rekey_count =
+-- 
+2.43.0
+
diff --git a/queue-6.1/series b/queue-6.1/series

index 38f5d535afa843e475c2bb64d09559fbe32fca65..564a01432f26a5d6177d530bf859a2a2c5120863 100644 (file)
--- a/queue-6.1/series
+++ b/queue-6.1/series
@@ -6,3 +6,36 @@ bluetooth-fix-memory-leak-in-hci_req_sync_complete.patch
  drm-amd-pm-fixes-a-random-hang-in-s4-for-smu-v13.0.4-11.patch
  pm-s2idle-make-sure-cpus-will-wakeup-directly-on-resume.patch
  media-cec-core-remove-length-check-of-timer-status.patch
+arm64-dts-imx8-ss-conn-fix-usdhc-wrong-lpcg-clock-or.patch
+revert-drm-qxl-simplify-qxl_fence_wait.patch
+nouveau-fix-function-cast-warning.patch
+scsi-hisi_sas-modify-the-deadline-for-ata_wait_after.patch
+scsi-qla2xxx-fix-off-by-one-in-qla_edif_app_getstats.patch
+net-openvswitch-fix-unwanted-error-log-on-timeout-po.patch
+u64_stats-fix-u64_stats_init-for-lockdep-when-used-r.patch
+xsk-validate-user-input-for-xdp_-umem-completion-_fi.patch
+geneve-fix-header-validation-in-geneve-6-_xmit_skb.patch
+bnxt_en-reset-ptp-tx_avail-after-possible-firmware-r.patch
+net-ks8851-inline-ks8851_rx_skb.patch
+net-ks8851-handle-softirqs-at-the-end-of-irq-thread-.patch
+af_unix-clear-stale-u-oob_skb.patch
+octeontx2-af-fix-nix-sq-mode-and-bp-config.patch
+ipv6-fib-hide-unused-pn-variable.patch
+ipv4-route-avoid-unused-but-set-variable-warning.patch
+ipv6-fix-race-condition-between-ipv6_get_ifaddr-and-.patch
+bluetooth-sco-fix-not-validating-setsockopt-user-inp.patch
+bluetooth-l2cap-fix-not-validating-setsockopt-user-i.patch
+netfilter-complete-validation-of-user-input.patch
+net-mlx5-properly-link-new-fs-rules-into-the-tree.patch
+net-mlx5e-fix-mlx5e_priv_init-cleanup-flow.patch
+net-mlx5e-htb-fix-inconsistencies-with-qos-sqs-numbe.patch
+net-sparx5-fix-wrong-config-being-used-when-reconfig.patch
+net-dsa-mt7530-trap-link-local-frames-regardless-of-.patch
+af_unix-do-not-use-atomic-ops-for-unix_sk-sk-infligh.patch
+af_unix-fix-garbage-collector-racing-against-connect.patch
+net-ena-fix-potential-sign-extension-issue.patch
+net-ena-wrong-missing-io-completions-check-order.patch
+net-ena-fix-incorrect-descriptor-free-behavior.patch
+tracing-fix-ftrace_record_recursion_size-kconfig-ent.patch
+tracing-hide-unused-ftrace_event_id_fops.patch
+iommu-vt-d-allocate-local-memory-for-page-request-qu.patch
diff --git a/queue-6.1/tracing-fix-ftrace_record_recursion_size-kconfig-ent.patch b/queue-6.1/tracing-fix-ftrace_record_recursion_size-kconfig-ent.patch

new file mode 100644 (file)

index 0000000..a030663
--- /dev/null
+++ b/queue-6.1/tracing-fix-ftrace_record_recursion_size-kconfig-ent.patch
@@ -0,0 +1,42 @@
+From 704384eac6f3a806bf568d43b411f9b069f2ecbd Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Fri, 22 Mar 2024 17:48:01 +0530
+Subject: tracing: Fix FTRACE_RECORD_RECURSION_SIZE Kconfig entry
+
+From: Prasad Pandit <pjp@fedoraproject.org>
+
+[ Upstream commit d96c36004e31e2baaf8ea1b449b7d0b2c2bfb41a ]
+
+Fix FTRACE_RECORD_RECURSION_SIZE entry, replace tab with
+a space character. It helps Kconfig parsers to read file
+without error.
+
+Link: https://lore.kernel.org/linux-trace-kernel/20240322121801.1803948-1-ppandit@redhat.com
+
+Cc: Masami Hiramatsu <mhiramat@kernel.org>
+Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
+Fixes: 773c16705058 ("ftrace: Add recording of functions that caused recursion")
+Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
+Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ kernel/trace/Kconfig | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
+index 93d7249962833..8514f86583136 100644
+--- a/kernel/trace/Kconfig
++++ b/kernel/trace/Kconfig
+@@ -910,7 +910,7 @@ config FTRACE_RECORD_RECURSION
+ 
+ config FTRACE_RECORD_RECURSION_SIZE
+       int "Max number of recursed functions to record"
+-      default 128
++      default 128
+       depends on FTRACE_RECORD_RECURSION
+       help
+         This defines the limit of number of functions that can be
+-- 
+2.43.0
+
diff --git a/queue-6.1/tracing-hide-unused-ftrace_event_id_fops.patch b/queue-6.1/tracing-hide-unused-ftrace_event_id_fops.patch

new file mode 100644 (file)

index 0000000..7fcaaf6
--- /dev/null
+++ b/queue-6.1/tracing-hide-unused-ftrace_event_id_fops.patch
@@ -0,0 +1,76 @@
+From 4dfd2d5e69b3efef87daeb7018bb5fc346dfe852 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Wed, 3 Apr 2024 10:06:24 +0200
+Subject: tracing: hide unused ftrace_event_id_fops
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+From: Arnd Bergmann <arnd@arndb.de>
+
+[ Upstream commit 5281ec83454d70d98b71f1836fb16512566c01cd ]
+
+When CONFIG_PERF_EVENTS, a 'make W=1' build produces a warning about the
+unused ftrace_event_id_fops variable:
+
+kernel/trace/trace_events.c:2155:37: error: 'ftrace_event_id_fops' defined but not used [-Werror=unused-const-variable=]
+ 2155 | static const struct file_operations ftrace_event_id_fops = {
+
+Hide this in the same #ifdef as the reference to it.
+
+Link: https://lore.kernel.org/linux-trace-kernel/20240403080702.3509288-7-arnd@kernel.org
+
+Cc: Masami Hiramatsu <mhiramat@kernel.org>
+Cc: Oleg Nesterov <oleg@redhat.com>
+Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
+Cc: Zheng Yejian <zhengyejian1@huawei.com>
+Cc: Kees Cook <keescook@chromium.org>
+Cc: Ajay Kaher <akaher@vmware.com>
+Cc: Jinjie Ruan <ruanjinjie@huawei.com>
+Cc: Clément Léger <cleger@rivosinc.com>
+Cc: Dan Carpenter <dan.carpenter@linaro.org>
+Cc: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com>
+Fixes: 620a30e97feb ("tracing: Don't pass file_operations array to event_create_dir()")
+Signed-off-by: Arnd Bergmann <arnd@arndb.de>
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ kernel/trace/trace_events.c | 4 ++++
+ 1 file changed, 4 insertions(+)
+
+diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
+index a6d2f99f847d3..24859d9645050 100644
+--- a/kernel/trace/trace_events.c
++++ b/kernel/trace/trace_events.c
+@@ -1669,6 +1669,7 @@ static int trace_format_open(struct inode *inode, struct file *file)
+       return 0;
+ }
+ 
++#ifdef CONFIG_PERF_EVENTS
+ static ssize_t
+ event_id_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
+ {
+@@ -1683,6 +1684,7 @@ event_id_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
+ 
+       return simple_read_from_buffer(ubuf, cnt, ppos, buf, len);
+ }
++#endif
+ 
+ static ssize_t
+ event_filter_read(struct file *filp, char __user *ubuf, size_t cnt,
+@@ -2127,10 +2129,12 @@ static const struct file_operations ftrace_event_format_fops = {
+       .release = seq_release,
+ };
+ 
++#ifdef CONFIG_PERF_EVENTS
+ static const struct file_operations ftrace_event_id_fops = {
+       .read = event_id_read,
+       .llseek = default_llseek,
+ };
++#endif
+ 
+ static const struct file_operations ftrace_event_filter_fops = {
+       .open = tracing_open_file_tr,
+-- 
+2.43.0
+
diff --git a/queue-6.1/u64_stats-fix-u64_stats_init-for-lockdep-when-used-r.patch b/queue-6.1/u64_stats-fix-u64_stats_init-for-lockdep-when-used-r.patch

new file mode 100644 (file)

index 0000000..70eebdb
--- /dev/null
+++ b/queue-6.1/u64_stats-fix-u64_stats_init-for-lockdep-when-used-r.patch
@@ -0,0 +1,56 @@
+From 30f1b65db1e69bf65e37804e3567dc6c80b5d9cb Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 4 Apr 2024 09:57:40 +0200
+Subject: u64_stats: fix u64_stats_init() for lockdep when used repeatedly in
+ one file
+
+From: Petr Tesarik <petr@tesarici.cz>
+
+[ Upstream commit 38a15d0a50e0a43778561a5861403851f0b0194c ]
+
+Fix bogus lockdep warnings if multiple u64_stats_sync variables are
+initialized in the same file.
+
+With CONFIG_LOCKDEP, seqcount_init() is a macro which declares:
+
+       static struct lock_class_key __key;
+
+Since u64_stats_init() is a function (albeit an inline one), all calls
+within the same file end up using the same instance, effectively treating
+them all as a single lock-class.
+
+Fixes: 9464ca650008 ("net: make u64_stats_init() a function")
+Closes: https://lore.kernel.org/netdev/ea1567d9-ce66-45e6-8168-ac40a47d1821@roeck-us.net/
+Signed-off-by: Petr Tesarik <petr@tesarici.cz>
+Reviewed-by: Simon Horman <horms@kernel.org>
+Reviewed-by: Eric Dumazet <edumazet@google.com>
+Link: https://lore.kernel.org/r/20240404075740.30682-1-petr@tesarici.cz
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ include/linux/u64_stats_sync.h | 9 +++++----
+ 1 file changed, 5 insertions(+), 4 deletions(-)
+
+diff --git a/include/linux/u64_stats_sync.h b/include/linux/u64_stats_sync.h
+index 46040d66334a8..79c3bbaa7e13e 100644
+--- a/include/linux/u64_stats_sync.h
++++ b/include/linux/u64_stats_sync.h
+@@ -135,10 +135,11 @@ static inline void u64_stats_inc(u64_stats_t *p)
+       p->v++;
+ }
+ 
+-static inline void u64_stats_init(struct u64_stats_sync *syncp)
+-{
+-      seqcount_init(&syncp->seq);
+-}
++#define u64_stats_init(syncp)                         \
++      do {                                            \
++              struct u64_stats_sync *__s = (syncp);   \
++              seqcount_init(&__s->seq);               \
++      } while (0)
+ 
+ static inline void __u64_stats_update_begin(struct u64_stats_sync *syncp)
+ {
+-- 
+2.43.0
+
diff --git a/queue-6.1/xsk-validate-user-input-for-xdp_-umem-completion-_fi.patch b/queue-6.1/xsk-validate-user-input-for-xdp_-umem-completion-_fi.patch

new file mode 100644 (file)

index 0000000..36c3ca4
--- /dev/null
+++ b/queue-6.1/xsk-validate-user-input-for-xdp_-umem-completion-_fi.patch
@@ -0,0 +1,176 @@
+From 44e1f1a454fe5cbc2ae30442872eb46e435c987e Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 4 Apr 2024 20:27:38 +0000
+Subject: xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+From: Eric Dumazet <edumazet@google.com>
+
+[ Upstream commit 237f3cf13b20db183d3706d997eedc3c49eacd44 ]
+
+syzbot reported an illegal copy in xsk_setsockopt() [1]
+
+Make sure to validate setsockopt() @optlen parameter.
+
+[1]
+
+ BUG: KASAN: slab-out-of-bounds in copy_from_sockptr_offset include/linux/sockptr.h:49 [inline]
+ BUG: KASAN: slab-out-of-bounds in copy_from_sockptr include/linux/sockptr.h:55 [inline]
+ BUG: KASAN: slab-out-of-bounds in xsk_setsockopt+0x909/0xa40 net/xdp/xsk.c:1420
+Read of size 4 at addr ffff888028c6cde3 by task syz-executor.0/7549
+
+CPU: 0 PID: 7549 Comm: syz-executor.0 Not tainted 6.8.0-syzkaller-08951-gfe46a7dd189e #0
+Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
+Call Trace:
+ <TASK>
+  __dump_stack lib/dump_stack.c:88 [inline]
+  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
+  print_address_description mm/kasan/report.c:377 [inline]
+  print_report+0x169/0x550 mm/kasan/report.c:488
+  kasan_report+0x143/0x180 mm/kasan/report.c:601
+  copy_from_sockptr_offset include/linux/sockptr.h:49 [inline]
+  copy_from_sockptr include/linux/sockptr.h:55 [inline]
+  xsk_setsockopt+0x909/0xa40 net/xdp/xsk.c:1420
+  do_sock_setsockopt+0x3af/0x720 net/socket.c:2311
+  __sys_setsockopt+0x1ae/0x250 net/socket.c:2334
+  __do_sys_setsockopt net/socket.c:2343 [inline]
+  __se_sys_setsockopt net/socket.c:2340 [inline]
+  __x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340
+ do_syscall_64+0xfb/0x240
+ entry_SYSCALL_64_after_hwframe+0x6d/0x75
+RIP: 0033:0x7fb40587de69
+Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
+RSP: 002b:00007fb40665a0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
+RAX: ffffffffffffffda RBX: 00007fb4059abf80 RCX: 00007fb40587de69
+RDX: 0000000000000005 RSI: 000000000000011b RDI: 0000000000000006
+RBP: 00007fb4058ca47a R08: 0000000000000002 R09: 0000000000000000
+R10: 0000000020001980 R11: 0000000000000246 R12: 0000000000000000
+R13: 000000000000000b R14: 00007fb4059abf80 R15: 00007fff57ee4d08
+ </TASK>
+
+Allocated by task 7549:
+  kasan_save_stack mm/kasan/common.c:47 [inline]
+  kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
+  poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
+  __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
+  kasan_kmalloc include/linux/kasan.h:211 [inline]
+  __do_kmalloc_node mm/slub.c:3966 [inline]
+  __kmalloc+0x233/0x4a0 mm/slub.c:3979
+  kmalloc include/linux/slab.h:632 [inline]
+  __cgroup_bpf_run_filter_setsockopt+0xd2f/0x1040 kernel/bpf/cgroup.c:1869
+  do_sock_setsockopt+0x6b4/0x720 net/socket.c:2293
+  __sys_setsockopt+0x1ae/0x250 net/socket.c:2334
+  __do_sys_setsockopt net/socket.c:2343 [inline]
+  __se_sys_setsockopt net/socket.c:2340 [inline]
+  __x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340
+ do_syscall_64+0xfb/0x240
+ entry_SYSCALL_64_after_hwframe+0x6d/0x75
+
+The buggy address belongs to the object at ffff888028c6cde0
+ which belongs to the cache kmalloc-8 of size 8
+The buggy address is located 1 bytes to the right of
+ allocated 2-byte region [ffff888028c6cde0, ffff888028c6cde2)
+
+The buggy address belongs to the physical page:
+page:ffffea0000a31b00 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888028c6c9c0 pfn:0x28c6c
+anon flags: 0xfff00000000800(slab|node=0|zone=1|lastcpupid=0x7ff)
+page_type: 0xffffffff()
+raw: 00fff00000000800 ffff888014c41280 0000000000000000 dead000000000001
+raw: ffff888028c6c9c0 0000000080800057 00000001ffffffff 0000000000000000
+page dumped because: kasan: bad access detected
+page_owner tracks the page as allocated
+page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112cc0(GFP_USER|__GFP_NOWARN|__GFP_NORETRY), pid 6648, tgid 6644 (syz-executor.0), ts 133906047828, free_ts 133859922223
+  set_page_owner include/linux/page_owner.h:31 [inline]
+  post_alloc_hook+0x1ea/0x210 mm/page_alloc.c:1533
+  prep_new_page mm/page_alloc.c:1540 [inline]
+  get_page_from_freelist+0x33ea/0x3580 mm/page_alloc.c:3311
+  __alloc_pages+0x256/0x680 mm/page_alloc.c:4569
+  __alloc_pages_node include/linux/gfp.h:238 [inline]
+  alloc_pages_node include/linux/gfp.h:261 [inline]
+  alloc_slab_page+0x5f/0x160 mm/slub.c:2175
+  allocate_slab mm/slub.c:2338 [inline]
+  new_slab+0x84/0x2f0 mm/slub.c:2391
+  ___slab_alloc+0xc73/0x1260 mm/slub.c:3525
+  __slab_alloc mm/slub.c:3610 [inline]
+  __slab_alloc_node mm/slub.c:3663 [inline]
+  slab_alloc_node mm/slub.c:3835 [inline]
+  __do_kmalloc_node mm/slub.c:3965 [inline]
+  __kmalloc_node+0x2db/0x4e0 mm/slub.c:3973
+  kmalloc_node include/linux/slab.h:648 [inline]
+  __vmalloc_area_node mm/vmalloc.c:3197 [inline]
+  __vmalloc_node_range+0x5f9/0x14a0 mm/vmalloc.c:3392
+  __vmalloc_node mm/vmalloc.c:3457 [inline]
+  vzalloc+0x79/0x90 mm/vmalloc.c:3530
+  bpf_check+0x260/0x19010 kernel/bpf/verifier.c:21162
+  bpf_prog_load+0x1667/0x20f0 kernel/bpf/syscall.c:2895
+  __sys_bpf+0x4ee/0x810 kernel/bpf/syscall.c:5631
+  __do_sys_bpf kernel/bpf/syscall.c:5738 [inline]
+  __se_sys_bpf kernel/bpf/syscall.c:5736 [inline]
+  __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5736
+ do_syscall_64+0xfb/0x240
+ entry_SYSCALL_64_after_hwframe+0x6d/0x75
+page last free pid 6650 tgid 6647 stack trace:
+  reset_page_owner include/linux/page_owner.h:24 [inline]
+  free_pages_prepare mm/page_alloc.c:1140 [inline]
+  free_unref_page_prepare+0x95d/0xa80 mm/page_alloc.c:2346
+  free_unref_page_list+0x5a3/0x850 mm/page_alloc.c:2532
+  release_pages+0x2117/0x2400 mm/swap.c:1042
+  tlb_batch_pages_flush mm/mmu_gather.c:98 [inline]
+  tlb_flush_mmu_free mm/mmu_gather.c:293 [inline]
+  tlb_flush_mmu+0x34d/0x4e0 mm/mmu_gather.c:300
+  tlb_finish_mmu+0xd4/0x200 mm/mmu_gather.c:392
+  exit_mmap+0x4b6/0xd40 mm/mmap.c:3300
+  __mmput+0x115/0x3c0 kernel/fork.c:1345
+  exit_mm+0x220/0x310 kernel/exit.c:569
+  do_exit+0x99e/0x27e0 kernel/exit.c:865
+  do_group_exit+0x207/0x2c0 kernel/exit.c:1027
+  get_signal+0x176e/0x1850 kernel/signal.c:2907
+  arch_do_signal_or_restart+0x96/0x860 arch/x86/kernel/signal.c:310
+  exit_to_user_mode_loop kernel/entry/common.c:105 [inline]
+  exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
+  __syscall_exit_to_user_mode_work kernel/entry/common.c:201 [inline]
+  syscall_exit_to_user_mode+0xc9/0x360 kernel/entry/common.c:212
+  do_syscall_64+0x10a/0x240 arch/x86/entry/common.c:89
+ entry_SYSCALL_64_after_hwframe+0x6d/0x75
+
+Memory state around the buggy address:
+ ffff888028c6cc80: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc
+ ffff888028c6cd00: fa fc fc fc fa fc fc fc 00 fc fc fc 06 fc fc fc
+>ffff888028c6cd80: fa fc fc fc fa fc fc fc fa fc fc fc 02 fc fc fc
+                                                       ^
+ ffff888028c6ce00: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc
+ ffff888028c6ce80: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc
+
+Fixes: 423f38329d26 ("xsk: add umem fill queue support and mmap")
+Reported-by: syzbot <syzkaller@googlegroups.com>
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Cc: "Björn Töpel" <bjorn@kernel.org>
+Cc: Magnus Karlsson <magnus.karlsson@intel.com>
+Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
+Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
+Acked-by: Daniel Borkmann <daniel@iogearbox.net>
+Link: https://lore.kernel.org/r/20240404202738.3634547-1-edumazet@google.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/xdp/xsk.c | 2 ++
+ 1 file changed, 2 insertions(+)
+
+diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
+index 5c8e02d56fd43..e3bdfc517424d 100644
+--- a/net/xdp/xsk.c
++++ b/net/xdp/xsk.c
+@@ -1127,6 +1127,8 @@ static int xsk_setsockopt(struct socket *sock, int level, int optname,
+               struct xsk_queue **q;
+               int entries;
+ 
++              if (optlen < sizeof(entries))
++                      return -EINVAL;
+               if (copy_from_sockptr(&entries, optval, sizeof(entries)))
+                       return -EFAULT;
+ 
+-- 
+2.43.0
+
author	Sasha Levin <sashal@kernel.org>
	Mon, 15 Apr 2024 08:56:16 +0000 (04:56 -0400)
committer	Sasha Levin <sashal@kernel.org>
	Mon, 15 Apr 2024 08:56:16 +0000 (04:56 -0400)
queue-6.1/af_unix-clear-stale-u-oob_skb.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/af_unix-do-not-use-atomic-ops-for-unix_sk-sk-infligh.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/af_unix-fix-garbage-collector-racing-against-connect.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/arm64-dts-imx8-ss-conn-fix-usdhc-wrong-lpcg-clock-or.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/bluetooth-l2cap-fix-not-validating-setsockopt-user-i.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/bluetooth-sco-fix-not-validating-setsockopt-user-inp.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/bnxt_en-reset-ptp-tx_avail-after-possible-firmware-r.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/geneve-fix-header-validation-in-geneve-6-_xmit_skb.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/iommu-vt-d-allocate-local-memory-for-page-request-qu.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/ipv4-route-avoid-unused-but-set-variable-warning.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/ipv6-fib-hide-unused-pn-variable.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/ipv6-fix-race-condition-between-ipv6_get_ifaddr-and-.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/net-dsa-mt7530-trap-link-local-frames-regardless-of-.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/net-ena-fix-incorrect-descriptor-free-behavior.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/net-ena-fix-potential-sign-extension-issue.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/net-ena-wrong-missing-io-completions-check-order.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/net-ks8851-handle-softirqs-at-the-end-of-irq-thread-.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/net-ks8851-inline-ks8851_rx_skb.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/net-mlx5-properly-link-new-fs-rules-into-the-tree.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/net-mlx5e-fix-mlx5e_priv_init-cleanup-flow.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/net-mlx5e-htb-fix-inconsistencies-with-qos-sqs-numbe.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/net-openvswitch-fix-unwanted-error-log-on-timeout-po.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/net-sparx5-fix-wrong-config-being-used-when-reconfig.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/netfilter-complete-validation-of-user-input.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/nouveau-fix-function-cast-warning.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/octeontx2-af-fix-nix-sq-mode-and-bp-config.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/revert-drm-qxl-simplify-qxl_fence_wait.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/scsi-hisi_sas-modify-the-deadline-for-ata_wait_after.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/scsi-qla2xxx-fix-off-by-one-in-qla_edif_app_getstats.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/series		patch \| blob \| blame \| history
queue-6.1/tracing-fix-ftrace_record_recursion_size-kconfig-ent.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/tracing-hide-unused-ftrace_event_id_fops.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/u64_stats-fix-u64_stats_init-for-lockdep-when-used-r.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/xsk-validate-user-input-for-xdp_-umem-completion-_fi.patch	[new file with mode: 0644]	patch \| blob