Fixes for 6.3

author Sasha Levin <sashal@kernel.org>

Thu, 18 May 2023 01:37:34 +0000 (21:37 -0400)

committer Sasha Levin <sashal@kernel.org>

Thu, 18 May 2023 01:37:34 +0000 (21:37 -0400)
author Sasha Levin <sashal@kernel.org>
Thu, 18 May 2023 01:37:34 +0000 (21:37 -0400)
committer Sasha Levin <sashal@kernel.org>
Thu, 18 May 2023 01:37:34 +0000 (21:37 -0400)
diff --git a/queue-6.3/af_unix-fix-a-data-race-of-sk-sk_receive_queue-qlen.patch b/queue-6.3/af_unix-fix-a-data-race-of-sk-sk_receive_queue-qlen.patch

new file mode 100644 (file)

index 0000000..5df110a
--- /dev/null
+++ b/queue-6.3/af_unix-fix-a-data-race-of-sk-sk_receive_queue-qlen.patch
@@ -0,0 +1,84 @@
+From c0313f158b1131fea37d9c7c6b55037f767caca5 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 May 2023 17:34:55 -0700
+Subject: af_unix: Fix a data race of sk->sk_receive_queue->qlen.
+
+From: Kuniyuki Iwashima <kuniyu@amazon.com>
+
+[ Upstream commit 679ed006d416ea0cecfe24a99d365d1dea69c683 ]
+
+KCSAN found a data race of sk->sk_receive_queue->qlen where recvmsg()
+updates qlen under the queue lock and sendmsg() checks qlen under
+unix_state_sock(), not the queue lock, so the reader side needs
+READ_ONCE().
+
+BUG: KCSAN: data-race in __skb_try_recv_from_queue / unix_wait_for_peer
+
+write (marked) to 0xffff888019fe7c68 of 4 bytes by task 49792 on cpu 0:
+ __skb_unlink include/linux/skbuff.h:2347 [inline]
+ __skb_try_recv_from_queue+0x3de/0x470 net/core/datagram.c:197
+ __skb_try_recv_datagram+0xf7/0x390 net/core/datagram.c:263
+ __unix_dgram_recvmsg+0x109/0x8a0 net/unix/af_unix.c:2452
+ unix_dgram_recvmsg+0x94/0xa0 net/unix/af_unix.c:2549
+ sock_recvmsg_nosec net/socket.c:1019 [inline]
+ ____sys_recvmsg+0x3a3/0x3b0 net/socket.c:2720
+ ___sys_recvmsg+0xc8/0x150 net/socket.c:2764
+ do_recvmmsg+0x182/0x560 net/socket.c:2858
+ __sys_recvmmsg net/socket.c:2937 [inline]
+ __do_sys_recvmmsg net/socket.c:2960 [inline]
+ __se_sys_recvmmsg net/socket.c:2953 [inline]
+ __x64_sys_recvmmsg+0x153/0x170 net/socket.c:2953
+ do_syscall_x64 arch/x86/entry/common.c:50 [inline]
+ do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
+ entry_SYSCALL_64_after_hwframe+0x72/0xdc
+
+read to 0xffff888019fe7c68 of 4 bytes by task 49793 on cpu 1:
+ skb_queue_len include/linux/skbuff.h:2127 [inline]
+ unix_recvq_full net/unix/af_unix.c:229 [inline]
+ unix_wait_for_peer+0x154/0x1a0 net/unix/af_unix.c:1445
+ unix_dgram_sendmsg+0x13bc/0x14b0 net/unix/af_unix.c:2048
+ sock_sendmsg_nosec net/socket.c:724 [inline]
+ sock_sendmsg+0x148/0x160 net/socket.c:747
+ ____sys_sendmsg+0x20e/0x620 net/socket.c:2503
+ ___sys_sendmsg+0xc6/0x140 net/socket.c:2557
+ __sys_sendmmsg+0x11d/0x370 net/socket.c:2643
+ __do_sys_sendmmsg net/socket.c:2672 [inline]
+ __se_sys_sendmmsg net/socket.c:2669 [inline]
+ __x64_sys_sendmmsg+0x58/0x70 net/socket.c:2669
+ do_syscall_x64 arch/x86/entry/common.c:50 [inline]
+ do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
+ entry_SYSCALL_64_after_hwframe+0x72/0xdc
+
+value changed: 0x0000000b -> 0x00000001
+
+Reported by Kernel Concurrency Sanitizer on:
+CPU: 1 PID: 49793 Comm: syz-executor.0 Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2
+Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
+
+Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
+Reported-by: syzbot <syzkaller@googlegroups.com>
+Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
+Reviewed-by: Eric Dumazet <edumazet@google.com>
+Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/unix/af_unix.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
+index 0b0f18ecce447..0a54959e5b944 100644
+--- a/net/unix/af_unix.c
++++ b/net/unix/af_unix.c
+@@ -1442,7 +1442,7 @@ static long unix_wait_for_peer(struct sock *other, long timeo)
+ 
+       sched = !sock_flag(other, SOCK_DEAD) &&
+               !(other->sk_shutdown & RCV_SHUTDOWN) &&
+-              unix_recvq_full(other);
++              unix_recvq_full_lockless(other);
+ 
+       unix_state_unlock(other);
+ 
+-- 
+2.39.2
+
diff --git a/queue-6.3/af_unix-fix-data-races-around-sk-sk_shutdown.patch b/queue-6.3/af_unix-fix-data-races-around-sk-sk_shutdown.patch

new file mode 100644 (file)

index 0000000..f52e79c
--- /dev/null
+++ b/queue-6.3/af_unix-fix-data-races-around-sk-sk_shutdown.patch
@@ -0,0 +1,153 @@
+From 58002045b23f5ceb359c28a19f25fc5098c5db2c Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 May 2023 17:34:56 -0700
+Subject: af_unix: Fix data races around sk->sk_shutdown.
+
+From: Kuniyuki Iwashima <kuniyu@amazon.com>
+
+[ Upstream commit e1d09c2c2f5793474556b60f83900e088d0d366d ]
+
+KCSAN found a data race around sk->sk_shutdown where unix_release_sock()
+and unix_shutdown() update it under unix_state_lock(), OTOH unix_poll()
+and unix_dgram_poll() read it locklessly.
+
+We need to annotate the writes and reads with WRITE_ONCE() and READ_ONCE().
+
+BUG: KCSAN: data-race in unix_poll / unix_release_sock
+
+write to 0xffff88800d0f8aec of 1 bytes by task 264 on cpu 0:
+ unix_release_sock+0x75c/0x910 net/unix/af_unix.c:631
+ unix_release+0x59/0x80 net/unix/af_unix.c:1042
+ __sock_release+0x7d/0x170 net/socket.c:653
+ sock_close+0x19/0x30 net/socket.c:1397
+ __fput+0x179/0x5e0 fs/file_table.c:321
+ ____fput+0x15/0x20 fs/file_table.c:349
+ task_work_run+0x116/0x1a0 kernel/task_work.c:179
+ resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
+ exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
+ exit_to_user_mode_prepare+0x174/0x180 kernel/entry/common.c:204
+ __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
+ syscall_exit_to_user_mode+0x1a/0x30 kernel/entry/common.c:297
+ do_syscall_64+0x4b/0x90 arch/x86/entry/common.c:86
+ entry_SYSCALL_64_after_hwframe+0x72/0xdc
+
+read to 0xffff88800d0f8aec of 1 bytes by task 222 on cpu 1:
+ unix_poll+0xa3/0x2a0 net/unix/af_unix.c:3170
+ sock_poll+0xcf/0x2b0 net/socket.c:1385
+ vfs_poll include/linux/poll.h:88 [inline]
+ ep_item_poll.isra.0+0x78/0xc0 fs/eventpoll.c:855
+ ep_send_events fs/eventpoll.c:1694 [inline]
+ ep_poll fs/eventpoll.c:1823 [inline]
+ do_epoll_wait+0x6c4/0xea0 fs/eventpoll.c:2258
+ __do_sys_epoll_wait fs/eventpoll.c:2270 [inline]
+ __se_sys_epoll_wait fs/eventpoll.c:2265 [inline]
+ __x64_sys_epoll_wait+0xcc/0x190 fs/eventpoll.c:2265
+ do_syscall_x64 arch/x86/entry/common.c:50 [inline]
+ do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
+ entry_SYSCALL_64_after_hwframe+0x72/0xdc
+
+value changed: 0x00 -> 0x03
+
+Reported by Kernel Concurrency Sanitizer on:
+CPU: 1 PID: 222 Comm: dbus-broker Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2
+Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
+
+Fixes: 3c73419c09a5 ("af_unix: fix 'poll for write'/ connected DGRAM sockets")
+Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
+Reported-by: syzbot <syzkaller@googlegroups.com>
+Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
+Reviewed-by: Eric Dumazet <edumazet@google.com>
+Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/unix/af_unix.c | 20 ++++++++++++--------
+ 1 file changed, 12 insertions(+), 8 deletions(-)
+
+diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
+index 0a54959e5b944..29c6083a37daf 100644
+--- a/net/unix/af_unix.c
++++ b/net/unix/af_unix.c
+@@ -603,7 +603,7 @@ static void unix_release_sock(struct sock *sk, int embrion)
+       /* Clear state */
+       unix_state_lock(sk);
+       sock_orphan(sk);
+-      sk->sk_shutdown = SHUTDOWN_MASK;
++      WRITE_ONCE(sk->sk_shutdown, SHUTDOWN_MASK);
+       path         = u->path;
+       u->path.dentry = NULL;
+       u->path.mnt = NULL;
+@@ -628,7 +628,7 @@ static void unix_release_sock(struct sock *sk, int embrion)
+               if (sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET) {
+                       unix_state_lock(skpair);
+                       /* No more writes */
+-                      skpair->sk_shutdown = SHUTDOWN_MASK;
++                      WRITE_ONCE(skpair->sk_shutdown, SHUTDOWN_MASK);
+                       if (!skb_queue_empty(&sk->sk_receive_queue) || embrion)
+                               skpair->sk_err = ECONNRESET;
+                       unix_state_unlock(skpair);
+@@ -3008,7 +3008,7 @@ static int unix_shutdown(struct socket *sock, int mode)
+       ++mode;
+ 
+       unix_state_lock(sk);
+-      sk->sk_shutdown |= mode;
++      WRITE_ONCE(sk->sk_shutdown, sk->sk_shutdown | mode);
+       other = unix_peer(sk);
+       if (other)
+               sock_hold(other);
+@@ -3028,7 +3028,7 @@ static int unix_shutdown(struct socket *sock, int mode)
+               if (mode&SEND_SHUTDOWN)
+                       peer_mode |= RCV_SHUTDOWN;
+               unix_state_lock(other);
+-              other->sk_shutdown |= peer_mode;
++              WRITE_ONCE(other->sk_shutdown, other->sk_shutdown | peer_mode);
+               unix_state_unlock(other);
+               other->sk_state_change(other);
+               if (peer_mode == SHUTDOWN_MASK)
+@@ -3160,16 +3160,18 @@ static __poll_t unix_poll(struct file *file, struct socket *sock, poll_table *wa
+ {
+       struct sock *sk = sock->sk;
+       __poll_t mask;
++      u8 shutdown;
+ 
+       sock_poll_wait(file, sock, wait);
+       mask = 0;
++      shutdown = READ_ONCE(sk->sk_shutdown);
+ 
+       /* exceptional events? */
+       if (sk->sk_err)
+               mask |= EPOLLERR;
+-      if (sk->sk_shutdown == SHUTDOWN_MASK)
++      if (shutdown == SHUTDOWN_MASK)
+               mask |= EPOLLHUP;
+-      if (sk->sk_shutdown & RCV_SHUTDOWN)
++      if (shutdown & RCV_SHUTDOWN)
+               mask |= EPOLLRDHUP | EPOLLIN | EPOLLRDNORM;
+ 
+       /* readable? */
+@@ -3203,18 +3205,20 @@ static __poll_t unix_dgram_poll(struct file *file, struct socket *sock,
+       struct sock *sk = sock->sk, *other;
+       unsigned int writable;
+       __poll_t mask;
++      u8 shutdown;
+ 
+       sock_poll_wait(file, sock, wait);
+       mask = 0;
++      shutdown = READ_ONCE(sk->sk_shutdown);
+ 
+       /* exceptional events? */
+       if (sk->sk_err || !skb_queue_empty_lockless(&sk->sk_error_queue))
+               mask |= EPOLLERR |
+                       (sock_flag(sk, SOCK_SELECT_ERR_QUEUE) ? EPOLLPRI : 0);
+ 
+-      if (sk->sk_shutdown & RCV_SHUTDOWN)
++      if (shutdown & RCV_SHUTDOWN)
+               mask |= EPOLLRDHUP | EPOLLIN | EPOLLRDNORM;
+-      if (sk->sk_shutdown == SHUTDOWN_MASK)
++      if (shutdown == SHUTDOWN_MASK)
+               mask |= EPOLLHUP;
+ 
+       /* readable? */
+-- 
+2.39.2
+
diff --git a/queue-6.3/arm-9296-1-hp-jornada-7xx-fix-kernel-doc-warnings.patch b/queue-6.3/arm-9296-1-hp-jornada-7xx-fix-kernel-doc-warnings.patch

new file mode 100644 (file)

index 0000000..fe6e55f
--- /dev/null
+++ b/queue-6.3/arm-9296-1-hp-jornada-7xx-fix-kernel-doc-warnings.patch
@@ -0,0 +1,69 @@
+From 583a66b3d1bb21bf220c899efcbbaf53101e02bd Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Sun, 23 Apr 2023 06:48:45 +0100
+Subject: ARM: 9296/1: HP Jornada 7XX: fix kernel-doc warnings
+
+From: Randy Dunlap <rdunlap@infradead.org>
+
+[ Upstream commit 46dd6078dbc7e363a8bb01209da67015a1538929 ]
+
+Fix kernel-doc warnings from the kernel test robot:
+
+jornada720_ssp.c:24: warning: Function parameter or member 'jornada_ssp_lock' not described in 'DEFINE_SPINLOCK'
+jornada720_ssp.c:24: warning: expecting prototype for arch/arm/mac(). Prototype was for DEFINE_SPINLOCK() instead
+jornada720_ssp.c:34: warning: Function parameter or member 'byte' not described in 'jornada_ssp_reverse'
+jornada720_ssp.c:57: warning: Function parameter or member 'byte' not described in 'jornada_ssp_byte'
+jornada720_ssp.c:85: warning: Function parameter or member 'byte' not described in 'jornada_ssp_inout'
+
+Link: lore.kernel.org/r/202304210535.tWby3jWF-lkp@intel.com
+
+Fixes: 69ebb22277a5 ("[ARM] 4506/1: HP Jornada 7XX: Addition of SSP Platform Driver")
+Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
+Reported-by: kernel test robot <lkp@intel.com>
+Cc: Arnd Bergmann <arnd@arndb.de>
+Cc: Kristoffer Ericson <Kristoffer.ericson@gmail.com>
+Cc: patches@armlinux.org.uk
+Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm/mach-sa1100/jornada720_ssp.c | 5 ++++-
+ 1 file changed, 4 insertions(+), 1 deletion(-)
+
+diff --git a/arch/arm/mach-sa1100/jornada720_ssp.c b/arch/arm/mach-sa1100/jornada720_ssp.c
+index 1dbe98948ce30..9627c4cf3e41d 100644
+--- a/arch/arm/mach-sa1100/jornada720_ssp.c
++++ b/arch/arm/mach-sa1100/jornada720_ssp.c
+@@ -1,5 +1,5 @@
+ // SPDX-License-Identifier: GPL-2.0-only
+-/**
++/*
+  *  arch/arm/mac-sa1100/jornada720_ssp.c
+  *
+  *  Copyright (C) 2006/2007 Kristoffer Ericson <Kristoffer.Ericson@gmail.com>
+@@ -26,6 +26,7 @@ static unsigned long jornada_ssp_flags;
+ 
+ /**
+  * jornada_ssp_reverse - reverses input byte
++ * @byte: input byte to reverse
+  *
+  * we need to reverse all data we receive from the mcu due to its physical location
+  * returns : 01110111 -> 11101110
+@@ -46,6 +47,7 @@ EXPORT_SYMBOL(jornada_ssp_reverse);
+ 
+ /**
+  * jornada_ssp_byte - waits for ready ssp bus and sends byte
++ * @byte: input byte to transmit
+  *
+  * waits for fifo buffer to clear and then transmits, if it doesn't then we will
+  * timeout after <timeout> rounds. Needs mcu running before its called.
+@@ -77,6 +79,7 @@ EXPORT_SYMBOL(jornada_ssp_byte);
+ 
+ /**
+  * jornada_ssp_inout - decide if input is command or trading byte
++ * @byte: input byte to send (may be %TXDUMMY)
+  *
+  * returns : (jornada_ssp_byte(byte)) on success
+  *         : %-ETIMEDOUT on timeout failure
+-- 
+2.39.2
+
diff --git a/queue-6.3/bonding-fix-send_peer_notif-overflow.patch b/queue-6.3/bonding-fix-send_peer_notif-overflow.patch

new file mode 100644 (file)

index 0000000..6d0c796
--- /dev/null
+++ b/queue-6.3/bonding-fix-send_peer_notif-overflow.patch
@@ -0,0 +1,99 @@
+From ce16bdfb9ad70636efac6955adf81bab0d6c3153 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 May 2023 11:11:57 +0800
+Subject: bonding: fix send_peer_notif overflow
+
+From: Hangbin Liu <liuhangbin@gmail.com>
+
+[ Upstream commit 9949e2efb54eb3001cb2f6512ff3166dddbfb75d ]
+
+Bonding send_peer_notif was defined as u8. Since commit 07a4ddec3ce9
+("bonding: add an option to specify a delay between peer notifications").
+the bond->send_peer_notif will be num_peer_notif multiplied by
+peer_notif_delay, which is u8 * u32. This would cause the send_peer_notif
+overflow easily. e.g.
+
+  ip link add bond0 type bond mode 1 miimon 100 num_grat_arp 30 peer_notify_delay 1000
+
+To fix the overflow, let's set the send_peer_notif to u32 and limit
+peer_notif_delay to 300s.
+
+Reported-by: Liang Li <liali@redhat.com>
+Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2090053
+Fixes: 07a4ddec3ce9 ("bonding: add an option to specify a delay between peer notifications")
+Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/bonding/bond_netlink.c | 7 ++++++-
+ drivers/net/bonding/bond_options.c | 8 +++++++-
+ include/net/bonding.h              | 2 +-
+ 3 files changed, 14 insertions(+), 3 deletions(-)
+
+diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c
+index c2d080fc4fc4e..27cbe148f0db5 100644
+--- a/drivers/net/bonding/bond_netlink.c
++++ b/drivers/net/bonding/bond_netlink.c
+@@ -84,6 +84,11 @@ static int bond_fill_slave_info(struct sk_buff *skb,
+       return -EMSGSIZE;
+ }
+ 
++/* Limit the max delay range to 300s */
++static struct netlink_range_validation delay_range = {
++      .max = 300000,
++};
++
+ static const struct nla_policy bond_policy[IFLA_BOND_MAX + 1] = {
+       [IFLA_BOND_MODE]                = { .type = NLA_U8 },
+       [IFLA_BOND_ACTIVE_SLAVE]        = { .type = NLA_U32 },
+@@ -114,7 +119,7 @@ static const struct nla_policy bond_policy[IFLA_BOND_MAX + 1] = {
+       [IFLA_BOND_AD_ACTOR_SYSTEM]     = { .type = NLA_BINARY,
+                                           .len  = ETH_ALEN },
+       [IFLA_BOND_TLB_DYNAMIC_LB]      = { .type = NLA_U8 },
+-      [IFLA_BOND_PEER_NOTIF_DELAY]    = { .type = NLA_U32 },
++      [IFLA_BOND_PEER_NOTIF_DELAY]    = NLA_POLICY_FULL_RANGE(NLA_U32, &delay_range),
+       [IFLA_BOND_MISSED_MAX]          = { .type = NLA_U8 },
+       [IFLA_BOND_NS_IP6_TARGET]       = { .type = NLA_NESTED },
+ };
+diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
+index f71d5517f8293..5310cb488f11d 100644
+--- a/drivers/net/bonding/bond_options.c
++++ b/drivers/net/bonding/bond_options.c
+@@ -169,6 +169,12 @@ static const struct bond_opt_value bond_num_peer_notif_tbl[] = {
+       { NULL,      -1,  0}
+ };
+ 
++static const struct bond_opt_value bond_peer_notif_delay_tbl[] = {
++      { "off",     0,   0},
++      { "maxval",  300000, BOND_VALFLAG_MAX},
++      { NULL,      -1,  0}
++};
++
+ static const struct bond_opt_value bond_primary_reselect_tbl[] = {
+       { "always",  BOND_PRI_RESELECT_ALWAYS,  BOND_VALFLAG_DEFAULT},
+       { "better",  BOND_PRI_RESELECT_BETTER,  0},
+@@ -488,7 +494,7 @@ static const struct bond_option bond_opts[BOND_OPT_LAST] = {
+               .id = BOND_OPT_PEER_NOTIF_DELAY,
+               .name = "peer_notif_delay",
+               .desc = "Delay between each peer notification on failover event, in milliseconds",
+-              .values = bond_intmax_tbl,
++              .values = bond_peer_notif_delay_tbl,
+               .set = bond_option_peer_notif_delay_set
+       }
+ };
+diff --git a/include/net/bonding.h b/include/net/bonding.h
+index c3843239517d5..2d034e07b796c 100644
+--- a/include/net/bonding.h
++++ b/include/net/bonding.h
+@@ -233,7 +233,7 @@ struct bonding {
+        */
+       spinlock_t mode_lock;
+       spinlock_t stats_lock;
+-      u8       send_peer_notif;
++      u32      send_peer_notif;
+       u8       igmp_retrans;
+ #ifdef CONFIG_PROC_FS
+       struct   proc_dir_entry *proc_entry;
+-- 
+2.39.2
+
diff --git a/queue-6.3/drm-dsc-fix-dp_dsc_max_bpp_delta_-macro-values.patch b/queue-6.3/drm-dsc-fix-dp_dsc_max_bpp_delta_-macro-values.patch

new file mode 100644 (file)

index 0000000..2c29fde
--- /dev/null
+++ b/queue-6.3/drm-dsc-fix-dp_dsc_max_bpp_delta_-macro-values.patch
@@ -0,0 +1,40 @@
+From 0a7bc4cb6ac884341c25f8414dadc41e8a72441b Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 6 Apr 2023 16:46:15 +0300
+Subject: drm/dsc: fix DP_DSC_MAX_BPP_DELTA_* macro values
+
+From: Jani Nikula <jani.nikula@intel.com>
+
+[ Upstream commit 0d68683838f2850dd8ff31f1121e05bfb7a2def0 ]
+
+The macro values just don't match the specs. Fix them.
+
+Fixes: 1482ec00be4a ("drm: Add missing DP DSC extended capability definitions.")
+Cc: Vinod Govindapillai <vinod.govindapillai@intel.com>
+Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
+Signed-off-by: Jani Nikula <jani.nikula@intel.com>
+Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
+Link: https://patchwork.freedesktop.org/patch/msgid/20230406134615.1422509-2-jani.nikula@intel.com
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ include/drm/display/drm_dp.h | 4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+diff --git a/include/drm/display/drm_dp.h b/include/drm/display/drm_dp.h
+index 4545ed6109584..b8b7f990d67f6 100644
+--- a/include/drm/display/drm_dp.h
++++ b/include/drm/display/drm_dp.h
+@@ -286,8 +286,8 @@
+ 
+ #define DP_DSC_MAX_BITS_PER_PIXEL_HI        0x068   /* eDP 1.4 */
+ # define DP_DSC_MAX_BITS_PER_PIXEL_HI_MASK  (0x3 << 0)
+-# define DP_DSC_MAX_BPP_DELTA_VERSION_MASK  0x06
+-# define DP_DSC_MAX_BPP_DELTA_AVAILABILITY  0x08
++# define DP_DSC_MAX_BPP_DELTA_VERSION_MASK  (0x3 << 5)        /* eDP 1.5 & DP 2.0 */
++# define DP_DSC_MAX_BPP_DELTA_AVAILABILITY  (1 << 7)  /* eDP 1.5 & DP 2.0 */
+ 
+ #define DP_DSC_DEC_COLOR_FORMAT_CAP         0x069
+ # define DP_DSC_RGB                         (1 << 0)
+-- 
+2.39.2
+
diff --git a/queue-6.3/drm-fbdev-generic-prohibit-potential-out-of-bounds-a.patch b/queue-6.3/drm-fbdev-generic-prohibit-potential-out-of-bounds-a.patch

new file mode 100644 (file)

index 0000000..ba11241
--- /dev/null
+++ b/queue-6.3/drm-fbdev-generic-prohibit-potential-out-of-bounds-a.patch
@@ -0,0 +1,135 @@
+From 302fe94ba864c0666865e090ef96c76ba889137d Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 20 Apr 2023 11:05:00 +0800
+Subject: drm/fbdev-generic: prohibit potential out-of-bounds access
+
+From: Sui Jingfeng <suijingfeng@loongson.cn>
+
+[ Upstream commit c8687694bb1f5c48134f152f8c5c2e53483eb99d ]
+
+The fbdev test of IGT may write after EOF, which lead to out-of-bound
+access for drm drivers with fbdev-generic. For example, run fbdev test
+on a x86+ast2400 platform, with 1680x1050 resolution, will cause the
+linux kernel hang with the following call trace:
+
+  Oops: 0000 [#1] PREEMPT SMP PTI
+  [IGT] fbdev: starting subtest eof
+  Workqueue: events drm_fb_helper_damage_work [drm_kms_helper]
+  [IGT] fbdev: starting subtest nullptr
+
+  RIP: 0010:memcpy_erms+0xa/0x20
+  RSP: 0018:ffffa17d40167d98 EFLAGS: 00010246
+  RAX: ffffa17d4eb7fa80 RBX: ffffa17d40e0aa80 RCX: 00000000000014c0
+  RDX: 0000000000001a40 RSI: ffffa17d40e0b000 RDI: ffffa17d4eb80000
+  RBP: ffffa17d40167e20 R08: 0000000000000000 R09: ffff89522ecff8c0
+  R10: ffffa17d4e4c5000 R11: 0000000000000000 R12: ffffa17d4eb7fa80
+  R13: 0000000000001a40 R14: 000000000000041a R15: ffffa17d40167e30
+  FS:  0000000000000000(0000) GS:ffff895257380000(0000) knlGS:0000000000000000
+  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
+  CR2: ffffa17d40e0b000 CR3: 00000001eaeca006 CR4: 00000000001706e0
+  Call Trace:
+   <TASK>
+   ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper]
+   drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper]
+   process_one_work+0x21f/0x430
+   worker_thread+0x4e/0x3c0
+   ? __pfx_worker_thread+0x10/0x10
+   kthread+0xf4/0x120
+   ? __pfx_kthread+0x10/0x10
+   ret_from_fork+0x2c/0x50
+   </TASK>
+  CR2: ffffa17d40e0b000
+  ---[ end trace 0000000000000000 ]---
+
+The is because damage rectangles computed by
+drm_fb_helper_memory_range_to_clip() function is not guaranteed to be
+bound in the screen's active display area. Possible reasons are:
+
+1) Buffers are allocated in the granularity of page size, for mmap system
+   call support. The shadow screen buffer consumed by fbdev emulation may
+   also choosed be page size aligned.
+
+2) The DIV_ROUND_UP() used in drm_fb_helper_memory_range_to_clip()
+   will introduce off-by-one error.
+
+For example, on a 16KB page size system, in order to store a 1920x1080
+XRGB framebuffer, we need allocate 507 pages. Unfortunately, the size
+1920*1080*4 can not be divided exactly by 16KB.
+
+ 1920 * 1080 * 4 = 8294400 bytes
+ 506 * 16 * 1024 = 8290304 bytes
+ 507 * 16 * 1024 = 8306688 bytes
+
+ line_length = 1920*4 = 7680 bytes
+
+ 507 * 16 * 1024 / 7680 = 1081.6
+
+ off / line_length = 507 * 16 * 1024 / 7680 = 1081
+ DIV_ROUND_UP(507 * 16 * 1024, 7680) will yeild 1082
+
+memcpy_toio() typically issue the copy line by line, when copy the last
+line, out-of-bound access will be happen. Because:
+
+ 1082 * line_length = 1082 * 7680 = 8309760, and 8309760 > 8306688
+
+Note that userspace may still write to the invisiable area if a larger
+buffer than width x stride is exposed. But it is not a big issue as
+long as there still have memory resolve the access if not drafting so
+far.
+
+ - Also limit the y1 (Daniel)
+ - keep fix patch it to minimal (Daniel)
+ - screen_size is page size aligned because of it need mmap (Thomas)
+ - Adding fixes tag (Thomas)
+
+Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn>
+Fixes: aa15c677cc34 ("drm/fb-helper: Fix vertical damage clipping")
+Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
+Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
+Link: https://lore.kernel.org/dri-devel/ad44df29-3241-0d9e-e708-b0338bf3c623@189.cn/
+Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
+Link: https://patchwork.freedesktop.org/patch/msgid/20230420030500.1578756-1-suijingfeng@loongson.cn
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/gpu/drm/drm_fb_helper.c | 16 ++++++++++++----
+ 1 file changed, 12 insertions(+), 4 deletions(-)
+
+diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
+index 2fe8349be0995..2a4e9fea03dd7 100644
+--- a/drivers/gpu/drm/drm_fb_helper.c
++++ b/drivers/gpu/drm/drm_fb_helper.c
+@@ -625,19 +625,27 @@ static void drm_fb_helper_damage(struct drm_fb_helper *helper, u32 x, u32 y,
+ static void drm_fb_helper_memory_range_to_clip(struct fb_info *info, off_t off, size_t len,
+                                              struct drm_rect *clip)
+ {
++      u32 line_length = info->fix.line_length;
++      u32 fb_height = info->var.yres;
+       off_t end = off + len;
+       u32 x1 = 0;
+-      u32 y1 = off / info->fix.line_length;
++      u32 y1 = off / line_length;
+       u32 x2 = info->var.xres;
+-      u32 y2 = DIV_ROUND_UP(end, info->fix.line_length);
++      u32 y2 = DIV_ROUND_UP(end, line_length);
++
++      /* Don't allow any of them beyond the bottom bound of display area */
++      if (y1 > fb_height)
++              y1 = fb_height;
++      if (y2 > fb_height)
++              y2 = fb_height;
+ 
+       if ((y2 - y1) == 1) {
+               /*
+                * We've only written to a single scanline. Try to reduce
+                * the number of horizontal pixels that need an update.
+                */
+-              off_t bit_off = (off % info->fix.line_length) * 8;
+-              off_t bit_end = (end % info->fix.line_length) * 8;
++              off_t bit_off = (off % line_length) * 8;
++              off_t bit_end = (end % line_length) * 8;
+ 
+               x1 = bit_off / info->var.bits_per_pixel;
+               x2 = DIV_ROUND_UP(bit_end, info->var.bits_per_pixel);
+-- 
+2.39.2
+
diff --git a/queue-6.3/drm-i915-dp-prevent-potential-div-by-zero.patch b/queue-6.3/drm-i915-dp-prevent-potential-div-by-zero.patch

new file mode 100644 (file)

index 0000000..77aa097
--- /dev/null
+++ b/queue-6.3/drm-i915-dp-prevent-potential-div-by-zero.patch
@@ -0,0 +1,50 @@
+From 77dd4814cd8b26c66d9cc655e09bd39b736593f1 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 18 Apr 2023 07:04:30 -0700
+Subject: drm/i915/dp: prevent potential div-by-zero
+
+From: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
+
+[ Upstream commit 0ff80028e2702c7c3d78b69705dc47c1ccba8c39 ]
+
+drm_dp_dsc_sink_max_slice_count() may return 0 if something goes
+wrong on the part of the DSC sink and its DPCD register. This null
+value may be later used as a divisor in intel_dsc_compute_params(),
+which will lead to an error.
+In the unlikely event that this issue occurs, fix it by testing the
+return value of drm_dp_dsc_sink_max_slice_count() against zero.
+
+Found by Linux Verification Center (linuxtesting.org) with static
+analysis tool SVACE.
+
+Fixes: a4a157777c80 ("drm/i915/dp: Compute DSC pipe config in atomic check")
+Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
+Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
+Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
+Link: https://patchwork.freedesktop.org/patch/msgid/20230418140430.69902-1-n.zhandarovich@fintech.ru
+(cherry picked from commit 51f7008239de011370c5067bbba07f0207f06b72)
+Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/gpu/drm/i915/display/intel_dp.c | 5 +++++
+ 1 file changed, 5 insertions(+)
+
+diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
+index 62cbab7402e93..c1825f8f885c2 100644
+--- a/drivers/gpu/drm/i915/display/intel_dp.c
++++ b/drivers/gpu/drm/i915/display/intel_dp.c
+@@ -1533,6 +1533,11 @@ int intel_dp_dsc_compute_config(struct intel_dp *intel_dp,
+               pipe_config->dsc.slice_count =
+                       drm_dp_dsc_sink_max_slice_count(intel_dp->dsc_dpcd,
+                                                       true);
++              if (!pipe_config->dsc.slice_count) {
++                      drm_dbg_kms(&dev_priv->drm, "Unsupported Slice Count %d\n",
++                                  pipe_config->dsc.slice_count);
++                      return -EINVAL;
++              }
+       } else {
+               u16 dsc_max_output_bpp = 0;
+               u8 dsc_dp_slice_count;
+-- 
+2.39.2
+
diff --git a/queue-6.3/drm-i915-fix-null-ptr-deref-by-checking-new_crtc_sta.patch b/queue-6.3/drm-i915-fix-null-ptr-deref-by-checking-new_crtc_sta.patch

new file mode 100644 (file)

index 0000000..2002f48
--- /dev/null
+++ b/queue-6.3/drm-i915-fix-null-ptr-deref-by-checking-new_crtc_sta.patch
@@ -0,0 +1,52 @@
+From 9c40f3f4f16bc863bfb9e2e68227c0f773e08180 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Fri, 5 May 2023 11:22:12 +0300
+Subject: drm/i915: Fix NULL ptr deref by checking new_crtc_state
+
+From: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
+
+[ Upstream commit a41d985902c153c31c616fe183cf2ee331e95ecb ]
+
+intel_atomic_get_new_crtc_state can return NULL, unless crtc state wasn't
+obtained previously with intel_atomic_get_crtc_state, so we must check it
+for NULLness here, just as in many other places, where we can't guarantee
+that intel_atomic_get_crtc_state was called.
+We are currently getting NULL ptr deref because of that, so this fix was
+confirmed to help.
+
+Fixes: 74a75dc90869 ("drm/i915/display: move plane prepare/cleanup to intel_atomic_plane.c")
+Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
+Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
+Link: https://patchwork.freedesktop.org/patch/msgid/20230505082212.27089-1-stanislav.lisovskiy@intel.com
+(cherry picked from commit 1d5b09f8daf859247a1ea65b0d732a24d88980d8)
+Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/gpu/drm/i915/display/intel_atomic_plane.c | 4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+diff --git a/drivers/gpu/drm/i915/display/intel_atomic_plane.c b/drivers/gpu/drm/i915/display/intel_atomic_plane.c
+index 1409bcfb6fd3d..9afba39613f37 100644
+--- a/drivers/gpu/drm/i915/display/intel_atomic_plane.c
++++ b/drivers/gpu/drm/i915/display/intel_atomic_plane.c
+@@ -1026,7 +1026,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
+       int ret;
+ 
+       if (old_obj) {
+-              const struct intel_crtc_state *crtc_state =
++              const struct intel_crtc_state *new_crtc_state =
+                       intel_atomic_get_new_crtc_state(state,
+                                                       to_intel_crtc(old_plane_state->hw.crtc));
+ 
+@@ -1041,7 +1041,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
+                * This should only fail upon a hung GPU, in which case we
+                * can safely continue.
+                */
+-              if (intel_crtc_needs_modeset(crtc_state)) {
++              if (new_crtc_state && intel_crtc_needs_modeset(new_crtc_state)) {
+                       ret = i915_sw_fence_await_reservation(&state->commit_ready,
+                                                             old_obj->base.resv,
+                                                             false, 0,
+-- 
+2.39.2
+
diff --git a/queue-6.3/drm-i915-guc-don-t-capture-gen8-regs-on-xe-devices.patch b/queue-6.3/drm-i915-guc-don-t-capture-gen8-regs-on-xe-devices.patch

new file mode 100644 (file)

index 0000000..dbcbbe0
--- /dev/null
+++ b/queue-6.3/drm-i915-guc-don-t-capture-gen8-regs-on-xe-devices.patch
@@ -0,0 +1,64 @@
+From a80e84e015d8f84c7e6f05cddd3e8097dd799459 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Fri, 28 Apr 2023 11:56:33 -0700
+Subject: drm/i915/guc: Don't capture Gen8 regs on Xe devices
+
+From: John Harrison <John.C.Harrison@Intel.com>
+
+[ Upstream commit 275dac1f7f5e9c2a2e806b34d3b10804eec0ac3c ]
+
+A pair of pre-Xe registers were being included in the Xe capture list.
+GuC was rejecting those as being invalid and logging errors about
+them. So, stop doing it.
+
+Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
+Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
+Fixes: dce2bd542337 ("drm/i915/guc: Add Gen9 registers for GuC error state capture.")
+Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
+Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
+Cc: Lucas De Marchi <lucas.demarchi@intel.com>
+Cc: John Harrison <John.C.Harrison@Intel.com>
+Cc: Jani Nikula <jani.nikula@intel.com>
+Cc: Matt Roper <matthew.d.roper@intel.com>
+Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
+Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
+Link: https://patchwork.freedesktop.org/patch/msgid/20230428185636.457407-2-John.C.Harrison@Intel.com
+(cherry picked from commit b049132d61336f643d8faf2f6574b063667088cf)
+Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c | 7 +++++--
+ 1 file changed, 5 insertions(+), 2 deletions(-)
+
+diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
+index 710999d7189ee..8c08899aa3c8d 100644
+--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
++++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
+@@ -30,12 +30,14 @@
+       { FORCEWAKE_MT,             0,      0, "FORCEWAKE" }
+ 
+ #define COMMON_GEN9BASE_GLOBAL \
+-      { GEN8_FAULT_TLB_DATA0,     0,      0, "GEN8_FAULT_TLB_DATA0" }, \
+-      { GEN8_FAULT_TLB_DATA1,     0,      0, "GEN8_FAULT_TLB_DATA1" }, \
+       { ERROR_GEN6,               0,      0, "ERROR_GEN6" }, \
+       { DONE_REG,                 0,      0, "DONE_REG" }, \
+       { HSW_GTT_CACHE_EN,         0,      0, "HSW_GTT_CACHE_EN" }
+ 
++#define GEN9_GLOBAL \
++      { GEN8_FAULT_TLB_DATA0,     0,      0, "GEN8_FAULT_TLB_DATA0" }, \
++      { GEN8_FAULT_TLB_DATA1,     0,      0, "GEN8_FAULT_TLB_DATA1" }
++
+ #define COMMON_GEN12BASE_GLOBAL \
+       { GEN12_FAULT_TLB_DATA0,    0,      0, "GEN12_FAULT_TLB_DATA0" }, \
+       { GEN12_FAULT_TLB_DATA1,    0,      0, "GEN12_FAULT_TLB_DATA1" }, \
+@@ -141,6 +143,7 @@ static const struct __guc_mmio_reg_descr xe_lpd_gsc_inst_regs[] = {
+ static const struct __guc_mmio_reg_descr default_global_regs[] = {
+       COMMON_BASE_GLOBAL,
+       COMMON_GEN9BASE_GLOBAL,
++      GEN9_GLOBAL,
+ };
+ 
+ static const struct __guc_mmio_reg_descr default_rc_class_regs[] = {
+-- 
+2.39.2
+
diff --git a/queue-6.3/drm-i915-taint-kernel-when-force-probing-unsupported.patch b/queue-6.3/drm-i915-taint-kernel-when-force-probing-unsupported.patch

new file mode 100644 (file)

index 0000000..db99a13
--- /dev/null
+++ b/queue-6.3/drm-i915-taint-kernel-when-force-probing-unsupported.patch
@@ -0,0 +1,89 @@
+From fa63fd2399d89594b0979c1901719b0d84ea0390 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 4 May 2023 13:35:08 +0300
+Subject: drm/i915: taint kernel when force probing unsupported devices
+
+From: Jani Nikula <jani.nikula@intel.com>
+
+[ Upstream commit 79c901c93562bdf1c84ce6c1b744fbbe4389a6eb ]
+
+For development and testing purposes, the i915.force_probe module
+parameter and DRM_I915_FORCE_PROBE kconfig option allow probing of
+devices that aren't supported by the driver.
+
+The i915.force_probe module parameter is "unsafe" and setting it taints
+the kernel. However, using the kconfig option does not.
+
+Always taint the kernel when force probing a device that is not
+supported.
+
+v2: Drop "depends on EXPERT" to avoid build breakage (kernel test robot)
+
+Fixes: 7ef5ef5cdead ("drm/i915: add force_probe module parameter to replace alpha_support")
+Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
+Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
+Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
+Cc: Daniel Vetter <daniel@ffwll.ch>
+Cc: Dave Airlie <airlied@gmail.com>
+Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
+Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
+Signed-off-by: Jani Nikula <jani.nikula@intel.com>
+Link: https://patchwork.freedesktop.org/patch/msgid/20230504103508.1818540-1-jani.nikula@intel.com
+(cherry picked from commit 3312bb4ad09ca6423bd4a5b15a94588a8962fb8e)
+Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/gpu/drm/i915/Kconfig    | 12 +++++++-----
+ drivers/gpu/drm/i915/i915_pci.c |  6 ++++++
+ 2 files changed, 13 insertions(+), 5 deletions(-)
+
+diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
+index 98f4e44976e09..9c9bb0a0dcfca 100644
+--- a/drivers/gpu/drm/i915/Kconfig
++++ b/drivers/gpu/drm/i915/Kconfig
+@@ -62,10 +62,11 @@ config DRM_I915_FORCE_PROBE
+         This is the default value for the i915.force_probe module
+         parameter. Using the module parameter overrides this option.
+ 
+-        Force probe the i915 for Intel graphics devices that are
+-        recognized but not properly supported by this kernel version. It is
+-        recommended to upgrade to a kernel version with proper support as soon
+-        as it is available.
++        Force probe the i915 driver for Intel graphics devices that are
++        recognized but not properly supported by this kernel version. Force
++        probing an unsupported device taints the kernel. It is recommended to
++        upgrade to a kernel version with proper support as soon as it is
++        available.
+ 
+         It can also be used to block the probe of recognized and fully
+         supported devices.
+@@ -75,7 +76,8 @@ config DRM_I915_FORCE_PROBE
+         Use "<pci-id>[,<pci-id>,...]" to force probe the i915 for listed
+         devices. For example, "4500" or "4500,4571".
+ 
+-        Use "*" to force probe the driver for all known devices.
++        Use "*" to force probe the driver for all known devices. Not
++        recommended.
+ 
+         Use "!" right before the ID to block the probe of the device. For
+         example, "4500,!4571" forces the probe of 4500 and blocks the probe of
+diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
+index 125f7ef1252c3..2b5aaea422208 100644
+--- a/drivers/gpu/drm/i915/i915_pci.c
++++ b/drivers/gpu/drm/i915/i915_pci.c
+@@ -1346,6 +1346,12 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+               return -ENODEV;
+       }
+ 
++      if (intel_info->require_force_probe) {
++              dev_info(&pdev->dev, "Force probing unsupported Device ID %04x, tainting kernel\n",
++                       pdev->device);
++              add_taint(TAINT_USER, LOCKDEP_STILL_OK);
++      }
++
+       /* Only bind to function 0 of the device. Early generations
+        * used function 1 as a placeholder for multi-head. This causes
+        * us confusion instead, especially on the systems where both
+-- 
+2.39.2
+
diff --git a/queue-6.3/drm-mipi-dsi-set-the-fwnode-for-mipi_dsi_device.patch b/queue-6.3/drm-mipi-dsi-set-the-fwnode-for-mipi_dsi_device.patch

new file mode 100644 (file)

index 0000000..7bcb09c
--- /dev/null
+++ b/queue-6.3/drm-mipi-dsi-set-the-fwnode-for-mipi_dsi_device.patch
@@ -0,0 +1,48 @@
+From 6b73cc975e3b02fe7b43bdd31af18fba31c21836 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 9 Mar 2023 22:39:09 -0800
+Subject: drm/mipi-dsi: Set the fwnode for mipi_dsi_device
+
+From: Saravana Kannan <saravanak@google.com>
+
+[ Upstream commit a26cc2934331b57b5a7164bff344f0a2ec245fc0 ]
+
+After commit 3fb16866b51d ("driver core: fw_devlink: Make cycle
+detection more robust"), fw_devlink prints an error when consumer
+devices don't have their fwnode set. This used to be ignored silently.
+
+Set the fwnode mipi_dsi_device so fw_devlink can find them and properly
+track their dependencies.
+
+This fixes errors like this:
+[    0.334054] nwl-dsi 30a00000.mipi-dsi: Failed to create device link with regulator-lcd-1v8
+[    0.346964] nwl-dsi 30a00000.mipi-dsi: Failed to create device link with backlight-dsi
+
+Reported-by: Martin Kepplinger <martin.kepplinger@puri.sm>
+Link: https://lore.kernel.org/lkml/2a8e407f4f18c9350f8629a2b5fa18673355b2ae.camel@puri.sm/
+Fixes: 068a00233969 ("drm: Add MIPI DSI bus support")
+Signed-off-by: Saravana Kannan <saravanak@google.com>
+Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>
+Link: https://lore.kernel.org/r/20230310063910.2474472-1-saravanak@google.com
+Signed-off-by: Maxime Ripard <maxime@cerno.tech>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/gpu/drm/drm_mipi_dsi.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/drivers/gpu/drm/drm_mipi_dsi.c b/drivers/gpu/drm/drm_mipi_dsi.c
+index b41aaf2bb9f16..7923cc21b78e8 100644
+--- a/drivers/gpu/drm/drm_mipi_dsi.c
++++ b/drivers/gpu/drm/drm_mipi_dsi.c
+@@ -221,7 +221,7 @@ mipi_dsi_device_register_full(struct mipi_dsi_host *host,
+               return dsi;
+       }
+ 
+-      dsi->dev.of_node = info->node;
++      device_set_node(&dsi->dev, of_fwnode_handle(info->node));
+       dsi->channel = info->channel;
+       strlcpy(dsi->name, info->type, sizeof(dsi->name));
+ 
+-- 
+2.39.2
+
diff --git a/queue-6.3/drm-nouveau-disp-more-dp_receiver_cap_size-array-fix.patch b/queue-6.3/drm-nouveau-disp-more-dp_receiver_cap_size-array-fix.patch

new file mode 100644 (file)

index 0000000..bf83f56
--- /dev/null
+++ b/queue-6.3/drm-nouveau-disp-more-dp_receiver_cap_size-array-fix.patch
@@ -0,0 +1,106 @@
+From 385cae759f38e35e99bd4749bf6388c02395027a Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Sat, 4 Feb 2023 10:43:10 -0800
+Subject: drm/nouveau/disp: More DP_RECEIVER_CAP_SIZE array fixes
+
+From: Kees Cook <keescook@chromium.org>
+
+[ Upstream commit 25feda6fbd0cfefcb69308fb20d4d4815a107c5e ]
+
+More arrays (and arguments) for dcpd were set to 16, when it looks like
+DP_RECEIVER_CAP_SIZE (15) should be used. Fix the remaining cases, seen
+with GCC 13:
+
+../drivers/gpu/drm/nouveau/nvif/outp.c: In function 'nvif_outp_acquire_dp':
+../include/linux/fortify-string.h:57:33: warning: array subscript 'unsigned char[16][0]' is partly outside array bounds of 'u8[15]' {aka 'unsigned char[15]'} [-Warray-bounds=]
+   57 | #define __underlying_memcpy     __builtin_memcpy
+      |                                 ^
+...
+../drivers/gpu/drm/nouveau/nvif/outp.c:140:9: note: in expansion of macro 'memcpy'
+  140 |         memcpy(args.dp.dpcd, dpcd, sizeof(args.dp.dpcd));
+      |         ^~~~~~
+../drivers/gpu/drm/nouveau/nvif/outp.c:130:49: note: object 'dpcd' of size [0, 15]
+  130 | nvif_outp_acquire_dp(struct nvif_outp *outp, u8 dpcd[DP_RECEIVER_CAP_SIZE],
+      |                                              ~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Fixes: 813443721331 ("drm/nouveau/disp: move DP link config into acquire")
+Cc: Ben Skeggs <bskeggs@redhat.com>
+Cc: Lyude Paul <lyude@redhat.com>
+Cc: Karol Herbst <kherbst@redhat.com>
+Cc: David Airlie <airlied@gmail.com>
+Cc: Daniel Vetter <daniel@ffwll.ch>
+Cc: Dave Airlie <airlied@redhat.com>
+Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
+Cc: dri-devel@lists.freedesktop.org
+Cc: nouveau@lists.freedesktop.org
+Signed-off-by: Kees Cook <keescook@chromium.org>
+Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
+Reviewed-by: Karol Herbst <kherbst@redhat.com>
+Signed-off-by: Karol Herbst <git@karolherbst.de>
+Link: https://patchwork.freedesktop.org/patch/msgid/20230204184307.never.825-kees@kernel.org
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/gpu/drm/nouveau/include/nvif/if0012.h    | 4 +++-
+ drivers/gpu/drm/nouveau/nvkm/engine/disp/outp.h  | 3 ++-
+ drivers/gpu/drm/nouveau/nvkm/engine/disp/uoutp.c | 2 +-
+ 3 files changed, 6 insertions(+), 3 deletions(-)
+
+diff --git a/drivers/gpu/drm/nouveau/include/nvif/if0012.h b/drivers/gpu/drm/nouveau/include/nvif/if0012.h
+index eb99d84eb8443..16d4ad5023a3e 100644
+--- a/drivers/gpu/drm/nouveau/include/nvif/if0012.h
++++ b/drivers/gpu/drm/nouveau/include/nvif/if0012.h
+@@ -2,6 +2,8 @@
+ #ifndef __NVIF_IF0012_H__
+ #define __NVIF_IF0012_H__
+ 
++#include <drm/display/drm_dp.h>
++
+ union nvif_outp_args {
+       struct nvif_outp_v0 {
+               __u8 version;
+@@ -63,7 +65,7 @@ union nvif_outp_acquire_args {
+                               __u8 hda;
+                               __u8 mst;
+                               __u8 pad04[4];
+-                              __u8 dpcd[16];
++                              __u8 dpcd[DP_RECEIVER_CAP_SIZE];
+                       } dp;
+               };
+       } v0;
+diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/outp.h b/drivers/gpu/drm/nouveau/nvkm/engine/disp/outp.h
+index b7631c1ab2420..4e7f873f66e27 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/outp.h
++++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/outp.h
+@@ -3,6 +3,7 @@
+ #define __NVKM_DISP_OUTP_H__
+ #include "priv.h"
+ 
++#include <drm/display/drm_dp.h>
+ #include <subdev/bios.h>
+ #include <subdev/bios/dcb.h>
+ #include <subdev/bios/dp.h>
+@@ -42,7 +43,7 @@ struct nvkm_outp {
+                       bool aux_pwr_pu;
+                       u8 lttpr[6];
+                       u8 lttprs;
+-                      u8 dpcd[16];
++                      u8 dpcd[DP_RECEIVER_CAP_SIZE];
+ 
+                       struct {
+                               int dpcd; /* -1, or index into SUPPORTED_LINK_RATES table */
+diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/uoutp.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/uoutp.c
+index 4f0ca709c85a4..fc283a4a1522a 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/uoutp.c
++++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/uoutp.c
+@@ -146,7 +146,7 @@ nvkm_uoutp_mthd_release(struct nvkm_outp *outp, void *argv, u32 argc)
+ }
+ 
+ static int
+-nvkm_uoutp_mthd_acquire_dp(struct nvkm_outp *outp, u8 dpcd[16],
++nvkm_uoutp_mthd_acquire_dp(struct nvkm_outp *outp, u8 dpcd[DP_RECEIVER_CAP_SIZE],
+                          u8 link_nr, u8 link_bw, bool hda, bool mst)
+ {
+       int ret;
+-- 
+2.39.2
+
diff --git a/queue-6.3/drm-sched-check-scheduler-work-queue-before-calling-.patch b/queue-6.3/drm-sched-check-scheduler-work-queue-before-calling-.patch

new file mode 100644 (file)

index 0000000..f948235
--- /dev/null
+++ b/queue-6.3/drm-sched-check-scheduler-work-queue-before-calling-.patch
@@ -0,0 +1,72 @@
+From 71dd79df1cac03bdd90c094a3fc3deeb35fd74e6 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Wed, 10 May 2023 09:51:11 -0400
+Subject: drm/sched: Check scheduler work queue before calling timeout handling
+
+From: Vitaly Prosyak <vitaly.prosyak@amd.com>
+
+[ Upstream commit 2da5bffe9eaa5819a868e8eaaa11b3fd0f16a691 ]
+
+During an IGT GPU reset test we see again oops despite of
+commit 0c8c901aaaebc9 (drm/sched: Check scheduler ready before calling
+timeout handling).
+
+It uses ready condition whether to call drm_sched_fault which unwind
+the TDR leads to GPU reset.
+However it looks the ready condition is overloaded with other meanings,
+for example, for the following stack is related GPU reset :
+
+0  gfx_v9_0_cp_gfx_start
+1  gfx_v9_0_cp_gfx_resume
+2  gfx_v9_0_cp_resume
+3  gfx_v9_0_hw_init
+4  gfx_v9_0_resume
+5  amdgpu_device_ip_resume_phase2
+
+does the following:
+       /* start the ring */
+       gfx_v9_0_cp_gfx_start(adev);
+       ring->sched.ready = true;
+
+The same approach is for other ASICs as well :
+gfx_v8_0_cp_gfx_resume
+gfx_v10_0_kiq_resume, etc...
+
+As a result, our GPU reset test causes GPU fault which calls unconditionally gfx_v9_0_fault
+and then drm_sched_fault. However now it depends on whether the interrupt service routine
+drm_sched_fault is executed after gfx_v9_0_cp_gfx_start is completed which sets the ready
+field of the scheduler to true even  for uninitialized schedulers and causes oops vs
+no fault or when ISR  drm_sched_fault is completed prior  gfx_v9_0_cp_gfx_start and
+NULL pointer dereference does not occur.
+
+Use the field timeout_wq  to prevent oops for uninitialized schedulers.
+The field could be initialized by the work queue of resetting the domain.
+
+v1: Corrections to commit message (Luben)
+
+Fixes: 11b3b9f461c5c4 ("drm/sched: Check scheduler ready before calling timeout handling")
+Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
+Link: https://lore.kernel.org/r/20230510135111.58631-1-vitaly.prosyak@amd.com
+Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
+Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/gpu/drm/scheduler/sched_main.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
+index 1e08cc5a17029..78c959eaef0c5 100644
+--- a/drivers/gpu/drm/scheduler/sched_main.c
++++ b/drivers/gpu/drm/scheduler/sched_main.c
+@@ -308,7 +308,7 @@ static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
+  */
+ void drm_sched_fault(struct drm_gpu_scheduler *sched)
+ {
+-      if (sched->ready)
++      if (sched->timeout_wq)
+               mod_delayed_work(sched->timeout_wq, &sched->work_tdr, 0);
+ }
+ EXPORT_SYMBOL(drm_sched_fault);
+-- 
+2.39.2
+
diff --git a/queue-6.3/ext4-allow-ext4_get_group_info-to-fail.patch b/queue-6.3/ext4-allow-ext4_get_group_info-to-fail.patch

new file mode 100644 (file)

index 0000000..10b7f1a
--- /dev/null
+++ b/queue-6.3/ext4-allow-ext4_get_group_info-to-fail.patch
@@ -0,0 +1,426 @@
+From 8e30410264cd8bcea3c2f1636ca25027b3006956 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Sat, 29 Apr 2023 00:06:28 -0400
+Subject: ext4: allow ext4_get_group_info() to fail
+
+From: Theodore Ts'o <tytso@mit.edu>
+
+[ Upstream commit 5354b2af34064a4579be8bc0e2f15a7b70f14b5f ]
+
+Previously, ext4_get_group_info() would treat an invalid group number
+as BUG(), since in theory it should never happen.  However, if a
+malicious attaker (or fuzzer) modifies the superblock via the block
+device while it is the file system is mounted, it is possible for
+s_first_data_block to get set to a very large number.  In that case,
+when calculating the block group of some block number (such as the
+starting block of a preallocation region), could result in an
+underflow and very large block group number.  Then the BUG_ON check in
+ext4_get_group_info() would fire, resutling in a denial of service
+attack that can be triggered by root or someone with write access to
+the block device.
+
+For a quality of implementation perspective, it's best that even if
+the system administrator does something that they shouldn't, that it
+will not trigger a BUG.  So instead of BUG'ing, ext4_get_group_info()
+will call ext4_error and return NULL.  We also add fallback code in
+all of the callers of ext4_get_group_info() that it might NULL.
+
+Also, since ext4_get_group_info() was already borderline to be an
+inline function, un-inline it.  The results in a next reduction of the
+compiled text size of ext4 by roughly 2k.
+
+Cc: stable@kernel.org
+Link: https://lore.kernel.org/r/20230430154311.579720-2-tytso@mit.edu
+Reported-by: syzbot+e2efa3efc15a1c9e95c3@syzkaller.appspotmail.com
+Link: https://syzkaller.appspot.com/bug?id=69b28112e098b070f639efb356393af3ffec4220
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Reviewed-by: Jan Kara <jack@suse.cz>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ fs/ext4/balloc.c  | 18 ++++++++++++-
+ fs/ext4/ext4.h    | 15 ++---------
+ fs/ext4/ialloc.c  | 12 ++++++---
+ fs/ext4/mballoc.c | 64 +++++++++++++++++++++++++++++++++++++++--------
+ fs/ext4/super.c   |  2 ++
+ 5 files changed, 82 insertions(+), 29 deletions(-)
+
+diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
+index f2c415f31b755..a38aa33af08ef 100644
+--- a/fs/ext4/balloc.c
++++ b/fs/ext4/balloc.c
+@@ -319,6 +319,22 @@ static ext4_fsblk_t ext4_valid_block_bitmap_padding(struct super_block *sb,
+       return (next_zero_bit < bitmap_size ? next_zero_bit : 0);
+ }
+ 
++struct ext4_group_info *ext4_get_group_info(struct super_block *sb,
++                                          ext4_group_t group)
++{
++       struct ext4_group_info **grp_info;
++       long indexv, indexh;
++
++       if (unlikely(group >= EXT4_SB(sb)->s_groups_count)) {
++               ext4_error(sb, "invalid group %u", group);
++               return NULL;
++       }
++       indexv = group >> (EXT4_DESC_PER_BLOCK_BITS(sb));
++       indexh = group & ((EXT4_DESC_PER_BLOCK(sb)) - 1);
++       grp_info = sbi_array_rcu_deref(EXT4_SB(sb), s_group_info, indexv);
++       return grp_info[indexh];
++}
++
+ /*
+  * Return the block number which was discovered to be invalid, or 0 if
+  * the block bitmap is valid.
+@@ -393,7 +409,7 @@ static int ext4_validate_block_bitmap(struct super_block *sb,
+ 
+       if (buffer_verified(bh))
+               return 0;
+-      if (EXT4_MB_GRP_BBITMAP_CORRUPT(grp))
++      if (!grp || EXT4_MB_GRP_BBITMAP_CORRUPT(grp))
+               return -EFSCORRUPTED;
+ 
+       ext4_lock_group(sb, block_group);
+diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
+index df0255b7d1faa..68228bd60e836 100644
+--- a/fs/ext4/ext4.h
++++ b/fs/ext4/ext4.h
+@@ -2740,6 +2740,8 @@ extern void ext4_check_blocks_bitmap(struct super_block *);
+ extern struct ext4_group_desc * ext4_get_group_desc(struct super_block * sb,
+                                                   ext4_group_t block_group,
+                                                   struct buffer_head ** bh);
++extern struct ext4_group_info *ext4_get_group_info(struct super_block *sb,
++                                                 ext4_group_t group);
+ extern int ext4_should_retry_alloc(struct super_block *sb, int *retries);
+ 
+ extern struct buffer_head *ext4_read_block_bitmap_nowait(struct super_block *sb,
+@@ -3347,19 +3349,6 @@ static inline void ext4_isize_set(struct ext4_inode *raw_inode, loff_t i_size)
+       raw_inode->i_size_high = cpu_to_le32(i_size >> 32);
+ }
+ 
+-static inline
+-struct ext4_group_info *ext4_get_group_info(struct super_block *sb,
+-                                          ext4_group_t group)
+-{
+-       struct ext4_group_info **grp_info;
+-       long indexv, indexh;
+-       BUG_ON(group >= EXT4_SB(sb)->s_groups_count);
+-       indexv = group >> (EXT4_DESC_PER_BLOCK_BITS(sb));
+-       indexh = group & ((EXT4_DESC_PER_BLOCK(sb)) - 1);
+-       grp_info = sbi_array_rcu_deref(EXT4_SB(sb), s_group_info, indexv);
+-       return grp_info[indexh];
+-}
+-
+ /*
+  * Reading s_groups_count requires using smp_rmb() afterwards.  See
+  * the locking protocol documented in the comments of ext4_group_add()
+diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
+index 157663031f8c9..2354538a430e3 100644
+--- a/fs/ext4/ialloc.c
++++ b/fs/ext4/ialloc.c
+@@ -91,7 +91,7 @@ static int ext4_validate_inode_bitmap(struct super_block *sb,
+ 
+       if (buffer_verified(bh))
+               return 0;
+-      if (EXT4_MB_GRP_IBITMAP_CORRUPT(grp))
++      if (!grp || EXT4_MB_GRP_IBITMAP_CORRUPT(grp))
+               return -EFSCORRUPTED;
+ 
+       ext4_lock_group(sb, block_group);
+@@ -293,7 +293,7 @@ void ext4_free_inode(handle_t *handle, struct inode *inode)
+       }
+       if (!(sbi->s_mount_state & EXT4_FC_REPLAY)) {
+               grp = ext4_get_group_info(sb, block_group);
+-              if (unlikely(EXT4_MB_GRP_IBITMAP_CORRUPT(grp))) {
++              if (!grp || unlikely(EXT4_MB_GRP_IBITMAP_CORRUPT(grp))) {
+                       fatal = -EFSCORRUPTED;
+                       goto error_return;
+               }
+@@ -1047,7 +1047,7 @@ struct inode *__ext4_new_inode(struct mnt_idmap *idmap,
+                        * Skip groups with already-known suspicious inode
+                        * tables
+                        */
+-                      if (EXT4_MB_GRP_IBITMAP_CORRUPT(grp))
++                      if (!grp || EXT4_MB_GRP_IBITMAP_CORRUPT(grp))
+                               goto next_group;
+               }
+ 
+@@ -1185,6 +1185,10 @@ struct inode *__ext4_new_inode(struct mnt_idmap *idmap,
+ 
+               if (!(sbi->s_mount_state & EXT4_FC_REPLAY)) {
+                       grp = ext4_get_group_info(sb, group);
++                      if (!grp) {
++                              err = -EFSCORRUPTED;
++                              goto out;
++                      }
+                       down_read(&grp->alloc_sem); /*
+                                                    * protect vs itable
+                                                    * lazyinit
+@@ -1528,7 +1532,7 @@ int ext4_init_inode_table(struct super_block *sb, ext4_group_t group,
+       }
+ 
+       gdp = ext4_get_group_desc(sb, group, &group_desc_bh);
+-      if (!gdp)
++      if (!gdp || !grp)
+               goto out;
+ 
+       /*
+diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
+index 343cb38ea3653..2a1df157d1206 100644
+--- a/fs/ext4/mballoc.c
++++ b/fs/ext4/mballoc.c
+@@ -745,6 +745,8 @@ static int __mb_check_buddy(struct ext4_buddy *e4b, char *file,
+       MB_CHECK_ASSERT(e4b->bd_info->bb_fragments == fragments);
+ 
+       grp = ext4_get_group_info(sb, e4b->bd_group);
++      if (!grp)
++              return NULL;
+       list_for_each(cur, &grp->bb_prealloc_list) {
+               ext4_group_t groupnr;
+               struct ext4_prealloc_space *pa;
+@@ -1060,9 +1062,9 @@ mb_set_largest_free_order(struct super_block *sb, struct ext4_group_info *grp)
+ 
+ static noinline_for_stack
+ void ext4_mb_generate_buddy(struct super_block *sb,
+-                              void *buddy, void *bitmap, ext4_group_t group)
++                          void *buddy, void *bitmap, ext4_group_t group,
++                          struct ext4_group_info *grp)
+ {
+-      struct ext4_group_info *grp = ext4_get_group_info(sb, group);
+       struct ext4_sb_info *sbi = EXT4_SB(sb);
+       ext4_grpblk_t max = EXT4_CLUSTERS_PER_GROUP(sb);
+       ext4_grpblk_t i = 0;
+@@ -1183,6 +1185,8 @@ static int ext4_mb_init_cache(struct page *page, char *incore, gfp_t gfp)
+                       break;
+ 
+               grinfo = ext4_get_group_info(sb, group);
++              if (!grinfo)
++                      continue;
+               /*
+                * If page is uptodate then we came here after online resize
+                * which added some new uninitialized group info structs, so
+@@ -1248,6 +1252,10 @@ static int ext4_mb_init_cache(struct page *page, char *incore, gfp_t gfp)
+                               group, page->index, i * blocksize);
+                       trace_ext4_mb_buddy_bitmap_load(sb, group);
+                       grinfo = ext4_get_group_info(sb, group);
++                      if (!grinfo) {
++                              err = -EFSCORRUPTED;
++                              goto out;
++                      }
+                       grinfo->bb_fragments = 0;
+                       memset(grinfo->bb_counters, 0,
+                              sizeof(*grinfo->bb_counters) *
+@@ -1258,7 +1266,7 @@ static int ext4_mb_init_cache(struct page *page, char *incore, gfp_t gfp)
+                       ext4_lock_group(sb, group);
+                       /* init the buddy */
+                       memset(data, 0xff, blocksize);
+-                      ext4_mb_generate_buddy(sb, data, incore, group);
++                      ext4_mb_generate_buddy(sb, data, incore, group, grinfo);
+                       ext4_unlock_group(sb, group);
+                       incore = NULL;
+               } else {
+@@ -1372,6 +1380,9 @@ int ext4_mb_init_group(struct super_block *sb, ext4_group_t group, gfp_t gfp)
+       might_sleep();
+       mb_debug(sb, "init group %u\n", group);
+       this_grp = ext4_get_group_info(sb, group);
++      if (!this_grp)
++              return -EFSCORRUPTED;
++
+       /*
+        * This ensures that we don't reinit the buddy cache
+        * page which map to the group from which we are already
+@@ -1446,6 +1457,8 @@ ext4_mb_load_buddy_gfp(struct super_block *sb, ext4_group_t group,
+ 
+       blocks_per_page = PAGE_SIZE / sb->s_blocksize;
+       grp = ext4_get_group_info(sb, group);
++      if (!grp)
++              return -EFSCORRUPTED;
+ 
+       e4b->bd_blkbits = sb->s_blocksize_bits;
+       e4b->bd_info = grp;
+@@ -2162,6 +2175,8 @@ int ext4_mb_find_by_goal(struct ext4_allocation_context *ac,
+       struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, group);
+       struct ext4_free_extent ex;
+ 
++      if (!grp)
++              return -EFSCORRUPTED;
+       if (!(ac->ac_flags & (EXT4_MB_HINT_TRY_GOAL | EXT4_MB_HINT_GOAL_ONLY)))
+               return 0;
+       if (grp->bb_free == 0)
+@@ -2386,7 +2401,7 @@ static bool ext4_mb_good_group(struct ext4_allocation_context *ac,
+ 
+       BUG_ON(cr < 0 || cr >= 4);
+ 
+-      if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(grp)))
++      if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(grp) || !grp))
+               return false;
+ 
+       free = grp->bb_free;
+@@ -2455,6 +2470,8 @@ static int ext4_mb_good_group_nolock(struct ext4_allocation_context *ac,
+       ext4_grpblk_t free;
+       int ret = 0;
+ 
++      if (!grp)
++              return -EFSCORRUPTED;
+       if (sbi->s_mb_stats)
+               atomic64_inc(&sbi->s_bal_cX_groups_considered[ac->ac_criteria]);
+       if (should_lock) {
+@@ -2535,7 +2552,7 @@ ext4_group_t ext4_mb_prefetch(struct super_block *sb, ext4_group_t group,
+                * prefetch once, so we avoid getblk() call, which can
+                * be expensive.
+                */
+-              if (!EXT4_MB_GRP_TEST_AND_SET_READ(grp) &&
++              if (gdp && grp && !EXT4_MB_GRP_TEST_AND_SET_READ(grp) &&
+                   EXT4_MB_GRP_NEED_INIT(grp) &&
+                   ext4_free_group_clusters(sb, gdp) > 0 &&
+                   !(ext4_has_group_desc_csum(sb) &&
+@@ -2579,7 +2596,7 @@ void ext4_mb_prefetch_fini(struct super_block *sb, ext4_group_t group,
+               group--;
+               grp = ext4_get_group_info(sb, group);
+ 
+-              if (EXT4_MB_GRP_NEED_INIT(grp) &&
++              if (grp && gdp && EXT4_MB_GRP_NEED_INIT(grp) &&
+                   ext4_free_group_clusters(sb, gdp) > 0 &&
+                   !(ext4_has_group_desc_csum(sb) &&
+                     (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)))) {
+@@ -2838,6 +2855,8 @@ static int ext4_mb_seq_groups_show(struct seq_file *seq, void *v)
+               sizeof(struct ext4_group_info);
+ 
+       grinfo = ext4_get_group_info(sb, group);
++      if (!grinfo)
++              return 0;
+       /* Load the group info in memory only if not already loaded. */
+       if (unlikely(EXT4_MB_GRP_NEED_INIT(grinfo))) {
+               err = ext4_mb_load_buddy(sb, group, &e4b);
+@@ -2848,7 +2867,7 @@ static int ext4_mb_seq_groups_show(struct seq_file *seq, void *v)
+               buddy_loaded = 1;
+       }
+ 
+-      memcpy(&sg, ext4_get_group_info(sb, group), i);
++      memcpy(&sg, grinfo, i);
+ 
+       if (buddy_loaded)
+               ext4_mb_unload_buddy(&e4b);
+@@ -3210,8 +3229,12 @@ static int ext4_mb_init_backend(struct super_block *sb)
+ 
+ err_freebuddy:
+       cachep = get_groupinfo_cache(sb->s_blocksize_bits);
+-      while (i-- > 0)
+-              kmem_cache_free(cachep, ext4_get_group_info(sb, i));
++      while (i-- > 0) {
++              struct ext4_group_info *grp = ext4_get_group_info(sb, i);
++
++              if (grp)
++                      kmem_cache_free(cachep, grp);
++      }
+       i = sbi->s_group_info_size;
+       rcu_read_lock();
+       group_info = rcu_dereference(sbi->s_group_info);
+@@ -3525,6 +3548,8 @@ int ext4_mb_release(struct super_block *sb)
+               for (i = 0; i < ngroups; i++) {
+                       cond_resched();
+                       grinfo = ext4_get_group_info(sb, i);
++                      if (!grinfo)
++                              continue;
+                       mb_group_bb_bitmap_free(grinfo);
+                       ext4_lock_group(sb, i);
+                       count = ext4_mb_cleanup_pa(grinfo);
+@@ -4454,6 +4479,8 @@ static void ext4_mb_generate_from_freelist(struct super_block *sb, void *bitmap,
+       struct ext4_free_data *entry;
+ 
+       grp = ext4_get_group_info(sb, group);
++      if (!grp)
++              return;
+       n = rb_first(&(grp->bb_free_root));
+ 
+       while (n) {
+@@ -4481,6 +4508,9 @@ void ext4_mb_generate_from_pa(struct super_block *sb, void *bitmap,
+       int preallocated = 0;
+       int len;
+ 
++      if (!grp)
++              return;
++
+       /* all form of preallocation discards first load group,
+        * so the only competing code is preallocation use.
+        * we don't need any locking here
+@@ -4672,6 +4702,8 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
+ 
+       ei = EXT4_I(ac->ac_inode);
+       grp = ext4_get_group_info(sb, ac->ac_b_ex.fe_group);
++      if (!grp)
++              return;
+ 
+       pa->pa_obj_lock = &ei->i_prealloc_lock;
+       pa->pa_inode = ac->ac_inode;
+@@ -4725,6 +4757,8 @@ ext4_mb_new_group_pa(struct ext4_allocation_context *ac)
+       atomic_add(pa->pa_free, &EXT4_SB(sb)->s_mb_preallocated);
+ 
+       grp = ext4_get_group_info(sb, ac->ac_b_ex.fe_group);
++      if (!grp)
++              return;
+       lg = ac->ac_lg;
+       BUG_ON(lg == NULL);
+ 
+@@ -4853,6 +4887,8 @@ ext4_mb_discard_group_preallocations(struct super_block *sb,
+       int err;
+       int free = 0;
+ 
++      if (!grp)
++              return 0;
+       mb_debug(sb, "discard preallocation for group %u\n", group);
+       if (list_empty(&grp->bb_prealloc_list))
+               goto out_dbg;
+@@ -5090,6 +5126,9 @@ static inline void ext4_mb_show_pa(struct super_block *sb)
+               struct ext4_prealloc_space *pa;
+               ext4_grpblk_t start;
+               struct list_head *cur;
++
++              if (!grp)
++                      continue;
+               ext4_lock_group(sb, i);
+               list_for_each(cur, &grp->bb_prealloc_list) {
+                       pa = list_entry(cur, struct ext4_prealloc_space,
+@@ -5889,6 +5928,7 @@ static void ext4_mb_clear_bb(handle_t *handle, struct inode *inode,
+       struct buffer_head *bitmap_bh = NULL;
+       struct super_block *sb = inode->i_sb;
+       struct ext4_group_desc *gdp;
++      struct ext4_group_info *grp;
+       unsigned int overflow;
+       ext4_grpblk_t bit;
+       struct buffer_head *gd_bh;
+@@ -5914,8 +5954,8 @@ static void ext4_mb_clear_bb(handle_t *handle, struct inode *inode,
+       overflow = 0;
+       ext4_get_group_no_and_offset(sb, block, &block_group, &bit);
+ 
+-      if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(
+-                      ext4_get_group_info(sb, block_group))))
++      grp = ext4_get_group_info(sb, block_group);
++      if (unlikely(!grp || EXT4_MB_GRP_BBITMAP_CORRUPT(grp)))
+               return;
+ 
+       /*
+@@ -6517,6 +6557,8 @@ int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range)
+ 
+       for (group = first_group; group <= last_group; group++) {
+               grp = ext4_get_group_info(sb, group);
++              if (!grp)
++                      continue;
+               /* We only do this if the grp has never been initialized */
+               if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
+                       ret = ext4_mb_init_group(sb, group, GFP_NOFS);
+diff --git a/fs/ext4/super.c b/fs/ext4/super.c
+index 7c45ab1dbd34e..d34afa8e0c158 100644
+--- a/fs/ext4/super.c
++++ b/fs/ext4/super.c
+@@ -1048,6 +1048,8 @@ void ext4_mark_group_bitmap_corrupted(struct super_block *sb,
+       struct ext4_group_desc *gdp = ext4_get_group_desc(sb, group, NULL);
+       int ret;
+ 
++      if (!grp || !gdp)
++              return;
+       if (flags & EXT4_GROUP_INFO_BBITMAP_CORRUPT) {
+               ret = ext4_test_and_set_bit(EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT,
+                                           &grp->bb_state);
+-- 
+2.39.2
+
diff --git a/queue-6.3/ext4-allow-to-find-by-goal-if-ext4_mb_hint_goal_only.patch b/queue-6.3/ext4-allow-to-find-by-goal-if-ext4_mb_hint_goal_only.patch

new file mode 100644 (file)

index 0000000..d3e8e27
--- /dev/null
+++ b/queue-6.3/ext4-allow-to-find-by-goal-if-ext4_mb_hint_goal_only.patch
@@ -0,0 +1,43 @@
+From 295f37d7e34b7a371c0c600a454694ed4f27853e Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Sat, 4 Mar 2023 01:21:02 +0800
+Subject: ext4: allow to find by goal if EXT4_MB_HINT_GOAL_ONLY is set
+
+From: Kemeng Shi <shikemeng@huaweicloud.com>
+
+[ Upstream commit 01e4ca29451760b9ac10b4cdc231c52150842643 ]
+
+If EXT4_MB_HINT_GOAL_ONLY is set, ext4_mb_regular_allocator will only
+allocate blocks from ext4_mb_find_by_goal. Allow to find by goal in
+ext4_mb_find_by_goal if EXT4_MB_HINT_GOAL_ONLY is set or allocation
+with EXT4_MB_HINT_GOAL_ONLY set will always fail.
+
+EXT4_MB_HINT_GOAL_ONLY is not used at all, so the problem is not
+found for now.
+
+Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
+Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
+Link: https://lore.kernel.org/r/20230303172120.3800725-3-shikemeng@huaweicloud.com
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Stable-dep-of: 5354b2af3406 ("ext4: allow ext4_get_group_info() to fail")
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ fs/ext4/mballoc.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
+index 5639a4cf7ff98..343cb38ea3653 100644
+--- a/fs/ext4/mballoc.c
++++ b/fs/ext4/mballoc.c
+@@ -2162,7 +2162,7 @@ int ext4_mb_find_by_goal(struct ext4_allocation_context *ac,
+       struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, group);
+       struct ext4_free_extent ex;
+ 
+-      if (!(ac->ac_flags & EXT4_MB_HINT_TRY_GOAL))
++      if (!(ac->ac_flags & (EXT4_MB_HINT_TRY_GOAL | EXT4_MB_HINT_GOAL_ONLY)))
+               return 0;
+       if (grp->bb_free == 0)
+               return 0;
+-- 
+2.39.2
+
diff --git a/queue-6.3/ext4-don-t-clear-sb_rdonly-when-remounting-r-w-until.patch b/queue-6.3/ext4-don-t-clear-sb_rdonly-when-remounting-r-w-until.patch

new file mode 100644 (file)

index 0000000..67c24f3
--- /dev/null
+++ b/queue-6.3/ext4-don-t-clear-sb_rdonly-when-remounting-r-w-until.patch
@@ -0,0 +1,77 @@
+From 2feae74b7bff73acb1d3059abf0bd1eb8d69b5e1 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Fri, 5 May 2023 21:02:30 -0400
+Subject: ext4: don't clear SB_RDONLY when remounting r/w until quota is
+ re-enabled
+
+From: Theodore Ts'o <tytso@mit.edu>
+
+[ Upstream commit a44be64bbecb15a452496f60db6eacfee2b59c79 ]
+
+When a file system currently mounted read/only is remounted
+read/write, if we clear the SB_RDONLY flag too early, before the quota
+is initialized, and there is another process/thread constantly
+attempting to create a directory, it's possible to trigger the
+
+       WARN_ON_ONCE(dquot_initialize_needed(inode));
+
+in ext4_xattr_block_set(), with the following stack trace:
+
+   WARNING: CPU: 0 PID: 5338 at fs/ext4/xattr.c:2141 ext4_xattr_block_set+0x2ef2/0x3680
+   RIP: 0010:ext4_xattr_block_set+0x2ef2/0x3680 fs/ext4/xattr.c:2141
+   Call Trace:
+    ext4_xattr_set_handle+0xcd4/0x15c0 fs/ext4/xattr.c:2458
+    ext4_initxattrs+0xa3/0x110 fs/ext4/xattr_security.c:44
+    security_inode_init_security+0x2df/0x3f0 security/security.c:1147
+    __ext4_new_inode+0x347e/0x43d0 fs/ext4/ialloc.c:1324
+    ext4_mkdir+0x425/0xce0 fs/ext4/namei.c:2992
+    vfs_mkdir+0x29d/0x450 fs/namei.c:4038
+    do_mkdirat+0x264/0x520 fs/namei.c:4061
+    __do_sys_mkdirat fs/namei.c:4076 [inline]
+    __se_sys_mkdirat fs/namei.c:4074 [inline]
+    __x64_sys_mkdirat+0x89/0xa0 fs/namei.c:4074
+
+Cc: stable@kernel.org
+Link: https://lore.kernel.org/r/20230506142419.984260-1-tytso@mit.edu
+Reported-by: syzbot+6385d7d3065524c5ca6d@syzkaller.appspotmail.com
+Link: https://syzkaller.appspot.com/bug?id=6513f6cb5cd6b5fc9f37e3bb70d273b94be9c34c
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ fs/ext4/super.c | 6 +++++-
+ 1 file changed, 5 insertions(+), 1 deletion(-)
+
+diff --git a/fs/ext4/super.c b/fs/ext4/super.c
+index 7b36089394175..7c45ab1dbd34e 100644
+--- a/fs/ext4/super.c
++++ b/fs/ext4/super.c
+@@ -6352,6 +6352,7 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
+       struct ext4_mount_options old_opts;
+       ext4_group_t g;
+       int err = 0;
++      int enable_rw = 0;
+ #ifdef CONFIG_QUOTA
+       int enable_quota = 0;
+       int i, j;
+@@ -6538,7 +6539,7 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
+                       if (err)
+                               goto restore_opts;
+ 
+-                      sb->s_flags &= ~SB_RDONLY;
++                      enable_rw = 1;
+                       if (ext4_has_feature_mmp(sb)) {
+                               err = ext4_multi_mount_protect(sb,
+                                               le64_to_cpu(es->s_mmp_block));
+@@ -6597,6 +6598,9 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
+       if (!test_opt(sb, BLOCK_VALIDITY) && sbi->s_system_blks)
+               ext4_release_system_zone(sb);
+ 
++      if (enable_rw)
++              sb->s_flags &= ~SB_RDONLY;
++
+       if (!ext4_has_feature_mmp(sb) || sb_rdonly(sb))
+               ext4_stop_mmpd(sbi);
+ 
+-- 
+2.39.2
+
diff --git a/queue-6.3/ext4-reflect-error-codes-from-ext4_multi_mount_prote.patch b/queue-6.3/ext4-reflect-error-codes-from-ext4_multi_mount_prote.patch

new file mode 100644 (file)

index 0000000..4e6b317
--- /dev/null
+++ b/queue-6.3/ext4-reflect-error-codes-from-ext4_multi_mount_prote.patch
@@ -0,0 +1,128 @@
+From 8e3ea06b5c82f9484990c19b96749451716cf618 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 27 Apr 2023 22:49:34 -0400
+Subject: ext4: reflect error codes from ext4_multi_mount_protect() to its
+ callers
+
+From: Theodore Ts'o <tytso@mit.edu>
+
+[ Upstream commit 3b50d5018ed06a647bb26c44bb5ae74e59c903c7 ]
+
+This will allow more fine-grained errno codes to be returned by the
+mount system call.
+
+Cc: Andreas Dilger <adilger.kernel@dilger.ca>
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Stable-dep-of: a44be64bbecb ("ext4: don't clear SB_RDONLY when remounting r/w until quota is re-enabled")
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ fs/ext4/mmp.c   |  9 ++++++++-
+ fs/ext4/super.c | 16 +++++++++-------
+ 2 files changed, 17 insertions(+), 8 deletions(-)
+
+diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c
+index 46735ce315b5a..0aaf38ffcb6ec 100644
+--- a/fs/ext4/mmp.c
++++ b/fs/ext4/mmp.c
+@@ -290,6 +290,7 @@ int ext4_multi_mount_protect(struct super_block *sb,
+       if (mmp_block < le32_to_cpu(es->s_first_data_block) ||
+           mmp_block >= ext4_blocks_count(es)) {
+               ext4_warning(sb, "Invalid MMP block in superblock");
++              retval = -EINVAL;
+               goto failed;
+       }
+ 
+@@ -315,6 +316,7 @@ int ext4_multi_mount_protect(struct super_block *sb,
+ 
+       if (seq == EXT4_MMP_SEQ_FSCK) {
+               dump_mmp_msg(sb, mmp, "fsck is running on the filesystem");
++              retval = -EBUSY;
+               goto failed;
+       }
+ 
+@@ -328,6 +330,7 @@ int ext4_multi_mount_protect(struct super_block *sb,
+ 
+       if (schedule_timeout_interruptible(HZ * wait_time) != 0) {
+               ext4_warning(sb, "MMP startup interrupted, failing mount\n");
++              retval = -ETIMEDOUT;
+               goto failed;
+       }
+ 
+@@ -338,6 +341,7 @@ int ext4_multi_mount_protect(struct super_block *sb,
+       if (seq != le32_to_cpu(mmp->mmp_seq)) {
+               dump_mmp_msg(sb, mmp,
+                            "Device is already active on another node.");
++              retval = -EBUSY;
+               goto failed;
+       }
+ 
+@@ -361,6 +365,7 @@ int ext4_multi_mount_protect(struct super_block *sb,
+        */
+       if (schedule_timeout_interruptible(HZ * wait_time) != 0) {
+               ext4_warning(sb, "MMP startup interrupted, failing mount");
++              retval = -ETIMEDOUT;
+               goto failed;
+       }
+ 
+@@ -371,6 +376,7 @@ int ext4_multi_mount_protect(struct super_block *sb,
+       if (seq != le32_to_cpu(mmp->mmp_seq)) {
+               dump_mmp_msg(sb, mmp,
+                            "Device is already active on another node.");
++              retval = -EBUSY;
+               goto failed;
+       }
+ 
+@@ -390,6 +396,7 @@ int ext4_multi_mount_protect(struct super_block *sb,
+               EXT4_SB(sb)->s_mmp_tsk = NULL;
+               ext4_warning(sb, "Unable to create kmmpd thread for %s.",
+                            sb->s_id);
++              retval = -ENOMEM;
+               goto failed;
+       }
+ 
+@@ -397,5 +404,5 @@ int ext4_multi_mount_protect(struct super_block *sb,
+ 
+ failed:
+       brelse(bh);
+-      return 1;
++      return retval;
+ }
+diff --git a/fs/ext4/super.c b/fs/ext4/super.c
+index d6ac61f43ac35..7b36089394175 100644
+--- a/fs/ext4/super.c
++++ b/fs/ext4/super.c
+@@ -5264,9 +5264,11 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
+                         ext4_has_feature_orphan_present(sb) ||
+                         ext4_has_feature_journal_needs_recovery(sb));
+ 
+-      if (ext4_has_feature_mmp(sb) && !sb_rdonly(sb))
+-              if (ext4_multi_mount_protect(sb, le64_to_cpu(es->s_mmp_block)))
++      if (ext4_has_feature_mmp(sb) && !sb_rdonly(sb)) {
++              err = ext4_multi_mount_protect(sb, le64_to_cpu(es->s_mmp_block));
++              if (err)
+                       goto failed_mount3a;
++      }
+ 
+       /*
+        * The first inode we look at is the journal inode.  Don't try
+@@ -6537,12 +6539,12 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
+                               goto restore_opts;
+ 
+                       sb->s_flags &= ~SB_RDONLY;
+-                      if (ext4_has_feature_mmp(sb))
+-                              if (ext4_multi_mount_protect(sb,
+-                                              le64_to_cpu(es->s_mmp_block))) {
+-                                      err = -EROFS;
++                      if (ext4_has_feature_mmp(sb)) {
++                              err = ext4_multi_mount_protect(sb,
++                                              le64_to_cpu(es->s_mmp_block));
++                              if (err)
+                                       goto restore_opts;
+-                              }
++                      }
+ #ifdef CONFIG_QUOTA
+                       enable_quota = 1;
+ #endif
+-- 
+2.39.2
+
diff --git a/queue-6.3/fbdev-arcfb-fix-error-handling-in-arcfb_probe.patch b/queue-6.3/fbdev-arcfb-fix-error-handling-in-arcfb_probe.patch

new file mode 100644 (file)

index 0000000..a7e6c63
--- /dev/null
+++ b/queue-6.3/fbdev-arcfb-fix-error-handling-in-arcfb_probe.patch
@@ -0,0 +1,81 @@
+From 148d2302d73c54fb1f420d2821d15cd29587929f Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 May 2023 19:27:26 +0800
+Subject: fbdev: arcfb: Fix error handling in arcfb_probe()
+
+From: Zongjie Li <u202112089@hust.edu.cn>
+
+[ Upstream commit 5a6bef734247c7a8c19511664ff77634ab86f45b ]
+
+Smatch complains that:
+arcfb_probe() warn: 'irq' from request_irq() not released on lines: 587.
+
+Fix error handling in the arcfb_probe() function. If IO addresses are
+not provided or framebuffer registration fails, the code will jump to
+the err_addr or err_register_fb label to release resources.
+If IRQ request fails, previously allocated resources will be freed.
+
+Fixes: 1154ea7dcd8e ("[PATCH] Framebuffer driver for Arc LCD board")
+Signed-off-by: Zongjie Li <u202112089@hust.edu.cn>
+Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn>
+Signed-off-by: Helge Deller <deller@gmx.de>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/video/fbdev/arcfb.c | 15 +++++++++------
+ 1 file changed, 9 insertions(+), 6 deletions(-)
+
+diff --git a/drivers/video/fbdev/arcfb.c b/drivers/video/fbdev/arcfb.c
+index 45e64016db328..024d0ee4f04f9 100644
+--- a/drivers/video/fbdev/arcfb.c
++++ b/drivers/video/fbdev/arcfb.c
+@@ -523,7 +523,7 @@ static int arcfb_probe(struct platform_device *dev)
+ 
+       info = framebuffer_alloc(sizeof(struct arcfb_par), &dev->dev);
+       if (!info)
+-              goto err;
++              goto err_fb_alloc;
+ 
+       info->screen_base = (char __iomem *)videomemory;
+       info->fbops = &arcfb_ops;
+@@ -535,7 +535,7 @@ static int arcfb_probe(struct platform_device *dev)
+ 
+       if (!dio_addr || !cio_addr || !c2io_addr) {
+               printk(KERN_WARNING "no IO addresses supplied\n");
+-              goto err1;
++              goto err_addr;
+       }
+       par->dio_addr = dio_addr;
+       par->cio_addr = cio_addr;
+@@ -551,12 +551,12 @@ static int arcfb_probe(struct platform_device *dev)
+                       printk(KERN_INFO
+                               "arcfb: Failed req IRQ %d\n", par->irq);
+                       retval = -EBUSY;
+-                      goto err1;
++                      goto err_addr;
+               }
+       }
+       retval = register_framebuffer(info);
+       if (retval < 0)
+-              goto err1;
++              goto err_register_fb;
+       platform_set_drvdata(dev, info);
+       fb_info(info, "Arc frame buffer device, using %dK of video memory\n",
+               videomemorysize >> 10);
+@@ -580,9 +580,12 @@ static int arcfb_probe(struct platform_device *dev)
+       }
+ 
+       return 0;
+-err1:
++
++err_register_fb:
++      free_irq(par->irq, info);
++err_addr:
+       framebuffer_release(info);
+-err:
++err_fb_alloc:
+       vfree(videomemory);
+       return retval;
+ }
+-- 
+2.39.2
+
diff --git a/queue-6.3/firmware-sysfb-fix-vesa-format-selection.patch b/queue-6.3/firmware-sysfb-fix-vesa-format-selection.patch

new file mode 100644 (file)

index 0000000..4057d59
--- /dev/null
+++ b/queue-6.3/firmware-sysfb-fix-vesa-format-selection.patch
@@ -0,0 +1,61 @@
+From 728f432a02edf83b0d1d5e4e45f2f2e682a15744 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Wed, 19 Apr 2023 00:48:34 -0400
+Subject: firmware/sysfb: Fix VESA format selection
+
+From: Pierre Asselin <pa@panix.com>
+
+[ Upstream commit 1b617bc93178912fa36f87a957c15d1f1708c299 ]
+
+Some legacy BIOSes report no reserved bits in their 32-bit rgb mode,
+breaking the calculation of bits_per_pixel in  commit f35cd3fa7729
+("firmware/sysfb: Fix EFI/VESA format selection").  However they report
+lfb_depth correctly for those modes.  Keep the computation but
+set bits_per_pixel to lfb_depth if the latter is larger.
+
+v2 fixes the warnings from a max3() macro with arguments of different
+types;  split the bits_per_pixel assignment to avoid uglyfing the code
+with too many typecasts.
+
+v3 fixes space and formatting blips pointed out by Javier, and change
+the bit_per_pixel assignment back to a single statement using two casts.
+
+v4 go back to v2 and use max_t()
+
+Signed-off-by: Pierre Asselin <pa@panix.com>
+Fixes: f35cd3fa7729 ("firmware/sysfb: Fix EFI/VESA format selection")
+Link: https://lore.kernel.org/r/4Psm6B6Lqkz1QXM@panix3.panix.com
+Link: https://lore.kernel.org/r/20230412150225.3757223-1-javierm@redhat.com
+Tested-by: Thomas Zimmermann <tzimmermann@suse.de>
+Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
+Link: https://patchwork.freedesktop.org/patch/msgid/20230419044834.10816-1-pa@panix.com
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/firmware/sysfb_simplefb.c | 4 +++-
+ 1 file changed, 3 insertions(+), 1 deletion(-)
+
+diff --git a/drivers/firmware/sysfb_simplefb.c b/drivers/firmware/sysfb_simplefb.c
+index 82c64cb9f5316..74363ed7501f6 100644
+--- a/drivers/firmware/sysfb_simplefb.c
++++ b/drivers/firmware/sysfb_simplefb.c
+@@ -51,7 +51,8 @@ __init bool sysfb_parse_mode(const struct screen_info *si,
+        *
+        * It's not easily possible to fix this in struct screen_info,
+        * as this could break UAPI. The best solution is to compute
+-       * bits_per_pixel here and ignore lfb_depth. In the loop below,
++       * bits_per_pixel from the color bits, reserved bits and
++       * reported lfb_depth, whichever is highest.  In the loop below,
+        * ignore simplefb formats with alpha bits, as EFI and VESA
+        * don't specify alpha channels.
+        */
+@@ -60,6 +61,7 @@ __init bool sysfb_parse_mode(const struct screen_info *si,
+                                         si->green_size + si->green_pos,
+                                         si->blue_size + si->blue_pos),
+                                    si->rsvd_size + si->rsvd_pos);
++              bits_per_pixel = max_t(u32, bits_per_pixel, si->lfb_depth);
+       } else {
+               bits_per_pixel = si->lfb_depth;
+       }
+-- 
+2.39.2
+
diff --git a/queue-6.3/gve-remove-the-code-of-clearing-pba-bit.patch b/queue-6.3/gve-remove-the-code-of-clearing-pba-bit.patch

new file mode 100644 (file)

index 0000000..e5c3485
--- /dev/null
+++ b/queue-6.3/gve-remove-the-code-of-clearing-pba-bit.patch
@@ -0,0 +1,50 @@
+From 6c9608ec0e873c7d907616aaf8409f89ff49dc08 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 May 2023 15:51:23 -0700
+Subject: gve: Remove the code of clearing PBA bit
+
+From: Ziwei Xiao <ziweixiao@google.com>
+
+[ Upstream commit f4c2e67c1773d2a2632381ee30e9139c1e744c16 ]
+
+Clearing the PBA bit from the driver is race prone and it may lead to
+dropped interrupt events. This could potentially lead to the traffic
+being completely halted.
+
+Fixes: 5e8c5adf95f8 ("gve: DQO: Add core netdev features")
+Signed-off-by: Ziwei Xiao <ziweixiao@google.com>
+Signed-off-by: Bailey Forrest <bcf@google.com>
+Reviewed-by: Simon Horman <simon.horman@corigine.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/ethernet/google/gve/gve_main.c | 13 -------------
+ 1 file changed, 13 deletions(-)
+
+diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
+index 07111c241e0eb..60bf0e3fb2176 100644
+--- a/drivers/net/ethernet/google/gve/gve_main.c
++++ b/drivers/net/ethernet/google/gve/gve_main.c
+@@ -284,19 +284,6 @@ static int gve_napi_poll_dqo(struct napi_struct *napi, int budget)
+       bool reschedule = false;
+       int work_done = 0;
+ 
+-      /* Clear PCI MSI-X Pending Bit Array (PBA)
+-       *
+-       * This bit is set if an interrupt event occurs while the vector is
+-       * masked. If this bit is set and we reenable the interrupt, it will
+-       * fire again. Since we're just about to poll the queue state, we don't
+-       * need it to fire again.
+-       *
+-       * Under high softirq load, it's possible that the interrupt condition
+-       * is triggered twice before we got the chance to process it.
+-       */
+-      gve_write_irq_doorbell_dqo(priv, block,
+-                                 GVE_ITR_NO_UPDATE_DQO | GVE_ITR_CLEAR_PBA_BIT_DQO);
+-
+       if (block->tx)
+               reschedule |= gve_tx_poll_dqo(block, /*do_clean=*/true);
+ 
+-- 
+2.39.2
+
diff --git a/queue-6.3/ipvlan-fix-out-of-bounds-caused-by-unclear-skb-cb.patch b/queue-6.3/ipvlan-fix-out-of-bounds-caused-by-unclear-skb-cb.patch

new file mode 100644 (file)

index 0000000..5fb9699
--- /dev/null
+++ b/queue-6.3/ipvlan-fix-out-of-bounds-caused-by-unclear-skb-cb.patch
@@ -0,0 +1,172 @@
+From e667acde523b0122c44982258f74965f069b2087 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Wed, 10 May 2023 11:50:44 +0800
+Subject: ipvlan:Fix out-of-bounds caused by unclear skb->cb
+
+From: t.feng <fengtao40@huawei.com>
+
+[ Upstream commit 90cbed5247439a966b645b34eb0a2e037836ea8e ]
+
+If skb enqueue the qdisc, fq_skb_cb(skb)->time_to_send is changed which
+is actually skb->cb, and IPCB(skb_in)->opt will be used in
+__ip_options_echo. It is possible that memcpy is out of bounds and lead
+to stack overflow.
+We should clear skb->cb before ip_local_out or ip6_local_out.
+
+v2:
+1. clean the stack info
+2. use IPCB/IP6CB instead of skb->cb
+
+crash on stable-5.10(reproduce in kasan kernel).
+Stack info:
+[ 2203.651571] BUG: KASAN: stack-out-of-bounds in
+__ip_options_echo+0x589/0x800
+[ 2203.653327] Write of size 4 at addr ffff88811a388f27 by task
+swapper/3/0
+[ 2203.655460] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Not tainted
+5.10.0-60.18.0.50.h856.kasan.eulerosv2r11.x86_64 #1
+[ 2203.655466] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
+BIOS rel-1.10.2-0-g5f4c7b1-20181220_000000-szxrtosci10000 04/01/2014
+[ 2203.655475] Call Trace:
+[ 2203.655481]  <IRQ>
+[ 2203.655501]  dump_stack+0x9c/0xd3
+[ 2203.655514]  print_address_description.constprop.0+0x19/0x170
+[ 2203.655530]  __kasan_report.cold+0x6c/0x84
+[ 2203.655586]  kasan_report+0x3a/0x50
+[ 2203.655594]  check_memory_region+0xfd/0x1f0
+[ 2203.655601]  memcpy+0x39/0x60
+[ 2203.655608]  __ip_options_echo+0x589/0x800
+[ 2203.655654]  __icmp_send+0x59a/0x960
+[ 2203.655755]  nf_send_unreach+0x129/0x3d0 [nf_reject_ipv4]
+[ 2203.655763]  reject_tg+0x77/0x1bf [ipt_REJECT]
+[ 2203.655772]  ipt_do_table+0x691/0xa40 [ip_tables]
+[ 2203.655821]  nf_hook_slow+0x69/0x100
+[ 2203.655828]  __ip_local_out+0x21e/0x2b0
+[ 2203.655857]  ip_local_out+0x28/0x90
+[ 2203.655868]  ipvlan_process_v4_outbound+0x21e/0x260 [ipvlan]
+[ 2203.655931]  ipvlan_xmit_mode_l3+0x3bd/0x400 [ipvlan]
+[ 2203.655967]  ipvlan_queue_xmit+0xb3/0x190 [ipvlan]
+[ 2203.655977]  ipvlan_start_xmit+0x2e/0xb0 [ipvlan]
+[ 2203.655984]  xmit_one.constprop.0+0xe1/0x280
+[ 2203.655992]  dev_hard_start_xmit+0x62/0x100
+[ 2203.656000]  sch_direct_xmit+0x215/0x640
+[ 2203.656028]  __qdisc_run+0x153/0x1f0
+[ 2203.656069]  __dev_queue_xmit+0x77f/0x1030
+[ 2203.656173]  ip_finish_output2+0x59b/0xc20
+[ 2203.656244]  __ip_finish_output.part.0+0x318/0x3d0
+[ 2203.656312]  ip_finish_output+0x168/0x190
+[ 2203.656320]  ip_output+0x12d/0x220
+[ 2203.656357]  __ip_queue_xmit+0x392/0x880
+[ 2203.656380]  __tcp_transmit_skb+0x1088/0x11c0
+[ 2203.656436]  __tcp_retransmit_skb+0x475/0xa30
+[ 2203.656505]  tcp_retransmit_skb+0x2d/0x190
+[ 2203.656512]  tcp_retransmit_timer+0x3af/0x9a0
+[ 2203.656519]  tcp_write_timer_handler+0x3ba/0x510
+[ 2203.656529]  tcp_write_timer+0x55/0x180
+[ 2203.656542]  call_timer_fn+0x3f/0x1d0
+[ 2203.656555]  expire_timers+0x160/0x200
+[ 2203.656562]  run_timer_softirq+0x1f4/0x480
+[ 2203.656606]  __do_softirq+0xfd/0x402
+[ 2203.656613]  asm_call_irq_on_stack+0x12/0x20
+[ 2203.656617]  </IRQ>
+[ 2203.656623]  do_softirq_own_stack+0x37/0x50
+[ 2203.656631]  irq_exit_rcu+0x134/0x1a0
+[ 2203.656639]  sysvec_apic_timer_interrupt+0x36/0x80
+[ 2203.656646]  asm_sysvec_apic_timer_interrupt+0x12/0x20
+[ 2203.656654] RIP: 0010:default_idle+0x13/0x20
+[ 2203.656663] Code: 89 f0 5d 41 5c 41 5d 41 5e c3 cc cc cc cc cc cc cc
+cc cc cc cc cc cc 0f 1f 44 00 00 0f 1f 44 00 00 0f 00 2d 9f 32 57 00 fb
+f4 <c3> cc cc cc cc 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 be 08
+[ 2203.656668] RSP: 0018:ffff88810036fe78 EFLAGS: 00000256
+[ 2203.656676] RAX: ffffffffaf2a87f0 RBX: ffff888100360000 RCX:
+ffffffffaf290191
+[ 2203.656681] RDX: 0000000000098b5e RSI: 0000000000000004 RDI:
+ffff88811a3c4f60
+[ 2203.656686] RBP: 0000000000000000 R08: 0000000000000001 R09:
+ffff88811a3c4f63
+[ 2203.656690] R10: ffffed10234789ec R11: 0000000000000001 R12:
+0000000000000003
+[ 2203.656695] R13: ffff888100360000 R14: 0000000000000000 R15:
+0000000000000000
+[ 2203.656729]  default_idle_call+0x5a/0x150
+[ 2203.656735]  cpuidle_idle_call+0x1c6/0x220
+[ 2203.656780]  do_idle+0xab/0x100
+[ 2203.656786]  cpu_startup_entry+0x19/0x20
+[ 2203.656793]  secondary_startup_64_no_verify+0xc2/0xcb
+
+[ 2203.657409] The buggy address belongs to the page:
+[ 2203.658648] page:0000000027a9842f refcount:1 mapcount:0
+mapping:0000000000000000 index:0x0 pfn:0x11a388
+[ 2203.658665] flags:
+0x17ffffc0001000(reserved|node=0|zone=2|lastcpupid=0x1fffff)
+[ 2203.658675] raw: 0017ffffc0001000 ffffea000468e208 ffffea000468e208
+0000000000000000
+[ 2203.658682] raw: 0000000000000000 0000000000000000 00000001ffffffff
+0000000000000000
+[ 2203.658686] page dumped because: kasan: bad access detected
+
+To reproduce(ipvlan with IPVLAN_MODE_L3):
+Env setting:
+=======================================================
+modprobe ipvlan ipvlan_default_mode=1
+sysctl net.ipv4.conf.eth0.forwarding=1
+iptables -t nat -A POSTROUTING -s 20.0.0.0/255.255.255.0 -o eth0 -j
+MASQUERADE
+ip link add gw link eth0 type ipvlan
+ip -4 addr add 20.0.0.254/24 dev gw
+ip netns add net1
+ip link add ipv1 link eth0 type ipvlan
+ip link set ipv1 netns net1
+ip netns exec net1 ip link set ipv1 up
+ip netns exec net1 ip -4 addr add 20.0.0.4/24 dev ipv1
+ip netns exec net1 route add default gw 20.0.0.254
+ip netns exec net1 tc qdisc add dev ipv1 root netem loss 10%
+ifconfig gw up
+iptables -t filter -A OUTPUT -p tcp --dport 8888 -j REJECT --reject-with
+icmp-port-unreachable
+=======================================================
+And then excute the shell(curl any address of eth0 can reach):
+
+for((i=1;i<=100000;i++))
+do
+        ip netns exec net1 curl x.x.x.x:8888
+done
+=======================================================
+
+Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.")
+Signed-off-by: "t.feng" <fengtao40@huawei.com>
+Suggested-by: Florian Westphal <fw@strlen.de>
+Reviewed-by: Paolo Abeni <pabeni@redhat.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/ipvlan/ipvlan_core.c | 6 ++++++
+ 1 file changed, 6 insertions(+)
+
+diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
+index 460b3d4f2245f..ab5133eb1d517 100644
+--- a/drivers/net/ipvlan/ipvlan_core.c
++++ b/drivers/net/ipvlan/ipvlan_core.c
+@@ -436,6 +436,9 @@ static int ipvlan_process_v4_outbound(struct sk_buff *skb)
+               goto err;
+       }
+       skb_dst_set(skb, &rt->dst);
++
++      memset(IPCB(skb), 0, sizeof(*IPCB(skb)));
++
+       err = ip_local_out(net, skb->sk, skb);
+       if (unlikely(net_xmit_eval(err)))
+               dev->stats.tx_errors++;
+@@ -474,6 +477,9 @@ static int ipvlan_process_v6_outbound(struct sk_buff *skb)
+               goto err;
+       }
+       skb_dst_set(skb, dst);
++
++      memset(IP6CB(skb), 0, sizeof(*IP6CB(skb)));
++
+       err = ip6_local_out(net, skb->sk, skb);
+       if (unlikely(net_xmit_eval(err)))
+               dev->stats.tx_errors++;
+-- 
+2.39.2
+
diff --git a/queue-6.3/linux-dim-do-nothing-if-no-time-delta-between-sample.patch b/queue-6.3/linux-dim-do-nothing-if-no-time-delta-between-sample.patch

new file mode 100644 (file)

index 0000000..7c44681
--- /dev/null
+++ b/queue-6.3/linux-dim-do-nothing-if-no-time-delta-between-sample.patch
@@ -0,0 +1,108 @@
+From 089802472329ae1336438cb1844d680e40d1ce3d Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Sun, 7 May 2023 16:57:43 +0300
+Subject: linux/dim: Do nothing if no time delta between samples
+
+From: Roy Novich <royno@nvidia.com>
+
+[ Upstream commit 162bd18eb55adf464a0fa2b4144b8d61c75ff7c2 ]
+
+Add return value for dim_calc_stats. This is an indication for the
+caller if curr_stats was assigned by the function. Avoid using
+curr_stats uninitialized over {rdma/net}_dim, when no time delta between
+samples. Coverity reported this potential use of an uninitialized
+variable.
+
+Fixes: 4c4dbb4a7363 ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux")
+Fixes: cb3c7fd4f839 ("net/mlx5e: Support adaptive RX coalescing")
+Signed-off-by: Roy Novich <royno@nvidia.com>
+Reviewed-by: Aya Levin <ayal@nvidia.com>
+Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
+Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
+Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
+Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
+Link: https://lore.kernel.org/r/20230507135743.138993-1-tariqt@nvidia.com
+Signed-off-by: Paolo Abeni <pabeni@redhat.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ include/linux/dim.h | 3 ++-
+ lib/dim/dim.c       | 5 +++--
+ lib/dim/net_dim.c   | 3 ++-
+ lib/dim/rdma_dim.c  | 3 ++-
+ 4 files changed, 9 insertions(+), 5 deletions(-)
+
+diff --git a/include/linux/dim.h b/include/linux/dim.h
+index 6c5733981563e..f343bc9aa2ec9 100644
+--- a/include/linux/dim.h
++++ b/include/linux/dim.h
+@@ -236,8 +236,9 @@ void dim_park_tired(struct dim *dim);
+  *
+  * Calculate the delta between two samples (in data rates).
+  * Takes into consideration counter wrap-around.
++ * Returned boolean indicates whether curr_stats are reliable.
+  */
+-void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
++bool dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
+                   struct dim_stats *curr_stats);
+ 
+ /**
+diff --git a/lib/dim/dim.c b/lib/dim/dim.c
+index 38045d6d05381..e89aaf07bde50 100644
+--- a/lib/dim/dim.c
++++ b/lib/dim/dim.c
+@@ -54,7 +54,7 @@ void dim_park_tired(struct dim *dim)
+ }
+ EXPORT_SYMBOL(dim_park_tired);
+ 
+-void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
++bool dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
+                   struct dim_stats *curr_stats)
+ {
+       /* u32 holds up to 71 minutes, should be enough */
+@@ -66,7 +66,7 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
+                            start->comp_ctr);
+ 
+       if (!delta_us)
+-              return;
++              return false;
+ 
+       curr_stats->ppms = DIV_ROUND_UP(npkts * USEC_PER_MSEC, delta_us);
+       curr_stats->bpms = DIV_ROUND_UP(nbytes * USEC_PER_MSEC, delta_us);
+@@ -79,5 +79,6 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
+       else
+               curr_stats->cpe_ratio = 0;
+ 
++      return true;
+ }
+ EXPORT_SYMBOL(dim_calc_stats);
+diff --git a/lib/dim/net_dim.c b/lib/dim/net_dim.c
+index 53f6b9c6e9366..4e32f7aaac86c 100644
+--- a/lib/dim/net_dim.c
++++ b/lib/dim/net_dim.c
+@@ -227,7 +227,8 @@ void net_dim(struct dim *dim, struct dim_sample end_sample)
+                                 dim->start_sample.event_ctr);
+               if (nevents < DIM_NEVENTS)
+                       break;
+-              dim_calc_stats(&dim->start_sample, &end_sample, &curr_stats);
++              if (!dim_calc_stats(&dim->start_sample, &end_sample, &curr_stats))
++                      break;
+               if (net_dim_decision(&curr_stats, dim)) {
+                       dim->state = DIM_APPLY_NEW_PROFILE;
+                       schedule_work(&dim->work);
+diff --git a/lib/dim/rdma_dim.c b/lib/dim/rdma_dim.c
+index 15462d54758d3..88f7794867078 100644
+--- a/lib/dim/rdma_dim.c
++++ b/lib/dim/rdma_dim.c
+@@ -88,7 +88,8 @@ void rdma_dim(struct dim *dim, u64 completions)
+               nevents = curr_sample->event_ctr - dim->start_sample.event_ctr;
+               if (nevents < DIM_NEVENTS)
+                       break;
+-              dim_calc_stats(&dim->start_sample, curr_sample, &curr_stats);
++              if (!dim_calc_stats(&dim->start_sample, curr_sample, &curr_stats))
++                      break;
+               if (rdma_dim_decision(&curr_stats, dim)) {
+                       dim->state = DIM_APPLY_NEW_PROFILE;
+                       schedule_work(&dim->work);
+-- 
+2.39.2
+
diff --git a/queue-6.3/net-add-vlan_get_protocol_and_depth-helper.patch b/queue-6.3/net-add-vlan_get_protocol_and_depth-helper.patch

new file mode 100644 (file)

index 0000000..1d992db
--- /dev/null
+++ b/queue-6.3/net-add-vlan_get_protocol_and_depth-helper.patch
@@ -0,0 +1,174 @@
+From a1cd5200c45814449ab33c3a47aa79a4c52c1cae Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 May 2023 13:18:57 +0000
+Subject: net: add vlan_get_protocol_and_depth() helper
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+From: Eric Dumazet <edumazet@google.com>
+
+[ Upstream commit 4063384ef762cc5946fc7a3f89879e76c6ec51e2 ]
+
+Before blamed commit, pskb_may_pull() was used instead
+of skb_header_pointer() in __vlan_get_protocol() and friends.
+
+Few callers depended on skb->head being populated with MAC header,
+syzbot caught one of them (skb_mac_gso_segment())
+
+Add vlan_get_protocol_and_depth() to make the intent clearer
+and use it where sensible.
+
+This is a more generic fix than commit e9d3f80935b6
+("net/af_packet: make sure to pull mac header") which was
+dealing with a similar issue.
+
+kernel BUG at include/linux/skbuff.h:2655 !
+invalid opcode: 0000 [#1] SMP KASAN
+CPU: 0 PID: 1441 Comm: syz-executor199 Not tainted 6.1.24-syzkaller #0
+Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023
+RIP: 0010:__skb_pull include/linux/skbuff.h:2655 [inline]
+RIP: 0010:skb_mac_gso_segment+0x68f/0x6a0 net/core/gro.c:136
+Code: fd 48 8b 5c 24 10 44 89 6b 70 48 c7 c7 c0 ae 0d 86 44 89 e6 e8 a1 91 d0 00 48 c7 c7 00 af 0d 86 48 89 de 31 d2 e8 d1 4a e9 ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
+RSP: 0018:ffffc90001bd7520 EFLAGS: 00010286
+RAX: ffffffff8469736a RBX: ffff88810f31dac0 RCX: ffff888115a18b00
+RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
+RBP: ffffc90001bd75e8 R08: ffffffff84697183 R09: fffff5200037adf9
+R10: 0000000000000000 R11: dffffc0000000001 R12: 0000000000000012
+R13: 000000000000fee5 R14: 0000000000005865 R15: 000000000000fed7
+FS: 000055555633f300(0000) GS:ffff8881f6a00000(0000) knlGS:0000000000000000
+CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
+CR2: 0000000020000000 CR3: 0000000116fea000 CR4: 00000000003506f0
+DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
+DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
+Call Trace:
+<TASK>
+[<ffffffff847018dd>] __skb_gso_segment+0x32d/0x4c0 net/core/dev.c:3419
+[<ffffffff8470398a>] skb_gso_segment include/linux/netdevice.h:4819 [inline]
+[<ffffffff8470398a>] validate_xmit_skb+0x3aa/0xee0 net/core/dev.c:3725
+[<ffffffff84707042>] __dev_queue_xmit+0x1332/0x3300 net/core/dev.c:4313
+[<ffffffff851a9ec7>] dev_queue_xmit+0x17/0x20 include/linux/netdevice.h:3029
+[<ffffffff851b4a82>] packet_snd net/packet/af_packet.c:3111 [inline]
+[<ffffffff851b4a82>] packet_sendmsg+0x49d2/0x6470 net/packet/af_packet.c:3142
+[<ffffffff84669a12>] sock_sendmsg_nosec net/socket.c:716 [inline]
+[<ffffffff84669a12>] sock_sendmsg net/socket.c:736 [inline]
+[<ffffffff84669a12>] __sys_sendto+0x472/0x5f0 net/socket.c:2139
+[<ffffffff84669c75>] __do_sys_sendto net/socket.c:2151 [inline]
+[<ffffffff84669c75>] __se_sys_sendto net/socket.c:2147 [inline]
+[<ffffffff84669c75>] __x64_sys_sendto+0xe5/0x100 net/socket.c:2147
+[<ffffffff8551d40f>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
+[<ffffffff8551d40f>] do_syscall_64+0x2f/0x50 arch/x86/entry/common.c:80
+[<ffffffff85600087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
+
+Fixes: 469aceddfa3e ("vlan: consolidate VLAN parsing code and limit max parsing depth")
+Reported-by: syzbot <syzkaller@googlegroups.com>
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Cc: Toke Høiland-Jørgensen <toke@redhat.com>
+Cc: Willem de Bruijn <willemb@google.com>
+Reviewed-by: Simon Horman <simon.horman@corigine.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/tap.c       |  4 ++--
+ include/linux/if_vlan.h | 17 +++++++++++++++++
+ net/bridge/br_forward.c |  2 +-
+ net/core/dev.c          |  2 +-
+ net/packet/af_packet.c  |  6 ++----
+ 5 files changed, 23 insertions(+), 8 deletions(-)
+
+diff --git a/drivers/net/tap.c b/drivers/net/tap.c
+index 8941aa199ea33..456de9c3ea169 100644
+--- a/drivers/net/tap.c
++++ b/drivers/net/tap.c
+@@ -739,7 +739,7 @@ static ssize_t tap_get_user(struct tap_queue *q, void *msg_control,
+ 
+       /* Move network header to the right position for VLAN tagged packets */
+       if (eth_type_vlan(skb->protocol) &&
+-          __vlan_get_protocol(skb, skb->protocol, &depth) != 0)
++          vlan_get_protocol_and_depth(skb, skb->protocol, &depth) != 0)
+               skb_set_network_header(skb, depth);
+ 
+       /* copy skb_ubuf_info for callback when skb has no error */
+@@ -1186,7 +1186,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
+ 
+       /* Move network header to the right position for VLAN tagged packets */
+       if (eth_type_vlan(skb->protocol) &&
+-          __vlan_get_protocol(skb, skb->protocol, &depth) != 0)
++          vlan_get_protocol_and_depth(skb, skb->protocol, &depth) != 0)
+               skb_set_network_header(skb, depth);
+ 
+       rcu_read_lock();
+diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
+index 6864b89ef8681..7ad09082f56c3 100644
+--- a/include/linux/if_vlan.h
++++ b/include/linux/if_vlan.h
+@@ -628,6 +628,23 @@ static inline __be16 vlan_get_protocol(const struct sk_buff *skb)
+       return __vlan_get_protocol(skb, skb->protocol, NULL);
+ }
+ 
++/* This version of __vlan_get_protocol() also pulls mac header in skb->head */
++static inline __be16 vlan_get_protocol_and_depth(struct sk_buff *skb,
++                                               __be16 type, int *depth)
++{
++      int maclen;
++
++      type = __vlan_get_protocol(skb, type, &maclen);
++
++      if (type) {
++              if (!pskb_may_pull(skb, maclen))
++                      type = 0;
++              else if (depth)
++                      *depth = maclen;
++      }
++      return type;
++}
++
+ /* A getter for the SKB protocol field which will handle VLAN tags consistently
+  * whether VLAN acceleration is enabled or not.
+  */
+diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
+index 02bb620d3b8da..bd54f17e3c3d8 100644
+--- a/net/bridge/br_forward.c
++++ b/net/bridge/br_forward.c
+@@ -42,7 +42,7 @@ int br_dev_queue_push_xmit(struct net *net, struct sock *sk, struct sk_buff *skb
+           eth_type_vlan(skb->protocol)) {
+               int depth;
+ 
+-              if (!__vlan_get_protocol(skb, skb->protocol, &depth))
++              if (!vlan_get_protocol_and_depth(skb, skb->protocol, &depth))
+                       goto drop;
+ 
+               skb_set_network_header(skb, depth);
+diff --git a/net/core/dev.c b/net/core/dev.c
+index 1488f700bf819..8fbd241849c01 100644
+--- a/net/core/dev.c
++++ b/net/core/dev.c
+@@ -3338,7 +3338,7 @@ __be16 skb_network_protocol(struct sk_buff *skb, int *depth)
+               type = eth->h_proto;
+       }
+ 
+-      return __vlan_get_protocol(skb, type, depth);
++      return vlan_get_protocol_and_depth(skb, type, depth);
+ }
+ 
+ /* openvswitch calls this on rx path, so we need a different check.
+diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
+index b8c62d88567ba..db9c2fa71c50c 100644
+--- a/net/packet/af_packet.c
++++ b/net/packet/af_packet.c
+@@ -1935,10 +1935,8 @@ static void packet_parse_headers(struct sk_buff *skb, struct socket *sock)
+       /* Move network header to the right position for VLAN tagged packets */
+       if (likely(skb->dev->type == ARPHRD_ETHER) &&
+           eth_type_vlan(skb->protocol) &&
+-          __vlan_get_protocol(skb, skb->protocol, &depth) != 0) {
+-              if (pskb_may_pull(skb, depth))
+-                      skb_set_network_header(skb, depth);
+-      }
++          vlan_get_protocol_and_depth(skb, skb->protocol, &depth) != 0)
++              skb_set_network_header(skb, depth);
+ 
+       skb_probe_transport_header(skb);
+ }
+-- 
+2.39.2
+
diff --git a/queue-6.3/net-annotate-sk-sk_err-write-from-do_recvmmsg.patch b/queue-6.3/net-annotate-sk-sk_err-write-from-do_recvmmsg.patch

new file mode 100644 (file)

index 0000000..fce724b
--- /dev/null
+++ b/queue-6.3/net-annotate-sk-sk_err-write-from-do_recvmmsg.patch
@@ -0,0 +1,40 @@
+From 9f3e6c717c489b276b3e948ba3bf53a90ba99b0b Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 May 2023 16:35:53 +0000
+Subject: net: annotate sk->sk_err write from do_recvmmsg()
+
+From: Eric Dumazet <edumazet@google.com>
+
+[ Upstream commit e05a5f510f26607616fecdd4ac136310c8bea56b ]
+
+do_recvmmsg() can write to sk->sk_err from multiple threads.
+
+As said before, many other points reading or writing sk_err
+need annotations.
+
+Fixes: 34b88a68f26a ("net: Fix use after free in the recvmmsg exit path")
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Reported-by: syzbot <syzkaller@googlegroups.com>
+Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/socket.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/net/socket.c b/net/socket.c
+index 9c92c0e6c4da8..263fab8e49010 100644
+--- a/net/socket.c
++++ b/net/socket.c
+@@ -2909,7 +2909,7 @@ static int do_recvmmsg(int fd, struct mmsghdr __user *mmsg,
+                * error to return on the next call or if the
+                * app asks about it using getsockopt(SO_ERROR).
+                */
+-              sock->sk->sk_err = -err;
++              WRITE_ONCE(sock->sk->sk_err, -err);
+       }
+ out_put:
+       fput_light(sock->file, fput_needed);
+-- 
+2.39.2
+
diff --git a/queue-6.3/net-datagram-fix-data-races-in-datagram_poll.patch b/queue-6.3/net-datagram-fix-data-races-in-datagram_poll.patch

new file mode 100644 (file)

index 0000000..a009cef
--- /dev/null
+++ b/queue-6.3/net-datagram-fix-data-races-in-datagram_poll.patch
@@ -0,0 +1,69 @@
+From 0ab97a58b1b939fdbd0f1f8f46bb17185483def3 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 May 2023 17:31:31 +0000
+Subject: net: datagram: fix data-races in datagram_poll()
+
+From: Eric Dumazet <edumazet@google.com>
+
+[ Upstream commit 5bca1d081f44c9443e61841842ce4e9179d327b6 ]
+
+datagram_poll() runs locklessly, we should add READ_ONCE()
+annotations while reading sk->sk_err, sk->sk_shutdown and sk->sk_state.
+
+Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
+Link: https://lore.kernel.org/r/20230509173131.3263780-1-edumazet@google.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/core/datagram.c | 15 ++++++++++-----
+ 1 file changed, 10 insertions(+), 5 deletions(-)
+
+diff --git a/net/core/datagram.c b/net/core/datagram.c
+index e4ff2db40c981..8dabb9a74cb17 100644
+--- a/net/core/datagram.c
++++ b/net/core/datagram.c
+@@ -799,18 +799,21 @@ __poll_t datagram_poll(struct file *file, struct socket *sock,
+ {
+       struct sock *sk = sock->sk;
+       __poll_t mask;
++      u8 shutdown;
+ 
+       sock_poll_wait(file, sock, wait);
+       mask = 0;
+ 
+       /* exceptional events? */
+-      if (sk->sk_err || !skb_queue_empty_lockless(&sk->sk_error_queue))
++      if (READ_ONCE(sk->sk_err) ||
++          !skb_queue_empty_lockless(&sk->sk_error_queue))
+               mask |= EPOLLERR |
+                       (sock_flag(sk, SOCK_SELECT_ERR_QUEUE) ? EPOLLPRI : 0);
+ 
+-      if (sk->sk_shutdown & RCV_SHUTDOWN)
++      shutdown = READ_ONCE(sk->sk_shutdown);
++      if (shutdown & RCV_SHUTDOWN)
+               mask |= EPOLLRDHUP | EPOLLIN | EPOLLRDNORM;
+-      if (sk->sk_shutdown == SHUTDOWN_MASK)
++      if (shutdown == SHUTDOWN_MASK)
+               mask |= EPOLLHUP;
+ 
+       /* readable? */
+@@ -819,10 +822,12 @@ __poll_t datagram_poll(struct file *file, struct socket *sock,
+ 
+       /* Connection-based need to check for termination and startup */
+       if (connection_based(sk)) {
+-              if (sk->sk_state == TCP_CLOSE)
++              int state = READ_ONCE(sk->sk_state);
++
++              if (state == TCP_CLOSE)
+                       mask |= EPOLLHUP;
+               /* connection hasn't started yet? */
+-              if (sk->sk_state == TCP_SYN_SENT)
++              if (state == TCP_SYN_SENT)
+                       return mask;
+       }
+ 
+-- 
+2.39.2
+
diff --git a/queue-6.3/net-deal-with-most-data-races-in-sk_wait_event.patch b/queue-6.3/net-deal-with-most-data-races-in-sk_wait_event.patch

new file mode 100644 (file)

index 0000000..819fdde
--- /dev/null
+++ b/queue-6.3/net-deal-with-most-data-races-in-sk_wait_event.patch
@@ -0,0 +1,224 @@
+From 9361145b42164b02503a274e0ffb79940a72f6c1 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 May 2023 18:29:48 +0000
+Subject: net: deal with most data-races in sk_wait_event()
+
+From: Eric Dumazet <edumazet@google.com>
+
+[ Upstream commit d0ac89f6f9879fae316c155de77b5173b3e2c9c9 ]
+
+__condition is evaluated twice in sk_wait_event() macro.
+
+First invocation is lockless, and reads can race with writes,
+as spotted by syzbot.
+
+BUG: KCSAN: data-race in sk_stream_wait_connect / tcp_disconnect
+
+write to 0xffff88812d83d6a0 of 4 bytes by task 9065 on cpu 1:
+tcp_disconnect+0x2cd/0xdb0
+inet_shutdown+0x19e/0x1f0 net/ipv4/af_inet.c:911
+__sys_shutdown_sock net/socket.c:2343 [inline]
+__sys_shutdown net/socket.c:2355 [inline]
+__do_sys_shutdown net/socket.c:2363 [inline]
+__se_sys_shutdown+0xf8/0x140 net/socket.c:2361
+__x64_sys_shutdown+0x31/0x40 net/socket.c:2361
+do_syscall_x64 arch/x86/entry/common.c:50 [inline]
+do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
+entry_SYSCALL_64_after_hwframe+0x63/0xcd
+
+read to 0xffff88812d83d6a0 of 4 bytes by task 9040 on cpu 0:
+sk_stream_wait_connect+0x1de/0x3a0 net/core/stream.c:75
+tcp_sendmsg_locked+0x2e4/0x2120 net/ipv4/tcp.c:1266
+tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1484
+inet6_sendmsg+0x63/0x80 net/ipv6/af_inet6.c:651
+sock_sendmsg_nosec net/socket.c:724 [inline]
+sock_sendmsg net/socket.c:747 [inline]
+__sys_sendto+0x246/0x300 net/socket.c:2142
+__do_sys_sendto net/socket.c:2154 [inline]
+__se_sys_sendto net/socket.c:2150 [inline]
+__x64_sys_sendto+0x78/0x90 net/socket.c:2150
+do_syscall_x64 arch/x86/entry/common.c:50 [inline]
+do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
+entry_SYSCALL_64_after_hwframe+0x63/0xcd
+
+value changed: 0x00000000 -> 0x00000068
+
+Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
+Reported-by: syzbot <syzkaller@googlegroups.com>
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/core/stream.c   | 12 ++++++------
+ net/ipv4/tcp_bpf.c  |  2 +-
+ net/llc/af_llc.c    |  8 +++++---
+ net/smc/smc_close.c |  4 ++--
+ net/smc/smc_rx.c    |  4 ++--
+ net/smc/smc_tx.c    |  4 ++--
+ net/tipc/socket.c   |  4 ++--
+ net/tls/tls_main.c  |  3 ++-
+ 8 files changed, 22 insertions(+), 19 deletions(-)
+
+diff --git a/net/core/stream.c b/net/core/stream.c
+index 434446ab14c57..f5c4e47df1650 100644
+--- a/net/core/stream.c
++++ b/net/core/stream.c
+@@ -73,8 +73,8 @@ int sk_stream_wait_connect(struct sock *sk, long *timeo_p)
+               add_wait_queue(sk_sleep(sk), &wait);
+               sk->sk_write_pending++;
+               done = sk_wait_event(sk, timeo_p,
+-                                   !sk->sk_err &&
+-                                   !((1 << sk->sk_state) &
++                                   !READ_ONCE(sk->sk_err) &&
++                                   !((1 << READ_ONCE(sk->sk_state)) &
+                                      ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)), &wait);
+               remove_wait_queue(sk_sleep(sk), &wait);
+               sk->sk_write_pending--;
+@@ -87,9 +87,9 @@ EXPORT_SYMBOL(sk_stream_wait_connect);
+  * sk_stream_closing - Return 1 if we still have things to send in our buffers.
+  * @sk: socket to verify
+  */
+-static inline int sk_stream_closing(struct sock *sk)
++static int sk_stream_closing(const struct sock *sk)
+ {
+-      return (1 << sk->sk_state) &
++      return (1 << READ_ONCE(sk->sk_state)) &
+              (TCPF_FIN_WAIT1 | TCPF_CLOSING | TCPF_LAST_ACK);
+ }
+ 
+@@ -142,8 +142,8 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
+ 
+               set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
+               sk->sk_write_pending++;
+-              sk_wait_event(sk, &current_timeo, sk->sk_err ||
+-                                                (sk->sk_shutdown & SEND_SHUTDOWN) ||
++              sk_wait_event(sk, &current_timeo, READ_ONCE(sk->sk_err) ||
++                                                (READ_ONCE(sk->sk_shutdown) & SEND_SHUTDOWN) ||
+                                                 (sk_stream_memory_free(sk) &&
+                                                 !vm_wait), &wait);
+               sk->sk_write_pending--;
+diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
+index ebf9175119370..2e9547467edbe 100644
+--- a/net/ipv4/tcp_bpf.c
++++ b/net/ipv4/tcp_bpf.c
+@@ -168,7 +168,7 @@ static int tcp_msg_wait_data(struct sock *sk, struct sk_psock *psock,
+       sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk);
+       ret = sk_wait_event(sk, &timeo,
+                           !list_empty(&psock->ingress_msg) ||
+-                          !skb_queue_empty(&sk->sk_receive_queue), &wait);
++                          !skb_queue_empty_lockless(&sk->sk_receive_queue), &wait);
+       sk_clear_bit(SOCKWQ_ASYNC_WAITDATA, sk);
+       remove_wait_queue(sk_sleep(sk), &wait);
+       return ret;
+diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
+index da7fe94bea2eb..9ffbc667be6cf 100644
+--- a/net/llc/af_llc.c
++++ b/net/llc/af_llc.c
+@@ -583,7 +583,8 @@ static int llc_ui_wait_for_disc(struct sock *sk, long timeout)
+ 
+       add_wait_queue(sk_sleep(sk), &wait);
+       while (1) {
+-              if (sk_wait_event(sk, &timeout, sk->sk_state == TCP_CLOSE, &wait))
++              if (sk_wait_event(sk, &timeout,
++                                READ_ONCE(sk->sk_state) == TCP_CLOSE, &wait))
+                       break;
+               rc = -ERESTARTSYS;
+               if (signal_pending(current))
+@@ -603,7 +604,8 @@ static bool llc_ui_wait_for_conn(struct sock *sk, long timeout)
+ 
+       add_wait_queue(sk_sleep(sk), &wait);
+       while (1) {
+-              if (sk_wait_event(sk, &timeout, sk->sk_state != TCP_SYN_SENT, &wait))
++              if (sk_wait_event(sk, &timeout,
++                                READ_ONCE(sk->sk_state) != TCP_SYN_SENT, &wait))
+                       break;
+               if (signal_pending(current) || !timeout)
+                       break;
+@@ -622,7 +624,7 @@ static int llc_ui_wait_for_busy_core(struct sock *sk, long timeout)
+       while (1) {
+               rc = 0;
+               if (sk_wait_event(sk, &timeout,
+-                                (sk->sk_shutdown & RCV_SHUTDOWN) ||
++                                (READ_ONCE(sk->sk_shutdown) & RCV_SHUTDOWN) ||
+                                 (!llc_data_accept_state(llc->state) &&
+                                  !llc->remote_busy_flag &&
+                                  !llc->p_flag), &wait))
+diff --git a/net/smc/smc_close.c b/net/smc/smc_close.c
+index 31db7438857c9..dbdf03e8aa5b5 100644
+--- a/net/smc/smc_close.c
++++ b/net/smc/smc_close.c
+@@ -67,8 +67,8 @@ static void smc_close_stream_wait(struct smc_sock *smc, long timeout)
+ 
+               rc = sk_wait_event(sk, &timeout,
+                                  !smc_tx_prepared_sends(&smc->conn) ||
+-                                 sk->sk_err == ECONNABORTED ||
+-                                 sk->sk_err == ECONNRESET ||
++                                 READ_ONCE(sk->sk_err) == ECONNABORTED ||
++                                 READ_ONCE(sk->sk_err) == ECONNRESET ||
+                                  smc->conn.killed,
+                                  &wait);
+               if (rc)
+diff --git a/net/smc/smc_rx.c b/net/smc/smc_rx.c
+index 4380d32f5a5f9..9a2f3638d161d 100644
+--- a/net/smc/smc_rx.c
++++ b/net/smc/smc_rx.c
+@@ -267,9 +267,9 @@ int smc_rx_wait(struct smc_sock *smc, long *timeo,
+       sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk);
+       add_wait_queue(sk_sleep(sk), &wait);
+       rc = sk_wait_event(sk, timeo,
+-                         sk->sk_err ||
++                         READ_ONCE(sk->sk_err) ||
+                          cflags->peer_conn_abort ||
+-                         sk->sk_shutdown & RCV_SHUTDOWN ||
++                         READ_ONCE(sk->sk_shutdown) & RCV_SHUTDOWN ||
+                          conn->killed ||
+                          fcrit(conn),
+                          &wait);
+diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
+index f4b6a71ac488a..45128443f1f10 100644
+--- a/net/smc/smc_tx.c
++++ b/net/smc/smc_tx.c
+@@ -113,8 +113,8 @@ static int smc_tx_wait(struct smc_sock *smc, int flags)
+                       break; /* at least 1 byte of free & no urgent data */
+               set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
+               sk_wait_event(sk, &timeo,
+-                            sk->sk_err ||
+-                            (sk->sk_shutdown & SEND_SHUTDOWN) ||
++                            READ_ONCE(sk->sk_err) ||
++                            (READ_ONCE(sk->sk_shutdown) & SEND_SHUTDOWN) ||
+                             smc_cdc_rxed_any_close(conn) ||
+                             (atomic_read(&conn->sndbuf_space) &&
+                              !conn->urg_tx_pend),
+diff --git a/net/tipc/socket.c b/net/tipc/socket.c
+index 37edfe10f8c6f..dd73d71c02a99 100644
+--- a/net/tipc/socket.c
++++ b/net/tipc/socket.c
+@@ -314,9 +314,9 @@ static void tsk_rej_rx_queue(struct sock *sk, int error)
+               tipc_sk_respond(sk, skb, error);
+ }
+ 
+-static bool tipc_sk_connected(struct sock *sk)
++static bool tipc_sk_connected(const struct sock *sk)
+ {
+-      return sk->sk_state == TIPC_ESTABLISHED;
++      return READ_ONCE(sk->sk_state) == TIPC_ESTABLISHED;
+ }
+ 
+ /* tipc_sk_type_connectionless - check if the socket is datagram socket
+diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
+index b32c112984dd9..f2e7302a4d96b 100644
+--- a/net/tls/tls_main.c
++++ b/net/tls/tls_main.c
+@@ -111,7 +111,8 @@ int wait_on_pending_writer(struct sock *sk, long *timeo)
+                       break;
+               }
+ 
+-              if (sk_wait_event(sk, timeo, !sk->sk_write_pending, &wait))
++              if (sk_wait_event(sk, timeo,
++                                !READ_ONCE(sk->sk_write_pending), &wait))
+                       break;
+       }
+       remove_wait_queue(sk_sleep(sk), &wait);
+-- 
+2.39.2
+
diff --git a/queue-6.3/net-fix-load-tearing-on-sk-sk_stamp-in-sock_recv_cms.patch b/queue-6.3/net-fix-load-tearing-on-sk-sk_stamp-in-sock_recv_cms.patch

new file mode 100644 (file)

index 0000000..3c3144c
--- /dev/null
+++ b/queue-6.3/net-fix-load-tearing-on-sk-sk_stamp-in-sock_recv_cms.patch
@@ -0,0 +1,82 @@
+From cd6ded14ed341f1dcbea2e1952cab31816d77551 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Mon, 8 May 2023 10:55:43 -0700
+Subject: net: Fix load-tearing on sk->sk_stamp in sock_recv_cmsgs().
+
+From: Kuniyuki Iwashima <kuniyu@amazon.com>
+
+[ Upstream commit dfd9248c071a3710c24365897459538551cb7167 ]
+
+KCSAN found a data race in sock_recv_cmsgs() where the read access
+to sk->sk_stamp needs READ_ONCE().
+
+BUG: KCSAN: data-race in packet_recvmsg / packet_recvmsg
+
+write (marked) to 0xffff88803c81f258 of 8 bytes by task 19171 on cpu 0:
+ sock_write_timestamp include/net/sock.h:2670 [inline]
+ sock_recv_cmsgs include/net/sock.h:2722 [inline]
+ packet_recvmsg+0xb97/0xd00 net/packet/af_packet.c:3489
+ sock_recvmsg_nosec net/socket.c:1019 [inline]
+ sock_recvmsg+0x11a/0x130 net/socket.c:1040
+ sock_read_iter+0x176/0x220 net/socket.c:1118
+ call_read_iter include/linux/fs.h:1845 [inline]
+ new_sync_read fs/read_write.c:389 [inline]
+ vfs_read+0x5e0/0x630 fs/read_write.c:470
+ ksys_read+0x163/0x1a0 fs/read_write.c:613
+ __do_sys_read fs/read_write.c:623 [inline]
+ __se_sys_read fs/read_write.c:621 [inline]
+ __x64_sys_read+0x41/0x50 fs/read_write.c:621
+ do_syscall_x64 arch/x86/entry/common.c:50 [inline]
+ do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
+ entry_SYSCALL_64_after_hwframe+0x72/0xdc
+
+read to 0xffff88803c81f258 of 8 bytes by task 19183 on cpu 1:
+ sock_recv_cmsgs include/net/sock.h:2721 [inline]
+ packet_recvmsg+0xb64/0xd00 net/packet/af_packet.c:3489
+ sock_recvmsg_nosec net/socket.c:1019 [inline]
+ sock_recvmsg+0x11a/0x130 net/socket.c:1040
+ sock_read_iter+0x176/0x220 net/socket.c:1118
+ call_read_iter include/linux/fs.h:1845 [inline]
+ new_sync_read fs/read_write.c:389 [inline]
+ vfs_read+0x5e0/0x630 fs/read_write.c:470
+ ksys_read+0x163/0x1a0 fs/read_write.c:613
+ __do_sys_read fs/read_write.c:623 [inline]
+ __se_sys_read fs/read_write.c:621 [inline]
+ __x64_sys_read+0x41/0x50 fs/read_write.c:621
+ do_syscall_x64 arch/x86/entry/common.c:50 [inline]
+ do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
+ entry_SYSCALL_64_after_hwframe+0x72/0xdc
+
+value changed: 0xffffffffc4653600 -> 0x0000000000000000
+
+Reported by Kernel Concurrency Sanitizer on:
+CPU: 1 PID: 19183 Comm: syz-executor.5 Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2
+Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
+
+Fixes: 6c7c98bad488 ("sock: avoid dirtying sk_stamp, if possible")
+Reported-by: syzbot <syzkaller@googlegroups.com>
+Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
+Reviewed-by: Eric Dumazet <edumazet@google.com>
+Link: https://lore.kernel.org/r/20230508175543.55756-1-kuniyu@amazon.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ include/net/sock.h | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/include/net/sock.h b/include/net/sock.h
+index 573f2bf7e0de7..9cd0354221507 100644
+--- a/include/net/sock.h
++++ b/include/net/sock.h
+@@ -2718,7 +2718,7 @@ static inline void sock_recv_cmsgs(struct msghdr *msg, struct sock *sk,
+               __sock_recv_cmsgs(msg, sk, skb);
+       else if (unlikely(sock_flag(sk, SOCK_TIMESTAMP)))
+               sock_write_timestamp(sk, skb->tstamp);
+-      else if (unlikely(sk->sk_stamp == SK_DEFAULT_STAMP))
++      else if (unlikely(sock_read_timestamp(sk) == SK_DEFAULT_STAMP))
+               sock_write_timestamp(sk, 0);
+ }
+ 
+-- 
+2.39.2
+
diff --git a/queue-6.3/net-mdio-mvusb-fix-an-error-handling-path-in-mvusb_m.patch b/queue-6.3/net-mdio-mvusb-fix-an-error-handling-path-in-mvusb_m.patch

new file mode 100644 (file)

index 0000000..1d52a0b
--- /dev/null
+++ b/queue-6.3/net-mdio-mvusb-fix-an-error-handling-path-in-mvusb_m.patch
@@ -0,0 +1,54 @@
+From fffd92d1c075a8b89f55a4d0ffcb8e88235e0402 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Fri, 5 May 2023 20:39:33 +0200
+Subject: net: mdio: mvusb: Fix an error handling path in mvusb_mdio_probe()
+
+From: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
+
+[ Upstream commit 27c1eaa07283b0c94becf8241f95368267cf558b ]
+
+Should of_mdiobus_register() fail, a previous usb_get_dev() call should be
+undone as in the .disconnect function.
+
+Fixes: 04e37d92fbed ("net: phy: add marvell usb to mdio controller")
+Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
+Reviewed-by: Simon Horman <simon.horman@corigine.com>
+Reviewed-by: Andrew Lunn <andrew@lunn.ch>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/mdio/mdio-mvusb.c | 11 ++++++++++-
+ 1 file changed, 10 insertions(+), 1 deletion(-)
+
+diff --git a/drivers/net/mdio/mdio-mvusb.c b/drivers/net/mdio/mdio-mvusb.c
+index 68fc55906e788..554837c21e73c 100644
+--- a/drivers/net/mdio/mdio-mvusb.c
++++ b/drivers/net/mdio/mdio-mvusb.c
+@@ -67,6 +67,7 @@ static int mvusb_mdio_probe(struct usb_interface *interface,
+       struct device *dev = &interface->dev;
+       struct mvusb_mdio *mvusb;
+       struct mii_bus *mdio;
++      int ret;
+ 
+       mdio = devm_mdiobus_alloc_size(dev, sizeof(*mvusb));
+       if (!mdio)
+@@ -87,7 +88,15 @@ static int mvusb_mdio_probe(struct usb_interface *interface,
+       mdio->write = mvusb_mdio_write;
+ 
+       usb_set_intfdata(interface, mvusb);
+-      return of_mdiobus_register(mdio, dev->of_node);
++      ret = of_mdiobus_register(mdio, dev->of_node);
++      if (ret)
++              goto put_dev;
++
++      return 0;
++
++put_dev:
++      usb_put_dev(mvusb->udev);
++      return ret;
+ }
+ 
+ static void mvusb_mdio_disconnect(struct usb_interface *interface)
+-- 
+2.39.2
+
diff --git a/queue-6.3/net-mscc-ocelot-fix-stat-counter-register-values.patch b/queue-6.3/net-mscc-ocelot-fix-stat-counter-register-values.patch

new file mode 100644 (file)

index 0000000..de0d1cc
--- /dev/null
+++ b/queue-6.3/net-mscc-ocelot-fix-stat-counter-register-values.patch
@@ -0,0 +1,62 @@
+From 0810204147d98d5668fdfde007fadfd22f2d67ca Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 May 2023 21:48:51 -0700
+Subject: net: mscc: ocelot: fix stat counter register values
+
+From: Colin Foster <colin.foster@in-advantage.com>
+
+[ Upstream commit cdc2e28e214fe9315cdd7e069c1c8e2428f93427 ]
+
+Commit d4c367650704 ("net: mscc: ocelot: keep ocelot_stat_layout by reg
+address, not offset") organized the stats counters for Ocelot chips, namely
+the VSC7512 and VSC7514. A few of the counter offsets were incorrect, and
+were caught by this warning:
+
+WARNING: CPU: 0 PID: 24 at drivers/net/ethernet/mscc/ocelot_stats.c:909
+ocelot_stats_init+0x1fc/0x2d8
+reg 0x5000078 had address 0x220 but reg 0x5000079 has address 0x214,
+bulking broken!
+
+Fix these register offsets.
+
+Fixes: d4c367650704 ("net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset")
+Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
+Reviewed-by: Simon Horman <simon.horman@corigine.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/ethernet/mscc/vsc7514_regs.c | 18 +++++++++---------
+ 1 file changed, 9 insertions(+), 9 deletions(-)
+
+diff --git a/drivers/net/ethernet/mscc/vsc7514_regs.c b/drivers/net/ethernet/mscc/vsc7514_regs.c
+index ef6fd3f6be309..5595bfe84bbbb 100644
+--- a/drivers/net/ethernet/mscc/vsc7514_regs.c
++++ b/drivers/net/ethernet/mscc/vsc7514_regs.c
+@@ -307,15 +307,15 @@ static const u32 vsc7514_sys_regmap[] = {
+       REG(SYS_COUNT_DROP_YELLOW_PRIO_4,               0x000218),
+       REG(SYS_COUNT_DROP_YELLOW_PRIO_5,               0x00021c),
+       REG(SYS_COUNT_DROP_YELLOW_PRIO_6,               0x000220),
+-      REG(SYS_COUNT_DROP_YELLOW_PRIO_7,               0x000214),
+-      REG(SYS_COUNT_DROP_GREEN_PRIO_0,                0x000218),
+-      REG(SYS_COUNT_DROP_GREEN_PRIO_1,                0x00021c),
+-      REG(SYS_COUNT_DROP_GREEN_PRIO_2,                0x000220),
+-      REG(SYS_COUNT_DROP_GREEN_PRIO_3,                0x000224),
+-      REG(SYS_COUNT_DROP_GREEN_PRIO_4,                0x000228),
+-      REG(SYS_COUNT_DROP_GREEN_PRIO_5,                0x00022c),
+-      REG(SYS_COUNT_DROP_GREEN_PRIO_6,                0x000230),
+-      REG(SYS_COUNT_DROP_GREEN_PRIO_7,                0x000234),
++      REG(SYS_COUNT_DROP_YELLOW_PRIO_7,               0x000224),
++      REG(SYS_COUNT_DROP_GREEN_PRIO_0,                0x000228),
++      REG(SYS_COUNT_DROP_GREEN_PRIO_1,                0x00022c),
++      REG(SYS_COUNT_DROP_GREEN_PRIO_2,                0x000230),
++      REG(SYS_COUNT_DROP_GREEN_PRIO_3,                0x000234),
++      REG(SYS_COUNT_DROP_GREEN_PRIO_4,                0x000238),
++      REG(SYS_COUNT_DROP_GREEN_PRIO_5,                0x00023c),
++      REG(SYS_COUNT_DROP_GREEN_PRIO_6,                0x000240),
++      REG(SYS_COUNT_DROP_GREEN_PRIO_7,                0x000244),
+       REG(SYS_RESET_CFG,                              0x000508),
+       REG(SYS_CMID,                                   0x00050c),
+       REG(SYS_VLAN_ETYPE_CFG,                         0x000510),
+-- 
+2.39.2
+
diff --git a/queue-6.3/net-phy-bcm7xx-correct-read-from-expansion-register.patch b/queue-6.3/net-phy-bcm7xx-correct-read-from-expansion-register.patch

new file mode 100644 (file)

index 0000000..1b2e893
--- /dev/null
+++ b/queue-6.3/net-phy-bcm7xx-correct-read-from-expansion-register.patch
@@ -0,0 +1,56 @@
+From 16afd432371c0a79792b5a7a1f338b52541020ad Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Mon, 8 May 2023 16:17:49 -0700
+Subject: net: phy: bcm7xx: Correct read from expansion register
+
+From: Florian Fainelli <f.fainelli@gmail.com>
+
+[ Upstream commit 582dbb2cc1a0a7427840f5b1e3c65608e511b061 ]
+
+Since the driver works in the "legacy" addressing mode, we need to write
+to the expansion register (0x17) with bits 11:8 set to 0xf to properly
+select the expansion register passed as argument.
+
+Fixes: f68d08c437f9 ("net: phy: bcm7xxx: Add EPHY entry for 72165")
+Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
+Reviewed-by: Simon Horman <simon.horman@corigine.com>
+Link: https://lore.kernel.org/r/20230508231749.1681169-1-f.fainelli@gmail.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/phy/bcm-phy-lib.h | 5 +++++
+ drivers/net/phy/bcm7xxx.c     | 2 +-
+ 2 files changed, 6 insertions(+), 1 deletion(-)
+
+diff --git a/drivers/net/phy/bcm-phy-lib.h b/drivers/net/phy/bcm-phy-lib.h
+index 9902fb1820997..729db441797a0 100644
+--- a/drivers/net/phy/bcm-phy-lib.h
++++ b/drivers/net/phy/bcm-phy-lib.h
+@@ -40,6 +40,11 @@ static inline int bcm_phy_write_exp_sel(struct phy_device *phydev,
+       return bcm_phy_write_exp(phydev, reg | MII_BCM54XX_EXP_SEL_ER, val);
+ }
+ 
++static inline int bcm_phy_read_exp_sel(struct phy_device *phydev, u16 reg)
++{
++      return bcm_phy_read_exp(phydev, reg | MII_BCM54XX_EXP_SEL_ER);
++}
++
+ int bcm54xx_auxctl_write(struct phy_device *phydev, u16 regnum, u16 val);
+ int bcm54xx_auxctl_read(struct phy_device *phydev, u16 regnum);
+ 
+diff --git a/drivers/net/phy/bcm7xxx.c b/drivers/net/phy/bcm7xxx.c
+index 75593e7d1118f..6cebf3aaa621f 100644
+--- a/drivers/net/phy/bcm7xxx.c
++++ b/drivers/net/phy/bcm7xxx.c
+@@ -487,7 +487,7 @@ static int bcm7xxx_16nm_ephy_afe_config(struct phy_device *phydev)
+       bcm_phy_write_misc(phydev, 0x0038, 0x0002, 0xede0);
+ 
+       /* Read CORE_EXPA9 */
+-      tmp = bcm_phy_read_exp(phydev, 0x00a9);
++      tmp = bcm_phy_read_exp_sel(phydev, 0x00a9);
+       /* CORE_EXPA9[6:1] is rcalcode[5:0] */
+       rcalcode = (tmp & 0x7e) / 2;
+       /* Correct RCAL code + 1 is -1% rprogr, LP: +16 */
+-- 
+2.39.2
+
diff --git a/queue-6.3/net-skb_partial_csum_set-fix-against-transport-heade.patch b/queue-6.3/net-skb_partial_csum_set-fix-against-transport-heade.patch

new file mode 100644 (file)

index 0000000..80fb7ff
--- /dev/null
+++ b/queue-6.3/net-skb_partial_csum_set-fix-against-transport-heade.patch
@@ -0,0 +1,95 @@
+From 0ec994ab3534666759946c8051429ccffd8f0ade Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Fri, 5 May 2023 17:06:18 +0000
+Subject: net: skb_partial_csum_set() fix against transport header magic value
+
+From: Eric Dumazet <edumazet@google.com>
+
+[ Upstream commit 424f8416bb39936df6365442d651ee729b283460 ]
+
+skb->transport_header uses the special 0xFFFF value
+to mark if the transport header was set or not.
+
+We must prevent callers to accidentaly set skb->transport_header
+to 0xFFFF. Note that only fuzzers can possibly do this today.
+
+syzbot reported:
+
+WARNING: CPU: 0 PID: 2340 at include/linux/skbuff.h:2847 skb_transport_offset include/linux/skbuff.h:2956 [inline]
+WARNING: CPU: 0 PID: 2340 at include/linux/skbuff.h:2847 virtio_net_hdr_to_skb+0xbcc/0x10c0 include/linux/virtio_net.h:103
+Modules linked in:
+CPU: 0 PID: 2340 Comm: syz-executor.0 Not tainted 6.3.0-syzkaller #0
+Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023
+RIP: 0010:skb_transport_header include/linux/skbuff.h:2847 [inline]
+RIP: 0010:skb_transport_offset include/linux/skbuff.h:2956 [inline]
+RIP: 0010:virtio_net_hdr_to_skb+0xbcc/0x10c0 include/linux/virtio_net.h:103
+Code: 41 39 df 0f 82 c3 04 00 00 48 8b 7c 24 10 44 89 e6 e8 08 6e 59 ff 48 85 c0 74 54 e8 ce 36 7e fc e9 37 f8 ff ff e8 c4 36 7e fc <0f> 0b e9 93 f8 ff ff 44 89 f7 44 89 e6 e8 32 38 7e fc 45 39 e6 0f
+RSP: 0018:ffffc90004497880 EFLAGS: 00010293
+RAX: ffffffff84fea55c RBX: 000000000000ffff RCX: ffff888120be2100
+RDX: 0000000000000000 RSI: 000000000000ffff RDI: 000000000000ffff
+RBP: ffffc90004497990 R08: ffffffff84fe9de5 R09: 0000000000000034
+R10: ffffea00048ebd80 R11: 0000000000000034 R12: ffff88811dc2d9c8
+R13: dffffc0000000000 R14: ffff88811dc2d9ae R15: 1ffff11023b85b35
+FS: 00007f9211a59700(0000) GS:ffff8881f6c00000(0000) knlGS:0000000000000000
+CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
+CR2: 00000000200002c0 CR3: 00000001215a5000 CR4: 00000000003506f0
+DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
+DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
+Call Trace:
+<TASK>
+packet_snd net/packet/af_packet.c:3076 [inline]
+packet_sendmsg+0x4590/0x61a0 net/packet/af_packet.c:3115
+sock_sendmsg_nosec net/socket.c:724 [inline]
+sock_sendmsg net/socket.c:747 [inline]
+__sys_sendto+0x472/0x630 net/socket.c:2144
+__do_sys_sendto net/socket.c:2156 [inline]
+__se_sys_sendto net/socket.c:2152 [inline]
+__x64_sys_sendto+0xe5/0x100 net/socket.c:2152
+do_syscall_x64 arch/x86/entry/common.c:50 [inline]
+do_syscall_64+0x2f/0x50 arch/x86/entry/common.c:80
+entry_SYSCALL_64_after_hwframe+0x63/0xcd
+RIP: 0033:0x7f9210c8c169
+Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
+RSP: 002b:00007f9211a59168 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
+RAX: ffffffffffffffda RBX: 00007f9210dabf80 RCX: 00007f9210c8c169
+RDX: 000000000000ffed RSI: 00000000200000c0 RDI: 0000000000000003
+RBP: 00007f9210ce7ca1 R08: 0000000020000540 R09: 0000000000000014
+R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
+R13: 00007ffe135d65cf R14: 00007f9211a59300 R15: 0000000000022000
+
+Fixes: 66e4c8d95008 ("net: warn if transport header was not set")
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Reported-by: syzbot <syzkaller@googlegroups.com>
+Cc: Willem de Bruijn <willemb@google.com>
+Reviewed-by: Willem de Bruijn <willemb@google.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/core/skbuff.c | 4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+diff --git a/net/core/skbuff.c b/net/core/skbuff.c
+index 14bb41aafee30..afec5e2c21ac0 100644
+--- a/net/core/skbuff.c
++++ b/net/core/skbuff.c
+@@ -5243,7 +5243,7 @@ bool skb_partial_csum_set(struct sk_buff *skb, u16 start, u16 off)
+       u32 csum_end = (u32)start + (u32)off + sizeof(__sum16);
+       u32 csum_start = skb_headroom(skb) + (u32)start;
+ 
+-      if (unlikely(csum_start > U16_MAX || csum_end > skb_headlen(skb))) {
++      if (unlikely(csum_start >= U16_MAX || csum_end > skb_headlen(skb))) {
+               net_warn_ratelimited("bad partial csum: csum=%u/%u headroom=%u headlen=%u\n",
+                                    start, off, skb_headroom(skb), skb_headlen(skb));
+               return false;
+@@ -5251,7 +5251,7 @@ bool skb_partial_csum_set(struct sk_buff *skb, u16 start, u16 off)
+       skb->ip_summed = CHECKSUM_PARTIAL;
+       skb->csum_start = csum_start;
+       skb->csum_offset = off;
+-      skb_set_transport_header(skb, start);
++      skb->transport_header = csum_start;
+       return true;
+ }
+ EXPORT_SYMBOL_GPL(skb_partial_csum_set);
+-- 
+2.39.2
+
diff --git a/queue-6.3/net-stmmac-initialize-mac_oneus_tic_counter-register.patch b/queue-6.3/net-stmmac-initialize-mac_oneus_tic_counter-register.patch

new file mode 100644 (file)

index 0000000..590ce65
--- /dev/null
+++ b/queue-6.3/net-stmmac-initialize-mac_oneus_tic_counter-register.patch
@@ -0,0 +1,96 @@
+From af422c3009ce35808c0179babb34f3c02f12b391 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Sun, 7 May 2023 01:58:45 +0200
+Subject: net: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register
+
+From: Marek Vasut <marex@denx.de>
+
+[ Upstream commit 8efbdbfa99381a017dd2c0f6375a7d80a8118b74 ]
+
+Initialize MAC_ONEUS_TIC_COUNTER register with correct value derived
+from CSR clock, otherwise EEE is unstable on at least NXP i.MX8M Plus
+and Micrel KSZ9131RNX PHY, to the point where not even ARP request can
+be sent out.
+
+i.MX 8M Plus Applications Processor Reference Manual, Rev. 1, 06/2021
+11.7.6.1.34 One-microsecond Reference Timer (MAC_ONEUS_TIC_COUNTER)
+defines this register as:
+"
+This register controls the generation of the Reference time (1 microsecond
+tic) for all the LPI timers. This timer has to be programmed by the software
+initially.
+...
+The application must program this counter so that the number of clock cycles
+of CSR clock is 1us. (Subtract 1 from the value before programming).
+For example if the CSR clock is 100MHz then this field needs to be programmed
+to value 100 - 1 = 99 (which is 0x63).
+This is required to generate the 1US events that are used to update some of
+the EEE related counters.
+"
+
+The reset value is 0x63 on i.MX8M Plus, which means expected CSR clock are
+100 MHz. However, the i.MX8M Plus "enet_qos_root_clk" are 266 MHz instead,
+which means the LPI timers reach their count much sooner on this platform.
+
+This is visible using a scope by monitoring e.g. exit from LPI mode on TX_CTL
+line from MAC to PHY. This should take 30us per STMMAC_DEFAULT_TWT_LS setting,
+during which the TX_CTL line transitions from tristate to low, and 30 us later
+from low to high. On i.MX8M Plus, this transition takes 11 us, which matches
+the 30us * 100/266 formula for misconfigured MAC_ONEUS_TIC_COUNTER register.
+
+Configure MAC_ONEUS_TIC_COUNTER based on CSR clock, so that the LPI timers
+have correct 1us reference. This then fixes EEE on i.MX8M Plus with Micrel
+KSZ9131RNX PHY.
+
+Fixes: 477286b53f55 ("stmmac: add GMAC4 core support")
+Signed-off-by: Marek Vasut <marex@denx.de>
+Tested-by: Harald Seiler <hws@denx.de>
+Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com>
+Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Verdin iMX8MP
+Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
+Link: https://lore.kernel.org/r/20230506235845.246105-1-marex@denx.de
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/ethernet/stmicro/stmmac/dwmac4.h      | 1 +
+ drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 5 +++++
+ 2 files changed, 6 insertions(+)
+
+diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4.h b/drivers/net/ethernet/stmicro/stmmac/dwmac4.h
+index ccd49346d3b30..a70b0d8a622d6 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4.h
++++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4.h
+@@ -181,6 +181,7 @@ enum power_event {
+ #define GMAC4_LPI_CTRL_STATUS 0xd0
+ #define GMAC4_LPI_TIMER_CTRL  0xd4
+ #define GMAC4_LPI_ENTRY_TIMER 0xd8
++#define GMAC4_MAC_ONEUS_TIC_COUNTER   0xdc
+ 
+ /* LPI control and status defines */
+ #define GMAC4_LPI_CTRL_STATUS_LPITCSE BIT(21) /* LPI Tx Clock Stop Enable */
+diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
+index 36251ec2589c9..24d6ec06732d9 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
++++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
+@@ -25,6 +25,7 @@ static void dwmac4_core_init(struct mac_device_info *hw,
+       struct stmmac_priv *priv = netdev_priv(dev);
+       void __iomem *ioaddr = hw->pcsr;
+       u32 value = readl(ioaddr + GMAC_CONFIG);
++      u32 clk_rate;
+ 
+       value |= GMAC_CORE_INIT;
+ 
+@@ -47,6 +48,10 @@ static void dwmac4_core_init(struct mac_device_info *hw,
+ 
+       writel(value, ioaddr + GMAC_CONFIG);
+ 
++      /* Configure LPI 1us counter to number of CSR clock ticks in 1us - 1 */
++      clk_rate = clk_get_rate(priv->plat->stmmac_clk);
++      writel((clk_rate / 1000000) - 1, ioaddr + GMAC4_MAC_ONEUS_TIC_COUNTER);
++
+       /* Enable GMAC interrupts */
+       value = GMAC_INT_DEFAULT_ENABLE;
+ 
+-- 
+2.39.2
+
diff --git a/queue-6.3/netfilter-conntrack-fix-possible-bug_on-with-enable_.patch b/queue-6.3/netfilter-conntrack-fix-possible-bug_on-with-enable_.patch

new file mode 100644 (file)

index 0000000..d63b582
--- /dev/null
+++ b/queue-6.3/netfilter-conntrack-fix-possible-bug_on-with-enable_.patch
@@ -0,0 +1,77 @@
+From d1ed227baa56429382bc15fcc59fb2e6d965e2e9 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 4 May 2023 14:55:02 +0200
+Subject: netfilter: conntrack: fix possible bug_on with enable_hooks=1
+
+From: Florian Westphal <fw@strlen.de>
+
+[ Upstream commit e72eeab542dbf4f544e389e64fa13b82a1b6d003 ]
+
+I received a bug report (no reproducer so far) where we trip over
+
+712         rcu_read_lock();
+713         ct_hook = rcu_dereference(nf_ct_hook);
+714         BUG_ON(ct_hook == NULL);  // here
+
+In nf_conntrack_destroy().
+
+First turn this BUG_ON into a WARN.  I think it was triggered
+via enable_hooks=1 flag.
+
+When this flag is turned on, the conntrack hooks are registered
+before nf_ct_hook pointer gets assigned.
+This opens a short window where packets enter the conntrack machinery,
+can have skb->_nfct set up and a subsequent kfree_skb might occur
+before nf_ct_hook is set.
+
+Call nf_conntrack_init_end() to set nf_ct_hook before we register the
+pernet ops.
+
+Fixes: ba3fbe663635 ("netfilter: nf_conntrack: provide modparam to always register conntrack hooks")
+Signed-off-by: Florian Westphal <fw@strlen.de>
+Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/netfilter/core.c                    | 6 ++++--
+ net/netfilter/nf_conntrack_standalone.c | 3 ++-
+ 2 files changed, 6 insertions(+), 3 deletions(-)
+
+diff --git a/net/netfilter/core.c b/net/netfilter/core.c
+index 358220b585215..edf92074221e2 100644
+--- a/net/netfilter/core.c
++++ b/net/netfilter/core.c
+@@ -699,9 +699,11 @@ void nf_conntrack_destroy(struct nf_conntrack *nfct)
+ 
+       rcu_read_lock();
+       ct_hook = rcu_dereference(nf_ct_hook);
+-      BUG_ON(ct_hook == NULL);
+-      ct_hook->destroy(nfct);
++      if (ct_hook)
++              ct_hook->destroy(nfct);
+       rcu_read_unlock();
++
++      WARN_ON(!ct_hook);
+ }
+ EXPORT_SYMBOL(nf_conntrack_destroy);
+ 
+diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c
+index 57f6724c99a76..169e16fc2bceb 100644
+--- a/net/netfilter/nf_conntrack_standalone.c
++++ b/net/netfilter/nf_conntrack_standalone.c
+@@ -1218,11 +1218,12 @@ static int __init nf_conntrack_standalone_init(void)
+       nf_conntrack_htable_size_user = nf_conntrack_htable_size;
+ #endif
+ 
++      nf_conntrack_init_end();
++
+       ret = register_pernet_subsys(&nf_conntrack_net_ops);
+       if (ret < 0)
+               goto out_pernet;
+ 
+-      nf_conntrack_init_end();
+       return 0;
+ 
+ out_pernet:
+-- 
+2.39.2
+
diff --git a/queue-6.3/netfilter-nf_tables-always-release-netdev-hooks-from.patch b/queue-6.3/netfilter-nf_tables-always-release-netdev-hooks-from.patch

new file mode 100644 (file)

index 0000000..d89ff55
--- /dev/null
+++ b/queue-6.3/netfilter-nf_tables-always-release-netdev-hooks-from.patch
@@ -0,0 +1,76 @@
+From bf9619a868568156160f37e56b2e97c0c59e9764 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 4 May 2023 14:20:21 +0200
+Subject: netfilter: nf_tables: always release netdev hooks from notifier
+
+From: Florian Westphal <fw@strlen.de>
+
+[ Upstream commit dc1c9fd4a8bbe1e06add9053010b652449bfe411 ]
+
+This reverts "netfilter: nf_tables: skip netdev events generated on netns removal".
+
+The problem is that when a veth device is released, the veth release
+callback will also queue the peer netns device for removal.
+
+Its possible that the peer netns is also slated for removal.  In this
+case, the device memory is already released before the pre_exit hook of
+the peer netns runs:
+
+BUG: KASAN: slab-use-after-free in nf_hook_entry_head+0x1b8/0x1d0
+Read of size 8 at addr ffff88812c0124f0 by task kworker/u8:1/45
+Workqueue: netns cleanup_net
+Call Trace:
+ nf_hook_entry_head+0x1b8/0x1d0
+ __nf_unregister_net_hook+0x76/0x510
+ nft_netdev_unregister_hooks+0xa0/0x220
+ __nft_release_hook+0x184/0x490
+ nf_tables_pre_exit_net+0x12f/0x1b0
+ ..
+
+Order is:
+1. First netns is released, veth_dellink() queues peer netns device
+   for removal
+2. peer netns is queued for removal
+3. peer netns device is released, unreg event is triggered
+4. unreg event is ignored because netns is going down
+5. pre_exit hook calls nft_netdev_unregister_hooks but device memory
+   might be free'd already.
+
+Fixes: 68a3765c659f ("netfilter: nf_tables: skip netdev events generated on netns removal")
+Signed-off-by: Florian Westphal <fw@strlen.de>
+Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/netfilter/nft_chain_filter.c | 9 ++++++---
+ 1 file changed, 6 insertions(+), 3 deletions(-)
+
+diff --git a/net/netfilter/nft_chain_filter.c b/net/netfilter/nft_chain_filter.c
+index c3563f0be2692..680fe557686e4 100644
+--- a/net/netfilter/nft_chain_filter.c
++++ b/net/netfilter/nft_chain_filter.c
+@@ -344,6 +344,12 @@ static void nft_netdev_event(unsigned long event, struct net_device *dev,
+               return;
+       }
+ 
++      /* UNREGISTER events are also happening on netns exit.
++       *
++       * Although nf_tables core releases all tables/chains, only this event
++       * handler provides guarantee that hook->ops.dev is still accessible,
++       * so we cannot skip exiting net namespaces.
++       */
+       __nft_release_basechain(ctx);
+ }
+ 
+@@ -362,9 +368,6 @@ static int nf_tables_netdev_event(struct notifier_block *this,
+           event != NETDEV_CHANGENAME)
+               return NOTIFY_DONE;
+ 
+-      if (!check_net(ctx.net))
+-              return NOTIFY_DONE;
+-
+       nft_net = nft_pernet(ctx.net);
+       mutex_lock(&nft_net->commit_mutex);
+       list_for_each_entry(table, &nft_net->tables, list) {
+-- 
+2.39.2
+
diff --git a/queue-6.3/netlink-annotate-accesses-to-nlk-cb_running.patch b/queue-6.3/netlink-annotate-accesses-to-nlk-cb_running.patch

new file mode 100644 (file)

index 0000000..74376e5
--- /dev/null
+++ b/queue-6.3/netlink-annotate-accesses-to-nlk-cb_running.patch
@@ -0,0 +1,109 @@
+From 61e89a9d06bd3cc75aaee640bd5ef2db9af6a3f9 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 May 2023 16:56:34 +0000
+Subject: netlink: annotate accesses to nlk->cb_running
+
+From: Eric Dumazet <edumazet@google.com>
+
+[ Upstream commit a939d14919b799e6fff8a9c80296ca229ba2f8a4 ]
+
+Both netlink_recvmsg() and netlink_native_seq_show() read
+nlk->cb_running locklessly. Use READ_ONCE() there.
+
+Add corresponding WRITE_ONCE() to netlink_dump() and
+__netlink_dump_start()
+
+syzbot reported:
+BUG: KCSAN: data-race in __netlink_dump_start / netlink_recvmsg
+
+write to 0xffff88813ea4db59 of 1 bytes by task 28219 on cpu 0:
+__netlink_dump_start+0x3af/0x4d0 net/netlink/af_netlink.c:2399
+netlink_dump_start include/linux/netlink.h:308 [inline]
+rtnetlink_rcv_msg+0x70f/0x8c0 net/core/rtnetlink.c:6130
+netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2577
+rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6192
+netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
+netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365
+netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1942
+sock_sendmsg_nosec net/socket.c:724 [inline]
+sock_sendmsg net/socket.c:747 [inline]
+sock_write_iter+0x1aa/0x230 net/socket.c:1138
+call_write_iter include/linux/fs.h:1851 [inline]
+new_sync_write fs/read_write.c:491 [inline]
+vfs_write+0x463/0x760 fs/read_write.c:584
+ksys_write+0xeb/0x1a0 fs/read_write.c:637
+__do_sys_write fs/read_write.c:649 [inline]
+__se_sys_write fs/read_write.c:646 [inline]
+__x64_sys_write+0x42/0x50 fs/read_write.c:646
+do_syscall_x64 arch/x86/entry/common.c:50 [inline]
+do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
+entry_SYSCALL_64_after_hwframe+0x63/0xcd
+
+read to 0xffff88813ea4db59 of 1 bytes by task 28222 on cpu 1:
+netlink_recvmsg+0x3b4/0x730 net/netlink/af_netlink.c:2022
+sock_recvmsg_nosec+0x4c/0x80 net/socket.c:1017
+____sys_recvmsg+0x2db/0x310 net/socket.c:2718
+___sys_recvmsg net/socket.c:2762 [inline]
+do_recvmmsg+0x2e5/0x710 net/socket.c:2856
+__sys_recvmmsg net/socket.c:2935 [inline]
+__do_sys_recvmmsg net/socket.c:2958 [inline]
+__se_sys_recvmmsg net/socket.c:2951 [inline]
+__x64_sys_recvmmsg+0xe2/0x160 net/socket.c:2951
+do_syscall_x64 arch/x86/entry/common.c:50 [inline]
+do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
+entry_SYSCALL_64_after_hwframe+0x63/0xcd
+
+value changed: 0x00 -> 0x01
+
+Fixes: 16b304f3404f ("netlink: Eliminate kmalloc in netlink dump operation.")
+Reported-by: syzbot <syzkaller@googlegroups.com>
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/netlink/af_netlink.c | 8 ++++----
+ 1 file changed, 4 insertions(+), 4 deletions(-)
+
+diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
+index 9b6eb28e6e94f..45d47b39de225 100644
+--- a/net/netlink/af_netlink.c
++++ b/net/netlink/af_netlink.c
+@@ -1990,7 +1990,7 @@ static int netlink_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+ 
+       skb_free_datagram(sk, skb);
+ 
+-      if (nlk->cb_running &&
++      if (READ_ONCE(nlk->cb_running) &&
+           atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf / 2) {
+               ret = netlink_dump(sk);
+               if (ret) {
+@@ -2304,7 +2304,7 @@ static int netlink_dump(struct sock *sk)
+       if (cb->done)
+               cb->done(cb);
+ 
+-      nlk->cb_running = false;
++      WRITE_ONCE(nlk->cb_running, false);
+       module = cb->module;
+       skb = cb->skb;
+       mutex_unlock(nlk->cb_mutex);
+@@ -2367,7 +2367,7 @@ int __netlink_dump_start(struct sock *ssk, struct sk_buff *skb,
+                       goto error_put;
+       }
+ 
+-      nlk->cb_running = true;
++      WRITE_ONCE(nlk->cb_running, true);
+       nlk->dump_done_errno = INT_MAX;
+ 
+       mutex_unlock(nlk->cb_mutex);
+@@ -2705,7 +2705,7 @@ static int netlink_native_seq_show(struct seq_file *seq, void *v)
+                          nlk->groups ? (u32)nlk->groups[0] : 0,
+                          sk_rmem_alloc_get(s),
+                          sk_wmem_alloc_get(s),
+-                         nlk->cb_running,
++                         READ_ONCE(nlk->cb_running),
+                          refcount_read(&s->sk_refcnt),
+                          atomic_read(&s->sk_drops),
+                          sock_i_ino(s)
+-- 
+2.39.2
+
diff --git a/queue-6.3/perf-core-fix-perf_sample_data-not-properly-initiali.patch b/queue-6.3/perf-core-fix-perf_sample_data-not-properly-initiali.patch

new file mode 100644 (file)

index 0000000..31f522c
--- /dev/null
+++ b/queue-6.3/perf-core-fix-perf_sample_data-not-properly-initiali.patch
@@ -0,0 +1,108 @@
+From b9749dfe227fcd3ccccca58f96a4bba205c3e3e0 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 25 Apr 2023 10:32:17 +0000
+Subject: perf/core: Fix perf_sample_data not properly initialized for
+ different swevents in perf_tp_event()
+
+From: Yang Jihong <yangjihong1@huawei.com>
+
+[ Upstream commit 1d1bfe30dad50d4bea83cd38d73c441972ea0173 ]
+
+data->sample_flags may be modified in perf_prepare_sample(),
+in perf_tp_event(), different swevents use the same on-stack
+perf_sample_data, the previous swevent may change sample_flags in
+perf_prepare_sample(), as a result, some members of perf_sample_data are
+not correctly initialized when next swevent_event preparing sample
+(for example data->id, the value varies according to swevent).
+
+A simple scenario triggers this problem is as follows:
+
+  # perf record -e sched:sched_switch --switch-output-event sched:sched_switch -a sleep 1
+  [ perf record: dump data: Woken up 0 times ]
+  [ perf record: Dump perf.data.2023041209014396 ]
+  [ perf record: dump data: Woken up 0 times ]
+  [ perf record: Dump perf.data.2023041209014662 ]
+  [ perf record: dump data: Woken up 0 times ]
+  [ perf record: Dump perf.data.2023041209014910 ]
+  [ perf record: Woken up 0 times to write data ]
+  [ perf record: Dump perf.data.2023041209015164 ]
+  [ perf record: Captured and wrote 0.069 MB perf.data.<timestamp> ]
+  # ls -l
+  total 860
+  -rw------- 1 root root  95694 Apr 12 09:01 perf.data.2023041209014396
+  -rw------- 1 root root 606430 Apr 12 09:01 perf.data.2023041209014662
+  -rw------- 1 root root  82246 Apr 12 09:01 perf.data.2023041209014910
+  -rw------- 1 root root  82342 Apr 12 09:01 perf.data.2023041209015164
+  # perf script -i perf.data.2023041209014396
+  0x11d58 [0x80]: failed to process type: 9 [Bad address]
+
+Solution: Re-initialize perf_sample_data after each event is processed.
+Note that data->raw->frag.data may be accessed in perf_tp_event_match().
+Therefore, need to init sample_data and then go through swevent hlist to prevent
+reference of NULL pointer, reported by [1].
+
+After fix:
+
+  # perf record -e sched:sched_switch --switch-output-event sched:sched_switch -a sleep 1
+  [ perf record: dump data: Woken up 0 times ]
+  [ perf record: Dump perf.data.2023041209442259 ]
+  [ perf record: dump data: Woken up 0 times ]
+  [ perf record: Dump perf.data.2023041209442514 ]
+  [ perf record: dump data: Woken up 0 times ]
+  [ perf record: Dump perf.data.2023041209442760 ]
+  [ perf record: Woken up 0 times to write data ]
+  [ perf record: Dump perf.data.2023041209443003 ]
+  [ perf record: Captured and wrote 0.069 MB perf.data.<timestamp> ]
+  # ls -l
+  total 864
+  -rw------- 1 root root 100166 Apr 12 09:44 perf.data.2023041209442259
+  -rw------- 1 root root 606438 Apr 12 09:44 perf.data.2023041209442514
+  -rw------- 1 root root  82246 Apr 12 09:44 perf.data.2023041209442760
+  -rw------- 1 root root  82342 Apr 12 09:44 perf.data.2023041209443003
+  # perf script -i perf.data.2023041209442259 | head -n 5
+              perf   232 [000]    66.846217: sched:sched_switch: prev_comm=perf prev_pid=232 prev_prio=120 prev_state=D ==> next_comm=perf next_pid=234 next_prio=120
+              perf   234 [000]    66.846449: sched:sched_switch: prev_comm=perf prev_pid=234 prev_prio=120 prev_state=S ==> next_comm=perf next_pid=232 next_prio=120
+              perf   232 [000]    66.846546: sched:sched_switch: prev_comm=perf prev_pid=232 prev_prio=120 prev_state=R ==> next_comm=perf next_pid=234 next_prio=120
+              perf   234 [000]    66.846606: sched:sched_switch: prev_comm=perf prev_pid=234 prev_prio=120 prev_state=S ==> next_comm=perf next_pid=232 next_prio=120
+              perf   232 [000]    66.846646: sched:sched_switch: prev_comm=perf prev_pid=232 prev_prio=120 prev_state=R ==> next_comm=perf next_pid=234 next_prio=120
+
+[1] Link: https://lore.kernel.org/oe-lkp/202304250929.efef2caa-yujie.liu@intel.com
+
+Fixes: bb447c27a467 ("perf/core: Set data->sample_flags in perf_prepare_sample()")
+Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
+Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
+Link: https://lkml.kernel.org/r/20230425103217.130600-1-yangjihong1@huawei.com
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ kernel/events/core.c | 14 +++++++++++++-
+ 1 file changed, 13 insertions(+), 1 deletion(-)
+
+diff --git a/kernel/events/core.c b/kernel/events/core.c
+index 68baa8194d9f8..db016e4189319 100644
+--- a/kernel/events/core.c
++++ b/kernel/events/core.c
+@@ -10150,8 +10150,20 @@ void perf_tp_event(u16 event_type, u64 count, void *record, int entry_size,
+       perf_trace_buf_update(record, event_type);
+ 
+       hlist_for_each_entry_rcu(event, head, hlist_entry) {
+-              if (perf_tp_event_match(event, &data, regs))
++              if (perf_tp_event_match(event, &data, regs)) {
+                       perf_swevent_event(event, count, &data, regs);
++
++                      /*
++                       * Here use the same on-stack perf_sample_data,
++                       * some members in data are event-specific and
++                       * need to be re-computed for different sweveents.
++                       * Re-initialize data->sample_flags safely to avoid
++                       * the problem that next event skips preparing data
++                       * because data->sample_flags is set.
++                       */
++                      perf_sample_data_init(&data, 0, 0);
++                      perf_sample_save_raw_data(&data, &raw);
++              }
+       }
+ 
+       /*
+-- 
+2.39.2
+
diff --git a/queue-6.3/scsi-ufs-core-fix-i-o-hang-that-occurs-when-bkops-fa.patch b/queue-6.3/scsi-ufs-core-fix-i-o-hang-that-occurs-when-bkops-fa.patch

new file mode 100644 (file)

index 0000000..a4ac659
--- /dev/null
+++ b/queue-6.3/scsi-ufs-core-fix-i-o-hang-that-occurs-when-bkops-fa.patch
@@ -0,0 +1,50 @@
+From 817bff30d0d2c033b4d676a852b0719932e5187d Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 25 Apr 2023 12:17:21 +0900
+Subject: scsi: ufs: core: Fix I/O hang that occurs when BKOPS fails in W-LUN
+ suspend
+
+From: Keoseong Park <keosung.park@samsung.com>
+
+[ Upstream commit 1a7edd041f2d252f251523ba3f2eaead076a8f8d ]
+
+Even when urgent BKOPS fails, the consumer will get stuck in runtime
+suspend status. Like commit 1a5665fc8d7a ("scsi: ufs: core: WLUN suspend
+SSU/enter hibern8 fail recovery"), trigger the error handler and return
+-EBUSY to break the suspend.
+
+Fixes: b294ff3e3449 ("scsi: ufs: core: Enable power management for wlun")
+Signed-off-by: Keoseong Park <keosung.park@samsung.com>
+Link: https://lore.kernel.org/r/20230425031721epcms2p5d4de65616478c967d466626e20c42a3a@epcms2p5
+Reviewed-by: Avri Altman <avri.altman@wdc.com>
+Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/ufs/core/ufshcd.c | 10 +++++++++-
+ 1 file changed, 9 insertions(+), 1 deletion(-)
+
+diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
+index 70b112038792a..8ac2945e849f4 100644
+--- a/drivers/ufs/core/ufshcd.c
++++ b/drivers/ufs/core/ufshcd.c
+@@ -9428,8 +9428,16 @@ static int __ufshcd_wl_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op)
+                        * that performance might be impacted.
+                        */
+                       ret = ufshcd_urgent_bkops(hba);
+-                      if (ret)
++                      if (ret) {
++                              /*
++                               * If return err in suspend flow, IO will hang.
++                               * Trigger error handler and break suspend for
++                               * error recovery.
++                               */
++                              ufshcd_force_error_recovery(hba);
++                              ret = -EBUSY;
+                               goto enable_scaling;
++                      }
+               } else {
+                       /* make sure that auto bkops is disabled */
+                       ufshcd_disable_auto_bkops(hba);
+-- 
+2.39.2
+
diff --git a/queue-6.3/series b/queue-6.3/series

index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..b306a38cfea5b6f853f5198e690e0686f33740d2 100644 (file)
--- a/queue-6.3/series
+++ b/queue-6.3/series
@@ -0,0 +1,39 @@
+drm-fbdev-generic-prohibit-potential-out-of-bounds-a.patch
+firmware-sysfb-fix-vesa-format-selection.patch
+drm-dsc-fix-dp_dsc_max_bpp_delta_-macro-values.patch
+drm-nouveau-disp-more-dp_receiver_cap_size-array-fix.patch
+drm-mipi-dsi-set-the-fwnode-for-mipi_dsi_device.patch
+arm-9296-1-hp-jornada-7xx-fix-kernel-doc-warnings.patch
+net-skb_partial_csum_set-fix-against-transport-heade.patch
+net-mdio-mvusb-fix-an-error-handling-path-in-mvusb_m.patch
+perf-core-fix-perf_sample_data-not-properly-initiali.patch
+scsi-ufs-core-fix-i-o-hang-that-occurs-when-bkops-fa.patch
+tick-broadcast-make-broadcast-device-replacement-wor.patch
+linux-dim-do-nothing-if-no-time-delta-between-sample.patch
+net-stmmac-initialize-mac_oneus_tic_counter-register.patch
+net-fix-load-tearing-on-sk-sk_stamp-in-sock_recv_cms.patch
+net-phy-bcm7xx-correct-read-from-expansion-register.patch
+netfilter-nf_tables-always-release-netdev-hooks-from.patch
+netfilter-conntrack-fix-possible-bug_on-with-enable_.patch
+bonding-fix-send_peer_notif-overflow.patch
+netlink-annotate-accesses-to-nlk-cb_running.patch
+net-annotate-sk-sk_err-write-from-do_recvmmsg.patch
+net-deal-with-most-data-races-in-sk_wait_event.patch
+net-add-vlan_get_protocol_and_depth-helper.patch
+tcp-add-annotations-around-sk-sk_shutdown-accesses.patch
+gve-remove-the-code-of-clearing-pba-bit.patch
+ipvlan-fix-out-of-bounds-caused-by-unclear-skb-cb.patch
+net-mscc-ocelot-fix-stat-counter-register-values.patch
+drm-sched-check-scheduler-work-queue-before-calling-.patch
+net-datagram-fix-data-races-in-datagram_poll.patch
+af_unix-fix-a-data-race-of-sk-sk_receive_queue-qlen.patch
+af_unix-fix-data-races-around-sk-sk_shutdown.patch
+drm-i915-guc-don-t-capture-gen8-regs-on-xe-devices.patch
+drm-i915-fix-null-ptr-deref-by-checking-new_crtc_sta.patch
+drm-i915-dp-prevent-potential-div-by-zero.patch
+drm-i915-taint-kernel-when-force-probing-unsupported.patch
+fbdev-arcfb-fix-error-handling-in-arcfb_probe.patch
+ext4-reflect-error-codes-from-ext4_multi_mount_prote.patch
+ext4-don-t-clear-sb_rdonly-when-remounting-r-w-until.patch
+ext4-allow-to-find-by-goal-if-ext4_mb_hint_goal_only.patch
+ext4-allow-ext4_get_group_info-to-fail.patch
diff --git a/queue-6.3/tcp-add-annotations-around-sk-sk_shutdown-accesses.patch b/queue-6.3/tcp-add-annotations-around-sk-sk_shutdown-accesses.patch

new file mode 100644 (file)

index 0000000..94e5381
--- /dev/null
+++ b/queue-6.3/tcp-add-annotations-around-sk-sk_shutdown-accesses.patch
@@ -0,0 +1,158 @@
+From c90cf05cab066db0392d1ef069e6995588db6706 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 9 May 2023 20:36:56 +0000
+Subject: tcp: add annotations around sk->sk_shutdown accesses
+
+From: Eric Dumazet <edumazet@google.com>
+
+[ Upstream commit e14cadfd80d76f01bfaa1a8d745b1db19b57d6be ]
+
+Now sk->sk_shutdown is no longer a bitfield, we can add
+standard READ_ONCE()/WRITE_ONCE() annotations to silence
+KCSAN reports like the following:
+
+BUG: KCSAN: data-race in tcp_disconnect / tcp_poll
+
+write to 0xffff88814588582c of 1 bytes by task 3404 on cpu 1:
+tcp_disconnect+0x4d6/0xdb0 net/ipv4/tcp.c:3121
+__inet_stream_connect+0x5dd/0x6e0 net/ipv4/af_inet.c:715
+inet_stream_connect+0x48/0x70 net/ipv4/af_inet.c:727
+__sys_connect_file net/socket.c:2001 [inline]
+__sys_connect+0x19b/0x1b0 net/socket.c:2018
+__do_sys_connect net/socket.c:2028 [inline]
+__se_sys_connect net/socket.c:2025 [inline]
+__x64_sys_connect+0x41/0x50 net/socket.c:2025
+do_syscall_x64 arch/x86/entry/common.c:50 [inline]
+do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
+entry_SYSCALL_64_after_hwframe+0x63/0xcd
+
+read to 0xffff88814588582c of 1 bytes by task 3374 on cpu 0:
+tcp_poll+0x2e6/0x7d0 net/ipv4/tcp.c:562
+sock_poll+0x253/0x270 net/socket.c:1383
+vfs_poll include/linux/poll.h:88 [inline]
+io_poll_check_events io_uring/poll.c:281 [inline]
+io_poll_task_func+0x15a/0x820 io_uring/poll.c:333
+handle_tw_list io_uring/io_uring.c:1184 [inline]
+tctx_task_work+0x1fe/0x4d0 io_uring/io_uring.c:1246
+task_work_run+0x123/0x160 kernel/task_work.c:179
+get_signal+0xe64/0xff0 kernel/signal.c:2635
+arch_do_signal_or_restart+0x89/0x2a0 arch/x86/kernel/signal.c:306
+exit_to_user_mode_loop+0x6f/0xe0 kernel/entry/common.c:168
+exit_to_user_mode_prepare+0x6c/0xb0 kernel/entry/common.c:204
+__syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
+syscall_exit_to_user_mode+0x26/0x140 kernel/entry/common.c:297
+do_syscall_64+0x4d/0xc0 arch/x86/entry/common.c:86
+entry_SYSCALL_64_after_hwframe+0x63/0xcd
+
+value changed: 0x03 -> 0x00
+
+Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
+Reported-by: syzbot <syzkaller@googlegroups.com>
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/ipv4/af_inet.c   |  2 +-
+ net/ipv4/tcp.c       | 14 ++++++++------
+ net/ipv4/tcp_input.c |  4 ++--
+ 3 files changed, 11 insertions(+), 9 deletions(-)
+
+diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
+index 8db6747f892f8..70fd769f1174b 100644
+--- a/net/ipv4/af_inet.c
++++ b/net/ipv4/af_inet.c
+@@ -894,7 +894,7 @@ int inet_shutdown(struct socket *sock, int how)
+                  EPOLLHUP, even on eg. unconnected UDP sockets -- RR */
+               fallthrough;
+       default:
+-              sk->sk_shutdown |= how;
++              WRITE_ONCE(sk->sk_shutdown, sk->sk_shutdown | how);
+               if (sk->sk_prot->shutdown)
+                       sk->sk_prot->shutdown(sk, how);
+               break;
+diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
+index 288693981b006..6c7c666554ced 100644
+--- a/net/ipv4/tcp.c
++++ b/net/ipv4/tcp.c
+@@ -498,6 +498,7 @@ __poll_t tcp_poll(struct file *file, struct socket *sock, poll_table *wait)
+       __poll_t mask;
+       struct sock *sk = sock->sk;
+       const struct tcp_sock *tp = tcp_sk(sk);
++      u8 shutdown;
+       int state;
+ 
+       sock_poll_wait(file, sock, wait);
+@@ -540,9 +541,10 @@ __poll_t tcp_poll(struct file *file, struct socket *sock, poll_table *wait)
+        * NOTE. Check for TCP_CLOSE is added. The goal is to prevent
+        * blocking on fresh not-connected or disconnected socket. --ANK
+        */
+-      if (sk->sk_shutdown == SHUTDOWN_MASK || state == TCP_CLOSE)
++      shutdown = READ_ONCE(sk->sk_shutdown);
++      if (shutdown == SHUTDOWN_MASK || state == TCP_CLOSE)
+               mask |= EPOLLHUP;
+-      if (sk->sk_shutdown & RCV_SHUTDOWN)
++      if (shutdown & RCV_SHUTDOWN)
+               mask |= EPOLLIN | EPOLLRDNORM | EPOLLRDHUP;
+ 
+       /* Connected or passive Fast Open socket? */
+@@ -559,7 +561,7 @@ __poll_t tcp_poll(struct file *file, struct socket *sock, poll_table *wait)
+               if (tcp_stream_is_readable(sk, target))
+                       mask |= EPOLLIN | EPOLLRDNORM;
+ 
+-              if (!(sk->sk_shutdown & SEND_SHUTDOWN)) {
++              if (!(shutdown & SEND_SHUTDOWN)) {
+                       if (__sk_stream_is_writeable(sk, 1)) {
+                               mask |= EPOLLOUT | EPOLLWRNORM;
+                       } else {  /* send SIGIO later */
+@@ -2866,7 +2868,7 @@ void __tcp_close(struct sock *sk, long timeout)
+       int data_was_unread = 0;
+       int state;
+ 
+-      sk->sk_shutdown = SHUTDOWN_MASK;
++      WRITE_ONCE(sk->sk_shutdown, SHUTDOWN_MASK);
+ 
+       if (sk->sk_state == TCP_LISTEN) {
+               tcp_set_state(sk, TCP_CLOSE);
+@@ -3118,7 +3120,7 @@ int tcp_disconnect(struct sock *sk, int flags)
+ 
+       inet_bhash2_reset_saddr(sk);
+ 
+-      sk->sk_shutdown = 0;
++      WRITE_ONCE(sk->sk_shutdown, 0);
+       sock_reset_flag(sk, SOCK_DONE);
+       tp->srtt_us = 0;
+       tp->mdev_us = jiffies_to_usecs(TCP_TIMEOUT_INIT);
+@@ -4648,7 +4650,7 @@ void tcp_done(struct sock *sk)
+       if (req)
+               reqsk_fastopen_remove(sk, req, false);
+ 
+-      sk->sk_shutdown = SHUTDOWN_MASK;
++      WRITE_ONCE(sk->sk_shutdown, SHUTDOWN_MASK);
+ 
+       if (!sock_flag(sk, SOCK_DEAD))
+               sk->sk_state_change(sk);
+diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
+index cc072d2cfcd82..10776c54ff784 100644
+--- a/net/ipv4/tcp_input.c
++++ b/net/ipv4/tcp_input.c
+@@ -4362,7 +4362,7 @@ void tcp_fin(struct sock *sk)
+ 
+       inet_csk_schedule_ack(sk);
+ 
+-      sk->sk_shutdown |= RCV_SHUTDOWN;
++      WRITE_ONCE(sk->sk_shutdown, sk->sk_shutdown | RCV_SHUTDOWN);
+       sock_set_flag(sk, SOCK_DONE);
+ 
+       switch (sk->sk_state) {
+@@ -6597,7 +6597,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
+                       break;
+ 
+               tcp_set_state(sk, TCP_FIN_WAIT2);
+-              sk->sk_shutdown |= SEND_SHUTDOWN;
++              WRITE_ONCE(sk->sk_shutdown, sk->sk_shutdown | SEND_SHUTDOWN);
+ 
+               sk_dst_confirm(sk);
+ 
+-- 
+2.39.2
+
diff --git a/queue-6.3/tick-broadcast-make-broadcast-device-replacement-wor.patch b/queue-6.3/tick-broadcast-make-broadcast-device-replacement-wor.patch

new file mode 100644 (file)

index 0000000..1861ab9
--- /dev/null
+++ b/queue-6.3/tick-broadcast-make-broadcast-device-replacement-wor.patch
@@ -0,0 +1,274 @@
+From d679d1ba1109aad6b73daa7da3553ce92f6080de Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Sat, 6 May 2023 18:40:57 +0200
+Subject: tick/broadcast: Make broadcast device replacement work correctly
+
+From: Thomas Gleixner <tglx@linutronix.de>
+
+[ Upstream commit f9d36cf445ffff0b913ba187a3eff78028f9b1fb ]
+
+When a tick broadcast clockevent device is initialized for one shot mode
+then tick_broadcast_setup_oneshot() OR's the periodic broadcast mode
+cpumask into the oneshot broadcast cpumask.
+
+This is required when switching from periodic broadcast mode to oneshot
+broadcast mode to ensure that CPUs which are waiting for periodic
+broadcast are woken up on the next tick.
+
+But it is subtly broken, when an active broadcast device is replaced and
+the system is already in oneshot (NOHZ/HIGHRES) mode. Victor observed
+this and debugged the issue.
+
+Then the OR of the periodic broadcast CPU mask is wrong as the periodic
+cpumask bits are sticky after tick_broadcast_enable() set it for a CPU
+unless explicitly cleared via tick_broadcast_disable().
+
+That means that this sets all other CPUs which have tick broadcasting
+enabled at that point unconditionally in the oneshot broadcast mask.
+
+If the affected CPUs were already idle and had their bits set in the
+oneshot broadcast mask then this does no harm. But for non idle CPUs
+which were not set this corrupts their state.
+
+On their next invocation of tick_broadcast_enable() they observe the bit
+set, which indicates that the broadcast for the CPU is already set up.
+As a consequence they fail to update the broadcast event even if their
+earliest expiring timer is before the actually programmed broadcast
+event.
+
+If the programmed broadcast event is far in the future, then this can
+cause stalls or trigger the hung task detector.
+
+Avoid this by telling tick_broadcast_setup_oneshot() explicitly whether
+this is the initial switch over from periodic to oneshot broadcast which
+must take the periodic broadcast mask into account. In the case of
+initialization of a replacement device this prevents that the broadcast
+oneshot mask is modified.
+
+There is a second problem with broadcast device replacement in this
+function. The broadcast device is only armed when the previous state of
+the device was periodic.
+
+That is correct for the switch from periodic broadcast mode to oneshot
+broadcast mode as the underlying broadcast device could operate in
+oneshot state already due to lack of periodic state in hardware. In that
+case it is already armed to expire at the next tick.
+
+For the replacement case this is wrong as the device is in shutdown
+state. That means that any already pending broadcast event will not be
+armed.
+
+This went unnoticed because any CPU which goes idle will observe that
+the broadcast device has an expiry time of KTIME_MAX and therefore any
+CPUs next timer event will be earlier and cause a reprogramming of the
+broadcast device. But that does not guarantee that the events of the
+CPUs which were already in idle are delivered on time.
+
+Fix this by arming the newly installed device for an immediate event
+which will reevaluate the per CPU expiry times and reprogram the
+broadcast device accordingly. This is simpler than caching the last
+expiry time in yet another place or saving it before the device exchange
+and handing it down to the setup function. Replacement of broadcast
+devices is not a frequent operation and usually happens once somewhere
+late in the boot process.
+
+Fixes: 9c336c9935cf ("tick/broadcast: Allow late registered device to enter oneshot mode")
+Reported-by: Victor Hassan <victor@allwinnertech.com>
+Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
+Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
+Link: https://lore.kernel.org/r/87pm7d2z1i.ffs@tglx
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ kernel/time/tick-broadcast.c | 120 +++++++++++++++++++++++++----------
+ 1 file changed, 88 insertions(+), 32 deletions(-)
+
+diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
+index 93bf2b4e47e56..771d1e040303b 100644
+--- a/kernel/time/tick-broadcast.c
++++ b/kernel/time/tick-broadcast.c
+@@ -35,14 +35,15 @@ static __cacheline_aligned_in_smp DEFINE_RAW_SPINLOCK(tick_broadcast_lock);
+ #ifdef CONFIG_TICK_ONESHOT
+ static DEFINE_PER_CPU(struct clock_event_device *, tick_oneshot_wakeup_device);
+ 
+-static void tick_broadcast_setup_oneshot(struct clock_event_device *bc);
++static void tick_broadcast_setup_oneshot(struct clock_event_device *bc, bool from_periodic);
+ static void tick_broadcast_clear_oneshot(int cpu);
+ static void tick_resume_broadcast_oneshot(struct clock_event_device *bc);
+ # ifdef CONFIG_HOTPLUG_CPU
+ static void tick_broadcast_oneshot_offline(unsigned int cpu);
+ # endif
+ #else
+-static inline void tick_broadcast_setup_oneshot(struct clock_event_device *bc) { BUG(); }
++static inline void
++tick_broadcast_setup_oneshot(struct clock_event_device *bc, bool from_periodic) { BUG(); }
+ static inline void tick_broadcast_clear_oneshot(int cpu) { }
+ static inline void tick_resume_broadcast_oneshot(struct clock_event_device *bc) { }
+ # ifdef CONFIG_HOTPLUG_CPU
+@@ -264,7 +265,7 @@ int tick_device_uses_broadcast(struct clock_event_device *dev, int cpu)
+               if (tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC)
+                       tick_broadcast_start_periodic(bc);
+               else
+-                      tick_broadcast_setup_oneshot(bc);
++                      tick_broadcast_setup_oneshot(bc, false);
+               ret = 1;
+       } else {
+               /*
+@@ -500,7 +501,7 @@ void tick_broadcast_control(enum tick_broadcast_mode mode)
+                       if (tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC)
+                               tick_broadcast_start_periodic(bc);
+                       else
+-                              tick_broadcast_setup_oneshot(bc);
++                              tick_broadcast_setup_oneshot(bc, false);
+               }
+       }
+ out:
+@@ -1020,48 +1021,101 @@ static inline ktime_t tick_get_next_period(void)
+ /**
+  * tick_broadcast_setup_oneshot - setup the broadcast device
+  */
+-static void tick_broadcast_setup_oneshot(struct clock_event_device *bc)
++static void tick_broadcast_setup_oneshot(struct clock_event_device *bc,
++                                       bool from_periodic)
+ {
+       int cpu = smp_processor_id();
++      ktime_t nexttick = 0;
+ 
+       if (!bc)
+               return;
+ 
+-      /* Set it up only once ! */
+-      if (bc->event_handler != tick_handle_oneshot_broadcast) {
+-              int was_periodic = clockevent_state_periodic(bc);
+-
+-              bc->event_handler = tick_handle_oneshot_broadcast;
+-
++      /*
++       * When the broadcast device was switched to oneshot by the first
++       * CPU handling the NOHZ change, the other CPUs will reach this
++       * code via hrtimer_run_queues() -> tick_check_oneshot_change()
++       * too. Set up the broadcast device only once!
++       */
++      if (bc->event_handler == tick_handle_oneshot_broadcast) {
+               /*
+-               * We must be careful here. There might be other CPUs
+-               * waiting for periodic broadcast. We need to set the
+-               * oneshot_mask bits for those and program the
+-               * broadcast device to fire.
++               * The CPU which switched from periodic to oneshot mode
++               * set the broadcast oneshot bit for all other CPUs which
++               * are in the general (periodic) broadcast mask to ensure
++               * that CPUs which wait for the periodic broadcast are
++               * woken up.
++               *
++               * Clear the bit for the local CPU as the set bit would
++               * prevent the first tick_broadcast_enter() after this CPU
++               * switched to oneshot state to program the broadcast
++               * device.
++               *
++               * This code can also be reached via tick_broadcast_control(),
++               * but this cannot avoid the tick_broadcast_clear_oneshot()
++               * as that would break the periodic to oneshot transition of
++               * secondary CPUs. But that's harmless as the below only
++               * clears already cleared bits.
+                */
++              tick_broadcast_clear_oneshot(cpu);
++              return;
++      }
++
++
++      bc->event_handler = tick_handle_oneshot_broadcast;
++      bc->next_event = KTIME_MAX;
++
++      /*
++       * When the tick mode is switched from periodic to oneshot it must
++       * be ensured that CPUs which are waiting for periodic broadcast
++       * get their wake-up at the next tick.  This is achieved by ORing
++       * tick_broadcast_mask into tick_broadcast_oneshot_mask.
++       *
++       * For other callers, e.g. broadcast device replacement,
++       * tick_broadcast_oneshot_mask must not be touched as this would
++       * set bits for CPUs which are already NOHZ, but not idle. Their
++       * next tick_broadcast_enter() would observe the bit set and fail
++       * to update the expiry time and the broadcast event device.
++       */
++      if (from_periodic) {
+               cpumask_copy(tmpmask, tick_broadcast_mask);
++              /* Remove the local CPU as it is obviously not idle */
+               cpumask_clear_cpu(cpu, tmpmask);
+-              cpumask_or(tick_broadcast_oneshot_mask,
+-                         tick_broadcast_oneshot_mask, tmpmask);
++              cpumask_or(tick_broadcast_oneshot_mask, tick_broadcast_oneshot_mask, tmpmask);
+ 
+-              if (was_periodic && !cpumask_empty(tmpmask)) {
+-                      ktime_t nextevt = tick_get_next_period();
++              /*
++               * Ensure that the oneshot broadcast handler will wake the
++               * CPUs which are still waiting for periodic broadcast.
++               */
++              nexttick = tick_get_next_period();
++              tick_broadcast_init_next_event(tmpmask, nexttick);
+ 
+-                      clockevents_switch_state(bc, CLOCK_EVT_STATE_ONESHOT);
+-                      tick_broadcast_init_next_event(tmpmask, nextevt);
+-                      tick_broadcast_set_event(bc, cpu, nextevt);
+-              } else
+-                      bc->next_event = KTIME_MAX;
+-      } else {
+               /*
+-               * The first cpu which switches to oneshot mode sets
+-               * the bit for all other cpus which are in the general
+-               * (periodic) broadcast mask. So the bit is set and
+-               * would prevent the first broadcast enter after this
+-               * to program the bc device.
++               * If the underlying broadcast clock event device is
++               * already in oneshot state, then there is nothing to do.
++               * The device was already armed for the next tick
++               * in tick_handle_broadcast_periodic()
+                */
+-              tick_broadcast_clear_oneshot(cpu);
++              if (clockevent_state_oneshot(bc))
++                      return;
+       }
++
++      /*
++       * When switching from periodic to oneshot mode arm the broadcast
++       * device for the next tick.
++       *
++       * If the broadcast device has been replaced in oneshot mode and
++       * the oneshot broadcast mask is not empty, then arm it to expire
++       * immediately in order to reevaluate the next expiring timer.
++       * @nexttick is 0 and therefore in the past which will cause the
++       * clockevent code to force an event.
++       *
++       * For both cases the programming can be avoided when the oneshot
++       * broadcast mask is empty.
++       *
++       * tick_broadcast_set_event() implicitly switches the broadcast
++       * device to oneshot state.
++       */
++      if (!cpumask_empty(tick_broadcast_oneshot_mask))
++              tick_broadcast_set_event(bc, cpu, nexttick);
+ }
+ 
+ /*
+@@ -1070,14 +1124,16 @@ static void tick_broadcast_setup_oneshot(struct clock_event_device *bc)
+ void tick_broadcast_switch_to_oneshot(void)
+ {
+       struct clock_event_device *bc;
++      enum tick_device_mode oldmode;
+       unsigned long flags;
+ 
+       raw_spin_lock_irqsave(&tick_broadcast_lock, flags);
+ 
++      oldmode = tick_broadcast_device.mode;
+       tick_broadcast_device.mode = TICKDEV_MODE_ONESHOT;
+       bc = tick_broadcast_device.evtdev;
+       if (bc)
+-              tick_broadcast_setup_oneshot(bc);
++              tick_broadcast_setup_oneshot(bc, oldmode == TICKDEV_MODE_PERIODIC);
+ 
+       raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);
+ }
+-- 
+2.39.2
+
author	Sasha Levin <sashal@kernel.org>
	Thu, 18 May 2023 01:37:34 +0000 (21:37 -0400)
committer	Sasha Levin <sashal@kernel.org>
	Thu, 18 May 2023 01:37:34 +0000 (21:37 -0400)
queue-6.3/af_unix-fix-a-data-race-of-sk-sk_receive_queue-qlen.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/af_unix-fix-data-races-around-sk-sk_shutdown.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/arm-9296-1-hp-jornada-7xx-fix-kernel-doc-warnings.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/bonding-fix-send_peer_notif-overflow.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/drm-dsc-fix-dp_dsc_max_bpp_delta_-macro-values.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/drm-fbdev-generic-prohibit-potential-out-of-bounds-a.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/drm-i915-dp-prevent-potential-div-by-zero.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/drm-i915-fix-null-ptr-deref-by-checking-new_crtc_sta.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/drm-i915-guc-don-t-capture-gen8-regs-on-xe-devices.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/drm-i915-taint-kernel-when-force-probing-unsupported.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/drm-mipi-dsi-set-the-fwnode-for-mipi_dsi_device.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/drm-nouveau-disp-more-dp_receiver_cap_size-array-fix.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/drm-sched-check-scheduler-work-queue-before-calling-.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/ext4-allow-ext4_get_group_info-to-fail.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/ext4-allow-to-find-by-goal-if-ext4_mb_hint_goal_only.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/ext4-don-t-clear-sb_rdonly-when-remounting-r-w-until.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/ext4-reflect-error-codes-from-ext4_multi_mount_prote.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/fbdev-arcfb-fix-error-handling-in-arcfb_probe.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/firmware-sysfb-fix-vesa-format-selection.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/gve-remove-the-code-of-clearing-pba-bit.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/ipvlan-fix-out-of-bounds-caused-by-unclear-skb-cb.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/linux-dim-do-nothing-if-no-time-delta-between-sample.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/net-add-vlan_get_protocol_and_depth-helper.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/net-annotate-sk-sk_err-write-from-do_recvmmsg.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/net-datagram-fix-data-races-in-datagram_poll.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/net-deal-with-most-data-races-in-sk_wait_event.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/net-fix-load-tearing-on-sk-sk_stamp-in-sock_recv_cms.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/net-mdio-mvusb-fix-an-error-handling-path-in-mvusb_m.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/net-mscc-ocelot-fix-stat-counter-register-values.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/net-phy-bcm7xx-correct-read-from-expansion-register.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/net-skb_partial_csum_set-fix-against-transport-heade.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/net-stmmac-initialize-mac_oneus_tic_counter-register.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/netfilter-conntrack-fix-possible-bug_on-with-enable_.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/netfilter-nf_tables-always-release-netdev-hooks-from.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/netlink-annotate-accesses-to-nlk-cb_running.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/perf-core-fix-perf_sample_data-not-properly-initiali.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/scsi-ufs-core-fix-i-o-hang-that-occurs-when-bkops-fa.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/series		patch \| blob \| blame \| history
queue-6.3/tcp-add-annotations-around-sk-sk_shutdown-accesses.patch	[new file with mode: 0644]	patch \| blob
queue-6.3/tick-broadcast-make-broadcast-device-replacement-wor.patch	[new file with mode: 0644]	patch \| blob