From: Sasha Levin Date: Tue, 30 Jul 2024 13:55:22 +0000 (-0400) Subject: Fixes for 5.15 X-Git-Tag: v6.1.103~15^2~4 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=638ad56389afc6352bca815c5d8d0aafac93a2f1;p=thirdparty%2Fkernel%2Fstable-queue.git Fixes for 5.15 Signed-off-by: Sasha Levin --- diff --git a/queue-5.15/apparmor-fix-null-pointer-deref-when-receiving-skb-d.patch b/queue-5.15/apparmor-fix-null-pointer-deref-when-receiving-skb-d.patch new file mode 100644 index 00000000000..342be2d7d65 --- /dev/null +++ b/queue-5.15/apparmor-fix-null-pointer-deref-when-receiving-skb-d.patch @@ -0,0 +1,111 @@ +From f4ae3401fa5f3bb723c45a3001618da06738d5b8 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sat, 2 Sep 2023 08:48:38 +0800 +Subject: apparmor: Fix null pointer deref when receiving skb during sock + creation + +From: Xiao Liang + +[ Upstream commit fce09ea314505a52f2436397608fa0a5d0934fb1 ] + +The panic below is observed when receiving ICMP packets with secmark set +while an ICMP raw socket is being created. SK_CTX(sk)->label is updated +in apparmor_socket_post_create(), but the packet is delivered to the +socket before that, causing the null pointer dereference. +Drop the packet if label context is not set. + + BUG: kernel NULL pointer dereference, address: 000000000000004c + #PF: supervisor read access in kernel mode + #PF: error_code(0x0000) - not-present page + PGD 0 P4D 0 + Oops: 0000 [#1] PREEMPT SMP NOPTI + CPU: 0 PID: 407 Comm: a.out Not tainted 6.4.12-arch1-1 #1 3e6fa2753a2d75925c34ecb78e22e85a65d083df + Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/28/2020 + RIP: 0010:aa_label_next_confined+0xb/0x40 + Code: 00 00 48 89 ef e8 d5 25 0c 00 e9 66 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 89 f0 <8b> 77 4c 39 c6 7e 1f 48 63 d0 48 8d 14 d7 eb 0b 83 c0 01 48 83 c2 + RSP: 0018:ffffa92940003b08 EFLAGS: 00010246 + RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000000e + RDX: ffffa92940003be8 RSI: 0000000000000000 RDI: 0000000000000000 + RBP: ffff8b57471e7800 R08: ffff8b574c642400 R09: 0000000000000002 + R10: ffffffffbd820eeb R11: ffffffffbeb7ff00 R12: ffff8b574c642400 + R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000 + FS: 00007fb092ea7640(0000) GS:ffff8b577bc00000(0000) knlGS:0000000000000000 + CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 + CR2: 000000000000004c CR3: 00000001020f2005 CR4: 00000000007706f0 + PKRU: 55555554 + Call Trace: + + ? __die+0x23/0x70 + ? page_fault_oops+0x171/0x4e0 + ? exc_page_fault+0x7f/0x180 + ? asm_exc_page_fault+0x26/0x30 + ? aa_label_next_confined+0xb/0x40 + apparmor_secmark_check+0xec/0x330 + security_sock_rcv_skb+0x35/0x50 + sk_filter_trim_cap+0x47/0x250 + sock_queue_rcv_skb_reason+0x20/0x60 + raw_rcv+0x13c/0x210 + raw_local_deliver+0x1f3/0x250 + ip_protocol_deliver_rcu+0x4f/0x2f0 + ip_local_deliver_finish+0x76/0xa0 + __netif_receive_skb_one_core+0x89/0xa0 + netif_receive_skb+0x119/0x170 + ? __netdev_alloc_skb+0x3d/0x140 + vmxnet3_rq_rx_complete+0xb23/0x1010 [vmxnet3 56a84f9c97178c57a43a24ec073b45a9d6f01f3a] + vmxnet3_poll_rx_only+0x36/0xb0 [vmxnet3 56a84f9c97178c57a43a24ec073b45a9d6f01f3a] + __napi_poll+0x28/0x1b0 + net_rx_action+0x2a4/0x380 + __do_softirq+0xd1/0x2c8 + __irq_exit_rcu+0xbb/0xf0 + common_interrupt+0x86/0xa0 + + + asm_common_interrupt+0x26/0x40 + RIP: 0010:apparmor_socket_post_create+0xb/0x200 + Code: 08 48 85 ff 75 a1 eb b1 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 <55> 48 89 fd 53 45 85 c0 0f 84 b2 00 00 00 48 8b 1d 80 56 3f 02 48 + RSP: 0018:ffffa92940ce7e50 EFLAGS: 00000286 + RAX: ffffffffbc756440 RBX: 0000000000000000 RCX: 0000000000000001 + RDX: 0000000000000003 RSI: 0000000000000002 RDI: ffff8b574eaab740 + RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 + R10: ffff8b57444cec70 R11: 0000000000000000 R12: 0000000000000003 + R13: 0000000000000002 R14: ffff8b574eaab740 R15: ffffffffbd8e4748 + ? __pfx_apparmor_socket_post_create+0x10/0x10 + security_socket_post_create+0x4b/0x80 + __sock_create+0x176/0x1f0 + __sys_socket+0x89/0x100 + __x64_sys_socket+0x17/0x20 + do_syscall_64+0x5d/0x90 + ? do_syscall_64+0x6c/0x90 + ? do_syscall_64+0x6c/0x90 + ? do_syscall_64+0x6c/0x90 + entry_SYSCALL_64_after_hwframe+0x72/0xdc + +Fixes: ab9f2115081a ("apparmor: Allow filtering based on secmark policy") +Signed-off-by: Xiao Liang +Signed-off-by: John Johansen +Signed-off-by: Sasha Levin +--- + security/apparmor/lsm.c | 7 +++++++ + 1 file changed, 7 insertions(+) + +diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c +index 10274eb90fa37..cf26ffe8cccb7 100644 +--- a/security/apparmor/lsm.c ++++ b/security/apparmor/lsm.c +@@ -1057,6 +1057,13 @@ static int apparmor_socket_sock_rcv_skb(struct sock *sk, struct sk_buff *skb) + if (!skb->secmark) + return 0; + ++ /* ++ * If reach here before socket_post_create hook is called, in which ++ * case label is null, drop the packet. ++ */ ++ if (!ctx->label) ++ return -EACCES; ++ + return apparmor_secmark_check(ctx->label, OP_RECVMSG, AA_MAY_RECEIVE, + skb->secmark, sk); + } +-- +2.43.0 + diff --git a/queue-5.15/asoc-intel-use-soc_intel_is_byt_cr-only-when-iosf_mb.patch b/queue-5.15/asoc-intel-use-soc_intel_is_byt_cr-only-when-iosf_mb.patch new file mode 100644 index 00000000000..fc1b0596197 --- /dev/null +++ b/queue-5.15/asoc-intel-use-soc_intel_is_byt_cr-only-when-iosf_mb.patch @@ -0,0 +1,54 @@ +From eeea851d58c13ad2d555b881951b7e192f26224f Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 22 Jul 2024 10:30:02 +0200 +Subject: ASoC: Intel: use soc_intel_is_byt_cr() only when IOSF_MBI is + reachable +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Pierre-Louis Bossart + +[ Upstream commit 9931f7d5d251882a147cc5811060097df43e79f5 ] + +the Intel kbuild bot reports a link failure when IOSF_MBI is built-in +but the Merrifield driver is configured as a module. The +soc-intel-quirks.h is included for Merrifield platforms, but IOSF_MBI +is not selected for that platform. + +ld.lld: error: undefined symbol: iosf_mbi_read +>>> referenced by atom.c +>>> sound/soc/sof/intel/atom.o:(atom_machine_select) in archive vmlinux.a + +This patch forces the use of the fallback static inline when IOSF_MBI is not reachable. + +Fixes: 536cfd2f375d ("ASoC: Intel: use common helpers to detect CPUs") +Reported-by: kernel test robot +Closes: https://lore.kernel.org/oe-kbuild-all/202407160704.zpdhJ8da-lkp@intel.com/ +Suggested-by: Takashi Iwai +Signed-off-by: Pierre-Louis Bossart +Reviewed-by: Péter Ujfalusi +Reviewed-by: Bard Liao +Link: https://patch.msgid.link/20240722083002.10800-1-pierre-louis.bossart@linux.intel.com +Signed-off-by: Mark Brown +Signed-off-by: Sasha Levin +--- + sound/soc/intel/common/soc-intel-quirks.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/sound/soc/intel/common/soc-intel-quirks.h b/sound/soc/intel/common/soc-intel-quirks.h +index de4e550c5b34d..42bd51456b945 100644 +--- a/sound/soc/intel/common/soc-intel-quirks.h ++++ b/sound/soc/intel/common/soc-intel-quirks.h +@@ -11,7 +11,7 @@ + + #include + +-#if IS_ENABLED(CONFIG_X86) ++#if IS_REACHABLE(CONFIG_IOSF_MBI) + + #include + #include +-- +2.43.0 + diff --git a/queue-5.15/bpf-events-use-prog-to-emit-ksymbol-event-for-main-p.patch b/queue-5.15/bpf-events-use-prog-to-emit-ksymbol-event-for-main-p.patch new file mode 100644 index 00000000000..cf8da236dd2 --- /dev/null +++ b/queue-5.15/bpf-events-use-prog-to-emit-ksymbol-event-for-main-p.patch @@ -0,0 +1,84 @@ +From a62189f4baf99e921a4421ebee4653f1b21bbccc Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sun, 14 Jul 2024 14:55:33 +0800 +Subject: bpf, events: Use prog to emit ksymbol event for main program + +From: Hou Tao + +[ Upstream commit 0be9ae5486cd9e767138c13638820d240713f5f1 ] + +Since commit 0108a4e9f358 ("bpf: ensure main program has an extable"), +prog->aux->func[0]->kallsyms is left as uninitialized. For BPF programs +with subprogs, the symbol for the main program is missing just as shown +in the output of perf script below: + + ffffffff81284b69 qp_trie_lookup_elem+0xb9 ([kernel.kallsyms]) + ffffffffc0011125 bpf_prog_a4a0eb0651e6af8b_lookup_qp_trie+0x5d (bpf...) + ffffffff8127bc2b bpf_for_each_array_elem+0x7b ([kernel.kallsyms]) + ffffffffc00110a1 +0x25 () + ffffffff8121a89a trace_call_bpf+0xca ([kernel.kallsyms]) + +Fix it by always using prog instead prog->aux->func[0] to emit ksymbol +event for the main program. After the fix, the output of perf script +will be correct: + + ffffffff81284b96 qp_trie_lookup_elem+0xe6 ([kernel.kallsyms]) + ffffffffc001382d bpf_prog_a4a0eb0651e6af8b_lookup_qp_trie+0x5d (bpf...) + ffffffff8127bc2b bpf_for_each_array_elem+0x7b ([kernel.kallsyms]) + ffffffffc0013779 bpf_prog_245c55ab25cfcf40_qp_trie_lookup+0x25 (bpf...) + ffffffff8121a89a trace_call_bpf+0xca ([kernel.kallsyms]) + +Fixes: 0108a4e9f358 ("bpf: ensure main program has an extable") +Signed-off-by: Hou Tao +Signed-off-by: Daniel Borkmann +Tested-by: Yonghong Song +Reviewed-by: Krister Johansen +Reviewed-by: Jiri Olsa +Link: https://lore.kernel.org/bpf/20240714065533.1112616-1-houtao@huaweicloud.com +Signed-off-by: Sasha Levin +--- + kernel/events/core.c | 28 +++++++++++++--------------- + 1 file changed, 13 insertions(+), 15 deletions(-) + +diff --git a/kernel/events/core.c b/kernel/events/core.c +index b689b35473a38..ce6f9052d4bc4 100644 +--- a/kernel/events/core.c ++++ b/kernel/events/core.c +@@ -9168,21 +9168,19 @@ static void perf_event_bpf_emit_ksymbols(struct bpf_prog *prog, + bool unregister = type == PERF_BPF_EVENT_PROG_UNLOAD; + int i; + +- if (prog->aux->func_cnt == 0) { +- perf_event_ksymbol(PERF_RECORD_KSYMBOL_TYPE_BPF, +- (u64)(unsigned long)prog->bpf_func, +- prog->jited_len, unregister, +- prog->aux->ksym.name); +- } else { +- for (i = 0; i < prog->aux->func_cnt; i++) { +- struct bpf_prog *subprog = prog->aux->func[i]; +- +- perf_event_ksymbol( +- PERF_RECORD_KSYMBOL_TYPE_BPF, +- (u64)(unsigned long)subprog->bpf_func, +- subprog->jited_len, unregister, +- subprog->aux->ksym.name); +- } ++ perf_event_ksymbol(PERF_RECORD_KSYMBOL_TYPE_BPF, ++ (u64)(unsigned long)prog->bpf_func, ++ prog->jited_len, unregister, ++ prog->aux->ksym.name); ++ ++ for (i = 1; i < prog->aux->func_cnt; i++) { ++ struct bpf_prog *subprog = prog->aux->func[i]; ++ ++ perf_event_ksymbol( ++ PERF_RECORD_KSYMBOL_TYPE_BPF, ++ (u64)(unsigned long)subprog->bpf_func, ++ subprog->jited_len, unregister, ++ subprog->aux->ksym.name); + } + } + +-- +2.43.0 + diff --git a/queue-5.15/bpf-fix-a-segment-issue-when-downgrading-gso_size.patch b/queue-5.15/bpf-fix-a-segment-issue-when-downgrading-gso_size.patch new file mode 100644 index 00000000000..89c30df8332 --- /dev/null +++ b/queue-5.15/bpf-fix-a-segment-issue-when-downgrading-gso_size.patch @@ -0,0 +1,57 @@ +From 0223776c756cb414810939af1d0ec1ccc35187c2 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 19 Jul 2024 10:46:53 +0800 +Subject: bpf: Fix a segment issue when downgrading gso_size + +From: Fred Li + +[ Upstream commit fa5ef655615a01533035c6139248c5b33aa27028 ] + +Linearize the skb when downgrading gso_size because it may trigger a +BUG_ON() later when the skb is segmented as described in [1,2]. + +Fixes: 2be7e212d5419 ("bpf: add bpf_skb_adjust_room helper") +Signed-off-by: Fred Li +Signed-off-by: Daniel Borkmann +Reviewed-by: Willem de Bruijn +Acked-by: Daniel Borkmann +Link: https://lore.kernel.org/all/20240626065555.35460-2-dracodingfly@gmail.com [1] +Link: https://lore.kernel.org/all/668d5cf1ec330_1c18c32947@willemb.c.googlers.com.notmuch [2] +Link: https://lore.kernel.org/bpf/20240719024653.77006-1-dracodingfly@gmail.com +Signed-off-by: Sasha Levin +--- + net/core/filter.c | 15 +++++++++++---- + 1 file changed, 11 insertions(+), 4 deletions(-) + +diff --git a/net/core/filter.c b/net/core/filter.c +index a873c8fd51b67..a92a35c0f1e72 100644 +--- a/net/core/filter.c ++++ b/net/core/filter.c +@@ -3507,13 +3507,20 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff, + if (skb_is_gso(skb)) { + struct skb_shared_info *shinfo = skb_shinfo(skb); + +- /* Due to header grow, MSS needs to be downgraded. */ +- if (!(flags & BPF_F_ADJ_ROOM_FIXED_GSO)) +- skb_decrease_gso_size(shinfo, len_diff); +- + /* Header must be checked, and gso_segs recomputed. */ + shinfo->gso_type |= gso_type; + shinfo->gso_segs = 0; ++ ++ /* Due to header growth, MSS needs to be downgraded. ++ * There is a BUG_ON() when segmenting the frag_list with ++ * head_frag true, so linearize the skb after downgrading ++ * the MSS. ++ */ ++ if (!(flags & BPF_F_ADJ_ROOM_FIXED_GSO)) { ++ skb_decrease_gso_size(shinfo, len_diff); ++ if (shinfo->frag_list) ++ return skb_linearize(skb); ++ } + } + + return 0; +-- +2.43.0 + diff --git a/queue-5.15/ceph-fix-incorrect-kmalloc-size-of-pagevec-mempool.patch b/queue-5.15/ceph-fix-incorrect-kmalloc-size-of-pagevec-mempool.patch new file mode 100644 index 00000000000..79079fe7086 --- /dev/null +++ b/queue-5.15/ceph-fix-incorrect-kmalloc-size-of-pagevec-mempool.patch @@ -0,0 +1,38 @@ +From b5ba66444b44724e89715830a5faf2b34c02f501 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jul 2024 14:47:56 +0800 +Subject: ceph: fix incorrect kmalloc size of pagevec mempool + +From: ethanwu + +[ Upstream commit 03230edb0bd831662a7c08b6fef66b2a9a817774 ] + +The kmalloc size of pagevec mempool is incorrectly calculated. +It misses the size of page pointer and only accounts the number for the array. + +Fixes: a0102bda5bc0 ("ceph: move sb->wb_pagevec_pool to be a global mempool") +Signed-off-by: ethanwu +Reviewed-by: Xiubo Li +Signed-off-by: Ilya Dryomov +Signed-off-by: Sasha Levin +--- + fs/ceph/super.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/fs/ceph/super.c b/fs/ceph/super.c +index 1723ec21cd470..b5ed6d9a19f4a 100644 +--- a/fs/ceph/super.c ++++ b/fs/ceph/super.c +@@ -783,7 +783,8 @@ static int __init init_caches(void) + if (!ceph_mds_request_cachep) + goto bad_mds_req; + +- ceph_wb_pagevec_pool = mempool_create_kmalloc_pool(10, CEPH_MAX_WRITE_SIZE >> PAGE_SHIFT); ++ ceph_wb_pagevec_pool = mempool_create_kmalloc_pool(10, ++ (CEPH_MAX_WRITE_SIZE >> PAGE_SHIFT) * sizeof(struct page *)); + if (!ceph_wb_pagevec_pool) + goto bad_pagevec_pool; + +-- +2.43.0 + diff --git a/queue-5.15/dma-fix-call-order-in-dmam_free_coherent.patch b/queue-5.15/dma-fix-call-order-in-dmam_free_coherent.patch new file mode 100644 index 00000000000..b3e38d1648d --- /dev/null +++ b/queue-5.15/dma-fix-call-order-in-dmam_free_coherent.patch @@ -0,0 +1,52 @@ +From 0dde245834ef58da29c6dd71fd7ac69f3c100b6a Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 18 Jul 2024 14:38:24 +0000 +Subject: dma: fix call order in dmam_free_coherent + +From: Lance Richardson + +[ Upstream commit 28e8b7406d3a1f5329a03aa25a43aa28e087cb20 ] + +dmam_free_coherent() frees a DMA allocation, which makes the +freed vaddr available for reuse, then calls devres_destroy() +to remove and free the data structure used to track the DMA +allocation. Between the two calls, it is possible for a +concurrent task to make an allocation with the same vaddr +and add it to the devres list. + +If this happens, there will be two entries in the devres list +with the same vaddr and devres_destroy() can free the wrong +entry, triggering the WARN_ON() in dmam_match. + +Fix by destroying the devres entry before freeing the DMA +allocation. + +Tested: + kokonut //net/encryption + http://sponge2/b9145fe6-0f72-4325-ac2f-a84d81075b03 + +Fixes: 9ac7849e35f7 ("devres: device resource management") +Signed-off-by: Lance Richardson +Signed-off-by: Christoph Hellwig +Signed-off-by: Sasha Levin +--- + kernel/dma/mapping.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c +index c9dbc8f5812b8..9e1a724ae7e7d 100644 +--- a/kernel/dma/mapping.c ++++ b/kernel/dma/mapping.c +@@ -62,8 +62,8 @@ void dmam_free_coherent(struct device *dev, size_t size, void *vaddr, + { + struct dma_devres match_data = { size, vaddr, dma_handle }; + +- dma_free_coherent(dev, size, vaddr, dma_handle); + WARN_ON(devres_destroy(dev, dmam_release, dmam_match, &match_data)); ++ dma_free_coherent(dev, size, vaddr, dma_handle); + } + EXPORT_SYMBOL(dmam_free_coherent); + +-- +2.43.0 + diff --git a/queue-5.15/dmaengine-ti-k3-udma-fix-bchan-count-with-uhc-and-hc.patch b/queue-5.15/dmaengine-ti-k3-udma-fix-bchan-count-with-uhc-and-hc.patch new file mode 100644 index 00000000000..14f66472b85 --- /dev/null +++ b/queue-5.15/dmaengine-ti-k3-udma-fix-bchan-count-with-uhc-and-hc.patch @@ -0,0 +1,42 @@ +From 3355f7d557dc829f791fd1263bafe2d1040527b1 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 7 Jun 2024 23:41:03 +0530 +Subject: dmaengine: ti: k3-udma: Fix BCHAN count with UHC and HC channels + +From: Vignesh Raghavendra + +[ Upstream commit 372f8b3621294173f539b32976e41e6e12f5decf ] + +Unlike other channel counts in CAPx registers, BCDMA BCHAN CNT doesn't +include UHC and HC BC channels. So include them explicitly to arrive at +total BC channel in the instance. + +Fixes: 8844898028d4 ("dmaengine: ti: k3-udma: Add support for BCDMA channel TPL handling") +Signed-off-by: Vignesh Raghavendra +Signed-off-by: Jai Luthra +Tested-by: Jayesh Choudhary +Link: https://lore.kernel.org/r/20240607-bcdma_chan_cnt-v2-1-bf1a55529d91@ti.com +Signed-off-by: Vinod Koul +Signed-off-by: Sasha Levin +--- + drivers/dma/ti/k3-udma.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c +index 698fb898847c1..9db45c4eaaf24 100644 +--- a/drivers/dma/ti/k3-udma.c ++++ b/drivers/dma/ti/k3-udma.c +@@ -4415,7 +4415,9 @@ static int udma_get_mmrs(struct platform_device *pdev, struct udma_dev *ud) + ud->rchan_cnt = UDMA_CAP2_RCHAN_CNT(cap2); + break; + case DMA_TYPE_BCDMA: +- ud->bchan_cnt = BCDMA_CAP2_BCHAN_CNT(cap2); ++ ud->bchan_cnt = BCDMA_CAP2_BCHAN_CNT(cap2) + ++ BCDMA_CAP3_HBCHAN_CNT(cap3) + ++ BCDMA_CAP3_UBCHAN_CNT(cap3); + ud->tchan_cnt = BCDMA_CAP2_TCHAN_CNT(cap2); + ud->rchan_cnt = BCDMA_CAP2_RCHAN_CNT(cap2); + ud->rflow_cnt = ud->rchan_cnt; +-- +2.43.0 + diff --git a/queue-5.15/f2fs-add-a-way-to-limit-roll-forward-recovery-time.patch b/queue-5.15/f2fs-add-a-way-to-limit-roll-forward-recovery-time.patch new file mode 100644 index 00000000000..33fe3ad6ace --- /dev/null +++ b/queue-5.15/f2fs-add-a-way-to-limit-roll-forward-recovery-time.patch @@ -0,0 +1,199 @@ +From 891520a1e9d657d51327e792f782dc5421ca1cd3 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 27 Jan 2022 13:31:43 -0800 +Subject: f2fs: add a way to limit roll forward recovery time + +From: Jaegeuk Kim + +[ Upstream commit 47c8ebcce85ed7113e9e3e3f1d8c6374fa87848e ] + +This adds a sysfs entry to call checkpoint during fsync() in order to avoid +long elapsed time to run roll-forward recovery when booting the device. +Default value doesn't enforce the limitation which is same as before. + +Reviewed-by: Chao Yu +Signed-off-by: Jaegeuk Kim +Stable-dep-of: f06c0f82e38b ("f2fs: fix to update user block counts in block_operations()") +Signed-off-by: Sasha Levin +--- + Documentation/ABI/testing/sysfs-fs-f2fs | 6 ++++++ + fs/f2fs/checkpoint.c | 1 + + fs/f2fs/debug.c | 3 +++ + fs/f2fs/f2fs.h | 3 +++ + fs/f2fs/node.c | 2 ++ + fs/f2fs/node.h | 3 +++ + fs/f2fs/recovery.c | 4 ++++ + fs/f2fs/super.c | 14 ++++++++++++-- + fs/f2fs/sysfs.c | 2 ++ + 9 files changed, 36 insertions(+), 2 deletions(-) + +diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs +index bdbece0b08051..92bc2bdc8baf1 100644 +--- a/Documentation/ABI/testing/sysfs-fs-f2fs ++++ b/Documentation/ABI/testing/sysfs-fs-f2fs +@@ -536,3 +536,9 @@ Contact: "Daeho Jeong" + Description: You can set the trial count limit for GC urgent high mode with this value. + If GC thread gets to the limit, the mode will turn back to GC normal mode. + By default, the value is zero, which means there is no limit like before. ++ ++What: /sys/fs/f2fs//max_roll_forward_node_blocks ++Date: January 2022 ++Contact: "Jaegeuk Kim" ++Description: Controls max # of node block writes to be used for roll forward ++ recovery. This can limit the roll forward recovery time. +diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c +index 71a3714419f85..8d12e2fa32b8f 100644 +--- a/fs/f2fs/checkpoint.c ++++ b/fs/f2fs/checkpoint.c +@@ -1569,6 +1569,7 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc) + /* update user_block_counts */ + sbi->last_valid_block_count = sbi->total_valid_block_count; + percpu_counter_set(&sbi->alloc_valid_block_count, 0); ++ percpu_counter_set(&sbi->rf_node_block_count, 0); + + /* Here, we have one bio having CP pack except cp pack 2 page */ + f2fs_sync_meta_pages(sbi, META, LONG_MAX, FS_CP_META_IO); +diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c +index b449c7a372a4b..6d26872c7364d 100644 +--- a/fs/f2fs/debug.c ++++ b/fs/f2fs/debug.c +@@ -534,6 +534,9 @@ static int stat_show(struct seq_file *s, void *v) + si->ndirty_meta, si->meta_pages); + seq_printf(s, " - imeta: %4d\n", + si->ndirty_imeta); ++ seq_printf(s, " - fsync mark: %4lld\n", ++ percpu_counter_sum_positive( ++ &si->sbi->rf_node_block_count)); + seq_printf(s, " - NATs: %9d/%9d\n - SITs: %9d/%9d\n", + si->dirty_nats, si->nats, si->dirty_sits, si->sits); + seq_printf(s, " - free_nids: %9d/%9d\n - alloc_nids: %9d\n", +diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h +index d4a5700927cd5..fb1422a81d382 100644 +--- a/fs/f2fs/f2fs.h ++++ b/fs/f2fs/f2fs.h +@@ -891,6 +891,7 @@ struct f2fs_nm_info { + nid_t max_nid; /* maximum possible node ids */ + nid_t available_nids; /* # of available node ids */ + nid_t next_scan_nid; /* the next nid to be scanned */ ++ nid_t max_rf_node_blocks; /* max # of nodes for recovery */ + unsigned int ram_thresh; /* control the memory footprint */ + unsigned int ra_nid_pages; /* # of nid pages to be readaheaded */ + unsigned int dirty_nats_ratio; /* control dirty nats ratio threshold */ +@@ -1663,6 +1664,8 @@ struct f2fs_sb_info { + atomic_t nr_pages[NR_COUNT_TYPE]; + /* # of allocated blocks */ + struct percpu_counter alloc_valid_block_count; ++ /* # of node block writes as roll forward recovery */ ++ struct percpu_counter rf_node_block_count; + + /* writeback control */ + atomic_t wb_sync_req[META]; /* count # of WB_SYNC threads */ +diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c +index b6758887540f2..16eab673ca84d 100644 +--- a/fs/f2fs/node.c ++++ b/fs/f2fs/node.c +@@ -1787,6 +1787,7 @@ int f2fs_fsync_node_pages(struct f2fs_sb_info *sbi, struct inode *inode, + + if (!atomic || page == last_page) { + set_fsync_mark(page, 1); ++ percpu_counter_inc(&sbi->rf_node_block_count); + if (IS_INODE(page)) { + if (is_inode_flag_set(inode, + FI_DIRTY_INODE)) +@@ -3227,6 +3228,7 @@ static int init_node_manager(struct f2fs_sb_info *sbi) + nm_i->ram_thresh = DEF_RAM_THRESHOLD; + nm_i->ra_nid_pages = DEF_RA_NID_PAGES; + nm_i->dirty_nats_ratio = DEF_DIRTY_NAT_RATIO_THRESHOLD; ++ nm_i->max_rf_node_blocks = DEF_RF_NODE_BLOCKS; + + INIT_RADIX_TREE(&nm_i->free_nid_root, GFP_ATOMIC); + INIT_LIST_HEAD(&nm_i->free_nid_list); +diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h +index ff14a6e5ac1c9..048f309e32ff4 100644 +--- a/fs/f2fs/node.h ++++ b/fs/f2fs/node.h +@@ -31,6 +31,9 @@ + /* control total # of nats */ + #define DEF_NAT_CACHE_THRESHOLD 100000 + ++/* control total # of node writes used for roll-fowrad recovery */ ++#define DEF_RF_NODE_BLOCKS 0 ++ + /* vector size for gang look-up from nat cache that consists of radix tree */ + #define NATVEC_SIZE 64 + #define SETVEC_SIZE 32 +diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c +index f07ae58d266d1..a5044be137988 100644 +--- a/fs/f2fs/recovery.c ++++ b/fs/f2fs/recovery.c +@@ -55,6 +55,10 @@ bool f2fs_space_for_roll_forward(struct f2fs_sb_info *sbi) + + if (sbi->last_valid_block_count + nalloc > sbi->user_block_count) + return false; ++ if (NM_I(sbi)->max_rf_node_blocks && ++ percpu_counter_sum_positive(&sbi->rf_node_block_count) >= ++ NM_I(sbi)->max_rf_node_blocks) ++ return false; + return true; + } + +diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c +index 098daef550f49..144c35b2760f4 100644 +--- a/fs/f2fs/super.c ++++ b/fs/f2fs/super.c +@@ -1540,8 +1540,9 @@ static void f2fs_free_inode(struct inode *inode) + + static void destroy_percpu_info(struct f2fs_sb_info *sbi) + { +- percpu_counter_destroy(&sbi->alloc_valid_block_count); + percpu_counter_destroy(&sbi->total_valid_inode_count); ++ percpu_counter_destroy(&sbi->rf_node_block_count); ++ percpu_counter_destroy(&sbi->alloc_valid_block_count); + } + + static void destroy_device_list(struct f2fs_sb_info *sbi) +@@ -3659,11 +3660,20 @@ static int init_percpu_info(struct f2fs_sb_info *sbi) + if (err) + return err; + ++ err = percpu_counter_init(&sbi->rf_node_block_count, 0, GFP_KERNEL); ++ if (err) ++ goto err_valid_block; ++ + err = percpu_counter_init(&sbi->total_valid_inode_count, 0, + GFP_KERNEL); + if (err) +- percpu_counter_destroy(&sbi->alloc_valid_block_count); ++ goto err_node_block; ++ return 0; + ++err_node_block: ++ percpu_counter_destroy(&sbi->rf_node_block_count); ++err_valid_block: ++ percpu_counter_destroy(&sbi->alloc_valid_block_count); + return err; + } + +diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c +index 673b1153dbc67..5bccd70a3f3be 100644 +--- a/fs/f2fs/sysfs.c ++++ b/fs/f2fs/sysfs.c +@@ -720,6 +720,7 @@ F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, min_ssr_sections, min_ssr_sections); + F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, ram_thresh, ram_thresh); + F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, ra_nid_pages, ra_nid_pages); + F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, dirty_nats_ratio, dirty_nats_ratio); ++F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, max_roll_forward_node_blocks, max_rf_node_blocks); + F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, max_victim_search, max_victim_search); + F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, migration_granularity, migration_granularity); + F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, dir_level, dir_level); +@@ -837,6 +838,7 @@ static struct attribute *f2fs_attrs[] = { + ATTR_LIST(ram_thresh), + ATTR_LIST(ra_nid_pages), + ATTR_LIST(dirty_nats_ratio), ++ ATTR_LIST(max_roll_forward_node_blocks), + ATTR_LIST(cp_interval), + ATTR_LIST(idle_interval), + ATTR_LIST(discard_idle_interval), +-- +2.43.0 + diff --git a/queue-5.15/f2fs-add-gc_urgent_high_remaining-sysfs-node.patch b/queue-5.15/f2fs-add-gc_urgent_high_remaining-sysfs-node.patch new file mode 100644 index 00000000000..0bede465b76 --- /dev/null +++ b/queue-5.15/f2fs-add-gc_urgent_high_remaining-sysfs-node.patch @@ -0,0 +1,129 @@ +From 6d0fb871dbe1cd210304a6833e845a4f1c5cfb27 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 8 Dec 2021 16:41:51 -0800 +Subject: f2fs: add gc_urgent_high_remaining sysfs node + +From: Daeho Jeong + +[ Upstream commit 325163e9892b627fc9fb1af51e51f0f95dded517 ] + +Added a new sysfs node called gc_urgent_high_remaining. The user can +set the trial count limit for GC urgent high mode with this value. If +GC thread gets to the limit, the mode will turn back to GC normal mode. +By default, the value is zero, which means there is no limit like before. + +Signed-off-by: Daeho Jeong +Signed-off-by: Jaegeuk Kim +Stable-dep-of: f06c0f82e38b ("f2fs: fix to update user block counts in block_operations()") +Signed-off-by: Sasha Levin +--- + Documentation/ABI/testing/sysfs-fs-f2fs | 7 +++++++ + fs/f2fs/f2fs.h | 3 +++ + fs/f2fs/gc.c | 12 ++++++++++++ + fs/f2fs/super.c | 1 + + fs/f2fs/sysfs.c | 11 +++++++++++ + 5 files changed, 34 insertions(+) + +diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs +index 91e2b549f8172..bdbece0b08051 100644 +--- a/Documentation/ABI/testing/sysfs-fs-f2fs ++++ b/Documentation/ABI/testing/sysfs-fs-f2fs +@@ -529,3 +529,10 @@ Description: With "mode=fragment:block" mount options, we can scatter block allo + f2fs will allocate 1.. blocks in a chunk and make a hole + in the length of 1.. by turns. This value can be set + between 1..512 and the default value is 4. ++ ++What: /sys/fs/f2fs//gc_urgent_high_remaining ++Date: December 2021 ++Contact: "Daeho Jeong" ++Description: You can set the trial count limit for GC urgent high mode with this value. ++ If GC thread gets to the limit, the mode will turn back to GC normal mode. ++ By default, the value is zero, which means there is no limit like before. +diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h +index b5f1099ab388f..d4a5700927cd5 100644 +--- a/fs/f2fs/f2fs.h ++++ b/fs/f2fs/f2fs.h +@@ -1682,6 +1682,9 @@ struct f2fs_sb_info { + unsigned int cur_victim_sec; /* current victim section num */ + unsigned int gc_mode; /* current GC state */ + unsigned int next_victim_seg[2]; /* next segment in victim section */ ++ spinlock_t gc_urgent_high_lock; ++ bool gc_urgent_high_limited; /* indicates having limited trial count */ ++ unsigned int gc_urgent_high_remaining; /* remaining trial count for GC_URGENT_HIGH */ + + /* for skip statistic */ + unsigned int atomic_files; /* # of opened atomic file */ +diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c +index 4d4d7f0c8a71b..21081e7ff55d5 100644 +--- a/fs/f2fs/gc.c ++++ b/fs/f2fs/gc.c +@@ -92,6 +92,18 @@ static int gc_thread_func(void *data) + * So, I'd like to wait some time to collect dirty segments. + */ + if (sbi->gc_mode == GC_URGENT_HIGH) { ++ spin_lock(&sbi->gc_urgent_high_lock); ++ if (sbi->gc_urgent_high_limited) { ++ if (!sbi->gc_urgent_high_remaining) { ++ sbi->gc_urgent_high_limited = false; ++ spin_unlock(&sbi->gc_urgent_high_lock); ++ sbi->gc_mode = GC_NORMAL; ++ continue; ++ } ++ sbi->gc_urgent_high_remaining--; ++ } ++ spin_unlock(&sbi->gc_urgent_high_lock); ++ + wait_ms = gc_th->urgent_sleep_time; + down_write(&sbi->gc_lock); + goto do_gc; +diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c +index 339e44467b9cd..098daef550f49 100644 +--- a/fs/f2fs/super.c ++++ b/fs/f2fs/super.c +@@ -3621,6 +3621,7 @@ static void init_sb_info(struct f2fs_sb_info *sbi) + sbi->seq_file_ra_mul = MIN_RA_MUL; + sbi->max_fragment_chunk = DEF_FRAGMENT_SIZE; + sbi->max_fragment_hole = DEF_FRAGMENT_SIZE; ++ spin_lock_init(&sbi->gc_urgent_high_lock); + + sbi->dir_level = DEF_DIR_LEVEL; + sbi->interval_time[CP_TIME] = DEF_CP_INTERVAL; +diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c +index c0e72bd44b135..673b1153dbc67 100644 +--- a/fs/f2fs/sysfs.c ++++ b/fs/f2fs/sysfs.c +@@ -480,6 +480,15 @@ static ssize_t __sbi_store(struct f2fs_attr *a, + return count; + } + ++ if (!strcmp(a->attr.name, "gc_urgent_high_remaining")) { ++ spin_lock(&sbi->gc_urgent_high_lock); ++ sbi->gc_urgent_high_limited = t == 0 ? false : true; ++ sbi->gc_urgent_high_remaining = t; ++ spin_unlock(&sbi->gc_urgent_high_lock); ++ ++ return count; ++ } ++ + #ifdef CONFIG_F2FS_IOSTAT + if (!strcmp(a->attr.name, "iostat_enable")) { + sbi->iostat_enable = !!t; +@@ -735,6 +744,7 @@ F2FS_RW_ATTR(FAULT_INFO_TYPE, f2fs_fault_info, inject_type, inject_type); + #endif + F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, data_io_flag, data_io_flag); + F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, node_io_flag, node_io_flag); ++F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_urgent_high_remaining, gc_urgent_high_remaining); + F2FS_RW_ATTR(CPRC_INFO, ckpt_req_control, ckpt_thread_ioprio, ckpt_thread_ioprio); + F2FS_GENERAL_RO_ATTR(dirty_segments); + F2FS_GENERAL_RO_ATTR(free_segments); +@@ -846,6 +856,7 @@ static struct attribute *f2fs_attrs[] = { + #endif + ATTR_LIST(data_io_flag), + ATTR_LIST(node_io_flag), ++ ATTR_LIST(gc_urgent_high_remaining), + ATTR_LIST(ckpt_thread_ioprio), + ATTR_LIST(dirty_segments), + ATTR_LIST(free_segments), +-- +2.43.0 + diff --git a/queue-5.15/f2fs-fix-start-segno-of-large-section.patch b/queue-5.15/f2fs-fix-start-segno-of-large-section.patch new file mode 100644 index 00000000000..6a6351093d9 --- /dev/null +++ b/queue-5.15/f2fs-fix-start-segno-of-large-section.patch @@ -0,0 +1,40 @@ +From 42507285d0900ef65a62dc3aef2b194073593961 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 8 Jul 2024 20:04:07 +0800 +Subject: f2fs: fix start segno of large section + +From: Sheng Yong + +[ Upstream commit 8c409989678e92e4a737e7cd2bb04f3efb81071a ] + +get_ckpt_valid_blocks() checks valid ckpt blocks in current section. +It counts all vblocks from the first to the last segment in the +large section. However, START_SEGNO() is used to get the first segno +in an SIT block. This patch fixes that to get the correct start segno. + +Fixes: 61461fc921b7 ("f2fs: fix to avoid touching checkpointed data in get_victim()") +Signed-off-by: Sheng Yong +Reviewed-by: Chao Yu +Signed-off-by: Jaegeuk Kim +Signed-off-by: Sasha Levin +--- + fs/f2fs/segment.h | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h +index 04f448ddf49ea..1d16449089d02 100644 +--- a/fs/f2fs/segment.h ++++ b/fs/f2fs/segment.h +@@ -369,7 +369,8 @@ static inline unsigned int get_ckpt_valid_blocks(struct f2fs_sb_info *sbi, + unsigned int segno, bool use_section) + { + if (use_section && __is_large_section(sbi)) { +- unsigned int start_segno = START_SEGNO(segno); ++ unsigned int secno = GET_SEC_FROM_SEG(sbi, segno); ++ unsigned int start_segno = GET_SEG_FROM_SEC(sbi, secno); + unsigned int blocks = 0; + int i; + +-- +2.43.0 + diff --git a/queue-5.15/f2fs-fix-to-update-user-block-counts-in-block_operat.patch b/queue-5.15/f2fs-fix-to-update-user-block-counts-in-block_operat.patch new file mode 100644 index 00000000000..ef26e3d6346 --- /dev/null +++ b/queue-5.15/f2fs-fix-to-update-user-block-counts-in-block_operat.patch @@ -0,0 +1,86 @@ +From 28a1435984b58374df33937baf30b25d7668f3b4 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 25 Jun 2024 10:32:39 +0800 +Subject: f2fs: fix to update user block counts in block_operations() + +From: Chao Yu + +[ Upstream commit f06c0f82e38bbda7264d6ef3c90045ad2810e0f3 ] + +Commit 59c9081bc86e ("f2fs: allow write page cache when writting cp") +allows write() to write data to page cache during checkpoint, so block +count fields like .total_valid_block_count, .alloc_valid_block_count +and .rf_node_block_count may encounter race condition as below: + +CP Thread A +- write_checkpoint + - block_operations + - f2fs_down_write(&sbi->node_change) + - __prepare_cp_block + : ckpt->valid_block_count = .total_valid_block_count + - f2fs_up_write(&sbi->node_change) + - write + - f2fs_preallocate_blocks + - f2fs_map_blocks(,F2FS_GET_BLOCK_PRE_AIO) + - f2fs_map_lock + - f2fs_down_read(&sbi->node_change) + - f2fs_reserve_new_blocks + - inc_valid_block_count + : percpu_counter_add(&sbi->alloc_valid_block_count, count) + : sbi->total_valid_block_count += count + - f2fs_up_read(&sbi->node_change) + - do_checkpoint + : sbi->last_valid_block_count = sbi->total_valid_block_count + : percpu_counter_set(&sbi->alloc_valid_block_count, 0) + : percpu_counter_set(&sbi->rf_node_block_count, 0) + - fsync + - need_do_checkpoint + - f2fs_space_for_roll_forward + : alloc_valid_block_count was reset to zero, + so, it may missed last data during checkpoint + +Let's change to update .total_valid_block_count, .alloc_valid_block_count +and .rf_node_block_count in block_operations(), then their access can be +protected by .node_change and .cp_rwsem lock, so that it can avoid above +race condition. + +Fixes: 59c9081bc86e ("f2fs: allow write page cache when writting cp") +Cc: Yunlei He +Signed-off-by: Chao Yu +Signed-off-by: Jaegeuk Kim +Signed-off-by: Sasha Levin +--- + fs/f2fs/checkpoint.c | 10 +++++----- + 1 file changed, 5 insertions(+), 5 deletions(-) + +diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c +index 8d12e2fa32b8f..ab91aa0003fe4 100644 +--- a/fs/f2fs/checkpoint.c ++++ b/fs/f2fs/checkpoint.c +@@ -1175,6 +1175,11 @@ static void __prepare_cp_block(struct f2fs_sb_info *sbi) + ckpt->valid_node_count = cpu_to_le32(valid_node_count(sbi)); + ckpt->valid_inode_count = cpu_to_le32(valid_inode_count(sbi)); + ckpt->next_free_nid = cpu_to_le32(last_nid); ++ ++ /* update user_block_counts */ ++ sbi->last_valid_block_count = sbi->total_valid_block_count; ++ percpu_counter_set(&sbi->alloc_valid_block_count, 0); ++ percpu_counter_set(&sbi->rf_node_block_count, 0); + } + + static bool __need_flush_quota(struct f2fs_sb_info *sbi) +@@ -1566,11 +1571,6 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc) + start_blk += NR_CURSEG_NODE_TYPE; + } + +- /* update user_block_counts */ +- sbi->last_valid_block_count = sbi->total_valid_block_count; +- percpu_counter_set(&sbi->alloc_valid_block_count, 0); +- percpu_counter_set(&sbi->rf_node_block_count, 0); +- + /* Here, we have one bio having CP pack except cp pack 2 page */ + f2fs_sync_meta_pages(sbi, META, LONG_MAX, FS_CP_META_IO); + /* Wait for all dirty meta pages to be submitted for IO */ +-- +2.43.0 + diff --git a/queue-5.15/f2fs-introduce-fragment-allocation-mode-mount-option.patch b/queue-5.15/f2fs-introduce-fragment-allocation-mode-mount-option.patch new file mode 100644 index 00000000000..f15422e8dff --- /dev/null +++ b/queue-5.15/f2fs-introduce-fragment-allocation-mode-mount-option.patch @@ -0,0 +1,312 @@ +From bcfa36b10c3241ed299f21f0b186e97a9f064eed Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 29 Sep 2021 11:12:03 -0700 +Subject: f2fs: introduce fragment allocation mode mount option + +From: Daeho Jeong + +[ Upstream commit 6691d940b0e09dd1564130e7a354d6deaf05d009 ] + +Added two options into "mode=" mount option to make it possible for +developers to simulate filesystem fragmentation/after-GC situation +itself. The developers use these modes to understand filesystem +fragmentation/after-GC condition well, and eventually get some +insights to handle them better. + +"fragment:segment": f2fs allocates a new segment in ramdom position. + With this, we can simulate the after-GC condition. +"fragment:block" : We can scatter block allocation with + "max_fragment_chunk" and "max_fragment_hole" sysfs + nodes. f2fs will allocate 1.. + blocks in a chunk and make a hole in the length of + 1.. by turns in a newly allocated + free segment. Plus, this mode implicitly enables + "fragment:segment" option for more randomness. + +Reviewed-by: Chao Yu +Signed-off-by: Daeho Jeong +Signed-off-by: Jaegeuk Kim +Stable-dep-of: f06c0f82e38b ("f2fs: fix to update user block counts in block_operations()") +Signed-off-by: Sasha Levin +--- + Documentation/ABI/testing/sysfs-fs-f2fs | 16 ++++++++++++++++ + Documentation/filesystems/f2fs.rst | 18 ++++++++++++++++++ + fs/f2fs/f2fs.h | 19 +++++++++++++++++-- + fs/f2fs/gc.c | 5 ++++- + fs/f2fs/segment.c | 20 ++++++++++++++++++-- + fs/f2fs/segment.h | 1 + + fs/f2fs/super.c | 10 ++++++++++ + fs/f2fs/sysfs.c | 20 ++++++++++++++++++++ + 8 files changed, 104 insertions(+), 5 deletions(-) + +diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs +index 48d41b6696270..91e2b549f8172 100644 +--- a/Documentation/ABI/testing/sysfs-fs-f2fs ++++ b/Documentation/ABI/testing/sysfs-fs-f2fs +@@ -513,3 +513,19 @@ Date: July 2021 + Contact: "Daeho Jeong" + Description: You can control the multiplier value of bdi device readahead window size + between 2 (default) and 256 for POSIX_FADV_SEQUENTIAL advise option. ++ ++What: /sys/fs/f2fs//max_fragment_chunk ++Date: August 2021 ++Contact: "Daeho Jeong" ++Description: With "mode=fragment:block" mount options, we can scatter block allocation. ++ f2fs will allocate 1.. blocks in a chunk and make a hole ++ in the length of 1.. by turns. This value can be set ++ between 1..512 and the default value is 4. ++ ++What: /sys/fs/f2fs//max_fragment_hole ++Date: August 2021 ++Contact: "Daeho Jeong" ++Description: With "mode=fragment:block" mount options, we can scatter block allocation. ++ f2fs will allocate 1.. blocks in a chunk and make a hole ++ in the length of 1.. by turns. This value can be set ++ between 1..512 and the default value is 4. +diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst +index 7fe50b0bccde9..6954c04753ad7 100644 +--- a/Documentation/filesystems/f2fs.rst ++++ b/Documentation/filesystems/f2fs.rst +@@ -202,6 +202,24 @@ fault_type=%d Support configuring fault injection type, should be + mode=%s Control block allocation mode which supports "adaptive" + and "lfs". In "lfs" mode, there should be no random + writes towards main area. ++ "fragment:segment" and "fragment:block" are newly added here. ++ These are developer options for experiments to simulate filesystem ++ fragmentation/after-GC situation itself. The developers use these ++ modes to understand filesystem fragmentation/after-GC condition well, ++ and eventually get some insights to handle them better. ++ In "fragment:segment", f2fs allocates a new segment in ramdom ++ position. With this, we can simulate the after-GC condition. ++ In "fragment:block", we can scatter block allocation with ++ "max_fragment_chunk" and "max_fragment_hole" sysfs nodes. ++ We added some randomness to both chunk and hole size to make ++ it close to realistic IO pattern. So, in this mode, f2fs will allocate ++ 1.. blocks in a chunk and make a hole in the ++ length of 1.. by turns. With this, the newly ++ allocated blocks will be scattered throughout the whole partition. ++ Note that "fragment:block" implicitly enables "fragment:segment" ++ option for more randomness. ++ Please, use these options for your experiments and we strongly ++ recommend to re-format the filesystem after using these options. + io_bits=%u Set the bit size of write IO requests. It should be set + with "mode=lfs". + usrquota Enable plain user disk quota accounting. +diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h +index e49fca9daf2d3..b5f1099ab388f 100644 +--- a/fs/f2fs/f2fs.h ++++ b/fs/f2fs/f2fs.h +@@ -1294,8 +1294,10 @@ enum { + }; + + enum { +- FS_MODE_ADAPTIVE, /* use both lfs/ssr allocation */ +- FS_MODE_LFS, /* use lfs allocation only */ ++ FS_MODE_ADAPTIVE, /* use both lfs/ssr allocation */ ++ FS_MODE_LFS, /* use lfs allocation only */ ++ FS_MODE_FRAGMENT_SEG, /* segment fragmentation mode */ ++ FS_MODE_FRAGMENT_BLK, /* block fragmentation mode */ + }; + + enum { +@@ -1770,6 +1772,9 @@ struct f2fs_sb_info { + + unsigned long seq_file_ra_mul; /* multiplier for ra_pages of seq. files in fadvise */ + ++ int max_fragment_chunk; /* max chunk size for block fragmentation mode */ ++ int max_fragment_hole; /* max hole size for block fragmentation mode */ ++ + #ifdef CONFIG_F2FS_FS_COMPRESSION + struct kmem_cache *page_array_slab; /* page array entry */ + unsigned int page_array_slab_size; /* default page array slab size */ +@@ -3539,6 +3544,16 @@ unsigned int f2fs_usable_segs_in_sec(struct f2fs_sb_info *sbi, + unsigned int f2fs_usable_blks_in_seg(struct f2fs_sb_info *sbi, + unsigned int segno); + ++#define DEF_FRAGMENT_SIZE 4 ++#define MIN_FRAGMENT_SIZE 1 ++#define MAX_FRAGMENT_SIZE 512 ++ ++static inline bool f2fs_need_rand_seg(struct f2fs_sb_info *sbi) ++{ ++ return F2FS_OPTION(sbi).fs_mode == FS_MODE_FRAGMENT_SEG || ++ F2FS_OPTION(sbi).fs_mode == FS_MODE_FRAGMENT_BLK; ++} ++ + /* + * checkpoint.c + */ +diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c +index 9a57754e6e0c1..4d4d7f0c8a71b 100644 +--- a/fs/f2fs/gc.c ++++ b/fs/f2fs/gc.c +@@ -14,6 +14,7 @@ + #include + #include + #include ++#include + + #include "f2fs.h" + #include "node.h" +@@ -257,7 +258,9 @@ static void select_policy(struct f2fs_sb_info *sbi, int gc_type, + p->max_search = sbi->max_victim_search; + + /* let's select beginning hot/small space first in no_heap mode*/ +- if (test_opt(sbi, NOHEAP) && ++ if (f2fs_need_rand_seg(sbi)) ++ p->offset = prandom_u32() % (MAIN_SECS(sbi) * sbi->segs_per_sec); ++ else if (test_opt(sbi, NOHEAP) && + (type == CURSEG_HOT_DATA || IS_NODESEG(type))) + p->offset = 0; + else +diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c +index 1c69dc91c3292..b059b02fc179d 100644 +--- a/fs/f2fs/segment.c ++++ b/fs/f2fs/segment.c +@@ -15,6 +15,7 @@ + #include + #include + #include ++#include + + #include "f2fs.h" + #include "segment.h" +@@ -2633,6 +2634,8 @@ static unsigned int __get_next_segno(struct f2fs_sb_info *sbi, int type) + unsigned short seg_type = curseg->seg_type; + + sanity_check_seg_type(sbi, seg_type); ++ if (f2fs_need_rand_seg(sbi)) ++ return prandom_u32() % (MAIN_SECS(sbi) * sbi->segs_per_sec); + + /* if segs_per_sec is large than 1, we need to keep original policy. */ + if (__is_large_section(sbi)) +@@ -2684,6 +2687,9 @@ static void new_curseg(struct f2fs_sb_info *sbi, int type, bool new_sec) + curseg->next_segno = segno; + reset_curseg(sbi, type, 1); + curseg->alloc_type = LFS; ++ if (F2FS_OPTION(sbi).fs_mode == FS_MODE_FRAGMENT_BLK) ++ curseg->fragment_remained_chunk = ++ prandom_u32() % sbi->max_fragment_chunk + 1; + } + + static int __next_free_blkoff(struct f2fs_sb_info *sbi, +@@ -2710,12 +2716,22 @@ static int __next_free_blkoff(struct f2fs_sb_info *sbi, + static void __refresh_next_blkoff(struct f2fs_sb_info *sbi, + struct curseg_info *seg) + { +- if (seg->alloc_type == SSR) ++ if (seg->alloc_type == SSR) { + seg->next_blkoff = + __next_free_blkoff(sbi, seg->segno, + seg->next_blkoff + 1); +- else ++ } else { + seg->next_blkoff++; ++ if (F2FS_OPTION(sbi).fs_mode == FS_MODE_FRAGMENT_BLK) { ++ /* To allocate block chunks in different sizes, use random number */ ++ if (--seg->fragment_remained_chunk <= 0) { ++ seg->fragment_remained_chunk = ++ prandom_u32() % sbi->max_fragment_chunk + 1; ++ seg->next_blkoff += ++ prandom_u32() % sbi->max_fragment_hole + 1; ++ } ++ } ++ } + } + + bool f2fs_segment_has_free_slot(struct f2fs_sb_info *sbi, int segno) +diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h +index 1d16449089d02..d1c0c8732c4fd 100644 +--- a/fs/f2fs/segment.h ++++ b/fs/f2fs/segment.h +@@ -321,6 +321,7 @@ struct curseg_info { + unsigned short next_blkoff; /* next block offset to write */ + unsigned int zone; /* current zone number */ + unsigned int next_segno; /* preallocated segment */ ++ int fragment_remained_chunk; /* remained block size in a chunk for block fragmentation mode */ + bool inited; /* indicate inmem log is inited */ + }; + +diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c +index 706d7adda3b22..339e44467b9cd 100644 +--- a/fs/f2fs/super.c ++++ b/fs/f2fs/super.c +@@ -881,6 +881,10 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount) + F2FS_OPTION(sbi).fs_mode = FS_MODE_ADAPTIVE; + } else if (!strcmp(name, "lfs")) { + F2FS_OPTION(sbi).fs_mode = FS_MODE_LFS; ++ } else if (!strcmp(name, "fragment:segment")) { ++ F2FS_OPTION(sbi).fs_mode = FS_MODE_FRAGMENT_SEG; ++ } else if (!strcmp(name, "fragment:block")) { ++ F2FS_OPTION(sbi).fs_mode = FS_MODE_FRAGMENT_BLK; + } else { + kfree(name); + return -EINVAL; +@@ -1972,6 +1976,10 @@ static int f2fs_show_options(struct seq_file *seq, struct dentry *root) + seq_puts(seq, "adaptive"); + else if (F2FS_OPTION(sbi).fs_mode == FS_MODE_LFS) + seq_puts(seq, "lfs"); ++ else if (F2FS_OPTION(sbi).fs_mode == FS_MODE_FRAGMENT_SEG) ++ seq_puts(seq, "fragment:segment"); ++ else if (F2FS_OPTION(sbi).fs_mode == FS_MODE_FRAGMENT_BLK) ++ seq_puts(seq, "fragment:block"); + seq_printf(seq, ",active_logs=%u", F2FS_OPTION(sbi).active_logs); + if (test_opt(sbi, RESERVE_ROOT)) + seq_printf(seq, ",reserve_root=%u,resuid=%u,resgid=%u", +@@ -3611,6 +3619,8 @@ static void init_sb_info(struct f2fs_sb_info *sbi) + sbi->max_victim_search = DEF_MAX_VICTIM_SEARCH; + sbi->migration_granularity = sbi->segs_per_sec; + sbi->seq_file_ra_mul = MIN_RA_MUL; ++ sbi->max_fragment_chunk = DEF_FRAGMENT_SIZE; ++ sbi->max_fragment_hole = DEF_FRAGMENT_SIZE; + + sbi->dir_level = DEF_DIR_LEVEL; + sbi->interval_time[CP_TIME] = DEF_CP_INTERVAL; +diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c +index 63af1573ebcaa..c0e72bd44b135 100644 +--- a/fs/f2fs/sysfs.c ++++ b/fs/f2fs/sysfs.c +@@ -553,6 +553,22 @@ static ssize_t __sbi_store(struct f2fs_attr *a, + return count; + } + ++ if (!strcmp(a->attr.name, "max_fragment_chunk")) { ++ if (t >= MIN_FRAGMENT_SIZE && t <= MAX_FRAGMENT_SIZE) ++ sbi->max_fragment_chunk = t; ++ else ++ return -EINVAL; ++ return count; ++ } ++ ++ if (!strcmp(a->attr.name, "max_fragment_hole")) { ++ if (t >= MIN_FRAGMENT_SIZE && t <= MAX_FRAGMENT_SIZE) ++ sbi->max_fragment_hole = t; ++ else ++ return -EINVAL; ++ return count; ++ } ++ + *ui = (unsigned int)t; + + return count; +@@ -783,6 +799,8 @@ F2FS_RW_ATTR(ATGC_INFO, atgc_management, atgc_age_threshold, age_threshold); + F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, seq_file_ra_mul, seq_file_ra_mul); + F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_segment_mode, gc_segment_mode); + F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_reclaimed_segments, gc_reclaimed_segs); ++F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, max_fragment_chunk, max_fragment_chunk); ++F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, max_fragment_hole, max_fragment_hole); + + #define ATTR_LIST(name) (&f2fs_attr_##name.attr) + static struct attribute *f2fs_attrs[] = { +@@ -861,6 +879,8 @@ static struct attribute *f2fs_attrs[] = { + ATTR_LIST(seq_file_ra_mul), + ATTR_LIST(gc_segment_mode), + ATTR_LIST(gc_reclaimed_segments), ++ ATTR_LIST(max_fragment_chunk), ++ ATTR_LIST(max_fragment_hole), + NULL, + }; + ATTRIBUTE_GROUPS(f2fs); +-- +2.43.0 + diff --git a/queue-5.15/fs-don-t-allow-non-init-s_user_ns-for-filesystems-wi.patch b/queue-5.15/fs-don-t-allow-non-init-s_user_ns-for-filesystems-wi.patch new file mode 100644 index 00000000000..935e4981729 --- /dev/null +++ b/queue-5.15/fs-don-t-allow-non-init-s_user_ns-for-filesystems-wi.patch @@ -0,0 +1,65 @@ +From 950476bbe7abec78876f2f0c16ea0436fd524623 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 24 Jul 2024 09:53:59 -0500 +Subject: fs: don't allow non-init s_user_ns for filesystems without + FS_USERNS_MOUNT + +From: Seth Forshee (DigitalOcean) + +[ Upstream commit e1c5ae59c0f22f7fe5c07fb5513a29e4aad868c9 ] + +Christian noticed that it is possible for a privileged user to mount +most filesystems with a non-initial user namespace in sb->s_user_ns. +When fsopen() is called in a non-init namespace the caller's namespace +is recorded in fs_context->user_ns. If the returned file descriptor is +then passed to a process priviliged in init_user_ns, that process can +call fsconfig(fd_fs, FSCONFIG_CMD_CREATE), creating a new superblock +with sb->s_user_ns set to the namespace of the process which called +fsopen(). + +This is problematic. We cannot assume that any filesystem which does not +set FS_USERNS_MOUNT has been written with a non-initial s_user_ns in +mind, increasing the risk for bugs and security issues. + +Prevent this by returning EPERM from sget_fc() when FS_USERNS_MOUNT is +not set for the filesystem and a non-initial user namespace will be +used. sget() does not need to be updated as it always uses the user +namespace of the current context, or the initial user namespace if +SB_SUBMOUNT is set. + +Fixes: cb50b348c71f ("convenience helpers: vfs_get_super() and sget_fc()") +Reported-by: Christian Brauner +Signed-off-by: Seth Forshee (DigitalOcean) +Link: https://lore.kernel.org/r/20240724-s_user_ns-fix-v1-1-895d07c94701@kernel.org +Reviewed-by: Alexander Mikhalitsyn +Signed-off-by: Christian Brauner +Signed-off-by: Sasha Levin +--- + fs/super.c | 11 +++++++++++ + 1 file changed, 11 insertions(+) + +diff --git a/fs/super.c b/fs/super.c +index 048576b19af63..39d866f7d7c6b 100644 +--- a/fs/super.c ++++ b/fs/super.c +@@ -528,6 +528,17 @@ struct super_block *sget_fc(struct fs_context *fc, + struct user_namespace *user_ns = fc->global ? &init_user_ns : fc->user_ns; + int err; + ++ /* ++ * Never allow s_user_ns != &init_user_ns when FS_USERNS_MOUNT is ++ * not set, as the filesystem is likely unprepared to handle it. ++ * This can happen when fsconfig() is called from init_user_ns with ++ * an fs_fd opened in another user namespace. ++ */ ++ if (user_ns != &init_user_ns && !(fc->fs_type->fs_flags & FS_USERNS_MOUNT)) { ++ errorfc(fc, "VFS: Mounting from non-initial user namespace is not allowed"); ++ return ERR_PTR(-EPERM); ++ } ++ + retry: + spin_lock(&sb_lock); + if (test) { +-- +2.43.0 + diff --git a/queue-5.15/iommu-sprd-avoid-null-deref-in-sprd_iommu_hw_en.patch b/queue-5.15/iommu-sprd-avoid-null-deref-in-sprd_iommu_hw_en.patch new file mode 100644 index 00000000000..38d4b02d005 --- /dev/null +++ b/queue-5.15/iommu-sprd-avoid-null-deref-in-sprd_iommu_hw_en.patch @@ -0,0 +1,41 @@ +From 1b298e56cdc3ac3a7d740d3d2ebeaf1568db0e2f Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 16 Jul 2024 15:55:14 +0300 +Subject: iommu: sprd: Avoid NULL deref in sprd_iommu_hw_en + +From: Artem Chernyshev + +[ Upstream commit 630482ee0653decf9e2482ac6181897eb6cde5b8 ] + +In sprd_iommu_cleanup() before calling function sprd_iommu_hw_en() +dom->sdev is equal to NULL, which leads to null dereference. + +Found by Linux Verification Center (linuxtesting.org) with SVACE. + +Fixes: 9afea57384d4 ("iommu/sprd: Release dma buffer to avoid memory leak") +Signed-off-by: Artem Chernyshev +Reviewed-by: Chunyan Zhang +Link: https://lore.kernel.org/r/20240716125522.3690358-1-artem.chernyshev@red-soft.ru +Signed-off-by: Will Deacon +Signed-off-by: Sasha Levin +--- + drivers/iommu/sprd-iommu.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c +index 6b11770e3d75a..f9392dbe6511e 100644 +--- a/drivers/iommu/sprd-iommu.c ++++ b/drivers/iommu/sprd-iommu.c +@@ -234,8 +234,8 @@ static void sprd_iommu_cleanup(struct sprd_iommu_domain *dom) + + pgt_size = sprd_iommu_pgt_size(&dom->domain); + dma_free_coherent(dom->sdev->dev, pgt_size, dom->pgt_va, dom->pgt_pa); +- dom->sdev = NULL; + sprd_iommu_hw_en(dom->sdev, false); ++ dom->sdev = NULL; + } + + static void sprd_iommu_domain_free(struct iommu_domain *domain) +-- +2.43.0 + diff --git a/queue-5.15/ipv4-fix-incorrect-source-address-in-record-route-op.patch b/queue-5.15/ipv4-fix-incorrect-source-address-in-record-route-op.patch new file mode 100644 index 00000000000..26f3638a843 --- /dev/null +++ b/queue-5.15/ipv4-fix-incorrect-source-address-in-record-route-op.patch @@ -0,0 +1,49 @@ +From 731e85183435f5a8c636c8efd5806a5f32fec82c Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 18 Jul 2024 15:34:07 +0300 +Subject: ipv4: Fix incorrect source address in Record Route option + +From: Ido Schimmel + +[ Upstream commit cc73bbab4b1fb8a4f53a24645871dafa5f81266a ] + +The Record Route IP option records the addresses of the routers that +routed the packet. In the case of forwarded packets, the kernel performs +a route lookup via fib_lookup() and fills in the preferred source +address of the matched route. + +The lookup is performed with the DS field of the forwarded packet, but +using the RT_TOS() macro which only masks one of the two ECN bits. If +the packet is ECT(0) or CE, the matched route might be different than +the route via which the packet was forwarded as the input path masks +both of the ECN bits, resulting in the wrong address being filled in the +Record Route option. + +Fix by masking both of the ECN bits. + +Fixes: 8e36360ae876 ("ipv4: Remove route key identity dependencies in ip_rt_get_source().") +Signed-off-by: Ido Schimmel +Reviewed-by: Guillaume Nault +Link: https://patch.msgid.link/20240718123407.434778-1-idosch@nvidia.com +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + net/ipv4/route.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/net/ipv4/route.c b/net/ipv4/route.c +index e7130a9f0e1a9..60fc35defdf8b 100644 +--- a/net/ipv4/route.c ++++ b/net/ipv4/route.c +@@ -1282,7 +1282,7 @@ void ip_rt_get_source(u8 *addr, struct sk_buff *skb, struct rtable *rt) + struct flowi4 fl4 = { + .daddr = iph->daddr, + .saddr = iph->saddr, +- .flowi4_tos = RT_TOS(iph->tos), ++ .flowi4_tos = iph->tos & IPTOS_RT_MASK, + .flowi4_oif = rt->dst.dev->ifindex, + .flowi4_iif = skb->dev->ifindex, + .flowi4_mark = skb->mark, +-- +2.43.0 + diff --git a/queue-5.15/jfs-fix-array-index-out-of-bounds-in-difree.patch b/queue-5.15/jfs-fix-array-index-out-of-bounds-in-difree.patch new file mode 100644 index 00000000000..c8508a4e822 --- /dev/null +++ b/queue-5.15/jfs-fix-array-index-out-of-bounds-in-difree.patch @@ -0,0 +1,46 @@ +From 09def29812da75840f732bc49466a6ec71f388cd Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 30 May 2024 22:28:09 +0900 +Subject: jfs: Fix array-index-out-of-bounds in diFree + +From: Jeongjun Park + +[ Upstream commit f73f969b2eb39ad8056f6c7f3a295fa2f85e313a ] + +Reported-by: syzbot+241c815bda521982cb49@syzkaller.appspotmail.com +Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") +Signed-off-by: Jeongjun Park +Signed-off-by: Dave Kleikamp +Signed-off-by: Sasha Levin +--- + fs/jfs/jfs_imap.c | 5 ++++- + 1 file changed, 4 insertions(+), 1 deletion(-) + +diff --git a/fs/jfs/jfs_imap.c b/fs/jfs/jfs_imap.c +index ac42f8ee553fc..ba6f28521360b 100644 +--- a/fs/jfs/jfs_imap.c ++++ b/fs/jfs/jfs_imap.c +@@ -290,7 +290,7 @@ int diSync(struct inode *ipimap) + int diRead(struct inode *ip) + { + struct jfs_sb_info *sbi = JFS_SBI(ip->i_sb); +- int iagno, ino, extno, rc; ++ int iagno, ino, extno, rc, agno; + struct inode *ipimap; + struct dinode *dp; + struct iag *iagp; +@@ -339,8 +339,11 @@ int diRead(struct inode *ip) + + /* get the ag for the iag */ + agstart = le64_to_cpu(iagp->agstart); ++ agno = BLKTOAG(agstart, JFS_SBI(ip->i_sb)); + + release_metapage(mp); ++ if (agno >= MAXAG || agno < 0) ++ return -EIO; + + rel_inode = (ino & (INOSPERPAGE - 1)); + pageno = blkno >> sbi->l2nbperpage; +-- +2.43.0 + diff --git a/queue-5.15/kdb-address-wformat-security-warnings.patch b/queue-5.15/kdb-address-wformat-security-warnings.patch new file mode 100644 index 00000000000..d28a33c2d79 --- /dev/null +++ b/queue-5.15/kdb-address-wformat-security-warnings.patch @@ -0,0 +1,58 @@ +From 1a4b498b1aa25c8c7dc48aaa872e9491d2cd6e4d Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 28 May 2024 14:11:48 +0200 +Subject: kdb: address -Wformat-security warnings + +From: Arnd Bergmann + +[ Upstream commit 70867efacf4370b6c7cdfc7a5b11300e9ef7de64 ] + +When -Wformat-security is not disabled, using a string pointer +as a format causes a warning: + +kernel/debug/kdb/kdb_io.c: In function 'kdb_read': +kernel/debug/kdb/kdb_io.c:365:36: error: format not a string literal and no format arguments [-Werror=format-security] + 365 | kdb_printf(kdb_prompt_str); + | ^~~~~~~~~~~~~~ +kernel/debug/kdb/kdb_io.c: In function 'kdb_getstr': +kernel/debug/kdb/kdb_io.c:456:20: error: format not a string literal and no format arguments [-Werror=format-security] + 456 | kdb_printf(kdb_prompt_str); + | ^~~~~~~~~~~~~~ + +Use an explcit "%s" format instead. + +Signed-off-by: Arnd Bergmann +Fixes: 5d5314d6795f ("kdb: core for kgdb back end (1 of 2)") +Reviewed-by: Douglas Anderson +Link: https://lore.kernel.org/r/20240528121154.3662553-1-arnd@kernel.org +Signed-off-by: Daniel Thompson +Signed-off-by: Sasha Levin +--- + kernel/debug/kdb/kdb_io.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c +index a3b4b55d2e2e1..a4256e558a701 100644 +--- a/kernel/debug/kdb/kdb_io.c ++++ b/kernel/debug/kdb/kdb_io.c +@@ -358,7 +358,7 @@ static char *kdb_read(char *buffer, size_t bufsize) + if (i >= dtab_count) + kdb_printf("..."); + kdb_printf("\n"); +- kdb_printf(kdb_prompt_str); ++ kdb_printf("%s", kdb_prompt_str); + kdb_printf("%s", buffer); + if (cp != lastchar) + kdb_position_cursor(kdb_prompt_str, buffer, cp); +@@ -450,7 +450,7 @@ char *kdb_getstr(char *buffer, size_t bufsize, const char *prompt) + { + if (prompt && kdb_prompt_str != prompt) + strscpy(kdb_prompt_str, prompt, CMD_BUFLEN); +- kdb_printf(kdb_prompt_str); ++ kdb_printf("%s", kdb_prompt_str); + kdb_nextline = 1; /* Prompt and input resets line number */ + return kdb_read(buffer, bufsize); + } +-- +2.43.0 + diff --git a/queue-5.15/kdb-use-the-passed-prompt-in-kdb_position_cursor.patch b/queue-5.15/kdb-use-the-passed-prompt-in-kdb_position_cursor.patch new file mode 100644 index 00000000000..339f82a22fe --- /dev/null +++ b/queue-5.15/kdb-use-the-passed-prompt-in-kdb_position_cursor.patch @@ -0,0 +1,42 @@ +From fe59bb2ab2eebe22e0d797cb79535e327fa8e0a1 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 28 May 2024 07:11:48 -0700 +Subject: kdb: Use the passed prompt in kdb_position_cursor() + +From: Douglas Anderson + +[ Upstream commit e2e821095949cde46256034975a90f88626a2a73 ] + +The function kdb_position_cursor() takes in a "prompt" parameter but +never uses it. This doesn't _really_ matter since all current callers +of the function pass the same value and it's a global variable, but +it's a bit ugly. Let's clean it up. + +Found by code inspection. This patch is expected to functionally be a +no-op. + +Fixes: 09b35989421d ("kdb: Use format-strings rather than '\0' injection in kdb_read()") +Signed-off-by: Douglas Anderson +Link: https://lore.kernel.org/r/20240528071144.1.I0feb49839c6b6f4f2c4bf34764f5e95de3f55a66@changeid +Signed-off-by: Daniel Thompson +Signed-off-by: Sasha Levin +--- + kernel/debug/kdb/kdb_io.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c +index a4256e558a701..b28b8a5ef6381 100644 +--- a/kernel/debug/kdb/kdb_io.c ++++ b/kernel/debug/kdb/kdb_io.c +@@ -194,7 +194,7 @@ char kdb_getchar(void) + */ + static void kdb_position_cursor(char *prompt, char *buffer, char *cp) + { +- kdb_printf("\r%s", kdb_prompt_str); ++ kdb_printf("\r%s", prompt); + if (cp > buffer) + kdb_printf("%.*s", (int)(cp - buffer), buffer); + } +-- +2.43.0 + diff --git a/queue-5.15/libbpf-fix-no-args-func-prototype-btf-dumping-syntax.patch b/queue-5.15/libbpf-fix-no-args-func-prototype-btf-dumping-syntax.patch new file mode 100644 index 00000000000..b4f3b82dd63 --- /dev/null +++ b/queue-5.15/libbpf-fix-no-args-func-prototype-btf-dumping-syntax.patch @@ -0,0 +1,96 @@ +From 5daa156f287a237a2651d98983d53979bb2c4b15 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 12 Jul 2024 15:44:42 -0700 +Subject: libbpf: Fix no-args func prototype BTF dumping syntax + +From: Andrii Nakryiko + +[ Upstream commit 189f1a976e426011e6a5588f1d3ceedf71fe2965 ] + +For all these years libbpf's BTF dumper has been emitting not strictly +valid syntax for function prototypes that have no input arguments. + +Instead of `int (*blah)()` we should emit `int (*blah)(void)`. + +This is not normally a problem, but it manifests when we get kfuncs in +vmlinux.h that have no input arguments. Due to compiler internal +specifics, we get no BTF information for such kfuncs, if they are not +declared with proper `(void)`. + +The fix is trivial. We also need to adjust a few ancient tests that +happily assumed `()` is correct. + +Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion") +Reported-by: Tejun Heo +Signed-off-by: Andrii Nakryiko +Signed-off-by: Daniel Borkmann +Acked-by: Stanislav Fomichev +Link: https://lore.kernel.org/bpf/20240712224442.282823-1-andrii@kernel.org +Signed-off-by: Sasha Levin +--- + tools/lib/bpf/btf_dump.c | 8 +++++--- + .../selftests/bpf/progs/btf_dump_test_case_multidim.c | 4 ++-- + .../selftests/bpf/progs/btf_dump_test_case_syntax.c | 4 ++-- + 3 files changed, 9 insertions(+), 7 deletions(-) + +diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c +index b91dd7cd4ffb0..c2bf996fcba82 100644 +--- a/tools/lib/bpf/btf_dump.c ++++ b/tools/lib/bpf/btf_dump.c +@@ -1458,10 +1458,12 @@ static void btf_dump_emit_type_chain(struct btf_dump *d, + * Clang for BPF target generates func_proto with no + * args as a func_proto with a single void arg (e.g., + * `int (*f)(void)` vs just `int (*f)()`). We are +- * going to pretend there are no args for such case. ++ * going to emit valid empty args (void) syntax for ++ * such case. Similarly and conveniently, valid ++ * no args case can be special-cased here as well. + */ +- if (vlen == 1 && p->type == 0) { +- btf_dump_printf(d, ")"); ++ if (vlen == 0 || (vlen == 1 && p->type == 0)) { ++ btf_dump_printf(d, "void)"); + return; + } + +diff --git a/tools/testing/selftests/bpf/progs/btf_dump_test_case_multidim.c b/tools/testing/selftests/bpf/progs/btf_dump_test_case_multidim.c +index ba97165bdb282..a657651eba523 100644 +--- a/tools/testing/selftests/bpf/progs/btf_dump_test_case_multidim.c ++++ b/tools/testing/selftests/bpf/progs/btf_dump_test_case_multidim.c +@@ -14,9 +14,9 @@ typedef int *ptr_arr_t[6]; + + typedef int *ptr_multiarr_t[7][8][9][10]; + +-typedef int * (*fn_ptr_arr_t[11])(); ++typedef int * (*fn_ptr_arr_t[11])(void); + +-typedef int * (*fn_ptr_multiarr_t[12][13])(); ++typedef int * (*fn_ptr_multiarr_t[12][13])(void); + + struct root_struct { + arr_t _1; +diff --git a/tools/testing/selftests/bpf/progs/btf_dump_test_case_syntax.c b/tools/testing/selftests/bpf/progs/btf_dump_test_case_syntax.c +index 970598dda7322..7fda004c153a6 100644 +--- a/tools/testing/selftests/bpf/progs/btf_dump_test_case_syntax.c ++++ b/tools/testing/selftests/bpf/progs/btf_dump_test_case_syntax.c +@@ -67,7 +67,7 @@ typedef void (*printf_fn_t)(const char *, ...); + * `int -> char *` function and returns pointer to a char. Equivalent: + * typedef char * (*fn_input_t)(int); + * typedef char * (*fn_output_outer_t)(fn_input_t); +- * typedef const fn_output_outer_t (* fn_output_inner_t)(); ++ * typedef const fn_output_outer_t (* fn_output_inner_t)(void); + * typedef const fn_output_inner_t fn_ptr_arr2_t[5]; + */ + /* ----- START-EXPECTED-OUTPUT ----- */ +@@ -94,7 +94,7 @@ typedef void (* (*signal_t)(int, void (*)(int)))(int); + + typedef char * (*fn_ptr_arr1_t[10])(int **); + +-typedef char * (* (* const fn_ptr_arr2_t[5])())(char * (*)(int)); ++typedef char * (* (* const fn_ptr_arr2_t[5])(void))(char * (*)(int)); + + struct struct_w_typedefs { + int_t a; +-- +2.43.0 + diff --git a/queue-5.15/lirc-rc_dev_get_from_fd-fix-file-leak.patch b/queue-5.15/lirc-rc_dev_get_from_fd-fix-file-leak.patch new file mode 100644 index 00000000000..f8955df83e5 --- /dev/null +++ b/queue-5.15/lirc-rc_dev_get_from_fd-fix-file-leak.patch @@ -0,0 +1,37 @@ +From eeb70c7aa92af1dd9f39f4063b4a488d207cc78b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 30 May 2024 23:58:26 -0400 +Subject: lirc: rc_dev_get_from_fd(): fix file leak + +From: Al Viro + +[ Upstream commit bba1f6758a9ec90c1adac5dcf78f8a15f1bad65b ] + +missing fdput() on a failure exit + +Fixes: 6a9d552483d50 "media: rc: bpf attach/detach requires write permission" # v6.9 +Signed-off-by: Al Viro +Signed-off-by: Sasha Levin +--- + drivers/media/rc/lirc_dev.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/drivers/media/rc/lirc_dev.c b/drivers/media/rc/lirc_dev.c +index d73f02b0db842..54f4a7cd88f43 100644 +--- a/drivers/media/rc/lirc_dev.c ++++ b/drivers/media/rc/lirc_dev.c +@@ -841,8 +841,10 @@ struct rc_dev *rc_dev_get_from_fd(int fd, bool write) + return ERR_PTR(-EINVAL); + } + +- if (write && !(f.file->f_mode & FMODE_WRITE)) ++ if (write && !(f.file->f_mode & FMODE_WRITE)) { ++ fdput(f); + return ERR_PTR(-EPERM); ++ } + + fh = f.file->private_data; + dev = fh->rc; +-- +2.43.0 + diff --git a/queue-5.15/mips-smp-cps-fix-address-for-gcr_access-register-for.patch b/queue-5.15/mips-smp-cps-fix-address-for-gcr_access-register-for.patch new file mode 100644 index 00000000000..0fbf6106496 --- /dev/null +++ b/queue-5.15/mips-smp-cps-fix-address-for-gcr_access-register-for.patch @@ -0,0 +1,66 @@ +From ef8341e7f98b7f98f7a0b8a5dfc2243d54b273e1 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 22 Jul 2024 15:15:39 +0200 +Subject: MIPS: SMP-CPS: Fix address for GCR_ACCESS register for CM3 and later + +From: Gregory CLEMENT + +[ Upstream commit a263e5f309f32301e1f3ad113293f4e68a82a646 ] + +When the CM block migrated from CM2.5 to CM3.0, the address offset for +the Global CSR Access Privilege register was modified. We saw this in +the "MIPS64 I6500 Multiprocessing System Programmer's Guide," it is +stated that "the Global CSR Access Privilege register is located at +offset 0x0120" in section 5.4. It is at least the same for I6400. + +This fix allows to use the VP cores in SMP mode if the reset values +were modified by the bootloader. + +Based on the work of Vladimir Kondratiev + and the feedback from Jiaxun Yang +. + +Fixes: 197e89e0984a ("MIPS: mips-cm: Implement mips_cm_revision") +Signed-off-by: Gregory CLEMENT +Reviewed-by: Jiaxun Yang +Signed-off-by: Thomas Bogendoerfer +Signed-off-by: Sasha Levin +--- + arch/mips/include/asm/mips-cm.h | 4 ++++ + arch/mips/kernel/smp-cps.c | 5 ++++- + 2 files changed, 8 insertions(+), 1 deletion(-) + +diff --git a/arch/mips/include/asm/mips-cm.h b/arch/mips/include/asm/mips-cm.h +index 23c67c0871b17..696b40beb774f 100644 +--- a/arch/mips/include/asm/mips-cm.h ++++ b/arch/mips/include/asm/mips-cm.h +@@ -228,6 +228,10 @@ GCR_ACCESSOR_RO(32, 0x0d0, gic_status) + GCR_ACCESSOR_RO(32, 0x0f0, cpc_status) + #define CM_GCR_CPC_STATUS_EX BIT(0) + ++/* GCR_ACCESS - Controls core/IOCU access to GCRs */ ++GCR_ACCESSOR_RW(32, 0x120, access_cm3) ++#define CM_GCR_ACCESS_ACCESSEN GENMASK(7, 0) ++ + /* GCR_L2_CONFIG - Indicates L2 cache configuration when Config5.L2C=1 */ + GCR_ACCESSOR_RW(32, 0x130, l2_config) + #define CM_GCR_L2_CONFIG_BYPASS BIT(20) +diff --git a/arch/mips/kernel/smp-cps.c b/arch/mips/kernel/smp-cps.c +index f2df0cae1b4d9..7409d46ce31a8 100644 +--- a/arch/mips/kernel/smp-cps.c ++++ b/arch/mips/kernel/smp-cps.c +@@ -230,7 +230,10 @@ static void boot_core(unsigned int core, unsigned int vpe_id) + write_gcr_co_reset_ext_base(CM_GCR_Cx_RESET_EXT_BASE_UEB); + + /* Ensure the core can access the GCRs */ +- set_gcr_access(1 << core); ++ if (mips_cm_revision() < CM_REV_CM3) ++ set_gcr_access(1 << core); ++ else ++ set_gcr_access_cm3(1 << core); + + if (mips_cpc_present()) { + /* Reset the core */ +-- +2.43.0 + diff --git a/queue-5.15/misdn-fix-a-use-after-free-in-hfcmulti_tx.patch b/queue-5.15/misdn-fix-a-use-after-free-in-hfcmulti_tx.patch new file mode 100644 index 00000000000..00d0d174000 --- /dev/null +++ b/queue-5.15/misdn-fix-a-use-after-free-in-hfcmulti_tx.patch @@ -0,0 +1,55 @@ +From a6240675a25eab05d691ad52191f3cad593633ec Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 24 Jul 2024 11:08:18 -0500 +Subject: mISDN: Fix a use after free in hfcmulti_tx() + +From: Dan Carpenter + +[ Upstream commit 61ab751451f5ebd0b98e02276a44e23a10110402 ] + +Don't dereference *sp after calling dev_kfree_skb(*sp). + +Fixes: af69fb3a8ffa ("Add mISDN HFC multiport driver") +Signed-off-by: Dan Carpenter +Reviewed-by: Simon Horman +Link: https://patch.msgid.link/8be65f5a-c2dd-4ba0-8a10-bfe5980b8cfb@stanley.mountain +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/isdn/hardware/mISDN/hfcmulti.c | 7 ++++--- + 1 file changed, 4 insertions(+), 3 deletions(-) + +diff --git a/drivers/isdn/hardware/mISDN/hfcmulti.c b/drivers/isdn/hardware/mISDN/hfcmulti.c +index e840609c50eb7..2063afffd0853 100644 +--- a/drivers/isdn/hardware/mISDN/hfcmulti.c ++++ b/drivers/isdn/hardware/mISDN/hfcmulti.c +@@ -1931,7 +1931,7 @@ hfcmulti_dtmf(struct hfc_multi *hc) + static void + hfcmulti_tx(struct hfc_multi *hc, int ch) + { +- int i, ii, temp, len = 0; ++ int i, ii, temp, tmp_len, len = 0; + int Zspace, z1, z2; /* must be int for calculation */ + int Fspace, f1, f2; + u_char *d; +@@ -2152,14 +2152,15 @@ hfcmulti_tx(struct hfc_multi *hc, int ch) + HFC_wait_nodebug(hc); + } + ++ tmp_len = (*sp)->len; + dev_kfree_skb(*sp); + /* check for next frame */ + if (bch && get_next_bframe(bch)) { +- len = (*sp)->len; ++ len = tmp_len; + goto next_frame; + } + if (dch && get_next_dframe(dch)) { +- len = (*sp)->len; ++ len = tmp_len; + goto next_frame; + } + +-- +2.43.0 + diff --git a/queue-5.15/net-bonding-correctly-annotate-rcu-in-bond_should_no.patch b/queue-5.15/net-bonding-correctly-annotate-rcu-in-bond_should_no.patch new file mode 100644 index 00000000000..2a27208d53c --- /dev/null +++ b/queue-5.15/net-bonding-correctly-annotate-rcu-in-bond_should_no.patch @@ -0,0 +1,53 @@ +From 41ace79dc0021d41770e7ef683928c63e85f8633 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 19 Jul 2024 09:41:18 -0700 +Subject: net: bonding: correctly annotate RCU in bond_should_notify_peers() + +From: Johannes Berg + +[ Upstream commit 3ba359c0cd6eb5ea772125a7aededb4a2d516684 ] + +RCU use in bond_should_notify_peers() looks wrong, since it does +rcu_dereference(), leaves the critical section, and uses the +pointer after that. + +Luckily, it's called either inside a nested RCU critical section +or with the RTNL held. + +Annotate it with rcu_dereference_rtnl() instead, and remove the +inner RCU critical section. + +Fixes: 4cb4f97b7e36 ("bonding: rebuild the lock use for bond_mii_monitor()") +Reviewed-by: Jiri Pirko +Signed-off-by: Johannes Berg +Acked-by: Jay Vosburgh +Link: https://patch.msgid.link/20240719094119.35c62455087d.I68eb9c0f02545b364b79a59f2110f2cf5682a8e2@changeid +Signed-off-by: Paolo Abeni +Signed-off-by: Sasha Levin +--- + drivers/net/bonding/bond_main.c | 7 ++----- + 1 file changed, 2 insertions(+), 5 deletions(-) + +diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c +index 9aed194d308d6..6a91229b0e05b 100644 +--- a/drivers/net/bonding/bond_main.c ++++ b/drivers/net/bonding/bond_main.c +@@ -1087,13 +1087,10 @@ static struct slave *bond_find_best_slave(struct bonding *bond) + return bestslave; + } + ++/* must be called in RCU critical section or with RTNL held */ + static bool bond_should_notify_peers(struct bonding *bond) + { +- struct slave *slave; +- +- rcu_read_lock(); +- slave = rcu_dereference(bond->curr_active_slave); +- rcu_read_unlock(); ++ struct slave *slave = rcu_dereference_rtnl(bond->curr_active_slave); + + if (!slave || !bond->send_peer_notif || + bond->send_peer_notif % +-- +2.43.0 + diff --git a/queue-5.15/net-nexthop-initialize-all-fields-in-dumped-nexthops.patch b/queue-5.15/net-nexthop-initialize-all-fields-in-dumped-nexthops.patch new file mode 100644 index 00000000000..4f001a41560 --- /dev/null +++ b/queue-5.15/net-nexthop-initialize-all-fields-in-dumped-nexthops.patch @@ -0,0 +1,55 @@ +From f993574bc20b1104e1a08428a199bc9f7dfb9ef5 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 23 Jul 2024 18:04:16 +0200 +Subject: net: nexthop: Initialize all fields in dumped nexthops + +From: Petr Machata + +[ Upstream commit 6d745cd0e9720282cd291d36b9db528aea18add2 ] + +struct nexthop_grp contains two reserved fields that are not initialized by +nla_put_nh_group(), and carry garbage. This can be observed e.g. with +strace (edited for clarity): + + # ip nexthop add id 1 dev lo + # ip nexthop add id 101 group 1 + # strace -e recvmsg ip nexthop get id 101 + ... + recvmsg(... [{nla_len=12, nla_type=NHA_GROUP}, + [{id=1, weight=0, resvd1=0x69, resvd2=0x67}]] ...) = 52 + +The fields are reserved and therefore not currently used. But as they are, they +leak kernel memory, and the fact they are not just zero complicates repurposing +of the fields for new ends. Initialize the full structure. + +Fixes: 430a049190de ("nexthop: Add support for nexthop groups") +Signed-off-by: Petr Machata +Reviewed-by: Ido Schimmel +Reviewed-by: Eric Dumazet +Signed-off-by: David S. Miller +Signed-off-by: Sasha Levin +--- + net/ipv4/nexthop.c | 7 ++++--- + 1 file changed, 4 insertions(+), 3 deletions(-) + +diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c +index c140a36bd1e65..633eab6ff55dd 100644 +--- a/net/ipv4/nexthop.c ++++ b/net/ipv4/nexthop.c +@@ -675,9 +675,10 @@ static int nla_put_nh_group(struct sk_buff *skb, struct nh_group *nhg) + + p = nla_data(nla); + for (i = 0; i < nhg->num_nh; ++i) { +- p->id = nhg->nh_entries[i].nh->id; +- p->weight = nhg->nh_entries[i].weight - 1; +- p += 1; ++ *p++ = (struct nexthop_grp) { ++ .id = nhg->nh_entries[i].nh->id, ++ .weight = nhg->nh_entries[i].weight - 1, ++ }; + } + + if (nhg->resilient && nla_put_nh_group_res(skb, nhg)) +-- +2.43.0 + diff --git a/queue-5.15/net-stmmac-correct-byte-order-of-perfect_match.patch b/queue-5.15/net-stmmac-correct-byte-order-of-perfect_match.patch new file mode 100644 index 00000000000..96bbfc73db9 --- /dev/null +++ b/queue-5.15/net-stmmac-correct-byte-order-of-perfect_match.patch @@ -0,0 +1,110 @@ +From 2026b160c36773dc62c3103d5311fdffd40a7792 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 23 Jul 2024 14:29:27 +0100 +Subject: net: stmmac: Correct byte order of perfect_match + +From: Simon Horman + +[ Upstream commit e9dbebae2e3c338122716914fe105458f41e3a4a ] + +The perfect_match parameter of the update_vlan_hash operation is __le16, +and is correctly converted from host byte-order in the lone caller, +stmmac_vlan_update(). + +However, the implementations of this caller, dwxgmac2_update_vlan_hash() +and dwxgmac2_update_vlan_hash(), both treat this parameter as host byte +order, using the following pattern: + + u32 value = ... + ... + writel(value | perfect_match, ...); + +This is not correct because both: +1) value is host byte order; and +2) writel expects a host byte order value as it's first argument + +I believe that this will break on big endian systems. And I expect it +has gone unnoticed by only being exercised on little endian systems. + +The approach taken by this patch is to update the callback, and it's +caller to simply use a host byte order value. + +Flagged by Sparse. +Compile tested only. + +Fixes: c7ab0b8088d7 ("net: stmmac: Fallback to VLAN Perfect filtering if HASH is not available") +Signed-off-by: Simon Horman +Reviewed-by: Maxime Chevallier +Signed-off-by: David S. Miller +Signed-off-by: Sasha Levin +--- + drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 2 +- + drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 2 +- + drivers/net/ethernet/stmicro/stmmac/hwif.h | 2 +- + drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 4 ++-- + 4 files changed, 5 insertions(+), 5 deletions(-) + +diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c +index 026e3645e566a..e5c5a9c5389c3 100644 +--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c ++++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c +@@ -972,7 +972,7 @@ static void dwmac4_set_mac_loopback(void __iomem *ioaddr, bool enable) + } + + static void dwmac4_update_vlan_hash(struct mac_device_info *hw, u32 hash, +- __le16 perfect_match, bool is_double) ++ u16 perfect_match, bool is_double) + { + void __iomem *ioaddr = hw->pcsr; + u32 value; +diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c +index dd73f38ec08d8..813327d04c56f 100644 +--- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c ++++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c +@@ -582,7 +582,7 @@ static int dwxgmac2_rss_configure(struct mac_device_info *hw, + } + + static void dwxgmac2_update_vlan_hash(struct mac_device_info *hw, u32 hash, +- __le16 perfect_match, bool is_double) ++ u16 perfect_match, bool is_double) + { + void __iomem *ioaddr = hw->pcsr; + +diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h +index 58e5c6c428dc0..414b63d5b9ebe 100644 +--- a/drivers/net/ethernet/stmicro/stmmac/hwif.h ++++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h +@@ -370,7 +370,7 @@ struct stmmac_ops { + struct stmmac_rss *cfg, u32 num_rxq); + /* VLAN */ + void (*update_vlan_hash)(struct mac_device_info *hw, u32 hash, +- __le16 perfect_match, bool is_double); ++ u16 perfect_match, bool is_double); + void (*enable_vlan)(struct mac_device_info *hw, u32 type); + int (*add_hw_vlan_rx_fltr)(struct net_device *dev, + struct mac_device_info *hw, +diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +index b0ab8f6986f8b..a5cbb495b5581 100644 +--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c ++++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +@@ -6237,7 +6237,7 @@ static u32 stmmac_vid_crc32_le(__le16 vid_le) + static int stmmac_vlan_update(struct stmmac_priv *priv, bool is_double) + { + u32 crc, hash = 0; +- __le16 pmatch = 0; ++ u16 pmatch = 0; + int count = 0; + u16 vid = 0; + +@@ -6252,7 +6252,7 @@ static int stmmac_vlan_update(struct stmmac_priv *priv, bool is_double) + if (count > 2) /* VID = 0 always passes filter */ + return -EOPNOTSUPP; + +- pmatch = cpu_to_le16(vid); ++ pmatch = vid; + hash = 0; + } + +-- +2.43.0 + diff --git a/queue-5.15/netfilter-nft_set_pipapo_avx2-disable-softinterrupts.patch b/queue-5.15/netfilter-nft_set_pipapo_avx2-disable-softinterrupts.patch new file mode 100644 index 00000000000..11a7482ca0b --- /dev/null +++ b/queue-5.15/netfilter-nft_set_pipapo_avx2-disable-softinterrupts.patch @@ -0,0 +1,71 @@ +From b35f611f0edaba00c1b7f0523a681d0bbe4c5e97 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 19 Jul 2024 13:19:26 +0200 +Subject: netfilter: nft_set_pipapo_avx2: disable softinterrupts + +From: Florian Westphal + +[ Upstream commit a16909ae9982e931841c456061cb57fbaec9c59e ] + +We need to disable softinterrupts, else we get following problem: + +1. pipapo_avx2 called from process context; fpu usable +2. preempt_disable() called, pcpu scratchmap in use +3. softirq handles rx or tx, we re-enter pipapo_avx2 +4. fpu busy, fallback to generic non-avx version +5. fallback reuses scratch map and index, which are in use + by the preempted process + +Handle this same way as generic version by first disabling +softinterrupts while the scratchmap is in use. + +Fixes: f0b3d338064e ("netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check, fallback to non-AVX2 version") +Cc: Stefano Brivio +Signed-off-by: Florian Westphal +Reviewed-by: Stefano Brivio +Signed-off-by: Pablo Neira Ayuso +Signed-off-by: Sasha Levin +--- + net/netfilter/nft_set_pipapo_avx2.c | 12 ++++++++++-- + 1 file changed, 10 insertions(+), 2 deletions(-) + +diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c +index 295406cf63672..dfae90cd34939 100644 +--- a/net/netfilter/nft_set_pipapo_avx2.c ++++ b/net/netfilter/nft_set_pipapo_avx2.c +@@ -1141,8 +1141,14 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, + bool map_index; + int i, ret = 0; + +- if (unlikely(!irq_fpu_usable())) +- return nft_pipapo_lookup(net, set, key, ext); ++ local_bh_disable(); ++ ++ if (unlikely(!irq_fpu_usable())) { ++ bool fallback_res = nft_pipapo_lookup(net, set, key, ext); ++ ++ local_bh_enable(); ++ return fallback_res; ++ } + + m = rcu_dereference(priv->match); + +@@ -1157,6 +1163,7 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, + scratch = *raw_cpu_ptr(m->scratch); + if (unlikely(!scratch)) { + kernel_fpu_end(); ++ local_bh_enable(); + return false; + } + +@@ -1237,6 +1244,7 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, + if (i % 2) + scratch->map_index = !map_index; + kernel_fpu_end(); ++ local_bh_enable(); + + return ret >= 0; + } +-- +2.43.0 + diff --git a/queue-5.15/nvme-pci-add-missing-condition-check-for-existence-o.patch b/queue-5.15/nvme-pci-add-missing-condition-check-for-existence-o.patch new file mode 100644 index 00000000000..ec45d2acca5 --- /dev/null +++ b/queue-5.15/nvme-pci-add-missing-condition-check-for-existence-o.patch @@ -0,0 +1,39 @@ +From 0b7a198a03e8eaee8385bc0b78a6ac50ff168b6e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 24 Jul 2024 13:31:14 +0300 +Subject: nvme-pci: add missing condition check for existence of mapped data + +From: Leon Romanovsky + +[ Upstream commit c31fad1470389666ac7169fe43aa65bf5b7e2cfd ] + +nvme_map_data() is called when request has physical segments, hence +the nvme_unmap_data() should have same condition to avoid dereference. + +Fixes: 4aedb705437f ("nvme-pci: split metadata handling from nvme_map_data / nvme_unmap_data") +Signed-off-by: Leon Romanovsky +Reviewed-by: Christoph Hellwig +Reviewed-by: Nitesh Shetty +Signed-off-by: Keith Busch +Signed-off-by: Sasha Levin +--- + drivers/nvme/host/pci.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c +index 01f16989d0d84..1df3e083f3c67 100644 +--- a/drivers/nvme/host/pci.c ++++ b/drivers/nvme/host/pci.c +@@ -914,7 +914,8 @@ static blk_status_t nvme_prep_rq(struct nvme_dev *dev, struct request *req) + blk_mq_start_request(req); + return BLK_STS_OK; + out_unmap_data: +- nvme_unmap_data(dev, req); ++ if (blk_rq_nr_phys_segments(req)) ++ nvme_unmap_data(dev, req); + out_free_cmd: + nvme_cleanup_cmd(req); + return ret; +-- +2.43.0 + diff --git a/queue-5.15/nvme-separate-command-prep-and-issue.patch b/queue-5.15/nvme-separate-command-prep-and-issue.patch new file mode 100644 index 00000000000..2cee89e65be --- /dev/null +++ b/queue-5.15/nvme-separate-command-prep-and-issue.patch @@ -0,0 +1,127 @@ +From 6e47ad61670e9205955d8467e0e3e45f5a80b64a Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 29 Oct 2021 14:34:11 -0600 +Subject: nvme: separate command prep and issue + +From: Jens Axboe + +[ Upstream commit 62451a2b2e7ea17c4a547ada6a5deebf8787a27a ] + +Add a nvme_prep_rq() helper to setup a command, and nvme_queue_rq() is +adapted to use this helper. + +Reviewed-by: Hannes Reinecke +Reviewed-by: Christoph Hellwig +Signed-off-by: Jens Axboe +Stable-dep-of: c31fad147038 ("nvme-pci: add missing condition check for existence of mapped data") +Signed-off-by: Sasha Levin +--- + drivers/nvme/host/pci.c | 63 +++++++++++++++++++++++------------------ + 1 file changed, 36 insertions(+), 27 deletions(-) + +diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c +index 04e51134165dd..01f16989d0d84 100644 +--- a/drivers/nvme/host/pci.c ++++ b/drivers/nvme/host/pci.c +@@ -886,55 +886,32 @@ static blk_status_t nvme_map_metadata(struct nvme_dev *dev, struct request *req, + return BLK_STS_OK; + } + +-/* +- * NOTE: ns is NULL when called on the admin queue. +- */ +-static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx, +- const struct blk_mq_queue_data *bd) ++static blk_status_t nvme_prep_rq(struct nvme_dev *dev, struct request *req) + { +- struct nvme_ns *ns = hctx->queue->queuedata; +- struct nvme_queue *nvmeq = hctx->driver_data; +- struct nvme_dev *dev = nvmeq->dev; +- struct request *req = bd->rq; + struct nvme_iod *iod = blk_mq_rq_to_pdu(req); +- struct nvme_command *cmnd = &iod->cmd; + blk_status_t ret; + + iod->aborted = 0; + iod->npages = -1; + iod->nents = 0; + +- /* +- * We should not need to do this, but we're still using this to +- * ensure we can drain requests on a dying queue. +- */ +- if (unlikely(!test_bit(NVMEQ_ENABLED, &nvmeq->flags))) +- return BLK_STS_IOERR; +- +- if (!nvme_check_ready(&dev->ctrl, req, true)) +- return nvme_fail_nonready_command(&dev->ctrl, req); +- +- ret = nvme_setup_cmd(ns, req); ++ ret = nvme_setup_cmd(req->q->queuedata, req); + if (ret) + return ret; + + if (blk_rq_nr_phys_segments(req)) { +- ret = nvme_map_data(dev, req, cmnd); ++ ret = nvme_map_data(dev, req, &iod->cmd); + if (ret) + goto out_free_cmd; + } + + if (blk_integrity_rq(req)) { +- ret = nvme_map_metadata(dev, req, cmnd); ++ ret = nvme_map_metadata(dev, req, &iod->cmd); + if (ret) + goto out_unmap_data; + } + + blk_mq_start_request(req); +- spin_lock(&nvmeq->sq_lock); +- nvme_sq_copy_cmd(nvmeq, &iod->cmd); +- nvme_write_sq_db(nvmeq, bd->last); +- spin_unlock(&nvmeq->sq_lock); + return BLK_STS_OK; + out_unmap_data: + nvme_unmap_data(dev, req); +@@ -943,6 +920,38 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx, + return ret; + } + ++/* ++ * NOTE: ns is NULL when called on the admin queue. ++ */ ++static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx, ++ const struct blk_mq_queue_data *bd) ++{ ++ struct nvme_queue *nvmeq = hctx->driver_data; ++ struct nvme_dev *dev = nvmeq->dev; ++ struct request *req = bd->rq; ++ struct nvme_iod *iod = blk_mq_rq_to_pdu(req); ++ blk_status_t ret; ++ ++ /* ++ * We should not need to do this, but we're still using this to ++ * ensure we can drain requests on a dying queue. ++ */ ++ if (unlikely(!test_bit(NVMEQ_ENABLED, &nvmeq->flags))) ++ return BLK_STS_IOERR; ++ ++ if (unlikely(!nvme_check_ready(&dev->ctrl, req, true))) ++ return nvme_fail_nonready_command(&dev->ctrl, req); ++ ++ ret = nvme_prep_rq(dev, req); ++ if (unlikely(ret)) ++ return ret; ++ spin_lock(&nvmeq->sq_lock); ++ nvme_sq_copy_cmd(nvmeq, &iod->cmd); ++ nvme_write_sq_db(nvmeq, bd->last); ++ spin_unlock(&nvmeq->sq_lock); ++ return BLK_STS_OK; ++} ++ + static void nvme_pci_complete_rq(struct request *req) + { + struct nvme_queue *nvmeq = req->mq_hctx->driver_data; +-- +2.43.0 + diff --git a/queue-5.15/nvme-split-command-copy-into-a-helper.patch b/queue-5.15/nvme-split-command-copy-into-a-helper.patch new file mode 100644 index 00000000000..575cb91f978 --- /dev/null +++ b/queue-5.15/nvme-split-command-copy-into-a-helper.patch @@ -0,0 +1,81 @@ +From 71af5fa40ecd43dbb5118d6e79449f95f94cfb72 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 29 Oct 2021 14:32:44 -0600 +Subject: nvme: split command copy into a helper + +From: Jens Axboe + +[ Upstream commit 3233b94cf842984ea7e208d5be1ad2f2af02d495 ] + +We'll need it for batched submit as well. Since we now have a copy +helper, get rid of the nvme_submit_cmd() wrapper. + +Reviewed-by: Chaitanya Kulkarni +Reviewed-by: Hannes Reinecke +Reviewed-by: Max Gurtovoy +Reviewed-by: Christoph Hellwig +Signed-off-by: Jens Axboe +Stable-dep-of: c31fad147038 ("nvme-pci: add missing condition check for existence of mapped data") +Signed-off-by: Sasha Levin +--- + drivers/nvme/host/pci.c | 26 ++++++++++++-------------- + 1 file changed, 12 insertions(+), 14 deletions(-) + +diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c +index 5a3ba7e390546..04e51134165dd 100644 +--- a/drivers/nvme/host/pci.c ++++ b/drivers/nvme/host/pci.c +@@ -479,22 +479,13 @@ static inline void nvme_write_sq_db(struct nvme_queue *nvmeq, bool write_sq) + nvmeq->last_sq_tail = nvmeq->sq_tail; + } + +-/** +- * nvme_submit_cmd() - Copy a command into a queue and ring the doorbell +- * @nvmeq: The queue to use +- * @cmd: The command to send +- * @write_sq: whether to write to the SQ doorbell +- */ +-static void nvme_submit_cmd(struct nvme_queue *nvmeq, struct nvme_command *cmd, +- bool write_sq) ++static inline void nvme_sq_copy_cmd(struct nvme_queue *nvmeq, ++ struct nvme_command *cmd) + { +- spin_lock(&nvmeq->sq_lock); + memcpy(nvmeq->sq_cmds + (nvmeq->sq_tail << nvmeq->sqes), +- cmd, sizeof(*cmd)); ++ absolute_pointer(cmd), sizeof(*cmd)); + if (++nvmeq->sq_tail == nvmeq->q_depth) + nvmeq->sq_tail = 0; +- nvme_write_sq_db(nvmeq, write_sq); +- spin_unlock(&nvmeq->sq_lock); + } + + static void nvme_commit_rqs(struct blk_mq_hw_ctx *hctx) +@@ -940,7 +931,10 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx, + } + + blk_mq_start_request(req); +- nvme_submit_cmd(nvmeq, cmnd, bd->last); ++ spin_lock(&nvmeq->sq_lock); ++ nvme_sq_copy_cmd(nvmeq, &iod->cmd); ++ nvme_write_sq_db(nvmeq, bd->last); ++ spin_unlock(&nvmeq->sq_lock); + return BLK_STS_OK; + out_unmap_data: + nvme_unmap_data(dev, req); +@@ -1109,7 +1103,11 @@ static void nvme_pci_submit_async_event(struct nvme_ctrl *ctrl) + + c.common.opcode = nvme_admin_async_event; + c.common.command_id = NVME_AQ_BLK_MQ_DEPTH; +- nvme_submit_cmd(nvmeq, &c, true); ++ ++ spin_lock(&nvmeq->sq_lock); ++ nvme_sq_copy_cmd(nvmeq, &c); ++ nvme_write_sq_db(nvmeq, true); ++ spin_unlock(&nvmeq->sq_lock); + } + + static int adapter_delete_queue(struct nvme_dev *dev, u8 opcode, u16 id) +-- +2.43.0 + diff --git a/queue-5.15/phy-cadence-torrent-check-return-value-on-register-r.patch b/queue-5.15/phy-cadence-torrent-check-return-value-on-register-r.patch new file mode 100644 index 00000000000..2f29a593d12 --- /dev/null +++ b/queue-5.15/phy-cadence-torrent-check-return-value-on-register-r.patch @@ -0,0 +1,40 @@ +From 9a474c34d6e9e755b03615d9cb09b89c2e4c7967 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 2 Jul 2024 11:20:42 +0800 +Subject: phy: cadence-torrent: Check return value on register read + +From: Ma Ke + +[ Upstream commit 967969cf594ed3c1678a9918d6e9bb2d1591cbe9 ] + +cdns_torrent_dp_set_power_state() does not consider that ret might be +overwritten. Add return value check of regmap_read_poll_timeout() after +register read in cdns_torrent_dp_set_power_state(). + +Fixes: 5b16a790f18d ("phy: cadence-torrent: Reorder few functions to remove function declarations") +Signed-off-by: Ma Ke +Reviewed-by: Roger Quadros +Link: https://lore.kernel.org/r/20240702032042.3993031-1-make24@iscas.ac.cn +Signed-off-by: Vinod Koul +Signed-off-by: Sasha Levin +--- + drivers/phy/cadence/phy-cadence-torrent.c | 3 +++ + 1 file changed, 3 insertions(+) + +diff --git a/drivers/phy/cadence/phy-cadence-torrent.c b/drivers/phy/cadence/phy-cadence-torrent.c +index 415ace64adc5c..8f23146d62bf0 100644 +--- a/drivers/phy/cadence/phy-cadence-torrent.c ++++ b/drivers/phy/cadence/phy-cadence-torrent.c +@@ -1056,6 +1056,9 @@ static int cdns_torrent_dp_set_power_state(struct cdns_torrent_phy *cdns_phy, + ret = regmap_read_poll_timeout(regmap, PHY_PMA_XCVR_POWER_STATE_ACK, + read_val, (read_val & mask) == value, 0, + POLL_TIMEOUT_US); ++ if (ret) ++ return ret; ++ + cdns_torrent_dp_write(regmap, PHY_PMA_XCVR_POWER_STATE_REQ, 0x00000000); + ndelay(100); + +-- +2.43.0 + diff --git a/queue-5.15/powerpc-fix-a-file-leak-in-kvm_vcpu_ioctl_enable_cap.patch b/queue-5.15/powerpc-fix-a-file-leak-in-kvm_vcpu_ioctl_enable_cap.patch new file mode 100644 index 00000000000..96174b0fe5f --- /dev/null +++ b/queue-5.15/powerpc-fix-a-file-leak-in-kvm_vcpu_ioctl_enable_cap.patch @@ -0,0 +1,37 @@ +From a780440bb4683536500a8b4bceef6ff3ff829041 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 30 May 2024 23:54:55 -0400 +Subject: powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() + +From: Al Viro + +[ Upstream commit b4cf5fc01ce83e5c0bcf3dbb9f929428646b9098 ] + +missing fdput() on one of the failure exits + +Fixes: eacc56bb9de3e # v5.2 +Signed-off-by: Al Viro +Signed-off-by: Sasha Levin +--- + arch/powerpc/kvm/powerpc.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c +index ee305455bd8db..fc7174b32e982 100644 +--- a/arch/powerpc/kvm/powerpc.c ++++ b/arch/powerpc/kvm/powerpc.c +@@ -1963,8 +1963,10 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu, + break; + + r = -ENXIO; +- if (!xive_enabled()) ++ if (!xive_enabled()) { ++ fdput(f); + break; ++ } + + r = -EPERM; + dev = kvm_device_from_filp(f.file); +-- +2.43.0 + diff --git a/queue-5.15/s390-cpum_cf-fix-endless-loop-in-cf_diag-event-stop.patch b/queue-5.15/s390-cpum_cf-fix-endless-loop-in-cf_diag-event-stop.patch new file mode 100644 index 00000000000..2a58ba9b1f6 --- /dev/null +++ b/queue-5.15/s390-cpum_cf-fix-endless-loop-in-cf_diag-event-stop.patch @@ -0,0 +1,87 @@ +From 7982effd2e5da77dcf1dfbd2515e9cb512d7bb33 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 15 Jul 2024 12:07:29 +0200 +Subject: s390/cpum_cf: Fix endless loop in CF_DIAG event stop + +From: Thomas Richter + +[ Upstream commit e6ce1f12d777f6ee22b20e10ae6a771e7e6f44f5 ] + +Event CF_DIAG reads out complete counter sets using stcctm +instruction. This is done at event start time when the process +starts execution and at event stop time when the process is +removed from the CPU. During removal the difference of each +counter in the counter sets is calculated and saved as raw data +in the ring buffer. This works fine unless the number of counters +in a counter set is zero. This may happen for the extended counter +set. This set is machine specific and the size of the counter +set can be zero even when extended counter set is authorized for +read access. + +This case is not handled. cfdiag_diffctr() checks authorization +of the extended counter set. If true the functions assumes +the extended counter set has been saved in a data buffer. However +this is not the case, cfdiag_getctrset() does not save a counter +set with counter set size of zero. This mismatch causes an endless +loop in the counter set readout during event stop handling. + +The calculation of the difference of the counters in each counter +now verifies the size of the counter set is non-zero. A counter set +with size zero is skipped. + +Fixes: a029a4eab39e ("s390/cpumf: Allow concurrent access for CPU Measurement Counter Facility") +Signed-off-by: Thomas Richter +Acked-by: Sumanth Korikkar +Acked-by: Heiko Carstens +Cc: Heiko Carstens +Cc: Vasily Gorbik +Cc: Alexander Gordeev +Signed-off-by: Vasily Gorbik +Signed-off-by: Sasha Levin +--- + arch/s390/kernel/perf_cpum_cf.c | 14 ++++++++++---- + 1 file changed, 10 insertions(+), 4 deletions(-) + +diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c +index d2a2a18b55808..34b8d9410503d 100644 +--- a/arch/s390/kernel/perf_cpum_cf.c ++++ b/arch/s390/kernel/perf_cpum_cf.c +@@ -213,25 +213,31 @@ static int cfdiag_diffctr(struct cpu_cf_events *cpuhw, unsigned long auth) + struct cf_trailer_entry *trailer_start, *trailer_stop; + struct cf_ctrset_entry *ctrstart, *ctrstop; + size_t offset = 0; ++ int i; + +- auth &= (1 << CPUMF_LCCTL_ENABLE_SHIFT) - 1; +- do { ++ for (i = CPUMF_CTR_SET_BASIC; i < CPUMF_CTR_SET_MAX; ++i) { + ctrstart = (struct cf_ctrset_entry *)(cpuhw->start + offset); + ctrstop = (struct cf_ctrset_entry *)(cpuhw->stop + offset); + ++ /* Counter set not authorized */ ++ if (!(auth & cpumf_ctr_ctl[i])) ++ continue; ++ /* Counter set size zero was not saved */ ++ if (!cpum_cf_read_setsize(i)) ++ continue; ++ + if (memcmp(ctrstop, ctrstart, sizeof(*ctrstop))) { + pr_err_once("cpum_cf_diag counter set compare error " + "in set %i\n", ctrstart->set); + return 0; + } +- auth &= ~cpumf_ctr_ctl[ctrstart->set]; + if (ctrstart->def == CF_DIAG_CTRSET_DEF) { + cfdiag_diffctrset((u64 *)(ctrstart + 1), + (u64 *)(ctrstop + 1), ctrstart->ctr); + offset += ctrstart->ctr * sizeof(u64) + + sizeof(*ctrstart); + } +- } while (ctrstart->def && auth); ++ } + + /* Save time_stamp from start of event in stop's trailer */ + trailer_start = (struct cf_trailer_entry *)(cpuhw->start + offset); +-- +2.43.0 + diff --git a/queue-5.15/s390-pci-allow-allocation-of-more-than-1-msi-interru.patch b/queue-5.15/s390-pci-allow-allocation-of-more-than-1-msi-interru.patch new file mode 100644 index 00000000000..9dece9d367e --- /dev/null +++ b/queue-5.15/s390-pci-allow-allocation-of-more-than-1-msi-interru.patch @@ -0,0 +1,168 @@ +From c4239ed4790c311bd555c9b36c94840f55c1a8a7 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jul 2024 15:45:27 +0200 +Subject: s390/pci: Allow allocation of more than 1 MSI interrupt + +From: Gerd Bayer + +[ Upstream commit ab42fcb511fd9d241bbab7cc3ca04e34e9fc0666 ] + +On a PCI adapter that provides up to 8 MSI interrupt sources the s390 +implementation of PCI interrupts rejected to accommodate them, although +the underlying hardware is able to support that. + +For MSI-X it is sufficient to allocate a single irq_desc per msi_desc, +but for MSI multiple irq descriptors are attached to and controlled by +a single msi descriptor. Add the appropriate loops to maintain multiple +irq descriptors and tie/untie them to/from the appropriate AIBV bit, if +a device driver allocates more than 1 MSI interrupt. + +Common PCI code passes on requests to allocate a number of interrupt +vectors based on the device drivers' demand and the PCI functions' +capabilities. However, the root-complex of s390 systems support just a +limited number of interrupt vectors per PCI function. +Produce a kernel log message to inform about any architecture-specific +capping that might be done. + +With this change, we had a PCI adapter successfully raising +interrupts to its device driver via all 8 sources. + +Fixes: a384c8924a8b ("s390/PCI: Fix single MSI only check") +Signed-off-by: Gerd Bayer +Reviewed-by: Niklas Schnelle +Signed-off-by: Vasily Gorbik +Signed-off-by: Sasha Levin +--- + arch/s390/pci/pci_irq.c | 62 ++++++++++++++++++++++++++++------------- + 1 file changed, 42 insertions(+), 20 deletions(-) + +diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c +index 39c3c29f0d1d3..4a1dfce1a5cd2 100644 +--- a/arch/s390/pci/pci_irq.c ++++ b/arch/s390/pci/pci_irq.c +@@ -292,8 +292,8 @@ static int __alloc_airq(struct zpci_dev *zdev, int msi_vecs, + + int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) + { ++ unsigned int hwirq, msi_vecs, irqs_per_msi, i, cpu; + struct zpci_dev *zdev = to_zpci(pdev); +- unsigned int hwirq, msi_vecs, cpu; + struct msi_desc *msi; + struct msi_msg msg; + unsigned long bit; +@@ -303,30 +303,46 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) + zdev->aisb = -1UL; + zdev->msi_first_bit = -1U; + +- if (type == PCI_CAP_ID_MSI && nvec > 1) +- return 1; + msi_vecs = min_t(unsigned int, nvec, zdev->max_msi); ++ if (msi_vecs < nvec) { ++ pr_info("%s requested %d irqs, allocate system limit of %d", ++ pci_name(pdev), nvec, zdev->max_msi); ++ } + + rc = __alloc_airq(zdev, msi_vecs, &bit); + if (rc < 0) + return rc; + +- /* Request MSI interrupts */ ++ /* ++ * Request MSI interrupts: ++ * When using MSI, nvec_used interrupt sources and their irq ++ * descriptors are controlled through one msi descriptor. ++ * Thus the outer loop over msi descriptors shall run only once, ++ * while two inner loops iterate over the interrupt vectors. ++ * When using MSI-X, each interrupt vector/irq descriptor ++ * is bound to exactly one msi descriptor (nvec_used is one). ++ * So the inner loops are executed once, while the outer iterates ++ * over the MSI-X descriptors. ++ */ + hwirq = bit; + msi_for_each_desc(msi, &pdev->dev, MSI_DESC_NOTASSOCIATED) { +- rc = -EIO; + if (hwirq - bit >= msi_vecs) + break; +- irq = __irq_alloc_descs(-1, 0, 1, 0, THIS_MODULE, +- (irq_delivery == DIRECTED) ? +- msi->affinity : NULL); ++ irqs_per_msi = min_t(unsigned int, msi_vecs, msi->nvec_used); ++ irq = __irq_alloc_descs(-1, 0, irqs_per_msi, 0, THIS_MODULE, ++ (irq_delivery == DIRECTED) ? ++ msi->affinity : NULL); + if (irq < 0) + return -ENOMEM; +- rc = irq_set_msi_desc(irq, msi); +- if (rc) +- return rc; +- irq_set_chip_and_handler(irq, &zpci_irq_chip, +- handle_percpu_irq); ++ ++ for (i = 0; i < irqs_per_msi; i++) { ++ rc = irq_set_msi_desc_off(irq, i, msi); ++ if (rc) ++ return rc; ++ irq_set_chip_and_handler(irq + i, &zpci_irq_chip, ++ handle_percpu_irq); ++ } ++ + msg.data = hwirq - bit; + if (irq_delivery == DIRECTED) { + if (msi->affinity) +@@ -339,31 +355,35 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) + msg.address_lo |= (cpu_addr << 8); + + for_each_possible_cpu(cpu) { +- airq_iv_set_data(zpci_ibv[cpu], hwirq, irq); ++ for (i = 0; i < irqs_per_msi; i++) ++ airq_iv_set_data(zpci_ibv[cpu], ++ hwirq + i, irq + i); + } + } else { + msg.address_lo = zdev->msi_addr & 0xffffffff; +- airq_iv_set_data(zdev->aibv, hwirq, irq); ++ for (i = 0; i < irqs_per_msi; i++) ++ airq_iv_set_data(zdev->aibv, hwirq + i, irq + i); + } + msg.address_hi = zdev->msi_addr >> 32; + pci_write_msi_msg(irq, &msg); +- hwirq++; ++ hwirq += irqs_per_msi; + } + + zdev->msi_first_bit = bit; +- zdev->msi_nr_irqs = msi_vecs; ++ zdev->msi_nr_irqs = hwirq - bit; + + rc = zpci_set_irq(zdev); + if (rc) + return rc; + +- return (msi_vecs == nvec) ? 0 : msi_vecs; ++ return (zdev->msi_nr_irqs == nvec) ? 0 : zdev->msi_nr_irqs; + } + + void arch_teardown_msi_irqs(struct pci_dev *pdev) + { + struct zpci_dev *zdev = to_zpci(pdev); + struct msi_desc *msi; ++ unsigned int i; + int rc; + + /* Disable interrupts */ +@@ -373,8 +393,10 @@ void arch_teardown_msi_irqs(struct pci_dev *pdev) + + /* Release MSI interrupts */ + msi_for_each_desc(msi, &pdev->dev, MSI_DESC_ASSOCIATED) { +- irq_set_msi_desc(msi->irq, NULL); +- irq_free_desc(msi->irq); ++ for (i = 0; i < msi->nvec_used; i++) { ++ irq_set_msi_desc(msi->irq + i, NULL); ++ irq_free_desc(msi->irq + i); ++ } + msi->msg.address_lo = 0; + msi->msg.address_hi = 0; + msi->msg.data = 0; +-- +2.43.0 + diff --git a/queue-5.15/s390-pci-refactor-arch_setup_msi_irqs.patch b/queue-5.15/s390-pci-refactor-arch_setup_msi_irqs.patch new file mode 100644 index 00000000000..f0ef494ce9a --- /dev/null +++ b/queue-5.15/s390-pci-refactor-arch_setup_msi_irqs.patch @@ -0,0 +1,106 @@ +From 6ab27cbbf08e1d33b1754c864437e6be59db6569 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jul 2024 15:45:26 +0200 +Subject: s390/pci: Refactor arch_setup_msi_irqs() + +From: Gerd Bayer + +[ Upstream commit 5fd11b96b43708f2f6e3964412c301c1bd20ec0f ] + +Factor out adapter interrupt allocation from arch_setup_msi_irqs() in +preparation for enabling registration of multiple MSIs. Code movement +only, no change of functionality intended. + +Signed-off-by: Gerd Bayer +Reviewed-by: Niklas Schnelle +Signed-off-by: Vasily Gorbik +Stable-dep-of: ab42fcb511fd ("s390/pci: Allow allocation of more than 1 MSI interrupt") +Signed-off-by: Sasha Levin +--- + arch/s390/pci/pci_irq.c | 54 ++++++++++++++++++++++++----------------- + 1 file changed, 32 insertions(+), 22 deletions(-) + +diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c +index 49e404c3e987a..39c3c29f0d1d3 100644 +--- a/arch/s390/pci/pci_irq.c ++++ b/arch/s390/pci/pci_irq.c +@@ -262,33 +262,20 @@ static void zpci_floating_irq_handler(struct airq_struct *airq, bool floating) + } + } + +-int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) ++static int __alloc_airq(struct zpci_dev *zdev, int msi_vecs, ++ unsigned long *bit) + { +- struct zpci_dev *zdev = to_zpci(pdev); +- unsigned int hwirq, msi_vecs, cpu; +- unsigned long bit; +- struct msi_desc *msi; +- struct msi_msg msg; +- int cpu_addr; +- int rc, irq; +- +- zdev->aisb = -1UL; +- zdev->msi_first_bit = -1U; +- if (type == PCI_CAP_ID_MSI && nvec > 1) +- return 1; +- msi_vecs = min_t(unsigned int, nvec, zdev->max_msi); +- + if (irq_delivery == DIRECTED) { + /* Allocate cpu vector bits */ +- bit = airq_iv_alloc(zpci_ibv[0], msi_vecs); +- if (bit == -1UL) ++ *bit = airq_iv_alloc(zpci_ibv[0], msi_vecs); ++ if (*bit == -1UL) + return -EIO; + } else { + /* Allocate adapter summary indicator bit */ +- bit = airq_iv_alloc_bit(zpci_sbv); +- if (bit == -1UL) ++ *bit = airq_iv_alloc_bit(zpci_sbv); ++ if (*bit == -1UL) + return -EIO; +- zdev->aisb = bit; ++ zdev->aisb = *bit; + + /* Create adapter interrupt vector */ + zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA | AIRQ_IV_BITLOCK); +@@ -296,10 +283,33 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) + return -ENOMEM; + + /* Wire up shortcut pointer */ +- zpci_ibv[bit] = zdev->aibv; ++ zpci_ibv[*bit] = zdev->aibv; + /* Each function has its own interrupt vector */ +- bit = 0; ++ *bit = 0; + } ++ return 0; ++} ++ ++int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) ++{ ++ struct zpci_dev *zdev = to_zpci(pdev); ++ unsigned int hwirq, msi_vecs, cpu; ++ struct msi_desc *msi; ++ struct msi_msg msg; ++ unsigned long bit; ++ int cpu_addr; ++ int rc, irq; ++ ++ zdev->aisb = -1UL; ++ zdev->msi_first_bit = -1U; ++ ++ if (type == PCI_CAP_ID_MSI && nvec > 1) ++ return 1; ++ msi_vecs = min_t(unsigned int, nvec, zdev->max_msi); ++ ++ rc = __alloc_airq(zdev, msi_vecs, &bit); ++ if (rc < 0) ++ return rc; + + /* Request MSI interrupts */ + hwirq = bit; +-- +2.43.0 + diff --git a/queue-5.15/s390-pci-rework-msi-descriptor-walk.patch b/queue-5.15/s390-pci-rework-msi-descriptor-walk.patch new file mode 100644 index 00000000000..5b0599f173e --- /dev/null +++ b/queue-5.15/s390-pci-rework-msi-descriptor-walk.patch @@ -0,0 +1,49 @@ +From b1970baad93425f383a8774a33b126a5b1c276df Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 6 Dec 2021 23:51:23 +0100 +Subject: s390/pci: Rework MSI descriptor walk + +From: Thomas Gleixner + +[ Upstream commit 2ca5e908d0f4cde61d9d3595e8314adca5d914a1 ] + +Replace the about to vanish iterators and make use of the filtering. + +Signed-off-by: Thomas Gleixner +Tested-by: Niklas Schnelle +Reviewed-by: Jason Gunthorpe +Acked-by: Niklas Schnelle +Link: https://lore.kernel.org/r/20211206210748.305656158@linutronix.de +Stable-dep-of: ab42fcb511fd ("s390/pci: Allow allocation of more than 1 MSI interrupt") +Signed-off-by: Sasha Levin +--- + arch/s390/pci/pci_irq.c | 6 ++---- + 1 file changed, 2 insertions(+), 4 deletions(-) + +diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c +index 3823e159bf749..49e404c3e987a 100644 +--- a/arch/s390/pci/pci_irq.c ++++ b/arch/s390/pci/pci_irq.c +@@ -303,7 +303,7 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) + + /* Request MSI interrupts */ + hwirq = bit; +- for_each_pci_msi_entry(msi, pdev) { ++ msi_for_each_desc(msi, &pdev->dev, MSI_DESC_NOTASSOCIATED) { + rc = -EIO; + if (hwirq - bit >= msi_vecs) + break; +@@ -362,9 +362,7 @@ void arch_teardown_msi_irqs(struct pci_dev *pdev) + return; + + /* Release MSI interrupts */ +- for_each_pci_msi_entry(msi, pdev) { +- if (!msi->irq) +- continue; ++ msi_for_each_desc(msi, &pdev->dev, MSI_DESC_ASSOCIATED) { + irq_set_msi_desc(msi->irq, NULL); + irq_free_desc(msi->irq); + msi->msg.address_lo = 0; +-- +2.43.0 + diff --git a/queue-5.15/series b/queue-5.15/series index 0a52af6aad6..f6f7bd3643d 100644 --- a/queue-5.15/series +++ b/queue-5.15/series @@ -258,3 +258,47 @@ bluetooth-btusb-add-realtek-rtl8852be-support-id-0x13d3-0x3591.patch nilfs2-handle-inconsistent-state-in-nilfs_btnode_create_block.patch io_uring-io-wq-limit-retrying-worker-initialisation.patch kernel-rerun-task_work-while-freezing-in-get_signal.patch +kdb-address-wformat-security-warnings.patch +kdb-use-the-passed-prompt-in-kdb_position_cursor.patch +jfs-fix-array-index-out-of-bounds-in-difree.patch +dmaengine-ti-k3-udma-fix-bchan-count-with-uhc-and-hc.patch +phy-cadence-torrent-check-return-value-on-register-r.patch +um-time-travel-fix-time-travel-start-option.patch +um-time-travel-fix-signal-blocking-race-hang.patch +f2fs-fix-start-segno-of-large-section.patch +f2fs-introduce-fragment-allocation-mode-mount-option.patch +f2fs-add-gc_urgent_high_remaining-sysfs-node.patch +f2fs-add-a-way-to-limit-roll-forward-recovery-time.patch +f2fs-fix-to-update-user-block-counts-in-block_operat.patch +libbpf-fix-no-args-func-prototype-btf-dumping-syntax.patch +dma-fix-call-order-in-dmam_free_coherent.patch +bpf-events-use-prog-to-emit-ksymbol-event-for-main-p.patch +mips-smp-cps-fix-address-for-gcr_access-register-for.patch +ipv4-fix-incorrect-source-address-in-record-route-op.patch +net-bonding-correctly-annotate-rcu-in-bond_should_no.patch +netfilter-nft_set_pipapo_avx2-disable-softinterrupts.patch +tipc-return-non-zero-value-from-tipc_udp_addr2str-on.patch +net-stmmac-correct-byte-order-of-perfect_match.patch +net-nexthop-initialize-all-fields-in-dumped-nexthops.patch +bpf-fix-a-segment-issue-when-downgrading-gso_size.patch +misdn-fix-a-use-after-free-in-hfcmulti_tx.patch +apparmor-fix-null-pointer-deref-when-receiving-skb-d.patch +powerpc-fix-a-file-leak-in-kvm_vcpu_ioctl_enable_cap.patch +lirc-rc_dev_get_from_fd-fix-file-leak.patch +spi-spidev-make-probe-to-fail-early-if-a-spidev-comp.patch +spi-spidev-replace-acpi-specific-code-by-device_get_.patch +spi-spidev-replace-of-specific-code-by-device-proper.patch +spidev-add-silicon-labs-em3581-device-compatible.patch +spi-spidev-order-compatibles-alphabetically.patch +spi-spidev-add-correct-compatible-for-rohm-bh2228fv.patch +asoc-intel-use-soc_intel_is_byt_cr-only-when-iosf_mb.patch +ceph-fix-incorrect-kmalloc-size-of-pagevec-mempool.patch +s390-pci-rework-msi-descriptor-walk.patch +s390-pci-refactor-arch_setup_msi_irqs.patch +s390-pci-allow-allocation-of-more-than-1-msi-interru.patch +s390-cpum_cf-fix-endless-loop-in-cf_diag-event-stop.patch +iommu-sprd-avoid-null-deref-in-sprd_iommu_hw_en.patch +nvme-split-command-copy-into-a-helper.patch +nvme-separate-command-prep-and-issue.patch +nvme-pci-add-missing-condition-check-for-existence-o.patch +fs-don-t-allow-non-init-s_user_ns-for-filesystems-wi.patch diff --git a/queue-5.15/spi-spidev-add-correct-compatible-for-rohm-bh2228fv.patch b/queue-5.15/spi-spidev-add-correct-compatible-for-rohm-bh2228fv.patch new file mode 100644 index 00000000000..3b1d9d5e99a --- /dev/null +++ b/queue-5.15/spi-spidev-add-correct-compatible-for-rohm-bh2228fv.patch @@ -0,0 +1,41 @@ +From 1dff9b14624bba7a0f78f7216add85e4baa0803c Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 17 Jul 2024 10:59:49 +0100 +Subject: spi: spidev: add correct compatible for Rohm BH2228FV + +From: Conor Dooley + +[ Upstream commit fc28d1c1fe3b3e2fbc50834c8f73dda72f6af9fc ] + +When Maxime originally added the BH2228FV to the spidev driver, he spelt +it incorrectly - the d should have been a b. Add the correctly spelt +compatible to the driver. Although the majority of users of this +compatible are abusers, there is at least one board that validly uses +the incorrect spelt compatible, so keep it in the driver to avoid +breaking the few real users it has. + +Fixes: 8fad805bdc52 ("spi: spidev: Add Rohm DH2228FV DAC compatible string") +Signed-off-by: Conor Dooley +Acked-by: Maxime Ripard +Link: https://patch.msgid.link/20240717-ventricle-strewn-a7678c509e85@spud +Signed-off-by: Mark Brown +Signed-off-by: Sasha Levin +--- + drivers/spi/spidev.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/drivers/spi/spidev.c b/drivers/spi/spidev.c +index 14bebc079ddbd..99bdbc040d1ae 100644 +--- a/drivers/spi/spidev.c ++++ b/drivers/spi/spidev.c +@@ -715,6 +715,7 @@ static const struct of_device_id spidev_dt_ids[] = { + { .compatible = "lwn,bk4", .data = &spidev_of_check }, + { .compatible = "menlo,m53cpld", .data = &spidev_of_check }, + { .compatible = "micron,spi-authenta", .data = &spidev_of_check }, ++ { .compatible = "rohm,bh2228fv", .data = &spidev_of_check }, + { .compatible = "rohm,dh2228fv", .data = &spidev_of_check }, + { .compatible = "semtech,sx1301", .data = &spidev_of_check }, + { .compatible = "silabs,em3581", .data = &spidev_of_check }, +-- +2.43.0 + diff --git a/queue-5.15/spi-spidev-make-probe-to-fail-early-if-a-spidev-comp.patch b/queue-5.15/spi-spidev-make-probe-to-fail-early-if-a-spidev-comp.patch new file mode 100644 index 00000000000..ba01a7d4e32 --- /dev/null +++ b/queue-5.15/spi-spidev-make-probe-to-fail-early-if-a-spidev-comp.patch @@ -0,0 +1,55 @@ +From 925416f2cebd2b8b493c4aa670c18910b16a9825 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 9 Nov 2021 23:59:20 +0100 +Subject: spi: spidev: Make probe to fail early if a spidev compatible is used + +From: Javier Martinez Canillas + +[ Upstream commit fffc84fd87d963a2ea77a125b8a6f5a3c9f3192d ] + +Some Device Trees don't use a real device name in the compatible string +for SPI devices nodes, abusing the fact that the spidev driver name is +used to match as a fallback when a SPI device ID table is not defined. + +But since commit 6840615f85f6 ("spi: spidev: Add SPI ID table") a table +for SPI device IDs was added to the driver breaking the assumption that +these DTs were relying on. + +There has been a warning message for some time since commit 956b200a846e +("spi: spidev: Warn loudly if instantiated from DT as "spidev""), making +quite clear that this case is not really supported by the spidev driver. + +Since these devices won't match anyways after the mentioned commit, there +is no point to continue if an spidev compatible is used. Let's just make +the driver probe to fail early. + +Signed-off-by: Javier Martinez Canillas +Link: https://lore.kernel.org/r/20211109225920.1158920-1-javierm@redhat.com +Signed-off-by: Mark Brown +Stable-dep-of: fc28d1c1fe3b ("spi: spidev: add correct compatible for Rohm BH2228FV") +Signed-off-by: Sasha Levin +--- + drivers/spi/spidev.c | 7 ++++--- + 1 file changed, 4 insertions(+), 3 deletions(-) + +diff --git a/drivers/spi/spidev.c b/drivers/spi/spidev.c +index 922d778df0641..75eb1c95c4a04 100644 +--- a/drivers/spi/spidev.c ++++ b/drivers/spi/spidev.c +@@ -760,9 +760,10 @@ static int spidev_probe(struct spi_device *spi) + * compatible string, it is a Linux implementation thing + * rather than a description of the hardware. + */ +- WARN(spi->dev.of_node && +- of_device_is_compatible(spi->dev.of_node, "spidev"), +- "%pOF: buggy DT: spidev listed directly in DT\n", spi->dev.of_node); ++ if (spi->dev.of_node && of_device_is_compatible(spi->dev.of_node, "spidev")) { ++ dev_err(&spi->dev, "spidev listed directly in DT is not supported\n"); ++ return -EINVAL; ++ } + + spidev_probe_acpi(spi); + +-- +2.43.0 + diff --git a/queue-5.15/spi-spidev-order-compatibles-alphabetically.patch b/queue-5.15/spi-spidev-order-compatibles-alphabetically.patch new file mode 100644 index 00000000000..f28503ca4cb --- /dev/null +++ b/queue-5.15/spi-spidev-order-compatibles-alphabetically.patch @@ -0,0 +1,46 @@ +From a73b9b3bcaa20babb090ce5b47e1302fae45eff4 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 20 Jan 2023 08:56:51 +0100 +Subject: spi: spidev: order compatibles alphabetically + +From: Krzysztof Kozlowski + +[ Upstream commit be5852457b7e85ad13b1bded9c97bed5ee1715a3 ] + +Bring some order to reduce possibilities of conflicts. + +Signed-off-by: Krzysztof Kozlowski +Link: https://lore.kernel.org/r/20230120075651.153763-1-krzysztof.kozlowski@linaro.org +Signed-off-by: Mark Brown +Stable-dep-of: fc28d1c1fe3b ("spi: spidev: add correct compatible for Rohm BH2228FV") +Signed-off-by: Sasha Levin +--- + drivers/spi/spidev.c | 8 ++++---- + 1 file changed, 4 insertions(+), 4 deletions(-) + +diff --git a/drivers/spi/spidev.c b/drivers/spi/spidev.c +index c083d511f63dd..14bebc079ddbd 100644 +--- a/drivers/spi/spidev.c ++++ b/drivers/spi/spidev.c +@@ -709,14 +709,14 @@ static int spidev_of_check(struct device *dev) + } + + static const struct of_device_id spidev_dt_ids[] = { +- { .compatible = "rohm,dh2228fv", .data = &spidev_of_check }, ++ { .compatible = "cisco,spi-petra", .data = &spidev_of_check }, ++ { .compatible = "dh,dhcom-board", .data = &spidev_of_check }, + { .compatible = "lineartechnology,ltc2488", .data = &spidev_of_check }, +- { .compatible = "semtech,sx1301", .data = &spidev_of_check }, + { .compatible = "lwn,bk4", .data = &spidev_of_check }, +- { .compatible = "dh,dhcom-board", .data = &spidev_of_check }, + { .compatible = "menlo,m53cpld", .data = &spidev_of_check }, +- { .compatible = "cisco,spi-petra", .data = &spidev_of_check }, + { .compatible = "micron,spi-authenta", .data = &spidev_of_check }, ++ { .compatible = "rohm,dh2228fv", .data = &spidev_of_check }, ++ { .compatible = "semtech,sx1301", .data = &spidev_of_check }, + { .compatible = "silabs,em3581", .data = &spidev_of_check }, + {}, + }; +-- +2.43.0 + diff --git a/queue-5.15/spi-spidev-replace-acpi-specific-code-by-device_get_.patch b/queue-5.15/spi-spidev-replace-acpi-specific-code-by-device_get_.patch new file mode 100644 index 00000000000..e3652f84067 --- /dev/null +++ b/queue-5.15/spi-spidev-replace-acpi-specific-code-by-device_get_.patch @@ -0,0 +1,129 @@ +From 8f82c35cb52ae1e4d99dbb5f803ddd7a9a1b7901 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 23 Mar 2022 16:02:14 +0200 +Subject: spi: spidev: Replace ACPI specific code by device_get_match_data() + +From: Andy Shevchenko + +[ Upstream commit 2a7f669dd8f6561d227e724ca2614c25732f4799 ] + +Instead of calling the ACPI specific APIs, use device_get_match_data(). + +Signed-off-by: Andy Shevchenko +Link: https://lore.kernel.org/r/20220323140215.2568-3-andriy.shevchenko@linux.intel.com +Signed-off-by: Mark Brown +Stable-dep-of: fc28d1c1fe3b ("spi: spidev: add correct compatible for Rohm BH2228FV") +Signed-off-by: Sasha Levin +--- + drivers/spi/spidev.c | 47 ++++++++++++++++++-------------------------- + 1 file changed, 19 insertions(+), 28 deletions(-) + +diff --git a/drivers/spi/spidev.c b/drivers/spi/spidev.c +index 75eb1c95c4a04..8c69ab348a7f7 100644 +--- a/drivers/spi/spidev.c ++++ b/drivers/spi/spidev.c +@@ -8,19 +8,20 @@ + */ + + #include +-#include + #include + #include + #include + #include + #include + #include ++#include ++#include + #include ++#include + #include + #include + #include + #include +-#include + + #include + #include +@@ -710,10 +711,12 @@ static const struct of_device_id spidev_dt_ids[] = { + MODULE_DEVICE_TABLE(of, spidev_dt_ids); + #endif + +-#ifdef CONFIG_ACPI +- + /* Dummy SPI devices not to be used in production systems */ +-#define SPIDEV_ACPI_DUMMY 1 ++static int spidev_acpi_check(struct device *dev) ++{ ++ dev_warn(dev, "do not use this driver in production systems!\n"); ++ return 0; ++} + + static const struct acpi_device_id spidev_acpi_ids[] = { + /* +@@ -722,35 +725,18 @@ static const struct acpi_device_id spidev_acpi_ids[] = { + * description of the connected peripheral and they should also use + * a proper driver instead of poking directly to the SPI bus. + */ +- { "SPT0001", SPIDEV_ACPI_DUMMY }, +- { "SPT0002", SPIDEV_ACPI_DUMMY }, +- { "SPT0003", SPIDEV_ACPI_DUMMY }, ++ { "SPT0001", (kernel_ulong_t)&spidev_acpi_check }, ++ { "SPT0002", (kernel_ulong_t)&spidev_acpi_check }, ++ { "SPT0003", (kernel_ulong_t)&spidev_acpi_check }, + {}, + }; + MODULE_DEVICE_TABLE(acpi, spidev_acpi_ids); + +-static void spidev_probe_acpi(struct spi_device *spi) +-{ +- const struct acpi_device_id *id; +- +- if (!has_acpi_companion(&spi->dev)) +- return; +- +- id = acpi_match_device(spidev_acpi_ids, &spi->dev); +- if (WARN_ON(!id)) +- return; +- +- if (id->driver_data == SPIDEV_ACPI_DUMMY) +- dev_warn(&spi->dev, "do not use this driver in production systems!\n"); +-} +-#else +-static inline void spidev_probe_acpi(struct spi_device *spi) {} +-#endif +- + /*-------------------------------------------------------------------------*/ + + static int spidev_probe(struct spi_device *spi) + { ++ int (*match)(struct device *dev); + struct spidev_data *spidev; + int status; + unsigned long minor; +@@ -765,7 +751,12 @@ static int spidev_probe(struct spi_device *spi) + return -EINVAL; + } + +- spidev_probe_acpi(spi); ++ match = device_get_match_data(&spi->dev); ++ if (match) { ++ status = match(&spi->dev); ++ if (status) ++ return status; ++ } + + /* Allocate driver data */ + spidev = kzalloc(sizeof(*spidev), GFP_KERNEL); +@@ -837,7 +828,7 @@ static struct spi_driver spidev_spi_driver = { + .driver = { + .name = "spidev", + .of_match_table = of_match_ptr(spidev_dt_ids), +- .acpi_match_table = ACPI_PTR(spidev_acpi_ids), ++ .acpi_match_table = spidev_acpi_ids, + }, + .probe = spidev_probe, + .remove = spidev_remove, +-- +2.43.0 + diff --git a/queue-5.15/spi-spidev-replace-of-specific-code-by-device-proper.patch b/queue-5.15/spi-spidev-replace-of-specific-code-by-device-proper.patch new file mode 100644 index 00000000000..818299a920c --- /dev/null +++ b/queue-5.15/spi-spidev-replace-of-specific-code-by-device-proper.patch @@ -0,0 +1,107 @@ +From 58ad2278ca73786059d29fbffbe0b36f141adc98 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 23 Mar 2022 16:02:15 +0200 +Subject: spi: spidev: Replace OF specific code by device property API + +From: Andy Shevchenko + +[ Upstream commit 88a285192084edab6657e819f7f130f9cfcb0579 ] + +Instead of calling the OF specific APIs, use device property ones. + +It also prevents misusing PRP0001 in ACPI when trying to instantiate +spidev directly. We only support special SPI test devices there. + +Signed-off-by: Andy Shevchenko +Link: https://lore.kernel.org/r/20220323140215.2568-4-andriy.shevchenko@linux.intel.com +Signed-off-by: Mark Brown +Stable-dep-of: fc28d1c1fe3b ("spi: spidev: add correct compatible for Rohm BH2228FV") +Signed-off-by: Sasha Levin +--- + drivers/spi/spidev.c | 45 ++++++++++++++++++++++---------------------- + 1 file changed, 22 insertions(+), 23 deletions(-) + +diff --git a/drivers/spi/spidev.c b/drivers/spi/spidev.c +index 8c69ab348a7f7..4a19c2142e474 100644 +--- a/drivers/spi/spidev.c ++++ b/drivers/spi/spidev.c +@@ -20,8 +20,6 @@ + #include + #include + #include +-#include +-#include + + #include + #include +@@ -696,20 +694,31 @@ static const struct spi_device_id spidev_spi_ids[] = { + }; + MODULE_DEVICE_TABLE(spi, spidev_spi_ids); + +-#ifdef CONFIG_OF ++/* ++ * spidev should never be referenced in DT without a specific compatible string, ++ * it is a Linux implementation thing rather than a description of the hardware. ++ */ ++static int spidev_of_check(struct device *dev) ++{ ++ if (device_property_match_string(dev, "compatible", "spidev") < 0) ++ return 0; ++ ++ dev_err(dev, "spidev listed directly in DT is not supported\n"); ++ return -EINVAL; ++} ++ + static const struct of_device_id spidev_dt_ids[] = { +- { .compatible = "rohm,dh2228fv" }, +- { .compatible = "lineartechnology,ltc2488" }, +- { .compatible = "semtech,sx1301" }, +- { .compatible = "lwn,bk4" }, +- { .compatible = "dh,dhcom-board" }, +- { .compatible = "menlo,m53cpld" }, +- { .compatible = "cisco,spi-petra" }, +- { .compatible = "micron,spi-authenta" }, ++ { .compatible = "rohm,dh2228fv", .data = &spidev_of_check }, ++ { .compatible = "lineartechnology,ltc2488", .data = &spidev_of_check }, ++ { .compatible = "semtech,sx1301", .data = &spidev_of_check }, ++ { .compatible = "lwn,bk4", .data = &spidev_of_check }, ++ { .compatible = "dh,dhcom-board", .data = &spidev_of_check }, ++ { .compatible = "menlo,m53cpld", .data = &spidev_of_check }, ++ { .compatible = "cisco,spi-petra", .data = &spidev_of_check }, ++ { .compatible = "micron,spi-authenta", .data = &spidev_of_check }, + {}, + }; + MODULE_DEVICE_TABLE(of, spidev_dt_ids); +-#endif + + /* Dummy SPI devices not to be used in production systems */ + static int spidev_acpi_check(struct device *dev) +@@ -741,16 +750,6 @@ static int spidev_probe(struct spi_device *spi) + int status; + unsigned long minor; + +- /* +- * spidev should never be referenced in DT without a specific +- * compatible string, it is a Linux implementation thing +- * rather than a description of the hardware. +- */ +- if (spi->dev.of_node && of_device_is_compatible(spi->dev.of_node, "spidev")) { +- dev_err(&spi->dev, "spidev listed directly in DT is not supported\n"); +- return -EINVAL; +- } +- + match = device_get_match_data(&spi->dev); + if (match) { + status = match(&spi->dev); +@@ -827,7 +826,7 @@ static int spidev_remove(struct spi_device *spi) + static struct spi_driver spidev_spi_driver = { + .driver = { + .name = "spidev", +- .of_match_table = of_match_ptr(spidev_dt_ids), ++ .of_match_table = spidev_dt_ids, + .acpi_match_table = spidev_acpi_ids, + }, + .probe = spidev_probe, +-- +2.43.0 + diff --git a/queue-5.15/spidev-add-silicon-labs-em3581-device-compatible.patch b/queue-5.15/spidev-add-silicon-labs-em3581-device-compatible.patch new file mode 100644 index 00000000000..c6898e48275 --- /dev/null +++ b/queue-5.15/spidev-add-silicon-labs-em3581-device-compatible.patch @@ -0,0 +1,43 @@ +From 882726a3d83a0077c1ce61a4a7000d45d9206d66 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 26 Dec 2022 21:35:48 -0500 +Subject: spidev: Add Silicon Labs EM3581 device compatible + +From: Vincent Tremblay + +[ Upstream commit c67d90e058550403a3e6f9b05bfcdcfa12b1815c ] + +Add compatible string for Silicon Labs EM3581 device. + +Signed-off-by: Vincent Tremblay +Link: https://lore.kernel.org/r/20221227023550.569547-2-vincent@vtremblay.dev +Signed-off-by: Mark Brown +Stable-dep-of: fc28d1c1fe3b ("spi: spidev: add correct compatible for Rohm BH2228FV") +Signed-off-by: Sasha Levin +--- + drivers/spi/spidev.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/drivers/spi/spidev.c b/drivers/spi/spidev.c +index 4a19c2142e474..c083d511f63dd 100644 +--- a/drivers/spi/spidev.c ++++ b/drivers/spi/spidev.c +@@ -690,6 +690,7 @@ static const struct spi_device_id spidev_spi_ids[] = { + { .name = "m53cpld" }, + { .name = "spi-petra" }, + { .name = "spi-authenta" }, ++ { .name = "em3581" }, + {}, + }; + MODULE_DEVICE_TABLE(spi, spidev_spi_ids); +@@ -716,6 +717,7 @@ static const struct of_device_id spidev_dt_ids[] = { + { .compatible = "menlo,m53cpld", .data = &spidev_of_check }, + { .compatible = "cisco,spi-petra", .data = &spidev_of_check }, + { .compatible = "micron,spi-authenta", .data = &spidev_of_check }, ++ { .compatible = "silabs,em3581", .data = &spidev_of_check }, + {}, + }; + MODULE_DEVICE_TABLE(of, spidev_dt_ids); +-- +2.43.0 + diff --git a/queue-5.15/tipc-return-non-zero-value-from-tipc_udp_addr2str-on.patch b/queue-5.15/tipc-return-non-zero-value-from-tipc_udp_addr2str-on.patch new file mode 100644 index 00000000000..87454cbbe6d --- /dev/null +++ b/queue-5.15/tipc-return-non-zero-value-from-tipc_udp_addr2str-on.patch @@ -0,0 +1,43 @@ +From 2432f3c8c8936ae2a80e49b009dba5eb91b9f84b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 16 Jul 2024 11:09:05 +0900 +Subject: tipc: Return non-zero value from tipc_udp_addr2str() on error + +From: Shigeru Yoshida + +[ Upstream commit fa96c6baef1b5385e2f0c0677b32b3839e716076 ] + +tipc_udp_addr2str() should return non-zero value if the UDP media +address is invalid. Otherwise, a buffer overflow access can occur in +tipc_media_addr_printf(). Fix this by returning 1 on an invalid UDP +media address. + +Fixes: d0f91938bede ("tipc: add ip/udp media type") +Signed-off-by: Shigeru Yoshida +Reviewed-by: Tung Nguyen +Signed-off-by: David S. Miller +Signed-off-by: Sasha Levin +--- + net/tipc/udp_media.c | 5 ++++- + 1 file changed, 4 insertions(+), 1 deletion(-) + +diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c +index 0a85244fd6188..73e461dc12d7b 100644 +--- a/net/tipc/udp_media.c ++++ b/net/tipc/udp_media.c +@@ -135,8 +135,11 @@ static int tipc_udp_addr2str(struct tipc_media_addr *a, char *buf, int size) + snprintf(buf, size, "%pI4:%u", &ua->ipv4, ntohs(ua->port)); + else if (ntohs(ua->proto) == ETH_P_IPV6) + snprintf(buf, size, "%pI6:%u", &ua->ipv6, ntohs(ua->port)); +- else ++ else { + pr_err("Invalid UDP media address\n"); ++ return 1; ++ } ++ + return 0; + } + +-- +2.43.0 + diff --git a/queue-5.15/um-time-travel-fix-signal-blocking-race-hang.patch b/queue-5.15/um-time-travel-fix-signal-blocking-race-hang.patch new file mode 100644 index 00000000000..57a91996b14 --- /dev/null +++ b/queue-5.15/um-time-travel-fix-signal-blocking-race-hang.patch @@ -0,0 +1,254 @@ +From ff1bc614d635092faa2d802d119a026ed9e657f0 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 3 Jul 2024 13:01:45 +0200 +Subject: um: time-travel: fix signal blocking race/hang + +From: Johannes Berg + +[ Upstream commit 2cf3a3c4b84def5406b830452b1cb8bbfffe0ebe ] + +When signals are hard-blocked in order to do time-travel +socket processing, we set signals_blocked and then handle +SIGIO signals by setting the SIGIO bit in signals_pending. +When unblocking, we first set signals_blocked to 0, and +then handle all pending signals. We have to set it first, +so that we can again properly block/unblock inside the +unblock, if the time-travel handlers need to be processed. + +Unfortunately, this is racy. We can get into this situation: + +// signals_pending = SIGIO_MASK + +unblock_signals_hard() + signals_blocked = 0; + if (signals_pending && signals_enabled) { + block_signals(); + unblock_signals() + ... + sig_handler_common(SIGIO, NULL, NULL); + sigio_handler() + ... + sigio_reg_handler() + irq_do_timetravel_handler() + reg->timetravel_handler() == + vu_req_interrupt_comm_handler() + vu_req_read_message() + vhost_user_recv_req() + vhost_user_recv() + vhost_user_recv_header() + // reads 12 bytes header of + // 20 bytes message +<-- receive SIGIO here <-- +sig_handler() + int enabled = signals_enabled; // 1 + if ((signals_blocked || !enabled) && (sig == SIGIO)) { + if (!signals_blocked && time_travel_mode == TT_MODE_EXTERNAL) + sigio_run_timetravel_handlers() + _sigio_handler() + sigio_reg_handler() + ... as above ... + vhost_user_recv_header() + // reads 8 bytes that were message payload + // as if it were header - but aborts since + // it then gets -EAGAIN +... +--> end signal handler --> + // continue in vhost_user_recv() + // full_read() for 8 bytes payload busy loops + // entire process hangs here + +Conceptually, to fix this, we need to ensure that the +signal handler cannot run while we hard-unblock signals. +The thing that makes this more complex is that we can be +doing hard-block/unblock while unblocking. Introduce a +new signals_blocked_pending variable that we can keep at +non-zero as long as pending signals are being processed, +then we only need to ensure it's decremented safely and +the signal handler will only increment it if it's already +non-zero (or signals_blocked is set, of course.) + +Note also that only the outermost call to hard-unblock is +allowed to decrement signals_blocked_pending, since it +could otherwise reach zero in an inner call, and leave +the same race happening if the timetravel_handler loops, +but that's basically required of it. + +Fixes: d6b399a0e02a ("um: time-travel/signals: fix ndelay() in interrupt") +Link: https://patch.msgid.link/20240703110144.28034-2-johannes@sipsolutions.net +Signed-off-by: Johannes Berg +Signed-off-by: Sasha Levin +--- + arch/um/os-Linux/signal.c | 118 +++++++++++++++++++++++++++++++------- + 1 file changed, 98 insertions(+), 20 deletions(-) + +diff --git a/arch/um/os-Linux/signal.c b/arch/um/os-Linux/signal.c +index 24a403a70a020..850d21e6473ee 100644 +--- a/arch/um/os-Linux/signal.c ++++ b/arch/um/os-Linux/signal.c +@@ -8,6 +8,7 @@ + + #include + #include ++#include + #include + #include + #include +@@ -65,9 +66,7 @@ static void sig_handler_common(int sig, struct siginfo *si, mcontext_t *mc) + + int signals_enabled; + #ifdef UML_CONFIG_UML_TIME_TRAVEL_SUPPORT +-static int signals_blocked; +-#else +-#define signals_blocked 0 ++static int signals_blocked, signals_blocked_pending; + #endif + static unsigned int signals_pending; + static unsigned int signals_active = 0; +@@ -76,14 +75,27 @@ void sig_handler(int sig, struct siginfo *si, mcontext_t *mc) + { + int enabled = signals_enabled; + +- if ((signals_blocked || !enabled) && (sig == SIGIO)) { ++#ifdef UML_CONFIG_UML_TIME_TRAVEL_SUPPORT ++ if ((signals_blocked || ++ __atomic_load_n(&signals_blocked_pending, __ATOMIC_SEQ_CST)) && ++ (sig == SIGIO)) { ++ /* increment so unblock will do another round */ ++ __atomic_add_fetch(&signals_blocked_pending, 1, ++ __ATOMIC_SEQ_CST); ++ return; ++ } ++#endif ++ ++ if (!enabled && (sig == SIGIO)) { + /* + * In TT_MODE_EXTERNAL, need to still call time-travel +- * handlers unless signals are also blocked for the +- * external time message processing. This will mark +- * signals_pending by itself (only if necessary.) ++ * handlers. This will mark signals_pending by itself ++ * (only if necessary.) ++ * Note we won't get here if signals are hard-blocked ++ * (which is handled above), in that case the hard- ++ * unblock will handle things. + */ +- if (!signals_blocked && time_travel_mode == TT_MODE_EXTERNAL) ++ if (time_travel_mode == TT_MODE_EXTERNAL) + sigio_run_timetravel_handlers(); + else + signals_pending |= SIGIO_MASK; +@@ -380,33 +392,99 @@ int um_set_signals_trace(int enable) + #ifdef UML_CONFIG_UML_TIME_TRAVEL_SUPPORT + void mark_sigio_pending(void) + { ++ /* ++ * It would seem that this should be atomic so ++ * it isn't a read-modify-write with a signal ++ * that could happen in the middle, losing the ++ * value set by the signal. ++ * ++ * However, this function is only called when in ++ * time-travel=ext simulation mode, in which case ++ * the only signal ever pending is SIGIO, which ++ * is blocked while this can be called, and the ++ * timer signal (SIGALRM) cannot happen. ++ */ + signals_pending |= SIGIO_MASK; + } + + void block_signals_hard(void) + { +- if (signals_blocked) +- return; +- signals_blocked = 1; ++ signals_blocked++; + barrier(); + } + + void unblock_signals_hard(void) + { ++ static bool unblocking; ++ + if (!signals_blocked) ++ panic("unblocking signals while not blocked"); ++ ++ if (--signals_blocked) + return; +- /* Must be set to 0 before we check the pending bits etc. */ +- signals_blocked = 0; ++ /* ++ * Must be set to 0 before we check pending so the ++ * SIGIO handler will run as normal unless we're still ++ * going to process signals_blocked_pending. ++ */ + barrier(); + +- if (signals_pending && signals_enabled) { +- /* this is a bit inefficient, but that's not really important */ +- block_signals(); +- unblock_signals(); +- } else if (signals_pending & SIGIO_MASK) { +- /* we need to run time-travel handlers even if not enabled */ +- sigio_run_timetravel_handlers(); ++ /* ++ * Note that block_signals_hard()/unblock_signals_hard() can be called ++ * within the unblock_signals()/sigio_run_timetravel_handlers() below. ++ * This would still be prone to race conditions since it's actually a ++ * call _within_ e.g. vu_req_read_message(), where we observed this ++ * issue, which loops. Thus, if the inner call handles the recorded ++ * pending signals, we can get out of the inner call with the real ++ * signal hander no longer blocked, and still have a race. Thus don't ++ * handle unblocking in the inner call, if it happens, but only in ++ * the outermost call - 'unblocking' serves as an ownership for the ++ * signals_blocked_pending decrement. ++ */ ++ if (unblocking) ++ return; ++ unblocking = true; ++ ++ while (__atomic_load_n(&signals_blocked_pending, __ATOMIC_SEQ_CST)) { ++ if (signals_enabled) { ++ /* signals are enabled so we can touch this */ ++ signals_pending |= SIGIO_MASK; ++ /* ++ * this is a bit inefficient, but that's ++ * not really important ++ */ ++ block_signals(); ++ unblock_signals(); ++ } else { ++ /* ++ * we need to run time-travel handlers even ++ * if not enabled ++ */ ++ sigio_run_timetravel_handlers(); ++ } ++ ++ /* ++ * The decrement of signals_blocked_pending must be atomic so ++ * that the signal handler will either happen before or after ++ * the decrement, not during a read-modify-write: ++ * - If it happens before, it can increment it and we'll ++ * decrement it and do another round in the loop. ++ * - If it happens after it'll see 0 for both signals_blocked ++ * and signals_blocked_pending and thus run the handler as ++ * usual (subject to signals_enabled, but that's unrelated.) ++ * ++ * Note that a call to unblock_signals_hard() within the calls ++ * to unblock_signals() or sigio_run_timetravel_handlers() above ++ * will do nothing due to the 'unblocking' state, so this cannot ++ * underflow as the only one decrementing will be the outermost ++ * one. ++ */ ++ if (__atomic_sub_fetch(&signals_blocked_pending, 1, ++ __ATOMIC_SEQ_CST) < 0) ++ panic("signals_blocked_pending underflow"); + } ++ ++ unblocking = false; + } + #endif + +-- +2.43.0 + diff --git a/queue-5.15/um-time-travel-fix-time-travel-start-option.patch b/queue-5.15/um-time-travel-fix-time-travel-start-option.patch new file mode 100644 index 00000000000..4cd4386dfe2 --- /dev/null +++ b/queue-5.15/um-time-travel-fix-time-travel-start-option.patch @@ -0,0 +1,40 @@ +From 50f1be619b55e25278e058bb6c941d154ff58c71 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 17 Apr 2024 10:27:45 +0200 +Subject: um: time-travel: fix time-travel-start option + +From: Johannes Berg + +[ Upstream commit 7d0a8a490aa3a2a82de8826aaf1dfa38575cb77a ] + +We need to have the = as part of the option so that the +value can be parsed properly. Also document that it must +be given in nanoseconds, not seconds. + +Fixes: 065038706f77 ("um: Support time travel mode") +Link: https://patch.msgid.link/20240417102744.14b9a9d4eba0.Ib22e9136513126b2099d932650f55f193120cd97@changeid +Signed-off-by: Johannes Berg +Signed-off-by: Sasha Levin +--- + arch/um/kernel/time.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +diff --git a/arch/um/kernel/time.c b/arch/um/kernel/time.c +index 3e270da6b6f67..c8c4ef94c753f 100644 +--- a/arch/um/kernel/time.c ++++ b/arch/um/kernel/time.c +@@ -874,9 +874,9 @@ int setup_time_travel_start(char *str) + return 1; + } + +-__setup("time-travel-start", setup_time_travel_start); ++__setup("time-travel-start=", setup_time_travel_start); + __uml_help(setup_time_travel_start, +-"time-travel-start=\n" ++"time-travel-start=\n" + "Configure the UML instance's wall clock to start at this value rather than\n" + "the host's wall clock at the time of UML boot.\n"); + #endif +-- +2.43.0 +