From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Sat, 24 Jun 2023 14:11:37 +0000 (+0200)
Subject: 5.10-stable patches
X-Git-Tag: v4.14.320~33
X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=dbc6e02421f4cad4aa4afcf1ce6feae57b8926be;p=thirdparty%2Fkernel%2Fstable-queue.git

5.10-stable patches

added patches:
	io_uring-net-clear-msg_controllen-on-partial-sendmsg-retry.patch
	io_uring-net-disable-partial-retries-for-recvmsg-with-cmsg.patch
	io_uring-net-save-msghdr-msg_control-for-retries.patch
	nilfs2-prevent-general-protection-fault-in-nilfs_clear_dirty_page.patch
	writeback-fix-dereferencing-null-mapping-host-on-writeback_page_template.patch
---

diff --git a/queue-5.10/io_uring-net-clear-msg_controllen-on-partial-sendmsg-retry.patch b/queue-5.10/io_uring-net-clear-msg_controllen-on-partial-sendmsg-retry.patch
new file mode 100644
index 00000000000..0413cb97cfa
--- /dev/null
+++ b/queue-5.10/io_uring-net-clear-msg_controllen-on-partial-sendmsg-retry.patch
@@ -0,0 +1,33 @@
+From 309fd8aa08da865a2fa8935d006c932bcd4ae216 Mon Sep 17 00:00:00 2001
+From: Jens Axboe <axboe@kernel.dk>
+Date: Fri, 23 Jun 2023 07:39:42 -0600
+Subject: io_uring/net: clear msg_controllen on partial sendmsg retry
+
+From: Jens Axboe <axboe@kernel.dk>
+
+Commit b1dc492087db0f2e5a45f1072a743d04618dd6be upstream.
+
+If we have cmsg attached AND we transferred partial data at least, clear
+msg_controllen on retry so we don't attempt to send that again.
+
+Cc: stable@vger.kernel.org # 5.10+
+Fixes: cac9e4418f4c ("io_uring/net: save msghdr->msg_control for retries")
+Reported-by: Stefan Metzmacher <metze@samba.org>
+Reviewed-by: Stefan Metzmacher <metze@samba.org>
+Signed-off-by: Jens Axboe <axboe@kernel.dk>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ io_uring/io_uring.c |    2 ++
+ 1 file changed, 2 insertions(+)
+
+--- a/io_uring/io_uring.c
++++ b/io_uring/io_uring.c
+@@ -5064,6 +5064,8 @@ static int io_recvmsg(struct io_kiocb *r
+ 		if (ret == -ERESTARTSYS)
+ 			ret = -EINTR;
+ 		if (ret > 0 && io_net_retry(sock, flags)) {
++			kmsg->msg.msg_controllen = 0;
++			kmsg->msg.msg_control = NULL;
+ 			sr->done_io += ret;
+ 			req->flags |= REQ_F_PARTIAL_IO;
+ 			return io_setup_async_msg(req, kmsg);
diff --git a/queue-5.10/io_uring-net-disable-partial-retries-for-recvmsg-with-cmsg.patch b/queue-5.10/io_uring-net-disable-partial-retries-for-recvmsg-with-cmsg.patch
new file mode 100644
index 00000000000..748daaffb7e
--- /dev/null
+++ b/queue-5.10/io_uring-net-disable-partial-retries-for-recvmsg-with-cmsg.patch
@@ -0,0 +1,38 @@
+From b3f9442fb5b504d240e6710f483232641beb1b8f Mon Sep 17 00:00:00 2001
+From: Jens Axboe <axboe@kernel.dk>
+Date: Fri, 23 Jun 2023 07:41:10 -0600
+Subject: io_uring/net: disable partial retries for recvmsg with cmsg
+
+From: Jens Axboe <axboe@kernel.dk>
+
+Commit 78d0d2063bab954d19a1696feae4c7706a626d48 upstream.
+
+We cannot sanely handle partial retries for recvmsg if we have cmsg
+attached. If we don't, then we'd just be overwriting the initial cmsg
+header on retries. Alternatively we could increment and handle this
+appropriately, but it doesn't seem worth the complication.
+
+Move the MSG_WAITALL check into the non-multishot case while at it,
+since MSG_WAITALL is explicitly disabled for multishot anyway.
+
+Link: https://lore.kernel.org/io-uring/0b0d4411-c8fd-4272-770b-e030af6919a0@kernel.dk/
+Cc: stable@vger.kernel.org # 5.10+
+Reported-by: Stefan Metzmacher <metze@samba.org>
+Reviewed-by: Stefan Metzmacher <metze@samba.org>
+Signed-off-by: Jens Axboe <axboe@kernel.dk>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ io_uring/io_uring.c |    2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/io_uring/io_uring.c
++++ b/io_uring/io_uring.c
+@@ -5053,7 +5053,7 @@ static int io_recvmsg(struct io_kiocb *r
+ 	flags = req->sr_msg.msg_flags;
+ 	if (force_nonblock)
+ 		flags |= MSG_DONTWAIT;
+-	if (flags & MSG_WAITALL)
++	if (flags & MSG_WAITALL && !kmsg->msg.msg_controllen)
+ 		min_ret = iov_iter_count(&kmsg->msg.msg_iter);
+ 
+ 	ret = __sys_recvmsg_sock(sock, &kmsg->msg, req->sr_msg.umsg,
diff --git a/queue-5.10/io_uring-net-save-msghdr-msg_control-for-retries.patch b/queue-5.10/io_uring-net-save-msghdr-msg_control-for-retries.patch
new file mode 100644
index 00000000000..3dcd865a47b
--- /dev/null
+++ b/queue-5.10/io_uring-net-save-msghdr-msg_control-for-retries.patch
@@ -0,0 +1,62 @@
+From 76513d9f99764e6acf9f0e2e53b7d42d95d6630d Mon Sep 17 00:00:00 2001
+From: Jens Axboe <axboe@kernel.dk>
+Date: Fri, 23 Jun 2023 07:38:14 -0600
+Subject: io_uring/net: save msghdr->msg_control for retries
+
+From: Jens Axboe <axboe@kernel.dk>
+
+Commit cac9e4418f4cbd548ccb065b3adcafe073f7f7d2 upstream.
+
+If the application sets ->msg_control and we have to later retry this
+command, or if it got queued with IOSQE_ASYNC to begin with, then we
+need to retain the original msg_control value. This is due to the net
+stack overwriting this field with an in-kernel pointer, to copy it
+in. Hitting that path for the second time will now fail the copy from
+user, as it's attempting to copy from a non-user address.
+
+Cc: stable@vger.kernel.org # 5.10+
+Link: https://github.com/axboe/liburing/issues/880
+Reported-and-tested-by: Marek Majkowski <marek@cloudflare.com>
+Signed-off-by: Jens Axboe <axboe@kernel.dk>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ io_uring/io_uring.c |   11 ++++++++++-
+ 1 file changed, 10 insertions(+), 1 deletion(-)
+
+--- a/io_uring/io_uring.c
++++ b/io_uring/io_uring.c
+@@ -581,6 +581,7 @@ struct io_sr_msg {
+ 	size_t				len;
+ 	size_t				done_io;
+ 	struct io_buffer		*kbuf;
++	void __user			*msg_control;
+ };
+ 
+ struct io_open {
+@@ -4718,10 +4719,16 @@ static int io_setup_async_msg(struct io_
+ static int io_sendmsg_copy_hdr(struct io_kiocb *req,
+ 			       struct io_async_msghdr *iomsg)
+ {
++	struct io_sr_msg *sr = &req->sr_msg;
++	int ret;
++
+ 	iomsg->msg.msg_name = &iomsg->addr;
+ 	iomsg->free_iov = iomsg->fast_iov;
+-	return sendmsg_copy_msghdr(&iomsg->msg, req->sr_msg.umsg,
++	ret = sendmsg_copy_msghdr(&iomsg->msg, req->sr_msg.umsg,
+ 				   req->sr_msg.msg_flags, &iomsg->free_iov);
++	/* save msg_control as sys_sendmsg() overwrites it */
++	sr->msg_control = iomsg->msg.msg_control;
++	return ret;
+ }
+ 
+ static int io_sendmsg_prep_async(struct io_kiocb *req)
+@@ -4778,6 +4785,8 @@ static int io_sendmsg(struct io_kiocb *r
+ 		if (ret)
+ 			return ret;
+ 		kmsg = &iomsg;
++	} else {
++		kmsg->msg.msg_control = sr->msg_control;
+ 	}
+ 
+ 	flags = req->sr_msg.msg_flags;
diff --git a/queue-5.10/nilfs2-prevent-general-protection-fault-in-nilfs_clear_dirty_page.patch b/queue-5.10/nilfs2-prevent-general-protection-fault-in-nilfs_clear_dirty_page.patch
new file mode 100644
index 00000000000..a492ac8fd18
--- /dev/null
+++ b/queue-5.10/nilfs2-prevent-general-protection-fault-in-nilfs_clear_dirty_page.patch
@@ -0,0 +1,56 @@
+From 782e53d0c14420858dbf0f8f797973c150d3b6d7 Mon Sep 17 00:00:00 2001
+From: Ryusuke Konishi <konishi.ryusuke@gmail.com>
+Date: Mon, 12 Jun 2023 11:14:56 +0900
+Subject: nilfs2: prevent general protection fault in nilfs_clear_dirty_page()
+
+From: Ryusuke Konishi <konishi.ryusuke@gmail.com>
+
+commit 782e53d0c14420858dbf0f8f797973c150d3b6d7 upstream.
+
+In a syzbot stress test that deliberately causes file system errors on
+nilfs2 with a corrupted disk image, it has been reported that
+nilfs_clear_dirty_page() called from nilfs_clear_dirty_pages() can cause a
+general protection fault.
+
+In nilfs_clear_dirty_pages(), when looking up dirty pages from the page
+cache and calling nilfs_clear_dirty_page() for each dirty page/folio
+retrieved, the back reference from the argument page to "mapping" may have
+been changed to NULL (and possibly others).  It is necessary to check this
+after locking the page/folio.
+
+So, fix this issue by not calling nilfs_clear_dirty_page() on a page/folio
+after locking it in nilfs_clear_dirty_pages() if the back reference
+"mapping" from the page/folio is different from the "mapping" that held
+the page/folio just before.
+
+Link: https://lkml.kernel.org/r/20230612021456.3682-1-konishi.ryusuke@gmail.com
+Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
+Reported-by: syzbot+53369d11851d8f26735c@syzkaller.appspotmail.com
+Closes: https://lkml.kernel.org/r/000000000000da4f6b05eb9bf593@google.com
+Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
+Cc: <stable@vger.kernel.org>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/nilfs2/page.c |   10 +++++++++-
+ 1 file changed, 9 insertions(+), 1 deletion(-)
+
+--- a/fs/nilfs2/page.c
++++ b/fs/nilfs2/page.c
+@@ -369,7 +369,15 @@ void nilfs_clear_dirty_pages(struct addr
+ 			struct page *page = pvec.pages[i];
+ 
+ 			lock_page(page);
+-			nilfs_clear_dirty_page(page, silent);
++
++			/*
++			 * This page may have been removed from the address
++			 * space by truncation or invalidation when the lock
++			 * was acquired.  Skip processing in that case.
++			 */
++			if (likely(page->mapping == mapping))
++				nilfs_clear_dirty_page(page, silent);
++
+ 			unlock_page(page);
+ 		}
+ 		pagevec_release(&pvec);
diff --git a/queue-5.10/series b/queue-5.10/series
index d9c5ba6e36c..be156270e99 100644
--- a/queue-5.10/series
+++ b/queue-5.10/series
@@ -19,3 +19,8 @@ mmc-mmci-stm32-fix-max-busy-timeout-calculation.patch
 ip_tunnels-allow-vxlan-geneve-to-inherit-tos-ttl-from-vlan.patch
 regulator-pca9450-fix-ldo3out-and-ldo4out-mask.patch
 regmap-spi-avmm-fix-regmap_bus-max_raw_write.patch
+writeback-fix-dereferencing-null-mapping-host-on-writeback_page_template.patch
+io_uring-net-save-msghdr-msg_control-for-retries.patch
+io_uring-net-clear-msg_controllen-on-partial-sendmsg-retry.patch
+io_uring-net-disable-partial-retries-for-recvmsg-with-cmsg.patch
+nilfs2-prevent-general-protection-fault-in-nilfs_clear_dirty_page.patch
diff --git a/queue-5.10/writeback-fix-dereferencing-null-mapping-host-on-writeback_page_template.patch b/queue-5.10/writeback-fix-dereferencing-null-mapping-host-on-writeback_page_template.patch
new file mode 100644
index 00000000000..54f8eabef6a
--- /dev/null
+++ b/queue-5.10/writeback-fix-dereferencing-null-mapping-host-on-writeback_page_template.patch
@@ -0,0 +1,99 @@
+From 54abe19e00cfcc5a72773d15cd00ed19ab763439 Mon Sep 17 00:00:00 2001
+From: Rafael Aquini <aquini@redhat.com>
+Date: Tue, 6 Jun 2023 19:36:13 -0400
+Subject: writeback: fix dereferencing NULL mapping->host on writeback_page_template
+
+From: Rafael Aquini <aquini@redhat.com>
+
+commit 54abe19e00cfcc5a72773d15cd00ed19ab763439 upstream.
+
+When commit 19343b5bdd16 ("mm/page-writeback: introduce tracepoint for
+wait_on_page_writeback()") repurposed the writeback_dirty_page trace event
+as a template to create its new wait_on_page_writeback trace event, it
+ended up opening a window to NULL pointer dereference crashes due to the
+(infrequent) occurrence of a race where an access to a page in the
+swap-cache happens concurrently with the moment this page is being written
+to disk and the tracepoint is enabled:
+
+    BUG: kernel NULL pointer dereference, address: 0000000000000040
+    #PF: supervisor read access in kernel mode
+    #PF: error_code(0x0000) - not-present page
+    PGD 800000010ec0a067 P4D 800000010ec0a067 PUD 102353067 PMD 0
+    Oops: 0000 [#1] PREEMPT SMP PTI
+    CPU: 1 PID: 1320 Comm: shmem-worker Kdump: loaded Not tainted 6.4.0-rc5+ #13
+    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20230301gitf80f052277c8-1.fc37 03/01/2023
+    RIP: 0010:trace_event_raw_event_writeback_folio_template+0x76/0xf0
+    Code: 4d 85 e4 74 5c 49 8b 3c 24 e8 06 98 ee ff 48 89 c7 e8 9e 8b ee ff ba 20 00 00 00 48 89 ef 48 89 c6 e8 fe d4 1a 00 49 8b 04 24 <48> 8b 40 40 48 89 43 28 49 8b 45 20 48 89 e7 48 89 43 30 e8 a2 4d
+    RSP: 0000:ffffaad580b6fb60 EFLAGS: 00010246
+    RAX: 0000000000000000 RBX: ffff90e38035c01c RCX: 0000000000000000
+    RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff90e38035c044
+    RBP: ffff90e38035c024 R08: 0000000000000002 R09: 0000000000000006
+    R10: ffff90e38035c02e R11: 0000000000000020 R12: ffff90e380bac000
+    R13: ffffe3a7456d9200 R14: 0000000000001b81 R15: ffffe3a7456d9200
+    FS:  00007f2e4e8a15c0(0000) GS:ffff90e3fbc80000(0000) knlGS:0000000000000000
+    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
+    CR2: 0000000000000040 CR3: 00000001150c6003 CR4: 0000000000170ee0
+    Call Trace:
+     <TASK>
+     ? __die+0x20/0x70
+     ? page_fault_oops+0x76/0x170
+     ? kernelmode_fixup_or_oops+0x84/0x110
+     ? exc_page_fault+0x65/0x150
+     ? asm_exc_page_fault+0x22/0x30
+     ? trace_event_raw_event_writeback_folio_template+0x76/0xf0
+     folio_wait_writeback+0x6b/0x80
+     shmem_swapin_folio+0x24a/0x500
+     ? filemap_get_entry+0xe3/0x140
+     shmem_get_folio_gfp+0x36e/0x7c0
+     ? find_busiest_group+0x43/0x1a0
+     shmem_fault+0x76/0x2a0
+     ? __update_load_avg_cfs_rq+0x281/0x2f0
+     __do_fault+0x33/0x130
+     do_read_fault+0x118/0x160
+     do_pte_missing+0x1ed/0x2a0
+     __handle_mm_fault+0x566/0x630
+     handle_mm_fault+0x91/0x210
+     do_user_addr_fault+0x22c/0x740
+     exc_page_fault+0x65/0x150
+     asm_exc_page_fault+0x22/0x30
+
+This problem arises from the fact that the repurposed writeback_dirty_page
+trace event code was written assuming that every pointer to mapping
+(struct address_space) would come from a file-mapped page-cache object,
+thus mapping->host would always be populated, and that was a valid case
+before commit 19343b5bdd16.  The swap-cache address space
+(swapper_spaces), however, doesn't populate its ->host (struct inode)
+pointer, thus leading to the crashes in the corner-case aforementioned.
+
+commit 19343b5bdd16 ended up breaking the assignment of __entry->name and
+__entry->ino for the wait_on_page_writeback tracepoint -- both dependent
+on mapping->host carrying a pointer to a valid inode.  The assignment of
+__entry->name was fixed by commit 68f23b89067f ("memcg: fix a crash in
+wb_workfn when a device disappears"), and this commit fixes the remaining
+case, for __entry->ino.
+
+Link: https://lkml.kernel.org/r/20230606233613.1290819-1-aquini@redhat.com
+Fixes: 19343b5bdd16 ("mm/page-writeback: introduce tracepoint for wait_on_page_writeback()")
+Signed-off-by: Rafael Aquini <aquini@redhat.com>
+Reviewed-by: Yafang Shao <laoar.shao@gmail.com>
+Cc: Aristeu Rozanski <aris@redhat.com>
+Cc: <stable@vger.kernel.org>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+
+---
+ include/trace/events/writeback.h |    2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/include/trace/events/writeback.h
++++ b/include/trace/events/writeback.h
+@@ -67,7 +67,7 @@ DECLARE_EVENT_CLASS(writeback_page_templ
+ 		strscpy_pad(__entry->name,
+ 			    bdi_dev_name(mapping ? inode_to_bdi(mapping->host) :
+ 					 NULL), 32);
+-		__entry->ino = mapping ? mapping->host->i_ino : 0;
++		__entry->ino = (mapping && mapping->host) ? mapping->host->i_ino : 0;
+ 		__entry->index = page->index;
+ 	),
+