From: Greg Kroah-Hartman Date: Fri, 12 Aug 2022 15:39:18 +0000 (+0200) Subject: 5.10-stable patches X-Git-Tag: v5.15.61~198 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=44b00538deff7721f8e394e0c9dda08d5d7db495;p=thirdparty%2Fkernel%2Fstable-queue.git 5.10-stable patches added patches: kvm-x86-tag-kvm_mmu_x86_module_init-with-__init.patch mm-add-kvrealloc.patch riscv-set-default-pm_power_off-to-null.patch xfs-fix-i_dontcache.patch xfs-only-set-iomap_f_shared-when-providing-a-srcmap-to-a-write.patch --- diff --git a/queue-5.10/kvm-x86-tag-kvm_mmu_x86_module_init-with-__init.patch b/queue-5.10/kvm-x86-tag-kvm_mmu_x86_module_init-with-__init.patch new file mode 100644 index 00000000000..69921e1a86e --- /dev/null +++ b/queue-5.10/kvm-x86-tag-kvm_mmu_x86_module_init-with-__init.patch @@ -0,0 +1,48 @@ +From 982bae43f11c37b51d2f1961bb25ef7cac3746fa Mon Sep 17 00:00:00 2001 +From: Sean Christopherson +Date: Wed, 3 Aug 2022 22:49:55 +0000 +Subject: KVM: x86: Tag kvm_mmu_x86_module_init() with __init + +From: Sean Christopherson + +commit 982bae43f11c37b51d2f1961bb25ef7cac3746fa upstream. + +Mark kvm_mmu_x86_module_init() with __init, the entire reason it exists +is to initialize variables when kvm.ko is loaded, i.e. it must never be +called after module initialization. + +Fixes: 1d0e84806047 ("KVM: x86/mmu: Resolve nx_huge_pages when kvm.ko is loaded") +Cc: stable@vger.kernel.org +Reviewed-by: Kai Huang +Tested-by: Michael Roth +Signed-off-by: Sean Christopherson +Message-Id: <20220803224957.1285926-2-seanjc@google.com> +Signed-off-by: Paolo Bonzini +Signed-off-by: Greg Kroah-Hartman +--- + arch/x86/include/asm/kvm_host.h | 2 +- + arch/x86/kvm/mmu/mmu.c | 2 +- + 2 files changed, 2 insertions(+), 2 deletions(-) + +--- a/arch/x86/include/asm/kvm_host.h ++++ b/arch/x86/include/asm/kvm_host.h +@@ -1340,7 +1340,7 @@ static inline int kvm_arch_flush_remote_ + return -ENOTSUPP; + } + +-void kvm_mmu_x86_module_init(void); ++void __init kvm_mmu_x86_module_init(void); + int kvm_mmu_vendor_module_init(void); + void kvm_mmu_vendor_module_exit(void); + +--- a/arch/x86/kvm/mmu/mmu.c ++++ b/arch/x86/kvm/mmu/mmu.c +@@ -5886,7 +5886,7 @@ static int set_nx_huge_pages(const char + * nx_huge_pages needs to be resolved to true/false when kvm.ko is loaded, as + * its default value of -1 is technically undefined behavior for a boolean. + */ +-void kvm_mmu_x86_module_init(void) ++void __init kvm_mmu_x86_module_init(void) + { + if (nx_huge_pages == -1) + __set_nx_huge_pages(get_nx_auto_mode()); diff --git a/queue-5.10/mm-add-kvrealloc.patch b/queue-5.10/mm-add-kvrealloc.patch new file mode 100644 index 00000000000..9949f57fad2 --- /dev/null +++ b/queue-5.10/mm-add-kvrealloc.patch @@ -0,0 +1,130 @@ +From foo@baz Fri Aug 12 05:38:47 PM CEST 2022 +From: Amir Goldstein +Date: Wed, 10 Aug 2022 16:15:50 +0200 +Subject: mm: Add kvrealloc() +To: Greg Kroah-Hartman +Cc: Sasha Levin , "Darrick J . Wong" , Leah Rumancik , Chandan Babu R , Luis Chamberlain , Adam Manzanares , linux-xfs@vger.kernel.org, stable@vger.kernel.org, Dave Chinner , Mel Gorman +Message-ID: <20220810141552.168763-2-amir73il@gmail.com> + +From: Dave Chinner + +commit de2860f4636256836450c6543be744a50118fc66 upstream. + +During log recovery of an XFS filesystem with 64kB directory +buffers, rebuilding a buffer split across two log records results +in a memory allocation warning from krealloc like this: + +xfs filesystem being mounted at /mnt/scratch supports timestamps until 2038 (0x7fffffff) +XFS (dm-0): Unmounting Filesystem +XFS (dm-0): Mounting V5 Filesystem +XFS (dm-0): Starting recovery (logdev: internal) +------------[ cut here ]------------ +WARNING: CPU: 5 PID: 3435170 at mm/page_alloc.c:3539 get_page_from_freelist+0xdee/0xe40 +..... +RIP: 0010:get_page_from_freelist+0xdee/0xe40 +Call Trace: + ? complete+0x3f/0x50 + __alloc_pages+0x16f/0x300 + alloc_pages+0x87/0x110 + kmalloc_order+0x2c/0x90 + kmalloc_order_trace+0x1d/0x90 + __kmalloc_track_caller+0x215/0x270 + ? xlog_recover_add_to_cont_trans+0x63/0x1f0 + krealloc+0x54/0xb0 + xlog_recover_add_to_cont_trans+0x63/0x1f0 + xlog_recovery_process_trans+0xc1/0xd0 + xlog_recover_process_ophdr+0x86/0x130 + xlog_recover_process_data+0x9f/0x160 + xlog_recover_process+0xa2/0x120 + xlog_do_recovery_pass+0x40b/0x7d0 + ? __irq_work_queue_local+0x4f/0x60 + ? irq_work_queue+0x3a/0x50 + xlog_do_log_recovery+0x70/0x150 + xlog_do_recover+0x38/0x1d0 + xlog_recover+0xd8/0x170 + xfs_log_mount+0x181/0x300 + xfs_mountfs+0x4a1/0x9b0 + xfs_fs_fill_super+0x3c0/0x7b0 + get_tree_bdev+0x171/0x270 + ? suffix_kstrtoint.constprop.0+0xf0/0xf0 + xfs_fs_get_tree+0x15/0x20 + vfs_get_tree+0x24/0xc0 + path_mount+0x2f5/0xaf0 + __x64_sys_mount+0x108/0x140 + do_syscall_64+0x3a/0x70 + entry_SYSCALL_64_after_hwframe+0x44/0xae + +Essentially, we are taking a multi-order allocation from kmem_alloc() +(which has an open coded no fail, no warn loop) and then +reallocating it out to 64kB using krealloc(__GFP_NOFAIL) and that is +then triggering the above warning. + +This is a regression caused by converting this code from an open +coded no fail/no warn reallocation loop to using __GFP_NOFAIL. + +What we actually need here is kvrealloc(), so that if contiguous +page allocation fails we fall back to vmalloc() and we don't +get nasty warnings happening in XFS. + +Fixes: 771915c4f688 ("xfs: remove kmem_realloc()") +Signed-off-by: Dave Chinner +Acked-by: Mel Gorman +Reviewed-by: Darrick J. Wong +Signed-off-by: Darrick J. Wong +Signed-off-by: Amir Goldstein +Acked-by: Darrick J. Wong +Signed-off-by: Greg Kroah-Hartman +--- + fs/xfs/xfs_log_recover.c | 4 +++- + include/linux/mm.h | 2 ++ + mm/util.c | 15 +++++++++++++++ + 3 files changed, 20 insertions(+), 1 deletion(-) + +--- a/fs/xfs/xfs_log_recover.c ++++ b/fs/xfs/xfs_log_recover.c +@@ -2061,7 +2061,9 @@ xlog_recover_add_to_cont_trans( + old_ptr = item->ri_buf[item->ri_cnt-1].i_addr; + old_len = item->ri_buf[item->ri_cnt-1].i_len; + +- ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL | __GFP_NOFAIL); ++ ptr = kvrealloc(old_ptr, old_len, len + old_len, GFP_KERNEL); ++ if (!ptr) ++ return -ENOMEM; + memcpy(&ptr[old_len], dp, len); + item->ri_buf[item->ri_cnt-1].i_len += len; + item->ri_buf[item->ri_cnt-1].i_addr = ptr; +--- a/include/linux/mm.h ++++ b/include/linux/mm.h +@@ -788,6 +788,8 @@ static inline void *kvcalloc(size_t n, s + return kvmalloc_array(n, size, flags | __GFP_ZERO); + } + ++extern void *kvrealloc(const void *p, size_t oldsize, size_t newsize, ++ gfp_t flags); + extern void kvfree(const void *addr); + extern void kvfree_sensitive(const void *addr, size_t len); + +--- a/mm/util.c ++++ b/mm/util.c +@@ -661,6 +661,21 @@ void kvfree_sensitive(const void *addr, + } + EXPORT_SYMBOL(kvfree_sensitive); + ++void *kvrealloc(const void *p, size_t oldsize, size_t newsize, gfp_t flags) ++{ ++ void *newp; ++ ++ if (oldsize >= newsize) ++ return (void *)p; ++ newp = kvmalloc(newsize, flags); ++ if (!newp) ++ return NULL; ++ memcpy(newp, p, oldsize); ++ kvfree(p); ++ return newp; ++} ++EXPORT_SYMBOL(kvrealloc); ++ + static inline void *__page_rmapping(struct page *page) + { + unsigned long mapping; diff --git a/queue-5.10/riscv-set-default-pm_power_off-to-null.patch b/queue-5.10/riscv-set-default-pm_power_off-to-null.patch new file mode 100644 index 00000000000..c5f2ac0807f --- /dev/null +++ b/queue-5.10/riscv-set-default-pm_power_off-to-null.patch @@ -0,0 +1,65 @@ +From f2928e224d85e7cc139009ab17cefdfec2df5d11 Mon Sep 17 00:00:00 2001 +From: Dimitri John Ledkov +Date: Tue, 7 Sep 2021 01:28:47 +0100 +Subject: riscv: set default pm_power_off to NULL + +From: Dimitri John Ledkov + +commit f2928e224d85e7cc139009ab17cefdfec2df5d11 upstream. + +Set pm_power_off to NULL like on all other architectures, check if it +is set in machine_halt() and machine_power_off() and fallback to +default_power_off if no other power driver got registered. + +This brings riscv architecture inline with all other architectures, +and allows to reuse exiting power drivers unmodified. + +Kernels without legacy SBI v0.1 extensions (CONFIG_RISCV_SBI_V01 is +not set), do not set pm_power_off to sbi_shutdown(). There is no +support for SBI v0.3 system reset extension either. This prevents +using gpio_poweroff on SiFive HiFive Unmatched. + +Tested on SiFive HiFive unmatched, with a dtb specifying gpio-poweroff +node and kernel complied without CONFIG_RISCV_SBI_V01. + +BugLink: https://bugs.launchpad.net/bugs/1942806 +Signed-off-by: Dimitri John Ledkov +Reviewed-by: Anup Patel +Tested-by: Ron Economos +Signed-off-by: Palmer Dabbelt +Cc: Nathan Chancellor +Signed-off-by: Greg Kroah-Hartman +--- + arch/riscv/kernel/reset.c | 12 +++++++++--- + 1 file changed, 9 insertions(+), 3 deletions(-) + +--- a/arch/riscv/kernel/reset.c ++++ b/arch/riscv/kernel/reset.c +@@ -12,7 +12,7 @@ static void default_power_off(void) + wait_for_interrupt(); + } + +-void (*pm_power_off)(void) = default_power_off; ++void (*pm_power_off)(void) = NULL; + EXPORT_SYMBOL(pm_power_off); + + void machine_restart(char *cmd) +@@ -23,10 +23,16 @@ void machine_restart(char *cmd) + + void machine_halt(void) + { +- pm_power_off(); ++ if (pm_power_off != NULL) ++ pm_power_off(); ++ else ++ default_power_off(); + } + + void machine_power_off(void) + { +- pm_power_off(); ++ if (pm_power_off != NULL) ++ pm_power_off(); ++ else ++ default_power_off(); + } diff --git a/queue-5.10/series b/queue-5.10/series index 87c41f5f1ac..952bdc788b1 100644 --- a/queue-5.10/series +++ b/queue-5.10/series @@ -18,3 +18,8 @@ kvm-s390-pv-don-t-present-the-ecall-interrupt-twice.patch kvm-nvmx-let-userspace-set-nvmx-msr-to-any-_host_-supported-value.patch kvm-x86-mark-tss-busy-during-ltr-emulation-_after_-all-fault-checks.patch kvm-x86-set-error-code-to-segment-selector-on-lldt-ltr-non-canonical-gp.patch +kvm-x86-tag-kvm_mmu_x86_module_init-with-__init.patch +riscv-set-default-pm_power_off-to-null.patch +mm-add-kvrealloc.patch +xfs-only-set-iomap_f_shared-when-providing-a-srcmap-to-a-write.patch +xfs-fix-i_dontcache.patch diff --git a/queue-5.10/xfs-fix-i_dontcache.patch b/queue-5.10/xfs-fix-i_dontcache.patch new file mode 100644 index 00000000000..b3ab493764a --- /dev/null +++ b/queue-5.10/xfs-fix-i_dontcache.patch @@ -0,0 +1,51 @@ +From foo@baz Fri Aug 12 05:38:47 PM CEST 2022 +From: Amir Goldstein +Date: Wed, 10 Aug 2022 16:15:52 +0200 +Subject: xfs: fix I_DONTCACHE +To: Greg Kroah-Hartman +Cc: Sasha Levin , "Darrick J . Wong" , Leah Rumancik , Chandan Babu R , Luis Chamberlain , Adam Manzanares , linux-xfs@vger.kernel.org, stable@vger.kernel.org, Dave Chinner +Message-ID: <20220810141552.168763-4-amir73il@gmail.com> + +From: Dave Chinner + +commit f38a032b165d812b0ba8378a5cd237c0888ff65f upstream. + +Yup, the VFS hoist broke it, and nobody noticed. Bulkstat workloads +make it clear that it doesn't work as it should. + +Fixes: dae2f8ed7992 ("fs: Lift XFS_IDONTCACHE to the VFS layer") +Signed-off-by: Dave Chinner +Reviewed-by: Darrick J. Wong +Signed-off-by: Darrick J. Wong +Signed-off-by: Amir Goldstein +Acked-by: Darrick J. Wong +Signed-off-by: Greg Kroah-Hartman +--- + fs/xfs/xfs_icache.c | 3 ++- + fs/xfs/xfs_iops.c | 2 +- + 2 files changed, 3 insertions(+), 2 deletions(-) + +--- a/fs/xfs/xfs_icache.c ++++ b/fs/xfs/xfs_icache.c +@@ -47,8 +47,9 @@ xfs_inode_alloc( + return NULL; + } + +- /* VFS doesn't initialise i_mode! */ ++ /* VFS doesn't initialise i_mode or i_state! */ + VFS_I(ip)->i_mode = 0; ++ VFS_I(ip)->i_state = 0; + + XFS_STATS_INC(mp, vn_active); + ASSERT(atomic_read(&ip->i_pincount) == 0); +--- a/fs/xfs/xfs_iops.c ++++ b/fs/xfs/xfs_iops.c +@@ -1328,7 +1328,7 @@ xfs_setup_inode( + gfp_t gfp_mask; + + inode->i_ino = ip->i_ino; +- inode->i_state = I_NEW; ++ inode->i_state |= I_NEW; + + inode_sb_list_add(inode); + /* make the inode look hashed for the writeback code */ diff --git a/queue-5.10/xfs-only-set-iomap_f_shared-when-providing-a-srcmap-to-a-write.patch b/queue-5.10/xfs-only-set-iomap_f_shared-when-providing-a-srcmap-to-a-write.patch new file mode 100644 index 00000000000..9a23a83a20c --- /dev/null +++ b/queue-5.10/xfs-only-set-iomap_f_shared-when-providing-a-srcmap-to-a-write.patch @@ -0,0 +1,103 @@ +From foo@baz Fri Aug 12 05:38:47 PM CEST 2022 +From: Amir Goldstein +Date: Wed, 10 Aug 2022 16:15:51 +0200 +Subject: xfs: only set IOMAP_F_SHARED when providing a srcmap to a write +To: Greg Kroah-Hartman +Cc: Sasha Levin , "Darrick J . Wong" , Leah Rumancik , Chandan Babu R , Luis Chamberlain , Adam Manzanares , linux-xfs@vger.kernel.org, stable@vger.kernel.org, Christoph Hellwig , Chandan Babu R +Message-ID: <20220810141552.168763-3-amir73il@gmail.com> + +From: "Darrick J. Wong" + +commit 72a048c1056a72e37ea2ee34cc73d8c6d6cb4290 upstream. + +While prototyping a free space defragmentation tool, I observed an +unexpected IO error while running a sequence of commands that can be +recreated by the following sequence of commands: + +$ xfs_io -f -c "pwrite -S 0x58 -b 10m 0 10m" file1 +$ cp --reflink=always file1 file2 +$ punch-alternating -o 1 file2 +$ xfs_io -c "funshare 0 10m" file2 +fallocate: Input/output error + +I then scraped this (abbreviated) stack trace from dmesg: + +WARNING: CPU: 0 PID: 30788 at fs/iomap/buffered-io.c:577 iomap_write_begin+0x376/0x450 +CPU: 0 PID: 30788 Comm: xfs_io Not tainted 5.14.0-rc6-xfsx #rc6 5ef57b62a900814b3e4d885c755e9014541c8732 +Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014 +RIP: 0010:iomap_write_begin+0x376/0x450 +RSP: 0018:ffffc90000c0fc20 EFLAGS: 00010297 +RAX: 0000000000000001 RBX: ffffc90000c0fd10 RCX: 0000000000001000 +RDX: ffffc90000c0fc54 RSI: 000000000000000c RDI: 000000000000000c +RBP: ffff888005d5dbd8 R08: 0000000000102000 R09: ffffc90000c0fc50 +R10: 0000000000b00000 R11: 0000000000101000 R12: ffffea0000336c40 +R13: 0000000000001000 R14: ffffc90000c0fd10 R15: 0000000000101000 +FS: 00007f4b8f62fe40(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 000056361c554108 CR3: 000000000524e004 CR4: 00000000001706f0 +Call Trace: + iomap_unshare_actor+0x95/0x140 + iomap_apply+0xfa/0x300 + iomap_file_unshare+0x44/0x60 + xfs_reflink_unshare+0x50/0x140 [xfs 61947ea9b3a73e79d747dbc1b90205e7987e4195] + xfs_file_fallocate+0x27c/0x610 [xfs 61947ea9b3a73e79d747dbc1b90205e7987e4195] + vfs_fallocate+0x133/0x330 + __x64_sys_fallocate+0x3e/0x70 + do_syscall_64+0x35/0x80 + entry_SYSCALL_64_after_hwframe+0x44/0xae +RIP: 0033:0x7f4b8f79140a + +Looking at the iomap tracepoints, I saw this: + +iomap_iter: dev 8:64 ino 0x100 pos 0 length 0 flags WRITE|0x80 (0x81) ops xfs_buffered_write_iomap_ops caller iomap_file_unshare +iomap_iter_dstmap: dev 8:64 ino 0x100 bdev 8:64 addr -1 offset 0 length 131072 type DELALLOC flags SHARED +iomap_iter_srcmap: dev 8:64 ino 0x100 bdev 8:64 addr 147456 offset 0 length 4096 type MAPPED flags +iomap_iter: dev 8:64 ino 0x100 pos 0 length 4096 flags WRITE|0x80 (0x81) ops xfs_buffered_write_iomap_ops caller iomap_file_unshare +iomap_iter_dstmap: dev 8:64 ino 0x100 bdev 8:64 addr -1 offset 4096 length 4096 type DELALLOC flags SHARED +console: WARNING: CPU: 0 PID: 30788 at fs/iomap/buffered-io.c:577 iomap_write_begin+0x376/0x450 + +The first time funshare calls ->iomap_begin, xfs sees that the first +block is shared and creates a 128k delalloc reservation in the COW fork. +The delalloc reservation is returned as dstmap, and the shared block is +returned as srcmap. So far so good. + +funshare calls ->iomap_begin to try the second block. This time there's +no srcmap (punch-alternating punched it out!) but we still have the +delalloc reservation in the COW fork. Therefore, we again return the +reservation as dstmap and the hole as srcmap. iomap_unshare_iter +incorrectly tries to unshare the hole, which __iomap_write_begin rejects +because shared regions must be fully written and therefore cannot +require zeroing. + +Therefore, change the buffered write iomap_begin function not to set +IOMAP_F_SHARED when there isn't a source mapping to read from for the +unsharing. + +Signed-off-by: Darrick J. Wong +Reviewed-by: Christoph Hellwig +Reviewed-by: Chandan Babu R +Signed-off-by: Amir Goldstein +Acked-by: Darrick J. Wong +Signed-off-by: Greg Kroah-Hartman +--- + fs/xfs/xfs_iomap.c | 8 ++++---- + 1 file changed, 4 insertions(+), 4 deletions(-) + +--- a/fs/xfs/xfs_iomap.c ++++ b/fs/xfs/xfs_iomap.c +@@ -1062,11 +1062,11 @@ found_cow: + error = xfs_bmbt_to_iomap(ip, srcmap, &imap, 0); + if (error) + return error; +- } else { +- xfs_trim_extent(&cmap, offset_fsb, +- imap.br_startoff - offset_fsb); ++ return xfs_bmbt_to_iomap(ip, iomap, &cmap, IOMAP_F_SHARED); + } +- return xfs_bmbt_to_iomap(ip, iomap, &cmap, IOMAP_F_SHARED); ++ ++ xfs_trim_extent(&cmap, offset_fsb, imap.br_startoff - offset_fsb); ++ return xfs_bmbt_to_iomap(ip, iomap, &cmap, 0); + + out_unlock: + xfs_iunlock(ip, XFS_ILOCK_EXCL);