From: Greg Kroah-Hartman Date: Wed, 2 Oct 2024 10:45:25 +0000 (+0200) Subject: 6.10-stable patches X-Git-Tag: v6.6.54~27 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=ffa527ef31c17cefd5bd8e0572bd989ab36bd6ec;p=thirdparty%2Fkernel%2Fstable-queue.git 6.10-stable patches added patches: bpf-lsm-set-bpf_lsm_blob_sizes.lbs_task-to-0.patch compiler.h-specify-correct-attribute-for-.rodata..c_jump_table.patch dm-verity-restart-or-panic-on-an-i-o-error.patch exfat-resolve-memory-leak-from-exfat_create_upcase_table.patch fbdev-xen-fbfront-assign-fb_info-device.patch i2c-aspeed-update-the-stop-sw-state-when-the-bus-recovery-occurs.patch i2c-isch-add-missed-else.patch lockdep-fix-deadlock-issue-between-lockdep-and-rcu.patch mm-change-vmf_anon_prepare-to-__vmf_anon_prepare.patch mm-damon-vaddr-protect-vma-traversal-in-__damon_va_thre_regions-with-rcu-read-lock.patch mm-huge_memory-ensure-huge_zero_folio-won-t-have-large_rmappable-flag-set.patch mm-hugetlb.c-fix-uaf-of-vma-in-hugetlb-fault-pathway.patch mm-hugetlb_vmemmap-batch-hvo-work-when-demoting.patch mm-only-enforce-minimum-stack-gap-size-if-it-s-sensible.patch module-fix-kcov-ignored-file-name.patch s390-ftrace-avoid-calling-unwinder-in-ftrace_return_address.patch spi-fspi-add-support-for-imx8ulp.patch tpm-export-tpm2_sessions_init-to-fix-ibmvtpm-building.patch --- diff --git a/queue-6.10/bpf-lsm-set-bpf_lsm_blob_sizes.lbs_task-to-0.patch b/queue-6.10/bpf-lsm-set-bpf_lsm_blob_sizes.lbs_task-to-0.patch new file mode 100644 index 00000000000..bdaadda0e7a --- /dev/null +++ b/queue-6.10/bpf-lsm-set-bpf_lsm_blob_sizes.lbs_task-to-0.patch @@ -0,0 +1,36 @@ +From 300a90b2cb5d442879e6398920c49aebbd5c8e40 Mon Sep 17 00:00:00 2001 +From: Song Liu +Date: Tue, 10 Sep 2024 22:55:08 -0700 +Subject: bpf: lsm: Set bpf_lsm_blob_sizes.lbs_task to 0 + +From: Song Liu + +commit 300a90b2cb5d442879e6398920c49aebbd5c8e40 upstream. + +bpf task local storage is now using task_struct->bpf_storage, so +bpf_lsm_blob_sizes.lbs_task is no longer needed. Remove it to save some +memory. + +Fixes: a10787e6d58c ("bpf: Enable task local storage for tracing programs") +Cc: stable@vger.kernel.org +Cc: KP Singh +Cc: Matt Bobrowski +Signed-off-by: Song Liu +Acked-by: Matt Bobrowski +Link: https://lore.kernel.org/r/20240911055508.9588-1-song@kernel.org +Signed-off-by: Alexei Starovoitov +Signed-off-by: Greg Kroah-Hartman +--- + security/bpf/hooks.c | 1 - + 1 file changed, 1 deletion(-) + +--- a/security/bpf/hooks.c ++++ b/security/bpf/hooks.c +@@ -31,7 +31,6 @@ static int __init bpf_lsm_init(void) + + struct lsm_blob_sizes bpf_lsm_blob_sizes __ro_after_init = { + .lbs_inode = sizeof(struct bpf_storage_blob), +- .lbs_task = sizeof(struct bpf_storage_blob), + }; + + DEFINE_LSM(bpf) = { diff --git a/queue-6.10/compiler.h-specify-correct-attribute-for-.rodata..c_jump_table.patch b/queue-6.10/compiler.h-specify-correct-attribute-for-.rodata..c_jump_table.patch new file mode 100644 index 00000000000..1b399a30acf --- /dev/null +++ b/queue-6.10/compiler.h-specify-correct-attribute-for-.rodata..c_jump_table.patch @@ -0,0 +1,66 @@ +From c5b1184decc819756ae549ba54c63b6790c4ddfd Mon Sep 17 00:00:00 2001 +From: Tiezhu Yang +Date: Tue, 24 Sep 2024 14:27:10 +0800 +Subject: compiler.h: specify correct attribute for .rodata..c_jump_table + +From: Tiezhu Yang + +commit c5b1184decc819756ae549ba54c63b6790c4ddfd upstream. + +Currently, there is an assembler message when generating kernel/bpf/core.o +under CONFIG_OBJTOOL with LoongArch compiler toolchain: + + Warning: setting incorrect section attributes for .rodata..c_jump_table + +This is because the section ".rodata..c_jump_table" should be readonly, +but there is a "W" (writable) part of the flags: + + $ readelf -S kernel/bpf/core.o | grep -A 1 "rodata..c" + [34] .rodata..c_j[...] PROGBITS 0000000000000000 0000d2e0 + 0000000000000800 0000000000000000 WA 0 0 8 + +There is no above issue on x86 due to the generated section flag is only +"A" (allocatable). In order to silence the warning on LoongArch, specify +the attribute like ".rodata..c_jump_table,\"a\",@progbits #" explicitly, +then the section attribute of ".rodata..c_jump_table" must be readonly +in the kernel/bpf/core.o file. + +Before: + + $ objdump -h kernel/bpf/core.o | grep -A 1 "rodata..c" + 21 .rodata..c_jump_table 00000800 0000000000000000 0000000000000000 0000d2e0 2**3 + CONTENTS, ALLOC, LOAD, RELOC, DATA + +After: + + $ objdump -h kernel/bpf/core.o | grep -A 1 "rodata..c" + 21 .rodata..c_jump_table 00000800 0000000000000000 0000000000000000 0000d2e0 2**3 + CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA + +By the way, AFAICT, maybe the root cause is related with the different +compiler behavior of various archs, so to some extent this change is a +workaround for LoongArch, and also there is no effect for x86 which is the +only port supported by objtool before LoongArch with this patch. + +Link: https://lkml.kernel.org/r/20240924062710.1243-1-yangtiezhu@loongson.cn +Signed-off-by: Tiezhu Yang +Cc: Josh Poimboeuf +Cc: Peter Zijlstra +Cc: [6.9+] +Signed-off-by: Andrew Morton +Signed-off-by: Greg Kroah-Hartman +--- + include/linux/compiler.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/include/linux/compiler.h ++++ b/include/linux/compiler.h +@@ -133,7 +133,7 @@ void ftrace_likely_update(struct ftrace_ + #define annotate_unreachable() __annotate_unreachable(__COUNTER__) + + /* Annotate a C jump table to allow objtool to follow the code flow */ +-#define __annotate_jump_table __section(".rodata..c_jump_table") ++#define __annotate_jump_table __section(".rodata..c_jump_table,\"a\",@progbits #") + + #else /* !CONFIG_OBJTOOL */ + #define annotate_reachable() diff --git a/queue-6.10/dm-verity-restart-or-panic-on-an-i-o-error.patch b/queue-6.10/dm-verity-restart-or-panic-on-an-i-o-error.patch new file mode 100644 index 00000000000..35beaf7c1f0 --- /dev/null +++ b/queue-6.10/dm-verity-restart-or-panic-on-an-i-o-error.patch @@ -0,0 +1,69 @@ +From e6a3531dd542cb127c8de32ab1e54a48ae19962b Mon Sep 17 00:00:00 2001 +From: Mikulas Patocka +Date: Tue, 24 Sep 2024 15:18:29 +0200 +Subject: dm-verity: restart or panic on an I/O error + +From: Mikulas Patocka + +commit e6a3531dd542cb127c8de32ab1e54a48ae19962b upstream. + +Maxim Suhanov reported that dm-verity doesn't crash if an I/O error +happens. In theory, this could be used to subvert security, because an +attacker can create sectors that return error with the Write Uncorrectable +command. Some programs may misbehave if they have to deal with EIO. + +This commit fixes dm-verity, so that if "panic_on_corruption" or +"restart_on_corruption" was specified and an I/O error happens, the +machine will panic or restart. + +This commit also changes kernel_restart to emergency_restart - +kernel_restart calls reboot notifiers and these reboot notifiers may wait +for the bio that failed. emergency_restart doesn't call the notifiers. + +Reported-by: Maxim Suhanov +Signed-off-by: Mikulas Patocka +Cc: stable@vger.kernel.org +Signed-off-by: Greg Kroah-Hartman +--- + drivers/md/dm-verity-target.c | 23 +++++++++++++++++++++-- + 1 file changed, 21 insertions(+), 2 deletions(-) + +--- a/drivers/md/dm-verity-target.c ++++ b/drivers/md/dm-verity-target.c +@@ -265,8 +265,10 @@ out: + if (v->mode == DM_VERITY_MODE_LOGGING) + return 0; + +- if (v->mode == DM_VERITY_MODE_RESTART) +- kernel_restart("dm-verity device corrupted"); ++ if (v->mode == DM_VERITY_MODE_RESTART) { ++ pr_emerg("dm-verity device corrupted\n"); ++ emergency_restart(); ++ } + + if (v->mode == DM_VERITY_MODE_PANIC) + panic("dm-verity device corrupted"); +@@ -691,6 +693,23 @@ static void verity_finish_io(struct dm_v + if (!static_branch_unlikely(&use_bh_wq_enabled) || !io->in_bh) + verity_fec_finish_io(io); + ++ if (unlikely(status != BLK_STS_OK) && ++ unlikely(!(bio->bi_opf & REQ_RAHEAD)) && ++ !verity_is_system_shutting_down()) { ++ if (v->mode == DM_VERITY_MODE_RESTART || ++ v->mode == DM_VERITY_MODE_PANIC) ++ DMERR_LIMIT("%s has error: %s", v->data_dev->name, ++ blk_status_to_str(status)); ++ ++ if (v->mode == DM_VERITY_MODE_RESTART) { ++ pr_emerg("dm-verity device corrupted\n"); ++ emergency_restart(); ++ } ++ ++ if (v->mode == DM_VERITY_MODE_PANIC) ++ panic("dm-verity device corrupted"); ++ } ++ + bio_endio(bio); + } + diff --git a/queue-6.10/exfat-resolve-memory-leak-from-exfat_create_upcase_table.patch b/queue-6.10/exfat-resolve-memory-leak-from-exfat_create_upcase_table.patch new file mode 100644 index 00000000000..0a27b50c59d --- /dev/null +++ b/queue-6.10/exfat-resolve-memory-leak-from-exfat_create_upcase_table.patch @@ -0,0 +1,47 @@ +From c290fe508eee36df1640c3cb35dc8f89e073c8a8 Mon Sep 17 00:00:00 2001 +From: Daniel Yang +Date: Mon, 16 Sep 2024 16:05:06 -0700 +Subject: exfat: resolve memory leak from exfat_create_upcase_table() + +From: Daniel Yang + +commit c290fe508eee36df1640c3cb35dc8f89e073c8a8 upstream. + +If exfat_load_upcase_table reaches end and returns -EINVAL, +allocated memory doesn't get freed and while +exfat_load_default_upcase_table allocates more memory, leading to a +memory leak. + +Here's link to syzkaller crash report illustrating this issue: +https://syzkaller.appspot.com/text?tag=CrashReport&x=1406c201980000 + +Reported-by: syzbot+e1c69cadec0f1a078e3d@syzkaller.appspotmail.com +Fixes: a13d1a4de3b0 ("exfat: move freeing sbi, upcase table and dropping nls into rcu-delayed helper") +Cc: stable@vger.kernel.org +Signed-off-by: Daniel Yang +Signed-off-by: Namjae Jeon +Signed-off-by: Greg Kroah-Hartman +--- + fs/exfat/nls.c | 5 ++++- + 1 file changed, 4 insertions(+), 1 deletion(-) + +diff --git a/fs/exfat/nls.c b/fs/exfat/nls.c +index afdf13c34ff5..1ac011088ce7 100644 +--- a/fs/exfat/nls.c ++++ b/fs/exfat/nls.c +@@ -779,8 +779,11 @@ int exfat_create_upcase_table(struct super_block *sb) + le32_to_cpu(ep->dentry.upcase.checksum)); + + brelse(bh); +- if (ret && ret != -EIO) ++ if (ret && ret != -EIO) { ++ /* free memory from exfat_load_upcase_table call */ ++ exfat_free_upcase_table(sbi); + goto load_default; ++ } + + /* load successfully */ + return ret; +-- +2.46.2 + diff --git a/queue-6.10/fbdev-xen-fbfront-assign-fb_info-device.patch b/queue-6.10/fbdev-xen-fbfront-assign-fb_info-device.patch new file mode 100644 index 00000000000..e45b49225c3 --- /dev/null +++ b/queue-6.10/fbdev-xen-fbfront-assign-fb_info-device.patch @@ -0,0 +1,48 @@ +From c2af2a45560bd4046c2e109152acde029ed0acc2 Mon Sep 17 00:00:00 2001 +From: Jason Andryuk +Date: Mon, 9 Sep 2024 22:09:16 -0400 +Subject: fbdev: xen-fbfront: Assign fb_info->device +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Jason Andryuk + +commit c2af2a45560bd4046c2e109152acde029ed0acc2 upstream. + +Probing xen-fbfront faults in video_is_primary_device(). The passed-in +struct device is NULL since xen-fbfront doesn't assign it and the +memory is kzalloc()-ed. Assign fb_info->device to avoid this. + +This was exposed by the conversion of fb_is_primary_device() to +video_is_primary_device() which dropped a NULL check for struct device. + +Fixes: f178e96de7f0 ("arch: Remove struct fb_info from video helpers") +Reported-by: Arthur Borsboom +Closes: https://lore.kernel.org/xen-devel/CALUcmUncX=LkXWeiSiTKsDY-cOe8QksWhFvcCneOKfrKd0ZajA@mail.gmail.com/ +Tested-by: Arthur Borsboom +CC: stable@vger.kernel.org +Signed-off-by: Jason Andryuk +Reviewed-by: Roger Pau Monné +Reviewed-by: Thomas Zimmermann +Signed-off-by: Helge Deller +Signed-off-by: Greg Kroah-Hartman +--- + drivers/video/fbdev/xen-fbfront.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/drivers/video/fbdev/xen-fbfront.c b/drivers/video/fbdev/xen-fbfront.c +index 66d4628a96ae..c90f48ebb15e 100644 +--- a/drivers/video/fbdev/xen-fbfront.c ++++ b/drivers/video/fbdev/xen-fbfront.c +@@ -407,6 +407,7 @@ static int xenfb_probe(struct xenbus_device *dev, + /* complete the abuse: */ + fb_info->pseudo_palette = fb_info->par; + fb_info->par = info; ++ fb_info->device = &dev->dev; + + fb_info->screen_buffer = info->fb; + +-- +2.46.2 + diff --git a/queue-6.10/i2c-aspeed-update-the-stop-sw-state-when-the-bus-recovery-occurs.patch b/queue-6.10/i2c-aspeed-update-the-stop-sw-state-when-the-bus-recovery-occurs.patch new file mode 100644 index 00000000000..ac5021c82e5 --- /dev/null +++ b/queue-6.10/i2c-aspeed-update-the-stop-sw-state-when-the-bus-recovery-occurs.patch @@ -0,0 +1,63 @@ +From 93701d3b84ac5f3ea07259d4ced405c53d757985 Mon Sep 17 00:00:00 2001 +From: Tommy Huang +Date: Wed, 11 Sep 2024 17:39:51 +0800 +Subject: i2c: aspeed: Update the stop sw state when the bus recovery occurs + +From: Tommy Huang + +commit 93701d3b84ac5f3ea07259d4ced405c53d757985 upstream. + +When the i2c bus recovery occurs, driver will send i2c stop command +in the scl low condition. In this case the sw state will still keep +original situation. Under multi-master usage, i2c bus recovery will +be called when i2c transfer timeout occurs. Update the stop command +calling with aspeed_i2c_do_stop function to update master_state. + +Fixes: f327c686d3ba ("i2c: aspeed: added driver for Aspeed I2C") +Cc: stable@vger.kernel.org # v4.13+ +Signed-off-by: Tommy Huang +Signed-off-by: Andi Shyti +Signed-off-by: Greg Kroah-Hartman +--- + drivers/i2c/busses/i2c-aspeed.c | 16 ++++++++-------- + 1 file changed, 8 insertions(+), 8 deletions(-) + +--- a/drivers/i2c/busses/i2c-aspeed.c ++++ b/drivers/i2c/busses/i2c-aspeed.c +@@ -170,6 +170,13 @@ struct aspeed_i2c_bus { + + static int aspeed_i2c_reset(struct aspeed_i2c_bus *bus); + ++/* precondition: bus.lock has been acquired. */ ++static void aspeed_i2c_do_stop(struct aspeed_i2c_bus *bus) ++{ ++ bus->master_state = ASPEED_I2C_MASTER_STOP; ++ writel(ASPEED_I2CD_M_STOP_CMD, bus->base + ASPEED_I2C_CMD_REG); ++} ++ + static int aspeed_i2c_recover_bus(struct aspeed_i2c_bus *bus) + { + unsigned long time_left, flags; +@@ -187,7 +194,7 @@ static int aspeed_i2c_recover_bus(struct + command); + + reinit_completion(&bus->cmd_complete); +- writel(ASPEED_I2CD_M_STOP_CMD, bus->base + ASPEED_I2C_CMD_REG); ++ aspeed_i2c_do_stop(bus); + spin_unlock_irqrestore(&bus->lock, flags); + + time_left = wait_for_completion_timeout( +@@ -391,13 +398,6 @@ static void aspeed_i2c_do_start(struct a + } + + /* precondition: bus.lock has been acquired. */ +-static void aspeed_i2c_do_stop(struct aspeed_i2c_bus *bus) +-{ +- bus->master_state = ASPEED_I2C_MASTER_STOP; +- writel(ASPEED_I2CD_M_STOP_CMD, bus->base + ASPEED_I2C_CMD_REG); +-} +- +-/* precondition: bus.lock has been acquired. */ + static void aspeed_i2c_next_msg_or_stop(struct aspeed_i2c_bus *bus) + { + if (bus->msgs_index + 1 < bus->msgs_count) { diff --git a/queue-6.10/i2c-isch-add-missed-else.patch b/queue-6.10/i2c-isch-add-missed-else.patch new file mode 100644 index 00000000000..dfbf4dedfbe --- /dev/null +++ b/queue-6.10/i2c-isch-add-missed-else.patch @@ -0,0 +1,34 @@ +From 1db4da55070d6a2754efeb3743f5312fc32f5961 Mon Sep 17 00:00:00 2001 +From: Andy Shevchenko +Date: Wed, 11 Sep 2024 18:39:14 +0300 +Subject: i2c: isch: Add missed 'else' + +From: Andy Shevchenko + +commit 1db4da55070d6a2754efeb3743f5312fc32f5961 upstream. + +In accordance with the existing comment and code analysis +it is quite likely that there is a missed 'else' when adapter +times out. Add it. + +Fixes: 5bc1200852c3 ("i2c: Add Intel SCH SMBus support") +Signed-off-by: Andy Shevchenko +Cc: # v2.6.27+ +Signed-off-by: Andi Shyti +Signed-off-by: Greg Kroah-Hartman +--- + drivers/i2c/busses/i2c-isch.c | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +--- a/drivers/i2c/busses/i2c-isch.c ++++ b/drivers/i2c/busses/i2c-isch.c +@@ -99,8 +99,7 @@ static int sch_transaction(void) + if (retries > MAX_RETRIES) { + dev_err(&sch_adapter.dev, "SMBus Timeout!\n"); + result = -ETIMEDOUT; +- } +- if (temp & 0x04) { ++ } else if (temp & 0x04) { + result = -EIO; + dev_dbg(&sch_adapter.dev, "Bus collision! SMBus may be " + "locked until next hard reset. (sorry!)\n"); diff --git a/queue-6.10/lockdep-fix-deadlock-issue-between-lockdep-and-rcu.patch b/queue-6.10/lockdep-fix-deadlock-issue-between-lockdep-and-rcu.patch new file mode 100644 index 00000000000..e6c7e1cd199 --- /dev/null +++ b/queue-6.10/lockdep-fix-deadlock-issue-between-lockdep-and-rcu.patch @@ -0,0 +1,215 @@ +From a6f88ac32c6e63e69c595bfae220d8641704c9b7 Mon Sep 17 00:00:00 2001 +From: Zhiguo Niu +Date: Thu, 20 Jun 2024 22:54:34 +0000 +Subject: lockdep: fix deadlock issue between lockdep and rcu + +From: Zhiguo Niu + +commit a6f88ac32c6e63e69c595bfae220d8641704c9b7 upstream. + +There is a deadlock scenario between lockdep and rcu when +rcu nocb feature is enabled, just as following call stack: + + rcuop/x +-000|queued_spin_lock_slowpath(lock = 0xFFFFFF817F2A8A80, val = ?) +-001|queued_spin_lock(inline) // try to hold nocb_gp_lock +-001|do_raw_spin_lock(lock = 0xFFFFFF817F2A8A80) +-002|__raw_spin_lock_irqsave(inline) +-002|_raw_spin_lock_irqsave(lock = 0xFFFFFF817F2A8A80) +-003|wake_nocb_gp_defer(inline) +-003|__call_rcu_nocb_wake(rdp = 0xFFFFFF817F30B680) +-004|__call_rcu_common(inline) +-004|call_rcu(head = 0xFFFFFFC082EECC28, func = ?) +-005|call_rcu_zapped(inline) +-005|free_zapped_rcu(ch = ?)// hold graph lock +-006|rcu_do_batch(rdp = 0xFFFFFF817F245680) +-007|nocb_cb_wait(inline) +-007|rcu_nocb_cb_kthread(arg = 0xFFFFFF817F245680) +-008|kthread(_create = 0xFFFFFF80803122C0) +-009|ret_from_fork(asm) + + rcuop/y +-000|queued_spin_lock_slowpath(lock = 0xFFFFFFC08291BBC8, val = 0) +-001|queued_spin_lock() +-001|lockdep_lock() +-001|graph_lock() // try to hold graph lock +-002|lookup_chain_cache_add() +-002|validate_chain() +-003|lock_acquire +-004|_raw_spin_lock_irqsave(lock = 0xFFFFFF817F211D80) +-005|lock_timer_base(inline) +-006|mod_timer(inline) +-006|wake_nocb_gp_defer(inline)// hold nocb_gp_lock +-006|__call_rcu_nocb_wake(rdp = 0xFFFFFF817F2A8680) +-007|__call_rcu_common(inline) +-007|call_rcu(head = 0xFFFFFFC0822E0B58, func = ?) +-008|call_rcu_hurry(inline) +-008|rcu_sync_call(inline) +-008|rcu_sync_func(rhp = 0xFFFFFFC0822E0B58) +-009|rcu_do_batch(rdp = 0xFFFFFF817F266680) +-010|nocb_cb_wait(inline) +-010|rcu_nocb_cb_kthread(arg = 0xFFFFFF817F266680) +-011|kthread(_create = 0xFFFFFF8080363740) +-012|ret_from_fork(asm) + +rcuop/x and rcuop/y are rcu nocb threads with the same nocb gp thread. +This patch release the graph lock before lockdep call_rcu. + +Fixes: a0b0fd53e1e6 ("locking/lockdep: Free lock classes that are no longer in use") +Cc: stable@vger.kernel.org +Cc: Boqun Feng +Cc: Waiman Long +Cc: Carlos Llamas +Cc: Bart Van Assche +Signed-off-by: Zhiguo Niu +Signed-off-by: Xuewen Yan +Reviewed-by: Waiman Long +Reviewed-by: Carlos Llamas +Reviewed-by: Bart Van Assche +Signed-off-by: Carlos Llamas +Acked-by: Paul E. McKenney +Signed-off-by: Boqun Feng +Link: https://lore.kernel.org/r/20240620225436.3127927-1-cmllamas@google.com +Signed-off-by: Greg Kroah-Hartman +--- + kernel/locking/lockdep.c | 48 +++++++++++++++++++++++++++++++---------------- + 1 file changed, 32 insertions(+), 16 deletions(-) + +--- a/kernel/locking/lockdep.c ++++ b/kernel/locking/lockdep.c +@@ -6184,25 +6184,27 @@ static struct pending_free *get_pending_ + static void free_zapped_rcu(struct rcu_head *cb); + + /* +- * Schedule an RCU callback if no RCU callback is pending. Must be called with +- * the graph lock held. +- */ +-static void call_rcu_zapped(struct pending_free *pf) ++* See if we need to queue an RCU callback, must called with ++* the lockdep lock held, returns false if either we don't have ++* any pending free or the callback is already scheduled. ++* Otherwise, a call_rcu() must follow this function call. ++*/ ++static bool prepare_call_rcu_zapped(struct pending_free *pf) + { + WARN_ON_ONCE(inside_selftest()); + + if (list_empty(&pf->zapped)) +- return; ++ return false; + + if (delayed_free.scheduled) +- return; ++ return false; + + delayed_free.scheduled = true; + + WARN_ON_ONCE(delayed_free.pf + delayed_free.index != pf); + delayed_free.index ^= 1; + +- call_rcu(&delayed_free.rcu_head, free_zapped_rcu); ++ return true; + } + + /* The caller must hold the graph lock. May be called from RCU context. */ +@@ -6228,6 +6230,7 @@ static void free_zapped_rcu(struct rcu_h + { + struct pending_free *pf; + unsigned long flags; ++ bool need_callback; + + if (WARN_ON_ONCE(ch != &delayed_free.rcu_head)) + return; +@@ -6239,14 +6242,18 @@ static void free_zapped_rcu(struct rcu_h + pf = delayed_free.pf + (delayed_free.index ^ 1); + __free_zapped_classes(pf); + delayed_free.scheduled = false; ++ need_callback = ++ prepare_call_rcu_zapped(delayed_free.pf + delayed_free.index); ++ lockdep_unlock(); ++ raw_local_irq_restore(flags); + + /* +- * If there's anything on the open list, close and start a new callback. +- */ +- call_rcu_zapped(delayed_free.pf + delayed_free.index); ++ * If there's pending free and its callback has not been scheduled, ++ * queue an RCU callback. ++ */ ++ if (need_callback) ++ call_rcu(&delayed_free.rcu_head, free_zapped_rcu); + +- lockdep_unlock(); +- raw_local_irq_restore(flags); + } + + /* +@@ -6286,6 +6293,7 @@ static void lockdep_free_key_range_reg(v + { + struct pending_free *pf; + unsigned long flags; ++ bool need_callback; + + init_data_structures_once(); + +@@ -6293,10 +6301,11 @@ static void lockdep_free_key_range_reg(v + lockdep_lock(); + pf = get_pending_free(); + __lockdep_free_key_range(pf, start, size); +- call_rcu_zapped(pf); ++ need_callback = prepare_call_rcu_zapped(pf); + lockdep_unlock(); + raw_local_irq_restore(flags); +- ++ if (need_callback) ++ call_rcu(&delayed_free.rcu_head, free_zapped_rcu); + /* + * Wait for any possible iterators from look_up_lock_class() to pass + * before continuing to free the memory they refer to. +@@ -6390,6 +6399,7 @@ static void lockdep_reset_lock_reg(struc + struct pending_free *pf; + unsigned long flags; + int locked; ++ bool need_callback = false; + + raw_local_irq_save(flags); + locked = graph_lock(); +@@ -6398,11 +6408,13 @@ static void lockdep_reset_lock_reg(struc + + pf = get_pending_free(); + __lockdep_reset_lock(pf, lock); +- call_rcu_zapped(pf); ++ need_callback = prepare_call_rcu_zapped(pf); + + graph_unlock(); + out_irq: + raw_local_irq_restore(flags); ++ if (need_callback) ++ call_rcu(&delayed_free.rcu_head, free_zapped_rcu); + } + + /* +@@ -6446,6 +6458,7 @@ void lockdep_unregister_key(struct lock_ + struct pending_free *pf; + unsigned long flags; + bool found = false; ++ bool need_callback = false; + + might_sleep(); + +@@ -6466,11 +6479,14 @@ void lockdep_unregister_key(struct lock_ + if (found) { + pf = get_pending_free(); + __lockdep_free_key_range(pf, key, 1); +- call_rcu_zapped(pf); ++ need_callback = prepare_call_rcu_zapped(pf); + } + lockdep_unlock(); + raw_local_irq_restore(flags); + ++ if (need_callback) ++ call_rcu(&delayed_free.rcu_head, free_zapped_rcu); ++ + /* Wait until is_dynamic_key() has finished accessing k->hash_entry. */ + synchronize_rcu(); + } diff --git a/queue-6.10/mm-change-vmf_anon_prepare-to-__vmf_anon_prepare.patch b/queue-6.10/mm-change-vmf_anon_prepare-to-__vmf_anon_prepare.patch new file mode 100644 index 00000000000..4833566eae4 --- /dev/null +++ b/queue-6.10/mm-change-vmf_anon_prepare-to-__vmf_anon_prepare.patch @@ -0,0 +1,85 @@ +From 2a058ab3286d6475b2082b90c2d2182d2fea4b39 Mon Sep 17 00:00:00 2001 +From: "Vishal Moola (Oracle)" +Date: Sat, 14 Sep 2024 12:41:18 -0700 +Subject: mm: change vmf_anon_prepare() to __vmf_anon_prepare() + +From: Vishal Moola (Oracle) + +commit 2a058ab3286d6475b2082b90c2d2182d2fea4b39 upstream. + +Some callers of vmf_anon_prepare() may not want us to release the per-VMA +lock ourselves. Rename vmf_anon_prepare() to __vmf_anon_prepare() and let +the callers drop the lock when desired. + +Also, make vmf_anon_prepare() a wrapper that releases the per-VMA lock +itself for any callers that don't care. + +This is in preparation to fix this bug reported by syzbot: +https://lore.kernel.org/linux-mm/00000000000067c20b06219fbc26@google.com/ + +Link: https://lkml.kernel.org/r/20240914194243.245-1-vishal.moola@gmail.com +Fixes: 9acad7ba3e25 ("hugetlb: use vmf_anon_prepare() instead of anon_vma_prepare()") +Reported-by: syzbot+2dab93857ee95f2eeb08@syzkaller.appspotmail.com +Closes: https://lore.kernel.org/linux-mm/00000000000067c20b06219fbc26@google.com/ +Signed-off-by: Vishal Moola (Oracle) +Cc: Muchun Song +Cc: +Signed-off-by: Andrew Morton +Signed-off-by: Greg Kroah-Hartman +--- + mm/internal.h | 11 ++++++++++- + mm/memory.c | 8 +++----- + 2 files changed, 13 insertions(+), 6 deletions(-) + +--- a/mm/internal.h ++++ b/mm/internal.h +@@ -293,7 +293,16 @@ static inline void wake_throttle_isolate + wake_up(wqh); + } + +-vm_fault_t vmf_anon_prepare(struct vm_fault *vmf); ++vm_fault_t __vmf_anon_prepare(struct vm_fault *vmf); ++static inline vm_fault_t vmf_anon_prepare(struct vm_fault *vmf) ++{ ++ vm_fault_t ret = __vmf_anon_prepare(vmf); ++ ++ if (unlikely(ret & VM_FAULT_RETRY)) ++ vma_end_read(vmf->vma); ++ return ret; ++} ++ + vm_fault_t do_swap_page(struct vm_fault *vmf); + void folio_rotate_reclaimable(struct folio *folio); + bool __folio_end_writeback(struct folio *folio); +--- a/mm/memory.c ++++ b/mm/memory.c +@@ -3226,7 +3226,7 @@ static inline vm_fault_t vmf_can_call_fa + } + + /** +- * vmf_anon_prepare - Prepare to handle an anonymous fault. ++ * __vmf_anon_prepare - Prepare to handle an anonymous fault. + * @vmf: The vm_fault descriptor passed from the fault handler. + * + * When preparing to insert an anonymous page into a VMA from a +@@ -3240,7 +3240,7 @@ static inline vm_fault_t vmf_can_call_fa + * Return: 0 if fault handling can proceed. Any other value should be + * returned to the caller. + */ +-vm_fault_t vmf_anon_prepare(struct vm_fault *vmf) ++vm_fault_t __vmf_anon_prepare(struct vm_fault *vmf) + { + struct vm_area_struct *vma = vmf->vma; + vm_fault_t ret = 0; +@@ -3248,10 +3248,8 @@ vm_fault_t vmf_anon_prepare(struct vm_fa + if (likely(vma->anon_vma)) + return 0; + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { +- if (!mmap_read_trylock(vma->vm_mm)) { +- vma_end_read(vma); ++ if (!mmap_read_trylock(vma->vm_mm)) + return VM_FAULT_RETRY; +- } + } + if (__anon_vma_prepare(vma)) + ret = VM_FAULT_OOM; diff --git a/queue-6.10/mm-damon-vaddr-protect-vma-traversal-in-__damon_va_thre_regions-with-rcu-read-lock.patch b/queue-6.10/mm-damon-vaddr-protect-vma-traversal-in-__damon_va_thre_regions-with-rcu-read-lock.patch new file mode 100644 index 00000000000..2458344ae9a --- /dev/null +++ b/queue-6.10/mm-damon-vaddr-protect-vma-traversal-in-__damon_va_thre_regions-with-rcu-read-lock.patch @@ -0,0 +1,47 @@ +From fb497d6db7c19c797cbd694b52d1af87c4eebcc6 Mon Sep 17 00:00:00 2001 +From: "Liam R. Howlett" +Date: Wed, 4 Sep 2024 17:12:04 -0700 +Subject: mm/damon/vaddr: protect vma traversal in __damon_va_thre_regions() with rcu read lock + +From: Liam R. Howlett + +commit fb497d6db7c19c797cbd694b52d1af87c4eebcc6 upstream. + +Traversing VMAs of a given maple tree should be protected by rcu read +lock. However, __damon_va_three_regions() is not doing the protection. +Hold the lock. + +Link: https://lkml.kernel.org/r/20240905001204.1481-1-sj@kernel.org +Fixes: d0cf3dd47f0d ("damon: convert __damon_va_three_regions to use the VMA iterator") +Signed-off-by: Liam R. Howlett +Signed-off-by: SeongJae Park +Reported-by: Guenter Roeck +Closes: https://lore.kernel.org/b83651a0-5b24-4206-b860-cb54ffdf209b@roeck-us.net +Tested-by: Guenter Roeck +Cc: David Hildenbrand +Cc: Matthew Wilcox +Cc: +Signed-off-by: Andrew Morton +Signed-off-by: Greg Kroah-Hartman +--- + mm/damon/vaddr.c | 2 ++ + 1 file changed, 2 insertions(+) + +--- a/mm/damon/vaddr.c ++++ b/mm/damon/vaddr.c +@@ -126,6 +126,7 @@ static int __damon_va_three_regions(stru + * If this is too slow, it can be optimised to examine the maple + * tree gaps. + */ ++ rcu_read_lock(); + for_each_vma(vmi, vma) { + unsigned long gap; + +@@ -146,6 +147,7 @@ static int __damon_va_three_regions(stru + next: + prev = vma; + } ++ rcu_read_unlock(); + + if (!sz_range(&second_gap) || !sz_range(&first_gap)) + return -EINVAL; diff --git a/queue-6.10/mm-huge_memory-ensure-huge_zero_folio-won-t-have-large_rmappable-flag-set.patch b/queue-6.10/mm-huge_memory-ensure-huge_zero_folio-won-t-have-large_rmappable-flag-set.patch new file mode 100644 index 00000000000..db15e416e39 --- /dev/null +++ b/queue-6.10/mm-huge_memory-ensure-huge_zero_folio-won-t-have-large_rmappable-flag-set.patch @@ -0,0 +1,35 @@ +From 2a1b8648d9be9f37f808a36c0f74adb8c53d06e6 Mon Sep 17 00:00:00 2001 +From: Miaohe Lin +Date: Sat, 14 Sep 2024 09:53:06 +0800 +Subject: mm/huge_memory: ensure huge_zero_folio won't have large_rmappable flag set + +From: Miaohe Lin + +commit 2a1b8648d9be9f37f808a36c0f74adb8c53d06e6 upstream. + +Ensure huge_zero_folio won't have large_rmappable flag set. So it can be +reported as thp,zero correctly through stable_page_flags(). + +Link: https://lkml.kernel.org/r/20240914015306.3656791-1-linmiaohe@huawei.com +Fixes: 5691753d73a2 ("mm: convert huge_zero_page to huge_zero_folio") +Signed-off-by: Miaohe Lin +Cc: David Hildenbrand +Cc: Matthew Wilcox (Oracle) +Cc: +Signed-off-by: Andrew Morton +Signed-off-by: Greg Kroah-Hartman +--- + mm/huge_memory.c | 2 ++ + 1 file changed, 2 insertions(+) + +--- a/mm/huge_memory.c ++++ b/mm/huge_memory.c +@@ -214,6 +214,8 @@ retry: + count_vm_event(THP_ZERO_PAGE_ALLOC_FAILED); + return false; + } ++ /* Ensure zero folio won't have large_rmappable flag set. */ ++ folio_clear_large_rmappable(zero_folio); + preempt_disable(); + if (cmpxchg(&huge_zero_folio, NULL, zero_folio)) { + preempt_enable(); diff --git a/queue-6.10/mm-hugetlb.c-fix-uaf-of-vma-in-hugetlb-fault-pathway.patch b/queue-6.10/mm-hugetlb.c-fix-uaf-of-vma-in-hugetlb-fault-pathway.patch new file mode 100644 index 00000000000..169679ffd94 --- /dev/null +++ b/queue-6.10/mm-hugetlb.c-fix-uaf-of-vma-in-hugetlb-fault-pathway.patch @@ -0,0 +1,80 @@ +From 98b74bb4d7e96b4da5ef3126511febe55b76b807 Mon Sep 17 00:00:00 2001 +From: "Vishal Moola (Oracle)" +Date: Sat, 14 Sep 2024 12:41:19 -0700 +Subject: mm/hugetlb.c: fix UAF of vma in hugetlb fault pathway + +From: Vishal Moola (Oracle) + +commit 98b74bb4d7e96b4da5ef3126511febe55b76b807 upstream. + +Syzbot reports a UAF in hugetlb_fault(). This happens because +vmf_anon_prepare() could drop the per-VMA lock and allow the current VMA +to be freed before hugetlb_vma_unlock_read() is called. + +We can fix this by using a modified version of vmf_anon_prepare() that +doesn't release the VMA lock on failure, and then release it ourselves +after hugetlb_vma_unlock_read(). + +Link: https://lkml.kernel.org/r/20240914194243.245-2-vishal.moola@gmail.com +Fixes: 9acad7ba3e25 ("hugetlb: use vmf_anon_prepare() instead of anon_vma_prepare()") +Reported-by: syzbot+2dab93857ee95f2eeb08@syzkaller.appspotmail.com +Closes: https://lore.kernel.org/linux-mm/00000000000067c20b06219fbc26@google.com/ +Signed-off-by: Vishal Moola (Oracle) +Cc: Muchun Song +Cc: +Signed-off-by: Andrew Morton +Signed-off-by: Greg Kroah-Hartman +--- + mm/hugetlb.c | 20 ++++++++++++++++++-- + 1 file changed, 18 insertions(+), 2 deletions(-) + +--- a/mm/hugetlb.c ++++ b/mm/hugetlb.c +@@ -6075,7 +6075,7 @@ retry_avoidcopy: + * When the original hugepage is shared one, it does not have + * anon_vma prepared. + */ +- ret = vmf_anon_prepare(vmf); ++ ret = __vmf_anon_prepare(vmf); + if (unlikely(ret)) + goto out_release_all; + +@@ -6274,7 +6274,7 @@ static vm_fault_t hugetlb_no_page(struct + } + + if (!(vma->vm_flags & VM_MAYSHARE)) { +- ret = vmf_anon_prepare(vmf); ++ ret = __vmf_anon_prepare(vmf); + if (unlikely(ret)) + goto out; + } +@@ -6406,6 +6406,14 @@ static vm_fault_t hugetlb_no_page(struct + folio_unlock(folio); + out: + hugetlb_vma_unlock_read(vma); ++ ++ /* ++ * We must check to release the per-VMA lock. __vmf_anon_prepare() is ++ * the only way ret can be set to VM_FAULT_RETRY. ++ */ ++ if (unlikely(ret & VM_FAULT_RETRY)) ++ vma_end_read(vma); ++ + mutex_unlock(&hugetlb_fault_mutex_table[hash]); + return ret; + +@@ -6627,6 +6635,14 @@ out_ptl: + } + out_mutex: + hugetlb_vma_unlock_read(vma); ++ ++ /* ++ * We must check to release the per-VMA lock. __vmf_anon_prepare() in ++ * hugetlb_wp() is the only way ret can be set to VM_FAULT_RETRY. ++ */ ++ if (unlikely(ret & VM_FAULT_RETRY)) ++ vma_end_read(vma); ++ + mutex_unlock(&hugetlb_fault_mutex_table[hash]); + /* + * Generally it's safe to hold refcount during waiting page lock. But diff --git a/queue-6.10/mm-hugetlb_vmemmap-batch-hvo-work-when-demoting.patch b/queue-6.10/mm-hugetlb_vmemmap-batch-hvo-work-when-demoting.patch new file mode 100644 index 00000000000..8f08217c5ed --- /dev/null +++ b/queue-6.10/mm-hugetlb_vmemmap-batch-hvo-work-when-demoting.patch @@ -0,0 +1,264 @@ +From c0f398c3b2cf67976bca216f80668b9c93368385 Mon Sep 17 00:00:00 2001 +From: Yu Zhao +Date: Mon, 12 Aug 2024 16:48:23 -0600 +Subject: mm/hugetlb_vmemmap: batch HVO work when demoting + +From: Yu Zhao + +commit c0f398c3b2cf67976bca216f80668b9c93368385 upstream. + +Batch the HVO work, including de-HVO of the source and HVO of the +destination hugeTLB folios, to speed up demotion. + +After commit bd225530a4c7 ("mm/hugetlb_vmemmap: fix race with speculative +PFN walkers"), each request of HVO or de-HVO, batched or not, invokes +synchronize_rcu() once. For example, when not batched, demoting one 1GB +hugeTLB folio to 512 2MB hugeTLB folios invokes synchronize_rcu() 513 +times (1 de-HVO plus 512 HVO requests), whereas when batched, only twice +(1 de-HVO plus 1 HVO request). And the performance difference between the +two cases is significant, e.g., + + echo 2048kB >/sys/kernel/mm/hugepages/hugepages-1048576kB/demote_size + time echo 100 >/sys/kernel/mm/hugepages/hugepages-1048576kB/demote + +Before this patch: + real 8m58.158s + user 0m0.009s + sys 0m5.900s + +After this patch: + real 0m0.900s + user 0m0.000s + sys 0m0.851s + +Note that this patch changes the behavior of the `demote` interface when +de-HVO fails. Before, the interface aborts immediately upon failure; now, +it tries to finish an entire batch, meaning it can make extra progress if +the rest of the batch contains folios that do not need to de-HVO. + +Link: https://lkml.kernel.org/r/20240812224823.3914837-1-yuzhao@google.com +Fixes: bd225530a4c7 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers") +Signed-off-by: Yu Zhao +Reviewed-by: Muchun Song +Cc: +Signed-off-by: Andrew Morton +Signed-off-by: Greg Kroah-Hartman +--- + mm/hugetlb.c | 156 ++++++++++++++++++++++++++++++++++------------------------- + 1 file changed, 92 insertions(+), 64 deletions(-) + +--- a/mm/hugetlb.c ++++ b/mm/hugetlb.c +@@ -3919,101 +3919,125 @@ out: + return 0; + } + +-static int demote_free_hugetlb_folio(struct hstate *h, struct folio *folio) ++static long demote_free_hugetlb_folios(struct hstate *src, struct hstate *dst, ++ struct list_head *src_list) + { +- int i, nid = folio_nid(folio); +- struct hstate *target_hstate; +- struct page *subpage; +- struct folio *inner_folio; +- int rc = 0; ++ long rc; ++ struct folio *folio, *next; ++ LIST_HEAD(dst_list); ++ LIST_HEAD(ret_list); + +- target_hstate = size_to_hstate(PAGE_SIZE << h->demote_order); +- +- remove_hugetlb_folio(h, folio, false); +- spin_unlock_irq(&hugetlb_lock); +- +- /* +- * If vmemmap already existed for folio, the remove routine above would +- * have cleared the hugetlb folio flag. Hence the folio is technically +- * no longer a hugetlb folio. hugetlb_vmemmap_restore_folio can only be +- * passed hugetlb folios and will BUG otherwise. +- */ +- if (folio_test_hugetlb(folio)) { +- rc = hugetlb_vmemmap_restore_folio(h, folio); +- if (rc) { +- /* Allocation of vmemmmap failed, we can not demote folio */ +- spin_lock_irq(&hugetlb_lock); +- add_hugetlb_folio(h, folio, false); +- return rc; +- } +- } +- +- /* +- * Use destroy_compound_hugetlb_folio_for_demote for all huge page +- * sizes as it will not ref count folios. +- */ +- destroy_compound_hugetlb_folio_for_demote(folio, huge_page_order(h)); ++ rc = hugetlb_vmemmap_restore_folios(src, src_list, &ret_list); ++ list_splice_init(&ret_list, src_list); + + /* + * Taking target hstate mutex synchronizes with set_max_huge_pages. + * Without the mutex, pages added to target hstate could be marked + * as surplus. + * +- * Note that we already hold h->resize_lock. To prevent deadlock, ++ * Note that we already hold src->resize_lock. To prevent deadlock, + * use the convention of always taking larger size hstate mutex first. + */ +- mutex_lock(&target_hstate->resize_lock); +- for (i = 0; i < pages_per_huge_page(h); +- i += pages_per_huge_page(target_hstate)) { +- subpage = folio_page(folio, i); +- inner_folio = page_folio(subpage); +- if (hstate_is_gigantic(target_hstate)) +- prep_compound_gigantic_folio_for_demote(inner_folio, +- target_hstate->order); +- else +- prep_compound_page(subpage, target_hstate->order); +- folio_change_private(inner_folio, NULL); +- prep_new_hugetlb_folio(target_hstate, inner_folio, nid); +- free_huge_folio(inner_folio); ++ mutex_lock(&dst->resize_lock); ++ ++ list_for_each_entry_safe(folio, next, src_list, lru) { ++ int i; ++ ++ if (folio_test_hugetlb_vmemmap_optimized(folio)) ++ continue; ++ ++ list_del(&folio->lru); ++ /* ++ * Use destroy_compound_hugetlb_folio_for_demote for all huge page ++ * sizes as it will not ref count folios. ++ */ ++ destroy_compound_hugetlb_folio_for_demote(folio, huge_page_order(src)); ++ ++ for (i = 0; i < pages_per_huge_page(src); i += pages_per_huge_page(dst)) { ++ struct page *page = folio_page(folio, i); ++ ++ if (hstate_is_gigantic(dst)) ++ prep_compound_gigantic_folio_for_demote(page_folio(page), ++ dst->order); ++ else ++ prep_compound_page(page, dst->order); ++ set_page_private(page, 0); ++ ++ init_new_hugetlb_folio(dst, page_folio(page)); ++ list_add(&page->lru, &dst_list); ++ } + } +- mutex_unlock(&target_hstate->resize_lock); + +- spin_lock_irq(&hugetlb_lock); ++ prep_and_add_allocated_folios(dst, &dst_list); + +- /* +- * Not absolutely necessary, but for consistency update max_huge_pages +- * based on pool changes for the demoted page. +- */ +- h->max_huge_pages--; +- target_hstate->max_huge_pages += +- pages_per_huge_page(h) / pages_per_huge_page(target_hstate); ++ mutex_unlock(&dst->resize_lock); + + return rc; + } + +-static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) ++static long demote_pool_huge_page(struct hstate *src, nodemask_t *nodes_allowed, ++ unsigned long nr_to_demote) + __must_hold(&hugetlb_lock) + { + int nr_nodes, node; +- struct folio *folio; ++ struct hstate *dst; ++ long rc = 0; ++ long nr_demoted = 0; + + lockdep_assert_held(&hugetlb_lock); + + /* We should never get here if no demote order */ +- if (!h->demote_order) { ++ if (!src->demote_order) { + pr_warn("HugeTLB: NULL demote order passed to demote_pool_huge_page.\n"); + return -EINVAL; /* internal error */ + } ++ dst = size_to_hstate(PAGE_SIZE << src->demote_order); + +- for_each_node_mask_to_free(h, nr_nodes, node, nodes_allowed) { +- list_for_each_entry(folio, &h->hugepage_freelists[node], lru) { ++ for_each_node_mask_to_free(src, nr_nodes, node, nodes_allowed) { ++ LIST_HEAD(list); ++ struct folio *folio, *next; ++ ++ list_for_each_entry_safe(folio, next, &src->hugepage_freelists[node], lru) { + if (folio_test_hwpoison(folio)) + continue; +- return demote_free_hugetlb_folio(h, folio); ++ ++ remove_hugetlb_folio(src, folio, false); ++ list_add(&folio->lru, &list); ++ ++ if (++nr_demoted == nr_to_demote) ++ break; ++ } ++ ++ spin_unlock_irq(&hugetlb_lock); ++ ++ rc = demote_free_hugetlb_folios(src, dst, &list); ++ ++ spin_lock_irq(&hugetlb_lock); ++ ++ list_for_each_entry_safe(folio, next, &list, lru) { ++ list_del(&folio->lru); ++ add_hugetlb_folio(src, folio, false); ++ ++ nr_demoted--; + } ++ ++ if (rc < 0 || nr_demoted == nr_to_demote) ++ break; + } + + /* ++ * Not absolutely necessary, but for consistency update max_huge_pages ++ * based on pool changes for the demoted page. ++ */ ++ src->max_huge_pages -= nr_demoted; ++ dst->max_huge_pages += nr_demoted << (huge_page_order(src) - huge_page_order(dst)); ++ ++ if (rc < 0) ++ return rc; ++ ++ if (nr_demoted) ++ return nr_demoted; ++ /* + * Only way to get here is if all pages on free lists are poisoned. + * Return -EBUSY so that caller will not retry. + */ +@@ -4247,6 +4271,8 @@ static ssize_t demote_store(struct kobje + spin_lock_irq(&hugetlb_lock); + + while (nr_demote) { ++ long rc; ++ + /* + * Check for available pages to demote each time thorough the + * loop as demote_pool_huge_page will drop hugetlb_lock. +@@ -4259,11 +4285,13 @@ static ssize_t demote_store(struct kobje + if (!nr_available) + break; + +- err = demote_pool_huge_page(h, n_mask); +- if (err) ++ rc = demote_pool_huge_page(h, n_mask, nr_demote); ++ if (rc < 0) { ++ err = rc; + break; ++ } + +- nr_demote--; ++ nr_demote -= rc; + } + + spin_unlock_irq(&hugetlb_lock); diff --git a/queue-6.10/mm-only-enforce-minimum-stack-gap-size-if-it-s-sensible.patch b/queue-6.10/mm-only-enforce-minimum-stack-gap-size-if-it-s-sensible.patch new file mode 100644 index 00000000000..1149281eece --- /dev/null +++ b/queue-6.10/mm-only-enforce-minimum-stack-gap-size-if-it-s-sensible.patch @@ -0,0 +1,51 @@ +From 69b50d4351ed924f29e3d46b159e28f70dfc707f Mon Sep 17 00:00:00 2001 +From: David Gow +Date: Sat, 3 Aug 2024 15:46:41 +0800 +Subject: mm: only enforce minimum stack gap size if it's sensible + +From: David Gow + +commit 69b50d4351ed924f29e3d46b159e28f70dfc707f upstream. + +The generic mmap_base code tries to leave a gap between the top of the +stack and the mmap base address, but enforces a minimum gap size (MIN_GAP) +of 128MB, which is too large on some setups. In particular, on arm tasks +without ADDR_LIMIT_32BIT, the STACK_TOP value is less than 128MB, so it's +impossible to fit such a gap in. + +Only enforce this minimum if MIN_GAP < MAX_GAP, as we'd prefer to honour +MAX_GAP, which is defined proportionally, so scales better and always +leaves us with both _some_ stack space and some room for mmap. + +This fixes the usercopy KUnit test suite on 32-bit arm, as it doesn't set +any personality flags so gets the default (in this case 26-bit) task size. +This test can be run with: ./tools/testing/kunit/kunit.py run --arch arm +usercopy --make_options LLVM=1 + +Link: https://lkml.kernel.org/r/20240803074642.1849623-2-davidgow@google.com +Fixes: dba79c3df4a2 ("arm: use generic mmap top-down layout and brk randomization") +Signed-off-by: David Gow +Reviewed-by: Kees Cook +Cc: Alexandre Ghiti +Cc: Linus Walleij +Cc: Luis Chamberlain +Cc: Mark Rutland +Cc: Russell King +Cc: +Signed-off-by: Andrew Morton +Signed-off-by: Greg Kroah-Hartman +--- + mm/util.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/mm/util.c ++++ b/mm/util.c +@@ -451,7 +451,7 @@ static unsigned long mmap_base(unsigned + if (gap + pad > gap) + gap += pad; + +- if (gap < MIN_GAP) ++ if (gap < MIN_GAP && MIN_GAP < MAX_GAP) + gap = MIN_GAP; + else if (gap > MAX_GAP) + gap = MAX_GAP; diff --git a/queue-6.10/module-fix-kcov-ignored-file-name.patch b/queue-6.10/module-fix-kcov-ignored-file-name.patch new file mode 100644 index 00000000000..4f0db74c2ae --- /dev/null +++ b/queue-6.10/module-fix-kcov-ignored-file-name.patch @@ -0,0 +1,36 @@ +From f34d086fb7102fec895fd58b9e816b981b284c17 Mon Sep 17 00:00:00 2001 +From: Dmitry Vyukov +Date: Tue, 11 Jun 2024 09:50:32 +0200 +Subject: module: Fix KCOV-ignored file name + +From: Dmitry Vyukov + +commit f34d086fb7102fec895fd58b9e816b981b284c17 upstream. + +module.c was renamed to main.c, but the Makefile directive was copy-pasted +verbatim with the old file name. Fix up the file name. + +Fixes: cfc1d277891e ("module: Move all into module/") +Signed-off-by: Dmitry Vyukov +Signed-off-by: Thomas Gleixner +Reviewed-by: Alexander Potapenko +Reviewed-by: Marco Elver +Reviewed-by: Andrey Konovalov +Cc: stable@vger.kernel.org +Link: https://lore.kernel.org/all/bc0cf790b4839c5e38e2fafc64271f620568a39e.1718092070.git.dvyukov@google.com +Signed-off-by: Greg Kroah-Hartman +--- + kernel/module/Makefile | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/kernel/module/Makefile ++++ b/kernel/module/Makefile +@@ -5,7 +5,7 @@ + + # These are called from save_stack_trace() on slub debug path, + # and produce insane amounts of uninteresting coverage. +-KCOV_INSTRUMENT_module.o := n ++KCOV_INSTRUMENT_main.o := n + + obj-y += main.o + obj-y += strict_rwx.o diff --git a/queue-6.10/s390-ftrace-avoid-calling-unwinder-in-ftrace_return_address.patch b/queue-6.10/s390-ftrace-avoid-calling-unwinder-in-ftrace_return_address.patch new file mode 100644 index 00000000000..b28e30924b5 --- /dev/null +++ b/queue-6.10/s390-ftrace-avoid-calling-unwinder-in-ftrace_return_address.patch @@ -0,0 +1,103 @@ +From a84dd0d8ae24bdc6da341187fc4c1a0adfce2ccc Mon Sep 17 00:00:00 2001 +From: Vasily Gorbik +Date: Sat, 24 Aug 2024 02:14:04 +0200 +Subject: s390/ftrace: Avoid calling unwinder in ftrace_return_address() + +From: Vasily Gorbik + +commit a84dd0d8ae24bdc6da341187fc4c1a0adfce2ccc upstream. + +ftrace_return_address() is called extremely often from +performance-critical code paths when debugging features like +CONFIG_TRACE_IRQFLAGS are enabled. For example, with debug_defconfig, +ftrace selftests on my LPAR currently execute ftrace_return_address() +as follows: + +ftrace_return_address(0) - 0 times (common code uses __builtin_return_address(0) instead) +ftrace_return_address(1) - 2,986,805,401 times (with this patch applied) +ftrace_return_address(2) - 140 times +ftrace_return_address(>2) - 0 times + +The use of __builtin_return_address(n) was replaced by return_address() +with an unwinder call by commit cae74ba8c295 ("s390/ftrace: +Use unwinder instead of __builtin_return_address()") because +__builtin_return_address(n) simply walks the stack backchain and doesn't +check for reaching the stack top. For shallow stacks with fewer than +"n" frames, this results in reads at low addresses and random +memory accesses. + +While calling the fully functional unwinder "works", it is very slow +for this purpose. Moreover, potentially following stack switches and +walking past IRQ context is simply wrong thing to do for +ftrace_return_address(). + +Reimplement return_address() to essentially be __builtin_return_address(n) +with checks for reaching the stack top. Since the ftrace_return_address(n) +argument is always a constant, keep the implementation in the header, +allowing both GCC and Clang to unroll the loop and optimize it to the +bare minimum. + +Fixes: cae74ba8c295 ("s390/ftrace: Use unwinder instead of __builtin_return_address()") +Cc: stable@vger.kernel.org +Reported-by: Sumanth Korikkar +Reviewed-by: Heiko Carstens +Acked-by: Sumanth Korikkar +Signed-off-by: Vasily Gorbik +Signed-off-by: Greg Kroah-Hartman +--- + arch/s390/include/asm/ftrace.h | 17 ++++++++++++++++- + arch/s390/kernel/stacktrace.c | 19 ------------------- + 2 files changed, 16 insertions(+), 20 deletions(-) + +--- a/arch/s390/include/asm/ftrace.h ++++ b/arch/s390/include/asm/ftrace.h +@@ -7,8 +7,23 @@ + #define MCOUNT_INSN_SIZE 6 + + #ifndef __ASSEMBLY__ ++#include + +-unsigned long return_address(unsigned int n); ++static __always_inline unsigned long return_address(unsigned int n) ++{ ++ struct stack_frame *sf; ++ ++ if (!n) ++ return (unsigned long)__builtin_return_address(0); ++ ++ sf = (struct stack_frame *)current_frame_address(); ++ do { ++ sf = (struct stack_frame *)sf->back_chain; ++ if (!sf) ++ return 0; ++ } while (--n); ++ return sf->gprs[8]; ++} + #define ftrace_return_address(n) return_address(n) + + void ftrace_caller(void); +--- a/arch/s390/kernel/stacktrace.c ++++ b/arch/s390/kernel/stacktrace.c +@@ -162,22 +162,3 @@ void arch_stack_walk_user(stack_trace_co + { + arch_stack_walk_user_common(consume_entry, cookie, NULL, regs, false); + } +- +-unsigned long return_address(unsigned int n) +-{ +- struct unwind_state state; +- unsigned long addr; +- +- /* Increment to skip current stack entry */ +- n++; +- +- unwind_for_each_frame(&state, NULL, NULL, 0) { +- addr = unwind_get_return_address(&state); +- if (!addr) +- break; +- if (!n--) +- return addr; +- } +- return 0; +-} +-EXPORT_SYMBOL_GPL(return_address); diff --git a/queue-6.10/series b/queue-6.10/series index aa37c6408f5..2bfe577ddec 100644 --- a/queue-6.10/series +++ b/queue-6.10/series @@ -612,3 +612,21 @@ fs_parse-add-uid-gid-option-option-parsing-helpers.patch debugfs-convert-to-new-uid-gid-option-parsing-helper.patch debugfs-show-actual-source-in-proc-mounts.patch lsm-infrastructure-management-of-the-sock-security.patch +bpf-lsm-set-bpf_lsm_blob_sizes.lbs_task-to-0.patch +dm-verity-restart-or-panic-on-an-i-o-error.patch +compiler.h-specify-correct-attribute-for-.rodata..c_jump_table.patch +lockdep-fix-deadlock-issue-between-lockdep-and-rcu.patch +exfat-resolve-memory-leak-from-exfat_create_upcase_table.patch +mm-hugetlb_vmemmap-batch-hvo-work-when-demoting.patch +s390-ftrace-avoid-calling-unwinder-in-ftrace_return_address.patch +mm-only-enforce-minimum-stack-gap-size-if-it-s-sensible.patch +spi-fspi-add-support-for-imx8ulp.patch +module-fix-kcov-ignored-file-name.patch +fbdev-xen-fbfront-assign-fb_info-device.patch +tpm-export-tpm2_sessions_init-to-fix-ibmvtpm-building.patch +mm-hugetlb.c-fix-uaf-of-vma-in-hugetlb-fault-pathway.patch +mm-huge_memory-ensure-huge_zero_folio-won-t-have-large_rmappable-flag-set.patch +mm-change-vmf_anon_prepare-to-__vmf_anon_prepare.patch +mm-damon-vaddr-protect-vma-traversal-in-__damon_va_thre_regions-with-rcu-read-lock.patch +i2c-aspeed-update-the-stop-sw-state-when-the-bus-recovery-occurs.patch +i2c-isch-add-missed-else.patch diff --git a/queue-6.10/spi-fspi-add-support-for-imx8ulp.patch b/queue-6.10/spi-fspi-add-support-for-imx8ulp.patch new file mode 100644 index 00000000000..1f252aaaf40 --- /dev/null +++ b/queue-6.10/spi-fspi-add-support-for-imx8ulp.patch @@ -0,0 +1,52 @@ +From 9228956a620553d7fd17f703a37a26c91e4d92ab Mon Sep 17 00:00:00 2001 +From: Haibo Chen +Date: Thu, 5 Sep 2024 17:43:37 +0800 +Subject: spi: fspi: add support for imx8ulp + +From: Haibo Chen + +commit 9228956a620553d7fd17f703a37a26c91e4d92ab upstream. + +The flexspi on imx8ulp only has 16 LUTs, different with others which +have up to 32 LUTs. + +Add a separate compatible string and nxp_fspi_devtype_data to support +flexspi on imx8ulp. + +Fixes: ef89fd56bdfc ("arm64: dts: imx8ulp: add flexspi node") +Cc: stable@kernel.org +Signed-off-by: Haibo Chen +Reviewed-by: Frank Li +Link: https://patch.msgid.link/20240905094338.1986871-4-haibo.chen@nxp.com +Signed-off-by: Mark Brown +Signed-off-by: Greg Kroah-Hartman +--- + drivers/spi/spi-nxp-fspi.c | 10 ++++++++++ + 1 file changed, 10 insertions(+) + +--- a/drivers/spi/spi-nxp-fspi.c ++++ b/drivers/spi/spi-nxp-fspi.c +@@ -371,6 +371,15 @@ static struct nxp_fspi_devtype_data imx8 + .little_endian = true, /* little-endian */ + }; + ++static struct nxp_fspi_devtype_data imx8ulp_data = { ++ .rxfifo = SZ_512, /* (64 * 64 bits) */ ++ .txfifo = SZ_1K, /* (128 * 64 bits) */ ++ .ahb_buf_size = SZ_2K, /* (256 * 64 bits) */ ++ .quirks = 0, ++ .lut_num = 16, ++ .little_endian = true, /* little-endian */ ++}; ++ + struct nxp_fspi { + void __iomem *iobase; + void __iomem *ahb_addr; +@@ -1297,6 +1306,7 @@ static const struct of_device_id nxp_fsp + { .compatible = "nxp,imx8mp-fspi", .data = (void *)&imx8mm_data, }, + { .compatible = "nxp,imx8qxp-fspi", .data = (void *)&imx8qxp_data, }, + { .compatible = "nxp,imx8dxl-fspi", .data = (void *)&imx8dxl_data, }, ++ { .compatible = "nxp,imx8ulp-fspi", .data = (void *)&imx8ulp_data, }, + { /* sentinel */ } + }; + MODULE_DEVICE_TABLE(of, nxp_fspi_dt_ids); diff --git a/queue-6.10/tpm-export-tpm2_sessions_init-to-fix-ibmvtpm-building.patch b/queue-6.10/tpm-export-tpm2_sessions_init-to-fix-ibmvtpm-building.patch new file mode 100644 index 00000000000..40571ccaf4f --- /dev/null +++ b/queue-6.10/tpm-export-tpm2_sessions_init-to-fix-ibmvtpm-building.patch @@ -0,0 +1,45 @@ +From f168c000d27f8134160d4a52dfc474a948a3d7e9 Mon Sep 17 00:00:00 2001 +From: Kexy Biscuit +Date: Mon, 9 Sep 2024 20:28:30 +0300 +Subject: tpm: export tpm2_sessions_init() to fix ibmvtpm building + +From: Kexy Biscuit + +commit f168c000d27f8134160d4a52dfc474a948a3d7e9 upstream. + +Commit 08d08e2e9f0a ("tpm: ibmvtpm: Call tpm2_sessions_init() to +initialize session support") adds call to tpm2_sessions_init() in ibmvtpm, +which could be built as a module. However, tpm2_sessions_init() wasn't +exported, causing libmvtpm to fail to build as a module: + +ERROR: modpost: "tpm2_sessions_init" [drivers/char/tpm/tpm_ibmvtpm.ko] undefined! + +Export tpm2_sessions_init() to resolve the issue. + +Cc: stable@vger.kernel.org # v6.10+ +Reported-by: kernel test robot +Closes: https://lore.kernel.org/oe-kbuild-all/202408051735.ZJkAPQ3b-lkp@intel.com/ +Fixes: 08d08e2e9f0a ("tpm: ibmvtpm: Call tpm2_sessions_init() to initialize session support") +Signed-off-by: Kexy Biscuit +Signed-off-by: Mingcong Bai +Reviewed-by: Stefan Berger +Reviewed-by: Jarkko Sakkinen +Signed-off-by: Jarkko Sakkinen +Signed-off-by: Greg Kroah-Hartman +--- + drivers/char/tpm/tpm2-sessions.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/drivers/char/tpm/tpm2-sessions.c b/drivers/char/tpm/tpm2-sessions.c +index d3521aadd43e..44f60730cff4 100644 +--- a/drivers/char/tpm/tpm2-sessions.c ++++ b/drivers/char/tpm/tpm2-sessions.c +@@ -1362,4 +1362,5 @@ int tpm2_sessions_init(struct tpm_chip *chip) + + return rc; + } ++EXPORT_SYMBOL(tpm2_sessions_init); + #endif /* CONFIG_TCG_TPM2_HMAC */ +-- +2.46.2 +