From: Sasha Levin Date: Sun, 2 Mar 2025 14:46:01 +0000 (-0500) Subject: Fixes for 6.13 X-Git-Tag: v6.6.81~40 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=1be1e0d513ea9678acb712da669f6e35efc678d0;p=thirdparty%2Fkernel%2Fstable-queue.git Fixes for 6.13 Signed-off-by: Sasha Levin --- diff --git a/queue-6.13/io_uring-net-save-msg_control-for-compat.patch b/queue-6.13/io_uring-net-save-msg_control-for-compat.patch new file mode 100644 index 0000000000..338706ce60 --- /dev/null +++ b/queue-6.13/io_uring-net-save-msg_control-for-compat.patch @@ -0,0 +1,39 @@ +From 9f8f9e3a606823ae83182132b7478c5ebf918aaf Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 25 Feb 2025 15:59:02 +0000 +Subject: io_uring/net: save msg_control for compat + +From: Pavel Begunkov + +[ Upstream commit 6ebf05189dfc6d0d597c99a6448a4d1064439a18 ] + +Match the compat part of io_sendmsg_copy_hdr() with its counterpart and +save msg_control. + +Fixes: c55978024d123 ("io_uring/net: move receive multishot out of the generic msghdr path") +Signed-off-by: Pavel Begunkov +Link: https://lore.kernel.org/r/2a8418821fe83d3b64350ad2b3c0303e9b732bbd.1740498502.git.asml.silence@gmail.com +Signed-off-by: Jens Axboe +Signed-off-by: Sasha Levin +--- + io_uring/net.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/io_uring/net.c b/io_uring/net.c +index b01bf900e3b94..96af3408792bb 100644 +--- a/io_uring/net.c ++++ b/io_uring/net.c +@@ -334,7 +334,9 @@ static int io_sendmsg_copy_hdr(struct io_kiocb *req, + if (unlikely(ret)) + return ret; + +- return __get_compat_msghdr(&iomsg->msg, &cmsg, NULL); ++ ret = __get_compat_msghdr(&iomsg->msg, &cmsg, NULL); ++ sr->msg_control = iomsg->msg.msg_control_user; ++ return ret; + } + #endif + +-- +2.39.5 + diff --git a/queue-6.13/objtool-fix-c-jump-table-annotations-for-clang.patch b/queue-6.13/objtool-fix-c-jump-table-annotations-for-clang.patch new file mode 100644 index 0000000000..d87ecb6dcf --- /dev/null +++ b/queue-6.13/objtool-fix-c-jump-table-annotations-for-clang.patch @@ -0,0 +1,106 @@ +From 5638fbee08fd291cc7e22c2cb7960cfdeed9237d Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 21 Feb 2025 14:57:07 +0100 +Subject: objtool: Fix C jump table annotations for Clang + +From: Ard Biesheuvel + +[ Upstream commit 73cfc53cc3b6380eccf013049574485f64cb83ca ] + +A C jump table (such as the one used by the BPF interpreter) is a const +global array of absolute code addresses, and this means that the actual +values in the table may not be known until the kernel is booted (e.g., +when using KASLR or when the kernel VA space is sized dynamically). + +When using PIE codegen, the compiler will default to placing such const +global objects in .data.rel.ro (which is annotated as writable), rather +than .rodata (which is annotated as read-only). As C jump tables are +explicitly emitted into .rodata, this used to result in warnings for +LoongArch builds (which uses PIE codegen for the entire kernel) like + + Warning: setting incorrect section attributes for .rodata..c_jump_table + +due to the fact that the explicitly specified .rodata section inherited +the read-write annotation that the compiler uses for such objects when +using PIE codegen. + +This warning was suppressed by explicitly adding the read-only +annotation to the __attribute__((section(""))) string, by commit + + c5b1184decc8 ("compiler.h: specify correct attribute for .rodata..c_jump_table") + +Unfortunately, this hack does not work on Clang's integrated assembler, +which happily interprets the appended section type and permission +specifiers as part of the section name, which therefore no longer +matches the hard-coded pattern '.rodata..c_jump_table' that objtool +expects, causing it to emit a warning + + kernel/bpf/core.o: warning: objtool: ___bpf_prog_run+0x20: sibling call from callable instruction with modified stack frame + +Work around this, by emitting C jump tables into .data.rel.ro instead, +which is treated as .rodata by the linker script for all builds, not +just PIE based ones. + +Fixes: c5b1184decc8 ("compiler.h: specify correct attribute for .rodata..c_jump_table") +Tested-by: Tiezhu Yang # on LoongArch +Signed-off-by: Ard Biesheuvel +Link: https://lore.kernel.org/r/20250221135704.431269-6-ardb+git@google.com +Signed-off-by: Josh Poimboeuf +Signed-off-by: Sasha Levin +--- + include/linux/compiler.h | 2 +- + tools/objtool/check.c | 7 ++++--- + tools/objtool/include/objtool/special.h | 2 +- + 3 files changed, 6 insertions(+), 5 deletions(-) + +diff --git a/include/linux/compiler.h b/include/linux/compiler.h +index 8104d3568d673..d004f9b5528d7 100644 +--- a/include/linux/compiler.h ++++ b/include/linux/compiler.h +@@ -110,7 +110,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, + /* Unreachable code */ + #ifdef CONFIG_OBJTOOL + /* Annotate a C jump table to allow objtool to follow the code flow */ +-#define __annotate_jump_table __section(".rodata..c_jump_table,\"a\",@progbits #") ++#define __annotate_jump_table __section(".data.rel.ro.c_jump_table") + #else /* !CONFIG_OBJTOOL */ + #define __annotate_jump_table + #endif /* CONFIG_OBJTOOL */ +diff --git a/tools/objtool/check.c b/tools/objtool/check.c +index 4d7d2c115cbac..6691bd106e4b6 100644 +--- a/tools/objtool/check.c ++++ b/tools/objtool/check.c +@@ -2589,13 +2589,14 @@ static void mark_rodata(struct objtool_file *file) + * + * - .rodata: can contain GCC switch tables + * - .rodata.: same, if -fdata-sections is being used +- * - .rodata..c_jump_table: contains C annotated jump tables ++ * - .data.rel.ro.c_jump_table: contains C annotated jump tables + * + * .rodata.str1.* sections are ignored; they don't contain jump tables. + */ + for_each_sec(file, sec) { +- if (!strncmp(sec->name, ".rodata", 7) && +- !strstr(sec->name, ".str1.")) { ++ if ((!strncmp(sec->name, ".rodata", 7) && ++ !strstr(sec->name, ".str1.")) || ++ !strncmp(sec->name, ".data.rel.ro", 12)) { + sec->rodata = true; + found = true; + } +diff --git a/tools/objtool/include/objtool/special.h b/tools/objtool/include/objtool/special.h +index 86d4af9c5aa9d..89ee12b1a1384 100644 +--- a/tools/objtool/include/objtool/special.h ++++ b/tools/objtool/include/objtool/special.h +@@ -10,7 +10,7 @@ + #include + #include + +-#define C_JUMP_TABLE_SECTION ".rodata..c_jump_table" ++#define C_JUMP_TABLE_SECTION ".data.rel.ro.c_jump_table" + + struct special_alt { + struct list_head list; +-- +2.39.5 + diff --git a/queue-6.13/objtool-remove-annotate_-un-reachable.patch b/queue-6.13/objtool-remove-annotate_-un-reachable.patch new file mode 100644 index 0000000000..266e8cc748 --- /dev/null +++ b/queue-6.13/objtool-remove-annotate_-un-reachable.patch @@ -0,0 +1,131 @@ +From fa2d0106a0ebc51d45a3315428ba8cb720134cbf Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 28 Nov 2024 10:39:04 +0100 +Subject: objtool: Remove annotate_{,un}reachable() + +From: Peter Zijlstra + +[ Upstream commit 06e24745985c8dd0da18337503afcf2f2fdbdff1 ] + +There are no users of annotate_reachable() left. + +And the annotate_unreachable() usage in unreachable() is plain wrong; +it will hide dangerous fall-through code-gen. + +Remove both. + +Signed-off-by: Peter Zijlstra (Intel) +Acked-by: Josh Poimboeuf +Link: https://lore.kernel.org/r/20241128094312.235637588@infradead.org +Stable-dep-of: 73cfc53cc3b6 ("objtool: Fix C jump table annotations for Clang") +Signed-off-by: Sasha Levin +--- + include/linux/compiler.h | 27 ------------------------- + tools/objtool/check.c | 43 ++-------------------------------------- + 2 files changed, 2 insertions(+), 68 deletions(-) + +diff --git a/include/linux/compiler.h b/include/linux/compiler.h +index bd5d10c479c09..8104d3568d673 100644 +--- a/include/linux/compiler.h ++++ b/include/linux/compiler.h +@@ -109,35 +109,9 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, + + /* Unreachable code */ + #ifdef CONFIG_OBJTOOL +-/* +- * These macros help objtool understand GCC code flow for unreachable code. +- * The __COUNTER__ based labels are a hack to make each instance of the macros +- * unique, to convince GCC not to merge duplicate inline asm statements. +- */ +-#define __stringify_label(n) #n +- +-#define __annotate_reachable(c) ({ \ +- asm volatile(__stringify_label(c) ":\n\t" \ +- ".pushsection .discard.reachable\n\t" \ +- ".long " __stringify_label(c) "b - .\n\t" \ +- ".popsection\n\t"); \ +-}) +-#define annotate_reachable() __annotate_reachable(__COUNTER__) +- +-#define __annotate_unreachable(c) ({ \ +- asm volatile(__stringify_label(c) ":\n\t" \ +- ".pushsection .discard.unreachable\n\t" \ +- ".long " __stringify_label(c) "b - .\n\t" \ +- ".popsection\n\t" : : "i" (c)); \ +-}) +-#define annotate_unreachable() __annotate_unreachable(__COUNTER__) +- + /* Annotate a C jump table to allow objtool to follow the code flow */ + #define __annotate_jump_table __section(".rodata..c_jump_table,\"a\",@progbits #") +- + #else /* !CONFIG_OBJTOOL */ +-#define annotate_reachable() +-#define annotate_unreachable() + #define __annotate_jump_table + #endif /* CONFIG_OBJTOOL */ + +@@ -147,7 +121,6 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, + * control elsewhere. + */ + #define unreachable() do { \ +- annotate_unreachable(); \ + barrier_before_unreachable(); \ + __builtin_unreachable(); \ + } while (0) +diff --git a/tools/objtool/check.c b/tools/objtool/check.c +index e7ec29dfdff22..4d7d2c115cbac 100644 +--- a/tools/objtool/check.c ++++ b/tools/objtool/check.c +@@ -639,47 +639,8 @@ static int add_dead_ends(struct objtool_file *file) + uint64_t offset; + + /* +- * Check for manually annotated dead ends. +- */ +- rsec = find_section_by_name(file->elf, ".rela.discard.unreachable"); +- if (!rsec) +- goto reachable; +- +- for_each_reloc(rsec, reloc) { +- if (reloc->sym->type == STT_SECTION) { +- offset = reloc_addend(reloc); +- } else if (reloc->sym->local_label) { +- offset = reloc->sym->offset; +- } else { +- WARN("unexpected relocation symbol type in %s", rsec->name); +- return -1; +- } +- +- insn = find_insn(file, reloc->sym->sec, offset); +- if (insn) +- insn = prev_insn_same_sec(file, insn); +- else if (offset == reloc->sym->sec->sh.sh_size) { +- insn = find_last_insn(file, reloc->sym->sec); +- if (!insn) { +- WARN("can't find unreachable insn at %s+0x%" PRIx64, +- reloc->sym->sec->name, offset); +- return -1; +- } +- } else { +- WARN("can't find unreachable insn at %s+0x%" PRIx64, +- reloc->sym->sec->name, offset); +- return -1; +- } +- +- insn->dead_end = true; +- } +- +-reachable: +- /* +- * These manually annotated reachable checks are needed for GCC 4.4, +- * where the Linux unreachable() macro isn't supported. In that case +- * GCC doesn't know the "ud2" is fatal, so it generates code as if it's +- * not a dead end. ++ * UD2 defaults to being a dead-end, allow them to be annotated for ++ * non-fatal, eg WARN. + */ + rsec = find_section_by_name(file->elf, ".rela.discard.reachable"); + if (!rsec) +-- +2.39.5 + diff --git a/queue-6.13/perf-core-order-the-pmu-list-to-fix-warning-about-un.patch b/queue-6.13/perf-core-order-the-pmu-list-to-fix-warning-about-un.patch new file mode 100644 index 0000000000..71f6255247 --- /dev/null +++ b/queue-6.13/perf-core-order-the-pmu-list-to-fix-warning-about-un.patch @@ -0,0 +1,108 @@ +From f0843046cdeecf9316d050fc2b3fc19f56d79652 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 22 Jan 2025 07:33:56 +0000 +Subject: perf/core: Order the PMU list to fix warning about unordered + pmu_ctx_list + +From: Luo Gengkun + +[ Upstream commit 2016066c66192a99d9e0ebf433789c490a6785a2 ] + +Syskaller triggers a warning due to prev_epc->pmu != next_epc->pmu in +perf_event_swap_task_ctx_data(). vmcore shows that two lists have the same +perf_event_pmu_context, but not in the same order. + +The problem is that the order of pmu_ctx_list for the parent is impacted by +the time when an event/PMU is added. While the order for a child is +impacted by the event order in the pinned_groups and flexible_groups. So +the order of pmu_ctx_list in the parent and child may be different. + +To fix this problem, insert the perf_event_pmu_context to its proper place +after iteration of the pmu_ctx_list. + +The follow testcase can trigger above warning: + + # perf record -e cycles --call-graph lbr -- taskset -c 3 ./a.out & + # perf stat -e cpu-clock,cs -p xxx // xxx is the pid of a.out + + test.c + + void main() { + int count = 0; + pid_t pid; + + printf("%d running\n", getpid()); + sleep(30); + printf("running\n"); + + pid = fork(); + if (pid == -1) { + printf("fork error\n"); + return; + } + if (pid == 0) { + while (1) { + count++; + } + } else { + while (1) { + count++; + } + } + } + +The testcase first opens an LBR event, so it will allocate task_ctx_data, +and then open tracepoint and software events, so the parent context will +have 3 different perf_event_pmu_contexts. On inheritance, child ctx will +insert the perf_event_pmu_context in another order and the warning will +trigger. + +[ mingo: Tidied up the changelog. ] + +Fixes: bd2756811766 ("perf: Rewrite core context handling") +Signed-off-by: Luo Gengkun +Signed-off-by: Ingo Molnar +Reviewed-by: Kan Liang +Link: https://lore.kernel.org/r/20250122073356.1824736-1-luogengkun@huaweicloud.com +Signed-off-by: Sasha Levin +--- + kernel/events/core.c | 11 +++++++++-- + 1 file changed, 9 insertions(+), 2 deletions(-) + +diff --git a/kernel/events/core.c b/kernel/events/core.c +index e9f698c08dc17..91c58014fe5dc 100644 +--- a/kernel/events/core.c ++++ b/kernel/events/core.c +@@ -4950,7 +4950,7 @@ static struct perf_event_pmu_context * + find_get_pmu_context(struct pmu *pmu, struct perf_event_context *ctx, + struct perf_event *event) + { +- struct perf_event_pmu_context *new = NULL, *epc; ++ struct perf_event_pmu_context *new = NULL, *pos = NULL, *epc; + void *task_ctx_data = NULL; + + if (!ctx->task) { +@@ -5007,12 +5007,19 @@ find_get_pmu_context(struct pmu *pmu, struct perf_event_context *ctx, + atomic_inc(&epc->refcount); + goto found_epc; + } ++ /* Make sure the pmu_ctx_list is sorted by PMU type: */ ++ if (!pos && epc->pmu->type > pmu->type) ++ pos = epc; + } + + epc = new; + new = NULL; + +- list_add(&epc->pmu_ctx_entry, &ctx->pmu_ctx_list); ++ if (!pos) ++ list_add_tail(&epc->pmu_ctx_entry, &ctx->pmu_ctx_list); ++ else ++ list_add(&epc->pmu_ctx_entry, pos->pmu_ctx_entry.prev); ++ + epc->ctx = ctx; + + found_epc: +-- +2.39.5 + diff --git a/queue-6.13/series b/queue-6.13/series index a26c8277a3..d9dbda61cb 100644 --- a/queue-6.13/series +++ b/queue-6.13/series @@ -60,3 +60,14 @@ net-ipv6-fix-dst-ref-loop-on-input-in-rpl-lwt.patch selftests-drv-net-check-if-combined-count-exists.patch idpf-fix-checksums-set-in-idpf_rx_rsc.patch net-ti-icss-iep-reject-perout-generation-request.patch +thermal-gov_power_allocator-fix-incorrect-calculatio.patch +perf-core-order-the-pmu-list-to-fix-warning-about-un.patch +uprobes-reject-the-shared-zeropage-in-uprobe_write_o.patch +thermal-of-fix-cdev-lookup-in-thermal_of_should_bind.patch +thermal-gov_power_allocator-update-total_weight-on-b.patch +io_uring-net-save-msg_control-for-compat.patch +unreachable-unify.patch +objtool-remove-annotate_-un-reachable.patch +objtool-fix-c-jump-table-annotations-for-clang.patch +x86-cpu-fix-warm-boot-hang-regression-on-amd-sc1100-.patch +uprobes-remove-too-strict-lockdep_assert-condition-i.patch diff --git a/queue-6.13/thermal-gov_power_allocator-fix-incorrect-calculatio.patch b/queue-6.13/thermal-gov_power_allocator-fix-incorrect-calculatio.patch new file mode 100644 index 0000000000..01420e762d --- /dev/null +++ b/queue-6.13/thermal-gov_power_allocator-fix-incorrect-calculatio.patch @@ -0,0 +1,46 @@ +From 51af8d063df6f3a45b7925102a774d96bc07d9b7 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 19 Feb 2025 15:07:48 +0800 +Subject: thermal: gov_power_allocator: Fix incorrect calculation in + divvy_up_power() + +From: Yu-Che Cheng + +[ Upstream commit 4ecaa75771a75f2b78a431bf67dea165d19d72a6 ] + +divvy_up_power() should use weighted_req_power instead of req_power to +calculate granted_power. Otherwise, granted_power may be unexpected as +the denominator total_req_power is a weighted sum. + +This is a mistake made during the previous refactor. + +Replace req_power with weighted_req_power in divvy_up_power() +calculation. + +Fixes: 912e97c67cc3 ("thermal: gov_power_allocator: Move memory allocation out of throttle()") +Signed-off-by: Yu-Che Cheng +Reviewed-by: Lukasz Luba +Link: https://patch.msgid.link/20250219-fix-power-allocator-calc-v1-1-48b860291919@chromium.org +[ rjw: Subject and changelog edits ] +Signed-off-by: Rafael J. Wysocki +Signed-off-by: Sasha Levin +--- + drivers/thermal/gov_power_allocator.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c +index 3b644de3292e2..3b626db55b2b9 100644 +--- a/drivers/thermal/gov_power_allocator.c ++++ b/drivers/thermal/gov_power_allocator.c +@@ -370,7 +370,7 @@ static void divvy_up_power(struct power_actor *power, int num_actors, + + for (i = 0; i < num_actors; i++) { + struct power_actor *pa = &power[i]; +- u64 req_range = (u64)pa->req_power * power_range; ++ u64 req_range = (u64)pa->weighted_req_power * power_range; + + pa->granted_power = DIV_ROUND_CLOSEST_ULL(req_range, + total_req_power); +-- +2.39.5 + diff --git a/queue-6.13/thermal-gov_power_allocator-update-total_weight-on-b.patch b/queue-6.13/thermal-gov_power_allocator-update-total_weight-on-b.patch new file mode 100644 index 0000000000..7c9828410f --- /dev/null +++ b/queue-6.13/thermal-gov_power_allocator-update-total_weight-on-b.patch @@ -0,0 +1,94 @@ +From e0ef9472d7cb07f965b5bc1ded1d2edcd1417690 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Sat, 22 Feb 2025 11:20:34 +0800 +Subject: thermal: gov_power_allocator: Update total_weight on bind and cdev + updates + +From: Yu-Che Cheng + +[ Upstream commit 0cde378a10c1cbfaa8dd2b89672d42f36c2809c3 ] + +params->total_weight is not initialized during bind and not updated when +the bound cdev changes. The cooling device weight will not be used due +to the uninitialized total_weight, until an update via sysfs is +triggered. + +The bound cdevs are updated during thermal zone registration, where each +cooling device will be bound to the thermal zone one by one, but +power_allocator_bind() can be called without an additional cdev update +when manually changing the policy of a thermal zone via sysfs. + +Add a new function to handle weight update logic, including updating +total_weight, and call it when bind, weight changes, and cdev updates to +ensure total_weight is always correct. + +Fixes: a3cd6db4cc2e ("thermal: gov_power_allocator: Support new update callback of weights") +Signed-off-by: Yu-Che Cheng +Link: https://patch.msgid.link/20250222-fix-power-allocator-weight-v2-1-a94de86b685a@chromium.org +[ rjw: Changelog edits ] +Signed-off-by: Rafael J. Wysocki +Signed-off-by: Sasha Levin +--- + drivers/thermal/gov_power_allocator.c | 30 ++++++++++++++++++++------- + 1 file changed, 22 insertions(+), 8 deletions(-) + +diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c +index 3b626db55b2b9..0d9f636c80f4d 100644 +--- a/drivers/thermal/gov_power_allocator.c ++++ b/drivers/thermal/gov_power_allocator.c +@@ -641,6 +641,22 @@ static int allocate_actors_buffer(struct power_allocator_params *params, + return ret; + } + ++static void power_allocator_update_weight(struct power_allocator_params *params) ++{ ++ const struct thermal_trip_desc *td; ++ struct thermal_instance *instance; ++ ++ if (!params->trip_max) ++ return; ++ ++ td = trip_to_trip_desc(params->trip_max); ++ ++ params->total_weight = 0; ++ list_for_each_entry(instance, &td->thermal_instances, trip_node) ++ if (power_actor_is_valid(instance)) ++ params->total_weight += instance->weight; ++} ++ + static void power_allocator_update_tz(struct thermal_zone_device *tz, + enum thermal_notify_event reason) + { +@@ -656,16 +672,12 @@ static void power_allocator_update_tz(struct thermal_zone_device *tz, + if (power_actor_is_valid(instance)) + num_actors++; + +- if (num_actors == params->num_actors) +- return; ++ if (num_actors != params->num_actors) ++ allocate_actors_buffer(params, num_actors); + +- allocate_actors_buffer(params, num_actors); +- break; ++ fallthrough; + case THERMAL_INSTANCE_WEIGHT_CHANGED: +- params->total_weight = 0; +- list_for_each_entry(instance, &td->thermal_instances, trip_node) +- if (power_actor_is_valid(instance)) +- params->total_weight += instance->weight; ++ power_allocator_update_weight(params); + break; + default: + break; +@@ -731,6 +743,8 @@ static int power_allocator_bind(struct thermal_zone_device *tz) + + tz->governor_data = params; + ++ power_allocator_update_weight(params); ++ + return 0; + + free_params: +-- +2.39.5 + diff --git a/queue-6.13/thermal-of-fix-cdev-lookup-in-thermal_of_should_bind.patch b/queue-6.13/thermal-of-fix-cdev-lookup-in-thermal_of_should_bind.patch new file mode 100644 index 0000000000..9271fce5ca --- /dev/null +++ b/queue-6.13/thermal-of-fix-cdev-lookup-in-thermal_of_should_bind.patch @@ -0,0 +1,106 @@ +From 57f43b28fb20bf48d723e99ef3770b0552b48f89 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 21 Feb 2025 17:57:11 +0100 +Subject: thermal/of: Fix cdev lookup in thermal_of_should_bind() + +From: Rafael J. Wysocki + +[ Upstream commit 423de5b5bc5b267586b449abd1c4fde562aa0cf9 ] + +Since thermal_of_should_bind() terminates the loop after processing +the first child found in cooling-maps, it will never match more than +one cdev to a given trip point which is incorrect, as there may be +cooling-maps associating one trip point with multiple cooling devices. + +Address this by letting the loop continue until either all +children have been processed or a matching one has been found. + +To avoid adding conditionals or goto statements, put the loop in +question into a separate function and make that function return +right away after finding a matching cooling-maps entry. + +Fixes: 94c6110b0b13 ("thermal/of: Use the .should_bind() thermal zone callback") +Link: https://lore.kernel.org/linux-pm/20250219-fix-thermal-of-v1-1-de36e7a590c4@chromium.org/ +Reported-by: Yu-Che Cheng +Signed-off-by: Rafael J. Wysocki +Reviewed-by: Yu-Che Cheng +Tested-by: Yu-Che Cheng +Reviewed-by: Lukasz Luba +Tested-by: Lukasz Luba +Link: https://patch.msgid.link/2788228.mvXUDI8C0e@rjwysocki.net +Signed-off-by: Sasha Levin +--- + drivers/thermal/thermal_of.c | 50 +++++++++++++++++++++--------------- + 1 file changed, 29 insertions(+), 21 deletions(-) + +diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c +index 5ab4ce4daaebd..5401f03d6b6c1 100644 +--- a/drivers/thermal/thermal_of.c ++++ b/drivers/thermal/thermal_of.c +@@ -274,6 +274,34 @@ static bool thermal_of_get_cooling_spec(struct device_node *map_np, int index, + return true; + } + ++static bool thermal_of_cm_lookup(struct device_node *cm_np, ++ const struct thermal_trip *trip, ++ struct thermal_cooling_device *cdev, ++ struct cooling_spec *c) ++{ ++ for_each_child_of_node_scoped(cm_np, child) { ++ struct device_node *tr_np; ++ int count, i; ++ ++ tr_np = of_parse_phandle(child, "trip", 0); ++ if (tr_np != trip->priv) ++ continue; ++ ++ /* The trip has been found, look up the cdev. */ ++ count = of_count_phandle_with_args(child, "cooling-device", ++ "#cooling-cells"); ++ if (count <= 0) ++ pr_err("Add a cooling_device property with at least one device\n"); ++ ++ for (i = 0; i < count; i++) { ++ if (thermal_of_get_cooling_spec(child, i, cdev, c)) ++ return true; ++ } ++ } ++ ++ return false; ++} ++ + static bool thermal_of_should_bind(struct thermal_zone_device *tz, + const struct thermal_trip *trip, + struct thermal_cooling_device *cdev, +@@ -293,27 +321,7 @@ static bool thermal_of_should_bind(struct thermal_zone_device *tz, + goto out; + + /* Look up the trip and the cdev in the cooling maps. */ +- for_each_child_of_node_scoped(cm_np, child) { +- struct device_node *tr_np; +- int count, i; +- +- tr_np = of_parse_phandle(child, "trip", 0); +- if (tr_np != trip->priv) +- continue; +- +- /* The trip has been found, look up the cdev. */ +- count = of_count_phandle_with_args(child, "cooling-device", "#cooling-cells"); +- if (count <= 0) +- pr_err("Add a cooling_device property with at least one device\n"); +- +- for (i = 0; i < count; i++) { +- result = thermal_of_get_cooling_spec(child, i, cdev, c); +- if (result) +- break; +- } +- +- break; +- } ++ result = thermal_of_cm_lookup(cm_np, trip, cdev, c); + + of_node_put(cm_np); + out: +-- +2.39.5 + diff --git a/queue-6.13/unreachable-unify.patch b/queue-6.13/unreachable-unify.patch new file mode 100644 index 0000000000..79b8ead7bd --- /dev/null +++ b/queue-6.13/unreachable-unify.patch @@ -0,0 +1,72 @@ +From 53c27a4c300cff7831abd5ec264ddd4069901837 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 28 Nov 2024 10:39:01 +0100 +Subject: unreachable: Unify + +From: Peter Zijlstra + +[ Upstream commit c837de3810982cd41cd70e5170da1931439f025c ] + +Since barrier_before_unreachable() is empty for !GCC it is trivial to +unify the two definitions. Less is more. + +Signed-off-by: Peter Zijlstra (Intel) +Acked-by: Josh Poimboeuf +Link: https://lore.kernel.org/r/20241128094311.924381359@infradead.org +Stable-dep-of: 73cfc53cc3b6 ("objtool: Fix C jump table annotations for Clang") +Signed-off-by: Sasha Levin +--- + include/linux/compiler-gcc.h | 12 ------------ + include/linux/compiler.h | 10 +++++++--- + 2 files changed, 7 insertions(+), 15 deletions(-) + +diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h +index d0ed9583743fc..c9b58188ec61e 100644 +--- a/include/linux/compiler-gcc.h ++++ b/include/linux/compiler-gcc.h +@@ -52,18 +52,6 @@ + */ + #define barrier_before_unreachable() asm volatile("") + +-/* +- * Mark a position in code as unreachable. This can be used to +- * suppress control flow warnings after asm blocks that transfer +- * control elsewhere. +- */ +-#define unreachable() \ +- do { \ +- annotate_unreachable(); \ +- barrier_before_unreachable(); \ +- __builtin_unreachable(); \ +- } while (0) +- + #if defined(CONFIG_ARCH_USE_BUILTIN_BSWAP) + #define __HAVE_BUILTIN_BSWAP32__ + #define __HAVE_BUILTIN_BSWAP64__ +diff --git a/include/linux/compiler.h b/include/linux/compiler.h +index 7af999a131cb2..bd5d10c479c09 100644 +--- a/include/linux/compiler.h ++++ b/include/linux/compiler.h +@@ -141,12 +141,16 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, + #define __annotate_jump_table + #endif /* CONFIG_OBJTOOL */ + +-#ifndef unreachable +-# define unreachable() do { \ ++/* ++ * Mark a position in code as unreachable. This can be used to ++ * suppress control flow warnings after asm blocks that transfer ++ * control elsewhere. ++ */ ++#define unreachable() do { \ + annotate_unreachable(); \ ++ barrier_before_unreachable(); \ + __builtin_unreachable(); \ + } while (0) +-#endif + + /* + * KENTRY - kernel entry point +-- +2.39.5 + diff --git a/queue-6.13/uprobes-reject-the-shared-zeropage-in-uprobe_write_o.patch b/queue-6.13/uprobes-reject-the-shared-zeropage-in-uprobe_write_o.patch new file mode 100644 index 0000000000..7370adf00e --- /dev/null +++ b/queue-6.13/uprobes-reject-the-shared-zeropage-in-uprobe_write_o.patch @@ -0,0 +1,112 @@ +From 02ca8294d4b48e2da3eb75b92a543342d56e7933 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 24 Feb 2025 11:11:49 +0800 +Subject: uprobes: Reject the shared zeropage in uprobe_write_opcode() + +From: Tong Tiangen + +[ Upstream commit bddf10d26e6e5114e7415a0e442ec6f51a559468 ] + +We triggered the following crash in syzkaller tests: + + BUG: Bad page state in process syz.7.38 pfn:1eff3 + page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1eff3 + flags: 0x3fffff00004004(referenced|reserved|node=0|zone=1|lastcpupid=0x1fffff) + raw: 003fffff00004004 ffffe6c6c07bfcc8 ffffe6c6c07bfcc8 0000000000000000 + raw: 0000000000000000 0000000000000000 00000000fffffffe 0000000000000000 + page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 + Call Trace: + + dump_stack_lvl+0x32/0x50 + bad_page+0x69/0xf0 + free_unref_page_prepare+0x401/0x500 + free_unref_page+0x6d/0x1b0 + uprobe_write_opcode+0x460/0x8e0 + install_breakpoint.part.0+0x51/0x80 + register_for_each_vma+0x1d9/0x2b0 + __uprobe_register+0x245/0x300 + bpf_uprobe_multi_link_attach+0x29b/0x4f0 + link_create+0x1e2/0x280 + __sys_bpf+0x75f/0xac0 + __x64_sys_bpf+0x1a/0x30 + do_syscall_64+0x56/0x100 + entry_SYSCALL_64_after_hwframe+0x78/0xe2 + + BUG: Bad rss-counter state mm:00000000452453e0 type:MM_FILEPAGES val:-1 + +The following syzkaller test case can be used to reproduce: + + r2 = creat(&(0x7f0000000000)='./file0\x00', 0x8) + write$nbd(r2, &(0x7f0000000580)=ANY=[], 0x10) + r4 = openat(0xffffffffffffff9c, &(0x7f0000000040)='./file0\x00', 0x42, 0x0) + mmap$IORING_OFF_SQ_RING(&(0x7f0000ffd000/0x3000)=nil, 0x3000, 0x0, 0x12, r4, 0x0) + r5 = userfaultfd(0x80801) + ioctl$UFFDIO_API(r5, 0xc018aa3f, &(0x7f0000000040)={0xaa, 0x20}) + r6 = userfaultfd(0x80801) + ioctl$UFFDIO_API(r6, 0xc018aa3f, &(0x7f0000000140)) + ioctl$UFFDIO_REGISTER(r6, 0xc020aa00, &(0x7f0000000100)={{&(0x7f0000ffc000/0x4000)=nil, 0x4000}, 0x2}) + ioctl$UFFDIO_ZEROPAGE(r5, 0xc020aa04, &(0x7f0000000000)={{&(0x7f0000ffd000/0x1000)=nil, 0x1000}}) + r7 = bpf$PROG_LOAD(0x5, &(0x7f0000000140)={0x2, 0x3, &(0x7f0000000200)=ANY=[@ANYBLOB="1800000000120000000000000000000095"], &(0x7f0000000000)='GPL\x00', 0x7, 0x0, 0x0, 0x0, 0x0, '\x00', 0x0, @fallback=0x30, 0xffffffffffffffff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x10, 0x0, @void, @value}, 0x94) + bpf$BPF_LINK_CREATE_XDP(0x1c, &(0x7f0000000040)={r7, 0x0, 0x30, 0x1e, @val=@uprobe_multi={&(0x7f0000000080)='./file0\x00', &(0x7f0000000100)=[0x2], 0x0, 0x0, 0x1}}, 0x40) + +The cause is that zero pfn is set to the PTE without increasing the RSS +count in mfill_atomic_pte_zeropage() and the refcount of zero folio does +not increase accordingly. Then, the operation on the same pfn is performed +in uprobe_write_opcode()->__replace_page() to unconditional decrease the +RSS count and old_folio's refcount. + +Therefore, two bugs are introduced: + + 1. The RSS count is incorrect, when process exit, the check_mm() report + error "Bad rss-count". + + 2. The reserved folio (zero folio) is freed when folio->refcount is zero, + then free_pages_prepare->free_page_is_bad() report error + "Bad page state". + +There is more, the following warning could also theoretically be triggered: + + __replace_page() + -> ... + -> folio_remove_rmap_pte() + -> VM_WARN_ON_FOLIO(is_zero_folio(folio), folio) + +Considering that uprobe hit on the zero folio is a very rare case, just +reject zero old folio immediately after get_user_page_vma_remote(). + +[ mingo: Cleaned up the changelog ] + +Fixes: 7396fa818d62 ("uprobes/core: Make background page replacement logic account for rss_stat counters") +Fixes: 2b1444983508 ("uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints") +Signed-off-by: Tong Tiangen +Signed-off-by: Ingo Molnar +Reviewed-by: David Hildenbrand +Reviewed-by: Oleg Nesterov +Cc: Peter Zijlstra +Cc: Masami Hiramatsu +Link: https://lore.kernel.org/r/20250224031149.1598949-1-tongtiangen@huawei.com +Signed-off-by: Sasha Levin +--- + kernel/events/uprobes.c | 5 +++++ + 1 file changed, 5 insertions(+) + +diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c +index 7f1a95b4f14de..e11e2df50a3ee 100644 +--- a/kernel/events/uprobes.c ++++ b/kernel/events/uprobes.c +@@ -495,6 +495,11 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm, + if (ret <= 0) + goto put_old; + ++ if (is_zero_page(old_page)) { ++ ret = -EINVAL; ++ goto put_old; ++ } ++ + if (WARN(!is_register && PageCompound(old_page), + "uprobe unregister should never work on compound page\n")) { + ret = -EINVAL; +-- +2.39.5 + diff --git a/queue-6.13/uprobes-remove-too-strict-lockdep_assert-condition-i.patch b/queue-6.13/uprobes-remove-too-strict-lockdep_assert-condition-i.patch new file mode 100644 index 0000000000..90b6d4b7b0 --- /dev/null +++ b/queue-6.13/uprobes-remove-too-strict-lockdep_assert-condition-i.patch @@ -0,0 +1,65 @@ +From 45e07f5f34a46ee8451994c2e4de5442abaa9251 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 25 Feb 2025 14:32:14 -0800 +Subject: uprobes: Remove too strict lockdep_assert() condition in + hprobe_expire() + +From: Andrii Nakryiko + +[ Upstream commit f8c857238a392f21d5726d07966f6061007c8d4f ] + +hprobe_expire() is used to atomically switch pending uretprobe instance +(struct return_instance) from being SRCU protected to be refcounted. +This can be done from background timer thread, or synchronously within +current thread when task is forked. + +In the former case, return_instance has to be protected through RCU read +lock, and that's what hprobe_expire() used to check with +lockdep_assert(rcu_read_lock_held()). + +But in the latter case (hprobe_expire() called from dup_utask()) there +is no RCU lock being held, and it's both unnecessary and incovenient. +Inconvenient due to the intervening memory allocations inside +dup_return_instance()'s loop. Unnecessary because dup_utask() is called +synchronously in current thread, and no uretprobe can run at that point, +so return_instance can't be freed either. + +So drop rcu_read_lock_held() condition, and expand corresponding comment +to explain necessary lifetime guarantees. lockdep_assert()-detected +issue is a false positive. + +Fixes: dd1a7567784e ("uprobes: SRCU-protect uretprobe lifetime (with timeout)") +Reported-by: Breno Leitao +Signed-off-by: Andrii Nakryiko +Signed-off-by: Ingo Molnar +Link: https://lore.kernel.org/r/20250225223214.2970740-1-andrii@kernel.org +Signed-off-by: Sasha Levin +--- + kernel/events/uprobes.c | 10 +++++++--- + 1 file changed, 7 insertions(+), 3 deletions(-) + +diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c +index e11e2df50a3ee..3c34761c9ae73 100644 +--- a/kernel/events/uprobes.c ++++ b/kernel/events/uprobes.c +@@ -767,10 +767,14 @@ static struct uprobe *hprobe_expire(struct hprobe *hprobe, bool get) + enum hprobe_state hstate; + + /* +- * return_instance's hprobe is protected by RCU. +- * Underlying uprobe is itself protected from reuse by SRCU. ++ * Caller should guarantee that return_instance is not going to be ++ * freed from under us. This can be achieved either through holding ++ * rcu_read_lock() or by owning return_instance in the first place. ++ * ++ * Underlying uprobe is itself protected from reuse by SRCU, so ensure ++ * SRCU lock is held properly. + */ +- lockdep_assert(rcu_read_lock_held() && srcu_read_lock_held(&uretprobes_srcu)); ++ lockdep_assert(srcu_read_lock_held(&uretprobes_srcu)); + + hstate = READ_ONCE(hprobe->state); + switch (hstate) { +-- +2.39.5 + diff --git a/queue-6.13/x86-cpu-fix-warm-boot-hang-regression-on-amd-sc1100-.patch b/queue-6.13/x86-cpu-fix-warm-boot-hang-regression-on-amd-sc1100-.patch new file mode 100644 index 0000000000..200bf52460 --- /dev/null +++ b/queue-6.13/x86-cpu-fix-warm-boot-hang-regression-on-amd-sc1100-.patch @@ -0,0 +1,95 @@ +From ec8a7530b8ae8116c29b92348cc656c49bf9be8e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 25 Feb 2025 22:31:20 +0100 +Subject: x86/CPU: Fix warm boot hang regression on AMD SC1100 SoC systems + +From: Russell Senior + +[ Upstream commit bebe35bb738b573c32a5033499cd59f20293f2a3 ] + +I still have some Soekris net4826 in a Community Wireless Network I +volunteer with. These devices use an AMD SC1100 SoC. I am running +OpenWrt on them, which uses a patched kernel, that naturally has +evolved over time. I haven't updated the ones in the field in a +number of years (circa 2017), but have one in a test bed, where I have +intermittently tried out test builds. + +A few years ago, I noticed some trouble, particularly when "warm +booting", that is, doing a reboot without removing power, and noticed +the device was hanging after the kernel message: + + [ 0.081615] Working around Cyrix MediaGX virtual DMA bugs. + +If I removed power and then restarted, it would boot fine, continuing +through the message above, thusly: + + [ 0.081615] Working around Cyrix MediaGX virtual DMA bugs. + [ 0.090076] Enable Memory-Write-back mode on Cyrix/NSC processor. + [ 0.100000] Enable Memory access reorder on Cyrix/NSC processor. + [ 0.100070] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0 + [ 0.110058] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0 + [ 0.120037] CPU: NSC Geode(TM) Integrated Processor by National Semi (family: 0x5, model: 0x9, stepping: 0x1) + [...] + +In order to continue using modern tools, like ssh, to interact with +the software on these old devices, I need modern builds of the OpenWrt +firmware on the devices. I confirmed that the warm boot hang was still +an issue in modern OpenWrt builds (currently using a patched linux +v6.6.65). + +Last night, I decided it was time to get to the bottom of the warm +boot hang, and began bisecting. From preserved builds, I narrowed down +the bisection window from late February to late May 2019. During this +period, the OpenWrt builds were using 4.14.x. I was able to build +using period-correct Ubuntu 18.04.6. After a number of bisection +iterations, I identified a kernel bump from 4.14.112 to 4.14.113 as +the commit that introduced the warm boot hang. + + https://github.com/openwrt/openwrt/commit/07aaa7e3d62ad32767d7067107db64b6ade81537 + +Looking at the upstream changes in the stable kernel between 4.14.112 +and 4.14.113 (tig v4.14.112..v4.14.113), I spotted a likely suspect: + + https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=20afb90f730982882e65b01fb8bdfe83914339c5 + +So, I tried reverting just that kernel change on top of the breaking +OpenWrt commit, and my warm boot hang went away. + +Presumably, the warm boot hang is due to some register not getting +cleared in the same way that a loss of power does. That is +approximately as much as I understand about the problem. + +More poking/prodding and coaching from Jonas Gorski, it looks +like this test patch fixes the problem on my board: Tested against +v6.6.67 and v4.14.113. + +Fixes: 18fb053f9b82 ("x86/cpu/cyrix: Use correct macros for Cyrix calls on Geode processors") +Debugged-by: Jonas Gorski +Signed-off-by: Russell Senior +Signed-off-by: Ingo Molnar +Link: https://lore.kernel.org/r/CAHP3WfOgs3Ms4Z+L9i0-iBOE21sdMk5erAiJurPjnrL9LSsgRA@mail.gmail.com +Cc: Matthew Whitehead +Cc: Thomas Gleixner +Signed-off-by: Sasha Levin +--- + arch/x86/kernel/cpu/cyrix.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +diff --git a/arch/x86/kernel/cpu/cyrix.c b/arch/x86/kernel/cpu/cyrix.c +index 9651275aecd1b..dfec2c61e3547 100644 +--- a/arch/x86/kernel/cpu/cyrix.c ++++ b/arch/x86/kernel/cpu/cyrix.c +@@ -153,8 +153,8 @@ static void geode_configure(void) + u8 ccr3; + local_irq_save(flags); + +- /* Suspend on halt power saving and enable #SUSP pin */ +- setCx86(CX86_CCR2, getCx86(CX86_CCR2) | 0x88); ++ /* Suspend on halt power saving */ ++ setCx86(CX86_CCR2, getCx86(CX86_CCR2) | 0x08); + + ccr3 = getCx86(CX86_CCR3); + setCx86(CX86_CCR3, (ccr3 & 0x0f) | 0x10); /* enable MAPEN */ +-- +2.39.5 +