From: Greg Kroah-Hartman Date: Thu, 19 Mar 2026 09:56:05 +0000 (+0100) Subject: 6.12-stable patches X-Git-Tag: v6.18.19~15 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=9abfbd162c8c3b7b43d2ce4a02a3df8dd019cf67;p=thirdparty%2Fkernel%2Fstable-queue.git 6.12-stable patches added patches: btrfs-do-not-strictly-require-dirty-metadata-threshold-for-metadata-writepages.patch dm-verity-disable-recursive-forward-error-correction.patch drm-xe-sync-cleanup-partially-initialized-sync-on-parse-failure.patch ice-fix-devlink-reload-call-trace.patch io_uring-uring_cmd-fix-too-strict-requirement-on-ioctl.patch ipv6-use-rcu-in-ip6_xmit.patch octeontx2-af-add-proper-checks-for-fwdata.patch platform-x86-amd-pmc-add-support-for-van-gogh-soc.patch rxrpc-fix-recvmsg-unconditional-requeue.patch tracing-add-recursion-protection-in-kernel-stack-trace-recording.patch x86-uprobes-fix-xol-allocation-failure-for-32-bit-tasks.patch --- diff --git a/queue-6.12/btrfs-do-not-strictly-require-dirty-metadata-threshold-for-metadata-writepages.patch b/queue-6.12/btrfs-do-not-strictly-require-dirty-metadata-threshold-for-metadata-writepages.patch new file mode 100644 index 0000000000..8be2bcdb00 --- /dev/null +++ b/queue-6.12/btrfs-do-not-strictly-require-dirty-metadata-threshold-for-metadata-writepages.patch @@ -0,0 +1,165 @@ +From stable+bounces-219891-greg=kroah.com@vger.kernel.org Fri Feb 27 02:57:59 2026 +From: Rahul Sharma +Date: Fri, 27 Feb 2026 09:56:58 +0800 +Subject: btrfs: do not strictly require dirty metadata threshold for metadata writepages +To: gregkh@linuxfoundation.org, stable@vger.kernel.org +Cc: linux-kernel@vger.kernel.org, Qu Wenruo , Jan Kara , Boris Burkov , David Sterba , Rahul Sharma +Message-ID: <20260227015658.1116424-1-black.hawk@163.com> + +From: Qu Wenruo + +[ Upstream commit 4e159150a9a56d66d247f4b5510bed46fe58aa1c ] + +[BUG] +There is an internal report that over 1000 processes are +waiting at the io_schedule_timeout() of balance_dirty_pages(), causing +a system hang and trigger a kernel coredump. + +The kernel is v6.4 kernel based, but the root problem still applies to +any upstream kernel before v6.18. + +[CAUSE] +>From Jan Kara for his wisdom on the dirty page balance behavior first. + + This cgroup dirty limit was what was actually playing the role here + because the cgroup had only a small amount of memory and so the dirty + limit for it was something like 16MB. + + Dirty throttling is responsible for enforcing that nobody can dirty + (significantly) more dirty memory than there's dirty limit. Thus when + a task is dirtying pages it periodically enters into balance_dirty_pages() + and we let it sleep there to slow down the dirtying. + + When the system is over dirty limit already (either globally or within + a cgroup of the running task), we will not let the task exit from + balance_dirty_pages() until the number of dirty pages drops below the + limit. + + So in this particular case, as I already mentioned, there was a cgroup + with relatively small amount of memory and as a result with dirty limit + set at 16MB. A task from that cgroup has dirtied about 28MB worth of + pages in btrfs btree inode and these were practically the only dirty + pages in that cgroup. + +So that means the only way to reduce the dirty pages of that cgroup is +to writeback the dirty pages of btrfs btree inode, and only after that +those processes can exit balance_dirty_pages(). + +Now back to the btrfs part, btree_writepages() is responsible for +writing back dirty btree inode pages. + +The problem here is, there is a btrfs internal threshold that if the +btree inode's dirty bytes are below the 32M threshold, it will not +do any writeback. + +This behavior is to batch as much metadata as possible so we won't write +back those tree blocks and then later re-COW them again for another +modification. + +This internal 32MiB is higher than the existing dirty page size (28MiB), +meaning no writeback will happen, causing a deadlock between btrfs and +cgroup: + +- Btrfs doesn't want to write back btree inode until more dirty pages + +- Cgroup/MM doesn't want more dirty pages for btrfs btree inode + Thus any process touching that btree inode is put into sleep until + the number of dirty pages is reduced. + +Thanks Jan Kara a lot for the analysis of the root cause. + +[ENHANCEMENT] +Since kernel commit b55102826d7d ("btrfs: set AS_KERNEL_FILE on the +btree_inode"), btrfs btree inode pages will only be charged to the root +cgroup which should have a much larger limit than btrfs' 32MiB +threshold. +So it should not affect newer kernels. + +But for all current LTS kernels, they are all affected by this problem, +and backporting the whole AS_KERNEL_FILE may not be a good idea. + +Even for newer kernels I still think it's a good idea to get +rid of the internal threshold at btree_writepages(), since for most cases +cgroup/MM has a better view of full system memory usage than btrfs' fixed +threshold. + +For internal callers using btrfs_btree_balance_dirty() since that +function is already doing internal threshold check, we don't need to +bother them. + +But for external callers of btree_writepages(), just respect their +requests and write back whatever they want, ignoring the internal +btrfs threshold to avoid such deadlock on btree inode dirty page +balancing. + +CC: stable@vger.kernel.org +CC: Jan Kara +Reviewed-by: Boris Burkov +Signed-off-by: Qu Wenruo +Signed-off-by: David Sterba +[ The context change is due to the commit 5e121ae687b8 +("btrfs: use buffer xarray for extent buffer writeback operations") +in v6.16 which is irrelevant to the logic of this patch. ] +Signed-off-by: Rahul Sharma +Signed-off-by: Greg Kroah-Hartman +--- + fs/btrfs/disk-io.c | 22 ---------------------- + fs/btrfs/extent_io.c | 3 +-- + fs/btrfs/extent_io.h | 3 +-- + 3 files changed, 2 insertions(+), 26 deletions(-) + +--- a/fs/btrfs/disk-io.c ++++ b/fs/btrfs/disk-io.c +@@ -498,28 +498,6 @@ static int btree_migrate_folio(struct ad + #define btree_migrate_folio NULL + #endif + +-static int btree_writepages(struct address_space *mapping, +- struct writeback_control *wbc) +-{ +- int ret; +- +- if (wbc->sync_mode == WB_SYNC_NONE) { +- struct btrfs_fs_info *fs_info; +- +- if (wbc->for_kupdate) +- return 0; +- +- fs_info = inode_to_fs_info(mapping->host); +- /* this is a bit racy, but that's ok */ +- ret = __percpu_counter_compare(&fs_info->dirty_metadata_bytes, +- BTRFS_DIRTY_METADATA_THRESH, +- fs_info->dirty_metadata_batch); +- if (ret < 0) +- return 0; +- } +- return btree_write_cache_pages(mapping, wbc); +-} +- + static bool btree_release_folio(struct folio *folio, gfp_t gfp_flags) + { + if (folio_test_writeback(folio) || folio_test_dirty(folio)) +--- a/fs/btrfs/extent_io.c ++++ b/fs/btrfs/extent_io.c +@@ -2088,8 +2088,7 @@ static int submit_eb_page(struct folio * + return 1; + } + +-int btree_write_cache_pages(struct address_space *mapping, +- struct writeback_control *wbc) ++int btree_writepages(struct address_space *mapping, struct writeback_control *wbc) + { + struct btrfs_eb_write_context ctx = { .wbc = wbc }; + struct btrfs_fs_info *fs_info = inode_to_fs_info(mapping->host); +--- a/fs/btrfs/extent_io.h ++++ b/fs/btrfs/extent_io.h +@@ -244,8 +244,7 @@ void extent_write_locked_range(struct in + u64 start, u64 end, struct writeback_control *wbc, + bool pages_dirty); + int btrfs_writepages(struct address_space *mapping, struct writeback_control *wbc); +-int btree_write_cache_pages(struct address_space *mapping, +- struct writeback_control *wbc); ++int btree_writepages(struct address_space *mapping, struct writeback_control *wbc); + void btrfs_readahead(struct readahead_control *rac); + int set_folio_extent_mapped(struct folio *folio); + int set_page_extent_mapped(struct page *page); diff --git a/queue-6.12/dm-verity-disable-recursive-forward-error-correction.patch b/queue-6.12/dm-verity-disable-recursive-forward-error-correction.patch new file mode 100644 index 0000000000..fb85dbd1a9 --- /dev/null +++ b/queue-6.12/dm-verity-disable-recursive-forward-error-correction.patch @@ -0,0 +1,66 @@ +From stable+bounces-219750-greg=kroah.com@vger.kernel.org Thu Feb 26 05:35:52 2026 +From: Rahul Sharma +Date: Thu, 26 Feb 2026 12:35:00 +0800 +Subject: dm-verity: disable recursive forward error correction +To: gregkh@linuxfoundation.org, stable@vger.kernel.org +Cc: linux-kernel@vger.kernel.org, Mikulas Patocka , Guangwu Zhang , Sami Tolvanen , Eric Biggers , Rahul Sharma +Message-ID: <20260226043500.3945988-1-black.hawk@163.com> + +From: Mikulas Patocka + +[ Upstream commit d9f3e47d3fae0c101d9094bc956ed24e7a0ee801 ] + +There are two problems with the recursive correction: + +1. It may cause denial-of-service. In fec_read_bufs, there is a loop that +has 253 iterations. For each iteration, we may call verity_hash_for_block +recursively. There is a limit of 4 nested recursions - that means that +there may be at most 253^4 (4 billion) iterations. Red Hat QE team +actually created an image that pushes dm-verity to this limit - and this +image just makes the udev-worker process get stuck in the 'D' state. + +2. It doesn't work. In fec_read_bufs we store data into the variable +"fio->bufs", but fio bufs is shared between recursive invocations, if +"verity_hash_for_block" invoked correction recursively, it would +overwrite partially filled fio->bufs. + +Signed-off-by: Mikulas Patocka +Reported-by: Guangwu Zhang +Reviewed-by: Sami Tolvanen +Reviewed-by: Eric Biggers +[ The context change is due to the commit bdf253d580d7 +("dm-verity: remove support for asynchronous hashes") +in v6.18 which is irrelevant to the logic of this patch. ] +Signed-off-by: Rahul Sharma +Signed-off-by: Greg Kroah-Hartman +--- + drivers/md/dm-verity-fec.c | 4 +--- + drivers/md/dm-verity-fec.h | 3 --- + 2 files changed, 1 insertion(+), 6 deletions(-) + +--- a/drivers/md/dm-verity-fec.c ++++ b/drivers/md/dm-verity-fec.c +@@ -424,10 +424,8 @@ int verity_fec_decode(struct dm_verity * + if (!verity_fec_is_enabled(v)) + return -EOPNOTSUPP; + +- if (fio->level >= DM_VERITY_FEC_MAX_RECURSION) { +- DMWARN_LIMIT("%s: FEC: recursion too deep", v->data_dev->name); ++ if (fio->level) + return -EIO; +- } + + fio->level++; + +--- a/drivers/md/dm-verity-fec.h ++++ b/drivers/md/dm-verity-fec.h +@@ -23,9 +23,6 @@ + #define DM_VERITY_FEC_BUF_MAX \ + (1 << (PAGE_SHIFT - DM_VERITY_FEC_BUF_RS_BITS)) + +-/* maximum recursion level for verity_fec_decode */ +-#define DM_VERITY_FEC_MAX_RECURSION 4 +- + #define DM_VERITY_OPT_FEC_DEV "use_fec_from_device" + #define DM_VERITY_OPT_FEC_BLOCKS "fec_blocks" + #define DM_VERITY_OPT_FEC_START "fec_start" diff --git a/queue-6.12/drm-xe-sync-cleanup-partially-initialized-sync-on-parse-failure.patch b/queue-6.12/drm-xe-sync-cleanup-partially-initialized-sync-on-parse-failure.patch new file mode 100644 index 0000000000..33042ddb30 --- /dev/null +++ b/queue-6.12/drm-xe-sync-cleanup-partially-initialized-sync-on-parse-failure.patch @@ -0,0 +1,83 @@ +From 1bfd7575092420ba5a0b944953c95b74a5646ff8 Mon Sep 17 00:00:00 2001 +From: Shuicheng Lin +Date: Thu, 19 Feb 2026 23:35:18 +0000 +Subject: drm/xe/sync: Cleanup partially initialized sync on parse failure + +From: Shuicheng Lin + +commit 1bfd7575092420ba5a0b944953c95b74a5646ff8 upstream. + +xe_sync_entry_parse() can allocate references (syncobj, fence, chain fence, +or user fence) before hitting a later failure path. Several of those paths +returned directly, leaving partially initialized state and leaking refs. + +Route these error paths through a common free_sync label and call +xe_sync_entry_cleanup(sync) before returning the error. + +Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") +Cc: Matthew Brost +Signed-off-by: Shuicheng Lin +Reviewed-by: Matthew Brost +Signed-off-by: Matthew Brost +Link: https://patch.msgid.link/20260219233516.2938172-5-shuicheng.lin@intel.com +(cherry picked from commit f939bdd9207a5d1fc55cced5459858480686ce22) +Cc: stable@vger.kernel.org +Signed-off-by: Rodrigo Vivi +Signed-off-by: Greg Kroah-Hartman +--- + drivers/gpu/drm/xe/xe_sync.c | 24 +++++++++++++++++------- + 1 file changed, 17 insertions(+), 7 deletions(-) + +--- a/drivers/gpu/drm/xe/xe_sync.c ++++ b/drivers/gpu/drm/xe/xe_sync.c +@@ -142,8 +142,10 @@ int xe_sync_entry_parse(struct xe_device + + if (!signal) { + sync->fence = drm_syncobj_fence_get(sync->syncobj); +- if (XE_IOCTL_DBG(xe, !sync->fence)) +- return -EINVAL; ++ if (XE_IOCTL_DBG(xe, !sync->fence)) { ++ err = -EINVAL; ++ goto free_sync; ++ } + } + break; + +@@ -163,17 +165,21 @@ int xe_sync_entry_parse(struct xe_device + + if (signal) { + sync->chain_fence = dma_fence_chain_alloc(); +- if (!sync->chain_fence) +- return -ENOMEM; ++ if (!sync->chain_fence) { ++ err = -ENOMEM; ++ goto free_sync; ++ } + } else { + sync->fence = drm_syncobj_fence_get(sync->syncobj); +- if (XE_IOCTL_DBG(xe, !sync->fence)) +- return -EINVAL; ++ if (XE_IOCTL_DBG(xe, !sync->fence)) { ++ err = -EINVAL; ++ goto free_sync; ++ } + + err = dma_fence_chain_find_seqno(&sync->fence, + sync_in.timeline_value); + if (err) +- return err; ++ goto free_sync; + } + break; + +@@ -207,6 +213,10 @@ int xe_sync_entry_parse(struct xe_device + sync->timeline_value = sync_in.timeline_value; + + return 0; ++ ++free_sync: ++ xe_sync_entry_cleanup(sync); ++ return err; + } + + int xe_sync_entry_add_deps(struct xe_sync_entry *sync, struct xe_sched_job *job) diff --git a/queue-6.12/ice-fix-devlink-reload-call-trace.patch b/queue-6.12/ice-fix-devlink-reload-call-trace.patch new file mode 100644 index 0000000000..86e9c777da --- /dev/null +++ b/queue-6.12/ice-fix-devlink-reload-call-trace.patch @@ -0,0 +1,75 @@ +From stable+bounces-219893-greg=kroah.com@vger.kernel.org Fri Feb 27 03:30:26 2026 +From: Wenshan Lan +Date: Fri, 27 Feb 2026 10:29:14 +0800 +Subject: ice: fix devlink reload call trace +To: gregkh@linuxfoundation.org, stable@vger.kernel.org +Cc: Paul Greenwalt , Aleksandr Loktionov , Paul Menzel , Rinitha S , Tony Nguyen , Wenshan Lan +Message-ID: <20260227022914.1378328-1-jetlan9@163.com> + +From: Paul Greenwalt + +[ Upstream commit d3f867e7a04678640ebcbfb81893c59f4af48586 ] + +Commit 4da71a77fc3b ("ice: read internal temperature sensor") introduced +internal temperature sensor reading via HWMON. ice_hwmon_init() was added +to ice_init_feature() and ice_hwmon_exit() was added to ice_remove(). As a +result if devlink reload is used to reinit the device and then the driver +is removed, a call trace can occur. + +BUG: unable to handle page fault for address: ffffffffc0fd4b5d +Call Trace: + string+0x48/0xe0 + vsnprintf+0x1f9/0x650 + sprintf+0x62/0x80 + name_show+0x1f/0x30 + dev_attr_show+0x19/0x60 + +The call trace repeats approximately every 10 minutes when system +monitoring tools (e.g., sadc) attempt to read the orphaned hwmon sysfs +attributes that reference freed module memory. + +The sequence is: +1. Driver load, ice_hwmon_init() gets called from ice_init_feature() +2. Devlink reload down, flow does not call ice_remove() +3. Devlink reload up, ice_hwmon_init() gets called from + ice_init_feature() resulting in a second instance +4. Driver unload, ice_hwmon_exit() called from ice_remove() leaving the + first hwmon instance orphaned with dangling pointer + +Fix this by moving ice_hwmon_exit() from ice_remove() to +ice_deinit_features() to ensure proper cleanup symmetry with +ice_hwmon_init(). + +Fixes: 4da71a77fc3b ("ice: read internal temperature sensor") +Reviewed-by: Aleksandr Loktionov +Signed-off-by: Paul Greenwalt +Reviewed-by: Paul Menzel +Tested-by: Rinitha S (A Contingent worker at Intel) +Signed-off-by: Tony Nguyen +[ Adjust context. The context change is irrelevant to the current patch +logic. ] +Signed-off-by: Wenshan Lan +Signed-off-by: Greg Kroah-Hartman +--- + drivers/net/ethernet/intel/ice/ice_main.c | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +--- a/drivers/net/ethernet/intel/ice/ice_main.c ++++ b/drivers/net/ethernet/intel/ice/ice_main.c +@@ -4920,6 +4920,7 @@ static void ice_deinit_features(struct i + ice_dpll_deinit(pf); + if (pf->eswitch_mode == DEVLINK_ESWITCH_MODE_SWITCHDEV) + xa_destroy(&pf->eswitch.reprs); ++ ice_hwmon_exit(pf); + } + + static void ice_init_wakeup(struct ice_pf *pf) +@@ -5451,8 +5452,6 @@ static void ice_remove(struct pci_dev *p + ice_free_vfs(pf); + } + +- ice_hwmon_exit(pf); +- + ice_service_task_stop(pf); + ice_aq_cancel_waiting_tasks(pf); + set_bit(ICE_DOWN, pf->state); diff --git a/queue-6.12/io_uring-uring_cmd-fix-too-strict-requirement-on-ioctl.patch b/queue-6.12/io_uring-uring_cmd-fix-too-strict-requirement-on-ioctl.patch new file mode 100644 index 0000000000..77722ea9e0 --- /dev/null +++ b/queue-6.12/io_uring-uring_cmd-fix-too-strict-requirement-on-ioctl.patch @@ -0,0 +1,63 @@ +From stable+bounces-221221-greg=kroah.com@vger.kernel.org Sat Feb 28 22:31:16 2026 +From: "Asbjørn Sloth Tønnesen" +Date: Sat, 28 Feb 2026 18:08:53 +0000 +Subject: io_uring/uring_cmd: fix too strict requirement on ioctl +To: stable@vger.kernel.org +Cc: "Asbjørn Sloth Tønnesen" , "Jens Axboe" , io-uring@vger.kernel.org, linux-kernel@vger.kernel.org, "Gabriel Krisman Bertazi" +Message-ID: <20260228180858.66938-1-ast@fiberby.net> + +From: "Asbjørn Sloth Tønnesen" + +[ Upstream commit 600b665b903733bd60334e86031b157cc823ee55 ] + +Attempting SOCKET_URING_OP_SETSOCKOPT on an AF_NETLINK socket resulted +in an -EOPNOTSUPP, as AF_NETLINK doesn't have an ioctl in its struct +proto, but only in struct proto_ops. + +Prior to the blamed commit, io_uring_cmd_sock() only had two cmd_op +operations, both requiring ioctl, thus the check was warranted. + +Since then, 4 new cmd_op operations have been added, none of which +depend on ioctl. This patch moves the ioctl check, so it only applies +to the original operations. + +AFAICT, the ioctl requirement was unintentional, and it wasn't +visible in the blamed patch within 3 lines of context. + +Cc: stable@vger.kernel.org +Fixes: a5d2f99aff6b ("io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT") +Signed-off-by: Asbjørn Sloth Tønnesen +Reviewed-by: Gabriel Krisman Bertazi +Signed-off-by: Jens Axboe +[Asbjørn: function moved in commit 91db6edc573b; updated subject prefix] +Signed-off-by: Asbjørn Sloth Tønnesen +Signed-off-by: Greg Kroah-Hartman +--- + io_uring/uring_cmd.c | 9 ++++++--- + 1 file changed, 6 insertions(+), 3 deletions(-) + +--- a/io_uring/uring_cmd.c ++++ b/io_uring/uring_cmd.c +@@ -338,16 +338,19 @@ int io_uring_cmd_sock(struct io_uring_cm + struct proto *prot = READ_ONCE(sk->sk_prot); + int ret, arg = 0; + +- if (!prot || !prot->ioctl) +- return -EOPNOTSUPP; +- + switch (cmd->cmd_op) { + case SOCKET_URING_OP_SIOCINQ: ++ if (!prot || !prot->ioctl) ++ return -EOPNOTSUPP; ++ + ret = prot->ioctl(sk, SIOCINQ, &arg); + if (ret) + return ret; + return arg; + case SOCKET_URING_OP_SIOCOUTQ: ++ if (!prot || !prot->ioctl) ++ return -EOPNOTSUPP; ++ + ret = prot->ioctl(sk, SIOCOUTQ, &arg); + if (ret) + return ret; diff --git a/queue-6.12/ipv6-use-rcu-in-ip6_xmit.patch b/queue-6.12/ipv6-use-rcu-in-ip6_xmit.patch new file mode 100644 index 0000000000..97aeff4c88 --- /dev/null +++ b/queue-6.12/ipv6-use-rcu-in-ip6_xmit.patch @@ -0,0 +1,109 @@ +From 9085e56501d93af9f2d7bd16f7fcfacdde47b99c Mon Sep 17 00:00:00 2001 +From: Eric Dumazet +Date: Thu, 28 Aug 2025 19:58:18 +0000 +Subject: ipv6: use RCU in ip6_xmit() + +From: Eric Dumazet + +commit 9085e56501d93af9f2d7bd16f7fcfacdde47b99c upstream. + +Use RCU in ip6_xmit() in order to use dst_dev_rcu() to prevent +possible UAF. + +Fixes: 4a6ce2b6f2ec ("net: introduce a new function dst_dev_put()") +Signed-off-by: Eric Dumazet +Reviewed-by: David Ahern +Link: https://patch.msgid.link/20250828195823.3958522-4-edumazet@google.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Keerthana K +Signed-off-by: Shivani Agarwal +Signed-off-by: Greg Kroah-Hartman +--- + net/ipv6/ip6_output.c | 35 +++++++++++++++++++++-------------- + 1 file changed, 21 insertions(+), 14 deletions(-) + +--- a/net/ipv6/ip6_output.c ++++ b/net/ipv6/ip6_output.c +@@ -267,35 +267,36 @@ bool ip6_autoflowlabel(struct net *net, + int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6, + __u32 mark, struct ipv6_txoptions *opt, int tclass, u32 priority) + { +- struct net *net = sock_net(sk); + const struct ipv6_pinfo *np = inet6_sk(sk); + struct in6_addr *first_hop = &fl6->daddr; + struct dst_entry *dst = skb_dst(skb); +- struct net_device *dev = dst_dev(dst); + struct inet6_dev *idev = ip6_dst_idev(dst); + struct hop_jumbo_hdr *hop_jumbo; + int hoplen = sizeof(*hop_jumbo); ++ struct net *net = sock_net(sk); + unsigned int head_room; ++ struct net_device *dev; + struct ipv6hdr *hdr; + u8 proto = fl6->flowi6_proto; + int seg_len = skb->len; +- int hlimit = -1; ++ int ret, hlimit = -1; + u32 mtu; + ++ rcu_read_lock(); ++ ++ dev = dst_dev_rcu(dst); + head_room = sizeof(struct ipv6hdr) + hoplen + LL_RESERVED_SPACE(dev); + if (opt) + head_room += opt->opt_nflen + opt->opt_flen; + + if (unlikely(head_room > skb_headroom(skb))) { +- /* Make sure idev stays alive */ +- rcu_read_lock(); ++ /* idev stays alive while we hold rcu_read_lock(). */ + skb = skb_expand_head(skb, head_room); + if (!skb) { + IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTDISCARDS); +- rcu_read_unlock(); +- return -ENOBUFS; ++ ret = -ENOBUFS; ++ goto unlock; + } +- rcu_read_unlock(); + } + + if (opt) { +@@ -357,17 +358,21 @@ int ip6_xmit(const struct sock *sk, stru + * skb to its handler for processing + */ + skb = l3mdev_ip6_out((struct sock *)sk, skb); +- if (unlikely(!skb)) +- return 0; ++ if (unlikely(!skb)) { ++ ret = 0; ++ goto unlock; ++ } + + /* hooks should never assume socket lock is held. + * we promote our socket to non const + */ +- return NF_HOOK(NFPROTO_IPV6, NF_INET_LOCAL_OUT, +- net, (struct sock *)sk, skb, NULL, dev, +- dst_output); ++ ret = NF_HOOK(NFPROTO_IPV6, NF_INET_LOCAL_OUT, ++ net, (struct sock *)sk, skb, NULL, dev, ++ dst_output); ++ goto unlock; + } + ++ ret = -EMSGSIZE; + skb->dev = dev; + /* ipv6_local_error() does not require socket lock, + * we promote our socket to non const +@@ -376,7 +381,9 @@ int ip6_xmit(const struct sock *sk, stru + + IP6_INC_STATS(net, idev, IPSTATS_MIB_FRAGFAILS); + kfree_skb(skb); +- return -EMSGSIZE; ++unlock: ++ rcu_read_unlock(); ++ return ret; + } + EXPORT_SYMBOL(ip6_xmit); + diff --git a/queue-6.12/octeontx2-af-add-proper-checks-for-fwdata.patch b/queue-6.12/octeontx2-af-add-proper-checks-for-fwdata.patch new file mode 100644 index 0000000000..1a7bca4f01 --- /dev/null +++ b/queue-6.12/octeontx2-af-add-proper-checks-for-fwdata.patch @@ -0,0 +1,62 @@ +From stable+bounces-219935-greg=kroah.com@vger.kernel.org Fri Feb 27 09:38:14 2026 +From: Rajani Kantha <681739313@139.com> +Date: Fri, 27 Feb 2026 16:34:51 +0800 +Subject: Octeontx2-af: Add proper checks for fwdata +To: hkelam@marvell.com, kuba@kernel.org, stable@vger.kernel.org +Message-ID: <20260227083451.20062-1-681739313@139.com> + +From: Hariprasad Kelam + +[ Upstream commit 4a3dba48188208e4f66822800e042686784d29d1 ] + +firmware populates MAC address, link modes (supported, advertised) +and EEPROM data in shared firmware structure which kernel access +via MAC block(CGX/RPM). + +Accessing fwdata, on boards booted with out MAC block leading to +kernel panics. + +Internal error: Oops: 0000000096000005 [#1] SMP +[ 10.460721] Modules linked in: +[ 10.463779] CPU: 0 UID: 0 PID: 174 Comm: kworker/0:3 Not tainted 6.19.0-rc5-00154-g76ec646abdf7-dirty #3 PREEMPT +[ 10.474045] Hardware name: Marvell OcteonTX CN98XX board (DT) +[ 10.479793] Workqueue: events work_for_cpu_fn +[ 10.484159] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) +[ 10.491124] pc : rvu_sdp_init+0x18/0x114 +[ 10.495051] lr : rvu_probe+0xe58/0x1d18 + +Fixes: 997814491cee ("Octeontx2-af: Fetch MAC channel info from firmware") +Fixes: 5f21226b79fd ("Octeontx2-pf: ethtool: support multi advertise mode") +Signed-off-by: Hariprasad Kelam +Link: https://patch.msgid.link/20260121094819.2566786-1-hkelam@marvell.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Rajani Kantha <681739313@139.com> +Signed-off-by: Greg Kroah-Hartman +--- + drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c | 3 +++ + drivers/net/ethernet/marvell/octeontx2/af/rvu_sdp.c | 2 +- + 2 files changed, 4 insertions(+), 1 deletion(-) + +--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c ++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c +@@ -1224,6 +1224,9 @@ int rvu_mbox_handler_cgx_set_link_mode(s + u8 cgx_idx, lmac; + void *cgxd; + ++ if (!rvu->fwdata) ++ return LMAC_AF_ERR_FIRMWARE_DATA_NOT_MAPPED; ++ + if (!is_cgx_config_permitted(rvu, req->hdr.pcifunc)) + return -EPERM; + +--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_sdp.c ++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_sdp.c +@@ -56,7 +56,7 @@ int rvu_sdp_init(struct rvu *rvu) + struct rvu_pfvf *pfvf; + u32 i = 0; + +- if (rvu->fwdata->channel_data.valid) { ++ if (rvu->fwdata && rvu->fwdata->channel_data.valid) { + sdp_pf_num[0] = 0; + pfvf = &rvu->pf[sdp_pf_num[0]]; + pfvf->sdp_info = &rvu->fwdata->channel_data.info; diff --git a/queue-6.12/platform-x86-amd-pmc-add-support-for-van-gogh-soc.patch b/queue-6.12/platform-x86-amd-pmc-add-support-for-van-gogh-soc.patch new file mode 100644 index 0000000000..0da253d7b7 --- /dev/null +++ b/queue-6.12/platform-x86-amd-pmc-add-support-for-van-gogh-soc.patch @@ -0,0 +1,71 @@ +From stable+bounces-222779-greg=kroah.com@vger.kernel.org Tue Mar 3 05:18:24 2026 +From: Alva Lan +Date: Tue, 3 Mar 2026 12:15:12 +0800 +Subject: platform/x86/amd/pmc: Add support for Van Gogh SoC +To: gregkh@linuxfoundation.org, stable@vger.kernel.org +Cc: platform-driver-x86@vger.kernel.org, "Antheas Kapenekakis" , "Mario Limonciello" , "Shyam Sundar S K" , "Ilpo Järvinen" , "Alva Lan" +Message-ID: + +From: Antheas Kapenekakis + +[ Upstream commit db4a3f0fbedb0398f77b9047e8b8bb2b49f355bb ] + +The ROG Xbox Ally (non-X) SoC features a similar architecture to the +Steam Deck. While the Steam Deck supports S3 (s2idle causes a crash), +this support was dropped by the Xbox Ally which only S0ix suspend. + +Since the handler is missing here, this causes the device to not suspend +and the AMD GPU driver to crash while trying to resume afterwards due to +a power hang. + +Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4659 +Signed-off-by: Antheas Kapenekakis +Reviewed-by: Mario Limonciello (AMD) +Acked-by: Shyam Sundar S K +Link: https://patch.msgid.link/20251024152152.3981721-2-lkml@antheas.dev +Reviewed-by: Ilpo Järvinen +Signed-off-by: Ilpo Järvinen +[ Adjust context ] +Signed-off-by: Alva Lan +Signed-off-by: Greg Kroah-Hartman +--- + drivers/platform/x86/amd/pmc/pmc.c | 3 +++ + drivers/platform/x86/amd/pmc/pmc.h | 1 + + 2 files changed, 4 insertions(+) + +--- a/drivers/platform/x86/amd/pmc/pmc.c ++++ b/drivers/platform/x86/amd/pmc/pmc.c +@@ -347,6 +347,7 @@ static void amd_pmc_get_ip_info(struct a + switch (dev->cpu_id) { + case AMD_CPU_ID_PCO: + case AMD_CPU_ID_RN: ++ case AMD_CPU_ID_VG: + case AMD_CPU_ID_YC: + case AMD_CPU_ID_CB: + dev->num_ips = 12; +@@ -765,6 +766,7 @@ static int amd_pmc_get_os_hint(struct am + case AMD_CPU_ID_PCO: + return MSG_OS_HINT_PCO; + case AMD_CPU_ID_RN: ++ case AMD_CPU_ID_VG: + case AMD_CPU_ID_YC: + case AMD_CPU_ID_CB: + case AMD_CPU_ID_PS: +@@ -977,6 +979,7 @@ static const struct pci_device_id pmc_pc + { PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_PCO) }, + { PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_RV) }, + { PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_SP) }, ++ { PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_VG) }, + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M20H_ROOT) }, + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M60H_ROOT) }, + { } +--- a/drivers/platform/x86/amd/pmc/pmc.h ++++ b/drivers/platform/x86/amd/pmc/pmc.h +@@ -62,6 +62,7 @@ void amd_mp2_stb_deinit(struct amd_pmc_d + #define AMD_CPU_ID_RN 0x1630 + #define AMD_CPU_ID_PCO AMD_CPU_ID_RV + #define AMD_CPU_ID_CZN AMD_CPU_ID_RN ++#define AMD_CPU_ID_VG 0x1645 + #define AMD_CPU_ID_YC 0x14B5 + #define AMD_CPU_ID_CB 0x14D8 + #define AMD_CPU_ID_PS 0x14E8 diff --git a/queue-6.12/rxrpc-fix-recvmsg-unconditional-requeue.patch b/queue-6.12/rxrpc-fix-recvmsg-unconditional-requeue.patch new file mode 100644 index 0000000000..81a9fe5272 --- /dev/null +++ b/queue-6.12/rxrpc-fix-recvmsg-unconditional-requeue.patch @@ -0,0 +1,106 @@ +From stable+bounces-219775-greg=kroah.com@vger.kernel.org Thu Feb 26 09:28:14 2026 +From: Robert Garcia +Date: Thu, 26 Feb 2026 16:23:46 +0800 +Subject: rxrpc: Fix recvmsg() unconditional requeue +To: stable@vger.kernel.org, David Howells +Cc: Marc Dionne , Robert Garcia , Steven Rostedt , linux-kernel@vger.kernel.org, Masami Hiramatsu , "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , linux-afs@lists.infradead.org, linux-trace-kernel@vger.kernel.org, netdev@vger.kernel.org, Faith , Pumpkin Chang , Nir Ohfeld , Willy Tarreau , Simon Horman +Message-ID: <20260226082346.3200864-1-rob_garcia@163.com> + +From: David Howells + +[ Upstream commit 2c28769a51deb6022d7fbd499987e237a01dd63a ] + +If rxrpc_recvmsg() fails because MSG_DONTWAIT was specified but the call at +the front of the recvmsg queue already has its mutex locked, it requeues +the call - whether or not the call is already queued. The call may be on +the queue because MSG_PEEK was also passed and so the call was not dequeued +or because the I/O thread requeued it. + +The unconditional requeue may then corrupt the recvmsg queue, leading to +things like UAFs or refcount underruns. + +Fix this by only requeuing the call if it isn't already on the queue - and +moving it to the front if it is already queued. If we don't queue it, we +have to put the ref we obtained by dequeuing it. + +Also, MSG_PEEK doesn't dequeue the call so shouldn't call +rxrpc_notify_socket() for the call if we didn't use up all the data on the +queue, so fix that also. + +Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg") +Reported-by: Faith +Reported-by: Pumpkin Chang +Signed-off-by: David Howells +Acked-by: Marc Dionne +cc: Nir Ohfeld +cc: Willy Tarreau +cc: Simon Horman +cc: linux-afs@lists.infradead.org +cc: stable@kernel.org +Link: https://patch.msgid.link/95163.1768428203@warthog.procyon.org.uk +Signed-off-by: Jakub Kicinski +[Use spin_unlock instead of spin_unlock_irq to maintain context consistency.] +Signed-off-by: Robert Garcia +Signed-off-by: Greg Kroah-Hartman +--- + include/trace/events/rxrpc.h | 4 ++++ + net/rxrpc/recvmsg.c | 19 +++++++++++++++---- + 2 files changed, 19 insertions(+), 4 deletions(-) + +--- a/include/trace/events/rxrpc.h ++++ b/include/trace/events/rxrpc.h +@@ -274,6 +274,7 @@ + EM(rxrpc_call_put_kernel, "PUT kernel ") \ + EM(rxrpc_call_put_poke, "PUT poke ") \ + EM(rxrpc_call_put_recvmsg, "PUT recvmsg ") \ ++ EM(rxrpc_call_put_recvmsg_peek_nowait, "PUT peek-nwt") \ + EM(rxrpc_call_put_release_sock, "PUT rls-sock") \ + EM(rxrpc_call_put_release_sock_tba, "PUT rls-sk-a") \ + EM(rxrpc_call_put_sendmsg, "PUT sendmsg ") \ +@@ -291,6 +292,9 @@ + EM(rxrpc_call_see_distribute_error, "SEE dist-err") \ + EM(rxrpc_call_see_input, "SEE input ") \ + EM(rxrpc_call_see_recvmsg, "SEE recvmsg ") \ ++ EM(rxrpc_call_see_recvmsg_requeue, "SEE recv-rqu") \ ++ EM(rxrpc_call_see_recvmsg_requeue_first, "SEE recv-rqF") \ ++ EM(rxrpc_call_see_recvmsg_requeue_move, "SEE recv-rqM") \ + EM(rxrpc_call_see_release, "SEE release ") \ + EM(rxrpc_call_see_userid_exists, "SEE u-exists") \ + EM(rxrpc_call_see_waiting_call, "SEE q-conn ") \ +--- a/net/rxrpc/recvmsg.c ++++ b/net/rxrpc/recvmsg.c +@@ -430,7 +430,8 @@ try_again: + if (rxrpc_call_has_failed(call)) + goto call_failed; + +- if (!skb_queue_empty(&call->recvmsg_queue)) ++ if (!(flags & MSG_PEEK) && ++ !skb_queue_empty(&call->recvmsg_queue)) + rxrpc_notify_socket(call); + goto not_yet_complete; + +@@ -461,11 +462,21 @@ error_unlock_call: + error_requeue_call: + if (!(flags & MSG_PEEK)) { + spin_lock(&rx->recvmsg_lock); +- list_add(&call->recvmsg_link, &rx->recvmsg_q); +- spin_unlock(&rx->recvmsg_lock); ++ if (list_empty(&call->recvmsg_link)) { ++ list_add(&call->recvmsg_link, &rx->recvmsg_q); ++ rxrpc_see_call(call, rxrpc_call_see_recvmsg_requeue); ++ spin_unlock(&rx->recvmsg_lock); ++ } else if (list_is_first(&call->recvmsg_link, &rx->recvmsg_q)) { ++ spin_unlock(&rx->recvmsg_lock); ++ rxrpc_put_call(call, rxrpc_call_see_recvmsg_requeue_first); ++ } else { ++ list_move(&call->recvmsg_link, &rx->recvmsg_q); ++ spin_unlock(&rx->recvmsg_lock); ++ rxrpc_put_call(call, rxrpc_call_see_recvmsg_requeue_move); ++ } + trace_rxrpc_recvmsg(call_debug_id, rxrpc_recvmsg_requeue, 0); + } else { +- rxrpc_put_call(call, rxrpc_call_put_recvmsg); ++ rxrpc_put_call(call, rxrpc_call_put_recvmsg_peek_nowait); + } + error_no_call: + release_sock(&rx->sk); diff --git a/queue-6.12/series b/queue-6.12/series index 603a4f1ba9..65b80b411c 100644 --- a/queue-6.12/series +++ b/queue-6.12/series @@ -270,3 +270,14 @@ net-macb-shuffle-the-tx-ring-before-enabling-tx.patch cifs-open-files-should-not-hold-ref-on-superblock.patch crypto-atmel-sha204a-fix-oom-tfm_count-leak.patch xfs-fix-integer-overflow-in-bmap-intent-sort-comparator.patch +drm-xe-sync-cleanup-partially-initialized-sync-on-parse-failure.patch +ipv6-use-rcu-in-ip6_xmit.patch +dm-verity-disable-recursive-forward-error-correction.patch +rxrpc-fix-recvmsg-unconditional-requeue.patch +btrfs-do-not-strictly-require-dirty-metadata-threshold-for-metadata-writepages.patch +ice-fix-devlink-reload-call-trace.patch +tracing-add-recursion-protection-in-kernel-stack-trace-recording.patch +octeontx2-af-add-proper-checks-for-fwdata.patch +io_uring-uring_cmd-fix-too-strict-requirement-on-ioctl.patch +x86-uprobes-fix-xol-allocation-failure-for-32-bit-tasks.patch +platform-x86-amd-pmc-add-support-for-van-gogh-soc.patch diff --git a/queue-6.12/tracing-add-recursion-protection-in-kernel-stack-trace-recording.patch b/queue-6.12/tracing-add-recursion-protection-in-kernel-stack-trace-recording.patch new file mode 100644 index 0000000000..cc93519edc --- /dev/null +++ b/queue-6.12/tracing-add-recursion-protection-in-kernel-stack-trace-recording.patch @@ -0,0 +1,93 @@ +From stable+bounces-220035-greg=kroah.com@vger.kernel.org Sat Feb 28 03:51:36 2026 +From: Leon Chen +Date: Sat, 28 Feb 2026 10:51:24 +0800 +Subject: tracing: Add recursion protection in kernel stack trace recording +To: mhiramat@kernel.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, joel@joelfernandes.org, paulmck@kernel.org, boqun.feng@gmail.com, stable@vger.kernel.org +Message-ID: <20260228025124.4590-1-leonchen.oss@139.com> + +From: Steven Rostedt + +[ Upstream commit 5f1ef0dfcb5b7f4a91a9b0e0ba533efd9f7e2cdb ] + +A bug was reported about an infinite recursion caused by tracing the rcu +events with the kernel stack trace trigger enabled. The stack trace code +called back into RCU which then called the stack trace again. + +Expand the ftrace recursion protection to add a set of bits to protect +events from recursion. Each bit represents the context that the event is +in (normal, softirq, interrupt and NMI). + +Have the stack trace code use the interrupt context to protect against +recursion. + +Note, the bug showed an issue in both the RCU code as well as the tracing +stacktrace code. This only handles the tracing stack trace side of the +bug. The RCU fix will be handled separately. + +Link: https://lore.kernel.org/all/20260102122807.7025fc87@gandalf.local.home/ + +Cc: stable@vger.kernel.org +Cc: Masami Hiramatsu +Cc: Mathieu Desnoyers +Cc: Joel Fernandes +Cc: "Paul E. McKenney" +Cc: Boqun Feng +Link: https://patch.msgid.link/20260105203141.515cd49f@gandalf.local.home +Reported-by: Yao Kai +Tested-by: Yao Kai +Fixes: 5f5fa7ea89dc ("rcu: Don't use negative nesting depth in __rcu_read_unlock()") +Signed-off-by: Steven Rostedt (Google) +Signed-off-by: Leon Chen +Signed-off-by: Greg Kroah-Hartman +--- + include/linux/trace_recursion.h | 9 +++++++++ + kernel/trace/trace.c | 6 ++++++ + 2 files changed, 15 insertions(+) + +--- a/include/linux/trace_recursion.h ++++ b/include/linux/trace_recursion.h +@@ -34,6 +34,13 @@ enum { + TRACE_INTERNAL_SIRQ_BIT, + TRACE_INTERNAL_TRANSITION_BIT, + ++ /* Internal event use recursion bits */ ++ TRACE_INTERNAL_EVENT_BIT, ++ TRACE_INTERNAL_EVENT_NMI_BIT, ++ TRACE_INTERNAL_EVENT_IRQ_BIT, ++ TRACE_INTERNAL_EVENT_SIRQ_BIT, ++ TRACE_INTERNAL_EVENT_TRANSITION_BIT, ++ + TRACE_BRANCH_BIT, + /* + * Abuse of the trace_recursion. +@@ -58,6 +65,8 @@ enum { + + #define TRACE_LIST_START TRACE_INTERNAL_BIT + ++#define TRACE_EVENT_START TRACE_INTERNAL_EVENT_BIT ++ + #define TRACE_CONTEXT_MASK ((1 << (TRACE_LIST_START + TRACE_CONTEXT_BITS)) - 1) + + /* +--- a/kernel/trace/trace.c ++++ b/kernel/trace/trace.c +@@ -2944,6 +2944,11 @@ static void __ftrace_trace_stack(struct + struct ftrace_stack *fstack; + struct stack_entry *entry; + int stackidx; ++ int bit; ++ ++ bit = trace_test_and_set_recursion(_THIS_IP_, _RET_IP_, TRACE_EVENT_START); ++ if (bit < 0) ++ return; + + /* + * Add one, for this function and the call to save_stack_trace() +@@ -3015,6 +3020,7 @@ static void __ftrace_trace_stack(struct + __this_cpu_dec(ftrace_stack_reserve); + preempt_enable_notrace(); + ++ trace_clear_recursion(bit); + } + + static inline void ftrace_trace_stack(struct trace_array *tr, diff --git a/queue-6.12/x86-uprobes-fix-xol-allocation-failure-for-32-bit-tasks.patch b/queue-6.12/x86-uprobes-fix-xol-allocation-failure-for-32-bit-tasks.patch new file mode 100644 index 0000000000..bc9f8a71cd --- /dev/null +++ b/queue-6.12/x86-uprobes-fix-xol-allocation-failure-for-32-bit-tasks.patch @@ -0,0 +1,130 @@ +From stable+bounces-222623-greg=kroah.com@vger.kernel.org Mon Mar 2 16:23:30 2026 +From: Oleg Nesterov +Date: Mon, 2 Mar 2026 16:14:27 +0100 +Subject: x86/uprobes: Fix XOL allocation failure for 32-bit tasks +To: Sasha Levin +Cc: stable@vger.kernel.org, Paulo Andrade , "Peter Zijlstra (Intel)" , linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org +Message-ID: +Content-Disposition: inline + +From: Oleg Nesterov + +[ Upstream commit d55c571e4333fac71826e8db3b9753fadfbead6a ] + +This script + + #!/usr/bin/bash + + echo 0 > /proc/sys/kernel/randomize_va_space + + echo 'void main(void) {}' > TEST.c + + # -fcf-protection to ensure that the 1st endbr32 insn can't be emulated + gcc -m32 -fcf-protection=branch TEST.c -o test + + bpftrace -e 'uprobe:./test:main {}' -c ./test + +"hangs", the probed ./test task enters an endless loop. + +The problem is that with randomize_va_space == 0 +get_unmapped_area(TASK_SIZE - PAGE_SIZE) called by xol_add_vma() can not +just return the "addr == TASK_SIZE - PAGE_SIZE" hint, this addr is used +by the stack vma. + +arch_get_unmapped_area_topdown() doesn't take TIF_ADDR32 into account and +in_32bit_syscall() is false, this leads to info.high_limit > TASK_SIZE. +vm_unmapped_area() happily returns the high address > TASK_SIZE and then +get_unmapped_area() returns -ENOMEM after the "if (addr > TASK_SIZE - len)" +check. + +handle_swbp() doesn't report this failure (probably it should) and silently +restarts the probed insn. Endless loop. + +I think that the right fix should change the x86 get_unmapped_area() paths +to rely on TIF_ADDR32 rather than in_32bit_syscall(). Note also that if +CONFIG_X86_X32_ABI=y, in_x32_syscall() falsely returns true in this case +because ->orig_ax = -1. + +But we need a simple fix for -stable, so this patch just sets TS_COMPAT if +the probed task is 32-bit to make in_ia32_syscall() true. + +Fixes: 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()") +Reported-by: Paulo Andrade +Signed-off-by: Oleg Nesterov +Signed-off-by: Peter Zijlstra (Intel) +Link: https://lore.kernel.org/all/aV5uldEvV7pb4RA8@redhat.com/ +Cc: stable@vger.kernel.org +Link: https://patch.msgid.link/aWO7Fdxn39piQnxu@redhat.com +Signed-off-by: Greg Kroah-Hartman +--- + arch/x86/kernel/uprobes.c | 24 ++++++++++++++++++++++++ + include/linux/uprobes.h | 1 + + kernel/events/uprobes.c | 10 +++++++--- + 3 files changed, 32 insertions(+), 3 deletions(-) + +--- a/arch/x86/kernel/uprobes.c ++++ b/arch/x86/kernel/uprobes.c +@@ -1223,3 +1223,27 @@ bool arch_uretprobe_is_alive(struct retu + else + return regs->sp <= ret->stack; + } ++ ++#ifdef CONFIG_IA32_EMULATION ++unsigned long arch_uprobe_get_xol_area(void) ++{ ++ struct thread_info *ti = current_thread_info(); ++ unsigned long vaddr; ++ ++ /* ++ * HACK: we are not in a syscall, but x86 get_unmapped_area() paths ++ * ignore TIF_ADDR32 and rely on in_32bit_syscall() to calculate ++ * vm_unmapped_area_info.high_limit. ++ * ++ * The #ifdef above doesn't cover the CONFIG_X86_X32_ABI=y case, ++ * but in this case in_32bit_syscall() -> in_x32_syscall() always ++ * (falsely) returns true because ->orig_ax == -1. ++ */ ++ if (test_thread_flag(TIF_ADDR32)) ++ ti->status |= TS_COMPAT; ++ vaddr = get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE, PAGE_SIZE, 0, 0); ++ ti->status &= ~TS_COMPAT; ++ ++ return vaddr; ++} ++#endif +--- a/include/linux/uprobes.h ++++ b/include/linux/uprobes.h +@@ -146,6 +146,7 @@ extern void arch_uprobe_copy_ixol(struct + extern void uprobe_handle_trampoline(struct pt_regs *regs); + extern void *arch_uprobe_trampoline(unsigned long *psize); + extern unsigned long uprobe_get_trampoline_vaddr(void); ++extern unsigned long arch_uprobe_get_xol_area(void); + #else /* !CONFIG_UPROBES */ + struct uprobes_state { + }; +--- a/kernel/events/uprobes.c ++++ b/kernel/events/uprobes.c +@@ -1493,6 +1493,12 @@ static const struct vm_special_mapping x + .fault = xol_fault, + }; + ++unsigned long __weak arch_uprobe_get_xol_area(void) ++{ ++ /* Try to map as high as possible, this is only a hint. */ ++ return get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE, PAGE_SIZE, 0, 0); ++} ++ + /* Slot allocation for XOL */ + static int xol_add_vma(struct mm_struct *mm, struct xol_area *area) + { +@@ -1508,9 +1514,7 @@ static int xol_add_vma(struct mm_struct + } + + if (!area->vaddr) { +- /* Try to map as high as possible, this is only a hint. */ +- area->vaddr = get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE, +- PAGE_SIZE, 0, 0); ++ area->vaddr = arch_uprobe_get_xol_area(); + if (IS_ERR_VALUE(area->vaddr)) { + ret = area->vaddr; + goto fail;