]> git.ipfire.org Git - thirdparty/kernel/stable.git/log
thirdparty/kernel/stable.git
2 months agoACPI: battery: Drop redundant checks from acpi_battery_remove()
Rafael J. Wysocki [Thu, 12 Feb 2026 13:26:54 +0000 (14:26 +0100)] 
ACPI: battery: Drop redundant checks from acpi_battery_remove()

In acpi_battery_remove(), "battery" cannot be NULL because it is the
driver data of the platform device passed to that function and it has
been set by acpi_battery_probe(), so drop the redundant check of it
against NULL.

Moreover, getting the ACPI device pointer from battery->device is
slightly less overhead than using the ACPI_COMPANION() macro on the
platform device to retrieve it, so do that and drop the check of that
pointer against NULL which is also redundant.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/12836976.O9o76ZdvQC@rafael.j.wysocki
2 months agoMerge tag 'amd-drm-next-6.20-2026-02-06' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Fri, 13 Feb 2026 01:46:51 +0000 (11:46 +1000)] 
Merge tag 'amd-drm-next-6.20-2026-02-06' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.20-2026-02-06:

amdgpu:
- DML 2.1 fixes
- Panel replay fixes
- Display writeback fixes
- MES 11 old firmware compat fix
- DC CRC improvements
- DPIA fixes
- XGMI fixes
- ASPM fix
- SMU feature bit handling fixes
- DC LUT fixes
- RAS fixes
- Misc memory leak in error path fixes
- SDMA queue reset fixes
- PG handling fixes
- 5 level GPUVM page table fix
- SR-IOV fix
- Queue reset fix

amdkfd:
- Fix possible double deletion of validate list
- Event setup fix
- Device disconnect regression fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260206192706.59396-1-alexander.deucher@amd.com
2 months agoMerge tag 'riscv-for-linus-7.0-mw1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 13 Feb 2026 03:17:44 +0000 (19:17 -0800)] 
Merge tag 'riscv-for-linus-7.0-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Paul Walmsley:

 - Add support for control flow integrity for userspace processes.

   This is based on the standard RISC-V ISA extensions Zicfiss and
   Zicfilp

 - Improve ptrace behavior regarding vector registers, and add some
   selftests

 - Optimize our strlen() assembly

 - Enable the ISO-8859-1 code page as built-in, similar to ARM64, for
   EFI volume mounting

 - Clean up some code slightly, including defining copy_user_page() as
   copy_page() rather than memcpy(), aligning us with other
   architectures; and using max3() to slightly simplify an expression
   in riscv_iommu_init_check()

* tag 'riscv-for-linus-7.0-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits)
  riscv: lib: optimize strlen loop efficiency
  selftests: riscv: vstate_exec_nolibc: Use the regular prctl() function
  selftests: riscv: verify ptrace accepts valid vector csr values
  selftests: riscv: verify ptrace rejects invalid vector csr inputs
  selftests: riscv: verify syscalls discard vector context
  selftests: riscv: verify initial vector state with ptrace
  selftests: riscv: test ptrace vector interface
  riscv: ptrace: validate input vector csr registers
  riscv: csr: define vtype register elements
  riscv: vector: init vector context with proper vlenb
  riscv: ptrace: return ENODATA for inactive vector extension
  kselftest/riscv: add kselftest for user mode CFI
  riscv: add documentation for shadow stack
  riscv: add documentation for landing pad / indirect branch tracking
  riscv: create a Kconfig fragment for shadow stack and landing pad support
  arch/riscv: add dual vdso creation logic and select vdso based on hw
  arch/riscv: compile vdso with landing pad and shadow stack note
  riscv: enable kernel access to shadow stack memory via the FWFT SBI call
  riscv: add kernel command line option to opt out of user CFI
  riscv/hwprobe: add zicfilp / zicfiss enumeration in hwprobe
  ...

2 months agoMerge branch 'net-mscc-ocelot-fix-missing-lock-in-ocelot_port_xmit'
Jakub Kicinski [Fri, 13 Feb 2026 03:01:17 +0000 (19:01 -0800)] 
Merge branch 'net-mscc-ocelot-fix-missing-lock-in-ocelot_port_xmit'

Ziyi Guo says:

====================
net: mscc: ocelot: fix missing lock in ocelot_port_xmit()

ocelot_port_xmit() calls ocelot_can_inject() and
ocelot_port_inject_frame() without holding the injection group lock.
Both functions contain lockdep_assert_held() for the injection lock,
and the correct caller felix_port_deferred_xmit() properly acquires
the lock using ocelot_lock_inj_grp() before calling these functions.

this v3 splits the fix into a 3-patch series to separate
refactoring from the behavioral change:

  1/3: Extract the PTP timestamp handling into an ocelot_xmit_timestamp()
       helper so the logic isn't duplicated when the function is split.

  2/3: Split ocelot_port_xmit() into ocelot_port_xmit_fdma() and
       ocelot_port_xmit_inj(), keeping the FDMA and register injection
       code paths fully separate.

  3/3: Add ocelot_lock_inj_grp()/ocelot_unlock_inj_grp() in
       ocelot_port_xmit_inj() to fix the missing lock protection.

Patches 1-2 are pure refactors with no behavioral change.
Patch 3 is the actual bug fix.
====================

Link: https://patch.msgid.link/20260208225602.1339325-1-n7l8m4@u.northwestern.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: mscc: ocelot: add missing lock protection in ocelot_port_xmit_inj()
Ziyi Guo [Sun, 8 Feb 2026 22:56:02 +0000 (22:56 +0000)] 
net: mscc: ocelot: add missing lock protection in ocelot_port_xmit_inj()

ocelot_port_xmit_inj() calls ocelot_can_inject() and
ocelot_port_inject_frame() without holding the injection group lock.
Both functions contain lockdep_assert_held() for the injection lock,
and the correct caller felix_port_deferred_xmit() properly acquires
the lock using ocelot_lock_inj_grp() before calling these functions.

Add ocelot_lock_inj_grp()/ocelot_unlock_inj_grp() around the register
injection path to fix the missing lock protection. The FDMA path is not
affected as it uses its own locking mechanism.

Fixes: c5e12ac3beb0 ("net: mscc: ocelot: serialize access to the injection/extraction groups")
Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260208225602.1339325-4-n7l8m4@u.northwestern.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: mscc: ocelot: split xmit into FDMA and register injection paths
Ziyi Guo [Sun, 8 Feb 2026 22:56:01 +0000 (22:56 +0000)] 
net: mscc: ocelot: split xmit into FDMA and register injection paths

Split ocelot_port_xmit() into two separate functions:
- ocelot_port_xmit_fdma(): handles the FDMA injection path
- ocelot_port_xmit_inj(): handles the register-based injection path

The top-level ocelot_port_xmit() now dispatches to the appropriate
function based on the ocelot_fdma_enabled static key.

This is a pure refactor with no behavioral change. Separating the two
code paths makes each one simpler and prepares for adding proper locking
to the register injection path without affecting the FDMA path.

Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260208225602.1339325-3-n7l8m4@u.northwestern.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: mscc: ocelot: extract ocelot_xmit_timestamp() helper
Ziyi Guo [Sun, 8 Feb 2026 22:56:00 +0000 (22:56 +0000)] 
net: mscc: ocelot: extract ocelot_xmit_timestamp() helper

Extract the PTP timestamp handling logic from ocelot_port_xmit() into a
separate ocelot_xmit_timestamp() helper function. This is a pure
refactor with no behavioral change.

The helper returns false if the skb was consumed (freed) due to a
timestamp request failure, and true if the caller should continue with
frame injection. The rew_op value is returned via pointer.

This prepares for splitting ocelot_port_xmit() into separate FDMA and
register injection paths in a subsequent patch.

Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260208225602.1339325-2-n7l8m4@u.northwestern.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: sparx5/lan969x: fix DWRR cost max to match hardware register width
Daniel Machon [Tue, 10 Feb 2026 13:44:01 +0000 (14:44 +0100)] 
net: sparx5/lan969x: fix DWRR cost max to match hardware register width

DWRR (Deficit Weighted Round Robin) scheduling distributes bandwidth
across traffic classes based on per-queue cost values, where lower cost
means higher bandwidth share.

The SPX5_DWRR_COST_MAX constant is 63 (6 bits) but the hardware
register field HSCH_DWRR_ENTRY_DWRR_COST is GENMASK(24, 20), only
5 bits wide (max 31). This causes sparx5_weight_to_hw_cost() to
compute cost values that silently overflow via FIELD_PREP, resulting
in incorrect scheduling weights.

Set SPX5_DWRR_COST_MAX to 31 to match the hardware register width.

Fixes: 211225428d65 ("net: microchip: sparx5: add support for offloading ets qdisc")
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260210-sparx5-fix-dwrr-cost-max-v1-1-58fbdbc25652@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoatm: fore200e: fix use-after-free in tasklets during device removal
Duoming Zhou [Tue, 10 Feb 2026 09:45:37 +0000 (17:45 +0800)] 
atm: fore200e: fix use-after-free in tasklets during device removal

When the PCA-200E or SBA-200E adapter is being detached, the fore200e
is deallocated. However, the tx_tasklet or rx_tasklet may still be running
or pending, leading to use-after-free bug when the already freed fore200e
is accessed again in fore200e_tx_tasklet() or fore200e_rx_tasklet().

One of the race conditions can occur as follows:

CPU 0 (cleanup)           | CPU 1 (tasklet)
fore200e_pca_remove_one() | fore200e_interrupt()
  fore200e_shutdown()     |   tasklet_schedule()
    kfree(fore200e)       | fore200e_tx_tasklet()
                          |   fore200e-> // UAF

Fix this by ensuring tx_tasklet or rx_tasklet is properly canceled before
the fore200e is released. Add tasklet_kill() in fore200e_shutdown() to
synchronize with any pending or running tasklets. Moreover, since
fore200e_reset() could prevent further interrupts or data transfers,
the tasklet_kill() should be placed after fore200e_reset() to prevent
the tasklet from being rescheduled in fore200e_interrupt(). Finally,
it only needs to do tasklet_kill() when the fore200e state is greater
than or equal to FORE200E_STATE_IRQ, since tasklets are uninitialized
in earlier states. In a word, the tasklet_kill() should be placed in
the FORE200E_STATE_IRQ branch within the switch...case structure.

This bug was identified through static analysis.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@kernel.org
Suggested-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260210094537.9767-1-duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: stmmac: fix oops when split header is enabled
Jie Zhang [Mon, 9 Feb 2026 22:50:32 +0000 (17:50 -0500)] 
net: stmmac: fix oops when split header is enabled

For GMAC4, when split header is enabled, in some rare cases, the
hardware does not fill buf2 of the first descriptor with payload.
Thus we cannot assume buf2 is always fully filled if it is not
the last descriptor. Otherwise, the length of buf2 of the second
descriptor will be calculated wrong and cause an oops:

Unable to handle kernel paging request at virtual address ffff00019246bfc0
...
x2 : 0000000000000040 x1 : ffff00019246bfc0 x0 : ffff00009246c000
Call trace:
 dcache_inval_poc+0x28/0x58 (P)
 dma_direct_sync_single_for_cpu+0x38/0x6c
 __dma_sync_single_for_cpu+0x34/0x6c
 stmmac_napi_poll_rx+0x8f0/0xb60
 __napi_poll.constprop.0+0x30/0x144
 net_rx_action+0x160/0x274
 handle_softirqs+0x1b8/0x1fc
...

To fix this, the PL bit-field in RDES3 register is used for all
descriptors, whether it is the last descriptor or not.

Fixes: ec222003bd94 ("net: stmmac: Prepare to add Split Header support")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jie Zhang <jie.zhang@analog.com>
Link: https://patch.msgid.link/20260209225037.589130-1-jie.zhang@analog.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: intel: fix PCI device ID conflict between i40e and ipw2200
Ethan Nelson-Moore [Tue, 10 Feb 2026 02:12:34 +0000 (18:12 -0800)] 
net: intel: fix PCI device ID conflict between i40e and ipw2200

The ID 8086:104f is matched by both i40e and ipw2200. The same device
ID should not be in more than one driver, because in that case, which
driver is used is unpredictable. Fix this by taking advantage of the
fact that i40e devices use PCI_CLASS_NETWORK_ETHERNET and ipw2200
devices use PCI_CLASS_NETWORK_OTHER to differentiate the devices.

Fixes: 2e45d3f4677a ("i40e: Add support for X710 B/P & SFP+ cards")
Cc: stable@vger.kernel.org
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Link: https://patch.msgid.link/20260210021235.16315-1-enelsonmoore@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: drv-net: limit RPS test CPUs to supported range
Gal Pressman [Tue, 10 Feb 2026 09:31:10 +0000 (11:31 +0200)] 
selftests: drv-net: limit RPS test CPUs to supported range

The _get_unused_cpus() function can return CPU numbers >= 16, which
exceeds RPS_MAX_CPUS in toeplitz.c. When this happens, the test fails
with a cryptic message:

  # Exception| Traceback (most recent call last):
  # Exception|   File "/tmp/cur/linux/tools/testing/selftests/net/lib/py/ksft.py", line 319, in ksft_run
  # Exception|     func(*args)
  # Exception|   File "/tmp/cur/linux/tools/testing/selftests/drivers/net/hw/toeplitz.py", line 189, in test
  # Exception|     with bkg(" ".join(rx_cmd), ksft_ready=True, exit_wait=True) as rx_proc:
  # Exception|          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  # Exception|   File "/tmp/cur/linux/tools/testing/selftests/net/lib/py/utils.py", line 124, in __init__
  # Exception|     super().__init__(comm, background=True,
  # Exception|   File "/tmp/cur/linux/tools/testing/selftests/net/lib/py/utils.py", line 77, in __init__
  # Exception|     raise Exception("Did not receive ready message")
  # Exception| Exception: Did not receive ready message

Rename _get_unused_cpus() to _get_unused_rps_cpus() and cap the CPU
search range to RPS_MAX_CPUS.

Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260210093110.1935149-1-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: net: lib: Fix jq parsing error
Yue Haibing [Wed, 11 Feb 2026 02:21:46 +0000 (10:21 +0800)] 
selftests: net: lib: Fix jq parsing error

The testcase failed as below:
$./vlan_bridge_binding.sh
...
+ adf_ip_link_set_up d1
+ local name=d1
+ shift
+ ip_link_is_up d1
+ ip_link_has_flag d1 UP
+ local name=d1
+ shift
+ local flag=UP
+ shift
++ ip -j link show d1
++ jq --arg flag UP 'any(.[].flags.[]; . == $flag)'
jq: error: syntax error, unexpected '[', expecting FORMAT or QQSTRING_START
 (Unix shell quoting issues?) at <top-level>, line 1:
any(.[].flags.[]; . == $flag)
jq: 1 compile error

Remove the extra dot (.) after flags array to fix this.

Fixes: 4baa1d3a5080 ("selftests: net: lib: Add ip_link_has_flag()")
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260211022146.190948-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agobng_en: Remove duplicate include
Chen Ni [Wed, 11 Feb 2026 03:20:21 +0000 (11:20 +0800)] 
bng_en: Remove duplicate include

Remove duplicate inclusion of <net/netdev_queues.h>.

Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Link: https://patch.msgid.link/20260211032021.2719742-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: mlxsw: tc_restrictions: Fix test failure with new iproute2
Ido Schimmel [Mon, 9 Feb 2026 13:53:53 +0000 (14:53 +0100)] 
selftests: mlxsw: tc_restrictions: Fix test failure with new iproute2

As explained in [1], iproute2 started rejecting tc-police burst sizes
that result in an overflow. This can happen when the burst size is high
enough and the rate is low enough.

A couple of test cases specify such configurations, resulting in
iproute2 errors and test failure.

Fix by reducing the burst size so that the test will pass with both new
and old iproute2 versions.

[1] https://lore.kernel.org/netdev/20250916215731.3431465-1-jay.vosburgh@canonical.com/

Fixes: cb12d1763267 ("selftests: mlxsw: tc_restrictions: Test tc-police restrictions")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/88b00c6e85188aa6a065dc240206119b328c46e1.1770643998.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: mctp: ensure our nlmsg responses are initialised
Jeremy Kerr [Mon, 9 Feb 2026 07:27:33 +0000 (15:27 +0800)] 
net: mctp: ensure our nlmsg responses are initialised

Syed Faraz Abrar (@farazsth98) from Zellic, and Pumpkin (@u1f383) from
DEVCORE Research Team working with Trend Micro Zero Day Initiative
report that a RTM_GETNEIGH will return uninitalised data in the pad
bytes of the ndmsg data.

Ensure we're initialising the netlink data to zero, in the link, addr
and neigh response messages.

Fixes: 831119f88781 ("mctp: Add neighbour netlink interface")
Fixes: 06d2f4c583a7 ("mctp: Add netlink route management")
Fixes: 583be982d934 ("mctp: Add device handling and netlink interface")
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260209-dev-mctp-nlmsg-v1-1-f1e30c346a43@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge tag 'for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power...
Linus Torvalds [Fri, 13 Feb 2026 02:24:37 +0000 (18:24 -0800)] 
Merge tag 'for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply

Pull power supply and reset updates from Sebastian Reichel:
 "power-supply core:
   - sysfs: constify pointer passed to dev_attr_psp
   - extend DT binding documentation for battery cells to allow
     describing voltage drop behaviour

  power-supply drivers:
   - multiple: Remove unused gpio include header
   - multiple: Fix potential IRQ use-after-free on driver unload
   - bd71828: Add support for ROHM BD72720
   - misc small fixes

  reset drivers:
   - tdx-ec-poweroff: fix restart"

* tag 'for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (30 commits)
  power: supply: bd71828: Use dev_err_probe()
  dt-bindings: power: supply: google,goldfish-battery: Convert to DT schema
  power: supply: qcom_battmgr: Recognize "LiP" as lithium-polymer
  power: supply: wm97xx: Use devm_power_supply_register()
  power: supply: wm97xx: Use devm_kcalloc()
  power: supply: pm8916_lbc: Fix use-after-free for extcon in IRQ handler
  power: reset: tdx-ec-poweroff: fix restart
  docs: power: update documentation about removed function
  power: supply: wm97xx: Fix NULL pointer dereference in power_supply_changed()
  MAINTAINERS: adjust file entry in ROHM BD71828 CHARGER
  power: supply: ab8500_chargalg: improve kernel-doc
  power: supply: sysfs: Constify pointer passed to dev_attr_psp()
  power: supply: bq27xxx: fix wrong errno when bus ops are unsupported
  power: reset: nvmem-reboot-mode: respect cell size for nvmem_cell_write
  power: supply: sbs-battery: Fix use-after-free in power_supply_changed()
  power: supply: rt9455: Fix use-after-free in power_supply_changed()
  power: supply: pm8916_lbc: Fix use-after-free in power_supply_changed()
  power: supply: pm8916_bms_vm: Fix use-after-free in power_supply_changed()
  power: supply: pf1550: Fix use-after-free in power_supply_changed()
  power: supply: goldfish: Fix use-after-free in power_supply_changed()
  ...

2 months agomyri10ge: replace comma with semicolons
Chen Ni [Thu, 12 Feb 2026 05:50:28 +0000 (13:50 +0800)] 
myri10ge: replace comma with semicolons

Commit fd24173439c0 ("myri10ge: avoid uninitialized variable use")
added commas instead of semicolons

Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260212055028.3248491-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge tag 'nfs-for-7.0-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
Linus Torvalds [Fri, 13 Feb 2026 01:49:33 +0000 (17:49 -0800)] 
Merge tag 'nfs-for-7.0-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker:
 "New Features:
   - Use an LRU list for returning unused delegations
   - Introduce a KConfig option to disable NFS v4.0 and make NFS v4.1
     the default

  Bugfixes:
   - NFS/localio:
       - Handle short writes by retrying
       - Prevent direct reclaim recursion into NFS via nfs_writepages
       - Use GFP_NOIO and non-memreclaim workqueue in nfs_local_commit
       - Remove -EAGAIN handling in nfs_local_doio()
   - pNFS: fix a missing wake up while waiting on NFS_LAYOUT_DRAIN
   - fs/nfs: Fix a readdir slow-start regression
   - SUNRPC: fix gss_auth kref leak in gss_alloc_msg error path

  Other cleanups and improvements:
   - A few other NFS/localio cleanups
   - Various other delegation handling cleanups from Christoph
   - Unify security_inode_listsecurity() calls
   - Improvements to NFSv4 lease handling
   - Clean up SUNRPC *_debug fields when CONFIG_SUNRPC_DEBUG is not set"

* tag 'nfs-for-7.0-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (60 commits)
  SUNRPC: fix gss_auth kref leak in gss_alloc_msg error path
  nfs: nfs4proc: Convert comma to semicolon
  SUNRPC: Change list definition method
  sunrpc: rpc_debug and others are defined even if CONFIG_SUNRPC_DEBUG unset
  NFSv4: limit lease period in nfs4_set_lease_period()
  NFSv4: pass lease period in seconds to nfs4_set_lease_period()
  nfs: unify security_inode_listsecurity() calls
  fs/nfs: Fix readdir slow-start regression
  pNFS: fix a missing wake up while waiting on NFS_LAYOUT_DRAIN
  NFS: fix delayed delegation return handling
  NFS: simplify error handling in nfs_end_delegation_return
  NFS: fold nfs_abort_delegation_return into nfs_end_delegation_return
  NFS: remove the delegation == NULL check in nfs_end_delegation_return
  NFS: use bool for the issync argument to nfs_end_delegation_return
  NFS: return void from ->return_delegation
  NFS: return void from nfs4_inode_make_writeable
  NFS: Merge CONFIG_NFS_V4_1 with CONFIG_NFS_V4
  NFS: Add a way to disable NFS v4.0 via KConfig
  NFS: Move sequence slot operations into minorversion operations
  NFS: Pass a struct nfs_client to nfs4_init_sequence()
  ...

2 months agoMerge branch 'net-phy_port-sfp-modules-representation-and-phy_port-listing'
Jakub Kicinski [Fri, 13 Feb 2026 01:45:02 +0000 (17:45 -0800)] 
Merge branch 'net-phy_port-sfp-modules-representation-and-phy_port-listing'

Maxime Chevallier says:

====================
net: phy_port: SFP modules representation and phy_port listing (part)
====================

Applying just the initial cleanup + fixes from the series.

Link: https://patch.msgid.link/20260205092317.755906-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phy: phy_port: Correctly recompute the port's linkmodes
Maxime Chevallier [Thu, 5 Feb 2026 09:23:06 +0000 (10:23 +0100)] 
net: phy: phy_port: Correctly recompute the port's linkmodes

a PHY-driven phy_port contains a 'supported' field containing the
linkmodes available on this port. This is populated based on :
 - The PHY's reported features
 - The DT representation of the connector
 - The PHY's attach_mdi() callback

As these different attrbution methods work in conjunction, the helper
phy_port_update_supported() recomputes the final 'supported' value based
on the populated mediums, linkmodes and pairs.

However this recompute wasn't correctly implemented, and added more
modes than necessary by or'ing the medium-specific modes to the existing
support. Let's fix this and properly filter the modes.

Fixes: 589e934d2735 ("net: phy: Introduce PHY ports representation")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Link: https://patch.msgid.link/20260205092317.755906-4-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phy: phy_port: Cleanup the of-parsing logic for phy_port
Maxime Chevallier [Thu, 5 Feb 2026 09:23:05 +0000 (10:23 +0100)] 
net: phy: phy_port: Cleanup the of-parsing logic for phy_port

We don't need to maintain a mediums bitfield, let's drop it and drop a
bogus check for empty mediums, as we already check it above.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Link: https://patch.msgid.link/20260205092317.755906-3-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: phy: initialize the port support based on the PHY's for OF ports
Maxime Chevallier [Thu, 5 Feb 2026 09:23:04 +0000 (10:23 +0100)] 
net: phy: initialize the port support based on the PHY's for OF ports

With the phy_port infrastructure came an ethernet-connector binding,
allowing to represent the MDI of a PHY in devicetree. This allows
specifying the mediums and pairs of a port.

Let's initialize the port's supported list based on what the PHY
reports, so that we can then filter it with what the connector allows
using.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Link: https://patch.msgid.link/20260205092317.755906-2-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agotools/testing: keep legacy generated files around in .gitignore file
Linus Torvalds [Fri, 13 Feb 2026 01:19:36 +0000 (17:19 -0800)] 
tools/testing: keep legacy generated files around in .gitignore file

People keep removing generated files from .gitignore files even when the
files stay around.  Please don't do that: just because the file is no
longer being generated doesn't make it magically go away, and doesn't
make it suddenly be something that should now not be ignored any more.

Fixes: dd2c6ec24fca ("selftests/mm: remove virtual_address_range test")
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agoMerge tag 'ata-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata...
Linus Torvalds [Fri, 13 Feb 2026 01:12:43 +0000 (17:12 -0800)] 
Merge tag 'ata-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux

Pull ATA updates from Damien Le Moal:

 - Cleanup IRQ masking in the handling of completed report zones
   commands (Niklas)

 - Improve the handling of Thunderbolt attached devices to speed up
   device removal (Henry)

 - Several patches to generalize the existing max_sec quirks to
   facilitates quirking the maximum command size of buggy drives, many
   of which have recently showed up with the recent increase of the
   default max_sectors block limit (Niklas)

 - Cleanup the ahci-platform and sata dt-bindings schema (Rob,
   Manivannan)

 - Improve device node scan in the ahci-dwc driver (Krzysztof)

 - Remove clang W=1 warnings with the ahci-imx and ahci-xgene drivers
   (Krzysztof)

 - Fix a long standing potential command starvation situation with
   non-NCQ commands issued when NCQ commands are on-going (me)

 - Limit max_sectors to 8191 on the INTEL SSDSC2KG480G8 SSD (Niklas)

 - Remove Vesa Local Bus (VLB) support in the pata_legacy driver (Ethan)

 - Simple fixes in the pata_cypress (typo) and pata_ftide010 (timing)
   drivers (Ethan, Linus W)

* tag 'ata-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  ata: pata_ftide010: Fix some DMA timings
  ata: pata_cypress: fix typo in error message
  ata: pata_legacy: remove VLB support
  ata: libata-core: Quirk INTEL SSDSC2KG480G8 max_sectors
  dt-bindings: ata: sata: Document the graph port
  ata: libata-scsi: avoid Non-NCQ command starvation
  ata: libata-scsi: refactor ata_scsi_translate()
  ata: ahci-xgene: Fix Wvoid-pointer-to-enum-cast warning
  ata: ahci-imx: Fix Wvoid-pointer-to-enum-cast warning
  ata: ahci-dwc: Simplify with scoped for each OF child loop
  dt-bindings: ata: ahci-platform: Drop unnecessary select schema
  ata: libata: Allow more quirks
  ata: libata: Add libata.force parameter max_sec
  ata: libata: Add support to parse equal sign in libata.force
  ata: libata: Change libata.force to use the generic ATA_QUIRK_MAX_SEC quirk
  ata: libata: Add ata_force_get_fe_for_dev() helper
  ata: libata: Add ATA_QUIRK_MAX_SEC and convert all device quirks
  ata: libata: avoid long timeouts on hot-unplugged SATA DAS
  ata: libata-scsi: Remove superfluous local_irq_save()

2 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Fri, 13 Feb 2026 01:05:20 +0000 (17:05 -0800)] 
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "Usual smallish cycle. The NFS biovec work to push it down into RDMA
  instead of indirecting through a scatterlist is pretty nice to see,
  been talked about for a long time now.

   - Various code improvements in irdma, rtrs, qedr, ocrdma, irdma, rxe

   - Small driver improvements and minor bug fixes to hns, mlx5, rxe,
     mana, mlx5, irdma

   - Robusness improvements in completion processing for EFA

   - New query_port_speed() verb to move past limited IBA defined speed
     steps

   - Support for SG_GAPS in rts and many other small improvements

   - Rare list corruption fix in iwcm

   - Better support different page sizes in rxe

   - Device memory support for mana

   - Direct bio vec to kernel MR for use by NFS-RDMA

   - QP rate limiting for bnxt_re

   - Remote triggerable NULL pointer crash in siw

   - DMA-buf exporter support for RDMA mmaps like doorbells"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (66 commits)
  RDMA/mlx5: Implement DMABUF export ops
  RDMA/uverbs: Add DMABUF object type and operations
  RDMA/uverbs: Support external FD uobjects
  RDMA/siw: Fix potential NULL pointer dereference in header processing
  RDMA/umad: Reject negative data_len in ib_umad_write
  IB/core: Extend rate limit support for RC QPs
  RDMA/mlx5: Support rate limit only for Raw Packet QP
  RDMA/bnxt_re: Report QP rate limit in debugfs
  RDMA/bnxt_re: Report packet pacing capabilities when querying device
  RDMA/bnxt_re: Add support for QP rate limiting
  MAINTAINERS: Drop RDMA files from Hyper-V section
  RDMA/uverbs: Add __GFP_NOWARN to ib_uverbs_unmarshall_recv() kmalloc
  svcrdma: use bvec-based RDMA read/write API
  RDMA/core: add rdma_rw_max_sge() helper for SQ sizing
  RDMA/core: add MR support for bvec-based RDMA operations
  RDMA/core: use IOVA-based DMA mapping for bvec RDMA operations
  RDMA/core: add bio_vec based RDMA read/write API
  RDMA/irdma: Use kvzalloc for paged memory DMA address array
  RDMA/rxe: Fix race condition in QP timer handlers
  RDMA/mana_ib: Add device‑memory support
  ...

2 months agoMerge tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Linus Torvalds [Fri, 13 Feb 2026 00:33:05 +0000 (16:33 -0800)] 
Merge tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL updates from Dave Jiang:

 - Introduce cxl_memdev_attach and pave way for soft reserved handling,
   type2 accelerator enabling, and LSA 2.0 enabling. All these series
   require the endpoint driver to settle before continuing the memdev
   driver probe.

 - Address CXL port error protocol handling and reporting.

   The large patch series was split into three parts. The first two
   parts are included here with the final part coming later.

   The first part consists of a series of code refactoring to PCI AER
   sub-system that addresses CXL and also CXL RAS code to prepare for
   port error handling.

   The second part refactors the CXL code to move management of
   component registers to cxl_port objects to allow all CXL AER errors
   to be handled through the cxl_port hierarchy.

 - Provide AMD Zen5 platform address translation for CXL using ACPI
   PRMT. This includes a conventions document to explain why this is
   needed and how it's implemented.

 - Misc CXL patches of fixes, cleanups, and updates. Including CXL
   address translation for unaligned MOD3 regions.

[ TLA service: CXL is "Compute Express Link" ]

* tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (59 commits)
  cxl: Disable HPA/SPA translation handlers for Normalized Addressing
  cxl/region: Factor out code into cxl_region_setup_poison()
  cxl/atl: Lock decoders that need address translation
  cxl: Enable AMD Zen5 address translation using ACPI PRMT
  cxl/acpi: Prepare use of EFI runtime services
  cxl: Introduce callback for HPA address ranges translation
  cxl/region: Use region data to get the root decoder
  cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos()
  cxl/region: Separate region parameter setup and region construction
  cxl: Simplify cxl_root_ops allocation and handling
  cxl/region: Store HPA range in struct cxl_region
  cxl/region: Store root decoder in struct cxl_region
  cxl/region: Rename misleading variable name @hpa to @hpa_range
  Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
  cxl, doc: Moving conventions in separate files
  cxl, doc: Remove isonum.txt inclusion
  cxl/port: Unify endpoint and switch port lookup
  cxl/port: Move endpoint component register management to cxl_port
  cxl/port: Map Port RAS registers
  cxl/port: Move dport RAS setup to dport add time
  ...

2 months agoMerge tag 'vfio-v7.0-rc1' of https://github.com/awilliam/linux-vfio
Linus Torvalds [Thu, 12 Feb 2026 23:52:39 +0000 (15:52 -0800)] 
Merge tag 'vfio-v7.0-rc1' of https://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:
 "A small cycle with the bulk in selftests and reintroducing poison
  handling in the nvgrace-gpu driver. The rest are fixes, cleanups, and
  some dmabuf structure consolidation.

   - Update outdated mdev comment referencing the renamed
     mdev_type_add() function (Julia Lawall)

   - Introduce selftest support for IOMMU mapping of PCI MMIO BARs (Alex
     Mastro)

   - Relax selftest assertion relative to differences in huge page
     handling between legacy (v1) TYPE1 IOMMU mapping behavior and the
     compatibility mode supported by IOMMUFD (David Matlack)

   - Reintroduce memory poison handling support for non-struct-page-
     backed memory in the nvgrace-gpu variant driver (Ankit Agrawal)

   - Replace dma_buf_phys_vec with phys_vec to avoid duplicate structure
     and semantics (Leon Romanovsky)

   - Add missing upstream bridge locking across PCI function reset,
     resolving an assertion failure when secondary bus reset is used to
     provide that reset (Anthony Pighin)

   - Fixes to hisi_acc vfio-pci variant driver to resolve corner case
     issues related to resets, repeated migration, and error injection
     scenarios (Longfang Liu, Weili Qian)

   - Restrict vfio selftest builds to arm64 and x86_64, resolving
     compiler warnings on 32-bit archs (Ted Logan)

   - Un-deprecate the fsl-mc vfio bus driver as a new maintainer has
     stepped up (Ioana Ciornei)"

* tag 'vfio-v7.0-rc1' of https://github.com/awilliam/linux-vfio:
  vfio/fsl-mc: add myself as maintainer
  vfio: selftests: only build tests on arm64 and x86_64
  hisi_acc_vfio_pci: fix the queue parameter anomaly issue
  hisi_acc_vfio_pci: resolve duplicate migration states
  hisi_acc_vfio_pci: update status after RAS error
  hisi_acc_vfio_pci: fix VF reset timeout issue
  vfio/pci: Lock upstream bridge for vfio_pci_core_disable()
  types: reuse common phys_vec type instead of DMABUF open‑coded variant
  vfio/nvgrace-gpu: register device memory for poison handling
  mm: add stubs for PFNMAP memory failure registration functions
  vfio: selftests: Drop IOMMU mapping size assertions for VFIO_TYPE1_IOMMU
  vfio: selftests: Add vfio_dma_mapping_mmio_test
  vfio: selftests: Align BAR mmaps for efficient IOMMU mapping
  vfio: selftests: Centralize IOMMU mode name definitions
  vfio/mdev: update outdated comment

2 months agosmb: client: fix data corruption due to racy lease checks
Paulo Alcantara [Thu, 12 Feb 2026 22:53:07 +0000 (19:53 -0300)] 
smb: client: fix data corruption due to racy lease checks

Customer reported data corruption in some of their files.  It turned
out the client would end up calling cacheless IO functions while
having RHW lease, bypassing the pagecache and then leaving gaps in the
file while writing to it.  It was related to concurrent opens changing
the lease state while having writes in flight.  Lease breaks and
re-opens due to reconnect could also cause same issue.

Fix this by serialising the lease updates with
cifsInodeInfo::open_file_lock.  When handling oplock break, make sure
to use the downgraded oplock value rather than one in cifsInodeinfo as
it could be changed concurrently.

Reported-by: Frank Sorenson <sorenson@redhat.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2 months agolib/group_cpus: handle const qualifier from clusters allocation type
Kees Cook [Fri, 6 Feb 2026 22:20:13 +0000 (14:20 -0800)] 
lib/group_cpus: handle const qualifier from clusters allocation type

In preparation for making the kmalloc family of allocators type aware, we
need to make sure that the returned type from the allocation matches the
type of the variable being assigned.  (Before, the allocator would always
return "void *", which can be implicitly cast to any pointer type.)

The assigned type is "const struct cpumask **", but the returned type,
while matching, is not const qualified.  To get them exactly matching,
just use the dereferenced pointer for the sizeof().

Link: https://lkml.kernel.org/r/20260206222010.work.349-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Cc: Wangyang Guo <wangyang.guo@intel.com>
Cc: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agokho: remove unnecessary WARN_ON(err) in kho_populate()
Ran Xiaokai [Thu, 12 Feb 2026 11:11:46 +0000 (11:11 +0000)] 
kho: remove unnecessary WARN_ON(err) in kho_populate()

The following pr_warn() provides detailed error and location information,
WARN_ON(err) adds no additional debugging value, so remove the redundant
WARN_ON() call.

Link: https://lkml.kernel.org/r/20260212111146.210086-3-ranxiaokai627@163.com
Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Alexander Graf <graf@amazon.com>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agokho: fix missing early_memunmap() call in kho_populate()
Ran Xiaokai [Thu, 12 Feb 2026 11:11:45 +0000 (11:11 +0000)] 
kho: fix missing early_memunmap() call in kho_populate()

Patch series "two fixes in kho_populate()", v3.

This patch (of 2):

kho_populate() returns without calling early_memunmap() on success path,
this will cause early ioremap virtual address space leak.

Link: https://lkml.kernel.org/r/20260212111146.210086-1-ranxiaokai627@163.com
Link: https://lkml.kernel.org/r/20260212111146.210086-2-ranxiaokai627@163.com
Fixes: b50634c5e84a ("kho: cleanup error handling in kho_populate()")
Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agoscripts/gdb: implement x86_page_ops in mm.py
Seongjun Hong [Mon, 2 Feb 2026 03:42:41 +0000 (12:42 +0900)] 
scripts/gdb: implement x86_page_ops in mm.py

Implement all member functions of x86_page_ops strictly following the
logic of aarch64_page_ops.

This includes full support for SPARSEMEM and standard page translation
functions.

This fixes compatibility with 'lx-' commands on x86_64, preventing
AttributeErrors when using lx-pfn_to_page and others.

Link: https://lkml.kernel.org/r/20260202034241.649268-1-hsj0512@snu.ac.kr
Signed-off-by: Seongjun Hong <hsj0512@snu.ac.kr>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agoobjpool: fix the overestimation of object pooling metadata size
zhouwenhao [Mon, 2 Feb 2026 13:28:46 +0000 (21:28 +0800)] 
objpool: fix the overestimation of object pooling metadata size

objpool uses struct objpool_head to store metadata information, and its
cpu_slots member points to an array of pointers that store the addresses
of the percpu ring arrays.  However, the memory size allocated during the
initialization of cpu_slots is nr_cpu_ids * sizeof(struct objpool_slot).
On a 64-bit machine, the size of struct objpool_slot is 16 bytes, which is
twice the size of the actual pointer required, and the extra memory is
never be used, resulting in a waste of memory.  Therefore, the memory size
required for cpu_slots needs to be corrected.

Link: https://lkml.kernel.org/r/20260202132846.68257-1-zhouwenhao7600@gmail.com
Fixes: b4edb8d2d464 ("lib: objpool added: ring-array based lockless MPMC")
Signed-off-by: zhouwenhao <zhouwenhao7600@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Matt Wu <wuqiang.matt@bytedance.com>
Cc: wuqiang.matt <wuqiang.matt@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agoselftests/memfd: use IPC semaphore instead of SIGSTOP/SIGCONT
Aristeu Rozanski [Mon, 2 Feb 2026 14:38:05 +0000 (09:38 -0500)] 
selftests/memfd: use IPC semaphore instead of SIGSTOP/SIGCONT

selftests/memfd: use IPC semaphore instead of SIGSTOP/SIGCONT

In order to synchronize new processes to test inheritance of memfd_noexec
sysctl, memfd_test sets up the sysctl with a value before creating the new
process.  The new process then sends itself a SIGSTOP in order to wait for
the parent to flip the sysctl value and send a SIGCONT signal.

This would work as intended if it wasn't the fact that the new process is
being created with CLONE_NEWPID, which creates a new PID namespace and the
new process has PID 1 in this namespace.  There're restrictions on sending
signals to PID 1 and, although it's relaxed for other than root PID
namespace, it's biting us here.  In this specific case the SIGSTOP sent by
the new process is ignored (no error to kill() is returned) and it never
stops its execution.  This is usually not noticiable as the parent usually
manages to set the new sysctl value before the child has a chance to run
and the test succeeds.  But if you run the test in a loop, it eventually
reproduces:

while [ 1 ]; do ./memfd_test >log 2>&1 || break; done; cat log

So this patch replaces the SIGSTOP/SIGCONT synchronization with IPC
semaphore.

Link: https://lkml.kernel.org/r/a7776389-b3d6-4b18-b438-0b0e3ed1fd3b@work
Fixes: 6469b66e3f5a ("selftests: improve vm.memfd_noexec sysctl tests")
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: liuye <liuye@kylinos.cn>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agodelayacct: fix build regression on accounting tool
Arnd Bergmann [Tue, 10 Feb 2026 10:34:22 +0000 (11:34 +0100)] 
delayacct: fix build regression on accounting tool

The accounting tool was modified for the original ABI using a custom
'timespec64' type in linux/taskstats.h, which I changed to use the regular
__kernel_timespec type, causing a build failure:

        getdelays.c:202:45: warning: 'struct timespec64' declared inside parameter list will not be visible outside of this definition or declaration

     202 | static const char *format_timespec64(struct timespec64 *ts)
         |                                             ^~~~~~~~~~

Change the tool to match the updated header.

Link: https://lkml.kernel.org/r/20260210103427.2984963-1-arnd@kernel.org
Fixes: 503efe850c74 ("delayacct: add timestamp of delay max")
Fixes: f06e31eef4c1 ("delayacct: fix uapi timespec64 definition")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202602091611.lxgINqXp-lkp@intel.com/
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm/page_alloc: clear page->private in free_pages_prepare()
Mikhail Gavrilov [Sat, 7 Feb 2026 17:36:14 +0000 (22:36 +0500)] 
mm/page_alloc: clear page->private in free_pages_prepare()

Several subsystems (slub, shmem, ttm, etc.) use page->private but don't
clear it before freeing pages.  When these pages are later allocated as
high-order pages and split via split_page(), tail pages retain stale
page->private values.

This causes a use-after-free in the swap subsystem.  The swap code uses
page->private to track swap count continuations, assuming freshly
allocated pages have page->private == 0.  When stale values are present,
swap_count_continued() incorrectly assumes the continuation list is valid
and iterates over uninitialized page->lru containing LIST_POISON values,
causing a crash:

  KASAN: maybe wild-memory-access in range [0xdead000000000100-0xdead000000000107]
  RIP: 0010:__do_sys_swapoff+0x1151/0x1860

Fix this by clearing page->private in free_pages_prepare(), ensuring all
freed pages have clean state regardless of previous use.

Link: https://lkml.kernel.org/r/20260207173615.146159-1-mikhail.v.gavrilov@gmail.com
Fixes: 3b8000ae185c ("mm/vmalloc: huge vmalloc backing pages should be split rather than compound")
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Suggested-by: Zi Yan <ziy@nvidia.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kairui Song <ryncsn@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agoMerge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Thu, 12 Feb 2026 23:43:02 +0000 (15:43 -0800)] 
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "Usual driver updates (qla2xxx, mpi3mr, mpt3sas, ufs) plus assorted
  cleanups and fixes.

  The biggest core change is the massive code motion in the sd driver to
  remove forward declarations and the most significant change is to
  enumify the queuecommand return"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (78 commits)
  scsi: csiostor: Fix dereference of null pointer rn
  scsi: buslogic: Reduce stack usage
  scsi: ufs: host: mediatek: Require CONFIG_PM
  scsi: ufs: mediatek: Fix page faults in ufs_mtk_clk_scale() trace event
  scsi: smartpqi: Fix memory leak in pqi_report_phys_luns()
  scsi: mpi3mr: Make driver probing asynchronous
  scsi: ufs: core: Flush exception handling work when RPM level is zero
  scsi: efct: Use IRQF_ONESHOT and default primary handler
  scsi: ufs: core: Use a host-wide tagset in SDB mode
  scsi: qla2xxx: target: Add WQ_PERCPU to alloc_workqueue() users
  scsi: qla2xxx: Add WQ_PERCPU to alloc_workqueue() users
  scsi: qla4xxx: Add WQ_PERCPU to alloc_workqueue() users
  scsi: mpi3mr: Driver version update to 8.17.0.3.50
  scsi: mpi3mr: Fixed the W=1 compilation warning
  scsi: mpi3mr: Record and report controller firmware faults
  scsi: mpi3mr: Update MPI Headers to revision 39
  scsi: mpi3mr: Use negotiated link rate from DevicePage0
  scsi: mpi3mr: Avoid redundant diag-fault resets
  scsi: mpi3mr: Rename log data save helper to reflect threaded/BH context
  scsi: mpi3mr: Add module parameter to control threaded IRQ polling
  ...

2 months agoselftests/mm: add memory failure dirty pagecache test
Miaohe Lin [Fri, 6 Feb 2026 03:16:39 +0000 (11:16 +0800)] 
selftests/mm: add memory failure dirty pagecache test

This patch adds a new testcase to validate memory failure handling for
dirty pagecache.  This performs similar operations as clean pagecaches
except fsync() is not used to keep pages dirty.

This test helps ensure that memory failure handling for dirty pagecache
works correctly, including proper SIGBUS delivery, page isolation, and
recovery paths.

Link: https://lkml.kernel.org/r/20260206031639.2707102-4-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: kernel test robot <lkp@intel.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agoselftests/mm: add memory failure clean pagecache test
Miaohe Lin [Fri, 6 Feb 2026 03:16:38 +0000 (11:16 +0800)] 
selftests/mm: add memory failure clean pagecache test

This patch adds a new testcase to validate memory failure handling for
clean pagecache.  This test performs similar operations as anonymous pages
except allocating memory using mmap() with a file fd.

This test helps ensure that memory failure handling for clean pagecache
works correctly, including unchanged page content, page isolation, and
recovery paths.

Link: https://lkml.kernel.org/r/20260206031639.2707102-3-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202601221142.mDWA1ucw-lkp@intel.com/
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agoselftests/mm: add memory failure anonymous page test
Miaohe Lin [Fri, 6 Feb 2026 03:16:37 +0000 (11:16 +0800)] 
selftests/mm: add memory failure anonymous page test

Patch series "selftests/mm: add memory failure selftests", v4.

Introduce selftests to validate the functionality of memory failure.
These tests help ensure that memory failure handling for anonymous pages,
pagecaches pages works correctly, including proper SIGBUS delivery to user
processes, page isolation, and recovery paths.

Currently madvise syscall is used to inject memory failures.  And only
anonymous pages and pagecaches are tested.  More test scenarios, e.g.
hugetlb, shmem, thp, will be added.  Also more memory failure injecting
methods will be supported, e.g.  APEI Error INJection, if required.

This patch (of 3):

This patch adds a new kselftest to validate memory failure handling for
anonymous pages. The test performs the following operations:
1. Allocates anonymous pages using mmap().
2. Injects memory failure via madvise syscall.
3. Verifies expected error handling behavior.
4. Unpoison memory.

This test helps ensure that memory failure handling for anonymous pages
works correctly, including proper SIGBUS delivery to user processes, page
isolation and recovery paths.

Link: https://lkml.kernel.org/r/20260206031639.2707102-1-linmiaohe@huawei.com
Link: https://lkml.kernel.org/r/20260206031639.2707102-2-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: kernel test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm: rmap: support batched unmapping for file large folios
Baolin Wang [Mon, 9 Feb 2026 14:07:28 +0000 (22:07 +0800)] 
mm: rmap: support batched unmapping for file large folios

Similar to folio_referenced_one(), we can apply batched unmapping for file
large folios to optimize the performance of file folios reclamation.

Barry previously implemented batched unmapping for lazyfree anonymous
large folios[1] and did not further optimize anonymous large folios or
file-backed large folios at that stage.  As for file-backed large folios,
the batched unmapping support is relatively straightforward, as we only
need to clear the consecutive (present) PTE entries for file-backed large
folios.

Note that it's not ready to support batched unmapping for uffd case, so
let's still fallback to per-page unmapping for the uffd case.

Performance testing:
Allocate 10G clean file-backed folios by mmap() in a memory cgroup, and
try to reclaim 8G file-backed folios via the memory.reclaim interface.  I
can observe 75% performance improvement on my Arm64 32-core server (and
50%+ improvement on my X86 machine) with this patch.

W/o patch:
real    0m1.018s
user    0m0.000s
sys     0m1.018s

W/ patch:
real 0m0.249s
user 0m0.000s
sys 0m0.249s

[1] https://lore.kernel.org/all/20250214093015.51024-4-21cnbao@gmail.com/T/#u
Link: https://lkml.kernel.org/r/b53a16f67c93a3fe65e78092069ad135edf00eff.1770645603.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: Barry Song <baohua@kernel.org>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agoarm64: mm: implement the architecture-specific clear_flush_young_ptes()
Baolin Wang [Mon, 9 Feb 2026 14:07:27 +0000 (22:07 +0800)] 
arm64: mm: implement the architecture-specific clear_flush_young_ptes()

Implement the Arm64 architecture-specific clear_flush_young_ptes() to
enable batched checking of young flags and TLB flushing, improving
performance during large folio reclamation.

Performance testing:
Allocate 10G clean file-backed folios by mmap() in a memory cgroup, and
try to reclaim 8G file-backed folios via the memory.reclaim interface.  I
can observe 33% performance improvement on my Arm64 32-core server (and
10%+ improvement on my X86 machine).  Meanwhile, the hotspot
folio_check_references() dropped from approximately 35% to around 5%.

W/o patchset:
real 0m1.518s
user 0m0.000s
sys 0m1.518s

W/ patchset:
real 0m1.018s
user 0m0.000s
sys 0m1.018s

Link: https://lkml.kernel.org/r/ce749fbae3e900e733fa104a16fcb3ca9fe4f9bd.1770645603.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Barry Song <baohua@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agoarm64: mm: support batch clearing of the young flag for large folios
Baolin Wang [Mon, 9 Feb 2026 14:07:26 +0000 (22:07 +0800)] 
arm64: mm: support batch clearing of the young flag for large folios

Currently, contpte_ptep_test_and_clear_young() and
contpte_ptep_clear_flush_young() only clear the young flag and flush TLBs
for PTEs within the contiguous range.  To support batch PTE operations for
other sized large folios in the following patches, adding a new parameter
to specify the number of PTEs that map consecutive pages of the same large
folio in a single VMA and a single page table.

While we are at it, rename the functions to maintain consistency with
other contpte_*() functions.

Link: https://lkml.kernel.org/r/5644250dcc0417278c266ad37118d27f541fd052.1770645603.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Barry Song <baohua@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agoarm64: mm: factor out the address and ptep alignment into a new helper
Baolin Wang [Mon, 9 Feb 2026 14:07:25 +0000 (22:07 +0800)] 
arm64: mm: factor out the address and ptep alignment into a new helper

Factor out the contpte block's address and ptep alignment into a new
helper, and will be reused in the following patch.

No functional changes.

Link: https://lkml.kernel.org/r/8076d12cb244b2d9e91119b44dc6d5e4ad9c00af.1770645603.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm: rmap: support batched checks of the references for large folios
Baolin Wang [Mon, 9 Feb 2026 14:07:24 +0000 (22:07 +0800)] 
mm: rmap: support batched checks of the references for large folios

Patch series "support batch checking of references and unmapping for large
folios", v6.

Currently, folio_referenced_one() always checks the young flag for each
PTE sequentially, which is inefficient for large folios.  This
inefficiency is especially noticeable when reclaiming clean file-backed
large folios, where folio_referenced() is observed as a significant
performance hotspot.

Moreover, on Arm architecture, which supports contiguous PTEs, there is
already an optimization to clear the young flags for PTEs within a
contiguous range.  However, this is not sufficient.  We can extend this to
perform batched operations for the entire large folio (which might exceed
the contiguous range: CONT_PTE_SIZE).

Similar to folio_referenced_one(), we can also apply batched unmapping for
large file folios to optimize the performance of file folio reclamation.
By supporting batched checking of the young flags, flushing TLB entries,
and unmapping, I can observed a significant performance improvements in my
performance tests for file folios reclamation.  Please check the
performance data in the commit message of each patch.

This patch (of 5):

Currently, folio_referenced_one() always checks the young flag for each
PTE sequentially, which is inefficient for large folios.  This
inefficiency is especially noticeable when reclaiming clean file-backed
large folios, where folio_referenced() is observed as a significant
performance hotspot.

Moreover, on Arm64 architecture, which supports contiguous PTEs, there is
already an optimization to clear the young flags for PTEs within a
contiguous range.  However, this is not sufficient.  We can extend this to
perform batched operations for the entire large folio (which might exceed
the contiguous range: CONT_PTE_SIZE).

Introduce a new API: clear_flush_young_ptes() to facilitate batched
checking of the young flags and flushing TLB entries, thereby improving
performance during large folio reclamation.  And it will be overridden by
the architecture that implements a more efficient batch operation in the
following patches.

While we are at it, rename ptep_clear_flush_young_notify() to
clear_flush_young_ptes_notify() to indicate that this is a batch
operation.

Link: https://lkml.kernel.org/r/cover.1770645603.git.baolin.wang@linux.alibaba.com
Link: https://lkml.kernel.org/r/12132694536834262062d1fb304f8f8a064b6750.1770645603.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Cc: Barry Song <baohua@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agotools/testing/vma: add VMA userland tests for VMA flag functions
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:22 +0000 (16:06 +0000)] 
tools/testing/vma: add VMA userland tests for VMA flag functions

Now we have the capability to test the new helpers for the bitmap VMA
flags in userland, do so.

We also update the Makefile such that both VMA (and while we're here)
mm_struct flag sizes can be customised on build.  We default to 128-bit to
enable testing of flags above word size even on 64-bit systems.

We add userland tests to ensure that we do not regress VMA flag behaviour
with the introduction when using bitmap VMA flags, nor accidentally
introduce unexpected results due to for instance higher bit values not
being correctly cleared/set.

As part of this change, make __mk_vma_flags() a custom function so we can
handle specifying invalid VMA bits.  This is purposeful so we can have the
VMA tests work at lower and higher number of VMA flags without having to
duplicate code too much.

Link: https://lkml.kernel.org/r/7fe6afe9c8c61e4d3cfc9a2d50a5d24da8528e68.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agotools/testing/vma: separate out vma_internal.h into logical headers
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:21 +0000 (16:06 +0000)] 
tools/testing/vma: separate out vma_internal.h into logical headers

The vma_internal.h file is becoming entirely unmanageable.  It combines
duplicated kernel implementation logic that needs to be kept in-sync with
the kernel, stubbed out declarations that we simply ignore for testing
purposes and custom logic added to aid testing.

If we separate each of the three things into separate headers it makes
things far more manageable, so do so:

* include/stubs.h  contains the stubbed declarations,
* include/dup.h    contains the duplicated kernel declarations, and
* include/custom.h contains declarations customised for testing.

[lorenzo.stoakes@oracle.com: avoid a duplicate struct define]
Link: https://lkml.kernel.org/r/1e032732-61c3-485c-9aa7-6a09016fefc1@lucifer.local
Link: https://lkml.kernel.org/r/dd57baf5b5986cb96a167150ac712cbe804b63ee.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agotools/testing/vma: separate VMA userland tests into separate files
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:20 +0000 (16:06 +0000)] 
tools/testing/vma: separate VMA userland tests into separate files

So far the userland VMA tests have been established as a rough expression
of what's been possible.

Adapt it into a more usable form by separating out tests and shared
helper functions.

Since we test functions that are declared statically in mm/vma.c, we make
use of the trick of #include'ing kernel C files directly.

In order for the tests to continue to function, we must therefore also
this way into the tests/ directory.

We try to keep as much shared logic actually modularised into a separate
compilation unit in shared.c, however the merge_existing() and
attach_vma() helpers rely on statically declared mm/vma.c functions so
these must be declared in main.c.

Link: https://lkml.kernel.org/r/a0455ccfe4fdcd1c962c64f76304f612e5662a4e.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm: make vm_area_desc utilise vma_flags_t only
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:19 +0000 (16:06 +0000)] 
mm: make vm_area_desc utilise vma_flags_t only

Now we have eliminated all uses of vm_area_desc->vm_flags, eliminate this
field, and have mmap_prepare users utilise the vma_flags_t
vm_area_desc->vma_flags field only.

As part of this change we alter is_shared_maywrite() to accept a
vma_flags_t parameter, and introduce is_shared_maywrite_vm_flags() for use
with legacy vm_flags_t flags.

We also update struct mmap_state to add a union between vma_flags and
vm_flags temporarily until the mmap logic is also converted to using
vma_flags_t.

Also update the VMA userland tests to reflect this change.

Link: https://lkml.kernel.org/r/fd2a2938b246b4505321954062b1caba7acfc77a.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm: update all remaining mmap_prepare users to use vma_flags_t
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:18 +0000 (16:06 +0000)] 
mm: update all remaining mmap_prepare users to use vma_flags_t

We will be shortly removing the vm_flags_t field from vm_area_desc so we
need to update all mmap_prepare users to only use the dessc->vma_flags
field.

This patch achieves that and makes all ancillary changes required to make
this possible.

This lays the groundwork for future work to eliminate the use of
vm_flags_t in vm_area_desc altogether and more broadly throughout the
kernel.

While we're here, we take the opportunity to replace VM_REMAP_FLAGS with
VMA_REMAP_FLAGS, the vma_flags_t equivalent.

No functional changes intended.

Link: https://lkml.kernel.org/r/fb1f55323799f09fe6a36865b31550c9ec67c225.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Damien Le Moal <dlemoal@kernel.org> [zonefs]
Acked-by: "Darrick J. Wong" <djwong@kernel.org>
Acked-by: Pedro Falcato <pfalcato@suse.de>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm: update shmem_[kernel]_file_*() functions to use vma_flags_t
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:17 +0000 (16:06 +0000)] 
mm: update shmem_[kernel]_file_*() functions to use vma_flags_t

In order to be able to use only vma_flags_t in vm_area_desc we must adjust
shmem file setup functions to operate in terms of vma_flags_t rather than
vm_flags_t.

This patch makes this change and updates all callers to use the new
functions.

No functional changes intended.

[akpm@linux-foundation.org: comment fixes, per Baolin]
Link: https://lkml.kernel.org/r/736febd280eb484d79cef5cf55b8a6f79ad832d2.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm: update secretmem to use VMA flags on mmap_prepare
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:16 +0000 (16:06 +0000)] 
mm: update secretmem to use VMA flags on mmap_prepare

This patch updates secretmem to use the new vma_flags_t type which will
soon supersede vm_flags_t altogether.

In order to make this change we also have to update mlock_future_ok(), we
replace the vm_flags_t parameter with a simple boolean is_vma_locked one,
which also simplifies the invocation here.

This is laying the groundwork for eliminating the vm_flags_t in
vm_area_desc and more broadly throughout the kernel.

No functional changes intended.

[lorenzo.stoakes@oracle.com: fix check_brk_limits(), per Chris]
Link: https://lkml.kernel.org/r/3aab9ab1-74b4-405e-9efb-08fc2500c06e@lucifer.local
Link: https://lkml.kernel.org/r/a243a09b0a5d0581e963d696de1735f61f5b2075.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm: update hugetlbfs to use VMA flags on mmap_prepare
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:15 +0000 (16:06 +0000)] 
mm: update hugetlbfs to use VMA flags on mmap_prepare

In order to update all mmap_prepare users to utilising the new VMA flags
type vma_flags_t and associated helper functions, we start by updating
hugetlbfs which has a lot of additional logic that requires updating to
make this change.

This is laying the groundwork for eliminating the vm_flags_t from struct
vm_area_desc and using vma_flags_t only, which further lays the ground for
removing the deprecated vm_flags_t type altogether.

No functional changes intended.

Link: https://lkml.kernel.org/r/9226bec80c9aa3447cc2b83354f733841dba8a50.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm: add basic VMA flag operation helper functions
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:14 +0000 (16:06 +0000)] 
mm: add basic VMA flag operation helper functions

Now we have the mk_vma_flags() macro helper which permits easy
specification of any number of VMA flags, add helper functions which
operate with vma_flags_t parameters.

This patch provides vma_flags_test[_mask](), vma_flags_set[_mask]() and
vma_flags_clear[_mask]() respectively testing, setting and clearing flags
with the _mask variants accepting vma_flag_t parameters, and the non-mask
variants implemented as macros which accept a list of flags.

This allows us to trivially test/set/clear aggregate VMA flag values as
necessary, for instance:

if (vma_flags_test(&flags, VMA_READ_BIT, VMA_WRITE_BIT))
goto readwrite;

vma_flags_set(&flags, VMA_READ_BIT, VMA_WRITE_BIT);

vma_flags_clear(&flags, VMA_READ_BIT, VMA_WRITE_BIT);

We also add a function for testing that ALL flags are set for convenience,
e.g.:

if (vma_flags_test_all(&flags, VMA_READ_BIT, VMA_MAYREAD_BIT)) {
/* Both READ and MAYREAD flags set */
...
}

The compiler generates optimal assembly for each such that they behave as
if the caller were setting the bitmap flags manually.

This is important for e.g.  drivers which manipulate flag values rather
than a VMA's specific flag values.

We also add helpers for testing, setting and clearing flags for VMA's and
VMA descriptors to reduce boilerplate.

Also add the EMPTY_VMA_FLAGS define to aid initialisation of empty flags.

Finally, update the userland VMA tests to add the helpers there so they
can be utilised as part of userland testing.

Link: https://lkml.kernel.org/r/885d4897d67a6a57c0b07fa182a7055ad752df11.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agotools: bitmap: add missing bitmap_[subset(), andnot()]
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:13 +0000 (16:06 +0000)] 
tools: bitmap: add missing bitmap_[subset(), andnot()]

The bitmap_subset() and bitmap_andnot() functions are not present in the
tools version of include/linux/bitmap.h, so add them as subsequent patches
implement test code that requires them.

We also add the missing __bitmap_subset() to tools/lib/bitmap.c.

Link: https://lkml.kernel.org/r/0fd0d4ec868297f522003cb4b5898b53b498805b.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm: add mk_vma_flags() bitmap flag macro helper
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:12 +0000 (16:06 +0000)] 
mm: add mk_vma_flags() bitmap flag macro helper

This patch introduces the mk_vma_flags() macro helper to allow easy
manipulation of VMA flags utilising the new bitmap representation
implemented of VMA flags defined by the vma_flags_t type.

It is a variadic macro which provides a bitwise-or'd representation of all
of each individual VMA flag specified.

Note that, while we maintain VM_xxx flags for backwards compatibility
until the conversion is complete, we define VMA flags of type vma_flag_t
using VMA_xxx_BIT to avoid confusing the two.

This helper macro therefore can be used thusly:

vma_flags_t flags = mk_vma_flags(VMA_READ_BIT, VMA_WRITE_BIT);

Testing has demonstrated that the compiler optimises this code such that
it generates the same assembly utilising this macro as it does if the
flags were specified manually, for instance:

vma_flags_t get_flags(void)
{
return mk_vma_flags(VMA_READ_BIT, VMA_WRITE_BIT, VMA_EXEC_BIT);
}

Generates the same code as:

vma_flags_t get_flags(void)
{
vma_flags_t flags;

vma_flags_clear_all(&flags);
vma_flag_set(&flags, VMA_READ_BIT);
vma_flag_set(&flags, VMA_WRITE_BIT);
vma_flag_set(&flags, VMA_EXEC_BIT);

return flags;
}

And:

vma_flags_t get_flags(void)
{
vma_flags_t flags;
unsigned long *bitmap = ACCESS_PRIVATE(&flags, __vma_flags);

*bitmap = 1UL << (__force int)VMA_READ_BIT;
*bitmap |= 1UL << (__force int)VMA_WRITE_BIT;
*bitmap |= 1UL << (__force int)VMA_EXEC_BIT;

return flags;
}

That is:

get_flags:
        movl    $7, %eax
        ret

Link: https://lkml.kernel.org/r/fde00df6ff7fb8c4b42cc0defa5a4924c7a1943a.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm: rename vma_flag_test/set_atomic() to vma_test/set_atomic_flag()
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:11 +0000 (16:06 +0000)] 
mm: rename vma_flag_test/set_atomic() to vma_test/set_atomic_flag()

In order to stay consistent between functions which manipulate a
vm_flags_t argument of the form of vma_flags_...() and those which
manipulate a VMA (in this case the flags field of a VMA), rename
vma_flag_[test/set]_atomic() to vma_[test/set]_atomic_flag().

This lays the groundwork for adding VMA flag manipulation functions in a
subsequent commit.

Link: https://lkml.kernel.org/r/033dcf12e819dee5064582bced9b12ea346d1607.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm/vma: remove __private sparse decoration from vma_flags_t
Lorenzo Stoakes [Thu, 22 Jan 2026 16:06:10 +0000 (16:06 +0000)] 
mm/vma: remove __private sparse decoration from vma_flags_t

Patch series "mm: add bitmap VMA flag helpers and convert all mmap_prepare
to use them", v2.

We introduced the bitmap VMA type vma_flags_t in the aptly named commit
9ea35a25d51b ("mm: introduce VMA flags bitmap type") in order to permit
future growth in VMA flags and to prevent the asinine requirement that VMA
flags be available to 64-bit kernels only if they happened to use a bit
number about 32-bits.

This is a long-term project as there are very many users of VMA flags
within the kernel that need to be updated in order to utilise this new
type.

In order to further this aim, this series adds a number of helper
functions to enable ordinary interactions with VMA flags - that is
testing, setting and clearing them.

In order to make working with VMA bit numbers less cumbersome this series
introduces the mk_vma_flags() helper macro which generates a vma_flags_t
from a variadic parameter list, e.g.:

vma_flags_t flags = mk_vma_flags(VMA_READ_BIT, VMA_WRITE_BIT,
 VMA_EXEC_BIT);

It turns out that the compiler optimises this very well to the point that
this is just as efficient as using VM_xxx pre-computed bitmap values.

This series then introduces the following functions:

bool vma_flags_test_mask(vma_flags_t flags, vma_flags_t to_test);
bool vma_flags_test_all_mask(vma_flags_t flags, vma_flags_t to_test);
void vma_flags_set_mask(vma_flags_t *flags, vma_flags_t to_set);
void vma_flags_clear_mask(vma_flags_t *flags, vma_flags_t to_clear);

Providing means of testing any flag, testing all flags, setting, and
clearing a specific vma_flags_t mask.

For convenience, helper macros are provided - vma_flags_test(),
vma_flags_set() and vma_flags_clear(), each of which utilise
mk_vma_flags() to make these operations easier, as well as an
EMPTY_VMA_FLAGS macro to make initialisation of an empty vma_flags_t value
easier, e.g.:

vma_flags_t flags = EMPTY_VMA_FLAGS;

vma_flags_set(&flags, VMA_READ_BIT, VMA_WRITE_BIT, VMA_EXEC_BIT);
...
if (vma_flags_test(flags, VMA_READ_BIT)) {
...
}
...
if (vma_flags_test_all_mask(flags, VMA_REMAP_FLAGS)) {
...
}
...
vma_flags_clear(&flags, VMA_READ_BIT);

Since callers are often dealing with a vm_area_struct (VMA) or
vm_area_desc (VMA descriptor as used in .mmap_prepare) object, this series
further provides helpers for these - firstly vma_set_flags_mask() and
vma_set_flags() for a VMA:

vma_flags_t flags = EMPTY_VMA_FLAGS:

vma_flags_set(&flags, VMA_READ_BIT, VMA_WRITE_BIT, VMA_EXEC_BIT);
...
vma_set_flags_mask(&vma, flags);
...
vma_set_flags(&vma, VMA_DONTDUMP_BIT);

Note that these do NOT ensure appropriate locks are taken and assume the
callers takes care of this.

For VMA descriptors this series adds vma_desc_[test, set,
clear]_flags_mask() and vma_desc_[test, set, clear]_flags() for a VMA
descriptor, e.g.:

static int foo_mmap_prepare(struct vm_area_desc *desc)
{
...
vma_desc_set_flags(desc, VMA_SEQ_READ_BIT);
vma_desc_clear_flags(desc, VMA_RAND_READ_BIT);
...
if (vma_desc_test_flags(desc, VMA_SHARED_BIT) {
...
}
...
}

With these helpers introduced, this series then updates all mmap_prepare
users to make use of the vma_flags_t vm_area_desc->vma_flags field rather
than the legacy vm_flags_t vm_area_desc->vm_flags field.

In order to do so, several other related functions need to be updated,
with separate patches for larger changes in hugetlbfs, secretmem and shmem
before finally removing vm_area_desc->vm_flags altogether.

This lays the foundations for future elimination of vm_flags_t and
associated defines and functionality altogether in the long run, and
elimination of the use of vm_flags_t in f_op->mmap() hooks in the near
term as mmap_prepare replaces these.

There is a useful synergy between the VMA flags and mmap_prepare work here
as with this change in place, converting f_op->mmap() to
f_op->mmap_prepare naturally also converts use of vm_flags_t to
vma_flags_t in all drivers which declare mmap handlers.

This accounts for the majority of the users of the legacy vm_flags_*()
helpers and thus a large number of drivers which need to interact with VMA
flags in general.

This series also updates the userland VMA tests to account for the change,
and adds unit tests for these helper functions to assert that they behave
as expected.

In order to faciliate this change in a sensible way, the series also
separates out the VMA unit tests into - code that is duplicated from the
kernel that should be kept in sync, code that is customised for test
purposes and code that is stubbed out.

We also separate out the VMA userland tests into separate files to make it
easier to manage and to provide a sensible baseline for adding the
userland tests for these helpers.

This patch (of 13):

We need to pass around these values and access them in a way that sparse
does not allow, as __private implies noderef, i.e.  disallowing
dereference of the value, which manifests as sparse warnings even when
passed around benignly.

Link: https://lkml.kernel.org/r/cover.1769097829.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/64fa89f416f22a60ae74cfff8fd565e7677be192.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm: use unmap_desc struct for freeing page tables
Liam R. Howlett [Wed, 21 Jan 2026 16:49:46 +0000 (11:49 -0500)] 
mm: use unmap_desc struct for freeing page tables

Pass through the unmap_desc to free_pgtables() because it almost has
everything necessary and is already on the stack.

Updates testing code as necessary.

No functional changes intended.

[Liam.Howlett@oracle.com: fix up unmap desc use on exit_mmap()]
Link: https://lkml.kernel.org/r/20260210214214.364856-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20260121164946.2093480-12-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: SeongJae Park <sj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm/vma: use unmap_region() in vms_clear_ptes()
Liam R. Howlett [Wed, 21 Jan 2026 16:49:45 +0000 (11:49 -0500)] 
mm/vma: use unmap_region() in vms_clear_ptes()

There is no need to open code the vms_clear_ptes() now that unmap_desc
struct is used.

Link: https://lkml.kernel.org/r/20260121164946.2093480-11-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: SeongJae Park <sj@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm/vma: use unmap_desc in exit_mmap() and vms_clear_ptes()
Liam R. Howlett [Wed, 21 Jan 2026 16:49:44 +0000 (11:49 -0500)] 
mm/vma: use unmap_desc in exit_mmap() and vms_clear_ptes()

Convert vms_clear_ptes() to use unmap_desc to call unmap_vmas() instead of
the large argument list.  The UNMAP_STATE() cannot be used because the vma
iterator in the vms does not point to the correct maple state
(mas_detach), and the tree_end will be set incorrectly.  Setting up the
arguments manually avoids setting the struct up incorrectly and doing
extra work to get the correct pagetable range.

exit_mmap() also calls unmap_vmas() with many arguments.  Using the
unmap_all_init() function to set the unmap descriptor for all vmas makes
this a bit easier to read.

Update to the vma test code is necessary to ensure testing continues to
function.

No functional changes intended.

Link: https://lkml.kernel.org/r/20260121164946.2093480-10-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: SeongJae Park <sj@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm: introduce unmap_desc struct to reduce function arguments
Liam R. Howlett [Wed, 21 Jan 2026 16:49:43 +0000 (11:49 -0500)] 
mm: introduce unmap_desc struct to reduce function arguments

The unmap_region code uses a number of arguments that could use better
documentation.  With the addition of a descriptor for unmap (called
unmap_desc), the arguments can be more self-documenting and increase the
descriptions within the declaration.

No functional change intended

Link: https://lkml.kernel.org/r/20260121164946.2093480-9-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm: change dup_mmap() recovery
Liam R. Howlett [Wed, 21 Jan 2026 16:49:42 +0000 (11:49 -0500)] 
mm: change dup_mmap() recovery

When the dup_mmap() fails during the vma duplication or setup, don't write
the XA_ZERO entry in the vma tree.  Instead, destroy the tree and free the
new resources, leaving an empty vma tree.

Using XA_ZERO introduced races where the vma could be found between
dup_mmap() dropping all locks and exit_mmap() taking the locks.  The race
can occur because the mm can be reached through the other trees via
successfully copied vmas and other methods such as the swapoff code.

XA_ZERO was marking the location to stop vma removal and pagetable
freeing.  The newly created arguments to the unmap_vmas() and
free_pgtables() serve this function.

Replacing the XA_ZERO entry use with the new argument list also means the
checks for xa_is_zero() are no longer necessary so these are also removed.

Note that dup_mmap() now cleans up when ALL vmas are successfully copied,
but the dup_mmap() fails to completely set up some other aspect of the
duplication.

Link: https://lkml.kernel.org/r/20260121164946.2093480-8-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm/vma: add page table limit to unmap_region()
Liam R. Howlett [Wed, 21 Jan 2026 16:49:41 +0000 (11:49 -0500)] 
mm/vma: add page table limit to unmap_region()

The unmap_region() calls need to pass through the page table limit for a
future patch.

No functional changes intended.

Link: https://lkml.kernel.org/r/20260121164946.2093480-7-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm/memory: add tree limit to free_pgtables()
Liam R. Howlett [Wed, 21 Jan 2026 16:49:40 +0000 (11:49 -0500)] 
mm/memory: add tree limit to free_pgtables()

The ceiling and tree search limit need to be different arguments for the
future change in the failed fork attempt.  The ceiling and floor variables
are not very descriptive, so change them to pg_start/pg_end.

Adding a new variable for the vma_end to the function as it will differ
from the pg_end in the later patches in the series.

Add a kernel doc about the free_pgtables() function.

Test code also updated.

No functional changes intended.

Link: https://lkml.kernel.org/r/20260121164946.2093480-6-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm/vma: add limits to unmap_region() for vmas
Liam R. Howlett [Wed, 21 Jan 2026 16:49:39 +0000 (11:49 -0500)] 
mm/vma: add limits to unmap_region() for vmas

Add a limit to the vma search instead of using the start and end of the
one passed in.

No functional changes intended.

Link: https://lkml.kernel.org/r/20260121164946.2093480-5-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm/mmap: abstract vma clean up from exit_mmap()
Liam R. Howlett [Wed, 21 Jan 2026 16:49:38 +0000 (11:49 -0500)] 
mm/mmap: abstract vma clean up from exit_mmap()

Create the new function tear_down_vmas() to remove a range of vmas.
exit_mmap() will be removing all the vmas.

This is necessary for future patches.

No functional changes intended.

Link: https://lkml.kernel.org/r/20260121164946.2093480-4-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm/mmap: move exit_mmap() trace point
Liam R. Howlett [Wed, 21 Jan 2026 16:49:37 +0000 (11:49 -0500)] 
mm/mmap: move exit_mmap() trace point

Move the trace point later in the function so that it is not skipped in
the event of a failed fork.

Link: https://lkml.kernel.org/r/20260121164946.2093480-3-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Acked-by: Chris Li <chrisl@kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm: relocate the page table ceiling and floor definitions
Liam R. Howlett [Wed, 21 Jan 2026 16:49:36 +0000 (11:49 -0500)] 
mm: relocate the page table ceiling and floor definitions

Patch series " Remove XA_ZERO from error recovery of dup_mmap()", v3.

It is possible that the dup_mmap() call fails on allocating or setting up
a vma after the maple tree of the oldmm is copied.  Today, that failure
point is marked by inserting an XA_ZERO entry over the failure point so
that the exact location does not need to be communicated through to
exit_mmap().

However, a race exists in the tear down process because the dup_mmap()
drops the mmap lock before exit_mmap() can remove the partially set up vma
tree.  This means that other tasks may get to the mm tree and find the
invalid vma pointer (since it's an XA_ZERO entry), even though the mm is
marked as MMF_OOM_SKIP and MMF_UNSTABLE.

To remove the race fully, the tree must be cleaned up before dropping the
lock.  This is accomplished by extracting the vma cleanup in exit_mmap()
and changing the required functions to pass through the vma search limit.
Any other tree modifications would require extra cycles which should be
spent on freeing memory.

This does run the risk of increasing the possibility of finding no vmas
(which is already possible!) in code that isn't careful.

The final four patches are to address the excessive argument lists being
passed between the functions.  Using the struct unmap_desc also allows
some special-case code to be removed in favour of the struct setup
differences.

This patch (of 11):

pgtables.h defines a fallback for ceiling and floor of the page tables
within the CONFIG_MMU section.  Moving the definitions to outside the
CONFIG_MMU allows for using them in generic code.

[akpm@linux-foundation.org: remove stray newline, per SeongJae]
Link: https://lkml.kernel.org/r/20260121164946.2093480-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20260121164946.2093480-2-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Suggested-by: SeongJae Park <sj@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm: folio_zero_user: open code range computation in folio_zero_user()
Ankur Arora [Wed, 28 Jan 2026 18:59:43 +0000 (10:59 -0800)] 
mm: folio_zero_user: open code range computation in folio_zero_user()

riscv64-gcc-linux-gnu (v8.5) reports a compile time assert in:

   r[2] = DEFINE_RANGE(clamp_t(s64, fault_idx - radius, pg.start, pg.end),
         clamp_t(s64, fault_idx + radius, pg.start, pg.end));

where it decides that pg.start > pg.end in:
  clamp_t(s64, fault_idx + radius, pg.start, pg.end));

where pg comes from:
  const struct range pg = DEFINE_RANGE(0, folio_nr_pages(folio) - 1);

That does not seem like it could be true. Even for pg.start == pg.end,
we would need folio_test_large() to evaluate to false at compile time:

  static inline unsigned long folio_nr_pages(const struct folio *folio)
  {
if (!folio_test_large(folio))
return 1;
return folio_large_nr_pages(folio);
  }

Workaround by open coding the range computation. Also, simplify the type
declarations for the relevant variables.

[ankur.a.arora@oracle.com: remove unneeded cast, per David]
Link: https://lkml.kernel.org/r/20260206223801.2617497-1-ankur.a.arora@oracle.com
Link: https://lkml.kernel.org/r/20260206223801.2617497-1-ankur.a.arora@oracle.com
Link: https://lkml.kernel.org/r/20260128185943.2397128-1-ankur.a.arora@oracle.com
Fixes: 93552c9a3350 ("mm: folio_zero_user: cache neighbouring pages")
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202601240453.QCjgGdJa-lkp@intel.com/
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Konrad Rzessutek Wilk <konrad.wilk@oracle.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Li Zhe <lizhe.67@bytedance.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm/vmscan: select the closest preferred node in demote_folio_list()
Bing Jiao [Wed, 14 Jan 2026 20:53:03 +0000 (20:53 +0000)] 
mm/vmscan: select the closest preferred node in demote_folio_list()

The preferred demotion node (migration_target_control.nid) should be the
one closest to the source node to minimize migration latency.  Currently,
a discrepancy exists where demote_folio_list() randomly selects an allowed
node if the preferred node from next_demotion_node() is not set in
mems_effective.

To address it, update next_demotion_node() to select a preferred target
against allowed nodes; and to return the closest demotion target if all
preferred nodes are not in mems_effective via next_demotion_node().

It ensures that the preferred demotion target is consistently the closest
available node to the source node.

[akpm@linux-foundation.org: fix comment typo, per Shakeel]
Link: https://lkml.kernel.org/r/20260114205305.2869796-3-bingjiao@google.com
Signed-off-by: Bing Jiao <bingjiao@google.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Gregory Price <gourry@gourry.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Waiman Long <longman@redhat.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Yuanchu Xie <yuanchu@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm/vmscan: fix demotion targets checks in reclaim/demotion
Bing Jiao [Wed, 14 Jan 2026 20:53:02 +0000 (20:53 +0000)] 
mm/vmscan: fix demotion targets checks in reclaim/demotion

Patch series "mm/vmscan: fix demotion targets checks in reclaim/demotion",
v9.

This patch series addresses two issues in demote_folio_list(),
can_demote(), and next_demotion_node() in reclaim/demotion.

1. demote_folio_list() and can_demote() do not correctly check
   demotion target against cpuset.mems_effective, which will cause (a)
   pages to be demoted to not-allowed nodes and (b) pages fail demotion
   even if the system still has allowed demotion nodes.

   Patch 1 fixes this bug by updating cpuset_node_allowed() and
   mem_cgroup_node_allowed() to return effective_mems, allowing directly
   logic-and operation against demotion targets.

2. next_demotion_node() returns a preferred demotion target, but it
   does not check the node against allowed nodes.

   Patch 2 ensures that next_demotion_node() filters against the allowed
   node mask and selects the closest demotion target to the source node.

This patch (of 2):

Fix two bugs in demote_folio_list() and can_demote() due to incorrect
demotion target checks against cpuset.mems_effective in reclaim/demotion.

Commit 7d709f49babc ("vmscan,cgroup: apply mems_effective to reclaim")
introduces the cpuset.mems_effective check and applies it to can_demote().
However:

  1. It does not apply this check in demote_folio_list(), which leads
     to situations where pages are demoted to nodes that are
     explicitly excluded from the task's cpuset.mems.

  2. It checks only the nodes in the immediate next demotion hierarchy
     and does not check all allowed demotion targets in can_demote().
     This can cause pages to never be demoted if the nodes in the next
     demotion hierarchy are not set in mems_effective.

These bugs break resource isolation provided by cpuset.mems.  This is
visible from userspace because pages can either fail to be demoted
entirely or are demoted to nodes that are not allowed in multi-tier memory
systems.

To address these bugs, update cpuset_node_allowed() and
mem_cgroup_node_allowed() to return effective_mems, allowing directly
logic-and operation against demotion targets.  Also update can_demote()
and demote_folio_list() accordingly.

Bug 1 reproduction:
  Assume a system with 4 nodes, where nodes 0-1 are top-tier and
  nodes 2-3 are far-tier memory. All nodes have equal capacity.

  Test script:
    echo 1 > /sys/kernel/mm/numa/demotion_enabled
    mkdir /sys/fs/cgroup/test
    echo +cpuset > /sys/fs/cgroup/cgroup.subtree_control
    echo "0-2" > /sys/fs/cgroup/test/cpuset.mems
    echo $$ > /sys/fs/cgroup/test/cgroup.procs
    swapoff -a
    # Expectation: Should respect node 0-2 limit.
    # Observation: Node 3 shows significant allocation (MemFree drops)
    stress-ng --oomable --vm 1 --vm-bytes 150% --mbind 0,1

Bug 2 reproduction:
  Assume a system with 6 nodes, where nodes 0-2 are top-tier,
  node 3 is a far-tier node, and nodes 4-5 are the farthest-tier nodes.
  All nodes have equal capacity.

  Test script:
    echo 1 > /sys/kernel/mm/numa/demotion_enabled
    mkdir /sys/fs/cgroup/test
    echo +cpuset > /sys/fs/cgroup/cgroup.subtree_control
    echo "0-2,4-5" > /sys/fs/cgroup/test/cpuset.mems
    echo $$ > /sys/fs/cgroup/test/cgroup.procs
    swapoff -a
    # Expectation: Pages are demoted to Nodes 4-5
    # Observation: No pages are demoted before oom.
    stress-ng --oomable --vm 1 --vm-bytes 150% --mbind 0,1,2

Link: https://lkml.kernel.org/r/20260114205305.2869796-1-bingjiao@google.com
Link: https://lkml.kernel.org/r/20260114205305.2869796-2-bingjiao@google.com
Fixes: 7d709f49babc ("vmscan,cgroup: apply mems_effective to reclaim")
Signed-off-by: Bing Jiao <bingjiao@google.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Gregory Price <gourry@gourry.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Waiman Long <longman@redhat.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Yuanchu Xie <yuanchu@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agoprocfs: fix possible double mmput() in do_procmap_query()
Andrii Nakryiko [Tue, 10 Feb 2026 19:27:38 +0000 (11:27 -0800)] 
procfs: fix possible double mmput() in do_procmap_query()

When user provides incorrectly sized buffer for build ID for PROCMAP_QUERY
we return with -ENAMETOOLONG error.  After recent changes this condition
happens later, after we unlocked mmap_lock/per-VMA lock and did mmput(),
so original goto out is now wrong and will double-mmput() mm_struct.  Fix
by jumping further to clean up only vm_file and name_buf.

Link: https://lkml.kernel.org/r/20260210192738.3041609-1-andrii@kernel.org
Fixes: b5cbacd7f86f ("procfs: avoid fetching build ID while holding VMA lock")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reported-by: Ruikai Peng <ruikai@pwno.io>
Reported-by: Thomas Gleixner <tglx@kernel.org>
Tested-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Reported-by: syzbot+237b5b985b78c1da9600@syzkaller.appspotmail.com
Cc: Ruikai Peng <ruikai@pwno.io>
Closes: https://lkml.kernel.org/r/CAFD3drOJANTZPuyiqMdqpiRwOKnHwv5QgMNZghCDr-WxdiHvMg@mail.gmail.com
Closes: https://lore.kernel.org/all/698aaf3c.050a0220.3b3015.0088.GAE@google.com/T/#u
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm/page_alloc: skip debug_check_no_{obj,locks}_freed with FPI_TRYLOCK
Harry Yoo [Mon, 9 Feb 2026 06:26:39 +0000 (15:26 +0900)] 
mm/page_alloc: skip debug_check_no_{obj,locks}_freed with FPI_TRYLOCK

When CONFIG_DEBUG_OBJECTS_FREE is enabled,
debug_check_no_{obj,locks}_freed() functions are called.

Since both of them spin on a lock, they are not safe to be called if the
FPI_TRYLOCK flag is specified.  This leads to a lockdep splat:

  ================================
  WARNING: inconsistent lock state
  6.19.0-rc5-slab-for-next+ #326 Tainted: G                 N
  --------------------------------
  inconsistent {INITIAL USE} -> {IN-NMI} usage.
  kunit_try_catch/9046 [HC2[2]:SC0[0]:HE0:SE1] takes:
  ffffffff84ed6bf8 (&obj_hash[i].lock){-.-.}-{2:2}, at: __debug_check_no_obj_freed+0xe0/0x300
  {INITIAL USE} state was registered at:
    lock_acquire+0xd9/0x2f0
    _raw_spin_lock_irqsave+0x4c/0x80
    __debug_object_init+0x9d/0x1f0
    debug_object_init+0x34/0x50
    __init_work+0x28/0x40
    init_cgroup_housekeeping+0x151/0x210
    init_cgroup_root+0x3d/0x140
    cgroup_init_early+0x30/0x240
    start_kernel+0x3e/0xcd0
    x86_64_start_reservations+0x18/0x30
    x86_64_start_kernel+0xf3/0x140
    common_startup_64+0x13e/0x148
  irq event stamp: 2998
  hardirqs last  enabled at (2997): [<ffffffff8298b77a>] exc_nmi+0x11a/0x240
  hardirqs last disabled at (2998): [<ffffffff8298b991>] sysvec_irq_work+0x11/0x110
  softirqs last  enabled at (1416): [<ffffffff813c1f72>] __irq_exit_rcu+0x132/0x1c0
  softirqs last disabled at (1303): [<ffffffff813c1f72>] __irq_exit_rcu+0x132/0x1c0

  other info that might help us debug this:
   Possible unsafe locking scenario:

         CPU0
         ----
    lock(&obj_hash[i].lock);
    <Interrupt>
      lock(&obj_hash[i].lock);

   *** DEADLOCK ***

Rename free_pages_prepare() to __free_pages_prepare(), add an fpi_t
parameter, and skip those checks if FPI_TRYLOCK is set.  To keep the fpi_t
definition in mm/page_alloc.c, add a wrapper function free_pages_prepare()
that always passes FPI_NONE and use it in mm/compaction.c.

Link: https://lkml.kernel.org/r/20260209062639.16577-1-harry.yoo@oracle.com
Fixes: 8c57b687e833 ("mm, bpf: Introduce free_pages_nolock()")
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Zi Yan <ziy@nvidia.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agomm/hugetlb: restore failed global reservations to subpool
Joshua Hahn [Fri, 16 Jan 2026 20:40:36 +0000 (15:40 -0500)] 
mm/hugetlb: restore failed global reservations to subpool

Commit a833a693a490 ("mm: hugetlb: fix incorrect fallback for subpool")
fixed an underflow error for hstate->resv_huge_pages caused by incorrectly
attributing globally requested pages to the subpool's reservation.

Unfortunately, this fix also introduced the opposite problem, which would
leave spool->used_hpages elevated if the globally requested pages could
not be acquired.  This is because while a subpool's reserve pages only
accounts for what is requested and allocated from the subpool, its "used"
counter keeps track of what is consumed in total, both from the subpool
and globally.  Thus, we need to adjust spool->used_hpages in the other
direction, and make sure that globally requested pages are uncharged from
the subpool's used counter.

Each failed allocation attempt increments the used_hpages counter by how
many pages were requested from the global pool.  Ultimately, this renders
the subpool unusable, as used_hpages approaches the max limit.

The issue can be reproduced as follows:
1. Allocate 4 hugetlb pages
2. Create a hugetlb mount with max=4, min=2
3. Consume 2 pages globally
4. Request 3 pages from the subpool (2 from subpool + 1 from global)
4.1 hugepage_subpool_get_pages(spool, 3) succeeds.
used_hpages += 3
4.2 hugetlb_acct_memory(h, 1) fails: no global pages left
used_hpages -= 2
5. Subpool now has used_hpages = 1, despite not being able to
   successfully allocate any hugepages. It believes it can now only
   allocate 3 more hugepages, not 4.

With each failed allocation attempt incrementing the used counter, the
subpool eventually reaches a point where its used counter equals its
max counter.  At that point, any future allocations that try to
allocate hugeTLB pages from the subpool will fail, despite the subpool
not having any of its hugeTLB pages consumed by any user.

Once this happens, there is no way to make the subpool usable again,
since there is no way to decrement the used counter as no process is
really consuming the hugeTLB pages.

The underflow issue that the original commit fixes still remains fixed
as well.

Without this fix, used_hpages would keep on leaking if
hugetlb_acct_memory() fails.

Link: https://lkml.kernel.org/r/20260116204037.2270096-1-joshua.hahnjy@gmail.com
Fixes: a833a693a490 ("mm: hugetlb: fix incorrect fallback for subpool")
Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Acked-by: Usama Arif <usama.arif@linux.dev>
Cc: David Hildenbrand <david@kernel.org>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Ma Wupeng <mawupeng1@huawei.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Waiman Long <longman@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 months agoMerge tag 'for-7.0/io_uring-zcrx-large-buffers-20260206' of git://git.kernel.org...
Linus Torvalds [Thu, 12 Feb 2026 23:07:50 +0000 (15:07 -0800)] 
Merge tag 'for-7.0/io_uring-zcrx-large-buffers-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring large rx buffer support from Jens Axboe:
 "Now that the networking updates are upstream, here's the support for
  large buffers for zcrx.

  Using larger (bigger than 4K) rx buffers can increase the effiency of
  zcrx. For example, it's been shown that using 32K buffers can decrease
  CPU usage by ~30% compared to 4K buffers"

* tag 'for-7.0/io_uring-zcrx-large-buffers-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/zcrx: implement large rx buffer support

2 months agoMerge tag 'libnvdimm-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm...
Linus Torvalds [Thu, 12 Feb 2026 22:47:35 +0000 (14:47 -0800)] 
Merge tag 'libnvdimm-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Ira Weiny:
 "A kmap conversion and a bug fix this go around:

   - drivers/nvdimm: Use local kmaps

   - nvdimm: virtio_pmem: serialize flush requests"

* tag 'libnvdimm-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  nvdimm: virtio_pmem: serialize flush requests
  drivers/nvdimm: Use local kmaps

2 months agoMerge tag 'trace-rtla-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/trace...
Linus Torvalds [Thu, 12 Feb 2026 22:31:02 +0000 (14:31 -0800)] 
Merge tag 'trace-rtla-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull RTLA updates from Steven Rostedt:

 - Remove unused function declarations

   Some functions were removed in recent code consolidation 6.18, but
   their prototypes were not removed from headers. Remove them.

 - Set stop threshold after enabling instances

   Prefer recording samples without stopping on them on the start of
   tracing to stopping on samples that are never recorded. This fixes
   flakiness of some RTLA tests and unifies behavior of sample
   collection between tracefs mode and BPF mode.

 - Consolidate usage help message implementation

   RTLA tools (osnoise-top, osnoise-hist, timerlat-top, timerlat-hist)
   each implement usage help individually. Move common logic between
   them into a new function to reduce code duplication.

 - Add BPF actions feature

   Add option --bpf-action to attach a BPF program (passed as filename
   of its ELF representation) to be executed via BPF tail call at
   latency threshold.

 - Consolidate command line option parsing

   Each RTLA tool implements the parsing of command line options
   individually. Now that we have a common structure for parameters,
   unify the parsing of those options common among all four tools into
   one function.

 - De-duplicate cgroup common code

   Two functions in utils.c, setting cgroup for comm and setting cgroup
   for pid, duplicate code for constructing the cgroup path. Extract it
   to a new helper function.

 - String and error handling fixes and cleanups

   There were several instances of unsafe string and error handling that
   could cause invalid memory access; fix them. Also, remove a few
   unused headers, update .gitignore, and deduplicate code.

* tag 'trace-rtla-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (30 commits)
  rtla: Fix parse_cpu_set() bug introduced by strtoi()
  rtla: Fix parse_cpu_set() return value documentation
  rtla: Ensure null termination after read operations in utils.c
  rtla: Make stop_tracing variable volatile
  rtla: Add generated output files to gitignore
  rtla: Fix NULL pointer dereference in actions_parse
  rtla: Remove unused headers
  rtla: Remove redundant memset after calloc
  rtla: Use standard exit codes for result enum
  rtla: Replace atoi() with a robust strtoi()
  rtla: Introduce for_each_action() helper
  tools/rtla: Deduplicate cgroup path opening code
  tools/rtla: Consolidate -H/--house-keeping option parsing
  tools/rtla: Consolidate -P/--priority option parsing
  tools/rtla: Consolidate -e/--event option parsing
  tools/rtla: Consolidate -d/--duration option parsing
  tools/rtla: Consolidate -D/--debug option parsing
  tools/rtla: Consolidate -C/--cgroup option parsing
  tools/rtla: Consolidate -c/--cpus option parsing
  tools/rtla: Add common_parse_options()
  ...

2 months agoMerge tag 'trace-rv-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/trace...
Linus Torvalds [Thu, 12 Feb 2026 22:08:49 +0000 (14:08 -0800)] 
Merge tag 'trace-rv-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull runtime verifier updates from Steven Rostedt:

 - Refactor da_monitor to minimize macros

   Complete refactor of da_monitor.h to reduce reliance on macros
   generating functions. Use generic static functions and uses the
   preprocessor only when strictly necessary (e.g. for tracepoint
   handlers).

   The change essentially relies on functions with generic names (e.g.
   da_handle) instead of monitor-specific as well adding the need to
   define constant (e.g. MONITOR_NAME, MONITOR_TYPE) before including
   the header rather than calling macros that would define functions.
   Also adapt monitors and documentation accordingly.

 - Cleanup DA code generation scripts

   Clean up functions in dot2c removing reimplementations of trivial
   library functions (__buff_to_string) and removing some other unused
   intermediate steps.

 - Annotate functions with types in the rvgen python scripts

 - Remove superfluous assignments and cleanup generated code

   The rvgen scripts generate a superfluous assignment to 0 for enum
   variables and don't add commas to the last elements, which is against
   the kernel coding standards. Change the generation process for a
   better compliance and slightly simpler logic.

 - Remove superfluous declarations from generated code

   The monitor container source files contained a declaration and a
   definition for the rv_monitor variable. The former is superfluous and
   was removed.

 - Fix reference to outdated documentation

   s/da_monitor_synthesis.rst/monitor_synthesis.rst in comment in
   da_monitor.h

* tag 'trace-rv-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  rv: Fix documentation reference in da_monitor.h
  verification/rvgen: Remove unused variable declaration from containers
  verification/dot2c: Remove superfluous enum assignment and add last comma
  verification/dot2c: Remove __buff_to_string() and cleanup
  verification/rvgen: Annotate DA functions with types
  verification/rvgen: Adapt dot2k and templates after refactoring da_monitor.h
  Documentation/rv: Adapt documentation after da_monitor refactoring
  rv: Cleanup da_monitor after refactor
  rv: Refactor da_monitor to minimise macros

2 months agoMerge tag 'for-linus' of https://github.com/openrisc/linux
Linus Torvalds [Thu, 12 Feb 2026 22:04:43 +0000 (14:04 -0800)] 
Merge tag 'for-linus' of https://github.com/openrisc/linux

Pull OpenRISC updates from Stafford Horne:
 "The main focus for this series has been to improve OpenRISC kernel
  out-of-the-box support for FPGA dev boards.

   - Add device tree configurations for De0 Nano single and multicore
     configurations

   - Fix bug in OpenRISC SMP preventing the kernel from running on FPGA
     boards, due to IPIs not being unmasked on secondary CPUs in some
     configurations

   - Pick up a fix from Brian Masney defining the nop() macro to fix
     build failures on OpenRISC for drivers using it"

* tag 'for-linus' of https://github.com/openrisc/linux:
  openrisc: define arch-specific version of nop()
  openrisc: dts: Add de0 nano multicore config and devicetree
  openrisc: dts: Split simple smp dts to dts and dtsi
  openrisc: Fix IPIs on simple multicore systems
  openrisc: dts: Add de0 nano config and devicetree

2 months agoMerge tag 'configfs-for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/a...
Linus Torvalds [Thu, 12 Feb 2026 22:01:38 +0000 (14:01 -0800)] 
Merge tag 'configfs-for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/a.hindborg/linux

Pull configfs updates from Andreas Hindborg:

 - Switch the configfs rust bindings to use c string literals provided
   by the compiler, rather than a macro

 - A follow up on constifying `configfs_item_operations`, applying the
   change to the configfs sample

* tag 'configfs-for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/a.hindborg/linux:
  samples: configfs: Constify struct configfs_item_operations and configfs_group_operations
  rust: configfs: replace `kernel::c_str!` with C-Strings

2 months agotools/sched_ext: scx_userland: fix restart and stats thread lifecycle bugs
David Carlier [Thu, 12 Feb 2026 20:35:19 +0000 (20:35 +0000)] 
tools/sched_ext: scx_userland: fix restart and stats thread lifecycle bugs

Fix three issues in scx_userland's restart path:

- exit_req is not reset on restart, causing sched_main_loop() to exit
  immediately without doing any scheduling work.

- stats_printer thread handle is local to spawn_stats_thread(), making
  it impossible to join from main(). Promote it to file scope.

- The stats thread continues reading skel->bss after the skeleton is
  destroyed on restart, causing a use-after-free. Join the stats thread
  before destroying the skeleton to ensure it has exited.

Signed-off-by: David Carlier <devnexen@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2 months agoperf test script: Add python script testing support
Ian Rogers [Mon, 9 Feb 2026 20:22:08 +0000 (12:22 -0800)] 
perf test script: Add python script testing support

Basic coverage of python script support from `perf script`.

Committer testing:

  $ perf test 'perf script python'
  107: perf script python tests                                        : Ok
  $ perf test -vv 'perf script python'
  107: perf script python tests:
  --- start ---
  test child forked, pid 595537
  Testing event: sched:sched_switch
  perf script python test [Skipped: failed to record sched:sched_switch]
  Testing event: task-clock
  Generating python script...
  generated Python script: /tmp/__perf_test_script.J4rWj.py
  Executing python script...
  perf script python test [Success: task-clock triggered param_dict]
  ---- end(0) ----
  107: perf script python tests                                        : Ok
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Yujie Liu <yujie.liu@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 months agoperf test script: Add perl script testing support
Ian Rogers [Mon, 9 Feb 2026 20:22:07 +0000 (12:22 -0800)] 
perf test script: Add perl script testing support

Basic coverage of perl script support from `perf script`. This is
disabled by default and so the test will most normally skip.

Committer testing:

  $ perf test 'perf script perl'
  106: perf script perl tests                                          : Skip
  $ perf test -vv 'perf script perl'
  106: perf script perl tests:
  --- start ---
  test child forked, pid 578323
  perf script perl test [Skipped: no libperl support]
  ---- end(-2) ----
  106: perf script perl tests                                          : Skip
  $ perf check feature libperl
                 libperl: [ OFF ]  # HAVE_LIBPERL_SUPPORT ( tip: Deprecated, use LIBPERL=1 and install perl-ExtUtils-Embed/libperl-dev to build with it )
  $

Install perl-ExtUtils-Embed, build with LIBPERL=1, rebuild:

  $ perf check feature libperl
                 libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
  $ perf test 'perf script perl'
  106: perf script perl tests                                          : Ok
  $ perf test -vv 'perf script perl'
  106: perf script perl tests:
  --- start ---
  test child forked, pid 588206
  Testing event: sched:sched_switch
  perf script perl test [Skipped: failed to record sched:sched_switch]
  Testing event: task-clock
  Generating perl script...
  generated Perl script: /tmp/__perf_test_script.RpMn5.pl
  Executing perl script...
  perf script perl test [Success: task-clock triggered $VAR1]
  ---- end(0) ----
  106: perf script perl tests                                          : Ok
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Yujie Liu <yujie.liu@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 months agoperf script: Allow the generated script to be a path
Ian Rogers [Mon, 9 Feb 2026 20:22:06 +0000 (12:22 -0800)] 
perf script: Allow the generated script to be a path

Allow the script generated by "perf script -g <language>" to be a file
path and the language determined by the file extension.

This is useful in testing so that the generated script file can be
written to a test directory.

Committer testing:

  $ perf record ls a.a
  ls: cannot access 'a.a': No such file or directory
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.003 MB perf.data (7 samples) ]
  $ perf script -g python
  generated Python script: perf-script.py
  $ perf script -g myscript.py
  generated Python script: myscript.py
  $ diff -u perf-script.py myscript.py
  $ tail myscript.py
  def trace_unhandled(event_name, context, event_fields_dict, perf_sample_dict):
   print(get_dict_as_string(event_fields_dict))
   print('Sample: {'+get_dict_as_string(perf_sample_dict['sample'], ', ')+'}')

  def print_header(event_name, cpu, secs, nsecs, pid, comm):
   print("%-20s %5u %05u.%09u %8u %-20s " % \
   (event_name, cpu, secs, nsecs, pid, comm), end="")

  def get_dict_as_string(a_dict, delimiter=' '):
   return delimiter.join(['%s=%s'%(k,str(v))for k,v in sorted(a_dict.items())])
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Yujie Liu <yujie.liu@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 months agoperf test: perf data --to-ctf testing
Ian Rogers [Wed, 11 Feb 2026 01:52:42 +0000 (17:52 -0800)] 
perf test: perf data --to-ctf testing

If babeltrace is detected check that --to-ctf functions with a data
file and in pipe mode.

Committer testing:

  $ perf test 'perf data convert --to-ctf'
  124: 'perf data convert --to-ctf' command test                       : Ok
  $ perf test -vv 'perf data convert --to-ctf'
  124: 'perf data convert --to-ctf' command test:
  --- start ---
  test child forked, pid 556008
           libbabeltrace: [ on  ]  # HAVE_LIBBABELTRACE_SUPPORT
  Testing Perf Data Conversion Command to CTF (File input)
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.021 MB /tmp/__perf_test.perf.data.9TxzZ (115 samples) ]
  [ perf data convert: Converted '/tmp/__perf_test.perf.data.9TxzZ' into CTF data '/tmp/__perf_test.ctf.f5EkS' ]
  [ perf data convert: Converted and wrote 0.012 MB (115 samples) ]
  Perf Data Converter Command to CTF (File input) [SUCCESS]
  Testing Perf Data Conversion Command to CTF (Pipe mode)
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.047 MB - ]
  Failed to setup all events.
  [ perf data convert: Converted '/tmp/__perf_test.perf.data.9TxzZ' into CTF data '/tmp/__perf_test.ctf.f5EkS' ]
  [ perf data convert: Converted and wrote 0.000 MB (0 samples) ]
  Perf Data Converter Command to CTF (Pipe mode) [SUCCESS]
  Unexpected signal in main
  ---- end(0) ----
  124: 'perf data convert --to-ctf' command test                       : Ok
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 months agoperf test: Test pipe mode with data conversion --to-json
Ian Rogers [Wed, 11 Feb 2026 01:52:41 +0000 (17:52 -0800)] 
perf test: Test pipe mode with data conversion --to-json

Add pipe mode test for json data conversion. Tidy up exit and cleanup
code.

Committer testing:

  $ perf test 'perf data convert --to-json'
  124: 'perf data convert --to-json' command test                      : Ok
  $ perf test -vv 'perf data convert --to-json'
  124: 'perf data convert --to-json' command test:
  --- start ---
  test child forked, pid 548738
  Testing Perf Data Conversion Command to JSON
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.020 MB /tmp/__perf_test.perf.data.krxvl (104 samples) ]
  [ perf data convert: Converted '/tmp/__perf_test.perf.data.krxvl' into JSON data '/tmp/__perf_test.output.json.0z60p' ]
  [ perf data convert: Converted and wrote 0.075 MB (104 samples) ]
  Perf Data Converter Command to JSON [SUCCESS]
  Validating Perf Data Converted JSON file
  The file contains valid JSON format [SUCCESS]
  Testing Perf Data Conversion Command to JSON (Pipe mode)
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.046 MB - ]
  [ perf data convert: Converted '-' into JSON data '/tmp/__perf_test.output.json.0z60p' ]
  [ perf data convert: Converted and wrote 0.081 MB (110 samples) ]
  Perf Data Converter Command to JSON (Pipe mode) [SUCCESS]
  Validating Perf Data Converted JSON file
  The file contains valid JSON format [SUCCESS]
  ---- end(0) ----
  124: 'perf data convert --to-json' command test                      : Ok
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 months agoperf json: Pipe mode --to-ctf support
Ian Rogers [Wed, 11 Feb 2026 01:52:40 +0000 (17:52 -0800)] 
perf json: Pipe mode --to-ctf support

In pipe mode the environment may not be fully initialized so be robust
to fields being NULL.

Add default handling of attr events, use the feature events to populate
the ctf writer environment.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 months agoperf json: Pipe mode --to-json support
Ian Rogers [Wed, 11 Feb 2026 01:52:39 +0000 (17:52 -0800)] 
perf json: Pipe mode --to-json support

In pipe mode the environment may not be fully initialized so be robust
to fields being NULL. Add default handling of feature and attr events.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 months agoperf check: Add libbabeltrace to the listed features
Ian Rogers [Wed, 11 Feb 2026 01:52:38 +0000 (17:52 -0800)] 
perf check: Add libbabeltrace to the listed features

This enables scripts to more easily determine if `perf data --to-ctf`
is supported.

Committer testing:

  $ perf check feature libbabeltrace
         libbabeltrace: [ on  ]  # HAVE_LIBBABELTRACE_SUPPORT
  $ perf check feature -q libbabeltrace && echo have libbabeltrace support
  have libbabeltrace support
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 months agoperf build: Allow passing extra Clang flags via EXTRA_BPF_FLAGS
hupu [Tue, 23 Dec 2025 08:43:34 +0000 (16:43 +0800)] 
perf build: Allow passing extra Clang flags via EXTRA_BPF_FLAGS

Add support for EXTRA_BPF_FLAGS in the eBPF skeleton build, allowing
users to pass additional clang options such as --sysroot or custom
include paths when cross-compiling perf.

This is primarily intended for cross-build scenarios where the default
host include paths do not match the target kernel version.

Example usage:

    make perf ARCH="arm64" EXTRA_BPF_FLAGS="--sysroot=..."

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: hupu <hupu.gm@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 months agoperf test data_type_profiling.sh: Skip just the Rust tests if code_with_type workload...
Arnaldo Carvalho de Melo [Wed, 11 Feb 2026 12:46:22 +0000 (09:46 -0300)] 
perf test data_type_profiling.sh: Skip just the Rust tests if code_with_type workload is missing

Namhyung suggested skipping only the rust tests when the code_with_type
'perf test' workload is not built into perf, do it so that we can
continue to test the C based workloads:

With rust:

  root@number:/# perf test -vv "data type"
   83: perf data type profiling tests:
  --- start ---
  test child forked, pid 2645245
  Basic Rust perf annotate test
  Basic annotate test [Success]
  Pipe Rust perf annotate test
  Pipe annotate test [Success]
  Basic C perf annotate test
  Basic annotate test [Success]
  Pipe C perf annotate test
  Pipe annotate test [Success]
  ---- end(0) ----
   83: perf data type profiling tests                                  : Ok
  root@number:/#

Without:

  root@number:/# perf test "data type"
   83: perf data type profiling tests                                  : Ok
  root@number:/# perf test -vv "data type"
   83: perf data type profiling tests:
  --- start ---
  test child forked, pid 2634759
  Basic Rust perf annotate test
  Skip: code_with_type workload not built in 'perf test'
  Pipe Rust perf annotate test
  Skip: code_with_type workload not built in 'perf test'
  Basic C perf annotate test
  Basic annotate test [Success]
  Pipe C perf annotate test
  Pipe annotate test [Success]
  ---- end(0) ----
   83: perf data type profiling tests                                  : Ok
  root@number:/#

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 months agotools build: Fix feature test for rust compiler
Dmitrii Dolgov [Wed, 11 Feb 2026 09:58:01 +0000 (10:58 +0100)] 
tools build: Fix feature test for rust compiler

Currently a dummy rust code is compiled to detect if the rust feature
could be enabled. It turns out that in this case rust emits a dependency
file without any external references:

    /perf/feature/test-rust.d: test-rust.rs

    /perf/feature/test-rust.bin: test-rust.rs

    test-rust.rs:

This can lead to a situation, when rustc was removed after a successful build,
but the build process still thinks it's there and the feature is enabled on
subsequent runs.

Instead simply check the compiler presence to detect the feature, as
suggested by Arnaldo.

This way no actual test-rust.bin will be created, meaning the feature
check will not be cached and always performed. That's exactly what we
want, and the overhead of doing this every time is minimal.

Tested with multiple rounds of install/remove of the rust package.

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 months agoperf libunwind: Fix calls to thread__e_machine()
Ian Rogers [Wed, 11 Feb 2026 05:38:27 +0000 (21:38 -0800)] 
perf libunwind: Fix calls to thread__e_machine()

Add the missing 'e_flags' option to fix the build.

Fixes: 4e66527f8859a661 ("perf thread: Add optional e_flags output argument to thread__e_machine")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 months agodrm/amdgpu: lock both VM and BO in amdgpu_gem_object_open
Christian König [Tue, 20 Jan 2026 11:57:21 +0000 (12:57 +0100)] 
drm/amdgpu: lock both VM and BO in amdgpu_gem_object_open

The VM was not locked in the past since we initially only cleared the
linked list element and not added it to any VM state.

But this has changed quite some time ago, we just never realized this
problem because the VM state lock was masking it.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Fix out-of-bounds stream encoder index v3
Srinivasan Shanmugam [Fri, 6 Feb 2026 15:19:23 +0000 (20:49 +0530)] 
drm/amd/display: Fix out-of-bounds stream encoder index v3

eng_id can be negative and that stream_enc_regs[]
can be indexed out of bounds.

eng_id is used directly as an index into stream_enc_regs[], which has
only 5 entries. When eng_id is 5 (ENGINE_ID_DIGF) or negative, this can
access memory past the end of the array.

Add a bounds check using ARRAY_SIZE() before using eng_id as an index.
The unsigned cast also rejects negative values.

This avoids out-of-bounds access.

Fixes the below smatch error:
dcn*_resource.c: stream_encoder_create() may index
stream_enc_regs[eng_id] out of bounds (size 5).

drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn351/dcn351_resource.c
    1246 static struct stream_encoder *dcn35_stream_encoder_create(
    1247         enum engine_id eng_id,
    1248         struct dc_context *ctx)
    1249 {

    ...

    1255
    1256         /* Mapping of VPG, AFMT, DME register blocks to DIO block instance */
    1257         if (eng_id <= ENGINE_ID_DIGF) {

ENGINE_ID_DIGF is 5.  should <= be <?

Unrelated but, ugh, why is Smatch saying that "eng_id" can be negative?
end_id is type signed long, but there are checks in the caller which prevent it from being negative.

    1258                 vpg_inst = eng_id;
    1259                 afmt_inst = eng_id;
    1260         } else
    1261                 return NULL;
    1262

    ...

    1281
    1282         dcn35_dio_stream_encoder_construct(enc1, ctx, ctx->dc_bios,
    1283                                         eng_id, vpg, afmt,
--> 1284                                         &stream_enc_regs[eng_id],
                                                  ^^^^^^^^^^^^^^^^^^^^^^^ This stream_enc_regs[] array has 5 elements so we are one element beyond the end of the array.

    ...

    1287         return &enc1->base;
    1288 }

v2: use explicit bounds check as suggested by Roman/Dan; avoid unsigned int cast

v3: The compiler already knows how to compare the two values, so the
    cast (int) is not needed. (Roman)

Fixes: 2728e9c7c842 ("drm/amd/display: add DC changes for DCN351")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Mario Limonciello <superm1@kernel.org>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: ChiaHsuan Chung <chiahsuan.chung@amd.com>
Cc: Roman Li <roman.li@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdkfd: Fix APU to use GTT, not VRAM for MQD
Siwei He [Mon, 9 Feb 2026 21:13:20 +0000 (16:13 -0500)] 
drm/amdkfd: Fix APU to use GTT, not VRAM for MQD

Add a check in mqd_on_vram. If the device prefers GTT, it returns false

Fixes: d4a814f400d4 ("drm/amdkfd: Move gfx9.4.3 and gfx 9.5 MQD to HBM")
Signed-off-by: Siwei He <siwei.he@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu:Add psp v13_0_15 ip block
Mangesh Gadre [Tue, 27 Jan 2026 10:00:59 +0000 (18:00 +0800)] 
drm/amdgpu:Add psp v13_0_15 ip block

Add support for psp v13_0_15 ip block

Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu: set family for GC 11.5.4
Alex Deucher [Tue, 10 Feb 2026 21:53:08 +0000 (16:53 -0500)] 
drm/amdgpu: set family for GC 11.5.4

Set the family for GC 11.5.4

Fixes: 47ae1f938d12 ("drm/amdgpu: add support for GC IP version 11.5.4")
Cc: Tim Huang <tim.huang@amd.com>
Cc: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Cc: Roman Li <Roman.Li@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>