cpu: Make atomic hotplug callbacks run with interrupts disabled on UP
On SMP systems the CPU hotplug callbacks in the "starting" range are
invoked while the CPU is brought up and interrupts are still
disabled. Callbacks which are added later are invoked via the
hotplug-thread on the target CPU and interrupts are explicitly disabled.
In the UP case callbacks which are added later are invoked directly without
the thread indirection. This is in principle okay since there is just one
CPU but those callbacks are invoked with interrupt disabled code. That's
incorrect as those callbacks assume interrupt disabled context.
Disable interrupts before invoking the callbacks on UP if the state is
atomic and interrupts are expected to be disabled. The "save" part is
required because this is also invoked early in the boot process while
interrupts are disabled and must not be enabled prematurely.
Fixes: 06ddd17521bf1 ("sched/smp: Always define is_percpu_thread() and scheduler_ipi()") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://patch.msgid.link/20251127144723.ev9DuXXR@linutronix.de
Linus Torvalds [Tue, 9 Dec 2025 03:06:20 +0000 (12:06 +0900)]
Merge tag 'f2fs-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"This series focuses on minor clean-ups and performance optimizations
across sysfs, documentation, debugfs, tracepoints, slab allocation,
and GC. Furthermore, it resolves several corner-case bugs caught by
xfstests, as well as issues related to 16KB page support and
f2fs_enable_checkpoint.
Enhancement:
- wrap ASCII tables in literal blocks to fix LaTeX build
- optimize trace_f2fs_write_checkpoint with enums
- support to show curseg.next_blkoff in debugfs
- add a sysfs entry to show max open zones
- add fadvise tracepoint
- use global inline_xattr_slab instead of per-sb slab cache
- set default valid_thresh_ratio to 80 for zoned devices
- maintain one time GC mode is enabled during whole zoned GC cycle
Bug fix:
- ensure node page reads complete before f2fs_put_super() finishes
- do not account invalid blocks in get_left_section_blocks()
- revert summary entry count from 2048 to 512 in 16kb block support
- detect recoverable inode during dryrun of find_fsync_dnodes()
- fix age extent cache insertion skip on counter overflow
- add sanity checks before unlinking and loading inodes
- ensure minimum trim granularity accounts for all devices
- block cache/dio write during f2fs_enable_checkpoint()
- propagate error from f2fs_enable_checkpoint()
- invalidate dentry cache on failed whiteout creation
- avoid updating compression context during writeback
- avoid updating zero-sized extent in extent cache
- avoid potential deadlock"
* tag 'f2fs-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (39 commits)
f2fs: ignore discard return value
f2fs: optimize trace_f2fs_write_checkpoint with enums
f2fs: fix to not account invalid blocks in get_left_section_blocks()
f2fs: support to show curseg.next_blkoff in debugfs
docs: f2fs: wrap ASCII tables in literal blocks to fix LaTeX build
f2fs: expand scalability of f2fs mount option
f2fs: change default schedule timeout value
f2fs: introduce f2fs_schedule_timeout()
f2fs: use memalloc_retry_wait() as much as possible
f2fs: add a sysfs entry to show max open zones
f2fs: wrap all unusable_blocks_per_sec code in CONFIG_BLK_DEV_ZONED
f2fs: simplify list initialization in f2fs_recover_fsync_data()
f2fs: revert summary entry count from 2048 to 512 in 16kb block support
f2fs: fix to detect recoverable inode during dryrun of find_fsync_dnodes()
f2fs: fix return value of f2fs_recover_fsync_data()
f2fs: add fadvise tracepoint
f2fs: fix age extent cache insertion skip on counter overflow
f2fs: Add sanity checks before unlinking and loading inodes
f2fs: Rename f2fs_unlink exit label
f2fs: ensure minimum trim granularity accounts for all devices
...
Linus Torvalds [Tue, 9 Dec 2025 00:07:28 +0000 (09:07 +0900)]
Merge tag 'io_uring-6.19-20251208' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring updates from Jens Axboe:
"Followup set of fixes for io_uring for this merge window. These are
either later fixes, or cleanups that don't make sense to defer. This
pull request contains:
- Fix for a recent regression in io-wq worker creation
- Tracing cleanup
- Use READ_ONCE/WRITE_ONCE consistently for ring mapped kbufs. Mostly
for documentation purposes, indicating that they are shared with
userspace
- Fix for POLL_ADD losing a completion, if the request is updated and
now is triggerable - eg, if POLLIN is set with the updated, and the
polled file is readable
- In conjunction with the above fix, also unify how poll wait queue
entries are deleted with the head update. We had 3 different spots
doing both the list deletion and head write, with one of them
nicely documented. Abstract that into a helper and use it
consistently
- Small series from Joanne fixing an issue with buffer cloning, and
cleaning up the arg validation"
* tag 'io_uring-6.19-20251208' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/poll: unify poll waitqueue entry and list removal
io_uring/kbuf: use WRITE_ONCE() for userspace-shared buffer ring fields
io_uring/kbuf: use READ_ONCE() for userspace-mapped memory
io_uring/rsrc: fix lost entries after cloned range
io_uring/rsrc: rename misleading src_node variable in io_clone_buffers()
io_uring/rsrc: clean up buffer cloning arg validation
io_uring/trace: rename io_uring_queue_async_work event "rw" field
io_uring/io-wq: always retry worker create on ERESTART*
io_uring/poll: correctly handle io_poll_add() return value on update
Linus Torvalds [Mon, 8 Dec 2025 23:53:24 +0000 (08:53 +0900)]
Merge tag 'block-6.19-20251208' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block updates from Jens Axboe:
"Followup set of fixes and updates for block for the 6.19 merge window.
NVMe had some late minute debates which lead to dropping some patches
from that tree, which is why the initial PR didn't have NVMe included.
It's here now. This pull request contains:
- Fix a memory leak in the discard ioctl code, if the task is being
interrupted by a signal at just the wrong time
- Zoned write plugging fixes
- Add ioctls for for persistent reservations
- Enable per-cpu bio caching by default
- Various little fixes and tweaks"
* tag 'block-6.19-20251208' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (27 commits)
nvme-fabrics: add ENOKEY to no retry criteria for authentication failures
nvme-auth: use kvfree() for memory allocated with kvcalloc()
nvmet-tcp: use kvcalloc for commands array
nvmet-rdma: use kvcalloc for commands and responses arrays
nvme: fix typo error in nvme target
nvmet-fc: use pr_* print macros instead of dev_*
nvmet-fcloop: remove unused lsdir member.
nvmet-fcloop: check all request and response have been processed
nvme-fc: check all request and response have been processed
block: fix memory leak in __blkdev_issue_zero_pages
block: fix comment for op_is_zone_mgmt() to include RESET_ALL
block: Clear BLK_ZONE_WPLUG_PLUGGED when aborting plugged BIOs
blk-mq: Abort suspend when wakeup events are pending
blk-mq: add blk_rq_nr_bvec() helper
block: add IOC_PR_READ_RESERVATION ioctl
block: add IOC_PR_READ_KEYS ioctl
nvme: reject invalid pr_read_keys() num_keys values
scsi: sd: reject invalid pr_read_keys() num_keys values
block: enable per-cpu bio cache by default
block: use bio_alloc_bioset for passthru IO by default
...
Linus Torvalds [Mon, 8 Dec 2025 23:46:10 +0000 (08:46 +0900)]
Merge tag 'hwmon-for-v6.19-take-2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes Guenter Roeck:
- Documentation: Fix link to g762 devicetree binding
- emc2305: Fix devicetree refcount leak and double put
- dell-smm: Fix channel-index off-by-one error
- w83791d: Convert macros to functions to avoid TOCTOU
* tag 'hwmon-for-v6.19-take-2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
docs: hwmon: fix link to g762 devicetree binding
hwmon: (emc2305) fix device node refcount leak in error path
hwmon: (emc2305) fix double put in emc2305_probe_childs_from_dt
hwmon: (dell-smm) Fix off-by-one error in dell_smm_is_visible()
hwmon: (w83791d) Convert macros to functions to avoid TOCTOU
Linus Torvalds [Mon, 8 Dec 2025 21:45:00 +0000 (06:45 +0900)]
Merge tag 'pinctrl-v6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control updates from Linus Walleij:
"The technical details below. For me the CIX Semi and Axis
Communications ARTPEC-9 SoCs were the most interesting new drivers in
this merge window.
Core changes:
- Handle per-direction skew control in the generic pin config
- Drop the pointless subsystem boilerplate banner message during
boot. Less noise in the console. It's available as debug message if
someone really want it
New drivers:
- Samsung Exynos 8890 SoC support
- Samsung Exynos derived Axis Communications ARTPEC-9 SoC support.
These guys literally live next door to me, ARTPEC spells out "Axis
Real-Time Picture Encoding Chip" and is tailored for camera image
streams and is something they have evolved for a quarter of a
century
- Mediatek MT6878 SoC support
- Qualcomm Glymur PMIC support (mostly just compatible strings)
- Qualcomm Kaanapali SoC TLMM support
- Microchip pic64gx "gpio2" SoC support
- Microchip Polarfire "iomux0" SoC support
- CIX Semiconductors SKY1 SoC support
- Rockchip RK3506 SoC support
- Airhoa AN7583 chip support
Improvements:
- Improvements for ST Microelectronics STM32 handling of skew
settings so input and output can have different skew settings
- A whole bunch of device tree binding cleanups: Marvell Armada and
Berlin, Actions Semiconductor S700 and S900, Broadcom Northstar 2
(NS2), Bitmain BM1880 and Spreadtrum SC9860 are moved over to
schema"
* tag 'pinctrl-v6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (107 commits)
pinctrl: add CONFIG_OF dependencies for microchip drivers
pinctrl: starfive: use dynamic GPIO base allocation
pinctrl: single: Fix incorrect type for error return variable
MAINTAINERS: Change Linus Walleij mail address
pinctrl: cix: Fix obscure dependency
dt-bindings: pinctrl: cix,sky1-pinctrl: Drop duplicate newline
dt-bindings: pinctrl: aspeed,ast2600-pinctrl: Add PCIe RC PERST# group
pinctrl: airoha: Fix AIROHA_PINCTRL_CONFS_DRIVE_E2 in an7583_pinctrl_match_data
pinctrl: airoha: fix pinctrl function mismatch issue
pinctrl: cherryview: Convert to use intel_gpio_add_pin_ranges()
pinctrl: intel: Export intel_gpio_add_pin_ranges()
pinctrl: renesas: rzg2l: Refactor OEN register PWPR handling
pinctrl: airoha: convert comma to semicolon
pinctrl: elkhartlake: Switch to INTEL_GPP() macro
pinctrl: cherryview: Switch to INTEL_GPP() macro
pinctrl: emmitsburg: Switch to INTEL_GPP() macro
pinctrl: denverton: Switch to INTEL_GPP() macro
pinctrl: cedarfork: Switch to INTEL_GPP() macro
pinctrl: airoha: add support for Airoha AN7583 PINs
dt-bindings: pinctrl: airoha: Document AN7583 Pin Controller
...
Linus Torvalds [Mon, 8 Dec 2025 21:35:53 +0000 (06:35 +0900)]
Merge tag 'dmaengine-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
- Renesas driver conversion to RUNTIME_PM_OPS() etc
- Dropping module alias on bunch of drivers
- GPI Block event interrupt support in Qualcomm driver and updates to
I2C driver as well
* tag 'dmaengine-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (23 commits)
dt-bindings: dma: xilinx: Simplify dma-coherent property
dmaengine: fsl-edma: configure tcd attr with separate src and dst settings
dmaengine: st_fdma: drop unused module alias
dmaengine: bcm2835: enable compile testing
dmaengine: tegra210-adma: drop unused module alias
dmaengine: sprd: drop unused module alias
dmaengine: mmp_tdma: drop unnecessary OF node check in remove
dmaengine: mmp_tdma: drop unused module alias
dmaengine: k3dma: drop unused module alias
dmaengine: fsl-qdma: drop unused module alias
dmaengine: fsl-edma: drop unused module alias
dmaengine: dw: drop unused module alias
dmaengine: bcm2835: drop unused module alias
dmaengine: at_hdmac: add COMPILE_TEST support
dmaengine: at_hdmac: fix formats under 64-bit
i2c: i2c-qcom-geni: Add Block event interrupt support
dmaengine: qcom: gpi: Add GPI Block event interrupt support
dmaengine: idxd: drain ATS translations when disabling WQ
dmaengine: sh: Kconfig: Drop ARCH_R7S72100/ARCH_RZG2L dependency
dmaengine: rcar-dmac: Convert to NOIRQ_SYSTEM_SLEEP/RUNTIME_PM_OPS()
...
Linus Torvalds [Mon, 8 Dec 2025 21:31:47 +0000 (06:31 +0900)]
Merge tag 'phy-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy updates from Vinod Koul:
"Core:
- Drop Kishon as maintainer, thanks to him for helping, move to
credits and add Neil to help with reviews.
- Add new phy_notify_stat to notify phy from controllers during the
runtime transitions and usage in samsung phy
New hardware support:
- Renesas RZ/G3E USB3.0 driver
- NXP Support TJA1048/TJA1051 CAN phy
- Rockchip support for rk3506 dsi dphy
- Qualcomm Glymur QMP PCIe PHY support
Updates:
- PM support for rcar-gen3-usb2 driver
- Samsung HDMI/eDP Transmitter Combo PHY updates
- Freescale imx8mq support for alternate reference clock"
* tag 'phy-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (40 commits)
MAINTAINERS: phy: Add Neil Armstrong as reviewers for phy subsystem
MAINTAINERS: phy: Move Kishon Vijay Abraham I to credits
phy: fsl-imx8mq-usb: support alternate reference clock
dt-bindings: phy: imx8mq-usb: add alternate reference clock
phy: rockchip: samsung-hdptx: Prevent Inter-Pair Skew from exceeding the limits
phy: rockchip: samsung-hdptx: Reduce ROPLL loop bandwidth
phy: rockchip: samsung-hdptx: Fix reported clock rate in high bpc mode
phy: ti: gmii-sel: Add a sanity check on the phy_id
phy: qcom: qmp-pcie: Add support for Glymur PCIe Gen5x4 PHY
phy: qcom-qmp: pcs: Add v8.50 register offsets
dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Document the Glymur QMP PCIe PHY
dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Restrict resets per each device
phy: freescale: Initialize priv->lock
phy: renesas: Remove unneeded semicolons
phy: qcom: m31-eusb2: Update init sequence to set PHY_ENABLE
phy: qcom: qmp-combo: get the USB3 & DisplayPort lanes mapping from DT
dt-bindings: phy: qcom,sc8280xp-qmp-usb43dp-phy: Document lanes mapping when not using in USB-C complex
phy: rockchip: naneng-combphy: Fix PCIe L1ss support RK3562
phy: rockchip: naneng-combphy: Fix PCIe L1ss support RK3528
phy: renesas: rcar-gen3-usb2: Add suspend/resume support
...
Linus Torvalds [Mon, 8 Dec 2025 21:10:17 +0000 (06:10 +0900)]
Merge tag 'hyperv-next-signed-20251207' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv updates from Wei Liu:
- Enhancements to Linux as the root partition for Microsoft Hypervisor:
- Support a new mode called L1VH, which allows Linux to drive the
hypervisor running the Azure Host directly
- Support for MSHV crash dump collection
- Allow Linux's memory management subsystem to better manage guest
memory regions
- Fix issues that prevented a clean shutdown of the whole system on
bare metal and nested configurations
- ARM64 support for the MSHV driver
- Various other bug fixes and cleanups
- Add support for Confidential VMBus for Linux guest on Hyper-V
- Secure AVIC support for Linux guests on Hyper-V
- Add the mshv_vtl driver to allow Linux to run as the secure kernel in
a higher virtual trust level for Hyper-V
* tag 'hyperv-next-signed-20251207' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (58 commits)
mshv: Cleanly shutdown root partition with MSHV
mshv: Use reboot notifier to configure sleep state
mshv: Add definitions for MSHV sleep state configuration
mshv: Add support for movable memory regions
mshv: Add refcount and locking to mem regions
mshv: Fix huge page handling in memory region traversal
mshv: Move region management to mshv_regions.c
mshv: Centralize guest memory region destruction
mshv: Refactor and rename memory region handling functions
mshv: adjust interrupt control structure for ARM64
Drivers: hv: use kmalloc_array() instead of kmalloc()
mshv: Add ioctl for self targeted passthrough hvcalls
Drivers: hv: Introduce mshv_vtl driver
Drivers: hv: Export some symbols for mshv_vtl
static_call: allow using STATIC_CALL_TRAMP_STR() from assembly
mshv: Extend create partition ioctl to support cpu features
mshv: Allow mappings that overlap in uaddr
mshv: Fix create memory region overlap check
mshv: add WQ_PERCPU to alloc_workqueue users
Drivers: hv: Use kmalloc_array() instead of kmalloc()
...
Linus Torvalds [Mon, 8 Dec 2025 02:25:14 +0000 (11:25 +0900)]
Merge tag 'i3c/for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux
Pull i3c updates from Alexandre Belloni:
"HDR support has finally been added. mipi-i3c-hci has been reworked and
Intel Nova Lake-S support has been added.
Subsystem:
- Add HDR transfer support
Drivers:
- dw: fix bus hang on Agilex5
- mipi-i3c-hci: Intel Nova Lake-S support, IOMMU support
- svc: HDR support"
* tag 'i3c/for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux: (28 commits)
regmap: i3c: switch to use i3c_xfer from i3c_priv_xfer
net: mctp i3c: switch to use i3c_xfer from i3c_priv_xfer
hwmon: (lm75): switch to use i3c_xfer from i3c_priv_xfer
i3c: document i3c_xfers
i3c: fix I3C_SDR bit number
i3c: master: svc: Add basic HDR mode support
i3c: master: svc: Replace bool rnw with union for HDR support
i3c: Switch to use new i3c_xfer from i3c_priv_xfer
i3c: Add HDR API support
i3c: master: add WQ_PERCPU to alloc_workqueue users
i3c: master: Remove i3c_device_free_ibi from i3c_device_remove
i3c: mipi-i3c-hci-pci: Set d3cold_delay to 0 for Intel controllers
i3c: mipi-i3c-hci-pci: Add LTR support for Intel controllers
i3c: mipi-i3c-hci-pci: Add exit callback
i3c: mipi-i3c-hci-pci: Change callback parameter
i3c: mipi-i3c-hci-pci: Allocate a structure for mipi_i3c_hci_pci device information
i3c: mipi-i3c-hci-pci: Factor out intel_reset()
i3c: mipi-i3c-hci-pci: Factor out private registers ioremapping
i3c: mipi-i3c-hci-pci: Constify driver data
i3c: mipi-i3c-hci-pci: Use readl_poll_timeout()
...
Linus Torvalds [Mon, 8 Dec 2025 00:38:52 +0000 (09:38 +0900)]
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"This is entirely SoC clk drivers.
The majority diff wise is for the new Rockchip and Qualcomm clk
drivers which is mostly lines and lines of data structures to describe
the clk hardware in these SoCs. Beyond those two, Renesas continues to
incrementally add clks to their SoC drivers, causing them to show up
higher in the diffstat this time because they added quite a few clks
all over the place.
Overall it is a semi-quiet release that has some new clk drivers and
the usual fixes for clock data that was wrong or missing and
non-critical cleanups that plug error paths or fix typos.
New Drivers:
- Qualcomm IPQ5424 Network Subsystem Clock Controller
- Qualcomm SM8750 Video Clock Controller
- Rockchip RV1126B and RK3506 clock drivers
- i.MX8ULP SIM LPAV clock driver
- Samsung ACPM (firmware interface) clock driver
- Altera Agilex5 clock driver"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (117 commits)
clk: keystone: fix compile testing
clk: keystone: syscon-clk: fix regmap leak on probe failure
clk: qcom: Mark camcc_sm7150_hws static
clk: samsung: exynos-clkout: Assign .num before accessing .hws
clk: rockchip: Add clock and reset driver for RK3506
dt-bindings: clock: rockchip: Add RK3506 clock and reset unit
clk: actions: Fix discarding const qualifier by 'container_of' macro
clk: spacemit: Set clk_hw_onecell_data::num before using flex array
clk: visconti: Add VIIF clocks
dt-bindings: clock: tmpv770x: Add VIIF clocks
dt-bindings: clock: tmpv770x: Remove definition of number of clocks
clk: visconti: Do not define number of clocks in bindings
clk: rockchip: Add clock controller for the RV1126B
dt-bindings: clock, reset: Add support for rv1126b
clk: rockchip: Implement rockchip_clk_register_armclk_multi_pll()
clk: qcom: x1e80100-dispcc: Add USB4 router link resets
dt-bindings: clock: qcom: x1e80100-dispcc: Add USB4 router link resets
clk: qcom: videocc-sm8750: Add video clock controller driver for SM8750
dt-bindings: clock: qcom: Add SM8750 video clock controller
clk: qcom: branch: Extend invert logic for branch2 mem clocks
...
Pei Xiao [Fri, 5 Dec 2025 03:15:13 +0000 (11:15 +0800)]
hwmon: (emc2305) fix device node refcount leak in error path
The for_each_child_of_node() macro automatically manages device node
reference counts during normal iteration. However, when breaking out
of the loop early with return, the current iteration's node is not
automatically released, leading to a reference count leak.
Fix this by adding of_node_put(child) before returning from the loop
when emc2305_set_single_tz() fails.
This issue could lead to memory leaks over multiple probe cycles.
Armin Wolf [Wed, 3 Dec 2025 20:21:09 +0000 (21:21 +0100)]
hwmon: (dell-smm) Fix off-by-one error in dell_smm_is_visible()
The documentation states that on machines supporting only global
fan mode control, the pwmX_enable attributes should only be created
for the first fan channel (pwm1_enable, aka channel 0).
Fix the off-by-one error caused by the fact that fan channels have
a zero-based index.
Cc: stable@vger.kernel.org Fixes: 1c1658058c99 ("hwmon: (dell-smm) Add support for automatic fan mode") Signed-off-by: Armin Wolf <W_Armin@gmx.de> Link: https://lore.kernel.org/r/20251203202109.331528-1-W_Armin@gmx.de Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Gui-Dong Han [Tue, 2 Dec 2025 18:01:05 +0000 (02:01 +0800)]
hwmon: (w83791d) Convert macros to functions to avoid TOCTOU
The macro FAN_FROM_REG evaluates its arguments multiple times. When used
in lockless contexts involving shared driver data, this leads to
Time-of-Check to Time-of-Use (TOCTOU) race conditions, potentially
causing divide-by-zero errors.
Convert the macro to a static function. This guarantees that arguments
are evaluated only once (pass-by-value), preventing the race
conditions.
Additionally, in store_fan_div, move the calculation of the minimum
limit inside the update lock. This ensures that the read-modify-write
sequence operates on consistent data.
Adhere to the principle of minimal changes by only converting macros
that evaluate arguments multiple times and are used in lockless
contexts.
Linus Torvalds [Sun, 7 Dec 2025 16:56:10 +0000 (08:56 -0800)]
Merge tag 'memblock-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock update from Mike Rapoport:
"Introduce a 'check_pages' boot parameter to decouple simple checks for
page state on allocation and free from CONFIG_DEBUG_VM.
This allows enabling page checking without building kernel with
CONFIG_DEBUG_VM or forcing init_on_{alloc, free} or other heavier
mechanisms"
* tag 'memblock-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
mm/mm_init: Introduce a boot parameter for check_pages
Linus Torvalds [Sun, 7 Dec 2025 16:29:09 +0000 (08:29 -0800)]
Merge tag '9p-for-6.19-rc1' of https://github.com/martinetd/linux
Pull 9p updates from Dominique Martinet:
- fix a bug with O_APPEND in cached mode causing data to be written
multiple times on server
- use kvmalloc for trans_fd to avoid problems with large msize and
fragmented memory This should hopefully be used in more transports
when time allows
- convert to new mount API
- minor cleanups
* tag '9p-for-6.19-rc1' of https://github.com/martinetd/linux:
9p: fix new mount API cache option handling
9p: fix cache/debug options printing in v9fs_show_options
9p: convert to the new mount API
9p: create a v9fs_context structure to hold parsed options
net/9p: move structures and macros to header files
fs/fs_parse: add back fsparam_u32hex
fs/9p: delete unnnecessary condition
fs/9p: Don't open remote file with APPEND mode when writeback cache is used
net/9p: cleanup: change p9_trans_module->def to bool
9p: Use kvmalloc for message buffers on supported transports
Linus Torvalds [Sun, 7 Dec 2025 15:07:02 +0000 (07:07 -0800)]
Merge tag 'perf-tools-for-v6.19-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Namhyung Kim:
"Perf event/metric description:
Unify all event and metric descriptions in JSON format. Now event
parsing and handling is greatly simplified by that.
From users point of view, perf list will provide richer information
about hardware events like the following.
$ perf list hw
List of pre-defined events (to be used in -e or -M):
legacy hardware:
branch-instructions
[Retired branch instructions [This event is an alias of branches]. Unit: cpu]
branch-misses
[Mispredicted branch instructions. Unit: cpu]
branches
[Retired branch instructions [This event is an alias of branch-instructions]. Unit: cpu]
bus-cycles
[Bus cycles,which can be different from total cycles. Unit: cpu]
cache-misses
[Cache misses. Usually this indicates Last Level Cache misses; this is intended to be used in conjunction with the
PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates. Unit: cpu]
cache-references
[Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include
prefetches and coherency messages; again this depends on the design of your CPU. Unit: cpu]
cpu-cycles
[Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cycles]. Unit: cpu]
cycles
[Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cpu-cycles]. Unit: cpu]
instructions
[Retired instructions. Be careful,these can be affected by various issues,most notably hardware interrupt counts. Unit: cpu]
ref-cycles
[Total cycles; not affected by CPU frequency scaling. Unit: cpu]
But most notable changes would be in the perf stat. On the right side,
the default metrics are better named and aligned. :)
$ perf stat -- perf test -w noploop
Performance counter stats for 'perf test -w noploop':
With the kernel support (commit c69993ecdd4d: "perf: Support deferred
user unwind"), perf can use deferred callchains for userspace stack
trace with frame pointers like below:
$ perf record --call-graph fp,defer ...
This will be transparent to users when it comes to other commands like
perf report and perf script. They will merge the deferred callchains
to the previous samples as if they were collected together.
ARM SPE updates
- Extensive enhancements to support various kinds of memory
operations including GCS, MTE allocation tags, memcpy/memset,
register access, and SIMD operations.
- Add inverted data source filter (inv_data_src_filter) support to
exclude certain data sources.
- Improve documentation.
Vendor event updates:
- Intel: Updated event files for Sierra Forest, Panther Lake, Meteor
Lake, Lunar Lake, Granite Rapids, and others.
- Arm64: Added metrics for i.MX94 DDR PMU and Cortex-A720AE
definitions.
- RISC-V: Added JSON support for T-HEAD C920V2.
Misc:
- Improve pointer tracking in data type profiling. It'd give better
output when the variable is using container_of() to convert type.
- Annotation support for perf c2c report in TUI. Press 'a' key to
enter annotation view from cacheline browser window. This will show
which instruction is causing the cacheline contention.
- Lots of fixes and test coverage improvements!"
* tag 'perf-tools-for-v6.19-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (214 commits)
libperf: Use 'extern' in LIBPERF_API visibility macro
perf stat: Improve handling of termination by signal
perf tests stat: Add test for error for an offline CPU
perf stat: When no events, don't report an error if there is none
perf tests stat: Add "--null" coverage
perf cpumap: Add "any" CPU handling to cpu_map__snprint_mask
libperf cpumap: Fix perf_cpu_map__max for an empty/NULL map
perf stat: Allow no events to open if this is a "--null" run
perf test kvm: Add some basic perf kvm test coverage
perf tests evlist: Add basic evlist test
perf tests script dlfilter: Add a dlfilter test
perf tests kallsyms: Add basic kallsyms test
perf tests timechart: Add a perf timechart test
perf tests top: Add basic perf top coverage test
perf tests buildid: Add purge and remove testing
perf tests c2c: Add a basic c2c
perf c2c: Clean up some defensive gets and make asan clean
perf jitdump: Fix missed dso__put
perf mem-events: Don't leak online CPU map
perf hist: In init, ensure mem_info is put on error paths
...
Linus Torvalds [Sun, 7 Dec 2025 02:52:00 +0000 (18:52 -0800)]
Merge tag 'staging-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver updates from Greg KH:
"Here is the big set of staging driver updates for 6.19-rc1.
Only thing "major" in here is that two subsystems, gpib and vc04 have
moved out of the staging tree into the "real" portion of the kernel,
which is great to see. Other than that, the rest of the changes are
just tiny coding style cleanups, nothing earth-shattering.
All of these have been in linux-next for a while with no reported
problems"
* tag 'staging-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (53 commits)
staging: rtl8723bs: fix out-of-bounds read in OnBeacon ESR IE parsing
staging: rtl8723bs: fix stack buffer overflow in OnAssocReq IE parsing
staging: rtl8723bs: fix out-of-bounds read in rtw_get_ie() parser
staging: gpib: Clean-up commented-out code
staging: rtl8723bs: remove custom FIELD_OFFSET macro
staging: rtl8723bs: replace FIELD_OFFSET usage with offsetof in rtw_mlme_ext.c
staging: rtl8723bs: remove dead commented code from odm.c
staging: rtl8723bs: use standard offsetof in cfg80211 operations
staging: rtl8723bs: remove unused registry and BSSID offset macros
staging: rtl8723bs: core: delete commented-out code
staging: rtl8723bs: core: fix block comment style issues
staging: greybus: uart: check return values during probe
staging: fbtft: core: fix potential memory leak in fbtft_probe_common()
staging: gpib: Destage gpib
staging: gpib: Fix SPDX license for gpib headers
staging: gpib: Update TODO file
staging: gpib: Change // comments in uapi header file
platform/raspberrypi: Destage VCHIQ MMAL driver
platform/raspberrypi: Destage VCHIQ interface
staging: vc04_services: Cleanup VCHIQ TODO entries
...
Linus Torvalds [Sun, 7 Dec 2025 02:42:12 +0000 (18:42 -0800)]
Merge tag 'usb-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB/Thunderbolt updates from Greg KH:
"Here is the big set of USB and Thunderbolt driver updates for
6.19-rc1. Nothing major here, just lots of tiny updates for most of
the common USB drivers. Included in here are:
- more xhci driver updates and fixes
- Thunderbolt driver cleanups
- usb serial driver updates
- typec driver updates
- USB tracepoint additions
- dwc3 driver updates, including support for Apple hardware
- lots of other smaller driver updates and cleanups
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (161 commits)
usb: gadget: tegra-xudc: Always reinitialize data toggle when clear halt
USB: serial: option: move Telit 0x10c7 composition in the right place
USB: serial: option: add Telit Cinterion FE910C04 new compositions
usb: typec: ucsi: fix use-after-free caused by uec->work
usb: typec: ucsi: fix probe failure in gaokun_ucsi_probe()
usb: dwc3: core: Remove redundant comment in core init
usb: phy: Initialize struct usb_phy list_head
USB: serial: option: add Foxconn T99W760
usb: usb-storage: No additional quirks need to be added to the EL-R12 optical drive.
usb: typec: hd3ss3220: Enable VBUS based on ID pin state
dt-bindings: usb: ti,hd3ss3220: Add support for VBUS based on ID state
usb: typec: anx7411: add WQ_PERCPU to alloc_workqueue users
USB: add WQ_PERCPU to alloc_workqueue users
dt-bindings: usb: dwc3-xilinx: Describe the reset constraint for the versal platform
drivers/usb/storage: use min() instead of min_t()
usb: raw-gadget: cap raw_io transfer length to KMALLOC_MAX_SIZE
usb: ohci-da8xx: remove unused platform data
usb: gadget: functionfs: use dma_buf_unmap_attachment_unlocked() helper
usb: uas: reduce time under spinlock
usb: dwc3: eic7700: Add EIC7700 USB driver
...
Linus Torvalds [Sun, 7 Dec 2025 02:38:19 +0000 (18:38 -0800)]
Merge tag 'tty-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial updates from Greg KH:
"Here is the big set of tty/serial driver changes for 6.19-rc1. Nothing
major at all, just small constant churn to make the tty layer
"cleaner" as well as serial driver updates and even a new test added!
Included in here are:
- More tty/serial cleanups from Jiri
- tty tiocsti test added to hopefully ensure we don't regress in this
area again
- sc16is7xx driver updates
- imx serial driver updates
- 8250 driver updates
- new hardware device ids added
- other minor serial/tty driver cleanups and tweaks
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (60 commits)
serial: sh-sci: Fix deadlock during RSCI FIFO overrun error
dt-bindings: serial: rsci: Drop "uart-has-rtscts: false"
LoongArch: dts: Add uart new compatible string
serial: 8250: Add Loongson uart driver support
dt-bindings: serial: 8250: Add Loongson uart compatible
serial: 8250: add driver for KEBA UART
serial: Keep rs485 settings for devices without firmware node
serial: qcom-geni: Enable Serial on SA8255p Qualcomm platforms
serial: qcom-geni: Enable PM runtime for serial driver
serial: sprd: Return -EPROBE_DEFER when uart clock is not ready
tty: serial: samsung: Declare earlycon for Exynos850
serial: icom: Convert PCIBIOS_* return codes to errnos
serial: 8250-of: Fix style issues in 8250_of.c
serial: add support of CPCI cards
serial: mux: Fix kernel doc for mux_poll()
tty: replace use of system_unbound_wq with system_dfl_wq
serial: 8250_platform: simplify IRQF_SHARED handling
serial: 8250: make share_irqs local to 8250_platform
serial: 8250: move skip_txen_test to core
serial: drop SERIAL_8250_DEPRECATED_OPTIONS
...
Linus Torvalds [Sun, 7 Dec 2025 02:34:24 +0000 (18:34 -0800)]
Merge tag 'char-misc-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc/IIO driver updates from Greg KH:
"Here is the big set of char/misc/iio driver updates for 6.19-rc1. Lots
of stuff in here including:
- lots of IIO driver updates, cleanups, and additions
- large interconnect driver changes as they get converted over to a
dynamic system of ids
- coresight driver updates
- mwave driver updates
- binder driver updates and changes
- comedi driver fixes now that the fuzzers are being set loose on
them
- nvmem driver updates
- new uio driver addition
- lots of other small char/misc driver updates, full details in the
shortlog
All of these have been in linux-next for a while now"
* tag 'char-misc-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (304 commits)
char: applicom: fix NULL pointer dereference in ac_ioctl
hangcheck-timer: fix coding style spacing
hangcheck-timer: Replace %Ld with %lld
hangcheck-timer: replace printk(KERN_CRIT) with pr_crit
uio: Add SVA support for PCI devices via uio_pci_generic_sva.c
dt-bindings: slimbus: fix warning from example
intel_th: Fix error handling in intel_th_output_open
misc: rp1: Fix an error handling path in rp1_probe()
char: xillybus: add WQ_UNBOUND to alloc_workqueue users
misc: bh1770glc: use pm_runtime_resume_and_get() in power_state_store
misc: cb710: Fix a NULL vs IS_ERR() check in probe()
mux: mmio: Add suspend and resume support
virt: acrn: split acrn_mmio_dev_res out of acrn_mmiodev
greybus: gb-beagleplay: Fix timeout handling in bootloader functions
greybus: add WQ_PERCPU to alloc_workqueue users
char/mwave: drop typedefs
char/mwave: drop printk wrapper
char/mwave: remove printk tracing
char/mwave: remove unneeded fops
char/mwave: remove MWAVE_FUTZ_WITH_OTHER_DEVICES ifdeffery
...
Linus Torvalds [Sun, 7 Dec 2025 02:28:52 +0000 (18:28 -0800)]
Merge tag 'spdx-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx
Pull SPDX License update from Greg KH:
"Here is a single patch that updates the LGPL-2.1 license text with
the "alternate" SPDX tags that are allowed for this license type"
* tag 'spdx-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
LICENSES: Add modern form of the LGPL-2.1 tags to the usage guide section
Linus Torvalds [Sun, 7 Dec 2025 00:24:52 +0000 (16:24 -0800)]
Merge tag 'parisc-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc architecture updates from Helge Deller:
"A fix which allows booting on the very old 710 workstations, and two
fixes in the syscall entry/exit path which allow to execute 64-bit
userspace binaries.
Note that although we currently have a 64-bit (static) kernel to allow
more than 4 GB physical RAM, there is no support for 64-bit userspace
for parisc-linux yet, but Dave and Sven are making slowly progress to
port and fix glibc and gcc.
Summary:
- Fix boot on 710 workstation by not reprogramming ASP chip
- Fix 64bit userspace syscalls (64-bit userspace is still being
developed)
- minor code cleanups in asm/bug.h and perf_regs.c"
* tag 'parisc-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Do not reprogram affinitiy on ASP chip
parisc: Drop linux/kernel.h include from asm/bug.h header
parisc: remove unneeded semicolon in perf_regs.c
parisc: entry.S: fix space adjustment on interruption for 64-bit userspace
parisc: entry: set W bit for !compat tasks in syscall_restore_rfi()
parisc: Drop padding fields and layers entries from inventory log
Linus Torvalds [Sat, 6 Dec 2025 23:41:26 +0000 (15:41 -0800)]
Merge tag 'fbdev-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
Pull fbdev updates from Helge Deller:
"The Termius 10x18 console bitmap font has been added. It is good
match for modern 13-16 inch laptop displays with resolutions like
1280x800 and 1440x900 pixels.
The gbefb and tcx.c drivers got some fixes to restore X11 support,
pxafb was not actually clamping input values and the ssd1307fb driver
leaked memory in the failure path.
The other patches convert some common drivers to use dev_info() and
dev_dbg() instead of printk(). Summary:
Cleanups:
- vga16fb: Request memory region [Javier Garcia]
- vga16fb: replace printk() with dev_*() in probe [Vivek BalachandharTN]
- vesafb, gxt4500fb, tridentfb: Use dev_dbg() instead of printk() [Javier Garcia]
- i810: use dev_info() [Shi Hao]"
* tag 'fbdev-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
fbdev: ssd1307fb: fix potential page leak in ssd1307fb_probe()
fbdev: i810: use appopriate log interface dev_info
fbdev: tridentfb: replace printk() with dev_*() in probe
lib/fonts: Add Terminus 10x18 console font
fbdev: pxafb: Fix multiple clamped values in pxafb_adjust_timing
fbdev: tcx.c fix mem_map to correct smem_start offset
fbdev: gxt4500fb: Use dev_err instead of printk
fbdev: gbefb: fix to use physical address instead of dma address
fbdev: vesafb: Use dev_* fn's instead printk
fbdev: vga16fb: Request memory region
fbdev: vga16fb: replace printk() with dev_*() in probe
Linus Torvalds [Sat, 6 Dec 2025 23:28:11 +0000 (15:28 -0800)]
ocfs2: fix xattr array entry __counted_by error
Commit 2f26f58df041 ("ocfs2: annotate flexible array members with
__counted_by_le()") started annotating the flexible arrays used by
ocfs2, and now gcc complains about ocfs2_reflink_xattr_header():
In function ‘fortify_memset_chk’,
inlined from ‘ocfs2_reflink_xattr_header’ at fs/ocfs2/xattr.c:6365:5:
include/linux/fortify-string.h:480:25: error: call to ‘__write_overflow_field’ declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning]
and it looks like the complaint is valid - even if the actual error
message is somewhat confusing.
The 'last' pointer points to past the end of the counted flex array, but
is used as an actual 'last' entry rather than a 'one-past-last'.
It looks like the code copied and cleared an extra entry (which is
likely harmless in practice), but I don't know ocfs2 at all. Because
it's also possible that the counted-by annotations are off-by-one, and
so this needs checking by somebody who actually knows ocfs2.
But in the meantime this fixes the build error, and certainly _looks_
sane.
Cc: Dmitry Antipov <dmantipov@yandex.ru> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 6 Dec 2025 22:01:20 +0000 (14:01 -0800)]
Merge tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- "panic: sys_info: Refactor and fix a potential issue" (Andy Shevchenko)
fixes a build issue and does some cleanup in ib/sys_info.c
- "Implement mul_u64_u64_div_u64_roundup()" (David Laight)
enhances the 64-bit math code on behalf of a PWM driver and beefs up
the test module for these library functions
- "scripts/gdb/symbols: make BPF debug info available to GDB" (Ilya Leoshkevich)
makes BPF symbol names, sizes, and line numbers available to the GDB
debugger
- "Enable hung_task and lockup cases to dump system info on demand" (Feng Tang)
adds a sysctl which can be used to cause additional info dumping when
the hung-task and lockup detectors fire
- "lib/base64: add generic encoder/decoder, migrate users" (Kuan-Wei Chiu)
adds a general base64 encoder/decoder to lib/ and migrates several
users away from their private implementations
- "rbree: inline rb_first() and rb_last()" (Eric Dumazet)
makes TCP a little faster
- "liveupdate: Rework KHO for in-kernel users" (Pasha Tatashin)
reworks the KEXEC Handover interfaces in preparation for Live Update
Orchestrator (LUO), and possibly for other future clients
- "kho: simplify state machine and enable dynamic updates" (Pasha Tatashin)
increases the flexibility of KEXEC Handover. Also preparation for LUO
- "Live Update Orchestrator" (Pasha Tatashin)
is a major new feature targeted at cloud environments. Quoting the
cover letter:
This series introduces the Live Update Orchestrator, a kernel
subsystem designed to facilitate live kernel updates using a
kexec-based reboot. This capability is critical for cloud
environments, allowing hypervisors to be updated with minimal
downtime for running virtual machines. LUO achieves this by
preserving the state of selected resources, such as memory,
devices and their dependencies, across the kernel transition.
As a key feature, this series includes support for preserving
memfd file descriptors, which allows critical in-memory data, such
as guest RAM or any other large memory region, to be maintained in
RAM across the kexec reboot.
Mike Rappaport merits a mention here, for his extensive review and
testing work.
- "kexec: reorganize kexec and kdump sysfs" (Sourabh Jain)
moves the kexec and kdump sysfs entries from /sys/kernel/ to
/sys/kernel/kexec/ and adds back-compatibility symlinks which can
hopefully be removed one day
- "kho: fixes for vmalloc restoration" (Mike Rapoport)
fixes a BUG which was being hit during KHO restoration of vmalloc()
regions
* tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (139 commits)
calibrate: update header inclusion
Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()"
vmcoreinfo: track and log recoverable hardware errors
kho: fix restoring of contiguous ranges of order-0 pages
kho: kho_restore_vmalloc: fix initialization of pages array
MAINTAINERS: TPM DEVICE DRIVER: update the W-tag
init: replace simple_strtoul with kstrtoul to improve lpj_setup
KHO: fix boot failure due to kmemleak access to non-PRESENT pages
Documentation/ABI: new kexec and kdump sysfs interface
Documentation/ABI: mark old kexec sysfs deprecated
kexec: move sysfs entries to /sys/kernel/kexec
test_kho: always print restore status
kho: free chunks using free_page() instead of kfree()
selftests/liveupdate: add kexec test for multiple and empty sessions
selftests/liveupdate: add simple kexec-based selftest for LUO
selftests/liveupdate: add userspace API selftests
docs: add documentation for memfd preservation via LUO
mm: memfd_luo: allow preserving memfd
liveupdate: luo_file: add private argument to store runtime state
mm: shmem: export some functions to internal.h
...
Linus Torvalds [Sat, 6 Dec 2025 21:49:40 +0000 (13:49 -0800)]
Merge tag 'trace-v6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix accounting of stop_count in file release
On opening the trace file, if "pause-on-trace" option is set, it will
increment the stop_count. On file release, it checks if stop_count is
set, and if so it decrements it. Since this code was originally
written, the stop_count can be incremented by other use cases. This
makes just checking the stop_count not enough to know if it should be
decremented.
Add a new iterator flag called "PAUSE" and have it set if the open
disables tracing and only decrement the stop_count if that flag is
set on close.
- Remove length field in trace_seq_printf() of print_synth_event()
When printing the synthetic event that has a static length array
field, the vsprintf() of the trace_seq_printf() triggered a
"(efault)" in the output. That's because the print_fmt replaced the
"%.*s" with "%s" causing the arguments to be off.
- Fix a bunch of typos
* tag 'trace-v6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix typo in trace_seq.c
tracing: Fix typo in trace_probe.c
tracing: Fix multiple typos in trace_osnoise.c
tracing: Fix multiple typos in trace_events_user.c
tracing: Fix typo in trace_events_trigger.c
tracing: Fix typo in trace_events_hist.c
tracing: Fix typo in trace_events_filter.c
tracing: Fix multiple typos in trace_events.c
tracing: Fix multiple typos in trace.c
tracing: Fix typo in ring_buffer_benchmark.c
tracing: Fix multiple typos in ring_buffer.c
tracing: Fix typo in fprobe.c
tracing: Fix typo in fpgraph.c
tracing: Fix fixed array of synthetic event
tracing: Fix enabling of tracing on file release
Linus Torvalds [Sat, 6 Dec 2025 20:33:26 +0000 (12:33 -0800)]
Merge tag 'x86-urgent-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Miscellaneous documentation fixes"
* tag 'x86-urgent-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot/Documentation: Prefix hexadecimal literals with 0x
x86/boot/Documentation: Spell 'ID' consistently
x86/platform: Fix and extend kernel-doc comments in <asm/x86_init.h>
Linus Torvalds [Sat, 6 Dec 2025 20:31:21 +0000 (12:31 -0800)]
Merge tag 'sched-urgent-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Miscellaneous scheduler fixes/cleanups:
- Fix psi_dequeue() for Proxy Execution
- Fix hrtick() vs. scheduling context bug
- Fix unfairness caused by stalled tg_load_avg_contrib when the last
task migrates out
- Fix whitespace noise in headers
- Remove a preempt-disable section in rt_mutex_setprio()"
* tag 'sched-urgent-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Fix psi_dequeue() for Proxy Execution
sched/fair: Fix unfairness caused by stalled tg_load_avg_contrib when the last task migrates out
sched/rt: Remove a preempt-disable section in rt_mutex_setprio()
sched/hrtick: Fix hrtick() vs. scheduling context
sched/headers: Remove whitespace noise from kernel/sched/sched.h
Linus Torvalds [Sat, 6 Dec 2025 19:56:51 +0000 (11:56 -0800)]
Merge tag 'objtool-urgent-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fixes from Ingo Molnar:
"Address various objtool scalability bugs/inefficiencies exposed by
allmodconfig builds, plus improve the quality of alternatives
instructions generated code and disassembly"
* tag 'objtool-urgent-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Simplify .annotate_insn code generation output some more
objtool: Add more robust signal error handling, detect and warn about stack overflows
objtool: Remove newlines and tabs from annotation macros
objtool: Consolidate annotation macros
x86/asm: Remove ANNOTATE_DATA_SPECIAL usage
x86/alternative: Remove ANNOTATE_DATA_SPECIAL usage
objtool: Fix stack overflow in validate_branch()
Linus Torvalds [Sat, 6 Dec 2025 19:31:49 +0000 (11:31 -0800)]
Merge tag 'locking-urgent-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"Two fixes related to recent introduction of scoped_seqlock_read():
- Fix compiler build failures when a particular .config and compiler
build options variant doesn't result in the expected removal of
unused, catch-bugs portions of scoped_seqlock_read() by the inliner
at build time, and cause a linker fail even in correct code
- Match read-locking order in do_task_stat() and do_io_accounting().
The inconsistency here was harmless but unnecessary"
* tag 'locking-urgent-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
seqlock: Cure some more scoped_seqlock() optimization fails
seqlock, procfs: Match scoped_seqlock_read() critical section vs. RCU ordering in do_task_stat() to do_io_accounting()
Linus Torvalds [Sat, 6 Dec 2025 19:13:50 +0000 (11:13 -0800)]
iommu/amd: fix SEV-TIO support reporting
Commit eeb934137deb ("iommu/amd: Report SEV-TIO support") was confused
about the config options that expose amd_iommu_sev_tio_supported(), and
made the declaration (and alternative dummy function) conditional on the
CONFIG_AMD_IOMMU config option.
But the code is actually dependent on CONFIG_KVM_AMD_SEV, resulting in
Linus Torvalds [Sat, 6 Dec 2025 18:57:02 +0000 (10:57 -0800)]
Merge tag 'nfsd-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd updates from Chuck Lever:
- Mike Snitzer's mechanism for disabling I/O caching introduced in
v6.18 is extended to include using direct I/O. The goal is to further
reduce the memory footprint consumed by NFS clients accessing large
data sets via NFSD.
- The NFSD community adopted a maintainer entry profile during this
cycle. See
- Work continues on hardening NFSD's implementation of the pNFS block
layout type. This type enables pNFS clients to directly access the
underlying block devices that contain an exported file system,
reducing server overhead and increasing data throughput.
- The remaining patches are clean-ups and minor optimizations. Many
thanks to the contributors, reviewers, testers, and bug reporters who
participated during the v6.19 NFSD development cycle.
* tag 'nfsd-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (38 commits)
NFSD: nfsd-io-modes: Separate lists
NFSD: nfsd-io-modes: Wrap shell snippets in literal code blocks
NFSD: Add toctree entry for NFSD IO modes docs
NFSD: add Documentation/filesystems/nfs/nfsd-io-modes.rst
NFSD: Implement NFSD_IO_DIRECT for NFS WRITE
NFSD: Make FILE_SYNC WRITEs comply with spec
NFSD: Add trace point for SCSI fencing operation.
NFSD: use correct reservation type in nfsd4_scsi_fence_client
xdrgen: Don't generate unnecessary semicolon
xdrgen: Fix union declarations
NFSD: don't start nfsd if sv_permsocks is empty
xdrgen: handle _XdrString in union encoder/decoder
xdrgen: Fix the variable-length opaque field decoder template
xdrgen: Make the xdrgen script location-independent
xdrgen: Generalize/harden pathname construction
lockd: don't allow locking on reexported NFSv2/3
MAINTAINERS: add a nfsd blocklayout reviewer
nfsd: Use MD5 library instead of crypto_shash
nfsd: stop pretending that we cache the SEQUENCE reply.
NFS: nfsd-maintainer-entry-profile: Inline function name prefixes
...
Linus Torvalds [Sat, 6 Dec 2025 18:49:19 +0000 (10:49 -0800)]
Merge tag 'for-linus-6.19-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
"This round it contains only three small cleanup patches"
* tag 'for-linus-6.19-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
drivers/xen: use min() instead of min_t()
drivers/xen/xenbus: Replace deprecated strcpy in xenbus_transaction_end
drivers/xen/xenbus: Simplify return statement in join()
Linus Torvalds [Sat, 6 Dec 2025 18:15:41 +0000 (10:15 -0800)]
Merge tag 'tsm-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm
Pull PCIe Link Encryption and Device Authentication from Dan Williams:
"New PCI infrastructure and one architecture implementation for PCIe
link encryption establishment via platform firmware services.
This work is the result of multiple vendors coming to consensus on
some core infrastructure (thanks Alexey, Yilun, and Aneesh!), and
three vendor implementations, although only one is included in this
pull. The PCI core changes have an ack from Bjorn, the crypto/ccp/
changes have an ack from Tom, and the iommu/amd/ changes have an ack
from Joerg.
PCIe link encryption is made possible by the soup of acronyms
mentioned in the shortlog below. Link Integrity and Data Encryption
(IDE) is a protocol for installing keys in the transmitter and
receiver at each end of a link. That protocol is transported over Data
Object Exchange (DOE) mailboxes using PCI configuration requests.
The aspect that makes this a "platform firmware service" is that the
key provisioning and protocol is coordinated through a Trusted
Execution Envrionment (TEE) Security Manager (TSM). That is either
firmware running in a coprocessor (AMD SEV-TIO), or quasi-hypervisor
software (Intel TDX Connect / ARM CCA) running in a protected CPU
mode.
Now, the only reason to ask a TSM to run this protocol and install the
keys rather than have a Linux driver do the same is so that later, a
confidential VM can ask the TSM directly "can you certify this
device?".
That precludes host Linux from provisioning its own keys, because host
Linux is outside the trust domain for the VM. It also turns out that
all architectures, save for one, do not publish a mechanism for an OS
to establish keys in the root port. So "TSM-established link
encryption" is the only cross-architecture path for this capability
for the foreseeable future.
This unblocks the other arch implementations to follow in v6.20/v7.0,
once they clear some other dependencies, and it unblocks the next
phase of work to implement the end-to-end flow of confidential device
assignment. The PCIe specification calls this end-to-end flow Trusted
Execution Environment (TEE) Device Interface Security Protocol
(TDISP).
In the meantime, Linux gets a link encryption facility which has
practical benefits along the same lines as memory encryption. It
authenticates devices via certificates and may protect against
interposer attacks trying to capture clear-text PCIe traffic.
Summary:
- Introduce the PCI/TSM core for the coordination of device
authentication, link encryption and establishment (IDE), and later
management of the device security operational states (TDISP).
Notify the new TSM core layer of PCI device arrival and departure
- Add a low level TSM driver for the link encryption establishment
capabilities of the AMD SEV-TIO architecture
- Add a library of helpers TSM drivers to use for IDE establishment
and the DOE transport
- Add skeleton support for 'bind' and 'guest_request' operations in
support of TDISP"
* tag 'tsm-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm: (23 commits)
crypto/ccp: Fix CONFIG_PCI=n build
virt: Fix Kconfig warning when selecting TSM without VIRT_DRIVERS
crypto/ccp: Implement SEV-TIO PCIe IDE (phase1)
iommu/amd: Report SEV-TIO support
psp-sev: Assign numbers to all status codes and add new
ccp: Make snp_reclaim_pages and __sev_do_cmd_locked public
PCI/TSM: Add 'dsm' and 'bound' attributes for dependent functions
PCI/TSM: Add pci_tsm_guest_req() for managing TDIs
PCI/TSM: Add pci_tsm_bind() helper for instantiating TDIs
PCI/IDE: Initialize an ID for all IDE streams
PCI/IDE: Add Address Association Register setup for downstream MMIO
resource: Introduce resource_assigned() for discerning active resources
PCI/TSM: Drop stub for pci_tsm_doe_transfer()
drivers/virt: Drop VIRT_DRIVERS build dependency
PCI/TSM: Report active IDE streams
PCI/IDE: Report available IDE streams
PCI/IDE: Add IDE establishment helpers
PCI: Establish document for PCI host bridge sysfs attributes
PCI: Add PCIe Device 3 Extended Capability enumeration
PCI/TSM: Establish Secure Sessions and Link Encryption
...
Linus Torvalds [Sat, 6 Dec 2025 17:55:38 +0000 (09:55 -0800)]
Merge tag 'rproc-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull remoteproc updates from Bjorn Andersson:
- Add support for the compute DSP in the Qualcomm SDM660 platform, and
finally fix up the way MSM8974 audio DSP remoteproc driver manages
its power rails
- Replace the usage of of_reserved_mem_lookup() with
of_reserved_mem_region_to_resource() to clean things up across most
of the drivers
- Perform a variety of housekeeping and cleanup work across iMX,
Mediatek, and TI remoteproc drivers
* tag 'rproc-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: (45 commits)
remoteproc: qcom_q6v5_wcss: use optional reset for wcss_q6_bcr_reset
remoteproc: qcom_q6v5_wcss: fix parsing of qcom,halt-regs
remoteproc: qcom_wcnss: Fix NULL vs IS_ERR() bug in wcnss_alloc_memory_region()
remoteproc: qcom: q6v5: Fix NULL vs IS_ERR() bug in q6v5_alloc_memory_region()
remoteproc: qcom: pas: Fix a couple NULL vs IS_ERR() bugs
remoteproc: qcom_q6v5_adsp: Fix a NULL vs IS_ERR() check in adsp_alloc_memory_region()
remoteproc: imx_dsp_rproc: Fix NULL vs IS_ERR() bug in imx_dsp_rproc_add_carveout()
remoteproc: st: Fix indexing of memory-regions
remoteproc: qcom: pas: Add support for SDM660 CDSP
dt-bindings: remoteproc: qcom: adsp: Add SDM660 CDSP compatible
dt-bindings: remoteproc: qcom: adsp: Add missing constrains for SDM660 ADSP
dt-bindings: remoteproc: qcom,sc8280xp-pas: Fix CDSP power desc
remoteproc: omap: Remove redundant pm_runtime_mark_last_busy() calls
remoteproc: qcom: Use of_reserved_mem_region_* functions for "memory-region"
remoteproc: qcom_q6v5_pas: Use resource with CX PD for MSM8974
dt-bindings: remoteproc: qcom,adsp: Make msm8974 use CX as power domain
remoteproc: Use of_reserved_mem_region_* functions for "memory-region"
remoteproc: imx_dsp_rproc: Simplify start/stop error handling
remoteproc: imx_rproc: Remove enum imx_rproc_method
remoteproc: imx_dsp_rproc: Simplify IMX_RPROC_RESET_CONTROLLER switch case
...
Linus Torvalds [Sat, 6 Dec 2025 17:52:41 +0000 (09:52 -0800)]
Merge tag 'landlock-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock updates from Mickaël Salaün:
"This mainly fixes handling of disconnected directories and adds new
tests"
* tag 'landlock-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
selftests/landlock: Add disconnected leafs and branch test suites
selftests/landlock: Add tests for access through disconnected paths
landlock: Improve variable scope
landlock: Fix handling of disconnected directories
selftests/landlock: Fix makefile header list
landlock: Make docs in cred.h and domain.h visible
landlock: Minor comments improvements
Linus Torvalds [Sat, 6 Dec 2025 17:32:25 +0000 (09:32 -0800)]
Merge tag 'libnvdimm-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull nvdimm updates from Ira Weiny:
"These are mainly bug fixes and code updates.
There is a new feature to divide up memmap= carve outs and a fix
caught in linux-next for that patch. Managing memmap memory on the fly
for multiple VM's was proving difficult and Mike provided a driver
which allows for the memory to be better manged.
Summary:
- Allow exposing RAM carveouts as NVDIMM DIMM devices
- Prevent integer overflow in ramdax_get_config_data()
- Replace use of system_wq with system_percpu_wq
- Documentation: btt: Unwrap bit 31-30 nested table
- tools/testing/nvdimm: Use per-DIMM device handle"
* tag 'libnvdimm-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nvdimm: Prevent integer overflow in ramdax_get_config_data()
Documentation: btt: Unwrap bit 31-30 nested table
nvdimm: replace use of system_wq with system_percpu_wq
tools/testing/nvdimm: Use per-DIMM device handle
nvdimm: allow exposing RAM carveouts as NVDIMM DIMM devices
Linus Torvalds [Sat, 6 Dec 2025 17:25:05 +0000 (09:25 -0800)]
Merge tag 'dma-mapping-6.19-2025-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping updates from Marek Szyprowski:
- More DMA mapping API refactoring to physical addresses as the primary
interface instead of page+offset parameters.
This time dma_map_ops callbacks are converted to physical addresses,
what in turn results also in some simplification of architecture
specific code (Leon Romanovsky and Jason Gunthorpe)
- Clarify that dma_map_benchmark is not a kernel self-test, but
standalone tool (Qinxin Xia)
* tag 'dma-mapping-6.19-2025-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
dma-mapping: remove unused map_page callback
xen: swiotlb: Convert mapping routine to rely on physical address
x86: Use physical address for DMA mapping
sparc: Use physical address DMA mapping
powerpc: Convert to physical address DMA mapping
parisc: Convert DMA map_page to map_phys interface
MIPS/jazzdma: Provide physical address directly
alpha: Convert mapping routine to rely on physical address
dma-mapping: remove unused mapping resource callbacks
xen: swiotlb: Switch to physical address mapping callbacks
ARM: dma-mapping: Switch to physical address mapping callbacks
ARM: dma-mapping: Reduce struct page exposure in arch_sync_dma*()
dma-mapping: convert dummy ops to physical address mapping
dma-mapping: prepare dma_map_ops to conversion to physical address
tools/dma: move dma_map_benchmark from selftests to tools/dma
Linus Torvalds [Sat, 6 Dec 2025 17:01:27 +0000 (09:01 -0800)]
Merge tag 'bitmap-for-6.19' of github.com:/norov/linux
Pull bitmap updates from Yury Norov:
- Runtime field_{get,prep}() (Geert)
- Rust ID pool updates (Alice)
- min_t() simplification (David)
- __sw_hweightN kernel-doc fixes (Andy)
- cpumask.h headers cleanup (Andy)
* tag 'bitmap-for-6.19' of github.com:/norov/linux: (32 commits)
rust_binder: use bitmap for allocation of handles
rust: id_pool: do not immediately acquire new ids
rust: id_pool: do not supply starting capacity
rust: id_pool: rename IdPool::new() to with_capacity()
rust: bitmap: add BitmapVec::new_inline()
rust: bitmap: add MAX_LEN and MAX_INLINE_LEN constants
cpumask: Don't use "proxy" headers
soc: renesas: Use bitfield helpers
clk: renesas: Use bitfield helpers
ALSA: usb-audio: Convert to common field_{get,prep}() helpers
soc: renesas: rz-sysc: Convert to common field_get() helper
pinctrl: ma35: Convert to common field_{get,prep}() helpers
iio: mlx90614: Convert to common field_{get,prep}() helpers
iio: dac: Convert to common field_prep() helper
gpio: aspeed: Convert to common field_{get,prep}() helpers
EDAC/ie31200: Convert to common field_get() helper
crypto: qat - convert to common field_get() helper
clk: at91: Convert to common field_{get,prep}() helpers
bitfield: Add non-constant field_{prep,get}() helpers
bitfield: Add less-checking __FIELD_{GET,PREP}()
...
Miguel Ojeda [Thu, 4 Dec 2025 14:50:35 +0000 (15:50 +0100)]
rust: sync: atomic: separate import "blocks"
Commit 14e9a18b07ec ("rust: sync: atomic: Make Atomic*Ops pub(crate)")
added a `pub(crate)` import in the same "block" as the `pub` one,
without running `rustfmt`, which would sort them differently.
Instead of running `rustfmt` as-is, add a newline to keep the import
"blocks" with different visibilities separate.
Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 6 Dec 2025 16:27:07 +0000 (08:27 -0800)]
Merge tag 'modules-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux
Pull module updates from Daniel Gomez:
"Rust module parameter support:
- Add Rust module parameter support, enabling Rust kernel modules to
declare and use module parameters. The rust_minimal sample module
demonstrates this, and the rust null block driver will be the first
to use it in the next cycle. This also adds the Rust module files
under the modules subsystem as agreed between the Rust and modules
maintainers.
Hardening:
- Add compile-time check for embedded NUL characters in MODULE_*()
macros. This module metadata was once used (and maybe still) to
bypass license enforcement (LWN article from 2003):
https://lwn.net/Articles/82305/ [1]
MAINTAINERS:
- Add Aaron Tomlin as reviewer for the Modules subsystem"
* tag 'modules-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux:
MAINTAINERS: Add myself as reviewer for module support
module: Add compile-time check for embedded NUL characters
media: radio: si470x: Fix DRIVER_AUTHOR macro definition
media: dvb-usb-v2: lmedm04: Fix firmware macro definitions
modules: add rust modules files to MAINTAINERS
rust: samples: add a module parameter to the rust_minimal sample
rust: module: update the module macro with module parameter support
rust: module: use a reference in macros::module::module
rust: introduce module_param module
rust: str: add radix prefixed integer parsing functions
rust: sync: add `SetOnce`
John Stultz [Fri, 5 Dec 2025 01:27:09 +0000 (01:27 +0000)]
sched/core: Fix psi_dequeue() for Proxy Execution
Currently, if the sleep flag is set, psi_dequeue() doesn't
change any of the psi_flags.
This is because psi_task_switch() will clear TSK_ONCPU as well
as other potential flags (TSK_RUNNING), and the assumption is
that a voluntary sleep always consists of a task being dequeued
followed shortly there after with a psi_sched_switch() call.
Proxy Execution changes this expectation, as mutex-blocked tasks
that would normally sleep stay on the runqueue. But in the case
where the mutex-owning task goes to sleep, or the owner is on a
remote cpu, we will then deactivate the blocked task shortly
after.
In that situation, the mutex-blocked task will have had its
TSK_ONCPU cleared when it was switched off the cpu, but it will
stay TSK_RUNNING. Then if we later dequeue it (as currently done
if we hit a case find_proxy_task() can't yet handle, such as the
case of the owner being on another rq or a sleeping owner)
psi_dequeue() won't change any state (leaving it TSK_RUNNING),
as it incorrectly expects a psi_task_switch() call to
immediately follow.
Later on when the task get woken/re-enqueued, and psi_flags are
set for TSK_RUNNING, we hit an error as the task is already
TSK_RUNNING:
To resolve this, extend the logic in psi_dequeue() so that
if the sleep flag is set, we also check if psi_flags have
TSK_ONCPU set (meaning the psi_task_switch is imminent) before
we do the shortcut return.
If TSK_ONCPU is not set, that means we've already switched away,
and this psi_dequeue call needs to clear the flags.
Fixes: be41bde4c3a8 ("sched: Add an initial sketch of the find_proxy_task() function") Reported-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Tested-by: Haiyue Wang <haiyuewa@163.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://patch.msgid.link/20251205012721.756394-1-jstultz@google.com Closes: https://lore.kernel.org/lkml/20251117185550.365156-1-kprateek.nayak@amd.com/
xupengbo [Wed, 27 Aug 2025 02:22:07 +0000 (10:22 +0800)]
sched/fair: Fix unfairness caused by stalled tg_load_avg_contrib when the last task migrates out
When a task is migrated out, there is a probability that the tg->load_avg
value will become abnormal. The reason is as follows:
1. Due to the 1ms update period limitation in update_tg_load_avg(), there
is a possibility that the reduced load_avg is not updated to tg->load_avg
when a task migrates out.
2. Even though __update_blocked_fair() traverses the leaf_cfs_rq_list and
calls update_tg_load_avg() for cfs_rqs that are not fully decayed, the key
function cfs_rq_is_decayed() does not check whether
cfs->tg_load_avg_contrib is null. Consequently, in some cases,
__update_blocked_fair() removes cfs_rqs whose avg.load_avg has not been
updated to tg->load_avg.
Add a check of cfs_rq->tg_load_avg_contrib in cfs_rq_is_decayed(),
which fixes the case (2.) mentioned above.
Fixes: 1528c661c24b ("sched/fair: Ratelimit update to tg->load_avg") Signed-off-by: xupengbo <xupengbo@oppo.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Aaron Lu <ziqianlu@bytedance.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Tested-by: Aaron Lu <ziqianlu@bytedance.com> Link: https://patch.msgid.link/20250827022208.14487-1-xupengbo@oppo.com
sched/rt: Remove a preempt-disable section in rt_mutex_setprio()
rt_mutex_setprio() has only one caller: rt_mutex_adjust_prio(). It
expects that task_struct::pi_lock and rt_mutex_base::wait_lock are held.
Both locks are raw_spinlock_t and are acquired with disabled interrupts.
Nevertheless rt_mutex_setprio() disables preemption while invoking
__balance_callbacks() and raw_spin_rq_unlock(). Even if one of the
balance callbacks unlocks the rq then it must not enable interrupts
because rt_mutex_base::wait_lock is still locked.
Therefore interrupts should remain disabled and disabling preemption is
not needed.
Commit 4c9a4bc89a9cc ("sched: Allow balance callbacks for check_class_changed()")
adds a preempt-disable section to rt_mutex_setprio() and
__sched_setscheduler(). In __sched_setscheduler() the preemption is
disabled before rq is unlocked and interrupts enabled but I don't see
why it makes a difference in rt_mutex_setprio().
Remove the preempt_disable() section from rt_mutex_setprio().
Peter Zijlstra [Thu, 4 Dec 2025 10:43:32 +0000 (11:43 +0100)]
seqlock: Cure some more scoped_seqlock() optimization fails
Arnd reported an x86 randconfig using gcc-15 tripped over
__scoped_seqlock_bug(). Turns out GCC chose not to inline the
scoped_seqlock helper functions and as such was not able to optimize
properly.
[ mingo: Clang fails the build too in some circumstances. ]
Linus Torvalds [Sat, 6 Dec 2025 05:29:02 +0000 (21:29 -0800)]
Merge tag 'driver-core-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core updates from Danilo Krummrich:
"Arch Topology:
- Move parse_acpi_topology() from arm64 to common code for reuse in
RISC-V
CPU:
- Expose housekeeping CPUs through /sys/devices/system/cpu/housekeeping
- Print a newline (or 0x0A) instead of '(null)' reading
/sys/devices/system/cpu/nohz_full when nohz_full= is not set
debugfs
- Remove (broken) 'no-mount' mode
- Remove redundant access mode checks in debugfs_get_tree() and
debugfs_create_*() functions
Devres:
- Remove unused devm_free_percpu() helper
- Move devm_alloc_percpu() from device.h to devres.h
Firmware Loader:
- Replace simple_strtol() with kstrtoint()
- Do not call cancel_store() when no upload is in progress
kernfs:
- Increase struct super_block::maxbytes to MAX_LFS_FILESIZE
- Fix a missing unwind path in __kernfs_new_node()
Misc:
- Increase the name size in struct auxiliary_device_id to 40
characters
- Replace system_unbound_wq with system_dfl_wq and add WQ_PERCPU to
alloc_workqueue()
Platform:
- Replace ERR_PTR() with IOMEM_ERR_PTR() in platform ioremap
functions
Rust:
- Auxiliary:
- Unregister auxiliary device on parent device unbind
- Move parent() to impl Device; implement device context aware
parent() for Device<Bound>
- Illustrate how to safely obtain a driver's device private data
when calling from an auxiliary driver into the parant device
driver
- DebugFs:
- Implement support for binary large objects
- Device:
- Let probe() return the driver's device private data as pinned
initializer, i.e. impl PinInit<Self, Error>
- Implement safe accessor for a driver's device private data for
Device<Bound> (returned reference can't out-live driver binding
and guarantees the correct private data type)
- Implement AsBusDevice trait, to be used by class device
abstractions to derive the bus device type of the parent device
- DMA:
- Store raw pointer of allocation as NonNull
- Use start_ptr() and start_ptr_mut() to inherit correct
mutability of self
- FS:
- Add file::Offset type alias
- I2C:
- Add abstractions for I2C device / driver infrastructure
- Implement abstractions for manual I2C device registrations
- I/O:
- Use "kernel vertical" style for imports
- Define ResourceSize as resource_size_t
- Move ResourceSize to top-level I/O module
- Add type alias for phys_addr_t
- Implement Rust version of read_poll_timeout_atomic()
- PCI:
- Use "kernel vertical" style for imports
- Move I/O and IRQ infrastructure to separate files
- Add support for PCI interrupt vectors
- Implement TryInto<IrqRequest<'a>> for IrqVector<'a> to convert
an IrqVector bound to specific pci::Device into an IrqRequest
bound to the same pci::Device's parent Device
- Leverage pin_init_scope() to get rid of redundant Result in IRQ
methods
- PinInit:
- Add {pin_}init_scope() to execute code before creating an
initializer
- Platform:
- Leverage pin_init_scope() to get rid of redundant Result in IRQ
methods
- Timekeeping:
- Implement abstraction of udelay()
- Uaccess:
- Implement read_slice_partial() and read_slice_file() for
UserSliceReader
- Implement write_slice_partial() and write_slice_file() for
UserSliceWriter
sysfs:
- Prepare the constification of struct attribute"
* tag 'driver-core-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: (75 commits)
rust: pci: fix build failure when CONFIG_PCI_MSI is disabled
debugfs: Fix default access mode config check
debugfs: Remove broken no-mount mode
debugfs: Remove redundant access mode checks
driver core: Check drivers_autoprobe for all added devices
driver core: WQ_PERCPU added to alloc_workqueue users
driver core: replace use of system_unbound_wq with system_dfl_wq
tick/nohz: Expose housekeeping CPUs in sysfs
tick/nohz: avoid showing '(null)' if nohz_full= not set
sysfs/cpu: Use DEVICE_ATTR_RO for nohz_full attribute
kernfs: fix memory leak of kernfs_iattrs in __kernfs_new_node
fs/kernfs: raise sb->maxbytes to MAX_LFS_FILESIZE
mod_devicetable: Bump auxiliary_device_id name size
sysfs: simplify attribute definition macros
samples/kobject: constify 'struct foo_attribute'
samples/kobject: add is_visible() callback to attribute group
sysfs: attribute_group: enable const variants of is_visible()
sysfs: introduce __SYSFS_FUNCTION_ALTERNATIVE()
sysfs: transparently handle const pointers in ATTRIBUTE_GROUPS()
sysfs: attribute_group: allow registration of const attribute
...
Linus Torvalds [Sat, 6 Dec 2025 04:49:24 +0000 (20:49 -0800)]
Merge tag 'for-linus-6.19-1' of https://github.com/cminyard/linux-ipmi
Pull IPMI updates from Corey Minyard:
"Minor IPMI fixes:
- Some device tree cleanups and a maintainer add
- Fix a race when handling channel updates that could result in
errors being reported to the user in some cases"
* tag 'for-linus-6.19-1' of https://github.com/cminyard/linux-ipmi:
MAINTAINERS: Add entry on Loongson-2K IPMI driver
dt-bindings: ipmi: Convert aspeed,ast2400-ibt-bmc to DT schema
dt-bindings: ipmi: Convert nuvoton,npcm750-kcs-bmc to DT schema
ipmi: Skip channel scan if channels are already marked ready
ipmi: Fix __scan_channels() failing to rescan channels
ipmi: Fix the race between __scan_channels() and deliver_response()
Linus Torvalds [Sat, 6 Dec 2025 04:41:20 +0000 (20:41 -0800)]
Merge tag 'ata-6.19-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fix from Niklas Cassel:
- The DELLBOSS VD SATA controller times out when sending I/Os of size
4096 KiB or larger, even though it claims LBA48 support, which per
the ACS standard requires support for a maximum command size of
65535 sectors, i.e. 32 MiB - 512. Thus, quirk the device so that it
sets a lower maximum command size (me)
* tag 'ata-6.19-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: libata-core: Quirk DELLBOSS VD max_sectors
ata: libata: Move quirk flags to their own enum
Linus Torvalds [Sat, 6 Dec 2025 04:36:28 +0000 (20:36 -0800)]
Merge tag 'tpmdd-sessions-next-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull more tpm updates from Jarkko Sakkinen:
"This is targeted for tpm2-sessions updates.
There's two bug fixes and two more cosmetic tweaks for HMAC protected
sessions. They provide a baseine for further improvements to be
implemented during the the course of the release cycle"
* tag 'tpmdd-sessions-next-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
tpm2-sessions: Open code tpm_buf_append_hmac_session()
tpm2-sessions: Remove 'attributes' parameter from tpm_buf_append_auth
tpm2-sessions: Fix tpm2_read_public range checks
tpm2-sessions: Fix out of range indexing in name_size
Linus Torvalds [Sat, 6 Dec 2025 03:56:50 +0000 (19:56 -0800)]
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"Usual driver updates (ufs, lpfc, target, qla2xxx) plus assorted
cleanups and fixes including the WQ_PERCPU series.
The biggest core change is the new allocation of pseudo-devices which
allow the sending of internal commands to a given SCSI target"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (147 commits)
scsi: MAINTAINERS: Add the UFS include directory
scsi: scsi_debug: Support injecting unaligned write errors
scsi: qla2xxx: Fix improper freeing of purex item
scsi: ufs: rockchip: Fix compile error without CONFIG_GPIOLIB
scsi: ufs: rockchip: Reset controller on PRE_CHANGE of hce enable notify
scsi: ufs: core: Use scsi_device_busy()
scsi: ufs: core: Fix single doorbell mode support
scsi: pm80xx: Add WQ_PERCPU to alloc_workqueue() users
scsi: target: Add WQ_PERCPU to alloc_workqueue() users
scsi: qedi: Add WQ_PERCPU to alloc_workqueue() users
scsi: target: ibmvscsi: Add WQ_PERCPU to alloc_workqueue() users
scsi: qedf: Add WQ_PERCPU to alloc_workqueue() users
scsi: bnx2fc: Add WQ_PERCPU to alloc_workqueue() users
scsi: be2iscsi: Add WQ_PERCPU to alloc_workqueue() users
scsi: message: fusion: Add WQ_PERCPU to alloc_workqueue() users
scsi: lpfc: WQ_PERCPU added to alloc_workqueue() users
scsi: scsi_transport_fc: WQ_PERCPU added to alloc_workqueue users()
scsi: scsi_dh_alua: WQ_PERCPU added to alloc_workqueue() users
scsi: qla2xxx: WQ_PERCPU added to alloc_workqueue() users
scsi: target: sbp: Replace use of system_unbound_wq with system_dfl_wq
...
Mike Snitzer [Wed, 26 Nov 2025 06:01:25 +0000 (01:01 -0500)]
nfs/localio: fix regression due to out-of-order __put_cred
Commit f2060bdc21d7 ("nfs/localio: add refcounting for each iocb IO
associated with NFS pgio header") inadvertantly reintroduced the same
potential for __put_cred() triggering BUG_ON(cred == current->cred) that
commit 992203a1fba5 ("nfs/localio: restore creds before releasing pageio
data") fixed.
Fix this by saving and restoring the cred around each {read,write}_iter
call within the respective for loop of nfs_local_call_{read,write} using
scoped_with_creds().
NOTE: this fix started by first reverting the following commits:
94afb627dfc2 ("nfs: use credential guards in nfs_local_call_read()") bff3c841f7bd ("nfs: use credential guards in nfs_local_call_write()") 1d18101a644e ("Merge tag 'kernel-6.19-rc1.cred' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs")
followed by narrowly fixing the cred lifetime issue by using
scoped_with_creds(). In doing so, this commit's changes appear more
extensive than they really are (as evidenced by comparing to v6.18's
fs/nfs/localio.c).
Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Acked-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/linux-next/20251205111942.4150b06f@canb.auug.org.au/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 6 Dec 2025 01:47:59 +0000 (17:47 -0800)]
Merge tag 'soc-drivers-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull more SoC driver updates from Arnd Bergmann:
"These updates came a little late, or were based on a later 6.18-rc tag
than the others:
- A new driver for cache management on cxl devices with memory shared
in a coherent cluster. This is part of the drivers/cache/ tree, but
unlike the other drivers that back the dma-mapping interfaces, this
one is needed only during CPU hotplug.
- A shared branch for reset controllers using swnode infrastructure
- Added support for new SoC variants in the Amlogic soc_device
identification
- Minor updates in Freescale, Microchip, Samsung, and Apple SoC
drivers"
* tag 'soc-drivers-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (24 commits)
soc: samsung: exynos-pmu: fix device leak on regmap lookup
soc: samsung: exynos-pmu: Fix structure initialization
soc: fsl: qbman: use kmalloc_array() instead of kmalloc()
soc: fsl: qbman: add WQ_PERCPU to alloc_workqueue users
MAINTAINERS: Update email address for Christophe Leroy
MAINTAINERS: refer to intended file in STANDALONE CACHE CONTROLLER DRIVERS
cache: Support cache maintenance for HiSilicon SoC Hydra Home Agent
cache: Make top level Kconfig menu a boolean dependent on RISCV
MAINTAINERS: Add Jonathan Cameron to drivers/cache and add lib/cache_maint.c + header
arm64: Select GENERIC_CPU_CACHE_MAINTENANCE
lib: Support ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION
soc: amlogic: meson-gx-socinfo: add new SoCs id
dt-bindings: arm: amlogic: meson-gx-ao-secure: support more SoCs
memregion: Support fine grained invalidate by cpu_cache_invalidate_memregion()
memregion: Drop unused IORES_DESC_* parameter from cpu_cache_invalidate_memregion()
dt-bindings: cache: sifive,ccache0: add a pic64gx compatible
MAINTAINERS: rename Microchip RISC-V entry
MAINTAINERS: add new soc drivers to Microchip RISC-V entry
soc: microchip: add mfd drivers for two syscon regions on PolarFire SoC
dt-bindings: soc: microchip: document the simple-mfd syscon on PolarFire SoC
...
Linus Torvalds [Sat, 6 Dec 2025 01:27:12 +0000 (17:27 -0800)]
Merge tag 'soc-newsoc-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull new SoC families update from Arnd Bergmann:
"These three new families of SoC are split out into a separate branch
because they touch multiple parts of the source tree and are better
left separate for the initial merge.
- Black Sesame Technologies C1200 is an automotive SoC using
Cortex-A78 CPU cores
- Anlogic dr1v90 (not to be confused with Amlogic) is an FPGA
platform using a single nuclei ux900 RISC-V core
- Tenstorrent Blackhole is a Neural Processing Unit using custom
"Tensix" cores for computation offload managed by Linux running on
SiFive X280 RISC-V cores.
Support for all three is rather rudimentary at the moment and will get
improved as device drivers are merged through other tree"
* tag 'soc-newsoc-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (24 commits)
MAINTAINERS: add Black Sesame Technologies (BST) ARM SoC support
arm64: defconfig: enable BST platform support
arm64: dts: bst: add support for Black Sesame Technologies C1200 CDCU1.0 board
arm64: Kconfig: add ARCH_BST for Black Sesame Technologies SoCs
dt-bindings: arm: add Black Sesame Technologies (bst) SoC
dt-bindings: vendor-prefixes: Add Black Sesame Technologies Co., Ltd.
MAINTAINERS: Setup support for Anlogic tree
riscv: defconfig: Enable Anlogic SoC
riscv: dts: anlogic: Add Milianke MLKPAI FS01 board
riscv: dts: Add initial Anlogic DR1V90 SoC device tree
riscv: Add Anlogic SoC famly Kconfig support
dt-bindings: serial: snps-dw-apb-uart: Add Anlogic DR1V90 uart
dt-bindings: timer: Add Anlogic DR1V90 ACLINT MTIMER
dt-bindings: riscv: Add Anlogic DR1V90
dt-bindings: riscv: Add Nuclei UX900 compatibles
dt-bindings: vendor-prefixes: Add Anlogic, Milianke and Nuclei
riscv: defconfig: Enable Tenstorrent SoCs
riscv: Kconfig.socs: Add ARCH_TENSTORRENT for Tenstorrent SoCs
riscv: dts: Add Tenstorrent Blackhole SoC PCIe cards
dt-bindings: interrupt-controller: Add Tenstorrent Blackhole compatible
...
Linus Torvalds [Sat, 6 Dec 2025 01:24:29 +0000 (17:24 -0800)]
Merge tag 'soc-dt-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC devicetree updates from Arnd Bergmann:
"Three new SoCs got added in existing arm64 chip families:
- Renesas R-Car X5H (R8A78000) is a new generation of automotive
SoCs, based on 16 Cortex-A720 (Armv9.2) cores, which makes the the
currently highest-perforance embedded SoC.
- TI AM62L is a new variant of the AM62 family of industrial SoCs,
this one comes without a GPU.
- Qualcomm MSM8937 (Snapdragon 430) is an older mobile phone chip
based on Cortex-A53, and closely related to MSM8917 (Snapdragn
425), which we already support.
In addition, there are a good number of newly supported machines
across SoC families:
- Two Aspeed AST2600 (Cortex-A7) based BMC setups for large servers
- Mobile Phones and tables based on Mediatek MT6582, Nvidia Tegra124,
Qualcomm MSM8937 and Qualcomm MSM8939,
- Two Laptops based on Qualcomm SoCs: one using the older sdm850, the
other using x1p42100.
- One Router based on Rockchips RK3568
- 24 variants of the Enclustra Mercury system-on-module, all based on
32-bit Intel/Altera SocFPGA chips, plus two boards using 64-bit
SocFPGA Agilex chips..
- 30 industrial/embedded boards and single-board computers, using
various chips from NXP, Rockchips, Mediatek, TI, Amlogic, Qualcomm,
Spacemit, and Starfive.
In total there are 783 commits here, the majority of these improving
hardware support and cleaning up devicetree files across the tree,
with the majority of the changes going into the Qualcomm, NXP, Renesas
and Rockchips platforms"
* tag 'soc-dt-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (782 commits)
arm64: dts: mediatek: mt8195: Fix address range for JPEG decoder core 1
ARM: dts: samsung: exynos4412-midas: turn off SDIO WLAN chip during system suspend
ARM: dts: samsung: exynos4210-trats: turn off SDIO WLAN chip during system suspend
ARM: dts: samsung: exynos4210-i9100: turn off SDIO WLAN chip during system suspend
ARM: dts: samsung: universal_c210: turn off SDIO WLAN chip during system suspend
arm64: dts: amlogic: meson-g12b: Fix L2 cache reference for S922X CPUs
arm64: dts: Add gpio_intc node for Amlogic S7D SoCs
arm64: dts: Add gpio_intc node for Amlogic S7 SoCs
arm64: dts: Add gpio_intc node for Amlogic S6 SoCs
arm64: dts: amlogic: s7d: add ao secure node
arm64: dts: amlogic: s7: add ao secure node
arm64: dts: amlogic: s6: add ao secure node
arm64: dts: amlogic: Fix the register name of the 'DBI' region
dts: arm64: amlogic: add a5 pinctrl node
arm64: dts: amlogic: s7d: add power domain controller node
arm64: dts: amlogic: s7: add power domain controller node
arm64: dts: amlogic: s6: add power domain controller node
dts: arm64: amlogic: Add ISP related nodes for C3
arm64: dts: meson: add initial device-tree for Tanix TX9 Pro
dt-bindings: arm: amlogic: add support for Tanix TX9 Pro
...
Linus Torvalds [Sat, 6 Dec 2025 01:23:12 +0000 (17:23 -0800)]
Merge tag 'soc-arm-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC ARM code updates from Arnd Bergmann:
"These are very minimal changes for 32-bit Arm platform code, enabling
SMP bringup for one more SoC variant (mt6582) among spelling changes
and a build warning fix"
* tag 'soc-arm-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: omap1: avoid symbol clashes in fiq handler
ARM: gemini: fix typos in comments
ARM: versatile: Fix typo in versatile.c
ARM: OMAP2+: Fix falg->flag typo in omap_smc2()
ARM: mediatek: add MT6582 smp bring up code
ARM: mediatek: add board_dt_compat entry for the MT6582 SoC
Linus Torvalds [Sat, 6 Dec 2025 01:22:09 +0000 (17:22 -0800)]
Merge tag 'soc-defconfig-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC defconfig updates from Arnd Bergmann:
"As usual, a number of newly added drivers get enabled in the arm64
defconfig, in addition to minor housekeeping work on defconfig files
for arm32, arm64 and riscv"
* tag 'soc-defconfig-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (24 commits)
arm64: defconfig: enable Exynos ACPM clocks
arm64: defconfig: Remove the redundant SCHED_MC/SCHED_SMT
ARM: multi_v7_defconfig: Enable TI PRU Ethernet driver
arm64: defconfig: enable i.MX AIPSTZ driver
ARM: mxs_defconfig: enable sound drivers for imx28-amarula-rmm
arm64: defconfig: Enable i.MX95 drivers for pinctrl, Ethernet and PCIe
arm64: defconfig: enable rockchip camera interface
ARM: tegra: Enable EXT4 for Tegra
arm64: defconfig: Enable NVIDIA VRS PSEQ RTC
arm64: defconfig: Enable SX150x GPIO expander driver
riscv: defconfig: enable SPI_FSL_QUADSPI as a module
ARM: at91: at91_dt_defconfig: set MMC_SPI to module
arm64: defconfig: Build NSS clock controller driver for IPQ5424
arm64: defconfig: Enable SCSI UFS Crypto and Block Inline encryption drivers
arm64: defconfig: Add M31 eUSB2 PHY config
arm64: defconfig: Enable configs for Fairphone 3, 4, 5 smartphones
arm64: defconfig: Enable two Novatek display panels for MTP8750 and Tianma
arm64: defconfig: Enable RZ/T2H / RZ/N2H ADC driver
ARM: shmobile: defconfig: Refresh for v6.18-rc1
arm64: defconfig: Enable DW HDMI QP CEC support
...
Linus Torvalds [Sat, 6 Dec 2025 01:01:20 +0000 (17:01 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"ARM:
- Support for userspace handling of synchronous external aborts
(SEAs), allowing the VMM to potentially handle the abort in a
non-fatal manner
- Large rework of the VGIC's list register handling with the goal of
supporting more active/pending IRQs than available list registers
in hardware. In addition, the VGIC now supports EOImode==1 style
deactivations for IRQs which may occur on a separate vCPU than the
one that acked the IRQ
- Support for FEAT_XNX (user / privileged execute permissions) and
FEAT_HAF (hardware update to the Access Flag) in the software page
table walkers and shadow MMU
- Allow page table destruction to reschedule, fixing long
need_resched latencies observed when destroying a large VM
- Minor fixes to KVM and selftests
Loongarch:
- Get VM PMU capability from HW GCFG register
- Add AVEC basic support
- Use 64-bit register definition for EIOINTC
- Add KVM timer test cases for tools/selftests
RISC/V:
- SBI message passing (MPXY) support for KVM guest
- Give a new, more specific error subcode for the case when in-kernel
AIA virtualization fails to allocate IMSIC VS-file
- Support KVM_DIRTY_LOG_INITIALLY_SET, enabling dirty log gradually
in small chunks
- Fix guest page fault within HLV* instructions
- Flush VS-stage TLB after VCPU migration for Andes cores
s390:
- Always allocate ESCA (Extended System Control Area), instead of
starting with the basic SCA and converting to ESCA with the
addition of the 65th vCPU. The price is increased number of exits
(and worse performance) on z10 and earlier processor; ESCA was
introduced by z114/z196 in 2010
- VIRT_XFER_TO_GUEST_WORK support
- Operation exception forwarding support
- Cleanups
x86:
- Skip the costly "zap all SPTEs" on an MMIO generation wrap if MMIO
SPTE caching is disabled, as there can't be any relevant SPTEs to
zap
- Relocate a misplaced export
- Fix an async #PF bug where KVM would clear the completion queue
when the guest transitioned in and out of paging mode, e.g. when
handling an SMI and then returning to paged mode via RSM
- Leave KVM's user-return notifier registered even when disabling
virtualization, as long as kvm.ko is loaded. On reboot/shutdown,
keeping the notifier registered is ok; the kernel does not use the
MSRs and the callback will run cleanly and restore host MSRs if the
CPU manages to return to userspace before the system goes down
- Use the checked version of {get,put}_user()
- Fix a long-lurking bug where KVM's lack of catch-up logic for
periodic APIC timers can result in a hard lockup in the host
- Revert the periodic kvmclock sync logic now that KVM doesn't use a
clocksource that's subject to NTP corrections
- Clean up KVM's handling of MMIO Stale Data and L1TF, and bury the
latter behind CONFIG_CPU_MITIGATIONS
- Context switch XCR0, XSS, and PKRU outside of the entry/exit fast
path; the only reason they were handled in the fast path was to
paper of a bug in the core #MC code, and that has long since been
fixed
- Add emulator support for AVX MOV instructions, to play nice with
emulated devices whose guest drivers like to access PCI BARs with
large multi-byte instructions
x86 (AMD):
- Fix a few missing "VMCB dirty" bugs
- Fix the worst of KVM's lack of EFER.LMSLE emulation
- Add AVIC support for addressing 4k vCPUs in x2AVIC mode
- Fix incorrect handling of selective CR0 writes when checking
intercepts during emulation of L2 instructions
- Fix a currently-benign bug where KVM would clobber SPEC_CTRL[63:32]
on VMRUN and #VMEXIT
- Fix a bug where KVM corrupt the guest code stream when re-injecting
a soft interrupt if the guest patched the underlying code after the
VM-Exit, e.g. when Linux patches code with a temporary INT3
- Add KVM_X86_SNP_POLICY_BITS to advertise supported SNP policy bits
to userspace, and extend KVM "support" to all policy bits that
don't require any actual support from KVM
x86 (Intel):
- Use the root role from kvm_mmu_page to construct EPTPs instead of
the current vCPU state, partly as worthwhile cleanup, but mostly to
pave the way for tracking per-root TLB flushes, and elide EPT
flushes on pCPU migration if the root is clean from a previous
flush
- Add a few missing nested consistency checks
- Rip out support for doing "early" consistency checks via hardware
as the functionality hasn't been used in years and is no longer
useful in general; replace it with an off-by-default module param
to WARN if hardware fails a check that KVM does not perform
- Fix a currently-benign bug where KVM would drop the guest's
SPEC_CTRL[63:32] on VM-Enter
- Misc cleanups
- Overhaul the TDX code to address systemic races where KVM (acting
on behalf of userspace) could inadvertantly trigger lock contention
in the TDX-Module; KVM was either working around these in weird,
ugly ways, or was simply oblivious to them (though even Yan's
devilish selftests could only break individual VMs, not the host
kernel)
- Fix a bug where KVM could corrupt a vCPU's cpu_list when freeing a
TDX vCPU, if creating said vCPU failed partway through
- Fix a few sparse warnings (bad annotation, 0 != NULL)
- Use struct_size() to simplify copying TDX capabilities to userspace
- Fix a bug where TDX would effectively corrupt user-return MSR
values if the TDX Module rejects VP.ENTER and thus doesn't clobber
host MSRs as expected
Selftests:
- Fix a math goof in mmu_stress_test when running on a single-CPU
system/VM
- Forcefully override ARCH from x86_64 to x86 to play nice with
specifying ARCH=x86_64 on the command line
- Extend a bunch of nested VMX to validate nested SVM as well
- Add support for LA57 in the core VM_MODE_xxx macro, and add a test
to verify KVM can save/restore nested VMX state when L1 is using
5-level paging, but L2 is not
- Clean up the guest paging code in anticipation of sharing the core
logic for nested EPT and nested NPT
guest_memfd:
- Add NUMA mempolicy support for guest_memfd, and clean up a variety
of rough edges in guest_memfd along the way
- Define a CLASS to automatically handle get+put when grabbing a
guest_memfd from a memslot to make it harder to leak references
- Enhance KVM selftests to make it easer to develop and debug
selftests like those added for guest_memfd NUMA support, e.g. where
test and/or KVM bugs often result in hard-to-debug SIGBUS errors
- Misc cleanups
Generic:
- Use the recently-added WQ_PERCPU when creating the per-CPU
workqueue for irqfd cleanup
- Fix a goof in the dirty ring documentation
- Fix choice of target for directed yield across different calls to
kvm_vcpu_on_spin(); the function was always starting from the first
vCPU instead of continuing the round-robin search"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (260 commits)
KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS
KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2
KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX}
KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected"
KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot()
KVM: arm64: Add endian casting to kvm_swap_s[12]_desc()
KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n
KVM: arm64: selftests: Add test for AT emulation
KVM: arm64: nv: Expose hardware access flag management to NV guests
KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW
KVM: arm64: Implement HW access flag management in stage-1 SW PTW
KVM: arm64: Propagate PTW errors up to AT emulation
KVM: arm64: Add helper for swapping guest descriptor
KVM: arm64: nv: Use pgtable definitions in stage-2 walk
KVM: arm64: Handle endianness in read helper for emulated PTW
KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW
KVM: arm64: Call helper for reading descriptors directly
KVM: arm64: nv: Advertise support for FEAT_XNX
KVM: arm64: Teach ptdump about FEAT_XNX permissions
KVM: s390: Use generic VIRT_XFER_TO_GUEST_WORK functions
...
Linus Torvalds [Sat, 6 Dec 2025 00:30:56 +0000 (16:30 -0800)]
Merge tag 'uml-for-linux-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux
Pull UML updates from Johannes Berg:
"Apart from the usual small churn, we have
- initial SMP support (only kernel)
- major vDSO cleanups (and fixes for 32-bit)"
* tag 'uml-for-linux-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (33 commits)
um: Disable KASAN_INLINE when STATIC_LINK is selected
um: Don't rename vmap to kernel_vmap
um: drivers: virtio: use string choices helper
um: Always set up AT_HWCAP and AT_PLATFORM
x86/um: Remove FIXADDR_USER_START and FIXADDR_USE_END
um: Remove __access_ok_vsyscall()
um: Remove redundant range check from __access_ok_vsyscall()
um: Remove fixaddr_user_init()
x86/um: Drop gate area handling
x86/um: Do not inherit vDSO from host
um: Split out default elf_aux_hwcap
x86/um: Move ELF_PLATFORM fallback to x86-specific code
um: Split out default elf_aux_platform
um: Avoid circular dependency on asm-offsets in pgtable.h
um: Enable SMP support on x86
asm-generic: percpu: Add assembly guard
um: vdso: Remove getcpu support on x86
um: Add initial SMP support
um: Define timers on a per-CPU basis
um: Determine sleep based on need_resched()
...
Linus Torvalds [Sat, 6 Dec 2025 00:26:57 +0000 (16:26 -0800)]
Merge tag 'riscv-for-linus-6.19-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Paul Walmsley:
- Enable parallel hotplug for RISC-V
- Optimize vector regset allocation for ptrace()
- Add a kernel selftest for the vector ptrace interface
- Enable the userspace RAID6 test to build and run using RISC-V vectors
- Add initial support for the Zalasr RISC-V ratified ISA extension
- For the Zicbop RISC-V ratified ISA extension to userspace, expose
hardware and kernel support to userspace and add a kselftest for
Zicbop
- Convert open-coded instances of 'asm goto's that are controlled by
runtime ALTERNATIVEs to use riscv_has_extension_{un,}likely(),
following arm64's alternative_has_cap_{un,}likely()
- Remove an unnecessary mask in the GFP flags used in some calls to
pagetable_alloc()
* tag 'riscv-for-linus-6.19-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
selftests/riscv: Add Zicbop prefetch test
riscv: hwprobe: Expose Zicbop extension and its block size
riscv: Introduce Zalasr instructions
riscv: hwprobe: Export Zalasr extension
dt-bindings: riscv: Add Zalasr ISA extension description
riscv: Add ISA extension parsing for Zalasr
selftests: riscv: Add test for the Vector ptrace interface
riscv: ptrace: Optimize the allocation of vector regset
raid6: test: Add support for RISC-V
raid6: riscv: Allow code to be compiled in userspace
raid6: riscv: Prevent compiler from breaking inline vector assembly code
riscv: cmpxchg: Use riscv_has_extension_likely
riscv: bitops: Use riscv_has_extension_likely
riscv: hweight: Use riscv_has_extension_likely
riscv: checksum: Use riscv_has_extension_likely
riscv: pgtable: Use riscv_has_extension_unlikely
riscv: Remove __GFP_HIGHMEM masking
RISC-V: Enable HOTPLUG_PARALLEL for secondary CPUs
Linus Torvalds [Sat, 6 Dec 2025 00:18:21 +0000 (16:18 -0800)]
Merge tag 'powerpc-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Restore clearing of MSR[RI] at interrupt/syscall exit on 32-bit
- Fix unpaired stwcx on interrupt exit on 32-bit
- Fix race condition leading to double list-add in
mac_hid_toggle_emumouse()
- Fix mprotect on book3s 32-bit
- Fix SLB multihit issue during SLB preload with 64-bit hash MMU
- Add support for crashkernel CMA reservation
- Add die_id and die_cpumask for Power10 & later to expose chip
hemispheres
- A series of minor fixes and improvements to the hash SLB code
Thanks to Antonio Alvarez Feijoo, Ben Collins, Bhaskar Chowdhury,
Christophe Leroy, Daniel Thompson, Dave Vasilevsky, Donet Tom,
J. Neuschäfer, Kunwu Chan, Long Li, Naresh Kamboju, Nathan Chancellor,
Ritesh Harjani (IBM), Shirisha G, Shrikanth Hegde, Sourabh Jain, Srikar
Dronamraju, Stephen Rothwell, Thomas Zimmermann, Venkat Rao Bagalkote,
and Vishal Chourasia.
* tag 'powerpc-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (32 commits)
macintosh/via-pmu-backlight: Include <linux/fb.h> and <linux/of.h>
powerpc/powermac: backlight: Include <linux/of.h>
powerpc/64s/slb: Add no_slb_preload early cmdline param
powerpc/64s/slb: Make preload_add return type as void
powerpc/ptdump: Dump PXX level info for kernel_page_tables
powerpc/64s/pgtable: Enable directMap counters in meminfo for Hash
powerpc/64s/hash: Update directMap page counters for Hash
powerpc/64s/hash: Hash hpt_order should be only available with Hash MMU
powerpc/64s/hash: Improve hash mmu printk messages
powerpc/64s/hash: Fix phys_addr_t printf format in htab_initialize()
powerpc/64s/ptdump: Fix kernel_hash_pagetable dump for ISA v3.00 HPTE format
powerpc/64s/hash: Restrict stress_hpt_struct memblock region to within RMA limit
powerpc/64s/slb: Fix SLB multihit issue during SLB preload
powerpc, mm: Fix mprotect on book3s 32-bit
powerpc/smp: Expose die_id and die_cpumask
powerpc/83xx: Add a null pointer check to mcu_gpiochip_add
arch:powerpc:tools This file was missing shebang line, so added it
kexec: Include kernel-end even without crashkernel
powerpc: p2020: Rename wdt@ nodes to watchdog@
powerpc: 86xx: Rename wdt@ nodes to watchdog@
...
ovl: pass original credentials, not mounter credentials during create
When creating new files the security layer expects the original
credentials to be passed. When cleaning up the code this was accidently
changed to pass the mounter's credentials by relying on current->cred
which is already overriden at this point. Pass the original credentials
directly.
Reported-by: Ondrej Mosnacek <omosnace@redhat.com> Reported-by: Paul Moore <paul@paul-moore.com> Fixes: e566bff96322 ("ovl: port ovl_create_or_link() to new ovl_override_creator_creds") Link: https://lore.kernel.org/CAFqZXNvL1ciLXMhHrnoyBmQu1PAApH41LkSWEhrcvzAAbFij8Q@mail.gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org> Tested-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 5 Dec 2025 23:52:30 +0000 (15:52 -0800)]
Merge tag 'vfs-6.19-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Fix a type conversion bug in the ipc subsystem
- Fix per-dentry timeout warning in autofs
- Drop the fd conversion from sockets
- Move assert from iput_not_last() to iput()
- Fix reversed check in filesystems_freeze_callback()
- Use proper uapi types for new struct delegation definitions
* tag 'vfs-6.19-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
vfs: use UAPI types for new struct delegation definition
mqueue: correct the type of ro to int
Revert "net/socket: convert sock_map_fd() to FD_ADD()"
autofs: fix per-dentry timeout warning
fs: assert on I_FREEING not being set in iput() and iput_not_last()
fs: PM: Fix reverse check in filesystems_freeze_callback()
Linus Torvalds [Fri, 5 Dec 2025 23:48:09 +0000 (15:48 -0800)]
Merge tag 'exfat-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat updates from Namjae Jeon:
- Fix a remount failure caused by differing process masks by inheriting
the original mount options during the remount process
- Fix a potential divide-by-zero error and system crash in
exfat_allocate_bitmap that occurred when the readahead count was zero
- Add validation for directory cluster bitmap bits to prevent directory
and root cluster from being incorrectly zeroed out on corrupted
images
- Clear the post-EOF page cache when extending a file to prevent stale
mmap data from becoming visible, addressing an generic/363 failure
- Fix a reference count leak in exfat_find by properly releasing the
dentry set in specific error paths
* tag 'exfat-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: fix remount failure in different process environments
exfat: fix divide-by-zero in exfat_allocate_bitmap
exfat: validate the cluster bitmap bits of directory
exfat: zero out post-EOF page cache on file extension
exfat: fix refcount leak in exfat_find
Linus Torvalds [Fri, 5 Dec 2025 23:25:13 +0000 (15:25 -0800)]
Merge tag 'fuse-update-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
- Add mechanism for cleaning out unused, stale dentries; controlled via
a module option (Luis Henriques)
- Fix various bugs
- Cleanups
* tag 'fuse-update-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: Uninitialized variable in fuse_epoch_work()
fuse: fix io-uring list corruption for terminated non-committed requests
fuse: signal that a fuse inode should exhibit local fs behaviors
fuse: Always flush the page cache before FOPEN_DIRECT_IO write
fuse: Invalidate the page cache after FOPEN_DIRECT_IO write
fuse: rename 'namelen' to 'namesize'
fuse: use strscpy instead of strcpy
fuse: refactor fuse_conn_put() to remove negative logic.
fuse: new work queue to invalidate dentries from old epochs
fuse: new work queue to periodically invalidate expired dentries
dcache: export shrink_dentry_list() and add new helper d_dispose_if_unused()
fuse: add WARN_ON and comment for RCU revalidate
fuse: Fix whitespace for fuse_uring_args_to_ring() comment
fuse: missing copy_finish in fuse-over-io-uring argument copies
fuse: fix readahead reclaim deadlock
Root partitions running on MSHV currently attempt ACPI power-off, which
MSHV intercepts and triggers a Machine Check Exception (MCE), leading
to a kernel panic.
Root partitions panic with a trace similar to:
[ 81.306348] reboot: Power down
[ 81.314709] mce: [Hardware Error]: CPU 0: Machine Check Exception: 4 Bank 0: b2000000c0060001
[ 81.314711] mce: [Hardware Error]: TSC 3b8cb60a66 PPIN 11d98332458e4ea9
[ 81.314713] mce: [Hardware Error]: PROCESSOR 0:606a6 TIME 1759339405 SOCKET 0 APIC 0 microcode ffffffff
[ 81.314715] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[ 81.314716] mce: [Hardware Error]: Machine check: Processor context corrupt
[ 81.314717] Kernel panic - not syncing: Fatal machine check
To avoid this, configure the sleep state in the hypervisor and invoke
the HVCALL_ENTER_SLEEP_STATE hypercall as the final step in the shutdown
sequence. This ensures a clean and safe shutdown of the root partition.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Co-developed-by: Anatol Belski <anbelski@linux.microsoft.com> Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@linux.microsoft.com> Acked-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
mshv: Use reboot notifier to configure sleep state
Configure sleep state information from ACPI in MSHV hypervisor using
a reboot notifier. This data allows the hypervisor to correctly power
off the host during shutdown.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Co-developed-by: Anatol Belski <anbelski@linux.microsoft.com> Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@linux.microsoft.com> Reviewed-by: Stansialv Kinsburskii <skinsburskii@linux.miscrosoft.com> Acked-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Introduce support for movable memory regions in the Hyper-V root partition
driver to improve memory management flexibility and enable advanced use
cases such as dynamic memory remapping.
Mirror the address space between the Linux root partition and guest VMs
using HMM. The root partition owns the memory, while guest VMs act as
devices with page tables managed via hypercalls. MSHV handles VP intercepts
by invoking hmm_range_fault() and updating SLAT entries. When memory is
reclaimed, HMM invalidates the relevant regions, prompting MSHV to clear
SLAT entries; guest VMs will fault again on access.
Integrate mmu_interval_notifier for movable regions, implement handlers for
HMM faults and memory invalidation, and update memory region mapping logic
to support movable regions.
While MMU notifiers are commonly used in virtualization drivers, this
implementation leverages HMM (Heterogeneous Memory Management) for its
specialized functionality. HMM provides a framework for mirroring,
invalidation, and fault handling, reducing boilerplate and improving
maintainability compared to generic MMU notifiers.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Introduce kref-based reference counting and spinlock protection for
memory regions in Hyper-V partition management. This change improves
memory region lifecycle management and ensures thread-safe access to the
region list.
Previously, the regions list was protected by the partition mutex.
However, this approach is too heavy for frequent fault and invalidation
operations. Finer grained locking is now used to improve efficiency and
concurrency.
This is a precursor to supporting movable memory regions. Fault and
invalidation handling for movable regions will require safe traversal of
the region list and holding a region reference while performing
invalidation or fault operations.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
mshv: Fix huge page handling in memory region traversal
The previous code assumed that if a region's first page was huge, the
entire region consisted of huge pages and stored this in a large_pages
flag. This premise is incorrect not only for movable regions (where
pages can be split and merged on invalidate callbacks or page faults),
but even for pinned regions: THPs can be split and merged during
allocation, so a large, pinned region may contain a mix of huge and
regular pages.
This change removes the large_pages flag and replaces region-wide
assumptions with per-chunk inspection of the actual page size when
mapping, unmapping, sharing, and unsharing. This makes huge page
handling correct for mixed-page regions and avoids relying on stale
metadata that can easily become invalid as memory is remapped.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Refactor memory region management functions from mshv_root_main.c into
mshv_regions.c for better modularity and code organization.
Adjust function calls and headers to use the new implementation. Improve
maintainability and separation of concerns in the mshv_root module.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Centralize guest memory region destruction to prevent resource leaks and
inconsistent cleanup across unmap and partition destruction paths.
Unify region removal, encrypted partition access recovery, and region
invalidation to improve maintainability and reliability. Reduce code
duplication and make future updates less error-prone by encapsulating
cleanup logic in a single helper.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
mshv: Refactor and rename memory region handling functions
Simplify and unify memory region management to improve code clarity and
reliability. Consolidate pinning and invalidation logic, adopt consistent
naming, and remove redundant checks to reduce complexity.
Enhance documentation and update call sites for maintainability.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Jinank Jain [Mon, 24 Nov 2025 14:25:59 +0000 (14:25 +0000)]
mshv: adjust interrupt control structure for ARM64
Interrupt control structure (union hv_interupt_control) has different
fields when it comes to x86 vs ARM64. Bring in the correct structure
from HyperV header files and adjust the existing interrupt routing
code accordingly.
mshv: Add ioctl for self targeted passthrough hvcalls
Allow MSHV_ROOT_HVCALL IOCTL on the /dev/mshv fd. This IOCTL would
execute a passthrough hypercall targeting the root/parent partition
i.e. HV_PARTITION_ID_SELF.
This will be useful for the VMM to query things like supported
synthetic processor features, supported VMM capabiliites etc.
Since hypercalls targeting the host partition could potentially perform
privileged operations, allow only a limited set of hypercalls. To begin
with, allow only:
Signed-off-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Naman Jain [Thu, 13 Nov 2025 04:41:49 +0000 (04:41 +0000)]
Drivers: hv: Introduce mshv_vtl driver
Provide an interface for Virtual Machine Monitor like OpenVMM and its
use as OpenHCL paravisor to control VTL0 (Virtual trust Level).
Expose devices and support IOCTLs for features like VTL creation,
VTL0 memory management, context switch, making hypercalls,
mapping VTL0 address space to VTL2 userspace, getting new VMBus
messages and channel events in VTL2 etc.
Co-developed-by: Roman Kisel <romank@linux.microsoft.com> Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Co-developed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Naman Jain <namjain@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Linus Torvalds [Fri, 5 Dec 2025 22:36:21 +0000 (14:36 -0800)]
Merge tag 'pull-persistency' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull persistent dentry infrastructure and conversion from Al Viro:
"Some filesystems use a kinda-sorta controlled dentry refcount leak to
pin dentries of created objects in dcache (and undo it when removing
those). A reference is grabbed and not released, but it's not actually
_stored_ anywhere.
That works, but it's hard to follow and verify; among other things, we
have no way to tell _which_ of the increments is intended to be an
unpaired one. Worse, on removal we need to decide whether the
reference had already been dropped, which can be non-trivial if that
removal is on umount and we need to figure out if this dentry is
pinned due to e.g. unlink() not done. Usually that is handled by using
kill_litter_super() as ->kill_sb(), but there are open-coded special
cases of the same (consider e.g. /proc/self).
Things get simpler if we introduce a new dentry flag
(DCACHE_PERSISTENT) marking those "leaked" dentries. Having it set
claims responsibility for +1 in refcount.
The end result this series is aiming for:
- get these unbalanced dget() and dput() replaced with new primitives
that would, in addition to adjusting refcount, set and clear
persistency flag.
- instead of having kill_litter_super() mess with removing the
remaining "leaked" references (e.g. for all tmpfs files that hadn't
been removed prior to umount), have the regular
shrink_dcache_for_umount() strip DCACHE_PERSISTENT of all dentries,
dropping the corresponding reference if it had been set. After that
kill_litter_super() becomes an equivalent of kill_anon_super().
Doing that in a single step is not feasible - it would affect too many
places in too many filesystems. It has to be split into a series.
This work has really started early in 2024; quite a few preliminary
pieces have already gone into mainline. This chunk is finally getting
to the meat of that stuff - infrastructure and most of the conversions
to it.
Some pieces are still sitting in the local branches, but the bulk of
that stuff is here"
* tag 'pull-persistency' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits)
d_make_discardable(): warn if given a non-persistent dentry
kill securityfs_recursive_remove()
convert securityfs
get rid of kill_litter_super()
convert rust_binderfs
convert nfsctl
convert rpc_pipefs
convert hypfs
hypfs: swich hypfs_create_u64() to returning int
hypfs: switch hypfs_create_str() to returning int
hypfs: don't pin dentries twice
convert gadgetfs
gadgetfs: switch to simple_remove_by_name()
convert functionfs
functionfs: switch to simple_remove_by_name()
functionfs: fix the open/removal races
functionfs: need to cancel ->reset_work in ->kill_sb()
functionfs: don't bother with ffs->ref in ffs_data_{opened,closed}()
functionfs: don't abuse ffs_data_closed() on fs shutdown
convert selinuxfs
...
Linus Torvalds [Fri, 5 Dec 2025 21:52:43 +0000 (13:52 -0800)]
Merge tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"__vmalloc()/kvmalloc() and no-block support" (Uladzislau Rezki)
Rework the vmalloc() code to support non-blocking allocations
(GFP_ATOIC, GFP_NOWAIT)
"ksm: fix exec/fork inheritance" (xu xin)
Fix a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not
inherited across fork/exec
"mm/zswap: misc cleanup of code and documentations" (SeongJae Park)
Some light maintenance work on the zswap code
"mm/page_owner: add debugfs files 'show_handles' and 'show_stacks_handles'" (Mauricio Faria de Oliveira)
Enhance the /sys/kernel/debug/page_owner debug feature by adding
unique identifiers to differentiate the various stack traces so
that userspace monitoring tools can better match stack traces over
time
"mm/page_alloc: pcp->batch cleanups" (Joshua Hahn)
Minor alterations to the page allocator's per-cpu-pages feature
"Improve UFFDIO_MOVE scalability by removing anon_vma lock" (Lokesh Gidra)
Address a scalability issue in userfaultfd's UFFDIO_MOVE operation
"kasan: cleanups for kasan_enabled() checks" (Sabyrzhan Tasbolatov)
"drivers/base/node: fold node register and unregister functions" (Donet Tom)
Clean up the NUMA node handling code a little
"mm: some optimizations for prot numa" (Kefeng Wang)
Cleanups and small optimizations to the NUMA allocation hinting
code
"mm/page_alloc: Batch callers of free_pcppages_bulk" (Joshua Hahn)
Address long lock hold times at boot on large machines. These were
causing (harmless) softlockup warnings
"optimize the logic for handling dirty file folios during reclaim" (Baolin Wang)
Remove some now-unnecessary work from page reclaim
"mm/damon: allow DAMOS auto-tuned for per-memcg per-node memory usage" (SeongJae Park)
Enhance the DAMOS auto-tuning feature
"mm/damon: fixes for address alignment issues in DAMON_LRU_SORT and DAMON_RECLAIM" (Quanmin Yan)
Fix DAMON_LRU_SORT and DAMON_RECLAIM with certain userspace
configuration
"expand mmap_prepare functionality, port more users" (Lorenzo Stoakes)
Enhance the new(ish) file_operations.mmap_prepare() method and port
additional callsites from the old ->mmap() over to ->mmap_prepare()
"Fix stale IOTLB entries for kernel address space" (Lu Baolu)
Fix a bug (and possible security issue on non-x86) in the IOMMU
code. In some situations the IOMMU could be left hanging onto a
stale kernel pagetable entry
"mm/huge_memory: cleanup __split_unmapped_folio()" (Wei Yang)
Clean up and optimize the folio splitting code
"mm, swap: misc cleanup and bugfix" (Kairui Song)
Some cleanups and a minor fix in the swap discard code
"mm/damon: support pin-point targets removal" (SeongJae Park)
Permit userspace to remove a specific monitoring target in the
middle of the current targets list
"mm: MISC follow-up patches for linux/pgalloc.h" (Harry Yoo)
A couple of cleanups related to mm header file inclusion
"mm/swapfile.c: select swap devices of default priority round robin" (Baoquan He)
improve the selection of swap devices for NUMA machines
"mm: Convert memory block states (MEM_*) macros to enums" (Israel Batista)
Change the memory block labels from macros to enums so they will
appear in kernel debug info
"ksm: perform a range-walk to jump over holes in break_ksm" (Pedro Demarchi Gomes)
Address an inefficiency when KSM unmerges an address range
"mm/damon/tests: fix memory bugs in kunit tests" (SeongJae Park)
Fix leaks and unhandled malloc() failures in DAMON userspace unit
tests
"some cleanups for pageout()" (Baolin Wang)
Clean up a couple of minor things in the page scanner's
writeback-for-eviction code
"mm/hugetlb: refactor sysfs/sysctl interfaces" (Hui Zhu)
Move hugetlb's sysfs/sysctl handling code into a new file
"introduce VM_MAYBE_GUARD and make it sticky" (Lorenzo Stoakes)
Make the VMA guard regions available in /proc/pid/smaps and
improves the mergeability of guarded VMAs
"mm: perform guard region install/remove under VMA lock" (Lorenzo Stoakes)
Reduce mmap lock contention for callers performing VMA guard region
operations
"vma_start_write_killable" (Matthew Wilcox)
Start work on permitting applications to be killed when they are
waiting on a read_lock on the VMA lock
"mm/damon/tests: add more tests for online parameters commit" (SeongJae Park)
Add additional userspace testing of DAMON's "commit" feature
"mm/damon: misc cleanups" (SeongJae Park)
"make VM_SOFTDIRTY a sticky VMA flag" (Lorenzo Stoakes)
Address the possible loss of a VMA's VM_SOFTDIRTY flag when that
VMA is merged with another
"mm: support device-private THP" (Balbir Singh)
Introduce support for Transparent Huge Page (THP) migration in zone
device-private memory
"Optimize folio split in memory failure" (Zi Yan)
"mm/huge_memory: Define split_type and consolidate split support checks" (Wei Yang)
Some more cleanups in the folio splitting code
"mm: remove is_swap_[pte, pmd]() + non-swap entries, introduce leaf entries" (Lorenzo Stoakes)
Clean up our handling of pagetable leaf entries by introducing the
concept of 'software leaf entries', of type softleaf_t
"reparent the THP split queue" (Muchun Song)
Reparent the THP split queue to its parent memcg. This is in
preparation for addressing the long-standing "dying memcg" problem,
wherein dead memcg's linger for too long, consuming memory
resources
"unify PMD scan results and remove redundant cleanup" (Wei Yang)
A little cleanup in the hugepage collapse code
"zram: introduce writeback bio batching" (Sergey Senozhatsky)
Improve zram writeback efficiency by introducing batched bio
writeback support
"memcg: cleanup the memcg stats interfaces" (Shakeel Butt)
Clean up our handling of the interrupt safety of some memcg stats
"make vmalloc gfp flags usage more apparent" (Vishal Moola)
Clean up vmalloc's handling of incoming GFP flags
"mm: Add soft-dirty and uffd-wp support for RISC-V" (Chunyan Zhang)
Teach soft dirty and userfaultfd write protect tracking to use
RISC-V's Svrsw60t59b extension
"mm: swap: small fixes and comment cleanups" (Youngjun Park)
Fix a small bug and clean up some of the swap code
"initial work on making VMA flags a bitmap" (Lorenzo Stoakes)
Start work on converting the vma struct's flags to a bitmap, so we
stop running out of them, especially on 32-bit
"mm/swapfile: fix and cleanup swap list iterations" (Youngjun Park)
Address a possible bug in the swap discard code and clean things
up a little
[ This merge also reverts commit ebb9aeb980e5 ("vfio/nvgrace-gpu:
register device memory for poison handling") because it looks
broken to me, I've asked for clarification - Linus ]
* tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
mm: fix vma_start_write_killable() signal handling
mm/swapfile: use plist_for_each_entry in __folio_throttle_swaprate
mm/swapfile: fix list iteration when next node is removed during discard
fs/proc/task_mmu.c: fix make_uffd_wp_huge_pte() huge pte handling
mm/kfence: add reboot notifier to disable KFENCE on shutdown
memcg: remove inc/dec_lruvec_kmem_state helpers
selftests/mm/uffd: initialize char variable to Null
mm: fix DEBUG_RODATA_TEST indentation in Kconfig
mm: introduce VMA flags bitmap type
tools/testing/vma: eliminate dependency on vma->__vm_flags
mm: simplify and rename mm flags function for clarity
mm: declare VMA flags by bit
zram: fix a spelling mistake
mm/page_alloc: optimize lowmem_reserve max lookup using its semantic monotonicity
mm/vmscan: skip increasing kswapd_failures when reclaim was boosted
pagemap: update BUDDY flag documentation
mm: swap: remove scan_swap_map_slots() references from comments
mm: swap: change swap_alloc_slow() to void
mm, swap: remove redundant comment for read_swap_cache_async
mm, swap: use SWP_SOLIDSTATE to determine if swap is rotational
...