Cc: Steve French <smfrench@gmail.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Dudu Lu [Wed, 15 Apr 2026 10:24:24 +0000 (18:24 +0800)]
smb: client: fix integer underflow in receive_encrypted_read()
In receive_encrypted_read(), the length of data to read from the socket
is computed as:
len = le32_to_cpu(tr_hdr->OriginalMessageSize) -
server->vals->read_rsp_size;
OriginalMessageSize comes from the server's transform header and is
untrusted. If a malicious server sends a value smaller than
read_rsp_size, the unsigned subtraction wraps to a very large value
(~4GB). This value is then passed to netfs_alloc_folioq_buffer() and
cifs_read_iter_from_socket(), causing either a massive allocation
attempt that fails with -ENOMEM (DoS), or under extreme memory
pressure, potential heap corruption.
Fix by adding a check that OriginalMessageSize is at least
read_rsp_size before the subtraction. On failure, jump to
discard_data to drain the remaining PDU from the socket, preventing
desync of subsequent reads on the connection.
Signed-off-by: Dudu Lu <phx0fer@gmail.com> Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
Merge tag 'v7.1-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
"API:
- Replace crypto_get_default_rng with crypto_stdrng_get_bytes
- Remove simd skcipher support
- Allow algorithm types to be disabled when CRYPTO_SELFTESTS is off
Algorithms:
- Remove CPU-based des/3des acceleration
- Add test vectors for authenc(hmac(md5),cbc({aes,des})) and
authenc(hmac({md5,sha1,sha224,sha256,sha384,sha512}),rfc3686(ctr(aes)))
- Replace spin lock with mutex in jitterentropy
Drivers:
- Add authenc algorithms to safexcel
- Add support for zstd in qat
- Add wireless mode support for QAT GEN6
- Add anti-rollback support for QAT GEN6
- Add support for ctr(aes), gcm(aes), and ccm(aes) in dthev2"
* tag 'v7.1-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (129 commits)
crypto: af_alg - use sock_kmemdup in alg_setkey_by_key_serial
crypto: vmx - remove CRYPTO_DEV_VMX from Kconfig
crypto: omap - convert reqctx buffer to fixed-size array
crypto: atmel-sha204a - add Thorsten Blum as maintainer
crypto: atmel-ecc - add Thorsten Blum as maintainer
crypto: qat - fix IRQ cleanup on 6xxx probe failure
crypto: geniv - Remove unused spinlock from struct aead_geniv_ctx
crypto: qce - simplify qce_xts_swapiv()
crypto: hisilicon - Fix dma_unmap_single() direction
crypto: talitos - rename first/last to first_desc/last_desc
crypto: talitos - fix SEC1 32k ahash request limitation
crypto: jitterentropy - replace long-held spinlock with mutex
crypto: hisilicon - remove unused and non-public APIs for qm and sec
crypto: hisilicon/qm - drop redundant variable initialization
crypto: hisilicon/qm - remove else after return
crypto: hisilicon/qm - add const qualifier to info_name in struct qm_cmd_dump_item
crypto: hisilicon - fix the format string type error
crypto: ccree - fix a memory leak in cc_mac_digest()
crypto: qat - add support for zstd
crypto: qat - use swab32 macro
...
Merge tag 'ipe-pr-20260413' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe
Pull IPE update from Fan Wu:
"A single commit from Evan Ducas that fixes several spelling and
grammar mistakes in the IPE documentation. There are no functional
changes"
* tag 'ipe-pr-20260413' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe:
docs: security: ipe: fix typos and grammar
Merge tag 'for-linus-7.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- fix an error path in drivers/xen/manage.c
- fix the Xen console driver solving a boot hangup when the console
backend isn't yet running
- comment fix in the Xen swiotlb driver
- hardening for Xen on Arm adding a more thorough validation
- cleanup of the Xen grant table code hiding suspend/resume code for
the case if CONFIG_HIBERNATE_CALLBACKS isn't defined
* tag 'for-linus-7.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/grant-table: guard gnttab_suspend/resume with CONFIG_HIBERNATE_CALLBACKS
hvc/xen: Check console connection flag
xen/swiotlb: fix stale reference to swiotlb_unmap_page()
xen/manage: unwind partial shutdown watcher setup on error
ARM: xen: validate hypervisor compatible before parsing its version
Merge tag 'for-7.1/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Benjamin Marzinski:
"There are fixes for some corner case crashes in dm-cache and
dm-mirror, new setup functionality for dm-vdo, and miscellaneous minor
fixes and cleanups, especially to dm-verity.
dm-vdo:
- Make dm-vdo able to format the device itself, like other dm
targets, instead of needing a userspace formating program
- Add some sanity checks and code cleanup
dm-cache:
- Fix crashes and hangs when operating in passthrough mode (which
have been around, unnoticed, since 4.12), as well as a late
arriving fix for an error path bug in the passthrough fix
- Fix a corner case memory leak
dm-verity:
- Another set of minor bugfixes and code cleanups to the forward
error correction code
dm-mirror
- Fix minor initialization bug
- Fix overflow crash on a large devices with small region sizes
dm-crypt
- Reimplement elephant diffuser using AES library and minor cleanups
dm-core:
- Claude found a buffer overflow in /dev/mapper/contrl ioctl handling
- make dm_mod.wait_for correctly wait for partitions
- minor code fixes and cleanups"
* tag 'for-7.1/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (62 commits)
dm cache: fix missing return in invalidate_committed's error path
dm: fix a buffer overflow in ioctl processing
dm-crypt: Make crypt_iv_operations::post return void
dm vdo: Fix spelling mistake "postive" -> "positive"
dm: provide helper to set stacked limits
dm-integrity: always set the io hints
dm-integrity: fix mismatched queue limits
dm-bufio: use kzalloc_flex
dm vdo: save the formatted metadata to disk
dm vdo: add formatting logic and initialization
dm vdo: add synchronous metadata I/O submission helper
dm vdo: add geometry block structure
dm vdo: add geometry block encoding
dm vdo: add upfront validation for logical size
dm vdo: add formatting parameters to table line
dm vdo: add super block initialization to encodings.c
dm vdo: add geometry block initialization to encodings.c
dm-crypt: Make crypt_iv_operations::wipe return void
dm-crypt: Reimplement elephant diffuser using AES library
dm-verity-fec: warn even when there were no errors
...
Merge tag 'iommu-updates-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu updates from Joerg Roedel:
"Core:
- Support for RISC-V IO-page-table format in generic iommupt code
ARM-SMMU Updates:
- Introduction of an "invalidation array" for SMMUv3, which enables
future scalability work and optimisations for devices with a large
number of SMMUv3 instances
- Update the conditions under which the SMMUv3 driver works around
hardware errata for invalidation on MMU-700 implementations
- Fix broken command filtering for the host view of NVIDIA's "cmdqv"
SMMUv3 extension
- MMU-500 device-tree binding additions for Qualcomm Eliza & Hawi
SoCs
Intel VT-d:
- Support for dirty tracking on domains attached to PASID
- Removal of unnecessary read*()/write*() wrappers
- Improvements to the invalidation paths
AMD Vi:
- Race-condition fixed in debugfs code
- Make log buffer allocation NUMA aware
RISC-V:
- IO-TLB flushing improvements
- Minor fixes"
* tag 'iommu-updates-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (48 commits)
iommu/vt-d: Restore IOMMU_CAP_CACHE_COHERENCY
dt-bindings: arm-smmu: qcom: Add compatible for Hawi SoC
iommu/amd: Invalidate IRT cache for DMA aliases
iommu/riscv: Remove overflows on the invalidation path
iommu/amd: Fix clone_alias() to use the original device's devid
iommu/vt-d: Remove the remaining pages along the invalidation path
iommu/vt-d: Pass size_order to qi_desc_piotlb() not npages
iommu/vt-d: Split piotlb invalidation into range and all
iommu/vt-d: Remove dmar_writel() and dmar_writeq()
iommu/vt-d: Remove dmar_readl() and dmar_readq()
iommufd/selftest: Test dirty tracking on PASID
iommu/vt-d: Support dirty tracking on PASID
iommu/vt-d: Rename device_set_dirty_tracking() and pass dmar_domain pointer
iommu/vt-d: Block PASID attachment to nested domain with dirty tracking
iommu/dma: Always allow DMA-FQ when iommupt provides the iommu_domain
iommu/riscv: Fix signedness bug
iommu/amd: Fix illegal cap/mmio access in IOMMU debugfs
iommu/amd: Fix illegal device-id access in IOMMU debugfs
iommu/tegra241-cmdqv: Update uAPI to clarify HYP_OWN requirement
iommu/tegra241-cmdqv: Set supports_cmd op in tegra241_vcmdq_hw_init()
...
Merge tag 'ata-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata updates from Niklas Cassel:
- Misc code cleanups related to tag checking and tag command completion
(Damien)
- Remove Baikal bt1-ahci DT binding since the upstreaming for this SoC
is not going to be finalized (Andy)
- Only call the libata port error handler from the SCSI error handler
if there were command timeouts or if EH was scheduled for the port
(Damien)
- Refactor ata_scsiop_maint_in() to more clearly show that there is
only one service action implemented for the MAINTENANCE IN command
(me)
- Clean up the handling of sysfs attributes exposed by libata (Heiner)
- Let libahci_platform use a flexible array member for platform PHYs to
avoid multiple allocations (Rosen)
- Do not retry reset if the device has been removed/hot-unplugged
(Igor)
- Add missing newlines to error prints in pata_arasan_cf driver (Haoyu)
- Use the correct SCSI host byte when completing deferred ATA
PASS-THROUGH commands, to avoid the SCSI mid-layer from failing the
commands instead of requeuing (Igor)
* tag 'ata-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: libata-scsi: fix requeue of deferred ATA PASS-THROUGH commands
ata: pata_arasan_cf: fix missing newline in dev_err() messages
ata: libata-transport: remove static variable ata_scsi_transport_template
ata: libata-transport: split struct ata_internal
ata: libata-transport: use static struct ata_transport_internal to simplify match functions
ata: libata-transport: inline ata_attach|release_transport
ata: libata-transport: instantiate struct ata_internal statically
ata: libata-eh: Do not retry reset if the device is gone
ata: libahci_platform: use flex array for platform PHYs
ata: libata-transport: remove redundant dynamic sysfs attributes
ata: libata-scsi: refactor ata_scsiop_maint_in()
ata: libata-eh: avoid unnecessary calls to ata_scsi_port_error_handler()
ata: ahci-dwc: Remove not-going-to-be-supported code for Baikal SoC
ata: libata-scsi: rename and improve ata_qc_done()
ata: libata-scsi: make ata_scsi_simulate() static
ata: libata-scsi: simplify ata_scsi_requeue_deferred_qc()
ata: libata-sata: simplify ata_sas_queuecmd()
ata: libata-core: improve tag checks in ata_qc_issue()
Merge tag 'pci-v7.1-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Allow TLP Processing Hints to be enabled for RCiEPs (George Abraham
P)
- Enable AtomicOps only if we know the Root Port supports them (Gerd
Bayer)
- Don't enable AtomicOps for RCiEPs since none of them need Atomic
Ops and we can't tell whether the Root Complex would support them
(Gerd Bayer)
- Leave Precision Time Measurement disabled until a driver enables it
to avoid PCIe errors (Mika Westerberg)
- Make pci_set_vga_state() fail if bridge doesn't support VGA
routing, i.e., PCI_BRIDGE_CTL_VGA is not writable, and return
errors to vga_get() callers including userspace via
/dev/vga_arbiter (Simon Richter)
- Validate max-link-speed from DT in j721e, brcmstb, mediatek-gen3,
rzg3s drivers (where the actual controller constraints are known),
and remove validation from the generic OF DT accessor (Hans Zhang)
- Remove pc110pad driver (no longer useful after 486 CPU support
removed) and no_pci_devices() (pc110pad was the last user) (Dmitry
Torokhov, Heiner Kallweit)
Resource management:
- Prevent assigning space to unimplemented bridge windows; previously
we mistakenly assumed prefetchable window existed and assigned
space and put a BAR there (Ahmed Naseef)
- Avoid shrinking bridge windows to fit in the initial Root Port
window; fixes one problem with devices with large BARs connected
via switches, e.g., Thunderbolt (Ilpo Järvinen)
- Pass full extent of empty space, not just the aligned space, to
resource_alignf callback so free space before the requested
alignment can be used (Ilpo Järvinen)
- Place small resources before larger ones for better utilization of
address space (Ilpo Järvinen)
- Fix alignment calculation for resource size larger than align,
e.g., bridge windows larger than the 1MB required alignment (Ilpo
Järvinen)
Reset:
- Update slot handling so all ARI functions are treated as being in
the same slot. They're all reset by Secondary Bus Reset, but
previously drivers of ARI functions that appeared to be on a
non-zero device weren't notified and fatal hardware errors could
result (Keith Busch)
- Make sysfs reset_subordinate hotplug safe to avoid spurious hotplug
events (Keith Busch)
- Hide Secondary Bus Reset ('bus') from sysfs reset_methods if masked
by CXL because it has no effect (Vidya Sagar)
- Avoid FLR for AMD NPU device, where it causes the device to hang
(Lizhi Hou)
Error handling:
- Clear only error bits in PCIe Device Status to avoid accidentally
clearing Emergency Power Reduction Detected (Shuai Xue)
- Check for AER errors even in devices without drivers (Lukas Wunner)
- Initialize ratelimit info so DPC and EDR paths log AER error
information (Kuppuswamy Sathyanarayanan)
Power control:
- Add UPD720201/UPD720202 USB 3.0 xHCI Host Controller .compatible so
generic pwrctrl driver can control it (Neil Armstrong)
Hotplug:
- Set LED_HW_PLUGGABLE for NPEM hotplug-capable ports so LED core
doesn't complain when setting brightness fails because the endpoint
is gone (Richard Cheng)
Peer-to-peer DMA:
- Allow wildcards in list of host bridges that support peer-to-peer
DMA between hierarchy domains and add all Google SoCs (Jacob
Moroni)
Endpoint framework:
- Advertise dynamic inbound mapping support in pci-epf-test and
update host pci_endpoint_test to skip doorbell testing if not
advertised by endpoint (Koichiro Den)
- Return 0, not remaining timeout, when MHI eDMA ops complete so
mhi_ep_ring_add_element() doesn't interpret non-zero as failure
(Daniel Hodges)
- Remove vntb and ntb duplicate resource teardown that leads to oops
when .allow_link() fails or .drop_link() is called (Koichiro Den)
- Disable vntb delayed work before clearing BAR mappings and
doorbells to avoid oops caused by doing the work after resources
have been torn down (Koichiro Den)
- Add a way to describe reserved subregions within BARs, e.g.,
platform-owned fixed register windows, and use it for the RK3588
BAR4 DMA ctrl window (Koichiro Den)
- Add BAR_DISABLED for BARs that will never be available to an EPF
driver, and change some BAR_RESERVED annotations to BAR_DISABLED
(Niklas Cassel)
- Add NTB .get_dma_dev() callback for cases where DMA API requires a
different device, e.g., vNTB devices (Koichiro Den)
- Add reserved region types for MSI-X Table and PBA so Endpoint
controllers can them as describe hardware-owned regions in a
BAR_RESERVED BAR (Manikanta Maddireddy)
- Make Tegra194/234 BAR0 programmable and remove 1MB size limit
(Manikanta Maddireddy)
- Expose Tegra BAR2 (MSI-X) and BAR4 (DMA) as 64-bit BAR_RESERVED
(Manikanta Maddireddy)
- Add Tegra194 and Tegra234 device table entries to pci_endpoint_test
(Manikanta Maddireddy)
- Skip the BAR subrange selftest if there are not enough inbound
window resources to run the test (Christian Bruel)
New native PCIe controller drivers:
- Add DT binding and driver for Andes QiLai SoC PCIe host controller
(Randolph Lin)
- Add DT binding and driver for ESWIN PCIe Root Complex (Senchuan
Zhang)
Baikal T-1 PCIe controller driver:
- Remove driver since it never quite became usable (Andy Shevchenko)
Cadence PCIe controller driver:
- Implement byte/word config reads with dword (32-bit) reads because
some Cadence controllers don't support sub-dword accesses (Aksh
Garg)
CIX Sky1 PCIe controller driver:
- Add 'power-domains' to DT binding for SCMI power domain (Gary Yang)
Freescale i.MX6 PCIe controller driver:
- Add i.MX94 and i.MX943 to fsl,imx6q-pcie-ep DT binding (Richard
Zhu)
- Delay instead of polling for L2/L3 Ready after PME_Turn_off when
suspending i.MX6SX because LTSSM registers are inaccessible
(Richard Zhu)
- Separate PERST# assertion (for resetting endpoints) from core reset
(for resetting the RC itself) to prepare for new DTs with PERST#
GPIO in per-Root Port nodes (Sherry Sun)
- Retain Root Port MSI capability on i.MX7D, i.MX8MM, and i.MX8MQ so
MSI from downstream devices will work (Richard Zhu)
- Fix i.MX95 reference clock source selection when internal refclk is
used (Franz Schnyder)
Freescale Layerscape PCIe controller driver:
- Allow building as a removable module (Sascha Hauer)
MediaTek PCIe Gen3 controller driver:
- Use dev_err_probe() to simplify error paths and make deferred probe
messages visible in /sys/kernel/debug/devices_deferred (Chen-Yu
Tsai)
- Power off device if setup fails (Chen-Yu Tsai)
- Integrate new pwrctrl API to enable power control for WiFi/BT
adapters on mainboard or in PCIe or M.2 slots (Chen-Yu Tsai)
NVIDIA Tegra194 PCIe controller driver:
- Poll less aggressively and non-atomically for PME_TO_Ack during
transition to L2 (Vidya Sagar)
- Disable LTSSM after transition to Detect on surprise link down to
stop toggling between Polling and Detect (Manikanta Maddireddy)
- Don't force the device into the D0 state before L2 when suspending
or shutting down the controller (Vidya Sagar)
- Disable PERST# IRQ only in Endpoint mode because it's not
registered in Root Port mode (Manikanta Maddireddy)
- Handle 'nvidia,refclk-select' as optional (Vidya Sagar)
- Disable direct speed change in Endpoint mode so link speed change
is controlled by the host (Vidya Sagar)
- Set LTR values before link up to avoid bogus LTR messages with 0
latency (Vidya Sagar)
- Allow system suspend when the Endpoint link is down (Vidya Sagar)
- Use DWC IP core version, not Tegra custom values, to avoid DWC core
version check warnings (Manikanta Maddireddy)
- Apply ECRC workaround to devices based on DesignWare 5.00a as well
as 4.90a (Manikanta Maddireddy)
- Disable PM Substate L1.2 in Endpoint mode to work around Tegra234
erratum (Vidya Sagar)
- Delay post-PERST# cleanup until core is powered on to avoid CBB
timeout (Manikanta Maddireddy)
- Assert CLKREQ# so switches that forward it to their downstream side
can bring up those links successfully (Vidya Sagar)
- Calibrate pipe to UPHY for Endpoint mode to reset stale PLL state
from any previous bad link state (Vidya Sagar)
- Remove IRQF_ONESHOT flag from Endpoint interrupt registration so
DMA driver and Endpoint controller driver can share the interrupt
line (Vidya Sagar)
- Enable DMA interrupt to support DMA in both Root Port and Endpoint
modes (Vidya Sagar)
- Enable hardware link retraining after link goes down in Endpoint
mode (Vidya Sagar)
- Add DT binding and driver support for core clock monitoring (Vidya
Sagar)
Qualcomm PCIe controller driver:
- Advertise 'Hot-Plug Capable' and set 'No Command Completed Support'
since Qcom Root Ports support hotplug events like DL_Up/Down and
can accept writes to Slot Control without delays between writes
(Krishna Chaitanya Chundru)
Renesas R-Car PCIe controller driver:
- Mark Endpoint BAR0 and BAR2 as Resizable (Koichiro Den)
- Reduce EPC BAR alignment requirement to 4K (Koichiro Den)
Renesas RZ/G3S PCIe controller driver:
- Add RZ/G3E to DT binding and to driver (John Madieu)
- Assert resets in suspend path in reverse order they were deasserted
during probe (John Madieu)
- Rework inbound window algorithm to prevent mapping more than
intended region and enforce alignment on size, to prepare for
RZ/G3E support (John Madieu)
Rockchip DesignWare PCIe controller driver:
- Add tracepoints for PCIe controller LTSSM transitions and link rate
changes (Shawn Lin)
- Trace LTSSM events collected by the dw-rockchip debug FIFO (Shawn
Lin)
SOPHGO PCIe controller driver:
- Disable ASPM L0s and L1 on Sophgo 2042 PCIe Root Ports that
advertise support for them (Yao Zi)
Synopsys DesignWare PCIe controller driver:
- Continue with system suspend even if an Endpoint doesn't respond
with PME_TO_Ack message (Manivannan Sadhasivam)
- Set Endpoint MSI-X Table Size in the correct function of a
multi-function device when configuring MSI-X, not in Function 0
(Aksh Garg)
- Set Max Link Width and Max Link Speed for all functions of a
multi-function device, not just Function 0 (Aksh Garg)
- Expose PCIe event counters in groups 5-7 in debugfs (Hans Zhang)
Miscellaneous:
- Warn only once about invalid ACS kernel parameter format (Richard
Cheng)
- Suppress FW_BUG warning when writing sysfs 'numa_node' with the
current value (Li RongQing)
- Drop redundant 'depends on PCI' from Kconfig (Julian Braha)"
* tag 'pci-v7.1-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (165 commits)
PCI/P2PDMA: Add Google SoCs to the P2P DMA host bridge list
PCI/P2PDMA: Allow wildcard Device IDs in host bridge list
PCI: sg2042: Avoid L0s and L1 on Sophgo 2042 PCIe Root Ports
PCI: cadence: Add flags for disabling ASPM capability for broken Root Ports
PCI: tegra194: Add core monitor clock support
dt-bindings: PCI: tegra194: Add monitor clock support
PCI: tegra194: Enable hardware hot reset mode in Endpoint mode
PCI: tegra194: Enable DMA interrupt
PCI: tegra194: Remove IRQF_ONESHOT flag during Endpoint interrupt registration
PCI: tegra194: Calibrate pipe to UPHY for Endpoint mode
PCI: tegra194: Assert CLKREQ# explicitly by default
PCI: tegra194: Fix CBB timeout caused by DBI access before core power-on
PCI: tegra194: Disable L1.2 capability of Tegra234 EP
PCI: dwc: Apply ECRC workaround to DesignWare 5.00a as well
PCI: tegra194: Use DWC IP core version
PCI: tegra194: Free up Endpoint resources during remove()
PCI: tegra194: Allow system suspend when the Endpoint link is not up
PCI: tegra194: Set LTR message request before PCIe link up in Endpoint mode
PCI: tegra194: Disable direct speed change for Endpoint mode
PCI: tegra194: Use devm_gpiod_get_optional() to parse "nvidia,refclk-select"
...
Merge tag 'hwmon-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon updates from Guenter Roeck:
"New drivers:
- Lenovo Yoga/Legion fan monitoring (yogafan)
- LattePanda Sigma EC
- Infineon XDP720 eFuse
- Microchip MCP998X
New device support:
- TI INA234
- Infineon XDPE1A2G5B/7B
- Renesas RAA228942 and RAA228943 (isl68137)
- Delta Q54SN120A1 and Q54SW120A7 (pmbus)
- TI TMP110 and TMP113 (tmp102)
- Sony APS-379 (pmbus)
- ITE IT8689E (it87)
- ASUS ROG STRIX Z790-H, X470-F, and CROSSHAIR X670E (asus-ec-sensors)
- GPD Win 5 (gpd-fan)
Modernization and Cleanups:
- Convert asus_atk0110 and acpi_power_meter ACPI drivers to platform
drivers
- Remove i2c_match_id() usage in many PMBus drivers
- Use guard() for mutex protection in pmbus_core
- Replace sprintf() with sysfs_emit() in ads7871, emc1403, max6650,
ads7828, max31722, and tc74
- Various markup and documentation improvements for yogafan and
ltc4282
Bug fixes:
- Fix use-after-free and missing usb_kill_urb on disconnect in powerz
driver
- Avoid cacheline sharing for DMA buffer in powerz driver
- Fix integer overflow in power calculation on 32-bit in isl28022
driver
- Fix bugs in pt5161l_read_block_data()
- Propagate SPI errors and fix incorrect error codes in ads7871
driver
- Fix i2c_smbus_write_byte_data wrapper argument type in max31785
driver
Device tree bindings:
- Convert npcm750-pwm-fan to DT schema
- Add bindings for Infineon XDP720, Microchip MCP998X, Sony APS-379,
Renesas RAA228942/3, Delta Q54SN120A1/7, XDPE1A2G5B/7B, Aosong
AHT10/20, DHT20, and TI INA234
- Adapt moortec,mr75203 bindings for T-Head TH1520"
* tag 'hwmon-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (82 commits)
hwmon: (ina233) Don't check for specific errors when parsing properties
hwmon: (isl28022) Don't check for specific errors when parsing properties
hwmon: (pmbus/tps25990) Don't check for specific errors when parsing properties
hwmon: (nct6683) Add customer ID for ASRock B650I Lightning WiFi
hwmon:(pmbus/xdp720) Add support for efuse xdp720
dt-bindings: hwmon/pmbus: Add Infineon XDP720
hwmon: add support for MCP998X
dt-bindings: hwmon: add support for MCP998X
hwmon: (powerz) Avoid cacheline sharing for DMA buffer
hwmon: (isl28022) Fix integer overflow in power calculation on 32-bit
hwmon: (pt5161l) Fix bugs in pt5161l_read_block_data()
hwmon: (powerz) Fix missing usb_kill_urb() on signal interrupt
hwmon: (powerz) Fix use-after-free on USB disconnect
hwmon: pmbus: Add support for Sony APS-379
dt-bindings: trivial-devices: Add sony,aps-379
hwmon: (yogafan) various markup improvements
hwmon: (sparx5) Make it selectable for ARCH_LAN969X
hwmon: (tmp102) add support for update interval
hwmon: (yogafan) fix markup warning
hwmon: (yogafan) Add support for Lenovo Yoga/Legion fan monitoring
...
Merge tag 'spi-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi updates from Mark Brown:
"A busy release for SPI, almost all of it in a couple of larger fix and
cleanup series for patterns that affected many drivers. We do have a
couple of core API additions as well, relatively application specific
but they enable some new use cases.
- A packed command operation for spi-mem devices
- Improvements to the ancillary device support to enable some IIO use
cases from Antoniu Miclaus
- Fixes for a registration ordering issue pattern caused by the
handover between allocation and registration of controllers in
concert with devm from Johan Hovold
- Improvements to handling of clock allocation from Pei Xiao
- Cleanups in the fsl-lpspi driver from Marc Kleine-Budde
Merge tag 'regulator-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator updates from Mark Brown:
"This has been a very quiet update for the regulator API, the bulk of
the diffstat is DT binding conversions and the most promient series in
the changelog is Johan Hovold cleaning up some leaks of OF nodes. For
some reason we have had several different people sending improvements
to better describe the parent supplies for existing regulators, these
look to be independent efforts.
The only new hardware support is for some Motorola custom varints of
cpcap"
* tag 'regulator-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (35 commits)
regulator: max77620: drop redundant OF node initialisation
regulator: bq257xx: Make OTG enable GPIO really optional
regulator: bq257xx: Remove reference to the parent MFD's dev
regulator: bd9571mwv: fix OF node reference imbalance
regulator: act8945a: fix OF node reference imbalance
regulator: s2dos05: fix OF node reference imbalance
regulator: mt6357: fix OF node reference imbalance
regulator: max77650: fix OF node reference imbalance
regulator: rk808: fix OF node reference imbalance
regulator: bq257xx: fix OF node reference imbalance
regulator: dt-bindings: qcom,qca6390-pmu: Document WCN6755 PMU
regulator: dt-bindings: regulator-max77620: convert to DT schema
regulator: mt6315: Add regulator supplies
regulator: dt-bindings: mt6315: Add regulator supplies
regulator: devres: Use enum regulator_get_type in internal functions
regulator: dt-bindings: mps,mp8859: convert to DT schema
regulator: da9121: Allow caching BUCK registers
regulator: dt-bindings: dlg,da9121: Add dlg,no-gpio-control
regulator: cros-ec: Add regulator supply
regulator: dt-bindings: cros-ec: Add regulator supply
...
Merge tag 'regmap-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"This has been quite a busy release for regmap, the user visible
changes are quite minor but there's some quite good work on internal
code improvements:
- Cleanup helper for __free()ing regmap_fields
- Support non-devm I3C regmaps
- A bunch of cleanup work, mostly from Andy Shevchenko
- Fix for bootstrapping issues with hardware initialised regmaps,
which was the main inspiration for some of the cleanups"
* tag 'regmap-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: i3c: Add non-devm regmap_init_i3c() helper
regmap: debugfs: fix race condition in dummy name allocation
regmap: Synchronize cache for the page selector
regmap: Simplify devres handling
regcache: Move HW readback after cache initialisation
regcache: Allocate and free reg_defaults on the same level
regcache: Move count check and cache_bypass assignment to the caller
regcache: Factor out regcache_hw_exit() helper
regcache: Amend printf() specifiers when printing registers
regcache: Define iterator inside for-loop and align their types
regmap: define cleanup helper for regmap_field
regmap: sort header includes
regcache: Split regcache_count_cacheable_registers() helper
regcache: Remove duplicate check in regcache_hw_init()
Merge tag 'pmdomain-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain updates from Ulf Hansson:
"pmdomain core:
- Extend statistics for domain idle states with s2idle data
- Show latency/residency for domain idle states in debugfs
pmdomain providers:
- imx: Add support for optional subnodes for imx93-blk-ctrl
- marvell: Add audio power island for Marvell PXA1908
- mediatek:
- Add legacy support for the MT7622 audio power domain
- Add nvmem provider functionality to the mtk-mfg-pmdomain
- Add support for the MT8189 power domains
- qcom: Add support for the Eliza and Hawi power domains
- sunxi: Add support for the Allwinner A733 power domains
- ti: Handle wakeup constraints for out-of-band wakeups for ti_sci"
* tag 'pmdomain-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: (32 commits)
pmdomain: qcom: rpmhpd: Add power domains for Hawi SoC
dt-bindings: power: qcom,rpmhpd: Add RPMh power domain for Hawi SoC
pmdomain: qcom: cpr: add COMPILE_TEST support
PM: domains: De-constify fields in struct dev_pm_domain_attach_data
pmdomain: qcom: cpr: simplify main allocation
pmdomain: bcm: bcm2835-power: Replace open-coded polling with readl_poll_timeout_atomic()
pmdomain: sunxi: Add support for A733 to Allwinner PCK600 driver
pmdomain: qcom: rpmhpd: Add Eliza RPMh Power Domains
pmdomain: arm: Add print after a successful probe for SCMI power domains
pmdomain: rockchip: quiet regulator error on -EPROBE_DEFER
pmdomain: mediatek: Add power domain driver for MT8189 SoC
pmdomain: mediatek: Add bus protect control flow for MT8189
pmdomain: core: Extend statistics for domain idle states with s2idle data
pmdomain: core: Show latency/residency for domain idle states in debugfs
pmdomain: core: Restructure domain idle states data for genpd in debugfs
pmdomain: qcom: rpmpd: drop stray semicolon
pmdomain: imx: scu-pd: Fix device_node reference leak during ->probe()
pmdomain: ti: omap_prm: Fix a reference leak on device node
pmdomain: mediatek: scpsys: Add MT7622 Audio power domain to legacy driver
pmdomain: mediatek: Simplify with scoped for each OF child loop
...
Merge tag 'mmc-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC updates from Ulf Hansson:
"MMC core:
- Add NXP vendor and IW61x device IDs for WiFi chips over SDIO
- Add quirk for incorrect manufacturing date
- Add support for manufacturing date beyond 2025
- Optimize support for secure erase/trim for some Kingston eMMCs
- Remove support for the legacy "enable-sdio-wakeup" DT property
- Use single block writes in the retry path
MMC host:
- dw_mmc:
- A great amount of cleanups/simplifications to improve the code
- Add clk_phase_map support
- Remove mshc DT alias support
- dw_mmc-rockchip:
- Fix runtime PM support for internal phase
- Add support for the RV1103B variant
- loongson2:
- Add support for the Loongson-2K0300 SD/SDIO/eMMC controller
- mtk-sd:
- Add support for the MT8189 variant
- renesas_sdhi_core:
- Add support for selecting an optional mux
- rtsx_pci_sdmmc:
- Simplify voltage switch handling
- sdhci:
- Stop advertising the driver in dmesg
- sdhci-esdhc-imx:
- Add 1-bit bus width support
- Add support for the NXP S32N79 variant
- sdhci-msm:
- Add support for the IPQ5210 and IPQ9650 variants
- Add support for wrapped keys
- Enable ICE for CQE-capable controllers with non-CQE cards
- sdhci-of-arasan:
- Add support for the Axiado AX3000 variant
- sdhci-of-aspeed:
- Add support for the AST2700 variant
- sdhci-of-bst:
- Add driver for the Black Sesame Technologies C1200 controller
- sdhci-of-dwcmshc:
- Add support for the Canaan K230 variant
- Add support for the HPE GSC variant
- Prevent clock glitches to avoid malfunction
- sdhci-of-k1:
- Add support for the K3 variant
mux core/consumers:
- core:
- Add helper functions for getting optional and selected mux-state
- i2c-omap:
- Convert to devm_mux_state_get_optional_selected()
- phy-renesas:
- Convert to devm_mux_state_get_optional_selected()
- phy-can-transceiver:
- Convert to devm_mux_state_get_optional()"
* tag 'mmc-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (131 commits)
mmc: sdhci-msm: Fix the wrapped key handling
mmc: sdhci-of-dwcmshc: Disable clock before DLL configuration
mmc: core: Simplify with scoped for each OF child loop
mmc: core: Optimize size of struct mmc_queue_req
mmc: vub300: clean up module init
mmc: vub300: rename probe error labels
mmc: dw_mmc: Remove dw_mci_start_request wrapper and rename core function
mmc: dw_mmc: Inline dw_mci_queue_request() into dw_mci_request()
mmc: block: Use MQRQ_XFER_SINGLE_BLOCK for both read and write recovery
mmc: mmc_test: Replace hard-coded values with macros and consolidate test parameters
mmc: block: Convert to use DEFINE_SIMPLE_DEV_PM_OPS()
mmc: core: Replace the hard-coded shift value 9 with SECTOR_SHIFT
mmc: sdhci-dwcmshc: Refactor Rockchip platform data for controller revisions
mmc: core: Switch to use pm_ptr() for mmc_host_class_dev_pm_ops
mmc: core: Remove legacy 'enable-sdio-wakeup' DT property support
mmc: mmc_test: use kzalloc_flex
mmc: mtk-sd: disable new_tx/rx and modify related settings for mt8189
dt-bindings: mmc: hisilicon,hi3660-dw-mshc: Convert to DT schema
dt-bindings: mmc: sdhci-msm: add IPQ9650 compatible
mmc: block: use single block write in retry
...
Merge tag 'pwm/for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux
Pull pwm updates from Uwe Kleine-König:
"Just two minor fixes, a device tree binding addition to support a few
more SoCs (without the need for driver adaptions), a driver include
cleanup and the addition of the #linux-pwm irc channel to MAINTAINERS"
* tag 'pwm/for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
pwm: th1520: fix `CLIPPY=1` warning
pwm: jz4740: Drop unused include
MAINTAINERS: Add #linux-pwm irc channel to pwm entry
dt-bindings: pwm: amlogic: Document A4 A5 and T7 PWM
pwm: imx-tpm: Count the number of enabled channels in probe
Merge tag 'locking_futex_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull futex selftest updates from Borislav Petkov:
- Correct the version guard for the futex_numa_mpol test to require
libnuma 2.0.18 instead of 2.0.16, which is the version that actually
introduced numa_set_mempolicy_home_node() used by the test
- Allow the futex_numa_mpol selftest to build and run on systems
without libnuma installed with affected test gracefully being skipped
instead of failing to compile
- Use the proper assertion macros so that individual sub-test failures
are correctly propagated and the test suite reports failure when
something goes wrong
* tag 'locking_futex_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests/futex: Bump up libnuma version check
selftests/futex: Conditionally include libnuma support
selftests/futex: Fix incorrect result reporting of futex_requeue test item
A robustness improvement and some cleanups in the kmemleak code
- "Improve khugepaged scan logic" (Vernon Yang)
Improve khugepaged scan logic and reduce CPU consumption by
prioritizing scanning tasks that access memory frequently
- "Make KHO Stateless" (Jason Miu)
Simplify Kexec Handover by transitioning KHO from an xarray-based
metadata tracking system with serialization to a radix tree data
structure that can be passed directly to the next kernel
- "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas
Ballasi and Steven Rostedt)
Enhance vmscan's tracepointing
- "mm: arch/shstk: Common shadow stack mapping helper and
VM_NOHUGEPAGE" (Catalin Marinas)
Cleanup for the shadow stack code: remove per-arch code in favour of
a generic implementation
- "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin)
Fix a WARN() which can be emitted the KHO restores a vmalloc area
- "mm: Remove stray references to pagevec" (Tal Zussman)
Several cleanups, mainly udpating references to "struct pagevec",
which became folio_batch three years ago
- "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl
Shutsemau)
Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail
pages encode their relationship to the head page
Provide some cleanups and slight fixes in the mremap, mmap and vma
code
- "mm/damon: support addr_unit on default monitoring targets for
modules" (SeongJae Park)
Extend the use of DAMON core's addr_unit tunable
- "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache)
Cleanups to khugepaged and is a base for Nico's planned khugepaged
mTHP support
- "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand)
Code movement and cleanups in the memhotplug and sparsemem code
- "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup
CONFIG_MIGRATION" (David Hildenbrand)
Rationalize some memhotplug Kconfig support
- "change young flag check functions to return bool" (Baolin Wang)
Cleanups to change all young flag check functions to return bool
- "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh
Law and SeongJae Park)
Fix a few potential DAMON bugs
- "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo
Stoakes)
Convert a lot of the existing use of the legacy vm_flags_t data type
to the new vma_flags_t type which replaces it. Mainly in the vma
code.
- "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes)
Expand the mmap_prepare functionality, which is intended to replace
the deprecated f_op->mmap hook which has been the source of bugs and
security issues for some time. Cleanups, documentation, extension of
mmap_prepare into filesystem drivers
Simplify and clean up zap_huge_pmd(). Additional cleanups around
vm_normal_folio_pmd() and the softleaf functionality are performed.
* tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
mm: fix deferred split queue races during migration
mm/khugepaged: fix issue with tracking lock
mm/huge_memory: add and use has_deposited_pgtable()
mm/huge_memory: add and use normal_or_softleaf_folio_pmd()
mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio()
mm/huge_memory: separate out the folio part of zap_huge_pmd()
mm/huge_memory: use mm instead of tlb->mm
mm/huge_memory: remove unnecessary sanity checks
mm/huge_memory: deduplicate zap deposited table call
mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE()
mm/huge_memory: add a common exit path to zap_huge_pmd()
mm/huge_memory: handle buggy PMD entry in zap_huge_pmd()
mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc
mm/huge: avoid big else branch in zap_huge_pmd()
mm/huge_memory: simplify vma_is_specal_huge()
mm: on remap assert that input range within the proposed VMA
mm: add mmap_action_map_kernel_pages[_full]()
uio: replace deprecated mmap hook with mmap_prepare in uio_info
drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare
mm: allow handling of stacked mmap_prepare hooks in more drivers
...
selftests/bpf: Test small task local data allocation
Make sure task local data is working correctly for different allocation
sizes. Existing task local data selftests allocate the maximum amount of
data possible but miss the garbage data issue when only small amount of
data is allocated. Therefore, test small data allocations as well.
selftests/bpf: Fix tld_get_data() returning garbage data
BPF side tld_get_data() currently may return garbage when tld_data_u is
not aligned to page_size. This can happen when small amount of memory
is allocated for tld_data_u. The misalignment is supposed to be allowed
and the BPF side will use tld_data_u->start to reference the tld_data_u
in a page. However, since "start" is within tld_data_u, there is no way
to know the correct "start" in the first place. As a result, BPF
programs will see garbage data. The selftest did not catch this since
it tries to allocate the maximum amount of data possible (i.e., a page)
such that tld_data_u->start is always correct.
Fix it by moving tld_data_u->start to tld_data_map->start. The original
field is now renamed as unused instead of removing it because BPF side
tld_get_data() views off = 0 returned from tld_fetch_key() as
uninitialized.
selftests/bpf: Prevent allocating data larger than a page
Fix a bug in the task local data library that may allocate more than a
a page for tld_data_u. This may happen when users set a too large
TLD_DYN_DATA_SIZE, so check it when creating dynamic TLD fields and fix
the corresponding selftest.
Changelog:
v1: https://lore.kernel.org/all/20260413123256.3296452-1-puranjay@kernel.org/
Changes in v2:
- Remove "#include <asm/cacheflush.h>" as it is not needed now.
- Add Acked-by: Song Liu <song@kernel.org>
When the BPF prog pack allocator was added for arm64 and riscv, the
existing bpf_flush_icache() calls were retained after
bpf_jit_binary_pack_finalize(). However, the finalize path copies the
JITed code via architecture text patching routines (__text_poke on arm64,
patch_text_nosync on riscv) that already perform a full
flush_icache_range() internally. The subsequent bpf_flush_icache()
repeats the same cache maintenance on the same range.
Remove the redundant flush and the now-unused bpf_flush_icache()
definitions on both architectures.
====================
bpf, riscv: Remove redundant bpf_flush_icache() after pack allocator finalize
bpf_flush_icache() calls flush_icache_range() to clean the data cache
and invalidate the instruction cache for the JITed code region. However,
since commit 48a8f78c50bd ("bpf, riscv: use prog pack allocator in the
BPF JIT"), this flush is redundant.
bpf_jit_binary_pack_finalize() copies the JITed instructions to the ROX
region via bpf_arch_text_copy() -> patch_text_nosync(), and
patch_text_nosync() already calls flush_icache_range() on the written
range. The subsequent bpf_flush_icache() repeats the same cache
maintenance on an overlapping range.
Remove the redundant bpf_flush_icache() call and its now-unused
definition.
Fixes: 48a8f78c50bd ("bpf, riscv: use prog pack allocator in the BPF JIT") Acked-by: Song Liu <song@kernel.org> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Reviewed-by: Pu Lehui <pulehui@huawei.com> Tested-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/20260413191111.3426023-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
bpf, arm64: Remove redundant bpf_flush_icache() after pack allocator finalize
bpf_flush_icache() calls flush_icache_range() to clean the data cache
and invalidate the instruction cache for the JITed code region. However,
since commit 1dad391daef1 ("bpf, arm64: use bpf_prog_pack for memory
management"), this flush is redundant.
bpf_jit_binary_pack_finalize() copies the JITed instructions to the ROX
region via bpf_arch_text_copy() -> aarch64_insn_copy() -> __text_poke(),
and __text_poke() already calls flush_icache_range() on the written
range. The subsequent bpf_flush_icache() repeats the same cache
maintenance on an overlapping range, including an unnecessary second
synchronous IPI to all CPUs via kick_all_cpus_sync().
Remove the redundant bpf_flush_icache() call and its now-unused
definition.
Fixes: 1dad391daef1 ("bpf, arm64: use bpf_prog_pack for memory management") Acked-by: Song Liu <song@kernel.org> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20260413191111.3426023-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
arena_vm_open() only bumps vml->mmap_count but never registers the
child VMA in arena->vma_list. The vml->vma always points at the
parent VMA, so after parent munmap the pointer dangles. If the child
then calls bpf_arena_free_pages(), zap_pages() reads the stale
vml->vma triggering use-after-free.
Fix this by preventing the arena VMA from being inherited across
fork with VM_DONTCOPY, and preventing VMA splits via the may_split
callback.
Also reject mremap with a .mremap callback returning -EINVAL. A
same-size mremap(MREMAP_FIXED) on the full arena VMA reaches
copy_vma() through the following path:
check_prep_vma() - returns 0 early: new_len == old_len
skips VM_DONTEXPAND check
prep_move_vma() - vm_start == old_addr and
vm_end == old_addr + old_len
so may_split is never called
move_vma()
copy_vma_and_data()
copy_vma()
vm_area_dup() - copies vm_private_data (vml pointer)
vm_ops->open() - bumps vml->mmap_count
vm_ops->mremap() - returns -EINVAL, rollback unmaps new VMA
The refcount ensures the rollback's arena_vm_close does not free
the vml shared with the original VMA.
Reported-by: Weiming Shi <bestswngs@gmail.com> Reported-by: Xiang Mei <xmei5@asu.edu> Fixes: 317460317a02 ("bpf: Introduce bpf_arena.") Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260413194245.21449-1-alexei.starovoitov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann [Wed, 15 Apr 2026 12:14:03 +0000 (14:14 +0200)]
bpf, arm64: Fix off-by-one in check_imm signed range check
check_imm(bits, imm) is used in the arm64 BPF JIT to verify that
a branch displacement (in arm64 instruction units) fits into the
signed N-bit immediate field of a B, B.cond or CBZ/CBNZ encoding
before it is handed to the encoder. The macro currently tests for
(imm > 0 && imm >> bits) || (imm < 0 && ~imm >> bits) which admits
values in [-2^N, 2^N) — effectively a signed (N+1)-bit range. A
signed N-bit field only holds [-2^(N-1), 2^(N-1)), so the check
admits one extra bit of range on each side.
In particular, for check_imm19(), values in [2^18, 2^19) slip past
the check but do not fit into the 19-bit signed imm19 field of
B.cond. aarch64_insn_encode_immediate() then masks the raw value
into the 19-bit field, setting bit 18 (the sign bit) and flipping
a forward branch into a backward one. Same class of issue exists
for check_imm26() and the B/BL encoding. Shift by (bits - 1)
instead of bits so the actual signed N-bit range is enforced.
Daniel Borkmann [Wed, 15 Apr 2026 12:14:02 +0000 (14:14 +0200)]
bpf, arm64: Reject out-of-range B.cond targets
aarch64_insn_gen_cond_branch_imm() calls label_imm_common() to
compute a 19-bit signed byte offset for a conditional branch,
but unlike its siblings aarch64_insn_gen_branch_imm() and
aarch64_insn_gen_comp_branch_imm(), it does not check whether
label_imm_common() returned its out-of-range sentinel (range)
before feeding the value to aarch64_insn_encode_immediate().
aarch64_insn_encode_immediate() unconditionally masks the value
with the 19-bit field mask, so an offset that was rejected by
label_imm_common() gets silently truncated. With the sentinel
value SZ_1M, the resulting field ends up with bit 18 (the sign
bit of the 19-bit signed displacement) set, and the CPU decodes
it as a ~1 MiB *backward* branch, producing an incorrectly
targeted B.cond instruction. For code-gen locations like the
emit_bpf_tail_call() this function is the only barrier between
an overflowing displacement and a silently miscompiled branch.
Fix it by returning AARCH64_BREAK_FAULT when the offset is out
of range, so callers see a loud failure instead of a silently
misencoded branch. validate_code() scans the generated image
for any AARCH64_BREAK_FAULT and then lets the JIT fail.
Merge tag 'sched_ext-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext updates from Tejun Heo:
- cgroup sub-scheduler groundwork
Multiple BPF schedulers can be attached to cgroups and the dispatch
path is made hierarchical. This involves substantial restructuring of
the core dispatch, bypass, watchdog, and dump paths to be
per-scheduler, along with new infrastructure for scheduler ownership
enforcement, lifecycle management, and cgroup subtree iteration
The enqueue path is not yet updated and will follow in a later cycle
- scx_bpf_dsq_reenq() generalized to support any DSQ including remote
local DSQs and user DSQs
Built on top of this, SCX_ENQ_IMMED guarantees that tasks dispatched
to local DSQs either run immediately or get reenqueued back through
ops.enqueue(), giving schedulers tighter control over queueing
latency
Also useful for opportunistic CPU sharing across sub-schedulers
- ops.dequeue() was only invoked when the core knew a task was in BPF
data structures, missing scheduling property change events and
skipping callbacks for non-local DSQ dispatches from ops.select_cpu()
Fixed to guarantee exactly one ops.dequeue() call when a task leaves
BPF scheduler custody
- Kfunc access validation moved from runtime to BPF verifier time,
removing runtime mask enforcement
- Idle SMT sibling prioritization in the idle CPU selection path
- Documentation, selftest, and tooling updates. Misc bug fixes and
cleanups
* tag 'sched_ext-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (134 commits)
tools/sched_ext: Add explicit cast from void* in RESIZE_ARRAY()
sched_ext: Make string params of __ENUM_set() const
tools/sched_ext: Kick home CPU for stranded tasks in scx_qmap
sched_ext: Drop spurious warning on kick during scheduler disable
sched_ext: Warn on task-based SCX op recursion
sched_ext: Rename scx_kf_allowed_on_arg_tasks() to scx_kf_arg_task_ok()
sched_ext: Remove runtime kfunc mask enforcement
sched_ext: Add verifier-time kfunc context filter
sched_ext: Drop redundant rq-locked check from scx_bpf_task_cgroup()
sched_ext: Decouple kfunc unlocked-context check from kf_mask
sched_ext: Fix ops.cgroup_move() invocation kf_mask and rq tracking
sched_ext: Track @p's rq lock across set_cpus_allowed_scx -> ops.set_cpumask
sched_ext: Add select_cpu kfuncs to scx_kfunc_ids_unlocked
sched_ext: Drop TRACING access to select_cpu kfuncs
selftests/sched_ext: Fix wrong DSQ ID in peek_dsq error message
sched_ext: Documentation: improve accuracy of task lifecycle pseudo-code
selftests/sched_ext: Improve runner error reporting for invalid arguments
sched_ext: Documentation: Fix scx_bpf_move_to_local kfunc name
sched_ext: Documentation: Add ops.dequeue() to task lifecycle
tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing
...
Merge tag 'wq-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue updates from Tejun Heo:
- New default WQ_AFFN_CACHE_SHARD affinity scope subdivides LLCs into
smaller shards to improve scalability on machines with many CPUs per
LLC
- Misc:
- system_dfl_long_wq for long unbound works
- devm_alloc_workqueue() for device-managed allocation
- sysfs exposure for ordered workqueues and the EFI workqueue
- removal of HK_TYPE_WQ from wq_unbound_cpumask
- various small fixes
* tag 'wq-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (21 commits)
workqueue: validate cpumask_first() result in llc_populate_cpu_shard_id()
workqueue: use NR_STD_WORKER_POOLS instead of hardcoded value
workqueue: avoid unguarded 64-bit division
docs: workqueue: document WQ_AFFN_CACHE_SHARD affinity scope
workqueue: add test_workqueue benchmark module
tools/workqueue: add CACHE_SHARD support to wq_dump.py
workqueue: set WQ_AFFN_CACHE_SHARD as the default affinity scope
workqueue: add WQ_AFFN_CACHE_SHARD affinity scope
workqueue: fix typo in WQ_AFFN_SMT comment
workqueue: Remove HK_TYPE_WQ from affecting wq_unbound_cpumask
workqueue: unlink pwqs from wq->pwqs list in alloc_and_link_pwqs() error path
workqueue: Remove NULL wq WARN in __queue_delayed_work()
workqueue: fix parse_affn_scope() prefix matching bug
workqueue: devres: Add device-managed allocate workqueue
workqueue: Add system_dfl_long_wq for long unbound works
tools/workqueue/wq_dump.py: add NODE prefix to all node columns
tools/workqueue/wq_dump.py: fix column alignment in node_nr/max_active section
tools/workqueue/wq_dump.py: remove backslash separator from node_nr/max_active header
efi: Allow to expose the workqueue via sysfs
workqueue: Allow to expose ordered workqueues via sysfs
...
Merge tag 'cgroup-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
- cgroup_file_notify() locking converted from a global lock to
per-cgroup_file spinlock with a lockless fast-path when no
notification is needed
- Misc changes including exposing cgroup helpers for sched_ext and
minor fixes
* tag 'cgroup-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup/rdma: fix swapped arguments in pr_warn() format string
cgroup/dmem: remove region parameter from dmemcg_parse_limit
cgroup: replace global cgroup_file_kn_lock with per-cgroup_file lock
cgroup: add lockless fast-path checks to cgroup_file_notify()
cgroup: reduce cgroup_file_kn_lock hold time in cgroup_file_notify()
cgroup: Expose some cgroup helpers
Merge tag 'slab-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:
- Sheaves performance improvements for systems with memoryless NUMA
nodes, developed in response to regression reports.
These mainly ensure that percpu sheaves exist and are used on cpus
that belong to these memoryless nodes (Vlastimil Babka, Hao Li).
- Cleanup API usage and constify sysfs attributes (Thomas Weißschuh)
- Disable kfree_rcu() batching on builds intended for fuzzing/debugging
that enable CONFIG_RCU_STRICT_GRACE_PERIOD (Jann Horn)
- Add a kunit test for kmalloc_nolock()/kfree_nolock() (Harry Yoo)
* tag 'slab-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
slub: clarify kmem_cache_refill_sheaf() comments
lib/tests/slub_kunit: add a test case for {kmalloc,kfree}_nolock
MAINTAINERS: add lib/tests/slub_kunit.c to SLAB ALLOCATOR section
slub: use N_NORMAL_MEMORY in can_free_to_pcs to handle remote frees
slab,rcu: disable KVFREE_RCU_BATCHED for strict grace period
slab: free remote objects to sheaves on memoryless nodes
slab: create barns for online memoryless nodes
slab: decouple pointer to barn from kmem_cache_node
slab: remove alloc_full_sheaf()
mm/slab: constify sysfs attributes
mm/slab: create sysfs attribute through default_groups
Merge tag 'sound-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"Nothing too thrilling here, but we see lots of driver updates and bug
fixes, including quirk additions and refactoring works, while there
have been little changes in the core functionality. Here are some
highlights:
Core:
- Add validation for the control API put callback
- Fixes in compress-offload API timestamp handling
- Continued ASoC core API cleanups
ASoC:
- Add support for bus keepers (for Apple devices in future)
- Enhancements to the SDCA support, including retaskable jacks
- Test improvements for Cirrus Logic drivers
- Lots of fixes for the NXP, nVidia and Qualcomm
- Support for AMD RPL DMIC, Cirrus Logic CS42L43 and CS47L47, nVidia
machines with CPCAP and WM8962
USB-audio:
- Quirks for Huawei Headset, Focusrite Novation, MV-Silicon, Studio
1824, Arturia AF16Rig, Hotone Audio, Feaulle Rainbow, PreSonus
AudioBox, Moondrop Ju Jiu, Scarlett 18i20, etc
- Extended mixer volume quirk handling
- UAF and other fixes for us144mkii, 6fire and caiaq drivers
HD-audio:
- Add quirks or fixes for Acer, Lenovo, HP, ASUS machines
- Fixes & cleanups of GPIO helper code
Misc:
- Add suspend/resume support for multiple legacy ISA and Apple
drivers
- Further regression fixes for ctxfi driver"
* tag 'sound-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (359 commits)
ALSA: control: Validate buf_len before strnlen() in snd_ctl_elem_init_enum_names()
ALSA: usb-audio: Fix missing error handling for get_min_max*()
ALSA: hda/realtek - fixed speaker no sound update
ALSA: hda/realtek: Add quirk for Acer PT316-51S headset mic
ALSA: usb-audio: Exclude Scarlett 18i20 1st Gen from SKIP_IFACE_SETUP
ALSA: hda/realtek: Add quirk for Legion S7 15IMH
ALSA: hda/realtek: Add quirk for HP Spectre x360 14-ea
ALSA: caiaq: take a reference on the USB device in create_card()
ASoC: dt-bindings: rockchip: convert rk3399-gru-sound to DT Schema
ALSA: sscape: Add suspend and resume support
ALSA: sscape: Cache per-card resources for board reinitialization
ALSA: usb-audio: Do not expose sticky mixers
ALSA: usb-audio: Move volume control resolution check into a function
ALSA: usb-audio: Add error checks against get_min_max*()
ALSA: usb-audio: Add quirk for PreSonus AudioBox USB
ALSA: interwave: guard PM-only restore helpers with CONFIG_PM
ALSA: usb-audio: Evaluate packsize caps at the right place
ALSA: sc6000: Restore board setup across suspend
ALSA: sc6000: Keep the programmed board state in card-private data
ALSA: 6fire: fix use-after-free on disconnect
...
Merge tag 'drm-next-2026-04-15' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
"Highlights:
- new DRM RAS infrastructure using netlink
- amdgpu: enable DC on CIK APUs, and more IP enablement, and more
user queue work
- xe: purgeable BO support, and new hw enablement
- dma-buf : add revocable operations
math:
- provide __KERNEL_DIV_ROUND_CLOSEST() in UAPI
- implement DIV_ROUND_CLOSEST() with __KERNEL_DIV_ROUND_CLOSEST()
rust:
- shared tag with driver-core: register macro and io infra
- core: rework DMA coherent API
- core: add interop::list to interop with C linked lists
- core: add more num::Bounded operations
- core: enable generic_arg_infer and add EMSGSIZE
- workqueue: add ARef<T> support for work and delayed work
- add GPU buddy allocator abstraction
- add DRM shmem GEM helper abstraction
- allow drm:::Device to dispatch work and delayed work items
to driver private data
- add dma_resv_lock helper and raw accessors
core:
- introduce DRM RAS infrastructure over netlink
- add connector panel_type property
- fourcc: add ARM interleaved 64k modifier
- colorop: add destroy helper
- suballoc: split into alloc and init helpers
- mode: provide DRM_ARGB_GET*() macros for reading color components
edid:
- provide drm_output_color_Format
dma-buf:
- provide revoke mechanism for shared buffers
- rename move_notify to invalidate_mappings
- always enable move_notify
- protect dma_fence_ops with RCU and improve locking
- clean pages with helpers
atomic:
- allocate drm_private_state via callback
- helper: use system_percpu_wq
buddy:
- make buddy allocator available to gpu level
- add kernel-doc for buddy allocator
- improve aligned allocation
ttm:
- fix fence signalling
- improve tests and docs
- improve handling of gfp_retry_mayfail
- use per-node stat counters to track memory allocations
- port pool to use list_lru
- drop NUMA specific pools
- make pool shrinker numa aware
- track allocated pages per numa node
coreboot:
- cleanup coreboot framebuffer support
sched:
- fix race condition in drm_sched_fini
pagemap:
- enable THP support
- pass pagemap_addr by reference
gem-shmem:
- Track page accessed/dirty status across mmap/vmap
bridge:
- anx7625: Support USB-C plus DT bindings
- connector: Fix EDID detection
- dw-hdmi-qp: Support Vendor-Specfic and SDP Infoframes; improve
others
- fsl-ldb: Fix visual artifacts plus related DT property
'enable-termination-resistor'
- imx8qxp-pixel-link: Improve bridge reference handling
- lt9611: Support Port-B-only input plus DT bindings
- tda998x: Support DRM_BRIDGE_ATTACH_NO_CONNECTOR; Clean up
- Support TH1520 HDMI plus DT bindings
- waveshare-dsi: Fix register and attach; Support 1..4 DSI lanes plus
DT bindings
- anx7625: Fix USB Type-C handling
- cdns-mhdp8546-core: Handle HDCP state in bridge atomic_check
- Support Lontium LT8713SX DP MST bridge plus DT bindings
- analogix_dp: Use DP helpers for link training
panel:
- panel-jdi-lt070me05000: Use mipi-dsi multi functions
- panel-edp: Support Add AUO B116XAT04.1 (HW: 1A); Support CMN
N116BCL-EAK (C2); Support FriendlyELEC plus DT changes
- panel-edp: Fix timings for BOE NV140WUM-N64
- ilitek-ili9882t: Allow GPIO calls to sleep
- jadard: Support TAIGUAN XTI05101-01A
- lxd: Support LXD M9189A plus DT bindings
- mantix: Fix pixel clock; Clean up
- motorola: Support Motorola Atrix 4G and Droid X2 plus DT bindings
- novatek: Support Novatek/Tianma NT37700F plus DT bindings
- simple: Support EDT ET057023UDBA plus DT bindings; Support Powertip
PH800480T032-ZHC19 plus DT bindings; Support Waveshare 13.3"
- novatek-nt36672a: Use mipi_dsi_*_multi() functions
- panel-edp: Support BOE NV153WUM-N42, CMN N153JCA-ELK, CSW
MNF307QS3-2
- support Himax HX83121A plus DT bindings
- support JuTouch JT070TM041 plus DT bindings
- support Samsung S6E8FC0 plus DT bindings
- himax-hx83102c: support Samsung S6E8FC0 plus DT bindings; support
backlight
- ili9806e: support Rocktech RK050HR345-CT106A plus DT bindings
- simple: support Tianma TM050RDH03 plus DT bindings
amdgpu:
- enable DC by default on CIK APUs
- userq fence ioctl param size fixes
- set panel_type to OLED for eDP
- refactor DC i2c code
- FAMS2 update
- rework ttm handling to allow multiple engines
- DC DCE 6.x cleanup
- DC support for NUTMEG/TRAVIS DP bridge
- DCN 4.2 support
- GC12 idle power fix for compute
- use struct drm_edid in non-DC code
- enable NV12/P010 support on primary planes
- support newer IP discovery tables
- VCN/JPEG 5.0.2 support
- GC/MES 12.1 updates
- USERQ fixes
- add DC idle state manager
- eDP DSC seamless boot
amdkfd:
- GC 12.1 updates
- non 4K page fixes
xe:
- basic Xe3p_LPG and NVL-P enabling patches
- allow VM_BIND decompress support
- add purgeable buffer object support
- add xe_vm_get_property_ioctl
- restrict multi-lrc to VCS/VECS engines
- allow disabling VM overcommit in fault mode
- dGPU memory optimizations
- Workaround cleanups and simplification
- Allow VFs VRAM quote changes using sysfs
- convert GT stats to per-cpu counters
- pagefault refactors
- enable multi-queue on xe3p_xpc
- disable DCC on PTL
- make MMIO communication more robust
- disable D3Cold for BMG on specific platforms
- vfio: improve FLR sync for Xe VFIO
i915/display:
- C10/C20/LT PHY PLL divider verification
- use trans push mechanism to generate PSR frame change on LNL+
- refactor DP DSC slice config
- VGA decode refactoring
- refactor DPT, gen2-4 overlay, masked field register macro helpers
- refactor stolen memory allocation decisions
- prepare for UHBR DP tunnels
- refactor LT PHY PLL to use DPLL framework
- implement register polling/waiting in display code
- add shared stepping header between i915 and display
i915:
- fix potential overflow of shmem scatterlist length
nouveau:
- provide Z cull info to userspace
- initial GA100 support
- shutdown on PCI device shutdown
nova-core:
- harden GSP command queue
- add support for large RPCs
- simplify GSP sequencer and message handling
- refactor falcon firmware handling
- convert to new register macro
- conver to new DMA coherent API
- use checked arithmetic
- add debugfs support for gsp-rm log buffers
- fix aux device registration for multi-GPU
msm:
- CI:
- Uprev mesa
- Restore CI jobs for Qualcomm APQ8016 and APQ8096 devices
- Core:
- Switched to of_get_available_child_by_name()
- DPU:
- Fixes for DSC panels
- Fixed brownout because of the frequency / OPP mismatch
- Quad pipe preparation (not enabled yet)
- Switched to virtual planes by default
- Dropped VBIF_NRT support
- Added support for Eliza platform
- Reworked alpha handling
- Switched to correct CWB definitions on Eliza
- Dropped dummy INTF_0 on MSM8953
- Corrected INTFs related to DP-MST
- DP:
- Removed debug prints looking into PHY internals
- DSI:
- Fixes for DSC panels
- RGB101010 support
- Support for SC8280XP
- Moved PHY bindings from display/ to phy/
- GPU:
- Preemption support for x2-85 and a840
- IFPC support for a840
- SKU detection support for x2-85 and a840
- Expose AQE support (VK ray-pipeline)
- Avoid locking in VM_BIND fence signaling path
- Fix to avoid reclaim in GPU snapshot path
- Disallow foreign mapping of _NO_SHARE BOs
- HDMI:
- Fixed infoframes programming
- MDP5:
- Dropped support for MSM8974v1
- Dropped now unused code for MSM8974 v1 and SDM660 / MSM8998
panthor:
- add tracepoints for power and IRQs
- fix fence handling
- extend timestamp query with flags
- support various sources for timestamp queries
tyr:
- fix names and model/versions
rockchip:
- vop2: use drm logging function
- rk3576 displayport support
- support CRTC background color
atmel-hlcdc:
- support sana5d65 LCD controller
tilcdc:
- use DT bindings schema
- use managed DRM interfaces
- support DRM_BRIDGE_ATTACH_NO_CONNECTOR
verisilicon:
- support DC8200 + DT bindings
virtgpu:
- support PRIME import with 3D enabled
komeda:
- fix integer overflow in AFBC checks
mcde:
- improve bridge handling
gma500:
- use drm client buffer for fbdev framebuffer
amdxdna:
- add sensors ioctls
- provide NPU power estimate
- support column utilization sensor
- allow forcing DMA through IOMMU IOVA
- support per-BO mem usage queries
- refactor GEM implementation
ivpu:
- update boot API to v3.29.4
- limit per-user number of doorbells/contexts
- perform engine reset on TDR error
loongson:
- replace custom code with drm_gem_ttm_dumb_map_offset()
imx:
- support planes behind the primary plane
- fix bus-format selection
vkms:
- support CRTC background color
v3d:
- improve handling of struct v3d_stats
komeda:
- support Arm China Linlon D6 plus DT bindings
imagination:
- improve power-off sequence
- support context-reset notification from firmware
mediatek:
- mtk_dsi: enable hs clock during pre-enable
- Remove all conflicting aperture devices during probe
- Add support for mt8167 display blocks"
* tag 'drm-next-2026-04-15' of https://gitlab.freedesktop.org/drm/kernel: (1735 commits)
drm/ttm/tests: Remove checks from ttm_pool_free_no_dma_alloc
drm/ttm/tests: fix lru_count ASSERT
drm/vram: remove DRM_VRAM_MM_FILE_OPERATIONS from docs
drm/fb-helper: Fix a locking bug in an error path
dma-fence: correct kernel-doc function parameter @flags
ttm/pool: track allocated_pages per numa node.
ttm/pool: make pool shrinker NUMA aware (v2)
ttm/pool: drop numa specific pools
ttm/pool: port to list_lru. (v2)
drm/ttm: use gpu mm stats to track gpu memory allocations. (v4)
mm: add gpu active/reclaim per-node stat counters (v2)
gpu: nova-core: fix missing colon in SEC2 boot debug message
gpu: nova-core: vbios: use from_le_bytes() for PCI ROM header parsing
gpu: nova-core: bitfield: fix broken Default implementation
gpu: nova-core: falcon: pad firmware DMA object size to required block alignment
gpu: nova-core: gsp: fix undefined behavior in command queue code
drm/shmem_helper: Make sure PMD entries get the writeable upgrade
accel/ivpu: Trigger recovery on TDR with OS scheduling
drm/msm: Use of_get_available_child_by_name()
dt-bindings: display/msm: move DSI PHY bindings to phy/ subdir
...
====================
bpf: arg tracking for imprecise/multi-offset BPF_ST/STX
When the static arg tracking analysis encounters a store through a
pointer with imprecise or multi-offset destination, it must use weak
updates (join) instead of strong updates (overwrite) for the affected
at_stack slots. At runtime only one slot is actually written; the
others retain their old values.
Two cases are addressed:
- BPF_STX, handled by spill_to_stack(). It was gated on
`dst_is_local_fp = (frame == depth)`, which missed ARG_IMPRECISE
pointers entirely.
- BPF_ST, handled by clear_stack_for_all_offs(). It delegates to
clear_overlapping_stack_slots() which unconditionally set
`at_stack[i] = none`. Change to `at_stack[i] = join(old, none)`
when multiple candidate slots exist (cnt != 1), so that untouched
slots preserve their tracked values.
No veristat diff compared to current master when tested on selftests,
sched_ext, cilium and a set of Meta internal programs.
This addresses issues reported by sashiko for patch #7 in [1].
Changelog:
v2 -> v3:
- Use check_add_overflow() in arg_add() (Alexei).
- Add missing fixes tag (CI bot).
- Remove unused __imm in the selftest (sashiko).
v1 -> v2:
- Delete the OFF_IMPRECISE constant, always rely on
arg_track->cnt == 0 as a marker the offset is imprecise.
(Alexei).
- Squash all patches together to simplify backporting to
'bpf' branch (Alexei).
Eduard Zingerman [Mon, 13 Apr 2026 23:30:53 +0000 (16:30 -0700)]
selftests/bpf: arg tracking for imprecise/multi-offset BPF_ST/STX
Add test cases for clear_stack_for_all_offs and dst_is_local_fp
handling of multi-offset and ARG_IMPRECISE stack pointers:
- st_imm_join_with_multi_off: BPF_ST through multi-offset dst should
join at_stack with none instead of overwriting both candidate slots.
- st_imm_join_with_imprecise_off: BPF_ST through offset-imprecise dst
should join at_stack with none instead of clearing all slots.
- st_imm_join_with_single_off: a canary checking that BPF_ST with a
known offset overwrites slot instead of joining.
- imprecise_dst_spill_join: BPF_STX through ARG_IMPRECISE dst should
be recognized as a local spill and join at_stack with the written
value.
Eduard Zingerman [Mon, 13 Apr 2026 23:30:52 +0000 (16:30 -0700)]
bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX
BPF_STX through ARG_IMPRECISE dst should be recognized as a local
spill and join at_stack with the written value. For example,
consider the following situation:
BPF_ST through multi-offset or imprecise dst should join at_stack with
none instead of overwriting the slots. For example, consider the
following situation:
Move the definition of the clear_overlapping_stack_slots() in order to
have __arg_track_join() visible. Remove the OFF_IMPRECISE constant to
avoid having two ways to express imprecise offset.
Only 'offset-imprecise {frame=N, cnt=0}' remains.
Merge tag 'fbdev-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
Pull fbdev updates from Helge Deller:
"A major refactorization by Thomas Zimmermann from SUSE regarding
handling of console font data, addition of helpers for console font
rotation and split into individual components for glyphs, fonts and
the overall fbcon state.
And there is the round of usual code cleanups and fixes:
Cleanups:
- atyfb: Remove unused fb_list (Geert Uytterhoeven)
- goldfishfb, wmt_ge_rops: use devm_platform_ioremap_resource() (Amin GATTOUT)
- matroxfb: Mark variable with __maybe_unused (Andy Shevchenko)
- omapfb: Add missing error check for clk_get() (Chen Ni)
- tdfxfb: Make the VGA register initialisation a bit more obvious (Daniel Palmer)
- macfb: Replace deprecated strcpy with strscpy (Thorsten Blum)
Fixes:
- tdfxfb, udlfb: avoid divide-by-zero on FBIOPUT_VSCREENINFO (Greg Kroah-Hartman)
- omap2: fix inconsistent lock returns in omapfb_mmap (Hongling Zeng)
- viafb: check ioremap return value in viafb_lcd_get_mobile_state (Wang Jun)"
* tag 'fbdev-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: (40 commits)
fbdev: udlfb: avoid divide-by-zero on FBIOPUT_VSCREENINFO
fbdev: tdfxfb: avoid divide-by-zero on FBIOPUT_VSCREENINFO
fbdev: omap2: fix inconsistent lock returns in omapfb_mmap
MAINTAINERS: Add dedicated entry for fbcon
fbcon: Put font-rotation state into separate struct
fbcon: Fill cursor mask in helper function
lib/fonts: Implement font rotation
lib/fonts: Refactor glyph-rotation helpers
lib/fonts: Refactor glyph-pattern helpers
lib/fonts: Implement glyph rotation
lib/fonts: Clean up Makefile
lib/fonts: Provide helpers for calculating glyph pitch and size
vt: Implement helpers for struct vc_font in source file
fbcon: Avoid OOB font access if console rotation fails
fbdev: atyfb: Remove unused fb_list
fbdev: matroxfb: Mark variable with __maybe_unused to avoid W=1 build break
fbdev: update help text for CONFIG_FB_NVIDIA
fbdev: omapfb: Add missing error check for clk_get()
fbdev: viafb: check ioremap return value in viafb_lcd_get_mobile_state
lib/fonts: Remove internal symbols and macros from public header file
...
selftests/bpf: Fix timer_start_deadlock failure due to hrtimer change
Since commit f2e388a019e4 ("hrtimer: Reduce trace noise in hrtimer_start()"),
hrtimer_cancel tracepoint is no longer called when a hrtimer is re-armed. So
instead of a hrtimer_cancel followed by hrtimer_start tracepoint events, there
is now only a since hrtimer_start tracepoint event with the new was_armed field
set to 1, to indicated that the hrtimer was previously armed.
Update timer_start_deadlock accordingly so it traces hrtimer_start tracepoint
instead, with was_armed used as guard.
[CAUSE]
ocfs2_group_add() calls ocfs2_set_new_buffer_uptodate() on a
user-controlled group block before ocfs2_verify_group_and_input()
validates that block number. That helper is only valid for newly
allocated metadata and asserts that the block is not already present in
the chosen metadata cache. The code also uses INODE_CACHE(inode) even
though the group descriptor belongs to main_bm_inode and later journal
accesses use that cache context instead.
[FIX]
Validate the on-disk group descriptor before caching it, then add it to
the metadata cache tracked by INODE_CACHE(main_bm_inode). Keep the
validation failure path separate from the later cleanup path so we only
remove the buffer from that cache after it has actually been inserted.
This keeps the group buffer lifetime consistent across validation,
journaling, and cleanup.
Link: https://lkml.kernel.org/r/20260410020209.3786348-1-gality369@gmail.com Fixes: 7909f2bf8353 ("[PATCH 2/2] ocfs2: Implement group add for online resize") Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[CAUSE]
ocfs2_info_freefrag_scan_chain() uses on-disk bg_bits directly as the
bitmap scan limit. The coherent path reads group descriptors through
ocfs2_read_group_descriptor(), which validates the descriptor before
use. The non-coherent path uses ocfs2_read_blocks_sync() instead and
skips that validation, so an impossible bg_bits value can drive the
bitmap walk past the end of the block.
[FIX]
Compute the bitmap capacity from the filesystem format with
ocfs2_group_bitmap_size(), report descriptors whose bg_bits exceeds
that limit, and clamp the scan to the computed capacity. This keeps the
freefrag report going while avoiding reads beyond the buffer.
Link: https://lkml.kernel.org/r/20260410034220.3825769-1-gality369@gmail.com Fixes: d24a10b9f8ed ("Ocfs2: Add a new code 'OCFS2_INFO_FREEFRAG' for o2info ioctl.") Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ocfs2: fix listxattr handling when the buffer is full
[BUG]
If an OCFS2 inode has both inline and block-based xattrs, listxattr()
can return a size larger than the caller's buffer when the inline names
consume that buffer exactly.
[CAUSE]
Commit 936b8834366e ("ocfs2: Refactor xattr list and remove
ocfs2_xattr_handler().") replaced the old per-handler list accounting
with ocfs2_xattr_list_entry(), but it kept using size == 0 to detect
probe mode.
That assumption stops being true once ocfs2_listxattr() finishes the
inline-xattr pass. If the inline names fill the caller buffer exactly,
the block-xattr pass runs with a non-NULL buffer and a remaining size of
zero. ocfs2_xattr_list_entry() then skips the bounds check, keeps
counting block names, and returns a positive size larger than the
supplied buffer.
[FIX]
Detect probe mode by testing whether the destination buffer pointer is
NULL instead of whether the remaining size is zero.
That restores the pre-refactor behavior and matches the OCFS2 getxattr
helpers. Once the remaining buffer reaches zero while more names are
left, the block-xattr pass now returns -ERANGE instead of reporting a
size larger than the allocated list buffer.
Link: https://lkml.kernel.org/r/20260410040339.3837162-1-gality369@gmail.com Fixes: 936b8834366e ("ocfs2: Refactor xattr list and remove ocfs2_xattr_handler().") Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Sean Anderson [Tue, 7 Apr 2026 16:47:21 +0000 (12:47 -0400)]
update Sean's email address
Soon I will no longer be working at SECO. Update the mailmap to redirect
to my linux.dev address which I still have access to.
Link: https://lkml.kernel.org/r/20260407164722.211610-1-sean.anderson@linux.dev Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Cc: Sean Anderson <sean.anderson@seco.com> Cc: Conor Dooley <conor+dt@kernel.org> Cc: Daniel Lezcano <daniel.lezcano@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
David Carlier [Sun, 5 Apr 2026 15:47:20 +0000 (16:47 +0100)]
ocfs2: use get_random_u32() where appropriate
Use the typed random integer helpers instead of get_random_bytes() when
filling a single integer variable. The helpers return the value directly,
require no pointer or size argument, and better express intent.
Link: https://lkml.kernel.org/r/20260405154720.4732-1-devnexen@gmail.com Signed-off-by: David Carlier <devnexen@gmail.com Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ocfs2: split transactions in dio completion to avoid credit exhaustion
During ocfs2 dio operations, JBD2 may report warnings via following
call trace:
ocfs2_dio_end_io_write
ocfs2_mark_extent_written
ocfs2_change_extent_flag
ocfs2_split_extent
ocfs2_try_to_merge_extent
ocfs2_extend_rotate_transaction
ocfs2_extend_trans
jbd2__journal_restart
start_this_handle
output: JBD2: kworker/6:2 wants too many credits credits:5450 rsv_credits:0 max:5449
To prevent exceeding the credits limit, modify ocfs2_dio_end_io_write() to
handle extents in a batch of transaction.
Additionally, relocate ocfs2_del_inode_from_orphan(). The orphan inode
should only be removed from the orphan list after the extent tree update
is complete. This ensures that if a crash occurs in the middle of extent
tree updates, we won't leave stale blocks beyond EOF.
This patch also changes the logic for updating the inode size and removing
orphan, making it similar to ext4_dio_write_end_io(). Both operations are
performed only when everything looks good.
Finally, thanks to Jans and Joseph for providing the bug fix prototype and
suggestions.
Link: https://lkml.kernel.org/r/20260402134328.27334-2-heming.zhao@suse.com Signed-off-by: Heming Zhao <heming.zhao@suse.com> Suggested-by: Jan Kara <jack@suse.cz> Suggested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joseph Qi [Fri, 3 Apr 2026 09:08:03 +0000 (17:08 +0800)]
ocfs2: remove redundant l_next_free_rec check in __ocfs2_find_path()
The l_next_free_rec > l_count check after ocfs2_read_extent_block() in
__ocfs2_find_path() is now redundant, as ocfs2_validate_extent_block()
already performs this validation at block read time.
Remove the duplicate check to avoid maintaining the same validation in two
places.
Link: https://lkml.kernel.org/r/20260403090803.3860971-5-joseph.qi@linux.alibaba.com Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joseph Qi [Fri, 3 Apr 2026 09:08:02 +0000 (17:08 +0800)]
ocfs2: validate extent block list fields during block read
Add extent list validation to ocfs2_validate_extent_block() so that
corrupted on-disk fields are caught early at block read time rather than
during extent tree traversal.
Two checks are added:
- l_count must equal the expected value from
ocfs2_extent_recs_per_eb(), catching blocks with a corrupted record
count before any array iteration.
- l_next_free_rec must not exceed l_count, preventing out-of-bounds
access when iterating over extent records.
Link: https://lkml.kernel.org/r/20260403090803.3860971-4-joseph.qi@linux.alibaba.com Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joseph Qi [Fri, 3 Apr 2026 09:08:01 +0000 (17:08 +0800)]
ocfs2: remove empty extent list check in ocfs2_dx_dir_lookup_rec()
The full extent list check is introduced by commit 44acc46d182f, which is
to avoid NULL pointer dereference if a dirent is not found.
Reworking the error message to not reference rec. Instead, report
major_hash being looked up and l_next_free_rec, which naturally covers
both failure cases (empty extent list and no matching record) without
needing a separate l_next_free_rec == 0 guard.
Link: https://lkml.kernel.org/r/20260403090803.3860971-3-joseph.qi@linux.alibaba.com Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joseph Qi [Fri, 3 Apr 2026 09:08:00 +0000 (17:08 +0800)]
ocfs2: validate dx_root extent list fields during block read
Patch series "ocfs2: consolidate extent list validation into block read
callbacks".
ocfs2 validates extent list fields (l_count, l_next_free_rec) at various
points during extent tree traversal. This is fragile because each caller
must remember to check for corrupted on-disk data before using it.
This series moves those checks into the block read validation callbacks
(ocfs2_validate_dx_root and ocfs2_validate_extent_block), so corrupted
fields are caught early at block read time. Redundant post-read checks
are then removed.
This patch (of 4):
Move the extent list l_count validation from ocfs2_dx_dir_lookup_rec()
into ocfs2_validate_dx_root(), so that corrupted on-disk fields are caught
early at block read time rather than during directory lookups.
Additionally, add a l_next_free_rec <= l_count check to prevent
out-of-bounds access when iterating over extent records.
Both checks are skipped for inline dx roots (OCFS2_DX_FLAG_INLINE), which
use dr_entries instead of dr_list.
ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRY
filemap_fault() may drop the mmap_lock before returning VM_FAULT_RETRY,
as documented in mm/filemap.c:
"If our return value has VM_FAULT_RETRY set, it's because the mmap_lock
may be dropped before doing I/O or by lock_folio_maybe_drop_mmap()."
When this happens, a concurrent munmap() can call remove_vma() and free
the vm_area_struct via RCU. The saved 'vma' pointer in ocfs2_fault() then
becomes a dangling pointer, and the subsequent trace_ocfs2_fault() call
dereferences it -- a use-after-free.
Fix this by saving ip_blkno as a plain integer before calling
filemap_fault(), and removing vma from the trace event. Since
ip_blkno is copied by value before the lock can be dropped, it
remains valid regardless of what happens to the vma or inode
afterward.
Link: https://lkml.kernel.org/r/20260410083816.34951-1-tejas.bharambe@outlook.com Fixes: 614a9e849ca6 ("ocfs2: Remove FILE_IO from masklog.") Signed-off-by: Tejas Bharambe <tejas.bharambe@outlook.com> Reported-by: syzbot+a49010a0e8fcdeea075f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a49010a0e8fcdeea075f Suggested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[CAUSE]
ocfs2_group_extend() assumes that the global bitmap inode block
returned from ocfs2_inode_lock() has already been validated and
BUG_ONs when the signature is not a dinode. That assumption is too
strong for crafted filesystems because the JBD2-managed buffer path
can bypass structural validation and return an invalid dinode to the
resize ioctl.
[FIX]
Validate the dinode explicitly in ocfs2_group_extend(). If the global
bitmap buffer does not contain a valid dinode, report filesystem
corruption with ocfs2_error() and fail the resize operation instead of
crashing the kernel.
Link: https://lkml.kernel.org/r/20260401092303.3709187-1-gality369@gmail.com Fixes: 10995aa2451a ("ocfs2: Morph the haphazard OCFS2_IS_VALID_DINODE() checks.") Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ocfs2: validate bg_list extent bounds in discontig groups
[BUG]
Running ocfs2 on a corrupted image with a discontiguous block
group whose bg_list.l_next_free_rec is set to an excessively
large value triggers a KASAN use-after-free crash:
BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_by_rec fs/ocfs2/suballoc.c:1678 [inline]
BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_result+0x4a4/0x560 fs/ocfs2/suballoc.c:1715
Read of size 4 at addr ffff88801a85f000 by task syz.0.115/552
[CAUSE]
ocfs2_bg_discontig_fix_result() iterates over bg->bg_list.l_recs[]
using l_next_free_rec as the upper bound without any sanity check:
for (i = 0; i < le16_to_cpu(bg->bg_list.l_next_free_rec); i++) {
rec = &bg->bg_list.l_recs[i];
l_next_free_rec is read directly from the on-disk group descriptor and
is trusted blindly. On a 4 KiB block device, bg_list.l_recs[] can hold
at most 235 entries (ocfs2_extent_recs_per_gd(sb)). A corrupted or
crafted filesystem image can set l_next_free_rec to an arbitrarily
large value, causing the loop to index past the end of the group
descriptor buffer_head data page and into an adjacent freed page.
[FIX]
Validate discontiguous bg_list.l_count against
ocfs2_extent_recs_per_gd(sb), then reject l_next_free_rec values that
exceed l_count. This keeps the on-disk extent list self-consistent and
matches how the rest of ocfs2 uses l_count as the extent-list bound.
Link: https://lkml.kernel.org/r/20260401021622.3560952-1-gality369@gmail.com Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Taylor Nelms [Tue, 31 Mar 2026 18:15:09 +0000 (14:15 -0400)]
checkpatch: exclude forward declarations of const structs
Limit checkpatch warnings for normally-const structs by excluding patterns
consistent with forward declarations.
For example, the forward declaration `struct regmap_access_table;` in a
header file currently generates a warning recommending that it is
generally declared as const; however, this would apply a useless type
qualifier in the empty declaration `const struct regmap_access_table;`,
and subsequently generate compiler warnings.
Link: https://lkml.kernel.org/r/20260331181509.1258693-1-tknelms@google.com Signed-off-by: Taylor Nelms <tknelms@google.com> Acked-by: Joe Perches <joe@perches.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Dwaipayan Ray <dwaipayanray1@gmail.com> Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
procacct and getdelays use a fixed receive buffer for taskstats generic
netlink messages. A multi-threaded process exit can emit a single
PID+TGID notification large enough to exceed that buffer on newer kernels.
Switch to recvmsg() so MSG_TRUNC is detected explicitly, increase the
message buffer size, and report truncated datagrams clearly instead of
misparsing them as fatal netlink errors.
Also print the taskstats version in debug output to make version
mismatches easier to diagnose while inspecting taskstats traffic.
Link: https://lkml.kernel.org/r/520308bb4cbbaf8dc2c7296b5f60f11e12fb30a5.1774810498.git.cyyzero16@gmail.com Signed-off-by: Yiyang Chen <cyyzero16@gmail.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Dr. Thomas Orgis <thomas.orgis@uni-hamburg.de> Cc: Fan Yu <fan.yu9@zte.com.cn> Cc: Wang Yaxin <wang.yaxin@zte.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Yiyang Chen [Sun, 29 Mar 2026 19:00:40 +0000 (03:00 +0800)]
taskstats: set version in TGID exit notifications
delay accounting started populating taskstats records with a valid version
field via fill_pid() and fill_tgid().
Later, commit ad4ecbcba728 ("[PATCH] delay accounting taskstats interface
send tgid once") changed the TGID exit path to send the cached
signal->stats aggregate directly instead of building the outgoing record
through fill_tgid(). Unlike fill_tgid(), fill_tgid_exit() only
accumulates accounting data and never initializes stats->version.
As a result, TGID exit notifications can reach userspace with version == 0
even though PID exit notifications and TASKSTATS_CMD_GET replies carry a
valid taskstats version.
This is easy to reproduce with `tools/accounting/getdelays.c`.
I have a small follow-up patch for that tool which:
1. increases the receive buffer/message size so the pid+tgid
combined exit notification is not dropped/truncated
That produces both PID and TGID exit notifications for the same
process. The PID exit record reports a valid taskstats version, while
the TGID exit record reports `version 0`.
This patch (of 2):
Set stats->version = TASKSTATS_VERSION after copying the cached TGID
aggregate into the outgoing netlink payload so all taskstats records are
self-describing again.
Link: https://lkml.kernel.org/r/ba83d934e59edd431b693607de573eb9ca059309.1774810498.git.cyyzero16@gmail.com Fixes: ad4ecbcba728 ("[PATCH] delay accounting taskstats interface send tgid once") Signed-off-by: Yiyang Chen <cyyzero16@gmail.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Dr. Thomas Orgis <thomas.orgis@uni-hamburg.de> Cc: Fan Yu <fan.yu9@zte.com.cn> Cc: Wang Yaxin <wang.yaxin@zte.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Yufan Chen [Mon, 30 Mar 2026 15:34:28 +0000 (23:34 +0800)]
ocfs2/heartbeat: fix slot mapping rollback leaks on error paths
o2hb_map_slot_data() allocates hr_tmp_block, hr_slots, hr_slot_data, and
pages in stages. If a later allocation fails, the current code returns
without unwinding the earlier allocations.
o2hb_region_dev_store() also leaves slot mapping resources behind when
setup aborts, and it keeps hr_aborted_start/hr_node_deleted set across
retries. That leaves stale state behind after a failed start.
Factor the slot cleanup into o2hb_unmap_slot_data(), use it from both
o2hb_map_slot_data() and o2hb_region_release(), and call it from the
dev_store() rollback after stopping a started heartbeat thread. While
freeing pages, clear each hr_slot_data entry as it is released, and reset
the start state before each new setup attempt.
This closes the slot mapping leak on allocation/setup failure paths and
keeps failed setup attempts retryable.
Link: https://lkml.kernel.org/r/20260330153428.19586-1-yufan.chen@linux.dev Signed-off-by: Yufan Chen <ericterminal@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Fix mismerge of the arm64 / timer-core interrupt handling changes
Commit c43267e6794a ("Merge tag 'arm64-upstream' of git://...") had a
conflict in the irq entry/exit code due to commit c5538d0141b3 ("entry:
Split kernel mode logic from irqentry_{enter,exit}()") having moved the
core code in irqentry_enter/exit() from kernel/entry/common.c into
helper inline functions in include/linux/irq-entry-common.h.
On the other side of the merge, the timer-core code had introduced
deferred hrtimer rearming infrastructure in commit 0e98eb14814e ("entry:
Prepare for deferred hrtimer rearming"), adding two calls to
hrtimer_rearm_deferred() in irqentry_enter().
When merging the two, moving the two calls to the new location wasn't a
problem, but afterwards I had made the mistake of looking what had
happened in linux-next. And linux-next had a very different merge
resolution in commit 04f02dc3ea74 ("Merge tag 'entry-for-arm64-26-04-08'
into sched/hrtick"), which had unified the two calls into one single
call-site in irqentry_exit_to_kernel_mode_preempt().
And that merge resolution looked cleverer than the straightforward one I
had done, so I re-did my merge the way it had been done in linux-next.
But it turns out nobody apparently tests linux-next, and the merge in
linux-next was just wrong.
The difference is that hrtimer_rearm_deferred() doesn't get called at
all for the case when state.exit_rcu is true, and the boot will
typically fail due to timers not triggering correctly.
So this undoes the "clever" merge, and does the straightforward one
instead.
Merge tag 'kernel-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull pid_namespace updates from Christian Brauner:
- pid_namespace: make init creation more flexible
Annotate ->child_reaper accesses with {READ,WRITE}_ONCE() to protect
the unlocked readers from cpu/compiler reordering, and enforce that
pid 1 in a pid namespace is always the first allocated pid (the
set_tid path already required this).
On top of that, allow opening pid_for_children before the pid
namespace init has been created. This lets one process create the pid
namespace and a different process create the init via setns(), which
makes clone3(set_tid) usable in all cases evenly and is particularly
useful to CRIU when restoring nested containers.
A new selftest covers both the basic create-pidns-then-init flow and
the cross-process variant, and a MAINTAINERS entry for the pid
namespace code is added.
- unrelated signal cleanup: update outdated comment for the removed
freezable_schedule()
* tag 'kernel-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
signal: update outdated comment for removed freezable_schedule()
MAINTAINERS: add a pid namespace entry
selftests: Add tests for creating pidns init via setns
pid_namespace: allow opening pid_for_children before init was created
pid: check init is created first after idr alloc
pid_namespace: avoid optimization of accesses to ->child_reaper
Merge tag 'vfs-7.1-rc1.mount.v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs mount updates from Christian Brauner:
- Add FSMOUNT_NAMESPACE flag to fsmount() that creates a new mount
namespace with the newly created filesystem attached to a copy of the
real rootfs. This returns a namespace file descriptor instead of an
O_PATH mount fd, similar to how OPEN_TREE_NAMESPACE works for
open_tree().
This allows creating a new filesystem and immediately placing it in a
new mount namespace in a single operation, which is useful for
container runtimes and other namespace-based isolation mechanisms.
This accompanies OPEN_TREE_NAMESPACE and avoids a needless detour via
OPEN_TREE_NAMESPACE to get the same effect. Will be especially useful
when you mount an actual filesystem to be used as the container
rootfs.
- Currently, creating a new mount namespace always copies the entire
mount tree from the caller's namespace. For containers and sandboxes
that intend to build their mount table from scratch this is wasteful:
they inherit a potentially large mount tree only to immediately tear
it down.
This series adds support for creating a mount namespace that contains
only a clone of the root mount, with none of the child mounts. Two
new flags are introduced:
- CLONE_EMPTY_MNTNS (0x400000000) for clone3(), using the 64-bit flag space
- UNSHARE_EMPTY_MNTNS (0x00100000) for unshare()
Both flags imply CLONE_NEWNS. The resulting namespace contains a
single nullfs root mount with an immutable empty directory. The
intended workflow is to then mount a real filesystem (e.g., tmpfs)
over the root and build the mount table from there.
- Allow MOVE_MOUNT_BENEATH to target the caller's rootfs, allowing to
switch out the rootfs without pivot_root(2).
The traditional approach to switching the rootfs involves
pivot_root(2) or a chroot_fs_refs()-based mechanism that atomically
updates fs->root for all tasks sharing the same fs_struct. This has
consequences for fork(), unshare(CLONE_FS), and setns().
This series instead decomposes root-switching into individually
atomic, locally-scoped steps:
Since each step only modifies the caller's own state, the
fork/unshare/setns races are eliminated by design.
A key step to making this possible is to remove the locked mount
restriction. Originally MOVE_MOUNT_BENEATH doesn't support mounting
beneath a mount that is locked. The locked mount protects the
underlying mount from being revealed. This is a core mechanism of
unshare(CLONE_NEWUSER | CLONE_NEWNS). The mounts in the new mount
namespace become locked. That effectively makes the new mount table
useless as the caller cannot ever get rid of any of the mounts no
matter how useless they are.
We can lift this restriction though. We simply transfer the locked
property from the top mount to the mount beneath. This works because
what we care about is to protect the underlying mount aka the parent.
The mount mounted between the parent and the top mount takes over the
job of protecting the parent mount from the top mount mount. This
leaves us free to remove the locked property from the top mount which
can consequently be unmounted:
unshare(CLONE_NEWUSER | CLONE_NEWNS)
and we inherit a clone of procfs on /proc then currently we cannot
unmount it as:
umount -l /proc
will fail with EINVAL because the procfs mount is locked.
After this series we can now do:
mount --beneath -t tmpfs tmpfs /proc
umount -l /proc
after which a tmpfs mount has been placed beneath the procfs mount.
The tmpfs mount has become locked and the procfs mount has become
unlocked.
This means you can safely modify an inherited mount table after
unprivileged namespace creation.
Afterwards we simply make it possible to move a mount beneath the
rootfs allowing to upgrade the rootfs.
Removing the locked restriction makes this very useful for containers
created with unshare(CLONE_NEWUSER | CLONE_NEWNS) to reshuffle an
inherited mount table safely and MOVE_MOUNT_BENEATH makes it possible
to switch out the rootfs instead of using the costly pivot_root(2).
* tag 'vfs-7.1-rc1.mount.v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests/namespaces: remove unused utils.h include from listns_efault_test
selftests/fsmount_ns: add missing TARGETS and fix cap test
selftests/empty_mntns: fix wrong CLONE_EMPTY_MNTNS hex value in comment
selftests/empty_mntns: fix statmount_alloc() signature mismatch
selftests/statmount: remove duplicate wait_for_pid()
mount: always duplicate mount
selftests/filesystems: add MOVE_MOUNT_BENEATH rootfs tests
move_mount: allow MOVE_MOUNT_BENEATH on the rootfs
move_mount: transfer MNT_LOCKED
selftests/filesystems: add clone3 tests for empty mount namespaces
selftests/filesystems: add tests for empty mount namespaces
namespace: allow creating empty mount namespaces
selftests: add FSMOUNT_NAMESPACE tests
selftests/statmount: add statmount_alloc() helper
tools: update mount.h header
mount: add FSMOUNT_NAMESPACE
mount: simplify __do_loopback()
mount: start iterating from start of rbtree
Merge tag 'net-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Support HW queue leasing, allowing containers to be granted access
to HW queues for zero-copy operations and AF_XDP
- Number of code moves to help the compiler with inlining. Avoid
output arguments for returning drop reason where possible
- Rework drop handling within qdiscs to include more metadata about
the reason and dropping qdisc in the tracepoints
- Remove the rtnl_lock use from IP Multicast Routing
- Pack size information into the Rx Flow Steering table pointer
itself. This allows making the table itself a flat array of u32s,
thus making the table allocation size a power of two
- Report TCP delayed ack timer information via socket diag
- Add ip_local_port_step_width sysctl to allow distributing the
randomly selected ports more evenly throughout the allowed space
- Add support for per-route tunsrc in IPv6 segment routing
- Start work of switching sockopt handling to iov_iter
- Improve dynamic recvbuf sizing in MPTCP, limit burstiness and avoid
buffer size drifting up
- Support MSG_EOR in MPTCP
- Add stp_mode attribute to the bridge driver for STP mode selection.
This addresses concerns about call_usermodehelper() usage
- Remove UDP-Lite support (as announced in 2023)
- Remove support for building IPv6 as a module. Remove the now
unnecessary function calling indirection
Cross-tree stuff:
- Move Michael MIC code from generic crypto into wireless, it's
considered insecure but some WiFi networks still need it
Netfilter:
- Switch nft_fib_ipv6 module to no longer need temporary dst_entry
object allocations by using fib6_lookup() + RCU.
Florian W reports this gets us ~13% higher packet rate
- Convert IPVS's global __ip_vs_mutex to per-net service_mutex and
switch the service tables to be per-net. Convert some code that
walks the service lists to use RCU instead of the service_mutex
- Add more opinionated input validation to lower security exposure
- Make IPVS hash tables to be per-netns and resizable
- Multi-link support for FILS, probe response templates and client
probing
- New APIs and mac80211 support for NAN (Neighbor Aware Networking,
aka Wi-Fi Aware) so less work must be in firmware
Driver API:
- Add numerical ID for devlink instances (to avoid having to create
fake bus/device pairs just to have an ID). Support shared devlink
instances which span multiple PFs
- Add standard counters for reporting pause storm events (implement
in mlx5 and fbnic)
- Add configuration API for completion writeback buffering (implement
in mana)
- Support driver-initiated change of RSS context sizes
- Support DPLL monitoring input frequency (implement in zl3073x)
- Support per-port resources in devlink (implement in mlx5)
Misc:
- Expand the YAML spec for Netfilter
Drivers
- Software:
- macvlan: support multicast rx for bridge ports with shared
source MAC address
- team: decouple receive and transmit enablement for IEEE 802.3ad
LACP "independent control"
- Ethernet high-speed NICs:
- nVidia/Mellanox:
- support high order pages in zero-copy mode (for payload
coalescing)
- support multiple packets in a page (for systems with 64kB
pages)
- Broadcom 25-400GE (bnxt):
- implement XDP RSS hash metadata extraction
- add software fallback for UDP GSO, lowering the IOMMU cost
- Broadcom 800GE (bnge):
- add link status and configuration handling
- add various HW and SW statistics
- Marvell/Cavium:
- NPC HW block support for cn20k
- Huawei (hinic3):
- add mailbox / control queue
- add rx VLAN offload
- add driver info and link management
- Ethernet NICs:
- Marvell/Aquantia:
- support reading SFP module info on some AQC100 cards
- Realtek PCI (r8169):
- add support for RTL8125cp
- Realtek USB (r8152):
- support for the RTL8157 5Gbit chip
- add 2500baseT EEE status/configuration support
- Ethernet NICs embedded and off-the-shelf IP:
- Synopsys (stmmac):
- cleanup and reorganize SerDes handling and PCS support
- cleanup descriptor handling and per-platform data
- cleanup and consolidate MDIO defines and handling
- shrink driver memory use for internal structures
- improve Tx IRQ coalescing
- improve TCP segmentation handling
- add support for Spacemit K3
- Cadence (macb):
- support PHYs that have inband autoneg disabled with GEM
- support IEEE 802.3az EEE
- rework usrio capabilities and handling
- AMD (xgbe):
- improve power management for S0i3
- improve TX resilience for link-down handling
- Virtual:
- Google cloud vNIC:
- support larger ring sizes in DQO-QPL mode
- improve HW-GRO handling
- support UDP GSO for DQO format
- PCIe NTB:
- support queue count configuration
- Ethernet PHYs:
- automatically disable PHY autonomous EEE if MAC is in charge
- Broadcom:
- add BCM84891/BCM84892 support
- Micrel:
- support for LAN9645X internal PHY
- Realtek:
- add RTL8224 pair order support
- support PHY LEDs on RTL8211F-VD
- support spread spectrum clocking (SSC)
- Maxlinear:
- add PHY-level statistics via ethtool
- Ethernet switches:
- Maxlinear (mxl862xx):
- support for bridge offloading
- support for VLANs
- support driver statistics
- Bluetooth:
- large number of fixes and new device IDs
- Mediatek:
- support MT6639 (MT7927)
- support MT7902 SDIO
- WiFi:
- Intel (iwlwifi):
- UNII-9 and continuing UHR work
- MediaTek (mt76):
- mt7996/mt7925 MLO fixes/improvements
- mt7996 NPU support (HW eth/wifi traffic offload)
- Qualcomm (ath12k):
- monitor mode support on IPQ5332
- basic hwmon temperature reporting
- support IPQ5424
- Realtek:
- add USB RX aggregation to improve performance
- add USB TX flow control by tracking in-flight URBs
- Cellular:
- IPA v5.2 support"
* tag 'net-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1561 commits)
net: pse-pd: fix kernel-doc function name for pse_control_find_by_id()
wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit
wireguard: allowedips: remove redundant space
tools: ynl: add sample for wireguard
wireguard: allowedips: Use kfree_rcu() instead of call_rcu()
MAINTAINERS: Add netkit selftest files
selftests/net: Add additional test coverage in nk_qlease
selftests/net: Split netdevsim tests from HW tests in nk_qlease
tools/ynl: Make YnlFamily closeable as a context manager
net: airoha: Add missing PPE configurations in airoha_ppe_hw_init()
net: airoha: Fix VIP configuration for AN7583 SoC
net: caif: clear client service pointer on teardown
net: strparser: fix skb_head leak in strp_abort_strp()
net: usb: cdc-phonet: fix skb frags[] overflow in rx_complete()
selftests/bpf: add test for xdp_master_redirect with bond not up
net, bpf: fix null-ptr-deref in xdp_master_redirect() for down master
net: airoha: Remove PCE_MC_EN_MASK bit in REG_FE_PCE_CFG configuration
sctp: disable BH before calling udp_tunnel_xmit_skb()
sctp: fix missing encap_port propagation for GSO fragments
net: airoha: Rely on net_device pointer in ETS callbacks
...
Merge tag 'bpf-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
- Welcome new BPF maintainers: Kumar Kartikeya Dwivedi, Eduard
Zingerman while Martin KaFai Lau reduced his load to Reviwer.
- Lots of fixes everywhere from many first time contributors. Thank you
All.
- Diff stat is dominated by mechanical split of verifier.c into
multiple components:
- backtrack.c: backtracking logic and jump history
- states.c: state equivalence
- cfg.c: control flow graph, postorder, strongly connected
components
- liveness.c: register and stack liveness
- fixups.c: post-verification passes: instruction patching, dead
code removal, bpf_loop inlining, finalize fastcall
8k line were moved. verifier.c still stands at 20k lines.
Further refactoring is planned for the next release.
- Replace dynamic stack liveness with static stack liveness based on
data flow analysis.
This improved the verification time by 2x for some programs and
equally reduced memory consumption. New logic is in liveness.c and
supported by constant folding in const_fold.c (Eduard Zingerman,
Alexei Starovoitov)
- Introduce BTF layout to ease addition of new BTF kinds (Alan Maguire)
- Use kmalloc_nolock() universally in BPF local storage (Amery Hung)
- Fix several bugs in linked registers delta tracking (Daniel Borkmann)
- Improve verifier support of arena pointers (Emil Tsalapatis)
- Improve verifier tracking of register bounds in min/max and tnum
domains (Harishankar Vishwanathan, Paul Chaignon, Hao Sun)
- Further extend support for implicit arguments in the verifier (Ihor
Solodrai)
- Add support for nop,nop5 instruction combo for USDT probes in libbpf
(Jiri Olsa)
- Support merging multiple module BTFs (Josef Bacik)
- Extend applicability of bpf_kptr_xchg (Kaitao Cheng)
- Support variable offset context access for 'syscall' programs (Kumar
Kartikeya Dwivedi)
- Migrate bpf_task_work and dynptr to kmalloc_nolock() (Mykyta
Yatsenko)
- Fix UAF in in open-coded task_vma iterator (Puranjay Mohan)
* tag 'bpf-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (241 commits)
selftests/bpf: cover short IPv4/IPv6 inputs with adjust_room
bpf: reject short IPv4/IPv6 inputs in bpf_prog_test_run_skb
selftests/bpf: Use memfd_create instead of shm_open in cgroup_iter_memcg
selftests/bpf: Add test for cgroup storage OOB read
bpf: Fix OOB in pcpu_init_value
selftests/bpf: Fix reg_bounds to match new tnum-based refinement
selftests/bpf: Add tests for non-arena/arena operations
bpf: Allow instructions with arena source and non-arena dest registers
bpftool: add missing fsession to the usage and docs of bpftool
docs/bpf: add missing fsession attach type to docs
bpf: add missing fsession to the verifier log
bpf: Move BTF checking logic into check_btf.c
bpf: Move backtracking logic to backtrack.c
bpf: Move state equivalence logic to states.c
bpf: Move check_cfg() into cfg.c
bpf: Move compute_insn_live_regs() into liveness.c
bpf: Move fixup/post-processing logic from verifier.c into fixups.c
bpf: Simplify do_check_insn()
bpf: Move checks for reserved fields out of the main pass
bpf: Delete unused variable
...
Merge tag 'linux_kselftest-next-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest updates from Shuah Khan:
- cpu-hotplug: fix to check if cpu hotplug is supported to avoid
test failures when cpu hotplug isn't supported.
- frace: fix to relevant comparisons and path checks in the helper so
it handles those patterns without spurious shell warnings.
- runner.sh: add ktrap support
- tracing: fix to make --logdir option work again
- tracing: fix to check awk supports non POSIX strtonum()
- mqueue: fix incorrectly named settings file to make sure the test
used the correct timeout value
- kselftest:
- fix to treat xpass as successful result
- add ksft_reset_state()
- kselftest_harness:
- validate kselftest exit codes are handled explicitly
- add detection of invalid mixing of kselftest and harness
functionality
- add validation of intermixing of kselftest and harness
functionality
- run_kselftest.sh:
- remove unused $ROOT
- resolve BASE_DIR with pwd -P to avoid dependency on realpath
or readlink commands to generate a physical absolute path for
BASE_DIR
- allow choosing per-test log directory
- preserve subtarget failures in all/install
* tag 'linux_kselftest-next-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/ftrace: Quote check_requires comparisons
selftests: Preserve subtarget failures in all/install
selftests/run_kselftest.sh: Allow choosing per-test log directory
selftests/run_kselftest.sh: Resolve BASE_DIR with pwd -P
selftests/run_kselftest.sh: Remove unused $ROOT
selftests/cpu-hotplug: Fix check for cpu hotplug not supported
selftests/mqueue: Fix incorrectly named file
selftests: Use ktap helpers for runner.sh
selftests: harness: Validate intermixing of kselftest and harness functionality
selftests: harness: Detect illegal mixing of kselftest and harness functionality
selftests: kselftest: Add ksft_reset_state()
selftests: harness: Validate that explicit kselftest exitcodes are handled
selftests: kselftest: Treat xpass as successful result
selftests/tracing: Fix to check awk supports non POSIX strtonum()
selftests/tracing: Fix to make --logdir option work again
Merge tag 'linux_kselftest-kunit-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit tool updates from Shuah Khan:
- terminate kernel under test on SIGINT when it catches SIGINT to make
sure the TTY isn't messed up and terminate the running kernel
- recommend --raw_output=all when KTAP header isn't found in the kernel
output, it's useful to re-run the test with --raw_output=all to find
out the reasons why the test didn't complete.
- skip stty when stdin is not a tty to avoid writing noise to stderr.
- show suites when user runs --list_suites option instead of entire
list of tests to make the output user friendly and concise.
* tag 'linux_kselftest-kunit-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: tool: Terminate kernel under test on SIGINT
kunit: tool: skip stty when stdin is not a tty
kunit: tool: Recommend --raw_output=all if no KTAP found
kunit: Add --list_suites to show suites
Merge tag 'modules-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux
Pull module updates from Sami Tolvanen:
"Kernel symbol flags:
- Replace the separate *_gpl symbol sections (__ksymtab_gpl and
__kcrctab_gpl) with a unified symbol table and a new __kflagstab
section.
This section stores symbol flags, such as the GPL-only flag, as an
8-bit bitset for each exported symbol. This is a cleanup that
simplifies symbol lookup in the module loader by avoiding table
fragmentation and will allow a cleaner way to add more flags later
if needed.
Module signature UAPI:
- Move struct module_signature to the UAPI headers to allow reuse by
tools outside the kernel proper, such as kmod and
scripts/sign-file.
This also renames a few constants for clarity and drops unused
signature types as preparation for hash-based module integrity
checking work that's in progress.
Sysfs:
- Add a /sys/module/<module>/import_ns sysfs attribute to show the
symbol namespaces imported by loaded modules.
This makes it easier to verify driver API access at runtime on
systems that care about such things (e.g. Android).
Cleanups and fixes:
- Force sh_addr to 0 for all sections in module.lds. This prevents
non-zero section addresses when linking modules with 'ld.bfd -r',
which confused elfutils.
- Fix a memory leak of charp module parameters on module unload when
the kernel is configured with CONFIG_SYSFS=n.
- Override the -EEXIST error code returned by module_init() to
userspace. This prevents confusion with the errno reserved by the
module loader to indicate that a module is already loaded.
- Simplify the warning message and drop the stack dump on positive
returns from module_init().
- Drop unnecessary extern keywords from function declarations and
synchronize parse_args() arguments with their implementation"
* tag 'modules-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux: (23 commits)
module: Simplify warning on positive returns from module_init()
module: Override -EEXIST module return
documentation: remove references to *_gpl sections
module: remove *_gpl sections from vmlinux and modules
module: deprecate usage of *_gpl sections in module loader
module: use kflagstab instead of *_gpl sections
module: populate kflagstab in modpost
module: add kflagstab section to vmlinux and modules
module: define ksym_flags enumeration to represent kernel symbol flags
selftests/bpf: verify_pkcs7_sig: Use 'struct module_signature' from the UAPI headers
sign-file: use 'struct module_signature' from the UAPI headers
tools uapi headers: add linux/module_signature.h
module: Move 'struct module_signature' to UAPI
module: Give MODULE_SIG_STRING a more descriptive name
module: Give 'enum pkey_id_type' a more specific name
module: Drop unused signature types
extract-cert: drop unused definition of PKEY_ID_PKCS7
docs: symbol-namespaces: mention sysfs attribute
module: expose imported namespaces via sysfs
module: Remove extern keyword from param prototypes
...
Merge tag 'nolibc-20260412-for-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc
Pull nolibc updates from Thomas Weißschuh:
- Many new features and optimizations to printf()
- Rename non-standard symbols to avoid collisions with application code
- Support for byteswap.h, endian.h, err.h and asprintf()
- 64-bit dev_t
- Smaller cleanups and fixes to the code and build system
* tag 'nolibc-20260412-for-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc: (61 commits)
selftests/nolibc: use gcc 15
tools/nolibc: support UBSAN on gcc
tools/nolibc: create __nolibc_no_sanitize_ubsan
selftests/nolibc: don't skip tests for unimplemented syscalls anymore
selftests/nolibc: explicitly handle ENOSYS from ptrace()
tools/nolibc: add byteorder conversions
tools/nolibc: add the _syscall() macro
tools/nolibc: move the call to __sysret() into syscall()
tools/nolibc: rename the internal macros used in syscall()
selftests/nolibc: only use libgcc when really necessary
selftests/nolibc: test the memory allocator
tools/nolibc: check for overflow in calloc() without divisions
tools/nolibc: add support for asprintf()
tools/nolibc: use __builtin_offsetof()
tools/nolibc: use makedev() in fstatat()
tools/nolibc: handle all major and minor numbers in makedev() and friends
tools/nolibc: make dev_t 64 bits wide
tools/nolibc: move the logic of makedev() and friends into functions
selftests/nolibc: add a test for stat().st_rdev
selftests/nolibc: add some tests for makedev() and friends
...
Merge tag 'powerpc-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Madhavan Srinivasan:
- powerpc support for huge pfnmaps
- Cleanups to use masked user access
- Rework pnv_ioda_pick_m64_pe() to use better bitmap API
- Convert powerpc to AUDIT_ARCH_COMPAT_GENERIC
- Backup region offset update to eflcorehdr
- Fixes for wii/ps3 platform
- Implement JIT support for private stack in powerpc
- Implement JIT support for fsession in powerpc64 trampoline
- Add support for instruction array and indirect jump in powerpc
- Misc selftest fixes and cleanups
Thanks to Abhishek Dubey, Aditya Gupta, Alex Williamson, Amit Machhiwal,
Andrew Donnellan, Bartosz Golaszewski, Cédric Le Goater, Chen Ni,
Christophe Leroy (CS GROUP), Hari Bathini, J. Neuschäfer, Mukesh Kumar
Chaurasiya (IBM), Nam Cao, Nilay Shroff, Pavithra Prakash, Randy Dunlap,
Ritesh Harjani (IBM), Shrikanth Hegde, Sourabh Jain, Vaibhav Jain,
Venkat Rao Bagalkote, and Yury Norov (NVIDIA)
* tag 'powerpc-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (47 commits)
mailmap: Add entry for Andrew Donnellan
powerpc32/bpf: fix loading fsession func metadata using PPC_LI32
selftest/bpf: Enable gotox tests for powerpc64
powerpc64/bpf: Add support for indirect jump
selftest/bpf: Enable instruction array test for powerpc
powerpc/bpf: Add support for instruction array
powerpc32/bpf: Add fsession support
powerpc64/bpf: Implement fsession support
selftests/bpf: Enable private stack tests for powerpc64
powerpc64/bpf: Implement JIT support for private stack
powerpc: pci-ioda: Optimize pnv_ioda_pick_m64_pe()
powerpc: pci-ioda: use bitmap_alloc() in pnv_ioda_pick_m64_pe()
powerpc/net: Inline checksum wrappers and convert to scoped user access
powerpc/sstep: Convert to scoped user access
powerpc/align: Convert emulate_spe() to scoped user access
powerpc/ptrace: Convert gpr32_set_common_user() to scoped user access
powerpc/futex: Use masked user access
powerpc/audit: Convert powerpc to AUDIT_ARCH_COMPAT_GENERIC
cpuidle: powerpc: avoid double clear when breaking snooze
powerpc/ps3: spu.c: fix enum and Return kernel-doc warnings
...
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"The biggest changes are MPAM enablement in drivers/resctrl and new PMU
support under drivers/perf.
On the core side, FEAT_LSUI lets futex atomic operations with EL0
permissions, avoiding PAN toggling.
The rest is mostly TLB invalidation refactoring, further generic entry
work, sysreg updates and a few fixes.
Core features:
- Add support for FEAT_LSUI, allowing futex atomic operations without
toggling Privileged Access Never (PAN)
- Further refactor the arm64 exception handling code towards the
generic entry infrastructure
- Optimise __READ_ONCE() with CONFIG_LTO=y and allow alias analysis
through it
Memory management:
- Refactor the arm64 TLB invalidation API and implementation for
better control over barrier placement and level-hinted invalidation
- Enable batched TLB flushes during memory hot-unplug
- Fix rodata=full block mapping support for realm guests (when
BBML2_NOABORT is available)
Perf and PMU:
- Add support for a whole bunch of system PMUs featured in NVIDIA's
Tegra410 SoC (cspmu extensions for the fabric and PCIe, new drivers
for CPU/C2C memory latency PMUs)
- Clean up iomem resource handling in the Arm CMN driver
- Fix signedness handling of AA64DFR0.{PMUVer,PerfMon}
MPAM (Memory Partitioning And Monitoring):
- Add architecture context-switch and hiding of the feature from KVM
- Add interface to allow MPAM to be exposed to user-space using
resctrl
- Add errata workaround for some existing platforms
- Add documentation for using MPAM and what shape of platforms can
use resctrl
Miscellaneous:
- Check DAIF (and PMR, where relevant) at task-switch time
- Skip TFSR_EL1 checks and barriers in synchronous MTE tag check mode
(only relevant to asynchronous or asymmetric tag check modes)
- Remove a duplicate allocation in the kexec code
- Remove redundant save/restore of SCS SP on entry to/from EL0
- Generate the KERNEL_HWCAP_ definitions from the arm64 hwcap
descriptions
- Add kselftest coverage for cmpbr_sigill()
- Update sysreg definitions"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (109 commits)
arm64: rsi: use linear-map alias for realm config buffer
arm64: Kconfig: fix duplicate word in CMDLINE help text
arm64: mte: Skip TFSR_EL1 checks and barriers in synchronous tag check mode
arm64/sysreg: Update ID_AA64SMFR0_EL1 description to DDI0601 2025-12
arm64/sysreg: Update ID_AA64ZFR0_EL1 description to DDI0601 2025-12
arm64/sysreg: Update ID_AA64FPFR0_EL1 description to DDI0601 2025-12
arm64/sysreg: Update ID_AA64ISAR2_EL1 description to DDI0601 2025-12
arm64/sysreg: Update ID_AA64ISAR0_EL1 description to DDI0601 2025-12
arm64/hwcap: Generate the KERNEL_HWCAP_ definitions for the hwcaps
arm64: kexec: Remove duplicate allocation for trans_pgd
ACPI: AGDI: fix missing newline in error message
arm64: Check DAIF (and PMR) at task-switch time
arm64: entry: Use split preemption logic
arm64: entry: Use irqentry_{enter_from,exit_to}_kernel_mode()
arm64: entry: Consistently prefix arm64-specific wrappers
arm64: entry: Don't preempt with SError or Debug masked
entry: Split preemption from irqentry_exit_to_kernel_mode()
entry: Split kernel mode logic from irqentry_{enter,exit}()
entry: Move irqentry_enter() prototype later
entry: Remove local_irq_{enable,disable}_exit_to_user()
...
Merge tag 'ras_core_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Borislav Petkov:
- Add new AMD MCA bank names and types to the MCA code, preceded by a
clean up of the relevant places to have them more developer-friendly
(read: sort them alphanumerically and clean up comments) such that
adding new banks is easy
* tag 'ras_core_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce, EDAC/mce_amd: Add new SMCA bank types
x86/mce, EDAC/mce_amd: Update CS bank type naming
x86/mce, EDAC/mce_amd: Reorder SMCA bank type enums
Merge tag 'edac_updates_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC updates from Borislav Petkov:
- amd64_edac: Add support for AMD Zen 3 (family 19h, models 40h–4fh)
- i10nm: Add GNR error information decoder support as an alternative to
the firmware decoder
- versalnet: Restructure the init/teardown logic for correct and more
readable error handling. Also, fix two memory leaks and a resource
leak
- Convert several internal structs to use bounded flex arrays, enabling
the kernel's runtime checker to catch out-of-bounds memory accesses
- Mark various sysfs attribute tables read-only, preventing accidental
modification at runtime
- The usual fixes and cleanups across the subsystem
* tag 'edac_updates_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/mc: Use kzalloc_flex()
EDAC/ie31200: Make rpl_s_cfg static
EDAC/i10nm: Fix spelling mistake "readd" -> "read"
EDAC/versalnet: Fix device_node leak in mc_probe()
EDAC/versalnet: Fix memory leak in remove and probe error paths
EDAC/amd64: Add support for family 19h, models 40h-4fh
EDAC/i10nm: Add driver decoder for Granite Rapids server
EDAC/sb: Use kzalloc_flex()
EDAC/i7core: Use kzalloc_flex()
EDAC/mpc85xx: Constify device sysfs attributes
EDAC/device: Allow addition of const sysfs attributes
EDAC/pci_sysfs: Constify instance sysfs attributes
EDAC/device: Constify info sysfs attributes
EDAC/device: Drop unnecessary and dangerous casts of attributes
EDAC/device: Drop unused macro to_edacdev_attr()
EDAC/altera: Drop unused field eccmgr_sysfs_attr
EDAC/versalnet: Refactor memory controller initialization and cleanup
Merge tag 'x86_sev_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SEV updates from Borislav Petkov:
- Change the SEV host code handling of when SNP gets enabled in order
to allow the machine to claim SNP-related resources only when SNP
guests are really going to be launched. The user requests this by
loading the ccp module and thus it controls when SNP initialization
is done
So export an API which module code can call and do the necessary SNP
setup only when really needed
- Drop an unnecessary write-back and invalidate operation that was
being performed too early, since the ccp driver already issues its
own at the correct point in the initialization sequence
- Drop the hotplug callbacks for enabling SNP on newly onlined CPUs,
which were both architecturally unsound (the firmware rejects
initialization if any CPU lacks the required configuration) and buggy
(the MFDM SYSCFG MSR bit was not being set)
- Code refactoring and cleanups to accomplish the above
* tag 'x86_sev_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
crypto/ccp: Update HV_FIXED page states to allow freeing of memory
crypto/ccp: Implement SNP x86 shutdown
x86/sev, crypto/ccp: Move HSAVE_PA setup to arch/x86/
x86/sev, crypto/ccp: Move SNP init to ccp driver
x86/sev: Create snp_shutdown()
x86/sev: Create snp_prepare()
x86/sev: Create a function to clear/zero the RMP
x86/sev: Rename SNP_FEATURES_PRESENT to SNP_FEATURES_IMPL
x86/virt/sev: Keep the RMP table bookkeeping area mapped
x86/virt/sev: Drop WBINVD before setting MSR_AMD64_SYSCFG_SNP_EN
x86/virt/sev: Drop support for SNP hotplug
Merge tag 'x86_misc_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 updates from Borislav Petkov:
- Reference the tip tree maintainer handbook directly from the relevant
MAINTAINERS file entries (covering timers, IRQ, locking, scheduling,
perf, x86, and others) so that contributors and tooling can know
where to look
- Enable interrupt remapping in defconfig, which is an architectural
requirement for x2APIC to function correctly on bare metal. Without
it, x2APIC was effectively enabled but non-functional.
- Ensure that drivers which register custom restart handlers (such as
those needed for SoC-based x86 devices like Intel Lightning Mountain)
are actually invoked during reboot, bringing x86 in line with how
other architectures handle this.
- Cleanups
* tag 'x86_misc_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
MAINTAINERS: Add references to tip tree handbook
x86/64/defconfig: Add CONFIG_IRQ_REMAP
x86/reboot: Execute the kernel restart handler upon machine restart
x86/mtrr: Use kstrtoul() in parse_mtrr_spare_reg()
Merge tag 'x86_microcode_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 microcode loading updates from Borislav Petkov:
"The kernel carries a table of Intel CPUs family, model, stepping, etc
tuples which say what is the latest microcode for that particular CPU.
Some CPU variants differ only by the platform ID which determines what
microcode needs to be loaded on them.
Carve out the platform ID handling from the microcode loader and make it
available in a more generic place so that the old microcode
verification machinery can use it"
* tag 'x86_microcode_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode: Add platform mask to Intel microcode "old" list
x86/cpu: Add platform ID to CPU matching structure
x86/cpu: Add platform ID to CPU info structure
x86/microcode: Refactor platform ID enumeration into a helper
Merge tag 'x86_fred_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 FRED updates from Borislav Petkov:
"We made the FRED support an opt-in initially out of fear of it
breaking machines left and right in the case of a hw bug in the first
generation of machines supporting it.
Now that that the FRED code has seen a lot of hammering, flip the
logic to be opt-out as is the usual case with new hw features"
* tag 'x86_fred_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fred: Remove kernel log message when initializing exceptions
x86/fred: Enable FRED by default
Merge tag 'x86_cache_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 resource control updates from Borislav Petkov:
- Add return value descriptions to several internal functions,
addressing kernel-doc complaints
- Add the x86 maintainer mailing list to the resctrl section so they
are automatically included in patch submissions, and reference the
applicable contribution rules document
- Allow users to apply a single Capacity Bitmask to all cache domains
at once using '*' as a shorthand, instead of having to specify each
domain individually. This is particularly user-friendly on high
core-count systems with many cache clusters
- When a user provides a non-existent domain ID while configuring cache
allocation, ensure the failure reason is properly reported to the
user rather than silently returning an error with a misleading "ok"
status
* tag 'x86_cache_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
fs/resctrl: Add missing return value descriptions
MAINTAINERS: Update resctrl entry
fs/resctrl: Add "*" shorthand to set io_alloc CBM for all domains
fs/resctrl: Report invalid domain ID when parsing io_alloc_cbm
Merge tag 'x86_tdx_for_7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 TDX updates from Dave Hansen:
"The only real thing of note here is printing the TDX module version.
This is a little silly on its own, but the upcoming TDX module update
code needs the same TDX module call. This shrinks that set a wee bit.
There's also few minor macro cleanups and a tweak to the GetQuote ABI
to make it easier for userspace to detect zero-length (failed) quotes"
* tag 'x86_tdx_for_7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
virt: tdx-guest: Return error for GetQuote failures
KVM/TDX: Rename KVM_SUPPORTED_TD_ATTRS to KVM_SUPPORTED_TDX_TD_ATTRS
x86/tdx: Rename TDX_ATTR_* to TDX_TD_ATTR_*
KVM/TDX: Remove redundant definitions of TDX_TD_ATTR_*
x86/tdx: Fix the typo in TDX_ATTR_MIGRTABLE
x86/virt/tdx: Print TDX module version during init
x86/virt/tdx: Retrieve TDX module version
Merge tag 'x86_mm_for_7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Dave Hansen:
- Convert x86 code to use generic "pagetable" APIs and ptdescs
This aligns some the set_memory*() code better with the new page
table APIs, especially using ptdescs as opposed to 'struct page'
directly.
* tag 'x86_mm_for_7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/pat: Convert split_large_page() to use ptdescs
x86/mm/pat: Convert populate_pgd() to use page table apis
x86/mm/pat: Convert pmd code to use page table apis
x86/mm/pat: Convert pte code to use page table apis
Merge tag 'x86_cpu_for_7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu updates from Dave Hansen:
- Complete LASS enabling: deal with vsyscall and EFI
The existing Linear Address Space Separation (LASS) support punted
on support for common EFI and vsyscall configs. Complete the
implementation by supporting EFI and vsyscall=xonly.
- Clean up CPUID usage in newer Intel "avs" audio driver and update the
x86-cpuid-db file
* tag 'x86_cpu_for_7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools/x86/kcpuid: Update bitfields to x86-cpuid-db v3.0
ASoC: Intel: avs: Include CPUID header at file scope
ASoC: Intel: avs: Check maximum valid CPUID leaf
x86/cpu: Remove LASS restriction on vsyscall emulation
x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE
x86/vsyscall: Restore vsyscall=xonly mode under LASS
x86/traps: Consolidate user fixups in the #GP handler
x86/vsyscall: Reorganize the page fault emulation code
x86/cpu: Remove LASS restriction on EFI
x86/efi: Disable LASS while executing runtime services
x86/cpu: Defer LASS enabling until userspace comes up
Merge tag 'x86-vdso-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 vdso updates from Ingo Molnar:
"vdso cleanups by Thomas Weißschuh:
- Clean up remnants of VDSO32_NOTE_MASK
- Drop pointless #ifdeffery in vvar_vclock_fault()"
* tag 'x86-vdso-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso: Drop pointless #ifdeffery in vvar_vclock_fault()
x86/vdso: Clean up remnants of VDSO32_NOTE_MASK
Merge tag 'x86-platform-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform updates from Ingo Molnar:
- Remove M486/M486SX/ELAN support, first minimal step (Ingo Molnar)
- Print AGESA string from DMI additional information entry (Yazen
Ghannam, Mario Limonciello)
- Improve and fix the DMI code (Mario Limonciello):
- Correct an indexing error in <linux/dmi.h>
- Adjust dmi_decode() to use enums <linux/dmi.h>
- Add pr_fmt() for dmi_scan.c to fix & standardize the log prefixes
* tag 'x86-platform-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/CPU/AMD: Print AGESA string from DMI additional information entry
firmware: dmi: Add pr_fmt() for dmi_scan.c
firmware: dmi: Adjust dmi_decode() to use enums
firmware: dmi: Correct an indexing error in dmi.h
x86/cpu: Remove M486/M486SX/ELAN support
* tag 'x86-cleanups-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu: Correct the comment explaining what xfeatures_in_use() does
x86/split_lock: Don't warn about unknown split_lock_detect parameter
x86/fpu: Correct misspelled xfeaures_to_write local var
x86/apic: Drop AMD Extended Interrupt LVT macros
x86/cpu/topology: Consolidate AMD and Hygon cases in parse_topology()
block/floppy: Don't use REALLY_SLOW_IO for delays
x86/paravirt: Replace io_delay() hook with a bool
x86/irqflags: Preemptively move include paravirt.h directive where it belongs
x86/split_lock: Restructure the unwieldy switch-case in sld_state_show()
x86/local: Remove trailing semicolon from _ASM_XADD in local_add_return()
x86/asm: Use inout "+" asm onstraint modifiers in __iowrite32_copy()
Merge tag 'x86-asm-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm from Ingo Molnar:
"x86 asm cleanups by Uros Bizjak:
- Remove unnecessary memory clobbers from FS/GS base (read-)
accessors and savesegment()
- Use ASM_INPUT_RM in __loadsegment_fs() to work around clang code
generation problems
- Implement loadsegment()/savesegment() macros with static inline
helpers
- Use savesegment() for segment register reads in ELF core dump and
__show_regs()
- Use correct type for 'gs' variable in __show_regs() to avoid
zero-extension
- Clean up 'sel' variable usage in do_set_thread_area()"
* tag 'x86-asm-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/tls: Clean up 'sel' variable usage in do_set_thread_area()
x86/process/32: Use correct type for 'gs' variable in __show_regs() to avoid zero-extension
x86/process/64: Use savesegment() in __show_regs() instead of inline asm
x86/elf: Use savesegment() for segment register reads in ELF core dump
x86/asm/segment: Implement loadsegment()/savesegment() macros with static inline helpers
x86/asm/segment: Use ASM_INPUT_RM in __loadsegment_fs()
x86/asm/segment: Remove unnecessary "memory" clobber from savesegment()
x86/asm/fsgsbase: Remove unnecessary "memory" clobbers from FS/GS base (read-) accessors
Merge tag 'sched-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Fair scheduling updates:
- Skip SCHED_IDLE rq for SCHED_IDLE tasks (Christian Loehle)
- Remove superfluous rcu_read_lock() in the wakeup path (K Prateek Nayak)
- Simplify the entry condition for update_idle_cpu_scan() (K Prateek Nayak)
- Simplify SIS_UTIL handling in select_idle_cpu() (K Prateek Nayak)
- Avoid overflow in enqueue_entity() (K Prateek Nayak)
- Update overutilized detection (Vincent Guittot)
- Prevent negative lag increase during delayed dequeue (Vincent Guittot)
- Clear buddies for preempt_short (Vincent Guittot)
- Implement more complex proportional newidle balance (Peter Zijlstra)
- Increase weight bits for avg_vruntime (Peter Zijlstra)
- Use full weight to __calc_delta() (Peter Zijlstra)
RT and DL scheduling updates:
- Fix incorrect schedstats for rt and dl thread (Dengjun Su)
- Skip group schedulable check with rt_group_sched=0 (Michal Koutný)
- Move group schedulability check to sched_rt_global_validate()
(Michal Koutný)
- Add reporting of runtime left & abs deadline to sched_getattr()
for DEADLINE tasks (Tommaso Cucinotta)
Scheduling topology updates by K Prateek Nayak:
- Compute sd_weight considering cpuset partitions
- Extract "imb_numa_nr" calculation into a separate helper
- Allocate per-CPU sched_domain_shared in s_data
- Switch to assigning "sd->shared" from s_data
- Remove sched_domain_shared allocation with sd_data
Energy-aware scheduling updates:
- Filter false overloaded_group case for EAS (Vincent Guittot)
- PM: EM: Switch to rcu_dereference_all() in wakeup path
(Dietmar Eggemann)
Infrastructure updates:
- Replace use of system_unbound_wq with system_dfl_wq (Marco Crivellari)
Proxy scheduling updates by John Stultz:
- Make class_schedulers avoid pushing current, and get rid of proxy_tag_curr()
- Minimise repeated sched_proxy_exec() checking
- Fix potentially missing balancing with Proxy Exec
- Fix and improve task::blocked_on et al handling
- Add assert_balance_callbacks_empty() helper
- Add logic to zap balancing callbacks if we pick again
- Move attach_one_task() and attach_task() helpers to sched.h
- Handle blocked-waiter migration (and return migration)
- Add K Prateek Nayak to scheduler reviewers for proxy execution
Misc cleanups and fixes by John Stultz, Joseph Salisbury, Peter
Zijlstra, K Prateek Nayak, Michal Koutný, Randy Dunlap, Shrikanth
Hegde, Vincent Guittot, Zhan Xusheng, Xie Yuanbin and Vincent Guittot"
* tag 'sched-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
sched/eevdf: Clear buddies for preempt_short
sched/rt: Cleanup global RT bandwidth functions
sched/rt: Move group schedulability check to sched_rt_global_validate()
sched/rt: Skip group schedulable check with rt_group_sched=0
sched/fair: Avoid overflow in enqueue_entity()
sched: Use u64 for bandwidth ratio calculations
sched/fair: Prevent negative lag increase during delayed dequeue
sched/fair: Use sched_energy_enabled()
sched: Handle blocked-waiter migration (and return migration)
sched: Move attach_one_task and attach_task helpers to sched.h
sched: Add logic to zap balance callbacks if we pick again
sched: Add assert_balance_callbacks_empty helper
sched/locking: Add special p->blocked_on==PROXY_WAKING value for proxy return-migration
sched: Fix modifying donor->blocked on without proper locking
locking: Add task::blocked_lock to serialize blocked_on state
sched: Fix potentially missing balancing with Proxy Exec
sched: Minimise repeated sched_proxy_exec() checking
sched: Make class_schedulers avoid pushing current, and get rid of proxy_tag_curr()
MAINTAINERS: Add K Prateek Nayak to scheduler reviewers
sched/core: Get this cpu once in ttwu_queue_cond()
...
Merge tag 'perf-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull performance events updates from Ingo Molnar:
"Core updates:
- Try to allocate task_ctx_data quickly, to optimize O(N^2) algorithm
on large systems with O(100k) threads (Namhyung Kim)
AMD PMU driver IBS support updates and fixes, by Ravi Bangoria:
- Fix interrupt accounting for discarded samples
- Fix a Zen5-specific quirk
- Fix PhyAddrVal handling
- Fix NMI-safety with perf_allow_kernel()
- Fix a race between event add and NMIs
Intel PMU driver updates:
- Only check GP counters for PEBS constraints validation (Dapeng Mi)
MSR driver:
- Turn SMI_COUNT and PPERF on by default, instead of a long list of
CPU models to enable them on (Kan Liang)
... and misc cleanups and fixes by Aldf Conte, Anshuman Khandual,
Namhyung Kim, Ravi Bangoria and Yen-Hsiang Hsu"
* tag 'perf-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/events: Replace READ_ONCE() with standard pgtable accessors
perf/x86/msr: Make SMI and PPERF on by default
perf/x86/intel/p4: Fix unused variable warning in p4_pmu_init()
perf/x86/intel: Only check GP counters for PEBS constraints validation
perf/x86/amd/ibs: Fix comment typo in ibs_op_data
perf/amd/ibs: Advertise remote socket capability
perf/amd/ibs: Enable streaming store filter
perf/amd/ibs: Enable RIP bit63 hardware filtering
perf/amd/ibs: Enable fetch latency filtering
perf/amd/ibs: Support IBS_{FETCH|OP}_CTL2[Dis] to eliminate RMW race
perf/amd/ibs: Add new MSRs and CPUID bits definitions
perf/amd/ibs: Define macro for ldlat mask and shift
perf/amd/ibs: Avoid race between event add and NMI
perf/amd/ibs: Avoid calling perf_allow_kernel() from the IBS NMI handler
perf/amd/ibs: Preserve PhyAddrVal bit when clearing PhyAddr MSR
perf/amd/ibs: Limit ldlat->l3missonly dependency to Zen5
perf/amd/ibs: Account interrupt for discarded samples
perf/core: Simplify __detach_global_ctx_data()
perf/core: Try to allocate task_ctx_data quickly
perf/core: Pass GFP flags to attach_task_ctx_data()
Merge tag 'objtool-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool updates from Ingo Molnar:
- KLP support updates and fixes (Song Liu)
- KLP-build script updates and fixes (Joe Lawrence)
- Support Clang RAX DRAP sequence, to address clang false positive
(Josh Poimboeuf)
- Reorder ORC register numbering to match regular x86 register
numbering (Josh Poimboeuf)
- Misc cleanups (Wentong Tian, Song Liu)
* tag 'objtool-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool/x86: Reorder ORC register numbering
objtool: Support Clang RAX DRAP sequence
livepatch/klp-build: report patch validation fuzz
livepatch/klp-build: add terminal color output
livepatch/klp-build: provide friendlier error messages
livepatch/klp-build: improve short-circuit validation
livepatch/klp-build: fix shellcheck complaints
livepatch/klp-build: add Makefile with check target
livepatch/klp-build: add grep-override function
livepatch/klp-build: switch to GNU patch and recountdiff
livepatch/klp-build: support patches that add/remove files
objtool/klp: Correlate locals to globals
objtool/klp: Match symbols based on demangled_name for global variables
objtool/klp: Remove .llvm suffix in demangle_name()
objtool/klp: Also demangle global objects
objtool/klp: Use sym->demangled_name for symbol_name hash
objtool/klp: Remove trailing '_' in demangle_name()
objtool/klp: Remove redundant strcmp() in correlate_symbols()
objtool: Use section/symbol type helpers
Wolfram Sang [Tue, 14 Apr 2026 19:48:38 +0000 (21:48 +0200)]
Merge tag 'i2c-host-7.1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-mergewindow
i2c-host for v7.1, part 1
- generic cleanups in npcm7xx, qcom-cci, xiic and designware DT
bindings
- atr: use kzalloc_flex for alias pool allocation
- ixp4xx: convert bindings to DT schema
- ocores: use read_poll_timeout_atomic() for polling waits
- qcom-geni: skip extra TX DMA TRE for single read messages
- s3c24xx: validate SMBus block length before using it
- spacemit: refactor xfer path and add K1 PIO support
- tegra: identify DVC and VI with SoC data variants
- tegra: support SoC-specific register offsets
- xiic: switch to devres and generic fw properties
- xiic: skip input clock setup on non-OF systems
rtl9300:
- add per-SoC callbacks and clock support for RTL9607C
- add support for new 50 kHz and 2.5 MHz bus speeds
- general refactoring in preparation for RTL9607C support
New support:
- DesignWare GOOG5000 (ACPI HID)
- Intel Nova Lake (ACPI ID)
- Realtek RTL9607C
- SpacemiT K3 binding
- Tegra410 register layout support
Miscellaneous fixes and cleanups by Peter Zijlstra, Randy Dunlap,
Thomas Weißschuh, Davidlohr Bueso and Mikhail Gavrilov"
* tag 'locking-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
compiler: Simplify generic RELOC_HIDE()
locking: Add lock context annotations in the spinlock implementation
locking: Add lock context support in do_raw_{read,write}_trylock()
locking: Fix rwlock support in <linux/spinlock_up.h>
lockdep: Raise default stack trace limits when KASAN is enabled
cleanup: Optimize guards
jump_label: remove workaround for old compilers in initializations
jump_label: use ATOMIC_INIT() for initialization of .enabled
futex: Convert to compiler context analysis
locking/rwsem: Fix logic error in rwsem_del_waiter()
locking/rwsem: Add context analysis
locking/rtmutex: Add context analysis
locking/mutex: Add context analysis
compiler-context-analysys: Add __cond_releases()
locking/mutex: Remove the list_head from struct mutex
locking/semaphore: Remove the list_head from struct semaphore
locking/rwsem: Remove the list_head from struct rw_semaphore
rust: atomic: Update a safety comment in impl of `fetch_add()`
rust: sync: atomic: Update documentation for `fetch_add()`
rust: sync: atomic: Add fetch_sub()
...
Merge in late fixes in preparation for the net-next PR.
Conflicts:
include/net/sch_generic.h a6bd339dbb351 ("net_sched: fix skb memory leak in deferred qdisc drops") ff2998f29f390 ("net: sched: introduce qdisc-specific drop reason tracing")
https://lore.kernel.org/adz0iX85FHMz0HdO@sirena.org.uk
drivers/net/ethernet/airoha/airoha_eth.c 1acdfbdb516b ("net: airoha: Fix VIP configuration for AN7583 SoC") bf3471e6e6c0 ("net: airoha: Make flow control source port mapping dependent on nbq parameter")
Adjacent changes:
drivers/net/ethernet/airoha/airoha_ppe.c f44218cd5e6a ("net: airoha: Reset PPE cpu port configuration in airoha_ppe_hw_init()") 7da62262ec96 ("inet: add ip_local_port_step_width sysctl to improve port usage distribution")
net: pse-pd: fix kernel-doc function name for pse_control_find_by_id()
The kernel-doc comment header incorrectly referenced the function
name pse_control_find_net_by_id() instead of the actual function name
pse_control_find_by_id(). Correct the function name in the documentation
to match the implementation.
wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit
wg_netns_pre_exit() manually acquires rtnl_lock() inside the
pernet .pre_exit callback. This causes a hung task when another
thread holds rtnl_mutex - the cleanup_net workqueue (or the
setup_net failure rollback path) blocks indefinitely in
wg_netns_pre_exit() waiting to acquire the lock.
Convert to .exit_rtnl, introduced in commit 7a60d91c690b ("net:
Add ->exit_rtnl() hook to struct pernet_operations."), where the
framework already holds RTNL and batches all callbacks under a
single rtnl_lock()/rtnl_unlock() pair, eliminating the contention
window.
The rcu_assign_pointer(wg->creating_net, NULL) is safe to move
from .pre_exit to .exit_rtnl (which runs after synchronize_rcu())
because all RCU readers of creating_net either use maybe_get_net()
- which returns NULL for a dying namespace with zero refcount - or
access net->user_ns which remains valid throughout the entire
ops_undo_list sequence.
Reported-by: syzbot+f2fbf7478a35a94c8b7c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?id=cb64c22a492202ca929e18262fdb8cb89e635c70 Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
[ Jason: added __net_exit and __read_mostly annotations that were missing. ] Fixes: 900575aa33a3 ("wireguard: device: avoid circular netns references") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Link: https://patch.msgid.link/20260414153944.2742252-5-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>