Nuoqi Gui [Sun, 7 Jun 2026 13:24:13 +0000 (21:24 +0800)]
bpf: Keep dynamic inner array lookups nullable
An ARRAY_OF_MAPS can use an array created with BPF_F_INNER_MAP as its
inner map template. A concrete inner array with a different max_entries
value can then replace the template.
After a successful outer map lookup, the verifier represents the
resulting map pointer using the inner map template. Const-key lookup
nullness elision consequently uses the template max_entries even though
the runtime helper uses the concrete inner map max_entries.
Do not elide lookup result nullness for maps marked with BPF_F_INNER_MAP,
because the template max_entries does not prove that the key is in bounds
for the concrete runtime map.
Fixes: d2102f2f5d75 ("bpf: verifier: Support eliding map lookup nullness") Signed-off-by: Nuoqi Gui <gnq25@mails.tsinghua.edu.cn> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20260607-f01-v2-v2-1-da48453146e8@mails.tsinghua.edu.cn Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Frank Li [Thu, 21 May 2026 14:21:53 +0000 (23:21 +0900)]
dmaengine: dw-edma: Add spinlock to protect DONE_INT_MASK and ABORT_INT_MASK
The DONE_INT_MASK and ABORT_INT_MASK registers are shared by all DMA
channels, and modifying them requires a read-modify-write sequence.
Because this operation is not atomic, concurrent calls to
dw_edma_v0_core_start() can introduce race conditions if two channels
update these registers simultaneously.
Add a spinlock to serialize access to these registers and prevent race
conditions.
Koichiro Den [Thu, 21 May 2026 14:21:51 +0000 (23:21 +0900)]
dmaengine: dw-edma-pcie: Reject devices without driver data
dw_edma_pcie_probe() treats the PCI device ID driver_data as the
template for the controller layout and copies it unconditionally. A
device bound dynamically via sysfs can match the driver without that
data, which leads to a NULL pointer dereference.
Reject such matches before enabling the device.
Fixes: 41aaff2a2ac0 ("dmaengine: Add Synopsys eDMA IP PCIe glue-logic") Cc: stable@vger.kernel.org Signed-off-by: Koichiro Den <den@valinux.co.jp> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20260521142153.2957432-3-den@valinux.co.jp Signed-off-by: Vinod Koul <vkoul@kernel.org>
John Madieu [Mon, 25 May 2026 11:07:50 +0000 (11:07 +0000)]
dmaengine: sh: rz-dmac: Add DMA ACK signal routing support
Some peripherals on RZ/G3E SoCs (SSIU, SPDIF, SCU/SRC, DVC, PFC) require
explicit ACK signal routing through the ICU for level-based DMA handshaking.
Rather than extending the DT binding with an optional second #dma-cells
(which would require all DMA consumers to supply two cells even when ACK
routing is not needed), derive the ACK signal number directly from the
MID/RID request number using the linear mapping defined in RZ/G3E hardware
manual Table 4.6-28:
ACK routing is programmed when a channel is prepared for transfer and
cleared when the channel is released or the transfer times out, following
the same pattern as MID/RID request routing.
John Madieu [Mon, 25 May 2026 11:07:49 +0000 (11:07 +0000)]
irqchip/renesas-rzv2h: Add DMA ACK signal routing support
Some peripherals on RZ/G3E SoCs (SSIU, SPDIF, SCU/SRC, DVC) require
explicit ACK signal routing through the ICU via the ICU_DMACKSELk
registers for level-based DMA handshaking.
Add rzv2h_icu_register_dma_ack() to configure ICU_DMACKSELk, routing
a DMAC channel's ACK signal to the specified peripheral.
Devendra K Verma [Tue, 26 May 2026 05:31:10 +0000 (11:01 +0530)]
dmaengine: dw-edma: Remove dw_edma_add_irq_mask()
Function dw_edma_add_irq_mask() sets the mask of the
interrupts alloted to read / write channels in a variable.
The mask set for read / write channels is niether used nor
this function is called else where, making it redundant.
The redundant function can be removed safely as it is
not affecting anything.
dmaengine: nbpfaxi: Drop unused platform_device_id array
The dma-nbpf driver only probes devices from device tree and fails to
probe devices relying on the traditional platform device probe path. So
the platform_device_id array is unused apart from providing misleading
module meta data.
dmaengine: cirrus: Drop left-over from platform probing
Since commit 2e7f55ce4302 ("dmaengine: cirrus: Convert to DT for Cirrus
EP93xx") the driver cannot probe devices using the traditional platform
device way any more. Thus the driver's .id_table serves no purpose any
more and can be dropped.
Rosen Penev [Sat, 30 May 2026 20:03:22 +0000 (13:03 -0700)]
dmaengine: dmatest: split struct dmatest_info from variable declaration
Combining the struct definition with its variable initializer confuses the
kernel-doc parser because __MUTEX_INITIALIZER() expands to contain braces,
breaking brace counting and causing:
Warning: drivers/dma/dmatest.c:152 struct member '' not described in 'dmatest_info'
Split into separate struct definition and variable declaration, which is
the standard kernel pattern.
Rosen Penev [Sun, 31 May 2026 02:08:43 +0000 (19:08 -0700)]
dmaengine: ste_dma40: turn d40_base phy_chans into a flexible array
Convert the separately-offset phy_chans pointer to a C99 flexible array
member at the end of struct d40_base, and switch the allocation to
struct_size(). The log_chans and memcpy_chans slots continue to live
in the same allocation immediately after phy_chans, indexed via
base->log_chans. This removes the hand-rolled pointer fixup that
recomputed phy_chans from base + ALIGN(sizeof(struct d40_base), 4).
The ALIGN(sizeof(struct d40_base), 4) requirement is met implicitly by the
C compiler when using a flexible array member. With struct d40_chan
phy_chans[] as the last member, the C standard guarantees
sizeof(struct d40_base) includes trailing padding to satisfy the alignment
of the flexible array element type (struct d40_chan). Since struct d40_chan
contains members like spinlock_t, pointers, and struct dma_chan — all with
alignment ≥ 4 — the compiler ensures sizeof(struct d40_base) is already a
multiple of _Alignof(struct d40_chan) >= 4. The struct_size() macro then
computes sizeof(struct d40_base) + sizeof(struct d40_chan) * num_phy_chans,
so phy_chans[0] lands at a properly aligned offset without needing the manual
ALIGN.
Sheetal [Sun, 17 May 2026 16:30:45 +0000 (16:30 +0000)]
dmaengine: tegra210-adma: Add error logging on failure paths
Add dev_err/dev_err_probe logging across failure paths to improve
debuggability of DMA errors during runtime and probe.
Use return dev_err_probe() pattern where no cleanup is required in the
probe function. On error paths that need explicit unwind, store the
dev_err_probe() return value in ret before jumping to the cleanup label.
Also convert existing dev_err calls in probe to dev_err_probe for
consistency, and use dev_err in non-probe functions.
Keep explicit runtime PM and DMA registration unwind instead of managed or
scoped cleanup. The scoped runtime PM guard releases the usage count with
pm_runtime_put(), while this probe error path needs pm_runtime_put_sync()
before pm_runtime_disable(). The OF DMA registration failure path also
needs to unregister the DMA engine before dropping the runtime PM reference.
Tejun Heo [Mon, 8 Jun 2026 07:25:47 +0000 (21:25 -1000)]
arm64: mm: Complete the PTE store in ptep_try_set()
ptep_try_set() installs a kernel PTE with try_cmpxchg() but, unlike
__set_pte(), skips the barriers that arm64 requires after writing a valid
kernel PTE. Without them a subsequent access can fault instead of seeing
the new mapping.
Issue them with emit_pte_barriers() rather than __set_pte_complete().
ptep_try_set() must finish the store before it returns, but
__set_pte_complete() would defer the barriers when the calling context is in
lazy MMU mode.
v2: Emit the barriers directly instead of __set_pte_complete(). (Catalin)
sb_regs_write() looks up the matching sideband register entry before
validating the number of bytes to write.
However, the size check uses sb_regs->size, which is the size of the
first entry in the register table, instead of the matched entry. This
rejects valid writes to larger sideband registers such as USB4_SB_DEBUG
or USB4_SB_DATA.
Use the matched register entry for the size check.
Signed-off-by: Xu Rao <raoxu@uniontech.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
With the implementation of masked user access, we always have a memory
gap between user memory space and kernel memory space, so use it to
simplify access_ok() by relying on access fault in case of an access
in the gap.
Most of the time the size is known at build time.
On powerpc64, the kernel space starts at 0x8000000000000000 which is
always more than two times TASK_USER_MAX so when the size is known at
build time and lower than TASK_USER_MAX, only the address needs to be
verified. If not, a binary or of address and size must be lower than
TASK_USER_MAX. As TASK_USER_MAX is a power of 2, just check that
there is no bit set outside of TASK_USER_MAX - 1 mask.
On powerpc32, there is a garanteed gap of 128KB so when the size is
known at build time and not greater than 128KB, just check that the
address is below TASK_SIZE. Otherwise use the original formula.
Shrikanth Hegde [Wed, 3 Jun 2026 13:10:54 +0000 (18:40 +0530)]
powerpc/entry: Disable interrupts before irqentry_exit
Venkat reported a panic on powerpc-next tree where GENERIC_ENTRY has
been enabled.
kernel BUG at kernel/sched/core.c:7512!
NIP preempt_schedule_irq+0x44/0x118
LR dynamic_irqentry_exit_cond_resched+0x40/0x1a4
Call Trace:
dynamic_irqentry_exit_cond_resched+0x40/0x1a4
do_page_fault+0xc0/0x104
data_access_common_virt+0x210/0x220
This happens since __do_page_fault ends up enabling the interrupts and
it could take significant time such that need_resched could be set. This
leads to schedule call in irqentry_exit leading to the bug.
There are many such irq handlers which enables the interrupts.
Fix it by disabling the irq before calling irqentry_exit. The same
pattern exists today in interrupt_exit_kernel_prepare.
pinctrl: qcom: lpass-lpi: Switch to PM clock framework for runtime PM
Convert the LPASS LPI pinctrl driver to use the PM clock framework for
runtime power management.
This allows the LPASS LPI pinctrl driver to drop clock votes when idle,
improves power efficiency on platforms using LPASS LPI island mode, and
aligns the driver with common runtime PM patterns used across Qualcomm
LPASS subsystems.
Guard GPIO register read/write helpers and slew-rate register programming
with synchronous runtime PM calls so the device is active during MMIO
operations whenever autosuspend is enabled.
Make PINCTRL_LPASS_LPI depend on PM_CLK, since this patch introduces
direct PM clock API use in the shared core.
The LPASS LPI core conversion to PM clock framework relies on variant
drivers wiring runtime PM callbacks.
Hook up runtime PM callbacks for the LPASS LPI variant drivers touched
in this patch so they are prepared for the shared core conversion.
This commit is a preparatory NOP on its own, as runtime PM is still
disabled on these devices until the following core conversion patch.
This is a mechanical per-variant driver update that relies on the
same generic PM clock flow (of_pm_clk_add_clks() + pm_clk_suspend/
pm_clk_resume()) and DT-provided clocks.
Runtime behavior was validated on Kodiak (sc7280).
OF_GPIO is selected automatically on all OF systems. Any symbols it
controls also provide stubs and are private to GPIOLIB anyway so there's
really no reason to select it explicitly.
staging: media: max96712: drop unneeded dependency on OF_GPIO
OF_GPIO is selected automatically on all OF systems. Any symbols it
controls also provide stubs and are private to GPIOLIB anyway so there's
really no reason to select it explicitly.
Billy Tsai [Fri, 5 Jun 2026 06:38:09 +0000 (14:38 +0800)]
pinctrl: aspeed: Fix GPIO mux value for ADC-capable balls
aspeed_g7_soc1_gpio_request_enable() unconditionally writes mux
function 0 to route the requested pin to GPIO. This is wrong for the
ADC-capable balls W17 through AB19 (ADC0-ADC15), where function 0
selects the ADC input and function 1 selects GPIO. Requesting one of
those GPIOs therefore muxed the ball to ADC instead.
Write mux value 1 for balls W17 through AB19 so the GPIO function is
actually selected.
power: sequencing: Add an API to return the pwrseq device's 'dev' pointer
The consumer drivers can make use of the pwrseq device's 'dev' pointer to
query the pwrseq provider's DT node to check for existence of specific
properties.
Hence, add an API to return the pwrseq device's 'dev' pointer to consumers.
Note that since pwrseq_get() would've increased the pwrseq refcount, there
is no need to increase the refcount in this API again.
Mingyu Wang [Mon, 4 May 2026 07:48:23 +0000 (15:48 +0800)]
agp/amd64: Fix broken error propagation in agp_amd64_probe()
A NULL pointer dereference was observed in the AMD64 AGP driver when
running in a virtualized environment (e.g. qemu/kvm) without a physical
AMD northbridge. The crash occurs in amd64_fetch_size() when attempting
to dereference the pointer returned by node_to_amd_nb(0).
The root cause of this crash is broken error propagation in
agp_amd64_probe(): When no AMD northbridges are found, cache_nbs()
correctly returns -ENODEV. However, the probe function erroneously
checks the return value against exactly -1, rather than < 0.
As a result, the hardware absence error is masked, allowing the driver
to improperly proceed with initialization. It eventually calls
agp_add_bridge(), which invokes amd64_fetch_size(). Since the hardware
does not exist, node_to_amd_nb(0) returns NULL, leading to a General
Protection Fault (GPF) when accessing its ->misc member.
Fix the issue by correcting the error check in agp_amd64_probe() to
abort properly when cache_nbs() returns any negative error code. This
prevents the driver from erroneously proceeding without hardware, thereby
avoiding the subsequent NULL pointer dereference at its source.
Fixes: a32073bffc65 ("[PATCH] x86_64: Clean and enhance up K8 northbridge access code") Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v2.6.18+ Link: https://patch.msgid.link/20260504074823.99377-1-w15303746062@163.com
power: sequencing: pcie-m2: Create BT node based on the pci_device_id[] table
Currently, pwrseq_pcie_m2_create_bt_node() hardcodes the BT compatible for
creating the devicetree node. But to allow adding support for more devices
in the future, create the BT node based on the pci_device_id[] table. The
BT compatible is passed using 'driver_data'.
Co-developed-by: Wei Deng <wei.deng@oss.qualcomm.com> Signed-off-by: Wei Deng <wei.deng@oss.qualcomm.com> Tested-by: Wei Deng <wei.deng@oss.qualcomm.com> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Link: https://patch.msgid.link/20260519-pwrseq-m2-bt-v3-5-b39dc2ae3966@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
power: sequencing: pcie-m2: Create serdev for PCI devices present before probe
So far, the driver is registering a notifier to create serdev for the PCI
devices that are going to be attached after probe. But it doesn't handle
the devices present before probe. Due to this, serdev is not getting
created for those existing devices.
Hence, create serdev for PCI devices available before probe as well.
Note that the serdev for available devices are created before
registering the notifier. There is a small window where a device could
appear after pwrseq_pcie_m2_create_serdev(), before notifier registration.
But since M.2 cards are fixed to a slot, they are mostly added either
before booting the host or after using hotplug. So this window is mostly
theoretical.
power: sequencing: pcie-m2: Allow creating serdev for multiple PCI devices
Current code makes it possible to create serdev for only one PCI device.
But for scaling this driver, it is necessary to allow creating serdev for
multiple PCI devices.
Hence, add provision for it by creating 'struct pwrseq_pci_dev' for each
PCI device that requires serdev and add them to
'pwrseq_pcie_m2_ctx::pci_devices' list.
Juergen Gross [Tue, 26 May 2026 15:05:13 +0000 (17:05 +0200)]
x86/xen: Get rid of last XEN_LAZY_MMU uses
There are only very few use cases of XEN_LAZY_MMU left. Get rid of
them in order to avoid having to call enter_lazy(XEN_LAZY_MMU) and
leave_lazy(XEN_LAZY_MMU).
The query in xen_batched_set_pte() can be replaced by using
is_lazy_mmu_mode_active() instead.
As xen_flush_lazy_mmu() will be called only with lazy MMU mode being
active, the test for the lazy mode can just be dropped.
In xen_start_context_switch() and xen_end_context_switch() use
__task_lazy_mmu_mode_pause() and __task_lazy_mmu_mode_resume(),
allowing to drop xen_enter_lazy_mmu() and xen_leave_lazy_mmu()
completely.
Call arch_flush_lazy_mmu_mode() from arch_leave_lazy_mmu_mode(), as
this is the only required action now.
Drop the lazy mmu enter and leave paravirt hooks, leaving the flush
hook as the only needed one.
Juergen Gross [Fri, 22 May 2026 15:21:14 +0000 (17:21 +0200)]
x86/xen: Remove Xen debugfs support
The only Xen file in debugfs is for dumping the p2m table when running
as a Xen PV guest. This might have been useful when the PV code was
young, but there haven't been any p2m related bugs requiring the p2m
dump since ages.
Juergen Gross [Fri, 22 May 2026 15:21:12 +0000 (17:21 +0200)]
x86/xen: Guard PV-only stuff in xen-ops.h with CONFIG_XEN_PV
A lot of arch/x86/xen/xen-ops.h is meant to be for PV only. Guard all
of it with CONFIG_XEN_PV in order to avoid someone misusing it in
non-PV builds. Additionally any 64-bit tests for now guarded items can
be dropped.
Move the enum pt_level definition to mmu_pv.c, as it is used only there.
Len Bao [Sat, 23 May 2026 13:28:01 +0000 (13:28 +0000)]
xen/mcelog: mark g_physinfo, ncpus and xen_mce_chrdev_device as __ro_after_init
The 'g_physinfo' and 'ncpus' variables are initialized only during the
init phase in the 'bind_virq_for_mce' function and never changed. So,
mark them as __ro_after_init.
The 'xen_mce_chrdev_device' variable is initialized only in the
declaration and never changed. So, this variable could be 'const', but
using the 'misc_register' and 'misc_deregister' functions discards the
'const' qualifier. Therefore, as an alternative, mark it as
__ro_after_init.
Signed-off-by: Len Bao <len.bao@gmx.us> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20260523132802.25391-1-len.bao@gmx.us>
Bryam Vargas [Sat, 6 Jun 2026 07:43:45 +0000 (07:43 +0000)]
wifi: mac80211: bound S1G TIM PVB walk to the TIM element
ieee80211_s1g_check_tim() parses the S1G Partial Virtual Bitmap (PVB) of a
received TIM element. The TIM is handed in as the element payload:
ieee802_11_parse_elems_full() stores elems->tim = elem->data and
elems->tim_len = elem->datalen (net/mac80211/parse.c), so the valid bytes
are [tim, tim + tim_len).
When walking the encoded blocks the function passes the walker an end
sentinel of (const u8 *)tim + tim_len + 2, i.e. two bytes past the end of
the element. ieee80211_s1g_find_target_block() loops while (ptr + 1 <= end)
and dereferences ptr (and the per-mode ieee80211_s1g_len_*() helpers read
*ptr), so it can read up to two bytes beyond the TIM element -- an
out-of-bounds read of adjacent skb/heap data when the TIM is the last
element in the frame. The +2 appears to account for the element id/len
header, but tim already points past that header at the element payload, so
the addend is wrong.
Pass the correct element end, (const u8 *)tim + tim_len.
xen/platform-pci: Simplify initialization of pci_device_id array
Instead of using a list initializer---that is hard to read unless you know
the structure of struct pci_device_id by heart---use the PCI_VDEVICE
macro to assign the needed values and drop all explicit but unneeded
zeros.
This doesn't introduce any changes to the compiled result of the array.
Junrui Luo [Thu, 16 Apr 2026 14:18:05 +0000 (22:18 +0800)]
mshv: add bounds check on vp_index in mshv_intercept_isr()
mshv_intercept_isr() extracts vp_index from the hypervisor message
payload and uses it directly to index into pt_vp_array without
validation. handle_bitset_message() and handle_pair_message() already
validate vp_index against MSHV_MAX_VPS before array access.
Add the same MSHV_MAX_VPS bounds check for consistency with the other
message handlers.
Fixes: 621191d709b1 ("Drivers: hv: Introduce mshv_root module to expose /dev/mshv to VMMs") Reported-by: Yuhao Jiang <danisjiang@gmail.com> Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Mukesh R [Mon, 1 Jun 2026 22:51:16 +0000 (15:51 -0700)]
x86/hyperv: Cosmetic changes in irqdomain.c for readability
Make cosmetic changes:
o Rename struct pci_dev *dev to *pdev since there are cases of
struct device *dev in the file and all over the kernel
o Rename hv_build_pci_dev_id to hv_build_devid_type_pci in anticipation
of building different types of device ids
o Fix checkpatch.pl issues with return and extraneous printk
o Replace spaces with tabs
o Rename struct hv_devid *xxx to struct hv_devid *hv_devid given code
paths involve many types of device ids
o Fix indentation in a large if block by using goto.
There are no functional changes.
Reviewed-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com> Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Signed-off-by: Mukesh R <mrathor@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Mukesh R [Wed, 3 Jun 2026 22:50:10 +0000 (15:50 -0700)]
iommu/hyperv: Create hyperv subdirectory under drivers/iommu
Create hyperv subdirectory under drivers/iommu in anticipation of more
Hyper-V related files from upcoming PCI passthrough and PV-IOMMU patches.
Also, the current file hyperv-iommu.c actually implements irq remapping on
x86, so rename to more appropriate hv-irq-remap-x86.c and move it under
the new hyperv subdirectory. Since this file implements irq_remap_ops
exposed by drivers/iommu/irq_remapping.h, it cannot be relocated to the
irq directory. This is in sync with other backend directories like amd
and intel there.
Lastly, this file should not be tied to CONFIG_HYPERV_IOMMU, but to
CONFIG_HYPERV and CONFIG_IRQ_REMAP.
Signed-off-by: Mukesh R <mrathor@linux.microsoft.com> Reviewed-by: Jacob Pan <jacob.pan@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Akashdeep Kaur [Wed, 3 Jun 2026 07:24:38 +0000 (12:54 +0530)]
cpufreq: ti: Add EPROBE_DEFER for K3 SoCs
On K3 SoCs, ti-cpufreq relies on k3-socinfo to register the SoC
device before soc_device_match() can return valid revision
information. If ti-cpufreq probes before k3-socinfo,
soc_device_match() returns NULL, leading to incorrect CPU frequency
scaling behavior.
Add a needs_k3_socinfo flag to ti_cpufreq_soc_data (similar to
the existing multi_regulator pattern) to defer probe when k3-socinfo
hasn't registered the SoC device yet.
Taniya Das [Fri, 22 May 2026 15:16:23 +0000 (20:46 +0530)]
cpufreq: qcom: Add cpufreq scaling support for Qualcomm Shikra SoC
The Qualcomm Shikra cpufreq hardware is functionally identical to EPSS,
but supports only up to 12 frequency lookup table (LUT) entries. When all
12 entries are populated, the existing repetitive LUT entry check may read
beyond valid entries and expose incorrect frequencies. Hence, introduce
shikra_epss_soc_data that reuses EPSS configuration with appropriate LUT
entries limit.
Signed-off-by: Taniya Das <taniya.das@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Imran Shaik <imran.shaik@oss.qualcomm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The Qualcomm Shikra cpufreq hardware is functionally identical to EPSS,
but supports only up to 12 frequency lookup table (LUT) entries. Introduce
Shikra specific bindings to represent this constrained EPSS variant.
m68k: coldfire: use ColdFire specifc IO access in SoC code
Convert all ColdFire specific SoC/board setup code to only use the
newly created internal register access methods. This is replacing the
mixed and inconsistent use of readx/writex and __raw_readx/__raw_writex
for internal SoC registers.
m68k: coldfire: use ColdFire specifc IO access in system code
Convert all ColdFire specific system setup code to only use the
newly created internal register access methods. This is replacing the
mixed and inconsistent use of readx/writex and __raw_readx/__raw_writex
for internal SoC registers.
With the basic ColdFire IO register functions now consistently named
with a "mcf_read"/"mcf_write" prefix it makes sense to name the timers
internal access defines the same way. Convert the local __raw_readtrr
and __raw_writetrr defines to use the consistent prefixes too. Thus
the change is:
m68k: coldfire: use ColdFire specifc IO access in timer code
Convert all ColdFire specific timer setup code to only use the
newly created internal register access methods. This is replacing the
mixed and inconsistent use of readx/writex and __raw_readx/__raw_writex
for internal SoC registers.
m68k: coldfire: use ColdFire specifc IO access in interrupt code
Convert all ColdFire specific interrupt setup code to only use the
newly created internal register access methods. This is replacing the
mixed and inconsistent use of readx/writex and __raw_readx/__raw_writex
for internal SoC registers.
m68k: coldfire: use ColdFire specific IO access in headers
Convert all m68k/ColdFire specific header file code to only use the
newly created internal register access methods. This is replacing the
mixed and inconsistent use of readx/writex and __raw_readx/__raw_writex
for internal SoC registers.
m68k: coldfire: create IO access functions for internal registers
The internal peripheral registers contained in all varieties of ColdFire
SoCs require simple big endian access ranging in sizes from 8, 16 and 32
bit. Currently there is a mixture of IO access methods used across the
various CPU support code, some using readx/writex and some using the
simpler __raw_readx/__raw_writew.
The readx/writex use cases are particularly kludgy in that they contain
code to differentiate internal register access and other general attached
peripheral register access - say on a PCI bus. In effect this means that
the readx/writex family for ColdFire is non-standard. This ultimately
ends up causing problems with definitions of other IO access support
functions like ioreadx/ioreadxbe/iowritex/iowritexbe which in the
generic case are defined in terms of readx/writex.
Create a set of internal only register access methods to ultimately
replace all internal register access code. The new access functions
mirror the existing readx/writex family but using the preferred 8/16/32
suffixes.
m68k: defconfig: add config for SnapGear/NETtel board
Add a default configuration for a basic M5307 based NETtel board.
This is primarily to improve defconfig build coverage. This platforms
uses the SMSC ethernet drivers for network ports, and has a few other
minor quirks that make it different from other ColdFire platforms.
Add a default configuration for a basic M54418 based EVB board.
The SoC has been supported for a long time but there is no default
configuration. Create one to improve build and test coverage.
Add a default configuration file for the Freescale M5329EVB board.
Although the SoC type has been supported for a long time there has been
no defconfig for the base platform. Create one to give better build
and test coverage.
m68k: coldfire: select legacy gpiolib interface for mcfqspi
The common coldfire code uses the old GPIO number based interfaces for
at least the QSPI chipselect lines. Select the required Kconfig symbol
to keep it building when that becomes optional.
Apparently there are no devices attached to a QSPI controller in any of
the coldfire boards, so this is not actually used in upstream kernels.
David Gow [Sat, 6 Jun 2026 02:03:15 +0000 (10:03 +0800)]
kunit:tool: Don't write to stdout when it should be disabled
The kunit_parser module accepts a 'printer' object which is used as a
destination for all output. This is typically set to stdout, so that the
parsed results are visible, but can be set to a special 'null_printer' to
implement options where not all results are always printed.
However, there are a few places where use of stdout is hardcoded, notably
in handling crashed tests and in outputting the colour escape sequences.
Properly use the specified printer for all output. This is okay for the
colour handling (as this is already gated behind isatty() anyway), and also
for the crash handling, as cases where printer != stdout are separately
printed afterwards.
David Gow [Sat, 6 Jun 2026 01:38:18 +0000 (09:38 +0800)]
kunit: tool: Add (primitive) support for outputting JUnit XML
This is used by things like Jenkins and other CI systems, which can
pretty-print the test output and potentially provide test-level comparisons
between runs.
The implementation here is pretty basic: it only provides the raw results,
split into tests and test suites, and doesn't provide any overall metadata.
However, CI systems like Jenkins can ingest it and it is already useful.
David Gow [Sat, 6 Jun 2026 01:38:17 +0000 (09:38 +0800)]
kunit: tool: Parse and print the reason tests are skipped
When a KUnit test (or other KTAP test) is skipped, a "skip reason" can be
provided. kunit.py has never done anything with this, ignoring anything
included in the KTAP output after the 'SKIP' directive.
Since we have it, and it's used, print it in a nice friendly yellow in
parentheses after a skipped test's name.
(And, by parsing it, it can be included in the JUnit results as well.)
This series fixes AA-deadlocks where NMI and tracepoint BPF programs
re-enter the per-CPU or global LRU lock already held on the same CPU
(syzbot c69a0a2c816716f1e0d5, 18b26edb69b2e19f3b33).
Patch 1 converts every LRU lock site to rqspinlock_t
and adds explicit recovery for some failures so no node leaks.
Patch 2 refreshes Documentation/bpf/map_lru_hash_update.dot to show
the new rqspinlock failure exits and recovery routes.
Patch 3 introduces a stress test.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
---
Changes in v3:
- Removed RFC tag
- Link to v2: https://patch.msgid.link/20260603-lru_map_spin-v2-0-7060cfb6cdac@meta.com
Changes in v2:
- Patch 1: __bpf_lru_node_move_in() now clears pending_free only when
moving to the FREE list.
- Patch 3: address sashiko's feedback.
- Link to v1: https://patch.msgid.link/20260528-lru_map_spin-v1-0-4f52223170cf@meta.com
====================
Introduces stress test for bpf_lru_list that exercises
lock-failures and orphan-recovery, added by the LRU rqspinlock
conversion.
Runs three subtests: common LRU, per-CPU LRU lists (BPF_F_NO_COMMON_LRU),
and per-CPU LRU map. Each pins one userspace hammer per CPU and attaches
the perf_event NMI BPF prog (update+delete mix) on every online CPU.
Pre-fix, lockdep fires the "INITIAL USE -> IN-NMI" splat during stress.
After stress test, drain_then_verify_capacity() drains every key
and refills the lru map.
A stranded node on any CPU's pool would have forced eviction of
a just-inserted key on that CPU, surfacing here as a missing lookup.
Marked serial_ because per-CPU pinning and high-rate HW perf events
would perturb parallel tests.
Mykyta Yatsenko [Sun, 7 Jun 2026 20:30:42 +0000 (13:30 -0700)]
Documentation/bpf: Refresh map_lru_hash_update.dot for rqspinlock
Reflect the rqspinlock conversion and orphan-recovery paths added in
the previous commit:
- All LRU locks are rqspinlock_t; any acquire can fail (AA or
timeout). A shared "rqspinlock acquire failed" terminal collapses
to the existing -ENOMEM exit. Dashed arrows from each acquire site
mark the failure paths.
- The per-CPU local freelist is now lockless (free_llist).
- Post-steal: re-acquiring loc_l->lock to insert the stolen node
into the local pending list can fail; on failure the node is
published to free_llist instead of being orphaned, and the call
returns -ENOMEM.
- Steal-loop victim lock failure is silent: skip the victim and try
the next CPU.
Mykyta Yatsenko [Sun, 7 Jun 2026 20:30:41 +0000 (13:30 -0700)]
bpf: Fix NMI/tracepoint re-entry deadlock on lru locks
NMI and tracepoint BPF programs can re-enter the per-CPU or global
LRU lock that bpf_lru_pop_free()/push_free() already hold on the
same CPU, AA-deadlocking. Lockdep reports "inconsistent
{INITIAL USE} -> {IN-NMI}" on &l->lock (syzbot c69a0a2c816716f1e0d5)
and "possible recursive locking detected" on &loc_l->lock (syzbot 18b26edb69b2e19f3b33).
Prior trylock and rqspinlock based fixes (see links) were nacked
because compromised on reliability.
This patch converts every LRU lock site to rqspinlock_t and adds a
recovery path for some failure windows to avoid node leaks.
Failure recovery:
- *_pop_free top-level: return NULL; prealloc_lru_pop() already
treats that as no-free-element (-ENOMEM).
- Cross-CPU steal: skip the victim's locked loc_l, try next CPU.
- Post-steal local lock fail: publish stolen node to lockless
per-CPU free_llist; next pop on this CPU picks it up.
- push_free fail: mark node pending_free=1. __local_list_flush(),
__local_list_pop_pending() reclaim the node from pending_list.
__bpf_lru_list_shrink_inactive() reclaims the node from inactive
list. Nodes from active list are reclaimed by __bpf_lru_list_shrink()
or after __bpf_lru_list_rotate_active() demotes it to the inactive.
Now that the Rust KUnit tests are protected with Kconfig, update the
documentation to mention it.
Signed-off-by: Yury Norov <ynorov@nvidia.com> Reviewed-by: David Gow <david@davidgow.net> Acked-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260417031531.315281-4-ynorov@nvidia.com
[ Fixed the paragraph by moving the new sentence above. Added gate
in the other example as well. Applied proper formatting. Reworded
slightly. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
There are 6 individual Rust KUnit test suites (plus the doctests one). All
the tests are compiled unconditionally now, which adds ~200 kB to the
kernel image for me on x86_64. As Rust matures, this bloating will
inevitably grow.
Add Kconfig.test which includes a RUST_KUNIT_TESTS menu, and all
individual tests under it.
As usual, new tests are all enabled if KUNIT_ALL_TESTS=y.
Suggested-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Yury Norov <ynorov@nvidia.com> Reviewed-by: David Gow <david@davidgow.net> Acked-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260417031531.315281-3-ynorov@nvidia.com
[ Fixed capitalization. Used singular for "API" for consistency.
Reworded to clarify these are suites and that there exists
the doctests one (which is the biggest at the moment by
far). - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
rust: tests: drop 'use crate' in bitmap and atomic KUnit tests
The following patch makes usage of macros::kunit_tests crate conditional
on the corresponding configs. When the configs are disabled, compiler
warns on unused crate. So, embed it in unit test declaration.
The wakeup condition if a min timeout is present and has expired is that
at least _one_ CQE was posted. Thus set the cq_tail target to
->cq_min_tail + 1. Without this commit a spurious wakeup can result in a
premature wakeup because io_should_wake() will return true even if _no_
CQE was posted at all.
Cc: Tip ten Brink <tip@tenbrinkmeijs.com> Fixes: e15cb2200b93 ("io_uring: fix min_wait wakeups for SQPOLL") Cc: stable@vger.kernel.org Signed-off-by: Christian A. Ehrhardt <lk@c--e.de> Link: https://patch.msgid.link/20260606201120.1441447-1-lk@c--e.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Sun, 7 Jun 2026 22:05:47 +0000 (16:05 -0600)]
io_uring/kbuf: don't truncate end buffer for bundles
If buffers have been peeked for a bundle receive, the kernel will
truncate the end buffer, if the available length is shorter than the
buffer itself. This is unnecessary, as applications iterating bundle
receives must always use the minimum size of the buffer length and the
remaining number of bytes in the bundle. The examples in liburing do
that as well, eg examples/proxy.c.
If the kernel does truncate this buffer AND the current transfer fails,
then the buffer will be left with a smaller size than what is otherwise
available.
Just remove the buffer truncation, as it's not necessary in the first
place.
Linus Torvalds [Sun, 7 Jun 2026 20:12:29 +0000 (13:12 -0700)]
Merge tag 'x86-urgent-2026-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Add more AMD Zen6 models (Pratik Vishwakarma)
- Avoid confusing bootup message by the Intel resctl enumeration
code when running on certain AMD systems (Tony Luck)
* tag 'x86-urgent-2026-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/resctrl: Only check Intel systems for SNC
x86/CPU/AMD: Add more Zen6 models
Linus Torvalds [Sun, 7 Jun 2026 20:02:02 +0000 (13:02 -0700)]
Merge tag 'timers-urgent-2026-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
- Fix the arch_inlined_clockevent_set_next_coupled() prototype in the
!CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST case (Naveen Kumar Chaudhary)
- Fix an off-by-1 bug in the sys_settimeofday() usecs validation code
(Naveen Kumar Chaudhary)
- Mark vdso_k_*_data pointers as __ro_after_init (Thomas Weißschuh)
- Fix livelock race in tmigr_handle_remote_up() (Amit Matityahu)
* tag 'timers-urgent-2026-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timers/migration: Fix livelock in tmigr_handle_remote_up()
vdso/datastore: Mark vdso_k_*_data pointers as __ro_after_init
time: Fix off-by-one in settimeofday() usec validation
clockevents: Fix duplicate type specifier in stub function parameter
Linus Torvalds [Sun, 7 Jun 2026 19:54:37 +0000 (12:54 -0700)]
Merge tag 'sched-urgent-2026-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull rseq fix from Ingo Molnar:
- Fix uninitialized stack variable in rseq_exit_user_update() (Qing
Wang)
* tag 'sched-urgent-2026-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq: Fix using an uninitialized stack variable in rseq_exit_user_update()