pinctrl: qcom: lpass-lpi: Switch to PM clock framework for runtime PM
Convert the LPASS LPI pinctrl driver to use the PM clock framework for
runtime power management.
This allows the LPASS LPI pinctrl driver to drop clock votes when idle,
improves power efficiency on platforms using LPASS LPI island mode, and
aligns the driver with common runtime PM patterns used across Qualcomm
LPASS subsystems.
Guard GPIO register read/write helpers and slew-rate register programming
with synchronous runtime PM calls so the device is active during MMIO
operations whenever autosuspend is enabled.
Make PINCTRL_LPASS_LPI depend on PM_CLK, since this patch introduces
direct PM clock API use in the shared core.
The LPASS LPI core conversion to PM clock framework relies on variant
drivers wiring runtime PM callbacks.
Hook up runtime PM callbacks for the LPASS LPI variant drivers touched
in this patch so they are prepared for the shared core conversion.
This commit is a preparatory NOP on its own, as runtime PM is still
disabled on these devices until the following core conversion patch.
This is a mechanical per-variant driver update that relies on the
same generic PM clock flow (of_pm_clk_add_clks() + pm_clk_suspend/
pm_clk_resume()) and DT-provided clocks.
Runtime behavior was validated on Kodiak (sc7280).
OF_GPIO is selected automatically on all OF systems. Any symbols it
controls also provide stubs and are private to GPIOLIB anyway so there's
really no reason to select it explicitly.
staging: media: max96712: drop unneeded dependency on OF_GPIO
OF_GPIO is selected automatically on all OF systems. Any symbols it
controls also provide stubs and are private to GPIOLIB anyway so there's
really no reason to select it explicitly.
Billy Tsai [Fri, 5 Jun 2026 06:38:09 +0000 (14:38 +0800)]
pinctrl: aspeed: Fix GPIO mux value for ADC-capable balls
aspeed_g7_soc1_gpio_request_enable() unconditionally writes mux
function 0 to route the requested pin to GPIO. This is wrong for the
ADC-capable balls W17 through AB19 (ADC0-ADC15), where function 0
selects the ADC input and function 1 selects GPIO. Requesting one of
those GPIOs therefore muxed the ball to ADC instead.
Write mux value 1 for balls W17 through AB19 so the GPIO function is
actually selected.
power: sequencing: Add an API to return the pwrseq device's 'dev' pointer
The consumer drivers can make use of the pwrseq device's 'dev' pointer to
query the pwrseq provider's DT node to check for existence of specific
properties.
Hence, add an API to return the pwrseq device's 'dev' pointer to consumers.
Note that since pwrseq_get() would've increased the pwrseq refcount, there
is no need to increase the refcount in this API again.
Mingyu Wang [Mon, 4 May 2026 07:48:23 +0000 (15:48 +0800)]
agp/amd64: Fix broken error propagation in agp_amd64_probe()
A NULL pointer dereference was observed in the AMD64 AGP driver when
running in a virtualized environment (e.g. qemu/kvm) without a physical
AMD northbridge. The crash occurs in amd64_fetch_size() when attempting
to dereference the pointer returned by node_to_amd_nb(0).
The root cause of this crash is broken error propagation in
agp_amd64_probe(): When no AMD northbridges are found, cache_nbs()
correctly returns -ENODEV. However, the probe function erroneously
checks the return value against exactly -1, rather than < 0.
As a result, the hardware absence error is masked, allowing the driver
to improperly proceed with initialization. It eventually calls
agp_add_bridge(), which invokes amd64_fetch_size(). Since the hardware
does not exist, node_to_amd_nb(0) returns NULL, leading to a General
Protection Fault (GPF) when accessing its ->misc member.
Fix the issue by correcting the error check in agp_amd64_probe() to
abort properly when cache_nbs() returns any negative error code. This
prevents the driver from erroneously proceeding without hardware, thereby
avoiding the subsequent NULL pointer dereference at its source.
Fixes: a32073bffc65 ("[PATCH] x86_64: Clean and enhance up K8 northbridge access code") Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v2.6.18+ Link: https://patch.msgid.link/20260504074823.99377-1-w15303746062@163.com
power: sequencing: pcie-m2: Create BT node based on the pci_device_id[] table
Currently, pwrseq_pcie_m2_create_bt_node() hardcodes the BT compatible for
creating the devicetree node. But to allow adding support for more devices
in the future, create the BT node based on the pci_device_id[] table. The
BT compatible is passed using 'driver_data'.
Co-developed-by: Wei Deng <wei.deng@oss.qualcomm.com> Signed-off-by: Wei Deng <wei.deng@oss.qualcomm.com> Tested-by: Wei Deng <wei.deng@oss.qualcomm.com> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Link: https://patch.msgid.link/20260519-pwrseq-m2-bt-v3-5-b39dc2ae3966@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
power: sequencing: pcie-m2: Create serdev for PCI devices present before probe
So far, the driver is registering a notifier to create serdev for the PCI
devices that are going to be attached after probe. But it doesn't handle
the devices present before probe. Due to this, serdev is not getting
created for those existing devices.
Hence, create serdev for PCI devices available before probe as well.
Note that the serdev for available devices are created before
registering the notifier. There is a small window where a device could
appear after pwrseq_pcie_m2_create_serdev(), before notifier registration.
But since M.2 cards are fixed to a slot, they are mostly added either
before booting the host or after using hotplug. So this window is mostly
theoretical.
power: sequencing: pcie-m2: Allow creating serdev for multiple PCI devices
Current code makes it possible to create serdev for only one PCI device.
But for scaling this driver, it is necessary to allow creating serdev for
multiple PCI devices.
Hence, add provision for it by creating 'struct pwrseq_pci_dev' for each
PCI device that requires serdev and add them to
'pwrseq_pcie_m2_ctx::pci_devices' list.
Juergen Gross [Tue, 26 May 2026 15:05:13 +0000 (17:05 +0200)]
x86/xen: Get rid of last XEN_LAZY_MMU uses
There are only very few use cases of XEN_LAZY_MMU left. Get rid of
them in order to avoid having to call enter_lazy(XEN_LAZY_MMU) and
leave_lazy(XEN_LAZY_MMU).
The query in xen_batched_set_pte() can be replaced by using
is_lazy_mmu_mode_active() instead.
As xen_flush_lazy_mmu() will be called only with lazy MMU mode being
active, the test for the lazy mode can just be dropped.
In xen_start_context_switch() and xen_end_context_switch() use
__task_lazy_mmu_mode_pause() and __task_lazy_mmu_mode_resume(),
allowing to drop xen_enter_lazy_mmu() and xen_leave_lazy_mmu()
completely.
Call arch_flush_lazy_mmu_mode() from arch_leave_lazy_mmu_mode(), as
this is the only required action now.
Drop the lazy mmu enter and leave paravirt hooks, leaving the flush
hook as the only needed one.
Juergen Gross [Fri, 22 May 2026 15:21:14 +0000 (17:21 +0200)]
x86/xen: Remove Xen debugfs support
The only Xen file in debugfs is for dumping the p2m table when running
as a Xen PV guest. This might have been useful when the PV code was
young, but there haven't been any p2m related bugs requiring the p2m
dump since ages.
Juergen Gross [Fri, 22 May 2026 15:21:12 +0000 (17:21 +0200)]
x86/xen: Guard PV-only stuff in xen-ops.h with CONFIG_XEN_PV
A lot of arch/x86/xen/xen-ops.h is meant to be for PV only. Guard all
of it with CONFIG_XEN_PV in order to avoid someone misusing it in
non-PV builds. Additionally any 64-bit tests for now guarded items can
be dropped.
Move the enum pt_level definition to mmu_pv.c, as it is used only there.
Len Bao [Sat, 23 May 2026 13:28:01 +0000 (13:28 +0000)]
xen/mcelog: mark g_physinfo, ncpus and xen_mce_chrdev_device as __ro_after_init
The 'g_physinfo' and 'ncpus' variables are initialized only during the
init phase in the 'bind_virq_for_mce' function and never changed. So,
mark them as __ro_after_init.
The 'xen_mce_chrdev_device' variable is initialized only in the
declaration and never changed. So, this variable could be 'const', but
using the 'misc_register' and 'misc_deregister' functions discards the
'const' qualifier. Therefore, as an alternative, mark it as
__ro_after_init.
Signed-off-by: Len Bao <len.bao@gmx.us> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20260523132802.25391-1-len.bao@gmx.us>
Bryam Vargas [Sat, 6 Jun 2026 07:43:45 +0000 (07:43 +0000)]
wifi: mac80211: bound S1G TIM PVB walk to the TIM element
ieee80211_s1g_check_tim() parses the S1G Partial Virtual Bitmap (PVB) of a
received TIM element. The TIM is handed in as the element payload:
ieee802_11_parse_elems_full() stores elems->tim = elem->data and
elems->tim_len = elem->datalen (net/mac80211/parse.c), so the valid bytes
are [tim, tim + tim_len).
When walking the encoded blocks the function passes the walker an end
sentinel of (const u8 *)tim + tim_len + 2, i.e. two bytes past the end of
the element. ieee80211_s1g_find_target_block() loops while (ptr + 1 <= end)
and dereferences ptr (and the per-mode ieee80211_s1g_len_*() helpers read
*ptr), so it can read up to two bytes beyond the TIM element -- an
out-of-bounds read of adjacent skb/heap data when the TIM is the last
element in the frame. The +2 appears to account for the element id/len
header, but tim already points past that header at the element payload, so
the addend is wrong.
Pass the correct element end, (const u8 *)tim + tim_len.
xen/platform-pci: Simplify initialization of pci_device_id array
Instead of using a list initializer---that is hard to read unless you know
the structure of struct pci_device_id by heart---use the PCI_VDEVICE
macro to assign the needed values and drop all explicit but unneeded
zeros.
This doesn't introduce any changes to the compiled result of the array.
Akashdeep Kaur [Wed, 3 Jun 2026 07:24:38 +0000 (12:54 +0530)]
cpufreq: ti: Add EPROBE_DEFER for K3 SoCs
On K3 SoCs, ti-cpufreq relies on k3-socinfo to register the SoC
device before soc_device_match() can return valid revision
information. If ti-cpufreq probes before k3-socinfo,
soc_device_match() returns NULL, leading to incorrect CPU frequency
scaling behavior.
Add a needs_k3_socinfo flag to ti_cpufreq_soc_data (similar to
the existing multi_regulator pattern) to defer probe when k3-socinfo
hasn't registered the SoC device yet.
Taniya Das [Fri, 22 May 2026 15:16:23 +0000 (20:46 +0530)]
cpufreq: qcom: Add cpufreq scaling support for Qualcomm Shikra SoC
The Qualcomm Shikra cpufreq hardware is functionally identical to EPSS,
but supports only up to 12 frequency lookup table (LUT) entries. When all
12 entries are populated, the existing repetitive LUT entry check may read
beyond valid entries and expose incorrect frequencies. Hence, introduce
shikra_epss_soc_data that reuses EPSS configuration with appropriate LUT
entries limit.
Signed-off-by: Taniya Das <taniya.das@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Imran Shaik <imran.shaik@oss.qualcomm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The Qualcomm Shikra cpufreq hardware is functionally identical to EPSS,
but supports only up to 12 frequency lookup table (LUT) entries. Introduce
Shikra specific bindings to represent this constrained EPSS variant.
m68k: coldfire: use ColdFire specifc IO access in SoC code
Convert all ColdFire specific SoC/board setup code to only use the
newly created internal register access methods. This is replacing the
mixed and inconsistent use of readx/writex and __raw_readx/__raw_writex
for internal SoC registers.
m68k: coldfire: use ColdFire specifc IO access in system code
Convert all ColdFire specific system setup code to only use the
newly created internal register access methods. This is replacing the
mixed and inconsistent use of readx/writex and __raw_readx/__raw_writex
for internal SoC registers.
With the basic ColdFire IO register functions now consistently named
with a "mcf_read"/"mcf_write" prefix it makes sense to name the timers
internal access defines the same way. Convert the local __raw_readtrr
and __raw_writetrr defines to use the consistent prefixes too. Thus
the change is:
m68k: coldfire: use ColdFire specifc IO access in timer code
Convert all ColdFire specific timer setup code to only use the
newly created internal register access methods. This is replacing the
mixed and inconsistent use of readx/writex and __raw_readx/__raw_writex
for internal SoC registers.
m68k: coldfire: use ColdFire specifc IO access in interrupt code
Convert all ColdFire specific interrupt setup code to only use the
newly created internal register access methods. This is replacing the
mixed and inconsistent use of readx/writex and __raw_readx/__raw_writex
for internal SoC registers.
m68k: coldfire: use ColdFire specific IO access in headers
Convert all m68k/ColdFire specific header file code to only use the
newly created internal register access methods. This is replacing the
mixed and inconsistent use of readx/writex and __raw_readx/__raw_writex
for internal SoC registers.
m68k: coldfire: create IO access functions for internal registers
The internal peripheral registers contained in all varieties of ColdFire
SoCs require simple big endian access ranging in sizes from 8, 16 and 32
bit. Currently there is a mixture of IO access methods used across the
various CPU support code, some using readx/writex and some using the
simpler __raw_readx/__raw_writew.
The readx/writex use cases are particularly kludgy in that they contain
code to differentiate internal register access and other general attached
peripheral register access - say on a PCI bus. In effect this means that
the readx/writex family for ColdFire is non-standard. This ultimately
ends up causing problems with definitions of other IO access support
functions like ioreadx/ioreadxbe/iowritex/iowritexbe which in the
generic case are defined in terms of readx/writex.
Create a set of internal only register access methods to ultimately
replace all internal register access code. The new access functions
mirror the existing readx/writex family but using the preferred 8/16/32
suffixes.
m68k: defconfig: add config for SnapGear/NETtel board
Add a default configuration for a basic M5307 based NETtel board.
This is primarily to improve defconfig build coverage. This platforms
uses the SMSC ethernet drivers for network ports, and has a few other
minor quirks that make it different from other ColdFire platforms.
Add a default configuration for a basic M54418 based EVB board.
The SoC has been supported for a long time but there is no default
configuration. Create one to improve build and test coverage.
Add a default configuration file for the Freescale M5329EVB board.
Although the SoC type has been supported for a long time there has been
no defconfig for the base platform. Create one to give better build
and test coverage.
m68k: coldfire: select legacy gpiolib interface for mcfqspi
The common coldfire code uses the old GPIO number based interfaces for
at least the QSPI chipselect lines. Select the required Kconfig symbol
to keep it building when that becomes optional.
Apparently there are no devices attached to a QSPI controller in any of
the coldfire boards, so this is not actually used in upstream kernels.
David Gow [Sat, 6 Jun 2026 02:03:15 +0000 (10:03 +0800)]
kunit:tool: Don't write to stdout when it should be disabled
The kunit_parser module accepts a 'printer' object which is used as a
destination for all output. This is typically set to stdout, so that the
parsed results are visible, but can be set to a special 'null_printer' to
implement options where not all results are always printed.
However, there are a few places where use of stdout is hardcoded, notably
in handling crashed tests and in outputting the colour escape sequences.
Properly use the specified printer for all output. This is okay for the
colour handling (as this is already gated behind isatty() anyway), and also
for the crash handling, as cases where printer != stdout are separately
printed afterwards.
David Gow [Sat, 6 Jun 2026 01:38:18 +0000 (09:38 +0800)]
kunit: tool: Add (primitive) support for outputting JUnit XML
This is used by things like Jenkins and other CI systems, which can
pretty-print the test output and potentially provide test-level comparisons
between runs.
The implementation here is pretty basic: it only provides the raw results,
split into tests and test suites, and doesn't provide any overall metadata.
However, CI systems like Jenkins can ingest it and it is already useful.
David Gow [Sat, 6 Jun 2026 01:38:17 +0000 (09:38 +0800)]
kunit: tool: Parse and print the reason tests are skipped
When a KUnit test (or other KTAP test) is skipped, a "skip reason" can be
provided. kunit.py has never done anything with this, ignoring anything
included in the KTAP output after the 'SKIP' directive.
Since we have it, and it's used, print it in a nice friendly yellow in
parentheses after a skipped test's name.
(And, by parsing it, it can be included in the JUnit results as well.)
This series fixes AA-deadlocks where NMI and tracepoint BPF programs
re-enter the per-CPU or global LRU lock already held on the same CPU
(syzbot c69a0a2c816716f1e0d5, 18b26edb69b2e19f3b33).
Patch 1 converts every LRU lock site to rqspinlock_t
and adds explicit recovery for some failures so no node leaks.
Patch 2 refreshes Documentation/bpf/map_lru_hash_update.dot to show
the new rqspinlock failure exits and recovery routes.
Patch 3 introduces a stress test.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
---
Changes in v3:
- Removed RFC tag
- Link to v2: https://patch.msgid.link/20260603-lru_map_spin-v2-0-7060cfb6cdac@meta.com
Changes in v2:
- Patch 1: __bpf_lru_node_move_in() now clears pending_free only when
moving to the FREE list.
- Patch 3: address sashiko's feedback.
- Link to v1: https://patch.msgid.link/20260528-lru_map_spin-v1-0-4f52223170cf@meta.com
====================
Introduces stress test for bpf_lru_list that exercises
lock-failures and orphan-recovery, added by the LRU rqspinlock
conversion.
Runs three subtests: common LRU, per-CPU LRU lists (BPF_F_NO_COMMON_LRU),
and per-CPU LRU map. Each pins one userspace hammer per CPU and attaches
the perf_event NMI BPF prog (update+delete mix) on every online CPU.
Pre-fix, lockdep fires the "INITIAL USE -> IN-NMI" splat during stress.
After stress test, drain_then_verify_capacity() drains every key
and refills the lru map.
A stranded node on any CPU's pool would have forced eviction of
a just-inserted key on that CPU, surfacing here as a missing lookup.
Marked serial_ because per-CPU pinning and high-rate HW perf events
would perturb parallel tests.
Mykyta Yatsenko [Sun, 7 Jun 2026 20:30:42 +0000 (13:30 -0700)]
Documentation/bpf: Refresh map_lru_hash_update.dot for rqspinlock
Reflect the rqspinlock conversion and orphan-recovery paths added in
the previous commit:
- All LRU locks are rqspinlock_t; any acquire can fail (AA or
timeout). A shared "rqspinlock acquire failed" terminal collapses
to the existing -ENOMEM exit. Dashed arrows from each acquire site
mark the failure paths.
- The per-CPU local freelist is now lockless (free_llist).
- Post-steal: re-acquiring loc_l->lock to insert the stolen node
into the local pending list can fail; on failure the node is
published to free_llist instead of being orphaned, and the call
returns -ENOMEM.
- Steal-loop victim lock failure is silent: skip the victim and try
the next CPU.
Mykyta Yatsenko [Sun, 7 Jun 2026 20:30:41 +0000 (13:30 -0700)]
bpf: Fix NMI/tracepoint re-entry deadlock on lru locks
NMI and tracepoint BPF programs can re-enter the per-CPU or global
LRU lock that bpf_lru_pop_free()/push_free() already hold on the
same CPU, AA-deadlocking. Lockdep reports "inconsistent
{INITIAL USE} -> {IN-NMI}" on &l->lock (syzbot c69a0a2c816716f1e0d5)
and "possible recursive locking detected" on &loc_l->lock (syzbot 18b26edb69b2e19f3b33).
Prior trylock and rqspinlock based fixes (see links) were nacked
because compromised on reliability.
This patch converts every LRU lock site to rqspinlock_t and adds a
recovery path for some failure windows to avoid node leaks.
Failure recovery:
- *_pop_free top-level: return NULL; prealloc_lru_pop() already
treats that as no-free-element (-ENOMEM).
- Cross-CPU steal: skip the victim's locked loc_l, try next CPU.
- Post-steal local lock fail: publish stolen node to lockless
per-CPU free_llist; next pop on this CPU picks it up.
- push_free fail: mark node pending_free=1. __local_list_flush(),
__local_list_pop_pending() reclaim the node from pending_list.
__bpf_lru_list_shrink_inactive() reclaims the node from inactive
list. Nodes from active list are reclaimed by __bpf_lru_list_shrink()
or after __bpf_lru_list_rotate_active() demotes it to the inactive.
Now that the Rust KUnit tests are protected with Kconfig, update the
documentation to mention it.
Signed-off-by: Yury Norov <ynorov@nvidia.com> Reviewed-by: David Gow <david@davidgow.net> Acked-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260417031531.315281-4-ynorov@nvidia.com
[ Fixed the paragraph by moving the new sentence above. Added gate
in the other example as well. Applied proper formatting. Reworded
slightly. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
There are 6 individual Rust KUnit test suites (plus the doctests one). All
the tests are compiled unconditionally now, which adds ~200 kB to the
kernel image for me on x86_64. As Rust matures, this bloating will
inevitably grow.
Add Kconfig.test which includes a RUST_KUNIT_TESTS menu, and all
individual tests under it.
As usual, new tests are all enabled if KUNIT_ALL_TESTS=y.
Suggested-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Yury Norov <ynorov@nvidia.com> Reviewed-by: David Gow <david@davidgow.net> Acked-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260417031531.315281-3-ynorov@nvidia.com
[ Fixed capitalization. Used singular for "API" for consistency.
Reworded to clarify these are suites and that there exists
the doctests one (which is the biggest at the moment by
far). - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
rust: tests: drop 'use crate' in bitmap and atomic KUnit tests
The following patch makes usage of macros::kunit_tests crate conditional
on the corresponding configs. When the configs are disabled, compiler
warns on unused crate. So, embed it in unit test declaration.
The wakeup condition if a min timeout is present and has expired is that
at least _one_ CQE was posted. Thus set the cq_tail target to
->cq_min_tail + 1. Without this commit a spurious wakeup can result in a
premature wakeup because io_should_wake() will return true even if _no_
CQE was posted at all.
Cc: Tip ten Brink <tip@tenbrinkmeijs.com> Fixes: e15cb2200b93 ("io_uring: fix min_wait wakeups for SQPOLL") Cc: stable@vger.kernel.org Signed-off-by: Christian A. Ehrhardt <lk@c--e.de> Link: https://patch.msgid.link/20260606201120.1441447-1-lk@c--e.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Sun, 7 Jun 2026 22:05:47 +0000 (16:05 -0600)]
io_uring/kbuf: don't truncate end buffer for bundles
If buffers have been peeked for a bundle receive, the kernel will
truncate the end buffer, if the available length is shorter than the
buffer itself. This is unnecessary, as applications iterating bundle
receives must always use the minimum size of the buffer length and the
remaining number of bytes in the bundle. The examples in liburing do
that as well, eg examples/proxy.c.
If the kernel does truncate this buffer AND the current transfer fails,
then the buffer will be left with a smaller size than what is otherwise
available.
Just remove the buffer truncation, as it's not necessary in the first
place.
Linus Torvalds [Sun, 7 Jun 2026 20:12:29 +0000 (13:12 -0700)]
Merge tag 'x86-urgent-2026-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Add more AMD Zen6 models (Pratik Vishwakarma)
- Avoid confusing bootup message by the Intel resctl enumeration
code when running on certain AMD systems (Tony Luck)
* tag 'x86-urgent-2026-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/resctrl: Only check Intel systems for SNC
x86/CPU/AMD: Add more Zen6 models
Linus Torvalds [Sun, 7 Jun 2026 20:02:02 +0000 (13:02 -0700)]
Merge tag 'timers-urgent-2026-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
- Fix the arch_inlined_clockevent_set_next_coupled() prototype in the
!CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST case (Naveen Kumar Chaudhary)
- Fix an off-by-1 bug in the sys_settimeofday() usecs validation code
(Naveen Kumar Chaudhary)
- Mark vdso_k_*_data pointers as __ro_after_init (Thomas Weißschuh)
- Fix livelock race in tmigr_handle_remote_up() (Amit Matityahu)
* tag 'timers-urgent-2026-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timers/migration: Fix livelock in tmigr_handle_remote_up()
vdso/datastore: Mark vdso_k_*_data pointers as __ro_after_init
time: Fix off-by-one in settimeofday() usec validation
clockevents: Fix duplicate type specifier in stub function parameter
Linus Torvalds [Sun, 7 Jun 2026 19:54:37 +0000 (12:54 -0700)]
Merge tag 'sched-urgent-2026-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull rseq fix from Ingo Molnar:
- Fix uninitialized stack variable in rseq_exit_user_update() (Qing
Wang)
* tag 'sched-urgent-2026-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq: Fix using an uninitialized stack variable in rseq_exit_user_update()
Linus Torvalds [Sun, 7 Jun 2026 19:43:21 +0000 (12:43 -0700)]
Merge tag 'locking-urgent-2026-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
- Fix a NULL pointer dereference bug in the FUTEX_CMP_REQUEUE_PI
code (Ji'an Zhou)
- Fix a NULL pointer dereference bug in the rtmutex code (Davidlohr
Bueso)
* tag 'locking-urgent-2026-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rtmutex: Skip remove_waiter() when waiter is not enqueued
futex/requeue: Prevent NULL pointer dereference in remove_waiter() on self-deadlock
v8 changes:
- add back the btf_is_union check to btf_get_type_size [sashiko]
v7 changes:
- added ftrace_hash_count stub for !CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS cade [sashiko]
- selftests fixes [sashiko]
- use hash_ptr in select_trampoline_lock [sashiko]
- changed the check duplicate logic in check_dup_ids [sashiko]
- use sort_r_nonatomic in check_dup_ids [sashiko]
- added BPF_TRACE_FSESSION_MULTI to can_be_sleepable,
plus added testcase for sleepable fsession
- make bpf_tracing_multi_opts pointer fields as const
- add ___migrate_enable to trace_blacklist
v6 changes:
- move ftrace_hash_count declaration under CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS [sashiko]
- fix ftrace_hash_remove check/deref [sashiko]
- disable context access for multi programs by using stub function with no arguments
for verification [sashiko]
- add __used for bpf_multi_func, and removed arguments, we do not allow direct access [sashiko]
- rebased on latest loongarch changes, fix ppc build
- guard update_ftrace_direct_del with ftrace_hash_count on rollback [sashiko]
- fix noreturn attachment condition in bpf_check_attach_btf_id_multi [sashiko]
- fail early on multiple same IDs provided by user [sashiko]
- fix selftests error paths [sashiko]
- add MAX_RESOLVE_DEPTH check to btf_get_type_size [sashiko]
- use btf__pointer_size [sashiko]
- fixed compilation on powerpc [sashiko]
- added verifier fails selftest
- after discussing with Song, it was determined that cleaning up FTRACE_OPS_CMD_DISABLE_SHARE_IPMODIFY_PEER
is not strictly necessary — keeping the trampoline in the ipmodify_enabled state is acceptable.
The race condition this introduces remains unlikely, so the concern raised in [1] will not be
addressed at this time.
[1] https://lore.kernel.org/bpf/aec7bAbGlnEo3R1g@krava/
v5 changes:
- add dedicated hashes used for detach, so there's no need to allocate
them on detach [sashiko]
- safely release old trampoline images [sashiko]
- add cond_resched() to couple of loops [sashiko]
- validate attr->link_create.target_fd [sashiko]
- allow only bpf_get_func_ret() for return value retrieval [sashiko]
- do not allow attachment of fexit/fsession_multi for noreturn functions [sashiko]
- fixed double free/close in libbpf btf cleanup, in separate patch [sashiko]
- make btf_type_is_traceable_func closer to btf_distill_func_proto [sashiko]
- add prog->attach_btf_obj_fd check to collect_func_ids_by_glob,
to check we don't load module programs for kernel [sashiko]
- make sure program is loaded in bpf_program__attach_tracing_multi [sashiko]
- several selftests fixes [sashiko]
- add attach_type to fdinfo output [Leon Hwang]
- selftests cleanup fixes [Leon Hwang]
v4 changes:
- unlink rollback fix (added ftrace_hash_count) [bot]
- use const for some bpf_link_create_opts tracing_multi members [bot]
- adding missing comment for lockdep keys [bot]
- selftest error path fixes (leaks) and other assorted test fixes [Leon Hwang]
- several compile fixes wrt CONFIG_BPF_SYSCALL and CONFIG_BPF_JIT [kernel test robot]
- make ftrace_hash_clear global, because it's needed in rollback
v3 changes:
- fix module parsing [Leon Hwang]
- use function traceable check from libbpf [Leon Hwang]
- use ptr_to_u64 and fix/updated few comments [ci]
- display cookies as decimal numbers [ci]
- added link_create.flags check [ci]
- fix error path in bpf_trampoline_multi_detach [ci]
- make fentry/fexit.multi not extendable [ci]
- add missing OPTS_VALID to bpf_program__attach_tracing_multi [ci]
v2 changes:
- allocate data.unreg in bpf_trampoline_multi_attach for rollback path [ci]
and fixed link count setup in rollback path [ci]
- several small assorted fixes [ci]
- added loongarch and powerpc changes for struct bpf_tramp_node change
- added support to attach functions from modules
- added tests for sleepable programs
- added rollback tests
v1 changes:
- added ftrace_hash_count as wrapper for hash_count [Steven]
- added trampoline mutex pool [Andrii]
- reworked 'struct bpf_tramp_node' separatoin [Andrii]
- the 'struct bpf_tramp_node' now holds pointer to bpf_link,
which is similar to what we do for uprobe_multi;
I understand it's not a fundamental change compared to previous
version which used bpf_prog pointer instead, but I don't see better
way of doing this.. I'm happy to discuss this further if there's
better idea
- reworked 'struct bpf_fsession_link' based on bpf_tramp_node
- made btf__find_by_glob_kind function internal helper [Andrii]
- many small assorted fixes [Andrii,CI]
- added session support [Leon Hwang]
- added cookies support
- added more tests
Note I plan to send linkinfo support separately, the patchset is big enough.
Jiri Olsa [Sat, 6 Jun 2026 12:39:54 +0000 (14:39 +0200)]
selftests/bpf: Add tracing multi attach rollback tests
Adding tests for the rollback code when the tracing_multi
link won't get attached, covering 2 reasons:
- wrong btf id passed by user, where all previously allocated
trampolines will be released
- trampoline for requested function is fully attached (has already
maximum programs attached) and the link fails, the rollback code
needs to release all previously link-ed trampolines and release
them
We need the bpf_fentry_test* unattached for the tests to pass,
so the rollback tests are serial.
Jiri Olsa [Sat, 6 Jun 2026 12:39:47 +0000 (14:39 +0200)]
selftests/bpf: Add tracing multi skel/pattern/ids module attach tests
Adding tests for tracing_multi link attachment via all possible
libbpf apis - skeleton, function pattern and btf ids on top of
bpf_testmod kernel module.
User can specify functions to attach with 'pattern' argument that
allows wildcards (*?' supported) or provide BTF ids of functions
in array directly via opts argument. These options are mutually
exclusive.
When using BTF ids, user can also provide cookie value for each
provided id/function, that can be retrieved later in bpf program
with bpf_get_attach_cookie helper. Each cookie value is paired with
provided BTF id with the same array index.
Adding support to auto attach programs with following sections:
The provided <pattern> is used as 'pattern' argument in
bpf_program__attach_kprobe_multi_opts function.
The <pattern> allows to specify optional kernel module name with
following syntax:
<module>:<function_pattern>
In order to attach tracing_multi link to a module functions:
- program must be loaded with 'module' btf fd
(in attr::attach_btf_obj_fd)
- bpf_program__attach_tracing_multi must either have
pattern with module spec or BTF ids from the module
Jiri Olsa [Sat, 6 Jun 2026 12:39:44 +0000 (14:39 +0200)]
libbpf: Add btf_type_is_traceable_func function
Adding btf_type_is_traceable_func function to perform same checks
as the kernel's btf_distill_func_proto function to prevent attachment
on some of the functions.
Exporting the function via libbpf_internal.h because it will be used
by benchmark test in following changes.
Adding bpf_trampoline_multi_attach/detach functions that allows to
attach/detach tracing program to multiple functions/trampolines.
The attachment is defined with bpf_program and array of BTF ids of
functions to attach the bpf program to.
Adding bpf_tracing_multi_link object that holds all the attached
trampolines and is initialized in attach and used in detach.
The attachment allocates or uses currently existing trampoline
for each function to attach and links it with the bpf program.
The attach works as follows:
- we get all the needed trampolines
- lock them and add the bpf program to each (__bpf_trampoline_link_prog)
- the trampoline_multi_ops passed in __bpf_trampoline_link_prog gathers
ftrace_hash (ip -> trampoline) objects
- we call update_ftrace_direct_add/mod to update needed locations
- we unlock all the trampolines
The detach works as follows:
- we lock all the needed trampolines
- remove the program from each (__bpf_trampoline_unlink_prog)
- the trampoline_multi_ops passed in __bpf_trampoline_unlink_prog gathers
ftrace_hash (ip -> trampoline) objects
- we call update_ftrace_direct_del/mod to update needed locations
- we unlock and put all the trampolines
We store the old image/flags in the trampoline before the update
and use it in case we need to rollback the attachment.
We keep the ftrace_hash objects allocated during attach in the link
so they can be used for detach as well.
Adding trampoline_(un)lock_all functions to (un)lock all trampolines
to gate the tracing_multi attachment.
Note this is supported only for archs (x86_64) with ftrace direct and
have single ops support.
Jiri Olsa [Sat, 6 Jun 2026 12:39:36 +0000 (14:39 +0200)]
bpf: Move sleepable verification code to btf_id_allow_sleepable
Move sleepable verification code to btf_id_allow_sleepable function.
It will be used in following changes.
Adding code to retrieve type's name instead of passing it from
bpf_check_attach_target function, because this function will be
called from another place in following changes and it's easier
to retrieve the name directly in here.
Jiri Olsa [Sat, 6 Jun 2026 12:39:35 +0000 (14:39 +0200)]
bpf: Add multi tracing attach types
Adding new program attach types multi tracing attachment:
BPF_TRACE_FENTRY_MULTI
BPF_TRACE_FEXIT_MULTI
and their base support in verifier code.
Programs with such attach type will use specific link attachment
interface coming in following changes.
This was suggested by Andrii some (long) time ago and turned out
to be easier than having special program flag for that.
Bpf programs with such types have 'bpf_multi_func' function set as
their attach_btf_id and keep module reference when it's specified
by attach_prog_fd.
They are also accepted as sleepable programs during verification,
and the real validation for specific BTF_IDs/functions will happen
during the multi link attachment in following changes.
Jiri Olsa [Sat, 6 Jun 2026 12:39:34 +0000 (14:39 +0200)]
bpf: Factor fsession link to use struct bpf_tramp_node
Now that we split trampoline attachment object (bpf_tramp_node) from
the link object (bpf_tramp_link) we can use bpf_tramp_node as fsession's
fexit attachment object and get rid of the bpf_fsession_link object.