====================
selftests/bpf: impose global ordering for test decl_tags
Impose global ordering for all decl tags used by test_loader.c based
tests: __success, __failure, __msg, etc. The tags are now sorted by
testing framework to be processed in the same order they appear in the
C source code of the test.
The ordering is necessary for gcc-bpf. Neither GCC nor the C standard
defines the order in which function attributes are consumed.
While Clang tends to preserve tags definition order in the output BTF,
GCC does not. This inconsistency causes BPF tests with multiple __msg
entries to fail when compiled with GCC.
This is based on a patch [1] from Cupertino Miranda (see patch #3) and
includes some additional cleanups for test_loader.c decl tags
declaration and processing (see patches #1, #2, #4).
Eduard Zingerman [Sat, 11 Apr 2026 07:33:47 +0000 (00:33 -0700)]
selftests/bpf: inline TEST_TAG constants in test_loader.c
After str_has_pfx() refactoring each TEST_TAG_* / TEST_BTF_PATH
constant is used exactly once. Since constant definitions are not
shared between BPF-side bpf_misc.h and userspace side test_loader.c,
there is no need in the additional redirection layer.
Eduard Zingerman [Sat, 11 Apr 2026 07:33:46 +0000 (00:33 -0700)]
selftests/bpf: impose global ordering for test decl_tags
Impose global ordering for all decl tags used by test_loader.c based
tests (__success, __failure, __msg, etc):
- change every tag to expand as
__attribute__((btf_decl_tag("comment:" XSTR(__COUNTER__) ...)))
- change parse_test_spec() to collect all decl tags before
processing and sort them using strverscmp().
The ordering is necessary for gcc-bpf.
Neither GCC nor the C standard defines the order in which function
attributes are consumed. While Clang tends to preserve definition order,
GCC may process them out of sequence. This inconsistency causes BPF
tests with multiple __msg entries to fail when compiled with GCC.
software node: return -ENOTCONN when referenced swnode is not registered yet
It's possible that at the time of resolving a reference to a remote
software node, the node we know exists is not yet registered as a full
firmware node. We currently return -ENOENT in this case but the same
error code is also returned in some other cases, like the reference
property with given name not existing in the property list of the local
software node.
It makes sense to let users know that we're dealing with an unregistered
software node so that they can defer probe - the situation is somewhat
similar to there existing a firmware node to which no device is bound
yet - which is valid grounds for probe deferral. To that end: use
-ENOTCONN to indicate the software node is "not connected".
Acked-by: Andy Shevchenko <andy@kernel.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Link: https://patch.msgid.link/20260407-swnode-unreg-retcode-v4-1-1b2f0725eb9c@oss.qualcomm.com
[ Drop software node backend specifics from
fwnode_property_get_reference_args() documentation. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
I stumbled upon "isolcpus=managed_irq" which is the last piece which
can only be handled by isolcpus= and has no runtime knob. I knew roughly
what managed interrupts should do but I lacked some details how it is
used and what the managed_irq sub parameter means in practise.
This documents what we have as of today and how it works. I added some
examples how the parameter affects the configuration. Did I miss
something?
Given that the spreading as computed group_cpus_evenly() does not take
the mask of isolated CPUs into account I'm not sure how relevant the
managed_irq argument is. The virtio_scsi driver has no way to limit the
interrupts and I don't see this for the nvme. Even if the number of
queues can be reduced to two (as in the example) it is still spread
evenly in the system instead and the isolated CPUs are not taken into
account.
To make this worse, you can even argue further whether or not the
application on the isolated CPU wants to receive the interrupt directly
or would prefer not to.
Given all this, I am not sure if it makes sense to add 'io_queue' to the
mix or if it could be incorporated into 'managed_irq'.
One more point: Given that isolcpus= is marked deprecated as of commit b0d40d2b22fe4 ("sched/isolation: Document isolcpus= boot parameter flags, mark it deprecated")
and the 'managed_irq' is evaluated at device's probe time it would
require additional callbacks to re-evaluate the situation. Probably for
'io_queue', too. Does is make sense or should we simply drop the
"deprecation" notice and allowing using it long term?
Dynamic partitions work with cpusets, there this (managed_irq)
limitation but is it really? And if static partition is the use case why
bother.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Aaron Tomlin <atomlin@atomlin.com> Acked-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260401110232.ET5RxZfl@linutronix.de>
Sina Hassani [Fri, 10 Apr 2026 18:32:44 +0000 (11:32 -0700)]
iommufd: Fix a race with concurrent allocation and unmap
iopt_unmap_iova_range() releases the lock on iova_rwsem inside the loop
body when getting to the more expensive unmap operations. This is fine on
its own, except the loop condition is based on the first area that matches
the unmap address range. If a concurrent call to map picks an area that
was unmapped in previous iterations, the loop mistakenly tries to unmap
it.
This is reproducible by having one userspace thread map buffers and pass
them to another thread that unmaps them. The problem manifests as EBUSY
errors with single page mappings.
Fix this by advancing the start pointer after unmapping an area. This
ensures each iteration only examines the IOVA range that remains mapped,
which is guaranteed not to have overlaps.
driver core: Don't let a device probe until it's ready
The moment we link a "struct device" into the list of devices for the
bus, it's possible probe can happen. This is because another thread
can load the driver at any time and that can cause the device to
probe. This has been seen in practice with a stack crawl that looks
like this [1]:
As a result of the above, it was seen that device_links_driver_bound()
could be called for the device before "dev->fwnode->dev" was
assigned. This prevented __fw_devlink_pickup_dangling_consumers() from
being called which meant that other devices waiting on our driver's
sub-nodes were stuck deferring forever.
It's believed that this problem is showing up suddenly for two
reasons:
1. Android has recently (last ~1 year) implemented an optimization to
the order it loads modules [2]. When devices opt-in to this faster
loading, modules are loaded one-after-the-other very quickly. This
is unlike how other distributions do it. The reproduction of this
problem has only been seen on devices that opt-in to Android's
"parallel module loading".
2. Android devices typically opt-in to fw_devlink, and the most
noticeable issue is the NULL "dev->fwnode->dev" in
device_links_driver_bound(). fw_devlink is somewhat new code and
also not in use by all Linux devices.
Even though the specific symptom where "dev->fwnode->dev" wasn't
assigned could be fixed by moving that assignment higher in
device_add(), other parts of device_add() (like the call to
device_pm_add()) are also important to run before probe. Only moving
the "dev->fwnode->dev" assignment would likely fix the current
symptoms but lead to difficult-to-debug problems in the future.
Fix the problem by preventing probe until device_add() has run far
enough that the device is ready to probe. If somehow we end up trying
to probe before we're allowed, __driver_probe_device() will return
-EPROBE_DEFER which will make certain the device is noticed.
In the race condition that was seen with Android's faster module
loading, we will temporarily add the device to the deferred list and
then take it off immediately when device_add() probes the device.
Instead of adding another flag to the bitfields already in "struct
device", instead add a new "flags" field and use that. This allows us
to freely change the bit from different thread without worrying about
corrupting nearby bits (and means threads changing other bit won't
corrupt us).
[1] Captured on a machine running a downstream 6.6 kernel
[2] https://cs.android.com/android/platform/superproject/main/+/main:system/core/libmodprobe/libmodprobe.cpp?q=LoadModulesParallel
Cc: stable@vger.kernel.org Fixes: 2023c610dc54 ("Driver core: add new device to bus's list before probing") Reviewed-by: Alan Stern <stern@rowland.harvard.edu> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patch.msgid.link/20260406162231.v5.1.Id750b0fbcc94f23ed04b7aecabcead688d0d8c17@changeid Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Instead of repeating the command opcode twice, some flash devices try to
pack command and address bits. In this case, the second opcode byte
being sent (LSB) is free to be used. The input data must be ANDed to
only provide the relevant bits.
While at a first glance this doesn't let a chance to the second byte to
be shifted out on the bus, this is actually the second step of an
initialization, where the byte being apparently "ignored" in DTR mode
has already been written in a dedicated "extended opcode" register. As
such, the comment and the extra check that I proposed were entirely
wrong, remove them.
dt-bindings: usb: dwc3: add support for StarFive JHB100
Add support for the USB 2.0 Dual-Role Device (DRD) controller embedded
in the StarFive JHB100 SoC. The controller is based on the Synopsys
DesignWare Core USB 3 (DWC3) IP.
Charan Pedumuru [Fri, 27 Mar 2026 16:47:46 +0000 (16:47 +0000)]
dt-bindings: usb: atmel,at91sam9rl-udc: convert to DT schema
Convert Atmel High-Speed USB Device Controller (USBA) binding to DT schema.
Changes during conversion:
- Make the "clock-names" property flexible enough to accept the items
in any order as the existing in tree DTS nodes doesn't follow an order.
Charan Pedumuru [Fri, 27 Mar 2026 16:47:45 +0000 (16:47 +0000)]
dt-bindings: usb: atmel,at91rm9200-udc: convert to DT schema
Convert Atmel AT91 USB Device Controller (UDC) binding to DT schema.
Changes during conversion:
- Include "atmel,pullup-gpio" and "atmel,matrix" in the properties since
they are required by existing in-tree DTS definitions.
Charan Pedumuru [Fri, 27 Mar 2026 16:47:42 +0000 (16:47 +0000)]
arm: dts: at91: remove unused #address-cells/#size-cells from sam9x60 udc node
The UDC node does not define any child nodes, so the "#address-cells" and
"#size-cells" properties are unnecessary. Remove these unused properties
to simplify the devicetree node and keep it consistent with DT conventions.
usbip: tools: add hint when no exported devices are found
When refresh_exported_devices() finds no devices, it's helpful to
inform users about potential causes. This could be due to:
1. The usbip driver module is not loaded.
2. No devices have been exported yet.
Add an informational message to guide users when ndevs == 0.
Also update the condition in usbip_host_driver_open() and
usbip_device_driver_open() to check both ret and ndevs == 0,
and change err() to info().
Message visibility by scenario:
- usbipd (console mode): Show on console/serial, this allows instant
visibility for debugging.
- usbipd -D (daemon mode): Message logged to syslog, can keep logs for
later traceability in production. Also can use "journalctl -f" to
trace on console.
Merge tag 'mvebu-dt-7.1-1' of https://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into soc/dt
mvebu dt for 7.1 (part 1)
Drop unnecessary MAINTAINERS entry for non-existent Marvell db-falcon files
* tag 'mvebu-dt-7.1-1' of https://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu:
MAINTAINERS: drop file entry in Marvell Kirkwood and Armada SOC support
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
s390/zcrypt: Fix warning about wrong kernel doc comment
Fix this warning:
Warning: drivers/s390/crypto/zcrypt_msgtype6.c:1253 This comment
starts with '/**', but isn't a kernel-doc comment. Refer to
Documentation/doc-guide/kernel-doc.rst
Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202603252022.vEojGo3V-lkp@intel.com/ Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
PCI: s390: Expose the UID as an arch specific PCI slot attribute
On s390, an individual PCI function can generally be identified by two
identifiers, the FID and the UID. Which identifier is used depends on
the scope and the platform configuration.
The first identifier, the FID, is always available and identifies a PCI
device uniquely within a machine. The FID may be virtualized by
hypervisors, but on the LPAR level, the machine scope makes it
impossible to create the same configuration based on FIDs on two
different LPARs of the same machine, and difficult to reuse across
machines.
Such matching LPAR configurations are useful, though, allowing
standardized setups and booting a Linux installation on different LPARs.
To this end the UID, or user-defined identifier, was introduced. While
it is only guaranteed to be unique within an LPAR and only if indicated
by firmware, it allows users to replicate PCI device setups.
On s390, which uses a machine hypervisor, a per PCI function hotplug
model is used. The shortcoming with the UID then is, that it is not
visible to the user without first attaching the PCI function and
accessing the "uid" device attribute. The FID, on the other hand, is
used as the slot name and is thus known even with the PCI function in
standby.
Remedy this shortcoming by providing the UID as an attribute on the slot
allowing the user to identify a PCI function based on the UID without
having to first attach it. Do this via a macro mechanism analogous to
what was introduced by commit 265baca69a07 ("s390/pci: Stop usurping
pdev->dev.groups") for the PCI device attributes.
docs: s390/pci: Improve and update PCI documentation
Update the s390 specific PCI documentation to better reflect current
behavior and terms such as the handling of Isolated VFs via commit 25f39d3dcb48 ("s390/pci: Ignore RID for isolated VFs").
Add a descriptions for /sys/firmware/clp/uid_checking which was added
in commit b043a81ce3ee ("s390/pci: Expose firmware provided UID Checking
state in sysfs") but missed documentation.
Similarly add documentation for the fidparm attribute added by commit 99ad39306a62 ("s390/pci: Expose FIDPARM attribute in sysfs") and
add a list of pft values and their names.
Finally improve formatting of the different attribute descriptions by
adding a separating colon.
Reviewed-by: Farhan Ali <alifm@linux.ibm.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com> Link: https://lore.kernel.org/r/20260407-uid_slot-v8-1-15ae4409d2ce@linux.ibm.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Merge tag 'mvebu-fixes-7.0-1' of https://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into arm/fixes
mvebu fixes for 7.0 (part 1)
A new device tree has been merged without a binding, which triggered
warnings during dtb_checks. The commit in this pull request fixes this
issue, enabling easier detection of new problems in device trees by
reducing the number of false warnings.
* tag 'mvebu-fixes-7.0-1' of https://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu:
dt-bindings: arm64: add Marvell 7k COMe boards
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Merge tag 'reset-fixes-for-v7.0-3' of https://git.pengutronix.de/git/pza/linux into arm/fixes
Reset controller fixes for v7.0, part 3
* Add missing reset ops for amlogic,t7-reset to the reset-meson driver.
The resets are unused as of now, but as soon as they are, the driver
would otherwise run into a NULL pointer dereference.
* tag 'reset-fixes-for-v7.0-3' of https://git.pengutronix.de/git/pza/linux:
reset: amlogic: t7: Fix null reset ops
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Merge tag 'ffa-fix-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into soc/drivers
Arm FF-A fix for v7.1
Use the page aligned backing allocation size when computing the RXTX_MAP
page count. This fixes FF-A RX/TX buffer registration on kernels built
with 16K/64K PAGE_SIZE, where alloc_pages_exact() backs the buffer with a
larger aligned span than the discovered minimum buffer size.
* tag 'ffa-fix-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
firmware: arm_ffa: Use the correct buffer size during RXTX_MAP
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Merge tag 'sunxi-dt-for-7.1-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into soc/dt
Allwinner Device Tree Changes for 7.1 - Part 2
UART DMA channels added for A64 and H6. Standard resolution MMIO timer added
for H616. This timer can be used as a broadcast timer for wakeup from idle
states.
* tag 'sunxi-dt-for-7.1-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
arm64: dts: allwinner: enable h616 timer support
arm64: dts: allwinner: sun50i-h6: add UART DMA channels
arm64: dts: allwinner: sun50i-a64: add UART DMA channels
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Merge tag 'qcom-drivers-for-7.1-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/drivers
One more Qualcomm driver update for v7.1
Flag Lenovo IdeaCentre Mini X to have functional QSEECOM/uefisecapp.
* tag 'qcom-drivers-for-7.1-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
firmware: qcom: scm: Allow QSEECOM on Lenovo IdeaCentre Mini X
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Merge tag 'riscv-soc-drivers-for-v7.1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/drivers
RISC-V soc drivers for v7.1
Microchip:
Add coverage for the pic64gx in the system controller and syscons.
Add a interrupt mux driver (akin to the one that Renesas recently added)
that fixes a problem where the platform never properly modelled gpio
interrupts. There's a gpio driver change here that Bartosz has acked
that adds the interrupt support to the GPIO driver itself.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
* tag 'riscv-soc-drivers-for-v7.1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/conor/linux:
soc: microchip: add mpfs gpio interrupt mux driver
dt-bindings: soc: microchip: document PolarFire SoC's gpio interrupt mux
gpio: mpfs: Add interrupt support
soc: microchip: mpfs-sys-controller: add support for pic64gx
dt-bindings: soc: microchip: mpfs-sys-controller: Add pic64gx compatibility
dt-bindings: soc: microchip: add compatible for the mss-top-sysreg on pic64gx
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Merge tag 'riscv-dt-for-v7.1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/dt
RISC-V devicetrees for v7.1
Generic:
Add binding coverage for Supm.
Microchip:
Add support for the picgx64 and its curiosity board. This is a PolarFire
SoC without the FPGA.
Add the missing tsu_clk for ptp on the macb on PolarFire SoC and resolve
a long-running problem with gpio interrupts being incorrectly described
on the platform.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
* tag 'riscv-dt-for-v7.1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/conor/linux:
riscv: dts: microchip: update mpfs gpio interrupts to better match the SoC
riscv: dts: microchip: add tsu clock to macb on mpfs
dt-bindings: riscv: Add Supm extension description
riscv: dts: microchip: remove POLARFIRE mention in Makefile
riscv: dts: microchip: add pic64gx and its curiosity kit
dt-bindings: riscv: microchip: document the PIC64GX curiosity kit
dt-bindings: timer: sifive,clint: add pic64gx compatibility
riscv: dts: microchip: add pinctrl nodes for mpfs/icicle kit
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Merge tag 'imx-dt-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/frank.li/linux into soc/dt
i.MX ARM device tree changes for 7.1:
- Device Tree Schema Compliance Fixes
Fixed numerous CHECK_DTBS warnings across multiple i.MX SoC families
Renamed nodes to match schema requirements (tcq→touchscreen,
uart8250→serial, iomuxc→pinmux, etc.). Fixed node naming conventions
(added "led-" prefix, proper addressing formats).
Corrected compatible strings and removed undocumented fallbacks. Added
required properties (clocks, clock-names, power supplies,
#sound-dai-cells).
- New Hardware Support
Added DT overlays for various expansion modules (i.MX6 DHCOM PDK2,
PicoITX display boards). Added support for muRata 1YN WiFi chip
(replacement for 1DX) on i.MX6ULL DHCOR board.
i.MX7ULP: Added CPU clock and OPP table support for frequency scaling.
Merge tag 'imx-fixes-7.0-2nd' of https://git.kernel.org/pub/scm/linux/kernel/git/frank.li/linux into arm/fixes
i.MX fixes for 7.0 2nd round:
- Fixes interrupt storm by adding pull up pinctrl config for pin PMIC_nINT.
* tag 'imx-fixes-7.0-2nd' of https://git.kernel.org/pub/scm/linux/kernel/git/frank.li/linux:
arm64: dts: imx8mm-tqma8mqml: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mn-tqma8mqnl: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mm-emtop-som: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mp-data-modul-edm-sbc: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mp-dhcom-som: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mp-ultra-mach-sbc: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mp-sr-som: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mp-nitrogen-som: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mp-aristainetos3a-som-v1: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mp-edm-g: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mp-icore-mx8mp: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mp-navqp: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mp-debix-som-a: Correct PAD settings for PMIC_nINT
arm64: dts: imx8mp-debix-model-a: Correct PAD settings for PMIC_nINT
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Merge tag 'v7.1-rockchip-dts32-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into soc/dt
Support for the RV1103B SoC and the Onion Omega4 board using it.
While the RV1103B only got a B-extension to its name, the SoC internals
were reworked heavily. So likely it's mainly pin compatible to the
non-B variant.
The dt-binding for the RV1103B clock driver is shared with the clock-
driver branch going into the clock-tree.
* tag 'v7.1-rockchip-dts32-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
ARM: dts: rockchip: Add Onion Omega4 Evaluation Board
dt-bindings: arm: rockchip: Add Omega4 Evaluation board
ARM: dts: rockchip: Add support for RV1103B
dt-bindings: soc: rockchip: grf: Add RV1103B compatibles
dt-bindings: clock: rockchip: Add RV1103B CRU support
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Merge tag 'v7.1-rockchip-dts32-1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into soc/dt
A number of dt-schema cleanups that are log standing, so not suitable
as fix for the current release.
* tag 'v7.1-rockchip-dts32-1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
ARM: dts: rockchip: Pass linux,code to the power key on rk3288-veyron-pinky
ARM: dts: rockchip: Fix LED node names on rk3288-phycore-rdk
ARM: dts: rockchip: Fix GMAC description n RK3288 boards
ARM: dts: rockchip: Fix RTC description on rk3288-firefly-reload
ARM: dts: rockchip: Add missing the touchscreen interrupt on rk3288-phycore-rdk
ARM: dts: rockchip: Fix the trackpad supply on rk3288-veyron-jerry
ARM: dts: rockchip: Fix the Bluetooth node name on rk3288-veyron
ARM: dts: rockchip: Remove invalid regulator-property from rk3288-veyron
ARM: dts: rockchip: Use mount-matrix on rk3188-bqedison2qc
ARM: dts: rockchip: Fix RTC compatible on rk3288-phycore-rdk
ARM: dts: rockchip: Move PHY reset to ethernet-phy node on rk3036 boards
ARM: dts: rockchip: Remove rockchip,grf from rk3288 tsadc
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Rong Zhang [Fri, 10 Apr 2026 17:49:04 +0000 (01:49 +0800)]
ALSA: usb-audio: Do not expose sticky mixers
Some devices' mixers are sticky, which accept SET_CUR but do absolutely
nothing. Registering these mixers confuses userspace and results in
ineffective volume control.
Check if a mixer is sticky by setting the volume to the maximum or
minimum value and checking for effectiveness afterward. Prevent the
mixer from being registered if it turns out to be sticky.
Quirky device sample:
usb 7-1: New USB device found, idVendor=0e0b, idProduct=fa01, bcdDevice= 1.00
usb 7-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 7-1: Product: Feaulle Rainbow
usb 7-1: Manufacturer: Generic
usb 7-1: SerialNumber: 20210726905926
(Mic Capture Volume)
Rong Zhang [Fri, 10 Apr 2026 17:49:02 +0000 (01:49 +0800)]
ALSA: usb-audio: Add error checks against get_min_max*()
All callers of get_min_max*() ignore the latter's return code
completely. This means to ignore temporary errors at the probe time.
However, it is not optimal and leads to some maintenance burdens.
Return -EAGAIN for temporary errors, and check against it in the callers
of get_min_max*(). If any other error occurs, bail out of the caller
early.
pstore/ftrace: Factor KASLR offset in the core kernel instruction addresses
The pstore ftrace frontend works by purely collecting the
instruction address, saving it on the persistent area through
the backend and when the log is read, on next boot for example,
the address is then resolved by using the regular printk symbol
lookup (%pS for example).
Problem: if we are running a relocatable kernel with KASLR enabled,
this is a recipe for failure in the symbol resolution on next boots,
since the addresses are offset'ed by the KASLR address. So, naturally
the way to go is factor the KASLR address out of instruction address
collection, and adding the fresh offset when resolving the symbol
on future boots.
Problem #2: modules also have varying addresses that float based
on module base address and potentially the module ordering in
memory, meaning factoring KASLR offset for them is useless.
So, let's hereby only take KASLR offset into account for core
kernel addresses, leaving module ones as is.
And we have yet a 3rd complexity: not necessarily the check range
for core kernel addresses holds true on future boots, since the
module base address will vary. With that, the choice was to mark
the addresses as being core vs module based on its MSB. And with
that...
...we have the 4th challenge here: for some "simple" architectures,
the CPU number is saved bit-encoded on the instruction pointer, to
allow bigger timestamps - this is set through the PSTORE_CPU_IN_IP
define for such architectures. Hence, the approach here is to skip
such architectures (at least in a first moment).
Finished? No. On top of all previous complexities, we have one
extra pain point: kaslr_offset() is inlined and fully "resolved"
at boot-time, after kernel decompression, through ELF relocation
mechanism. Once the offset is known, it's patched to the kernel
text area, wherever it is used. The mechanism, and its users, are
only built-in - incompatible with module usage. Though there are
possibly some hacks (as computing the offset using some kallsym
lookup), the choice here is to restrict this optimization to the
(hopefully common) case of CONFIG_PSTORE=y.
TL;DR: let's factor KASLR offsets on pstore/ftrace for core kernel
addresses, only when PSTORE is built-in and leaving module addresses
out, as well as architectures that define PSTORE_CPU_IN_IP.
====================
Use kmalloc_nolock() universally in BPF local storage
Socket local storage did not convert to use kmalloc_nolock() since there
were observable performance degredation due to kfree_nolock() hitting the
slow path and the lack of kfree_rcu()-like batching freeing. Now that
these concern were addressed in slub, convert all remaining local storage
flavors to use kmalloc_nolock().
bpf: Remove gfp_flags plumbing from bpf_local_storage_update()
Remove the check that rejects sleepable BPF programs from doing
BPF_ANY/BPF_EXIST updates on local storage. This restriction was added
in commit b00fa38a9c1c ("bpf: Enable non-atomic allocations in local
storage") because kzalloc(GFP_KERNEL) could sleep inside
local_storage->lock. This is no longer a concern: all local storage
allocations now use kmalloc_nolock() which never sleeps.
In addition, since kmalloc_nolock() only accepts __GFP_ACCOUNT,
__GFP_ZERO and __GFP_NO_OBJ_EXT, the gfp_flags parameter plumbing from
bpf_*_storage_get() to bpf_local_storage_update() becomes dead code.
Remove gfp_flags from bpf_selem_alloc(), bpf_local_storage_alloc() and
bpf_local_storage_update(). Drop the hidden 5th argument from
bpf_*_storage_get helpers, and remove the verifier patching that
injected GFP_KERNEL/GFP_ATOMIC into the fifth argument.
bpf: Use kmalloc_nolock() universally in local storage
Switch to kmalloc_nolock() universally in local storage. Socket local
storage didn't move to kmalloc_nolock() when BPF memory allocator was
replaced by it for performance reasons. Now that kfree_rcu() supports
freeing memory allocated by kmalloc_nolock(), we can move the remaining
local storages to use kmalloc_nolock() and cleanup the cluttered free
paths.
Use kfree() instead of kfree_nolock() in bpf_selem_free_trace_rcu() and
bpf_local_storage_free_trace_rcu(). Both callbacks run in process context
where spinning is allowed, so kfree_nolock() is unnecessary.
The benchmark is a microbenchmark stress-testing how fast local storage
can be created. There is no measurable throughput change for socket local
storage after switching from kzalloc() to kmalloc_nolock().
selftests/bpf: Remove kmalloc tracing from local storage create bench
Remove the raw_tp/kmalloc BPF program and its associated reporting from
the local storage create benchmark. The kmalloc count per create is not
a useful metric as different code paths use different allocators (e.g.
kmalloc_nolock vs kzalloc), introducing noise that makes the number
hard to interpret.
Keep total_creates in the summary output as it is useful for normalizing
perf statistics collected alongside the benchmark.
sched_ext: Drop spurious warning on kick during scheduler disable
kick_cpus_irq_workfn() warns when scx_kick_syncs is NULL, but this can
legitimately happen when a BPF timer or other kick source races with
free_kick_syncs() during scheduler disable. Drop the pr_warn_once() and
add a comment explaining the race.
Paulo Alcantara [Fri, 10 Apr 2026 23:20:55 +0000 (20:20 -0300)]
smb: client: get rid of d_drop()+d_add()
Replace d_drop()+d_add() in cifs_tmpfile() and cifs_create() with
d_instantiate(), and in cifs_atomic_open() with d_splice_alias() if
in-lookup, otherwise d_instantiate().
Reported-by: Al Viro <viro@zeniv.linux.org.uk> Closes: https://lore.kernel.org/r/20260408065719.GF3836593@ZenIV Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Cc: David Howells <dhowells@redhat.com> Cc: NeilBrown <neilb@ownmail.net> Cc: linux-fsdevel@vger.kernel.org Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
Daniel Borkmann [Fri, 10 Apr 2026 23:26:50 +0000 (01:26 +0200)]
bpf: Enforce regsafe base id consistency for BPF_ADD_CONST scalars
When regsafe() compares two scalar registers that both carry
BPF_ADD_CONST, check_scalar_ids() maps their full compound id
(aka base | BPF_ADD_CONST flag) as one idmap entry. However,
it never verifies that the underlying base ids, that is, with
the flag stripped are consistent with existing idmap mappings.
This allows construction of two verifier states where the old
state has R3 = R2 + 10 (both sharing base id A) while the current
state has R3 = R4 + 10 (base id C, unrelated to R2). The idmap
creates two independent entries: A->B (for R2) and A|flag->C|flag
(for R3), without catching that A->C conflicts with A->B. State
pruning then incorrectly succeeds.
Fix this by additionally verifying base ID mapping consistency
whenever BPF_ADD_CONST is set: after mapping the compound ids,
also invoke check_ids() on the base IDs (flag bits stripped).
This ensures that if A was already mapped to B from comparing
the source register, any ADD_CONST derivative must also derive
from B, not an unrelated C.
Fixes: 98d7ca374ba4 ("bpf: Track delta between "linked" registers.") Reported-by: STAR Labs SG <info@starlabs.sg> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20260410232651.559778-1-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Merge tag 'riscv-for-linus-v7.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Paul Walmsley:
"Before v7.0 is released, fix a few issues with the CFI patchset,
merged earlier in v7.0-rc, that primarily affect interfaces to
non-kernel code:
- Improve the prctl() interface for per-task indirect branch landing
pad control to expand abbreviations and to resemble the speculation
control prctl() interface
- Expand the "LP" and "SS" abbreviations in the ptrace uapi header
file to "branch landing pad" and "shadow stack", to improve
readability
- Fix a typo in a CFI-related macro name in the ptrace uapi header
file
- Ensure that the indirect branch tracking state and shadow stack
state are unlocked immediately after an exec() on the new task so
that libc subsequently can control it
- While working in this area, clean up the kernel-internal,
cross-architecture prctl() function names by expanding the
abbreviations mentioned above"
* tag 'riscv-for-linus-v7.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
prctl: cfi: change the branch landing pad prctl()s to be more descriptive
riscv: ptrace: cfi: expand "SS" references to "shadow stack" in uapi headers
prctl: rename branch landing pad implementation functions to be more explicit
riscv: ptrace: expand "LP" references to "branch landing pads" in uapi headers
riscv: cfi: clear CFI lock status in start_thread()
riscv: ptrace: cfi: fix "PRACE" typo in uapi header
Merge tag 'drm-fixes-2026-04-11' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Last set of fixes, a few vc4, and i915, one xe and one ethosu Kconfig
fix.
xe:
- Fix HW engine idleness unit conversion
i915:
- Drop check for changed VM in EXECBUF
- Fix refcount underflow race in intel_engine_park_heartbeat
- Do not use pipe_src as borders for SU area in PSR
* tag 'drm-fixes-2026-04-11' of https://gitlab.freedesktop.org/drm/kernel:
drm/i915/gem: Drop check for changed VM in EXECBUF
drm/i915/gt: fix refcount underflow in intel_engine_park_heartbeat
drm/xe: Fix bug in idledly unit conversion
drm/i915/psr: Do not use pipe_src as borders for SU area
accel: ethosu: Add hardware dependency hint
drm/vc4: Protect madv read in vc4_gem_object_mmap() with madv_lock
drm/vc4: Fix a memory leak in hang state error path
drm/vc4: Fix memory leak of BO array in hang state
drm/vc4: Release runtime PM reference after binding V3D
iavf: fix kernel-doc comment style in iavf_ethtool.c
iavf_ethtool.c contains 31 kernel-doc comment blocks using the legacy
`**/` terminator instead of the correct single `*/`. Two function
headers also use a colon separator (`iavf_get_channels:`,
`iavf_set_channels:`) instead of the ` - ` dash required by kernel-doc.
Additionally several comments embed their return-value descriptions in
the body paragraph, producing `scripts/kernel-doc -Wreturn` warnings.
Void functions that incorrectly say "Returns ..." are also rephrased.
Fix all issues across the full file:
- Replace every `**/` terminator with `*/`.
- Change `function_name:` doc headers to `function_name -`.
- Move inline "Returns ..." sentences into dedicated `Return:` sections
for non-void functions (iavf_get_msglevel, iavf_get_rxnfc,
iavf_set_channels, iavf_get_rxfh_key_size, iavf_get_rxfh_indir_size,
iavf_get_rxfh, iavf_set_rxfh).
- Rephrase body descriptions in void functions that incorrectly said
"Returns ..." (iavf_get_drvinfo, iavf_get_ringparam, iavf_get_coalesce).
- Remove boilerplate body text for iavf_get_rxfh_key_size and
iavf_get_rxfh_indir_size; the `Return:` line now conveys the same
information without the vague "Returns the table size." sentence.
Suggested-by: Anthony L. Nguyen <anthony.l.nguyen@intel.com> Suggested-by: Leszek Pepiak <leszek.pepiak@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Breno Leitao <leitao@debian.org> Reviewed-by: Joe Damato <joe@dama.to> Link: https://patch.msgid.link/20260409093020.3808687-1-aleksandr.loktionov@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: airoha: Fix FE_PSE_BUF_SET configuration if PPE2 is available
airoha_fe_set routine is used to set specified bits to 1 in the selected
register. In the FE_PSE_BUF_SET case this can due to a overestimation of
the required buffers for I/O queues since we can miss to set some bits
of PSE_ALLRSV_MASK subfield to 0. Fix the issue relying on airoha_fe_rmw
routine instead.
====================
net: dsa: mxl862xx: VLAN support and minor improvements
This series adds VLAN offloading to the mxl862xx DSA driver along
with two minor improvements to port setup and bridge configuration.
VLAN support uses a hybrid architecture combining the Extended VLAN
engine for PVID insertion and tag stripping with the VLAN Filter
engine for per-port VID membership, both drawing from shared
1024-entry hardware pools partitioned across user ports at probe time.
====================
Daniel Golle [Tue, 7 Apr 2026 17:31:01 +0000 (18:31 +0100)]
net: dsa: mxl862xx: implement VLAN functionality
Add VLAN support using both the Extended VLAN (EVLAN) engine and the
VLAN Filter (VF) engine in a hybrid architecture that allows a higher
number of VIDs than either engine could achieve alone.
The VLAN Filter engine handles per-port VID membership checks with
discard-unmatched semantics. The Extended VLAN engine handles PVID
insertion on ingress (via fixed catchall rules) and tag stripping on
egress (2 rules per untagged VID). Tagged-only VIDs need no EVLAN
egress rules at all, so they consume only a VF entry.
Both engines draw from shared 1024-entry hardware pools. The VF pool
is divided equally among user ports for VID membership, while the
EVLAN pool is partitioned into small fixed-size ingress blocks (7
entries of catchall rules per port) and fixed-size egress blocks for
tag stripping.
With 5 user ports this yields up to 204 VIDs per port (limited by VF),
of which up to 98 can be untagged (limited by EVLAN egress budget).
With 9 user ports the numbers are 113 total and 53 untagged.
Wire up .port_vlan_add, .port_vlan_del, and .port_vlan_filtering.
Reprogram all EVLAN rules when the PVID or filtering mode changes.
Detach blocks from the bridge port before freeing them on bridge leave
to satisfy the firmware's internal refcount.
Future optimizations could increase VID capacity by dynamically sizing
the egress EVLAN blocks based on actual per-port untagged VID counts
rather than worst-case pre-allocation, or by sharing EVLAN egress and
VLAN Filter blocks across ports with identical VID sets.
Daniel Golle [Tue, 7 Apr 2026 17:30:35 +0000 (18:30 +0100)]
net: dsa: mxl862xx: don't skip early bridge port configuration
mxl862xx_bridge_port_set() is currently guarded by the
mxl8622_port->setup_done flag, as the early call to
mxl862xx_bridge_port_set() from mxl862xx_port_stp_state_set() would
otherwise cause a NULL-pointer dereference on unused ports which don't
have dp->cpu_dp despite not being a CPU port.
Using the setup_done flag (which is never set for unused ports),
however, also prevents mxl862xx_bridge_port_set() from configuring
user ports' single-port bridges early, which was unintended.
Fix this by returning early from mxl862xx_bridge_port_set() in case
dsa_port_is_unused().
Daniel Golle [Tue, 7 Apr 2026 17:30:27 +0000 (18:30 +0100)]
net: dsa: mxl862xx: reject DSA_PORT_TYPE_DSA
DSA links aren't supported by the mxl862xx driver.
Instead of returning early from .port_setup when called for
DSA_PORT_TYPE_DSA ports rather return -EOPNOTSUPP and show an error
message.
The desired side-effect is that the framework will switch the port to
DSA_PORT_TYPE_UNUSED, so we can stop caring about DSA_PORT_TYPE_DSA in
all other places.
====================
net: bridge: add stp_mode attribute for STP mode selection
The bridge-stp usermode helper is currently restricted to the initial
network namespace, preventing userspace STP daemons like mstpd from
operating on bridges in other namespaces. Since commit ff62198553e4
("bridge: Only call /sbin/bridge-stp for the initial network
namespace"), bridges in non-init namespaces silently fall back to
kernel STP with no way to request userspace STP.
This series adds a new IFLA_BR_STP_MODE bridge attribute that allows
explicit per-bridge control over STP mode selection. Three modes are
supported:
- auto (default): existing behavior, try /sbin/bridge-stp in
init_net, fall back to kernel STP otherwise
- user: directly enable BR_USER_STP without invoking the helper,
works in any network namespace
- kernel: directly enable BR_KERNEL_STP without invoking the helper
The user and kernel modes bypass call_usermodehelper() entirely,
addressing the security concerns discussed at [1]. Userspace is
responsible for ensuring an STP daemon manages the bridge, rather
than relying on the kernel to invoke /sbin/bridge-stp.
Patch 1 adds the kernel support. The mode can only be changed while
STP is disabled and is processed before IFLA_BR_STP_STATE in
br_changelink() so both can be set atomically in a single netlink
message.
Patch 2 adds documentation for the new attribute in the bridge docs.
Patch 3 adds a selftest with 9 test cases. The test requires iproute2
with IFLA_BR_STP_MODE support and can be run with virtme-ng:
Andy Roulin [Sun, 5 Apr 2026 20:52:24 +0000 (13:52 -0700)]
selftests: net: add bridge STP mode selection test
Add a selftest for the IFLA_BR_STP_MODE bridge attribute that verifies:
1. stp_mode defaults to auto on new bridges
2. stp_mode can be toggled between user, kernel, and auto
3. Changing stp_mode while STP is active is rejected with -EBUSY
4. Re-setting the same stp_mode while STP is active succeeds
5. stp_mode user in a network namespace yields userspace STP (stp_state=2)
6. stp_mode kernel forces kernel STP (stp_state=1)
7. stp_mode auto in a netns preserves traditional fallback to kernel STP
8. stp_mode and stp_state can be set atomically in a single message
9. stp_mode persists across STP disable/enable cycles
Test 5 is the key use case: it demonstrates that userspace STP can now
be enabled in non-init network namespaces by setting stp_mode to user
before enabling STP.
Test 8 verifies the atomic usage pattern where both attributes are set
in a single netlink message, which is supported because br_changelink()
processes IFLA_BR_STP_MODE before IFLA_BR_STP_STATE.
The test gracefully skips if the installed iproute2 does not support
the stp_mode attribute.
Assisted-by: Claude:claude-opus-4-6 Reviewed-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Andy Roulin <aroulin@nvidia.com> Link: https://patch.msgid.link/20260405205224.3163000-4-aroulin@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Andy Roulin [Sun, 5 Apr 2026 20:52:23 +0000 (13:52 -0700)]
docs: net: bridge: document stp_mode attribute
Add documentation for the IFLA_BR_STP_MODE bridge attribute in the
"User space STP helper" section of the bridge documentation. Reference
the BR_STP_MODE_* values via kernel-doc and describe the use case for
network namespace environments.
Andy Roulin [Sun, 5 Apr 2026 20:52:22 +0000 (13:52 -0700)]
net: bridge: add stp_mode attribute for STP mode selection
The bridge-stp usermode helper is currently restricted to the initial
network namespace, preventing userspace STP daemons (e.g. mstpd) from
operating on bridges in other network namespaces. Since commit ff62198553e4 ("bridge: Only call /sbin/bridge-stp for the initial
network namespace"), bridges in non-init namespaces silently fall
back to kernel STP with no way to use userspace STP.
Add a new bridge attribute IFLA_BR_STP_MODE that allows explicit
per-bridge control over STP mode selection:
BR_STP_MODE_AUTO (default) - Existing behavior: invoke the
/sbin/bridge-stp helper in init_net only; fall back to kernel STP
if it fails or in non-init namespaces.
BR_STP_MODE_USER - Directly enable userspace STP (BR_USER_STP)
without invoking the helper. Works in any network namespace.
Userspace is responsible for ensuring an STP daemon manages the
bridge.
BR_STP_MODE_KERNEL - Directly enable kernel STP (BR_KERNEL_STP)
without invoking the helper.
The mode can only be changed while STP is disabled, or set to the
same value (-EBUSY otherwise). IFLA_BR_STP_MODE is processed before
IFLA_BR_STP_STATE in br_changelink(), so both can be set atomically
in a single netlink message. The mode can also be changed in the
same message that disables STP.
The stp_mode struct field is u8 since all possible values fit, while
NLA_U32 is used for the netlink attribute since it occupies the same
space in the netlink message as NLA_U8.
A new stp_helper_active boolean tracks whether the /sbin/bridge-stp
helper was invoked during br_stp_start(), so that br_stp_stop() only
calls the helper for stop when it was called for start. This avoids
calling the helper asymmetrically when stp_mode changes between
start and stop.
Suggested-by: Ido Schimmel <idosch@nvidia.com> Assisted-by: Claude:claude-opus-4-6 Reviewed-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Andy Roulin <aroulin@nvidia.com> Link: https://patch.msgid.link/20260405205224.3163000-2-aroulin@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Extend the ntuple flow steering test to cover dst-ip, src-port, and
dst-port fields. The test supports arbitrary combinations of the fields,
for now we test src_ip/dst_ip, and src_ip/dst_ip/src_port/dst_port.
The tests currently match full fields, but we can consider adding
support for masked fields in the future.
TAP version 13
1..24
ok 1 ntuple.queue.tcp4.src_ip
ok 2 ntuple.queue.tcp4.dst_ip
ok 3 ntuple.queue.tcp4.src_port
ok 4 ntuple.queue.tcp4.dst_port
ok 5 ntuple.queue.tcp4.src_ip.dst_ip
ok 6 ntuple.queue.tcp4.src_ip.dst_ip.src_port.dst_port
ok 7 ntuple.queue.udp4.src_ip
ok 8 ntuple.queue.udp4.dst_ip
ok 9 ntuple.queue.udp4.src_port
ok 10 ntuple.queue.udp4.dst_port
ok 11 ntuple.queue.udp4.src_ip.dst_ip
ok 12 ntuple.queue.udp4.src_ip.dst_ip.src_port.dst_port
ok 13 ntuple.queue.tcp6.src_ip
ok 14 ntuple.queue.tcp6.dst_ip
ok 15 ntuple.queue.tcp6.src_port
ok 16 ntuple.queue.tcp6.dst_port
ok 17 ntuple.queue.tcp6.src_ip.dst_ip
ok 18 ntuple.queue.tcp6.src_ip.dst_ip.src_port.dst_port
ok 19 ntuple.queue.udp6.src_ip
ok 20 ntuple.queue.udp6.dst_ip
ok 21 ntuple.queue.udp6.src_port
ok 22 ntuple.queue.udp6.dst_port
ok 23 ntuple.queue.udp6.src_ip.dst_ip
ok 24 ntuple.queue.udp6.src_ip.dst_ip.src_port.dst_port
# Totals: pass:24 fail:0 xfail:0 xpass:0 skip:0 error:0
selftests: drv-net: Add ntuple (NFC) flow steering test
Add a test for ethtool NFC (ntuple) flow steering rules. The test
creates an ntuple rule matching on various flow fields and verifies
that traffic is steered to the correct queue.
The test forces all traffic to queue 0 via the indirection table,
then installs an ntuple rule to steer select traffic to a specific
queue. The test then verifies the expected number of packets is received
on the queue.
This test has variants for TCP/UDP over IPv4/IPv6, with rules matching
the source IP. Additional match fields will be added in the next commit.
TAP version 13
1..4
ok 1 ntuple.queue.tcp4.src_ip
ok 2 ntuple.queue.udp4.src_ip
ok 3 ntuple.queue.tcp6.src_ip
ok 4 ntuple.queue.udp6.src_ip
# Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0
====================
bpf: static stack liveness data flow analysis
This patch set converts current dynamic stack slot liveness tracking
mechanism to a static data flow analysis. The result is used during
state pruning (clean_verifier_state): to zero out dead stack slots,
enabling more aggressive state equivalence and pruning. To improve
analysis precision live stack slot tracking is converted to 4-byte
granularity.
The key ideas and the bulk of the execution behind the series belong
to Alexei Starovoitov. I contributed to patch set integration
with existing liveness tracking mechanism.
Due to complexity of the changes the bisectability property of the
patch set is not preserved. Some selftests may fail between
intermediate patches of the series.
Analysis consists of two passes:
- A forward fixed-point analysis that tracks which frame's FP each
register value is derived from, and at what byte offset. This is
needed because a callee can receive a pointer to its caller's stack
frame (e.g. r1 = fp-16 at the call site), then do *(u64 *)(r1 + 0)
inside the callee - a cross-frame stack access that the callee's
local liveness must attribute to the caller's stack.
- A backward dataflow pass within each callee subprog that computes
live_in = (live_out \ def) ∪ use for both local and non-local
(ancestor) stack slots. The result of the analysis for callee is
propagated up to the callsite.
The key idea making such analysis possible is that limited and
conservative argument tracking pass is sufficient to recover most of
the offsets / stack pointer arguments.
Changelog:
v3 -> v4:
liveness.c:
- fill_from_stack(): correct conservative stack mask for imprecise
result, instead of picking frames from pointer register
(Alexei, sashiko).
- spill_to_stack(): join with existing values instead of
overwriting when dst has multiple offsets (cnt > 1) or imprecise
offset (cnt == 0) (Alexei, sashiko).
- analyze_subprog(): big change, now each analyze_subprog() is
called with a fresh func_instance, once read/write marks are
collected the instance is joined with the one accumulated for
(callsite, depth) and update_instance() is called.
This handles several issues:
- Avoids stale must_write marks when same func_instance is reused
by analyze_subprog() several times.
- Handles potential calls multiple calls for mark_stack_write()
within single instruction.
(Alexei, sashiko).
- analyze_subprog(): added complexity limit to avoid exponential
analysis time blowup for crafted programs with lots of nested
function calls (Alexei, sashiko).
- the patch "bpf: record arg tracking results in bpf_liveness masks"
is reinstated, it was accidentally squashed during v1->v2
transition.
verifier.c:
- clean_live_states() is replaced by a direct call to
clean_verifier_state(), bpf_verifier_state->cleaned is dropped.
verifier_live_stack.c:
- added selftests for arg tracking changes.
v2 -> v3:
liveness.c:
- record_stack_access(): handle S64_MIN (unknown read) with
imprecise offset. Test case can't be created with existing
helpers/kfuncs (sashiko).
- fmt_subprog(): handle NULL name (subprogs without BTF info).
- print_instance(): use u64 for pos/insn_pos avoid truncation
(bot+bpf-ci).
- compute_subprog_args(): return error if
'env->callsite_at_stack[idx] = kvmalloc_objs(...)' fails
(sashiko).
- clear_overlapping_stack_slots(): avoid integer promoting
issues by adding explicit (int) cast (sashiko).
bpf_verifier.h, verifier.c, liveness.c:
- Fixes in comments and commit messages (bot+bpf-ci).
v1 -> v2:
liveness.c:
- Removed func_instance->callsites and replaced it with explicit
spine passed through analys_subprog() calls (sashiko).
- Fixed BPF_LOAD_ACQ handling in arg_track_xfer: don't clear dst
register tracking (sashiko).
- Various error threading nits highlighted by bots
(sashiko, bot+bpf-ci).
- Massaged fmt_spis_mask() to be more concise (Alexei)
verifier.c:
- Move subprog_info[i].name assignment from add_subprog_and_kfunc to
check_btf_func (sashiko, bot+bpf-ci).
- Fixed inverse usage of msb/lsb halves by patch
"bpf: make liveness.c track stack with 4-byte granularity"
(sashiko, bot+bpf-ci).
Jacob Moroni [Thu, 9 Apr 2026 15:01:22 +0000 (15:01 +0000)]
PCI/P2PDMA: Allow wildcard Device IDs in host bridge list
Currently, the pci_p2pdma_whitelist array requires an exact match for both
Vendor and Device ID. Some hardware vendors support cross bridge
peer-to-peer DMA across their entire silicon lineup, so add support for
wildcard device IDs to avoid the need to continuously update this array.
net: hamradio: 6pack: fix uninit-value in sixpack_receive_buf
sixpack_receive_buf() does not properly skip bytes with TTY error flags.
The while loop iterates through the flags buffer but never advances the
data pointer (cp), and passes the original count (including error bytes)
to sixpack_decode(). This causes sixpack_decode() to process bytes that
should have been skipped due to TTY errors. The TTY layer does not
guarantee that cp[i] holds a meaningful value when fp[i] is set, so
passing those positions to sixpack_decode() results in KMSAN reporting
an uninit-value read.
Fix this by processing bytes one at a time, advancing cp on each
iteration, and only passing valid (non-error) bytes to sixpack_decode().
This matches the pattern used by slip_receive_buf() and
mkiss_receive_buf() for the same purpose.
As a sanity check poison stack slots that stack liveness determined
to be dead, so that any read from such slots will cause program rejection.
If stack liveness logic is incorrect the poison can cause
valid program to be rejected, but it also will prevent unsafe program
to be accepted.
Allow global subprogs "read" poisoned stack slots.
The static stack liveness determined that subprog doesn't read certain
stack slots, but sizeof(arg_type) based global subprog validation
isn't accurate enough to know which slots will actually be read by
the callee, so it needs to check full sizeof(arg_type) at the caller.
The new liveness analysis in liveness.c adds verbose output at
BPF_LOG_LEVEL2, making the verifier log for good_prog exceed the 1024-byte
reference buffer. When the reference is truncated in fixed mode, the
rolling mode captures the actual tail of the full log, which doesn't match
the truncated reference.
The fix is to increase the buffer sizes in the test.
selftests/bpf: update existing tests due to liveness changes
The verifier cleans all dead registers and stack slots in the current
state. Adjust expected output in tests or insert dummy stack/register
reads. Also update verifier_live_stack tests to adhere to new logging
scheme.
Eduard Zingerman [Fri, 10 Apr 2026 20:56:00 +0000 (13:56 -0700)]
bpf: simplify liveness to use (callsite, depth) keyed func_instances
Rework func_instance identification and remove the dynamic liveness
API, completing the transition to fully static stack liveness analysis.
Replace callchain-based func_instance keys with (callsite, depth)
pairs. The full callchain (all ancestor callsites) is no longer part
of the hash key; only the immediate callsite and the call depth
matter. This does not lose precision in practice and simplifies the
data structure significantly: struct callchain is removed entirely,
func_instance stores just callsite, depth.
Drop must_write_acc propagation. Previously, must_write marks were
accumulated across successors and propagated to the caller via
propagate_to_outer_instance(). Instead, callee entry liveness
(live_before at subprog start) is pulled directly back to the
caller's callsite in analyze_subprog() after each callee returns.
Since (callsite, depth) instances are shared across different call
chains that invoke the same subprog at the same depth, must_write
marks from one call may be stale for another. To handle this,
analyze_subprog() records into a fresh_instance() when the instance
was already visited (must_write_initialized), then merge_instances()
combines the results: may_read is unioned, must_write is intersected.
This ensures only slots written on ALL paths through all call sites
are marked as guaranteed writes.
This replaces commit_stack_write_marks() logic.
Skip recursive descent into callees that receive no FP-derived
arguments (has_fp_args() check). This is needed because global
subprogram calls can push depth beyond MAX_CALL_FRAMES (max depth
is 64 for global calls but only 8 frames are accommodated for FP
passing). It also handles the case where a callback subprog cannot be
determined by argument tracking: such callbacks will be processed by
analyze_subprog() at depth 0 independently.
Update lookup_instance() (used by is_live_before queries) to search
for the func_instance with maximal depth at the corresponding
callsite, walking depth downward from frameno to 0. This accounts for
the fact that instance depth no longer corresponds 1:1 to
bpf_verifier_state->curframe, since skipped non-FP calls create gaps.
Remove the dynamic public liveness API from verifier.c:
- bpf_mark_stack_{read,write}(), bpf_reset/commit_stack_write_marks()
- bpf_update_live_stack(), bpf_reset_live_stack_callchain()
- All call sites in check_stack_{read,write}_fixed_off(),
check_stack_range_initialized(), mark_stack_slot_obj_read(),
mark/unmark_stack_slots_{dynptr,iter,irq_flag}()
- The per-instruction write mark accumulation in do_check()
- The bpf_update_live_stack() call in prepare_func_exit()
mark_stack_read() and mark_stack_write() become static functions in
liveness.c, called only from the static analysis pass. The
func_instance->updated and must_write_dropped flags are removed.
Remove spis_single_slot(), spis_one_bit() helpers from bpf_verifier.h
as they are no longer used.
Eduard Zingerman [Fri, 10 Apr 2026 20:55:59 +0000 (13:55 -0700)]
bpf: record arg tracking results in bpf_liveness masks
After arg tracking reaches a fixed point, perform a single linear scan
over the converged at_in[] state and translate each memory access into
liveness read/write masks on the func_instance:
- Load/store instructions: FP-derived pointer's frame and offset(s)
are converted to half-slot masks targeting
per_frame_masks->{may_read,must_write}
- Helper/kfunc calls: record_call_access() queries
bpf_helper_stack_access_bytes() / bpf_kfunc_stack_access_bytes()
for each FP-derived argument to determine access size and direction.
Unknown access size (S64_MIN) conservatively marks all slots from
fp_off to fp+0 as read.
- Imprecise pointers (frame == ARG_IMPRECISE): conservatively mark
all slots in every frame covered by the pointer's frame bitmask
as fully read.
- Static subprog calls with unresolved arguments: conservatively mark
all frames as fully read.
Instead of a call to clean_live_states(), start cleaning the current
state continuously as registers and stack become dead since the static
analysis provides complete liveness information. This makes
clean_live_states() and bpf_verifier_state->cleaned unnecessary.
The analysis is a basis for static liveness tracking mechanism
introduced by the next two commits.
A forward fixed-point analysis that tracks which frame's FP each
register value is derived from, and at what byte offset. This is
needed because a callee can receive a pointer to its caller's stack
frame (e.g. r1 = fp-16 at the call site), then do *(u64 *)(r1 + 0)
inside the callee — a cross-frame stack access that the callee's local
liveness must attribute to the caller's stack.
Each register holds an arg_track value from a three-level lattice:
- Precise {frame=N, off=[o1,o2,...]} — known frame index and
up to 4 concrete byte offsets
- Offset-imprecise {frame=N, off_cnt=0} — known frame, unknown offset
- Fully-imprecise {frame=ARG_IMPRECISE, mask=bitmask} — unknown frame,
mask says which frames might be involved
At CFG merge points the lattice moves toward imprecision (same
frame+offset stays precise, same frame different offsets merges offset
sets or becomes offset-imprecise, different frames become
fully-imprecise with OR'd bitmask).
The analysis also tracks spills/fills to the callee's own stack
(at_stack_in/out), so FP derived values spilled and reloaded.
This pass is run recursively per call site: when subprog A calls B
with specific FP-derived arguments, B is re-analyzed with those entry
args. The recursion follows analyze_subprog -> compute_subprog_args ->
(for each call insn) -> analyze_subprog. Subprogs that receive no
FP-derived args are skipped during recursion and analyzed
independently at depth 0.
Eduard Zingerman [Fri, 10 Apr 2026 20:55:57 +0000 (13:55 -0700)]
bpf: prepare liveness internal API for static analysis pass
Move the `updated` check and reset from bpf_update_live_stack() into
update_instance() itself, so callers outside the main loop can reuse
it. Similarly, move write_insn_idx assignment out of
reset_stack_write_marks() into its public caller, and thread insn_idx
as a parameter to commit_stack_write_marks() instead of reading it
from liveness->write_insn_idx. Drop the unused `env` parameter from
alloc_frame_masks() and mark_stack_read().
Eduard Zingerman [Fri, 10 Apr 2026 20:55:56 +0000 (13:55 -0700)]
bpf: 4-byte precise clean_verifier_state
Migrate clean_verifier_state() and its liveness queries from 8-byte
SPI granularity to 4-byte half-slot granularity.
In __clean_func_state(), each SPI is cleaned in two independent
halves:
- half_spi 2*i (lo): slot_type[0..3]
- half_spi 2*i+1 (hi): slot_type[4..7]
Slot types STACK_DYNPTR, STACK_ITER and STACK_IRQ_FLAG are never
cleaned, as their slot type markers are required by
destroy_if_dynptr_stack_slot(), is_iter_reg_valid_uninit() and
is_irq_flag_reg_valid_uninit() for correctness.
When only the hi half is dead, spilled_ptr metadata is destroyed and
the lo half's STACK_SPILL bytes are downgraded to STACK_MISC or
STACK_ZERO. When only the lo half is dead, spilled_ptr is preserved
because the hi half may still need it for state comparison.
Eduard Zingerman [Fri, 10 Apr 2026 20:55:55 +0000 (13:55 -0700)]
bpf: make liveness.c track stack with 4-byte granularity
Convert liveness bitmask type from u64 to spis_t, doubling the number
of trackable stack slots from 64 to 128 to support 4-byte granularity.
Each 8-byte SPI now maps to two consecutive 4-byte sub-slots in the
bitmask: spi*2 half and spi*2+1 half. In verifier.c,
check_stack_write_fixed_off() now reports 4-byte aligned writes of
4-byte writes as half-slot marks and 8-byte aligned 8-byte writes as
two slots. Similar logic applied in check_stack_read_fixed_off().
Queries (is_live_before) are not yet migrated to half-slot
granularity.
bpf: Add spis_*() helpers for 4-byte stack slot bitmasks
Add helper functions for manipulating u64[2] bitmasks that represent
4-byte stack slot liveness. The 512-byte BPF stack is divided into
128 4-byte slots, requiring 128 bits (two u64s) to track.
These will be used by the static stack liveness analysis in the
next commit.
Eduard Zingerman [Fri, 10 Apr 2026 20:55:53 +0000 (13:55 -0700)]
bpf: save subprogram name in bpf_subprog_info
Subprogram name can be computed from function info and BTF, but it is
convenient to have the name readily available for logging purposes.
Update comment saying that bpf_subprog_info->start has to be the first
field, this is no longer true, relevant sites access .start field
by it's name.
Dave Airlie [Fri, 10 Apr 2026 21:35:21 +0000 (07:35 +1000)]
Merge tag 'drm-intel-fixes-2026-04-09' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
- Drop check for changed VM in EXECBUF
- Fix refcount underflow race in intel_engine_park_heartbeat
- Do not use pipe_src as borders for SU area in PSR
Thomas Gleixner [Tue, 7 Apr 2026 08:54:17 +0000 (10:54 +0200)]
clockevents: Prevent timer interrupt starvation
Calvin reported an odd NMI watchdog lockup which claims that the CPU locked
up in user space. He provided a reproducer, which sets up a timerfd based
timer and then rearms it in a loop with an absolute expiry time of 1ns.
As the expiry time is in the past, the timer ends up as the first expiring
timer in the per CPU hrtimer base and the clockevent device is programmed
with the minimum delta value. If the machine is fast enough, this ends up
in a endless loop of programming the delta value to the minimum value
defined by the clock event device, before the timer interrupt can fire,
which starves the interrupt and consequently triggers the lockup detector
because the hrtimer callback of the lockup mechanism is never invoked.
As a first step to prevent this, avoid reprogramming the clock event device
when:
- a forced minimum delta event is pending
- the new expiry delta is less then or equal to the minimum delta
Thanks to Calvin for providing the reproducer and to Borislav for testing
and providing data from his Zen5 machine.
The problem is not limited to Zen5, but depending on the underlying
clock event device (e.g. TSC deadline timer on Intel) and the CPU speed
not necessarily observable.
This change serves only as the last resort and further changes will be made
to prevent this scenario earlier in the call chain as far as possible.
[ tglx: Updated to restore the old behaviour vs. !force and delta <= 0 and
fixed up the tick-broadcast handlers as pointed out by Borislav ]
i2c: qcom-geni: Avoid extra TX DMA TRE for single read message in GPI mode
In GPI mode, the I2C GENI driver programs an extra TX DMA transfer
descriptor (TRE) on the TX channel when handling a single read message.
This results in an unintended write phase being issued on the I2C bus,
even though a read transaction does not require any TX data.
For a single-byte read, the correct hardware sequence consists of the
CONFIG and GO commands followed by a single RX DMA TRE. Programming an
additional TX DMA TRE is redundant, causes unnecessary DMA buffer
mapping on the TX channel, and may lead to incorrect bus behavior.
Update the transfer logic to avoid programming a TX DMA TRE for single
read messages in GPI mode.
====================
selftests/bpf: Test BTF sanitization
Allow simulation of missing BPF features through provision of
a synthetic feature cache set, and use this to simulate case
where FEAT_BTF_LAYOUT is missing. Ensure sanitization leaves us
with expected BTF (layout info removed, layout header fields
zeroed, strings data adjusted).
Specifying a feature cache with selected missing features will
allow testing of other missing feature codepaths, but for now
add BTF layout sanitization test only.
Changes since v2 [1]:
- change zfree() to free() since we immediately assign the
feat_cache (Jiri, patch 1)
- "goto out" to avoid skeleton leak (Chengkaitao, patch 2)
- just use kfree_skb__open() since we do not need to load
skeleton
Alan Maguire [Wed, 8 Apr 2026 16:57:35 +0000 (17:57 +0100)]
selftests/bpf: Add BTF sanitize test covering BTF layout
Add test that fakes up a feature cache of supported BPF
features to simulate an older kernel that does not support
BTF layout information. Ensure that BTF is sanitized correctly
to remove layout info between types and strings, and that all
offsets and lengths are adjusted appropriately.
Alan Maguire [Wed, 8 Apr 2026 16:57:34 +0000 (17:57 +0100)]
libbpf: Allow use of feature cache for non-token cases
Allow bpf object feat_cache assignment in BPF selftests
to simulate missing features via inclusion of libbpf_internal.h
and use of bpf_object_set_feat_cache() and bpf_object__sanitize_btf() to
test BTF sanitization for cases where missing features are simulated.
test_access_variable_array relied on accessing struct sched_domain::span
to validate variable-length array handling via BTF. Recent scheduler
refactoring removed or hid this field, causing the test
to fail to build.
Given that this test depends on internal scheduler structures that are
subject to refactoring, and equivalent variable-length array coverage
already exists via bpf_testmod-based tests, remove
test_access_variable_array entirely.