Guangshuo Li [Wed, 15 Apr 2026 19:31:38 +0000 (03:31 +0800)]
ALSA: pcmtest: fix reference leak on failed device registration
When platform_device_register() fails in mod_init(), the embedded struct
device in pcmtst_pdev has already been initialized by
device_initialize(), but the failure path returns the error without
dropping the device reference for the current platform device:
Cássio Gabriel [Thu, 16 Apr 2026 13:24:40 +0000 (10:24 -0300)]
ALSA: 6fire: Fix input volume change detection
usb6fire_control_input_vol_put() stores the analog capture volume
as a signed offset in rt->input_vol[] (-15..+15), but it compares
the cached value against the user-visible mixer value (0..30)
before subtracting 15.
This mixes two domains in the change detection path. Since the
runtime is zero-initialized, the visible default is 15; writing 0
right after probe is ignored, while writing 15 is reported as a
change even though the cached value remains 0.
Normalize the user value before comparing it with the cached offset.
ALSA: usb-audio: Add quirk entries for NexiGo N930W webcam
The NexiGo N930W 60fps webcam (USB ID 3443:930d) hits the same
'cannot get freq at ep 0x84' error in snd-usb-audio as its sibling
N930AF (1bcf:2283). Without QUIRK_FLAG_GET_SAMPLE_RATE the ADC clock
is never configured and the microphone streams only zero samples.
Testing on Linux 6.17 with QUIRK_FLAG_GET_SAMPLE_RATE |
QUIRK_FLAG_MIC_RES_16 (via quirk_alias=3443930d:1bcf2283) confirmed
the microphone captures real audio after a cold USB re-enumeration.
Adding a native quirk_flags_table entry avoids the alias workaround.
Michal Wajdeczko [Thu, 16 Apr 2026 13:18:31 +0000 (15:18 +0200)]
drm/xe/pf: Fix VF's scheduling priority reporting
When preparing number of impacted VFs parameter for the reporting
helper function, we wrongly ended with adding +1 (representing PF)
twice, since local variable total_vfs was already adjusted. This
resulted in printing a message that was referring to an invalid VF:
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd updates from Jason Gunthorpe:
"Several fixes:
- Add missing static const
- Correct type 1 emulation for VFIO_CHECK_EXTENSION when no-iommu is
turned on
- Fix selftest memory leak and syzkaller splat
- Fix missed -EFAULT in fault reporting write() fops
- Fix a race where map/unmap with the internal IOVA allocator can
unmap things it should not"
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
iommufd: Fix a race with concurrent allocation and unmap
iommufd/selftest: Remove MOCK_IOMMUPT_AMDV1 format
iommufd: Fix return value of iommufd_fault_fops_write()
iommufd: update outdated comment for renamed iommufd_hw_pagetable_alloc()
iommufd/selftest: Fix page leaks in mock_viommu_{init,destroy}
iommufd: vfio compatibility extension check for noiommu mode
iommufd: Constify struct dma_buf_attach_ops
Paulo Alcantara [Fri, 17 Apr 2026 00:15:50 +0000 (21:15 -0300)]
smb: client: fix dir separator in SMB1 UNIX mounts
When calling cifs_mount_get_tcon() with SMB1 UNIX mounts,
@cifs_sb->mnt_cifs_flags needs to be read or updated only after
calling reset_cifs_unix_caps(), otherwise it might end up with missing
CIFS_MOUNT_POSIXACL and CIFS_MOUNT_POSIX_PATHS bits.
This fixes the wrong dir separator used in paths caused by the missing
CIFS_MOUNT_POSIX_PATHS bit in cifs_sb_info::mnt_cifs_flags.
Reported-by: "Kris Karas (Bug Reporting)" <bugs-a21@moonlit-rail.com> Closes: https://lore.kernel.org/r/f758f4ff-4d54-4244-931d-38f469c3ff14@moonlit-rail.com Fixes: 4fc3a433c139 ("smb: client: use atomic_t for mnt_cifs_flags") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Cc: David Howells <dhowells@redhat.com> Cc: linux-cifs@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
Merge tag 'for-linus-fwctl' of git://git.kernel.org/pub/scm/linux/kernel/git/fwctl/fwctl
Pull fwctl updates from Jason Gunthorpe:
- New fwctl driver for Broadcom RDMA NICs
- Bug fix for non-modular builds
* tag 'for-linus-fwctl' of git://git.kernel.org/pub/scm/linux/kernel/git/fwctl/fwctl:
fwctl: Fix class init ordering to avoid NULL pointer dereference on device removal
fwctl/bnxt_fwctl: Add documentation entries
fwctl/bnxt_fwctl: Add bnxt fwctl device
fwctl/bnxt_en: Create an aux device for fwctl
fwctl/bnxt_en: Refactor aux bus functions to be more generic
fwctl/bnxt_en: Move common definitions to include/linux/bnxt/
Merge tag 'soc-arm-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC ARM code updates from Arnd Bergmann:
"These are again very minimal updates:
- A workaround for firmware on Google Nexus 10
- A fix for early debugging on OMAP1
- A rework for Microchip SoC configuration
- Cleanups on OMAP2 an R-Car-Gen2"
* tag 'soc-arm-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: omap2: dead code cleanup in kconfig for ARCH_OMAP4
ARM: OMAP1: Fix DEBUG_LL and earlyprintk on OMAP16XX
arm64: Kconfig: provide a top-level switch for Microchip platforms
ARM: shmobile: rcar-gen2: Use of_phandle_args_equal() helper
ARM: omap: fix all kernel-doc warnings
ARM: omap2: Replace scnprintf with strscpy in omap3_cpuinfo
ARM: samsung: exynos5250: Allow CPU1 to boot
Merge tag 'soc-defconfig-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC defconfig updates from Arnd Bergmann:
"As usual, we enable a number of additional device drivers as loadable
modules, to support the added platforms. The largest change this time
is for OMAP2/3, which were not that well supported in the generic
arm32 defconfig.
The Tegra SoC platforms are now enabled by default in Kconfig when
ARCH_TEGRA is enabled, which means the defconfig change is done at the
same time as the Kconfig change here"
* tag 'soc-defconfig-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (25 commits)
arch/arm: Drop CONFIG_FIRMWARE_EDID from defconfig files
arm64: defconfig: Enable DP83TG720 PHY driver
arm64: tegra: defconfig: Drop redundant ARCH_TEGRA_foo_SOC
ARM: tegra: defconfig: Drop redundant ARCH_TEGRA_foo_SOC
arm64: defconfig: enable pci-pwrctrl-generic as module
arm64: defconfig: Enable Lontium LT8713sx driver
arm64: defconfig: Enable Qualcomm Eliza SoC display clock controller
arm64: defconfig: enable IPQ5210 RDP504 base configs
arm64: defconfig: Enable Milos LPASS LPI pinctrl driver
arm64: defconfig: Enable Kaanapali clock controllers
arm64: defconfig: Enable configs for Arduino VENTUNO Q
arm64: defconfig: Enable Qualcomm Eliza basic resource providers
arm64: defconfig: Enable S5KJN1 camera sensor
arm64: defconfig: Enable configurations for Toradex Aquila AM69
arm64: defconfig: remove SENSORS_SA67MCU
arm64: defconfig: Enable Qualcomm WCD937x headphone codec as module
arm64: defconfig: Enable QCOMTEE module for QTEE-enabled Qualcomm SoCs
ARM: shmobile: defconfig: Refresh for v7.0-rc1
arm: multi_v7_defconfig: Enable more OMAP 3/4 related configs
ARM: multi_v7_defconfig: omap2plus_defconfig: Enable ITE IT66121 driver
...
Merge tag 'soc-drivers-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC driver updates from Arnd Bergmann:
"The driver updates again are all over the place with many minor fixes
going into platform specific code. The most notable changes are:
- Support for Microchip pic64gx system controllers
- Work on cleaning up devicetree bindings for SoC drivers, and
converting them into the new format
- Lots of smaller changes for Qualcomm SoC drivers, including support
for a number of newly supported chips
- reset controller API cleanups and a new driver for Cix Sky1
- Reworks of the Tegra PMC and CBB drivers, along with a change to
how individual Tegra SoCs get selected in Kconfig and BPMP firmware
driver updates including a refresh of the ABI header to match the
version used by firmware
- STM32 updates to the firewall bus driver and support for the debug
bus through OP-TEE
- SCMI firmware driver improvements for reliability, in particular
for dealing with broken firmware interrupts
- Memory driver updates for Tegra, and a patch to remove the unused
Baikal T1 driver"
* tag 'soc-drivers-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (193 commits)
firmware: arm_ffa: Use the correct buffer size during RXTX_MAP
firmware: qcom: scm: Allow QSEECOM on Lenovo IdeaCentre Mini X
clk: spear: fix resource leak in clk_register_vco_pll()
reset: rzv2h-usb2phy: Add support for VBUS mux controller registration
reset: rzv2h-usb2phy: Convert to regmap API
dt-bindings: reset: renesas,rzv2h-usb2phy: Document RZ/G3E USB2PHY reset
dt-bindings: reset: renesas,rzv2h-usb2phy: Add '#mux-state-cells' property
soc: microchip: add mpfs gpio interrupt mux driver
dt-bindings: soc: microchip: document PolarFire SoC's gpio interrupt mux
gpio: mpfs: Add interrupt support
soc: qcom: ubwc: add helpers to get programmable values
soc: qcom: ubwc: add helper to get min_acc length
firmware: qcom: scm: Register gunyah watchdog device
soc: qcom: socinfo: Add SoC ID for SA8650P
dt-bindings: arm: qcom,ids: Add SoC ID for SA8650P
firmware: qcom: scm: Allow QSEECOM on Mahua CRD
soc: qcom: wcnss: simplify allocation of req
soc: qcom: pd-mapper: Add support for Eliza
soc: qcom: aoss: compare against normalized cooling state
soc: qcom: llcc: fix v1 SB syndrome register offset
...
Merge tag 'soc-dt-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC devicetree updates from Arnd Bergmann:
"A number of SoC platforms are adding modernized variants of their
already supported chips time, with a total of 12 new SoCs, and two
older SoC getting removed:
- Qualcomm Glymur is a compute SoC using 18 Oryon-2 CPU cores
- Qualcomm Mahua is a variant of Glymur with only 12 CPU cores, but
largely identical.
- Qualcomm Eliza is an embeded platform for mobile phone (SM7750) and
IOT (QC7790S/M) workloads
- Qualcomm IPQ5210 is a wireless networking SoC using Cortex-A53
cores
- Qualcomm apq8084 and ipq806x had only rudimentary support but no
actual products using them, so they are now gone.
- Axis ARTPEC-9 is a follow-up to the ARTPEC-8 embedded SoC, using
the Samsung SoC platform but now with Cortex-A55 cores
- ARM Zena is a virtual platform in FVP using Cortex-A720AE cores,
with additional versions planned to be merged in the future.
- ARM corstone-1000-a320 is a reference platform for IOT, using
low-end Cortex-A320 cores
- Microchip LAN9691 is an updated 64-bit variant of the arm32 lan966x
series of networking SoCs
- Microchip PIC64GX is an embedded RISC-V chip using SIFIVE U54 CPU
cores
- Rockchip RV1103B is the low-end 32-bit single-core vision processor
- Renesas RZ/G3L (r9a08g046) is an industrial embedded chip using
Cortex-A55 cores, similar to the G3E and G3S variants we already
supported.
- NXP S32N79 is an automotive SoC using Cortex-A78AE cores, a
significant upgrade from the older S32V and S32G series
These all come with at least one reference board or an initial product
using these, in total there are 67 newly added boards. The ones for
already supported SoCs are:
- Two more Aspeed BMC based boards
- Three older tablets based on 32-bit OMAP4 and Exynos5 SoCs
- One Set-top-box based on Allwinner H6
- 22 additional industrial/embedded boards using 64-bit NXP i.MX8M or
i.MX9 SoCs
- 20 Qualcomm SoC based machines across all possible markets:
workstation, gaming, laptop, phone, networking, reference, ...
- Three more Rockchips rk35xx based boards
- Four variants of the Toradex Verdin using TI AM62
Other notable bits are:
- A cleanup for the 32-bit Tegra paz00 board moved the last board
specific code on Tegra into equivalent dts syntax.
- There continues to be a significant number of fixes for static
checking of dtc syntax, but it feels like this is slowing down,
hopefully getting into a state where most known issues are
addressed
- Additional hardware support for many existing boards across SoC
families, notably Qualcomm, Broadcom, i.MX2, i.MX6, Rockchips,
STM32, Mediatek, Tegra, TI and Microchip"
* tag 'soc-dt-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (841 commits)
arm64: dts: ti: k3: Use memory-region-names for r5f
ARM: dts: imx: Add DT overlays for DH i.MX6 DHCOM SoM and boards
ARM: dts: imx6sx: remove fallback compatible string fsl,imx28-lcdif
ARM: dts: imx25: rename node name tcq to touchscreen
ARM: dts: imx: b850v3: Disable unused usdhc4
ARM: dts: imx: b850v3: Define GPIO line names
ARM: dts: imx: b850v3: Use alphabetical sorting
ARM: dts: imx: bx50v3: Configure phy-mode to eliminate a warning
ARM: dts: imx: bx50v3: Configure switch PHY max-speed to 100Mbps
ARM: dts: imx7ulp: Add CPU clock and OPP table support
ARM: dts: imx7-mba7: Deassert BOOT_EN after boot
ARM: dts: tqma7: add boot phase properties
ARM: dts: imx7s: add boot phase properties
ARM: dts: tqma6ul[l]: correct spelling of TQ-Systems
ARM: dts: mba6ulx: add boot phase properties
ARM: dts: imx6ul[l]-tqma6ul[l]: add boot phase properties
ARM: dts: imx6ul/imx6ull: add boot phase properties
ARM: dts: imx6qdl-mba6: add boot phase properties
ARM: dts: imx6qdl-tqma6: add boot phase properties
ARM: dts: imx6qdl: add boot phase properties
...
Merge tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- "pid: make sub-init creation retryable" (Oleg Nesterov)
Make creation of init in a new namespace more robust by clearing away
some historical cruft which is no longer needed. Also some
documentation fixups
- "selftests/fchmodat2: Error handling and general" (Mark Brown)
Fix and a cleanup for the fchmodat2() syscall selftest
- "lib: polynomial: Move to math/ and clean up" (Andy Shevchenko)
- "hung_task: Provide runtime reset interface for hung task detector"
(Aaron Tomlin)
Give administrators the ability to zero out
/proc/sys/kernel/hung_task_detect_count
- "tools/getdelays: use the static UAPI headers from
tools/include/uapi" (Thomas Weißschuh)
Teach getdelays to use the in-kernel UAPI headers rather than the
system-provided ones
- "watchdog/hardlockup: Improvements to hardlockup" (Mayank Rungta)
Several cleanups and fixups to the hardlockup detector code and its
documentation
- "lib/bch: fix undefined behavior from signed left-shifts" (Josh Law)
A couple of small/theoretical fixes in the bch code
- "ocfs2/dlm: fix two bugs in dlm_match_regions()" (Junrui Luo)
- "cleanup the RAID5 XOR library" (Christoph Hellwig)
A quite far-reaching cleanup to this code. I can't do better than to
quote Christoph:
"The XOR library used for the RAID5 parity is a bit of a mess right
now. The main file sits in crypto/ despite not being cryptography
and not using the crypto API, with the generic implementations
sitting in include/asm-generic and the arch implementations
sitting in an asm/ header in theory. The latter doesn't work for
many cases, so architectures often build the code directly into
the core kernel, or create another module for the architecture
code.
Change this to a single module in lib/ that also contains the
architecture optimizations, similar to the library work Eric
Biggers has done for the CRC and crypto libraries later. After
that it changes to better calling conventions that allow for
smarter architecture implementations (although none is contained
here yet), and uses static_call to avoid indirection function call
overhead"
- "lib/list_sort: Clean up list_sort() scheduling workarounds"
(Kuan-Wei Chiu)
Clean up this library code by removing a hacky thing which was added
for UBIFS, which UBIFS doesn't actually need
- "Fix bugs in extract_iter_to_sg()" (Christian Ehrhardt)
Fix a few bugs in the scatterlist code, add in-kernel tests for the
now-fixed bugs and fix a leak in the test itself
- "kdump: Enable LUKS-encrypted dump target support in ARM64 and
PowerPC" (Coiby Xu)
Enable support of the LUKS-encrypted device dump target on arm64 and
powerpc
- "ocfs2: consolidate extent list validation into block read callbacks"
(Joseph Qi)
Cleanup, simplify, and make more robust ocfs2's validation of extent
list fields (Kernel test robot loves mounting corrupted fs images!)
* tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (127 commits)
ocfs2: validate group add input before caching
ocfs2: validate bg_bits during freefrag scan
ocfs2: fix listxattr handling when the buffer is full
doc: watchdog: fix typos etc
update Sean's email address
ocfs2: use get_random_u32() where appropriate
ocfs2: split transactions in dio completion to avoid credit exhaustion
ocfs2: remove redundant l_next_free_rec check in __ocfs2_find_path()
ocfs2: validate extent block list fields during block read
ocfs2: remove empty extent list check in ocfs2_dx_dir_lookup_rec()
ocfs2: validate dx_root extent list fields during block read
ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRY
ocfs2: handle invalid dinode in ocfs2_group_extend
.get_maintainer.ignore: add Askar
ocfs2: validate bg_list extent bounds in discontig groups
checkpatch: exclude forward declarations of const structs
tools/accounting: handle truncated taskstats netlink messages
taskstats: set version in TGID exit notifications
ocfs2/heartbeat: fix slot mapping rollback leaks on error paths
arm64,ppc64le/kdump: pass dm-crypt keys to kdump kernel
...
====================
vsock/virtio: fix MSG_PEEK calculation on bytes to copy
`virtio_transport_stream_do_peek`, when calculating the number of bytes to
copy, didn't consider the `offset`, caused by partial reads that happened
before.
This might cause out-of-bounds read that lead to an EFAULT.
More details in the commits.
Commit 1 introduces the fix
Commit 2 introduces some preliminary work for adding a test and fixes a
problem in existing tests.
Commit 3 introduces a test that checks for this bug to avoid future
regressions.
For disclosure: this bug was found initially by claude opus 4.6, I then analyzed
it and worked on the fix and the test.
====================
Luigi Leonardi [Wed, 15 Apr 2026 15:09:30 +0000 (17:09 +0200)]
vsock/test: add MSG_PEEK after partial recv test
Add a test that verifies MSG_PEEK works correctly after a partial
recv().
This is to test a bug that was present in the
`virtio_transport_stream_do_peek()` when computing the number of bytes to
copy: After a partial read, the peek function didn't take into
consideration the number of bytes that were already read. So peeking the
whole buffer would cause an out-of-bounds read, that resulted in a -EFAULT.
This test does exactly this: do a partial recv on a buffer, then try to
peek the whole buffer content. The test re-uses
`test_stream_msg_peek_client()` to also cover this scenario.
Luigi Leonardi [Wed, 15 Apr 2026 15:09:29 +0000 (17:09 +0200)]
vsock/test: fix MSG_PEEK handling in recv_buf()
`recv_buf` does not handle the MSG_PEEK flag correctly: it keeps calling
`recv` until all requested bytes are available or an error occurs.
The problem is how it calculates the number of bytes read: MSG_PEEK
doesn't consume any bytes and will re-read the same bytes from the buffer
head, so summing the return value every time is wrong.
Moreover, MSG_PEEK doesn't consume the bytes in the buffer, so if more
bytes are requested than are available, the loop will never terminate,
because `recv` will never return EOF. For this reason, we need to compare
the number of bytes read with the number of bytes expected.
Add a check: if the MSG_PEEK flag is present, update the byte counter and
break out of the loop only after at least the expected number of bytes
have been received; otherwise, retry after a short delay to avoid
consuming too many CPU cycles.
This allows us to simplify the `test_stream_credit_update_test` by
reusing `recv_buf`, like some other tests already do.
Suggested-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Luigi Leonardi <leonardi@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/20260415-fix_peek-v4-2-8207e872759e@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Luigi Leonardi [Wed, 15 Apr 2026 15:09:28 +0000 (17:09 +0200)]
vsock/virtio: fix MSG_PEEK ignoring skb offset when calculating bytes to copy
`virtio_transport_stream_do_peek()` does not account for the skb offset
when computing the number of bytes to copy.
This means that, after a partial recv() that advances the offset, a peek
requesting more bytes than are available in the sk_buff causes
`skb_copy_datagram_iter()` to go past the valid payload, resulting in
a -EFAULT.
The dequeue path already handles this correctly.
Apply the same logic to the peek path.
Fixes: 0df7cd3c13e4 ("vsock/virtio/vhost: read data from non-linear skb") Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Signed-off-by: Luigi Leonardi <leonardi@redhat.com> Link: https://patch.msgid.link/20260415-fix_peek-v4-1-8207e872759e@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
net: enetc: fix command BD ring issues
Currently, the implementation of command BD ring has two issues, one is
that the driver may obtain wrong consumer index of the ring, because the
driver does not mask out the SBE bit of the CIR value, so a wrong index
will be obtained when a SBE error ouccrs. The other one is that the DMA
buffer may be used after free. If netc_xmit_ntmp_cmd() times out and
returns an error, the pending command is not explicitly aborted, while
ntmp_free_data_mem() unconditionally frees the DMA buffer. If the buffer
has already been reallocated elsewhere, this may lead to silent memory
corruption. Because the hardware eventually processes the pending command
and perform a DMA write of the response to the physical address of the
freed buffer. So this patch set is to fix these two issues.
====================
The AI-generated review reported a potential DMA use-after-free issue
[1]. If netc_xmit_ntmp_cmd() times out and returns an error, the pending
command is not explicitly aborted, while ntmp_free_data_mem()
unconditionally frees the DMA buffer. If the buffer has already been
reallocated elsewhere, this may lead to silent memory corruption. Because
the hardware eventually processes the pending command and perform a DMA
write of the response to the physical address of the freed buffer.
To resolve this issue, this patch does the following modifications:
1. Convert cbdr->ring_lock from a spinlock to a mutex
The lock was originally a spinlock in case NTMP operations might be
invoked from atomic context. After downstream support for all NTMP
tables, no such usage has materialized. A mutex lock is now required
because the driver now needs to reclaim used BDs and release associated
DMA memory within the lock's context, while dma_free_coherent() might
sleep.
The hardware write-back overwrites the addr and len fields of the BD,
so the driver cannot rely on the hardware BD to free the associated DMA
memory. The driver now maintains a software shadow BD storing the DMA
buffer pointer, DMA address, and size. And netc_xmit_ntmp_cmd() only
reclaims older BDs when the number of used BDs reaches
NETC_CBDR_CLEAN_WORK (16). The software BD enables correct DMA memory
release. With this, struct ntmp_dma_buf and ntmp_free_data_mem() are no
longer needed and are removed.
3. Require callers to hold ring_lock across netc_xmit_ntmp_cmd()
netc_xmit_ntmp_cmd() releases the ring_lock before the caller finishes
consuming the response. At this point, if a concurrent thread submits
a new command, it may trigger ntmp_clean_cbdr() and free the DMA buffer
while it is still in use. Move ring_lock ownership to the caller to
ensure the response buffer cannot be reclaimed prematurely. So the
helpers ntmp_select_and_lock_cbdr() and ntmp_unlock_cbdr() are added.
These changes eliminate the DMA use-after-free condition and ensure safe
and consistent BD reclamation and DMA buffer lifecycle management.
net: enetc: correct the command BD ring consumer index
The command BD ring cousumer index register has the consumer index as
the lower 10 bits, and the bit 31 is SBE, which indicates whether a
system bus error occurred during execution of the CBD command. So if a
system bus error occurs, reading the register will get the SBE bit set.
However, the current implementation directly uses the register value as
the consumer index without masking it. Therefore, if a system bus error
occurs, an incorrect consumer index will be obtained, causing errors in
the processing of the command BD ring. Thus, we need to mask out the
other bits to obtain the correct consumer index.
In addition, this patch adds a check for the SBE bit after the polling
loop and returns an error if the bit is set.
net: pse-pd: fix out-of-bounds bitmap access in pse_isr() on 32-bit
In pse_isr(), notifs_mask was declared as a single unsigned long on the
stack (32 bits on 32-bit architectures). For PSE controllers with more
than 32 ports, this causes two problems:
- map_event callbacks could wrote bit positions >= 32 via
*notifs_mask |= BIT(i), which is undefined behaviour on a 32-bit
unsigned long and corrupts adjacent stack memory.
- for_each_set_bit(i, ¬ifs_mask, pcdev->nr_lines) treats
¬ifs_mask as a multi-word bitmap and reads beyond the single
unsigned long when nr_lines > BITS_PER_LONG.
Fix this by moving notifs_mask out of the stack and into struct pse_irq
as a dynamically allocated bitmap. It is sized with
BITS_TO_LONGS(pcdev->nr_lines) words in devm_pse_irq_helper(), so it
is always wide enough regardless of the host word size.
[Jakub]: No upstream driver currently supports >=32 ports.
Merge tag 'v7.1-rc1-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client updates from Steve French:
- Fix integer underflow in encrypted read
- Four debug patches, adding a few tracepoints
- Minor update to MAINTAINERS file (preferred server URL for cifs)
- Remove the BUG_ON() calls in d_mark_tmpfile_name
* tag 'v7.1-rc1-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
MAINTAINERS: change git.samba.org to https
smb: client: fix integer underflow in receive_encrypted_read()
smb: client: add tracepoints for deferred handle caching
smb: client: add oplock level to smb3_open_done tracepoint
smb: client: add tracepoint for local lock conflicts
smb: client: add tracepoints for lock operations
vfs: get rid of BUG_ON() in d_mark_tmpfile_name()
net: dsa: remove redundant netdev_lock_ops() from conduit ethtool ops
DSA replaces the conduit (master) device's ethtool_ops with its own
wrappers that aggregate stats from both the conduit and DSA switch
ports. Taking the lock again inside the DSA wrappers causes a deadlock.
Stumbled upon this when booting qemu with fbnic and CONFIG_NET_DSA_LOOP=y
(which looks like some kind of testing device that auto-populates the ports
of eth0). `ethtool -i` is enough to deadlock. This means we have basically zero
coverage for DSA stuff with real ops locked devs.
Remove the redundant netdev_lock_ops()/netdev_unlock_ops() calls from
the DSA conduit ethtool wrappers.
Fixes: 2bcf4772e45a ("net: ethtool: try to protect all callback with netdev instance lock") Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20260414231035.1917035-1-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net/sched: taprio: fix use-after-free in advance_sched() on schedule switch
In advance_sched(), when should_change_schedules() returns true,
switch_schedules() is called to promote the admin schedule to oper.
switch_schedules() queues the old oper schedule for RCU freeing via
call_rcu(), but 'next' still points into an entry of the old oper
schedule. The subsequent 'next->end_time = end_time' and
rcu_assign_pointer(q->current_entry, next) are use-after-free.
Fix this by selecting 'next' from the new oper schedule immediately
after switch_schedules(), and using its pre-calculated end_time.
setup_first_end_time() sets the first entry's end_time to
base_time + interval when the schedule is installed, so the value
is already correct.
The deleted 'end_time = sched_base_time(admin)' assignment was also
harmful independently: it would overwrite the new first entry's
pre-calculated end_time with just base_time.
Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule") Reported-by: Junxi Qian <qjx1298677004@gmail.com> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: mdio: MDIO_PIC64HPSC should depend on ARCH_MICROCHIP
The PIC64-HPSC/HX MDIO interface is only present on Microchip
PIC64-HPSC/HX SoCs. Hence add a dependency on ARCH_MICROCHIP, to
prevent asking the user about this driver when configuring a kernel
without Microchip SoC support.
Lorenzo Bianconi [Tue, 14 Apr 2026 14:08:52 +0000 (16:08 +0200)]
net: airoha: Wait for NPU PPE configuration to complete in airoha_ppe_offload_setup()
In order to properly enable flowtable hw offloading, poll
REG_PPE_FLOW_CFG register in airoha_ppe_offload_setup routine and
wait for NPU PPE configuration triggered by ppe_init callback to complete
before running airoha_ppe_hw_init().
Dave Airlie [Thu, 16 Apr 2026 21:32:30 +0000 (07:32 +1000)]
Merge tag 'topic/pipe-reorder-2026-04-15' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
drm/i915/display: change pipe allocation order for discrete platforms
This is a topic pull request for changing the pipe allocation order for
discrete platforms from the usual A,B,C,D to A,C,B,D. The goal is to
help pipe joiner configurations that reserve the adjacent pipe as the
secondary pipe without the user space knowing. More details in the
relevant commit message. The CRTC iteration is also changed to remain in
pipe order.
Jiri Olsa [Thu, 16 Apr 2026 10:00:34 +0000 (12:00 +0200)]
libbpf: Prevent double close and leak of btf objects
Sashiko found possible double close of btf object fd [1],
which happens when strdup in load_module_btfs fails at which
point the obj->btf_module_cnt is already incremented.
The error path close btf fd and so does later cleanup code in
bpf_object_post_load_cleanup function.
Also libbpf_ensure_mem failure leaves btf object not assigned
and it's leaked.
Replacing the err_out label with break to make the error path
less confusing as suggested by Alan.
Incrementing obj->btf_module_cnt only if there's no failure
and releasing btf object in error path.
====================
bpf: allow UTF-8 literals in bpf_bprintf_prepare()
bpf_bprintf_prepare() currently rejects any non-ASCII byte in format
strings, so helpers such as bpf_trace_printk() fail to emit UTF-8
literal text even when those bytes are not part of a format specifier.
Keep plain text permissive while continuing to parse '%' sequences as
ASCII-only. Patch 1 updates snprintf_negative() at the same time so the
selftests stay consistent during bisection. Patch 2 then extends
trace_printk coverage for both the valid UTF-8 literal case and the
invalid non-ASCII-after-'%' case.
Changes in v3:
- drop Suggested-by trailers and move review credit into this changelog
- update test_snprintf_negative() in patch 1/2 so plain non-ASCII text is
accepted while non-ASCII after '%' is still rejected, keeping
./test_progs -t snprintf aligned with the new behavior.
- clarify the trace_printk negative case with an explicit invalid format
string and comment
- address Paul Chaignon's review feedback and keep the negative coverage
requested earlier by Alan Maguire
Changes in v2:
- split the core change and selftest updates into two patches
- drop unnecessary isspace()/ispunct() casts
- add comments to clarify plain-text vs format-specifier handling
- add a negative selftest for non-ASCII bytes inside '%' sequences
Testing:
- Reproduced on x86_64 without the core fix: ASCII trace output works,
while UTF-8 literal text in bpf_trace_printk() is rejected and
produces no trace output
- Verified with tools/testing/selftests/bpf: ./test_progs -t trace_printk
- Verified with tools/testing/selftests/bpf: ./test_progs -t snprintf
====================
Extend trace_printk coverage to verify that UTF-8 literal text is
emitted successfully and that '%' parsing still rejects non-ASCII
bytes once format parsing starts.
Use an explicitly invalid format string for the negative case so the
ASCII-only parser expectation is visible from the test code itself.
bpf: allow UTF-8 literals in bpf_bprintf_prepare()
bpf_bprintf_prepare() only needs ASCII parsing for conversion
specifiers. Plain text can safely carry bytes >= 0x80, so allow
UTF-8 literals outside '%' sequences while keeping ASCII control
bytes rejected and format specifiers ASCII-only.
This keeps existing parsing rules for format directives unchanged,
while allowing helpers such as bpf_trace_printk() to emit UTF-8
literal text.
Update test_snprintf_negative() in the same commit so selftests keep
matching the new plain-text vs format-specifier split during bisection.
====================
bpf: Fix NULL deref when storing scalar into kptr slot
map_kptr_match_type() accesses reg->btf before confirming the register
is PTR_TO_BTF_ID. A scalar store into a kptr slot has no btf, causing
a NULL pointer dereference. Guard base_type() first.
bpf: Fix NULL deref in map_kptr_match_type for scalar regs
Commit ab6c637ad027 ("bpf: Fix a bpf_kptr_xchg() issue with local
kptr") refactored map_kptr_match_type() to branch on btf_is_kernel()
before checking base_type(). A scalar register stored into a kptr
slot has no btf, so the btf_is_kernel(reg->btf) call dereferences
NULL.
Move the base_type() != PTR_TO_BTF_ID guard before any reg->btf
access.
Fixes: ab6c637ad027 ("bpf: Fix a bpf_kptr_xchg() issue with local kptr") Reported-by: Hiker Cl <clhiker365@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221372 Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/20260416-kptr_crash-v1-1-5589356584b4@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tao Jiang [Wed, 15 Apr 2026 17:27:15 +0000 (01:27 +0800)]
nvme-pci: add quirk for Memblaze Pblaze5 (0x1c5f:0x0555)
The Memblaze Pblaze5 NVMe device (PCI ID 0x1c5f:0x0555)
is detected as a controller on recent kernels (tested on 5.15.85
and 6.8.4), but no namespace is exposed.
Tools like lsblk and fdisk do not report any block device.
dmesg shows:
nvme nvme0: missing or invalid SUBNQN field.
The device works correctly on older kernels (e.g. 4.19), suggesting
a compatibility issue with newer namespace handling.
This indicates the device does not properly support the
Namespace Descriptor List feature.
Applying NVME_QUIRK_NO_NS_DESC_LIST allows the namespace to be
discovered correctly.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Tao Jiang <tanroame.kyle@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
John Garry [Wed, 15 Apr 2026 15:53:58 +0000 (15:53 +0000)]
nvme-multipath: put module reference when delayed removal work is canceled
The delayed disk removal work is canceled when a NS (re)appears. However,
we do not put the module reference grabbed in nvme_mpath_remove_disk(), so
fix that.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
Daniel Wagner [Wed, 8 Apr 2026 16:19:56 +0000 (18:19 +0200)]
nvme: expose TLS mode
It is not possible to determine the active TLS mode from the
presence or absence of sysfs attributes like tls_key,
tls_configured_key, or dhchap_secret.
With the introduction of the concat mode and optional DH-CHAP
authentication, different configurations can result in identical
sysfs state. This makes user space detection unreliable.
Expose the TLS mode explicitly to allow user space to
unambiguously identify the active configuration and avoid
fragile heuristics in nvme-cli.
Reviewed-by: Chris Leech <cleech@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Daniel Wagner <wagi@kernel.org> Signed-off-by: Keith Busch <kbusch@kernel.org>
nvme-apple: drop invalid put of admin queue reference count
Commit 03b3bcd319b3 ("nvme: fix admin request_queue lifetime") moved the
admin queue reference ->put call into nvme_free_ctrl() - a controller
device release callback performed for every nvme driver doing
nvme_init_ctrl().
nvme-apple sets refcount of the admin queue to 1 at allocation during the
probe function and then puts it twice now:
Note that there is a commit 941f7298c70c ("nvme-apple: remove an extra
queue reference") which intended to drop taking an extra admin queue
reference. Looks like at that moment it accidentally fixed a refcount
leak, which existed since the driver's introduction. There were two ->get
calls at driver's probe function and a single ->put inside
apple_nvme_free_ctrl().
However now after commit 03b3bcd319b3 ("nvme: fix admin request_queue
lifetime") the refcount is imbalanced again. Fix it by removing extra
->put call from apple_nvme_free_ctrl(). anv->dev and ctrl->dev point to
the same device, so use ctrl->dev directly for simplification. Compile
tested only.
Found by Linux Verification Center (linuxtesting.org).
In the declaration of the structure "core_quirks[]", in the comment
referred to the devices "Kioxia CD6-V Series / HPE PE8030", the
parameter "default_ps_max_latency_us" is reported in a wrong way:
nvme_core.default_ps_max_latency=0
The correct form is, instead:
nvme_core.default_ps_max_latency_us=0
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Flavio Suligoi <f.suligoi@asem.it> Signed-off-by: Keith Busch <kbusch@kernel.org>
nvmet: avoid recursive nvmet-wq flush in nvmet_ctrl_free
nvmet_tcp_release_queue_work() runs on nvmet-wq and can drop the
final controller reference through nvmet_cq_put(). If that triggers
nvmet_ctrl_free(), the teardown path flushes ctrl->async_event_work on
the same nvmet-wq.
Previously Scheduled by :-
nvmet_add_async_event
queue_work(nvmet_wq, &ctrl->async_event_work);
This trips lockdep with a possible recursive locking warning.
[ 5223.015876] run blktests nvme/003 at 2026-04-07 20:53:55
[ 5223.061801] loop0: detected capacity change from 0 to 2097152
[ 5223.072206] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 5223.088368] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
[ 5223.126086] nvmet: Created discovery controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349.
[ 5223.128453] nvme nvme1: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 127.0.0.1:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
[ 5233.199447] nvme nvme1: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
[ 5233.227718] ============================================
[ 5233.231283] WARNING: possible recursive locking detected
[ 5233.234696] 7.0.0-rc3nvme+ #20 Tainted: G O N
[ 5233.238434] --------------------------------------------
[ 5233.241852] kworker/u192:6/2413 is trying to acquire lock:
[ 5233.245429] ffff888111632548 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: touch_wq_lockdep_map+0x26/0x90
[ 5233.251438]
but task is already holding lock:
[ 5233.255254] ffff888111632548 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_one_work+0x5cc/0x6e0
[ 5233.261125]
other info that might help us debug this:
[ 5233.265333] Possible unsafe locking scenario:
There is also no need to flush async_event_work from controller
teardown. The admin queue teardown already fails outstanding AER
requests before the final controller put :-
The controller has already been removed from the subsystem list before
nvmet_ctrl_free() quiesces outstanding work.
Replace flush_work() with cancel_work_sync() so a pending
async_event_work item is canceled and a running instance is waited on
without recursing into the same workqueue.
Fixes: 06406d81a2d7 ("nvmet: cancel fatal error and flush async work before free controller") Cc: stable@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
extract-cert: Wrap key_pass with '#ifdef USE_PKCS11_ENGINE'
A recent strengthening of -Wunused-but-set-variable (enabled with -Wall)
in clang under a new subwarning, -Wunused-but-set-global, points out an
unused static global variable in certs/extract-cert.c:
certs/extract-cert.c:46:20: error: variable 'key_pass' set but not used [-Werror,-Wunused-but-set-global]
46 | static const char *key_pass;
| ^
After commit 558bdc45dfb2 ("sign-file,extract-cert: use pkcs11 provider
for OPENSSL MAJOR >= 3"), key_pass is only used with the OpenSSL engine
API, not the new provider API. Wrap key_pass's declaration and
assignment with '#ifdef USE_PKCS11_ENGINE' so that it is only included
with its use to clear up the warning. While this is a little uglier than
just marking key_pass with the unused attribute, this will make it
easier to clean up all code associated with the use of the engine API if
it were ever removed in the future. While in the area, use a tab for
the key_pass assignment line to match the rest of the file.
Mark Brown [Thu, 16 Apr 2026 19:03:59 +0000 (20:03 +0100)]
selftests: Fix runner.sh for non-bash shells
Commit 2964f6b816c2 ("selftests: Use ktap helpers for runner.sh") added a
number of bashisms and updated the interpreter specified for the script to
be /bin/bash to reflect this. Unfortunately this does not actually achieve
anything in production since the main way runner.sh is invoked is from the
top level run_kselftest.sh which sources it rather than running it as a
separate script and specifies the shell as /bin/sh. This means that on
systems where /bin/sh is not bash (such as Debian where /bin/sh defaults to
being dash) we see failures:
4. In runner.sh run_one(), get the return value and use ktap helpers for
all pass/fail reporting. This allows counting pass/fail numbers in the
main process.
which uses a bash array to track all the subtests being run. Convert this
to use a simple flat variable instead.
Mark Brown [Thu, 16 Apr 2026 19:03:58 +0000 (20:03 +0100)]
selftests: Fix runner.sh busybox support
Commit 2964f6b816c2 ("selftests: Use ktap helpers for runner.sh") added an
import of ktap_helper.sh to runner.sh in order to standardise on these for
output formatting. Rather than build on the existing requirement for the
user to supply BASE_DIR to find the helpers it uses some magic which
features a use of "readlink -e". Unfortunately the -e option is a GNU
extension and is not available in at least busybox, meaning that runner.sh
starts failing:
./run_kselftest.sh: 5: ./kselftest/runner.sh: Bad substitution
./run_kselftest.sh: 5: .: cannot open ./ktap_helpers.sh: No such file
Fix this by using the already required BASE_DIR to locate the helper
library.
Mark Brown [Thu, 16 Apr 2026 13:19:24 +0000 (14:19 +0100)]
selftests: Deescalate error reporting
Commit 7e47389142b8 ("selftests: Preserve subtarget failures in
all/install") updated the propagation of errors from indivdual kselftest
targets to be similar to that seen with FORCE_TARGETS. While it would
be really nice to be in a position to do this currently it is premature
to do this as the default behaviour.
At present we default to trying to build all selftests but a combination
of code quality issues and build dependencies mean that it is almost
certain that at least one of them will fail to build (for example,
several depend on clang so don't work in a GCC container) and a top
level failure in the kselftest build reported. Further, the resulting
failures mean that the install target does not run at all so any build
problem is escallated to a complete failure to produce a kselftest
tarball so CI systems that run into issues loose all selftests coverage.
This has been causing disruption to a range of CI systems including
KernelCI, mine and Arm's internal one.
Revert the commit, users who need this behaviour should be able to use
FORCE_TARGETS for the time being. At present users that do this (such
as linux-next) are most likely building a subset of targets known to
succeed in their environments.
Shuicheng Lin [Wed, 15 Apr 2026 22:54:28 +0000 (22:54 +0000)]
drm/xe/eustall: Fix drm_dev_put called before stream disable in close
In xe_eu_stall_stream_close(), drm_dev_put() is called before the
stream is disabled and its resources are freed. If this drops the
last reference, the device structures could be freed while the
subsequent cleanup code still accesses them, leading to a
use-after-free.
Fix this by moving drm_dev_put() after all device accesses are
complete. This matches the ordering in xe_oa_release().
Fixes: 9a0b11d4cf3b ("drm/xe/eustall: Add support to init, enable and disable EU stall sampling") Cc: Harish Chegondi <harish.chegondi@intel.com> Assisted-by: Claude:claude-opus-4.6 Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Harish Chegondi <harish.chegondi@intel.com> Link: https://patch.msgid.link/20260415225428.3399934-1-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Philipp Stanner [Wed, 15 Apr 2026 14:49:56 +0000 (16:49 +0200)]
drm/sched: Make drm_sched_entity_kill() a public function
Some drivers do not care on teardown whether the last jobs pending in an
entity are actually executed before teardown completed. For such
scenarios, drm_sched_entity_flush() is not the ideal function since it's
intended to wait for jobs to complete.
Make drm_sched_entity_kill() public for that use-case and update the
documentation.
Thomas Gleixner [Tue, 14 Apr 2026 20:55:01 +0000 (22:55 +0200)]
clockevents: Add missing resets of the next_event_forced flag
The prevention mechanism against timer interrupt starvation missed to reset
the next_event_forced flag in a couple of places:
- When the clock event state changes. That can cause the flag to be
stale over a shutdown/startup sequence
- When a non-forced event is armed, which then prevents rearming before
that event. If that event is far out in the future this will cause
missed timer interrupts.
- In the suspend wakeup handler.
That led to stalls which have been reported by several people.
Add the missing resets, which fixes the problems for the reporters.
Jonathan Cavitt [Mon, 2 Feb 2026 21:47:10 +0000 (21:47 +0000)]
drm/colorop: Check if getting curve_1d_type default succeeds
Static analysis issue:
In all other uses of the function drm_object_property_get_default_value,
the return value of the function is checked before the output is saved to
the relevant object parameter. Though likely unnecessary given the
execution path involved, keep the behavior consistent across uses and only
set colorop_state->curve_1d_type in __drm_colorop_state_reset if
drm_object_property_get_default_value succeeds.
Jonathan Cavitt [Fri, 30 Jan 2026 19:19:54 +0000 (19:19 +0000)]
drm/gpuvm: Do not prepare NULL objects
Statis analysis issue:
drm_gpuvm_prepare_range issues an exec_object_prepare call to all
drm_gem_objects mapped between addr and addr + range. However, it is
possible (albeit very unlikely) that the objects found through
drm_gpuvm_for_each_va_range (as connected to va->gem) are NULL, as seen
in other functions such as drm_gpuva_link and drm_gpuva_unlink_defer.
Do not prepare NULL objects.
Fixes: 50c1a36f594b ("drm/gpuvm: track/lock/validate external/evicted objects") Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com> Link: https://patch.msgid.link/20260130191953.61718-2-jonathan.cavitt@intel.com
sched_ext: scx_qmap: replace FIFO queue maps with arena-backed lists
Arena simplifies verification and allows more natural programming.
Convert scx_qmap to arena as preparation for further sub-sched work.
Replace the five BPF_MAP_TYPE_QUEUE maps with doubly-linked lists in
arena, threaded through task_ctx. Each queue is a struct qmap_fifo with
head/tail pointers and its own per-queue bpf_res_spin_lock.
qmap_dequeue() now properly removes tasks from the queue instead of
leaving stale entries for dispatch to skip.
v2:
- Remove duplicate QMAP_TOUCH_ARENA() in qmap_dump_task (Andrea).
- Update file-level description for arena-backed lists (Andrea).
Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
sched_ext: scx_qmap: move task_ctx into a BPF arena slab
Arena simplifies verification and allows more natural programming.
Convert scx_qmap to arena as preparation for further sub-sched work.
Allocate per-task context from an arena slab instead of storing it
directly in task_storage. task_ctx_stor now holds an arena pointer to
the task's slab entry. Free entries form a singly-linked list protected
by bpf_res_spin_lock; slab exhaustion triggers scx_bpf_error().
The slab size is configurable via the new -N option (default 16384).
Also add bpf_res_spin_lock/unlock declarations to common.bpf.h.
Scheduling logic unchanged.
v2: Add task_ctx_t typedef for struct task_ctx __arena (Emil).
Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
sched_ext: scx_qmap: move globals and cpu_ctx into a BPF arena map
Arena simplifies verification and allows more natural programming.
Convert scx_qmap to arena as preparation for further sub-sched work.
Move scheduler state from BSS globals and a percpu array map
into a single BPF arena map. A shared struct qmap_arena is declared as
an __arena global so BPF accesses it directly and userspace reaches it
through skel->arena->qa.
Scheduling logic unchanged; only memory backing changes.
v2: Drop "mutable" from comments.
Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Ville Syrjälä [Wed, 15 Apr 2026 21:04:11 +0000 (00:04 +0300)]
drm/i915/reset: Disable execlist per-engine reset for display reset tests
The display reset only happens from the full reset path. We must
therefore force execlist submission to always take the full reset
path and not the per-engine reset path. Currently the display
reset tests are in fact not testing display resets at all on
platforms using execlist submission. Ring submission and GuC
submission always take the full path anyway.
Also disable the engine reset inside __intel_gt_set_wedged() so
that we simulate the intel_gt_gpu_reset_clobbers_display() behavior
as closely as possible also when taking the full wedge path.
The slight race between the separate intel_display_reset_test()
calls in the overall reset path is harmless. kms_busy will keep
the modparam fixed during the test, and even if someone were to
fiddle with the modparam manually nothing bad should happen if
the calls return different values.
Expose the number of display resets performed in a new
"display_reset_count" debugfs file. kms_busy can use this to
confirm that the kernel actually took the full display reset path.
v2: Give the file an "intel_" namespace (Jani)
Cc: Jouni Högander <jouni.hogander@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Testcase: igt/kms_busy/*-with-reset Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260415210411.24750-7-ville.syrjala@linux.intel.com
Stephen Boyd [Thu, 16 Apr 2026 17:07:47 +0000 (10:07 -0700)]
Merge branches 'clk-fixes', 'clk-renesas', 'clk-rpi', 'clk-eswin' and 'clk-mediatek' into clk-next
- ESWIN eic700 SoC clk support
- Econet EN751221 SoC clock/reset support
* clk-fixes:
clk: spacemit: ccu_mix: fix inverted condition in ccu_mix_trigger_fc()
clk: microchip: mpfs-ccc: fix out of bounds access during output registration
clk: qcom: dispcc-sm8450: use RCG2 ops for DPTX1 AUX clock source
* clk-renesas:
clk: renesas: Add support for RZ/G3L SoC
dt-bindings: clock: renesas,rzg2l-cpg: Document RZ/G3L SoC
clk: renesas: rzg2l: Re-enable critical module clocks during resume
clk: renesas: rzg2l: Add rzg2l_mod_clock_init_mstop_helper()
clk: renesas: rzg2l: Add helper for mod clock enable/disable
clk: renesas: r9a0{7g04[34],8g045}: Add critical reset entries
clk: renesas: rzg2l: Add support for critical resets
clk: renesas: r9a09g056: Remove entries for WDT{0,2,3}
clk: renesas: r9a06g032: Enable watchdog reset sources
clk: renesas: cpg-mssr: Use struct_size() helper
clk: renesas: r9a09g047: Add PCIe clocks and reset
clk: renesas: r9a09g057: Add PCIe clocks and reset
clk: renesas: r9a09g056: Add PCIe clocks and reset
clk: renesas: r9a09g047: Add entries for the RSPIs
clk: renesas: r9a09g056: Add clock and reset entries for RTC
clk: renesas: r9a09g057: Remove entries for WDT{0,2,3}
clk: renesas: r9a09g056: Fix ordering of module clocks array
clk: renesas: r9a09g057: Fix ordering of module clocks array
Ville Syrjälä [Wed, 15 Apr 2026 21:04:09 +0000 (00:04 +0300)]
drm/xe/display: Add init_clock_gating.h stubs
Add static inline stubs for init_clock_gating.h functions
so that we don't need ifdefs in the actual code. We already
have one in intel_display_power.c, and now I need to bring
over intel_display_reset.c.
Ville Syrjälä [Wed, 15 Apr 2026 21:04:08 +0000 (00:04 +0300)]
drm/i915/reset: Move pending_fb_pin handling to i915
Only i915 uses the pending_fb_pin counter to potentially whack
the GPU harder if the display gets nuked during a GPU reset.
Move the atomic counter into the i915 specific bits of code, so
that we don't need to worry about on the display side.
For some reason the overlay code kept the pending_fb_pin counter
elevated for longer than just for the pin, but from now on it'll
just cover the actual pinning part.
Ville Syrjälä [Wed, 15 Apr 2026 21:04:07 +0000 (00:04 +0300)]
drm/i915/reset: Reorganize display reset code
Stop returning the "is there a display?" status from
intel_display_reset_prepare(). I plan to move the pending_fb_pin
into the i915 code, so I need to make that determination already
before intel_display_reset_prepare() is called. Add a new
intel_display_reset_supported() function for that.
Felix Gu [Thu, 16 Apr 2026 13:37:23 +0000 (21:37 +0800)]
accel/amdxdna: Fix memory leak in amdxdna_iommu_alloc()
In amdxdna_iommu_alloc(), if iommu_map() fails after successfully
allocating both iova and cpu_addr, the code jumps to free_iova
which only frees the iova, leaking the allocated pages.
Max Zhen [Wed, 15 Apr 2026 17:11:39 +0000 (10:11 -0700)]
accel/amdxdna: Add hardware scheduler time quantum support
Add support for configuring the hardware scheduler time quantum to
improve fairness across concurrent contexts.
The scheduler enforces a fixed time slice per context, preventing
long-running workloads from monopolizing the device and allowing
other contexts to make forward progress.
The default time quantum is 30ms and can be configured via the
time_quantum_ms module parameter.
Shuicheng Lin [Tue, 14 Apr 2026 22:54:33 +0000 (22:54 +0000)]
drm/xe: Fix type and parameter name mismatches in kernel-doc references
Fix kernel-doc references that point to wrong type or parameter names:
- xe_guc_capture_types.h: register_data_type ->
capture_register_data_type to match actual enum name
- xe_oa_types.h: enum @drm_xe_oa_format_type ->
enum drm_xe_oa_format_type (spurious '@' before type name)
- xe_pt_walk.h: @sizes -> @shifts to match actual struct member,
@start -> @addr to match actual parameter name,
page.table. -> page table. (typo)
Shuicheng Lin [Tue, 14 Apr 2026 22:54:32 +0000 (22:54 +0000)]
drm/xe: Fix kernel-doc comment syntax issues in header files
Fix doc comment formatting that does not conform to kernel-doc syntax
rules:
- xe_guc_relay_types.h: Add missing space after '/**' in member doc
comment (/**@lock -> /** @lock)
- xe_oa_types.h: Fix struct documentation to use proper kernel-doc
format (/** @xe_oa_buffer: -> /** struct xe_oa_buffer -)
- xe_pagefault_types.h: Use dash instead of colon as separator in
struct doc (struct xe_pagefault_queue: -> struct xe_pagefault_queue -)
- xe_pt_walk.h: Use dash instead of colon as separator in function
docs (xe_pt_num_entries: -> xe_pt_num_entries -,
xe_pt_offset: -> xe_pt_offset -)
Shuicheng Lin [Tue, 14 Apr 2026 22:54:31 +0000 (22:54 +0000)]
drm/xe: Add missing '@' prefix to kernel-doc member tags
Add the required '@' prefix to member documentation tags that were
missing it. Without the prefix, kernel-doc does not associate the
comment with the struct member.
Shuicheng Lin [Tue, 14 Apr 2026 22:54:30 +0000 (22:54 +0000)]
drm/xe: Fix stale and mismatched kernel-doc member tags in header files
Fix kernel-doc member tags that do not match the actual struct member
names, which would cause warnings with scripts/kernel-doc or produce
incorrect documentation:
Merge tag 'v7.1-rc-part1-smbdirect-fixes' of git://git.samba.org/ksmbd
Pull smbdirect updates from Steve French:
"Move smbdirect server and client code to common directory:
- temporary use of smbdirect_all_c_files.c to allow micro steps
- factor out common functions into a smbdirect.ko.
- convert cifs.ko to use smbdirect.ko
- convert ksmbd.ko to use smbdirect.ko
- let smbdirect.ko use global workqueues
- move ib_client logic from ksmbd.ko into smbdirect.ko
- remove smbdirect_all_c_files.c hack again
- some locking and teardown related fixes on top"
* tag 'v7.1-rc-part1-smbdirect-fixes' of git://git.samba.org/ksmbd: (145 commits)
smb: smbdirect: let smbdirect_connection_deregister_mr_io unlock while waiting
smb: smbdirect: fix the logic in smbdirect_socket_destroy_sync() without an error
smb: smbdirect: fix copyright header of smbdirect.h
smb: smbdirect: change smbdirect_socket_parameters.{initiator_depth,responder_resources} to __u16
smb: smbdirect: remove unused SMBDIRECT_USE_INLINE_C_FILES logic
smb: server: no longer use smbdirect_socket_set_custom_workqueue()
smb: client: no longer use smbdirect_socket_set_custom_workqueue()
smb: smbdirect: introduce global workqueues
smb: smbdirect: prepare use of dedicated workqueues for different steps
smb: smbdirect: remove unused smbdirect_connection_mr_io_recovery_work()
smb: smbdirect: wrap rdma_disconnect() in rdma_[un]lock_handler()
smb: server: make use of smbdirect_netdev_rdma_capable_mode_type()
smb: smbdirect: introduce smbdirect_netdev_rdma_capable_mode_type()
smb: server: make use of smbdirect.ko
smb: server: remove unused ksmbd_transport_ops.prepare()
smb: server: make use of smbdirect_socket_{listen,accept}()
smb: server: only use public smbdirect functions
smb: server: make use of smbdirect_socket_create_accepting()/smbdirect_socket_release()
smb: server: make use of smbdirect_{socket_init_accepting,connection_wait_for_connected}()
smb: server: make use of smbdirect_connection_send_iter() and related functions
...
Merge tag 'livepatching-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching
Pull livepatching updates from Petr Mladek:
- Add two new selftests
* tag 'livepatching-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
selftests/livepatch: add test for module function patching
selftests: livepatch: test-ftrace: livepatch a traced function
Merge tag 'efi-next-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
"Again not a busy cycle for EFI, just some minor tweaks and bug fixes:
- Enable boot graphics resource table (BGRT) on Xen/x86
- Correct a misguided assumption in the memory attributes table
sanity check
- Start tagging efi_mem_reserve()'d regions as MEMBLOCK_RSRV_KERN
- Some other minor fixes and cleanups"
* tag 'efi-next-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi/capsule-loader: fix incorrect sizeof in phys array reallocation
efi: Tag memblock reservations of boot services regions as RSRV_KERN
memblock: Permit existing reserved regions to be marked RSRV_KERN
efi/memattr: Fix thinko in table size sanity check
efi: libstub: fix type of fdt 32 and 64bit variables
efi: Drop unused efi_range_is_wc() function
efi: Enable BGRT loading under Xen
efi: make efi_mem_type() and efi_mem_attributes() work on Xen PV
- Conversions to const struct class in support of class_create()
deprecation (Jori Koolstra)
- Improve selftest compiler compatibility by avoiding initializer on
variable-length array (Manish Honap)
- Define new uAPI for drivers supporting migration to advise user-
space of new initial data for reducing target startup latency.
Implemented for mlx5 vfio-pci variant driver (Yishai Hadas)
- Enable vfio selftests on aarch64, not just cross-compiles reporting
arm64 (Ted Logan)
- Update vfio selftest driver support to include additional DSA devices
(Yi Lai)
- Unconditionally include debugfs root pointer in vfio device struct,
avoiding a build failure seen in hisi_acc variant driver without
debugfs otherwise (Arnd Bergmann)
- Add support for the s390 ISM (Internal Shared Memory) device via a
new variant driver. The device is unique in the size of its BAR space
(256TiB) and lack of mmap support (Julian Ruess)
- Enforce that vfio-pci drivers implement a name in their ops structure
for use in sequestering SR-IOV VFs (Alex Williamson)
- Prune leftover group notifier code (Paolo Bonzini)
- Fix Xe vfio-pci variant driver to avoid migration support as a
dependency in the reset path and missing release call (Michał
Winiarski)
* tag 'vfio-v7.1-rc1' of https://github.com/awilliam/linux-vfio: (23 commits)
vfio/xe: Add a missing vfio_pci_core_release_dev()
vfio/xe: Reorganize the init to decouple migration from reset
vfio: remove dead notifier code
vfio/pci: Require vfio_device_ops.name
MAINTAINERS: add VFIO ISM PCI DRIVER section
vfio/ism: Implement vfio_pci driver for ISM devices
vfio/pci: Rename vfio_config_do_rw() to vfio_pci_config_rw_single() and export it
vfio: unhide vdev->debug_root
vfio/qat: add support for Intel QAT 420xx VFs
vfio: selftests: Support DMR and GNR-D DSA devices
vfio: selftests: Build tests on aarch64
vfio/mlx5: Add REINIT support to VFIO_MIG_GET_PRECOPY_INFO
vfio/mlx5: consider inflight SAVE during PRE_COPY
net/mlx5: Add IFC bits for migration state
vfio: Adapt drivers to use the core helper vfio_check_precopy_ioctl
vfio: Add support for VFIO_DEVICE_FEATURE_MIG_PRECOPY_INFOv2
vfio: Define uAPI for re-init initial bytes during the PRE_COPY phase
vfio: selftests: Fix VLA initialisation in vfio_pci_irq_set()
vfio: uapi: fix comment typo
vfio: mdev: replace mtty_dev->vd_class with a const struct class
...
Melissa Wen [Wed, 18 Mar 2026 16:27:11 +0000 (13:27 -0300)]
drm/drm_atomic: duplicate colorop states if plane color pipeline in use
For suspend/resume to work correctly, do for colorop state the same we
do for plane/crtc/connector states: duplicate the state of colorops in a
color pipeline if it's in use by a given plane when suspending and
restore cached colorop states when resuming. While at it, prevent
unused-variable warning when using for_each_new_colorop_in_stage here.
Fixes: 2afc3184f3b3 ("drm/plane: Add COLOR PIPELINE property") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Reviewed-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Signed-off-by: Melissa Wen <mwen@igalia.com> Link: https://patch.msgid.link/20260318163629.300627-1-mwen@igalia.com Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Ville Syrjälä [Wed, 15 Apr 2026 21:04:05 +0000 (00:04 +0300)]
drm/i915: Clear i915->display when no longer valid
Don't leave a stale i915->display pointer hanging around after
the display driver has been torn down. Apparently the gt code
calls into the reset codepaths after this, and if the display
pointer is still around we may try to access freed memory.
The whole teardown sequence here seems rather suspect. Why is
display done first and then everything else via the managed
release? Who the heck knows. Someone really needs to dig into
this stuff and figure out the proper init/cleanup sequence for
both i915 (real and mock) and xe...
Cc: Jani Nikula <jani.nikula@intel.com> Cc: Jouni Högander <jouni.hogander@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260415210411.24750-2-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tomas Glozar [Thu, 16 Apr 2026 11:59:42 +0000 (13:59 +0200)]
tracing/osnoise: Add option to align tlat threads
Add an option called TIMERLAT_ALIGN to osnoise/options, together with a
corresponding setting osnoise/timerlat_align_us.
This option sets the alignment of wakeup times between different
timerlat threads, similarly to cyclictest's -A/--aligned option. If
TIMERLAT_ALIGN is set, the first thread that reaches the first cycle
records its first wake-up time. Each following thread sets its first
wake-up time to a fixed offset from the recorded time, and increments
it by the same offset.
Example:
osnoise/timerlat_period is set to 1000, osnoise/timerlat_align_us is
set to 20. There are four threads, on CPUs 1 to 4.
- CPU 4 enters first cycle first. The current time is 20000us, so
the wake-up of the first cycle is set to 21000us. This time is recorded.
- CPU 2 enter first cycle next. It reads the recorded time, increments
it to 21020us, and uses this value as its own wake-up time for the first
cycle.
- CPU 3 enters first cycle next. It reads the recorded time, increments
it to 21040 us, and uses the value as its own wake-up time.
- CPU 1 proceeds analogically.
In each next cycle, the wake-up time (called "absolute period" in
timerlat code) is incremented by the (relative) period of 1000us. Thus,
the wake-ups in the following cycles (provided the times are reached and
not in the past) will be as follows:
CPU 1 CPU 2 CPU 3 CPU 4
21080us 21020us 21040us 21000us
22080us 22020us 22040us 22000us
... ... ... ...
Even if any cycle is skipped due to e.g. the first cycle calculation
happening later, the alignment stays in place.
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Link: https://patch.msgid.link/20260416115942.544032-1-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Reviewed-by: Wander Lairson Costa <wander@redhat.com> Reviewed-by: Crystal Wood <crwood@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Daniel Borkmann [Thu, 16 Apr 2026 12:27:19 +0000 (14:27 +0200)]
bpf: Fix precedence bug in convert_bpf_ld_abs alignment check
Fix an operator precedence issue in convert_bpf_ld_abs() where the
expression offset + ip_align % size evaluates as offset + (ip_align % size)
due to % having higher precedence than +. That latter evaluation does
not make any sense. The intended check is (offset + ip_align) % size == 0
to verify that the packet load offset is properly aligned for direct
access.
With NET_IP_ALIGN == 2, the bug causes the inline fast-path for direct
packet loads to almost never be taken on !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
platforms. This forces nearly all cBPF BPF_LD_ABS packet loads through
the bpf_skb_load_helper slow path on the affected archs.
====================
emit ENDBR/BTI instructions for indirect
On architectures with CFI protection enabled that require landing pad
instructions at indirect jump targets, such as x86 with CET/IBT enabled
and arm64 with BTI enabled, kernel panics when an indirect jump lands on
a target without landing pad. Therefore, the JIT must emit landing pad
instructions for indirect jump targets.
The verifier already recognizes which instructions are indirect jump
targets during the verification phase. So we can store this information
in env->insn_aux_data and pass it to the JIT as new parameter, allowing
the JIT to consult env->insn_aux_data to determine which instructions are
indirect jump targets.
During JIT, constants blinding is performed. It rewrites the private copy
of instructions for the JITed program, but it does not adjust the global
env->insn_aux_data array. As a result, after constants blinding, the
instruction indexes used by JIT may no longer match the indexes in
env->insn_aux_data, so the JIT can not use env->insn_aux_data directly.
To avoid this mismatch, and given that all existing arch-specific JITs
already implement constants blinding with largely duplicated code, move
constants blinding from JIT to generic code.
v15:
- Rebase and target bpf tree
- Resotre subprog_start of the fake 'exit' subprog on failure
- Fix wrong function name used in comment
v14: https://lore.kernel.org/all/cover.1776062885.git.xukuohai@hotmail.com/
- Rebase
- Fix comment style
- Fix incorrect variable and function name used in commit message
v13: https://lore.kernel.org/bpf/20260411133847.1042658-1-xukuohai@huaweicloud.com
- Use vmalloc to allocate memory for insn_aux_data copies to match with vfree
- Do not free the copied memory of insn_aux_data when restoring from failure
- Code cleanup
v12: https://lore.kernel.org/bpf/20260403132811.753894-1-xukuohai@huaweicloud.com
- Restore env->insn_aux_data on JIT failure
- Fix incorrect error code sign (-EFAULT vs EFAULT)
- Fix incorrect prog used in the restore path
v11: https://lore.kernel.org/bpf/20260403090915.473493-1-xukuohai@huaweicloud.com
- Restore env->subprog_info after jit_subprogs() fails
- Clear prog->jit_requested and prog->blinding_requested on failure
- Use the actual env->insn_aux_data size in clear_insn_aux_data() on failure
v10: https://lore.kernel.org/bpf/20260324122052.342751-1-xukuohai@huaweicloud.com
- Fix the incorrect call_imm restore in jit_subprogs
- Define a dummy void version of bpf_jit_prog_release_other and
bpf_patch_insn_data when the corresponding config is not set
- Remove the unnecessary #ifdef in x86_64 JIT (Leon Hwang)
v9: https://lore.kernel.org/bpf/20260312170255.3427799-1-xukuohai@huaweicloud.com
- Make constant blinding available for classic bpf (Eduard)
- Clear prog->bpf_func, prog->jited ... on the error path of extra pass (Eduard)
- Fix spelling errors and remove unused parameter (Anton Protopopov)
v8: https://lore.kernel.org/bpf/20260309140044.2652538-1-xukuohai@huaweicloud.com
- Define void bpf_jit_blind_constants() function when CONFIG_BPF_JIT is not set
- Move indirect_target fixup for insn patching from bpf_jit_blind_constants()
to adjust_insn_aux_data()
v7: https://lore.kernel.org/bpf/20260307103949.2340104-1-xukuohai@huaweicloud.com
- Move constants blinding logic back to bpf/core.c
- Compute ip address before switch statement in x86 JIT
- Clear JIT state from error path on arm64 and loongarch
v6: https://lore.kernel.org/bpf/20260306102329.2056216-1-xukuohai@huaweicloud.com
- Move constants blinding from JIT to verifier
- Move call to bpf_prog_select_runtime from bpf_prog_load to verifier
v5: https://lore.kernel.org/bpf/20260302102726.1126019-1-xukuohai@huaweicloud.com
- Switch to pass env to JIT directly to get rid of copying private insn_aux_data for
each prog
v4: https://lore.kernel.org/all/20260114093914.2403982-1-xukuohai@huaweicloud.com
- Switch to the approach proposed by Eduard, using insn_aux_data to identify indirect
jump targets, and emit ENDBR on x86
v3: https://lore.kernel.org/bpf/20251227081033.240336-1-xukuohai@huaweicloud.com
- Get rid of unnecessary enum definition (Yonghong Song, Anton Protopopov)
v2: https://lore.kernel.org/bpf/20251223085447.139301-1-xukuohai@huaweicloud.com
- Exclude instruction arrays not used for indirect jumps (Anton Protopopov)
On CPUs that support CET/IBT, the indirect jump selftest triggers
a kernel panic because the indirect jump targets lack ENDBR
instructions.
To fix it, emit an ENDBR instruction to each indirect jump target. Since
the ENDBR instruction shifts the position of original jited instructions,
fix the instruction address calculation wherever the addresses are used.
For reference, below is a sample panic log.
Missing ENDBR: bpf_prog_2e5f1c71c13ac3e0_big_jump_table+0x97/0xe1
------------[ cut here ]------------
kernel BUG at arch/x86/kernel/cet.c:133!
Oops: invalid opcode: 0000 [#1] SMP NOPTI
Introduce helper bpf_insn_is_indirect_target to check whether a BPF
instruction is an indirect jump target.
Since the verifier knows which instructions are indirect jump targets,
add a new flag indirect_target to struct bpf_insn_aux_data to mark
them. The verifier sets this flag when verifying an indirect jump target
instruction, and the helper checks the flag to determine whether an
instruction is an indirect jump target.
Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> #v8 Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> #v12 Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20260416064341.151802-4-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>