]> git.ipfire.org Git - thirdparty/kernel/stable.git/log
thirdparty/kernel/stable.git
12 days agospi: microchip-core-qspi: control built-in cs manually
Conor Dooley [Thu, 30 Apr 2026 10:10:18 +0000 (11:10 +0100)] 
spi: microchip-core-qspi: control built-in cs manually

The coreQSPI IP supports only a single chip select, which is
automagically operated by the hardware - set low when the transmit
buffer first gets written to and set high when the number of bytes
written to the TOTALBYTES field of the FRAMES register have been sent on
the bus. Additional devices must use GPIOs for their chip selects.
It was reported to me that if there are two devices attached to this
QSPI controller that the in-built chip select is set low while linux
tries to access the device attached to the GPIO.

This went undetected as the boards that connected multiple devices to
the SPI controller all exclusively used GPIOs for chip selects, not
relying on the built-in chip select at all. It turns out that this was
because the built-in chip select, when controlled automagically, is set
low when active and high when inactive, thereby ruling out its use for
active-high devices or devices that need to transmit with the chip
select disabled.

Modify the driver so that it controls chip select directly, retaining
the behaviour for mem_ops of setting the chip select active for the
entire duration of the transfer in the exec_op callback. For regular
transfers, implement the set_cs callback for the core to use.

As part of this, the existing setup callback, mchp_coreqspi_setup_op(),
is removed. Modifying the CLKIDLE field is not safe to do during
operation when there are multiple devices, so this code is removed
entirely. Setting the MASTER and ENABLE fields is something that can be
done once at probe, it doesn't need to be re-run for each device.
Instead the new setup callback sets the built-in chip select to its
inactive state for active-low devices, as the reset value of the chip
select in software controlled mode is low.

Fixes: 8f9cf02c88528 ("spi: microchip-core-qspi: Add regular transfers")
Fixes: 8596124c4c1bc ("spi: microchip-core-qspi: Add support for microchip fpga qspi controllers")
CC: stable@vger.kernel.org
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20260430-hamstring-busload-f941d0347b5e@spud
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agospi: imx: Three fixes for the i.MX SPI driver
Mark Brown [Mon, 4 May 2026 13:22:18 +0000 (22:22 +0900)] 
spi: imx: Three fixes for the i.MX SPI driver

John Madieu <john.madieu@gmail.com> says:

This series independent fixes found in the i.MX SPI driver.

These are:

1/3 fixes a precedence bug in spi_imx_dma_max_wml_find() that makes
    the watermark-finding logic effectively dead code. The function
    currently always returns wml = 1 because of how the !-operator
    binds to the modulo expression.

2/3 fixes a missing return on the package-1 failure path in
    spi_imx_dma_data_prepare(). The error path frees the
    dma_data array and the package-0 buffers, then falls through
    to "return 0" - the caller proceeds with a freed pointer.

3/3 makes spi_imx_setupxfer() propagate the prepare_transfer()
    return value. Currently a -EINVAL from mx51_ecspi_prepare_transfer
    (e.g. on a word_delay overflow) is silently swallowed and the
    transfer proceeds with a partially-configured controller.

12 days agospi: imx: Propagate prepare_transfer() error from spi_imx_setupxfer()
John Madieu [Fri, 1 May 2026 13:59:51 +0000 (13:59 +0000)] 
spi: imx: Propagate prepare_transfer() error from spi_imx_setupxfer()

spi_imx_setupxfer() calls the per-variant prepare_transfer()
callback and returns 0 unconditionally:

spi_imx->devtype_data->prepare_transfer(spi_imx, spi, t);

return 0;

mx51_ecspi_prepare_transfer() can return -EINVAL when the requested
word_delay does not fit in MX51_ECSPI_PERIOD_MASK. The error is
detected after a partial set of register writes (CTRL: BL, clkdiv,
SMC), so the controller is left in a partially-configured state and
the transfer is then submitted as if setup succeeded.

Propagate the return value. The other variants' prepare_transfer
callbacks all return 0, so this is a no-op for them.

Fixes: a3bb4e663df3 ("spi: imx: support word delay")
Signed-off-by: John Madieu <john.madieu@gmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260501135951.2416527-4-john.madieu@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agospi: imx: Fix UAF on package-1 prepare failure in spi_imx_dma_data_prepare()
John Madieu [Fri, 1 May 2026 13:59:50 +0000 (13:59 +0000)] 
spi: imx: Fix UAF on package-1 prepare failure in spi_imx_dma_data_prepare()

When transfer->len exceeds MX51_ECSPI_CTRL_MAX_BURST and is not a
multiple of it, spi_imx_dma_data_prepare() splits the transfer into
two DMA packages. If preparing the second package fails:

ret = spi_imx_dma_tx_data_handle(spi_imx, &spi_imx->dma_data[1],
 transfer->tx_buf + spi_imx->dma_data[0].data_len,
 false);
if (ret) {
kfree(spi_imx->dma_data[0].dma_tx_buf);
kfree(spi_imx->dma_data[0].dma_rx_buf);
kfree(spi_imx->dma_data);
}
}

return 0;

the function frees the package-0 buffers and the dma_data array,
then falls through to `return 0`, telling the caller the prepare
succeeded. The caller then dereferences the freed dma_data array,
producing a use-after-free.

Return the error from the failure path so the caller takes its
existing failure branch.

Fixes: faa8e404ad8e ("spi: imx: support dynamic burst length for ECSPI DMA mode")
Signed-off-by: John Madieu <john.madieu@gmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260501135951.2416527-3-john.madieu@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agospi: imx: Fix precedence bug in spi_imx_dma_max_wml_find()
John Madieu [Fri, 1 May 2026 13:59:49 +0000 (13:59 +0000)] 
spi: imx: Fix precedence bug in spi_imx_dma_max_wml_find()

The watermark search in spi_imx_dma_max_wml_find() reads:

if (!dma_data->dma_len % (i * bytes_per_word))
break;

The unary ! binds tighter than %, so this parses as:

if ((!dma_data->dma_len) % (i * bytes_per_word))
break;

!dma_data->dma_len is 0 or 1, and `0 % x == 0` for any x; `1 % x` is
0 unless x == 1. The condition is therefore false in every case
except dma_len != 0 with i * bytes_per_word == 1, i.e. i == 1 and
bytes_per_word == 1.

The loop almost always falls through to its end, leaving i == 0,
which the post-loop fallback rewrites to 1:

if (i == 0)
i = 1;

So spi_imx->wml ends up at 1 for essentially every DMA transfer,
defeating the entire purpose of the function. The DMA engine then
requests service after every single FIFO word instead of using
multi-word bursts, hurting throughput on every DMA-capable variant.

Add the missing parentheses so the modulo is computed first, then
negated:

if (!(dma_data->dma_len % (i * bytes_per_word)))
break;

Fixes: faa8e404ad8e ("spi: imx: support dynamic burst length for ECSPI DMA mode")
Signed-off-by: John Madieu <john.madieu@gmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260501135951.2416527-2-john.madieu@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: fsl_xcvr: Fix event generation for cached controls
Cássio Gabriel [Tue, 28 Apr 2026 03:07:08 +0000 (00:07 -0300)] 
ASoC: fsl_xcvr: Fix event generation for cached controls

ALSA controls should return 1 from a put callback when the control
value changes. fsl_xcvr_capds_put() and fsl_xcvr_tx_cs_put() both
update cached control data but always return 0, so ALSA suppresses
change notifications for the Capabilities Data Structure and playback
IEC958 channel status controls.

Compare the old and new cached values before copying the new data,
and return whether the control value changed.

Fixes: 28564486866f ("ASoC: fsl_xcvr: Add XCVR ASoC CPU DAI driver")
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260428-asoc-fsl-xcvr-event-generation-v1-1-f21cf0812c4f@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: sdw_utils: avoid the SDCA companion function not supported failure
Derek Fang [Thu, 30 Apr 2026 12:10:43 +0000 (20:10 +0800)] 
ASoC: sdw_utils: avoid the SDCA companion function not supported failure

Treat the companion amp as generic AMP until full support for companion
amp is added.

Signed-off-by: Derek Fang <derek.fang@realtek.com>
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://patch.msgid.link/20260430121043.552241-1-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: amd: yc: Add HP OMEN Gaming Laptop 16-ap0xxx product line in quirk table
Tommaso Soncin [Wed, 29 Apr 2026 16:08:57 +0000 (18:08 +0200)] 
ASoC: amd: yc: Add HP OMEN Gaming Laptop 16-ap0xxx product line in quirk table

Add a DMI quirk for the HP OMEN Gaming Laptop 16-ap0xxx line fixing the
issue where the internal microphone was not detected.

Cc: stable@vger.kernel.org
Signed-off-by: Tommaso Soncin <soncintommaso@gmail.com>
Link: https://patch.msgid.link/20260429160858.538986-1-soncintommaso@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: cs35l56: Fix out-of-bounds in dev_err() in cs35l56_read_onchip_spkid()
Richard Fitzgerald [Thu, 30 Apr 2026 10:11:34 +0000 (11:11 +0100)] 
ASoC: cs35l56: Fix out-of-bounds in dev_err() in cs35l56_read_onchip_spkid()

Remove the incorrect use of onchip_spkid_gpios[i] in the dev_err() after
regmap_read() of CS35L56_GPIO_STATUS1 returns an error.

This dev_err() was incorrectly copy-pasted from one inside the for-loop,
where i was valid. The read of CS35L56_GPIO_STATUS1 isn't for a specific
GPIO register, so the use of onchip_spkid_gpios[i] in the error message is
both irrelevant and out-of-bounds here.

Fixes: 4d1e3e2c404d ("ASoC: cs35l56: Support for reading speaker ID from on-chip GPIOs")
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://patch.msgid.link/20260430101134.2655938-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: amd: yc: Add DMI quirk for MSI Bravo 15 C7VE
Bob Song [Thu, 30 Apr 2026 01:49:20 +0000 (09:49 +0800)] 
ASoC: amd: yc: Add DMI quirk for MSI Bravo 15 C7VE

The laptop requires a quirk ID to enable its internal microphone. Add
it to the DMI quirk table.

Reported-by: gannovera <gannovera@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218402
Signed-off-by: Bob Song <songxiebing@kylinos.cn>
Link: https://patch.msgid.link/20260430014920.141276-1-songxiebing@kylinos.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: cs35l56: Fix hibernate write in runtime resume error path
Richard Fitzgerald [Wed, 29 Apr 2026 10:53:15 +0000 (11:53 +0100)] 
ASoC: cs35l56: Fix hibernate write in runtime resume error path

The error path of cs35l56_runtime_resume_common() should only write
the hibernation sequence if can_hibernate is true.

Something has already gone badly wrong if we ever reach the error
path. But triggering hibernate on hardware that does not support it
is likely to make the situation unrecoverable without a full reboot
because there might not be any hardware signal to exit hibernate.

Fixes: a47cf4dac7dc ("ASoC: cs35l56: Change hibernate sequence to use allow auto hibernate")
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://patch.msgid.link/20260429105315.2438298-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: spacemit: fix RX DMA params not set when TX is running
Troy Mitchell [Wed, 29 Apr 2026 09:00:50 +0000 (17:00 +0800)] 
ASoC: spacemit: fix RX DMA params not set when TX is running

When TX is already running (SSCR_SSE is set), the hw_params callback
returns early before setting up DMA parameters for the RX stream. This
prevents the capture path from configuring its DMA data properly.

Move the SSCR_SSE check after DMA parameter setup and format
constraints, so both TX and RX streams get their DMA configuration
regardless of whether the hardware is already enabled. The early return
now only skips the register writes that would disrupt an active stream.

Fixes: fce217449075 ("ASoC: spacemit: add i2s support for K1 SoC")
Signed-off-by: Troy Mitchell <troy.mitchell@linux.spacemit.com>
Link: https://patch.msgid.link/20260429-k1-i2s-fix-v2-1-8d67835aaddc@linux.spacemit.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agodrm/fb-helper: Fix clipping when damage area spans a single scanline
Francesco Lavra [Tue, 10 Feb 2026 17:35:45 +0000 (18:35 +0100)] 
drm/fb-helper: Fix clipping when damage area spans a single scanline

When the damage area resulting from a dirty memory range spans a single
scanline, the width of the rectangle is calculated dynamically because it
may not coincide with the framebuffer width.
If the dirty range ends exactly at the end of the scanline, the `bit_end`
variable is incorrectly assigned a 0 value, which results in a bogus clip
rectangle where the x2 coordinate is 0. This prevents the dirty scanline
from being flushed to the hardware.
Change the calculation of the `bit_end` value to fix the x2 coordinate
value in the above edge case.

Fixes: ded74cafeea9 ("drm/fb-helper: Clip damage area horizontally")
Signed-off-by: Francesco Lavra <flavra@baylibre.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260210173545.733937-1-flavra@baylibre.com
12 days agodrm/qxl: Fix missing KMS poll cleanup
Myeonghun Pak [Fri, 24 Apr 2026 11:25:18 +0000 (20:25 +0900)] 
drm/qxl: Fix missing KMS poll cleanup

drm_kms_helper_poll_init() initializes the output polling work and
enables polling for the DRM device. qxl enables polling before calling
drm_dev_register(), but the drm_dev_register() failure path tears down
the modeset and device state without disabling the polling helper.

The remove path also unregisters and shuts down the DRM device without
first disabling the polling helper. Add matching drm_kms_helper_poll_fini()
calls in both paths so the delayed polling work is cancelled before qxl
tears down the associated modeset/device state.

Signed-off-by: Myeonghun Pak <mhun512@gmail.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 5ff91e442652 ("qxl: use drm helper hotplug support")
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260424112543.57819-1-mhun512@gmail.com
12 days agoASoC: codecs: ab8500: Remove suspicious code
Uwe Kleine-König (The Capable Hub) [Thu, 30 Apr 2026 15:45:24 +0000 (17:45 +0200)] 
ASoC: codecs: ab8500: Remove suspicious code

anc_configure() passed values from drvdata->anc_fir_values[],
drvdata->anc_iir_values[] and drvdata->sid_fir_values[] as register
offset to snd_soc_component_read(). The content of these arrays are user
controllable via the component controls "ANC FIR Coefficients", "ANC
IIR Coefficients" and "Sidetone FIR Coefficients" which I assume are
supposed to hold register values, not register offsets.

Without a datasheet for that component and given that before commit
a201aef1a88b ("ASoC: codecs: ab8500: Fix casting of private data") the
arrays overlapped with driver control structures and thus didn't work
properly since 2012, drop that functionality and let someone repair it
who has an actual need for it.

With the core functionally removed several code parts become essentially
unused and are removed, too.

Reported-by: Sashiko (gemini/gemini-3.1-pro-preview)
Link: https://sashiko.dev/#/patchset/20260428192255.2294705-2-u.kleine-koenig%40baylibre.com
Fixes: 679d7abdc754 ("ASoC: codecs: Add AB8500 codec-driver")
Signed-off-by: Uwe Kleine-König (The Capable Hub) <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/20260430154524.338912-2-u.kleine-koenig@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoALSA: pcmtest: Return -EFAULT on pattern read copy failure
Cássio Gabriel [Fri, 1 May 2026 17:45:14 +0000 (14:45 -0300)] 
ALSA: pcmtest: Return -EFAULT on pattern read copy failure

pattern_write() reports -EFAULT when copy_from_user() fails, but
pattern_read() converts copy_to_user() failures into a zero-length read.
That makes a userspace buffer fault look like EOF instead of reporting the
actual error.

Return -EFAULT from pattern_read() when copying the pattern data to
userspace fails, and update the file offset only after a successful copy.

Fixes: 315a3d57c64c ("ALSA: Implement the new Virtual PCM Test Driver")
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260501-alsa-pcmtest-pattern-read-efault-v1-1-53e1e8c11dda@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
12 days agoi2c: stub: Reject I2C block transfers with invalid length
Weiming Shi [Tue, 14 Apr 2026 17:23:39 +0000 (01:23 +0800)] 
i2c: stub: Reject I2C block transfers with invalid length

The I2C_SMBUS_I2C_BLOCK_DATA case in stub_xfer() uses data->block[0]
as the transfer length. The existing check only clamps it to avoid
overrunning the chip->words[256] register array, but does not validate
it against I2C_SMBUS_BLOCK_MAX (32), which is the limit of the union
i2c_smbus_data.block buffer (34 bytes total). The driver is a
development/test tool (CONFIG_I2C_STUB=m, not built by default)
that must be loaded with a chip_addr= parameter.

A local user with access to /dev/i2c-* can issue an I2C_SMBUS ioctl
with I2C_SMBUS_I2C_BLOCK_DATA and data->block[0] > 32, causing
stub_xfer() to read or write past the end of the union
i2c_smbus_data.block buffer:

 BUG: KASAN: stack-out-of-bounds in stub_xfer (drivers/i2c/i2c-stub.c:223)
 Read of size 1 at addr ffff88800abcfd92 by task exploit/81
 Call Trace:
  <TASK>
  stub_xfer (drivers/i2c/i2c-stub.c:223)
  __i2c_smbus_xfer (drivers/i2c/i2c-core-smbus.c:593)
  i2c_smbus_xfer (drivers/i2c/i2c-core-smbus.c:536)
  i2cdev_ioctl_smbus (drivers/i2c/i2c-dev.c:391)
  i2cdev_ioctl (drivers/i2c/i2c-dev.c:478)
  __x64_sys_ioctl (fs/ioctl.c:583)
  do_syscall_64 (arch/x86/entry/syscall_64.c:94)
  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
  </TASK>

The bug exists because i2c-stub implements .smbus_xfer directly,
bypassing the I2C_SMBUS_BLOCK_MAX validation in
i2c_smbus_xfer_emulated(). The I2C_SMBUS_BLOCK_DATA case in the same
function correctly validates against I2C_SMBUS_BLOCK_MAX, but the
I2C_SMBUS_I2C_BLOCK_DATA case does not.

Fix by rejecting transfers with data->block[0] == 0 or
data->block[0] > I2C_SMBUS_BLOCK_MAX with -EINVAL, consistent with
both the I2C_SMBUS_BLOCK_DATA case in the same function and the
I2C_SMBUS_I2C_BLOCK_DATA validation in i2c_smbus_xfer_emulated().

Fixes: 4710317891e4 ("i2c-stub: Implement I2C block support")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
12 days agox86/efi: Fix graceful fault handling after FPU softirq changes
Ivan Hu [Thu, 30 Apr 2026 07:41:07 +0000 (15:41 +0800)] 
x86/efi: Fix graceful fault handling after FPU softirq changes

Since commit d02198550423 ("x86/fpu: Improve crypto performance by
making kernel-mode FPU reliably usable in softirqs"), kernel_fpu_begin()
calls fpregs_lock() which uses local_bh_disable() instead of the
previous preempt_disable(). This sets SOFTIRQ_OFFSET in preempt_count
during the entire EFI runtime service call, causing in_interrupt() to
return true in normal task context.

The graceful page fault handler efi_crash_gracefully_on_page_fault()
uses in_interrupt() to bail out for faults in real interrupt context.
With SOFTIRQ_OFFSET now set, the handler always bails out, leaving EFI
firmware page faults unhandled. This escalates to die() which also sees
in_interrupt() as true and calls panic("Fatal exception in interrupt"),
resulting in a hard system freeze. On systems with buggy firmware that
triggers page faults during EFI runtime calls (e.g., accessing unmapped
memory in GetTime()), this causes an unrecoverable hang instead of the
expected graceful EFI_ABORTED recovery.

Fix by replacing in_interrupt() with !in_task(). This preserves the
original intent of bailing for interrupts or NMI faults, while no longer
falsely triggering from the FPU code path's local_bh_disable().

Fixes: d02198550423 ("x86/fpu: Improve crypto performance by making kernel-mode FPU reliably usable in softirqs")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ivan Hu <ivan.hu@canonical.com>
[ardb: Sashiko spotted that using 'in_hardirq() || in_nmi()' leaves a
       window where a softirq may be taken before fpregs_lock() is
       called, but after efi_rts_work.efi_rts_id has been assigned,
       and any page faults occurring in that window will then be
       misidentified as having been caused by the firmware. Instead,
       use !in_task(), which incorporates in_serving_softirq(). ]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
13 days agoi2c: Compare the return value of gpiod_get_direction against GPIO_LINE_DIRECTION_OUT
Nikola Z. Ivanov [Wed, 15 Apr 2026 20:50:21 +0000 (23:50 +0300)] 
i2c: Compare the return value of gpiod_get_direction against GPIO_LINE_DIRECTION_OUT

The GPIO_LINE_DIRECTION_* definitions have just recently been exposed to
gpio consumers.h by breaking them out in a separate defs.h file.

Use this to validate the gpio direction instead of the hard-coded literal.

Signed-off-by: Nikola Z. Ivanov <zlatistiv@gmail.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
13 days agoparisc: Fix IRQ leak in LASI driver
Hongling Zeng [Sun, 3 May 2026 04:17:44 +0000 (12:17 +0800)] 
parisc: Fix IRQ leak in LASI driver

When request_irq() succeeds but gsc_common_setup() fails later,
the IRQ is never released. Fix this by adding proper error handling
with goto labels to ensure resources are released in LIFO order.

Detected by Smatch:
  drivers/parisc/lasi.c:216 lasi_init_chip() warn: 'lasi->gsc_irq.irq'
from request_irq() not released on lines: 207.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/r/202604180957.4QdAIxP6-lkp@intel.com/
Signed-off-by: Hongling Zeng <zenghongling@kylinos.cn>
Cc: stable@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>
13 days agostaging: rtl8723bs: os_dep: avoid NULL pointer dereference in rtw_cbuf_alloc
Shyam Sunder Reddy Padira [Tue, 14 Apr 2026 07:13:06 +0000 (12:43 +0530)] 
staging: rtl8723bs: os_dep: avoid NULL pointer dereference in rtw_cbuf_alloc

The return value of kzalloc_flex() is used without
ensuring that the allocation succeeded, and the
pointer is dereferenced unconditionally.

Guard the access to the allocated structure to
avoid a potential NULL pointer dereference if the
allocation fails.

Fixes: 980cd426a257 ("staging: rtl8723bs: replace rtw_zmalloc() with kzalloc()")
Cc: stable <stable@kernel.org>
Signed-off-by: Shyam Sunder Reddy Padira <shyamsunderreddypadira@gmail.com>
Reviewed-by: Dan Carpenter <error27@gmail.com>
Link: https://patch.msgid.link/20260414071308.4781-2-shyamsunderreddypadira@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 days agoi2c: dev: prevent integer overflow in I2C_TIMEOUT ioctl
Mingyu Wang [Mon, 27 Apr 2026 02:57:45 +0000 (10:57 +0800)] 
i2c: dev: prevent integer overflow in I2C_TIMEOUT ioctl

While fuzzing with Syzkaller, a persistent `schedule_timeout: wrong
timeout value` warning was observed, accompanied by SMBus controller
state machine corruption.

The I2C_TIMEOUT ioctl accepts a user-provided timeout in multiples of
10 ms. The user argument is checked against INT_MAX, but it is
subsequently multiplied by 10 before being passed to msecs_to_jiffies().

A malicious user can pass a large value (e.g., 429496729) that passes
the `arg > INT_MAX` check but overflows when multiplied by 10. This
results in a truncated 32-bit unsigned value that bypasses the
internal `(int)m < 0` check in `msecs_to_jiffies()`.

The truncated value is then assigned to `client->adapter->timeout`
(a signed 32-bit int), which is reinterpreted as a negative number.
When passed to wait_for_completion_timeout(), this negative value
undergoes sign extension to a 64-bit unsigned long, triggering the
`schedule_timeout` warning and causing premature returns. This leaves
the SMBus state machine in an unrecoverable state, constituting a
local Denial of Service (DoS).

Fix this by bounding the user argument to `INT_MAX / 10`.

Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
[wsa: move the comment as well]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
13 days agostaging: vme_user: fix root device leak on init failure
Johan Hovold [Fri, 24 Apr 2026 10:49:10 +0000 (12:49 +0200)] 
staging: vme_user: fix root device leak on init failure

Make sure to deregister and free the root device in case module
initialisation fails.

Fixes: 658bcdae9c67 ("vme: Adding Fake VME driver")
Cc: stable@vger.kernel.org # 4.9
Cc: Martyn Welch <martyn@welchs.me.uk>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260424104910.2619349-1-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 days agoi2c: acpi: Add ELAN0678 to i2c_acpi_force_100khz_device_ids
Niels Franke [Sat, 18 Apr 2026 05:37:19 +0000 (07:37 +0200)] 
i2c: acpi: Add ELAN0678 to i2c_acpi_force_100khz_device_ids

The ELAN0678 touchpad (04F3:3195) found in the Lenovo ThinkPad X13
exhibits excessive smoothing when the I2C bus runs at 400KHz, making
the touchpad feel sluggish when plugged into AC power. This is the
same issue previously fixed for ELAN06FA.

The device's ACPI table (Lenovo TP-R22) specifies 0x00061A80 (400KHz)
for the I2cSerialBusV2 descriptor. Forcing the bus to 100KHz eliminates
the sluggish behavior.

Signed-off-by: Niels Franke <nielsfranke@gmail.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[wsa: kept the sorting]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
13 days agofbdev: udlfb: add vm_ops to dlfb_ops_mmap to prevent use-after-free
Rajat Gupta [Mon, 4 May 2026 03:51:10 +0000 (20:51 -0700)] 
fbdev: udlfb: add vm_ops to dlfb_ops_mmap to prevent use-after-free

dlfb_ops_mmap() uses remap_pfn_range() to map vmalloc framebuffer pages
to userspace but sets no vm_ops on the VMA. This means the kernel cannot
track active mmaps. When dlfb_realloc_framebuffer() replaces the backing
buffer via FBIOPUT_VSCREENINFO, existing mmap PTEs are not invalidated.
On USB disconnect, dlfb_ops_destroy() calls vfree() on the old pages
while userspace PTEs still reference them, resulting in a use-after-free:
the process retains read/write access to freed kernel pages.

Add vm_operations_struct with open/close callbacks that maintain an
atomic mmap_count on struct dlfb_data. In dlfb_realloc_framebuffer(),
check mmap_count and return -EBUSY if the buffer is currently mapped,
preventing buffer replacement while userspace holds stale PTEs.

Tested with PoC using dummy_hcd + raw_gadget USB device emulation.

Signed-off-by: Rajat Gupta <rajgupt@qti.qualcomm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>
13 days agodt-bindings: i2c: apple,i2c: Add t8122 compatible
Janne Grunau [Fri, 20 Mar 2026 12:23:24 +0000 (13:23 +0100)] 
dt-bindings: i2c: apple,i2c: Add t8122 compatible

The i2c block on the Apple silicon t8122 (M3) SoC is compatible with the
existing driver. Add "apple,t8122-i2c" as SoC specific compatible under
"apple,t8103-i2c" used by the deriver.

Signed-off-by: Janne Grunau <j@jannau.net>
Acked-by: Andi Shyti <andi.shyti@kernel.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
13 days agoiommu/amd: Fix precedence order in set_dte_passthrough()
Weinan Liu [Thu, 30 Apr 2026 23:28:51 +0000 (23:28 +0000)] 
iommu/amd: Fix precedence order in set_dte_passthrough()

Bitwise OR | operator has a higher precedence than the ternary ?:
operatior. It will be incorrectly evaluated as:

new->data[1] |= (FIELD_PREP(...) | dev_data->ats_enabled) ? DTE_FLAG_IOTLB : 0;

Wrap the conditional operation in parentheses to enforce the
correct evaluation order.

Fixes: 93eee2a49c1b ("iommu/amd: Refactor logic to program the host page table in DTE")
Signed-off-by: Weinan Liu <wnliu@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
13 days agoi2c: stm32f7: reinit_completion() per transfer not per msg
Marek Vasut [Sat, 2 May 2026 15:31:54 +0000 (17:31 +0200)] 
i2c: stm32f7: reinit_completion() per transfer not per msg

Currently, the driver may repeatedly call reinit_completion() during
transfer which contains multiple messages, while another thread is
waiting for the completion.

This happens during transfer with more than 1 message, invoked via
stm32f7_i2c_xfer_core() -> stm32f7_i2c_xfer_msg(). After invoking the
stm32f7_i2c_xfer_msg() to start transfer, stm32f7_i2c_xfer_core()
calls wait_for_completion_timeout() to wait for completion of the
transfer of all messages. When the first message transfer completes,
the hard IRQ handler triggers, and detects transfer completion, which
leads to stm32f7_i2c_isr_event_thread() IRQ thread being started. The
stm32f7_i2c_isr_event_thread() calls stm32f7_i2c_xfer_msg() in case
there are more messages.

Without this change, the second and later stm32f7_i2c_xfer_msg() would
call reinit_completion() on the completion which is still being waited
for in stm32f7_i2c_xfer_core(). Fix this by moving the reinit_completion()
into stm32f7_i2c_xfer_core(), together with wait_for_completion_timeout().

Since stm32f7_i2c_xfer_core() now waits for completion of the entire
transfer, increase the default timeout. This fixes sporadic transfer
timeouts on STM32MP25xx during kernel boot.

Fixes: aeb068c57214 ("i2c: i2c-stm32f7: add driver")
Signed-off-by: Marek Vasut <marex@nabladev.com>
[wsa: reworded commit subject]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
13 days agodt-bindings: i2c: amlogic: Add compatible for T7 SOC
Ronald Claveau [Fri, 24 Apr 2026 14:17:33 +0000 (16:17 +0200)] 
dt-bindings: i2c: amlogic: Add compatible for T7 SOC

Add the T7 SOC compatible which fallback to AXG compatible.

Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Ronald Claveau <linux-kernel-dev@aliel.fr>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
13 days agoi2c: testunit: Replace system_long_wq with system_dfl_long_wq
Marco Crivellari [Thu, 30 Apr 2026 09:08:10 +0000 (11:08 +0200)] 
i2c: testunit: Replace system_long_wq with system_dfl_long_wq

Currently the code enqueue work items using {queue|mod}_delayed_work(),
using system_long_wq. This workqueue should be used when long works are
expected, but it is a per-cpu workqueue.

This is important because queue_delayed_work() queue the work using:

   queue_delayed_work_on(WORK_CPU_UNBOUND, ...);

Note that WORK_CPU_UNBOUND = NR_CPUS.

This would end up calling __queue_delayed_work() that does:

    if (housekeeping_enabled(HK_TYPE_TIMER)) {
    //      [....]
    } else {
            if (likely(cpu == WORK_CPU_UNBOUND))
                    add_timer_global(timer);
            else
                    add_timer_on(timer, cpu);
    }

So when cpu == WORK_CPU_UNBOUND the timer is global and is
not using a specific CPU. Later, when __queue_work() is called:

    if (req_cpu == WORK_CPU_UNBOUND) {
            if (wq->flags & WQ_UNBOUND)
                    cpu = wq_select_unbound_cpu(raw_smp_processor_id());
            else
                    cpu = raw_smp_processor_id();
    }

Because the wq is not unbound, it takes the CPU where the timer
fired and enqueue the work on that CPU.
The consequence of all of this is that the work can run anywhere,
depending on where the timer fired.

Recently, a new unbound workqueue specific for long running work has
been added:

    c116737e972e ("workqueue: Add system_dfl_long_wq for long unbound works")

So change system_long_wq with system_dfl_long_wq so that the work may
benefit from scheduler task placement.

Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
[wsa: remove FIXME as well]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
13 days agowifi: cw1200: Revert "Fix locking in error paths"
Bart Van Assche [Thu, 30 Apr 2026 17:44:15 +0000 (10:44 -0700)] 
wifi: cw1200: Revert "Fix locking in error paths"

Revert commit d98c24617a83 ("wifi: cw1200: Fix locking in error paths")
because it introduces a locking bug instead of fixing a locking bug.
cw1200_wow_resume() unlocks priv->conf_mutex. Hence, adding
mutex_unlock(&priv->conf_mutex) just after cw1200_wow_resume() is wrong.

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Closes: https://lore.kernel.org/all/408661f69f263266b028713e1412ba36d457e63d.camel@decadent.org.uk/
Fixes: d98c24617a83 ("wifi: cw1200: Fix locking in error paths")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260430174418.1845431-1-bvanassche@acm.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
13 days agowifi: mac80211: tests: mark HT check strict
Johannes Berg [Mon, 4 May 2026 06:54:27 +0000 (08:54 +0200)] 
wifi: mac80211: tests: mark HT check strict

The HT check now only applies in strict mode since APs
were found to be broken. Mark it as such.

Fixes: 711a9c018ad2 ("wifi: mac80211: skip ieee80211_verify_sta_ht_mcs_support check in non-strict mode")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
13 days agoio_uring/eventfd: reset deferred signal state
Yufan Chen [Sun, 3 May 2026 17:57:10 +0000 (01:57 +0800)] 
io_uring/eventfd: reset deferred signal state

Recursive eventfd wakeups must defer io_uring eventfd signaling because
eventfd_signal_mask() rejects reentry from eventfd wakeup handlers. The
io_ev_fd ops bit tracks an outstanding deferred signal so that the same
rcu_head is not queued twice.

That bit is only set today. Once the first deferred callback runs, later
recursive notifications still see the bit set and skip queueing another
deferred signal. This can leave new completions without a matching
eventfd wake after the first recursive deferral.

Clear the pending bit before issuing the deferred signal. If the wakeup
path recurses while the callback runs, a new signal can be queued for
the next RCU grace period while the current callback keeps its reference
until it returns.

Signed-off-by: Yufan Chen <ericterminal@gmail.com>
Fixes: 60b6c075e8eb ("io_uring/eventfd: move to more idiomatic RCU free usage")
Link: https://patch.msgid.link/20260503175710.37209-1-yufan.chen@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
13 days agoio_uring/napi: clear tracked NAPI entries on unregister
Yufan Chen [Sun, 3 May 2026 17:56:10 +0000 (01:56 +0800)] 
io_uring/napi: clear tracked NAPI entries on unregister

IORING_UNREGISTER_NAPI disables NAPI busy polling, but it currently
leaves any previously tracked NAPI IDs on the ring context. The normal
wait path only checks whether the list is empty before entering the busy
poll helper, so an unregistered ring can still observe stale entries and
run an unexpected busy poll pass.

Make unregister switch the context to inactive and free the tracked
entries. Do the same inactive transition while changing the tracking
strategy, and recheck the expected tracking mode under napi_lock before
inserting a newly learned NAPI ID. This prevents a racing poll path from
repopulating the list after unregister or reconfiguration.

Also make the busy poll dispatcher ignore inactive mode explicitly.

Signed-off-by: Yufan Chen <ericterminal@gmail.com>
Fixes: 6bf90bd8c58a ("io_uring/napi: add static napi tracking strategy")
Link: https://patch.msgid.link/20260503175610.35521-1-yufan.chen@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
13 days agodrm/ttm: Fix GPU MM stats during pool shrinking
Matthew Brost [Sat, 2 May 2026 06:53:38 +0000 (23:53 -0700)] 
drm/ttm: Fix GPU MM stats during pool shrinking

TTM pool shrinking frees pages by calling __free_pages() directly,
which bypasses updates to NR_GPU_ACTIVE and leaves GPU MM accounting
out of sync.

Introduce a helper, __free_pages_gpu_account(), and use it for all page
frees in ttm_pool.c so GPU MM statistics are updated consistently.

Reported-by: Kenneth Crudup <kenny@panix.com>
Fixes: ae80122f3896 ("drm/ttm: use gpu mm stats to track gpu memory allocations. (v4)")
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: David Airlie <airlied@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Tested-by: Kenneth Crudup <kenny@panix.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patch.msgid.link/20260502065338.2720646-1-matthew.brost@intel.com
13 days agosmb: client: use kzalloc to zero-initialize security descriptor buffer
Bjoern Doebel [Thu, 30 Apr 2026 08:57:17 +0000 (08:57 +0000)] 
smb: client: use kzalloc to zero-initialize security descriptor buffer

Commit 62e7dd0a39c2d ("smb: common: change the data type of num_aces
to le16") split struct smb_acl's __le32 num_aces field into __le16
num_aces and __le16 reserved. The reserved field corresponds to Sbz2
in the MS-DTYP ACL wire format, which must be zero [1].

When building an ACL descriptor in build_sec_desc(), we are using a
kmalloc()'ed descriptor buffer and writing the fields explicitly using
le16() writes now. This never writes to the 2 byte reserved field,
leaving it as uninitialized heap data.

When the reserved field happens to contain non-zero slab garbage,
Samba rejects the security descriptor with "ndr_pull_security_descriptor
failed: Range Error", causing chmod to fail with EINVAL.

Change kmalloc() to kzalloc() to ensure the entire buffer is
zero-initialized.

Fixes: 62e7dd0a39c2d ("smb: common: change the data type of num_aces to le16")
Cc: stable@vger.kernel.org
Signed-off-by: Bjoern Doebel <doebel@amazon.de>
Assisted-by: Kiro:claude-opus-4.6
[1] https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/20233ed8-a6c6-4097-aafa-dd545ed24428
Signed-off-by: Steve French <stfrench@microsoft.com>
13 days agocifs: abort open_cached_dir if we don't request leases
Shyam Prasad N [Tue, 28 Apr 2026 16:07:47 +0000 (21:37 +0530)] 
cifs: abort open_cached_dir if we don't request leases

It is possible that SMB2_open_init may not set lease context based
on the requested oplock level. This can happen when leases have been
temporarily or permanently disabled. When this happens, we will have
open_cached_dir making an open without lease context and the response
will anyway be rejected by open_cached_dir (thereby forcing a close to
discard this open). That's unnecessary two round-trips to the server.

This change adds a check before making the open request to the server
to make sure that SMB2_open_init did add the expected lease context
to the open in open_cached_dir.

Cc: <stable@vger.kernel.org>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
13 days agoLoongArch: KVM: Move unconditional delay into timer clear scenery
Bibo Mao [Mon, 4 May 2026 01:00:48 +0000 (09:00 +0800)] 
LoongArch: KVM: Move unconditional delay into timer clear scenery

When timer interrupt arrives in guest kernel, guest kernel clears the
timer interrupt and program timer with the next incoming event.

During this stage, timer tick is -1 and timer interrupt status is
disabled in ESTAT register. KVM hypervisor need write zero with timer
tick register and wait timer interrupt injection from HW side, and
then clear timer interrupt.

So there is 2 cycle delay in KVM hypervisor to emulate such scenery,
and the delay is unnecessary if there is no need to clear the timer
interrupt.

Here move 2 cycle delay into timer clear scenery and add timer ESTAT
checking after delay, and set max timer expire value if timer interrupt
does not arrive still.

Cc: stable@vger.kernel.org
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
13 days agoLoongArch: KVM: Fix HW timer interrupt lost when inject interrupt by software
Bibo Mao [Mon, 4 May 2026 01:00:48 +0000 (09:00 +0800)] 
LoongArch: KVM: Fix HW timer interrupt lost when inject interrupt by software

With passthrough HW timer, timer interrupt is injected by HW. When
inject emulated CPU interrupt by software such SIP0/SIP1/IPI, HW timer
interrupt may be lost.

Here check whether there is timer tick value inversion before and after
injecting emulated CPU interrupt by software, timer enabling by reading
timer cfg register is skipped. If the timer tick value is detected with
changing, then timer should be enabled. And inject a timer interrupt by
software if there is.

Cc: <stable@vger.kernel.org>
Fixes: f45ad5b8aa93 ("LoongArch: KVM: Implement vcpu interrupt operations").
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
13 days agoLoongArch: KVM: Move AVEC interrupt injection into switch loop
Bibo Mao [Mon, 4 May 2026 01:00:48 +0000 (09:00 +0800)] 
LoongArch: KVM: Move AVEC interrupt injection into switch loop

When AVEC interrupt controller is emulated in user space, AVEC interrupt
is injected by software like SIP0/SIP1/TI/IPI interrupts. Here also move
the AVEC interrupt injection in switch loop.

Cc: stable@vger.kernel.org
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
13 days agoLoongArch: KVM: Use kvm_set_pte() in kvm_flush_pte()
Tao Cui [Mon, 4 May 2026 01:00:38 +0000 (09:00 +0800)] 
LoongArch: KVM: Use kvm_set_pte() in kvm_flush_pte()

kvm_flush_pte() is the only caller that directly assigns *pte instead
of using the kvm_set_pte() wrapper. Use the wrapper for consistency with
the rest of the file.

No functional change intended.

Cc: stable@vger.kernel.org
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Tao Cui <cuitao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
13 days agoLoongArch: KVM: Fix missing EMULATE_FAIL in kvm_emu_mmio_read()
Tao Cui [Mon, 4 May 2026 01:00:38 +0000 (09:00 +0800)] 
LoongArch: KVM: Fix missing EMULATE_FAIL in kvm_emu_mmio_read()

In the ldptr (0x24...0x27) opcode decoding path, the default case only
breaks out but without setting "ret" value to EMULATE_FAIL. This leaves
run->mmio.len uninitialized (stale from a previous MMIO operation) while
"ret" value remains EMULATE_DO_MMIO, causing the code to proceed with an
incorrect MMIO length.

Add "ret = EMULATE_FAIL" to match the other default branches in the same
function (e.g. the 0x28...0x2e and 0x38 cases).

Cc: stable@vger.kernel.org
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Tao Cui <cuitao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
13 days agoLoongArch: KVM: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
Qiang Ma [Mon, 4 May 2026 01:00:37 +0000 (09:00 +0800)] 
LoongArch: KVM: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS

It doesn't make sense to return the recommended maximum number of vCPUs
which exceeds the maximum possible number of vCPUs.

Other architectures have already done this, such as commit 57a2e13ebdda
("KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS")

Cc: stable@vger.kernel.org
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Qiang Ma <maqianga@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
13 days agoLoongArch: KVM: Fix "unreliable stack" for kvm_exc_entry
Xianglai Li [Mon, 4 May 2026 01:00:37 +0000 (09:00 +0800)] 
LoongArch: KVM: Fix "unreliable stack" for kvm_exc_entry

Insert the appropriate UNWIND hint into the kvm_exc_entry assembly
function to guide the generation of correct ORC table entries, thereby
solving the timeout problem ("unreliable stack") while loading the
livepatch-sample module on a physical machine running virtual machines
with multiple vcpus.

Cc: stable@vger.kernel.org
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
13 days agoLoongArch: KVM: Compile switch.S directly into the kernel
Xianglai Li [Mon, 4 May 2026 01:00:37 +0000 (09:00 +0800)] 
LoongArch: KVM: Compile switch.S directly into the kernel

If we directly compile the switch.S file into the kernel, the address of
the kvm_exc_entry function will definitely be within the DMW memory area.
Therefore, we will no longer need to perform a copy relocation of the
kvm_exc_entry.

So this patch compiles switch.S directly into the kernel, and then remove
the copy relocation execution logic for the kvm_exc_entry function.

Cc: stable@vger.kernel.org
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
13 days agoLoongArch: vDSO: Drop custom __arch_vdso_hres_capable()
Thomas Weißschuh [Mon, 4 May 2026 01:00:20 +0000 (09:00 +0800)] 
LoongArch: vDSO: Drop custom __arch_vdso_hres_capable()

The custom definition is identical to the generic fallback one.

So remove it.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
13 days agoLoongArch: Fix potential ADE in loongson_gpu_fixup_dma_hang()
Wentao Guan [Mon, 4 May 2026 01:00:20 +0000 (09:00 +0800)] 
LoongArch: Fix potential ADE in loongson_gpu_fixup_dma_hang()

The switch case in loongson_gpu_fixup_dma_hang() may not DC2 or DC3, and
readl(crtc_reg) will access with random address, because the "device" is
from "base+PCI_DEVICE_ID", "base" is from "pdev->devfn+1". This is wrong
when my platform inserts a discrete GPU:

lspci -tv
-[0000:00]-+-00.0  Loongson Technology LLC Hyper Transport Bridge Controller
...
           +-06.0  Loongson Technology LLC LG100 GPU
           +-06.2  Loongson Technology LLC Device 7a37
...

Add a default switch case to fix the panic as below:

 Kernel ade access[#1]:
 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.6.136-loong64-desktop-hwe+ #4
 pc 90000000017e5534 ra 90000000017e54c0 tp 90000001002f8000 sp 90000001002fb6c0
 a0 80000efe00003100 a1 0000000000003100 a2 0000000000000000 a3 0000000000000002
 a4 90000001002fb6b4 a5 900000087cdb58fd a6 90000000027af000 a7 0000000000000001
 t0 00000000000085b9 t1 000000000000ffff t2 0000000000000000 t3 0000000000000000
 t4 fffffffffffffffd t5 00000000fffb6d9c t6 0000000000083b00 t7 00000000000070c0
 t8 900000087cdb4d94 u0 900000087cdb58fd s9 90000001002fb826 s0 90000000031c12c8
 s1 7fffffffffffff00 s2 90000000031c12d0 s3 0000000000002710 s4 0000000000000000
 s5 0000000000000000 s6 9000000100053000 s7 7fffffffffffff00 s8 90000000030d4000
    ra: 90000000017e54c0 loongson_gpu_fixup_dma_hang+0x40/0x210
   ERA: 90000000017e5534 loongson_gpu_fixup_dma_hang+0xb4/0x210
  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
  PRMD: 00000004 (PPLV0 +PIE -PWE)
  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
 ESTAT: 00480000 [ADEM] (IS= ECode=8 EsubCode=1)
  BADV: 7fffffffffffff00
  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
 Modules linked in:
 Process swapper/0 (pid: 1, threadinfo=(____ptrval____), task=(____ptrval____))
 Stack : 0000000000000006 90000001002fb778 90000001002fb704 0000000000000007
         0000000016a65700 90000000017e5690 000000000000ffff ffffffffffffffff
         900000000209f7c0 9000000100053000 900000000209f7a8 9000000000eebc08
         0000000000000000 0000000000000000 0000000000000006 90000001002fb778
         90000001000530b8 90000000027af000 0000000000000000 9000000100054000
         9000000100053000 9000000000ebb70c 9000000100004c00 9000000004000001
         90000001002fb7e4 bae765461f31cb12 0000000000000000 0000000000000000
         0000000000000006 90000000027af000 0000000000000030 90000000027af000
         900000087cd6f800 9000000100053000 0000000000000000 9000000000ebc560
         7a2500147cdaf720 bae765461f31cb12 0000000000000001 0000000000000030
         ...
 Call Trace:
 [<90000000017e5534>] loongson_gpu_fixup_dma_hang+0xb4/0x210
 [<9000000000eebc08>] pci_fixup_device+0x108/0x280
 [<9000000000ebb70c>] pci_setup_device+0x24c/0x690
 [<9000000000ebc560>] pci_scan_single_device+0xe0/0x140
 [<9000000000ebc684>] pci_scan_slot+0xc4/0x280
 [<9000000000ebdd00>] pci_scan_child_bus_extend+0x60/0x3f0
 [<9000000000f5bc94>] acpi_pci_root_create+0x2b4/0x420
 [<90000000017e5e74>] pci_acpi_scan_root+0x2d4/0x440
 [<9000000000f5b02c>] acpi_pci_root_add+0x21c/0x3a0
 [<9000000000f4ee54>] acpi_bus_attach+0x1a4/0x3c0
 [<90000000010e200c>] device_for_each_child+0x6c/0xe0
 [<9000000000f4bbf4>] acpi_dev_for_each_child+0x44/0x70
 [<9000000000f4ef40>] acpi_bus_attach+0x290/0x3c0
 [<90000000010e200c>] device_for_each_child+0x6c/0xe0
 [<9000000000f4bbf4>] acpi_dev_for_each_child+0x44/0x70
 [<9000000000f4ef40>] acpi_bus_attach+0x290/0x3c0
 [<9000000000f5211c>] acpi_bus_scan+0x6c/0x280
 [<900000000189c028>] acpi_scan_init+0x194/0x310
 [<900000000189bc6c>] acpi_init+0xcc/0x140
 [<9000000000220cdc>] do_one_initcall+0x4c/0x310
 [<90000000018618fc>] kernel_init_freeable+0x258/0x2d4
 [<900000000184326c>] kernel_init+0x28/0x13c
 [<9000000000222008>] ret_from_kernel_thread+0xc/0xa4

Cc: stable@vger.kernel.org
Fixes: 95db0c9f526d ("LoongArch: Workaround LS2K/LS7A GPU DMA hang bug")
Link: https://gist.github.com/opsiff/ebf2dac51b4013d22462f2124c55f807
Link: https://gist.github.com/opsiff/a62f2a73db0492b3c49bf223a339b133
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
13 days agoLoongArch: Use per-root-bridge PCIH flag to skip mem resource fixup
Huacai Chen [Mon, 4 May 2026 01:00:20 +0000 (09:00 +0800)] 
LoongArch: Use per-root-bridge PCIH flag to skip mem resource fixup

When firmware enables 64-bit PCI host bridge support, some root bridges
already provide valid 64-bit mem resource windows through ACPI.

In this case, the LoongArch-specific mem resource high-bits fixup in
acpi_prepare_root_resources() should not be applied unconditionally.
Otherwise, the kernel may override the native resource layout derived
from firmware, and later BAR assignment can fail to place device BARs
into the intended 64-bit address space correctly.

Add a per-root-bridge ACPI flag, PCIH, and evaluate it from the current
root bridge device scope. When PCIH is set, skip the mem resource high-
bits fixup path and let the kernel use the firmware-provided resource
description directly. When PCIH is absent or cleared, keep the existing
behavior and continue filling the high address bits from the host bridge
address.

This makes the behavior per-root-bridge configurable and avoids breaking
valid 64-bit BAR space allocation on bridges whose 64-bit windows have
already been fully described by firmware.

Cc: stable@vger.kernel.org
Suggested-by: Chao Li <lichao@loongson.cn>
Tested-by: Dongyan Qian <qiandongyan@loongson.cn>
Signed-off-by: Dongyan Qian <qiandongyan@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
13 days agoLoongArch: Fix SYM_SIGFUNC_START definition for 32BIT
Huacai Chen [Mon, 4 May 2026 01:00:01 +0000 (09:00 +0800)] 
LoongArch: Fix SYM_SIGFUNC_START definition for 32BIT

The SYM_SIGFUNC_START definition should match sigcontext that the length
of GPRs are 8 bytes for both 32BIT and 64BIT. So replace SZREG with 8 to
fix it.

Cc: stable@vger.kernel.org
Fixes: e4878c37f6679fde ("LoongArch: vDSO: Emit GNU_EH_FRAME correctly")
Suggested-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
13 days agoLoongArch: Specify -m32/-m64 explicitly for 32BIT/64BIT
Huacai Chen [Mon, 4 May 2026 01:00:01 +0000 (09:00 +0800)] 
LoongArch: Specify -m32/-m64 explicitly for 32BIT/64BIT

Clang/LLVM build needs -m32/-m64 to switch triple variants (i.e. the
--target=xxx parameter). Otherwise we get build errors for CONFIG_32BIT.

GCC doesn't support -m32/-m64 now, but maybe support in future, so use
cc-option to specify them.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202604232041.ESJDwVG4-lkp@intel.com/
Suggested-by: Nathan Chancellor <nathan@kernel.org
Tested-by: WANG Rui <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
13 days agoLoongArch: Make CONFIG_64BIT as the default option
Huacai Chen [Mon, 4 May 2026 01:00:00 +0000 (09:00 +0800)] 
LoongArch: Make CONFIG_64BIT as the default option

CONFIG_64BIT is the mandatory option before v7.0, but in v7.1-rc1 both
CONFIG_32BIT and CONFIG_64BIT are selectable and CONFIG_32BIT became the
default option. This breaks existing configurations, so explicitly make
CONFIG_64BIT as the default option to keep existing behavior.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
13 days agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sun, 3 May 2026 22:25:47 +0000 (15:25 -0700)] 
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Three bug fixes for x86:

   - Check that nEPT/nNPT is enabled in slow flush hypercalls. If it is
     not, the hypercalls can be processed as usual even while running a
     nested guest

   - Fix shadow paging use-after-free due to page tables changing
     outside execution of the guest. A bug that is 16 years old and
     stems from an imprecision in the very first KVM series

   - Scan IRR whenever PID.ON is true, even if PIR is empty, which
     avoids a somewhat rare WARN"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Fix shadow paging use-after-free due to unexpected GFN
  KVM: x86: Fix misleading variable names and add more comments for PIR=>IRR flow
  KVM: x86: Do IRR scan in __kvm_apic_update_irr even if PIR is empty
  KVM: x86: check for nEPT/nNPT in slow flush hypercalls

13 days agoLinux 7.1-rc2 v7.1-rc2
Linus Torvalds [Sun, 3 May 2026 21:21:25 +0000 (14:21 -0700)] 
Linux 7.1-rc2

13 days agoKVM: x86: Fix shadow paging use-after-free due to unexpected GFN
Sean Christopherson [Wed, 15 Apr 2026 00:52:34 +0000 (00:52 +0000)] 
KVM: x86: Fix shadow paging use-after-free due to unexpected GFN

The shadow MMU computes GFNs for direct shadow pages using sp->gfn plus
the SPTE index. This assumption breaks for shadow paging if the guest
page tables are modified between VM entries (similar to commit
aad885e77496, "KVM: x86/mmu: Drop/zap existing present SPTE even
when creating an MMIO SPTE", 2026-03-27).  The flow is as follows:

- a PDE is installed for a 2MB mapping, and a page in that area is
  accessed.  KVM creates a kvm_mmu_page consisting of 512 4KB pages;
  the kvm_mmu_page is marked by FNAME(fetch) as direct-mapped because
  the guest's mapping is a huge page (and thus contiguous).

- the PDE mapping is changed from outside the guest.

- the guest accesses another page in the same 2MB area.  KVM installs
  a new leaf SPTE and rmap entry; the SPTE uses the "correct" GFN
  (i.e. based on the new mapping, as changed in the previous step) but
  that GFN is outside of the [sp->gfn, sp->gfn + 511] range; therefore
  the rmap entry cannot be found and removed when the kvm_mmu_page
  is zapped.

- the memslot that covers the first 2MB mapping is deleted, and the
  kvm_mmu_page for the now-invalid GPA is zapped.  However, rmap_remove()
  only looks at the [sp->gfn, sp->gfn + 511] range established in step 1,
  and fails to find the rmap entry that was recorded by step 3.

- any operation that causes an rmap walk for the same page accessed
  by step 3 then walks a stale rmap and dereferences a freed kvm_mmu_page.
  This includes dirty logging or MMU notifier invalidations (e.g., from
  MADV_DONTNEED).

The underlying issue is that KVM's walking of shadow PTEs assumes that
if a SPTE is present when KVM wants to install a non-leaf SPTE, then the
existing kvm_mmu_page must be for the correct gfn.  Because the only way
for the gfn to be wrong is if KVM messed up and failed to zap a SPTE...
which shouldn't happen, but *actually* only happens in response to a
guest write.

That bug dates back literally forever, as even the first version of KVM
assumes that the GFN matches and walks into the "wrong" shadow page.
However, that was only an imprecision until 2032a93d66fa ("KVM: MMU:
Don't allocate gfns page for direct mmu pages") came along.

Fix it by checking for a target gfn mismatch and zapping the existing
SPTE.  That way the old SP and rmap entries are gone, KVM installs
the rmap in the right location, and everyone is happy.

Fixes: 2032a93d66fa ("KVM: MMU: Don't allocate gfns page for direct mmu pages")
Fixes: 6aa8b732ca01 ("kvm: userspace interface")
Reported-by: Alexander Bulekov <bkov@amazon.com>
Reported-by: Fred Griffoul <fgriffo@amazon.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://patch.msgid.link/20260503201029.106481-1-pbonzini@redhat.com/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
13 days agoKVM: x86: Fix misleading variable names and add more comments for PIR=>IRR flow
Sean Christopherson [Tue, 28 Apr 2026 15:50:43 +0000 (08:50 -0700)] 
KVM: x86: Fix misleading variable names and add more comments for PIR=>IRR flow

Rename kvm_apic_update_irr()'s "irr_updated" and vmx_sync_pir_to_irr()'s
"got_posted_interrupt" to a more accurate "max_irr_is_from_pir", as neither
"irr_updated" nor "got_posted_interrupt" is accurate.
__kvm_apic_update_irr() and thus kvm_apic_update_irr() specifically return
true if and only if the highest priority IRQ, i.e. max_irr, is a "new"
pending IRQ from the PIR.  I.e. it's possible for the IRR to be updated,
i.e. for a posted IRQ to be "got", *without* the APIs returning true.

Expand vmx_sync_pir_to_irr()'s comment to explain why it's necessary to
set KVM_REQ_EVENT only if a "new" IRQ was found, and to explain why it's
safe to do so only if a new IRQ is also the highest priority pending IRQ.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://patch.msgid.link/20260503201703.108231-3-pbonzini@redhat.com/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
13 days agoKVM: x86: Do IRR scan in __kvm_apic_update_irr even if PIR is empty
Paolo Bonzini [Sun, 3 May 2026 17:19:32 +0000 (19:19 +0200)] 
KVM: x86: Do IRR scan in __kvm_apic_update_irr even if PIR is empty

Fall back to apic_find_highest_vector() when PID.ON is set but PIR
turns out to be empty, to correctly report the highest pending interrupt
from the existing IRR.

In a nested VM stress test, the following WARNING fires in
vmx_check_nested_events() when kvm_cpu_has_interrupt() reports a pending
interrupt but the subsequent kvm_apic_has_interrupt() (which invokes
vmx_sync_pir_to_irr() again) returns -1:

  WARNING: CPU: 99 PID: 57767 at arch/x86/kvm/vmx/nested.c:4449 vmx_check_nested_events+0x6bf/0x6e0 [kvm_intel]
  Call Trace:
   kvm_check_and_inject_events
   vcpu_enter_guest.constprop.0
   vcpu_run
   kvm_arch_vcpu_ioctl_run
   kvm_vcpu_ioctl
   __x64_sys_ioctl
   do_syscall_64
   entry_SYSCALL_64_after_hwframe

The root cause is a race between vmx_sync_pir_to_irr() on the target vCPU
and __vmx_deliver_posted_interrupt() on a sender vCPU.  The sender
performs two individually-atomic operations that are not a single
transaction:

  1. pi_test_and_set_pir(vector)  -- sets the PIR bit
  2. pi_test_and_set_on()         -- sets PID.ON

The following interleaving triggers the bug:

  Sender vCPU (IPI):              Target vCPU (1st sync_pir_to_irr):
  B1: set PIR[vector]
                                  A1: pi_clear_on()
                                  A2: pi_harvest_pir() -> sees B1 bit
                                  A3: xchg() -> consumes bit, PIR=0
                                      (1st sync returns correct max_irr)
  B2: set PID.ON = 1

                                  Target vCPU (2nd sync_pir_to_irr):
                                  C1: pi_test_on() -> TRUE (from B2)
                                  C2: pi_clear_on() -> ON=0
                                  C3: pi_harvest_pir() -> PIR empty
                                  C4: *max_irr = -1, early return
                                      IRR NOT SCANNED

The interrupt is not lost (it resides in the IRR from the first sync and
is recovered on the next vcpu_enter_guest() iteration), but the incorrect
max_irr causes a spurious WARNING and a wasted L2 VM-Enter/VM-Exit cycle.

Fixes: b41f8638b9d3 ("KVM: VMX: Isolate pure loads from atomic XCHG when processing PIR")
Reported-by: Farrah Chen <farrah.chen@intel.com>
Analyzed-by: Chenyi Qiang <chenyi.qiang@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/kvm/20260428070349.1633238-1-chenyi.qiang@intel.com/T/
Link: https://patch.msgid.link/20260503201703.108231-2-pbonzini@redhat.com/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
13 days agoKVM: x86: check for nEPT/nNPT in slow flush hypercalls
Paolo Bonzini [Mon, 27 Apr 2026 12:25:40 +0000 (14:25 +0200)] 
KVM: x86: check for nEPT/nNPT in slow flush hypercalls

Checking is_guest_mode(vcpu) is incorrect, because translate_nested_gpa()
is only valid if an L2 guest is running *with nested EPT/NPT enabled*.
Instead use the same condition as translate_nested_gpa() itself.

Cc: stable@vger.kernel.org
Reviewed-by: Sean Christopherson <seanjc@google.com>
Fixes: aee738236dca ("KVM: x86: Prepare kvm_hv_flush_tlb() to handle L2's GPAs", 2022-11-18)
Link: https://patch.msgid.link/20260503200905.106077-1-pbonzini@redhat.com/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
13 days agoMerge tag 'sh-for-v7.1-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubit...
Linus Torvalds [Sun, 3 May 2026 15:58:42 +0000 (08:58 -0700)] 
Merge tag 'sh-for-v7.1-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux

Pull sh fix from John Paul Adrian Glaubitz:
 "The ZERO_PAGE consolidation in v7.1, introduced a regression on sh
  which made these systems unbootable.

  The problem was that on sh, the initial boot parameters were
  previously referenced as an array and after 6215d9f4470f ("arch, mm:
  consolidate empty_zero_page"), they were referenced as a pointer which
  caused wrong code generation and boot hang.

  This changes the declaration back to being an array which fixes the
  boot hang"

* tag 'sh-for-v7.1-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
  sh: Fix fallout from ZERO_PAGE consolidation

13 days agoMerge tag 'slab-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka...
Linus Torvalds [Sun, 3 May 2026 15:19:57 +0000 (08:19 -0700)] 
Merge tag 'slab-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:

 - Stable fixes for CONFIG_SMP=n where _nolock() allocations in NMI both
   at kmalloc and page allocator levels are not properly protected by
   the spin_trylock() semantics on !SMP (Harry Yoo)

* tag 'slab-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/slab: return NULL early from kmalloc_nolock() in NMI on UP
  mm/page_alloc: return NULL early from alloc_frozen_pages_nolock() in NMI on UP

13 days agoMerge tag 'locking-urgent-2026-05-03' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 3 May 2026 15:17:09 +0000 (08:17 -0700)] 
Merge tag 'locking-urgent-2026-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fix from Ingo Molnar:
 "Fix lockup in requeue-PI during signal/timeout wakeups, by Sebastian
  Andrzej Siewior"

* tag 'locking-urgent-2026-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  futex: Prevent lockup in requeue-PI during signal/ timeout wakeup

13 days agoMerge tag 'sched-urgent-2026-05-03' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 3 May 2026 15:05:23 +0000 (08:05 -0700)] 
Merge tag 'sched-urgent-2026-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:

 - Fix the delayed dequeue negative lag increase fix in the
   fair scheduler (Peter Zijlstra)

 - Fix wakeup_preempt_fair() to do proper delayed dequeue
   (Vincent Guittot)

 - Clear sched_entity::rel_deadline when initializing
   forked entities, which bug can cause all tasks to be
   EEVDF-ineligible, causing a NULL pointer dereference
   crash in pick_next_entity() (Zicheng Qu)

* tag 'sched-urgent-2026-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Clear rel_deadline when initializing forked entities
  sched/fair: Fix wakeup_preempt_fair() vs delayed dequeue
  sched/fair: Fix the negative lag increase fix

13 days agosh: Fix fallout from ZERO_PAGE consolidation
Mike Rapoport (Microsoft) [Fri, 17 Apr 2026 10:32:07 +0000 (13:32 +0300)] 
sh: Fix fallout from ZERO_PAGE consolidation

Consolidation of empty_zero_page declarations broke boot on sh.

sh stores its initial boot parameters in a page reserved in
arch/sh/kernel/head_32.S. Before commit 6215d9f4470f ("arch, mm:
consolidate empty_zero_page") this page was referenced in C code
as an array and after that commit it is referenced as a pointer.

This causes wrong code generation and boot hang.

Declare boot_params_page as an array to fix the issue.

Reported-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Tested-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Fixes: 6215d9f4470f ("arch, mm: consolidate empty_zero_page")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Artur Rojek <contact@artur-rojek.eu>
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
13 days agorust: drm: fix unsound initialization in drm::Device::new
Eliot Courtney [Fri, 1 May 2026 10:49:37 +0000 (19:49 +0900)] 
rust: drm: fix unsound initialization in drm::Device::new

If pinned initialization of drm::Device::Data fails, it calls
drm::Device::release via drm_dev_put. This materializes a reference to
&drm::Device, but it's not fully constructed yet, because initializing
`data` failed. It should not be dropped either. Instead, if pinned
initialization fails, make sure drm::Device::release isn't called.

Fixes: 2e9fdbe5ec7a ("rust: drm: device: drop_in_place() the drm::Device in release()")
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260501-fix-drm-1-v2-1-5c4f681837bc@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2 weeks agoselftests: tls: add test for data loss on small pipe
Jakub Kicinski [Wed, 29 Apr 2026 22:29:39 +0000 (15:29 -0700)] 
selftests: tls: add test for data loss on small pipe

Add selftest for data loss on short splice.

Link: https://patch.msgid.link/20260429222944.2139041-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agonet: tls: fix silent data drop under pipe back-pressure
Jakub Kicinski [Wed, 29 Apr 2026 22:29:38 +0000 (15:29 -0700)] 
net: tls: fix silent data drop under pipe back-pressure

tls_sw_splice_read() uses len when advancing rxm->offset / rxm->full_len
after skb_splice_bits(), rather than copied (the actual number of bytes
successfully spliced into the pipe). When the destination pipe cannot
accept all the requested bytes, splice_to_pipe() returns fewer bytes
than len, and 'len - copied' of data is effectively skipped over.

Fixes: e062fe99cccd ("tls: splice_read: fix accessing pre-processed records")
Link: https://patch.msgid.link/20260429222944.2139041-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agoMerge branch 'net-sched-sch_cake-annotate-data-races-in-cake_dump_class_stats-series'
Jakub Kicinski [Sat, 2 May 2026 23:59:11 +0000 (16:59 -0700)] 
Merge branch 'net-sched-sch_cake-annotate-data-races-in-cake_dump_class_stats-series'

Eric Dumazet says:

====================
net/sched: sch_cake: annotate data-races in cake_dump_class_stats (series)

cake_dump_class_stats() runs without qdisc spinlock being held.

In this series (of two), I add READ_ONCE()/WRITE_ONCE() annotations for:

- flow->head
- flow->dropped
- b->backlogs[]
- flow->deficit
- flow->cvars.dropping
- flow->cvars.count
- flow->cvars.p_drop
- flow->cvars.blue_timer
- flow->cvars.drop_next
====================

Link: https://patch.msgid.link/20260430061610.3503483-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agonet/sched: sch_cake: annotate data-races in cake_dump_class_stats (II)
Eric Dumazet [Thu, 30 Apr 2026 06:16:10 +0000 (06:16 +0000)] 
net/sched: sch_cake: annotate data-races in cake_dump_class_stats (II)

cake_dump_class_stats() runs without qdisc spinlock being held.

In this second patch, I add READ_ONCE()/WRITE_ONCE() annotations for:

- flow->deficit
- flow->cvars.dropping
- flow->cvars.count
- flow->cvars.p_drop
- flow->cvars.blue_timer
- flow->cvars.drop_next

Fixes: 046f6fd5daef ("sched: Add Common Applications Kept Enhanced (cake) qdisc")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260430061610.3503483-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agonet/sched: sch_cake: annotate data-races in cake_dump_class_stats (I)
Eric Dumazet [Thu, 30 Apr 2026 06:16:09 +0000 (06:16 +0000)] 
net/sched: sch_cake: annotate data-races in cake_dump_class_stats (I)

cake_dump_class_stats() runs without qdisc spinlock being held.

In this first patch, I add READ_ONCE()/WRITE_ONCE() annotations for:

- flow->head
- flow->dropped
- b->backlogs[]

Fixes: 046f6fd5daef ("sched: Add Common Applications Kept Enhanced (cake) qdisc")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260430061610.3503483-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agobatman-adv: stop tp_meter sessions during mesh teardown
Jiexun Wang [Mon, 27 Apr 2026 06:43:34 +0000 (14:43 +0800)] 
batman-adv: stop tp_meter sessions during mesh teardown

TP meter sessions remain linked on bat_priv->tp_list after the netlink
request has already finished. When the mesh interface is removed,
batadv_mesh_free() currently tears down the mesh without first draining
these sessions.

A running sender thread or a late incoming tp_meter packet can then keep
processing against a mesh instance which is already shutting down.
Synchronize tp_meter with the mesh lifetime by stopping all active
sessions from batadv_mesh_free() and waiting for sender threads to exit
before teardown continues.

Fixes: 33a3bb4a3345 ("batman-adv: throughput meter implementation")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Co-developed-by: Luxing Yin <tr0jan@lzu.edu.cn>
Signed-off-by: Luxing Yin <tr0jan@lzu.edu.cn>
Signed-off-by: Jiexun Wang <wangjiexun2025@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
2 weeks agobatman-adv: reject new tp_meter sessions during teardown
Jiexun Wang [Mon, 27 Apr 2026 06:43:33 +0000 (14:43 +0800)] 
batman-adv: reject new tp_meter sessions during teardown

Prevent tp_meter from starting new sender or receiver sessions after
mesh_state has left BATADV_MESH_ACTIVE.

Fixes: 33a3bb4a3345 ("batman-adv: throughput meter implementation")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Co-developed-by: Luxing Yin <tr0jan@lzu.edu.cn>
Signed-off-by: Luxing Yin <tr0jan@lzu.edu.cn>
Signed-off-by: Jiexun Wang <wangjiexun2025@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
2 weeks agobatman-adv: fix integer overflow on buff_pos
Lyes Bourennani [Tue, 21 Apr 2026 22:20:22 +0000 (00:20 +0200)] 
batman-adv: fix integer overflow on buff_pos

Fixing an integer overflow present in batadv_iv_ogm_send_to_if. The size
check is done using the int type in batadv_iv_ogm_aggr_packet whereas the
buff_pos variable uses the s16 type. This could lead to an out-of-bound
read.

Cc: stable@vger.kernel.org
Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol")
Signed-off-by: Lyes Bourennani <lbourennani@fuzzinglabs.com>
Signed-off-by: Alexis Pinson <apinson@fuzzinglabs.com>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
2 weeks agoMerge tag 'v7.1-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Sat, 2 May 2026 19:31:43 +0000 (12:31 -0700)] 
Merge tag 'v7.1-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:

 - Reject algorithms with authsizes that are too short in authencesn

* tag 'v7.1-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: authencesn - reject short ahash digests during instance creation

2 weeks agoMerge tag 'ntfs-for-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinj...
Linus Torvalds [Sat, 2 May 2026 19:25:57 +0000 (12:25 -0700)] 
Merge tag 'ntfs-for-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs

Pull ntfs fixes from Namjae Jeon:

 - Fix a NULL pointer dereference in ntfs_index_walk_down() by
   validating index block allocation

 - Fix a memory leak of the symlink target string in
   ntfs_reparse_set_wsl_symlink() during error paths

 - Prevent VCN overflow and validate lowest_vcn in
   ntfs_mapping_pairs_decompress() to avoid runlist corruption

 - Fix a page reference leak in ntfs_write_iomap_end_resident()
   when attribute search context allocation fails

 - Fix an invalid PTR_ERR() usage on a valid folio pointer in
   __ntfs_bitmap_set_bits_in_run()

 - Correct directory link counting by dropping nlink only when
   the MFT record link count reaches zero for WIN32/DOS aliases

 - Fix an uninitialized variable in ntfs_mapping_pairs_decompress()
   by returning an error pointer directly

* tag 'ntfs-for-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs:
  ntfs: Use return instead of goto in ntfs_mapping_pairs_decompress()
  ntfs: drop nlink once for WIN32/DOS aliases
  ntfs: fix invalid PTR_ERR() usage in __ntfs_bitmap_set_bits_in_run()
  ntfs: fix error handling in ntfs_write_iomap_end_resident()
  ntfs: fix VCN overflow in ntfs_mapping_pairs_decompress()
  ntfs: fix WSL symlink target leak on reparse failure
  ntfs: fix NULL dereference in ntfs_index_walk_down()

2 weeks agoRDMA/hns: Fix unlocked call to hns_roce_qp_remove()
Jason Gunthorpe [Tue, 28 Apr 2026 16:17:48 +0000 (13:17 -0300)] 
RDMA/hns: Fix unlocked call to hns_roce_qp_remove()

Sashiko points out that hns_roce_qp_remove() requires the caller to hold
locks.  The error flow in hns_roce_create_qp_common() doesn't hold those
locks for the error unwind so it risks corrupting memory.

Grab the same locks the other two callers use.

Cc: stable@vger.kernel.org
Fixes: e088a685eae9 ("RDMA/hns: Support rq record doorbell for the user space")
Link: https://sashiko.dev/#/patchset/0-v2-1c49eeb88c48%2B91-rdma_udata_rep_jgg%40nvidia.com?part=9
Link: https://patch.msgid.link/r/15-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Junxian Huang <huangjunxian6@hisilicon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoRDMA/hns: Fix xarray race in hns_roce_create_qp_common()
Jason Gunthorpe [Tue, 28 Apr 2026 16:17:47 +0000 (13:17 -0300)] 
RDMA/hns: Fix xarray race in hns_roce_create_qp_common()

Similar to the SRQ case the hr_qp is stored in the xarray before it is
fully initialized. Unlike the SRQ case the error unwinds do not wait for
the completion so keep the refcount 0 until the function succeeds.

Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
Link: https://patch.msgid.link/r/14-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Suggested-by: Junxian Huang <huangjunxian6@hisilicon.com>
Reviewed-by: Junxian Huang <huangjunxian6@hisilicon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoRDMA/hns: Fix xarray race in hns_roce_create_srq()
Jason Gunthorpe [Tue, 28 Apr 2026 16:17:46 +0000 (13:17 -0300)] 
RDMA/hns: Fix xarray race in hns_roce_create_srq()

Sashiko points out that once the srq memory is stored into the xarray by
alloc_srqc() it can immediately be looked up by:

xa_lock(&srq_table->xa);
srq = xa_load(&srq_table->xa, srqn & (hr_dev->caps.num_srqs - 1));
if (srq)
refcount_inc(&srq->refcount);
xa_unlock(&srq_table->xa);

Which will fail refcount debug because the refcount is 0 and then crash:

srq->event(srq, event_type);

Because event is NULL.

Use refcount_inc_not_zero() instead to ensure a partially prepared srq is
never retrieved from the event handler and fix the ordering of the
initialization so refcount becomes 1 only after it is fully ready.

All the initialization must be done before calling free_srqc() since it
depends on the completion and refcount.

Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
Link: https://sashiko.dev/#/patchset/0-v1-e911b76a94d1%2B65d95-rdma_udata_rep_jgg%40nvidia.com?part=3
Link: https://patch.msgid.link/r/13-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Junxian Huang <huangjunxian6@hisilicon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoRDMA/mlx4: Fix mis-use of RCU in mlx4_srq_event()
Jason Gunthorpe [Tue, 28 Apr 2026 16:17:45 +0000 (13:17 -0300)] 
RDMA/mlx4: Fix mis-use of RCU in mlx4_srq_event()

Sashiko points out the radix_tree itself is RCU safe, but nothing ever
frees the mlx4_srq struct with RCU, and it isn't even accessed within the
RCU critical section. It also will crash if an event is delivered before
the srq object is finished initializing.

Use the spinlock since it isn't easy to make RCU work, use
refcount_inc_not_zero() to protect against partially initialized objects,
and order the refcount_set() to be after the srq is fully initialized.

Cc: stable@vger.kernel.org
Fixes: 30353bfc43a1 ("net/mlx4_core: Use RCU to perform radix tree lookup for SRQ")
Link: https://sashiko.dev/#/patchset/0-v2-1c49eeb88c48%2B91-rdma_udata_rep_jgg%40nvidia.com?part=5
Link: https://patch.msgid.link/r/12-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoRDMA/mlx4: Fix resource leak on error in mlx4_ib_create_srq()
Jason Gunthorpe [Tue, 28 Apr 2026 16:17:44 +0000 (13:17 -0300)] 
RDMA/mlx4: Fix resource leak on error in mlx4_ib_create_srq()

Sashiko points out that mlx4_srq_alloc() was not undone during error
unwind, add the missing call to mlx4_srq_free().

Cc: stable@vger.kernel.org
Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
Link: https://sashiko.dev/#/patchset/0-v1-e911b76a94d1%2B65d95-rdma_udata_rep_jgg%40nvidia.com?part=8
Link: https://patch.msgid.link/r/11-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoRDMA/vmw_pvrdma: Fix double free on pvrdma_alloc_ucontext() error path
Jason Gunthorpe [Tue, 28 Apr 2026 16:17:43 +0000 (13:17 -0300)] 
RDMA/vmw_pvrdma: Fix double free on pvrdma_alloc_ucontext() error path

Sashiko points out that pvrdma_uar_free() is already called within
pvrdma_dealloc_ucontext(), so calling it before triggers a double free.

Cc: stable@vger.kernel.org
Fixes: 29c8d9eba550 ("IB: Add vmw_pvrdma driver")
Link: https://sashiko.dev/#/patchset/0-v1-e911b76a94d1%2B65d95-rdma_udata_rep_jgg%40nvidia.com?part=4
Link: https://patch.msgid.link/r/10-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoRDMA/ocrdma: Don't NULL deref uctx on errors in ocrdma_copy_pd_uresp()
Jason Gunthorpe [Tue, 28 Apr 2026 16:17:42 +0000 (13:17 -0300)] 
RDMA/ocrdma: Don't NULL deref uctx on errors in ocrdma_copy_pd_uresp()

Sashiko points out that pd->uctx isn't initialized until late in the
function so all these error flow references are NULL and will crash. Use
the uctx that isn't NULL.

Cc: stable@vger.kernel.org
Fixes: fe2caefcdf58 ("RDMA/ocrdma: Add driver for Emulex OneConnect IBoE RDMA adapter")
Link: https://sashiko.dev/#/patchset/0-v1-e911b76a94d1%2B65d95-rdma_udata_rep_jgg%40nvidia.com?part=4
Link: https://patch.msgid.link/r/9-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoRDMA/ocrdma: Clarify the mm_head searching
Jason Gunthorpe [Tue, 28 Apr 2026 16:17:41 +0000 (13:17 -0300)] 
RDMA/ocrdma: Clarify the mm_head searching

The intention of this code is to find matching entries exactly, the driver
never creates phys_addr's with different lens so the current expression is
not a bug, but it doesn't make sense and confuses review tooling.

Search for exact match instead.

Link: https://sashiko.dev/#/patchset/0-v1-e911b76a94d1%2B65d95-rdma_udata_rep_jgg%40nvidia.com?part=4
Link: https://patch.msgid.link/r/8-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoRDMA/mana: Fix error unwind in mana_ib_create_qp_rss()
Jason Gunthorpe [Tue, 28 Apr 2026 16:17:40 +0000 (13:17 -0300)] 
RDMA/mana: Fix error unwind in mana_ib_create_qp_rss()

Sashiko points out that mana_ib_cfg_vport_steering() is leaked, the normal
destroy path cleans it up.

Cc: stable@vger.kernel.org
Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Link: https://sashiko.dev/#/patchset/0-v1-e911b76a94d1%2B65d95-rdma_udata_rep_jgg%40nvidia.com?part=4
Link: https://patch.msgid.link/r/7-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoRDMA/mana: Fix mana_destroy_wq_obj() cleanup in mana_ib_create_qp_rss()
Jason Gunthorpe [Tue, 28 Apr 2026 16:17:39 +0000 (13:17 -0300)] 
RDMA/mana: Fix mana_destroy_wq_obj() cleanup in mana_ib_create_qp_rss()

Sashiko points out there are two bugs here in the error unwind flow, both
related to how the WQ table is unwound.

First there is a double i-- on the first failure path due to the while loop
having a i--, remove it.

Second if mana_ib_install_cq_cb() fails then mana_create_wq_obj() is not
undone due to the above i--.

Cc: stable@vger.kernel.org
Fixes: c15d7802a424 ("RDMA/mana_ib: Add CQ interrupt support for RAW QP")
Link: https://sashiko.dev/#/patchset/0-v2-1c49eeb88c48%2B91-rdma_udata_rep_jgg%40nvidia.com?part=1
Link: https://patch.msgid.link/r/6-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoRDMA/mana: Remove user triggerable WARN_ON() in mana_ib_create_qp_rss()
Jason Gunthorpe [Tue, 28 Apr 2026 16:17:38 +0000 (13:17 -0300)] 
RDMA/mana: Remove user triggerable WARN_ON() in mana_ib_create_qp_rss()

Sashiko points out that the user can specify WQs sharing the same CQ as a
part of the uAPI and this will trigger the WARN_ON() then go on to corrupt
the kernel.

Just reject it outright and fail the QP creation.

Cc: stable@vger.kernel.org
Fixes: c15d7802a424 ("RDMA/mana_ib: Add CQ interrupt support for RAW QP")
Link: https://sashiko.dev/#/patchset/0-v2-1c49eeb88c48%2B91-rdma_udata_rep_jgg%40nvidia.com?part=1
Link: https://patch.msgid.link/r/5-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoRDMA/mana: Validate rx_hash_key_len
Jason Gunthorpe [Tue, 28 Apr 2026 16:17:37 +0000 (13:17 -0300)] 
RDMA/mana: Validate rx_hash_key_len

Sashiko points out that rx_hash_key_len comes from a uAPI structure and is
blindly passed to memcpy, allowing the userspace to trash kernel
memory. Bounds check it so the memcpy cannot overflow.

Cc: stable@vger.kernel.org
Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Link: https://sashiko.dev/#/patchset/0-v2-1c49eeb88c48%2B91-rdma_udata_rep_jgg%40nvidia.com?part=1
Link: https://patch.msgid.link/r/4-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoRDMA/mlx5: Add missing store/release for lock elision pattern
Jason Gunthorpe [Tue, 28 Apr 2026 16:17:36 +0000 (13:17 -0300)] 
RDMA/mlx5: Add missing store/release for lock elision pattern

mlx5 has a common pattern implementing a device-global singleton resource
where it checks the resource pointer for !NULL and then skips obtaining
the lock.

This is not ordered properly as observing !NULL doesn't mean that all the
data under that pointer is also visible on this CPU when the lock is not
taken.

Use a release/acquire pairing to explicitly manage this.

Pointed out by sashiko, Codex found more cases.

Fixes: 5895e70f2e6e ("IB/mlx5: Allocate resources just before first QP/SRQ is created")
Fixes: 638420115cc4 ("IB/mlx5: Create UMR QP just before first reg_mr occurs")
Link: https://sashiko.dev/#/patchset/SYBPR01MB7881E1E0970268BD69C0BA75AF2B2%40SYBPR01MB7881.ausprd01.prod.outlook.com
Link: https://patch.msgid.link/r/3-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Assisted-by: Codex:GPT-5.5
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoRDMA/mlx5: Restore zero-init to mlx5_ib_modify_qp() ucmd
Jason Gunthorpe [Tue, 28 Apr 2026 16:17:35 +0000 (13:17 -0300)] 
RDMA/mlx5: Restore zero-init to mlx5_ib_modify_qp() ucmd

Sashiko points out the check for inlen==0 got missed, the ={} was not
redundant, put it back.

Fixes: a9cd442a5347 ("RDMA: Remove redundant = {} for udata req structs")
Link: https://patch.msgid.link/r/2-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoRDMA/ionic: Fix typo in format string
Jason Gunthorpe [Tue, 28 Apr 2026 16:17:34 +0000 (13:17 -0300)] 
RDMA/ionic: Fix typo in format string

Applying the corrupted patch by hand mangled the format string, put the s
in the right place.

Cc: stable@vger.kernel.org
Fixes: 654a27f25530 ("RDMA/ionic: bound node_desc sysfs read with %.64s")
Link: https://patch.msgid.link/r/1-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reported-by: Brad Spengler <brad.spengler@opensrcsec.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoip6_gre: Use cached t->net in ip6erspan_changelink().
Maoyi Xie [Thu, 30 Apr 2026 10:33:18 +0000 (18:33 +0800)] 
ip6_gre: Use cached t->net in ip6erspan_changelink().

After commit 5e72ce3e3980 ("net: ipv6: Use link netns in newlink() of
rtnl_link_ops"), ip6erspan_newlink() correctly resolves the per-netns
ip6gre hash via link_net. ip6erspan_changelink() was not converted in
that series and still uses dev_net(dev), which diverges from the
device's creation netns after IFLA_NET_NS_FD migration.

This re-inserts the tunnel into the wrong per-netns hash. The
original netns keeps a stale entry. When that netns is later
destroyed, ip6gre_exit_rtnl_net() walks the stale entry, producing a
slab-use-after-free reported by KASAN, followed by a kernel BUG at
net/core/dev.c (LIST_POISON1) in unregister_netdevice_many_notify().

Reachable from an unprivileged user namespace (unshare --user
--map-root-user --net).

ip6gre_changelink() earlier in the same file already uses the cached
t->net; only ip6erspan_changelink() has the wrong shape.

Fixes: 2d665034f239 ("net: ip6_gre: Fix ip6erspan hlen calculation")
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Maoyi Xie <maoyi.xie@ntu.edu.sg>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260430103318.3206018-1-maoyi.xie@ntu.edu.sg
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agoMerge branch 'replace-direct-dequeue-call-with-qdisc_dequeue_peeked'
Jakub Kicinski [Sat, 2 May 2026 17:20:58 +0000 (10:20 -0700)] 
Merge branch 'replace-direct-dequeue-call-with-qdisc_dequeue_peeked'

Jamal Hadi Salim says:

====================
Replace direct dequeue call with qdisc_dequeue_peeked

When sfb and red qdiscs have children (eg qfq qdisc) whose peek() callback is
qdisc_peek_dequeued(), we could get a kernel panic. When the parent of such
qdiscs (eg illustrated in patch #3 as tbf) wants to retrieve an skb from
its child (red/sfb in this case), it will do the following:
 1a. do a peek() - and when sensing there's an skb the child can offer, then
     - the child in this case(red/sfb) calls its child's (qfq) peek.
        qfq does the right thing and will return the gso_skb queue packet.
        Note: if there wasnt a gso_skb entry then qfq will store it there.
 1b. invoke a dequeue() on the child (red/sfb). And herein lies the problem.
     - red/sfb will call the child's dequeue() which will essentially just
       try to grab something of qfq's queue.

The right thing to do in #1b is to grab the skb off gso_skb queue.
This patchset fixes that issue by changing #1b to use qdisc_dequeue_peeked()
method instead.

Patch 1 fixes the issue for red qdisc. Patch 2 fixes it for sfb.
Patch 3 adds testcases for the two setups.
====================

Link: https://patch.msgid.link/20260430152957.194015-1-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agoselftests/tc-testing: Add tests that force red and sfb to dequeue from child's gso_skb
Victor Nogueira [Thu, 30 Apr 2026 15:29:57 +0000 (11:29 -0400)] 
selftests/tc-testing: Add tests that force red and sfb to dequeue from child's gso_skb

Create 4 test cases:
- Force red to dequeue from its child's gso_skb with qfq leaf
- Force sfb to dequeue from its child's gso_skb with qfq leaf
- Force red to dequeue from its child's gso_skb with dualpi2 leaf
- Force sfb to dequeue from its child's gso_skb with dualpi2 leaf

All of them have tbf followed by red (or sfb) followed by qfq (or
dualpi2). Since tbf calls its child's peek followed by
qdisc_dequeue_peeked, it will force red/sfb to call their child's peek.
In this case, since the child (qfq/dualpi2) has qdisc_peek_dequeued as
its peek callback, the packet will be stored in its gso_skb queue. During
the subsequent call to qdisc_dequeue_peeked, red/sfb will have to dequeue
from the child's gso_skb to retrieve the packet.
Not doing so will cause a NULL ptr deref which was happening before a
recent fix.

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20260430152957.194015-4-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agonet/sched: sch_sfb: Replace direct dequeue call with peek and qdisc_dequeue_peeked
Victor Nogueria [Thu, 30 Apr 2026 15:29:56 +0000 (11:29 -0400)] 
net/sched: sch_sfb: Replace direct dequeue call with peek and qdisc_dequeue_peeked

When sfb has children (eg qfq qdisc) whose peek() callback is
qdisc_peek_dequeued(), we could get a kernel panic. When the parent of such
qdiscs (eg illustrated in patch #3 as tbf) wants to retrieve an skb from
its child (sfb in this case), it will do the following:
 1a. do a peek() - and when sensing there's an skb the child can offer, then
     - the child in this case(sfb) calls its child's (qfq) peek.
        qfq does the right thing and will return the gso_skb queue packet.
        Note: if there wasnt a gso_skb entry then qfq will store it there.
 1b. invoke a dequeue() on the child (sfb). And herein lies the problem.
     - sfb will call the child's dequeue() which will essentially just
       try to grab something of qfq's queue.

[  127.594489][  T453] KASAN: null-ptr-deref in range [0x0000000000000048-0x000000000000004f]
[  127.594741][  T453] CPU: 2 UID: 0 PID: 453 Comm: ping Not tainted 7.1.0-rc1-00035-gac961974495b-dirty #793 PREEMPT(full)
[  127.595059][  T453] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  127.595254][  T453] RIP: 0010:qfq_dequeue+0x35c/0x1650 [sch_qfq]
[  127.595461][  T453] Code: 00 fc ff df 80 3c 02 00 0f 85 17 0e 00 00 4c 8d 73 48 48 89 9d b8 02 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 f2 48 c1 ea 03 <80> 3c 02 00 0f 85 76 0c 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b
[  127.596081][  T453] RSP: 0018:ffff88810e5af440 EFLAGS: 00010216
[  127.596337][  T453] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: dffffc0000000000
[  127.596623][  T453] RDX: 0000000000000009 RSI: 0000001880000000 RDI: ffff888104fd82b0
[  127.596917][  T453] RBP: ffff888104fd8000 R08: ffff888104fd8280 R09: 1ffff110211893a3
[  127.597165][  T453] R10: 1ffff110211893a6 R11: 1ffff110211893a7 R12: 0000001880000000
[  127.597404][  T453] R13: ffff888104fd82b8 R14: 0000000000000048 R15: 0000000040000000
[  127.597644][  T453] FS:  00007fc380cbfc40(0000) GS:ffff88816f2a8000(0000) knlGS:0000000000000000
[  127.597956][  T453] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  127.598160][  T453] CR2: 00005610aa9890a8 CR3: 000000010369e000 CR4: 0000000000750ef0
[  127.598390][  T453] PKRU: 55555554
[  127.598509][  T453] Call Trace:
[  127.598629][  T453]  <TASK>
[  127.598718][  T453]  ? mark_held_locks+0x40/0x70
[  127.598890][  T453]  ? srso_alias_return_thunk+0x5/0xfbef5
[  127.599053][  T453]  sfb_dequeue+0x88/0x4d0
[  127.599174][  T453]  ? ktime_get+0x137/0x230
[  127.599328][  T453]  ? srso_alias_return_thunk+0x5/0xfbef5
[  127.599480][  T453]  ? qdisc_peek_dequeued+0x7b/0x350 [sch_qfq]
[  127.599670][  T453]  ? srso_alias_return_thunk+0x5/0xfbef5
[  127.599831][  T453]  tbf_dequeue+0x6b1/0x1098 [sch_tbf]
[  127.599988][  T453]  __qdisc_run+0x169/0x1900

The right thing to do in #1b is to grab the skb off gso_skb queue.
This patchset fixes that issue by changing #1b to use qdisc_dequeue_peeked()
method instead.

Fixes: e13e02a3c68d ("net_sched: SFB flow scheduler")
Signed-off-by: Victor Nogueria <victor@mojatatu.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260430152957.194015-3-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agonet/sched: sch_red: Replace direct dequeue call with peek and qdisc_dequeue_peeked
Jamal Hadi Salim [Thu, 30 Apr 2026 15:29:55 +0000 (11:29 -0400)] 
net/sched: sch_red: Replace direct dequeue call with peek and qdisc_dequeue_peeked

When red qdisc has children (eg qfq qdisc) whose peek() callback is
qdisc_peek_dequeued(), we could get a kernel panic. When the parent of such
qdiscs (eg illustrated in patch #3 as tbf) wants to retrieve an skb from
its child (red in this case), it will do the following:
 1a. do a peek() - and when sensing there's an skb the child can offer, then
     - the child in this case(red) calls its child's (qfq) peek.
        qfq does the right thing and will return the gso_skb queue packet.
        Note: if there wasnt a gso_skb entry then qfq will store it there.
 1b. invoke a dequeue() on the child (red). And herein lies the problem.
     - red will call the child's dequeue() which will essentially just
       try to grab something of qfq's queue.

[   78.667668][  T363] KASAN: null-ptr-deref in range [0x0000000000000048-0x000000000000004f]
[   78.667927][  T363] CPU: 1 UID: 0 PID: 363 Comm: ping Not tainted 7.1.0-rc1-00033-g46f74a3f7d57-dirty #790 PREEMPT(full)
[   78.668263][  T363] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   78.668486][  T363] RIP: 0010:qfq_dequeue+0x446/0xc90 [sch_qfq]
[   78.668718][  T363] Code: 54 c0 e8 dd 90 00 f1 48 c7 c7 e0 03 54 c0 48 89 de e8 ce 90 00 f1 48 8d 7b 48 b8 ff ff 37 00 48 89 fa 48 c1 e0 2a 48 c1 ea 03 <80> 3c 02 00 74 05 e8 ef a1 e1 f1 48 8b 7b 48 48 8d 54 24 58 48 8d
[   78.669312][  T363] RSP: 0018:ffff88810de573e0 EFLAGS: 00010216
[   78.669533][  T363] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   78.669790][  T363] RDX: 0000000000000009 RSI: 0000000000000004 RDI: 0000000000000048
[   78.670044][  T363] RBP: ffff888110dc4000 R08: ffffffffb1b0885a R09: fffffbfff6ba9078
[   78.670297][  T363] R10: 0000000000000003 R11: ffff888110e31c80 R12: 0000001880000000
[   78.670560][  T363] R13: ffff888110dc4150 R14: ffff888110dc42b8 R15: 0000000000000200
[   78.670814][  T363] FS:  00007f66a8f09c40(0000) GS:ffff888163428000(0000) knlGS:0000000000000000
[   78.671110][  T363] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   78.671324][  T363] CR2: 000055db4c6a30a8 CR3: 000000010da67000 CR4: 0000000000750ef0
[   78.671585][  T363] PKRU: 55555554
[   78.671713][  T363] Call Trace:
[   78.671843][  T363]  <TASK>
[   78.671936][  T363]  ? __pfx_qfq_dequeue+0x10/0x10 [sch_qfq]
[   78.672148][  T363]  ? __pfx__printk+0x10/0x10
[   78.672322][  T363]  ? srso_alias_return_thunk+0x5/0xfbef5
[   78.672496][  T363]  ? lockdep_hardirqs_on_prepare+0xa8/0x1a0
[   78.672706][  T363]  ? srso_alias_return_thunk+0x5/0xfbef5
[   78.672875][  T363]  ? trace_hardirqs_on+0x19/0x1a0
[   78.673047][  T363]  red_dequeue+0x65/0x270 [sch_red]
[   78.673217][  T363]  ? srso_alias_return_thunk+0x5/0xfbef5
[   78.673385][  T363]  tbf_dequeue.cold+0xb0/0x70c [sch_tbf]
[   78.673566][  T363]  __qdisc_run+0x169/0x1900

The right thing to do in #1b is to grab the skb off gso_skb queue.
This patchset fixes that issue by changing #1b to use qdisc_dequeue_peeked()
method instead.

Fixes: 77be155cba4e ("pkt_sched: Add peek emulation for non-work-conserving qdiscs.")
Reported-by: Manas <ghandatmanas@gmail.com>
Reported-by: Rakshit Awasthi <rakshitawasthi17@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260430152957.194015-2-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agoamd-xgbe: fix PTP addend overflow causing frozen clock
Gregory Fuchedgi [Wed, 29 Apr 2026 21:54:14 +0000 (14:54 -0700)] 
amd-xgbe: fix PTP addend overflow causing frozen clock

XGBE_PTP_ACT_CLK_FREQ and XGBE_V2_PTP_ACT_CLK_FREQ were 10x too
large (500MHz/1GHz instead of 50MHz/100MHz), causing the computed
addend to overflow the 32-bit tstamp_addend. In the general case
this would result in the clock advancing at the wrong rate. For v2
(PCI), ptpclk_rate is hardcoded to 125MHz, so the addend formula
(ACT_CLK_FREQ << 32) / ptpclk_rate yields exactly 8 * 2^32, and
when stored to the 32-bit tstamp_addend the value is zero. With
addend = 0 the hardware accumulator never overflows and the PTP
clock is fully stopped. For v1 (platform), ptpclk_rate is read from
ACPI/DT so the exact overflow behavior depends on the
firmware-reported frequency.

Define the constants as NSEC_PER_SEC / SSINC so the relationship is
explicit and cannot drift out of sync.

Fixes: fbd47be098b5 ("amd-xgbe: add hardware PTP timestamping support")
Tested-by: Gregory Fuchedgi <gfuchedgi@gmail.com>
Signed-off-by: Gregory Fuchedgi <gfuchedgi@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260429-fix-xgbe-ptp-addend-v1-1-fca5b0ca5e62@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agonet: wan: fsl_ucc_hdlc: fix ucc_hdlc_remove
Holger Brunck [Wed, 29 Apr 2026 11:42:08 +0000 (13:42 +0200)] 
net: wan: fsl_ucc_hdlc: fix ucc_hdlc_remove

If the driver is used in a non tdm mode priv->utdm is a NULL pointer.
Therefore we need to check this pointer first before checking si_regs.

Fixes: c19b6d246a35 ("drivers/net: support hdlc function for QE-UCC")
Signed-off-by: Holger Brunck <holger.brunck@hitachienergy.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agonet: wan: fsl_ucc_hdlc: fix uhdlc_memclean
Holger Brunck [Wed, 29 Apr 2026 11:42:07 +0000 (13:42 +0200)] 
net: wan: fsl_ucc_hdlc: fix uhdlc_memclean

Unmapping of uf_regs is done from ucc_fast_free and doesn't need to be
done explicitly. If already unmapped ucc_fast_free will crash.

Fixes: c19b6d246a35 ("drivers/net: support hdlc function for QE-UCC")
Signed-off-by: Holger Brunck <holger.brunck@hitachienergy.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agoMAINTAINERS: Add self for the DEC LANCE network driver
Maciej W. Rozycki [Mon, 27 Apr 2026 10:23:35 +0000 (11:23 +0100)] 
MAINTAINERS: Add self for the DEC LANCE network driver

Like with the rest of DECstation and TURBOchannel hardware I have been
handling the DEC LANCE network driver for some 25 years now anyway.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/alpine.DEB.2.21.2604271113520.28583@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agotools/headers: Regenerate stddef.h to fix BPF selftests
Paul Chaignon [Sat, 2 May 2026 10:12:40 +0000 (12:12 +0200)] 
tools/headers: Regenerate stddef.h to fix BPF selftests

With commit dacbfc167808 ("crypto: af_alg - Annotate struct af_alg_iv
with __counted_by"), two selftests, test_tag and crypto_sanity, now
indirectly rely on the __counted_by macro. On systems with commit
dacbfc167808 in the installed UAPI headers, the selftests build fails
with:

  In file included from tools/testing/selftests/bpf/prog_tests/crypto_sanity.c:7:
  /usr/include/linux/if_alg.h:45:22: error: expected ‘:’, ‘,’, ‘;’, ‘}’ or ‘__attribute__’ before ‘__counted_by’
     45 |         __u8    iv[] __counted_by(ivlen);
        |                      ^~~~~~~~~~~~

This patch fixes it by regenerating stddef.h in tools/include using the
instructions from commit a778f5d46b62 ("tools/headers: Pull in stddef.h
to uapi to fix BPF selftests build in CI").

Fixes: dacbfc167808 ("crypto: af_alg - Annotate struct af_alg_iv with __counted_by")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/8da8ef16055aa452d940668ed5359ce54adc6b0b.1777715500.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agoksmbd: validate inherited ACE SID length
Shota Zaizen [Tue, 28 Apr 2026 10:02:55 +0000 (19:02 +0900)] 
ksmbd: validate inherited ACE SID length

smb_inherit_dacl() walks the parent directory DACL loaded from the
security descriptor xattr. It verifies that each ACE contains the fixed
SID header before using it, but does not verify that the variable-length
SID described by sid.num_subauth is fully contained in the ACE.

A malformed inheritable ACE can advertise more subauthorities than are
present in the ACE. compare_sids() may then read past the ACE.
smb_set_ace() also clamps the copied destination SID, but used the
unchecked source SID count to compute the inherited ACE size. That could
advance the temporary inherited ACE buffer pointer and nt_size accounting
past the allocated buffer.

Fix this by validating the parent ACE SID count and SID length before
using the SID during inheritance. Compute the inherited ACE size from the
copied SID so the size matches the bounded destination SID. Reject the
inherited DACL if size accumulation would overflow smb_acl.size or the
security descriptor allocation size.

Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Signed-off-by: Shota Zaizen <s@zaizen.me>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2 weeks agoksmbd: fix kernel-doc warnings from ksmbd_conn_get/put()
Namjae Jeon [Thu, 30 Apr 2026 23:34:55 +0000 (08:34 +0900)] 
ksmbd: fix kernel-doc warnings from ksmbd_conn_get/put()

The kernel test robot reported W=1 build warnings for ksmbd_conn_get()
and ksmbd_conn_put() due to missing parameter descriptions.
Add the @conn description to fix these warnings.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>