Marek Vasut [Sat, 4 Apr 2026 18:35:01 +0000 (20:35 +0200)]
ASoC: fsl_sai: Add RX/TX BCLK swap support
Add support for setting the Bit Clock Swap bit in CR2 register
via new "fsl,sai-bit-clock-swap" DT property. This bit swaps the
bit clock used by the transmitter or receiver in asynchronous mode,
i.e. makes transmitter use RX_BCLK and TX_SYNC, and vice versa,
makes receiver use TX_BCLK and RX_SYNC.
Marek Vasut [Sat, 4 Apr 2026 18:35:00 +0000 (20:35 +0200)]
ASoC: dt-bindings: fsl-sai: Document RX/TX BCLK swap support
Document support for setting the Bit Clock Swap bit in CR2 register
via new "fsl,sai-bit-clock-swap" DT property. This bit swaps the
bit clock used by the transmitter or receiver in asynchronous mode,
i.e. makes transmitter use RX_BCLK and TX_SYNC, and vice versa,
makes receiver use TX_BCLK and RX_SYNC.
Li Jian [Fri, 17 Apr 2026 10:53:14 +0000 (18:53 +0800)]
ASoC: ES8389: convert to devm_clk_get_optional() to get clock
When enabling ES8390 via ACPI description, es8389 would fail to
obtain a clock source, causing the driver to fail to initialize.
This was not an issue with older kernels, but since commit abae8e57e49a ("clk: generalize devm_clk_get() a bit"),
devm_clk_get() would return an error pointer when a clock source
was not detected (instead of falling back to a static clock),
causing the driver to fail early.
Use devm_clk_get_optional() instead to return to the previous
behaviour, allowing the use of a static clock source.
Currently, the function topology provides the basic audio function. The
commit allow the users add their topologies to the topology list. So
they can add their own features.
ASoC: sof-function-topology-lib: add virtual loop dai support
The virtual loop dai link is created by the machine driver and no
topology is needed for the dai link. Handle it to avoid the dai_link
is not supported error.
This series simplifies spinlock management across several Samsung ASoC
drivers by adopting the guard() macro.
All patches are strictly code refactoring with no functional changes
intended.
Johan Hovold [Tue, 21 Apr 2026 12:53:53 +0000 (14:53 +0200)]
spi: cadence-quadspi: clean up disable runtime pm quirk
Commit 30dbc1c8d50f ("spi: cadence-qspi: defer runtime support on
socfpga if reset bit is enabled") fixed a warm reset issue on SoCFPGA by
disabling runtime PM on that platform.
Clean up the quirk implementation by never dropping the runtime PM usage
count on probe instead of sprinkling conditionals throughout the driver
which makes the code unnecessarily hard to read and maintain.
Cc: Khairul Anuar Romli <khairul.anuar.romli@altera.com> Cc: Adrian Ng Ho Yin <adrianhoyin.ng@altera.com> Cc: Niravkumar L Rabara <nirav.rabara@altera.com> Cc: Matthew Gerlach <matthew.gerlach@altera.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20260421125354.1534871-6-johan@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
Johan Hovold [Tue, 21 Apr 2026 12:53:52 +0000 (14:53 +0200)]
spi: cadence-quadspi: fix runtime pm and clock imbalance on unbind
Make sure to balance the runtime PM usage count before returning on
probe failure (to allow the controller to suspend after a probe
deferral) and to only drop the usage count on driver unbind to avoid a
clock disable imbalance.
Also restore the autosuspend setting.
Fixes: 0578a6dbfe75 ("spi: spi-cadence-quadspi: add runtime pm support") Cc: stable@vger.kernel.org # 6.7 Cc: Dhruva Gole <d-gole@ti.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20260421125354.1534871-5-johan@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
Johan Hovold [Tue, 21 Apr 2026 12:53:50 +0000 (14:53 +0200)]
spi: cadence-quadspi: fix clock imbalance on probe failure
Drop the bogus runtime PM get on probe failures that was never needed
and that leaks a usage count reference while preventing the clocks from
being disabled (as runtime PM has not yet been enabled).
Fixes: 1889dd208197 ("spi: cadence-quadspi: Fix clock disable on probe failure path") Cc: stable@vger.kernel.org # 6.19 Cc: Anurag Dutta <a-dutta@ti.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20260421125354.1534871-3-johan@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
Johan Hovold [Tue, 21 Apr 2026 12:53:49 +0000 (14:53 +0200)]
spi: cadence-quadspi: fix runtime pm disable imbalance on probe failure
A recent attempt to fix the probe error handling introduced a runtime PM
disable depth imbalance by incorrectly disabling runtime PM on early
failures (e.g. probe deferral).
Fixes: f18c8cfa4f1a ("spi: cadence-qspi: Fix probe error path and remove") Cc: stable@vger.kernel.org # 7.0 Cc: Miquel Raynal (Schneider Electric) <miquel.raynal@bootlin.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20260421125354.1534871-2-johan@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
Johan Hovold [Tue, 21 Apr 2026 12:36:14 +0000 (14:36 +0200)]
spi: cadence: rename probe error labels
The "clk_dis_all" error label is not used to disable clocks since commit f64b1600f92e ("spi: spi-cadence: Use helper function
devm_clk_get_enabled()").
Similarly, "remove_ctlr" drops a reference rather than deregisters the
controller.
Johan Hovold [Tue, 21 Apr 2026 12:36:12 +0000 (14:36 +0200)]
spi: cadence: fix unclocked access on unbind
Make sure that the controller is runtime resumed before disabling it
during driver unbind to avoid unclocked register access and unbalanced
clock disable.
Also restore the autosuspend setting.
This issue was flagged by Sashiko when reviewing a controller
deregistration fix.
ASoC: tegra: Remove stale snd-soc-tegra-utils composite module definition
kconfiglint reports two warnings for sound/soc/tegra/Makefile:
M002: composite module 'snd-soc-tegra-utils' defined but not in any obj-*
M008: composite module 'snd-soc-tegra-utils': tegra_asoc_utils.o has no
source file
The composite module definition
`snd-soc-tegra-utils-y += tegra_asoc_utils.o` references a source file that
no longer exists and defines a module that is never included in any obj-*
target.
The tegra_asoc_utils module was originally introduced in commit a3cd50deef7b ("ASoC: Tegra: Move utilities to separate module") by Stephen
Warren in 2011 to provide shared clock/rate utility functions for Tegra
machine drivers. At that time, the Makefile had both the composite
definition (`snd-soc-tegra-utils-objs`) and the build target
(`obj-$(CONFIG_SND_TEGRA_SOC) += snd-soc-tegra-utils.o`).
In 2021,
commit 8c1b3b159300 ("ASoC: tegra: Squash utils into common machine
driver")
by Dmitry Osipenko merged tegra_asoc_utils.c into tegra_asoc_machine.c,
deleting both the .c and .h files. That commit correctly removed the obj-*
build target line but overlooked the composite module definition line
(`snd-soc-tegra-utils-objs += tegra_asoc_utils.o`).
The orphaned line persisted unnoticed and was even mechanically updated in
2024 by
commit 51a50d6ad727 ("ASoC: tegra: Use *-y instead of *-objs in
Makefile")
by Takashi Iwai, which converted it from `-objs` to `-y` syntax as part of
a treewide cleanup — inadvertently refreshing a stale definition.
Remove the orphaned composite module definition since it serves no purpose:
the source file was deleted, the obj-* target was already removed, and the
functionality now lives in tegra_asoc_machine.c.
Aaron Ma [Thu, 23 Apr 2026 10:13:38 +0000 (18:13 +0800)]
ASoC: rt722-sdca: add FU06 Playback Switch for speaker mute control
The rt722-sdca codec driver exposes FU06 Playback Volume but no
corresponding mute switch. Without a user-facing ALSA switch, UCM
cannot attach the speaker mute LED via snd_ctl_led, and PipeWire
cannot drive hardware mute.
MT6360 PMIC provides 2 buck and 6 ldo regulators, that have each one a
separate supply.
Currently, the supplies for the ldo regulators are described in the
dt-bindings but the ones for the buck regulators are not.
James Calligeros [Sat, 25 Apr 2026 00:44:05 +0000 (10:44 +1000)]
ASoC: tas2770: Fix order of operations for temperature calculation
The order of operations to derive the temperature from the temp
register values was wrong, since 1000 / 16 is not an integer. This
resulted in the calculated temperature value deviating from the
value represented by the registers slightly, which was most obvious
when the registers were zeroed (-92.265 *C vs the expected -93.000 *C).
Scale the reading before dividing the whole thing by 16 to correct
this.
John Madieu [Sat, 25 Apr 2026 09:29:34 +0000 (09:29 +0000)]
spi: rockchip: Read ISR, not IMR, to detect cs-inactive IRQ
rockchip_spi_isr() decides whether the current interrupt was the
cs-inactive event by reading IMR:
if (rs->cs_inactive &&
readl_relaxed(rs->regs + ROCKCHIP_SPI_IMR) & INT_CS_INACTIVE)
ctlr->target_abort(ctlr);
IMR is the interrupt mask register: it tells which sources are enabled,
not which one fired. In the PIO path, rockchip_spi_prepare_irq() enables
both INT_RF_FULL and INT_CS_INACTIVE in IMR when rs->cs_inactive is true:
if (rs->cs_inactive)
writel_relaxed(INT_RF_FULL | INT_CS_INACTIVE,
rs->regs + ROCKCHIP_SPI_IMR);
so the IMR check is always true once cs_inactive is enabled, and every
PIO interrupt - including normal RF_FULL completions - is dispatched to
ctlr->target_abort(), aborting the transfer. The bug is reachable on
ROCKCHIP_SPI_VER2_TYPE2 in target mode with a DMA-capable controller
when the transfer is short enough to fall back to PIO
(rockchip_spi_can_dma() returns false below fifo_len).
Read ISR (which is RISR masked by IMR) so the check actually reflects
which interrupt fired, and parenthesise the expression for clarity while
at it.
Marek Vasut [Sat, 11 Apr 2026 13:05:12 +0000 (15:05 +0200)]
ASoC: tlv320aic3x: Add multi endpoint support
Support multiple endpoints on TLV320AIC3xxx codec port when
used in of_graph context.
This patch allows to share the codec port between two CPU DAIs.
Custom STM32MP255C board uses TLV320AIC3104 audio codec. This codec
is connected to two serial audio interfaces, which are configured
either as rx or tx.
However, when the audio graph card parses the codec nodes, it expects
to find DAI interface indexes matching the endpoints indexes.
The current patch forces the use of DAI id 0 for both endpoints,
which allows to share the codec DAI between the two CPU DAIs
for playback and capture streams respectively.
An empty adr_link is expected to terminate the
for (adr_link = mach_params->links; adr_link->num_adr; adr_link++) loop.
Allocate link_num + 1 links to add an empty adr_link.
Fixes: 5226d19d4cae5 ("ASoC: SOF: Intel: use sof_sdw as default SDW machine driver") Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20260424105031.114053-1-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
Commit 671dd2ffbd8b ("ASoC: amd: acp: Add new cpu dai and dailink creation for I2S BT instance")
introduced a change that "broke" Steam Deck's audio probe, in the OLED
model, as observed in the following dmesg snippet:
[...]
snd_sof_amd_vangogh 0000:04:00.5: Topology: ABI 3:26:0 Kernel ABI 3:23:1
sof_mach nau8821-max: ASoC: physical link acp-bt-codec (id 2) not exist
sof_mach nau8821-max: ASoC: topology: could not load header: -22
snd_sof_amd_vangogh 0000:04:00.5: tplg amd/sof-tplg/sof-vangogh-nau8821-max.tplg component load failed -22
snd_sof_amd_vangogh 0000:04:00.5: error: failed to load DSP topology -22
snd_sof_amd_vangogh 0000:04:00.5: ASoC error (-22): at snd_soc_component_probe() on 0000:04:00.5
sof_mach nau8821-max: ASoC: failed to instantiate card -22
sof_mach nau8821-max: error -EINVAL: Failed to register card(sof-nau8821-max)
sof_mach nau8821-max: probe with driver sof_mach failed with error -22
[...]
Notice the quotes in "broke": it's not really a bug in such commit,
but instead a problem with a topology file from Steam Deck OLED. This
was discussed to great extent in [1], and Cristian proposed a pretty
simple and functional change that resolved the issue for the Deck's
issue. That change, though, would break other devices, so it wasn't
accepted upstream. And the proper suggested solution (fix the topology)
was never implemented, so Valve's kernel (and anyone that wants to boot
the mainline on Steam Deck OLED) is carrying that fix downstream.
So, we propose hereby a different approach: a DMI quirk, as many already
present in the sound drivers, to address this issue solely on Steam Deck
OLED, not breaking other devices and as a bonus, allowing simple patch
up in case eventually the topology file gets fixed (we'd just need to
check against any DMI info reflecting that or the topology/FW versions).
The motivation of such upstream quirk is related to users that want
to test latest kernel trees on their devices and get no only non-working
sound device, but seems some games (like Ori and the Blind Forest)
can't properly work without a proper functional audio device.
Example of such report can be seen at [2].
ASoC: atmel: Change manual bitfield manipulation to use FIELD_PREP()
Co-developed-by: Carlos Alberto Marques Rabelo <carlos.albmr@usp.br> Signed-off-by: Carlos Alberto Marques Rabelo <carlos.albmr@usp.br> Signed-off-by: Gabriel Jacob Perin <gabrieljp@usp.br> Link: https://patch.msgid.link/20260421193113.1060213-1-gabrieljp@usp.br Signed-off-by: Mark Brown <broonie@kernel.org>
Mark Brown [Wed, 22 Apr 2026 20:34:05 +0000 (21:34 +0100)]
ASoC: ops: Log unknown controls in snd_soc_limit_volume()
When we fail to look up the control name in snd_soc_limit_volume() we don't
log anything, the error code isn't particularly descriptive and checking
the return value of the function at all is a bit erratic among the callers.
Since there is no reason why anyone should ever be attempting to limit the
volume of a nonexistant control add a log message in the core to improve
usability.
João [Mon, 20 Apr 2026 20:32:03 +0000 (17:32 -0300)]
ASoC: mchp-spdifrx: Replace manual bitfield manipulations with macros and typo correction
Replace manual bitfield manipulations with FIELD_GET() and
FIELD_PREP() in order to improve code readability, security
and manageability. Also correcting GENAMSK typo for GENMASK.
Signed-off-by: João Marinho <joao.bcc@usp.br> Co-developed-by: Micael Vinicius <micael0208@usp.br> Signed-off-by: Micael Vinicius <micael0208@usp.br> Link: https://patch.msgid.link/20260420203218.15060-1-joao.bcc@usp.br Signed-off-by: Mark Brown <broonie@kernel.org>
driver core: Replace dev->offline + ->offline_disabled with accessors
In C, bitfields are not necessarily safe to modify from multiple
threads without locking. Switch "offline" and "offline_disabled" over
to the "flags" field so modifications are safe.
Cc: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Mark Brown <broonie@kernel.org> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patch.msgid.link/20260406162231.v5.9.I897d478b4a9361d79cd5073207c1062fd4d0d0e4@changeid Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Johan Hovold [Thu, 23 Apr 2026 07:58:01 +0000 (09:58 +0200)]
spi: mpc52xx: clean up interrupt handling
The driver is relying on the assumption that the invalid interrupt 0 can
be freed without any side effects, but that is not the case on
architectures like x86 where it would trigger a warning about freeing an
already free interrupt.
This should not cause any trouble on powerpc where this driver is used,
but make the code more portable (and obviously correct) by making sure
that the interrupts have been requested before freeing them.
driver core: Replace dev->of_node_reused with dev_of_node_reused()
In C, bitfields are not necessarily safe to modify from multiple
threads without locking. Switch "of_node_reused" over to the "flags"
field so modifications are safe.
Cc: Johan Hovold <johan@kernel.org> Acked-by: Mark Brown <broonie@kernel.org> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Johan Hovold <johan@kernel.org> Acked-by: Manivannan Sadhasivam <mani@kernel.org> # PCI_PWRCTRL Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://patch.msgid.link/20260406162231.v5.8.I806b8636cd3724f6cd1f5e199318ab8694472d90@changeid Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Shengjiu Wang [Mon, 20 Apr 2026 08:53:44 +0000 (16:53 +0800)]
ASoC: fsl_micfil: Add DC output remover control
Add support for the DC output remover feature available on i.MX93 and
newer platforms. This allows users to configure the output DC removal
filter with cut-off frequencies of 20Hz, 13.3Hz, 40Hz, or bypass it
entirely.
The control is exposed as an ALSA mixer control and defaults to bypass
mode. It is only available on platforms with the use_verid flag set
(i.MX93+).
John Madieu [Sat, 25 Apr 2026 02:47:25 +0000 (02:47 +0000)]
spi: rzv2h-rspi: Fix silent failure in clock setup error path
rzv2h_rspi_setup_clock() is declared to return u32 but returns -EINVAL
when no valid clock parameters are found. Cast to u32, -EINVAL becomes
0xffffffea, which is a non-zero value. The caller in
rzv2h_rspi_prepare_message() guards against failure with:
rspi->freq = rzv2h_rspi_setup_clock(rspi, speed_hz);
if (!rspi->freq)
return -EINVAL;
Because 0xffffffea is non-zero, the check is bypassed and the controller
proceeds to program SPBR/SPCMD with stale values, leading to an unknown
bit rate.
Return 0 on the failed-search path, consistent with the existing
clk_set_rate() failure path which already returns 0.
Fixes: 77d931584dd3 ("spi: rzv2h-rspi: make transfer clock rate finding chip-specific") Signed-off-by: John Madieu <john.madieu.xa@bp.renesas.com> Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Cosmin Tanislav <cosmin-gabriel.tanislav.xa@renesas.com> Link: https://patch.msgid.link/20260425024725.2393632-1-john.madieu.xa@bp.renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
driver core: Replace dev->dma_coherent with dev_dma_coherent()
In C, bitfields are not necessarily safe to modify from multiple
threads without locking. Switch "dma_coherent" over to the "flags"
field so modifications are safe.
Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Acked-by: Vinod Koul <vkoul@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patch.msgid.link/20260406162231.v5.7.If839f6dde98979fce177f70c6c74689a1904ee76@changeid
[ Since all DEV_FLAG_DMA_COHERENT accessors are exposed unconditionally,
also drop the CONFIG guards around dev_assign_dma_coherent() in
device_initialize() to ensure a correct default value. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
driver core: Replace dev->state_synced with dev_state_synced()
In C, bitfields are not necessarily safe to modify from multiple
threads without locking. Switch "state_synced" over to the "flags"
field so modifications are safe.
Cc: Saravana Kannan <saravanak@kernel.org> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patch.msgid.link/20260406162231.v5.6.Idb4818e1159fef104c7756bfd6e7ba8f374bebcd@changeid Signed-off-by: Danilo Krummrich <dakr@kernel.org>
driver core: Replace dev->dma_ops_bypass with dev_dma_ops_bypass()
In C, bitfields are not necessarily safe to modify from multiple
threads without locking. Switch "dma_ops_bypass" over to the "flags"
field so modifications are safe.
Cc: Christoph Hellwig <hch@lst.de> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patch.msgid.link/20260406162231.v5.5.If62b84471ef2c85e7ad250f0468867d6dba965ab@changeid Signed-off-by: Danilo Krummrich <dakr@kernel.org>
driver core: Replace dev->dma_skip_sync with dev_dma_skip_sync()
In C, bitfields are not necessarily safe to modify from multiple
threads without locking. Switch "dma_skip_sync" over to the "flags"
field so modifications are safe.
Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patch.msgid.link/20260406162231.v5.4.Icf072aa4184dd86a88fa8ca195b09d1651984000@changeid Signed-off-by: Danilo Krummrich <dakr@kernel.org>
driver core: Replace dev->dma_iommu with dev_dma_iommu()
In C, bitfields are not necessarily safe to modify from multiple
threads without locking. Switch "dma_iommu" over to the "flags" field
so modifications are safe.
Cc: Leon Romanovsky <leon@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patch.msgid.link/20260406162231.v5.3.Id20d5973cbff542fea290e13177e9423f5d81342@changeid Signed-off-by: Danilo Krummrich <dakr@kernel.org>
driver core: Replace dev->can_match with dev_can_match()
In C, bitfields are not necessarily safe to modify from multiple
threads without locking. Switch "can_match" over to the "flags" field
so modifications are safe.
Cc: Saravana Kannan <saravanak@kernel.org> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patch.msgid.link/20260406162231.v5.2.I54b3ae6311ff34ad30227659d91bb109911a4aea@changeid Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Petr Hodina [Sun, 26 Apr 2026 20:57:41 +0000 (13:57 -0700)]
Input: stmfts - add optional reset GPIO support
Add support for an optional "reset-gpios" property. If present, the
driver drives the reset line high at probe time and releases it during
power-on, after the regulators have been enabled.
Signed-off-by: Petr Hodina <petr.hodina@protonmail.com> Co-developed-by: David Heidelberg <david@ixit.cz> Signed-off-by: David Heidelberg <david@ixit.cz> Link: https://patch.msgid.link/20260409-stmfts5-v4-8-64fe62027db5@ixit.cz Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
David Heidelberg [Sun, 26 Apr 2026 20:01:31 +0000 (13:01 -0700)]
Input: stmfts - fix the MODULE_LICENSE() string
Replace the bogus "GPL v2" with "GPL" as MODULE_LICNSE() string. The
value does not declare the module's exact license, but only lets the
module loader test whether the module is Free Software or not.
See commit bf7fbeeae6db ("module: Cure the MODULE_LICENSE "GPL" vs.
"GPL v2" bogosity") in the details of the issue. The fix is to use
"GPL" for all modules under any variant of the GPL.
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
"One more fix for the merge window to avoid a boot hang on
Raspberry Pi 3B by marking the VEC clk critical so that it
doesn't get turned off and hang the bus"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: bcm: rpi: Mark VEC clock as CLK_IGNORE_UNUSED
Merge tag 'tsm-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm
Pull PCIe TSP update from Dan Williams:
"A small update for the TSM core. It is arguably a fix and coming in
late as I have been offline the past few weeks:
- Drop class_create() for the 'tsm' class"
* tag 'tsm-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm:
virt: coco: change tsm_class to a const struct
Hamza Mahfooz [Sat, 25 Apr 2026 18:17:19 +0000 (11:17 -0700)]
drm/hyperv: use VMBUS_RING_SIZE()
VMBUS ring buffers must be page aligned. So, use VMBUS_RING_SIZE() to
ensure they are always aligned and large enough to hold all of the
relevant data.
Hamza Mahfooz [Wed, 25 Feb 2026 17:57:08 +0000 (12:57 -0500)]
drm/edid: add CTA Video Format Data Block support
Video Format Data Blocks (VFDBs) contain the necessary information that
needs to be fed to the Optimized Video Timings (OVT) Algorithm.
Also, we require OVT support to cover modes that aren't supported by
earlier standards (e.g. CVT). So, parse all of the relevant VFDB data
and feed it to the OVT Algorithm, to extract all of the missing OVT
modes.
Merge tag 'kbuild-fixes-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nicolas Schier:
- builddeb - avoid recompiles for non-cross-compiles
Avoid triggering complete rebuilds for non-cross-compile Debian
package builds by only triggering the rebuild of host tools for
actual cross-compile builds
- Never respect CONFIG_WERROR / W=e to fixdep
Avoid spurious rebuilds of fixdep w/ and w/o -Werror during a single
kbuild invocation by never respecting CONFIG_WERROR for fixdep
* tag 'kbuild-fixes-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
kbuild: Never respect CONFIG_WERROR / W=e to fixdep
kbuild: builddeb - avoid recompiles for non-cross-compiles
Merge tag 'power-utilities-2026.04.25' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull power utility updates from Len Brown:
"x86_energy_perf_policy:
- Initial SoC Slider support
turbostat:
- Display HT siblings in cpu# order
- Add Module-ID column
- Print Core-ID and APIC-ID in hex
- Fix misc bugs"
* tag 'power-utilities-2026.04.25' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power x86_energy_perf_policy: Version 2026.04.25
tools/power x86_energy_perf_policy.8: Document SoC Slider Options
tools/power x86_energy_perf_policy: Enhances SoC Slider related checks
tools/power turbostat: v2026.04.21
tools/power turbostat: Process HT siblings in CPU order
tools/power turbostat: Show module_id column
tools/power turbostat: Print core_id and apic_id in hex
tools/power turbostat: Cleanup print helper functions
tools/power turbostat: Fix --cpu-set 1 regression on HT systems
tools/power turbostat: Fix --cpu-set 0 regression on HT systems
tools/power turbostat: Fix unrecognized option '-P'
tools/power turbostat: Fix AMD RAPL regression on big systems
tools/power/x86: Add SOC slider and platform profile support
Merge tag 'rtc-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
"Subsystem:
- add data_race() in rtc_dev_poll()
Drivers:
- remove i2c_match_id usage
- abx80x: Disable alarm feature if no interrupt attached
- ti-k3: support resuming from IO DDR low power mode"
* tag 'rtc-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
rtc: abx80x: Disable alarm feature if no interrupt attached
rtc: ntxec: fix OF node reference imbalance
rtc: pic32: allow driver to be compiled with COMPILE_TEST
rtc: ti-k3: Add support to resume from IO DDR low power mode
rtc: cmos: Use platform_get_irq_optional() in cmos_platform_probe()
dt-bindings: rtc: add olpc,xo1-rtc to trivial-rtc
dt-bindings: rtc: sc2731: Add compatible for SC2730
rtc: add data_race() in rtc_dev_poll()
rtc: armada38x: zalloc + calloc to single allocation
dt-bindings: rtc: isl12026: convert to YAML schema
dt-bindings: rtc: microcrystal,rv3028: Allow to specify vdd-supply
rtc: max77686: convert to i2c_new_ancillary_device
dt-bindings: rtc: mpfs-rtc: permit resets
rtc: rx8025: Remove use of i2c_match_id()
rtc: rv8803: Remove use of i2c_match_id()
rtc: rs5c372: Remove use of i2c_match_id()
rtc: pcf2127: Remove use of i2c_match_id()
rtc: m41t80: Remove use of i2c_match_id()
rtc: abx80x: Remove use of i2c_match_id()
Merge tag 'for-next-tpm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm updates from Jarkko Sakkinen:
"Here are the accumulated fixes for 7.1-rc1 and a single structural
change worth mentioning separately: Rafael's commit converting tpm_crb
from ACPI driver to a platform driver"
* tag 'for-next-tpm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
tpm: tpm_tis: stop transmit if retries are exhausted
tpm: tpm_tis: add error logging for data transfer
tpm: avoid -Wunused-but-set-variable
tpm: Use kfree_sensitive() to free auth session in tpm_dev_release()
tpm2-sessions: Fix missing tpm_buf_destroy() in tpm2_read_public()
tpm: Fix auth session leak in tpm2_get_random() error path
tpm: i2c: atmel: fix block comment formatting
tpm_crb: Convert ACPI driver to a platform one
tpm: Make tcpci_pm_ops variable static const
Len Brown [Sat, 25 Apr 2026 17:26:16 +0000 (13:26 -0400)]
tools/power x86_energy_perf_policy: Version 2026.04.25
Since v2025.11.22:
Initial SoC Slider support
SoC Slider is an SoC-wide power/performance policy setting.
On SoC Slider systems, EPP plays a diminished role.
Len Brown [Wed, 15 Apr 2026 19:12:29 +0000 (15:12 -0400)]
tools/power x86_energy_perf_policy: Enhances SoC Slider related checks
When processor_thermal_soc_slider is loaded, its slider
and offset modparams are visible. Check that the driver
actually registered the profile named "SoC Slider" before
reading or writing these modparams.
n.b. This utility allows writing the Slider and Offset modparams
even if the driver policy is not "balanced". Currently the
processor_thermal_soc_slider consults those modparams
only in "balanced" mode.
clk: bcm: rpi: Mark VEC clock as CLK_IGNORE_UNUSED
On Raspberry Pi 3B, the VEC clock is used by the VideoCore firmware
display driver, which remains active until the vc4 driver loads and
sends NOTIFY_DISPLAY_DONE. If this clock is disabled during boot, a bus
lockup happens and the firmware becomes unresponsive, causing a complete
system lockup.
Mark the VEC clock with CLK_IGNORE_UNUSED so it survives the unused
clock disablement and remains available until the vc4 driver takes over
display management.
Fixes: 672299736af6 ("clk: bcm: rpi: Manage clock rate in prepare/unprepare callbacks") Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/r/5f0bec08-f458-4fba-8bf3-06817a100c4c@sirena.org.uk Signed-off-by: Maíra Canal <mcanal@igalia.com> Link: https://patch.msgid.link/20260401111416.562279-2-mcanal@igalia.com Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Brian Masney <bmasney@redhat.com> # Active contributor to clk Reviewed-by: Stefan Wahren <wahrenst@gmx.net> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux
Pull ARM updates from Russell King:
- fix a race condition handling PG_dcache_clean
- further cleanups for the fault handling, allowing RT to be enabled
- fixing nzones validation in adfs filesystem driver
- fix for module unwinding
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
ARM: 9463/1: Allow to enable RT
ARM: 9472/1: fix race condition on PG_dcache_clean in __sync_icache_dcache()
ARM: 9471/1: module: fix unwind section relocation out of range error
fs/adfs: validate nzones in adfs_validate_bblk()
ARM: provide individual is_translation_fault() and is_permission_fault()
ARM: move FSR fault status definitions before fsr_fs()
ARM: use BIT() and GENMASK() for fault status register fields
ARM: move is_permission_fault() and is_translation_fault() to fault.h
ARM: move vmalloc() lazy-page table population
ARM: ensure interrupts are enabled in __do_user_fault()
====================
bpf: replace min/max fields with struct cnum{32,64}
This RFC replaces s64, u64, s32, u32 scalar range domains tracked by
verifier by a pair of circular numbers (cnums): one for 64-bit domain
and another for 32-bit domain. Each cnum represents a range as a
single arc on the circular number line, from which signed and unsigned
bounds are derived on demand. See also wrapped intervals
representation as in [1].
The use of such representation simplifies arithmetic and conditions
handling in verifier.c and allows to express 32 <-> 64 bit deductions
in a more mathematically rigorous way.
Changelog
=========
v2 -> v1:
- Fixes in source code comments highlighted by bot+bpf-ci.
RFCv1 -> v2:
- Dropped RFC tag.
- Dropped cnum{32,64}_mul(), too much complexity and no veristat
or selftests gains.
- cnum32_from_cnum64() normalizes to CNUM32_EMPTY when input
cnum64.size >= U32_MAX, previously only checked for > U32_MAX
(bot+bpf-ci).
- cnum32_from_cnum64() and cnum64_cnum32_intersect() now check for
empty inputs (sashiko).
- In regs_refine_cond_op() case BPF_JSLT use cnum{32,64}_from_srange
constructors instead of unsigned variants (sashiko).
- cnum{32,64}_intersect_with{,_urange,_srange}() helpers added
(Alexei)
Patch #1 introduces the cnum (circular number) data type and basic
operations: intersection, addition, multiplication, negation, 32-to-64
and 64-to-32 projections.
CBMC based proofs for these operations are available at [2].
(The proofs lag behind the current patch set and need an update if
this RFC moves further. Although, I don't expect any significant
changes).
[2] https://github.com/eddyz87/cnum-verif
Patch #2 mechanically converts all direct accesses to bpf_reg_state's
min/max fields to use accessor functions, preparing for the field
replacement.
Patch #3 replaces the eight independent min/max fields in
bpf_reg_state with two cnum fields. The signed and unsigned bounds are
derived on demand from the cnum via the accessor functions.
This eliminates the separate signed<->unsigned cross-deduction logic,
simplifies reg_bounds_sync(), regs_refine_cond_op() and some
arithmetic operations.
Patch #4 adds selftest cases for improved 32->64 range refinements
enabled by cnum64_cnum32_intersect().
Precision trade-offs
====================
A cnum represents a contiguous arc on the circular number line.
Current master tracks four independent ranges (s64, u64, s32, u32)
whose intersection could implicitly represent value sets that are
two disjoint arcs on the circle - something a single cnum cannot.
Cnums lose precision when a cross-domain conditional comparison
(e.g., unsigned comparison on a register with a signed-derived range)
produces an intersection that would be two disjoint arcs.
The cnum must over-approximate by returning one of the input arcs.
E.g. a pair of ranges u64:[1,U64_MAX] s:[-1,1] represents a set of two
values: {-1,1}, this can only be approximated as a set of three values
by cnum: {-1,0,1}.
Cnums gain precision in arithmetic: when addition, subtraction or
multiplication causes register value to cross both signed and unsigned
min/max boundaries. Current master resets the register as unbound in
such cases, while cnums preserve the exact wrapping arc.
In practice precision loss turned out to matter only for a set of
specifically crafted selftests from reg_bounds.c (see patch #3 for
more details). On the other hand, precision gains are observable for a
set real-life programs.
Tested on 6683 programs across four corpora: Cilium (134 programs),
selftests (4635), sched_ext (376), and Meta internal BPF corpus
(1540). Program verification verdicts are identical across all four
program sets - no program changes between success and failure.
Instruction count comparison (cnum vs signed/unsigned pair):
- 83 programs improve, 44 regress, 6596 unchanged
- Total saved: ~290K insns, total added: 10K insns
- Largest improvements:
-26% insns in a internal network firewall program;
-47% insns in rusty/wd40 sched_ext schedulers.
- Largest regression: +24% insns in one Cilium program (5062 insns
added).
Raw performance data is at the bottom of the email.
Raw verification performance metrics
====================================
Filtered out by -f insns_pct>5 -f !insns<10000
========= selftests: master vs cnums-everywhere-v2 =========
Eduard Zingerman [Fri, 24 Apr 2026 22:52:45 +0000 (15:52 -0700)]
selftests/bpf: new cases handled by 32->64 range refinements
1. 32-bit range starts before 64-bit range's low bits in each block,
causing intersection to skip entire blocks.
2. 32-bit range crosses the U32_MAX/0 boundary, represented as s32
range crossing sign boundary.
Eduard Zingerman [Fri, 24 Apr 2026 22:52:44 +0000 (15:52 -0700)]
bpf: replace min/max fields with struct cnum{32,64}
Replace eight independent s64, u64, s32, u32 min/max fields in
bpf_reg_state with two circular number fields:
- cnum64 for a unified signed/unsigned 64-bit range tracking;
- cnum32 for a unified signed/unsigned 32-bit range tracking.
Each cnum represents a range as a single arc on the circular number
line (base + size), from which signed and unsigned bounds are derived
on demand via accessor functions introduced in the preceding commit.
Notable changes:
- Signed<->unsigned deductions in __reg_deduce_bounds() are removed.
- 64<->32 bit deductions are replaced with:
- reg->r32 = cnum32_intersect(reg->r32, cnum32_from_cnum64(reg->r64));
this is functionally equivalent to the old code.
- reg->r64 = cnum64_cnum32_intersect(reg->r64, reg->r32);
this handles a few additional cases, see commit message for
"bpf: representation and basic operations on circular numbers".
- regs_refine_cond_op() now computes results in terms of operations on
sets, e.g. for JNE:
/* Complement of the range [val, val] as cnum64. */
lo = (struct cnum64){ val + 1, U64_MAX - 1 };
reg1->r64 = cnum64_intersect(reg1->r64, lo);
- For add, sub operations on scalars replace explicit bounds
computations with cnum{32,64}_{add,negate}.
- For add, sub operations on pointers deduplicate with arithmetic
operations on scalars and use cnum{32,64}_{add,negate}.
- For and, or, xor operations on scalars remove explicit signed bounds
computations.
- range_bounds_violation() reduces to checking cnum_is_empty().
- const_tnum_range_mismatch() reduces to checking cnum_is_const().
Selftest adjustments: a few existing tests are updated because a
single cnum arc cannot always represent what the old system expressed
as the intersection of independent signed and unsigned ranges.
For example, if the old system tracked u64=[0, U64_MAX-U32_MAX+2] and
s64=[S64_MIN+2, 2] independently, their intersection is a tight
two-point set. A single cnum must pick the shorter arc, losing the
other constraint. These cases are documented with comments in the
adjusted tests.
reg_bounds.c is updated with logic similar to
cnum64_cnum32_intersect(). Instead of using cnums it inspects
intersection between 'b' and first / last / next-after-first /
previous-before-last sub-ranges of 'a'.
reg_bounds.c is also updated to skip test cases that rely
in signed and unsigned ranges intersecting in two intervals,
as such cases are not representable by a single cnum.
The following "crafted" test cases are affected:
- reg_bounds_crafted/(s64)[0xffffffffffff8000; 0x7fff] (u32)<op> [0; 0x1f]
- reg_bounds_crafted/(s64)[0; 0x1f] (u32)<op> [0xffffffffffffff80; 0x7f]
- reg_bounds_crafted/(s64)[0xffffffffffffff80; 0x7f] (u32)<op> [0; 0x1f]
- reg_bounds_crafted/(u64)[0; 1] (s32)<op> [1; 2147483648]
- reg_bounds_crafted/(u64)[1; 2147483648] (s32)<op> [0; 1]
- reg_bounds_crafted/(u64)[0; 0xffffffff00000000] (s64)<op> 0
- reg_bounds_crafted/(u64)0 (s64)<op> [0; 0xffffffff00000000]
- reg_bounds_crafted/(u64)[0; 0xffffffff00000000] (s32)<op> 0
- reg_bounds_crafted/(u64)0 (s32)<op> [0; 0xffffffff00000000]
- reg_bounds_crafted/(s64)[S64_MIN; 0] (u64)<op> S64_MIN
- reg_bounds_crafted/(s64)S64_MIN (u64)<op> [S64_MIN; 0]
- reg_bounds_crafted/(s32)[S32_MIN; 0] (u32)<op> S32_MIN
- reg_bounds_crafted/(s32)S32_MIN (u32)<op> [S32_MIN; 0]
- reg_bounds_crafted/(s64)[0; 0x1f] (u32)<op> [0xffffffff80000000; 0x7fffffff]
- reg_bounds_crafted/(s64)[0xffffffff80000000; 0x7fffffff] (u32)<op> [0; 0x1f]
- reg_bounds_crafted/(s64)[0; 0x1f] (u32)<op> [0xffffffffffff8000; 0x7fff]
As well as some reg_bounds_roand_{consts,ranges}_A_B, where A and B
differ in sign domain.
Eduard Zingerman [Fri, 24 Apr 2026 22:52:43 +0000 (15:52 -0700)]
bpf: use accessor functions for bpf_reg_state min/max fields
Replace direct access to bpf_reg_state->{smin,smax,umin,umax,
s32_min,s32_max,u32_min,u32_max}_value with getter/setter inline
functions, preparing for future switch to cnum-based internal
representation.
Eduard Zingerman [Fri, 24 Apr 2026 22:52:42 +0000 (15:52 -0700)]
bpf: representation and basic operations on circular numbers
This commit adds basic definitions for cnum32/cnum64.
This is a unified numeric range representation for signed and unsigned
domains. Inspired by an old post from Shung-Hsi Yu [1] and paper [2].
Operations correctness is verified using cbmc model checker,
tests source code can be found in a separate repo [3].
The cnum64_cnum32_intersect() function is notable, because it handled
several cases verifier.c:deduce_bounds_64_from_32() does not.
Given:
- a is a 64-bit range
- b is a 32-bit range
- t is a refined 64-bit range, such that ∀ v ∈ a, (u32)v ∈ b: v ∈ t.
cnum64_cnum32_intersect() makes the following deductions:
(A): 'b' is a sub-range of the first or the last 32-bit
sub-range of 'a':
64-bit number axis --->
N*2^32 (N+1)*2^32 (N+2)*2^32 (N+3)*2^32
||------|---|=====|-------||----------|=====|-------||----------|=====|----|--||
| |< b >| |< b >| |< b >| |
| | | |
|<--+--------------------------- a ---------------------------+--->|
| |
|<-------------------------- t -------------------------->|
(B) 'b' does not intersect with the first of the last 32-bit
sub-range of 'a':
N*2^32 (N+1)*2^32 (N+2)*2^32 (N+3)*2^32
||--|=====|----|----------||--|=====|---------------||--|=====|------------|--||
|< b >| | |< b >| |< b >| |
| | | |
|<-------------+--------- a -------------------|----------->|
| |
|<-------- t ------------------>|
sched_ext: Release cpus_read_lock on scx_link_sched() failure in root enable
scx_root_enable_workfn() takes cpus_read_lock() before
scx_link_sched(sch), but the `if (ret) goto err_disable` on failure
skips the matching cpus_read_unlock() - all other err_disable gotos
along this path drop the lock first.
scx_link_sched() only returns non-zero on the sub-sched path
(parent != NULL), so the leak path is unreachable via the root
caller today. Still, the unwind is out of line with the surrounding
paths.
Drop cpus_read_lock() before goto err_disable.
v2: Correct Fixes: tag (Andrea Righi).
Fixes: 25037af712eb ("sched_ext: Add rhashtable lookup for sub-schedulers") Reported-by: Chris Mason <clm@meta.com> Signed-off-by: Tejun Heo <tj@kernel.org>
sched_ext: Reject NULL-sch callers in scx_bpf_task_set_slice/dsq_vtime
scx_prog_sched(aux) returns NULL for TRACING / SYSCALL BPF progs that
have no struct_ops association when the root scheduler has sub_attach
set. scx_bpf_task_set_slice() and scx_bpf_task_set_dsq_vtime() pass
that NULL into scx_task_on_sched(sch, p), which under
CONFIG_EXT_SUB_SCHED is rcu_access_pointer(p->scx.sched) == sch. For
any non-scx task p->scx.sched is NULL, so NULL == NULL returns true
and the authority gate is bypassed - a privileged but
non-struct_ops-associated prog can poke p->scx.slice /
p->scx.dsq_vtime on arbitrary tasks.
Reject !sch up front so the gate only admits callers with a resolved
scheduler.
Fixes: 245d09c594ea ("sched_ext: Enforce scheduler ownership when updating slice and dsq_vtime") Reported-by: Chris Mason <clm@meta.com> Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
select_cpu_from_kfunc() skipped pi_lock for @p when called from
ops.select_cpu() or another rq-locked SCX op, assuming the held lock
protects @p. scx_bpf_select_cpu_dfl() / __scx_bpf_select_cpu_and() accept an
arbitrary KF_RCU task_struct, so a caller in e.g. ops.select_cpu(p1) or
ops.enqueue(p1) can pass some other p2 - the held pi_lock / rq lock is p1's,
not p2's - and reading p2->cpus_ptr / nr_cpus_allowed races with
set_cpus_allowed_ptr() and migrate_disable_switch() on another CPU.
Abort the scheduler on cross-task calls in both branches: for
ops.select_cpu() use scx_kf_arg_task_ok() to verify @p is the wake-up
task recorded in current->scx.kf_tasks[] by SCX_CALL_OP_TASK_RET();
for other rq-locked SCX ops compare task_rq(p) against scx_locked_rq().
v2: Switch the in_select_cpu cross-task check from direct_dispatch_task
comparison to scx_kf_arg_task_ok(). The former spuriously rejects when
ops.select_cpu() calls scx_bpf_dsq_insert() first, then calls
scx_bpf_select_cpu_*() on the same task. (Andrea Righi)
Fixes: 0022b328504d ("sched_ext: Decouple kfunc unlocked-context check from kf_mask") Reported-by: Chris Mason <clm@meta.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Andrea Righi <arighi@nvidia.com>
sched_ext: Align cgroup #ifdef guards with SUB_SCHED vs GROUP_SCHED
Two EXT_GROUP_SCHED/SUB_SCHED guards are misclassified:
- scx_root_enable_workfn()'s cgroup_get(cgrp) and the err_put_cgrp unwind
in scx_alloc_and_add_sched() are under `#if GROUP || SUB`, but the
matching cgroup_put() in scx_sched_free_rcu_work() is inside `#ifdef SUB`
only (via sch->cgrp, stored only under SUB). GROUP-only would leak a
reference on every root-sched enable.
- sch_cgroup() / set_cgroup_sched() live under `#if GROUP || SUB` but touch
SUB-only fields (sch->cgrp, cgroup->scx_sched). GROUP-only wouldn't
compile.
GROUP needs CGROUP_SCHED; SUB needs only CGROUPS. CGROUPS=y/CGROUP_SCHED=n
gives the reachable GROUP=n, SUB=y combination; GROUP=y, SUB=n isn't
reachable today (SUB is def_bool y under CGROUPS). Neither miscategorization
triggers a real bug in any reachable config, but keep the guards honest:
- Narrow cgroup_get and err_put_cgrp to `#ifdef SUB` (matches the free-side
put).
- Move sch_cgroup() and set_cgroup_sched() to a separate `#ifdef SUB` block
with no-op stubs for the !SUB case; keep root_cgroup() and scx_cgroup_{
lock,unlock}() under `#if GROUP || SUB` since those only need cgroup core.
Fixes: ebeca1f930ea ("sched_ext: Introduce cgroup sub-sched support") Reported-by: Chris Mason <clm@meta.com> Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
scx_bypass_lb_{donee,resched}_cpumask were file-scope statics shared by all
scheduler instances. With CONFIG_EXT_SUB_SCHED, multiple sched instances
each arm their own bypass_lb_timer; concurrent bypass_lb_node() calls RMW
the global cpumasks with no lock, corrupting donee/resched decisions.
Move the cpumasks into struct scx_sched, allocate them alongside the timer
in scx_alloc_and_add_sched(), free them in scx_sched_free_rcu_work().
Fixes: 95d1df610cdc ("sched_ext: Implement load balancer for bypass mode") Cc: stable@vger.kernel.org # v6.19+ Reported-by: Chris Mason <clm@meta.com> Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
sched_ext: Pass held rq to SCX_CALL_OP() for core_sched_before
scx_prio_less() runs from core-sched's pick_next_task() path with rq
locked but invokes ops.core_sched_before() with NULL locked_rq, leaving
scx_locked_rq_state NULL. If the BPF callback calls a kfunc that
re-acquires rq based on scx_locked_rq() - e.g. scx_bpf_cpuperf_set(cpu)
- it re-acquires the already-held rq.
sched_ext: Pass held rq to SCX_CALL_OP() for dump_cpu/dump_task
scx_dump_state() walks CPUs with rq_lock_irqsave() held and invokes
ops.dump_cpu / ops.dump_task with NULL locked_rq, leaving
scx_locked_rq_state NULL. If the BPF callback calls a kfunc that
re-acquires rq based on scx_locked_rq() - e.g. scx_bpf_cpuperf_set(cpu)
- it re-acquires the already-held rq.
Pass the held rq to SCX_CALL_OP(). Thread it into scx_dump_task() too.
The pre-loop ops.dump call runs before rq_lock_irqsave() so keeps
rq=NULL.
Fixes: 07814a9439a3 ("sched_ext: Print debug dump after an error exit") Cc: stable@vger.kernel.org # v6.12+ Reported-by: Chris Mason <clm@meta.com> Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
sched_ext: Save and restore scx_locked_rq across SCX_CALL_OP
SCX_CALL_OP{,_RET}() unconditionally clears scx_locked_rq_state to NULL on
exit. Correct at the top level, but ops can recurse via
scx_bpf_sub_dispatch(): a parent's ops.dispatch calls the helper, which
invokes the child's ops.dispatch under another SCX_CALL_OP. When the inner
call returns, the NULL clobbers the outer's state. The parent's BPF then
calls kfuncs like scx_bpf_cpuperf_set() which read scx_locked_rq()==NULL and
re-acquire the already-held rq.
Snapshot scx_locked_rq_state on entry and restore on exit. Rename the rq
parameter to locked_rq across all SCX_CALL_OP* macros so the snapshot local
can be typed as 'struct rq *' without colliding with the parameter token in
the expansion. SCX_CALL_OP_TASK{,_RET}() and SCX_CALL_OP_2TASKS_RET() funnel
through the two base macros and inherit the fix.
Fixes: 4f8b122848db ("sched_ext: Add basic building blocks for nested sub-scheduler dispatching") Reported-by: Chris Mason <clm@meta.com> Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
sched_ext: Use dsq->first_task instead of list_empty() in dispatch_enqueue() FIFO-tail
dispatch_enqueue()'s FIFO-tail path used list_empty(&dsq->list) to decide
whether to set dsq->first_task on enqueue. dsq->list can contain parked BPF
iterator cursors (SCX_DSQ_LNODE_ITER_CURSOR), so list_empty() is not a
reliable "no real task" check. If the last real task is unlinked while a
cursor is parked, first_task becomes NULL; the next FIFO-tail enqueue then
sees list_empty() == false and skips the first_task update, leaving
scx_bpf_dsq_peek() returning NULL for a non-empty DSQ.
Test dsq->first_task directly, which already tracks only real tasks and is
maintained under dsq->lock.
Fixes: 44f5c8ec5b9a ("sched_ext: Add lockless peek operation for DSQs") Cc: stable@vger.kernel.org # v6.19+ Reported-by: Chris Mason <clm@meta.com> Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com> Cc: Ryan Newton <newton@meta.com>
sched_ext: Resolve caller's scheduler in scx_bpf_destroy_dsq() / scx_bpf_dsq_nr_queued()
scx_bpf_create_dsq() resolves the calling scheduler via scx_prog_sched(aux)
and inserts the new DSQ into that scheduler's dsq_hash. Its inverse
scx_bpf_destroy_dsq() and the query helper scx_bpf_dsq_nr_queued() were
hard-coded to rcu_dereference(scx_root), so a sub-scheduler could only
destroy or query DSQs in the root scheduler's hash - never its own. If the
root had a DSQ with the same id, the sub-sched silently destroyed it and the
root aborted on the next dispatch ("invalid DSQ ID 0x0..").
Take a const struct bpf_prog_aux *aux via KF_IMPLICIT_ARGS and resolve the
scheduler with scx_prog_sched(aux), matching scx_bpf_create_dsq().
Fixes: ebeca1f930ea ("sched_ext: Introduce cgroup sub-sched support") Reported-by: Chris Mason <clm@meta.com> Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
sched_ext: Read scx_root under scx_cgroup_ops_rwsem in cgroup setters
scx_group_set_{weight,idle,bandwidth}() cache scx_root before acquiring
scx_cgroup_ops_rwsem, so the pointer can be stale by the time the op runs.
If the loaded scheduler is disabled and freed (via RCU work) and another is
enabled between the naked load and the rwsem acquire, the reader sees
scx_cgroup_enabled=true (the new scheduler's) but dereferences the freed one
- UAF on SCX_HAS_OP(sch, ...) / SCX_CALL_OP(sch, ...).
scx_cgroup_enabled is toggled only under scx_cgroup_ops_rwsem write
(scx_cgroup_{init,exit}), so reading scx_root inside the rwsem read section
correlates @sch with the enabled snapshot.
Fixes: a5bd6ba30b33 ("sched_ext: Use cgroup_lock/unlock() to synchronize against cgroup operations") Cc: stable@vger.kernel.org # v6.18+ Reported-by: Chris Mason <clm@meta.com> Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
sched_ext: Don't disable tasks in scx_sub_enable_workfn() abort path
scx_sub_enable_workfn()'s prep loop calls __scx_init_task(sch, p, false)
without transitioning task state, then sets SCX_TASK_SUB_INIT. If prep fails
partway, the abort path runs __scx_disable_and_exit_task(sch, p) on the
marked tasks. Task state is still the parent's ENABLED, so that dispatches
to the SCX_TASK_ENABLED arm and calls scx_disable_task(sch, p) - i.e.
child->ops.disable() - for tasks on which child->ops.enable() never ran. A
BPF sub-scheduler allocating per-task state in enable/freeing in disable
would operate on uninitialized state.
The dying-task branch in scx_disable_and_exit_task() has the same problem,
and scx_enabling_sub_sched was cleared before the abort cleanup loop - a
task exiting during cleanup tripped the WARN and skipped both ops.exit_task
and the SCX_TASK_SUB_INIT clear, leaking per-task resources and leaving the
task stuck.
Introduce scx_sub_init_cancel_task() that calls ops.exit_task with
cancelled=true - matching what the top-level init path does when init_task
itself returns -errno. Use it in the abort loop and in the dying-task
branch. scx_enabling_sub_sched now stays set until the abort loop finishes
clearing SUB_INIT, so concurrent exits hitting the dying-task branch can
still find @sch. That branch also clears SCX_TASK_SUB_INIT unconditionally
when seen, leaving the task unmarked even if the WARN fires.
Fixes: 337ec00b1d9c ("sched_ext: Implement cgroup sub-sched enabling and disabling") Reported-by: Chris Mason <clm@meta.com> Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
sched_ext: Skip tasks with stale task_rq in bypass_lb_cpu()
bypass_lb_cpu() transfers tasks between per-CPU bypass DSQs without
migrating them - task_cpu() only updates when the donee later consumes the
task via move_remote_task_to_local_dsq(). If the LB timer fires again before
consumption and the new DSQ becomes a donor, @p is still on the previous CPU
and task_rq(@p) != donor_rq. @p can't be moved without its own rq locked.
Skip such tasks.
Fixes: 95d1df610cdc ("sched_ext: Implement load balancer for bypass mode") Cc: stable@vger.kernel.org # v6.19+ Reported-by: Chris Mason <clm@meta.com> Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
sched_ext: Guard scx_dsq_move() against NULL kit->dsq after failed iter_new
bpf_iter_scx_dsq_new() clears kit->dsq on failure and
bpf_iter_scx_dsq_{next,destroy}() guard against that. scx_dsq_move() doesn't -
it dereferences kit->dsq immediately, so a BPF program that calls
scx_bpf_dsq_move[_vtime]() after a failed iter_new oopses the kernel.
sched_ext: Unregister sub_kset on scheduler disable
When ops.sub_attach is set, scx_alloc_and_add_sched() creates sub_kset as a
child of &sch->kobj, which pins the parent with its own reference. The
disable paths never call kset_unregister(), so the final kobject_put() in
bpf_scx_unreg() leaves a stale reference and scx_kobj_release() never runs,
leaking the whole struct scx_sched on every load/unload cycle.
Unregister sub_kset in scx_root_disable() and scx_sub_disable() before
kobject_del(&sch->kobj).
Fixes: ebeca1f930ea ("sched_ext: Introduce cgroup sub-sched support") Reported-by: Chris Mason <clm@meta.com> Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
scx_hardlockup() runs from NMI and eventually calls scx_claim_exit(),
which takes scx_sched_lock. scx_sched_lock isn't NMI-safe and grabbing
it from NMI context can lead to deadlocks.
The hardlockup handler is best-effort recovery and the disable path it
triggers runs off of irq_work anyway. Move the handle_lockup() call into
an irq_work so it runs in IRQ context.
Merge tag 'trace-ring-buffer-v7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ring-buffer fix from Steven Rostedt:
- Fix accounting of persistent ring buffer rewind
On boot up, the head page is moved back to the earliest point of the
saved ring buffer. This is because the ring buffer being read by user
space on a crash may not save the part it read. Rewinding the head
page back to the earliest saved position helps keep those events from
being lost.
The number of events is also read during boot up and displayed in the
stats file in the tracefs directory. It's also used for other
accounting as well. On boot up, the "reader page" is accounted for
but a rewind may put it back into the buffer and then the reader page
may be accounted for again.
Save off the original reader page and skip accounting it when
scanning the pages in the ring buffer.
* tag 'trace-ring-buffer-v7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ring-buffer: Do not double count the reader_page
Merge tag 'block-7.1-20260424' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- Series for zloop, fixing a variety of issues
- t10-pi code cleanup
- Fix for a merge window regression with the bio memory allocation mask
- Fix for a merge window regression in ublk, caused by an issue with
the maple tree iteration code at teardown
- ublk self tests additions
- Zoned device pgmap fixes
- Various little cleanups and fixes
* tag 'block-7.1-20260424' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (21 commits)
Revert "floppy: fix reference leak on platform_device_register() failure"
ublk: avoid unpinning pages under maple tree spinlock
ublk: refactor common helper ublk_shmem_remove_ranges()
ublk: fix maple tree lockdep warning in ublk_buf_cleanup
selftests: ublk: add ublk auto integrity test
selftests: ublk: enable test_integrity_02.sh on fio 3.42
selftests: ublk: remove unused argument to _cleanup
block: only restrict bio allocation gfp mask asked to block
block/blk-throttle: Add WQ_PERCPU to alloc_workqueue users
block: Add WQ_PERCPU to alloc_workqueue users
block: relax pgmap check in bio_add_page for compatible zone device pages
block: add pgmap check to biovec_phys_mergeable
floppy: fix reference leak on platform_device_register() failure
ublk: use unchecked copy helpers for bio page data
t10-pi: reduce ref tag code duplication
zloop: remove irq-safe locking
zloop: factor out zloop_mark_{full,empty} helpers
zloop: set RQF_QUIET when completing requests on deleted devices
zloop: improve the unaligned write pointer warning
zloop: use vfs_truncate
...