Mark Brown [Mon, 11 May 2026 01:04:02 +0000 (10:04 +0900)]
ASoC: sdw_utils: make RT712/RT721 CODEC_MIC be optional
Bard Liao <yung-chuan.liao@linux.intel.com> says:
The RT712 and RT721 codec mic are optional and are not used on some
products. Add a quirk to make it optional and skip the codec mic DAI
when it is not present in DisCo table.
Mac Chiang [Fri, 8 May 2026 09:32:23 +0000 (17:32 +0800)]
ASoC: sdw_utils: Add quirk to ignore RT712 CODEC_MIC
Some devices do not use CODEC_MIC but use the host PCH_DMIC
instead. Add a quirk to skip the CODEC_MIC DAI when it is not present
in disco table, ensuring the correct capture device is used.
If CODEC_MIC is present, it continues to be used as default.
Jang Pyohwan [Sat, 9 May 2026 08:53:10 +0000 (17:53 +0900)]
ASoC: Intel: soc-acpi: add LG Gram 16Z90U RT713 + single RT1320 quirk
Add a SoundWire machine table entry for the LG Gram Pro 2026
(16Z90U-KU7BK), which has an unusual configuration:
sdw:0:1:025d:1320:01 single stereo RT1320 SmartAmp on link 1
sdw:0:3:025d:0713:01 RT713 jack/headset codec on link 3
Existing rt713-rt1320 boards have two RT1320 amps on different links
("link_mask = BIT(1) | BIT(2) | BIT(3)"). The LG Gram uses a single
stereo RT1320 chip, so the new entry uses "link_mask = BIT(1) |
BIT(3)" with the existing rt1320_1_group2_adr structure, leaving the
two-channel routing to the topology.
The RT713 on this board does not expose a SMART_MIC function in
ACPI, so the .machine_check callback used by the existing entries
(snd_soc_acpi_intel_sdca_is_device_rt712_vb) would reject this
board. Drop machine_check for the new entry; speaker output and
the headset jack do not depend on the SMART_MIC presence check.
The corresponding topology source has been submitted to the SOF
project at https://github.com/thesofproject/sof/pull/10760 . The
generated sof-ptl-rt713-l3-rt1320-l1-2ch.tplg and
nhlt-sof-ptl-rt713-l3-rt1320-l1.bin will follow in linux-firmware
once that lands.
Tested on Ubuntu 26.04 with kernel 7.0.0-15: speaker (RT1320
stereo), headphone jack with auto-routing, headset mic, and the
internal NHLT DMIC array all work via the UCM HiFi profile.
Gary C Wang [Fri, 8 May 2026 10:42:38 +0000 (18:42 +0800)]
ASoC: soc-acpi-intel-arl-match: add rt712_l0_rt1320_l3 support
Add support for using the rt712 multi-function codec on link 0 and the
rt1320 amplifier on link 3 on ARL platforms.
Signed-off-by: Gary C Wang <gary.c.wang@intel.com> Co-developed-by: Mac Chiang <mac.chiang@intel.com> Signed-off-by: Mac Chiang <mac.chiang@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20260508104239.1247525-3-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
This series converts mutex and spinlock handling in TI ASoC drivers
to use guard() helpers.
Most patches are straightforward conversions to guard() helpers.
Two patches include minor cleanup changes in the process:
omap-dmic: Simplified omap_dmic_dai_startup() by removing the
temporary return variable and using a direct return path on error.
omap-mcbsp: Modernized omap_mcbsp_request() by using __free(kfree)
for memory management. This ensures that memory is always freed on
error paths after the spinlock is released, without needing manual
goto labels.
ASoC: ti: omap-mcbsp: Simplify lock and resource handling
Convert spinlock protected sections to guard()/scoped_guard()
helpers and simplify the cleanup paths, including the
reg_cache lifetime handling in omap_mcbsp_request().
Mark Brown [Mon, 11 May 2026 00:55:56 +0000 (09:55 +0900)]
spi: switch to managed controller allocation (part 2/3)
Johan Hovold <johan@kernel.org> says:
In preparation for fixing the SPI controller API so that it no longer
drops a reference when deregistering (non-managed) controllers (cf.
[1]), this series converts drivers using non-managed registration to use
managed allocation.
Included is also a related cleanup of a ti-qspi error path.
This second set will be followed by a third set of 12 patches for
drivers using managed registration.
That leaves us with 18 drivers using non-managed allocation, which is
few enough to be able to fix the API in tree-wide change.
Mark Brown [Mon, 11 May 2026 00:52:53 +0000 (09:52 +0900)]
ASoC: Move system_long_wq to system_dfl_long_wq
Marco Crivellari <marco.crivellari@suse.com> says:
Currently the code uses the per-cpu workqueue system_long_wq to schedule
long running works.
Unbound works could benefit from scheduler task placement, to optimize
performance and power consumption. Another good reason to have this unbound,
is the "queue_delayed_work()" function, used to enqueue the work item.
More details on this will follow in the next section.
Recently, a new unbound workqueue specific for long running work has been
added:
c116737e972e ("workqueue: Add system_dfl_long_wq for long unbound works")
~~~ Details about queue_delayed_work ~~~
system_long_wq is a per-cpu workqueue and it is used as a parameter of
queue_delayed_work(). This function schedule an item that it will later
be enqueued (once the timer will fire). __queue_delayed_work() does the job
receiving as "cpu" WORK_CPU_UNBOUND:
if (housekeeping_enabled(HK_TYPE_TIMER)) {
// [....]
} else {
if (likely(cpu == WORK_CPU_UNBOUND))
add_timer_global(timer);
else
add_timer_on(timer, cpu);
}
The timer is global, so can fire everywhere, and the work item will be
enqueued where the timer fired.
Since the workqueue work doesn't rely on per-cpu variables, there is no
obvious reason that justify the use of a per-cpu workqueue. So change the
workqueue with the new system_dfl_long_wq, so that the used workqueue is
now unbound and can benefit from scheduler task placement.
ASoC: codecs: rt5640: Move long delayed work on system_dfl_long_wq
Currently the code enqueue work items using {queue|mod}_delayed_work(),
using system_long_wq. This workqueue should be used when long works are
expected and it is a per-cpu workqueue.
The function(s) end up calling __queue_delayed_work(), which set a global
timer that could fire anywhere, enqueuing the work where the timer fired.
Unbound works could benefit from scheduler task placement, to optimize
performance and power consumption. Long work shouldn't stick to a single
CPU.
Recently, a new unbound workqueue specific for long running work has
been added:
    c116737e972e ("workqueue: Add system_dfl_long_wq for long unbound works")
Since the workqueue work doesn't rely on per-cpu variables, there is no
obvious reason that justify the use of a per-cpu workqueue. So change
system_long_wq with system_dfl_long_wq so that the work may benefit from
scheduler task placement.
ASoC: cs42l43: Move long delayed work on system_dfl_long_wq
Move long delayed work on system_dfl_long_wq
Currently the code enqueue work items using {queue|mod}_delayed_work(),
using system_long_wq. This workqueue should be used when long works are
expected and it is a per-cpu workqueue.
The function(s) end up calling __queue_delayed_work(), which set a global
timer that could fire anywhere, enqueuing the work where the timer fired.
Unbound works could benefit from scheduler task placement, to optimize
performance and power consumption. Long work shouldn't stick to a single
CPU.
Recently, a new unbound workqueue specific for long running work has
been added:
    c116737e972e ("workqueue: Add system_dfl_long_wq for long unbound works")
Since the workqueue work doesn't rely on per-cpu variables, there is no
obvious reason that justify the use of a per-cpu workqueue. So change
system_long_wq with system_dfl_long_wq so that the work may benefit from
scheduler task placement.
Cc: David Rhodes <david.rhodes@cirrus.com> Cc: Richard Fitzgerald <rf@opensource.cirrus.com> Cc: patches@opensource.cirrus.com Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20260508150327.351779-2-marco.crivellari@suse.com Signed-off-by: Mark Brown <broonie@kernel.org>
ASoC: fsl: eukrea-tlv320: update board checks to use the DT
The eukrea-tlv320 driver contains checks for ARM machine IDs via
machine_is_*() macros. The boards concerned now support only FDT
booting, which does not use machine IDs, and therefore the code should
be updated to check the DT compatible property instead.
Non-DT booting support for these machines was removed in these
commits:
commit f2f55499942a ("ARM: imx: Remove eukrea_mbimxsd35 non-dt support")
commit 3877942b0c7f ("ARM: imx25: Remove eukrea mx25 board files")
commit 7c5deaf77526 ("ARM: i.MX: Remove mach-cpuimx27sd board file")
commit 8da4d6b2f798 ("ARM: mx51: Remove mach-cpuimx51sd board file")
The presence of these machine ID checks prevents the removal of
machine IDs no longer used by the kernel from arch/arm/tools/mach-types,
because the machine_is_*() macros are generated from mach-types. To
resolve this issue, use of_machine_is_compatible() instead.
spi: amd: Set correct bus number in ACPI probe path
On platforms where the HID2 SPI controller (AMDI0063) is enumerated via
ACPI instead of PCI, amd_spi_probe() unconditionally sets bus_num to 0,
while the PCI probe path assigns bus_num 2 for HID2 controller.
Align the ACPI probe path to use the same bus number so that userspace
and SPI client drivers see a consistent bus assignment regardless of the
enumeration method.
Fixes: b644c2776652 ("spi: spi_amd: Add PCI-based driver for AMD HID2 SPI controller") Cc: stable@vger.kernel.org # v6.16+ Signed-off-by: Krishnamoorthi M <krishnamoorthi.m@amd.com> Link: https://patch.msgid.link/20260507180051.4158674-1-krishnamoorthi.m@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>
A phandle-array is really a matrix and needs constraints on the number
of elements for both the inner and outer dimensions. Add the missing
inner constraints.
Fixes: 472d77bdc511 ("ASoC: dt-bindings: mediatek,mt8173-rt5650-rt5514: convert to DT schema") Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260508182438.1757394-1-robh@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
Peter Ujfalusi [Tue, 5 May 2026 16:47:44 +0000 (19:47 +0300)]
MAINTAINERS: ASoC/ti: Remove myself and add Sen Wang as maintainer
As I cannot spend adequate time to fulfill my role as maintainer for the
TI ASoC drivers, it is for the better if I resign and hand over the role
to Sen Wang.
Tejun Heo [Sun, 10 May 2026 22:51:09 +0000 (12:51 -1000)]
Merge branch 'for-7.1-fixes' into for-7.2
Conflict between:
[1] 41e3312861ea ("sched_ext: add p->scx.tid and SCX_OPS_TID_TO_TASK lookup")
[2] c941d7391f25 ("sched_ext: Close root-enable vs sched_ext_dead() race with SCX_TASK_INIT_BEGIN")
in scx_root_enable_workfn()'s post-init block. [1] added a tid hash
insertion under a scoped_guard() for scx_tasks_lock; [2] wraps the same
region in task_rq_lock() for a DEAD recheck. A naive merge would invert the
iter's outer/inner order.
[3] f25ad1e3cbaa ("sched_ext: Add scx_task_iter_relock() and use it in scx_root_enable_workfn()")
was added to for-7.2 for a clean resolution: scx_task_iter_relock(iter, p)
takes both scx_tasks_lock and @p's rq lock in iter order.
Resolved by routing both sides through [3]'s dual-lock helper: the post-init
region runs under a single scx_task_iter_relock() acquisition, with [2]'s
state machine and [1]'s hash insert in sequence inside it.
Tejun Heo [Sun, 10 May 2026 22:21:58 +0000 (12:21 -1000)]
sched_ext: Add scx_task_iter_relock() and use it in scx_root_enable_workfn()
scx_root_enable_workfn()'s post-init block re-acquires scx_tasks_lock
briefly via a scoped_guard() for the tid hash insertion. c941d7391f25
("sched_ext: Close root-enable vs sched_ext_dead() race with
SCX_TASK_INIT_BEGIN") on for-7.1-fixes adds a post-init DEAD recheck that
holds the task's rq lock across the state-machine updates in the same
region. A naive merge would acquire scx_tasks_lock while the rq lock is
held, inverting the iter's outer/inner order (scx_tasks_lock then rq lock).
Add scx_task_iter_relock(iter, p), the counterpart to
scx_task_iter_unlock(), that re-acquires scx_tasks_lock and, if @p is
non-NULL, @p's rq lock. The locks are tracked in @iter so subsequent
iteration releases them.
Use it in scx_root_enable_workfn()'s post-init block and drop the
now-redundant scoped_guard on the hash insertion. The post-init region now
runs with both scx_tasks_lock and the task's rq lock held across the init
failure check, the state-machine updates and the hash insert.
v2: Move scx_task_iter_relock() earlier to ease the for-7.1-fixes merge.
Clippy 1.97 introduces new `useless_borrows_in_formatting` warning which
fires on the examples as we have `&*expr` where the format macro takes
reference already. Remove the extra borrow.
Gary Guo [Tue, 28 Apr 2026 13:10:59 +0000 (14:10 +0100)]
rust: pin-init: internal: turn `PhantomPinned` error into warnings
The `PhantomPinned` detection is just a lint, and is emitted as an error
because there is no `compile_warning!()` macro, and
`proc-macro-diagnostics` is not stable.
Use of `#[deprecated = ""]` attribute to approximate custom proc-macro
warnings. A new line is added before message for visual clarity.
An example warning with this trick looks like this:
warning: use of deprecated function `_::warn`:
The field `pin` of type `PhantomPinned` only has an effect if it has the `#[pin]` attribute
--> test.rs:9:5
|
9 | pin: marker::PhantomPinned,
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^
rust: pin-init: internal: adjust license identifier of `zeroable.rs`
The pin-init crate has been licensed under `Apache-2.0 OR MIT` since the
beginning. I introduced in commit 071cedc84e90 ("rust: add derive macro for
`Zeroable`") `zeroable.rs` with incompatible GPL-2.0 SPDX identifier. The
file has not been modified by other authors, so relicense it under the
above license.
rust: pin-init: internal: add missing where clause to projection types
`#[pin_data]` failed to propagate the struct's `where` clause to the
generated projection struct. As a result, bounds written in a `where`
clause could be dropped during expansion, causing type errors when
fields depended on those bounds.
Fix this by adding the missing `where` clause to the generated
projection struct.
Reported-by: Andreas Hindborg <a.hindborg@kernel.org> Closes: https://rust-for-linux.zulipchat.com/#narrow/channel/561532-pin-init/topic/generic.20bounds.20and.20.60.23.5Bpin_data.5D.60/with/578381591 Signed-off-by: Mohamad Alsadhan <mo@sdhn.cc> Reviewed-by: Gary Guo <gary@garyguo.net>
[ Reworded commit message - Gary ] Link: https://patch.msgid.link/20260428-pin-init-sync-v1-5-07f9bd3859fb@garyguo.net Signed-off-by: Gary Guo <gary@garyguo.net>
rust: pin-init: extend `impl_zeroable_option` macro to handle generics
Improve impl_zeroable_option macro to handle generic impls for types
like `&T`, `&mut T`, `NonNull<T>`, and others (for which `Option<T>`
is guaranteed to be zeroable) with similar approach to
`impl_zeroable`.
Also, update old declarations to use generics e.g. `NonZeroU8` to
`NonZero<u8>`.
rust: pin-init: cleanup `Zeroable` and `ZeroableOptions`
Place definitions and implementations (incl. macro invocations) of
the `Zeroable` trait first in the relevant section of `src/lib.rs`,
followed by the ZeroableOption trait and its implementations.
Rename `impl_non_zero_int_zeroable_option` to `impl_zeroable_option`
for consistency.
This commit should not introduce any functional changes.
Gary Guo [Tue, 28 Apr 2026 13:10:51 +0000 (14:10 +0100)]
rust: pin-init: bump minimum Rust version to 1.82
Following the kernel minimum version bump in commit f32fb9c58a5b ("rust:
bump Rust minimum supported version to 1.85.0 (Debian Trixie)"), bump
pin-init's minimum Rust version to 1.82.
This removes the `lint_reasons` feature which is stabilized in 1.81 and the
`raw_ref_ops` and `new_uninit` features which are stabilized in 1.82.
Given we do not use any features that are stabilized in 1.82..=1.85 range,
and pin-init crate is useful for other projects which may have their own
MSRV requirements, the minimum version is not straightly bumped to 1.85.
Tejun Heo [Sun, 10 May 2026 20:08:16 +0000 (10:08 -1000)]
sched_ext: Handle SCX_TASK_NONE in disable/switched_from paths
scx_fail_parent() leaves cgroup tasks at (state=NONE, sched=parent,
sched_class=ext) until the parent itself is torn down by the scx_error() it
raised. When the later root_disable iterates them, two paths trip on NONE.
scx_disable_and_exit_task() re-enters the wrapper at NONE: the inner switch
returns early but the trailing scx_set_task_sched(p, NULL) clobbers the
parent sched left by scx_fail_parent(), and scx_set_task_state(p, NONE)
wastes a write on an already-NONE task. switched_from_scx() then calls
scx_disable_task(), which WARNs on non-ENABLED state and writes state=READY,
producing a NONE -> READY transition the validation matrix rejects.
Treat NONE as "nothing to do" in both paths. Add a NONE early-return at the
top of scx_disable_and_exit_task() and a parallel NONE check in
switched_from_scx() next to task_dead_and_done().
Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
Tejun Heo [Sun, 10 May 2026 20:08:16 +0000 (10:08 -1000)]
sched_ext: Close sub-sched init race with post-init DEAD recheck
scx_sub_enable_workfn()'s init pass and scx_sub_disable() migration both
drop the rq lock to call __scx_init_task() against the other sched. A
TASK_DEAD @p can fall through sched_ext_dead() in that window.
sched_ext_dead() runs ops.exit_task() on the sched @p was attached to, not
on the sched whose init just completed, so the new allocation leaks.
Reuse the DEAD signal set by sched_ext_dead(). After __scx_init_task()
returns, take task_rq_lock(p) and check for DEAD; on hit, call
scx_sub_init_cancel_task() against the sub sched the init ran for and drop
@p; on miss, proceed as before.
Tejun Heo [Sun, 10 May 2026 20:08:16 +0000 (10:08 -1000)]
sched_ext: Close root-enable vs sched_ext_dead() race with SCX_TASK_INIT_BEGIN
scx_root_enable_workfn() drops the iter rq lock for ops.init_task() and a
TASK_DEAD @p can fall through sched_ext_dead() in that window. The race hits
when sched_ext_dead() observes SCX_TASK_INIT (the intermediate state before
@p->scx.sched is published) and dereferences NULL via SCX_HAS_OP(NULL,
exit_task), or observes SCX_TASK_NONE during the unlocked init window and
skips cleanup so exit_task() never runs.
Add SCX_TASK_INIT_BEGIN. The enable path writes NONE -> INIT_BEGIN under the
iter rq lock, then takes the rq lock again after init to walk INIT_BEGIN ->
INIT -> READY. sched_ext_dead() that wins the rq-lock race observes
INIT_BEGIN and sets DEAD without calling into ops; the post-init recheck
unwinds via scx_sub_init_cancel_task().
scx_fork() runs single-threaded against sched_ext_dead() (the task is not on
scx_tasks until scx_post_fork() adds it) so its INIT_BEGIN -> INIT walk
needs no rq-lock pairing; it rolls back to NONE on ops.init_task() failure.
The validation matrix grows the INIT_BEGIN row and the INIT_BEGIN -> DEAD
edge; INIT now requires INIT_BEGIN as the predecessor. scx_sub_disable()'s
migration writes INIT_BEGIN as a synthetic predecessor to satisfy the
tightened verification.
The sub-sched paths still race with sched_ext_dead() during the unlocked
init window. This will be fixed by the next patch.
Tejun Heo [Sun, 10 May 2026 20:08:16 +0000 (10:08 -1000)]
sched_ext: Replace SCX_TASK_OFF_TASKS flag with SCX_TASK_DEAD state
SCX_TASK_OFF_TASKS marked tasks already through sched_ext_dead() so cgroup
task iteration would skip them. This can be expressed better with a task
state. Replace the flag with SCX_TASK_DEAD.
scx_disable_and_exit_task() resets state to NONE on its way out, so
sched_ext_dead() now sets DEAD after the wrapper returns. The validation
matrix grows NONE -> DEAD, warns on DEAD -> NONE, and tightens READY's
predecessor to INIT or ENABLED so the new DEAD value cannot silently
transition to READY.
Prepares for the following enable vs dead race fix.
Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
Tejun Heo [Sun, 10 May 2026 20:08:16 +0000 (10:08 -1000)]
sched_ext: Inline scx_init_task() and move RESET_RUNNABLE_AT into scx_set_task_state()
Prepare for the SCX_TASK_INIT_BEGIN/DEAD work that follows by collapsing the
scx_init_task() helper. Move the SCX_TASK_RESET_RUNNABLE_AT setting into
scx_set_task_state() on the INIT transition (it was set unconditionally at
every INIT site through the scx_init_task() helper), inline scx_init_task()
into scx_fork() and scx_root_enable_workfn(), and drop the helper.
As a side effect, scx_sub_disable() migration sequence now also sets
RESET_RUNNABLE_AT (it previously wrote INIT directly without going through
scx_init_task()). The flag triggers a runnable_at reset on the next
set_task_runnable(), which is harmless on a task that has just been moved
between scheds.
On root-enable, p->scx.flags is written without the task's rq lock. The task
isn't visible to scx yet, and a follow-up patch restores the lock-held
write.
v2: Note p->scx.flags rq-lock relaxation on root-enable path. (Andrea)
Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
Tejun Heo [Sun, 10 May 2026 20:08:16 +0000 (10:08 -1000)]
sched_ext: Cleanups in preparation for the SCX_TASK_INIT_BEGIN/DEAD work
Cleanups in preparation for the state-machine work that follows:
- Convert three sub-sched call sites that open-code
rcu_assign_pointer(p->scx.sched, ...) to scx_set_task_sched().
- Move scx_get_task_state()/scx_set_task_state() above the SCX task iter
section so scx_task_iter_next_locked() can use them without a forward
declaration.
No functional change.
Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
net/mlx5: HWS, Handle destroying table that has a miss table
If a table has a miss table that was created by
'mlx5hws_table_set_default_miss' API function, its miss_tbl
keeps the table that points to it in a list.
If such table is deleted, we need to also remove it from the
miss_tbl list, otherwise the node in miss_tbl list will contain
garbage.
Jakub Kicinski [Sun, 10 May 2026 17:20:23 +0000 (10:20 -0700)]
Merge branch 'log-clean-up-and-tap-follow-ups'
Allison Henderson says:
====================
Log clean up and TAP follow ups
This is a follow up series to the "Log collection, TAP compliance and
cleanups" set. The sashiko report had made some points that I thought
was worth addressing. This patch set fixes a few more TAP compliance
prints in the check_gcov* routines. Also since the user must now pass
in the log folder to collect logs, log clean up is tightened to only
remove rds* prefixed artifacts instead of the entire folder. Lastly a
the signal handler alarm should be disarmed after the completes to
avoid multiple calls to the stop_pcaps routine.
====================
selftests: rds: Disarm signal alarm on test completion
A race in stop_pcaps is possible if the test completes and then
times out while waiting for the tcpdump process to exit. The
signal handler may fire again and needlessly call stop_pcap a
second time. Fix this by disabling the alarm after normal
test completion.
Also if there are no tcpdump processes to wait on, stop_pcaps can
just exit. This avoids misleading prints when there are no procs
to collect dumps from.
selftests: rds: Fix TAP-prefixed prints in check_gcov*
This patch adds the # prefix to info and warning prints in
the check_gcov* routines. Since these routines do not exit,
as the other check_* routines do, the output here should be
kept TAP compliant.
Since rds self tests no longer has a default folder, users must
specify a log collection folder if they want to collect logs.
Currently the log folder is deleted and recreated, but this can
be dangerous if the user exports RDS_LOG_DIR=/tmp or /var/log.
This patch corrects the clean up to delete only rds log artifacts
from the log folder, and further prefixes rds specific logs as rds*
net: xgene: fix mdio_np leak in xgene_mdiobus_register()
The for_each_child_of_node() loop captures mdio_np via break,
holding the refcount. of_mdiobus_register() does not consume the
reference, so it leaks on success.
====================
ipv4: Flush the FIB once on multiple nexthop removal
This series optimizes multiple nexthop removal performance from having
to do a FIB flush for each nexthop being removed to only doing a single
FIB flush after all nexthops are removed.
This dramatically improves performance in scenarios where there are
many nexthops and many ipv4 routes. Please see individual patches for
more details and for a test scenario.
====================
Cosmin Ratiu [Thu, 7 May 2026 07:56:06 +0000 (10:56 +0300)]
ipv4: Add __must_check to nexthop removal functions
These functions return a signal whether FIB flushing is required which
must not be ignored. Use the compiler to help with enforcing this
requirement in the future.
Cosmin Ratiu [Thu, 7 May 2026 07:56:05 +0000 (10:56 +0300)]
ipv4: Flush the FIB once on multiple nexthop removal
When a device is going down or when a net namespace is deleted, all
nexthops on it are removed, and for each nexthop being removed the FIB
table is flushed, which does a full trie traversal looking for entries
marked RTNH_F_DEAD and removing them. This is O(N x R), with N being
number of dev nexthops and R being number of IPv4 routes.
The RTNL is held the entire time.
When there are many nexthops to be removed and many routing entries,
this can result in the RTNL being held for multiple minutes, which
causes unhappiness in other processes trying to acquire the RTNL (e.g.
systemd-networkd for DHCP renewals).
In a complicated deployment with multiple vxlan devices, each having
16K nexthops and a total of 128K ipv4 routes, this is exactly what
happens:
nexthop_flush_dev() # loops over 16K nexthops
-> remove_nexthop()
-> __remove_nexthop()
-> __remove_nexthop_fib() # marks fi->fib_flags |= RTNH_F_DEAD
-> fib_flush() # for EACH nexthop!
-> fib_table_flush() # walks the ENTIRE FIB, 128K entries
This patch makes use of the previously added FIB flushing signal to only
do a single FIB flush after all nexthops to be removed are marked as
RTNH_F_DEAD:
- __remove_nexthop_fib() no longer flushes the FIB.
- nexthop_flush_dev() and flush_all_nexthops() now keep track whether
any nexthop was removed and trigger a FIB flush at the end.
- a new wrapper is defined, remove_one_nexthop() which calls
remove_nexthop() and flushes if necessary. This is intended for places
which must remove a single nexthop and shouldn't worry about the need
to trigger a FIB flush. For now, the only caller is rtm_del_nexthop().
- The two direct callers of __remove_nexthop() get a WARN_ON_ONCE, since
the nh about to be removed should not have any FIB entries referencing
it when replacing or inserting a new one.
This dramatically improves performance from O(N x R) to O(N + R).
Releasing a nexthop reference in remove_nexthop() now no longer frees
it. Instead, it is deleted when the last fib_info pointing to it gets
freed via free_fib_info_rcu(). All routing code is already careful not
to take into consideration routes marked with RTNH_F_DEAD.
Tested with:
DEV=eth2
ip link set up dev $DEV
ip link add testnh0 link $DEV type macvlan mode bridge
ip addr add 198.51.100.1/24 dev testnh0
ip link set testnh0 up
seq 1 65536 | \
sed 's/.*/nexthop add id & via 198.51.100.2 dev testnh0/' | \
ip -batch -
i=1
for a in $(seq 0 255); do
for b in $(seq 0 255); do
echo "route add 10.${a}.${b}.0/32 nhid $i"
i=$((i + 1))
done
done | ip -batch -
time ip link set testnh0 down
ip link del testnh0
Without this patch:
real 0m32.601s
user 0m0.000s
sys 0m32.511s
With this patch:
real 0m0.209s
user 0m0.000s
sys 0m0.153s
Cosmin Ratiu [Thu, 7 May 2026 07:56:04 +0000 (10:56 +0300)]
ipv4: Provide a FIB flushing signal from nexthop removal functions
Plumb a bool value throughout the various nexthop removal functions,
determined in the innermost __remove_nexthop_fib() (which still does the
FIB flushing) and propagated up all callers.
The next patch will make use of this signal to optimize the removal of
multiple nexthops by moving the FIB flushing up the call hierarchy.
====================
net: convert four more protocols to getsockopt_iter
Continue the work to convert protocols to the new getsockopt_iter API.
Convert four additional getsockopt implementations to the new
sockopt_t/getsockopt_iter callback:
- MCTP
- LLC
- X.25
- KCM
These are mechanical, ABI-preserving conversions following the same
pattern as the previously converted protocols (af_packet, can/raw,
af_netlink, af_vsock): the (char __user *optval, int __user *optlen)
pair is replaced with a single sockopt_t *opt that carries the buffer
length on input and the returned size on output, and exposes an iov_iter
for the copy-out path. put_user()/copy_to_user() pairs are replaced with
a single copy_to_iter() per option, and the wrapper in
do_sock_getsockopt() handles writing optlen back to userspace.
I picked these four because each is small and self-contained.
====================
Breno Leitao [Thu, 7 May 2026 10:57:54 +0000 (03:57 -0700)]
selftests: net: getsockopt_iter: cleanup
Apply two cleanups suggested by Stanislav and bobby on the original
selftest series:
- Reorder local variable declarations into reverse christmas-tree
order (longest line first). Because that ordering puts socklen_t
optlen before the variable whose size it stores, the
"optlen = sizeof(...)" initializer is moved out of the declaration
to a plain assignment in the test body, as Stanislav suggested.
- Add ASSERT_EQ(optlen, ...) on every error path so the value the
kernel writes back to the userspace optlen is pinned down even
when the syscall returns -1. With do_sock_getsockopt() now writing
opt->optlen back to userspace unconditionally, asserting that the
netlink/vsock error paths leave the original input length untouched
guards against future regressions.
Bobby Eshleman pointed out that
SO_VM_SOCKETS_CONNECT_TIMEOUT_NEW/OLD return a sock_timeval-shaped
payload (16 bytes on 64-bit), which is wider than the u64 case
already covered. Add four tests that exercise this path:
Breno Leitao [Thu, 7 May 2026 10:57:53 +0000 (03:57 -0700)]
kcm: convert to getsockopt_iter
Convert KCM socket's getsockopt implementation to use the new
getsockopt_iter callback with sockopt_t.
Key changes:
- Replace (char __user *optval, int __user *optlen) with sockopt_t *opt
- Use opt->optlen for buffer length (input) and returned size (output)
- Use copy_to_iter() instead of put_user()/copy_to_user()
- Add linux/uio.h for copy_to_iter()
Breno Leitao [Thu, 7 May 2026 10:57:52 +0000 (03:57 -0700)]
x25: convert to getsockopt_iter
Convert X.25 socket's getsockopt implementation to use the new
getsockopt_iter callback with sockopt_t.
Key changes:
- Replace (char __user *optval, int __user *optlen) with sockopt_t *opt
- Use opt->optlen for buffer length (input) and returned size (output)
- Use copy_to_iter() instead of put_user()/copy_to_user()
- Add linux/uio.h for copy_to_iter()
Breno Leitao [Thu, 7 May 2026 10:57:51 +0000 (03:57 -0700)]
llc: convert to getsockopt_iter
Convert LLC socket's getsockopt implementation to use the new
getsockopt_iter callback with sockopt_t.
Key changes:
- Replace (char __user *optval, int __user *optlen) with sockopt_t *opt
- Use opt->optlen for buffer length (input) and returned size (output)
- Use copy_to_iter() instead of put_user()/copy_to_user()
- Add linux/uio.h for copy_to_iter()
Breno Leitao [Thu, 7 May 2026 10:57:50 +0000 (03:57 -0700)]
mctp: convert to getsockopt_iter
Convert MCTP socket's getsockopt implementation to use the new
getsockopt_iter callback with sockopt_t.
Key changes:
- Replace (char __user *optval, int __user *optlen) with sockopt_t *opt
- Use opt->optlen for buffer length (input)
- Use copy_to_iter() instead of copy_to_user()
- Add linux/uio.h for copy_to_iter()
Jakub Kicinski [Sun, 10 May 2026 17:07:37 +0000 (10:07 -0700)]
Merge tag 'batadv-net-pullrequest-20260508' of https://git.open-mesh.org/batadv
Simon Wunderlich says:
====================
Here are some batman-adv bugfixes:
- fix integer overflow on buff_pos, by Lyes Bourennani
- fix invalid tp_meter access during teardown, by Jiexun Wang (2 patches)
- stop caching unowned originator pointers in BAT IV, by Jiexun Wang
- tp_meter: fix tp_num leak on kmalloc failure, by Sven Eckelmann
- fix BLA refcounting issues, by Sven Eckelmann (3 patches)
* tag 'batadv-net-pullrequest-20260508' of https://git.open-mesh.org/batadv:
batman-adv: bla: put backbone reference on failed claim hash insert
batman-adv: bla: only purge non-released claims
batman-adv: bla: prevent use-after-free when deleting claims
batman-adv: tp_meter: fix tp_num leak on kmalloc failure
batman-adv: stop caching unowned originator pointers in BAT IV
batman-adv: stop tp_meter sessions during mesh teardown
batman-adv: reject new tp_meter sessions during teardown
batman-adv: fix integer overflow on buff_pos
====================
net: ena: PHC: Fix potential use-after-free in get_timestamp
Move the phc->active check and resp pointer assignment to after
acquiring the spinlock. Previously, phc->active was checked without
holding the lock, and resp was cached from ena_dev->phc.virt_addr
before the lock was acquired.
If ena_com_phc_destroy() runs between the lockless active check and
the lock acquisition, it sets active=false, releases the lock, frees
the DMA memory, and sets virt_addr=NULL. The get_timestamp path would
then read a NULL virt_addr and dereference it.
With both the active check and the pointer read under the lock,
destroy cannot free the memory while get_timestamp is using it.
Fixes: e0ea34158ee8 ("net: ena: Add PHC support in the ENA driver") Cc: stable@vger.kernel.org Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20260508062126.7273-1-akiyano@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Chuck Lever [Sun, 19 Apr 2026 18:52:59 +0000 (14:52 -0400)]
NFSD: Fix infinite loop in layout state revocation
find_one_sb_stid() skips stids whose sc_status is non-zero, but the
SC_TYPE_LAYOUT case in nfsd4_revoke_states() never sets sc_status
before calling nfsd4_close_layout(). The retry loop therefore finds
the same layout stid on every iteration, hanging the revoker
indefinitely.
Fixes: 1e33e1414bec ("nfsd: allow layout state to be admin-revoked.") Reported-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Jeff Layton [Sat, 11 Apr 2026 21:12:16 +0000 (17:12 -0400)]
sunrpc: start cache request seqno at 1 to fix netlink GET_REQS
sunrpc_cache_requests_snapshot() filters requests with
crq->seqno <= min_seqno. The min_seqno for the first netlink
dump call is cb->args[0] which is 0. Since next_seqno was
initialized to 0, the very first cache request got seqno=0
and was silently skipped by the snapshot (0 <= 0 is true).
This caused netlink-based GET_REQS to return 0 pending requests
even when a request was queued, preventing mountd from resolving
cache entries (particularly expkey/nfsd.fh). The unresolved
CACHE_PENDING state blocked all further notifications for the
entry, leading to permanent NFS4ERR_DELAY hangs.
Start next_seqno at 1 so all requests have seqno >= 1 and pass
the snapshot filter when min_seqno is 0.
Fixes: facc4e3c8042 ("sunrpc: split cache_detail queue into request and reader lists") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
nfsd: update mtime/ctime on COPY in presence of delegated attributes
When delegated attributes are given on open, the file is opened with
NOCMTIME and modifying operations do not update mtime/ctime as to not get
out-of-sync with the client's delegated view. However, for COPY operation,
the server should update its view of mtime/ctime and reflect that in any
GETATTR queries.
Fixes: e5e9b24ab8fa ("nfsd: freeze c/mtime updates with outstanding WRITE_ATTRS delegation") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
nfsd: update mtime/ctime on CLONE in presense of delegated attributes
When delegated attributes are given on open, the file is opened with
NOCMTIME and modifying operations do not update mtime/ctime as to not get
out-of-sync with the client's delegated view. However, for CLONE operation,
the server should update its view of mtime/ctime and reflect that in any
GETATTR queries.
Fixes: e5e9b24ab8fa ("nfsd: freeze c/mtime updates with outstanding WRITE_ATTRS delegation") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Scott Mayhew [Tue, 7 Apr 2026 22:08:57 +0000 (18:08 -0400)]
nfsd: fix file change detection in CB_GETATTR
RFC 8881, section 10.4.3 doesn't say anything about caching the file
size in the delegation record, nor does it say anything about comparing
a cached file size with the size reported by the client in the
CB_GETATTR reply for the purpose of determining if the client holds
modified data for the file.
What section 10.4.3 of RFC 8881 does say is that the server should
compare the *current* file size with the size reported by the client
holding the delegation in the CB_GETATTR reply, and if they differ to
treat it as a modification regardless of the change attribute retrieved
via the CB_GETATTR.
Doing otherwise would cause the server to believe the client holding the
delegation has a modified version of the file, even if the client
flushed the modifications to the server prior to the CB_GETATTR. This
would have the added side effect of subsequent CB_GETATTRs causing
updates to the mtime, ctime, and change attribute even if the client
holding the delegation makes no further updates to the file.
Modify nfsd4_deleg_getattr_conflict() to obtain the current file size
via i_size_read(). Retain the ncf_cur_fsize field, since it's a
convenient way to return the file size back to nfsd4_encode_fattr4(),
but don't use it for the purpose of detecting file changes. Remove the
unnecessary initialization of ncf_cur_fsize in nfs4_open_delegation().
Also, if we recall the delegation (because the client didn't respond to
the CB_GETATTR), then skip the logic that checks the nfs4_cb_fattr
fields.
Fixes: c5967721e106 ("NFSD: handle GETATTR conflict with write delegation") Cc: stable@vger.kernel.org Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
tools/ynl: add missing uapi header deps in Makefile.deps
ethtool.h includes linux/typelimits.h which is a relatively new header
not yet shipped in most distro kernel-header packages. Without the
explicit entry, the build silently falls through to -idirafter.
dev_energymodel.h is a new YNL family whose uapi header is not in
system paths at all and was missing a CFLAGS entry entirely.
Wolfram Sang [Thu, 7 May 2026 10:24:08 +0000 (12:24 +0200)]
watchdog: rzn1: remove now obsolete interrupt support
Previously, it was overlooked that the watchdog could reset the system
directly. So, a workaround using the interrupt which called
emergency_restart() was implemented. We now configure the controller
when booting properly to allow watchdog resets directly. Thus, remove
the interrupt workaround.
watchdog: convert the Kconfig dependency on OF_GPIO to OF
OF_GPIO is selected automatically on all OF systems. Any symbols it
controls also provide stubs so there's really no reason to select it
explicitly. We could simply remove the dependency but in order to avoid
a new symbol popping up for everyone in make config - just convert it to
requiring CONFIG_OF.