Nirmoy Das [Thu, 14 May 2026 14:42:57 +0000 (07:42 -0700)]
ovl: keep err zero after successful ovl_cache_get()
ovl_iterate_merged() stores PTR_ERR(cache) in err before checking
IS_ERR(cache). On success err holds the truncated cache pointer and
can be returned as a bogus non-zero error.
The syzbot reproducer reaches this through overlay-on-overlay readdir:
Fengnan Chang [Wed, 13 May 2026 09:13:49 +0000 (17:13 +0800)]
dm: limit target bio polling to one shot
dm_poll_bio() is the ->poll_bio() callback for a stacked dm device.
The caller only knows about the dm queue, so it may decide to do a
spinning poll if it thinks a single queue is being polled. Passing those
flags unchanged to the mapped clone lets blk_mq_poll() spin on a target
queue from inside dm_poll_bio().
With io_uring IOPOLL on a dm-stripe target this can keep a task in
dm_poll_bio() -> bio_poll() -> blk_mq_poll()
long enough to trigger an RCU CPU stall, before io_uring gets back to
io_iopoll_check() and its need_resched() check.
Keep dm's ->poll_bio() bounded by forcing one-shot polling for target
bios. The caller can invoke dm_poll_bio() again if it wants to keep
polling, and it also gets a chance to reap completions or reschedule
between passes.
Mikulas Patocka [Mon, 11 May 2026 11:04:16 +0000 (13:04 +0200)]
dm-ioctl: report an error if a device has no table
When we send a message to a device that has no table, the return code was
not set. The code would return "2", which is not considered a valid return value.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
As described in commit 2bc057692599 ("block: don't make REQ_POLLED imply
REQ_NOWAIT"), which fixed the same issue for the block device node, there
are valid cases to poll for I/O completion without REQ_NOWAIT.
Additionally, sing REQ_NOWAIT for file system writes is currently not
supported as file systems writes are not idempotent and would need a
retry of just the bio and not the entire operation to be fully supported.
Switch iomap to set REQ_POLLED and remove the now unused bio_set_polled
helper.
Gary Guo [Tue, 12 May 2026 12:09:53 +0000 (13:09 +0100)]
rust: pin-init: internal: project using full slot
Instead of projecting using pointer to a field project the full slot. This
further shifts the code generation from the initializer site to the struct
definition site, which means less code is generated overall.
It also makes the safety comment easier to justify, as now the projection
is done by the `#[pin_data]` macro which has full visibility of pinnedness
of fields.
The field alignment could also be checked on the `#[pin_data]` side;
however, since `init!()` macro works for other type of structs, we cannot
remove the alignment check from `init!`/`pin_init!` side anyway, so I opted
to still keep the alignment check in init.rs.
Gary Guo [Tue, 12 May 2026 12:09:52 +0000 (13:09 +0100)]
rust: pin-init: internal: project slots instead of references
By projecting slots, the `pin_init!` and `init!` code path can be more
unified. This also reduces the amount of macro-generated code and shifts
them to the shared infrastructure.
Gary Guo [Tue, 12 May 2026 12:09:50 +0000 (13:09 +0100)]
rust: pin-init: internal: use marker on drop guard type for pinned fields
Instead of projecting the created reference, simply create drop guards with
different marker types and have the `let_binding()` method of guards of
different marker produce different type instead.
This allows more flexible lifetime as this is now controlled by the guard.
This will be needed when implementing self-referential fields.
Gary Guo [Tue, 12 May 2026 12:09:49 +0000 (13:09 +0100)]
rust: pin-init: internal: init: handle code blocks early
`InitializerKind::Code` is a special case where it does not initialize a
field, and thus generate no guard and accessors. Handle it earlier and make
the rest of the code more linear.
Niklas Cassel [Thu, 14 May 2026 07:39:02 +0000 (09:39 +0200)]
ata: libata-scsi: do not needlessly defer commands when using PMP with FBS
The ACS specification does not allow a non-NCQ command to be issued while
an NCQ command is outstanding.
Commit 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
introduced a feature where a deferred non-NCQ command gets issued from a
workqueue. The design stores a single non-NCQ command per port.
However, when using Port Multipliers (PMPs), specifically PMPs that
support FIS-Based Switching (FBS), non-NCQ and NCQ commands can be mixed
on the same port, just not for the same link, see e.g. ata_std_qc_defer()
which is, and always has operated on a per-link basis.
Therefore, move the deferred_qc from struct ata_port to struct ata_link.
This way, when using a PMP with FBS, we will not needlessly defer commands
to all other links, just because one link issued a non-NCQ command while
having an NCQ command outstanding. Only commands for that specific link
will be deferred. This is in line with how PMPs with FBS worked before
commit 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation").
Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation") Tested-by: Tommy Kelly <linux@tkel.ly> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org>
Niklas Cassel [Thu, 14 May 2026 07:39:01 +0000 (09:39 +0200)]
ata: libata-scsi: do not use the deferred QC feature on PMPs with CBS
When using Port Multipliers (PMPs) with Command-Based Switching (CBS), you
can only issue commands to one link at a time. For PMPs with CBS, there is
already code to handle commands being sent to different links in
sata_pmp_qc_defer_cmd_switch() using ap->excl_link. sata_sil24 also makes
use of ap->excl_link.
A user on the list reported that commit 0ea84089dbf6 ("ata: libata-scsi:
avoid Non-NCQ command starvation") broke PMPs with CBS. The commit
introduced code that stores a deferred qc in ap->deferred_qc, to later be
issued via a workqueue. It turns out that this change is incompatible with
the existing ap->excl_link handling used by PMPs with CBS.
Thus, modify sata_pmp_qc_defer_cmd_switch() and sil24_qc_defer() to return
ATA_DEFER_LINK_EXCL, and make sure that the deferred QC handling via
workqueue is not used for this return value.
This way, PMPs with CBS will work once again. Note that the starvation
referenced in commit 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ
command starvation") can only happen on libsas ports, and libsas does not
support Port Multipliers, thus there is no harm of reverting back to the
previous way of deferring commands for PMPs with CBS.
Non-libsas ports connected to anything but a PMP with CBS (e.g. a normal
drive or a PMP with FBS) will continue using the deferred workqueue, since
it does result in lower completion latencies for non-NCQ commands, even
though the workqueue is not strictly needed to avoid starvation for
non-libsas ports.
If we want to modify the scope of the workqueue issuing to also handle
PMPs with CBS, then we should ensure that we can save both NCQ and non-NCQ
commands in ap->deferred_qc, while also removing the existing PMP CBS
handling using ap->excl_link, such that we don't duplicate features.
While at it, also add a comment explaining how the ap->excl_link mechanism
works.
Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation") Tested-by: Tommy Kelly <linux@tkel.ly> Reported-by: Tommy Kelly <linux@tkel.ly> Closes: https://lore.kernel.org/linux-ide/ce09cc21-a8e9-4845-b205-35411e22fba9@tkel.ly/ Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org>
Liviu Dudau [Thu, 7 May 2026 10:50:46 +0000 (11:50 +0100)]
drm/syncobj: Fix memory leak in drm_syncobj_find_fence()
Commit 18226ba52159 ("drm/syncobj: reject invalid flags in
drm_syncobj_find_fence") forgot to take into account the fact that
drm_syncobj_find() takes a reference to syncobj and returns early
without dropping the reference, leading to memory leaks.
Fixes: 18226ba52159 ("drm/syncobj: reject invalid flags in drm_syncobj_find_fence")
Reported by: Sam Spencer <sam.spencer@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Acked-by: Erik Kurzinger <ekurzinger@gmail.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://lore.kernel.org/all/20260507144425.2488057-1-liviu.dudau@arm.com
Niklas Cassel [Thu, 14 May 2026 07:39:00 +0000 (09:39 +0200)]
ata: libata-scsi: do not use the deferred QC feature for ATA_DEFER_PORT
The deferred QC feature was meant to handle mixed NCQ and non-NCQ commands,
i.e. for return value ATA_DEFER_LINK.
ATA_DEFER_PORT is returned by PATA drivers, but also certain SATA drivers
like sata_mv and sata_sil24 that uses ap->excl_link to workaround hardware
bugs in these HBAs. Regardless of the reason, using the deferred QC feature
for ATA_DEFER_PORT is always wrong, and will break the ap->excl_link usage
of the SATA drivers that rely on that feature.
Modify ata_scsi_qc_issue() to only use the deferred QC feature when mixing
NCQ and non-NCQ commands, i.e. ATA_DEFER_LINK.
Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation") Tested-by: Tommy Kelly <linux@tkel.ly> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org>
regulator: tps65219: fix irq_data.rdev not being assigned
Commit 64a6b577490c ("regulator: tps65219: Remove debugging helper
function") removed the tps65219_get_rdev_by_name() helper along with
the irq_data.rdev assignment that depended on it. This left
irq_data.rdev uninitialized for all IRQs, causing undefined behavior
when regulator_notifier_call_chain() is called from the IRQ handler:
Internal error: Oops: 0000000096000004
pc : regulator_notifier_call_chain
lr : tps65219_regulator_irq_handler
Call trace:
regulator_notifier_call_chain
tps65219_regulator_irq_handler
handle_nested_irq
regmap_irq_thread
irq_thread_fn
irq_thread
kthread
ret_from_fork
Instead of restoring a dedicated lookup array, restructure the probe
function to combine regulator registration with IRQ registration in
the same loop. This way the rdev returned by devm_regulator_register()
is naturally available for assigning to irq_data.rdev without any
auxiliary data structure.
Non-regulator IRQs (SENSOR, TIMEOUT) that don't correspond to any
registered regulator are registered with rdev=NULL, and the IRQ handler
is protected with a NULL check to avoid crashing.
Cc: stable@vger.kernel.org Closes: https://lore.kernel.org/all/aBDSTxALaOc-PD7X@gaggiata.pivistrello.it/ Reported-by: Francesco Dolcini <francesco@dolcini.it> Fixes: 64a6b577490c ("regulator: tps65219: Remove debugging helper function") Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> Link: https://patch.msgid.link/20260518083113.2063368-1-alexander.sverdlin@siemens.com Signed-off-by: Mark Brown <broonie@kernel.org>
Chen-Yu Tsai [Thu, 14 May 2026 10:12:52 +0000 (18:12 +0800)]
arm64: dts: mediatek: mt8195-cherry: Sort top level nodes correctly
The thermistor device nodes were added before the vbus regulator and
reserved memory nodes, when they should be after them, based on
alphabetical order of the device node _name_.
Move them to the correct position. No functional changes intended.
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Chen-Yu Tsai [Thu, 14 May 2026 10:12:50 +0000 (18:12 +0800)]
arm64: dts: mediatek: mt8192-asurada: Add (BT|WIFI)_KILL_1V8_L GPIO line names
GPIO lines 59 and 61 are named BT_KILL_1V8_L and WIFI_KILL_1V8_L in the
hardware design. Add them to the gpio-line-names property to make the
names available to users and developers.
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Julien Chauveau [Tue, 24 Mar 2026 19:30:11 +0000 (20:30 +0100)]
drm/bridge: it66121: acquire reset GPIO in probe
The it66121_ctx structure has a gpio_reset field, and it66121_hw_reset()
calls gpiod_set_value() on it. However, the GPIO descriptor is never
acquired via devm_gpiod_get(), leaving gpio_reset as NULL throughout
the driver lifetime.
gpiod_set_value() silently returns when passed a NULL descriptor, so
the hardware reset sequence in it66121_hw_reset() is a no-op. This
leaves the chip in an undefined state at probe time, which can prevent
it from responding on the I2C bus.
The DT binding marks reset-gpios as a required property, so all
compliant device trees provide this GPIO. Add the missing
devm_gpiod_get() call after enabling power supplies and before the
hardware reset, so the chip is properly reset with power applied.
Fixes: 988156dc2fc9 ("drm: bridge: add it66121 driver") Cc: stable@vger.kernel.org Signed-off-by: Julien Chauveau <chauveau.julien@gmail.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patch.msgid.link/20260324193011.16583-1-chauveau.julien@gmail.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Mark Brown [Mon, 18 May 2026 09:19:53 +0000 (10:19 +0100)]
spi: switch to managed controller allocation (part 3/3)
Johan Hovold <johan@kernel.org> says:
In preparation for fixing the SPI controller API so that it no longer
drops a reference when deregistering (non-managed) controllers (cf.
[1]), this series converts drivers using managed registration to also
use managed allocation.
Included is also a related cleanup of a lp8841-rtc.
This leaves us with 18 drivers using non-managed allocation, which is
few enough to be able to fix the API in tree-wide change.
Thorsten Blum [Mon, 4 May 2026 08:18:05 +0000 (10:18 +0200)]
dio: Replace deprecated strcpy with strscpy in dio_init
strcpy() has been deprecated [1] because it performs no bounds checking
on the destination buffer, which can lead to buffer overflows. While the
current code works correctly, replace strcpy() with the safer strscpy()
to follow secure coding best practices.
Johan Hovold [Fri, 24 Apr 2026 10:40:11 +0000 (12:40 +0200)]
nubus: Switch to dynamic root device
Driver core expects devices to be dynamically allocated and will, for
example, complain loudly if a device that lacks a release function is
ever freed.
Use root_device_register() to allocate and register the root device
instead of open coding using a static device.
Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Finn Thain <fthain@linux-m68k.org> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://patch.msgid.link/20260424104011.2616970-1-johan@kernel.org Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Prathamesh Shete [Thu, 14 May 2026 12:48:34 +0000 (12:48 +0000)]
dt-bindings: gpio: Add Tegra238 support
Extend the existing Tegra186 GPIO controller device tree bindings with
support for the GPIO controllers found on Tegra238. Tegra238 has two
GPIO controllers: the main controller and always-on (AON) controller.
The number of pins is slightly different, but the programming model
remains the same.
Add a new header, include/dt-bindings/gpio/nvidia,tegra238-gpio.h,
that defines port IDs as well as the TEGRA238_MAIN_GPIO() helper,
both of which are used in conjunction to create a unique specifier
for each pin.
Regions with a BO are checked against the BO size, but the SRAM
region is not. The SRAM region doesn't have a BO, but the command stream
region size should be checked against the SRAM size. The job's
"sram_size" isn't useful here because an evil userspace could lie about
the size.
Introduce device-managed helpers for clocksource registration.
The clocksource framework currently provides __clocksource_register_scale()
along with convenience wrappers for Hz and kHz registration. However,
drivers must handle error paths and cleanup manually, typically by pairing
registration with an explicit clocksource_unregister() call.
Add a devm-based variant, __devm_clocksource_register_scale(), along with
devm_clocksource_register_hz() and devm_clocksource_register_khz() helpers.
These helpers register the clocksource and attach a devres action to
automatically unregister it on driver detach or probe failure.
This simplifies driver code by:
* removing explicit cleanup paths
* ensuring correct teardown ordering
* aligning with the devm-based resource management model widely used
across the kernel
While drivers can open-code devm_add_action_or_reset(), providing a
dedicated helper avoids duplication, reduces boilerplate, and ensures
consistent usage across drivers, following patterns used in other
subsystems.
This is also particularly useful for drivers built as modules, where
device-managed resource handling avoids manual cleanup in remove paths and
ensures correct teardown on module unload.
This helper is self-contained and can be adopted progressively by drivers.
Crystal Wood [Tue, 12 May 2026 17:37:31 +0000 (12:37 -0500)]
rtla: Stop the record trace on interrupt
Before, when rtla got a signal, it stopped the main trace but not the
record trace. With "--on-end trace", this can lead to
save_trace_to_file() failing to keep up, especially on a debug kernel.
Plus, it adds post-stoppage noise to the trace file.
Signed-off-by: Crystal Wood <crwood@redhat.com> Fixes: c73cab9dbed0 ("rtla/timerlat_hist: Stop timerlat tracer on signal") Fixes: a4dfce7559d7 ("rtla/timerlat_top: Stop timerlat tracer on signal") Fixes: 3aadb65db5d6 ("rtla/timerlat: Add action on end feature") Link: https://lore.kernel.org/r/20260512173731.2151841-1-crwood@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Tomas Glozar [Fri, 24 Apr 2026 14:02:44 +0000 (16:02 +0200)]
rtla/tests: Add unit tests for actions module
Add unit tests covering all functions in the actions module, including
both valid and invalid inputs and all action types, except for
actions_perform(), where only shell and continue actions are tested.
To support testing multiple modules, the unit test build was modified so
that it links the entire rtla-in.o file. For this to work, the main()
function in rtla.c was declared weak, so that the unit test main is able
to override it.
Other included minor changes to unit tests are:
- Make unit test output verbose to show which tests are being run, now
that we have more than 3 tests.
- Add unit_tests file to .gitignore.
- Split unit test sources to one file per test suite, and keep only
main() function in unit_tests.c.
- Fix Makefile dependencies so that "make unit-tests" will rebuild the
binary with the changes in the commit.
Also with the linking the entire rtla-in.o file, it now has rtla's
nr_cpus symbol, so the declaration in utils unit tests is made extern.
Tomas Glozar [Thu, 23 Apr 2026 13:05:58 +0000 (15:05 +0200)]
rtla/tests: Add runtime tests for -C/--cgroup
Add a new script check-cgroup-match.sh that retrieves the cgroup of the
main rtla process and compares it to the cgroup of the rtla workload
threads.
Add a new test based on this script, for both osnoise and timerlat
tools, testing the variant of -C without argument (which sets the cgroup
of the workload to the cgroup of the rtla main process).
Note that this has to be tested in kernel mode to be significant for
timerlat tool, as user workloads inherit the parent rtla process cgroup
even without the option.
Tomas Glozar [Thu, 23 Apr 2026 13:05:57 +0000 (15:05 +0200)]
rtla/tests: Add runtime test for -k and -u options
Add runtime test for rtla-timerlat's -k/--kernel-threads and
-u/--user-threads options using get_workload_pids.sh to check whether
the appropriate threads are being created.
The tests are implemented for both top and hist. Additionally, all tests
related to timerlat threads are moved to a separate section in the test
files. The latter is also done for rtla-osnoise tests.
Tomas Glozar [Thu, 23 Apr 2026 13:05:55 +0000 (15:05 +0200)]
rtla/tests: Cover all hist options in runtime tests
Cover all options regarding histogram formatting for both
rtla-osnoise-hist and rtla-timerlat-hist tools. All options also have
output checking using positive or negative match, except for
-b/--bucket-size and -E/--entries, which cannot be tested in isolated
due to the output depending on the actual data collected.
Old -E/--entries test for rtla-osnoise was replaced with a new one
equivalent to the timerlat one.
Tomas Glozar [Thu, 23 Apr 2026 13:05:54 +0000 (15:05 +0200)]
rtla/tests: Extend timerlat top --aa-only coverage
rtla-timerlat-top's --aa-only option is currently only tested for return
value.
Extend the tests to also check that only auto-analysis is being done via
a negative match for the "Timer Latency" text in the top header, and
further split the test case into two:
- one test case for --aa-only stopping on threshold
- one test case for --aa-only exiting without threshold being hit
For both cases, the expected output ("analyzing it" or "Max latency was"
respectively) is checked against in addition to the negative match.
Tomas Glozar [Thu, 23 Apr 2026 13:05:52 +0000 (15:05 +0200)]
rtla/tests: Check -c/--cpus thread affinity
RTLA runtime tests verify the -c/--cpus options, but do not check
whether the correct affinity is actually applied.
Add a script named check-cpus.sh that retrieves the affinity of all
workload threads and use it to check the -c/--cpus option for both
osnoise and timerlat tools.
Tomas Glozar [Thu, 23 Apr 2026 13:05:51 +0000 (15:05 +0200)]
rtla/tests: Add get_workload_pids() helper
RTLA runtime tests that check workload processes (currently the test
case "verify -P/--priority" of timerlat.t and "verify the --priority/-P
param" of osnoise.t) use "pgrep timerlatu/" or "pgrep osnoise/"
respectively to identify the workload.
Make them more robust by adding a get_workload_pids() helper that
finds the main rtla process and returns the PIDs of all siblings other
than the test script itself, plus all child processes of kthreadd that
have the osnoise/timerlat kthread pattern comm.
This filters out any spurious processes not related to the running test
that happen to have "timerlatu/" or "osnoise/" in their command, for
example, a user grepping the same names at the time of the running of
the test.
Tomas Glozar [Thu, 23 Apr 2026 13:05:50 +0000 (15:05 +0200)]
rtla/tests: Cover both top and hist tools where possible
RTLA runtime tests currently do not cover both tool variants for osnoise
and timerlat properly. Many tests applicable to both tools are only
tested for one tool, selected randomly.
Introduce two new shell functions, check_top_hist() and
check_top_q_hist(). The functions use the same syntax as check() and run
check() on the arguments twice: once replacing the "TOOL" string in the
command with "top" (or "top -q"), once replacing it with "hist". The top
-q variant is used for tests relying on messages printed after aborting
the RTLA main loop with a starting new line, which only happens for top
tools in quiet mode; without -q, the top output is printed on the same
line and the matches would fail.
Tests that are applicable to both top and hist tools were modified to
the run for both; additionally, tests that were already done for both
tools were migrated to the new shell functions, unless the test command
or matches differ between the tools. Additional tests were added to test
tool-specific help messages.
RDMA/cma: Constify struct configfs_item_operations and configfs_group_operations
'struct configfs_item_operations' and 'configfs_group_operations' are not
modified in this driver.
Constifying these structures moves some data to a read-only section, so
increases overall security, especially when the structure holds some
function pointers.
On a x86_64, with allmodconfig:
Before:
======
text data bss dec hex filename
6677 2776 64 9517 252d drivers/infiniband/core/cma_configfs.o
After:
=====
text data bss dec hex filename
6901 2552 64 9517 252d drivers/infiniband/core/cma_configfs.o
Jason Gunthorpe [Tue, 12 May 2026 00:09:39 +0000 (21:09 -0300)]
RDMA: Replace memset with = {} pattern for ib_respond_udata()
Most drivers do this already, but some open-code a memset. Switch
all instances found. qedr_copy_qp_uresp() is already called with
zeroed memory so that memset is redundant.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
Jason Gunthorpe [Tue, 12 May 2026 00:09:37 +0000 (21:09 -0300)]
RDMA: Use proper driver data response structs instead of open coding
At some point the response structs were added and rdma-core is using
them, but the kernel was not changed to use them as well. Replace
the open-coded copy with the right struct and ib_respond_udata().
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
Jason Gunthorpe [Tue, 12 May 2026 00:09:36 +0000 (21:09 -0300)]
RDMA/mlx: Replace response_len with ib_respond_udata()
The Mellanox drivers have a pattern where they compute the response
length they think they need based on what the user asked for, then
blindly write that ignoring the provided size limit on the response
structure.
Drop this and just use ib_respond_udata() which caps the response
struct to the user's memory, which is fine for what mlx5 is doing.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
Jason Gunthorpe [Tue, 12 May 2026 00:09:34 +0000 (21:09 -0300)]
RDMA/cxgb4: Convert to ib_respond_udata()
These cases carefully work around 32-bit unpadded structures, but
the min integrated into ib_respond_udata() handles this
automatically. Zero-initialize data that would not have been copied.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
Rohit Chavan [Tue, 5 May 2026 08:57:08 +0000 (14:27 +0530)]
RDMA/bng_re: Remove unused variable rc
The variable 'rc' is initialized to 0 and returned at the end of
bng_re_process_qp_event(), but it is never modified in between.
Simplify the function by removing the redundant variable and returning 0
directly. This cleans up the code and avoids potential compiler warnings
about unused variables.
Chengchang Tang [Thu, 7 May 2026 01:21:48 +0000 (09:21 +0800)]
RDMA/hns: Support congestion control algorithm parameter configuration
hns RoCE supports 4 congestion control algorithms. Each algorihm
involves multiple parameters. Support configuring these parameters
by debugfs.
Here are some examples of this feature:
* The directory structure:
$ ls /sys/kernel/debug/hns_roce/0000\:35\:00.0/
dcqcn_cc_param dip_cc_param hc3_cc_param ldcp_cc_param
$ ls /sys/kernel/debug/hns_roce/0000\:35\:00.0/dcqcn_cc_param/
ai al alp ashift cnp_time f g lifespan max_speed tkp tmp
* Read the value of a param:
$ cat /sys/kernel/debug/hns_roce/0000\:35\:00.0/dcqcn_cc_param/ai
1
* Set a new value for a param:
$ echo 2 > /sys/kernel/debug/hns_roce/0000\:35\:00.0/dcqcn_cc_param/ai
Junxian Huang [Thu, 7 May 2026 01:21:46 +0000 (09:21 +0800)]
RDMA/hns: Initialize seqfile before creating file
The debugfs file was created before seq->read and seq->data were set,
leaving a small window where userspace could access an uninitialized
seqfile.
Move debugfs_create_file() after the assignments to avoid this issue.
Also, inline the original init_debugfs_seqfile() since it is not a
really necessary helper.
Rohit Chavan [Tue, 5 May 2026 10:05:49 +0000 (15:35 +0530)]
RDMA/mlx5: Use max() macro for bfreg calculation
Simplify the calculation of medium blue flame registers by using the
max() macro instead of open-coded ternary logic. This improves
readability and aligns with the subsystem's preference for using
standard kernel helpers.
Sara Venkatesh [Mon, 4 May 2026 08:00:36 +0000 (01:00 -0700)]
RDMA/srpt: fix integer overflow in immediate data length check
imm_buf->len is a user-controlled uint32_t received from the network.
Adding it to imm_data_offset without overflow checking allows a
malicious initiator to send len=0xFFFFFFFF, causing req_size to wrap
around to a small value, bypassing the bounds check, and subsequently
passing a ~4GB length to sg_init_one().
Use check_add_overflow() to detect wrapping before the comparison.
Fixes: 5dabcd0456d7 ("RDMA/srpt: Add support for immediate data") Reported-by: Carlos Bilbao (Lambda) <carlos.bilbao@kernel.org> Signed-off-by: Sara Venkatesh <sarajvenkatesh@gmail.com> Link: https://patch.msgid.link/20260504080036.3482415-1-sarajvenkatesh@gmail.com Reviewed-by: Carlos Bilbao (Lambda) <carlos.bilbao@kernel.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Leon Romanovsky <leon@kernel.org>
Rohit Chavan [Tue, 5 May 2026 07:53:07 +0000 (13:23 +0530)]
RDMA/mlx4: Use secs_to_jiffies() instead of open-coding
The conversion from seconds to jiffies is currently performed by
multiplying the value by 1000 and passing it to msecs_to_jiffies().
Use the more direct secs_to_jiffies() helper instead. This simplifies the
code, improves readability, and avoids the manual multiplication step
by using the dedicated kernel API.
Li RongQing [Sun, 3 May 2026 02:33:49 +0000 (22:33 -0400)]
IB/mlx5: Reduce spinlock contention by moving free operations outside
The functions kfree() and kvfree() can occasionally trigger a long
chain of calls or face contention in the slab allocator. Executing
these inside a spinlock increases the risk of CPU stalls and increases
lock contention under heavy event load.
Move the memory freeing logic out of the critical sections in devx.c
by using temporary lists and local flags. This narrows the lock's
scope to only protect the list integrity and state transitions.
MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT() links event_sub into sub_list
before initializing the fields used by the shared error path.
If eventfd_ctx_fdget() then fails, the unwind path dereferences
event_sub->ev_file in uverbs_uobject_put() and calls
subscribe_event_xa_dealloc() with an unset xa_key_level1.
subscribe_event_xa_alloc() creates the XA entry exactly once for a given
key_level1, on the first occurrence of that key. The unwind path must
therefore call subscribe_event_xa_dealloc() exactly once for it as well.
Enforce that by adding devx_key_in_sub_list() and calling
subscribe_event_xa_dealloc() only when the last matching pending entry is
being cleaned up.
Fixes: 759738537142 ("IB/mlx5: Enable subscription for device events over DEVX") Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com> Link: https://patch.msgid.link/20260428224319.37682-1-prathameshdeshpande7@gmail.com Reviewed-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
RDMA/mlx5: Fix UMR XLT cleanup on ODP populate failure
mlx5r_umr_update_xlt() allocates and DMA maps an XLT buffer with
mlx5r_umr_create_xlt(). The buffer is released by the common cleanup path
through mlx5r_umr_unmap_free_xlt().
After mlx5_odp_populate_xlt() became fallible, its error path returned
directly and skipped that cleanup. This leaks the XLT DMA mapping and
buffer. If the emergency XLT page was used, it also leaves
xlt_emergency_page_mutex locked.
Break out of the loop so execution falls through the existing cleanup path.
RDMA/efa: Add checksum support for admin responses
EFA devices added support for CRC16 checksum on admin responses and to
expose it to the driver the API version increased to 0.2. Add a check
for support on device init and if supported validate the checksum on
each admin response the driver receives. If the checksum validation
failed, drop the CQE.
Add the CRC16 module to Kconfig to have the in-tree dependency.
Create a virtual TUN net device with RXE support, then run rping
server and client to invoke networking packets, finally compare both
*port_xmit_data* and *port_rcv_data* of such device.
zhenwei pi [Tue, 14 Apr 2026 06:29:47 +0000 (14:29 +0800)]
RDMA/rxe: support perf mgmt GET method
In RXE, hardware counters are already supported, but not in a
standardized manner. For instance, user-space monitoring tools like
atop only read from the *counters* directory. Therefore, it is
necessary to add perf management support to RXE.
Also use rxe_counter_get instead of raw atomic64_read in hw-counters.
zhenwei pi [Tue, 14 Apr 2026 06:29:45 +0000 (14:29 +0800)]
RDMA/rxe: remove rxe_ib_device_get_netdev() and RXE_PORT
Suggested by Leon, remove the rxe_ib_device_get_netdev() wrapper and
the RXE_PORT definition. These additions do not improve readability,
and RXE has always had only a single port.
RDMA/hns: Fix arithmetic overflow in calc_hem_config()
If bt_num is 3 or 2, then the expressions like
l0_idx * chunk_ba_num + l1_idx are computed in 32-bit
arithmetic before being assigned to a u64 index field,
which can lead to overflow.
Cast the first operand to u64 to ensure the arithmetic
is performed in 64-bit.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
RDMA/mlx5: Use QP port when decoding responder CQEs
The responder CQE path determines the link layer via
rdma_port_get_link_layer(). Use qp->port instead of
hardcoding port 1, which can mis-decode completions on
multi-port devices.
IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier
mlx5_ib_alloc_transport_domain() allocates a transport domain and then
may fail in mlx5_ib_enable_lb(). In that case, the allocated TD is leaked.
Fix this by deallocating the TD when mlx5_ib_enable_lb() returns an
error. Also return 0 explicitly in the no-loopback-capability success
branch, and move dev->lb.mutex initialization to mlx5_ib_stage_init_init().
Fixes: 146d2f1af324 ("IB/mlx5: Allocate a Transport Domain for each ucontext") Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
Helen Koike [Mon, 11 May 2026 21:53:05 +0000 (18:53 -0300)]
debugobjects: Do not fill_pool() if pi_blocked_on
On RT enabled kernels, fill_pool() ends up calling rtlock_lock(), which
asserts if current::pi_blocked_on is set, because a task can obviously only
block on one lock as otherwise the priority inheritenace chain gets
corrupted.
Prevent this by expanding the conditional to take current::pi_blocked_on
into account.
Fixes: 4bedcc28469a ("debugobjects: Make them PREEMPT_RT aware") Reported-by: syzbot+b8ca586b9fc235f0c0df@syzkaller.appspotmail.com Signed-off-by: Helen Koike <koike@igalia.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260511215359.3351259-1-koike@igalia.com Closes: https://syzkaller.appspot.com/bug?extid=b8ca586b9fc235f0c0df
udf: validate free block extents against the partition length
udf_free_blocks() checks the logical block number and count against the
partition length, but drops the extent offset from that final bound. A
crafted extent can pass the guard while logicalBlockNum + offset + count
points past the partition, which later indexes past the space bitmap
array.
A single ftruncate(2) on a file backed by such an extent reliably
panics the kernel. This is a local availability issue. On desktop
systems where UDisks/polkit allows the active user to mount removable
UDF media without CAP_SYS_ADMIN, an unprivileged local user can supply
the crafted filesystem and trigger the panic by truncating a writable
file on it. Systems that require root or CAP_SYS_ADMIN to mount the
image have a higher prerequisite.
No confidentiality or integrity impact is claimed: the reproduced
primitive is an out-of-bounds read of a bitmap pointer slot followed by
a kernel panic.
Use the already computed logicalBlockNum + offset + count value for the
partition length check. Also make load_block_bitmap() reject an
out-of-range block group before indexing s_block_bitmap[], so corrupted
callers cannot walk past the flexible array.
Fixes: 56e69e59751d ("udf: prevent integer overflow in udf_bitmap_free_blocks()") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Link: https://patch.msgid.link/20260515142327.1120767-1-michael.bommarito@gmail.com Signed-off-by: Jan Kara <jack@suse.cz>
gpio: Initialize i2c_device_id arrays using member names
While being less compact, using named initializers allows to more easily
see which members of the structs are assigned which value without having
to lookup the declaration of the struct. And it's also more robust
against changes to the struct definition.
The mentioned robustness is relevant for a planned change to struct
i2c_device_id that replaces .driver_data by an anonymous union.
This patch doesn't modify the compiled arrays, only their representation
in source form benefits. The former was confirmed with x86 and arm64
builds.
Chen Ni [Tue, 28 Apr 2026 07:53:29 +0000 (15:53 +0800)]
pwm: atmel-tcb: Remove unneeded semicolon
Remove unnecessary semicolons reported by Coccinelle/coccicheck and the
semantic patch at scripts/coccinelle/misc/semicolon.cocci.
This was introduced in commit 68637b68afcc ("pwm: atmel-tcb:
Cache clock rates and mark chip as atomic") in Uwe's adaption of
Sangyun's original patch.
Jianpeng Chang [Wed, 13 May 2026 07:22:09 +0000 (15:22 +0800)]
dma-mapping: move dma_map_resource() sanity check into debug code
dma_map_resource() uses pfn_valid() to ensure the range is not RAM.
However, pfn_valid() only checks for availability of the memory map for
a PFN but it does not ensure that the PFN is actually backed by RAM. On
ARM64 with SPARSEMEM (128MB section granularity), MMIO addresses that
share a section with RAM will falsely trigger the WARN_ON_ONCE and cause
dma_map_resource() to return DMA_MAPPING_ERROR.
This causes a WARNING on Raspberry Pi 4 during spi_bcm2835 probe because
the SPI FIFO register (0xfe204004) falls in the same sparsemem section
as the end of RAM (0xf8000000-0xfbffffff), both in section 31
(0xf8000000-0xffffffff).
Move the sanity check from dma_map_resource() into debug_dma_map_phys()
and replace the unreliable pfn_valid() with pfn_valid() &&
!PageReserved(), which correctly identifies actual usable RAM without
false positives for MMIO regions that happen to have struct pages.
Since dma_map_resource() is dma_map_phys(DMA_ATTR_MMIO), the check
applies equally to both APIs. Any non-reserved page represents kernel
memory to a sufficient degree that using DMA_ATTR_MMIO on it is almost
certainly wrong and risks breaking coherency on non-coherent platforms.
ZONE_DEVICE pages used for PCI P2P DMA (MEMORY_DEVICE_PCI_P2PDMA) have
PageReserved set, so they will not trigger a false positive.
The check no longer blocks the mapping and uses err_printk() to
integrate with dma-debug filtering.
Fixes: f7326196a781 ("dma-mapping: export new dma_*map_phys() interface") Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Jianpeng Chang <jianpeng.chang.cn@windriver.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260513072209.1486986-1-jianpeng.chang.cn@windriver.com
Stepan Ionichev [Sun, 17 May 2026 16:15:30 +0000 (21:15 +0500)]
pinctrl: intel: move PWM base computation past feature check
Compute base inside intel_pinctrl_probe_pwm() only after the
PINCTRL_FEATURE_PWM and CONFIG_PWM_LPSS checks have passed. Tidy
up; no functional change.
Suggested-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/linux-gpio/aglu5jy5SbW9Wjwj@ashevche-desk.local/ Signed-off-by: Stepan Ionichev <sozdayvek@gmail.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
soc: aspeed: Move MODULE_DEVICE_TABLE next to the table itself
By convention MODULE_DEVICE_TABLE() immediately follows the ID table it
exports, because this is easier to read and verify. It also makes more
sense since #ifdef for ACPI or OF could hide both of them.
Most of the privers already have this correctly placed, so adjust
the missing ones. No functional impact.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Ender Hsieh [Tue, 5 May 2026 05:05:41 +0000 (14:05 +0900)]
ARM: dts: aspeed: msx4: enable BMC networking via MAC0
Add &mac0, &mdio3, and the ethphy3 PHY node to enable BMC networking
on the AST2600-based NVIDIA MSX4 board. The PHY is attached to MDIO3
at address 2 and uses RGMII with PHY-internal delays.
These nodes were intentionally omitted in commit f28674fab34f ("ARM:
dts: aspeed: Add NVIDIA MSX4 HPM") at Andrew Lunn's request, pending
clarification of the RGMII delay handling. Following his guidance on
linux-aspeed, the bootloader has been modified to stop enabling MAC
clock delays on the SoC side, so phy-mode = "rgmii-id" correctly
results in the PHY adding the required ~2ns delay without any
double-delay from the MAC controller.
The corresponding U-Boot change has been submitted to openbmc/u-boot.
Jouni Högander [Fri, 15 May 2026 09:57:56 +0000 (12:57 +0300)]
drm/i915/psr: Apply SDP on prior scanline workaround for Xe3p
In Xe3p there is an HW optimization done. When there is an SU triggered in
Capture state, Link will be kept ON post Capture CRC SDP. Before valid SU
pixels Intel source will transmit dummy pixels. Some TCONS are improperly
considering these dummy pixels as a valid pixel data. Prior Xe3p link was
was turned off even if there was SU triggered in capture state and no dummy
pixels were transmitted. These dummy pixels are problem only if SDP on
prior scanline is used and Early Transport is not in use. The workaround is
to start SU area always at scanline 0.
Jouni Högander [Fri, 15 May 2026 09:57:55 +0000 (12:57 +0300)]
drm/i915/psr: Apply Intel DPCD workaround when SDP on prior line used
There is Intel specific workaround DPCD address containing workaround for
case where SDP is on prior line. Apply this workaround according to values
in the offset.
Jouni Högander [Fri, 15 May 2026 09:57:54 +0000 (12:57 +0300)]
drm/i915/psr: Read Intel DPCD workaround register
Read Intel DPCD workaround register and store it into
intel_connector->dp.psr_caps. psr_caps was chosen as currently it contains
only PSR workaround for PSR2 SDP on prior scanline implementation.