dma: dw-edma: Fix build warning in dw_edma_pcie_probe()
The function dw_edma_pcie_probe() in dw-edma-pcie.c triggered a
frame size warning:
ld.lld:warning:
drivers/dma/dw-edma/dw-edma-pcie.c:162:0: stack frame size (1040) exceeds limit (1024) in function 'dw_edma_pcie_probe'
This patch reduces the stack usage by dynamically allocating the
`vsec_data` structure using kmalloc(), rather than placing it on
the stack. This eliminates the overflow warning and improves kernel
robustness.
Dan Carpenter [Tue, 1 Jul 2025 22:31:40 +0000 (17:31 -0500)]
dmaengine: nbpfaxi: Fix memory corruption in probe()
The nbpf->chan[] array is allocated earlier in the nbpf_probe() function
and it has "num_channels" elements. These three loops iterate one
element farther than they should and corrupt memory.
The changes to the second loop are more involved. In this case, we're
copying data from the irqbuf[] array into the nbpf->chan[] array. If
the data in irqbuf[i] is the error IRQ then we skip it, so the iterators
are not in sync. I added a check to ensure that we don't go beyond the
end of the irqbuf[] array. I'm pretty sure this can't happen, but it
seemed harmless to add a check.
On the other hand, after the loop has ended there is a check to ensure
that the "chan" iterator is where we expect it to be. In the original
code we went one element beyond the end of the array so the iterator
wasn't in the correct place and it would always return -EINVAL. However,
now it will always be in the correct place. I deleted the check since
we know the result.
Robert Richter [Fri, 11 Jul 2025 15:15:27 +0000 (17:15 +0200)]
cxl: Remove core/acpi.c and cxl core dependency on ACPI
From Dave [1]:
"""
It was a mistake to introduce core/acpi.c and putting ACPI dependency on
cxl_core when adding the extended linear cache support.
"""
Current implementation calls hmat_get_extended_linear_cache_size() of
the ACPI subsystem. That external reference causes issue running
cxl_test as there is no way to "mock" that function and ignore it when
using cxl test.
Instead of working around that using cxlrd ops and extensively
expanding cxl_test code [1], just move HMAT calls out of the core
module to cxl_acpi. Implement this by adding a @cache_size member to
struct cxl_root_decoder. During initialization the cache size is
determined and added to the root decoder object in cxl_acpi. Later on
in cxl_core the cache_size parameter is used to setup extended linear
caching.
Hugo Villeneuve [Mon, 14 Jul 2025 13:37:30 +0000 (09:37 -0400)]
gpio: pca953x: use regmap_update_bits() to improve performance
Using regmap_update_bits() allows to reduce the number of I2C transfers
when updating bits that haven't changed on non-volatile registers.
For example on a PCAL6416, when changing a GPIO direction from input to
output, the number of I2C transfers can be reduced from 4 to just 1 if
the pull resistors configuration hasn't changed and the output value
is the same as before.
Select PM_GENERIC_DOMAINS instead of depending on it to ensure
it is always enabled when TI_SCI_PM_DOMAINS is selected.
Since PM_GENERIC_DOMAINS is an implicit symbol, it can only be enabled
through 'select' and cannot be explicitly enabled in configuration.
This simplifies the dependency chain and prevents build issues
Daniel Almeida [Mon, 14 Jul 2025 18:52:04 +0000 (15:52 -0300)]
rust: regulator: add a bare minimum regulator abstraction
Add a bare minimum regulator abstraction to be used by Rust drivers.
This abstraction adds a small subset of the regulator API, which is
thought to be sufficient for the drivers we have now.
Regulators provide the power needed by many hardware blocks and thus are
likely to be needed by a lot of drivers.
It was tested on rk3588, where it was used to power up the "mali"
regulator in order to power up the GPU.
Kai Huang [Sun, 13 Jul 2025 22:20:20 +0000 (10:20 +1200)]
KVM: x86: Reject KVM_SET_TSC_KHZ vCPU ioctl for TSC protected guest
Reject KVM_SET_TSC_KHZ vCPU ioctl if guest's TSC is protected and not
changeable by KVM, and update the documentation to reflect it.
For such TSC protected guests, e.g. TDX guests, typically the TSC is
configured once at VM level before any vCPU are created and remains
unchanged during VM's lifetime. KVM provides the KVM_SET_TSC_KHZ VM
scope ioctl to allow the userspace VMM to configure the TSC of such VM.
After that the userspace VMM is not supposed to call the KVM_SET_TSC_KHZ
vCPU scope ioctl anymore when creating the vCPU.
The de facto userspace VMM Qemu does this for TDX guests. The upcoming
SEV-SNP guests with Secure TSC should follow.
Note, TDX support hasn't been fully released as of the "buggy" commit,
i.e. there is no established ABI to break.
Fixes: adafea110600 ("KVM: x86: Add infrastructure for secure TSC") Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Nikunj A Dadhania <nikunj@amd.com> Link: https://lore.kernel.org/r/71bbdf87fdd423e3ba3a45b57642c119ee2dd98c.1752444335.git.kai.huang@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
pmdomain: sunxi: sun20i-ppu: change to tristate and enable for ARCH_SUNXI
There is no reason why the sun20i-ppu cannot be built as a module. So
change it to tristate.
Also enable it by default for ARCH_SUNXI since this driver is required
for some peripherals to work, and update the help text to reflect this
requirement.
pmdomain: sunxi: add driver for Allwinner A523's PCK-600 power controller
Allwinner A523 family has a second power controller, named PCK-600 in
the datasheets and BSP. It is likely based on ARM's PCK-600 hardware
block, with some additional delay controls. The only documentation for
this hardware is the BSP driver. The standard registers defined in ARM's
Power Policy Unit Architecture Specification line up. Some extra delay
controls are found in the reserved range of registers.
Add a driver for this power controller. Delay control register values
and power domain names are from the BSP driver.
Ming Lei [Sun, 13 Jul 2025 14:34:06 +0000 (22:34 +0800)]
selftests: ublk: remove `tag` parameter of ->tgt_io_done()
The `tag` parameter can be figured out from cqe->user_data, and that is
also the only way to get the info, so remove `tag` parameter, and
let target code retrieve it from cqe->user_data.
Ming Lei [Sun, 13 Jul 2025 14:34:04 +0000 (22:34 +0800)]
ublk: remove ublk_commit_and_fetch()
Remove ublk_commit_and_fetch() and open code request completion.
Consolidate accesses to struct ublk_io in UBLK_IO_COMMIT_AND_FETCH_REQ. When
the ublk_io daemon task restriction is relaxed in the future, ublk_io will
need to be protected by a lock. Unregister the auto-registered buffer and
complete the request last, as these don't need to happen under the lock.
Ming Lei [Sun, 13 Jul 2025 14:34:03 +0000 (22:34 +0800)]
ublk: add helper ublk_check_fetch_buf()
Add a helper ublk_check_fetch_buf() to validate UBLK_IO_FETCH_REQ's addr.
This doesn't require access to the ublk_io, so it can be done before taking
the ublk_device mutex.
This way also fixes one missing return value of -EINVAL in case of early
failure from ublk_fetch().
Fixes: b69b8edfb27d ("ublk: properly serialize all FETCH_REQs") Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250713143415.2857561-9-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Sun, 13 Jul 2025 14:34:00 +0000 (22:34 +0800)]
ublk: avoid to pass `struct ublksrv_io_cmd *` to ublk_commit_and_fetch()
Refactor ublk_commit_and_fetch() in the following way for removing
parameter of `struct ublksrv_io_cmd *`:
- return `struct request *` from ublk_fill_io_cmd(), so that we can
use request reference reliably in this way cause both request and
io_uring_cmd reference share same storage
- move ublk_fill_io_cmd() before calling into ublk_commit_and_fetch(),
so that ublk_fill_io_cmd() could be run with per-io lock held for
supporting command batch.
- pass ->zone_append_lba to ublk_commit_and_fetch() directly
The main motivation is to reproduce ublk_commit_and_fetch() for fetching
io command batch with multishot uring_cmd.
Ming Lei [Sun, 13 Jul 2025 14:33:58 +0000 (22:33 +0800)]
ublk: move fake timeout logic into __ublk_complete_rq()
Almost every block driver deals with fake timeout logic around real
request completion code.
Also the existing way may cause request reference count leak, so move the
logic into __ublk_complete_rq(), then we can skip the completion in the
last step like other drivers.
Ming Lei [Sun, 13 Jul 2025 14:33:56 +0000 (22:33 +0800)]
ublk: validate ublk server pid
ublk server pid(the `tgid` of the process opening the ublk device) is stored
in `ublk_device->ublksrv_tgid`. This `tgid` is then checked against the
`ublksrv_pid` in `ublk_ctrl_start_dev` and `ublk_ctrl_end_recovery`.
This ensures that correct ublk server pid is stored in device info.
block: split blk_zone_update_request_bio into two functions
blk_zone_update_request_bio() does two things. First it checks if the
request to be completed was written via ZONE APPEND and if yes it then
updates the sector to the one that the data was written to.
This is small enough to be an inline function. But upcoming changes adding
a tracepoint don't work if the function is inlined.
Split the function into two, the first is blk_req_bio_is_zone_append()
checking if the sector needs to be updated. This can still be an inline
function. The second is blk_zone_append_update_request_bio() doing the
sector update.
Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20250715115324.53308-3-johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
blktrace: add zoned block commands to blk_fill_rwbs
Add zoned block commands to blk_fill_rwbs:
- ZONE APPEND will be decoded as 'ZA'
- ZONE RESET will be decoded as 'ZR'
- ZONE RESET ALL will be decoded as 'ZRA'
- ZONE FINISH will be decoded as 'ZF'
- ZONE OPEN will be decoded as 'ZO'
- ZONE CLOSE will be decoded as 'ZC'
dt-bindings: power: Add A523 PPU and PCK600 power controllers
The A523 PPU is likely the same kind of hardware seen on previous SoCs.
The A523 PCK600, as the name suggests, is likely a customized version
of ARM's PCK-600 power controller. Comparing the BSP driver against
ARM's PPU datasheet shows that the basic registers line up, but
Allwinner's hardware has some additional delay controls in the reserved
register range. As such it is likely not fully compatible with the
standard ARM version.
Document A523 PPU and PCK600 compatibles.
Also reorder the compatible string entries so they are grouped and
ordered by family first, then by SoC model.
Reviewed-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Chen-Yu Tsai <wens@csie.org> Link: https://lore.kernel.org/r/20250712074021.805953-2-wens@kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Mark Brown [Mon, 14 Jul 2025 11:21:27 +0000 (12:21 +0100)]
arm64/gcs: Don't call gcs_free() when releasing task_struct
Currently we call gcs_free() when releasing task_struct but this is
redundant, it attempts to deallocate any kernel managed userspace GCS
which should no longer be relevant and resets values in the struct we're
in the process of freeing.
By the time arch_release_task_struct() is called the mm will have been
disassociated from the task so the check for a mm in gcs_free() will
always be false, for threads that are exiting leaving the mm active
deactivate_mm() will have been called previously and freed any kernel
managed GCS.
Imre Deak [Tue, 8 Jul 2025 21:23:31 +0000 (00:23 +0300)]
drm/dp: Change AUX DPCD probe address from LANE0_1_STATUS to TRAINING_PATTERN_SET
Commit a3ef3c2da675 ("drm/dp: Change AUX DPCD probe address from
DPCD_REV to LANE0_1_STATUS") stopped using the DPCD_REV register for
DPCD probing, since this results in link training failures at least when
using an Intel Barlow Ridge TBT hub at UHBR link rates (the
DP_INTRA_HOP_AUX_REPLY_INDICATION never getting cleared after the failed
link training). Since accessing DPCD_REV during link training is
prohibited by the DP Standard, LANE0_1_STATUS (0x202) was used instead,
as it falls within the Standard's valid register address range
(0x102-0x106, 0x202-0x207, 0x200c-0x200f, 0x2216) and it fixed the link
training on the above TBT hub.
However, reading the LANE0_1_STATUS register also has a side-effect at
least on a Novatek eDP panel, as reported on the Closes: link below,
resulting in screen flickering on that panel. One clear side-effect when
doing the 1-byte probe reads from LANE0_1_STATUS during link training
before reading out the full 6 byte link status starting at the same
address is that the panel will report the link training as completed
with voltage swing 0. This is different from the normal, flicker-free
scenario when no DPCD probing is done, the panel reporting the link
training complete with voltage swing 2.
Using the TRAINING_PATTERN_SET register for DPCD probing doesn't have
the above side-effect, the panel will link train with voltage swing 2 as
expected and it will stay flicker-free. This register is also in the
above valid register range and is unlikely to have a side-effect as that
of LANE0_1_STATUS: Reading LANE0_1_STATUS is part of the link training
CR/EQ sequences and so it may cause a state change in the sink - even if
inadvertently as I suspect in the case of the above Novatek panel. As
opposed to this, reading TRAINING_PATTERN_SET is not part of the link
training sequence (it must be only written once at the beginning of the
CR/EQ sequences), so it's unlikely to cause any state change in the
sink.
As a side-note, this Novatek panel also lacks support for TPS3, while
claiming support for HBR2, which violates the DP Standard (the Standard
mandating TPS3 for HBR2).
Besides the Novatek panel (PSR 1), which this change fixes, I also
verified the change on a Samsung (PSR 1) and an Analogix (PSR 2) eDP
panel as well as on the Intel Barlow Ridge TBT hub.
Note that in the drm-tip tree (targeting the v6.17 kernel version) the
i915 and xe drivers keep DPCD probing enabled only for the panel known
to require this (HP ZR24w), hence those drivers in drm-tip are not
affected by the above problem.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Fixes: a3ef3c2da675 ("drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS") Reported-and-tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14558 Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250708212331.112898-1-imre.deak@intel.com
(cherry picked from commit bba9aa41654036534d86b198f5647a9ce15ebd7f)
[Imre: Rebased on drm-intel-fixes] Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo: Changed original commit hash to match with the one propagated in fixes]
irq_domain_create_simple() takes fwnode as the first argument. It can be
extracted from the struct device using dev_fwnode() helper instead of
using of_node with of_fwnode_handle().
So use the dev_fwnode() helper.
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Cc: Amit Kucheria <amitk@kernel.org> Cc: Thara Gopinath <thara.gopinath@gmail.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Lukasz Luba <lukasz.luba@arm.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Jonathan Hunter <jonathanh@nvidia.com> Cc: linux-arm-msm@vger.kernel.org Cc: linux-tegra@vger.kernel.org Link: https://lore.kernel.org/r/20250611104348.192092-20-jirislaby@kernel.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Paolo Abeni [Thu, 10 Jul 2025 16:04:50 +0000 (18:04 +0200)]
selftests: net: increase inter-packet timeout in udpgro.sh
The mentioned test is not very stable when running on top of
debug kernel build. Increase the inter-packet timeout to allow
more slack in such environments.
'struct thermal_zone_device_ops' are not modified in these drivers.
Constifying these structures moves some data to a read-only section, so
increases overall security, especially when the structure holds some
function pointers.
On a x86_64, with allmodconfig, as an example:
Before:
======
text data bss dec hex filename
28116 5168 128 33412 8284 drivers/thermal/armada_thermal.o
After:
=====
text data bss dec hex filename
28244 5040 128 33412 8284 drivers/thermal/armada_thermal.o
'struct thermal_zone_device_ops' could be left unmodified in this driver.
Constifying this structure moves some data to a read-only section, so
increases overall security, especially when the structure holds some
function pointers.
This partly reverts commit 734b5def91b5 ("thermal/drivers/loongson2: Add
Loongson-2K2000 support") which removed the const qualifier. Instead,
define two different structures.
On a x86_64, with allmodconfig:
Before:
======
text data bss dec hex filename
5089 1160 0 6249 1869 drivers/thermal/loongson2_thermal.o
After:
=====
text data bss dec hex filename
5464 1128 0 6592 19c0 drivers/thermal/loongson2_thermal.o
PM: runtime: Take active children into account in pm_runtime_get_if_in_use()
For all practical purposes, there is no difference between the situation
in which a given device is not ignoring children and its active child
count is nonzero and the situation in which its runtime PM usage counter
is nonzero. However, pm_runtime_get_if_in_use() will only increment the
device's usage counter and return 1 in the latter case.
For consistency, make it do so in the former case either by adjusting
pm_runtime_get_conditional() and update the related kerneldoc comments
accordingly.
Provide an unsafe functions for abstractions to convert a regular
&Device to a &Device<Bound>.
This is useful for registrations that provide certain guarantees for the
scope of their callbacks, such as IRQs or certain class device
registrations (e.g. PWM, miscdevice).
kexec_core: Drop redundant pm_restore_gfp_mask() call
Drop the direct pm_restore_gfp_mask() call from the KEXEC_JUMP flow in
kernel_kexec() because it is redundant. Namely, dpm_resume_end()
called beforehand in the same code path invokes that function and
it is sufficient to invoke it once.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Baoquan He <bhe@redhat.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/1949230.tdWV9SEqCh@rjwysocki.net
[ rjw: Rebase after fixing up previous changes ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
kexec_core: Fix error code path in the KEXEC_JUMP flow
If dpm_suspend_start() fails, dpm_resume_end() must be called to
recover devices whose suspend callbacks have been called, but this
does not happen in the KEXEC_JUMP flow's error path due to a confused
goto target label.
Address this by using the correct target label in the goto statement in
question and drop the Resume_console label that is not used any more.
Fixes: 2965faa5e03d ("kexec: split kexec_load syscall from kexec core code") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Baoquan He <bhe@redhat.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/2396879.ElGaqSPkdT@rjwysocki.net
[ rjw: Drop unused label and amend the changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
PM: sleep: Clean up MAINTAINERS entries for suspend and hibernation
Since Pavel Machek and Len Brown do not actually maintain the system
suspend and hibernation code, change their records in the relevant
MAINTAINERS entries to reviewers.
While at it, use Len Brown's kernel.org address in the suspend-to-RAM
MAINTAINERS record.
PM: sleep: Update power.completion for all devices on errors
After commit aa7a9275ab81 ("PM: sleep: Suspend async parents after
suspending children"), the following scenario is possible:
1. Device A is async and it depends on device B that is sync.
2. Async suspend is scheduled for A before the processing of B is
started.
3. A is waiting for B.
4. In the meantime, an unrelated device fails to suspend and returns
an error.
5. The processing of B doesn't start at all and its power.completion is
not updated.
6. A is still waiting for B when async_synchronize_full() is called.
7. Deadlock ensues.
To prevent this from happening, update power.completion for all devices
on errors in all suspend phases, but do not do it directly for devices
that are already being processed or are waiting for the processing to
start because in those cases it may be necessary to wait for the
processing to actually complete before updating power.completion for
the device.
Fixes: aa7a9275ab81 ("PM: sleep: Suspend async parents after suspending children") Fixes: 443046d1ad66 ("PM: sleep: Make suspend of devices more asynchronous") Closes: https://lore.kernel.org/linux-pm/e13740a0-88f3-4a6f-920f-15805071a7d6@linaro.org/ Reported-and-tested-by: Tudor Ambarus <tudor.ambarus@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://patch.msgid.link/6191258.lOV4Wx5bFT@rjwysocki.net
PM: suspend: clean up redundant filesystems_freeze/thaw() handling
The recently introduced support for freezing filesystems during system
suspend included calls to filesystems_freeze() in both suspend_prepare()
and enter_state(), as well as calls to filesystems_thaw() in both
suspend_finish() and the Unlock path in enter_state(). These are
redundant.
Moreover, calling filesystems_freeze() twice, from both suspend_prepare()
and enter_state(), leads to a black screen and makes the system unable
to resume in some cases.
Address this as follows:
- filesystems_freeze() is already called in suspend_prepare(), which
is the proper and consistent place to handle pre-suspend operations.
The second call in enter_state() is unnecessary and so remove it.
- filesystems_thaw() is invoked in suspend_finish(), which covers
successful suspend/resume paths. In the failure case, add a call
to filesystems_thaw() only when needed, avoiding the duplicate call
in the general Unlock path.
This change simplifies the suspend code and avoids repeated freeze/thaw
calls, while preserving correct ordering and behavior.
Fixes: eacfbf74196f ("power: freeze filesystems during suspend/resume") Signed-off-by: Zihuan Zhang <zhangzihuan@kylinos.cn> Link: https://patch.msgid.link/20250712030824.81474-1-zhangzihuan@kylinos.cn
[ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
PM: suspend: Drop a misplaced pm_restore_gfp_mask() call
The pm_restore_gfp_mask() call added by commit 12ffc3b1513e ("PM:
Restrict swap use to later in the suspend sequence") to
suspend_devices_and_enter() is done too early because it takes
place before calling dpm_resume() in dpm_resume_end() and some
swap-backing devices may not be ready at that point. Moreover,
dpm_resume_end() called subsequently in the same code path invokes
pm_restore_gfp_mask() again and calling it twice in a row is
pointless.
Drop the misplaced pm_restore_gfp_mask() call from
suspend_devices_and_enter() to address this issue.
Fixes: 12ffc3b1513e ("PM: Restrict swap use to later in the suspend sequence") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/2810409.mvXUDI8C0e@rjwysocki.net
[ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Dan has also reported following issues with the original patch
https://lore.kernel.org/all/33fe8fe7-719a-405a-9ed2-d9f816ce1d57@sabinyo.mountain/
Bug #1:
The zeroeth element of ctrl->pconfig[] is supposed to be unused. We
start counting at 1. However this code sets ctrl->pconfig[0].ch_mask = 128.
Bug #2:
There are SLIM_MAX_TX_PORTS (16) elements in tx_ch[] array but only
QCOM_SDW_MAX_PORTS + 1 (15) in the ctrl->pconfig[] array so it corrupts
memory like Yongqin Liu pointed out.
Bug 3:
Like Jie Gan pointed out, it erases all the tx information with the rx
information.
Brian Masney [Thu, 10 Jul 2025 15:51:12 +0000 (11:51 -0400)]
ASoC: stm: stm32_sai_sub: convert from round_rate() to determine_rate()
The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.
Brian Masney [Thu, 10 Jul 2025 15:51:11 +0000 (11:51 -0400)]
ASoC: stm: stm32_i2s: convert from round_rate() to determine_rate()
The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.
Brian Masney [Thu, 10 Jul 2025 15:51:10 +0000 (11:51 -0400)]
ASoC: qcom: qdsp6: q6dsp-lpass-clocks: convert from round_rate() to determine_rate()
The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.
Brian Masney [Thu, 10 Jul 2025 15:51:09 +0000 (11:51 -0400)]
ASoC: codecs: rt5682s: convert from round_rate() to determine_rate()
The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.
Brian Masney [Thu, 10 Jul 2025 15:51:08 +0000 (11:51 -0400)]
ASoC: codecs: rt5682: convert from round_rate() to determine_rate()
The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.
Brian Masney [Thu, 10 Jul 2025 15:51:07 +0000 (11:51 -0400)]
ASoC: codecs: da7219: convert from round_rate() to determine_rate()
The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.
spi: st: Switch from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
Letting the compiler remove these functions when the kernel is built
without CONFIG_PM_SLEEP support is simpler and less error prone than the
use of #ifdef based kernel configuration guards.
Currently all filesystems which implement super_operations::shutdown()
can not afford losing a device.
Thus fs_bdev_mark_dead() will just call the ->shutdown() callback for the
involved filesystem.
But it will no longer be the case, as multi-device filesystems like
btrfs and bcachefs can handle certain device loss without the need to
shutdown the whole filesystem.
To allow those multi-device filesystems to be integrated to use
fs_holder_ops:
- Add a new super_operations::remove_bdev() callback
- Try ->remove_bdev() callback first inside fs_bdev_mark_dead()
If the callback returned 0, meaning the fs can handling the device
loss, then exit without doing anything else.
If there is no such callback or the callback returned non-zero value,
continue to shutdown the filesystem as usual.
This means the new remove_bdev() should only do the check on whether the
operation can continue, and if so do the fs specific handlings.
The shutdown handling should still be handled by the existing
->shutdown() callback.
For all existing filesystems with shutdown callback, there is no change
to the code nor behavior.
Btrfs is going to implement both the ->remove_bdev() and ->shutdown()
callbacks soon.
Merge patch series "can: Kconfig: add missing COMPILE_TEST"
Vincent Mailhol <mailhol.vincent@wanadoo.fr> says:
The ti_hecc and tscan1 CAN drivers can not be built on an x86_64
platform. Add the COMPILE_TEST dependency to allow build testing.
Doing that, a so far unnoticed W=0 warning showed up in ti_hecc. Fix
this warning. To prevent any potential noise in some future git
bisect, the warning is fixed before introducing COMPILE_TEST.
Note that the mscan and mpc5xxx drivers have the same issue but those
two use some helper functions, such as in_8() and out_8(), which are
only available on the powerpc platform. Those two drivers would
require some deeper code refactor to be built on x86_64 and are thus
left out of scope.
Vincent Mailhol [Tue, 15 Jul 2025 11:28:13 +0000 (20:28 +0900)]
can: tscan1: Kconfig: add COMPILE_TEST
tscan1 depends on ISA. It also has a hidden dependency on HAS_IOPORT
as reported by the kernel test bot [1]. That dependency is implied by
ISA which explains why this was not an issue so far.
Add both COMPILE_TEST and HAS_IOPORT to the dependency list so that
this driver can also be built on other platforms.
Vincent Mailhol [Tue, 15 Jul 2025 11:28:11 +0000 (20:28 +0900)]
can: ti_hecc: fix -Woverflow compiler warning
Fix below default (W=0) warning:
drivers/net/can/ti_hecc.c: In function 'ti_hecc_start':
drivers/net/can/ti_hecc.c:386:20: warning: conversion from 'long unsigned int' to 'u32' {aka 'unsigned int'} changes value from '18446744073709551599' to '4294967279' [-Woverflow]
386 | mbx_mask = ~BIT(HECC_RX_LAST_MBOX);
| ^
drm/panfrost: Use DRM_GPU_SCHED_STAT_NO_HANG to skip the reset
Panfrost can skip the reset if TDR has fired before the free-job worker.
Currently, since Panfrost doesn't take any action on these scenarios, the
job is being leaked, considering that `free_job()` won't be called.
To avoid such leaks, inform the scheduler that the job did not actually
timeout and no reset was performed through the new status code
DRM_GPU_SCHED_STAT_NO_HANG.
drm/xe: Use DRM_GPU_SCHED_STAT_NO_HANG to skip the reset
Xe can skip the reset if TDR has fired before the free job worker and can
also re-arm the timeout timer in some scenarios. Instead of manipulating
scheduler's internals, inform the scheduler that the job did not actually
timeout and no reset was performed through the new status code
DRM_GPU_SCHED_STAT_NO_HANG.
Note that, in the first case, there is no need to restart submission if it
hasn't been stopped.
drm/etnaviv: Use DRM_GPU_SCHED_STAT_NO_HANG to skip the reset
Etnaviv can skip a hardware reset in two situations:
1. TDR has fired before the free-job worker and the timeout is spurious.
2. The GPU is still making progress on the front-end and we can give
the job a chance to complete.
Instead of manipulating scheduler's internals, inform the scheduler that
the job did not actually timeout and no reset was performed through
the new status code DRM_GPU_SCHED_STAT_NO_HANG.
drm/v3d: Use DRM_GPU_SCHED_STAT_NO_HANG to skip the reset
When a CL/CSD job times out, we check if the GPU has made any progress
since the last timeout. If so, instead of resetting the hardware, we skip
the reset and allow the timer to be rearmed. This gives long-running jobs
a chance to complete.
Instead of manipulating scheduler's internals, inform the scheduler that
the job did not actually timeout and no reset was performed through
the new status code DRM_GPU_SCHED_STAT_NO_HANG.
drm/sched: Add new test for DRM_GPU_SCHED_STAT_NO_HANG
Add a test to submit a single job against a scheduler with the timeout
configured and verify that if the job is still running, the timeout
handler will skip the reset and allow the job to complete.
As more KUnit tests are introduced to evaluate the basic capabilities of
the `timedout_job()` hook, the test suite will continue to increase in
duration. To reduce the overall running time of the test suite, decrease
the scheduler's timeout for the timeout tests.
drm/sched: Allow drivers to skip the reset and keep on running
When the DRM scheduler times out, it's possible that the GPU isn't hung;
instead, a job just took unusually long (longer than the timeout) but is
still running, and there is, thus, no reason to reset the hardware. This
can occur in two scenarios:
1. The job is taking longer than the timeout, but the driver determined
through a GPU-specific mechanism that the hardware is still making
progress. Hence, the driver would like the scheduler to skip the
timeout and treat the job as still pending from then onward. This
happens in v3d, Etnaviv, and Xe.
2. Timeout has fired before the free-job worker. Consequently, the
scheduler calls `sched->ops->timedout_job()` for a job that isn't
timed out.
These two scenarios are problematic because the job was removed from the
`sched->pending_list` before calling `sched->ops->timedout_job()`, which
means that when the job finishes, it won't be freed by the scheduler
though `sched->ops->free_job()` - leading to a memory leak.
To solve these problems, create a new `drm_gpu_sched_stat`, called
DRM_GPU_SCHED_STAT_NO_HANG, which allows a driver to skip the reset. The
new status will indicate that the job must be reinserted into
`sched->pending_list`, and the hardware / driver will still complete that
job.
drm/sched: Rename DRM_GPU_SCHED_STAT_NOMINAL to DRM_GPU_SCHED_STAT_RESET
Among the scheduler's statuses, the only one that indicates an error is
DRM_GPU_SCHED_STAT_ENODEV. Any status other than DRM_GPU_SCHED_STAT_ENODEV
signifies that the operation succeeded and the GPU is in a nominal state.
However, to provide more information about the GPU's status, it is needed
to convey more information than just "OK".
Therefore, rename DRM_GPU_SCHED_STAT_NOMINAL to
DRM_GPU_SCHED_STAT_RESET, which better communicates the meaning of this
status. The status DRM_GPU_SCHED_STAT_RESET indicates that the GPU has
hung, but it has been successfully reset and is now in a nominal state
again.
Merge tag 'usb-serial-6.16-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:
USB serial device ids for 6.16-rc7
Here are some more device ids for 6.16-rc7.
All have been in linux-next with no reported issues.
* tag 'usb-serial-6.16-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
USB: serial: option: add Telit Cinterion FE910C04 (ECM) composition
USB: serial: ftdi_sio: add support for NDI EMGUIDE GEMINI
Andrew Price [Mon, 14 Jul 2025 15:21:15 +0000 (16:21 +0100)]
gfs2: Set .migrate_folio in gfs2_{rgrp,meta}_aops
Clears up the warning added in 7ee3647243e5 ("migrate: Remove call to
->writepage") that occurs in various xfstests, causing "something found
in dmesg" failures.
[ 341.136573] gfs2_meta_aops does not implement migrate_folio
[ 341.136953] WARNING: CPU: 1 PID: 36 at mm/migrate.c:944 move_to_new_folio+0x2f8/0x300
Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
arm64: dts: rockchip: Move dsi address+size-cells from SoC to rk3399 boards
The #address-cells and #size-cells properties are not useful on the DSI
controller node; they are only useful/required on ports and panel(s).
So remove them from the controller node and add them where actually
needed on the various rk3399 based boards.
This fixes the following DTB validation warnings:
unnecessary #address-cells/#size-cells without "ranges",
"dma-ranges" or child "reg" property
arm64: dts: rockchip: Move dsi address+size-cells from SoC to px30 boards
The #address-cells and #size-cells properties are not useful on the DSI
controller node; they are only useful/required on ports and panel(s).
So remove them from the controller node and add them where actually
needed on the various px30 based boards, which includes rk3326.
This fixes the following DTB validation warnings:
unnecessary #address-cells/#size-cells without "ranges",
"dma-ranges" or child "reg" property
Michal Wajdeczko [Fri, 11 Jul 2025 19:33:16 +0000 (21:33 +0200)]
drm/xe/pf: Invalidate LMTT after completing changes
Once we finish populating all leaf pages in the VF's LMTT we should
make sure that hardware will not access any stale data. Explicitly
force LMTT invalidation (as it was already planned in the past).
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250711193316.1920-7-michal.wajdeczko@intel.com
Michal Wajdeczko [Fri, 11 Jul 2025 19:33:15 +0000 (21:33 +0200)]
drm/xe/pf: Invalidate LMTT during LMEM unprovisioning
Invalidate LMTT immediately after removing VF's LMTT page tables
and clearing root PTE in the LMTT PD to avoid any invalid access
by the hardware (and VF) due to stale data.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250711193316.1920-6-michal.wajdeczko@intel.com
Michal Wajdeczko [Fri, 11 Jul 2025 19:33:14 +0000 (21:33 +0200)]
drm/xe/pf: Force GuC virtualization mode
By default the GuC starts in the 'native' mode and enables the VGT
mode (aka 'virtualization' mode) only after it receives at least one
set of VF configuration data. While this happens naturally while PF
begins VFs provisioning, we might need this sooner as some actions,
like TLB_INVALIDATION_ALL(0x7002), is supported by the GuC only in
the VGT mode.
And this becomes a real problem if we would want to use above action
to invalidate the LMTT early during VFs auto-provisioning, before VFs
are enabled, as such H2G would be rejected:
This could be mitigated by pushing earlier a PF self-configuration
with some hard-coded values that cover unlimited access to the GGTT,
use of all GuC contexts and doorbells. This step is sufficient for
the GuC to switch into the VGT mode.
Michal Wajdeczko [Fri, 11 Jul 2025 19:33:12 +0000 (21:33 +0200)]
drm/xe/pf: Resend PF provisioning after GT reset
If we reload the GuC due to suspend/resume or GT reset then we
have to resend not only any VFs provisioning data, but also PF
configuration, like scheduling parameters (EQ, PT), as otherwise
GuC will continue to use default values.
Michal Wajdeczko [Fri, 11 Jul 2025 19:33:11 +0000 (21:33 +0200)]
drm/xe/pf: Prepare to stop SR-IOV support prior GT reset
As part of the resume or GT reset, the PF driver schedules work
which is then used to complete restarting of the SR-IOV support,
including resending to the GuC configurations of provisioned VFs.
However, in case of short delay between those two actions, which
could be seen by triggering a GT reset on the suspened device:
While this VFs reprovisioning will be successful during next spin
of the worker, to avoid those errors, make sure to cancel restart
worker if we are about to trigger next reset.
dt-bindings: display: rockchip,dw-mipi-dsi: Drop address/size cells
The "rockchip,dw-mipi-dsi" binding has allOf "snps,dw-mipi-dsi.yaml"
which has allOf "dsi-controller.yaml", which already has #address-cells
and #size-cells defined as '1' and '0' respectively.
So drop this re-definition.
iommu/amd: Add documentation for AMD IOMMU debugfs support
Add documentation describing how to use AMD IOMMU debugfs support to
dump IOMMU data structures - IRT table, Device table, Registers (MMIO and
Capability) and command buffer.
In cases where we have an issue in the device interrupt path with IOMMU
interrupt remapping enabled, dumping valid IRT table entries for the device
is very useful and good input for debugging the issue.
eg.
-> To dump irte entries for a particular device
#echo "c4:00.0" > /sys/kernel/debug/iommu/amd/devid
#cat /sys/kernel/debug/iommu/amd/irqtbl | less
or
#echo "0000:c4:00.0" > /sys/kernel/debug/iommu/amd/devid
#cat /sys/kernel/debug/iommu/amd/irqtbl | less
iommu/amd: Add debugfs support to dump device table
IOMMU uses device table data structure to get per-device information for
DMA remapping, interrupt remapping, and other functionalities. It's a
valuable data structure to visualize for debugging issues related to
IOMMU.
eg.
-> To dump device table entry for a particular device
#echo 0000:c4:00.0 > /sys/kernel/debug/iommu/amd/devid
#cat /sys/kernel/debug/iommu/amd/devtbl
Dumping IOMMU data structures like device table, IRT, etc., for all devices
on the system will be a lot of data dumped in a file. Also, user may want
to dump and analyze these data structures just for one or few devices. So
dumping IOMMU data structures like device table, IRT etc for all devices
is not a good approach.
Add "device id" user input to be used for dumping IOMMU data structures
like device table, IRT etc in AMD IOMMU debugfs.
iommu/amd: Add debugfs support to dump IOMMU command buffer
IOMMU driver sends command to IOMMU hardware via command buffer. In cases
where IOMMU hardware fails to process commands in command buffer, dumping
it is a valuable input to debug the issue.
IOMMU hardware processes command buffer entry at offset equals to the head
pointer. Dumping just the entry at the head pointer may not always be
useful. The current head may not be pointing to the entry of the command
buffer which is causing the issue. IOMMU Hardware may have processed the
entry and updated the head pointer. So dumping the entire command buffer
gives a broad understanding of what hardware was/is doing. The command
buffer dump will have all entries from start to end of the command buffer.
Along with that, it will have a head and tail command buffer pointer
register dump to facilitate where the IOMMU driver and hardware are in
the command buffer for injecting and processing the entries respectively.
Command buffer is a per IOMMU data structure. So dumping on per IOMMU
basis.
eg.
-> To get command buffer dump for iommu<x> (say, iommu00)
#cat /sys/kernel/debug/iommu/amd/iommu00/cmdbuf
iommu/amd: Add debugfs support to dump IOMMU Capability registers
IOMMU Capability registers defines capabilities of IOMMU and information
needed for initialising MMIO registers and device table. This is useful
to dump these registers for debugging IOMMU related issues.
e.g.
-> To get capability registers value at offset 0x10 for iommu<x> (say,
iommu00)
# echo "0x10" > /sys/kernel/debug/iommu/amd/iommu00/capability
# cat /sys/kernel/debug/iommu/amd/iommu00/capability
iommu/amd: Add debugfs support to dump IOMMU MMIO registers
Analyzing IOMMU MMIO registers gives a view of what IOMMU is
configured with on the system and is helpful to debug issues
with IOMMU.
eg.
-> To get mmio registers value at offset 0x18 for iommu<x> (say, iommu00)
# echo "0x18" > /sys/kernel/debug/iommu/amd/iommu00/mmio
# cat /sys/kernel/debug/iommu/amd/iommu00/mmio