The retbleed select function leaves the mitigation to AUTO in some cases.
Moreover, the update function can also set the mitigation to AUTO. This
is inconsistent with other mitigations and requires explicit handling of
AUTO at the end of update step.
Make sure a mitigation gets selected in the select step, and do not change
it to AUTO in the update step. When no mitigation can be selected leave it
to NONE, which is what AUTO was getting changed to in the end.
Sustained frequency when greater than or equal to 4Ghz on 64-bit devices
currently result in marking all frequencies as turbo. Address the turbo
frequency selection bug by fixing the truncation.
Fixes: a897575e79d7 ("firmware: arm_scmi: Add support for marking certain frequencies as turbo") Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
Message-Id: <20250514214719.203607-1-quic_sibis@quicinc.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
An earlier patch marked one of the two CPU masks as 'static' to reduce stack
usage, but if CONFIG_NR_CPUS is large enough, the function still produces
a warning for compile testing:
drivers/cpufreq/armada-8k-cpufreq.c: In function 'armada_8k_cpufreq_init':
drivers/cpufreq/armada-8k-cpufreq.c:203:1: error: the frame size of 1416 bytes is larger than 1408 bytes [-Werror=frame-larger-than=]
Normally this should be done using alloc_cpumask_var(), but since the
driver already has a static mask and the probe function is not called
concurrently, use the same trick for both.
Enable internal bias pull-ups on the SoC-side I2C_3_HDMI that do not have
external pull resistors populated on the SoM. This ensures proper
default line levels.
For the ICSSG PHYs to operate correctly, a 25 MHz reference clock must
be supplied on CLKOUT0. Previously, our bootloader configured this
clock, which is why the PRU Ethernet ports appeared to work, but the
change never made it into the device tree.
Add clock properties to make EXT_REFCLK1.CLKOUT0 output a 25MHz clock.
dtc complains with the following message for DTSes which use the ISP:
arch/arm64/boot/dts/rockchip/px30.dtsi:1272.19-1276.6: Warning (graph_child_address): /isp@ff4a0000/ports/port@0: graph node has single child node 'endpoint@0', #address-cells/#size-cells are not necessary
Typically, it is expected from the device DTS(I) to update the SoC DTSI
nodes if they have more than one endpoint, so let's assume there's only
one endpoint in port@0 by default, instead of forcing board DTS(I)s to
/delete-property/ address-cells and size-cells to make dtc happy.
Because PX30 PP1516/EVB's endpoint@0 is the only endpoint and
considering its parent node now has no address-cells property, dtc
complains (same messages for PX30 EVB):
arch/arm64/boot/dts/rockchip/px30-pp1516.dtsi:447.29-451.6: Warning (avoid_default_addr_size): /isp@ff4a0000/ports/port@0/endpoint@0: Relying on default #address-cells value
arch/arm64/boot/dts/rockchip/px30-pp1516.dtsi:447.29-451.6: Warning (avoid_default_addr_size): /isp@ff4a0000/ports/port@0/endpoint@0: Relying on default #size-cells value
arch/arm64/boot/dts/rockchip/px30-pp1516-ltk050h3146w-a2.dtb: Warning (avoid_unnecessary_addr_size): Failed prerequisite 'avoid_default_addr_size'
arch/arm64/boot/dts/rockchip/px30-pp1516-ltk050h3146w-a2.dtb: Warning (unique_unit_address_if_enabled): Failed prerequisite 'avoid_default_addr_size'
arch/arm64/boot/dts/rockchip/px30-pp1516.dtsi:447.29-451.6: Warning (graph_endpoint): /isp@ff4a0000/ports/port@0/endpoint@0: graph node '#address-cells' is -1, must be 1
arch/arm64/boot/dts/rockchip/px30-pp1516.dtsi:447.29-451.6: Warning (graph_endpoint): /isp@ff4a0000/ports/port@0/endpoint@0: graph node '#size-cells' is -1, must be 0
arch/arm64/boot/dts/rockchip/px30-pp1516-ltk050h3146w-a2.dtb: Warning (graph_child_address): Failed prerequisite 'graph_endpoint'
so we fix that by removing the reg property. dtc still complains (same
messages for PX30 EVB):
arch/arm64/boot/dts/rockchip/px30-pp1516.dtsi:447.29-450.6: Warning (unit_address_vs_reg): /isp@ff4a0000/ports/port@0/endpoint@0: node has a unit name, but no reg or ranges property
so we also remove the @0 suffix off the node name.
Fixes: 8df7b4537dfb ("arm64: dts: rockchip: add isp node for px30") Fixes: 474a77395be2 ("arm64: dts: rockchip: hook up camera on px30-evb") Fixes: 56198acdbf0d ("arm64: dts: rockchip: add px30-pp1516 base dtsi and board variants") Signed-off-by: Quentin Schulz <quentin.schulz@cherry.de> Link: https://lore.kernel.org/r/20250610-ringneck-haikou-video-demo-cam-v2-1-de1bf87e0732@cherry.de Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
When multiple Apple devices are connected concurrently, the
apple-mfi-fastcharge driver fails to probe the subsequent devices with
the following error:
sysfs: cannot create duplicate filename '/class/power_supply/apple_mfi_fastcharge'
apple-mfi-fastcharge 5-2.4.3.3: probe of 5-2.4.3.3 failed with error -17
This happens because the driver uses a fixed power supply name
("apple_mfi_fastcharge") for all devices, causing a sysfs name
conflict when a second device is connected.
Fix this by generating unique names using the USB bus and device
number (e.g., "apple_mfi_fastcharge_5-12"). This ensures each
connected device gets a unique power supply entry in sysfs.
The change requires storing a copy of the power_supply_desc structure
in the per-device mfi_device struct, since the name pointer needs to
remain valid for the lifetime of the power supply registration.
The variable `of_match` was incorrectly declared as a `bool`.
It is assigned the return value of of_match_device(), which is a pointer of
type `const struct of_device_id *`.
Address and size-cells are 1 and the ftm timer node takes two address
spaces in "reg" property, so this should be in two <> tuples. Change
has no functional impact, but original code is confusing/less readable.
After the commit 0014f65e3df0 ("pm: cpupower: remove hard-coded
topology depth values"), "cpupower monitor" output ceased to print the
CORE and the CPU fields on a multi-socket platform.
The reason for this is that the patch changed the behaviour to break
out of the switch-case after printing the PKG details, while prior to
the patch, the CORE and the CPU details would also get printed since
the "if" condition check would pass for any level whose topology depth
was lesser than that of a package.
Fix this ensuring all the details below a desired topology depth are
printed in the cpupower monitor output.
The blsp_dma controller is shared between the different subsystems,
which is why it is already initialized by the firmware. We should not
reinitialize it from Linux to avoid potential other users of the DMA
engine to misbehave.
In mainline this can be described using the "qcom,controlled-remotely"
property. In the downstream/vendor kernel from Qualcomm there is an
opposite "qcom,managed-locally" property. This property is *not* set
for the qcom,sps-dma@7884000 and qcom,sps-dma@7ac4000 [1] so adding
"qcom,controlled-remotely" upstream matches the behavior of the
downstream/vendor kernel.
In preparation for switching to the architected timer as the primary
clockevents device, mark the cpuidle nodes with the 'local-timer-stop'
property to indicate that an alternative clockevents device must be
used for waking up from the "c2" idle state.
Signed-off-by: Will Deacon <willdeacon@google.com>
[Original commit from https://android.googlesource.com/kernel/gs/+/a896fd98638047989513d05556faebd28a62b27c] Signed-off-by: Will McVicker <willmcvicker@google.com> Reviewed-by: Youngmin Nam <youngmin.nam@samsung.com> Tested-by: Youngmin Nam <youngmin.nam@samsung.com> Fixes: ea89fdf24fd9 ("arm64: dts: exynos: google: Add initial Google gs101 SoC support") Signed-off-by: Peter Griffin <peter.griffin@linaro.org> Reviewed-by: Peter Griffin <peter.griffin@linaro.org> Tested-by: Peter Griffin <peter.griffin@linaro.org> Link: https://lore.kernel.org/r/20250611-gs101-cpuidle-v2-1-4fa811ec404d@linaro.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Disable the CTI device of the camera block to prevent potential NoC errors
during AMBA bus device matching.
The clocks for the Qualcomm Debug Subsystem (QDSS) are managed by aoss_qmp
through a mailbox. However, the camera block resides outside the AP domain,
meaning its QDSS clock cannot be controlled via aoss_qmp.
Fixes: bf469630552a ("arm64: dts: qcom: qcs615: Add coresight nodes") Signed-off-by: Jie Gan <jie.gan@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250611030003.3801-1-jie.gan@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
An infinite loop has been created by the Coresight devices. When only a
source device is enabled, the coresight_find_activated_sysfs_sink function
is recursively invoked in an attempt to locate an active sink device,
ultimately leading to a stack overflow and system crash. Therefore, disable
the replicator1 to break the infinite loop and prevent a potential stack
overflow.
The QMI_DATA_LEN type may have different sizes. Taking the element's
address of that type and interpret it as a smaller sized ones works fine
for little endian platforms but not for big endian ones. Instead use
temporary variables of smaller sized types and cast them correctly to
support big endian platforms.
only the first syscall may fail and set errno, but the second may succeed
and keep errno intact, and the check will falsely pass.
Or if errno happened to be EINVAL before, even the first check may falsely
pass.
Also use EXPECT/ASSERT consistently. Currently there is an inconsistent mix
without obvious reasons for usage of one or another.
The AFE DMA hardware supports up to 34 bits for DMA addresses. This is
missing from the driver and prevents reserved memory regions from
working properly when the allocated region is above the 4GB line.
Fill in the related register offsets for each DAI, and also set the
DMA mask. Also fill in the LSB end register offsets for completeness.
Fixes: a94aec035a12 ("ASoC: mediatek: mt8183: add platform driver") Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Link: https://patch.msgid.link/20250612074901.4023253-8-wenst@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In commit 32c9c06adb5b ("ASoC: mediatek: disable buffer pre-allocation")
buffer pre-allocation was disabled to accommodate newer platforms that
have a limited reserved memory region for the audio frontend.
Turns out disabling pre-allocation across the board impacts platforms
that don't have this reserved memory region. Buffer allocation failures
have been observed on MT8173 and MT8183 based Chromebooks under low
memory conditions, which results in no audio playback for the user.
Since some MediaTek platforms already have dedicated reserved memory
pools for the audio frontend, the plan is to enable this for all of
them. This requires device tree changes. As a fallback, reinstate the
original policy of pre-allocating audio buffers at probe time of the
reserved memory pool cannot be found or used.
This patch covers the MT8173, MT8183, MT8186 and MT8192 platforms for
now, the reason being that existing MediaTek platform drivers that
supported reserved memory were all platforms that mainly supported
ChromeOS, and is also the set of devices that I can verify.
Fixes: 32c9c06adb5b ("ASoC: mediatek: disable buffer pre-allocation") Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Link: https://patch.msgid.link/20250612074901.4023253-7-wenst@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Change the function to dynamically allocate it instead.
There is probably a better way to do it since only two integer fields
inside of that structure are actually used, but this is the simplest
rework for the moment.
Fixes: 783db6851c18 ("ASoC: ops: Enforce platform maximum on initial value") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250610093057.2643233-1-arnd@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch modifies the assignment of machine structure pointers in the
acp_pci_probe function. Previously, the machine pointers were assigned
using the address-of operator (&), which caused incompatibility issues
in type assignments.
Additionally, the declarations of the machine arrays in amd.h have been
updated to reflect that they are indeed arrays (`[]`). The code is
further cleaned up by declaring the codec structures in
amd-acpi-mach.c as static, reflecting their intended usage.
error: symbol 'amp_rt1019' was not declared. Should it be static?
error: symbol 'amp_max' was not declared. Should it be static?
error: symbol 'snd_soc_acpi_amd_acp_machines' was not declared. Should it be static?
error: symbol 'snd_soc_acpi_amd_rmb_acp_machines' was not declared. Should it be static?
error: symbol 'snd_soc_acpi_amd_acp63_acp_machines' was not declared. Should it be static?
error: symbol 'snd_soc_acpi_amd_acp70_acp_machines' was not declared. Should it be static?
commit 7f1186a8d738661 ("ASoC: soc-dai: check return value at
snd_soc_dai_set_tdm_slot()") checks return value of
xlate_tdm_slot_mask() (A1)(A2).
/*
* ...
(Y) * TDM mode can be disabled by passing 0 for @slots. In this case @tx_mask,
* @rx_mask and @slot_width will be ignored.
* ...
*/
int snd_soc_dai_set_tdm_slot(...)
{
...
if (...)
(A1) ret = dai->driver->ops->xlate_tdm_slot_mask(...);
else
(A2) ret = snd_soc_xlate_tdm_slot_mask(...);
if (ret)
goto err;
...
}
snd_soc_xlate_tdm_slot_mask() (A2) will return -EINVAL if slots was 0 (X),
but snd_soc_dai_set_tdm_slot() allow to use it (Y).
(A) static int snd_soc_xlate_tdm_slot_mask(...)
{
...
if (!slots)
(X) return -EINVAL;
...
}
Call xlate_tdm_slot_mask() only if slots was non zero.
Reported-by: Giedrius Trainavičius <giedrius@blokas.io> Closes: https://lore.kernel.org/r/CAMONXLtSL7iKyvH6w=CzPTxQdBECf++hn8RKL6Y4=M_ou2YHow@mail.gmail.com Fixes: 7f1186a8d738661 ("ASoC: soc-dai: check return value at snd_soc_dai_set_tdm_slot()") Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://patch.msgid.link/8734cdfx59.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add missing const qualifier to the non-CONFIG_THREAD_INFO_IN_TASK
version of end_of_stack() to match the CONFIG_THREAD_INFO_IN_TASK
version. Fixes a warning with CONFIG_KSTACK_ERASE=y on archs that don't
select THREAD_INFO_IN_TASK (such as LoongArch):
error: passing 'const struct task_struct *' to parameter of type 'struct task_struct *' discards qualifiers
The stackleak_task_low_bound() function correctly uses a const task
parameter, but the legacy end_of_stack() prototype didn't like that.
Build tested on loongarch (with CONFIG_KSTACK_ERASE=y) and m68k
(with CONFIG_DEBUG_STACK_USAGE=y).
The kmemleak reports memory leaks related to elevator resources that
were originally allocated in the ->init_hctx() method. The following
leak traces are observed after running blktests block/040:
The issue arises while we run nr_hw_queue update, Specifically, we first
reallocate hardware contexts (hctx) via __blk_mq_realloc_hw_ctxs(), and
then later invoke elevator_switch() (assuming q->elevator is not NULL).
The elevator switch code would first exit old elevator (elevator_exit)
and then switches to the new elevator. The elevator_exit loops through
each hctx and invokes the elevator’s per-hctx exit method ->exit_hctx(),
which releases resources allocated during ->init_hctx().
This memleak manifests when we reduce the num of h/w queues - for example,
when the initial update sets the number of queues to X, and a later update
reduces it to Y, where Y < X. In this case, we'd loose the access to old
hctxs while we get to elevator exit code because __blk_mq_realloc_hw_ctxs
would have already released the old hctxs. As we don't now have any
reference left to the old hctxs, we don't have any way to free the
scheduler resources (which are allocate in ->init_hctx()) and kmemleak
complains about it.
This issue was caused due to the commit 596dce110b7d ("block: simplify
elevator reattachment for updating nr_hw_queues"). That change unified
the two-stage elevator teardown and reattachment into a single call that
occurs after __blk_mq_realloc_hw_ctxs() has already freed the hctxs.
This patch restores the previous two-stage elevator switch logic during
nr_hw_queues updates. First, the elevator is switched to 'none', which
ensures all scheduler resources are properly freed. Then, the hardware
contexts (hctxs) are reallocated, and the software-to-hardware queue
mappings are updated. Finally, the original elevator is reattached. This
sequence prevents loss of references to old hctxs and avoids the scheduler
resource leaks reported by kmemleak.
Sphinx complains that ep_get_upwards_depth_proc() has a kerneldoc-style
comment without documenting its parameters.
This is an internal function that was not meant to show up in kernel
documentation, so fix the warning by changing the comment to a
non-kerneldoc one.
Fixes: 22bacca48a17 ("epoll: prevent creating circular epoll structures") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/r/20250717173655.10ecdce6@canb.auug.org.au Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202507171958.aMcW08Cn-lkp@intel.com/ Signed-off-by: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/20250721-epoll-sphinx-fix-v1-1-b695c92bf009@google.com Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Ensure that epoll instances can never form a graph deeper than
EP_MAX_NESTS+1 links.
Currently, ep_loop_check_proc() ensures that the graph is loop-free and
does some recursion depth checks, but those recursion depth checks don't
limit the depth of the resulting tree for two reasons:
- They don't look upwards in the tree.
- If there are multiple downwards paths of different lengths, only one of
the paths is actually considered for the depth check since commit 28d82dc1c4ed ("epoll: limit paths").
Essentially, the current recursion depth check in ep_loop_check_proc() just
serves to prevent it from recursing too deeply while checking for loops.
A more thorough check is done in reverse_path_check() after the new graph
edge has already been created; this checks, among other things, that no
paths going upwards from any non-epoll file with a length of more than 5
edges exist. However, this check does not apply to non-epoll files.
As a result, it is possible to recurse to a depth of at least roughly 500,
tested on v6.15. (I am unsure if deeper recursion is possible; and this may
have changed with commit 8c44dac8add7 ("eventpoll: Fix priority inversion
problem").)
To fix it:
1. In ep_loop_check_proc(), note the subtree depth of each visited node,
and use subtree depths for the total depth calculation even when a subtree
has already been visited.
2. Add ep_get_upwards_depth_proc() for similarly determining the maximum
depth of an upwards walk.
3. In ep_loop_check(), use these values to limit the total path length
between epoll nodes to EP_MAX_NESTS edges.
Commit 323ac95bce44 ("Btrfs: don't read leaf blocks containing only
checksums during truncate") changed the condition from `level == 0` to
`level == path->lowest_level`, while its original purpose was just to do
some leaf node handling (calling btrfs_item_key_to_cpu()) and skip some
code that doesn't fit leaf nodes.
After changing the condition, the code path:
1. Also handles the non-leaf nodes when path->lowest_level is nonzero,
which is wrong. However btrfs_search_forward() is never called with a
nonzero path->lowest_level, which makes this bug not found before.
2. Makes the later if block with the same condition, which was originally
used to handle non-leaf node (calling btrfs_node_key_to_cpu()) when
lowest_level is not zero, dead code.
Since btrfs_search_forward() is never called for a path with a
lowest_level different from zero, just completely remove the partial
support for a non-zero lowest_level, simplifying a bit the code, and
assert that lowest_level is zero at the start of the function.
Suggested-by: Qu Wenruo <wqu@suse.com> Fixes: 323ac95bce44 ("Btrfs: don't read leaf blocks containing only checksums during truncate") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Sun YangKai <sunk67188@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add a dependency for IO_URING for the GCOV_PROFILE_URING symbol.
Without this patch the EXPERT config menu ends with
"Enable IO uring support" and the menu prompts for
GCOV_PROFILE_URING and IO_URING_MOCK_FILE are not subordinate to it.
This causes all of the EXPERT Kconfig options that follow
GCOV_PROFILE_URING to be display in the "upper" menu (General setup),
just following the EXPERT menu.
Currently we just ensure that a non-zero value in chunk_sectors aligns
with any atomic write boundary, as the blk boundary functionality uses
both these values.
However it is also improper to have atomic write unit max > chunk_sectors
(for non-zero chunk_sectors), as this would lead to splitting of atomic
write bios (which is disallowed).
Sanitize atomic write unit max against chunk_sectors to avoid any
potential problems.
Fixes: d00eea91deaf3 ("block: Add extra checks in blk_validate_atomic_write_limits()") Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20250711105258.3135198-3-john.g.garry@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
Have nvmet_req_init() and req->execute() complete failed commands.
Description of the problem:
nvmet_req_init() calls __nvmet_req_complete() internally upon failure,
e.g., unsupported opcode, which calls the "queue_response" callback,
this results in nvmet_pci_epf_queue_response() being called, which will
call nvmet_pci_epf_complete_iod() if data_len is 0 or if dma_dir is
different from DMA_TO_DEVICE. This results in a double completion as
nvmet_pci_epf_exec_iod_work() also calls nvmet_pci_epf_complete_iod()
when nvmet_req_init() fails.
Steps to reproduce:
On the host send a command with an unsupported opcode with nvme-cli,
For example the admin command "security receive"
$ sudo nvme security-recv /dev/nvme0n1 -n1 -x4096
This triggers a double completion as nvmet_req_init() fails and
nvmet_pci_epf_queue_response() is called, here iod->dma_dir is still
in the default state of "DMA_NONE" as set by default in
nvmet_pci_epf_alloc_iod(), so nvmet_pci_epf_complete_iod() is called.
Because nvmet_req_init() failed nvmet_pci_epf_complete_iod() is also
called in nvmet_pci_epf_exec_iod_work() leading to a double completion.
This not only sends two completions to the host but also corrupts the
state of the PCI NVMe target leading to kernel oops.
This patch lets nvmet_req_init() and req->execute() complete all failed
commands, and removes the double completion case in
nvmet_pci_epf_exec_iod_work() therefore fixing the edge cases where
double completions occurred.
Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver") Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
When a node withdraws and it turns out that it is the only node that has
the filesystem mounted, gfs2 currently tries to replay the local journal
to bring the filesystem back into a consistent state. Not only is that
a very bad idea, it has also never worked because gfs2_recover_func()
will refuse to do anything during a withdraw.
However, before even getting to this point, gfs2_recover_func()
dereferences sdp->sd_jdesc->jd_inode. This was a use-after-free before
commit 04133b607a78 ("gfs2: Prevent double iput for journal on error")
and is a NULL pointer dereference since then.
Simply get rid of self recovery to fix that.
Fixes: 601ef0d52e96 ("gfs2: Force withdraw to replay journals and wait for it to finish") Reported-by: Chunjie Zhu <chunjie.zhu@cloud.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
drivers/md/raid10.c: In function ‘sync_request_write’:
drivers/md/raid10.c:2441:21: error: variable ‘d’ set but not used [-Werror=unused-but-set-variable]
2441 | int d;
| ^
cc1: all warnings being treated as errors
ublk server pid(the `tgid` of the process opening the ublk device) is stored
in `ublk_device->ublksrv_tgid`. This `tgid` is then checked against the
`ublksrv_pid` in `ublk_ctrl_start_dev` and `ublk_ctrl_end_recovery`.
This ensures that correct ublk server pid is stored in device info.
Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250713143415.2857561-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
Recently, we've observed a few cases where a ublk server is able to
complete restart more quickly than the driver can process the exit of
the previous ublk server. The new ublk server comes up, attempts
recovery of the preexisting ublk devices, and observes them still in
state UBLK_S_DEV_LIVE. While this is possible due to the asynchronous
nature of io_uring cleanup and should therefore be handled properly in
the ublk server, it is still preferable to make ublk server exit
handling faster if possible, as we should strive for it to not be a
limiting factor in how fast a ublk server can restart and provide
service again.
Analysis of the issue showed that the vast majority of the time spent in
handling the ublk server exit was in calls to blk_mq_quiesce_queue,
which is essentially just a (relatively expensive) call to
synchronize_rcu. The ublk server exit path currently issues an
unnecessarily large number of calls to blk_mq_quiesce_queue, for two
reasons:
1. It tries to call blk_mq_quiesce_queue once per ublk_queue. However,
blk_mq_quiesce_queue targets the request_queue of the underlying ublk
device, of which there is only one. So the number of calls is larger
than necessary by a factor of nr_hw_queues.
2. In practice, it calls blk_mq_quiesce_queue _more_ than once per
ublk_queue. This is because of a data race where we read
ubq->canceling without any locking when deciding if we should call
ublk_start_cancel. It is thus possible for two calls to
ublk_uring_cmd_cancel_fn against the same ublk_queue to both call
ublk_start_cancel against the same ublk_queue.
Fix this by making the "canceling" flag a per-device state. This
actually matches the existing code better, as there are several places
where the flag is set or cleared for all queues simultaneously, and
there is the general expectation that cancellation corresponds with ublk
server exit. This per-device canceling flag is then checked under a
(new) lock (addressing the data race (2) above), and the queue is only
quiesced if it is cleared (addressing (1) above). The result is just one
call to blk_mq_quiesce_queue per ublk device.
To minimize the number of cache lines that are accessed in the hot path,
the per-queue canceling flag is kept. The values of the per-device
canceling flag and all per-queue canceling flags should always match.
In our setup, where one ublk server handles I/O for 128 ublk devices,
each having 24 hardware queues of depth 4096, here are the results
before and after this patch, where teardown time is measured from the
first call to io_ring_ctx_wait_and_kill to the return from the last
ublk_ch_release:
before after
number of calls to blk_mq_quiesce_queue: 6469 256
teardown time: 11.14s 2.44s
There are still some potential optimizations here, but this takes care
of a big chunk of the ublk server exit handling delay.
Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250703-ublk_too_many_quiesce-v2-1-3527b5339eeb@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: c2c8089f325e ("ublk: validate ublk server pid") Signed-off-by: Sasha Levin <sashal@kernel.org>
It seems the Clang can see through OPTIMIZER_HIDE_VAR when the constant
is coming from sizeof. Adding "volatile" back to these variables solves
this false positive without reintroducing the issues that originally led
to switching to OPTIMIZER_HIDE_VAR in the first place[1].
During RAID resync, faulty rdev cannot be removed and will result in
"Device or resource busy" error when attempting hot removal.
Reproduction steps:
mdadm -Cv /dev/md0 -l1 -n3 -e1.2 /dev/sd{b..d}
mdadm /dev/md0 -f /dev/sdb
mdadm /dev/md0 -r /dev/sdb
-> mdadm: hot remove failed for /dev/sdb: Device or resource busy
After commit 4b10a3bc67c1 ("md: ensure resync is prioritized over
recovery"), when a device becomes faulty during resync, the
md_choose_sync_action() function returns early without calling
remove_and_add_spares(), preventing faulty device removal.
This patch extracts a helper function remove_spares() to support
removing faulty devices during RAID resync operations.
Fixes: 4b10a3bc67c1 ("md: ensure resync is prioritized over recovery") Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> Reviewed-by: Li Nan <linan122@huawei.com> Link: https://lore.kernel.org/linux-raid/20250707075412.150301-1-zhengqixing@huaweicloud.com Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Initially, conditional lock acquisition was removed to fix an xfstest bug
that was observed during internal testing. The deadlock reported by syzbot
is resolved by reintroducing conditional acquisition. The xfstest bug no
longer occurs on kernel version 6.16-rc1 during internal testing. I
assume that changes in other modules may have contributed to this.
Fixes: 69505fe98f19 ("fs/ntfs3: Replace inode_trylock with inode_lock") Reported-by: syzbot+a91fcdbd2698f99db8f4@syzkaller.appspotmail.com Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
To avoid deadlock, Commit 31651c607151 ("hfsplus: avoid deadlock
on file truncation") unlock extree before hfsplus_free_extents(),
and add check wheather extree is locked in hfsplus_free_extents().
However, when operations such as hfsplus_file_release,
hfsplus_setattr, hfsplus_unlink, and hfsplus_get_block are executed
concurrently in different files, it is very likely to trigger the
WARN_ON, which will lead syzbot and xfstest to consider it as an
abnormality.
The comment above this warning also describes one of the easy
triggering situations, which can easily trigger and cause
xfstest&syzbot to report errors.
Several threads could try to lock the shared extents tree.
And warning can be triggered in one thread when another thread
has locked the tree. This is the wrong behavior of the code and
we need to remove the warning.
struct ublk_device's __queues points to an allocation with up to
UBLK_MAX_NR_QUEUES (4096) queues, each of which have:
- struct ublk_queue (48 bytes)
- Tail array of up to UBLK_MAX_QUEUE_DEPTH (4096) struct ublk_io's,
32 bytes each
This means the full allocation can exceed 512 MB, which may well be
impossible to service with contiguous physical pages. Switch to
kvcalloc() and kvfree(), since there is no need for physically
contiguous memory.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250620151008.3976463-2-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
The reproducer uses a file0 on a ntfs3 file system with a corrupted i_link.
When renaming, the file0's inode is marked as a bad inode because the file
name cannot be deleted.
The underlying bug is that make_bad_inode() is called on a live inode.
In some cases it's "icache lookup finds a normal inode, d_splice_alias()
is called to attach it to dentry, while another thread decides to call
make_bad_inode() on it - that would evict it from icache, but we'd already
found it there earlier".
In some it's outright "we have an inode attached to dentry - that's how we
got it in the first place; let's call make_bad_inode() on it just for shits
and giggles".
Fixes: 78ab59fee07f ("fs/ntfs3: Rework file operations") Reported-by: syzbot+1aa90f0eb1fc3e77d969@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=1aa90f0eb1fc3e77d969 Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The audit_init_filter_exe() helper incorrectly checks the readlink(2)
error because an unsigned integer is used to store the result. Use a
signed integer for this check.
The macro takes a parameter called "p" but references "fc" internally.
This happens to compile as long as callers pass a variable named fc,
but breaks otherwise. Rename the first parameter to “fc” to match the
usage and to be consistent with warnfc() / errorfc().
Fixes: a3ff937b33d9 ("prefix-handling analogues of errorf() and friends") Signed-off-by: RubenKelevra <rubenkelevra@gmail.com> Link: https://lore.kernel.org/20250617230927.1790401-1-rubenkelevra@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
... and parse_longname() is not guaranteed that. That's the reason
why it uses kmemdup_nul() to build the argument for kstrtou64();
the problem is, kstrtou64() is not the only thing that need it.
Just get a NUL-terminated copy of the entire thing and be done
with that...
Fixes: dd66df0053ef "ceph: add support for encrypted snapshot names" Tested-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
The move of the module sanity check to earlier skipped the audit logging
call in the case of failure and to a place where the previously used
context is unavailable.
Add an audit logging call for the module loading failure case and get
the module name when possible.
Link: https://issues.redhat.com/browse/RHEL-52839 Fixes: 02da2cbab452 ("module: move check_modinfo() early to early_mod_check()") Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Merge tag 'timers-urgent-2025-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"A single fix for the PTP systemcounter mechanism:
The rework of this mechanism added a 'use_nsec' member to struct
system_counterval. get_device_system_crosststamp() instantiates that
struct on the stack and hands a pointer to the driver callback.
Only the drivers which set use_nsec to true, initialize that field,
but all others ignore it. As get_device_system_crosststamp() does not
initialize the struct, the use_nsec field contains random stack
content in those cases. That causes a miscalulation usually resulting
in a failing range check in the best case.
Initialize the structure before handing it to the drivers to cure
that"
* tag 'timers-urgent-2025-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping: Zero initialize system_counterval when querying time from phc drivers
Merge tag 'i2c-for-6.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- qup: avoid potential hang when waiting for bus idle
- tegra: improve ACPI reset error handling
- virtio: use interruptible wait to prevent hang during transfer
* tag 'i2c-for-6.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: qup: jump out of the loop in case of timeout
i2c: virtio: Avoid hang by using interruptible completion wait
i2c: tegra: Fix reset error handling with ACPI
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux
Pull ARM fixes from Russell King:
- use an absolute path for asm/unified.h in KBUILD_AFLAGS to solve a
regression caused by commit d5c8d6e0fa61 ("kbuild: Update assembler
calls to use proper flags and language target")
- fix dead code elimination binutils version check again
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
ARM: 9450/1: Fix allowing linker DCE with binutils < 2.36
ARM: 9448/1: Use an absolute path to unified.h in KBUILD_AFLAGS
Merge tag 'soc-fixes-6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
"These are two fixes that came in late, one addresses a regression on a
rockchips based board, the other is for ensuring a consistent dt
binding for a device added in 6.16 before the incorrect one makes it
into a release"
* tag 'soc-fixes-6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
arm64: dts: rockchip: Drop netdev led-triggers on NanoPi R5S
arm64: dts: allwinner: a523: Rename emac0 to gmac0
Wolfram Sang [Fri, 25 Jul 2025 22:59:39 +0000 (00:59 +0200)]
Merge tag 'i2c-host-fixes-6.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
i2c-host-fixes for v6.16-rc8
qup: avoid potential hang when waiting for bus idle
tegra: improve ACPI reset error handling
virtio: use interruptible wait to prevent hang during transfer
Merge tag 'drm-fixes-2025-07-26' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes (part 2) from Dave Airlie:
"Just the follow up fixes for i915 and xe, all pretty minor.
i915:
- Fix DP 2.7 Gbps DP_LINK_BW value on g4x
- Fix return value on intel_atomic_commit_fence_wait
xe:
- Fix build without debugfs"
* tag 'drm-fixes-2025-07-26' of https://gitlab.freedesktop.org/drm/kernel:
drm/xe: Fix build without debugfs
drm/i915/display: Fix dma_fence_wait_timeout() return value handling
drm/i915/dp: Fix 2.7 Gbps DP_LINK_BW value on g4x
Merge tag 'vfs-6.16-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"Two last-minute fixes for this cycle:
- Set afs vllist to NULL if addr parsing fails
- Add a missing check for reaching the end of the string in afs"
* tag 'vfs-6.16-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
afs: Set vllist to NULL if addr parsing fails
afs: Fix check for NULL terminator
Merge tag 'bcachefs-2025-07-24' of git://evilpiepirate.org/bcachefs
Pull bcachefs fixes from Kent Overstreet:
"User reported fixes:
- Fix btree node scan on encrypted filesystems by not using btree
node header fields encrypted
- Fix a race in btree write buffer flush; this caused EROs primarily
during fsck for some people"
* tag 'bcachefs-2025-07-24' of git://evilpiepirate.org/bcachefs:
bcachefs: Add missing snapshots_seen_add_inorder()
bcachefs: Fix write buffer flushing from open journal entry
bcachefs: btree_node_scan: don't re-read before initializing found_btree_node
ARM: 9450/1: Fix allowing linker DCE with binutils < 2.36
Commit e7607f7d6d81 ("ARM: 9443/1: Require linker to support KEEP within
OVERLAY for DCE") accidentally broke the binutils version restriction
that was added in commit 0d437918fb64 ("ARM: 9414/1: Fix build issue
with LD_DEAD_CODE_DATA_ELIMINATION"), reintroducing the segmentation
fault addressed by that workaround.
Restore the binutils version dependency by using
CONFIG_LD_CAN_USE_KEEP_IN_OVERLAY as an additional condition to ensure
that CONFIG_HAVE_LD_DEAD_CODE_DATA_ELIMINATION is only enabled with
binutils >= 2.36 and ld.lld >= 21.0.0.
Closes: https://lore.kernel.org/6739da7d-e555-407a-b5cb-e5681da71056@landley.net/ Closes: https://lore.kernel.org/CAFERDQ0zPoya5ZQfpbeuKVZEo_fKsonLf6tJbp32QnSGAtbi+Q@mail.gmail.com/ Cc: stable@vger.kernel.org Fixes: e7607f7d6d81 ("ARM: 9443/1: Require linker to support KEEP within OVERLAY for DCE") Reported-by: Rob Landley <rob@landley.net> Tested-by: Rob Landley <rob@landley.net> Reported-by: Martin Wetterwald <martin@wetterwald.eu> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
ARM: 9448/1: Use an absolute path to unified.h in KBUILD_AFLAGS
After commit d5c8d6e0fa61 ("kbuild: Update assembler calls to use proper
flags and language target"), which updated as-instr to use the
'assembler-with-cpp' language option, the Kbuild version of as-instr
always fails internally for arch/arm with
<command-line>: fatal error: asm/unified.h: No such file or directory
compilation terminated.
because '-include' flags are now taken into account by the compiler
driver and as-instr does not have '$(LINUXINCLUDE)', so unified.h is not
found.
This went unnoticed at the time of the Kbuild change because the last
use of as-instr in Kbuild that arch/arm could reach was removed in 5.7
by commit 541ad0150ca4 ("arm: Remove 32bit KVM host support") but a
stable backport of the Kbuild change to before that point exposed this
potential issue if one were to be reintroduced.
Follow the general pattern of '-include' paths throughout the tree and
make unified.h absolute using '$(srctree)' to ensure KBUILD_AFLAGS can
be used independently.
Closes: https://lore.kernel.org/CACo-S-1qbCX4WAVFA63dWfHtrRHZBTyyr2js8Lx=Az03XHTTHg@mail.gmail.com/ Cc: stable@vger.kernel.org Fixes: d5c8d6e0fa61 ("kbuild: Update assembler calls to use proper flags and language target") Reported-by: KernelCI bot <bot@kernelci.org> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Kent Overstreet [Tue, 22 Jul 2025 03:41:50 +0000 (23:41 -0400)]
bcachefs: Fix write buffer flushing from open journal entry
When flushing the btree write buffer, we pull write buffer keys directly
from the journal instead of letting the journal write path copy them to
the write buffer.
When flushing from the currently open journal buffer, we have to block
new reservations and wait for outstanding reservations to complete.
Recheck the reservation state after blocking new reservations:
previously, we were checking the reservation count from before calling
__journal_block().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Merge tag 'mm-hotfixes-stable-2025-07-24-18-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"11 hotfixes. 9 are cc:stable and the remainder address post-6.15
issues or aren't considered necessary for -stable kernels.
7 are for MM"
* tag 'mm-hotfixes-stable-2025-07-24-18-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
sprintf.h requires stdarg.h
resource: fix false warning in __request_region()
mm/damon/core: commit damos_quota_goal->nid
kasan: use vmalloc_dump_obj() for vmalloc error reports
mm/ksm: fix -Wsometimes-uninitialized from clang-21 in advisor_mode_show()
mm: update MAINTAINERS entry for HMM
nilfs2: reject invalid file types when reading inodes
selftests/mm: fix split_huge_page_test for folio_split() tests
mailmap: add entry for Senozhatsky
mm/zsmalloc: do not pass __GFP_MOVABLE if CONFIG_COMPACTION=n
mm/vmscan: fix hwpoisoned large folio handling in shrink_folio_list
Stephen Rothwell [Mon, 21 Jul 2025 06:15:57 +0000 (16:15 +1000)]
sprintf.h requires stdarg.h
In file included from drivers/crypto/intel/qat/qat_common/adf_pm_dbgfs_utils.c:4:
include/linux/sprintf.h:11:54: error: unknown type name 'va_list'
11 | __printf(2, 0) int vsprintf(char *buf, const char *, va_list);
| ^~~~~~~
include/linux/sprintf.h:1:1: note: 'va_list' is defined in header '<stdarg.h>'; this is probably fixable by adding '#include <stdarg.h>'
Link: https://lkml.kernel.org/r/20250721173754.42865913@canb.auug.org.au Fixes: 39ced19b9e60 ("lib/vsprintf: split out sprintf() and friends") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Andriy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Akinobu Mita [Sat, 19 Jul 2025 11:26:04 +0000 (20:26 +0900)]
resource: fix false warning in __request_region()
A warning is raised when __request_region() detects a conflict with a
resource whose resource.desc is IORES_DESC_DEVICE_PRIVATE_MEMORY.
But this warning is only valid for iomem_resources.
The hmem device resource uses resource.desc as the numa node id, which can
cause spurious warnings.
This warning appeared on a machine with multiple cxl memory expanders.
One of the NUMA node id is 6, which is the same as the value of
IORES_DESC_DEVICE_PRIVATE_MEMORY.
In this environment it was just a spurious warning, but when I saw the
warning I suspected a real problem so it's better to fix it.
This change fixes this by restricting the warning to only iomem_resource.
This also adds a missing new line to the warning message.
Link: https://lkml.kernel.org/r/20250719112604.25500-1-akinobu.mita@gmail.com Fixes: 7dab174e2e27 ("dax/hmem: Move hmem device registration to dax_hmem.ko") Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
SeongJae Park [Sat, 19 Jul 2025 18:19:32 +0000 (11:19 -0700)]
mm/damon/core: commit damos_quota_goal->nid
DAMOS quota goal uses 'nid' field when the metric is
DAMOS_QUOTA_NODE_MEM_{USED,FREE}_BP. But the goal commit function is not
updating the goal's nid field. Fix it.
Link: https://lkml.kernel.org/r/20250719181932.72944-1-sj@kernel.org Fixes: 0e1c773b501f ("mm/damon/core: introduce damos quota goal metrics for memory node utilization") [6.16.x] Signed-off-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Merge tag 'pci-v6.16-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fix from Bjorn Helgaas:
- Create pwrctrl devices only when we need them, i.e., when
CONFIG_PCI_PWRCTRL is enabled.
This allows brcmstb to work around a pwrctrl regression by
disabling CONFIG_PCI_PWRCTRL (Manivannan Sadhasivam)
* tag 'pci-v6.16-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI/pwrctrl: Create pwrctrl devices only when CONFIG_PCI_PWRCTRL is enabled
Lucas De Marchi [Tue, 22 Jul 2025 19:52:08 +0000 (12:52 -0700)]
drm/xe: Fix build without debugfs
When CONFIG_DEBUG_FS is off, drivers/gpu/drm/xe/xe_gt_debugfs.o
is not built and build fails on some setups with:
ld: drivers/gpu/drm/xe/xe_gt.o: in function `xe_fault_inject_gt_reset':
drivers/gpu/drm/xe/xe_gt.h:27:(.text+0x1659): undefined reference to `gt_reset_failure'
ld: drivers/gpu/drm/xe/xe_gt.h:27:(.text+0x1c16): undefined reference to `gt_reset_failure'
collect2: error: ld returned 1 exit status
Do not use the gt_reset_failure attribute if debugfs is not enabled.
Fixes: 8f3013e0b222 ("drm/xe: Introduce fault injection for gt reset") Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250722-xe-fix-build-fault-v1-1-157384d50987@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 4d3bbe9dd28c0a4ca119e4b8823c5f5e9cb3ff90) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Merge tag 'sound-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Some last-minute fixes. All changes are device-specific small fixes or
quirks, safe to apply"
* tag 'sound-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: mediatek: common: fix device and OF node leak
ALSA: hda/realtek: Fix mute LED mask on HP OMEN 16 laptop
ALSA: usb-audio: qcom: Adjust mutex unlock order
ASoC: SDCA: correct the calculation of the maximum init table size
ASoC: rt5650: Eliminate the high frequency glitch
ASoC: SOF: Intel: PTL: Add the sdw_process_wakeen op
ALSA: hda/realtek - Add mute LED support for HP Pavilion 15-eg0xxx
ALSA: hda/realtek - Add mute LED support for HP Victus 15-fa0xxx
ASoC: mediatek: mt8365-dai-i2s: pass correct size to mt8365_dai_set_priv
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Two important arm64 fixes ahead of the 6.16 release.
The first fixes a regression introduced during the merge window where
the KVM UUID (which is used to advertise KVM-specific hypercalls for
things like time synchronisation in the guest) was corrupted thanks to
an endianness bug introduced when converting the code to use the
UUID_INIT() helper.
The second fixes a stack-pointer corruption issue during
context-switch which has been observed in the wild when taking a
pseudo-NMI with shadow call stack enabled.
Summary:
- Fix broken UUID value for the KVM/arm64 hypervisor SMCCC interface
- Fix stack corruption on context-switch, primarily seen on (but not
limited to) configurations with both pNMI and SCS enabled"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/entry: Mask DAIF in cpu_switch_to(), call_on_irq_stack()
arm64: kvm, smccc: Fix vendor uuid
Merge tag 'net-6.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from can and xfrm.
The TI regression notified last week is actually on our net-next tree,
it does not affect 6.16.
We are investigating a virtio regression which is quite hard to
reproduce - currently only our CI sporadically hits it. Hopefully it
should not be critical, and I'm not sure that an additional week would
be enough to solve it.
Current release - fix to a fix:
- sched: sch_qfq: avoid sleeping in atomic context in qfq_delete_class
Previous releases - regressions:
- xfrm:
- set transport header to fix UDP GRO handling
- delete x->tunnel as we delete x
- eth:
- mlx5: fix memory leak in cmd_exec()
- i40e: when removing VF MAC filters, avoid losing PF-set MAC
- gve: fix stuck TX queue for DQ queue format
Previous releases - always broken:
- can: fix NULL pointer deref of struct can_priv::do_set_mode
- eth:
- ice: fix a null pointer dereference in ice_copy_and_init_pkg()
- ism: fix concurrency management in ism_cmd()
- dpaa2: fix device reference count leak in MAC endpoint handling
- icssg-prueth: fix buffer allocation for ICSSG
Misc:
- selftests: mptcp: increase code coverage"
* tag 'net-6.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits)
net: hns3: default enable tx bounce buffer when smmu enabled
net: hns3: fixed vf get max channels bug
net: hns3: disable interrupt when ptp init failed
net: hns3: fix concurrent setting vlan filter issue
s390/ism: fix concurrency management in ism_cmd()
selftests: drv-net: wait for iperf client to stop sending
MAINTAINERS: Add in6.h to MAINTAINERS
selftests: netfilter: tone-down conntrack clash test
can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode
net/sched: sch_qfq: Avoid triggering might_sleep in atomic context in qfq_delete_class
gve: Fix stuck TX queue for DQ queue format
net: appletalk: Fix use-after-free in AARP proxy probe
net: bcmasp: Restore programming of TX map vector register
selftests: mptcp: connect: also cover checksum
selftests: mptcp: connect: also cover alt modes
e1000e: ignore uninitialized checksum word on tgp
e1000e: disregard NVM checksum on tgp when valid checksum bit is not set
ice: Fix a null pointer dereference in ice_copy_and_init_pkg()
i40e: When removing VF MAC filters, only check PF-set MAC
i40e: report VF tx_dropped with tx_errors instead of tx_discards
...
1) Premption fixes for xfrm_state_find.
From Sabrina Dubroca.
2) Initialize offload path also for SW IPsec GRO. This fixes a
performance regression on SW IPsec offload.
From Leon Romanovsky.
3) Fix IPsec UDP GRO for IKE packets.
From Tobias Brunner,
4) Fix transport header setting for IPcomp after decompressing.
From Fernando Fernandez Mancera.
5) Fix use-after-free when xfrmi_changelink tries to change
collect_md for a xfrm interface.
From Eyal Birger .
6) Delete the special IPcomp x->tunnel state along with the state x
to avoid refcount problems.
From Sabrina Dubroca.
Please pull or let me know if there are problems.
* tag 'ipsec-2025-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
Revert "xfrm: destroy xfrm_state synchronously on net exit path"
xfrm: delete x->tunnel as we delete x
xfrm: interface: fix use-after-free after changing collect_md xfrm interface
xfrm: ipcomp: adjust transport header after decompressing
xfrm: Set transport header to fix UDP GRO handling
xfrm: always initialize offload path
xfrm: state: use a consistent pcpu_id in xfrm_state_find
xfrm: state: initialize state_ptrs earlier in xfrm_state_find
====================
net: hns3: default enable tx bounce buffer when smmu enabled
The SMMU engine on HIP09 chip has a hardware issue.
SMMU pagetable prefetch features may prefetch and use a invalid PTE
even the PTE is valid at that time. This will cause the device trigger
fake pagefaults. The solution is to avoid prefetching by adding a
SYNC command when smmu mapping a iova. But the performance of nic has a
sharp drop. Then we do this workaround, always enable tx bounce buffer,
avoid mapping/unmapping on TX path.
This issue only affects HNS3, so we always enable
tx bounce buffer when smmu enabled to improve performance.
Fixes: 295ba232a8c3 ("net: hns3: add device version to replace pci revision") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250722125423.1270673-5-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Currently, the queried maximum of vf channels is the maximum of channels
supported by each TC. However, the actual maximum of channels is
the maximum of channels supported by the device.
Fixes: 849e46077689 ("net: hns3: add ethtool_ops.get_channels support for VF") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250722125423.1270673-4-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The vport->req_vlan_fltr_en may be changed concurrently by function
hclge_sync_vlan_fltr_state() called in periodic work task and
function hclge_enable_vport_vlan_filter() called by user configuration.
It may cause the user configuration inoperative. Fixes it by protect
the vport->req_vlan_fltr by vport_lock.
Fixes: 2ba306627f59 ("net: hns3: add support for modify VLAN filter state") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250722125423.1270673-2-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Halil Pasic [Tue, 22 Jul 2025 16:18:17 +0000 (18:18 +0200)]
s390/ism: fix concurrency management in ism_cmd()
The s390x ISM device data sheet clearly states that only one
request-response sequence is allowable per ISM function at any point in
time. Unfortunately as of today the s390/ism driver in Linux does not
honor that requirement. This patch aims to rectify that.
This problem was discovered based on Aliaksei's bug report which states
that for certain workloads the ISM functions end up entering error state
(with PEC 2 as seen from the logs) after a while and as a consequence
connections handled by the respective function break, and for future
connection requests the ISM device is not considered -- given it is in a
dysfunctional state. During further debugging PEC 3A was observed as
well.
A kernel message like
[ 1211.244319] zpci: 061a:00:00.0: Event 0x2 reports an error for PCI function 0x61a
is a reliable indicator of the stated function entering error state
with PEC 2. Let me also point out that a kernel message like
[ 1211.244325] zpci: 061a:00:00.0: The ism driver bound to the device does not support error recovery
is a reliable indicator that the ISM function won't be auto-recovered
because the ISM driver currently lacks support for it.
On a technical level, without this synchronization, commands (inputs to
the FW) may be partially or fully overwritten (corrupted) by another CPU
trying to issue commands on the same function. There is hard evidence that
this can lead to DMB token values being used as DMB IOVAs, leading to
PEC 2 PCI events indicating invalid DMA. But this is only one of the
failure modes imaginable. In theory even completely losing one command
and executing another one twice and then trying to interpret the outputs
as if the command we intended to execute was actually executed and not
the other one is also possible. Frankly, I don't feel confident about
providing an exhaustive list of possible consequences.
Fixes: 684b89bc39ce ("s390/ism: add device driver for internal shared memory") Reported-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com> Tested-by: Mahanta Jambigi <mjambigi@linux.ibm.com> Tested-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com> Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250722161817.1298473-1-wintera@linux.ibm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Merge tag 'drm-fixes-2025-07-24' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"This might just be part one, but I'm sending it a bit early as it has
two sets of reverts for regressions, one is all the gem/dma-buf
handling and another was a nouveau ioctl change.
Otherwise there is an amdgpu fix, nouveau fix and a scheduler fix.
If any other changes come in I'll follow up with another more usual
Fri/Sat MR.
gem:
- revert all the dma-buf/gem changes as there as lifetime issues
with them
nouveau:
- revert an ioctl change as it causes issues
- fix NULL ptr on fermi
bridge:
- remove extra semicolon
sched:
- remove hang causing optimisation
amdgpu:
- fix garbage in cleared vram after resume"
* tag 'drm-fixes-2025-07-24' of https://gitlab.freedesktop.org/drm/kernel:
drm/bridge: ti-sn65dsi86: Remove extra semicolon in ti_sn_bridge_probe()
Revert "drm/nouveau: check ioctl command codes better"
drm/nouveau/nvif: fix null ptr deref on pre-fermi boards
Revert "drm/gem-dma: Use dma_buf from GEM object instance"
Revert "drm/gem-shmem: Use dma_buf from GEM object instance"
Revert "drm/gem-framebuffer: Use dma_buf from GEM object instance"
Revert "drm/prime: Use dma_buf from GEM object instance"
Revert "drm/etnaviv: Use dma_buf from GEM object instance"
Revert "drm/vmwgfx: Use dma_buf from GEM object instance"
Revert "drm/virtio: Use dma_buf from GEM object instance"
drm/sched: Remove optimization that causes hang when killing dependent jobs
drm/amdgpu: Reset the clear flag in buddy during resume
selftests: drv-net: wait for iperf client to stop sending
A few packets may still be sent out during the termination of iperf
processes. These late packets cause failures in rss_ctx.py when they
arrive on queues expected to be empty.
Yang Xiwen [Sun, 15 Jun 2025 16:01:10 +0000 (00:01 +0800)]
i2c: qup: jump out of the loop in case of timeout
Original logic only sets the return value but doesn't jump out of the
loop if the bus is kept active by a client. This is not expected. A
malicious or buggy i2c client can hang the kernel in this case and
should be avoided. This is observed during a long time test with a
PCA953x GPIO extender.
Fix it by changing the logic to not only sets the return value, but also
jumps out of the loop and return to the caller with -ETIMEDOUT.
Fixes: fbfab1ab0658 ("i2c: qup: reorganization of driver code to remove polling for qup v1") Signed-off-by: Yang Xiwen <forbidden405@outlook.com> Cc: <stable@vger.kernel.org> # v4.17+ Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20250616-qca-i2c-v1-1-2a8d37ee0a30@outlook.com