Jakub reported increased flakiness in bond_macvlan_ipvlan.sh on regular
kernel, while the tests consistently pass on a debug kernel. This suggests
a timing-sensitive issue.
To mitigate this, introduce a short sleep before each xvlan_over_bond
connectivity check. The delay helps ensure neighbor and route cache
have fully converged before verifying connectivity.
The sleep interval is kept minimal since check_connection() is invoked
nearly 100 times during the test.
Fixes: 246af950b940 ("selftests: bonding: add macvlan over bond testing") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20251114082014.750edfad@kernel.org Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20251127143310.47740-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Disconnected files or directories can appear when they are visible and
opened from a bind mount, but have been renamed or moved from the source
of the bind mount in a way that makes them inaccessible from the mount
point (i.e. out of scope).
Previously, access rights tied to files or directories opened through a
disconnected directory were collected by walking the related hierarchy
down to the root of the filesystem, without taking into account the
mount point because it couldn't be found. This could lead to
inconsistent access results, potential access right widening, and
hard-to-debug renames, especially since such paths cannot be printed.
For a sandboxed task to create a disconnected directory, it needs to
have write access (i.e. FS_MAKE_REG, FS_REMOVE_FILE, and FS_REFER) to
the underlying source of the bind mount, and read access to the related
mount point. Because a sandboxed task cannot acquire more access
rights than those defined by its Landlock domain, this could lead to
inconsistent access rights due to missing permissions that should be
inherited from the mount point hierarchy, while inheriting permissions
from the filesystem hierarchy hidden by this mount point instead.
Landlock now handles files and directories opened from disconnected
directories by taking into account the filesystem hierarchy when the
mount point is not found in the hierarchy walk, and also always taking
into account the mount point from which these disconnected directories
were opened. This ensures that a rename is not allowed if it would
widen access rights [1].
The rationale is that, even if disconnected hierarchies might not be
visible or accessible to a sandboxed task, relying on the collected
access rights from them improves the guarantee that access rights will
not be widened during a rename because of the access right comparison
between the source and the destination (see LANDLOCK_ACCESS_FS_REFER).
It may look like this would grant more access on disconnected files and
directories, but the security policies are always enforced for all the
evaluated hierarchies. This new behavior should be less surprising to
users and safer from an access control perspective.
Remove a wrong WARN_ON_ONCE() canary in collect_domain_accesses() and
fix the related comment.
Because opened files have their access rights stored in the related file
security properties, there is no impact for disconnected or unlinked
files.
Cc: Christian Brauner <brauner@kernel.org> Cc: Günther Noack <gnoack@google.com> Cc: Song Liu <song@kernel.org> Reported-by: Tingmao Wang <m@maowtm.org> Closes: https://lore.kernel.org/r/027d5190-b37a-40a8-84e9-4ccbc352bcdf@maowtm.org Closes: https://lore.kernel.org/r/09b24128f86973a6022e6aa8338945fcfb9a33e4.1749925391.git.m@maowtm.org Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER") Fixes: cb2c7d1a1776 ("landlock: Support filesystem access-control") Link: https://lore.kernel.org/r/b0f46246-f2c5-42ca-93ce-0d629702a987@maowtm.org Reviewed-by: Tingmao Wang <m@maowtm.org> Link: https://lore.kernel.org/r/20251128172200.760753-2-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The rodata=on security measure requires that any code path which does
vmalloc -> set_memory_ro/set_memory_rox must protect the linear map alias
too. Therefore, if such a call fails, we must abort set_memory_* and caller
must take appropriate action; currently we are suppressing the error, and
there is a real chance of such an error arising post commit a166563e7ec3
("arm64: mm: support large block mapping when rodata=full"). Therefore,
propagate any error to the caller.
Fixes: a166563e7ec3 ("arm64: mm: support large block mapping when rodata=full") Signed-off-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Discovered by Atuin - Automated Vulnerability Discovery Engine.
The 'len' variable is calculated as 'min(32, trans->len + 1)',
which includes the 1-byte command header.
When copying data from 'trans->tx_buf' to 'ch341->tx_buf + 1', using 'len'
as the length is incorrect because:
1. It causes an out-of-bounds read from 'trans->tx_buf' (which has size
'trans->len', i.e., 'len - 1' in this context).
2. It can cause an out-of-bounds write to 'ch341->tx_buf' if 'len' is
CH341_PACKET_LENGTH (32). Writing 32 bytes to ch341->tx_buf + 1
overflows the buffer.
devm_pm_runtime_enable() can fail due to memory allocation failures.
The current code ignores its return value and proceeds with
pm_runtime_resume_and_get(), which may operate on incorrectly
initialized runtime PM state.
Check the return value of devm_pm_runtime_enable() and return the
error code if it fails.
Fixes: 6a2277a0ebe7 ("mtd: rawnand: renesas: Use runtime PM instead of the raw clock API") Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The extra "count >= limit" check in stmmac_rx_zc() is redundant and
has no effect because the value of "count" doesn't change after the
while condition at this point.
However, it can change after "read_again:" label:
while (count < limit) {
...
if (count >= limit)
break;
read_again:
...
/* XSK pool expects RX frame 1:1 mapped to XSK buffer */
if (likely(status & rx_not_ls)) {
xsk_buff_free(buf->xdp);
buf->xdp = NULL;
dirty++;
count++;
goto read_again;
}
...
This patch addresses the same issue previously resolved in stmmac_rx()
by commit fa02de9e7588 ("net: stmmac: fix rx budget limit check").
The fix is the same: move the check after the label to ensure that it
bounds the goto loop.
Fixes: bba2556efad6 ("net: stmmac: Enable RX via AF_XDP zero-copy") Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20251126104327.175590-1-aleksei.kodanev@bell-sw.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Neither sock4 nor sock6 pointers are guaranteed to be non-NULL in
vxlan_xmit_one, e.g. if the iface is brought down. This can lead to the
following NULL dereference:
Mentioned commits changed the code path in vxlan_xmit_one and as a side
effect the sock4/6 pointer validity checks in vxlan(6)_get_route were
lost. Fix this by adding back checks.
Since both commits being fixed were released in the same version (v6.7)
and are strongly related, bundle the fixes in a single commit.
Reported-by: Liang Li <liali@redhat.com> Fixes: 6f19b2c136d9 ("vxlan: use generic function for tunnel IPv4 route lookup") Fixes: 2aceb896ee18 ("vxlan: use generic function for tunnel IPv6 route lookup") Cc: Beniamino Galvani <b.galvani@gmail.com> Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20251126102627.74223-1-atenart@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Connlimit expression can be used for all kind of packets and not only
for packets with connection state new. See this ruleset as example:
table ip filter {
chain input {
type filter hook input priority filter; policy accept;
tcp dport 22 ct count over 4 counter
}
}
Currently, if the connection count goes over the limit the counter will
count the packets. When a connection is closed, the connection count
won't decrement as it should because it is only updated for new
connections due to an optimization on __nf_conncount_add() that prevents
updating the list if the connection is duplicated.
To solve this problem, check whether the connection was skipped and if
so, update the list. Adjust count_tree() too so the same fix is applied
for xt_connlimit.
Fixes: 976afca1ceba ("netfilter: nf_conncount: Early exit in nf_conncount_lookup() and cleanup") Closes: https://lore.kernel.org/netfilter/trinity-85c72a88-d762-46c3-be97-36f10e5d9796-1761173693813@3c-app-mailcom-bs12/ Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When using nf_conncount infrastructure for non-confirmed connections a
duplicated track is possible due to an optimization introduced since
commit d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC").
In order to fix this introduce a new conncount API that receives
directly an sk_buff struct. It fetches the tuple and zone and the
corresponding ct from it. It comes with both existing conncount variants
nf_conncount_count_skb() and nf_conncount_add_skb(). In addition remove
the old API and adjust all the users to use the new one.
This way, for each sk_buff struct it is possible to check if there is a
ct present and already confirmed. If so, skip the add operation.
Commit 97523a4edb7b ("kernel/resource: remove first_lvl / siblings_only
logic") removed an optimization introduced by commit 756398750e11
("resource: avoid unnecessary lookups in find_next_iomem_res()"). That
was not called out in the message of the first commit explicitly so it's
not entirely clear whether removing the optimization happened
inadvertently or not.
As the original commit message of the optimization explains there is no
point considering the children of a subtree in find_next_iomem_res() if
the top level range does not match.
Reinstating the optimization results in performance improvements in
systems where /proc/iomem is ~5k lines long. Calling mmap() on /dev/mem
in such platforms takes 700-1500μs without the optimisation and 10-50μs
with the optimisation.
Note that even though commit 97523a4edb7b removed the 'sibling_only'
parameter from next_resource(), newer kernels have basically reinstated it
under the name 'skip_children'.
The base address of JPEG decoder core 1 should start at 0x10000, and
have a size of 0x10000, i.e. it is right after core 0.
Instead the core has the same base address as core 0, and with a crazy
large size. This looks like a mixup of address and size cells when the
ranges were converted.
This causes the kernel to fail to register the second core due to sysfs
name conflicts:
regulator_supply_alias_list was accessed without any locking in
regulator_supply_alias(), regulator_register_supply_alias(), and
regulator_unregister_supply_alias(). Concurrent registration,
unregistration and lookups can race, leading to:
1 use-after-free if an alias entry is removed while being read,
2 duplicate entries when two threads register the same alias,
3 inconsistent alias mappings observed by consumers.
Protect all traversals, insertions and deletions on
regulator_supply_alias_list with the existing regulator_list_mutex.
Fixes: a06ccd9c3785f ("regulator: core: Add ability to create a lookup alias for supply") Signed-off-by: sparkhuang <huangshaobo3@xiaomi.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20251127025716.5440-1-huangshaobo3@xiaomi.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 8c3170628a9c ("wifi: brcmfmac: keep power during suspend if board
requires it") changed default behavior of the BRCMFMAC driver, which now
keeps SDIO card powered during system suspend to enable optional support
for WOWL. This feature is not supported by the legacy Exynos4 based
boards and leads to WLAN disfunction after system suspend/resume cycle.
Fix this by annotating SDIO host used by WLAN chip with
'cap-power-off-card' property, which should have been there from the
beginning.
Fixes: f77cbb9a3e5d ("ARM: dts: exynos: Add bcm4334 device node to Trats2") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://patch.msgid.link/20251126102618.3103517-5-m.szyprowski@samsung.com Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 8c3170628a9c ("wifi: brcmfmac: keep power during suspend if board
requires it") changed default behavior of the BRCMFMAC driver, which now
keeps SDIO card powered during system suspend to enable optional support
for WOWL. This feature is not supported by the legacy Exynos4 based
boards and leads to WLAN disfunction after system suspend/resume cycle.
Fix this by annotating SDIO host used by WLAN chip with
'cap-power-off-card' property, which should have been there from the
beginning.
Fixes: a19f6efc01df ("ARM: dts: exynos: Enable WLAN support for the Trats board") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://patch.msgid.link/20251126102618.3103517-4-m.szyprowski@samsung.com Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 8c3170628a9c ("wifi: brcmfmac: keep power during suspend if board
requires it") changed default behavior of the BRCMFMAC driver, which now
keeps SDIO card powered during system suspend to enable optional support
for WOWL. This feature is not supported by the legacy Exynos4 based
boards and leads to WLAN disfunction after system suspend/resume cycle.
Fix this by annotating SDIO host used by WLAN chip with
'cap-power-off-card' property, which should have been there from the
beginning.
Fixes: 8620cc2f99b7 ("ARM: dts: exynos: Add devicetree file for the Galaxy S2") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://patch.msgid.link/20251126102618.3103517-3-m.szyprowski@samsung.com Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 8c3170628a9c ("wifi: brcmfmac: keep power during suspend if board
requires it") changed default behavior of the BRCMFMAC driver, which now
keeps SDIO card powered during system suspend to enable optional support
for WOWL. This feature is not supported by the legacy Exynos4 based
boards and leads to WLAN disfunction after system suspend/resume cycle.
Fix this by annotating SDIO host used by WLAN chip with
'cap-power-off-card' property, which should have been there from the
beginning.
Fixes: f1b0ffaa686f ("ARM: dts: exynos: Enable WLAN support for the UniversalC210 board") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://patch.msgid.link/20251126102618.3103517-2-m.szyprowski@samsung.com Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 78b72897a5c8 ("soc: samsung: exynos-pmu: Enable CPU Idle for
gs101") added system wide suspend/resume callbacks to Exynos PMU driver,
but some items used by these callbacks are initialized only on
GS101-compatible boards. Move that initialization to exynos_pmu_probe()
to avoid potential lockdep warnings like below observed during system
suspend/resume cycle:
INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
CPU: 0 UID: 0 PID: 2134 Comm: rtcwake Not tainted 6.18.0-rc7-next-20251126-00039-g1d656a1af243 #11794 PREEMPT
Hardware name: Samsung Exynos (Flattened Device Tree)
Call trace:
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x68/0x88
dump_stack_lvl from register_lock_class+0x970/0x988
register_lock_class from __lock_acquire+0xc8/0x29ec
__lock_acquire from lock_acquire+0x134/0x39c
lock_acquire from _raw_spin_lock+0x38/0x48
_raw_spin_lock from exynos_cpupm_suspend_noirq+0x18/0x34
exynos_cpupm_suspend_noirq from dpm_run_callback+0x98/0x2b8
dpm_run_callback from device_suspend_noirq+0x8c/0x310
Fixes: 78b72897a5c8 ("soc: samsung: exynos-pmu: Enable CPU Idle for gs101") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://patch.msgid.link/20251126110038.3326768-1-m.szyprowski@samsung.com
[krzk: include calltrace into commit msg] Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Airoha EN7523 specific bug
--------------------------
We found that some serial console may pull TX line to GROUND during board
boot time. Airoha uses TX line as one of its bootstrap pins. On the EN7523
SoC this may lead to booting in RESERVED boot mode.
It was found that some flashes operates incorrectly in RESERVED mode.
Micron and Skyhigh flashes are definitely affected by the issue,
Winbond flashes are not affected.
Details:
--------
DMA reading of odd pages on affected flashes operates incorrectly. Page
reading offset (start of the page) on hardware level is replaced by 0x10.
Thus results in incorrect data reading. As result OS loading becomes
impossible.
Usage of UBI make things even worse. On attaching, UBI will detects
corruptions (because of wrong reading of odd pages) and will try to
recover. For recovering UBI will erase and write 'damaged' blocks with
a valid information. This will destroy all UBI data.
Non-DMA reading is OK.
This patch detects booting in reserved mode, turn off DMA and print big
fat warning.
It's worth noting that the boot configuration is preserved across reboots.
Therefore, to boot normally, you should do the following:
- disconnect the serial console from the board,
- power cycle the board.
Fixes: a403997c12019 ("spi: airoha: add SPI-NAND Flash controller driver") Signed-off-by: Mikhail Kshevetskiy <mikhail.kshevetskiy@iopsys.eu> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Link: https://patch.msgid.link/20251125234047.1101985-2-mikhail.kshevetskiy@iopsys.eu Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
virtio pci uses word to mean "16 bits". mmio uses it to mean
"32 bits".
To avoid confusion, let's avoid the term in core virtio
altogether. Just say U64 to mean "64 bit".
Fixes: e7d4c1c5a546 ("virtio: introduce extended features") Cc: Paolo Abeni <pabeni@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
Message-ID: <ad53b7b6be87fc524f45abaeca0bb05fb3633397.1764225384.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Use %pe instead of %ps when printing ERR_PTR() values. %ps is intended
for string pointers, while %pe correctly prints symbolic error names
for error pointers returned via ERR_PTR().
This shows the returned error value more clearly.
Fixes: 67f27b8b3a34 ("pds_vdpa: subscribe to the pds_core events") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251018174705.1511982-1-alok.a.tiwari@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If we fail to attach to a cgroup we are leaking the id. This adds
a new goto to free the id.
Fixes: 7d9896e9f6d0 ("vhost: Reintroduce kthread API and add mode selection") Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251101194358.13605-1-michael.christie@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When query_virtqueues() fails, the error log prints the variable err
instead of cmd->err. Since err may still be zero at this point, the
log message can misleadingly report a success value 0 even though the
command actually failed.
Even worse, once err is set to the first failure, subsequent logs
print that same stale value. This makes the error reporting appear
one step behind the actual failing queue index, which is confusing
and misleading.
Fix the log to report cmd->err, which reflects the real failure code
returned by the firmware.
Fixes: 1fcdf43ea69e ("vdpa/mlx5: Use async API for vq query command") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20250929134258.80956-1-alok.a.tiwari@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
@free will free the map handle not sync it. Fix the doc to match.
Fixes: bee8c7c24b73 ("virtio: introduce map ops in virtio core")
Message-Id: <f6ff1c7aff8401900bf362007d7fb52dfdb6a15b.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Rewrite the comment for better grammar and clarity.
Fixes: 75a0a52be3c2 ("virtio: introduce an API to set affinity for a virtqueue")
Message-Id: <e317e91bd43b070e5eaec0ebbe60c5749d02e2dd.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Remove colons after "Returns" in virtio_map_ops function
documentation - both to avoid triggering an htmldoc warning
and for consistency with virtio_config_ops.
This affects map_page, alloc, need_sync, and max_mapping_size.
Fixes: bee8c7c24b73 ("virtio: introduce map ops in virtio core")
Message-Id: <c262893fa21f4b1265147ef864574a9bd173348f.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix grammar issues in the virtio_map_ops docs:
- missing article before "transport"
- "implements" -> "implement" to match subject
Fixes: bee8c7c24b73 ("virtio: introduce map ops in virtio core")
Message-Id: <3f7bcae5a984f14b72e67e82572b110acb06fa7e.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fixes: c502eb85c34e ("virtio: introduce virtio_queue_info struct and find_vqs_info() config op")
Message-Id: <a5cf2b92573200bdb1c1927e559d3930d61a4af2.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The finalize_features documentation uses a tab between words.
Use space instead.
Fixes: d16c0cd27331 ("docs: driver-api: virtio: virtio on Linux")
Message-Id: <39d7685c82848dc6a876d175e33a1407f6ab3fc1.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
WARNING: ./drivers/virtio/virtio_ring.c:3174 function parameter 'vaddr' not described in 'virtqueue_map_free_coherent'
WARNING: ./drivers/virtio/virtio_ring.c:3308 expecting prototype for virtqueue_mapping_error(). Prototype was for virtqueue_map_mapping_error() instead
The kernel-doc block for virtqueue_map_free_coherent() omitted the @vaddr parameter, and
the kernel-doc header for virtqueue_map_mapping_error() used the wrong function name
(virtqueue_mapping_error) instead of the actual function name.
This change updates:
- the function name in the comment to virtqueue_map_mapping_error()
- adds the missing @vaddr description in the comment for virtqueue_map_free_coherent()
virtio_vdpa_set_status() is declared as returning void, but it used
"return vdpa_set_status()" Since vdpa_set_status() also returns
void, the return statement is unnecessary and misleading.
Remove it.
Fixes: c043b4a8cf3b ("virtio: introduce a vDPA based transport") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Message-Id: <20251001191653.1713923-1-alok.a.tiwari@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Starting with commit 69a8b62a7aa1 ("riscv: acpi: avoid errors caused by
probing DT devices when ACPI is used"), riscv images no longer populate
devicetree if ACPI is enabled. This causes unit tests to fail which require
the root node to be set.
# Subtest: of_dtb
# module: of_test
1..2
# of_dtb_root_node_found_by_path: EXPECTATION FAILED at drivers/of/of_test.c:21
Expected np is not null, but is
# of_dtb_root_node_found_by_path: pass:0 fail:1 skip:0 total:1
not ok 1 of_dtb_root_node_found_by_path
# of_dtb_root_node_populates_of_root: EXPECTATION FAILED at drivers/of/of_test.c:31
Expected of_root is not null, but is
# of_dtb_root_node_populates_of_root: pass:0 fail:1 skip:0 total:1
not ok 2 of_dtb_root_node_populates_of_root
Skip those tests for RISCV if the root node is not populated.
Fixes: 69a8b62a7aa1 ("riscv: acpi: avoid errors caused by probing DT devices when ACPI is used") Cc: Han Gao <rabenda.cn@gmail.com> Cc: Paul Walmsley <pjw@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Paul Walmsley <pjw@kernel.org> # arch/riscv Link: https://patch.msgid.link/20251023160415.705294-1-linux@roeck-us.net Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When the MB_CHECK_ASSERT macro is enabled, we found that the
current validation logic in __mb_check_buddy has a gap in
detecting certain invalid buddy states, particularly related
to order-0 (bitmap) bits.
The original logic consists of three steps:
1. Validates higher-order buddies: if a higher-order bit is
set, at most one of the two corresponding lower-order bits
may be free; if a higher-order bit is clear, both lower-order
bits must be allocated (and their bitmap bits must be 0).
2. For any set bit in order-0, ensures all corresponding
higher-order bits are not free.
3. Verifies that all preallocated blocks (pa) in the group
have pa_pstart within bounds and their bitmap bits marked as
allocated.
However, this approach fails to properly validate cases where
order-0 bits are incorrectly cleared (0), allowing some invalid
configurations to pass:
corrupt integral
order 3 1 1
order 2 1 1 1 1
order 1 1 1 1 1 1 1 1 1
order 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Here we get two adjacent free blocks at order-0 with inconsistent
higher-order state, and the right one shows the correct scenario.
The root cause is insufficient validation of order-0 zero bits.
To fix this and improve completeness without significant performance
cost, we refine the logic:
1. Maintain the top-down higher-order validation, but we no longer
check the cases where the higher-order bit is 0, as this case will
be covered in step 2.
2. Enhance order-0 checking by examining pairs of bits:
- If either bit in a pair is set (1), all corresponding
higher-order bits must not be free.
- If both bits are clear (0), then exactly one of the
corresponding higher-order bits must be free
3. Keep the preallocation (pa) validation unchanged.
This change closes the validation gap, ensuring illegal buddy states
involving order-0 are correctly detected, while removing redundant
checks and maintaining efficiency.
Fixes: c9de560ded61f ("ext4: Add multi block allocator for ext4") Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Yongjian Sun <sunyongjian1@huawei.com> Reviewed-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz>
Message-ID: <20251106060614.631382-3-sunyongjian@huaweicloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
When excl_prog_hash is 0 and excl_prog_hash_size is non-zero, the map also
needs to be freed. Otherwise, the map memory will not be reclaimed, just
like the memory leak problem reported by syzbot [1].
Make all headers part of make's dependencies computations.
Otherwise, updating audit.h, common.h, scoped_base_variants.h,
scoped_common.h, scoped_multiple_domain_variants.h, or wrappers.h,
re-running make and running selftests could lead to testing stale headers.
ublk_ch_uring_cmd_local() may jump to the out label before
initialising the io pointer. This will cause trouble if DEBUG is
defined, because the pr_devel() call dereferences io. Clang reports:
drivers/block/ublk_drv.c:2403:6: error: variable 'io' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
2403 | if (tag >= ub->dev_info.queue_depth)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/block/ublk_drv.c:2492:32: note: uninitialized use occurs here
2492 | __func__, cmd_op, tag, ret, io->flags);
|
Fix this by initialising io to NULL and checking it before
dereferencing it.
On all AMD AM4 systems I have seen, e.g ASUS X470-i, Pro WS X570 Ace
and equivalent Gigabyte, amd-pstate does not initialize when the
x2apic is enabled in the BIOS. Kernel debug messages include:
[ 0.315438] acpi LNXCPU:00: Failed to get CPU physical ID.
[ 0.354756] ACPI CPPC: No CPC descriptor for CPU:0
[ 0.714951] amd_pstate: the _CPC object is not present in SBIOS or ACPI disabled
I tracked this down to map_x2apic_id() checking device_declaration
passed in via the type argument of acpi_get_phys_id() via
map_madt_entry() while map_lapic_id() does not.
It appears these BIOSes use Processor statements for declaring the CPUs
in the ACPI namespace instead of processor device objects (which should
have been used). CPU declarations via Processor statements were
deprecated in ACPI 6.0 that was released 10 years ago. They should not
be used any more in any contemporary platform firmware.
I tried to contact Asus support multiple times, but never received a
reply nor did any BIOS update ever change this.
Fix amd-pstate w/ x2apic on am4 by allowing map_x2apic_id() to work with
CPUs declared via Processor statements for IDs less than 255, which is
consistent with ACPI 5.0 that still allowed Processor statements to be
used for declaring CPUs.
Fixes: 7237d3de78ff ("x86, ACPI: add support for x2apic ACPI extensions") Signed-off-by: René Rebe <rene@exactco.de>
[ rjw: Changelog edits ] Link: https://patch.msgid.link/20251126.165513.1373131139292726554.rene@exactco.de Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In sy7636a_sensor_probe(), regulator_enable() is called but if
devm_hwmon_device_register_with_info() fails, the function returns
without calling regulator_disable(), leaving the regulator enabled
and leaking the reference count.
Switch to devm_regulator_get_enable() to automatically
manage the regulator resource.
XCD id is assigned to uuid, which causes some performance
drop in SPX mode, assigning AID back will resolve the
issue.
Fixes: 3a75edf93aae ("drm/amdkfd: set uuid for each partition in topology") Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The .H_SYNC_POLARITY and .V_SYNC_POLARITY variables are 1 bit bitfields
of a u32. The ATOM_HSYNC_POLARITY define is 0x2 and the
ATOM_VSYNC_POLARITY is 0x4. When we do a bitwise negate of 0, 2, or 4
then the last bit is always 1 so this code always sets .H_SYNC_POLARITY
and .V_SYNC_POLARITY to true.
This code is instead intended to check if the ATOM_HSYNC_POLARITY or
ATOM_VSYNC_POLARITY flags are set and reverse the result. In other
words, it's supposed to be a logical negate instead of a bitwise negate.
Fixes: ae79c310b1a6 ("drm/amd/display: Add DCE12 bios parser support") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
FMODE_NOCMTIME used to be just a hack for the legacy XFS handle-based
"invisible I/O", but commit e5e9b24ab8fa ("nfsd: freeze c/mtime updates
with outstanding WRITE_ATTRS delegation") started using it from
generic callers.
I'm not sure other file systems are actually read for this in general,
so the above commit should get a closer look, but for it to make any
sense, file_update_time needs to respect the flag.
Lift the check from file_modified_flags to file_update_time so that
users of file_update_time inherit the behavior and so that all the
checks are done in one place.
Fixes: e5e9b24ab8fa ("nfsd: freeze c/mtime updates with outstanding WRITE_ATTRS delegation") Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251120064859.2911749-3-hch@lst.de Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently the two high-level APIs use two helper functions to implement
almost all of the logic. Refactor the two helpers and the common logic
into a new file_update_time_flags routine that gets the iocb flags or
0 in case of file_update_time passed so that the entire logic is
contained in a single function and can be easily understood and modified.
Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251120064859.2911749-2-hch@lst.de Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: 7f30e7a42371 ("fs: lift the FMODE_NOCMTIME check into file_update_time_flags") Signed-off-by: Sasha Levin <sashal@kernel.org>
wait_for_completion_timeout() returns the remaining jiffies
(at least 1) on success or 0 on timeout, but never negative
error codes. The current code incorrectly checks for negative
values, causing timeouts to be ignored and treated as success.
Check for a zero return value to correctly identify and
handle timeout events.
The use of firmware_loader is an implementation detail of drivers rather
than a dependency. FW_LOADER is typically selected rather than depended
on; the Rust abstractions should do the same thing.
memset_io() writes memory byte by byte with __raw_writeb() on the arm
platform if the size is word. but XCVR data RAM memory can't be accessed
with byte address, so with memset_io() the channel status control memory
is not really cleared, use writel_relaxed() instead.
Fixes: 28564486866f ("ASoC: fsl_xcvr: Add XCVR ASoC CPU DAI driver") Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Link: https://patch.msgid.link/20251126064509.1900974-1-shengjiu.wang@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Function new_inode() returns a new inode with inode->i_mapping->gfp_mask
set to GFP_HIGHUSER_MOVABLE. This value includes the __GFP_FS flag, so
allocations in that address space can recurse into filesystem memory
reclaim. We don't want that to happen because it can consume a
significant amount of stack memory.
Worse than that is that it can also deadlock: for example, in several
places, gfs2_unstuff_dinode() is called inside filesystem transactions.
This calls filemap_grab_folio(), which can allocate a new folio, which
can trigger memory reclaim. If memory reclaim recurses into the
filesystem and starts another transaction, a deadlock will ensue.
To fix these kinds of problems, prevent memory reclaim from recursing
into filesystem code by making sure that the gfp_mask of inode address
spaces doesn't include __GFP_FS.
The "meta" and resource group address spaces were already using GFP_NOFS
as their gfp_mask (which doesn't include __GFP_FS). The default value
of GFP_HIGHUSER_MOVABLE is less restrictive than GFP_NOFS, though. To
avoid being overly limiting, use the default value and only knock off
the __GFP_FS flag. I'm not sure if this will actually make a
difference, but it also shouldn't hurt.
This patch is loosely based on commit ad22c7a043c2 ("xfs: prevent stack
overflows from page cache allocation").
Fixes xfstest generic/273.
Fixes: dc0b9435238c ("gfs: Don't use GFP_NOFS in gfs2_unstuff_dinode") Reviewed-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This configuration was missing from the initial commit.
Found by Jiri Benc <jbenc@redhat.com>
Fixes: c0a3873b9938 ("ASoC: nau8325: new driver") Cc: Seven Lee <wtli@nuvoton.com> Signed-off-by: Jaroslav Kysela <perex@perex.cz> Link: https://patch.msgid.link/20251126091759.2490019-3-perex@perex.cz Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
nau8325.c:430:13: error: variable 'n2_max' is uninitialized when used here [-Werror,-Wuninitialized]
which are false positive, because the variables will be always
initialized when used (guarded by mclk_max!=0 check). However
initializing them upfront makes the code more obvious and easier, plus
it silences the warning.
The i2c probe functions here don't use the id information provided in
their second argument, so the single-parameter i2c probe function
("probe_new") can be used instead.
This avoids scanning the identifier tables during probes.
Clockevents cannot be deregistered so suppress the bind attributes to
prevent the driver from being unbound and releasing the underlying
resources after registration.
Even if the driver can currently only be built-in, also switch to
builtin_platform_driver() to prevent it from being unloaded should
modular builds ever be enabled.
Fixes: cec32ac75827 ("clocksource/drivers/nxp-timer: Add the System Timer Module for the s32gx platforms") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20251111153226.579-4-johan@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
Platform drivers can be probed after their init sections have been
discarded (e.g. on probe deferral or manual rebind through sysfs) so the
probe function must not live in init. Device managed resource actions
similarly cannot be discarded.
The "_probe" suffix of the driver structure name prevents modpost from
warning about this so replace it to catch any similar future issues.
Fixes: cec32ac75827 ("clocksource/drivers/nxp-timer: Add the System Timer Module for the s32gx platforms") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: stable@vger.kernel.org # 6.16 Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20251017054943.7195-1-johan@kernel.org
Stable-dep-of: 6a2416892e89 ("clocksource/drivers/nxp-stm: Prevent driver unbind") Signed-off-by: Sasha Levin <sashal@kernel.org>
The driver does not support unbinding (e.g. as clockevents cannot be
deregistered) so suppress the bind attributes to prevent the driver from
being unbound and rebound after registration (and disabling the timer
when reprobing fails).
Even if the driver can currently only be built-in, also switch to
builtin_platform_driver() to prevent it from being unloaded should
modular builds ever be enabled.
Clockevents cannot be deregistered so suppress the bind attributes to
prevent the driver from being unbound and releasing the underlying
resources after registration.
Fixes: 4891f01527bb ("clocksource/drivers/arm_arch_timer: Add standalone MMIO driver") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://patch.msgid.link/20251111153226.579-2-johan@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
The purpose of the devm_add_action_or_reset() helper is to call the
action function in case adding an action ever fails so drop the clock
source deregistration from the error path to avoid deregistering twice.
Fixes: cec32ac75827 ("clocksource/drivers/nxp-timer: Add the System Timer Module for the s32gx platforms") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20251017055039.7307-1-johan@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
The ralink_systick_init() function does not release all acquired resources
on its error paths. If irq_of_parse_and_map() or a subsequent call fails,
the previously created I/O memory mapping and IRQ mapping are leaked.
Add goto-based error handling labels to ensure that all allocated
resources are correctly freed.
Fixes: 1f2acc5a8a0a ("MIPS: ralink: Add support for systick timer found on newer ralink SoC") Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20251030090710.1603-1-vulab@iscas.ac.cn Signed-off-by: Sasha Levin <sashal@kernel.org>
The kernel BOs unnecessarily got added to the external objects list
of drm_gpuvm, when mapping to GPU, which would have resulted in few
extra CPU cycles being spent at the time of job submission as
drm_exec_until_all_locked() loop iterates over all external objects.
Kernel BOs are private to a VM and so they share the dma_resv object of
the dummy GEM object created for a VM. Use of DRM_EXEC_IGNORE_DUPLICATES
flag ensured the recursive locking of the dummy GEM object was ignored.
Also no extra space got allocated to add fences to the dma_resv object
of dummy GEM object. So no other impact apart from few extra CPU cycles.
This commit sets the pointer to dma_resv object of GEM object of
kernel BOs before they are mapped to GPU, to prevent them from
being added to external objects list.
v2: Add R-bs and fixes tags
Fixes: 8a1cc07578bf ("drm/panthor: Add GEM logical block") Signed-off-by: Akash Goel <akash.goel@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patch.msgid.link/20251120172118.2741724-1-akash.goel@arm.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The original addition of cache information for the Amlogic S922X SoC
used the wrong next-level cache node for CPU cores 100 and 101,
incorrectly referencing `l2_cache_l`. These cores actually belong to
the big cluster and should reference `l2_cache_b`. Update the device
tree accordingly.
Fixes: e7f85e6c155a ("arm64: dts: amlogic: Add cache information to the Amlogic S922X SoC") Signed-off-by: Guillaume La Roque <glaroque@baylibre.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patch.msgid.link/20251123-fixkhadas-v1-1-045348f0a4c2@baylibre.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In some cases, this logic can result in doorbell writes being
skipped when they should not have been (at least on GEN3 HW),
so remove it. This also means that the mb() can be safely
downgraded to dma_wmb().
The HW disables bounds checking for MRs with a length of zero, so
the driver will only allow a zero length MR if the "all_memory"
flag is set, and this flag is only set if IB_PD_UNSAFE_GLOBAL_RKEY
is set for the PD.
This means that the "get_dma_mr" method will currently fail unless
the IB_PD_UNSAFE_GLOBAL_RKEY flag is set. This has not been an issue
because the "get_dma_mr" method is only ever invoked if the device
does not support the local DMA key or if IB_PD_UNSAFE_GLOBAL_RKEY
is set, and so far, all IRDMA HW supports the local DMA lkey.
However, some new HW does not support the local DMA lkey, so the
"get_dma_mr" method needs to work without IB_PD_UNSAFE_GLOBAL_RKEY
being set.
To support HW that does not allow the local DMA lkey, the logic has
been changed to pass an explicit flag to indicate when a dma_mr is
being created so that the zero length will be allowed.
Also, the "all_memory" flag has been forced to false for normal MR
allocation since these MRs are never supposed to provide global
unsafe rkey semantics anyway; only the MR created with "get_dma_mr"
should support this.
Removes write to IRDMA_PFINT_AEQCTL register prior to destroying AEQ,
as this register does not exist in GEN3+ hardware and this kind of IRQ
configuration is no longer required.
Fixes: b800e82feba7 ("RDMA/irdma: Add GEN3 support for AEQ and CEQ") Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Link: https://patch.msgid.link/20251125025350.180-5-tatyana.e.nikolova@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
During a refactor of the irdma GEN2 code, the kfree of the irdma_pci_f struct
in icrdma_remove(), which was originally introduced upstream as part of
commit 80f2ab46c2ee ("irdma: free iwdev->rf after removing MSI-X")
was accidentally removed.
Adds a lock around irdma_sc_ccq_arm body to prevent inter-thread data race.
Fixes data race in irdma_sc_ccq_arm() reported by KCSAN:
BUG: KCSAN: data-race in irdma_sc_ccq_arm [irdma] / irdma_sc_ccq_arm [irdma]
read to 0xffff9d51b4034220 of 8 bytes by task 255 on cpu 11:
irdma_sc_ccq_arm+0x36/0xd0 [irdma]
irdma_cqp_ce_handler+0x300/0x310 [irdma]
cqp_compl_worker+0x2a/0x40 [irdma]
process_one_work+0x402/0x7e0
worker_thread+0xb3/0x6d0
kthread+0x178/0x1a0
ret_from_fork+0x2c/0x50
write to 0xffff9d51b4034220 of 8 bytes by task 89 on cpu 3:
irdma_sc_ccq_arm+0x7e/0xd0 [irdma]
irdma_cqp_ce_handler+0x300/0x310 [irdma]
irdma_wait_event+0xd4/0x3e0 [irdma]
irdma_handle_cqp_op+0xa5/0x220 [irdma]
irdma_hw_flush_wqes+0xb1/0x300 [irdma]
irdma_flush_wqes+0x22e/0x3a0 [irdma]
irdma_cm_disconn_true+0x4c7/0x5d0 [irdma]
irdma_disconnect_worker+0x35/0x50 [irdma]
process_one_work+0x402/0x7e0
worker_thread+0xb3/0x6d0
kthread+0x178/0x1a0
ret_from_fork+0x2c/0x50
value changed: 0x0000000000024000 -> 0x0000000000034000
Some platforms (e.g. SC8280XP and X1E) support more than 128 stream
matching groups. This is more than what is defined as maximum by the ARM
SMMU architecture specification. Commit 122611347326 ("iommu/arm-smmu-qcom:
Limit the SMR groups to 128") disabled use of the additional groups because
they don't exhibit the same behavior as the architecture supported ones.
It seems like this is just another quirk of the hypervisor: When running
bare-metal without the hypervisor, the additional groups appear to behave
just like all others. The boot firmware uses some of the additional groups,
so ignoring them in this situation leads to stream match conflicts whenever
we allocate a new SMR group for the same SID.
The workaround exists primarily because the bypass quirk detection fails
when using a S2CR register from the additional matching groups, so let's
perform the test with the last reliable S2CR (127) and then limit the
number of SMR groups only if we detect that we are running below the
hypervisor (because of the bypass quirk).
Fixes: 122611347326 ("iommu/arm-smmu-qcom: Limit the SMR groups to 128") Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add a missing struct short description and a missing leading " *" to
lp855x.h to avoid kernel-doc warnings:
Warning: include/linux/platform_data/lp855x.h:126 missing initial short
description on line:
* struct lp855x_platform_data
Warning: include/linux/platform_data/lp855x.h:131 bad line:
Only valid when mode is PWM_BASED.
Fixes: 7be865ab8634 ("backlight: new backlight driver for LP855x devices") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Daniel Thompson (RISCstar) <danielt@kernel.org> Link: https://patch.msgid.link/20251111060916.1995920-1-rdunlap@infradead.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
LED Backlight is a consumer of one or multiple LED class devices, but
devlink is currently unable to create correct supplier-producer links when
the supplier is a class device. It creates instead a link where the
supplier is the parent of the expected device.
One consequence is that removal order is not correctly enforced.
Issues happen for example with the following sections in a device tree
overlay:
// An LED driver chip
pca9632@62 {
compatible = "nxp,pca9632";
reg = <0x62>;
In this example, the devlink should be created between the backlight-addon
(consumer) and the pca9632@62 (supplier). Instead it is created between the
backlight-addon (consumer) and the parent of the pca9632@62, which is
typically the I2C bus adapter.
On removal of the above overlay, the LED driver can be removed before the
backlight device, resulting in:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
...
Call trace:
led_put+0xe0/0x140
devm_led_release+0x6c/0x98
Another way to reproduce the bug without any device tree overlays is
unbinding the LED class device (pca9632@62) before unbinding the consumer
(backlight-addon):
The FILS status codes are set to 108/109, but the IEEE 802.11-2020
spec defines them as 112/113. Update the enum so it matches the
specification and keeps the kernel consistent with standard values.
Fixes: a3caf7440ded ("cfg80211: Add support for FILS shared key authentication offload") Signed-off-by: Ria Thomas <ria.thomas@morsemicro.com> Reviewed-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com> Link: https://patch.msgid.link/20251124125637.3936154-1-ria.thomas@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Since commit 222f2c7c6d14 ("iomap: always run error completions in user
context"), read error completions are deferred to s_dio_done_wq. This
means the workqueue also needs to be allocated for async reads.
Fixes: 222f2c7c6d14 ("iomap: always run error completions in user context") Reported-by: syzbot+a2b9a4ed0d61b1efb3f5@syzkaller.appspotmail.com Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251124140013.902853-1-hch@lst.de Tested-by: syzbot+a2b9a4ed0d61b1efb3f5@syzkaller.appspotmail.com Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
At least zonefs expects error completions to be able to sleep. Because
error completions aren't performance critical, just defer them to workqueue
context unconditionally.
Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system") Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251113170633.1453259-3-hch@lst.de Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In order to work around the existence of a vmap symbol in libpcap, the
UML makefile unconditionally redefines vmap to kernel_vmap. However,
this not only affects the actual vmap symbol, but also anything else
named vmap, including a number of struct members in DRM.
This would not be too much of a problem, since all uses are also
updated, except we now have Rust DRM bindings, which expect the
corresponding Rust structs to have 'vmap' names. Since the redefinition
applies in bindgen, but not to Rust code, we end up with errors such as:
error[E0560]: struct `drm_gem_object_funcs` has no fields named `vmap`
--> rust/kernel/drm/gem/mod.rs:210:9
Since libpcap support was removed in commit 12b8e7e69aa7 ("um: Remove
obsolete pcap driver"), remove the, now unnecessary, define as well.
We also take this opportunity to update the comment.
Signed-off-by: David Gow <davidgow@google.com> Acked-by: Miguel Ojeda <ojeda@kernel.org> Link: https://patch.msgid.link/20251122083213.3996586-1-davidgow@google.com Fixes: 12b8e7e69aa7 ("um: Remove obsolete pcap driver")
[adjust commmit message a bit] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The flush page DMA address is stored in a special register that is not
associated with the GPU's standard DMA range. For example, on Turing,
the GPU's MMU can handle 47-bit addresses, but the flush page address
register is limited to 40 bits.
At the point during device initialization when the flush page is
allocated, the DMA mask is still at its default of 32 bits. So even
though it's unlikely that the flush page could exist above a 40-bit
address, the dma_map_page() call could fail, e.g. if IOMMU is disabled
and the address is above 32 bits. The simplest way to achieve all
constraints is to allocate the page in the DMA32 zone. Since the flush
page is literally just a page, this is an acceptable limitation. The
alternative is to temporarily set the DMA mask to 40 (or 52 for Hopper
and later) bits, but that could have unforseen side effects.
In situations where the flush page is allocated above 32 bits and IOMMU
is disabled, you will get an error like this:
nouveau 0000:65:00.0: DMA addr 0x0000000107c56000+4096 overflow (mask ffffffff, bus limit 0).
Fixes: 5728d064190e ("drm/nouveau/fb: handle sysmem flush page from common code") Signed-off-by: Timur Tabi <ttabi@nvidia.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patch.msgid.link/20251113230323.1271726-1-ttabi@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
If the call to btrfs_del_leaf() fails we return without decrementing the
extra ref we took on the leaf, therefore leaking it. Fix this by ensuring
we drop the ref count before returning the error.
Fixes: 751a27615dda ("btrfs: do not BUG_ON() on tree mod log failures at btrfs_del_ptr()") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Unlike queue_scrub_stripe() which uses the global sctx->extent_path and
sctx->csum_path which are always released at the end of scrub_stripe(),
scrub_raid56_parity_stripe() uses local extent_path and csum_path, as
that function is going to handle the full stripe, whose bytenr may be
smaller than the bytenr in the global sctx paths.
However the cleanup of local extent/csum paths is only happening after
we have successfully submitted an rbio.
There are several error routes that we didn't release those two paths:
- scrub_find_fill_first_stripe() errored out at csum tree search
In that case extent_path is still valid, and that function itself will
not release the extent_path passed in.
And the function returns directly without releasing both paths.
- The full stripe is empty
- Some blocks failed to be recovered
- btrfs_map_block() failed
- raid56_parity_alloc_scrub_rbio() failed
The function returns directly without releasing both paths.
Fix it by covering btrfs_release_path() calls inside the out: tag.
This is just a hot fix, in the long run we will go scoped based auto
freeing for both local paths.
Fixes: 1dc4888e725d ("btrfs: scrub: avoid unnecessary extent tree search preparing stripes") Fixes: 3c771c194402 ("btrfs: scrub: avoid unnecessary csum tree search preparing stripes") Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
From the memory-barriers.txt document regarding memory barrier ordering
guarantees:
(*) These guarantees do not apply to bitfields, because compilers often
generate code to modify these using non-atomic read-modify-write
sequences. Do not attempt to use bitfields to synchronize parallel
algorithms.
(*) Even in cases where bitfields are protected by locks, all fields
in a given bitfield must be protected by one lock. If two fields
in a given bitfield are protected by different locks, the compiler's
non-atomic read-modify-write sequences can cause an update to one
field to corrupt the value of an adjacent field.
btrfs_space_info has a bitfield sharing an underlying word consisting of
the fields full, chunk_alloc, and flush:
Therefore, to be safe from parallel read-modify-writes losing a write to
one of the bitfield members protected by a lock, all writes to all the
bitfields must use the lock. They almost universally do, except for
btrfs_clear_space_info_full() which iterates over the space_infos and
writes out found->full = 0 without a lock.
Imagine that we have one thread completing a transaction in which we
finished deleting a block_group and are thus calling
btrfs_clear_space_info_full() while simultaneously the data reclaim
ticket infrastructure is running do_async_reclaim_data_space():
and now data_sinfo->flush is 1 but the reclaim worker has exited. This
breaks the invariant that flush is 0 iff there is no work queued or
running. Once this invariant is violated, future allocations that go
into __reserve_bytes() will add tickets to space_info->tickets but will
see space_info->flush is set to 1 and not queue the work. After this,
they will block forever on the resulting ticket, as it is now impossible
to kick the worker again.
I also confirmed by looking at the assembly of the affected kernel that
it is doing RMW operations. For example, to set the flush (3rd) bit to 0,
the assembly is:
andb $0xfb,0x60(%rbx)
and similarly for setting the full (1st) bit to 0:
andb $0xfe,-0x20(%rax)
So I think this is really a bug on practical systems. I have observed
a number of systems in this exact state, but am currently unable to
reproduce it.
Rather than leaving this footgun lying around for the future, take
advantage of the fact that there is room in the struct anyway, and that
it is already quite large and simply change the three bitfield members to
bools. This avoids writes to space_info->full having any effect on
writes to space_info->flush, regardless of locking.
Fixes: 957780eb2788 ("Btrfs: introduce ticketed enospc infrastructure") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the previous code it was possible to incur into a double kfree()
scenario when calling add_delayed_ref_head(). This could happen if the
record was reported to already exist in the
btrfs_qgroup_trace_extent_nolock() call, but then there was an error
later on add_delayed_ref_head(). In this case, since
add_delayed_ref_head() returned an error, the caller went to free the
record. Since add_delayed_ref_head() couldn't set this kfree'd pointer
to NULL, then kfree() would have acted on a non-NULL 'record' object
which was pointing to memory already freed by the callee.
The problem comes from the fact that the responsibility to kfree the
object is on both the caller and the callee at the same time. Hence, the
fix for this is to shift the ownership of the 'qrecord' object out of
the add_delayed_ref_head(). That is, we will never attempt to kfree()
the given object inside of this function, and will expect the caller to
act on the 'qrecord' object on its own. The only exception where the
'qrecord' object cannot be kfree'd is if it was inserted into the
tracing logic, for which we already have the 'qrecord_inserted_ret'
boolean to account for this. Hence, the caller has to kfree the object
only if add_delayed_ref_head() reports not to have inserted it on the
tracing logic.
As a side-effect of the above, we must guarantee that
'qrecord_inserted_ret' is properly initialized at the start of the
function, not at the end, and then set when an actual insert
happens. This way we avoid 'qrecord_inserted_ret' having an invalid
value on an early exit.
The documentation from the add_delayed_ref_head() has also been updated
to reflect on the exact ownership of the 'qrecord' object.
Fixes: 6ef8fbce0104 ("btrfs: fix missing error handling when adding delayed ref with qgroups enabled") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Miquel Sabaté Solà <mssola@mssola.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently selftests require xxd with the "-n <name>" option
which allows the user to specify a name not derived from
the input object path. Instead of relying on this newer
feature, older xxd can be used if we link our desired name
("test_progs_verification_cert") to the input object.
Many distros ship xxd in vim-common package and do not have
the latest xxd with -n support.
Fixes: b720903e2b14d ("selftests/bpf: Enable signature verification for some lskel tests") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/r/20251120084754.640405-3-alan.maguire@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
ERR_get_error_all()[1] is a openssl v3 API, so to make code
compatible with openssl v1 utilize ERR_get_err_line_data
instead. Since openssl is already a build requirement for
the kernel (minimum requirement openssl 1.0.0), this will
allow bpftool to compile where opensslv3 is not available.
Signing-related BPF selftests pass with openssl v1.
fbtft_probe_common() allocates a memory chunk for "info" with
fbtft_framebuffer_alloc(). When "display->buswidth == 0" is true, the
function returns without releasing the "info", which will lead to a
memory leak.
Fix it by calling fbtft_framebuffer_release() when "display->buswidth
== 0" is true.
In mt7615_mcu_wtbl_sta_add(), an skb sskb is allocated. If the
subsequent call to mt76_connac_mcu_alloc_wtbl_req() fails, the function
returns an error without freeing sskb, leading to a memory leak.
Fix this by calling dev_kfree_skb() on sskb in the error handling path
to ensure it is properly released.
Fixes: 99c457d902cf9 ("mt76: mt7615: move mt7615_mcu_set_bmc to mt7615_mcu_ops") Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20251113062415.103611-1-zilin@seu.edu.cn Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix the issue skipping ieee80211_iter_keys() for scanning links in
mt7996_vif_link_remove routine since we have not uploaded any hw keys
for these links.
Move mt76_abort_scan routine out of mt76_reset_device() in order to
avoid a possible deadlock since mt76_reset_device routine is running
with mt76 mutex help and mt76_abort_scan_complete() can grab mt76 mutex
in some cases.
Fixes: b36d55610215a ("wifi: mt76: abort scan/roc on hw restart") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Tested-by: Ben Greear <greearb@candelatech.com> Link: https://patch.msgid.link/20251114-mt76-fix-missing-mtx-v1-3-259ebf11f654@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch fixes the following key issues:
- Pass correct link BSS to mt7996_mcu_add_key(), and use HW beacon
protection mode for mt7990 chipset
- Do not do group key deletion for GTK and IGTK due to FW design, the
delete key command will delete all group keys of a link BSS
- For deleting BIGTK, FW adds a new flow, but the "sec->add" field
should be filled with "SET_KEY". Note that if BIGTK is not deleted, it
will cause beacon decryption issue when switching from an AP interface
to a station interface
Fixes: 0c45d52276fd ("wifi: mt76: mt7996: fix setting beacon protection keys") Co-developed-by: Allen Ye <allen.ye@mediatek.com> Signed-off-by: Allen Ye <allen.ye@mediatek.com> Co-developed-by: Peter Chiu <chui-hao.chiu@mediatek.com> Signed-off-by: Peter Chiu <chui-hao.chiu@mediatek.com> Signed-off-by: Shayne Chen <shayne.chen@mediatek.com> Link: https://patch.msgid.link/20251106064203.1000505-10-shayne.chen@mediatek.com Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Sasha Levin <sashal@kernel.org>