Fix up xfs_inumbers to now pass in the XFS_IBULK* flags into the flags
argument to xfs_inobt_walk, which expects the XFS_IWALK* flags.
Currently passing the wrong flags works for non-debug builds because
the only XFS_IWALK* flag has the same encoding as the corresponding
XFS_IBULK* flag, but in debug builds it can trigger an assert that no
incorrect flag is passed. Instead just extra the relevant flag.
Fixes: 5b35d922c52798 ("xfs: Decouple XFS_IBULK flags from XFS_IWALK flags") Cc: <stable@vger.kernel.org> # v5.19 Reported-by: cen zhang <zzzccc427@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The top of the function comment is outdated, and the parts still correct
duplicate information in comment inside the function. Remove the top of
the function comment and instead improve a comment inside the function.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Stable-dep-of: d2845519b072 ("xfs: fully decouple XFS_IBULK* flags from XFS_IWALK* flags") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
btrfs_subpage_set_writeback() calls folio_start_writeback() the first time
a folio is written back, and it also clears the PAGECACHE_TAG_TOWRITE tag
even if there are still dirty blocks in the folio. This can break ordering
guarantees, such as those required by btrfs_wait_ordered_extents().
That ordering breakage leads to a real failure. For example, running
generic/464 on a zoned setup will hit the following ASSERT. This happens
because the broken ordering fails to flush existing dirty pages before the
file size is truncated.
Consider process A calling writepages() with WB_SYNC_NONE. In zoned mode or
for compressed writes, it locks several folios for delalloc and starts
writing them out. Let's call the last locked folio folio X. Suppose the
write range only partially covers folio X, leaving some pages dirty.
Process A calls btrfs_subpage_set_writeback() when building a bio. This
function call clears the TOWRITE tag of folio X, whose size = 8K and
the block size = 4K. It is following state.
0 4K 8K
|/////|/////| (flag: DIRTY, tag: DIRTY)
<-----> Process A will write this range.
Now suppose process B concurrently calls writepages() with WB_SYNC_ALL. It
calls tag_pages_for_writeback() to tag dirty folios with
PAGECACHE_TAG_TOWRITE. Since folio X is still dirty, it gets tagged. Then,
B collects tagged folios using filemap_get_folios_tag() and must wait for
folio X to be written before returning from writepages().
However, between tagging and collecting, process A may call
btrfs_subpage_set_writeback() and clear folio X's TOWRITE tag.
0 4K 8K
| |/////| (flag: DIRTY|WRITEBACK, tag: DIRTY)
As a result, process B won't see folio X in its batch, and returns without
waiting for it. This breaks the WB_SYNC_ALL ordering requirement.
Fix this by using btrfs_subpage_set_writeback_keepwrite(), which retains
the TOWRITE tag. We now manually clear the tag only after the folio becomes
clean, via the xas operation.
With the incoming large data folios support, the structure name
btrfs_subpage is no longer correct, as for we can have multiple blocks
inside a large folio, and the block size is still page size.
So to follow the schema of iomap, rename btrfs_subpage to
btrfs_folio_state, along with involved enums.
There are still exported functions with "btrfs_subpage_" prefix, and I
believe for metadata the name "subpage" will stay forever as we will
never allocate a folio larger than nodesize anyway.
The full cleanup of the word "subpage" will happen in much smaller steps
in the future.
Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Stable-dep-of: b1511360c8ac ("btrfs: subpage: keep TOWRITE tag until folio is cleaned") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Unlike the iomap_folio_state structure, the btrfs_subpage structure has a
lot of extra sub-bitmaps, namely:
- writeback sub-bitmap
- locked sub-bitmap
iomap_folio_state uses an atomic for writeback tracking, while it has
no per-block locked tracking.
This is because iomap always locks a single folio, and submits dirty
blocks with that folio locked.
But btrfs has async delalloc ranges (for compression), which are queued
with their range locked, until the compression is done, then marks the
involved range writeback and unlocked.
This means a range can be unlocked and marked writeback at seemingly
random timing, thus it needs the extra tracking.
This needs a huge rework on the lifespan of async delalloc range
before we can remove/simplify these two sub-bitmaps.
- ordered sub-bitmap
- checked sub-bitmap
These are for COW-fixup, but as I mentioned in the past, the COW-fixup
is not really needed anymore and these two flags are already marked
deprecated, and will be removed in the near future after comprehensive
tests.
Add related comments to indicate we're actively trying to align the
sub-bitmaps to the iomap ones.
Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Stable-dep-of: b1511360c8ac ("btrfs: subpage: keep TOWRITE tag until folio is cleaned") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is a potential deadlock that can happen in
try_release_subpage_extent_buffer() because the irq-safe xarray spin
lock fs_info->buffer_tree is being acquired before the irq-unsafe
eb->refs_lock.
This leads to the potential race:
// T1 (random eb->refs user) // T2 (release folio)
spin_lock(&eb->refs_lock);
// interrupt
end_bbio_meta_write()
btrfs_meta_folio_clear_writeback()
btree_release_folio()
folio_test_writeback() //false
try_release_extent_buffer()
try_release_subpage_extent_buffer()
xa_lock_irq(&fs_info->buffer_tree)
spin_lock(&eb->refs_lock); // blocked; held by T1
buffer_tree_clear_mark()
xas_lock_irqsave() // blocked; held by T2
I believe that the spin lock can safely be replaced by an rcu_read_lock.
The xa_for_each loop does not need the spin lock as it's already
internally protected by the rcu_read_lock. The extent buffer is also
protected by the rcu_read_lock so it won't be freed before we take the
eb->refs_lock and check the ref count.
The rcu_read_lock is taken and released every iteration, just like the
spin lock, which means we're not protected against concurrent
insertions into the xarray. This is fine because we rely on
folio->private to detect if there are any ebs remaining in the folio.
There is already some precedent for this with find_extent_buffer_nolock,
which loads an extent buffer from the xarray with only rcu_read_lock.
lockdep warning:
=====================================================
WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
6.16.0-0_fbk701_debug_rc0_123_g4c06e63b9203 #1 Tainted: G E N
-----------------------------------------------------
kswapd0/66 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: ffff000011ffd600 (&eb->refs_lock){+.+.}-{3:3}, at: try_release_extent_buffer+0x18c/0x560
and this task is already holding: ffff0000c1d91b88 (&buffer_xa_class){-.-.}-{3:3}, at: try_release_extent_buffer+0x13c/0x560
which would create a new lock dependency:
(&buffer_xa_class){-.-.}-{3:3} -> (&eb->refs_lock){+.+.}-{3:3}
but this new dependency connects a HARDIRQ-irq-safe lock:
(&buffer_xa_class){-.-.}-{3:3}
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&eb->refs_lock);
local_irq_disable();
lock(&buffer_xa_class);
lock(&eb->refs_lock);
<Interrupt>
lock(&buffer_xa_class);
*** DEADLOCK ***
2 locks held by kswapd0/66:
#0: ffff800085506e40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0xe8/0xe50
#1: ffff0000c1d91b88 (&buffer_xa_class){-.-.}-{3:3}, at: try_release_extent_buffer+0x13c/0x560
Link: https://www.kernel.org/doc/Documentation/locking/lockdep-design.rst#:~:text=Multi%2Dlock%20dependency%20rules%3A Fixes: 19d7f65f032f ("btrfs: convert the buffer_radix to an xarray") CC: stable@vger.kernel.org # 6.16+ Reviewed-by: Boris Burkov <boris@bur.io> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Leo Martins <loemra.dev@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Instead of using a bare atomic, use the refcount_t type, which despite
being a structure that contains only an atomic, has an API that checks
for underflows and other hazards. This doesn't change the size of the
extent_buffer structure.
This removes the need to do things like this:
WARN_ON(atomic_read(&eb->refs) == 0);
if (atomic_dec_and_test(&eb->refs)) {
(...)
}
And do just:
if (refcount_dec_and_test(&eb->refs)) {
(...)
}
Since refcount_dec_and_test() already triggers a warning when we decrement
a ref count that has a value of 0 (or below zero).
Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Stable-dep-of: ad580dfa388f ("btrfs: fix subpage deadlock in try_release_subpage_extent_buffer()") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There's this special atomic compare and exchange logic which serves to
avoid locking the extent buffers refs_lock spinlock and therefore reduce
lock contention, so add a comment to make it more obvious.
Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Stable-dep-of: ad580dfa388f ("btrfs: fix subpage deadlock in try_release_subpage_extent_buffer()") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's hard to read the logic to break out of the while loop since it's a
very long expression consisting of a logical or of two composite
expressions, each one composed by a logical and. Further each one is also
testing for the EXTENT_BUFFER_UNMAPPED bit, making it more verbose than
necessary.
If we find an unexpected generation for the extent buffer we are cloning
at btrfs_copy_root(), we just WARN_ON() and don't error out and abort the
transaction, meaning we allow to persist metadata with an unexpected
generation. Instead of warning only, abort the transaction and return
-EUCLEAN.
CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Daniel Vacek <neelx@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Only one of the callers of __add_block_group_free_space() aborts the
transaction if the call fails, while the others don't do it and it's
either never done up the call chain or much higher in the call chain.
So make sure we abort the transaction at __add_block_group_free_space()
if it fails, which brings a couple benefits:
1) If some call chain never aborts the transaction, we avoid having some
metadata inconsistency because BLOCK_GROUP_FLAG_NEEDS_FREE_SPACE is
cleared when we enter __add_block_group_free_space() and therefore
__add_block_group_free_space() is never called again to add the block
group items to the free space tree, since the function is only called
when that flag is set in a block group;
2) If the call chain already aborts the transaction, then we get a better
trace that points to the exact step from __add_block_group_free_space()
which failed, which is better for analysis.
So abort the transaction at __add_block_group_free_space() if any of its
steps fails.
CC: stable@vger.kernel.org # 6.6+ Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Transaction aborts should be done next to the place the error happens,
which was not done in add_block_group_free_space().
Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Stable-dep-of: 1f06c942aa70 ("btrfs: always abort transaction on failure to add block group to free space tree") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
damon_migrate_pages() tries migration even if the target node is invalid.
If users mistakenly make such invalid requests via
DAMOS_MIGRATE_{HOT,COLD} action, the below kernel BUG can happen.
Add a target node validity check in damon_migrate_pages(). The validity
check is stolen from that of do_pages_move(), which is being used for the
move_pages() system call.
Link: https://lkml.kernel.org/r/20250720185822.1451-1-sj@kernel.org Fixes: b51820ebea65 ("mm/damon/paddr: introduce DAMOS_MIGRATE_COLD action for demotion") [6.11.x] Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Honggyu Kim <honggyu.kim@sk.com> Cc: Hyeongtak Ji <hyeongtak.ji@sk.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mptcp_inq.c:79:2: warning: label followed by a declaration is a C23 extension [-Wc23-extensions]
79 | int err = getaddrinfo(node, service, hints, res);
| ^
mptcp_sockopt.c:166:2: warning: label followed by a declaration is a C23 extension [-Wc23-extensions]
166 | int err = getaddrinfo(node, service, hints, res);
| ^
Simply declare 'err' before the label to avoid this warning.
Fixes: dd367e81b79a ("selftests: mptcp: sockopt: use IPPROTO_MPTCP for getaddrinfo") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-8-521fe9957892@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mptcp_connect.c:187:2: warning: label followed by a declaration is a C23 extension [-Wc23-extensions]
187 | int err = getaddrinfo(node, service, hints, res);
| ^
Simply declare 'err' before the label to avoid this warning.
Fixes: a862771d1aa4 ("selftests: mptcp: use IPPROTO_MPTCP for getaddrinfo") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-7-521fe9957892@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This modification is linked to the parent commit where the received
ADD_ADDR limit was accidentally reset when the endpoints were flushed.
To validate that, the test is now flushing endpoints after having set
new limits, and before checking them.
The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.
When add_addr_timeout was set to 0, this caused the ADD_ADDR to be
retransmitted immediately, which looks like a buggy behaviour. Instead,
interpret 0 as "no retransmissions needed".
The documentation is updated to explicitly state that setting the timeout
to 0 disables retransmission.
A flush of the MPTCP endpoints should not affect the MPTCP limits. In
other words, 'ip mptcp endpoint flush' should not change 'ip mptcp
limits'.
But it was the case: the MPTCP_PM_ATTR_RCV_ADD_ADDRS (add_addr_accepted)
limit was reset by accident. Removing the reset of this counter during a
flush fixes this issue.
When skb_ext_add(skb, SKB_EXT_MPTCP) fails in mptcp_incoming_options(),
we used to return true, letting the segment proceed through the TCP
receive path without a DSS mapping. Such segments can leave inconsistent
mapping state and trigger a mid-stream fallback to TCP, which in testing
collapsed (by artificially forcing failures in skb_ext_add) throughput
to zero.
Return false instead so the TCP input path drops the skb (see
tcp_data_queue() and step-7 processing). This is the safer choice
under memory pressure: it preserves MPTCP correctness and provides
backpressure to the sender.
Control packets remain unaffected: ACK updates and DATA_FIN handling
happen before attempting the extension allocation, and tcp_reset()
continues to ignore the return value.
With this change, MPTCP continues to work at high throughput if we
artificially inject failures into skb_ext_add.
Fixes: 6787b7e350d3 ("mptcp: avoid processing packet if a subflow reset") Cc: stable@vger.kernel.org Signed-off-by: Christoph Paasch <cpaasch@openai.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-1-521fe9957892@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The .remove() callback is also used during error handling in
faux_probe(). As einj_remove() was marked with __exit it's not linked
into the kernel if the driver is built-in, potentially resulting in
resource leaks.
Also remove the comment justifying the __exit annotation which doesn't
apply any more since the driver was converted to the faux device
interface.
Fixes: 6cb9441bfe8d ("ACPI: APEI: EINJ: Transition to the faux device interface") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Cc: 6.16+ <stable@vger.kernel.org> # 6.16+ Link: https://patch.msgid.link/20250814051157.35867-2-u.kleine-koenig@baylibre.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Occasionally, the exit latency of the idle state selected by the menu
governor may exceed the PM QoS CPU wakeup latency limit. Namely, if the
scheduler tick has been stopped already and predicted_ns is greater than
the tick period length, the governor may return an idle state whose exit
latency exceeds latency_req because that decision is made before
checking the current idle state's exit latency.
For instance, say that there are 3 idle states, 0, 1, and 2. For idle
states 0 and 1, the exit latency is equal to the target residency and
the values are 0 and 5 us, respectively. State 2 is deeper and has the
exit latency and target residency of 200 us and 2 ms (which is greater
than the tick period length), respectively.
Say that predicted_ns is equal to TICK_NSEC and the PM QoS latency
limit is 20 us. After the first two iterations of the main loop in
menu_select(), idx becomes 1 and in the third iteration of it the target
residency of the current state (state 2) is greater than predicted_ns.
State 2 is not a polling one and predicted_ns is not less than TICK_NSEC,
so the check on whether or not the tick has been stopped is done. Say
that the tick has been stopped already and there are no imminent timers
(that is, delta_tick is greater than the target residency of state 2).
In that case, idx becomes 2 and it is returned immediately, but the exit
latency of state 2 exceeds the latency limit.
Address this issue by modifying the code to compare the exit latency of
the current idle state (idle state i) with the latency limit before
comparing its target residency with predicted_ns, which allows one
more exit_latency_ns check that becomes redundant to be dropped.
However, after the above change, latency_req cannot take the predicted_ns
value any more, which takes place after commit 38f83090f515 ("cpuidle:
menu: Remove iowait influence"), because it may cause a polling state
to be returned prematurely.
In the context of the previous example say that predicted_ns is 3000 and
the PM QoS latency limit is still 20 us. Additionally, say that idle
state 0 is a polling one. Moving the exit_latency_ns check before the
target_residency_ns one causes the loop to terminate in the second
iteration, before the target_residency_ns check, so idle state 0 will be
returned even though previously state 1 would be returned if there were
no imminent timers.
For this reason, remove the assignment of the predicted_ns value to
latency_req from the code.
Fixes: 5ef499cd571c ("cpuidle: menu: Handle stopped tick more aggressively") Cc: 4.17+ <stable@vger.kernel.org> # 4.17+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/5043159.31r3eYUQgx@rafael.j.wysocki Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
css_rstat_exit() may be called asynchronously in scenarios where preceding
calls to css_rstat_init() have not completed. One such example is this
sequence below:
When enabling a sched_ext scheduler, we may trigger invalid task state
transitions, resulting in warnings like the following (which can be
easily reproduced by running the hotplug selftest in a loop):
This happens because we skip initialization for tasks that are already
dead (with their usage counter set to zero), but we don't exclude them
during the scheduling class transition phase.
Fix this by also skipping dead tasks during class swiching, preventing
invalid task state transitions.
Fixes: a8532fac7b5d2 ("sched_ext: TASK_DEAD tasks must be switched into SCX on ops_enable") Cc: stable@vger.kernel.org # v6.12+ Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The panic happens in br_dev_queue_push_xmit() once it receives a
corrupted skb with ETH header already pushed in linear data. When it
attempts the skb_push() call, there's not enough headroom and
skb_push() panics.
The corrupted skb is put on the queue by HSR layer, which makes a
sequence of unintended transformations when it receives a specific
corrupted HSR frame (with incomplete TAG).
Fix it by dropping and consuming frames that are not long enough to
contain both ethernet and hsr headers.
Alternative fix would be to check for enough headroom before skb_push()
in br_dev_queue_push_xmit().
In the reproducer, this is injected via AF_PACKET, but I don't easily
see why it couldn't be sent over the wire from adjacent network.
Further Details:
In the reproducer, the following network interface chain is set up:
To trigger the events leading up to crash, reproducer sends a corrupted
HSR frame with incomplete TAG, via AF_PACKET socket on 'veth0_to_hsr'.
The first HSR-layer function to process this frame is
hsr_handle_frame(). It and then checks if the
protocol is ETH_P_PRP or ETH_P_HSR. If it is, it calls
skb_set_network_header(skb, ETH_HLEN + HSR_HLEN), without checking that
the skb is long enough. For the crashing frame it is not, and hence the
skb->network_header and skb->mac_len fields are set incorrectly,
pointing after the end of the linear buffer.
I will call this a BUG#1 and it is what is addressed by this patch. In
the crashing scenario before the fix, the skb continues to go down the
hsr path as follows.
hsr_handle_frame() then calls this sequence
hsr_forward_skb()
fill_frame_info()
hsr->proto_ops->fill_frame_info()
hsr_fill_frame_info()
hsr_fill_frame_info() contains a check that intends to check whether the
skb actually contains the HSR header. But the check relies on the
skb->mac_len field which was erroneously setup due to BUG#1, so the
check passes and the execution continues back in the hsr_forward_skb():
In create_stripped_skb_hsr(), a copy of the skb is created and is
further corrupted by operation that attempts to strip the HSR tag in a
call to __pskb_copy().
The skb enters create_stripped_skb_hsr() with ethernet header pushed in
linear buffer. The skb_pull(skb_in, HSR_HLEN) thus pulls 6 bytes of
ethernet header into the headroom, creating skb_in with a headroom of
size 8. The subsequent __pskb_copy() then creates an skb with headroom
of just 2 and skb->len of just 12, this is how it looks after the copy:
gdb) p skb->len
$10 = 12
(gdb) p skb->data
$11 = (unsigned char *) 0xffff888041e45382 "\252\252\252\252\252!\210\373",
(gdb) p skb->head
$12 = (unsigned char *) 0xffff888041e45380 ""
It seems create_stripped_skb_hsr() assumes that ETH header is pulled
in the headroom when it's entered, because it just pulls HSR header on
top. But that is not the case in our code-path and we end up with the
corrupted skb instead. I will call this BUG#2
*I got confused here because it seems that under no conditions can
create_stripped_skb_hsr() work well, the assumption it makes is not true
during the processing of hsr frames - since the skb_push() in
hsr_handle_frame to skb_pull in hsr_deliver_master(). I wonder whether I
missed something here.*
Next, the execution arrives in hsr_deliver_master(). It calls
skb_pull(ETH_HLEN), which just returns NULL - the SKB does not have
enough space for the pull (as it only has 12 bytes in total at this
point).
*The skb_pull() here further suggests that ethernet header is meant
to be pushed through the whole hsr processing and
create_stripped_skb_hsr() should pull it before doing the HSR header
pull.*
hsr_deliver_master() then puts the corrupted skb on the queue, it is
then picked up from there by bridge frame handling layer and finally
lands in br_dev_queue_push_xmit where it panics.
Cc: stable@kernel.org Fixes: 48b491a5cc74 ("net: hsr: fix mac_len checks") Reported-by: syzbot+a81f2759d022496b40ab@syzkaller.appspotmail.com Signed-off-by: Jakub Acs <acsjakub@amazon.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250819082842.94378-1-acsjakub@amazon.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With pch_pic device, its register is based on MMIO address space,
different access size 1/2/4/8 is supported. And base address should
be naturally aligned with its access size, here add alignment check
in its register access emulation function.
Function kvm_io_bus_read() is called in function send_ipi_data(), buffer
size of parameter *val should be at least 8 bytes. Since some emulation
functions like loongarch_ipi_readl() and kvm_eiointc_read() will write
the buffer *val with 8 bytes signed extension regardless parameter len.
Otherwise there will be buffer overflow issue when CONFIG_STACKPROTECTOR
is enabled. The bug report is shown as follows:
dc_clk_mgr_create accidentally overwrites the dce60_clk_mgr
with the dce_clk_mgr, causing incorrect behaviour on DCE6.
Fix it by removing the extra dce_clk_mgr_construct.
Fixes: 62eab49faae7 ("drm/amd/display: hide VGH asic specific structs") Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bbddcbe36a686af03e91341b9bbfcca94bd45fb6) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 66abb996999de0d440a02583a6e70c2c24deab45.
This broke custom brightness curves but it wasn't obvious because
of other related changes. Custom brightness curves are always
from a 0-255 input signal. The correct fix was to fix the default
value which was done by [1].
DIV_ROUND_CLOSEST(x, 100) returns either 0 or 1 if 0<x<=100, so the
division needs to be performed after the multiplication and not the
other way around, to properly scale the value.
Fixes: 8b5f3a229a70 ("drm/amd/display: Fix default DC and AC levels") Signed-off-by: Lauri Tirkkonen <lauri@hacktheplanet.fi> Cc: stable@vger.kernel.org Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/aH2Q_HJvxKbW74vU@hacktheplanet.fi Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Apparently, both DCE 6.0 and 6.4 have 3 PLLs, but PLL0 can only
be used for DP. Make sure to initialize the correct amount of PLLs
in DC for these DCE versions and use PLL0 only for DP.
Also, on DCE 6.0 and 6.4, the PLL0 needs to be powered on at
initialization as opposed to DCE 6.1 and 7.x which use a different
clock source for DFS.
The following functions were used as reference from the old
radeon driver implementation of DCE 6.x:
- radeon_atom_pick_pll
- atombios_crtc_set_disp_eng_pll
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 35222b5934ec8d762473592ece98659baf6bc48e) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[Why]
A null pointer dereference vulnerability exists in the AMD display driver's
(DC module) cleanup function dc_destruct().
When display control context (dc->ctx) construction fails
(due to memory allocation failure), this pointer remains NULL.
During subsequent error handling when dc_destruct() is called,
there's no NULL check before dereferencing the perf_trace member
(dc->ctx->perf_trace), causing a kernel null pointer dereference crash.
[How]
Check if dc->ctx is non-NULL before dereferencing.
Link: https://lore.kernel.org/r/tencent_54FF4252EDFB6533090A491A25EEF3EDBF06@qq.com Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
(Updated commit text and removed unnecessary error message) Signed-off-by: Siyang Liu <Security@tencent.com> Signed-off-by: Roman Li <roman.li@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9dd8e2ba268c636c240a918e0a31e6feaee19404) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amdgpu_dm_commit_planes calls update_freesync_state_on_stream only for
the primary plane. If a commit affects a CRTC but not its primary plane,
it would previously not trigger a refresh cycle or affect LFC, violating
current UAPI semantics.
Fixes e.g. atomic commits affecting only the cursor plane being limited
to the minimum refresh rate.
Don't do this for the legacy cursor ioctls though, it would break the
UAPI semantics for those.
Suggested-by: Xaver Hugl <xaver.hugl@kde.org> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3034 Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit cc7bfba95966251b254cb970c21627124da3b7f4) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
GPUs with multi-xcc have multiple MQDs per queue. This patch saves and
restores all the MQDs within the partition.
Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a578f2a58c3ab38f0643b1b6e7534af860233cb1) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since KFD proc content was moved to kernel debugfs, we can't destroy KFD
debugfs before kfd_process_destroy_wq. Move kfd_process_destroy_wq prior
to kfd_debugfs_fini to fix a kernel NULL pointer problem. It happens
when /sys/kernel/debug/kfd was already destroyed in kfd_debugfs_fini but
kfd_process_destroy_wq calls kfd_debugfs_remove_process. This line
debugfs_remove_recursive(entry->proc_dentry);
tries to remove /sys/kernel/debug/kfd/proc/<pid> while
/sys/kernel/debug/kfd is already gone. It hangs the kernel by kernel
NULL pointer.
Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Eric Huang <jinhuieric.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0333052d90683d88531558dcfdbf2525cc37c233) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The field job->vm is used in function amdgpu_job_run to get the page
table re-generation counter and decide whether the job should be skipped.
Specifically, function amdgpu_vm_generation checks if the VM is valid for this job to use.
For instance, if a gfx job depends on a cancelled sdma job from entity vm->delayed,
then the gfx job should be skipped.
Fixes: 26c95e838e63 ("drm/amdgpu: set the VM pointer to NULL in amdgpu_job_prepare") Signed-off-by: YuanShang <YuanShang.Mao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ed76936c6b10b547c6df4ca75412331e9ef6d339) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After a recent change in clang to expose uninitialized warnings from
const variables and pointers [1], there is a warning in
imu_v12_0_program_rlc_ram() because data is passed uninitialized to
program_imu_rlc_ram():
drivers/gpu/drm/amd/amdgpu/imu_v12_0.c:374:30: error: variable 'data' is uninitialized when used here [-Werror,-Wuninitialized]
374 | program_imu_rlc_ram(adev, data, (const u32)size);
| ^~~~
As this warning happens early in clang's frontend, it does not realize
that due to the assignment of r to -EINVAL, program_imu_rlc_ram() is
never actually called, and even if it were, data would not be
dereferenced because size is 0.
Just initialize data to NULL to silence the warning, as the commit that
added program_imu_rlc_ram() mentioned it would eventually be used over
the old method, at which point data can be properly initialized and
used.
Cc: stable@vger.kernel.org Closes: https://github.com/ClangBuiltLinux/linux/issues/2107 Fixes: 56159fffaab5 ("drm/amdgpu: use new method to program rlc ram") Link: https://github.com/llvm/llvm-project/commit/2464313eef01c5b1edf0eccf57a32cdee01472c7 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
HUBBUB structure is not initialized on DCE hardware, so check if it is NULL
to avoid null dereference while accessing amdgpu_dm_capabilities file in
debugfs.
Signed-off-by: Peter Shkenev <mustela@erminea.space> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Gang Ba <Gang.Ba@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We only need the fw based discovery table for sysfs. No
need to parse it. Additionally parsing some of the board
specific tables may result in incorrect data on some boards.
just load the binary and don't parse it on those boards.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4441 Fixes: 80a0e8282933 ("drm/amdgpu/discovery: optionally use fw based ip discovery") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 62eedd150fa11aefc2d377fc746633fdb1baeb55) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the xe buffer-object shrinker allows GPU waits and write-back,
(typically from kswapd), perform multiple passes, skipping
subsequent passes if the shrinker number of scanned objects target
is reached.
1) Without GPU waits and write-back
2) Without write-back
3) With both GPU-waits and write-back
This is to avoid stalls and costly write- and readbacks unless they
are really necessary.
v2:
- Don't test for scan completion twice. (Stuart Summers)
- Update tags.
Reported-by: melvyn <melvyn2@dnsense.pub> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5557 Cc: Summers Stuart <stuart.summers@intel.com> Fixes: 00c8efc3180f ("drm/xe: Add a shrinker for xe bos") Cc: <stable@vger.kernel.org> # v6.15+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20250805074842.11359-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 80944d334182ce5eb27d00e2bf20a88bfc32dea1) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Validate all internal buffers queued to firmware are released back to
driver on close. This helps ensure buffer lifecycle correctness and aids
in debugging any resporce leaks.
Cc: stable@vger.kernel.org Fixes: 73702f45db81 ("media: iris: allocate, initialize and queue internal buffers") Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # on sa8775p-ride Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Update the width, height and buffer size of CAPTURE based on the
resolution set to OUTPUT via VIDIOC_S_FMT. This is required to set the
updated capture resolution to firmware when S_FMT is called only for
OUTPUT.
Cc: stable@vger.kernel.org Fixes: b530b95de22c ("media: iris: implement s_fmt, g_fmt and try_fmt ioctls") Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Acked-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # on sa8775p-ride Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, two types of flush commands are queued to the firmware,
the first flush queued as part of sequence change, does not wait for a
response, while the second flush queued as part of stop, expects a
completion response before proceeding further.
Due to timing issue, the flush response corresponding to the first
command could arrive after the second flush is issued. This casuses the
driver to incorrectly assume that the second flush has completed,
leading to the premature signaling of flush_completion.
To address this, introduce a counter to track the number of pending
flush responses and signal flush completion only when all expected
responses are received.
Cc: stable@vger.kernel.org Fixes: 11712ce70f8e ("media: iris: implement vb2 streaming ops") Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # on sa8775p-ride Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a condition to skip the flush operation during the first sequence
change event. At this point, the capture queue is not streaming, making
the flush unnecessary.
Cc: stable@vger.kernel.org Fixes: 84e17adae3e3 ("media: iris: add support for dynamic resolution change") Acked-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # on sa8775p-ride Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Firmware sends capture buffers with 0 filled length which are not to be
displayed and should be dropped by client.
To achieve the same, add V4L2_BUF_FLAG_ERROR to such buffers by making
sure:
- These 0 length buffers are not returned as result of flush.
- Its not a buffer with LAST flag enabled which will also have 0 filled
length.
Cc: stable@vger.kernel.org Fixes: d09100763bed ("media: iris: add support for drain sequence") Acked-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # on sa8775p-ride Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current check only considers the core error state before allowing
writes to the HFI queues. However, the core can also transition to the
deinit state due to a system error triggered by the response thread.
In such cases, writing to the HFI queues should not be allowed.
Fix this by adding a check for the core deinit state, ensuring that
writes are rejected when core is not in a valid state.
Cc: stable@vger.kernel.org Fixes: fb583a214337 ("media: iris: introduce host firmware interface with necessary hooks") Acked-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # on sa8775p-ride Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A warning reported by smatch indicated a possible null pointer
dereference where one of the arguments to API
"iris_hfi_gen2_handle_system_error" could sometimes be null.
To fix this, add a check to validate that the argument passed is not
null before accessing its members.
Cc: stable@vger.kernel.org Fixes: fb583a214337 ("media: iris: introduce host firmware interface with necessary hooks") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-media/634cc9b8-f099-4b54-8556-d879fb2b5169@stanley.mountain/ Acked-by: Vikash Garodia <quic_vgarodia@quicinc.com> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # on sa8775p-ride Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function pointers responsible for setting firmware properties were
never initialized in the instance capability structure, causing it to
remain NULL. As a result, the firmware properties were not being set
correctly.
Fix this by properly assigning the function pointers from the core
capability to the instance capability, ensuring that the properties are
correctly applied to the firmware.
Cc: stable@vger.kernel.org Fixes: 3a19d7b9e08b ("media: iris: implement set properties to firmware during streamon") Acked-by: Vikash Garodia <quic_vgarodia@quicinc.com> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # on sa8775p-ride Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the resolution changes, the driver internally updates the width and
height, but the client continue to queue buffers with the older
resolution until the last flag is received. This results in a mismatch
when the buffers are prepared, causing failure due to outdated size.
Introduce a check to prevent size validation during buffer preparation
if a resolution reconfiguration is in progress, to handle this.
Cc: stable@vger.kernel.org Fixes: 17f2a485ca67 ("media: iris: implement vb2 ops for buf_queue and firmware response") Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # on sa8775p-ride Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, port check enforces that session property response must
arrive only on the BITSTREAM port. However, firmware can send some
responses on other port as well.
Remove the strict port validation to correctly handle session property
responses from the firmware.
Cc: stable@vger.kernel.org Fixes: 3a19d7b9e08b ("media: iris: implement set properties to firmware during streamon") Acked-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # on sa8775p-ride Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During reconfig, the firmware sends the resolution aligned to 8 bytes.
If the driver sends the same resolution back to the firmware the resolution
will be aligned to 16 bytes not 8.
The alignment mismatch would then subsequently cause the firmware to
send another redundant sequence change event.
Fix this by not setting the resolution property during reconfig.
Cc: stable@vger.kernel.org Fixes: 3a19d7b9e08b ("media: iris: implement set properties to firmware during streamon") Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # on sa8775p-ride Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The driver uses "whole" fps in all its calculations (e.g. in
load_per_instance()). Those calculation expect an fps bigger than 1, and
not big enough to overflow.
Clamp the param if the user provides a value that will result in an invalid
fps.
Reported-by: Hans Verkuil <hverkuil@xs4all.nl> Closes: https://lore.kernel.org/linux-media/f11653a7-bc49-48cd-9cdb-1659147453e4@xs4all.nl/T/#m91cd962ac942834654f94c92206e2f85ff7d97f0 Fixes: aaaa93eda64b ("[media] media: venus: venc: add video encoder files") Cc: stable@vger.kernel.org Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
[bod: Change "parm" to "param"] Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The driver uses "whole" fps in all its calculations (e.g. in
load_per_instance()). Those calculation expect an fps bigger than 1, and
not big enough to overflow.
Clamp the value if the user provides a param that will result in an invalid
fps.
Make sure the interrupt handler is initialized before the interrupt is
registered.
If the IRQ is registered before hfi_create(), it's possible that an
interrupt fires before the handler setup is complete, leading to a NULL
dereference.
This error condition has been observed during system boot on Rb3Gen2.
Add a check to ensure that the packet size does not exceed the number of
available words after reading the packet header from shared memory. This
ensures that the size provided by the firmware is safe to process and
prevent potential out-of-bounds memory access.
The moving of init sequence hook from gen2() to subdev_init() doesn't
account for gen1 devices such as SDM660 and SDM670. The switch should find
the right offset for gen2 PHYs only, not reject gen1. Remove the default
error case to restore gen1 CSIPHY support.
Cc: stable@vger.kernel.org Fixes: fbce0ca24c3a ("media: qcom: camss: csiphy-3ph: Move CSIPHY variables to data field inside csiphy struct") Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org> Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Both the ACE and CSI driver are missing a mei_cldev_disable() call in
their remove() function.
This causes the mei_cl client to stay part of the mei_device->file_list
list even though its memory is freed by mei_cl_bus_dev_release() calling
kfree(cldev->cl).
This leads to a use-after-free when mei_vsc_remove() runs mei_stop()
which first removes all mei bus devices calling mei_ace_remove() and
mei_csi_remove() followed by mei_cl_bus_dev_release() and then calls
mei_cl_all_disconnect() which walks over mei_device->file_list dereferecing
the just freed cldev->cl.
And mei_vsc_remove() it self is run at shutdown because of the
platform_device_unregister(tp->pdev) in vsc_tp_shutdown()
When building a kernel with KASAN this leads to the following KASAN report:
[ 106.634504] ==================================================================
[ 106.634623] BUG: KASAN: slab-use-after-free in mei_cl_set_disconnected (drivers/misc/mei/client.c:783) mei
[ 106.634683] Read of size 4 at addr ffff88819cb62018 by task systemd-shutdow/1
[ 106.634729]
[ 106.634767] Tainted: [E]=UNSIGNED_MODULE
[ 106.634770] Hardware name: Dell Inc. XPS 16 9640/09CK4V, BIOS 1.12.0 02/10/2025
[ 106.634773] Call Trace:
[ 106.634777] <TASK>
...
[ 106.634871] kasan_report (mm/kasan/report.c:221 mm/kasan/report.c:636)
[ 106.634901] mei_cl_set_disconnected (drivers/misc/mei/client.c:783) mei
[ 106.634921] mei_cl_all_disconnect (drivers/misc/mei/client.c:2165 (discriminator 4)) mei
[ 106.634941] mei_reset (drivers/misc/mei/init.c:163) mei
...
[ 106.635042] mei_stop (drivers/misc/mei/init.c:348) mei
[ 106.635062] mei_vsc_remove (drivers/misc/mei/mei_dev.h:784 drivers/misc/mei/platform-vsc.c:393) mei_vsc
[ 106.635066] platform_remove (drivers/base/platform.c:1424)
Add the missing mei_cldev_disable() calls so that the mei_cl gets removed
from mei_device->file_list before it is freed to fix this.
Fixes: 78876f71b3e9 ("media: pci: intel: ivsc: Add ACE submodule") Fixes: 29006e196a56 ("media: pci: intel: ivsc: Add CSI submodule") Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hansg@kernel.org> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Getting / Setting the frame interval using the V4L2 subdev pad ops
get_frame_interval/set_frame_interval causes a deadlock, as the
subdev state is locked in the [1] but also in the driver itself.
In [2] it's described that the caller is responsible to acquire and
release the lock in this case. Therefore, acquiring the lock in the
driver is wrong.
Remove the lock acquisitions/releases from mt9m114_ifp_get_frame_interval()
and mt9m114_ifp_set_frame_interval().
[1] drivers/media/v4l2-core/v4l2-subdev.c - line 1129
[2] Documentation/driver-api/media/v4l2-subdev.rst
Fixes: 24d756e914fc ("media: i2c: Add driver for onsemi MT9M114 camera sensor") Cc: stable@vger.kernel.org Signed-off-by: Mathis Foerst <mathis.foerst@mt.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ov2659_probe() doesn't properly free control handler resources in failure
paths, causing memory leaks. Add v4l2_ctrl_handler_free() to prevent these
memory leaks and reorder the ctrl_handler assignment for better code flow.
Fixes: c4c0283ab3cd ("[media] media: i2c: add support for omnivision's ov2659 sensor") Cc: stable@vger.kernel.org Signed-off-by: Zhang Shurong <zhang_shurong@foxmail.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During the probe() routine, the PiSP BE driver needs to power up the
interface in order to identify and initialize the hardware.
The driver resumes the interface by calling the
pispbe_runtime_resume() function directly, without going
through the pm_runtime helpers, but later suspends it by calling
pm_runtime_put_autosuspend().
This causes a PM usage count imbalance at probe time, notified by the
runtime_pm framework with the below message in the system log:
Fix this by resuming the interface using the pm runtime helpers instead
of calling the resume function directly and use the pm_runtime framework
in the probe() error path. While at it, remove manual suspend of the
interface in the remove() function. The driver cannot be unloaded if in
use, so simply disable runtime pm.
To simplify the implementation, make the driver depend on PM as the
RPI5 platform where the ISP is integrated in uses the PM framework by
default.
In the interrupt handler rain_interrupt(), the buffer full check on
rain->buf_len is performed before acquiring rain->buf_lock. This
creates a Time-of-Check to Time-of-Use (TOCTOU) race condition, as
rain->buf_len is concurrently accessed and modified in the work
handler rain_irq_work_handler() under the same lock.
Multiple interrupt invocations can race, with each reading buf_len
before it becomes full and then proceeding. This can lead to both
interrupts attempting to write to the buffer, incrementing buf_len
beyond its capacity (DATA_SIZE) and causing a buffer overflow.
Fix this bug by moving the spin_lock() to before the buffer full
check. This ensures that the check and the subsequent buffer modification
are performed atomically, preventing the race condition. An corresponding
spin_unlock() is added to the overflow path to correctly release the
lock.
This possible bug was found by an experimental static analysis tool
developed by our team.
Fixes: 0f314f6c2e77 ("[media] rainshadow-cec: new RainShadow Tech HDMI CEC driver") Cc: stable@vger.kernel.org Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When an program is streaming (ffplay) and another program (qv4l2)
changes the TV standard from NTSC to PAL, the kernel crashes due to trying
to copy to unmapped memory.
Changing from NTSC to PAL increases the resolution in the usbtv struct,
but the video plane buffer isn't adjusted, so it overflows.
Fixes: 0e0fe3958fdd13d ("[media] usbtv: Add support for PAL video source") Cc: stable@vger.kernel.org Signed-off-by: Ludwig Disterhof <ludwig@disterhof.eu> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
[hverkuil: call vb2_is_busy instead of vb2_is_streaming] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's a common pattern in drivers to free the control handler's resources
and then return the handler's error code on drivers' error handling paths.
Alas, the v4l2_ctrl_handler_free() function also zeroes the error field,
effectively indicating successful return to the caller.
There's no apparent need to touch the error field while releasing the
control handler's resources and cleaning up stale pointers. Not touching
the handler's error field is a more certain way to address this problem
than changing all the users, in which case the pattern would be likely to
re-emerge in new drivers.
Do just that, don't touch the control handler's error field in
v4l2_ctrl_handler_free().
Fixes: 0996517cf8ea ("V4L/DVB: v4l2: Add new control handling framework") Cc: stable@vger.kernel.org Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Hans Verkuil <hverkuil@xs4all.nl> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The desired clock frequency was correctly set to 400MHz in the device tree
but was lowered by the driver to 300MHz breaking 4K 60Hz content playback.
Fix the issue by removing the driver call to clk_set_rate(), which reduce
the amount of board specific code.
Fixes: 003afda97c65 ("media: verisilicon: Enable AV1 decoder on rk3588") Cc: stable@vger.kernel.org Reviewed-by: Benjamin Gaignard <benjamin.gaignard@collabora.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The pixel_array control size was calculated incorrectly:
the dimensions were swapped (dims[0] should be the height), and the
values should be the width or height divided by PIXEL_ARRAY_DIV
and rounded up. So don't use roundup, but use DIV_ROUND_UP instead.
This bug is harmless in the sense that nothing will break, except that
it consumes way too much memory for this control.
The pad argument to v4l2_subdev_state_xlate_streams() is incorrect, static
pad number is used for the source pad even though the pad number is
dependent on the stream. Fix it.
Add video_device_release() in label 'err_m2m' to release the memory
allocated by video_device_alloc() and prevent potential memory leaks.
Remove the reduntant code in label 'err_m2m'.
Fixes: a8ef0488cc59 ("media: imx: add csc/scaler mem2mem device") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The test pattern is set by a 8-bit register according to the
specification.
+--------+-------------------------------+
| BIT[0] | Solid color |
+--------+-------------------------------+
| BIT[1] | Color bar |
+--------+-------------------------------+
| BIT[2] | Fade to grey color bar |
+--------+-------------------------------+
| BIT[3] | PN9 |
+--------+-------------------------------+
| BIT[4] | Gradient horizontal |
+--------+-------------------------------+
| BIT[5] | Gradient vertical |
+--------+-------------------------------+
| BIT[6] | Check board |
+--------+-------------------------------+
| BIT[7] | Slant pattern |
+--------+-------------------------------+
Based on function above, current test pattern programming is wrong.
This patch fixes it by 'BIT(pattern - 1)'. If pattern is 0, driver
will disable the test pattern generation and set the pattern to 0.
Fixes: e62138403a84 ("media: hi556: Add support for Hi-556 sensor") Cc: stable@vger.kernel.org Signed-off-by: Bingbu Cao <bingbu.cao@intel.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This sd_init() function reads the firmware. The firmware data holds a
series of records and the function reads each record and sends the data
to the device. The request_ihex_firmware() function
calls ihex_validate_fw() which ensures that the total length of all the
records won't read out of bounds of the fw->data[].
However, a potential issue is if there is a single very large
record (larger than PAGE_SIZE) and that would result in memory
corruption. Generally we trust the firmware, but it's always better to
double check.
Fixes: 49b61ec9b5af ("[media] gspca: Add new vicam subdriver") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
int main(int argc, char **argv)
{
unsigned long page_size = sysconf(_SC_PAGESIZE);
char *p = malloc(3 * page_size);
char *p_aligned;
/* initialize memory region. If not initialized, write syscall below will correctly return EFAULT. */
if (1)
memset(p, 'X', 3 * page_size);
p_aligned = (char *) ((((uintptr_t) p) + (2*page_size - 1)) & ~(page_size - 1));
/* Drop PROT_READ protection. Kernel and userspace should fault when accessing that memory region */
mprotect(p_aligned, page_size, PROT_NONE);
/* the following write() should return EFAULT, since PROT_READ was dropped by previous mprotect() */
int ret = write(2, p_aligned, 1);
if (!ret || errno != EFAULT)
printf("\n FAILURE: write() did not returned expected EFAULT value\n");
return 0;
}
Because of the way _PAGE_READ is handled, kernel code never generates
a read access fault when it access a page as the kernel privilege level
is always less than PL1 in the PTE.
This patch reworks the comments in the make_insert_tlb macro to try
to make this clearer.
Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v5.12+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We use load and stbys,e instructions to trigger memory reference
interruptions without writing to memory. Because of the way read
access support is implemented, read access interruptions are only
triggered at privilege levels 2 and 3. The kernel and gateway
page execute at privilege level 0, so this code never triggers
a read access interruption. Thus, it is currently possible for
user code to execute a LWS compare and swap operation at an
address that is read protected at privilege level 3 (PRIV_USER).
Fix this by probing read access rights at privilege level 3 and
branching to lws_fault if access isn't allowed.
Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v5.12+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>