Here are the outstanding miscellaneous fixes for netfslib gathered together
and with some fixes-to-fixes folded down and one rearrangement. Various
Sashiko review comments[1][2][3][4][5] are addressed:
(1) Fix subrequest cancellation cleanup in DIO read and single-read.
(2) Fix missing locking around retry adding new subrequests.
(3) Fix read and write result collection to use barriering correctly to
access a request's subrequest lists without taking a lock.
This adds list_add_tail_release() and
list_first_entry_or_null_acquire() to appropriate incorporate
barriering into some list functions.
(4) Fix netfs_read_to_pagecache() to pause on subrequest I/O failure.
(5) Fix the potential for 64-bit tearing on a 32-bit machine when reading
netfs_inode->remote_i_size and ->zero_point by using much the same
mechanism as is used for ->i_size.
(6) Fix the calculation of zero_point in netfs_release_folio() to limit it
to ->remote_i_size, not ->i_size.
(7) Fix triggering of a VM_BUG_ON_FOLIO() in netfs_write_begin().
(8) Fix a potentially uninitialised error value in
netfs_extract_user_iter().
(9) Fix error handling in netfs_extract_user_iter().
(10) Fix overrun checking in netfs_extract_user_iter().
(11) Fix netfs_invalidate_folio() to clear the folio dirty bit if all dirty
data removed.
(12) Defer the emission of trace_netfs_folio() in netfs_perform_write().
This allows the next patch to emit the correct traces.
(13) Fix the handling of a partially failed copy (ie. EFAULT) into a
streaming write folio. Also remove the netfs_folio if a streaming
write folio is entirely overwritten.
(14) Fix a potential deadlock in writethrough writing.
(15) Fix netfs_read_gaps() to remove the netfs_folio from a filled folio.
(16) Fix netfs_perform_write() to not disable streaming writes when writing
to an fd that's open O_RDWR.
(17) Fix an early put of the sink page used in netfs_read_gaps(), before
the request has completed.
(18) Fix request leak in netfs_write_begin() error handling.
(19) Fix a potential UAF in netfs_unlock_abandoned_read_pages() due to
trying to check index of each folio we're abandoning to see if that
folio is actually owned by the caller (in which case, we're not
actually allowed to dereference it).
(20) Fix incorrect adjustment of dirty region when partially invalidating a
streaming write folio.
(21) Fix the handling of folio->private in netfs_perform_write() and the
attached netfs_folio and/or group when a streaming write folio is
modified.
(22) Fix netfs_read_folio() to wait on writeback first (it holds the folio
lock) otherwise we aren't allowed to look at the netfs_folio struct as
that could be modified at any time by the writeback collector.
(23) Fix write skipping in dir/symlink writepages.
* patches from https://patch.msgid.link/20260512123404.719402-1-dhowells@redhat.com: (24 commits)
afs: Fix the locking used by afs_get_link()
netfs, afs: Fix write skipping in dir/link writepages
netfs: Fix netfs_read_folio() to wait on writeback
netfs: Fix folio->private handling in netfs_perform_write()
netfs: Fix partial invalidation of streaming-write folio
netfs: Fix potential UAF in netfs_unlock_abandoned_read_pages()
netfs: Fix leak of request in netfs_write_begin() error handling
netfs: Fix early put of sink folio in netfs_read_gaps()
netfs: Fix write streaming disablement if fd open O_RDWR
netfs: Fix read-gaps to remove netfs_folio from filled folio
netfs: Fix potential deadlock in write-through mode
netfs: Fix streaming write being overwritten
netfs: Defer the emission of trace_netfs_folio()
netfs: Fix netfs_invalidate_folio() to clear dirty bit if all changes gone
netfs: Fix overrun check in netfs_extract_user_iter()
netfs: fix error handling in netfs_extract_user_iter()
netfs: Fix potential uninitialised var in netfs_extract_user_iter()
netfs: fix VM_BUG_ON_FOLIO() issue in netfs_write_begin() call
netfs: Fix zeropoint update where i_size > remote_i_size
netfs: Fix potential for tearing in ->remote_i_size and ->zero_point
...
David Howells [Tue, 12 May 2026 12:34:01 +0000 (13:34 +0100)]
afs: Fix the locking used by afs_get_link()
The afs filesystem in the kernel doesn't do locking correctly for symbolic
links. There are a number of problems:
(1) It doesn't do any locking around afs_read_single() to prevent races
between multiple ->get_link() calls, thereby allowing the possibility
of leaks.
(2) It doesn't use RCU barriering when accessing the buffer pointers
during RCU pathwalk.
(3) It can race with another thread updating the contents of the symlink
if a third party updated it on the server.
Fix this by the following means:
(0) Move symlink handling into its own file as this makes it more
complicated.
(1) Take the validate_lock around afs_read_single() to prevent races
between multiple ->get_link() calls.
(2) Keep a separate copy of the symlink contents with an rcu_head. This
is always going to be a lot smaller than a page, so it can be
kmalloc'd and save quite a bit of memory. It also needs a refcount
for non-RCU pathwalk.
(3) Split the symlink read and write-to-cache routines in afs from those
for directories.
(4) Discard the I/O buffer as soon as the write-to-cache completes as this
is a full page (plus a folio_queue).
(5) If there's no cache, discard the I/O buffer immediately after reading
and copying if there is no cache.
Fixes: eae9e78951bb ("afs: Use netfslib for symlinks, allowing them to be cached") Fixes: 6698c02d64b2 ("afs: Locally initialise the contents of a new symlink on creation") Closes: https://sashiko.dev/#/patchset/20260326104544.509518-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-25-dhowells@redhat.com
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:34:00 +0000 (13:34 +0100)]
netfs, afs: Fix write skipping in dir/link writepages
Fix netfs_write_single() and afs_single_writepages() to better handle a
write that would be skipped due to lock contention and WB_SYNC_NONE by
returning 1 from netfs_write_single() if it skipped and making
afs_single_writepages() skip also. If a skip occurs, the inode must be
re-marked as the VFS may have cleared the mark.
This is really only theoretical for directories in netfs_write_single() as
the only path to that is through afs_single_writepages() that takes the
->validate_lock around it, thereby serialising it.
Fixes: 6dd80936618c ("afs: Use netfslib for directories") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-24-dhowells@redhat.com
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:59 +0000 (13:33 +0100)]
netfs: Fix netfs_read_folio() to wait on writeback
Fix netfs_read_folio() to wait for an ongoing writeback to complete so that
it can trust the dirty flag and whatever is attached to folio->private
(folio->private may get cleaned up by the collector before it clears the
writeback flag).
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-23-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:58 +0000 (13:33 +0100)]
netfs: Fix folio->private handling in netfs_perform_write()
Under some circumstances, netfs_perform_write() doesn't correctly
manipulate folio->private between NULL, NETFS_FOLIO_COPY_TO_CACHE, pointing
to a group and pointing to a netfs_folio struct, leading to potential
multiple attachments of private data with associated folio ref leaks and
also leaks of netfs_folio structs or netfs_group refs.
Fix this by consolidating the place at which a folio is marked uptodate in
one place and having that look at what's attached to folio->private and
decide how to clean it up and then set the new group. Also, the content
shouldn't be flushed if group is NULL, even if a group is specified in the
netfs_group parameter, as that would be the case for a new folio. A
filesystem should always specify netfs_group or never specify netfs_group.
The Sashiko auto-review tool noted that it was theoretically possible that
the fpos >= ctx->zero_point section might leak if it modified a streaming
write folio. This is unlikely, but with a network filesystem, third party
changes can happen. It also pointed out that __netfs_set_group() would
leak if called multiple times on the same folio from the "whole folio
modify section".
Fixes: 8f52de0077ba ("netfs: Reduce number of conditional branches in netfs_perform_write()") Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-22-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:57 +0000 (13:33 +0100)]
netfs: Fix partial invalidation of streaming-write folio
In netfs_invalidate_folio(), if the region of a partial invalidation
overlaps the front (but not all) of a dirty write cached in a streaming
write page (dirty, but not uptodate, with the dirty region tracked by a
netfs_folio struct), the function modifies the dirty region - but
incorrectly as it moves the region forward by setting the start to the
start, not the end, of the invalidation region.
Fix this by setting finfo->dirty_offset to the end of the invalidation
region (iend).
Fixes: cce6bfa6ca0e ("netfs: Fix trimming of streaming-write folios in netfs_inval_folio()") Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-21-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:56 +0000 (13:33 +0100)]
netfs: Fix potential UAF in netfs_unlock_abandoned_read_pages()
netfs_unlock_abandoned_read_pages(rreq) accesses the index of the folios it
is wanting to unlock and compares that to rreq->no_unlock_folio so that it
doesn't unlock a folio being read for netfs_perform_write() or
netfs_write_begin().
However, given that netfs_unlock_abandoned_read_pages() is called _after_
NETFS_RREQ_IN_PROGRESS is cleared, the one folio that it's not allowed to
dereference is the one specified by ->no_unlock_folio as ownership
immediately reverts to the caller.
Fix this by storing the folio pointer instead and using that rather than
the index. Also fix netfs_unlock_read_folio() where the same applies.
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-20-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:55 +0000 (13:33 +0100)]
netfs: Fix leak of request in netfs_write_begin() error handling
Fix netfs_write_begin() to not leak our ref on the request in the event
that we get an error from netfs_wait_for_read().
Fixes: 4090b31422a6 ("netfs: Add a function to consolidate beginning a read") Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-19-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:54 +0000 (13:33 +0100)]
netfs: Fix early put of sink folio in netfs_read_gaps()
Fix netfs_read_gaps() to release the sink page it uses after waiting for
the request to complete. The way the sink page is used is that an
ITER_BVEC-class iterator is created that has the gaps from the target folio
at either end, but has the sink page tiled over the middle so that a single
read op can fill in both gaps.
The bug was found by KASAN detecting a UAF on the generic/075 xfstest in
the cifsd kernel thread that handles reception of data from the TCP socket:
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Reported-by: Steve French <sfrench@samba.org> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-18-dhowells@redhat.com Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:53 +0000 (13:33 +0100)]
netfs: Fix write streaming disablement if fd open O_RDWR
In netfs_perform_write(), "write streaming" (the caching of dirty data in
dirty but !uptodate folios) is performed to avoid the need to read data
that is just going to get immediately overwritten. However, this is/will
be disabled in three circumstances: if the fd is open O_RDWR, if fscache is
in use (as we need to round out the blocks for DIO) or if content
encryption is enabled (again for rounding out purposes).
The idea behind disabling it if the fd is open O_RDWR is that we'd need to
flush the write-streaming page before we could read the data, particularly
through mmap. But netfs now fills in the gaps if ->read_folio() is called
on the page, so that is unnecessary. Further, this doesn't actually work
if a separate fd is open for reading.
Fix this by removing the check for O_RDWR, thereby allowing streaming
writes even when we might read.
This caused a number of problems with the generic/522 xfstest, but those
are now fixed.
Fixes: c38f4e96e605 ("netfs: Provide func to copy data to pagecache for buffered write") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-17-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:52 +0000 (13:33 +0100)]
netfs: Fix read-gaps to remove netfs_folio from filled folio
Fix netfs_read_gaps() to remove the netfs_folio record from the folio
record before marking the folio uptodate if it successfully fills the gaps
around the dirty data in a streaming write folio (dirty, but not uptodate).
David Howells [Tue, 12 May 2026 12:33:51 +0000 (13:33 +0100)]
netfs: Fix potential deadlock in write-through mode
Fix netfs_advance_writethrough() to always unlock the supplied folio and to
mark it dirty if it isn't yet written to the end. Unfortunately, it can't
be marked for writeback until the folio is done with as that may cause a
deadlock against mmapped reads and writes.
Even though it has been marked dirty, premature writeback can't occur as
the caller is holding both inode->i_rwsem (which will prevent concurrent
truncation, fallocation, DIO and other writes) and ictx->wb_lock (which
will cause flushing to wait and writeback to skip or wait).
Note that this may be easier to deal with once the queuing of folios is
split from the generation of subrequests.
Fixes: 288ace2f57c9 ("netfs: New writeback implementation") Closes: https://sashiko.dev/#/patchset/20260427154639.180684-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-15-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:50 +0000 (13:33 +0100)]
netfs: Fix streaming write being overwritten
In order to avoid reading whilst writing, netfslib will allow "streaming
writes" in which dirty data is stored directly into folios without reading
them first. Such folios are marked dirty but may not be marked uptodate.
If a folio is entirely written by a streaming write, uptodate will be set,
otherwise it will have a netfs_folio struct attached to ->private recording
the dirty region.
In the event that a partially written streaming write page is to be
overwritten entirely by a single write(), netfs_perform_write() will try to
copy over it, but doesn't discard the netfs_folio if it succeeds; further,
it doesn't correctly handle a partial copy that overwrites some of the
dirty data.
Fix this by the following:
(1) If the folio is successfully overwritten, free the netfs_folio struct
before marking the page uptodate.
(2) If the copy to the folio partially fails, but short of the dirty data,
just ignore the copy.
(3) If the copy partially fails and overwrites some of the dirty data,
accept the copy, update the netfs_folio struct to record the new data.
If the folio is now filled, free the netfs_folio and set uptodate,
otherwise return a partial write.
It shows folio 0x24 misbehaving if the FMODE_READ check is commented out in
netfs_perform_write():
if (//(file->f_mode & FMODE_READ) ||
netfs_is_cache_enabled(ctx)) {
and no fscache. This was initially found with the generic/522 xfstest.
Fixes: 8f52de0077ba ("netfs: Reduce number of conditional branches in netfs_perform_write()") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-14-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:49 +0000 (13:33 +0100)]
netfs: Defer the emission of trace_netfs_folio()
Change netfs_perform_write() to keep the netfs_folio trace value in a
variable and emit it later to make it easier to choose the value displayed.
This is a prerequisite for a subsequent patch.
Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-13-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:48 +0000 (13:33 +0100)]
netfs: Fix netfs_invalidate_folio() to clear dirty bit if all changes gone
If a streaming write is made, this will leave the relevant modified folio
in a not-uptodate, but dirty state with a netfs_folio struct hung off of
folio->private indicating the dirty range. Subsequently truncating the
file such that the dirty data in the folio is removed, but the first part
of the folio theoretically remains will cause the netfs_folio struct to be
discarded... but will leave the dirty flag set.
If the folio is then read via mmap(), netfs_read_folio() will see that the
page is dirty and jump to netfs_read_gaps() to fill in the missing bits.
netfs_read_gaps(), however, expects there to be a netfs_folio struct
present and can oops because truncate removed it.
Fix this by calling folio_cancel_dirty() in netfs_invalidate_folio() in the
event that all the dirty data in the folio is erased (as nfs does).
Also add some tracepoints to log modifications to a dirty page.
with fscaching disabled (otherwise streaming writes are suppressed) and a
change to netfs_perform_write() to disallow streaming writes if the fd is
open O_RDWR:
if (//(file->f_mode & FMODE_READ) || <--- comment this out
netfs_is_cache_enabled(ctx)) {
It should be reproducible even without this change, but if prevents the
above trivial xfs_io command from reproducing it.
Note that the initial dd is important: the file must start out sufficiently
large that the zero-point logic doesn't just clear the gaps because it
knows there's nothing in the file to read yet. Unmounting and mounting is
needed to clear the pagecache (there are other ways to do that that may
also work).
This was initially reproduced with the generic/522 xfstest on some patches
that remove the FMODE_READ restriction.
Fixes: 9ebff83e6481 ("netfs: Prep to use folio->private for write grouping and streaming write") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-12-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:47 +0000 (13:33 +0100)]
netfs: Fix overrun check in netfs_extract_user_iter()
Fix netfs_extract_user_iter() so that if iov_iter_extract_pages() overfills
pages[], then those pages don't get included in the iterator constructed at
the end of the function. If there was an overfill, memory corruption has
already happened.
Fixes: 85dd2c8ff368 ("netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator") Closes: https://sashiko.dev/#/patchset/20260427154639.180684-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-11-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
Paulo Alcantara [Tue, 12 May 2026 12:33:46 +0000 (13:33 +0100)]
netfs: fix error handling in netfs_extract_user_iter()
In netfs_extract_user_iter(), if iov_iter_extract_pages() failed to
extract user pages, bail out on -ENOMEM, otherwise return the error
code only if @npages == 0, allowing short DIO reads and writes to be
issued.
This fixes mmapstress02 from LTP tests against CIFS.
Fixes: 85dd2c8ff368 ("netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator") Reported-by: Xiaoli Feng <xifeng@redhat.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-10-dhowells@redhat.com Cc: netfs@lists.linux.dev Cc: stable@vger.kernel.org Cc: linux-cifs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:45 +0000 (13:33 +0100)]
netfs: Fix potential uninitialised var in netfs_extract_user_iter()
In netfs_extract_user_iter(), if it's given a zero-length iterator, it will
fall through the loop without setting ret, and so the error handling
behaviour will be undefined, depending on whether ret happens to be
negative. The value of ret then propagates back up the callstack.
Fix this by presetting ret to 0.
Fixes: 85dd2c8ff368 ("netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator") Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-9-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
The race sequence:
1. Read completes -> netfs_read_collection() runs
2. netfs_wake_rreq_flag(rreq, NETFS_RREQ_IN_PROGRESS, ...)
3. netfs_wait_for_read() returns -EFAULT to netfs_write_begin()
4. The netfs_unlock_abandoned_read_pages() unlocks the folio
5. netfs_write_begin() calls folio_unlock(folio) -> VM_BUG_ON_FOLIO()
The key reason of the issue that netfs_unlock_abandoned_read_pages()
doesn't check the flag NETFS_RREQ_NO_UNLOCK_FOLIO and executes
folio_unlock() unconditionally. This patch implements in
netfs_unlock_abandoned_read_pages() logic similar to
netfs_unlock_read_folio().
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-8-dhowells@redhat.com Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: Ceph Development <ceph-devel@vger.kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:43 +0000 (13:33 +0100)]
netfs: Fix zeropoint update where i_size > remote_i_size
Fix the update of the zero point[*] by netfs_release_folio() when there is
uncommitted data in the pagecache beyond the folio being released but the
on-server EOF is in this folio (ie. i_size > remote_i_size). The update
needs to limit zero_point to remote_i_size, not i_size as i_size is a local
phenomenon reflecting updates made locally to the pagecache, not stuff
written to the server. remote_i_size tracks the server's i_size.
[*] The zero point is the file position from which we can assume that the
server will just return zeros, so we can avoid generating reads.
Note that netfs_invalidate_folio() probably doesn't need fixing as
zero_point should be updated by setattr after truncation or fallocate.
David Howells [Tue, 12 May 2026 12:33:42 +0000 (13:33 +0100)]
netfs: Fix potential for tearing in ->remote_i_size and ->zero_point
Fix potential tearing in using ->remote_i_size and ->zero_point by copying
i_size_read() and i_size_write() and using the same seqcount as for i_size.
We need to make sure that netfslib and the filesystems that use it always
hold i_lock whilst updating any of the sizes to prevent i_size_seqcount
from getting corrupted.
Fixes: 4058f742105e ("netfs: Keep track of the actual remote file size") Fixes: 100ccd18bb41 ("netfs: Optimise away reads above the point at which there can be no data") Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-6-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:40 +0000 (13:33 +0100)]
netfs: Fix missing barriers when accessing stream->subrequests locklessly
The list of subrequests attached to stream->subrequests is accessed without
locks by netfs_collect_read_results() and netfs_collect_write_results(),
and then they access subreq->flags without taking a barrier after getting
the subreq pointer from the list. Relatedly, the functions that build the
list don't use any sort of write barrier when constructing the list to make
sure that the NETFS_SREQ_IN_PROGRESS flag is perceived to be set first if
no lock is taken.
Fix this by:
(1) Add a new list_add_tail_release() function that uses a release barrier
to set the pointer to the new member of the list.
(2) Add a new list_first_entry_or_null_acquire() function that uses an
acquire barrier to read the pointer to the first member in a list (or
return NULL).
(3) Use list_add_tail_release() when adding a subreq to ->subrequests.
(4) Use list_first_entry_or_null_acquire() when initially accessing the
front of the list (when an item is removed, the pointer to the new
front iterm is obtained under the same lock).
David Howells [Tue, 12 May 2026 12:33:39 +0000 (13:33 +0100)]
netfs: Fix missing locking around retry adding new subreqs
Fix netfs_retry_read_subrequests() and netfs_retry_write_stream() to take
the appropriate lock when adding extra subrequests into
stream->subrequests.
Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item") Fixes: 288ace2f57c9 ("netfs: New writeback implementation") Closes: https://sashiko.dev/#/patchset/20260425125426.3855807-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-3-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
David Howells [Tue, 12 May 2026 12:33:38 +0000 (13:33 +0100)]
netfs: Fix cancellation of a DIO and single read subrequests
When the preparation of a new subrequest for a read fails, if the
subrequest has already been added to the stream->subrequests list, it can't
simply be put and abandoned as the collector may see it. Also, if it
hasn't been queued yet, it has two outstanding refs that both need to be
put. Both DIO read and single-read dispatch fail at this; further, both
differ in the order they do things to the way buffered read works.
Fix cancellation of both DIO-read and single-read subrequests that failed
preparation by the following steps:
(1) Harmonise all three reads (buffered, dio, single) to queue the subreq
before prepping it.
(2) Make all three call netfs_queue_read() to do the queuing.
(3) Set NETFS_RREQ_ALL_QUEUED independently of the queuing as we don't
know the length of the subreq at this point.
(4) In all cases, set the error and NETFS_SREQ_FAILED flag on the subreq
and then call netfs_read_subreq_terminated() to deal with it. This
will pass responsibility off to the collector for dealing with it.
Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item") Closes: https://sashiko.dev/#/patchset/20260425125426.3855807-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20260512123404.719402-2-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
before calling poll_select_set_timeout() -> timespec64_valid(). Both
operands of the seconds sum are unbounded user-controlled signed long.
A crafted pair where tv_usec is a negative multiple of USEC_PER_SEC
drives the sum across the wrap boundary - e.g.
yields sec = LONG_MAX, nsec = 0, which passes timespec64_valid() and
then flows through timespec64_add_safe(), which saturates the absolute
deadline to TIME64_MAX (clamped further to KTIME_MAX downstream).
select(2) therefore blocks effectively forever instead of returning
-EINVAL as POSIX requires for a negative timeout.
Only the legacy __NR_select syscall takes this path. pselect6, ppoll,
poll and epoll_pwait2 all hand the user's two fields directly to
poll_select_set_timeout(), which validates *before* doing any
arithmetic:
/* fs/select.c:271 -- the validator */
int poll_select_set_timeout(struct timespec64 *to, time64_t sec, long nsec)
{
struct timespec64 ts = {.tv_sec = sec, .tv_nsec = nsec};
if (!timespec64_valid(&ts))
return -EINVAL;
...
}
/* include/linux/time64.h:97 -- timespec64_valid */
if (ts->tv_sec < 0) return false;
if ((unsigned long)ts->tv_nsec >= NSEC_PER_SEC) return false;
/* fs/select.c:744 do_pselect() (pselect6, pselect6_time32) */
if (get_timespec64(&ts, tsp)) return -EFAULT;
if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec)) return -EINVAL;
/* fs/select.c:1097 ppoll */
if (get_timespec64(&ts, tsp)) return -EFAULT;
if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec)) return -EINVAL;
/* fs/select.c:1065 poll -- timeout_msecs is int; >= 0 gates the math */
if (timeout_msecs >= 0)
poll_select_set_timeout(to, timeout_msecs / MSEC_PER_SEC,
NSEC_PER_MSEC * (timeout_msecs % MSEC_PER_SEC));
/* fs/eventpoll.c:2512 epoll_pwait2 */
if (get_timespec64(&ts, timeout)) return -EFAULT;
if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec)) return -EINVAL;
In every one of these the wrap-prone arithmetic from kern_select()
simply does not exist; the user fields reach timespec64_valid()
unmodified. glibc routes the C-library select() through pselect6,
so the bug is reachable only via a direct syscall(__NR_select, ...).
The pre-validation negative check that used to live here was lost
when the syscall was switched to the poll_select_set_timeout() helper.
Restore it: reject tv_sec < 0 || tv_usec < 0 up front, mirroring what
glibc does in userspace. do_compat_select() has the same arithmetic
pattern but is only reachable on 32-bit compat and from a different
syscall entry; left for a follow-up so this change stays minimal.
Reproducer (returns -1/EINVAL on a fixed kernel; blocks indefinitely
on an unfixed one):
Fixes: 4d36a9e65d49 ("select: deal with math overflow from borderline valid userland data") Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260429-timeval-v1-1-4448e2588bbf@debian.org Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
ARM: pxa: pxa27x: attach software node to its target GPIO controller
Software node describing the GPIO controller for the pxa27x platforms is
currently "dangling" - it's not actually attached to the relevant
controller and doesn't allow real fwnode lookup. Attach it once it's
registered as a firmware node before adding the platform device.
ARM: pxa: pxa25x: attach software node to its target GPIO controller
Software node describing the GPIO controller for the pxa25x platforms is
currently "dangling" - it's not actually attached to the relevant
controller and doesn't allow real fwnode lookup. Attach it once it's
registered as a firmware node before adding the platform device.
ARM: pxa: spitz: attach software nodes to their target GPIO controllers
Software nodes describing the GPIO controllers for the spitz platform
are currently "dangling" - they're not actually attached to the relevant
controllers and don't allow real fwnode lookup. Attach them either by
directly assigning them to the struct device or by using the i2c board
info struct.
Fix an incorrect operator in the SAFETY comment, changing `<` to `<=`,
since `Vec::reserve` guarantees capacity for exactly n additional elements,
so the equal case should be included.
rust: alloc: fix assert in `Vec::reserve` doc test
The assert in the doctest used `>= 10`, which only checks that the
capacity can hold `additional` elements, ignoring the existing length
of `v`. The correct check should ensure there is room for `additional`
*extra* elements on top of what is already in the vector.
Fix the assert to use `>= v.len() + 10` so the example accurately
reflects the actual semantics of the function.
Eric Biggers [Thu, 30 Apr 2026 09:52:43 +0000 (02:52 -0700)]
dm-inlinecrypt: add target for inline block device encryption
Add a new device-mapper target "dm-inlinecrypt" that is similar to
dm-crypt but uses the blk-crypto API instead of the regular crypto API.
This allows it to take advantage of inline encryption hardware such as
that commonly built into UFS host controllers.
The table syntax matches dm-crypt's, but for now only a stripped-down
set of parameters is supported. For example, for now AES-256-XTS is the
only supported cipher.
dm-inlinecrypt is based on Android's dm-default-key with the
controversial passthrough support removed. Note that due to the removal
of passthrough support, use of dm-inlinecrypt in combination with
fscrypt causes double encryption of file contents (similar to dm-crypt +
fscrypt), with the fscrypt layer not being able to use the inline
encryption hardware. This makes dm-inlinecrypt unusable on systems such
as Android that use fscrypt and where a more optimized approach is
needed. It is however suitable as a replacement for dm-crypt.
dm-inlinecrypt supports both keyring key and hex key, the former avoids
the key to be exposed in dm-table message. Similar to dm-default-key in
Android, it will fallabck to the software block crypto once the inline
crypto hardware cannot support the expected cipher.
====================
vsock/virtio: fix vsockmon tap skb construction
While reviewing the patch posted by Yiqi Sun [1] to fix an issue in
virtio_transport_build_skb(), I discovered another issue related to
the offset and length of the payload to be copied in the new skb.
This was introduced when we did the skb conversion, and fixed by
patch 1.
Patch 2 fixes the issue found by Yiqi Sun in a different way: using
iov_iter_kvec() to properly initialize all the iov_iter fields and
removing the linear vs non-linear split like we alredy do in
vhost-vsock.
It could have been a single patch, but since there were two affected
commits, I decided to keep the fixes separate.
vsock/virtio: fix empty payload in tap skb for non-linear buffers
For non-linear skbs, virtio_transport_build_skb() goes through
virtio_transport_copy_nonlinear_skb() to copy the original payload
in the new skb to be delivered to the vsockmon tap device.
This manually initializes an iov_iter but does not set iov_iter.count.
Since the iov_iter is zero-initialized, the copy length is zero and no
payload is actually copied to the monitor interface, leaving data
un-initialized.
Fix this by removing the linear vs non-linear split and using
skb_copy_datagram_iter() with iov_iter_kvec() for all cases, as
vhost-vsock already does. This handles both linear and non-linear skbs,
properly initializes the iov_iter, and removes the now unused
virtio_transport_copy_nonlinear_skb().
While touching this code, let's also check the return value of
skb_copy_datagram_iter(), even though it's unlikely to fail.
Fixes: 4b0bf10eb077 ("vsock/virtio: non-linear skb handling for tap") Reported-by: Yiqi Sun <sunyiqixm@gmail.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Bobby Eshleman <bobbyeshleman@meta.com> Reviewed-by: Arseniy Krasnov <avkrasnov@rulkc.org> Link: https://patch.msgid.link/20260508164411.261440-3-sgarzare@redhat.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
vsock/virtio: fix length and offset in tap skb for split packets
virtio_transport_build_skb() builds a new skb to be delivered to the
vsockmon tap device. To build the new skb, it uses the original skb
data length as payload length, but as the comment notes, the original
packet stored in the skb may have been split in multiple packets, so we
need to use the length in the header, which is correctly updated before
the packet is delivered to the tap, and the offset for the data.
This was also similar to what we did before commit 71dc9ec9ac7d
("virtio/vsock: replace virtio_vsock_pkt with sk_buff") where we probably
missed something during the skb conversion.
Also update the comment above, which was left stale by the skb
conversion and still mentioned a buffer pointer that no longer exists.
Fixes: 71dc9ec9ac7d ("virtio/vsock: replace virtio_vsock_pkt with sk_buff") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Bobby Eshleman <bobbyeshleman@meta.com> Reviewed-by: Arseniy Krasnov <avkrasnov@rulkc.org> Link: https://patch.msgid.link/20260508164411.261440-2-sgarzare@redhat.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Mark Brown [Tue, 12 May 2026 10:47:14 +0000 (19:47 +0900)]
ASoC: Add a new SoundWire enumeration helper
Charles Keepax <ckeepax@opensource.cirrus.com> says:
Add a new SoundWire enumeration helper function, many drivers have
almost identical code in runtime resume so it makes sense to move this
to the core.
It is worth noting this is really step one of a larger process, there
are a few drivers that do more custom things and are not covered by this
series. But this series picks up the low hanging fruit and moves things
in a good direction.
The next step is to look at drivers that also wait at probe time, where
the unattached_request flag is not going to be valid.
Charles Keepax [Tue, 12 May 2026 10:30:05 +0000 (11:30 +0100)]
soundwire: Add a helper function to wait for device initialisation
Add a new helper function to wait for the device to enumerate
and be initialised by the SoundWire core. Most of the SoundWire
drivers have very similar boiler plate code in their runtime
resume, and that boiler plate tends to access various internals
of the SoundWire structs which is a mild layering violation.
Adding a new core helper function greatly eases both of these
issues.
Quan Sun [Fri, 8 May 2026 12:46:36 +0000 (20:46 +0800)]
net: hsr: fix NULL pointer dereference in hsr_get_node_data()
In the HSR (High-availability Seamless Redundancy) protocol, node
information is maintained in the node_db. When a supervision frame is
received, node->addr_B_port is updated to track the receiving port type
(e.g., HSR_PT_SLAVE_B).
If the underlying physical interface associated with this slave port is
removed (e.g., via `ip link del`), hsr_del_port() frees the hsr_port
object. However, the stale node->addr_B_port reference is kept in the
node_db until the node ages out.
Subsequently, if userspace queries the node status via the Netlink
command HSR_C_GET_NODE_STATUS, the kernel calls hsr_get_node_data().
This function unconditionally dereferences the pointer returned by
hsr_port_get_hsr():
if (node->addr_B_port != HSR_PT_NONE) {
port = hsr_port_get_hsr(hsr, node->addr_B_port);
*addr_b_ifindex = port->dev->ifindex; // <-- NULL deref
}
If the slave port has been deleted, hsr_port_get_hsr() returns NULL,
resulting in a kernel panic.
Oops: general protection fault, probably for non-canonical address
KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
RIP: 0010:hsr_get_node_data+0x7b6/0x9e0
Call Trace:
<TASK>
hsr_get_node_status+0x445/0xa40
Fix this by adding a proper NULL pointer check. If the port lookup fails
due to a stale port type, gracefully treat it as if no valid port exists
and assign -1 to the interface index.
Steps to reproduce:
1. Create an HSR interface with two slave devices.
2. Receive a supervision frame to populate node_db with
addr_B_port assigned to SLAVE_B.
3. Delete the underlying slave device B.
4. Send an HSR_C_GET_NODE_STATUS Netlink message.
Linus Walleij [Mon, 11 May 2026 19:43:44 +0000 (21:43 +0200)]
gpio: regmap: Don't set a fixed direction line
If a GPIO line has a fixed direction, report an error if a consumer
anyway tries to set the direction to something other than what it is
hardcoded to.
This didn't happen much before because what we supported was all lines
input or output and then the implementer would probably not specify the
direction registers, but with sparse fixed direction we can have
a mixture so let's take this into account.
As a consequence, since gpio_regmap_set_direction() can now fail, alter
the semantics in gpio_regmap_direction_output() such that we first check
if we can set the direction to output before we set the value and the
direction.
The RZ/G3L SOC has 3 watchdog timer channels:
- channel0 (wdt0) for Cortex-A55-CPU Non-Secure,
- channel1 (wdt1) for Cortex-A55 CPU Secure,
- channel2 (wdt2) for Cortex-M33 CPU.
Add pincontrol node to RZ/G3L ("R9A08G046") SoC DTSI and set the icu as
the interrupt-parent of the pin controller to route GPIO interrupts
through the IA55 interrupt controller.
Marek Vasut [Wed, 22 Apr 2026 23:36:30 +0000 (01:36 +0200)]
ARM: dts: renesas: r8a7740: Describe coresight
Describe the coresight topology on R-Mobile A1. Extend the current PTM
node with connection funnel, TPIU, ETB and replicator. Coresight on
this hardware is clocked from the ZT/ZTR trace clocks.
Add ZT trace bus and ZTR trace clocks on R-Mobile A1.
These clocks supply the coresight tracing modules, PTM, TPIU, ETB and
replicator. Without these clock, coresight tracing can not be operated.
Enable the Gigabit Ethernet Interfaces (GBETH0) populated on the RZ/G3L
SMARC EVK. The eth1, pincontrol definitions and hotplug support will be
added later.
Biju Das [Thu, 26 Mar 2026 11:19:49 +0000 (11:19 +0000)]
arm64: dts: renesas: r9a08g046: Add GBETH nodes
Renesas RZ/G3L SoC is equipped with 2x Synopsys DesignWare Ethernet
(10/100/1000 BASE) with TSN, IP block version 5.30. Add GBETH nodes
to R9A08G046 RZ/G3L SoC DTSI.
Valery Borovsky [Mon, 11 May 2026 17:12:11 +0000 (20:12 +0300)]
media: sun4i-csi: Return queued buffers on start_streaming() failure
The vb2 framework hands buffers to the driver via buf_queue() before
calling start_streaming(). If start_streaming() returns an error
without first returning those buffers via vb2_buffer_done(),
vb2_start_streaming() fires WARN_ON(owned_by_drv_count) and the queued
buffers leak.
sun4i_csi_start_streaming() returned -EINVAL when no matching CSI
format could be found, before any setup (scratch buffer allocation,
pipeline start) had been performed. The remaining error paths already
converge on the err_clear_dma_queue label, which calls
return_all_buffers(..., VB2_BUF_STATE_QUEUED) under csi->qlock. Jump
to that label directly: the intermediate err_disable_device /
err_disable_pipeline / err_free_scratch_buffer labels are skipped,
which is correct because nothing they would undo has happened yet.
This mirrors the uvcvideo fix in commit 4cf3b6fd54eb ("media: uvcvideo:
Return queued buffers on start_streaming() failure").
Valery Borovsky [Mon, 11 May 2026 17:12:10 +0000 (20:12 +0300)]
media: stm32-dcmipp: Return queued buffers on start_streaming() failure
The vb2 framework hands buffers to the driver via buf_queue() before
calling start_streaming(). If start_streaming() returns an error
without first returning those buffers via vb2_buffer_done(),
vb2_start_streaming() fires WARN_ON(owned_by_drv_count) and the queued
buffers leak.
dcmipp_bytecap_start_streaming() returned -EINVAL when the source
subdevice could not be resolved from the media graph, before
pm_runtime_resume_and_get() and media_pipeline_start() had been called.
The remaining error paths already converge on the err_buffer_done
label, which calls dcmipp_bytecap_all_buffers_done(...,
VB2_BUF_STATE_QUEUED). Jump to that label directly: the intermediate
err_pm_put / err_media_pipeline_stop labels are skipped, which is
correct because nothing they would undo has happened yet.
This mirrors the uvcvideo fix in commit 4cf3b6fd54eb ("media: uvcvideo:
Return queued buffers on start_streaming() failure").
Fixes: 28e0f3772296 ("media: stm32-dcmipp: STM32 DCMIPP camera interface driver") Cc: stable@vger.kernel.org Signed-off-by: Valery Borovsky <vebohr@gmail.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Valery Borovsky [Mon, 11 May 2026 17:12:09 +0000 (20:12 +0300)]
media: rtl2832_sdr: Return queued buffers on start_streaming() failure
The vb2 framework hands buffers to the driver via buf_queue() before
calling start_streaming(). If start_streaming() returns an error
without first returning those buffers via vb2_buffer_done(),
vb2_start_streaming() fires WARN_ON(owned_by_drv_count) and the queued
buffers leak.
rtl2832_sdr_start_streaming() had multiple error paths that hit this
trap: two direct early returns (-ENODEV, -ERESTARTSYS), plus six
`goto err` paths covering subdev s_power, tuner setup, ADC setup,
stream-buffer allocation, urb allocation, and urb submission failures.
None of them returned the queued buffers.
The original function had no distinct success exit and fell straight
through into the err label, which previously only did mutex_unlock and
"return ret". Adding queued-buffer cleanup at err must therefore be
paired with an explicit success return; otherwise every successful
start would also drain the buffer queue and kill streaming. Add that
success return, then add rtl2832_sdr_cleanup_queued_bufs() at the err
label and before each early return.
The cleanup helper takes a vb2_buffer_state argument so that the
start_streaming error paths can pass VB2_BUF_STATE_QUEUED (as
expected by userspace on start_streaming failure) while stop_streaming
keeps its existing VB2_BUF_STATE_ERROR semantics.
This mirrors the uvcvideo fix in commit 4cf3b6fd54eb ("media: uvcvideo:
Return queued buffers on start_streaming() failure").
The err label still does not roll back power_ctrl(), frontend_ctrl(),
the POWER_ON flag, or stream/URB allocations that may have happened
before the failing step. Those are pre-existing leaks of a different
class and are not addressed here.
Valery Borovsky [Mon, 11 May 2026 17:12:08 +0000 (20:12 +0300)]
media: pwc: Return queued buffers on start_streaming() failure
The vb2 framework hands buffers to the driver via buf_queue() before
calling start_streaming(). If start_streaming() returns an error
without first returning those buffers via vb2_buffer_done(),
vb2_start_streaming() fires WARN_ON(owned_by_drv_count) and the queued
buffers leak.
pwc's start_streaming() had two early returns that hit this trap:
-ENODEV when the USB device was already disconnected, and -ERESTARTSYS
when mutex_lock_interruptible() was interrupted by a signal. Call the
existing pwc_cleanup_queued_bufs() helper with VB2_BUF_STATE_QUEUED
before returning (matching the state already used by the
pwc_isoc_init() error path in the same function).
This mirrors the uvcvideo fix in commit 4cf3b6fd54eb ("media: uvcvideo:
Return queued buffers on start_streaming() failure").
Valery Borovsky [Mon, 11 May 2026 17:12:07 +0000 (20:12 +0300)]
media: msi2500: Return queued buffers on start_streaming() failure
The vb2 framework hands buffers to the driver via buf_queue() before
calling start_streaming(). If start_streaming() returns an error
without first returning those buffers via vb2_buffer_done(),
vb2_start_streaming() fires WARN_ON(owned_by_drv_count) and the queued
buffers leak.
msi2500_start_streaming() had five error paths that all hit this trap
and were further tangled by ret-overwriting between calls:
- -ENODEV when the USB device was already disconnected
- -ERESTARTSYS when mutex_lock_interruptible() was interrupted
- msi2500_set_usb_adc() failure: ret was silently overwritten by
the next call (msi2500_isoc_init), so the error was lost entirely
- msi2500_isoc_init() failure: cleanup_queued_bufs was called, but
the function then fell through to msi2500_ctrl_msg() and again
masked the original error by overwriting ret
- msi2500_ctrl_msg(CMD_START_STREAMING) failure: no cleanup at all,
leaving isoc URBs submitted with no way for the driver to consume
them
Consolidate the error paths into a small goto chain. Every failure
now stops the function, drains the queued-buffer list, and returns
the real error code. The ctrl_msg failure path also rolls back the
preceding msi2500_isoc_init() via msi2500_isoc_cleanup() before
unlocking and draining.
The cleanup helper takes a vb2_buffer_state argument so that the
start_streaming error paths can pass VB2_BUF_STATE_QUEUED (as
expected by userspace on start_streaming failure) while stop_streaming
keeps its existing VB2_BUF_STATE_ERROR semantics.
This mirrors the uvcvideo fix in commit 4cf3b6fd54eb ("media: uvcvideo:
Return queued buffers on start_streaming() failure").
Valery Borovsky [Mon, 11 May 2026 17:12:06 +0000 (20:12 +0300)]
media: airspy: Return queued buffers on start_streaming() failure
The vb2 framework hands buffers to the driver via buf_queue() before
calling start_streaming(). If start_streaming() returns an error
without first returning those buffers via vb2_buffer_done(),
vb2_start_streaming() fires WARN_ON(owned_by_drv_count) and the queued
buffers leak.
airspy_start_streaming() returned -ENODEV early when the USB device had
been disconnected (s->udev == NULL) without returning any buffers that
buf_queue() had already accepted. Take v4l2_lock first and jump to the
existing err_clear_bit label, which already drains s->queued_bufs via
vb2_buffer_done(..., VB2_BUF_STATE_QUEUED) before unlocking.
This mirrors the uvcvideo fix in commit 4cf3b6fd54eb ("media: uvcvideo:
Return queued buffers on start_streaming() failure").
Arash Golgol [Sat, 9 May 2026 16:10:13 +0000 (19:40 +0330)]
media: video-i2c: use vb2_video_unregister_device on driver removal
The driver uses vb2_fop_release() as its file release operation, so
vb2_video_unregister_device() should be used instead of
video_unregister_device() during driver removal.
This ensures that the vb2 queue is properly disconnected before the
video device is unregistered.
Signed-off-by: Arash Golgol <arash.golgol@gmail.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Ricardo Ribalda [Thu, 7 May 2026 20:58:10 +0000 (20:58 +0000)]
media: staging: ipu3-imgu: Add range check for imgu_css_cfg_acc_stripe
If the driver's stripe information is invalid it can result in an integer
underflow. Add a range check to avoid this kind of error.
This patch fixes the following smatch error:
drivers/staging/media/ipu3/ipu3-css-params.c:1792 imgu_css_cfg_acc_stripe() warn: 'acc->stripe.bds_out_stripes[0]->width - 2 * f' 4294967168 can't fit into 65535 'acc->stripe.bds_out_stripes[1]->offset'
Cc: stable@vger.kernel.org Fixes: e11110a5b744 ("media: staging/intel-ipu3: css: Compute and program ccs") Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Ricardo Ribalda [Thu, 7 May 2026 20:58:07 +0000 (20:58 +0000)]
media: i2c: mt9p031: Rewrite assignment to make smatch happy
The current code makes smatch a bit uncomfortable:
drivers/media/i2c/mt9p031.c:799 mt9p031_s_ctrl() warn: assigning (-1952) to unsigned variable 'data'
Probably because smatch is not clever enough (yet). Do a simple rewrite
to make sure that smatch understands what we are doing here.