]> git.ipfire.org Git - thirdparty/kernel/linux.git/log
thirdparty/kernel/linux.git
5 months agoNFSv4.2: mark OFFLOAD_CANCEL MOVEABLE
Olga Kornievskaia [Fri, 13 Dec 2024 16:52:01 +0000 (11:52 -0500)] 
NFSv4.2: mark OFFLOAD_CANCEL MOVEABLE

OFFLOAD_CANCEL should be marked MOVEABLE for when we need to move
tasks off a non-functional transport.

Fixes: c975c2092657 ("NFS send OFFLOAD_CANCEL when COPY killed")
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agoNFSv4.2: fix COPY_NOTIFY xdr buf size calculation
Olga Kornievskaia [Fri, 13 Dec 2024 16:52:00 +0000 (11:52 -0500)] 
NFSv4.2: fix COPY_NOTIFY xdr buf size calculation

We need to include sequence size in the compound.

Fixes: 0491567b51ef ("NFS: add COPY_NOTIFY operation")
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agoNFS: Rename struct nfs4_offloadcancel_data
Chuck Lever [Mon, 13 Jan 2025 15:32:38 +0000 (10:32 -0500)] 
NFS: Rename struct nfs4_offloadcancel_data

Refactor: This struct can be used unchanged for the new
OFFLOAD_STATUS implementation, so give it a more generic name.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agoNFS: Fix typo in OFFLOAD_CANCEL comment
Chuck Lever [Mon, 13 Jan 2025 15:32:37 +0000 (10:32 -0500)] 
NFS: Fix typo in OFFLOAD_CANCEL comment

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agoNFS: CB_OFFLOAD can return NFS4ERR_DELAY
Chuck Lever [Mon, 13 Jan 2025 15:32:36 +0000 (10:32 -0500)] 
NFS: CB_OFFLOAD can return NFS4ERR_DELAY

RFC 7862 permits the callback service to respond to a CB_OFFLOAD
operation with NFS4ERR_DELAY. Use that instead of
NFS4ERR_SERVERFAULT for temporary memory allocation failure, as that
is more consistent with how other operations report memory
allocation failure.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfs: Make NFS_FSCACHE select NETFS_SUPPORT instead of depending on it
Dragan Simic [Fri, 27 Dec 2024 19:17:58 +0000 (20:17 +0100)] 
nfs: Make NFS_FSCACHE select NETFS_SUPPORT instead of depending on it

Having the NFS_FSCACHE option depend on the NETFS_SUPPORT options makes
selecting NFS_FSCACHE impossible unless another option that additionally
selects NETFS_SUPPORT is already selected.

As a result, for example, being able to reach and select the NFS_FSCACHE
option requires the CEPH_FS or CIFS option to be selected beforehand, which
obviously doesn't make much sense.

Let's correct this by making the NFS_FSCACHE option actually select the
NETFS_SUPPORT option, instead of depending on it.

Fixes: 915cd30cdea8 ("netfs, fscache: Combine fscache with netfs")
Cc: stable@vger.kernel.org
Reported-by: Diederik de Haas <didi.debian@cknow.org>
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfs: fix incorrect error handling in LOCALIO
Mike Snitzer [Mon, 13 Jan 2025 22:29:03 +0000 (17:29 -0500)] 
nfs: fix incorrect error handling in LOCALIO

nfs4_stat_to_errno() expects a NFSv4 error code as an argument and
returns a POSIX errno.

The problem is LOCALIO is passing nfs4_stat_to_errno() the POSIX errno
return values from filp->f_op->read_iter(), filp->f_op->write_iter()
and vfs_fsync_range().

So the POSIX errno that nfs_local_pgio_done() and
nfs_local_commit_done() are passing to nfs4_stat_to_errno() are
failing to match any NFSv4 error code, which results in
nfs4_stat_to_errno() defaulting to returning -EREMOTEIO. This causes
assertions in upper layers due to -EREMOTEIO not being a valid NFSv4
error code.

Fix this by updating nfs_local_pgio_done() and nfs_local_commit_done()
to use the new nfs_localio_errno_to_nfs4_stat() to map a POSIX errno
to an NFSv4 error code.

Care was taken to factor out nfs4_errtbl_common[] to avoid duplicating
the same NFS error to errno table. nfs4_errtbl_common[] is checked
first by both nfs4_stat_to_errno and nfs_localio_errno_to_nfs4_stat
before they check their own more specialized tables (nfs4_errtbl[] and
nfs4_errtbl_localio[] respectively).

While auditing the associated error mapping tables, the (ab)use of -1
for the last table entry was removed in favor of using ARRAY_SIZE to
iterate the nfs_errtbl[] and nfs4_errtbl[]. And 'errno_NFSERR_IO' was
removed because it caused needless obfuscation.

Fixes: 70ba381e1a431 ("nfs: add LOCALIO support")
Reported-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfs: probe for LOCALIO when v3 client reconnects to server
Mike Snitzer [Sat, 16 Nov 2024 01:41:06 +0000 (20:41 -0500)] 
nfs: probe for LOCALIO when v3 client reconnects to server

Re-enabling NFSv3 LOCALIO is made more complex (than NFSv4) because v3
is stateless.  As such, the hueristic used to identify a LOCALIO probe
point is more adhoc by nature: if/when NFSv3 client IO begins to
complete again in terms of normal RPC-based NFSv3 server IO, attempt
nfs_local_probe_async().

Care is taken to throttle the frequency of nfs_local_probe_async(),
otherwise there could be a flood of repeat calls to
nfs_local_probe_async().

The throttle is admin controlled using a new module parameter for
nfsv3, e.g.:
  echo 512 > /sys/module/nfsv3/parameters/nfs3_localio_probe_throttle

Probe for NFSv3 LOCALIO every N IO requests (512 in this case). Must
be power-of-2, defaults to 0 (probing disabled).

On systems that expect to use LOCALIO with NFSv3 the admin should
configure the 'nfs3_localio_probe_throttle' module parameter.

This commit backfills module parameter documentation in localio.rst

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfs: probe for LOCALIO when v4 client reconnects to server
Mike Snitzer [Sat, 16 Nov 2024 01:41:05 +0000 (20:41 -0500)] 
nfs: probe for LOCALIO when v4 client reconnects to server

Introduce nfs_local_probe_async() for the NFS client to initiate
if/when it reconnects with server. For NFSv4 it is a simple matter to
call nfs_local_probe_async() from nfs4_do_reclaim (during NFSv4
grace).

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfs/localio: remove redundant code and simplify LOCALIO enablement
Mike Snitzer [Sat, 16 Nov 2024 01:41:04 +0000 (20:41 -0500)] 
nfs/localio: remove redundant code and simplify LOCALIO enablement

Remove nfs_local_enable and nfs_local_disable, instead use
nfs_localio_enable_client and nfs_localio_disable_client.

Discontinue use of the NFS_CS_LOCAL_IO bit in the nfs_client struct's
cl_flags to reflect that LOCALIO is enabled; instead just test if the
net member of the nfs_uuid_t struct is set.

Also remove NFS_CS_LOCAL_IO.

Lastly, remove trace_nfs_local_enable and trace_nfs_local_disable
because comparable traces are available from nfs_localio.ko.

Suggested-by: NeilBrown <neilb@suse.de>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfs_common: add nfs_localio trace events
Mike Snitzer [Sat, 16 Nov 2024 01:41:03 +0000 (20:41 -0500)] 
nfs_common: add nfs_localio trace events

The nfs_localio.ko now exposes /sys/kernel/tracing/events/nfs_localio
with nfs_localio_enable_client and nfs_localio_disable_client events.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfs_common: track all open nfsd_files per LOCALIO nfs_client
Mike Snitzer [Sat, 16 Nov 2024 01:41:02 +0000 (20:41 -0500)] 
nfs_common: track all open nfsd_files per LOCALIO nfs_client

This tracking enables __nfsd_file_cache_purge() to call
nfs_localio_invalidate_clients(), upon shutdown or export change, to
nfs_close_local_fh() all open nfsd_files that are still cached by the
LOCALIO nfs clients associated with nfsd_net that is being shutdown.

Now that the client must track all open nfsd_files there was more work
than necessary being done with the global nfs_uuids_lock contended.
This manifested in various RCU issues, e.g.:
 hrtimer: interrupt took 47969440 ns
 rcu: INFO: rcu_sched detected stalls on CPUs/tasks:

Use nfs_uuid->lock to protect all nfs_uuid_t members, instead of
nfs_uuids_lock, once nfs_uuid_is_local() adds the client to
nn->local_clients.

Also add 'local_clients_lock' to 'struct nfsd_net' to protect
nn->local_clients.  And store a pointer to spinlock in the 'list_lock'
member of nfs_uuid_t so nfs_localio_disable_client() can use it to
avoid taking the global nfs_uuids_lock.

In combination, these split out locks eliminate the use of the single
nfslocalio.c global nfs_uuids_lock in the IO paths (open and close).

Also refactored associated fs/nfs_common/nfslocalio.c methods' locking
to reduce work performed with spinlocks held in general.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfs_common: rename nfslocalio nfs_uuid_lock to nfs_uuids_lock
Mike Snitzer [Sat, 16 Nov 2024 01:41:01 +0000 (20:41 -0500)] 
nfs_common: rename nfslocalio nfs_uuid_lock to nfs_uuids_lock

This global spinlock protects all nfs_uuid_t relative to the global
nfs_uuids list.  A later commit will split this global spinlock so
prepare by renaming this lock to reflect its intended narrow scope.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfsd: nfsd_file_acquire_local no longer returns GC'd nfsd_file
Mike Snitzer [Sat, 16 Nov 2024 01:41:00 +0000 (20:41 -0500)] 
nfsd: nfsd_file_acquire_local no longer returns GC'd nfsd_file

Now that LOCALIO no longer leans on NFSD's filecache for caching open
files (and instead uses NFS client-side open nfsd_file caching) there
is no need to use NFSD filecache's GC feature.  Avoiding GC will speed
up nfsd_file initial opens.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfsd: rename nfsd_serv_ prefixed methods and variables with nfsd_net_
Mike Snitzer [Sat, 16 Nov 2024 01:40:59 +0000 (20:40 -0500)] 
nfsd: rename nfsd_serv_ prefixed methods and variables with nfsd_net_

Also update Documentation/filesystems/nfs/localio.rst accordingly
and reduce the technical documentation debt that was previously
captured in that document.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfsd: update percpu_ref to manage references on nfsd_net
Mike Snitzer [Sat, 16 Nov 2024 01:40:58 +0000 (20:40 -0500)] 
nfsd: update percpu_ref to manage references on nfsd_net

Holding a reference on nfsd_net is what is required, it was never
actually about ensuring nn->nfsd_serv available.

Move waiting for outstanding percpu references from
nfsd_destroy_serv() to nfsd_shutdown_net().

By moving it later it will be possible to invalidate localio clients
during nfsd_file_cache_shutdown_net() via __nfsd_file_cache_purge().

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfs: cache all open LOCALIO nfsd_file(s) in client
Mike Snitzer [Sat, 16 Nov 2024 01:40:57 +0000 (20:40 -0500)] 
nfs: cache all open LOCALIO nfsd_file(s) in client

This commit switches from leaning heavily on NFSD's filecache (in
terms of GC'd nfsd_files) back to caching nfsd_files in the
client. A later commit will add the callback mechanism needed to
allow NFSD to force the NFS client to cleanup all cached nfsd_files.

Add nfs_fh_localio_init() and 'struct nfs_fh_localio' to cache opened
nfsd_file(s) (both a RO and RW nfsd_file is able to be opened and
cached for a given nfs_fh).

Update nfs_local_open_fh() to cache the nfsd_file once it is opened
using __nfs_local_open_fh().

Introduce nfs_close_local_fh() to clear the cached open nfsd_files and
call nfs_to_nfsd_file_put_local().

Refcounting is such that:
- nfs_local_open_fh() is paired with nfs_close_local_fh().
- __nfs_local_open_fh() is paired with nfs_to_nfsd_file_put_local().
- nfs_local_file_get() is paired with nfs_local_file_put().

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfs_common: move localio_lock to new lock member of nfs_uuid_t
Mike Snitzer [Sat, 16 Nov 2024 01:40:56 +0000 (20:40 -0500)] 
nfs_common: move localio_lock to new lock member of nfs_uuid_t

Remove cl_localio_lock from 'struct nfs_client' in favor of adding a
lock to the nfs_uuid_t struct (which is embedded in each nfs_client).

Push nfs_local_{enable,disable} implementation down to nfs_common.
Those methods now call nfs_localio_{enable,disable}_client.

This allows implementing nfs_localio_invalidate_clients in terms of
nfs_localio_disable_client.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfs_common: rename functions that invalidate LOCALIO nfs_clients
Mike Snitzer [Sat, 16 Nov 2024 01:40:55 +0000 (20:40 -0500)] 
nfs_common: rename functions that invalidate LOCALIO nfs_clients

Rename nfs_uuid_invalidate_one_client to nfs_localio_disable_client.
Rename nfs_uuid_invalidate_clients to nfs_localio_invalidate_clients.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfsd: add nfsd_file_{get,put} to 'nfs_to' nfsd_localio_operations
Mike Snitzer [Sat, 16 Nov 2024 01:40:54 +0000 (20:40 -0500)] 
nfsd: add nfsd_file_{get,put} to 'nfs_to' nfsd_localio_operations

In later a commit LOCALIO must call both nfsd_file_get and
nfsd_file_put to manage extra nfsd_file references.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agonfs/localio: add direct IO enablement with sync and async IO support
Mike Snitzer [Sat, 16 Nov 2024 01:40:53 +0000 (20:40 -0500)] 
nfs/localio: add direct IO enablement with sync and async IO support

This commit simply adds the required O_DIRECT plumbing.  It doesn't
address the fact that NFS doesn't ensure all writes are page aligned
(nor device logical block size aligned as required by O_DIRECT).

Because NFS will read-modify-write for IO that isn't aligned, LOCALIO
will not use O_DIRECT semantics by default if/when an application
requests the use of O_DIRECT.  Allow the use of O_DIRECT semantics by:
1: Adding a flag to the nfs_pgio_header struct to allow the NFS
   O_DIRECT layer to signal that O_DIRECT was used by the application
2: Adding a 'localio_O_DIRECT_semantics' NFS module parameter that
   when enabled will cause LOCALIO to use O_DIRECT semantics (this may
   cause IO to fail if applications do not properly align their IO).

This commit is derived from code developed by Weston Andros Adamson.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agoNFS: Fix potential buffer overflowin nfs_sysfs_link_rpc_client()
Zichen Xie [Tue, 17 Dec 2024 16:13:12 +0000 (00:13 +0800)] 
NFS: Fix potential buffer overflowin nfs_sysfs_link_rpc_client()

name is char[64] where the size of clnt->cl_program->name remains
unknown. Invoking strcat() directly will also lead to potential buffer
overflow. Change them to strscpy() and strncat() to fix potential
issues.

Signed-off-by: Zichen Xie <zichenxie0106@gmail.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agoSUNRPC: display total RPC tasks for RPC client
Dai Ngo [Tue, 19 Nov 2024 21:43:23 +0000 (13:43 -0800)] 
SUNRPC: display total RPC tasks for RPC client

Display the total number of RPC tasks, including tasks waiting
on workqueue and wait queues, for rpc_clnt.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agoSUNRPC: only put task on cl_tasks list after the RPC call slot is reserved.
Dai Ngo [Tue, 19 Nov 2024 21:43:22 +0000 (13:43 -0800)] 
SUNRPC: only put task on cl_tasks list after the RPC call slot is reserved.

Under heavy write load, we've seen the cl_tasks list grows to
millions of entries. Even though the list is extremely long,
the system still runs fine until the user wants to get the
information of all active RPC tasks by doing:

When this happens, tasks_start acquires the cl_lock to walk the
cl_tasks list, returning one entry at a time to the caller. The
cl_lock is held until all tasks on this list have been processed.

While the cl_lock is held, completed RPC tasks have to spin wait
in rpc_task_release_client for the cl_lock. If there are millions
of entries in the cl_tasks list it will take a long time before
tasks_stop is called and the cl_lock is released.

The spin wait tasks can use up all the available CPUs in the system,
preventing other jobs to run, this causes the system to temporarily
lock up.

This patch fixes this problem by delaying inserting the RPC
task on the cl_tasks list until the RPC call slot is reserved.
This limits the length of the cl_tasks to the number of call
slots available in the system.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
5 months agoLinux 6.13-rc7 v6.13-rc7
Linus Torvalds [Sun, 12 Jan 2025 22:37:56 +0000 (14:37 -0800)] 
Linux 6.13-rc7

5 months agoMerge tag 'char-misc-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregk...
Linus Torvalds [Sun, 12 Jan 2025 22:34:00 +0000 (14:34 -0800)] 
Merge tag 'char-misc-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc/IIO driver fixes from Greg KH:
 "Here are a bunch of small IIO and interconnect and other driver fixes
  to resolve reported issues. Included in here are:

   - loads of iio driver fixes as a result of an audit of places where
    uninitialized data would leak to userspace.

   - other smaller, and normal, iio driver fixes.

   - mhi driver fix

   - interconnect driver fixes

   - pci1xxxx driver fix

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (32 commits)
  misc: microchip: pci1xxxx: Resolve return code mismatch during GPIO set config
  misc: microchip: pci1xxxx: Resolve kernel panic during GPIO IRQ handling
  interconnect: icc-clk: check return values of devm_kasprintf()
  interconnect: qcom: icc-rpm: Set the count member before accessing the flex array
  iio: adc: ti-ads1119: fix sample size in scan struct for triggered buffer
  iio: temperature: tmp006: fix information leak in triggered buffer
  iio: inkern: call iio_device_put() only on mapped devices
  iio: adc: ad9467: Fix the "don't allow reading vref if not available" case
  iio: adc: at91: call input_free_device() on allocated iio_dev
  iio: adc: ad7173: fix using shared static info struct
  iio: adc: ti-ads124s08: Use gpiod_set_value_cansleep()
  iio: adc: ti-ads1119: fix information leak in triggered buffer
  iio: pressure: zpa2326: fix information leak in triggered buffer
  iio: adc: rockchip_saradc: fix information leak in triggered buffer
  iio: imu: kmx61: fix information leak in triggered buffer
  iio: light: vcnl4035: fix information leak in triggered buffer
  iio: light: bh1745: fix information leak in triggered buffer
  iio: adc: ti-ads8688: fix information leak in triggered buffer
  iio: dummy: iio_simply_dummy_buffer: fix information leak in triggered buffer
  iio: test: Fix GTS test config
  ...

5 months agoMerge tag 'driver-core-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 12 Jan 2025 22:26:31 +0000 (14:26 -0800)] 
Merge tag 'driver-core-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core and debugfs fixes from Greg KH:
 "Here are some small driver core and debugfs fixes that resolve some
  reported problems:

   - debugfs runtime error reporting fixes

   - topology cpumask race-condition fix

   - MAINTAINERS file email update

  All of these have been in linux-next this week with no reported
  issues"

* tag 'driver-core-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  fs: debugfs: fix open proxy for unsafe files
  MAINTAINERS: align Danilo's maintainer entries
  topology: Keep the cpumask unchanged when printing cpumap
  debugfs: fix missing mutex_destroy() in short_fops case
  fs: debugfs: differentiate short fops with proxy ops

5 months agoMerge tag 'staging-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sun, 12 Jan 2025 22:22:13 +0000 (14:22 -0800)] 
Merge tag 'staging-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging driver fixes from Greg KH:
 "Here are some small staging driver fixes that resolve some reported
  issues and have been in my tree for too long due to the holiday break.
  They resolve the following issues:

   - lots of gpib build-time fixes as reported by testers and 0-day

   - gpib logical fixes

   - mailmap fix

  All of these have been in linux-next for a while, with no reported
  issues other than the duplicated change"

* tag 'staging-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: gpib: mite: remove unused global functions
  staging: gpib: refer to correct config symbol in tnt4882 Makefile
  mailmap: update Bingwu Zhang's email address
  staging: gpib: fix address space mixup
  staging: gpib: use ioport_map
  staging: gpib: fix pcmcia dependencies
  staging: gpib: add module author and description fields
  staging: gpib: fix Makefiles
  staging: gpib: make global 'usec_diff' functions static
  staging: gpib: Modify mismatched function name
  staging: gpib: Add lower bound check for secondary address
  staging: gpib: Fix erroneous removal of blank before newline

5 months agoMerge tag 'tty-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Sun, 12 Jan 2025 21:27:15 +0000 (13:27 -0800)] 
Merge tag 'tty-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull serial driver fixes from Greg KH:
 "Here are three small serial driver fixes tree. They resolve some
  reported issues:

   - stm32 break control fix

   - 8250 runtime pm usage counter fix

   - imx driver locking fix

  All have been in my tree and linux-next for three weeks now, with no
  reported issues"

* tag 'tty-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: stm32: use port lock wrappers for break control
  serial: imx: Use uart_port_lock_irq() instead of uart_port_lock()
  tty: serial: 8250: Fix another runtime PM usage counter underflow

5 months agoMerge tag 'usb-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sun, 12 Jan 2025 21:09:00 +0000 (13:09 -0800)] 
Merge tag 'usb-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some small USB driver fixes and new device ids for 6.13-rc7.
  Included in here are:

   - usb serial new device ids

   - typec bugfixes for reported issues

   - dwc3 driver fixes

   - chipidea driver fixes

   - gadget driver fixes

   - other minor fixes for reported problems.

  All of these have been in linux-next for a while, with no reported
  issues"

* tag 'usb-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: serial: option: add Neoway N723-EA support
  USB: serial: option: add MeiG Smart SRM815
  USB: serial: cp210x: add Phoenix Contact UPS Device
  usb: typec: fix pm usage counter imbalance in ucsi_ccg_sync_control()
  usb-storage: Add max sectors quirk for Nokia 208
  usb: gadget: midi2: Reverse-select at the right place
  usb: gadget: f_fs: Remove WARN_ON in functionfs_bind
  USB: core: Disable LPM only for non-suspended ports
  usb: fix reference leak in usb_new_device()
  usb: typec: tcpci: fix NULL pointer issue on shared irq case
  usb: gadget: u_serial: Disable ep before setting port to null to fix the crash caused by port being null
  usb: chipidea: ci_hdrc_imx: decrement device's refcount in .remove() and in the error path of .probe()
  usb: typec: ucsi: Set orientation as none when connector is unplugged
  usb: gadget: configfs: Ignore trailing LF for user strings to cdev
  USB: usblp: return error when setting unsupported protocol
  usb: gadget: f_uac2: Fix incorrect setting of bNumEndpoints
  usb: typec: tcpm/tcpci_maxim: fix error code in max_contaminant_read_resistance_kohm()
  usb: host: xhci-plat: set skip_phy_initialization if software node has XHCI_SKIP_PHY_INIT property
  usb: dwc3-am62: Disable autosuspend during remove
  usb: dwc3: gadget: fix writing NYET threshold

5 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sun, 12 Jan 2025 20:04:53 +0000 (12:04 -0800)] 
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "The largest part here is for KVM/PPC, where a NULL pointer dereference
  was introduced in the 6.13 merge window and is now fixed.

  There's some "holiday-induced lateness", as the s390 submaintainer put
  it, but otherwise things looks fine.

  s390:

   - fix a latent bug when the kernel is compiled in debug mode

   - two small UCONTROL fixes and their selftests

  arm64:

   - always check page state in hyp_ack_unshare()

   - align set_id_regs selftest with the fact that ASIDBITS field is RO

   - various vPMU fixes for bugs that only affect nested virt

  PPC e500:

   - Fix a mostly impossible (but just wrong) case where IRQs were never
     re-enabled

   - Observe host permissions instead of mapping readonly host pages as
     guest-writable. This fixes a NULL-pointer dereference in 6.13

   - Replace brittle VMA-based attempts at building huge shadow TLB
     entries with PTE lookups"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: e500: perform hugepage check after looking up the PFN
  KVM: e500: map readonly host pages for read
  KVM: e500: track host-writability of pages
  KVM: e500: use shadow TLB entry as witness for writability
  KVM: e500: always restore irqs
  KVM: s390: selftests: Add has device attr check to uc_attr_mem_limit selftest
  KVM: s390: selftests: Add ucontrol gis routing test
  KVM: s390: Reject KVM_SET_GSI_ROUTING on ucontrol VMs
  KVM: s390: selftests: Add ucontrol flic attr selftests
  KVM: s390: Reject setting flic pfault attributes on ucontrol VMs
  KVM: s390: vsie: fix virtual/physical address in unpin_scb()
  KVM: arm64: Only apply PMCR_EL0.P to the guest range of counters
  KVM: arm64: nv: Reload PMU events upon MDCR_EL2.HPME change
  KVM: arm64: Use KVM_REQ_RELOAD_PMU to handle PMCR_EL0.E change
  KVM: arm64: Add unified helper for reprogramming counters by mask
  KVM: arm64: Always check the state from hyp_ack_unshare()
  KVM: arm64: Fix set_id_regs selftest for ASIDBITS becoming unwritable

5 months agoMerge tag 'perf_urgent_for_v6.13_rc7' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 12 Jan 2025 19:57:45 +0000 (11:57 -0800)] 
Merge tag 'perf_urgent_for_v6.13_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fix from Borislav Petkov:

 - Fix a #GP in the perf user callchain code caused by a race between
   uprobe freeing the task and the bpf profiler unwinding the task's
   user stack

* tag 'perf_urgent_for_v6.13_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  uprobes: Fix race in uprobe_free_utask

5 months agoMerge tag 'x86_urgent_for_v6.13_rc7' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 12 Jan 2025 19:55:48 +0000 (11:55 -0800)] 
Merge tag 'x86_urgent_for_v6.13_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Check whether shadow stack is active before using the ptrace regset
   getter

 - Remove a wrong BUG_ON in the early static call code which breaks Xen
   PVH when booting as dom0

* tag 'x86_urgent_for_v6.13_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Ensure shadow stack is active before "getting" registers
  x86/static-call: Remove early_boot_irqs_disabled check to fix Xen PVH dom0

5 months agoMerge tag 'kvm-s390-master-6.13-1' of https://git.kernel.org/pub/scm/linux/kernel...
Paolo Bonzini [Sun, 12 Jan 2025 11:51:05 +0000 (12:51 +0100)] 
Merge tag 'kvm-s390-master-6.13-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: three small bugfixes

Fix a latent bug when the kernel is compiled in debug mode.
Two small UCONTROL fixes and their selftests.

5 months agoMerge tag 'kvmarm-fixes-6.13-3' of https://git.kernel.org/pub/scm/linux/kernel/git...
Paolo Bonzini [Sun, 12 Jan 2025 11:50:39 +0000 (12:50 +0100)] 
Merge tag 'kvmarm-fixes-6.13-3' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 changes for 6.13, part #3

 - Always check page state in hyp_ack_unshare()

 - Align set_id_regs selftest with the fact that ASIDBITS field is RO

 - Various vPMU fixes for bugs that only affect nested virt

5 months agoMerge branch 'kvm-e500-check-writable-pfn' into HEAD
Paolo Bonzini [Sun, 12 Jan 2025 11:48:14 +0000 (12:48 +0100)] 
Merge branch 'kvm-e500-check-writable-pfn' into HEAD

The new __kvm_faultin_pfn() function is upset by the fact that e500
KVM ignores host page permissions - __kvm_faultin requires a "writable"
outgoing argument, but e500 KVM is passing NULL.

While a simple fix would be possible that simply allows writable to
be NULL, it is quite ugly to have e500 KVM ignore completely the host
permissions and map readonly host pages as guest-writable.  Merge a more
complete fix and remove the VMA-based attempts at building huge shadow TLB
entries.  Using a PTE lookup, similar to what is done for x86, is better
and works with remap_pfn_range() because it does not assume that VM_PFNMAP
areas are contiguous.  Note that the same incorrect logic is there in
ARM's get_vma_page_shift() and RISC-V's kvm_riscv_gstage_ioremap().

Fortunately, for e500 most of the code is already there; it just has to
be changed to compute the range from find_linux_pte()'s output rather
than find_vma().  The new code works for both VM_PFNMAP and hugetlb
mappings, so the latter is removed.

Patches 2-5 were tested by the reporter, Christian Zigotzky.  Since
the difference with v1 is minimal, I am going to send it to Linus
today.

5 months agoKVM: e500: perform hugepage check after looking up the PFN
Paolo Bonzini [Wed, 8 Jan 2025 15:49:50 +0000 (16:49 +0100)] 
KVM: e500: perform hugepage check after looking up the PFN

e500 KVM tries to bypass __kvm_faultin_pfn() in order to map VM_PFNMAP
VMAs as huge pages.  This is a Bad Idea because VM_PFNMAP VMAs could
become noncontiguous as a result of callsto remap_pfn_range().

Instead, use the already existing host PTE lookup to retrieve a
valid host-side mapping level after __kvm_faultin_pfn() has
returned.  Then find the largest size that will satisfy the
guest's request while staying within a single host PTE.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 months agoKVM: e500: map readonly host pages for read
Paolo Bonzini [Wed, 8 Jan 2025 15:14:55 +0000 (16:14 +0100)] 
KVM: e500: map readonly host pages for read

The new __kvm_faultin_pfn() function is upset by the fact that e500 KVM
ignores host page permissions - __kvm_faultin requires a "writable"
outgoing argument, but e500 KVM is nonchalantly passing NULL.

If the host page permissions do not include writability, the shadow
TLB entry is forcibly mapped read-only.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 months agoKVM: e500: track host-writability of pages
Paolo Bonzini [Wed, 8 Jan 2025 15:21:38 +0000 (16:21 +0100)] 
KVM: e500: track host-writability of pages

Add the possibility of marking a page so that the UW and SW bits are
force-cleared.  This is stored in the private info so that it persists
across multiple calls to kvmppc_e500_setup_stlbe.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 months agoKVM: e500: use shadow TLB entry as witness for writability
Paolo Bonzini [Wed, 8 Jan 2025 15:19:28 +0000 (16:19 +0100)] 
KVM: e500: use shadow TLB entry as witness for writability

kvmppc_e500_ref_setup is returning whether the guest TLB entry is writable,
which is than passed to kvm_release_faultin_page.  This makes little sense
for two reasons: first, because the function sets up the private data for
the page and the return value feels like it has been bolted on the side;
second, because what really matters is whether the _shadow_ TLB entry is
writable.  If it is not writable, the page can be released as non-dirty.
Shift from using tlbe_is_writable(gtlbe) to doing the same check on
the shadow TLB entry.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 months agoKVM: e500: always restore irqs
Paolo Bonzini [Sun, 12 Jan 2025 09:34:44 +0000 (10:34 +0100)] 
KVM: e500: always restore irqs

If find_linux_pte fails, IRQs will not be restored.  This is unlikely
to happen in practice since it would have been reported as hanging
hosts, but it should of course be fixed anyway.

Cc: stable@vger.kernel.org
Reported-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 months agoMerge tag 'probes-fixes-v6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 12 Jan 2025 04:34:12 +0000 (20:34 -0800)] 
Merge tag 'probes-fixes-v6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fix from Masami Hiramatsu:
 "Fix to free trace_kprobe objects at a failure path in
  __trace_kprobe_create() function. This fixes a memory leak"

* tag 'probes-fixes-v6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/kprobes: Fix to free objects when failed to copy a symbol

5 months agoMerge tag 'hwmon-for-v6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 11 Jan 2025 19:42:48 +0000 (11:42 -0800)] 
Merge tag 'hwmon-for-v6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fix from Guenter Roeck:
 "One patch to fix error handling in drivetemp driver"

* tag 'hwmon-for-v6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (drivetemp) Fix driver producing garbage data when SCSI errors occur

5 months agoMerge tag 'block-6.13-20250111' of git://git.kernel.dk/linux
Linus Torvalds [Sat, 11 Jan 2025 19:17:08 +0000 (11:17 -0800)] 
Merge tag 'block-6.13-20250111' of git://git.kernel.dk/linux

Pull block fix from Jens Axboe:
 "A single fix for a use-after-free in the BFQ IO scheduler"

* tag 'block-6.13-20250111' of git://git.kernel.dk/linux:
  block, bfq: fix waker_bfqq UAF after bfq_split_bfqq()

5 months agoMerge tag 'io_uring-6.13-20250111' of git://git.kernel.dk/linux
Linus Torvalds [Sat, 11 Jan 2025 18:59:43 +0000 (10:59 -0800)] 
Merge tag 'io_uring-6.13-20250111' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:

 - Fix for multishot timeout updates only using the updated value for
   the first invocation, not subsequent ones

 - Silence a false positive lockdep warning

 - Fix the eventfd signaling and putting RCU logic

 - Fix fault injected SQPOLL setup not clearing the task pointer in the
   error path

 - Fix local task_work looking at the SQPOLL thread rather than just
   signaling the safe variant. Again one of those theoretical issues,
   which should be closed up none the less.

* tag 'io_uring-6.13-20250111' of git://git.kernel.dk/linux:
  io_uring: don't touch sqd->thread off tw add
  io_uring/sqpoll: zero sqd->thread on tctx errors
  io_uring/eventfd: ensure io_eventfd_signal() defers another RCU period
  io_uring: silence false positive warnings
  io_uring/timeout: fix multishot updates

5 months agoMerge tag '6.13-rc6-SMB3-client-fix' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sat, 11 Jan 2025 18:49:50 +0000 (10:49 -0800)] 
Merge tag '6.13-rc6-SMB3-client-fix' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fix from Steve French:

 - fix unneeded session setup retry due to stale password e.g. for DFS
   automounts

* tag '6.13-rc6-SMB3-client-fix' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: sync the root session and superblock context passwords before automounting

5 months agoMerge tag 'soc-fixes-6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Sat, 11 Jan 2025 18:42:05 +0000 (10:42 -0800)] 
Merge tag 'soc-fixes-6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
 "Over the Christmas break a couple of devicetree fixes came in for
  Rockchips, Qualcomm and NXP/i.MX. These add some missing board
  specific properties, address build time warnings,

  The USB/TOG supoprt on X1 Elite regressed, so two earlier DT changes
  get reverted for now.

  Aside from the devicetree fixes, there is One build fix for the stm32
  firewall driver, and a defconfig change to enable SPDIF support for
  i.MX"

* tag 'soc-fixes-6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  firewall: remove misplaced semicolon from stm32_firewall_get_firewall
  arm64: dts: rockchip: add hevc power domain clock to rk3328
  arm64: dts: rockchip: Fix the SD card detection on NanoPi R6C/R6S
  arm64: dts: qcom: sa8775p: fix the secure device bootup issue
  Revert "arm64: dts: qcom: x1e80100: enable OTG on USB-C controllers"
  Revert "arm64: dts: qcom: x1e80100-crd: enable otg on usb ports"
  arm64: dts: qcom: x1e80100: Fix up BAR space size for PCIe6a
  Revert "arm64: dts: qcom: x1e78100-t14s: enable otg on usb-c ports"
  ARM: dts: imxrt1050: Fix clocks for mmc
  ARM: imx_v6_v7_defconfig: enable SND_SOC_SPDIF
  arm64: dts: imx95: correct the address length of netcmix_blk_ctrl
  arm64: dts: imx8-ss-audio: add fallback compatible string fsl,imx6ull-esai for esai
  arm64: dts: rockchip: rename rfkill label for Radxa ROCK 5B
  arm64: dts: rockchip: add reset-names for combphy on rk3568
  arm64: dts: qcom: sa8775p: Fix the size of 'addr_space' regions

5 months agoMAINTAINERS: powerpc: Update my status
Michael Ellerman [Fri, 10 Jan 2025 23:57:38 +0000 (10:57 +1100)] 
MAINTAINERS: powerpc: Update my status

Maddy is taking over the day-to-day maintenance of powerpc. I will still
be around to help, and as a backup.

Re-order the main POWERPC list to put Maddy first to reflect that.

KVM/powerpc patches will be handled by Maddy via the powerpc tree with
review from Nick, so replace myself with Maddy there.

Remove myself from BPF, leaving Hari & Christophe as maintainers.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 months agosmb: client: sync the root session and superblock context passwords before automounting
Meetakshi Setiya [Wed, 8 Jan 2025 10:10:34 +0000 (05:10 -0500)] 
smb: client: sync the root session and superblock context passwords before automounting

In some cases, when password2 becomes the working password, the
client swaps the two password fields in the root session struct, but
not in the smb3_fs_context struct in cifs_sb. DFS automounts inherit
fs context from their parent mounts. Therefore, they might end up
getting the passwords in the stale order.
The automount should succeed, because the mount function will end up
retrying with the actual password anyway. But to reduce these
unnecessary session setup retries for automounts, we can sync the
parent context's passwords with the root session's passwords before
duplicating it to the child's fs context.

Cc: stable@vger.kernel.org
Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
5 months agoMerge tag 'sched_ext-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 10 Jan 2025 23:11:58 +0000 (15:11 -0800)] 
Merge tag 'sched_ext-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext fixes from Tejun Heo:

 - Fix corner case bug where ops.dispatch() couldn't extend the
   execution of the current task if SCX_OPS_ENQ_LAST is set.

 - Fix ops.cpu_release() not being called when a SCX task is preempted
   by a higher priority sched class task.

 - Fix buitin idle mask being incorrectly left as busy after an idle CPU
   is picked and kicked.

 - scx_ops_bypass() was unnecessarily using rq_lock() which comes with
   rq pinning related sanity checks which could trigger spuriously.
   Switch to raw_spin_rq_lock().

* tag 'sched_ext-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  sched_ext: idle: Refresh idle masks during idle-to-idle transitions
  sched_ext: switch class when preempted by higher priority scheduler
  sched_ext: Replace rq_lock() to raw_spin_rq_lock() in scx_ops_bypass()
  sched_ext: keep running prev when prev->scx.slice != 0

5 months agoMerge tag 'cgroup-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 10 Jan 2025 23:03:02 +0000 (15:03 -0800)] 
Merge tag 'cgroup-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo:
 "Cpuset fixes:

   - Fix isolated CPUs leaking into sched domains

   - Remove now unnecessary kernfs active break which can trigger a
     warning

   - Comment updates"

* tag 'cgroup-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup/cpuset: remove kernfs active break
  cgroup/cpuset: Prevent leakage of isolated CPUs into sched domains
  cgroup/cpuset: Remove stale text

5 months agoMerge tag 'wq-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 10 Jan 2025 22:52:30 +0000 (14:52 -0800)] 
Merge tag 'wq-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull workqueue fix from Tejun Heo:

 - Add a WARN_ON_ONCE() on queue_delayed_work_on() on an offline CPU as
   such work items won't get executed till the CPU comes back online

* tag 'wq-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: warn if delayed_work is queued to an offlined cpu.

5 months agoMerge tag 'thermal-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 10 Jan 2025 22:46:49 +0000 (14:46 -0800)] 
Merge tag 'thermal-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control fix from Rafael Wysocki:
 "Fix an OF node leak in the code parsing thermal zone DT properties
  (Joe Hattori)"

* tag 'thermal-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: of: fix OF node leak in of_thermal_zone_find()

5 months agoMerge tag 'acpi-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 10 Jan 2025 22:41:46 +0000 (14:41 -0800)] 
Merge tag 'acpi-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "Add two more ACPI IRQ override quirks and update the code using them
  to avoid unnecessary overhead (Hans de Goede)"

* tag 'acpi-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: resource: acpi_dev_irq_override(): Check DMI match last
  ACPI: resource: Add TongFang GM5HG0A to irq1_edge_low_force_override[]
  ACPI: resource: Add Asus Vivobook X1504VAP to irq1_level_low_skip_override[]

5 months agosched_ext: idle: Refresh idle masks during idle-to-idle transitions
Andrea Righi [Fri, 10 Jan 2025 22:16:31 +0000 (23:16 +0100)] 
sched_ext: idle: Refresh idle masks during idle-to-idle transitions

With the consolidation of put_prev_task/set_next_task(), see
commit 436f3eed5c69 ("sched: Combine the last put_prev_task() and the
first set_next_task()"), we are now skipping the transition between
these two functions when the previous and the next tasks are the same.

As a result, the scx idle state of a CPU is updated only when
transitioning to or from the idle thread. While this is generally
correct, it can lead to uneven and inefficient core utilization in
certain scenarios [1].

A typical scenario involves proactive wake-ups: scx_bpf_pick_idle_cpu()
selects and marks an idle CPU as busy, followed by a wake-up via
scx_bpf_kick_cpu(), without dispatching any tasks. In this case, the CPU
continues running the idle thread, returns to idle, but remains marked
as busy, preventing it from being selected again as an idle CPU (until a
task eventually runs on it and releases the CPU).

For example, running a workload that uses 20% of each CPU, combined with
an scx scheduler using proactive wake-ups, results in the following core
utilization:

 CPU 0: 25.7%
 CPU 1: 29.3%
 CPU 2: 26.5%
 CPU 3: 25.5%
 CPU 4:  0.0%
 CPU 5: 25.5%
 CPU 6:  0.0%
 CPU 7: 10.5%

To address this, refresh the idle state also in pick_task_idle(), during
idle-to-idle transitions, but only trigger ops.update_idle() on actual
state changes to prevent unnecessary updates to the scx scheduler and
maintain balanced state transitions.

With this change in place, the core utilization in the previous example
becomes the following:

 CPU 0: 18.8%
 CPU 1: 19.4%
 CPU 2: 18.0%
 CPU 3: 18.7%
 CPU 4: 19.3%
 CPU 5: 18.9%
 CPU 6: 18.7%
 CPU 7: 19.3%

[1] https://github.com/sched-ext/scx/pull/1139

Fixes: 7c65ae81ea86 ("sched_ext: Don't call put_prev_task_scx() before picking the next task")
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
5 months agoio_uring: don't touch sqd->thread off tw add
Pavel Begunkov [Fri, 10 Jan 2025 20:36:45 +0000 (20:36 +0000)] 
io_uring: don't touch sqd->thread off tw add

With IORING_SETUP_SQPOLL all requests are created by the SQPOLL task,
which means that req->task should always match sqd->thread. Since
accesses to sqd->thread should be separately protected, use req->task
in io_req_normal_work_add() instead.

Note, in the eyes of io_req_normal_work_add(), the SQPOLL task struct
is always pinned and alive, and sqd->thread can either be the task or
NULL. It's only problematic if the compiler decides to reload the value
after the null check, which is not so likely.

Cc: stable@vger.kernel.org
Cc: Bui Quang Minh <minhquangbui99@gmail.com>
Reported-by: lizetao <lizetao1@huawei.com>
Fixes: 78f9b61bd8e54 ("io_uring: wake SQPOLL task when task_work is added to an empty queue")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1cbbe72cf32c45a8fee96026463024cd8564a7d7.1736541357.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 months agoio_uring/sqpoll: zero sqd->thread on tctx errors
Pavel Begunkov [Fri, 10 Jan 2025 14:31:23 +0000 (14:31 +0000)] 
io_uring/sqpoll: zero sqd->thread on tctx errors

Syzkeller reports:

BUG: KASAN: slab-use-after-free in thread_group_cputime+0x409/0x700 kernel/sched/cputime.c:341
Read of size 8 at addr ffff88803578c510 by task syz.2.3223/27552
 Call Trace:
  <TASK>
  ...
  kasan_report+0x143/0x180 mm/kasan/report.c:602
  thread_group_cputime+0x409/0x700 kernel/sched/cputime.c:341
  thread_group_cputime_adjusted+0xa6/0x340 kernel/sched/cputime.c:639
  getrusage+0x1000/0x1340 kernel/sys.c:1863
  io_uring_show_fdinfo+0xdfe/0x1770 io_uring/fdinfo.c:197
  seq_show+0x608/0x770 fs/proc/fd.c:68
  ...

That's due to sqd->task not being cleared properly in cases where
SQPOLL task tctx setup fails, which can essentially only happen with
fault injection to insert allocation errors.

Cc: stable@vger.kernel.org
Fixes: 1251d2025c3e1 ("io_uring/sqpoll: early exit thread if task_context wasn't allocated")
Reported-by: syzbot+3d92cfcfa84070b0a470@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/efc7ec7010784463b2e7466d7b5c02c2cb381635.1736519461.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 months agoMerge tag 'drm-fixes-2025-01-11' of https://gitlab.freedesktop.org/drm/kernel
Linus Torvalds [Fri, 10 Jan 2025 20:35:46 +0000 (12:35 -0800)] 
Merge tag 'drm-fixes-2025-01-11' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "Regular weekly fixes, this has the usual amdgpu/xe/i915 bits.

  There is a bigger bunch of mediatek patches that I considered not
  including at this stage, but all the changes (except for one were
  obvious small fixes, and the rotation one is a few lines, and I
  suppose will help someone have their screen up the right way), I
  decided to include it since I expect it got slowed down by holidays
  etc, and it's not that mainstream a hw platform.

  i915:
   - Revert "drm/i915/hdcp: Don't enable HDCP1.4 directly from
     check_link"

  amdgpu:
   - Display interrupt fixes
   - Fix display max surface mismatches
   - Fix divide error in DM plane scale calcs
   - Display divide by 0 checks in dml helpers
   - SMU 13 AD/DC interrrupt handling fix
   - Fix locking around buddy trim handling

  amdkfd:
   - Fix page fault with shader debugger enabled
   - Fix eviction fence wq handling

  xe:
   - Avoid a NULL ptr deref when wedging
   - Fix power gate sequence on DG1

  mediatek:
   - Revert "drm/mediatek: dsi: Correct calculation formula of PHY
     Timing"
   - Set private->all_drm_private[i]->drm to NULL if mtk_drm_bind
     returns err
   - Move mtk_crtc_finish_page_flip() to ddp_cmdq_cb()
   - Only touch DISP_REG_OVL_PITCH_MSB if AFBC is supported
   - Add support for 180-degree rotation in the display driver
   - Stop selecting foreign drivers
   - Revert "drm/mediatek: Switch to for_each_child_of_node_scoped()"
   - Fix YCbCr422 color format issue for DP
   - Fix mode valid issue for dp
   - dp: Reference common DAI properties
   - dsi: Add registers to pdata to fix MT8186/MT8188
   - Remove unneeded semicolon
   - Add return value check when reading DPCD
   - Initialize pointer in mtk_drm_of_ddp_path_build_one()"

* tag 'drm-fixes-2025-01-11' of https://gitlab.freedesktop.org/drm/kernel: (26 commits)
  drm/xe/dg1: Fix power gate sequence.
  drm/xe: Fix tlb invalidation when wedging
  Revert "drm/i915/hdcp: Don't enable HDCP1.4 directly from check_link"
  drm/amdgpu: Add a lock when accessing the buddy trim function
  drm/amd/pm:  fix BUG: scheduling while atomic
  drm/amdkfd: wq_release signals dma_fence only when available
  drm/amd/display: Add check for granularity in dml ceil/floor helpers
  drm/amdkfd: fixed page fault when enable MES shader debugger
  drm/amd/display: fix divide error in DM plane scale calcs
  drm/amd/display: increase MAX_SURFACES to the value supported by hw
  drm/amd/display: fix page fault due to max surface definition mismatch
  drm/amd/display: Remove unnecessary amdgpu_irq_get/put
  drm/mediatek: Initialize pointer in mtk_drm_of_ddp_path_build_one()
  drm/mediatek: Add return value check when reading DPCD
  drm/mediatek: Remove unneeded semicolon
  drm/mediatek: mtk_dsi: Add registers to pdata to fix MT8186/MT8188
  dt-bindings: display: mediatek: dp: Reference common DAI properties
  drm/mediatek: Fix mode valid issue for dp
  drm/mediatek: Fix YCbCr422 color format issue for DP
  Revert "drm/mediatek: Switch to for_each_child_of_node_scoped()"
  ...

5 months agoMerge tag 'riscv-for-linus-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 10 Jan 2025 18:50:30 +0000 (10:50 -0800)] 
Merge tag 'riscv-for-linus-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - a handful of selftest fixes

 - fix a memory leak in relocation processing during module loading

 - avoid sleeping in die()

 - fix kprobe instruction slot address calculations

 - fix DT node reference leak in SBI idle probing

 - avoid initializing out of bounds pages on sparse vmemmap systems with
   a gap at the start of their physical memory map

 - fix backtracing through exceptions

 - _Q_PENDING_LOOPS is now defined whenever QUEUED_SPINLOCKS=y

 - local labels in entry.S are now marked with ".L", which prevents them
   from trashing backtraces

 - a handful of fixes for SBI-based performance counters

* tag 'riscv-for-linus-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  drivers/perf: riscv: Do not allow invalid raw event config
  drivers/perf: riscv: Return error for default case
  drivers/perf: riscv: Fix Platform firmware event data
  tools: selftests: riscv: Add test count for vstate_prctl
  tools: selftests: riscv: Add pass message for v_initval_nolibc
  riscv: use local label names instead of global ones in assembly
  riscv: qspinlock: Fixup _Q_PENDING_LOOPS definition
  riscv: stacktrace: fix backtracing through exceptions
  riscv: mm: Fix the out of bound issue of vmemmap address
  cpuidle: riscv-sbi: fix device node release in early exit of for_each_possible_cpu
  riscv: kprobes: Fix incorrect address calculation
  riscv: Fix sleeping in invalid context in die()
  riscv: module: remove relocation_head rel_entry member allocation
  riscv: selftests: Fix warnings pointer masking test

5 months agoworkqueue: warn if delayed_work is queued to an offlined cpu.
Imran Khan [Thu, 9 Jan 2025 23:27:11 +0000 (10:27 +1100)] 
workqueue: warn if delayed_work is queued to an offlined cpu.

delayed_work submitted to an offlined cpu, will not get executed,
after the specified delay if the cpu remains offline. If the cpu
never comes online the work will never get executed.
checking for online cpu in __queue_delayed_work, does not sound
like a good idea because to do this reliably we need hotplug lock
and since work may be submitted from atomic contexts, we would
have to use cpus_read_trylock. But if trylock fails we would queue
the work on any cpu and this may not be optimal because our intended
cpu might still be online.

Putting a WARN_ON_ONCE for an already offlined cpu, will indicate users
of queue_delayed_work_on, if they are (wrongly) trying to queue
delayed_work on offlined cpu. Also indicate the problem of using
offlined cpu with queue_delayed_work_on, in its description.

Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
5 months agoMerge tag 'v6.13-rockchip-dtsfixes1' of https://git.kernel.org/pub/scm/linux/kernel...
Arnd Bergmann [Fri, 10 Jan 2025 17:20:46 +0000 (18:20 +0100)] 
Merge tag 'v6.13-rockchip-dtsfixes1' of https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes

Fixed card-detect on one board and some missing properties added.

* tag 'v6.13-rockchip-dtsfixes1' of https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
  arm64: dts: rockchip: add hevc power domain clock to rk3328
  arm64: dts: rockchip: Fix the SD card detection on NanoPi R6C/R6S
  arm64: dts: rockchip: rename rfkill label for Radxa ROCK 5B
  arm64: dts: rockchip: add reset-names for combphy on rk3568

Link: https://lore.kernel.org/r/2914560.yaVYbkx8dN@diego
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
5 months agoMerge tag 'vfs-6.13-rc7.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 10 Jan 2025 17:11:11 +0000 (09:11 -0800)] 
Merge tag 'vfs-6.13-rc7.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
 "afs:

   - Fix the maximum cell name length

   - Fix merge preference rule failure condition

  fuse:

   - Fix fuse_get_user_pages() so it doesn't risk misleading the caller
     to think pages have been allocated when they actually haven't

   - Fix direct-io folio offset and length calculation

  netfs:

   - Fix async direct-io handling

   - Fix read-retry for filesystems that don't provide a
     ->prepare_read() method

  vfs:

   - Prevent truncating 64-bit offsets to 32-bits in iomap

   - Fix memory barrier interactions when polling

   - Remove MNT_ONRB to fix concurrent modification of @mnt->mnt_flags
     leading to MNT_ONRB to not be raised and invalid access to a list
     member"

* tag 'vfs-6.13-rc7.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  poll: kill poll_does_not_wait()
  sock_poll_wait: kill the no longer necessary barrier after poll_wait()
  io_uring_poll: kill the no longer necessary barrier after poll_wait()
  poll_wait: kill the obsolete wait_address check
  poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll()
  afs: Fix merge preference rule failure condition
  netfs: Fix read-retry for fs with no ->prepare_read()
  netfs: Fix kernel async DIO
  fs: kill MNT_ONRB
  iomap: avoid avoid truncating 64-bit offset to 32 bits
  afs: Fix the maximum cell name length
  fuse: Set *nbytesp=0 in fuse_get_user_pages on allocation failure
  fuse: fix direct io folio offset and length calculation

5 months agoMerge tag 'xfs-fixes-6.13-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Fri, 10 Jan 2025 17:04:27 +0000 (09:04 -0800)] 
Merge tag 'xfs-fixes-6.13-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Carlos Maiolino:

 - Fix a missing lock while detaching a dquot buffer

 - Fix failure on xfs_update_last_rtgroup_size for !XFS_RT

* tag 'xfs-fixes-6.13-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: lock dquot buffer before detaching dquot from b_li_list
  xfs: don't return an error from xfs_update_last_rtgroup_size for !XFS_RT

5 months agoMerge tag 'platform-drivers-x86-v6.13-5' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 10 Jan 2025 16:14:22 +0000 (08:14 -0800)] 
Merge tag 'platform-drivers-x86-v6.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Ilpo Järvinen:
 "Fixes and new HW support:

   - amd/pmc: Match IRQ1 wakeup disable with the enable on i8042 side

   - intel: power-domains: Clearwater Forest support

   - intel/pmc: Skip SSRAM setup when no additional devices are present

   - ISST: Clearwater Forest support"

* tag 'platform-drivers-x86-v6.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: intel/pmc: Fix ioremap() of bad address
  platform/x86: ISST: Add Clearwater Forest to support list
  platform/x86/intel: power-domains: Add Clearwater Forest support
  platform/x86/amd/pmc: Only disable IRQ1 wakeup where i8042 actually enabled it

5 months agoMerge tag 'regulator-fix-v6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 10 Jan 2025 16:05:32 +0000 (08:05 -0800)] 
Merge tag 'regulator-fix-v6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "A couple of fixes for !REGULATOR and !OF configurations, adding
  missing stubs"

* tag 'regulator-fix-v6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: Move OF_ API declarations/definitions outside CONFIG_REGULATOR
  regulator: Guard of_regulator_bulk_get_all() with CONFIG_OF

5 months agoMerge tag 'gpio-fixes-for-v6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 10 Jan 2025 15:59:47 +0000 (07:59 -0800)] 
Merge tag 'gpio-fixes-for-v6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:
 "There's one small fix for real HW - gpio-loongson.

  The rest concern two virtual testing drivers in which some issues were
  recently found and addressed:

   - fix resource leaks in error path in gpio-virtuser (and one
     consistent memory leak triggered on every device removal))

   - fix the use-case of having multiple con_ids in a lookup table in
     gpio-virtuser which has never worked (despite being advertised)

   - don't allow rmdir() on configfs directories when they are in use in
     gpio-sim and gpio-virtuser

   - fix register offsets in gpio-loongson-64"

* tag 'gpio-fixes-for-v6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: loongson: Fix Loongson-2K2000 ACPI GPIO register offset
  gpio: sim: lock up configfs that an instantiated device depends on
  gpio: virtuser: lock up configfs that an instantiated device depends on
  gpio: virtuser: fix handling of multiple conn_ids in lookup table
  gpio: virtuser: fix missing lookup table cleanups

5 months agoMerge tag 'usb-serial-6.13-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel...
Greg Kroah-Hartman [Fri, 10 Jan 2025 13:59:20 +0000 (14:59 +0100)] 
Merge tag 'usb-serial-6.13-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus

Johan writes:

USB-serial device ids for 6.13-rc7

Here are some new modem and cp210x device ids.

All have been in linux-next with no reported issues.

* tag 'usb-serial-6.13-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
  USB: serial: option: add Neoway N723-EA support
  USB: serial: option: add MeiG Smart SRM815
  USB: serial: cp210x: add Phoenix Contact UPS Device

5 months agoMerge branch 'vfs-6.14.poll' into vfs.fixes
Christian Brauner [Fri, 10 Jan 2025 11:01:21 +0000 (12:01 +0100)] 
Merge branch 'vfs-6.14.poll' into vfs.fixes

Bring in the fixes for __pollwait() and waitqueue_active() interactions.

Signed-off-by: Christian Brauner <brauner@kernel.org>
5 months agoMerge patch series "poll_wait: add mb() to fix theoretical race between waitqueue_act...
Christian Brauner [Fri, 10 Jan 2025 10:59:08 +0000 (11:59 +0100)] 
Merge patch series "poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll()"

Oleg Nesterov <oleg@redhat.com> says:

The waitqueue_active() helper can only be used if both waker and waiter
have memory barriers that pair with each other. But __pollwait() is
broken in this respect. Fix it.

* patches from https://lore.kernel.org/r/20250107162649.GA18886@redhat.com:
  poll: kill poll_does_not_wait()
  sock_poll_wait: kill the no longer necessary barrier after poll_wait()
  io_uring_poll: kill the no longer necessary barrier after poll_wait()
  poll_wait: kill the obsolete wait_address check
  poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll()

Link: https://lore.kernel.org/r/20250107162649.GA18886@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 months agopoll: kill poll_does_not_wait()
Oleg Nesterov [Tue, 7 Jan 2025 16:27:43 +0000 (17:27 +0100)] 
poll: kill poll_does_not_wait()

It no longer has users.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20250107162743.GA18947@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 months agosock_poll_wait: kill the no longer necessary barrier after poll_wait()
Oleg Nesterov [Tue, 7 Jan 2025 16:27:36 +0000 (17:27 +0100)] 
sock_poll_wait: kill the no longer necessary barrier after poll_wait()

Now that poll_wait() provides a full barrier we can remove smp_mb() from
sock_poll_wait().

Also, the poll_does_not_wait() check before poll_wait() just adds the
unnecessary confusion, kill it. poll_wait() does the same "p && p->_qproc"
check.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20250107162736.GA18944@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 months agoio_uring_poll: kill the no longer necessary barrier after poll_wait()
Oleg Nesterov [Tue, 7 Jan 2025 16:27:30 +0000 (17:27 +0100)] 
io_uring_poll: kill the no longer necessary barrier after poll_wait()

Now that poll_wait() provides a full barrier we can remove smp_rmb() from
io_uring_poll().

In fact I don't think smp_rmb() was correct, it can't serialize LOADs and
STOREs.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20250107162730.GA18940@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 months agopoll_wait: kill the obsolete wait_address check
Oleg Nesterov [Tue, 7 Jan 2025 16:27:24 +0000 (17:27 +0100)] 
poll_wait: kill the obsolete wait_address check

This check is historical and no longer needed, wait_address is never NULL.
These days we rely on the poll_table->_qproc check. NULL if select/poll
is not going to sleep, or it already has a data to report, or all waiters
have already been registered after the 1st iteration.

However, poll_table *p can be NULL, see p9_fd_poll() for example, so we
can't remove the "p != NULL" check.

Link: https://lore.kernel.org/all/20250106180325.GF7233@redhat.com/
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20250107162724.GA18926@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 months agopoll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll()
Oleg Nesterov [Tue, 7 Jan 2025 16:27:17 +0000 (17:27 +0100)] 
poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll()

As the comment above waitqueue_active() explains, it can only be used
if both waker and waiter have mb()'s that pair with each other. However
__pollwait() is broken in this respect.

This is not pipe-specific, but let's look at pipe_poll() for example:

poll_wait(...); // -> __pollwait() -> add_wait_queue()

LOAD(pipe->head);
LOAD(pipe->head);

In theory these LOAD()'s can leak into the critical section inside
add_wait_queue() and can happen before list_add(entry, wq_head), in this
case pipe_poll() can race with wakeup_pipe_readers/writers which do

smp_mb();
if (waitqueue_active(wq_head))
wake_up_interruptible(wq_head);

There are more __pollwait()-like functions (grep init_poll_funcptr), and
it seems that at least ep_ptable_queue_proc() has the same problem, so the
patch adds smp_mb() into poll_wait().

Link: https://lore.kernel.org/all/20250102163320.GA17691@redhat.com/
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20250107162717.GA18922@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 months agoxfs: lock dquot buffer before detaching dquot from b_li_list
Darrick J. Wong [Thu, 9 Jan 2025 00:54:02 +0000 (16:54 -0800)] 
xfs: lock dquot buffer before detaching dquot from b_li_list

We have to lock the buffer before we can delete the dquot log item from
the buffer's log item list.

Cc: stable@vger.kernel.org # v6.13-rc3
Fixes: acc8f8628c3737 ("xfs: attach dquot buffer to dquot log item buffer")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
5 months agofs: debugfs: fix open proxy for unsafe files
Johannes Berg [Fri, 10 Jan 2025 07:58:14 +0000 (08:58 +0100)] 
fs: debugfs: fix open proxy for unsafe files

In the previous commit referenced below, I had to split
the short fops handling into different proxy fops. This
necessitated knowing out-of-band whether or not the ops
are short or full, when attempting to convert from fops
to allocated fsdata.

Unfortunately, I only converted full_proxy_open() which
is used for the new full_proxy_open_regular() and
full_proxy_open_short(), but forgot about the call in
open_proxy_open(), used for debugfs_create_file_unsafe().

Fix that, it never has short fops.

Fixes: f8f25893a477 ("fs: debugfs: differentiate short fops with proxy ops")
Reported-by: Suresh Kumar Kurmi <suresh.kumar.kurmi@intel.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202501101055.bb8bf3e7-lkp@intel.com
Reported-by: Venkat Rao Bagalkote <venkat88@linux.vnet.ibm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20250110085826.cd74f3b7a36b.I430c79c82ec3f954c2ff9665753bf6ac9e63eef8@changeid
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 months agouprobes: Fix race in uprobe_free_utask
Jiri Olsa [Thu, 9 Jan 2025 14:14:40 +0000 (15:14 +0100)] 
uprobes: Fix race in uprobe_free_utask

Max Makarov reported kernel panic [1] in perf user callchain code.

The reason for that is the race between uprobe_free_utask and bpf
profiler code doing the perf user stack unwind and is triggered
within uprobe_free_utask function:
  - after current->utask is freed and
  - before current->utask is set to NULL

 general protection fault, probably for non-canonical address 0x9e759c37ee555c76: 0000 [#1] SMP PTI
 RIP: 0010:is_uprobe_at_func_entry+0x28/0x80
 ...
  ? die_addr+0x36/0x90
  ? exc_general_protection+0x217/0x420
  ? asm_exc_general_protection+0x26/0x30
  ? is_uprobe_at_func_entry+0x28/0x80
  perf_callchain_user+0x20a/0x360
  get_perf_callchain+0x147/0x1d0
  bpf_get_stackid+0x60/0x90
  bpf_prog_9aac297fb833e2f5_do_perf_event+0x434/0x53b
  ? __smp_call_single_queue+0xad/0x120
  bpf_overflow_handler+0x75/0x110
  ...
  asm_sysvec_apic_timer_interrupt+0x1a/0x20
 RIP: 0010:__kmem_cache_free+0x1cb/0x350
 ...
  ? uprobe_free_utask+0x62/0x80
  ? acct_collect+0x4c/0x220
  uprobe_free_utask+0x62/0x80
  mm_release+0x12/0xb0
  do_exit+0x26b/0xaa0
  __x64_sys_exit+0x1b/0x20
  do_syscall_64+0x5a/0x80

It can be easily reproduced by running following commands in
separate terminals:

  # while :; do bpftrace -e 'uprobe:/bin/ls:_start  { printf("hit\n"); }' -c ls; done
  # bpftrace -e 'profile:hz:100000 { @[ustack()] = count(); }'

Fixing this by making sure current->utask pointer is set to NULL
before we start to release the utask object.

[1] https://github.com/grafana/pyroscope/issues/3673

Fixes: cfa7f3d2c526 ("perf,x86: avoid missing caller address in stack traces captured in uprobe")
Reported-by: Max Makarov <maxpain@linux.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250109141440.2692173-1-jolsa@kernel.org
5 months agoMerge tag 'mediatek-drm-fixes-20250104' of https://git.kernel.org/pub/scm/linux/kerne...
Dave Airlie [Fri, 10 Jan 2025 06:57:45 +0000 (16:57 +1000)] 
Merge tag 'mediatek-drm-fixes-20250104' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes

Mediatek DRM Fixes - 20250104

1. Revert "drm/mediatek: dsi: Correct calculation formula of PHY Timing"
2. Set private->all_drm_private[i]->drm to NULL if mtk_drm_bind returns err
3. Move mtk_crtc_finish_page_flip() to ddp_cmdq_cb()
4. Only touch DISP_REG_OVL_PITCH_MSB if AFBC is supported
5. Add support for 180-degree rotation in the display driver
6. Stop selecting foreign drivers
7. Revert "drm/mediatek: Switch to for_each_child_of_node_scoped()"
8. Fix YCbCr422 color format issue for DP
9. Fix mode valid issue for dp
10. dp: Reference common DAI properties
11. dsi: Add registers to pdata to fix MT8186/MT8188
12. Remove unneeded semicolon
13. Add return value check when reading DPCD
14. Initialize pointer in mtk_drm_of_ddp_path_build_one()

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250104124227.45505-1-chunkuang.hu@kernel.org
5 months agoMerge tag 'drm-xe-fixes-2025-01-09' of https://gitlab.freedesktop.org/drm/xe/kernel...
Dave Airlie [Fri, 10 Jan 2025 06:41:59 +0000 (16:41 +1000)] 
Merge tag 'drm-xe-fixes-2025-01-09' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

Driver Changes:
- Avoid a NULL ptr deref when wedging (Lucas)
- Fix power gate sequence on DG1 (Rodrigo)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z4AcqP3Io_r0pEsR@fedora
5 months agoMerge tag 'amd-drm-fixes-6.13-2025-01-09' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Fri, 10 Jan 2025 06:12:25 +0000 (16:12 +1000)] 
Merge tag 'amd-drm-fixes-6.13-2025-01-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.13-2025-01-09:

amdgpu:
- Display interrupt fixes
- Fix display max surface mismatches
- Fix divide error in DM plane scale calcs
- Display divide by 0 checks in dml helpers
- SMU 13 AD/DC interrrupt handling fix
- Fix locking around buddy trim handling

amdkfd:
- Fix page fault with shader debugger enabled
- Fix eviction fence wq handling

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250109164236.477295-1-alexander.deucher@amd.com
5 months agoMerge tag 'drm-intel-fixes-2025-01-08' of https://gitlab.freedesktop.org/drm/i915...
Dave Airlie [Fri, 10 Jan 2025 04:50:19 +0000 (14:50 +1000)] 
Merge tag 'drm-intel-fixes-2025-01-08' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes

- Revert "drm/i915/hdcp: Don't enable HDCP1.4 directly from check_link" [hdcp] (Suraj Kandpal)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z37BPchEzY0ovIqF@linux
5 months agoMerge tag '6.13-rc6-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Linus Torvalds [Fri, 10 Jan 2025 02:19:59 +0000 (18:19 -0800)] 
Merge tag '6.13-rc6-ksmbd-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:
 "Four ksmbd server fixes, most also for stable:

   - fix for reporting special file type more accurately when POSIX
     extensions negotiated

   - minor cleanup

   - fix possible incorrect creation path when dirname is not present.
     In some cases, Windows apps create files without checking if they
     exist.

   - fix potential NULL pointer dereference sending interim response"

* tag '6.13-rc6-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: Implement new SMB3 POSIX type
  ksmbd: fix unexpectedly changed path in ksmbd_vfs_kern_path_locked
  ksmbd: Remove unneeded if check in ksmbd_rdma_capable_netdev()
  ksmbd: fix a missing return value check bug

5 months agotracing/kprobes: Fix to free objects when failed to copy a symbol
Masami Hiramatsu (Google) [Thu, 9 Jan 2025 14:29:37 +0000 (23:29 +0900)] 
tracing/kprobes: Fix to free objects when failed to copy a symbol

In __trace_kprobe_create(), if something fails it must goto error block
to free objects. But when strdup() a symbol, it returns without that.
Fix it to goto the error block to free objects correctly.

Link: https://lore.kernel.org/all/173643297743.1514810.2408159540454241947.stgit@devnote2/
Fixes: 6212dd29683e ("tracing/kprobes: Use dyn_event framework for kprobe events")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
5 months agofirewall: remove misplaced semicolon from stm32_firewall_get_firewall
guanjing [Fri, 20 Dec 2024 08:33:35 +0000 (09:33 +0100)] 
firewall: remove misplaced semicolon from stm32_firewall_get_firewall

Remove misplaced colon in stm32_firewall_get_firewall()
which results in a syntax error when the code is compiled
without CONFIG_STM32_FIREWALL.

Fixes: 5c9668cfc6d7 ("firewall: introduce stm32_firewall framework")
Signed-off-by: guanjing <guanjing@cmss.chinamobile.com>
Reviewed-by: Gatien Chevallier <gatien.chevallier@foss.st.com>
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
5 months agoMerge tag 'imx-fixes-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/shawngu...
Arnd Bergmann [Thu, 9 Jan 2025 21:56:14 +0000 (22:56 +0100)] 
Merge tag 'imx-fixes-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes

i.MX fixes for 6.13:

- Add fallback for i.MX8QM ESAI compatible to fix a dt-schema warning
  caused by bindings update
- Fix uSDHC1 clock for i.MX RT1050
- Enable SND_SOC_SPDIF in imx_v6_v7_defconfig to fix a regression caused
  by an i.MX6 SPDIF sound card change in DT
- Fix address length of i.MX95 netcmix_blk_ctrl

* tag 'imx-fixes-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: imxrt1050: Fix clocks for mmc
  ARM: imx_v6_v7_defconfig: enable SND_SOC_SPDIF
  arm64: dts: imx95: correct the address length of netcmix_blk_ctrl
  arm64: dts: imx8-ss-audio: add fallback compatible string fsl,imx6ull-esai for esai

Link: https://lore.kernel.org/r/Z3Jf9zbv/xH3YzuB@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
5 months agoMerge tag 'qcom-arm64-fixes-for-6.13' of https://git.kernel.org/pub/scm/linux/kernel...
Arnd Bergmann [Thu, 9 Jan 2025 21:54:39 +0000 (22:54 +0100)] 
Merge tag 'qcom-arm64-fixes-for-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes

Qualcomm Arm64 DeviceTree fixes for v6.13

Revert the enablement of OTG support on primary and secondary USB Type-C
controllers of X1 Elite, for now, as this broke support for USB hotplug.

Disable the TPDM DCC device on SA8775P, as this is inaccessible per
current firmware configuration. Also correct the PCIe "addr_space"
region to enable larger BAR sizes.

Also fix the address space of PCIe6a found in X1 Elite.

* tag 'qcom-arm64-fixes-for-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
  arm64: dts: qcom: sa8775p: fix the secure device bootup issue
  Revert "arm64: dts: qcom: x1e80100: enable OTG on USB-C controllers"
  Revert "arm64: dts: qcom: x1e80100-crd: enable otg on usb ports"
  arm64: dts: qcom: x1e80100: Fix up BAR space size for PCIe6a
  Revert "arm64: dts: qcom: x1e78100-t14s: enable otg on usb-c ports"
  arm64: dts: qcom: sa8775p: Fix the size of 'addr_space' regions

Link: https://lore.kernel.org/r/20250103024945.4649-1-andersson@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
5 months agoMerge tag 'net-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 9 Jan 2025 20:40:58 +0000 (12:40 -0800)] 
Merge tag 'net-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, Bluetooth and WPAN.

  No outstanding fixes / investigations at this time.

  Current release - new code bugs:

   - eth: fbnic: revert HWMON support, it doesn't work at all and revert
     is similar size as the fixes

  Previous releases - regressions:

   - tcp: allow a connection when sk_max_ack_backlog is zero

   - tls: fix tls_sw_sendmsg error handling

  Previous releases - always broken:

   - netdev netlink family:
       - prevent accessing NAPI instances from another namespace
       - don't dump Tx and uninitialized NAPIs

   - net: sysctl: avoid using current->nsproxy, fix null-deref if task
     is exiting and stick to opener's netns

   - sched: sch_cake: add bounds checks to host bulk flow fairness
     counts

  Misc:

   - annual cleanup of inactive maintainers"

* tag 'net-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
  rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy
  sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy
  sctp: sysctl: udp_port: avoid using current->nsproxy
  sctp: sysctl: auth_enable: avoid using current->nsproxy
  sctp: sysctl: rto_min/max: avoid using current->nsproxy
  sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy
  mptcp: sysctl: blackhole timeout: avoid using current->nsproxy
  mptcp: sysctl: sched: avoid using current->nsproxy
  mptcp: sysctl: avail sched: remove write access
  MAINTAINERS: remove Lars Povlsen from Microchip Sparx5 SoC
  MAINTAINERS: remove Noam Dagan from AMAZON ETHERNET
  MAINTAINERS: remove Ying Xue from TIPC
  MAINTAINERS: remove Mark Lee from MediaTek Ethernet
  MAINTAINERS: mark stmmac ethernet as an Orphan
  MAINTAINERS: remove Andy Gospodarek from bonding
  MAINTAINERS: update maintainers for Microchip LAN78xx
  MAINTAINERS: mark Synopsys DW XPCS as Orphan
  net/mlx5: Fix variable not being completed when function returns
  rtase: Fix a check for error in rtase_alloc_msix()
  net: stmmac: dwmac-tegra: Read iommu stream id from device tree
  ...

5 months agoMerge tag 'for-6.13-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Thu, 9 Jan 2025 18:16:45 +0000 (10:16 -0800)] 
Merge tag 'for-6.13-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "A few more fixes.

  Besides the one-liners in Btrfs there's fix to the io_uring and
  encoded read integration (added in this development cycle). The update
  to io_uring provides more space for the ongoing command that is then
  used in Btrfs to handle some cases.

   - io_uring and encoded read:
       - provide stable storage for io_uring command data
       - make a copy of encoded read ioctl call, reuse that in case the
         call would block and will be called again

   - properly initialize zlib context for hardware compression on s390

   - fix max extent size calculation on filesystems with non-zoned
     devices

   - fix crash in scrub on crafted image due to invalid extent tree"

* tag 'for-6.13-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: zlib: fix avail_in bytes for s390 zlib HW compression path
  btrfs: zoned: calculate max_extent_size properly on non-zoned setup
  btrfs: avoid NULL pointer dereference if no valid extent tree
  btrfs: don't read from userspace twice in btrfs_uring_encoded_read()
  io_uring: add io_uring_cmd_get_async_data helper
  io_uring/cmd: add per-op data to struct io_uring_cmd_data
  io_uring/cmd: rename struct uring_cache to io_uring_cmd_data

5 months agoMerge patch series "SBI PMU event related fixes"
Palmer Dabbelt [Thu, 9 Jan 2025 17:37:12 +0000 (09:37 -0800)] 
Merge patch series "SBI PMU event related fixes"

Atish Patra <atishp@rivosinc.com> says:

Here are two minor improvement/fixes in the PMU event path. The first patch
was part of the series[1]. The 2nd patch was suggested during the series
review.

While the series can only be merged once SBI v3.0 is frozen, these two
patches can be independent of SBI v3.0 and can be merged sooner. Hence, these
two patches are sent as a separate series.

* b4-shazam-merge:
  drivers/perf: riscv: Do not allow invalid raw event config
  drivers/perf: riscv: Return error for default case
  drivers/perf: riscv: Fix Platform firmware event data

Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-0-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
5 months agodrivers/perf: riscv: Do not allow invalid raw event config
Atish Patra [Fri, 13 Dec 2024 00:09:34 +0000 (16:09 -0800)] 
drivers/perf: riscv: Do not allow invalid raw event config

The SBI specification allows only lower 48bits of hpmeventX to be
configured via SBI PMU. Currently, the driver masks of the higher
bits but doesn't return an error. This will lead to an additional
SBI call for config matching which should return for an invalid
event error in most of the cases.

However, if a platform(i.e Rocket and sifive cores) implements a
bitmap of all bits in the event encoding this will lead to an
incorrect event being programmed leading to user confusion.

Report the error to the user if higher bits are set during the
event mapping itself to avoid the confusion and save an additional
SBI call.

Suggested-by: Samuel Holland <samuel.holland@sifive.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-3-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
5 months agodrivers/perf: riscv: Return error for default case
Atish Patra [Fri, 13 Dec 2024 00:09:33 +0000 (16:09 -0800)] 
drivers/perf: riscv: Return error for default case

If the upper two bits has an invalid valid (0x1), the event mapping
is not reliable as it returns an uninitialized variable.

Return appropriate value for the default case.

Fixes: f0c9363db2dd ("perf/riscv-sbi: Add platform specific firmware event handling")
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-2-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
5 months agodrivers/perf: riscv: Fix Platform firmware event data
Atish Patra [Fri, 13 Dec 2024 00:09:32 +0000 (16:09 -0800)] 
drivers/perf: riscv: Fix Platform firmware event data

Platform firmware event data field is allowed to be 62 bits for
Linux as uppper most two bits are reserved to indicate SBI fw or
platform specific firmware events.
However, the event data field is masked as per the hardware raw
event mask which is not correct.

Fix the platform firmware event data field with proper mask.

Fixes: f0c9363db2dd ("perf/riscv-sbi: Add platform specific firmware event handling")
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-1-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
5 months agoMerge patch series "selftest: fix riscv/vector tests"
Palmer Dabbelt [Thu, 9 Jan 2025 17:35:42 +0000 (09:35 -0800)] 
Merge patch series "selftest: fix riscv/vector tests"

This contains a pair of fixes for the vector self tests, which avoids
some warnings and provides proper status messages.

* b4-shazam-merge:
  tools: selftests: riscv: Add test count for vstate_prctl
  tools: selftests: riscv: Add pass message for v_initval_nolibc

Link: https://lore.kernel.org/r/20241220091730.28006-1-yongxuan.wang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
5 months agotools: selftests: riscv: Add test count for vstate_prctl
Yong-Xuan Wang [Fri, 20 Dec 2024 09:17:27 +0000 (17:17 +0800)] 
tools: selftests: riscv: Add test count for vstate_prctl

Add the test count to drop the warning message.
"Planned tests != run tests (0 != 1)"

Fixes: 7cf6198ce22d ("selftests: Test RISC-V Vector prctl interface")
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Andy Chiu <AndybnAC@gmail.com>
Link: https://lore.kernel.org/r/20241220091730.28006-3-yongxuan.wang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
5 months agotools: selftests: riscv: Add pass message for v_initval_nolibc
Yong-Xuan Wang [Fri, 20 Dec 2024 09:17:26 +0000 (17:17 +0800)] 
tools: selftests: riscv: Add pass message for v_initval_nolibc

Add the pass message after we successfully complete the test.

Fixes: 5c93c4c72fbc ("selftests: Test RISC-V Vector's first-use handler")
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Andy Chiu <AndybnAC@gmail.com>
Link: https://lore.kernel.org/r/20241220091730.28006-2-yongxuan.wang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
5 months agoMerge tag 'nf-25-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Jakub Kicinski [Thu, 9 Jan 2025 16:54:49 +0000 (08:54 -0800)] 
Merge tag 'nf-25-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Fix imbalance between flowtable BIND and UNBIND calls to configure
   hardware offload, this fixes a possible kmemleak.

2) Clamp maximum conntrack hashtable size to INT_MAX to fix a possible
   WARN_ON_ONCE splat coming from kvmalloc_array(), only possible from
   init_netns.

* tag 'nf-25-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: conntrack: clamp maximum hashtable size to INT_MAX
  netfilter: nf_tables: imbalance in flowtable binding
====================

Link: https://patch.msgid.link/20250109123532.41768-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 months agoMerge branch 'net-sysctl-avoid-using-current-nsproxy'
Jakub Kicinski [Thu, 9 Jan 2025 16:53:37 +0000 (08:53 -0800)] 
Merge branch 'net-sysctl-avoid-using-current-nsproxy'

Matthieu Baerts says:

====================
net: sysctl: avoid using current->nsproxy

As pointed out by Al Viro and Eric Dumazet in [1], using the 'net'
structure via 'current' is not recommended for different reasons:

- Inconsistency: getting info from the reader's/writer's netns vs only
  from the opener's netns as it is usually done. This could cause
  unexpected issues when other operations are done on the wrong netns.

- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
  (null-ptr-deref), e.g. when the current task is exiting, as spotted by
  syzbot [1] using acct(2).

The 'net' or 'pernet' structure can be obtained from the table->data
using container_of().

Note that table->data could also be used directly in more places, but
that would increase the size of this fix to replace all accesses via
'net'. Probably best to avoid that for fixes.

Patches 2-9 remove access of net via current->nsproxy in sysfs handlers
in MPTCP, SCTP and RDS. There are multiple patches doing almost the same
thing, but the reason is to ease the backports.

Patch 1 is not directly linked to this, but it is a small fix for MPTCP
available_schedulers sysctl knob to explicitly mark it as read-only.

Please note that this series does not address Al's comment [2]. In SCTP,
some sysctl knobs set other sysfs-exposed variables for the min/max: two
processes could then write two linked values at the same time, resulting
in new values being outside the new boundaries. It would be great if
SCTP developers can look at this problem.

Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com
Link: https://lore.kernel.org/20250105211158.GL1977892@ZenIV
====================

Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-0-5df34b2083e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 months agords: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy
Matthieu Baerts (NGI0) [Wed, 8 Jan 2025 15:34:37 +0000 (16:34 +0100)] 
rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy

As mentioned in a previous commit of this series, using the 'net'
structure via 'current' is not recommended for different reasons:

- Inconsistency: getting info from the reader's/writer's netns vs only
  from the opener's netns.

- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
  (null-ptr-deref), e.g. when the current task is exiting, as spotted by
  syzbot [1] using acct(2).

The per-netns structure can be obtained from the table->data using
container_of(), then the 'net' one can be retrieved from the listen
socket (if available).

Fixes: c6a58ffed536 ("RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-9-5df34b2083e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 months agosctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy
Matthieu Baerts (NGI0) [Wed, 8 Jan 2025 15:34:36 +0000 (16:34 +0100)] 
sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy

As mentioned in a previous commit of this series, using the 'net'
structure via 'current' is not recommended for different reasons:

- Inconsistency: getting info from the reader's/writer's netns vs only
  from the opener's netns.

- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
  (null-ptr-deref), e.g. when the current task is exiting, as spotted by
  syzbot [1] using acct(2).

The 'net' structure can be obtained from the table->data using
container_of().

Note that table->data could also be used directly, as this is the only
member needed from the 'net' structure, but that would increase the size
of this fix, to use '*data' everywhere 'net->sctp.probe_interval' is
used.

Fixes: d1e462a7a5f3 ("sctp: add probe_interval in sysctl and sock/asoc/transport")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-8-5df34b2083e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 months agosctp: sysctl: udp_port: avoid using current->nsproxy
Matthieu Baerts (NGI0) [Wed, 8 Jan 2025 15:34:35 +0000 (16:34 +0100)] 
sctp: sysctl: udp_port: avoid using current->nsproxy

As mentioned in a previous commit of this series, using the 'net'
structure via 'current' is not recommended for different reasons:

- Inconsistency: getting info from the reader's/writer's netns vs only
  from the opener's netns.

- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
  (null-ptr-deref), e.g. when the current task is exiting, as spotted by
  syzbot [1] using acct(2).

The 'net' structure can be obtained from the table->data using
container_of().

Note that table->data could also be used directly, but that would
increase the size of this fix, while 'sctp.ctl_sock' still needs to be
retrieved from 'net' structure.

Fixes: 046c052b475e ("sctp: enable udp tunneling socks")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-7-5df34b2083e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>