]> git.ipfire.org Git - thirdparty/kernel/linux.git/log
thirdparty/kernel/linux.git
5 weeks agoMerge patch series "Exposing case folding behavior"
Christian Brauner [Mon, 11 May 2026 14:50:37 +0000 (16:50 +0200)] 
Merge patch series "Exposing case folding behavior"

Chuck Lever <cel@kernel.org> says:

I'm attempting to implement enough support in the Linux VFS to
enable file services like NFSD and ksmbd (and user space
equivalents) to provide the actual status of case folding support
in local file systems. The default behavior for local file systems
not explicitly supported in this series is to reflect the usual
POSIX behaviors:

  case-insensitive = false
  case-nonpreserving = false

The case-insensitivity and case-nonpreserving booleans can be
consumed immediately by NFSD. These two attributes have been part of
the NFSv3 and NFSv4 protocols for decades, in order to support NFS
client implementations on non-POSIX systems.

Support for user space file servers is why this series exposes case
folding information via a user-space API. I don't know of any other
category of user-space application that requires access to case
folding info.

The Linux NFS community has a growing interest in supporting NFS
clients on Windows and MacOS platforms, where file name behavior does
not align with traditional POSIX semantics.

One example of a Windows-based NFS client is [1]. This client
implementation explicitly requires servers to report
FATTR4_WORD0_CASE_INSENSITIVE = TRUE for proper operation, a hard
requirement for Windows client interoperability because Windows
applications expect case-insensitive behavior. When an NFS client
knows the server is case-insensitive, it can avoid issuing multiple
LOOKUP/READDIR requests to search for case variants, and applications
like Win32 programs work correctly without manual workarounds or
code changes.

Even the Linux client can take advantage of this information. Trond
merged patches 4 years ago [2] that introduce support for case
insensitivity, in support of the Hammerspace NFS server. In
particular, when a client detects a case-insensitive NFS share,
negative dentry caching must be disabled (a lookup for "FILE.TXT"
failing shouldn't cache a negative entry when "file.txt" exists)
and directory change invalidation must clear all cached case-folded
file name variants.

Hammerspace servers and several other NFS server implementations
operate in multi-protocol environments, where a single file service
instance caters to both NFS and SMB clients. In those cases, things
work more smoothly for everyone when the NFS client can see and adapt
to the case folding behavior that SMB users rely on and expect. NFSD
needs to support the case-insensitivity and case-nonpreserving
booleans properly in order to participate as a first-class citizen
in such environments.

[1] https://github.com/kofemann/ms-nfs41-client
[2] https://patchwork.kernel.org/project/linux-nfs/cover/20211217203658.439352-1-trondmy@kernel.org/

* patches from https://patch.msgid.link/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com:
  ksmbd: Report filesystem case sensitivity via FS_ATTRIBUTE_INFORMATION
  nfsd: Implement NFSv4 FATTR4_CASE_INSENSITIVE and FATTR4_CASE_PRESERVING
  nfsd: Report export case-folding via NFSv3 PATHCONF
  isofs: Implement fileattr_get for case sensitivity
  vboxsf: Implement fileattr_get for case sensitivity
  nfs: Implement fileattr_get for case sensitivity
  cifs: Implement fileattr_get for case sensitivity
  xfs: Report case sensitivity in fileattr_get
  hfsplus: Report case sensitivity in fileattr_get
  hfs: Implement fileattr_get for case sensitivity
  ntfs3: Implement fileattr_get for case sensitivity
  exfat: Implement fileattr_get for case sensitivity
  fat: Implement fileattr_get for case sensitivity
  fs: Add case sensitivity flags to file_kattr
  fs: Move file_kattr initialization to callers

Link: https://patch.msgid.link/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agoksmbd: Report filesystem case sensitivity via FS_ATTRIBUTE_INFORMATION
Chuck Lever [Thu, 7 May 2026 08:53:08 +0000 (04:53 -0400)] 
ksmbd: Report filesystem case sensitivity via FS_ATTRIBUTE_INFORMATION

FS_ATTRIBUTE_INFORMATION responses have always reported
FILE_CASE_SENSITIVE_SEARCH and FILE_CASE_PRESERVED_NAMES
unconditionally. Case-insensitive filesystems like exFAT, and
casefolded directories on ext4 or f2fs, have no way to signal
their actual semantics to SMB clients.

Now that filesystems expose case behavior through ->fileattr_get,
query it via vfs_fileattr_get() and translate the FS_XFLAG_CASEFOLD
and FS_XFLAG_CASENONPRESERVING flags into the corresponding SMB
attributes. Filesystems without ->fileattr_get continue reporting
default POSIX behavior (case-sensitive, case-preserving).

SMB's FS_ATTRIBUTE_INFORMATION reports per-share attributes from
the share root, not per-file. Shares mixing casefold and
non-casefold directories report the root directory's behavior.

Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-15-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agonfsd: Implement NFSv4 FATTR4_CASE_INSENSITIVE and FATTR4_CASE_PRESERVING
Chuck Lever [Thu, 7 May 2026 08:53:07 +0000 (04:53 -0400)] 
nfsd: Implement NFSv4 FATTR4_CASE_INSENSITIVE and FATTR4_CASE_PRESERVING

NFSD currently provides NFSv4 clients with hard-coded responses
indicating all exported filesystems are case-sensitive and
case-preserving. This is incorrect for case-insensitive filesystems
and ext4 directories with casefold enabled.

Query the underlying filesystem's actual case sensitivity via
nfsd_get_case_info() and return accurate values to clients. This
supports per-directory settings for filesystems that allow mixing
case-sensitive and case-insensitive directories within an export.

The helper queries the parent dentry for non-directory filehandles
because case-folding is a per-directory property. That resolution
has the same corner cases here as for NFSv3 PATHCONF: single-file
exports query an unexported parent, disconnected dentries report
defaults until reconnected, and hardlinked files track whichever
alias the dcache currently holds.

Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-14-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agonfsd: Report export case-folding via NFSv3 PATHCONF
Chuck Lever [Thu, 7 May 2026 08:53:06 +0000 (04:53 -0400)] 
nfsd: Report export case-folding via NFSv3 PATHCONF

The hard-coded MSDOS_SUPER_MAGIC check in nfsd3_proc_pathconf()
only recognizes FAT filesystems as case-insensitive. Modern
filesystems like F2FS, exFAT, and CIFS support case-insensitive
directories, but NFSv3 clients cannot discover this capability.

Query the export's actual case behavior through ->fileattr_get
instead. This allows NFSv3 clients to correctly handle case
sensitivity for any filesystem that implements the fileattr
interface. Filesystems without ->fileattr_get continue to report
the default POSIX behavior (case-sensitive, case-preserving).

This change depends on the earlier "fat: Implement fileattr_get
for case sensitivity" patch in this series, which ensures FAT
filesystems report their case behavior correctly via the
fileattr interface.

Case-folding is a per-directory property, so
nfsd_get_case_info() queries the parent dentry for
non-directory filehandles. Three inherent corner cases follow:
a single-file export's parent lies outside the exported
subtree, so the LSM hook evaluates against an unexported
directory; a disconnected dentry from fh_verify() has
d_parent == itself, so the file's own attributes are reported
until the dentry connects; and a hardlinked file resolves
through the alias the dcache currently holds, so when the
inode is linked into both case-folded and case-sensitive
directories the reported value tracks whichever parent is
active. These limitations are not addressable without
redefining the protocol attribute as per-parent rather than
per-object.

RFC 1813 restricts PATHCONF errors to NFS3ERR_STALE,
NFS3ERR_BADHANDLE, and NFS3ERR_SERVERFAULT. When an LSM hook
denies the case-folding query on the parent, NFS3ERR_STALE is
the only correct mapping: NFS3ERR_SERVERFAULT misrepresents a
working server as broken, and NFS3ERR_BADHANDLE implies a
decoding failure that did not occur. A client purging the
filehandle on receipt is the desired outcome, since the server
has refused to read attributes through it. Substituting POSIX
defaults instead would let the same handle report
casefold=false now and casefold=true once policy permits,
opening a silent name-collision window on case-insensitive
exports.

Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-13-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agoisofs: Implement fileattr_get for case sensitivity
Chuck Lever [Thu, 7 May 2026 08:53:05 +0000 (04:53 -0400)] 
isofs: Implement fileattr_get for case sensitivity

Upper layers such as NFSD need a way to query whether a
filesystem handles filenames in a case-sensitive manner so
they can provide correct semantics to remote clients. Without
this information, NFS exports of ISO 9660 filesystems cannot
advertise their filename case behavior.

Implement isofs_fileattr_get() to report ISO 9660 case handling
behavior. The 'check=r' (relaxed) mount option enables
case-insensitive lookups and is reported via FS_XFLAG_CASEFOLD.
By default, Joliet extensions operate in relaxed mode while
plain ISO 9660 uses strict (case-sensitive) mode.

Plain ISO 9660 names on the medium are uppercase. When neither
Rock Ridge nor Joliet is in effect, the default 'map=n' option
(and 'map=a') routes lookup and readdir through
isofs_name_translate(), which forces A-Z to a-z. The names
visible to userspace then differ in case from the on-disc form,
so report FS_XFLAG_CASENONPRESERVING in that configuration. Rock
Ridge and Joliet both deliver names as authored, and 'map=o'
emits the raw on-disc name unchanged, so those configurations
remain case-preserving.

Casefolding is a directory property, and the in-tree consumers
(NFSD, ksmbd) issue the query against a directory: NFSD walks
to the parent for non-directory dentries before calling
vfs_fileattr_get(), and ksmbd reports per-share attributes from
the share root. Wire .fileattr_get only on
isofs_dir_inode_operations. The CASEFOLD flag is set in both
fa->fsx_xflags and fa->flags so FS_IOC_FSGETXATTR and
FS_IOC_GETFLAGS agree.

Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-12-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agovboxsf: Implement fileattr_get for case sensitivity
Chuck Lever [Thu, 7 May 2026 08:53:04 +0000 (04:53 -0400)] 
vboxsf: Implement fileattr_get for case sensitivity

Upper layers such as NFSD need a way to query whether a
filesystem handles filenames in a case-sensitive manner. Report
VirtualBox shared folder case handling behavior via the
FS_XFLAG_CASEFOLD flag.

The case sensitivity property is queried from the VirtualBox host
service at mount time and cached in struct vboxsf_sbi. The host
determines case sensitivity based on the underlying host filesystem
(for example, Windows NTFS is case-insensitive while Linux ext4 is
case-sensitive).

VirtualBox shared folders always preserve filename case exactly
as provided by the guest. The host interface does not expose a
separate case-preserving property; leaving
FS_XFLAG_CASENONPRESERVING unset reports the POSIX-default
case-preserving behavior, which matches vboxsf semantics.

The callback is registered in all three inode_operations
structures (directory, file, and symlink) to ensure consistent
reporting across all inode types.

Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-11-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agonfs: Implement fileattr_get for case sensitivity
Chuck Lever [Thu, 7 May 2026 08:53:03 +0000 (04:53 -0400)] 
nfs: Implement fileattr_get for case sensitivity

An NFS server re-exporting an NFS mount point needs to report
the case sensitivity behavior of the underlying filesystem to
its clients. NFSD's attribute encoder obtains that information
by calling vfs_fileattr_get() on the lower filesystem, so the
NFS client must implement fileattr_get to surface what it
learned from its own server.

The NFS client already retrieves case sensitivity information
from servers during mount via PATHCONF (NFSv3) or the
FATTR4_CASE_INSENSITIVE/FATTR4_CASE_PRESERVING attributes
(NFSv4). Expose this information through fileattr_get by
reporting the FS_XFLAG_CASEFOLD and FS_XFLAG_CASENONPRESERVING
flags. NFSv2 lacks PATHCONF support, so mounts using that protocol
version default to standard POSIX behavior: case-sensitive and
case-preserving.

PATHCONF is now invoked unconditionally for NFSv2 and NFSv3 mounts
so the case-sensitivity capabilities are established even when the
user pins server->namelen with the namlen= mount option. That option
is orthogonal to case handling, and skipping PATHCONF because
namelen was already known would leave the caps unset.

The two capability bits carry opposite polarity because their POSIX
defaults differ. Most servers are case-sensitive and case-
preserving, matching "neither xflag set." NFS_CAP_CASE_INSENSITIVE
is set only when the server affirms case insensitivity, so "server
said no" and "server did not answer" both collapse to the case-
sensitive default. NFS_CAP_CASE_NONPRESERVING follows the same
pattern in the opposite direction: set only when the server affirms
that it does not preserve case, so that silence or a missing
attribute lands on the case-preserving default. The NFSv4 probe
checks res.attr_bitmask[0] to distinguish "server said false" from
"server omitted the attribute" before setting the bit.

Both capability bits are cleared before each probe so a remount,
an NFSv4 transparent state migration to a server with different
case semantics, or a probe whose reply does not arrive does not
retain stale capabilities from the prior probe.

Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-10-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agocifs: Implement fileattr_get for case sensitivity
Chuck Lever [Thu, 7 May 2026 08:53:02 +0000 (04:53 -0400)] 
cifs: Implement fileattr_get for case sensitivity

Upper layers such as NFSD need a way to query whether a filesystem
handles filenames in a case-sensitive manner. Report CIFS/SMB case
handling behavior via FS_XFLAG_CASEFOLD and
FS_XFLAG_CASENONPRESERVING.

The authoritative source is the server itself: at mount time CIFS
issues QueryFSInfo(FS_ATTRIBUTE_INFORMATION) and caches the reply
on the tcon. That reply carries FILE_CASE_SENSITIVE_SEARCH and
FILE_CASE_PRESERVED_NAMES, which reflect whatever case handling
the share actually implements after SMB3.1.1 POSIX extensions
negotiation. Translating those two bits into the VFS flags lets
cifs_fileattr_get report what the server advertises rather than
what the client was asked to pretend.

QueryFSInfo is best-effort; the mount completes even if the server
does not answer. MaxPathNameComponentLength is zero in that case
and is used as the "no reply received" sentinel. When no reply is
available, fall back to the nocase mount option so that the reported
behavior agrees with the dentry comparison operations installed on
the superblock.

The callback is registered on cifs_dir_inode_ops so that NFSD,
ksmbd, and other consumers querying case handling against a
directory get a definitive answer, and on cifs_file_inode_ops to
preserve FS_COMPR_FL reporting on regular files. cifs_set_ops()
also installs cifs_namespace_inode_operations on DFS referral
directories that carry IS_AUTOMOUNT; register the same callback
there so the answer does not depend on whether the directory is
a referral point.

Registering fileattr_get routes FS_IOC_GETFLAGS through
vfs_fileattr_get() and short-circuits the syscall's fallback to
cifs_ioctl(). That fallback invoked CIFSGetExtAttr() under
CONFIG_CIFS_POSIX and CONFIG_CIFS_ALLOW_INSECURE_LEGACY on servers
advertising CIFS_UNIX_EXTATTR_CAP, surfacing the SMB1 Unix-extension
immutable, append, and nodump bits. cifs_fileattr_get carries over
only FS_COMPR_FL from cached cifsAttrs; the SMB1 extattr fetch is
not reproduced. SMB1 is deprecated, and acquiring a netfid from
within a dentry-only callback is not worth preserving a path tied
to an insecure legacy dialect.

Acked-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-9-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agoxfs: Report case sensitivity in fileattr_get
Chuck Lever [Thu, 7 May 2026 08:53:01 +0000 (04:53 -0400)] 
xfs: Report case sensitivity in fileattr_get

Upper layers such as NFSD need to query whether a filesystem
is case-sensitive. Add FS_XFLAG_CASEFOLD to xfs_ip2xflags()
when the filesystem is formatted with the ASCIICI feature
flag. This serves both FS_IOC_FSGETXATTR (via xfs_fill_fsxattr()
in xfs_fileattr_get()) and XFS_IOC_BULKSTAT (which populates
bs_xflags directly from xfs_ip2xflags()), so bulkstat consumers
and per-inode queries see a consistent view of the filesystem's
case-folding behavior.

FS_XFLAG_CASEFOLD is read-only: FS_XFLAG_RDONLY_MASK ensures
FS_IOC_FSSETXATTR strips it, and xfs_flags2diflags() has no
clause for CASEFOLD so the on-disk diflags are unaffected.
The legacy FS_IOC_SETFLAGS path in xfs_fileattr_set() also
allows FS_CASEFOLD_FL through its allowlist on ASCIICI
filesystems so that a chattr read-modify-write cycle does
not fail with EOPNOTSUPP.

XFS always preserves case. XFS is case-sensitive by default,
but supports ASCII case-insensitive lookups when formatted
with the ASCIICI feature flag.

Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-8-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agohfsplus: Report case sensitivity in fileattr_get
Chuck Lever [Thu, 7 May 2026 08:53:00 +0000 (04:53 -0400)] 
hfsplus: Report case sensitivity in fileattr_get

Add case sensitivity reporting to the existing hfsplus_fileattr_get()
function via the FS_XFLAG_CASEFOLD flag. HFS+ always preserves case
at rest.

Case sensitivity depends on how the volume was formatted: HFSX
volumes may be either case-sensitive or case-insensitive, indicated
by the HFSPLUS_SB_CASEFOLD superblock flag.

FS_XFLAG_CASEFOLD is read-only: FS_XFLAG_RDONLY_MASK ensures
FS_IOC_FSSETXATTR strips it. The legacy FS_IOC_SETFLAGS path in
hfsplus_fileattr_set() also allows FS_CASEFOLD_FL through its
allowlist on case-insensitive volumes so that a chattr
read-modify-write cycle does not fail with EOPNOTSUPP.

Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-7-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agohfs: Implement fileattr_get for case sensitivity
Chuck Lever [Thu, 7 May 2026 08:52:59 +0000 (04:52 -0400)] 
hfs: Implement fileattr_get for case sensitivity

Report HFS case sensitivity behavior via the FS_XFLAG_CASEFOLD
flag. HFS is always case-insensitive (using Mac OS Roman case
folding) and always preserves case at rest.

Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-6-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agontfs3: Implement fileattr_get for case sensitivity
Chuck Lever [Thu, 7 May 2026 08:52:58 +0000 (04:52 -0400)] 
ntfs3: Implement fileattr_get for case sensitivity

Report NTFS case sensitivity behavior via the FS_XFLAG_CASEFOLD
flag. NTFS always preserves case at rest.

Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-5-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agoexfat: Implement fileattr_get for case sensitivity
Chuck Lever [Thu, 7 May 2026 08:52:57 +0000 (04:52 -0400)] 
exfat: Implement fileattr_get for case sensitivity

Report exFAT's case sensitivity behavior via the FS_XFLAG_CASEFOLD
flag. exFAT compares names through the volume's upcase table; in
practice that table folds case, and case is preserved at rest.

Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://github.com/exfatprogs/exfatprogs/issues/313
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-4-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agofat: Implement fileattr_get for case sensitivity
Chuck Lever [Thu, 7 May 2026 08:52:56 +0000 (04:52 -0400)] 
fat: Implement fileattr_get for case sensitivity

Report FAT's case sensitivity behavior via the FS_XFLAG_CASEFOLD
and FS_XFLAG_CASENONPRESERVING flags. FAT filesystems are
case-insensitive by default.

MSDOS supports a 'nocase' mount option that enables case-sensitive
behavior; check this option when reporting case sensitivity.

VFAT long filename entries preserve case; without VFAT, only
uppercased 8.3 short names are stored. MSDOS with 'nocase' also
preserves case since the name-formatting code skips upcasing when
'nocase' is set. Check both options when reporting case preservation.

Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-3-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agofs: Add case sensitivity flags to file_kattr
Chuck Lever [Thu, 7 May 2026 08:52:55 +0000 (04:52 -0400)] 
fs: Add case sensitivity flags to file_kattr

Enable upper layers such as NFSD to retrieve case sensitivity
information from file systems by adding FS_XFLAG_CASEFOLD and
FS_XFLAG_CASENONPRESERVING flags.

Filesystems report case-insensitive or case-nonpreserving behavior
by setting these flags directly in fa->fsx_xflags. The default
(flags unset) indicates POSIX semantics: case-sensitive and
case-preserving. Both flags are added to FS_XFLAG_RDONLY_MASK so
FS_IOC_FSSETXATTR silently strips them, keeping the new xflags
strictly a reporting interface. Callers that want to toggle
casefolding continue to use FS_IOC_SETFLAGS with FS_CASEFOLD_FL,
the established UAPI on filesystems that support the operation
(ext4 and f2fs on empty directories).

Case sensitivity information is exported to userspace via the
fa_xflags field in the FS_IOC_FSGETXATTR ioctl and file_getattr()
system call.

Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-2-e62cc8200435@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agofs: Move file_kattr initialization to callers
Chuck Lever [Thu, 7 May 2026 08:52:54 +0000 (04:52 -0400)] 
fs: Move file_kattr initialization to callers

fileattr_fill_xflags() and fileattr_fill_flags() memset the
entire file_kattr struct before populating select fields, so
callers cannot pre-set fields in fa->fsx_xflags without having
their values clobbered. Darrick Wong noted that a function
named "fill_xflags" touching more than xflags forces callers
to know implementation details beyond its apparent scope.

Drop the memset from both fill functions and initialize at the
entry points instead: ioctl_setflags(), ioctl_fssetxattr(),
the file_setattr() syscall, and xfs_ioc_fsgetxattra() now
declare fa with an aggregate initializer. ioctl_getflags(),
ioctl_fsgetxattr(), and the file_getattr() syscall already
aggregate-initialize fa to pass flags_valid/fsx_valid hints
into vfs_fileattr_get().

Subsequent patches rely on this so that ->fileattr_get()
handlers can set case-sensitivity flags (FS_XFLAG_CASEFOLD,
FS_XFLAG_CASENONPRESERVING) in fa->fsx_xflags before the fill
functions run.

Suggested-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260507-case-sensitivity-v14-1-e62cc8200435@oracle.com
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agommc: core: Fix host controller programming for fixed driver type
Kamal Dasu [Thu, 23 Apr 2026 19:18:55 +0000 (15:18 -0400)] 
mmc: core: Fix host controller programming for fixed driver type

When using the fixed-emmc-driver-type device tree property, the MMC core
correctly selects the driver strength for the card but fails to program
the host controller accordingly. This causes a mismatch where the card
uses the specified driver type while the host controller defaults to
Type B (since ios->drv_type remains zero).

Split the driver type programming logic to handle both fixed and dynamic
driver type selection paths. For fixed driver types, program the host
controller with the selected drive_strength value. For dynamic selection,
use the existing drv_type as before.

This ensures both the eMMC device and host controller use matching driver
strengths, preventing potential signal integrity issues.

Fixes: 6186d06c519e ("mmc: parse new binding for eMMC fixed driver type")
Signed-off-by: Kamal Dasu <kamal.dasu@broadcom.com>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
5 weeks agos390/debug: Add s390dbf kernel parameter
Peter Oberparleiter [Wed, 6 May 2026 14:53:27 +0000 (16:53 +0200)] 
s390/debug: Add s390dbf kernel parameter

Problem determination using s390dbf logging sometimes requires changing
the default logging level or log area size. While this is possible
using sysfs interfaces, there is no easy way to adjust these parameters
for early boot code that emits logs before userspace is available.

Add an s390dbf kernel parameter to address this shortcoming. The
parameter can be used to specify log level and area size (in units of
pages). A level of '-' turns logging off for an area. Logs can be
identified by name or a shell-style pattern.

Parameter format:

  s390dbf=<name|pattern>:[<level>|-]:[<pages>][,...]

Example:

  s390dbf=cio*:6:128,sclp_err::2

Specified parameters are applied immediately during debug area
registration for regular log areas. For early, static debug areas,
log levels are changed during early_param() parsing, while size
changes are applied at arch_initcall-time.

Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Tested-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
5 weeks agos390/ap: Implement SE bind and associate uevents
Harald Freudenberger [Mon, 27 Apr 2026 16:43:14 +0000 (18:43 +0200)] 
s390/ap: Implement SE bind and associate uevents

Notify userspace about two important events on AP queues
when run within Secure Execution (SE) environment:
- Send AP CHANGE uevent with "SE_BIND=1" on successful bind
  operation on this AP queue device.
- Send AP CHANGE uevent with "SE_ASSOC=<association_index>"
  on successful association operation with the secret of the
  reported index on this AP queue device.

Note there is no SE unbind/unassociate event. Unbind/unassociate
can have different triggers and technically there is no signaling
done which the AP code could catch. A user space application can,
if this information is crucial, query the sysfs attribute se_bind
on the AP queue which runs a synchronous TAPQ. If the attribute
returns with "unbound" a reset took place and SE bind and associate
states are unbound and unassociated.

Suggested-by: Marc Hartmayer mhartmay@linux.ibm.com
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
5 weeks agogenirq/proc: Size interrupt directory names for 10-digit interrupt numbers
Pengpeng Hou [Fri, 3 Apr 2026 08:55:56 +0000 (16:55 +0800)] 
genirq/proc: Size interrupt directory names for 10-digit interrupt numbers

/proc/irq/<n>/ directory names are built in `char name[10]` buffers
with `sprintf(name, "%u", irq)`.

Ten-digit IRQ numbers already need 11 bytes including the trailing NUL, and
current sparse-IRQ configurations allow interrupt numbers in that range.

Size the temporary name buffer for the current decimal form and switch
to bounded formatting when creating or removing the proc entry.

Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260404101001.1-genirq-proc-pengpeng@iscas.ac.cn
5 weeks agontfs: restore $MFT mirror contents check
DaeMyung Kang [Sun, 10 May 2026 17:11:14 +0000 (02:11 +0900)] 
ntfs: restore $MFT mirror contents check

check_mft_mirror() still computes the number of bytes to validate in each
mirrored MFT record, but the actual comparison against $MFTMirr was dropped
when the superblock code was updated.

As a result, mount misses a stale or inconsistent $MFTMirr as long as both
records pass the structural baad-record checks. Restore the comparison and
log an error when the primary $MFT record differs from its mirror copy.

Returning false lets the existing mount error handling mark the volume as
having NTFS errors and, with on_errors=remount-ro, continue read-only. The
default on_errors=continue mount policy still allows the mount to proceed.

Fixes: 6251f0b0de7d ("ntfs: update super block operations")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
5 weeks agoirq_work: Fix use-after-free in irq_work_single() on PREEMPT_RT
Jiayuan Chen [Mon, 30 Mar 2026 07:32:29 +0000 (15:32 +0800)] 
irq_work: Fix use-after-free in irq_work_single() on PREEMPT_RT

On PREEMPT_RT, non-HARD irq_work runs in per-CPU kthreads via
run_irq_workd(), so irq_work_sync() uses rcuwait() to wait for BUSY==0.

After irq_work_single() clears BUSY via atomic_cmpxchg(), it still
dereferences @work for irq_work_is_hard() and rcuwait_wake_up().

An irq_work_sync() caller on another CPU that enters after BUSY is cleared
can observe BUSY==0 immediately, return, and free the work before those
accesses complete — causing a use-after-free.

Fix this by wrapping run_irq_workd() in guard(rcu)() so that the entire
irq_work_single() execution is within an RCU read-side critical
section. Then add synchronize_rcu() in irq_work_sync() after
rcuwait_wait_event() to ensure the caller waits for the RCU grace period
before returning, preventing premature frees.

Fixes: 810979682ccc ("irq_work: Allow irq_work_sync() to sleep if irq_work() no IRQ support.")
Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20260330073234.303732-1-jiayuan.chen@linux.dev
5 weeks agos390/cio: Restore GFP_DMA for CHSC allocation
Peter Oberparleiter [Thu, 7 May 2026 14:27:08 +0000 (16:27 +0200)] 
s390/cio: Restore GFP_DMA for CHSC allocation

Re-add GFP_DMA when allocating memory for CHSC control blocks.
On some supported machines, CHSC cannot access memory outside
the DMA zone, causing CHSC command failures.

Cc: stable@vger.kernel.org
Fixes: a3a64a4def8d ("s390/cio: remove unneeded DMA zone allocation")
Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
5 weeks agommc: dw_mmc: exynos: increase DMA threshold value for exynos7870
Kaustabh Chakraborty [Wed, 15 Apr 2026 15:02:09 +0000 (20:32 +0530)] 
mmc: dw_mmc: exynos: increase DMA threshold value for exynos7870

Exynos 7870 compatible controllers, such as SDIO ones are not able to
perform DMA transfers for small sizes of data (~16 to ~512 bytes),
resulting in cache issues in subsequent transfers. Increase the DMA
transfer threshold to 512 to allow the shorter transfers to take place,
bypassing DMA.

Signed-off-by: Kaustabh Chakraborty <kauschluss@disroot.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
5 weeks agommc: dw_mmc: implement option for configuring DMA threshold
Kaustabh Chakraborty [Wed, 15 Apr 2026 15:02:08 +0000 (20:32 +0530)] 
mmc: dw_mmc: implement option for configuring DMA threshold

Some controllers, such as certain Exynos SDIO ones, are unable to
perform DMA transfers of small amount of bytes properly. Following the
device tree schema, implement the property to define the DMA transfer
threshold (from a hard coded value of 16 bytes) so that lesser number of
bytes can be transferred safely skipping DMA in such controllers. The
value of 16 bytes stays as the default for controllers which do not
define it. This value can be overridden by implementation-specific init
sequences.

Signed-off-by: Kaustabh Chakraborty <kauschluss@disroot.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
5 weeks agogenirq/msi: Fix typos in msi_domain_ops comment
Miles Krause [Tue, 5 May 2026 01:46:02 +0000 (21:46 -0400)] 
genirq/msi: Fix typos in msi_domain_ops comment

Fix spelling and possessive typos in the msi_domain_ops comment.

No functional change.

Signed-off-by: Miles Krause <mileskrause5200@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260505014602.5879-1-mileskrause5200@gmail.com
5 weeks agoirqchip/gic-v3-its: Use FIELD_MODIFY()
Hans Zhang [Thu, 30 Apr 2026 16:29:54 +0000 (00:29 +0800)] 
irqchip/gic-v3-its: Use FIELD_MODIFY()

Use FIELD_MODIFY() to remove open-coded bit manipulation.

No functional change intended.

Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260430162954.44156-1-18255117159@163.com
5 weeks agoplatform/x86: asus-nb-wmi: add DMI quirk for ASUS Zenbook Duo UX8407AA
Paolo Pisati [Fri, 8 May 2026 07:09:56 +0000 (09:09 +0200)] 
platform/x86: asus-nb-wmi: add DMI quirk for ASUS Zenbook Duo UX8407AA

Use the existing zenbook duo keyboard quirk for the UX8407AA model too.

Signed-off-by: Paolo Pisati <p.pisati@gmail.com>
Reviewed-by: Denis Benato <denis.benato@linux.dev>
Link: https://patch.msgid.link/20260508070956.62201-1-p.pisati@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
5 weeks agosoc: qcom: ice: Allow explicit votes on 'iface' clock for ICE
Harshal Dev [Thu, 16 Apr 2026 11:59:19 +0000 (17:29 +0530)] 
soc: qcom: ice: Allow explicit votes on 'iface' clock for ICE

Since Qualcomm inline-crypto engine (ICE) is now a dedicated driver
de-coupled from the QCOM UFS driver, it explicitly votes for its required
clocks during probe. For scenarios where the 'clk_ignore_unused' flag is
not passed on the kernel command line, to avoid potential unclocked ICE
hardware register access during probe the ICE driver should additionally
vote on the 'iface' clock.
Also update the suspend and resume callbacks to handle un-voting and voting
on the 'iface' clock.

Fixes: 2afbf43a4aec6 ("soc: qcom: Make the Qualcomm UFS/SDCC ICE a dedicated driver")
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Kuldeep Singh <kuldeep.singh@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Harshal Dev <harshal.dev@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260416-qcom_ice_power_and_clk_vote-v5-2-5ccf5d7e2846@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
5 weeks agodt-bindings: watchdog: apple,wdt: Add t8122 compatible
Janne Grunau [Thu, 7 May 2026 07:33:08 +0000 (09:33 +0200)] 
dt-bindings: watchdog: apple,wdt: Add t8122 compatible

The watchdog on the Apple silicon t8122 (M3) SoC is compatible with the
existing driver. Add "apple,t8122-wdt" as SoC specific compatible under
"apple,t8103-wdt" used by the driver.

Acked-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Joshua Peisach <jpeisach@ubuntu.com>
Reviewed-by: Neal Gompa <neal@gompa.dev>
Signed-off-by: Janne Grunau <j@jannau.net>
Link: https://lore.kernel.org/r/20260507-apple-m3-initial-devicetrees-v3-2-ca07c81b5dc7@jannau.net
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
5 weeks agox86/cpuid: Introduce a centralized CPUID parser
Ahmed S. Darwish [Fri, 27 Mar 2026 02:15:23 +0000 (03:15 +0100)] 
x86/cpuid: Introduce a centralized CPUID parser

Introduce a CPUID parser for populating the system's CPUID tables.

Since accessing a leaf within the CPUID table requires compile time
tokenization, split the parser into two stages:

  (a) Compile-time macros for tokenizing the leaf/subleaf offsets within
      the CPUID table.

  (b) Generic runtime code to fill the CPUID data, using a parsing table
      which collects these compile-time offsets.

For actual CPUID output parsing, support both generic and leaf-specific
read functions.

To ensure CPUID data early availability, invoke the parser during early
boot, early Xen boot, and at early secondary CPUs bring up.

Provide call site APIs to refresh a single leaf, or a leaf range, within
the CPUID tables.  This is for sites issuing MSR writes that partially
change the CPU's CPUID layout.  Doing full CPUID table rescans in such
cases will be destructive since the CPUID tables will host all of the
kernel's X86_FEATURE flags at a later stage.

Suggested-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20260327021645.555257-1-darwi@linutronix.de
5 weeks agoMerge branch '20260416-qcom_ice_power_and_clk_vote-v5-1-5ccf5d7e2846@oss.qualcomm...
Bjorn Andersson [Mon, 11 May 2026 14:05:50 +0000 (09:05 -0500)] 
Merge branch '20260416-qcom_ice_power_and_clk_vote-v5-1-5ccf5d7e2846@oss.qualcomm.com' into drivers-fixes-for-7.1

Merge the qcom,ice DeviceTree binding update through a topic branch to
allow sharing it with the DeviceTree branch.

5 weeks agodt-bindings: crypto: qcom,ice: Fix missing power-domain and iface clk
Harshal Dev [Thu, 16 Apr 2026 11:59:18 +0000 (17:29 +0530)] 
dt-bindings: crypto: qcom,ice: Fix missing power-domain and iface clk

The DT bindings for inline-crypto engine do not specify the UFS_PHY_GDSC
power-domain and iface clock. Without enabling the iface clock and the
associated power-domain the ICE hardware cannot function correctly and
leads to unclocked hardware accesses being observed during probe.

Fix the DT bindings for inline-crypto engine to require the UFS_PHY_GDSC
power-domain and iface clock for new devices (Eliza and Milos) introduced
in the current release (7.1) with yet-to-stabilize ABI, while preserving
backward compatibility for older devices.

Fixes: 618195a7ac3df ("dt-bindings: crypto: qcom,inline-crypto-engine: Document the Eliza ICE")
Fixes: 85faec1e85555 ("dt-bindings: crypto: qcom,inline-crypto-engine: document the Milos ICE")
Reviewed-by: Kuldeep Singh <kuldeep.singh@oss.qualcomm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Harshal Dev <harshal.dev@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260416-qcom_ice_power_and_clk_vote-v5-1-5ccf5d7e2846@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
5 weeks agowatchdog: apple: Add "apple,t8103-wdt" compatible
Janne Grunau [Wed, 31 Dec 2025 12:07:21 +0000 (13:07 +0100)] 
watchdog: apple: Add "apple,t8103-wdt" compatible

After discussion with the devicetree maintainers we agreed to not extend
lists with the generic compatible "apple,wdt" anymore [1]. Use
"apple,t8103-wdt" as base compatible as it is the SoC the driver and
bindings were written for.

[1]: https://lore.kernel.org/asahi/12ab93b7-1fc2-4ce0-926e-c8141cfe81bf@kernel.org/

Fixes: 4ed224aeaf66 ("watchdog: Add Apple SoC watchdog driver")
Cc: stable@vger.kernel.org
Reviewed-by: Neal Gompa <neal@gompa.dev>
Signed-off-by: Janne Grunau <j@jannau.net>
Link: https://lore.kernel.org/r/20251231-watchdog-apple-t8103-base-compat-v1-1-1702a02e0c45@jannau.net
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
5 weeks agos390/pai: Fix missing PAI counter increments under heavy load
Thomas Richter [Tue, 5 May 2026 10:34:33 +0000 (12:34 +0200)] 
s390/pai: Fix missing PAI counter increments under heavy load

Machines with a larger number of CPUs and under heavy load sometimes
loose PAI counter increments during recording using events
-e CRYPTO_ÂLL or -e NNPA_ALL. Counting is not affected.
This happens when several PAI crypto counters are incremented during
the same cryptographic operation.

During schedule out the functions

paiXXX_sched_task() (with XXX either crypt or ext)
+--> pai_have_samples()
   +--> pai_have_sample()
+--> pai_copy()
+--> pai_push_sample()

are called to read out PAI counter values.
In pai_copy() the current values of PAI counters are read from the
PMU memory mapped page and compared to the values read during last
schedule out operation, which have been saved in a backup page
named PAI_SAVE_AREA(event). For each PAI counter a delta is calculated
and when the delta is positive, that PAI counter was incremented by
hardware. This positve delta is reported as raw data record attached
to a sample.
After all deltas have been calculated, the new PAI counter values
are saved in the backup page PAI_SAVE_AREA(event). However this is
done in pai_push_sample(), leaving a small window for missing hardware
triggered updates. Here is one scenario:

  PAI counter idx:   0   1   2   3   4   5   6   7  ....  N
                   +---+---+---+---+---+---+---+---+    +---+
  PAI counter page:|   |   | X |   |   |   |   |   |....| Y |
                   +---+---+---+---+---+---+---+---+    +---+

In pai_copy() each PAI counter value is read and compared
to its old value. This is done in a loop. When PAI counter indexed
N is read, the hardware might increment PAI counter indexed 2 again,
updating its value from X to X+1.
Later pai_push_sample() simply mem-copies the complete PAI counter
page to a backup page and the increment of X+1 is lost, because the
backup page now contains the new value.

Read each PAI counter and save this value in the backup page when
there is a positive delta. This omits any time window between read
and store. This also reduced the work load as only modified PAI
counters are saved.

Cc: stable@vger.kernel.org
Fixes: fe861b0c8d06 ("s390/pai: save PAI counter value page in event structure")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
5 weeks agonsfs: fix wrong error code returned for pidns ioctls
Zhihao Cheng [Thu, 7 May 2026 11:23:01 +0000 (19:23 +0800)] 
nsfs: fix wrong error code returned for pidns ioctls

When executing NS_GET_PID_FROM_PIDNS (or similar pidns ioctls), if the
target task cannot be found in the corresponding pid_ns, the error code
should be ESRCH instead of ENOTTY.

This bug was introduced when the extensible ioctl handling was added.
Without proper return, ret would be overwritten by the default case in
the extensible ioctl switch statement.

Fixes: a1d220d9dafa8 ("nsfs: iterate through mount namespaces")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Link: https://patch.msgid.link/20260507112301.1042757-1-chengzhihao1@huawei.com
Reviewed-by: Yang Erkun <yangerkun@huawei.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agoublk: reject max_sectors smaller than PAGE_SECTORS in parameter validation
Ming Lei [Sun, 10 May 2026 14:48:43 +0000 (22:48 +0800)] 
ublk: reject max_sectors smaller than PAGE_SECTORS in parameter validation

blk_validate_limits() requires max_hw_sectors >= PAGE_SECTORS and fires
a WARN_ON_ONCE if this invariant is violated. ublk_validate_params()
only checked the upper bound of max_sectors against max_io_buf_bytes,
allowing userspace to pass small values (including zero) that trigger
the warning when blk_mq_alloc_disk() is called from
ublk_ctrl_start_dev().

Before 494ea040bcb5, ublk used blk_queue_max_hw_sectors() which silently
clamped small values up to PAGE_SECTORS. The conversion to passing
queue_limits directly to blk_mq_alloc_disk() lost that clamping and now
hits blk_validate_limits()'s WARN_ON_ONCE instead.

Validate that max_sectors is at least PAGE_SECTORS in
ublk_validate_params() so invalid values are rejected early with
-EINVAL instead of reaching the block layer.

Fixes: 494ea040bcb5 ("ublk: pass queue_limits to blk_mq_alloc_disk")
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Link: https://patch.msgid.link/20260510144843.769031-1-tom.leiming@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 weeks agoio_uring/fdinfo: translate SqThread PID through caller's pid_ns
Maoyi Xie [Sun, 10 May 2026 08:41:19 +0000 (16:41 +0800)] 
io_uring/fdinfo: translate SqThread PID through caller's pid_ns

SQPOLL stores current->pid (init_pid_ns view) in sqd->task_pid
at thread creation. fdinfo prints it raw via
seq_printf("SqThread:\t%d\n", sq_pid). A reader inside a
non-initial pid_ns sees the host PID, not the kthread's PID in
the reader's own pid_ns.

The SQPOLL kthread is created with CLONE_THREAD and no
CLONE_NEW*, so it lives in the submitter's pid_ns. An
unprivileged user_ns + pid_ns submitter can read fdinfo and
learn the host PID of a kthread whose in-namespace PID is
different.

Reproducer (mainline 7.0, KASAN): unshare CLONE_NEWUSER |
CLONE_NEWPID | CLONE_NEWNS, mount a private /proc, then have a
grandchild that is pid 1 in the new pid_ns open an io_uring
ring with IORING_SETUP_SQPOLL. /proc/self/task lists {1, 2};
the SQPOLL kthread is pid 2. Before: fdinfo prints
SqThread = <host pid>. After: SqThread = 2.

Use task_pid_nr_ns() against the proc inode's pid_ns to compute
sq_pid, instead of reading the stored sq->task_pid (which holds
the init_pid_ns view). pidfd_show_fdinfo() in kernel/pid.c
follows the same pattern.

Signed-off-by: Maoyi Xie <maoyi.xie@ntu.edu.sg>
Link: https://patch.msgid.link/20260510084119.457578-1-maoyi.xie@ntu.edu.sg
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 weeks agoiomap: add dirty page control to iomap_zero_iter
Chi Zhiling [Mon, 11 May 2026 09:40:07 +0000 (17:40 +0800)] 
iomap: add dirty page control to iomap_zero_iter

This patch prepares the iomap framework for exFAT's upcoming migration to
iomap. During testing of the exFAT iomap branch with xfstests generic/299 on
a VM with 8GB RAM and a 40GB disk, system unresponsiveness was observed.

iomap_zero_iter() lacked dirty page throttling, which could cause memory
pressure when exFAT's valid_size mechanism triggers large-scale zeroing
operations during writes beyond valid_size.

Align iomap_zero_iter() with iomap_write_iter() by adding
balance_dirty_pages_ratelimited() to throttle dirty page generation during
large zeroing operations

Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Link: https://patch.msgid.link/20260511094007.728011-1-chizhiling@163.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agoiomap: avoid memset iomap when iter is done
Fengnan Chang [Mon, 20 Apr 2026 06:16:30 +0000 (14:16 +0800)] 
iomap: avoid memset iomap when iter is done

When iomap_iter() finishes its iteration (returns <= 0), it is no longer
necessary to memset the entire iomap and srcmap structures.

In high-IOPS scenarios (like 4k randread NVMe polling with io_uring),
where the majority of I/Os complete in a single extent map, this wasted
memory write bandwidth, as the caller will just discard the iterator.
Use this command to test:
taskset -c 30 ./t/io_uring -p1 -d512 -b4096 -s32 -c32 -F1 -B1 -R1 -X1
-n1 -P1 /mnt/testfile
IOPS improve about 5% on ext4 and XFS.

However, we MUST still call iomap_iter_reset_iomap() to release the
folio_batch if IOMAP_F_FOLIO_BATCH is set, otherwise we leak page
references. Therefore, split the cleanup logic: always release the
folio_batch, but skip the memset() when ret <= 0.

Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
Link: https://patch.msgid.link/20260420061630.62077-1-changfengnan@bytedance.com
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agofs: fix forced iversion increment on lazytime timestamp updates
Pankaj Raghav [Mon, 11 May 2026 11:19:18 +0000 (13:19 +0200)] 
fs: fix forced iversion increment on lazytime timestamp updates

When updating timestamps with lazytime enabled, if only I_DIRTY_TIME is
set (pure lazytime update), inode_maybe_inc_iversion() should not be
forced to increment i_version. The force parameter should only be true
when actual data or metadata changes require an iversion bump.

The current code uses "!!dirty" which evaluates to true whenever dirty
has any bits set, including the I_DIRTY_TIME bit alone. This forces an
iversion increment on every lazytime timestamp update, which then sets
I_DIRTY_SYNC, triggering expensive log flushes on subsequent fdatasync
calls. Andres reported this issue when he noticed a perf regression[1].

Fix this by using "dirty != I_DIRTY_TIME" as the force parameter. This
passes false for pure lazytime updates (allowing the I_VERSION_QUERIED
optimization to work), while still forcing the increment when dirty
contains other flags indicating real changes that require iversion
updates.

[1] https://lore.kernel.org/linux-xfs/7ys6erh3nnyeerv2nybyfvp7dmaknuxrlxv74wx56ocdothkc6@ekfiadtkfn2r/

Fixes: 85c871a02b03 ("fs: add support for non-blocking timestamp updates")
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Link: https://patch.msgid.link/20260511111918.1793689-1-p.raghav@samsung.com
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agoirqchip/econet-en751221: Support MIPS 34Kc VEIC mode
Caleb James DeLisle [Thu, 30 Apr 2026 16:41:57 +0000 (16:41 +0000)] 
irqchip/econet-en751221: Support MIPS 34Kc VEIC mode

The Vectored External Interrupt Controller mode present in the MIPS 34Kc
and 1004Kc variants causes the CPU to stop dispatching interrupts by the
normal code path and instead it sends those interrupts to the external
interrupt controller to be prioritized, renumbered, and sent back.  When
they come back, they are handled through a different path using a dispatch
table, so plat_irq_dispatch() never sees action.

This of course subverts the traditional intc hierarchy, and on the 1004Kc
the interrupt controller is standardized (IRQ_GIC) so it can be reasonably
considered part of the CPU itself - and tighter coupling between IRQ_GIC
and arch/mips/* is tolerable. However on the 34Kc the intc is defined by
each SoC vendor, so it's required to have a modular driver - but for a
device which in fact ends up taking over the entire interrupt system.

Let the DT describe which IRQs which come from the CPU and should be
routed back and handled by the CPU intc. These particularly include the
two IPI interrupts which would otherwise necessitate duplication of all
the IPI supporting infrastructure from the CPU intc.

Signed-off-by: Caleb James DeLisle <cjd@cjdns.fr>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260430164157.6026-3-cjd@cjdns.fr
5 weeks agodt-bindings: interrupt-controller: econet: Add CPU interrupt mapping
Caleb James DeLisle [Thu, 30 Apr 2026 16:41:56 +0000 (16:41 +0000)] 
dt-bindings: interrupt-controller: econet: Add CPU interrupt mapping

In MIPS VEIC mode (Vectored External Interrupt Controller), the
hardware stops directly dispatching CPU interrupts such as IPIs or CPU
performance counters, and instead it communicates them to the external
interrupt controller (the hardware described here) which prioritizes,
renumbers, and integrates them with its own hardware interrupt pins.
Interrupts from the external controller are then dispatched through a
different method via a dispatch table. In effect, the external
controller subsumes the CPU controller and becomes the root.

34K Manual (MD00534) Section 6.3.1.3 rev 1.13 page 136

Since there are interrupts which ought to be controlled by the CPU
controller driver - particularly the IPI interrupts - we create a
reverse mapping where those interrupts may be sent back to the CPU
intc when they are received. This maintains the fiction that there is
still a hierarchy, and keeps the DT the same no matter whether the
processor is in VEIC mode or not. The econet,cpu-interrupt-map is
optional and if omitted, it's assumed that no interrupts need to be
mapped.

Signed-off-by: Caleb James DeLisle <cjd@cjdns.fr>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20260430164157.6026-2-cjd@cjdns.fr
5 weeks agommc: dw_mmc: Move misplaced comment
Shawn Lin [Thu, 9 Apr 2026 07:48:12 +0000 (15:48 +0800)] 
mmc: dw_mmc: Move misplaced comment

It was originally part of the @cmd_status field description but became
separated and now appears between @ring_size and @dms without proper context.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
5 weeks agommc: core: Add validation for host-provided max_segs
Shawn Lin [Thu, 9 Apr 2026 07:48:11 +0000 (15:48 +0800)] 
mmc: core: Add validation for host-provided max_segs

The max_segs field is of type unsigned short, and if a host driver
sets an excessively large value, it may be truncated to zero. This
can cause mmc_alloc_sg() to call kmalloc_objs() with a zero size
allocation request, which leads to undefined behavior.

Under the SLUB allocator, kmalloc(0) returns a special pointer
(ZERO_SIZE_PTR). The subsequent 'if (sg)' check will evaluate to
true, and sg_init_table() will then attempt to access invalid memory,
resulting in a crash:

dwmmc_rockchip 2a310000.mmc: Successfully tuned phase to 133
mmc1: new UHS-I speed SDR104 SDHC card at address aaaa
Unable to handle kernel paging request at virtual address 0000001ffffffff0
Mem abort info:
ESR = 0x0000000096000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=0000000102c88000
[0000001ffffffff0] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 0000000096000004 [#1] SMP
Modules linked in:
CPU: 2 UID: 0 PID: 102 Comm: kworker/2:1 Not tainted 7.0.0-rc6-next-20260331-00013-g4d93c25963c5-dirty #80 PREEMPT
Hardware name: Rockchip RK3576 EVB V10 Board (DT)
Workqueue: events_freezable mmc_rescan
pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : sg_init_table+0x2c/0x50
lr : sg_init_table+0x24/0x50
sp : ffff8000837db710
x29: ffff8000837db710 x28: 000000000000c000 x27: 0000000000000300
x26: 0000000000000000 x25: 0000000000000040 x24: ffff0000c46a0000
x23: 0000000000000000 x22: ffff0000c0c73c00 x21: 0000000000000010
x20: 0000000000000010 x19: 0000000000000000 x18: 000000000000002c
x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000000
x14: 0000000000000400 x13: ffff8000837dc000 x12: 0000000000000000
x11: ffff0000c0c73ca0 x10: 0000000000000040 x9 : 459ec1f0abbdbb00
x8 : 0000001fffffffe0 x7 : 0000000000000000 x6 : 000000000000003f
x5 : 0000000000035579 x4 : 0000000000000901 x3 : 0000000000000000
x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000010
Call trace:
sg_init_table+0x2c/0x50 (P)
mmc_mq_init_request+0x64/0x90
blk_mq_alloc_map_and_rqs+0x3ac/0x480
blk_mq_alloc_set_map_and_rqs+0x98/0x1e0
blk_mq_alloc_tag_set+0x1c0/0x290
mmc_init_queue+0x120/0x370
mmc_blk_alloc_req+0x150/0x420

To prevent this, add a validation check in mmc_mq_init_request() to
detect when sg_len (derived from max_segs) is zero. If sg_len is zero,
we return an error and print an error message, allowing host driver
developers to identify and fix incorrect max_segs configuration.

This is a defensive measure that ensures the MMC core fails gracefully
when host drivers provide invalid max_segs values, rather than crashing
with a page fault.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
5 weeks agorust: driver core: remove drvdata() and driver_type
Danilo Krummrich [Tue, 5 May 2026 15:23:09 +0000 (17:23 +0200)] 
rust: driver core: remove drvdata() and driver_type

When drvdata() was introduced in commit 6f61a2637abe ("rust: device:
introduce Device::drvdata()"), its commit message already noted that a
direct accessor to the driver's bus device private data is not commonly
required -- bus callbacks provide access through &self, and other entry
points (IRQs, workqueues, IOCTLs, etc.) carry their own private data.

The sole motivation for drvdata() was inter-driver interaction -- an
auxiliary driver deriving the parent's bus device private data from the
parent device.

However, drvdata() exposes the driver's bus device private data beyond
the driver's own scope. This creates ordering constraints; for instance
drvdata may not be set yet when the first caller of drvdata() can
appear. It also forces the driver's bus device private data to outlive
all registrations that access it, which causes unnecessary
complications.

Private data should be private to the entity that issues it, i.e. bus
device private data belongs to bus callbacks, class device private data
to class callbacks, IRQ private data to the IRQ handler, etc.

With registration-private data now available through the auxiliary bus,
there is no remaining user of drvdata(), thus remove it.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20260505152400.3905096-4-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
5 weeks agorust: auxiliary: add registration data to auxiliary devices
Danilo Krummrich [Tue, 5 May 2026 15:23:08 +0000 (17:23 +0200)] 
rust: auxiliary: add registration data to auxiliary devices

Add a registration_data pointer to struct auxiliary_device, allowing the
registering (parent) driver to attach private data to the device at
registration time and retrieve it later when called back by the
auxiliary (child) driver.

By tying the data to the device's registration, Rust drivers can bind
the lifetime of device resources to it, since the auxiliary bus
guarantees that the parent driver remains bound while the auxiliary
device is bound.

On the Rust side, Registration<T> takes ownership of the data via
ForeignOwnable. A TypeId is stored alongside the data for runtime type
checking, making Device::registration_data<T>() a safe method.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20260505152400.3905096-3-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
5 weeks agorust: alloc: add Box::zeroed()
Danilo Krummrich [Tue, 5 May 2026 15:23:07 +0000 (17:23 +0200)] 
rust: alloc: add Box::zeroed()

Add Box::zeroed() for T: Zeroable types.

This allocates with __GFP_ZERO directly, letting the underlying
allocator deal with zeroing out the memory compared to
Box::new(T::zeroed(), flags).

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20260505152400.3905096-2-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
5 weeks agogpio: spear-spics: Add COMPILE_TEST support
Rosen Penev [Sun, 10 May 2026 19:55:31 +0000 (12:55 -0700)] 
gpio: spear-spics: Add COMPILE_TEST support

The SPEAr SPI chip-select GPIO driver only depends on generic platform,
OF, and MMIO interfaces, so it can be built outside SPEAr platform
configurations.

Enable compile-test coverage to catch build regressions on other
architectures.

Assisted-by: Codex:GPT-5.5
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Link: https://patch.msgid.link/20260510195531.10561-1-rosenp@gmail.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
5 weeks agoirqchip/riscv-imsic: Clear interrupt move state during CPU offlining
Yong-Xuan Wang [Fri, 8 May 2026 09:31:21 +0000 (02:31 -0700)] 
irqchip/riscv-imsic: Clear interrupt move state during CPU offlining

Affinity changes of IMSIC interrupts have to be careful to not lose an
interrupt in the process. Each vector keeps track of an affinity change in
progress with two pointers in struct imsic_vector.

imsic_vector::move_prev points to the previous CPU target data and
imsic_vector::move_next to the designated new CPU target data.

imsic_vector::move_prev on the new CPU can only be cleared after the
previous CPU has cleared imsic_vector::move_next, which ususally happens in
__imsic_remote_sync().

In case of CPU hot-unplug __imsic_remote_sync() is not invoked because the
CPU is already marked offline. That means imsic_vector::move_prev becomes
stale until the CPU is onlined again.

The stale pointer prevents further affinity changes for the affected
interrupts.

Solve this by clearing the imsic_vector::move_prev pointers in the CPU
hotplug offline path.

[ tglx: Replace word salad in change log ]

Fixes: 0f67911e821c ("irqchip/riscv-imsic: Separate next and previous pointers in IMSIC vector")
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260508-imsic-v2-1-e9f08dd46cf5@sifive.com
5 weeks agoirqchip/meson-gpio: Use the correct register in meson_s4_gpio_irq_set_type()
Xianwei Zhao [Fri, 8 May 2026 07:36:54 +0000 (07:36 +0000)] 
irqchip/meson-gpio: Use the correct register in meson_s4_gpio_irq_set_type()

meson_s4_gpio_irq_set_type() uses the both-edge trigger register for
configuring level type and single edge mode interrupts, which is not
correct.

Use REG_EDGE_POL instead.

Fixes: bbd6fcc76b39 ("irqchip: Add support for Amlogic A4 and A5 SoCs")
Signed-off-by: Xianwei Zhao <xianwei.zhao@amlogic.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260508-a9-gpio-irqchip-v1-1-9dc5f3e022e0@amlogic.com
5 weeks agoirqchip/meson-gpio: Add support for Amlogic A9 SoCs
Xianwei Zhao [Fri, 8 May 2026 07:36:56 +0000 (07:36 +0000)] 
irqchip/meson-gpio: Add support for Amlogic A9 SoCs

The Amlogic A9 SoCs supports the following GPIO interrupt lines:
A9 IRQ Number:
        - 95:86   10 pins on bank Y
        - 85:84    2 pins on bank CC
        - 83:64   20 pins on bank A
        - 63:48   16 pins on bank Z
        - 47:30   18 pins on bank X
        - 29:22    8 pins on bank H
        - 21:14    8 pins on bank M
        - 13:0    14 pins on bank B

A9 AO IRQ Number:
        - 38       1 pins on bank TESTN
        - 37:31    7 pins on bank C
        - 30:13   18 pins on bank D
        - 12:0    13 pins on bank AO

Update the driver to handle these variants.

Signed-off-by: Xianwei Zhao <xianwei.zhao@amlogic.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260508-a9-gpio-irqchip-v1-3-9dc5f3e022e0@amlogic.com
5 weeks agodt-bindings: interrupt-controller: Add support for Amlogic A9 SoCs
Xianwei Zhao [Fri, 8 May 2026 07:36:55 +0000 (07:36 +0000)] 
dt-bindings: interrupt-controller: Add support for Amlogic A9 SoCs

Update dt-binding document for GPIO interrupt controller
of Amlogic A9 SoCs.

Signed-off-by: Xianwei Zhao <xianwei.zhao@amlogic.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20260508-a9-gpio-irqchip-v1-2-9dc5f3e022e0@amlogic.com
5 weeks agoMerge branch 'irq/urgent' into irq/drivers
Thomas Gleixner [Mon, 11 May 2026 13:07:23 +0000 (15:07 +0200)] 
Merge branch 'irq/urgent' into irq/drivers

to synchronize upstream fixes on which other changes depend on.

5 weeks agoirqchip/meson-gpio: Use the correct register in meson_s4_gpio_irq_set_type()
Xianwei Zhao [Fri, 8 May 2026 07:36:54 +0000 (07:36 +0000)] 
irqchip/meson-gpio: Use the correct register in meson_s4_gpio_irq_set_type()

meson_s4_gpio_irq_set_type() uses the both-edge trigger register for
configuring level type and single edge mode interrupts, which is not
correct.

Use REG_EDGE_POL instead.

Fixes: bbd6fcc76b39 ("irqchip: Add support for Amlogic A4 and A5 SoCs")
Signed-off-by: Xianwei Zhao <xianwei.zhao@amlogic.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260508-a9-gpio-irqchip-v1-1-9dc5f3e022e0@amlogic.com
5 weeks agoirqchip/ath79-cpu: Remove unused function
Rosen Penev [Wed, 6 May 2026 08:55:22 +0000 (01:55 -0700)] 
irqchip/ath79-cpu: Remove unused function

ath79_cpu_irq_init() was part of the legacy pre-OF code that got removed a
while back.

Remove it to get rid of a missing prototype warning, reported by the kernel test
robot.

[ tglx: Fix the subject prefix. Sigh ... ]

Fixes: 51fa4f8912c0 ("MIPS: ath79: drop legacy IRQ code")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260506085522.1210143-1-rosenp@gmail.com
Closes: https://lore.kernel.org/oe-kbuild-all/202412011509.kGQkDr1y-lkp@intel.com/
5 weeks agogenirq/chip: Don't call add_interrupt_randomness() for NMIs
Mark Rutland [Thu, 7 May 2026 11:05:18 +0000 (12:05 +0100)] 
genirq/chip: Don't call add_interrupt_randomness() for NMIs

Recently handle_percpu_devid_irq() was changed to call
add_interrupt_randomness(). This introduced a potential deadlock when
handle_percpu_devid_irq() is used to handle an NMI, which can be
detected with lockdep, e.g.

    ================================
    WARNING: inconsistent lock state
    7.1.0-rc2-pnmi #465 Not tainted
    --------------------------------
    inconsistent {INITIAL USE} -> {IN-NMI} usage.
    perf/695 [HC1[1]:SC0[0]:HE0:SE1] takes:
    ffff00837dfd3a18 (&base->lock){-.-.}-{2:2}, at: lock_timer_base+0x6c/0xac
    {INITIAL USE} state was registered at:
      _raw_spin_lock_irqsave+0x68/0xb0
      lock_timer_base+0x6c/0xac
      __mod_timer+0x100/0x32c
      add_timer_global+0x2c/0x40
      __queue_delayed_work+0xf0/0x140
      queue_delayed_work_on+0x134/0x138
      mem_cgroup_css_online+0x30c/0x310
      online_css+0x34/0x10c
      cgroup_init_subsys+0x158/0x1c8
      cgroup_init+0x440/0x524
      start_kernel+0x888/0x998

    other info that might help us debug this:
    Possible unsafe locking scenario:
           CPU0
           ----
      lock(&base->lock);
      <Interrupt>
        lock(&base->lock);
        *** DEADLOCK ***

    Call trace:
     _raw_spin_lock_irqsave+0x68/0xb0
     lock_timer_base+0x6c/0xac
     add_timer_on+0x78/0x16c
     add_interrupt_randomness+0x124/0x134
     handle_percpu_devid_irq+0xd4/0x16c
     handle_irq_desc+0x40/0x58
     generic_handle_domain_nmi+0x28/0x50
     __gic_handle_nmi.isra.0+0x4c/0xa0
     gic_handle_irq+0x38/0x2bc
     call_on_irq_stack+0x30/0x48
     do_interrupt_handler+0x80/0x98
     el1_interrupt+0x90/0xac
     el1h_64_irq_handler+0x18/0x24
     el1h_64_irq+0x80/0x84
     [...]

During review, Thomas pointed out it wouldn't be safe for
handle_percpu_devid_irq() to call add_interrupt_randomness() if it was
used to handle NMIs:

  https://lore.kernel.org/lkml/87bjgik042.ffs@tglx/

... but evidently people missed that handle_percpu_devid_irq() *is* used
for NMIs.

While it might seem that NMIs should be handled with a separate
handle_percpu_devid_nmi() function, for various structural reasons this was
impractical, and handle_percpu_devid_irq() has been expected to be used for
NMIs since commits:

  21bbbc50f398f ("irqchip/gic-v3: Switch high priority PPIs over to handle_percpu_devid_irq()")
  5ff78c8de9d83 ("genirq: Kill handle_percpu_devid_fasteoi_nmi()")

Taking the above into account, avoid the deadlock by not calling
add_interrupt_randomness() when handle_percpu_devid_irq() is called in an
NMI context. This is consistent with other NNI handling flows, which do not
call add_interrupt_randomness().

At the same time, update the kernel-doc comment to make it clear that
handle_percpu_devid_irq() can be called in NMI context. The rest of
handle_percpu_devid_irq() is currently NMI safe and doesn't need to change.

Fixes: fd7400cfcbaa ("genirq/chip: Invoke add_interrupt_randomness() in handle_percpu_devid_irq()")
Reported-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://patch.msgid.link/20260507110518.3128248-1-mark.rutland@arm.com
5 weeks agoirqchip/gic-v5: Allocate ITS parent LPIs as a range
Sascha Bischoff [Wed, 6 May 2026 09:37:43 +0000 (09:37 +0000)] 
irqchip/gic-v5: Allocate ITS parent LPIs as a range

The ITS MSI domain no longer manages LPI allocation directly. LPIs are
allocated and freed by the parent LPI domain, which can now handle a
full range of interrupts and unwind partial allocations internally.

Make the ITS domain request and release the parent IRQs as a single
range instead of iterating over each interrupt. The ITS allocation
path then only needs to reserve EventIDs, allocate the parent range,
and fill in the ITS irq_data for each MSI. Since no operation in the
per-MSI loop can fail, the partial parent-free unwind becomes
unnecessary.

On teardown, reset the ITS irq_data for the range and then release the
parent range in one call, leaving LPI teardown to the LPI domain.

Fixes: 0f0101325876 ("irqchip/gic-v5: Add GICv5 LPI/IPI support")
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260506093634.382062-4-sascha.bischoff@arm.com
5 weeks agoirqchip/gic-v5: Support range allocation for LPIs
Sascha Bischoff [Wed, 6 May 2026 09:37:23 +0000 (09:37 +0000)] 
irqchip/gic-v5: Support range allocation for LPIs

The per-IPI parent allocation loop returns immediately on failure and leaks
any parent interrupts allocated by earlier iterations.

The GICv5 LPI domain now owns LPI allocation and teardown internally,
but its irq_domain callbacks still reject requests where nr_irqs is
greater than one. This forces child domains to allocate and free LPIs
one at a time even when the interrupt core requests a contiguous
range.

Handle multi-interrupt allocation and teardown in the LPI domain by
iterating over the requested range and unwinding any partially
allocated state on failure.

Allocate the parent LPIs for the IPI domain with a single range
request as well, which cures the leakage problem.

Fixes: 0f0101325876 ("irqchip/gic-v5: Add GICv5 LPI/IPI support")
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260506093634.382062-3-sascha.bischoff@arm.com
5 weeks agoirqchip/gic-v5: Move LPI allocation into the LPI domain
Sascha Bischoff [Wed, 6 May 2026 09:37:02 +0000 (09:37 +0000)] 
irqchip/gic-v5: Move LPI allocation into the LPI domain

The IPI and ITS MSI domains currently allocate and release LPIs
directly, then pass the selected LPI ID to the parent LPI domain. This
leaks the LPI domain's allocation policy into its child domains and
forces each child to duplicate part of the parent domain's teardown.

Make the LPI domain allocate LPIs in its .alloc() callback and release
them in a matching .free() callback. Child domains can then request a
parent interrupt without passing an implementation-specific LPI ID,
and the LPI lifetime is tied to the domain that owns the LPI
namespace.

Remove the gicv5_alloc_lpi() and gicv5_free_lpi() wrappers now that no
external caller needs to manage LPIs directly.

This is a preparatory change for an actual leakage problem in the
allocation code and therefore tagged with the same Fixes tag.

Fixes: 0f0101325876 ("irqchip/gic-v5: Add GICv5 LPI/IPI support")
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260506093634.382062-2-sascha.bischoff@arm.com
5 weeks agostaging: rtl8723bs: fix buffer over-read in rtw_update_protection
Salman Alghamdi [Fri, 8 May 2026 22:26:14 +0000 (01:26 +0300)] 
staging: rtl8723bs: fix buffer over-read in rtw_update_protection

rtw_update_protection() is called with a pointer offset into the
ies buffer but the full ie_length is passed, causing a potential
buffer over-read.

Fixes: e945c43df60b ("Staging: rtl8723bs: Delete dead code from update_current_network()")
Fixes: d3fcee1b78a5 ("staging: rtl8723bs: fix camel case in struct wlan_bssid_ex")
Reported-by: Luka Gejak <luka.gejak@linux.dev>
Closes: https://lore.kernel.org/linux-staging/DI2H39EAAFBZ.3KI5NWN02AQ2S@linux.dev
Cc: stable@vger.kernel.org
Signed-off-by: Salman Alghamdi <me@cipherat.com>
Reviewed-by: Luka Gejak <luka.gejak@linux.dev>
Link: https://patch.msgid.link/20260508222649.23989-1-me@cipherat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 weeks agoMerge tag 'ib-gpio-add-gpiod-is-single-ended-for-v7.2' of git://git.kernel.org/pub...
Bartosz Golaszewski [Mon, 11 May 2026 12:31:05 +0000 (14:31 +0200)] 
Merge tag 'ib-gpio-add-gpiod-is-single-ended-for-v7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into gpio/for-next

Immutable branch betweeb the GPIO and I2C trees for v7.2-rc1

- add the gpiod_is_single_ended() helper function

5 weeks agogpiolib: add gpiod_is_single_ended() helper
Jie Li [Mon, 11 May 2026 11:37:25 +0000 (13:37 +0200)] 
gpiolib: add gpiod_is_single_ended() helper

The direction of a single-ended (open-drain or open-source) GPIO line
cannot always be reliably determined by reading hardware registers.
In true open-drain implementations, the "high" state is achieved by
entering a high-impedance mode, which many hardware controllers report
as "input" even if the software intends to use it as an output.

This creates issues for consumer drivers (like I2C) that rely on
gpiod_get_direction() to decide if a line can be driven.

Introduce gpiod_is_single_ended() to allow consumers to check the
software configuration (GPIO_FLAG_OPEN_DRAIN/GPIO_FLAG_OPEN_SOURCE) of
a descriptor. This provides a robust way to identify lines that are
capable of being driven, regardless of their instantaneous hardware state.

Signed-off-by: Jie Li <jie.i.li@nokia.com>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260511113726.49041-2-jie.i.li@nokia.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
5 weeks agofuse: fix writeback array overflow when max_pages is one
Junxi Qian [Wed, 6 May 2026 12:24:15 +0000 (20:24 +0800)] 
fuse: fix writeback array overflow when max_pages is one

fuse_iomap_writeback_range() appends one folio pointer and one
fuse_folio_desc for every dirty range that is merged into the current
writeback request.  The merge decision checks the byte budget against
fc->max_pages and fc->max_write, but it does not check whether the folio
and descriptor arrays still have another free slot.

This is not sufficient for fuseblk, where the filesystem block size can
be smaller than PAGE_SIZE.  With writeback cache enabled and max_pages
negotiated as one, contiguous sub-page dirty ranges can fit within the
byte budget while spanning more than one folio.  The next append can then
write past the one-slot folios and descs arrays.

Split the request when the number of already attached folios has reached
fc->max_pages.  This keeps the folio/descriptor slot accounting in sync
with the send decision.

Fixes: ef7e7cbb323f ("fuse: use iomap for writeback")
Cc: stable@vger.kernel.org
Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Junxi Qian <qjx1298677004@gmail.com>
Link: https://patch.msgid.link/20260506122415.205340-1-qjx1298677004@gmail.com
Acked-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agofs: Fix return in jfs_mkdir and orangefs_mkdir
Hongling Zeng [Fri, 1 May 2026 07:10:58 +0000 (15:10 +0800)] 
fs: Fix return in jfs_mkdir and orangefs_mkdir

Return NULL instead of passing to ERR_PTR while err is zero
Fixes these smatch warnings:
  - fs/jfs/namei.c:311 jfs_mkdir() warn: passing zero to 'ERR_PTR'
  - fs/orangefs/namei.c:369 orangefs_mkdir() warn: passing zero
    to 'ERR_PTR'

Fixes: 88d5baf69082 ("Change inode_operations.mkdir to return struct dentry *")
Signed-off-by: Hongling Zeng <zenghongling@kylinos.cn>
Link: https://patch.msgid.link/20260501071058.1243245-1-zenghongling@kylinos.cn
Reviewed-by: Jori Koolstra <jkoolstra@xs4all.nl>
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agofs/statmount: fix slab out-of-bounds write in statmount_mnt_idmap
Junyoung Jang [Mon, 4 May 2026 11:26:49 +0000 (20:26 +0900)] 
fs/statmount: fix slab out-of-bounds write in statmount_mnt_idmap

statmount_mnt_idmap() writes one mapping with seq_printf() and then
manually advances seq->count to include the NUL separator.

If seq_printf() overflows, seq_set_overflow() sets seq->count to
seq->size. The manual seq->count++ changes this to seq->size + 1.
seq_has_overflowed() then no longer detects the overflow. The corrupted
count returns to statmount_string(), which later executes:

    seq->buf[seq->count++] = '\0';

This causes a 1-byte NULL out-of-bounds write on the dynamically
allocated seq buffer.

Fix this by checking for overflow immediately after seq_printf().

Fixes: 37c4a9590e1e ("statmount: allow to retrieve idmappings")
Signed-off-by: Junyoung Jang <graypanda.inzag@gmail.com>
Link: https://patch.msgid.link/20260504112649.1862936-1-graypanda.inzag@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agospi: Use FIELD_MODIFY() for bitfield operations
Mark Brown [Mon, 11 May 2026 12:05:13 +0000 (21:05 +0900)] 
spi: Use FIELD_MODIFY() for bitfield operations

Hans Zhang <18255117159@163.com> says:

Replace open-coded bitfield modifications with the standard FIELD_MODIFY()
macro across multiple SPI controller drivers. This improves readability and
adds compile-time checking without functional changes.

Each patch modifies a single driver, allowing independent review and
application.

Link: https://patch.msgid.link/20260430155456.36998-1-18255117159@163.com
Signed-off-by: Mark Brown <broonie@kernel.org>
5 weeks agospi: uniphier: Use FIELD_MODIFY()
Hans Zhang [Thu, 30 Apr 2026 15:54:56 +0000 (23:54 +0800)] 
spi: uniphier: Use FIELD_MODIFY()

Use FIELD_MODIFY() to remove open-coded bit manipulation.
No functional change intended.

Signed-off-by: Hans Zhang <18255117159@163.com>
Link: https://patch.msgid.link/20260430155456.36998-11-18255117159@163.com
Signed-off-by: Mark Brown <broonie@kernel.org>
5 weeks agospi: sunplus-sp7021: Use FIELD_MODIFY()
Hans Zhang [Thu, 30 Apr 2026 15:54:55 +0000 (23:54 +0800)] 
spi: sunplus-sp7021: Use FIELD_MODIFY()

Use FIELD_MODIFY() to remove open-coded bit manipulation.
No functional change intended.

Signed-off-by: Hans Zhang <18255117159@163.com>
Link: https://patch.msgid.link/20260430155456.36998-10-18255117159@163.com
Signed-off-by: Mark Brown <broonie@kernel.org>
5 weeks agospi: stm32-qspi: Use FIELD_MODIFY()
Hans Zhang [Thu, 30 Apr 2026 15:54:54 +0000 (23:54 +0800)] 
spi: stm32-qspi: Use FIELD_MODIFY()

Use FIELD_MODIFY() to remove open-coded bit manipulation.
No functional change intended.

Signed-off-by: Hans Zhang <18255117159@163.com>
Reviewed-by: Patrice Chotard <patrice.chotard@foss.st.com>
Link: https://patch.msgid.link/20260430155456.36998-9-18255117159@163.com
Signed-off-by: Mark Brown <broonie@kernel.org>
5 weeks agospi: stm32-ospi: Use FIELD_MODIFY()
Hans Zhang [Thu, 30 Apr 2026 15:54:53 +0000 (23:54 +0800)] 
spi: stm32-ospi: Use FIELD_MODIFY()

Use FIELD_MODIFY() to remove open-coded bit manipulation.
No functional change intended.

Signed-off-by: Hans Zhang <18255117159@163.com>
Reviewed-by: Patrice Chotard <patrice.chotard@foss.st.com>
Link: https://patch.msgid.link/20260430155456.36998-8-18255117159@163.com
Signed-off-by: Mark Brown <broonie@kernel.org>
5 weeks agospi: sn-f-ospi: Use FIELD_MODIFY()
Hans Zhang [Thu, 30 Apr 2026 15:54:52 +0000 (23:54 +0800)] 
spi: sn-f-ospi: Use FIELD_MODIFY()

Use FIELD_MODIFY() to remove open-coded bit manipulation.
No functional change intended.

Signed-off-by: Hans Zhang <18255117159@163.com>
Link: https://patch.msgid.link/20260430155456.36998-7-18255117159@163.com
Signed-off-by: Mark Brown <broonie@kernel.org>
5 weeks agospi: nxp-xspi: Use FIELD_MODIFY()
Hans Zhang [Thu, 30 Apr 2026 15:54:51 +0000 (23:54 +0800)] 
spi: nxp-xspi: Use FIELD_MODIFY()

Use FIELD_MODIFY() to remove open-coded bit manipulation.
No functional change intended.

Signed-off-by: Hans Zhang <18255117159@163.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260430155456.36998-6-18255117159@163.com
Signed-off-by: Mark Brown <broonie@kernel.org>
5 weeks agospi: meson-spicc: Use FIELD_MODIFY()
Hans Zhang [Thu, 30 Apr 2026 15:54:50 +0000 (23:54 +0800)] 
spi: meson-spicc: Use FIELD_MODIFY()

Use FIELD_MODIFY() to remove open-coded bit manipulation.
No functional change intended.

Signed-off-by: Hans Zhang <18255117159@163.com>
Link: https://patch.msgid.link/20260430155456.36998-5-18255117159@163.com
Signed-off-by: Mark Brown <broonie@kernel.org>
5 weeks agospi: cadence-xspi: Use FIELD_MODIFY()
Hans Zhang [Thu, 30 Apr 2026 15:54:49 +0000 (23:54 +0800)] 
spi: cadence-xspi: Use FIELD_MODIFY()

Use FIELD_MODIFY() to remove open-coded bit manipulation.
No functional change intended.

Signed-off-by: Hans Zhang <18255117159@163.com>
Link: https://patch.msgid.link/20260430155456.36998-4-18255117159@163.com
Signed-off-by: Mark Brown <broonie@kernel.org>
5 weeks agospi: amlogic-spisg: Use FIELD_MODIFY()
Hans Zhang [Thu, 30 Apr 2026 15:54:48 +0000 (23:54 +0800)] 
spi: amlogic-spisg: Use FIELD_MODIFY()

Use FIELD_MODIFY() to remove open-coded bit manipulation.
No functional change intended.

Signed-off-by: Hans Zhang <18255117159@163.com>
Link: https://patch.msgid.link/20260430155456.36998-3-18255117159@163.com
Signed-off-by: Mark Brown <broonie@kernel.org>
5 weeks agospi: amlogic-spifc-a1: Use FIELD_MODIFY()
Hans Zhang [Thu, 30 Apr 2026 15:54:47 +0000 (23:54 +0800)] 
spi: amlogic-spifc-a1: Use FIELD_MODIFY()

Use FIELD_MODIFY() to remove open-coded bit manipulation.
No functional change intended.

Signed-off-by: Hans Zhang <18255117159@163.com>
Link: https://patch.msgid.link/20260430155456.36998-2-18255117159@163.com
Signed-off-by: Mark Brown <broonie@kernel.org>
5 weeks agoplatform/x86: lenovo-wmi-other: Limit adding attributes to supported devices
Derek J. Clark [Sun, 10 May 2026 04:25:39 +0000 (04:25 +0000)] 
platform/x86: lenovo-wmi-other: Limit adding attributes to supported devices

Adds lwmi_is_attr_01_supported, and only creates the attribute subfolder
if the attribute is supported by the hardware. Due to some poorly
implemented BIOS this is a multi-step sequence of events. This is
because:
- Some BIOS support getting the capability data from custom mode (0xff),
  while others only support it in no-mode (0x00).
- Some BIOS support get/set for the current value from custom mode (0xff),
  while others only support it in no-mode (0x00).
- Some BIOS report capability data for a method that is not fully
  implemented.
- Some BIOS have methods fully implemented, but no complimentary
  capability data.

To ensure we only expose fully implemented methods with corresponding
capability data, we check each outcome before reporting that an
attribute can be supported.

Checking for lwmi_is_attr_01_supported during remove is not done to
ensure that we don't attempt to call cd01 or send WMI events if one of
the interfaces being removed was the cause of the driver unloading.

Fixes: edc4b183b794 ("platform/x86: Add Lenovo Other Mode WMI Driver")
Reported-by: Kurt Borja <kuurtb@gmail.com>
Closes: https://lore.kernel.org/platform-driver-x86/DG60P3SHXR8H.3NSEHMZ6J7XRC@gmail.com/
Cc: stable@vger.kernel.org
Reviewed-by: Rong Zhang <i@rong.moe>
Tested-by: Rong Zhang <i@rong.moe>
Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Signed-off-by: Derek J. Clark <derekjohn.clark@gmail.com>
Link: https://patch.msgid.link/20260510042546.436874-10-derekjohn.clark@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
5 weeks agoplatform/x86: lenovo-wmi-other: Add Attribute ID helper functions
Derek J. Clark [Sun, 10 May 2026 04:25:38 +0000 (04:25 +0000)] 
platform/x86: lenovo-wmi-other: Add Attribute ID helper functions

Adds lwmi_attr_id() function. In the same vein as LWMI_ATTR_ID_FAN_RPM(),
but as a generic, to de-duplicate attribute_id assignment boilerplate.

Adds tunable_attr_01_id() function that breaks out the members of a
tunable_attr_01 struct and passes them to lwmi_attr_id().

No functional change intended.

Cc: stable@vger.kernel.org
Reviewed-by: Rong Zhang <i@rong.moe>
Tested-by: Rong Zhang <i@rong.moe>
Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Signed-off-by: Derek J. Clark <derekjohn.clark@gmail.com>
Link: https://patch.msgid.link/20260510042546.436874-9-derekjohn.clark@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
5 weeks agoplatform/x86: lenovo-wmi-helpers: Move gamezone enums to wmi-helpers
Derek J. Clark [Sun, 10 May 2026 04:25:37 +0000 (04:25 +0000)] 
platform/x86: lenovo-wmi-helpers: Move gamezone enums to wmi-helpers

In a later patch in the series the thermal mode enum will be accessed
across three separate drivers (wmi-capdata, wmi-gamezonem and wmi-other).
An additional patch in the series will also add a function prototype that
needs to reference this enum in wmi-helpers.h. To avoid having all these
drivers begin to import each others headers, and to avoid declaring an
opaque enum to hande the second case, move the thermal mode enum to
helpers where it can be safely accessed by everything that needs it from
a single import.

While at it, since the gamezone_events_type enum is the only remaining
item in the header, move that as well and remove the gamezone header
entirely.

Cc: stable@vger.kernel.org
Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Reviewed-by: Rong Zhang <i@rong.moe>
Tested-by: Rong Zhang <i@rong.moe>
Signed-off-by: Derek J. Clark <derekjohn.clark@gmail.com>
Link: https://patch.msgid.link/20260510042546.436874-8-derekjohn.clark@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
5 weeks agoplatform/x86: lenovo: Decouple lenovo-wmi-gamezone and lenovo-wmi-other
Rong Zhang [Sun, 10 May 2026 04:25:36 +0000 (04:25 +0000)] 
platform/x86: lenovo: Decouple lenovo-wmi-gamezone and lenovo-wmi-other

Currently, lenovo-wmi-gamezone depends on lenovo-wmi-other as the former
imports symbols from the latter. The imported symbols are just used to
register a notifier block. However, there is no runtime dependency
between both drivers, and either of them can run without the other,
which is the major purpose of using the notifier framework.

Such a link-time dependency is non-optimal. A previous attempt to "fix"
it made LENOVO_WMI_GAMEZONE select LENOVO_WMI_TUNING, which was
fundamentally broken and resulted in undefined Kconfig behavior, as
`select' cannot be used on a symbol with potentially unmet dependencies.

Decouple both drivers by moving the thermal mode notifier chain to
lenovo-wmi-helpers. Methods for notifier block (un)registration are
exported for lenovo-wmi-gamezone, while a method for querying the
current thermal mode are exported for lenovo-wmi-other.

This turns the dependency graph from

            +------------ lenovo-wmi-gamezone
            |                     |
            v                     |
    lenovo-wmi-helpers            |
            ^                     |
            |                     V
            +------------ lenovo-wmi-other

into

            +------------ lenovo-wmi-gamezone
            |
            v
    lenovo-wmi-helpers
            ^
            |
            +------------ lenovo-wmi-other

To make it clear, the name of the notifier chain is also renamed from
`om_chain_head' to `tm_chain_head', indicating that it's used to query
the current thermal mode.

No functional change intended.

Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Fixes: 6e38b9fcbfa3 ("platform/x86: lenovo: gamezone needs "other mode"")
Cc: stable@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202603252259.gHvJDyh3-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202603260302.X0NjQOda-lkp@intel.com/
Signed-off-by: Rong Zhang <i@rong.moe>
Signed-off-by: Derek J. Clark <derekjohn.clark@gmail.com>
Link: https://patch.msgid.link/20260510042546.436874-7-derekjohn.clark@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
5 weeks agoplatform/x86: lenovo-wmi-other: Fix tunable_attr_01 struct members
Derek J. Clark [Sun, 10 May 2026 04:25:35 +0000 (04:25 +0000)] 
platform/x86: lenovo-wmi-other: Fix tunable_attr_01 struct members

In struct tunable_attr_01 the capdata pointer is unused and the size of
the id members is u32 when it should be u8. Fix these prior to adding
additional members.

No functional change intended.

Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Cc: stable@vger.kernel.org
Reviewed-by: Rong Zhang <i@rong.moe>
Tested-by: Rong Zhang <i@rong.moe>
Signed-off-by: Derek J. Clark <derekjohn.clark@gmail.com>
Link: https://patch.msgid.link/20260510042546.436874-6-derekjohn.clark@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
5 weeks agoplatform/x86: lenovo-wmi-other: Zero initialize WMI arguments
Derek J. Clark [Sun, 10 May 2026 04:25:34 +0000 (04:25 +0000)] 
platform/x86: lenovo-wmi-other: Zero initialize WMI arguments

Adds explicit initialization of wmi_method_args_32 declarations with
zero values to prevent uninitialized data from being sent to the device
BIOS when passed.

No functional change intended.

Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Fixes: 22024ac5366f ("platform/x86: Add Lenovo Gamezone WMI Driver")
Fixes: edc4b183b794 ("platform/x86: Add Lenovo Other Mode WMI Driver")
Reported-by: Rong Zhang <i@rong.moe>
Closes: https://lore.kernel.org/platform-driver-x86/95c7e7b539dd0af41189c754fcd35cec5b6fe182.camel@rong.moe/
Cc: stable@vger.kernel.org
Reviewed-by: Rong Zhang <i@rong.moe>
Tested-by: Rong Zhang <i@rong.moe>
Signed-off-by: Derek J. Clark <derekjohn.clark@gmail.com>
Link: https://patch.msgid.link/20260510042546.436874-5-derekjohn.clark@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
5 weeks agoplatform/x86: lenovo-wmi-other: Balance component bind and unbind
Rong Zhang [Sun, 10 May 2026 04:25:33 +0000 (04:25 +0000)] 
platform/x86: lenovo-wmi-other: Balance component bind and unbind

When lwmi_om_master_bind() fails, the master device's components are
left bound, with the aggregate device destroyed due to the failure
(found by sashiko.dev [1]).

Balance calls to component_bind_all() and component_unbind_all() when an
error is propagated to the component framework.

No functional change intended.

Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Fixes: edc4b183b794 ("platform/x86: Add Lenovo Other Mode WMI Driver")
Cc: stable@vger.kernel.org
Link: https://sashiko.dev/#/patchset/20260331181208.421552-1-derekjohn.clark%40gmail.com
Signed-off-by: Rong Zhang <i@rong.moe>
Signed-off-by: Derek J. Clark <derekjohn.clark@gmail.com>
Link: https://patch.msgid.link/20260510042546.436874-4-derekjohn.clark@gmail.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
5 weeks agoplatform/x86: lenovo-wmi-other: Balance IDA id allocation and free
Rong Zhang [Sun, 10 May 2026 04:25:32 +0000 (04:25 +0000)] 
platform/x86: lenovo-wmi-other: Balance IDA id allocation and free

Currently, the IDA id is only freed on wmi-other device removal or
failure to create firmware-attributes device, kset, or attributes. It
leaks IDA ids if the wmi-other device is bound multiple times, as the
unbind callback never frees the previously allocated IDA id.
Additionally, if the wmi-other device has failed to create a
firmware-attributes device before it gets removed, the wmi-device
removal callback double frees the same IDA id.

These bugs were found by sashiko.dev [1].

Fix them by moving ida_free() into lwmi_om_fw_attr_remove() so it is
balanced with ida_alloc() in lwmi_om_fw_attr_add(). With them fixed,
properly set and utilize the validity of priv->ida_id to balance
firmware-attributes registration and removal, without relying on
propagating the registration error to the component framework, which is
more reliable and aligns with the hwmon device registration and removal
sequences.

No functional change intended.

Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Fixes: edc4b183b794 ("platform/x86: Add Lenovo Other Mode WMI Driver")
Cc: stable@vger.kernel.org
Link: https://sashiko.dev/#/patchset/20260331181208.421552-1-derekjohn.clark%40gmail.com
Signed-off-by: Rong Zhang <i@rong.moe>
Signed-off-by: Derek J. Clark <derekjohn.clark@gmail.com>
Link: https://patch.msgid.link/20260510042546.436874-3-derekjohn.clark@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
5 weeks agoplatform/x86: lenovo-wmi-helpers: Fix memory leak in lwmi_dev_evaluate_int()
Rong Zhang [Sun, 10 May 2026 04:25:31 +0000 (04:25 +0000)] 
platform/x86: lenovo-wmi-helpers: Fix memory leak in lwmi_dev_evaluate_int()

lwmi_dev_evaluate_int() leaks output.pointer when retval == NULL (found
by sashiko.dev [1]).

Fix it by moving `ret_obj = output.pointer' outside of the `if (retval)'
block so that it is always freed by the __free cleanup callback.

No functional change intended.

Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Fixes: e521d16e76cd ("platform/x86: Add lenovo-wmi-helpers")
Cc: stable@vger.kernel.org
Link: https://sashiko.dev/#/patchset/20260331181208.421552-1-derekjohn.clark%40gmail.com
Signed-off-by: Rong Zhang <i@rong.moe>
Signed-off-by: Derek J. Clark <derekjohn.clark@gmail.com>
Link: https://patch.msgid.link/20260510042546.436874-2-derekjohn.clark@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
5 weeks agox86/cpu: Introduce a centralized CPUID data model
Ahmed S. Darwish [Fri, 27 Mar 2026 02:15:22 +0000 (03:15 +0100)] 
x86/cpu: Introduce a centralized CPUID data model

** Context

The x86-cpuid-db project generates a C header file with full C99 bitfield
listings for all known CPUID leaf/subleaf query outputs.

That header is now merged by parent commits at <asm/cpuid/leaf_types.h>,
and is of the form:

    struct leaf_0x0_0 { /* CPUID(0x0).0 C99 bitfields */ };
    ...
    struct leaf_0x4_n { /* CPUID(0x4).n C99 bitfields */ };
    ...
    struct leaf_0xd_0 { /* CPUID(0xd).0 C99 bitfields */ };
    struct leaf_0xd_1 { /* CPUID(0xd).1 C99 bitfields */ };
    struct leaf_0xd_n { /* CPUID(0xd).n C99 bitfields */ };
    ...

** Goal

Introduce a structured, size-efficient, per-CPU, CPUID data repository.

Use the x86-cpuid-db auto-generated data types, and custom CPUID leaf
parsers, to build that repository.  Given a leaf, subleaf, and index,
provide direct memory access to the parsed and cached per-CPU CPUID output.

** Long-term goal

Remove the need for drivers and other areas in the kernel to invoke direct
CPUID queries.  Only one place in the kernel should be allowed to use the
CPUID instruction: the CPUID parser code.

** Implementation

Introduce CPUID_LEAF()/CPUID_LEAF_N() to build a compact CPUID storage
layout in the form:

    struct leaf_0x0_0 leaf_0x0_0[1];
    struct leaf_parse_info leaf_0x0_0_info;

    struct leaf_0x1_0 leaf_0x1_0[1];
    struct leaf_parse_info leaf_0x0_0_info;

    struct leaf_0x4_n leaf_0x4_n[8];
    struct leaf_parse_info leaf_0x4_n_info;
    ...

where each CPUID query stores its output at the designated leaf/subleaf
array and has an associated "CPUID query info" structure.

Embed the CPUID tables inside "struct cpuinfo_x86" to ensure early-boot and
per-CPU access through the CPUs capability structures.

Use an array of CPUID output storage entries for each leaf/subleaf
combination to accommodate leaves which produce the same output format for
a large subleaf range.  This is typical for CPUID leaves enumerating
hierarchical objects; e.g. CPUID(0x4) cache topology enumeration,
CPUID(0xd) XSAVE enumeration, and CPUID(0x12) SGX Enclave Page Cache
enumeration.

** New CPUID APIs

Assuming a CPU capability structure 'c', provide macros to access the
parsed and cached CPUID leaf/subleaf output.  These macros resolve to a
compile-time tokenization that ensures type-safety:

    const struct leaf_0x7_0 *l7_0;

    l7_0 = cpuid_subleaf(c, 0x7, 0);
                         |   |   └────────┐
                         |   └─────────┐  |
                         *             *  *
                        &c.cpuid.leaf_0x7_0[0]

For CPUID leaves with multiple subleaves having the same output format,
provide the APIs:

    const struct leaf_0x4_n *l4_0, *l4_1;

    l4_0 = cpuid_subleaf_n(c, 0x4, 0);
                           |   |   └──────────┐
                           |   └─────────┐    |
                           *             *    v
                          &c.cpuid.leaf_0x4_n[0]

    l4_1 = cpuid_subleaf_n(c, 0x4, 1);
                           |   |   └──────────┐
                           |   └─────────┐    |
                           *             *    v
                          &c.cpuid.leaf_0x4_n[1]

where the indices 0, 1, n above can be passed dynamically; e.g., in an
enumeration for loop.

Add a clear rationale on why call sites should use the these new APIs
instead of directly invoking CPUID.

** Next steps

For now, define cached parse entries for CPUID(0x0) and CPUID(0x1).

Generic parser logic to fill the CPUID tables, along with more CPUID leaves
support, will be added next.

Suggested-by: Thomas Gleixner <tglx@kernel.org> # CPUID data model
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> # x86-cpuid-db schema
Suggested-by: Borislav Petkov <bp@alien8.de> # Early CPUID centralization drafts
Suggested-by: Ingo Molnar <mingo@kernel.org> # CPUID headers restructuring
Suggested-by: Sean Christopherson <seanjc@google.com> # cpuid_subleaf_n() APIs
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20260327021645.555257-1-darwi@linutronix.de
Link: https://lore.kernel.org/all/874ixernra.ffs@tglx
Link: https://gitlab.com/x86-cpuid.org/x86-cpuid-db
Link: https://lore.kernel.org/all/aBnSgu_JyEi8fvog@gmail.com
Link: https://lore.kernel.org/all/aJ9TbaNMgaplKSbH@google.com
5 weeks agoMerge tag 'ib-gpio-add-fwnode-gpiod-get-for-v7.2' of git://git.kernel.org/pub/scm...
Bartosz Golaszewski [Mon, 11 May 2026 11:02:54 +0000 (13:02 +0200)] 
Merge tag 'ib-gpio-add-fwnode-gpiod-get-for-v7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into gpio/for-next

Immutable branch between the GPIO and PCI trees for v7.2

- add fwnode_gpiod_get() helper to GPIOLIB

5 weeks agoxfs: Fix typo in comment
Md Shofiqul Islam [Wed, 6 May 2026 16:36:58 +0000 (19:36 +0300)] 
xfs: Fix typo in comment

Fix spelling mistake in comment:
 - occured -> occurred

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Md Shofiqul Islam <shofiqtest@gmail.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
5 weeks agogpio: Add fwnode_gpiod_get() helper
Krishna Chaitanya Chundru [Mon, 11 May 2026 07:25:37 +0000 (12:55 +0530)] 
gpio: Add fwnode_gpiod_get() helper

Add fwnode_gpiod_get() as a convenience wrapper around
fwnode_gpiod_get_index() for the common case where only the
first GPIO is required.

This mirrors existing gpiod_get() and devm_gpiod_get() helpers
and avoids open-coding index 0 at call sites.

Suggested-by: Manivannan Sadhasivam <mani@kernel.org>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
Link: https://patch.msgid.link/20260511-wakeirq_support-v10-1-c10af9c9eb8c@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
5 weeks agodt-bindings: gpio: dwapb: allow GPIO hogs
Icenowy Zheng [Thu, 7 May 2026 08:17:05 +0000 (16:17 +0800)] 
dt-bindings: gpio: dwapb: allow GPIO hogs

GPIO hogs are described in the gpio.txt binding as automatic default
GPIO configuration items.

Allow them for GPIO ports in DesignWare APB GPIO controller nodes.

Cc: Hoan Tran <hoan@os.amperecomputing.com>
Cc: Linus Walleij <linusw@kernel.org>
Cc: Bartosz Golaszewski <brgl@kernel.org>
Cc: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20260507081710.4090814-8-zhengxingda@iscas.ac.cn
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
5 weeks agoxfs: fix the "limiting open zones" message
Christoph Hellwig [Thu, 7 May 2026 05:24:57 +0000 (07:24 +0200)] 
xfs: fix the "limiting open zones" message

The xfs logging macros include a newline, remove the \n, which adds an
extra one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
5 weeks agoMerge patch series "selftests/namespaces: Fix test hangs and false failures"
Christian Brauner [Thu, 9 Apr 2026 13:06:07 +0000 (15:06 +0200)] 
Merge patch series "selftests/namespaces: Fix test hangs and false failures"

Ricardo B. Marlière <rbm@suse.com> says:

This series addresses three reliability problems in the namespaces selftest
suite that cause tests to hang or report incorrect results.

The first patch fixes a hang in nsid_test where the grandchild process is
not reaped during fixture teardown, leaving it alive and holding the TAP
pipe write-end open so the test runner blocks indefinitely waiting for EOF.

The second and third patches fix two problems in listns_efault_test: a
waitpid(-1) race that can cause the iterator child to be consumed during
namespace cleanup (leading to an indefinite block on the subsequent targeted
waitpid), and a false FAIL verdict on kernels that do not implement listns()
(the EFAULT tests should SKIP in that case, consistent with every other
listns test that already handles ENOSYS correctly).

Link: https://patch.msgid.link/20260407-selftests-namespaces_fixes-v1-0-59109909d88b@suse.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agoselftests/namespaces: Skip efault tests when listns() is not available
Ricardo B. Marlière [Tue, 7 Apr 2026 14:35:47 +0000 (11:35 -0300)] 
selftests/namespaces: Skip efault tests when listns() is not available

When listns() is not implemented the iterator child detects ENOSYS and
exits cleanly with status PIDFD_SKIP before the parent has a chance to
signal it.  The parent sends SIGKILL (which is a harmless no-op at that
point) and then calls waitpid(), obtaining a normal-exit status.  The
subsequent ASSERT_TRUE(WIFSIGNALED(status)) therefore fails, causing the
three EFAULT-focused tests to report FAIL rather than SKIP on kernels that
do not yet carry listns() support.

After collecting the iterator's exit status, check whether it exited with
PIDFD_SKIP and issue a SKIP verdict in that case, consistent with the
behaviour of every other listns test that already handles ENOSYS correctly.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://patch.msgid.link/20260407-selftests-namespaces_fixes-v1-3-59109909d88b@suse.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agoselftests/namespaces: Fix waitpid race in listns_efault_test cleanup
Ricardo B. Marlière [Tue, 7 Apr 2026 14:35:46 +0000 (11:35 -0300)] 
selftests/namespaces: Fix waitpid race in listns_efault_test cleanup

The efault tests spawn two categories of child processes: namespace
children (each in its own mount namespace, for concurrent destruction) and
an iterator child that calls listns() in a tight loop.  The cleanup loop
used waitpid(-1), which reaps any child in any order. If the iterator child
exits early (e.g. because listns() returned ENOSYS) before all namespace
children have been reaped, waitpid(-1) may consume it instead.  The
subsequent targeted waitpid(iter_pid) would then block indefinitely.

Track the PIDs of the namespace children explicitly and use targeted
waitpid() calls in the cleanup loop so the iterator child cannot be
inadvertently reaped during namespace cleanup.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://patch.msgid.link/20260407-selftests-namespaces_fixes-v1-2-59109909d88b@suse.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agoMerge patch series "uaccess/sockptr: copy_struct_ fixes and more helpers"
Christian Brauner [Thu, 9 Apr 2026 13:04:32 +0000 (15:04 +0200)] 
Merge patch series "uaccess/sockptr: copy_struct_ fixes and more helpers"

Stefan Metzmacher <metze@samba.org> says:

Here are some patches related to
copy_struct_{from,to}_{user,sockptr}()
I collected during my work on an IPPROTO_SMBDIRECT
implementation wrapping the smbdirect code used
by cifs.ko and ksmbd.ko.

The first patch fixes copy_struct_to_user()
to behave like documented.

The 2nd patch fixes the case where
copy_struct_from_user() is called by
copy_struct_from_sockptr().

The 3rd patch introduces
copy_struct_{from,to}_bounce_buffer()
as a result of a discussion about the
IPPROTO_QUIC driver in order to
be future prove when handling msg_control
messages in sendmsg and recvmsg.
But I'll likely also use them in my
IPPROTO_SMBDIRECT driver.

The 4th patch makes copy_struct_from_sockptr()
a trivial wrapper switching between
copy_struct_from_user() and
copy_struct_from_bounce_buffer()

The 5th patch introduces copy_struct_to_sockptr()
which I'll also use in my IPPROTO_SMBDIRECT driver.

* patches from https://patch.msgid.link/cover.1775576651.git.metze@samba.org:
  sockptr: introduce copy_struct_to_sockptr()
  sockptr: let copy_struct_from_sockptr() use copy_struct_from_bounce_buffer()
  uaccess: add copy_struct_{from,to}_bounce_buffer() helpers
  sockptr: fix usize check in copy_struct_from_sockptr() for user pointers
  uaccess: fix ignored_trailing logic in copy_struct_to_user()

Link: https://patch.msgid.link/cover.1775576651.git.metze@samba.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agoselftests/namespaces: Kill grandchild in nsid fixture teardown
Ricardo B. Marlière [Tue, 7 Apr 2026 14:35:45 +0000 (11:35 -0300)] 
selftests/namespaces: Kill grandchild in nsid fixture teardown

The timens_separate and pidns_separate test cases fork a grandchild that
calls pause().  FIXTURE_TEARDOWN only kills the direct child, which is the
init process of the grandchild's namespace.  Once the child (init) exits,
the grandchild is reparented to the host init but remains alive and
continues to hold the inherited write end of the test runner's TAP pipe
open.  tap_prefix never receives EOF and blocks indefinitely, hanging the
entire test collection.

Record the grandchild PID in the fixture struct so that teardown can send
SIGKILL and reap it before dealing with the child.  The grandchild must be
reaped first because the child acts as its PID namespace init; killing the
child first would kill the grandchild without giving us a chance to
waitpid() it.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://patch.msgid.link/20260407-selftests-namespaces_fixes-v1-1-59109909d88b@suse.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agosockptr: introduce copy_struct_to_sockptr()
Stefan Metzmacher [Tue, 7 Apr 2026 16:03:17 +0000 (18:03 +0200)] 
sockptr: introduce copy_struct_to_sockptr()

We already have copy_struct_from_sockptr() as wrapper to
copy_struct_from_user() or copy_struct_from_bounce_buffer(),
so it's good to have copy_struct_to_sockptr()
as well matching the behavior of copy_struct_to_user()
or copy_struct_to_bounce_buffer().

The world would be better without sockptr_t, but having
copy_struct_to_sockptr() is better than open code it
in various places.

I'll use this in my IPPROTO_SMBDIRECT work,
but maybe it will also be useful for others...
IPPROTO_QUIC will likely also use it.

Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Dmitry Safonov <dima@arista.com>
Cc: Francesco Ruggeri <fruggeri@arista.com>
Cc: Salam Noureddine <noureddine@arista.com>
Cc: David Ahern <dsahern@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Michal Luczaj <mhal@rbox.co>
Cc: David Wei <dw@davidwei.uk>
Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Kuniyuki Iwashima <kuniyu@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Simon Horman <horms@kernel.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Christian Brauner <brauner@kernel.org>
CC: Kees Cook <keescook@chromium.org>
Cc: netdev@vger.kernel.org
Cc: linux-bluetooth@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Link: https://patch.msgid.link/c950ee1578cb93b4411c3731010def9c1cd82f0d.1775576651.git.metze@samba.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agosockptr: let copy_struct_from_sockptr() use copy_struct_from_bounce_buffer()
Stefan Metzmacher [Tue, 7 Apr 2026 16:03:16 +0000 (18:03 +0200)] 
sockptr: let copy_struct_from_sockptr() use copy_struct_from_bounce_buffer()

The world would be better without sockptr_t, but this at least
simplifies copy_struct_from_sockptr() to be just a dispatcher for
copy_struct_from_user() or copy_struct_from_bounce_buffer() without any
special logic on its own.

Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Dmitry Safonov <dima@arista.com>
Cc: Francesco Ruggeri <fruggeri@arista.com>
Cc: Salam Noureddine <noureddine@arista.com>
Cc: David Ahern <dsahern@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Michal Luczaj <mhal@rbox.co>
Cc: David Wei <dw@davidwei.uk>
Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Kuniyuki Iwashima <kuniyu@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Simon Horman <horms@kernel.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Christian Brauner <brauner@kernel.org>
CC: Kees Cook <keescook@chromium.org>
Cc: netdev@vger.kernel.org
Cc: linux-bluetooth@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Link: https://patch.msgid.link/b9b7e22664a53251d7ad099b12aead8b599c1257.1775576651.git.metze@samba.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
5 weeks agouaccess: add copy_struct_{from,to}_bounce_buffer() helpers
Stefan Metzmacher [Tue, 7 Apr 2026 16:03:15 +0000 (18:03 +0200)] 
uaccess: add copy_struct_{from,to}_bounce_buffer() helpers

These are similar to copy_struct_{from,to}_user() but operate
on kernel buffers instead of user buffers.

They can be used when there is a temporary bounce buffer used,
e.g. in msg_control or similar places.

It allows us to have the same logic to handle old vs. current
and current vs. new structures in the same compatible way.

copy_struct_from_sockptr() will also be able to
use copy_struct_from_bounce_buffer() for the kernel
case as follow us patch.

I'll use this in my IPPROTO_SMBDIRECT work,
but maybe it will also be useful for others...
IPPROTO_QUIC will likely also use it.

Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Dmitry Safonov <dima@arista.com>
Cc: Francesco Ruggeri <fruggeri@arista.com>
Cc: Salam Noureddine <noureddine@arista.com>
Cc: David Ahern <dsahern@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Michal Luczaj <mhal@rbox.co>
Cc: David Wei <dw@davidwei.uk>
Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Kuniyuki Iwashima <kuniyu@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Simon Horman <horms@kernel.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Christian Brauner <brauner@kernel.org>
CC: Kees Cook <keescook@chromium.org>
Cc: netdev@vger.kernel.org
Cc: linux-bluetooth@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Link: https://patch.msgid.link/f29570914590c50b9b6f451eb3a38d0fe1d954df.1775576651.git.metze@samba.org
Signed-off-by: Christian Brauner <brauner@kernel.org>