Rick Macklem [Fri, 9 Jan 2026 16:21:41 +0000 (11:21 -0500)]
NFSD: Add POSIX draft ACL support to the NFSv4 SETATTR operation
The POSIX ACL extension to NFSv4 enables clients to set access
and default ACLs via FATTR4_POSIX_ACCESS_ACL and
FATTR4_POSIX_DEFAULT_ACL attributes. Integration of these
attributes into SETATTR processing requires wiring them through
the nfsd_attrs structure and ensuring proper cleanup on all
code paths.
This patch connects the na_pacl and na_dpacl fields in
nfsd_attrs to the decoded ACL pointers from the NFSv4 SETATTR
decoder. Ownership of these ACL references transfers to attrs
immediately after initialization, with the decoder's pointers
cleared to NULL. This transfer ensures nfsd_attrs_free()
releases the ACLs on normal completion, while new error paths
call posix_acl_release() directly when cleanup occurs before
nfsd_attrs_free() runs.
Early returns in the nfsd4_setattr() function gain conversions
to goto statements that branch to proper cleanup handlers.
Error paths before fh_want_write() branch to out_err for ACL
release only; paths after fh_want_write() use the existing out
label for full cleanup via nfsd_attrs_free().
The patch adds mutual exclusion between NFSv4 ACLs (sa_acl) and
POSIX ACLs. Setting both types simultaneously returns
nfserr_inval because these ACL models cannot coexist on the
same file object.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Rick Macklem [Fri, 9 Jan 2026 16:21:40 +0000 (11:21 -0500)]
NFSD: Add support for POSIX draft ACLs for file creation
NFSv4.2 clients can specify POSIX draft ACLs when creating file
objects via OPEN(CREATE) and CREATE operations. The previous patch
added POSIX ACL support to the NFSv4 SETATTR operation for modifying
existing objects, but file creation follows different code paths
that also require POSIX ACL handling.
This patch integrates POSIX ACL support into nfsd4_create() and
nfsd4_create_file(). Ownership of the decoded ACL pointers
(op_dpacl, op_pacl, cr_dpacl, cr_pacl) transfers to the nfsd_attrs
structure immediately, with the original fields cleared to NULL.
This transfer ensures nfsd_attrs_free() releases the ACLs upon
completion while preventing double-free on error paths.
Mutual exclusion between NFSv4 ACLs and POSIX ACLs is enforced:
setting both op_acl and op_dpacl/op_pacl simultaneously returns
nfserr_inval. Errors during ACL application clear the corresponding
bits in the result bitmask (fattr->bmval), signaling partial
completion to the client.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Rick Macklem [Fri, 9 Jan 2026 16:21:39 +0000 (11:21 -0500)]
NFSD: Add support for XDR decoding POSIX draft ACLs
The POSIX ACL extension to NFSv4 defines FATTR4_POSIX_ACCESS_ACL
and FATTR4_POSIX_DEFAULT_ACL for setting access and default ACLs
via CREATE, OPEN, and SETATTR operations. This patch adds the XDR
decoders for those attributes.
The nfsd4_decode_fattr4() function gains two additional parameters
for receiving decoded POSIX ACLs. CREATE, OPEN, and SETATTR
decoders pass pointers to these new parameters, enabling clients
to set POSIX ACLs during object creation or modification.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Support for FATTR4_POSIX_ACCESS_ACL and FATTR4_POSIX_DEFAULT_ACL
attributes in subsequent patches allows clients to set both ACL
types simultaneously during SETATTR and file creation. Each ACL
type can succeed or fail independently, requiring the server to
clear individual attribute bits in the reply bitmap when one
fails while the other succeeds.
The existing na_aclerr field cannot distinguish which ACL type
encountered an error. Separate error fields (na_paclerr for
access ACLs, na_dpaclerr for default ACLs) enable the server to
report per-ACL-type failures accurately.
This refactoring also adds validation previously absent: default
ACL processing rejects non-directory targets with EINVAL and
passes NULL to set_posix_acl() when a_count is zero to delete
the ACL. Access ACL processing rejects zero a_count with EINVAL
for ACL_SCOPE_FILE_SYSTEM semantics (the only scope currently
supported).
The changes preserve compatibility with existing NFSv4 ACL code.
NFSv4 ACL conversion (nfs4_acl_nfsv4_to_posix()) never produces
POSIX ACLs with a_count == 0, so the new validation logic only
affects future POSIX ACL attribute handling.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Rick Macklem [Fri, 9 Jan 2026 16:21:37 +0000 (11:21 -0500)]
NFSD: Do not allow NFSv4 (N)VERIFY to check POSIX ACL attributes
Section 9.3 of draft-ietf-nfsv4-posix-acls-00 prohibits use of
the POSIX ACL attributes with VERIFY and NVERIFY operations: the
server MUST reply NFS4ERR_INVAL when a client attempts this.
Beyond the protocol requirement, comparison of POSIX draft ACLs
via (N)VERIFY presents an implementation challenge. Clients are
not required to order the ACEs within a POSIX ACL in any
particular way, making reliable attribute comparison impractical.
Return nfserr_inval when the client requests FATTR4_POSIX_ACCESS_ACL
or FATTR4_POSIX_DEFAULT_ACL in a VERIFY or NVERIFY operation.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Rick Macklem [Fri, 9 Jan 2026 16:21:36 +0000 (11:21 -0500)]
NFSD: Add nfsd4_encode_fattr4_posix_access_acl
The POSIX ACL extension to NFSv4 defines FATTR4_POSIX_ACCESS_ACL
for retrieving the access ACL of a file or directory. This patch
adds the XDR encoder for that attribute.
The access ACL is retrieved via get_inode_acl(). If the filesystem
provides no explicit access ACL, one is synthesized from the file
mode via posix_acl_from_mode(). Each entry is encoded as a
posixace4: tag type, permission bits, and principal name (empty
for structural entries, resolved via idmapping for USER/GROUP
entries).
Unlike the default ACL encoder which applies only to directories,
this encoder handles all inode types and ensures an access ACL is
always available through mode-based synthesis when needed.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Rick Macklem [Fri, 9 Jan 2026 16:21:35 +0000 (11:21 -0500)]
NFSD: Add nfsd4_encode_fattr4_posix_default_acl
The POSIX ACL extension to NFSv4 defines FATTR4_POSIX_DEFAULT_ACL
for retrieving a directory's default ACL. This patch adds the
XDR encoder for that attribute.
For directories, the default ACL is retrieved via get_inode_acl()
and each entry is encoded as a posixace4: tag type, permission
bits, and principal name (empty for structural entries like
USER_OBJ/GROUP_OBJ/MASK/OTHER, resolved via idmapping for
USER/GROUP entries).
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Rick Macklem [Fri, 9 Jan 2026 16:21:34 +0000 (11:21 -0500)]
NFSD: Add nfsd4_encode_fattr4_acl_trueform_scope
The FATTR4_ACL_TRUEFORM_SCOPE attribute indicates the granularity at
which the ACL model can vary: per file object, per file system, or
uniformly across the entire server.
In Linux, the ACL model is determined by the SB_POSIXACL superblock
flag, which applies uniformly to all files within a file system.
Different exported file systems can have different ACL models, but
individual files cannot differ from their containing file system.
ACL_SCOPE_FILE_SYSTEM accurately reflects this behavior.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Rick Macklem [Fri, 9 Jan 2026 16:21:33 +0000 (11:21 -0500)]
NFSD: Add nfsd4_encode_fattr4_acl_trueform
Mapping between NFSv4 ACLs and POSIX ACLs is semantically imprecise:
a client that sets an NFSv4 ACL and reads it back may see a different
ACL than it wrote. The proposed NFSv4 POSIX ACL extension introduces
the FATTR4_ACL_TRUEFORM attribute, which reports whether a file
object stores its access control permissions using NFSv4 ACLs or
POSIX ACLs.
A client aware of this extension can avoid lossy translation by
requesting and setting ACLs in their native format.
When NFSD is built with CONFIG_NFSD_V4_POSIX_ACLS, report
ACL_MODEL_POSIX_DRAFT for file objects on file systems with the
SB_POSIXACL flag set, and ACL_MODEL_NONE otherwise. Linux file
systems do not store NFSv4 ACLs natively, so ACL_MODEL_NFS4 is never
reported.
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 9 Jan 2026 16:21:32 +0000 (11:21 -0500)]
Add RPC language definition of NFSv4 POSIX ACL extension
The language definition was extracted from the new
draft-ietf-nfsv4-posix-acls specification. This ensures good
constant and type name alignment between the spec and the Linux
kernel source code, and brings in some basic XDR utilities for
handling NFSv4 POSIX draft ACLs.
Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
This draft has not yet been ratified. A build-time configuration
option allows developers and distributors to decide whether to
expose this experimental protocol extension to NFSv4 clients. The
option is disabled by default to prevent unintended deployment of
potentially unstable protocol features in production environments.
This approach mirrors the existing NFSD_V4_DELEG_TIMESTAMPS option,
which gates another experimental NFSv4 extension based on an
unratified IETF draft.
Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 9 Jan 2026 16:21:30 +0000 (11:21 -0500)]
xdrgen: Implement pass-through lines in specifications
XDR specification files can contain lines prefixed with '%' that
pass through unchanged to generated output. Traditional rpcgen
removes the '%' and emits the remainder verbatim, allowing direct
insertion of C includes, pragma directives, or other language-
specific content into the generated code.
Until now, xdrgen silently discarded these lines during parsing.
This prevented specifications from including necessary headers or
preprocessor directives that might be required for the generated
code to compile correctly.
The grammar now captures pass-through lines instead of ignoring
them. A new AST node type represents pass-through content, and
the AST transformer strips the leading '%' character. Definition
and source generators emit pass-through content in document order,
preserving the original placement within the specification.
This brings xdrgen closer to feature parity with traditional
rpcgen while maintaining the existing document-order processing
model.
Existing generated xdrgen source code has been regenerated.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Thu, 8 Jan 2026 00:40:11 +0000 (19:40 -0500)]
nfsd: cancel async COPY operations when admin revokes filesystem state
Async COPY operations hold copy stateids that represent NFSv4 state.
Thus, when the NFS server administrator revokes all NFSv4 state for
a filesystem via the unlock_fs interface, ongoing async COPY
operations referencing that filesystem must also be canceled.
Each cancelled copy triggers a CB_OFFLOAD callback carrying the
NFS4ERR_ADMIN_REVOKED status to notify the client that the server
terminated the operation.
The static drop_client() function is renamed to nfsd4_put_client()
and exported. The function must be exported because both the new
nfsd4_cancel_copy_by_sb() and the CB_OFFLOAD release callback in
nfs4proc.c need to release client references.
Reviewed-by: NeilBrown <neil@brown.name> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Jeff Layton [Tue, 6 Jan 2026 18:59:50 +0000 (13:59 -0500)]
nfsd: add controls to set the minimum number of threads per pool
Add a new "min_threads" variable to the nfsd_net, along with the
corresponding netlink interface, to set that value from userland.
Pass that value to svc_set_pool_threads() and svc_set_num_threads().
Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Jeff Layton [Tue, 6 Jan 2026 18:59:49 +0000 (13:59 -0500)]
nfsd: adjust number of running nfsd threads based on activity
nfsd() is changed to pass a timeout to svc_recv() when there is a min
number of threads set, and to handle error returns from it:
In the case of -ETIMEDOUT, if the service mutex can be taken (via
trylock), the thread becomes an RQ_VICTIM so that it will exit,
providing that the actual number of threads is above pool->sp_nrthrmin.
In the case of -EBUSY, if the actual number of threads is below
pool->sp_nrthrmax, it will attempt to start a new thread. This attempt
is gated on a new SP_TASK_STARTING pool flag that serializes thread
creation attempts within a pool, and further by mutex_trylock().
Neil says: "I think we want memory pressure to be able to push a thread
into returning -ETIMEDOUT. That can come later."
Signed-off-by: NeilBrown <neil@brown.name> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Jeff Layton [Tue, 6 Jan 2026 18:59:48 +0000 (13:59 -0500)]
sunrpc: allow svc_recv() to return -ETIMEDOUT and -EBUSY
To dynamically adjust the thread count, nfsd requires some information
about how busy things are.
Change svc_recv() to take a timeout value, and then allow the wait for
work to time out if it's set. If a timeout is not defined, then the
schedule will be set to MAX_SCHEDULE_TIMEOUT. If the task waits for the
full timeout, then have it return -ETIMEDOUT to the caller.
If it wakes up, finds that there is more work and that no threads are
available, then attempt to set SP_TASK_STARTING. If wasn't already set,
have the task return -EBUSY to cue to the caller that the service could
use more threads.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Jeff Layton [Tue, 6 Jan 2026 18:59:46 +0000 (13:59 -0500)]
sunrpc: introduce the concept of a minimum number of threads per pool
Add a new pool->sp_nrthrmin field to track the minimum number of threads
in a pool. Add min_threads parameters to both svc_set_num_threads() and
svc_set_pool_threads(). If min_threads is non-zero and less than the
max, svc_set_num_threads() will ensure that the number of running
threads is between the min and the max.
If the min is 0 or greater than the max, then it is ignored, and the
maximum number of threads will be started, and never spun down.
For now, the min_threads is always 0, but a later patch will pass the
proper value through from nfsd.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Jeff Layton [Tue, 6 Jan 2026 18:59:45 +0000 (13:59 -0500)]
sunrpc: track the max number of requested threads in a pool
The kernel currently tracks the number of threads running in a pool in
the "sp_nrthreads" field. In the future, where threads are dynamically
spun up and down, it'll be necessary to keep track of the maximum number
of requested threads separately from the actual number running.
Add a pool->sp_nrthrmax parameter to track this. When userland changes
the number of threads in a pool, update that value accordingly.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Jeff Layton [Tue, 6 Jan 2026 18:59:44 +0000 (13:59 -0500)]
sunrpc: remove special handling of NULL pool from svc_start/stop_kthreads()
Now that svc_set_num_threads() handles distributing the threads among
the available pools, remove the special handling of a NULL pool pointer
from svc_start_kthreads() and svc_stop_kthreads().
Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Jeff Layton [Tue, 6 Jan 2026 18:59:43 +0000 (13:59 -0500)]
sunrpc: split svc_set_num_threads() into two functions
svc_set_num_threads() will set the number of running threads for a given
pool. If the pool argument is set to NULL however, it will distribute
the threads among all of the pools evenly.
These divergent codepaths complicate the move to dynamic threading.
Simplify the API by splitting these two cases into different helpers:
Add a new svc_set_pool_threads() function that sets the number of
threads in a single, given pool. Modify svc_set_num_threads() to
distribute the threads evenly between all of the pools and then call
svc_set_pool_threads() for each.
Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 26 Dec 2025 15:19:35 +0000 (10:19 -0500)]
xdrgen: Add enum value validation to generated decoders
XDR enum decoders generated by xdrgen do not verify that incoming
values are valid members of the enum. Incoming out-of-range values
from malicious or buggy peers propagate through the system
unchecked.
Add validation logic to generated enum decoders using a switch
statement that explicitly lists valid enumerator values. The
compiler optimizes this to a simple range check when enum values
are dense (contiguous), while correctly rejecting invalid values
for sparse enums with gaps in their value ranges.
The --no-enum-validation option on the source subcommand disables
this validation when not needed.
The minimum and maximum fields in _XdrEnum, which were previously
unused placeholders for a range-based validation approach, have
been removed since the switch-based validation handles both dense
and sparse enums correctly.
Because the new mechanism results in substantive changes to
generated code, existing .x files are regenerated. Unrelated white
space and semicolon changes in the generated code are due to recent
commit 1c873a2fd110 ("xdrgen: Don't generate unnecessary semicolon")
and commit 38c4df91242b ("xdrgen: Address some checkpatch whitespace
complaints").
Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 26 Dec 2025 15:19:34 +0000 (10:19 -0500)]
xdrgen: Emit a max_arg_sz macro
struct svc_service has a .vs_xdrsize field that is filled in by
servers for each of their RPC programs. This field is supposed to
contain the size of the largest procedure argument in the RPC
program. This value is also sometimes used to size network
transport buffers.
Currently, server implementations must manually calculate and
hard-code this value, which is error-prone and requires updates
when procedure arguments change.
Update xdrgen to determine which procedure argument structure is
largest, and emit a macro with a well-known name that contains
the size of that structure. Server code then uses this macro when
initializing the .vs_xdrsize field.
For NLM version 4, xdrgen now emits:
#define NLM4_MAX_ARGS_SZ (NLM4_nlm4_lockargs_sz)
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 26 Dec 2025 15:19:33 +0000 (10:19 -0500)]
xdrgen: Extend error reporting to AST transformation phase
Commit 277df18d7df9 ("xdrgen: Improve parse error reporting") added
clean, compiler-style error messages for syntax errors detected during
parsing. However, semantic errors discovered during AST transformation
still produce verbose Python stack traces.
When an XDR specification references an undefined type, the transformer
raises a VisitError wrapping a KeyError. Before this change:
Traceback (most recent call last):
File ".../lark/visitors.py", line 124, in _call_userfunc
return f(children)
...
KeyError: 'fsh4_mode'
...
lark.exceptions.VisitError: Error trying to process rule "basic":
'fsh4_mode'
After this change:
file.x:156:2: semantic error
Undefined type 'fsh4_mode'
fsh4_mode mode;
^
The new handle_transform_error() function extracts position information
from the Lark tree node metadata and formats the error consistently with
parse error messages.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 26 Dec 2025 15:15:32 +0000 (10:15 -0500)]
SUNRPC: auth_gss: fix memory leaks in XDR decoding error paths
The gssx_dec_ctx(), gssx_dec_status(), and gssx_dec_name()
functions allocate memory via gssx_dec_buffer(), which calls
kmemdup(). When a subsequent decode operation fails, these
functions return immediately without freeing previously
allocated buffers, causing memory leaks.
The leak in gssx_dec_ctx() is particularly relevant because
the caller (gssp_accept_sec_context_upcall) initializes several
buffer length fields to non-zero values, resulting in memory
allocation:
If, for example, gssx_dec_name() succeeds for src_name but
fails for targ_name, the memory allocated for
exported_context_token, mech, and src_name.display_name
remains unreferenced and cannot be reclaimed.
Add error handling with goto-based cleanup to free any
previously allocated buffers before returning an error.
Reported-by: Xingjing Deng <micro6947@gmail.com> Closes: https://lore.kernel.org/linux-nfs/CAK+ZN9qttsFDu6h1FoqGadXjMx1QXqPMoYQ=6O9RY4SxVTvKng@mail.gmail.com/ Fixes: 1d658336b05f ("SUNRPC: Add RPC based upcall mechanism for RPCGSS auth") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
nfsd: fix return error code for nfsd_map_name_to_[ug]id
idmap lookups can time out while the cache is waiting for a userspace
upcall reply. In that case cache_check() returns -ETIMEDOUT to callers.
The nfsd_map_name_to_[ug]id functions currently proceed with attempting
to map the id to a kuid despite a potentially temporary failure to
perform the idmap lookup. This results in the code returning the error
NFSERR_BADOWNER which can cause client operations to return to userspace
with failure.
Fix this by returning the failure status before attempting kuid mapping.
This will return NFSERR_JUKEBOX on idmap lookup timeout so that clients
can retry the operation instead of aborting it.
Fixes: 65e10f6d0ab0 ("nfsd: Convert idmap to use kuids and kgids") Cc: stable@vger.kernel.org Signed-off-by: Anthony Iliopoulos <ailiop@suse.com> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
During v4 request compound arg decoding, some ops (e.g. SETATTR)
can trigger idmap lookup upcalls. When those upcall responses get
delayed beyond the allowed time limit, cache_check() will mark the
request for deferral and cause it to be dropped.
This prevents nfs4svc_encode_compoundres from being executed, and
thus the session slot flag NFSD4_SLOT_INUSE never gets cleared.
Subsequent client requests will fail with NFSERR_JUKEBOX, given
that the slot will be marked as in-use, making the SEQUENCE op
fail.
Fix this by making sure that the RQ_USEDEFERRAL flag is always
clear during nfs4svc_decode_compoundargs(), since no v4 request
should ever be deferred.
Fixes: 2f425878b6a7 ("nfsd: don't use the deferral service, return NFS4ERR_DELAY") Signed-off-by: Anthony Iliopoulos <ailiop@suse.com> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Mon, 22 Dec 2025 14:44:59 +0000 (09:44 -0500)]
xdrgen: Improve parse error reporting
The current verbose Lark exception output makes it difficult to
quickly identify and fix syntax errors in XDR specifications. Users
must wade through hundreds of lines of cascading errors to find the
root cause.
Replace this with concise, compiler-style error messages showing
file, line, column, the unexpected token, and the source line with
a caret pointing to the error location.
Before:
Unexpected token Token('__ANON_1', '+1') at line 14, column 35.
Expected one of:
* SEMICOLON
Previous tokens: [Token('__ANON_0', 'LM_MAXSTRLEN')]
[hundreds more cascading errors...]
After:
file.x:14:35: parse error
Unexpected number '+1'
const LM_MAXNAMELEN = LM_MAXSTRLEN+1;
^
The error handler now raises XdrParseError on the first error,
preventing cascading messages that obscure the root cause.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Mon, 22 Dec 2025 14:44:29 +0000 (09:44 -0500)]
xdrgen: Remove inclusion of nlm4.h header
The client-side source code template mistakenly includes the
nlm4.h header file, which is specific to the NLM protocol and
should not be present in the generic template that generates
client stubs for all XDR-based protocols.
Fixes: 903a7d37d9ea ("xdrgen: Update the files included in client-side source code") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Sat, 20 Dec 2025 15:41:09 +0000 (10:41 -0500)]
xdrgen: Initialize data pointer for zero-length items
The xdrgen decoders for strings and opaque data had an
optimization that skipped calling xdr_inline_decode() when the
item length was zero. This left the data pointer uninitialized,
which could lead to unpredictable behavior when callers access
it.
Remove the zero-length check and always call xdr_inline_decode().
When passed a length of zero, xdr_inline_decode() returns the
current buffer position, which is valid and matches the behavior
of hand-coded XDR decoders throughout the kernel.
Fixes: 4b132aacb076 ("tools: Add xdrgen") Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
NFSD: fix setting FMODE_NOCMTIME in nfs4_open_delegation
fstests generic/215 and generic/407 were failing because the server
wasn't updating mtime properly. When deleg attribute support is not
compiled in and thus no attribute delegation was given, the server
was skipping updating mtime and ctime because FMODE_NOCMTIME was
uncoditionally set for the write delegation.
Fixes: e5e9b24ab8fa ("nfsd: freeze c/mtime updates with outstanding WRITE_ATTRS delegation") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Tue, 16 Dec 2025 16:23:09 +0000 (11:23 -0500)]
xdrgen: Implement short (16-bit) integer types
"short" and "unsigned short" types are not defined in RFC 4506, but
are supported by the rpcgen program. An upcoming protocol
specification includes at least one "unsigned short" field, so xdrgen
needs to implement support for these types.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
NeilBrown [Sat, 13 Dec 2025 18:42:00 +0000 (13:42 -0500)]
nfsd: use workqueue enable/disable APIs for v4_end_grace sync
"nfsd: provide locking for v4_end_grace" introduced a
client_tracking_active flag protected by nn->client_lock to prevent
the laundromat from being scheduled before client tracking
initialization or after shutdown begins. That commit is suitable for
backporting to LTS kernels that predate commit 86898fa6b8cd
("workqueue: Implement disable/enable for (delayed) work items").
However, the workqueue subsystem in recent kernels provides
enable_delayed_work() and disable_delayed_work_sync() for this
purpose. Using this mechanism enable us to remove the
client_tracking_active flag and associated spinlock operations
while preserving the same synchronization guarantees, which is
a cleaner long-term approach.
Signed-off-by: NeilBrown <neil@brown.name> Tested-by: Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Wed, 10 Dec 2025 00:28:50 +0000 (19:28 -0500)]
NFS: NFSERR_INVAL is not defined by NFSv2
A documenting comment in include/uapi/linux/nfs.h claims incorrectly
that NFSv2 defines NFSERR_INVAL. There is no such definition in either
RFC 1094 or https://pubs.opengroup.org/onlinepubs/9629799/chap7.htm
NFS3ERR_INVAL is introduced in RFC 1813.
NFSD returns NFSERR_INVAL for PROC_GETACL, which has no
specification (yet).
However, nfsd_map_status() maps nfserr_symlink and nfserr_wrong_type
to nfserr_inval, which does not align with RFC 1094. This logic was
introduced only recently by commit 438f81e0e92a ("nfsd: move error
choice for incorrect object types to version-specific code."). Given
that we have no INVAL or SERVERFAULT status in NFSv2, probably the
only choice is NFSERR_IO.
Fixes: 438f81e0e92a ("nfsd: move error choice for incorrect object types to version-specific code.") Reviewed-by: NeilBrown <neil@brown.name> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Mon, 8 Dec 2025 16:15:32 +0000 (11:15 -0500)]
xdrgen: Fix struct prefix for typedef types in program wrappers
The program templates for decoder/argument.j2 and encoder/result.j2
unconditionally add 'struct' prefix to all types. This is incorrect
when an RPC protocol specification lists a typedef'd basic type or
an enum as a procedure argument or result (e.g., NFSv2's fhandle or
stat), resulting in compiler errors when building generated C code.
Fixes: 4b132aacb076 ("tools: Add xdrgen") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
NeilBrown [Sat, 22 Nov 2025 01:00:37 +0000 (12:00 +1100)]
locks: ensure vfs_test_lock() never returns FILE_LOCK_DEFERRED
FILE_LOCK_DEFERRED can be returned when creating or removing a lock, but
not when testing for a lock. This support was explicitly removed in
Commit 09802fd2a8ca ("lockd: rip out deferred lock handling from testlock codepath")
However the test in nlmsvc_testlock() suggests that it *can* be returned,
only nlm cannot handle it.
To aid clarity, remove the test and instead put a similar test and
warning in vfs_test_lock(). If the impossible happens, convert
FILE_LOCK_DEFERRED to -EIO.
Signed-off-by: NeilBrown <neil@brown.name> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Mon, 1 Dec 2025 22:19:46 +0000 (17:19 -0500)]
xdrgen: Address some checkpatch whitespace complaints
This is a roll-up of three template fixes that eliminate noise from
checkpatch output so that it's easier to spot non-trivial problems.
To follow conventional kernel C style, when a union declaration is
marked with "pragma public", there should be a blank line between
the emitted "union xxx { ... };" and the decoder and encoder
function declarations.
Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Thu, 20 Nov 2025 20:15:51 +0000 (15:15 -0500)]
NFSD: Add instructions on how to deal with xdrgen files
xdrgen requires a number of Python packages on the build system. We
don't want to add these to the kernel build dependency list, which
is long enough already.
The generated files are generated manually using
$ cd fs/nfsd && make xdrgen
whenever the .x files are modified, then they are checked into the
kernel repo so others do not need to rebuild them.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Khushal Chitturi [Tue, 18 Nov 2025 19:52:58 +0000 (01:22 +0530)]
xdrgen: improve error reporting for invalid void declarations
RFC 4506 defines void as a zero-length type that may appear only as
union arms or as program argument/result types. It cannot be declared
with an identifier, so constructs like "typedef void temp;" are not
valid XDR.
Previously, xdrgen raised a NotImplementedError when it encountered a
void declaration in a typedef. Which was misleading, as the problem is an
invalid RPC specification rather than missing functionality in xdrgen.
This patch replaces the NotImplementedError for _XdrVoid in typedef
handling with a clearer ValueError that specifies incorrect use of void
in the XDR input, making it clear that the issue lies in the RPC
specification being parsed.
Signed-off-by: Khushal Chitturi <kc9282016@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Linus Torvalds [Sun, 25 Jan 2026 20:06:15 +0000 (12:06 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Only one core change, the rest are drivers.
The core change reorders some state operations in the error handler to
try to prevent missed wake ups of the error handler (which can halt
error processing and effectively freeze the entire system)"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qla2xxx: Sanitize payload size to prevent member overflow
scsi: target: iscsi: Fix use-after-free in iscsit_dec_session_usage_count()
scsi: target: iscsi: Fix use-after-free in iscsit_dec_conn_usage_count()
scsi: core: Wake up the error handler when final completions race against each other
scsi: storvsc: Process unsupported MODE_SENSE_10
scsi: xen: scsiback: Fix potential memory leak in scsiback_remove()
Linus Torvalds [Sun, 25 Jan 2026 18:06:23 +0000 (10:06 -0800)]
Merge tag 'keys-trusted-next-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull keys fix from Jarkko Sakkinen.
* tag 'keys-trusted-next-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
keys/trusted_keys: fix handle passed to tpm_buf_append_name during unseal
Linus Torvalds [Sun, 25 Jan 2026 17:57:31 +0000 (09:57 -0800)]
Merge tag 'char-misc-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc/iio driver fixes from Greg KH:
"Here are some small char/misc/iio and some other minor driver
subsystem fixes for 6.19-rc7. Nothing huge here, just some fixes for
reported issues including:
- lots of little iio driver fixes
- comedi driver fixes
- mux driver fix
- w1 driver fixes
- uio driver fix
- slimbus driver fixes
- hwtracing bugfix
- other tiny bugfixes
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (36 commits)
comedi: dmm32at: serialize use of paged registers
mei: trace: treat reg parameter as string
uio: pci_sva: correct '-ENODEV' check logic
uacce: ensure safe queue release with state management
uacce: implement mremap in uacce_vm_ops to return -EPERM
uacce: fix isolate sysfs check condition
uacce: fix cdev handling in the cleanup path
slimbus: core: clean up of_slim_get_device()
slimbus: core: fix of_slim_get_device() kernel doc
slimbus: core: amend slim_get_device() kernel doc
slimbus: core: fix device reference leak on report present
slimbus: core: fix runtime PM imbalance on report present
slimbus: core: fix OF node leak on registration failure
intel_th: rename error label
intel_th: fix device leak on output open()
comedi: Fix getting range information for subdevices 16 to 255
mux: mmio: Fix IS_ERR() vs NULL check in probe()
interconnect: debugfs: initialize src_node and dst_node to empty strings
iio: dac: ad3552r-hs: fix out-of-bound write in ad3552r_hs_write_data_source
iio: accel: iis328dq: fix gain values
...
Linus Torvalds [Sun, 25 Jan 2026 17:53:28 +0000 (09:53 -0800)]
Merge tag 'tty-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull serial driver fixes from Greg KH:
"Here are three small serial driver fixes for 6.19-rc7 that resolve
some reported issues. They include:
- tty->port race condition fix for a reported problem
- qcom_geni serial driver fix
- 8250_pci serial driver fix
All of these have been in linux-next with no reported issues"
* tag 'tty-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: Fix not set tty->port race condition
serial: 8250_pci: Fix broken RS485 for F81504/508/512
serial: qcom_geni: Fix BT failure regression on RB2 platform
Linus Torvalds [Sun, 25 Jan 2026 17:42:25 +0000 (09:42 -0800)]
Merge tag 'input-for-v6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
- a couple of quirks to i8042 to enable keyboard on a Asus and MECHREVO
laptops
* tag 'input-for-v6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: i8042 - add quirks for MECHREVO Wujie 15X Pro
Input: i8042 - add quirk for ASUS Zenbook UX425QA_UM425QA
Srish Srinivasan [Fri, 23 Jan 2026 16:55:03 +0000 (22:25 +0530)]
keys/trusted_keys: fix handle passed to tpm_buf_append_name during unseal
TPM2_Unseal[1] expects the handle of a loaded data object, and not the
handle of the parent key. But the tpm2_unseal_cmd provides the parent
keyhandle instead of blob_handle for the session HMAC calculation. This
causes unseal to fail.
Fix this by passing blob_handle to tpm_buf_append_name().
Fixes: 6e9722e9a7bf ("tpm2-sessions: Fix out of range indexing in name_size") Signed-off-by: Srish Srinivasan <ssrish@linux.ibm.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
gongqi [Thu, 22 Jan 2026 15:54:59 +0000 (23:54 +0800)]
Input: i8042 - add quirks for MECHREVO Wujie 15X Pro
The MECHREVO Wujie 15X Pro requires several i8042 quirks to function
correctly. Specifically, NOMUX, RESET_ALWAYS, NOLOOP, and NOPNP are
needed to ensure the keyboard and touchpad work reliably.
feng [Sun, 25 Jan 2026 05:44:12 +0000 (21:44 -0800)]
Input: i8042 - add quirk for ASUS Zenbook UX425QA_UM425QA
The ASUS Zenbook UX425QA_UM425QA fails to initialize the keyboard after
a cold boot.
A quirk already exists for "ZenBook UX425", but some Zenbooks report
"Zenbook" with a lowercase 'b'. Since DMI matching is case-sensitive,
the existing quirk is not applied to these "extra special" Zenbooks.
Testing confirms that this model needs the same quirks as the ZenBook
UX425 variants.
Linus Torvalds [Sun, 25 Jan 2026 02:55:48 +0000 (18:55 -0800)]
Merge tag 'riscv-for-linus-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
"The notable changes here are the three RISC-V timer compare register
update sequence patches. These only apply to RV32 systems and are
related to the 64-bit timer compare value being split across two
separate 32-bit registers.
We weren't using the appropriate three-write sequence, documented in
the RISC-V ISA specifications, to avoid spurious timer interrupts
during the update sequence; so, these patches now use the recommended
sequence.
This doesn't affect 64-bit RISC-V systems, since the timer compare
value fits inside a single register and can be updated with a single
write.
- Fix the RISC-V timer compare register update sequence on RV32
systems to use the recommended sequence in the RISC-V ISA manual
This avoids spurious interrupts during updates
- Add a dependence on the new CONFIG_CACHEMAINT_FOR_DMA Kconfig
symbol for Renesas and StarFive RISC-V SoCs
- Add a temporary workaround for a Clang compiler bug caused by using
asm_goto_output for get_user()
- Clarify our documentation to specifically state a particular ISA
specification version for a chapter number reference"
* tag 'riscv-for-linus-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Add intermediate cast to 'unsigned long' in __get_user_asm
riscv: Use 64-bit variable for output in __get_user_asm
soc: renesas: Fix missing dependency on new CONFIG_CACHEMAINT_FOR_DMA
riscv: ERRATA_STARFIVE_JH7100: Fix missing dependency on new CONFIG_CACHEMAINT_FOR_DMA
riscv: suspend: Fix stimecmp update hazard on RV32
riscv: kvm: Fix vstimecmp update hazard on RV32
riscv: clocksource: Fix stimecmp update hazard on RV32
Documentation: riscv: uabi: Clarify ISA spec version for canonical order
Linus Torvalds [Sun, 25 Jan 2026 01:18:57 +0000 (17:18 -0800)]
Merge tag 'trace-v6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix a crash with passing a stacktrace between synthetic events
A synthetic event is an event that combines two events into a single
event that can display fields from both events as well as the time
delta that took place between the events. It can also pass a
stacktrace from the first event so that it can be displayed by the
synthetic event (this is useful to get a stacktrace of a task
scheduling out when blocked and recording the time it was blocked
for).
A synthetic event can also connect an existing synthetic event to
another event. An issue was found that if the first synthetic event
had a stacktrace as one of its fields, and that stacktrace field was
passed to the new synthetic event to be displayed, it would crash the
kernel. This was due to the stacktrace not being saved as a
stacktrace but was still marked as one. When the stacktrace was read,
it would try to read an array but instead read the integer metadata
of the stacktrace and dereferenced a bad value.
Fix this by saving the stacktrace field as a stacktrace.
- Fix possible overflow in cmp_mod_entry() compare function
A binary search is used to find a module address and if the addresses
are greater than 2GB apart it could lead to truncation and cause a
bad search result. Use normal compares instead of a subtraction
between addresses to calculate the compare value.
- Fix output of entry arguments in function graph tracer
Depending on the configurations enabled, the entry can be two
different types that hold the argument array. The macro
FGRAPH_ENTRY_ARGS() is used to find the correct arguments from the
given type. One location was missed and still referenced the
arguments directly via entry->args and could produce the wrong value
depending on how the kernel was configured.
- Fix memory leak in scripts/tracepoint-update build tool
If the array fails to allocate, the memory for the values needs to be
freed and was not. Free the allocated values if the array failed to
allocate.
* tag 'trace-v6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
scripts/tracepoint-update: Fix memory leak in add_string() on failure
function_graph: Fix args pointer mismatch in print_graph_retval()
tracing: Avoid possible signed 64-bit truncation
tracing: Fix crash on synthetic stacktrace field usage
Dan Williams [Sat, 24 Jan 2026 01:22:56 +0000 (17:22 -0800)]
Documentation: Project continuity
Document project continuity procedures. This is a plan for a plan for
navigating events that affect the forward progress of the canonical
Linux repository, torvalds/linux.git.
It is a follow-up from Maintainer Summit [1].
Co-developed-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jiri Kosina <jkosina@suse.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Link: https://lwn.net/Articles/1050179/ Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 24 Jan 2026 18:13:22 +0000 (10:13 -0800)]
Merge tag 'driver-core-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core fixes from Danilo Krummrich:
- Always inline I/O and IRQ methods using build_assert!() to avoid
false positive build errors
- Do not free the driver's device private data in I2C shutdown()
avoiding race conditions that can lead to UAF bugs
- Drop the driver's device private data after the driver has been
fully unbound from its device to avoid UAF bugs from &Device<Bound>
scopes, such as IRQ callbacks
* tag 'driver-core-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core:
rust: driver: drop device private data post unbind
rust: driver: add DriverData type to the DriverLayout trait
rust: driver: add DEVICE_DRIVER_OFFSET to the DriverLayout trait
rust: driver: introduce a DriverLayout trait
rust: auxiliary: add Driver::unbind() callback
rust: i2c: do not drop device private data on shutdown()
rust: irq: always inline functions using build_assert with arguments
rust: io: always inline functions using build_assert with arguments
Linus Torvalds [Sat, 24 Jan 2026 17:36:03 +0000 (09:36 -0800)]
Merge tag 'timers-urgent-2026-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
- Fix auxiliary timekeeper update & locking bug
- Reduce the sensitivity of the clocksource watchdog,
to fix false positive measurements that marked the
TSC clocksource unstable
* tag 'timers-urgent-2026-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: Reduce watchdog readout delay limit to prevent false positives
timekeeping: Adjust the leap state for the correct auxiliary timekeeper
Linus Torvalds [Sat, 24 Jan 2026 17:24:17 +0000 (09:24 -0800)]
Merge tag 'perf-urgent-2026-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events fixes from Ingo Molnar:
- Fix mmap_count warning & bug when creating a group member event
with the PERF_FLAG_FD_OUTPUT flag
- Disable the sample period == 1 branch events BTS optimization
on guests, because BTS is not virtualized
* tag 'perf-urgent-2026-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Do not enable BTS for guests
perf: Fix refcount warning on event->mmap_count increment
Linus Torvalds [Sat, 24 Jan 2026 17:02:56 +0000 (09:02 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull arm64 kvm fixes from Paolo Bonzini:
- Ensure early return semantics are preserved for pKVM fault handlers
- Fix case where the kernel runs with the guest's PAN value when
CONFIG_ARM64_PAN is not set
- Make stage-1 walks to set the access flag respect the access
permission of the underlying stage-2, when enabled
- Propagate computed FGT values to the pKVM view of the vCPU at
vcpu_load()
- Correctly program PXN and UXN privilege bits for hVHE's stage-1 page
tables
- Check that the VM is actually using VGICv3 before accessing the GICv3
CPU interface
- Delete some unused code
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: arm64: Invert KVM_PGTABLE_WALK_HANDLE_FAULT to fix pKVM walkers
KVM: arm64: Don't blindly set set PSTATE.PAN on guest exit
KVM: arm64: nv: Respect stage-2 write permssion when setting stage-1 AF
KVM: arm64: Remove unused vcpu_{clear,set}_wfx_traps()
KVM: arm64: Remove unused parameter in synchronize_vcpu_pstate()
KVM: arm64: Remove extra argument for __pvkm_host_{share,unshare}_hyp()
KVM: arm64: Inject UNDEF for a register trap without accessor
KVM: arm64: Copy FGT traps to unprotected pKVM VCPU on VCPU load
KVM: arm64: Fix EL2 S1 XN handling for hVHE setups
KVM: arm64: gic: Check for vGICv3 when clearing TWI
Linus Torvalds [Fri, 23 Jan 2026 22:58:51 +0000 (14:58 -0800)]
Merge tag 'kbuild-fixes-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull kbuild fixes from Nicolas Schier:
- Reduce possible complications when cross-compiling by increasing use
of ${NM} in check-function-names.sh
- Fix static linking of nconf
* tag 'kbuild-fixes-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
kconfig: fix static linking of nconf
kbuild: prefer ${NM} in check-function-names.sh
Linus Torvalds [Fri, 23 Jan 2026 21:56:04 +0000 (13:56 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- A set of fixes for FPSIMD/SVE/SME state management (around signal
handling and ptrace) where a task can be placed in an invalid state
- __nocfi added to swsusp_arch_resume() to avoid a data abort on
resuming from hibernate
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Set __nocfi on swsusp_arch_resume()
arm64/fpsimd: signal: Fix restoration of SVE context
arm64/fpsimd: signal: Allocate SSVE storage when restoring ZA
arm64/fpsimd: ptrace: Fix SVE writes on !SME systems
Linus Torvalds [Fri, 23 Jan 2026 21:40:55 +0000 (13:40 -0800)]
Merge tag 'v6.19-rc6-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- Use the original nents value for ib_dma_unmap_sg(), preventing
potential memory corruption in the RDMA transport layer
- Fix a naming discrepancy in the kernel-doc for
ksmbd_vfs_kern_path_start_removing() as identified by sparse static
analysis
- Reset smb_direct_port to its default value during initialization to
ensure the correct port is used when switching between different RDMA
device types without module reload
* tag 'v6.19-rc6-server-fixes' of git://git.samba.org/ksmbd:
smb: server: reset smb_direct_port = SMB_DIRECT_PORT_INFINIBAND on init
smb: server: fix comment for ksmbd_vfs_kern_path_start_removing()
ksmbd: smbd: fix dma_unmap_sg() nents
Linus Torvalds [Fri, 23 Jan 2026 21:20:24 +0000 (13:20 -0800)]
Merge tag 'pci-v6.19-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull PCI fixes from Bjorn Helgaas:
- Fix the pci_do_resource_release_and_resize() failure path, which
clobbered the intended failure return value (Ilpo Järvinen)
- Restore resizable BAR size before value because the size determines
which bits are writable; this fixes i915 and xe regressions (Ilpo
Järvinen)
* tag 'pci-v6.19-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI: Fix Resizable BAR restore order
PCI: Fix BAR resize rollback path overwriting ret
* tag 'platform-drivers-x86-v6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (21 commits)
platform/x86: acer-wmi: Fix missing capability check
platform/x86: acer-wmi: Extend support for Acer Nitro AN515-58
platform/x86: asus-armoury: add support for GA403WW
platform/x86: asus-armoury: keep the list ordered alphabetically
platform/x86: asus-armoury: add support for G835L
platform/x86: asus-armoury: fix ppt data for FA608UM
platform/x86: hp-bioscfg: Fix automatic module loading
platform/x86: hp-bioscfg: Fix kernel panic in GET_INSTANCE_ID macro
platform/x86: hp-bioscfg: Fix kobject warnings for empty attribute names
platform/x86: asus-wmi: fix sending OOBE at probe
platform/x86: asus-armoury: add support for FA617XT
platform/x86: asus-armoury: add support for FA401UV
platform/x86: asus-armoury: add support for GV302XV
platform/x86: asus-armoury: Add power limits for Asus G513QY
platform/x86/amd: Fix memory leak in wbrf_record()
platform/mellanox: Fix SN5640/SN5610 LED platform data
docs: fix PPR for AMD EPYC broken link
docs: alienware-wmi: fix typo
platform/x86: asus-armoury: add support for GA403UV
asus-armoury: fix ppt data for GA403U* renaming to GA403UI
...
Linus Torvalds [Fri, 23 Jan 2026 21:12:49 +0000 (13:12 -0800)]
Merge tag 'pmdomain-v6.19-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:
- imx: Remove incorrect reset/clock mask for 8mq vpu
- rockchip: Fix initial state of PM domain
* tag 'pmdomain-v6.19-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain:rockchip: Fix init genpd as GENPD_STATE_ON before regulator ready
pmdomain: imx8m-blk-ctrl: Remove separate rst and clk mask for 8mq vpu
Linus Torvalds [Fri, 23 Jan 2026 20:53:56 +0000 (12:53 -0800)]
Merge tag 'block-6.19-20260122' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- A set of selftest fixes for ublk
- Fix for a pid mismatch in ublk, comparing PIDs in different
namespaces if run inside a namespace
- Fix for a regression added in this release with polling, where the
nvme tcp connect code would spin forever
- Zoned device error path fix
- Tweak the blkzoned uapi additions from this kernel release, making
them more easily discoverable
- Fix for a regression in bcache with bio endio handling added in this
release
* tag 'block-6.19-20260122' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
bcache: use bio cloning for detached device requests
blk-mq: use BLK_POLL_ONESHOT for synchronous poll completion
selftests/ublk: fix garbage output in foreground mode
selftests/ublk: fix error handling for starting device
selftests/ublk: fix IO thread idle check
block: make the new blkzoned UAPI constants discoverable
ublk: fix ublksrv pid handling for pid namespaces
block: Fix an error path in disk_update_zone_resources()
Linus Torvalds [Fri, 23 Jan 2026 20:51:00 +0000 (12:51 -0800)]
Merge tag 'io_uring-6.19-20260122' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
- Fix for a potential leak of an iovec, if a specific cleanup path is
used and the rw_cache is full at the time of the call
- Fix for a regression added in this cycle, where waitid should be
using prober release/acquire semantics for updating the wait queue
head
- Check for the cancelation bit being set for every work item processed
by io-wq, not just at the start of the loop. Has no real practical
implications other than to shut up syzbot doing crazy things that
grossly overload a system, hence slowing down ring exit
- A few selftest additions, updating the mini_liburing that selftests
use
* tag 'io_uring-6.19-20260122' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
selftests/io_uring: support NO_SQARRAY in miniliburing
selftests/io_uring: add io_uring_queue_init_params
io_uring/io-wq: check IO_WQ_BIT_EXIT inside work run loop
io_uring/waitid: fix KCSAN warning on io_waitid->head
io_uring/rw: free potentially allocated iovec on cache put failure
Linus Torvalds [Fri, 23 Jan 2026 20:46:12 +0000 (12:46 -0800)]
Merge tag 'iommu-fixes-v6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu fixes from Joerg Roedel:
- AMD IOMMU: Fix potential NULL-ptr dereference in error path
of amd_iommu_probe_device()
- Generic IOMMUPT: Fix another compiler issue seen with older
compiler versions
- Fix signedness issue in ARM IO-PageTable code
* tag 'iommu-fixes-v6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
iommu/io-pgtable-arm: fix size_t signedness bug in unmap path
iommupt: Make it clearer to the compiler that pts.level == 0 for single page
iommu/amd: Fix error path in amd_iommu_probe_device()
Weigang He [Mon, 19 Jan 2026 11:45:42 +0000 (11:45 +0000)]
scripts/tracepoint-update: Fix memory leak in add_string() on failure
When realloc() fails in add_string(), the function returns -1 but leaves
*vals pointing to the previously allocated memory. This can cause memory
leaks in callers like make_trace_array() that return on error without
freeing the partially built array.
Fix this by freeing *vals and setting it to NULL when realloc() fails.
This makes the error handling self-contained in add_string() so callers
don't need to handle cleanup on failure.
This bug is found by my static analysis tool and my code review.
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: e30f8e61e2518 ("tracing: Add a tracepoint verification check at build time") Link: https://patch.msgid.link/20260119114542.1714405-1-geoffreyhe2@gmail.com Signed-off-by: Weigang He <geoffreyhe2@gmail.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Donglin Peng [Mon, 12 Jan 2026 02:16:01 +0000 (10:16 +0800)]
function_graph: Fix args pointer mismatch in print_graph_retval()
When funcgraph-args and funcgraph-retaddr are both enabled, many kernel
functions display invalid parameters in trace logs.
The issue occurs because print_graph_retval() passes a mismatched args
pointer to print_function_args(). Fix this by retrieving the correct
args pointer using the FGRAPH_ENTRY_ARGS() macro.
Link: https://patch.msgid.link/20260112021601.1300479-1-dolinux.peng@gmail.com Fixes: f83ac7544fbf ("function_graph: Enable funcgraph-args and funcgraph-retaddr to work simultaneously") Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Ian Rogers [Thu, 8 Jan 2026 00:26:25 +0000 (16:26 -0800)]
tracing: Avoid possible signed 64-bit truncation
64-bit truncation to 32-bit can result in the sign of the truncated
value changing. The cmp_mod_entry is used in bsearch and so the
truncation could result in an invalid search order. This would only
happen were the addresses more than 2GB apart and so unlikely, but
let's fix the potentially broken compare anyway.
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20260108002625.333331-1-irogers@google.com Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt [Fri, 23 Jan 2026 00:48:24 +0000 (19:48 -0500)]
tracing: Fix crash on synthetic stacktrace field usage
When creating a synthetic event based on an existing synthetic event that
had a stacktrace field and the new synthetic event used that field a
kernel crash occurred:
~# cd /sys/kernel/tracing
~# echo 's:stack unsigned long stack[];' > dynamic_events
~# echo 'hist:keys=prev_pid:s0=common_stacktrace if prev_state & 3' >> events/sched/sched_switch/trigger
~# echo 'hist:keys=next_pid:s1=$s0:onmatch(sched.sched_switch).trace(stack,$s1)' >> events/sched/sched_switch/trigger
The above creates a synthetic event that takes a stacktrace when a task
schedules out in a non-running state and passes that stacktrace to the
sched_switch event when that task schedules back in. It triggers the
"stack" synthetic event that has a stacktrace as its field (called "stack").
The above makes another synthetic event called "syscall_stack" that
attaches the first synthetic event (stack) to the sys_exit trace event and
records the stacktrace from the stack event with the id of the system call
that is exiting.
When enabling this event (or using it in a historgram):
The reason is that the stacktrace field is not labeled as such, and is
treated as a normal field and not as a dynamic event that it is.
In trace_event_raw_event_synth() the event is field is still treated as a
dynamic array, but the retrieval of the data is considered a normal field,
and the reference is just the meta data:
// Meta data is retrieved instead of a dynamic array
str_val = (char *)(long)var_ref_vals[val_idx];
// Then when it tries to process it:
len = *((unsigned long *)str_val) + 1;
It triggers a kernel page fault.
To fix this, first when defining the fields of the first synthetic event,
set the filter type to FILTER_STACKTRACE. This is used later by the second
synthetic event to know that this field is a stacktrace. When creating
the field of the new synthetic event, have it use this FILTER_STACKTRACE
to know to create a stacktrace field to copy the stacktrace into.
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tom Zanussi <zanussi@kernel.org> Link: https://patch.msgid.link/20260122194824.6905a38e@gandalf.local.home Fixes: 00cf3d672a9d ("tracing: Allow synthetic events to pass around stacktraces") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linus Torvalds [Fri, 23 Jan 2026 18:20:28 +0000 (10:20 -0800)]
Merge tag 'spi-fix-v6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"One new device ID, plus a few fixes.
The most substantial of the fixes is for the Cadence driver which in
at least some instantiations requires transmit data to drive data
through the IP"
* tag 'spi-fix-v6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: intel-pci: Add support for Nova Lake SPI serial flash
spi: spi-cadence: enable SPI_CONTROLLER_MUST_TX
spi: hisi-kunpeng: Fixed the wrong debugfs node name in hisi_spi debugfs initialization
spi: spi-sprd-adi: Fix double free in probe error path
Linus Torvalds [Fri, 23 Jan 2026 18:17:06 +0000 (10:17 -0800)]
Merge tag 'regmap-fix-v6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"A couple of small fixes, one error handling one and another for misuse
of the hwspinlock API"
* tag 'regmap-fix-v6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Fix race condition in hwspinlock irqsave routine
regmap: maple: free entry on mas_store_gfp() failure
Linus Torvalds [Fri, 23 Jan 2026 18:14:52 +0000 (10:14 -0800)]
Merge tag 'gpio-fixes-for-v6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
"Some fixes to resource leaks in the character device handling and
another small fix for shared GPIO management:
- fix resource leaks in error paths in GPIO character device code
- return -ENOMEM and not -ENODEV on memory allocation failure
- fix an audio issue on Qualcomm platforms due to configuration not
being propagated to pinctrl from shared GPIO proxy"
* tag 'gpio-fixes-for-v6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: shared: propagate configuration to pinctrl
gpio: cdev: Fix resource leaks on errors in gpiolib_cdev_register()
gpio: cdev: Fix resource leaks on errors in lineinfo_changed_notify()
gpio: cdev: Correct return code on memory allocation failure
Zhaoyang Huang [Thu, 22 Jan 2026 11:49:25 +0000 (19:49 +0800)]
arm64: Set __nocfi on swsusp_arch_resume()
A DABT is reported[1] on an android based system when resume from hiberate.
This happens because swsusp_arch_suspend_exit() is marked with SYM_CODE_*()
and does not have a CFI hash, but swsusp_arch_resume() will attempt to
verify the CFI hash when calling a copy of swsusp_arch_suspend_exit().
Given that there's an existing requirement that the entrypoint to
swsusp_arch_suspend_exit() is the first byte of the .hibernate_exit.text
section, we cannot fix this by marking swsusp_arch_suspend_exit() with
SYM_FUNC_*(). The simplest fix for now is to disable the CFI check in
swsusp_arch_resume().
Mark swsusp_arch_resume() as __nocfi to disable the CFI check.
Linus Torvalds [Fri, 23 Jan 2026 17:37:35 +0000 (09:37 -0800)]
Merge tag 'sound-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of a few more small fixes for HD- and USB-audio,
including a regression fix for the OOB fix that was included
in the previous pull request"
* tag 'sound-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek: ALC269 fixup for Lenovo Yoga Book 9i 13IRU8 audio
ALSA: hda/realtek: Add quirk for Samsung 730QED to fix headphone
ALSA: usb-audio: Use the right limit for PCM OOB check
ALSA: usb-audio: Fix use-after-free in snd_usb_mixer_free()
ALSA: hda/realtek: Fix headset mic for TongFang X6AR55xU
ALSA: ctxfi: Fix potential OOB access in audio mixer handling
selftests: ALSA: Remove unused variable in utimer-test
ALSA: usb-audio: Add delay quirk for MOONDROP Moonriver2 Ti
ALSA: scarlett2: Fix buffer overflow in config retrieval
ALSA: usb: Increase volume range that triggers a warning
Linus Torvalds [Fri, 23 Jan 2026 17:01:26 +0000 (09:01 -0800)]
Merge tag 'drm-fixes-2026-01-23' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Probably a good thing you decided to do an rc8 in this round. Nothing
stands out, but xe/amdgpu and mediatek all have a bunch of fixes, and
then there are a few other single patches. Hopefully next week is
calmer for release.
xe:
- Disallow bind-queue sharing across multiple VMs
- Fix xe userptr in the absence of CONFIG_DEVICE_PRIVATE
- Fix a missed page count update
- Fix a confused argument to alloc_workqueue()
- Kernel-doc fixes
- Disable a workaround on VFs
- Fix a job lock assert
- Update wedged.mode only after successful reset policy change
- Select CONFIG_DEVICE_PRIVATE when DRM_XE_GPUSVM is selected
amdgpu:
- fix color pipeline string leak
- GC 12 fix
- Misc error path fixes
- DC analog fix
- SMU 6 fixes
- TLB flush fix
- DC idle optimization fix
amdkfd:
- GC 11 cooperative launch fix
imagination:
- sync wait for logtype update completion to ensure FW trace
is available
bridge/synopsis:
- Fix error paths in dw_dp_bind
nouveau:
- Add and implement missing DSB connector types, and improve
unknown connector handling
- Set missing atomic function ops
intel:
- place 3D lut at correct place in pipeline
- fix color pipeline string leak
vkms:
- fix color pipeline string leak
mediatek:
- Fix platform_get_irq() error checking
- HDMI DDC v2 driver fixes
- dpi: Find next bridge during probe
- mtk_gem: Partial refactor and use drm_gem_dma_object
- dt-bindings: Fix typo 'hardwares' to 'hardware'"
* tag 'drm-fixes-2026-01-23' of https://gitlab.freedesktop.org/drm/kernel: (38 commits)
Revert "drm/amd/display: pause the workload setting in dm"
drm/xe: Select CONFIG_DEVICE_PRIVATE when DRM_XE_GPUSVM is selected
drm, drm/xe: Fix xe userptr in the absence of CONFIG_DEVICE_PRIVATE
drm/i915/display: Fix color pipeline enum name leak
drm/vkms: Fix color pipeline enum name leak
drm/amd/display: Fix color pipeline enum name leak
drm/i915/color: Place 3D LUT after CSC in plane color pipeline
drm/nouveau/disp: Set drm_mode_config_funcs.atomic_(check|commit)
drm/nouveau: implement missing DCB connector types; gracefully handle unknown connectors
drm/nouveau: add missing DCB connector types
drm/amdgpu: fix type for wptr in ring backup
drm/amdgpu: Fix validating flush_gpu_tlb_pasid()
drm/amd/pm: Workaround SI powertune issue on Radeon 430 (v2)
drm/amd/pm: Don't clear SI SMC table when setting power limit
drm/amd/pm: Fix si_dpm mmCG_THERMAL_INT setting
drm/xe: Update wedged.mode only after successful reset policy change
drm/xe/migrate: fix job lock assert
drm/xe/uapi: disallow bind queue sharing
drm/amd/display: Only poll analog connectors
drm/amdgpu: fix error handling in ib_schedule()
...
Revert commit bfc467db60b7 ("serial: remove redundant
tty_port_link_device()") because the tty_port_link_device() is not
redundant: the tty->port has to be confured before we call
uart_configure_port(), otherwise user-space can open console without TTY
linked to the driver.
This tty_port_link_device() was added explicitly to avoid this exact
issue in commit fb2b90014d78 ("tty: link tty and port before configuring
it as console"), so offending commit basically reverted the fix saying
it is redundant without addressing the actual race condition presented
there.
Reproducible always as tty->port warning on Qualcomm SoC with most of
devices disabled, so with very fast boot, and one serial device being
the console:
printk: legacy console [ttyMSM0] enabled
printk: legacy console [ttyMSM0] enabled
printk: legacy bootconsole [qcom_geni0] disabled
printk: legacy bootconsole [qcom_geni0] disabled
------------[ cut here ]------------
tty_init_dev: ttyMSM driver does not set tty->port. This would crash the kernel. Fix the driver!
WARNING: drivers/tty/tty_io.c:1414 at tty_init_dev.part.0+0x228/0x25c, CPU#2: systemd/1
Modules linked in: socinfo tcsrcc_eliza gcc_eliza sm3_ce fuse ipv6
CPU: 2 UID: 0 PID: 1 Comm: systemd Tainted: G S 6.19.0-rc4-next-20260108-00024-g2202f4d30aa8 #73 PREEMPT
Tainted: [S]=CPU_OUT_OF_SPEC
Hardware name: Qualcomm Technologies, Inc. Eliza (DT)
...
tty_init_dev.part.0 (drivers/tty/tty_io.c:1414 (discriminator 11)) (P)
tty_open (arch/arm64/include/asm/atomic_ll_sc.h:95 (discriminator 3) drivers/tty/tty_io.c:2073 (discriminator 3) drivers/tty/tty_io.c:2120 (discriminator 3))
chrdev_open (fs/char_dev.c:411)
do_dentry_open (fs/open.c:962)
vfs_open (fs/open.c:1094)
do_open (fs/namei.c:4634)
path_openat (fs/namei.c:4793)
do_filp_open (fs/namei.c:4820)
do_sys_openat2 (fs/open.c:1391 (discriminator 3))
...
Starting Network Name Resolution...
Apparently the flow with this small Yocto-based ramdisk user-space is:
driver (qcom_geni_serial.c): user-space:
============================ ===========
qcom_geni_serial_probe()
uart_add_one_port()
serial_core_register_port()
serial_core_add_one_port()
uart_configure_port()
register_console()
|
| open console
| ...
| tty_init_dev()
| driver->ports[idx] is NULL
|
tty_port_register_device_attr_serdev()
tty_port_link_device() <- set driver->ports[idx]
Vincent Guittot [Fri, 23 Jan 2026 10:28:58 +0000 (11:28 +0100)]
sched/fair: Revert force wakeup preemption
This agressively bypasses run_to_parity and slice protection with the
assumpiton that this is what waker wants but there is no garantee that
the wakee will be the next to run. It is a better choice to use
yield_to_task or WF_SYNC in such case.
This increases the number of resched and preemption because a task becomes
quickly "ineligible" when it runs; We update the task vruntime periodically
and before the task exhausted its slice or at least quantum.
Example:
2 tasks A and B wake up simultaneously with lag = 0. Both are
eligible. Task A runs 1st and wakes up task C. Scheduler updates task
A's vruntime which becomes greater than average runtime as all others
have a lag == 0 and didn't run yet. Now task A is ineligible because
it received more runtime than the other task but it has not yet
exhausted its slice nor a min quantum. We force preemption, disable
protection but Task B will run 1st not task C.
Sidenote, DELAY_ZERO increases this effect by clearing positive lag at
wake up.
Mel Gorman [Tue, 20 Jan 2026 11:33:35 +0000 (11:33 +0000)]
sched/fair: Disable scheduler feature NEXT_BUDDY
NEXT_BUDDY was disabled with the introduction of EEVDF and enabled again
after NEXT_BUDDY was rewritten for EEVDF by commit e837456fdca8 ("sched/fair:
Reimplement NEXT_BUDDY to align with EEVDF goals"). It was not expected
that this would be a universal win without a crystal ball instruction
but the reported regressions are a concern [1][2] even if gains were
also reported. Specifically;
o mysql with client/server running on different servers regresses
o specjbb reports lower peak metrics
o daytrader regresses
The mysql is realistic and a concern. It needs to be confirmed if
specjbb is simply shifting the point where peak performance is measured
but still a concern. daytrader is considered to be representative of a
real workload.
Access to test machines is currently problematic for verifying any fix to
this problem. Disable NEXT_BUDDY for now by default until the root causes
are addressed.
Linus Torvalds [Fri, 23 Jan 2026 03:39:25 +0000 (19:39 -0800)]
Merge tag 'v6.19-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
- Add assoclen check in authencesn
* tag 'v6.19-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: authencesn - reject too-short AAD (assoclen<8) to match ESP/ESN spec
riscv: Add intermediate cast to 'unsigned long' in __get_user_asm
After commit bdce162f2e57 ("riscv: Use 64-bit variable for output in
__get_user_asm"), there is a warning when building for 32-bit RISC-V:
In file included from include/linux/uaccess.h:13,
from include/linux/sched/task.h:13,
from include/linux/sched/signal.h:9,
from include/linux/rcuwait.h:6,
from include/linux/mm.h:36,
from include/linux/migrate.h:5,
from mm/migrate.c:16:
mm/migrate.c: In function 'do_pages_move':
arch/riscv/include/asm/uaccess.h:115:15: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
115 | (x) = (__typeof__(x))__tmp; \
| ^
arch/riscv/include/asm/uaccess.h:198:17: note: in expansion of macro '__get_user_asm'
198 | __get_user_asm("lb", (x), __gu_ptr, label); \
| ^~~~~~~~~~~~~~
arch/riscv/include/asm/uaccess.h:218:9: note: in expansion of macro '__get_user_nocheck'
218 | __get_user_nocheck(x, ptr, __gu_failed); \
| ^~~~~~~~~~~~~~~~~~
arch/riscv/include/asm/uaccess.h:255:9: note: in expansion of macro '__get_user_error'
255 | __get_user_error(__gu_val, __gu_ptr, __gu_err); \
| ^~~~~~~~~~~~~~~~
arch/riscv/include/asm/uaccess.h:285:17: note: in expansion of macro '__get_user'
285 | __get_user((x), __p) : \
| ^~~~~~~~~~
mm/migrate.c:2358:29: note: in expansion of macro 'get_user'
2358 | if (get_user(p, pages + i))
| ^~~~~~~~
Add an intermediate cast to 'unsigned long', which is guaranteed to be the same
width as a pointer, before the cast to the type of the output variable to clear
up the warning.
Fixes: bdce162f2e57 ("riscv: Use 64-bit variable for output in __get_user_asm") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202601210526.OT45dlOZ-lkp@intel.com/ Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://patch.msgid.link/20260121-riscv-fix-int-to-pointer-cast-v1-1-b83eebe57c76@kernel.org Signed-off-by: Paul Walmsley <pjw@kernel.org>
Cedric Xing [Fri, 23 Jan 2026 00:39:15 +0000 (18:39 -0600)]
x86: make page fault handling disable interrupts properly
There's a big comment in the x86 do_page_fault() about our interrupt
disabling code:
* User address page fault handling might have reenabled
* interrupts. Fixing up all potential exit points of
* do_user_addr_fault() and its leaf functions is just not
* doable w/o creating an unholy mess or turning the code
* upside down.
but it turns out that comment is subtly wrong, and the code as a result
is also wrong.
Because it's certainly true that we may have re-enabled interrupts when
handling user page faults. And it's most certainly true that we don't
want to bother fixing up all the cases.
But what isn't true is that it's limited to user address page faults.
The confusion stems from the fact that we have logic here that depends
on the address range of the access, but other code then depends on the
_context_ the access was done in. The two are not related, even though
both of them are about user-vs-kernel.
In other words, both user and kernel addresses can cause interrupts to
have been enabled (eg when __bad_area_nosemaphore() gets called for user
accesses to kernel addresses). As a result we should make sure to
disable interrupts again regardless of the address range before
returning to the low-level fault handling code.
The __bad_area_nosemaphore() code actually did disable interrupts again
after enabling them, just not consistently. Ironically, as noted in the
original comment, fixing up all the cases is just not worth it, when the
simple solution is to just do it unconditionally in one single place.
So remove the incomplete case that unsuccessfully tried to do what the
comment said was "not doable" in commit ca4c6a9858c2 ("x86/traps: Make
interrupt enable/disable symmetric in C code"), and just make it do the
simple and straightforward thing.
Signed-off-by: Cedric Xing <cedric.xing@intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Fixes: ca4c6a9858c2 ("x86/traps: Make interrupt enable/disable symmetric in C code") Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
smb: server: reset smb_direct_port = SMB_DIRECT_PORT_INFINIBAND on init
This allows testing with different devices (iwrap vs. non-iwarp) without
'rmmod ksmbd && modprobe ksmbd', but instead
'ksmbd.control -s && ksmbd.mountd' is enough.
In the long run we want to listen on iwarp and non-iwarp at the same time,
but requires more changes, most likely also in the rdma layer.
Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
smb: server: fix comment for ksmbd_vfs_kern_path_start_removing()
This was found by sparse...
Fixes: 1ead2213dd7d ("smb/server: use end_removing_noperm for for target of smb2_create_link()") Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: NeilBrown <neil@brown.name> Cc: Christian Brauner <brauner@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Thomas Fourier [Fri, 9 Jan 2026 10:38:39 +0000 (11:38 +0100)]
ksmbd: smbd: fix dma_unmap_sg() nents
The dma_unmap_sg() functions should be called with the same nents as the
dma_map_sg(), not the value the map function returned.
Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Cc: <stable@vger.kernel.org> Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>