Merge tag 'hyperv-next-signed-20260421' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull Hyper-V updates from Wei Liu:
- Fix cross-compilation for hv tools (Aditya Garg)
- Fix vmemmap_shift exceeding MAX_FOLIO_ORDER in mshv_vtl (Naman Jain)
- Limit channel interrupt scan to relid high water mark (Michael
Kelley)
- Export hv_vmbus_exists() and use it in pci-hyperv (Dexuan Cui)
- Fix cleanup and shutdown issues for MSHV (Jork Loeser)
- Introduce more tracing support for MSHV (Stanislav Kinsburskii)
* tag 'hyperv-next-signed-20260421' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/hyperv: Skip LP/VP creation on kexec
x86/hyperv: move stimer cleanup to hv_machine_shutdown()
Drivers: hv: vmbus: fix hyperv_cpuhp_online variable shadowing
mshv: Add tracepoint for GPA intercept handling
mshv_vtl: Fix vmemmap_shift exceeding MAX_FOLIO_ORDER
tools: hv: Fix cross-compilation
Drivers: hv: vmbus: Export hv_vmbus_exists() and use it in pci-hyperv
mshv: Introduce tracing support
Drivers: hv: vmbus: Limit channel interrupt scan to relid high water mark
Cássio Gabriel [Wed, 22 Apr 2026 01:07:41 +0000 (22:07 -0300)]
ALSA: usb-audio: Fix Audio Advantage Micro II SPDIF switch
snd_microii_spdif_switch_put() returns 0 when the requested
vendor register value differs from the cached one.
This comparison was inverted by the resume-support conversion,
so real SPDIF switch toggles are ignored while no-op writes still
issue SET_CUR and report success.
Return early only when the requested value matches the cached one.
snd_emuusb_set_samplerate() unconditionally notifies the E-MU
SampleRate Extension Unit control after issuing SET_CUR.
If snd_usb_mixer_set_ctl_value() fails, the control value has not
changed, yet snd_usb_mixer_notify_id() still invalidates the cache and
emits a value-change event to userspace.
Len Brown [Wed, 22 Apr 2026 15:13:00 +0000 (11:13 -0400)]
tools/power turbostat: Process HT siblings in CPU order
On large systems with HT sibling cpu#'s more than 32 apart,
HT siblings were processed and displayed in reverse order.
This was due to how set_thread_siblings() parsed the
sibling-bit-mask.
Update set_thread_siblings to instead parse the sibling-list,
like other cpu lists, and to thus order HT siblings
by ascending CPU number, no matter the size of the system.
Len Brown [Tue, 21 Apr 2026 22:35:15 +0000 (18:35 -0400)]
tools/power turbostat: Fix --cpu-set 1 regression on HT systems
When the "--cpu-set" option limits turbostat to run on
a higher numbered HT sibling, it exits upon dividing by zero.
This is because the HT support handles higher numbered siblings
at the same time as lower numbered siblings. But when that lower
number sibling is dis-allowed, the higher numbered sibling is
never processed. The result is a time delta of 0, which results
in a divide by 0 for any of the "per-second" metrics.
Enhance the HT enumeration code to record all siblings (up to SMT4).
Consult this complete HT sibling list to determine when
to process an HT sibling, and when to skip it.
Fixes: a2b4d0f8bf07 ("tools/power turbostat: Favor cpu# over core#") Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Thu, 16 Apr 2026 20:17:31 +0000 (16:17 -0400)]
tools/power turbostat: Fix --cpu-set 0 regression on HT systems
"turbostat --cpu-set 0" appears to hang if cpu0 has an HT sibling.
This is because the initialization code recognizes that it does not
have to open perf files for the HT sibling, but the HT support
in the collection code sees the HT sibling and tries to read
from an uninitialized file descriptor, 0 (standard input).
Access HT siblings only when they are in the allowed set.
Fixes: a2b4d0f8bf07 ("tools/power turbostat: Favor cpu# over core#") Signed-off-by: Len Brown <len.brown@intel.com> Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
The '-P' short option (shorthand for --no-perf) is not present in the
optstring of the second call to getopt_long_only(). This results in
the "unrecognized option" error when the tool reaches the main parsing
loop.
Add 'P' to the second getopt_long_only() call to ensure it is
consistently recognized.
Fixes: a0e86c90b83c ("tools/power turbostat: Add --no-perf option") Signed-off-by: David Arcari <darcari@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
Paolo Bonzini [Tue, 21 Apr 2026 10:04:55 +0000 (11:04 +0100)]
tracing: Make undefsyms_base.c a first-class citizen
Linus points out that dumping undefsyms_base.c form the Makefile
is rather ugly, and that a much better course of action would be
to have this file as a first-class citizen in the git tree.
This allows some extra cleanup in the Makefile, and the removal of
the .gitignore file in kernel/trace.
Hardik Phalet [Tue, 10 Mar 2026 12:30:27 +0000 (12:30 +0000)]
fbdev: hgafb: Request memory region before ioremap
The driver calls ioremap() on the HGA video memory at 0xb0000 without
first reserving the physical address range. This leaves the kernel
resource tree incomplete and can cause silent conflicts with other
drivers claiming the same range.
Add a devm_request_mem_region() call before ioremap() in
hga_card_detect() to reserve the memory region.
Maciej Strozek [Mon, 20 Apr 2026 11:48:17 +0000 (12:48 +0100)]
ASoC: sdw_utils: cs42l43: allow spk component names to be combined
Move handling of cs42l43-spk component string into SOF mechanism [1]
which will allow it to be aggregated with other speakers.
Likewise handle the cs35l56-bridge special case which should not be
combined to keep compatibility with UCM.
Eric Biggers [Sat, 18 Apr 2026 22:13:11 +0000 (15:13 -0700)]
smb: client: Drop 'allocate_crypto' arg from smb*_calc_signature()
Since the crypto library API is now being used instead of crypto_shash,
all structs for MAC computation are now just fixed-size structs
allocated on the stack; no dynamic allocations are ever required.
Besides being much more efficient, this also means that the
'allocate_crypto' argument to smb2_calc_signature() and
smb3_calc_signature() is no longer used. Remove this unused argument.
Acked-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Eric Biggers [Sat, 18 Apr 2026 22:13:10 +0000 (15:13 -0700)]
smb: client: Make generate_key() return void
Since the crypto library API is now being used instead of crypto_shash,
generate_key() can no longer fail. Make it return void and simplify the
callers accordingly.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Eric Biggers [Sat, 18 Apr 2026 22:13:09 +0000 (15:13 -0700)]
smb: client: Remove obsolete cmac(aes) allocation
Since the crypto library API is now being used instead of crypto_shash,
the "cmac(aes)" crypto_shash that is being allocated and stored in
'struct cifs_secmech' is no longer used. Remove it.
That makes the kconfig selection of CRYPTO_CMAC and the module softdep
on "cmac" unnecessary. So remove those too.
Finally, since this removes the last use of crypto_shash from the smb
client, also remove the remaining crypto_shash-related helper functions.
Note: cifs_unicode.c was relying on <linux/unaligned.h> being included
transitively via <crypto/internal/hash.h>. Since the latter include is
removed, make cifs_unicode.c include <linux/unaligned.h> explicitly.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Eric Biggers [Sat, 18 Apr 2026 22:13:08 +0000 (15:13 -0700)]
smb: client: Use AES-CMAC library for SMB3 signature calculation
Convert smb3_calc_signature() to use the AES-CMAC library instead of a
"cmac(aes)" crypto_shash.
The result is simpler and faster code. With the library there's no need
to allocate memory, no need to handle errors except for key preparation,
and the AES-CMAC code is accessed directly without inefficient indirect
calls and other unnecessary API overhead.
For now a "cmac(aes)" crypto_shash is still being allocated in
'struct cifs_secmech'. Later commits will remove that, simplifying the
code even further.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
This patch implements several micro-optimizations on lz77_compress()
with the goal of reducing the number of instructions per [input]
byte (a.k.a. IPB).
Changes:
- change hashtable to be u32 (instead of u64) -- change the hash
function to reflect that (adds lz77_hash() and lz77_read32() helpers)
- batch-write literals instead of 1 by 1 -- now that we have a well
defined hot path (match finding) and a cold path (encode literals +
match), batch writing makes a significant difference
- implement adaptive skipping of input bytes -- skip input bytes more
aggressively if too few matches are being found
- name some constants for more meaningful context
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
smb: client: compress: fix counting in LZ77 match finding
- lz77_match_len() increments @cur before checking for equality,
leading to off-by-one match len in some cases.
Fix by moving pointers increment to inside the loop.
Also rename @wnd arg to @match (more accurate name).
- both lz77_match_len() and lz77_compress() checked for
"buf + step < end" when the correct is "<=" for such cases.
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
smb: client: compress: fix buffer overrun in lz77_compress()
@dst buffer is allocated with same size as @src, which, for good
compression cases, works fine.
However, when compression goes bad (e.g. random bytes payloads), the
compressed size can increase significantly, and even by stopping the
main loop at 7/8 of @slen, writing leftover literals could write past
the end of @dst because of LZ77 metadata.
To fix this, add lz77_compressed_alloc_size() helper to compute the
correct allocation size for @dst, accounting for metadata and worst
cast scenario (all literals).
While this is overprovisioning memory, it's not only correct, but also
allows lz77_compress() main loop to run without ever checking @dst
limits (i.e. a perf improvement).
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
smb: client: scope end_of_dacl to CIFS_DEBUG2 use in parse_dacl
After validate_dacl() was factored out in commit 149822e5541c, the
local end_of_dacl in parse_dacl() is only read by the dump_ace()
call under #ifdef CONFIG_CIFS_DEBUG2. With CIFS_DEBUG2 off the
variable is assigned but never used, which gcc -W=1 flags as
-Wunused-but-set-variable.
Remove the local and compute the end-of-dacl pointer inline at the
single call site inside the existing CIFS_DEBUG2 guard. No
functional change: when CIFS_DEBUG2 is enabled the argument value
is identical to what the removed local carried; when CIFS_DEBUG2
is disabled the code was already dead.
Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202604220046.tGkRxVtS-lkp@intel.com/ Fixes: 149822e5541c ("smb: client: validate the whole DACL before rewriting it in cifsacl") Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Steve French <stfrench@microsoft.com>
Instead of fixing this with some kind of "is module initialized"
approach, this patch instead moves that functionality to procfs,
setting a write op for the existing open_dirs entry, where
writing a 0 to it will drop the cached directory entries.
Also make it available only when CONFIG_CIFS_DEBUG=y.
A small change needed now is to not call flush_delayed_work()
on invalidate_all_cached_dirs() when called from procfs (can't sleep in
that context).
So add a @sync arg to invalidate_all_cached_dirs() to control when to
flush the delayed works.
Fixes: dde6667fa3c8 ("smb: client: add drop_dir_cache module parameter to invalidate cached dirents") Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
smb: client: require a full NFS mode SID before reading mode bits
parse_dacl() treats an ACE SID matching sid_unix_NFS_mode as an NFS
mode SID and reads sid.sub_auth[2] to recover the mode bits.
That assumes the ACE carries three subauthorities, but compare_sids()
only compares min(a, b) subauthorities. A malicious server can return
an ACE with num_subauth = 2 and sub_auth[] = {88, 3}, which still
matches sid_unix_NFS_mode and then drives the sub_auth[2] read four
bytes past the end of the ACE.
Require num_subauth >= 3 before treating the ACE as an NFS mode SID.
This keeps the fix local to the special-SID mode path without changing
compare_sids() semantics for the rest of cifsacl.
Fixes: e2f8fbfb8d09 ("cifs: get mode bits from special sid on stat") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
smb: client: validate the whole DACL before rewriting it in cifsacl
build_sec_desc() and id_mode_to_cifs_acl() derive a DACL pointer from a
server-supplied dacloffset and then use the incoming ACL to rebuild the
chmod/chown security descriptor.
The original fix only checked that the struct smb_acl header fits before
reading dacl_ptr->size or dacl_ptr->num_aces. That avoids the immediate
header-field OOB read, but the rewrite helpers still walk ACEs based on
pdacl->num_aces with no structural validation of the incoming DACL body.
A malicious server can return a truncated DACL that still contains a
header, claims one or more ACEs, and then drive
replace_sids_and_copy_aces() or set_chmod_dacl() past the validated
extent while they compare or copy attacker-controlled ACEs.
Factor the DACL structural checks into validate_dacl(), extend them to
validate each ACE against the DACL bounds, and use the shared validator
before the chmod/chown rebuild paths. parse_dacl() reuses the same
validator so the read-side parser and write-side rewrite paths agree on
what constitutes a well-formed incoming DACL.
Fixes: bc3e9dd9d104 ("cifs: Change SIDs in ACEs while transferring file ownership.") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
smb: client: fix OOB read in smb2_ioctl_query_info QUERY_INFO path
smb2_ioctl_query_info() has two response-copy branches: PASSTHRU_FSCTL
and the default QUERY_INFO path. The QUERY_INFO branch clamps
qi.input_buffer_length to the server-reported OutputBufferLength and then
copies qi.input_buffer_length bytes from qi_rsp->Buffer to userspace, but
it never verifies that the flexible-array payload actually fits within
rsp_iov[1].iov_len.
A malicious server can return OutputBufferLength larger than the actual
QUERY_INFO response, causing copy_to_user() to walk past the response
buffer and expose adjacent kernel heap to userspace.
Guard the QUERY_INFO copy with a bounds check on the actual Buffer
payload. Use struct_size(qi_rsp, Buffer, qi.input_buffer_length)
rather than an open-coded addition so the guard cannot overflow on
32-bit builds.
Fixes: f5778c398713 ("SMB3: Allow SMB3 FSCTL queries to be sent to server from tools") Cc: stable@vger.kernel.org Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Steve French <stfrench@microsoft.com>
Amit Barzilai [Mon, 20 Apr 2026 13:44:23 +0000 (16:44 +0300)]
fbdev: clps711x-fb: Request memory region for MMIO
Use devm_platform_get_and_ioremap_resource() for resource 0 (the MMIO
control register range) instead of open-coding platform_get_resource()
and devm_ioremap() separately. The helper requests the memory region
before mapping it, which registers the range in /proc/iomem and prevents
another driver from mapping the same registers.
This makes resource 0 consistent with resource 1 (the framebuffer),
which already uses devm_platform_get_and_ioremap_resource().
Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Amit Barzilai <amit.barzilai22@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>
Amit Barzilai [Mon, 20 Apr 2026 13:44:22 +0000 (16:44 +0300)]
fbdev: cobalt_lcdfb: Request memory region
Use devm_platform_get_and_ioremap_resource() instead of open-coding
platform_get_resource() and devm_ioremap() separately. The helper
requests the memory region before mapping it, which registers the range
in /proc/iomem and prevents another driver from mapping the same
registers.
Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Amit Barzilai <amit.barzilai22@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>
Limit the digital gain and PA volumes to a combined -3 dB in the machine
driver to reduce the risk of speaker damage until we have active speaker
protection in place (or higher safe levels have been established).
Based on commit c481016bb4f8 ("ASoC: qcom: sc8280xp: limit speaker
volumes") which addressed the same issue on the sc8280x SoC with some
minor changes as explained below.
The Digital Volume behaves almost identical to sc8280x since both use
the same lpass-wsa-macro, but x1e80100 has two sets of controls prefixed
with WSA and WSA2.
For PA x1e80100 machines use wsa884x amplifiers which expose a linear
scale from -9 dB to 9 dB with a 1.5 dB step size giving us
0 dB = -9 dB + 6 * 1.5 dB.
On x1e80100 there are two different speaker topologies we need to handle:
2-Speakers: SpkrLeft, Spkr Right
4-Speakers: WooferLeft, WooferRight, TweeterLeft, TweeterRight
Johan Hovold [Tue, 21 Apr 2026 14:39:23 +0000 (16:39 +0200)]
spi: axiado: fix runtime pm imbalance on probe failure
Make sure that the controller is active before disabling clocks on late
probe failure and on driver unbind to avoid a clock disable imbalance.
Also make sure that the usage count is balanced on probe failure (e.g.
probe deferral) so that the controller can be suspended when a driver is
later bound.
Note that the runtime PM state can only be set when runtime PM is
disabled.
Fixes: e75a6b00ad79 ("spi: axiado: Add driver for Axiado SPI DB controller") Cc: stable@vger.kernel.org # 7.0 Cc: Vladimir Moravcevic <vmoravcevic@axiado.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20260421143925.1551781-2-johan@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
When CONFIG_FB_SAVAGE_I2C is enabled, savagefb_probe() can build both an
EDID-derived monspecs.modedb and a modelist from it before later failing.
The normal success path frees monspecs.modedb after the initial mode selection,
but the probe error path only deletes the I2C busses and misses the
EDID-derived allocations.
Free both the modelist and monspecs.modedb on the failed: unwind path.
Co-developed-by: Myeonghun Pak <mhun512@gmail.com> Signed-off-by: Myeonghun Pak <mhun512@gmail.com> Co-developed-by: Ijae Kim <ae878000@gmail.com> Signed-off-by: Ijae Kim <ae878000@gmail.com> Co-developed-by: Taegyu Kim <tmk5904@psu.edu> Signed-off-by: Taegyu Kim <tmk5904@psu.edu> Signed-off-by: Yuho Choi <dbgh9129@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>
fbdev: offb: fix PCI device reference leak on probe failure
offb_init_nodriver() gets a referenced PCI device with pci_get_device().
If pci_enable_device() fails, the function returns without dropping that
reference.
Release the PCI device reference before returning from the
pci_enable_device() failure path.
Fixes: 5bda8f7b5468 ("video: fbdev: offb: Call pci_enable_device() before using the PCI VGA device") Co-developed-by: Myeonghun Pak <mhun512@gmail.com> Signed-off-by: Myeonghun Pak <mhun512@gmail.com> Co-developed-by: Ijae Kim <ae878000@gmail.com> Signed-off-by: Ijae Kim <ae878000@gmail.com> Co-developed-by: Taegyu Kim <tmk5904@psu.edu> Signed-off-by: Taegyu Kim <tmk5904@psu.edu> Signed-off-by: Yuho Choi <dbgh9129@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>
in smb2_get_info_sec, a dummy security descriptor (SD) is returned if
the requested information is not supported.
the code is currently wrong, as DACL_PROTECTED is set in the type field,
but there is no DACL is present.
instead of faking a security, report a STATUS_NOT_SUPPORTED error.
this seems to fix a "Error 0x80090006: Invalid Signature" on file
transfers with Windows 11 clients (25H2, build 26200.8246).
capturing traffic shows that the client is sending a GET_INFO/SEC_INFO
request, with the additional_info field set to 0x20
(ATTRIBUTE_SECURITY_INFORMATION). Returning an empty SD
(with only SELF_RELATIVE set) does not fix the error.
Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Hyunwoo Kim [Mon, 20 Apr 2026 15:31:47 +0000 (00:31 +0900)]
ksmbd: scope conn->binding slowpath to bound sessions only
When the binding SESSION_SETUP sets conn->binding = true, the flag stays
set after the call so that the global session lookup in
ksmbd_session_lookup_all() can find the session, which was not added to
conn->sessions. Because the flag is connection-wide, the global lookup
path will also resolve any other session by id if asked.
Tighten the global lookup so that the returned session must have this
connection registered in its channel xarray (sess->ksmbd_chann_list).
The channel entry is installed by the existing binding_session path in
ntlm_authenticate()/krb5_authenticate() when a SESSION_SETUP completes
successfully, so this condition is a strict equivalent of "this
connection has been accepted as a channel of this session". Connections
that have not bound to a given session cannot reach it via the global
table.
The existing conn->binding gate for entering the slowpath is preserved
so that non-binding connections keep the fast-path-only behavior, and
the session->state check is unchanged.
Fixes: f5a544e3bab7 ("ksmbd: add support for SMB3 multichannel") Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
DaeMyung Kang [Mon, 20 Apr 2026 17:51:25 +0000 (02:51 +0900)]
ksmbd: fix CreateOptions sanitization clobbering the whole field
smb2_open() attempts to clear conflicting CreateOptions bits
(FILE_SEQUENTIAL_ONLY_LE together with FILE_RANDOM_ACCESS_LE, and
FILE_NO_COMPRESSION_LE on a directory open), but uses a plain
assignment of the bitwise negation of the target flag:
This replaces the entire field with 0xFFFFFFFB / 0xFFFFFFEF rather
than clearing a single bit. With the SEQUENTIAL/RANDOM case, the
next check for FILE_OPEN_BY_FILE_ID_LE | CREATE_TREE_CONNECTION |
FILE_RESERVE_OPFILTER_LE then trivially matches and a legitimate
request is rejected with -EOPNOTSUPP. With the NO_COMPRESSION case,
every downstream test (FILE_DELETE_ON_CLOSE, etc.) operates on a
corrupted CreateOptions value.
Use &= ~FLAG to clear only the intended bit in both places.
Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
DaeMyung Kang [Mon, 20 Apr 2026 18:45:11 +0000 (03:45 +0900)]
ksmbd: fix durable fd leak on ClientGUID mismatch in durable v2 open
ksmbd_lookup_fd_cguid() returns a ksmbd_file with its refcount
incremented via ksmbd_fp_get(). parse_durable_handle_context() in
the DURABLE_REQ_V2 case properly releases this reference on every
path inside the ClientGUID-match branch, either by calling
ksmbd_put_durable_fd() or by transferring ownership to dh_info->fp
for a successful reconnect. However, when an entry exists in the
global file table with the same CreateGuid but a different
ClientGUID, the code simply falls through to the new-open path
without dropping the reference obtained from ksmbd_lookup_fd_cguid().
Per MS-SMB2 section 3.3.5.9.10 ("Handling the
SMB2_CREATE_DURABLE_HANDLE_REQUEST_V2 Create Context"), the server
MUST locate an Open whose Open.CreateGuid matches the request's
CreateGuid AND whose Open.ClientGuid matches the ClientGuid of the
connection that received the request. If no such Open is found, the
server MUST continue with the normal open execution phase. A
CreateGuid hit with a ClientGUID mismatch is therefore the
"Open not found" case: proceeding with a new open is correct, but
the reference obtained purely as a side effect of the lookup must
not be leaked.
Repeated requests that hit this mismatch pin global_ft entries,
prevent __ksmbd_close_fd() from ever running for the corresponding
files, and defeat the durable scavenger, leading to long-lived
resource leaks.
Release the reference in the mismatch path and clear dh_info->fp so
subsequent logic does not mistake a non-matching lookup result for
a reconnect target.
Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2") Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
DaeMyung Kang [Sun, 19 Apr 2026 11:02:55 +0000 (20:02 +0900)]
ksmbd: destroy async_ida in ksmbd_conn_free()
When per-connection async_ida was converted from a dynamically
allocated ksmbd_ida to an embedded struct ida, ksmbd_ida_free() was
removed from the connection teardown path but no matching
ida_destroy() was added. The connection is therefore freed with the
IDA's backing xarray still intact.
The kernel IDA API expects ida_init() and ida_destroy() to be paired
over an object's lifetime, so add the missing cleanup before the
connection is freed.
No leak has been observed in testing; this is a pairing fix to match
the IDA lifetime rules, not a response to a reproduced regression.
Fixes: d40012a83f87 ("cifsd: declare ida statically") Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
DaeMyung Kang [Sun, 19 Apr 2026 11:02:54 +0000 (20:02 +0900)]
ksmbd: destroy tree_conn_ida in ksmbd_session_destroy()
When per-session tree_conn_ida was converted from a dynamically
allocated ksmbd_ida to an embedded struct ida, ksmbd_ida_free() was
removed from ksmbd_session_destroy() but no matching ida_destroy()
was added. The session is therefore freed with the IDA's backing
xarray still intact.
The kernel IDA API expects ida_init() and ida_destroy() to be paired
over an object's lifetime, so add the missing cleanup before the
enclosing session is freed.
Also move ida_init() to right after the session is allocated so that
it is always paired with the destroy call even on the early error
paths of __session_create() (ksmbd_init_file_table() or
__init_smb2_session() failures), both of which jump to the error
label and invoke ksmbd_session_destroy() on a partially initialised
session.
No leak has been observed in testing; this is a pairing fix to match
the IDA lifetime rules, not a response to a reproduced regression.
Fixes: d40012a83f87 ("cifsd: declare ida statically") Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Eric Biggers [Sat, 18 Apr 2026 22:17:07 +0000 (15:17 -0700)]
ksmbd: Use AES-CMAC library for SMB3 signature calculation
Now that AES-CMAC has a library API, convert ksmbd_sign_smb3_pdu() to
use it instead of a "cmac(aes)" crypto_shash.
The result is simpler and faster code. With the library there's no need
to dynamically allocate memory, no need to handle errors, and the
AES-CMAC code is accessed directly without inefficient indirect calls
and other unnecessary API overhead.
Acked-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
DaeMyung Kang [Sat, 18 Apr 2026 17:28:44 +0000 (02:28 +0900)]
ksmbd: reset rcount per connection in ksmbd_conn_wait_idle_sess_id()
rcount is intended to be connection-specific: 2 for curr_conn, 1 for
every other connection sharing the same session. However, it is
initialised only once before the hash iteration and is never reset.
After the loop visits curr_conn, later sibling connections are also
checked against rcount == 2, so a sibling with req_running == 1 is
incorrectly treated as idle. This makes the outcome depend on the
hash iteration order: whether a given sibling is checked against the
loose (< 2) or the strict (< 1) threshold is decided by whether it
happens to be visited before or after curr_conn.
The function's contract is "wait until every connection sharing this
session is idle" so that destroy_previous_session() can safely tear
the session down. The latched rcount violates that contract and
reopens the teardown race window the wait logic was meant to close:
destroy_previous_session() may proceed before sibling channels have
actually quiesced, overlapping session teardown with in-flight work
on those connections.
Recompute rcount inside the loop so each connection is compared
against its own threshold regardless of iteration order.
This is a code-inspection fix for an iteration-order-dependent logic
error; a targeted reproducer would require SMB3 multichannel with
in-flight work on a sibling channel landing after curr_conn in hash
order, which is not something that can be triggered reliably.
Fixes: 76e98a158b20 ("ksmbd: fix race condition between destroy_previous_session() and smb2 operations()") Cc: stable@vger.kernel.org Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
regulator: qcom: Unify user-visible "Qualcomm" name
Various names for Qualcomm as a company are used in user-visible config
options: QCOM, Qualcomm and Qualcomm Technologies. Switch to unified
"Qualcomm" so it will be easier for users to identify the options when
for example running menuconfig.
Sean Chang [Sun, 19 Apr 2026 16:31:38 +0000 (00:31 +0800)]
NFS: Fix RCU dereference of cl_xprt in nfs_compare_super_address
The cl_xprt pointer in struct rpc_clnt is marked as __rcu. Accessing
it directly in nfs_compare_super_address() is unsafe and triggers
Sparse warnings.
Fix this by using rcu_dereference() within an RCU read-side critical
section to retrieve the transport pointer. This addresses the sparse
warning and ensures atomic access to the pointer, as the transport
can be updated via transport switching even while the superblock
remains active under sb_lock.
Fixes: 7e3fcf61abde ("nfs: don't share mounts between network namespaces") Signed-off-by: Sean Chang <seanwascoding@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Sean Chang [Sun, 19 Apr 2026 16:31:37 +0000 (00:31 +0800)]
NFS: remove redundant __private attribute from nfs_page_class
The nfs_page_class tracepoint uses a pointer for the 'req' field marked
with the __private attribute. This causes Sparse to complain about
dereferencing a private pointer within the trace ring buffer context,
specifically during the TP_fast_assign() operation.
This fixes a Sparse warning introduced in commit b6ef079fd984 ("nfs:
more in-depth tracing of writepage events") by removing the redundant
__private attribute from the 'req' field.
Reviewed-by: Benjamin Coddington <bcodding@hammerspace.com> Signed-off-by: Sean Chang <seanwascoding@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
NFSv4.2: fix CLONE/COPY attrs in presence of delegated attributes
xfstest generic/407 is failing in 2 ways. It detects that after
doing a clone the client does not update it's mtime and it's ctime.
CLONE always sends a GETATTR operation and then calls
nfs_post_op_update_inode() based on the returned attributes.
Because of the delegated attributes the client ignores updating
the mtime. Then also, when delegated attributes are present, for
the change_attr the server replies with the same values as what
the client cached before and thus the generic/407 would flag that.
Instead, make sure we invalidate the blocks attr.
By adding updating delegated attributes in nfs42_copy_dest_done()
both COPY and CLONE would update mtime appropriately.
Fixes: e12912d94137 ("NFSv4: Add support for delegated atime and mtime attributes") Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
After running xfstest generic/751, in certain conditions, can have
a writeback IO stuck while experiencing one of the two patterns.
Pattern#1: writeback IO experiences ENOSPC on an offset smaller
than the filesize. Example,
write offset=0 len=4096 how=unstable OK
write offset=8192 len=4096 how=unstable OK
write offset=12288 len=4096 how=unstable ENOSPC
write offset=4096 len=4096 how=unstable ENOSPC
client sends a commit and receives a verifier which is different
from the last successful write. It marks pages dirty and writeback
retries. But it again send writes unstable and gets into the same
pattern, running into the ENOSPC error and sending a commit because
writes were sent at unstable.
Pattern#2: an unstable write followed by a short write and ENOSPC.
write offset=0 len=4096 how=unstable OK
write offset=4096 len=4096 how=unstable returns OK but count=100
write offset=4197 len=3996 how=stable returns ENOSPC
client send a commit and receives a verifier different from
the last unstable write. The same behaviour is retried in a loop.
Instead, this patch proposes to identify those conditions and mark
requests to be done synchronously instead. Previous solution tried
to mark it in the nfs_page, however that's not persistent thus
instead mark it in the nfs_open_context.
Furthermore, the same problem occurs during localio code path so
recognize that IO needs to be done sync in that case as well.
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Johan Hovold [Tue, 21 Apr 2026 12:56:32 +0000 (14:56 +0200)]
spi: imx: fix runtime pm leak on probe deferral
Make sure to balance the runtime PM usage count before returning on
probe failure (e.g. probe deferral) so that the controller can be
suspended when a driver is later bound.
Fixes: 43b6bf406cd0 ("spi: imx: fix runtime pm support for !CONFIG_PM") Cc: stable@vger.kernel.org # 5.10 Cc: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20260421125632.1537235-1-johan@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
ntfs: use page allocation for resident attribute inline data
The current kmemdup() based allocation for IOMAP_INLINE can result in
inline_data pointer having a non-zero page offset. This causes
iomap_inline_data_valid() to fail the check:
and triggers the kernel BUG at fs/iomap/buffered-io.c:1061.
This particularly affects workloads with frequent small file access
(e.g. Firefox Nightly profile on NTFS with bind mount) when using the
new ntfs. This fix this by allocating a full page with alloc_page() so that
page_address() always returns a page-aligned address.
Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
ntfs: fix mmap_prepare writable check for shared mappings
Linus pointed out that checking only VMA_WRITE_BIT is incorrect.
Private writable mappings (MAP_PRIVATE) set VM_WRITE but do not
write back to the filesystem. Also, mappings that can become
writable via mprotect() (VM_MAYWRITE) must be handled.
Use vma_desc_test_all(VMA_SHARED_BIT, VMA_MAYWRITE_BIT) instead,
which matches what other filesystems do.
Tiezhu Yang [Wed, 22 Apr 2026 07:45:34 +0000 (15:45 +0800)]
LoongArch: BPF: Support up to 12 function arguments for trampoline
Currently, LoongArch bpf trampoline supports up to 8 function arguments.
According to the statistics from commit 473e3150e30a ("bpf, x86: allow
function arguments up to 12 for TRACING"), there are over 200 functions
accept 9 to 12 arguments, so add 12 arguments support for trampoline.
With this patch, the following related testcases passed:
sudo ./test_progs -a tracing_struct/struct_many_args
sudo ./test_progs -a fentry_test/fentry_many_args
sudo ./test_progs -a fexit_test/fexit_many_args
Tiezhu Yang [Wed, 22 Apr 2026 07:45:34 +0000 (15:45 +0800)]
LoongArch: BPF: Support small struct arguments for trampoline
In the current BPF code, the struct argument size is at most 16 bytes,
enforced by the verifier. According to the Procedure Call Standard for
LoongArch, the struct argument size below 16 bytes are provided as part
of the 8 argument registers, that is to say, the struct argument may be
passed in a pair of registers if its size is more than 8 bytes and no
more than 16 bytes.
Extend the BPF trampoline JIT to support attachment to functions that
take small structures (up to 16 bytes) as argument, save and restore a
number of "argument registers" rather than a number of arguments.
With this patch, the following related testcases passed:
sudo ./test_progs -a tracing_struct/struct_args
sudo ./test_progs -a tracing_struct/union_args
Tiezhu Yang [Wed, 22 Apr 2026 07:45:34 +0000 (15:45 +0800)]
LoongArch: BPF: Open code and remove invoke_bpf_mod_ret()
invoke_bpf_mod_ret() is a small wrapper over invoke_bpf_prog(), it
should check the return value of invoke_bpf_prog() and then return
immediately if invoke_bpf_prog() failed, just open code and remove
it due to it is called only once.
Tiezhu Yang [Wed, 22 Apr 2026 07:45:34 +0000 (15:45 +0800)]
LoongArch: BPF: Support 8 and 16 bit read-modify-write instructions
The 8 and 16 bit read-modify-write instructions {amadd/amswap}.{b/h}
were newly added in the latest LoongArch Reference Manual, use them to
avoid the error of unknown opcode if possible.
Tiezhu Yang [Wed, 22 Apr 2026 07:45:34 +0000 (15:45 +0800)]
LoongArch: BPF: Add the default case in emit_atomic() and rename it
Like the other archs such as x86 and riscv, add the default case
in emit_atomic() to print an error message for the invalid opcode
and return -EINVAL, then make its return type as int.
While at it, given that all of the instructions in emit_atomic()
are only read-modify-write instructions, rename emit_atomic() to
emit_atomic_rmw() to make it clear, because there will be a new
function emit_atomic_ld_st() for load-acquire and store-release
instructions in the later patch.
Tiezhu Yang [Wed, 22 Apr 2026 07:45:13 +0000 (15:45 +0800)]
LoongArch: Define instruction formats for AM{SWAP/ADD}.{B/H} and DBAR
The 8 and 16 bit read-modify-write atomic instructions amadd.{b/h} and
amswap.{b/h} were newly added in the latest LoongArch Reference Manual,
define the instruction format and check whether support via CPUCFG.
Furthermore, define the instruction format for DBAR which will be used
to support BPF load-acquire and store-release instructions.
LoongArch: Add spectre boundry for syscall dispatch table
The LoongArch syscall number is directly controlled by userspace, but
does not have a array_index_nospec() boundry to prevent access past the
syscall function pointer tables.
Most LoongArch processors are vulnerable to Spectre-V1 Proof-of-Concept
(PoC). And the generic mechanism, __user pointer sanitization, can be
used as a mitigation. This means to use array_index_nospec() to prevent
out of boundry access in syscall and other critical paths.
Implement the arch-specific cpu_show_spectre_v1() to show CPU Spectre-V1
vulnerabilites correctly.
LoongArch: Make arch_irq_work_has_interrupt() true only if IPI HW exist
After commit 7c405fb3279b3924 ("rcu: Use an intermediate irq_work to
start process_srcu()"), Loongson-2K0300/2K0500 fail to boot. Because
IRQ_WORK need IPI but Loongson-2K0300/2K0500 don't have IPI HW.
So make arch_irq_work_has_interrupt() return true only if IPI HW exist.
Luo Qiu [Wed, 22 Apr 2026 07:45:12 +0000 (15:45 +0800)]
LoongArch: Use get_random_canary() for stack canary init
Like others, replace the custom stack canary initialization with the
get_random_canary() helper, following the pattern established in commit 622754e84b10 ("stackprotector: actually use get_random_canary()").
Signed-off-by: Luo Qiu <luoqiu@kylinsec.com.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Yuqian Yang [Wed, 22 Apr 2026 07:45:11 +0000 (15:45 +0800)]
LoongArch: Improve the logging of disabling KASLR
Whether KASLR is disabled is not handled in nokaslr() which is the early
param "nokaslr" setup function, but in kaslr_disabled(). However, the
logging was previously done in nokaslr() and lack detail. So we move the
logging to the right place and add more specific infomation about why it
is disabled.
Lisa Robinson [Wed, 22 Apr 2026 07:45:11 +0000 (15:45 +0800)]
LoongArch: Align FPU register state to 32 bytes
Move fpr to the beginning of struct loongarch_fpu so it is naturally
aligned to FPU_ALIGN (32 bytes), improving 256-bit SIMD (LASX) context
switch performance.
Also adjust process.c and fpu.S to work well with the new loongarch_fpu
layout.
Signed-off-by: Lisa Robinson <lisa@bytefly.space> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
LoongArch: Add HIGHMEM (PKMAP and FIX_KMAP) support
Add HIGHMEM (High Memory) support for LoongArch, mostly needed by 32BIT
kernel because the size of kernel virtual memory space is only 512MB and
the size of usable physical memory is only 256MB in this case.
HIGHMEM adds permanent kernel mapping (PKMAP) and fixed kernel mapping
(FIX_KMAP), which increase usable physical memory up to 2.25GB (2304MB).
We can just use the generic copy_user_highpage(), so remove the custom
version.
After a kexec the logical processors and virtual processors already
exist in the hypervisor because they were created by the previous
kernel. Attempting to add them again causes either a BUG_ON or
corrupted VP state leading to MCEs in the new kernel.
Add hv_lp_exists() to probe whether an LP is already present by
calling HVCALL_GET_LOGICAL_PROCESSOR_RUN_TIME. When it succeeds the
LP exists and we skip the add-LP and create-VP loops entirely.
Also add hv_call_notify_all_processors_started() which informs the
hypervisor that all processors are online. This is required after
adding LPs (fresh boot) and is a no-op on kexec since we skip that
path.
Co-developed-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Signed-off-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Co-developed-by: Stanislav Kinsburskii <stanislav.kinsburskii@gmail.com> Signed-off-by: Stanislav Kinsburskii <stanislav.kinsburskii@gmail.com> Co-developed-by: Mukesh Rathor <mrathor@linux.microsoft.com> Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com> Signed-off-by: Jork Loeser <jloeser@linux.microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
x86/hyperv: move stimer cleanup to hv_machine_shutdown()
Move hv_stimer_global_cleanup() from vmbus's hv_kexec_handler() to
hv_machine_shutdown() in the platform code. This ensures stimer cleanup
happens before the vmbus unload, which is required for root partition
kexec to work correctly.
Co-developed-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Signed-off-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Signed-off-by: Jork Loeser <jloeser@linux.microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
vmbus_alloc_synic_and_connect() declares a local 'int
hyperv_cpuhp_online' that shadows the file-scope global of the same
name. The cpuhp state returned by cpuhp_setup_state() is stored in
the local, leaving the global at 0 (CPUHP_OFFLINE). When
hv_kexec_handler() or hv_machine_shutdown() later call
cpuhp_remove_state(hyperv_cpuhp_online) they pass 0, which hits the
BUG_ON in __cpuhp_remove_state_cpuslocked().
Remove the local declaration so the cpuhp state is stored in the
file-scope global where hv_kexec_handler() and hv_machine_shutdown()
expect it.
Fixes: 2647c96649ba ("Drivers: hv: Support establishing the confidential VMBus connection") Signed-off-by: Jork Loeser <jloeser@linux.microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Sangyun Kim [Sun, 19 Apr 2026 08:08:38 +0000 (17:08 +0900)]
pwm: atmel-tcb: Cache clock rates and mark chip as atomic
atmel_tcb_pwm_apply() holds tcbpwmc->lock as a spinlock via
guard(spinlock)() and then calls atmel_tcb_pwm_config(), which calls
clk_get_rate() twice. clk_get_rate() acquires clk_prepare_lock (a
mutex), so this is a sleep-in-atomic-context violation.
On CONFIG_DEBUG_ATOMIC_SLEEP kernels every pwm_apply_state() that
enables or reconfigures the PWM triggers a "BUG: sleeping function
called from invalid context" warning.
Acquire exclusive control over the clock rates with
clk_rate_exclusive_get() at probe time and cache the rates in struct
atmel_tcb_pwm_chip, then read the cached rates from
atmel_tcb_pwm_config(). This keeps the spinlock-based mutual exclusion
introduced in commit 37f7707077f5 ("pwm: atmel-tcb: Fix race condition
and convert to guards") and removes the sleeping calls from the atomic
section.
With no sleeping calls left in .apply() and the regmap-mmio bus already
running with fast_io=true, also mark the chip as atomic so consumers
can use pwm_apply_atomic() from atomic context.
Fixes: 37f7707077f5 ("pwm: atmel-tcb: Fix race condition and convert to guards") Signed-off-by: Sangyun Kim <sangyun.kim@snu.ac.kr> Link: https://patch.msgid.link/20260419080838.3192357-1-sangyun.kim@snu.ac.kr
[ukleinek: Ensure .clk is enabled before calling clk_get_rate on it.] Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
io_uring: take page references for NOMMU pbuf_ring mmaps
Under !CONFIG_MMU, io_uring_get_unmapped_area() returns the kernel
virtual address of the io_mapped_region's backing pages directly;
the user's VMA aliases the kernel allocation. io_uring_mmap() then
just returns 0 -- it takes no page references.
The CONFIG_MMU path uses vm_insert_pages(), which takes a reference on
each inserted page. Those references are released when the VMA is torn
down (zap_pte_range -> put_page). io_free_region() -> release_pages()
drops the io_uring-side references, but the pages survive until munmap
drops the VMA-side references.
Under NOMMU there are no VMA-side references. io_unregister_pbuf_ring ->
io_put_bl -> io_free_region -> release_pages drops the only references
and the pages return to the buddy allocator while the user's VMA still
has vm_start pointing into them. The user can then write into whatever
the allocator hands out next.
Mirror the MMU lifetime: take get_page references in io_uring_mmap() and
release them via vm_ops->close. NOMMU's delete_vma() calls vma_close()
which runs ->close on munmap.
This also incidentally addresses the duplicate-vm_start case: two mmaps
of SQ_RING and CQ_RING resolve to the same ctx->ring_region pointer.
With page refs taken per mmap, the second mmap takes its own refs and
the pages survive until both mmaps are closed. The nommu rb-tree BUG_ON
on duplicate vm_start is a separate mm/nommu.c concern (it should share
the existing region rather than BUG), but the page lifetime is now
correct.
Cc: Jens Axboe <axboe@kernel.dk> Reported-by: Anthropic Assisted-by: gkh_clanker_t1000 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/2026042115-body-attention-d15b@gregkh
[axboe: get rid of region lookup, just iterate pages in vma] Signed-off-by: Jens Axboe <axboe@kernel.dk>
Merge tag 'probes-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
"fprobe bug fixes:
- Prevent re-registration
Add an earlier check to reject re-registering an already active
fprobe before its state is modified during the initialization phase
- Robustness in failure paths:
- Ensure fprobes are correctly removed from all internal tables
and properly RCU-freed during registration failure
- Make unregister_fprobe() proceed with unregistration even if
temporary memory allocation fails
- RCU safety in module unloading
Avoid a potential "sleep in RCU" warning by removing a kcalloc()
call in the module notifier path. This also tries to remove
fprobe_hash_node even if memory allocation fails.
- Type-aware unregistration
Fix a bug where unregistering an fprobe did not account for
different types (entry-only vs entry-exit) at the same address,
which previously left "junk" entries in the underlying
ftrace/fgraph ops
- Unregistration of empty ftrace_ops
Avoid unneeded performance overhead due to making registered
ftrace_ops empty - which means 'trace all functions'. This counts
remaining entries and unregister ftrace_ops when it becomes empty.
Two new selftests to check above fixes:
- Module Unloading Test:
Specifically verifies that fprobe events on a module are correctly
cleaned up and do not trigger 'trace-all' behavior when the module
is removed.
- Multiple Fprobe Events Test:
Ensure that having multiple fprobes on the same function correctly
manages the ftrace hash map during removal"
* tag 'probes-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
selftests/ftrace: Add a testcase for multiple fprobe events
selftests/ftrace: Add a testcase for fprobe events on module
tracing/fprobe: Fix to unregister ftrace_ops if it is empty on module unloading
tracing/fprobe: Check the same type fprobe on table as the unregistered one
tracing/fprobe: Avoid kcalloc() in rcu_read_lock section
tracing/fprobe: Remove fprobe from hash in failure path
tracing/fprobe: Unregister fprobe even if memory allocation fails
tracing/fprobe: Reject registration of a registered fprobe before init
fixed an issue where poll->events and req->apoll_events weren't
synchronized, but then when the commit referenced in Fixes got added,
it didn't ensure the same thing.
If we mask in EPOLLONESHOT in the regular EPOLL_URING_WAKE path, then
ensure it's done for both. Including a link to the original report
below, even though it's mostly nonsense. But it includes a reproducer
that does show that IORING_CQE_F_MORE is set in the previous CQE,
while no more CQEs will be generated for this request. Just ignore
anything that pretends this is security related in any way, it's just
the typical AI nonsense.
Merge tag 'drm-next-2026-04-22' of https://gitlab.freedesktop.org/drm/kernel
Pull more drm updates from Dave Airlie:
"This is a followup which is mostly next material with some fixes.
Alex pointed out I missed one of his AMD MRs from last week, so I
added that, then Jani sent the pipe reordering stuff, otherwise it's
just some minor i915 fixes and a dma-buf fix.
drm:
- Add support for AMD VSDB parsing to drm_edid
dma-buf:
- fix documentation formatting
i915:
- add support for reordered pipes to support joined pipes better
- Fix VESA backlight possible check condition
- Verify the correct plane DDB entry
amdgpu:
- Audio regression fix
- Use drm edid parser for AMD VSDB
- Misc cleanups
- VCE cs parse fixes
- VCN cs parse fixes
- RAS fixes
- Clean up and unify vram reservation handling
- GPU Partition updates
- system_wq cleanups
- Add CONFIG_GCOV_PROFILE_AMDGPU kconfig option
- SMU vram copy updates
- SMU 13/14/15 fixes
- UserQ fixes
- Replace pasid idr with an xarray
- Dither handling fix
- Enable amdgpu by default for CIK APUs
- Add IBs to devcoredump
amdkfd:
- system_wq cleanups
radeon:
- system_wq cleanups"
* tag 'drm-next-2026-04-22' of https://gitlab.freedesktop.org/drm/kernel: (62 commits)
drm/i915/display: change pipe allocation order for discrete platforms
drm/i915/wm: Verify the correct plane DDB entry
drm/i915/backlight: Fix VESA backlight possible check condition
drm/i915: Walk crtcs in pipe order
drm/i915/joiner: Make joiner "nomodeset" state copy independent of pipe order
dma-buf: fix htmldocs error for dma_buf_attach_revocable
drm/amdgpu: dump job ibs in the devcoredump
drm/amdgpu: store ib info for devcoredump
drm/amdgpu: extract amdgpu_vm_lock_by_pasid from amdgpu_vm_handle_fault
drm/amdgpu: Use amdgpu by default for CIK APUs too
drm/amd/display: Remove unused NUM_ELEMENTS macros
drm/amd/display: Replace inline NUM_ELEMENTS macro with ARRAY_SIZE
drm/amdgpu: save ring content before resetting the device
drm/amdgpu: make userq fence_drv drop explicit in queue destroy
drm/amdgpu: rework userq fence driver alloc/destroy
drm/amdgpu/userq: use dma_fence_wait_timeout without test for signalled
drm/amdgpu/userq: call dma_resv_wait_timeout without test for signalled
drm/amdgpu/userq: add the return code too in error condition
drm/amdgpu/userq: fence wait for max time in amdgpu_userq_wait_for_signal
drm/amd/display: Change dither policy for 10 bpc output back to dithering
...
selftests/ftrace: Add a testcase for fprobe events on module
Add a testcase for fprobe events on module, which unloads a kernel
module on which fprobe events are probing and ensure the ftrace
hash map is cleared correctly.
tracing/fprobe: Fix to unregister ftrace_ops if it is empty on module unloading
Fix fprobe to unregister ftrace_ops if corresponding type of fprobe
does not exist on the fprobe_ip_table and it is expected to be empty
when unloading modules.
Since ftrace thinks that the empty hash means everything to be traced,
if we set fprobes only on the unloaded module, all functions are traced
unexpectedly after unloading module.
e.g.
Alex Markuze [Tue, 10 Feb 2026 09:06:26 +0000 (09:06 +0000)]
ceph: add subvolume metrics collection and reporting
Add complete infrastructure for per-subvolume I/O metrics collection
and reporting to the MDS. This enables administrators to monitor I/O
patterns at the subvolume granularity, which is useful for multi-tenant
CephFS deployments.
This patch adds:
- CEPHFS_FEATURE_SUBVOLUME_METRICS feature flag for MDS negotiation
- CEPH_SUBVOLUME_ID_NONE constant (0) for unknown/unset state
- Red-black tree based metrics tracker for efficient per-subvolume
aggregation with kmem_cache for entry allocations
- Wire format encoding matching the MDS C++ AggregatedIOMetrics struct
- Integration with the existing CLIENT_METRICS message
- Recording of I/O operations from file read/write and writeback paths
- Debugfs interfaces for monitoring (metrics/subvolumes, metrics/metric_features)
Metrics tracked per subvolume include:
- Read/write operation counts
- Read/write byte counts
- Read/write latency sums (for average calculation)
The metrics are periodically sent to the MDS as part of the existing
metrics reporting infrastructure when the MDS advertises support for
the SUBVOLUME_METRICS feature.
CEPH_SUBVOLUME_ID_NONE enforces subvolume_id immutability. Following
the FUSE client convention, 0 means unknown/unset. Once an inode has
a valid (non-zero) subvolume_id, it should not change during the
inode's lifetime.
Alex Markuze [Tue, 10 Feb 2026 09:06:25 +0000 (09:06 +0000)]
ceph: parse subvolume_id from InodeStat v9 and store in inode
Add support for parsing the subvolume_id field from InodeStat v9 and
storing it in the inode for later use by subvolume metrics tracking.
The subvolume_id identifies which CephFS subvolume an inode belongs to,
enabling per-subvolume I/O metrics collection and reporting.
This patch:
- Adds subvolume_id field to struct ceph_mds_reply_info_in
- Adds i_subvolume_id field to struct ceph_inode_info
- Parses subvolume_id from v9 InodeStat in parse_reply_info_in()
- Adds ceph_inode_set_subvolume() helper to propagate the ID to inodes
- Initializes i_subvolume_id in inode allocation and clears on destroy