git.ipfire.org Git - thirdparty/kernel/stable.git/log

apparmor: Remove redundant if check in sk_peer_get_label

Remove the redundant if check in sk_peer_get_label() and return
ERR_PTR(-ENOPROTOOPT) directly.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: John Johansen <john.johansen@canonical.com>

apparmor: Replace memcpy + NUL termination with kmemdup_nul in do_setattr

Use kmemdup_nul() to copy 'value' instead of using memcpy() followed by
a manual NUL termination. No functional changes.

Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: John Johansen <john.johansen@canonical.com>

Merge tag 'hyperv-next-signed-20260421' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull Hyper-V updates from Wei Liu:

- Fix cross-compilation for hv tools (Aditya Garg)

- Fix vmemmap_shift exceeding MAX_FOLIO_ORDER in mshv_vtl (Naman Jain)

- Limit channel interrupt scan to relid high water mark (Michael
   Kelley)

- Export hv_vmbus_exists() and use it in pci-hyperv (Dexuan Cui)

- Fix cleanup and shutdown issues for MSHV (Jork Loeser)

- Introduce more tracing support for MSHV (Stanislav Kinsburskii)

* tag 'hyperv-next-signed-20260421' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  x86/hyperv: Skip LP/VP creation on kexec
  x86/hyperv: move stimer cleanup to hv_machine_shutdown()
  Drivers: hv: vmbus: fix hyperv_cpuhp_online variable shadowing
  mshv: Add tracepoint for GPA intercept handling
  mshv_vtl: Fix vmemmap_shift exceeding MAX_FOLIO_ORDER
  tools: hv: Fix cross-compilation
  Drivers: hv: vmbus: Export hv_vmbus_exists() and use it in pci-hyperv
  mshv: Introduce tracing support
  Drivers: hv: vmbus: Limit channel interrupt scan to relid high water mark

ALSA: usb-audio: Fix Audio Advantage Micro II SPDIF switch

snd_microii_spdif_switch_put() returns 0 when the requested
vendor register value differs from the cached one.

This comparison was inverted by the resume-support conversion,
so real SPDIF switch toggles are ignored while no-op writes still
issue SET_CUR and report success.

Return early only when the requested value matches the cached one.

Fixes: 288673beae6c ("ALSA: usb-audio: Add resume support for MicroII SPDIF ctls")
Cc: stable@vger.kernel.org
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260421-microii-spdif-switch-fix-v1-1-5c50dc28b88f@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: usb-audio: Avoid false E-MU sample-rate notifications

snd_emuusb_set_samplerate() unconditionally notifies the E-MU
SampleRate Extension Unit control after issuing SET_CUR.

If snd_usb_mixer_set_ctl_value() fails, the control value has not
changed, yet snd_usb_mixer_notify_id() still invalidates the cache and
emits a value-change event to userspace.

Notify the control only after a successful write.

Fixes: 7d2b451e65d2 ("ALSA: usb-audio - Added functionality for E-mu 0404USB/0202USB/TrackerPre")
Cc: stable@vger.kernel.org
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260421-alsa-emuusb-samplerate-notify-v1-1-8b63bbc1d7f1@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>

tools/power turbostat: v2026.04.21

Since v2026.02.14

Display HT siblings in cpu# order.
Add Module-ID column.
Print Core-ID and APIC-ID in hex.
Fix misc bugs.

Signed-off-by: Len Brown <len.brown@intel.com>

tools/power turbostat: Process HT siblings in CPU order

On large systems with HT sibling cpu#'s more than 32 apart,
HT siblings were processed and displayed in reverse order.

This was due to how set_thread_siblings() parsed the
sibling-bit-mask.

Update set_thread_siblings to instead parse the sibling-list,
like other cpu lists, and to thus order HT siblings
by ascending CPU number, no matter the size of the system.

Signed-off-by: Len Brown <len.brown@intel.com>

tools/power turbostat: Show module_id column

Get the "module_id" from the Linux topology "cluster_id".
If the there is more than one id, show it by default.

Module joins Die etc. in the "topology" group.

Display in hex, as it is usually based mask of the APIC-id

Signed-off-by: Len Brown <len.brown@intel.com>

tools/power turbostat: Print core_id and apic_id in hex

The core_id is based on a mask of the apic_id.
Print them both in hex, rather than decimal,
to make this relationship visibly clear.

Signed-off-by: Len Brown <len.brown@intel.com>

tools/power turbostat: Cleanup print helper functions

Make printer helper functions more readable by factoring
out a local 'sep' variable.

Remove the redundant parentheses around sprintf() calls.

Remove an unnecessary cast to "unsigned int" by using the '%08llx' instead
of '%08x'.

No functional changes.

[lenb: fix typos, simplify]
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

tools/power turbostat: Fix --cpu-set 1 regression on HT systems

When the "--cpu-set" option limits turbostat to run on
a higher numbered HT sibling, it exits upon dividing by zero.

This is because the HT support handles higher numbered siblings
at the same time as lower numbered siblings. But when that lower
number sibling is dis-allowed, the higher numbered sibling is
never processed. The result is a time delta of 0, which results
in a divide by 0 for any of the "per-second" metrics.

Enhance the HT enumeration code to record all siblings (up to SMT4).
Consult this complete HT sibling list to determine when
to process an HT sibling, and when to skip it.

Fixes: a2b4d0f8bf07 ("tools/power turbostat: Favor cpu# over core#")
Signed-off-by: Len Brown <len.brown@intel.com>

tools/power turbostat: Fix --cpu-set 0 regression on HT systems

"turbostat --cpu-set 0" appears to hang if cpu0 has an HT sibling.

This is because the initialization code recognizes that it does not
have to open perf files for the HT sibling, but the HT support
in the collection code sees the HT sibling and tries to read
from an uninitialized file descriptor, 0 (standard input).

Access HT siblings only when they are in the allowed set.

Fixes: a2b4d0f8bf07 ("tools/power turbostat: Favor cpu# over core#")
Signed-off-by: Len Brown <len.brown@intel.com>
Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

tools/power turbostat: Fix unrecognized option '-P'

The '-P' short option (shorthand for --no-perf) is not present in the
optstring of the second call to getopt_long_only(). This results in
the "unrecognized option" error when the tool reaches the main parsing
loop.

Add 'P' to the second getopt_long_only() call to ensure it is
consistently recognized.

Fixes: a0e86c90b83c ("tools/power turbostat: Add --no-perf option")
Signed-off-by: David Arcari <darcari@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>

tracing: Make undefsyms_base.c a first-class citizen

Linus points out that dumping undefsyms_base.c form the Makefile
is rather ugly, and that a much better course of action would be
to have this file as a first-class citizen in the git tree.

This allows some extra cleanup in the Makefile, and the removal of
the .gitignore file in kernel/trace.

Cc: Marc Zyngier <maz@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/CAHk-=wieqGd_XKpu8UxDoyADZx8TDe8CF3RmkUXt5N_9t5Pf_w@mail.gmail.com
Link: https://lore.kernel.org/all/20260421095446.2951646-1-maz@kernel.org/
Link: https://patch.msgid.link/20260421100455.324333-1-pbonzini@redhat.com
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

fbdev: hgafb: Request memory region before ioremap

The driver calls ioremap() on the HGA video memory at 0xb0000 without
first reserving the physical address range. This leaves the kernel
resource tree incomplete and can cause silent conflicts with other
drivers claiming the same range.

Add a devm_request_mem_region() call before ioremap() in
hga_card_detect() to reserve the memory region.

Signed-off-by: Hardik Phalet <hardik.phalet@pm.me>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Helge Deller <deller@gmx.de>

ASoC: sdw_utils: cs42l43: allow spk component names to be combined

Move handling of cs42l43-spk component string into SOF mechanism [1]
which will allow it to be aggregated with other speakers.
Likewise handle the cs35l56-bridge special case which should not be
combined to keep compatibility with UCM.

Link: https://github.com/thesofproject/linux/pull/5445
Link: https://github.com/alsa-project/alsa-ucm-conf/pull/747
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Maciej Strozek <mstrozek@opensource.cirrus.com>
Suggested-by: Aaron Ma <aaron.ma@canonical.com>
Tested-by: Aaron Ma <aaron.ma@canonical.com>
Link: https://patch.msgid.link/20260420114823.194226-1-mstrozek@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>

smb: client: Drop 'allocate_crypto' arg from smb*_calc_signature()

Since the crypto library API is now being used instead of crypto_shash,
all structs for MAC computation are now just fixed-size structs
allocated on the stack; no dynamic allocations are ever required.
Besides being much more efficient, this also means that the
'allocate_crypto' argument to smb2_calc_signature() and
smb3_calc_signature() is no longer used. Remove this unused argument.

Acked-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: Make generate_key() return void

Since the crypto library API is now being used instead of crypto_shash,
generate_key() can no longer fail. Make it return void and simplify the
callers accordingly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: Remove obsolete cmac(aes) allocation

Since the crypto library API is now being used instead of crypto_shash,
the "cmac(aes)" crypto_shash that is being allocated and stored in
'struct cifs_secmech' is no longer used.  Remove it.

That makes the kconfig selection of CRYPTO_CMAC and the module softdep
on "cmac" unnecessary.  So remove those too.

Finally, since this removes the last use of crypto_shash from the smb
client, also remove the remaining crypto_shash-related helper functions.

Note: cifs_unicode.c was relying on <linux/unaligned.h> being included
transitively via <crypto/internal/hash.h>.  Since the latter include is
removed, make cifs_unicode.c include <linux/unaligned.h> explicitly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: Use AES-CMAC library for SMB3 signature calculation

Convert smb3_calc_signature() to use the AES-CMAC library instead of a
"cmac(aes)" crypto_shash.

The result is simpler and faster code. With the library there's no need
to allocate memory, no need to handle errors except for key preparation,
and the AES-CMAC code is accessed directly without inefficient indirect
calls and other unnecessary API overhead.

For now a "cmac(aes)" crypto_shash is still being allocated in
'struct cifs_secmech'. Later commits will remove that, simplifying the
code even further.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: common: add SMB3_COMPRESS_MAX_ALGS

Set it to number of currently defined algorithms (6 as of now).

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: compress: add code docs to lz77.c

Document parts of the code, especially the apparently
non-sense parts.

Other:
- change pointer increment constants to sizeof() values

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: compress: LZ77 optimizations

This patch implements several micro-optimizations on lz77_compress()
with the goal of reducing the number of instructions per [input]
byte (a.k.a. IPB).

Changes:
- change hashtable to be u32 (instead of u64) -- change the hash
  function to reflect that (adds lz77_hash() and lz77_read32() helpers)
- batch-write literals instead of 1 by 1 -- now that we have a well
  defined hot path (match finding) and a cold path (encode literals +
  match), batch writing makes a significant difference
- implement adaptive skipping of input bytes -- skip input bytes more
  aggressively if too few matches are being found
- name some constants for more meaningful context

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: compress: increase LZ77_MATCH_MAX_DIST

Increase max distance (i.e. window size) from 1k to 8k.
This allows better compression and is just as fast.

Other:
- drop LZ77_MATCH_MIN_DIST as it's nused -- main loop
already checks if dist > 0

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: compress: fix counting in LZ77 match finding

- lz77_match_len() increments @cur before checking for equality,
  leading to off-by-one match len in some cases.

  Fix by moving pointers increment to inside the loop.
  Also rename @wnd arg to @match (more accurate name).
- both lz77_match_len() and lz77_compress() checked for
  "buf + step < end" when the correct is "<=" for such cases.

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: compress: fix buffer overrun in lz77_compress()

@dst buffer is allocated with same size as @src, which, for good
compression cases, works fine.

However, when compression goes bad (e.g. random bytes payloads), the
compressed size can increase significantly, and even by stopping the
main loop at 7/8 of @slen, writing leftover literals could write past
the end of @dst because of LZ77 metadata.

To fix this, add lz77_compressed_alloc_size() helper to compute the
correct allocation size for @dst, accounting for metadata and worst
cast scenario (all literals).

While this is overprovisioning memory, it's not only correct, but also
allows lz77_compress() main loop to run without ever checking @dst
limits (i.e. a perf improvement).

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: scope end_of_dacl to CIFS_DEBUG2 use in parse_dacl

After validate_dacl() was factored out in commit 149822e5541c, the
local end_of_dacl in parse_dacl() is only read by the dump_ace()
call under #ifdef CONFIG_CIFS_DEBUG2. With CIFS_DEBUG2 off the
variable is assigned but never used, which gcc -W=1 flags as
-Wunused-but-set-variable.

Remove the local and compute the end-of-dacl pointer inline at the
single call site inside the existing CIFS_DEBUG2 guard. No
functional change: when CIFS_DEBUG2 is enabled the argument value
is identical to what the removed local carried; when CIFS_DEBUG2
is disabled the code was already dead.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202604220046.tGkRxVtS-lkp@intel.com/
Fixes: 149822e5541c ("smb: client: validate the whole DACL before rewriting it in cifsacl")
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: fix (remove) drop_dir_cache module parameter

Being a module parameter, it's possible to do:

  # modprobe cifs drop_dir_cache=1

Which will lead to a crash, because cifs_tcp_ses_list hasn't been
initialized yet:

  [  168.242624] BUG: kernel NULL pointer dereference, address: 0000000000000010
  [  168.242952] #PF: supervisor read access in kernel mode
  [  168.243175] #PF: error_code(0x0000) - not-present page
  [  168.243394] PGD 0 P4D 0
  [  168.243524] Oops: Oops: 0000 [#1] SMP NOPTI
  [  168.243703] CPU: 2 UID: 0 PID: 1105 Comm: modprobe Not tainted 7.0.0-lku #5 PREEMPT(lazy)
  [  168.244054] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-2-g4f253b9b-prebuilt.qemu.org 04/01/2014
  [  168.244557] RIP: 0010:cifs_param_set_drop_dir_cache+0x7c/0x100 [cifs]
  ...
  [  168.248785] Call Trace:
  [  168.248915]  <TASK>
  [  168.249023]  parse_args+0x285/0x3a0
  [  168.249204]  ? __pfx_unknown_module_param_cb+0x10/0x10
  [  168.249448]  load_module+0x192b/0x1bb0
  [  168.249637]  ? __pfx_unknown_module_param_cb+0x10/0x10
  [  168.249882]  ? kernel_read_file+0x27d/0x2b0
  [  168.250088]  init_module_from_file+0xce/0xf0
  [  168.250291]  idempotent_init_module+0xfb/0x2f0
  [  168.250496]  __x64_sys_finit_module+0x5a/0xa0
  [  168.250694]  do_syscall_64+0xe0/0x5a0
  [  168.250863]  ? exc_page_fault+0x65/0x160
  [  168.251050]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
  [  168.251284] RIP: 0033:0x7fcaa12b774d

Instead of fixing this with some kind of "is module initialized"
approach, this patch instead moves that functionality to procfs,
setting a write op for the existing open_dirs entry, where
writing a 0 to it will drop the cached directory entries.

Also make it available only when CONFIG_CIFS_DEBUG=y.

A small change needed now is to not call flush_delayed_work()
on invalidate_all_cached_dirs() when called from procfs (can't sleep in
that context).
So add a @sync arg to invalidate_all_cached_dirs() to control when to
flush the delayed works.

Fixes: dde6667fa3c8 ("smb: client: add drop_dir_cache module parameter to invalidate cached dirents")
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: require a full NFS mode SID before reading mode bits

parse_dacl() treats an ACE SID matching sid_unix_NFS_mode as an NFS
mode SID and reads sid.sub_auth[2] to recover the mode bits.

That assumes the ACE carries three subauthorities, but compare_sids()
only compares min(a, b) subauthorities. A malicious server can return
an ACE with num_subauth = 2 and sub_auth[] = {88, 3}, which still
matches sid_unix_NFS_mode and then drives the sub_auth[2] read four
bytes past the end of the ACE.

Require num_subauth >= 3 before treating the ACE as an NFS mode SID.
This keeps the fix local to the special-SID mode path without changing
compare_sids() semantics for the rest of cifsacl.

Fixes: e2f8fbfb8d09 ("cifs: get mode bits from special sid on stat")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: validate the whole DACL before rewriting it in cifsacl

build_sec_desc() and id_mode_to_cifs_acl() derive a DACL pointer from a
server-supplied dacloffset and then use the incoming ACL to rebuild the
chmod/chown security descriptor.

The original fix only checked that the struct smb_acl header fits before
reading dacl_ptr->size or dacl_ptr->num_aces. That avoids the immediate
header-field OOB read, but the rewrite helpers still walk ACEs based on
pdacl->num_aces with no structural validation of the incoming DACL body.

A malicious server can return a truncated DACL that still contains a
header, claims one or more ACEs, and then drive
replace_sids_and_copy_aces() or set_chmod_dacl() past the validated
extent while they compare or copy attacker-controlled ACEs.

Factor the DACL structural checks into validate_dacl(), extend them to
validate each ACE against the DACL bounds, and use the shared validator
before the chmod/chown rebuild paths. parse_dacl() reuses the same
validator so the read-side parser and write-side rewrite paths agree on
what constitutes a well-formed incoming DACL.

Fixes: bc3e9dd9d104 ("cifs: Change SIDs in ACEs while transferring file ownership.")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-6
Assisted-by: Codex:gpt-5-4
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: fix OOB read in smb2_ioctl_query_info QUERY_INFO path

smb2_ioctl_query_info() has two response-copy branches: PASSTHRU_FSCTL
and the default QUERY_INFO path. The QUERY_INFO branch clamps
qi.input_buffer_length to the server-reported OutputBufferLength and then
copies qi.input_buffer_length bytes from qi_rsp->Buffer to userspace, but
it never verifies that the flexible-array payload actually fits within
rsp_iov[1].iov_len.

A malicious server can return OutputBufferLength larger than the actual
QUERY_INFO response, causing copy_to_user() to walk past the response
buffer and expose adjacent kernel heap to userspace.

Guard the QUERY_INFO copy with a bounds check on the actual Buffer
payload. Use struct_size(qi_rsp, Buffer, qi.input_buffer_length)
rather than an open-coded addition so the guard cannot overflow on
32-bit builds.

Fixes: f5778c398713 ("SMB3: Allow SMB3 FSCTL queries to be sent to server from tools")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Assisted-by: Claude:claude-opus-4-6
Assisted-by: Codex:gpt-5-4
Signed-off-by: Steve French <stfrench@microsoft.com>

fbdev: clps711x-fb: Request memory region for MMIO

Use devm_platform_get_and_ioremap_resource() for resource 0 (the MMIO
control register range) instead of open-coding platform_get_resource()
and devm_ioremap() separately. The helper requests the memory region
before mapping it, which registers the range in /proc/iomem and prevents
another driver from mapping the same registers.

This makes resource 0 consistent with resource 1 (the framebuffer),
which already uses devm_platform_get_and_ioremap_resource().

Assisted-by: Claude:claude-sonnet-4-6
Signed-off-by: Amit Barzilai <amit.barzilai22@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>

fbdev: cobalt_lcdfb: Request memory region

Use devm_platform_get_and_ioremap_resource() instead of open-coding
platform_get_resource() and devm_ioremap() separately. The helper
requests the memory region before mapping it, which registers the range
in /proc/iomem and prevents another driver from mapping the same
registers.

Assisted-by: Claude:claude-sonnet-4-6
Signed-off-by: Amit Barzilai <amit.barzilai22@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>

ASoC: qcom: x1e80100: limit speaker volumes

Limit the digital gain and PA volumes to a combined -3 dB in the machine
driver to reduce the risk of speaker damage until we have active speaker
protection in place (or higher safe levels have been established).

Based on commit c481016bb4f8 ("ASoC: qcom: sc8280xp: limit speaker
volumes") which addressed the same issue on the sc8280x SoC with some
minor changes as explained below.

The Digital Volume behaves almost identical to sc8280x since both use
the same lpass-wsa-macro, but x1e80100 has two sets of controls prefixed
with WSA and WSA2.
For PA x1e80100 machines use wsa884x amplifiers which expose a linear
scale from -9 dB to 9 dB with a 1.5 dB step size giving us
0 dB = -9 dB + 6 * 1.5 dB.

On x1e80100 there are two different speaker topologies we need to handle:
2-Speakers: SpkrLeft, Spkr Right
4-Speakers: WooferLeft, WooferRight, TweeterLeft, TweeterRight

Signed-off-by: Tobias Heider <tobias.heider@canonical.com>
Tested-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260422-x1e80100-audio-limit-v2-1-333258b97697@canonical.com
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: fix resource leaks on device setup failure

Johan Hovold <johan@kernel.org> says:

Make sure to call controller cleanup() if spi_setup() fails while
registering a device to avoid leaking any resources allocated by
setup().

spi: fix controller cleanup() documentation

The controller cleanup() callback is no longer called when releasing a
device, but rather when deregistering it (and on registration failures).

Fixes: c7299fea6769 ("spi: Fix spi device unregister flow")
Cc: Saravana Kannan <saravanak@kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260410154907.129248-3-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: fix resource leaks on device setup failure

Make sure to call controller cleanup() if spi_setup() fails while
registering a device to avoid leaking any resources allocated by
setup().

Fixes: c7299fea6769 ("spi: Fix spi device unregister flow")
Cc: stable@vger.kernel.org # 5.13
Cc: Saravana Kannan <saravanak@kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260410154907.129248-2-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: axiado: spi: axiado: fix runtime pm imbalance on probe failure

Johan Hovold <johan@kernel.org> says:

The series fixes some runtime PM related issues in the axiado driver.

Included is also a couple of related cleanups.

spi: axiado: clean up probe return value

Drop the redundant initialisation and return explicit zero on successful
probe to make the code more readable.

Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260421143925.1551781-4-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: axiado: rename probe error labels

Rename the probe error labels after what they do.

Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260421143925.1551781-3-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: axiado: fix runtime pm imbalance on probe failure

Make sure that the controller is active before disabling clocks on late
probe failure and on driver unbind to avoid a clock disable imbalance.

Also make sure that the usage count is balanced on probe failure (e.g.
probe deferral) so that the controller can be suspended when a driver is
later bound.

Note that the runtime PM state can only be set when runtime PM is
disabled.

Fixes: e75a6b00ad79 ("spi: axiado: Add driver for Axiado SPI DB controller")
Cc: stable@vger.kernel.org # 7.0
Cc: Vladimir Moravcevic <vmoravcevic@axiado.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260421143925.1551781-2-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

fbdev: atyfb: Fix spelling mistake "enfore" -> "enforce"

Signed-off-by: Ethan Carter Edwards <ethan@ethancedwards.com>
Signed-off-by: Helge Deller <deller@gmx.de>

fbdev: savage: fix probe-path EDID cleanup leaks

When CONFIG_FB_SAVAGE_I2C is enabled, savagefb_probe() can build both an
EDID-derived monspecs.modedb and a modelist from it before later failing.

The normal success path frees monspecs.modedb after the initial mode selection,
but the probe error path only deletes the I2C busses and misses the
EDID-derived allocations.

Free both the modelist and monspecs.modedb on the failed: unwind path.

Co-developed-by: Myeonghun Pak <mhun512@gmail.com>
Signed-off-by: Myeonghun Pak <mhun512@gmail.com>
Co-developed-by: Ijae Kim <ae878000@gmail.com>
Signed-off-by: Ijae Kim <ae878000@gmail.com>
Co-developed-by: Taegyu Kim <tmk5904@psu.edu>
Signed-off-by: Taegyu Kim <tmk5904@psu.edu>
Signed-off-by: Yuho Choi <dbgh9129@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>

fbdev: offb: fix PCI device reference leak on probe failure

offb_init_nodriver() gets a referenced PCI device with pci_get_device().
If pci_enable_device() fails, the function returns without dropping that
reference.

Release the PCI device reference before returning from the
pci_enable_device() failure path.

Fixes: 5bda8f7b5468 ("video: fbdev: offb: Call pci_enable_device() before using the PCI VGA device")
Co-developed-by: Myeonghun Pak <mhun512@gmail.com>
Signed-off-by: Myeonghun Pak <mhun512@gmail.com>
Co-developed-by: Ijae Kim <ae878000@gmail.com>
Signed-off-by: Ijae Kim <ae878000@gmail.com>
Co-developed-by: Taegyu Kim <tmk5904@psu.edu>
Signed-off-by: Taegyu Kim <tmk5904@psu.edu>
Signed-off-by: Yuho Choi <dbgh9129@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>

smb: server: stop sending fake security descriptors

in smb2_get_info_sec, a dummy security descriptor (SD) is returned if
the requested information is not supported.

the code is currently wrong, as DACL_PROTECTED is set in the type field,
but there is no DACL is present.

instead of faking a security, report a STATUS_NOT_SUPPORTED error.

this seems to fix a "Error 0x80090006: Invalid Signature" on file
transfers with Windows 11 clients (25H2, build 26200.8246).

capturing traffic shows that the client is sending a GET_INFO/SEC_INFO
request, with the additional_info field set to 0x20
(ATTRIBUTE_SECURITY_INFORMATION). Returning an empty SD
(with only SELF_RELATIVE set) does not fix the error.

Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: scope conn->binding slowpath to bound sessions only

When the binding SESSION_SETUP sets conn->binding = true, the flag stays
set after the call so that the global session lookup in
ksmbd_session_lookup_all() can find the session, which was not added to
conn->sessions. Because the flag is connection-wide, the global lookup
path will also resolve any other session by id if asked.

Tighten the global lookup so that the returned session must have this
connection registered in its channel xarray (sess->ksmbd_chann_list).
The channel entry is installed by the existing binding_session path in
ntlm_authenticate()/krb5_authenticate() when a SESSION_SETUP completes
successfully, so this condition is a strict equivalent of "this
connection has been accepted as a channel of this session". Connections
that have not bound to a given session cannot reach it via the global
table.

The existing conn->binding gate for entering the slowpath is preserved
so that non-binding connections keep the fast-path-only behavior, and
the session->state check is unchanged.

Fixes: f5a544e3bab7 ("ksmbd: add support for SMB3 multichannel")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: fix CreateOptions sanitization clobbering the whole field

smb2_open() attempts to clear conflicting CreateOptions bits
(FILE_SEQUENTIAL_ONLY_LE together with FILE_RANDOM_ACCESS_LE, and
FILE_NO_COMPRESSION_LE on a directory open), but uses a plain
assignment of the bitwise negation of the target flag:

req->CreateOptions = ~(FILE_SEQUENTIAL_ONLY_LE);
req->CreateOptions = ~(FILE_NO_COMPRESSION_LE);

This replaces the entire field with 0xFFFFFFFB / 0xFFFFFFEF rather
than clearing a single bit. With the SEQUENTIAL/RANDOM case, the
next check for FILE_OPEN_BY_FILE_ID_LE | CREATE_TREE_CONNECTION |
FILE_RESERVE_OPFILTER_LE then trivially matches and a legitimate
request is rejected with -EOPNOTSUPP. With the NO_COMPRESSION case,
every downstream test (FILE_DELETE_ON_CLOSE, etc.) operates on a
corrupted CreateOptions value.

Use &= ~FLAG to clear only the intended bit in both places.

Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: fix durable fd leak on ClientGUID mismatch in durable v2 open

ksmbd_lookup_fd_cguid() returns a ksmbd_file with its refcount
incremented via ksmbd_fp_get(). parse_durable_handle_context() in
the DURABLE_REQ_V2 case properly releases this reference on every
path inside the ClientGUID-match branch, either by calling
ksmbd_put_durable_fd() or by transferring ownership to dh_info->fp
for a successful reconnect. However, when an entry exists in the
global file table with the same CreateGuid but a different
ClientGUID, the code simply falls through to the new-open path
without dropping the reference obtained from ksmbd_lookup_fd_cguid().

Per MS-SMB2 section 3.3.5.9.10 ("Handling the
SMB2_CREATE_DURABLE_HANDLE_REQUEST_V2 Create Context"), the server
MUST locate an Open whose Open.CreateGuid matches the request's
CreateGuid AND whose Open.ClientGuid matches the ClientGuid of the
connection that received the request. If no such Open is found, the
server MUST continue with the normal open execution phase. A
CreateGuid hit with a ClientGUID mismatch is therefore the
"Open not found" case: proceeding with a new open is correct, but
the reference obtained purely as a side effect of the lookup must
not be leaked.

Repeated requests that hit this mismatch pin global_ft entries,
prevent __ksmbd_close_fd() from ever running for the corresponding
files, and defeat the durable scavenger, leading to long-lived
resource leaks.

Release the reference in the mismatch path and clear dh_info->fp so
subsequent logic does not mistake a non-matching lookup result for
a reconnect target.

Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: fix O(N^2) DoS in smb2_lock via unbounded LockCount

smb2_lock() performs O(N^2) conflict detection with no cap on LockCount.
Cap lock_count at 64 to prevent CPU exhaustion from a single request.

Signed-off-by: Akif Sait <akif.sait111@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: destroy async_ida in ksmbd_conn_free()

When per-connection async_ida was converted from a dynamically
allocated ksmbd_ida to an embedded struct ida, ksmbd_ida_free() was
removed from the connection teardown path but no matching
ida_destroy() was added. The connection is therefore freed with the
IDA's backing xarray still intact.

The kernel IDA API expects ida_init() and ida_destroy() to be paired
over an object's lifetime, so add the missing cleanup before the
connection is freed.

No leak has been observed in testing; this is a pairing fix to match
the IDA lifetime rules, not a response to a reproduced regression.

Fixes: d40012a83f87 ("cifsd: declare ida statically")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: destroy tree_conn_ida in ksmbd_session_destroy()

When per-session tree_conn_ida was converted from a dynamically
allocated ksmbd_ida to an embedded struct ida, ksmbd_ida_free() was
removed from ksmbd_session_destroy() but no matching ida_destroy()
was added. The session is therefore freed with the IDA's backing
xarray still intact.

The kernel IDA API expects ida_init() and ida_destroy() to be paired
over an object's lifetime, so add the missing cleanup before the
enclosing session is freed.

Also move ida_init() to right after the session is allocated so that
it is always paired with the destroy call even on the early error
paths of __session_create() (ksmbd_init_file_table() or
__init_smb2_session() failures), both of which jump to the error
label and invoke ksmbd_session_destroy() on a partially initialised
session.

No leak has been observed in testing; this is a pairing fix to match
the IDA lifetime rules, not a response to a reproduced regression.

Fixes: d40012a83f87 ("cifsd: declare ida statically")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: Use AES-CMAC library for SMB3 signature calculation

Now that AES-CMAC has a library API, convert ksmbd_sign_smb3_pdu() to
use it instead of a "cmac(aes)" crypto_shash.

The result is simpler and faster code. With the library there's no need
to dynamically allocate memory, no need to handle errors, and the
AES-CMAC code is accessed directly without inefficient indirect calls
and other unnecessary API overhead.

Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: reset rcount per connection in ksmbd_conn_wait_idle_sess_id()

rcount is intended to be connection-specific: 2 for curr_conn, 1 for
every other connection sharing the same session.  However, it is
initialised only once before the hash iteration and is never reset.
After the loop visits curr_conn, later sibling connections are also
checked against rcount == 2, so a sibling with req_running == 1 is
incorrectly treated as idle.  This makes the outcome depend on the
hash iteration order: whether a given sibling is checked against the
loose (< 2) or the strict (< 1) threshold is decided by whether it
happens to be visited before or after curr_conn.

The function's contract is "wait until every connection sharing this
session is idle" so that destroy_previous_session() can safely tear
the session down.  The latched rcount violates that contract and
reopens the teardown race window the wait logic was meant to close:
destroy_previous_session() may proceed before sibling channels have
actually quiesced, overlapping session teardown with in-flight work
on those connections.

Recompute rcount inside the loop so each connection is compared
against its own threshold regardless of iteration order.

This is a code-inspection fix for an iteration-order-dependent logic
error; a targeted reproducer would require SMB3 multichannel with
in-flight work on a sibling channel landing after curr_conn in hash
order, which is not something that can be triggered reliably.

Fixes: 76e98a158b20 ("ksmbd: fix race condition between destroy_previous_session() and smb2 operations()")
Cc: stable@vger.kernel.org
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

regulator: qcom: Unify user-visible "Qualcomm" name

Various names for Qualcomm as a company are used in user-visible config
options: QCOM, Qualcomm and Qualcomm Technologies. Switch to unified
"Qualcomm" so it will be easier for users to identify the options when
for example running menuconfig.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260422083338.84343-2-krzysztof.kozlowski@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>

NFS: Fix RCU dereference of cl_xprt in nfs_compare_super_address

The cl_xprt pointer in struct rpc_clnt is marked as __rcu. Accessing
it directly in nfs_compare_super_address() is unsafe and triggers
Sparse warnings.

Fix this by using rcu_dereference() within an RCU read-side critical
section to retrieve the transport pointer. This addresses the sparse
warning and ensures atomic access to the pointer, as the transport
can be updated via transport switching even while the superblock
remains active under sb_lock.

Fixes: 7e3fcf61abde ("nfs: don't share mounts between network namespaces")
Signed-off-by: Sean Chang <seanwascoding@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

NFS: remove redundant __private attribute from nfs_page_class

The nfs_page_class tracepoint uses a pointer for the 'req' field marked
with the __private attribute. This causes Sparse to complain about
dereferencing a private pointer within the trace ring buffer context,
specifically during the TP_fast_assign() operation.

This fixes a Sparse warning introduced in commit b6ef079fd984 ("nfs:
more in-depth tracing of writepage events") by removing the redundant
__private attribute from the 'req' field.

Reviewed-by: Benjamin Coddington <bcodding@hammerspace.com>
Signed-off-by: Sean Chang <seanwascoding@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

NFSv4.2: fix CLONE/COPY attrs in presence of delegated attributes

xfstest generic/407 is failing in 2 ways. It detects that after
doing a clone the client does not update it's mtime and it's ctime.
CLONE always sends a GETATTR operation and then calls
nfs_post_op_update_inode() based on the returned attributes.
Because of the delegated attributes the client ignores updating
the mtime. Then also, when delegated attributes are present, for
the change_attr the server replies with the same values as what
the client cached before and thus the generic/407 would flag that.
Instead, make sure we invalidate the blocks attr.

By adding updating delegated attributes in nfs42_copy_dest_done()
both COPY and CLONE would update mtime appropriately.

Fixes: e12912d94137 ("NFSv4: Add support for delegated atime and mtime attributes")
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

NFS: fix writeback in presence of errors

After running xfstest generic/751, in certain conditions, can have
a writeback IO stuck while experiencing one of the two patterns.

Pattern#1: writeback IO experiences ENOSPC on an offset smaller
than the filesize. Example,
write offset=0 len=4096 how=unstable OK
write offset=8192 len=4096 how=unstable OK
write offset=12288 len=4096 how=unstable ENOSPC
write offset=4096 len=4096 how=unstable ENOSPC
client sends a commit and receives a verifier which is different
from the last successful write. It marks pages dirty and writeback
retries. But it again send writes unstable and gets into the same
pattern, running into the ENOSPC error and sending a commit because
writes were sent at unstable.

Pattern#2: an unstable write followed by a short write and ENOSPC.
write offset=0 len=4096 how=unstable OK
write offset=4096 len=4096 how=unstable returns OK but count=100
write offset=4197 len=3996 how=stable returns ENOSPC
client send a commit and receives a verifier different from
the last unstable write. The same behaviour is retried in a loop.

Instead, this patch proposes to identify those conditions and mark
requests to be done synchronously instead. Previous solution tried
to mark it in the nfs_page, however that's not persistent thus
instead mark it in the nfs_open_context.

Furthermore, the same problem occurs during localio code path so
recognize that IO needs to be done sync in that case as well.

Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

nfs: use memcpy_and_pad in decode_fh

Use memcpy_and_pad() instead of memcpy() followed by memset() to
simplify decode_fh().

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

spi: orion: runtime PM fixes

Johan Hovold <johan@kernel.org> says:

This series fixes some runtime PM related issues in the orion driver.

Included is also a related clean up.

spi: orion: clean up probe return value

Drop the redundant initialisation and return explicit zero on successful
probe to make the code more readable.

Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260421130211.1537628-4-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: orion: fix clock imbalance on registration failure

Make sure that the controller is not runtime suspended before disabling
clocks on probe failure.

Also restore the autosuspend setting.

Fixes: 5c6786945b4e ("spi: spi-orion: add runtime PM support")
Cc: stable@vger.kernel.org # 3.17
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260421130211.1537628-3-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: orion: fix runtime pm leak on unbind

Make sure to balance the runtime PM usage count on driver unbind so that
the controller can be suspended when a driver is rebound.

Also restore the autosuspend setting.

This issue was flagged by Sashiko when reviewing a controller
deregistration fix.

Fixes: 5c6786945b4e ("spi: spi-orion: add runtime PM support")
Cc: stable@vger.kernel.org # 3.17
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Link: https://sashiko.dev/#/patchset/20260414134319.978196-1-johan%40kernel.org?part=6
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260421130211.1537628-2-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: imx: fix runtime pm leak on probe deferral

Make sure to balance the runtime PM usage count before returning on
probe failure (e.g. probe deferral) so that the controller can be
suspended when a driver is later bound.

Fixes: 43b6bf406cd0 ("spi: imx: fix runtime pm support for !CONFIG_PM")
Cc: stable@vger.kernel.org # 5.10
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260421125632.1537235-1-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: mpc52xx: fix use-after-free on registration failure

Make sure to disable and free the interrupts in case controller
registration fails to avoid a potential use-after-free and resource
leak.

This issue was flagged by Sashiko when reviewing a controller
deregistration fix.

Fixes: 42bbb70980f3 ("powerpc/5200: Add mpc5200-spi (non-PSC) device driver")
Cc: stable@vger.kernel.org # 2.6.33
Cc: Grant Likely <grant.likely@secretlab.ca>
Link: https://sashiko.dev/#/patchset/20260414134319.978196-1-johan%40kernel.org?part=3
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260421125800.1537361-1-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

ntfs: use page allocation for resident attribute inline data

The current kmemdup() based allocation for IOMAP_INLINE can result in
inline_data pointer having a non-zero page offset. This causes
iomap_inline_data_valid() to fail the check:

iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data)

and triggers the kernel BUG at fs/iomap/buffered-io.c:1061.

This particularly affects workloads with frequent small file access
(e.g. Firefox Nightly profile on NTFS with bind mount) when using the
new ntfs. This fix this by allocating a full page with alloc_page() so that
page_address() always returns a page-aligned address.

Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>

ntfs: fix mmap_prepare writable check for shared mappings

Linus pointed out that checking only VMA_WRITE_BIT is incorrect.
Private writable mappings (MAP_PRIVATE) set VM_WRITE but do not
write back to the filesystem. Also, mappings that can become
writable via mprotect() (VM_MAYWRITE) must be handled.

Use vma_desc_test_all(VMA_SHARED_BIT, VMA_MAYWRITE_BIT) instead,
which matches what other filesystems do.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>

LoongArch: BPF: Support up to 12 function arguments for trampoline

Currently, LoongArch bpf trampoline supports up to 8 function arguments.
According to the statistics from commit 473e3150e30a ("bpf, x86: allow
function arguments up to 12 for TRACING"), there are over 200 functions
accept 9 to 12 arguments, so add 12 arguments support for trampoline.

With this patch, the following related testcases passed:

  sudo ./test_progs -a tracing_struct/struct_many_args
  sudo ./test_progs -a fentry_test/fentry_many_args
  sudo ./test_progs -a fexit_test/fexit_many_args

Acked-by: Hengqi Chen <hengqi.chen@gmail.com>
Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: BPF: Support small struct arguments for trampoline

In the current BPF code, the struct argument size is at most 16 bytes,
enforced by the verifier. According to the Procedure Call Standard for
LoongArch, the struct argument size below 16 bytes are provided as part
of the 8 argument registers, that is to say, the struct argument may be
passed in a pair of registers if its size is more than 8 bytes and no
more than 16 bytes.

Extend the BPF trampoline JIT to support attachment to functions that
take small structures (up to 16 bytes) as argument, save and restore a
number of "argument registers" rather than a number of arguments.

With this patch, the following related testcases passed:

sudo ./test_progs -a tracing_struct/struct_args
sudo ./test_progs -a tracing_struct/union_args

Link: https://github.com/loongson/la-abi-specs/blob/release/lapcs.adoc#structures
Acked-by: Hengqi Chen <hengqi.chen@gmail.com>
Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: BPF: Open code and remove invoke_bpf_mod_ret()

invoke_bpf_mod_ret() is a small wrapper over invoke_bpf_prog(), it
should check the return value of invoke_bpf_prog() and then return
immediately if invoke_bpf_prog() failed, just open code and remove
it due to it is called only once.

Acked-by: Hengqi Chen <hengqi.chen@gmail.com>
Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: BPF: Support load-acquire and store-release instructions

Use the LoongArch common memory access instructions with the barrier
'dbar' to support the BPF load-acquire and store-release instructions.

With this patch, the following testcases passed on LoongArch if the
macro CAN_USE_LOAD_ACQ_STORE_REL is usable in bpf selftests:

  sudo ./test_progs -t verifier_load_acquire
  sudo ./test_progs -t verifier_store_release
  sudo ./test_progs -t verifier_precision/bpf_load_acquire
  sudo ./test_progs -t verifier_precision/bpf_store_release
  sudo ./test_progs -t compute_live_registers/atomic_load_acq_store_rel

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: BPF: Support 8 and 16 bit read-modify-write instructions

The 8 and 16 bit read-modify-write instructions {amadd/amswap}.{b/h}
were newly added in the latest LoongArch Reference Manual, use them to
avoid the error of unknown opcode if possible.

Acked-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: BPF: Add the default case in emit_atomic() and rename it

Like the other archs such as x86 and riscv, add the default case
in emit_atomic() to print an error message for the invalid opcode
and return -EINVAL, then make its return type as int.

While at it, given that all of the instructions in emit_atomic()
are only read-modify-write instructions, rename emit_atomic() to
emit_atomic_rmw() to make it clear, because there will be a new
function emit_atomic_ld_st() for load-acquire and store-release
instructions in the later patch.

Acked-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: Define instruction formats for AM{SWAP/ADD}.{B/H} and DBAR

The 8 and 16 bit read-modify-write atomic instructions amadd.{b/h} and
amswap.{b/h} were newly added in the latest LoongArch Reference Manual,
define the instruction format and check whether support via CPUCFG.

Furthermore, define the instruction format for DBAR which will be used
to support BPF load-acquire and store-release instructions.

This is preparation for later patches.

Acked-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: Batch the icache maintenance for jump_label

Switch to the batched version of the jump label update functions so
instruction cache maintenance is deferred until the end of the update.

Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: Add flush_icache_all()/local_flush_icache_all()

LoongArch maintains ICache/DCache coherency by hardware, so we just need
"ibar 0" to avoid instruction hazard here.

Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: Add spectre boundry for syscall dispatch table

The LoongArch syscall number is directly controlled by userspace, but
does not have a array_index_nospec() boundry to prevent access past the
syscall function pointer tables.

Cc: stable@vger.kernel.org
Assisted-by: gkh_clanker_2000
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: Show CPU vulnerabilites correctly

Most LoongArch processors are vulnerable to Spectre-V1 Proof-of-Concept
(PoC). And the generic mechanism, __user pointer sanitization, can be
used as a mitigation. This means to use array_index_nospec() to prevent
out of boundry access in syscall and other critical paths.

Implement the arch-specific cpu_show_spectre_v1() to show CPU Spectre-V1
vulnerabilites correctly.

Cc: stable@vger.kernel.org
Link: https://cc-sw.com/chinese-loongarch-architecture-evaluation-part-3-of-3/
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: Make arch_irq_work_has_interrupt() true only if IPI HW exist

After commit 7c405fb3279b3924 ("rcu: Use an intermediate irq_work to
start process_srcu()"), Loongson-2K0300/2K0500 fail to boot. Because
IRQ_WORK need IPI but Loongson-2K0300/2K0500 don't have IPI HW.

So make arch_irq_work_has_interrupt() return true only if IPI HW exist.

Cc: stable@vger.kernel.org
Reported-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: Use get_random_canary() for stack canary init

Like others, replace the custom stack canary initialization with the
get_random_canary() helper, following the pattern established in commit
622754e84b10 ("stackprotector: actually use get_random_canary()").

Signed-off-by: Luo Qiu <luoqiu@kylinsec.com.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: Improve the logging of disabling KASLR

Whether KASLR is disabled is not handled in nokaslr() which is the early
param "nokaslr" setup function, but in kaslr_disabled(). However, the
logging was previously done in nokaslr() and lack detail. So we move the
logging to the right place and add more specific infomation about why it
is disabled.

Suggested-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Yuqian Yang <yangyuqian@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: Align FPU register state to 32 bytes

Move fpr to the beginning of struct loongarch_fpu so it is naturally
aligned to FPU_ALIGN (32 bytes), improving 256-bit SIMD (LASX) context
switch performance.

Also adjust process.c and fpu.S to work well with the new loongarch_fpu
layout.

Signed-off-by: Lisa Robinson <lisa@bytefly.space>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: Handle CONFIG_32BIT in syscall_get_arch()

If CONFIG_32BIT is set, it should return AUDIT_ARCH_LOONGARCH32 instead
of AUDIT_ARCH_LOONGARCH64 in syscall_get_arch().

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: Add HIGHMEM (PKMAP and FIX_KMAP) support

Add HIGHMEM (High Memory) support for LoongArch, mostly needed by 32BIT
kernel because the size of kernel virtual memory space is only 512MB and
the size of usable physical memory is only 256MB in this case.

HIGHMEM adds permanent kernel mapping (PKMAP) and fixed kernel mapping
(FIX_KMAP), which increase usable physical memory up to 2.25GB (2304MB).

We can just use the generic copy_user_highpage(), so remove the custom
version.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: Adjust build infrastructure for 32BIT/64BIT

Adjust build infrastructure (Kconfig, Makefile and ld scripts) to let
us enable both 32BIT/64BIT kernel build.

Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

x86/hyperv: Skip LP/VP creation on kexec

After a kexec the logical processors and virtual processors already
exist in the hypervisor because they were created by the previous
kernel. Attempting to add them again causes either a BUG_ON or
corrupted VP state leading to MCEs in the new kernel.

Add hv_lp_exists() to probe whether an LP is already present by
calling HVCALL_GET_LOGICAL_PROCESSOR_RUN_TIME. When it succeeds the
LP exists and we skip the add-LP and create-VP loops entirely.

Also add hv_call_notify_all_processors_started() which informs the
hypervisor that all processors are online. This is required after
adding LPs (fresh boot) and is a no-op on kexec since we skip that
path.

Co-developed-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com>
Signed-off-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com>
Co-developed-by: Stanislav Kinsburskii <stanislav.kinsburskii@gmail.com>
Signed-off-by: Stanislav Kinsburskii <stanislav.kinsburskii@gmail.com>
Co-developed-by: Mukesh Rathor <mrathor@linux.microsoft.com>
Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com>
Signed-off-by: Jork Loeser <jloeser@linux.microsoft.com>
Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

x86/hyperv: move stimer cleanup to hv_machine_shutdown()

Move hv_stimer_global_cleanup() from vmbus's hv_kexec_handler() to
hv_machine_shutdown() in the platform code. This ensures stimer cleanup
happens before the vmbus unload, which is required for root partition
kexec to work correctly.

Co-developed-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com>
Signed-off-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com>
Signed-off-by: Jork Loeser <jloeser@linux.microsoft.com>
Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

Drivers: hv: vmbus: fix hyperv_cpuhp_online variable shadowing

vmbus_alloc_synic_and_connect() declares a local 'int
hyperv_cpuhp_online' that shadows the file-scope global of the same
name. The cpuhp state returned by cpuhp_setup_state() is stored in
the local, leaving the global at 0 (CPUHP_OFFLINE). When
hv_kexec_handler() or hv_machine_shutdown() later call
cpuhp_remove_state(hyperv_cpuhp_online) they pass 0, which hits the
BUG_ON in __cpuhp_remove_state_cpuslocked().

Remove the local declaration so the cpuhp state is stored in the
file-scope global where hv_kexec_handler() and hv_machine_shutdown()
expect it.

Fixes: 2647c96649ba ("Drivers: hv: Support establishing the confidential VMBus connection")
Signed-off-by: Jork Loeser <jloeser@linux.microsoft.com>
Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

mshv: Add tracepoint for GPA intercept handling

Provide visibility into GPA intercept operations for debugging and
performance analysis of Microsoft Hypervisor guest memory management.

Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Reviewed-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>

pwm: atmel-tcb: Cache clock rates and mark chip as atomic

atmel_tcb_pwm_apply() holds tcbpwmc->lock as a spinlock via
guard(spinlock)() and then calls atmel_tcb_pwm_config(), which calls
clk_get_rate() twice. clk_get_rate() acquires clk_prepare_lock (a
mutex), so this is a sleep-in-atomic-context violation.

On CONFIG_DEBUG_ATOMIC_SLEEP kernels every pwm_apply_state() that
enables or reconfigures the PWM triggers a "BUG: sleeping function
called from invalid context" warning.

Acquire exclusive control over the clock rates with
clk_rate_exclusive_get() at probe time and cache the rates in struct
atmel_tcb_pwm_chip, then read the cached rates from
atmel_tcb_pwm_config(). This keeps the spinlock-based mutual exclusion
introduced in commit 37f7707077f5 ("pwm: atmel-tcb: Fix race condition
and convert to guards") and removes the sleeping calls from the atomic
section.

With no sleeping calls left in .apply() and the regmap-mmio bus already
running with fast_io=true, also mark the chip as atomic so consumers
can use pwm_apply_atomic() from atomic context.

Fixes: 37f7707077f5 ("pwm: atmel-tcb: Fix race condition and convert to guards")
Signed-off-by: Sangyun Kim <sangyun.kim@snu.ac.kr>
Link: https://patch.msgid.link/20260419080838.3192357-1-sangyun.kim@snu.ac.kr
[ukleinek: Ensure .clk is enabled before calling clk_get_rate on it.]
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>

io_uring: take page references for NOMMU pbuf_ring mmaps

Under !CONFIG_MMU, io_uring_get_unmapped_area() returns the kernel
virtual address of the io_mapped_region's backing pages directly;
the user's VMA aliases the kernel allocation. io_uring_mmap() then
just returns 0 -- it takes no page references.

The CONFIG_MMU path uses vm_insert_pages(), which takes a reference on
each inserted page.  Those references are released when the VMA is torn
down (zap_pte_range -> put_page). io_free_region() -> release_pages()
drops the io_uring-side references, but the pages survive until munmap
drops the VMA-side references.

Under NOMMU there are no VMA-side references. io_unregister_pbuf_ring ->
io_put_bl -> io_free_region -> release_pages drops the only references
and the pages return to the buddy allocator while the user's VMA still
has vm_start pointing into them.  The user can then write into whatever
the allocator hands out next.

Mirror the MMU lifetime: take get_page references in io_uring_mmap() and
release them via vm_ops->close.  NOMMU's delete_vma() calls vma_close()
which runs ->close on munmap.

This also incidentally addresses the duplicate-vm_start case: two mmaps
of SQ_RING and CQ_RING resolve to the same ctx->ring_region pointer.
With page refs taken per mmap, the second mmap takes its own refs and
the pages survive until both mmaps are closed.  The nommu rb-tree BUG_ON
on duplicate vm_start is a separate mm/nommu.c concern (it should share
the existing region rather than BUG), but the page lifetime is now
correct.

Cc: Jens Axboe <axboe@kernel.dk>
Reported-by: Anthropic
Assisted-by: gkh_clanker_t1000
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/2026042115-body-attention-d15b@gregkh
[axboe: get rid of region lookup, just iterate pages in vma]
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge tag 'probes-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fixes from Masami Hiramatsu:
"fprobe bug fixes:

   - Prevent re-registration

     Add an earlier check to reject re-registering an already active
     fprobe before its state is modified during the initialization phase

   - Robustness in failure paths:
      - Ensure fprobes are correctly removed from all internal tables
        and properly RCU-freed during registration failure
      - Make unregister_fprobe() proceed with unregistration even if
        temporary memory allocation fails

   - RCU safety in module unloading

     Avoid a potential "sleep in RCU" warning by removing a kcalloc()
     call in the module notifier path. This also tries to remove
     fprobe_hash_node even if memory allocation fails.

   - Type-aware unregistration

     Fix a bug where unregistering an fprobe did not account for
     different types (entry-only vs entry-exit) at the same address,
     which previously left "junk" entries in the underlying
     ftrace/fgraph ops

   - Unregistration of empty ftrace_ops

     Avoid unneeded performance overhead due to making registered
     ftrace_ops empty - which means 'trace all functions'. This counts
     remaining entries and unregister ftrace_ops when it becomes empty.

  Two new selftests to check above fixes:

   - Module Unloading Test:

     Specifically verifies that fprobe events on a module are correctly
     cleaned up and do not trigger 'trace-all' behavior when the module
     is removed.

   - Multiple Fprobe Events Test:

     Ensure that having multiple fprobes on the same function correctly
     manages the ftrace hash map during removal"

* tag 'probes-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  selftests/ftrace: Add a testcase for multiple fprobe events
  selftests/ftrace: Add a testcase for fprobe events on module
  tracing/fprobe: Fix to unregister ftrace_ops if it is empty on module unloading
  tracing/fprobe: Check the same type fprobe on table as the unregistered one
  tracing/fprobe: Avoid kcalloc() in rcu_read_lock section
  tracing/fprobe: Remove fprobe from hash in failure path
  tracing/fprobe: Unregister fprobe even if memory allocation fails
  tracing/fprobe: Reject registration of a registered fprobe before init

io_uring/poll: ensure EPOLL_ONESHOT is propagated for EPOLL_URING_WAKE

Commit:

aacf2f9f382c ("io_uring: fix req->apoll_events")

fixed an issue where poll->events and req->apoll_events weren't
synchronized, but then when the commit referenced in Fixes got added,
it didn't ensure the same thing.

If we mask in EPOLLONESHOT in the regular EPOLL_URING_WAKE path, then
ensure it's done for both. Including a link to the original report
below, even though it's mostly nonsense. But it includes a reproducer
that does show that IORING_CQE_F_MORE is set in the previous CQE,
while no more CQEs will be generated for this request. Just ignore
anything that pretends this is security related in any way, it's just
the typical AI nonsense.

Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/io-uring/CAM0zi7yQzF3eKncgHo4iVM5yFLAjsiob_ucqyWKs=hyd_GqiMg@mail.gmail.com/
Reported-by: Azizcan Daştan <azizcan.d@mileniumsec.com>
Fixes: 4464853277d0 ("io_uring: pass in EPOLL_URING_WAKE for eventfd signaling and wakeups")
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge tag 'amd-drm-next-7.1-2026-04-17' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-7.1-2026-04-17:

amdgpu:
- SMU 14 fixes
- Partition fixes
- SMUIO 15.x fix
- SR-IOV fixes
- JPEG fix
- PSP 15.x fix
- NBIF fix
- Devcoredump fixes
- DPC fix
- RAS fixes
- Aldebaran smu fix
- IP discovery fix
- SDMA 7.1 fix
- Runtime pm fix
- MES 12.1 fix
- DML2 fixes
- DCN 4.2 fixes
- YCbCr fixes
- Freesync fixes
- ISM fixes
- Overlay cursor fix
- DC FP fixes
- UserQ locking fixes

amdkfd:
- Fix memory clear handling

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260417225351.8714-1-alexander.deucher@amd.com

Merge tag 'drm-next-2026-04-22' of https://gitlab.freedesktop.org/drm/kernel

Pull more drm updates from Dave Airlie:
"This is a followup which is mostly next material with some fixes.

  Alex pointed out I missed one of his AMD MRs from last week, so I
  added that, then Jani sent the pipe reordering stuff, otherwise it's
  just some minor i915 fixes and a dma-buf fix.

  drm:
   - Add support for AMD VSDB parsing to drm_edid

  dma-buf:
   - fix documentation formatting

  i915:
   - add support for reordered pipes to support joined pipes better
   - Fix VESA backlight possible check condition
   - Verify the correct plane DDB entry

  amdgpu:
   - Audio regression fix
   - Use drm edid parser for AMD VSDB
   - Misc cleanups
   - VCE cs parse fixes
   - VCN cs parse fixes
   - RAS fixes
   - Clean up and unify vram reservation handling
   - GPU Partition updates
   - system_wq cleanups
   - Add CONFIG_GCOV_PROFILE_AMDGPU kconfig option
   - SMU vram copy updates
   - SMU 13/14/15 fixes
   - UserQ fixes
   - Replace pasid idr with an xarray
   - Dither handling fix
   - Enable amdgpu by default for CIK APUs
   - Add IBs to devcoredump

  amdkfd:
   - system_wq cleanups

  radeon:
   - system_wq cleanups"

* tag 'drm-next-2026-04-22' of https://gitlab.freedesktop.org/drm/kernel: (62 commits)
  drm/i915/display: change pipe allocation order for discrete platforms
  drm/i915/wm: Verify the correct plane DDB entry
  drm/i915/backlight: Fix VESA backlight possible check condition
  drm/i915: Walk crtcs in pipe order
  drm/i915/joiner: Make joiner "nomodeset" state copy independent of pipe order
  dma-buf: fix htmldocs error for dma_buf_attach_revocable
  drm/amdgpu: dump job ibs in the devcoredump
  drm/amdgpu: store ib info for devcoredump
  drm/amdgpu: extract amdgpu_vm_lock_by_pasid from amdgpu_vm_handle_fault
  drm/amdgpu: Use amdgpu by default for CIK APUs too
  drm/amd/display: Remove unused NUM_ELEMENTS macros
  drm/amd/display: Replace inline NUM_ELEMENTS macro with ARRAY_SIZE
  drm/amdgpu: save ring content before resetting the device
  drm/amdgpu: make userq fence_drv drop explicit in queue destroy
  drm/amdgpu: rework userq fence driver alloc/destroy
  drm/amdgpu/userq: use dma_fence_wait_timeout without test for signalled
  drm/amdgpu/userq: call dma_resv_wait_timeout without test for signalled
  drm/amdgpu/userq: add the return code too in error condition
  drm/amdgpu/userq: fence wait for max time in amdgpu_userq_wait_for_signal
  drm/amd/display: Change dither policy for 10 bpc output back to dithering
  ...

selftests/ftrace: Add a testcase for multiple fprobe events

Add a testcase for multiple fprobe events on the same function
so that it clears ftrace hash map correctly when removing the
events.

Link: https://lore.kernel.org/all/177669370353.132053.16801520791509406141.stgit@mhiramat.tok.corp.google.com/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

selftests/ftrace: Add a testcase for fprobe events on module

Add a testcase for fprobe events on module, which unloads a kernel
module on which fprobe events are probing and ensure the ftrace
hash map is cleared correctly.

Link: https://lore.kernel.org/all/177669369564.132053.623527664540176496.stgit@mhiramat.tok.corp.google.com/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

tracing/fprobe: Fix to unregister ftrace_ops if it is empty on module unloading

Fix fprobe to unregister ftrace_ops if corresponding type of fprobe
does not exist on the fprobe_ip_table and it is expected to be empty
when unloading modules.

Since ftrace thinks that the empty hash means everything to be traced,
if we set fprobes only on the unloaded module, all functions are traced
unexpectedly after unloading module.
e.g.

# modprobe xt_LOG.ko
# echo 'f:test log_tg*' > dynamic_events
# echo 1 > events/fprobes/test/enable
# cat enabled_functions
log_tg [xt_LOG] (1)             tramp: 0xffffffffa0004000 (fprobe_ftrace_entry+0x0/0x490) ->fprobe_ftrace_entry+0x0/0x490
log_tg_check [xt_LOG] (1)               tramp: 0xffffffffa0004000 (fprobe_ftrace_entry+0x0/0x490) ->fprobe_ftrace_entry+0x0/0x490
log_tg_destroy [xt_LOG] (1)             tramp: 0xffffffffa0004000 (fprobe_ftrace_entry+0x0/0x490) ->fprobe_ftrace_entry+0x0/0x490
# rmmod xt_LOG
# wc -l enabled_functions
34085 enabled_functions

Link: https://lore.kernel.org/all/177669368776.132053.10042301916765771279.stgit@mhiramat.tok.corp.google.com/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

ceph: add subvolume metrics collection and reporting

Add complete infrastructure for per-subvolume I/O metrics collection
and reporting to the MDS. This enables administrators to monitor I/O
patterns at the subvolume granularity, which is useful for multi-tenant
CephFS deployments.

This patch adds:
- CEPHFS_FEATURE_SUBVOLUME_METRICS feature flag for MDS negotiation
- CEPH_SUBVOLUME_ID_NONE constant (0) for unknown/unset state
- Red-black tree based metrics tracker for efficient per-subvolume
aggregation with kmem_cache for entry allocations
- Wire format encoding matching the MDS C++ AggregatedIOMetrics struct
- Integration with the existing CLIENT_METRICS message
- Recording of I/O operations from file read/write and writeback paths
- Debugfs interfaces for monitoring (metrics/subvolumes, metrics/metric_features)

Metrics tracked per subvolume include:
- Read/write operation counts
- Read/write byte counts
- Read/write latency sums (for average calculation)

The metrics are periodically sent to the MDS as part of the existing
metrics reporting infrastructure when the MDS advertises support for
the SUBVOLUME_METRICS feature.

CEPH_SUBVOLUME_ID_NONE enforces subvolume_id immutability. Following
the FUSE client convention, 0 means unknown/unset. Once an inode has
a valid (non-zero) subvolume_id, it should not change during the
inode's lifetime.

Signed-off-by: Alex Markuze <amarkuze@redhat.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

ceph: parse subvolume_id from InodeStat v9 and store in inode

Add support for parsing the subvolume_id field from InodeStat v9 and
storing it in the inode for later use by subvolume metrics tracking.

The subvolume_id identifies which CephFS subvolume an inode belongs to,
enabling per-subvolume I/O metrics collection and reporting.

This patch:
- Adds subvolume_id field to struct ceph_mds_reply_info_in
- Adds i_subvolume_id field to struct ceph_inode_info
- Parses subvolume_id from v9 InodeStat in parse_reply_info_in()
- Adds ceph_inode_set_subvolume() helper to propagate the ID to inodes
- Initializes i_subvolume_id in inode allocation and clears on destroy

Signed-off-by: Alex Markuze <amarkuze@redhat.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>