git.ipfire.org Git - thirdparty/linux.git/log

tracing: Remove the backup instance automatically after read

Since the backup instance is readonly, after reading all data via pipe, no
data is left on the instance. Thus it can be removed safely after closing
all files. This also removes it if user resets the ring buffer manually
via 'trace' file.

Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/177502547711.1311542.12572973358010839400.stgit@mhiramat.tok.corp.google.com
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

tracing: Make the backup instance non-reusable

Since there is no reason to reuse the backup instance, make it readonly
(but erasable). Note that only backup instances are readonly, because
other trace instances will be empty unless it is writable. Only backup
instances have copy entries from the original.

With this change, most of the trace control files are removed from the
backup instance, including eventfs enable/filter etc.

# find /sys/kernel/tracing/instances/backup/events/ | wc -l
4093
# find /sys/kernel/tracing/instances/boot_map/events/ | wc -l
9573

Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/177502546939.1311542.1826814401724828930.stgit@mhiramat.tok.corp.google.com
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

ring-buffer: Enforce read ordering of trace_buffer cpumask and buffers

On CPU hotplug, if it is the first time a trace_buffer sees a CPU, a
ring_buffer_per_cpu will be allocated and its corresponding bit toggled
in the cpumask. Many readers check this cpumask to know if they can
safely read the ring_buffer_per_cpu but they are doing so without memory
ordering and may observe the cpumask bit set while having NULL buffer
pointer.

Enforce the memory read ordering by sending an IPI to all online CPUs.
The hotplug path is a slow-path anyway and it saves us from adding read
barriers in numerous call sites.

Link: https://patch.msgid.link/20260401053659.3458961-1-vdonnefort@google.com
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

selftests/bpf: Add more precision tracking tests for atomics

Add verifier precision tracking tests for BPF atomic fetch operations.
Validate that backtrack_insn correctly propagates precision from the
fetch dst_reg to the stack slot for {fetch_add,xchg,cmpxchg} atomics.
For the first two src_reg gets the old memory value, and for the last
one r0. The fetched register is used for pointer arithmetic to trigger
backtracking. Also add coverage for fetch_{or,and,xor} flavors which
exercises the bitwise atomic fetch variants going through the same
insn->imm & BPF_FETCH check but with different imm values.

Add dual-precision regression tests for fetch_add and cmpxchg where
both the fetched value and a reread of the same stack slot are tracked
for precision. After the atomic operation, the stack slot is STACK_MISC,
so the ldx does not set INSN_F_STACK_ACCESS. These tests verify that
stack precision propagates solely through the atomic fetch's load side.

Add map-based tests for fetch_add and cmpxchg which validate that non-
stack atomic fetch completes precision tracking without falling back
to mark_all_scalars_precise. Lastly, add 32-bit variants for {fetch_add,
cmpxchg} on map values to cover the second valid atomic operand size.

  # LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh -- ./test_progs -t verifier_precision
  [...]
  + /etc/rcS.d/S50-startup
  ./test_progs -t verifier_precision
  [    1.697105] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.700220] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  [    1.777043] tsc: Refined TSC clocksource calibration: 3407.986 MHz
  [    1.777619] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fc6d7268, max_idle_ns: 440795260133 ns
  [    1.778658] clocksource: Switched to clocksource tsc
  #633/1   verifier_precision/bpf_neg:OK
  #633/2   verifier_precision/bpf_end_to_le:OK
  #633/3   verifier_precision/bpf_end_to_be:OK
  #633/4   verifier_precision/bpf_end_bswap:OK
  #633/5   verifier_precision/bpf_load_acquire:OK
  #633/6   verifier_precision/bpf_store_release:OK
  #633/7   verifier_precision/state_loop_first_last_equal:OK
  #633/8   verifier_precision/bpf_cond_op_r10:OK
  #633/9   verifier_precision/bpf_cond_op_not_r10:OK
  #633/10  verifier_precision/bpf_atomic_fetch_add_precision:OK
  #633/11  verifier_precision/bpf_atomic_xchg_precision:OK
  #633/12  verifier_precision/bpf_atomic_fetch_or_precision:OK
  #633/13  verifier_precision/bpf_atomic_fetch_and_precision:OK
  #633/14  verifier_precision/bpf_atomic_fetch_xor_precision:OK
  #633/15  verifier_precision/bpf_atomic_cmpxchg_precision:OK
  #633/16  verifier_precision/bpf_atomic_fetch_add_dual_precision:OK
  #633/17  verifier_precision/bpf_atomic_cmpxchg_dual_precision:OK
  #633/18  verifier_precision/bpf_atomic_fetch_add_map_precision:OK
  #633/19  verifier_precision/bpf_atomic_cmpxchg_map_precision:OK
  #633/20  verifier_precision/bpf_atomic_fetch_add_32bit_precision:OK
  #633/21  verifier_precision/bpf_atomic_cmpxchg_32bit_precision:OK
  #633/22  verifier_precision/bpf_neg_2:OK
  #633/23  verifier_precision/bpf_neg_3:OK
  #633/24  verifier_precision/bpf_neg_4:OK
  #633/25  verifier_precision/bpf_neg_5:OK
  #633     verifier_precision:OK
  Summary: 1/25 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260331222020.401848-2-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Fix incorrect pruning due to atomic fetch precision tracking

When backtrack_insn encounters a BPF_STX instruction with BPF_ATOMIC
and BPF_FETCH, the src register (or r0 for BPF_CMPXCHG) also acts as
a destination, thus receiving the old value from the memory location.

The current backtracking logic does not account for this. It treats
atomic fetch operations the same as regular stores where the src
register is only an input. This leads the backtrack_insn to fail to
propagate precision to the stack location, which is then not marked
as precise!

Later, the verifier's path pruning can incorrectly consider two states
equivalent when they differ in terms of stack state. Meaning, two
branches can be treated as equivalent and thus get pruned when they
should not be seen as such.

Fix it as follows: Extend the BPF_LDX handling in backtrack_insn to
also cover atomic fetch operations via is_atomic_fetch_insn() helper.
When the fetch dst register is being tracked for precision, clear it,
and propagate precision over to the stack slot. For non-stack memory,
the precision walk stops at the atomic instruction, same as regular
BPF_LDX. This covers all fetch variants.

Before:

  0: (b7) r1 = 8                        ; R1=8
  1: (7b) *(u64 *)(r10 -8) = r1         ; R1=8 R10=fp0 fp-8=8
  2: (b7) r2 = 0                        ; R2=0
  3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2)          ; R2=8 R10=fp0 fp-8=mmmmmmmm
  4: (bf) r3 = r10                      ; R3=fp0 R10=fp0
  5: (0f) r3 += r2
  mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1
  mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10
  mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2)
  mark_precise: frame0: regs=r2 stack= before 2: (b7) r2 = 0
  6: R2=8 R3=fp8
  6: (b7) r0 = 0                        ; R0=0
  7: (95) exit

After:

  0: (b7) r1 = 8                        ; R1=8
  1: (7b) *(u64 *)(r10 -8) = r1         ; R1=8 R10=fp0 fp-8=8
  2: (b7) r2 = 0                        ; R2=0
  3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2)          ; R2=8 R10=fp0 fp-8=mmmmmmmm
  4: (bf) r3 = r10                      ; R3=fp0 R10=fp0
  5: (0f) r3 += r2
  mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1
  mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10
  mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2)
  mark_precise: frame0: regs= stack=-8 before 2: (b7) r2 = 0
  mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1
  mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8
  6: R2=8 R3=fp8
  6: (b7) r0 = 0                        ; R0=0
  7: (95) exit

Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg")
Fixes: 5ca419f2864a ("bpf: Add BPF_FETCH field / create atomic_fetch_add instruction")
Reported-by: STAR Labs SG <info@starlabs.sg>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260331222020.401848-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Merge tag 'net-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"With fixes from wireless, bluetooth and netfilter included we're back
  to each PR carrying 30%+ more fixes than in previous era.

  The good news is that so far none of the "extra" fixes are themselves
  causing real regressions. Not sure how much comfort that is.

  Current release - fix to a fix:

   - netdevsim: fix build if SKB_EXTENSIONS=n

   - eth: stmmac: skip VLAN restore when VLAN hash ops are missing

  Previous releases - regressions:

   - wifi: iwlwifi: mvm: don't send a 6E related command when
     not supported

  Previous releases - always broken:

   - some info leak fixes

   - add missing clearing of skb->cb[] on ICMP paths from tunnels

   - ipv6:
      - flowlabel: defer exclusive option free until RCU teardown
      - avoid overflows in ip6_datagram_send_ctl()

   - mpls: add seqcount to protect platform_labels from OOB access

   - bridge: improve safety of parsing ND options

   - bluetooth: fix leaks, overflows and races in hci_sync

   - netfilter: add more input validation, some to address bugs directly
     some to prevent exploits from cooking up broken configurations

   - wifi:
      - ath: avoid poor performance due to stopping the wrong
        aggregation session
      - virt_wifi: remove SET_NETDEV_DEV to avoid use-after-free

   - eth:
      - fec: fix the PTP periodic output sysfs interface
      - enetc: safely reinitialize TX BD ring when it has unsent frames"

* tag 'net-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (95 commits)
  eth: fbnic: Increase FBNIC_QUEUE_SIZE_MIN to 64
  ipv6: avoid overflows in ip6_datagram_send_ctl()
  net: hsr: fix VLAN add unwind on slave errors
  net: hsr: serialize seq_blocks merge across nodes
  vsock: initialize child_ns_mode_locked in vsock_net_init()
  selftests/tc-testing: add tests for cls_fw and cls_flow on shared blocks
  net/sched: cls_flow: fix NULL pointer dereference on shared blocks
  net/sched: cls_fw: fix NULL pointer dereference on shared blocks
  net/x25: Fix overflow when accumulating packets
  net/x25: Fix potential double free of skb
  bnxt_en: Restore default stat ctxs for ULP when resource is available
  bnxt_en: Don't assume XDP is never enabled in bnxt_init_dflt_ring_mode()
  bnxt_en: Refactor some basic ring setup and adjustment logic
  net/mlx5: Fix switchdev mode rollback in case of failure
  net/mlx5: Avoid "No data available" when FW version queries fail
  net/mlx5: lag: Check for LAG device before creating debugfs
  net: macb: properly unregister fixed rate clocks
  net: macb: fix clk handling on PCI glue driver removal
  virtio_net: clamp rss_max_key_size to NETDEV_RSS_KEY_LEN
  net/sched: sch_netem: fix out-of-bounds access in packet corruption
  ...

Merge tag 'iommu-fixes-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux

Pull iommu fixes from Joerg Roedel:

- IOMMU-PT related compile breakage in for AMD driver

- IOTLB flushing behavior when unmapped region is larger than requested
   due to page-sizes

- Fix IOTLB flush behavior with empty gathers

* tag 'iommu-fixes-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
  iommupt/amdv1: mark amdv1pt_install_leaf_entry as __always_inline
  iommupt: Fix short gather if the unmap goes into a large mapping
  iommu: Do not call drivers for empty gathers

bpf: Reject sleepable kprobe_multi programs at attach time

kprobe.multi programs run in atomic/RCU context and cannot sleep.
However, bpf_kprobe_multi_link_attach() did not validate whether the
program being attached had the sleepable flag set, allowing sleepable
helpers such as bpf_copy_from_user() to be invoked from a non-sleepable
context.

This causes a "sleeping function called from invalid context" splat:

  BUG: sleeping function called from invalid context at ./include/linux/uaccess.h:169
  in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1787, name: sudo
  preempt_count: 1, expected: 0
  RCU nest depth: 2, expected: 0

Fix this by rejecting sleepable programs early in
bpf_kprobe_multi_link_attach(), before any further processing.

Fixes: 0dcac2725406 ("bpf: Add multi kprobe link")
Signed-off-by: Varun R Mallya <varunrmallya@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Leon Hwang <leon.hwang@linux.dev>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20260401191126.440683-1-varunrmallya@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: reject direct access to nullable PTR_TO_BUF pointers

check_mem_access() matches PTR_TO_BUF via base_type() which strips
PTR_MAYBE_NULL, allowing direct dereference without a null check.

Map iterator ctx->key and ctx->value are PTR_TO_BUF | PTR_MAYBE_NULL.
On stop callbacks these are NULL, causing a kernel NULL dereference.

Add a type_may_be_null() guard to the PTR_TO_BUF branch, matching the
existing PTR_TO_BTF_ID pattern.

Fixes: 20b2aff4bc15 ("bpf: Introduce MEM_RDONLY flag")
Signed-off-by: Qi Tang <tpluszz77@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260402092923.38357-2-tpluszz77@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Merge tag 'sound-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"People have been so busy for hunting and we're still getting more
  changes than wished for, but it doesn't look too scary; almost all
  changes are device-specific small fixes.

  I guess it's rather a casual bump, and no more Easter eggs are left
  for 7.0 (hopefully)...

   - Fixes for the recent regression on ctxfi driver

   - Fix missing INIT_LIST_HEAD() for ASoC card_aux_list

   - Usual HD- and USB-audio, and ASoC AMD quirk updates

   - ASoC fixes for AMD and Intel"

* tag 'sound-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (24 commits)
  ASoC: amd: ps: Fix missing leading zeros in subsystem_device SSID log
  ALSA: usb-audio: Exclude Scarlett 2i2 1st Gen (8016) from SKIP_IFACE_SETUP
  ALSA: hda/realtek: add quirk for Acer Swift SFG14-73
  ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14IMH9
  ASoC: Intel: boards: fix unmet dependency on PINCTRL
  ASoC: Intel: ehl_rt5660: Use the correct rtd->dev device in hw_params
  ALSA: ctxfi: Don't enumerate SPDIF1 at DAIO initialization
  ALSA: hda/realtek: Add quirk for Lenovo Yoga Slim 7 14AKP10
  ALSA: hda/realtek: add quirk for HP Laptop 15-fc0xxx
  ASoC: ep93xx: Fix unchecked clk_prepare_enable() and add rollback on failure
  ASoC: soc-core: call missing INIT_LIST_HEAD() for card_aux_list
  ALSA: hda/realtek: Add quirk for Samsung Book2 Pro 360 (NP950QED)
  ASoC: amd: yc: Add DMI entry for HP Laptop 15-fc0xxx
  ASoC: amd: yc: Add DMI quirk for ASUS Vivobook Pro 16X OLED M7601RM
  ALSA: hda/realtek: Add quirk for ASUS ROG Strix SCAR 15
  ALSA: usb-audio: Exclude Scarlett Solo 1st Gen from SKIP_IFACE_SETUP
  ALSA: caiaq: fix stack out-of-bounds read in init_card
  ALSA: ctxfi: Check the error for index mapping
  ALSA: ctxfi: Fix missing SPDIFI1 index handling
  ALSA: hda/realtek: add quirk for HP Victus 15-fb0xxx
  ...

Merge tag 'auxdisplay-v7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay

Pull auxdisplay fixes from Andy Shevchenko:

- Fix NULL dereference in linedisp_release()

- Fix ht16k33 DT bindings to avoid warnings

- Handle errors in I²C transfers in lcd2s driver

* tag 'auxdisplay-v7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay:
  auxdisplay: line-display: fix NULL dereference in linedisp_release
  auxdisplay: lcd2s: add error handling for i2c transfers
  dt-bindings: auxdisplay: ht16k33: Use unevaluatedProperties to fix common property warning

Merge tag 'reset-fixes-for-v7.0-2' into reset/next

Reset controller fixes for v7.0, part 2

* Decouple spacemit K3 reset lines that were incorrectly coupled
  together as one, but are in fact separate resets in hardware.
* Fix a double free in the reset_add_gpio_aux_device() error path.
  This has already been fixed on reset/next by commit a9b95ce36de4
  ("reset: gpio: add a devlink between reset-gpio and its consumer").
* Fix the MODULE_AUTHOR string in the rzg2l-usbphy-ctrl driver.

We merge this into reset/next to resolve a conflict between commits
a9b95ce36de4 ("reset: gpio: add a devlink between reset-gpio and its
consumer") and fbffb8c7c7bb ("reset: gpio: fix double free in
reset_add_gpio_aux_device() error path").

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

Merge branch 'bpf-migrate-bpf_task_work-and-file-dynptr-to-kmalloc_nolock'

Mykyta Yatsenko says:

====================
bpf: Migrate bpf_task_work and file dynptr to kmalloc_nolock

Now that kmalloc can be used from NMI context via kmalloc_nolock(),
migrate BPF internal allocations away from bpf_mem_alloc to use the
standard slab allocator.

Use kfree_rcu() for deferred freeing, which waits for a regular RCU
grace period before the memory is reclaimed. Sleepable BPF programs
hold rcu_read_lock_trace but not regular rcu_read_lock, so patch 1
adds explicit rcu_read_lock/unlock around the pointer-to-refcount
window to prevent kfree_rcu from freeing memory while a sleepable
program is still between reading the pointer and acquiring a
reference.

Patch 1 migrates bpf_task_work_ctx from bpf_mem_alloc/bpf_mem_free to
kmalloc_nolock/kfree_rcu.

Patch 2 migrates bpf_dynptr_file_impl from bpf_mem_alloc/bpf_mem_free
to kmalloc_nolock/kfree.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
---
Changes in v2:
- Switch to scoped_guard in patch 1 (Kumar)
- Remove rcu gp wait in patch 2 (Kumar)
- Defer to irq_work when irqs disabled in patch 1
- use bpf_map_kmalloc_nolock() for bpf_task_work
- use kmalloc_nolock() for file dynptr
- Link to v1: https://lore.kernel.org/all/20260325-kmalloc_special-v1-0-269666afb1ea@meta.com/
====================

Link: https://patch.msgid.link/20260330-kmalloc_special-v2-0-c90403f92ff0@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Migrate dynptr file to kmalloc_nolock

Replace bpf_mem_alloc/bpf_mem_free with kmalloc_nolock/kfree_nolock for
bpf_dynptr_file_impl, continuing the migration away from bpf_mem_alloc
now that kmalloc can be used from NMI context.

freader_cleanup() runs before kfree_nolock() while the dynptr still
holds exclusive access, so plain kfree_nolock() is safe — no concurrent
readers can access the object.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260330-kmalloc_special-v2-2-c90403f92ff0@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Migrate bpf_task_work to kmalloc_nolock

Replace bpf_mem_alloc/bpf_mem_free with
kmalloc_nolock/kfree_rcu for bpf_task_work_ctx.

Replace guard(rcu_tasks_trace)() with guard(rcu)() in
bpf_task_work_irq(). The function only accesses ctx struct members
(not map values), so tasks trace protection is not needed - regular
RCU is sufficient since ctx is freed via kfree_rcu. The guard in
bpf_task_work_callback() remains as tasks trace since it accesses map
values from process context.

Sleepable BPF programs hold rcu_read_lock_trace but not
regular rcu_read_lock. Since kfree_rcu
waits for a regular RCU grace period, the ctx memory can be freed
while a sleepable program is still running. Add scoped_guard(rcu)
around the pointer read and refcount tryget in
bpf_task_work_acquire_ctx to close this race window.

Since kfree_rcu uses call_rcu internally which is not safe from
NMI context, defer destruction via irq_work when irqs are disabled.

For the lost-cmpxchg path the ctx was never published, so
kfree_nolock is safe.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260330-kmalloc_special-v2-1-c90403f92ff0@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer

Mario Limonciello has led amd-pstate maintenance in recent years and
has done excellent work. The amd-pstate driver is in good hands with
him. I am stepping down as co-maintainer as I move on to other things.

Add K Prateek Nayak as a reviewer. He has been actively contributing
to the driver including preferred-core and ITMT improvements, and has
been helping review amd-pstate patches for a while now.

Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: K Prateek Nayak <kprateek.nayak@amd.com>
Acked-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://lore.kernel.org/r/20260402102611.16519-1-gautham.shenoy@amd.com
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

cpufreq: Pass the policy to cpufreq_driver->adjust_perf()

cpufreq_cpu_get() can sleep on PREEMPT_RT in presence of concurrent
writer(s), however amd-pstate depends on fetching the cpudata via the
policy's driver data which necessitates grabbing the reference.

Since schedutil governor can call "cpufreq_driver->update_perf()"
during sched_tick/enqueue/dequeue with rq_lock held and IRQs disabled,
fetching the policy object using the cpufreq_cpu_get() helper in the
scheduler fast-path leads to "BUG: scheduling while atomic" on
PREEMPT_RT [1].

Pass the cached cpufreq policy object in sg_policy to the update_perf()
instead of just the CPU. The CPU can be inferred using "policy->cpu".

The lifetime of cpufreq_policy object outlasts that of the governor and
the cpufreq driver (allocated when the CPU is onlined and only reclaimed
when the CPU is offlined / the CPU device is removed) which makes it
safe to be referenced throughout the governor's lifetime.

Closes:https://lore.kernel.org/all/20250731092316.3191-1-spasswolf@web.de/ [1]

Fixes: 1d215f0319c2 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Reported-by: Bert Karwatzki <spasswolf@web.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Acked-by: Gary Guo <gary@garyguo.net> # Rust
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260316081849.19368-3-kprateek.nayak@amd.com
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

cpufreq/amd-pstate: Pass the policy to amd_pstate_update()

All callers of amd_pstate_update() already have a reference to the
cpufreq_policy object.

Pass the entire policy object and grab the cpudata using
"policy->driver_data" instead of passing the cpudata and unnecessarily
grabbing another read-side reference to the cpufreq policy object when
it is already available in the caller.

No functional changes intended.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20260316081849.19368-2-kprateek.nayak@amd.com
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

cpufreq/amd-pstate-ut: Add a unit test for raw EPP

Ensure that all supported raw EPP values work properly.

Export the driver helpers used by the test module so the test can drive
raw EPP writes and temporarily disable dynamic EPP while it runs.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

Merge branch 'bpf-fix-abuse-of-kprobe_write_ctx-via-freplace'

Leon Hwang says:

====================
bpf: Fix abuse of kprobe_write_ctx via freplace

The potential issue of kprobe_write_ctx+freplace was mentioned in
"bpf: Disallow !kprobe_write_ctx progs tail-calling kprobe_write_ctx progs" [1].

It is true issue, that the test in patch #2 verifies that kprobe_write_ctx=false
kprobe progs can be abused to modify struct pt_regs via kprobe_write_ctx=true
freplace progs.

When struct pt_regs is modified, bpf_prog_test_run_opts() gets -EFAULT instead
of 0.

test_freplace_kprobe_write_ctx:FAIL:bpf_prog_test_run_opts unexpected error: -14 (errno 14)

We will disallow attaching freplace programs on kprobe programs with different
kprobe_write_ctx values.

Links:
[1] https://lore.kernel.org/bpf/CAP01T74w4KVMn9bEwpQXrk+bqcUxzb6VW1SQ_QvNy0A4EY-9Jg@mail.gmail.com/

Changes:
v2 -> v3:
* Add comment to the rejection of kprobe_write_ctx (per Jiri).
* Use libbpf_get_error() instead of errno in test (per Jiri).
* Collect Acked-by tags from Jiri and Song, thanks.
v2: https://lore.kernel.org/bpf/20260326141718.17731-1-leon.hwang@linux.dev/

v1 -> v2:
* Drop patch #1 in v1, as it wasn't an issue (per Toke).
* Check kprobe_write_ctx value at attach time instead of at load time, to
prevent attaching kprobe_write_ctx=true freplace progs on
kprobe_write_ctx=false kprobe progs (per Gemini/sashiko).
* Move kprobe_write_ctx test code to attach_probe.c and kprobe_write_ctx.c.
v1: https://lore.kernel.org/bpf/20260324150444.68166-1-leon.hwang@linux.dev/
====================

Link: https://patch.msgid.link/20260331145353.87606-1-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Add test to verify the fix of kprobe_write_ctx abuse

Add a test to verify the issue: kprobe_write_ctx can be abused to modify
struct pt_regs of kernel functions via kprobe_write_ctx=true freplace
progs.

Without the fix, the issue is verified:

kprobe_write_ctx=true freplace prog is allowed to attach to
kprobe_write_ctx=false kprobe prog. Then, the first arg of
bpf_fentry_test1 will be set as 0, and bpf_prog_test_run_opts() gets
-EFAULT instead of 0.

With the fix, the issue is rejected at attach time.

Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260331145353.87606-3-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Fix abuse of kprobe_write_ctx via freplace

uprobe programs are allowed to modify struct pt_regs.

Since the actual program type of uprobe is KPROBE, it can be abused to
modify struct pt_regs via kprobe+freplace when the kprobe attaches to
kernel functions.

For example,

SEC("?kprobe")
int kprobe(struct pt_regs *regs)
{
return 0;
}

SEC("?freplace")
int freplace_kprobe(struct pt_regs *regs)
{
regs->di = 0;
return 0;
}

freplace_kprobe prog will attach to kprobe prog.
kprobe prog will attach to a kernel function.

Without this patch, when the kernel function runs, its first arg will
always be set as 0 via the freplace_kprobe prog.

To fix the abuse of kprobe_write_ctx=true via kprobe+freplace, disallow
attaching freplace programs on kprobe programs with different
kprobe_write_ctx values.

Fixes: 7384893d970e ("bpf: Allow uprobe program to change context registers")
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260331145353.87606-2-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

cpufreq/amd-pstate: Add support for raw EPP writes

The energy performance preference field of the CPPC request MSR
supports values from 0 to 255, but the strings only offer 4 values.

The other values are useful for tuning the performance of some
workloads.

Add support for writing the raw energy performance preference value
to the sysfs file. If the last value written was an integer then
an integer will be returned. If the last value written was a string
then a string will be returned.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

cpufreq/amd-pstate: Add support for platform profile class

The platform profile core allows multiple drivers and devices to
register platform profile support.

When the legacy platform profile interface is used all drivers will
adjust the platform profile as well.

Add support for registering every CPU with the platform profile handler
when dynamic EPP is enabled.

The end result will be that changing the platform profile will modify
EPP accordingly.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

cpufreq/amd-pstate: add kernel command line to override dynamic epp

Add `amd_dynamic_epp=enable` and `amd_dynamic_epp=disable` to override
the kernel configuration option `CONFIG_X86_AMD_PSTATE_DYNAMIC_EPP`
locally.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

cpufreq/amd-pstate: Add dynamic energy performance preference

Dynamic energy performance preference changes the EPP profile based on
whether the machine is running on AC or DC power.

A notification chain from the power supply core is used to adjust EPP
values on plug in or plug out events.

When enabled, the driver exposes a sysfs toggle for dynamic EPP, blocks
manual writes to energy_performance_preference while it "owns" the EPP
updates.

For non-server systems:
* the default EPP for AC mode is `performance`.
* the default EPP for DC mode is `balance_performance`.

For server systems dynamic EPP is mostly a no-op.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

Documentation: amd-pstate: fix dead links in the reference section

The links for AMD64 Architecture Programmer's Manual and PPR for AMD
Family 19h Model 51h, Revision A1 Processors redirect to a generic page.
Update the links to the working ones.

Signed-off-by: Ninad Naik <ninadnaik07@gmail.com>
Link: https://lore.kernel.org/r/20260330190855.1115304-1-ninadnaik07@gmail.com
Acked-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

cpufreq/amd-pstate: Cache the max frequency in cpudata

The value of maximum frequency is fixed and never changes. Doing
calculations every time based off of perf is unnecessary.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20260326193620.649441-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}

Add documentation for the sysfs files
/sys/devices/system/cpu/cpufreq/policy*/amd_pstate_floor_freq
and
/sys/devices/system/cpu/cpufreq/policy*/amd_pstate_floor_count.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file

Add the missing amd_pstate_prefcore_ranking filenames in the sysfs
listing example leading to the descriptions of these
parameters. Clarify when will the file be visible.

Fixes: 15a2b764ea7c ("amd-pstate: Add missing documentation for `amd_pstate_prefcore_ranking`")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file

Add the missing amd_pstate_hw_prefcore filenames in the sysfs listing
example leading to the descriptions of these parameters. Clarify when
will the file be visible.

Fixes: b96b82d1af7f ("cpufreq: amd-pstate: Add documentation for `amd_pstate_hw_prefcore`")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

amd-pstate-ut: Add a testcase to validate the visibility of driver attributes

amd-pstate driver has per-attribute visibility functions to
dynamically control which sysfs freq_attrs are exposed based on the
platform capabilities and the current amd_pstate mode. However, there
is no test coverage to validate that the driver's live attribute list
matches the expected visibility for each mode.

Add amd_pstate_ut_check_freq_attrs() to the amd-pstate unit test
module. For each enabled mode (passive, active, guided), the test
independently derives the expected visibility of each attribute:
  - Core attributes (max_freq, lowest_nonlinear_freq, highest_perf)
    are always expected.
  - Prefcore attributes (prefcore_ranking, hw_prefcore) are expected
    only when cpudata->hw_prefcore indicates platform support.
  - EPP attributes (energy_performance_preference,
    energy_performance_available_preferences) are expected only in
    active mode.
  - Floor frequency attributes (floor_freq, floor_count) are expected
    only when X86_FEATURE_CPPC_PERF_PRIO is present.

Compare these independent expectations against the live driver's attr
array, catching bugs such as attributes leaking into wrong modes or
visibility functions checking incorrect conditions.

Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

amd-pstate-ut: Add module parameter to select testcases

Currently when amd-pstate-ut test module is loaded, it runs all the
tests from amd_pstate_ut_cases[] array.

Add a module parameter named "test_list" that accepts a
comma-delimited list of test names, allowing users to run a
selected subset of tests. When the parameter is omitted or empty, all
tests are run as before.

Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

amd-pstate: Introduce a tracepoint trace_amd_pstate_cppc_req2()

Introduce a new tracepoint trace_amd_pstate_cppc_req2() to track
updates to MSR_AMD_CPPC_REQ2.

Invoke this while changing the Floor Perf.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

amd-pstate: Add sysfs support for floor_freq and floor_count

When Floor Performance feature is supported by the platform, expose
two sysfs files:

   * amd_pstate_floor_freq to allow userspace to request the floor
     frequency for each CPU.

   * amd_pstate_floor_count which advertises the number of distinct
     levels of floor frequencies supported on this platform.

Reset the floor_perf to bios_floor_perf in the suspend, offline, and
exit paths, and restore the value to the cached user-request
floor_freq on the resume and online paths mirroring how bios_min_perf
is handled for MSR_AMD_CPPC_REQ.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

amd-pstate: Add support for CPPC_REQ2 and FLOOR_PERF

Some future AMD processors have feature named "CPPC Performance
Priority" which lets userspace specify different floor performance
levels for different CPUs. The platform firmware takes these different
floor performance levels into consideration while throttling the CPUs
under power/thermal constraints. The presence of this feature is
indicated by bit 16 of the EDX register for CPUID leaf
0x80000007. More details can be found in AMD Publication titled "AMD64
Collaborative Processor Performance Control (CPPC) Performance
Priority" Revision 1.10.

The number of distinct floor performance levels supported on the
platform will be advertised through the bits 32:39 of the
MSR_AMD_CPPC_CAP1. Bits 0:7 of a new MSR MSR_AMD_CPPC_REQ2
(0xc00102b5) will be used to specify the desired floor performance
level for that CPU.

Add support for the aforementioned MSR_AMD_CPPC_REQ2, and macros for
parsing and updating the relevant bits from MSR_AMD_CPPC_CAP1 and
MSR_AMD_CPPC_REQ2.

On boot if the default value of the MSR_AMD_CPPC_REQ2[7:0] (Floor
Perf) is lower than CPPC.lowest_perf, and thus invalid, initialize it
to MSR_AMD_CPPC_CAP1.nominal_perf which is a sane default value.

Save the boot-time floor_perf during amd_pstate_init_floor_perf(). In
a subsequent patch it will be restored in the suspend, offline, and
exit paths, mirroring how bios_min_perf is handled for
MSR_AMD_CPPC_REQ.

Link: https://docs.amd.com/v/u/en-US/69206_1.10_AMD64_CPPC_PUB
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

x86/cpufeatures: Add AMD CPPC Performance Priority feature.

Some future AMD processors have feature named "CPPC Performance
Priority" which lets userspace specify different floor performance
levels for different CPUs. The platform firmware takes these different
floor performance levels into consideration while throttling the CPUs
under power/thermal constraints. The presence of this feature is
indicated by bit 16 of the EDX register for CPUID leaf
0x80000007. More details can be found in AMD Publication titled "AMD64
Collaborative Processor Performance Control (CPPC) Performance
Priority" Revision 1.10.

Define a new feature bit named X86_FEATURE_CPPC_PERF_PRIO to map to
CPUID 0x80000007.EDX[16].

Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

amd-pstate: Make certain freq_attrs conditionally visible

Certain amd_pstate freq_attrs such as amd_pstate_hw_prefcore and
amd_pstate_prefcore_ranking are enabled even when preferred core is
not supported on the platform.

Similarly there are common freq_attrs between the amd-pstate and the
amd-pstate-epp drivers (eg: amd_pstate_max_freq,
amd_pstate_lowest_nonlinear_freq, etc.) but are duplicated in two
different freq_attr structs.

Unify all the attributes in a single place and associate each of them
with a visibility function that determines whether the attribute
should be visible based on the underlying platform support and the
current amd_pstate mode.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

amd-pstate: Update cppc_req_cached in fast_switch case

The function msr_update_perf() does not cache the new value that is
written to MSR_AMD_CPPC_REQ into the variable cpudata->cppc_req_cached
when the update is happening from the fast path.

Fix that by caching the value everytime the MSR_AMD_CPPC_REQ gets
updated.

This issue was discovered by Claude Opus 4.6 with the aid of Chris
Mason's AI review-prompts
(https://github.com/masoncl/review-prompts/tree/main/kernel).

Assisted-by: Claude:claude-opus-4.6 review-prompts/linux
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Fixes: fff395796917 ("cpufreq/amd-pstate: Always write EPP value when updating perf")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

amd-pstate: Fix memory leak in amd_pstate_epp_cpu_init()

On failure to set the epp, the function amd_pstate_epp_cpu_init()
returns with an error code without freeing the cpudata object that was
allocated at the beginning of the function.

Ensure that the cpudata object is freed before returning from the
function.

This memory leak was discovered by Claude Opus 4.6 with the aid of
Chris Mason's AI review-prompts
(https://github.com/masoncl/review-prompts/tree/main/kernel).

Assisted-by: Claude:claude-opus-4.6 review-prompts/linux
Fixes: f9a378ff6443 ("cpufreq/amd-pstate: Set different default EPP policy for Epyc and Ryzen")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

KVM: riscv: selftests: Implement kvm_arch_has_default_irqchip

kvm_arch_has_default_irqchip is required for irqfd_test and returns
true if an in-kernel interrupt controller is supported.

Fixes: a133052666bed ("KVM: selftests: Fix irqfd_test for non-x86 architectures")
Signed-off-by: Mayuresh Chitale <mayuresh.chitale@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260402101818.2982071-1-mayuresh.chitale@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>

kbuild: vdso_install: drop build ID architecture allow-list

Many architectures which do generate build IDs are missing from this
list. For example arm64, riscv, loongarch, mips.

Now that errors from readelf and binaries without any build ID are
handled gracefully, the allow-list is not necessary anymore, drop it.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260331-kbuild-vdso-install-v2-4-606d0dc6beca@weissschuh.net
Signed-off-by: Nicolas Schier <nsc@kernel.org>

kbuild: vdso_install: gracefully handle images without build ID

If the vDSO does not contain a build ID, skip the symlink step.
This will allow the removal of the explicit list of architectures.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260331-kbuild-vdso-install-v2-3-606d0dc6beca@weissschuh.net
Signed-off-by: Nicolas Schier <nsc@kernel.org>

kbuild: vdso_install: hide readelf warnings

If 'readelf -n' encounters a note it does not recognize it emits a
warning. This for example happens when inspecting a compat vDSO for
which the main kernel toolchain was not used.
However the relevant build ID note is always readable, so the
warnings are pointless.

Hide the warnings to make it possible to extract build IDs for more
architectures in the future.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260331-kbuild-vdso-install-v2-2-606d0dc6beca@weissschuh.net
Signed-off-by: Nicolas Schier <nsc@kernel.org>

kbuild: vdso_install: split out the readelf invocation

Split up the logic as some upcoming changes to the readelf invocation
would create a very long line otherwise.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260331-kbuild-vdso-install-v2-1-606d0dc6beca@weissschuh.net
Signed-off-by: Nicolas Schier <nsc@kernel.org>

eth: fbnic: Increase FBNIC_QUEUE_SIZE_MIN to 64

On systems with 64K pages, RX queues will be wedged if users set the
descriptor count to the current minimum (16). Fbnic fragments large
pages into 4K chunks, and scales down the ring size accordingly. With
64K pages and 16 descriptors, the ring size mask is 0 and will never
be filled.

32 descriptors is another special case that wedges the RX rings.
Internally, the rings track pages for the head/tail pointers, not page
fragments. So with 32 descriptors, there's only 1 usable page as one
ring slot is kept empty to disambiguate between an empty/full ring.
As a result, the head pointer never advances and the HW stalls after
consuming 16 page fragments.

Fixes: 0cb4c0a13723 ("eth: fbnic: Implement Rx queue alloc/start/stop/free")
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
Link: https://patch.msgid.link/20260401162848.2335350-1-dimitri.daskalakis1@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ASoC: qcom: q6dsp: few fixes and enhancements

Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com> says:

This patchset contains few fixes for the bugs hit during testing with
Monza EVK platform
- around array out of bounds access on dai ids which keep extending but
  the drivers seems to have hardcoded some numbers, fix this and clean
the mess up
- fix few issues discovered while trying to shut down dsp.
- flooding rpmsg with write requests due to not resetting queue pointer,
  fix this resetting the pointer in trigger stop.
- possible multiple graph opens which can result in open failures.

Apart from this few new enhancements to the dsp side
- add new LPI MI2S and senary dai entries
- handle pipewire and Displayport issues by moving graph start to
  trigger level, which should fix outstanding pipewire and DP issues on
Qualcomm SoCs.
- remove some unnessary loops in hot path
- support early memory map on DSP.

Tested this on top of linux-next on VENTUNO-Q platform.

ASoC: qcom: q6apm: Add support for early buffer mapping on DSP

Buffers are allocated on pcm_new and mapped in the dsp on every
prepare call, which is inefficient and unnecessary.

Add new functions q6apm_[un]map_memory_fixed_region to map it on
to dsp only once after allocation.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260402081118.348071-14-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: qcom: qdsp6: remove search for module iid in hot path

Remove searching for Shared Memory module instance id on every
read/write call, this is un-necessary if we can cache the shared
memory module instance id per PCM graph.

Add new member to graph struct to store shared memory module
instance id to avoid searching for this in hot path.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260402081118.348071-13-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: qcom: q6apm-lpass-dai: move graph start to trigger

Start the graph at trigger callback. Staring the graph at prepare does
not make sense as there is no data transfer at this point.
Moving this to trigger will also help cope situation where pipewire
is not happy if display port is not connected during start.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260402081118.348071-12-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: qcom: qdapm-lpass-dai: correct the error message

Fix the error message to reflect the actual graph stop error
instead of graph close error.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260402081118.348071-11-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: qcom: q6dsp: Add Senary MI2S audio interface support

Introduces support for the Senary MI2S audio interface in the Qualcomm
q6dsp. Add new AFE port IDs for Senary MI2S RX and TX and include the
necessary mappings in the port configuration to allow audio routing
over the Senary MI2S interface.

Signed-off-by: Mohammad Rafi Shaik <mohammad.rafi.shaik@oss.qualcomm.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Tested-by: Val Packett <val@packett.cool> # sm7325-motorola-dubai
Link: https://patch.msgid.link/20260402081118.348071-10-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: qcom: qdsp6: lpass-ports: add support for LPASS LPI MI2S dais

Add support for LPASS LPI MI2S dais in the dai-driver, these dais are
used in Monaco based platform devices.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260402081118.348071-9-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: qcom: common: validate cpu dai id during parsing

lpass ports numbers have been added but the afe/apm driver never got
updated with new max port value that it uses to store dai specific data.
There are more than one places these values are cached and always become
out of sync.

This will result in array out of bounds and weird driver behaviour.

To catch such issues, first add a single place where we can define max
port and second add a check in common parsing code which can error
out before corrupting the memory with out of bounds array access.

This should help both avoid and catch these type of mistakes in future.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260402081118.348071-8-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: dt-bindings: qcom: add LPASS LPI MI2S dai ids

Add new dai ids entries for LPASS LPI MI2S and SENARY MI2S audio lines.

Co-developed-by: Mohammad Rafi Shaik <mohammad.rafi.shaik@oss.qualcomm.com>
Signed-off-by: Mohammad Rafi Shaik <mohammad.rafi.shaik@oss.qualcomm.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260402081118.348071-7-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: qcom: q6apm-dai: reset queue ptr on trigger stop

Reset queue pointer on SNDRV_PCM_TRIGGER_STOP event to be inline
with resetting appl_ptr. Without this we will end up with a queue_ptr
out of sync and driver could try to send data that is not ready yet.

Fix this by resetting the queue_ptr.

Fixes: 3d4a4411aa8bb ("ASoC: q6apm-dai: schedule all available frames to avoid dsp under-runs")
Cc: Stable@vger.kernel.org
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260402081118.348071-6-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: qcom: q6apm-lpass-dai: Fix multiple graph opens

As prepare can be called mulitple times, this can result in multiple
graph opens for playback path.

This will result in a memory leaks, fix this by adding a check before
opening.

Fixes: be1fae62cf25 ("ASoC: q6apm-lpass-dai: close graph on prepare errors")
Cc: Stable@vger.kernel.org
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260402081118.348071-5-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: qcom: qdsp6: topology: check widget type before accessing data

Check widget type before accessing the private data, as this could a
virtual widget which is no associated with a dsp graph, container and
module. Accessing witout check could lead to incorrect memory access.

Fixes: 36ad9bf1d93d ("ASoC: qdsp6: audioreach: add topology support")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260402081118.348071-4-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: qcom: q6apm: remove child devices when apm is removed

looks like q6apm driver does not remove the child driver q6apm-dai and
q6apm-bedais when the this driver is removed.

Fix this by depopulating them in remove callback.

With this change when the dsp is shutdown all the devices associated with
q6apm will now be removed.

Fixes: 5477518b8a0e ("ASoC: qdsp6: audioreach: add q6apm support")
Cc: Stable@vger.kernel.org
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260402081118.348071-3-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: qcom: q6apm: move component registration to unmanaged version

q6apm component registers dais dynamically from ASoC toplology, which
are allocated using device managed version apis. Allocating both
component and dynamic dais using managed version could lead to incorrect
free ordering, dai will be freed while component still holding references
to it.

Fix this issue by moving component to unmanged version so
that the dai pointers are only freeded after the component is removed.

==================================================================
BUG: KASAN: slab-use-after-free in snd_soc_del_component_unlocked+0x3d4/0x400 [snd_soc_core]
Read of size 8 at addr ffff00084493a6e8 by task kworker/u48:0/3426
Tainted: [W]=WARN
Hardware name: LENOVO 21N2ZC5PUS/21N2ZC5PUS, BIOS N42ET57W (1.31 ) 08/08/2024
Workqueue: pdr_notifier_wq pdr_notifier_work [pdr_interface]
Call trace:
show_stack+0x28/0x7c (C)
dump_stack_lvl+0x60/0x80
print_report+0x160/0x4b4
kasan_report+0xac/0xfc
__asan_report_load8_noabort+0x20/0x34
snd_soc_del_component_unlocked+0x3d4/0x400 [snd_soc_core]
snd_soc_unregister_component_by_driver+0x50/0x88 [snd_soc_core]
devm_component_release+0x30/0x5c [snd_soc_core]
devres_release_all+0x13c/0x210
device_unbind_cleanup+0x20/0x190
device_release_driver_internal+0x350/0x468
device_release_driver+0x18/0x30
bus_remove_device+0x1a0/0x35c
device_del+0x314/0x7f0
device_unregister+0x20/0xbc
apr_remove_device+0x5c/0x7c [apr]
device_for_each_child+0xd8/0x160
apr_pd_status+0x7c/0xa8 [apr]
pdr_notifier_work+0x114/0x240 [pdr_interface]
process_one_work+0x500/0xb70
worker_thread+0x630/0xfb0
kthread+0x370/0x6c0
ret_from_fork+0x10/0x20

Allocated by task 77:
kasan_save_stack+0x40/0x68
kasan_save_track+0x20/0x40
kasan_save_alloc_info+0x44/0x58
__kasan_kmalloc+0xbc/0xdc
__kmalloc_node_track_caller_noprof+0x1f4/0x620
devm_kmalloc+0x7c/0x1c8
snd_soc_register_dai+0x50/0x4f0 [snd_soc_core]
soc_tplg_pcm_elems_load+0x55c/0x1eb8 [snd_soc_core]
snd_soc_tplg_component_load+0x4f8/0xb60 [snd_soc_core]
audioreach_tplg_init+0x124/0x1fc [snd_q6apm]
q6apm_audio_probe+0x10/0x1c [snd_q6apm]
snd_soc_component_probe+0x5c/0x118 [snd_soc_core]
soc_probe_component+0x44c/0xaf0 [snd_soc_core]
snd_soc_bind_card+0xad0/0x2370 [snd_soc_core]
snd_soc_register_card+0x3b0/0x4c0 [snd_soc_core]
devm_snd_soc_register_card+0x50/0xc8 [snd_soc_core]
x1e80100_platform_probe+0x208/0x368 [snd_soc_x1e80100]
platform_probe+0xc0/0x188
really_probe+0x188/0x804
__driver_probe_device+0x158/0x358
driver_probe_device+0x60/0x190
__device_attach_driver+0x16c/0x2a8
bus_for_each_drv+0x100/0x194
__device_attach+0x174/0x380
device_initial_probe+0x14/0x20
bus_probe_device+0x124/0x154
deferred_probe_work_func+0x140/0x220
process_one_work+0x500/0xb70
worker_thread+0x630/0xfb0
kthread+0x370/0x6c0
ret_from_fork+0x10/0x20

Freed by task 3426:
kasan_save_stack+0x40/0x68
kasan_save_track+0x20/0x40
__kasan_save_free_info+0x4c/0x80
__kasan_slab_free+0x78/0xa0
kfree+0x100/0x4a4
devres_release_all+0x144/0x210
device_unbind_cleanup+0x20/0x190
device_release_driver_internal+0x350/0x468
device_release_driver+0x18/0x30
bus_remove_device+0x1a0/0x35c
device_del+0x314/0x7f0
device_unregister+0x20/0xbc
apr_remove_device+0x5c/0x7c [apr]
device_for_each_child+0xd8/0x160
apr_pd_status+0x7c/0xa8 [apr]
pdr_notifier_work+0x114/0x240 [pdr_interface]
process_one_work+0x500/0xb70
worker_thread+0x630/0xfb0
kthread+0x370/0x6c0
ret_from_fork+0x10/0x20

Fixes: 5477518b8a0e ("ASoC: qdsp6: audioreach: add q6apm support")
Cc: Stable@vger.kernel.org
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260402081118.348071-2-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>

batman-adv: reject oversized global TT response buffers

batadv_tt_prepare_tvlv_global_data() builds the allocation length for a
global TT response in 16-bit temporaries. When a remote originator
advertises a large enough global TT, the TT payload length plus the VLAN
header offset can exceed 65535 and wrap before kmalloc().

The full-table response path still uses the original TT payload length when
it fills tt_change, so the wrapped allocation is too small and
batadv_tt_prepare_tvlv_global_data() writes past the end of the heap object
before the later packet-size check runs.

Fix this by rejecting TT responses whose TVLV value length cannot fit in
the 16-bit TVLV payload length field.

Fixes: 7ea7b4a14275 ("batman-adv: make the TT CRC logic VLAN specific")
Cc: stable@vger.kernel.org
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Co-developed-by: Yuan Tan <yuantan098@gmail.com>
Signed-off-by: Yuan Tan <yuantan098@gmail.com>
Suggested-by: Xin Liu <bird@lzu.edu.cn>
Tested-by: Ren Wei <enjou1224z@gmail.com>
Signed-off-by: Ruide Cao <caoruide123@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>

ipv6: avoid overflows in ip6_datagram_send_ctl()

Yiming Qian reported :
<quote>
I believe I found a locally triggerable kernel bug in the IPv6 sendmsg
ancillary-data path that can panic the kernel via `skb_under_panic()`
(local DoS).

The core issue is a mismatch between:

- a 16-bit length accumulator (`struct ipv6_txoptions::opt_flen`, type
`__u16`) and
- a pointer to the *last* provided destination-options header (`opt->dst1opt`)

when multiple `IPV6_DSTOPTS` control messages (cmsgs) are provided.

- `include/net/ipv6.h`:
   - `struct ipv6_txoptions::opt_flen` is `__u16` (wrap possible).
(lines 291-307, especially 298)
- `net/ipv6/datagram.c:ip6_datagram_send_ctl()`:
   - Accepts repeated `IPV6_DSTOPTS` and accumulates into `opt_flen`
without rejecting duplicates. (lines 909-933)
- `net/ipv6/ip6_output.c:__ip6_append_data()`:
   - Uses `opt->opt_flen + opt->opt_nflen` to compute header
sizes/headroom decisions. (lines 1448-1466, especially 1463-1465)
- `net/ipv6/ip6_output.c:__ip6_make_skb()`:
   - Calls `ipv6_push_frag_opts()` if `opt->opt_flen` is non-zero.
(lines 1930-1934)
- `net/ipv6/exthdrs.c:ipv6_push_frag_opts()` / `ipv6_push_exthdr()`:
   - Push size comes from `ipv6_optlen(opt->dst1opt)` (based on the
pointed-to header). (lines 1179-1185 and 1206-1211)

1. `opt_flen` is a 16-bit accumulator:

- `include/net/ipv6.h:298` defines `__u16 opt_flen; /* after fragment hdr */`.

2. `ip6_datagram_send_ctl()` accepts *repeated* `IPV6_DSTOPTS` cmsgs
and increments `opt_flen` each time:

- In `net/ipv6/datagram.c:909-933`, for `IPV6_DSTOPTS`:
   - It computes `len = ((hdr->hdrlen + 1) << 3);`
   - It checks `CAP_NET_RAW` using `ns_capable(net->user_ns,
CAP_NET_RAW)`. (line 922)
   - Then it does:
     - `opt->opt_flen += len;` (line 927)
     - `opt->dst1opt = hdr;` (line 928)

There is no duplicate rejection here (unlike the legacy
`IPV6_2292DSTOPTS` path which rejects duplicates at
`net/ipv6/datagram.c:901-904`).

If enough large `IPV6_DSTOPTS` cmsgs are provided, `opt_flen` wraps
while `dst1opt` still points to a large (2048-byte)
destination-options header.

In the attached PoC (`poc.c`):

- 32 cmsgs with `hdrlen=255` => `len = (255+1)*8 = 2048`
- 1 cmsg with `hdrlen=0` => `len = 8`
- Total increment: `32*2048 + 8 = 65544`, so `(__u16)opt_flen == 8`
- The last cmsg is 2048 bytes, so `dst1opt` points to a 2048-byte header.

3. The transmit path sizes headers using the wrapped `opt_flen`:

- In `net/ipv6/ip6_output.c:1463-1465`:
  - `headersize = sizeof(struct ipv6hdr) + (opt ? opt->opt_flen +
opt->opt_nflen : 0) + ...;`

With wrapped `opt_flen`, `headersize`/headroom decisions underestimate
what will be pushed later.

4. When building the final skb, the actual push length comes from
`dst1opt` and is not limited by wrapped `opt_flen`:

- In `net/ipv6/ip6_output.c:1930-1934`:
   - `if (opt->opt_flen) proto = ipv6_push_frag_opts(skb, opt, proto);`
- In `net/ipv6/exthdrs.c:1206-1211`, `ipv6_push_frag_opts()` pushes
`dst1opt` via `ipv6_push_exthdr()`.
- In `net/ipv6/exthdrs.c:1179-1184`, `ipv6_push_exthdr()` does:
   - `skb_push(skb, ipv6_optlen(opt));`
   - `memcpy(h, opt, ipv6_optlen(opt));`

With insufficient headroom, `skb_push()` underflows and triggers
`skb_under_panic()` -> `BUG()`:

- `net/core/skbuff.c:2669-2675` (`skb_push()` calls `skb_under_panic()`)
- `net/core/skbuff.c:207-214` (`skb_panic()` ends in `BUG()`)

- The `IPV6_DSTOPTS` cmsg path requires `CAP_NET_RAW` in the target
netns user namespace (`ns_capable(net->user_ns, CAP_NET_RAW)`).
- Root (or any task with `CAP_NET_RAW`) can trigger this without user
namespaces.
- An unprivileged `uid=1000` user can trigger this if unprivileged
user namespaces are enabled and it can create a userns+netns to obtain
namespaced `CAP_NET_RAW` (the attached PoC does this).

- Local denial of service: kernel BUG/panic (system crash).
- Reproducible with a small userspace PoC.
</quote>

This patch does not reject duplicated options, as this might break
some user applications.

Instead, it makes sure to adjust opt_flen and opt_nflen to correctly
reflect the size of the current option headers, preventing the overflows
and the potential for panics.

This applies to IPV6_DSTOPTS, IPV6_HOPOPTS, and IPV6_RTHDR.

Specifically:

When a new IPV6_DSTOPTS is processed, the length of the old opt->dst1opt
is subtracted from opt->opt_flen before adding the new length.

When a new IPV6_HOPOPTS is processed, the length of the old opt->dst0opt
is subtracted from opt->opt_nflen.

When a new Routing Header (IPV6_RTHDR or IPV6_2292RTHDR) is processed,
the length of the old opt->srcrt is subtracted from opt->opt_nflen.

In the special case within IPV6_2292RTHDR handling where dst1opt is moved
to dst0opt, the length of the old opt->dst0opt is subtracted from
opt->opt_nflen before the new one is added.

Fixes: 333fad5364d6 ("[IPV6]: Support several new sockopt / ancillary data in Advanced API (RFC3542).")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Closes: https://lore.kernel.org/netdev/CAL_bE8JNzawgr5OX5m+3jnQDHry2XxhQT5=jThW1zDPtUikRYA@mail.gmail.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260401154721.3740056-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-hsr-fixes-for-prp-duplication-and-vlan-unwind'

Luka Gejak says:

====================
net: hsr: fixes for PRP duplication and VLAN unwind

This series addresses two logic bugs in the HSR/PRP implementation
identified during a protocol audit. These are targeted for the 'net'
tree as they fix potential memory corruption and state inconsistency.

The primary change resolves a race condition in the node merging path by
implementing address-based lock ordering. This ensures that concurrent
mutations of sequence blocks do not lead to state corruption or
deadlocks.

An additional fix corrects asymmetric VLAN error unwinding by
implementing a centralized unwind path on slave errors.
====================

Link: https://patch.msgid.link/20260401092243.52121-1-luka.gejak@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: hsr: fix VLAN add unwind on slave errors

When vlan_vid_add() fails for a secondary slave, the error path calls
vlan_vid_del() on the failing port instead of the peer slave that had
already succeeded. This results in asymmetric VLAN state across the HSR
pair.

Fix this by switching to a centralized unwind path that removes the VID
from any slave device that was already programmed.

Fixes: 1a8a63a5305e ("net: hsr: Add VLAN CTAG filter support")
Signed-off-by: Luka Gejak <luka.gejak@linux.dev>
Link: https://patch.msgid.link/20260401092243.52121-3-luka.gejak@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: hsr: serialize seq_blocks merge across nodes

During node merging, hsr_handle_sup_frame() walks node_curr->seq_blocks
to update node_real without holding node_curr->seq_out_lock. This
allows concurrent mutations from duplicate registration paths, risking
inconsistent state or XArray/bitmap corruption.

Fix this by locking both nodes' seq_out_lock during the merge.
To prevent ABBA deadlocks, locks are acquired in order of memory
address.

Reviewed-by: Felix Maurer <fmaurer@redhat.com>
Fixes: 415e6367512b ("hsr: Implement more robust duplicate discard for PRP")
Signed-off-by: Luka Gejak <luka.gejak@linux.dev>
Link: https://patch.msgid.link/20260401092243.52121-2-luka.gejak@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

vsock: initialize child_ns_mode_locked in vsock_net_init()

The `child_ns_mode_locked` field lives in `struct net`, which persists
across vsock module reloads. When the module is unloaded and reloaded,
`vsock_net_init()` resets `mode` and `child_ns_mode` back to their
default values, but does not reset `child_ns_mode_locked`.

The stale lock from the previous module load causes subsequent writes
to `child_ns_mode` to silently fail: `vsock_net_set_child_mode()` sees
the old lock, skips updating the actual value, and returns success
when the requested mode matches the stale lock. The sysctl handler
reports no error, but `child_ns_mode` remains unchanged.

Steps to reproduce:
    $ modprobe vsock
    $ echo local > /proc/sys/net/vsock/child_ns_mode
    $ cat /proc/sys/net/vsock/child_ns_mode
    local
    $ modprobe -r vsock
    $ modprobe vsock
    $ echo local > /proc/sys/net/vsock/child_ns_mode
    $ cat /proc/sys/net/vsock/child_ns_mode
    global    <--- expected "local"

Fix this by initializing `child_ns_mode_locked` to 0 (unlocked) in
`vsock_net_init()`, so the write-once mechanism works correctly after
module reload.

Fixes: 102eab95f025 ("vsock: lock down child_ns_mode as write-once")
Reported-by: Jin Liu <jinl@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Bobby Eshleman <bobbyeshleman@meta.com>
Link: https://patch.msgid.link/20260401092153.28462-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools/nolibc: explicitly list architecture headers

Relying on $(wildcard) is brittle and non-deterministic.

similar to all the other headers.
Switch the list of architecture headers to an explicit list,

Link: https://patch.msgid.link/20260401-nolibc-cleanup-v1-4-bcf4c9f5c1be@weissschuh.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>

tools/nolibc: drop superfluous definition of Q

Q is already defined by tools/scripts/Makefile.include which is included
at the top of tools/include/nolibc/Makefile.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://patch.msgid.link/20260401-nolibc-cleanup-v1-3-bcf4c9f5c1be@weissschuh.net

tools/nolibc: drop superfluous invocation of mkdir

The call to 'mkdir -p $(OUTPUT)sysroot/include' will also create the
sysroot directory.

Drop the unnecessary explicit invocation of mkdir.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://patch.msgid.link/20260401-nolibc-cleanup-v1-2-bcf4c9f5c1be@weissschuh.net

tools/nolibc: drop superfluous invocation of 'make headers'

The headers_install target of the toplevel Makefile will already make
sure that the headers are up-to-date.

Drop the superfluous explicit invocation.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://patch.msgid.link/20260401-nolibc-cleanup-v1-1-bcf4c9f5c1be@weissschuh.net

drivers/base/memory: fix stale reference to memory_block_add_nid()

The function memory_block_add_nid() was renamed to
memory_block_add_nid_early() by commit 0a947c14e48c
("drivers/base: move memory_block_add_nid() into the
caller"). Update the stale reference in add_memory_block().

Assisted-by: unnamed:deepseek-v3.2 coccinelle
Signed-off-by: Kexin Sun <kexinsun@smail.nju.edu.cn>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Link: https://patch.msgid.link/20260321105704.6093-1-kexinsun@smail.nju.edu.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

device property: Document how to check for the property presence

Currently it's unclear if one may or may not rely on the error codes
returned from the property getters to check for the property presence.
Clarify this by updating kernel-doc for fwnode_property_*() and
device_property_*() where it's applicable.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/r/4b24f1f4-b395-467a-81b7-1334a2d48845@roeck-us.net
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://patch.msgid.link/20260318142404.2526642-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

.get_maintainer.ignore: add myself

I don't want get_maintainer.pl to automatically print my email.

Signed-off-by: Askar Safin <safinaskar@gmail.com>
Link: https://patch.msgid.link/20260324082928.3473789-1-safinaskar@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nvmem: zynqmp_nvmem: Fix buffer size in DMA and memcpy

Buffer size used in dma allocation and memcpy is wrong.
It can lead to undersized DMA buffer access and possible
memory corruption. use correct buffer size in dma_alloc_coherent
and memcpy.

Fixes: 737c0c8d07b5 ("nvmem: zynqmp_nvmem: Add support to access efuse")
Cc: stable@vger.kernel.org
Signed-off-by: Ivan Vera <ivanverasantos@gmail.com>
Signed-off-by: Harish Ediga <harish.ediga@amd.com>
Signed-off-by: Harsh Jain <h.jain@amd.com>
Signed-off-by: Srinivas Kandagatla <srini@kernel.org>
Link: https://patch.msgid.link/20260327131645.3025781-3-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nvmem: imx: assign nvmem_cell_info::raw_len

Avoid getting error messages at startup like the following on i.MX6ULL:

nvmem imx-ocotp0: cell mac-addr raw len 6 unaligned to nvmem word size 4
nvmem imx-ocotp0: cell mac-addr raw len 6 unaligned to nvmem word size 4

This shouldn't cause any functional change as this alignment would
otherwise be done in nvmem_cell_info_to_nvmem_cell_entry_nodup().

Cc: stable@vger.kernel.org
Fixes: 13bcd440f2ff ("nvmem: core: verify cell's raw_len")
Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Srinivas Kandagatla <srini@kernel.org>
Link: https://patch.msgid.link/20260327131645.3025781-2-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

soundwire: debugfs: initialize firmware_file to empty string

Passing NULL to debugfs_create_str() causes a NULL pointer dereference,
and creating debugfs nodes with NULL string pointers is no longer
permitted.

Additionally, firmware_file is a global pointer. Previously, adding every
new slave blindly overwrote it with NULL.

Fix these issues by initializing firmware_file to an allocated empty
string once in the subsystem init path (sdw_debugfs_init), and freeing
it in the exit path. Existing driver code handles empty strings
correctly.

Fixes: fe46d2a4301d ("soundwire: debugfs: add interface to read/write commands")
Reported-by: yangshiguang <yangshiguang@xiaomi.com>
Closes: https://lore.kernel.org/lkml/17647e4c.d461.19b46144a4e.Coremail.yangshiguang1011@163.com/
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
Link: https://patch.msgid.link/20260323085930.88894-4-hanguidong02@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

debugfs: fix placement of EXPORT_SYMBOL_GPL for debugfs_create_str()

The EXPORT_SYMBOL_GPL() for debugfs_create_str was placed incorrectly
away from the function definition. Move it immediately below the
debugfs_create_str() function where it belongs.

Fixes: d60b59b96795 ("debugfs: Export debugfs_create_str symbol")
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
Link: https://patch.msgid.link/20260323085930.88894-3-hanguidong02@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

debugfs: check for NULL pointer in debugfs_create_str()

Passing a NULL pointer to debugfs_create_str() leads to a NULL pointer
dereference when the debugfs file is read. Following upstream
discussions, forbid the creation of debugfs string files with NULL
pointers. Add a WARN_ON() to expose offending callers and return early.

Fixes: 9af0440ec86e ("debugfs: Implement debugfs_create_str()")
Reported-by: yangshiguang <yangshiguang@xiaomi.com>
Closes: https://lore.kernel.org/lkml/2025122221-gag-malt-75ba@gregkh/
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
Link: https://patch.msgid.link/20260323085930.88894-2-hanguidong02@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tty: serial: ip22zilog: Fix section mispatch warning

ip22zilog_prepare() is now called by driver probe routine, so it
shouldn't be in the __init section any longer.

Fixes: 3fc36ae6abd2 ("tty: serial: ip22zilog: Use platform device for probing")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202604020945.c9jAvCPs-lkp@intel.com/
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Link: https://patch.msgid.link/20260402102154.136620-1-tbogendoerfer@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: Make usb_host_endpoint.hcpriv survive endpoint_disable()

xHCI hardware maintains its endpoint state between add_endpoint()
and drop_endpoint() calls followed by successful check_bandwidth().
So does the driver.

Core may call endpoint_disable() during xHCI endpoint life, so don't
clear host_ep->hcpriv then, because this breaks endpoint_reset().

If a driver calls usb_set_interface(), submits URBs which make host
sequence state non-zero and calls usb_clear_halt(), the device clears
its sequence state but xhci_endpoint_reset() bails out. The next URB
malfunctions: USB2 loses one packet, USB3 gets Transaction Error or
may not complete at all on some (buggy?) HCs from ASMedia and AMD.
This is triggered by uvcvideo on bulk video devices.

The code was copied from ehci_endpoint_disable() but it isn't needed
here - hcpriv should only be NULL on emulated root hub endpoints.
It might prevent resetting and inadvertently enabling a disabled and
dropped endpoint, but core shouldn't try to reset dropped endpoints.

Document xhci requirements regarding hcpriv. They are currently met.

Fixes: 18b74067ac78 ("xhci: Fix use-after-free regression in xhci clear hub TT implementation")
Cc: stable@vger.kernel.org
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-26-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: remove duplicate '0x' prefix

Prefix "0x" is automatically added by '%pad'.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-25-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: move roothub port limit validation

Function xhci_setup_port_arrays() limits the number of roothub ports
for both USB 2 and 3, this causes code repetition.

Solve this by moving roothub port limits validation to
xhci_create_rhub_port_array().

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-24-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: simpilfy resume root hub code

Resume roothubs without checking 'retval' value, as it is always '0'.
Due to changes made in commit 79989bd4ab86 ("xhci: always resume roothubs
if xHC was reset during resume") the check is redundant.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-23-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: cleanup xhci_hub_report_usb3_link_state()

Improve readability of xhci_hub_report_usb3_link_state().

Comments are shortened and clarified, and the code now makes it explicit
when the Port Link State (PLS) value is modified versus when other status
bits are updated.

No functional changes.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-22-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: rename parameter to match argument 'portsc'

A previous patch renamed the temporary variable holding the value read
from the PORTSC register from 'temp' to 'portsc'. This patch follows up
by updating the parameter names of all helper functions called from
xhci_hub_control() that receive a PORTSC value, as well as the functions
they call.

Function changed:
xhci_get_port_status()
L xhci_get_usb3_port_status()
L xhci_hub_report_usb3_link_state()
L xhci_del_comp_mod_timer()
xhci_get_ext_port_status()
xhci_port_state_to_neutral()
xhci_clear_port_change_bit()
xhci_port_speed()

The reason for the rename is to differentiate between port
status/change bit to be written to PORTSC and replying to hub-class
USB requests. Each of them use their specific macros.

Use "portsc" name for PORTSC values and "status" for values intended
for replying to hub-class USB request.

A dedicated structure for USB hub port status responses
('struct usb_port_status' from ch11.h) exists and will be integrated in
a later patch.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-21-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: add PORTSC variable to xhci_hub_control()

The variable 'temp' is used multiple times throughout xhci_hub_control()
for holding only PORTSC register values.

As a follow-up to introducing a dedicated variable for PORTPMSC, rename
all remaining 'temp' to 'portsc'. This improves readability and clarifies
what is being modified.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-20-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: add PORTPMSC variable to xhci_hub_control()

The code handling U1/U2 timeout updates reads and modifies the PORTPMSC
register using the generic 'temp' variable, which is also used for
PORTSC. This makes the code hard to read and increases the risk of mixing
up register contents.

Introduce a dedicated 'portpmsc' variable for PORTPMSC accesses and use
it in both U1 and U2 timeout handlers. This makes the intent clearer and
keeps register operations logically separated.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-19-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: separate use of USB Chapter 11 PLS macros from xHCI-specific PLS macros

The xhci driver uses two different sources for Port Link State (PLS):
  1. The PLS field in the PORTSC register (bits 8:5).
  2. The PLS value encoded in bits 15:8 of the USB request wIndex,
     received by xhci_hub_control().

While both represent similar link states, they differ in a few details,
for example, xHCI's Resume State. Because of these differences, the xhci
driver defines its own set of PLS macros in xhci-port.h, which are intended
to be used when reading and writing  PORTSC. The generic USB Chapter 11
macros in ch11.h should only be used when parsing or replying to hub-class
USB requests.

To avoid mixing these two representations and prevent incorrect state
reporting, replace all uses of Chapter 11 PLS macros with the xHCI
versions when interacting with the PORTSC register.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-18-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: clean up 'wValue' handling in xhci_hub_control()

Several hub control requests encode a descriptor type in the upper byte
of 'wValue'. Clean this up by extracting the descriptor type into a local
variable and using it for all relevant requests.

Replace magic value (0x02) with the appropriate macro (HUB_EXT_PORT_STATUS)

This improves readability and makes the handling of 'wValue' consistent.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-17-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: clean up handling of upper bits in SetPortFeature wIndex

In Set Port Feature requests, the upper byte of 'wIndex' encodes
feature-specific parameters. The current code reads these upper bits in
an early pre-processing block, and then the same feature is handled again
later in the main switch statement. This results in duplicated condition
checks and makes the control flow harder to follow.

Move all feature-specific extraction of 'wIndex' upper bits into the
main SetPortFeature logic so that each feature is handled in exactly one
place. This reduces duplication, makes the handling clearer, and keeps
'wIndex' parsing local to the code that actually uses the values.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-16-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: rename 'wIndex' parameters to 'portnum'

Several helper functions take a parameter named 'wIndex', but the
value they receive is not the raw USB request wIndex field. The only
function that actually processes the USB hub request parameter is
xhci_hub_control(), which extracts the relevant port number (and other
upper-byte fields) before passing them down.

To avoid confusion between the USB request parameter and the derived
0-based port index, rename all such function parameters from 'wIndex'
to 'portnum'. This improves readability and makes the call intentions
clearer.

When a function accept struct 'xhci_port' pointer, use its port number
instead.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-15-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: stop treating 'wIndex' as a mutable port number

The USB request parameter 'wIndex' is a 16-bit field whose meaning depends
on the request type. For hub port operations, only bits 7:0 encode the port
number (1..MaxPorts). Despite this, the current code extracts the port
number into 'portnum1' while also modifying and using 'wIndex' directly as
a 0-based port index. This dual use is both confusing and error-prone,
since 'wIndex' is not always a pure port number.

Clean this up by deriving a single 0-based 'portnum' from 'wIndex' and
using it throughout the function. The original 'wIndex' value is no longer
modified or treated as a port number. This also matches existing xhci code.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-14-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: optimize resuming from S4 (suspend-to-disk)

On resume from S4 (power loss after suspend/hibernation), the xHCI
driver previously freed, reallocated, and fully reinitialized all
data structures. Most of this is unnecessary because the data is
restored from a saved image; only the xHCI registers lose their values.

This patch optimizes S4 resume by performing only a host controller
reset, which includes:
* Freeing or clearing runtime-created data.
* Rewriting xHCI registers.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-13-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: improve debug messages during suspend

Improve debug output for suspend failures, particularly when the controller
handshake does not complete. This will become important as upcoming patches
significantly rework the resume path, making more detailed suspend-side
messages valuable for debugging.

Add an explicit check of the Save/Restore Error (SRE) flag after a
successful Save State (CSS) operation. The xHCI specification
(note in section 4.23.2) states:

"After a Save or Restore State operation completes, the
Save/Restore Error (SRE) flag in USBSTS should be checked to
ensure the operation completed successfully."

Currently, the SRE error is only observed and warning is printed.
This patch does not introduce deeper error handling, as the correct
response is unclear and changes to suspend behavior may risk regressions
once the resume path is updated.

Additionally, simplify and clean up the suspend USBSTS CSS/SSS
handling code, improving readability and quirk handling for AMD
SNPS xHC controllers that occasionally do not clear the SSS bit.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-12-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: split core allocation and initialization

Separate allocation and initialization in the xHCI core:
* xhci_mem_init() now only handles memory allocation.
* xhci_init() now only handles initialization.

This split allows xhci_init() to be reused when resuming from S4
suspend-to-disk.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-11-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: move initialization for lifetime objects

Initialize objects that exist for the lifetime of the driver only once,
rather than repeatedly. These objects do not require re-initialization
after events such as S4 (suspend-to-disk).

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-10-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: move ring initialization

Move ring initialization from xhci_ring_alloc() to xhci_ring_init().
Call xhci_ring_init() after xhci_ring_alloc(); in the future,
it can also be used to re-initialize the ring during resume.

Additionally, remove xhci_dbg_trace() from xhci_mem_init(). The command
ring's first segment DMA address is now printed during the trace call in
xhci_ring_init().

This refactoring lays also the groundwork for eventually replacing:
* xhci_dbc_ring_init()
* xhci_clear_command_ring()

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-9-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: move reserving command ring trb

Move the command ring TRB reservation from xhci_mem_init() to xhci_init().

Function xhci_mem_init() is intended for memory allocation,
while xhci_init() is for initialization.

This split allows xhci_init() to be reused when resuming from S4
suspend-to-disk.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-8-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: factor out roothub bandwidth cleanup

Introduce xhci_rh_bw_cleanup() to release all bandwidth tracking
structures associated with xHCI roothub ports.

The new helper clears:
* TT bandwidth entries
* Per-interval endpoint lists

This refactors and consolidates the existing per-port cleanup logic
previously embedded in xhci_mem_cleanup(), reducing duplication and
making the teardown sequence easier to follow.

The helper will also be reused for upcoming S4 resume handling.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-7-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: xhci: relocate Restore/Controller error check

A Restore Error or Host Controller Error indicates that the host controller
failed to resume after suspend. In such cases, the xhci driver is fully
re-initialized, similar to a post-hibernation scenario.

The existing error check is only relevant when 'power_lost' is false.
If 'power_lost' is true, a Restore or Controller error has no effect:
no warning is printed and the 'power_lost' state remains unchanged.

Move the entire error check into the if '!power_lost' condition
to make this dependency explicit and simplify the resume logic.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-6-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>