git.ipfire.org Git - thirdparty/linux.git/log

KVM: x86: Add support for RDMSR/WRMSRNS w/ immediate on Intel

Add support for the immediate forms of RDMSR and WRMSRNS (currently
Intel-only).  The immediate variants are only valid in 64-bit mode, and
use a single general purpose register for the data (the register is also
encoded in the instruction, i.e. not implicit like regular RDMSR/WRMSR).

The immediate variants are primarily motivated by performance, not code
size: by having the MSR index in an immediate, it is available *much*
earlier in the CPU pipeline, which allows hardware much more leeway about
how a particular MSR is handled.

Intel VMX support for the immediate forms of MSR accesses communicates
exit information to the host as follows:

  1) The immediate form of RDMSR uses VM-Exit Reason 84.

  2) The immediate form of WRMSRNS uses VM-Exit Reason 85.

  3) For both VM-Exit reasons 84 and 85, the Exit Qualification field is
     set to the MSR index that triggered the VM-Exit.

  4) Bits 3 ~ 6 of the VM-Exit Instruction Information field are set to
     the register encoding used by the immediate form of the instruction,
     i.e. the destination register for RDMSR, and the source for WRMSRNS.

  5) The VM-Exit Instruction Length field records the size of the
     immediate form of the MSR instruction.

To deal with userspace RDMSR exits, stash the destination register in a
new kvm_vcpu_arch field, similar to cui_linear_rip, pio, etc.
Alternatively, the register could be saved in kvm_run.msr or re-retrieved
from the VMCS, but the former would require sanitizing the value to ensure
userspace doesn't clobber the value to an out-of-bounds index, and the
latter would require a new one-off kvm_x86_ops hook.

Don't bother adding support for the instructions in KVM's emulator, as the
only way for RDMSR/WRMSR to be encountered is if KVM is emulating large
swaths of code due to invalid guest state, and a vCPU cannot have invalid
guest state while in 64-bit mode.

Signed-off-by: Xin Li (Intel) <xin@zytor.com>
[sean: minor tweaks, massage and expand changelog]
Link: https://lore.kernel.org/r/20250805202224.1475590-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Rename handle_fastpath_set_msr_irqoff() to handle_fastpath_wrmsr()

Rename the WRMSR fastpath API to drop "irqoff", as that information is
redundant (the fastpath always runs with IRQs disabled), and to prepare
for adding a fastpath for the immediate variant of WRMSRNS.

No functional change intended.

Signed-off-by: Xin Li (Intel) <xin@zytor.com>
[sean: split to separate patch, write changelog]
Link: https://lore.kernel.org/r/20250805202224.1475590-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Rename local "ecx" variables to "msr" and "pmc" as appropriate

Rename "ecx" variables in {RD,WR}MSR and RDPMC helpers to "msr" and "pmc"
respectively, in anticipation of adding support for the immediate variants
of RDMSR and WRMSRNS, and to better document what the variables hold
(versus where the data originated).

No functional change intended.

Link: https://lore.kernel.org/r/20250805202224.1475590-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

x86/cpufeatures: Add a CPU feature bit for MSR immediate form instructions

The immediate form of MSR access instructions are primarily motivated
by performance, not code size: by having the MSR number in an immediate,
it is available *much* earlier in the pipeline, which allows the
hardware much more leeway about how a particular MSR is handled.

Use a scattered CPU feature bit for MSR immediate form instructions.

Suggested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Link: https://lore.kernel.org/r/20250805202224.1475590-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Add a fastpath handler for INVD

Add a fastpath handler for INVD so that the common fastpath logic can be
trivially tested on both Intel and AMD. Under KVM, INVD is always:
(a) intercepted, (b) available to the guest, and (c) emulated as a nop,
with no side effects. Combined with INVD not having any inputs or outputs,
i.e. no register constraints, INVD is the perfect instruction for
exercising KVM's fastpath as it can be inserted into practically any
guest-side code stream.

Link: https://lore.kernel.org/r/20250805190526.1453366-19-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Push acquisition of SRCU in fastpath into kvm_pmu_trigger_event()

Acquire SRCU in the VM-Exit fastpath if and only if KVM needs to check the
PMU event filter, to further trim the amount of code that is executed with
SRCU protection in the fastpath.  Counter-intuitively, holding SRCU can do
more harm than good due to masking potential bugs, and introducing a new
SRCU-protected asset to code reachable via kvm_skip_emulated_instruction()
would be quite notable, i.e. definitely worth auditing.

E.g. the primary user of kvm->srcu is KVM's memslots, accessing memslots
all but guarantees guest memory may be accessed, accessing guest memory
can fault, and page faults might sleep, which isn't allowed while IRQs are
disabled.  Not acquiring SRCU means the (hypothetical) illegal sleep would
be flagged when running with PROVE_RCU=y, even if DEBUG_ATOMIC_SLEEP=n.

Note, performance is NOT a motivating factor, as SRCU lock/unlock only
adds ~15 cycles of latency to fastpath VM-Exits.  I.e. overhead isn't a
concern _if_ SRCU protection needs to be extended beyond PMU events, e.g.
to honor userspace MSR filters.

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20250805190526.1453366-18-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86/pmu: Rename check_pmu_event_filter() to pmc_is_event_allowed()

Rename check_pmu_event_filter() to make its polarity more obvious, and to
connect the dots to is_gp_event_allowed() and is_fixed_event_allowed().

No functional change intended.

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20250805190526.1453366-17-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86/pmu: Drop redundant check on PMC being locally enabled for emulation

Drop the check on a PMC being locally enabled when triggering emulated
events, as the bitmap of passed-in PMCs only contains locally enabled PMCs.

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20250805190526.1453366-16-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86/pmu: Drop redundant check on PMC being globally enabled for emulation

When triggering PMC events in response to emulation, drop the redundant
checks on a PMC being globally and locally enabled, as the passed in bitmap
contains only PMCs that are locally enabled (and counting the right event),
and the local copy of the bitmap has already been masked with global_ctrl.

No true functional change intended.

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20250805190526.1453366-15-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86/pmu: Open code pmc_event_is_allowed() in its callers

Open code pmc_event_is_allowed() in its callers, as kvm_pmu_trigger_event()
only needs to check the event filter (both global and local enables are
consulted outside of the loop).

No functional change intended.

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20250805190526.1453366-14-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86/pmu: Rename pmc_speculative_in_use() to pmc_is_locally_enabled()

Rename pmc_speculative_in_use() to pmc_is_locally_enabled() to better
capture what it actually tracks, and to show its relationship to
pmc_is_globally_enabled(). While neither AMD nor Intel refer to event
selectors or the fixed counter control MSR as "local", it's the obvious
name to pair with "global".

As for "speculative", there's absolutely nothing speculative about the
checks. E.g. for PMUs without PERF_GLOBAL_CTRL, from the guest's
perspective, the counters are "in use" without any qualifications.

No functional change intended.

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20250805190526.1453366-13-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86/pmu: Calculate set of to-be-emulated PMCs at time of WRMSRs

Calculate and track PMCs that are counting instructions/branches retired
when the PMC's event selector (or fixed counter control) is modified
instead evaluating the event selector on-demand.  Immediately recalc a
PMC's configuration on writes to avoid false negatives/positives when
KVM skips an emulated WRMSR, which is guaranteed to occur before the
main run loop processes KVM_REQ_PMU.

Out of an abundance of caution, and because it's relatively cheap, recalc
reprogrammed PMCs in kvm_pmu_handle_event() as well.  Recalculating in
response to KVM_REQ_PMU _should_ be unnecessary, but for now be paranoid
to avoid introducing easily-avoidable bugs in edge cases.  The code can be
removed in the future if necessary, e.g. in the unlikely event that the
overhead of recalculating to-be-emulated PMCs is noticeable.

Note!  Deliberately don't check the PMU event filters, as doing so could
result in KVM consuming stale information.

Tracking which PMCs are counting branches/instructions will allow grabbing
SRCU in the fastpath VM-Exit handlers if and only if a PMC event might be
triggered (to consult the event filters), and will also allow the upcoming
mediated PMU to do the right thing with respect to counting instructions
(the mediated PMU won't be able to update PMCs in the VM-Exit fastpath).

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20250805190526.1453366-12-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86/pmu: Add wrappers for counting emulated instructions/branches

Add wrappers for triggering instruction retired and branch retired PMU
events in anticipation of reworking the internal mechanisms to track
which PMCs need to be evaluated, e.g. to avoid having to walk and check
every PMC.

Opportunistically bury "struct kvm_pmu_emulated_event_selectors" in pmu.c.

No functional change intended.

Link: https://lore.kernel.org/r/20250805190526.1453366-11-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86/pmu: Move kvm_init_pmu_capability() to pmu.c

Move kvm_init_pmu_capability() to pmu.c so that future changes can access
variables that have no business being visible outside of pmu.c.
kvm_init_pmu_capability() is called once per module load, there's is zero
reason it needs to be inlined.

No functional change intended.

Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20250805190526.1453366-10-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Fold WRMSR fastpath helpers into the main handler

Fold the per-MSR WRMSR fastpath helpers into the main handler now that the
IPI path in particular is relatively tiny. In addition to eliminating a
decent amount of boilerplate, this removes the ugly -errno/1/0 => bool
conversion (which is "necessitated" by kvm_x2apic_icr_write_fast()).

Opportunistically drop the comment about IPIs, as the purpose of the
fastpath is hopefully self-evident, and _if_ it needs more documentation,
the documentation (and rules!) should be placed in a more central location.

No functional change intended.

Link: https://lore.kernel.org/r/20250805190526.1453366-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Unconditionally grab data from EDX:EAX in WRMSR fastpath

Always grab EDX:EAX in the WRMSR fastpath to deduplicate and simplify the
case statements, and to prepare for handling immediate variants of WRMSRNS
in the fastpath (the data register is explicitly provided in that case).
There's no harm in reading the registers, as their values are always
available, i.e. don't require VMREADs (or similarly slow operations).

No real functional change intended.

Cc: Xin Li <xin@zytor.com>
Link: https://lore.kernel.org/r/20250805190526.1453366-8-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Acquire SRCU in WRMSR fastpath iff instruction needs to be skipped

Acquire SRCU in the WRMSR fastpath if and only if an instruction needs to
be skipped, i.e. only if the fastpath succeeds. The reasoning in commit
3f2739bd1e0b ("KVM: x86: Acquire SRCU read lock when handling fastpath MSR
writes") about "avoid having to play whack-a-mole" seems sound, but in
hindsight unconditionally acquiring SRCU does more harm than good.

While acquiring/releasing SRCU isn't slow per se, the things that are
_protected_ by kvm->srcu are generally safe to access only in the "slow"
VM-Exit path. E.g. accessing memslots in generic helpers is never safe,
because accessing guest memory with IRQs disabled is unless unsafe (except
when kvm_vcpu_read_guest_atomic() is used, but that API should never be
used in emulation helpers).

In other words, playing whack-a-mole is actually desirable in this case,
because every access to an asset protected by kvm->srcu warrants further
scrutiny.

Link: https://lore.kernel.org/r/20250805190526.1453366-7-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Unconditionally handle MSR_IA32_TSC_DEADLINE in fastpath exits

Drop the fastpath VM-Exit requirement that KVM can use the hypervisor
timer to emulate the APIC timer in TSC deadline mode.  I.e. unconditionally
handle MSR_IA32_TSC_DEADLINE WRMSRs in the fastpath.  Restricting the
fastpath to *maybe* using the VMX preemption timer is ineffective and
unnecessary.

If the requested deadline can't be programmed into the VMX preemption
timer, KVM will fall back to hrtimers, i.e. the restriction is ineffective
as far as preventing any kind of worst case scenario.

But guarding against a worst case scenario is completely unnecessary as
the "slow" path, start_sw_tscdeadline() => hrtimer_start(), explicitly
disables IRQs.  In fact, the worst case scenario is when KVM thinks it
can use the VMX preemption timer, as KVM will eat the overhead of calling
into vmx_set_hv_timer() and falling back to hrtimers.

Opportunistically limit kvm_can_use_hv_timer() to lapic.c as the fastpath
code was the only external user.

Stating the obvious, this allows handling MSR_IA32_TSC_DEADLINE writes in
the fastpath on AMD CPUs.

Link: https://lore.kernel.org/r/20250805190526.1453366-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Drop semi-arbitrary restrictions on IPI type in fastpath

Drop the restrictions on fastpath IPIs only working for fixed IRQs with a
physical destination now that the fastpath is explicitly limited to "fast"
delivery.  Limiting delivery to a single physical APIC ID guarantees only
one vCPU will receive the event, but that isn't necessary "fast", e.g. if
the targeted vCPU is the last of 4096 vCPUs.  And logical destination mode
or shorthand (to self) can also be fast, e.g. if only a few vCPUs are
being targeted.  Lastly, there's nothing inherently slow about delivering
an NMI, INIT, SIPI, SMI, etc., i.e. there's no reason to artificially
limit fastpath delivery to fixed vector IRQs.

Link: https://lore.kernel.org/r/20250805190526.1453366-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Only allow "fast" IPIs in fastpath WRMSR(X2APIC_ICR) handler

Explicitly restrict fastpath ICR writes to IPIs that are "fast", i.e. can
be delivered without having to walk all vCPUs, and that target at most 16
vCPUs. Artificially restricting ICR writes to physical mode guarantees
at most one vCPU will receive in IPI (because x2APIC IDs are read-only),
but that delivery might not be "fast". E.g. even if the vCPU exists, KVM
might have to iterate over 4096 vCPUs to find the right one.

Limiting delivery to fast IPIs aligns the WRMSR fastpath with
kvm_arch_set_irq_inatomic() (which also runs with IRQs disabled), and will
allow dropping the semi-arbitrary restrictions on delivery mode and type.

Link: https://lore.kernel.org/r/20250805190526.1453366-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Add kvm_icr_to_lapic_irq() helper to allow for fastpath IPIs

Extract the code for converting an ICR message into a kvm_lapic_irq
structure into a local helper so that a fast-only IPI path can share the
conversion logic.

No functional change intended.

Link: https://lore.kernel.org/r/20250805190526.1453366-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

kvm: x86: simplify kvm_vector_to_index()

Use find_nth_bit() and make the function almost a one-liner.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: allow CPUID 0xC000_0000 to proceed on Zhaoxin CPUs

Bypass the Centaur-only filter for the CPUID signature leaf so that
processing continues when the CPU vendor is Zhaoxin.

Signed-off-by: Ewan Hai <ewanhai-oc@zhaoxin.com>
Link: https://lore.kernel.org/r/20250818083034.93935-1-ewanhai-oc@zhaoxin.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

arch/x86/kvm/ioapic: Remove license boilerplate with bad FSF address

The Free Software Foundation does not reside in "59 Temple Place"
anymore, so we should not mention that address in the source code here.
But instead of updating the address to their current location, let's
rather drop the license boilerplate text here and use a proper SPDX
license identifier instead. The text talks about the "GNU *Lesser*
General Public License" and "any later version", so LGPL-2.1+ is the
right choice here.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20250728152843.310260-1-thuth@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: SVM: Skip fastpath emulation on VM-Exit if next RIP isn't valid

Skip the WRMSR and HLT fastpaths in SVM's VM-Exit handler if the next RIP
isn't valid, e.g. because KVM is running with nrips=false.  SVM must
decode and emulate to skip the instruction if the CPU doesn't provide the
next RIP, and getting the instruction bytes to decode requires reading
guest memory.  Reading guest memory through the emulator can fault, i.e.
can sleep, which is disallowed since the fastpath handlers run with IRQs
disabled.

BUG: sleeping function called from invalid context at ./include/linux/uaccess.h:106
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 32611, name: qemu
preempt_count: 1, expected: 0
INFO: lockdep is turned off.
irq event stamp: 30580
hardirqs last  enabled at (30579): [<ffffffffc08b2527>] vcpu_run+0x1787/0x1db0 [kvm]
hardirqs last disabled at (30580): [<ffffffffb4f62e32>] __schedule+0x1e2/0xed0
softirqs last  enabled at (30570): [<ffffffffb4247a64>] fpu_swap_kvm_fpstate+0x44/0x210
softirqs last disabled at (30568): [<ffffffffb4247a64>] fpu_swap_kvm_fpstate+0x44/0x210
CPU: 298 UID: 0 PID: 32611 Comm: qemu Tainted: G     U              6.16.0-smp--e6c618b51cfe-sleep #782 NONE
Tainted: [U]=USER
Hardware name: Google Astoria-Turin/astoria, BIOS 0.20241223.2-0 01/17/2025
Call Trace:
  <TASK>
  dump_stack_lvl+0x7d/0xb0
  __might_resched+0x271/0x290
  __might_fault+0x28/0x80
  kvm_vcpu_read_guest_page+0x8d/0xc0 [kvm]
  kvm_fetch_guest_virt+0x92/0xc0 [kvm]
  __do_insn_fetch_bytes+0xf3/0x1e0 [kvm]
  x86_decode_insn+0xd1/0x1010 [kvm]
  x86_emulate_instruction+0x105/0x810 [kvm]
  __svm_skip_emulated_instruction+0xc4/0x140 [kvm_amd]
  handle_fastpath_invd+0xc4/0x1a0 [kvm]
  vcpu_run+0x11a1/0x1db0 [kvm]
  kvm_arch_vcpu_ioctl_run+0x5cc/0x730 [kvm]
  kvm_vcpu_ioctl+0x578/0x6a0 [kvm]
  __se_sys_ioctl+0x6d/0xb0
  do_syscall_64+0x8a/0x2c0
  entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f479d57a94b
  </TASK>

Note, this is essentially a reapply of commit 5c30e8101e8d ("KVM: SVM:
Skip WRMSR fastpath on VM-Exit if next RIP isn't valid"), but with
different justification (KVM now grabs SRCU when skipping the instruction
for other reasons).

Fixes: b439eb8ab578 ("Revert "KVM: SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't valid"")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250805190526.1453366-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: SVM: Emulate PERF_CNTR_GLOBAL_STATUS_SET for PerfMonV2

Emulate PERF_CNTR_GLOBAL_STATUS_SET when PerfMonV2 is enumerated to the
guest, as the MSR is supposed to exist in all AMD v2 PMUs.

Fixes: 4a2771895ca6 ("KVM: x86/svm/pmu: Add AMD PerfMonV2 support")
Cc: stable@vger.kernel.org
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20250711172746.1579423-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Don't (re)check L1 intercepts when completing userspace I/O

When completing emulation of instruction that generated a userspace exit
for I/O, don't recheck L1 intercepts as KVM has already finished that
phase of instruction execution, i.e. has already committed to allowing L2
to perform I/O.  If L1 (or host userspace) modifies the I/O permission
bitmaps during the exit to userspace,  KVM will treat the access as being
intercepted despite already having emulated the I/O access.

Pivot on EMULTYPE_NO_DECODE to detect that KVM is completing emulation.
Of the three users of EMULTYPE_NO_DECODE, only complete_emulated_io() (the
intended "recipient") can reach the code in question.  gp_interception()'s
use is mutually exclusive with is_guest_mode(), and
complete_emulated_insn_gp() unconditionally pairs EMULTYPE_NO_DECODE with
EMULTYPE_SKIP.

The bad behavior was detected by a syzkaller program that toggles port I/O
interception during the userspace I/O exit, ultimately resulting in a WARN
on vcpu->arch.pio.count being non-zero due to KVM no completing emulation
of the I/O instruction.

  WARNING: CPU: 23 PID: 1083 at arch/x86/kvm/x86.c:8039 emulator_pio_in_out+0x154/0x170 [kvm]
  Modules linked in: kvm_intel kvm irqbypass
  CPU: 23 UID: 1000 PID: 1083 Comm: repro Not tainted 6.16.0-rc5-c1610d2d66b1-next-vm #74 NONE
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:emulator_pio_in_out+0x154/0x170 [kvm]
  PKRU: 55555554
  Call Trace:
   <TASK>
   kvm_fast_pio+0xd6/0x1d0 [kvm]
   vmx_handle_exit+0x149/0x610 [kvm_intel]
   kvm_arch_vcpu_ioctl_run+0xda8/0x1ac0 [kvm]
   kvm_vcpu_ioctl+0x244/0x8c0 [kvm]
   __x64_sys_ioctl+0x8a/0xd0
   do_syscall_64+0x5d/0xc60
   entry_SYSCALL_64_after_hwframe+0x4b/0x53
   </TASK>

Reported-by: syzbot+cc2032ba16cc2018ca25@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68790db4.a00a0220.3af5df.0020.GAE@google.com
Fixes: 8a76d7f25f8f ("KVM: x86: Add x86 callback for intercept check")
Cc: stable@vger.kernel.org
Cc: Jim Mattson <jmattson@google.com>
Link: https://lore.kernel.org/r/20250715190638.1899116-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

Linux 6.17-rc2

Merge tag 'x86_urgent_for_v6.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

- Remove a transitional asm/cpuid.h header which was added only as a
   fallback during cpuid helpers reorg

- Initialize reserved fields in the SVSM page validation calls
   structure to zero in order to allow for future structure extensions

- Have the sev-guest driver's buffers used in encryption operations be
   in linear mapping space as the encryption operation can be offloaded
   to an accelerator

- Have a read-only MSR write when in an AMD SNP guest trap to the
   hypervisor as it is usually done. This makes the guest user
   experience better by simply raising a #GP instead of terminating said
   guest

- Do not output AVX512 elapsed time for kernel threads because the data
   is wrong and fix a NULL pointer dereferencing in the process

- Adjust the SRSO mitigation selection to the new attack vectors

* tag 'x86_urgent_for_v6.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpuid: Remove transitional <asm/cpuid.h> header
  x86/sev: Ensure SVSM reserved fields in a page validation entry are initialized to zero
  virt: sev-guest: Satisfy linear mapping requirement in get_derived_key()
  x86/sev: Improve handling of writes to intercepted TSC MSRs
  x86/fpu: Fix NULL dereference in avx512_status()
  x86/bugs: Select best SRSO mitigation

Merge tag 'locking_urgent_for_v6.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Borislav Petkov:

- Make sure sanity checks down in the mutex lock path happen on the
   correct type of task so that they don't trigger falsely

- Use the write unsafe user access pairs when writing a futex value to
   prevent an error on PowerPC which does user read and write accesses
   differently

* tag 'locking_urgent_for_v6.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking: Fix __clear_task_blocked_on() warning from __ww_mutex_wound() path
  futex: Use user_write_access_begin/_end() in futex_put_value()

Merge tag 'rust-fixes-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux

Pull rust fixes from Miguel Ojeda:

- Workaround 'rustdoc' target modifiers bug in Rust >= 1.88.0. It will
   be fixed in Rust 1.90.0 (expected 2025-09-18).

- Clean 'rustdoc' output before running it to avoid confusing the tool
   when files from previous versions remain.

* tag 'rust-fixes-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
  rust: kbuild: clean output before running `rustdoc`
  rust: workaround `rustdoc` target modifiers bug

Merge tag 'ata-ata-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux

Pull ata fixes from Damien Le Moal:

- Fix a regression affecting old IDE/PATA device scan and introduced by
   the recent link power management cleanups & fixes. The regression
   prevented devices from being properly detected (me)

- Fix command duration limits (CDL) feature control: attempting to
   enable the feature while NCQ commands are being executed resulted in
   a silent failure to enable CDL when needed (Igor)

* tag 'ata-ata-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  ata: libata-scsi: Fix CDL control
  ata: libata-eh: Fix link state check for IDE/PATA ports

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"One core change removing the 'w' access flag of attributes that don't
  have a set routine (and therefore can't be written to) which should
  have no practical impact. The big scsi_debug update is caused by
  reformatting lots of arrays and the rest of the bug fixes in drivers
  are trivial"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: core: Remove error print for devm_add_action_or_reset()
  scsi: ufs: mediatek: Fix out-of-bounds access in MCQ IRQ mapping
  scsi: lpfc: Remove redundant assignment to avoid memory leak
  scsi: lpfc: Fix wrong function reference in a comment
  scsi: ufs: core: Fix interrupt handling for MCQ Mode
  scsi: scsi_debug: Make read-only arrays static const
  scsi: core: sysfs: Correct sysfs attributes access rights

Merge tag 'drm-fixes-2025-08-16' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
"Relatively quiet week, usual amdgpu/i915/xe fixes along with a set of
  fixes for fbdev format info, which fix some regressions seen in with
  rc1.

  bridge:
   - fix OF-node leak
   - fix documentation

  fbdev-emulation:
   - pass correct format info to drm_helper_mode_fill_fb_struct()

  panfrost:
   - print correct RSS size

  amdgpu:
   - PSP fix
   - VRAM reservation fix
   - CSA fix
   - Process kill fix

  i915:
   - Fix the implementation of wa_18038517565 [fbc]
   - Do not trigger Frame Change events from frontbuffer flush [psr]

  xe:
   - Some more xe_migrate_access_memory fixes (Auld)
   - Defer buffer object shrinker write-backs and GPU waits (Thomas)
   - HWMON fix for clamping limits (Karthik)
   - SRIOV-PF: Set VF LMEM BAR size (Michal)"

* tag 'drm-fixes-2025-08-16' of https://gitlab.freedesktop.org/drm/kernel:
  drm/xe/pf: Set VF LMEM BAR size
  drm/amdgpu: fix task hang from failed job submission during process kill
  drm/amdgpu: fix incorrect vm flags to map bo
  drm/amdgpu: fix vram reservation issue
  drm/amdgpu: Add PSP fw version check for fw reserve GFX command
  drm/xe/hwmon: Add SW clamp for power limits writes
  drm/xe: Defer buffer object shrinker write-backs and GPU waits
  drm/xe/migrate: prevent potential UAF
  drm/xe/migrate: don't overflow max copy size
  drm/xe/migrate: prevent infinite recursion
  drm/i915/psr: Do not trigger Frame Change events from frontbuffer flush
  drm/i915/fbc: fix the implementation of wa_18038517565
  drm/panfrost: Print RSS for tiler heap BO's in debugfs GEMS file
  drm/radeon: Pass along the format info from .fb_create() to drm_helper_mode_fill_fb_struct()
  drm/nouveau: Pass along the format info from .fb_create() to drm_helper_mode_fill_fb_struct()
  drm/omap: Pass along the format info from .fb_create() to drm_helper_mode_fill_fb_struct()
  drm/bridge: document HDMI CEC callbacks
  drm/bridge: Describe the newly introduced drm_connector parameter for drm_bridge_detect
  drm/bridge: fix OF node leak

Merge tag 'xfs-fixes-6.17-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Carlos Maiolino:

- Fix an assert trigger introduced during the merge window

- Prevent atomic writes to be used with DAX

- Prevent users from using the max_atomic_write mount option without
   reflink, as atomic writes > 1block are not supported without reflink

- Fix a null-pointer-deref in a tracepoint

* tag 'xfs-fixes-6.17-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: split xfs_zone_record_blocks
  xfs: fix scrub trace with null pointer in quotacheck
  xfs: reject max_atomic_write mount option for no reflink
  xfs: disallow atomic writes on DAX
  fs/dax: Reject IOCB_ATOMIC in dax_iomap_rw()
  xfs: remove XFS_IBULK_SAME_AG
  xfs: fully decouple XFS_IBULK* flags from XFS_IWALK* flags
  xfs: fix frozen file system assert in xfs_trans_alloc

Merge tag 'block-6.17-20250815' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

- Fix for unprivileged daemons in ublk

- Speedup ublk release by removing unnecessary quiesce

- Fix for blk-wbt, where a regression caused it to not be possible to
   enable at runtime

- blk-wbt cleanups

- Kill the page pool from drbd

- Remove redundant __GFP_NOWARN uses in a few spots

- Fix for a kobject double initialization issues

* tag 'block-6.17-20250815' of git://git.kernel.dk/linux:
  block: restore default wbt enablement
  Docs: admin-guide: Correct spelling mistake
  blk-wbt: doc: Update the doc of the wbt_lat_usec interface
  blk-wbt: Eliminate ambiguity in the comments of struct rq_wb
  blk-wbt: Optimize wbt_done() for non-throttled writes
  block: fix kobject double initialization in add_disk
  blk-cgroup: remove redundant __GFP_NOWARN
  block, bfq: remove redundant __GFP_NOWARN
  ublk: check for unprivileged daemon on each I/O fetch
  ublk: don't quiesce in ublk_ch_release
  drbd: Remove the open-coded page pool

x86/cpuid: Remove transitional <asm/cpuid.h> header

All CPUID call sites were updated at commit:

968e30006807 ("x86/cpuid: Set <asm/cpuid/api.h> as the main CPUID header")

to include <asm/cpuid/api.h> instead of <asm/cpuid.h>.

The <asm/cpuid.h> header was still retained as a wrapper, just in case
some new code in -next started using it. Now that everything is merged
to Linus' tree, remove the header.

Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250815070227.19981-2-darwi@linutronix.de

x86/sev: Ensure SVSM reserved fields in a page validation entry are initialized to zero

In order to support future versions of the SVSM_CORE_PVALIDATE call, all
reserved fields within a PVALIDATE entry must be set to zero as an SVSM should
be ensuring all reserved fields are zero in order to support future usage of
reserved areas based on the protocol version.

Fixes: fcd042e86422 ("x86/sev: Perform PVALIDATE using the SVSM when not at VMPL0")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/7cde412f8b057ea13a646fb166b1ca023f6a5031.1755098819.git.thomas.lendacky@amd.com

virt: sev-guest: Satisfy linear mapping requirement in get_derived_key()

Commit

7ffeb2fc2670 ("x86/sev: Document requirement for linear mapping of guest request buffers")

added a check that requires the guest request buffers to be in the linear
mapping. The get_derived_key() function was passing a buffer that was
allocated on the stack, resulting in the call to snp_send_guest_request()
returning an error.

Update the get_derived_key() function to use an allocated buffer instead
of a stack buffer.

Fixes: 7ffeb2fc2670 ("x86/sev: Document requirement for linear mapping of guest request buffers")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/9b764ca9fc79199a091aac684c4926e2080ca7a8.1752698495.git.thomas.lendacky@amd.com

Merge tag 'io_uring-6.17-20250815' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:

- Tweak for the fairly recent changes of minimizing io-wq worker
   creations when it's pointless to create them.

- Fix for an issue with ring provided buffers, which could cause issues
   with reuse or corrupt application data.

* tag 'io_uring-6.17-20250815' of git://git.kernel.dk/linux:
  io_uring/io-wq: add check free worker before create new worker
  io_uring/net: commit partial buffers on retry

Merge tag 'sound-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"A collection of small fixes:

   - Potential OOB access fixes in USB-audio driver

   - ASoC kconfig menu fix for improving the generic drivers

   - HD-audio quirks and a fix revert

   - Codec and platform-specific small fixes for ASoC"

* tag 'sound-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/tas2781: Normalize the volume kcontrol name
  ALSA: usb-audio: Validate UAC3 cluster segment descriptors
  ALSA: usb-audio: Validate UAC3 power domain descriptors, too
  Revert "ALSA: hda: Add ASRock X670E Taichi to denylist"
  ALSA: azt3328: Put __maybe_unused for inline functions for gameport
  ASoC: tas2781: Normalize the volume kcontrol name
  ASoC: stm: stm32_i2s: Fix calc_clk_div() error handling in determine_rate()
  ASoC: codecs: Call strscpy() with correct size argument
  ALSA: hda/realtek: Fix headset mic on HONOR BRB-X
  ALSA: hda/realtek: Add Framework Laptop 13 (AMD Ryzen AI 300) to quirks
  ASoC: tas2781: Fix spelling mistake "dismatch" -> "mismatch"
  ASoC: rt1320: fix random cycle mute issue
  ASoC: rt721: fix FU33 Boost Volume control not working
  ASoC: generic: tidyup standardized ASoC menu for generic
  ASoC: codec: sma1307: replace spelling mistake with new error message
  ASoC: codecs: tx-macro: correct tx_macro_component_drv name
  ASoC: fsl_sai: replace regmap_write with regmap_update_bits

Merge tag 'gpio-fixes-for-v6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fix from Bartosz Golaszewski:

- fix the way optional interrupts are retrieved from firmware in
   gpio-mlxbf3

* tag 'gpio-fixes-for-v6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: mlxbf3: use platform_get_irq_optional()
  Revert "gpio: mlxbf3: only get IRQ for device instance 0"

Merge tag 'pmdomain-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm

Pull pmdomain fix from Ulf Hansson:

- tegra: Ensure pmc power-domains are in a known state

* tag 'pmdomain-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
soc/tegra: pmc: Ensure power-domains are in a known state

Merge tag '6.17-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

- Fix unlink race and rename races

- SMB3.1.1 compression fix

- Avoid unneeded strlen calls in cifs_get_spnego_key

- Fix slab out of bounds in parse_server_interfaces()

- Fix mid leak and server buffer leak

- smbdirect send error path fix

- update internal version #

- Fix unneeded response time update in negotiate protocol

* tag '6.17-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: remove redundant lstrp update in negotiate protocol
  cifs: update internal version number
  smb: client: don't wait for info->send_pending == 0 on error
  smb: client: fix mid_q_entry memleak leak with per-mid locking
  smb3: fix for slab out of bounds on mount to ksmbd
  cifs: avoid extra calls to strlen() in cifs_get_spnego_key()
  cifs: Fix collect_sample() to handle any iterator type
  smb: client: fix race with concurrent opens in rename(2)
  smb: client: fix race with concurrent opens in unlink(2)

Merge tag 'firewire-fixes-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394

Pull firewire fixes from Takashi Sakamoto:
"This fixes a potential call to schedule() within an RCU read-side
  critical section. The solution applies reference counting to ensure
  that handlers which may call schedule() are invoked safely outside of
  the critical section"

* tag 'firewire-fixes-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
  firewire: core: reallocate buffer for FCP address handlers when more than 4 are registered
  firewire: core: call FCP address handlers outside RCU read-side critical section
  firewire: core: call handler for exclusive regions outside RCU read-side critical section
  firewire: core: use reference counting to invoke address handlers safely

Merge tag 'drm-xe-fixes-2025-08-14' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

- Some more xe_migrate_access_memory fixes (Auld)
- Defer buffer object shrinker write-backs and GPU waits (Thomas)
- HWMON fix for clamping limits (Karthik)
- SRIOV-PF: Set VF LMEM BAR size (Michal)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/aJ4MIZQurSo0uNxn@intel.com

Merge tag 'drm-intel-fixes-2025-08-13' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes

- Fix the implementation of wa_18038517565 [fbc] (Vinod Govindapillai)
- Do not trigger Frame Change events from frontbuffer flush [psr] (Jouni Högander)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tursulin@igalia.com>
Link: https://lore.kernel.org/r/aJ0HAh06VHWVdv63@linux

Merge tag 'acpi-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
"These restore corner case behavior of the EC driver related to the
  handling of defective ACPI tables and fix a recent regression in the
  ACPI processor driver:

   - Prevent the ACPI EC driver from ignoring ECDT information in the
     cases when the ID string in the ECDT is invalid, but not empty, to
     fix thouchpad detection on ThinkBook 14 G7 IML (Armin Wolf)

   - Rearrange checks in acpi_processor_ppc_init() to restore the
     handling of frequency QoS requests related to _PPC limits
     inadvertently broken by a recent update (Rafael Wysocki)"

* tag 'acpi-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: EC: Relax sanity check of the ECDT ID string
  ACPI: processor: perflib: Move problematic pr->performance check

Merge tag 'pm-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"These remove an artificial limitation from the intel_idle driver,
  update the menu cpuidle governor to restore its previous behavior in a
  corner case and add one more supported platform configuration to the
  intel_pstate driver:

   - Allow intel_idle to use _CST information from ACPI tables for idle
     states enumeration on any family of processors (Len Brown)

   - Restore corner case behavior of the menu cpuidle governor, related
     to the handling of systems where idle states selected by the
     governor are rejected by the cpuidle driver, inadvertently changed
     during the 6.15 development cycle (Rafael Wysocki)

   - Add support for Clearwater Forest in the out-of-band (OOB) mode to
     the intel_pstate driver (Srinivas Pandruvada)"

* tag 'pm-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: intel_pstate: Support Clearwater Forest OOB mode
  cpuidle: governors: menu: Avoid using invalid recent intervals data
  intel_idle: Allow loading ACPI tables for any family

drm/xe/pf: Set VF LMEM BAR size

LMEM is partitioned between multiple VFs and we expect that the more
VFs we have, the less LMEM is assigned to each VF.
This means that we can achieve full LMEM BAR access without the need to
attempt full VF LMEM BAR resize via pci_resize_resource().

Always try to set the largest possible BAR size that allows to fit the
number of enabled VFs and inform the user in case the resize attempt is
not successful.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20250527120637.665506-7-michal.winiarski@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 32a4d1b98e6663101fd0abfaf151c48feea7abb1)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Merge tag 'net-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
"Including fixes from Netfilter and IPsec.

  Current release - regressions:

   - netfilter: nft_set_pipapo:
      - don't return bogus extension pointer
      - fix null deref for empty set

  Current release - new code bugs:

   - core: prevent deadlocks when enabling NAPIs with mixed kthread
     config

   - eth: netdevsim: Fix wild pointer access in nsim_queue_free().

  Previous releases - regressions:

   - page_pool: allow enabling recycling late, fix false positive
     warning

   - sched: ets: use old 'nbands' while purging unused classes

   - xfrm:
      - restore GSO for SW crypto
      - bring back device check in validate_xmit_xfrm

   - tls: handle data disappearing from under the TLS ULP

   - ptp: prevent possible ABBA deadlock in ptp_clock_freerun()

   - eth:
      - bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE
      - hv_netvsc: fix panic during namespace deletion with VF

  Previous releases - always broken:

   - netfilter: fix refcount leak on table dump

   - vsock: do not allow binding to VMADDR_PORT_ANY

   - sctp: linearize cloned gso packets in sctp_rcv

   - eth:
      - hibmcge: fix the division by zero issue
      - microchip: fix KSZ8863 reset problem"

* tag 'net-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
  net: usb: asix_devices: add phy_mask for ax88772 mdio bus
  net: kcm: Fix race condition in kcm_unattach()
  selftests: net/forwarding: test purge of active DWRR classes
  net/sched: ets: use old 'nbands' while purging unused classes
  bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE
  netdevsim: Fix wild pointer access in nsim_queue_free().
  net: mctp: Fix bad kfree_skb in bind lookup test
  netfilter: nf_tables: reject duplicate device on updates
  ipvs: Fix estimator kthreads preferred affinity
  netfilter: nft_set_pipapo: fix null deref for empty set
  selftests: tls: test TCP stealing data from under the TLS socket
  tls: handle data disappearing from under the TLS ULP
  ptp: prevent possible ABBA deadlock in ptp_clock_freerun()
  ixgbe: prevent from unwanted interface name changes
  devlink: let driver opt out of automatic phys_port_name generation
  net: prevent deadlocks when enabling NAPIs with mixed kthread config
  net: update NAPI threaded config even for disabled NAPIs
  selftests: drv-net: don't assume device has only 2 queues
  docs: Fix name for net.ipv4.udp_child_hash_entries
  riscv: dts: thead: Add APB clocks for TH1520 GMACs
  ...

Merge branches 'acpi-ec' and 'acpi-processor'

* acpi-ec:
ACPI: EC: Relax sanity check of the ECDT ID string

* acpi-processor:
ACPI: processor: perflib: Move problematic pr->performance check

Merge branches 'pm-cpuidle' and 'pm-cpufreq'

* pm-cpuidle:
  cpuidle: governors: menu: Avoid using invalid recent intervals data
  intel_idle: Allow loading ACPI tables for any family

* pm-cpufreq:
  cpufreq: intel_pstate: Support Clearwater Forest OOB mode

ata: libata-scsi: Fix CDL control

Delete extra checks for the ATA_DFLAG_CDL_ENABLED flag that prevent
SET FEATURES command from being issued to a drive when NCQ commands
are active.

ata_mselect_control_ata_feature() sets / clears the ATA_DFLAG_CDL_ENABLED
flag during the translation of MODE SELECT to SET FEATURES. If SET FEATURES
gets deferred due to outstanding NCQ commands, the original MODE SELECT
command will be re-queued. When the re-queued MODE SELECT goes through
the ata_mselect_control_ata_feature() translation again, SET FEATURES
will not be issued because ATA_DFLAG_CDL_ENABLED has been already set or
cleared by the initial translation of MODE SELECT.

The ATA_DFLAG_CDL_ENABLED checks in ata_mselect_control_ata_feature()
are safe to remove because scsi_cdl_enable() implements a similar logic
that avoids enabling CDL if it has been enabled already.

Fixes: 17e897a45675 ("ata: libata-scsi: Improve CDL control")
Cc: stable@vger.kernel.org
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>

ata: libata-eh: Fix link state check for IDE/PATA ports

Commit 4371fe1ba400 ("ata: libata-eh: Avoid unnecessary resets when
revalidating devices") replaced the call to ata_phys_link_offline() in
ata_eh_revalidate_and_attach() with the new function
ata_eh_link_established() which relaxes the checks on a device link
state to account for low power mode transitions. However, this change
assumed that the device port has a valid scr_read method to obtain the
SStatus register for the port. This is not always the case, especially
with older IDE/PATA adapters (e.g. PATA/IDE devices emulated with QEMU).
For such adapter, ata_eh_link_established() will always return false,
causing ata_eh_revalidate_and_attach() to go into its error path and
ultimately to the device being disabled.

Avoid this by restoring the previous behavior, which is to assume that
the link is online if reading the port SStatus register fails.

While at it, also fix the spelling of SStatus in the comment describing
the function ata_eh_link_established().

Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Fixes: 4371fe1ba400 ("ata: libata-eh: Avoid unnecessary resets when revalidating devices")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>

ALSA: hda/tas2781: Normalize the volume kcontrol name

Change the name of the kcontrol from "Gain" to "Volume".

Signed-off-by: Baojun Xu <baojun.xu@ti.com>
Link: https://patch.msgid.link/20250813100842.12224-1-baojun.xu@ti.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: usb-audio: Validate UAC3 cluster segment descriptors

UAC3 class segment descriptors need to be verified whether their sizes
match with the declared lengths and whether they fit with the
allocated buffer sizes, too. Otherwise malicious firmware may lead to
the unexpected OOB accesses.

Fixes: 11785ef53228 ("ALSA: usb-audio: Initial Power Domain support")
Reported-and-tested-by: Youngjun Lee <yjjuny.lee@samsung.com>
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20250814081245.8902-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: usb-audio: Validate UAC3 power domain descriptors, too

UAC3 power domain descriptors need to be verified with its variable
bLength for avoiding the unexpected OOB accesses by malicious
firmware, too.

Fixes: 9a2fe9b801f5 ("ALSA: usb: initial USB Audio Device Class 3.0 support")
Reported-and-tested-by: Youngjun Lee <yjjuny.lee@samsung.com>
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20250814081245.8902-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>

net: usb: asix_devices: add phy_mask for ax88772 mdio bus

Without setting phy_mask for ax88772 mdio bus, current driver may create
at most 32 mdio phy devices with phy address range from 0x00 ~ 0x1f.
DLink DUB-E100 H/W Ver B1 is such a device. However, only one main phy
device will bind to net phy driver. This is creating issue during system
suspend/resume since phy_polling_mode() in phy_state_machine() will
directly deference member of phydev->drv for non-main phy devices. Then
NULL pointer dereference issue will occur. Due to only external phy or
internal phy is necessary, add phy_mask for ax88772 mdio bus to workarnoud
the issue.

Closes: https://lore.kernel.org/netdev/20250806082931.3289134-1-xu.yang_2@nxp.com
Fixes: e532a096be0e ("net: usb: asix: ax88772: add phylib support")
Cc: stable@vger.kernel.org
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Tested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20250811092931.860333-1-xu.yang_2@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge tag 'asoc-fix-v6.17-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.17

A reasonably small collection of fixes that came in since the merge
window, mostly small and driver specific plus a cleanup of the menu
reorganisation to address some user confusion with the way the generic
drivers had been handled.

Merge tag 'probes-fixes-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fix from Masami Hiramatsu:

- MAINTAINERS: Remove bouncing kprobes maintainer

* tag 'probes-fixes-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
MAINTAINERS: Remove bouncing kprobes maintainer

MAINTAINERS: Remove bouncing kprobes maintainer

The kprobes MAINTAINERS entry includes anil.s.keshavamurthy@intel.com.
That address is bouncing. Remove it.

This still leaves three other listed maintainers.

Link: https://lore.kernel.org/all/20250808180124.7DDE2ECD@davehans-spike.ostc.intel.com/
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Naveen N Rao <naveen@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: linux-trace-kernel@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

net: kcm: Fix race condition in kcm_unattach()

syzbot found a race condition when kcm_unattach(psock)
and kcm_release(kcm) are executed at the same time.

kcm_unattach() is missing a check of the flag
kcm->tx_stopped before calling queue_work().

If the kcm has a reserved psock, kcm_unattach() might get executed
between cancel_work_sync() and unreserve_psock() in kcm_release(),
requeuing kcm->tx_work right before kcm gets freed in kcm_done().

Remove kcm->tx_stopped and replace it by the less
error-prone disable_work_sync().

Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Reported-by: syzbot+e62c9db591c30e174662@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e62c9db591c30e174662
Reported-by: syzbot+d199b52665b6c3069b94@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d199b52665b6c3069b94
Reported-by: syzbot+be6b1fdfeae512726b4e@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=be6b1fdfeae512726b4e
Signed-off-by: Sven Stegemann <sven@stegemann.de>
Link: https://patch.msgid.link/20250812191810.27777-1-sven@stegemann.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'ets-use-old-nbands-while-purging-unused-classes'

Davide Caratti says:

====================
ets: use old 'nbands' while purging unused classes

- patch 1/2 fixes a NULL dereference in the control path of sch_ets qdisc
- patch 2/2 extends kselftests to verify effectiveness of the above fix
====================

Link: https://patch.msgid.link/cover.1755016081.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: net/forwarding: test purge of active DWRR classes

Extend sch_ets.sh to add a reproducer for problematic list deletions when
active DWRR class are purged by ets_qdisc_change() [1] [2].

[1] https://lore.kernel.org/netdev/e08c7f4a6882f260011909a868311c6e9b54f3e4.1639153474.git.dcaratti@redhat.com/
[2] https://lore.kernel.org/netdev/f3b9bacc73145f265c19ab80785933da5b7cbdec.1754581577.git.dcaratti@redhat.com/

Suggested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/489497cb781af7389011ca1591fb702a7391f5e7.1755016081.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/sched: ets: use old 'nbands' while purging unused classes

Shuang reported sch_ets test-case [1] crashing in ets_class_qlen_notify()
after recent changes from Lion [2]. The problem is: in ets_qdisc_change()
we purge unused DWRR queues; the value of 'q->nbands' is the new one, and
the cleanup should be done with the old one. The problem is here since my
first attempts to fix ets_qdisc_change(), but it surfaced again after the
recent qdisc len accounting fixes. Fix it purging idle DWRR queues before
assigning a new value of 'q->nbands', so that all purge operations find a
consistent configuration:

- old 'q->nbands' because it's needed by ets_class_find()
- old 'q->nstrict' because it's needed by ets_class_is_strict()

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 62 UID: 0 PID: 39457 Comm: tc Kdump: loaded Not tainted 6.12.0-116.el10.x86_64 #1 PREEMPT(voluntary)
Hardware name: Dell Inc. PowerEdge R640/06DKY5, BIOS 2.12.2 07/09/2021
RIP: 0010:__list_del_entry_valid_or_report+0x4/0x80
Code: ff 4c 39 c7 0f 84 39 19 8e ff b8 01 00 00 00 c3 cc cc cc cc 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa <48> 8b 17 48 8b 4f 08 48 85 d2 0f 84 56 19 8e ff 48 85 c9 0f 84 ab
RSP: 0018:ffffba186009f400 EFLAGS: 00010202
RAX: 00000000000000d6 RBX: 0000000000000000 RCX: 0000000000000004
RDX: ffff9f0fa29b69c0 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffffffc12c2400 R08: 0000000000000008 R09: 0000000000000004
R10: ffffffffffffffff R11: 0000000000000004 R12: 0000000000000000
R13: ffff9f0f8cfe0000 R14: 0000000000100005 R15: 0000000000000000
FS:  00007f2154f37480(0000) GS:ffff9f269c1c0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000001530be001 CR4: 00000000007726f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
  <TASK>
  ets_class_qlen_notify+0x65/0x90 [sch_ets]
  qdisc_tree_reduce_backlog+0x74/0x110
  ets_qdisc_change+0x630/0xa40 [sch_ets]
  __tc_modify_qdisc.constprop.0+0x216/0x7f0
  tc_modify_qdisc+0x7c/0x120
  rtnetlink_rcv_msg+0x145/0x3f0
  netlink_rcv_skb+0x53/0x100
  netlink_unicast+0x245/0x390
  netlink_sendmsg+0x21b/0x470
  ____sys_sendmsg+0x39d/0x3d0
  ___sys_sendmsg+0x9a/0xe0
  __sys_sendmsg+0x7a/0xd0
  do_syscall_64+0x7d/0x160
  entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f2155114084
Code: 89 02 b8 ff ff ff ff eb bb 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 80 3d 25 f0 0c 00 00 74 13 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 89 54 24 1c 48 89
RSP: 002b:00007fff1fd7a988 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000560ec063e5e0 RCX: 00007f2155114084
RDX: 0000000000000000 RSI: 00007fff1fd7a9f0 RDI: 0000000000000003
RBP: 00007fff1fd7aa60 R08: 0000000000000010 R09: 000000000000003f
R10: 0000560ee9b3a010 R11: 0000000000000202 R12: 00007fff1fd7aae0
R13: 000000006891ccde R14: 0000560ec063e5e0 R15: 00007fff1fd7aad0
  </TASK>

[1] https://lore.kernel.org/netdev/e08c7f4a6882f260011909a868311c6e9b54f3e4.1639153474.git.dcaratti@redhat.com/
[2] https://lore.kernel.org/netdev/d912cbd7-193b-4269-9857-525bee8bbb6a@gmail.com/

Cc: stable@vger.kernel.org
Fixes: 103406b38c60 ("net/sched: Always pass notifications when child class becomes empty")
Fixes: c062f2a0b04d ("net/sched: sch_ets: don't remove idle classes from the round-robin list")
Fixes: dcc68b4d8084 ("net: sch_ets: Add a new Qdisc")
Reported-by: Li Shuang <shuali@redhat.com>
Closes: https://issues.redhat.com/browse/RHEL-108026
Reviewed-by: Petr Machata <petrm@nvidia.com>
Co-developed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://patch.msgid.link/7928ff6d17db47a2ae7cc205c44777b1f1950545.1755016081.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
ixgbe: bypass devlink phys_port_name generation

Jedrzej adds option to skip phys_port_name generation and opts
ixgbe into it as some configurations rely on pre-devlink naming
which could end up broken as a result.

* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ixgbe: prevent from unwanted interface name changes
devlink: let driver opt out of automatic phys_port_name generation
====================

Link: https://patch.msgid.link/20250812205226.1984369-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE

The data page pool always fills the HW rx ring with pages. On arm64 with
64K pages, this will waste _at least_ 32K of memory per entry in the rx
ring.

Fix by fragmenting the pages if PAGE_SIZE > BNXT_RX_PAGE_SIZE. This
makes the data page pool the same as the header pool.

Tested with iperf3 with a small (64 entries) rx ring to encourage buffer
circulation.

Fixes: cd1fafe7da1f ("eth: bnxt: add support rx side device memory TCP")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250812182907.1540755-1-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

netdevsim: Fix wild pointer access in nsim_queue_free().

syzbot reported the splat below. [0]

When nsim_queue_uninit() is called from nsim_init_netdevsim(),
register_netdevice() has not been called, thus dev->dstats has
not been allocated.

Let's not call dev_dstats_rx_dropped_add() in such a case.

[0]
BUG: unable to handle page fault for address: ffff88809782c020
PF: supervisor write access in kernel mode
PF: error_code(0x0002) - not-present page
PGD 1b401067 P4D 1b401067 PUD 0
Oops: Oops: 0002 [#1] SMP KASAN NOPTI
CPU: 3 UID: 0 PID: 8476 Comm: syz.1.251 Not tainted 6.16.0-syzkaller-06699-ge8d780dcd957 #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:local_add arch/x86/include/asm/local.h:33 [inline]
RIP: 0010:u64_stats_add include/linux/u64_stats_sync.h:89 [inline]
RIP: 0010:dev_dstats_rx_dropped_add include/linux/netdevice.h:3027 [inline]
RIP: 0010:nsim_queue_free+0xba/0x120 drivers/net/netdevsim/netdev.c:714
Code: 07 77 6c 4a 8d 3c ed 20 7e f1 8d 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 75 46 4a 03 1c ed 20 7e f1 8d <4c> 01 63 20 be 00 02 00 00 48 8d 3d 00 00 00 00 e8 61 2f 58 fa 48
RSP: 0018:ffffc900044af150 EFLAGS: 00010286
RAX: dffffc0000000000 RBX: ffff88809782c000 RCX: 00000000000079c3
RDX: 1ffffffff1be2fc7 RSI: ffffffff8c15f380 RDI: ffffffff8df17e38
RBP: ffff88805f59d000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000003 R14: ffff88806ceb3d00 R15: ffffed100dfd308e
FS: 0000000000000000(0000) GS:ffff88809782c000(0063) knlGS:00000000f505db40
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: ffff88809782c020 CR3: 000000006fc6a000 CR4: 0000000000352ef0
Call Trace:
<TASK>
nsim_queue_uninit drivers/net/netdevsim/netdev.c:993 [inline]
nsim_init_netdevsim drivers/net/netdevsim/netdev.c:1049 [inline]
nsim_create+0xd0a/0x1260 drivers/net/netdevsim/netdev.c:1101
__nsim_dev_port_add+0x435/0x7d0 drivers/net/netdevsim/dev.c:1438
nsim_dev_port_add_all drivers/net/netdevsim/dev.c:1494 [inline]
nsim_dev_reload_create drivers/net/netdevsim/dev.c:1546 [inline]
nsim_dev_reload_up+0x5b8/0x860 drivers/net/netdevsim/dev.c:1003
devlink_reload+0x322/0x7c0 net/devlink/dev.c:474
devlink_nl_reload_doit+0xe31/0x1410 net/devlink/dev.c:584
genl_family_rcv_msg_doit+0x206/0x2f0 net/netlink/genetlink.c:1115
genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
genl_rcv_msg+0x55c/0x800 net/netlink/genetlink.c:1210
netlink_rcv_skb+0x155/0x420 net/netlink/af_netlink.c:2552
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline]
netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1346
netlink_sendmsg+0x8d1/0xdd0 net/netlink/af_netlink.c:1896
sock_sendmsg_nosec net/socket.c:714 [inline]
__sock_sendmsg net/socket.c:729 [inline]
____sys_sendmsg+0xa95/0xc70 net/socket.c:2614
___sys_sendmsg+0x134/0x1d0 net/socket.c:2668
__sys_sendmsg+0x16d/0x220 net/socket.c:2700
do_syscall_32_irqs_on arch/x86/entry/syscall_32.c:83 [inline]
__do_fast_syscall_32+0x7c/0x3a0 arch/x86/entry/syscall_32.c:306
do_fast_syscall_32+0x32/0x80 arch/x86/entry/syscall_32.c:331
entry_SYSENTER_compat_after_hwframe+0x84/0x8e
RIP: 0023:0xf708e579
Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
RSP: 002b:00000000f505d55c EFLAGS: 00000296 ORIG_RAX: 0000000000000172
RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 0000000080000080
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Modules linked in:
CR2: ffff88809782c020

Fixes: 2a68a22304f9 ("netdevsim: account dropped packet length in stats on queue free")
Reported-by: syzbot+8aa80c6232008f7b957d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/688bb9ca.a00a0220.26d0e1.0050.GAE@google.com/
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250812162130.4129322-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: mctp: Fix bad kfree_skb in bind lookup test

The kunit test's skb_pkt is consumed by mctp_dst_input() so shouldn't be
freed separately.

Fixes: e6d8e7dbc5a3 ("net: mctp: Add bind lookup test")
Reported-by: Alexandre Ghiti <alex@ghiti.fr>
Closes: https://lore.kernel.org/all/734b02a3-1941-49df-a0da-ec14310d41e4@ghiti.fr/
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://patch.msgid.link/20250812-fix-mctp-bind-test-v1-1-5e2128664eb3@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'amd-drm-fixes-6.17-2025-08-13' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.17-2025-08-13:

amdgpu:
- PSP fix
- VRAM reservation fix
- CSA fix
- Process kill fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250813151905.2040816-1-alexander.deucher@amd.com

Merge tag 'nf-25-08-13' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Florian Westphal says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for *net*:

1) I managed to add a null dereference crash in nft_set_pipapo
   in the current development cycle, was not caught by CI
   because the avx2 implementation is fine, but selftest
   splats when run on non-avx2 host.

2) Fix the ipvs estimater kthread affinity, was incorrect
   since 6.14. From Frederic Weisbecker.

3) nf_tables should not allow to add a device to a flowtable
   or netdev chain more than once -- reject this.
   From Pablo Neira Ayuso.  This has been broken for long time,
   blamed commit dates from v5.8.

* tag 'nf-25-08-13' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_tables: reject duplicate device on updates
  ipvs: Fix estimator kthreads preferred affinity
  netfilter: nft_set_pipapo: fix null deref for empty set
====================

Link: https://patch.msgid.link/20250813113800.20775-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'drm-misc-next-fixes-2025-08-12' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

Short summary of fixes pull:

bridge:
- fix OF-node leak
- fix documentation

fbdev-emulation:
- pass correct format info to drm_helper_mode_fill_fb_struct()

panfrost:
- print correct RSS size

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250812064712.GA14554@2a02-2454-fd5e-fd00-2c49-c639-c55f-a125.dyn6.pyur.net

Merge tag 'erofs-for-6.17-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs fixes from Gao Xiang:

- Align FSDAX enablement among multiple devices

- Fix EROFS_FS_ZIP_ACCEL build dependency again to prevent forcing
   CRYPTO{,_DEFLATE}=y even if EROFS=m

- Fix atomic context detection to properly launch kworkers on demand

- Fix block count statistics for 48-bit addressing support

* tag 'erofs-for-6.17-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: fix block count report when 48-bit layout is on
  erofs: fix atomic context detection when !CONFIG_DEBUG_LOCK_ALLOC
  erofs: Do not select tristate symbols from bool symbols
  erofs: Fallback to normal access if DAX is not supported on extra device

Merge tag 'rcu.fixes.6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux

Pull RCU fix from Neeraj Upadhyay:
"Fix a regression introduced by commit b41642c87716 ("rcu: Fix
  rcu_read_unlock() deadloop due to IRQ work") which results in boot
  hang as reported by kernel test bot at [1].

  This issue happens because RCU re-initializes the deferred QS IRQ work
  everytime it is queued. With commit b41642c87716, the IRQ work
  re-initialization can happen while it is already queued. This results
  in IRQ work being requeued to itself. When IRQ work finally fires, as
  it is requeued to itself, it is repeatedly executed and results in
  hang.

  Fix this with initializing the IRQ work only once before the CPU
  boots"

Link: https://lore.kernel.org/rcu/202508071303.c1134cce-lkp@intel.com/
* tag 'rcu.fixes.6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux:
  rcu: Fix racy re-initialization of irq_work causing hangs

smb: client: remove redundant lstrp update in negotiate protocol

Commit 34331d7beed7 ("smb: client: fix first command failure during
re-negotiation") addressed a race condition by updating lstrp before
entering negotiate state. However, this approach may have some unintended
side effects.

The lstrp field is documented as "when we got last response from this
server", and updating it before actually receiving a server response
could potentially affect other mechanisms that rely on this timestamp.
For example, the SMB echo detection logic also uses lstrp as a reference
point. In scenarios with frequent user operations during reconnect states,
the repeated calls to cifs_negotiate_protocol() might continuously
update lstrp, which could interfere with the echo detection timing.

Additionally, commit 266b5d02e14f ("smb: client: fix race condition in
negotiate timeout by using more precise timing") introduced a dedicated
neg_start field specifically for tracking negotiate start time. This
provides a more precise solution for the original race condition while
preserving the intended semantics of lstrp.

Since the race condition is now properly handled by the neg_start
mechanism, the lstrp update in cifs_negotiate_protocol() is no longer
necessary and can be safely removed.

Fixes: 266b5d02e14f ("smb: client: fix race condition in negotiate timeout by using more precise timing")
Cc: stable@vger.kernel.org
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
Signed-off-by: Steve French <stfrench@microsoft.com>

cifs: update internal version number

to 2.56

Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: don't wait for info->send_pending == 0 on error

We already called ib_drain_qp() before and that makes sure
send_done() was called with IB_WC_WR_FLUSH_ERR, but
didn't called atomic_dec_and_test(&sc->send_io.pending.count)

So we may never reach the info->send_pending == 0 condition.

Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Fixes: 5349ae5e05fa ("smb: client: let send_done() cleanup before calling smbd_disconnect_rdma_connection()")
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: fix mid_q_entry memleak leak with per-mid locking

This is step 4/4 of a patch series to fix mid_q_entry memory leaks
caused by race conditions in callback execution.

In compound_send_recv(), when wait_for_response() is interrupted by
signals, the code attempts to cancel pending requests by changing
their callbacks to cifs_cancelled_callback. However, there's a race
condition between signal interruption and network response processing
that causes both mid_q_entry and server buffer leaks:

```
User foreground process                    cifsd
cifs_readdir
open_cached_dir
  cifs_send_recv
   compound_send_recv
    smb2_setup_request
     smb2_mid_entry_alloc
      smb2_get_mid_entry
       smb2_mid_entry_alloc
        mempool_alloc // alloc mid
        kref_init(&temp->refcount); // refcount = 1
     mid[0]->callback = cifs_compound_callback;
     mid[1]->callback = cifs_compound_last_callback;
     smb_send_rqst
     rc = wait_for_response
      wait_event_state TASK_KILLABLE
                                  cifs_demultiplex_thread
                                    allocate_buffers
                                      server->bigbuf = cifs_buf_get()
                                    standard_receive3
                                      ->find_mid()
                                        smb2_find_mid
                                          __smb2_find_mid
                                           kref_get(&mid->refcount) // +1
                                      cifs_handle_standard
                                        handle_mid
                                         /* bigbuf will also leak */
                                         mid->resp_buf = server->bigbuf
                                         server->bigbuf = NULL;
                                         dequeue_mid
                                     /* in for loop */
                                    mids[0]->callback
                                      cifs_compound_callback
    /* Signal interrupts wait: rc = -ERESTARTSYS */
    /* if (... || midQ[i]->mid_state == MID_RESPONSE_RECEIVED) *?
    midQ[0]->callback = cifs_cancelled_callback;
    cancelled_mid[i] = true;
                                       /* The change comes too late */
                                       mid->mid_state = MID_RESPONSE_READY
                                    release_mid  // -1
    /* cancelled_mid[i] == true causes mid won't be released
       in compound_send_recv cleanup */
    /* cifs_cancelled_callback won't executed to release mid */
```

The root cause is that there's a race between callback assignment and
execution.

Fix this by introducing per-mid locking:

- Add spinlock_t mid_lock to struct mid_q_entry
- Add mid_execute_callback() for atomic callback execution
- Use mid_lock in cancellation paths to ensure atomicity

This ensures that either the original callback or the cancellation
callback executes atomically, preventing reference count leaks when
requests are interrupted by signals.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=220404
Fixes: ee258d79159a ("CIFS: Move credit processing to mid callbacks for SMB3")
Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb3: fix for slab out of bounds on mount to ksmbd

With KASAN enabled, it is possible to get a slab out of bounds
during mount to ksmbd due to missing check in parse_server_interfaces()
(see below):

BUG: KASAN: slab-out-of-bounds in
parse_server_interfaces+0x14ee/0x1880 [cifs]
Read of size 4 at addr ffff8881433dba98 by task mount/9827

CPU: 5 UID: 0 PID: 9827 Comm: mount Tainted: G
OE       6.16.0-rc2-kasan #2 PREEMPT(voluntary)
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: Dell Inc. Precision Tower 3620/0MWYPT,
BIOS 2.13.1 06/14/2019
Call Trace:
  <TASK>
dump_stack_lvl+0x9f/0xf0
print_report+0xd1/0x670
__virt_addr_valid+0x22c/0x430
? parse_server_interfaces+0x14ee/0x1880 [cifs]
? kasan_complete_mode_report_info+0x2a/0x1f0
? parse_server_interfaces+0x14ee/0x1880 [cifs]
   kasan_report+0xd6/0x110
   parse_server_interfaces+0x14ee/0x1880 [cifs]
   __asan_report_load_n_noabort+0x13/0x20
   parse_server_interfaces+0x14ee/0x1880 [cifs]
? __pfx_parse_server_interfaces+0x10/0x10 [cifs]
? trace_hardirqs_on+0x51/0x60
SMB3_request_interfaces+0x1ad/0x3f0 [cifs]
? __pfx_SMB3_request_interfaces+0x10/0x10 [cifs]
? SMB2_tcon+0x23c/0x15d0 [cifs]
smb3_qfs_tcon+0x173/0x2b0 [cifs]
? __pfx_smb3_qfs_tcon+0x10/0x10 [cifs]
? cifs_get_tcon+0x105d/0x2120 [cifs]
? do_raw_spin_unlock+0x5d/0x200
? cifs_get_tcon+0x105d/0x2120 [cifs]
? __pfx_smb3_qfs_tcon+0x10/0x10 [cifs]
cifs_mount_get_tcon+0x369/0xb90 [cifs]
? dfs_cache_find+0xe7/0x150 [cifs]
dfs_mount_share+0x985/0x2970 [cifs]
? check_path.constprop.0+0x28/0x50
? save_trace+0x54/0x370
? __pfx_dfs_mount_share+0x10/0x10 [cifs]
? __lock_acquire+0xb82/0x2ba0
? __kasan_check_write+0x18/0x20
cifs_mount+0xbc/0x9e0 [cifs]
? __pfx_cifs_mount+0x10/0x10 [cifs]
? do_raw_spin_unlock+0x5d/0x200
? cifs_setup_cifs_sb+0x29d/0x810 [cifs]
cifs_smb3_do_mount+0x263/0x1990 [cifs]

Reported-by: Namjae Jeon <linkinjeon@kernel.org>
Tested-by: Namjae Jeon <linkinjeon@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>

Revert "ALSA: hda: Add ASRock X670E Taichi to denylist"

On a motherboard with an AMD Granite Ridge CPU there is a report
that 3.5mm microphone and headphones aren't working. In the
log it's observed:

snd_hda_intel 0000:02:00.6: Skipping the device on the denylist

This was because of commit df42ee7e22f03 ("ALSA: hda: Add ASRock
X670E Taichi to denylist"). Reverting this commit allows the
microphone and headphones to work again. As at least some combinations
of this motherboard do have applicable devices, revert so that they
can be probed.

Cc: Richard Gong <richard.gong@amd.com>
Cc: Juan Martinez <juan.martinez@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/20250813140427.1577172-1-superm1@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: azt3328: Put __maybe_unused for inline functions for gameport

Some inline functions are unused depending on kconfig, and the recent
change for clang builds made those handled as errors with W=1.
For avoiding pitfalls, mark those with __maybe_unused attributes.

Link: https://patch.msgid.link/20250813153628.12303-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>

Merge tag 'mm-hotfixes-stable-2025-08-12-20-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
"12 hotfixes. 5 are cc:stable and the remainder address post-6.16
  issues or aren't considered necessary for -stable kernels.

  10 of these fixes are for MM"

* tag 'mm-hotfixes-stable-2025-08-12-20-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  proc: proc_maps_open allow proc_mem_open to return NULL
  mm/mremap: avoid expensive folio lookup on mremap folio pte batch
  userfaultfd: fix a crash in UFFDIO_MOVE when PMD is a migration entry
  mm: pass page directly instead of using folio_page
  selftests/proc: fix string literal warning in proc-maps-race.c
  fs/proc/task_mmu: hold PTL in pagemap_hugetlb_range and gather_hugetlb_stats
  mm/smaps: fix race between smaps_hugetlb_range and migration
  mm: fix the race between collapse and PT_RECLAIM under per-vma lock
  mm/kmemleak: avoid soft lockup in __kmemleak_do_cleanup()
  MAINTAINERS: add Masami as a reviewer of hung task detector
  mm/kmemleak: avoid deadlock by moving pr_warn() outside kmemleak_lock
  kasan/test: fix protection against compiler elision

ASoC: tas2781: Normalize the volume kcontrol name

Change the name of the kcontrol from "Gain" to "Volume".

Signed-off-by: Baojun Xu <baojun.xu@ti.com>
Link: https://patch.msgid.link/20250813100708.12197-1-baojun.xu@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>

io_uring/io-wq: add check free worker before create new worker

After commit 0b2b066f8a85 ("io_uring/io-wq: only create a new worker
if it can make progress"), in our produce environment, we still
observe that part of io_worker threads keeps creating and destroying.
After analysis, it was confirmed that this was due to a more complex
scenario involving a large number of fsync operations, which can be
abstracted as frequent write + fsync operations on multiple files in
a single uring instance. Since write is a hash operation while fsync
is not, and fsync is likely to be suspended during execution, the
action of checking the hash value in
io_wqe_dec_running cannot handle such scenarios.
Similarly, if hash-based work and non-hash-based work are sent at the
same time, similar issues are likely to occur.
Returning to the starting point of the issue, when a new work
arrives, io_wq_enqueue may wake up free worker A, while
io_wq_dec_running may create worker B. Ultimately, only one of A and
B can obtain and process the task, leaving the other in an idle
state. In the end, the issue is caused by inconsistent logic in the
checks performed by io_wq_enqueue and io_wq_dec_running.
Therefore, the problem can be resolved by checking for available
workers in io_wq_dec_running.

Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
Reviewed-by: Diangang Li <lidiangang@bytedance.com>
Link: https://lore.kernel.org/r/20250813120214.18729-1-changfengnan@bytedance.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

block: restore default wbt enablement

The commit 245618f8e45f ("block: protect wbt_lat_usec using
q->elevator_lock") protected wbt_enable_default() with
q->elevator_lock; however, it also placed wbt_enable_default()
before blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q);, resulting
in wbt failing to be enabled.

Moreover, the protection of wbt_enable_default() by q->elevator_lock
was removed in commit 78c271344b6f ("block: move wbt_enable_default()
out of queue freezing from sched ->exit()"), so we can directly fix
this issue by placing wbt_enable_default() after
blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q);.

Additionally, this issue also causes the inability to read the
wbt_lat_usec file, and the scenario is as follows:

root@q:/sys/block/sda/queue# cat wbt_lat_usec
cat: wbt_lat_usec: Invalid argument

root@q:/data00/sjc/linux# ls /sys/kernel/debug/block/sda/rqos
cannot access '/sys/kernel/debug/block/sda/rqos': No such file or directory

root@q:/data00/sjc/linux# find /sys -name wbt
/sys/kernel/debug/tracing/events/wbt

After testing with this patch, wbt can be enabled normally.

Signed-off-by: Julian Sun <sunjunchao@bytedance.com>
Cc: stable@vger.kernel.org
Fixes: 245618f8e45f ("block: protect wbt_lat_usec using q->elevator_lock")
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250812154257.57540-1-sunjunchao@bytedance.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Docs: admin-guide: Correct spelling mistake

Fix spelling mistake directoy to directory

Reported-by: codespell
Signed-off-by: Erick Karanja <karanja99erick@gmail.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250813071837.668613-1-karanja99erick@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

locking: Fix __clear_task_blocked_on() warning from __ww_mutex_wound() path

The __clear_task_blocked_on() helper added a number of sanity
checks ensuring we hold the mutex wait lock and that the task
we are clearing blocked_on pointer (if set) matches the mutex.

However, there is an edge case in the _ww_mutex_wound() logic
where we need to clear the blocked_on pointer for the task that
owns the mutex, not the task that is waiting on the mutex.

For this case the sanity checks aren't valid, so handle this
by allowing a NULL lock to skip the additional checks.

K Prateek Nayak and Maarten Lankhorst also pointed out that in
this case where we don't hold the owner's mutex wait_lock, we
need to be a bit more careful using READ_ONCE/WRITE_ONCE in both
the __clear_task_blocked_on() and __set_task_blocked_on()
implementations to avoid accidentally tripping WARN_ONs if two
instances race. So do that here as well.

This issue was easier to miss, I realized, as the test-ww_mutex
driver only exercises the wait-die class of ww_mutexes. I've
sent a patch[1] to address this so the logic will be easier to
test.

[1]: https://lore.kernel.org/lkml/20250801023358.562525-2-jstultz@google.com/

Fixes: a4f0b6fef4b0 ("locking/mutex: Add p->blocked_on wrappers for correctness checks")
Closes: https://lore.kernel.org/lkml/68894443.a00a0220.26d0e1.0015.GAE@google.com/
Reported-by: syzbot+602c4720aed62576cd79@syzkaller.appspotmail.com
Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://lore.kernel.org/r/20250805001026.2247040-1-jstultz@google.com

netfilter: nf_tables: reject duplicate device on updates

A chain/flowtable update with duplicated devices in the same batch is
possible. Unfortunately, netdev event path only removes the first
device that is found, leaving unregistered the hook of the duplicated
device.

Check if a duplicated device exists in the transaction batch, bail out
with EEXIST in such case.

WARNING is hit when unregistering the hook:

[49042.221275] WARNING: CPU: 4 PID: 8425 at net/netfilter/core.c:340 nf_hook_entry_head+0xaa/0x150
[49042.221375] CPU: 4 UID: 0 PID: 8425 Comm: nft Tainted: G S 6.16.0+ #170 PREEMPT(full)
[...]
[49042.221382] RIP: 0010:nf_hook_entry_head+0xaa/0x150

Fixes: 78d9f48f7f44 ("netfilter: nf_tables: add devices to existing flowtable")
Fixes: b9703ed44ffb ("netfilter: nf_tables: support for adding new devices to an existing netdev chain")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>

ipvs: Fix estimator kthreads preferred affinity

The estimator kthreads' affinity are defined by sysctl overwritten
preferences and applied through a plain call to the scheduler's affinity
API.

However since the introduction of managed kthreads preferred affinity,
such a practice shortcuts the kthreads core code which eventually
overwrites the target to the default unbound affinity.

Fix this with using the appropriate kthread's API.

Fixes: d1a89197589c ("kthread: Default affine kthread to its preferred NUMA node")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Florian Westphal <fw@strlen.de>

netfilter: nft_set_pipapo: fix null deref for empty set

Blamed commit broke the check for a null scratch map:
- if (unlikely(!m || !*raw_cpu_ptr(m->scratch)))
+ if (unlikely(!raw_cpu_ptr(m->scratch)))

This should have been "if (!*raw_ ...)".
Use the pattern of the avx2 version which is more readable.

This can only be reproduced if avx2 support isn't available.

Fixes: d8d871a35ca9 ("netfilter: nft_set_pipapo: merge pipapo_get/lookup")
Signed-off-by: Florian Westphal <fw@strlen.de>

selftests: tls: test TCP stealing data from under the TLS socket

Check a race where data disappears from the TCP socket after
TLS signaled that its ready to receive.

  ok 6 global.data_steal
  #  RUN           tls_basic.base_base ...
  #            OK  tls_basic.base_base

Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250807232907.600366-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tls: handle data disappearing from under the TLS ULP

TLS expects that it owns the receive queue of the TCP socket.
This cannot be guaranteed in case the reader of the TCP socket
entered before the TLS ULP was installed, or uses some non-standard
read API (eg. zerocopy ones). Replace the WARN_ON() and a buggy
early exit (which leaves anchor pointing to a freed skb) with real
error handling. Wipe the parsing state and tell the reader to retry.

We already reload the anchor every time we (re)acquire the socket lock,
so the only condition we need to avoid is an out of bounds read
(not having enough bytes in the socket for previously parsed record len).

If some data was read from under TLS but there's enough in the queue
we'll reload and decrypt what is most likely not a valid TLS record.
Leading to some undefined behavior from TLS perspective (corrupting
a stream? missing an alert? missing an attack?) but no kernel crash
should take place.

Reported-by: William Liu <will@willsroot.io>
Reported-by: Savino Dicanosa <savy@syst3mfailure.io>
Link: https://lore.kernel.org/tFjq_kf7sWIG3A7CrCg_egb8CVsT_gsmHAK0_wxDPJXfIzxFAMxqmLwp3MlU5EHiet0AwwJldaaFdgyHpeIUCS-3m3llsmRzp9xIOBR4lAI=@syst3mfailure.io
Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250807232907.600366-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch '6.17/scsi-queue' into 6.17/scsi-fixes

Pull in outstanding commits for 6.17.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

ptp: prevent possible ABBA deadlock in ptp_clock_freerun()

syzbot reported the following ABBA deadlock:

       CPU0                           CPU1
       ----                           ----
  n_vclocks_store()
    lock(&ptp->n_vclocks_mux) [1]
        (physical clock)
                                     pc_clock_adjtime()
                                       lock(&clk->rwsem) [2]
                                        (physical clock)
                                       ...
                                       ptp_clock_freerun()
                                         ptp_vclock_in_use()
                                           lock(&ptp->n_vclocks_mux) [3]
                                              (physical clock)
    ptp_clock_unregister()
      posix_clock_unregister()
        lock(&clk->rwsem) [4]
          (virtual clock)

Since ptp virtual clock is registered only under ptp physical clock, both
ptp_clock and posix_clock must be physical clocks for ptp_vclock_in_use()
to lock &ptp->n_vclocks_mux and check ptp->n_vclocks.

However, when unregistering vclocks in n_vclocks_store(), the locking
ptp->n_vclocks_mux is a physical clock lock, but clk->rwsem of
ptp_clock_unregister() called through device_for_each_child_reverse()
is a virtual clock lock.

Therefore, clk->rwsem used in CPU0 and clk->rwsem used in CPU1 are
different locks, but in lockdep, a false positive occurs because the
possibility of deadlock is determined through lock-class.

To solve this, lock subclass annotation must be added to the posix_clock
rwsem of the vclock.

Reported-by: syzbot+7cfb66a237c4a5fb22ad@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7cfb66a237c4a5fb22ad
Fixes: 73f37068d540 ("ptp: support ptp physical/virtual clocks conversion")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250728062649.469882-1-aha310510@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ixgbe: prevent from unwanted interface name changes

Users of the ixgbe driver report that after adding devlink support by
the commit a0285236ab93 ("ixgbe: add initial devlink support") their
configs got broken due to unwanted changes of interface names. It's
caused by automatic phys_port_name generation during devlink port
initialization flow.

To prevent from that set no_phys_port_name flag for ixgbe devlink ports.

Reported-by: David Howells <dhowells@redhat.com>
Closes: https://lore.kernel.org/netdev/3452224.1745518016@warthog.procyon.org.uk/
Reported-by: David Kaplan <David.Kaplan@amd.com>
Closes: https://lore.kernel.org/netdev/LV3PR12MB92658474624CCF60220157199470A@LV3PR12MB9265.namprd12.prod.outlook.com/
Fixes: a0285236ab93 ("ixgbe: add initial devlink support")
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

devlink: let driver opt out of automatic phys_port_name generation

Currently when adding devlink port, phys_port_name is automatically
generated within devlink port initialization flow. As a result adding
devlink port support to driver may result in forced changes of interface
names, which breaks already existing network configs.

This is an expected behavior but in some scenarios it would not be
preferable to provide such limitation for legacy driver not being able to
keep 'pre-devlink' interface name.

Add flag no_phys_port_name to devlink_port_attrs struct which indicates
if devlink should not alter name of interface.

Suggested-by: Jiri Pirko <jiri@resnulli.us>
Link: https://lore.kernel.org/all/nbwrfnjhvrcduqzjl4a2jafnvvud6qsbxlvxaxilnryglf4j7r@btuqrimnfuly/
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

drm/amdgpu: fix task hang from failed job submission during process kill

During process kill, drm_sched_entity_flush() will kill the vm
entities. The following job submissions of this process will fail, and
the resources of these jobs have not been released, nor have the fences
been signalled, causing tasks to hang and timeout.

Fix by check entity status in amdgpu_vm_ready() and avoid submit jobs to
stopped entity.

v2: add amdgpu_vm_ready() check before amdgpu_vm_clear_freed() in
function amdgpu_cs_vm_handling().

Fixes: 1f02f2044bda ("drm/amdgpu: Avoid extra evict-restore process.")
Signed-off-by: Liu01 Tong <Tong.Liu01@amd.com>
Signed-off-by: Lin.Cao <lincao12@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f101c13a8720c73e67f8f9d511fbbeda95bcedb1)

ASoC: stm: stm32_i2s: Fix calc_clk_div() error handling in determine_rate()

calc_clk_div() will only return a non-zero value (-EINVAL)
in case of error. On the other hand, req->rate is an unsigned long.
It seems quite odd that req->rate would be assigned a negative value,
which is clearly not a rate, and success would be returned.

Reinstate previous logic, which would just return error.

Fixes: afd529d74002 ("ASoC: stm: stm32_i2s: convert from round_rate() to determine_rate()")
Link: https://scan7.scan.coverity.com/#/project-view/53936/11354?selectedIssue=1647702
Signed-off-by: Sergio Perez Gonzalez <sperezglz@gmail.com>
Link: https://patch.msgid.link/20250729020052.404617-1-sperezglz@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

drm/amdgpu: fix incorrect vm flags to map bo

It should use vm flags instead of pte flags
to specify bo vm attributes.

Fixes: 7946340fa389 ("drm/amdgpu: Move csa related code to separate file")
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b08425fa77ad2f305fe57a33dceb456be03b653f)