]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
2 weeks agoMerge tag 'spi-fix-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Linus Torvalds [Fri, 3 Apr 2026 17:19:52 +0000 (10:19 -0700)] 
Merge tag 'spi-fix-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "A small collection of fixes, mostly probe/remove issues that are the
  result of Felix Gu going and auditing those areas, plus one error
  handling fix for the Cadence QSPI driver"

* tag 'spi-fix-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: cadence-qspi: Fix exec_mem_op error handling
  spi: amlogic: spifc-a4: unregister ECC engine on probe failure and remove() callback
  spi: stm32-ospi: Fix DMA channel leak on stm32_ospi_dma_setup() failure
  spi: stm32-ospi: Fix reset control leak on probe error
  spi: stm32-ospi: Fix resource leak in remove() callback

2 weeks agosched_ext: Fix stale direct dispatch state in ddsp_dsq_id
Andrea Righi [Fri, 3 Apr 2026 06:57:20 +0000 (08:57 +0200)] 
sched_ext: Fix stale direct dispatch state in ddsp_dsq_id

@p->scx.ddsp_dsq_id can be left set (non-SCX_DSQ_INVALID) triggering a
spurious warning in mark_direct_dispatch() when the next wakeup's
ops.select_cpu() calls scx_bpf_dsq_insert(), such as:

 WARNING: kernel/sched/ext.c:1273 at scx_dsq_insert_commit+0xcd/0x140

The root cause is that ddsp_dsq_id was only cleared in dispatch_enqueue(),
which is not reached in all paths that consume or cancel a direct dispatch
verdict.

Fix it by clearing it at the right places:

 - direct_dispatch(): cache the direct dispatch state in local variables
   and clear it before dispatch_enqueue() on the synchronous path. For
   the deferred path, the direct dispatch state must remain set until
   process_ddsp_deferred_locals() consumes them.

 - process_ddsp_deferred_locals(): cache the dispatch state in local
   variables and clear it before calling dispatch_to_local_dsq(), which
   may migrate the task to another rq.

 - do_enqueue_task(): clear the dispatch state on the enqueue path
   (local/global/bypass fallbacks), where the direct dispatch verdict is
   ignored.

 - dequeue_task_scx(): clear the dispatch state after dispatch_dequeue()
   to handle both the deferred dispatch cancellation and the holding_cpu
   race, covering all cases where a pending direct dispatch is
   cancelled.

 - scx_disable_task(): clear the direct dispatch state when
   transitioning a task out of the current scheduler. Waking tasks may
   have had the direct dispatch state set by the outgoing scheduler's
   ops.select_cpu() and then been queued on a wake_list via
   ttwu_queue_wakelist(), when SCX_OPS_ALLOW_QUEUED_WAKEUP is set. Such
   tasks are not on the runqueue and are not iterated by scx_bypass(),
   so their direct dispatch state won't be cleared. Without this clear,
   any subsequent SCX scheduler that tries to direct dispatch the task
   will trigger the WARN_ON_ONCE() in mark_direct_dispatch().

Fixes: 5b26f7b920f7 ("sched_ext: Allow SCX_DSQ_LOCAL_ON for direct dispatches")
Cc: stable@vger.kernel.org # v6.12+
Cc: Daniel Hodges <hodgesd@meta.com>
Cc: Patrick Somaru <patsomaru@meta.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2 weeks agoworkqueue: avoid unguarded 64-bit division
Arnd Bergmann [Thu, 2 Apr 2026 20:59:03 +0000 (22:59 +0200)] 
workqueue: avoid unguarded 64-bit division

The printk() requires a division that is not allowed on 32-bit architectures:

x86_64-linux-ld: lib/test_workqueue.o: in function `test_workqueue_init':
test_workqueue.c:(.init.text+0x36f): undefined reference to `__udivdi3'

Use div_u64() to print the resulting elapsed microseconds.

Fixes: 24b2e73f9700 ("workqueue: add test_workqueue benchmark module")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
2 weeks agoMerge tag 'pm-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 3 Apr 2026 16:56:32 +0000 (09:56 -0700)] 
Merge tag 'pm-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix a potential NULL pointer dereference in the energy model
  netlink interface and a potential double free in an error path in
  the common cpufreq governor management code:

   - Fix a NULL pointer dereference in the energy model netlink
     interface that may occur if a given perf domain ID is not
     recognized (Changwoo Min)

   - Avoid double free in the cpufreq_dbs_governor_init() error
     path when kobject_init_and_add() fails (Guangshuo Li)"

* tag 'pm-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: governor: fix double free in cpufreq_dbs_governor_init() error path
  PM: EM: Fix NULL pointer dereference when perf domain ID is not found

2 weeks agoMerge tag 'thermal-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 3 Apr 2026 16:49:06 +0000 (09:49 -0700)] 
Merge tag 'thermal-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control fixes from Rafael Wysocki:
 "Address potential races between thermal zone removal and system
  resume that may lead to a use-after-free (in two different ways)
  and a potential use-after-free in the thermal zone unregistration
  path (Rafael Wysocki)"

* tag 'thermal-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: core: Fix thermal zone device registration error path
  thermal: core: Address thermal zone removal races with resume

2 weeks agoKVM: SEV: Disallow LAUNCH_FINISH if vCPUs are actively being created
Sean Christopherson [Tue, 10 Mar 2026 23:48:12 +0000 (16:48 -0700)] 
KVM: SEV: Disallow LAUNCH_FINISH if vCPUs are actively being created

Reject LAUNCH_FINISH for SEV-ES and SNP VMs if KVM is actively creating
one or more vCPUs, as KVM needs to process and encrypt each vCPU's VMSA.
Letting userspace create vCPUs while LAUNCH_FINISH is in-progress is
"fine", at least in the current code base, as kvm_for_each_vcpu() operates
on online_vcpus, LAUNCH_FINISH (all SEV+ sub-ioctls) holds kvm->mutex, and
fully onlining a vCPU in kvm_vm_ioctl_create_vcpu() is done under
kvm->mutex.  I.e. there's no difference between an in-progress vCPU and a
vCPU that is created entirely after LAUNCH_FINISH.

However, given that concurrent LAUNCH_FINISH and vCPU creation can't
possibly work (for any reasonable definition of "work"), since userspace
can't guarantee whether a particular vCPU will be encrypted or not,
disallow the combination as a hardening measure, to reduce the probability
of introducing bugs in the future, and to avoid having to reason about the
safety of future changes related to LAUNCH_FINISH.

Cc: Jethro Beekman <jethro@fortanix.com>
Closes: https://lore.kernel.org/all/b31f7c6e-2807-4662-bcdd-eea2c1e132fa@fortanix.com
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260310234829.2608037-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2 weeks agoKVM: SEV: Protect *all* of sev_mem_enc_register_region() with kvm->lock
Sean Christopherson [Tue, 10 Mar 2026 23:48:11 +0000 (16:48 -0700)] 
KVM: SEV: Protect *all* of sev_mem_enc_register_region() with kvm->lock

Take and hold kvm->lock for before checking sev_guest() in
sev_mem_enc_register_region(), as sev_guest() isn't stable unless kvm->lock
is held (or KVM can guarantee KVM_SEV_INIT{2} has completed and can't
rollack state).  If KVM_SEV_INIT{2} fails, KVM can end up trying to add to
a not-yet-initialized sev->regions_list, e.g. triggering a #GP

  Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI
  KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
  CPU: 110 UID: 0 PID: 72717 Comm: syz.15.11462 Tainted: G     U  W  O        6.16.0-smp-DEV #1 NONE
  Tainted: [U]=USER, [W]=WARN, [O]=OOT_MODULE
  Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 12.52.0-0 10/28/2024
  RIP: 0010:sev_mem_enc_register_region+0x3f0/0x4f0 ../include/linux/list.h:83
  Code: <41> 80 3c 04 00 74 08 4c 89 ff e8 f1 c7 a2 00 49 39 ed 0f 84 c6 00
  RSP: 0018:ffff88838647fbb8 EFLAGS: 00010256
  RAX: dffffc0000000000 RBX: 1ffff92015cf1e0b RCX: dffffc0000000000
  RDX: 0000000000000000 RSI: 0000000000001000 RDI: ffff888367870000
  RBP: ffffc900ae78f050 R08: ffffea000d9e0007 R09: 1ffffd4001b3c000
  R10: dffffc0000000000 R11: fffff94001b3c001 R12: 0000000000000000
  R13: ffff8982ab0bde00 R14: ffffc900ae78f058 R15: 0000000000000000
  FS:  00007f34e9dc66c0(0000) GS:ffff89ee64d33000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fe180adef98 CR3: 000000047210e000 CR4: 0000000000350ef0
  Call Trace:
   <TASK>
   kvm_arch_vm_ioctl+0xa72/0x1240 ../arch/x86/kvm/x86.c:7371
   kvm_vm_ioctl+0x649/0x990 ../virt/kvm/kvm_main.c:5363
   __se_sys_ioctl+0x101/0x170 ../fs/ioctl.c:51
   do_syscall_x64 ../arch/x86/entry/syscall_64.c:63 [inline]
   do_syscall_64+0x6f/0x1f0 ../arch/x86/entry/syscall_64.c:94
   entry_SYSCALL_64_after_hwframe+0x76/0x7e
  RIP: 0033:0x7f34e9f7e9a9
  Code: <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
  RSP: 002b:00007f34e9dc6038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  RAX: ffffffffffffffda RBX: 00007f34ea1a6080 RCX: 00007f34e9f7e9a9
  RDX: 0000200000000280 RSI: 000000008010aebb RDI: 0000000000000007
  RBP: 00007f34ea000d69 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
  R13: 0000000000000000 R14: 00007f34ea1a6080 R15: 00007ffce77197a8
   </TASK>

with a syzlang reproducer that looks like:

  syz_kvm_add_vcpu$x86(0x0, &(0x7f0000000040)={0x0, &(0x7f0000000180)=ANY=[], 0x70}) (async)
  syz_kvm_add_vcpu$x86(0x0, &(0x7f0000000080)={0x0, &(0x7f0000000180)=ANY=[@ANYBLOB="..."], 0x4f}) (async)
  r0 = openat$kvm(0xffffffffffffff9c, &(0x7f0000000200), 0x0, 0x0)
  r1 = ioctl$KVM_CREATE_VM(r0, 0xae01, 0x0)
  r2 = openat$kvm(0xffffffffffffff9c, &(0x7f0000000240), 0x0, 0x0)
  r3 = ioctl$KVM_CREATE_VM(r2, 0xae01, 0x0)
  ioctl$KVM_SET_CLOCK(r3, 0xc008aeba, &(0x7f0000000040)={0x1, 0x8, 0x0, 0x5625e9b0}) (async)
  ioctl$KVM_SET_PIT2(r3, 0x8010aebb, &(0x7f0000000280)={[...], 0x5}) (async)
  ioctl$KVM_SET_PIT2(r1, 0x4070aea0, 0x0) (async)
  r4 = ioctl$KVM_CREATE_VM(0xffffffffffffffff, 0xae01, 0x0)
  openat$kvm(0xffffffffffffff9c, 0x0, 0x0, 0x0) (async)
  ioctl$KVM_SET_USER_MEMORY_REGION(r4, 0x4020ae46, &(0x7f0000000400)={0x0, 0x0, 0x0, 0x2000, &(0x7f0000001000/0x2000)=nil}) (async)
  r5 = ioctl$KVM_CREATE_VCPU(r4, 0xae41, 0x2)
  close(r0) (async)
  openat$kvm(0xffffffffffffff9c, &(0x7f0000000000), 0x8000, 0x0) (async)
  ioctl$KVM_SET_GUEST_DEBUG(r5, 0x4048ae9b, &(0x7f0000000300)={0x4376ea830d46549b, 0x0, [0x46, 0x0, 0x0, 0x0, 0x0, 0x1000]}) (async)
  ioctl$KVM_RUN(r5, 0xae80, 0x0)

Opportunistically use guard() to avoid having to define a new error label
and goto usage.

Fixes: 1e80fdc09d12 ("KVM: SVM: Pin guest memory when SEV is active")
Cc: stable@vger.kernel.org
Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Link: https://patch.msgid.link/20260310234829.2608037-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2 weeks agoKVM: SEV: Reject attempts to sync VMSA of an already-launched/encrypted vCPU
Sean Christopherson [Tue, 10 Mar 2026 23:48:10 +0000 (16:48 -0700)] 
KVM: SEV: Reject attempts to sync VMSA of an already-launched/encrypted vCPU

Reject synchronizing vCPU state to its associated VMSA if the vCPU has
already been launched, i.e. if the VMSA has already been encrypted.  On a
host with SNP enabled, accessing guest-private memory generates an RMP #PF
and panics the host.

  BUG: unable to handle page fault for address: ff1276cbfdf36000
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x80000003) - RMP violation
  PGD 5a31801067 P4D 5a31802067 PUD 40ccfb5063 PMD 40e5954063 PTE 80000040fdf36163
  SEV-SNP: PFN 0x40fdf36, RMP entry: [0x6010fffffffff001 - 0x000000000000001f]
  Oops: Oops: 0003 [#1] SMP NOPTI
  CPU: 33 UID: 0 PID: 996180 Comm: qemu-system-x86 Tainted: G           OE
  Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
  Hardware name: Dell Inc. PowerEdge R7625/0H1TJT, BIOS 1.5.8 07/21/2023
  RIP: 0010:sev_es_sync_vmsa+0x54/0x4c0 [kvm_amd]
  Call Trace:
   <TASK>
   snp_launch_update_vmsa+0x19d/0x290 [kvm_amd]
   snp_launch_finish+0xb6/0x380 [kvm_amd]
   sev_mem_enc_ioctl+0x14e/0x720 [kvm_amd]
   kvm_arch_vm_ioctl+0x837/0xcf0 [kvm]
   kvm_vm_ioctl+0x3fd/0xcc0 [kvm]
   __x64_sys_ioctl+0xa3/0x100
   x64_sys_call+0xfe0/0x2350
   do_syscall_64+0x81/0x10f0
   entry_SYSCALL_64_after_hwframe+0x76/0x7e
  RIP: 0033:0x7ffff673287d
   </TASK>

Note, the KVM flaw has been present since commit ad73109ae7ec ("KVM: SVM:
Provide support to launch and run an SEV-ES guest"), but has only been
actively dangerous for the host since SNP support was added.  With SEV-ES,
KVM would "just" clobber guest state, which is totally fine from a host
kernel perspective since userspace can clobber guest state any time before
sev_launch_update_vmsa().

Fixes: ad27ce155566 ("KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command")
Reported-by: Jethro Beekman <jethro@fortanix.com>
Closes: https://lore.kernel.org/all/d98692e2-d96b-4c36-8089-4bc1e5cc3d57@fortanix.com
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260310234829.2608037-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2 weeks agoKVM: selftests: Remove duplicate LAUNCH_UPDATE_VMSA call in SEV-ES migrate test
Sean Christopherson [Tue, 10 Mar 2026 23:48:09 +0000 (16:48 -0700)] 
KVM: selftests: Remove duplicate LAUNCH_UPDATE_VMSA call in SEV-ES migrate test

Drop the explicit KVM_SEV_LAUNCH_UPDATE_VMSA call when creating an SEV-ES
VM in the SEV migration test, as sev_vm_create() automatically updates the
VMSA pages for SEV-ES guests.  The only reason the duplicate call doesn't
cause visible problems is because the test doesn't actually try to run the
vCPUs.  That will change when KVM adds a check to prevent userspace from
re-launching a VMSA (which corrupts the VMSA page due to KVM writing
encrypted private memory).

Fixes: 69f8e15ab61f ("KVM: selftests: Use the SEV library APIs in the intra-host migration test")
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260310234829.2608037-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2 weeks agoKVM: SEV: Use kvzalloc_objs() when pinning userpages
Sean Christopherson [Fri, 13 Mar 2026 00:33:02 +0000 (17:33 -0700)] 
KVM: SEV: Use kvzalloc_objs() when pinning userpages

Use kvzalloc_objs() instead of sev_pin_memory()'s open coded (rough)
equivalent to harden the code and

Note!  This sanity check in __kvmalloc_node_noprof()

  /* Don't even allow crazy sizes */
  if (unlikely(size > INT_MAX)) {
          WARN_ON_ONCE(!(flags & __GFP_NOWARN));
          return NULL;
  }

will artificially limit the maximum size of any single pinned region to
just under 1TiB.  While there do appear to be providers that support SEV
VMs with more than 1TiB of _total_ memory, it's unlikely any KVM-based
providers pin 1TiB in a single request.

Allocate with NOWARN so that fuzzers can't trip the WARN_ON_ONCE() when
they inevitably run on systems with copious amounts of RAM, i.e. when they
can get by KVM's "total_npages > totalram_pages()" restriction.

Note #2, KVM's usage of vmalloc()+kmalloc() instead of kvmalloc() predates
commit 7661809d493b ("mm: don't allow oversized kvmalloc() calls") by 4+
years (see commit 89c505809052 ("KVM: SVM: Add support for
KVM_SEV_LAUNCH_UPDATE_DATA command").  I.e. the open coded behavior wasn't
intended to avoid the aforementioned sanity check.  The implementation
appears to be pure oversight at the time the code was written, as it showed
up in v3[1] of the early RFCs, whereas as v2[2] simply used kmalloc().

Cc: Liam Merwick <liam.merwick@oracle.com>
Link: https://lore.kernel.org/all/20170724200303.12197-17-brijesh.singh@amd.com
Link: https://lore.kernel.org/all/148846786714.2349.17724971671841396908.stgit__25299.4950431914$1488470940$gmane$org@brijesh-build-machine
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Tested-by: Liam Merwick <liam.merwick@oracle.com>
Link: https://patch.msgid.link/20260313003302.3136111-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2 weeks agoKVM: SEV: Use PFN_DOWN() to simplify "number of pages" math when pinning memory
Sean Christopherson [Fri, 13 Mar 2026 00:33:01 +0000 (17:33 -0700)] 
KVM: SEV: Use PFN_DOWN() to simplify "number of pages" math when pinning memory

Use PFN_DOWN() instead of open coded equivalents in sev_pin_memory() to
simplify the code and make it easier to read.

No functional change intended (verified before and after versions of the
generated code are identical).

Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Tested-by: Liam Merwick <liam.merwick@oracle.com>
Link: https://patch.msgid.link/20260313003302.3136111-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2 weeks agoKVM: SEV: Disallow pinning more pages than exist in the system
Sean Christopherson [Fri, 13 Mar 2026 00:33:00 +0000 (17:33 -0700)] 
KVM: SEV: Disallow pinning more pages than exist in the system

Explicitly disallow pinning more pages for an SEV VM than exist in the
system to defend against absurd userspace requests without relying on
somewhat arbitrary kernel functionality to prevent truly stupid KVM
behavior.  E.g. even with the INT_MAX check, userspace can request that
KVM pin nearly 8TiB of memory, regardless of how much RAM exists in the
system.

Opportunistically rename "locked" to a more descriptive "total_npages".

Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Tested-by: Liam Merwick <liam.merwick@oracle.com>
Link: https://patch.msgid.link/20260313003302.3136111-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2 weeks agoKVM: SEV: Drop useless sanity checks in sev_mem_enc_register_region()
Sean Christopherson [Fri, 13 Mar 2026 00:32:59 +0000 (17:32 -0700)] 
KVM: SEV: Drop useless sanity checks in sev_mem_enc_register_region()

Drop sev_mem_enc_register_region()'s sanity checks on the incoming address
and size, as SEV is 64-bit only, making ULONG_MAX a 64-bit, all-ones value,
and thus making it impossible for kvm_enc_region.{addr,size} to be greater
than ULONG_MAX.

Note, sev_pin_memory() verifies the incoming address is non-NULL (which
isn't strictly required, but whatever), and that addr+size don't wrap to
zero (which _is_ needed and what really needs to be guarded against).

Note #2, pin_user_pages_fast() guards against the end address walking into
kernel address space, so lack of an access_ok() check is also safe (maybe
not ideal, but safe).

No functional change intended (the generated code is literally the same,
i.e. the compiler was smart enough to know the checks were useless).

Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Tested-by: Liam Merwick <liam.merwick@oracle.com>
Link: https://patch.msgid.link/20260313003302.3136111-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2 weeks agoKVM: SEV: Drop WARN on large size for KVM_MEMORY_ENCRYPT_REG_REGION
Sean Christopherson [Fri, 13 Mar 2026 00:32:58 +0000 (17:32 -0700)] 
KVM: SEV: Drop WARN on large size for KVM_MEMORY_ENCRYPT_REG_REGION

Drop the WARN in sev_pin_memory() on npages overflowing an int, as the
WARN is comically trivially to trigger from userspace, e.g. by doing:

  struct kvm_enc_region range = {
          .addr = 0,
          .size = -1ul,
  };

  __vm_ioctl(vm, KVM_MEMORY_ENCRYPT_REG_REGION, &range);

Note, the checks in sev_mem_enc_register_region() that presumably exist to
verify the incoming address+size are completely worthless, as both "addr"
and "size" are u64s and SEV is 64-bit only, i.e. they _can't_ be greater
than ULONG_MAX.  That wart will be cleaned up in the near future.

if (range->addr > ULONG_MAX || range->size > ULONG_MAX)
return -EINVAL;

Opportunistically add a comment to explain why the code calculates the
number of pages the "hard" way, e.g. instead of just shifting @ulen.

Fixes: 78824fabc72e ("KVM: SVM: fix svn_pin_memory()'s use of get_user_pages_fast()")
Cc: stable@vger.kernel.org
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Tested-by: Liam Merwick <liam.merwick@oracle.com>
Link: https://patch.msgid.link/20260313003302.3136111-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2 weeks agomisc: pci_endpoint_test: Use -EINVAL for small subrange size
Koichiro Den [Fri, 20 Mar 2026 14:01:39 +0000 (23:01 +0900)] 
misc: pci_endpoint_test: Use -EINVAL for small subrange size

The sub_size check ensures that each subrange is large enough for 32-bit
accesses. Subranges smaller than sizeof(u32) do not satisfy this
assumption, so this is a local sanity check rather than a resource
exhaustion case.

Return -EINVAL instead of -ENOSPC for this case.

Suggested-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Koichiro Den <den@valinux.co.jp>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20260320140139.2415480-1-den@valinux.co.jp
2 weeks agoKVM: x86: Suppress WARNs on nested_run_pending after userspace exit
Sean Christopherson [Thu, 12 Mar 2026 23:48:23 +0000 (16:48 -0700)] 
KVM: x86: Suppress WARNs on nested_run_pending after userspace exit

To end an ongoing game of whack-a-mole between KVM and syzkaller, WARN on
illegally cancelling a pending nested VM-Enter if and only if userspace
has NOT gained control of the vCPU since the nested run was initiated.  As
proven time and time again by syzkaller, userspace can clobber vCPU state
so as to force a VM-Exit that violates KVM's architectural modelling of
VMRUN/VMLAUNCH/VMRESUME.

To detect that userspace has gained control, while minimizing the risk of
operating on stale data, convert nested_run_pending from a pure boolean to
a tri-state of sorts, where '0' is still "not pending", '1' is "pending",
and '2' is "pending but untrusted".  Then on KVM_RUN, if the flag is in
the "trusted pending" state, move it to "untrusted pending".

Note, moving the state to "untrusted" even if KVM_RUN is ultimately
rejected is a-ok, because for the "untrusted" state to matter, KVM must
get past kvm_x86_vcpu_pre_run() at some point for the vCPU.

Reviewed-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260312234823.3120658-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2 weeks agoMerge tag 'gpio-fixes-for-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 3 Apr 2026 16:33:38 +0000 (09:33 -0700)] 
Merge tag 'gpio-fixes-for-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix kerneldocs for gpio-timberdale and gpio-nomadik

 - clear the "requested" flag in error path in gpiod_request_commit()

 - call of_xlate() if provided when setting up shared GPIOs

 - handle pins shared by child firmware nodes of consumer devices

 - fix return value check in gpio-qixis-fpga

 - fix suspend on gpio-mxc

 - fix gpio-microchip DT bindings

* tag 'gpio-fixes-for-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  dt-bindings: gpio: fix microchip #interrupt-cells
  gpio: shared: shorten the critical section in gpiochip_setup_shared()
  gpio: mxc: map Both Edge pad wakeup to Rising Edge
  gpio: qixis-fpga: Fix error handling for devm_regmap_init_mmio()
  gpio: shared: handle pins shared by child nodes of devices
  gpio: shared: call gpio_chip::of_xlate() if set
  gpiolib: clear requested flag if line is invalid
  gpio: nomadik: repair some kernel-doc comments
  gpio: timberdale: repair kernel-doc comments
  gpio: Fix resource leaks on errors in gpiochip_add_data_with_key()

2 weeks agoKVM: x86: Move nested_run_pending to kvm_vcpu_arch
Yosry Ahmed [Thu, 12 Mar 2026 23:48:22 +0000 (16:48 -0700)] 
KVM: x86: Move nested_run_pending to kvm_vcpu_arch

Move nested_run_pending field present in both svm_nested_state and
nested_vmx to the common kvm_vcpu_arch. This allows for common code to
use without plumbing it through per-vendor helpers.

nested_run_pending remains zero-initialized, as the entire kvm_vcpu
struct is, and all further accesses are done through vcpu->arch instead
of svm->nested or vmx->nested.

No functional change intended.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
[sean: expand the commend in the field declaration]
Link: https://patch.msgid.link/20260312234823.3120658-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2 weeks agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 3 Apr 2026 15:47:13 +0000 (08:47 -0700)] 
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fix from Will Deacon:

 - Implement a basic static call trampoline to fix CFI failures with the
   generic implementation

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Use static call trampolines when kCFI is enabled

2 weeks agoselftests/seccomp: Add hard-coded __NR_uprobe for x86_64
Oleg Nesterov [Fri, 3 Apr 2026 13:30:40 +0000 (15:30 +0200)] 
selftests/seccomp: Add hard-coded __NR_uprobe for x86_64

This complements the commit 18f7686a1ce6 ("selftests/seccomp:
Add hard-coded __NR_uretprobe for x86_64").

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://patch.msgid.link/ac_BAMSggw-_ABPE@redhat.com
Signed-off-by: Kees Cook <kees@kernel.org>
2 weeks agoMerge branch 'bpf-prep-patches-for-static-stack-liveness'
Alexei Starovoitov [Fri, 3 Apr 2026 15:33:48 +0000 (08:33 -0700)] 
Merge branch 'bpf-prep-patches-for-static-stack-liveness'

Alexei Starovoitov says:

====================
bpf: Prep patches for static stack liveness.

v4->v5:
- minor test fixup

v3->v4:
- fixed invalid recursion detection when calback is called multiple times

v3: https://lore.kernel.org/bpf/20260402212856.86606-1-alexei.starovoitov@gmail.com/

v2->v3:
- added recursive call detection
- fixed ubsan warning
- removed double declaration in the header
- added Acks

v2: https://lore.kernel.org/bpf/20260402061744.10885-1-alexei.starovoitov@gmail.com/

v1->v2:
. fixed bugs spotted by Eduard, Mykyta, claude and gemini
. fixed selftests that were failing in unpriv
. gemini(sashiko) found several precision improvements in patch 6,
  but they made no difference in real programs.

v1: https://lore.kernel.org/bpf/20260401021635.34636-1-alexei.starovoitov@gmail.com/
First 6 prep patches for static stack liveness.

. do src/dst_reg validation early and remove defensive checks

. sort subprog in topo order. We wanted to do this long ago
  to process global subprogs this way and in other cases.

. Add constant folding pass that computes map_ptr, subprog_idx,
  loads from readonly maps, and other constants that fit into 32-bit

. Use these constants to eliminate dead code. Replace predicted
  conditional branches with "jmp always". That reduces JIT prog size.

. Add two helpers that return access size from their arguments.
====================

Link: https://patch.msgid.link/20260403024422.87231-1-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agobpf: Add helper and kfunc stack access size resolution
Alexei Starovoitov [Fri, 3 Apr 2026 02:44:21 +0000 (19:44 -0700)] 
bpf: Add helper and kfunc stack access size resolution

The static stack liveness analysis needs to know how many bytes a
helper or kfunc accesses through a stack pointer argument, so it can
precisely mark the affected stack slots as stack 'def' or 'use'.

Add bpf_helper_stack_access_bytes() and bpf_kfunc_stack_access_bytes()
which resolve the access size for a given call argument.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-7-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agobpf: Move verifier helpers to header
Alexei Starovoitov [Fri, 3 Apr 2026 02:44:20 +0000 (19:44 -0700)] 
bpf: Move verifier helpers to header

Move several helpers to header as preparation for
the subsequent stack liveness patches.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-6-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agobpf: Add bpf_compute_const_regs() and bpf_prune_dead_branches() passes
Alexei Starovoitov [Fri, 3 Apr 2026 02:44:19 +0000 (19:44 -0700)] 
bpf: Add bpf_compute_const_regs() and bpf_prune_dead_branches() passes

Add two passes before the main verifier pass:

bpf_compute_const_regs() is a forward dataflow analysis that tracks
register values in R0-R9 across the program using fixed-point
iteration in reverse postorder. Each register is tracked with
a six-state lattice:

  UNVISITED -> CONST(val) / MAP_PTR(map_index) /
               MAP_VALUE(map_index, offset) / SUBPROG(num) -> UNKNOWN

At merge points, if two paths produce the same state and value for
a register, it stays; otherwise it becomes UNKNOWN.

The analysis handles:
 - MOV, ADD, SUB, AND with immediate or register operands
 - LD_IMM64 for plain constants, map FDs, map values, and subprogs
 - LDX from read-only maps: constant-folds the load by reading the
   map value directly via bpf_map_direct_read()

Results that fit in 32 bits are stored per-instruction in
insn_aux_data and bitmasks.

bpf_prune_dead_branches() uses the computed constants to evaluate
conditional branches. When both operands of a conditional jump are
known constants, the branch outcome is determined statically and the
instruction is rewritten to an unconditional jump.
The CFG postorder is then recomputed to reflect new control flow.
This eliminates dead edges so that subsequent liveness analysis
doesn't propagate through dead code.

Also add runtime sanity check to validate that precomputed
constants match the verifier's tracked state.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-5-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agoselftests/bpf: Add tests for subprog topological ordering
Alexei Starovoitov [Fri, 3 Apr 2026 02:44:18 +0000 (19:44 -0700)] 
selftests/bpf: Add tests for subprog topological ordering

Add few tests for topo sort:
- linear chain: main -> A -> B
- diamond: main -> A, main -> B, A -> C, B -> C
- mixed global/static: main -> global -> static leaf
- shared callee: main -> leaf, main -> global -> leaf
- duplicate calls: main calls same subprog twice
- no calls: single subprog

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-4-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agobpf: Sort subprogs in topological order after check_cfg()
Alexei Starovoitov [Fri, 3 Apr 2026 02:44:17 +0000 (19:44 -0700)] 
bpf: Sort subprogs in topological order after check_cfg()

Add a pass that sorts subprogs in topological order so that iterating
subprog_topo_order[] walks leaf subprogs first, then their callers.
This is computed as a DFS post-order traversal of the CFG.

The pass runs after check_cfg() to ensure the CFG has been validated
before traversing and after postorder has been computed to avoid
walking dead code.

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-3-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agobpf: Do register range validation early
Alexei Starovoitov [Fri, 3 Apr 2026 02:44:16 +0000 (19:44 -0700)] 
bpf: Do register range validation early

Instead of checking src/dst range multiple times during
the main verifier pass do them once.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-2-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agoMerge tag 'drm-fixes-2026-04-03' of https://gitlab.freedesktop.org/drm/kernel
Linus Torvalds [Fri, 3 Apr 2026 15:23:51 +0000 (08:23 -0700)] 
Merge tag 'drm-fixes-2026-04-03' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "Hopefully no Easter eggs in this bunch of fixes. Usual stuff across
  the amd/intel with some misc bits. Thanks to Thorsten and Alex for
  making sure a regression fix that was hanging around in process land
  finally made it in, that is probably the biggest change in here.

  core:
   - revert unplug/framebuffer fix as it caused problems
   - compat ioctl speculation fix

  bridge:
   - refcounting fix

  sysfb:
   - error handling fix

  amdgpu:
   - fix renoir audio regression
   - UserQ fixes
   - PASID handling fix
   - S4 fix for smu11 chips
   - Misc small fixes

  amdkfd:
   - Non-4K page fixes

  i915:
   - Fix for #12045: Huawei Matebook E (DRR-WXX): Persistent Black
     Screen on Boot with i915 and Gen11: Modesetting and Backlight
     Control Malfunction
   - Fix for #15826: i915: Raptor Lake-P [UHD Graphics] display
     flicker/corruption on eDP panel
   - Use crtc_state->enhanced_framing properly on ivb/hsw CPU eDP

  xe:
   - uapi: Accept canonical GPU addresses in xe_vm_madvise_ioctl
   - Disallow writes to read-only VMAs
   - PXP fixes
   - Disable garbage collector work item on SVM close
   - void memory allocations in xe_device_declare_wedged

  qaic:
   - hang fix

  ast:
   - initialisation fix"

* tag 'drm-fixes-2026-04-03' of https://gitlab.freedesktop.org/drm/kernel: (28 commits)
  drm/amd/display: Wire up dcn10_dio_construct() for all pre-DCN401 generations
  drm/ioc32: stop speculation on the drm_compat_ioctl path
  drm/sysfb: Fix efidrm error handling and memory type mismatch
  drm/i915/dp: Use crtc_state->enhanced_framing properly on ivb/hsw CPU eDP
  drm/i915/cdclk: Do the full CDCLK dance for min_voltage_level changes
  drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack size to GPU page size
  drm/amdgpu: Fix wait after reset sequence in S4
  drm/amd/display: Fix NULL pointer dereference in dcn401_init_hw()
  drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB
  drm/amdgpu/userq: fix memory leak in MQD creation error paths
  drm/amd: Fix MQD and control stack alignment for non-4K
  drm/amdkfd: Align expected_queue_size to PAGE_SIZE
  drm/amdgpu: fix the idr allocation flags
  drm/amdgpu: validate doorbell_offset in user queue creation
  drm/amdgpu/pm: drop SMU driver if version not matched messages
  drm/xe: Avoid memory allocations in xe_device_declare_wedged()
  drm/xe: Disable garbage collector work item on SVM close
  drm/xe/pxp: Don't allow PXP on older PTL GSC FWs
  drm/xe/pxp: Clear restart flag in pxp_start after jumping back
  drm/xe/pxp: Remove incorrect handling of impossible state during suspend
  ...

2 weeks agoipmi: ssif_bmc: add unit test for state machine
Jian Zhang [Fri, 3 Apr 2026 14:39:38 +0000 (22:39 +0800)] 
ipmi: ssif_bmc: add unit test for state machine

Add some unit test for state machine when in SSIF_ABORTING state.

Fixes: dd2bc5cc9e25 ("ipmi: ssif_bmc: Add SSIF BMC driver")
Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com>
Message-ID: <20260403143939.434017-1-zhangjian.3032@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
2 weeks agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 7.0-rc6+
Alexei Starovoitov [Fri, 3 Apr 2026 15:12:58 +0000 (08:12 -0700)] 
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 7.0-rc6+

Cross-merge BPF and other fixes after downstream PR.

Minor conflict in kernel/bpf/verifier.c

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agox86/alternative: delay freeing of smp_locks section
Mike Rapoport (Microsoft) [Mon, 30 Mar 2026 19:10:00 +0000 (22:10 +0300)] 
x86/alternative: delay freeing of smp_locks section

On SMP systems alternative_instructions() frees memory occupied by
smp_locks section immediately after patching the lock instructions.

The memory is freed using free_init_pages() that calls free_reserved_area()
that essentially does __free_page() for every page in the range.

Up until recently it didn't update memblock state so in cases when
CONFIG_ARCH_KEEP_MEMBLOCK is enabled (on x86 it is selected by
INTEL_TDX_HOST), the state of memblock and the memory map would be
inconsistent.

Additionally, with CONFIG_DEFERRED_STRUCT_PAGE_INIT enabled, freeing of
smp_locks happens before the memory map is fully initialized and freeing
reserved memory may cause an access to not-yet-initialized struct page when
__free_page() searches for a buddy page.

Following the discussion in [1], implementation of memblock_free_late() and
free_reserved_area() was unified to ensure that reserved memory that's
freed after memblock transfers the pages to the buddy allocator is actually
freed and that the memblock and the memory map are consistent. As a part of
these changes, free_reserved_area() now WARN()s when it is called before
the initialization of the memory map is complete.

The memory map is fully initialized in page_alloc_init_late() that
completes before initcalls are executed, so it is safe to free reserved
memory in any initcall except early_initcall().

Move freeing of smp_locks section to an initcall to ensure it will happen
after the memory map is fully initialized. Since it does not matter which
exactly initcall to use and the code lives in arch/, pick arch_initcall.

[1] https://lore.kernel.org/all/ec2aaef14783869b3be6e3c253b2dcbf67dbc12a.camel@kernel.crashing.org

Reported-By: Bert Karwatzki <spasswolf@web.de>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202603302154.b50adaf1-lkp@intel.com
Tested-By: Bert Karwatzki <spasswolf@web.de>
Link: https://lore.kernel.org/r/20260327140109.7561-1-spasswolf@web.de
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Fixes: b2129a39511b ("memblock: make free_reserved_area() update memblock if ARCH_KEEP_MEMBLOCK=y")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
2 weeks agoASoC: Intel: Fix MCLK leaks and clean up error
Mark Brown [Fri, 3 Apr 2026 14:15:04 +0000 (15:15 +0100)] 
ASoC: Intel: Fix MCLK leaks and clean up error

aravindanilraj0702@gmail.com <aravindanilraj0702@gmail.com> says:

From: Aravind Anilraj <aravindanilraj0702@gmail.com>

This series fixes MCLK resource leaks in the platform_clock_control()
implementations for bytcr_rt5640, bytcr_rt5651, and cht_bsw_rt5672.

In the SND_SOC_DAPM_EVENT_ON() path, clk_prepare_enable() is called to
enable MCLK, but subsequent failures in codec clock configuration (eg:
*_prepare_and_enable_pll1() or snd_soc_dai_set_sysclk()) return without
disabling the clock, leaking a reference.

Patches 1-3 fix this by adding the missing clk_disable_unprepare() calls
in the relevant error paths, ensuring proper symmetry between enable and
disable operations within the EVENT_ON scope.

Patch 4 moves unrelated logging changes into a separate patch and
standardizes error messages.

2 weeks agoASoC: Intel: Standardize MCLK error logs across RT boards
Aravind Anilraj [Wed, 1 Apr 2026 22:05:07 +0000 (18:05 -0400)] 
ASoC: Intel: Standardize MCLK error logs across RT boards

Standardize the error logging in platform_clock_control() by adding
missing newline characters to dev_err() strings. Additionally, include
the return code in the error messages to assist with debugging.

Signed-off-by: Aravind Anilraj <aravindanilraj0702@gmail.com>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/20260401220507.23557-5-aravindanilraj0702@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agoASoC: Intel: cht_bsw_rt5672: Fix MCLK leak on platform_clock_control error
Aravind Anilraj [Wed, 1 Apr 2026 22:05:06 +0000 (18:05 -0400)] 
ASoC: Intel: cht_bsw_rt5672: Fix MCLK leak on platform_clock_control error

If snd_soc_dai_set_pll() or snd_soc_dai_set_sysclk() fail inside the
EVENT_ON path, the function returns without calling
clk_disable_unprepare() on ctx->mclk, which was already enabled earlier
in the same code path. Add the missing clk_disable_unprepare() calls
before returning the error.

Signed-off-by: Aravind Anilraj <aravindanilraj0702@gmail.com>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/20260401220507.23557-4-aravindanilraj0702@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agoASoC: Intel: bytcr_rt5651: Fix MCLK leak on platform_clock_control error
Aravind Anilraj [Wed, 1 Apr 2026 22:05:05 +0000 (18:05 -0400)] 
ASoC: Intel: bytcr_rt5651: Fix MCLK leak on platform_clock_control error

If byt_rt5651_prepare_and_enable_pll1() fails, the function returns
without calling clk_disable_unprepare() on priv->mclk, which was
already enabled earlier in the same code path. Add the missing
cleanup call to prevent the clock from leaking.

Signed-off-by: Aravind Anilraj <aravindanilraj0702@gmail.com>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/20260401220507.23557-3-aravindanilraj0702@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agoASoC: Intel: bytcr_rt5640: Fix MCLK leak on platform_clock_control error
Aravind Anilraj [Wed, 1 Apr 2026 22:05:04 +0000 (18:05 -0400)] 
ASoC: Intel: bytcr_rt5640: Fix MCLK leak on platform_clock_control error

If byt_rt5640_prepare_and_enable_pll1() fails, the function returns
without calling clk_disable_unprepare() on priv->mclk, which was
already enabled earlier in the same code path. Add the missing
cleanup call to prevent the clock from leaking.

Signed-off-by: Aravind Anilraj <aravindanilraj0702@gmail.com>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/20260401220507.23557-2-aravindanilraj0702@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agoASoC: imx-rpmsg: Add DSD format support with dynamic DAI format switching
Chancel Liu [Thu, 26 Mar 2026 05:56:14 +0000 (14:56 +0900)] 
ASoC: imx-rpmsg: Add DSD format support with dynamic DAI format switching

Add hw_params callback to dynamically switch DAI format between I2S
and PDM based on audio stream format. When DSD formats are detected,
the DAI format is switched to PDM mode.

Signed-off-by: Chancel Liu <chancel.liu@nxp.com>
Link: https://patch.msgid.link/20260326055614.3614104-1-chancel.liu@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agoASoC: SDCA: Export Q7.8 volume control helpers
Niranjan H Y [Wed, 1 Apr 2026 13:21:45 +0000 (18:51 +0530)] 
ASoC: SDCA: Export Q7.8 volume control helpers

Export the Q7.8 volume control helpers to allow reuse
by other ASoC drivers. These functions handle 16-bit
signed Q7.8 fixed-point format values for volume controls.

Changes include:
- Rename q78_get_volsw to sdca_asoc_q78_get_volsw
- Rename q78_put_volsw to sdca_asoc_q78_put_volsw
- Add a convenience macro SDCA_SINGLE_Q78_TLV and
  SDCA_DOUBLE_Q78_TLV for creating mixer controls

This allows other ASoC drivers to easily implement controls
using the Q7.8 fixed-point format without duplicating code.

Signed-off-by: Niranjan H Y <niranjan.hy@ti.com>
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://patch.msgid.link/20260401132148.2367-1-niranjan.hy@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agoASoC: codecs: tlv320dac33: remove kmemdup_array
Rosen Penev [Thu, 2 Apr 2026 02:50:40 +0000 (19:50 -0700)] 
ASoC: codecs: tlv320dac33: remove kmemdup_array

Use a flexible array member and struct_size to use one allocation.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Link: https://patch.msgid.link/20260402025040.93569-1-rosenp@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agoASoC: SDCA: Add RJ support to class driver
Charles Keepax [Fri, 27 Mar 2026 16:27:32 +0000 (16:27 +0000)] 
ASoC: SDCA: Add RJ support to class driver

Add the retaskable jack Function to the list of Functions supported by
the class driver, it shouldn't require anything that isn't already
supported.

Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://patch.msgid.link/20260327162732.877257-1-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agoexfat: use exfat_chain_advance helper
Chi Zhiling [Fri, 3 Apr 2026 08:05:38 +0000 (16:05 +0800)] 
exfat: use exfat_chain_advance helper

Replace open-coded cluster chain walking logic with exfat_chain_advance()
across exfat_readdir, exfat_find_dir_entry, exfat_count_dir_entries,
exfat_search_empty_slot and exfat_check_dir_empty.

Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2 weeks agoexfat: introduce exfat_chain_advance helper
Chi Zhiling [Fri, 3 Apr 2026 08:05:37 +0000 (16:05 +0800)] 
exfat: introduce exfat_chain_advance helper

Introduce exfat_chain_advance() to walk a exfat_chain structure by a
given step, updating both ->dir and ->size fields atomically. This
helper handles both ALLOC_NO_FAT_CHAIN and ALLOC_FAT_CHAIN modes with
proper boundary checking.

Suggested-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2 weeks agoexfat: remove NULL cache pointer case in exfat_ent_get
Chi Zhiling [Fri, 3 Apr 2026 08:05:36 +0000 (16:05 +0800)] 
exfat: remove NULL cache pointer case in exfat_ent_get

Since exfat_get_next_cluster has been updated, no callers pass a NULL
pointer to exfat_ent_get, so remove the handling logic for this case.

Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2 weeks agoexfat: use exfat_cluster_walk helper
Chi Zhiling [Fri, 3 Apr 2026 08:05:35 +0000 (16:05 +0800)] 
exfat: use exfat_cluster_walk helper

Replace the custom exfat_walk_fat_chain() function and open-coded
FAT chain walking logic with the exfat_cluster_walk() helper across
exfat_find_location, __exfat_get_dentry_set, and exfat_map_cluster.

Suggested-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2 weeks agoexfat: introduce exfat_cluster_walk helper
Chi Zhiling [Fri, 3 Apr 2026 08:05:34 +0000 (16:05 +0800)] 
exfat: introduce exfat_cluster_walk helper

Introduce exfat_cluster_walk() to walk the FAT chain by a given step,
handling both ALLOC_NO_FAT_CHAIN and ALLOC_FAT_CHAIN modes. Also
redefine exfat_get_next_cluster as a thin wrapper around it for
backward compatibility.

Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2 weeks agoexfat: fix incorrect directory checksum after rename to shorter name
Chi Zhiling [Fri, 3 Apr 2026 08:05:33 +0000 (16:05 +0800)] 
exfat: fix incorrect directory checksum after rename to shorter name

When renaming a file in-place to a shorter name, exfat_remove_entries
marks excess entries as DELETED, but es->num_entries is not updated
accordingly. As a result, exfat_update_dir_chksum iterates over the
deleted entries and computes an incorrect checksum.

This does not lead to persistent corruption because mark_inode_dirty()
is called afterward, and __exfat_write_inode later recomputes the
checksum using the correct num_entries value.

Fix by setting es->num_entries = num_entries in exfat_init_ext_entry.

Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2 weeks agoNFSD: Docs: clean up pnfs server timeout docs
Randy Dunlap [Wed, 18 Mar 2026 22:21:05 +0000 (15:21 -0700)] 
NFSD: Docs: clean up pnfs server timeout docs

Make various changes to the documentation formatting to avoid docs
build errors and otherwise improve the produced output format:

- use bullets for lists
- don't use a '.' at the end of echo commands
- fix indentation

Documentation/admin-guide/nfs/pnfs-block-server.rst:55: ERROR: Unexpected indentation. [docutils]
Documentation/admin-guide/nfs/pnfs-scsi-server.rst:37: ERROR: Unexpected indentation. [docutils]

Fixes: 6a97f70b45e7 ("NFSD: Enforce timeout on layout recall and integrate lease manager fencing")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2 weeks agonfsd: fix comment typo in nfsxdr
Joseph Salisbury [Mon, 16 Mar 2026 18:28:45 +0000 (14:28 -0400)] 
nfsd: fix comment typo in nfsxdr

The file contains a spelling error in a source comment (occured).

Typos in comments reduce readability and make text searches less reliable
for developers and maintainers.

Replace 'occured' with 'occurred' in the affected comment. This is a
comment-only cleanup and does not change behavior.

Signed-off-by: Joseph Salisbury <joseph.salisbury@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2 weeks agonfsd: fix comment typo in nfs3xdr
Joseph Salisbury [Mon, 16 Mar 2026 18:25:16 +0000 (14:25 -0400)] 
nfsd: fix comment typo in nfs3xdr

The file contains a spelling error in a source comment (occured).

Typos in comments reduce readability and make text searches less reliable
for developers and maintainers.

Replace 'occured' with 'occurred' in the affected comment. This is a
comment-only cleanup and does not change behavior.

Signed-off-by: Joseph Salisbury <joseph.salisbury@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2 weeks agoNFSD: convert callback RPC program to per-net namespace
Dai Ngo [Fri, 13 Mar 2026 16:31:48 +0000 (12:31 -0400)] 
NFSD: convert callback RPC program to per-net namespace

The callback channel's rpc_program, rpc_version, rpc_stat,
and per-procedure counts are declared as file-scope statics in
nfs4callback.c, shared across all network namespaces.
Forechannel RPC statistics are already maintained per-netns
(via nfsd_svcstats in struct nfsd_net); the backchannel
has no such separation. When backchannel statistics are
eventually surfaced to userspace, the global counters would
expose cross-namespace data.

Allocate per-netns copies of these structures through a new
opaque struct nfsd_net_cb, managed by nfsd_net_cb_init()
and nfsd_net_cb_shutdown(). The struct definition is private
to nfs4callback.c; struct nfsd_net holds only a pointer.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2 weeks agoNFSD: use per-operation statidx for callback procedures
Chuck Lever [Fri, 13 Mar 2026 16:31:47 +0000 (12:31 -0400)] 
NFSD: use per-operation statidx for callback procedures

The callback RPC procedure table uses NFSPROC4_CB_##call for
p_statidx, which maps CB_NULL to index 0 and every
compound-based callback (CB_RECALL, CB_LAYOUT, CB_OFFLOAD,
etc.) to index 1. All compound callback operations therefore
share a single statistics counter, making per-operation
accounting impossible.

Assign p_statidx from the NFSPROC4_CLNT_##proc enum instead,
giving each callback operation its own counter slot. The
counts array is already sized by ARRAY_SIZE(nfs4_cb_procedures),
so no allocation change is needed.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2 weeks agosvcrdma: Use contiguous pages for RDMA Read sink buffers
Chuck Lever [Tue, 10 Mar 2026 19:39:25 +0000 (15:39 -0400)] 
svcrdma: Use contiguous pages for RDMA Read sink buffers

svc_rdma_build_read_segment() constructs RDMA Read sink
buffers by consuming pages one-at-a-time from rq_pages[]
and building one bvec per page. A 64KB NFS READ payload
produces 16 separate bvecs, 16 DMA mappings, and
potentially multiple RDMA Read WRs (on platforms with
4KB pages).

A single higher-order allocation followed by split_page()
yields physically contiguous memory while preserving
per-page refcounts. A single bvec spanning the contiguous
range causes rdma_rw_ctx_init_bvec() to take the
rdma_rw_init_single_wr_bvec() fast path: one DMA mapping,
one SGE, one WR.

The split sub-pages replace the original rq_pages[] entries,
so all downstream page tracking, completion handling, and
xdr_buf assembly remain unchanged.

Allocation uses __GFP_NORETRY | __GFP_NOWARN and falls back
through decreasing orders. If even order-1 fails, the
existing per-page path handles the segment.

When nr_pages is not a power of two, get_order() rounds up
and the allocation yields more pages than needed. The extra
split pages replace existing rq_pages[] entries (freed via
put_page() first), so there is no net increase in per-
request page consumption. Successive segments reuse the
same padding slots, preventing accumulation. The
rq_maxpages guard rejects any allocation that would
overrun the array, falling back to the per-page path.
Under memory pressure, __GFP_NORETRY causes the higher-
order allocation to fail without stalling.

The contiguous path is attempted when the segment starts
page-aligned (rc_pageoff == 0) and spans at least two
pages. NFS WRITE segments carry application-modified byte
ranges of arbitrary length, so the optimization is not
restricted to power-of-two page counts.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2 weeks agoSUNRPC: Add svc_rqst_page_release() helper
Chuck Lever [Wed, 11 Mar 2026 16:18:54 +0000 (12:18 -0400)] 
SUNRPC: Add svc_rqst_page_release() helper

svc_rqst_replace_page() releases displaced pages through a
per-rqst folio batch, but exposes the add-or-flush sequence
directly. svc_tcp_restore_pages() releases displaced pages
individually with put_page().

Introduce svc_rqst_page_release() to encapsulate the
batched release mechanism. Convert svc_rqst_replace_page()
and svc_tcp_restore_pages() to use it. The latter now
benefits from the same batched release that
svc_rqst_replace_page() already uses.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2 weeks agoipmi: ssif_bmc: change log level to dbg in irq callback
Jian Zhang [Fri, 3 Apr 2026 09:06:01 +0000 (17:06 +0800)] 
ipmi: ssif_bmc: change log level to dbg in irq callback

Long-running tests indicate that this logging can occasionally disrupt
timing and lead to request/response corruption.

Irq handler need to be executed as fast as possible,
most I2C slave IRQ implementations are byte-level, logging here
can significantly affect transfer behavior and timing. It is recommended
to use dev_dbg() for these messages.

Fixes: dd2bc5cc9e25 ("ipmi: ssif_bmc: Add SSIF BMC driver")
Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com>
Message-ID: <20260403090603.3988423-4-zhangjian.3032@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
2 weeks agoipmi: ssif_bmc: fix message desynchronization after truncated response
Jian Zhang [Fri, 3 Apr 2026 09:06:00 +0000 (17:06 +0800)] 
ipmi: ssif_bmc: fix message desynchronization after truncated response

A truncated response, caused by host power-off, or other conditions,
can lead to message desynchronization.

Raw trace data (STOP loss scenario, add state transition comment):

1. T-1: Read response phase (SSIF_RES_SENDING)
8271.955342  WR_RCV [03]                          <- Read polling cmd
8271.955348  RD_REQ [04]  <== SSIF_RES_SENDING    <- start sending response
8271.955436  RD_PRO [b4]
8271.955527  RD_PRO [00]
8271.955618  RD_PRO [c1]
8271.955707  RD_PRO [00]
8271.955814  RD_PRO [ad]  <== SSIF_RES_SENDING     <- last byte
<- !! STOP lost (truncated response)

2. T: New Write request arrives, BMC still in SSIF_RES_SENDING
8271.967973  WR_REQ []    <== SSIF_RES_SENDING >> SSIF_ABORTING  <- log: unexpected WR_REQ in RES_SENDING
8271.968447  WR_RCV [02]  <== SSIF_ABORTING  <- do nothing
8271.968452  WR_RCV [02]  <== SSIF_ABORTING  <- do nothing
8271.968454  WR_RCV [18]  <== SSIF_ABORTING  <- do nothing
8271.968456  WR_RCV [01]  <== SSIF_ABORTING  <- do nothing
8271.968458  WR_RCV [66]  <== SSIF_ABORTING  <- do nothing
8271.978714  STOP []      <== SSIF_ABORTING >> SSIF_READY  <- log: unexpected SLAVE STOP in state=SSIF_ABORTING

3. T+1: Next Read polling, treated as a fresh transaction
8271.979125  WR_REQ []    <== SSIF_READY >> SSIF_START
8271.979326  WR_RCV [03]  <== SSIF_START >> SSIF_SMBUS_CMD        <- smbus_cmd=0x03
8271.979331  RD_REQ [04]  <== SSIF_RES_SENDING      <- sending response
8271.979427  RD_PRO [b4]                            <- !! this is T's stale response -> desynchronization

When in SSIF_ABORTING state, a newly arrived command should still be
handled to avoid dropping the request or causing message
desynchronization.

Fixes: dd2bc5cc9e25 ("ipmi: ssif_bmc: Add SSIF BMC driver")
Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com>
Message-ID: <20260403090603.3988423-3-zhangjian.3032@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
2 weeks agoipmi: ssif_bmc: fix missing check for copy_to_user() partial failure
Jian Zhang [Fri, 3 Apr 2026 09:05:59 +0000 (17:05 +0800)] 
ipmi: ssif_bmc: fix missing check for copy_to_user() partial failure

copy_to_user() returns the number of bytes that could not be copied,
with a non-zero value indicating a partial or complete failure. The
current code only checks for negative return values and treats all
non-negative results as success.

Treating any positive return value from copy_to_user() as
an error and returning -EFAULT.

Fixes: dd2bc5cc9e25 ("ipmi: ssif_bmc: Add SSIF BMC driver")
Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com>
Message-ID: <20260403090603.3988423-2-zhangjian.3032@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
2 weeks agoipmi: ssif_bmc: cancel response timer on remove
Jian Zhang [Fri, 3 Apr 2026 09:05:58 +0000 (17:05 +0800)] 
ipmi: ssif_bmc: cancel response timer on remove

The response timer can stay armed across device teardown. If it fires after
remove, the callback dereferences the SSIF context and the i2c client after
teardown has started.

Cancel the timer in remove so the callback cannot run after the device is
unregistered.

Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com>
Message-ID: <20260403090603.3988423-1-zhangjian.3032@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
2 weeks agoASoC: rsnd: Fix potential out-of-bounds access of component_dais[]
Denis Rastyogin [Fri, 27 Mar 2026 10:33:11 +0000 (13:33 +0300)] 
ASoC: rsnd: Fix potential out-of-bounds access of component_dais[]

component_dais[RSND_MAX_COMPONENT] is initially zero-initialized
and later populated in rsnd_dai_of_node(). However, the existing boundary check:
  if (i >= RSND_MAX_COMPONENT)

does not guarantee that the last valid element remains zero. As a result,
the loop can rely on component_dais[RSND_MAX_COMPONENT] being zero,
which may lead to an out-of-bounds access.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 547b02f74e4a ("ASoC: rsnd: enable multi Component support for Audio Graph Card/Card2")
Signed-off-by: Denis Rastyogin <gerben@altlinux.org>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/20260327103311.459239-1-gerben@altlinux.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agoASoC: amd: acp-sdw-legacy: remove unnecessary condition check
Vijendar Mukunda [Fri, 3 Apr 2026 06:34:25 +0000 (12:04 +0530)] 
ASoC: amd: acp-sdw-legacy: remove unnecessary condition check

Currently there is no mechanism to read dmic_num in mach_params
structure. In this scenario mach_params->dmic_num check always
returns 0.
Remove unnecessary condition check for mach_params->dmic_num.

Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Link: https://patch.msgid.link/20260403063452.159800-1-Vijendar.Mukunda@amd.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agoperf/x86/msr: Make SMI and PPERF on by default
Kan Liang [Fri, 27 Mar 2026 05:28:44 +0000 (13:28 +0800)] 
perf/x86/msr: Make SMI and PPERF on by default

The MSRs, SMI_COUNT and PPERF, are model-specific MSRs. A very long
CPU ID list is maintained to indicate the supported platforms. With more
and more platforms being introduced, new CPU IDs have to be kept adding.
Also, the old kernel has to be updated to apply the new CPU ID.

The MSRs have been introduced for a long time. There is no plan to
change them in the near future. Furthermore, the current code utilizes
rdmsr_safe() to check the availability of MSRs before using it.

Make them on by default. It should be good enough to only rely on the
rdmsr_safe() to check their availability for both existing and future
platforms.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Co-developed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260327052844.818218-1-dapeng1.mi@linux.intel.com
2 weeks agosched/fair: Prevent negative lag increase during delayed dequeue
Vincent Guittot [Tue, 31 Mar 2026 16:23:52 +0000 (18:23 +0200)] 
sched/fair: Prevent negative lag increase during delayed dequeue

Delayed dequeue feature aims to reduce the negative lag of a dequeued
task while sleeping but it can happens that newly enqueued tasks will
move backward the avg vruntime and increase its negative lag.
When the delayed dequeued task wakes up, it has more neg lag compared
to being dequeued immediately or to other tasks that have been
dequeued just before theses new enqueues.

Ensure that the negative lag of a delayed dequeued task doesn't
increase during its delayed dequeued phase while waiting for its neg
lag to diseappear. Similarly, we remove any positive lag that the
delayed dequeued task could have gain during thsi period.

Short slice tasks are particularly impacted in overloaded system.

Test on snapdragon rb5:

hackbench -T -p -l 16000000 -g 2 1> /dev/null &
cyclictest -t 1 -i 2777 -D 333 --policy=fair --mlock  -h 20000 -q

The scheduling latency of cyclictest is:

                       tip/sched/core  tip/sched/core    +this patch
cyclictest slice  (ms) (default)2.8             8               8
hackbench slice   (ms) (default)2.8            20              20
Total Samples          |   115632          119733          119806
Average           (us) |      364              64(-82%)        61(- 5%)
Median (P50)      (us) |       60              56(- 7%)        56(  0%)
90th Percentile   (us) |     1166              62(-95%)        62(  0%)
99th Percentile   (us) |     4192              73(-98%)        72(- 1%)
99.9th Percentile (us) |     8528            2707(-68%)      1300(-52%)
Maximum           (us) |    17735           14273(-20%)     13525(- 5%)

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260331162352.551501-1-vincent.guittot@linaro.org
2 weeks agosched/fair: Use sched_energy_enabled()
Vincent Guittot [Fri, 27 Mar 2026 13:20:13 +0000 (14:20 +0100)] 
sched/fair: Use sched_energy_enabled()

Use helper sched_energy_enabled() everywhere we want to test if EAS is
enabled instead of mixing sched_energy_enabled() and direct call to
static_branch_unlikely().

No functional change

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260327132013.2800517-1-vincent.guittot@linaro.org
2 weeks agosched: Handle blocked-waiter migration (and return migration)
John Stultz [Tue, 24 Mar 2026 19:13:25 +0000 (19:13 +0000)] 
sched: Handle blocked-waiter migration (and return migration)

Add logic to handle migrating a blocked waiter to a remote
cpu where the lock owner is runnable.

Additionally, as the blocked task may not be able to run
on the remote cpu, add logic to handle return migration once
the waiting task is given the mutex.

Because tasks may get migrated to where they cannot run, also
modify the scheduling classes to avoid sched class migrations on
mutex blocked tasks, leaving find_proxy_task() and related logic
to do the migrations and return migrations.

This was split out from the larger proxy patch, and
significantly reworked.

Credits for the original patch go to:
  Peter Zijlstra (Intel) <peterz@infradead.org>
  Juri Lelli <juri.lelli@redhat.com>
  Valentin Schneider <valentin.schneider@arm.com>
  Connor O'Brien <connoro@google.com>

Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260324191337.1841376-11-jstultz@google.com
2 weeks agosched: Move attach_one_task and attach_task helpers to sched.h
John Stultz [Tue, 24 Mar 2026 19:13:24 +0000 (19:13 +0000)] 
sched: Move attach_one_task and attach_task helpers to sched.h

The fair scheduler locally introduced attach_one_task() and
attach_task() helpers, but these could be generically useful so
move this code to sched.h so we can use them elsewhere.

One minor tweak made to utilize guard(rq_lock)(rq) to simplifiy
the function.

Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://patch.msgid.link/20260324191337.1841376-10-jstultz@google.com
2 weeks agosched: Add logic to zap balance callbacks if we pick again
John Stultz [Tue, 24 Mar 2026 19:13:23 +0000 (19:13 +0000)] 
sched: Add logic to zap balance callbacks if we pick again

With proxy-exec, a task is selected to run via pick_next_task(),
and then if it is a mutex blocked task, we call find_proxy_task()
to find a runnable owner. If the runnable owner is on another
cpu, we will need to migrate the selected donor task away, after
which we will pick_again can call pick_next_task() to choose
something else.

However, in the first call to pick_next_task(), we may have
had a balance_callback setup by the class scheduler. After we
pick again, its possible pick_next_task_fair() will be called
which calls sched_balance_newidle() and sched_balance_rq().

This will throw a warning:
[    8.796467] rq->balance_callback && rq->balance_callback != &balance_push_callback
[    8.796467] WARNING: CPU: 32 PID: 458 at kernel/sched/sched.h:1750 sched_balance_rq+0xe92/0x1250
...
[    8.796467] Call Trace:
[    8.796467]  <TASK>
[    8.796467]  ? __warn.cold+0xb2/0x14e
[    8.796467]  ? sched_balance_rq+0xe92/0x1250
[    8.796467]  ? report_bug+0x107/0x1a0
[    8.796467]  ? handle_bug+0x54/0x90
[    8.796467]  ? exc_invalid_op+0x17/0x70
[    8.796467]  ? asm_exc_invalid_op+0x1a/0x20
[    8.796467]  ? sched_balance_rq+0xe92/0x1250
[    8.796467]  sched_balance_newidle+0x295/0x820
[    8.796467]  pick_next_task_fair+0x51/0x3f0
[    8.796467]  __schedule+0x23a/0x14b0
[    8.796467]  ? lock_release+0x16d/0x2e0
[    8.796467]  schedule+0x3d/0x150
[    8.796467]  worker_thread+0xb5/0x350
[    8.796467]  ? __pfx_worker_thread+0x10/0x10
[    8.796467]  kthread+0xee/0x120
[    8.796467]  ? __pfx_kthread+0x10/0x10
[    8.796467]  ret_from_fork+0x31/0x50
[    8.796467]  ? __pfx_kthread+0x10/0x10
[    8.796467]  ret_from_fork_asm+0x1a/0x30
[    8.796467]  </TASK>

This is because if a RT task was originally picked, it will
setup the rq->balance_callback with push_rt_tasks() via
set_next_task_rt().

Once the task is migrated away and we pick again, we haven't
processed any balance callbacks, so rq->balance_callback is not
in the same state as it was the first time pick_next_task was
called.

To handle this, add a zap_balance_callbacks() helper function
which cleans up the balance callbacks without running them. This
should be ok, as we are effectively undoing the state set in
the first call to pick_next_task(), and when we pick again,
the new callback can be configured for the donor task actually
selected.

Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://patch.msgid.link/20260324191337.1841376-9-jstultz@google.com
2 weeks agosched: Add assert_balance_callbacks_empty helper
John Stultz [Tue, 24 Mar 2026 19:13:22 +0000 (19:13 +0000)] 
sched: Add assert_balance_callbacks_empty helper

With proxy-exec utilizing pick-again logic, we can end up having
balance callbacks set by the preivous pick_next_task() call left
on the list.

So pull the warning out into a helper function, and make sure we
check it when we pick again.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://patch.msgid.link/20260324191337.1841376-8-jstultz@google.com
2 weeks agosched/locking: Add special p->blocked_on==PROXY_WAKING value for proxy return-migration
John Stultz [Tue, 24 Mar 2026 19:13:21 +0000 (19:13 +0000)] 
sched/locking: Add special p->blocked_on==PROXY_WAKING value for proxy return-migration

As we add functionality to proxy execution, we may migrate a
donor task to a runqueue where it can't run due to cpu affinity.
Thus, we must be careful to ensure we return-migrate the task
back to a cpu in its cpumask when it becomes unblocked.

Peter helpfully provided the following example with pictures:
"Suppose we have a ww_mutex cycle:

                  ,-+-* Mutex-1 <-.
        Task-A ---' |             | ,-- Task-B
                    `-> Mutex-2 *-+-'

Where Task-A holds Mutex-1 and tries to acquire Mutex-2, and
where Task-B holds Mutex-2 and tries to acquire Mutex-1.

Then the blocked_on->owner chain will go in circles.

        Task-A  -> Mutex-2
          ^          |
          |          v
        Mutex-1 <- Task-B

We need two things:

 - find_proxy_task() to stop iterating the circle;

 - the woken task to 'unblock' and run, such that it can
   back-off and re-try the transaction.

Now, the current code [without this patch] does:
        __clear_task_blocked_on();
        wake_q_add();

And surely clearing ->blocked_on is sufficient to break the
cycle.

Suppose it is Task-B that is made to back-off, then we have:

  Task-A -> Mutex-2 -> Task-B (no further blocked_on)

and it would attempt to run Task-B. Or worse, it could directly
pick Task-B and run it, without ever getting into
find_proxy_task().

Now, here is a problem because Task-B might not be runnable on
the CPU it is currently on; and because !task_is_blocked() we
don't get into the proxy paths, so nobody is going to fix this
up.

Ideally we would have dequeued Task-B alongside of clearing
->blocked_on, but alas, [the lock ordering prevents us from
getting the task_rq_lock() and] spoils things."

Thus we need more than just a binary concept of the task being
blocked on a mutex or not.

So allow setting blocked_on to PROXY_WAKING as a special value
which specifies the task is no longer blocked, but needs to
be evaluated for return migration *before* it can be run.

This will then be used in a later patch to handle proxy
return-migration.

Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://patch.msgid.link/20260324191337.1841376-7-jstultz@google.com
2 weeks agosched: Fix modifying donor->blocked on without proper locking
John Stultz [Tue, 24 Mar 2026 19:13:20 +0000 (19:13 +0000)] 
sched: Fix modifying donor->blocked on without proper locking

Introduce an action enum in find_proxy_task() which allows
us to handle work needed to be done outside the mutex.wait_lock
and task.blocked_lock guard scopes.

This ensures proper locking when we clear the donor's blocked_on
pointer in proxy_deactivate(), and the switch statement will be
useful as we add more cases to handle later in this series.

Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://patch.msgid.link/20260324191337.1841376-6-jstultz@google.com
2 weeks agolocking: Add task::blocked_lock to serialize blocked_on state
John Stultz [Tue, 24 Mar 2026 19:13:19 +0000 (19:13 +0000)] 
locking: Add task::blocked_lock to serialize blocked_on state

So far, we have been able to utilize the mutex::wait_lock
for serializing the blocked_on state, but when we move to
proxying across runqueues, we will need to add more state
and a way to serialize changes to this state in contexts
where we don't hold the mutex::wait_lock.

So introduce the task::blocked_lock, which nests under the
mutex::wait_lock in the locking order, and rework the locking
to use it.

Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://patch.msgid.link/20260324191337.1841376-5-jstultz@google.com
2 weeks agosched: Fix potentially missing balancing with Proxy Exec
John Stultz [Tue, 24 Mar 2026 19:13:18 +0000 (19:13 +0000)] 
sched: Fix potentially missing balancing with Proxy Exec

K Prateek pointed out that with Proxy Exec, we may have cases
where we context switch in __schedule(), while the donor remains
the same. This could cause balancing issues, since the
put_prev_set_next() logic short-cuts if (prev == next). With
proxy-exec prev is the previous donor, and next is the next
donor. Should the donor remain the same, but different tasks are
picked to actually run, the shortcut will have avoided enqueuing
the sched class balance callback.

So, if we are context switching, add logic to catch the
same-donor case, and trigger the put_prev/set_next calls to
ensure the balance callbacks get enqueued.

Closes: https://lore.kernel.org/lkml/20ea3670-c30a-433b-a07f-c4ff98ae2379@amd.com/
Reported-by: K Prateek Nayak <kprateek.nayak@amd.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260324191337.1841376-4-jstultz@google.com
2 weeks agosched: Minimise repeated sched_proxy_exec() checking
John Stultz [Tue, 24 Mar 2026 19:13:17 +0000 (19:13 +0000)] 
sched: Minimise repeated sched_proxy_exec() checking

Peter noted: Compilers are really bad (as in they utterly refuse)
optimizing (even when marked with __pure) the static branch
things, and will happily emit multiple identical in a row.

So pull out the one obvious sched_proxy_exec() branch in
__schedule() and remove some of the 'implicit' ones in that
path.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://patch.msgid.link/20260324191337.1841376-3-jstultz@google.com
2 weeks agosched: Make class_schedulers avoid pushing current, and get rid of proxy_tag_curr()
John Stultz [Tue, 24 Mar 2026 19:13:16 +0000 (19:13 +0000)] 
sched: Make class_schedulers avoid pushing current, and get rid of proxy_tag_curr()

With proxy-execution, the scheduler selects the donor, but for
blocked donors, we end up running the lock owner.

This caused some complexity, because the class schedulers make
sure to remove the task they pick from their pushable task
lists, which prevents the donor from being migrated, but there
wasn't then anything to prevent rq->curr from being migrated
if rq->curr != rq->donor.

This was sort of hacked around by calling proxy_tag_curr() on
the rq->curr task if we were running something other then the
donor. proxy_tag_curr() did a dequeue/enqueue pair on the
rq->curr task, allowing the class schedulers to remove it from
their pushable list.

The dequeue/enqueue pair was wasteful, and additonally K Prateek
highlighted that we didn't properly undo things when we stopped
proxying, leaving the lock owner off the pushable list.

After some alternative approaches were considered, Peter
suggested just having the RT/DL classes just avoid migrating
when task_on_cpu().

So rework pick_next_pushable_dl_task() and the rt
pick_next_pushable_task() functions so that they skip over the
first pushable task if it is on_cpu.

Then just drop all of the proxy_tag_curr() logic.

Fixes: be39617e38e0 ("sched: Fix proxy/current (push,pull)ability")
Closes: https://lore.kernel.org/lkml/e735cae0-2cc9-4bae-b761-fcb082ed3e94@amd.com/
Reported-by: K Prateek Nayak <kprateek.nayak@amd.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260324191337.1841376-2-jstultz@google.com
2 weeks agopower: sequencing: pcie-m2: add SERIAL_DEV_BUS dependency
Arnd Bergmann [Wed, 1 Apr 2026 19:10:13 +0000 (21:10 +0200)] 
power: sequencing: pcie-m2: add SERIAL_DEV_BUS dependency

The newly added serdev code fails to link when serdev is turned off:

arm-linux-gnueabi-ld: drivers/power/sequencing/pwrseq-pcie-m2.o: in function `pwrseq_pcie_m2_remove_serdev':
pwrseq-pcie-m2.c:(.text+0xc8): undefined reference to `serdev_device_remove'
arm-linux-gnueabi-ld: drivers/power/sequencing/pwrseq-pcie-m2.o: in function `pwrseq_m2_pcie_notify':
pwrseq-pcie-m2.c:(.text+0x69c): undefined reference to `of_find_serdev_controller_by_node'
arm-linux-gnueabi-ld: pwrseq-pcie-m2.c:(.text+0x6f8): undefined reference to `serdev_device_alloc'
arm-linux-gnueabi-ld: pwrseq-pcie-m2.c:(.text+0x724): undefined reference to `serdev_device_add'

Add another Kconfig dependency for this

Fixes: 3f736aecbdc8 ("power: sequencing: pcie-m2: Create serdev device for WCN7850 bluetooth")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20260401191030.948046-1-arnd@kernel.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2 weeks agopower: sequencing: pcie-m2: enforce PCI and OF dependencies
Arnd Bergmann [Wed, 1 Apr 2026 09:16:25 +0000 (11:16 +0200)] 
power: sequencing: pcie-m2: enforce PCI and OF dependencies

The driver fails to build when PCI is disabled:

drivers/power/sequencing/pwrseq-pcie-m2.c: In function 'pwrseq_pcie_m2_register_notifier':
drivers/power/sequencing/pwrseq-pcie-m2.c:368:54: error: 'pci_bus_type' undeclared (first use in this function); did you mean 'pci_pcie_type'?
  368 |                         ret = bus_register_notifier(&pci_bus_type, &ctx->nb);
      |                                                      ^~~~~~~~~~~~
      |                                                      pci_pcie_type

Similarly, when CONFIG_OF is disabled:

drivers/power/sequencing/pwrseq-pcie-m2.c: In function 'pwrseq_m2_pcie_create_bt_node':
drivers/power/sequencing/pwrseq-pcie-m2.c:191:9: error: implicit declaration of function 'of_changeset_init' [-Wimplicit-function-declaration]
  191 |         of_changeset_init(ctx->ocs);
      |         ^~~~~~~~~~~~~~~~~

Make both dependencies unconditional to prevent compile-testing
in either configuration.

Fixes: 3f736aecbdc8 ("power: sequencing: pcie-m2: Create serdev device for WCN7850 bluetooth")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/20260401091847.305294-1-arnd@kernel.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2 weeks agoRISC-V: KVM: Don't check hstateen0 when updating sstateen0 CSR
Anup Patel [Tue, 20 Jan 2026 07:59:55 +0000 (13:29 +0530)] 
RISC-V: KVM: Don't check hstateen0 when updating sstateen0 CSR

The hstateen0 will be programmed differently for guest HS-mode
and guest VS/VU-mode so don't check hstateen0.SSTATEEN0 bit when
updating sstateen0 CSR in kvm_riscv_vcpu_swap_in_guest_state()
and kvm_riscv_vcpu_swap_in_host_state().

Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Reviewed-by: Radim Krčmář <radim.krcmar@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260120080013.2153519-10-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2 weeks agoRISC-V: KVM: Factor-out VCPU config into separate sources
Anup Patel [Tue, 20 Jan 2026 07:59:54 +0000 (13:29 +0530)] 
RISC-V: KVM: Factor-out VCPU config into separate sources

The VCPU config deals with hideleg, hedeleg, henvcfg, and hstateenX
CSR configuration for each VCPU. Factor-out VCPU config into separate
sources so that VCPU config can do things differently for guest HS-mode
and guest VS/VU-mode.

Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Reviewed-by: Radim Krčmář <radim.krcmar@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260120080013.2153519-9-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2 weeks agoMerge branch 'pm-em'
Rafael J. Wysocki [Fri, 3 Apr 2026 12:15:06 +0000 (14:15 +0200)] 
Merge branch 'pm-em'

Fix a NULL pointer dereference in the energy model netlink interface
that may occur if a given perf domain ID is not recognized (Changwoo Min).

* pm-em:
  PM: EM: Fix NULL pointer dereference when perf domain ID is not found

2 weeks agoASoC: intel: sof_sdw: Prepare for configuration without a jack
Maciej Strozek [Fri, 3 Apr 2026 08:23:35 +0000 (09:23 +0100)] 
ASoC: intel: sof_sdw: Prepare for configuration without a jack

In certain setups of cs42l43 UAJ function may be removed from ACPI and
physically unconnected. Prepare a driver for that configuration by
setting a system clock in the speaker path too.

Signed-off-by: Maciej Strozek <mstrozek@opensource.cirrus.com>
Link: https://patch.msgid.link/20260403082335.40798-1-mstrozek@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agoRevert "dt-bindings: usb: cdns,usb3: document USBSSP controller support"
Greg Kroah-Hartman [Fri, 3 Apr 2026 12:06:20 +0000 (14:06 +0200)] 
Revert "dt-bindings: usb: cdns,usb3: document USBSSP controller support"

This reverts commit fb14e7f7cbb4abbcde5576282d91352deaff2887.

There were some build issues as reported by Arnd, so revert this for
now.

Cc: Peter Chen <peter.chen@cixtech.com>
Cc: Pawel Laszczak <pawell@cadence.com>
Reported-by: Arnd Bergmann <arnd@kernel.org>
Link: https://lore.kernel.org/r/ac+LEWMCQpLSnfoD@nchen-desktop
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agoRISC-V: KVM: Add hideleg to struct kvm_vcpu_config
Anup Patel [Tue, 20 Jan 2026 07:59:53 +0000 (13:29 +0530)] 
RISC-V: KVM: Add hideleg to struct kvm_vcpu_config

The hideleg CSR state when VCPU is running in guest VS/VU-mode will
be different from when it is running in guest HS-mode. To achieve
this, add hideleg to struct kvm_vcpu_config and re-program hideleg
CSR upon every kvm_arch_vcpu_load().

Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Reviewed-by: Radim Krčmář <radim.krcmar@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260120080013.2153519-8-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2 weeks agoRISC-V: KVM: Move timer state defines closer to struct in UAPI header
Anup Patel [Tue, 20 Jan 2026 07:59:52 +0000 (13:29 +0530)] 
RISC-V: KVM: Move timer state defines closer to struct in UAPI header

The KVM_RISCV_TIMER_STATE_xyz defines specify possible values of the
"state" member in struct kvm_riscv_timer so move these defines closer
to struct kvm_riscv_timer in uapi/asm/kvm.h.

Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260120080013.2153519-7-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2 weeks agoRevert "usb: cdns3: Add USBSSP platform driver support"
Greg Kroah-Hartman [Fri, 3 Apr 2026 12:05:13 +0000 (14:05 +0200)] 
Revert "usb: cdns3: Add USBSSP platform driver support"

This reverts commit 6076388ca1eda808b95f9479f3b04839d348a2f7.

There were some build issues as reported by Arnd, so revert this for
now.

Cc: Peter Chen <peter.chen@cixtech.com>
Cc: Pawel Laszczak <pawell@cadence.com>
Reported-by: Arnd Bergmann <arnd@kernel.org>
Link: https://lore.kernel.org/r/ac+LEWMCQpLSnfoD@nchen-desktop
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agoRISC-V: KVM: Factor-out ISA checks into separate sources
Anup Patel [Tue, 20 Jan 2026 07:59:51 +0000 (13:29 +0530)] 
RISC-V: KVM: Factor-out ISA checks into separate sources

The KVM ISA extension related checks are not VCPU specific and
should be factored out of vcpu_onereg.c into separate sources.

Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Reviewed-by: Radim Krčmář <radim.krcmar@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260120080013.2153519-6-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2 weeks agoRevert "usb: cdnsp: Add support for device-only configuration"
Greg Kroah-Hartman [Fri, 3 Apr 2026 12:03:37 +0000 (14:03 +0200)] 
Revert "usb: cdnsp: Add support for device-only configuration"

This reverts commit 7b7f2dd913829e06705035dfc41ca25fa6ec68d3.

There was some problems with an earlier cdns3 change, so this one needs
to be backed out as well.

Cc: Pawel Laszczak <pawell@cadence.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Reported-by: Peter Chen <peter.chen@kernel.org>
Link: https://lore.kernel.org/r/ac+LEWMCQpLSnfoD@nchen-desktop
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agoRISC-V: KVM: Introduce common kvm_riscv_isa_check_host()
Anup Patel [Tue, 20 Jan 2026 07:59:50 +0000 (13:29 +0530)] 
RISC-V: KVM: Introduce common kvm_riscv_isa_check_host()

Rename kvm_riscv_vcpu_isa_check_host() to kvm_riscv_isa_check_host()
and use it as common function with KVM RISC-V to check isa extensions
supported by host.

Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Reviewed-by: Radim Krčmář <radim.krcmar@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260120080013.2153519-5-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2 weeks agoDocumentation: clarify the mandatory and desirable info for security reports
Willy Tarreau [Fri, 3 Apr 2026 06:20:18 +0000 (08:20 +0200)] 
Documentation: clarify the mandatory and desirable info for security reports

A significant part of the effort of the security team consists in begging
reporters for patch proposals, or asking them to provide them in regular
format, and most of the time they're willing to provide this, they just
didn't know that it would help. So let's add a section detailing the
required and desirable contents in a security report to help reporters
write more actionable reports which do not require round trips.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260403062018.31080-4-w@1wt.eu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agoDocumentation: explain how to find maintainers addresses for security reports
Willy Tarreau [Fri, 3 Apr 2026 06:20:17 +0000 (08:20 +0200)] 
Documentation: explain how to find maintainers addresses for security reports

These days, 80% of the work done by the security team consists in
locating the affected subsystem in a report, running get_maintainers on
it, forwarding the report to these persons and responding to the reporter
with them in Cc. This is a huge and unneeded overhead that we must try to
lower for a better overall efficiency. This patch adds a complete section
explaining how to figure the list of recipients to send the report to.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260403062018.31080-3-w@1wt.eu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agoDocumentation: minor updates to the security contacts
Willy Tarreau [Fri, 3 Apr 2026 06:20:16 +0000 (08:20 +0200)] 
Documentation: minor updates to the security contacts

This clarifies the fact that the bug reporters must use a valid
e-mail address to send their report, and that the security team
assists developers working on a fix but doesn't always produce
fixes on its own.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260403062018.31080-2-w@1wt.eu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agobcache: fix uninitialized closure object
Mingzhe Zou [Fri, 3 Apr 2026 04:21:35 +0000 (12:21 +0800)] 
bcache: fix uninitialized closure object

In the previous patch ("bcache: fix cached_dev.sb_bio use-after-free and
crash"), we adopted a simple modification suggestion from AI to fix the
use-after-free.

But in actual testing, we found an extreme case where the device is
stopped before calling bch_write_bdev_super().

At this point, struct closure sb_write has not been initialized yet.
For this patch, we ensure that sb_bio has been completed via
sb_write_mutex.

Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
Signed-off-by: Coly Li <colyli@fnnas.com>
Link: https://patch.msgid.link/20260403042135.2221247-1-colyli@fnnas.com
Fixes: fec114a98b87 ("bcache: fix cached_dev.sb_bio use-after-free and crash")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agobcache: fix cached_dev.sb_bio use-after-free and crash
Mingzhe Zou [Sun, 22 Mar 2026 13:41:02 +0000 (21:41 +0800)] 
bcache: fix cached_dev.sb_bio use-after-free and crash

In our production environment, we have received multiple crash reports
regarding libceph, which have caught our attention:

```
[6888366.280350] Call Trace:
[6888366.280452]  blk_update_request+0x14e/0x370
[6888366.280561]  blk_mq_end_request+0x1a/0x130
[6888366.280671]  rbd_img_handle_request+0x1a0/0x1b0 [rbd]
[6888366.280792]  rbd_obj_handle_request+0x32/0x40 [rbd]
[6888366.280903]  __complete_request+0x22/0x70 [libceph]
[6888366.281032]  osd_dispatch+0x15e/0xb40 [libceph]
[6888366.281164]  ? inet_recvmsg+0x5b/0xd0
[6888366.281272]  ? ceph_tcp_recvmsg+0x6f/0xa0 [libceph]
[6888366.281405]  ceph_con_process_message+0x79/0x140 [libceph]
[6888366.281534]  ceph_con_v1_try_read+0x5d7/0xf30 [libceph]
[6888366.281661]  ceph_con_workfn+0x329/0x680 [libceph]
```

After analyzing the coredump file, we found that the address of
dc->sb_bio has been freed. We know that cached_dev is only freed when it
is stopped.

Since sb_bio is a part of struct cached_dev, rather than an alloc every
time.  If the device is stopped while writing to the superblock, the
released address will be accessed at endio.

This patch hopes to wait for sb_write to complete in cached_dev_free.

It should be noted that we analyzed the cause of the problem, then tell
all details to the QWEN and adopted the modifications it made.

Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
Fixes: cafe563591446 ("bcache: A block layer cache")
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Coly Li <colyli@fnnas.com>
Link: https://patch.msgid.link/20260322134102.480107-1-colyli@fnnas.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agoblock: use sysfs_emit in sysfs show functions
Thorsten Blum [Thu, 2 Apr 2026 16:50:00 +0000 (18:50 +0200)] 
block: use sysfs_emit in sysfs show functions

Replace sprintf() with sysfs_emit() in sysfs show functions.
sysfs_emit() is preferred for formatting sysfs output because it
provides safer bounds checking.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://patch.msgid.link/20260402164958.894879-4-thorsten.blum@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agokbuild: rust: allow `clippy::uninlined_format_args`
Miguel Ojeda [Tue, 31 Mar 2026 20:58:48 +0000 (22:58 +0200)] 
kbuild: rust: allow `clippy::uninlined_format_args`

Clippy in Rust 1.88.0 (only) reports [1]:

    warning: variables can be used directly in the `format!` string
       --> rust/macros/module.rs:112:23
        |
    112 |         let content = format!("{param}:{content}", param = param, content = content);
        |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        |
        = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
        = note: `-W clippy::uninlined-format-args` implied by `-W clippy::all`
        = help: to override `-W clippy::all` add `#[allow(clippy::uninlined_format_args)]`
    help: change this to
        |
    112 -         let content = format!("{param}:{content}", param = param, content = content);
    112 +         let content = format!("{param}:{content}");

    warning: variables can be used directly in the `format!` string
       --> rust/macros/module.rs:198:14
        |
    198 |         t => panic!("Unsupported parameter type {}", t),
        |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        |
        = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
        = note: `-W clippy::uninlined-format-args` implied by `-W clippy::all`
        = help: to override `-W clippy::all` add `#[allow(clippy::uninlined_format_args)]`
    help: change this to
        |
    198 -         t => panic!("Unsupported parameter type {}", t),
    198 +         t => panic!("Unsupported parameter type {t}"),
        |

The reason it only triggers in that version is that the lint was moved
from `pedantic` to `style` in Rust 1.88.0 and then back to `pedantic`
in Rust 1.89.0 [2][3].

In the first case, the suggestion is fair and a pure simplification, thus
we will clean it up separately.

To keep the behavior the same across all versions, and since the lint
does not work for all macros (e.g. custom ones like `pr_info!`), disable
it globally.

Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs).
Link: https://lore.kernel.org/rust-for-linux/CANiq72=drAtf3y_DZ-2o4jb6Az9J3Yj4QYwWnbRui4sm4AJD3Q@mail.gmail.com/
Link: https://github.com/rust-lang/rust-clippy/pull/15287
Link: https://github.com/rust-lang/rust-clippy/issues/15151
Link: https://patch.msgid.link/20260331205849.498295-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2 weeks agorust_binder: override crate name to rust_binder
Alice Ryhl [Thu, 2 Apr 2026 10:55:34 +0000 (10:55 +0000)] 
rust_binder: override crate name to rust_binder

The Rust Binder object file is called rust_binder_main.o because the
name rust_binder.o is used for the result of linking together
rust_binder_main.o with rust_binderfs.o and a few others.

However, the crate name is supposed to be rust_binder without a _main
suffix. Thus, override the crate name accordingly.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Gary Guo <gary@garyguo.net>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260402-binder-crate-name-v4-2-ec3919b87909@google.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2 weeks agorust: support overriding crate_name
Alice Ryhl [Thu, 2 Apr 2026 10:55:33 +0000 (10:55 +0000)] 
rust: support overriding crate_name

Currently you cannot filter out the crate-name argument
RUSTFLAGS_REMOVE_stem.o because the Rust filter-out invocation does not
include that particular argument. Since --crate-name is an argument that
can't be passed multiple times, this means that it's currently not
possible to override the crate name. Thus, remove the --crate-name
argument for drivers. This allows them to override the crate name using
the #![crate_name] annotation.

This affects symbol names, but has no effect on the filenames of object
files and other things generated by the build, as we always use --emit
with a fixed output filename.

The --crate-name argument is kept for the crates under rust/ for
simplicity and to avoid changing many of them by adding #![crate_name].

The rust analyzer script is updated to use rustc to obtain the crate
name of the driver crates, which picks up the right name whether it is
configured via #![crate_name] or not. For readability, the logic to
invoke 'rustc' is extracted to its own function.

Note that the crate name in the python script is not actually that
important - the only place where the name actually affects anything is
in the 'deps' array which specifies an index and name for each
dependency, and determines what that dependency is called in *this*
crate. (The same crate may be called different things in each
dependency.) Since driver crates are leaf crates, this doesn't apply and
the rustc invocation only affects the 'display_name' parameter.

Acked-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Jesung Yang <y.jems.n@gmail.com>
Acked-by: Tamir Duberstein <tamird@kernel.org>
Link: https://patch.msgid.link/20260402-binder-crate-name-v4-1-ec3919b87909@google.com
[ Applied Python type hints. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2 weeks agopwm: th1520: remove impl Send/Sync for Th1520PwmDriverData
Alice Ryhl [Mon, 23 Feb 2026 10:08:27 +0000 (10:08 +0000)] 
pwm: th1520: remove impl Send/Sync for Th1520PwmDriverData

Now that clk implements Send and Sync, we no longer need to manually
implement these traits for Th1520PwmDriverData. Thus remove the
implementations.

Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Acked-by: Uwe Kleine-König <ukleinek@kernel.org>
Reviewed-by: Michal Wilczynski <m.wilczynski@samsung.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20260223-clk-send-sync-v5-3-181bf2f35652@google.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2 weeks agotyr: remove impl Send/Sync for TyrData
Alice Ryhl [Mon, 23 Feb 2026 10:08:26 +0000 (10:08 +0000)] 
tyr: remove impl Send/Sync for TyrData

Now that clk implements Send and Sync, we no longer need to manually
implement these traits for TyrData. Thus remove the implementations.

The comment also mentions the regulator. However, the regulator had the
traits added in commit 9a200cbdb543 ("rust: regulator: implement Send
and Sync for Regulator<T>"), which is already in mainline.

Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20260223-clk-send-sync-v5-2-181bf2f35652@google.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2 weeks agorust: clk: implement Send and Sync
Alice Ryhl [Mon, 23 Feb 2026 10:08:25 +0000 (10:08 +0000)] 
rust: clk: implement Send and Sync

These traits are required for drivers to embed the Clk type in their own
data structures because driver data structures are usually required to
be Send. Since the Clk type is thread-safe, implement the relevant
traits.

Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Brian Masney <bmasney@redhat.com> # Active contributor to clk
Link: https://patch.msgid.link/20260223-clk-send-sync-v5-1-181bf2f35652@google.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2 weeks agorust: ptr: add const_align_up()
John Hubbard [Thu, 26 Mar 2026 01:38:47 +0000 (18:38 -0700)] 
rust: ptr: add const_align_up()

Add const_align_up() to kernel::ptr as the const-compatible equivalent
of Alignable::align_up().

Suggested-by: Danilo Krummrich <dakr@kernel.org>
Suggested-by: Gary Guo <gary@garyguo.net>
Suggested-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260326013902.588242-17-jhubbard@nvidia.com
[ Adjusted imports style. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2 weeks agorust: error: clarify that `from_err_ptr` can return `Ok(NULL)`
Mirko Adzic [Sun, 29 Mar 2026 10:41:10 +0000 (12:41 +0200)] 
rust: error: clarify that `from_err_ptr` can return `Ok(NULL)`

Improve the doc comment of `from_err_ptr` by explicitly stating that it
will return `Ok(NULL)` when passed a null pointer, as it isn't an error
value.

Add a doctest case that tests the behavior described above, as well as
other scenarios (non-null/non-error pointer, error value).

Suggested-by: Miguel Ojeda <ojeda@kernel.org>
Link: https://lore.kernel.org/rust-for-linux/20260322193830.89324-1-ojeda@kernel.org/
Link: https://github.com/Rust-for-Linux/linux/issues/1231
Signed-off-by: Mirko Adzic <adzicmirko97@gmail.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20260329104319.131057-1-adzicmirko97@gmail.com
[ - Added `expect` for `clippy::missing_safety_doc`.
  - Simplified and removed unsafe block using `Error::to_ptr()`.
  - Added intra-doc link.
      - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2 weeks agogpiolib: Make deferral warnings debug messages
Jon Hunter [Wed, 1 Apr 2026 13:34:41 +0000 (14:34 +0100)] 
gpiolib: Make deferral warnings debug messages

With the recent addition of the shared GPIO support, warning messages
such as the following are being observed ...

 reg-fixed-voltage regulator-vdd-3v3-pcie: cannot find GPIO chip
  gpiolib_shared.proxy.6, deferring

These are seen even with GPIO_SHARED_PROXY=y.

Given that the GPIOs are successfully found a bit later during boot and
the code is intentionally returning -EPROBE_DEFER when they are not
found, downgrade these messages to debug prints to avoid unnecessary
warnings being observed.

Note that although the 'cannot find GPIO line' warning has not been
observed in this case, it seems reasonable to make this print a debug
print for consistency too.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260401133441.47641-1-jonathanh@nvidia.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>