]> git.ipfire.org Git - thirdparty/kernel/linux.git/log
thirdparty/kernel/linux.git
3 weeks agobpf: Improve bounds when s64 crosses sign boundary
Paul Chaignon [Mon, 28 Jul 2025 09:50:53 +0000 (11:50 +0200)] 
bpf: Improve bounds when s64 crosses sign boundary

__reg64_deduce_bounds currently improves the s64 range using the u64
range and vice versa, but only if it doesn't cross the sign boundary.

This patch improves __reg64_deduce_bounds to cover the case where the
s64 range crosses the sign boundary but overlaps with the u64 range on
only one end. In that case, we can improve both ranges. Consider the
following example, with the s64 range crossing the sign boundary:

    0                                                   U64_MAX
    |  [xxxxxxxxxxxxxx u64 range xxxxxxxxxxxxxx]              |
    |----------------------------|----------------------------|
    |xxxxx s64 range xxxxxxxxx]                       [xxxxxxx|
    0                     S64_MAX S64_MIN                    -1

The u64 range overlaps only with positive portion of the s64 range. We
can thus derive the following new s64 and u64 ranges.

    0                                                   U64_MAX
    |  [xxxxxx u64 range xxxxx]                               |
    |----------------------------|----------------------------|
    |  [xxxxxx s64 range xxxxx]                               |
    0                     S64_MAX S64_MIN                    -1

The same logic can probably apply to the s32/u32 ranges, but this patch
doesn't implement that change.

In addition to the selftests, the __reg64_deduce_bounds change was
also tested with Agni, the formal verification tool for the range
analysis [1].

Link: https://github.com/bpfverif/agni
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/933bd9ce1f36ded5559f92fdc09e5dbc823fa245.1753695655.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
3 weeks agoRISC-V: KVM: Avoid re-acquiring memslot in kvm_riscv_gstage_map()
Quan Zhou [Wed, 11 Jun 2025 09:51:40 +0000 (17:51 +0800)] 
RISC-V: KVM: Avoid re-acquiring memslot in kvm_riscv_gstage_map()

The caller has already passed in the memslot, and there are
two instances `{kvm_faultin_pfn/mark_page_dirty}` of retrieving
the memslot again in `kvm_riscv_gstage_map`, we can replace them
with `{__kvm_faultin_pfn/mark_page_dirty_in_slot}`.

Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/50989f0a02790f9d7dc804c2ade6387c4e7fbdbc.1749634392.git.zhouquan@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Use find_vma_intersection() to search for intersecting VMAs
Quan Zhou [Tue, 17 Jun 2025 13:04:23 +0000 (21:04 +0800)] 
RISC-V: KVM: Use find_vma_intersection() to search for intersecting VMAs

There is already a helper function find_vma_intersection() in KVM
for searching intersecting VMAs, use it directly.

Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/230d6c8c8b8dd83081fcfd8d83a4d17c8245fa2f.1731552790.git.zhouquan@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: perf/kvm: Add reporting of interrupt events
Quan Zhou [Fri, 13 Jun 2025 07:53:38 +0000 (15:53 +0800)] 
RISC-V: perf/kvm: Add reporting of interrupt events

For `perf kvm stat` on the RISC-V, in order to avoid the
occurrence of `UNKNOWN` event names, interrupts should be
reported in addition to exceptions.

testing without patch:

Event name                    Samples  Sample%       Time(ns)
---------------------------  --------  --------  ------------
STORE_GUEST_PAGE_FAULT        1496461   53.00%    889612544
UNKNOWN                        887514   31.00%    272857968
LOAD_GUEST_PAGE_FAULT          305164   10.00%    189186331
VIRTUAL_INST_FAULT              70625    2.00%    134114260
SUPERVISOR_SYSCALL              32014    1.00%     58577110
INST_GUEST_PAGE_FAULT               1    0.00%         2545

testing with patch:

Event name                    Samples  Sample%       Time(ns)
---------------------------  --------  --------  ------------
IRQ_S_TIMER                   211271    58.00%  738298680600
EXC_STORE_GUEST_PAGE_FAULT    111279    30.00%  130725914800
EXC_LOAD_GUEST_PAGE_FAULT      22039     6.00%   25441480600
EXC_VIRTUAL_INST_FAULT          8913     2.00%   21015381600
IRQ_VS_EXT                      4748     1.00%   10155464300
IRQ_S_EXT                       2802     0.00%   13288775800
IRQ_S_SOFT                      1998     0.00%    4254129300

Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/9693132df4d0f857b8be3a75750c36b40213fcc0.1726211632.git.zhouquan@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Enable ring-based dirty memory tracking
Quan Zhou [Fri, 13 Jun 2025 11:29:57 +0000 (19:29 +0800)] 
RISC-V: KVM: Enable ring-based dirty memory tracking

Enable ring-based dirty memory tracking on riscv:

- Enable CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL as riscv is weakly
  ordered.
- Set KVM_DIRTY_LOG_PAGE_OFFSET for the ring buffer's physical page
  offset.
- Add a check to kvm_vcpu_kvm_riscv_check_vcpu_requests for checking
  whether the dirty ring is soft full.

To handle vCPU requests that cause exits to userspace, modified the
`kvm_riscv_check_vcpu_requests` to return a value (currently only
returns 0 or 1).

Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20e116efb1f7aff211dd8e3cf8990c5521ed5f34.1749810735.git.zhouquan@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Fix inclusion of Smnpm in the guest ISA bitmap
Samuel Holland [Sat, 11 Jan 2025 00:46:58 +0000 (16:46 -0800)] 
RISC-V: KVM: Fix inclusion of Smnpm in the guest ISA bitmap

The Smnpm extension requires special handling because the guest ISA
extension maps to a different extension (Ssnpm) on the host side.
commit 1851e7836212 ("RISC-V: KVM: Allow Smnpm and Ssnpm extensions for
guests") missed that the vcpu->arch.isa bit is based only on the host
extension, so currently both KVM_RISCV_ISA_EXT_{SMNPM,SSNPM} map to
vcpu->arch.isa[RISCV_ISA_EXT_SSNPM]. This does not cause any problems
for the guest, because both extensions are force-enabled anyway when the
host supports Ssnpm, but prevents checking for (guest) Smnpm in the SBI
FWFT logic.

Redefine kvm_isa_ext_arr to look up the guest extension, since only the
guest -> host mapping is unambiguous. Factor out the logic for checking
for host support of an extension, so this special case only needs to be
handled in one place, and be explicit about which variables hold a host
vs a guest ISA extension.

Fixes: 1851e7836212 ("RISC-V: KVM: Allow Smnpm and Ssnpm extensions for guests")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20250111004702.2813013-2-samuel.holland@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Delegate illegal instruction fault to VS mode
Xu Lu [Mon, 14 Jul 2025 09:45:54 +0000 (17:45 +0800)] 
RISC-V: KVM: Delegate illegal instruction fault to VS mode

Delegate illegal instruction fault to VS mode by default to avoid such
exceptions being trapped to HS and redirected back to VS.

The delegation of illegal instruction fault is particularly important
to guest applications that use vector instructions frequently. In such
cases, an illegal instruction fault will be raised when guest user thread
uses vector instruction the first time and then guest kernel will enable
user thread to execute following vector instructions.

The fw pmu event counter remains undeleted so that guest can still query
illegal instruction events via sbi call. Guest will only see zero count
on illegal instruction faults and know 'firmware' has delegated it.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
Link: https://lore.kernel.org/r/20250714094554.89151-1-luxu.kernel@bytedance.com
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Pass VMID as parameter to kvm_riscv_hfence_xyz() APIs
Anup Patel [Wed, 18 Jun 2025 11:35:32 +0000 (17:05 +0530)] 
RISC-V: KVM: Pass VMID as parameter to kvm_riscv_hfence_xyz() APIs

Currently, all kvm_riscv_hfence_xyz() APIs assume VMID to be the
host VMID of the Guest/VM which resticts use of these APIs only
for host TLB maintenance. Let's allow passing VMID as a parameter
to all kvm_riscv_hfence_xyz() APIs so that they can be re-used
for nested virtualization related TLB maintenance.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250618113532.471448-13-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Factor-out g-stage page table management
Anup Patel [Wed, 18 Jun 2025 11:35:31 +0000 (17:05 +0530)] 
RISC-V: KVM: Factor-out g-stage page table management

The upcoming nested virtualization can share g-stage page table
management with the current host g-stage implementation hence
factor-out g-stage page table management as separate sources
and also use "kvm_riscv_mmu_" prefix for host g-stage functions.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250618113532.471448-12-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Add vmid field to struct kvm_riscv_hfence
Anup Patel [Wed, 18 Jun 2025 11:35:30 +0000 (17:05 +0530)] 
RISC-V: KVM: Add vmid field to struct kvm_riscv_hfence

Currently, the struct kvm_riscv_hfence does not have vmid field
and various hfence processing functions always pick vmid assigned
to the guest/VM. This prevents us from doing hfence operation on
arbitrary vmid hence add vmid field to struct kvm_riscv_hfence
and use it wherever applicable.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250618113532.471448-11-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Introduce struct kvm_gstage_mapping
Anup Patel [Wed, 18 Jun 2025 11:35:29 +0000 (17:05 +0530)] 
RISC-V: KVM: Introduce struct kvm_gstage_mapping

Introduce struct kvm_gstage_mapping which represents a g-stage
mapping at a particular g-stage page table level. Also, update
the kvm_riscv_gstage_map() to return the g-stage mapping upon
success.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250618113532.471448-10-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Factor-out MMU related declarations into separate headers
Anup Patel [Wed, 18 Jun 2025 11:35:28 +0000 (17:05 +0530)] 
RISC-V: KVM: Factor-out MMU related declarations into separate headers

The MMU, TLB, and VMID management for KVM RISC-V already exists as
seprate sources so create separate headers along these lines. This
further simplifies asm/kvm_host.h header.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250618113532.471448-9-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Use ncsr_xyz() in kvm_riscv_vcpu_trap_redirect()
Anup Patel [Wed, 18 Jun 2025 11:35:27 +0000 (17:05 +0530)] 
RISC-V: KVM: Use ncsr_xyz() in kvm_riscv_vcpu_trap_redirect()

The H-extension CSRs accessed by kvm_riscv_vcpu_trap_redirect() will
trap when KVM RISC-V is running as Guest/VM hence remove these traps
by using ncsr_xyz() instead of csr_xyz().

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250618113532.471448-8-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Implement kvm_arch_flush_remote_tlbs_range()
Anup Patel [Wed, 18 Jun 2025 11:35:26 +0000 (17:05 +0530)] 
RISC-V: KVM: Implement kvm_arch_flush_remote_tlbs_range()

The kvm_arch_flush_remote_tlbs_range() expected by KVM core can be
easily implemented for RISC-V using kvm_riscv_hfence_gvma_vmid_gpa()
hence provide it.

Also with kvm_arch_flush_remote_tlbs_range() available for RISC-V, the
mmu_wp_memory_region() can happily use kvm_flush_remote_tlbs_memslot()
instead of kvm_flush_remote_tlbs().

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250618113532.471448-7-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Don't flush TLB when PTE is unchanged
Anup Patel [Wed, 18 Jun 2025 11:35:25 +0000 (17:05 +0530)] 
RISC-V: KVM: Don't flush TLB when PTE is unchanged

The gstage_set_pte() and gstage_op_pte() should flush TLB only when
a leaf PTE changes so that unnecessary TLB flushes can be avoided.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250618113532.471448-6-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Replace KVM_REQ_HFENCE_GVMA_VMID_ALL with KVM_REQ_TLB_FLUSH
Anup Patel [Wed, 18 Jun 2025 11:35:24 +0000 (17:05 +0530)] 
RISC-V: KVM: Replace KVM_REQ_HFENCE_GVMA_VMID_ALL with KVM_REQ_TLB_FLUSH

The KVM_REQ_HFENCE_GVMA_VMID_ALL is same as KVM_REQ_TLB_FLUSH so
to avoid confusion let's replace KVM_REQ_HFENCE_GVMA_VMID_ALL with
KVM_REQ_TLB_FLUSH. Also, rename kvm_riscv_hfence_gvma_vmid_all_process()
to kvm_riscv_tlb_flush_process().

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250618113532.471448-5-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Rename and move kvm_riscv_local_tlb_sanitize()
Anup Patel [Wed, 18 Jun 2025 11:35:23 +0000 (17:05 +0530)] 
RISC-V: KVM: Rename and move kvm_riscv_local_tlb_sanitize()

The kvm_riscv_local_tlb_sanitize() deals with sanitizing current
VMID related TLB mappings when a VCPU is moved from one host CPU
to another.

Let's move kvm_riscv_local_tlb_sanitize() to VMID management
sources and rename it to kvm_riscv_gstage_vmid_sanitize().

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250618113532.471448-4-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Drop the return value of kvm_riscv_vcpu_aia_init()
Anup Patel [Wed, 18 Jun 2025 11:35:22 +0000 (17:05 +0530)] 
RISC-V: KVM: Drop the return value of kvm_riscv_vcpu_aia_init()

The kvm_riscv_vcpu_aia_init() does not return any failure so drop
the return value which is always zero.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250618113532.471448-3-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoRISC-V: KVM: Check kvm_riscv_vcpu_alloc_vector_context() return value
Anup Patel [Wed, 18 Jun 2025 11:35:21 +0000 (17:05 +0530)] 
RISC-V: KVM: Check kvm_riscv_vcpu_alloc_vector_context() return value

The kvm_riscv_vcpu_alloc_vector_context() does return an error code
upon failure so don't ignore this in kvm_arch_vcpu_create().

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250618113532.471448-2-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoMerge tag 'pull-rpc_pipefs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Mon, 28 Jul 2025 16:56:09 +0000 (09:56 -0700)] 
Merge tag 'pull-rpc_pipefs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull rpc_pipefs updates from Al Viro:
 "Massage rpc_pipefs to use saner primitives and clean up the APIs
  provided to the rest of the kernel"

* tag 'pull-rpc_pipefs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  rpc_create_client_dir(): return 0 or -E...
  rpc_create_client_dir(): don't bother with rpc_populate()
  rpc_new_dir(): the last argument is always NULL
  rpc_pipe: expand the calls of rpc_mkdir_populate()
  rpc_gssd_dummy_populate(): don't bother with rpc_populate()
  rpc_mkpipe_dentry(): switch to simple_start_creating()
  rpc_pipe: saner primitive for creating regular files
  rpc_pipe: saner primitive for creating subdirectories
  rpc_pipe: don't overdo directory locking
  rpc_mkpipe_dentry(): saner calling conventions
  rpc_unlink(): saner calling conventions
  rpc_populate(): lift cleanup into callers
  rpc_unlink(): use simple_recursive_removal()
  rpc_{rmdir_,}depopulate(): use simple_recursive_removal() instead
  rpc_pipe: clean failure exits in fill_super
  new helper: simple_start_creating()

3 weeks agoMerge tag 'pull-simple_recursive_removal' of git://git.kernel.org/pub/scm/linux/kerne...
Linus Torvalds [Mon, 28 Jul 2025 16:43:51 +0000 (09:43 -0700)] 
Merge tag 'pull-simple_recursive_removal' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull simple_recursive_removal() update from Al Viro:
 "Removing subtrees of kernel filesystems is done in quite a few places;
  unfortunately, it's easy to get wrong. A number of open-coded attempts
  are out there, with varying amount of bogosities.

  simple_recursive_removal() had been introduced for doing that with all
  precautions needed; it does an equivalent of rm -rf, with sufficient
  locking, eviction of anything mounted on top of the subtree, etc.

  This series converts a bunch of open-coded instances to using that"

* tag 'pull-simple_recursive_removal' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  functionfs, gadgetfs: use simple_recursive_removal()
  kill binderfs_remove_file()
  fuse_ctl: use simple_recursive_removal()
  pstore: switch to locked_recursive_removal()
  binfmt_misc: switch to locked_recursive_removal()
  spufs: switch to locked_recursive_removal()
  add locked_recursive_removal()
  better lockdep annotations for simple_recursive_removal()
  simple_recursive_removal(): saner interaction with fsnotify

3 weeks agof2fs: fix to trigger foreground gc during f2fs_map_blocks() in lfs mode
Chao Yu [Thu, 24 Jul 2025 08:01:44 +0000 (16:01 +0800)] 
f2fs: fix to trigger foreground gc during f2fs_map_blocks() in lfs mode

w/ "mode=lfs" mount option, generic/299 will cause system panic as below:

------------[ cut here ]------------
kernel BUG at fs/f2fs/segment.c:2835!
Call Trace:
 <TASK>
 f2fs_allocate_data_block+0x6f4/0xc50
 f2fs_map_blocks+0x970/0x1550
 f2fs_iomap_begin+0xb2/0x1e0
 iomap_iter+0x1d6/0x430
 __iomap_dio_rw+0x208/0x9a0
 f2fs_file_write_iter+0x6b3/0xfa0
 aio_write+0x15d/0x2e0
 io_submit_one+0x55e/0xab0
 __x64_sys_io_submit+0xa5/0x230
 do_syscall_64+0x84/0x2f0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0010:new_curseg+0x70f/0x720

The root cause of we run out-of-space is: in f2fs_map_blocks(), f2fs may
trigger foreground gc only if it allocates any physical block, it will be
a little bit later when there is multiple threads writing data w/
aio/dio/bufio method in parallel, since we always use OPU in lfs mode, so
f2fs_map_blocks() does block allocations aggressively.

In order to fix this issue, let's give a chance to trigger foreground
gc in prior to block allocation in f2fs_map_blocks().

Fixes: 36abef4e796d ("f2fs: introduce mode=lfs mount option")
Cc: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: fix to calculate dirty data during has_not_enough_free_secs()
Chao Yu [Thu, 24 Jul 2025 08:01:43 +0000 (16:01 +0800)] 
f2fs: fix to calculate dirty data during has_not_enough_free_secs()

In lfs mode, dirty data needs OPU, we'd better calculate lower_p and
upper_p w/ them during has_not_enough_free_secs(), otherwise we may
encounter out-of-space issue due to we missed to reclaim enough
free section w/ foreground gc.

Fixes: 36abef4e796d ("f2fs: introduce mode=lfs mount option")
Cc: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: fix to update upper_p in __get_secs_required() correctly
Chao Yu [Thu, 24 Jul 2025 08:01:42 +0000 (16:01 +0800)] 
f2fs: fix to update upper_p in __get_secs_required() correctly

Commit 1acd73edbbfe ("f2fs: fix to account dirty data in __get_secs_required()")
missed to calculate upper_p w/ data_secs, fix it.

Fixes: 1acd73edbbfe ("f2fs: fix to account dirty data in __get_secs_required()")
Cc: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: directly add newly allocated pre-dirty nat entry to dirty set list
wangzijie [Mon, 28 Jul 2025 05:02:36 +0000 (13:02 +0800)] 
f2fs: directly add newly allocated pre-dirty nat entry to dirty set list

When we need to alloc nat entry and set it dirty, we can directly add it to
dirty set list(or initialize its list_head for new_ne) instead of adding it
to clean list and make a move. Introduce init_dirty flag to do it.

Signed-off-by: wangzijie <wangzijie1@honor.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: avoid redundant clean nat entry move in lru list
wangzijie [Mon, 28 Jul 2025 05:02:35 +0000 (13:02 +0800)] 
f2fs: avoid redundant clean nat entry move in lru list

__lookup_nat_cache follows LRU manner to move clean nat entry, when nat
entries are going to be dirty, no need to move them to tail of lru list.
Introduce a parameter 'for_dirty' to avoid it.

Signed-off-by: wangzijie <wangzijie1@honor.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agoMerge tag 'pull-dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Mon, 28 Jul 2025 16:17:57 +0000 (09:17 -0700)] 
Merge tag 'pull-dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull dentry d_flags updates from Al Viro:
 "The current exclusion rules for dentry->d_flags stores are rather
  unpleasant. The basic rules are simple:

   - stores to dentry->d_flags are OK under dentry->d_lock

   - stores to dentry->d_flags are OK in the dentry constructor, before
     becomes potentially visible to other threads

  Unfortunately, there's a couple of exceptions to that, and that's
  where the headache comes from.

  The main PITA comes from d_set_d_op(); that primitive sets ->d_op of
  dentry and adjusts the flags that correspond to presence of individual
  methods. It's very easy to misuse; existing uses _are_ safe, but proof
  of correctness is brittle.

  Use in __d_alloc() is safe (we are within a constructor), but we might
  as well precalculate the initial value of 'd_flags' when we set the
  default ->d_op for given superblock and set 'd_flags' directly instead
  of messing with that helper.

  The reasons why other uses are safe are bloody convoluted; I'm not
  going to reproduce it here. See [1] for gory details, if you care. The
  critical part is using d_set_d_op() only just prior to
  d_splice_alias(), which makes a combination of d_splice_alias() with
  setting ->d_op, etc a natural replacement primitive.

  Better yet, if we go that way, it's easy to take setting ->d_op and
  modifying 'd_flags' under ->d_lock, which eliminates the headache as
  far as 'd_flags' exclusion rules are concerned. Other exceptions are
  minor and easy to deal with.

  What this series does:

   - d_set_d_op() is no longer available; instead a new primitive
     (d_splice_alias_ops()) is provided, equivalent to combination of
     d_set_d_op() and d_splice_alias().

   - new field of struct super_block - 's_d_flags'. This sets the
     default value of 'd_flags' to be used when allocating dentries on
     this filesystem.

   - new primitive for setting 's_d_op': set_default_d_op(). This
     replaces stores to 's_d_op' at mount time.

     All in-tree filesystems converted; out-of-tree ones will get caught
     by the compiler ('s_d_op' is renamed, so stores to it will be
     caught). 's_d_flags' is set by the same primitive to match the
     's_d_op'.

   - a lot of filesystems had sb->s_d_op->d_delete equal to
     always_delete_dentry; that is equivalent to setting
     DCACHE_DONTCACHE in 'd_flags', so such filesystems can bloody well
     set that bit in 's_d_flags' and drop 'd_delete()' from
     dentry_operations.

     In quite a few cases that results in empty dentry_operations, which
     means that we can get rid of those.

   - kill simple_dentry_operations - not needed anymore

   - massage d_alloc_parallel() to get rid of the other exception wrt
     'd_flags' stores - we can set DCACHE_PAR_LOOKUP as soon as we
     allocate the new dentry; no need to delay that until we commit to
     using the sucker.

  As the result, 'd_flags' stores are all either under ->d_lock or done
  before the dentry becomes visible in any shared data structures"

Link: https://lore.kernel.org/all/20250224010624.GT1977892@ZenIV/
* tag 'pull-dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (21 commits)
  configfs: use DCACHE_DONTCACHE
  debugfs: use DCACHE_DONTCACHE
  efivarfs: use DCACHE_DONTCACHE instead of always_delete_dentry()
  9p: don't bother with always_delete_dentry
  ramfs, hugetlbfs, mqueue: set DCACHE_DONTCACHE
  kill simple_dentry_operations
  devpts, sunrpc, hostfs: don't bother with ->d_op
  shmem: no dentry retention past the refcount reaching zero
  d_alloc_parallel(): set DCACHE_PAR_LOOKUP earlier
  make d_set_d_op() static
  simple_lookup(): just set DCACHE_DONTCACHE
  tracefs: Add d_delete to remove negative dentries
  set_default_d_op(): calculate the matching value for ->d_flags
  correct the set of flags forbidden at d_set_d_op() time
  split d_flags calculation out of d_set_d_op()
  new helper: set_default_d_op()
  fuse: no need for special dentry_operations for root dentry
  switch procfs from d_set_d_op() to d_splice_alias_ops()
  new helper: d_splice_alias_ops()
  procfs: kill ->proc_dops
  ...

3 weeks agofsnotify: optimize FMODE_NONOTIFY_PERM for the common cases
Amir Goldstein [Tue, 8 Jul 2025 14:36:41 +0000 (16:36 +0200)] 
fsnotify: optimize FMODE_NONOTIFY_PERM for the common cases

The most unlikely watched permission event is FAN_ACCESS_PERM, because
at the time that it was introduced there were no evictable ignore mark,
so subscribing to FAN_ACCESS_PERM would have incured a very high
overhead.

Yet, when we set the fmode to FMODE_NOTIFY_HSM(), we never skip trying
to send FAN_ACCESS_PERM, which is almost always a waste of cycles.

We got to this logic because of bundling FAN_OPEN*_PERM and
FAN_ACCESS_PERM in the same category and because FAN_OPEN_PERM is a
commonly used event.

By open coding fsnotify_open_perm() in fsnotify_open_perm_and_set_mode(),
we no longer need to regard FAN_OPEN*_PERM when calculating fmode.

This leaves the case of having pre-content events and not having any
other permission event in the object masks a more likely case than the
other way around.

Rework the fmode macros and code so that their meaning now refers only
to hooks on an already open file:

- FMODE_NOTIFY_NONE() skip all events
- FMODE_NOTIFY_ACCESS_PERM() send all permission events including
   FAN_ACCESS_PERM
- FMODE_NOTIFY_HSM() send pre-content permission events

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20250708143641.418603-3-amir73il@gmail.com
3 weeks agofsnotify: merge file_set_fsnotify_mode_from_watchers() with open perm hook
Amir Goldstein [Tue, 8 Jul 2025 14:36:40 +0000 (16:36 +0200)] 
fsnotify: merge file_set_fsnotify_mode_from_watchers() with open perm hook

Create helper fsnotify_open_perm_and_set_mode() that moves the
fsnotify_open_perm() hook into file_set_fsnotify_mode_from_watchers().

This will allow some more optimizations.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20250708143641.418603-2-amir73il@gmail.com
3 weeks agosamples: fix building fs-monitor on musl systems
Brahmajit Das [Mon, 30 Jun 2025 11:02:41 +0000 (13:02 +0200)] 
samples: fix building fs-monitor on musl systems

samples/fanotify/fs-monitor.c:22:9: error: unknown type name '__s32'
   22 |         __s32 error;
      |         ^~~~~
samples/fanotify/fs-monitor.c:23:9: error: unknown type name '__u32'
   23 |         __u32 error_count;
      |         ^~~~~
samples/fanotify/fs-monitor.c: In function 'handle_notifications':
samples/fanotify/fs-monitor.c:98:50: error: 'fsid_t' has no member named 'val';
did you mean '__val'?
   98 |                                        fid->fsid.val[0],
fid->fsid.val[1]);
      |                                                  ^~~
      |                                                  __val
samples/fanotify/fs-monitor.c:98:68: error: 'fsid_t' has no member named 'val';
did you mean '__val'?
   98 |                                        fid->fsid.val[0],
fid->fsid.val[1]);
      |                                                                    ^~~
      |                                                                    __val

This is due to sys/fanotify.h on musl does not include
linux/fanotify.h[0] unlike glibc which includes it. This also results in
fsid not being of type __kernel_fsid_t, rather the libc's definition of
it which does not have val, but instead __val.

[0]: https://git.musl-libc.org/cgit/musl/tree/include/sys/fanotify.h
Signed-off-by: Brahmajit Das <listout@listout.xyz>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20250630103011.27484-1-listout@listout.xyz
3 weeks agoMerge tag 'pull-headers_param' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Mon, 28 Jul 2025 16:03:37 +0000 (09:03 -0700)] 
Merge tag 'pull-headers_param' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull asm/param cleanup from Al Viro:
 "This massages asm/param.h to simpler and more uniform shape:

   - all arch/*/include/uapi/asm/param.h are either generated includes
     of <asm-generic/param.h> or a #define or two followed by such
     include

   - no arch/*/include/asm/param.h anywhere, generated or not

   - include <asm/param.h> resolves to arch/*/include/uapi/asm/param.h
     of the architecture in question (or that of host in case of uml)

   - include/asm-generic/param.h pulls uapi/asm-generic/param.h and
     deals with USER_HZ, CLOCKS_PER_SEC and with HZ redefinition after
     that"

* tag 'pull-headers_param' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  loongarch, um, xtensa: get rid of generated arch/$ARCH/include/asm/param.h
  alpha: regularize the situation with asm/param.h
  xtensa: get rid uapi/asm/param.h

3 weeks agoNFS: Fixup allocation flags for nfsiod's __GFP_NORETRY
Benjamin Coddington [Thu, 10 Jul 2025 01:47:43 +0000 (21:47 -0400)] 
NFS: Fixup allocation flags for nfsiod's __GFP_NORETRY

If the NFS client is doing writeback from a workqueue context, avoid using
__GFP_NORETRY for allocations if the task has set PF_MEMALLOC_NOIO or
PF_MEMALLOC_NOFS.  The combination of these flags makes memory allocation
failures much more likely.

We've seen those allocation failures show up when the loopback driver is
doing writeback from a workqueue to a file on NFS, where memory allocation
failure results in errors or corruption within the loopback device's
filesystem.

Suggested-by: Trond Myklebust <trondmy@kernel.org>
Fixes: 0bae835b63c5 ("NFS: Avoid writeback threads getting stuck in mempool_alloc()")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/f83ac1155a4bc670f2663959a7a068571e06afd9.1752111622.git.bcodding@redhat.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
3 weeks agoMerge tag 'nfsd-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Linus Torvalds [Mon, 28 Jul 2025 16:01:09 +0000 (09:01 -0700)] 
Merge tag 'nfsd-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
 "NFSD is finally able to offer write delegations to clients that open
  files with O_WRONLY, thanks to patches from Dai Ngo. We're expecting
  this to accelerate a few interesting corner cases.

  The cap on the number of operations per NFSv4 COMPOUND has been
  lifted. Now, clients that send COMPOUNDs containing dozens of
  operations (for example, a long stream of LOOKUP operations to walk a
  pathname in a single round trip) will no longer be rejected.

  This release re-enables the ability for NFSD to perform NFSv4.2 COPY
  operations asynchronously. This feature has been disabled to mitigate
  the risk of denial-of-service when too many such requests arrive.

  Many thanks to the contributors, reviewers, testers, and bug reporters
  who participated during the v6.17 development cycle"

* tag 'nfsd-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (32 commits)
  nfsd: Drop dprintk in blocklayout xdr functions
  sunrpc: make svc_tcp_sendmsg() take a signed sentp pointer
  sunrpc: rearrange struct svc_rqst for fewer cachelines
  sunrpc: return better error in svcauth_gss_accept() on alloc failure
  sunrpc: reset rq_accept_statp when starting a new RPC
  sunrpc: remove SVC_SYSERR
  sunrpc: fix handling of unknown auth status codes
  NFSD: Simplify struct knfsd_fh
  NFSD: Access a knfsd_fh's fsid by pointer
  Revert "NFSD: Force all NFSv4.2 COPY requests to be synchronous"
  NFSD: Avoid multiple -Wflex-array-member-not-at-end warnings
  NFSD: Use vfs_iocb_iter_write()
  NFSD: Use vfs_iocb_iter_read()
  NFSD: Clean up kdoc for nfsd_open_local_fh()
  NFSD: Clean up kdoc for nfsd_file_put_local()
  NFSD: Remove definition for trace_nfsd_ctl_maxconn
  NFSD: Remove definition for trace_nfsd_file_gc_recent
  NFSD: Remove definitions for unused trace_nfsd_file_lru trace points
  NFSD: Remove definition for trace_nfsd_file_unhash_and_queue
  nfsd: Use correct error code when decoding extents
  ...

3 weeks agoMerge tag 'gfs2-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux...
Linus Torvalds [Mon, 28 Jul 2025 15:58:58 +0000 (08:58 -0700)] 
Merge tag 'gfs2-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 updates from Andreas Gruenbacher:

 - Prevent cluster nodes from trying to recover their own filesystems
   during a withdraw

 - Add two missing migrate_folio aops and an additional exhash directory
   consistency check (both triggered by syzbot bug reports)

 - Sanitize how dlm results are processed and clean up a few quirks in
   the glock code

 - Minor stuff: Get rid of the GIF_ALLOC_FAILED flag; use SECTOR_SIZE
   and SECTOR_SHIFT

* tag 'gfs2-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: No more self recovery
  gfs2: Validate i_depth for exhash directories
  gfs2: Set .migrate_folio in gfs2_{rgrp,meta}_aops
  gfs2: a minor finish_xmote cleanup
  gfs2: simplify finish_xmote
  gfs2: sanitize the gdlm_ast -> finish_xmote interface
  gfs2: Minor do_xmote cancelation fix
  gfs2: Remove GIF_ALLOC_FAILED flag
  gfs2: Use SECTOR_SIZE and SECTOR_SHIFT

3 weeks agoMerge tag 'xfs-merge-6.17' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Mon, 28 Jul 2025 15:55:53 +0000 (08:55 -0700)] 
Merge tag 'xfs-merge-6.17' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs updates from Carlos Maiolino:
 "This doesn't contain any new features. It mostly is a collection of
  clean ups and code refactoring that I preferred to postpone to the
  merge window.

  It includes removal of several unused tracepoints, refactoring key
  comparing routines under the B-Trees management and cleanup of xfs
  journaling code"

* tag 'xfs-merge-6.17' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (44 commits)
  xfs: don't use a xfs_log_iovec for ri_buf in log recovery
  xfs: don't use a xfs_log_iovec for attr_item names and values
  xfs: use better names for size members in xfs_log_vec
  xfs: cleanup the ordered item logic in xlog_cil_insert_format_items
  xfs: don't pass the old lv to xfs_cil_prepare_item
  xfs: remove unused trace event xfs_reflink_cow_enospc
  xfs: remove unused trace event xfs_discard_rtrelax
  xfs: remove unused trace event xfs_log_cil_return
  xfs: remove unused trace event xfs_dqreclaim_dirty
  fs/xfs: replace strncpy with memtostr_pad()
  xfs: Remove unused label in xfs_dax_notify_dev_failure
  xfs: improve the comments in xfs_select_zone_nowait
  xfs: improve the comments in xfs_max_open_zones
  xfs: stop passing an inode to the zone space reservation helpers
  xfs: rename oz_write_pointer to oz_allocated
  xfs: use a uint32_t to cache i_used_blocks in xfs_init_zone
  xfs: improve the xg_active_ref check in xfs_group_free
  xfs: remove the xlog_ticket_t typedef
  xfs: remove xrep_trans_{alloc,cancel}_hook_dummy
  xfs: return the allocated transaction from xchk_trans_alloc_empty
  ...

3 weeks agoNFSv4.2: another fix for listxattr
Olga Kornievskaia [Tue, 22 Jul 2025 20:56:41 +0000 (16:56 -0400)] 
NFSv4.2: another fix for listxattr

Currently, when the server supports NFS4.1 security labels then
security.selinux label in included twice. Instead, only add it
when the server doesn't possess security label support.

Fixes: 243fea134633 ("NFSv4.2: fix listxattr to return selinux security label")
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Link: https://lore.kernel.org/r/20250722205641.79394-1-okorniev@redhat.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
3 weeks agoMerge tag 'erofs-for-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang...
Linus Torvalds [Mon, 28 Jul 2025 15:49:32 +0000 (08:49 -0700)] 
Merge tag 'erofs-for-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs updates from Gao Xiang:
 "We now support metadata compression. It can be useful for embedded use
  cases or archiving a large number of small files.

  Additionally, readdir performance has been improved by enabling
  readahead (note that it was already common practice for ext3/4 non-dx
  and f2fs directories). We may consider further improvements later to
  align with ext4's s_inode_readahead_blks behavior for slow devices
  too.

  The remaining commits are minor.

  Summary:

   - Add support for metadata compression

   - Enable readahead for directories to improve readdir performance

   - Minor fixes and cleanups"

* tag 'erofs-for-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: support to readahead dirent blocks in erofs_readdir()
  erofs: implement metadata compression
  erofs: add on-disk definition for metadata compression
  erofs: fix build error with CONFIG_EROFS_FS_ZIP_ACCEL=y
  erofs: remove ENOATTR definition
  erofs: refine erofs_iomap_begin()
  erofs: unify meta buffers in z_erofs_fill_inode()
  erofs: remove need_kmap in erofs_read_metabuf()
  erofs: do sanity check on m->type in z_erofs_load_compact_lcluster()
  erofs: get rid of {get,put}_page() for ztailpacking data

3 weeks agoMerge tag 'ntfs3_for_6.17' of https://github.com/Paragon-Software-Group/linux-ntfs3
Linus Torvalds [Mon, 28 Jul 2025 15:46:55 +0000 (08:46 -0700)] 
Merge tag 'ntfs3_for_6.17' of https://github.com/Paragon-Software-Group/linux-ntfs3

Pull ntfs3 updates from Konstantin Komarov:
 "Added:
   - sanity check for file name
   - mark live inode as bad and avoid any operations

  Fixed:
   - handling of symlinks created in windows
   - creation of symlinks for relative path

  Changed:
   - cancel setting inode as bad after removing name fails
   - revert 'replace inode_trylock with inode_lock'"

* tag 'ntfs3_for_6.17' of https://github.com/Paragon-Software-Group/linux-ntfs3:
  Revert "fs/ntfs3: Replace inode_trylock with inode_lock"
  fs/ntfs3: Exclude call make_bad_inode for live nodes.
  fs/ntfs3: cancle set bad inode after removing name fails
  fs/ntfs3: Add sanity check for file name
  fs/ntfs3: correctly create symlink for relative path
  fs/ntfs3: fix symlinks cannot be handled correctly

3 weeks agoMerge tag 'for-6.17-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Linus Torvalds [Mon, 28 Jul 2025 15:42:29 +0000 (08:42 -0700)] 
Merge tag 'for-6.17-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "A number of usability and feature updates, scattered performance
  improvements and fixes. Highlight of the core changes is getting
  closer to enabling large folios (now behind a config option).

  User visible changes:

   - update defrag ioctl, add new flag to request no compression on
     existing extents

   - restrict writes to block devices after mount

   - in experimental config, enable large folios for data, almost
     complete but not widely tested

   - add stats tracking duration of critical section in transaction
     commit to /sys/fs/btrfs/FSID/commit_stats

  Performance improvements:

   - caching of lookup results of free space bitmap (20% runtime
     improvement on an empty file creation benchmark)

   - accessors to metadata (b-tree items) simplified and optimized,
     minor improvement in metadata-heavy workloads

   - readahead on compressed data improves sequential read

   - the xarray for extent buffers is indexed by denser keys, leading to
     better packing of the nodes (50-70% reduction of leaf nodes)

  Notable fixes:

   - stricter compression mount option parsing

   - send properly emits fallocate command for file holes when protocol
     v2 is used

   - fix overallocation of chunks with mount option 'ssd_spread', due to
     interaction with size classes not finding the right chunk
     (workaround: manual reclaim by 'usage' balance filter)

   - various quota enable/disable races with rescan, more verbose
     notifications about inconsistent state

   - populate otime in tree-log during log replay

   - handle ENOSPC when NOCOW file is used with mmap()

  Core:

   - large data folios enabled in experimental config

   - improved error handling, transaction abort call sites

   - in zoned mode, allocate reloc block group on mount to make sure
     there's always one available for zone reclaim under heavy load

   - rework device opening, they're always open as read-only and delayed
     until the super block is created, allowing the restricted writes
     after mount

   - preparatory work for adding blk_holder_ops, allowing device
     freeze/thaw in the future

  Cleanups, refactoring:

   - type and naming unifications (int/bool, return variables)

   - rb-tree helper refactoring and simplifications

   - reorder memory allocations to less critical places

   - RCU string (used for device name) refactoring and API removal

   - replace all remaining use of strcpy()"

* tag 'for-6.17-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (209 commits)
  btrfs: send: use fallocate for hole punching with send stream v2
  btrfs: unfold transaction aborts when writing dirty block groups
  btrfs: use saner variable type and name to indicate extrefs at add_inode_ref()
  btrfs: don't skip remaining extrefs if dir not found during log replay
  btrfs: don't ignore inode missing when replaying log tree
  btrfs: enable large data folios for data reloc inode
  btrfs: output more info when btrfs_subpage_assert() failed
  btrfs: reloc: unconditionally invalidate the page cache for each cluster
  btrfs: defrag: add flag to force no-compression
  btrfs: fix ssd_spread overallocation
  btrfs: zoned: requeue to unused block group list if zone finish failed
  btrfs: zoned: do not remove unwritten non-data block group
  btrfs: remove btrfs_clear_extent_bits()
  btrfs: use cached state when falling back from NOCoW write to CoW write
  btrfs: set EXTENT_NORESERVE before range unlock in btrfs_truncate_block()
  btrfs: don't print relocation messages from auto reclaim
  btrfs: remove redundant auto reclaim log message
  btrfs: make btrfs_check_nocow_lock() check more than one extent
  btrfs: assert we can NOCOW the range in btrfs_truncate_block()
  btrfs: update function comment for btrfs_check_nocow_lock()
  ...

3 weeks agoKVM: arm64: selftests: Add FEAT_RAS EL2 registers to get-reg-list
Oliver Upton [Mon, 28 Jul 2025 15:26:03 +0000 (08:26 -0700)] 
KVM: arm64: selftests: Add FEAT_RAS EL2 registers to get-reg-list

VDISR_EL2 and VSESR_EL2 are now visible to userspace for nested VMs. Add
them to get-reg-list.

Link: https://lore.kernel.org/r/20250728152603.2823699-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
3 weeks agoMerge branch 'kvm-arm64/vgic-v4-ctl' into kvmarm/next
Oliver Upton [Mon, 28 Jul 2025 15:11:34 +0000 (08:11 -0700)] 
Merge branch 'kvm-arm64/vgic-v4-ctl' into kvmarm/next

* kvm-arm64/vgic-v4-ctl:
  : Userspace control of nASSGIcap, courtesy of Raghavendra Rao Ananta
  :
  : Allow userspace to decide if support for SGIs without an active state is
  : advertised to the guest, allowing VMs from GICv3-only hardware to be
  : migrated to to GICv4.1 capable machines.
  Documentation: KVM: arm64: Describe VGICv3 registers writable pre-init
  KVM: arm64: selftests: Add test for nASSGIcap attribute
  KVM: arm64: vgic-v3: Allow userspace to write GICD_TYPER2.nASSGIcap
  KVM: arm64: vgic-v3: Allow access to GICD_IIDR prior to initialization
  KVM: arm64: vgic-v3: Consolidate MAINT_IRQ handling
  KVM: arm64: Disambiguate support for vSGIs v. vLPIs

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
3 weeks agoiommu/arm-smmu-v3: Replace vsmmu_size/type with get_viommu_size
Nicolin Chen [Thu, 24 Jul 2025 22:10:02 +0000 (15:10 -0700)] 
iommu/arm-smmu-v3: Replace vsmmu_size/type with get_viommu_size

It's more flexible to have a get_viommu_size op. Replace static vsmmu_size
and vsmmu_type with that.

Link: https://patch.msgid.link/r/20250724221002.1883034-3-nicolinc@nvidia.com
Suggested-by: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
3 weeks agoiommu/arm-smmu-v3: Do not bother impl_ops if IOMMU_VIOMMU_TYPE_ARM_SMMUV3
Nicolin Chen [Thu, 24 Jul 2025 22:10:01 +0000 (15:10 -0700)] 
iommu/arm-smmu-v3: Do not bother impl_ops if IOMMU_VIOMMU_TYPE_ARM_SMMUV3

When viommu type is IOMMU_VIOMMU_TYPE_ARM_SMMUV3, always return or init the
standard struct arm_vsmmu, instead of going through impl_ops that must have
its own viommu type than the standard IOMMU_VIOMMU_TYPE_ARM_SMMUV3.

Given that arm_vsmmu_init() is called after arm_smmu_get_viommu_size(), any
unsupported viommu->type must be a corruption. And it must be a driver bug
that its vsmmu_size and vsmmu_init ops aren't paired. Warn these two cases.

Link: https://patch.msgid.link/r/20250724221002.1883034-2-nicolinc@nvidia.com
Suggested-by: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
3 weeks agoMerge branch 'kvm-arm64/el2-reg-visibility' into kvmarm/next
Oliver Upton [Mon, 28 Jul 2025 15:06:27 +0000 (08:06 -0700)] 
Merge branch 'kvm-arm64/el2-reg-visibility' into kvmarm/next

* kvm-arm64/el2-reg-visibility:
  : Fixes to EL2 register visibility, courtesy of Marc Zyngier
  :
  :  - Expose EL2 VGICv3 registers via the VGIC attributes accessor, not the
  :    KVM_{GET,SET}_ONE_REG ioctls
  :
  :  - Condition visibility of FGT registers on the presence of FEAT_FGT in
  :    the VM
  KVM: arm64: selftest: vgic-v3: Add basic GICv3 sysreg userspace access test
  KVM: arm64: Enforce the sorting of the GICv3 system register table
  KVM: arm64: Clarify the check for reset callback in check_sysreg_table()
  KVM: arm64: vgic-v3: Fix ordering of ICH_HCR_EL2
  KVM: arm64: Document registers exposed via KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS
  KVM: arm64: selftests: get-reg-list: Add base EL2 registers
  KVM: arm64: selftests: get-reg-list: Simplify feature dependency
  KVM: arm64: Advertise FGT2 registers to userspace
  KVM: arm64: Condition FGT registers on feature availability
  KVM: arm64: Expose GICv3 EL2 registers via KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS
  KVM: arm64: Let GICv3 save/restore honor visibility attribute
  KVM: arm64: Define helper for ICH_VTR_EL2
  KVM: arm64: Define constant value for ICC_SRE_EL2
  KVM: arm64: Don't advertise ICH_*_EL2 registers through GET_ONE_REG
  KVM: arm64: Make RVBAR_EL2 accesses UNDEF

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
3 weeks agoMerge branch 'kvm-arm64/config-masks' into kvmarm/next
Oliver Upton [Mon, 28 Jul 2025 15:03:03 +0000 (08:03 -0700)] 
Merge branch 'kvm-arm64/config-masks' into kvmarm/next

* kvm-arm64/config-masks:
  : More config-driven mask computation, courtesy of Marc Zyngier
  :
  : Converts more system registers to the config-driven computation of RESx
  : masks based on the advertised feature set
  KVM: arm64: Tighten the definition of FEAT_PMUv3p9
  KVM: arm64: Convert MDCR_EL2 to config-driven sanitisation
  KVM: arm64: Convert SCTLR_EL1 to config-driven sanitisation
  KVM: arm64: Convert TCR2_EL2 to config-driven sanitisation
  arm64: sysreg: Add THE/ASID2 controls to TCR2_ELx

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
3 weeks agosmb: client: get rid of kstrdup() when parsing iocharset mount option
Paulo Alcantara [Sat, 26 Jul 2025 16:47:51 +0000 (13:47 -0300)] 
smb: client: get rid of kstrdup() when parsing iocharset mount option

Steal string reference from @param->string rather than duplicating it.

Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 weeks agosmb: client: get rid of kstrdup() when parsing domain mount option
Paulo Alcantara [Sat, 26 Jul 2025 16:45:43 +0000 (13:45 -0300)] 
smb: client: get rid of kstrdup() when parsing domain mount option

Steal string reference from @param->string rather than duplicating it.

Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 weeks agosmb: client: get rid of kstrdup() when parsing pass2 mount option
Paulo Alcantara [Sat, 26 Jul 2025 16:40:28 +0000 (13:40 -0300)] 
smb: client: get rid of kstrdup() when parsing pass2 mount option

Steal string reference from @param->string rather than duplicating it.

Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 weeks agosmb: client: get rid of kstrdup() when parsing pass mount option
Paulo Alcantara [Sat, 26 Jul 2025 16:38:52 +0000 (13:38 -0300)] 
smb: client: get rid of kstrdup() when parsing pass mount option

Steal string reference from @param->string rather than duplicating it.

Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 weeks agosmb: client: get rid of kstrdup() when parsing user mount option
Paulo Alcantara [Sat, 26 Jul 2025 16:36:51 +0000 (13:36 -0300)] 
smb: client: get rid of kstrdup() when parsing user mount option

Steal string reference from @param->string rather than duplicating it.

Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 weeks agorv: Fix wrong type cast in reactors_show() and monitor_reactor_show()
Nam Cao [Sun, 27 Jul 2025 17:31:13 +0000 (19:31 +0200)] 
rv: Fix wrong type cast in reactors_show() and monitor_reactor_show()

Argument 'p' of reactors_show() and monitor_reactor_show() is not a pointer
to struct rv_reactor, it is actually a pointer to the list_head inside
struct rv_reactor. Therefore it's wrong to cast 'p' to struct rv_reactor *.

This wrong type cast has been there since the beginning. But it still
worked because the list_head was the first field in struct rv_reactor_def.
This is no longer true since commit 3d3c376118b5 ("rv: Merge struct
rv_reactor_def into struct rv_reactor") moved the list_head, and this wrong
type cast became a functional problem.

Properly use container_of() instead.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/b4febbd6844311209e4c8768b65d508b81bd8c9b.1753625621.git.namcao@linutronix.de
Fixes: 3d3c376118b5 ("rv: Merge struct rv_reactor_def into struct rv_reactor")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
3 weeks agorv: Fix wrong type cast in monitors_show()
Nam Cao [Sun, 27 Jul 2025 17:31:12 +0000 (19:31 +0200)] 
rv: Fix wrong type cast in monitors_show()

Argument 'p' of monitors_show() is not a pointer to struct rv_monitor, it
is actually a pointer to the list_head inside struct rv_monitor. Therefore
it is wrong to cast 'p' to struct rv_monitor *.

This wrong type cast has been there since the beginning. But it still
worked because the list_head was the first field in struct rv_monitor_def.
This is no longer true since commit 24cbfe18d55a ("rv: Merge struct
rv_monitor_def into struct rv_monitor") moved the list_head, and this wrong
type cast became a functional problem.

Properly use container_of() instead.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/35e49e97696007919ceacf73796487a2e15a3d02.1753625621.git.namcao@linutronix.de
Fixes: 24cbfe18d55a ("rv: Merge struct rv_monitor_def into struct rv_monitor")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
3 weeks agodrm/xe/configfs: Fix pci_dev reference leak
Michal Wajdeczko [Tue, 22 Jul 2025 14:10:54 +0000 (16:10 +0200)] 
drm/xe/configfs: Fix pci_dev reference leak

We are using pci_get_domain_bus_and_slot() function to verify if
the given config directory name matches any existing PCI device,
but we missed to call matching pci_dev_put() to release reference.

While around, also change error code in case of no device match,
to make it more specific than generic formatting error.

Fixes: 16280ded45fb ("drm/xe: Add configfs to enable survivability mode")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://lore.kernel.org/r/20250722141059.30707-2-michal.wajdeczko@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 0bdd05c2a82bbf2419415d012fd4f5faeca7f1af)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 weeks agodrm/xe/hw_engine_group: Avoid call kfree() for drmm_kzalloc()
Shuicheng Lin [Thu, 24 Jul 2025 19:38:55 +0000 (19:38 +0000)] 
drm/xe/hw_engine_group: Avoid call kfree() for drmm_kzalloc()

Memory allocated with drmm_kzalloc() should not be freed using
kfree(), as it is managed by the DRM subsystem. The memory will
be automatically freed when the associated drm_device is released.
These 3 group pointers are allocated using drmm_kzalloc() in
hw_engine_group_alloc(), so they don't require manual deallocation.

Fixes: 67979060740f ("drm/xe/hw_engine_group: Fix potential leak")
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://lore.kernel.org/r/20250724193854.1124510-2-shuicheng.lin@intel.com
(cherry picked from commit f98de826b418885a21ece67f0f5b921ae759b7bf)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 weeks agortla/tests: Test timerlat -P option using actions
Tomas Glozar [Fri, 25 Jul 2025 13:38:17 +0000 (15:38 +0200)] 
rtla/tests: Test timerlat -P option using actions

The -P option is used to set priority of osnoise and timerlat threads.

Extend the test for -P with --on-threshold calling a script that looks
for running timerlat threads and checks if their priority is set
correctly.

As --on-threshold is only supported by timerlat at the moment, this is
only implemented there so far.

Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Chang Yin <cyin@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Link: https://lore.kernel.org/20250725133817.59237-3-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
3 weeks agodrm/xe/guc: Clear whole g2h_fence during initialization
Michal Wajdeczko [Wed, 23 Jul 2025 17:56:39 +0000 (19:56 +0200)] 
drm/xe/guc: Clear whole g2h_fence during initialization

The struct g2h_fence must be explicitly initializated using the
g2h_fence_init() function to avoid trash values in its members,
but we missed to update this helper function with the new member.

To fix that and avoid any future mistakes, memset the whole struct
first, then update remaining non-zero members.

Fixes: 94de94d24ea8 ("drm/xe/guc: Cancel ongoing H2G requests when stopping CT")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Lukasz Laguna <lukasz.laguna@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250723175639.206875-1-michal.wajdeczko@intel.com
(cherry picked from commit 159afd92bae8153bdd8d8b34aea0d463fe19c978)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 weeks agortla/tests: Add grep checks for base test cases
Tomas Glozar [Fri, 25 Jul 2025 13:38:16 +0000 (15:38 +0200)] 
rtla/tests: Add grep checks for base test cases

Checking for patterns in rtla output with grep was added to test rtla
actions. Add grep checks also for base tests where applicable.

Also fix trace event histogram trigger check to use the correct syntax
for the command-line option so that the test passes with the grep check.

Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Chang Yin <cyin@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Link: https://lore.kernel.org/20250725133817.59237-2-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
3 weeks agodrm/xe/vf: Don't register I2C devices if VF
Lukasz Laguna [Thu, 17 Jul 2025 15:54:20 +0000 (17:54 +0200)] 
drm/xe/vf: Don't register I2C devices if VF

VF drivers can't access I2C devices, so skip their registration when
running as VF.

Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
Fixes: f0e53aadd702 ("drm/xe: Support for I2C attached MCUs")
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250717155420.25298-1-lukasz.laguna@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 9a220e065914b67b55d3d0ab91c3e215742fdd73)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 weeks agodrm/xe/uc: Fix missing unwind goto
Zhanjun Dong [Mon, 21 Jul 2025 21:45:20 +0000 (17:45 -0400)] 
drm/xe/uc: Fix missing unwind goto

Fix missing unwind goto on error handling.

Fixes: b2c4ac219fa4 ("drm/xe/uc: Disable GuC communication on hardware initialization error")
Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://lore.kernel.org/r/20250721214520.954014-1-zhanjun.dong@intel.com
(cherry picked from commit 176f44a5ec0b074aaf44852db77d0c183c36696d)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 weeks agodrm/xe: Fix a NULL vs IS_ERR() bug in xe_i2c_register_adapter()
Dan Carpenter [Tue, 15 Jul 2025 22:59:44 +0000 (17:59 -0500)] 
drm/xe: Fix a NULL vs IS_ERR() bug in xe_i2c_register_adapter()

The fwnode_create_software_node() function returns error pointers.  It
never returns NULL.  Update the checks to match.

Fixes: f0e53aadd702 ("drm/xe: Support for I2C attached MCUs")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/65825d00-81ab-4665-af51-4fff6786a250@sabinyo.mountain
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 2f264d58cc805a3cefc6b98097f90fbc388136ef)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 weeks agodrm/xe/oa: Fix static checker warning about null gt
Ashutosh Dixit [Tue, 15 Jul 2025 18:14:22 +0000 (11:14 -0700)] 
drm/xe/oa: Fix static checker warning about null gt

There is a static checker warning that gt returned by xe_device_get_gt can
be NULL and that is being dereferenced. Use xe_root_mmio_gt instead, which
is equivalent and cannot return a NULL gt 0.

Fixes: 10d42ef34bce ("drm/xe/oa: Assign hwe for OAM_SAG")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://lore.kernel.org/r/20250715181422.2807624-1-ashutosh.dixit@intel.com
(cherry picked from commit 308dc9b27874d0e8a0258869b9e681b0fdd2e579)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 weeks agodrm/xe: Don't fail probe on unsupported mailbox command
Raag Jadav [Mon, 14 Jul 2025 21:55:03 +0000 (03:25 +0530)] 
drm/xe: Don't fail probe on unsupported mailbox command

If the device is running older pcode firmware, it is possible that newer
mailbox commands are not supported by it. The sysfs attributes aren't
useful in that case, but we shouldn't fail driver probe because of it.
As of now, it is unknown if we can distinguish unsupported commands before
attempting them. But until we figure out a way to do that, fix the
regressions.

v2: Add debug message (Lucas)

Fixes: cdc36b66cd41 ("drm/xe: Expose fan control and voltage regulator version")
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Tested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://lore.kernel.org/r/20250714215503.2897748-1-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit ed5461daa150b037e36b8202381da1ef85d6b16b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
3 weeks agoMerge tag 'asoc-v6.17-2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie...
Takashi Iwai [Mon, 28 Jul 2025 12:28:21 +0000 (14:28 +0200)] 
Merge tag 'asoc-v6.17-2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: More updates for v6.17

A few more updates, mostly fixes and device IDs plus some small
enhancements for the FSL xcvr driver.

3 weeks agomtd: map: Don't use "proxy" headers
Andy Shevchenko [Thu, 26 Jun 2025 16:08:12 +0000 (19:08 +0300)] 
mtd: map: Don't use "proxy" headers

Update header inclusions to follow IWYU (Include What You Use)
principle.

Note that kernel.h is discouraged to be included as it's written
at the top of that file.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
3 weeks agowatchdog: sbsa: Adjust keepalive timeout to avoid MediaTek WS0 race condition
Aaron Plattner [Mon, 21 Jul 2025 23:06:39 +0000 (16:06 -0700)] 
watchdog: sbsa: Adjust keepalive timeout to avoid MediaTek WS0 race condition

The MediaTek implementation of the sbsa_gwdt watchdog has a race
condition where a write to SBSA_GWDT_WRR is ignored if it occurs while
the hardware is processing a timeout refresh that asserts WS0.

Detect this based on the hardware implementer and adjust
wdd->min_hw_heartbeat_ms to avoid the race by forcing the keepalive ping
to be one second later.

Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Acked-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20250721230640.2244915-1-aplattner@nvidia.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
3 weeks agoALSA: scarlett2: Add retry on -EPROTO from scarlett2_usb_tx()
Geoffrey D. Bennett [Mon, 28 Jul 2025 09:30:35 +0000 (19:00 +0930)] 
ALSA: scarlett2: Add retry on -EPROTO from scarlett2_usb_tx()

During communication with Focusrite Scarlett Gen 2/3/4 USB audio
interfaces, -EPROTO is sometimes returned from scarlett2_usb_tx(),
snd_usb_ctl_msg() which can cause initialisation and control
operations to fail intermittently.

This patch adds up to 5 retries in scarlett2_usb(), with a delay
starting at 5ms and doubling each time. This follows the same approach
as the fix for usb_set_interface() in endpoint.c (commit f406005e162b
("ALSA: usb-audio: Add retry on -EPROTO from usb_set_interface()")),
which resolved similar -EPROTO issues during device initialisation,
and is the same approach as in fcp.c:fcp_usb().

Fixes: 9e4d5c1be21f ("ALSA: usb-audio: Scarlett Gen 2 mixer interface")
Closes: https://github.com/geoffreybennett/linux-fcp/issues/41
Cc: stable@vger.kernel.org
Signed-off-by: Geoffrey D. Bennett <g@b4.vu>
Link: https://patch.msgid.link/aIdDO6ld50WQwNim@m.b4.vu
Signed-off-by: Takashi Iwai <tiwai@suse.de>
3 weeks agoALSA: hda/realtek - Fix mute LED for HP Victus 16-r1xxx
Edip Hazuri [Fri, 25 Jul 2025 15:14:37 +0000 (18:14 +0300)] 
ALSA: hda/realtek - Fix mute LED for HP Victus 16-r1xxx

The mute led on this laptop is using ALC245 but requires a quirk to work
This patch enables the existing quirk for the device.

Tested on Victus 16-r1xxx Laptop. The LED behaviour works
as intended.

Cc: <stable@vger.kernel.org>
Signed-off-by: Edip Hazuri <edip@medip.dev>
Link: https://patch.msgid.link/20250725151436.51543-2-edip@medip.dev
Signed-off-by: Takashi Iwai <tiwai@suse.de>
3 weeks agoi2c: core: Fix double-free of fwnode in i2c_unregister_device()
Hans de Goede [Sat, 19 Jul 2025 18:01:04 +0000 (20:01 +0200)] 
i2c: core: Fix double-free of fwnode in i2c_unregister_device()

Before commit df6d7277e552 ("i2c: core: Do not dereference fwnode in struct
device"), i2c_unregister_device() only called fwnode_handle_put() on
of_node-s in the form of calling of_node_put(client->dev.of_node).

But after this commit the i2c_client's fwnode now unconditionally gets
fwnode_handle_put() on it.

When the i2c_client has no primary (ACPI / OF) fwnode but it does have
a software fwnode, the software-node will be the primary node and
fwnode_handle_put() will put() it.

But for the software fwnode device_remove_software_node() will also put()
it leading to a double free:

[   82.665598] ------------[ cut here ]------------
[   82.665609] refcount_t: underflow; use-after-free.
[   82.665808] WARNING: CPU: 3 PID: 1502 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x11
...
[   82.666830] RIP: 0010:refcount_warn_saturate+0xba/0x110
...
[   82.666962]  <TASK>
[   82.666971]  i2c_unregister_device+0x60/0x90

Fix this by not calling fwnode_handle_put() when the primary fwnode is
a software-node.

Fixes: df6d7277e552 ("i2c: core: Do not dereference fwnode in struct device")
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hansg@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
3 weeks agoMerge tag 'i2c-host-6.17-pt1' of git://git.kernel.org/pub/scm/linux/kernel/git/andi...
Wolfram Sang [Mon, 28 Jul 2025 08:24:40 +0000 (10:24 +0200)] 
Merge tag 'i2c-host-6.17-pt1' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-mergewindow

i2c-host for v6.17, part 1

Cleanups and refactorings:
- lpi2c, riic, st, stm32f7: general improvements
- riic: support more flexible IRQ configurations
- tegra: fix documentation

Improvements:
- lpi2c: improve register polling and add atomic transfer
- imx: use guarded spinlocks

New hardware support:
- Samsung Exynos 2200
- Renesas RZ/T2H (R9A09G077), RZ/N2H (R9A09G087)

DT binding:
- rk3x: enable power domains
- nxp: support clock property

3 weeks agoMIPS: Don't use %pK through printk
Thomas Weißschuh [Fri, 18 Jul 2025 13:18:24 +0000 (15:18 +0200)] 
MIPS: Don't use %pK through printk

Restricted pointers ("%pK") are not meant to be used through printk().
It can unintentionally expose security sensitive, raw pointer values.

Use regular pointer formatting instead.

Link: https://lore.kernel.org/lkml/20250113171731-dc10e3c1-da64-4af0-b767-7c7070468023@linutronix.de/
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
3 weeks agoMIPS: Update Joshua Kinard's e-mail address
Joshua Kinard [Mon, 21 Jul 2025 16:57:15 +0000 (12:57 -0400)] 
MIPS: Update Joshua Kinard's e-mail address

I am switching my address to a personal domain, so some files in the
SGI IP30 and IOC3 files need to be updated.  I will send updates for
the MAINTAINERS file and rtc-ds1685 separately to linux-rtc.

Signed-off-by: Joshua Kinard <kumba@gentoo.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
3 weeks agoMIPS: mobileye: dts: eyeq5,eyeq6h: rename the emmc controller
Benoît Monin [Tue, 22 Jul 2025 15:15:20 +0000 (17:15 +0200)] 
MIPS: mobileye: dts: eyeq5,eyeq6h: rename the emmc controller

The name should match the pattern defined in the mmc-controller binding.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202507220336.JhvVLL7k-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202507220215.wVoUMK5B-lkp@intel.com/
Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
3 weeks agoMIPS: alchemy: gpio: use new GPIO line value setter callbacks for the remaining chips
Bartosz Golaszewski [Sun, 27 Jul 2025 08:24:42 +0000 (10:24 +0200)] 
MIPS: alchemy: gpio: use new GPIO line value setter callbacks for the remaining chips

Previous commit missed two other places that need converting, it only
came out in tests on autobuilders now. Convert the rest of the driver.

Fixes: 68bdc4dc1130 ("MIPS: alchemy: gpio: use new line value setter callbacks")
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Link: https://lore.kernel.org/r/20250727082442.13182-1-brgl@bgdev.pl
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
3 weeks agoMerge tag 'v6.16' into x86/cpu, to resolve conflict
Ingo Molnar [Mon, 28 Jul 2025 05:12:53 +0000 (07:12 +0200)] 
Merge tag 'v6.16' into x86/cpu, to resolve conflict

Resolve overlapping context conflict between this upstream fix:

  d8010d4ba43e ("x86/bugs: Add a Transient Scheduler Attacks mitigation")

And this pending commit in tip:x86/cpu:

  65f55a301766 ("x86/CPU/AMD: Add CPUID faulting support")

  Conflicts:
arch/x86/kernel/cpu/amd.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
3 weeks agopowerpc64/bpf: Add jit support for load_acquire and store_release
Puranjay Mohan [Thu, 17 Jul 2025 20:29:17 +0000 (20:29 +0000)] 
powerpc64/bpf: Add jit support for load_acquire and store_release

Add JIT support for the load_acquire and store_release instructions. The
implementation is similar to the kernel where:

        load_acquire  => plain load -> lwsync
        store_release => lwsync -> plain store

To test the correctness of the implementation, following selftests were
run:

  [fedora@linux-kernel bpf]$ sudo ./test_progs -a \
  verifier_load_acquire,verifier_store_release,atomics
  #11/1    atomics/add:OK
  #11/2    atomics/sub:OK
  #11/3    atomics/and:OK
  #11/4    atomics/or:OK
  #11/5    atomics/xor:OK
  #11/6    atomics/cmpxchg:OK
  #11/7    atomics/xchg:OK
  #11      atomics:OK
  #519/1   verifier_load_acquire/load-acquire, 8-bit:OK
  #519/2   verifier_load_acquire/load-acquire, 8-bit @unpriv:OK
  #519/3   verifier_load_acquire/load-acquire, 16-bit:OK
  #519/4   verifier_load_acquire/load-acquire, 16-bit @unpriv:OK
  #519/5   verifier_load_acquire/load-acquire, 32-bit:OK
  #519/6   verifier_load_acquire/load-acquire, 32-bit @unpriv:OK
  #519/7   verifier_load_acquire/load-acquire, 64-bit:OK
  #519/8   verifier_load_acquire/load-acquire, 64-bit @unpriv:OK
  #519/9   verifier_load_acquire/load-acquire with uninitialized
  src_reg:OK
  #519/10  verifier_load_acquire/load-acquire with uninitialized src_reg
  @unpriv:OK
  #519/11  verifier_load_acquire/load-acquire with non-pointer src_reg:OK
  #519/12  verifier_load_acquire/load-acquire with non-pointer src_reg
  @unpriv:OK
  #519/13  verifier_load_acquire/misaligned load-acquire:OK
  #519/14  verifier_load_acquire/misaligned load-acquire @unpriv:OK
  #519/15  verifier_load_acquire/load-acquire from ctx pointer:OK
  #519/16  verifier_load_acquire/load-acquire from ctx pointer @unpriv:OK
  #519/17  verifier_load_acquire/load-acquire with invalid register R15:OK
  #519/18  verifier_load_acquire/load-acquire with invalid register R15
  @unpriv:OK
  #519/19  verifier_load_acquire/load-acquire from pkt pointer:OK
  #519/20  verifier_load_acquire/load-acquire from flow_keys pointer:OK
  #519/21  verifier_load_acquire/load-acquire from sock pointer:OK
  #519     verifier_load_acquire:OK
  #556/1   verifier_store_release/store-release, 8-bit:OK
  #556/2   verifier_store_release/store-release, 8-bit @unpriv:OK
  #556/3   verifier_store_release/store-release, 16-bit:OK
  #556/4   verifier_store_release/store-release, 16-bit @unpriv:OK
  #556/5   verifier_store_release/store-release, 32-bit:OK
  #556/6   verifier_store_release/store-release, 32-bit @unpriv:OK
  #556/7   verifier_store_release/store-release, 64-bit:OK
  #556/8   verifier_store_release/store-release, 64-bit @unpriv:OK
  #556/9   verifier_store_release/store-release with uninitialized
  src_reg:OK
  #556/10  verifier_store_release/store-release with uninitialized src_reg
  @unpriv:OK
  #556/11  verifier_store_release/store-release with uninitialized
  dst_reg:OK
  #556/12  verifier_store_release/store-release with uninitialized dst_reg
  @unpriv:OK
  #556/13  verifier_store_release/store-release with non-pointer
  dst_reg:OK
  #556/14  verifier_store_release/store-release with non-pointer dst_reg
  @unpriv:OK
  #556/15  verifier_store_release/misaligned store-release:OK
  #556/16  verifier_store_release/misaligned store-release @unpriv:OK
  #556/17  verifier_store_release/store-release to ctx pointer:OK
  #556/18  verifier_store_release/store-release to ctx pointer @unpriv:OK
  #556/19  verifier_store_release/store-release, leak pointer to stack:OK
  #556/20  verifier_store_release/store-release, leak pointer to stack
  @unpriv:OK
  #556/21  verifier_store_release/store-release, leak pointer to map:OK
  #556/22  verifier_store_release/store-release, leak pointer to map
  @unpriv:OK
  #556/23  verifier_store_release/store-release with invalid register
  R15:OK
  #556/24  verifier_store_release/store-release with invalid register R15
  @unpriv:OK
  #556/25  verifier_store_release/store-release to pkt pointer:OK
  #556/26  verifier_store_release/store-release to flow_keys pointer:OK
  #556/27  verifier_store_release/store-release to sock pointer:OK
  #556     verifier_store_release:OK
  Summary: 3/55 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Tested-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250717202935.29018-2-puranjay@kernel.org
3 weeks agodocs: powerpc: add htm.rst to toctree
Vishal Parmar [Sun, 27 Jul 2025 11:01:45 +0000 (16:31 +0530)] 
docs: powerpc: add htm.rst to toctree

The file Documentation/arch/powerpc/htm.rst is not included in the
index.rst toctree. This results in a warning when building the docs:

  WARNING: document isn't included in any toctree: htm.rst

Add it to the index.rst file so that it is properly included in the
PowerPC documentation TOC.

Signed-off-by: Vishal Parmar <vishistriker@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250727110145.839906-1-vishistriker@gmail.com
3 weeks agodt-bindings: hwmon: Replace bouncing Alexandru Tachici emails
Krzysztof Kozlowski [Thu, 24 Jul 2025 11:37:36 +0000 (13:37 +0200)] 
dt-bindings: hwmon: Replace bouncing Alexandru Tachici emails

Emails to alexandru.tachici@analog.com bounce permanently:

  Remote Server returned '550 5.1.10 RESOLVER.ADR.RecipientNotFound; Recipient not found by SMTP address lookup'

so replace him with Cedric Encarnacion from Analog.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20250724113735.59148-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
3 weeks agohwmon: (ina238) Add support for INA228
Jonas Rebmann [Fri, 18 Jul 2025 14:12:50 +0000 (16:12 +0200)] 
hwmon: (ina238) Add support for INA228

Add support for the Texas Instruments INA228 Ultra-Precise
Power/Energy/Charge Monitor.

The INA228 is very similar to the INA238 but offers four bits of extra
precision in the temperature, voltage and current measurement fields.
It also supports energy and charge monitoring, the latter of which is
not supported through this patch.

While it seems in the datasheet that some constants such as LSB values
differ between the 228 and the 238, they differ only for those registers
where four bits of precision have been added and they differ by a factor
of 16 (VBUS, VSHUNT, DIETEMP, CURRENT).

Therefore, the INA238 constants are still applicable with regard
to the bit of the same significance.

Signed-off-by: Jonas Rebmann <jre@pengutronix.de>
Link: https://lore.kernel.org/r/20250718-ina228-v2-3-227feb62f709@pengutronix.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
3 weeks agodt-bindings: Add INA228 to ina2xx devicetree bindings
Jonas Rebmann [Fri, 18 Jul 2025 14:12:49 +0000 (16:12 +0200)] 
dt-bindings: Add INA228 to ina2xx devicetree bindings

Add the ina228 to ina2xx bindings.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jonas Rebmann <jre@pengutronix.de>
Link: https://lore.kernel.org/r/20250718-ina228-v2-2-227feb62f709@pengutronix.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
3 weeks agohwmon: (ina238) Fix inconsistent whitespace
Jonas Rebmann [Fri, 18 Jul 2025 14:12:48 +0000 (16:12 +0200)] 
hwmon: (ina238) Fix inconsistent whitespace

Some purely cosmetic changes in ina238.c:

 - When aligning definitions, do so consistently with tab stop of 8.
 - Use spaces instead of tabs around operators.
 - Align wrapped lines.

Signed-off-by: Jonas Rebmann <jre@pengutronix.de>
Link: https://lore.kernel.org/r/20250718-ina228-v2-1-227feb62f709@pengutronix.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
3 weeks agocifs: Add support for creating reparse points over SMB1
Pali Rohár [Wed, 25 Dec 2024 23:43:22 +0000 (00:43 +0100)] 
cifs: Add support for creating reparse points over SMB1

SMB1 already supports querying reparse points and detecting types of
symlink, fifo, socket, block and char.

This change implements the missing part - ability to create a new reparse
points over SMB1. This includes everything which SMB2+ already supports:
- native SMB symlinks and sockets
- NFS style of special files (symlinks, fifos, sockets, char/block devs)
- WSL style of special files (symlinks, fifos, sockets, char/block devs)

Attaching a reparse point to an existing file or directory is done via
SMB1 SMB_COM_NT_TRANSACT/NT_TRANSACT_IOCTL/FSCTL_SET_REPARSE_POINT command
and implemented in a new cifs_create_reparse_inode() function.

This change introduce a new callback ->create_reparse_inode() which creates
a new reperse point file or directory and returns inode. For SMB1 it is
provided via that new cifs_create_reparse_inode() function.

Existing reparse.c code was only slightly updated to call new protocol
callback ->create_reparse_inode() instead of hardcoded SMB2+ function.
This make the whole reparse.c code to work with every SMB dialect.

The original callback ->create_reparse_symlink() is not needed anymore as
the implementation of new create_reparse_symlink() function is dialect
agnostic too. So the link.c code was updated to call that function directly
(and not via callback).

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 weeks agocifs: Do not query WSL EAs for native SMB symlink
Pali Rohár [Thu, 30 Jan 2025 21:33:27 +0000 (22:33 +0100)] 
cifs: Do not query WSL EAs for native SMB symlink

WSL EAs are not required for native SMB symlinks, so do not query them from server.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 weeks agocifs: Optimize CIFSFindFirst() response when not searching
Pali Rohár [Mon, 30 Dec 2024 19:55:53 +0000 (20:55 +0100)] 
cifs: Optimize CIFSFindFirst() response when not searching

When not searching for child entries with msearch wildcard pattern then ask
server just for one output entry. There is no need to ask for more entries
as we are interested only for one search result, as we are doing query on
path.

CIFSFindFirst() with msearch=false is called by the cifs_query_path_info()
function.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 weeks agocifs: Fix calling CIFSFindFirst() for root path without msearch
Pali Rohár [Mon, 30 Dec 2024 19:54:11 +0000 (20:54 +0100)] 
cifs: Fix calling CIFSFindFirst() for root path without msearch

To query root path (without msearch wildcard) it is needed to
send pattern '\' instead of '' (empty string).

This allows to use CIFSFindFirst() to query information about root path
which is being used in followup changes.

This change fixes the stat() syscall called on the root path on the mount.
It is because stat() syscall uses the cifs_query_path_info() function and
it can fallback to the CIFSFindFirst() usage with msearch=false.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 weeks agosmb: client: fix session setup against servers that require SPN
Paulo Alcantara [Fri, 25 Jul 2025 03:04:44 +0000 (00:04 -0300)] 
smb: client: fix session setup against servers that require SPN

Some servers might enforce the SPN to be set in the target info
blob (AV pairs) when sending NTLMSSP_AUTH message.  In Windows Server,
this could be enforced with SmbServerNameHardeningLevel set to 2.

Fix this by always appending SPN (cifs/<hostname>) to the existing
list of target infos when setting up NTLMv2 response blob.

Cc: linux-cifs@vger.kernel.org
Cc: David Howells <dhowells@redhat.com>
Reported-by: Pierguido Lambri <plambri@redhat.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 weeks agosmb: client: allow parsing zero-length AV pairs
Paulo Alcantara [Fri, 25 Jul 2025 03:04:43 +0000 (00:04 -0300)] 
smb: client: allow parsing zero-length AV pairs

Zero-length AV pairs should be considered as valid target infos.
Don't skip the next AV pairs that follow them.

Cc: linux-cifs@vger.kernel.org
Cc: David Howells <dhowells@redhat.com>
Fixes: 0e8ae9b953bc ("smb: client: parse av pair type 4 in CHALLENGE_MESSAGE")
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 weeks agocifs: add new field to track the last access time of cfid
Shyam Prasad N [Fri, 25 Jul 2025 03:23:53 +0000 (22:23 -0500)] 
cifs: add new field to track the last access time of cfid

The handlecache code today tracks the time at which dir lease was
acquired and the laundromat thread uses that to check for old
entries to cleanup.

However, if a directory is actively accessed, it should not
be chosen to expire first.

This change adds a new last_access_time field to cfid and
uses that to decide expiry of the cfid.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 weeks agosmb: change return type of cached_dir_lease_break() to bool
Bharath SM [Mon, 30 Jun 2025 18:49:32 +0000 (00:19 +0530)] 
smb: change return type of cached_dir_lease_break() to bool

cached_dir_lease_break() has return type as int but only
returning true or false. change return type of this function
to bool for clarity.

Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 weeks agocifs: reset iface weights when we cannot find a candidate
Shyam Prasad N [Thu, 17 Jul 2025 12:06:13 +0000 (17:36 +0530)] 
cifs: reset iface weights when we cannot find a candidate

We now do a weighted selection of server interfaces when allocating
new channels. The weights are decided based on the speed advertised.
The fulfilled weight for an interface is a counter that is used to
track the interface selection. It should be reset back to zero once
all interfaces fulfilling their weight.

In cifs_chan_update_iface, this reset logic was missing. As a result
when the server interface list changes, the client may not be able
to find a new candidate for other channels after all interfaces have
been fulfilled.

Fixes: a6d8fb54a515 ("cifs: distribute channels across interfaces based on speed")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 weeks agosmb: client: fix netns refcount leak after net_passive changes
Wang Zhaolong [Thu, 17 Jul 2025 13:29:26 +0000 (21:29 +0800)] 
smb: client: fix netns refcount leak after net_passive changes

After commit 5c70eb5c593d ("net: better track kernel sockets lifetime"),
kernel sockets now use net_passive reference counting. However, commit
95d2b9f693ff ("Revert "smb: client: fix TCP timers deadlock after rmmod"")
restored the manual socket refcount manipulation without adapting to this
new mechanism, causing a memory leak.

The issue can be reproduced by[1]:
1. Creating a network namespace
2. Mounting and Unmounting CIFS within the namespace
3. Deleting the namespace

Some memory leaks may appear after a period of time following step 3.

unreferenced object 0xffff9951419f6b00 (size 256):
  comm "ip", pid 447, jiffies 4294692389 (age 14.730s)
  hex dump (first 32 bytes):
    1b 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 80 77 c2 44 51 99 ff ff  .........w.DQ...
  backtrace:
    __kmem_cache_alloc_node+0x30e/0x3d0
    __kmalloc+0x52/0x120
    net_alloc_generic+0x1d/0x30
    copy_net_ns+0x86/0x200
    create_new_namespaces+0x117/0x300
    unshare_nsproxy_namespaces+0x60/0xa0
    ksys_unshare+0x148/0x360
    __x64_sys_unshare+0x12/0x20
    do_syscall_64+0x59/0x110
    entry_SYSCALL_64_after_hwframe+0x78/0xe2
...
unreferenced object 0xffff9951442e7500 (size 32):
  comm "mount.cifs", pid 475, jiffies 4294693782 (age 13.343s)
  hex dump (first 32 bytes):
    40 c5 38 46 51 99 ff ff 18 01 96 42 51 99 ff ff  @.8FQ......BQ...
    01 00 00 00 6f 00 c5 07 6f 00 d8 07 00 00 00 00  ....o...o.......
  backtrace:
    __kmem_cache_alloc_node+0x30e/0x3d0
    kmalloc_trace+0x2a/0x90
    ref_tracker_alloc+0x8e/0x1d0
    sk_alloc+0x18c/0x1c0
    inet_create+0xf1/0x370
    __sock_create+0xd7/0x1e0
    generic_ip_connect+0x1d4/0x5a0 [cifs]
    cifs_get_tcp_session+0x5d0/0x8a0 [cifs]
    cifs_mount_get_session+0x47/0x1b0 [cifs]
    dfs_mount_share+0xfa/0xa10 [cifs]
    cifs_mount+0x68/0x2b0 [cifs]
    cifs_smb3_do_mount+0x10b/0x760 [cifs]
    smb3_get_tree+0x112/0x2e0 [cifs]
    vfs_get_tree+0x29/0xf0
    path_mount+0x2d4/0xa00
    __se_sys_mount+0x165/0x1d0

Root cause:
When creating kernel sockets, sk_alloc() calls net_passive_inc() for
sockets with sk_net_refcnt=0. The CIFS code manually converts kernel
sockets to user sockets by setting sk_net_refcnt=1, but doesn't call
the corresponding net_passive_dec(). This creates an imbalance in the
net_passive counter, which prevents the network namespace from being
destroyed when its last user reference is dropped. As a result, the
entire namespace and all its associated resources remain allocated.

Timeline of patches leading to this issue:
- commit ef7134c7fc48 ("smb: client: Fix use-after-free of network
  namespace.") in v6.12 fixed the original netns UAF by manually
  managing socket refcounts
- commit e9f2517a3e18 ("smb: client: fix TCP timers deadlock after
  rmmod") in v6.13 attempted to use kernel sockets but introduced
  TCP timer issues
- commit 5c70eb5c593d ("net: better track kernel sockets lifetime")
  in v6.14-rc5 introduced the net_passive mechanism with
  sk_net_refcnt_upgrade() for proper socket conversion
- commit 95d2b9f693ff ("Revert "smb: client: fix TCP timers deadlock
  after rmmod"") in v6.15-rc3 reverted to manual refcount management
  without adapting to the new net_passive changes

Fix this by using sk_net_refcnt_upgrade() which properly handles the
net_passive counter when converting kernel sockets to user sockets.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=220343
Fixes: 95d2b9f693ff ("Revert "smb: client: fix TCP timers deadlock after rmmod"")
Cc: stable@vger.kernel.org
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 weeks agoLinux 6.16 v6.16
Linus Torvalds [Sun, 27 Jul 2025 21:26:38 +0000 (14:26 -0700)] 
Linux 6.16

3 weeks agofbcon: Use 'bool' where appopriate
Ville Syrjälä [Mon, 23 Sep 2024 20:50:16 +0000 (23:50 +0300)] 
fbcon: Use 'bool' where appopriate

Use 'bool' type where it makes more sense than 'int'.

v2: Rebase due to corrected 'fbcon_cursor_blink' initial value

Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Helge Deller <deller@gmx.de>
3 weeks agofbcon: Introduce get_{fg,bg}_color()
Ville Syrjälä [Mon, 23 Sep 2024 15:57:48 +0000 (18:57 +0300)] 
fbcon: Introduce get_{fg,bg}_color()

Make the code more legible by adding get_{fg,bg}_color()
which hide the obscure 'is_fg' parameter of get_color()
from the caller.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Helge Deller <deller@gmx.de>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Helge Deller <deller@gmx.de>
3 weeks agofbcon: fbcon_is_inactive() -> fbcon_is_active()
Ville Syrjälä [Mon, 23 Sep 2024 15:57:47 +0000 (18:57 +0300)] 
fbcon: fbcon_is_inactive() -> fbcon_is_active()

Invert fbcon_is_inactive() into fbcon_is_active(). Much easier
on the poor brain when you don't have to do dobule negations
all over the place.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Helge Deller <deller@gmx.de>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Helge Deller <deller@gmx.de>
3 weeks agofbcon: fbcon_cursor_noblink -> fbcon_cursor_blink
Ville Syrjälä [Mon, 23 Sep 2024 20:48:53 +0000 (23:48 +0300)] 
fbcon: fbcon_cursor_noblink -> fbcon_cursor_blink

Invert fbcon_cursor_noblink into fbcon_cursor_blink so that:
- it matches the sysfs attribute exactly
- avoids having to do these NOT operations all over the place
- use bool instead of int

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Helge Deller <deller@gmx.de>
3 weeks agofbdev: Fix typo in Kconfig text for FB_DEVICE
Daniel Palmer [Fri, 25 Jul 2025 05:30:57 +0000 (14:30 +0900)] 
fbdev: Fix typo in Kconfig text for FB_DEVICE

Seems like someone hit 'c' when they meant to hit 'd'.

Signed-off-by: Daniel Palmer <daniel.palmer@sony.com>
Signed-off-by: Helge Deller <deller@gmx.de>
3 weeks agofbdev: imxfb: Check fb_add_videomode to prevent null-ptr-deref
Chenyuan Yang [Thu, 24 Jul 2025 03:25:34 +0000 (22:25 -0500)] 
fbdev: imxfb: Check fb_add_videomode to prevent null-ptr-deref

fb_add_videomode() can fail with -ENOMEM when its internal kmalloc() cannot
allocate a struct fb_modelist.  If that happens, the modelist stays empty but
the driver continues to register.  Add a check for its return value to prevent
poteintial null-ptr-deref, which is similar to the commit 17186f1f90d3 ("fbdev:
Fix do_register_framebuffer to prevent null-ptr-deref in fb_videomode_to_var").

Fixes: 1b6c79361ba5 ("video: imxfb: Add DT support")
Signed-off-by: Chenyuan Yang <chenyuan0y@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
3 weeks agofbdev: svgalib: Clean up coding style
Darshan R. [Mon, 21 Jul 2025 12:56:47 +0000 (12:56 +0000)] 
fbdev: svgalib: Clean up coding style

This patch addresses various coding style issues in `svgalib.c` to improve
readability and better align the code with the Linux kernel's formatting
standards.

The changes primarily consist of:
- Adjusting whitespace around operators and after keywords.
- Standardizing brace placement for control flow statements.
- Removing unnecessary braces on single-statement if/else blocks.
- Deleting extraneous blank lines throughout the file.

These changes are purely stylistic and introduce no functional modifications.

Signed-off-by: Darshan R. <rathod.darshan.0896@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
3 weeks agofbdev: kyro: Use devm_ioremap_wc() for screen mem
Giovanni Di Santi [Wed, 9 Jul 2025 09:53:54 +0000 (11:53 +0200)] 
fbdev: kyro: Use devm_ioremap_wc() for screen mem

Replace the manual pci_ioremap_wc() call for mapping screen memory with the
device-managed devm_ioremap_wc() variant.

This simplifies the driver's resource management by ensuring the memory is
automatically unmapped when the driver detaches from the device.

Signed-off-by: Giovanni Di Santi <giovanni.disanti.lkl@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
3 weeks agofbdev: kyro: Use devm_ioremap() for mmio registers
Giovanni Di Santi [Wed, 9 Jul 2025 09:53:53 +0000 (11:53 +0200)] 
fbdev: kyro: Use devm_ioremap() for mmio registers

Replace the manual ioremap() call for the MMIO registers with the
device-managed devm_ioremap() variant.

This simplifies the driver's resource management by ensuring the memory is
automatically unmapped when the driver detaches from the device.

Signed-off-by: Giovanni Di Santi <giovanni.disanti.lkl@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>