]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
12 days agoslimbus: qcom-ngd-ctrl: Initialize controller resources in controller
Bjorn Andersson [Sat, 30 May 2026 20:44:19 +0000 (21:44 +0100)] 
slimbus: qcom-ngd-ctrl: Initialize controller resources in controller

The work structs and work queue are controller resources, create and
destroy them in the controller context. Creating them as part of the
child device's probe path seems to be okay now that the controller's
probe has been updated, but if for some reason the child does not probe
successfully a SSR or PDR notification will schedule_work() on an
uninitialized "ngd_up_work".

Move the initialization of these controller resources to the controller
probe function to avoid any issues, and to clarify the ownership.

Fixes: 917809e2280b ("slimbus: ngd: Add qcom SLIMBus NGD driver")
Cc: stable@vger.kernel.org
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
Signed-off-by: Srinivas Kandagatla <srini@kernel.org>
Link: https://patch.msgid.link/20260530204421.116824-7-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
12 days agoslimbus: qcom-ngd-ctrl: Register callbacks after creating the ngd
Bjorn Andersson [Sat, 30 May 2026 20:44:18 +0000 (21:44 +0100)] 
slimbus: qcom-ngd-ctrl: Register callbacks after creating the ngd

When the remoteproc starts in parallel with the NGD driver being probed,
or the remoteproc is already up when the PDR lookup is being registered,
or in the theoretical event that we get an interrupt from the hardware,
these callbacks will operate on uninitialized data. This result in
issues to boot the affected boards.

One such example can be seen in the following fault, where
qcom_slim_ngd_ssr_pdr_notify() schedules work on the NULL ngd_up_work.

[   21.858578] ------------[ cut here ]------------
[   21.858745] WARNING: kernel/workqueue.c:2338 at __queue_work+0x5e0/0x790, CPU#2: kworker/2:2/116
...
[   21.859251] Call trace:
[   21.859255]  __queue_work+0x5e0/0x790 (P)
[   21.859265]  queue_work_on+0x6c/0xf0
[   21.859273]  qcom_slim_ngd_ssr_pdr_notify+0x110/0x150 [slim_qcom_ngd_ctrl]
[   21.859304]  qcom_slim_ngd_ssr_notify+0x24/0x40 [slim_qcom_ngd_ctrl]
[   21.859318]  notifier_call_chain+0xa4/0x230
[   21.859329]  srcu_notifier_call_chain+0x64/0xb8
[   21.859338]  ssr_notify_start+0x40/0x78 [qcom_common]
[   21.859355]  rproc_start+0x130/0x230
[   21.859367]  rproc_boot+0x3d4/0x518
...

Move the enablement of interrupts, and the registration of SSR and PDR
until after the NGD device has been registered.

This could be further refined by moving initialization to the control
driver probe and by removing the platform driver model from the picture.

Fixes: 917809e2280b ("slimbus: ngd: Add qcom SLIMBus NGD driver")
Cc: stable@vger.kernel.org
Reviewed-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
Signed-off-by: Srinivas Kandagatla <srini@kernel.org>
Link: https://patch.msgid.link/20260530204421.116824-6-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
12 days agoslimbus: qcom-ngd-ctrl: Correct PDR and SSR cleanup ownership
Bjorn Andersson [Sat, 30 May 2026 20:44:17 +0000 (21:44 +0100)] 
slimbus: qcom-ngd-ctrl: Correct PDR and SSR cleanup ownership

PDR and SSR callbacks are registred from the controller probe function,
but currently released from the child device's remove function.

The remove() function should only be unwinding what was done in the
same device's probe() function.

Fixes: 917809e2280b ("slimbus: ngd: Add qcom SLIMBus NGD driver")
Cc: stable@vger.kernel.org
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
Signed-off-by: Srinivas Kandagatla <srini@kernel.org>
Link: https://patch.msgid.link/20260530204421.116824-5-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
12 days agoslimbus: qcom-ngd-ctrl: Fix probe error path ordering
Bjorn Andersson [Sat, 30 May 2026 20:44:16 +0000 (21:44 +0100)] 
slimbus: qcom-ngd-ctrl: Fix probe error path ordering

qcom_slim_ngd_ctrl_probe() first registers the SSR callback then
allocates the PDR context, as such the error path needs to come in
opposite order to allow us to unroll each step.

Fixes: 16f14551d0df ("slimbus: qcom-ngd: cleanup in probe error path")
Cc: stable@vger.kernel.org
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
Signed-off-by: Srinivas Kandagatla <srini@kernel.org>
Link: https://patch.msgid.link/20260530204421.116824-4-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
12 days agoslimbus: qcom-ngd-ctrl: Fix up platform_driver registration
Bjorn Andersson [Sat, 30 May 2026 20:44:15 +0000 (21:44 +0100)] 
slimbus: qcom-ngd-ctrl: Fix up platform_driver registration

Device drivers should not invoke platform_driver_register()/unregister()
in their probe and remove paths. They should further not rely on
platform_driver_unregister() as their only means of "deleting" their
child devices.

Introduce a helper to unregister the child device and move the
platform_driver_register()/unregister() to module_init()/exit().

Fixes: 917809e2280b ("slimbus: ngd: Add qcom SLIMBus NGD driver")
Cc: stable@vger.kernel.org
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
Signed-off-by: Srinivas Kandagatla <srini@kernel.org>
Link: https://patch.msgid.link/20260530204421.116824-3-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
12 days agoslimbus: qcom-ngd-ctrl: fix OF node refcount
Bartosz Golaszewski [Sat, 30 May 2026 20:44:14 +0000 (21:44 +0100)] 
slimbus: qcom-ngd-ctrl: fix OF node refcount

Platform devices created with platform_device_alloc() call
platform_device_release() when the last reference to the device's
kobject is dropped. This function calls of_node_put() unconditionally.
This works fine for devices created with platform_device_register_full()
but users of the split approach (platform_device_alloc() +
platform_device_add()) must bump the reference of the of_node they
assign manually. Add the missing call to of_node_get().

Cc: stable@vger.kernel.org
Fixes: 917809e2280b ("slimbus: ngd: Add qcom SLIMBus NGD driver")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Srinivas Kandagatla <srini@kernel.org>
Link: https://patch.msgid.link/20260530204421.116824-2-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
12 days agonvmem: core: fix use-after-free bugs in error paths
Bartosz Golaszewski [Sat, 30 May 2026 20:43:40 +0000 (21:43 +0100)] 
nvmem: core: fix use-after-free bugs in error paths

Fix several instances of error paths in which we call
__nvmem_device_put() - which may end up freeing the underlying memory
and other resources - and then keep on using the nvmem structure. Always
put the reference to the nvmem device as the last step before returning
the error code.

Cc: stable@vger.kernel.org
Fixes: 7ae6478b304b ("nvmem: core: rework nvmem cell instance creation")
Fixes: e888d445ac33 ("nvmem: resolve cells from DT at registration time")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Srinivas Kandagatla <srini@kernel.org>
Link: https://patch.msgid.link/20260530204340.116743-3-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
12 days agonvmem: layouts: onie-tlv: fix hang on unknown types
Andre Heider [Sat, 30 May 2026 20:43:39 +0000 (21:43 +0100)] 
nvmem: layouts: onie-tlv: fix hang on unknown types

The EEPROM on my board has a vendor specific entry of type 0x41. When
stumbling upon that, this driver hangs in an endless loop.

Fix it by keep incrementing the offset on unknown entries, so the loop
will eventually stop.

Fixes: d3c0d12f6474 ("nvmem: layouts: onie-tlv: Add new layout driver")
Cc: Stable@vger.kernel.org
Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Srinivas Kandagatla <srini@kernel.org>
Link: https://patch.msgid.link/20260530204340.116743-2-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
12 days agoMerge tag 'svc_fixes_for_v7.1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel...
Greg Kroah-Hartman [Fri, 5 Jun 2026 15:17:30 +0000 (17:17 +0200)] 
Merge tag 'svc_fixes_for_v7.1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux into char-misc-linus

Dinh writes:

firmware: stratix10-svc and stratix10-rsu: fixes for v7.1
- Return -EOPNOTSUPP when ATF async is not supported
- Fix SVC driver from loading entirely when asynchronous ops is not
  supported in older ATF.
- Fix a NULL pointer dereference on a timeout in rsu_send_msg()

* tag 'svc_fixes_for_v7.1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
  firmware: stratix10-rsu: Fix NULL deref on rsu_send_msg() timeout in probe
  firmware: stratix10-svc: Don't fail probe when async ops unsupported
  firmware: stratix10-svc: Return -EOPNOTSUPP when ATF async unsupported

12 days agoMerge tag 'usb-serial-7.1-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel...
Greg Kroah-Hartman [Fri, 5 Jun 2026 15:16:27 +0000 (17:16 +0200)] 
Merge tag 'usb-serial-7.1-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus

Johan writes:

USB serial fixes for 7.1-rc7

Here are two fixes for buffer overflows in the io_ti driver and a new
modem device id.

All have been in linux-next with no reported issues.

* tag 'usb-serial-7.1-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
  USB: serial: option: add usb-id for Dell Wireless DW5826e-m
  USB: serial: io_ti: fix heap overflow in build_i2c_fw_hdr()
  USB: serial: io_ti: fix heap overflow in get_manuf_info()

12 days agobpf: Reject registration of duplicated kfunc
Song Chen [Wed, 3 Jun 2026 09:19:10 +0000 (17:19 +0800)] 
bpf: Reject registration of duplicated kfunc

Search for duplicated kfunc in btf_vmlinux and btf_modules
before a kernel module attempts to register a kfunc.
If kfunc would shadow existing kfunc then pr_err() and
reject module loading.

Reviewed-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Song Chen <chensong_2000@126.com>
Link: https://lore.kernel.org/r/20260603091910.7212-1-chensong_2000@126.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
12 days agoMAINTAINERS: BPF: Add self as reviewer and run parse_maintainers.pl
Emil Tsalapatis [Thu, 4 Jun 2026 18:42:52 +0000 (14:42 -0400)] 
MAINTAINERS: BPF: Add self as reviewer and run parse_maintainers.pl

Add myself as a reviewer for the BPF subsystem. While at it, run
./scripts/parse_maintainers.pl --order and reorder the BPF-related
entries in the file accordingly.

Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20260604184252.9917-1-emil@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
12 days agoMerge branch 'bpf-introduce-resizable-hash-map'
Alexei Starovoitov [Fri, 5 Jun 2026 15:00:09 +0000 (08:00 -0700)] 
Merge branch 'bpf-introduce-resizable-hash-map'

Mykyta Yatsenko says:

====================
bpf: Introduce resizable hash map

This patch series introduces BPF_MAP_TYPE_RHASH, a new hash map type that
leverages the kernel's rhashtable to provide resizable hash map for BPF.

The existing BPF_MAP_TYPE_HASH uses a fixed number of buckets determined at
map creation time. While this works well for many use cases, it presents
challenges when:

1. The number of elements is unknown at creation time
2. The element count varies significantly during runtime
3. Memory efficiency is important (over-provisioning wastes memory,
 under-provisioning hurts performance)

BPF_MAP_TYPE_RHASH addresses these issues by using rhashtable, which
automatically grows and shrinks based on load factor.

The implementation wraps the kernel's rhashtable with BPF map operations:

- Uses bpf_mem_alloc for RCU-safe memory management
- Supports all standard map operations (lookup, update, delete, get_next_key)
- Supports batch operations (lookup_batch, lookup_and_delete_batch)
- Supports BPF iterators for traversal
- Supports BPF_F_LOCK for spin locks in values
- Requires BPF_F_NO_PREALLOC flag (elements allocated on demand)
- In-place updates for improved performance.
- max_entries serves as a hard limit, not bucket count
- Uses bit_spin_lock() + local_irq_save() for bucket locking,
similar to existing BPF hashmap's raw_spin_lock_irqsave(), insertions and
deletes may fail.
- Iterations are best-effort, if resize, insertions, deletions take place
concurrently, iterations may visit same elements multiple times or skip
elements.
- Lock out insertions, when running special fields destructor to guarantee
its completion.

The series includes comprehensive tests:
- Basic operations in test_maps (lookup, update, delete, get_next_key)
- BPF program tests for lookup/update/delete semantics
- Seq file tests

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
---
Update implementation
---------------------
Current implementation of the BPF_MAP_TYPE_RHASH does not provide
the same strong guarantees on the values consistency under concurrent
reads/writes as BPF_MAP_TYPE_HASH.
BPF_MAP_TYPE_HASH allocates a new element and atomically swaps the
pointer. BPF_MAP_TYPE_RHASH does memcpy in place with no lock held.
rhash trades consistency for speed, concurrent readers can observe
partially updated data. Two concurrent writers to the same key can
also interleave, producing mixed values. This is similar to arraymap
update implementation, including handling of the special fields.
As a solution, user may use BPF_F_LOCK to guarantee consistent reads
and write serialization.

Summary of the read consistency guarantees:

  map type     |  write mechanism |  read consistency
  -------------+------------------+--------------------------
  htab         |  alloc, swap ptr |  always consistent (RCU)
  htab  F_LOCK |  in-place + lock |  consistent if reader locks
  -------------+------------------+--------------------------
  rhtab        |  in-place memcpy |  torn reads
  rhtab F_LOCK |  in-place + lock |  consistent if reader locks

Benchmarks
----------
1. LOOKUP  (single producer, M events/sec)
  key | max | nr    |    htab |   rhtab | ratio | delta
  ----+-----+-------+---------+---------+-------+-------
    8 |  1K |   750 |   99.85 |   81.92 | 0.82x |  -18 %
    8 |  1K |    1K |  100.71 |   80.19 | 0.80x |  -20 %
    8 |  1M |  750K |   23.37 |   72.09 | 3.08x | +208 %
    8 |  1M |    1M |   13.39 |   53.72 | 4.01x | +301 %
   32 |  1K |   750 |   51.57 |   42.78 | 0.83x |  -17 %
   32 |  1K |    1K |   50.81 |   45.83 | 0.90x |  -10 %
   32 |  1M |  750K |   11.27 |   15.29 | 1.36x |  +36 %
   32 |  1M |    1M |    7.32 |    8.75 | 1.19x |  +19 %
  256 |  1K |   750 |    7.58 |    7.88 | 1.04x |   +4 %
  256 |  1K |    1K |    7.43 |    7.81 | 1.05x |   +5 %
  256 |  1M |  750K |    3.69 |    4.27 | 1.16x |  +16 %
  256 |  1M |    1M |    2.60 |    3.12 | 1.20x |  +20 %

Pattern:
  * Small map (1K): htab wins for 8 / 32 byte keys by 10-20 %
    because the preallocated bucket array fits in L1.  Equalises
    at 256 byte keys.
  * Large map (1M): rhtab wins everywhere, up to 4x at high load
    factor with 8 byte keys.
  * Higher load factor amplifies rhtab's lead: rhtab grows the
    bucket array; htab stays at user-declared max.

2. FULL UPDATE  (M events/sec per producer, -p 7)

  htab  per-producer:
    20.33   22.02   19.27   23.61   24.18   23.17   21.07
    mean  21.94   range  19.27 - 24.18

  rhtab per-producer:
   133.51  129.47   74.52  129.29  102.26  129.98  107.64
    mean 115.24   range  74.52 - 133.51

  speedup (mean): 5.25x   (+425 %)

In-place memcpy avoids the per-update alloc + RCU pointer swap
that htab pays.

3. MEMORY  (overwrite, -p 8, no --preallocated)

  value_size |  htab ops/s | rhtab ops/s | htab mem | rhtab mem
  -----------+-------------+-------------+----------+----------
       32 B  |  122.87 k/s |  133.04 k/s | 2.47 MiB | 2.49 MiB
     4096 B  |   64.43 k/s |   65.38 k/s | 6.74 MiB | 6.44 MiB
  rhtab/htab :  +8 % ops, +0.8 % mem   (32 B)
                +1 % ops,  -4  % mem (4096 B)

SUMMARY

  * Small / well-fitting map: htab is faster (cache-friendly
    fixed bucket array), but only by ~10-20 %.
  * Large / high-load-factor map: rhtab is dramatically faster
    (1.2x to 4x) because rhashtable resizes to keep the load
    factor sane while htab stays stuck at user-declared max.
  * Update-heavy workloads: rhtab is ~5x faster per producer
    via in-place memcpy.
  * Memory benchmark: effectively on par

---
Changes in v7:
- rhashtable_next_key: move into lib/rhashtable.c, drop params argument
  (Herbert).
- rhashtable_next_key: kdoc clarifies that behavior on tables with
  duplicate keys is undefined (sashiko).
- rhashtable: include Herbert's "Use irq work for shrinking" patch so
  __rhashtable_remove_fast_one() can fire the shrink path from NMI
  context (Herbert).
- hashtab: fix u32 multiply overflow in __rhtab_map_lookup_and_delete_batch
  copy_to_user; cast total to size_t before multiplying by key_size /
  value_size (sashiko, bot+bpf-ci).
- hashtab: allow kptr/refcount fields in rhtab values (same model as
  array map).
- Link to v6: https://patch.msgid.link/20260602-rhash-v6-0-1bfd35a4184f@meta.com

Changes in v6:
- rhashtable_next_key: advance past duplicate keys in the main bucket
  chain to avoid an infinite loop when there are duplicate keys
  (sashiko).
- rhashtable_next_key: return ERR_PTR(-EOPNOTSUPP) on rhltable (sashiko).
- rhashtable: selftest pre-sizes the table to avoid concurrent rehash
  triggering spurious failures (sashiko).
- hashtab: real rhtab_map_mem_usage in the basic commit; move
  bpf_map_free_internal_structs from rhtab_free_elem into the
  special-fields commit where it does meaningful work (bot+bpf-ci).
- bpf_iter (seq_file): switch to rhashtable_walk_* for stronger
  coverage under concurrent rehash; get_next_key and batch keep
  rhashtable_next_key (sashiko).
- iter ops: rhtab_map_get_next_key adds IS_ERR check
  before dereferencing the element pointer (sashiko).
- iter ops: bpf_each_rhash_elem removes cond_resched() (sashiko).
- iter ops: batch returns -EAGAIN (not -ENOENT) on cursor delete,
  so userspace can distinguish lost cursor from end-of-iteration
  and restart from NULL (sashiko).

- Link to v5: https://patch.msgid.link/20260528-rhash-v5-0-7205191b6c57@meta.com

Changes in v5:
- rhashtable_next_key: add kdoc WARNING to highlight lack of rehash
  detection and unbounded iteration (Herbert).
- rhashtable: selftest now checks IS_ERR() before PTR_ERR comparison
  on the missing-key path (bot+bpf-ci).
- hashtab: drop dead stub bodies and unused map_ops registrations
  from the basic commit; iteration commit adds bodies, structs, and
  registrations together. .map_get_next_key keeps a stub registration
  in the basic commit because the syscall dispatcher does not
  NULL-check it; iteration commit replaces the stub body with the
  real implementation (bot+bpf-ci).
- hashtab: fix batch cursor advancement. v4 stashed the lookahead
  element key but then resumed via next_key(cursor), skipping that
  element across batch boundaries and orphaning it on
  lookup_and_delete_batch. v5 stashes the lookahead key and looks
  it up directly on the next batch entry (bot+bpf-ci, sashiko v3).
- hashtab: document torn-read race in rhtab_map_update_existing,
  matching arraymap semantics (bot+bpf-ci).
- Link to v4: https://patch.msgid.link/20260513-rhash-v4-0-dd3d541ccb0b@meta.com

Changes in v4:
- rhashtable: introduce rhashtable_next_key(), drop walker-based
  iteration for BPF (also drops earlier rhashtable_walk_enter_from()
  proposal).
- map_extra: presize hint via lower 32 bits (nelem_hint), capped at
  U16_MAX.
- Automatic shrinking enabled (was missing despite being advertised).
- Reject key_size > U16_MAX (rhashtable_params.key_len is u16).
- Replace irqs_disabled() guard with bpf_disable_instrumentation around
  bucket-lock paths: closes same-CPU NMI tracing recursion without
  rejecting legitimate IRQ-context callers.
- lookup_and_delete reordered: unlink before copy to avoid populating
  user buffer on concurrent-unlink -ENOENT.
- update_existing reordered: copy then free_fields, matching arraymap.
- Word-sized key fast path (sizeof(long) bytes), inlined hashfn/cmpfn
  via static-const rhashtable_params; works on both 32-bit and 64-bit.
- check_and_init_map_value() on insert (zero special-field bytes from
  recycled bpf_mem_alloc memory; previously bpf_spin_lock could read
  garbage and qspinlock would deadlock).
- BPF_SPIN_LOCK / BPF_RES_SPIN_LOCK allowlist moved to the special-
  fields commit so each commit is bisect-safe.
- Link to v3: https://patch.msgid.link/20260424-rhash-v3-0-d0fa0ce4379b@meta.com

Changes in v3:
- Squash all commits implementing basic functions into one (Alexei)
- Remove selftests that were not necessary (Alexei)
- Resize detection for kernel full iterations, error out on resize (Alexei)
- Remove second lookup in get_next_key() (Emil)
- __acquires(RCU)/__releases(RCU) on seq_start/seq_stop (Emil)
- Use bpf_map_check_op_flags() where it makes sense (Leon)
- Benchmarks refresh, experiment with alternative hash functions
- Rely on iterator invalidation during rehash to handle table resizes:
fail on resize where we fully iterate on table inside kernel, dont fail on
resize where iteration goes through userspace. Exception -
rhtab_map_free_internal_structs() should be just safe to iterate fully
in kernel, no risk of infinite loop, because no user holding reference.
- Handle special fields during in-place updates (Emil, sashiko)
- Link to v2: https://lore.kernel.org/all/20260408-rhash-v2-0-3b3675da1f6e@meta.com/

Changes in v2:
- Added benchmarks
- Reworked all functions that walk the rhashtable, use walk API, instead
of directly accessing tbl and future_tbl
- Added rhashtable_walk_enter_from() into rhashtable to support O(1)
iteration continuations
- Link to v1: https://lore.kernel.org/r/20260205-rhash-v1-0-30dd6d63c462@meta.com

---
====================

Link: https://patch.msgid.link/20260605-rhash-v7-0-5b8e05f8630d@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
12 days agoselftests/bpf: Add resizable hashmap to benchmarks
Mykyta Yatsenko [Fri, 5 Jun 2026 11:41:29 +0000 (04:41 -0700)] 
selftests/bpf: Add resizable hashmap to benchmarks

Support resizable hashmap in BPF map benchmarks.

1. LOOKUP  (single producer, M events/sec)

  key | max | nr    |    htab |   rhtab | ratio | delta
  ----+-----+-------+---------+---------+-------+-------
    8 |  1K |   750 |   99.85 |   81.92 | 0.82x |  -18 %
    8 |  1K |    1K |  100.71 |   80.19 | 0.80x |  -20 %
    8 |  1M |  750K |   23.37 |   72.09 | 3.08x | +208 %
    8 |  1M |    1M |   13.39 |   53.72 | 4.01x | +301 %
   32 |  1K |   750 |   51.57 |   42.78 | 0.83x |  -17 %
   32 |  1K |    1K |   50.81 |   45.83 | 0.90x |  -10 %
   32 |  1M |  750K |   11.27 |   15.29 | 1.36x |  +36 %
   32 |  1M |    1M |    7.32 |    8.75 | 1.19x |  +19 %
  256 |  1K |   750 |    7.58 |    7.88 | 1.04x |   +4 %
  256 |  1K |    1K |    7.43 |    7.81 | 1.05x |   +5 %
  256 |  1M |  750K |    3.69 |    4.27 | 1.16x |  +16 %
  256 |  1M |    1M |    2.60 |    3.12 | 1.20x |  +20 %

Pattern:
  * Small map (1K): htab wins for 8 / 32 byte keys by 10-20%
  * Large map (1M): rhtab wins everywhere, up to 4x at high load
    factor with 8 byte keys.
  * Higher load factor amplifies rhtab's lead: rhtab grows the
    bucket array; htab stays at user-declared max.

2. FULL UPDATE  (M events/sec per producer)

  htab  per-producer:
    20.33   22.02   19.27   23.61   24.18   23.17   21.07
    mean  21.94   range  19.27 - 24.18

  rhtab per-producer:
   133.51  129.47   74.52  129.29  102.26  129.98  107.64
    mean 115.24   range  74.52 - 133.51

  speedup (mean): 5.25x   (+425 %)

In-place memcpy avoids the per-update alloc + RCU pointer swap
that htab pays.

3. MEMORY

  value_size |  htab ops/s | rhtab ops/s | htab mem | rhtab mem
  -----------+-------------+-------------+----------+----------
       32 B  |  122.87 k/s |  133.04 k/s | 2.47 MiB | 2.49 MiB
     4096 B  |   64.43 k/s |   65.38 k/s | 6.74 MiB | 6.44 MiB
  rhtab/htab :  +8 % ops, +0.8 % mem   (32 B)
                +1 % ops,  -4  % mem (4096 B)

Throughput effectively tied

SUMMARY

  * Small / well-fitting map: htab is faster (cache-friendly
    fixed bucket array), but only by ~10-20 %.
  * Large / high-load-factor map: rhtab is dramatically faster
    (1.2x to 4x) because rhashtable resizes to keep the load
    factor sane while htab stays stuck at user-declared max.
  * Update-heavy workloads: rhtab is ~5x faster per producer
    via in-place memcpy.
  * Memory benchmark: effectively on par.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/20260605-rhash-v7-12-5b8e05f8630d@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
12 days agobpftool: Add rhash map documentation
Mykyta Yatsenko [Fri, 5 Jun 2026 11:41:28 +0000 (04:41 -0700)] 
bpftool: Add rhash map documentation

Make bpftool documentation aware of the resizable hash map.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260605-rhash-v7-11-5b8e05f8630d@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
12 days agoselftests/bpf: Add BPF iterator tests for resizable hash map
Mykyta Yatsenko [Fri, 5 Jun 2026 11:41:27 +0000 (04:41 -0700)] 
selftests/bpf: Add BPF iterator tests for resizable hash map

Test basic BPF iterator functionality for BPF_MAP_TYPE_RHASH,
verifying all elements are visited.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/20260605-rhash-v7-10-5b8e05f8630d@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
12 days agoselftests/bpf: Add basic tests for resizable hash map
Mykyta Yatsenko [Fri, 5 Jun 2026 11:41:26 +0000 (04:41 -0700)] 
selftests/bpf: Add basic tests for resizable hash map

Test basic map operations (lookup, update, delete) for
BPF_MAP_TYPE_RHASH including boundary conditions like duplicate
key insertion and deletion of nonexistent keys.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/20260605-rhash-v7-9-5b8e05f8630d@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
12 days agolibbpf: Support resizable hashtable
Mykyta Yatsenko [Fri, 5 Jun 2026 11:41:25 +0000 (04:41 -0700)] 
libbpf: Support resizable hashtable

Add BPF_MAP_TYPE_RHASH to libbpf's map type name table and feature
probing so that libbpf-based tools can create and identify resizable
hash maps.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260605-rhash-v7-8-5b8e05f8630d@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
12 days agobpf: Optimize word-sized keys for resizable hashtable
Mykyta Yatsenko [Fri, 5 Jun 2026 11:41:24 +0000 (04:41 -0700)] 
bpf: Optimize word-sized keys for resizable hashtable

Specialize the lookup/update/delete paths for keys whose size matches
sizeof(long) (4 bytes on 32-bit, 8 bytes on 64-bit). A static-const
rhashtable_params lets the compiler inline a custom XOR-fold hashfn and
a single-word equality cmpfn, eliminating the indirect jhash dispatch.
The same hashfn and cmpfn are installed into rhashtable's stored params
at rhashtable_init time, so the rehash worker, slow-path inserts, and
rhashtable_next_key() all agree with the inlined fast paths.

The seq_file BPF iterator uses rhashtable_walk_* and is unaffected.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/20260605-rhash-v7-7-5b8e05f8630d@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
12 days agobpf: Allow special fields in resizable hashtab
Mykyta Yatsenko [Fri, 5 Jun 2026 11:41:23 +0000 (04:41 -0700)] 
bpf: Allow special fields in resizable hashtab

Add support for timers, workqueues, task work, spin locks and kptrs.
Without this, users needing deferred callbacks, BPF_F_LOCK, or
refcounted kernel pointers in a dynamically-sized map have no option -
fixed-size htab is the only map supporting these field types.
Resizable hashtab should offer the same capability.

kptr semantics under in-place updates are identical to array map.

Properly clean up BTF record fields on element delete and map
teardown by wiring up bpf_obj_free_fields through a memory allocator
destructor, matching the pattern used by htab for non-prealloc maps.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/20260605-rhash-v7-6-5b8e05f8630d@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
12 days agobpf: Implement iteration ops for resizable hashtab
Mykyta Yatsenko [Fri, 5 Jun 2026 11:41:22 +0000 (04:41 -0700)] 
bpf: Implement iteration ops for resizable hashtab

Implement get_next_key, batch lookup/lookup-and-delete, for_each_map_elem
callback, and the seq_file BPF iterator for BPF_MAP_TYPE_RHASH.

get_next_key() and batch use rhashtable_next_key() — stateless,
matches the syscall UAPI shape (no kernel-side iterator state).
get_next_key falls back to the first key when prev_key was
concurrently deleted (matches htab semantics). Batch reports
cursor loss as -EAGAIN so userspace can distinguish it from
end-of-iteration (-ENOENT) and restart from NULL.

The seq_file BPF iterator uses rhashtable_walk_* instead. It runs
only from read() syscall context, so the walker's spin_lock is
safe, and seq_file's per-fd state lets the walker handle rehash
correctly (retry on -EAGAIN) for stronger coverage than the
stateless API can provide.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/20260605-rhash-v7-5-5b8e05f8630d@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
12 days agobpf: Implement resizable hashmap basic functions
Mykyta Yatsenko [Fri, 5 Jun 2026 11:41:21 +0000 (04:41 -0700)] 
bpf: Implement resizable hashmap basic functions

Use rhashtable_lookup_likely() for lookups, rhashtable_remove_fast()
for deletes, and rhashtable_lookup_get_insert_fast() for inserts.

Updates modify values in place under RCU rather than allocating a
new element and swapping the pointer (as regular htab does). This
trades read consistency for performance: concurrent readers may
see partial updates. BPF_F_LOCK support and special-field
handling (timers, kptrs, etc.) follow in a later commit.

Initialize rhashtable with bpf_mem_alloc element cache. Require
BPF_F_NO_PREALLOC. Limit max_entries to 2^31. Free elements via
rhashtable_free_and_destroy().

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/20260605-rhash-v7-4-5b8e05f8630d@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
12 days agorhashtable: Use irq work for shrinking
Herbert Xu [Fri, 5 Jun 2026 11:41:20 +0000 (04:41 -0700)] 
rhashtable: Use irq work for shrinking

Use irq work for automatic shrinking so that this may be called in NMI
context.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/20260605-rhash-v7-3-5b8e05f8630d@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
12 days agorhashtable: Add selftest for rhashtable_next_key()
Mykyta Yatsenko [Fri, 5 Jun 2026 11:41:19 +0000 (04:41 -0700)] 
rhashtable: Add selftest for rhashtable_next_key()

Insert n elements, then verify:
  - NULL prev_key walks from the beginning, visiting all n
  - non-existing prev_key returns ERR_PTR(-ENOENT)

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Link: https://lore.kernel.org/r/20260605-rhash-v7-2-5b8e05f8630d@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
12 days agorhashtable: Add rhashtable_next_key() API
Mykyta Yatsenko [Fri, 5 Jun 2026 11:41:18 +0000 (04:41 -0700)] 
rhashtable: Add rhashtable_next_key() API

Introduce a simpler iteration mechanism for rhashtable that lets
the caller continue from an arbitrary position by supplying the
previous key, without the per-iterator state of the
rhashtable_walk_* API.

  void *rhashtable_next_key(struct rhashtable *ht,
                            const void *prev_key);

Caller holds RCU; passes NULL prev_key for the first element or
the previously returned key to advance. Walks tbl->future_tbl
chain so in-flight rehashes are observed.

Best-effort: in case of concurrent resize, provides no guarantees:
 - may produce duplicate elements
 - may skip any amount of elements
 - termination of the loop is not guaranteed in case of
 sustained rehash. Callers are advised to bound loop externally
 or avoid inserting new elements during such loop.

Returns ERR_PTR(-ENOENT) if prev_key is not found.
Behavior on tables with duplicate keys is undefined.
rhltable is not supported — returns ERR_PTR(-EOPNOTSUPP).

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Link: https://lore.kernel.org/r/20260605-rhash-v7-1-5b8e05f8630d@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
12 days agonetfilter: conntrack: call nf_ct_gre_keymap_destroy() if master helper is pptp
Pablo Neira Ayuso [Thu, 4 Jun 2026 06:21:13 +0000 (08:21 +0200)] 
netfilter: conntrack: call nf_ct_gre_keymap_destroy() if master helper is pptp

For GRE flows, validate that the ct master helper (if any) is pptp
before calling nf_ct_gre_keymap_destroy(), so the helper data area
can be accessed safely. Note that only the pptp helper provides a
.destroy callback.

Fixes: e56894356f60 ("netfilter: conntrack: remove l4proto destroy hook")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 days agonetfilter: conntrack: revert ct extension genid infrastructure
Pablo Neira Ayuso [Thu, 4 Jun 2026 06:21:12 +0000 (08:21 +0200)] 
netfilter: conntrack: revert ct extension genid infrastructure

This infrastructure is not used anymore after moving ct timeout and
helper to use datapath refcount to track object use.

Revert commit c56716c69ce1 ("netfilter: extensions: introduce extension
genid count") this patch disables all ct extensions (leading to NULL)
for unconfirmed conntracks, when this is only targeted at ct helper and
ct timeout. There is also codebase that dereferences the ct extension
without checking for NULL which could lead to crash.

Fixes: c56716c69ce1 ("netfilter: extensions: introduce extension genid count")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 days agonetfilter: nf_conntrack_helper: add refcounting from datapath
Pablo Neira Ayuso [Thu, 4 Jun 2026 06:21:11 +0000 (08:21 +0200)] 
netfilter: nf_conntrack_helper: add refcounting from datapath

This patch adds a new ->ct_refcnt field to struct nf_conntrack_helper
which is bumped when the helper is used by the ct helper extension. Drop
this reference count when the conntrack entry is released. This is a
packet path refcount which ensures that struct nf_conntrack_helper
remains in place for tricky scenarios where a packet sits in nfqueue, or
elsewhere, with a conntrack that refers to this helper.

For simplicity, this leaves a single refcount for helper objects in
place, remove the existing refcount for control plane that ensures that
the helper does not go away if it is used by ruleset.

On helper removal, the help callback is set to NULL to disable it from
packet path and, after rcu grace period, existing expectations are
removed. Update ctnetlink to disable access to .to_nlattr and
.from_nlattr if the helper is going away.

Remove nf_queue_nf_hook_drop() since it has proven not to be effective
because packets with unconfirmed conntracks which are still flying to
sit in nfqueue.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 days agonetfilter: nf_conntrack_pptp: move GRE specific cleanup to GRE tracker
Pablo Neira Ayuso [Thu, 4 Jun 2026 06:21:10 +0000 (08:21 +0200)] 
netfilter: nf_conntrack_pptp: move GRE specific cleanup to GRE tracker

Move the GRE specific cleanup to nf_conntrack_proto_gre.c to ensure that
the .destroy callback for the pptp helper is still reachable by existing
conntrack entries while pptp module is being removed.

This is a preparation patch, no functional changes are intended.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 days agowifi: mac80211: Add 802.3 multicast encapsulation offload support
Tamizh Chelvam Raja [Thu, 4 Jun 2026 16:24:03 +0000 (21:54 +0530)] 
wifi: mac80211: Add 802.3 multicast encapsulation offload support

mac80211 converts 802.3 multicast packets to 802.11 format
before driver TX, even when Ethernet encapsulation offload
is enabled. This prevents drivers that support multicast
Ethernet encapsulation offload from receiving frames in
native 802.3 format.

Introduce the IEEE80211_OFFLOAD_ENCAP_MCAST flag to bypass the
802.11 encapsulation step and pass the multicast packet to the
driver in 802.3 format. Drivers that support multicast Ethernet
encapsulation offload can advertise this flag.

Disable multicast encapsulation offload in MLO case for drivers not
advertising MLO_MCAST_MULTI_LINK_TX support for AP mode and for
3-address AP_VLAN multicast packets.

Signed-off-by: Tamizh Chelvam Raja <tamizh.raja@oss.qualcomm.com>
Link: https://patch.msgid.link/20260604162403.1563729-4-tamizh.raja@oss.qualcomm.com
[fix unlikely(), indentation]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
12 days agowifi: mac80211: Add multicast to unicast support for 802.3 path
Tamizh Chelvam Raja [Thu, 4 Jun 2026 16:24:02 +0000 (21:54 +0530)] 
wifi: mac80211: Add multicast to unicast support for 802.3 path

mac80211 already supports multicast-to-unicast conversion for
native 802.11 TX paths, but this handling is missing for the
802.3 transmit path. Due to that the packet never converted to
unicast and directly pass it to 802.11 Tx path by checking the
destination address as multicast.

Extend ieee80211_subif_start_xmit_8023() to honor the
multicast_to_unicast setting by cloning and converting multicast
Ethernet frames into per-station unicast transmissions, following
the same behavior of the native 802.11 TX path and allow it
to take 802.3 path.

Signed-off-by: Tamizh Chelvam Raja <tamizh.raja@oss.qualcomm.com>
Link: https://patch.msgid.link/20260604162403.1563729-3-tamizh.raja@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
12 days agowifi: mac80211: Add sta pointer sanity check in ieee80211_8023_xmit()
Tamizh Chelvam Raja [Thu, 4 Jun 2026 16:24:01 +0000 (21:54 +0530)] 
wifi: mac80211: Add sta pointer sanity check in ieee80211_8023_xmit()

Currently ieee80211_8023_xmit() accesses the sta pointer without any
sanity check, assuming that only unicast packets for an authorized
station are processed. But the sta pointer could become NULL when
a framework to support 802.3 offload for the multicast packets is
added in the follow-up patches. Add the valid sta pointer sanity
check to avoid the invalid pointer access.

This aligns with some of the subordinate functions called by
ieee80211_8023_xmit() that already NULL-check 'sta' such as
ieee80211_select_queue() and ieee80211_aggr_check().

Signed-off-by: Tamizh Chelvam Raja <tamizh.raja@oss.qualcomm.com>
Link: https://patch.msgid.link/20260604162403.1563729-2-tamizh.raja@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
12 days agoiommufd/selftest: Cover invalid read counts on vEVENTQ FD
Nicolin Chen [Mon, 1 Jun 2026 20:42:38 +0000 (13:42 -0700)] 
iommufd/selftest: Cover invalid read counts on vEVENTQ FD

The vEVENTQ file descriptor must reject reads whose buffer cannot hold
even one event record. Add selftest coverage that exercises both the
empty-queue path (the upfront size check) and the non-empty path (the
in-loop check that fires only after an event is fetched).

For iommufd_veventq_fops_read():
 - count == 0 and count < sizeof(header) on an empty vEVENTQ both
   return -EINVAL.
 - count == 0 and count == sizeof(header) on a non-empty vEVENTQ
   (event has trailing payload) both return -EINVAL.

Link: https://patch.msgid.link/r/7bcd153d306f2cf04c094c728c0ebe146855072a.1780343944.git.nicolinc@nvidia.com
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoiommufd: Avoid partial fault group delivery in iommufd_fault_fops_read()
Nicolin Chen [Mon, 1 Jun 2026 20:42:37 +0000 (13:42 -0700)] 
iommufd: Avoid partial fault group delivery in iommufd_fault_fops_read()

The cookie returned by xa_alloc() in iommufd_fault_fops_read() is per fault
group, but the inner copy_to_user() runs per fault inside the group. If a
copy fails mid-group, xa_erase clears the cookie and the group is restored
to the deliver list, yet done is not rolled back. The function returns the
partial byte count, with the successfully copied faults sitting at offsets
below done carrying the now-erased cookie. The next read() then re-fetches
the group, allocates a fresh cookie, and re-delivers every fault including
the ones already copied; userspace sees duplicates carrying the new cookie,
and a stale cookie that can never be responded to.

Use a local group_done variable that tracks the per-group progress inside
the inner loop, and only commit done = group_done after the inner loop has
finished successfully. On a copy_to_user failure the outer break skips the
commit, so done remains at its prior start-of-group baseline; the partial
bytes already written past done are undefined to userspace per the read(2)
contract, and the next read re-delivers the whole group atomically.

Fixes: 07838f7fd529 ("iommufd: Add iommufd fault object")
Link: https://patch.msgid.link/r/360cab4d4aeccb0bae275a970e2b3c340a71e0e0.1780343944.git.nicolinc@nvidia.com
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoiommufd: Break the loop on failure in iommufd_fault_fops_read()
Nicolin Chen [Mon, 1 Jun 2026 20:42:36 +0000 (13:42 -0700)] 
iommufd: Break the loop on failure in iommufd_fault_fops_read()

On a copy_to_user() failure inside the inner list_for_each_entry, only the
inner loop breaks; the outer while re-fetches the just-restored fault group
and retries the failing copy_to_user() forever, spinning the reader at 100%
CPU with fault->mutex held.

Check rc after the inner loop and break the outer while as well.

Fixes: 07838f7fd529 ("iommufd: Add iommufd fault object")
Link: https://patch.msgid.link/r/336a9b6e44fe66a24199d3be777c405c85c98622.1780343944.git.nicolinc@nvidia.com
Cc: stable@vger.kernel.org
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoiommufd: Reject invalid read count in iommufd_fault_fops_read()
Nicolin Chen [Mon, 1 Jun 2026 20:42:35 +0000 (13:42 -0700)] 
iommufd: Reject invalid read count in iommufd_fault_fops_read()

The read count must be large enough to hold one fault or a group's faults.

iommufd_fault_fops_read() does not validate the count, but returns 0 as if
the read had succeeded while leaving the pending fault in the queue.

Return -EINVAL in the undersize cases.

Fixes: 07838f7fd529 ("iommufd: Add iommufd fault object")
Link: https://patch.msgid.link/r/85c118a606fbedc5c132a1f5ec223a5ba23b92d2.1780343944.git.nicolinc@nvidia.com
Cc: stable@vger.kernel.org
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoiommufd: Propagate allocation failure in iommufd_veventq_deliver_fetch()
Nicolin Chen [Mon, 1 Jun 2026 20:42:34 +0000 (13:42 -0700)] 
iommufd: Propagate allocation failure in iommufd_veventq_deliver_fetch()

When the kzalloc_obj() fails in iommufd_veventq_deliver_fetch(), it returns
NULL, falsely advertising to userspace that the queue is empty.

Propagate the -ENOMEM properly to the caller.

Fixes: e36ba5ab808e ("iommufd: Add IOMMUFD_OBJ_VEVENTQ and IOMMUFD_CMD_VEVENTQ_ALLOC")
Link: https://patch.msgid.link/r/25d29feac909e36f78c145fa99ef2d4cb7a415da.1780343944.git.nicolinc@nvidia.com
Cc: stable@vger.kernel.org
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoiommufd: Reject invalid read count in iommufd_veventq_fops_read()
Nicolin Chen [Mon, 1 Jun 2026 20:42:33 +0000 (13:42 -0700)] 
iommufd: Reject invalid read count in iommufd_veventq_fops_read()

The read count must be large enough to hold a vEVENT header. For a normal
vEVENT, it must also hold the trailing data following the header.

iommufd_veventq_fops_read() does not validate the count, but returns 0 as
if the read had succeeded while leaving the pending event in the queue.

Return -EINVAL in both undersize cases.

Fixes: e36ba5ab808e ("iommufd: Add IOMMUFD_OBJ_VEVENTQ and IOMMUFD_CMD_VEVENTQ_ALLOC")
Link: https://patch.msgid.link/r/e1111adcc8a8882fbfd84accd6674dc846dc5689.1780343944.git.nicolinc@nvidia.com
Cc: stable@vger.kernel.org
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoiommufd: Rewind header length in done if iommufd_veventq_fops_read() fails
Nicolin Chen [Mon, 1 Jun 2026 20:42:32 +0000 (13:42 -0700)] 
iommufd: Rewind header length in done if iommufd_veventq_fops_read() fails

When the first event copy fails, rc = -EFAULT will not be reported as done
is set to the length of the copied header.

Rewind it to report rc correctly.

Fixes: e36ba5ab808e ("iommufd: Add IOMMUFD_OBJ_VEVENTQ and IOMMUFD_CMD_VEVENTQ_ALLOC")
Link: https://patch.msgid.link/r/78f8caeb6a5d667a26b870e3068cec47dd4b5be1.1780343944.git.nicolinc@nvidia.com
Cc: stable@vger.kernel.org
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agox86/cpu: Remove obsolete aperfmperf_get_khz() declaration
Junxiao Chang [Sat, 6 Jun 2026 02:15:14 +0000 (10:15 +0800)] 
x86/cpu: Remove obsolete aperfmperf_get_khz() declaration

aperfmperf_get_khz() was replaced by arch_freq_get_on_cpu().
The remaining declaration in the header file is no longer used
and should be removed.

Fixes: f3eca381bd49 ("x86/aperfmperf: Replace arch_freq_get_on_cpu()")
Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://patch.msgid.link/20260606021514.1433619-1-junxiao.chang@intel.com
12 days agoKVM: arm64: Correctly identify executable PTEs at stage-2
Oliver Upton [Tue, 2 Jun 2026 16:59:01 +0000 (09:59 -0700)] 
KVM: arm64: Correctly identify executable PTEs at stage-2

KVM invalidates the I-cache before installing an executable PTE on
implementations without DIC. Unfortunately, support for FEAT_XNX
broke this check as KVM_PTE_LEAF_ATTR_HI_S2_XN was expanded to a
bitfield.

Fix it by reusing kvm_pgtable_stage2_pte_prot() and testing the abstract
permission bits instead.

Fixes: 2608563b466b ("KVM: arm64: Add support for FEAT_XNX stage-2 permissions")
Reported-by: Sashiko (gemini/gemini-3.1-pro-preview)
Signed-off-by: Oliver Upton <oupton@kernel.org>
Reviewed-by: Wei-Lin Chang <weilin.chang@arm.com>
Link: https://patch.msgid.link/20260602165901.52800-3-oupton@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
12 days agoKVM: arm64: nv: Fix handling of XN[0] when !FEAT_XNX
Oliver Upton [Tue, 2 Jun 2026 16:59:00 +0000 (09:59 -0700)] 
KVM: arm64: nv: Fix handling of XN[0] when !FEAT_XNX

XN has already been extracted from its bitfield position so using
FIELD_PREP() on the mask that clears XN[0] is completely broken, having
the effect of unconditionally granting execute permissions...

Fix the obvious mistake by manipulating the right bit.

Cc: stable@vger.kernel.org
Fixes: d93febe2ed2e ("KVM: arm64: nv: Forward FEAT_XNX permissions to the shadow stage-2")
Reviewed-by: Wei-Lin Chang <weilin.chang@arm.com>
Signed-off-by: Oliver Upton <oupton@kernel.org>
Link: https://patch.msgid.link/20260602165901.52800-2-oupton@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
12 days agocleanup: Specify nonnull argument index
Dmitry Ilvokhin [Fri, 5 Jun 2026 10:06:22 +0000 (03:06 -0700)] 
cleanup: Specify nonnull argument index

The guard constructors were annotated with an empty __nonnull_args(),
relying on __nonnull__() marking every pointer parameter as non-NULL.
Sparse cannot parse the empty argument list.

Both constructors take the lock pointer as their first parameter, so
specify the index explicitly: __nonnull_args(1).

Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/all/aiJi0WcYE8FZt-jO@stanley.mountain/
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/aiKpH3cLBEj3TF2Q@shell.ilvokhin.com
12 days agofs/read_write: Do not export __kernel_write() to the entire world
Andy Shevchenko [Thu, 4 Jun 2026 09:52:02 +0000 (11:52 +0200)] 
fs/read_write: Do not export __kernel_write() to the entire world

Since we have EXPORT_SYMBOL_FOR_MODULES(), we may narrow
the __kernel_write() export to the only which really needs it.
With that being done, update the respective comment.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20260604095233.284067-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
12 days agoptp: vmclock: Use hw_cycles from snapshot for precise TSC pairing
David Woodhouse [Thu, 4 Jun 2026 09:35:18 +0000 (10:35 +0100)] 
ptp: vmclock: Use hw_cycles from snapshot for precise TSC pairing

When the system clocksource is kvmclock or Hyper-V (not the TSC directly),
vmclock_get_crosststamp() falls through to a separate get_cycles() call,
losing the atomic pairing between the system time snapshot and the TSC
reading.

Now that ktime_get_snapshot_id() populates hw_cycles with the underlying
TSC value for derived clocksources, use it when available.  This gives a
perfect (system_time, tsc) pairing for the device time calculation.

The SUPPORT_KVMCLOCK wrapper is still needed to convert the TSC into
kvmclock nanoseconds for system_counter->cycles, because otherwise
get_device_system_crosststamp() can't interpret the result against the
system clock.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Assisted-by: Kiro:claude-opus-4.6-1m
Link: https://patch.msgid.link/20260604095755.64849-4-dwmw2@infradead.org
12 days agox86/kvmclock: Implement read_snapshot() for kvmclock clocksource
David Woodhouse [Thu, 4 Jun 2026 09:35:17 +0000 (10:35 +0100)] 
x86/kvmclock: Implement read_snapshot() for kvmclock clocksource

Implement the read_snapshot() callback for the kvmclock clocksource.  This
returns the kvmclock nanosecond value (for timekeeping) while also
providing the raw TSC value that was used to compute it.

The TSC is read inside the pvclock seqlock-protected region, ensuring the
raw TSC and derived kvmclock value are atomically paired.

This enables ktime_get_snapshot_id() to provide the raw TSC to consumers
like the vmclock PTP driver, which currently has to do a separate call to
get_cycles() to obtain a value at *approximately* the same time, to feed
through the vmclock calculation.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Assisted-by: Kiro:claude-opus-4.6-1m
Link: https://patch.msgid.link/20260604095755.64849-3-dwmw2@infradead.org
12 days agoclocksource/hyperv: Implement read_snapshot() for TSC page clocksource
David Woodhouse [Thu, 4 Jun 2026 09:35:16 +0000 (10:35 +0100)] 
clocksource/hyperv: Implement read_snapshot() for TSC page clocksource

Implement the read_snapshot() callback for the Hyper-V TSC page clock-
source. This returns the derived 10MHz reference time (for timekeeping)
while also providing the raw TSC value that was used to compute it.

When the TSC page is valid, hv_read_tsc_page_tsc() atomically captures both
values from a single RDTSC inside the sequence-counter protected read. When
the TSC page is invalid (sequence == 0), the hw_csid and hw_cycles are set
to zero indicating no value is available.

This enables ktime_get_snapshot_id() to provide the raw TSC to consumers
like KVM's master clock when running nested guests under Hyper-V.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Assisted-by: Kiro:claude-opus-4.6-1m
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20260604095755.64849-2-dwmw2@infradead.org
12 days agopwm: th1520: Remove requirement for mul_u64_u64_div_u64_roundup
Maurice Hieronymus [Fri, 5 Jun 2026 07:03:59 +0000 (09:03 +0200)] 
pwm: th1520: Remove requirement for mul_u64_u64_div_u64_roundup

The cycle register is always u32, so cycles_to_ns() can take a u32
instead of a u64. With that narrowing, cycles * NSEC_PER_SEC is at most
u32::MAX * 1e9 (~4.3e18), which fits in u64 without overflow. The
saturating arithmetic is therefore no longer needed, and the ceiling
division can use Rust's u64::div_ceil() directly instead of the
open-coded numerator/denominator form.

This also drops the TODO referring to a future
mul_u64_u64_div_u64_roundup kernel helper, which is no longer required.

Reviewed-by: Michal Wilczynski <m.wilczynski@samsung.com>
Signed-off-by: Maurice Hieronymus <mhi@mailbox.org>
Link: https://patch.msgid.link/20260605-pwm-th1520-fix-v2-1-5921e3a595f7@mailbox.org
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
12 days agomm/slub: preserve original size in _kmalloc_nolock_noprof retry path
Shengming Hu [Thu, 4 Jun 2026 12:27:32 +0000 (20:27 +0800)] 
mm/slub: preserve original size in _kmalloc_nolock_noprof retry path

_kmalloc_nolock_noprof() retries from the next kmalloc bucket when the
initial allocation fails. The retry currently reuses `size` as the
bucket selector and overwrites it with s->object_size + 1.

That value is later passed as the original allocation size to
__slab_alloc_node(), slab_post_alloc_hook() and kasan_kmalloc(). On a
successful retry this makes KASAN/slub-debug observe the retry bucket
selector rather than the caller requested size, potentially widening the
valid kmalloc range and hiding overflows.

Keep the caller requested size separately as orig_size and pass it to
the allocation/debug/KASAN paths. Continue using `size` as the retry cache
selector.

Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock()")
Signed-off-by: Shengming Hu <hu.shengming@zte.com.cn>
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Reviewed-by: Hao Li <hao.li@linux.dev>
Link: https://patch.msgid.link/202606042027323804pk3MRY42Jy7y42OHAhQZ@zte.com.cn
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
12 days agoiomap: Add IOMAP_F_ZERO_TAIL flag to trace event strings
Namjae Jeon [Wed, 3 Jun 2026 14:40:31 +0000 (23:40 +0900)] 
iomap: Add IOMAP_F_ZERO_TAIL flag to trace event strings

Add IOMAP_F_ZERO_TAIL to the flag string mapping in iomap trace
events. This allows the new flag to be properly displayed in
ftrace output when iomap operations use it.

Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Link: https://patch.msgid.link/20260603144031.7370-1-linkinjeon@kernel.org
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
12 days agocrypto: qat - simplify adf_service_mask_to_string helper
Thorsten Blum [Wed, 27 May 2026 17:46:55 +0000 (19:46 +0200)] 
crypto: qat - simplify adf_service_mask_to_string helper

Use a single scnprintf() for each set bit and drop the offset in the
else branch to simplify adf_service_mask_to_string().

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
12 days agocrypto: powerpc/aes - use min in ppc_{ecb,cbc,ctr,xts}_crypt
Thorsten Blum [Wed, 27 May 2026 14:11:47 +0000 (16:11 +0200)] 
crypto: powerpc/aes - use min in ppc_{ecb,cbc,ctr,xts}_crypt

Replace min_t() with the simpler min() macro since the values are
unsigned and compatible.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
12 days agocrypto: chacha20poly1305 - validate poly1305 template argument
Xiaonan Zhao [Tue, 26 May 2026 10:11:43 +0000 (18:11 +0800)] 
crypto: chacha20poly1305 - validate poly1305 template argument

chachapoly_create() still accepts the compatibility poly1305 parameter
in the template name, but it assumes the second template argument is
always present and immediately passes it to strcmp().

When the argument is missing, crypto_attr_alg_name() returns an error
pointer. Check for that before comparing the name so malformed template
instantiations fail with an error instead of dereferencing the error
pointer in strcmp().

This matches the surrounding Crypto API template pattern where
crypto_attr_alg_name() results are validated before string-specific use.

Fixes: a298765e28ad ("crypto: chacha20poly1305 - Use lib/crypto poly1305")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Co-developed-by: Luxing Yin <tr0jan@lzu.edu.cn>
Signed-off-by: Luxing Yin <tr0jan@lzu.edu.cn>
Signed-off-by: Xiaonan Zhao <ngochuongbui67@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
12 days agocrypto: qat - add KPT support for GEN6 devices
Junyuan Wang [Tue, 26 May 2026 09:28:39 +0000 (09:28 +0000)] 
crypto: qat - add KPT support for GEN6 devices

Add support for Intel Key Protection Technology (KPT) on QAT GEN6
devices.

KPT protects private keys from exposure by keeping them wrapped
(encrypted) while in use, in-flight, and at rest. Keys remain in wrapped
form and are not exposed in plaintext in host memory. This feature
operates outside of the Linux crypto framework and kernel keyring.

Extend the firmware admin interface to enable and configure KPT. During
device initialisation, if KPT is enabled, the driver sends an admin
message to firmware to enable KPT mode and configure parameters such as
the maximum number of SWK (Symmetric Wrapping Key) slots and the SWK
time-to-live (TTL).

Expose KPT configuration via a new sysfs attribute group, "qat_kpt", and
add ABI documentation.

Co-developed-by: Nitesh Venkatesh <nitesh.venkatesh@intel.com>
Signed-off-by: Nitesh Venkatesh <nitesh.venkatesh@intel.com>
Signed-off-by: Junyuan Wang <junyuan.wang@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Ahsan Atta <ahsan.atta@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
12 days agocrypto: pcrypt - restore callback for non-parallel fallback
Ruijie Li [Mon, 25 May 2026 11:45:21 +0000 (19:45 +0800)] 
crypto: pcrypt - restore callback for non-parallel fallback

pcrypt installs pcrypt_aead_done() on the child AEAD request before
trying to submit it through padata.  If padata_do_parallel() returns
-EBUSY, pcrypt falls back to calling the child AEAD directly.

That fallback must not keep the padata completion callback.  Otherwise
an asynchronous completion runs pcrypt_aead_done() even though the
request was never enrolled in padata.

Restore the original request callback and callback data before calling
the child AEAD directly.  This keeps the fallback path aligned with a
direct AEAD request while leaving the parallel path unchanged.

Fixes: 662f2f13e66d ("crypto: pcrypt - Call crypto layer directly when padata_do_parallel() return -EBUSY")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Assisted-by: Codex:gpt-5.4
Signed-off-by: Ruijie Li <ruijieli51@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
12 days agocrypto: nx - fix nx_crypto_ctx_exit argument
Sam James [Mon, 25 May 2026 07:56:19 +0000 (08:56 +0100)] 
crypto: nx - fix nx_crypto_ctx_exit argument

nx_crypto_ctx_shash_exit calls nx_crypto_ctx_exit with crypto_shash_ctx(...)
but crypto_shash_ctx gives a nx_crypto_ctx *, not a crypto_tfm *.

Fix the type in nx_crypto_ctx_exit and drop the bogus crypto_tfm_ctx
call.

This fixes the following oops:

  BUG: Unable to handle kernel data access at 0xc0403effffffffc8
  Faulting instruction address: 0xc000000000396cb4
  Oops: Kernel access of bad area, sig: 11 [#15]
  Call Trace:
   nx_crypto_ctx_shash_exit+0x24/0x60
   crypto_shash_exit_tfm+0x28/0x40
   crypto_destroy_tfm+0x98/0x140
   crypto_exit_ahash_using_shash+0x20/0x40
   crypto_destroy_tfm+0x98/0x140
   hash_release+0x1c/0x30
   alg_sock_destruct+0x38/0x60
   __sk_destruct+0x48/0x2b0
   af_alg_release+0x58/0xb0
   __sock_release+0x68/0x150
   sock_close+0x20/0x40
   __fput+0x110/0x3a0
   sys_close+0x48/0xa0
   system_call_exception+0x140/0x2d0
   system_call_common+0xf4/0x258

.. which came from hardlink(1) opportunistically using AF_ALG.

The same problem exists with nx_crypto_ctx_skcipher_exit getting a context
it wasn't expecting, but apparently nobody hit that for years.

Cc: Eric Biggers <ebiggers@kernel.org>
Cc: stable@vger.kernel.org
Fixes: bfd9efddf990 ("crypto: nx - convert AES-ECB to skcipher API")
Fixes: 9420e628e7d8 ("crypto: nx - Use API partial block handling")
Acked-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reported-by: Calvin Buckley <calvin@cmpct.info>
Tested-by: Calvin Buckley <calvin@cmpct.info>
Suggested-by: Brad Spengler <brad.spengler@opensrcsec.com>
Signed-off-by: Sam James <sam@gentoo.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
12 days agodt-bindings: crypto: qcom,inline-crypto-engine: Document Hawi ICE
Manivannan Sadhasivam [Thu, 21 May 2026 12:36:21 +0000 (12:36 +0000)] 
dt-bindings: crypto: qcom,inline-crypto-engine: Document Hawi ICE

The Inline Crypto Engine found in Hawi SoC is compatible with the common
baseline IP 'qcom,inline-crypto-engine'. Hence, document the compatible as
such.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
12 days agodt-bindings: crypto: qcom,prng: Document Hawi TRNG
Manivannan Sadhasivam [Thu, 21 May 2026 12:36:20 +0000 (12:36 +0000)] 
dt-bindings: crypto: qcom,prng: Document Hawi TRNG

Hawi SoC has the True Random Number Generator (TRNG) which is compatible
with the baseline IP "qcom,trng". Hence, document the compatible as such.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
12 days agoMerge patch series "vfs infrastructure for fs-verity support for XFS with post EOF...
Christian Brauner [Thu, 4 Jun 2026 11:47:27 +0000 (13:47 +0200)] 
Merge patch series "vfs infrastructure for fs-verity support for XFS with post EOF merkle tree"

Christian Brauner <brauner@kernel.org> says:

This brings in the vfs infrastructure required to implement fs-verity
support for XFS.

* patches from https://patch.msgid.link/20260520123722.405752-1-aalbersh@kernel.org:
  iomap: introduce iomap_fsverity_write() for writing fsverity metadata
  iomap: teach iomap to read files with fsverity
  iomap: introduce IOMAP_F_FSVERITY and teach writeback to handle fsverity
  fsverity: generate and store zero-block hash

Link: https://patch.msgid.link/20260520123722.405752-1-aalbersh@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
12 days agoiomap: introduce iomap_fsverity_write() for writing fsverity metadata
Andrey Albershteyn [Wed, 20 May 2026 12:37:07 +0000 (14:37 +0200)] 
iomap: introduce iomap_fsverity_write() for writing fsverity metadata

This is just a wrapper around iomap_file_buffered_write() to create
necessary iterator over metadata.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Link: https://patch.msgid.link/20260520123722.405752-10-aalbersh@kernel.org
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
12 days agoiomap: teach iomap to read files with fsverity
Andrey Albershteyn [Wed, 20 May 2026 12:37:06 +0000 (14:37 +0200)] 
iomap: teach iomap to read files with fsverity

Obtain fsverity info for folios with file data and fsverity metadata.
Filesystem can pass vi down to ioend and then to fsverity for
verification. This is different from other filesystems ext4, f2fs, btrfs
supporting fsverity, these filesystems don't need fsverity_info for
reading fsverity metadata. While reading merkle tree iomap requires
fsverity info to synthesize hashes for zeroed data block.

fsverity metadata has two kinds of holes - ones in merkle tree and one
after fsverity descriptor.

Merkle tree holes are blocks full of hashes of zeroed data blocks. These
are not stored on the disk but synthesized on the fly. This saves a bit
of space for sparse files. Due to this iomap also need to lookup
fsverity_info for folios with fsverity metadata. ->vi has a hash of the
zeroed data block which will be used to fill the merkle tree block.

The hole past descriptor is interpreted as end of metadata region. As we
don't have EOF here we use this hole as an indication that rest of the
folio is empty. This patch marks rest of the folio beyond fsverity
descriptor as uptodate.

For file data, fsverity needs to verify consistency of the whole file
against the root hash, hashes of holes are included in the merkle tree.
Verify them too.

Issue reading of fsverity merkle tree on the fsverity inodes. This way
metadata will be available at I/O completion time.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Link: https://patch.msgid.link/20260520123722.405752-9-aalbersh@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
12 days agoiomap: introduce IOMAP_F_FSVERITY and teach writeback to handle fsverity
Andrey Albershteyn [Wed, 20 May 2026 12:37:05 +0000 (14:37 +0200)] 
iomap: introduce IOMAP_F_FSVERITY and teach writeback to handle fsverity

This flag indicates that I/O is for fsverity metadata.

In the write path skip i_size check and i_size updates as metadata is
past EOF. In writeback don't update i_size and continue writeback if
even folio is beyond EOF. In read path don't zero fsverity folios, again
they are past EOF.

The iomap_block_needs_zeroing() is also called from write path. For
folios of larger order we don't want to zero out pages in the folio as
these could contain other merkle tree blocks. For fsverity, filesystem
will request to read PAGE_SIZE memory regions. For data folios, iomap
will zero the rest of the folio for anything which is beyond EOF. We
don't want this for fsverity folios.

Christian Brauner <brauner@kernel.org> says:
Changed IOMAP_F_FSVERITY from (1U << 10) to (1U << 11) to avoid colliding
with IOMAP_F_ZERO_TAIL, which already uses (1U << 10).

Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Link: https://patch.msgid.link/20260520123722.405752-8-aalbersh@kernel.org
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
12 days agofsverity: generate and store zero-block hash
Andrey Albershteyn [Wed, 20 May 2026 12:37:02 +0000 (14:37 +0200)] 
fsverity: generate and store zero-block hash

Compute the hash of one filesystem block's worth of zeros. A filesystem
implementation can decide to elide merkle tree blocks containing only
this hash and synthesize the contents at read time.

Let's pretend that there's a file containing 131 data block and whose
merkle tree looks roughly like this:

root
 +--leaf0
 |   +--data0
 |   +--data1
 |   +--...
 |   `--data128
 `--leaf1
     +--data129
     +--data130
     `--data131

If data[0-128] are sparse holes, then leaf0 will contain a repeating
sequence of @zero_digest.  Therefore, leaf0 need not be written to disk
because its contents can be synthesized.

A subsequent xfs patch will use this to reduce the size of the merkle
tree when dealing with sparse gold master disk images and the like.

Note that this works only on the first-level (data holes). fsverity
doesn't store/generate zero_digest for any higher levels.

Add a helper to pre-fill folio with hashes of empty blocks. This will be
used by iomap to synthesize blocks full of zero hashes on the fly.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Acked-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Link: https://patch.msgid.link/20260520123722.405752-5-aalbersh@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
12 days agonetfilter: nf_conntrack_helper: dynamically allocate struct nf_conntrack_helper
Pablo Neira Ayuso [Thu, 4 Jun 2026 06:21:09 +0000 (08:21 +0200)] 
netfilter: nf_conntrack_helper: dynamically allocate struct nf_conntrack_helper

Adapt all existing helpers to use a modified version of
nf_ct_helper_init(), to dynamically allocate struct nf_conntrack_helper.

Allocate expect_policy[] built-in into the helper to ensure this area is
reachable after helper removal since a follow up patch adds refcount to
track use of the nf_conntrack_helper structure from packet path so it
remains around until last reference from ct helper extension is dropped.

Export __nf_conntrack_helper_register() which allows to register
nfnetlink_cthelper dynamically allocated helper. Adapt nfnetlink_cthelper
to use the built-in expect_policy[].

This is a preparation patch to add packet path refcounting to helpers.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 days agoio_uring/net: inherit IORING_CQE_F_BUF_MORE across bundle recv retries
Clément Léger [Thu, 4 Jun 2026 16:07:13 +0000 (09:07 -0700)] 
io_uring/net: inherit IORING_CQE_F_BUF_MORE across bundle recv retries

When a bundle recv retries inside io_recv_finish(), the merge logic OR
the saved cflags from the previous iteration with the cflags returned by
the new iteration:
  cflags = req->cqe.flags | (cflags & CQE_F_MASK);

Bits listed in CQE_F_MASK are inherited from the new iteration, and all
other bits (notably IORING_CQE_F_BUFFER and the buffer ID) come from the
saved cflags. Before this change CQE_F_MASK covered only
IORING_CQE_F_SOCK_NONEMPTY and IORING_CQE_F_MORE.

When using provided buffer rings (IOU_PBUF_RING_INC) with incremental
mode, and bundle recv, io_kbuf_inc_commit() can leave the head ring
entry partially consumed, __io_put_kbufs() then sets
IORING_CQE_F_BUF_MORE on the returned cflags so userspace knows the
buffer ID will be reused for subsequent completions.

Because IORING_CQE_F_BUF_MORE was not in CQE_F_MASK, the merge above
silently dropped it whenever the final retry iteration partially
consumed the buffer, and the subsequent req->cqe.flags = cflags &
~CQE_F_MASK save would have left a stale IORING_CQE_F_BUF_MORE in the
carried-over cflags had one been present. Userspace would then
wrongfully advance it ring head past an entry the kernel still uses.

Add IORING_CQE_F_BUF_MORE to CQE_F_MASK so it is both inherited from the
new iteration into the user-visible CQE and stripped from the saved
cflags between iterations.

Cc: stable@vger.kernel.org
Signed-off-by: Clément Léger <cleger@meta.com>
Assisted-by: Claude:claude-opus-4.6
Fixes: ae98dbf43d75 ("io_uring/kbuf: add support for incremental buffer consumption")
Link: https://patch.msgid.link/20260604160715.2482972-1-cleger@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
12 days agoxfrm: espintcp: do not reuse an in-progress partial send
Wyatt Feng [Tue, 2 Jun 2026 16:46:27 +0000 (00:46 +0800)] 
xfrm: espintcp: do not reuse an in-progress partial send

espintcp keeps a single in-flight transmit in ctx->partial.
Before building a new sk_msg, espintcp_sendmsg() first tries to flush
that state through espintcp_push_msgs().

For blocking callers, espintcp_push_msgs() may return success even when
the previous partial send is still pending. espintcp_sendmsg() would
then reinitialize emsg->skmsg and reuse ctx->partial while the old
transfer still owns that state.

Do not rebuild the send message when ctx->partial is still in progress.
If espintcp_push_msgs() returns with emsg->len still set, fail the new
send instead of overwriting the live partial state.

This is a memory-safety fix: reusing the live partial-send state can
leave a stale offset attached to a new sk_msg and lead to an out-of-
bounds read in the send path.

tcp_sendmsg_locked() already handles waiting for send buffer memory, so
the fix here is just to preserve espintcp's one-message-at-a-time
transmit state.

Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Assisted-by: Codex:GPT-5.4
Signed-off-by: Wyatt Feng <bronzed_45_vested@icloud.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
12 days agoMerge tag 'nvme-7.2-2026-06-04' of git://git.infradead.org/nvme into for-7.2/block
Jens Axboe [Fri, 5 Jun 2026 11:18:58 +0000 (05:18 -0600)] 
Merge tag 'nvme-7.2-2026-06-04' of git://git.infradead.org/nvme into for-7.2/block

Pull NVMe updates from Keith:

"- Per-controller timeouts
 - Multipath telemetry
 - Namespace format validation
 - Various other fixes"

* tag 'nvme-7.2-2026-06-04' of git://git.infradead.org/nvme: (34 commits)
  nvme: export controller reconnect event count via sysfs
  nvme: export controller reset event count via sysfs
  nvme: export I/O failure count when no path is available via sysfs
  nvme: export I/O requeue count when no path is usable via sysfs
  nvme: export command error counters via sysfs
  nvme: export multipath failover count via sysfs
  nvme: export command retry count via sysfs
  nvme: add diag attribute group under sysfs
  nvme-tcp: lockdep: use dynamic lockdep keys per socket instance
  nvme-tcp: move nvme_tcp_reclassify_socket()
  nvme: validate FDP configuration descriptor sizes
  nvmet-auth: validate reply message payload bounds against transfer length
  nvme: refresh multipath head zoned limits from path limits
  nvme: fix FDP fdpcidx bounds check
  nvme-tcp: Use WQ_PERCPU explicitly if wq_unbound is false.
  nvmet: fix pre-auth out-of-bounds heap read in Discovery Get Log Page
  nvme-multipath: set BIO_REMAPPED on bios remapped to per-path namespace disks
  nvme-multipath: require exact iopolicy names for module parameter
  nvme-multipath: pass NS head to nvme_mpath_revalidate_paths()
  nvme-pci: fix out-of-bounds access in nvme_setup_descriptor_pools
  ...

12 days agoALSA: usb-audio: qcom: Initialize offload control return value
Cássio Gabriel [Fri, 5 Jun 2026 04:14:40 +0000 (01:14 -0300)] 
ALSA: usb-audio: qcom: Initialize offload control return value

snd_usb_offload_create_ctl() returns ret after walking the USB PCM list,
but ret is only assigned after a playback stream passes the endpoint and
PCM-index filters.

If all playback streams are skipped, for example because there is no
playback endpoint or because all PCM indexes exceed the 0xff control
range, the function returns an uninitialized stack value.

Initialize ret to 0 so the no-control-created path returns deterministic
success, while preserving the existing negative error return when
snd_ctl_add() fails.

Fixes: a67656f011d1 ("ALSA: usb-audio: qcom: Add USB offload route kcontrol")
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260605-alsa-usb-qcom-offload-ret-init-v1-1-dc72fcc4bd3b@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
12 days agonetfilter: cttimeout: detach dataplane timeout policy and repurpose refcount
Pablo Neira Ayuso [Thu, 4 Jun 2026 06:21:08 +0000 (08:21 +0200)] 
netfilter: cttimeout: detach dataplane timeout policy and repurpose refcount

Add a refcount for struct nf_ct_timeout which is used by ct extension to
set the custom ct timeout policy, this tells us that the ct timeout is
being used by a conntrack entry. When the last conntrack entry drops the
refcount on the ct timeout, the ct timeout is released.

Remove the refcount for control plane which controls if the ruleset
refers to the timeout policy. After this update, it is possible to
remove the ct timeout policy from nfnetlink_cttimeout immediately.
This is for simplicity not to handle two refcounts on a single object.

Remove nf_queue_nf_hook_drop(): a packet sitting in nfqueue will just
hold a reference to the nf_ct_timeout object until packet is reinjected,
since this is part of the ct extension, this will be released by the
time the conntrack is freed.

nf_ct_untimeout() is still called to clean up in a best effort basis:
the ct timeout on existing entries gets removed when the ct timeout goes
away, but as long as the iptables ruleset still refers to the ct timeout
through a template, new conntracks may keep attaching it and extend its
lifetime until the rule is removed.

nf_ct_untimeout() is not called anymore from module removal path, this
is unlikely to find timeouts give module refcount is bumped, and the new
refcount already tracks the ct timeout policy use so it is released when
unused.

Fixes: 50978462300f ("netfilter: add cttimeout infrastructure for fine timeout tuning")
Fixes: 7e0b2b57f01d ("netfilter: nft_ct: add ct timeout support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 days agonetfilter: synproxy: protect nf_ct_seqadj_init() with conntrack lock
Fernando Fernandez Mancera [Tue, 26 May 2026 21:58:30 +0000 (23:58 +0200)] 
netfilter: synproxy: protect nf_ct_seqadj_init() with conntrack lock

nf_ct_seqadj_init() is called without holding the ct lock. This can race
with nf_ct_seq_adjust() when a connection is in CLOSE state due to an
RST or connection reopening. In addition for SYN_RECV state, concurrent
processing of packets can trigger nf_ct_seq_adjust() too. These
situations create a read/write data race.

As synproxy is the only user of nf_ct_seqadj_init() at the moment, fix
this by holding ct->lock inside nf_ct_seqadj_init() until all is done.

Fixes: 48b1de4c110a ("netfilter: add SYNPROXY core/target")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 days agonetfilter: synproxy: fix unaligned memory access in timestamp adjustment
Fernando Fernandez Mancera [Tue, 26 May 2026 21:58:29 +0000 (23:58 +0200)] 
netfilter: synproxy: fix unaligned memory access in timestamp adjustment

Use get_unaligned_be32() and put_unaligned_be32() to safely read and
write the timestamp fields. This prevents performance degradation due to
unaligned memory access or even a crash on strict alignment
architectures.

This follows the implementation of timestamp parsing in the networking
stack at tcp_parse_options() and synproxy_parse_options().

Fixes: 48b1de4c110a ("netfilter: add SYNPROXY core/target")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 days agonetfilter: synproxy: adjust duplicate timestamp options
Fernando Fernandez Mancera [Tue, 26 May 2026 21:58:28 +0000 (23:58 +0200)] 
netfilter: synproxy: adjust duplicate timestamp options

RFC 9293 does not mention anything about duplicated options and each
networking stack handles it in their own way. Currently, Linux kernel is
processing options sequentially and in case of duplicated timestamp
options, the value from the latest one overrides the others.

As SYNPROXY is modifying only the first timestamp option found, a packet
can reach the backend server and it might parse the wrong timestamp
value. Let's just continue parsing the following options and in case a
duplicated timestamp is found, adjust it too.

Fixes: 48b1de4c110a ("netfilter: add SYNPROXY core/target")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 days agonetfilter: synproxy: drop packets if timestamp adjustment fails
Fernando Fernandez Mancera [Tue, 26 May 2026 21:58:27 +0000 (23:58 +0200)] 
netfilter: synproxy: drop packets if timestamp adjustment fails

If a packet was malformed or if skb_ensure_writable() failed, the
synproxy_tstamp_adjust() function returned 0 indicating an error but it
was ignored on the callers.

Make the function return a boolean instead to clarify the result and
drop the packet if synproxy_tstamp_adjust() failed due to ENOMEM from
skb_ensure_writable(). In addition, if there are malformed options, skip
the tstamp update but do not drop the packet as that should be done by
the policy directly.

Fixes: 48b1de4c110a ("netfilter: add SYNPROXY core/target")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 days agonetfilter: nfnetlink_cthelper: use {READ,WRITE}_ONCE for accessing helper flags
Pablo Neira Ayuso [Tue, 26 May 2026 16:40:44 +0000 (18:40 +0200)] 
netfilter: nfnetlink_cthelper: use {READ,WRITE}_ONCE for accessing helper flags

Conntrack helper flags are accessed from packet and netlink dump path.
Concurrent update of userspace helper flags is not possible, because the
nfnl_mutex in held on updates. These flags are only used by userspace
helpers. Use {READ,WRITE}_ONCE() to access this flags from lockless
paths.

Fixes: 12f7a505331e ("netfilter: add user-space connection tracking helper infrastructure")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 days agonetfilter: nfnetlink_osf: fix mss parsing on big-endian architectures
Fernando Fernandez Mancera [Mon, 25 May 2026 15:35:54 +0000 (17:35 +0200)] 
netfilter: nfnetlink_osf: fix mss parsing on big-endian architectures

The MSS calculation in nf_osf_match_one() manually shifts bytes to
construct a 16-bit value before passing it to ntohs().

This works on little-endian hosts but it does not work on big-endian as
the bytes are being always shifted and set in the same way for all
architectures.

Use get_unaligned_be16() to fix this on big-endian systems. It also
simplifies the code.

Fixes: 11eeef41d5f6 ("netfilter: passive OS fingerprint xtables match")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 days agoipvs: add conn_max sysctl to limit connections
Julian Anastasov [Mon, 25 May 2026 03:54:39 +0000 (06:54 +0300)] 
ipvs: add conn_max sysctl to limit connections

Currently, we are using atomic_t to track the number of
connections. On 64-bit setups with large memory there is
a risk this counter to overflow. Also, setups with many
containers may need to tune the limit for connections.

Add sysctl control to limit the number of connections to
1,073,741,824 (64-bit) and 16,777,216 (32-bit).
Depending on the admin's privilege, the value is
used to change a soft or hard limit allowing
unprivileged admins to change the soft limit in
range determined by privileged admins.

Link: https://sashiko.dev/#/patchset/20260523172715.94795-1-ja%40ssi.bg
Link: https://sashiko.dev/#/patchset/20260430074420.26697-7-ja%40ssi.bg
Link: https://sashiko.dev/#/patchset/20260522105546.13732-1-ja%40ssi.bg
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 days agoxfrm: iptfs: fix ABBA deadlock in iptfs_destroy_state()
Tristan Madani [Tue, 2 Jun 2026 17:16:41 +0000 (17:16 +0000)] 
xfrm: iptfs: fix ABBA deadlock in iptfs_destroy_state()

iptfs_destroy_state() calls hrtimer_cancel() while holding a spinlock
that the timer callback also acquires, leading to an ABBA deadlock on
SMP systems.

For the output timer (iptfs_timer):
  - iptfs_destroy_state() holds x->lock, calls hrtimer_cancel()
  - iptfs_delay_timer() callback takes x->lock

For the drop timer (drop_timer):
  - iptfs_destroy_state() holds drop_lock, calls hrtimer_cancel()
  - iptfs_drop_timer() callback takes drop_lock

Both timers use HRTIMER_MODE_REL_SOFT, so their callbacks run in softirq
context.  When hrtimer_cancel() is called for a soft timer that is
currently executing on another CPU, hrtimer_cancel_wait_running() spins
on softirq_expiry_lock -- the same lock held by the softirq running the
callback.  If the callback is blocked waiting for the spinlock held by
the caller of hrtimer_cancel(), a circular dependency forms:

  CPU 0: holds lock_A -> waits for softirq_expiry_lock
  CPU 1: holds softirq_expiry_lock -> waits for lock_A

Fix by calling hrtimer_cancel() before acquiring the respective locks.
hrtimer_cancel() is safe to call without holding any lock and will wait
for any in-progress callback to complete.  For the output timer, the
lock is still acquired afterwards to drain the packet queue.  For the
drop timer, the lock/unlock pair is removed entirely since it only
existed to serialize with the timer callback, which hrtimer_cancel()
already guarantees.

Found by source code audit.

Fixes: 4b3faf610cc6 ("xfrm: iptfs: add new iptfs xfrm mode impl")
Cc: Christian Hopps <chopps@labn.net>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tristan Madani <tristan@talencesecurity.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
12 days agoarm64: arch_timer: reuse arch_timer_read_cnt{p,v}ct_el0() helpers
Breno Leitao [Sat, 23 May 2026 16:59:26 +0000 (12:59 -0400)] 
arm64: arch_timer: reuse arch_timer_read_cnt{p,v}ct_el0() helpers

__arch_counter_get_cntpct() and __arch_counter_get_cntvct() open-code
the same ECV-aware ALTERNATIVE block that arch_timer_read_cntpct_el0()
and arch_timer_read_cntvct_el0() already provide in the same header.
The two pairs are byte-for-byte identical except for the trailing
arch_counter_enforce_ordering() the __arch_counter_get_* variants add.

Replace the duplicated inline assembly in __arch_counter_get_cntpct()
and __arch_counter_get_cntvct() with calls to the corresponding helpers.
This mirrors commit 00b39d150986 ("arm64: vdso: Use
__arch_counter_get_cntvct()"), which removed similar duplication from
the vDSO, and keeps the system-counter read sequence in a single place,
reducing assembly code in the kernell

No functional change: the resulting inline assembly, alternatives, and
clobbers are unchanged; only the source-level expression of the read
moves into the existing helper.

Verified by rebuilding the consumers of these helpers before and after
the change and comparing the resulting disassembly:

  - arch/arm64/kernel/vdso/vdso.so (final linked vDSO):
      bit-identical (same sha256 across rebuilds)
  - arch/arm64/kernel/vdso/vgettimeofday.o:    identical disassembly
  - arch/arm64/lib/delay.o:                    identical disassembly
  - drivers/clocksource/arm_arch_timer.o:      same 50 functions with
      byte-identical instruction streams; only difference is function
      ordering inside .text and NOP padding, with no opcodes added or
      removed.

Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Will Deacon <will@kernel.org>
12 days agoKVM: arm64: Reassign nested_mmus array behind mmu_lock
Hyunwoo Kim [Fri, 5 Jun 2026 08:27:01 +0000 (17:27 +0900)] 
KVM: arm64: Reassign nested_mmus array behind mmu_lock

kvm->arch.nested_mmus[] is walked under kvm->mmu_lock, including from the
MMU notifier path (kvm_unmap_gfn_range() -> kvm_nested_s2_unmap()), which
can run at any time. kvm_vcpu_init_nested() reallocates the array and frees
the old buffer while holding only kvm->arch.config_lock, so such a walker
can reference the freed array.

Allocate the new array outside of mmu_lock, as the allocation can sleep.
Under the lock, copy the existing entries, fix up the back pointers and
reassign the array. Free the old buffer after dropping the lock, as
kvfree() can sleep as well.

Fixes: 4f128f8e1aaac ("KVM: arm64: nv: Support multiple nested Stage-2 mmu structures")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Reviewed-by: Oliver Upton <oupton@kernel.org>
Link: https://patch.msgid.link/aiKIVVeIr1aAB1yp@v4bel
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger,kernel.org
12 days agoKVM: arm64: Restore POR_EL0 access to host EL0
Joey Gouly [Thu, 4 Jun 2026 10:54:34 +0000 (11:54 +0100)] 
KVM: arm64: Restore POR_EL0 access to host EL0

CPTR_EL2.E0POE was being cleared in __deactivate_cptr_traps_vhe(), which meant
that any accesses to POR_EL0 from host EL0 would trap and be reported to
userspace as an Illegal instruction. This would happen after running any VM,
regardless if it used POE or not.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Link: https://sashiko.dev/#/patchset/20260602155430.2088142-1-maz@kernel.org?part=1
Link: https://patch.msgid.link/20260604105434.2297268-1-joey.gouly@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger,kernel.org
12 days agoarm64/mm: Rename ptdesc_t
Anshuman Khandual [Wed, 20 May 2026 06:34:17 +0000 (07:34 +0100)] 
arm64/mm: Rename ptdesc_t

ptdesc_t sounds very similar to the core MM struct ptdesc which is actually
the memory descriptor for page table allocations. Hence rename this typedef
element as ptval_t instead for better clarity and separation.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: linux-efi@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Suggested-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
12 days agoarm64: mm: Defer remap of linear alias of data/bss
Ard Biesheuvel [Thu, 4 Jun 2026 15:11:57 +0000 (17:11 +0200)] 
arm64: mm: Defer remap of linear alias of data/bss

Marking the linear alias of data/bss invalid involves calling
set_memory_valid(), which calls split_kernel_leaf_mapping() under the
hood.

On BBML2_NOABORT capable systems, this may result in the need to
allocate page tables at a time when the generic memory allocation APIs
are not yet available, resulting in a splat like

   WARNING: arch/arm64/mm/mmu.c:821 at split_kernel_leaf_mapping+0x15c/0x170, CPU#0: swapper/0
   Modules linked in:
   CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 7.1.0-rc6 #1 PREEMPT(undef)
   pstate: a04000c9 (NzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
   pc : split_kernel_leaf_mapping+0x15c/0x170
   lr : update_range_prot+0x40/0x128
   sp : ffffc99ad3863c80
   ...
   Call trace:
    split_kernel_leaf_mapping+0x15c/0x170 (P)
    update_range_prot+0x40/0x128
    set_memory_valid+0x94/0xe0
    mark_linear_data_alias_valid+0x54/0x68
    map_mem+0x1fc/0x240
    paging_init+0x48/0x210
    setup_arch+0x274/0x338
    start_kernel+0x98/0x538
    __primary_switched+0x88/0x98

as reported by CKI automated testing.

So defer the boot-time call to mark_linear_data_alias_valid() to a later
time when page allocations can be made normally.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
12 days agoKVM: arm64: Omit tag sync on stage-2 mappings of the zero page
Ard Biesheuvel [Thu, 4 Jun 2026 15:11:56 +0000 (17:11 +0200)] 
KVM: arm64: Omit tag sync on stage-2 mappings of the zero page

Commit

   f620d66af316 ("arm64: mte: Do not flag the zero page as PG_mte_tagged")

removed the PG_mte_tagged flag from the zero page, but missed a KVM code
path that may set this flag on the zero page when it is used in a
stage-2 CoW mapping of anonymous memory.

So disregard the zero page explicitly in sanitise_mte_tags().

Fixes: f620d66af316 ("arm64: mte: Do not flag the zero page as PG_mte_tagged")
Cc: stable@vger.kernel.org # 5.10.x
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
12 days agoarm64: Avoid double evaluation of __ptep_get()
Ard Biesheuvel [Thu, 4 Jun 2026 15:11:55 +0000 (17:11 +0200)] 
arm64: Avoid double evaluation of __ptep_get()

Sashiko warns that the new pte_valid_noncont() macro is used in a manner
where the argument (which performs a READ_ONCE() of the descriptor) is
evaluated twice.

Drop the macro that we just added, and move the check into the newly
added users.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
12 days agokasan: Move generic KASAN page tables out of BSS too
Ard Biesheuvel [Thu, 4 Jun 2026 15:11:54 +0000 (17:11 +0200)] 
kasan: Move generic KASAN page tables out of BSS too

Make sure that all KASAN page tables are emitted into the .pgtbl section
(provided that the arch has one - otherwise, fall back to page aligned
BSS)

This is needed because BSS itself is no longer accessible via the linear
map on arm64.

Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: kasan-dev@googlegroups.com
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
12 days agoarm64: Rename page table BSS section to .bss..pgtbl
Ard Biesheuvel [Thu, 4 Jun 2026 15:11:53 +0000 (17:11 +0200)] 
arm64: Rename page table BSS section to .bss..pgtbl

Rename the .pgdir.bss section to .bss..pgtbl so that the compiler will
notice the leading ".bss" and mark it as NOBITS by default (rather than
PROGBITS, which would take up space in Image binary, forcing all of the
preceding BSS to be emitted into the image as well). This supersedes the
NOLOAD linker directive, which achieves the same thing, and can be
therefore be dropped.

Also, rename .pgdir to .pgtbl to be more generic, as page tables of
various levels will reside here.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
12 days agoRevert "drm/i915/backlight: Remove try_vesa_interface"
Suraj Kandpal [Sun, 17 May 2026 02:47:09 +0000 (08:17 +0530)] 
Revert "drm/i915/backlight: Remove try_vesa_interface"

This reverts commit 40d2f5820951dee818d05c14677277048bd85f9f.

Removing the try_vesa_interface gate caused a backlight regression on
panels whose VBT correctly reports INTEL_BACKLIGHT_DISPLAY_DDI and whose
PWM path is the actual backlight control, but whose DPCD optimistically
advertises DP_EDP_BACKLIGHT_AUX_ENABLE_CAP / _BRIGHTNESS_AUX_SET_CAP.
After the commit such panels silently bind to the VESA AUX backlight
funcs; AUX writes complete but the panel ignores them, leaving
brightness stuck (no-op backlight). Observed on at least KBL and TGL
eDP setups.

Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Link: https://patch.msgid.link/20260517024709.1016121-1-suraj.kandpal@intel.com
(cherry picked from commit f30fddb4402313aa5301a74d721638d343395269)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
12 days agox86/process: Convert rdmsr() to rdmsrq() in arch_post_acpi_subsys_init() to address...
HyeongJun An [Thu, 4 Jun 2026 15:00:52 +0000 (00:00 +0900)] 
x86/process: Convert rdmsr() to rdmsrq() in arch_post_acpi_subsys_init() to address W=1 warning

arch_post_acpi_subsys_init() reads MSR_K8_INT_PENDING_MSG with rdmsr()
into a lo/hi pair but only uses the low 32 bits: K8_INTP_C1E_ACTIVE_MASK
(0x18000000) lies entirely within them. The 'hi' half is never consumed,
which triggers a -Wunused-but-set-variable warning under W=1:

  arch/x86/kernel/process.c: In function 'arch_post_acpi_subsys_init':
  arch/x86/kernel/process.c:972:17: warning: variable 'hi' set but not used

Read the full MSR into a single u64 with rdmsrq() and test the mask
against it, dropping the now-unnecessary lo/hi variables.

No functional change intended.

Signed-off-by: HyeongJun An <sammiee5311@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Jürgen Groß <jgross@suse.com>
Link: https://patch.msgid.link/20260604150052.3337246-1-sammiee5311@gmail.com
12 days agoKVM: arm64: Take the SRCU lock for page table walks in fault injection and AT emulation
Hyunwoo Kim [Wed, 3 Jun 2026 12:09:33 +0000 (21:09 +0900)] 
KVM: arm64: Take the SRCU lock for page table walks in fault injection and AT emulation

walk_s1() and kvm_walk_nested_s2() expect to be called while holding
kvm->srcu to guard against memslot changes. While this is generally
the case, __kvm_at_s12() and __kvm_find_s1_desc_level() call into the
respective walkers without taking kvm->srcu.

Fix by acquiring kvm->srcu prior to the table walk in both instances.

Cc: stable@vger.kernel.org
Fixes: 50f77dc87f13 ("KVM: arm64: Populate level on S1PTW SEA injection")
Fixes: be04cebf3e78 ("KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W}")
Suggested-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Reviewed-by: Oliver Upton <oupton@kernel.org>
Link: https://patch.msgid.link/aiAZfdeyanIvP8SD@v4bel
Signed-off-by: Marc Zyngier <maz@kernel.org>
12 days agoKVM: arm64: vgic-its: Drop the translation cache reference only for the erased entry
Hyunwoo Kim [Mon, 1 Jun 2026 14:53:26 +0000 (23:53 +0900)] 
KVM: arm64: vgic-its: Drop the translation cache reference only for the erased entry

vgic_its_invalidate_cache() walks the per-ITS translation cache with
xa_for_each() and drops the cache's reference on each entry with
vgic_put_irq(). It puts the iterated pointer, though, rather than the
value returned by xa_erase().

The function is called from contexts that do not exclude one another: the
ITS command handlers hold its_lock, the GITS_CTLR write path holds
cmd_lock, and the path that clears EnableLPIs in a redistributor's
GICR_CTLR holds neither. Two or more of them can drain the same cache
concurrently, and if each one observes the same entry, erases it and then
puts it, the single reference the cache holds on that entry is dropped
more than once. The entry can then be freed while an ITE still maps it.

xa_erase() is atomic and returns the previous entry, so put only the entry
that this context actually removed. The cache reference is then dropped
exactly once per entry even when the invalidations run concurrently, and
the behavior is unchanged when only one context runs.

Fixes: 8201d1028caa ("KVM: arm64: vgic-its: Maintain a translation cache per ITS")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Reviewed-by: Oliver Upton <oupton@kernel.org>
Link: https://patch.msgid.link/ah2c5lu4JbUg7dj-@v4bel
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
12 days agoirqchip/irq-realtek-rtl: Add multicore support
Markus Stockhausen [Thu, 4 Jun 2026 18:25:06 +0000 (20:25 +0200)] 
irqchip/irq-realtek-rtl: Add multicore support

The Realtek interrupt driver currently supports only single core
systems. So the higher end devices like RTL839x and RTL930x with
dual VPEs must be driven with NR_CPU=1. Enhance the driver to
support multicore (dual VPE) systems. For this:

  - Extend the register map for multiple cores
  - Search for multiple CPU cores in the devicetree
  - Improve the register helpers to support multiple cores
  - Add an affinity setter
  - Enhance the IRQ handler for multiple cores

Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260604182506.1113440-3-markus.stockhausen@gmx.de
12 days agoirqchip/irq-realtek-rtl: Add/simplify register helpers
Markus Stockhausen [Thu, 4 Jun 2026 18:25:05 +0000 (20:25 +0200)] 
irqchip/irq-realtek-rtl: Add/simplify register helpers

The Realtek interrupt controller has two important registers that are used
by the driver in several places

 - GIMR: global interrupt mask register
 - IRR: Interrupt routing registers

The usage of these registers is very inconsistent. GIMR is addressed
directly while IRR has a helper that needs a macro as an input. Harmonize
this by providing consistent helpers that improve code readability.

The callers of these helpers use classic lock/unlock functions and
sometimes use the wrong locking helper. E.g. irqsave variants are used in
mask/unmask although not needed. Adapt and fix the surrounding call
locations.

Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260604182506.1113440-2-markus.stockhausen@gmx.de
12 days agox86/resctrl: Only check Intel systems for SNC
Tony Luck [Fri, 5 Jun 2026 04:46:49 +0000 (21:46 -0700)] 
x86/resctrl: Only check Intel systems for SNC

topology_num_nodes_per_package() reports values greater than one on certain
AMD systems resulting in resctrl's Intel model specific SNC detection
printing the confusing message:

   "CoD enabled system? Resctrl not supported"

Add a check for Intel systems before looking at the topology.

[ reinette: Add Closes tag, fix tag typos, rework changelog ]

Fixes: 59674fc9d0bf ("x86/resctrl: Fix SNC detection")
Reported-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://patch.msgid.link/9849330f45ac86344cc5ac54df2d313906d70bc4.1780634584.git.reinette.chatre@intel.com
Closes: https://lore.kernel.org/lkml/37ac0376-43a3-4283-a3d5-4d57b3bec578@amd.com/
12 days agoMerge tag 'ib-gpio-add-gpiod-is-single-ended-for-v7.2' into i2c/i2c-host
Andi Shyti [Fri, 5 Jun 2026 08:46:58 +0000 (10:46 +0200)] 
Merge tag 'ib-gpio-add-gpiod-is-single-ended-for-v7.2' into i2c/i2c-host

Immutable branch between the GPIO and I2C trees for v7.2-rc1

- add the gpiod_is_single_ended() helper function

12 days agoiomap: introduce IOMAP_F_ZERO_TAIL flag
Namjae Jeon [Mon, 18 May 2026 11:46:55 +0000 (20:46 +0900)] 
iomap: introduce IOMAP_F_ZERO_TAIL flag

In filesystems that maintain a separate Valid Data Length, such as exFAT
and NTFS, a partial write may start at or beyond the current valid_size and
extend it. In this case, the region after the previous valid_size but
within the same filesystem block is considered unwritten.

This patch introduces IOMAP_F_ZERO_TAIL. When this flag is set in iomap,
__iomap_write_begin() will zero only the tail portion while preserving any
valid data before it in the same block.

Without this tail zeroing, stale data in the unwritten portion of the block
can remain in the page cache. Subsequent reads can then return incorrect
contents from that region.

Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Link: https://patch.msgid.link/20260518114705.9601-2-linkinjeon@kernel.org
Acked-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
12 days agorust: ptr: remove implicit index projection syntax
Gary Guo [Tue, 2 Jun 2026 14:17:57 +0000 (15:17 +0100)] 
rust: ptr: remove implicit index projection syntax

All users have been converted to use keyworded index projection syntax to
explicitly state their intention when doing index projection.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260602-projection-syntax-rework-v2-6-6989470f5440@garyguo.net
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
12 days agogpu: nova-core: convert to keyworded projection syntax
Gary Guo [Tue, 2 Jun 2026 14:17:56 +0000 (15:17 +0100)] 
gpu: nova-core: convert to keyworded projection syntax

Use "build" to denote that the index bounds checking here is performed at
build time.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260602-projection-syntax-rework-v2-5-6989470f5440@garyguo.net
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
12 days agorust: dma: update to keyworded index projection syntax
Gary Guo [Tue, 2 Jun 2026 14:17:55 +0000 (15:17 +0100)] 
rust: dma: update to keyworded index projection syntax

Demonstrate the preferred syntax of index projection in DMA documentation
and examples. A few `[i]?` cases are converted to demonstrate the new
variant.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260602-projection-syntax-rework-v2-4-6989470f5440@garyguo.net
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
12 days agorust: ptr: add panicking index projection variant
Gary Guo [Tue, 2 Jun 2026 14:17:54 +0000 (15:17 +0100)] 
rust: ptr: add panicking index projection variant

There have been a few cases where the programmer knows that the indices are
in bounds but the compiler cannot deduce that. This is also
compiler-version-dependent, so using build indexing here can be
problematic. On the other hand, it is also not ideal to use the fallible
variant, as it adds an error handling path that is never hit.

Add a new panicking index projection for this scenario. Like all panicking
operations, this should be used carefully only in cases where the user
knows the index is going to be in bounds, and panicking would indicate
something is catastrophically wrong.

To signify this, require users to explicitly denote the type of index being
used. The existing two types of index projections also gain the keyworded
version, which will be the recommended way going forward.

The keyworded syntax also paves the way of perhaps adding more flavors in
the future, e.g. `unsafe` index projection. However, unless the code is
extremely performance sensitive and bounds checking cannot be tolerated,
the panicking variant is safer and should be preferred, so it will be left
to the future when demand arises.

Signed-off-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260602-projection-syntax-rework-v2-3-6989470f5440@garyguo.net
[ Fixed broken intra-doc link. Added a few extra intra-doc links. Reworded
  some docs slightly. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
12 days agorust: ptr: use `match` instead of `unwrap_or_else` for `build_index`
Gary Guo [Tue, 2 Jun 2026 14:17:53 +0000 (15:17 +0100)] 
rust: ptr: use `match` instead of `unwrap_or_else` for `build_index`

Use `match` to avoid potential inlining issues of the `unwrap_or_else`
function.

Suggested-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/rust-for-linux/aeCKlut-88SbNsyW@google.com/
Signed-off-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260602-projection-syntax-rework-v2-2-6989470f5440@garyguo.net
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>