git.ipfire.org Git - thirdparty/kernel/linux.git/log

Merge branch 'net-devmem-allow-bind-rx-from-non-init-user-namespaces'

Bobby Eshleman says:

====================
net: devmem: allow bind-rx from non-init user namespaces

NETDEV_CMD_BIND_RX is GENL_ADMIN_PERM, which checks CAP_NET_ADMIN
against init_user_ns. With netkit and netns support for devmem, it is
now useful to let workloads holding CAP_NET_ADMIN only in their own
user_ns issue bind-rx for a netns owned by that user_ns.

The first patch switches the flag to GENL_UNS_ADMIN_PERM so the check
uses the target netns's owning user_ns. Init remains permitted.

The second patch just adds test cases. They are identical to
nk_devmem.py tests, but using a non-init userns.
====================

Link: https://patch.msgid.link/20260602-nl-prov-v2-0-ad721142c641@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: drv-net: add userns devmem RX test

Add userns_devmem.py, which mirrors nk_devmem.py but places the netkit
guest in a netns whose owning user_ns is non-init. ncdevmem is ran there
via nsenter so the bind-rx call is issued with creds that hold
CAP_NET_ADMIN only in the child user_ns.

Without the preceding GENL_UNS_ADMIN_PERM patch the test fails at
bind-rx with EPERM, but with the patch the transfer completes and tests
pass.

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20260602-nl-prov-v2-2-ad721142c641@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: devmem: allow bind-rx from non-init user namespaces

NETDEV_CMD_BIND_RX is currently GENL_ADMIN_PERM, which checks
CAP_NET_ADMIN against init userns. With recent container/netkit/ns
support for devmem, other userns/netns use cases come online and require
bind-rx to allow CAP_NET_ADMIN in non-init user ns as well.

Switch the flag to GENL_UNS_ADMIN_PERM to allow bind-rx for
CAP_NET_ADMIN in the netns's owning userns as well.

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20260602-nl-prov-v2-1-ad721142c641@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'drm-fixes-2026-06-06' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
"Weekly drm fixes, not contributing to things settling down
  unfortunately. Lots of driver fixes for various bounds checks, leaks
  and UAF type things, i915/xe probably the most sane, amdgpu has a mix
  of fixes all over, then ethosu has lots of small fixes.

  The problem of fixing thing in private has really hit us with the
  change handle ioctl, and "Sima was right" and we should have disabled
  the ioctl, since it was only introduced a couple of kernels ago and
  failed to upstream it's tests in time.

  The patch here fixes the problems Sima identified, but disables the
  ioctl as well, with a list of known problems in it and a request for
  proper tests to be written and upstreamed. It's a niche user ioctl
  designed for CRIU with AMD ROCm, so I think it's fine to just disable
  it.

  Maybe this week will settle down.

  core:
   - disable the gem change handle ioctl for security reasons (plan to
     fix it on list later with proper test coverage)

  dumb-buffer:
   - remove strict limits on buffer geometry

  amdgpu:
   - BT.2020 fix for DCE
   - DC bounds checking fixes
   - SDMA 7.1 fix
   - UserQ fixes
   - SI fix
   - SMU 13 fixes
   - SMU 14 fixes
   - GC 12.1 fix
   - Userptr fix
   - GC 10.1 fix
   - GART fix for non-4K pages

  amdkfd:
   - UAF race fix
   - Fix a potential NULL pointer dereference
   - GC 11 buffer overflow fix for SDMA

  xe:
   - Revert removing support for unpublished NVL-S GuC
   - Suspend fixes related to multi-queue

  i915:
   - Fix color blob reference handling in intel_plane_state
   - Revert "drm/i915/backlight: Remove try_vesa_interface"

  ethosu:
   - reject unsupported NPU_OP_RESIZE
   - fix index of IFM region
   - fix weight index
   - fix overflows in DMA-size calculations
   - reject DMA commands with uninitialized length
   - fix OOB write in ethosu_gem_cmdstream_copy_and_validate

  imx:
   - fix kernel-doc warnings

  ivpu:
   - add overflow checks in firmware handling and get_info_ioctl

  v3d:
   - wait for pending L2T flush before cleaning caches
   - fix leak of vaddr
   - skip CSD when it has zeroed workgroups
   - fix ref counting in performance monitoring"

* tag 'drm-fixes-2026-06-06' of https://gitlab.freedesktop.org/drm/kernel: (50 commits)
  drm/gem: Try to fix change_handle ioctl, attempt 4
  Revert "drm/i915/backlight: Remove try_vesa_interface"
  accel/ethosu: fix OOB write in ethosu_gem_cmdstream_copy_and_validate()
  accel/ethosu: reject DMA commands with uninitialized length
  accel/ethosu: fix arithmetic issues in dma_length()
  accel/ethosu: fix wrong weight index in NPU_SET_SCALE1_LENGTH on U85
  accel/ethosu: reject NPU_OP_RESIZE commands from userspace
  accel/ethosu: fix IFM region index out-of-bounds in command stream parser
  drm/v3d: Fix global performance monitor reference counting
  drm/xe/multi_queue: skip submit when primary queue is suspended
  drm/xe: Clear pending_disable before signaling suspend fence
  Revert "drm/xe: Skip exec queue schedule toggle if queue is idle during suspend"
  drm/amd/pm: smu_v14_0_0: use SoftMin for gfxclk in set_soft_freq_limited_range
  drm/amdgpu: Fix incorrect VRAM GART mappings on non-4K page size systems
  drm/amdgpu/userq: move wptr_obj cleanup in mqd_destroy
  drm/amdgpu: improve the userq seq BO free bit lookup
  drm/amdgpu/userq: remove the vital queue unmap logging
  drm/amdkfd: Fix buffer overflow in SDMA queue checkpoint/restore on GFX11
  drm/amdkfd: fix NULL dereference in get_queue_ids()
  drm/amdgpu: set noretry=1 as default for GFX 10.1.x (Navi10/12/14)
  ...

ipv6: remove obsolete EXPORT_SYMBOL() and EXPORT_SYMBOL_GPL()

These symbols don't need to be exported, they are only used from vmlinux:

- inet6addr_notifier_call_chain
- inet6addr_validator_notifier_call_chain
- in6addr_linklocal_allnodes
- in6addr_linklocal_allrouters
- in6addr_interfacelocal_allnodes
- in6addr_interfacelocal_allrouters
- in6addr_sitelocal_allrouters
- inet6_cleanup_sock
- ip6_sk_dst_lookup_flow
- ipv6_proxy_select_ident
- ip6_sk_update_pmtu
- ip6_sk_redirect

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260604174555.2801532-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv4: remove obsolete EXPORT_SYMBOL() and EXPORT_SYMBOL_GPL()

These symbols no longer need to be exported, they are only used from
vmlinux:

- inet_send_prepare
- inet_splice_eof
- inet_sk_rebuild_header
- inet_current_timestamp
- snmp_fold_field
- snmp_get_cpu_field64
- snmp_fold_field64
- fib_nh_common_release
- fib_nh_common_init
- fib_nexthop_info
- fib_add_nexthop
- ip_build_and_send_pkt
- ipv4_sk_update_pmtu
- ipv4_sk_redirect
- rt_dst_clone

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260604173413.2782008-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'bridge-prepare-lockless-br_port_fill_attrs-i'

Eric Dumazet says:

====================
bridge: prepare lockless br_port_fill_attrs() (I)

medium-term goal is to allow "ip link show" dump commands to run without RTNL.

This round of patches adds/fixes some lockess accesses in bridge.

This is not complete, more patches will come later.

Ultimately all changes to p->flags should use set_bit()/clear_bit().

I repeat (AI agents might read this cover ?):

Many p->flags accesses are racy, and will hopefully be fixed in a
future series.
====================

Link: https://patch.msgid.link/20260604141343.2124500-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bridge: read p->flags once in br_port_fill_attrs()

We might run br_port_fill_attrs() locklessly in the future.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260604141343.2124500-12-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bridge: provide lockless access to p->config_pending

Needed for sysfs show_config_pending(), BRCTL_GET_PORT_INFO
and upcoming RTNL avoidance in "ip link" dumps (cf br_port_fill_attrs()).

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260604141343.2124500-11-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bridge: provide lockless access to p->port_id

sysfs show_port_id() and BRCTL_GET_PORT_INFO need this.

This will be needed for upcoming RTNL avoidance in "ip link"
dumps (cf br_port_fill_attrs()).

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260604141343.2124500-10-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bridge: provide lockless access to p->priority

sysfs show_priority() needs this.

Also br_port_fill_attrs() might in the future run without RTNL.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260604141343.2124500-9-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bridge: provide lockless access to p->designated_port

Add READ_ONCE()/WRITE_ONCE() annotations around p->designated_port

This is needed at least for sysfs show_designated_port(), BRCTL_GET_PORT_INFO
and upcoming RTNL avoidance in "ip link" dumps (cf br_port_fill_attrs()).

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260604141343.2124500-8-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bridge: provide lockless access to p->designated_cost

Add READ_ONCE()/WRITE_ONCE() annotations around p->designated_cost

This is needed at least for sysfs show_designated_cost(), BRCTL_GET_PORT_INFO
and upcoming RTNL avoidance in "ip link" dumps (cf br_port_fill_attrs()).

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260604141343.2124500-7-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bridge: provide lockless access to p->path_cost

Add READ_ONCE()/WRITE_ONCE() annotations around p->path_cost.

This is needed at least for sysfs show_path_cost(), BRCTL_GET_PORT_INFO
and upcoming RTNL avoidance in "ip link" dumps (cf br_port_fill_attrs()).

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260604141343.2124500-6-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bridge: use BR_ADMIN_COST_BIT

Use set_bit() and test_bit() lockless functions.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260604141343.2124500-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bridge: use BR_PROMISC_BIT

Use BR_PROMISC_BIT and set_bit(), clear_bit() and test_bit() lockless
functions.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260604141343.2124500-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bridge: add bridge_flags_bit enum

We want to use atomic operations for lockless p->flags changes
and reads.

Add definitions for bits in addition of masks so that we can use
test_bit(), clear_bit() and set_bit() in subsequent patches.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260604141343.2124500-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bridge: add a READ_ONCE() in br_timer_value()

br_timer_value() can be called locklessly, the expires field
could be changed concurrently.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260604141343.2124500-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: net: do not detect PPPoX loopback

By default, pppd attempts to detect loopbacks on the underlying
interface using a pseudo-randomly generated magic number and checks if
the same value is received. The seed for the PRNG is a hash of hostname
XOR current time XOR pid, which is likely to collide on NIPA, causing
false positives. Disable magic number generation.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Fixes: 7af2a94f4dcf ("selftests: net: add tests for PPPoL2TP")
Signed-off-by: Qingfang Deng <qingfang.deng@linux.dev>
Link: https://patch.msgid.link/20260603061746.23452-1-qingfang.deng@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: cpsw_new: unregister devlink on port registration failure

cpsw_probe() registers devlink before registering the CPSW ports.

If cpsw_register_ports() fails, the error path only unregisters the
notifiers and then releases the lower level resources. It does not undo
the successful cpsw_register_devlink() call, leaving the devlink instance
and its parameters registered after probe has failed.

Add a devlink cleanup label for the path where devlink registration has
already succeeded, and use it when port registration fails.

Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Link: https://patch.msgid.link/20260604043115.1409134-1-lgs201920130244@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'intel-wired-lan-driver-updates-2026-06-02-i40e-ice-idpf'

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2026-06-02 (ice, idpf)

Petr Oros adds missing callbacks for U.FL DPLL pins on ice.

Alok Tiwari corrects copy/paste error causing incorrect reporting of PTP
mailbox capability for idpf.
====================

Link: https://patch.msgid.link/20260602225513.393338-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

idpf: fix mailbox capability for set device clock time

The current code incorrectly uses VIRTCHNL2_CAP_PTP_SET_DEVICE_CLK_TIME
for both direct and mailbox capabilities, causing mailbox-only support
to be ignored and potentially reporting IDPF_PTP_NONE.

Fixes: d5dba8f7206da ("idpf: add PTP clock configuration")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20260602225513.393338-4-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ice: fix missing priority callbacks for U.FL DPLL pins

The U.FL2 input pin advertises DPLL_PIN_CAPABILITIES_PRIORITY_CAN_CHANGE
in its capability mask, but ice_dpll_pin_ufl_ops does not provide
.prio_get and .prio_set callbacks. As a result the DPLL subsystem
cannot report or accept priority for U.FL pins: pin-get omits the prio
field on U.FL2 and pin-set with prio is rejected as invalid, even
though the capability is present. This prevents user space from using
priority to select or disable U.FL2 as a DPLL input source.

Reproducer with iproute2 (dpll command):

  # dpll pin show board-label U.FL2
  pin id 16:
    module-name ice
    board-label U.FL2
    type ext
    capabilities priority-can-change|state-can-change
    parent-device:
      id 0 direction input state selectable phase-offset 0
    /* note: no "prio" between "direction" and "state",
       even though priority-can-change is advertised */

  # dpll pin set id 16 parent-device 0 prio 5
  RTNETLINK answers: Operation not supported

After the fix the prio field is reported by pin show and pin set with
prio is accepted on U.FL2.

Add the missing .prio_get and .prio_set callbacks to
ice_dpll_pin_ufl_ops, reusing ice_dpll_sw_input_prio_{get,set}. The
same ops struct is shared by U.FL1 and U.FL2: U.FL2 (input) delegates
to the backing hardware input pin, while U.FL1 (output) does not
advertise DPLL_PIN_CAPABILITIES_PRIORITY_CAN_CHANGE so the dpll core
capability gate never invokes prio_set for it, and prio_get reports
the OUTPUT sentinel (ICE_DPLL_PIN_PRIO_OUTPUT) on the output side
exactly like the SMA path does today.

Fixes: 2dd5d03c77e2 ("ice: redesign dpll sma/u.fl pins control")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Petr Oros <poros@redhat.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20260602225513.393338-3-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests/bpf: Fix test_lirc test

Since commit 68a99f6a0ebf ("media: lirc: report ir receiver overflow"),
the rc-loopback driver does not accept edges over 50ms, as these are
never seen in real life ir protocols. Fix this.

Signed-off-by: Sean Young <sean@mess.org>
Link: https://lore.kernel.org/r/20260605151417.777614-1-sean@mess.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Merge branch 'add-validation-for-bpf_set_retval-helper'

Xu Kuohai says:

====================
Add validation for bpf_set_retval helper

From: Xu Kuohai <xukuohai@huawei.com>

The bpf_set_retval() helper is used by cgroup BPF programs to set the
return value of the kernel hook. The argument type for this helper is
ARG_ANYTHING. This allows setting a positive value, which no cgroup
hook expects and can cause issues, such as the kernel panic reported
in [1].

This series adds validation for the argument of the bpf_set_retval()
helper.

For BPF_LSM_CGROUP, the same validation as BPF_LSM_MAC is enforced,
i.e. validate the argument against the LSM hook specific range, which
is returned by bpf_lsm_get_retval_range().

For all other cgroup program types, restrict the argument to
[-MAX_ERRNO, 0], which matches the kernel convention of 0 for success
and negative errno for error.

BPF_CGROUP_GETSOCKOPT is an exception from this restriction, since valid
getsockopt implementations may return positive values (e.g. optlen), as
allowed by commit c4dcfdd406aa ("bpf: Move getsockopt retval to struct
bpf_cg_run_ctx").

[1] https://lore.kernel.org/all/567d3206-74a5-44e5-99c6-779c425f399e@std.uestc.edu.cn

v5:
- Use resolve_prog_type(env->prog) instead of env->prog->type for prog type checks
- Target bpf-next tree

v4: https://lore.kernel.org/bpf/20260604130458.617765-1-xukuohai@huaweicloud.com
- Remove the return value limit for BPF_CGROUP_GETSOCKOPT type
- Refine the range of return value of bpf_get_retval helper

v3: https://lore.kernel.org/bpf/20260530101239.590395-1-xukuohai@huaweicloud.com/
- Mark R1 as precise to prevent validation bypass via branch pruning (sashiko)

v2: https://lore.kernel.org/bpf/20260530055557.549474-1-xukuohai@huaweicloud.com/
- Extend validation from LSM cgroup BPF type to all cgroup BPF types (sashiko)

v1: https://lore.kernel.org/bpf/20260523085806.417723-1-xukuohai@huaweicloud.com/
====================

Link: https://patch.msgid.link/20260605140243.664590-1-xukuohai@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Add tests for bpf_set_retval validation

Add verifier tests to validate bpf_set_retval argument for cgroup
program types.

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> #v1
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20260605140243.664590-4-xukuohai@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Add validation for bpf_set_retval argument

The bpf_set_retval() helper is used by cgroup BPF programs to set the
return value of the target hook. The argument type for this helper is
ARG_ANYTHING. This allows setting a positive value, which no cgroup
hook expects and can cause issues, such as:

- BPF_LSM_CGROUP: a positive value from bpf_lsm_socket_create bypasses
  the err < 0 check in __sock_create(), leaving the socket object
  unallocated. The positive return value is then propagated to the
  syscall entry __sys_socket(), which also bypasses the IS_ERR() guard
  and ultimately causes a NULL pointer dereference.

- BPF_CGROUP_DEVICE: a positive value can be returned through cgroup
  device bpf prog -> devcgroup_check_permission() -> bdev_permission()
  -> bdev_file_open_by_dev(), where ERR_PTR(positive) produces a pointer
  that IS_ERR() does not catch, leading to a wild pointer dereference.

- BPF_CGROUP_SOCK: a positive value can be returned through cgroup sock
  bpf prog -> __cgroup_bpf_run_filter_sk() -> inet_create() ->
  __sock_create(), where inet_create() frees the newly allocated sk
  via sk_common_release() and sets sock->sk = NULL on the non-zero
  return, but __sock_create() only checks err < 0 for cleanup, so a
  positive retval bypasses cleanup and returns a socket with NULL sk
  to userspace, triggering a NULL pointer dereference on subsequent
  socket operations.

- BPF_CGROUP_SYSCTL: a positive value can be returned through the cgroup
  bpf prog -> __cgroup_bpf_run_filter_sysctl() -> proc_sys_call_handler(),
  where a non-zero return bypasses the normal sysctl proc_handler and is
  returned directly to userspace as return value of read() or write()
  syscall.

So add validation for the argument of the bpf_set_retval() helper.

For BPF_LSM_CGROUP, enforce the LSM hook specific range returned by
bpf_lsm_get_retval_range().

For all other cgroup program types, restrict the argument to
[-MAX_ERRNO, 0], which matches the kernel convention of 0 for success
and negative errno for error.

BPF_CGROUP_GETSOCKOPT is an exception, since valid getsockopt
implementations may return positive values, as allowed by commit
c4dcfdd406aa ("bpf: Move getsockopt retval to struct bpf_cg_run_ctx").

Also refine the return value range of bpf_get_retval() so that
values returned by bpf_get_retval() can be passed directly to
bpf_set_retval() without extra manual bounds checking.

Fixes: b44123b4a3dc ("bpf: Add cgroup helpers bpf_{get,set}_retval to get/set syscall return value")
Fixes: 69fd337a975c ("bpf: per-cgroup lsm flavor")
Reported-by: Quan Sun <2022090917019@std.uestc.edu.cn>
Closes: https://lore.kernel.org/all/567d3206-74a5-44e5-99c6-779c425f399e@std.uestc.edu.cn
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20260605140243.664590-3-xukuohai@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Restrict bpf_set_retval argument in sk_bypass_prot_mem

Test sk_bypass_prot_mem passes an unchecked value as argument to helper
bpf_set_retval(). The argument can be outside the valid range enforced
by the strict retval validation added in the next patch.

Restrict the argument to -EFAULT when it is outside the valid range, so
the test will not be rejected by the verifier when retval validation
is enforced.

Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20260605140243.664590-2-xukuohai@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

drm/gem: Try to fix change_handle ioctl, attempt 4

[airlied: just added some comments on how to reenable]
On-list because the cat is out of the bag and we're clearly not good
enough to figure this out in private. The story thus far:

5e28b7b94408 ("drm: Set old handle to NULL before prime swap in
change_handle") tried to fix a race condition between the gem_close and
gem_change_handle ioctls, but got a few things wrong:

- There's a confusion with the local variable handle, which is actually
  the new handle, and so the two-stage trick was actually applied to the
  wrong idr slot. 7164d78559b0 ("drm/gem: fix race between
  change_handle and handle_delete") tried to fix that by adding yet
  another code block, but forgot to add the error handling. Which meant
  we now have two paths, both kinda wrong.

- dc366607c41c ("drm: Replace old pointer to new idr") tried to apply
  another fix, but inconsistently, again because of the handle confusion
  - this would be the right fix (kinda, somewhat, it's a mess) if we'd
  do the two-stage approach for the new handle. Except that wasn't the
  intent of the original fix.

We also didn't have an igt merged for the original ioctl, which is a big
no-go. This was attempted to address off-list in the original bugfix,
and amd QA people claimed the bug was fixed now. Very clearly that's not
the case. Here's my attempt to sort this out:

- Rename the local variable to new_handle, the old aliasing with
  args->handle is just too dangerously confusing.

- Merge the gem obj lookup with the two-stage idr_replace so that we
  avoid getting ourselves confused there.

- This means we don't have a surplus temporary reference anymore, only
  an inherited from the idr. A concurrent gem_close on the new_handle
  could steal that. Fix that with the same two-stage approach
  create_tail uses. This is a bit overkill as documented in the comment,
  but I also don't trust my ability to understand this all correctly, so
  go with the established pattern we have from other ioctls instead for
  maximum paranoia.

- Adjust error paths. I've tried to make the error and success paths
  common, because they are identical except for which handle is removed
  and on which we call idr_replace to (re)install the object again. But
  that made things messier to read, so I've left it at the more verbose
  version, which unfortunately hides the symmetry in the entire code
  flow a bit.

- While at it, also replace the 7 space indent with 1 tab.

And finally, because I flat out don't trust my abilities here at all
anymore:

- Disable the ioctl until we have the igt situation and everything else
  sorted out on-list and with full consensus.

v2:

Sashiko noticed that I didn't handle the error path for idr_replace
correctly, it must be checked with IS_ERR_OR_NULL like in
gem_handle_delete. So yeah, definitely should just the existing paths
1:1 because this is endless amounts of tricky.

Also add the Fixes: line for the original ioctl, I forgot that too.

Reported-by: DARKNAVY (@DarkNavyOrg) <vr@darknavy.com>
Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
Fixes: dc366607c41c ("drm: Replace old pointer to new idr")
Cc: syzbot+d7c9eed171647e421013@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Cc: Edward Adam Davis <eadavis@qq.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 5e28b7b94408 ("drm: Set old handle to NULL before prime swap in change_handle")
Cc: David Francis <David.Francis@amd.com>
Cc: Puttimet Thammasaeng <pwn8official@gmail.com>
Cc: Christian Koenig <Christian.Koenig@amd.com>
Fixes: 7164d78559b0 ("drm/gem: fix race between change_handle and handle_delete")
Cc: Zhenghang Xiao <kipreyyy@gmail.com>
Fixes: 5e28b7b94408 ("drm: Set old handle to NULL before prime swap in change_handle")
Reviewed-by: David Francis <David.Francis@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patch.msgid.link/20260604194437.1725314-1-simona.vetter@ffwll.ch

Merge branch 'bpf-fix-sysctl-new-value-handling-in-__cgroup_bpf_run_filter_sysctl'

Dawei Feng says:

====================
bpf: fix sysctl new-value handling in __cgroup_bpf_run_filter_sysctl

This series fixes three bugs in the sysctl write-buffer replacement path
of __cgroup_bpf_run_filter_sysctl(). It resolves a kvzalloc()/kfree()
mismatch, adds a missing NUL terminator to the replacement string, and
updates a stale return value check to safely restore the replacement
functionality.

Patch Summary:
- patch 1 NUL-terminates the replaced sysctl value
- patch 2 uses kvfree() for the replaced sysctl write buffer
- patch 3 restores sysctl new-value replacement

Changelog:
v2 -> v3:
- reordered patches 1 and 2
- added the missing Reviewed-by/Acked-by tags to patches 2 and 3
- fixed the incorrect Fixes tag in patch 3
- simplified the dynamic test logs in patch 1 and 2, and updated
titles
====================

Link: https://patch.msgid.link/20260603105317.944304-1-dawei.feng@seu.edu.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Restore sysctl new-value from 1 to 0

Commit 4e63acdff864 ("bpf: Introduce bpf_sysctl_{get,set}_new_value
helpers") changed the success return value to 0, but failed to update the
corresponding check in __cgroup_bpf_run_filter_sysctl(). Since
bpf_prog_run_array_cg() now returns 0 on success, the legacy ret == 1
condition is never satisfied. As a result, the modified value is ignored,
and bpf_sysctl_set_new_value() fails to replace the write buffer.

Fix this by checking for a return value of 0 instead, so cgroup/sysctl
programs can correctly replace the pending sysctl buffer.

This bug was discovered during a manual code review. Tested via a
cgroup/sysctl BPF reproducer overriding writes to a target sysctl.
Pre-fix, bpf_sysctl_set_new_value("foo") was silently ignored: the write
returned 8192 and the value remained "600". Post-fix, the BPF replacement
buffer properly propagates: the write returns 3 and the value updates to
"foo".

Fixes: f10d05966196 ("bpf: Make BPF_PROG_RUN_ARRAY return -err instead of allow boolean")
Cc: stable@vger.kernel.org
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: Dawei Feng <dawei.feng@seu.edu.cn>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Acked-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20260603105317.944304-4-dawei.feng@seu.edu.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: use kvfree() for replaced sysctl write buffer

proc_sys_call_handler() allocates its temporary sysctl buffer with
kvzalloc() and passes it to __cgroup_bpf_run_filter_sysctl(). Since
kvzalloc() may fall back to vmalloc() for large allocations, freeing
that buffer with kfree() is wrong and can corrupt memory.

Use kvfree() to safely handle both kmalloc and kvzalloc()/vmalloc
allocations.

The bug was first flagged by an experimental analysis tool we are
developing for kernel memory-management bugs while analyzing
v6.13-rc1. The tool is still under development and is not yet publicly
available. Manual inspection confirms that the bug is still
present in v7.1-rc5.

Reproduced the bug based on v7.1-rc4 in a QEMU x86_64 guest booted with
KASAN and CONFIG_FAILSLAB enabled. To exercise the replacement path, the
test tree also included the accompanying fix for the stale ret == 1
check in __cgroup_bpf_run_filter_sysctl(). The reproducer confines
failslab injections to the proc_sys_call_handler() range, uses
stacktrace-depth=32, and injects fail-nth=1 while writing 8191 bytes to
/proc/sys/kernel/domainname from a task in the target cgroup. Under
that setup, fail-nth=1 triggered the fault:

  BUG: unable to handle page fault for address: ffffeb0200024d48
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: Oops: 0000  SMP KASAN NOPTI
  CPU: 2 UID: 0 PID: 209 Comm: repro_proc_sys_ Not tainted 7.1.0-rc4-00686-g97625979a5d4  PREEMPT(lazy)
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
  RIP: 0010:kfree+0x6e/0x510
  ...
  Call Trace:
   <TASK>
   ? __cgroup_bpf_run_filter_sysctl+0x626/0xc30
   __cgroup_bpf_run_filter_sysctl+0x74d/0xc30
   ? __pfx___cgroup_bpf_run_filter_sysctl+0x10/0x10
   ? srso_return_thunk+0x5/0x5f
   ? __kvmalloc_node_noprof+0x345/0x870
   ? proc_sys_call_handler+0x250/0x480
   ? srso_return_thunk+0x5/0x5f
   proc_sys_call_handler+0x3a2/0x480
   ? __pfx_proc_sys_call_handler+0x10/0x10
   ? srso_return_thunk+0x5/0x5f
   ? selinux_file_permission+0x39f/0x500
   ? srso_return_thunk+0x5/0x5f
   ? lock_is_held_type+0x9e/0x120
   vfs_write+0x98e/0x1000
   ...
   </TASK>

With this fix applied on top of the same test setup, rerunning the
reproducer with fail-nth=1 yields no corresponding Oops reports.

Fixes: 4508943794ef ("proc: use kvzalloc for our kernel buffer")
Cc: stable@vger.kernel.org
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: Dawei Feng <dawei.feng@seu.edu.cn>
Link: https://lore.kernel.org/r/20260603105317.944304-3-dawei.feng@seu.edu.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: NUL-terminate replaced sysctl value

When writing to sysctls, proc_sys_call_handler() guarantees that the
buffer passed to proc handlers is NUL-terminated. If
bpf_sysctl_set_new_value() replaces the pending sysctl value, it can
hand a replacement buffer directly to proc handlers. However, the
helper currently copies only buf_len bytes into that buffer without
appending a NUL terminator, leaving downstream parsers vulnerable to
out-of-bounds access.

Fix this by appending a '\0' after the replaced value to restore the
expected sysctl semantics. Since the helper already rejects buf_len
greater than PAGE_SIZE - 1, there is always room for the extra byte.

Reproduced in a QEMU x86_64 guest booted with KASAN while exercising
the sysctl replacement path with a cgroup/sysctl BPF program. The
reproducer targets `/proc/sys/net/core/flow_limit_cpu_bitmap`, fills
the original user write buffer with non-zero bytes, and overrides the
sysctl value so the replacement buffer lacks a terminating NUL. Under
that setup, the pre-fix kernel reported:

  BUG: KASAN: slab-out-of-bounds in strnchrnul+0x72/0x90
  Read of size 1 at addr ffff88800de57000 by task repro_patch3/66
  CPU: 0 UID: 0 PID: 66 Comm: repro_patch3 Not tainted 7.1.0-rc3-00269-g8370ca1f87cc #6 PREEMPT(lazy)
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x68/0xa0
   print_report+0xcb/0x5e0
   ? __virt_addr_valid+0x21d/0x3f0
   ? strnchrnul+0x72/0x90
   ? strnchrnul+0x72/0x90
   kasan_report+0xca/0x100
   ? strnchrnul+0x72/0x90
   strnchrnul+0x72/0x90
   bitmap_parse+0x37/0x2e0
   flow_limit_cpu_sysctl+0xc6/0x840
   ? __pfx_flow_limit_cpu_sysctl+0x10/0x10
   ? __kvmalloc_node_noprof+0x5ba/0x870
   proc_sys_call_handler+0x31d/0x480
   ? __pfx_proc_sys_call_handler+0x10/0x10
   ? selinux_file_permission+0x39f/0x500
   ? lock_is_held_type+0x9e/0x120
   vfs_write+0x98e/0x1000
   ...
   </TASK>
  The buggy address is located 0 bytes to the right of
  allocated 4096-byte region [ffff88800de56000, ffff88800de57000)
With this fix applied, rerunning the same sysctl-targeted path yields
no corresponding KASAN reports.

Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: Dawei Feng <dawei.feng@seu.edu.cn>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20260603105317.944304-2-dawei.feng@seu.edu.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Merge tag 'drm-intel-fixes-2026-06-05' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes

- Fix color blob reference handling in intel_plane_state (Chaitanya Kumar Borah)
- Revert "drm/i915/backlight: Remove try_vesa_interface" [backlight] (Suraj Kandpal)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tursulin@igalia.com>
Link: https://patch.msgid.link/aiKgmwz7VGOaFXIv@linux

Merge tag 'drm-misc-fixes-2026-06-05' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

Short summary of fixes pull:

dumb-buffer:
- remove strict limits on buffer geometry

ethosu:
- reject unsupported NPU_OP_RESIZE
- fix index of IFM region
- fix weight index
- fix overflows in DMA-size calculations
- reject DMA commands with uninitialized length
- fix OOB write in ethosu_gem_cmdstream_copy_and_validate

imx:
- fix kernel-doc warnings

ivpu:
- add overflow checks in firmware handling and get_info_ioctl

v3d:
- wait for pending L2T flush before cleaning caches
- fix leak of vaddr
- skip CSD when it has zeroed workgroups
- fix ref counting in performance monitoring

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260605072602.GA268798@linux.fritz.box

Merge branch 'bpf-update-transport_header-when-encapsulating-udp-tunnel-in-lwt'

Leon Hwang says:

====================
bpf: Update transport_header when encapsulating UDP tunnel in lwt

Currently, bpf_lwt_push_ip_encap() does not update skb->transport_header.
When a driver, e.g. ice, reuses the stale skb->transport_header to
offload checksum computation to NIC hardware, VxLAN packets encapsulated
by bpf_lwt_push_encap() helper may be dropped due to incorrect checksum.

Update skb->transport_header in bpf_lwt_push_ip_encap() whenever the
encapsulated packet uses UDP, so checksum offload works correctly.

Changes:
v3 -> v4:
* Address comments from Emil:
  * Make the logic of skb_set_transport_header() clearer in patch #1.
  * Fold the code of fexit_lwt_push_ip_encap() into test_lwt_ip_encap.c in
    patch #2.
  * Resolve assorted issues of test in patch #2.
* v3: https://lore.kernel.org/bpf/20260601150203.20352-1-leon.hwang@linux.dev/

v2 -> v3:
* Drop patch #1 and #2 of v2 that aim to resolve potential issues
  reported by sashiko (per Alexei).
* Check target IP version and UDP tunnel in test (per sashiko).
* v2: https://lore.kernel.org/bpf/20260529151351.69911-1-leon.hwang@linux.dev/

v1 -> v2:
* Address sashiko's reviews:
  * Fix TOCTOU issue in lwt to avoid changing hdr after checks.
  * Add check iph->ihl < 5 in lwt to avoid infinite-loop in MIPS driver.
  * Update comment style in selftests with BPF comment style.
* v1: https://lore.kernel.org/bpf/20260525142650.2569-1-leon.hwang@linux.dev/
====================

Link: https://patch.msgid.link/20260602150931.49629-1-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Add tests to verify the fix of encapsulating VxLAN in lwt

Add two tests to verify the transport header of skb has been set when
encapsulate VxLAN using bpf_lwt_push_encap() helper.

1. VxLAN over IPv4.
2. VxLAN over IPv6.

Without the fix, the tests would fail:

lwt_ip_encap_vxlan:FAIL:transport_hdr offset unexpected transport_hdr offset: actual 70 != expected 20
#208 lwt_ip_encap_vxlan_ipv4:FAIL
lwt_ip_encap_vxlan:FAIL:transport_hdr offset unexpected transport_hdr offset: actual 110 != expected 40
#209 lwt_ip_encap_vxlan_ipv6:FAIL

The unexpected offsets are: outer encap headers
(IPv4: iphdr+udp+vxlan+eth = 50 bytes, IPv6: ipv6hdr+udp+vxlan+eth = 70 bytes)
plus the inner IP header (20 or 40 bytes), because without the fix
transport_header still points at the inner transport layer instead of the
outer UDP header.

Assisted-by: Claude:claude-sonnet-4-6
Cc: Leon Hwang <leon.huangfu@shopee.com>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260602150931.49629-3-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Update transport_header when encapsulating UDP tunnel in lwt

Currently, bpf_lwt_push_ip_encap() does not update skb->transport_header.
When a driver, e.g. ice, reuses the stale skb->transport_header to
offload checksum computation to NIC hardware, VxLAN packets encapsulated
by bpf_lwt_push_encap() helper may be dropped due to incorrect checksum.

Update skb->transport_header in bpf_lwt_push_ip_encap() whenever the
encapsulated packet uses UDP, so checksum offload works correctly.

Fixes: 52f278774e79 ("bpf: implement BPF_LWT_ENCAP_IP mode in bpf_lwt_push_encap")
Cc: Leon Hwang <leon.huangfu@shopee.com>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260602150931.49629-2-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Merge branch 'risc-v-jit-support-for-bpf_get_current_task-_btf'

Varun R Mallya says:

====================
RISC-V JIT support for bpf_get_current_task/_btf

These two patches add support for the bpf_get_current_task and
bpf_get_current_task_btf kfuncs in RISC-V JIT and add a selftest.

The first patch adds support for cpu and feature detection on
the JIT disassembly helper function as RISC-V JITed code was not
being disassembled using `LLVMCreateDisasm` as it was missing the
"+c" CPU feature and JITed code contained RISC-V Compressed (C)
Extension. This patch generalizes that to detect CPU features and
enables testing on more RISC-V JIT work ahead.

The second patch, which actually adds this support has been benchmarked
on QEMU RISC-V and shows significant improvements.
It was benchmarked using a simple loop inside a bpf program that ran
bpf_get_current_task() and execution time was measured. It used
bpf_prog_test_run_opts() to repeatedly trigger the BPF program. The loop
ran in 1 second intervals and it kept firing bpf_prog_test_run_opts() as
fast as possible until a second had elapsed and then reported statistics.
====================

Link: https://patch.msgid.link/20260602205847.102825-1-varunrmallya@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf, riscv: inline bpf_get_current_task() and bpf_get_current_task_btf()

On RISC-V, the current task pointer is stored in the thread pointer
register (tp). Emit a single `mv a5, tp` instead of a full helper
call for BPF_FUNC_get_current_task and BPF_FUNC_get_current_task_btf.

Register bpf_jit_inlines_helper_call() entries for both helpers so the
verifier treats them as inlined, and add the expected `mv a5, tp`
annotation to the riscv64 selftests.

The following show changes before and after this patch.

Before patch:

      auipc  t1,0x817a    # load upper PC-relative address
      jalr   -2004(t1)    # call bpf_get_current_task helper
      mv     a5,a0        # move return value to BPF_REG_0

After patch:

      mv     a5,tp        # directly: a5 = current (tp = thread pointer)

Benchmark (bpf_prog_test_run wrapping bpf_get_current_task in loop,
batch=100, 10s, QEMU RISC-V):

              | runs/sec  | helper-calls/sec | ns/call
-------------+-----------+------------------+---------
Before patch |   173,490 |       17,349,090 |      57
After patch  |   320,497 |       32,049,780 |      31
-------------+-----------+------------------+---------
Improvement  |   +84.7%  |          +84.7%  |  -45.6%

Signed-off-by: Varun R Mallya <varunrmallya@gmail.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20260602205847.102825-3-varunrmallya@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: use host CPU features in JIT disassembler

Pass the host CPU name and feature string to
LLVMCreateDisasmCPUFeatures() instead of using LLVMCreateDisasm(), so
the disassembler correctly decodes CPU-specific instructions and
extensions such as RISC-V compressed and vector instructions.

Signed-off-by: Varun R Mallya <varunrmallya@gmail.com>
Reviewed-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20260602205847.102825-2-varunrmallya@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Merge branch 'bpf-check-tail-zero-of-bpf_map_info-and-bpf_prog_info'

Leon Hwang says:

====================
bpf: Check tail zero of bpf_map_info and bpf_prog_info

Check the tail bytes of bpf_map_info and bpf_prog_info due to padding
when getting map info and prog info via BPF_OBJ_GET_INFO_BY_FD, which
was discussed in the thread
"bpf: Check tail zero of bpf_common_attr using offsetofend" [1].

Links:
[1] https://lore.kernel.org/bpf/20260518145446.6794-2-leon.hwang@linux.dev/

Changes:
v2 -> v3:
* Add "__u32 :32" to bpf_map_info and bpf_prog_info (per Alexei).
* v2: https://lore.kernel.org/bpf/20260604150505.99129-1-leon.hwang@linux.dev/

v1 -> v2:
* Collect Acked-by tags from Mykyta, thanks.
* Update Fixes tag in patch #2 (per bot+bpf-ci)
* v1: https://lore.kernel.org/bpf/20260603144518.67065-1-leon.hwang@linux.dev/
====================

Link: https://patch.msgid.link/20260605155249.20772-1-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Add tests to verify checking padding bytes for bpf_[map,prog]_info

Add two tests to verify that the tail padding 4 bytes of struct
bpf_map_info and bpf_prog_info are checked in syscall.c using
bpf_check_uarg_tail_zero().

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260605155249.20772-4-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Check tail zero of bpf_prog_info

Since there're 4 bytes padding at the end of struct bpf_prog_info, they
won't be checked by bpf_check_uarg_tail_zero().

pahole -C bpf_prog_info ./vmlinux
struct bpf_prog_info {
...
__u32 attach_btf_obj_id; /* 220 4 */
__u32 attach_btf_id; /* 224 4 */

/* size: 232, cachelines: 4, members: 38 */
/* sum members: 224 */
/* sum bitfield members: 1 bits, bit holes: 1, sum bit holes: 31 bits */
/* padding: 4 */
/* forced alignments: 9 */
/* last cacheline: 40 bytes */
} __attribute__((__aligned__(8)));

If a future kernel extension adds a new 4-byte field, older userspace
programs allocating this structure on the stack might inadvertently pass
uninitialized stack garbage into the new field, permanently breaking
backward compatibility. -- sashiko [1]

Fix it by changing sizeof(info) to
offsetofend(struct bpf_prog_info, attach_btf_id).

And, add "__u32 :32" to the tail of struct bpf_prog_info.

[1] https://lore.kernel.org/bpf/20260513224823.6494FC19425@smtp.kernel.org/

Fixes: aba64c7da983 ("bpf: Add verified_insns to bpf_prog_info and fdinfo")
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260605155249.20772-3-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Check tail zero of bpf_map_info

Since there're 4 bytes padding at the end of struct bpf_map_info, they
won't be checked by bpf_check_uarg_tail_zero().

pahole -C bpf_map_info ./vmlinux
struct bpf_map_info {
...
__u64 hash __attribute__((__aligned__(8))); /* 88 8 */
__u32 hash_size; /* 96 4 */

/* size: 104, cachelines: 2, members: 18 */
/* padding: 4 */
/* forced alignments: 1 */
/* last cacheline: 40 bytes */
} __attribute__((__aligned__(8)));

If a future kernel extension adds a new 4-byte field, older userspace
programs allocating this structure on the stack might inadvertently pass
uninitialized stack garbage into the new field, permanently breaking
backward compatibility. -- sashiko [1]

Fix it by changing sizeof(info) to
offsetofend(struct bpf_map_info, hash_size).

And, add "__u32 :32" to the tail of struct bpf_map_info.

[1] https://lore.kernel.org/bpf/20260513224823.6494FC19425@smtp.kernel.org/

Fixes: ea2e6467ac36 ("bpf: Return hashes of maps in BPF_OBJ_GET_INFO_BY_FD")
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260605155249.20772-2-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

dt-bindings: display: panel: Describe Samsung SOFEF01-M DDIC

Document the Samsung SOFEF01-M Display-Driver-IC and 1080x2520@60Hz
command-mode DSI panels found in many Sony phones:
- Sony Xperia 5 (kumano bahamut): amb609tc01
- Sony Xperia 10 II (seine pdx201): ams597ut01
- Sony Xperia 10 III (lena pdx213): ams597ut04
- Sony Xperia 10 IV (murray pdx225): ams597ut05
- Sony Xperia 10 V (zambezi pdx235): ams605dk01
- Sony Xperia 10 VI (columbia pdx246): ams605dk01

Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Link: https://patch.msgid.link/20251222-drm-panels-sony-v2-4-82a87465d163@somainline.org
[robh: move vci-supply property to top-level]
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>

Merge branch 'selftests-bpf-tolerate-partial-builds-across-kernel-configs'

Ricardo B. Marliere says:

====================
selftests/bpf: Tolerate partial builds across kernel configs

Currently the BPF selftests can only be built by using the minimum kernel
configuration defined in tools/testing/selftests/bpf/config*. This poses a
problem in distribution kernels that may have some of the flags disabled or
set as module. For example, we have been running the tests regularly in
openSUSE Tumbleweed [1] [2] but to work around this fact we created a
special package [3] that build the tests against an auxiliary vmlinux with
the BPF Kconfig. We keep a list of known issues that may happen due to,
amongst other things, configuration mismatches [4] [5].

The maintenance of this package is far from ideal, especially for
enterprise kernels. The goal of this series is to enable the common usecase
of running the following in any system:

```sh
make -C tools/testing/selftests install \
     SKIP_TARGETS= \
     TARGETS=bpf \
     BPF_STRICT_BUILD=0 \
     O=/lib/modules/$(uname -r)/build
```

As an example, the following script targeting a minimal config can be used
for testing:

```sh
make defconfig
scripts/config --file .config \
               --enable DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT \
               --enable DEBUG_INFO_BTF \
               --enable BPF_SYSCALL \
               --enable BPF_JIT
make olddefconfig
make -j$(nproc)
make -j$(nproc) -C tools/testing/selftests install \
     SKIP_TARGETS= \
     TARGETS=bpf \
     BPF_STRICT_BUILD=0
```

This produces a test_progs binary with 592 subtests, against the total of
727. Many of them will still fail or be skipped at runtime due to lack of
symbols, but at least there will be a clear way of building the tests.

[1]: https://openqa.opensuse.org/tests/5811715
[2]: https://openqa.opensuse.org/tests/5811730
[3]: https://src.opensuse.org/rmarliere/kselftests
[4]: https://github.com/openSUSE/kernel-qe/blob/main/kselftests_known_issues.yaml
[5]: https://openqa.opensuse.org/tests/5811730/logfile?filename=run_kselftests-config_mismatches.txt
---
Changes in v12:
- Rebase from 11 to 8 commits: split the test_kmods KDIR patch into a
  pure KDIR fix (no PERMISSIVE) and a separate tolerance patch; squash
  the BPF_STRICT_BUILD toggle with its first PERMISSIVE user; squash
  the four test_progs partial-build patches into one; move the install
  fix to immediately follow the first PERMISSIVE user
- Link to v11: https://patch.msgid.link/20260430-selftests-bpf_misconfig-v11-0-e11f7a8c4fdc@suse.com

Changes in v11:
- Gate the BTFIDS pretty-print on $(filter 1,$(V)) so V=0 and V=2 still
  print the marker; only V=1 suppresses it (patch 6)
- Use asm volatile ("") in the uprobe_multi_func_{1,2,3} weak stubs to
  match the strong definitions in prog_tests/uprobe_multi_test.c
  (patch 10)
- Link to v10: https://patch.msgid.link/20260430-selftests-bpf_misconfig-v10-0-cd302a31af16@suse.com

Changes in v10:
- Drop $(wildcard $^) from the bench link recipe so cc fails cleanly on
  the first missing input and the || fallback emits one SKIP-LINK line
  instead of a wall of undefined-reference errors; also makes make -n
  accurate (patch 9)
- Include <alloca.h> in testing_helpers.c for alloca() (patch 10)
- Link to v9: https://patch.msgid.link/20260429-selftests-bpf_misconfig-v9-0-c311f06b4791@suse.com

Changes in v9:
- Also pass KBUILD_OUTPUT=$(KMOD_O_VALID) when invoking kbuild so an
  inherited command-line KBUILD_OUTPUT cannot disagree with O= (patch 2)
- Note BPFOBJ remains a normal prereq intentionally (patch 5)
- Sub-shell isolate cd/CC and use $@ for $(RM); gate the BTFIDS
  pretty-print on $(V) so verbose mode does not double-print (patch 6)
- Restrict permissive compile-failure tolerance and partial-link to
  test_progs%; runners with strong cross-object references (test_maps)
  keep strict semantics (patches 6 and 8)
- Link to v8: https://patch.msgid.link/20260428-selftests-bpf_misconfig-v8-0-bf02cf97dbcb@suse.com

Changes in v8:
- In permissive mode, keep source changes to already-built tests
  triggering relinks, while using recipe-time $(wildcard ...) to pick
  up fresh .test.o files without duplicate linker inputs (patch 8)
- Resolve relative O=/KBUILD_OUTPUT before recursing into test_kmods,
  and only treat it as a kernel build dir when it contains
  Module.symvers (patch 2)
- Note in the commit message that PERMISSIVE-mode bisecting here needs
  the next two patches (patch 6)
- Clarify in the commit message that order-only skeleton prereqs do not
  break the new-skel-via-local-header case (patch 5)
- Exclude not-built tests from the success count so summary and JSON
  report them only as skipped (patch 7)
- Tighten commit messages for accuracy and clarity
- Link to v7: https://patch.msgid.link/20260416-selftests-bpf_misconfig-v7-0-a078e18012e4@suse.com

Changes in v7:
- Use $(abspath) for KMOD_O so relative O= paths resolve correctly
  when make -C changes directory (patch 2)
- Guard make clean against missing KDIR unconditionally; there is
  nothing to clean when kernel headers are absent (patch 2)
- Drop explicit $(TRUNNER_TEST_OBJS) from linker filter in strict mode;
  $^ already contains them as normal prerequisites (patch 8)
- Link to v6: https://patch.msgid.link/20260416-selftests-bpf_misconfig-v6-0-7efeab504af1@suse.com

Changes in v6:
- Add --ignore-missing-args to -extras rsync so out-of-tree permissive
  builds do not abort when .ko files are absent (patch 2)
- Use $(abspath) for KMOD_O so relative O= paths resolve correctly
  when make -C changes directory (patch 2)
- Guard make clean against missing KDIR unconditionally so cleaning
  does not abort when kernel headers are absent (patch 2)
- Remove stale skeleton headers in early-exit paths when .bpf.o is
  missing on incremental builds (patch 3)
- Fix strict-mode skeleton rules: use && before temp file cleanup so
  bpftool failures are not masked by rm -f exit code (patch 3)
- Track test filter selection separately from not_built so -t/-n flags
  are respected for unbuilt tests (patch 7)
- Make TRUNNER_TEST_OBJS order-only only in permissive mode, preserving
  incremental relinking in strict builds (patch 8)
- Drop explicit $(TRUNNER_TEST_OBJS) from linker filter in strict mode;
  they are already in $^ as normal prerequisites (patch 8)
- Reorder: move skip-unbuilt-tests patch before partial-linking patch
  for bisectability
- Link to v5: https://patch.msgid.link/20260415-selftests-bpf_misconfig-v5-0-03d0a52a898a@suse.com

Changes in v5:
- Add BPF_STRICT_BUILD toggle as patch 1 so every subsequent patch
  gates tolerance behind PERMISSIVE from the start, making the series
  bisectable with strict-by-default at every point
- Fix O= commit message; make parent cp conditional (patch 2)
- Tolerate linked skeleton failures (patch 3)
- Skip feature detection for emit_tests (patch 4)
- Clarify bench is all-or-nothing in commit message (patch 8)
- Move stack_mprotect() to testing_helpers.c, drop weak stubs (patch 9)
- Report not-built tests as "SKIP (not built)" in output (patch 10)
- Drop overly broad 2>/dev/null || true from install rsync; rely solely
  on --ignore-missing-args which already handles absent files (patch 11)
- Link to v4: https://patch.msgid.link/20260406-selftests-bpf_misconfig-v4-0-9914f50efdf7@suse.com

Changes in v4:
- Drop the test_kmods kselftest module flow patch: lib.mk gen_mods_dir
  invokes $(MAKE) -C $(TEST_GEN_MODS_DIR) without forwarding
  RESOLVE_BTFIDS, breaking ASAN and GCC BPF CI builds (Makefile.modfinal
  cannot find resolve_btfids in the kbuild output tree)
- Link to v3:
  https://patch.msgid.link/20260406-selftests-bpf_misconfig-v3-0-587a1114263c@suse.com

Changes in v3:
- Split test_kmods patch into two: fix KDIR handling (O= passthrough,
  EXTRA_CFLAGS/EXTRA_LDFLAGS clearing) and wire into lib.mk via
  TEST_GEN_MODS_DIR
- Pass O= through to the kernel module build so artifacts land in the
  output tree, not the source tree
- Clear EXTRA_CFLAGS and EXTRA_LDFLAGS when invoking the kernel build to
  prevent host flags (e.g. -static) leaking into module compilation
- Replace the bespoke test_kmods pattern rule with lib.mk module
  infrastructure (TEST_GEN_MODS_DIR); lib.mk now drives build and clean
  lifecycle
- Make the .ko copy step resilient: emit SKIP instead of failing when a
  module is absent
- Expand the uprobe weak stub comment in bpf_cookie.c to explain why
  noinline is required
- Link to v2:
  https://patch.msgid.link/20260403-selftests-bpf_misconfig-v2-0-f06700380a9d@suse.com

Changes in v2:
- Skip test_kmods build/clean when KDIR directory does not exist
- Use `Module.symvers` instead of `.config` for in-tree detection
- Fix skeleton order-only prereqs commit message
- Guard BTFIDS step when .test.o is absent
- Add `__weak stack_mprotect()` stubs in `bpf_cookie.c` and `iters.c`
- Link to v1:
  https://patch.msgid.link/20260401-selftests-bpf_misconfig-v1-0-3ae42c0af76f@suse.com

Assisted-by: {codex,claude}
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Ricardo B. Marliere <rbm@suse.com>
---
Ricardo B. Marlière (11):
      selftests/bpf: Add BPF_STRICT_BUILD toggle
      selftests/bpf: Fix test_kmods KDIR to honor O= and distro kernels
      selftests/bpf: Tolerate BPF and skeleton generation failures
      selftests/bpf: Avoid rebuilds when running emit_tests
      selftests/bpf: Make skeleton headers order-only prerequisites of .test.d
      selftests/bpf: Tolerate test file compilation failures
      selftests/bpf: Skip tests whose objects were not built
      selftests/bpf: Allow test_progs to link with a partial object set
      selftests/bpf: Tolerate benchmark build failures
      selftests/bpf: Provide weak definitions for cross-test functions
      selftests/bpf: Tolerate missing files during install

tools/testing/selftests/bpf/Makefile               | 177 ++++++++++++++-------
.../testing/selftests/bpf/prog_tests/bpf_cookie.c  |  17 +-
tools/testing/selftests/bpf/prog_tests/iters.c     |   2 -
tools/testing/selftests/bpf/prog_tests/test_lsm.c  |  22 ---
tools/testing/selftests/bpf/test_kmods/Makefile    |  30 +++-
tools/testing/selftests/bpf/test_progs.c           |  53 +++++-
tools/testing/selftests/bpf/test_progs.h           |   1 +
tools/testing/selftests/bpf/testing_helpers.c      |  18 +++
tools/testing/selftests/bpf/testing_helpers.h      |   1 +
9 files changed, 226 insertions(+), 95 deletions(-)
---
base-commit: b93c55b4932dd7e32dca8cf34a3443cc87a02906
change-id: 20260401-selftests-bpf_misconfig-4c33ef5c56da

Best regards,
--
Ricardo B. Marlière <rbm@suse.com>
====================

Link: https://patch.msgid.link/20260602-selftests-bpf_misconfig-v12-0-27f898b3ba26@suse.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Tolerate missing files during install

With partial builds, some TEST_GEN_FILES entries can be absent at install
time. rsync treats missing source arguments as fatal and aborts kselftest
installation.

Override INSTALL_SINGLE_RULE in selftests/bpf to use --ignore-missing-args,
while keeping the existing bpf-specific INSTALL_RULE extension logic. Also
add --ignore-missing-args to the TEST_INST_SUBDIRS rsync loop so that
subdirectories with no .bpf.o files (e.g. when a test runner flavor was
skipped) do not abort installation.

Note that the INSTALL_SINGLE_RULE override applies globally to all file
categories including static source files (TEST_PROGS, TEST_FILES). These
are version-controlled and should always be present, so the practical risk
is negligible.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260602-selftests-bpf_misconfig-v12-11-27f898b3ba26@suse.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Provide weak definitions for cross-test functions

Some test files reference functions defined in other translation units that
may not be compiled when skeletons are missing. Replace forward
declarations of uprobe_multi_func_{1,2,3}() with weak no-op stubs so the
linker resolves them regardless of which objects are present.

The stub bodies are `asm volatile ("")` rather than empty, matching the
shape of the strong definitions in prog_tests/uprobe_multi_test.c. This
keeps the weak and strong sides on the same footing for the optimiser
(noinline + asm-barrier), which is the form upstream already relies on
for these functions.

Move stack_mprotect() from test_lsm.c into testing_helpers.c so it is
always available. The previous weak-stub approach returned 0, which would
cause callers expecting -1/EPERM to fail their assertions
deterministically. Having the real implementation in a shared utility
avoids this problem entirely.

Include <alloca.h> for alloca() so the build does not rely on glibc's
implicit declaration via <stdlib.h>.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260602-selftests-bpf_misconfig-v12-10-27f898b3ba26@suse.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Tolerate benchmark build failures

Benchmark objects depend on skeletons that may be missing when some BPF
programs fail to build. In that case, benchmark object compilation or final
bench linking should not abort the full selftests/bpf build.

Keep both steps non-fatal, emit SKIP-BENCH or SKIP-LINK, and remove failed
outputs so stale objects or binaries are not reused by later incremental
builds. Note that because bench.c statically references every benchmark via
extern symbols, partial linking is not possible: if any single benchmark
object fails, the entire bench binary is skipped. This is by design -- the
error handler catches all compilation failures including genuine ones, but
those are caught by full-config CI runs.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260602-selftests-bpf_misconfig-v12-9-27f898b3ba26@suse.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Allow test_progs to link with a partial object set

When individual test files are skipped due to compilation failures, their
.test.o files are absent. The linker step currently lists all expected
.test.o files as explicit prerequisites, so make considers any missing one
an error.

In permissive mode, declare the test objects that already exist on disk
(via parse-time $(wildcard ...)) as normal prerequisites of the binary so
that modifications to a test source still trigger a relink, and keep the
full TRUNNER_TEST_OBJS list as order-only prerequisites so that initial
fresh builds still produce them and missing objects do not abort the link.
The recipe filter is split per mode: in permissive mode it combines a
recipe-time $(wildcard ...) (which catches objects freshly produced via
the order-only path on a fresh build) with $(filter-out
$(TRUNNER_TEST_OBJS),$^) (which keeps the non-test inputs from $^ but
drops the parse-time wildcard duplicates). This avoids passing the same
.test.o twice to the linker while still presenting test objects before
libbpf.a so that GNU ld, which scans static archives left-to-right, pulls
in archive members referenced exclusively by test objects (e.g.
ring_buffer__new from ringbuf.c). In default (strict) mode the recipe
remains the simple $(filter %.a %.o,$^) since TRUNNER_TEST_OBJS is part
of $^ exactly once.

Gate the partial-link behavior on $(if $(filter test_progs%,$1),...) so
it only applies to test_progs and its flavors. test_maps and similar
runners using strong cross-object references would link-fail with a
partial set and intentionally retain strict link semantics.

Note: adding a brand-new test_*.c file in permissive mode requires
removing the binary (or a clean rebuild) before the new test is linked
in, because the parse-time $(wildcard ...) is evaluated when the Makefile
is read and will not yet see the new .test.o. This is acceptable since
permissive mode targets tolerant CI builds rather than incremental
development.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260602-selftests-bpf_misconfig-v12-8-27f898b3ba26@suse.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Skip tests whose objects were not built

When both run_test and run_serial_test are NULL (because the corresponding
.test.o was not compiled), mark the test as not built instead of fatally
aborting.

Report these tests as "SKIP (not built)" in per-test output and include
them in the skip count so they remain visible in CI results and JSON
output. The summary line shows the not-built count when nonzero:

Summary: 50/55 PASSED, 5 SKIPPED (3 not built), 0 FAILED

Tests filtered out by -t/-n remain invisible as before; only genuinely
unbuilt tests are surfaced.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260602-selftests-bpf_misconfig-v12-7-27f898b3ba26@suse.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Tolerate test file compilation failures

Individual test files may fail to compile when headers or kernel features
required by that test are absent. Currently this aborts the entire build.

Make the per-test compilation non-fatal: remove the output object on
failure and print a SKIP-TEST marker to stderr. Guard the BTFIDS
post-processing step so it is skipped when the object file is absent. The
linker step will later ignore absent objects, allowing the remaining tests
to build and run.

Group cd and CC in a sub-shell so a cd failure cannot leak into the
error-handling branch and operate in the original working directory; use
$@ (absolute path) for $(RM) so it cannot match an unrelated file there.

Replace the $(call msg,...) in the BTFIDS block with a plain printf
(the msg macro expands to @printf, which is a make-recipe construct and
is invalid inside a shell if-then-fi body) and gate the printf on
$(filter 1,$(V)) so verbose mode (V=1) does not double-print the line
that the recipe shell already echoes; non-verbose modes (V unset, V=0,
V=2, ...) still print the BTFIDS marker, matching the convention of the
shared msg macro.

Restrict tolerance to test_progs and its flavors via an inlined
$(if $(filter test_progs%,$1),$(if $(PERMISSIVE),...)) check: runners
with strong cross-object references (e.g. test_maps) would link-fail
with a partial object set, so they keep strict semantics even when
BPF_STRICT_BUILD=0. The check is inlined rather than stored in a helper
variable so $1 is substituted at $(call) time and the per-runner result
is baked into each recipe.

Note on bisectability: this change is gated entirely behind PERMISSIVE
for test_progs%, so default builds (BPF_STRICT_BUILD!=0) compile and
run identically at every commit in the series. Bisecting in PERMISSIVE
mode at this commit still requires the next two patches ("selftests/bpf:
Skip tests whose objects were not built" and "selftests/bpf: Allow
test_progs to link with a partial object set") to avoid the linker
rejecting missing objects and the runtime aborting on NULL function
pointers.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260602-selftests-bpf_misconfig-v12-6-27f898b3ba26@suse.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Make skeleton headers order-only prerequisites of .test.d

The .test.d dependency files are generated by the C preprocessor and list
the headers each test file actually #includes. Skeleton headers appear in
those generated lists, so the .test.o -> .skel.h dependency is already
tracked by the .d file content.

Making skeletons order-only prerequisites of .test.d means that a missing
or skipped skeleton does not prevent .test.d generation, and regenerating a
skeleton does not force .test.d to be recreated. This avoids unnecessary
recompilation and, more importantly, avoids build errors when a skeleton
was intentionally skipped due to a BPF compilation failure.

$$(BPFOBJ) is intentionally kept as a normal prerequisite: a libbpf
rebuild legitimately invalidates .test.d, since libbpf header changes
can affect the headers .test.o sees. Only the skeleton headers are
moved to order-only.

Note that adding a new BPF skeleton via a modified existing local header
still works correctly: GNU make builds order-only prerequisites that do
not exist (the order-only qualifier only suppresses timestamp-driven
rebuilds, not existence-driven builds), so a brand-new .skel.h listed in
TRUNNER_BPF_SKELS is generated even when .test.d is otherwise up to date.
The modified local header invalidates .test.o through the previously
included .d content, forcing a recompile that regenerates .test.d with
the new .skel.h dependency captured by gcc -MMD.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260602-selftests-bpf_misconfig-v12-5-27f898b3ba26@suse.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Avoid rebuilds when running emit_tests

emit_tests is used while installing selftests to generate the kselftest
list. Pulling in .d files for this goal can trigger BPF rebuild rules and
mix build output into list generation.

Skip dependency file inclusion for emit_tests, like clean goals, so list
generation stays side-effect free. Also add emit_tests to
NON_CHECK_FEAT_TARGETS so that feature detection is skipped; without this,
Makefile.feature's $(info) output leaks into stdout and corrupts the test
list captured by the top-level selftests Makefile.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260602-selftests-bpf_misconfig-v12-4-27f898b3ba26@suse.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Tolerate BPF and skeleton generation failures

Some BPF programs cannot be built on distro kernels because required BTF
types or features are missing. A single failure currently aborts the
selftests/bpf build.

Make BPF object and skeleton generation best effort in permissive mode:
emit SKIP-BPF or SKIP-SKEL to stderr, remove failed outputs so downstream
rules can detect absence, and continue with remaining tests. Apply the same
tolerance to linked skeletons (TRUNNER_BPF_SKELS_LINKED), which depend on
multiple .bpf.o files and abort the build when any dependency is missing.

Note that progress messages (GEN-SKEL, LINK-BPF) are also redirected to
stderr as a side effect of rewriting the recipes into single-shell
pipelines; the $(call msg,...) macro is a make-recipe construct that cannot
be used inside an &&-chained shell command sequence.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260602-selftests-bpf_misconfig-v12-3-27f898b3ba26@suse.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Fix test_kmods KDIR to honor O= and distro kernels

test_kmods/Makefile always pointed KDIR at the kernel source tree root,
ignoring O= and KBUILD_OUTPUT. On distro kernels where the source tree has
not been built, the Makefile had no fallback and would fail
unconditionally.

When O= or KBUILD_OUTPUT is set and points at a prepared kernel build
directory (one containing Module.symvers), pass it through so kbuild can
locate the correct build infrastructure (scripts, Kconfig, etc.). Note
that the module artifacts themselves still land in the M= directory,
which is test_kmods/; O= only controls where kbuild finds its build
infrastructure. Fall back to /lib/modules/$(uname -r)/build when neither
an explicit valid build directory nor an in-tree Module.symvers is
present.

A selftests-only O= value (one that does not contain Module.symvers, e.g.
a private output directory) is intentionally not treated as a kernel
build directory. Without this guard, a user invoking
"make -C tools/testing/selftests/bpf O=/tmp/out" would have test_kmods
try to use /tmp/out as the kernel build dir and fail.

The parent bpf/Makefile resolves O= and KBUILD_OUTPUT to absolute paths
before invoking the test_kmods sub-make. Without this, $(abspath ...)
inside test_kmods/Makefile would resolve relative paths against the
sub-make's CWD (test_kmods/) rather than the user's invocation directory.

When O= is passed to kbuild, also pass KBUILD_OUTPUT=$(KMOD_O_VALID)
explicitly. The parent invocation lifts KBUILD_OUTPUT into MAKEFLAGS as
a command-line variable, which would otherwise suppress kbuild's own
"KBUILD_OUTPUT := $(O)" assignment and cause it to use the inherited
KBUILD_OUTPUT instead of the validated O=.

Guard both all and clean against a missing KDIR so the step is silently
skipped rather than fatal. Make the parent Makefile's cp conditional so it
does not abort when modules were not built.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260602-selftests-bpf_misconfig-v12-2-27f898b3ba26@suse.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Add BPF_STRICT_BUILD toggle

Distro kernels often lack BTF types or kernel features required by some BPF
selftests, causing the build to abort on the first failure and preventing
the remaining tests from running.

Add BPF_STRICT_BUILD (default 1) to control build failure tolerance. When
set to 0, the PERMISSIVE make variable is assigned a non-empty value that
subsequent Makefile rules use to make individual build steps non-fatal.
When set to 1 (the default), the build fails on any error, preserving the
existing behavior for CI and direct builds.

Users can opt in to permissive mode on the command line:

make -C tools/testing/selftests \
TARGETS=bpf SKIP_TARGETS= BPF_STRICT_BUILD=0

Suggested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260602-selftests-bpf_misconfig-v12-1-27f898b3ba26@suse.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Clear rb node linkage when freeing bpf_rb_root

bpf_rb_root_free() detaches the root by copying the current rb_root_cached
and then replacing the live root with RB_ROOT_CACHED. It then walks the
copied root and drops each object contained in the tree.

This leaves the rb node state intact while dropping the object. If the
object is refcounted and survives the drop, its bpf_rb_node_kern still
contains an owner pointer to the freed root and stale rb tree linkage. If
a later bpf_rb_root allocation reuses the same address, bpf_rbtree_remove()
can incorrectly pass the owner check and call rb_erase_cached() on a node
whose rb pointers belong to the old tree.

Mirror the list draining behavior by marking nodes as busy while the root
is being detached, then clear the rb node and release the owner before
dropping the containing object. This makes surviving nodes unowned and
safe to reject from remove or accept for a later add.

Fixes: 9c395c1b99bd ("bpf: Add basic bpf_rb_{root,node} support")
Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20260605094143.5509-1-kaitao.cheng@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

x86/virt/tdx: Document TDX module update

Recent changes introduced TDX module update support. This is a thorny
feature to use correctly. It is intended for informed admins only who are
expected to either be doing things very carefully, or using it via
userspace programs that handle the pitfalls for them.

Document the basics of how to use the feature and what is expected of the
user in order for it to go correctly. Both to help the intended users of
the feature and as a "here be dragons" note for the more casual TDX users.

[ dhansen: tweak docs a bit, clarify update constraints ]

Signed-off-by: Chao Gao <chao.gao@intel.com>
[dropped "Implementation details" section, update log]
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>

Merge branch 'object-relationship-tracking-refactor-followup'

Amery Hung says:

====================
Object relationship tracking refactor followup

Hi,

The main patchset refactoring object relationship tracking in the
verifier has landed and this is a followup that addresses the remaining
feedback in v6 [0].

[0] https://lore.kernel.org/bpf/20260529014936.2811085-1-ameryhung@gmail.com/

v2 -> v3
  - Fix cleanup in patch 2 (AI bots)

v1 -> v2
  - Add patch 2 fixing silent failure when acquiring reference for
    struct_ops argument
  - Add patch 4 removing WARN_ON_ONCE in check_ids()
  - Add fix tags
====================

Link: https://patch.msgid.link/20260605202056.1780352-1-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Use bpf_dynptr_slice() to read file dynptr in leak test

use_file_dynptr_slice_after_put_file() reads the dynptr via
bpf_dynptr_data(), which always returns NULL for a read-only file
dynptr, making the example confusing. Switch to bpf_dynptr_slice(), the
correct read API for file dynptrs, and read (rather than write) the slice
since it is read-only. The test still fails as expected.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Link: https://lore.kernel.org/r/20260605202056.1780352-6-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Remove WARN_ON_ONCE in check_ids()

check_ids() warned when it ran out of idmap slots, assuming this was
impossible because the slots are bounded by the number of registers and
stack slots. That assumption no longer holds: referenced dynptrs acquire
an intermediate reference that lives in refs[] but is not backed by any
register or stack slot [0], so a program can accumulate more reference
ids than the idmap can hold and exhaust it.

Exhaustion is fine for verification correctness. check_ids() already
returns false, which makes the states compare as not equivalent and
prevents unsound pruning. The only effect of the WARN_ON_ONCE() is log
noise, or a panic under panic_on_warn. Drop the warning and keep
returning false.

[0] 308c7a0ae885 ("bpf: Refactor object relationship tracking and fix dynptr UAF bug")

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Link: https://lore.kernel.org/r/20260605202056.1780352-5-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Compare parent_id in refsafe() for REF_TYPE_PTR

refsafe() compared each reference's id and type but not its parent_id,
so two states whose PTR references differ only in the parent object they
were derived from could be wrongly treated as equivalent and pruned. Fix
it by checking parent_id too.

Fixes: 308c7a0ae885 ("bpf: Refactor object relationship tracking and fix dynptr UAF bug")
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Link: https://lore.kernel.org/r/20260605202056.1780352-4-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Check acquire_reference() error for "__ref" struct_ops arguments

When acquiring references for struct_ops program arguments tagged with
"__ref", the return value of acquire_reference() was stored directly
into u32 ctx_arg_info[i].ref_id without checking for failure.
acquire_reference() returns -ENOMEM when acquire_reference_state() fails
to allocate, so the error was silently stored as a ref_id instead of
aborting verification. Fix it by checking the return.

Fixes: a687df2008f6 ("bpf: Support getting referenced kptr from struct_ops argument")
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Link: https://lore.kernel.org/r/20260605202056.1780352-3-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Fix dead error check on acquire_reference() in check_kfunc_call

acquire_reference() returns a signed int that may be a negative errno
but was converted to unsigned, which makes the subsequent error check
deadcode. Fix it by declaring 'id' as int so the error path is taken
correctly.

Fixes: 308c7a0ae885 ("bpf: Refactor object relationship tracking and fix dynptr UAF bug")
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Link: https://lore.kernel.org/r/20260605202056.1780352-2-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpftool: Restrict feature tests during bootstrap compilation

When the perf build executes 'make -C ../bpf/bpftool bootstrap', bpftool's
Makefile unconditionally evaluated feature checks for llvm, libcap, libbfd,
and disassembler libraries because the bootstrap target was not exempted.

Since the bootstrap bpftool strictly compiles minimal AST parsing and C
code generation logic without linking LLVM or disassembler libraries, these
feature check sub-makes are completely redundant.

Exempt the bootstrap target from non-essential feature tests to eliminate
unneeded sub-make fork overhead during Kbuild startup.

Tested-by: James Clark <james.clark@linaro.org>
Assisted-by: Gemini:gemini-3.1-pro-preview
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/r/20260531010750.525160-1-irogers@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

dt-bindings: soc: ti,omap-dmm: Convert to DT schema

Convert the TI OMAP Dynamic Memory Manager (DMM) dt binding
from text format to DT schema.

During conversion following changes were made:
- Move file from /bindings/arm/omap to /bindings/soc/ti/
- Make the 'ti,hwmods' property optional and mark it deprecated as it is
no longer used, it is kept to support legacy dtbs.
- Add the missing required property 'interrupts' to example node.

Signed-off-by: Bhargav Joshi <j.bhargav.u@gmail.com>
Link: https://patch.msgid.link/20260605-ti-omap-dmm-v2-1-1b460742ec83@gmail.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>

selftests/bpf: Fix flaky file_reader test

file_reader/on_open_expect_fault test expects page fault
when reading pages from the test harness executable.
It is not guaranteed that those are paged out, even
after madvise(MADV_PAGEOUT).
Relax the condition in the test to succeed with both
0 and -EFAULT returned.

Fixes: 784cdf931543 ("selftests/bpf: add file dynptr tests")
Reported-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Closes: https://lore.kernel.org/all/ah6g7JSYOWGp2oAG@u94a/
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Tested-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260603-file_reader_flake-v1-1-7f3f52d1e388@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Merge tag 'io_uring-7.1-20260605' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring fix from Jens Axboe:
"A single fix for a missing flag mask when multishot is used with
  an incrementally consumed buffer ring, potentially leading to
  application confusion because of lack of IORING_CQE_F_BUF_MORE
  consistency"

* tag 'io_uring-7.1-20260605' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/net: inherit IORING_CQE_F_BUF_MORE across bundle recv retries

power: supply: max17042_battery: Use modern PM ops to clear up warning

When building for a platform that does not have power management, such
as s390, there is an unused function warning, as
max17042_suspend_soc_alerts() is only used in max17042_suspend(), which
is under a CONFIG_PM_SLEEP #ifdef.

  drivers/power/supply/max17042_battery.c:957:13: error: 'max17042_suspend_soc_alerts' defined but not used [-Werror=unused-function]
    957 | static void max17042_suspend_soc_alerts(struct max17042_chip *chip)
        |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~

Use the modern DEFINE_SIMPLE_DEV_PM_OPS(), which allows the compiler to
see the functions as used while allowing it to eliminate them as unused
during the optimization phase. Use pm_ptr() to allow the compiler to
drop max17042_pm_ops when there is no PM support.

Fixes: 601885ffb5e9 ("power: supply: max17042_battery: Keep only critical alerts during suspend")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260604-max17042_battery-fix-unused-suspend_soc_alerts-v1-1-3562a68e6f36@kernel.org
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>

block: Enable lock context analysis

Now that all block/*.c files have been annotated, enable lock context
analysis for all these source files.

Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://patch.msgid.link/e248ca3aeead238bbc489cf3afdafcbff9e41faf.1780682325.git.bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

block/mq-deadline: Make the lock context annotations compatible with Clang

While sparse ignores the __acquires() and __releases() arguments, Clang
verifies these. Make the arguments of __acquires() and __releases()
acceptable for Clang.

Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://patch.msgid.link/3b6e336ced91e27213608ffce205ccd24f4ba285.1780682325.git.bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

block/Kyber: Make the lock context annotations compatible with Clang

While sparse ignores the __acquires() and __releases() arguments, Clang
verifies these. Make the arguments of __acquires() and __releases()
acceptable for Clang.

Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://patch.msgid.link/91cb8c790fc8b26b8aa742569fbf8c2c1d099dac.1780682325.git.bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

block/blk-mq-debugfs: Improve lock context annotations

Make the existing lock context annotations compatible with Clang. Add
the lock context annotations that are missing.

Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://patch.msgid.link/f58fe220ff98f9dfddfed4573f40005c773b7fb7.1780682325.git.bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

block/blk-iocost: Inline iocg_lock() and iocg_unlock()

Both iocg_lock() and iocg_unlock() use conditional locking. Fold these
functions into their callers such that unlocking becomes unconditional.

Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://patch.msgid.link/f8c9867788957d2e40a32e23c6d9b866e480ad9d.1780682325.git.bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

block/blk-iocost: Split ioc_rqos_throttle()

Prepare for inlining iocg_lock() and iocg_unlock() by moving the code
between these two calls into a new function. No functionality has been
changed.

Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://patch.msgid.link/a6d3ed953cef6669d23a80923bf46600733cbdae.1780682325.git.bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

block/crypto: Annotate the crypto functions

Add the lock context annotations required for Clang's thread-safety
analysis.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://patch.msgid.link/297b40e43a7f9b7d20e91a6c44b41a69d01f5c63.1780682325.git.bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

block/cgroup: Inline blkg_conf_{open,close}_bdev_frozen()

The blkg_conf_open_bdev_frozen() calling convention is not compatible
with lock context annotations. Fold both blkg_conf_open_bdev_frozen()
and blkg_conf_close_bdev_frozen() into their only caller. This patch
prepares for enabling lock context analysis.

The type of 'memflags' has been changed from unsigned long into unsigned
int to match the type of current->flags. See also <linux/sched.h>.

Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://patch.msgid.link/05661d1555decc6dd5389174ba448d803b72ed9a.1780682325.git.bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

block/blk-iocost: Combine two error paths in ioc_qos_write()

Reduce code duplication by combining two error paths. No functionality
has been changed.

Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://patch.msgid.link/80d4fc1ecd5eaf187c0a31c63a1033a7326d4c7e.1780682325.git.bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

block/cgroup: Improve lock context annotations

Add lock context annotations where these are missing. Move the
blkg_conf_prep() annotation into block/blk-cgroup.h to make it visible
to all blkg_conf_prep() callers.

Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://patch.msgid.link/58ddd6e2b960bdfa03d0007984386bc0ba351391.1780682325.git.bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

block/cgroup: Split blkg_conf_exit()

Split blkg_conf_exit() into blkg_conf_unprep() and blkg_conf_close_bdev()
because blkg_conf_exit() is not compatible with the Clang thread-safety
annotations. Remove blkg_conf_exit(). Rename blkg_conf_exit_frozen() into
blkg_conf_close_bdev_frozen(). Add thread-safety annotations to the new
functions.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://patch.msgid.link/c1ec1f1c4b675bc5f187f77b3e6436234c6b244c.1780682325.git.bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

block/cgroup: Split blkg_conf_prep()

Move the blkg_conf_open_bdev() call out of blkg_conf_prep() to make it
possible to add lock context annotations to blkg_conf_prep(). Change an
if-statement in blkg_conf_open_bdev() into a WARN_ON_ONCE() call. Export
blkg_conf_open_bdev() because it is called by the BFQ I/O scheduler and
the BFQ I/O scheduler may be built as a kernel module.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://patch.msgid.link/e6ea0387f413217c8561a0ca54ce7b846aa5c7c5.1780682325.git.bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

block/bdev: Annotate the blk_holder_ops callback functions

The four callback functions in blk_holder_ops all release the
bd_holder_lock. Annotate these functions accordingly.

Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://patch.msgid.link/be51cf81110f691ebd5868ac2f15ceb847805bc8.1780682325.git.bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

block: Annotate the queue limits functions

Let the thread-safety checker verify whether every start of a queue
limits update is followed by a call to a function that finishes a queue
limits update.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://patch.msgid.link/8f71062b6d0fcf2b80bc8cda701c453224755439.1780682325.git.bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge tag 'kbuild-fixes-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux

Pull Kbuild fix from Nicolas Schier:
"A single simple commit that fixes the currently broken kconfig
selftests"

* tag 'kbuild-fixes-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
kconfig: Fix repeated include selftest expectation

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"arm64:
   - Correctly drop the ITS translation cache reference when it actually
     gets invalidated

   - Take the SRCU lock for SW page table walks

   - Restore POR_EL0 access to host EL0, avoiding POR_EL0 becoming
     inaccessible from EL0 after running a guest

   - Reassign nested_mmus array behind mmu_lock, ensuring that vcpu init
     and MMU notifiers are mutually exclusive

   - Correctly handle FEAT_XNX at stage-2

  s390:
   - More fixes for the new page table management and nested
     virtualization

  x86:
   - More fixes for GHCB issues:
      - Read start/end indices of page size change requests exactly once
        per vmexit
      - Unmap and unpin the GHCB as needed on vCPU free"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (23 commits)
  KVM: arm64: Correctly identify executable PTEs at stage-2
  KVM: arm64: nv: Fix handling of XN[0] when !FEAT_XNX
  KVM: arm64: Reassign nested_mmus array behind mmu_lock
  KVM: arm64: Restore POR_EL0 access to host EL0
  KVM: arm64: Take the SRCU lock for page table walks in fault injection and AT emulation
  KVM: arm64: vgic-its: Drop the translation cache reference only for the erased entry
  KVM: SEV: Unmap and unpin the GHCB as needed on vCPU free
  KVM: SEV: Decouple the need to sync the GHCB SA from the need to free the SA
  KVM: SEV: Move sev_free_vcpu() down below sev_es_unmap_ghcb()
  KVM: Don't WARN if memory is dirtied without a vCPU when the VM is dying
  KVM: SEV: Read start/end indices of PSC requests exactly once per #VMGEXIT
  KVM: SEV: Add an anonymous "psc" struct to track current PSC metadata
  KVM: SEV: Make it more obvious when KVM is writing back the current PSC index
  KVM: s390: Remove ptep_zap_softleaf_entry()
  KVM: s390: Fix possible reference leak in fault-in code
  KVM: s390: Prevent memslots outside the ASCE range
  KVM: s390: Lock pte when making page secure
  KVM: s390: Fix fault-in code
  KVM: s390: vsie: Fix rmap handling in _do_shadow_crste()
  KVM: s390: Fix guest / virtual address confusion in _essa_clear_cbrl()
  ...

Merge tag 'probes-fixes-v7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing/probes fix from Masami Hiramatsu:
"Fix the eprobe event parser to point error position correctly"

* tag 'probes-fixes-v7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/probes: Point the error offset correctly for eprobe argument error

RDMA/siw: Fix endpoint/socket association handling

Disassociating a socket from an endpoint via siw_socket_disassoc() may
release the last reference on that endpoint and free it. Therefore, don't
clear the endpoints socket pointer after calling that function, but
within.

This fixes a:

BUG: KASAN: slab-use-after-free in siw_cm_work_handler (drivers/infiniband/sw/siw/siw_cm.c:1053 drivers/infiniband/sw/siw/siw_cm.c:1075)

which occurred after processing a malformed MPA request during connection
establishment, causing the new endpoint to be closed.

Fixes: 6c52fdc244b5c ("rdma/siw: connection management")
Link: https://patch.msgid.link/r/20260604160808.30948-1-bernard.metzler@linux.dev
Reported-by: Shuangpeng Bai <shuangpeng.kernel@gmail.com>
Signed-off-by: Bernard Metzler <bernard.metzler@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

arm64: defconfig: Enable DP83822 PHY driver

Enable DP83822 PHY driver as a module to support the Ethernet PHY,
which is placed on phyCORE-i.MX93 SOM.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Signed-off-by: Frank Li <Frank.Li@nxp.com>

kconfig: Fix repeated include selftest expectation

The err_repeated_inc test was added with an expected stderr fixture
that does not match the diagnostic printed by kconfig.

Running "make testconfig" currently fails in that test even though the
parser reports the duplicated include correctly:

  [stderr]
  Kconfig.inc1:4: error: repeated inclusion of Kconfig.inc3
  Kconfig.inc2:3: note: location of first inclusion of Kconfig.inc3

The fixture expects "Repeated" and "Location" with capital letters, but
the diagnostic emitted by scripts/kconfig/util.c uses lowercase words.
Update the fixture to match the real message.

Fixes: 102d712ded3e ("kconfig: Error out on duplicated kconfig inclusion")
Signed-off-by: Zhou Yuhang <zhouyuhang@kylinos.cn>
Tested-by: Nicolas Schier <nsc@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260520070800.2265479-1-zhouyuhang1010@163.com
Signed-off-by: Nicolas Schier <nsc@kernel.org>

block: Add WQ_PERCPU to alloc_workqueue users

This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:

commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")

The refactoring is going to alter the default behavior of
alloc_workqueue() to be unbound by default.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU. For more details see the Link tag below.

In order to keep alloc_workqueue() behavior identical, explicitly request
WQ_PERCPU.

Link: https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://patch.msgid.link/20260604105347.168322-1-marco.crivellari@suse.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

arm64: dts: imx{91,93}-phyboard-segin: Add peb-av-18 overlays

Add overlay for the PHYTEC Audio/Video adapter module PEB-AV-18 on
phyBOARD-Segin-i.MX91/93 boards. The supported AC220 display is
Powertip PH800480T032-ZHC19 panel with a backlight and Ilitek
touch-screen controller.

Signed-off-by: Florijan Plohl <florijan.plohl@norik.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>

arm64: dts: imx93-var-som-symphony: enable ADC

Enable ADC1 on the Symphony carrier board and describe its 1.8 V
reference supply.

Signed-off-by: Stefano Radaelli <stefano.r@variscite.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>

arm64: dts: imx93-var-som-symphony: enable TPM3 PWM

Enable TPM3 on the Symphony carrier board and add the pinctrl states for
the PWM output and sleep configuration.

Signed-off-by: Stefano Radaelli <stefano.r@variscite.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>

arm64: dts: imx93-var-som-symphony: keep RGB_SEL low

Keep the RGB_SEL line driven low on the Symphony carrier board.

This avoids leaving the line floating and ensures the board remains in
the expected display configuration.

Signed-off-by: Stefano Radaelli <stefano.r@variscite.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>

arm64: dts: imx93-var-som-symphony: enable UART7

Enable UART7 on the Symphony carrier board and add its pinctrl
configuration.

Signed-off-by: Stefano Radaelli <stefano.r@variscite.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>

arm64: dts: imx93-var-som-symphony: add TPM support

Add the ST33KTPM2XI2C TPM device on the Symphony carrier board.

The TPM reset line is driven through a PCAL6408 GPIO expander, so add
the expander on the I2C bus and describe the TPM reset GPIO.

Signed-off-by: Stefano Radaelli <stefano.r@variscite.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>

arm64: dts: imx91-var-som-symphony: fix RGB_SEL handling

RGB_SEL is a board-level signal driven by the PCAL6408 GPIO expander on
the Symphony carrier board.

The signal needs to be driven high on the i.MX91 variant to keep the
board in the expected display configuration. Move the handling of this
line from a fixed regulator tied to the PCAL6408 supply to a GPIO hog on
the correct GPIO expander.

Fixes: b3292129dcef ("arm64: dts: imx91-var-som: Add support for Variscite Symphony board")
Signed-off-by: Stefano Radaelli <stefano.r@variscite.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>

arm64: dts: freescale: fsl-ls1028a-tqmls1028a-mbls1028a: switch mmc aliases

All modern TQ-Systems boards follow the convention that mmc0 is the eMMC
and mmc1 is the SD-card when both interfaces exist, reducing differences
between boards for both documentation and U-Boot code (which uses the
same Device Trees). Adjust the recently added MBLS1028A Device Tree
accordingly.

Fixes: 0538ca1f102d ("arm64: dts: ls1028a: Add mbls1028a and mbls1028a-ind devicetrees")
Signed-off-by: Nora Schiffer <nora.schiffer@ew.tq-group.com>
Reviewed-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>