]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
2 weeks agobatman-adv: replace non-atomic vlan config fields with (READ|WRITE)_ONCE
Sven Eckelmann [Tue, 12 May 2026 17:37:05 +0000 (19:37 +0200)] 
batman-adv: replace non-atomic vlan config fields with (READ|WRITE)_ONCE

The vlan configuration values are only accessed as plain loads/stores and
do not require full atomic_t semantics. Convert these fields to native
integer types and replace their users with READ_ONCE()/WRITE_ONCE() to
avoid load/store tearing.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
2 weeks agobatman-adv: replace non-atomic hardif config fields with (READ|WRITE)_ONCE
Sven Eckelmann [Tue, 12 May 2026 17:37:05 +0000 (19:37 +0200)] 
batman-adv: replace non-atomic hardif config fields with (READ|WRITE)_ONCE

The hardif configuration values are only accessed as plain loads/stores and
do not require full atomic_t semantics. Convert these fields to native
integer types and replace their users with READ_ONCE()/WRITE_ONCE() to
avoid load/store tearing.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
2 weeks agobatman-adv: replace non-atomic meshif config fields with (READ|WRITE)_ONCE
Sven Eckelmann [Tue, 12 May 2026 17:37:05 +0000 (19:37 +0200)] 
batman-adv: replace non-atomic meshif config fields with (READ|WRITE)_ONCE

The meshif configuration values are only accessed as plain loads/stores and
do not require full atomic_t semantics. Convert these fields to native
integer types and replace their users with READ_ONCE()/WRITE_ONCE() to
avoid load/store tearing.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
2 weeks agobatman-adv: extract netdev wifi detection information object
Sven Eckelmann [Wed, 13 May 2026 19:37:46 +0000 (21:37 +0200)] 
batman-adv: extract netdev wifi detection information object

Previously, wifi_flags were stored directly in batadv_hard_iface, which is
created for every network interface on the system (including those never
attached to a mesh interface). This wastes memory and complicates the
long-term goal of lazily allocating batadv_hard_iface only for interfaces
that actually join a mesh.

The problem is that several batman-adv features need wifi detection for
net_devices (and their underlying devices) regardless of whether a
batadv_hard_iface exists for them:

* B.A.T.M.A.N. IV TQ hop penalty calculation
* B.A.T.M.A.N. V ELP probing / throughput estimation
* AP isolation

To decouple wifi detection from batadv_hard_iface lifetime, introduce a
global rhashtable (batadv_wifi_net_devices) mapping net_device pointers to
batadv_wifi_net_device_state objects. Only net_devices that are actually
detected as (indirect) wifi interfaces occupy an entry, keeping the common
(non-wifi) case allocation-free.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
2 weeks agonet/mlx5: Reorder completion before putting command entry in cmd_work_handler
Nikolay Kuratov [Tue, 26 May 2026 16:29:32 +0000 (19:29 +0300)] 
net/mlx5: Reorder completion before putting command entry in cmd_work_handler

Assuming callback != NULL && !page_queue, cmd_work_handler takes
command entry with refcnt == 1 from mlx5_cmd_invoke.
If either semaphore timeout or index allocation error happens,
it does final cmd_ent_put(ent). To avoid access to freed memory,
notify slotted completion before cmd_ent_put.

This is theoretical issue found by Svace static analyser.

Cc: stable@vger.kernel.org
Fixes: 485d65e135712 ("net/mlx5: Add a timeout to acquire the command queue semaphore")
Fixes: 0e2909c6bec90 ("net/mlx5: Fix variable not being completed when function returns")
Signed-off-by: Nikolay Kuratov <kniv@yandex-team.ru>
Reviewed-by: Md Haris Iqbal <haris.iqbal@linux.dev>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260526162932.501584-1-kniv@yandex-team.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agonetfilter: nft_byteorder: remove multi-register support
Florian Westphal [Tue, 12 May 2026 13:36:14 +0000 (15:36 +0200)] 
netfilter: nft_byteorder: remove multi-register support

64bit byteorder conversion is broken when several registers need to be
converted because the source register array advances in steps for 4 bytes
instead of 8:

  for (i = ...
      src64 = nft_reg_load64(&src[i]);
                             ~~~~~ u32 *src
      nft_reg_store64(&dst64[i],

Remove the multi-register support, it has other issues as well:

Pablo points out that commit
caf3ef7468f7 ("netfilter: nf_tables: prevent OOB access in nft_byteorder_eval")
alters semantics: before the loop operated on registers, i.e.
 for ( ... )
   dst32[i] = htons((u16)src32[i])

 .. but after the patch it will operate on bytes, which makes this
 useless to convert e.g. concatenations, which store each compound
 in its own register.

Multi-convert of u32 has one theoretical application:

ct mark . meta mark . tcp dport @intervalset

Because ct mark and meta mark are host byte order, use with
intervals has to convert the byteorder for ct/meta mark value
to network byte order (bigendian).

nftables emits this:
 [ meta load mark => reg 1 ]
 [ byteorder reg 1 = hton(reg 1, 4, 4) ]
 [ ct load mark => reg 9 ]
 [ byteorder reg 9 = hton(reg 9, 4, 4) ]
 ...

I.e. two separate calls.  Theoretically it could be changed to do:
 [ meta load mark => reg 1 ]
 [ ct load mark => reg 9 ]
 [ byteorder reg 1 = htonl(reg 1, 4, 8) ]
 ...

But then all it would take to change the set to
meta mark . tcp dport . ct mark

... and we'd be back to two "byteorder" calls. IOW, support to
convert a range of registers is both dysfunctional and dubious.

Simplify this: remove the feature.

Pablo Neira Ayuso points out that nftables before 1.1.0 can generate
incorrect byteorder conversions, see 9fe58952c45a,
"evaluate: skip byteorder conversion for selector smaller than 2 bytes"
in nftables.git).  Affected rulesets fail to load with this change and
old userspace due to 'len != size' check.

Fixes: c301f0981fdd ("netfilter: nf_tables: fix pointer math issue in nft_byteorder_eval()")
Cc: <stable+noautosel@kernel.org> # may break rule load with old nftables versions
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Link: https://lore.kernel.org/netfilter-devel/20240206104336.ctigqpkunom2ufmn@lion.mk-sys.cz/
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 weeks agonetfilter: bridge: make ebt_snat ARP rewrite writable
Yiming Qian [Sat, 23 May 2026 12:29:10 +0000 (12:29 +0000)] 
netfilter: bridge: make ebt_snat ARP rewrite writable

The ebtables SNAT target keeps the Ethernet source address rewrite
behind skb_ensure_writable(skb, 0).  This is intentional: at the bridge
ebtables hooks the Ethernet header is addressed through
skb_mac_header()/eth_hdr(), while skb->data points at the Ethernet
payload.  Asking skb_ensure_writable() for ETH_HLEN bytes would check
the payload, not the Ethernet header, and would reintroduce the small
packet regression fixed by commit 63137bc5882a.

However, the optional ARP sender hardware address rewrite is different.
It writes through skb_store_bits() at an offset relative to skb->data:

        skb_store_bits(skb, sizeof(struct arphdr), info->mac, ETH_ALEN)

skb_header_pointer() only safely reads the ARP header; it does not make
the later sender hardware address range writable.  If that range is
still held in a nonlinear skb fragment backed by a splice-imported file
page, skb_store_bits() maps the frag page and copies the new MAC address
directly into it.

Ensure the ARP SHA range is writable before reading the ARP header and
before calling skb_store_bits().

Fixes: 63137bc5882a ("netfilter: ebtables: Fixes dropping of small packets in bridge nat")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Signed-off-by: Yiming Qian <yimingqian591@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 weeks agonetfilter: nft_ct: bail out on template ct in get eval
Jiayuan Chen [Thu, 28 May 2026 11:09:19 +0000 (19:09 +0800)] 
netfilter: nft_ct: bail out on template ct in get eval

I noticed this issue while looking at a historic syzbot report [1].

A rule like the one below is enough to trigger the bug:

    table ip t {
        chain pre {
            type filter hook prerouting priority raw;
            ct zone set 1
            ct original saddr 1.2.3.4 accept
        }
    }

The first expression attaches a per-cpu template ct via
nft_ct_set_zone_eval() (nf_ct_tmpl_alloc -> kzalloc, tuple is all
zero, nf_ct_l3num(ct) == 0). The next expression then calls
nft_ct_get_eval() on the same skb, treats the template as a real ct
and hits the 16-byte memcpy path. With dreg at NFT_REG32_15 this
overflows past struct nft_regs on the kernel stack; with smaller
dreg values it silently clobbers adjacent registers.

Reject template ct at the eval entry and in nft_ct_get_fast_eval(),
mirroring the check nft_ct_set_eval() already has. Additionally,
bound the address copy in NFT_CT_SRC / NFT_CT_DST by priv->len
instead of by nf_ct_l3num(ct): nf_ct_get_tuple() zeroes the tuple
before pkt_to_tuple() fills in only the protocol-relevant leading
bytes, so the trailing bytes of tuple->{src,dst}.u3.all are
well-defined zero. priv->len is validated at rule load, so the
copy size is now bounded by the destination register rather than
by an untrusted field on the conntrack.

[1]: https://syzkaller.appspot.com/bug?id=389cf09cb72926114fce90dc85a2c3231dcb647c

Fixes: 45d9bcda21f4 ("netfilter: nf_tables: validate len in nft_validate_data_load()")
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 weeks agonetfilter: nft_tunnel: fix use-after-free on object destroy
Tristan Madani [Wed, 27 May 2026 13:57:50 +0000 (13:57 +0000)] 
netfilter: nft_tunnel: fix use-after-free on object destroy

nft_tunnel_obj_destroy() calls metadata_dst_free() which directly
kfree()s the metadata_dst, ignoring the dst_entry refcount. Packets
that took a reference via dst_hold() in nft_tunnel_obj_eval() and
are still queued (e.g. in a netem qdisc) are left with a dangling
pointer. When these packets are eventually dequeued, dst_release()
operates on freed memory.

Replace metadata_dst_free() with dst_release() so the metadata_dst
is freed only after all references are dropped. The dst subsystem
already handles metadata_dst cleanup in dst_destroy() when
DST_METADATA is set.

Fixes: af308b94a2a4 ("netfilter: nf_tables: add tunnel support")
Cc: stable@vger.kernel.org
Signed-off-by: Tristan Madani <tristan@talencesecurity.com>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 weeks agonetfilter: conntrack_irc: fix possible out-of-bounds read
Florian Westphal [Wed, 27 May 2026 10:20:19 +0000 (12:20 +0200)] 
netfilter: conntrack_irc: fix possible out-of-bounds read

When parsing fails after we've matched the command string we
should bail out instead of trying to match a different command.

This helper should be deprecated, given prevalence of TLS I doubt it has
any relevance in 2026.

Fixes: 869f37d8e48f ("[NETFILTER]: nf_conntrack/nf_nat: add IRC helper port")
Closes: https://sashiko.dev/#/patchset/20260525182924.28456-1-fw%40strlen.de
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 weeks agonetfilter: synproxy: add mutex to guard hook reference counting
Fernando Fernandez Mancera [Tue, 26 May 2026 21:58:31 +0000 (23:58 +0200)] 
netfilter: synproxy: add mutex to guard hook reference counting

As the synproxy infrastructure register netfilter hooks on-demand when a
user adds the first iptables target or nftables expression, if done
concurrently they can race each other.

Introduce a mutex to serialize the refcount control blocks access from
both frontends. While a per namespace mutex might be more efficient, it
is not needed for target/expression like SYNPROXY.

Fixes: ad49d86e07a4 ("netfilter: nf_tables: Add synproxy support")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 weeks agonetfilter: nft_fib_ipv6: bail out of sibling walk if rt got unlinked
Jiayuan Chen [Tue, 26 May 2026 02:02:27 +0000 (10:02 +0800)] 
netfilter: nft_fib_ipv6: bail out of sibling walk if rt got unlinked

This was reported by Sashiko [1].

The RCU walk over rt->fib6_siblings can spin forever if rt is unlinked
mid-iteration: rt->fib6_siblings.next still points into the old ring,
so the loop never meets &rt->fib6_siblings as its terminator.

fib6_purge_rt() always does WRITE_ONCE(rt->fib6_nsiblings, 0) before
list_del_rcu(), so readers can use rt->fib6_nsiblings == 0 as the
detach signal. The same pattern is used in fib6_info_uses_dev() and
rt6_nlmsg_size().

[1]: https://sashiko.dev/#/patchset/20260520023411.391233-1-jiayuan.chen%40linux.dev
Suggested-by: Florian Westphal <fw@strlen.de>
Fixes: 1c32b24c234b ("netfilter: nft_fib_ipv6: switch to fib6_lookup")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 weeks agoipvs: clear the svc scheduler ptr early on edit
Julian Anastasov [Mon, 25 May 2026 04:07:44 +0000 (07:07 +0300)] 
ipvs: clear the svc scheduler ptr early on edit

ip_vs_edit_service() while unbinding the old scheduler clears
the svc->scheduler ptr after the scheduler module initiates
RCU callbacks. This can cause packets to use the old
scheduler at the time when svc->sched_data is already freed
after RCU grace period.

Fix it by clearing the ptr early in ip_vs_unbind_scheduler(),
before the done_service method schedules any RCU callbacks.

Also, if the new scheduler fails to initialize when replacing
the old scheduler, try to restore the old scheduler while still
returning the error code.

Link: https://sashiko.dev/#/patchset/20260519015506.634185-1-rosenp%40gmail.com
Fixes: 05f00505a89a ("ipvs: fix crash if scheduler is changed")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 weeks agonetfilter: xt_NFQUEUE: prefer raw_smp_processor_id
Fernando Fernandez Mancera [Fri, 22 May 2026 10:47:17 +0000 (12:47 +0200)] 
netfilter: xt_NFQUEUE: prefer raw_smp_processor_id

With PREEMPT_RCU this triggers a splat because smp_processor_id() can be
preempted while inside a RCU critical section. If xt_NFQUEUE target is
invoked via nft_compat_eval() path, we are inside a RCU critical
section.

Just use the raw version instead.

Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 weeks agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Paolo Abeni [Mon, 1 Jun 2026 11:35:51 +0000 (13:35 +0200)] 
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Conflicts:

drivers/net/ethernet/microsoft/mana/mana_en.c:
  17bfe0a8c014e ("net: mana: Add NULL guards in teardown path to prevent panic on attach failure")
  d07efe5a6e641 ("net: mana: Use per-queue allocation for tx_qp to reduce allocation size")

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agoarm64: dts: rockchip: Add Bluetooth support for Khadas Edge 2L
Gray Huang [Thu, 7 May 2026 03:35:41 +0000 (11:35 +0800)] 
arm64: dts: rockchip: Add Bluetooth support for Khadas Edge 2L

Enable Bluetooth support for the Ampak AP6275P module on the
Khadas Edge 2L. This involves enabling the UART5 interface for
HCI communication and defining the required regulators and
power-sequence pins.

Signed-off-by: Gray Huang <gray.huang@wesion.com>
Link: https://patch.msgid.link/20260507033541.2576335-3-gray.huang@wesion.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2 weeks agoarm64: dts: rockchip: Enable USB for Khadas Edge 2L
Gray Huang [Thu, 7 May 2026 03:35:40 +0000 (11:35 +0800)] 
arm64: dts: rockchip: Enable USB for Khadas Edge 2L

The Khadas Edge 2L board provides one USB 3.0 Host port and
one USB 2.0 port (connected via an internal hub). Enable the
corresponding DWC3 controllers and PHYs.

Signed-off-by: Gray Huang <gray.huang@wesion.com>
Link: https://patch.msgid.link/20260507033541.2576335-2-gray.huang@wesion.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2 weeks agoarm64: dts: rockchip: Disable removed devices from rk3399-nanopi-r4s
Chen-Yu Tsai [Tue, 5 May 2026 17:29:02 +0000 (01:29 +0800)] 
arm64: dts: rockchip: Disable removed devices from rk3399-nanopi-r4s

While the design of the NanoPi R4S is based on the common NanoPi 4
family, it is trimmed down a lot.

Disable all the peripherals on the SoC that are not used, and delete
all the external components that are not present.

Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
[feels like the cleaner option, than to move those peripherals into a new
 rk3399-nanopi-allothers.dtsi, as the r4s variants are not as many ]
Link: https://patch.msgid.link/20260505172903.33271-1-wens@kernel.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2 weeks agoarm64: dts: rockchip: Fix EEPROM compatible on rk3399-nanopi-r4s-enterprise
Chen-Yu Tsai [Tue, 5 May 2026 16:52:43 +0000 (00:52 +0800)] 
arm64: dts: rockchip: Fix EEPROM compatible on rk3399-nanopi-r4s-enterprise

The EEPROM used on the R4S (enterprise) is the 24AA025E48T-I/OT from
MicroChip. This is a 2-Kbit EEPROM with 16-byte page size. The latter
half of the EEPROM is read-only, and the last 48 bits contain a globally
unique MAC address. That is to say this is not an ordinary EEPROM.

The compatible for this type of EEPROM was introduced later that the
board. Switch over to the correct compatible now that it is available.

Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
Link: https://patch.msgid.link/20260505165244.1902-1-wens@kernel.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2 weeks agoALSA: core: Use flexible array for card private data
Cássio Gabriel [Mon, 1 Jun 2026 01:23:35 +0000 (22:23 -0300)] 
ALSA: core: Use flexible array for card private data

snd_card_new() and snd_devm_card_new() allocate struct snd_card
together with optional driver-private storage. The storage is currently
described only by open-coded sizeof(*card) + extra_size arithmetic, and
snd_card_init() reaches it by manually adding sizeof(struct snd_card) to
the card pointer.

Make the trailing storage explicit with a flexible array member. Use
kzalloc_flex() for the regular allocation path and struct_size() for the
devres allocation size. This documents the layout and avoids open-coded
variable-size object arithmetic.

Align the flexible array to unsigned long long so the driver-private area
does not become less aligned than the old sizeof(struct snd_card) tail
address on 32-bit ABIs.

Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260531-alsa-card-private-flex-array-v2-1-e4ff67f5bd23@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2 weeks agoALSA: seq: Use flexible array for device arguments
Cássio Gabriel [Sun, 31 May 2026 23:41:41 +0000 (20:41 -0300)] 
ALSA: seq: Use flexible array for device arguments

snd_seq_device_new() allocates struct snd_seq_device together with a
caller-specific argument area. SNDRV_SEQ_DEVICE_ARGPTR() reaches that
area by adding sizeof(struct snd_seq_device) to the object pointer.

Make the trailing storage explicit with a flexible array and allocate it
with kzalloc_flex(). This makes the object layout self-describing and
avoids open-coded size arithmetic in the allocation and accessor.

Reject negative argsize values before calculating the allocation size.
Current in-tree callers pass either zero or sizeof() values, but the
function takes an int size argument and should not let a negative value
flow into unsigned allocation arithmetic.

Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260531-alsa-seq-flex-args-v2-1-6e068d4ed9b0@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2 weeks agoext2: Remove deprecated DAX support
Ashwin Gundarapu [Sun, 24 May 2026 15:35:27 +0000 (21:05 +0530)] 
ext2: Remove deprecated DAX support

DAX support in ext2 was deprecated in commit d5a2693f93e4 ("ext2:
Deprecate DAX") with a removal deadline of end of 2025.  Remove all DAX
code from ext2 as scheduled.

This removes the DAX mount option, IOMAP DAX support, DAX file
operations, DAX address_space_operations, and the DAX fault handler.

[JK: Fixup some whitespace damage]

Signed-off-by: Ashwin Gundarapu <linuxuser509@zohomail.in>
Link: https://patch.msgid.link/19e5aa07c9b.3a2e576d130187.5289857983023045470@zohomail.in
Signed-off-by: Jan Kara <jack@suse.cz>
2 weeks agomm/slub: detach and reattach partial slabs in batch
Hao Li [Fri, 29 May 2026 03:50:52 +0000 (11:50 +0800)] 
mm/slub: detach and reattach partial slabs in batch

get_partial_node_bulk() moves each selected slab from the node's
partial list to the local pc->slabs list using a remove_partial() and
list_add() pair. In practice, the loop often detaches several adjacent
slabs. Doing this individually repeatedly manipulates list pointers
while holding n->list_lock, which causes unnecessary churn.

To demonstrate this, the counts below show how often single vs. multiple
consecutive slabs are retrieved during a will-it-scale mmap stress test:

consecutive_slabs_count        frequency
= 1                            277345324
= 2                            335238023
= 3                            175717884
>= 4                           88862337

The data confirms that retrieving multiple contiguous slabs is highly
frequent.

To optimize this, track contiguous runs of matching slabs and move each
run in a single operation using list_bulk_move_tail(). This reduces list
pointer churn inside the lock critical section.

Apply the same optimization to __refill_objects_node() when reattaching
leftover partial slabs back to the node's partial list.

The will-it-scale mmap benchmark shows a 2% ~ 5% performance improvement
after applying this patch.

Signed-off-by: Hao Li <hao.li@linux.dev>
Link: https://patch.msgid.link/20260529035120.81304-3-hao.li@linux.dev
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
2 weeks agomm/slub: introduce helpers for node partial slab state
Hao Li [Fri, 29 May 2026 03:50:51 +0000 (11:50 +0800)] 
mm/slub: introduce helpers for node partial slab state

Wrap partial slab count inc/dec and flag set/clear into
helper functions to reduce code duplication.

Note that __add_partial() is called locklessly in
early_kmem_cache_node_alloc(), but since there is no such use case for
removal, __remove_partial() does not exist.

Suggested-by: Harry Yoo <harry@kernel.org>
Signed-off-by: Hao Li <hao.li@linux.dev>
Link: https://patch.msgid.link/20260529035120.81304-2-hao.li@linux.dev
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
2 weeks agomm/slub: use empty sheaf helpers for oversized sheaves
Shengming Hu [Thu, 28 May 2026 11:35:37 +0000 (19:35 +0800)] 
mm/slub: use empty sheaf helpers for oversized sheaves

Oversized prefilled sheaves are allocated separately because their
capacity can be larger than the cache's regular sheaf capacity. After
they are flushed, however, they are empty sheaves as well, and should be
released through the same empty-sheaf helper.

Allocate oversized prefilled sheaves with __alloc_empty_sheaf() and free
them with free_empty_sheaf() after a failed prefill or after they are
returned and flushed. This keeps the oversized and pfmemalloc return paths
consistent, including the SLAB_KMALLOC-specific __GFP_NO_OBJ_EXT and
mark_obj_codetag_empty() handling.

Keep the caller-GFP filtering in alloc_empty_sheaf() instead of
__alloc_empty_sheaf(). In particular, do not clear OBJCGS_CLEAR_MASK in
the raw helper, so the oversized prefill path does not unexpectedly drop
caller-provided flags such as __GFP_NOFAIL. The SLAB_KMALLOC-specific
addition of __GFP_NO_OBJ_EXT remains in __alloc_empty_sheaf(), matching
the free_empty_sheaf() assumption.

Since oversized sheaves are now allocated and freed through the empty
sheaf helpers, SHEAF_ALLOC and SHEAF_FREE also account for oversized
sheaves. Update the stat comments accordingly.

Keep the capacity initialization in the oversized prefill path, since
capacity is currently only used for prefilled sheaves

Signed-off-by: Shengming Hu <hu.shengming@zte.com.cn>
Link: https://patch.msgid.link/20260528193537623nAo-xYBNYBysGKSBjREuO@zte.com.cn
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Reviewed-by: Hao Li <hao.li@linux.dev>
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
2 weeks agoARM: orion5x: update board check in mss2_pci_init() to use the DT
Ethan Nelson-Moore [Sun, 17 May 2026 02:37:20 +0000 (19:37 -0700)] 
ARM: orion5x: update board check in mss2_pci_init() to use the DT

The mss2_pci_init() function contains a check for the ARM machine ID
via the machine_is_mss2() macro. The board concerned now supports only
FDT booting, which does not use machine IDs, and therefore the code
should be updated to check the DT compatible property instead. The
machine was converted to FDT booting in commit fbf04d814d0a ("ARM:
orion5x: convert Maxtor Shared Storage II to the Device Tree"). The
presence of this machine ID check prevents the removal of machine IDs
no longer used by the kernel from arch/arm/tools/mach-types, because
the machine_is_*() macros are generated from mach-types. To resolve
this issue, use of_machine_is_compatible() instead.

Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2 weeks agoRISC-V: KVM: Use flexible array for APLIC IRQ state
Rosen Penev [Mon, 11 May 2026 03:21:44 +0000 (20:21 -0700)] 
RISC-V: KVM: Use flexible array for APLIC IRQ state

Store the per-source APLIC IRQ state in the APLIC allocation instead
of allocating it separately.

This ties the IRQ state lifetime directly to the APLIC state, removes a
separate allocation failure path, and lets __counted_by() describe the
array bounds.

Assisted-by: Codex:GPT-5.5
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260511032144.361520-1-rosenp@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2 weeks agoxfrm: iptfs: preserve shared-frag marker in iptfs_consume_frags()
Takao Sato [Tue, 26 May 2026 16:09:57 +0000 (13:09 -0300)] 
xfrm: iptfs: preserve shared-frag marker in iptfs_consume_frags()

iptfs_consume_frags() transfers paged fragments from one socket buffer
to another but fails to propagate the SKBFL_SHARED_FRAG flag. This is
the same class of bug that was fixed in skb_try_coalesce() for
CVE-2026-46300: when fragments backed by read-only page-cache pages are
merged, the marker indicating their shared nature must be preserved so
that ESP can decide correctly whether in-place encryption is safe.

Apply the same two-line fix used in skb_try_coalesce() to
iptfs_consume_frags().

Fixes: b96ba312e21c ("xfrm: iptfs: share page fragments of inner packets")
Cc: stable@vger.kernel.org # 6.14+
Signed-off-by: Takao Sato <takaosato1997@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2 weeks agoarm: mvebu_v5_defconfig: remove stale MACH_LINKSTATION_LSCHL reference
Ethan Nelson-Moore [Sat, 9 May 2026 02:27:43 +0000 (19:27 -0700)] 
arm: mvebu_v5_defconfig: remove stale MACH_LINKSTATION_LSCHL reference

The legacy board file for MACH_LINKSTATION_LSCHL was removed in
commit ecfe69639157 ("ARM: orion5x: remove legacy support of ls-chl")
after it was converted to DT booting, but a reference to it remained in
mvebu_v5_defconfig. Drop this unused code.

Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2 weeks agoriscv: dts: spacemit: enable PMIC on OrangePi R2S
Chukun Pan [Wed, 20 May 2026 10:00:00 +0000 (18:00 +0800)] 
riscv: dts: spacemit: enable PMIC on OrangePi R2S

Enable the i2c8 interface and add the connected SpacemiT P1 PMIC and
its associated regulators to support voltage regulation on the board.

Signed-off-by: Chukun Pan <amadeus@jmu.edu.cn>
Reviewed-by: Yixun lan <dlan@kernel.org>
Link: https://patch.msgid.link/20260520100000.575719-1-amadeus@jmu.edu.cn
Signed-off-by: Yixun Lan <dlan@kernel.org>
2 weeks agorust: cpufreq: clean new `clippy::map_or_identity` lint for Rust 1.98.0
Miguel Ojeda [Sat, 30 May 2026 09:58:09 +0000 (11:58 +0200)] 
rust: cpufreq: clean new `clippy::map_or_identity` lint for Rust 1.98.0

Starting with Rust 1.98.0 (expected 2026-08-20), Clippy is likely
introducing a new lint `clippy::map_or_identity` [1][2], which currently
triggers in a single case:

    warning: expression can be simplified using `Result::unwrap_or()`
        --> rust/kernel/cpufreq.rs:1326:60
         |
    1326 |         PolicyCpu::from_cpu(cpu_id).map_or(0, |mut policy| T::get(&mut policy).map_or(0, |f| f))
         |                                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         |
         = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#map_or_identity
         = note: `-W clippy::map-or-identity` implied by `-W clippy::all`
         = help: to override `-W clippy::all` add `#[allow(clippy::map_or_identity)]`
    help: consider using `unwrap_or`
         |
    1326 -         PolicyCpu::from_cpu(cpu_id).map_or(0, |mut policy| T::get(&mut policy).map_or(0, |f| f))
    1326 +         PolicyCpu::from_cpu(cpu_id).map_or(0, |mut policy| T::get(&mut policy).unwrap_or(0))
         |

The suggestion is valid, thus clean it up.

Cc: stable@vger.kernel.org # Needed in 6.18.y and later.
Link: https://github.com/rust-lang/rust-clippy/issues/15801
Link: https://github.com/rust-lang/rust-clippy/pull/16052
Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20260530095809.213611-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2 weeks agoARM: mvebu: simplify of_node_put calls
Martin Kaiser [Sat, 2 May 2026 12:14:27 +0000 (14:14 +0200)] 
ARM: mvebu: simplify of_node_put calls

In armada_370_coherency_init, cpu_config_np is no longer needed after
of_iomap. We can call of_node_put earlier and summarize the two calls.

Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2 weeks agoARM: mvebu: drop unnecessary NULL check
Martin Kaiser [Sat, 2 May 2026 12:14:26 +0000 (14:14 +0200)] 
ARM: mvebu: drop unnecessary NULL check

Don't check the returned pointer from of_find_compatible_node.
We pass this pointer to of_iomap, which handles np==NULL  correctly.

Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2 weeks agouserfaultfd: remove redundant check in vm_uffd_ops()
Mike Rapoport (Microsoft) [Wed, 27 May 2026 18:47:51 +0000 (21:47 +0300)] 
userfaultfd: remove redundant check in vm_uffd_ops()

Lorenzo says:

  static const struct vm_uffd_ops *vma_uffd_ops(struct vm_area_struct *vma)
  {
          if (vma_is_anonymous(vma))
                  return &anon_uffd_ops;
          return vma->vm_ops ? vma->vm_ops->uffd_ops : NULL;
  }

  This is doing a redundant check _and_ making life confusing, as if
  !vma->vm_ops is a condition that can be reached there, it can't, as
  vma_is_anonymous() is literally a !vma->vm_ops check :)

Remove the redundant check.

Link: https://lore.kernel.org/20260527184751.4147364-4-rppt@kernel.org
Fixes: 0f48947c4232 ("userfaultfd: introduce vm_uffd_ops")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Suggested-by: Lorenzo Stoakes <ljs@kernel.org>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: David Carlier <devnexen@gmail.com>
Cc: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agouserfaultfd: refuse to __mfill_atomic_pte() for unsupported VMAs
Mike Rapoport (Microsoft) [Wed, 27 May 2026 18:47:50 +0000 (21:47 +0300)] 
userfaultfd: refuse to __mfill_atomic_pte() for unsupported VMAs

__mfill_atomic_pte() unconditionally dereferences ops because there is an
assumption that VMAs that can undergo mfill_* operations are vetted on
registration and must have valid vm_uffd_ops.

Add a guard against potential bugs and make sure __mfill_atomic_pte()
bails out if ops is NULL.

Link: https://lore.kernel.org/20260527184751.4147364-3-rppt@kernel.org
Fixes: ad9ac3081332 ("userfaultfd: introduce vm_uffd_ops->alloc_folio()")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Suggested-by: Lorenzo Stoakes <ljs@kernel.org>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Reviewed-by: David CARLIER <devnexen@gmail.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Michael Bommarito <michael.bommarito@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agouserfaultfd: verify VMA state across UFFDIO_COPY retry
Mike Rapoport (Microsoft) [Wed, 27 May 2026 18:47:49 +0000 (21:47 +0300)] 
userfaultfd: verify VMA state across UFFDIO_COPY retry

Patch series "userfaultfd: verify VMA state across UFFDIO_COPY retry", v2.

... and two more small fixes.

This patch (of 3):

mfill_copy_folio_retry() drops the VMA lock for copy_from_user() and
reacquires it afterwards.  The destination VMA can be replaced during that
window.

The existing check compares vma_uffd_ops() before and after the retry, but
if a shmem VMA with MAP_SHARED is replaced with a shmem VMA with
MAP_PRIVATE (or vice versa) the replacement goes undetected.

The change from MAP_PRIVATE to MAP_SHARED will treat the folio allocated
with shmem_alloc_folio() as anonymous and this will cause BUG() when
mfill_atomic_install_pte() will try to folio_add_new_anon_rmap().

The change from MAP_SHARED to MAP_PRIVATE allows injection of folios into
the page cache of the original VMA.

There is no need to change for hugetlb because it never uses
mfill_copy_folio_retry().

Introduce helpers for more comprehensive comparison of VMA state:
- mfill_retry_state_save() to save the relevant VMA state into a struct
  mfill_retry_state (original uffd_ops, relevant VMA flags, vm_file and
  pgoff) before dropping the lock
- mfill_retry_state_changed() to compare the saved state with the state
  of the VMA acquired after retaking the locks
- mfill_retry_state_put() to release vm_file pinning.

Use DEFINE_FREE() cleanup to wrap mfill_retry_state_put() to avoid
complicating error handling paths in mfill_copy_folio_retry().

Link: https://lore.kernel.org/20260527184751.4147364-1-rppt@kernel.org
Link: https://lore.kernel.org/20260527184751.4147364-2-rppt@kernel.org
Fixes: 292411fda25b ("mm/userfaultfd: detect VMA type change after copy retry in mfill_copy_folio_retry()")
Fixes: 6ab703034f14 ("userfaultfd: mfill_atomic(): remove retry logic")
Co-developed-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Suggested-by: Peter Xu <peterx@redhat.com>
Co-developed-by: David Carlier <devnexen@gmail.com>
Signed-off-by: David Carlier <devnexen@gmail.com>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomm/huge_memory: update file PMD counter before folio_put()
Yin Tirui [Tue, 26 May 2026 10:13:37 +0000 (18:13 +0800)] 
mm/huge_memory: update file PMD counter before folio_put()

__split_huge_pmd_locked() updates the file/shmem RSS counter after
dropping the PMD mapping's folio reference.  If folio_put() drops the last
reference, mm_counter_file() can later read freed folio state via
folio_test_swapbacked().

Move the counter update before folio_put().

Link: https://lore.kernel.org/20260526101337.1984081-1-yintirui@huawei.com
Fixes: fadae2953072 ("thp: use mm_file_counter to determine update which rss counter")
Signed-off-by: Yin Tirui <yintirui@huawei.com>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Acked-by: David Hildenbrand (arm) <david@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chen Jun <chenjun102@huawei.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomm/huge_memory: update file PUD counter before folio_put()
Yin Tirui [Tue, 26 May 2026 10:13:55 +0000 (18:13 +0800)] 
mm/huge_memory: update file PUD counter before folio_put()

__split_huge_pud_locked() updates the file/shmem RSS counter after
dropping the PUD mapping's folio reference.  If folio_put() drops the last
reference, mm_counter_file() can later read freed folio state via
folio_test_swapbacked().

Move the counter update before folio_put().

Link: https://lore.kernel.org/20260526101355.1984244-1-yintirui@huawei.com
Fixes: dbe54153296d ("mm/huge_memory: add vmf_insert_folio_pud()")
Signed-off-by: Yin Tirui <yintirui@huawei.com>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Acked-by: David Hildenbrand (arm) <david@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chen Jun <chenjun102@huawei.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomm/hugetlb_vmemmap: fix incorrect vmemmap restore in rollback
Muchun Song [Mon, 25 May 2026 02:52:13 +0000 (10:52 +0800)] 
mm/hugetlb_vmemmap: fix incorrect vmemmap restore in rollback

vmemmap_restore_pte() rebuilds restored vmemmap pages from a tail-page
template derived from compound_head().  This is wrong when the current PTE
already maps a page whose contents are not tail-page metadata.

In the rollback path of vmemmap_remap_free(), the first restored PTE is
backed by vmemmap_head and contains head-page metadata.  Reconstructing
that page from a tail-page template overwrites the head-page state and
corrupts the restored vmemmap page.

Fix this by copying the full page from the page currently mapped by the
PTE.  Also pass vmemmap_tail to the rollback walk so only PTEs backed by
the shared tail page are restored, while the head PTE remains mapped to
vmemmap_head.  Add VM_WARN_ON_ONCE() checks for unexpected cases.

Link: https://lore.kernel.org/20260525025213.2229628-1-songmuchun@bytedance.com
Fixes: c0b495b91a47 ("mm/hugetlb: refactor code around vmemmap_walk")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Kiryl Shutsemau <kas@kernel.org>
Acked-by: Oscar Salvador (SUSE) <osalvador@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomm/damon/ops-common: call folio_test_lru() after folio_get()
SeongJae Park [Mon, 25 May 2026 16:22:55 +0000 (09:22 -0700)] 
mm/damon/ops-common: call folio_test_lru() after folio_get()

damon_get_folio() speculatively calls folio_test_lru() before
folio_try_get().  The folio can get freed and reallocated to a tail page.
In the case, VM_BUG_ON_PGFLAGS() in const_folio_flags() can be triggered.
Remove the speculative call.

Also mark folio_test_lru() check right after folio_try_get() success as no
more unlikely.

The race should be rare.  Also the problem can happen only if the kernel
has enabled CONFIG_DEBUG_VM_PGFLAGS.  No real world report of this issue
has been made so far.  This fix is based on only theoretical analysis.
That said, a bug is a bug.  A similar issue was also fixed via commit
3203b3ab0fcf ("mm/filemap: don't call folio_test_locked() without a
reference in next_uptodate_folio()").  I don't expect this change will
make a meaningful impact to DAMON performance in the real world, though I
will be happy to be corrected from the real world reports.

The issue was discovered [1] by Sashiko.

Link: https://lore.kernel.org/20260525162256.8317-1-sj@kernel.org
Link: https://lore.kernel.org/20260517234112.89245-1-sj@kernel.org
Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Fernand Sieber <sieberf@amazon.com>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: <stable@vger.kernel.org> # 5.15.x
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agodt-bindings: trivial-devices: add fsl,mc1323
Frank Li [Fri, 22 May 2026 20:38:08 +0000 (16:38 -0400)] 
dt-bindings: trivial-devices: add fsl,mc1323

Add freescale 2.4 GHz IEEE® 802.15.4/ZigBee mc1323 to fix the below
CHECK_DTBS warnings.
  arch/arm/boot/dts/nxp/imx/imx53-smd.dtb: /soc/bus@50000000/spba-bus@50000000/spi@50010000/mc1323@0: failed to match any schema with compatible: ['fsl,mc1323']

Since the i.MX53 platform is more than 20 years old, it is difficult to
find detailed information about how the MC1323 was used on the i.MX53 SMD
board, as the functionality depended on firmware.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260522203810.832631-1-Frank.Li@oss.nxp.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2 weeks agodt-bindings: display: imx: Add television encoder (TVE) for imx53
Frank Li [Thu, 21 May 2026 19:37:32 +0000 (15:37 -0400)] 
dt-bindings: display: imx: Add television encoder (TVE) for imx53

Add television encoder (TVE) for legacy i.MX53 (over 15 years) to fix below
DTB_CHECK warnings:
  arch/arm/boot/dts/nxp/imx/imx53-ard.dtb: /soc/bus@60000000/tve@63ff0000: failed to match any schema with compatible: ['fsl,imx53-tve']

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260521193734.1496372-1-Frank.Li@oss.nxp.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2 weeks agorbd: check snap_count against RBD_MAX_SNAP_COUNT
Rosen Penev [Sat, 30 May 2026 01:12:55 +0000 (18:12 -0700)] 
rbd: check snap_count against RBD_MAX_SNAP_COUNT

snap_count is u32 but the comparison is against a SIZE_MAX-derived value
(~2^61 on 64-bit), which clang flags as always false with
-Wtautological-constant-out-of-range-compare.

The proper check here should be that snap_count does not go over
RBD_MAX_SNAP_COUNT.

Assisted-by: Opencode:Big-pickle
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Alex Elder <elder@riscstar.com>
Link: https://patch.msgid.link/20260530011255.52916-1-rosenp@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agorust: block: fix GenDisk cleanup paths
Haoze Xie [Sat, 30 May 2026 06:11:54 +0000 (14:11 +0800)] 
rust: block: fix GenDisk cleanup paths

GenDiskBuilder::build() still has fallible work after
__blk_mq_alloc_disk(), but its error path only recovers the
foreign queue data. That leaks the temporary gendisk and
request_queue until later teardown. If the caller moved the last
Arc<TagSet<T>> into build(), the leaked queue can retain blk-mq
state after the tag set is dropped.

Fix the pre-registration failure path by dropping the temporary
gendisk reference with put_disk() before recovering queue_data,
so disk_release() can tear down the owned queue.

Also pair GenDisk::drop() with put_disk() after del_gendisk().
Once a Rust GenDisk has been added with device_add_disk(),
del_gendisk() only unregisters it; the final gendisk reference
still has to be dropped to complete the release path.

Fixes: 3253aba3408a ("rust: block: introduce `kernel::block::mq` module")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
Signed-off-by: Haoze Xie <royenheart@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Link: https://patch.msgid.link/b70aff9a920cc42110fe5cf454c3099561863519.1780063368.git.royenheart@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agobpf: Fix security_bpf_prog_load() error handling
Paul Moore [Sat, 23 May 2026 16:00:26 +0000 (12:00 -0400)] 
bpf: Fix security_bpf_prog_load() error handling

If security_bpf_prog_load() fails there is no need to call into
security_bpf_prog_free() as the LSM will handle the cleanup of any partial
LSM state before returning to the caller with an error.  Thankfully this
isn't an issue with any of the existing code as the LSMs which currently
provide BPF hook callback implementations don't allocate any internal
state, but this is something we want to fix for potential future users.

Signed-off-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20260523160025.16363-2-paul@paul-moore.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agobpf: reject overlarge global subprog argument sizes
Taegu Ha [Thu, 28 May 2026 06:21:55 +0000 (15:21 +0900)] 
bpf: reject overlarge global subprog argument sizes

Global subprogram argument checking derives generic pointer sizes from BTF
and passes the resolved size to check_mem_reg() as a u32. The access-size
validation path then uses a signed int, and stack pointers negate the value
before calling check_helper_mem_access().

This creates a wrap when BTF describes a pointee size larger than S32_MAX.
For example, a global subprogram argument of type:

  int (*p)[0x3fffffff]

has a BTF-resolved pointee size of 0xfffffffc bytes. At a call site the
caller can pass a pointer to a 4-byte stack slot at fp-4. The current
PTR_TO_STACK path computes:

  size = -(int)mem_size

so 0xfffffffc becomes -4 as a signed int and the negation validates only
a 4-byte stack range. That range is covered by the caller's stack slot,
so the call is accepted.

The callee is then verified independently with R1 as PTR_TO_MEM and
mem_size 0xfffffffc. A small instruction such as:

  r0 = *(u32 *)(r1 + 4)

is accepted as being inside that BTF-described memory region. At run time,
however, the actual argument value is still fp-4, so r1 + 4 addresses fp+0,
outside the 4-byte object that the caller provided.

Reject sizes that cannot be represented by the verifier's signed
access-size API before the stack-specific negation. Add a verifier
regression test for the oversized BTF argument.

Fixes: 2cb27158adb3 ("bpf: poison dead stack slots")
Signed-off-by: Taegu Ha <hataegu0826@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20260528062155.3988156-1-hataegu0826@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agoMerge branch 'bpf-arm64-stack-argument-fixes'
Alexei Starovoitov [Mon, 1 Jun 2026 00:49:21 +0000 (17:49 -0700)] 
Merge branch 'bpf-arm64-stack-argument-fixes'

Puranjay Mohan says:

====================
bpf, arm64: Stack argument fixes

Patch 1 fixes a redundant MOV in the arm64 JIT's
emit_stack_arg_store_imm() and clarifies the stack layout comments. This
is not a bug fix but an improvement.

Patch 2 bumps the stack argument tests from 6-8 args to at least 10 so
they actually exercise the native stack on arm64, where x0-x7 cover the
first 8 arguments.
====================

Link: https://patch.msgid.link/20260528161750.1900674-1-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agoselftests/bpf: Use at least 10 args in stack argument tests
Puranjay Mohan [Thu, 28 May 2026 16:17:48 +0000 (09:17 -0700)] 
selftests/bpf: Use at least 10 args in stack argument tests

On arm64, the first 8 arguments are passed in registers (x0-x7), so
tests with 8 or fewer arguments never exercise the native stack argument
path in the JIT. Increase argument counts to at least 10 across all
BPF-to-BPF subprog and kfunc stack argument tests so that at least 2
arguments land on the arm64 stack.

For the two-callees test, bump foo1 from 8 to 10 and foo2 from 10 to 12
args to preserve the different-stack-depth flavor of the test.

The bpf_kfunc_call_stack_arg_mem kfunc is left unchanged at 7 args to
avoid breaking the precision backtracking test which relies on hardcoded
verifier log instruction indices.

Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20260528161750.1900674-3-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agobpf, arm64: Fix redundant MOV and clarify stack arg comments
Puranjay Mohan [Thu, 28 May 2026 16:17:47 +0000 (09:17 -0700)] 
bpf, arm64: Fix redundant MOV and clarify stack arg comments

emit_stack_arg_store_imm() materializes the immediate into tmp and
then moves tmp to the target register (x5-x7).  Emit the immediate
directly into the target register to avoid the redundant MOV.

While here, qualify the bare "FP" in the stack-layout ASCII art as
"A64_FP" so it is not confused with BPF_FP, and note that incoming
stack arguments sit above the FP/LR pair pushed by the callee
prologue.

Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20260528161750.1900674-2-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agolibbpf: Skip endianness swap when loader generation failed
Daniel Borkmann [Fri, 29 May 2026 16:28:29 +0000 (18:28 +0200)] 
libbpf: Skip endianness swap when loader generation failed

bpf_gen__prog_load() byte-swaps the program insns and the {func,line}_info
and CO-RE relo blobs in place for cross-endian targets. The blob offsets
come from add_data(), which returns 0 on failure: realloc_data_buf() either
frees and NULLs gen->data_start (realloc OOM) or returns early on an
already-latched gen->error, leaving a stale, possibly too-small buffer.

Neither bswap site checked for this. With gen->swapped_endian set and a
failed generation, "gen->data_start + off" becomes NULL + 0. Guard the
same way via !gen->error so they are skipped once generation has failed.

Fixes: 8ca3323dce43 ("libbpf: Support creating light skeleton of either endianness")
Reported-by: sashiko <sashiko@sashiko.dev>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260529162829.315921-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agolibbpf: Also reset {insn,data}_cur on realloc failure
Daniel Borkmann [Fri, 29 May 2026 09:41:18 +0000 (11:41 +0200)] 
libbpf: Also reset {insn,data}_cur on realloc failure

realloc_insn_buf() as well as realloc_data_buf() free and NULL
gen->insn_start / gen->data_start on -ENOMEM but leave gen->insn_cur /
gen->data_cur pointing into the old, freed buffer. Just reset the
cursors to NULL alongside the base pointers so the freed state is
coherent.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260529094119.307264-3-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agolibbpf: Skip hash computation when loader generation failed
Daniel Borkmann [Fri, 29 May 2026 09:41:17 +0000 (11:41 +0200)] 
libbpf: Skip hash computation when loader generation failed

bpf_gen__finish() calls compute_sha_update_offsets() gated only on
the gen_hash option, without first consulting gen->error. On a failed
generation this is buggy: a failed realloc_data_buf() sets gen->data_start
to NULL (leaving gen->data_cur dangling), so compute_sha_update_offsets()
runs libbpf_sha256() over a NULL buffer with a bogus length; a failed
realloc_insn_buf() likewise sets gen->insn_start to NULL and the hash
immediates get patched through that NULL base.

The computed program is discarded in either case, since the following
"if (!gen->error)" block does not publish opts->insns once an error is
set. Thus, skip the hash pass when generation has already failed.

Fixes: ea923080c145 ("libbpf: Embed and verify the metadata hash in the loader")
Reported-by: sashiko <sashiko@sashiko.dev>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260529094119.307264-2-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agolibbpf: Drop redundant self-loop in emit_check_err
Daniel Borkmann [Fri, 29 May 2026 09:41:16 +0000 (11:41 +0200)] 
libbpf: Drop redundant self-loop in emit_check_err

When the cleanup-label jump offset does not fit in s16, emit_check_err()
sets gen->error = -ERANGE and then emits a BPF_JMP_IMM(BPF_JA, 0, 0, -1)
self-loop.

The latter emit() is dead: gen->error is assigned on the preceding line,
and emit() then bails out early in realloc_insn_buf() the moment gen->error
is set, so the jump is never written into the instruction stream.

gen->error alone already marks the generation as failed. This is a follow-up
to 7dd62566e0d1 ("libbpf: fix off-by-one in emit_signature_match jump offset")
which removed the jump in emit_signature_match() but not in other locations.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260529094119.307264-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agobpf: Update bpf maintainers
Martin KaFai Lau [Fri, 29 May 2026 20:39:09 +0000 (13:39 -0700)] 
bpf: Update bpf maintainers

I am making a life change and will take a long break
from my current work, so I will step down from the "M:" responsibility.

I am currently a "R:" in "BPF [GENERAL]", this part stays unchanged.
I am folding most of the parts into "BPF [GENERAL]".

For "BPF [BTF]", it is long overdue as I am no longer involved.
It is folded into the "BPF [GENERAL]".

The "BPF [STORAGE & CGROUPS]" will also be covered by "BPF [GENERAL]".

For struct_ops, its usage is no longer limited to networking,
so this naturally should move back to "BPF [GENERAL]".

For the reuseport, it will continue to be maintained together
by "BPF [GENERAL]" and the "NETWORKING [SOCKETS]".

For other "BPF [NETWORKING]...", I am moving myself to "R:".

Thanks!

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260529203909.1222164-1-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agoring-buffer: Better comment the use of RB_MISSED_EVENTS
Steven Rostedt [Fri, 29 May 2026 02:37:38 +0000 (22:37 -0400)] 
ring-buffer: Better comment the use of RB_MISSED_EVENTS

If the persistent ring buffer is detected on boot up to have a corrupted
sub-buffer, that sub-buffer is cleared to zero and its commit value has
the RB_MISSED_EVENTS bit set. That bit is to allow the "trace",
"trace_pipe" and "trace_pipe_raw" files know that events were dropped by
outputting "[LOST EVENTS]".

Only in this case does that bit get set in the writeable portion of the
ring buffer. When events are dropped in the normal ring buffer, that
information is stored in the cpu_buffer descriptor and the
RB_MISSED_EVENTS is set in the buffer page at the time the page is
consumed. It is never set in the writeable portion of the buffer.

Add comments to describe this better as it can be confusing to know when
the RB_MISSED_EVENTS are set in the commit portion of the buffer page.

Link: https://lore.kernel.org/all/20260529001500.14178455a046a5cbc6180861@kernel.org/
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://patch.msgid.link/20260528223738.41276c0e@fedora
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2 weeks agoksmbd: fix use-after-free of a deferred file_lock on double SMB2_CANCEL
Gil Portnoy [Sun, 31 May 2026 23:27:56 +0000 (08:27 +0900)] 
ksmbd: fix use-after-free of a deferred file_lock on double SMB2_CANCEL

A deferred byte-range lock (an SMB2_LOCK that blocks) registers an async work on
conn->async_requests via setup_async_work(), with cancel_fn =
smb2_remove_blocked_lock and cancel_argv[0] pointing at the struct file_lock.

When the request is cancelled, the worker frees the file_lock with
locks_free_lock() and takes the cancelled early-exit, which "goto out"s and never
reaches release_async_work() -- the only site that unlinks the work from
conn->async_requests and clears cancel_fn/cancel_argv. The work therefore stays
matchable on async_requests with a live cancel_fn pointing at the freed file_lock,
until connection teardown finally runs release_async_work().

smb2_cancel() fires cancel_fn unconditionally with no state guard, so a second
SMB2_CANCEL for the same AsyncId, arriving in that window, re-runs
smb2_remove_blocked_lock() on the freed file_lock -- a slab use-after-free:

  BUG: KASAN: slab-use-after-free in __locks_delete_block
    __locks_delete_block
    locks_delete_block
    ksmbd_vfs_posix_lock_unblock
    smb2_remove_blocked_lock
    smb2_cancel                 <- 2nd SMB2_CANCEL fires cancel_fn
    handle_ksmbd_work
  Allocated by ...: locks_alloc_lock <- smb2_lock
  Freed by ...:     locks_free_lock  <- smb2_lock (cancelled branch)
  ... cache file_lock_cache of size 192

Reproduced on mainline with KASAN by an authenticated SMB client.

Skip a work whose state is already KSMBD_WORK_CANCELLED so its cancel callback
cannot be fired a second time.

Cc: stable@vger.kernel.org
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2 weeks agoksmbd: fix durable reconnect double-bind race in ksmbd_reopen_durable_fd
Gil Portnoy [Thu, 28 May 2026 00:00:00 +0000 (00:00 +0000)] 
ksmbd: fix durable reconnect double-bind race in ksmbd_reopen_durable_fd

Two concurrent same-user DHnC reconnects can both observe fp->conn == NULL
before either sets it. ksmbd_reopen_durable_fd() checks fp->conn to guard
against a handle already being reconnected, but the check and the binding
assignment are not atomic: both threads pass the guard, both call
ksmbd_conn_get() on the same fp, and both eventually reach
kfree(fp->owner.name) -- a double-free of the owner.name slab object.
The double-bound ksmbd_file also causes a write-UAF on the 344-byte
ksmbd_file_cache object when a concurrent smb2_close() spins on fp->f_lock
after the object has been freed by the losing reconnect path.

KASAN on 7.1-rc5 (48-thread concurrent reconnect, 3000 cycles):
  BUG: KASAN: double-free in ksmbd_reopen_durable_fd+0x268/0x308
  BUG: KASAN: slab-use-after-free in _raw_spin_lock+0xac/0x150
    Write of size 4 at offset 24 into freed ksmbd_file_cache object
Five double-bind windows observed; 63 total KASAN reports triggered.

Fix: validate and claim fp->conn under write_lock(&global_ft.lock) so the
check-and-claim is atomic. ksmbd_lookup_durable_fd() already treats
fp->conn != NULL as "in use" and skips such an fp; setting fp->conn before
dropping the lock closes the race. ksmbd_conn_get() is a non-sleeping
refcount increment, safe under the rwlock. The rollback path on __open_id()
failure also clears fp->conn/tcon under the lock so concurrent readers see
a consistent state.

Fixes: b1f1e80620de ("ksmbd: centralize ksmbd_conn final release to plug transport leak")
Assisted-by: Henry (Claude):claude-opus-4
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2 weeks agoksmbd: fix NULL-deref of opinfo->conn in oplock/lease break notifiers
Gil Portnoy [Thu, 28 May 2026 00:00:00 +0000 (00:00 +0000)] 
ksmbd: fix NULL-deref of opinfo->conn in oplock/lease break notifiers

smb2_oplock_break_noti() and smb2_lease_break_noti() read opinfo->conn
into a local with neither READ_ONCE() nor a NULL check.  Both run from
oplock_break() after opinfo_get_list() has dropped ci->m_lock, so a
concurrent SMB2 LOGOFF (session_fd_check()) can set op->conn = NULL
under ci->m_lock within that window.  ksmbd_conn_r_count_inc(conn) then
writes through NULL at offset 0xc4 -- a remotely triggerable oops.

Guard both reads the way compare_guid_key() already does: read
opinfo->conn with READ_ONCE() and return early if it is NULL, before
allocating the work struct so nothing leaks.  A NULL conn means the
client is gone and the break is moot, so return 0; oplock_break() treats
that as success and runs the normal teardown.

Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2")
Assisted-by: Henry (Claude):claude-opus-4
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2 weeks agoLinux 7.1-rc6 v7.1-rc6
Linus Torvalds [Sun, 31 May 2026 22:14:24 +0000 (15:14 -0700)] 
Linux 7.1-rc6

2 weeks agoMerge tag 'media/v7.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Sun, 31 May 2026 18:50:39 +0000 (11:50 -0700)] 
Merge tag 'media/v7.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:

 - rc: igorplugusb: fix control request setup packet

 - vsp1: revert a couple patches to fix regressions when setting DRM
   pipelines

* tag 'media/v7.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: rc: igorplugusb: fix control request setup packet
  Revert "media: renesas: vsp1: brx: Fix format propagation"
  Revert "media: renesas: vsp1: Initialize format on all pads"

2 weeks agosched_ext: Guard BPF arena helper calls to fix 32-bit build
Tejun Heo [Sun, 31 May 2026 18:01:47 +0000 (08:01 -1000)] 
sched_ext: Guard BPF arena helper calls to fix 32-bit build

BPF arena (kernel/bpf/arena.c) is compiled only on MMU && 64BIT, while
SCHED_CLASS_EXT depends on BPF_SYSCALL && BPF_JIT && DEBUG_INFO_BTF with no
64BIT requirement. On a 32-bit arch with a BPF JIT, SCX builds while the
arena helpers are absent, so the cid-form code's unconditional calls to
bpf_prog_arena() and bpf_arena_map_kern_vm_start() fail to link:

  build_policy.o: undefined reference to `bpf_prog_arena'
  build_policy.o: undefined reference to `bpf_arena_map_kern_vm_start'

Guard the three call sites with the same MMU && 64BIT condition that gates
arena.o. A cid-form scheduler needs a BPF arena, which isn't available on
such builds, so it can't run there regardless. cpu-form schedulers don't
touch the arena and are unaffected.

This is a quick workaround to get past the build errors. A fuller fix may
make the whole cid-form path conditional on the same condition, or drop
32-bit support outright.

Fixes: 0e2819cba977 ("sched_ext: Require an arena for cid-form schedulers")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202605310454.U9iByL2n-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202605310926.APXMc0RJ-lkp@intel.com/
Signed-off-by: Tejun Heo <tj@kernel.org>
3 weeks agodocs: cgroup: Fix stale source file paths
Costa Shulyupin [Sun, 31 May 2026 14:00:45 +0000 (17:00 +0300)] 
docs: cgroup: Fix stale source file paths

Update two references to files that were moved:
- kernel/cgroup.c -> kernel/cgroup/cgroup.c
- tools/cgroup/cgroup_event_listener.c ->
  samples/cgroup/cgroup_event_listener.c

Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
3 weeks agoMerge branch 'bpf-align-syscall-writeback-behavior-with-user-declared-size'
Alexei Starovoitov [Sun, 31 May 2026 16:16:55 +0000 (09:16 -0700)] 
Merge branch 'bpf-align-syscall-writeback-behavior-with-user-declared-size'

Yuyang Huang says:

====================
bpf: Align syscall writeback behavior with user-declared size

This series fixes an out-of-bounds write vulnerability in BPF_PROG_QUERY
while maintaining backward compatibility for older userspace applications.

BPF_PROG_QUERY unconditionally writes back the 'query.revision' field
to userspace. If userspace passes a smaller 'bpf_attr' structure (e.g. 40
bytes, which was the cgroup query layout before 'query.revision' was
added), the kernel performs an out-of-bounds write.

We address this by propagating the user-provided 'uattr_size' down to
the cgroup query handlers and conditionally skipping the write-back of
'query.revision' if the buffer is too small. This allows legacy cgroup
queries to succeed safely.

tcx and netkit queries are left unchanged since they were introduced in
the same merge window as 'query.revision' and have no legacy callers.

Finally, we add a selftest to verify these boundary behaviors.

Changes since v2:
- Propagate uattr_size to __cgroup_bpf_query() and conditionally write
  revision (instead of unconditionally rejecting smaller sizes in front-gate).
- Update BPF selftests to verify that cgroup queries succeed with
  OLD_QUERY_SIZE without writing revision, and succeed with FULL_QUERY_SIZE.
- Remove early size checks in the front-gate to keep the patch minimal.

Changes since v1:
- Simplify the kernel fix to checking the size only in bpf_prog_query().
- Revert all other subsystem query plumbing changes.
- Update BPF selftest to target BPF_CGROUP_INET_INGRESS cgroup query, and
  add verification for attr size boundaries.
====================

Link: https://patch.msgid.link/20260531075600.4058207-1-yuyanghuang@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
3 weeks agoselftests/bpf: add verification for BPF_PROG_QUERY attr size boundaries
Yuyang Huang [Sun, 31 May 2026 07:56:00 +0000 (15:56 +0800)] 
selftests/bpf: add verification for BPF_PROG_QUERY attr size boundaries

Add a new selftest to verify that the BPF syscall (specifically
BPF_PROG_QUERY) correctly handles different user-declared attribute sizes.

Specifically, verify that:
- For cgroup queries, a query with a size that covers 'prog_cnt' but is
  smaller than 'revision' (OLD_QUERY_SIZE) succeeds, but does not write
  to 'revision' (verifying backward compatibility).
- A query with full size (FULL_QUERY_SIZE) succeeds and writes both
  'prog_cnt' and 'revision'.

Fixes: 120933984460 ("bpf: Implement mprog API on top of existing cgroup progs")
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Yuyang Huang <yuyanghuang@google.com>
Link: https://lore.kernel.org/r/20260531075600.4058207-3-yuyanghuang@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
3 weeks agobpf: fix BPF_PROG_QUERY OOB write and cgroup backward compat
Yuyang Huang [Sun, 31 May 2026 07:55:59 +0000 (15:55 +0800)] 
bpf: fix BPF_PROG_QUERY OOB write and cgroup backward compat

BPF_PROG_QUERY writes back the 'query.revision' field unconditionally to
userspace. If userspace passes a smaller 'bpf_attr' structure (e.g. 40
bytes, which was the layout before the addition of 'query.revision'),
the kernel performs an out-of-bounds write.

Fix this by propagating the user-provided attribute size 'uattr_size'
down to the cgroup query handlers, and conditionally skipping writing
the revision field to userspace when the provided buffer size is
insufficient.

query.revision in bpf_mprog_query is structurally identical to the
cgroup case: a late tail field, written unconditionally.

But the backward-compat hazard is not the same.

The min-historical-size test is per command, and bpf_mprog_query only
serves attach types that were born with revision in the struct:

- tcx_prog_query -> BPF_TCX_INGRESS/EGRESS
- netkit_prog_query -> BPF_NETKIT_PRIMARY/PEER

tcx, netkit, the revision field, and bpf_mprog_query itself all landed in
the same v6.6 merge window (053c8e1f235d added the mprog query API +
revision; tcx in e420bed02507, netkit in 35dfaad7188c). There has never
been a tcx/netkit BPF_PROG_QUERY userspace that doesn't know about
revision. So for these commands the minimum legitimate struct already
covers offset 56-64 — no old binary can be broken here.

Contrast with cgroup: BPF_PROG_QUERY on cgroup attach types shipped in
2017; revision write-back was bolted on years later (120933984460). That
path has a real population of pre-revision callers.

Fixes: 120933984460 ("bpf: Implement mprog API on top of existing cgroup progs")
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Yuyang Huang <yuyanghuang@google.com>
Link: https://lore.kernel.org/r/20260531075600.4058207-2-yuyanghuang@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
3 weeks agoMerge tag 'x86-urgent-2026-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 31 May 2026 15:52:16 +0000 (08:52 -0700)] 
Merge tag 'x86-urgent-2026-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:

 - Make the clearcpuid= boot parameter less prominent
   and warn about its dangers & caveats (Borislav Petkov)

 - Do not access the (new) PLATFORM_ID MSR when running
   as a guest (Borislav Petkov)

 - x86 ftrace: Relocate %rip-relative percpu refs in dynamic
   trampolines, to fix crash when using such trampolines
   (Alexis Lothoré)

 - Fix x86-64 CFI build error (Peter Zijlstra)

 - Revert FPU signal return magic number check optimization,
   because it broke CRIU and gVisor in certain FPU configurations
   (Andrei Vagin)

* tag 'x86-urgent-2026-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "x86/fpu: Refine and simplify the magic number check during signal return"
  x86/kvm/vmx: Fix x86_64 CFI build
  x86/ftrace: Relocate %rip-relative percpu refs in dynamic trampolines
  x86/microcode: Do not access MSR_IA32_PLATFORM_ID when running as a guest
  Documentation/arch/x86: Hide clearcpuid=

3 weeks agoALSA: usb-audio: Add quirk flag for Edifier MF200
Rong Zhang [Sun, 31 May 2026 15:45:22 +0000 (23:45 +0800)] 
ALSA: usb-audio: Add quirk flag for Edifier MF200

The UAC mixer of Edifier MF200 works fine except that its volume GET_CUR
method is somehow stubbed and returns a constant value. Since commit
86aa1ea1f15c ("ALSA: usb-audio: Do not expose sticky mixers"), the
sticky check considers the mixer to be sticky and unnecessarily disables
the mixer.

Add a quirk table entry matching VID/PID=0x2d99/0xa024 and applying
the MIXER_SKIP_GET_CUR_VOL quirk flag, so that the mixer is usable
again.

Quirky device sample:

  usb 1-3.2: new full-speed USB device number 7 using xhci_hcd
  usb 1-3.2: New USB device found, idVendor=2d99, idProduct=a024, bcdDevice= 0.00
  usb 1-3.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
  usb 1-3.2: Product: EDIFIER MF200
  usb 1-3.2: Manufacturer: EDIFIER
  usb 1-3.2: SerialNumber: EDI00000X06
  input: EDIFIER EDIFIER MF200 Consumer Control as /devices/pci0000:00/0000:00:02.1/0000:05:00.0/0000:06:0c.0/0000:0e:00.0/usb1/1-3/1-3.2/1-3.2:1.0/0003:2D99:A024.0003/input/input8
  input: EDIFIER EDIFIER MF200 Mouse as /devices/pci0000:00/0000:00:02.1/0000:05:00.0/0000:06:0c.0/0000:0e:00.0/usb1/1-3/1-3.2/1-3.2:1.0/0003:2D99:A024.0003/input/input9
  input: EDIFIER EDIFIER MF200 Keyboard as /devices/pci0000:00/0000:00:02.1/0000:05:00.0/0000:06:0c.0/0000:0e:00.0/usb1/1-3/1-3.2/1-3.2:1.0/0003:2D99:A024.0003/input/input10
  input: EDIFIER EDIFIER MF200 as /devices/pci0000:00/0000:00:02.1/0000:05:00.0/0000:06:0c.0/0000:0e:00.0/usb1/1-3/1-3.2/1-3.2:1.0/0003:2D99:A024.0003/input/input11
  input: EDIFIER EDIFIER MF200 as /devices/pci0000:00/0000:00:02.1/0000:05:00.0/0000:06:0c.0/0000:0e:00.0/usb1/1-3/1-3.2/1-3.2:1.0/0003:2D99:A024.0003/input/input12
  hid-generic 0003:2D99:A024.0003: input,hiddev1,hidraw2: USB HID v1.10 Mouse [EDIFIER EDIFIER MF200] on usb-0000:0e:00.0-3.2/input0
  usb 1-3.2: 9:1: sticky mixer values (-32768/-32513/1 => -32702), disabling

Reported-by: Steve Smith <tarkasteve@gmail.com>
Closes: https://lore.kernel.org/r/CAHLWS5FJCx66GQ-O10pu+nEudEo_QgQAM9vt76T7vT0zGPPC1g@mail.gmail.com
Tested-by: Steve Smith <tarkasteve@gmail.com>
Signed-off-by: Rong Zhang <i@rong.moe>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20260531-uac-quirk-get-cur-vol-v4-3-ede643dca151@rong.moe
3 weeks agoALSA: usb-audio: Add quirk flag for Sennheiser MOMENTUM 3
Rong Zhang [Sun, 31 May 2026 15:45:21 +0000 (23:45 +0800)] 
ALSA: usb-audio: Add quirk flag for Sennheiser MOMENTUM 3

The Sennheiser MOMENTUM 3 is a wireless around-ear headphones featuring
ANC, which can be connected via Bluetooth or USB-C.

When connecting via USB-C, its UAC mixer works fine and precisely
corresponds to the reported dB range. However, the mixer's volume
GET_CUR method is somehow stubbed and returns a constant value (15dB).
Since commit 86aa1ea1f15c ("ALSA: usb-audio: Do not expose sticky
mixers"), the sticky check considers the mixer to be sticky and
unnecessarily disables the mixer.

Add a quirk table entry matching VID/PID=0x1377/0x6004 and applying
the MIXER_GET_CUR_BROKEN quirk flag, so that the mixer is usable again.

Quirky device sample:

  usb 7-1.4.4.1.1.1: new full-speed USB device number 30 using xhci_hcd
  usb 7-1.4.4.1.1.1: New USB device found, idVendor=1377, idProduct=6004, bcdDevice=38.85
  usb 7-1.4.4.1.1.1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
  usb 7-1.4.4.1.1.1: Product: MOMENTUM 3
  usb 7-1.4.4.1.1.1: Manufacturer: Sennheiser electronic GmbH & Co. KG
  usb 7-1.4.4.1.1.1: SerialNumber: <REDACTED>
  usb 7-1.4.4.1.1.1: Found last interface = 0
  usb 7-1.4.4.1.1.1: 1:1: add audio endpoint 0x3
  usb 7-1.4.4.1.1.1: Creating new data endpoint #3
  usb 7-1.4.4.1.1.1: 1:1 Set sample rate 48000, clock 0
  usb 7-1.4.4.1.1.1: 6:0: sticky mixer values (0/11520/768 => 3840), disabling
  usb 7-1.4.4.1.1.1: [6] FU [PCM Playback Volume] skipped due to invalid volume
  input: Sennheiser electronic GmbH & Co. KG MOMENTUM 3 as /devices/pci0000:00/0000:00:08.3/0000:67:00.4/usb7/7-1/7-1.4/7-1.4.4/7-1.4.4.1/7-1.4.4.1.1/7-1.4.4.1.1.1/7-1.4.4.1.1.1:1.2/0003:1377:6004.002B/input/input208
  input: Sennheiser electronic GmbH & Co. KG MOMENTUM 3 Consumer Control as /devices/pci0000:00/0000:00:08.3/0000:67:00.4/usb7/7-1/7-1.4/7-1.4.4/7-1.4.4.1/7-1.4.4.1.1/7-1.4.4.1.1.1/7-1.4.4.1.1.1:1.2/0003:1377:6004.002B/input/input209
  hid-generic 0003:1377:6004.002B: input,hiddev99,hidraw12: USB HID v1.11 Device [Sennheiser electronic GmbH & Co. KG MOMENTUM 3] on usb-0000:67:00.4-1.4.4.1.1.1/input2

Signed-off-by: Rong Zhang <i@rong.moe>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20260531-uac-quirk-get-cur-vol-v4-2-ede643dca151@rong.moe
3 weeks agoALSA: usb-audio: Add QUIRK_FLAG_MIXER_GET_CUR_BROKEN
Rong Zhang [Sun, 31 May 2026 15:45:20 +0000 (23:45 +0800)] 
ALSA: usb-audio: Add QUIRK_FLAG_MIXER_GET_CUR_BROKEN

Since commit 86aa1ea1f15c ("ALSA: usb-audio: Do not expose sticky
mixers"), the UAC mixer core utilizes volume SET_CUR and GET_CUR to
identify devices with sticky mixers. Unfortunately, even though most
devices with sticky GET_CUR also have corresponding sticky SET_CUR,
which I actually met more since the commit had been merged, there is
also a rare case that some devices may have volume mixers that responds
to SET_CUR properly but with its GET_CUR stubbed. This cause the sticky
check to consider the mixer to be sticky and unnecessarily disable it.

As the sticky check can't distinguish between sticky mixers and working
SET_CUR but broken GET_CUR, add QUIRK_FLAG_MIXER_GET_CUR_BROKEN to tell
that the device should fall into the second category when GET_CUR
returns a constant value. In this case, the sticky check becomes
non-fatal and only disables GET_CUR instead of the whole mixer. The
current volume will then be provided by the internal cache that stores
the last set volume.

An info message prompting users to check MIXER_GET_CUR_BROKEN for
potential sticky mixers is also added, so that users can learn how to do
some experiments to determine what's going on. If the mixer surprisingly
turns out to be non-sticky, they can submit a patch for a new quirk
table entry.

Signed-off-by: Rong Zhang <i@rong.moe>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20260531-uac-quirk-get-cur-vol-v4-1-ede643dca151@rong.moe
3 weeks agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sun, 31 May 2026 15:45:08 +0000 (08:45 -0700)] 
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Two core changes, the only one of significance being the change to
  kick queues in SDEV_CANCEL which had a small window for stuck
  requests.

  The major driver fixes are the one to the FC transport class to widen
  the FPIN counter to counter a theoretical (and privileged) fabric
  traffic injection attack and the other is an iscsi fix where a
  malicious target could trick the kernel into an output buffer overrun.

  Both the driver fixes were AI assisted"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: target: iscsi: Validate CHAP_R length before base64 decode
  scsi: target: iscsi: Bound iscsi_encode_text_output() appends to rsp_buf
  scsi: target: iscsi: Fix CRC overread and double-free in iscsit_handle_text_cmd()
  scsi: fcoe: Reject FIP descriptors with zero fip_dlen in CVL walker
  scsi: scsi_transport_fc: Widen FPIN pname walker counter to u32
  scsi: scsi_debug: Add missing newline in scsi_debug_device_reset()
  scsi: megaraid_sas: Fix NULL pointer dereference on firmware duplicate completion
  scsi: devinfo: Add BLIST_NO_RSOC for Promise VTrak E310f
  scsi: core: Run queues for all non-SDEV_DEL devices from scsi_run_host_queues

3 weeks agoMerge tag 'i2c-for-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sun, 31 May 2026 15:33:08 +0000 (08:33 -0700)] 
Merge tag 'i2c-for-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

 - davinci: fix fallback bus frequency on missing clock-frequency

 - virtio: mark device ready initially

* tag 'i2c-for-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: virtio: mark device ready before registering the adapter
  i2c: davinci: fix division by zero on missing clock-frequency

3 weeks agoMerge tag 'input-for-v7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor...
Linus Torvalds [Sun, 31 May 2026 15:27:18 +0000 (08:27 -0700)] 
Merge tag 'input-for-v7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:

 - updates to Elan I2C touchpad driver to handle a new IC type and to
   validate size of supplied firmware to prevent OOB access

 - updates to Xpad controller driver to recognize ASUS ROG RAIKIRI II
   and "Nova 2 Lite" from GameSir controllers as well as a fix to
   prevent a potential OOB access when handling "Share" button

 - an update to Synaptics touchpad driver to use RMI mode for touchpad
   in Thinkpad E490

 - updates to Atmel MXT driver adding checks to prevent potential OOB
   accesses

 - a fix to IMS PCU driver to free correct amount of memory when tearing
   it down

 - a fixup to the recent change to Atlas buttons driver

 - a small cleanup in fm801-fp for PCI IDs table initialisation

* tag 'input-for-v7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: ims-pcu - fix usb_free_coherent() size in ims_pcu_buffers_free()
  Input: synaptics - add LEN2058 to SMBus passlist for ThinkPad E490
  Input: atlas - check ACPI_COMPANION() against NULL
  Input: atmel_mxt_ts - check mem_size before calculating config memory size
  Input: atmel_mxt_ts - fix boundary check in mxt_prepare_cfg_mem
  Input: fm801-gp - simplify initialisation of pci_device_id array
  Input: xpad - add "Nova 2 Lite" from GameSir
  Input: xpad - add support for ASUS ROG RAIKIRI II
  Input: elan_i2c - validate firmware size before use
  Input: xpad - fix out-of-bounds access for Share button
  Input: usbtouchscreen - clamp NEXIO data_len/x_len to URB buffer size
  Input: elan_i2c - increase device reset wait timeout after update FW
  Input: elan_i2c - add ic type 0x19

3 weeks agoRISC-V: KVM: Batch stage-2 TLB flushes
Jinyu Tang [Sun, 12 Apr 2026 02:38:22 +0000 (10:38 +0800)] 
RISC-V: KVM: Batch stage-2 TLB flushes

KVM RISC-V triggers a TLB flush for every single stage-2 PTE
modification (unmap or write-protect) now. Although KVM coalesces the
hardware IPIs, the software overhead of executing the flush work
for every page is large, especially during dirty page tracking.

Following the approach used in x86 and arm64, this patch optimizes
the MMU logic by making the PTE manipulation functions return a boolean
indicating if a leaf PTE was actually changed. The outer MMU functions
bubble up this flag to batch the remote TLB flushes.

Consequently, the flush operation is executed only once per batch.
Moving it outside of the `mmu_lock` also reduces lock contention.

Tested with tools/testing/selftests/kvm on a 4-vCPU guest (Host
environment: QEMU 10.2.1 RISC-V)
1. demand_paging_test (1GB memory)
  time ./demand_paging_test -b 1G -v 4
- Total execution time reduced from ~2m39s to ~2m31s
2. dirty_log_perf_test (1GB memory)
  ./dirty_log_perf_test -b 1G -v 4
- "Clear dirty log time" per iteration dropped significantly from
   ~3.40s to ~0.18s

Reviewed-by: Nutty Liu <nutty.liu@hotmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Jinyu Tang <tjytimi@163.com>
Link: https://lore.kernel.org/r/20260412023822.83341-1-tjytimi@163.com
Signed-off-by: Anup Patel <anup@brainfault.org>
3 weeks agoMerge branch 'for-linus' into for-next
Takashi Iwai [Sun, 31 May 2026 14:49:30 +0000 (16:49 +0200)] 
Merge branch 'for-linus' into for-next

Signed-off-by: Takashi Iwai <tiwai@suse.de>
3 weeks agoALSA: hda/tas2781: Fix spelling mistake: "Froce" -. "Force"
Colin Ian King [Sun, 31 May 2026 10:13:39 +0000 (11:13 +0100)] 
ALSA: hda/tas2781: Fix spelling mistake: "Froce" -. "Force"

There is a spelling mistake in a snprintf statement. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://patch.msgid.link/20260531101339.42155-1-colin.i.king@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
3 weeks agoALSA: usb-audio: Set the value of potential sticky mixers to maximum
Rong Zhang [Sat, 30 May 2026 19:52:49 +0000 (03:52 +0800)] 
ALSA: usb-audio: Set the value of potential sticky mixers to maximum

It makes no sense to restore the saved value for a sticky mixer, since
setting any value is a no-op.

However, in some rare cases, SET_CUR is effective despite GET_CUR always
returns a constant value. These mixers are not sticky, but there's no
way to distinguish them. Without any additional information, the best
thing we can do is to set the mixer value to the maximum before bailing
out, so that a soft mixer can still reach the maximum hardware volume if
the mixer turns out to be non-sticky. Meanwhile, all channels must be
synchronized to prevent imbalance volume.

Fixes: 86aa1ea1f15c ("ALSA: usb-audio: Do not expose sticky mixers")
Signed-off-by: Rong Zhang <i@rong.moe>
Link: https://patch.msgid.link/20260531-uac-sticky-error-path-v1-1-12c2329d17ef@rong.moe
Signed-off-by: Takashi Iwai <tiwai@suse.de>
3 weeks agowifi: iwlwifi: pcie: simplify the resume flow if fast resume is not used
Emmanuel Grumbach [Sun, 31 May 2026 10:30:19 +0000 (13:30 +0300)] 
wifi: iwlwifi: pcie: simplify the resume flow if fast resume is not used

In most distributions, NetworkManager shuts the device down before
entering system suspend, so fast suspend is typically not used.

On older devices, resume currently tries to grab NIC access to infer
whether the device was powered off while suspended. That probe is only
meaningful for the fast-suspend path where the device is expected to
remain alive.

Unfortunately, for unclear reasons, grabbing NIC access was harmful as
reported in the bugzilla ticket below.

Workaround this issue by simply not grabbing NIC access if fast suspend
is not used.

Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221501
Assisted-by: GitHub Copilot:gpt-5.3-codex
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Link: https://patch.msgid.link/20260531133005.e2ed9e0cd44f.If283625983a843933e0c01561a421daff184e9e9@changeid
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
3 weeks agomedia: qcom: camss: vfe-340: Support for PIX client
Loic Poulain [Tue, 14 Apr 2026 18:52:02 +0000 (20:52 +0200)] 
media: qcom: camss: vfe-340: Support for PIX client

Add support for the vfe-340 PIX write engine, enabling frame capture
through the PIX video device (e.g. msm_vfe0_pix). The PIX path requires
a separate configuration flow from RDI, including cropping setup, line-
based write engine configuration, and the correct packer format based
on the input pixel format.

In contrast to RDI, the PIX interface embeds a lightweight processing
engine we can use for cropping, configuring custom stride/alignment,
and, in the future, extracting frame statistics.

The functionality has been validated on Arduino-Uno-Q with:
media-ctl -d /dev/media0 --reset
media-ctl -d /dev/media0 -l '"msm_csiphy0":1->"msm_csid0":0[1],"msm_csid0":4->"msm_vfe0_pix":0[1]'
media-ctl -d /dev/media0 -V '"imx219 1-0010":0[fmt:SRGGB8_1X8/640x480 field:none]'
media-ctl -d /dev/media0 -V '"msm_csiphy0":0[fmt:SRGGB8_1X8/640x480 field:none]'
media-ctl -d /dev/media0 -V '"msm_csid0":0[fmt:SRGGB8_1X8/640x480 field:none]'
media-ctl -d /dev/media0 -V '"msm_vfe0_pix":0[fmt:SRGGB8_1X8/640x480 field:none]'
yavta -B capture-mplane --capture=3 -n 3 -f SRGGB8 -s 640x480 /dev/video3

Signed-off-by: Loic Poulain <loic.poulain@oss.qualcomm.com>
[bod: Squash down fix for bpp unused in vfe_packer_format]
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
3 weeks agomd/raid0: use str_plural helper in dump_zones
Thorsten Blum [Wed, 27 May 2026 14:19:33 +0000 (16:19 +0200)] 
md/raid0: use str_plural helper in dump_zones

Replace the manual ternary "s" pluralization with str_plural() to
simplify the code.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://patch.msgid.link/20260527141932.1243503-2-thorsten.blum@linux.dev
Signed-off-by: Yu Kuai <yukuai@fygo.io>
3 weeks agoraid1: fix nr_pending leak in REQ_ATOMIC bad-block error path
Abd-Alrhman Masalkhi [Sat, 30 May 2026 15:14:11 +0000 (15:14 +0000)] 
raid1: fix nr_pending leak in REQ_ATOMIC bad-block error path

In raid1_write_request(), each per-mirror loop iteration begins by
incrementing rdev->nr_pending. If a REQ_ATOMIC write encounters a
badblock within the requested range, the code jumps to err_handle
without dropping the reference taken for the current mirror.

err_handle's cleanup loop will only decrements for k < i and
r1_bio->bios[k] is non-NULL. The current slot is therefore skipped,
leaving its nr_pending reference leaked permanently. The reference
prevents the rdev from ever being removed, since raid1_remove_conf()
refuses to remove an rdev with nr_pending > 0.

Fix this by calling rdev_dec_pending() before jumping to err_handle.

Fixes: f2a38abf5f1c ("md/raid1: Atomic write support")
Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Link: https://patch.msgid.link/20260530151411.4119-1-abd.masalkhi@gmail.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>
3 weeks agomd/raid1: move the exceed_read_errors condition out of fix_read_error
Christoph Hellwig [Fri, 29 May 2026 05:43:00 +0000 (07:43 +0200)] 
md/raid1: move the exceed_read_errors condition out of fix_read_error

This condition much better fits into the only caller, limiting
fix_read_error to actually fix up data devices after a read error.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260529054308.2720300-3-hch@lst.de
Signed-off-by: Yu Kuai <yukuai@fygo.io>
3 weeks agomd/raid1: cleanup handle_read_error
Christoph Hellwig [Fri, 29 May 2026 05:42:59 +0000 (07:42 +0200)] 
md/raid1: cleanup handle_read_error

Unwind the main conditional with duplicate conditions and initialize
variables at initialization time where possible.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260529054308.2720300-2-hch@lst.de
Signed-off-by: Yu Kuai <yukuai@fygo.io>
3 weeks agomd/raid1,raid10: fix bio accounting for split md cloned bios
Abd-Alrhman Masalkhi [Fri, 1 May 2026 11:46:51 +0000 (13:46 +0200)] 
md/raid1,raid10: fix bio accounting for split md cloned bios

Use md_cloned_bio() to control bio accounting instead of relying
on r1bio_existed in raid1 or the io_accounting flag in raid10.

The previous logic does not reliably reflect whether a bio is an
md cloned bio. When a failed bio is split and resubmitted via
bio_submit_split_bioset() on the error path, this can lead to either
double accounting for md cloned bios, or missing accounting for bios
returned from bio_submit_split_bioset()

Fix this by using md_cloned_bio() to detect md cloned bios and
skip accounting accordingly.

Fixes: bb2a9acefaf9 ("md/raid1: switch to use md_account_bio() for io accounting")
Fixes: 820455238366 ("md/raid10: switch to use md_account_bio() for io accounting")
Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Reviewed-by: Xiao Ni <xiao@kernel.org>
Link: https://patch.msgid.link/20260501114652.590037-4-abd.masalkhi@gmail.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>
3 weeks agomd/raid1,raid10: fix error-path detection with md_cloned_bio()
Abd-Alrhman Masalkhi [Fri, 1 May 2026 11:46:50 +0000 (13:46 +0200)] 
md/raid1,raid10: fix error-path detection with md_cloned_bio()

Detect the error path using md_cloned_bio() instead of relying
on r1_bio in raid1 or r10_bio->read_slot in raid10, which may be
NULL or -1 after splitting and resubmitting a failed bio.

As a result, the error path may not be recognized and memory
allocations can incorrectly use GFP_NOIO instead of
(GFP_NOIO | __GFP_HIGH), which can lead to a deadlock under
memory pressure.

Fixes: 689389a06ce7 ("md/raid1: simplify handle_read_error().")
Fixes: 545250f24809 ("md/raid10: simplify handle_read_error()")
Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Reviewed-by: Xiao Ni <xiao@kernel.org>
Link: https://patch.msgid.link/20260501114652.590037-3-abd.masalkhi@gmail.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>
3 weeks agomd/raid1,raid10: fix deadlock in read error recovery path
Abd-Alrhman Masalkhi [Fri, 1 May 2026 11:46:49 +0000 (13:46 +0200)] 
md/raid1,raid10: fix deadlock in read error recovery path

raid1d and raid10d may resubmit a split md cloned bio while handling
a read error. In this case, resubmitting the bio can lead to a deadlock
if the array is suspended before md_handle_request() acquires an
active_io reference via percpu_ref_tryget_live().

Since the cloned bio already holds an active_io reference,
trying to acquire another reference via percpu_ref_tryget_live()
can lead to a deadlock while the array is suspended.

Fix this by using percpu_ref_get() for md cloned bios.

Fixes: bb2a9acefaf9 ("md/raid1: switch to use md_account_bio() for io accounting")
Fixes: 820455238366 ("md/raid10: switch to use md_account_bio() for io accounting")
Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Reviewed-by: Xiao Ni <xiao@kernel.org>
Reviewed-by: Yu Kuai <yukuai@fygo.io>
Link: https://patch.msgid.link/20260501114652.590037-2-abd.masalkhi@gmail.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>
3 weeks agomd/raid10: reset read_slot when reusing r10bio for discard
Chen Cheng [Fri, 15 May 2026 09:30:19 +0000 (17:30 +0800)] 
md/raid10: reset read_slot when reusing r10bio for discard

put_all_bios() always drops devs[i].bio, but it only drops
devs[i].repl_bio when r10_bio->read_slot < 0. If discard reuses an
r10bio that was previously used for a read, read_slot can still be
non-negative, and discard cleanup can skip bio_put() on repl_bio.

Reset read_slot to -1 when preparing an r10bio for discard so the
replacement bio is always released correctly.

Fixes: d30588b2731f ("md/raid10: improve raid10 discard request")
Signed-off-by: Chen Cheng <chencheng@fnnas.com>
Reviewed-by: Xiao Ni <xiao@kernel.org>
Link: https://patch.msgid.link/20260515093019.3436882-1-chencheng@fnnas.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>
3 weeks agomd: skip redundant raid_disks update when value is unchanged
Abd-Alrhman Masalkhi [Tue, 28 Apr 2026 13:05:24 +0000 (15:05 +0200)] 
md: skip redundant raid_disks update when value is unchanged

Calling update_raid_disks() with the same value as the current one
can trigger unnecessary work. For example, RAID1 will reallocate
resources such as the mempool for r1bio.

Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Link: https://patch.msgid.link/20260428130524.448063-1-abd.masalkhi@gmail.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>
3 weeks agodm-raid: only requeue bios when dm is suspending
Benjamin Marzinski [Tue, 28 Apr 2026 23:20:10 +0000 (19:20 -0400)] 
dm-raid: only requeue bios when dm is suspending

Returning DM_MAPIO_REQUEUE from the target map() function only requeues
the bio during noflush suspends. During regular operations or during
flushing suspends, it fails the bio. Failing the bio during flushing
suspends is the correct behavior here. The bio cannot be handled, and
dm-raid cannot suspend while it is outstanding. But during normal
operations, dm-raid should not push the bio back to dm. Instead, wait
for the reshape to be resumed.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Reviewed-by: Xiao Ni <xiao@kernel.org>
Link: https://patch.msgid.link/20260428232010.2785514-1-bmarzins@redhat.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>
3 weeks agoMAINTAINERS: Update Li Nan's E-mail address
Li Nan [Fri, 8 May 2026 09:55:13 +0000 (17:55 +0800)] 
MAINTAINERS: Update Li Nan's E-mail address

Change to my new email address on didiglobal.com.

Signed-off-by: Li Nan <magiclinan@didiglobal.com>
Link: https://patch.msgid.link/tencent_8F8173BEDF20E98550D5429DF802F34A7108@qq.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>
3 weeks agoMAINTAINERS: update Yu Kuai's email address
Yu Kuai [Wed, 20 May 2026 11:21:48 +0000 (19:21 +0800)] 
MAINTAINERS: update Yu Kuai's email address

Update Yu Kuai's maintainer entries to use the new fygo.io address.

Link: https://patch.msgid.link/20260520112627.1264368-1-yukuai@fnnas.com
Signed-off-by: Yu Kuai <yukuai@fygo.io>
3 weeks agoarm64: dts: renesas: r9a08g046l48-smarc: Enable audio
Biju Das [Thu, 28 May 2026 07:45:45 +0000 (08:45 +0100)] 
arm64: dts: renesas: r9a08g046l48-smarc: Enable audio

Enable audio on the RZ/G3L SMARC EVK by linking SSI0 with the DA7212
audio CODEC.  The SSI0 signals are multiplexed with SD2 and are selected
by switch SW_SD2_EN#.  Add regulator nodes regulator-{1p8v,3p3v} to the
SoM DTSI for reuse by eMMC.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260528074615.91110-3-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
3 weeks agoarm64: dts: renesas: rzg3l-smarc-som: Enable Versa clock generator
Biju Das [Thu, 28 May 2026 07:45:44 +0000 (08:45 +0100)] 
arm64: dts: renesas: rzg3l-smarc-som: Enable Versa clock generator

The RZ/G3L SMARC SoM has a Versa 5P35023B clock generator to generate
the following clocks:
  - ref: Not connected,
  - se1: AUDIO_MCK (11.2896 or 12.2880 MHz),
  - se2: RZ_AUDIO_CLK_B (11.2896 MHz),
  - se3: RZ_AUDIO_CLK_C (12.2880 MHz),
  - diff{1,1B}: ET{0,1}_PHY_CLK (25 MHz),
  - diff2{2,2B}: Not connected.

Enable the Vversa 5P35023B clock generator on the RZ/G3L SoM DTSI.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260528074615.91110-2-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
3 weeks agoarm64: dts: renesas: r9a08g046l48-smarc: Enable I2C{2,3} devices
Biju Das [Thu, 28 May 2026 07:02:35 +0000 (08:02 +0100)] 
arm64: dts: renesas: r9a08g046l48-smarc: Enable I2C{2,3} devices

Enable I2C{2,3} on the RZ/G3L SMARC EVK board.  I2C3 is enabled by
setting SW SYS.2 to the OFF position.

Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Link: https://patch.msgid.link/20260528070239.33352-3-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
3 weeks agoarm64: dts: renesas: r9a08g046l48-smarc: Add gpio keys
Biju Das [Thu, 28 May 2026 07:02:34 +0000 (08:02 +0100)] 
arm64: dts: renesas: r9a08g046l48-smarc: Add gpio keys

RZ/G3L SMARC EVK  has 3 user buttons called USER_SW1, USER_SW2 and
USER_SW3.  Instantiate the gpio-keys driver for these buttons by
removing place holders and replacing proper pins for the buttons.

USER_SW{1,2,3} are configured as wakeup-sources, so they can wake up the
system during s2idle.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260528070239.33352-2-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
3 weeks agoarm64: dts: renesas: rzt2h-n2h-evk: Enable xSPI nodes
Lad Prabhakar [Wed, 27 May 2026 20:24:30 +0000 (21:24 +0100)] 
arm64: dts: renesas: rzt2h-n2h-evk: Enable xSPI nodes

Enable the xSPI0 and xSPI1 controllers on the RZ/T2H N2H EVK board.

Configure the xSPI0 controller interface to 1-bit (x1) mode, even though
the connected MX25LW51245 octal flash device supports octal mode.  Add a
corresponding inline hardware comment detailing this restriction;
operating in octal mode causes the BootROM to fail loading the
first-stage bootloader following a Watchdog Timer (WDT) reset.

Configure the xSPI1 controller interface connected to the AT25SF128A
flash device for 4-bit (x4) mode to utilize all available data lines.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260527202430.606341-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
3 weeks agoarm64: dts: renesas: r9a09g087: Add xSPI nodes
Lad Prabhakar [Tue, 26 May 2026 20:40:44 +0000 (21:40 +0100)] 
arm64: dts: renesas: r9a09g087: Add xSPI nodes

Add device tree nodes for the two xSPI (Expanded SPI) controllers
integrated into the RZ/N2H (R9A09G087) SoC.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260526204045.3481604-3-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
3 weeks agoarm64: dts: renesas: r9a09g077: Add xSPI nodes
Lad Prabhakar [Tue, 26 May 2026 20:40:43 +0000 (21:40 +0100)] 
arm64: dts: renesas: r9a09g077: Add xSPI nodes

Add device tree nodes for the two xSPI (Expanded SPI) controllers
integrated into the RZ/T2H (R9A09G077) SoC.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260526204045.3481604-2-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
3 weeks agoarm64: dts: renesas: rzg3e-smarc-som: Sort GMAC pinmux entries
Biju Das [Sun, 24 May 2026 09:20:11 +0000 (10:20 +0100)] 
arm64: dts: renesas: rzg3e-smarc-som: Sort GMAC pinmux entries

Sort the pinmux entries for both GMAC ctrl nodes in port order (A/B/C and
D/E/F respectively) and remove the extra blank line before the second
pinmux assignment.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260524092016.46346-1-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
3 weeks agoarm64: dts: renesas: r8a779md: Add support for R-Car M3Le R8A779MD Geist
Nguyen Tran [Fri, 22 May 2026 17:19:57 +0000 (19:19 +0200)] 
arm64: dts: renesas: r8a779md: Add support for R-Car M3Le R8A779MD Geist

Add support for the Geist board based on the Renesas R-Car R8A779MD (M3Le)
SoC, a register-compatible variant of the R8A77965 (M3-N) with reduced set
of peripherals.

Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Nguyen Tran <nguyen.tran.pz@bp.renesas.com>
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Link: https://patch.msgid.link/20260522172000.15096-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
3 weeks agoarm64: dts: renesas: r9a07g044: Add DMA properties to serial nodes
Claudiu Beznea [Wed, 20 May 2026 13:23:15 +0000 (16:23 +0300)] 
arm64: dts: renesas: r9a07g044: Add DMA properties to serial nodes

Add DMA properties to the serial nodes on the RZ/G2L SoC.

Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260520132315.944117-1-claudiu.beznea@kernel.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>