media: Use named initializers for arrays of i2c_device_data
While being less compact, using named initializers allows to more easily
see which members of the structs are assigned which value without having
to lookup the declaration of the struct. And it's also more robust
against changes to the struct definition.
The mentioned robustness is relevant for a planned change to struct
i2c_device_id that replaces .driver_data by an anonymous union.
While touching all these arrays, unify usage of whitespace and commas.
This patch doesn't modify the compiled arrays, only their representation
in source form benefits. The former was confirmed with x86 and arm64
builds.
Signed-off-by: Uwe Kleine-König (The Capable Hub) <u.kleine-koenig@baylibre.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Myeonghun Pak [Wed, 13 May 2026 07:02:37 +0000 (16:02 +0900)]
media: radio-si476x: Unregister v4l2_device on probe failure
si476x_radio_probe() registers radio->v4l2dev before allocating the V4L2
controls and before registering the video device. If any of those later
steps fails, probe returns through the exit label after freeing only the
control handler.
A failed probe does not call si476x_radio_remove(), so the
v4l2_device_unregister() there is not reached. This leaves the parent
device reference taken by v4l2_device_register() behind on the error path.
Unregister the V4L2 device in the probe error path after freeing the
controls.
Fixes: b879a9c2a755 ("[media] v4l2: Add a V4L2 driver for SI476X MFD") Cc: stable@vger.kernel.org Co-developed-by: Ijae Kim <ae878000@gmail.com> Signed-off-by: Ijae Kim <ae878000@gmail.com> Signed-off-by: Myeonghun Pak <mhun512@gmail.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Valery Borovsky [Wed, 13 May 2026 05:42:44 +0000 (08:42 +0300)]
media: pwc: Drain fill_buf on start_streaming() failure
pwc_isoc_init() submits its isochronous URBs with
usb_submit_urb(.., GFP_KERNEL) in a loop. After the first URB is
submitted, its completion handler pwc_isoc_handler() can run on another
CPU before the loop finishes:
pwc_get_next_fill_buf() detaches a buffer from pdev->queued_bufs and
stores it in pdev->fill_buf. The error path in start_streaming() only
drains pdev->queued_bufs, so the buffer parked in pdev->fill_buf is
leaked. vb2_start_streaming() then triggers
WARN_ON(owned_by_drv_count).
stop_streaming() already handles this since commit 80b0963e1698
("[media] pwc: fix WARN_ON"), which added the fill_buf drain in the
teardown path but not in the start_streaming() error path. Mirror that
handling on failure so start_streaming() returns with no buffer owned
by the driver.
Issue identified by automated review of the INV-003 series at
https://sashiko.dev/
Jia Zhu [Wed, 20 May 2026 04:46:07 +0000 (12:46 +0800)]
erofs: fix metabuf leak in inode xattr initialization
commit bb88e8da0025 ("erofs: use meta buffers for xattr operations")
converted xattr operations to use on-stack erofs_buf instances.
erofs_init_inode_xattrs() uses such a metabuf while reading the inline
xattr header and shared xattr id array.
Some error paths after erofs_read_metabuf() leave through out_unlock
without dropping the metabuf, so the folio reference can leak.
Consolidate the cleanup at out_unlock. erofs_put_metabuf() is a
no-op if no folio has been acquired, and this keeps all paths after
taking EROFS_I_BL_XATTR_BIT covered by a single cleanup site.
Fixes: bb88e8da0025 ("erofs: use meta buffers for xattr operations") Signed-off-by: Jia Zhu <zhujia.zj@bytedance.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Fixes: bb88e8da0025 ("erofs: use meta buffers for xattr operations") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
erofs: fix managed cache race for unaligned extents
After unaligned compressed extents were introduced, the following race
could occur:
[Thread 1] [Thread 2]
(z_erofs_fill_bio_vec)
<handle a Z_EROFS_PREALLOCATED_FOLIO folio>
...
filemap_add_folio (1)
(z_erofs_bind_cache)
<the same folio is found..>
..
..
folio_attach_private (2)
filemap_add_folio (3) again
Since (1) is executed but (2) hasn't been executed yet, it's possible
that another thread finds the same managed folio in z_erofs_bind_cache()
for a different pcluster and calls filemap_add_folio() again since
folio->private is still Z_EROFS_PREALLOCATED_FOLIO.
Fix this by explicitly clearing folio->private before making the folio
visible in the managed cache so that another pcluster can simply wait
on the locked managed folio as what we did for other shared cases [1].
This only impacts unaligned data compression (`-E48bit` with zstd,
for example).
[1] Commit 9e2f9d34dd12 ("erofs: handle overlapped pclusters out of
crafted images properly") was originally introduced to handle crafted
overlapped extents, but it addresses unaligned extents as well.
Qihang [Thu, 7 May 2026 15:39:17 +0000 (23:39 +0800)]
tee: fix params_from_user() error path in tee_ioctl_supp_recv
params_from_user() may acquire tee_shm references for MEMREF parameters
before failing after partially processing the supplied parameter array.
In tee_ioctl_supp_recv(), those references are currently not released on
that error path.
Fix this by freeing MEMREF references before returning when
params_from_user() fails.
Keep the final cleanup path in tee_ioctl_supp_recv() unchanged since
supp_recv() may consume and replace the supplied parameters, unlike the
other TEE ioctl callback paths.
Arnd Bergmann [Thu, 4 Dec 2025 10:17:23 +0000 (11:17 +0100)]
tee: fix tee_ioctl_object_invoke_arg padding
The tee_ioctl_object_invoke_arg structure has padding on some
architectures but not on x86-32 and a few others:
include/linux/tee.h:474:32: error: padding struct to align 'params' [-Werror=padded]
I expect that all current users of this are on architectures that do
have implicit padding here (arm64, arm, x86, riscv), so make the padding
explicit in order to avoid surprises if this later gets used elsewhere.
Cássio Gabriel [Tue, 19 May 2026 14:46:19 +0000 (11:46 -0300)]
ALSA: scarlett2: Allow flash writes ending at segment boundary
scarlett2_hwdep_write() rejects writes when offset + count is greater than
or equal to the selected flash segment size. That incorrectly treats a
write ending exactly at the end of the segment as out of space, although
the last byte written is still within the segment.
Split invalid argument checks from the segment-space check, keep
zero-length writes as no-ops, and compare count against the remaining
segment size. This permits exact-end writes and avoids relying on
offset + count before deciding whether the request is in bounds.
Marius Hoch [Tue, 19 May 2026 14:01:29 +0000 (16:01 +0200)]
ALSA: hda/realtek: Add LED quirk for HP ProBook 430 G6
Like the HP ProBook 440 G6, the HP ProBook 430 G6 needs
the ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF quirk for its
mute and microphone mute LEDs.
Tested on a HP ProBook 430 G6.
Takashi Iwai [Tue, 19 May 2026 12:11:53 +0000 (14:11 +0200)]
ALSA: hda/intel: Make sure to cancel irq-pending work at closing PCM stream
The pending irq work might be still floating while the assigned stream
has been already closed, which may lead to UAF, especially when
another async work for fasync is involved.
For addressing this, extend the hda_controller_ops for allowing the
extra cleanup procedure that is specific to the controller driver, and
make sure to cancel and sync the pending irq work at each PCM close
before releasing the resources.
Takashi Iwai [Tue, 19 May 2026 12:11:52 +0000 (14:11 +0200)]
ALSA: hda: Move irq pending work into hda-intel stream
Currently, the delayed IRQ handling for PCM streams is managed in a
single work embedded in hda_intel, but this is basically a per-stream
thing. Due to the single work, we can't cancel the work properly at
closing each stream, for example.
For making the IRQ pending work to be stream-based, this patch changes
the following:
- An extended version of azx_dev (i.e. the hd-audio stream object) is
defined for snd-hda-intel
- The irq_pending flag and irq_pending_work are moved to
hda_intel_stream, so that they can be hda-intel stream specific
- The stream creation and assignment are refactored so that
snd-hda-intel can handle individually;
the snd-hda-intel specific workaround for stream tags is also moved
to snd-hda-intel itself instead of the common code
- The irq pending work is canceled properly at free / shutdown
While we're at it, changed the bit field flag to bool, as the bit
field doesn't help much in our case.
Takashi Iwai [Tue, 19 May 2026 09:42:52 +0000 (11:42 +0200)]
ALSA: seq: Register kernel port with full information
The current ALSA sequencer core tries to register the new kernel
sequencer port on the list at first, then fill up the port
information. This means that user-space may sneak the wrong
information before the actual data is filled, which isn't ideal.
Although the user-space should try to query the port info after the
port registration notification is sent out, it'd be still better to
have a port available with the full info from the beginning.
This patch changes the sequencer port creation and registration
procedure; now split to two steps, for creation and insertion, and the
port is registered after the information is filled.
platform/chrome: Use named initializers for struct i2c_device_id
While being less compact, using named initializers allows to more easily
see which members of the structs are assigned which value without having
to lookup the declaration of the struct. And it's also more robust
against changes to the struct definition.
This patch doesn't modify the compiled arrays, only their representation
in source form benefits. The former was confirmed with x86 and arm64
builds.
KP Singh [Wed, 20 May 2026 02:40:59 +0000 (04:40 +0200)]
bpf: Reject NULL data/sig in bpf_verify_pkcs7_signature
__bpf_dynptr_data() can return NULL (FILE dynptrs, any non-contiguous
backing). bpf_verify_pkcs7_signature() forwards the pointer to
verify_pkcs7_signature() unchecked, causing a NULL deref in
asn1_ber_decoder() reachable from a sleepable BPF LSM at lsm.s/bpf.
NULL-check both pointers and reject with -EINVAL. Mirrors the guards
already in kernel/bpf/crypto.c.
Follow the comment for the macrotile_mode and introduce separate
revision for UBWC 3.0 + 8-channel macrotiling mode. It is not used by
the database (since the drivers are not yet changed to handle it yet).
Avinash Duduskar [Sat, 16 May 2026 10:11:09 +0000 (15:41 +0530)]
net: socket: clean up __sys_accept4 comment
Fix a typo and a redundant phrase in the block comment above
__sys_accept4(): "thats" -> "that's", and drop the trailing
"to recvmsg" that repeats the recvmsg() reference earlier in
the same sentence.
Nikhil P. Rao [Fri, 15 May 2026 21:29:07 +0000 (21:29 +0000)]
pds_core: fix debugfs_lookup dentry leak and error handling
debugfs_lookup() returns a dentry with an elevated reference count that
must be released with dput(). The current code discards the returned
dentry without calling dput(), causing a reference leak on every
firmware reset recovery.
Additionally, when CONFIG_DEBUG_FS is disabled, debugfs_lookup()
returns ERR_PTR(-ENODEV), not NULL. The current check passes for error
pointers and would call dput() on an invalid pointer, causing a crash.
Nikhil P. Rao [Fri, 15 May 2026 21:29:05 +0000 (21:29 +0000)]
pds_core: fix error handling in pdsc_devcmd_wait
Fix two cases where pdsc_devcmd_wait() returns stale success from
the completion register instead of an error:
1. FW crash: If firmware stops running, the wait loop breaks early with
running=false. The condition "if ((!done || timeout) && running)" is
false, so error handling is bypassed and stale status is returned.
Check !running first and return -ENXIO.
2. Timeout: If a command times out, err is set to -ETIMEDOUT but then
overwritten by pdsc_err_to_errno(status) which reads stale status.
Return -ETIMEDOUT immediately after cleaning up.
Both errors now propagate to pdsc_devcmd_locked() which queues
health_work for recovery.
In an internal review from Airoha, it was notice that the RX DMA descriptor
bits and mask are wrong. These values probably refer to an old NPU firmware
never published. The previous value works correctly but it was reported
that in some specific condition in mixed scenario with both Ethernet and
WiFi offload it's possible that RX DMA descriptor signal wrong value with
the problem to the RX ring or packets getting dropped.
To handle these specific scenario, apply the new suggested bits mask from
Airoha.
Correct functionality of both AN7581 NPU and MT7996 variant were verified
and confirmed working.
Fixes: a7fc8c641cab ("net: airoha: Fix npu rx DMA definitions") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260518134530.3683-1-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jann Horn [Mon, 18 May 2026 16:51:30 +0000 (18:51 +0200)]
af_unix: Fix UAF read of tail->len in unix_stream_data_wait()
unix_stream_data_wait() does skb_peek_tail(&sk->sk_receive_queue) without
holding any lock that prevents SKBs on that queue from being dequeued and
freed.
This has been the case since commit 79f632c71bea ("unix/stream: fix
peeking with an offset larger than data in queue").
The first consequence of this is that the pointer comparison
`tail != last` can be false even if `last` semantically refers to an
already-freed SKB while `tail` is a new SKB allocated at the same address;
which can cause unix_stream_data_wait() to wrongly keep blocking after new
data has arrived, but only in a weird scenario where a peeking recv() and
a normal recv() on the same socket are racing, which is probably not a
real problem.
But since commit 2b514574f7e8 ("net: af_unix: implement splice for stream
af_unix sockets"), `tail` is actually dereferenced, which can cause UAF in
the following race scenario (where test_setup() runs single-threaded,
and afterwards, test_thread1() and test_thread2() run concurrently in
two threads:
```
static int socks[2];
void test_setup(void) {
socketpair(AF_UNIX, SOCK_STREAM, 0, socks);
send(socks[1], "A", 1, 0);
int peekoff = 1;
setsockopt(socks[0], SOL_SOCKET, SO_PEEK_OFF, &peekoff, sizeof(peekoff));
}
void test_thread1(void) {
char dummy;
recv(socks[0], &dummy, 1, MSG_PEEK);
}
void test_thread2(void) {
char dummy;
recv(socks[0], &dummy, 1, 0);
shutdown(socks[1], SHUT_WR);
}
```
Fix the UAF by removing the read of tail->len; checking tail->len would
only make sense if SKBs in the receive queue of a UNIX socket could grow,
which can no longer happen.
Kuniyuki explained:
> When commit 869e7c62486e ("net: af_unix: implement stream sendpage
> support") added sendpage() support, data could be appended to the last
> skb in the receiver's queue.
>
> That's why we needed to check if the length of the last skb was changed
> while waiting for new data in unix_stream_data_wait().
>
> However, commit a0dbf5f818f9 ("af_unix: Support MSG_SPLICE_PAGES") and
> commit 57d44a354a43 ("unix: Convert unix_stream_sendpage() to use
> MSG_SPLICE_PAGES") refactored sendmsg(), and now data is always added
> to a new skb.
That means this fix is not suitable for kernels before 6.5.
Justin Iurman [Sun, 17 May 2026 18:30:59 +0000 (20:30 +0200)]
ipv6: ioam: add NULL check for idev in ipv6_hop_ioam()
Reported by Sashiko:
The function ipv6_hop_ioam() accesses
__in6_dev_get(skb->dev)->cnf.ioam6_enabled without validating the returned
idev pointer. Because addrconf_ifdown() can concurrently clear dev->ip6_ptr
via RCU, __in6_dev_get() can return NULL during interface teardown, which
could cause a NULL pointer dereference when processing an IOAM Hop-by-Hop
option.
Let's add a check and use SKB_DROP_REASON_IPV6DISABLED accordingly.
Fixes: 9ee11f0fff20 ("ipv6: ioam: Data plane support for Pre-allocated Trace") Cc: stable@vger.kernel.org Signed-off-by: Justin Iurman <justin.iurman@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260517183059.29140-1-justin.iurman@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
net: phy: honor eee_disabled_modes when advertising EEE
While debugging why ethtool --show-eee reports "not supported" on a
Raspberry Pi CM4 with eee-broken-1000t / eee-broken-100tx set on the
PHY node, I noticed two phylib helpers copy phydev->supported_eee
into phydev->advertising_eee without applying
phydev->eee_disabled_modes: phy_support_eee() and
phy_advertise_eee_all(). That undoes the filtering phy_probe() set
up after of_set_phy_eee_broken(), so the PHY ends up advertising EEE
for modes that were marked broken in DT (or by the driver via
eee_disabled_modes).
The visible effect on MAC drivers that call phy_support_eee() after
probe (bcmgenet, fec, lan743x, lan78xx, r8169) is that ethtool on the
local interface reports "not supported" (because supported is masked
by eee_disabled_modes and ends up empty), while the link partner
happily sees EEE negotiated and active.
Patch 1 fixes phy_support_eee(). Patch 2 fixes phy_advertise_eee_all(),
which is also reached from genphy_c45_ethtool_set_eee() when user
space passes an empty advertisement.
I went through the other users of supported_eee as suggested by Andrew
and they look fine:
- phy_probe() already masks via eee_disabled_modes after
of_set_phy_eee_broken().
- genphy_c45_ethtool_get_eee() masks supported_eee with
eee_disabled_modes when reporting to user space.
- genphy_c45_ethtool_set_eee() masks user-supplied adv against
eee_disabled_modes, and the empty-adv path is now covered by
patch 2.
- genphy_c45_read_eee_abilities(), read_eee_cap1/cap2 populate
supported_eee from PHY registers (source of truth).
- genphy_c45_read_eee_adv(), read_eee_lpa() and write_eee_adv() use
supported_eee only to gate which MMD registers to access, not to
construct an advertisement.
====================
Nicolai Buchwitz [Mon, 18 May 2026 08:23:10 +0000 (10:23 +0200)]
net: phy: honor eee_disabled_modes in phy_advertise_eee_all()
phy_advertise_eee_all() copies supported_eee into advertising_eee
unconditionally, overwriting any filtering applied during phy_probe()
based on DT eee-broken-* properties or driver-populated
eee_disabled_modes. genphy_c45_ethtool_set_eee() calls this helper
when user space passes an empty advertisement, undoing the filtering.
Apply the same eee_disabled_modes mask in phy_advertise_eee_all() so
the filtering survives the copy, matching the pattern in phy_probe()
and phy_support_eee().
Nicolai Buchwitz [Mon, 18 May 2026 08:23:09 +0000 (10:23 +0200)]
net: phy: honor eee_disabled_modes in phy_support_eee()
phy_support_eee() copies supported_eee into advertising_eee
unconditionally, overwriting any filtering applied during phy_probe()
based on DT eee-broken-* properties or driver-populated
eee_disabled_modes. MAC drivers that call phy_support_eee() after
probe (e.g. bcmgenet, fec, lan743x, lan78xx, r8169) then cause the PHY
to advertise EEE for modes the user marked as broken.
The symptom is that ethtool --show-eee on the local interface reports
"not supported" (supported & ~eee_disabled_modes is empty) while the
link partner sees EEE negotiated and active.
phy_probe() already filters advertising_eee via eee_disabled_modes
after calling of_set_phy_eee_broken(). Apply the same mask in
phy_support_eee() so the filtering survives the copy.
net: phy: skip EEE advertisement write when autoneg is disabled
genphy_c45_an_config_eee_aneg() writes the EEE advertisement to the
auto-negotiation device's MMD register space (MDIO_MMD_AN, register
MDIO_AN_EEE_ADV). These registers are read by the link partner only
during auto-negotiation, so writing them while autoneg is disabled
cannot influence the link. On some PHYs (e.g. Broadcom BCM54213PE)
the write nevertheless reaches the chip and disturbs the receive
datapath.
Concretely, running
ethtool -s eth0 speed 100 duplex full autoneg off
ethtool --set-eee eth0 eee off
leaves eth0 with TX working and RX completely silent on a
Raspberry Pi 4 / CM4 board (bcmgenet + BCM54213PE in rgmii-rxid).
Switching back to autoneg recovers the link.
Prior to commit f26a29a038ee ("net: phy: ensure that genphy_c45_an_config_eee_aneg() sees new value of phydev->eee_cfg.eee_enabled"),
the disable path was effectively a no-op because the helper read
the stale eee_cfg.eee_enabled, so the underlying PHY behavior never
surfaced.
Eric Dumazet [Mon, 18 May 2026 09:05:18 +0000 (09:05 +0000)]
net/sched: sch_htb: fix htb_dump_class_stats() vs offload mode
htb_dump_class_stats() and htb_offload_aggregate_stats()
call gnet_stats_basic_sync_init(&cl->bstats) which
is wrong on 32bit arches when syncp is cleared.
Make sure to acquire qdisc spinlock and use
_bstats_set() to ease future lockless dumps.
Fixes: 83271586249c ("sch_htb: Stats for offloaded HTB") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Maxim Mikityanskiy <maximmi@mellanox.com> Cc: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260518090518.629245-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
After enabling GENERIC_ENTRY some functions are left unused.
Cleanup all those functions which includes:
- do_syscall_trace_enter
- do_syscall_trace_leave
- do_notify_resume
- do_seccomp
Enable the generic IRQ entry/exit infrastructure on PowerPC by selecting
GENERIC_ENTRY and integrating the architecture-specific interrupt and
syscall handlers with the generic entry/exit APIs.
This change replaces PowerPC’s local interrupt entry/exit handling with
calls to the generic irqentry_* helpers, aligning the architecture with
the common kernel entry model. The macros that define interrupt, async,
and NMI handlers are updated to use irqentry_enter()/irqentry_exit()
and irqentry_nmi_enter()/irqentry_nmi_exit() where applicable also
convert the PowerPC syscall entry and exit paths to use the generic
entry/exit framework and integrating with the common syscall handling
routines.
Key updates include:
- The architecture now selects GENERIC_ENTRY in Kconfig.
- Replace interrupt_enter/exit_prepare() with arch_interrupt_* helpers.
- Integrate irqentry_enter()/exit() in standard and async interrupt paths.
- Integrate irqentry_nmi_enter()/exit() in NMI handlers.
- Remove redundant irq_enter()/irq_exit() calls now handled generically.
- Use irqentry_exit_cond_resched() for preemption checks.
- interrupt.c and syscall.c are simplified to delegate context
management and user exit handling to the generic entry path.
- The new pt_regs field `exit_flags` introduced earlier is now used
to carry per-syscall exit state flags (e.g. _TIF_RESTOREALL).
- Remove unused code.
This change establishes the necessary wiring for PowerPC to use the
generic IRQ entry/exit framework while maintaining existing semantics.
This aligns PowerPC with the common entry code used by other
architectures and reduces duplicated logic around syscall tracing,
context tracking, and signal handling.
The performance benchmarks from perf bench basic syscall are below:
Move interrupt entry and exit helper routines from interrupt.h into the
PowerPC-specific entry-common.h header as a preparatory step for enabling
the generic entry/exit framework.
This consolidation places all PowerPC interrupt entry/exit handling in a
single common header, aligning with the generic entry infrastructure.
The helpers provide architecture-specific handling for interrupt and NMI
entry/exit sequences, including:
- arch_interrupt_enter/exit_prepare()
- arch_interrupt_async_enter/exit_prepare()
- arch_interrupt_nmi_enter/exit_prepare()
- Supporting helpers such as nap_adjust_return(), check_return_regs_valid(),
debug register maintenance, and soft mask handling.
The functions are copied verbatim from interrupt.h.Subsequent patches will
integrate these routines into the generic entry/exit flow.
Add a new field `exit_flags` in the pt_regs structure. This field will hold
the flags set during interrupt or syscall execution that are required during
exit to user mode.
Specifically, the `TIF_RESTOREALL` flag, stored in this field, helps the
exit routine determine if any NVGPRs were modified and need to be restored
before returning to userspace.
This addition ensures a clean and architecture-specific mechanism to track
per-syscall or per-interrupt state transitions related to register restore.
Changes:
- Add `exit_flags` and `__pt_regs_pad` to maintain 16-byte stack alignment
- Update asm-offsets.c and ptrace.c for offset and validation
- Update PT_* constants in uapi header to reflect the new layout
These helpers handle user state restoration when returning from the
kernel to userspace, including FPU/VMX/VSX state, transactional memory,
KUAP restore, and per-CPU accounting.
Additionally, move check_return_regs_valid() from interrupt.c to
interrupt.h so it can be shared by the new entry/exit logic.
Implement the arch_enter_from_user_mode() hook required by the generic
entry/exit framework. This helper prepares the CPU state when entering
the kernel from userspace, ensuring correct handling of KUAP/KUEP,
transactional memory, and debug register state.
This patch contains no functional changes, it is purely preparatory for
enabling the generic syscall and interrupt entry path on PowerPC.
powerpc: Prepare to build with generic entry/exit framework
This patch introduces preparatory changes needed to support building
PowerPC with the generic entry/exit (irqentry) framework.
The following infrastructure updates are added:
- Add a syscall_work field to struct thread_info to hold SYSCALL_WORK_* flags.
- Provide a stub implementation of arch_syscall_is_vdso_sigreturn(),
returning false for now.
- Introduce on_thread_stack() helper to detect if the current stack pointer
lies within the task’s kernel stack.
These additions enable later integration with the generic entry/exit
infrastructure while keeping existing PowerPC behavior unchanged.
Rename arch_irq_disabled_regs() to regs_irqs_disabled() to align with the
naming used in the generic irqentry framework. This makes the function
available for use both in the PowerPC architecture code and in the
common entry/exit paths shared with other architectures.
This is a preparatory change for enabling the generic irqentry framework
on PowerPC.
Marco Crivellari [Fri, 15 May 2026 13:51:36 +0000 (15:51 +0200)]
ipmr: Replace use of system_unbound_wq with system_dfl_wq
This patch continues the effort to refactor workqueue APIs, which has begun
with the changes introducing new workqueues and a new alloc_workqueue flag:
commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
The point of the refactoring is to eventually alter the default behavior of
workqueues to become unbound by default so that their workload placement is
optimized by the scheduler.
Before that to happen, workqueue users must be converted to the better named
new workqueues with no intended behaviour changes:
Jakub Kicinski [Wed, 20 May 2026 01:17:30 +0000 (18:17 -0700)]
Merge branch 'gve-add-support-for-ptp-gettimex64'
Jordan Rhee says:
====================
gve: add support for PTP gettimex64
This patch series adds support to obtain near-simultaneous NIC and
system timestamps with gettimex64. This enables daemons like
chrony and phc2sys to synchronize the system clock to the NIC clock.
GVE does not have direct register access to the NIC hardware clock, so
it must issue an AdminQ command to read the NIC clock. Two paths
for obtaining a cross-timestamp are implemented: a precise path using
system counter values sampled by the device, and a fallback path using
system counter values sampled in the driver using
ptp_read_system_prets()/postts().
To use the precise path, the current system clocksource must match the
units returned by the device, which on x86 is X86_TSC and on ARM64 is
ARM_ARCH_COUNTER. The clockid requested for the cross-timestamp must
be either CLOCK_REALTIME or CLOCK_MONOTONIC_RAW. These conditions hold
by default on GCP VMs using Chrony, so we expect the precise path to be
used the vast majority of the time. If the system clocksource is changed
to kvm-clock, it activates the fallback path. Ethtool counters have been
added to count how many times each path is used.
The uncertainty window in the precise path is typically around 1-2us,
while in the fallback path is around 60-80us. This table shows a
comparison in chrony tracking statistics between the precise path and
fallback path. The RMS offset is nearly 4 orders of magnitude smaller
in the precise path.
| | Fallback Path | Precise path |
| --------------- | --------------------- | ------------------------ |
| System time | 0.000000005 s slow | 0.000000001 s fast |
| Last offset | +0.000005606 seconds | +0.000000001 seconds |
| RMS offset | 0.000009020 seconds | 0.000000002 seconds |
| Frequency | 4.115 ppm fast | 0.362 ppm fast |
| Residual freq | +2.515 ppm | +0.000 ppm |
| Skew | 18.480 ppm | 0.001 ppm |
| Root delay | 0.000000001 seconds | 0.000000001 seconds |
| Root dispersion | 0.000081905 seconds | 0.000001169 seconds |
| Update interval | 0.5 seconds | 0.5 seconds |
| Leap status | Normal | Normal |
The first two patches pave the way for the PTP implementation by
quieting excessive logging and refactoring an existing routine for
thread safety.
====================
Jordan Rhee [Thu, 14 May 2026 22:58:42 +0000 (22:58 +0000)]
gve: implement PTP gettimex64
Enable chrony and phc2sys to synchronize system clock to NIC clock.
Two paths are implemented: a precise path using system counter values
sampled by the device, and a fallback path using system counter values
sampled in the driver using ptp_read_system_prets()/postts().
To use the precise path, the current system clocksource must match the
units returned by the device, which on x86 is X86_TSC and on ARM64 is
ARM_ARCH_COUNTER. The clockid requested for the cross-timestamp must
be either CLOCK_REALTIME or CLOCK_MONOTONIC_RAW. These conditions hold
by default on GCP VMs using Chrony, so we expect the precise path to be
used the vast majority of the time. If the system clocksource is changed
to kvm-clock, it activates the fallback path. Ethtool counters have been
added to count how many times each path is used.
The uncertainty window in the precise path is typically around 1-2us,
while in the fallback path is around 60-80us.
Stub implementions of adjfine and adjtime are added to avoid NULL
dereference when phc2sys tries to adjust the clock.
Cc: John Stultz <jstultz@google.com> Cc: Thomas Gleixner <tglx@kernel.org> Cc: Stephen Boyd <sboyd@kernel.org> Cc: David Woodhouse <dwmw2@infradead.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Kevin Yang <yyd@google.com> Reviewed-by: Naman Gulati <namangulati@google.com> Signed-off-by: Jordan Rhee <jordanrhee@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20260514225842.110706-4-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ankit Garg [Thu, 14 May 2026 22:58:41 +0000 (22:58 +0000)]
gve: make nic clock reads thread safe
Add a mutex to protect the shared DMA buffer that receives NIC
timestamp reports. The NIC timestamp will be read from two different
threads: the periodic worker and upcoming `gettimex64`.
Move clock registration to the last step of initialization to ensure
that all data needed by the clock module is initialized before
the clock is exposed to usermode.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Joshua Washington <joshwash@google.com> Signed-off-by: Ankit Garg <nktgrg@google.com> Signed-off-by: Jordan Rhee <jordanrhee@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20260514225842.110706-3-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jordan Rhee [Thu, 14 May 2026 22:58:40 +0000 (22:58 +0000)]
gve: skip error logging for retryable AdminQ commands
AdminQ commands may return -EAGAIN under certain transient conditions.
These commands are intended to be retried by the driver, so logging
a formal error to the system log is misleading and creates
unnecessary noise.
Modify the logging logic to skip the error message when the result
is -EAGAIN, and move logging to dev_err_ratelimited() to avoid
spamming the log.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Joshua Washington <joshwash@google.com> Signed-off-by: Jordan Rhee <jordanrhee@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20260514225842.110706-2-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
bridge: mcast: Fix a possible use-after-free when removing a bridge port
Patch #1 fixes a possible use-after-free when removing a bridge port.
Patch #2 adds a test case that triggers the problem.
In net-next we can:
1. Add DEBUG_NET_WARN_ON_ONCE() when a port multicast context is
de-initialized while enabled.
2. When de-initializing a port multicast context, synchronously shutdown
all the timers that were initialized when the context was initialized.
====================
Ido Schimmel [Sun, 17 May 2026 12:11:22 +0000 (15:11 +0300)]
selftests: bridge_vlan_mcast: Test toggling of multicast snooping
Test toggling of multicast snooping when per-VLAN multicast snooping is
enabled. The test always passes, but without "bridge: mcast: Fix
possible use-after-free when removing a bridge port" it results in a
splat.
Ido Schimmel [Sun, 17 May 2026 12:11:21 +0000 (15:11 +0300)]
bridge: mcast: Fix a possible use-after-free when removing a bridge port
When per-VLAN multicast snooping is enabled, the bridge iterates over
all the bridge ports, disables the per-port multicast context on each
port and enables the per-{port, VLAN} multicast contexts instead. The
reverse happens when per-VLAN multicast snooping is disabled.
When global multicast snooping is enabled, the bridge iterates over all
the bridge ports and enables the per-port multicast context on each
port. The reverse happens when multicast snooping is disabled.
The above scheme can result in a situation where both types of contexts
(per-port and per-{port, VLAN}) are enabled on a single bridge port:
# ip link add name br1 up type bridge mcast_snooping 1 mcast_querier 1 vlan_filtering 1
# ip link add name dummy1 up master br1 type dummy
# ip link set dev br1 type bridge mcast_vlan_snooping 1
# ip link set dev br1 type bridge mcast_snooping 0
# ip link set dev br1 type bridge mcast_snooping 1
This is not intended and it is a problem since the commit cited below.
Prior to this commit, when removing a bridge port,
br_multicast_disable_port() would disable the per-port multicast context
and the per-{port, VLAN} multicast contexts would get disabled when
flushing VLANs.
After this commit, br_multicast_disable_port() only disables the
per-port multicast context if per-VLAN multicast snooping is disabled.
If both types of contexts were enabled on the port when it was removed,
the per-port multicast context would remain enabled when freeing the
bridge port, leading to a use-after-free [1].
Fix by preventing the bridge from enabling / disabling the per-port
multicast contexts when toggling global multicast snooping if per-VLAN
multicast snooping is enabled.
Ido Schimmel [Sun, 17 May 2026 11:50:09 +0000 (14:50 +0300)]
bridge: Add missing READ_ONCE() annotations around FDB destination port
When roaming, the FDB destination port can change without holding the
bridge's hash lock. Therefore, add missing READ_ONCE() annotations in
both RCU readers and readers that hold the lock. In the latter case, the
annotation is not needed in places where the FDB entry was already
validated to be a local entry since such entries cannot roam.
Dawei Feng [Fri, 15 May 2026 15:18:26 +0000 (23:18 +0800)]
octeontx2-pf: avoid double free of pool->stack on AQ init failure
otx2_pool_aq_init() frees pool->stack when mailbox sync or retry
allocation fails, but leaves the pointer unchanged. Later,
otx2_sq_aura_pool_init() unwinds the partial setup through
otx2_aura_pool_free(), which frees pool->stack again. The CN20K-specific
cn20k_pool_aq_init() implementation has the same bug in
its corresponding error path.
Set pool->stack to NULL immediately after the local free so the shared
cleanup path does not free the same stack again while cleaning up
partially initialized pool state.
The bug was first flagged by an experimental analysis tool we are
developing for kernel memory-management bugs while analyzing
v6.13-rc1. The tool is still under development and is not yet publicly
available. Manual inspection confirms that the bug is still present in
v7.1-rc3.
Runtime validation was not performed because reproducing this path
requires OcteonTX2/CN20K hardware.
Fixes: caa2da34fd25 ("octeontx2-pf: Initialize and config queues") Fixes: d322fbd17203 ("octeontx2-pf: Initialize cn20k specific aura and pool contexts") Cc: stable@vger.kernel.org Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Signed-off-by: Dawei Feng <dawei.feng@seu.edu.cn> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260515151826.1005397-1-dawei.feng@seu.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jonas Jelonek [Fri, 15 May 2026 14:31:03 +0000 (14:31 +0000)]
net: pse-pd: fix sign on -ENOENT check in of_load_pse_pis()
of_count_phandle_with_args() returns the count on success and a negative
errno on failure, including -ENOENT when the "pairsets" property is
absent. The existing comparison in of_load_pse_pis() checks against
ENOENT (positive 2) instead of -ENOENT, so the branch is taken for any
error return: legitimate DTs that omit "pairsets" trigger a spurious
"wrong number of pairsets" error and probe fails with -EINVAL.
Compare against -ENOENT so a missing "pairsets" property is correctly
treated as "this PI has no pairsets, continue".
Fixes: 9be9567a7c59 ("net: pse-pd: Add support for PSE PIs") Cc: stable@vger.kernel.org Signed-off-by: Jonas Jelonek <jelonek.jonas@gmail.com> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20260515143103.1721888-1-jelonek.jonas@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gary Guo [Mon, 20 Apr 2026 16:16:36 +0000 (17:16 +0100)]
rust: doc: disable doc inlining for all prelude items
Somehow the rustdoc heuristics determined that a large chunk of the items
found in prelude should have documentation inlined. This bloats the
generated documentation size.
Also, for crates that optimize documentation with `cfg(doc)`, as the
documentation inlining makes use of the metadata compiled by just rustc, it
will not pick up the `cfg(doc)` attributes from the inlined documentation.
pin-init for example optimizes tuple/fn rendering using the nightly
fake_variadic feature [1], but this is missing from the inlined version
[2].
Thus, mark all prelude items as `#[doc(no_inline)]`.
Guangshuo Li [Thu, 14 May 2026 11:38:34 +0000 (19:38 +0800)]
RDMA/rtrs: Fix use-after-free in path file creation cleanup
In the error path of rtrs_srv_create_path_files(), the sysfs root folders
may already have been created and srv_path->kobj may already have been
initialized. If a later step fails, the cleanup currently calls
kobject_put(&srv_path->kobj) before
rtrs_srv_destroy_once_sysfs_root_folders(srv_path).
kobject_put() may drop the last reference to srv_path->kobj and invoke the
release callback, rtrs_srv_release(), which frees srv_path. The following
call to rtrs_srv_destroy_once_sysfs_root_folders(srv_path) then
dereferences srv_path internally to access srv_path->srv, resulting in a
use-after-free.
This failure path is reached before rtrs_srv_create_path_files() returns
success, so the successful-path lifetime handling is not involved.
Fix this by destroying the sysfs root folders before calling
kobject_put(&srv_path->kobj), so srv_path is still valid while the helper
accesses it.
This issue was found by a static analysis tool I am developing.
Shiraz Saleem [Tue, 12 May 2026 09:42:09 +0000 (02:42 -0700)]
RDMA/mana_ib: Report max_msg_sz in mana_ib_query_port
Report max_msg_sz for mana_ib, which is 16MB.
Fixes: 4bda1d5332ec ("RDMA/mana_ib: Implement port parameters") Signed-off-by: Shiraz Saleem <shirazsaleem@microsoft.com> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://patch.msgid.link/20260512094209.264955-1-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
Jason Gunthorpe [Wed, 13 May 2026 15:00:16 +0000 (12:00 -0300)]
RDMA/core: Do not read wild stack memory in uverbs_get_handler_fn()
Sashiko points out the legacy write path in ib_uverbs_write() does
allocate a struct uverbs_attr_bundle, but it doesn't wrap it in a
bundle_priv so downcasting here isn't safe.
Instead lift the method_elm out of the bundle_priv and use it for the
debug function. The legacy write path will leave it set as NULL since the
write method_elm uses a different type.
Cc: stable@vger.kernel.org Fixes: 1de9287ece44 ("RDMA: Add ib_copy_validate_udata_in()") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Jason Gunthorpe [Wed, 13 May 2026 17:33:23 +0000 (14:33 -0300)]
RDMA/core: Move the _ib_copy_validate_udata* functions to ib_core_uverbs
It was incorrect to place them in uverbs_ioctl because that makes every
driver depends on ib_uverbs.ko, which is undesired. ib_core_uverbs.c is
for functions used by alot of drivers that are linked into ib_core
instead.
output mismatch
QA output created by 003
ERROR: access time has changed for file1 after remount
ERROR: access time has changed after modifying file1
ERROR: change time has not been updated after changing file1
ERROR: access time has changed for file in read-only filesystem
Silence is golden
This patch fixes the issue with change time by
adding inode_set_ctime_current() and mark_inode_dirty()
in hfs_rename(). Also, it reworks hfs_inode_setattr() by
changing simple_inode_init_ts() on inode_set_mtime_to_ts()
and inode_set_ctime_current() calls.
HFS hasn't any field in on-disk layout that can keep
the file/folder access times (atime). It was added
setting of SB_NOATIME in SB_NOATIME.
Finally, we have only atime related errors in
generic/003 output:
QA output created by 003
ERROR: access time has not been updated after accessing file1 first time
ERROR: access time has not been updated after accessing file2
ERROR: access time has not been updated after accessing file3 second time
ERROR: access time has not been updated after accessing file3 third time
Silence is golden
The generic/003 test-case needs to be disabled for HFS case
because it cannot support the file/folder access times (atime).
Closes: https://github.com/hfs-linux-kernel/hfs-linux-kernel/issues/3
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> Link: https://lore.kernel.org/r/20260514195630.354206-2-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
The final assigned CNID was 21 but fsck correct it on 22.
It is possible to see that the reason of the issue is
incrementing the next_id value at first and assigning
already incremented value to the inode->i_ino:
This patch fixes the issue by assigning the decremented
value to inode->i_ino.
Fixes: a06ec283e125 ("hfs: add logic of correcting a next unused CNID")
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> Link: https://lore.kernel.org/r/20260514195518.354108-2-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Sherry Sun [Sat, 9 May 2026 01:54:11 +0000 (09:54 +0800)]
arm64: dts: imx95-19x19-evk: Fix PCIe EP vpcie-supply
The vpcie-supply property should reference the regulator that controls
the actual M.2 power supply, not the W_DISABLE1# signal.
On imx95-19x19-evk:
- reg_pcie0 controls M.2 W_DISABLE1# signal
- reg_m2_pwr controls the actual M.2 power supply
Fix the vpcie-supply to use reg_m2_pwr for proper power control in
PCIe endpoint mode.
Fixes: 58bea81052d0 ("arm64: dts: imx95: add pcie1 ep overlay file and create pcie-ep dtb files") Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Reviewed-by: Richard Zhu <hongxing.zhu@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Sherry Sun [Sat, 9 May 2026 01:54:10 +0000 (09:54 +0800)]
arm64: dts: imx8qxp-mek: Remove unnecessary PCIe EP vpcie-supply
For PCIe endpoint mode, only M.2 power supply needs to be ensured.
On imx8qxp-mek, the M.2 power is always on and cannot be controlled,
while reg_pcieb only controls the M.2 W_DISABLE1# signal. Remove the
unnecessary vpcie-supply property from pcie0_ep node.
Fixes: 1c9b0c6044c2 ("arm64: dts: imx8: use common imx-pcie0-ep.dtso to enable PCI ep function") Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Reviewed-by: Richard Zhu <hongxing.zhu@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Sherry Sun [Sat, 9 May 2026 01:54:09 +0000 (09:54 +0800)]
arm64: dts: imx8dxl-evk: Remove unnecessary PCIe EP properties
For PCIe endpoint mode, only M.2 power supply needs to be ensured.
On imx8dxl-evk, the M.2 power is always on and cannot be controlled,
while reg_pcieb only controls the M.2 W_DISABLE1# signal. Remove the
unnecessary vpcie-supply property from pcie0_ep node.
Also remove reset-gpio as PCIe endpoint mode doesn't require reset
control.
Fixes: c1c4820b60d7 ("arm64: dts: imx8dxl-evk: Add pcie0-ep node and use unified pcie0 label") Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Reviewed-by: Richard Zhu <hongxing.zhu@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Replace deprecated "gpio" property with "gpios" in
regulator-vmmc-usdhc2 fixed regulator node.
Signed-off-by: Antoine Gouby <antoine.gouby@toradex.com> Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
ACPI: battery: Fix system wakeup on critical battery status
Commit 0a869409a981 ("ACPI: battery: Convert the driver to a platform
one") changed the parent of the battery wakeup source to the platform
device used for driver binding, but it forgot to update the
acpi_pm_wakeup_event() call in acpi_battery_update() accordingly.
Do it now to unbreak waking up the system on critical battery status
during suspend-to-idle and during transitions to ACPI S3/S4.
Fixes: 0a869409a981 ("ACPI: battery: Convert the driver to a platform one") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 7.0+ <stable@vger.kernel.org> # 7.0+ Link: https://patch.msgid.link/12898712.O9o76ZdvQC@rafael.j.wysocki
Linus Torvalds [Tue, 19 May 2026 20:31:35 +0000 (15:31 -0500)]
Merge tag 'lsm-pr-20260519' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm fix from Paul Moore:
"A single LSM patch to add a missing credential mutex lock to the
lsm_set_self_attr(2) syscall so it behaves similar to the associated
procfs API and avoids issues with ptrace"
* tag 'lsm-pr-20260519' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
lsm: hold cred_guard_mutex for lsm_set_self_attr()
x86/sev: Remove redundant ghcbs_initialized checks around __sev_{get,put}_ghcb()
After
3645eb7e3915 ("x86/fred: Fix early boot failures on SEV-ES/SNP guests"),
__sev_{get,put}_ghcb() handle the early-boot GHCB fallback internally, making
the ghcbs_initialized guards in __set_pages_state() and
svsm_perform_call_protocol() redundant.
Remove them.
Also initialize state->ghcb to NULL in the early-boot path of
__sev_get_ghcb() so that the ghcb_state is well-defined for all callers,
even though __sev_put_ghcb() currently returns early before reading it.
No functional change intended.
Suggested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://patch.msgid.link/20260518102230.3394603-1-nikunj@amd.com
Linus Torvalds [Tue, 19 May 2026 19:00:48 +0000 (14:00 -0500)]
Merge tag 'ata-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fixes from Niklas Cassel:
- Make sure that the issuing of a deferred non-NCQ command via
workqueue feature is only used when mixing NCQ and non-NCQ commands
to the same link (i.e. return value ATA_DEFER_LINK), and nothing
else. This way we will not incorrectly try to use the feature for
e.g. PATA drivers
- The deferred non-NCQ command was stored in a per-port struct. When
using Port Multipliers with FIS-Based Switching, we would thus
needlessly defer commands to all other links. Store the deferred QC
in a per-link struct, such that Port Multipliers with FBS will get
the same performance as before
- The issuing of a deferred non-NCQ command via workqueue feature broke
support for Port Multipliers using Command-Based Switching. The
issuing of a deferred non-NCQ command via workqueue feature is not
compatible with the use of ap->excl_link, which PMPs with CBS use for
fairness (using implicit round robin)
* tag 'ata-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: libata-scsi: do not needlessly defer commands when using PMP with FBS
ata: libata-scsi: do not use the deferred QC feature on PMPs with CBS
ata: libata-scsi: do not use the deferred QC feature for ATA_DEFER_PORT
ata: libata-scsi: improve readability of ata_scsi_qc_issue()
Apply improved drive-strength values and pull-up/down configurations as
devised from hardware measurements to improve signal quality on PHYTEC
phyCORE-i.MX 91/93 SoM based boards. Also improve eMMC HS400 mode by
setting property "fsl,strobe-dll-delay-target" which shifts the strobe
DLL sampling window to the optimal position.
Signed-off-by: Christoph Stoidner <c.stoidner@phytec.de> Signed-off-by: Primoz Fiser <primoz.fiser@norik.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
According to measurements, the PHY reset signal shows an overshoot on
the rising edge that exceeds the specified limits (max 2.1V) when using
X4 strength on ENET2_RXC. Reduce drive-strength to X1 to decrease the
overshoot and bring signal within specification limits.
Signed-off-by: Primoz Fiser <primoz.fiser@norik.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Primoz Fiser [Thu, 7 May 2026 06:20:56 +0000 (08:20 +0200)]
arm64: dts: freescale: imx{91,93}-phycore-som: Set BUCK5 in FPWM mode
Set PMIC BUCK5 mode to forced PWM (Pulse Width Modulation) mode instead
of the default automatic PFM and PWM transition mode. FPWM mode produces
less ripple on the output voltage rail under light load conditions. And
since BUCK5 supplies SoC internal ADC reference voltage we need to keep
voltage ripple to a minimum. This solves issues with the occasional ADC
calibration procedure failures on phyCORE-i.MX91/93 SoM based boards.
Signed-off-by: Primoz Fiser <primoz.fiser@norik.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Peter Zijlstra [Thu, 23 Apr 2026 15:56:13 +0000 (17:56 +0200)]
x86/kvm/vmx: Fix VMX vs hrtimer_rearm_deferred()
Vishal reported that KVM unit test 'x2apic' started failing after commit 0e98eb14814e ("entry: Prepare for deferred hrtimer rearming").
The reason is that KVM/VMX is injecting interrupts while it has interrupts
disabled, for a context that will enable interrupts, this means that
regs->flags.X86_EFLAGS_IF == 0 and irqentry_exit() will not do the right
thing.
Notably, irqentry_exit() must not call hrtimer_rearm_deferred() when the return
context does not have IF set, because this will cause problems vs NMIs.
Therefore, fix up the state after the injection.
Fixes: 0e98eb14814e ("entry: Prepare for deferred hrtimer rearming") Reported-by: "Verma, Vishal L" <vishal.l.verma@intel.com> Suggested-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: "Verma, Vishal L" <vishal.l.verma@intel.com> Tested-by: David Woodhouse <dwmw@amazon.co.uk> Tested-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Link: https://patch.msgid.link/20260423155936.957351833@infradead.org Closes: https://lore.kernel.org/r/70cd3e97fbb796e2eb2ff8cd4b7614ada05a5f24.camel%40intel.com
Peter Zijlstra [Fri, 8 May 2026 09:18:29 +0000 (11:18 +0200)]
x86/kvm/vmx: Move IRQ/NMI dispatch from KVM into x86 core
Move the VMX interrupt dispatch magic into the x86 core code. This
isolates KVM from the FRED/IDT decisions and reduces the amount of
EXPORT_SYMBOL_FOR_KVM().
Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: "Verma, Vishal L" <vishal.l.verma@intel.com> Tested-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Binbin Wu <binbin.wu@linxu.intel.com> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://patch.msgid.link/20260508091829.GO3126523@noisy.programming.kicks-ass.net
The macro HAS_DMC_WAKELOCK() is currently only used inside
intel_dmc_wl.c and doesn't need to be exposed to the rest of the driver.
Furthermore, there is a distinction between the display IP having
support for the feature and the driver actually using it, so using
HAS_DMC_WAKELOCK() outside of intel_dmc_wl.c would potentially be wrong
anyway.
Let's drop that macro. If other part of the driver needs to check if
the driver is using the DMC wakelock feature, we would need actually to
expose the function __intel_dmc_wl_supported().
Since HAS_DMC_WAKELOCK() was kind of self-documenting in the sense that
it tells us what display IPs have support for the feature and we are now
dropping it, let's also take this opportunity to add a documentation
note on the subject.
Support tianma,tm050rdh03 DPI panel on i.MX93 9x9 QSB.
Move the common parts from imx93-9x9-qsb-ontat-kd50g21-40nt-a1.dtso into
imx93-9x9-qsb-ontat-kd50g21-40nt-a1.dtsi for reuse by the tm050rdh03
overlay file.
The panel connects with the QSB board through an adapter board[1]
designed by NXP.
Sherry Sun [Thu, 7 May 2026 06:53:30 +0000 (14:53 +0800)]
arm64: dts: imx: Add common imx-m2-pcie.dtso to enable PCIe on M.2 connector
Some i.MX boards (i.MX8MP EVK and i.MX95-15x15 EVK) have M.2 connectors
that are physically wired to both USDHC and PCIe controllers. The
default device tree enables USDHC for SDIO WiFi modules and disables
PCIe to avoid regulator conflicts.
Add a common imx-m2-pcie.dtso that can be applied to enable PCIe and
disable USDHC when a PCIe module is installed in the M.2 connector.
This creates the following DTB files:
- imx8mp-evk-pcie.dtb: i.MX8MP EVK with PCIe enabled
- imx95-15x15-evk-pcie.dtb: i.MX95-15x15 EVK with PCIe enabled
Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Sherry Sun [Thu, 7 May 2026 06:53:29 +0000 (14:53 +0800)]
arm64: dts: imx95-15x15-evk: Disable PCIe bus in the default dts
Disable the PCIe bus in the default device tree to avoid shared
regulator conflicts between SDIO and PCIe buses. The non-deterministic
probe order between these two buses can break the PCIe initialization
sequence, causing PCIe devices to fail detection intermittently.
On i.MX95-15x15 EVK board, the M.2 connector is physically wired to both
USDHC3 and PCIe0, however the out-of-box module is SDIO IW612 WiFi, so
enable SDIO WiFi in the default imx95-15x15-evk.dts.
Add 'm2_usdhc' label to USDHC3 to support device tree overlay for PCIe
modules. Users who need PCIe can use imx95-15x15-evk-pcie.dtb (added in
a follow-up patch) which applies an overlay to enable PCIe and disable
USDHC3.
Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Sherry Sun [Thu, 7 May 2026 06:53:28 +0000 (14:53 +0800)]
arm64: dts: imx8mp-evk: Disable PCIe bus in the default dts
Disable the PCIe bus in the default device tree to avoid shared
regulator conflicts between SDIO and PCIe buses. The non-deterministic
probe order between these two buses can break the PCIe initialization
sequence, causing PCIe devices to fail detection intermittently.
On i.MX8MP EVK board, the M.2 connector is physically wired to both
USDHC1 and PCIe0, however the out-of-box module is SDIO IW612 WiFi, so
enable the SDIO WiFi in the default imx8mp-evk.dts.
Add 'm2_usdhc' label to USDHC1 to support device tree overlay for PCIe
modules. Users who need PCIe can use imx8mp-evk-pcie.dtb (added in a
follow-up patch) which applies an overlay to enable PCIe and disable
USDHC1.
Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
It features 1 x RS232, 1 x RS485, 1 x CAN, 3 x isolated digital I/O,
2 x GBit/s Ethernet, a mini PCIe slot with USB / SIM card connector
for a modem, USB and SD card interfaces.
Add Zinnia Carrier Board mated with Verdin iMX8M Plus.
It features 1 x RS232, 1 x RS485, 1 x CAN, 3 x isolated digital I/O,
2 x GBit/s Ethernet, a mini PCIe slot with USB / SIM card connector
for a modem, USB and SD card interfaces.
Add Zinnia Carrier Board mated with Verdin iMX8M Mini.
It features 1 x RS232, 1 x RS485, 1 x CAN, 3 x isolated digital I/O,
1 x GBit/s Ethernet, a mini PCIe slot with USB / SIM card connector
for a modem, USB and SD card interfaces.
arm64: dts: imx8ulp-evk: Correct Type-C int GPIO flags
IRQ_TYPE_xxx flags are not correct in the context of GPIO flags.
These are simple defines so they could be used in DTS but they will not
have the same meaning: IRQ_TYPE_EDGE_FALLING = 2 = GPIO_SINGLE_ENDED.
Correct the Type-C int-gpios to use proper flags, assuming the author of
the code wanted similar logical behavior:
IRQ_TYPE_EDGE_FALLING => GPIO_ACTIVE_LOW
Fixes: c4b4593ecb0b ("arm64: dts: imx8ulp-evk: enable usb nodes and add ptn5150 nodes") Cc: stable@vger.kernel.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Jamie Nguyen [Mon, 18 May 2026 20:31:16 +0000 (13:31 -0700)]
firmware: arm_ffa: Honor partition info descriptor size
FFA_PARTITION_INFO_GET_REGS reports the size of each partition
information descriptor in x2[63:48]. However, __ffa_partition_info_get_regs()
walks the returned register payload with a hardcoded 24-byte stride
(regs += 3), even though the size is already read into buf_sz.
That works for the FF-A v1.1/v1.2 24-byte descriptor layout, where each
descriptor consumes three registers. Newer FF-A revisions can extend the
descriptor while keeping the existing fields at the front. For example, a
48-byte descriptor consumes six registers, so advancing by only three
registers desynchronises the parser and can make it read subsequent entries
from the middle of a descriptor.
Use the advertised descriptor size to derive the register stride. Validate
that the size is register-aligned, large enough for the fields parsed by the
driver, and that the requested number of descriptors fits in the returned
x3..x17 register window. The driver still copies only the fields it
understands, but now skips over any trailing descriptor fields correctly.
Fixes: ba85c644ac8d ("firmware: arm_ffa: Add support for FFA_PARTITION_INFO_GET_REGS") Suggested-by: Sudeep Holla <sudeep.holla@kernel.org> Signed-off-by: Jamie Nguyen <jamien@nvidia.com> Link: https://patch.msgid.link/20260518203116.42624-1-jamien@nvidia.com
(sudeep.holla: Minor rewordng of the commit message and subject) Signed-off-by: Sudeep Holla <sudeep.holla@kernel.org>
bio-integrity-fs: pass data iter to bio_integrity_verify()
bio_integrity_verify() expects the passed struct bvec_iter to be an
iterator over bio data, not integrity. So construct a separate data
bvec_iter without the bio_integrity_bytes() conversion and pass it to
bio_integrity_verify() instead of bip_iter.
Gustavo Sousa [Mon, 18 May 2026 16:14:04 +0000 (13:14 -0300)]
drm/i915/bw: Extract get_display_bw_params()
Just like it is done for the platform-specific bandwidth parameters, use
a separate function named get_display_bw_params() to return the display
IP-specific parameters. This simplifies intel_bw_init_hw() by having
just one call for each of the *_get_bw_info() functions.
v2:
- Prefer to call get_display_bw_params() only once in
intel_bw_init_hw() instead of having multiple calls in each of the
affected *_get_bw_info() functions. (Jani)
v3:
- Call get_display_bw_params() only after the check on
HAS_DISPLAY(display). (Jani)
- Return &gen11_bw_params only if display version is 11. (Matt)
v4:
- Like done with get_soc_bw_params(), drop drm_WARN() when no display
IP is matched.
Gustavo Sousa [Mon, 18 May 2026 16:14:03 +0000 (13:14 -0300)]
drm/i915/bw: Rename struct intel_sa_info to intel_display_bw_params
To align with struct intel_platform_bw_params, rename struct
intel_sa_info to intel_display_bw_params. Also add comments to contrast
their purposes.
v2:
- Use gen11 and gen12 as prefixes for ICL's and TGL's display-specific
parameters variables. (Matt)
- Prefer to use "display" instead of "disp" in variable names. (Jani)
- Drop the redundant "disp" from the variable names.
Gustavo Sousa [Mon, 18 May 2026 16:14:02 +0000 (13:14 -0300)]
drm/i915/bw: Deduplicate intel_sa_info instances
Now that intel_sa_info contains bandwidth parameters specific to the
display IP, we can drop many duplicates and reuse from previous
releases.
Let's do that and also simplify intel_bw_init_hw() while at it.
v2:
- Drop rkl_sa_info and reuse icl_sa_info. (Matt)
- Add comment explaining RKL's display's peculiarity on using ICL's
parameters. (Matt)
- Don't rename xelpdp_sa_info to mtl_sa_info. Renaming of instances
to use IP names will be done in upcoming changes.
Gustavo Sousa [Mon, 18 May 2026 16:14:01 +0000 (13:14 -0300)]
drm/i915/bw: Extract platform-specific parameters
We got confirmation from the hardware team that the bandwidth parameters
deprogbwlimit and derating are platform-specific and not tied to the
display IP. As such, let's make sure that we use platform checks for
those.
The rest of the members of struct intel_sa_info are tied to the display
IP and we will deal with them as a follow-up.
v2:
- Use good old if-ladder instead of weird-looking pattern "assign ret,
check platform, then return ret". (Jani, Matt)
- Have a single call site for get_platform_bw_params() and pass the
result as parameter to the *_get_bw_info() functions. (Jani)
- Avoid using "plat" as abbreviation for "platform". (Jani)
- s/_plat_bw_params/_bw_params/, since all of the instances are
prefixed with platform names. (Jani)
- s/struct intel_platform_bw_params/struct intel_soc_bw_params/.
(Matt)
- Do not return a default value; prefer to return NULL and
intentionally cause a NULL pointer dereference if a platform is
missing. (Gustavo)
v3:
- Call get_soc_bw_params() only after the check on
HAS_DISPLAY(display). (Jani)
- Combine if-ladder branches for adl_s_bw_params into a single one.
(Matt)
- Flatten if-ladder by checking for WCL before PTL (as opposed to
checking for WCL inside the brace for PTL). (Matt)
- Bail out of intel_bw_init_hw() if display version is below 11.
(Gustavo)
v4:
- Drop drm_WARN() when no platform was matched to avoid
special-casing DG2 and any other platform that doesn't use
SoC-specific parameters. (Jani)
- Pass dram_info to get_soc_bw_params() to keep a single call to
intel_dram_info(). (Jani)
- Don't use 2 separate if-ladders (one for client and another for
discrete platforms) and keep a single one for simplicity. (Gustavo)
Gustavo Sousa [Mon, 18 May 2026 16:14:00 +0000 (13:14 -0300)]
drm/i915/bw: Don't call intel_dram_info() too early
If we end-up bailing early from intel_bw_init_hw() due to
!HAS_DISPLAY(display), the call to intel_dram_info() to initialize
dram_info will be meaningless. Move the call to be done after that
check.