Casey Connolly [Fri, 6 Mar 2026 17:47:07 +0000 (18:47 +0100)]
ASoC: detect empty DMI strings
Some bootloaders like recent versions of U-Boot may install some DMI
properties with empty values rather than not populate them. This manages
to make its way through the validator and cleanup resulting in a rogue
hyphen being appended to the card longname.
The description mixes two nodes. There is the controller, and there is
the flash. Describe the flash (which itself can be considered an mtd
device, unlike the top level controller), and move the st,smi-fast-mode
property inside, as this property is flash specific and should not live
in the parent controller node.
Fixes: 68cd8ef48452 ("dt-bindings: mtd: st,spear600-smi: convert to DT schema") Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
gpu: nova-core: gsp: fix UB in DmaGspMem pointer accessors
The DmaGspMem pointer accessor methods (gsp_write_ptr, gsp_read_ptr,
cpu_read_ptr, cpu_write_ptr, advance_cpu_read_ptr,
advance_cpu_write_ptr) dereference a raw pointer to DMA memory, creating
an intermediate reference before calling volatile read/write methods.
This is undefined behavior since DMA memory can be concurrently modified
by the device.
Fix this by moving the implementations into a gsp_mem module in fw.rs
that uses the dma_read!() / dma_write!() macros, making the original
methods on DmaGspMem thin forwarding wrappers.
An alternative approach would have been to wrap the shared memory in
Opaque, but that would have required even more unsafe code.
Since the gsp_mem module lives in fw.rs (to access firmware-specific
binding field names), GspMem, Msgq and their relevant fields are
temporarily widened to pub(super). This will be reverted once IoView
projections are available.
Xu Yang [Mon, 9 Mar 2026 07:43:13 +0000 (15:43 +0800)]
usb: roles: get usb role switch from parent only for usb-b-connector
usb_role_switch_is_parent() was walking up to the parent node and checking
for the "usb-role-switch" property regardless of the type of the passed
fwnode. This could cause unrelated device nodes to be probed as potential
role switch parent, leading to spurious matches and "-EPROBE_DEFER" being
returned infinitely.
Till now only Type-B connector node will have a parent node which may
present "usb-role-switch" property and register the role switch device.
For Type-C connector node, its parent node will always be a Type-C chip
device which will never register the role switch device. However, it may
still present a non-boolean "usb-role-switch = <&usb_controller>" property
for historical compatibility.
So restrict the helper to only operate on Type-B connector when attempting
to get the role switch from parent node.
Fixes: 6fadd72943b8 ("usb: roles: get usb-role-switch from parent") Cc: stable <stable@kernel.org> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Tested-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://patch.msgid.link/20260309074313.2809867-3-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The fwnode_usb_role_switch_get() returns NULL only if no connection is
found, returns ERR_PTR(-EPROBE_DEFER) if connection is found but deferred
probe is needed, or a valid pointer of usb_role_switch.
When switching from a NULL check to IS_ERR_OR_NULL(), usb_role_switch_get()
returns NULL and overwrites the ERR_PTR(-EPROBE_DEFER) returned by
fwnode_usb_role_switch_get(). This causes the deferred probe indication to
be lost, preventing the USB role switch from ever being retrieved.
Fixes: 1366cd228b0c ("tcpm: allow looking for role_sw device in the main node") Cc: stable <stable@kernel.org> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Tested-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://patch.msgid.link/20260309074313.2809867-2-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kuen-Han Tsai [Mon, 9 Mar 2026 12:04:52 +0000 (20:04 +0800)]
usb: gadget: f_ncm: Fix net_device lifecycle with device_move
The network device outlived its parent gadget device during
disconnection, resulting in dangling sysfs links and null pointer
dereference problems.
A prior attempt to solve this by removing SET_NETDEV_DEV entirely [1]
was reverted due to power management ordering concerns and a NO-CARRIER
regression.
A subsequent attempt to defer net_device allocation to bind [2] broke
1:1 mapping between function instance and network device, making it
impossible for configfs to report the resolved interface name. This
results in a regression where the DHCP server fails on pmOS.
Use device_move to reparent the net_device between the gadget device and
/sys/devices/virtual/ across bind/unbind cycles. This preserves the
network interface across USB reconnection, allowing the DHCP server to
retain their binding.
Introduce gether_attach_gadget()/gether_detach_gadget() helpers and use
__free(detach_gadget) macro to undo attachment on bind failure. The
bind_count ensures device_move executes only on the first bind.
This commit is being reverted as part of a series-wide revert.
By deferring the net_device allocation to the bind() phase, a single
function instance will spawn multiple network devices if it is symlinked
to multiple USB configurations.
This causes regressions for userspace tools (like the postmarketOS DHCP
daemon) that rely on reading the interface name (e.g., "usb0") from
configfs. Currently, configfs returns the template "usb%d", causing the
userspace network setup to fail.
Crucially, because this patch breaks the 1:1 mapping between the
function instance and the network device, this naming issue cannot
simply be patched. Configfs only exposes a single 'ifname' attribute per
instance, making it impossible to accurately report the actual interface
name when multiple underlying network devices can exist for that single
instance.
All configurations tied to the same function instance are meant to share
a single network device. Revert this change to restore the 1:1 mapping
by allocating the network device at the instance level (alloc_inst).
This commit is being reverted as part of a series-wide revert.
By deferring the net_device allocation to the bind() phase, a single
function instance will spawn multiple network devices if it is symlinked
to multiple USB configurations.
This causes regressions for userspace tools (like the postmarketOS DHCP
daemon) that rely on reading the interface name (e.g., "usb0") from
configfs. Currently, configfs returns the template "usb%d", causing the
userspace network setup to fail.
Crucially, because this patch breaks the 1:1 mapping between the
function instance and the network device, this naming issue cannot
simply be patched. Configfs only exposes a single 'ifname' attribute per
instance, making it impossible to accurately report the actual interface
name when multiple underlying network devices can exist for that single
instance.
All configurations tied to the same function instance are meant to share
a single network device. Revert this change to restore the 1:1 mapping
by allocating the network device at the instance level (alloc_inst).
This commit is being reverted as part of a series-wide revert.
By deferring the net_device allocation to the bind() phase, a single
function instance will spawn multiple network devices if it is symlinked
to multiple USB configurations.
This causes regressions for userspace tools (like the postmarketOS DHCP
daemon) that rely on reading the interface name (e.g., "usb0") from
configfs. Currently, configfs returns the template "usb%d", causing the
userspace network setup to fail.
Crucially, because this patch breaks the 1:1 mapping between the
function instance and the network device, this naming issue cannot
simply be patched. Configfs only exposes a single 'ifname' attribute per
instance, making it impossible to accurately report the actual interface
name when multiple underlying network devices can exist for that single
instance.
All configurations tied to the same function instance are meant to share
a single network device. Revert this change to restore the 1:1 mapping
by allocating the network device at the instance level (alloc_inst).
This commit is being reverted as part of a series-wide revert.
By deferring the net_device allocation to the bind() phase, a single
function instance will spawn multiple network devices if it is symlinked
to multiple USB configurations.
This causes regressions for userspace tools (like the postmarketOS DHCP
daemon) that rely on reading the interface name (e.g., "usb0") from
configfs. Currently, configfs returns the template "usb%d", causing the
userspace network setup to fail.
Crucially, because this patch breaks the 1:1 mapping between the
function instance and the network device, this naming issue cannot
simply be patched. Configfs only exposes a single 'ifname' attribute per
instance, making it impossible to accurately report the actual interface
name when multiple underlying network devices can exist for that single
instance.
All configurations tied to the same function instance are meant to share
a single network device. Revert this change to restore the 1:1 mapping
by allocating the network device at the instance level (alloc_inst).
This commit is being reverted as part of a series-wide revert.
By deferring the net_device allocation to the bind() phase, a single
function instance will spawn multiple network devices if it is symlinked
to multiple USB configurations.
This causes regressions for userspace tools (like the postmarketOS DHCP
daemon) that rely on reading the interface name (e.g., "usb0") from
configfs. Currently, configfs returns the template "usb%d", causing the
userspace network setup to fail.
Crucially, because this patch breaks the 1:1 mapping between the
function instance and the network device, this naming issue cannot
simply be patched. Configfs only exposes a single 'ifname' attribute per
instance, making it impossible to accurately report the actual interface
name when multiple underlying network devices can exist for that single
instance.
All configurations tied to the same function instance are meant to share
a single network device. Revert this change to restore the 1:1 mapping
by allocating the network device at the instance level (alloc_inst).
This commit is being reverted as part of a series-wide revert.
By deferring the net_device allocation to the bind() phase, a single
function instance will spawn multiple network devices if it is symlinked
to multiple USB configurations.
This causes regressions for userspace tools (like the postmarketOS DHCP
daemon) that rely on reading the interface name (e.g., "usb0") from
configfs. Currently, configfs returns the template "usb%d", causing the
userspace network setup to fail.
Crucially, because this patch breaks the 1:1 mapping between the
function instance and the network device, this naming issue cannot
simply be patched. Configfs only exposes a single 'ifname' attribute per
instance, making it impossible to accurately report the actual interface
name when multiple underlying network devices can exist for that single
instance.
All configurations tied to the same function instance are meant to share
a single network device. Revert this change to restore the 1:1 mapping
by allocating the network device at the instance level (alloc_inst).
RD Babiera [Tue, 10 Mar 2026 20:41:05 +0000 (20:41 +0000)]
usb: typec: altmode/displayport: set displayport signaling rate in configure message
dp_altmode_configure sets the signaling rate to the current
configuration's rate and then shifts the value to the Select
Configuration bitfield. On the initial configuration, dp->data.conf
is 0 to begin with, so the signaling rate field is never set, which
leads to some DisplayPort Alt Mode partners sending NAK to the
Configure message.
Set the signaling rate to the capabilities supported by both the
port and the port partner. If the cable supports DisplayPort Alt Mode,
then include its capabilities as well.
Mathias Nyman [Wed, 4 Mar 2026 22:36:39 +0000 (00:36 +0200)]
xhci: Fix NULL pointer dereference when reading portli debugfs files
Michal reported and debgged a NULL pointer dereference bug in the
recently added portli debugfs files
Oops is caused when there are more port registers counted in
xhci->max_ports than ports reported by Supported Protocol capabilities.
This is possible if max_ports is more than maximum port number, or
if there are gaps between ports of different speeds the 'Supported
Protocol' capabilities.
In such cases port->rhub will be NULL so we can't reach xhci behind it.
Add an explicit NULL check for this case, and print portli in hex
without dereferencing port->rhub.
Reported-by: Michal Pecio <michal.pecio@gmail.com> Closes: https://lore.kernel.org/linux-usb/20260304103856.48b785fd.michal.pecio@gmail.com Fixes: 384c57ec7205 ("usb: xhci: Add debugfs support for xHCI Port Link Info (PORTLI) register.") Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://patch.msgid.link/20260304223639.3882398-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dayu Jiang [Wed, 4 Mar 2026 22:36:38 +0000 (00:36 +0200)]
usb: xhci: Prevent interrupt storm on host controller error (HCE)
The xHCI controller reports a Host Controller Error (HCE) in UAS Storage
Device plug/unplug scenarios on Android devices. HCE is checked in
xhci_irq() function and causes an interrupt storm (since the interrupt
isn’t cleared), leading to severe system-level faults.
When the xHC controller reports HCE in the interrupt handler, the driver
only logs a warning and assumes xHC activity will stop as stated in xHCI
specification. An interrupt storm does however continue on some hosts
even after HCE, and only ceases after manually disabling xHC interrupt
and stopping the controller by calling xhci_halt().
Add xhci_halt() to xhci_irq() function where STS_HCE status is checked,
mirroring the existing error handling pattern used for STS_FATAL errors.
This only fixes the interrupt storm. Proper HCE recovery requires resetting
and re-initializing the xHC.
Zilin Guan [Wed, 4 Mar 2026 22:36:37 +0000 (00:36 +0200)]
usb: xhci: Fix memory leak in xhci_disable_slot()
xhci_alloc_command() allocates a command structure and, when the
second argument is true, also allocates a completion structure.
Currently, the error handling path in xhci_disable_slot() only frees
the command structure using kfree(), causing the completion structure
to leak.
Use xhci_free_command() instead of kfree(). xhci_free_command() correctly
frees both the command structure and the associated completion structure.
Since the command structure is allocated with zero-initialization,
command->in_ctx is NULL and will not be erroneously freed by
xhci_free_command().
This bug was found using an experimental static analysis tool we are
developing. The tool is based on the LLVM framework and is specifically
designed to detect memory management issues. It is currently under
active development and not yet publicly available, but we plan to
open-source it after our research is published.
The bug was originally detected on v6.13-rc1 using our static analysis
tool, and we have verified that the issue persists in the latest mainline
kernel.
We performed build testing on x86_64 with allyesconfig using GCC=11.4.0.
Since triggering these error paths in xhci_disable_slot() requires specific
hardware conditions or abnormal state, we were unable to construct a test
case to reliably trigger these specific error paths at runtime.
Oliver Neukum [Wed, 4 Mar 2026 13:01:12 +0000 (14:01 +0100)]
usb: class: cdc-wdm: fix reordering issue in read code path
Quoting the bug report:
Due to compiler optimization or CPU out-of-order execution, the
desc->length update can be reordered before the memmove. If this
happens, wdm_read() can see the new length and call copy_to_user() on
uninitialized memory. This also violates LKMM data race rules [1].
Fix it by using WRITE_ONCE and memory barriers.
Fixes: afba937e540c9 ("USB: CDC WDM driver") Cc: stable <stable@kernel.org> Signed-off-by: Oliver Neukum <oneukum@suse.com> Closes: https://lore.kernel.org/linux-usb/CALbr=LbrUZn_cfp7CfR-7Z5wDTHF96qeuM=3fO2m-q4cDrnC4A@mail.gmail.com/ Reported-by: Gui-Dong Han <hanguidong02@gmail.com> Reviewed-by: Gui-Dong Han <hanguidong02@gmail.com> Link: https://patch.msgid.link/20260304130116.1721682-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fan Wu [Tue, 3 Mar 2026 07:33:44 +0000 (07:33 +0000)]
usb: renesas_usbhs: fix use-after-free in ISR during device removal
In usbhs_remove(), the driver frees resources (including the pipe array)
while the interrupt handler (usbhs_interrupt) is still registered. If an
interrupt fires after usbhs_pipe_remove() but before the driver is fully
unbound, the ISR may access freed memory, causing a use-after-free.
Fix this by calling devm_free_irq() before freeing resources. This ensures
the interrupt handler is both disabled and synchronized (waits for any
running ISR to complete) before usbhs_pipe_remove() is called.
Fixes: f1407d5c6624 ("usb: renesas_usbhs: Add Renesas USBHS common code") Cc: stable <stable@kernel.org> Suggested-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Fan Wu <fanwu01@zju.edu.cn> Link: https://patch.msgid.link/20260303073344.34577-1-fanwu01@zju.edu.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Marc Zyngier [Sun, 1 Mar 2026 12:44:40 +0000 (12:44 +0000)]
usb: cdc-acm: Restore CAP_BRK functionnality to CH343
The CH343 USB/serial adapter is as buggy as it is popular (very).
One of its quirks is that despite being capable of signalling a
BREAK condition, it doesn't advertise it.
This used to work nonetheless until 66aad7d8d3ec5 ("usb: cdc-acm:
return correct error code on unsupported break") applied some
reasonable restrictions, preventing breaks from being emitted on
devices that do not advertise CAP_BRK.
Add a quirk for this particular device, so that breaks can still
be produced on some of my machines attached to my console server.
Fixes: 66aad7d8d3ec5 ("usb: cdc-acm: return correct error code on unsupported break") Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable <stable@kernel.org> Cc: Oliver Neukum <oneukum@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Oliver Neukum <oneukum@suse.com> Link: https://patch.msgid.link/20260301124440.1192752-1-maz@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Seungjin Bae [Sat, 28 Feb 2026 10:43:25 +0000 (05:43 -0500)]
usb: gadget: f_mass_storage: Fix potential integer overflow in check_command_size_in_blocks()
The `check_command_size_in_blocks()` function calculates the data size
in bytes by left shifting `common->data_size_from_cmnd` by the block
size (`common->curlun->blkbits`). However, it does not validate whether
this shift operation will cause an integer overflow.
Initially, the block size is set up in `fsg_lun_open()` , and the
`common->data_size_from_cmnd` is set up in `do_scsi_command()`. During
initialization, there is no integer overflow check for the interaction
between two variables.
So if a malicious USB host sends a SCSI READ or WRITE command
requesting a large amount of data (`common->data_size_from_cmnd`), the
left shift operation can wrap around. This results in a truncated data
size, which can bypass boundary checks and potentially lead to memory
corruption or out-of-bounds accesses.
Fix this by using the check_shl_overflow() macro to safely perform the
shift and catch any overflows.
Fixes: 144974e7f9e3 ("usb: gadget: mass_storage: support multi-luns with different logic block size") Signed-off-by: Seungjin Bae <eeodqql09@gmail.com> Reviewed-by: Alan Stern <stern@rowland.harvard.edu> Link: https://patch.msgid.link/20260228104324.1696455-2-eeodqql09@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
John Keeping [Fri, 27 Feb 2026 11:15:39 +0000 (11:15 +0000)]
usb: gadget: f_hid: fix SuperSpeed descriptors
When adding dynamic configuration for bInterval, the value was removed
from the static SuperSpeed endpoint descriptors but was not set from the
configured value in hidg_bind(). Thus at SuperSpeed the interrupt
endpoints have bInterval as zero which is not valid per the USB
specification.
Add the missing setting for SuperSpeed endpoints.
Fixes: ea34925f5b2ee ("usb: gadget: hid: allow dynamic interval configuration via configfs") Cc: stable <stable@kernel.org> Signed-off-by: John Keeping <jkeeping@inmusicbrands.com> Acked-by: Peter Korsgaard <peter@korsgaard.com> Link: https://patch.msgid.link/20260227111540.431521-1-jkeeping@inmusicbrands.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jie Deng [Fri, 27 Feb 2026 08:49:31 +0000 (16:49 +0800)]
usb: core: new quirk to handle devices with zero configurations
Some USB devices incorrectly report bNumConfigurations as 0 in their
device descriptor, which causes the USB core to reject them during
enumeration.
logs:
usb 1-2: device descriptor read/64, error -71
usb 1-2: no configurations
usb 1-2: can't read configurations, error -22
However, these devices actually work correctly when
treated as having a single configuration.
Add a new quirk USB_QUIRK_FORCE_ONE_CONFIG to handle such devices.
When this quirk is set, assume the device has 1 configuration instead
of failing with -EINVAL.
This quirk is applied to the device with VID:PID 5131:2007 which
exhibits this behavior.
usb: misc: uss720: properly clean up reference in uss720_probe()
If get_1284_register() fails, the usb device reference count is
incorrect and needs to be properly dropped before returning. That will
happen when the kref is dropped in the call to destroy_priv(), so jump
to that error path instead of returning directly.
Alan Stern [Wed, 18 Feb 2026 03:10:32 +0000 (22:10 -0500)]
USB: core: Limit the length of unkillable synchronous timeouts
The usb_control_msg(), usb_bulk_msg(), and usb_interrupt_msg() APIs in
usbcore allow unlimited timeout durations. And since they use
uninterruptible waits, this leaves open the possibility of hanging a
task for an indefinitely long time, with no way to kill it short of
unplugging the target device.
To prevent this sort of problem, enforce a maximum limit on the length
of these unkillable timeouts. The limit chosen here, somewhat
arbitrarily, is 60 seconds. On many systems (although not all) this
is short enough to avoid triggering the kernel's hung-task detector.
In addition, clear up the ambiguity of negative timeout values by
treating them the same as 0, i.e., using the maximum allowed timeout.
Alan Stern [Wed, 18 Feb 2026 03:09:22 +0000 (22:09 -0500)]
USB: usbtmc: Use usb_bulk_msg_killable() with user-specified timeouts
The usbtmc driver accepts timeout values specified by the user in an
ioctl command, and uses these timeouts for some usb_bulk_msg() calls.
Since the user can specify arbitrarily long timeouts and
usb_bulk_msg() uses unkillable waits, call usb_bulk_msg_killable()
instead to avoid the possibility of the user hanging a kernel thread
indefinitely.
Reported-by: syzbot+25ba18e2c5040447585d@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-usb/8e1c7ac5-e076-44b0-84b8-1b34b20f0ae1@suse.com/T/#t Tested-by: syzbot+25ba18e2c5040447585d@syzkaller.appspotmail.com Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Fixes: 048c6d88a021 ("usb: usbtmc: Add ioctls to set/get usb timeout") CC: stable@vger.kernel.org Link: https://patch.msgid.link/81c6fc24-0607-40f1-8c20-5270dab2fad5@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alan Stern [Wed, 18 Feb 2026 03:07:47 +0000 (22:07 -0500)]
USB: usbcore: Introduce usb_bulk_msg_killable()
The synchronous message API in usbcore (usb_control_msg(),
usb_bulk_msg(), and so on) uses uninterruptible waits. However,
drivers may call these routines in the context of a user thread, which
means it ought to be possible to at least kill them.
For this reason, introduce a new usb_bulk_msg_killable() function
which behaves the same as usb_bulk_msg() except for using
wait_for_completion_killable_timeout() instead of
wait_for_completion_timeout(). The same can be done later for
usb_control_msg() later on, if it turns out to be needed.
Gabor Juhos [Wed, 18 Feb 2026 20:21:07 +0000 (21:21 +0100)]
usb: core: don't power off roothub PHYs if phy_set_mode() fails
Remove the error path from the usb_phy_roothub_set_mode() function.
The code is clearly wrong, because phy_set_mode() calls can't be
balanced with phy_power_off() calls.
Additionally, the usb_phy_roothub_set_mode() function is called only
from usb_add_hcd() before it powers on the PHYs, so powering off those
makes no sense anyway.
Presumably, the code is copy-pasted from the phy_power_on() function
without adjusting the error handling.
Hans de Goede [Sat, 28 Feb 2026 14:52:58 +0000 (15:52 +0100)]
HID: input: Add HID_BATTERY_QUIRK_DYNAMIC for Elan touchscreens
Elan touchscreens have a HID-battery device for the stylus which is always
there even if there is no stylus.
This is causing upower to report an empty battery for the stylus and some
desktop-environments will show a notification about this, which is quite
annoying.
Because of this the HID-battery is being ignored on all Elan I2c and USB
touchscreens, but this causes there to be no battery reporting for
the stylus at all.
This adds a new HID_BATTERY_QUIRK_DYNAMIC and uses these for the Elan
touchscreens.
This new quirks causes the present value of the battery to start at 0,
which will make userspace ignore it and only sets present to 1 after
receiving a battery input report which only happens when the stylus
gets in range.
Reported-by: ggrundik@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221118 Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
Hans de Goede [Sat, 28 Feb 2026 14:52:57 +0000 (15:52 +0100)]
HID: input: Drop Asus UX550* touchscreen ignore battery quirks
Drop the Asus UX550* touchscreen ignore battery quirks, there is a blanket
HID_BATTERY_QUIRK_IGNORE for all USB_VENDOR_ID_ELAN USB touchscreens now,
so these are just a duplicate of those.
Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
Long Li [Tue, 10 Mar 2026 12:32:33 +0000 (20:32 +0800)]
xfs: fix integer overflow in bmap intent sort comparator
xfs_bmap_update_diff_items() sorts bmap intents by inode number using
a subtraction of two xfs_ino_t (uint64_t) values, with the result
truncated to int. This is incorrect when two inode numbers differ by
more than INT_MAX (2^31 - 1), which is entirely possible on large XFS
filesystems.
Fix this by replacing the subtraction with cmp_int().
Cc: <stable@vger.kernel.org> # v4.9 Fixes: 9f3afb57d5f1 ("xfs: implement deferred bmbt map/unmap operations") Signed-off-by: Long Li <leo.lilong@huawei.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Chen-Yu Tsai [Mon, 2 Mar 2026 15:35:56 +0000 (23:35 +0800)]
spi: dt-bindings: sun6i: Allow Dual SPI and Quad SPI for newer SoCs
Support for Dual SPI and Quad SPI was added to the Linux driver in
commit 0605d9fb411f ("spi: sun6i: add quirk for dual and quad SPI modes
support") and commit 25453d797d7a ("spi: sun6i: add dual and quad SPI
modes support for R329/D1/R528/T113s").
However the binding was never updated to allow these modes. Allow them
by adding 2 and 4 to the allowed bus widths for the newer variants.
While at it, also add 0 to the allowed bus widths. This signals that
RX or TX is not available, i.e. the MISO or MOSI pin is disconnected.
Ben Dooks [Wed, 11 Mar 2026 10:58:35 +0000 (10:58 +0000)]
ACPI: OSL: fix __iomem type on return from acpi_os_map_generic_address()
The pointer returned from acpi_os_map_generic_address() is
tagged with __iomem, so make the rv it is returned to also
of void __iomem * type.
Fixes the following sparse warning:
drivers/acpi/osl.c:1686:20: warning: incorrect type in assignment (different address spaces)
drivers/acpi/osl.c:1686:20: expected void *rv
drivers/acpi/osl.c:1686:20: got void [noderef] __iomem *
Fixes: 6915564dc5a8 ("ACPI: OSL: Change the type of acpi_os_map_generic_address() return value") Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
[ rjw: Subject tweak, added Fixes tag ] Link: https://patch.msgid.link/20260311105835.463030-1-ben.dooks@codethink.co.uk Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Thomas Gleixner [Tue, 10 Mar 2026 20:29:09 +0000 (21:29 +0100)]
sched/mmcid: Avoid full tasklist walks
Chasing vfork()'ed tasks on a CID ownership mode switch requires a full
task list walk, which is obviously expensive on large systems.
Avoid that by keeping a list of tasks using a mm MMCID entity in mm::mm_cid
and walk this list instead. This removes the proven to be flaky counting
logic and avoids a full task list walk in the case of vfork()'ed tasks.
Fixes: fbd0e71dc370 ("sched/mmcid: Provide CID ownership mode fixup functions") Signed-off-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260310202526.183824481@kernel.org
Thomas Gleixner [Tue, 10 Mar 2026 20:29:04 +0000 (21:29 +0100)]
sched/mmcid: Remove pointless preempt guard
This is a leftover from the early versions of this function where it could
be invoked without mm::mm_cid::lock held.
Remove it and add lockdep asserts instead.
Fixes: 653fda7ae73d ("sched/mmcid: Switch over to the new mechanism") Signed-off-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260310202526.116363613@kernel.org
Thomas Gleixner [Tue, 10 Mar 2026 20:28:58 +0000 (21:28 +0100)]
sched/mmcid: Handle vfork()/CLONE_VM correctly
Matthieu and Jiri reported stalls where a task endlessly loops in
mm_get_cid() when scheduling in.
It turned out that the logic which handles vfork()'ed tasks is broken. It
is invoked when the number of tasks associated to a process is smaller than
the number of MMCID users. It then walks the task list to find the
vfork()'ed task, but accounts all the already processed tasks as well.
If that double processing brings the number of to be handled tasks to 0,
the walk stops and the vfork()'ed task's CID is not fixed up. As a
consequence a subsequent schedule in fails to acquire a (transitional) CID
and the machine stalls.
Cure this by removing the accounting condition and make the fixup always
walk the full task list if it could not find the exact number of users in
the process' thread list.
Thomas Gleixner [Tue, 10 Mar 2026 20:28:53 +0000 (21:28 +0100)]
sched/mmcid: Prevent CID stalls due to concurrent forks
A newly forked task is accounted as MMCID user before the task is visible
in the process' thread list and the global task list. This creates the
following problem:
CPU1 CPU2
fork()
sched_mm_cid_fork(tnew1)
tnew1->mm.mm_cid_users++;
tnew1->mm_cid.cid = getcid()
-> preemption
fork()
sched_mm_cid_fork(tnew2)
tnew2->mm.mm_cid_users++;
// Reaches the per CPU threshold
mm_cid_fixup_tasks_to_cpus()
for_each_other(current, p)
....
As tnew1 is not visible yet, this fails to fix up the already allocated CID
of tnew1. As a consequence a subsequent schedule in might fail to acquire a
(transitional) CID and the machine stalls.
Move the invocation of sched_mm_cid_fork() after the new task becomes
visible in the thread and the task list to prevent this.
This also makes it symmetrical vs. exit() where the task is removed as CID
user before the task is removed from the thread and task lists.
Fixes: fbd0e71dc370 ("sched/mmcid: Provide CID ownership mode fixup functions") Signed-off-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260310202525.969061974@kernel.org
Merge tag 'stratix10_svc_fix_for_v7.0' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux into char-misc-linus
Dinh writes:
firmware: stratix10-svc: add multiple svc clients support
- Add a dedicated thread for each svc client to fix a timeout issue when the svc
driver is handling multiple clients.
* tag 'stratix10_svc_fix_for_v7.0' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
firmware: stratix10-svc: Add Multi SVC clients support
Steven Rostedt [Sat, 7 Mar 2026 02:24:03 +0000 (21:24 -0500)]
time/jiffies: Mark jiffies_64_to_clock_t() notrace
The trace_clock_jiffies() function that handles the "uptime" clock for
tracing calls jiffies_64_to_clock_t(). This causes the function tracer to
constantly recurse when the tracing clock is set to "uptime". Mark it
notrace to prevent unnecessary recursion when using the "uptime" clock.
Fixes: 58d4e21e50ff3 ("tracing: Fix wraparound problems in "uptime" trace clock") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260306212403.72270bb2@robin
Raphael Zimmer [Tue, 10 Mar 2026 14:28:15 +0000 (15:28 +0100)]
libceph: Fix potential out-of-bounds access in ceph_handle_auth_reply()
This patch fixes an out-of-bounds access in ceph_handle_auth_reply()
that can be triggered by a message of type CEPH_MSG_AUTH_REPLY. In
ceph_handle_auth_reply(), the value of the payload_len field of such a
message is stored in a variable of type int. A value greater than
INT_MAX leads to an integer overflow and is interpreted as a negative
value. This leads to decrementing the pointer address by this value and
subsequently accessing it because ceph_decode_need() only checks that
the memory access does not exceed the end address of the allocation.
This patch fixes the issue by changing the data type of payload_len to
u32. Additionally, the data type of result_msg_len is changed to u32,
as it is also a variable holding a non-negative length.
Also, an additional layer of sanity checks is introduced, ensuring that
directly after reading it from the message, payload_len and
result_msg_len are not greater than the overall segment length.
BUG: KASAN: slab-out-of-bounds in ceph_handle_auth_reply+0x642/0x7a0 [libceph]
Read of size 4 at addr ffff88811404df14 by task kworker/20:1/262
Raphael Zimmer [Thu, 26 Feb 2026 15:07:31 +0000 (16:07 +0100)]
libceph: Use u32 for non-negative values in ceph_monmap_decode()
This patch fixes unnecessary implicit conversions that change signedness
of blob_len and num_mon in ceph_monmap_decode().
Currently blob_len and num_mon are (signed) int variables. They are used
to hold values that are always non-negative and get assigned in
ceph_decode_32_safe(), which is meant to assign u32 values. Both
variables are subsequently used as unsigned values, and the value of
num_mon is further assigned to monmap->num_mon, which is of type u32.
Therefore, both variables should be of type u32. This is especially
relevant for num_mon. If the value read from the incoming message is
very large, it is interpreted as a negative value, and the check for
num_mon > CEPH_MAX_MON does not catch it. This leads to the attempt to
allocate a very large chunk of memory for monmap, which will most likely
fail. In this case, an unnecessary attempt to allocate memory is
performed, and -ENOMEM is returned instead of -EINVAL.
Barnabás Pőcze [Tue, 10 Mar 2026 20:44:03 +0000 (20:44 +0000)]
gpiolib: clear requested flag if line is invalid
If `gpiochip_line_is_valid()` fails, then `-EINVAL` is returned, but
`desc->flags` will have `GPIOD_FLAG_REQUESTED` set, which will result
in subsequent calls misleadingly returning `-EBUSY`.
Lianqin Hu [Wed, 11 Mar 2026 07:22:38 +0000 (07:22 +0000)]
ALSA: usb-audio: Add iface reset and delay quirk for SPACETOUCH USB Audio
Setting up the interface when suspended/resumeing fail on this card.
Adding a reset and delay quirk will eliminate this problem.
usb 1-1: New USB device found, idVendor=0666, idProduct=0880
usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 1-1: Product: USB Audio
usb 1-1: Manufacturer: SPACETOUCH
usb 1-1: SerialNumber: 000000000
Daniel Gomez [Tue, 10 Mar 2026 11:36:23 +0000 (12:36 +0100)]
scripts: kconfig: merge_config.sh: pass output file as awk variable
The refactoring commit 5fa9b82cbcfc5 ("scripts: kconfig:
merge_config.sh: refactor from shell/sed/grep to awk") passes
$TMP_FILE.new as ARGV[3] to awk, using it as both an output destination
and an input file argument. When the base file is empty, nothing is
written to ARGV[3] during processing, so awk fails trying to open it
for reading:
awk: cmd. line:52: fatal: cannot open file
`./.tmp.config.grcQin34jb.new' for reading: No such file or directory
mv: cannot stat './.tmp.config.grcQin34jb.new': No such file or directory
Pass the output path via -v outfile instead and drop the FILENAME ==
ARGV[3] { nextfile }.
Linus Torvalds [Wed, 11 Mar 2026 03:30:52 +0000 (20:30 -0700)]
Merge tag 'v7.0-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- Fix potential use after free errors
- Fix refcount leak in smb2 open error path
- Prevent allowing logging signing or encryption keys
* tag 'v7.0-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: Don't log keys in SMB3 signing and encryption key generation
smb: server: fix use-after-free in smb2_open()
ksmbd: fix use-after-free in smb_lazy_parent_lease_break_close()
ksmbd: fix use-after-free by using call_rcu() for oplock_info
ksmbd: fix use-after-free in proc_show_files due to early rcu_read_unlock
smb/server: Fix another refcount leak in smb2_open()
Nicolai Buchwitz [Tue, 10 Mar 2026 05:49:35 +0000 (06:49 +0100)]
net: bcmgenet: fix broken EEE by converting to phylib-managed state
The bcmgenet EEE implementation is broken in several ways.
phy_support_eee() is never called, so the PHY never advertises EEE
and phylib never sets phydev->enable_tx_lpi. bcmgenet_mac_config()
checks priv->eee.eee_enabled to decide whether to enable the MAC
LPI logic, but that field is never initialised to true, so the MAC
never enters Low Power Idle even when EEE is negotiated - wasting
the power savings EEE is designed to provide. The only way to get
EEE working at all is a manual 'ethtool --set-eee eth0 eee on' after
every link-up, and even then bcmgenet_get_eee() immediately clobbers
the reported state because phy_ethtool_get_eee() overwrites
eee_enabled and tx_lpi_enabled with the uninitialised PHY eee_cfg
values. Finally, bcmgenet_mac_config() is only called on link-up,
so EEE is never disabled in hardware on link-down.
Fix all of this by removing the MAC-side EEE state tracking
(priv->eee) and aligning with the pattern used by other non-phylink
MAC drivers such as FEC.
Call phy_support_eee() in bcmgenet_mii_probe() so the PHY advertises
EEE link modes and phylib tracks negotiation state. Move the EEE
hardware control to bcmgenet_mii_setup(), which is called on every
link event, and drive it directly from phydev->enable_tx_lpi - the
flag phylib sets when EEE is negotiated and the user has not disabled
it. This enables EEE automatically once the link partner agrees and
disables it cleanly on link-down.
Make bcmgenet_get_eee() and bcmgenet_set_eee() pure passthroughs to
phy_ethtool_get_eee() and phy_ethtool_set_eee(), with the MAC
hardware register read/written for tx_lpi_timer. Drop struct
ethtool_keee eee from struct bcmgenet_priv.
Paul Moses [Mon, 9 Mar 2026 17:35:10 +0000 (17:35 +0000)]
net-shapers: don't free reply skb after genlmsg_reply()
genlmsg_reply() hands the reply skb to netlink, and
netlink_unicast() consumes it on all return paths, whether the
skb is queued successfully or freed on an error path.
net_shaper_nl_get_doit() and net_shaper_nl_cap_get_doit()
currently jump to free_msg after genlmsg_reply() fails and call
nlmsg_free(msg), which can hit the same skb twice.
Return the genlmsg_reply() error directly and keep free_msg
only for pre-reply failures.
Fixes: 4b623f9f0f59 ("net-shapers: implement NL get operation") Fixes: 553ea9f1efd6 ("net: shaper: implement introspection support") Cc: stable@vger.kernel.org Signed-off-by: Paul Moses <p@1g4.org> Link: https://patch.msgid.link/20260309173450.538026-2-p@1g4.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Golle [Tue, 10 Mar 2026 00:41:56 +0000 (00:41 +0000)]
net: dsa: mxl862xx: don't set user_mii_bus
The PHY addresses in the MII bus are not equal to the port addresses,
so the bus cannot be assigned as user_mii_bus. Falling back on the
user_mii_bus in case a PHY isn't declared in device tree will result in
using the wrong (in this case: off-by-+1) PHY.
Remove the wrong assignment.
Fan Wu [Mon, 9 Mar 2026 13:24:09 +0000 (13:24 +0000)]
net: ethernet: arc: emac: quiesce interrupts before requesting IRQ
Normal RX/TX interrupts are enabled later, in arc_emac_open(), so probe
should not see interrupt delivery in the usual case. However, hardware may
still present stale or latched interrupt status left by firmware or the
bootloader.
If probe later unwinds after devm_request_irq() has installed the handler,
such a stale interrupt can still reach arc_emac_intr() during teardown and
race with release of the associated net_device.
Avoid that window by putting the device into a known quiescent state before
requesting the IRQ: disable all EMAC interrupt sources and clear any
pending EMAC interrupt status bits. This keeps the change hardware-focused
and minimal, while preventing spurious IRQ delivery from leftover state.
Jakub Kicinski [Tue, 10 Mar 2026 00:39:07 +0000 (17:39 -0700)]
page_pool: store detach_time as ktime_t to avoid false-negatives
While testing other changes in vng I noticed that
nl_netdev.page_pool_check flakes. This never happens in real CI.
Turns out vng may boot and get to that test in less than a second.
page_pool_detached() records the detach time in seconds, so if
vng is fast enough detach time is set to 0. Other code treats
0 as "not detached". detach_time is only used to report the state
to the user, so it's not a huge deal in practice but let's fix it.
Store the raw ktime_t (nanoseconds) instead. A nanosecond value
of 0 is practically impossible.
Kevin Hao [Sat, 7 Mar 2026 07:08:54 +0000 (15:08 +0800)]
net: macb: Shuffle the tx ring before enabling tx
Quanyang observed that when using an NFS rootfs on an AMD ZynqMp board,
the rootfs may take an extended time to recover after a suspend.
Upon investigation, it was determined that the issue originates from a
problem in the macb driver.
According to the Zynq UltraScale TRM [1], when transmit is disabled,
the transmit buffer queue pointer resets to point to the address
specified by the transmit buffer queue base address register.
In the current implementation, the code merely resets `queue->tx_head`
and `queue->tx_tail` to '0'. This approach presents several issues:
- Packets already queued in the tx ring are silently lost,
leading to memory leaks since the associated skbs cannot be released.
- Concurrent write access to `queue->tx_head` and `queue->tx_tail` may
occur from `macb_tx_poll()` or `macb_start_xmit()` when these values
are reset to '0'.
- The transmission may become stuck on a packet that has already been sent
out, with its 'TX_USED' bit set, but has not yet been processed. However,
due to the manipulation of 'queue->tx_head' and 'queue->tx_tail',
`macb_tx_poll()` incorrectly assumes there are no packets to handle
because `queue->tx_head == queue->tx_tail`. This issue is only resolved
when a new packet is placed at this position. This is the root cause of
the prolonged recovery time observed for the NFS root filesystem.
To resolve this issue, shuffle the tx ring and tx skb array so that
the first unsent packet is positioned at the start of the tx ring.
Additionally, ensure that updates to `queue->tx_head` and
`queue->tx_tail` are properly protected with the appropriate lock.
Fixes: bf9cf80cab81 ("net: macb: Fix tx/rx malfunction after phy link down and up") Reported-by: Quanyang Wang <quanyang.wang@windriver.com> Signed-off-by: Kevin Hao <haokexin@gmail.com> Cc: stable@vger.kernel.org Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260307-zynqmp-v2-1-6ef98a70e1d0@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Wei Yang [Thu, 5 Mar 2026 01:50:06 +0000 (01:50 +0000)]
mm/huge_memory: fix early failure try_to_migrate() when split huge pmd for shared THP
Commit 60fbb14396d5 ("mm/huge_memory: adjust try_to_migrate_one() and
split_huge_pmd_locked()") return false unconditionally after
split_huge_pmd_locked(). This may fail try_to_migrate() early when
TTU_SPLIT_HUGE_PMD is specified.
The reason is the above commit adjusted try_to_migrate_one() to, when a
PMD-mapped THP entry is found, and TTU_SPLIT_HUGE_PMD is specified (for
example, via unmap_folio()), return false unconditionally. This breaks
the rmap walk and fail try_to_migrate() early, if this PMD-mapped THP is
mapped in multiple processes.
The user sensible impact of this bug could be:
* On memory pressure, shrink_folio_list() may split partially mapped
folio with split_folio_to_list(). Then free unmapped pages without IO.
If failed, it may not be reclaimed.
* On memory failure, memory_failure() would call try_to_split_thp_page()
to split folio contains the bad page. If succeed, the PG_has_hwpoisoned
bit is only set in the after-split folio contains @split_at. By doing
so, we limit bad memory. If failed to split, the whole folios is not
usable.
One way to reproduce:
Create an anonymous THP range and fork 512 children, so we have a
THP shared mapped in 513 processes. Then trigger folio split with
/sys/kernel/debug/split_huge_pages debugfs to split the THP folio to
order 0.
Without the above commit, we can successfully split to order 0. With the
above commit, the folio is still a large folio.
And currently there are two core users of TTU_SPLIT_HUGE_PMD:
* try_to_unmap_one()
* try_to_migrate_one()
try_to_unmap_one() would restart the rmap walk, so only
try_to_migrate_one() is affected.
We can't simply revert commit 60fbb14396d5 ("mm/huge_memory: adjust
try_to_migrate_one() and split_huge_pmd_locked()"), since it removed some
duplicated check covered by page_vma_mapped_walk().
This patch fixes this by restart page_vma_mapped_walk() after
split_huge_pmd_locked(). Since we cannot simply return "true" to fix the
problem, as that would affect another case:
When invoking folio_try_share_anon_rmap_pmd() from
split_huge_pmd_locked(), the latter can fail and leave a large folio
mapped through PTEs, in which case we ought to return true from
try_to_migrate_one(). This might result in unnecessary walking of the
rmap but is relatively harmless.
Link: https://lkml.kernel.org/r/20260305015006.27343-1-richard.weiyang@gmail.com Fixes: 60fbb14396d5 ("mm/huge_memory: adjust try_to_migrate_one() and split_huge_pmd_locked()") Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Tested-by: Lance Yang <lance.yang@linux.dev> Reviewed-by: Lance Yang <lance.yang@linux.dev> Reviewed-by: Gavin Guo <gavinguo@igalia.com> Acked-by: David Hildenbrand (arm) <david@kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Dev Jain [Tue, 3 Mar 2026 06:15:28 +0000 (11:45 +0530)]
mm/rmap: fix incorrect pte restoration for lazyfree folios
We batch unmap anonymous lazyfree folios by folio_unmap_pte_batch. If the
batch has a mix of writable and non-writable bits, we may end up setting
the entire batch writable. Fix this by respecting writable bit during
batching.
Although on a successful unmap of a lazyfree folio, the soft-dirty bit is
lost, preserve it on pte restoration by respecting the bit during
batching, to make the fix consistent w.r.t both writable bit and
soft-dirty bit.
I was able to write the below reproducer and crash the kernel.
Explanation of reproducer (set 64K mTHP to always):
Fault in a 64K large folio. Split the VMA at mid-point with
MADV_DONTFORK. fork() - parent points to the folio with 8 writable ptes
and 8 non-writable ptes. Merge the VMAs with MADV_DOFORK so that
folio_unmap_pte_batch() can determine all the 16 ptes as a batch. Do
MADV_FREE on the range to mark the folio as lazyfree. Write to the memory
to dirty the pte, eventually rmap will dirty the folio. Then trigger
reclaim, we will hit the pte restoration path, and the kernel will crash
with the trace given below.
The code path is asking for anonymous page to be mapped writable into the
pagetable. The BUG_ON() firing implies that such a writable page has been
mapped into the pagetables of more than one process, which breaks
anonymous memory/CoW semantics.
Link: https://lkml.kernel.org/r/20260303061528.2429162-1-dev.jain@arm.com Fixes: 354dffd29575 ("mm: support batched unmap for lazyfree large folios during reclamation") Signed-off-by: Dev Jain <dev.jain@arm.com> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Barry Song <baohua@kernel.org> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Tested-by: Lance Yang <lance.yang@linux.dev> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Harry Yoo <harry.yoo@oracle.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Chris Down [Tue, 3 Mar 2026 07:21:21 +0000 (07:21 +0000)]
mm/huge_memory: fix use of NULL folio in move_pages_huge_pmd()
move_pages_huge_pmd() handles UFFDIO_MOVE for both normal THPs and huge
zero pages. For the huge zero page path, src_folio is explicitly set to
NULL, and is used as a sentinel to skip folio operations like lock and
rmap.
In the huge zero page branch, src_folio is NULL, so folio_mk_pmd(NULL,
pgprot) passes NULL through folio_pfn() and page_to_pfn(). With
SPARSEMEM_VMEMMAP this silently produces a bogus PFN, installing a PMD
pointing to non-existent physical memory. On other memory models it is a
NULL dereference.
Use page_folio(src_page) to obtain the valid huge zero folio from the
page, which was obtained from pmd_page() and remains valid throughout.
After commit d82d09e48219 ("mm/huge_memory: mark PMD mappings of the huge
zero folio special"), moved huge zero PMDs must remain special so
vm_normal_page_pmd() continues to treat them as special mappings.
move_pages_huge_pmd() currently reconstructs the destination PMD in the
huge zero page branch, which drops PMD state such as pmd_special() on
architectures with CONFIG_ARCH_HAS_PTE_SPECIAL. As a result,
vm_normal_page_pmd() can treat the moved huge zero PMD as a normal page
and corrupt its refcount.
Instead of reconstructing the PMD from the folio, derive the destination
entry from src_pmdval after pmdp_huge_clear_flush(), then handle the PMD
metadata the same way move_huge_pmd() does for moved entries by marking it
soft-dirty and clearing uffd-wp.
Link: https://lkml.kernel.org/r/a1e787dd-b911-474d-8570-f37685357d86@lucifer.local Fixes: e3981db444a0 ("mm: add folio_mk_pmd()") Signed-off-by: Chris Down <chris@chrisdown.name> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Tested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Randy Dunlap [Mon, 2 Mar 2026 00:51:44 +0000 (16:51 -0800)]
build_bug.h: correct function parameters names in kernel-doc
Use the correct function (or macro) names to avoid kernel-doc warnings:
Warning: include/linux/build_bug.h:38 function parameter 'cond' not
described in 'BUILD_BUG_ON_MSG'
Warning: include/linux/build_bug.h:38 function parameter 'msg' not
described in 'BUILD_BUG_ON_MSG'
Warning: include/linux/build_bug.h:76 function parameter 'expr' not
described in 'static_assert'
Thorsten Blum [Fri, 27 Feb 2026 23:00:09 +0000 (00:00 +0100)]
crash_dump: don't log dm-crypt key bytes in read_key_from_user_keying
When debug logging is enabled, read_key_from_user_keying() logs the first
8 bytes of the key payload and partially exposes the dm-crypt key. Stop
logging any key bytes.
Link: https://lkml.kernel.org/r/20260227230008.858641-2-thorsten.blum@linux.dev Fixes: 479e58549b0f ("crash_dump: store dm crypt keys in kdump reserved memory") Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ye Bin [Tue, 10 Mar 2026 13:08:47 +0000 (21:08 +0800)]
smb/client: only export symbol for 'smb2maperror-test' module
Only export smb2_get_err_map_test smb2_error_map_table_test and
smb2_error_map_num symbol for 'smb2maperror-test' module.
Fixes: 7d0bf050a587 ("smb/client: make SMB2 maperror KUnit tests a separate module") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Bharath SM [Mon, 9 Mar 2026 10:30:49 +0000 (16:00 +0530)]
smb: client: fix in-place encryption corruption in SMB2_write()
SMB2_write() places write payload in iov[1..n] as part of rq_iov.
smb3_init_transform_rq() pointer-shares rq_iov, so crypt_message()
encrypts iov[1] in-place, replacing the original plaintext with
ciphertext. On a replayable error, the retry sends the same iov[1]
which now contains ciphertext instead of the original data,
resulting in corruption.
The corruption is most likely to be observed when connections are
unstable, as reconnects trigger write retries that re-send the
already-encrypted data.
This affects SFU mknod, MF symlinks, etc. On kernels before
6.10 (prior to the netfs conversion), sync writes also used
this path and were similarly affected. The async write path
wasn't unaffected as it uses rq_iter which gets deep-copied.
Fix by moving the write payload into rq_iter via iov_iter_kvec(),
so smb3_init_transform_rq() deep-copies it before encryption.
Cc: stable@vger.kernel.org #6.3+ Acked-by: Henrique Carvalho <henrique.carvalho@suse.com> Acked-by: Shyam Prasad N <sprasad@microsoft.com> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Arnd Bergmann [Fri, 6 Mar 2026 15:07:13 +0000 (16:07 +0100)]
smb: client: fix sbflags initialization
The newly introduced variable is initialized in an #ifdef block
but used outside of it, leading to undefined behavior when
CONFIG_CIFS_ALLOW_INSECURE_LEGACY is disabled:
fs/smb/client/dir.c:417:9: error: variable 'sbflags' is uninitialized when used here [-Werror,-Wuninitialized]
417 | if (sbflags & CIFS_MOUNT_DYNPERM)
| ^~~~~~~
Move the initialization into the declaration, the same way as the
other similar function do it.
Fixes: 4fc3a433c139 ("smb: client: use atomic_t for mnt_cifs_flags") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Paulo Alcantara [Sat, 7 Mar 2026 21:20:16 +0000 (18:20 -0300)]
smb: client: fix atomic open with O_DIRECT & O_SYNC
When user application requests O_DIRECT|O_SYNC along with O_CREAT on
open(2), CREATE_NO_BUFFER and CREATE_WRITE_THROUGH bits were missed in
CREATE request when performing an atomic open, thus leading to
potentially data integrity issues.
Fix this by setting those missing bits in CREATE request when
O_DIRECT|O_SYNC has been specified in cifs_do_create().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Henrique Carvalho <henrique.carvalho@suse.com> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
Matt Vollrath [Tue, 24 Feb 2026 23:28:33 +0000 (18:28 -0500)]
e1000/e1000e: Fix leak in DMA error cleanup
If an error is encountered while mapping TX buffers, the driver should
unmap any buffers already mapped for that skb.
Because count is incremented after a successful mapping, it will always
match the correct number of unmappings needed when dma_error is reached.
Decrementing count before the while loop in dma_error causes an
off-by-one error. If any mapping was successful before an unsuccessful
mapping, exactly one DMA mapping would leak.
In these commits, a faulty while condition caused an infinite loop in
dma_error:
Commit 03b1320dfcee ("e1000e: remove use of skb_dma_map from e1000e
driver")
Commit 602c0554d7b0 ("e1000: remove use of skb_dma_map from e1000 driver")
Commit c1fa347f20f1 ("e1000/e1000e/igb/igbvf/ixgb/ixgbe: Fix tests of
unsigned in *_tx_map()") fixed the infinite loop, but introduced the
off-by-one error.
This issue may still exist in the igbvf driver, but I did not address it
in this patch.
Fixes: c1fa347f20f1 ("e1000/e1000e/igb/igbvf/ixgb/ixgbe: Fix tests of unsigned in *_tx_map()") Assisted-by: Claude:claude-4.6-opus Signed-off-by: Matt Vollrath <tactii@gmail.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Alok Tiwari [Mon, 10 Nov 2025 19:13:38 +0000 (11:13 -0800)]
i40e: fix src IP mask checks and memcpy argument names in cloud filter
Fix following issues in the IPv4 and IPv6 cloud filter handling logic in
both the add and delete paths:
- The source-IP mask check incorrectly compares mask.src_ip[0] against
tcf.dst_ip[0]. Update it to compare against tcf.src_ip[0]. This likely
goes unnoticed because the check is in an "else if" path that only
executes when dst_ip is not set, most cloud filter use cases focus on
destination-IP matching, and the buggy condition can accidentally
evaluate true in some cases.
- memcpy() for the IPv4 source address incorrectly uses
ARRAY_SIZE(tcf.dst_ip) instead of ARRAY_SIZE(tcf.src_ip), although
both arrays are the same size.
- The IPv4 memcpy operations used ARRAY_SIZE(tcf.dst_ip) and ARRAY_SIZE
(tcf.src_ip), Update these to use sizeof(cfilter->ip.v4.dst_ip) and
sizeof(cfilter->ip.v4.src_ip) to ensure correct and explicit copy size.
- In the IPv6 delete path, memcmp() uses sizeof(src_ip6) when comparing
dst_ip6 fields. Replace this with sizeof(dst_ip6) to make the intent
explicit, even though both fields are struct in6_addr.
Fixes: e284fc280473 ("i40e: Add and delete cloud filter") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Linus Torvalds [Tue, 10 Mar 2026 19:47:56 +0000 (12:47 -0700)]
Merge tag 'mm-hotfixes-stable-2026-03-09-16-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"15 hotfixes. 6 are cc:stable. 14 are for MM.
Singletons, with one doubleton - please see the changelogs for details"
* tag 'mm-hotfixes-stable-2026-03-09-16-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
MAINTAINERS, mailmap: update email address for Lorenzo Stoakes
mm/mmu_notifier: clean up mmu_notifier.h kernel-doc
uaccess: correct kernel-doc parameter format
mm/huge_memory: fix a folio_split() race condition with folio_try_get()
MAINTAINERS: add co-maintainer and reviewer for SLAB ALLOCATOR
MAINTAINERS: add RELAY entry
memcg: fix slab accounting in refill_obj_stock() trylock path
mm/hugetlb.c: use __pa() instead of virt_to_phys() in early bootmem alloc code
zram: rename writeback_compressed device attr
tools/testing: fix testing/vma and testing/radix-tree build
Revert "ptdesc: remove references to folios from __pagetable_ctor() and pagetable_dtor()"
mm/cma: move put_page_testzero() out of VM_WARN_ON in cma_release()
mm/damon/core: clear walk_control on inactive context in damos_walk()
mm: memfd_luo: always dirty all folios
mm: memfd_luo: always make all folios uptodate
Peter Ujfalusi [Tue, 10 Mar 2026 06:53:50 +0000 (08:53 +0200)]
ASoC: codecs: rt1011: Use component to get the dapm context in spk_mode_put
The correct helper to use in rt1011_recv_spk_mode_put() to retrieve the
DAPM context is snd_soc_component_to_dapm(), from kcontrol we will
receive NULL pointer.
Closes: https://github.com/thesofproject/linux/issues/5691 Fixes: 5b35bb517f27 ("ASoC: codecs: rt1011: convert to snd_soc_dapm_xxx()") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://patch.msgid.link/20260310065350.18921-1-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
Paul Chaignon [Tue, 10 Mar 2026 11:39:51 +0000 (12:39 +0100)]
selftests/bpf: Fix pkg-config call on static builds
For commit b0dcdcb9ae75 ("resolve_btfids: Fix linker flags detection"),
I suggested setting HOSTPKG_CONFIG to $PKG_CONFIG when compiling
resolve_btfids, but I forgot the quotes around that variable.
As a result, when running vmtest.sh with static linking, it fails as
follows:
This worked when I tested it because HOSTPKG_CONFIG didn't have a
default value in the resolve_btfids Makefile, but once it does, the
quotes aren't preserved and it fails on the next make call.
Zhang Rui [Wed, 4 Mar 2026 00:30:12 +0000 (08:30 +0800)]
tools/power turbostat: Fix illegal memory access when SMT is present and disabled
When SMT is present and disabled, turbostat may under-size
the thread_data array. This can corrupt results or
cause turbostat to exit with a segmentation fault.
[lenb: commit message] Fixes: a2b4d0f8bf07 ("tools/power turbostat: Favor cpu# over core#") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
Sachin Kumar [Mon, 9 Mar 2026 18:25:42 +0000 (18:25 +0000)]
bpf: Fix constant blinding for PROBE_MEM32 stores
BPF_ST | BPF_PROBE_MEM32 immediate stores are not handled by
bpf_jit_blind_insn(), allowing user-controlled 32-bit immediates to
survive unblinded into JIT-compiled native code when bpf_jit_harden >= 1.
The root cause is that convert_ctx_accesses() rewrites BPF_ST|BPF_MEM
to BPF_ST|BPF_PROBE_MEM32 for arena pointer stores during verification,
before bpf_jit_blind_constants() runs during JIT compilation. The
blinding switch only matches BPF_ST|BPF_MEM (mode 0x60), not
BPF_ST|BPF_PROBE_MEM32 (mode 0xa0). The instruction falls through
unblinded.
Add BPF_ST|BPF_PROBE_MEM32 cases to bpf_jit_blind_insn() alongside the
existing BPF_ST|BPF_MEM cases. The blinding transformation is identical:
load the blinded immediate into BPF_REG_AX via mov+xor, then convert
the immediate store to a register store (BPF_STX).
The rewritten STX instruction must preserve the BPF_PROBE_MEM32 mode so
the architecture JIT emits the correct arena addressing (R12-based on
x86-64). Cannot use the BPF_STX_MEM() macro here because it hardcodes
BPF_MEM mode; construct the instruction directly instead.
Lizhi Hou [Tue, 10 Mar 2026 18:00:58 +0000 (11:00 -0700)]
accel/amdxdna: Fix runtime suspend deadlock when there is pending job
The runtime suspend callback drains the running job workqueue before
suspending the device. If a job is still executing and calls
pm_runtime_resume_and_get(), it can deadlock with the runtime suspend
path.
Fix this by moving pm_runtime_resume_and_get() from the job execution
routine to the job submission routine, ensuring the device is resumed
before the job is queued and avoiding the deadlock during runtime
suspend.
Yazhou Tang [Wed, 4 Mar 2026 08:32:28 +0000 (16:32 +0800)]
selftests/bpf: Add test for BPF_END register ID reset
Add a test case to ensure that BPF_END operations correctly break
register's scalar ID ties.
The test creates a scenario where r1 is a copy of r0, r0 undergoes a
byte swap, and then r0 is checked against a constant.
- Without the fix in the verifier, the bounds learned from r0 are
incorrectly propagated to r1, making the verifier believe r1 is
bounded and wrongly allowing subsequent pointer arithmetic.
- With the fix, r1 remains an unbounded scalar, and the verifier
correctly rejects the arithmetic operation between the frame pointer
and the unbounded register.
Co-developed-by: Tianci Cao <ziye@zju.edu.cn> Signed-off-by: Tianci Cao <ziye@zju.edu.cn> Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com> Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com> Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260304083228.142016-3-tangyazhou@zju.edu.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Yazhou Tang [Wed, 4 Mar 2026 08:32:27 +0000 (16:32 +0800)]
bpf: Reset register ID for BPF_END value tracking
When a register undergoes a BPF_END (byte swap) operation, its scalar
value is mutated in-place. If this register previously shared a scalar ID
with another register (e.g., after an `r1 = r0` assignment), this tie must
be broken.
Currently, the verifier misses resetting `dst_reg->id` to 0 for BPF_END.
Consequently, if a conditional jump checks the swapped register, the
verifier incorrectly propagates the learned bounds to the linked register,
leading to false confidence in the linked register's value and potentially
allowing out-of-bounds memory accesses.
Fix this by explicitly resetting `dst_reg->id` to 0 in the BPF_END case
to break the scalar tie, similar to how BPF_NEG handles it via
`__mark_reg_known`.
Jessica Liu [Tue, 10 Mar 2026 06:17:31 +0000 (14:17 +0800)]
irqchip/riscv-aplic: Register syscore operations only once
Since commit 95a8ddde3660 ("irqchip/riscv-aplic: Preserve APLIC
states across suspend/resume"), when multiple NUMA nodes exist
and AIA is not configured as "none", aplic_probe() is called
multiple times. This leads to register_syscore(&aplic_syscore)
being invoked repeatedly, causing the following Oops:
Jessica Liu [Tue, 10 Mar 2026 06:16:00 +0000 (14:16 +0800)]
irqchip/riscv-aplic: Do not clear ACPI dependencies on probe failure
aplic_probe() calls acpi_dev_clear_dependencies() unconditionally at the
end, even when the preceding setup (MSI or direct mode) has failed. This is
incorrect because if the device failed to probe, it should not be
considered as active and should not clear dependencies for other devices
waiting on it.
Fix this by returning immediately when the setup fails, skipping the ACPI
dependency cleanup. Also, explicitly return 0 on success instead of relying
on the value of 'rc' to make the success path clear.
Tim Kovalenko [Mon, 9 Mar 2026 16:34:21 +0000 (12:34 -0400)]
gpu: nova-core: fix stack overflow in GSP memory allocation
The `Cmdq::new` function was allocating a `PteArray` struct on the stack
and was causing a stack overflow with 8216 bytes.
Modify the `PteArray` to calculate and write the Page Table Entries
directly into the coherent DMA buffer one-by-one. This reduces the stack
usage quite a lot.
Yihan Ding [Fri, 6 Mar 2026 02:16:51 +0000 (10:16 +0800)]
landlock: Clean up interrupted thread logic in TSYNC
In landlock_restrict_sibling_threads(), when the calling thread is
interrupted while waiting for sibling threads to prepare, it executes
a recovery path.
Previously, this path included a wait_for_completion() call on
all_prepared to prevent a Use-After-Free of the local shared_ctx.
However, this wait is redundant. Exiting the main do-while loop
already leads to a bottom cleanup section that unconditionally waits
for all_finished. Therefore, replacing the wait with a simple break
is safe, prevents UAF, and correctly unblocks the remaining task_works.
Clean up the error path by breaking the loop and updating the
surrounding comments to accurately reflect the state machine.
Yihan Ding [Fri, 6 Mar 2026 02:16:50 +0000 (10:16 +0800)]
landlock: Serialize TSYNC thread restriction
syzbot found a deadlock in landlock_restrict_sibling_threads().
When multiple threads concurrently call landlock_restrict_self() with
sibling thread restriction enabled, they can deadlock by mutually
queueing task_works on each other and then blocking in kernel space
(waiting for the other to finish).
Fix this by serializing the TSYNC operations within the same process
using the exec_update_lock. This prevents concurrent invocations
from deadlocking.
We use down_write_trylock() and restart the syscall if the lock
cannot be acquired immediately. This ensures that if a thread fails
to get the lock, it will return to userspace, allowing it to process
any pending TSYNC task_works from the lock holder, and then
transparently restart the syscall.
firmware: stratix10-svc: Add Multi SVC clients support
In the current implementation, SVC client drivers such as socfpga-hwmon,
intel_fcs, stratix10-soc, stratix10-rsu each send an SMC command that
triggers a single thread in the stratix10-svc driver. Upon receiving a
callback, the initiating client driver sends a stratix10-svc-done signal,
terminating the thread without waiting for other pending SMC commands to
complete. This leads to a timeout issue in the firmware SVC mailbox service
when multiple client drivers send SMC commands concurrently.
To resolve this issue, a dedicated thread is now created per channel. The
stratix10-svc driver will support up to the number of channels defined by
SVC_NUM_CHANNEL. Thread synchronization is handled using a mutex to prevent
simultaneous issuance of SMC commands by multiple threads.
SVC_NUM_DATA_IN_FIFO is reduced from 32 to 8, since each channel now has
its own dedicated FIFO and the SDM processes commands one at a time.
8 entries per channel is sufficient while keeping the total aggregate
capacity the same (4 channels x 8 = 32 entries).
Additionally, a thread task is now validated before invoking kthread_stop
when the user aborts, ensuring safe termination.
Timeout values have also been adjusted to accommodate the increased load
from concurrent client driver activity.
Fixes: 7ca5ce896524 ("firmware: add Intel Stratix10 service layer driver") Cc: stable@vger.kernel.org Signed-off-by: Ang Tien Sung <tien.sung.ang@altera.com> Signed-off-by: Fong, Yan Kei <yankei.fong@altera.com> Signed-off-by: Muhammad Amirul Asyraf Mohamad Jamian <muhammad.amirul.asyraf.mohamad.jamian@altera.com> Link: https://lore.kernel.org/all/20260305093151.2678-1-muhammad.amirul.asyraf.mohamad.jamian@altera.com Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Petr Oros [Wed, 11 Feb 2026 19:18:55 +0000 (20:18 +0100)]
iavf: fix incorrect reset handling in callbacks
Three driver callbacks schedule a reset and wait for its completion:
ndo_change_mtu(), ethtool set_ringparam(), and ethtool set_channels().
Waiting for reset in ndo_change_mtu() and set_ringparam() was added by
commit c2ed2403f12c ("iavf: Wait for reset in callbacks which trigger
it") to fix a race condition where adding an interface to bonding
immediately after MTU or ring parameter change failed because the
interface was still in __RESETTING state. The same commit also added
waiting in iavf_set_priv_flags(), which was later removed by commit 53844673d555 ("iavf: kill "legacy-rx" for good").
Waiting in set_channels() was introduced earlier by commit 4e5e6b5d9d13
("iavf: Fix return of set the new channel count") to ensure the PF has
enough time to complete the VF reset when changing channel count, and to
return correct error codes to userspace.
Commit ef490bbb2267 ("iavf: Add net_shaper_ops support") added
net_shaper_ops to iavf, which required reset_task to use _locked NAPI
variants (napi_enable_locked, napi_disable_locked) that need the netdev
instance lock.
Later, commit 7e4d784f5810 ("net: hold netdev instance lock during
rtnetlink operations") and commit 2bcf4772e45a ("net: ethtool: try to
protect all callback with netdev instance lock") started holding the
netdev instance lock during ndo and ethtool callbacks for drivers with
net_shaper_ops.
Finally, commit 120f28a6f314 ("iavf: get rid of the crit lock")
replaced the driver's crit_lock with netdev_lock in reset_task, causing
incorrect behavior: the callback holds netdev_lock and waits for
reset_task, but reset_task needs the same lock:
Thread 1 (callback) Thread 2 (reset_task)
------------------- ---------------------
netdev_lock() [blocked on workqueue]
ndo_change_mtu() or ethtool op
iavf_schedule_reset()
iavf_wait_for_reset() iavf_reset_task()
waiting... netdev_lock() <- blocked
This does not strictly deadlock because iavf_wait_for_reset() uses
wait_event_interruptible_timeout() with a 5-second timeout. The wait
eventually times out, the callback returns an error to userspace, and
after the lock is released reset_task completes the reset. This leads to
incorrect behavior: userspace sees an error even though the configuration
change silently takes effect after the timeout.
Fix this by extracting the reset logic from iavf_reset_task() into a new
iavf_reset_step() function that expects netdev_lock to be already held.
The three callbacks now call iavf_reset_step() directly instead of
scheduling the work and waiting, performing the reset synchronously in
the caller's context which already holds netdev_lock. This eliminates
both the incorrect error reporting and the need for
iavf_wait_for_reset(), which is removed along with the now-unused
reset_waitqueue.
The workqueue-based iavf_reset_task() becomes a thin wrapper that
acquires netdev_lock and calls iavf_reset_step(), preserving its use
for PF-initiated resets.
The callbacks may block for several seconds while iavf_reset_step()
polls hardware registers, but this is acceptable since netdev_lock is a
per-device mutex and only serializes operations on the same interface.
v3:
- Remove netif_running() guard from iavf_set_channels(). Unlike
set_ringparam where descriptor counts are picked up by iavf_open()
directly, num_req_queues is only consumed during
iavf_reinit_interrupt_scheme() in the reset path. Skipping the reset
on a down device would silently discard the channel count change.
- Remove dead reset_waitqueue code (struct field, init, and all
wake_up calls) since iavf_wait_for_reset() was the only consumer.
Fixes: 120f28a6f314 ("iavf: get rid of the crit lock") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Petr Oros [Thu, 29 Jan 2026 09:57:23 +0000 (10:57 +0100)]
iavf: fix PTP use-after-free during reset
Commit 7c01dbfc8a1c5f ("iavf: periodically cache PHC time") introduced a
worker to cache PHC time, but failed to stop it during reset or disable.
This creates a race condition where `iavf_reset_task()` or
`iavf_disable_vf()` free adapter resources (AQ) while the worker is still
running. If the worker triggers `iavf_queue_ptp_cmd()` during teardown, it
accesses freed memory/locks, leading to a crash.
Fix this by calling `iavf_ptp_release()` before tearing down the adapter.
This ensures `ptp_clock_unregister()` synchronously cancels the worker and
cleans up the chardev before the backing resources are destroyed.
Fixes: 7c01dbfc8a1c5f ("iavf: periodically cache PHC time") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
drivers: net: ice: fix devlink parameters get without irdma
If CONFIG_IRDMA isn't enabled but there are ice NICs in the system, the
driver will prevent full devlink dev param show dump because its rdma get
callbacks return ENODEV and stop the dump. For example:
$ devlink dev param show
pci/0000:82:00.0:
name msix_vec_per_pf_max type generic
values:
cmode driverinit value 2
name msix_vec_per_pf_min type generic
values:
cmode driverinit value 2
kernel answers: No such device
Returning EOPNOTSUPP allows the dump to continue so we can see all devices'
devlink parameters.
Fixes: c24a65b6a27c ("iidc/ice/irdma: Update IDC to support multiple consumers") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Thorsten Blum [Tue, 3 Mar 2026 21:31:01 +0000 (22:31 +0100)]
nvme: Annotate struct nvme_dhchap_key with __counted_by
Add the __counted_by() compiler attribute to the flexible array member
'key' to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and
CONFIG_FORTIFY_SOURCE.
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Keith Busch <kbusch@kernel.org>
nvme-core: do not pass empty queue_limits to blk_mq_alloc_queue()
In nvme_alloc_admin_tag_set(), an empty queue_limits struct is
currently allocated on the stack and passed by reference to
blk_mq_alloc_queue().
This is redundant because blk_mq_alloc_queue() already handles
a NULL limits pointer by internally substituting it with a default
empty queue_limits struct.
Remove the unnecessary local variable and pass a NULL value.
Sungwoo Kim [Sat, 7 Mar 2026 19:46:36 +0000 (14:46 -0500)]
nvme-pci: Fix race bug in nvme_poll_irqdisable()
In the following scenario, pdev can be disabled between (1) and (3) by
(2). This sets pdev->msix_enabled = 0. Then, pci_irq_vector() will
return MSI-X IRQ(>15) for (1) whereas return INTx IRQ(<=15) for (2).
This causes IRQ warning because it tries to enable INTx IRQ that has
never been disabled before.
To fix this, save IRQ number into a local variable and ensure
disable_irq() and enable_irq() operate on the same IRQ number. Even if
pci_free_irq_vectors() frees the IRQ concurrently, disable_irq() and
enable_irq() on a stale IRQ number is still valid and safe, and the
depth accounting reamins balanced.
For target nvmet_ctrl_free() flushes ctrl->async_event_work.
If nvmet_ctrl_free() runs on nvmet-wq, the flush re-enters workqueue
completion for the same worker:-
A. Async event work queued on nvmet-wq (prior to disconnect):
nvmet_execute_async_event()
queue_work(nvmet_wq, &ctrl->async_event_work)
============================================
WARNING: possible recursive locking detected
6.19.0-rc3nvme+ #14 Tainted: G N
--------------------------------------------
kworker/u192:42/44933 is trying to acquire lock: ffff888118a00948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: touch_wq_lockdep_map+0x26/0x90
but task is already holding lock: ffff888118a00948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_one_work+0x53e/0x660
3 locks held by kworker/u192:42/44933:
#0: ffff888118a00948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_one_work+0x53e/0x660
#1: ffffc9000e6cbe28 ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: process_one_work+0x1c5/0x660
#2: ffffffff82d4db60 (rcu_read_lock){....}-{1:3}, at: __flush_work+0x62/0x530
Sungwoo Kim [Sun, 8 Mar 2026 18:20:59 +0000 (14:20 -0400)]
nvme-pci: Fix slab-out-of-bounds in nvme_dbbuf_set
dev->online_queues is a count incremented in nvme_init_queue. Thus,
valid indices are 0 through dev->online_queues − 1.
This patch fixes the loop condition to ensure the index stays within the
valid range. Index 0 is excluded because it is the admin queue.
KASAN splat:
==================================================================
BUG: KASAN: slab-out-of-bounds in nvme_dbbuf_free drivers/nvme/host/pci.c:377 [inline]
BUG: KASAN: slab-out-of-bounds in nvme_dbbuf_set+0x39c/0x400 drivers/nvme/host/pci.c:404
Read of size 2 at addr ffff88800592a574 by task kworker/u8:5/74
The buggy address belongs to the object at ffff88800592a000
which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 244 bytes to the right of
allocated 1152-byte region [ffff88800592a000, ffff88800592a480)
Memory state around the buggy address: ffff88800592a400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88800592a480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88800592a500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^ ffff88800592a580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88800592a600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Fixes: 0f0d2c876c96 (nvme: free sq/cq dbbuf pointers when dbbuf set fails) Acked-by: Chao Shi <cshi008@fiu.edu> Acked-by: Weidong Zhu <weizhu@fiu.edu> Acked-by: Dave Tian <daveti@purdue.edu> Signed-off-by: Sungwoo Kim <iam@sung-woo.kim> Signed-off-by: Keith Busch <kbusch@kernel.org>
Darrick J. Wong [Thu, 5 Mar 2026 04:26:20 +0000 (20:26 -0800)]
xfs: fix undersized l_iclog_roundoff values
If the superblock doesn't list a log stripe unit, we set the incore log
roundoff value to 512. This leads to corrupt logs and unmountable
filesystems in generic/617 on a disk with 4k physical sectors...
XFS (sda1): Mounting V5 Filesystem ff3121ca-26e6-4b77-b742-aaff9a449e1c
XFS (sda1): Torn write (CRC failure) detected at log block 0x318e. Truncating head block from 0x3197.
XFS (sda1): failed to locate log tail
XFS (sda1): log mount/recovery failed: error -74
XFS (sda1): log mount failed
XFS (sda1): Mounting V5 Filesystem ff3121ca-26e6-4b77-b742-aaff9a449e1c
XFS (sda1): Ending clean mount
...on the current xfsprogs for-next which has a broken mkfs. xfs_info
shows this...
...observe that the log section has sectsz=4096 sunit=0, which means
that the roundoff factor is 512, not 4096 as you'd expect. We should
fix mkfs not to generate broken filesystems, but anyone can fuzz the
ondisk superblock so we should be more cautious. I think the inadequate
logic predates commit a6a65fef5ef8d0, but that's clearly going to
require a different backport.
Cc: stable@vger.kernel.org # v5.14 Fixes: a6a65fef5ef8d0 ("xfs: log stripe roundoff is a property of the log") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Ian Rogers [Fri, 6 Mar 2026 18:53:06 +0000 (10:53 -0800)]
perf annotate loongarch: Fix off-by-one bug in outside check
A copy-paste of a similar issue fixed by Peter Collingbourne in:
https://lore.kernel.org/linux-perf-users/20260304190613.2507582-1-pcc@google.com/
Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bill Wendling <morbo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
sched: idle: Make skipping governor callbacks more consistent
If the cpuidle governor .select() callback is skipped because there
is only one idle state in the cpuidle driver, the .reflect() callback
should be skipped as well, at least for consistency (if not for
correctness), so do it.
Fixes: e5c9ffc6ae1b ("cpuidle: Skip governor when only one idle state is available") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Reviewed-by: Aboorva Devarajan <aboorvad@linux.ibm.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://patch.msgid.link/12857700.O9o76ZdvQC@rafael.j.wysocki
Stefan Haberland [Tue, 10 Mar 2026 14:23:30 +0000 (15:23 +0100)]
s390/dasd: Copy detected format information to secondary device
During online processing for a DASD device an IO operation is started to
determine the format of the device. CDL format contains specifically
sized blocks at the beginning of the disk.
For a PPRC secondary device no real IO operation is possible therefore
this IO request can not be started and this step is skipped for online
processing of secondary devices. This is generally fine since the
secondary is a copy of the primary device.
In case of an additional partition detection that is run after a swap
operation the format information is needed to properly drive partition
detection IO.
Currently the information is not passed leading to IO errors during
partition detection and a wrongly detected partition table which in turn
might lead to data corruption on the disk with the wrong partition table.
Fix by passing the format information from primary to secondary device.
Fixes: 413862caad6f ("s390/dasd: add copy pair swap capability") Cc: stable@vger.kernel.org #6.1 Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Acked-by: Eduard Shishkin <edward6@linux.ibm.com> Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Link: https://patch.msgid.link/20260310142330.4080106-3-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>