If cs35l56_component_probe() fails, call cs35l56_component_remove() to
clean up.
All the cleanup in cs35l56_component_remove() is the same cleanup that
would need to be done (at least partially) if cs35l56_component_probe()
fails. So calling cs35l56_component_remove() avoids convoluted cleanup
gotos and duplicated code in cs35l56_component_probe().
The only action in cs35l56_component_remove() that is nominally
dependent on having completed the component_probe() action is the call
to wm_adsp2_component_remove(). Though it is currently safe to call that
even if wm_adsp2_component_probe() was not called. However,
wm_adsp2_component_probe() has been trivially updated to check itself
whether it needs to cleanup.
Invalidate the debugfs pointer after debugfs_remove_recursive() in
cs35l56_remove_cal_debugfs(). This prevents a double-free situation when
a future commit adds proper failure cleanup in cs35l56_component_probe().
As described by Sashiko (including the future cs35l56_component_probe()
cleanup commit):
During a normal component unbind, cs35l56_component_remove() calls
cs35l56_remove_cal_debugfs() which removes the directory but leaves
a dangling pointer.
If the component is later bound again, but _cs35l56_component_probe()
fails early (for example, if the init_completion times out), this new
error path will call cs35l56_component_remove(). This causes
cs35l56_remove_cal_debugfs() to be called again with the dangling
cs35l56_base->debugfs pointer from the previous lifecycle, resulting in
a use-after-free in debugfs_remove_recursive().
ASoC: cs35l56: Fix missing calls to wm_adsp2_remove()
Call wm_adsp2_remove() in cs35l56_remove() and the error path of
cs35l56_common_probe().
Depends on commit 7d3fb78b5503 ("ASoC: wm_adsp: Fix NULL dereference
when removing firmware controls").
The call to wm_halo_init() during driver probe should be paired with
a call to wm_adsp2_remove() but this was missing. The consequence
would be a memory leak of the control lists in the cs_dsp driver.
xfs_growfs_compute_deltas has an odd calling conventions, and looks
very convoluted due to the use of do_div and strangely named and typed
variables.
Rename it, make it return the agcount and let the caller calculate the
delta. The internally use the better div_u64_rem helper and descriptive
variable names and types. Also add a comment describing what the
function is used for.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
xfs: pass back updated nb from xfs_growfs_compute_deltas
xfs_growfs_compute_deltas can update nb for corner cases like a number
of blocks that would create a less the minimal sized AG, or running
past the max AG limit. Pass back the calculated value to the caller,
as it relies on to calculate the new number of perag structures.
Note that the grown file system size is not affected by this
miscalculation as it uses the passed back delta value.
Fixes: a49b7ff63f98 ("xfs: Refactoring the nagcount and delta calculation") Cc: stable@vger.kernel.org # v7.0 Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Darrick J. Wong [Wed, 10 Jun 2026 04:57:24 +0000 (21:57 -0700)]
xfs: fix pointer arithmetic error on 32-bit systems
The translation of the old XFS_BMBT_KEY_ADDR macro into a static
function is not correct on 32-bit systems because the sizeof() argument
went from being a xfs_bmbt_key_t (i.e. a struct) to a (struct
xfs_bmbt_key *) (i.e. a pointer to the same struct). On 64-bit systems
this turns out ok because they are the same size, but on 32-bit systems
this is catastrophic because they are not the same size. So far there
have been no complaints, most likely because the xfs developers urge
against running it on 32-bit systems. But this needs fixing asap.
Cc: stable@vger.kernel.org # v6.12 Fixes: 79124b37400635 ("xfs: replace shouty XFS_BM{BT,DR} macros") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
xfs: initialize iomap->flags earlier in xfs_bmbt_to_iomap
Otherwise we lose the IOMAP_IOEND_BOUNDARY assingment for writes to the
first block in a realtime group, and could cause incorrect merges for
such writes.
Fixes: b91afef72471 ("xfs: don't merge ioends across RTGs") Cc: <stable@vger.kernel.org> # v6.13 Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
xfs: only log freed extents for the current RTG in zoned growfs
Otherwise a power fail or crash during growfs could lead to an
elevated sb_rblocks counter.
Note that the step function is much simpler compared to the classic RT
allocator as zoned RT sections must be aligned to real time group
boundaries.
Fixes: 01b71e64bb87 ("xfs: support growfs on zoned file systems") Cc: <stable@vger.kernel.org> # v6.15 Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Melissa Wen [Tue, 9 Jun 2026 10:20:21 +0000 (12:20 +0200)]
drm/amd/display: use plane color_mgmt_changed to track colorop changes
Ensure the driver tracks changes in any colorop property of a plane
color pipeline by using the same mechanism of CRTC color management and
update plane color blocks when any colorop property changes. It fixes an
issue observed on gamescope settings for night mode which is done via
shaper/3D-LUT updates.
Fixes: 9ba25915efba ("drm/amd/display: Add support for sRGB EOTF in DEGAM block") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Melissa Wen <melissa.srw@gmail.com> Link: https://patch.msgid.link/20260609110420.1298352-5-mwen@igalia.com
Melissa Wen [Tue, 9 Jun 2026 10:20:20 +0000 (12:20 +0200)]
drm/atomic: track individual colorop updates
As we do for CRTC color mgmt properties, use color_mgmt_changed flag to
track any value changes in the color pipeline of a given plane, so that
drivers can update color blocks as soon as plane color pipeline or
individual colorop values change. Since we're here, only announce and
track changes to plane COLOR_PIPELINE prop if its value is actually
changing.
Fixes: 8c5ea1745f4c ("drm/colorop: Add BYPASS property") Fixes: 7fa3ee8c0a79 ("drm/colorop: Define LUT_1D interpolation") Fixes: 41651f9d42eb ("drm/colorop: Add 1D Curve subtype") Fixes: 3410108037d5 ("drm/colorop: Add multiplier type") Fixes: db971856bbe0 ("drm/colorop: Add 3D LUT support to color pipeline") Fixes: e5719e7f1900 ("drm/colorop: Add 3x4 CTM type") Fixes: 99a4e4f08abe ("drm/colorop: Add 1D Curve Custom LUT type") Fixes: 2afc3184f3b3 ("drm/plane: Add COLOR PIPELINE property") Reviewed-by: Harry Wentland <harry.wentland@amd.com> #v1 Reviewed-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Fixes: 9ba25915efba ("drm/amd/display: Add support for sRGB EOTF in DEGAM block") Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Melissa Wen <melissa.srw@gmail.com> Link: https://patch.msgid.link/20260609110420.1298352-4-mwen@igalia.com
Melissa Wen [Tue, 9 Jun 2026 10:20:19 +0000 (12:20 +0200)]
drm/colorop: make lut(1/3)d_interpolation props correctly behave as mutable
As interpolation props are actually mutable props, any changes should be
handled by drm_colorop_state. Move their enum and make it correctly
behaves as mutable.
Fixes: 7fa3ee8c0a79 ("drm/colorop: Define LUT_1D interpolation") Fixes: db971856bbe0 ("drm/colorop: Add 3D LUT support to color pipeline") Reviewed-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Fixes: 9ba25915efba ("drm/amd/display: Add support for sRGB EOTF in DEGAM block") Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Melissa Wen <melissa.srw@gmail.com> Link: https://patch.msgid.link/20260609110420.1298352-3-mwen@igalia.com
Alex Hung [Tue, 9 Jun 2026 10:20:18 +0000 (12:20 +0200)]
drm/colorop: Remove read-only comments from interpolation fields
The lut1d_interpolation and lut3d_interpolation fields and their
associated properties were marked as read-only, but userspace
can set them via drm_atomic_colorop_set_property().
Fixes: 7fa3ee8c0a79 ("drm/colorop: Define LUT_1D interpolation") Fixes: db971856bbe0 ("drm/colorop: Add 3D LUT support to color pipeline") Reviewed-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Fixes: 9ba25915efba ("drm/amd/display: Add support for sRGB EOTF in DEGAM block") Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Melissa Wen <melissa.srw@gmail.com> Link: https://patch.msgid.link/20260609110420.1298352-2-mwen@igalia.com
AnonymeMeow [Sun, 7 Jun 2026 00:33:43 +0000 (08:33 +0800)]
fanotify: allow reporting pidfds for reaped tasks
Fanotify used to refuse to report pidfds for reaped tasks by applying a
pid_has_task() check before calling pidfd_prepare(). This prevented
userspace from obtaining information about the task.
Register the event pid with pidfs when creating the fanotify event if
pidfd reporting was requested, so pidfd_prepare() can later create a
pidfd for the reaped task.
AnonymeMeow [Sun, 7 Jun 2026 00:33:42 +0000 (08:33 +0800)]
fanotify: report thread pidfds for FAN_REPORT_TID
The FAN_REPORT_PIDFD and FAN_REPORT_TID flags used to be mutually
exclusive because by the time the pidfd support was introduced to
fanotify, pidfds could only be created for thread group leaders. Now
that the pidfd API supports thread-specific pidfds via PIDFD_THREAD,
this restriction can be lifted.
The do_bad_IRQ() macro simply calls handle_bad_irq() with a lock around
it. It also carries a comment stating that uses of it should be
replaced. According to commit aec0095653cd ("irqchip: gic: Call
handle_bad_irq() directly"), which replaced another use of
do_bad_IRQ(), locking the IRQ descriptor is not necessary for error
reporting. Therefore, replace all uses of do_bad_IRQ() with calls to
handle_bad_irq() and remove do_bad_IRQ().
Alessio Ferri [Thu, 28 May 2026 17:31:41 +0000 (19:31 +0200)]
b43: add RF power offset for N-PHY r8 + radio 2057 r8
Add the 2.4 GHz RF power offset table for N-PHY rev 8 paired with
radio 2057 rev 8 and wire it to the existing dispatcher.
b43_ntab_get_rf_pwr_offset_table() currently dispatches on phy->rev
== 17 (radio_rev 14) and phy->rev == 16 (radio_rev 9) for 2.4 GHz.
phy->rev == 8 falls through and the function logs:
b43-phyX ERROR: No 2GHz RF power table available for this device
Add a phy->rev == 8 / radio_rev == 8 case returning the new table.
The values are sourced from the proprietary Broadcom wl driver's
nphy_papd_padgain_dlt_2g_2057rev5 array. Reusing the rev 5 values
is structurally appropriate: the IPA TX gain table added by the
preceding patch in this series shares the low 24 bits of every
entry with rev 5 - same gain step amplitudes, only the PAD-gain
selector byte differs. b43's pad_gain extraction in
b43_nphy_tx_pwr_ctl_init() reads bits 19..23 of the gain entry,
which sit in the shared low-24-bit range; the same gain index
therefore maps to the same physical PAD gain code on both
revisions and warrants the same per-index dB offset.
Note that b43_nphy_tx_gain_table_upload() currently has a "TODO:
Enable this once we have gains configured" early-return for
phy->rev >= 7. With that early-return in place, this table is
fetched (silencing the b43err that would otherwise abort PHY
init) but its values are not yet written to MMIO. Resolving the
TODO is a future, separate task.
Alessio Ferri [Thu, 28 May 2026 17:31:40 +0000 (19:31 +0200)]
b43: add channel info table for N-PHY r8 + radio 2057 r8
Add the 2.4 GHz channel info table for N-PHY rev 8 paired with
radio 2057 rev 8 and wire it to the existing dispatcher in
r2057_get_chantabent_rev7().
The dispatcher's case 8 currently handles radio_rev == 5 only.
For radio_rev == 8 both output pointers stay NULL,
b43_nphy_set_channel() returns an error and channel switch to
the default channel fails.
The new b43_nphy_chantab_phy_rev8_radio_rev8[] is 14 entries
covering the standard 2.4 GHz channel set (2412..2472 in 5 MHz
steps, plus 2484 for channel 14).
Values extracted from an MMIO dump of the proprietary Broadcom wl
driver running on BCM6362 silicon (wl driver 6.30.102.7).
Alessio Ferri [Thu, 28 May 2026 17:31:39 +0000 (19:31 +0200)]
b43: add IPA TX gain table for N-PHY r8 + radio 2057 r8
Add the 2.4 GHz IPA TX gain table for N-PHY rev 8 paired with radio
2057 rev 8 and wire it to the existing dispatcher.
b43_nphy_get_ipa_gain_table() in tables_nphy.c currently handles
case 8 only for radio_rev == 5; radio_rev == 8 falls through and
the function logs:
b43-phyX ERROR: No 2GHz IPA gain table available for this device
b43-phyX ERROR: PHY init: Channel switch to default failed
leaving b43_phy_init() to return an error and core_init to abort
before the MAC is enabled.
The high byte of every entry differs from the rev 5 sibling (0x40
vs 0x30): different PAD-gain code prefix for the rev 8 front-end.
The low 24 bits coincide with rev 5 across the whole table - the
gain step amplitudes are the same, only the PAD-gain selector
prefix changes.
Values extracted from an MMIO dump of the proprietary Broadcom wl
driver running on BCM6362 silicon (wl driver 6.30.102.7).
Alessio Ferri [Thu, 28 May 2026 17:31:38 +0000 (19:31 +0200)]
b43: support radio 2057 rev 8
Add support for radio 2057 revision 8, paired with N-PHY rev 8 on
the Broadcom BCM6362 single-die integrated 2.4 GHz wireless block.
Three correlated changes are needed for the same chip:
- main.c: the radio_rev allow-list under B43_PHYTYPE_N currently
accepts radio 2057 revisions 9 and 14 only; extend to include
rev 8.
- radio_2057.c: the existing r2057_rev8_init[] is a 54-entry stub
declared inside a TODO comment block and never referenced
from r2057_upload_inittabs().
Replace it with the full 412-entry register set actually
programmed by the proprietary Broadcom wl driver on this radio.
I couldn't find the origin of the original 54-entry stub - 8
of its entries do not appear at all in the rev 8 register set
and 7 more carry different values.
Loading it instead of using the real table leaves the radio
hanging producing a "Microcode not responding" timeout.
- radio_2057.c: r2057_upload_inittabs() case 8 handles radio_rev
5 and 7 only; add the radio_rev == 8 branch pointing at the
new table.
The init table is extracted from an MMIO dump of the radio
register set programmed during proprietary driver initialisation
on BCM6362 silicon (Broadcom wl driver 6.30.102.7).
Alessio Ferri [Thu, 28 May 2026 17:31:37 +0000 (19:31 +0200)]
b43: route d11 corerev 22 to 24-bit indirect radio access
Rev 22 backports the older 802.11 core but pairs it with a radio
in the 2057 family, which requires the 24-bit indirect path. With
the current dispatch, corerev 22 falls into the legacy 4-wire branch,
reads garbage for radio_id, and bails out with -EOPNOTSUPP at the
"FOUND UNSUPPORTED RADIO" branch below.
brcmsmac handles the same silicon family with the equivalent
dispatch in drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/
phy_cmn.c read_radio_reg() and write_radio_reg():
b43 does not support SSN/SSLPN PHYs - they are rejected earlier in
b43_phy_versioning() at the "unsupported PHY type" switch - so just
adding the check corerev == 22 will do.
Alessio Ferri [Thu, 28 May 2026 17:31:36 +0000 (19:31 +0200)]
b43: add d11 core revision 0x16 to id table
Add d11 core revision 0x16 (= 22) to the b43 bcma device id table.
The b43 bcma id table covers d11 revisions 0x11, 0x15, 0x17, 0x18,
0x1C, 0x1D, 0x1E, 0x28 and 0x2A. Revision 0x16 belongs to the same
N-PHY family as revisions 0x17 and 0x18 (radio 2057) and needs no
new PHY or radio code beyond the radio_rev 8 dispatcher entries
added later in this series - only the device id entry is missing.
Without it bcma scan enumerates the 802.11 core but no driver binds.
The revision is used by the Broadcom BCM6362 single-die integrated
2.4 GHz wireless block found in xDSL SoCs.
Joonas Lahtinen [Wed, 10 Jun 2026 06:03:14 +0000 (09:03 +0300)]
drm/i915/gem: Fix phys BO pread/pwrite with offset
sg_page() returns struct page pointer not (void *) so the scaling
of pread/pwrite is wrong for phys BO and wrong parts of BO would be
accessed if non-zero offset is used.
Last impacted platform with overlay or cursor planes using phys
mapping was Gen3/945G/Lakeport.
Daniel Drake [Mon, 8 Jun 2026 21:01:08 +0000 (22:01 +0100)]
gpiolib: handle gpio-hogs only once
Commit d1d564ec49929 ("gpio: move hogs into GPIO core") introduced a
behaviour change that breaks boot on Raspberry Pi 5 when using the
firmware-supplied device tree:
gpiochip_add_data_with_key: GPIOs 544..575
(/soc@107c000000/gpio@7d517c00) failed to register, -22
brcmstb-gpio 107d517c00.gpio: Could not add gpiochip for bank 1
brcmstb-gpio 107d517c00.gpio: probe with driver brcmstb-gpio failed
with error -22
gpio-brcmstb registers two gpio_chips against the device tree
node gpio@7d517c00, one for each bank. The firmware-supplied DT includes
a gpio-hog on RP1 RUN, and this gpio-hog is attempted to be applied to
*both* gpio_chips. This succeeds against bank 0 (which hosts the GPIO)
and fails for bank 1 (which does not).
In the previous implementation, failures to apply gpio-hogs were
quietly ignored. In the new code, the error code propagates and causes
probe to fail.
Closely approximate the previous behaviour by using the OF_POPULATED flag
to ensure that each gpio-hog is processed only once. The flag was
previously being set before the gpio-hogs were processed, so as part
of this change, the flag now gets set only after the gpio-hog is actioned.
The handling of gpio-hogs on a DT node with multiple gpio_chips remains a
bit incomplete/unclear, but this at least retains the ability to apply
hogs to the first gpio_chip per node.
The kerneldoc for backing_file_open() documented a @user_path argument,
but the function takes const struct file *user_file. The user
path is derived as &user_file->f_path.
Update the @-tag to @user_file and adjust the description accordingly.
Also fix the "reuqested" typo to 'requested' in the old comment.
Takashi Iwai [Tue, 9 Jun 2026 11:50:53 +0000 (13:50 +0200)]
ALSA: timer: Manage timer object with kref
So far we've tried to address UAFs in ALSA timer code by applying the
locks at various places, but the fundamental problem is that the timer
object may be released while the belonging timer instance objects are
still present and accessing to it. This patch is a more proper fix to
address that issue, namely, by refcounting and keeping the timer
object.
The basic implementation is to use kref for the refcount of the timer
object, and take/release the reference at assigning/releasing the
instance, as well as at referring from ioctls or ALSA sequencer code.
The reference from ioctl or ALSA sequencer is abstracted with
snd_timeri_timer auto-cleanup.
Note that this change assumes that the code already took the fix
commit da3039e91d1f ("ALSA: timer: Forcibly close timer instances at
closing"); otherwise the refcount may be unbalanced when the timer is
freed while slave instances are still present.
If gpiochip_hog_lines() successfully processes some hogs but fails on
a later one, the error handling path in gpiochip_add_data_with_key()
jumps directly to err_remove_of_chip. This leaks resources allocated
earlier for ACPI, interrupts and hogs that were successfully processed.
Use the right label in error path.
Closes: https://sashiko.dev/#/patchset/20260608210108.36248-1-dan%40reactivated.net Fixes: d1d564ec4992 ("gpio: move hogs into GPIO core") Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://patch.msgid.link/20260609-gpio-hogs-fixes-v1-2-b4064f8070e7@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Furst Blumier [Tue, 9 Jun 2026 20:17:06 +0000 (22:17 +0200)]
ALSA: hda/realtek: Add quirk for HP 255 15.6 inch G9 Notebook PC
The HP 255 15.6 inch G9 Notebook PC (PCI SSID 103c:8a1b) uses the
ALC236 codec but lacks an entry in the quirk table, causing the kernel
to fall back to a null SSID match (103c:0000) and skip the necessary
fixup. Add a quirk entry using ALC236_FIXUP_HP_MUTE_LED_COEFBIT2,
matching the HP 255 G8 which uses the same codec and fixup. This fixes
the mute-button LED and fixes an issue with unplugging and replugging a
headset jack not being recognized as an audio sink.
Baojun Xu [Tue, 9 Jun 2026 10:52:53 +0000 (18:52 +0800)]
ALSA: hda/tas2781: Fix device-0 reset issue and handle -EXDEV in block data processing
Fix reset for device-0: In older projects (e.g., Merino), the hardware
reset pin for the first SPI device (device-0) is ineffective, causing
initialization failures. Added a software reset sequence for device-0
to ensure proper initialization.
Handle -EXDEV correctly: When processing block data, if the data does
not belong to the current SPI device, the driver returned -EXDEV.
This error code is now ignored to allow the driver to continue iterating
through the block data and correctly calculate the total block size.
Moritz Baron [Tue, 9 Jun 2026 14:16:48 +0000 (16:16 +0200)]
ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14IRH8
The Lenovo Yoga Pro 7 14IRH8 (ALC287 codec, subsystem ID 0x17aa:0x38b1)
has bass speakers on pin 0x17 that are not routed through a DAC with
volume control. This causes the bass speakers to play at full volume
regardless of the volume slider position.
Apply ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN which corrects the DAC
routing for pin 0x17, enabling proper volume control. This is the same
fix used for other Yoga Pro 7 models with identical audio topology
(14APH8, 14AHP9, 14ASP10, 14IAH10).
Gary Guo [Tue, 9 Jun 2026 14:26:33 +0000 (15:26 +0100)]
rust: make `build_assert` module the home of related macros
Given the macro scoping rules, all macros are rendered twice, in the
module and in the top-level of kernel crate.
Add `#[doc(hidden)]` to the macro definition and `#[doc(inline)]` to the
re-export inside `build_assert` module so the top-level items are hidden.
[ Sadly, because the definition is hidden, `rustdoc` decides to not list
them as re-exports in the `prelude` page anymore, even if we refer to
the not-actually-hidden item.
- Miguel ]
Acked-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Acked-by: Alexandre Courbot <acourbot@nvidia.com> Acked-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Acked-by: Boqun Feng <boqun@kernel.org> Signed-off-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260609142637.373347-1-gary@kernel.org
[ Kept a single declaration in the prelude, and reworded since they
already had `no_inline`. Removed other imports from `predefine` since
we now use the prelude. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Miguel Ojeda [Tue, 9 Jun 2026 10:41:52 +0000 (12:41 +0200)]
rust: str: clean unused import for Rust >= 1.98
Starting with Rust 1.98.0 (expected 2026-08-20), the compiler has changed
how the resolution algorithm works [1] in upstream commit c4d84db5f184
("Resolver: Batched import resolution."), and it now spots:
xfs: add newly added RTGs to the free pool in growfs
When growing a zoned RT section, the newly added RTGs also need to be
tagged as free in the radix tree and add to the nr_free_zones counters.
Call xfs_add_free_zone to do that, otherwise using up the newly added
space will wait for free zones forever.
Fixes: 01b71e64bb87 ("xfs: support growfs on zoned file systems") Cc: stable@vger.kernel.org # v6.15 Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Add a helper for adding a zone to the free pool in preparation of adding
another caller.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Lizhi Hou [Thu, 4 Jun 2026 20:28:15 +0000 (13:28 -0700)]
accel/amdxdna: Clear sva pointer after unbind
Add client->sva = NULL after the unbind makes it consistent with how
amdxdna_sva_fini() already clears the pointer after unbinding. The
IS_ERR_OR_NULL guard in sva_fini will then correctly skip the second
unbind.
When compile testing the UAPI headers with clang, there is an warning turned
error for using a C++ style ('//') comment, which is explicitly forbidden for
UAPI headers.
In file included from <built-in>:1:
./usr/include/linux/virtio_can.h:29:35: error: // comments are not allowed in this language [-Werror,-Wcomment]
29 | #define VIRTIO_CAN_MAX_DLEN 64 // this is like CANFD_MAX_DLEN
| ^
1 error generated.
Switch to a standard C style comment.
Fixes: 2b6b4bb7d96f ("can: virtio: Add virtio CAN driver") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260604-virtio_can-fix-uapi-comment-v1-1-199fa96ec5f0@kernel.org>
Add virtio CAN driver based on Virtio 1.4 specification (see
https://github.com/oasis-tcs/virtio-spec/tree/virtio-1.4). The driver
implements a complete CAN bus interface over Virtio transport,
supporting both CAN Classic and CAN-FD Ids. In term of frames, it
supports classic and CAN FD. RTR frames are only supported with classic
CAN.
Usage:
- "ip link set up can0" - start controller
- "ip link set down can0" - stop controller
- "candump can0" - receive frames
- "cansend can0 123#DEADBEEF" - send frames
Signed-off-by: Harald Mommer <harald.mommer@oss.qualcomm.com> Co-developed-by: Harald Mommer <harald.mommer@oss.qualcomm.com> Signed-off-by: Mikhail Golubev-Ciuchea <mikhail.golubev-ciuchea@oss.qualcomm.com> Co-developed-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Cc: Damir Shaikhutdinov <Damir.Shaikhutdinov@opensynergy.com> Reviewed-by: Francesco Valla <francesco@valla.it> Tested-by: Francesco Valla <francesco@valla.it> Signed-off-by: Matias Ezequiel Vara Larsen <mvaralar@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <ahXNb+KzuHYbS24+@fedora>
Yui Washizu [Tue, 10 Mar 2026 06:14:52 +0000 (15:14 +0900)]
virtio: add num_vf callback to virtio_bus
Recent QEMU versions added support for virtio SR-IOV emulation,
allowing virtio devices to expose SR-IOV VFs to the guest.
However, virtio_bus does not implement the num_vf callback of bus_type,
causing dev_num_vf() to return 0 for virtio devices even when
SR-IOV VFs are active.
net/core/rtnetlink.c calls dev_num_vf(dev->dev.parent) to populate
IFLA_NUM_VF in RTM_GETLINK responses. For a virtio-net device,
dev.parent points to the virtio_device, whose busis virtio_bus.
Without num_vf, SR-IOV VF information is silently
omitted from tools that rely on rtnetlink, such as 'ip link show'.
Add a num_vf callback that delegates to dev_num_vf(dev->parent),
which in turn reaches the underlying transport (pci_bus_type for
virtio-pci) where the actual VF count is tracked. Non-PCI transports
are unaffected as dev_num_vf() returns 0 when no num_vf callback is
present.
Signed-off-by: Yui Washizu <yui.washidu@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260310061454.683894-1-yui.washidu@gmail.com>
Srujana Challa [Tue, 24 Feb 2026 09:52:26 +0000 (15:22 +0530)]
vdpa/octeon_ep: fix IRQ-to-ring mapping in interrupt handler
Look up the IRQ index in oct_hw->irqs instead of assuming
irq - irqs[0]. This supports non-contiguous IRQ numbers and
avoids incorrect ring indexing when irqs[0] is not the base.
Fixes: 26f8ce06af64 ("vdpa/octeon_ep: enable support for multiple interrupts per device") Signed-off-by: Srujana Challa <schalla@marvell.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260224095226.1001151-5-schalla@marvell.com>
Srujana Challa [Tue, 24 Feb 2026 09:52:23 +0000 (15:22 +0530)]
vdpa/octeon_ep: Fix PF->VF mailbox data address calculation
The mailbox address was computed assuming 1 ring per VF. Read the
actual rings-per-VF from OCTEP_EPF_RINFO and use it when calculating
OCTEP_PF_MBOX_DATA offsets, fixing VF initialization when rings
per VF > 1.
Fixes: 8b6c724cdab8 ("virtio: vdpa: vDPA driver for Marvell OCTEON DPU devices") Signed-off-by: Srujana Challa <schalla@marvell.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260224095226.1001151-2-schalla@marvell.com>
The only reason for this janitorial change is that this initialization
adds unnecessary noise to "git grep exit_signal".
args.exit_signal has no effect with CLONE_THREAD, not to mention it is
zero-initialized by the compiler anyway.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <acAIm732QPFZs15C@redhat.com>
The vhost driver has unnecessary empty module_init and
module_exit functions. Remove them. Note that if a module_init function
exists, a module_exit function must also exist; otherwise, the module
cannot be unloaded.
Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260131020010.45647-1-enelsonmoore@gmail.com>
Rosen Penev [Fri, 8 May 2026 05:18:37 +0000 (22:18 -0700)]
vdpa/mlx5: Use kvzalloc_flex() for MTT command memory
The create mkey command memory embeds the MTT array as a flexible array
member. Use kvzalloc_flex() to allocate it directly instead of open-coding
the struct_size() calculation with kvcalloc().
The MTT allocation still needs to be aligned to MLX5_VDPA_MTT_ALIGN bytes.
Since each MTT entry is __be64, align the entry count directly and avoid
carrying a separate byte length variable.
Assisted-by: Codex:GPT-5.5 Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260508051837.1744409-1-rosenp@gmail.com>
Johan Hovold [Fri, 24 Apr 2026 10:47:03 +0000 (12:47 +0200)]
vdpa_sim_net: switch to dynamic root device
Driver core expects devices to be dynamically allocated and will, for
example, complain loudly when no release function has been provided.
Use root_device_register() to allocate and register the root device
instead of open coding using a static device.
Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260424104703.2619093-3-johan@kernel.org>
Johan Hovold [Fri, 24 Apr 2026 10:47:02 +0000 (12:47 +0200)]
vdpa_sim_blk: switch to dynamic root device
Driver core expects devices to be dynamically allocated and will, for
example, complain loudly when no release function has been provided.
Use root_device_register() to allocate and register the root device
instead of open coding using a static device.
Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260424104703.2619093-2-johan@kernel.org>
virtio-mem: Destroy mutex before freeing virtio_mem
Add a call to mutex_destroy in the error code path as well as in the
virtio_mem_remove code path.
Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20251123175750.445461-3-mhi@mailbox.org>
virtio-balloon: Destroy mutex before freeing virtio_balloon
Add a call to mutex_destroy in the error code path as well as in the
virtballoon_remove code path.
Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20251123175750.445461-2-mhi@mailbox.org>
tools/virtio: fix build for kmalloc_obj API and missing stubs
Add stubs for kmalloc_obj() and kmalloc_objs() to the tools/virtio
test harness, matching the new kernel allocator API. Also add the
DMA_ATTR_CPU_CACHE_CLEAN definition and include kernel.h from err.h
for the unlikely() macro.
Alexander Graf [Sat, 31 Jan 2026 10:28:09 +0000 (11:28 +0100)]
virtio_ring: Add READ_ONCE annotations for device-writable fields
KCSAN reports data races when accessing virtio ring fields that are
concurrently written by the device (host). These are legitimate
concurrent accesses where the CPU reads fields that the device updates
via DMA-like mechanisms.
Add accessor functions that use READ_ONCE() to properly annotate these
device-writable fields and prevent compiler optimizations that could in
theory break the code. This also serves as documentation showing which
fields are shared with the device.
This patch was partially written using the help of Kiro, an
AI coding assistant, to automate the mechanical work of generating the
inline function definition.
Signed-off-by: Alexander Graf <graf@amazon.com>
[jth: Add READ_ONCE in virtqueue_kick_prepare_split ] Co-developed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Alexander Graf <graf@amazon.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260131102810.1254845-1-johannes.thumshirn@wdc.com>
Arnd Bergmann [Fri, 13 Feb 2026 15:40:46 +0000 (16:40 +0100)]
vduse: fix compat handling for VDUSE_IOTLB_GET_FD/VDUSE_VQ_GET_INFO
These two ioctls are incompatible on 32-bit x86 userspace, because
the data structures are shorter than they are on 64-bit.
Add a proper .compat_ioctl handler for x86 that reads the structures
with the smaller padding before calling the internal handlers. On
all other architectures, CONFIG_COMPAT_FOR_U64_ALIGNMENT is disabled
and no special handling is required.
Fixes: ad146355bfad ("vduse: Support querying information of IOVA regions") Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260213154051.4172275-1-arnd@kernel.org>
longlong yan [Fri, 5 Jun 2026 02:14:45 +0000 (10:14 +0800)]
tools/virtio: check mmap return value in vringh_test
In parallel_test(), the return values of mmap() for both host_map and
guest_map are not checked against MAP_FAILED. If mmap() fails, the
subsequent code will dereference the invalid pointer, leading to a
segmentation fault.
Add MAP_FAILED checks after both mmap() calls, using err() to report
the error and exit, consistent with the existing error handling style
in this file (e.g., the open() call on line 149).
Fixes: 1515c5ce26ae ("tools/virtio: add vring_test.") Signed-off-by: longlong yan <yanlonglong@kylinos.cn> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260605021446.1611-1-yanlonglong@kylinos.cn>
Qing Ming [Mon, 1 Jun 2026 10:43:00 +0000 (18:43 +0800)]
vhost/net: complete zerocopy ubufs only once
vhost-net initializes one ubuf_info per outstanding zerocopy TX
descriptor and hands it to the backend socket. The networking stack may
then clone a zerocopy skb before all skb references are released. For
example, batman-adv fragmentation reaches skb_split(), which calls
skb_zerocopy_clone() and increments the same ubuf_info refcount.
vhost_zerocopy_complete() currently treats every ubuf callback as a
completed vhost descriptor. It dereferences ubuf->ctx, writes the
descriptor completion state, and drops the vhost_net_ubuf_ref even when
the callback only releases a cloned skb reference. A backend reset can
therefore wait for and free the vhost_net_ubuf_ref while another cloned
skb still carries the same ubuf_info. A later completion then
dereferences the freed ubufs pointer.
KASAN reports the stale completion as:
BUG: KASAN: slab-use-after-free in vhost_zerocopy_complete+0x1d7/0x1f0
BUG: KASAN: slab-use-after-free in vhost_zerocopy_complete+0x101/0x1f0
vhost_zerocopy_complete
skb_copy_ubufs
__dev_forward_skb2
veth_xmit
The freed object was allocated from vhost_net_ioctl() while setting the
backend and freed through kfree_rcu()/kvfree_rcu_bulk after backend
removal, while delayed skb completion still reached
vhost_zerocopy_complete().
Honor the generic ubuf_info refcount before touching vhost state, and run
the vhost descriptor completion only for the final ubuf reference. This
matches the msg_zerocopy_complete() ownership rule for cloned zerocopy
skbs.
Fixes: bab632d69ee4 ("vhost: vhost TX zero-copy support") Signed-off-by: Qing Ming <a0yami@mailbox.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260601104300.197210-1-a0yami@mailbox.org>
Jason Wang [Fri, 30 Jan 2026 05:07:50 +0000 (13:07 +0800)]
VDUSE: avoid leaking information to userspace
The bounceing is not necessarily page aligned, so current VDUSE can
leak kernel information through mapping bounce pages to
userspace. Allocate bounce pages with __GFP_ZERO to avoid leaking
information to userspace.
Fixes: 8c773d53fb7b ("vduse: Implement an MMU-based software IOTLB") Cc: stable@vger.kernel.org Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260130050750.4050-1-jasowang@redhat.com>
Zhang Tianci [Thu, 26 Feb 2026 11:55:49 +0000 (19:55 +0800)]
vduse: Requeue failed read to send_list head
When copy_to_iter() fails in vduse_dev_read_iter(), put the message back
at the head of send_list to preserve FIFO ordering and retry the oldest
pending request first.
Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Reported-by: Michael S. Tsirkin <mst@redhat.com> Suggested-by: Xie Yongji <xieyongji@bytedance.com> Signed-off-by: Zhang Tianci <zhangtianci.1997@bytedance.com> Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260226115550.1814-2-zhangtianci.1997@bytedance.com>
Cindy Lu [Mon, 26 Jan 2026 09:45:38 +0000 (17:45 +0800)]
vdpa/mlx5: update MAC address handling in mlx5_vdpa_set_attr()
Improve MAC address handling in mlx5_vdpa_set_attr() to ensure that
old MAC entries are properly removed from the MPFS table before
adding a new one. The new MAC address is then added to both the MPFS
and VLAN tables.
This change fixes an issue where the updated MAC address would not
take effect until QEMU was rebooted.
Signed-off-by: Cindy Lu <lulu@redhat.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260126094848.9601-4-lulu@redhat.com>
Cindy Lu [Mon, 26 Jan 2026 09:45:36 +0000 (17:45 +0800)]
vdpa/mlx5: update mlx_features with driver state check
Add logic in mlx5_vdpa_set_attr() to ensure the VIRTIO_NET_F_MAC
feature bit is properly set only when the device is not yet in
the DRIVER_OK (running) state.
This makes the MAC address visible in the output of:
vdpa dev config show -jp
when the device is created without an initial MAC address.
Signed-off-by: Cindy Lu <lulu@redhat.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260126094848.9601-2-lulu@redhat.com>
vdpa/ifcvf: handle dev_set_name() failure in ifcvf_vdpa_dev_add()
dev_set_name() may fail and return an error, but its return value
is currently ignored and overwritten by _vdpa_register_device().
Abort device creation if dev_set_name() fails and release the
device reference to avoid continuing with an improperly initialized
struct device.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Evgenii Burenchev <evg28bur@yandex.ru> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Zhu Lingshan <lingshan.zhu@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260226152924.38790-1-evg28bur@yandex.ru>
Filip Hejsek [Mon, 23 Feb 2026 17:37:02 +0000 (18:37 +0100)]
virtio_console: read size from config space during device init
Previously, the size was only read upon receiving the config interrupt.
This interrupt is sent when the size changes. However, we also need to
read the initial size.
Also make sure to only read the size from config if F_SIZE is enabled.
Fixes: 9778829cffd4 ("virtio: console: Store each console's size in the console structure") Signed-off-by: Filip Hejsek <filip.hejsek@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260223-virtio-console-fix-v1-1-0cf08303b428@gmail.com>
There is a spelling mistake in a struct description. Fix it.
Signed-off-by: Ethan Carter Edwards <ethan@ethancedwards.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260418-virtio-typo-v1-1-0df6f943a79d@ethancedwards.com>
Jia Jia [Thu, 7 May 2026 12:08:01 +0000 (20:08 +0800)]
virtio: rtc: tear down old virtqueues before restore
virtio_device_restore() resets the device and restores the negotiated
features before calling ->restore(). viortc_freeze() intentionally
leaves the existing virtqueues in place so the alarm queue can still
wake the system, but viortc_restore() immediately calls
viortc_init_vqs() without first deleting those old queues.
If virtqueue reinitialization fails on virtio-pci, the transport error
path can run vp_del_vqs() against a newly allocated vp_dev->vqs array
while vdev->vqs still contains the old virtqueues. vp_del_vqs() then
looks up queue state through the new array and can dereference a NULL
info pointer in vp_del_vq(), crashing the guest kernel during restore.
This can also happen during a non-faulty reinitialization, when one of
the vp_find_vqs_msix() attempts is unsuccessful before a later attempt
would succeed.
Delete the stale virtqueues before rebuilding them. If restore fails
before virtio_device_ready(), reuse the remove path to stop the device.
Once the device is ready, return errors directly instead of deleting the
virtqueues again.
Fixes: 0623c7592768 ("virtio_rtc: Add module and driver core") Signed-off-by: Jia Jia <physicalmtea@gmail.com> Reviewed-by: Peter Hilber <peter.hilber@oss.qualcomm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260507120801.3677552-1-physicalmtea@gmail.com>
Johan Hovold [Mon, 27 Apr 2026 14:37:10 +0000 (16:37 +0200)]
virtio-mmio: fix device release warning on module unload
Driver core expects devices to be allocated dynamically and complains
loudly when a device that lacks a release function is freed.
Use __root_device_register() to allocate and register the root device
instead of open coding using a static device.
Note that root_device_register(), which also creates a link to the
module, cannot be used as the device is registered when parsing the
module parameters which happens before the module kobject has been set
up.
Fixes: 81a054ce0b46 ("virtio-mmio: Devices parameter parsing") Cc: stable@vger.kernel.org # 3.5 Cc: Pawel Moll <pawel.moll@arm.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260427143710.14702-1-johan@kernel.org>
Qihang Tang [Fri, 8 May 2026 07:58:21 +0000 (15:58 +0800)]
vhost/vdpa: validate virtqueue index in mmap and fault paths
vhost_vdpa_mmap() and vhost_vdpa_fault() use vma->vm_pgoff as a
virtqueue index for get_vq_notification(), but they do not validate
that the index is smaller than v->nvqs.
The ioctl path already performs both a bounds check and
array_index_nospec(), but the mmap/fault path only checks that the
index fits in u16. This allows an out-of-range queue index to reach
driver-specific get_vq_notification() callbacks.
Fix this by extracting a unified vhost_vdpa_get_vq_notification()
helper that validates the queue index against v->nvqs and applies
array_index_nospec() before calling the driver callback. Both the
mmap and fault paths use this helper, and the bounds checking is
consolidated into a single location.
From source inspection, the most defensible impact is out-of-bounds
access in the callback path, potentially leading to invalid PFN
remaps and crash/DoS.
Fixes: ddd89d0a059d ("vhost_vdpa: support doorbell mapping via mmap") Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Qihang Tang <q.h.hack.winter@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260508075821.92656-1-q.h.hack.winter@gmail.com>
Qihang Tang [Fri, 8 May 2026 09:46:59 +0000 (17:46 +0800)]
vduse: hold vduse_lock across IDR lookup in open path
vduse_dev_open() looks up struct vduse_dev through the IDR and then
acquires dev->lock only after vduse_lock has been dropped.
This leaves a window where a concurrent VDUSE_DESTROY_DEV can remove the
same object from the IDR and free it before the open path locks the
device, leading to a use-after-free.
Close this race by keeping vduse_lock held until dev->lock has been
acquired in the open path, matching the lock ordering already used by
the destroy path.
Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Signed-off-by: Qihang Tang <q.h.hack.winter@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260508094659.94647-1-q.h.hack.winter@gmail.com>
Denis V. Lunev [Wed, 13 May 2026 14:50:04 +0000 (16:50 +0200)]
vhost/vsock: Refuse the connection immediately when guest isn't ready
When the host initiates an AF_VSOCK connect() to a guest that has not
yet loaded the virtio-vsock transport (i.e. still booting), the caller
blocks for VSOCK_DEFAULT_CONNECT_TIMEOUT.
A caller that wants to know if the guest is up yet instead of waiting
could theoretically tune SO_VM_SOCKETS_CONNECT_TIMEOUT, but it's tricky
to find the right timeout, if not impossible: there's no way to
distinguish "guest won't reply because it's not up yet" vs "guest is up
and tried to reply, but was too slow".
Furthermore, this delay is pointless:
- If the guest doesn't initialize within this timeout, connect()
returns ETIMEDOUT.
- If the guest **does** initialize, it'll reply with RST immediately,
because there won't be a listener on the port yet; connect() returns
ECONNRESET.
That's also inconsistent with the behavior at other initialization
stages: if a connection is attempted when the guest driver is already
loaded, but nothing is listening yet, we return ECONNRESET immediately
without waiting.
Fix this by checking the RX virtqueue backend in
vhost_transport_send_pkt() before queuing. If it's NULL, return
-EHOSTUNREACH immediately.
Callers that used to get ETIMEDOUT will now usually get EHOSTUNREACH.
Signed-off-by: Denis V. Lunev <den@openvz.org> Co-developed-by: Polina Vishneva <polina.vishneva@virtuozzo.com> Signed-off-by: Polina Vishneva <polina.vishneva@virtuozzo.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260513145842.809404-1-polina.vishneva@virtuozzo.com>
virtio: add missing kernel-doc for map and vmap members
Commit bee8c7c24b73 ("virtio: introduce map ops in virtio core") and
commit b16060c5c7d5 ("virtio: introduce virtio_map container union")
added 'map' and 'vmap' members to struct virtio_device but did not
update the kernel-doc comment block. This caused 'make htmldocs' to
emit warnings:
./include/linux/virtio.h:188 struct member 'map' not described in 'virtio_device'
./include/linux/virtio.h:188 struct member 'vmap' not described in 'virtio_device'
Add the missing entries in struct-declaration order to match the
existing convention in the file. After this patch, 'make htmldocs'
no longer emits these warnings.
Fixes: bee8c7c24b73 ("virtio: introduce map ops in virtio core") Fixes: b16060c5c7d5 ("virtio: introduce virtio_map container union") Reported-by: Luis Felipe Hernandez <luis.hernandez093@gmail.com> Signed-off-by: Christian Fontanez <christfontanez@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20260519013321.32511-1-christfontanez@gmail.com>
clocksource: move NXP timer selection to drivers/clocksource
The Kconfig logic for selecting the scheduler clocksource on
NXP Vybrid (VF610) uses a `choice` block restricted to 32-bit ARM. This
prevents 64-bit architectures, such as the NXP S32 family, from enabling
the NXP Periodic Interrupt Timer (PIT) driver (CONFIG_NXP_PIT_TIMER).
Relocate the NXP clocksource selection from arch/arm/mach-imx/Kconfig to
drivers/clocksource/Kconfig. This allows the configuration to be shared
across different architectures.
Update the selection to include support for ARCH_S32 and add a "None"
option restricted to ARCH_S32, since Vybrid lacks the ARM Architected
Timer. The Vybrid Global Timer option is restricted to ARCH_MULTI_V7
SOC_VF610 platforms to prevent it from being visible on Cortex-M4 builds,
which lack the ARM Global Timer hardware.
Fixes: bee33f22d7c3 ("clocksource/drivers/nxp-pit: Add NXP Automotive s32g2 / s32g3 support") Signed-off-by: Enric Balletbo i Serra <eballetb@redhat.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20260514-fix-nxp-timer-v3-1-a3e68fdb505e@redhat.com
Kartik Rajput [Thu, 7 May 2026 15:45:57 +0000 (21:15 +0530)]
clocksource/drivers/timer-tegra186: Reserve and service a kernel watchdog
Tegra SoCs supports multiple watchdog timers. If the kernel crashes or
hangs before userspace enables a watchdog, the system cannot recover and
may remain bricked, e.g. after a failed OTA update. The driver currently
leaves all watchdogs disabled until userspace configures them.
Reserve first available watchdog as a kernel-only watchdog for Tegra186
and Tegra234. Arm it during probe (120s timeout) and keep it alive in
the driver IRQ handler. Do not register it to userspace. Other available
watchdogs remain exposed to userspace. This guarantees the system can
reset itself in case of a hang or crash even when userspace never starts.
Kartik Rajput [Thu, 7 May 2026 15:45:56 +0000 (21:15 +0530)]
clocksource/drivers/timer-tegra186: Register all accessible watchdog timers
Tegra186+ SoCs expose multiple watchdog timers, but the driver only
registers WDT(0).
Iterate over num_wdts and, for each WDT, check the SCR (firewall) registers
in the TKE block to determine whether Linux has read and write access.
Register the watchdogs that are accessible.
Kartik Rajput [Thu, 7 May 2026 15:45:55 +0000 (21:15 +0530)]
clocksource/drivers/timer-tegra186: Correct num_wdts for Tegra186 and Tegra234
On Tegra186 and Tegra234, WDT2 is connected to the Audio Processing
Engine (APE) and cannot be accessed from Linux. Only WDT0 and WDT1
are accessible to Linux.
Update num_wdts from 3 to 2 for both Tegra186 and Tegra234 to reflect
the actual number of watchdogs available to Linux.
Kartik Rajput [Thu, 7 May 2026 15:45:54 +0000 (21:15 +0530)]
clocksource/drivers/timer-tegra186: Fix support for multiple watchdog instances
Tegra SoCs support multiple watchdogs; currently only one (WDT0) is
used. When multiple watchdogs are registered, tegra186_wdt_enable()
overwrites the TKEIE(x) register, discarding any existing watchdog
interrupt enable bits. As a result, enabling one watchdog inadvertently
disables interrupts for the others.
Fix this by preserving the existing TKEIE(x) value and updating it
using a read-modify-write sequence.
Referenced kptr destruction can run from tracing/NMI contexts through
bpf_obj_drop() and map value update/delete paths, reaching NMI-unsafe
special field teardown and deadlocks. Justin reported the issue and
iterated on fixes in [0]-[2], and also confirmed the bpf_obj_drop()
reproducer in [3].
This series rejects unsafe obj drops from non-iterator tracing programs,
limits map value recycle to NMI-safe field cancellation, and adds
focused selftests for the obj_drop(), NMI delete, and recycle teardown
cases.
Add focused map_kptr coverage for BPF-side map updates that touch values
containing referenced kptrs.
The new syscall programs stash the testmod refcounted object in an array
map, a preallocated hash map, and a no-prealloc hash map, then update the
same map from BPF. The refcount must remain elevated after the update,
while the userspace runner destroys the skeleton and reuses the existing
refcount wait to confirm map teardown releases the kptr.
selftests/bpf: Exercise unsafe obj drops from tracing progs
Add task_kfunc failure cases for bpf_obj_drop() on local objects with
referenced kptr fields from tracing and NMI tracing programs. These programs
must be rejected because dropping the object would run full special-field
destruction synchronously in an unsafe context.
Justin Suess [Tue, 9 Jun 2026 20:25:44 +0000 (22:25 +0200)]
bpf: Cancel special fields on map value recycle
Map update and delete paths currently call bpf_obj_free_fields() when a
value is being replaced or recycled. That makes field destruction depend
on the context of the update/delete operation. For tracing programs this
can include NMI context, where referenced kptr destructors, uptr
unpinning, and graph root destruction are not generally safe.
Introduce bpf_obj_cancel_fields() for the reusable-value path. It only
performs NMI-safe cleanup for timer, workqueue, and task_work fields.
Fields that need full destruction are left attached to the recycled value
and are destroyed by the final cleanup path instead.
Switch array and hashtab update/delete/recycle paths to this cancel
helper. Keep bpf_obj_free_fields() for final map destruction and for
bpf_mem_alloc destructors. Preallocated hashtabs do not have allocator
destructors, so teardown continues to walk the normal and extra elements
and fully destroy their fields.
This deliberately relaxes the eager-free semantics of map update/delete
for special fields. Programs that relied on a recycled map slot becoming
empty immediately after update/delete were relying on behavior that
cannot be implemented safely from every BPF execution context without
offloading arbitrary destructors.
There is a chance this change breaks programs making assumptions
regarding the eager freeing of fields. If so, we can relax semantics to
cancellation only when irqs_disabled() is true in the future. However,
theoretically, map values that get reused eagerly already have weaker
guarantees as parallel users can recreate freed fields before the new
element becomes visible again.
Justin Suess [Tue, 9 Jun 2026 20:25:43 +0000 (22:25 +0200)]
bpf: Reject bpf_obj_drop() from tracing progs
bpf_obj_drop() runs bpf_obj_free_fields() synchronously for
program-allocated objects. When such an object contains NMI unsafe
fields, tracing programs that can run from arbitrary instrumented
context can reach that destruction from unsafe contexts, including NMI.
NMI is likely one instance of this problem, and other instances would
include possible unsafe reentrancy. Deferring bpf_obj_drop() is not
appealing either: it would add delayed-free machinery to a release
operation that otherwise has straightforward synchronous ownership
semantics.
Reject bpf_obj_drop() and bpf_percpu_obj_drop() from tracing programs
that may run from unsafe contexts unless every field in the object's BTF
record is explicitly NMI safe. Do not reject sleepable
BPF_PROG_TYPE_TRACING programs, since they are not the arbitrary/NMI
contexts that motivate the restriction.
Note that while bpf_rb_root and bpf_list_head would be NMI safe on their
own to free, the objects recursively held by them may not be; be
conservative and just mark them as not NMI safe for now.
Use a whitelist for the NMI-safe field set instead of listing only known
NMI unsafe fields. Locks, async fields, unreferenced kptrs, and
refcounts are known to be NMI safe because their destruction is either a
no-op, simple state reset, or async cancellation. Referenced kptrs,
percpu referenced kptrs, uptrs, graph roots, graph nodes, and any future
field type are rejected until audited for arbitrary tracing and NMI
contexts. This is less susceptible to future changes in fields that were
previously safe by exclusion, and to new fields being added without
updating this check.
Convert the existing recursive local-object drop success case to a
syscall program in the same commit, since this verifier change makes the
old tracing program form invalid. The test still exercises
bpf_obj_drop() releasing a referenced task kptr from a safe program
type.
====================
selftests/bpf: Fix tests for llvm23 true signature
LLVM23 ([1]) records the 'true' function signature in BTF, i.e. the
signature inferred after optimization rather than the one written in C.
This caused two kinds of selftest failures (see below).
Case 1: keep int return type for tailcall subprogs
The verifier requires any subprog that issues a bpf_tail_call to return
an 'int' (see check_btf_func() in kernel/bpf/check_btf.c, which rejects
it with "tail_call is only allowed in functions that return 'int'").
Several tailcall subprogs do 'return 0' (or another constant) whose
result no caller uses. With llvm23 the compiler folds the constant and,
since the return value is dead, optimizes the subprog to effectively
return 'void' and records 'void' in BTF, so the program fails to load.
Use barrier_var() and __sink() to prevent returned value from being
optimized.
Case 2: adjust tracing prog ctx layout for the true signature
test_pkt_access_subprog2() has an unused argument that llvm optimizes
away. Before llvm23 the BTF signature did not match the optimized
assembly, so the verifier fell back to MAX_BPF_FUNC_REG_ARGS (5) u64
arguments and the fexit return value sat after args[5]. With llvm23 the
true signature has a single argument, so the return value moves to the
slot after args[1]. Select the matching ctx struct based on __clang_major__
so the test works with both old and new llvm.
Changelogs:
v1 -> v2:
- v1: https://lore.kernel.org/bpf/20260609163947.1717694-1-yonghong.song@linux.dev/
- Do not use bpf array map or bpf global var. Use __sink() instead.
====================
Yonghong Song [Tue, 9 Jun 2026 23:34:12 +0000 (16:34 -0700)]
selftests/bpf: Adjust fexit_bpf2bpf ctx layout for llvm23 true signature
test_pkt_access_subprog2() is defined in C as
int test_pkt_access_subprog2(int val, volatile struct __sk_buff *skb)
but llvm optimizes away the unused 'int val' argument. Before llvm23 the
BTF signature did not match the optimized assembly, so the verifier set
attach_func_proto to NULL and fell back to MAX_BPF_FUNC_REG_ARGS (5) u64
arguments (see btf_ctx_access()). The fexit ctx struct therefore placed
the return value after args[5].
With llvm23 the 'true' signature
int test_pkt_access_subprog2(volatile struct __sk_buff *skb)
is recorded in BTF, so nr_args becomes 1 and the return value moves to
the slot right after args[1]. Select the matching args_subprog2 layout
based on __clang_major__ so the test works with both old and new llvm.
Yonghong Song [Tue, 9 Jun 2026 23:34:07 +0000 (16:34 -0700)]
selftests/bpf: Keep int return type for tailcall subprogs
LLVM23 ([1]) supports 'true' function signature in BTF. The return type
of the caller of a tailcall must be an 'int'. Otherwise, verification will
fail (see check_btf_func() in check_btf.c). So with llvm23, it is possible
that the compiler may change the caller's return type from 'int' to 'void'.
To prevent this, barrier_var() and __sink() are used to avoid returning
a constant prone to be optimized.
====================
net: dsa: realtek: rtl8365mb: bridge offloading and VLAN support
This series introduces bridge offloading, FDB management, and VLAN support
for the Realtek rtl8365mb DSA switch driver. The primary goal is to
enable hardware frame forwarding between bridge ports, reducing CPU
overhead and providing advanced features like VLAN and FDB isolation.
Some of these patches are based on original work by Alvin Šipraga,
subsequently adapted and updated for the current net-next state.
I attempted to reach Alvin for review of the final version but was
unable to establish contact. Any regressions in this version are my
responsibility.
====================
net: dsa: realtek: rtl8365mb: add bridge port flags
Implement support for bridge port flags to control learning and flooding
behavior. This patch maps hardware functionalities to the following
bridge flags:
Implement hardware offloading of bridge functionality. This is achieved
by using the per-port isolation registers, which contain a forwarding
port mask. The switch will refuse to forward packets ingressed on a
given port to a port which is not in its forwarding mask.
For each bridge that is offloaded, use the DSA-provided bridge number
for the Extended Filtering ID (EFID). When using Independent VLAN
Learning (IVL), the forwarding database is keyed with the tuple
{VID, MAC, EFID}. There are 8 EFIDs available (0~7), but we reserve the
default EFID 0 for standalone ports where learning is disabled. This
fits nicely because DSA indexes the bridge number starting from 1.
Because of the limited number of EFIDs, we have to set the
max_num_bridges property of our switch to 7: we can't offload more than
that or we will fail to offer IVL as at least two bridges would end up
having to share an EFID.
All ports start isolated, forwarding exclusively to CPU ports, and
with VLAN transparent, ignoring VLAN membership. Once a member in a
bridge, the port isolation is expanded to include the bridge members.
When that bridge enables VLAN filtering, the VLAN transparent feature is
disabled, letting the switch filter based on VLAN setup.
Alvin Šipraga [Sat, 6 Jun 2026 08:29:31 +0000 (05:29 -0300)]
net: dsa: realtek: rtl8365mb: add FDB support
Implement support for FDB and MDB management for the RTL8365MB series
switches.
The hardware supports IVL by keying the unicast forwarding database with
the {MAC, VID, EFID} tuple. The Extended Filtering ID (EFID) is 3 bits
wide, providing 8 unique filtering domains. This driver reserves EFID 0
for standalone ports, effectively limiting the hardware offload to a
maximum of 7 bridges. The multicast database uses a {MAC, VID} key, with
ports from different bridges sharing the same multicast group.
Introduce a mutex lock (l2_lock) to protect concurrent L2 table updates.
Add support for forwarding database operations, including unicast and
multicast entry handling as well as fast aging support.
Set DSA switch flags assisted_learning_on_cpu_port and fdb_isolation.