]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
13 days agovdpa/mlx5: update mlx_features with driver state check
Cindy Lu [Mon, 26 Jan 2026 09:45:36 +0000 (17:45 +0800)] 
vdpa/mlx5: update mlx_features with driver state check

Add logic in mlx5_vdpa_set_attr() to ensure the VIRTIO_NET_F_MAC
feature bit is properly set only when the device is not yet in
the DRIVER_OK (running) state.

This makes the MAC address visible in the output of:

 vdpa dev config show -jp

when the device is created without an initial MAC address.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260126094848.9601-2-lulu@redhat.com>

2 weeks agocrypto: virtio: Replace package id with numa node id
Bibo Mao [Tue, 13 Jan 2026 03:05:56 +0000 (11:05 +0800)] 
crypto: virtio: Replace package id with numa node id

With multiple virtio crypto devices supported with different NUMA
nodes, when crypto session is created, it will search virtio crypto
device with the same numa node of current CPU.

Here API topology_physical_package_id() is replaced with cpu_to_node()
since package id is physical concept, and one package id have multiple
memory numa id.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260113030556.3522533-4-maobibo@loongson.cn>

2 weeks agocrypto: virtio: Remove duplicated virtqueue_kick in virtio_crypto_skcipher_crypt_req
Bibo Mao [Tue, 13 Jan 2026 03:05:55 +0000 (11:05 +0800)] 
crypto: virtio: Remove duplicated virtqueue_kick in virtio_crypto_skcipher_crypt_req

With function virtio_crypto_skcipher_crypt_req(), there is already
virtqueue_kick() call with spinlock held in function
__virtio_crypto_skcipher_do_req(). Remove duplicated virtqueue_kick()
function call here.

Fixes: d79b5d0bbf2e ("crypto: virtio - support crypto engine framework")
Cc: stable@vger.kernel.org
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260113030556.3522533-3-maobibo@loongson.cn>

2 weeks agocrypto: virtio: Add spinlock protection with virtqueue notification
Bibo Mao [Tue, 13 Jan 2026 03:05:54 +0000 (11:05 +0800)] 
crypto: virtio: Add spinlock protection with virtqueue notification

When VM boots with one virtio-crypto PCI device and builtin backend,
run openssl benchmark command with multiple processes, such as
  openssl speed -evp aes-128-cbc -engine afalg  -seconds 10 -multi 32

openssl processes will hangup and there is error reported like this:
 virtio_crypto virtio0: dataq.0:id 3 is not a head!

It seems that the data virtqueue need protection when it is handled
for virtio done notification. If the spinlock protection is added
in virtcrypto_done_task(), openssl benchmark with multiple processes
works well.

Fixes: fed93fb62e05 ("crypto: virtio - Handle dataq logic with tasklet")
Cc: stable@vger.kernel.org
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260113030556.3522533-2-maobibo@loongson.cn>

2 weeks agoDocumentation: Add documentation for VDUSE Address Space IDs
Eugenio Pérez [Mon, 19 Jan 2026 14:33:06 +0000 (15:33 +0100)] 
Documentation: Add documentation for VDUSE Address Space IDs

Address Space IDs allows the VDUSE framework to support devices able to
expose different virtqueues to different part of the drivers.  For
example, to let QEMU handle the net device control virtqueue, so QEMU
always knows the state of the device like mac address or number of
queues enabled, while leaving the dataplane passthrough to the guest
intact.  This enables live migration.

Expands the VDUSE documentation to explain how to use the new ioctls or
the new struct members of old ioctls.

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260119143306.1818855-14-eperezma@redhat.com>

2 weeks agovduse: bump version number
Eugenio Pérez [Mon, 19 Jan 2026 14:33:05 +0000 (15:33 +0100)] 
vduse: bump version number

Finalize the series by advertising VDUSE API v1 support to userspace.

Now that all required infrastructure for v1 (ASIDs, VQ groups,
update_iotlb_v2) is in place, VDUSE devices can opt in to the new
features.

Assume API version 0 if the VDUSE instance does not call
VDUSE_GET_API_VERSION to maintain compatibility.

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260119143306.1818855-13-eperezma@redhat.com>

2 weeks agovduse: add vq group asid support
Eugenio Pérez [Mon, 19 Jan 2026 14:33:04 +0000 (15:33 +0100)] 
vduse: add vq group asid support

Add support for assigning Address Space Identifiers (ASIDs) to each VQ
group.  This enables mapping each group into a distinct memory space.

The vq group to ASID association is protected by a rwlock now.  But the
mutex domain_lock keeps protecting the domains of all ASIDs, as some
operations like the one related with the bounce buffer size still
requires to lock all the ASIDs.

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260119143306.1818855-12-eperezma@redhat.com>

2 weeks agovduse: merge tree search logic of IOTLB_GET_FD and IOTLB_GET_INFO ioctls
Eugenio Pérez [Mon, 19 Jan 2026 14:33:03 +0000 (15:33 +0100)] 
vduse: merge tree search logic of IOTLB_GET_FD and IOTLB_GET_INFO ioctls

The next patch adds new ioctl with the ASID member per entry.  Abstract
these two so it can be build on top easily.

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260119143306.1818855-11-eperezma@redhat.com>

2 weeks agovduse: take out allocations from vduse_dev_alloc_coherent
Eugenio Pérez [Mon, 19 Jan 2026 14:33:02 +0000 (15:33 +0100)] 
vduse: take out allocations from vduse_dev_alloc_coherent

The function vduse_dev_alloc_coherent will be called under rwlock in
next patches.  Make it out of the lock to avoid increasing its fail
rate.

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260119143306.1818855-10-eperezma@redhat.com>

2 weeks agovduse: remove unused vaddr parameter of vduse_domain_free_coherent
Eugenio Pérez [Mon, 19 Jan 2026 14:33:01 +0000 (15:33 +0100)] 
vduse: remove unused vaddr parameter of vduse_domain_free_coherent

We will modify the function in next patches so let's clean it first.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260119143306.1818855-9-eperezma@redhat.com>

2 weeks agovduse: refactor vdpa_dev_add for goto err handling
Eugenio Pérez [Mon, 19 Jan 2026 14:33:00 +0000 (15:33 +0100)] 
vduse: refactor vdpa_dev_add for goto err handling

Next patches introduce more error paths in this function.  Refactor it
so they can be accommodated through gotos.

Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260119143306.1818855-8-eperezma@redhat.com>

2 weeks agovhost: forbid change vq groups ASID if DRIVER_OK is set
Eugenio Pérez [Mon, 19 Jan 2026 14:32:59 +0000 (15:32 +0100)] 
vhost: forbid change vq groups ASID if DRIVER_OK is set

Only vdpa_sim support it.  Forbid this behaviour as there is no use for
it right now, we can always enable it in the future with a feature flag.

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260119143306.1818855-7-eperezma@redhat.com>

2 weeks agovdpa: document set_group_asid thread safety
Eugenio Pérez [Mon, 19 Jan 2026 14:32:58 +0000 (15:32 +0100)] 
vdpa: document set_group_asid thread safety

Document that the function races with the check of DRIVER_OK.

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260119143306.1818855-6-eperezma@redhat.com>

2 weeks agovduse: return internal vq group struct as map token
Eugenio Pérez [Mon, 19 Jan 2026 14:32:57 +0000 (15:32 +0100)] 
vduse: return internal vq group struct as map token

Return the internal struct that represents the vq group as virtqueue map
token, instead of the device.  This allows the map functions to access
the information per group.

At this moment all the virtqueues share the same vq group, that only
can point to ASID 0.  This change prepares the infrastructure for actual
per-group address space handling

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260119143306.1818855-5-eperezma@redhat.com>

2 weeks agovduse: add vq group support
Eugenio Pérez [Mon, 19 Jan 2026 14:32:56 +0000 (15:32 +0100)] 
vduse: add vq group support

This allows separate the different virtqueues in groups that shares the
same address space.  Asking the VDUSE device for the groups of the vq at
the beginning as they're needed for the DMA API.

Allocating 3 vq groups as net is the device that need the most groups:
* Dataplane (guest passthrough)
* CVQ
* Shadowed vrings.

Future versions of the series can include dynamic allocation of the
groups array so VDUSE can declare more groups.

Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260119143306.1818855-4-eperezma@redhat.com>

2 weeks agovduse: add v1 API definition
Eugenio Pérez [Mon, 19 Jan 2026 14:32:55 +0000 (15:32 +0100)] 
vduse: add v1 API definition

This allows the kernel to detect whether the userspace VDUSE device
supports the VQ group and ASID features.  VDUSE devices that don't set
the V1 API will not receive the new messages, and vdpa device will be
created with only one vq group and asid.

The next patches implement the new feature incrementally, only enabling
the VDUSE device to set the V1 API version by the end of the series.

Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260119143306.1818855-3-eperezma@redhat.com>

2 weeks agovhost: move vdpa group bound check to vhost_vdpa
Eugenio Pérez [Mon, 19 Jan 2026 14:32:54 +0000 (15:32 +0100)] 
vhost: move vdpa group bound check to vhost_vdpa

Remove duplication by consolidating these here.  This reduces the
posibility of a parent driver missing them.

While we're at it, fix a bug in vdpa_sim where a valid ASID can be
assigned to a group equal to ngroups, causing an out of bound write.

Cc: stable@vger.kernel.org
Fixes: bda324fd037a ("vdpasim: control virtqueue support")
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20260119143306.1818855-2-eperezma@redhat.com>

2 weeks agocheckpatch: special-case cacheline group macros
Michael S. Tsirkin [Mon, 5 Jan 2026 21:05:42 +0000 (16:05 -0500)] 
checkpatch: special-case cacheline group macros

Currently, cacheline group macros trigger checkpatch warnings.
For example:

  $ ./scripts/checkpatch.pl -g ba7e025a6c84aed012421468d83639e5dae982b0
  WARNING: Missing a blank line after declarations
  #58: FILE: drivers/gpio/gpio-virtio.c:32:
  + struct virtio_gpio_response res;
  + __dma_from_device_group_end();

  $ ./scripts/checkpatch.pl -g 5d4cc87414c5d11345c4b11d61377d351b5c28a2
  WARNING: Missing a blank line after declarations
  #267: FILE: include/net/sock.h:431:
  + int sk_rcvlowat;
  + __cacheline_group_end(sock_read_rx);

But these are not actually statements - the following macros
all expand to zero-length fields:
  __cacheline_group_begin()
  __cacheline_group_end()
  __cacheline_group_begin_aligned()
  __cacheline_group_end_aligned()
  __dma_from_device_group_begin()
  __dma_from_device_group_end()

Add them to $declaration_macros so checkpatch recognizes this fact.

Message-ID: <b345bb7e2d4e23672e3e5d1b283754dc11c7d8cd.1767647872.git.mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agogpio: virtio: reorder fields to reduce struct padding
Michael S. Tsirkin [Mon, 29 Dec 2025 23:58:05 +0000 (18:58 -0500)] 
gpio: virtio: reorder fields to reduce struct padding

Reorder struct virtio_gpio_line fields to place the DMA buffers
(req/res) last.

This eliminates the padding from aligning struct size on
ARCH_DMA_MINALIGN.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Message-ID: <f1221bbc120df6adaba9006710a517f1e84a10b2.1767601130.git.mst@redhat.com>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agogpio: virtio: fix DMA alignment
Michael S. Tsirkin [Tue, 30 Dec 2025 13:04:15 +0000 (08:04 -0500)] 
gpio: virtio: fix DMA alignment

The res and ires buffers in struct virtio_gpio_line and struct
vgpio_irq_line respectively are used for DMA_FROM_DEVICE via
virtqueue_add_sgs().  However, within these structs, even though these
elements are tagged as ____cacheline_aligned, adjacent struct elements
can share DMA cachelines on platforms where ARCH_DMA_MINALIGN >
L1_CACHE_BYTES (e.g., arm64 with 128-byte DMA alignment but 64-byte
cache lines).

The existing ____cacheline_aligned annotation aligns to L1_CACHE_BYTES
which is not always sufficient for DMA alignment. For example, with
L1_CACHE_BYTES = 32 and ARCH_DMA_MINALIGN = 128
  - irq_lines[0].ires at offset 128
  - irq_lines[1].type at offset 192
both in same 128-byte DMA cacheline [128-256)

When the device writes to irq_lines[0].ires and the CPU concurrently
modifies one of irq_lines[1].type/disabled/masked/queued flags,
corruption can occur on non-cache-coherent platforms.

Fix by using __dma_from_device_group_begin()/end() annotations on the
DMA buffers. Drop ____cacheline_aligned - it's not required to isolate
request and response, and keeping them would increase the memory cost.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Message-ID: <ba7e025a6c84aed012421468d83639e5dae982b0.1767601130.git.mst@redhat.com>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agovsock/virtio: reorder fields to reduce padding
Michael S. Tsirkin [Mon, 29 Dec 2025 23:58:05 +0000 (18:58 -0500)] 
vsock/virtio: reorder fields to reduce padding

Reorder struct virtio_vsock fields to place the DMA buffer (event_list)
last. This eliminates the padding from aligning the struct size on
ARCH_DMA_MINALIGN.

Message-ID: <ce44f61af415521e00ab7492aa16d3d19f00bd5e.1769632071.git.mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
5 weeks agovirtio_input: use virtqueue_add_inbuf_cache_clean for events
Michael S. Tsirkin [Mon, 29 Dec 2025 23:28:28 +0000 (18:28 -0500)] 
virtio_input: use virtqueue_add_inbuf_cache_clean for events

The evts array contains 64 small (8-byte) input events that share
cachelines with each other. When CONFIG_DMA_API_DEBUG is enabled,
this can trigger warnings about overlapping DMA mappings within
the same cacheline.

Previous patch isolated the array in its own cachelines,
so the warnings are now spurious.

Use virtqueue_add_inbuf_cache_clean() to indicate that the CPU does not
write into these cache lines, suppressing these warnings.

Message-ID: <4c885b4046323f68cf5cadc7fbfb00216b11dd20.1767601130.git.mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
5 weeks agovirtio-rng: fix DMA alignment for data buffer
Michael S. Tsirkin [Mon, 29 Dec 2025 23:27:21 +0000 (18:27 -0500)] 
virtio-rng: fix DMA alignment for data buffer

The data buffer in struct virtrng_info is used for DMA_FROM_DEVICE via
virtqueue_add_inbuf() and shares cachelines with the adjacent
CPU-written fields (data_avail, data_idx).

The device writing to the DMA buffer and the CPU writing to adjacent
fields could corrupt each other's data on non-cache-coherent platforms.

Add __dma_from_device_group_begin()/end() annotations to place these
in distinct cache lines.

Message-ID: <157a63b6324d1f1307ddd4faa3b62a8b90a79423.1767601130.git.mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
5 weeks agovirtio_scsi: fix DMA cacheline issues for events
Michael S. Tsirkin [Mon, 29 Dec 2025 23:25:16 +0000 (18:25 -0500)] 
virtio_scsi: fix DMA cacheline issues for events

Current struct virtio_scsi_event_node layout has two problems:

The event (DMA_FROM_DEVICE) and work (CPU-written via
INIT_WORK/queue_work) fields share a cacheline.
On non-cache-coherent platforms, CPU writes to work can
corrupt device-written event data.

If ARCH_DMA_MINALIGN is large enough, the 8 events in event_list share
cachelines, triggering CONFIG_DMA_API_DEBUG warnings.

Fix the corruption by moving event buffers to a separate array and
aligning using __dma_from_device_group_begin()/end().

Suppress the (now spurious) DMA debug warnings using
virtqueue_add_inbuf_cache_clean().

Message-ID: <8801aeef7576a155299f19b6887682dd3a272aba.1767601130.git.mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
5 weeks agovirtio_input: fix DMA alignment for evts
Michael S. Tsirkin [Mon, 29 Dec 2025 23:24:36 +0000 (18:24 -0500)] 
virtio_input: fix DMA alignment for evts

On non-cache-coherent platforms, when a structure contains a buffer
used for DMA alongside fields that the CPU writes to, cacheline sharing
can cause data corruption.

The evts array is used for DMA_FROM_DEVICE operations via
virtqueue_add_inbuf(). The adjacent lock and ready fields are written
by the CPU during normal operation. If these share cachelines with evts,
CPU writes can corrupt DMA data.

Add __dma_from_device_group_begin()/end() annotations to ensure evts is
isolated in its own cachelines.

Message-ID: <cd328233198a76618809bb5cd9a6ddcaa603a8a1.1767601130.git.mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
5 weeks agovsock/virtio: use virtqueue_add_inbuf_cache_clean for events
Michael S. Tsirkin [Mon, 29 Dec 2025 23:27:54 +0000 (18:27 -0500)] 
vsock/virtio: use virtqueue_add_inbuf_cache_clean for events

The event_list array contains 8 small (4-byte) events that share
cachelines with each other. When CONFIG_DMA_API_DEBUG is enabled,
this can trigger warnings about overlapping DMA mappings within
the same cacheline.

The previous patch isolated event_list in its own cache lines
so the warnings are spurious.

Use virtqueue_add_inbuf_cache_clean() to indicate that the CPU does not
write into these fields, suppressing the warnings.

Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Message-ID: <4b5bf63a7ebb782d87f643466b3669df567c9fe1.1767601130.git.mst@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
5 weeks agovsock/virtio: fix DMA alignment for event_list
Michael S. Tsirkin [Mon, 29 Dec 2025 23:23:53 +0000 (18:23 -0500)] 
vsock/virtio: fix DMA alignment for event_list

On non-cache-coherent platforms, when a structure contains a buffer
used for DMA alongside fields that the CPU writes to, cacheline sharing
can cause data corruption.

The event_list array is used for DMA_FROM_DEVICE operations via
virtqueue_add_inbuf(). The adjacent event_run and guest_cid fields are
written by the CPU while the buffer is available, so mapped for the
device. If these share cachelines with event_list, CPU writes can
corrupt DMA data.

Add __dma_from_device_group_begin()/end() annotations to ensure event_list
is isolated in its own cachelines.

Message-ID: <f19ebd74f70c91cab4b0178df78cf6a6e107a96b.1767601130.git.mst@redhat.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agovirtio: add virtqueue_add_inbuf_cache_clean API
Michael S. Tsirkin [Mon, 29 Dec 2025 18:25:23 +0000 (13:25 -0500)] 
virtio: add virtqueue_add_inbuf_cache_clean API

Add virtqueue_add_inbuf_cache_clean() for passing DMA_ATTR_CPU_CACHE_CLEAN
to virtqueue operations. This suppresses DMA debug cacheline overlap
warnings for buffers where proper cache management is ensured by the
caller.

Message-ID: <e50d38c974859e731e50bda7a0ee5691debf5bc4.1767601130.git.mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agodma-debug: track cache clean flag in entries
Michael S. Tsirkin [Mon, 29 Dec 2025 19:38:31 +0000 (14:38 -0500)] 
dma-debug: track cache clean flag in entries

If a driver is buggy and has 2 overlapping mappings but only
sets cache clean flag on the 1st one of them, we warn.
But if it only does it for the 2nd one, we don't.

Fix by tracking cache clean flag in the entry.

Message-ID: <0ffb3513d18614539c108b4548cdfbc64274a7d1.1767601130.git.mst@redhat.com>
Reviewed-by: Petr Tesarik <ptesarik@suse.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agodocs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN
Michael S. Tsirkin [Mon, 29 Dec 2025 13:11:41 +0000 (08:11 -0500)] 
docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN

Document DMA_ATTR_CPU_CACHE_CLEAN as implemented in the
previous patch.

Message-ID: <0720b4be31c1b7a38edca67fd0c97983d2a56936.1767601130.git.mst@redhat.com>
Reviewed-by: Petr Tesarik <ptesarik@suse.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agodma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN
Michael S. Tsirkin [Mon, 29 Dec 2025 12:28:43 +0000 (07:28 -0500)] 
dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN

When multiple small DMA_FROM_DEVICE or DMA_BIDIRECTIONAL buffers share a
cacheline, and DMA_API_DEBUG is enabled, we get this warning:
cacheline tracking EEXIST, overlapping mappings aren't supported.

This is because when one of the mappings is removed, while another one
is active, CPU might write into the buffer.

Add an attribute for the driver to promise not to do this, making the
overlapping safe, and suppressing the warning.

Message-ID: <2d5d091f9d84b68ea96abd545b365dd1d00bbf48.1767601130.git.mst@redhat.com>
Reviewed-by: Petr Tesarik <ptesarik@suse.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agodocs: dma-api: document __dma_from_device_group_begin()/end()
Michael S. Tsirkin [Mon, 29 Dec 2025 09:01:21 +0000 (04:01 -0500)] 
docs: dma-api: document __dma_from_device_group_begin()/end()

Document the __dma_from_device_group_begin()/end() annotations.

Message-ID: <01ea88055ded4d70cac70ba557680fd5fa7d9ff5.1767601130.git.mst@redhat.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Petr Tesarik <ptesarik@suse.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agodma-mapping: add __dma_from_device_group_begin()/end()
Michael S. Tsirkin [Mon, 29 Dec 2025 08:53:39 +0000 (03:53 -0500)] 
dma-mapping: add __dma_from_device_group_begin()/end()

When a structure contains a buffer that DMA writes to alongside fields
that the CPU writes to, cache line sharing between the DMA buffer and
CPU-written fields can cause data corruption on non-cache-coherent
platforms.

Add __dma_from_device_group_begin()/end() annotations to ensure proper
alignment to prevent this:

struct my_device {
spinlock_t lock1;
__dma_from_device_group_begin();
char dma_buffer1[16];
char dma_buffer2[16];
__dma_from_device_group_end();
spinlock_t lock2;
};

Message-ID: <19163086d5e4704c316f18f6da06bc1c72968904.1767601130.git.mst@redhat.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Petr Tesarik <ptesarik@suse.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agovirtio_ring: add in order support
Jason Wang [Tue, 30 Dec 2025 06:46:49 +0000 (14:46 +0800)] 
virtio_ring: add in order support

This patch implements in order support for both split virtqueue and
packed virtqueue. Performance could be gained for the device where the
memory access could be expensive (e.g vhost-net or a real PCI device):

Benchmark with KVM guest:

Vhost-net on the host: (pktgen + XDP_DROP):

         in_order=off | in_order=on | +%
    TX:  4.51Mpps     | 5.30Mpps    | +17%
    RX:  3.47Mpps     | 3.61Mpps    | + 4%

Vhost-user(testpmd) on the host: (pktgen/XDP_DROP):

For split virtqueue:

         in_order=off | in_order=on | +%
    TX:  5.60Mpps     | 5.60Mpps    | +0.0%
    RX:  9.16Mpps     | 9.61Mpps    | +4.9%

For packed virtqueue:

         in_order=off | in_order=on | +%
    TX:  5.60Mpps     | 5.70Mpps    | +1.7%
    RX:  10.6Mpps     | 10.8Mpps    | +1.8%

Benchmark also shows no performance impact for in_order=off for queue
size with 256 and 1024.

Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-20-jasowang@redhat.com>

6 weeks agovirtio_ring: factor out split detaching logic
Jason Wang [Tue, 30 Dec 2025 06:46:48 +0000 (14:46 +0800)] 
virtio_ring: factor out split detaching logic

This patch factors out the split core detaching logic that could be
reused by in order feature into a dedicated function.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-19-jasowang@redhat.com>

6 weeks agovirtio_ring: factor out split indirect detaching logic
Jason Wang [Tue, 30 Dec 2025 06:46:47 +0000 (14:46 +0800)] 
virtio_ring: factor out split indirect detaching logic

Factor out the split indirect descriptor detaching logic in order to
allow it to be reused by the in order support.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-18-jasowang@redhat.com>

6 weeks agovirtio_ring: factor out core logic for updating last_used_idx
Jason Wang [Tue, 30 Dec 2025 06:46:46 +0000 (14:46 +0800)] 
virtio_ring: factor out core logic for updating last_used_idx

Factor out the core logic for updating last_used_idx to be reused by
the packed in order implementation.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-17-jasowang@redhat.com>

6 weeks agovirtio_ring: factor out core logic of buffer detaching
Jason Wang [Tue, 30 Dec 2025 06:46:45 +0000 (14:46 +0800)] 
virtio_ring: factor out core logic of buffer detaching

Factor out core logic of buffer detaching and leave the free list
management to the caller so in_order can just call the core logic.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-16-jasowang@redhat.com>

6 weeks agovirtio_ring: determine descriptor flags at one time
Jason Wang [Tue, 30 Dec 2025 06:46:44 +0000 (14:46 +0800)] 
virtio_ring: determine descriptor flags at one time

Let's determine the last descriptor by counting the number of sg. This
would be consistent with packed virtqueue implementation and ease the
future in-order implementation.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-15-jasowang@redhat.com>

6 weeks agovirtio_ring: introduce virtqueue ops
Jason Wang [Tue, 30 Dec 2025 06:46:43 +0000 (14:46 +0800)] 
virtio_ring: introduce virtqueue ops

This patch introduces virtqueue ops which is a set of callbacks
that will be called for different queue layout or features. This would
help to avoid branches for split/packed and will ease the future
implementation like in order.

Note that in order to eliminate the indirect calls this patch uses
global array of const ops to allow compiler to avoid indirect
branches.

Tested with CONFIG_MITIGATION_RETPOLINE, no performance differences
were noticed.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-14-jasowang@redhat.com>

6 weeks agovirtio_ring: switch to use unsigned int for virtqueue_poll_packed()
Jason Wang [Tue, 30 Dec 2025 06:46:42 +0000 (14:46 +0800)] 
virtio_ring: switch to use unsigned int for virtqueue_poll_packed()

Switch to use unsigned int for virtqueue_poll_packed() to match
virtqueue_poll() and virtqueue_poll_split() and to ease the
abstraction of the virtqueue ops.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-13-jasowang@redhat.com>

6 weeks agovirtio_ring: switch to use vring_virtqueue for detach_unused_buf variants
Jason Wang [Tue, 30 Dec 2025 06:46:41 +0000 (14:46 +0800)] 
virtio_ring: switch to use vring_virtqueue for detach_unused_buf variants

Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-12-jasowang@redhat.com>

6 weeks agovirtio_ring: switch to use vring_virtqueue for disable_cb variants
Jason Wang [Tue, 30 Dec 2025 06:46:40 +0000 (14:46 +0800)] 
virtio_ring: switch to use vring_virtqueue for disable_cb variants

Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-11-jasowang@redhat.com>

6 weeks agovirtio_ring: use vring_virtqueue for enable_cb_delayed variants
Jason Wang [Tue, 30 Dec 2025 06:46:39 +0000 (14:46 +0800)] 
virtio_ring: use vring_virtqueue for enable_cb_delayed variants

Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-10-jasowang@redhat.com>

6 weeks agovirtio_ring: switch to use vring_virtqueue for enable_cb_prepare variants
Jason Wang [Tue, 30 Dec 2025 06:46:38 +0000 (14:46 +0800)] 
virtio_ring: switch to use vring_virtqueue for enable_cb_prepare variants

Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-9-jasowang@redhat.com>

6 weeks agovirtio: switch to use vring_virtqueue for virtqueue_get variants
Jason Wang [Tue, 30 Dec 2025 06:46:37 +0000 (14:46 +0800)] 
virtio: switch to use vring_virtqueue for virtqueue_get variants

Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-8-jasowang@redhat.com>

6 weeks agovirtio_ring: switch to use vring_virtqueue for virtqueue_add variants
Jason Wang [Tue, 30 Dec 2025 06:46:36 +0000 (14:46 +0800)] 
virtio_ring: switch to use vring_virtqueue for virtqueue_add variants

Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-7-jasowang@redhat.com>

6 weeks agovirtio_ring: switch to use vring_virtqueue for virtqueue_kick_prepare variants
Jason Wang [Tue, 30 Dec 2025 06:46:35 +0000 (14:46 +0800)] 
virtio_ring: switch to use vring_virtqueue for virtqueue_kick_prepare variants

Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-6-jasowang@redhat.com>

6 weeks agovirtio_ring: switch to use vring_virtqueue for virtqueue resize variants
Jason Wang [Tue, 30 Dec 2025 06:46:34 +0000 (14:46 +0800)] 
virtio_ring: switch to use vring_virtqueue for virtqueue resize variants

Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-5-jasowang@redhat.com>

6 weeks agovirtio_ring: unify logic of virtqueue_poll() and more_used()
Jason Wang [Tue, 30 Dec 2025 06:46:33 +0000 (14:46 +0800)] 
virtio_ring: unify logic of virtqueue_poll() and more_used()

This patch unifies the logic of virtqueue_poll() and more_used() for
better code reusing and ease the future in order implementation.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-4-jasowang@redhat.com>

6 weeks agovirtio_ring: switch to use vring_virtqueue in virtqueue_poll variants
Jason Wang [Tue, 30 Dec 2025 06:46:32 +0000 (14:46 +0800)] 
virtio_ring: switch to use vring_virtqueue in virtqueue_poll variants

Those variants are used internally so let's switch to use
vring_virtqueue as parameter to be consistent with other internal
virtqueue helpers.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-3-jasowang@redhat.com>

6 weeks agovirtio_ring: rename virtqueue_reinit_xxx to virtqueue_reset_xxx()
Jason Wang [Tue, 30 Dec 2025 06:46:31 +0000 (14:46 +0800)] 
virtio_ring: rename virtqueue_reinit_xxx to virtqueue_reset_xxx()

To be consistent with virtqueue_reset().

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251230064649.55597-2-jasowang@redhat.com>

7 weeks agovhost: use "checked" versions of get_user() and put_user()
Jon Kohler [Thu, 13 Nov 2025 00:55:28 +0000 (17:55 -0700)] 
vhost: use "checked" versions of get_user() and put_user()

vhost_get_user and vhost_put_user leverage __get_user and __put_user,
respectively, which were both added in 2016 by commit 6b1e6cc7855b
("vhost: new device IOTLB API"). In a heavy UDP transmit workload on a
vhost-net backed tap device, these functions showed up as ~11.6% of
samples in a flamegraph of the underlying vhost worker thread.

Quoting Linus from [1]:
    Anyway, every single __get_user() call I looked at looked like
    historical garbage. [...] End result: I get the feeling that we
    should just do a global search-and-replace of the __get_user/
    __put_user users, replace them with plain get_user/put_user instead,
    and then fix up any fallout (eg the coco code).

Switch to plain get_user/put_user in vhost, which results in a slight
throughput speedup. get_user now about ~8.4% of samples in flamegraph.

Basic iperf3 test on a Intel 5416S CPU with Ubuntu 25.10 guest:
TX: taskset -c 2 iperf3 -c <rx_ip> -t 60 -p 5200 -b 0 -u -i 5
RX: taskset -c 2 iperf3 -s -p 5200 -D
Before: 6.08 Gbits/sec
After:  6.32 Gbits/sec

As to what drives the speedup, Sean's patch [2] explains:
Use the normal, checked versions for get_user() and put_user() instead of
the double-underscore versions that omit range checks, as the checked
versions are actually measurably faster on modern CPUs (12%+ on Intel,
25%+ on AMD).

The performance hit on the unchecked versions is almost entirely due to
the added LFENCE on CPUs where LFENCE is serializing (which is effectively
all modern CPUs), which was added by commit 304ec1b05031 ("x86/uaccess:
Use __uaccess_begin_nospec() and uaccess_try_nospec").  The small
optimizations done by commit b19b74bc99b1 ("x86/mm: Rework address range
check in get_user() and put_user()") likely shave a few cycles off, but
the bulk of the extra latency comes from the LFENCE.

[1] https://lore.kernel.org/all/CAHk-=wiJiDSPZJTV7z3Q-u4DfLgQTNWqUqqrwSBHp0+Dh016FA@mail.gmail.com/
[2] https://lore.kernel.org/all/20251106210206.221558-1-seanjc@google.com/

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
Message-Id: <20251113005529.2494066-1-jon@nutanix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 weeks agovirtio_ring: code cleanup in detach_buf_split
zhangdongchuan@eswincomputing.com [Wed, 26 Nov 2025 03:40:16 +0000 (11:40 +0800)] 
virtio_ring: code cleanup in detach_buf_split

Since the return value of vring_unmap_one_split() is exactly
vq->split.desc_extra[i].next, 'i = vq->split.desc_extra[i].next' is
redundant. Assign vring_unmap_one_split() to i instead.

Since vq->split.desc_extra is assigned to extra, use extra[i].next
instead of vq->split.desc_extra[i].next to improve readability.

No change in functionality.

Signed-off-by: zhangdongchuan <zhangdongchuan@eswincomputing.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <202511261140162936986@eswincomputing.com>

7 weeks agovirtio: uapi: avoid usage of libc types
Thomas Weißschuh [Mon, 22 Dec 2025 08:00:33 +0000 (09:00 +0100)] 
virtio: uapi: avoid usage of libc types

Using libc types and headers from the UAPI headers is problematic as it
introduces a dependency on a full C toolchain.

On Linux 'unsigned long' works as a replacement for 'uintptr_t' and does
not depend on libc.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251222-uapi-virtio-v1-1-29390f87bcad@linutronix.de>

7 weeks agovhost/vsock: improve RCU read sections around vhost_vsock_get()
Stefano Garzarella [Wed, 26 Nov 2025 13:38:26 +0000 (14:38 +0100)] 
vhost/vsock: improve RCU read sections around vhost_vsock_get()

vhost_vsock_get() uses hash_for_each_possible_rcu() to find the
`vhost_vsock` associated with the `guest_cid`. hash_for_each_possible_rcu()
should only be called within an RCU read section, as mentioned in the
following comment in include/linux/rculist.h:

/**
 * hlist_for_each_entry_rcu - iterate over rcu list of given type
 * @pos: the type * to use as a loop cursor.
 * @head: the head for your list.
 * @member: the name of the hlist_node within the struct.
 * @cond: optional lockdep expression if called from non-RCU protection.
 *
 * This list-traversal primitive may safely run concurrently with
 * the _rcu list-mutation primitives such as hlist_add_head_rcu()
 * as long as the traversal is guarded by rcu_read_lock().
 */

Currently, all calls to vhost_vsock_get() are between rcu_read_lock()
and rcu_read_unlock() except for calls in vhost_vsock_set_cid() and
vhost_vsock_reset_orphans(). In both cases, the current code is safe,
but we can make improvements to make it more robust.

About vhost_vsock_set_cid(), when building the kernel with
CONFIG_PROVE_RCU_LIST enabled, we get the following RCU warning when the
user space issues `ioctl(dev, VHOST_VSOCK_SET_GUEST_CID, ...)` :

  WARNING: suspicious RCU usage
  6.18.0-rc7 #62 Not tainted
  -----------------------------
  drivers/vhost/vsock.c:74 RCU-list traversed in non-reader section!!

  other info that might help us debug this:

  rcu_scheduler_active = 2, debug_locks = 1
  1 lock held by rpc-libvirtd/3443:
   #0: ffffffffc05032a8 (vhost_vsock_mutex){+.+.}-{4:4}, at: vhost_vsock_dev_ioctl+0x2ff/0x530 [vhost_vsock]

  stack backtrace:
  CPU: 2 UID: 0 PID: 3443 Comm: rpc-libvirtd Not tainted 6.18.0-rc7 #62 PREEMPT(none)
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-7.fc42 06/10/2025
  Call Trace:
   <TASK>
   dump_stack_lvl+0x75/0xb0
   dump_stack+0x14/0x1a
   lockdep_rcu_suspicious.cold+0x4e/0x97
   vhost_vsock_get+0x8f/0xa0 [vhost_vsock]
   vhost_vsock_dev_ioctl+0x307/0x530 [vhost_vsock]
   __x64_sys_ioctl+0x4f2/0xa00
   x64_sys_call+0xed0/0x1da0
   do_syscall_64+0x73/0xfa0
   entry_SYSCALL_64_after_hwframe+0x76/0x7e
   ...
   </TASK>

This is not a real problem, because the vhost_vsock_get() caller, i.e.
vhost_vsock_set_cid(), holds the `vhost_vsock_mutex` used by the hash
table writers. Anyway, to prevent that warning, add lockdep_is_held()
condition to hash_for_each_possible_rcu() to verify that either the
caller is in an RCU read section or `vhost_vsock_mutex` is held when
CONFIG_PROVE_RCU_LIST is enabled; and also clarify the comment for
vhost_vsock_get() to better describe the locking requirements and the
scope of the returned pointer validity.

About vhost_vsock_reset_orphans(), currently this function is only
called via vsock_for_each_connected_socket(), which holds the
`vsock_table_lock` spinlock (which is also an RCU read-side critical
section). However, add an explicit RCU read lock there to make the code
more robust and explicit about the RCU requirements, and to prevent
issues if the calling context changes in the future or if
vhost_vsock_reset_orphans() is called from other contexts.

Fixes: 834e772c8db0 ("vhost/vsock: fix use-after-free in network stack callers")
Cc: stefanha@redhat.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20251126133826.142496-1-sgarzare@redhat.com>
Message-ID: <20251126210313.GA499503@fedora>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 weeks agotools/virtio: add device, device_driver stubs
Michael S. Tsirkin [Thu, 4 Dec 2025 18:37:07 +0000 (13:37 -0500)] 
tools/virtio: add device, device_driver stubs

Add stubs needed by virtio.h

Message-ID: <0fabf13f6ea812ebc73b1c919fb17d4dec1545db.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 weeks agotools/virtio: fix up oot build
Michael S. Tsirkin [Thu, 4 Dec 2025 17:55:11 +0000 (12:55 -0500)] 
tools/virtio: fix up oot build

oot build tends to help uncover bugs so it's worth keeping around,
as long as it's low effort.
add stubs for a couple of macros virtio gained recently,
and disable vdpa in the test build.

Message-ID: <33968faa7994b86d1f78057358a50b8f460c7a23.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 weeks agovirtio_features: make it self-contained
Michael S. Tsirkin [Thu, 4 Dec 2025 17:49:34 +0000 (12:49 -0500)] 
virtio_features: make it self-contained

virtio_features.h uses WARN_ON_ONCE and memset so it must
include linux/bug.h and linux/string.h

Message-ID: <579986aa9b8d023844990d2a0e267382f8ad85d5.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 weeks agotools/virtio: switch to kernel's virtio_config.h
Michael S. Tsirkin [Thu, 4 Dec 2025 17:22:38 +0000 (12:22 -0500)] 
tools/virtio: switch to kernel's virtio_config.h

Drops stubs in virtio_config.h, use the kernel's version instead - we
are now activly developing it, so the stub became too hard to maintain.

Message-ID: <8e5c85dc8aad001f161f7e2d8799ffbccfc31381.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 weeks agotools/virtio: stub might_sleep and synchronize_rcu
Michael S. Tsirkin [Thu, 4 Dec 2025 17:25:17 +0000 (12:25 -0500)] 
tools/virtio: stub might_sleep and synchronize_rcu

Add might_sleep() and synchronize_rcu() stubs needed by virtio_config.h.

might_sleep() is a no-op, synchronize_rcu doesn't work but we don't
need it to.

Created using Cursor CLI.

Message-ID: <5557e026335d808acd7b890693ee1382e73dd33a.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 weeks agotools/virtio: add struct cpumask to cpumask.h
Michael S. Tsirkin [Thu, 4 Dec 2025 17:25:15 +0000 (12:25 -0500)] 
tools/virtio: add struct cpumask to cpumask.h

Add struct cpumask stub used by virtio_config.h.

Created using Cursor CLI.

Message-ID: <eacf56399ba220513ebcd610f4a5115dc768db80.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 weeks agotools/virtio: pass KCFLAGS to module build
Michael S. Tsirkin [Thu, 4 Dec 2025 17:22:43 +0000 (12:22 -0500)] 
tools/virtio: pass KCFLAGS to module build

Update the mod target to pass KCFLAGS with the in-tree vhost driver
include path. This way vhost_test can find vhost headers.

Created using Cursor CLI.

Message-ID: <5473e5a5dfd2fcd261a778f2017cac669c031f23.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 weeks agotools/virtio: add ucopysize.h stub
Michael S. Tsirkin [Thu, 4 Dec 2025 17:22:40 +0000 (12:22 -0500)] 
tools/virtio: add ucopysize.h stub

Add ucopysize.h with stub implementations of check_object_size,
copy_overflow, and check_copy_size.

Created using Cursor CLI.

Message-ID: <5046df90002bb744609248404b81d33b559fe813.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 weeks agotools/virtio: add dev_WARN_ONCE and is_vmalloc_addr stubs
Michael S. Tsirkin [Thu, 4 Dec 2025 17:22:36 +0000 (12:22 -0500)] 
tools/virtio: add dev_WARN_ONCE and is_vmalloc_addr stubs

Add dev_WARN_ONCE and is_vmalloc_addr stubs needed by virtio_ring.c.
is_vmalloc_addr stub always returns false - that's fine since it's
merely a sanity check.

Created using Cursor CLI.

Message-ID: <749e7a03b7cd56baf50a27efc3b05e50cf8f36b6.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 weeks agotools/virtio: stub DMA mapping functions
Michael S. Tsirkin [Thu, 4 Dec 2025 17:22:34 +0000 (12:22 -0500)] 
tools/virtio: stub DMA mapping functions

Add dma_map_page_attrs and dma_unmap_page_attrs stubs.
Follow the same pattern as existing DMA mapping stubs.

Created using Cursor CLI.

Message-ID: <3512df1fe0e2129ea493434a21c940c50381cc93.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 weeks agotools/virtio: add struct module forward declaration
Michael S. Tsirkin [Thu, 4 Dec 2025 17:22:32 +0000 (12:22 -0500)] 
tools/virtio: add struct module forward declaration

Declarate struct module in our linux/module.h stub.

Created using Cursor CLI.

Message-ID: <c01b8d24159664cc8c49354088efa342ae9e7321.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 weeks agotools/virtio: use kernel's virtio.h
Michael S. Tsirkin [Thu, 4 Dec 2025 17:22:31 +0000 (12:22 -0500)] 
tools/virtio: use kernel's virtio.h

Replace virtio stubs with an include of the kernel header.

Message-ID: <33daf1033fc447eb8e3e54d21013ccfd99550e37.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 weeks agovirtio: make it self-contained
Michael S. Tsirkin [Thu, 4 Dec 2025 18:31:52 +0000 (13:31 -0500)] 
virtio: make it self-contained

virtio.h uses struct module, add a forward declaration to
make the header self-contained.

Message-ID: <9171b5cac60793eb59ab044c96ee038bf1363bee.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 weeks agotools/virtio: fix up compiler.h stub
Michael S. Tsirkin [Thu, 4 Dec 2025 17:22:28 +0000 (12:22 -0500)] 
tools/virtio: fix up compiler.h stub

Add #undef __user before and after including compiler_types.h to avoid
redefinition warnings when compiling with system headers that also
define __user. This allows tools/virtio to build without warnings.

Additionally, stub out __must_check

Created using Cursor CLI.

Message-ID: <56424ce95c72cb4957070a7cd3c3c40ad5addaee.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
8 weeks agoLinux 6.19-rc2 v6.19-rc2
Linus Torvalds [Sun, 21 Dec 2025 23:52:04 +0000 (15:52 -0800)] 
Linux 6.19-rc2

8 weeks agoMerge tag 'coccinelle-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall...
Linus Torvalds [Sun, 21 Dec 2025 23:28:59 +0000 (15:28 -0800)] 
Merge tag 'coccinelle-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux

Pull Coccinelle fixes from Julia Lawall:
 "These fix a typo and make the coccicheck script more robust by
  ensuring that only compatible semantic patches are executed for the
  chosen mode"

* tag 'coccinelle-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
  Coccinelle: pm_runtime: Fix typo in report message
  scripts: coccicheck: filter *.cocci files by MODE

8 weeks agoMerge tag 'input-for-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 21 Dec 2025 23:21:10 +0000 (15:21 -0800)] 
Merge tag 'input-for-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:

 - a quirk for i8042 to better handle another TUXEDO model

 - a quirk to atkbd to handle incorcet behavior of HONOR FMB-P internal
   keyboard

 - a definition for a new ABS_SND_PROFILE event

 - fixes to alps and lkkbd drivers to reliably shut down pending work on
   removal

 - a fix to apple_z2 driver tightening input report parsing

 - a fix for "off-by-one" error when validating config in ti_am335x_tsc
   driver

 - addition of CRKD Guitars device IDs to xpad driver.

* tag 'input-for-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: ti_am335x_tsc - fix off-by-one error in wire_order validation
  Input: xpad - add support for CRKD Guitars
  Input: add ABS_SND_PROFILE
  Input: apple_z2 - fix reading incorrect reports after exiting sleep
  Input: alps - fix use-after-free bugs caused by dev3_register_work
  Input: i8042 - add TUXEDO InfinityBook Max Gen10 AMD to i8042 quirk table
  Input: atkbd - skip deactivate for HONOR FMB-P's internal keyboard
  Input: lkkbd - disable pending work before freeing device

8 weeks agoMerge tag 'i2c-for-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sun, 21 Dec 2025 23:05:47 +0000 (15:05 -0800)] 
Merge tag 'i2c-for-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

 - bcm, pxa, rcar: fix void-pointer-to-enum-cast warning

 - new hardware IDs / DT bindings for
    - Intel Nova Lake-S
    - Mobileye
    - Qualcomm SM8750

* tag 'i2c-for-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  dt-bindings: i2c: qcom-cci: Document SM8750 compatible
  i2c: i801: Add support for Intel Nova Lake-S
  dt-bindings: i2c: dw: Add Mobileye I2C controllers
  i2c: rcar: Fix Wvoid-pointer-to-enum-cast warning
  i2c: pxa: Fix Wvoid-pointer-to-enum-cast warning
  i2c: bcm-iproc: Fix Wvoid-pointer-to-enum-cast warning

8 weeks agoMerge tag 'x86-urgent-2025-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 21 Dec 2025 22:41:29 +0000 (14:41 -0800)] 
Merge tag 'x86-urgent-2025-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:

 - Fix FPU core dumps on certain CPU models

 - Fix htmldocs build warning

 - Export TLB tracing event name via header

 - Remove unused constant from <linux/mm_types.h>

 - Fix comments

 - Fix whitespace noise in documentation

 - Fix variadic structure's definition to un-confuse UBSAN

 - Fix posted MSI interrupts irq_retrigger() bug

 - Fix asm build failure with older GCC builds

* tag 'x86-urgent-2025-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/bug: Fix old GCC compile fails
  x86/msi: Make irq_retrigger() functional for posted MSI
  x86/platform/uv: Fix UBSAN array-index-out-of-bounds
  mm: Remove tlb_flush_reason::NR_TLB_FLUSH_REASONS from <linux/mm_types.h>
  x86/mm/tlb/trace: Export the TLB_REMOTE_WRONG_CPU enum in <trace/events/tlb.h>
  x86/sgx: Remove unmatched quote in __sgx_encl_extend function comment
  x86/boot/Documentation: Fix whitespace noise in boot.rst
  x86/fpu: Fix FPU state core dump truncation on CPUs with no extended xfeatures
  x86/boot/Documentation: Fix htmldocs build warning due to malformed table in boot.rst

8 weeks agoMerge tag 'irq-urgent-2025-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 21 Dec 2025 22:34:13 +0000 (14:34 -0800)] 
Merge tag 'irq-urgent-2025-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Ingo Molnar:
 "Fix IRQ thread affinity flags setup regression"

* tag 'irq-urgent-2025-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Don't overwrite interrupt thread flags on setup

8 weeks agoCoccinelle: pm_runtime: Fix typo in report message
Thorsten Blum [Sat, 22 Nov 2025 11:48:04 +0000 (12:48 +0100)] 
Coccinelle: pm_runtime: Fix typo in report message

s/Unecessary/Unnecessary/

Reviewed-by: Julia Lawall <julia.lawall@inria.fr>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
8 weeks agoscripts: coccicheck: filter *.cocci files by MODE
Songwei Chai [Fri, 6 Jun 2025 06:09:36 +0000 (14:09 +0800)] 
scripts: coccicheck: filter *.cocci files by MODE

Enhance the coccicheck script to filter *.cocci files based on the
specified MODE (e.g., report, patch). This ensures that only compatible
semantic patch files are executed, preventing errors such as:

    "virtual rule report not supported"

This error occurs when a .cocci file does not define a 'virtual <MODE>'
rule, yet is executed in that mode.

For example:

    make coccicheck M=drivers/hwtracing/coresight/ MODE=report

In this case, running "secs_to_jiffies.cocci" would trigger the error
because it lacks support for 'report' mode. With this change, such files
are skipped automatically, improving robustness and developer
experience.

Signed-off-by: Songwei Chai <quic_songchai@quicinc.com>
Reviewed-by: Julia Lawall <Julia.Lawall@inria.fr>
8 weeks agoMerge tag 'ata-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata...
Linus Torvalds [Sun, 21 Dec 2025 06:58:14 +0000 (22:58 -0800)] 
Merge tag 'ata-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux

Pull ata fix from Damien Le Moal:

 - Disable link power management (LPM) for a Seagate drive that is
   misbehaving when LPM is enabled

* tag 'ata-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  ata: libata-core: Disable LPM on ST2000DM008-2FR102

8 weeks agoMerge tag 'spi-fix-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brooni...
Linus Torvalds [Sun, 21 Dec 2025 00:54:42 +0000 (16:54 -0800)] 
Merge tag 'spi-fix-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "A small collection of fixes for various SPI drivers, plus a relaxation
  of constraints in the DT for the DesignWare controller to reflect
  hardware that's been seen.

  There's several fixes for the Cadence QuadSPI driver since a fix
  during the last release made some existing issues with error handling
  during probe more readily visible"

* tag 'spi-fix-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: mt65xx: Use IRQF_ONESHOT with threaded IRQ
  spi: dt-bindings: snps,dw-abp-ssi: Allow up to 16 chip-selects
  spi: cadence-quadspi: Fix clock disable on probe failure path
  spi: cadence-quadspi: Add error logging for DMA request failure
  spi: fsl-cpm: Check length parity before switching to 16 bit mode
  spi: mpfs: Fix an error handling path in mpfs_spi_probe()

8 weeks agox86/irqflags: Use ASM_OUTPUT_RM in native_save_fl()
Eric Dumazet [Fri, 19 Dec 2025 11:20:07 +0000 (11:20 +0000)] 
x86/irqflags: Use ASM_OUTPUT_RM in native_save_fl()

clang is generating very inefficient code for native_save_fl() which is
used for local_irq_save() in critical spots.

Allowing the "pop %0" to use memory:

 1) forces the compiler to add annoying stack canaries when
    CONFIG_STACKPROTECTOR_STRONG=y in many places.

 2) Almost always is followed by an immediate "move memory,register"

One good example is _raw_spin_lock_irqsave, with 8 extra instructions

  ffffffff82067a30 <_raw_spin_lock_irqsave>:
  ffffffff82067a30: ...
  ffffffff82067a39: 53 push   %rbx

  // Three instructions to ajust the stack, read the per-cpu canary
  // and copy it to 8(%rsp)
  ffffffff82067a3a: 48 83 ec 10  sub    $0x10,%rsp
  ffffffff82067a3e: 65 48 8b 05 da 15 45 02 mov    %gs:0x24515da(%rip),%rax     # <__stack_chk_guard>
  ffffffff82067a46: 48 89 44 24 08 mov    %rax,0x8(%rsp)

  ffffffff82067a4b: 9c pushf

  // instead of pop %rbx, compiler uses 2 instructions.
  ffffffff82067a4c: 8f 04 24 pop    (%rsp)
  ffffffff82067a4f: 48 8b 1c 24  mov    (%rsp),%rbx

  ffffffff82067a53: fa cli
  ffffffff82067a54: b9 01 00 00 00 mov    $0x1,%ecx
  ffffffff82067a59: 31 c0 xor    %eax,%eax
  ffffffff82067a5b: f0 0f b1 0f  lock cmpxchg %ecx,(%rdi)
  ffffffff82067a5f: 75 1d jne    ffffffff82067a7e <_raw_spin_lock_irqsave+0x4e>

  // three instructions to check the stack canary
  ffffffff82067a61: 65 48 8b 05 b7 15 45 02 mov    %gs:0x24515b7(%rip),%rax     # <__stack_chk_guard>
  ffffffff82067a69: 48 3b 44 24 08 cmp    0x8(%rsp),%rax
  ffffffff82067a6e: 75 17 jne    ffffffff82067a87

  ...

  // One extra instruction to adjust the stack.
  ffffffff82067a73: 48 83 c4 10  add    $0x10,%rsp
  ...

  // One more instruction in case the stack was mangled.
  ffffffff82067a87: e8 a4 35 ff ff call   ffffffff8205b030 <__stack_chk_fail>

This patch changes nothing for gcc, but for clang saves ~20000 bytes of text
even though more functions are inlined.

  $ size vmlinux.gcc.before vmlinux.gcc.after vmlinux.clang.before vmlinux.clang.after
     text    data bss dec hex filename
  45565821 25005462 4704800 75276083 47c9f33 vmlinux.gcc.before
  45565821 25005462 4704800 75276083 47c9f33 vmlinux.gcc.after
  45121072 24638617 5533040 75292729 47ce039 vmlinux.clang.before
  45093887 24638633 5536808 75269328 47c84d0 vmlinux.clang.after

  $ scripts/bloat-o-meter -t vmlinux.clang.before vmlinux.clang.after
  add/remove: 1/2 grow/shrink: 21/533 up/down: 2250/-22112 (-19862)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 weeks agoclang: work around asm output constraint problems
Eric Dumazet [Fri, 19 Dec 2025 11:20:06 +0000 (11:20 +0000)] 
clang: work around asm output constraint problems

Work around clang problems with "=rm" asm constraint.

clang seems to always chose the memory output, while it is almost
always the worst choice.

Add ASM_OUTPUT_RM so that we can replace "=rm" constraint
where it matters for clang, while not penalizing gcc.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 weeks agoMerge tag 'xfs-fixes-6.19-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Sat, 20 Dec 2025 20:45:35 +0000 (12:45 -0800)] 
Merge tag 'xfs-fixes-6.19-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Carlos Maiolino:
 "This contains a few fixes for zoned devices support, an UAF and a
  compiler warning, and some cleaning up"

* tag 'xfs-fixes-6.19-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix the zoned RT growfs check for zone alignment
  xfs: validate that zoned RT devices are zone aligned
  xfs: fix XFS_ERRTAG_FORCE_ZERO_RANGE for zoned file system
  xfs: fix a memory leak in xfs_buf_item_init()
  xfs: fix stupid compiler warning
  xfs: fix a UAF problem in xattr repair
  xfs: ignore discard return value

8 weeks agoMerge tag 'hwmon-for-v6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 20 Dec 2025 20:22:53 +0000 (12:22 -0800)] 
Merge tag 'hwmon-for-v6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

 - ltc4282: Fix reset_history file permissions

 - ds620: Update broken Datasheet URL in driver documentation

 - tmp401: Fix overflow caused by default conversion rate value

 - ibmpex: Fix use-after-free in high/low store

 - dell-smm: Limit fan multiplier to avoid overflow

* tag 'hwmon-for-v6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (ltc4282): Fix reset_history file permissions
  hwmon: (DS620) Update broken Datasheet URL in driver documentation
  hwmon: (tmp401) fix overflow caused by default conversion rate value
  hwmon: (ibmpex) fix use-after-free in high/low store
  hwmon: (dell-smm) Limit fan multiplier to avoid overflow

8 weeks agoMerge tag 'mmc-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Sat, 20 Dec 2025 20:18:32 +0000 (12:18 -0800)] 
Merge tag 'mmc-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC host fixes from Ulf Hansson:

 - sdhci-esdhc-imx: Fix build problem dependency

 - sdhci-of-arasan: Increase card-detect stable timeout to 2 seconds

 - sdhci-of-aspeed: Fix DT doc for missing properties

* tag 'mmc-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci-esdhc-imx: add alternate ARCH_S32 dependency to Kconfig
  mmc: sdhci-of-arasan: Increase CD stable timeout to 2 seconds
  dt-bindings: mmc: sdhci-of-aspeed: Switch ref to sdhci-common.yaml

8 weeks agoMerge tag 'drm-fixes-2025-12-20' of https://gitlab.freedesktop.org/drm/kernel
Linus Torvalds [Sat, 20 Dec 2025 20:08:02 +0000 (12:08 -0800)] 
Merge tag 'drm-fixes-2025-12-20' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "rc2 fixes for the week, mostly xe, with amdgpu as usual. Then a
  smattering of small fixes across the core/tests/panel and amdxdna.

  I expect things will be quiet for rc3/4 as teams take a break, and I'm
  travelling but will keep an eye on things.

  core:
   - fix gem handle leak on DRM_IOCTL_GEM_CHANGE_HANDLE

  tests:
   - add EDEADLK handling

  amdgpu:
   - Fix no_console_suspend handling
   - DCN 3.5.x seamless boot fixes
   - DP audio fix
   - Fix race in GPU recovery
   - SMU 14 OD fix

  amdkfd:
   - Event fix

  xe:
   - Limit num_syncs to prevent oversized kernel allocations
   - Disallow 0 OA property values
   - Disallow 0 EU stall property values
   - Fix kobject leak
   - Workaround
   - Loop variable reference fix
   - Fix a CONFIG corner-case incorrect number of argument
   - Skip reason prefix while emitting array
   - VF migration fix
   - Fix context in mei interrupt top half
   - Don't include the CCS metadata in the dma-buf sg-table
   - VF queueing recovery work fix
   - Increase TDF timeout
   - GT reset registers vs scheduler ordering fix
   - Adjust long-running workload timeslices
   - Always set OA_OAGLBCTXCTRL_COUNTER_RESUME
   - Fix a return value
   - Drop preempt-fences when destroying imported dma-bufs
   - Use usleep_range for accurate long-running workload timeslicing

  amdxdna:
   - don't load virtualized

  panel:
   - fix visionox-rm69299 Kconfig dependency
   - sony-td4353-jdi probing fix"

* tag 'drm-fixes-2025-12-20' of https://gitlab.freedesktop.org/drm/kernel: (34 commits)
  drm/xe: Use usleep_range for accurate long-running workload timeslicing
  drm/xe: Drop preempt-fences when destroying imported dma-bufs.
  drm/xe/eustall: Disallow 0 EU stall property values
  drm/xe/oa: Disallow 0 OA property values
  drm/xe/xe_sriov_vfio: Fix return value in xe_sriov_vfio_migration_supported()
  drm/xe/oa: Always set OAG_OAGLBCTXCTRL_COUNTER_RESUME
  drm/xe: Adjust long-running workload timeslices to reasonable values
  drm/xe/oa: Limit num_syncs to prevent oversized allocations
  drm/xe: Limit num_syncs to prevent oversized allocations
  drm/amdkfd: Fix improper NULL termination of queue restore SMI event string
  drm/amd/pm: restore SCLK settings after S0ix resume
  drm/amdgpu: fix a job->pasid access race in gpu recovery
  drm/amd/display: Fix DP no audio issue
  drm/amd/display: Fix scratch registers offsets for DCN351
  drm/amd/display: Fix scratch registers offsets for DCN35
  drm/amd: Resume the device in thaw() callback when console suspend is disabled
  drm/panel: visionox-rm69299: Depend on BACKLIGHT_CLASS_DEVICE
  accel/amdxdna: Block running under a hypervisor
  drm/panel: sony-td4353-jdi: Enable prepare_prev_first
  drm/xe: Restore engine registers before restarting schedulers after GT reset
  ...

8 weeks agoMerge tag 'linux_kselftest-kunit-fixes-6.19-rc3' of git://git.kernel.org/pub/scm...
Linus Torvalds [Sat, 20 Dec 2025 19:59:06 +0000 (11:59 -0800)] 
Merge tag 'linux_kselftest-kunit-fixes-6.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit fixes from Shuah Khan:
 "Drop unused parameter from kunit_device_register_internal and make
  FAULT_TEST default to n when PANIC_ON_OOPS"

* tag 'linux_kselftest-kunit-fixes-6.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: make FAULT_TEST default to n when PANIC_ON_OOPS
  kunit: Drop unused parameter from kunit_device_register_internal

8 weeks agoMerge tag 'devicetree-fixes-for-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 20 Dec 2025 19:49:49 +0000 (11:49 -0800)] 
Merge tag 'devicetree-fixes-for-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree fixes from Rob Herring:

 - Fix warnings for Mediatek overlays not getting applied

 - Fix regression in handling elfcorehdr region

 - Fix creating cpufreq device on OPPv1 platforms

 - Add GE7800 GPU in Renesas R-Car V3U

 - Simplify dma-coherent property in TI display bindings

 - Allow "reg" in sprd,sc9860-clk binding

 - Update Linus Walleij's email

* tag 'devicetree-fixes-for-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  arm64: dts: mediatek: Apply mt8395-radxa DT overlay at build time
  arm64: dts: mediatek: mt7988: add dtbs with applied overlays for bpi-r4 (pro)
  arm64: dts: mediatek: mt7986: add dtbs with applied overlays for bpi-r3
  dt-bindings: Updates Linus Walleij's mail address
  dt-bindings: gpu: img,powervr-rogue: Document GE7800 GPU in Renesas R-Car V3U
  cpufreq: dt-platdev: Fix creating device on OPPv1 platforms
  dt-bindings: clock: sprd,sc9860-clk: Allow "reg" for gate clocks
  dt-bindings: display/ti: Simplify dma-coherent property
  arm64: kdump: Fix elfcorehdr overlap caused by reserved memory processing reorder

8 weeks agoMerge tag 'mips-fixes_6.19_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips...
Linus Torvalds [Sat, 20 Dec 2025 19:40:51 +0000 (11:40 -0800)] 
Merge tag 'mips-fixes_6.19_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS fixes from Thomas Bogendoerfer:

 - Fix build error for Alchemy

 - Fix reference leak

* tag 'mips-fixes_6.19_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: Fix a reference leak bug in ip22_check_gio()
  MIPS: Alchemy: Remove bogus static/inline specifiers

8 weeks agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Sat, 20 Dec 2025 19:34:37 +0000 (11:34 -0800)] 
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:
 "Two left-over updates that could not go into -rc1 due to conflicts
  with other series:

   - Simplify checks in arch_kfence_init_pool() since
     force_pte_mapping() already takes BBML2-noabort (break-before-make
     Level 2 with no aborts generated) into account

   - Remove unneeded SVE/SME fallback preserve/store handling in the
     arm64 EFI. With the recent updates, the fallback path is only taken
     for EFI runtime calls from hardirq or NMI contexts. In practice,
     this only happens under panic/oops/emergency_restart() and no
     restoring of the user state expected.

     There's a corresponding lkdtm update to trigger a BUG() or panic()
     from hardirq context together with a fixup not to confuse
     clang/objtool about the control flow

  GCS (guarded control stacks) fix: flush the GCS locking state on exec,
  otherwise the new task will not be able to enable GCS (locked as
  disabled)"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  lkdtm/bugs: Do not confuse the clang/objtool with busy wait loop
  arm64/gcs: Flush the GCS locking state on exec
  arm64/efi: Remove unneeded SVE/SME fallback preserve/store handling
  lkdtm/bugs: Add cases for BUG and PANIC occurring in hardirq context
  arm64: mm: Simplify check in arch_kfence_init_pool()

8 weeks agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sat, 20 Dec 2025 19:31:37 +0000 (11:31 -0800)] 
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull x86 kvm fixes from Paolo Bonzini:
 "x86 fixes.  Everyone else is already in holiday mood apparently.

   - Add a missing 'break' to fix param parsing in the rseq selftest

   - Apply runtime updates to the _current_ CPUID when userspace is
     setting CPUID, e.g. as part of vCPU hotplug, to fix a false
     positive and to avoid dropping the pending update

   - Disallow toggling KVM_MEM_GUEST_MEMFD on an existing memslot, as
     it's not supported by KVM and leads to a use-after-free due to KVM
     failing to unbind the memslot from the previously-associated
     guest_memfd instance

   - Harden against similar KVM_MEM_GUEST_MEMFD goofs, and prepare for
     supporting flags-only changes on KVM_MEM_GUEST_MEMFD memlslots,
     e.g. for dirty logging

   - Set exit_code[63:32] to -1 (all 0xffs) when synthesizing a nested
     SVM_EXIT_ERR (a.k.a. VMEXIT_INVALID) #VMEXIT, as VMEXIT_INVALID is
     defined as -1ull (a 64-bit value)

   - Update SVI when activating APICv to fix a bug where a
     post-activation EOI for an in-service IRQ would effective be lost
     due to SVI being stale

   - Immediately refresh APICv controls (if necessary) on a nested
     VM-Exit instead of deferring the update via KVM_REQ_APICV_UPDATE,
     as the request is effectively ignored because KVM thinks the vCPU
     already has the correct APICv settings"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: nVMX: Immediately refresh APICv controls as needed on nested VM-Exit
  KVM: VMX: Update SVI during runtime APICv activation
  KVM: nSVM: Set exit_code_hi to -1 when synthesizing SVM_EXIT_ERR (failed VMRUN)
  KVM: nSVM: Clear exit_code_hi in VMCB when synthesizing nested VM-Exits
  KVM: Harden and prepare for modifying existing guest_memfd memslots
  KVM: Disallow toggling KVM_MEM_GUEST_MEMFD on an existing memslot
  KVM: selftests: Add a CPUID testcase for KVM_SET_CPUID2 with runtime updates
  KVM: x86: Apply runtime updates to current CPUID during KVM_SET_CPUID{,2}
  KVM: selftests: Add missing "break" in rseq_test's param parsing

8 weeks agoMerge tag 'for-linus-6.19-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 20 Dec 2025 19:29:32 +0000 (11:29 -0800)] 
Merge tag 'for-linus-6.19-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fix from Juergen Gross:
 "Just a single patch fixing a sparse warning"

* tag 'for-linus-6.19-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: Fix sparse warning in enlighten_pv.c

8 weeks agoMerge tag 'slab-for-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka...
Linus Torvalds [Sat, 20 Dec 2025 19:24:42 +0000 (11:24 -0800)] 
Merge tag 'slab-for-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fix from Vlastimil Babka:

 - A stable fix for a missing tag reset that can happen in
   kfree_nolock() with KASAN+SLUB_TINY configs (Deepanshu Kartikey)

* tag 'slab-for-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/slub: reset KASAN tag in defer_free() before accessing freed memory

8 weeks agoMerge tag 'iommu-fixes-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 20 Dec 2025 19:18:32 +0000 (11:18 -0800)] 
Merge tag 'iommu-fixes-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux

Pull iommu fixes from Joerg Roedel:

 - iommupt: Fix an oops found by syzcaller in the new generic
   IO-page-table code.

 - AMD-Vi: Fix IO_PAGE_FAULTs in kdump kernels triggered by re-using
   domain-ids from previous kernel.

* tag 'iommu-fixes-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
  amd/iommu: Make protection domain ID functions non-static
  amd/iommu: Preserve domain ids inside the kdump kernel
  iommupt: Return ERR_PTR from _table_alloc()

8 weeks agoMerge tag 'block-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 20 Dec 2025 17:48:56 +0000 (09:48 -0800)] 
Merge tag 'block-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

 - ublk selftests for missing coverage

 - two fixes for the block integrity code

 - fix for the newly added newly added PR read keys ioctl, limiting the
   memory that can be allocated

 - work around for a deadlock that can occur with ublk, where partition
   scanning ends up recursing back into file closure, which needs the
   same mutex grabbed. Not the prettiest thing in the world, but an
   acceptable work-around until we can eliminate the reliance on
   disk->open_mutex for this

 - fix for a race between enabling writeback throttling and new IO
   submissions

 - move a bit of bio flag handling code. No changes, but needed for a
   patchset for a future kernel

 - fix for an init time id leak failure in rnbd

 - loop/zloop state check fix

* tag 'block-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  block: validate interval_exp integrity limit
  block: validate pi_offset integrity limit
  block: rnbd-clt: Fix leaked ID in init_dev()
  ublk: fix deadlock when reading partition table
  block: add allocation size check in blkdev_pr_read_keys()
  Documentation: admin-guide: blockdev: replace zone_capacity with zone_capacity_mb when creating devices
  zloop: use READ_ONCE() to read lo->lo_state in queue_rq path
  loop: use READ_ONCE() to read lo->lo_state without locking
  block: fix race between wbt_enable_default and IO submission
  selftests: ublk: add user copy test cases
  selftests: ublk: add support for user copy to kublk
  selftests: ublk: forbid multiple data copy modes
  selftests: ublk: don't share backing files between ublk servers
  selftests: ublk: use auto_zc for PER_IO_DAEMON tests in stress_04
  selftests: ublk: fix fio arguments in run_io_and_recover()
  selftests: ublk: remove unused ios map in seq_io.bt
  selftests: ublk: correct last_rw map type in seq_io.bt
  selftests: ublk: fix overflow in ublk_queue_auto_zc_fallback()
  block: move around bio flagging helpers

8 weeks agoMerge tag 'io_uring-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 20 Dec 2025 17:38:56 +0000 (09:38 -0800)] 
Merge tag 'io_uring-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring fix from Jens Axboe:
 "Just a single fix this week, for an issue with the calculation of the
  number of segments in the ublk kbuf import path"

* tag 'io_uring-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring: fix nr_segs calculation in io_import_kbuf

8 weeks agohwmon: (ltc4282): Fix reset_history file permissions
Nuno Sá [Fri, 19 Dec 2025 16:11:05 +0000 (16:11 +0000)] 
hwmon: (ltc4282): Fix reset_history file permissions

The reset_history attributes are write only. Hence don't report them as
readable just to return -EOPNOTSUPP later on.

Fixes: cbc29538dbf7 ("hwmon: Add driver for LTC4282")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20251219-ltc4282-fix-reset-history-v1-1-8eab974c124b@analog.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
8 weeks agoarm64: dts: mediatek: Apply mt8395-radxa DT overlay at build time
Rob Herring (Arm) [Fri, 5 Dec 2025 21:59:38 +0000 (22:59 +0100)] 
arm64: dts: mediatek: Apply mt8395-radxa DT overlay at build time

It's a requirement that DT overlays be applied at build time in order to
validate them as overlays are not validated on their own.

Add missing target for mt8395-radxa hd panel overlay.

Fixes: 4c8ff61199a7 ("arm64: dts: mediatek: mt8395-radxa-nio-12l: Add Radxa 8 HD panel")
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Acked-by: AngeloGioacchino Del Regno <angelogiaocchino.delregno@collabora.com>
Link: https://patch.msgid.link/20251205215940.19287-1-linux@fw-web.de
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
8 weeks agoarm64: dts: mediatek: mt7988: add dtbs with applied overlays for bpi-r4 (pro)
Frank Wunderlich [Wed, 19 Nov 2025 17:51:23 +0000 (18:51 +0100)] 
arm64: dts: mediatek: mt7988: add dtbs with applied overlays for bpi-r4 (pro)

Build devicetree binaries for testing overlays and providing users
full dtb without using overlays for Bananapi R4 (pro) variants.

Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Link: https://patch.msgid.link/20251119175124.48947-3-linux@fw-web.de
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
8 weeks agoarm64: dts: mediatek: mt7986: add dtbs with applied overlays for bpi-r3
Frank Wunderlich [Wed, 19 Nov 2025 17:51:22 +0000 (18:51 +0100)] 
arm64: dts: mediatek: mt7986: add dtbs with applied overlays for bpi-r3

Build devicetree binaries for testing overlays and providing users
full dtb without using overlays.

Suggested-by: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Fixes: a58c36806741 ("arm64: dts: mediatek: mt7988a-bpi-r4pro: Add mmc overlays")
Fixes: dec929e61a42 ("arm64: dts: mediatek: mt7988a-bpi-r4-pro: Add PCIe overlays")
Fixes: 714a80ced07a ("arm64: dts: mediatek: mt7988a-bpi-r4: Add dt overlays for sd + emmc")
Fixes: 312189ebb802 ("arm64: dts: mt7986: add overlay for SATA power socket on BPI-R3")
Fixes: 8e01fb15b815 ("arm64: dts: mt7986: add Bananapi R3")
Acked-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20251119175124.48947-2-linux@fw-web.de
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>