Eric Biggers [Mon, 2 Mar 2026 07:59:39 +0000 (23:59 -0800)]
nvme-auth: add NVME_AUTH_MAX_DIGEST_SIZE constant
Define a NVME_AUTH_MAX_DIGEST_SIZE constant and use it in the
appropriate places.
Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Keith Busch <kbusch@kernel.org>
Vasily Gorbik [Sun, 22 Mar 2026 02:35:10 +0000 (03:35 +0100)]
block: fix bio_alloc_bioset slowpath GFP handling
bio_alloc_bioset() first strips __GFP_DIRECT_RECLAIM from the optimistic
fast allocation attempt with try_alloc_gfp(). If that fast path fails,
the slowpath checks saved_gfp to decide whether blocking allocation is
allowed, but then still calls mempool_alloc() with the stripped gfp mask.
That can lead to a NULL bio pointer being passed into bio_init().
Fix the slowpath by using saved_gfp for the bio and bvec mempool
allocations.
Fixes: b520c4eef83d ("block: split bio_alloc_bioset more clearly into a fast and slowpath") Reported-by: syzbot+09ddb593eea76a158f42@syzkaller.appspotmail.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/p01.gc6e9ad5845ad.ttca29g@ub.hpns Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Wed, 18 Mar 2026 01:41:12 +0000 (09:41 +0800)]
ublk: move cold paths out of __ublk_batch_dispatch() for icache efficiency
Mark ublk_filter_unused_tags() as noinline since it is only called from
the unlikely(needs_filter) branch. Extract the error-handling block from
__ublk_batch_dispatch() into a new noinline ublk_batch_dispatch_fail()
function to keep the hot path compact and icache-friendly. This also
makes __ublk_batch_dispatch() more readable by separating the error
recovery logic from the normal dispatch flow.
Before: __ublk_batch_dispatch is ~1419 bytes
After: __ublk_batch_dispatch is ~1090 bytes (-329 bytes, -23%)
Jens Axboe [Sun, 22 Mar 2026 19:37:45 +0000 (13:37 -0600)]
Merge tag 'md-7.1-20260323' of git://git.kernel.org/pub/scm/linux/kernel/git/mdraid/linux into for-7.1/block
Pull MD changes from Yu Kuia:
"Bug Fixes:
- md: suppress spurious superblock update error message for dm-raid
(Chen Cheng)
- md/raid1: fix the comparing region of interval tree (Xiao Ni)
- md/raid10: fix deadlock with check operation and nowait requests
(Josh Hunt)
- md/raid5: skip 2-failure compute when other disk is R5_LOCKED
(FengWei Shih)
- md/md-llbitmap: raise barrier before state machine transition
(Yu Kuai)
- md/md-llbitmap: skip reading rdevs that are not in_sync (Yu Kuai)
Improvements:
- md/raid5: set chunk_sectors to enable full stripe I/O splitting
(Yu Kuai)
Cleanups:
- md: remove unused mddev argument from export_rdev (Chen Cheng)
- md/raid5: remove stale md_raid5_kick_device() declaration
(Chen Cheng)
- md/raid5: move handle_stripe() comment to correct location
(Chen Cheng)"
* tag 'md-7.1-20260323' of git://git.kernel.org/pub/scm/linux/kernel/git/mdraid/linux:
md: remove unused mddev argument from export_rdev
md/raid5: move handle_stripe() comment to correct location
md/raid5: remove stale md_raid5_kick_device() declaration
md/raid1: fix the comparing region of interval tree
md/raid5: skip 2-failure compute when other disk is R5_LOCKED
md/md-llbitmap: raise barrier before state machine transition
md/md-llbitmap: skip reading rdevs that are not in_sync
md/raid5: set chunk_sectors to enable full stripe I/O splitting
md/raid10: fix deadlock with check operation and nowait requests
md: suppress spurious superblock update error message for dm-raid
Xiao Ni [Thu, 5 Mar 2026 01:18:33 +0000 (09:18 +0800)]
md/raid1: fix the comparing region of interval tree
Interval tree uses [start, end] as a region which stores in the tree.
In raid1, it uses the wrong end value. For example:
bio(A,B) is too big and needs to be split to bio1(A,C-1), bio2(C,B).
The region of bio1 is [A,C] and the region of bio2 is [C,B]. So bio1 and
bio2 overlap which is not right.
Fix this problem by using right end value of the region.
FengWei Shih [Thu, 19 Mar 2026 05:33:51 +0000 (13:33 +0800)]
md/raid5: skip 2-failure compute when other disk is R5_LOCKED
When skip_copy is enabled on a doubly-degraded RAID6, a device that is
being written to will be in R5_LOCKED state with R5_UPTODATE cleared.
If a new read triggers fetch_block() while the write is still in
flight, the 2-failure compute path may select this locked device as a
compute target because it is not R5_UPTODATE.
Because skip_copy makes the device page point directly to the bio page,
reconstructing data into it might be risky. Also, since the compute
marks the device R5_UPTODATE, it triggers WARN_ON in ops_run_io()
which checks that R5_SkipCopy and R5_UPTODATE are not both set.
This can be reproduced by running small-range concurrent read/write on
a doubly-degraded RAID6 with skip_copy enabled, for example:
Kees Cook [Sat, 21 Mar 2026 00:48:44 +0000 (17:48 -0700)]
block: partitions: Replace pp_buf with struct seq_buf
In preparation for removing the strlcat API[1], replace the char *pp_buf
with a struct seq_buf, which tracks the current write position and
remaining space internally. This allows for:
- Direct use of seq_buf_printf() in place of snprintf()+strlcat()
pairs, eliminating local tmp buffers throughout.
- Adjacent strlcat() calls that build strings piece-by-piece
(e.g., strlcat("["); strlcat(name); strlcat("]")) to be collapsed
into single seq_buf_printf() calls.
- Simpler call sites: seq_buf_puts() takes only the buffer and string,
with no need to pass PAGE_SIZE at every call.
The backing buffer allocation is unchanged (__get_free_page), and the
output path uses seq_buf_str() to NUL-terminate before passing to
printk().
Yang Xiuwei [Tue, 17 Mar 2026 07:22:26 +0000 (15:22 +0800)]
scsi: bsg: add io_uring passthrough handler
Implement the SCSI-specific io_uring command handler for BSG using
struct bsg_uring_cmd.
The handler builds a SCSI request from the io_uring command, maps user
buffers (including fixed buffers), and completes asynchronously via a
request end_io callback and task_work. Completion returns a 32-bit
status and packed residual/sense information via CQE res and res2, and
supports IO_URING_F_NONBLOCK.
Yang Xiuwei [Tue, 17 Mar 2026 07:22:25 +0000 (15:22 +0800)]
bsg: add io_uring command support to generic layer
Add an io_uring command handler to the generic BSG layer. The new
.uring_cmd file operation validates io_uring features and delegates
handling to a per-queue bsg_uring_cmd_fn callback.
Extend bsg_register_queue() so transport drivers can register both
sg_io and io_uring command handlers.
Yu Kuai [Mon, 23 Feb 2026 02:40:35 +0000 (10:40 +0800)]
md/md-llbitmap: raise barrier before state machine transition
Move the barrier raise operation before calling llbitmap_state_machine()
in both llbitmap_start_write() and llbitmap_start_discard(). This
ensures the barrier is in place before any state transitions occur,
preventing potential race conditions where the state machine could
complete before the barrier is properly raised.
Yu Kuai [Mon, 23 Feb 2026 02:40:34 +0000 (10:40 +0800)]
md/md-llbitmap: skip reading rdevs that are not in_sync
When reading bitmap pages from member disks, the code iterates through
all rdevs and attempts to read from the first available one. However,
it only checks for raid_disk assignment and Faulty flag, missing the
In_sync flag check.
This can cause bitmap data to be read from spare disks that are still
being rebuilt and don't have valid bitmap information yet. Reading
stale or uninitialized bitmap data from such disks can lead to
incorrect dirty bit tracking, potentially causing data corruption
during recovery or normal operation.
Add the In_sync flag check to ensure bitmap pages are only read from
fully synchronized member disks that have valid bitmap data.
Yu Kuai [Mon, 23 Feb 2026 03:58:34 +0000 (11:58 +0800)]
md/raid5: set chunk_sectors to enable full stripe I/O splitting
Set chunk_sectors to the full stripe width (io_opt) so that the block
layer splits I/O at full stripe boundaries. This ensures that large
writes are aligned to full stripes, avoiding the read-modify-write
overhead that occurs with partial stripe writes in RAID-5/6.
When chunk_sectors is set, the block layer's bio splitting logic in
get_max_io_size() uses blk_boundary_sectors_left() to limit I/O size
to the boundary. This naturally aligns split bios to full stripe
boundaries, enabling more efficient full stripe writes.
Test results with 24-disk RAID5 (chunk_size=64k):
dd if=/dev/zero of=/dev/md0 bs=10M oflag=direct
Before: 461 MB/s
After: 520 MB/s (+12.8%)
Josh Hunt [Tue, 3 Mar 2026 00:56:19 +0000 (19:56 -0500)]
md/raid10: fix deadlock with check operation and nowait requests
When an array check is running it will raise the barrier at which point
normal requests will become blocked and increment the nr_pending value to
signal there is work pending inside of wait_barrier(). NOWAIT requests
do not block and so will return immediately with an error, and additionally
do not increment nr_pending in wait_barrier(). Upstream change commit 43806c3d5b9b ("raid10: cleanup memleak at raid10_make_request") added a
call to raid_end_bio_io() to fix a memory leak when NOWAIT requests hit
this condition. raid_end_bio_io() eventually calls allow_barrier() and
it will unconditionally do an atomic_dec_and_test(&conf->nr_pending) even
though the corresponding increment on nr_pending didn't happen in the
NOWAIT case.
This can be easily seen by starting a check operation while an application
is doing nowait IO on the same array. This results in a deadlocked state
due to nr_pending value underflowing and so the md resync thread gets stuck
waiting for nr_pending to == 0.
Output of r10conf state of the array when we hit this condition:
Chen Cheng [Tue, 10 Feb 2026 13:38:47 +0000 (21:38 +0800)]
md: suppress spurious superblock update error message for dm-raid
dm-raid has external metadata management (mddev->external = 1) and
no persistent superblock (mddev->persistent = 0). For these arrays,
there's no superblock to update, so the error message is spurious.
The error appears as:
md_update_sb: can't update sb for read-only array md0
Fixes: 8c9e376b9d1a ("md: warn about updating super block failure") Reported-by: Tj <tj.iam.tj@proton.me> Closes: https://lore.kernel.org/all/20260128082430.96788-1-tj.iam.tj@proton.me/ Signed-off-by: Chen Cheng <chencheng@fnnas.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Link: https://lore.kernel.org/linux-raid/20260210133847.269986-1-chencheng@fnnas.com Signed-off-by: Yu Kuai <yukuai@fnnas.com>
The ublk driver doesn't access request integrity buffers directly, it
only copies them to/from the ublk server in ublk_copy_user_integrity().
ublk_copy_user_integrity() uses bio_for_each_integrity_vec() to walk all
the integrity segments. ublk devices are therefore capable of handling
requests with integrity intervals split across segments. Set
BLK_SPLIT_INTERVAL_CAPABLE in the struct blk_integrity flags for ublk
devices to opt out of the integrity-interval dma_alignment limit.
Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://patch.msgid.link/20260313144701.1221652-3-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
Keith Busch [Fri, 13 Mar 2026 14:47:00 +0000 (07:47 -0700)]
blk-integrity: support arbitrary buffer alignment
A bio segment may have partial interval block data with the rest
continuing into the next segments because direct-io data payloads only
need to align in memory to the device's DMA limits.
At the same time, the protection information may also be split in
multiple segments. The most likely way that may happen is if two
requests merge, or if we're directly using the io_uring user metadata.
The generate/verify, however, only ever accessed the first bip_vec.
Further, it may be possible to unalign the protection fields from the
user space buffer, or if there are odd additional opaque bytes in front
or in back of the protection information metadata region.
Change up the iteration to allow spanning multiple segments. This patch
is mostly a re-write of the protection information handling to allow any
arbitrary alignments, so it's probably easier to review the end result
rather than the diff.
Many controllers are not able to handle interval data composed of
multiple segments when PI is used, so this patch introduces a new
integrity limit that a low level driver can set to notify that it is
capable, default to false. The nvme driver is the first one to enable it
in this patch. Everyone else will force DMA alignment to the logical
block size as before to ensure interval data is always aligned within a
single segment.
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://patch.msgid.link/20260313144701.1221652-2-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Wed, 11 Mar 2026 03:28:37 +0000 (11:28 +0800)]
blk-cgroup: wait for blkcg cleanup before initializing new disk
When a queue is shared across disk rebind (e.g., SCSI unbind/bind), the
previous disk's blkcg state is cleaned up asynchronously via
disk_release() -> blkcg_exit_disk(). If the new disk's blkcg_init_disk()
runs before that cleanup finishes, we may overwrite q->root_blkg while
the old one is still alive, and radix_tree_insert() in blkg_create()
fails with -EEXIST because the old blkg entries still occupy the same
queue id slot in blkcg->blkg_tree. This causes the sd probe to fail
with -ENOMEM.
Fix it by waiting in blkcg_init_disk() for root_blkg to become NULL,
which indicates the previous disk's blkcg cleanup has completed.
Fixes: 1059699f87eb ("block: move blkcg initialization/destroy into disk allocation/release handler") Cc: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20260311032837.2368714-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
When a bio goes through the rq_qos infrastructure on a path's request
queue, it gets BIO_QOS_THROTTLED or BIO_QOS_MERGED flags set. These
flags indicate that rq_qos_done_bio() should be called on completion
to update rq_qos accounting.
During path failover in nvme_failover_req(), the bio's bi_bdev is
redirected from the failed path's disk to the multipath head's disk
via bio_set_dev(). However, the BIO_QOS flags are not cleared.
When the bio eventually completes (either successfully via a new path
or with an error via bio_io_error()), rq_qos_done_bio() checks for
these flags and calls __rq_qos_done_bio(q->rq_qos, bio) where q is
obtained from the bio's current bi_bdev - which is now the multipath
head's queue, not the original path's queue.
The multipath head's queue does not have rq_qos enabled (q->rq_qos is
NULL), but the code assumes that if BIO_QOS_* flags are set, q->rq_qos
must be valid.
This breaks when a bio is moved between queues during NVMe multipath
failover, leading to a NULL pointer dereference.
Execution Context timeline :-
* =====> dd process context
[USER] dd process
[SYSCALL] write() - dd process context
submit_bio()
nvme_ns_head_submit_bio() - path selection
blk_mq_submit_bio() #### QOS FLAGS SET HERE
[USER] dd waits or returns
==== I/O in flight on NVMe hardware =====
===== End of submission path ====
------------------------------------------------------
* dd ====> Interrupt context;
[IRQ] NVMe completion interrupt
nvme_irq()
nvme_complete_rq()
nvme_failover_req() ### BIO MOVED TO HEAD
spin_lock_irqsave (atomic section)
bio_set_dev() changes bi_bdev
### BUG: QOS flags NOT cleared
kblockd_schedule_work()
Fix this by clearing both BIO_QOS_THROTTLED and BIO_QOS_MERGED flags
when bios are redirected to the multipath head in nvme_failover_req().
This is consistent with the existing code that clears REQ_POLLED and
REQ_NOWAIT flags when the bio changes queues.
block: move bio queue-transition flag fixups into blk_steal_bios()
blk_steal_bios() transfers bios from a request to a bio_list when the
request is requeued to a different queue. The NVMe multipath failover
path (nvme_failover_req) currently open-codes clearing of REQ_POLLED,
bi_cookie, and REQ_NOWAIT on each bio before calling blk_steal_bios().
Move these fixups into blk_steal_bios() itself so that any caller
automatically gets correct flag state when bios cross queue boundaries.
Simplify nvme_failover_req() accordingly.
Jens Axboe [Mon, 9 Mar 2026 20:30:14 +0000 (14:30 -0600)]
Merge branch 'for-7.1/block-integrity' into for-7.1/block
Merge in integrity changes which are also landing in the VFS tree as
dependencies for fs related changes.
* for-7.1/block-integrity:
block: pass a maxlen argument to bio_iov_iter_bounce
block: add fs_bio_integrity helpers
block: make max_integrity_io_size public
block: prepare generation / verification helpers for fs usage
block: add a bdev_has_integrity_csum helper
block: factor out a bio_integrity_setup_default helper
block: factor out a bio_integrity_action helper
Damien Le Moal [Thu, 26 Feb 2026 07:54:48 +0000 (16:54 +0900)]
block: remove bdev_nonrot()
bdev_nonrot() is simply the negative return value of bdev_rot().
So replace all call sites of bdev_nonrot() with calls to bdev_rot()
and remove bdev_nonrot().
Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 27 Feb 2026 13:19:51 +0000 (22:19 +0900)]
Documentation: ABI: stable: document the zoned_qd1_writes attribute
Update the documentation file Documentation/ABI/stable/sysfs-block to
describe the zoned_qd1_writes sysfs queue attribute file.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 27 Feb 2026 13:19:50 +0000 (22:19 +0900)]
block: default to QD=1 writes for blk-mq rotational zoned devices
For blk-mq rotational zoned block devices (e.g. SMR HDDs), default to
having zone write plugging limit write operations to a maximum queue
depth of 1 for all zones. This significantly reduce write seek overhead
and improves SMR HDD write throughput.
For remotely connected disks with a very high network latency this
features might not be useful. However, remotely connected zoned devices
are rare at the moment, and we cannot know the round trip latency to
pick a good default for network attached devices. System administrators
can however disable this feature in that case.
For BIO based (non blk-mq) rotational zoned block devices, the device
driver (e.g. a DM target driver) can directly set an appropriate
default.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 27 Feb 2026 13:19:49 +0000 (22:19 +0900)]
block: allow submitting all zone writes from a single context
In order to maintain sequential write patterns per zone with zoned block
devices, zone write plugging issues only a single write BIO per zone at
any time. This works well but has the side effect that when large
sequential write streams are issued by the user and these streams cross
zone boundaries, the device ends up receiving a discontiguous set of
write commands for different zones. The same also happens when a user
writes simultaneously at high queue depth multiple zones: the device
does not see all sequential writes per zone and receives discontiguous
writes to different zones. While this does not affect the performance of
solid state zoned block devices, when using an SMR HDD, this pattern
change from sequential writes to discontiguous writes to different zones
significantly increases head seek which results in degraded write
throughput.
In order to reduce this seek overhead for rotational media devices,
introduce a per disk zone write plugs kernel thread to issue all write
BIOs to zones. This single zone write issuing context is enabled for
any zoned block device that has a request queue flagged with the new
QUEUE_ZONED_QD1_WRITES flag.
The flag QUEUE_ZONED_QD1_WRITES is visible as the sysfs queue attribute
zoned_qd1_writes for zoned devices. For regular block devices, this
attribute is not visible. For zoned block devices, a user can override
the default value set to force the global write maximum queue depth of
1 for a zoned block device, or clear this attribute to fallback to the
default behavior of zone write plugging which limits writes to QD=1 per
sequential zone.
Writing to a zoned block device flagged with QUEUE_ZONED_QD1_WRITES is
implemented using a list of zone write plugs that have a non-empty BIO
list. Listed zone write plugs are processed by the disk zone write plugs
worker kthread in FIFO order, and all BIOs of a zone write plug are all
processed before switching to the next listed zone write plug. A newly
submitted BIO for a non-FULL zone write plug that is not yet listed
causes the addition of the zone write plug at the end of the disk list
of zone write plugs.
Since the write BIOs queued in a zone write plug BIO list are
necessarilly sequential, for rotational media, using the single zone
write plugs kthread to issue all BIOs maintains a sequential write
pattern and thus reduces seek overhead and improves write throughput.
This processing essentially result in always writing to HDDs at QD=1,
which is not an issue for HDDs operating with write caching enabled.
Performance with write cache disabled is also not degraded thanks to
the efficient write handling of modern SMR HDDs.
A disk list of zone write plugs is defined using the new struct gendisk
zone_wplugs_list, and accesses to this list is protected using the
zone_wplugs_list_lock spinlock. The per disk kthread
(zone_wplugs_worker) code is implemented by the function
disk_zone_wplugs_worker(). A reference on listed zone write plugs is
always held until all BIOs of the zone write plug are processed by the
worker kthread. BIO issuing at QD=1 is driven using a completion
structure (zone_wplugs_worker_bio_done) and calls to blk_io_wait().
With this change, performance when sequentially writing the zones of a
30 TB SMR SATA HDD connected to an AHCI adapter changes as follows
(1MiB direct I/Os, results in MB/s unit):
With the current code (baseline), as the sequential write stream crosses
a zone boundary, higher queue depth creates a gap between the
last IO to the previous zone and the first IOs to the following zones,
causing head seeks and degrading performance. Using the disk zone
write plugs worker thread, this pattern disappears and the maximum
throughput of the drive is maintained, leading to over 100%
improvements in throughput for high queue depth write.
Using 16 fio jobs all writing to randomly chosen zones at QD=32 with 1
MiB direct IOs, write throughput also increases significantly.
Performance gains are even more significant when using an HBA that
limits the maximum size of commands to a small value, e.g. HBAs
controlled with the mpi3mr driver limit commands to a maximum of 1 MiB.
In such case, the write throughput gains are over 40%.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 27 Feb 2026 13:19:48 +0000 (22:19 +0900)]
block: rename struct gendisk zone_wplugs_lock field
Rename struct gendisk zone_wplugs_lock field to zone_wplugs_hash_lock to
clearly indicates that this is the spinlock used for manipulating the
hash table of zone write plugs.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 27 Feb 2026 13:19:47 +0000 (22:19 +0900)]
block: remove disk_zone_is_full()
The helper function disk_zone_is_full() is only used in
disk_zone_wplug_is_full(). So remove it and open code it directly in
this single caller.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 27 Feb 2026 13:19:46 +0000 (22:19 +0900)]
block: rename and simplify disk_get_and_lock_zone_wplug()
disk_get_and_lock_zone_wplug() always returns a zone write plug with the
plug lock held. This is unnecessary since this function does not look at
the fields of existing plugs, and new plugs need to be locked only after
their insertion in the disk hash table, when they are being used.
Remove the zone write plug locking from disk_get_and_lock_zone_wplug()
and rename this function disk_get_or_alloc_zone_wplug().
blk_zone_wplug_handle_write() is modified to add locking of the zone
write plug after calling disk_get_or_alloc_zone_wplug() and before
starting to use the plug. This change also simplifies
blk_revalidate_seq_zone() as unlocking the plug becomes unnecessary.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 27 Feb 2026 13:19:45 +0000 (22:19 +0900)]
block: fix zone write plugs refcount handling in disk_zone_wplug_schedule_bio_work()
The function disk_zone_wplug_schedule_bio_work() always takes a
reference on the zone write plug of the BIO work being scheduled. This
ensures that the zone write plug cannot be freed while the BIO work is
being scheduled but has not run yet. However, this unconditional
reference taking is fragile since the reference taken is released by the
BIO work blk_zone_wplug_bio_work() function, which implies that there
always must be a 1:1 relation between the work being scheduled and the
work running.
Make sure to drop the reference taken when scheduling the BIO work if
the work is already scheduled, that is, when queue_work() returns false.
Fixes: 9e78c38ab30b ("block: Hold a reference on zone write plugs to schedule submission") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 27 Feb 2026 13:19:44 +0000 (22:19 +0900)]
block: fix zone write plug removal
Commit 7b295187287e ("block: Do not remove zone write plugs still in
use") modified disk_should_remove_zone_wplug() to add a check on the
reference count of a zone write plug to prevent removing zone write
plugs from a disk hash table when the plugs are still being referenced
by BIOs or requests in-flight. However, this check does not take into
account that a BIO completion may happen right after its submission by
a zone write plug BIO work, and before the zone write plug BIO work
releases the zone write plug reference count. This situation leads to
disk_should_remove_zone_wplug() returning false as in this case the zone
write plug reference count is at least equal to 3. If the BIO that
completes in such manner transitioned the zone to the FULL condition,
the zone write plug for the FULL zone will remain in the disk hash
table.
Furthermore, relying on a particular value of a zone write plug
reference count to set the BLK_ZONE_WPLUG_UNHASHED flag is fragile as
reading the atomic reference count and doing a comparison with some
value is not overall atomic at all.
Address these issues by reworking the reference counting of zone write
plugs so that removing plugs from a disk hash table can be done
directly from disk_put_zone_wplug() when the last reference on a plug
is dropped.
To do so, replace the function disk_remove_zone_wplug() with
disk_mark_zone_wplug_dead(). This new function sets the zone write plug
flag BLK_ZONE_WPLUG_DEAD (which replaces BLK_ZONE_WPLUG_UNHASHED) and
drops the initial reference on the zone write plug taken when the plug
was added to the disk hash table. This function is called either for
zones that are empty or full, or directly in the case of a forced plug
removal (e.g. when the disk hash table is being destroyed on disk
removal). With this change, disk_should_remove_zone_wplug() is also
removed.
disk_put_zone_wplug() is modified to call the function
disk_free_zone_wplug() to remove a zone write plug from a disk hash
table and free the plug structure (with a call_rcu()), when the last
reference on a zone write plug is dropped. disk_free_zone_wplug()
always checks that the BLK_ZONE_WPLUG_DEAD flag is set.
In order to avoid having multiple zone write plugs for the same zone in
the disk hash table, disk_get_and_lock_zone_wplug() checked for the
BLK_ZONE_WPLUG_UNHASHED flag. This check is removed and a check for
the new BLK_ZONE_WPLUG_DEAD flag is added to
blk_zone_wplug_handle_write(). With this change, we continue preventing
adding multiple zone write plugs for the same zone and at the same time
re-inforce checks on the user behavior by failing new incoming write
BIOs targeting a zone that is marked as dead. This case can happen only
if the user erroneously issues write BIOs to zones that are full, or to
zones that are currently being reset or finished.
Fixes: 7b295187287e ("block: Do not remove zone write plugs still in use") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bill Wendling [Wed, 25 Feb 2026 20:51:05 +0000 (20:51 +0000)]
block: annotate struct request_queue with __counted_by_ptr
The queue_hw_ctx field in struct request_queue is an array of pointers
to struct blk_mq_hw_ctx. The number of elements in this array is tracked
by the nr_hw_queues field.
The array is allocated in __blk_mq_realloc_hw_ctxs() using
kcalloc_node() with set->nr_hw_queues elements. q->nr_hw_queues is
subsequently updated to set->nr_hw_queues.
When growing the array, the new array is assigned to queue_hw_ctx before
nr_hw_queues is updated. This is safe because nr_hw_queues (the old
smaller count) is used for bounds checking, which is within the new
larger allocation.
When shrinking the array, nr_hw_queues is updated to the smaller value,
while queue_hw_ctx retains the larger allocation. This is also safe as
the count is within the allocation bounds.
Annotating queue_hw_ctx with __counted_by_ptr(nr_hw_queues) allows the
compiler (with kSAN) to verify that accesses to queue_hw_ctx are within
the valid range defined by nr_hw_queues.
This patch was generated by CodeMender and reviewed by Bill Wendling.
Tested by running blktests.
Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bill Wendling <morbo@google.com>
[axboe: massage commit message] Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ondrej Kozina [Fri, 6 Feb 2026 14:18:03 +0000 (15:18 +0100)]
sed-opal: add IOC_OPAL_GET_SUM_STATUS ioctl.
This adds a function for retrieving the set of Locking objects enabled
for Single User Mode (SUM) and the value of the
RangeStartRangeLengthPolicy parameter.
It retrieves data from the LockingInfo table, specifically the
columns SingleUserModeRanges and RangeStartLengthPolicy, which
were added according to the TCG Opal Feature Set: Single User Mode,
as described in chapters 4.4.3.1 and 4.4.3.2.
Signed-off-by: Ondrej Kozina <okozina@redhat.com> Reviewed-and-tested-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ondrej Kozina [Fri, 6 Feb 2026 14:18:01 +0000 (15:18 +0100)]
sed-opal: add IOC_OPAL_ENABLE_DISABLE_LR.
This ioctl is used to set up RLE (read lock enabled) and WLE (write
lock enabled) parameters of the Locking object.
In Single User Mode (SUM), if the RangeStartRangeLengthPolicy parameter
is set in the 'Reactivate' method, only Admin authority maintains the
locking range length and start (offset) attributes of Locking objects
set up for SUM. All other attributes from struct opal_user_lr_setup
(RLE - read locking enabled, WLE - write locking enabled) shall
remain in possession of the User authority associated with the Locking
object set for SUM.
With the IOC_OPAL_ENABLE_DISABLE_LR ioctl, the opal_user_lr_setup
members 'range_start' and 'range_length' of the ioctl argument are
ignored.
Signed-off-by: Ondrej Kozina <okozina@redhat.com> Reviewed-and-tested-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ondrej Kozina [Fri, 6 Feb 2026 14:18:00 +0000 (15:18 +0100)]
sed-opal: add IOC_OPAL_LR_SET_START_LEN ioctl.
This ioctl is used to set up locking range start (offset)
and locking range length attributes only.
In Single User Mode (SUM), if the RangeStartRangeLengthPolicy parameter
is set in the 'Reactivate' method, only Admin authority maintains the
locking range length and start (offset) attributes of Locking objects
set up for SUM. All other attributes from struct opal_user_lr_setup
(RLE - read locking enabled, WLE - write locking enabled) shall
remain in possession of the User authority associated with the Locking
object set for SUM.
Therefore, we need a separate function for setting up locking range
start and locking range length because it may require two different
authorities (and sessions) if the RangeStartRangeLengthPolicy attribute
is set.
With the IOC_OPAL_LR_SET_START_LEN ioctl, the opal_user_lr_setup
members 'RLE' and 'WLE' of the ioctl argument are ignored.
Signed-off-by: Ondrej Kozina <okozina@redhat.com> Reviewed-and-tested-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
IOC_OPAL_LR_SETUP is used to set up a locking range entirely under a
single authority (usually Admin1), but for Single User Mode (SUM),
the permissions for attributes (RangeStart, RangeLength)
and (ReadLockEnable, WriteLockEnable, ReadLocked, WriteLocked)
may be split between two different authorities. Typically, it is Admin1
for the former and the User associated with the LockingRange in SUM
for the latter.
This commit only splits the internals in preparation for the introduction
of separate ioctls for setting RangeStart, RangeLength and the rest
using new ioctl calls.
Signed-off-by: Ondrej Kozina <okozina@redhat.com> Reviewed-and-tested-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
block: pass a maxlen argument to bio_iov_iter_bounce
Allow the file system to limit the size processed in a single
bounce operation. This is needed when generating integrity data
so that the size of a single integrity segment can't overflow.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Tested-by: Anuj Gupta <anuj20.g@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add a set of helpers for file system initiated integrity information.
These include mempool backed allocations and verifying based on a passed
in sector and size which is often available from file system completion
routines.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Tested-by: Anuj Gupta <anuj20.g@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
block: prepare generation / verification helpers for fs usage
Return the status from verify instead of directly stashing it in the bio,
and rename the helpers to use the usual bio_ prefix for things operating
on a bio.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Tested-by: Anuj Gupta <anuj20.g@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Sun, 8 Mar 2026 19:13:09 +0000 (12:13 -0700)]
Merge tag 'efi-fixes-for-v7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fix from Ard Biesheuvel:
"Fix for the x86 EFI workaround keeping boot services code and data
regions reserved until after SetVirtualAddressMap() completes:
deferred struct page initialization may result in some of this memory
being lost permanently"
* tag 'efi-fixes-for-v7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
x86/efi: defer freeing of boot services memory
Linus Torvalds [Sun, 8 Mar 2026 01:12:06 +0000 (17:12 -0800)]
Merge tag 'x86-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Fix SEV guest boot failures in certain circumstances, due to
very early code relying on a BSS-zeroed variable that isn't
actually zeroed yet an may contain non-zero bootup values
Move the variable into the .data section go gain even earlier
zeroing
- Expose & allow the IBPB-on-Entry feature on SNP guests, which
was not properly exposed to guests due to initial implementational
caution
- Fix O= build failure when CONFIG_EFI_SBAT_FILE is using relative
file paths
- Fix the various SNC (Sub-NUMA Clustering) topology enumeration
bugs/artifacts (sched-domain build errors mostly).
SNC enumeration data got more complicated with Granite Rapids X
(GNR) and Clearwater Forest X (CWF), which exposed these bugs
and made their effects more serious
- Also use the now sane(r) SNC code to fix resctrl SNC detection bugs
- Work around a historic libgcc unwinder bug in the vdso32 sigreturn
code (again), which regressed during an overly aggressive recent
cleanup of DWARF annotations
* tag 'x86-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/entry/vdso32: Work around libgcc unwinder bug
x86/resctrl: Fix SNC detection
x86/topo: Fix SNC topology mess
x86/topo: Replace x86_has_numa_in_package
x86/topo: Add topology_num_nodes_per_package()
x86/numa: Store extra copy of numa_nodes_parsed
x86/boot: Handle relative CONFIG_EFI_SBAT_FILE file paths
x86/sev: Allow IBPB-on-Entry feature for SNP guests
x86/boot/sev: Move SEV decompressor variables into the .data section
Linus Torvalds [Sun, 8 Mar 2026 01:09:15 +0000 (17:09 -0800)]
Merge tag 'timers-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
"Make clock_adjtime() syscall timex validation slightly more permissive
for auxiliary clocks, to not reject syscalls based on the status field
that do not try to modify the status field.
This makes the ABI behavior in clock_adjtime() consistent with
CLOCK_REALTIME"
* tag 'timers-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping: Fix timex status validation for auxiliary clocks
Linus Torvalds [Sun, 8 Mar 2026 01:07:13 +0000 (17:07 -0800)]
Merge tag 'sched-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
"Fix a DL scheduler bug that may corrupt internal metrics during PI and
setscheduler() syscalls, resulting in kernel warnings and misbehavior.
Found during stress-testing"
* tag 'sched-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/deadline: Fix missing ENQUEUE_REPLENISH during PI de-boosting
Linus Torvalds [Sat, 7 Mar 2026 22:04:50 +0000 (14:04 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two core changes and the rest in drivers, one core change to quirk the
behaviour of the Iomega Zip drive and one to fix a hang caused by tag
reallocation problems, which has mostly been seen by the iscsi client.
Note the latter fixes the problem but still has a slight sysfs memory
leak, so will be amended in the next pull request (once we've run the
fix for the fix through our testing)"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: target: Fix recursive locking in __configfs_open_file()
scsi: devinfo: Add BLIST_SKIP_IO_HINTS for Iomega ZIP
scsi: mpi3mr: Clear reset history on ready and recheck state after timeout
scsi: core: Fix refcount leak for tagset_refcnt
Linus Torvalds [Sat, 7 Mar 2026 20:38:16 +0000 (12:38 -0800)]
Merge tag 'parisc-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"While testing Sasha Levin's 'kallsyms: embed source file:line info in
kernel stack traces' patch series, which increases the typical kernel
image size, I found some issues with the parisc initial kernel mapping
which may prevent the kernel to boot.
The three small patches here fix this"
* tag 'parisc-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix initial page table creation for boot
parisc: Check kernel mapping earlier at bootup
parisc: Increase initial mapping to 64 MB with KALLSYMS
- Fix precision backtracking with linked registers (Eduard Zingerman)
- Fix linker flags detection for resolve_btfids (Ihor Solodrai)
- Fix race in update_ftrace_direct_add/del (Jiri Olsa)
- Fix UAF in bpf_trampoline_link_cgroup_shim (Lang Xu)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
resolve_btfids: Fix linker flags detection
selftests/bpf: add reproducer for spurious precision propagation through calls
bpf: collect only live registers in linked regs
Revert "selftests/bpf: Update reg_bound range refinement logic"
selftests/bpf: test refining u32/s32 bounds when ranges cross min/max boundary
bpf: Fix u32/s32 bounds when ranges cross min/max boundary
bpf: Fix a UAF issue in bpf_trampoline_link_cgroup_shim
ftrace: Add missing ftrace_lock to update_ftrace_direct_add/del
Linus Torvalds [Sat, 7 Mar 2026 19:56:55 +0000 (11:56 -0800)]
Merge tag 'rcu-fixes.v7.0-20260307a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux
Pull RCU selftest fixes from Boqun Feng:
"Fix a regression in RCU torture test pre-defined scenarios caused by
commit 7dadeaa6e851 ("sched: Further restrict the preemption modes")
which limits PREEMPT_NONE to architectures that do not support
preemption at all and PREEMPT_VOLUNTARY to those architectures that do
not yet have PREEMPT_LAZY support.
Since major architectures (e.g. x86 and arm64) no longer support
CONFIG_PREEMPT_NONE and CONFIG_PREEMPT_VOLUNTARY, using them in
rcutorture, rcuscale, refscale, and scftorture pre-defined scenarios
causes config checking errors.
Switch these kconfigs to PREEMPT_LAZY"
* tag 'rcu-fixes.v7.0-20260307a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux:
scftorture: Update due to x86 not supporting none/voluntary preemption
refscale: Update due to x86 not supporting none/voluntary preemption
rcuscale: Update due to x86 not supporting none/voluntary preemption
rcutorture: Update due to x86 not supporting none/voluntary preemption
Linus Torvalds [Sat, 7 Mar 2026 17:50:54 +0000 (09:50 -0800)]
Merge tag 'trace-v7.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix possible NULL pointer dereference in trace_data_alloc()
On the trace_data_alloc() error path, it can call trigger_data_free()
with a NULL pointer. This used to be a kfree() but was changed to
trigger_data_free() to clean up any partial initialization. The issue
is that trigger_data_free() does not expect a NULL pointer. Have
trigger_data_free() return safely on NULL pointer.
- Fix multiple events on the command line and bootconfig
If multiple events are enabled on the command line separately and not
grouped, only the last event gets enabled. That is:
trace_event=sched_switch trace_event=sched_waking
will only enable sched_waking whereas:
trace_event=sched_switch,sched_waking
will enable both.
The bootconfig makes it even worse as the second way is the more
common method.
The issue is that a temporary buffer is used to store the events to
enable later in boot. Each time the cmdline callback is called, it
overwrites what was previously there.
Have the callback append the next value (delimited by a comma) if the
temporary buffer already has content.
- Fix command line trace_buffer_size if >= 2G
The logic to allocate the trace buffer uses "int" for the size
parameter in the command line code causing overflow issues if more
that 2G is specified.
* tag 'trace-v7.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix trace_buf_size= cmdline parameter with sizes >= 2G
tracing: Fix enabling multiple events on the kernel command line and bootconfig
tracing: Add NULL pointer check to trigger_data_free()
Ihor Solodrai [Thu, 5 Mar 2026 01:47:30 +0000 (17:47 -0800)]
resolve_btfids: Fix linker flags detection
The "|| echo -lzstd" default makes zstd an unconditional link
dependency of resolve_btfids. On systems where libzstd-dev is not
installed and pkg-config fails, the linker fails:
ld: cannot find -lzstd: No such file or directory
libzstd is a transitive dependency of libelf, so the -lzstd flag is
strictly necessary only for static builds [1].
Remove ZSTD_LIBS variable, and instead set LIBELF_LIBS depending on
whether the build is static or not. Use $(HOSTPKG_CONFIG) as primary
source of the flags list.
Also add a default value for HOSTPKG_CONFIG in case it's not built via
the toplevel Makefile. Pass it from selftests/bpf too.
Linus Torvalds [Sat, 7 Mar 2026 16:16:48 +0000 (08:16 -0800)]
Merge tag 'driver-core-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core fix from Danilo Krummrich:
- Revert "driver core: enforce device_lock for driver_match_device()":
When a device is already present in the system and a driver is
registered on the same bus, we iterate over all devices registered on
this bus to see if one of them matches. If we come across an already
bound one where the corresponding driver crashed while holding the
device lock (e.g. in probe()) we can't make any progress anymore.
Thus, revert and clarify that an implementer of struct bus_type must
not expect match() to be called with the device lock held.
* tag 'driver-core-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core:
Revert "driver core: enforce device_lock for driver_match_device()"
====================
bpf: Fix precision backtracking bug with linked registers
Emil Tsalapatis reported a verifier bug hit by the scx_lavd sched_ext
scheduler. The essential part of the verifier log looks as follows:
436: ...
// checkpoint hit for 438: (1d) if r7 == r8 goto ...
frame 3: propagating r2,r7,r8
frame 2: propagating r6
mark_precise: frame3: last_idx ...
mark_precise: frame3: regs=r2,r7,r8 stack= before 436: ...
mark_precise: frame3: regs=r2,r7 stack= before 435: ...
mark_precise: frame3: regs=r2,r7 stack= before 434: (85) call bpf_trace_vprintk#177
verifier bug: backtracking call unexpected regs 84
The log complains that registers r2 and r7 are tracked as precise
while processing the bpf_trace_vprintk() call in precision backtracking.
This can't be right, as r2 is reset by the call and there is nothing
to backtrack it to. The precision propagation is triggered when
a checkpoint is hit at instruction 438, r2 is dead at that instruction.
This happens because of the following sequence of events:
- Instruction 438 is first reached with registers r2 and r7 having
the same id via a path that does not call bpf_trace_vprintk():
- Checkpoint is created at 438.
- The jump at 438 is predicted, hence r7 and registers linked to it
(r2) are propagated as precise, marking r2 and r7 precise in the
checkpoint.
- Instruction 438 is reached a second time with r2 undefined and via
a path that calls bpf_trace_vprintk():
- Checkpoint is hit.
- propagate_precision() picks registers r2 and r7 and propagates
precision marks for those up to the helper call.
The root cause is the fact that states_equal() and
propagate_precision() assume that the precision flag can't be set for a
dead register (as computed by compute_live_registers()).
However, this is not the case when linked registers are at play.
Fix this by accounting for live register flags in
collect_linked_regs().
---
====================
selftests/bpf: add reproducer for spurious precision propagation through calls
Add a test for the scenario described in the previous commit:
an iterator loop with two paths where one ties r2/r7 via
shared scalar id and skips a call, while the other goes
through the call. Precision marks from the linked registers
get spuriously propagated to the call path via
propagate_precision(), hitting "backtracking call unexpected
regs" in backtrack_insn().
Fix an inconsistency between func_states_equal() and
collect_linked_regs():
- regsafe() uses check_ids() to verify that cached and current states
have identical register id mapping.
- func_states_equal() calls regsafe() only for registers computed as
live by compute_live_registers().
- clean_live_states() is supposed to remove dead registers from cached
states, but it can skip states belonging to an iterator-based loop.
- collect_linked_regs() collects all registers sharing the same id,
ignoring the marks computed by compute_live_registers().
Linked registers are stored in the state's jump history.
- backtrack_insn() marks all linked registers for an instruction
as precise whenever one of the linked registers is precise.
The above might lead to a scenario:
- There is an instruction I with register rY known to be dead at I.
- Instruction I is reached via two paths: first A, then B.
- On path A:
- There is an id link between registers rX and rY.
- Checkpoint C is created at I.
- Linked register set {rX, rY} is saved to the jump history.
- rX is marked as precise at I, causing both rX and rY
to be marked precise at C.
- On path B:
- There is no id link between registers rX and rY,
otherwise register states are sub-states of those in C.
- Because rY is dead at I, check_ids() returns true.
- Current state is considered equal to checkpoint C,
propagate_precision() propagates spurious precision
mark for register rY along the path B.
- Depending on a program, this might hit verifier_bug()
in the backtrack_insn(), e.g. if rY ∈ [r1..r5]
and backtrack_insn() spots a function call.
The reproducer program is in the next patch.
This was hit by sched_ext scx_lavd scheduler code.
Changes in tests:
- verifier_scalar_ids.c selftests need modification to preserve
some registers as live for __msg() checks.
- exceptions_assert.c adjusted to match changes in the verifier log,
R0 is dead after conditional instruction and thus does not get
range.
- precise.c adjusted to match changes in the verifier log, register r9
is dead after comparison and it's range is not important for test.
Linus Torvalds [Sat, 7 Mar 2026 04:27:13 +0000 (20:27 -0800)]
Merge tag 'kbuild-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nathan Chancellor:
- Split out .modinfo section from ELF_DETAILS macro, as that macro may
be used in other areas that expect to discard .modinfo, breaking
certain image layouts
- Adjust genksyms parser to handle optional attributes in certain
declarations, necessary after commit 07919126ecfc ("netfilter:
annotate NAT helper hook pointers with __rcu")
- Include resolve_btfids in external module build created by
scripts/package/install-extmod-build when it may be run on external
modules
- Avoid removing objtool binary with 'make clean', as it is required
for external module builds
* tag 'kbuild-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
kbuild: Leave objtool binary around with 'make clean'
kbuild: install-extmod-build: Package resolve_btfids if necessary
genksyms: Fix parsing a declarator with a preceding attribute
kbuild: Split .modinfo out from ELF_DETAILS
Linus Torvalds [Sat, 7 Mar 2026 03:57:03 +0000 (19:57 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"The main changes are a fix to the way in which we manage the access
flag setting for mappings using the contiguous bit and a fix for a
hang on the kexec/hibernation path.
Summary:
- Fix kexec/hibernation hang due to bogus read-only mappings
- Fix sparse warnings in our cmpxchg() implementation
- Prevent runtime-const being used in modules, just like x86
- Fix broken elision of access flag modifications for contiguous
entries on systems without support for hardware updates
- Fix a broken SVE selftest that was testing the wrong instruction"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
selftest/arm64: Fix sve2p1_sigill() to hwcap test
arm64: contpte: fix set_access_flags() no-op check for SMMU/ATS faults
arm64: make runtime const not usable by modules
arm64: mm: Add PTE_DIRTY back to PAGE_KERNEL* to fix kexec/hibernation
arm64: Silence sparse warnings caused by the type casting in (cmp)xchg
Calvin Owens [Sat, 7 Mar 2026 03:19:25 +0000 (19:19 -0800)]
tracing: Fix trace_buf_size= cmdline parameter with sizes >= 2G
Some of the sizing logic through tracer_alloc_buffers() uses int
internally, causing unexpected behavior if the user passes a value that
does not fit in an int (on my x86 machine, the result is uselessly tiny
buffers).
Fix by plumbing the parameter's real type (unsigned long) through to the
ring buffer allocation functions, which already use unsigned long.
It has always been possible to create larger ring buffers via the sysfs
interface: this only affects the cmdline parameter.
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/bff42a4288aada08bdf74da3f5b67a2c28b761f8.1772852067.git.calvin@wbinvd.org Fixes: 73c5162aa362 ("tracing: keep ring buffer to minimum size till used") Signed-off-by: Calvin Owens <calvin@wbinvd.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
====================
bpf: Fix u32/s32 bounds when ranges cross min/max boundary
Cover the following cases in range refinement logic for 32-bit ranges:
- s32 range crosses U32_MAX/0 boundary, positive part of the s32 range
overlaps with u32 range.
- s32 range crosses U32_MAX/0 boundary, negative part of the s32 range
overlaps with u32 range.
These cases are already handled for 64-bit range refinement.
Without the fix the test in patch 2 is rejected by the verifier.
The test was reduced from sched-ext program.
Changelog:
- v2 -> v3:
- Reverted da653de268d3 (Paul)
- Removed !BPF_F_TEST_REG_INVARIANTS flag from
crossing_32_bit_signed_boundary_2() (Paul)
- v1 -> v2:
- Extended commit message and comments (Emil)
- Targeting 'bpf' tree instead of bpf-next (Alexei)
Revert "selftests/bpf: Update reg_bound range refinement logic"
This reverts commit da653de268d32a80e135c9eb960a8147c186f1bc.
Removed logic is now covered by range_refine_in_halves()
which handles both 32-bit and 64-bit refinements.
selftests/bpf: test refining u32/s32 bounds when ranges cross min/max boundary
Two test cases for signed/unsigned 32-bit bounds refinement
when s32 range crosses the sign boundary:
- s32 range [S32_MIN..1] overlapping with u32 range [3..U32_MAX],
s32 range tail before sign boundary overlaps with u32 range.
- s32 range [-3..5] overlapping with u32 range [0..S32_MIN+3],
s32 range head after the sign boundary overlaps with u32 range.
This covers both branches added in the __reg32_deduce_bounds().
Also, crossing_32_bit_signed_boundary_2() no longer triggers invariant
violations.
bpf: Fix u32/s32 bounds when ranges cross min/max boundary
Same as in __reg64_deduce_bounds(), refine s32/u32 ranges
in __reg32_deduce_bounds() in the following situations:
- s32 range crosses U32_MAX/0 boundary, positive part of the s32 range
overlaps with u32 range:
0 U32_MAX
| [xxxxxxxxxxxxxx u32 range xxxxxxxxxxxxxx] |
|----------------------------|----------------------------|
|xxxxx s32 range xxxxxxxxx] [xxxxxxx|
0 S32_MAX S32_MIN -1
- s32 range crosses U32_MAX/0 boundary, negative part of the s32 range
overlaps with u32 range:
0 U32_MAX
| [xxxxxxxxxxxxxx u32 range xxxxxxxxxxxxxx] |
|----------------------------|----------------------------|
|xxxxxxxxx] [xxxxxxxxxxxx s32 range |
0 S32_MAX S32_MIN -1
- No refinement if ranges overlap in two intervals.
This helps for e.g. consider the following program:
call %[bpf_get_prandom_u32];
w0 &= 0xffffffff;
if w0 < 0x3 goto 1f; // on fall-through u32 range [3..U32_MAX]
if w0 s> 0x1 goto 1f; // on fall-through s32 range [S32_MIN..1]
if w0 s< 0x0 goto 1f; // range can be narrowed to [S32_MIN..-1]
r10 = 0;
1: ...;
The reg_bounds.c selftest is updated to incorporate identical logic,
refinement based on non-overflowing range halves:
Linus Torvalds [Sat, 7 Mar 2026 00:07:22 +0000 (16:07 -0800)]
Merge tag 'v7.0-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Fix potential oops on open failure
- Fix unmount to better free deferred closes
- Use proper constant-time MAC comparison function
- Two buffer allocation size fixes
- Two minor cleanups
- make SMB2 kunit tests a distinct module
* tag 'v7.0-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: fix oops due to uninitialised var in smb2_unlink()
cifs: open files should not hold ref on superblock
smb: client: Compare MACs in constant time
smb/client: remove unused SMB311_posix_query_info()
smb/client: fix buffer size for smb311_posix_qinfo in SMB311_posix_query_info()
smb/client: fix buffer size for smb311_posix_qinfo in smb2_compound_op()
smb: update some doc references
smb/client: make SMB2 maperror KUnit tests a separate module
tracing: Fix enabling multiple events on the kernel command line and bootconfig
Multiple events can be enabled on the kernel command line via a comma
separator. But if the are specified one at a time, then only the last
event is enabled. This is because the event names are saved in a temporary
buffer, and each call by the init cmdline code will reset that buffer.
This also affects names in the boot config file, as it may call the
callback multiple times with an example of:
Linus Torvalds [Fri, 6 Mar 2026 21:37:52 +0000 (13:37 -0800)]
Merge tag 'pci-v7.0-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
- Initialize msi_addr_mask for OF-created PCI devices to fix sparc and
powerpc probe regressions (Nilay Shroff)
- Orphan the Altera PCIe controller driver (Dave Hansen)
* tag 'pci-v7.0-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
MAINTAINERS: Orphan Altera PCIe controller driver
sparc/PCI: Initialize msi_addr_mask for OF-created PCI devices
powerpc/pci: Initialize msi_addr_mask for OF-created PCI devices
Linus Torvalds [Fri, 6 Mar 2026 21:29:12 +0000 (13:29 -0800)]
Merge tag 'drm-fixes-2026-03-07' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Weekly fixes pull.
There is one mm fix in here for a HMM livelock triggered by the xe
driver tests. Otherwise it's a pretty wide range of fixes across the
board, ttm UAF regression fix, amdgpu fixes, nouveau doesn't crash my
laptop anymore fix, and a fair bit of misc.
Seems about right for rc3.
mm:
- mm: Fix a hmm_range_fault() livelock / starvation problem
amdxdna:
- fix invalid payload for failed command
- fix NULL ptr dereference
- fix major fw version check
- avoid inconsistent fw state on error
i915/display:
- Fix for Lenovo T14 G7 display not refreshing
xe:
- Do not preempt fence signaling CS instructions
- Some leak and finalization fixes
- Workaround fix
nouveau:
- avoid runtime suspend oops when using dp aux
panthor:
- fix gem_sync argument ordering
solomon:
- fix incorrect display output
renesas:
- fix DSI divider programming
ethosu:
- fix job submit error clean-up refcount
- fix NPU_OP_ELEMENTWISE validation
- handle possible underflows in IFM size calcs"
* tag 'drm-fixes-2026-03-07' of https://gitlab.freedesktop.org/drm/kernel: (38 commits)
accel: ethosu: Handle possible underflow in IFM size calculations
accel: ethosu: Fix NPU_OP_ELEMENTWISE validation with scalar
accel: ethosu: Fix job submit error clean-up refcount underflows
accel/amdxdna: Split mailbox channel create function
drm/panthor: Correct the order of arguments passed to gem_sync
Revert "drm/syncobj: Fix handle <-> fd ioctls with dirty stack"
drm/ttm: Fix bo resource use-after-free
nouveau/dpcd: return EBUSY for aux xfer if the device is asleep
accel/amdxdna: Fix major version check on NPU1 platform
drm/amdgpu/userq: refcount userqueues to avoid any race conditions
drm/amdgpu/userq: Consolidate wait ioctl exit path
drm/amdgpu/psp: Use Indirect access address for GFX to PSP mailbox
drm/amdgpu: Fix use-after-free race in VM acquire
drm/amd/pm: remove invalid gpu_metrics.energy_accumulator on smu v13.0.x
drm/xe: Fix memory leak in xe_vm_madvise_ioctl
drm/xe/reg_sr: Fix leak on xa_store failure
drm/xe/xe2_hpg: Correct implementation of Wa_16025250150
drm/xe/gsc: Fix GSC proxy cleanup on early initialization failure
Revert "drm/pagemap: Disable device-to-device migration"
drm/i915/psr: Fix for Panel Replay X granularity DPCD register handling
...
Linus Torvalds [Fri, 6 Mar 2026 20:34:49 +0000 (12:34 -0800)]
Merge tag 'linux_kselftest-kunit-fixes-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit fixes from Shuah Khan:
- Fix rust warnings when CONFIG_PRINTK is disabled
- Reduce stack usage in kunit_run_tests() to fix warnings when
CONFIG_FRAME_WARN is set to a relatively low value
- Update email address for David Gow
- Copy caller args in kunit tool in run_kernel to prevent mutation
* tag 'linux_kselftest-kunit-fixes-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: reduce stack usage in kunit_run_tests()
kunit: tool: copy caller args in run_kernel to prevent mutation
rust: kunit: fix warning when !CONFIG_PRINTK
MAINTAINERS: Update email address for David Gow
Linus Torvalds [Fri, 6 Mar 2026 18:33:32 +0000 (10:33 -0800)]
Merge tag 'spi-fix-v7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fix from Mark Brown:
"One device specific fix here, it was possible we might end up trying
to dereference an invalid pointer while reporting a transfer timeout
on DesignWare controllers"
* tag 'spi-fix-v7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-dw-dma: fix print error log when wait finish transaction
Linus Torvalds [Fri, 6 Mar 2026 18:27:45 +0000 (10:27 -0800)]
Merge tag 'regulator-fix-v7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A couple of small, driver specific fixes which might not even have
much impact if you have the affected devices depending on your setup"
* tag 'regulator-fix-v7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: pf9453: Respect IRQ trigger settings from firmware
regulator: mt6363: Fix incorrect and redundant IRQ disposal in probe
Linus Torvalds [Fri, 6 Mar 2026 18:06:04 +0000 (10:06 -0800)]
Merge tag 'sound-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Again a collection of device-specific fixes. Most of changes are
fairly small device-specific quirks of fixes for HD- and USB-audio,
ASoC Intel, AMD, fsl, Cirrus and co.
The only large LOC is for plumbing ASoC ACP driver to add the Cirrus
Logic codec support, so this one is also just adding some tables"
* tag 'sound-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (21 commits)
ALSA: us122l: drop redundant interface references
ASoC: amd: yc: Add DMI quirk for ASUS EXPERTBOOK PM1503CDA
ASoC: dt-bindings: renesas,rz-ssi: Document RZ/G3L SoC
ASoC: SDCA: Add allocation failure check for Entity name
ALSA: hda/senary: Ensure EAPD is enabled during init
ALSA: hda/senary: Use codec->core.afg for GPIO access
ALSA: doc: usb-audio: Add doc for QUIRK_FLAG_SKIP_IFACE_SETUP
ASoC: dt-bindings: tegra: Add compatible for Tegra238 sound card
ALSA: hda/hdmi: Add Tegra238 HDA codec device ID
ASoC: cs35l56: Suppress pointless warning about number of GPIO pulls
ASoC: amd: acp: Add ACP6.3 match entries for Cirrus Logic parts
ASoC: Intel: sof_sdw: Add quirk for Alienware Area 51 (2025) 0CCD SKU
ASoC: rt1321: fix DMIC ch2/3 mask issue
ASoC: cs35l56: Only patch ASP registers if the DAI is part of a DAIlink
ASoC: fsl_easrc: Fix event generation in fsl_easrc_iec958_set_reg()
ASoC: fsl_easrc: Fix event generation in fsl_easrc_iec958_put_bits()
ALSA: firewire: dice: Fix printf warning with W=1
ALSA: hda/tas2781: A workaround solution to lower-vol issue among lower calibrated-impedance micro-speaker on TAS2781
ALSA: hda/realtek: Add quirk for HP Pavilion 15-eh1xxx to enable mute LED
ALSA: usb-audio: Add iface reset and delay quirk for AB13X USB Audio
...
Guenter Roeck [Thu, 5 Mar 2026 19:33:39 +0000 (11:33 -0800)]
tracing: Add NULL pointer check to trigger_data_free()
If trigger_data_alloc() fails and returns NULL, event_hist_trigger_parse()
jumps to the out_free error path. While kfree() safely handles a NULL
pointer, trigger_data_free() does not. This causes a NULL pointer
dereference in trigger_data_free() when evaluating
data->cmd_ops->set_filter.
Fix the problem by adding a NULL pointer check to trigger_data_free().
The problem was found by an experimental code review agent based on
gemini-3.1-pro while reviewing backports into v6.18.y.
Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://patch.msgid.link/20260305193339.2810953-1-linux@roeck-us.net Fixes: 0550069cc25f ("tracing: Properly process error handling in event_hist_trigger_parse()") Assisted-by: Gemini:gemini-3.1-pro Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linus Torvalds [Fri, 6 Mar 2026 18:00:58 +0000 (10:00 -0800)]
Merge tag 'hid-for-linus-2026030601' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Benjamin Tissoires:
- fix a few memory leaks (Günther Noack)
- fix potential kernel crashes in cmedia, creative-sb0540 and zydacron
(Greg Kroah-Hartman)
- fix NULL pointer dereference in pidff (Tomasz Pakuła)
- fix battery reporting for Apple Magic Trackpad 2 (Julius Lehmann)
- mcp2221 proper handling of failed read operation (Romain Sioen)
- various device quirks / device ID additions
* tag 'hid-for-linus-2026030601' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: mcp2221: cancel last I2C command on read error
HID: asus: add xg mobile 2023 external hardware support
HID: multitouch: Keep latency normal on deactivate for reactivation gesture
HID: apple: Add EPOMAKER TH87 to the non-apple keyboards list
HID: intel-ish-hid: ipc: Add Nova Lake-H/S PCI device IDs
selftests: hid: tests: test_wacom_generic: add tests for display devices and opaque devices
HID: multitouch: new class MT_CLS_EGALAX_P80H84
HID: magicmouse: fix battery reporting for Apple Magic Trackpad 2
HID: pidff: Fix condition effect bit clearing
HID: Add HID_CLAIMED_INPUT guards in raw_event callbacks missing them
HID: asus: avoid memory leak in asus_report_fixup()
HID: magicmouse: avoid memory leak in magicmouse_report_fixup()
HID: apple: avoid memory leak in apple_report_fixup()
HID: Document memory allocation properties of report_fixup()
- touchscreen_dmi: Add quirk for y-inverted Goodix touchscreen on SUPI
S10
- uniwill-laptop:
- FN lock/super key lock attributes rename
- Fix crash on unexpected battery event
- A special key combination can alter FN lock status so mark it
volatile
- Handle FN lock event
* tag 'platform-drivers-x86-v7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (27 commits)
platform/x86: dell-wmi-sysman: Don't hex dump plaintext password data
platform_data/mlxreg: mlxreg.h: fix all kernel-doc warnings
platform/x86: asus-armoury: add support for FA401UM
platform/x86: asus-armoury: add support for GX650RX
platform/x86: hp-bioscfg: Support allocations of larger data
platform/x86: oxpec: Add support for Aokzoe A2 Pro
platform/x86: oxpec: Add support for OneXPlayer X1 Air
platform/x86: oxpec: Add support for OneXPlayer X1z
platform/x86: oxpec: Add support for OneXPlayer APEX
platform/x86: uniwill-laptop: Handle FN lock event
platform/x86: uniwill-laptop: Mark FN lock status as being volatile
platform/x86: uniwill-laptop: Fix crash on unexpected battery event
platform/x86: uniwill-laptop: Rename FN lock and super key lock attrs
platform/x86: redmi-wmi: Add more hotkey mappings
platform/x86: alienware-wmi-wmax: Add G-Mode support to m18 laptops
platform/x86: hp-wmi: add Omen 14-fb1xxx (board 8E41) support
platform/x86: dell-wmi: Add audio/mic mute key codes
platform/x86: hp-wmi: Add Victus 16-d0xxx support
platform/x86: intel-hid: Enable 5-button array on ThinkPad X1 Fold 16 Gen 1
platform/x86: int3472: Handle GPIO type 0x10 (DOVDD)
...
Linus Torvalds [Fri, 6 Mar 2026 17:22:51 +0000 (09:22 -0800)]
Merge tag 'slab-for-7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fixes from Vlastimil Babka:
- Fix for slab->stride truncation on 64k page systems due to short
type. It was not due to races and lack of barriers in the end. (Harry
Yoo)
- Fix for severe performance regression due to unnecessary sheaf refill
restrictions exposed by mempool allocation strategy. (Vlastimil
Babka)
- Stable fix for potential silent percpu sheaf flushing failures on
PREEMPT_RT. (Vlastimil Babka)
* tag 'slab-for-7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
mm/slab: change stride type from unsigned short to unsigned int
mm/slab: allow sheaf refill if blocking is not allowed
slab: distinguish lock and trylock for sheaf_flush_main()
Linus Torvalds [Fri, 6 Mar 2026 17:16:39 +0000 (09:16 -0800)]
Merge tag 'pmdomain-v7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:
- rockchip: Fix PD_VCODEC for RK3588
- bcm: Fix broken reset status read for bcm2835
* tag 'pmdomain-v7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: rockchip: Fix PD_VCODEC for RK3588
pmdomain: bcm: bcm2835-power: Fix broken reset status read