Linus Torvalds [Fri, 20 Mar 2026 16:34:32 +0000 (09:34 -0700)]
Merge tag 'mtd/fixes-for-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD fixes from Miquel Raynal:
- In SPI NOR, there was an issue with the RDCR capability, leading to
several platforms no longer capable of using it for wrong reasons
(the follow-up commit renames the helper to avoid future confusion)
- NAND controller drivers needed to be improved to fix some timings, a
locking schenario and avoid certain operations during panic writes
- The Spear600 DT binding conversion was done partially, leading to
several warnings which have individually been fixed
- Tudor gets replaced by Takahiro for the SPI NOR maintainance
- Plus two more misc fixes
* tag 'mtd/fixes-for-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: rawnand: pl353: make sure optimal timings are applied
mtd: spi-nor: Rename spi_nor_spimem_check_op()
mtd: spi-nor: Fix RDCR controller capability core check
mtd: rawnand: brcmnand: skip DMA during panic write
mtd: rawnand: serialize lock/unlock against other NAND operations
dt-bindings: mtd: st,spear600-smi: Fix example
dt-bindings: mtd: st,spear600-smi: #address/size-cells is mandatory
dt-bindings: mtd: st,spear600-smi: Fix description
mtd: rawnand: cadence: Fix error check for dma_alloc_coherent() in cadence_nand_init()
mtd: Avoid boot crash in RedBoot partition table parser
MAINTAINERS: add Takahiro Kuwano as SPI NOR reviewer
MAINTAINERS: remove Tudor Ambarus as SPI NOR maintainer
Linus Torvalds [Fri, 20 Mar 2026 16:29:03 +0000 (09:29 -0700)]
Merge tag 'iommu-fixes-v7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu fixes from Joerg Roedel:
"Intel VT-d:
- Abort all pending requests on dev_tlb_inv timeout to avoid
hardlockup
- Limit IOPF handling to PRI-capable device to avoid SVA attach
failure
AMD-Vi:
- Make sure identity domain is not used when SNP is active
Core fixes:
- Handle mapping IOVA 0x0 correctly
- Fix crash in SVA code
- Kernel-doc fix in IO-PGTable code"
* tag 'iommu-fixes-v7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
iommu/amd: Block identity domain when SNP enabled
iommu/sva: Fix crash in iommu_sva_unbind_device()
iommu/io-pgtable: fix all kernel-doc warnings in io-pgtable.h
iommu: Fix mapping check for 0x0 to avoid re-mapping it
iommu/vt-d: Only handle IOPF for SVA when PRI is supported
iommu/vt-d: Fix intel iommu iotlb sync hardlockup and retry
Linus Torvalds [Fri, 20 Mar 2026 16:23:01 +0000 (09:23 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"There's a small crop of fixes for the MPAM resctrl driver, a fix for
SCS/PAC patching with the AMDGPU driver and a page-table fix for
realms running with 52-bit physical addresses:
- Fix DWARF parsing for SCS/PAC patching to work with very large
modules (such as the amdgpu driver)
- Fixes to the mpam resctrl driver
- Fix broken handling of 52-bit physical addresses when sharing
memory from within a realm"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: realm: Fix PTE_NS_SHARED for 52bit PA support
arm_mpam: Force __iomem casts
arm_mpam: Disable preemption when making accesses to fake MSC in kunit test
arm_mpam: Fix null pointer dereference when restoring bandwidth counters
arm64/scs: Fix handling of advance_loc4
- Update maintainers for Hyper-V DRM driver (Saurabh Sengar)
- Misc clean up in MSHV crashdump code (Ard Biesheuvel, Uros Bizjak)
- Minor improvements to MSHV code (Mukesh R, Wei Liu)
- Revert not yet released MSHV scrub partition hypercall (Wei Liu)
* tag 'hyperv-fixes-signed-20260319' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
mshv: Fix error handling in mshv_region_pin
MAINTAINERS: Update maintainers for Hyper-V DRM driver
mshv: Fix use-after-free in mshv_map_user_memory error path
mshv: pass struct mshv_user_mem_region by reference
x86/hyperv: Use any general-purpose register when saving %cr2 and %cr8
x86/hyperv: Use current_stack_pointer to avoid asm() in hv_hvcrash_ctxt_save()
x86/hyperv: Save segment registers directly to memory in hv_hvcrash_ctxt_save()
x86/hyperv: Use __naked attribute to fix stackless C function
Revert "mshv: expose the scrub partition hypercall"
mshv: add arm64 support for doorbell & intercept SINTs
mshv: refactor synic init and cleanup
x86/hyperv: print out reserved vectors in hexadecimal
Dave Airlie [Fri, 20 Mar 2026 16:12:41 +0000 (02:12 +1000)]
Merge tag 'drm-xe-fixes-2026-03-19' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- A number of teardown fixes (Daniele, Matt Brost, Zhanjun, Ashutosh)
- Skip over non-leaf PTE for PRL generation (Brian)
- Fix an unitialized variable (Umesh)
- Fix a missing runtime PM reference (Sanjay)
Linus Torvalds [Fri, 20 Mar 2026 16:07:29 +0000 (09:07 -0700)]
Merge tag 'v7.0-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Fix reporting of i_blocks
- Fix Kerberos mounts with different usernames to same server
- Trivial comment cleanup
* tag 'v7.0-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: fix generic/694 due to wrong ->i_blocks
cifs: smb1: fix comment typo
smb: client: fix krb5 mount with username option
Linus Torvalds [Fri, 20 Mar 2026 16:03:37 +0000 (09:03 -0700)]
Merge tag 'v7.0-rc4-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- Three use after free fixes (in close, in compounded ops, and in tree
disconnect)
- Multichannel fix
- return proper volume identifier (superblock uuid if available) in
FS_OBJECT_ID queries
* tag 'v7.0-rc4-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix use-after-free in durable v2 replay of active file handles
ksmbd: fix use-after-free of share_conf in compound request
ksmbd: use volume UUID in FS_OBJECT_ID_INFORMATION
ksmbd: unset conn->binding on failed binding request
ksmbd: fix share_conf UAF in tree_conn disconnect
Dave Airlie [Fri, 20 Mar 2026 15:52:29 +0000 (01:52 +1000)]
Merge tag 'drm-misc-fixes-2026-03-19' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
A doc warning fix and a memory leak fix for vmwgfx, a deadlock fix and
interrupt handling fixes for imagination, a locking fix for
pagemap_until, a UAF fix for drm_dev_unplug, and a multi-channel audio
handling fix for dw-hdmi-qp.
Dave Airlie [Fri, 20 Mar 2026 15:43:57 +0000 (01:43 +1000)]
Merge tag 'drm-intel-fixes-2026-03-19' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
- Fix #15771: Screen corruption and stuttering on P14s w/ 3K display
- Fix for PSR entry setup frames count on rejected commit
- Fix OOPS if firmware is not loaded and suspend is attempted
- Fix unlikely NULL deref due to DC6 on probe
Dave Jiang [Thu, 19 Mar 2026 15:25:41 +0000 (08:25 -0700)]
cxl: Add endpoint decoder flags clear when PCI reset happens
When a PCI reset happens, the lock and enable flags of the CXL device
should be cleared to avoid stale state flags after reset. Add flag
clearing during cxl_reset_done() to clear the relevant endpoint
decoder flags for all decoders of the endpoint device.
Reported-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260319152541.2739343-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
x86/efi: efi_unmap_boot_services: fix calculation of ranges_to_free size
ranges_to_free array should have enough room to store the entire EFI
memmap plus an extra element for NULL entry.
The calculation of this array size wrongly adds 1 to the overall size
instead of adding 1 to the number of elements.
Merge patch series "pid_namespace: make init creation more flexible"
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> says:
The first patch properly annotates accesses to ->child_reaper with
_ONCE macroses, to protect unlocked accesses from possible cpu/compiler
optimization problems.
The second patch makes sure that the init is always a first process in
the pid namespace, previously this was only checked for set_tid case.
The third patch allows to join pid namespace before pid namespace init
is created, that allows to create pid namespace by one process and then
create pid namespace init from another process after setns(). Please see
the detailed description in the patch commit message. It depends on the
second patch.
The forth and the final patch is a comprehansive test, that tests both
basic usecase of creating pid namespace and init separately, and a more
specific usecase which shows how we can improve clone3(set_tid)
usability after this change.
This change is generally useful as it makes clone3(set_tid) more
universal, and let's it work in all the cases evenly. Also it is highly
useful to CRIU to handle nested containers.
* patches from https://patch.msgid.link/20260318122157.280595-1-ptikhomirov@virtuozzo.com:
MAINTAINERS: add a new entry for testing pidns init creation via setns
selftests: Add tests for creating pidns init via setns
pid_namespace: allow opening pid_for_children before init was created
pid: check init is created first after idr alloc
pid_namespace: avoid optimization of accesses to ->child_reaper
Pavel Tikhomirov [Wed, 18 Mar 2026 12:21:52 +0000 (13:21 +0100)]
selftests: Add tests for creating pidns init via setns
First testcase "pidns_init_via_setns" checks that a process can become
Pid 1 (init) in a new Pid namespace created via unshare() and joined via
setns().
Second testcase "pidns_init_via_setns_set_tid" checks that during this
process we can use clone3() + set_tid and set the pid in both the new
and old pid namespaces (owned by different user namespaces). This test
requires root to run to avoid complex setup for wrapper userns.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
--
pidns_init_via_setns. Make pidns_init_via_setns_set_tid require root.
Pavel Tikhomirov [Wed, 18 Mar 2026 12:21:51 +0000 (13:21 +0100)]
pid_namespace: allow opening pid_for_children before init was created
This effectively gives us an ability to create the pid namespace init as
a child of the process (setns-ed to the pid namespace) different to the
process which created the pid namespace itself.
Original problem:
There is a cool set_tid feature in clone3() syscall, it allows you to
create process with desired pids on multiple pid namespace levels. Which
is useful to restore processes in CRIU for nested pid namespace case.
In nested container case we can potentially see this kind of pid/user
namespace tree:
So to create the "Process" and set pids {p0, p1, ... pn} for it on all
pid namespace levels we can use clone3() syscall set_tid feature, BUT
the syscall does not allow you to set pid on pid namespace levels you
don't have permission to. So basically you have to be in "User NS0" when
creating the "Process" to actually be able to set pids on all levels.
It is ok for almost any process, but with pid namespace init this does
not work, as currently we can only create pid namespace init and the pid
namespace itself simultaneously, so to make "Pid NSn" owned by "User
NSn" we have to be in the "User NSn".
We can't possibly be in "User NS0" and "User NSn" at the same time,
hence the problem.
Alternative solution:
Yes, for the case of pid namespace init we can use old and gold
/proc/sys/kernel/ns_last_pid interface on the levels lower than n. But
it is much more complicated and introduces tons of extra code to do. It
would be nice to make clone3() set_tid interface also aplicable to this
corner case.
Implementation:
Now when anyone can setns to the pid namespace before the creation of
init, and thus multiple processes can fork children to the pid
namespace, it is important that we enforce the first process created is
always pid namespace init. (Note that this was done by the previous
preparational patch as a standalon useful change.) We only allow other
processes after the init sets pid_namespace->child_reaper.
Reviewed-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Andrei Vagin <avagin@google.com> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
--
v2: Use *_ONCE for ->child_reaper accesses atomicity, and avoid taking
task_list lock for reading it. Rebase to master, and thus remove
now excess pidns_ready variable.
v3: Separate *_ONCE change and "init is first" checks into separate
commits.
v5: Add Andrei's review tag.
->child_reaper which can influence the pid namespace, so it looks like
the pid namespace is fully setup at the point when init sets
->child_reaper to receive more processes. Thus tasklist lock looks
excess in pidns_for_children_get()'s ->child_reaper check and it should
be safe not to have it in the corresponding check in alloc_pid()
(introduced earlier in this series).
Pavel Tikhomirov [Wed, 18 Mar 2026 12:21:50 +0000 (13:21 +0100)]
pid: check init is created first after idr alloc
This moves the condition (tid != 1 && !tmp->child_reaper) to after idr
alloc, so it not only covers that first process in pid namespace has pid
1 in case of clone3(set_tid) requesting wrong pid, but also if idr
itself gives wrong pid for some reason.
This could've been the case before this patch, when creating first
process the alloc_pid()->pidfs_add_pid() code path fails, so that the
idr->idr_next is non zero anymore and next process calling to
alloc_pid(), will get 2 as a pid from idr_alloc_cyclic(). Though thanks
to PIDNS_ADDING logic, free_pid() disables further pid allocation in
this case and it does not lead to any real problem.
Note: This is also a preparation for the next patch in the series, which
will introduce an ability of creating init from the task different to
the task which had created the pid namespace. Needed to make sure that
init is always first, even in this new case.
--
Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Andrei Vagin <avagin@google.com> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Link: https://patch.msgid.link/20260318122157.280595-3-ptikhomirov@virtuozzo.com
v3: Split from main commit. Merge two checks of ->child_reaper into one.
v4: Update commit message about PIDNS_ADDING.
v5: Add Andrei's review tag. Signed-off-by: Christian Brauner <brauner@kernel.org>
Pavel Tikhomirov [Wed, 18 Mar 2026 12:21:49 +0000 (13:21 +0100)]
pid_namespace: avoid optimization of accesses to ->child_reaper
To avoid potential problems related to cpu/compiler optimizations around
->child_reaper, let's use WRITE_ONCE (additional to task_list lock)
everywhere we write it and use READ_ONCE where we read it without
explicit lock. Note: It also pairs with existing READ_ONCE with no lock
in nsfs_fh_to_dentry().
Also let's add ASSERT_EXCLUSIVE_WRITER before write to identify to KCSAN
that we don't expect any concurrent ->child_reaper modifications, and
those must be detected.
--
Suggested-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Link: https://patch.msgid.link/20260318122157.280595-2-ptikhomirov@virtuozzo.com
v3: Split from main commit. Add ASSERT_EXCLUSIVE_WRITER.
v5: Add one more READ_ONCE for access without lock in free_pid(). Signed-off-by: Christian Brauner <brauner@kernel.org>
Tomas Glozar [Tue, 10 Mar 2026 16:07:25 +0000 (17:07 +0100)]
rtla: Fix segfault on multiple SIGINTs
Detach stop_trace() from SIGINT/SIGALRM on tool clean-up to prevent it
from crashing RTLA by accessing freed memory.
This prevents a crash when multiple SIGINTs are received.
Fixes: d6899e560366 ("rtla/timerlat_hist: Abort event processing on second signal") Fixes: 80967b354a76 ("rtla/timerlat_top: Abort event processing on second signal") Reviewed-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20260310160725.144443-1-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Joanne Koong [Fri, 20 Mar 2026 00:51:45 +0000 (17:51 -0700)]
writeback: don't block sync for filesystems with no data integrity guarantees
Add a SB_I_NO_DATA_INTEGRITY superblock flag for filesystems that cannot
guarantee data persistence on sync (eg fuse). For superblocks with this
flag set, sync kicks off writeback of dirty inodes but does not wait
for the flusher threads to complete the writeback.
This replaces the per-inode AS_NO_DATA_INTEGRITY mapping flag added in
commit f9a49aa302a0 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings
in wait_sb_inodes()"). The flag belongs at the superblock level because
data integrity is a filesystem-wide property, not a per-inode one.
Having this flag at the superblock level also allows us to skip having
to iterate every dirty inode in wait_sb_inodes() only to skip each inode
individually.
Prior to this commit, mappings with no data integrity guarantees skipped
waiting on writeback completion but still waited on the flusher threads
to finish initiating the writeback. Waiting on the flusher threads is
unnecessary. This commit kicks off writeback but does not wait on the
flusher threads. This change properly addresses a recent report [1] for
a suspend-to-RAM hang seen on fuse-overlayfs that was caused by waiting
on the flusher threads to finish:
On fuse this is problematic because there are paths that may cause the
flusher thread to block (eg if systemd freezes the user session cgroups
first, which freezes the fuse daemon, before invoking the kernel
suspend. The kernel suspend triggers ->write_node() which on fuse issues
a synchronous setattr request, which cannot be processed since the
daemon is frozen. Or if the daemon is buggy and cannot properly complete
writeback, initiating writeback on a dirty folio already under writeback
leads to writeback_get_folio() -> folio_prepare_writeback() ->
unconditional wait on writeback to finish, which will cause a hang).
This commit restores fuse to its prior behavior before tmp folios were
removed, where sync was essentially a no-op.
Yuto Ohnuki [Thu, 26 Feb 2026 20:18:58 +0000 (20:18 +0000)]
fs: remove stale and duplicate forward declarations
Remove the following unnecessary forward declarations from fs.h, which
improves maintainability.
- struct hd_geometry: became unused in fs.h when
block_device_operations was moved to blkdev.h in commit 08f858512151
("[PATCH] move block_device_operations to blkdev.h"). The forward
declaration is now added to blkdev.h where it is actually used.
- struct iovec: became unused when aio_read/aio_write were removed in
commit 8436318205b9 ("->aio_read and ->aio_write removed")
- struct iov_iter: duplicate forward declaration. This removes the
redundant second declaration, added in commit 293bc9822fa9
("new methods: ->read_iter() and ->write_iter()")
Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202512301303.s7YWTZHA-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202512302139.Wl0soAlz-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202512302105.pmzYfmcV-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202512302125.FNgHwu5z-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202512302108.nIV8r5ES-lkp@intel.com/ Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com> Link: https://patch.msgid.link/20260226201857.27310-2-ytohnuki@amazon.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
component has component->val_bytes which is set via
snd_soc_component_setup_regmap(). But it can be calculated via
component->regmap. No need to keep it as component->val_bytes.
This patchset adds new snd_soc_component_regmap_val_bytes(),
and remove component->val_bytes / snd_soc_component_setup_regmap().
component has component->val_bytes which is set via
snd_soc_component_setup_regmap(). But it can be calculated via
component->regmap. No need to keep it as component->val_bytes.
ASoC: soc-ops: use snd_soc_component_regmap_val_bytes()
component has component->val_bytes which is set via
snd_soc_component_setup_regmap(). But it can be calculated via
component->regmap. No need to keep it as component->val_bytes.
ASoC: tegra: use snd_soc_component_regmap_val_bytes()
component has component->val_bytes which is set via
snd_soc_component_setup_regmap(). But it can be calculated via
component->regmap. No need to keep it as component->val_bytes.
component has component->val_bytes which is set via
snd_soc_component_setup_regmap(). But it can be calculated via
component->regmap. No need to keep it as component->val_bytes.
sof_parse_token_sets() accepts array->size values that can be invalid
for a vendor tuple array header. In particular, a zero size does not
advance the parser state and can lead to non-progress parsing on
malformed topology data.
Validate array->size against the minimum header size and reject values
smaller than sizeof(*array) before parsing. This preserves behavior for
valid topologies and hardens malformed-input handling.
Thomas Gleixner [Tue, 17 Mar 2026 09:01:54 +0000 (10:01 +0100)]
clocksource: Rewrite watchdog code completely
The clocksource watchdog code has over time reached the state of an
impenetrable maze of duct tape and staples. The original design, which was
made in the context of systems far smaller than today, is based on the
assumption that the to be monitored clocksource (TSC) can be trivially
compared against a known to be stable clocksource (HPET/ACPI-PM timer).
Over the years it turned out that this approach has major flaws:
- Long delays between watchdog invocations can result in wrap arounds
of the reference clocksource
- Scalability of the reference clocksource readout can degrade on large
multi-socket systems due to interconnect congestion
This was addressed with various heuristics which degraded the accuracy of
the watchdog to the point that it fails to detect actual TSC problems on
older hardware which exposes slow inter CPU drifts due to firmware
manipulating the TSC to hide SMI time.
To address this and bring back sanity to the watchdog, rewrite the code
completely with a different approach:
1) Restrict the validation against a reference clocksource to the boot
CPU, which is usually the CPU/Socket closest to the legacy block which
contains the reference source (HPET/ACPI-PM timer). Validate that the
reference readout is within a bound latency so that the actual
comparison against the TSC stays within 500ppm as long as the clocks
are stable.
2) Compare the TSCs of the other CPUs in a round robin fashion against
the boot CPU in the same way the TSC synchronization on CPU hotplug
works. This still can suffer from delayed reaction of the remote CPU
to the SMP function call and the latency of the control variable cache
line. But this latency is not affecting correctness. It only affects
the accuracy. With low contention the readout latency is in the low
nanoseconds range, which detects even slight skews between CPUs. Under
high contention this becomes obviously less accurate, but still
detects slow skews reliably as it solely relies on subsequent readouts
being monotonically increasing. It just can take slightly longer to
detect the issue.
3) Rewrite the watchdog test so it tests the various mechanisms one by
one and validating the result against the expectation.
Kit Dallege [Sun, 15 Mar 2026 17:10:01 +0000 (18:10 +0100)]
dma-mapping: fix false kernel-doc comment marker
Change /** to /* for the DMA attributes list comment in dma-mapping.h.
The comment is not a kernel-doc structured comment and should not use
the kernel-doc opening marker.
Juergen Gross [Tue, 14 Oct 2025 11:28:15 +0000 (13:28 +0200)]
xen/privcmd: add boot control for restricted usage in domU
When running in an unprivileged domU under Xen, the privcmd driver
is restricted to allow only hypercalls against a target domain, for
which the current domU is acting as a device model.
Add a boot parameter "unrestricted" to allow all hypercalls (the
hypervisor will still refuse destructive hypercalls affecting other
guests).
Make this new parameter effective only in case the domU wasn't started
using secure boot, as otherwise hypercalls targeting the domU itself
might result in violating the secure boot functionality.
This is achieved by adding another lockdown reason, which can be
tested to not being set when applying the "unrestricted" option.
This is part of XSA-482
Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- new patch
Leon Romanovsky [Mon, 16 Mar 2026 19:06:52 +0000 (21:06 +0200)]
mm/hmm: Indicate that HMM requires DMA coherency
HMM is fundamentally about allowing a sophisticated device to perform DMA
directly to a process’s memory while the CPU accesses that same memory at
the same time. It is similar to SVA but does not rely on IOMMU support.
Because the entire model depends on concurrent access to shared memory, it
fails as a uAPI if SWIOTLB substitutes the memory or if the CPU caches are
not coherent with DMA.
Until now, there has been no reliable way to report this, and various
approximations have been used:
int hmm_dma_map_alloc(struct device *dev, struct hmm_dma_map *map,
size_t nr_entries, size_t dma_entry_size)
{
<...>
/*
* The HMM API violates our normal DMA buffer ownership rules and can't
* transfer buffer ownership. The dma_addressing_limited() check is a
* best approximation to ensure no swiotlb buffering happens.
*/
dma_need_sync = !dev->dma_skip_sync;
if (dma_need_sync || dma_addressing_limited(dev))
return -EOPNOTSUPP;
So let's mark mapped buffers with DMA_ATTR_REQUIRE_COHERENT attribute
to prevent silent data corruption if someone tries to use hmm in a system
with swiotlb or incoherent DMA
Leon Romanovsky [Mon, 16 Mar 2026 19:06:51 +0000 (21:06 +0200)]
RDMA/umem: Tell DMA mapping that UMEM requires coherency
The RDMA subsystem exposes DMA regions through the verbs interface, which
assumes a coherent system. Use the DMA_ATTR_REQUIRE_COHERENCE attribute
to ensure coherency and avoid taking the SWIOTLB path.
The RDMA verbs programming model resembles HMM and assumes concurrent DMA
and CPU access to userspace memory. The hardware and programming model
support "one-sided" operations initiated remotely without any local CPU
involvement or notification. These include ATOMIC compare/swap, READ, and
WRITE. A remote CPU can use these operations to traverse data structures,
manipulate locks, and perform similar tasks without the host CPU’s
awareness. If SWIOTLB substitutes memory or DMA is not cache coherent,
these use cases break entirely.
In-kernel RDMA is fine with incoherent mappings because kernel users do
not rely on one-sided operations in ways that would expose these issues.
A given region may also be exported multiple times, which can trigger
warnings about cacheline overlaps. These warnings are suppressed when the
new attribute is used.
Leon Romanovsky [Mon, 16 Mar 2026 19:06:50 +0000 (21:06 +0200)]
iommu/dma: add support for DMA_ATTR_REQUIRE_COHERENT attribute
Add support for the DMA_ATTR_REQUIRE_COHERENT attribute to the exported
functions. This attribute indicates that the SWIOTLB path must not be
used and that no sync operations should be performed.
Juergen Gross [Thu, 9 Oct 2025 14:54:58 +0000 (16:54 +0200)]
xen/privcmd: restrict usage in unprivileged domU
The Xen privcmd driver allows to issue arbitrary hypercalls from
user space processes. This is normally no problem, as access is
usually limited to root and the hypervisor will deny any hypercalls
affecting other domains.
In case the guest is booted using secure boot, however, the privcmd
driver would be enabling a root user process to modify e.g. kernel
memory contents, thus breaking the secure boot feature.
The only known case where an unprivileged domU is really needing to
use the privcmd driver is the case when it is acting as the device
model for another guest. In this case all hypercalls issued via the
privcmd driver will target that other guest.
Fortunately the privcmd driver can already be locked down to allow
only hypercalls targeting a specific domain, but this mode can be
activated from user land only today.
The target domain can be obtained from Xenstore, so when not running
in dom0 restrict the privcmd driver to that target domain from the
beginning, resolving the potential problem of breaking secure boot.
This is XSA-482
Reported-by: Teddy Astie <teddy.astie@vates.tech> Fixes: 1c5de1939c20 ("xen: add privcmd driver") Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- defer reading from Xenstore if Xenstore isn't ready yet (Jan Beulich)
- wait in open() if target domain isn't known yet
- issue message in case no target domain found (Jan Beulich)
The mapping buffers which carry this attribute require DMA coherent system.
This means that they can't take SWIOTLB path, can perform CPU cache overlap
and doesn't perform cache flushing.
Sources already have SPDX-FileCopyrightText (~40 instances) and more
appear on the mailing list, so document that it is allowed. On the
other hand SPDX defines several other tags like SPDX-FileType, so add
checkpatch rule to narrow desired tags only to two of them - license and
copyright. That way no new tags would sneak in to the kernel unnoticed.
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Joe Perches <joe@perches.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Leon Romanovsky [Mon, 16 Mar 2026 19:06:47 +0000 (21:06 +0200)]
dma-mapping: Clarify valid conditions for CPU cache line overlap
Rename the DMA_ATTR_CPU_CACHE_CLEAN attribute to better reflect that it
is debugging aid to inform DMA core code that CPU cache line overlaps are
allowed, and refine the documentation describing its use.
Leon Romanovsky [Mon, 16 Mar 2026 19:06:46 +0000 (21:06 +0200)]
dma-mapping: handle DMA_ATTR_CPU_CACHE_CLEAN in trace output
Tracing prints decoded DMA attribute flags, but it does not yet
include the recently added DMA_ATTR_CPU_CACHE_CLEAN. Add support
for decoding and displaying this attribute in the trace output.
Leon Romanovsky [Mon, 16 Mar 2026 19:06:45 +0000 (21:06 +0200)]
dma-debug: Allow multiple invocations of overlapping entries
Repeated DMA mappings with DMA_ATTR_CPU_CACHE_CLEAN trigger the
following splat. This prevents using the attribute in cases where a DMA
region is shared and reused more than seven times.
arm64: dts: renesas: rzt2h-rzn2h-evk: Fix GMAC pins sort order
Restore alphabetical sort order of the pin control subnodes by
exchanging the gmac1-pins and gmac2-pins nodes.
While at it, fix the index in an incorrect "GMAC2" comment.
Add device tree overlay to support the MayQueen PixPaper e-paper display
on the Renesas RZ/V2H EVK (KAKIP board). The display is connected via
SPI0 interface and uses GPIO pins for reset, busy, and DC control.
The overlay configures:
- RSPI0 pinmux for SPI communication (MOSI, MISO, CLK, CE0),
- PixPaper display device with proper GPIO assignments,
- SPI frequency set to 1MHz for stable operation.
This enables support for the Open-EP Community pixpaper-213-c module on
the RZ/V2H platform.
Lad Prabhakar [Fri, 23 Jan 2026 22:59:55 +0000 (22:59 +0000)]
arm64: dts: renesas: r9a09g077m44-rzt2h-evk: Clarify SD0 power jumpers
Clarify the board setup requirements for using SDHI0 on the RZ/T2H EVK by
documenting the CN78 jumper positions needed to supply SD0 power for
either the default eMMC configuration or the SD card slot configuration.
Fix a malformed MODULE_AUTHOR macro in the RZ/G2L USBPHY control driver
where the author's name and opening angle bracket were missing, leaving
only the email address with a stray closing >. Correct it to the standard
Name <email> format.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Nitin Gote [Tue, 17 Mar 2026 08:00:59 +0000 (13:30 +0530)]
drm/xe: Extend Wa_14026781792 for xe3lpg
Wa_14026781792 applies to all graphics versions from 30.00
through 35.10 (inclusive). Since there are no IPs between
30.05 and 35.10, consolidate the RTP rules into a single
GRAPHICS_VERSION_RANGE(3000, 3510).
v2: (Matt)
- There are no IPs between 30.05 and 35.10 either,
So, consolidate this into a single GRAPHICS_VERSION_RANGE(3000, 3510)
- Also move it up to the top part of the table
Varun Gupta [Tue, 17 Mar 2026 04:04:47 +0000 (09:34 +0530)]
drm/xe/xe3p_lpg: Add Wa_16029437861
Wa_16029437861 requires disabling COAMA atomics by setting bit 22
(SQ_DISABLE_COAMA) of L3SQCREG2 (0xb104) for Xe3p_LPG graphics
version 35.10 stepping A0..B0. This bit is already set by the existing
Wa_14026144927 entry, so add the new WA ID to the same implementation.
Luca Ceresoli [Tue, 10 Mar 2026 12:13:23 +0000 (13:13 +0100)]
drm/bridge: add drm_bridge_clear_and_put()
Drivers having a struct drm_bridge pointer pointing to a bridge in many
cases hold that reference until the owning device is removed. In those
cases the reference to the bridge can be put in the .remove callback
(possibly using devm actions) or in the .destroy func (possibly with the
help of struct drm_bridge::next_bridge). At those moments the driver should
not be operating anymore and won't dereference the bridge pointer after it
is put.
However there are cases when drivers need to stop holding a reference to a
bridge even when their device is not being removed. This is the case for
bridge hot-unplug, when a bridge is removed but the previous entity (bridge
or encoder) is staying. In such case the "previous entity" needs to put it
but cannot do it via devm or .destroy, because it is not being removed.
However this is risky because there is a time window between the two lines
where the reference is put, and thus the bridge could be deallocated, but
the pointer is still assigned. If other functions of the same driver were
invoked concurrently they might dereference my_priv->some_bridge during
that window, resulting in use-after-free.
A correct solution is to clear the pointer before putting the reference,
but that needs a temporary variable:
Thomas Hellström [Tue, 17 Mar 2026 14:18:55 +0000 (15:18 +0100)]
drm/ttm: Avoid invoking the OOM killer when reading back swapped content
In situations where the system is very short on RAM, the shmem
readback from swap-space may invoke the OOM killer.
However, since this might be a recoverable situation where the caller
is indicating this by setting
struct ttm_operation_ctx::gfp_retry_mayfail to true, adjust the gfp
value used by the allocation accordingly.
Thomas Hellström [Tue, 17 Mar 2026 14:18:54 +0000 (15:18 +0100)]
drm/ttm: Don't spam the log on buffer object backing store allocation failure
If the struct ttm_operation_ctx::gfp_retry_mayfail is true,
buffer object backing store allocation failures are expected to
silently fail with an error code to the caller. But currently an
elaborate warning is printed to the system log.
Maxime Ripard [Tue, 24 Feb 2026 16:10:29 +0000 (17:10 +0100)]
drm/atomic: Remove state argument to drm_atomic_private_obj_init
Now that all drm_private_objs users have been converted to use
atomic_create_state instead of the old ad-hoc initialization, we can
remove the state parameter from drm_private_obj_init and the fallback
code.
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260224-drm-private-obj-reset-v5-4-5a72f8ec9934@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
Maxime Ripard [Tue, 24 Feb 2026 16:10:28 +0000 (17:10 +0100)]
drm/tegra: Switch private_obj initialization to atomic_create_state
The tegra driver relies on a drm_private_obj, that is initialized by
allocating and initializing a state, and then passing it to
drm_private_obj_init.
Since we're gradually moving away from that pattern to the more
established one relying on a atomic_create_state implementation, let's
migrate this instance to the new pattern.
Maxime Ripard [Tue, 24 Feb 2026 16:10:27 +0000 (17:10 +0100)]
drm/omapdrm: Switch private_obj initialization to atomic_create_state
The omapdrm driver relies on a drm_private_obj, that is initialized by
allocating and initializing a state, and then passing it to
drm_private_obj_init.
Since we're gradually moving away from that pattern to the more
established one relying on a atomic_create_state implementation, let's
migrate this instance to the new pattern.
Maxime Ripard [Tue, 24 Feb 2026 16:10:26 +0000 (17:10 +0100)]
drm/amdgpu: Switch private_obj initialization to atomic_create_state
The amdgpu driver relies on a drm_private_obj, that is initialized by
allocating and initializing a state, and then passing it to
drm_private_obj_init.
Since we're gradually moving away from that pattern to the more
established one relying on a atomic_create_state implementation, let's
migrate this instance to the new pattern.
Damien Le Moal [Fri, 20 Mar 2026 03:48:01 +0000 (12:48 +0900)]
ata: libata-scsi: report correct sense field pointer in ata_scsiop_maint_in()
Commit 4ab7bb976343 ("ata: libata-scsi: Refactor ata_scsiop_maint_in()")
modified ata_scsiop_maint_in() to directly call
ata_scsi_set_invalid_field() to set the field pointer of the sense data
of a failed MAINTENANCE IN command. However, in the case of an invalid
command format, the sense data field incorrectly indicates byte 1 of
the CDB. Fix this to indicate byte 2 of the command.
drm/shmem-helper: Fix huge page mapping in fault handler
When running ./tools/testing/selftests/mm/split_huge_page_test multiple
times with /sys/kernel/mm/transparent_hugepage/shmem_enabled and
/sys/kernel/mm/transparent_hugepage/enabled set as always the following BUG
occurs:
This happens when two concurrent page faults occur within the same PMD range.
One fault installs a PMD mapping through vmf_insert_pfn_pmd(), while the other
attempts to install a PTE mapping via vmf_insert_pfn(). The bug is
triggered because a pmd_trans_huge is not expected when walking the page
table inside vmf_insert_pfn.
Avoid this race by adding a huge_fault callback to drm_gem_shmem_vm_ops so that
PMD-sized mappings are handled through the appropriate huge page fault path.
Fixes: 211b9a39f261 ("drm/shmem-helper: Map huge pages in fault handler") Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patch.msgid.link/20260319015224.46896-1-pedrodemargomes@gmail.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Karol Wachowski [Wed, 18 Mar 2026 09:39:27 +0000 (10:39 +0100)]
accel/ivpu: Perform engine reset instead of device recovery on TDR
Replace full device recovery on TDR timeout with per-context abort,
allowing individual context handling instead of resetting the entire
device.
Extend ivpu_jsm_reset_engine() to return the list of contexts impacted
by the engine reset and use that information to abort only the affected
contexts.
Only check for potentially faulty contexts when the engine reset was not
triggered by an MMU fault or a job completion error status. This prevents
misidentifying non-guilty contexts that happened to be running at the
time of the fault.
Trigger full device recovery if no contexts were marked by engine reset
if triggered by job completion timeout, as there is no way to identify
guilty one.
Add engine reset counter to debugfs for engine resets bookkeeping
for debugging/testing purposes.
platform/chrome: cros_usbpd_logger: Simplify with devm
Simplify the driver by using devm interfaces, which allow to drop
probe() error paths and the remove() callback.
Change is not equivalent in the workqueue itself: use non-legacy API
which does not set (__WQ_LEGACY | WQ_MEM_RECLAIM). The workqueue is
used to update logs, thus there is no point to run it for memory
reclaim.
Yannis Bolliger [Fri, 17 Oct 2025 19:30:01 +0000 (19:30 +0000)]
extcon: usbc-tusb320: Make typec-power-opmode optional
The driver returned an error in the probe function when a usb c
connector is configured in the DT without a "typec-power-opmode"
property. This property is used to initialize the CURRENT_MODE_ADVERTISE
register of the TUSB320, which is unused when operating as a UFP.
Requiring this property causes unnecessary configuration overhead and
inconsistency with the USB connector DT bindings, which do not specify
it as required.
This change makes typec-power-opmode optional. When the property is not
present, the driver will skip programming the CURRENT_MODE_ADVERTISE
register and rely on the hardware default.
Xu Yang [Fri, 26 Sep 2025 02:53:09 +0000 (10:53 +0800)]
extcon: ptn5150: Support USB role switch via connector fwnode
Since the PTN5150 is a Type-C chip, it's common to describe related
properties under the connector node. To align with this, the port
node will be located under the connector node in the future.
To support this layout, retrieve the USB role switch using the
connector's fwnode. For compatibility with existing device trees,
keep the usb_role_switch_get() function.
Michael Wu [Fri, 24 Oct 2025 02:49:46 +0000 (10:49 +0800)]
extcon: Fixed sysfs duplicate filename issue
With current extcon_dev_unregister() timing, ida_free is before
device_unregister(), that may cause current id re-alloc to another
device in extcon_dev_register() context but sysfs filename path not
removal completed yet.
The right timing shows below:
on extcon_dev_register: ida_alloc() -> device_register()
on extcon_dev_unregister: device_unregister() -> ida_free()
stack information when an error occurs:
sysfs: cannot create duplicate filename '/class/extcon/extcon1'
Call trace:
sysfs_warn_dup+0x68/0x88
sysfs_do_create_link_sd+0x94/0xdc
sysfs_create_link+0x30/0x48
device_add_class_symlinks+0xb4/0x12c
device_add+0x1e0/0x48c
device_register+0x20/0x34
extcon_dev_register+0x3b8/0x5c4
Fixes: 7bba9e81a6fb ("extcon: Use unique number for the extcon device ID") Acked-by: MyungJoo Ham <myungjoo.ham@samsung.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Michael Wu <michael@allwinnertech.com> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Link: https://lore.kernel.org/lkml/20251024024946.16618-1-michael@allwinnertech.com/
extcon: int3496: replace use of system_wq with system_percpu_wq
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
This patch continues the effort to refactor worqueue APIs, which has begun
with the change introducing new workqueues and a new alloc_workqueue flag:
commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
system_wq should be the per-cpu workqueue, yet in this name nothing makes
that clear, so replace system_wq with system_percpu_wq.
The old wq (system_wq) will be kept for a few release cycles.
Xu Yang [Sat, 15 Nov 2025 02:59:05 +0000 (10:59 +0800)]
extcon: ptn5150: handle pending IRQ events during system resume
When the system is suspended and ptn5150 wakeup interrupt is disabled,
any changes on ptn5150 will only be record in interrupt status
registers and won't fire an IRQ since its trigger type is falling
edge. So the HW interrupt line will keep at low state and any further
changes won't trigger IRQ anymore. To fix it, this will schedule a
work to check whether any IRQ are pending and handle it accordingly.
Fixes: 4ed754de2d66 ("extcon: Add support for ptn5150 extcon driver") Cc: stable@vger.kernel.org Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: MyungJoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Link: https://lore.kernel.org/lkml/20251115025905.1395347-1-xu.yang_2@nxp.com/
Thinh Nguyen [Sat, 14 Mar 2026 01:17:40 +0000 (01:17 +0000)]
scsi: target: file: Use kzalloc_flex for aio_cmd
The target_core_file doesn't initialize the aio_cmd->iocb for the
ki_write_stream. When a write command fd_execute_rw_aio() is executed,
we may get a bogus ki_write_stream value, causing unintended write
failure status when checking iocb->ki_write_stream > max_write_streams
in the block device.
Let's just use kzalloc_flex when allocating the aio_cmd and let
ki_write_stream=0 to fix this issue.
Yihang Li [Tue, 17 Mar 2026 06:31:47 +0000 (14:31 +0800)]
scsi: scsi_transport_sas: Fix the maximum channel scanning issue
After commit 37c4e72b0651 ("scsi: Fix sas_user_scan() to handle wildcard
and multi-channel scans"), if the device supports multiple channels (0 to
shost->max_channel), user_scan() invokes updated sas_user_scan() to perform
the scan behavior for a specific transfer. However, when the user
specifies shost->max_channel, it will return -EINVAL, which is not
expected.
Fix and support specifying the scan shost->max_channel for scanning.
Fixes: 37c4e72b0651 ("scsi: Fix sas_user_scan() to handle wildcard and multi-channel scans") Signed-off-by: Yihang Li <liyihang9@huawei.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20260317063147.2182562-1-liyihang9@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Josef Bacik [Tue, 17 Mar 2026 00:23:29 +0000 (20:23 -0400)]
scsi: target: tcm_loop: Drain commands in target_reset handler
tcm_loop_target_reset() violates the SCSI EH contract: it returns SUCCESS
without draining any in-flight commands. The SCSI EH documentation
(scsi_eh.rst) requires that when a reset handler returns SUCCESS the driver
has made lower layers "forget about timed out scmds" and is ready for new
commands. Every other SCSI LLD (virtio_scsi, mpt3sas, ipr, scsi_debug,
mpi3mr) enforces this by draining or completing outstanding commands before
returning SUCCESS.
Because tcm_loop_target_reset() doesn't drain, the SCSI EH reuses in-flight
scsi_cmnd structures for recovery commands (e.g. TUR) while the target core
still has async completion work queued for the old se_cmd. The memset in
queuecommand zeroes se_lun and lun_ref_active, causing
transport_lun_remove_cmd() to skip its percpu_ref_put(). The leaked LUN
reference prevents transport_clear_lun_ref() from completing, hanging
configfs LUN unlink forever in D-state:
INFO: task rm:264 blocked for more than 122 seconds.
rm D 0 264 258 0x00004000
Call Trace:
__schedule+0x3d0/0x8e0
schedule+0x36/0xf0
transport_clear_lun_ref+0x78/0x90 [target_core_mod]
core_tpg_remove_lun+0x28/0xb0 [target_core_mod]
target_fabric_port_unlink+0x50/0x60 [target_core_mod]
configfs_unlink+0x156/0x1f0 [configfs]
vfs_unlink+0x109/0x290
do_unlinkat+0x1d5/0x2d0
Fix this by making tcm_loop_target_reset() actually drain commands:
1. Issue TMR_LUN_RESET via tcm_loop_issue_tmr() to drain all commands that
the target core knows about (those not yet CMD_T_COMPLETE).
2. Use blk_mq_tagset_busy_iter() to iterate all started requests and
flush_work() on each se_cmd — this drains any deferred completion work
for commands that already had CMD_T_COMPLETE set before the TMR (which
the TMR skips via __target_check_io_state()). This is the same pattern
used by mpi3mr, scsi_debug, and libsas to drain outstanding commands
during reset.
Eric Biggers [Mon, 16 Mar 2026 22:36:31 +0000 (15:36 -0700)]
scsi: lpfc: Use the crc32c() function
lpfc_cgn_calc_crc32(data, size, crc) is really just an open-coded version
of ~crc32c(bitrev32(crc), data, size). However, all callers pass crc ==
~0, so it can be simplified even further to just ~crc32c(~0, data, size).
Remove the crc argument and implement it that way.
While we're at it, also use proper types in the function prototype.
Tyllis Xu [Sat, 14 Mar 2026 17:01:50 +0000 (12:01 -0500)]
scsi: ibmvfc: Fix OOB access in ibmvfc_discover_targets_done()
A malicious or compromised VIO server can return a num_written value in the
discover targets MAD response that exceeds max_targets. This value is
stored directly in vhost->num_targets without validation, and is then used
as the loop bound in ibmvfc_alloc_targets() to index into disc_buf[], which
is only allocated for max_targets entries. Indices at or beyond max_targets
access kernel memory outside the DMA-coherent allocation. The
out-of-bounds data is subsequently embedded in Implicit Logout and PLOGI
MADs that are sent back to the VIO server, leaking kernel memory.
Fix by clamping num_written to max_targets before storing it.
Fixes: 072b91f9c651 ("[SCSI] ibmvfc: IBM Power Virtual Fibre Channel Adapter Client Driver") Reported-by: Yuhao Jiang <danisjiang@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Tyllis Xu <LivelyCarpet87@gmail.com> Reviewed-by: Dave Marquardt <davemarq@linux.ibm.com> Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com> Link: https://patch.msgid.link/20260314170151.548614-1-LivelyCarpet87@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
vamshi gajjela [Tue, 10 Mar 2026 19:03:08 +0000 (00:33 +0530)]
scsi: ufs: core: Handle MCQ IAG events
Add support for handling aggregation-based interrupts when operating in MCQ
mode.
In legacy interrupt mode, an IE.IAGES is triggered when the counter or
timer threshold is reached. To manage this, the handler now resets the
aggregation counter and timer by writing to the MCQIACRy.CTR register.
Since the register layout of MCQIACRy is identical to the existing UTRIACR
register, this implementation reuses the previously defined bitfield masks
to maintain consistency and reduce code duplication.
Extend ufshcd_handle_mcq_cq_events() with a boolean iag parameter. If set,
the handler resets the MCQ IAG counter and timer.
Define MCQ_IAG_EVENT_STATUS (0x200000) and include it in
UFSHCD_ENABLE_MCQ_INTRS to ensure the interrupt is unmasked during
initialization.
scsi: ses: Handle positive SCSI error from ses_recv_diag()
ses_recv_diag() can return a positive value, which also means that an
error happened, so do not only test for negative values.
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: stable <stable@kernel.org> Assisted-by: gkh_clanker_2000 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://patch.msgid.link/2026022301-bony-overstock-a07f@gregkh Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
net: fjes: Drop fjes_acpi_driver and rework initialization
The ACPI driver interface used by the Fujitsu Extended Socket (fjes)
Network Device driver is redundant because its only role is to create
a platform device the fjes platform driver can bind to, which can be
done already at the module initialization time.
Namely, acpi_find_extended_socket_device() looks for the requisite ACPI
device object anyway and it may as well check its resources, and the
platform device can be created when the ACPI object in question
has been found (and it can be freed when the module is unloaded).
Moreover, as a rule, it is better to avoid binding drivers directly to
ACPI device objects [1].
Accordingly, drop fjes_acpi_driver, adjust the module initialization
and exit code as per the above and set the fwnode for the fjes platform
device to point to the corresponding ACPI device object as its ACPI
companion.
While this is not expected to alter functionality, it changes sysfs
layout and so it will be visible to user space.
====================
net: stmmac: descriptor cleanups part 2
Part 2 of the stmmac descriptor cleanups.
- rename "priv->mode" to be more descriptive, and do the same in
function arguments.
- simplify descriptor allocation/initialisation/freeing
- use more descriptive local variable names in stmmac_xmit()
- STMMAC_GET_ENTRY() doesn't get an entry, it moves to the next one.
Describe this in the macro name.
====================
STMMAC_GET_ENTRY() doesn't describe what this macro is doing - it is
incrementing the provided index for the circular array of descriptors.
Replace "GET" with "NEXT" as this better describes the action here.
Rather than having separate branches to handle the different types of
descriptors, use the helper functions to calculate the total size of
the DMA descriptors.
Use this to allocate or free the descriptor array, and use a local
variable to hold the address of the descriptor array, so we only need
one dma_alloc_coherent() or dma_free_coherent() call in these paths.
Also do the same for the receive ring initialisation. The transmit
ring can't be converted as there is a case where stmmac_mode_init()
is not called.
priv->mode doesn't describe what it refers to, it is whether we operate
the DMA descriptors as a ring or chain. It is also difficult to grep for
as there are several "mode" struct members. Add "descriptor_" prefix.
net: ntb_netdev: add SPDX tag and remove boilerplate license text
Add SPDX-License-Identifier tag to reflect the dual
GPL-2.0-only/BSD-3-Clause license, remove the redundant
boilerplate license text and contact information, keeping
only the driver description.