Long Li [Wed, 25 Mar 2026 19:40:57 +0000 (12:40 -0700)]
RDMA/mana_ib: Disable RX steering on RSS QP destroy
When an RSS QP is destroyed (e.g. DPDK exit), mana_ib_destroy_qp_rss()
destroys the RX WQ objects but does not disable vPort RX steering in
firmware. This leaves stale steering configuration that still points to
the destroyed RX objects.
If traffic continues to arrive (e.g. peer VM is still transmitting) and
the VF interface is subsequently brought up (mana_open), the firmware
may deliver completions using stale CQ IDs from the old RX objects.
These CQ IDs can be reused by the ethernet driver for new TX CQs,
causing RX completions to land on TX CQs:
Fix this by disabling vPort RX steering before destroying RX WQ objects.
Note that mana_fence_rqs() cannot be used here because the fence
completion is delivered on the CQ, which is polled by user-mode (e.g.
DPDK) and not visible to the kernel driver.
Refactor the disable logic into a shared mana_disable_vport_rx() in
mana_en, exported for use by mana_ib, replacing the duplicate code.
The ethernet driver's mana_dealloc_queues() is also updated to call
this common function.
Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter") Cc: stable@vger.kernel.org Signed-off-by: Long Li <longli@microsoft.com> Link: https://patch.msgid.link/20260325194100.1929056-1-longli@microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
Leon Romanovsky [Wed, 25 Mar 2026 18:16:03 +0000 (20:16 +0200)]
RDMA/mlx4: Restrict external umem for CQ when copy_to_user() is used
When the mlx4 firmware reports the MLX4_DEV_CAP_FLAG2_SW_CQ_INIT capability,
libmlx4 from the rdma-core package expects the driver to initialize memory
at the address provided in the buf_addr parameter of ucmd.
This behavior cannot be supported by any external umem implementation, so
restrict it accordingly.
The function hfi1_destroy_qp() was removed in commit 75261cc6ab66 ("staging/rdma/hfi1: Remove destroy qp verb") in
favor of the rdmavt generic rvt_destroy_qp(). Two comments
still reference hfi1_destroy_qp() as the waiter that
rvt_put_qp() will wake up. As Leon Romanovsky noted, these
comments add no value. Remove them.
Leon Romanovsky [Wed, 18 Mar 2026 10:08:50 +0000 (12:08 +0200)]
RDMA/bnxt_re: Simplify bnxt_re_init_depth() callers and implementation
All callers of bnxt_re_init_depth() compute the minimum between its return
value and another internal variable, often mixing variable types in the
process. Clean this up by making the logic simpler and more readable.
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Kexin Sun [Sat, 21 Mar 2026 10:58:59 +0000 (18:58 +0800)]
RDMA/uverbs: Update outdated reference to remove_commit_idr_uobject()
The function remove_commit_idr_uobject() was split into
destroy_hw_idr_uobject() and remove_handle_idr_uobject() by
commit 0f50d88a6e9a ("IB/uverbs: Allow all DESTROY commands
to succeed after disassociate"). The kref put that the
comment refers to now lives in remove_handle_idr_uobject().
Update the stale reference.
Also update "allocated this IDR with a NULL object" to
"allocated this XArray entry with a NULL pointer" to match
the actual data structure (xa_store) and the wording already
used two lines below ("transfers our kref on uobj to the
XArray").
Leon Romanovsky [Thu, 19 Mar 2026 15:22:21 +0000 (17:22 +0200)]
RDMA: Properly propagate the number of CQEs as unsigned int
Instead of checking whether the number of CQEs is negative or zero, fix the
.resize_user_cq() declaration to use unsigned int. This better reflects the
expected value range. The sanity check is then handled correctly in ib_uvbers.
Zhu Yanjun [Fri, 13 Mar 2026 02:30:57 +0000 (19:30 -0700)]
RDMA/rxe: Support RDMA link creation and destruction per net namespace
After introducing dellink handling and per-net namespace management
for IPv4 and IPv6 sockets, extend rxe to create and destroy RDMA links
within each network namespace.
With this change, RDMA links can be instantiated both in init_net and
in other network namespaces. The lifecycle of the RDMA link is now tied
to the corresponding namespace and is properly cleaned up when the
namespace or link is removed.
This ensures rxe behaves correctly in multi-namespace environments and
keeps socket and RDMA link resources consistent across namespace
creation and teardown.
Zhu Yanjun [Fri, 13 Mar 2026 02:30:56 +0000 (19:30 -0700)]
RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets
Add a net namespace implementation file to rxe to manage the
lifecycle of IPv4 and IPv6 sockets per network namespace.
This implementation handles the creation and destruction of the
sockets both for init_net and for dynamically created network
namespaces. The sockets are initialized when a namespace becomes
active and are properly released when the namespace is removed.
This change provides the infrastructure needed for rxe to operate
correctly in environments using multiple network namespaces.
RDMA/mana_ib: cleanup the usage of mana_gd_send_request()
Do not check the status of the response header returned by mana_gd_send_request(),
as the returned error code already indicates the request status.
The mana_gd_send_request() may return no error code and have the response status
GDMA_STATUS_MORE_ENTRIES, which is a successful completion. It is used
for checking the correctness of multi-request operations, such as creation of
a dma region with mana_ib_gd_create_dma_region().
Marco Crivellari [Wed, 18 Mar 2026 15:27:48 +0000 (16:27 +0100)]
RDMA/rxe: Replace use of system_unbound_wq with rxe_wq
This patch continues the effort to refactor workqueue APIs, which has begun
with the changes introducing new workqueues and a new alloc_workqueue flag:
commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
The point of the refactoring is to eventually alter the default behavior of
workqueues to become unbound by default so that their workload placement is
optimized by the scheduler.
Before that to happen, workqueue users must be converted to the better named
new workqueues with no intended behaviour changes:
Sanman Pradhan [Mon, 30 Mar 2026 15:56:40 +0000 (15:56 +0000)]
hwmon: (tps53679) Fix device ID comparison and printing in tps53676_identify()
tps53676_identify() uses strncmp() to compare the device ID buffer
against a byte sequence containing embedded non-printable bytes
(\x53\x67\x60). strncmp() is semantically wrong for binary data
comparison; use memcmp() instead.
Additionally, the buffer from i2c_smbus_read_block_data() is not
NUL-terminated, so printing it with "%s" in the error path is
undefined behavior and may read past the buffer. Use "%*ph" to
hex-dump the actual bytes returned.
Per the datasheet, the expected device ID is the 6-byte sequence
54 49 53 67 60 00 ("TI\x53\x67\x60\x00"), so compare all 6 bytes
including the trailing NUL.
Dmitry Baryshkov [Sun, 25 Jan 2026 11:30:04 +0000 (13:30 +0200)]
soc: qcom: ubwc: add helpers to get programmable values
Currently the database stores macrotile_mode in the data. However it
can be derived from the rest of the data: it should be used for UBWC
encoding >= 3.0 except for several corner cases (SM8150 and SC8180X).
The ubwc_bank_spread field seems to be based on the impreside data we
had for the MDSS and DPU programming. In some cases UBWC engine inside
the display controller doesn't need to program it, although bank spread
is to be enabled.
Bank swizzle is also currently stored as is, but it is almost standard
(banks 1-3 for UBWC 1.0 and 2-3 for other versions), the only exception
being Lemans (it uses only bank 3).
Add helpers returning values from the config for now. They will be
rewritten later, in a separate series, but having the helper now
simplifies refacroring the code later.
Dmitry Baryshkov [Sun, 25 Jan 2026 11:30:03 +0000 (13:30 +0200)]
soc: qcom: ubwc: add helper to get min_acc length
MDSS and GPU drivers use different approaches to get min_acc length.
Add helper function that can be used by all the drivers.
The helper reflects our current best guess, it blindly copies the
approach adopted by the MDSS drivers and it matches current values
selected by the GPU driver.
Commit 2e5449f4f21a ("profiling: Remove create_prof_cpu_mask().") said that
no one would create /proc/irq/prof_cpu_mask since commit 1f44a225777e
("s390: convert interrupt handling to use generic hardirq", 2013). Remove
the outdated description.
Mark Brown [Mon, 30 Mar 2026 16:59:52 +0000 (17:59 +0100)]
ASoC: Merge up fixes
Merge branch 'for-7.0' of
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into
asoc-7.1 for both ASoC and general bug fixes to support testing.
Akira Yokosawa [Thu, 26 Mar 2026 11:46:37 +0000 (20:46 +0900)]
docs/ja_JP: submitting-patches: Amend "Describe your changes"
To make the translation of "Describe your changes" (into
"変更内容を記述する") easier to follow, do some rewording and
rephrasing, as well as fixing a couple of mistranslations.
Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260326114637.144601-1-akiyks@gmail.com>
Tomas Glozar [Mon, 30 Mar 2026 09:12:07 +0000 (11:12 +0200)]
rtla: Fix build without libbpf header
rtla supports building without libbpf. However, BPF actions
patchset [1] adds an include of bpf/libbpf.h into timerlat_bpf.h,
which breaks build on systems that don't have libbpf headers
installed.
This is a leftover from a draft version of the patchset where
timerlat_bpf_set_action() (which takes a struct bpf_program * argument)
was defined in the header. timerlat_bpf.c already includes bpf/libbpf.h
via timerlat.skel.h when libbpf is present.
Remove the redundant include to fix build on systems without libbpf
headers.
Manuel Ebner [Wed, 25 Mar 2026 19:48:12 +0000 (20:48 +0100)]
docs: changes.rst and ver_linux: sort the lists
Sort the lists of tools in both scripts/ver_linux and
Documentation/process/changes.rst into alphabetical order, facilitating
comparison between the two.
Signed-off-by: Manuel Ebner <manuelebner@mailbox.org>
[jc: rewrote changelog] Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260325194811.78509-2-manuelebner@mailbox.org>
Manuel Ebner [Wed, 25 Mar 2026 19:46:17 +0000 (20:46 +0100)]
docs: changes/ver_linux: fix entries and add several tools
Some of the entries in both Documentation/process/changes.rst and
script/ver_linux were obsolete; update them to reflect the current way of
getting version information.
Many were missing altogether; add the relevant information for:
Leonard Lausen [Fri, 27 Mar 2026 22:25:15 +0000 (22:25 +0000)]
ALSA: hda: cs35l41: Fix boost type for HP Dragonfly 13.5 inch G4
The HP Dragonfly 13.5 inch G4 (SSID 103C8B63) has _DSD properties in
ACPI firmware with valid reset-gpios and cs-gpios for the four CS35L41
amplifiers on SPI.
However, the _DSD specifies cirrus,boost-type as Internal (0), while
the hardware requires External Boost. With Internal Boost configured,
the amplifiers trigger "Amp short error" when audio is played at
moderate-to-high volume, eventually shutting down entirely.
Add a configuration table entry to override the boost type to
External, similar to the existing workaround for 103C89C6. All GPIO
indices are set to -1 since the _DSD provides valid reset-gpios and
cs-gpios.
Confirmed on BIOS V90 01.11.00 (January 2026), the latest available.
Takashi Iwai [Mon, 30 Mar 2026 16:22:20 +0000 (18:22 +0200)]
ALSA: hda/realtek: Add quirk for Samsung Book2 Pro 360 (NP950QED)
There is another Book2 Pro model (NP950QED) that seems equipped with
the same speaker module as the non-360 model, which requires
ALC298_FIXUP_SAMSUNG_AMP_V2_2_AMPS quirk.
Florian Fainelli [Thu, 26 Mar 2026 23:32:24 +0000 (16:32 -0700)]
Documentation: Provide hints on how to debug Python GDB scripts
By default GDB does not print a full stack of its integrated Python
interpreter, thus making the debugging of GDB scripts more painful than
it has to be.
Suggested-by: Radu Rendec <radu@rendec.net> Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Radu Rendec <radu@rendec.net> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260326233226.2248817-1-florian.fainelli@broadcom.com>
Harry Wentland [Fri, 27 Mar 2026 15:41:57 +0000 (11:41 -0400)]
scripts/checkpatch: add Assisted-by: tag validation
The coding-assistants.rst documentation defines the Assisted-by: tag
format for AI-assisted contributions as:
Assisted-by: AGENT_NAME:MODEL_VERSION [TOOL1] [TOOL2]
This format does not use an email address, so checkpatch currently
reports a false positive about an invalid email when encountering this
tag.
Add Assisted-by: to the recognized signature tags and standard signature
list. When an Assisted-by: tag is found, validate it instead of checking
for an email address.
Examples of passing tags:
- Claude:claude-3-opus coccinelle sparse
- FOO:BAR.baz
- Copilot Github:claude-3-opus
- GitHub Copilot:Claude Opus 4.6
- My Cool Agent:v1.2.3 coccinelle sparse
Examples of tags triggering the new warning:
- Claude coccinelle sparse
- JustAName
- :missing-agent
Cc: Jani Nikula <jani.nikula@linux.intel.com> Assisted-by: Claude:claude-opus-4.6 Co-developed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260327154157.162962-1-harry.wentland@amd.com>
Will Deacon [Mon, 30 Mar 2026 14:48:39 +0000 (15:48 +0100)]
drivers/virt: pkvm: Add Kconfig dependency on DMA_RESTRICTED_POOL
pKVM guests practically rely on CONFIG_DMA_RESTRICTED_POOL=y in order
to establish shared memory regions with the host for virtio buffers.
Make CONFIG_ARM_PKVM_GUEST depend on CONFIG_DMA_RESTRICTED_POOL to avoid
the inevitable segmentation faults experience if you have the former but
not the latter.
Will Deacon [Mon, 30 Mar 2026 14:48:38 +0000 (15:48 +0100)]
KVM: arm64: Rename PKVM_PAGE_STATE_MASK
Rename PKVM_PAGE_STATE_MASK to PKVM_PAGE_STATE_VMEMMAP_MASK to make it
clear that the mask applies to the page state recorded in the entries
of the 'hyp_vmemmap', rather than page states stored elsewhere (e.g. in
the ptes).
Suggested-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Tested-by: Mostafa Saleh <smostafa@google.com> Signed-off-by: Will Deacon <will@kernel.org> Link: https://patch.msgid.link/20260330144841.26181-38-will@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
Now that the guest can share and unshare memory with the host using
hypercalls, extend the pKVM page ownership selftest to exercise these
new transitions.
Will Deacon [Mon, 30 Mar 2026 14:48:35 +0000 (15:48 +0100)]
KVM: arm64: Register 'selftest_vm' in the VM table
In preparation for extending the pKVM page ownership selftests to cover
forceful reclaim of donated pages, rework the creation of the
'selftest_vm' so that it is registered in the VM table while the tests
are running.
Will Deacon [Mon, 30 Mar 2026 14:48:33 +0000 (15:48 +0100)]
KVM: arm64: Add some initial documentation for pKVM
Add some initial documentation for pKVM to help people understand what
is supported, the limitations of protected VMs when compared to
non-protected VMs and also what is left to do.
Will Deacon [Mon, 30 Mar 2026 14:48:31 +0000 (15:48 +0100)]
KVM: arm64: Implement the MEM_UNSHARE hypercall for protected VMs
Implement the ARM_SMCCC_KVM_FUNC_MEM_UNSHARE hypercall to allow
protected VMs to unshare memory that was previously shared with the host
using the ARM_SMCCC_KVM_FUNC_MEM_SHARE hypercall.
Reviewed-by: Vincent Donnefort <vdonnefort@google.com> Tested-by: Fuad Tabba <tabba@google.com> Tested-by: Mostafa Saleh <smostafa@google.com> Signed-off-by: Will Deacon <will@kernel.org> Link: https://patch.msgid.link/20260330144841.26181-31-will@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
Will Deacon [Mon, 30 Mar 2026 14:48:29 +0000 (15:48 +0100)]
KVM: arm64: Add hvc handler at EL2 for hypercalls from protected VMs
Add a hypercall handler at EL2 for hypercalls originating from protected
VMs. For now, this implements only the FEATURES and MEMINFO calls, but
subsequent patches will implement the SHARE and UNSHARE functions
necessary for virtio.
Unhandled hypercalls (including PSCI) are passed back to the host.
Reviewed-by: Vincent Donnefort <vdonnefort@google.com> Tested-by: Fuad Tabba <tabba@google.com> Tested-by: Mostafa Saleh <smostafa@google.com> Signed-off-by: Will Deacon <will@kernel.org> Link: https://patch.msgid.link/20260330144841.26181-29-will@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
Will Deacon [Mon, 30 Mar 2026 14:48:28 +0000 (15:48 +0100)]
KVM: arm64: Return -EFAULT from VCPU_RUN on access to a poisoned pte
If a protected vCPU faults on an IPA which appears to be mapped, query
the hypervisor to determine whether or not the faulting pte has been
poisoned by a forceful reclaim. If the pte has been poisoned, return
-EFAULT back to userspace rather than retrying the instruction forever.
Will Deacon [Mon, 30 Mar 2026 14:48:27 +0000 (15:48 +0100)]
KVM: arm64: Reclaim faulting page from pKVM in spurious fault handler
Host kernel accesses to pages that are inaccessible at stage-2 result in
the injection of a translation fault, which is fatal unless an exception
table fixup is registered for the faulting PC (e.g. for user access
routines). This is undesirable, since a get_user_pages() call could be
used to obtain a reference to a donated page and then a subsequent
access via a kernel mapping would lead to a panic().
Rework the spurious fault handler so that stage-2 faults injected back
into the host result in the target page being forcefully reclaimed when
no exception table fixup handler is registered.
Will Deacon [Mon, 30 Mar 2026 14:48:26 +0000 (15:48 +0100)]
KVM: arm64: Introduce hypercall to force reclaim of a protected page
Introduce a new hypercall, __pkvm_force_reclaim_guest_page(), to allow
the host to forcefully reclaim a physical page that was previous donated
to a protected guest. This results in the page being zeroed and the
previous guest mapping being poisoned so that new pages cannot be
subsequently donated at the same IPA.
Will Deacon [Mon, 30 Mar 2026 14:48:25 +0000 (15:48 +0100)]
KVM: arm64: Annotate guest donations with handle and gfn in host stage-2
Handling host kernel faults arising from accesses to donated guest
memory will require an rmap-like mechanism to identify the guest mapping
of the faulting page.
Extend the page donation logic to encode the guest handle and gfn
alongside the owner information in the host stage-2 pte.
Will Deacon [Mon, 30 Mar 2026 14:48:24 +0000 (15:48 +0100)]
KVM: arm64: Change 'pkvm_handle_t' to u16
'pkvm_handle_t' doesn't need to be a 32-bit type and subsequent patches
will rely on it being no more than 16 bits so that it can be encoded
into a pte annotation.
Change 'pkvm_handle_t' to a u16 and add a compile-type check that the
maximum handle fits into the reduced type.
Rework host_stage2_set_owner_locked() to add a new helper function,
host_stage2_set_owner_metadata_locked(), which will allow us to store
additional metadata alongside a 3-bit owner ID for invalid host stage-2
entries.
kvm_pgtable_stage2_set_owner() can be generalised into a way to store
up to 59 bits in the page tables alongside a 4-bit 'type' identifier
specific to the format of the 59-bit payload.
Introduce kvm_pgtable_stage2_annotate() and move the existing invalid
ptes (for locked ptes and donated pages) over to the new scheme.
Will Deacon [Mon, 30 Mar 2026 14:48:21 +0000 (15:48 +0100)]
KVM: arm64: Avoid pointless annotation when mapping host-owned pages
When a page is transitioned to host ownership, we can eagerly map it
into the host stage-2 page-table rather than going via the convoluted
step of a faulting annotation to trigger the mapping.
Call host_stage2_idmap_locked() directly when transitioning a page to
be owned by the host.
Quentin Perret [Mon, 30 Mar 2026 14:48:20 +0000 (15:48 +0100)]
KVM: arm64: Inject SIGSEGV on illegal accesses
The pKVM hypervisor will currently panic if the host tries to access
memory that it doesn't own (e.g. protected guest memory). Sadly, as
guest memory can still be mapped into the VMM's address space, userspace
can trivially crash the kernel/hypervisor by poking into guest memory.
To prevent this, inject the abort back in the host with S1PTW set in the
ESR, hence allowing the host to differentiate this abort from normal
userspace faults and inject a SIGSEGV cleanly.
Will Deacon [Mon, 30 Mar 2026 14:48:19 +0000 (15:48 +0100)]
KVM: arm64: Support translation faults in inject_host_exception()
Extend inject_host_exception() to support the injection of translation
faults on both the data and instruction side to 32-bit and 64-bit EL0
as well as 64-bit EL1. This will be used in a subsequent patch when
resolving an unhandled host stage-2 abort.
Will Deacon [Mon, 30 Mar 2026 14:48:18 +0000 (15:48 +0100)]
KVM: arm64: Factor out pKVM host exception injection logic
inject_undef64() open-codes the logic to inject an exception into the
pKVM host. In preparation for reusing this logic to inject a data abort
on an unhandled stage-2 fault from the host, factor out the meat and
potatoes of the function into a new inject_host_exception() function
which takes the ESR as a parameter.
Will Deacon [Mon, 30 Mar 2026 14:48:17 +0000 (15:48 +0100)]
KVM: arm64: Hook up reclaim hypercall to pkvm_pgtable_stage2_destroy()
During teardown of a protected guest, its memory pages must be reclaimed
from the hypervisor by issuing the '__pkvm_reclaim_dying_guest_page'
hypercall.
Add a new helper, __pkvm_pgtable_stage2_reclaim(), which is called
during the VM teardown operation to reclaim pages from the hypervisor
and drop the GUP pin on the host.
To enable reclaim of pages from a protected VM during teardown,
introduce a new hypercall to reclaim a single page from a protected
guest that is in the dying state.
Since the EL2 code is non-preemptible, the new hypercall deliberately
acts on a single page at a time so as to allow EL1 to reschedule
frequently during the teardown operation.
Reviewed-by: Vincent Donnefort <vdonnefort@google.com> Tested-by: Fuad Tabba <tabba@google.com> Tested-by: Mostafa Saleh <smostafa@google.com> Co-developed-by: Quentin Perret <qperret@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Link: https://patch.msgid.link/20260330144841.26181-16-will@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
Will Deacon [Mon, 30 Mar 2026 14:48:15 +0000 (15:48 +0100)]
KVM: arm64: Handle aborts from protected VMs
Introduce a new abort handler for resolving stage-2 page faults from
protected VMs by pinning and donating anonymous memory. This is
considerably simpler than the infamous user_mem_abort() as we only have
to deal with translation faults at the pte level.
Will Deacon [Mon, 30 Mar 2026 14:48:14 +0000 (15:48 +0100)]
KVM: arm64: Hook up donation hypercall to pkvm_pgtable_stage2_map()
Mapping pages into a protected guest requires the donation of memory
from the host.
Extend pkvm_pgtable_stage2_map() to issue a donate hypercall when the
target VM is protected. Since the hypercall only handles a single page,
the splitting logic used for the share path is not required.
Will Deacon [Mon, 30 Mar 2026 14:48:13 +0000 (15:48 +0100)]
KVM: arm64: Introduce __pkvm_host_donate_guest()
In preparation for supporting protected VMs, whose memory pages are
isolated from the host, introduce a new pKVM hypercall to allow the
donation of pages to a guest.
Will Deacon [Mon, 30 Mar 2026 14:48:12 +0000 (15:48 +0100)]
KVM: arm64: Split teardown hypercall into two phases
In preparation for reclaiming protected guest VM pages from the host
during teardown, split the current 'pkvm_teardown_vm' hypercall into
separate 'start' and 'finalise' calls.
The 'pkvm_start_teardown_vm' hypercall puts the VM into a new 'is_dying'
state, which is a point of no return past which no vCPU of the pVM is
allowed to run any more. Once in this new state,
'pkvm_finalize_teardown_vm' can be used to reclaim meta-data and
page-table pages from the VM. A subsequent patch will add support for
reclaiming the individual guest memory pages.
Will Deacon [Mon, 30 Mar 2026 14:48:11 +0000 (15:48 +0100)]
KVM: arm64: Ignore -EAGAIN when mapping in pages for the pKVM host
If the host takes a stage-2 translation fault on two CPUs at the same
time, one of them will get back -EAGAIN from the page-table mapping code
when it runs into the mapping installed by the other.
Rather than handle this explicitly in handle_host_mem_abort(), pass the
new KVM_PGTABLE_WALK_IGNORE_EAGAIN flag to kvm_pgtable_stage2_map() from
__host_stage2_idmap() and return -EEXIST if host_stage2_adjust_range()
finds a valid pte. This will avoid having to test for -EAGAIN on the
reclaim path in subsequent patches.
Will Deacon [Mon, 30 Mar 2026 14:48:09 +0000 (15:48 +0100)]
KVM: arm64: Ignore MMU notifier callbacks for protected VMs
In preparation for supporting the donation of pinned pages to protected
VMs, return early from the MMU notifiers when called for a protected VM,
as the necessary hypercalls are exposed only for non-protected guests.
Will Deacon [Mon, 30 Mar 2026 14:48:08 +0000 (15:48 +0100)]
KVM: arm64: Remove is_protected_kvm_enabled() checks from hypercalls
When pKVM is not enabled, the host shouldn't issue pKVM-specific
hypercalls and so there's no point checking for this in the pKVM
hypercall handlers.
Remove the redundant is_protected_kvm_enabled() checks from each
hypercall and instead rejig the hypercall table so that the
pKVM-specific hypercalls are unreachable when pKVM is not being used.
Fuad Tabba [Mon, 30 Mar 2026 14:48:07 +0000 (15:48 +0100)]
KVM: arm64: Expose self-hosted debug regs as RAZ/WI for protected guests
Debug and trace are not currently supported for protected guests, so
trap accesses to the related registers and emulate them as RAZ/WI for
now. Although this isn't strictly compatible with the architecture, it's
sufficient for Linux guests and means that debug support can be added
later on.
Will Deacon [Mon, 30 Mar 2026 14:48:06 +0000 (15:48 +0100)]
KVM: arm64: Don't advertise unsupported features for protected guests
Both SVE and PMUv3 are treated as "restricted" features for protected
guests and attempts to access their corresponding architectural state
from a protected guest result in an undefined exception being injected
by the hypervisor.
Since these exceptions are unexpected and typically fatal for the guest,
don't advertise these features for protected guests.
Will Deacon [Mon, 30 Mar 2026 14:48:05 +0000 (15:48 +0100)]
KVM: arm64: Rename __pkvm_pgtable_stage2_unmap()
In preparation for adding support for protected VMs, where pages are
donated rather than shared, rename __pkvm_pgtable_stage2_unmap() to
__pkvm_pgtable_stage2_unshare() to make it clearer about what is going
on.
Will Deacon [Mon, 30 Mar 2026 14:48:04 +0000 (15:48 +0100)]
KVM: arm64: Move handle check into pkvm_pgtable_stage2_destroy_range()
When pKVM is enabled, a VM has a 'handle' allocated by the hypervisor
in kvm_arch_init_vm() and released later by kvm_arch_destroy_vm().
Consequently, the only time __pkvm_pgtable_stage2_unmap() can run into
an uninitialised 'handle' is on the kvm_arch_init_vm() failure path,
where we destroy the empty stage-2 page-table if we fail to allocate a
handle.
Move the handle check into pkvm_pgtable_stage2_destroy_range(), which
will additionally handle protected VMs in subsequent patches.
Will Deacon [Mon, 30 Mar 2026 14:48:02 +0000 (15:48 +0100)]
KVM: arm64: Remove unused PKVM_ID_FFA definition
Commit 7cbf7c37718e ("KVM: arm64: Drop pkvm_mem_transition for host/hyp
sharing") removed the last users of PKVM_ID_FFA, so drop the definition
altogether.
Simon Richter [Sat, 7 Mar 2026 17:35:37 +0000 (02:35 +0900)]
PCI/VGA: Fail pci_set_vga_state() if VGA decoding not supported
PCI bridges are allowed to refuse activating VGA decoding, by simply
ignoring attempts to set the bit that enables it, so after setting the bit,
read it back to verify.
One example of such a bridge is the root bridge in IBM PowerNV, but this is
also useful for GPU passthrough into virtual machines, where it is
difficult to set up routing for legacy IO through IOMMU.
Nicolas Pitre [Sat, 28 Mar 2026 03:09:47 +0000 (23:09 -0400)]
vt: resize saved unicode buffer on alt screen exit after resize
Instead of discarding the saved unicode buffer when the console was
resized while in the alternate screen, resize it to the current
dimensions using vc_uniscr_copy_area() to preserve its content. This
properly restores the unicode screen on alt screen exit rather than
lazily rebuilding it from a lossy reverse glyph translation.
On allocation failure the stale buffer is freed and vc_uni_lines is
set to NULL so it gets lazily rebuilt via vc_uniscr_check() when next
needed.
Fixes: 40014493cece ("vt: discard stale unicode buffer on alt screen exit after resize") Cc: stable <stable@kernel.org> Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Link: https://patch.msgid.link/3nsr334n-079q-125n-7807-n4nq818758ns@syhkavp.arg Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Liav Mordouch [Fri, 27 Mar 2026 17:02:04 +0000 (20:02 +0300)]
vt: discard stale unicode buffer on alt screen exit after resize
When enter_alt_screen() saves vc_uni_lines into vc_saved_uni_lines and
sets vc_uni_lines to NULL, a subsequent console resize via vc_do_resize()
skips reallocating the unicode buffer because vc_uni_lines is NULL.
However, vc_saved_uni_lines still points to the old buffer allocated for
the original dimensions.
When leave_alt_screen() later restores vc_saved_uni_lines, the buffer
dimensions no longer match vc_rows/vc_cols. Any operation that iterates
over the unicode buffer using the current dimensions (e.g. csi_J clearing
the screen) will access memory out of bounds, causing a kernel oops:
BUG: unable to handle page fault for address: 0x0000002000000020
RIP: 0010:csi_J+0x133/0x2d0
The faulting address 0x0000002000000020 is two adjacent u32 space
characters (0x20) interpreted as a pointer, read from the row data area
past the end of the 25-entry pointer array in a buffer allocated for
80x25 but accessed with 240x67 dimensions.
Fix this by checking whether the console dimensions changed while in the
alternate screen. If they did, free the stale saved buffer instead of
restoring it. The unicode screen will be lazily rebuilt via
vc_uniscr_check() when next needed.
Jason Andryuk [Wed, 18 Mar 2026 23:53:26 +0000 (19:53 -0400)]
hvc/xen: Check console connection flag
When the console out buffer is filled, __write_console() will return 0
as it cannot send any data. domU_write_console() will then spin in
`while (len)` as len doesn't decrement until xenconsoled attaches. This
would block a domU and nullify the parallelism of Hyperlaunch until dom0
userspace starts xenconsoled, which empties the buffer.
Xen 4.21 added a connection field to the xen console page. This is set
to XENCONSOLE_DISCONNECTED (1) when a domain is built, and xenconsoled
will set it to XENCONSOLE_CONNECTED (0) when it connects.
Update the hvc_xen driver to check the field. When the field is
disconnected, drop the write with -ENOTCONN. We only drop the write
when the field is XENCONSOLE_DISCONNECTED (1) to try for maximum
compatibility. The Xen toolstack has historically zero initialized the
console, so it should see XENCONSOLE_CONNECTED (0) by default. If an
implemenation used uninitialized memory, only checking for
XENCONSOLE_DISCONNECTED could have the lowest chance of not connecting.
This lets the hyperlaunched domU boot without stalling. Once dom0
starts xenconsoled, xl console can be used to access the domU's hvc0.
Paritally sync console.h from xen.git to bring in the new field.
Biju Das [Thu, 12 Mar 2026 08:26:59 +0000 (08:26 +0000)]
serial: sh-sci: Add support for RZ/G3L RSCI
Add support for RZ/G3L RSCI. The RSCI IP found on the RZ/G3L SoC is
similar to RZ/G3E, but it has 3 clocks (2 module clocks + 1 external
clock) instead of 6 clocks (5 module clocks + 1 external clock) on the
RZ/G3E. Both RZ/G3L and RZ/G3E have a 32-bit FIFO, but RZ/G3L has a
single TCLK with internal dividers, whereas the RZ/G3E has explicit
clocks for TCLK and its dividers. Add a new port type
RSCI_PORT_SCIF32_SINGLE_TCLK to handle this clock difference.
Document the serial communication interface (RSCI) used on the Renesas
RZ/G3L (R9A08G046) SoC. This SoC integrates the same RSCI IP block as
the RZ/G3E (R9A09G047), but it has 3 clocks compared to 6 clocks on
the RZ/G3E SoC. The RZ/G3L has a single TCLK with internal dividers,
whereas the RZ/G3E has explicit clocks for TCLK and its dividers.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260312082708.98835-2-biju.das.jz@bp.renesas.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kexin Sun [Tue, 24 Mar 2026 02:48:57 +0000 (10:48 +0800)]
tty: atmel_serial: update outdated reference to atmel_tasklet_func()
The modem-status comparison that used irq_status_prev was
moved from atmel_tasklet_func() into atmel_handle_status() in
commit d033e82db9a5 ("tty/serial: at91: handle IRQ status
more safely"). Update the comment accordingly.
The UART controller on Loongson 3A4000 is compatible with Loongson
2K1500, which is NS16550A-compatible with an additional fractional
frequency divisor register.
Add loongson,ls3a4000-uart as compatible with loongson,ls2k1500-uart.
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Rong Zhang <rongrong@oss.cipunited.com> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Link: https://patch.msgid.link/20260315184301.412844-2-rongrong@oss.cipunited.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kuen-Han Tsai [Fri, 20 Mar 2026 08:54:50 +0000 (16:54 +0800)]
usb: gadget: f_rndis: Fix net_device lifecycle with device_move
The net_device is allocated during function instance creation and
registered during the bind phase with the gadget device as its sysfs
parent. When the function unbinds, the parent device is destroyed, but
the net_device survives, resulting in dangling sysfs symlinks:
console:/ # ls -l /sys/class/net/usb0
lrwxrwxrwx ... /sys/class/net/usb0 ->
/sys/devices/platform/.../gadget.0/net/usb0
console:/ # ls -l /sys/devices/platform/.../gadget.0/net/usb0
ls: .../gadget.0/net/usb0: No such file or directory
Use device_move() to reparent the net_device between the gadget device
tree and /sys/devices/virtual across bind and unbind cycles. During the
final unbind, calling device_move(NULL) moves the net_device to the
virtual device tree before the gadget device is destroyed. On rebinding,
device_move() reparents the device back under the new gadget, ensuring
proper sysfs topology and power management ordering.
To maintain compatibility with legacy composite drivers (e.g., multi.c),
the borrowed_net flag is used to indicate whether the network device is
shared and pre-registered during the legacy driver's bind phase.
Fixes: f466c6353819 ("usb: gadget: f_rndis: convert to new function interface with backward compatibility") Cc: stable@vger.kernel.org Signed-off-by: Kuen-Han Tsai <khtsai@google.com> Link: https://patch.msgid.link/20260320-usb-net-lifecycle-v1-7-4886b578161b@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kuen-Han Tsai [Fri, 20 Mar 2026 08:54:49 +0000 (16:54 +0800)]
usb: gadget: f_subset: Fix net_device lifecycle with device_move
The net_device is allocated during function instance creation and
registered during the bind phase with the gadget device as its sysfs
parent. When the function unbinds, the parent device is destroyed, but
the net_device survives, resulting in dangling sysfs symlinks:
console:/ # ls -l /sys/class/net/usb0
lrwxrwxrwx ... /sys/class/net/usb0 ->
/sys/devices/platform/.../gadget.0/net/usb0
console:/ # ls -l /sys/devices/platform/.../gadget.0/net/usb0
ls: .../gadget.0/net/usb0: No such file or directory
Use device_move() to reparent the net_device between the gadget device
tree and /sys/devices/virtual across bind and unbind cycles. During the
final unbind, calling device_move(NULL) moves the net_device to the
virtual device tree before the gadget device is destroyed. On rebinding,
device_move() reparents the device back under the new gadget, ensuring
proper sysfs topology and power management ordering.
To maintain compatibility with legacy composite drivers (e.g., multi.c),
the bound flag is used to indicate whether the network device is shared
and pre-registered during the legacy driver's bind phase.
Fixes: 8cedba7c73af ("usb: gadget: f_subset: convert to new function interface with backward compatibility") Cc: stable@vger.kernel.org Signed-off-by: Kuen-Han Tsai <khtsai@google.com> Link: https://patch.msgid.link/20260320-usb-net-lifecycle-v1-6-4886b578161b@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kuen-Han Tsai [Fri, 20 Mar 2026 08:54:48 +0000 (16:54 +0800)]
usb: gadget: f_eem: Fix net_device lifecycle with device_move
The net_device is allocated during function instance creation and
registered during the bind phase with the gadget device as its sysfs
parent. When the function unbinds, the parent device is destroyed, but
the net_device survives, resulting in dangling sysfs symlinks:
console:/ # ls -l /sys/class/net/usb0
lrwxrwxrwx ... /sys/class/net/usb0 ->
/sys/devices/platform/.../gadget.0/net/usb0
console:/ # ls -l /sys/devices/platform/.../gadget.0/net/usb0
ls: .../gadget.0/net/usb0: No such file or directory
Use device_move() to reparent the net_device between the gadget device
tree and /sys/devices/virtual across bind and unbind cycles. During the
final unbind, calling device_move(NULL) moves the net_device to the
virtual device tree before the gadget device is destroyed. On rebinding,
device_move() reparents the device back under the new gadget, ensuring
proper sysfs topology and power management ordering.
To maintain compatibility with legacy composite drivers (e.g., multi.c),
the bound flag is used to indicate whether the network device is shared
and pre-registered during the legacy driver's bind phase.
Fixes: b29002a15794 ("usb: gadget: f_eem: convert to new function interface with backward compatibility") Cc: stable@vger.kernel.org Signed-off-by: Kuen-Han Tsai <khtsai@google.com> Link: https://patch.msgid.link/20260320-usb-net-lifecycle-v1-5-4886b578161b@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kuen-Han Tsai [Fri, 20 Mar 2026 08:54:47 +0000 (16:54 +0800)]
usb: gadget: f_ecm: Fix net_device lifecycle with device_move
The net_device is allocated during function instance creation and
registered during the bind phase with the gadget device as its sysfs
parent. When the function unbinds, the parent device is destroyed, but
the net_device survives, resulting in dangling sysfs symlinks:
console:/ # ls -l /sys/class/net/usb0
lrwxrwxrwx ... /sys/class/net/usb0 ->
/sys/devices/platform/.../gadget.0/net/usb0
console:/ # ls -l /sys/devices/platform/.../gadget.0/net/usb0
ls: .../gadget.0/net/usb0: No such file or directory
Use device_move() to reparent the net_device between the gadget device
tree and /sys/devices/virtual across bind and unbind cycles. During the
final unbind, calling device_move(NULL) moves the net_device to the
virtual device tree before the gadget device is destroyed. On rebinding,
device_move() reparents the device back under the new gadget, ensuring
proper sysfs topology and power management ordering.
To maintain compatibility with legacy composite drivers (e.g., multi.c),
the bound flag is used to indicate whether the network device is shared
and pre-registered during the legacy driver's bind phase.
Fixes: fee562a6450b ("usb: gadget: f_ecm: convert to new function interface with backward compatibility") Cc: stable@vger.kernel.org Signed-off-by: Kuen-Han Tsai <khtsai@google.com> Link: https://patch.msgid.link/20260320-usb-net-lifecycle-v1-4-4886b578161b@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kuen-Han Tsai [Fri, 20 Mar 2026 08:54:44 +0000 (16:54 +0800)]
usb: gadget: f_subset: Fix unbalanced refcnt in geth_free
geth_alloc() increments the reference count, but geth_free() fails to
decrement it. This prevents the configuration of attributes via configfs
after unlinking the function.
Decrement the reference count in geth_free() to ensure proper cleanup.