Huacai Chen [Mon, 4 May 2026 01:00:00 +0000 (09:00 +0800)]
LoongArch: Make CONFIG_64BIT as the default option
CONFIG_64BIT is the mandatory option before v7.0, but in v7.1-rc1 both
CONFIG_32BIT and CONFIG_64BIT are selectable and CONFIG_32BIT became the
default option. This breaks existing configurations, so explicitly make
CONFIG_64BIT as the default option to keep existing behavior.
Linus Torvalds [Sun, 3 May 2026 22:25:47 +0000 (15:25 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Three bug fixes for x86:
- Check that nEPT/nNPT is enabled in slow flush hypercalls. If it is
not, the hypercalls can be processed as usual even while running a
nested guest
- Fix shadow paging use-after-free due to page tables changing
outside execution of the guest. A bug that is 16 years old and
stems from an imprecision in the very first KVM series
- Scan IRR whenever PID.ON is true, even if PIR is empty, which
avoids a somewhat rare WARN"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Fix shadow paging use-after-free due to unexpected GFN
KVM: x86: Fix misleading variable names and add more comments for PIR=>IRR flow
KVM: x86: Do IRR scan in __kvm_apic_update_irr even if PIR is empty
KVM: x86: check for nEPT/nNPT in slow flush hypercalls
Andrew Davis [Mon, 2 Mar 2026 18:08:53 +0000 (12:08 -0600)]
watchdog: bcm2835_wdt: Switch to new sys-off handler API
Kernel now supports chained power-off handlers. Use
devm_register_sys_off_handler() that registers a power-off handler. Legacy
pm_power_off() will be removed once all drivers and archs are converted to
the new sys-off API.
Signed-off-by: Andrew Davis <afd@ti.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20260302180853.224112-1-afd@ti.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
KVM: x86: Fix shadow paging use-after-free due to unexpected GFN
The shadow MMU computes GFNs for direct shadow pages using sp->gfn plus
the SPTE index. This assumption breaks for shadow paging if the guest
page tables are modified between VM entries (similar to commit aad885e77496, "KVM: x86/mmu: Drop/zap existing present SPTE even
when creating an MMIO SPTE", 2026-03-27). The flow is as follows:
- a PDE is installed for a 2MB mapping, and a page in that area is
accessed. KVM creates a kvm_mmu_page consisting of 512 4KB pages;
the kvm_mmu_page is marked by FNAME(fetch) as direct-mapped because
the guest's mapping is a huge page (and thus contiguous).
- the PDE mapping is changed from outside the guest.
- the guest accesses another page in the same 2MB area. KVM installs
a new leaf SPTE and rmap entry; the SPTE uses the "correct" GFN
(i.e. based on the new mapping, as changed in the previous step) but
that GFN is outside of the [sp->gfn, sp->gfn + 511] range; therefore
the rmap entry cannot be found and removed when the kvm_mmu_page
is zapped.
- the memslot that covers the first 2MB mapping is deleted, and the
kvm_mmu_page for the now-invalid GPA is zapped. However, rmap_remove()
only looks at the [sp->gfn, sp->gfn + 511] range established in step 1,
and fails to find the rmap entry that was recorded by step 3.
- any operation that causes an rmap walk for the same page accessed
by step 3 then walks a stale rmap and dereferences a freed kvm_mmu_page.
This includes dirty logging or MMU notifier invalidations (e.g., from
MADV_DONTNEED).
The underlying issue is that KVM's walking of shadow PTEs assumes that
if a SPTE is present when KVM wants to install a non-leaf SPTE, then the
existing kvm_mmu_page must be for the correct gfn. Because the only way
for the gfn to be wrong is if KVM messed up and failed to zap a SPTE...
which shouldn't happen, but *actually* only happens in response to a
guest write.
That bug dates back literally forever, as even the first version of KVM
assumes that the GFN matches and walks into the "wrong" shadow page.
However, that was only an imprecision until 2032a93d66fa ("KVM: MMU:
Don't allocate gfns page for direct mmu pages") came along.
Fix it by checking for a target gfn mismatch and zapping the existing
SPTE. That way the old SP and rmap entries are gone, KVM installs
the rmap in the right location, and everyone is happy.
Fixes: 2032a93d66fa ("KVM: MMU: Don't allocate gfns page for direct mmu pages") Fixes: 6aa8b732ca01 ("kvm: userspace interface") Reported-by: Alexander Bulekov <bkov@amazon.com> Reported-by: Fred Griffoul <fgriffo@amazon.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://patch.msgid.link/20260503201029.106481-1-pbonzini@redhat.com/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM: x86: Fix misleading variable names and add more comments for PIR=>IRR flow
Rename kvm_apic_update_irr()'s "irr_updated" and vmx_sync_pir_to_irr()'s
"got_posted_interrupt" to a more accurate "max_irr_is_from_pir", as neither
"irr_updated" nor "got_posted_interrupt" is accurate.
__kvm_apic_update_irr() and thus kvm_apic_update_irr() specifically return
true if and only if the highest priority IRQ, i.e. max_irr, is a "new"
pending IRQ from the PIR. I.e. it's possible for the IRR to be updated,
i.e. for a posted IRQ to be "got", *without* the APIs returning true.
Expand vmx_sync_pir_to_irr()'s comment to explain why it's necessary to
set KVM_REQ_EVENT only if a "new" IRQ was found, and to explain why it's
safe to do so only if a new IRQ is also the highest priority pending IRQ.
Paolo Bonzini [Sun, 3 May 2026 17:19:32 +0000 (19:19 +0200)]
KVM: x86: Do IRR scan in __kvm_apic_update_irr even if PIR is empty
Fall back to apic_find_highest_vector() when PID.ON is set but PIR
turns out to be empty, to correctly report the highest pending interrupt
from the existing IRR.
In a nested VM stress test, the following WARNING fires in
vmx_check_nested_events() when kvm_cpu_has_interrupt() reports a pending
interrupt but the subsequent kvm_apic_has_interrupt() (which invokes
vmx_sync_pir_to_irr() again) returns -1:
The root cause is a race between vmx_sync_pir_to_irr() on the target vCPU
and __vmx_deliver_posted_interrupt() on a sender vCPU. The sender
performs two individually-atomic operations that are not a single
transaction:
1. pi_test_and_set_pir(vector) -- sets the PIR bit
2. pi_test_and_set_on() -- sets PID.ON
The following interleaving triggers the bug:
Sender vCPU (IPI): Target vCPU (1st sync_pir_to_irr):
B1: set PIR[vector]
A1: pi_clear_on()
A2: pi_harvest_pir() -> sees B1 bit
A3: xchg() -> consumes bit, PIR=0
(1st sync returns correct max_irr)
B2: set PID.ON = 1
Target vCPU (2nd sync_pir_to_irr):
C1: pi_test_on() -> TRUE (from B2)
C2: pi_clear_on() -> ON=0
C3: pi_harvest_pir() -> PIR empty
C4: *max_irr = -1, early return
IRR NOT SCANNED
The interrupt is not lost (it resides in the IRR from the first sync and
is recovered on the next vcpu_enter_guest() iteration), but the incorrect
max_irr causes a spurious WARNING and a wasted L2 VM-Enter/VM-Exit cycle.
Paolo Bonzini [Mon, 27 Apr 2026 12:25:40 +0000 (14:25 +0200)]
KVM: x86: check for nEPT/nNPT in slow flush hypercalls
Checking is_guest_mode(vcpu) is incorrect, because translate_nested_gpa()
is only valid if an L2 guest is running *with nested EPT/NPT enabled*.
Instead use the same condition as translate_nested_gpa() itself.
Cc: stable@vger.kernel.org Reviewed-by: Sean Christopherson <seanjc@google.com> Fixes: aee738236dca ("KVM: x86: Prepare kvm_hv_flush_tlb() to handle L2's GPAs", 2022-11-18) Link: https://patch.msgid.link/20260503200905.106077-1-pbonzini@redhat.com/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For eDP low vdiff, the LDO setting depends on the PHY version rather
than being a simple 0x0 or 0x1 value. Introduce a PHY callback to program
the correct LDO setting according to the HPG.
Since SC7280/SC8180X uses different LDO settings from SA8775P/SC8280XP,
introduce qcom_edp_phy_ops_v3 to keep the LDO setting correct.
phy: qcom: edp: Fix AUX_CFG8 programming for DP mode
AUX_CFG8 depends on whether the PHY is operating in eDP or DP mode, not
the selected swing/pre-emphasis table. All supported platforms already
have the proper tables, so remove the unnecessary check.
SC7280 and SC8180X previously shared the same cfg because they did not use
swing/pre-emphasis tables. Add the corresponding tables for these
platforms. Since they have different PHY sub-versions, their eDP/DP mode
tables also differ, so move SC8180X to its own cfg instead of reusing the
SC7280 one.
The eDP PHY supports both eDP/DP modes, each requiring a different
swing/pre-emphasis table. However, the driver currently uses a fixed
static table for eDP programming rather than selecting the appropriate
table based on the current mode. Add separate tables for eDP and DP
modes, and select the appropriate table dynamically based on the
current mode.
Glymur's DP mode table differs from the other platforms, add a
dedicated table for it.
This also fixes the table mismatch for X1E80100 (eDP) and SA8775P (DP).
Cc: stable@vger.kernel.org Fixes: 3f12bf16213c ("phy: qcom: edp: Add support for eDP PHY on SA8775P") Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Yongxing Mou <yongxing.mou@oss.qualcomm.com> Link: https://patch.msgid.link/20260427-edp_phy-v5-2-3bb876824475@oss.qualcomm.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
Fabrizio Castro [Tue, 3 Feb 2026 12:42:45 +0000 (12:42 +0000)]
dt-bindings: watchdog: renesas,r9a09g057-wdt: Rework example
When the bindings for the Renesas RZ/V2H(P) SoC were factored
out IP WDT0 was selected for the example, however the HW user
manual states that only IP WDT1 can be used by Linux.
This commit is part of a series that removes WDT{0,2,3} support
from the kernel, therefore the example from the bindings has
lost its meaning.
Linus Torvalds [Sun, 3 May 2026 15:58:42 +0000 (08:58 -0700)]
Merge tag 'sh-for-v7.1-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux
Pull sh fix from John Paul Adrian Glaubitz:
"The ZERO_PAGE consolidation in v7.1, introduced a regression on sh
which made these systems unbootable.
The problem was that on sh, the initial boot parameters were
previously referenced as an array and after 6215d9f4470f ("arch, mm:
consolidate empty_zero_page"), they were referenced as a pointer which
caused wrong code generation and boot hang.
This changes the declaration back to being an array which fixes the
boot hang"
* tag 'sh-for-v7.1-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
sh: Fix fallout from ZERO_PAGE consolidation
Michael Walle [Mon, 2 Mar 2026 12:24:51 +0000 (13:24 +0100)]
dt-bindings: watchdog: Drop SMARC-sAM67 support
I was just informed that this product is discontinued (without being
ever released to the market). Pull the plug and let's not waste any more
maintainers time and revert commit 354f31e9d2a3 ("dt-bindings: watchdog:
Add SMARC-sAM67 support").
watchdog: sama5d4_wdt: Fix WDDIS detection on SAM9X60 and SAMA7G5
The driver hardcoded AT91_WDT_WDDIS (bit 15) in wdt_enabled and the
probe initial state readout. SAM9X60 and SAMA7G5 use bit 12
(AT91_SAM9X60_WDDIS), causing incorrect WDDIS detection.
Introduce a per-device wddis_mask field to select the correct WDDIS
bit based on the compatible string.
Jonathan Corbet [Sun, 3 May 2026 15:49:52 +0000 (09:49 -0600)]
Merge branch 'mauro' into docs-mw
Mauro says:
This patch series change the way maintainer entry profile links
are added to the documentation. Instead of having an entry for
each of them at an ReST file, get them from MAINTAINERS content.
That should likely make easier to maintain, as there will be a single
point to place all such profiles.
The output is a per-subsystem sorted (*) series of links shown as a
list like this:
- Arm And Arm64 Soc Sub-Architectures (Common Parts)
- Arm/Samsung S3C, S5P And Exynos Arm Architectures
- Arm/Tesla Fsd Soc Support
...
- Xfs Filesystem
Please notice that the series is doing one logical change per patch.
I could have merged some changes altogether, but I opted doing it
in small steps to help reviews. If you prefer, feel free to merge
maintainers_include changes on merge.
There is one interesting side effect of this series: there is no
need to add rst files containing profiles inside a TOC tree: Just
creating the file anywhere inside Documentation and adding a P entry
is enough. Adding them to a TOC won't hurt.
Hrishabh Rajput [Wed, 11 Mar 2026 05:46:31 +0000 (11:16 +0530)]
watchdog: Add driver for Gunyah Watchdog
On Qualcomm SoCs running under the Gunyah hypervisor, access to watchdog
through MMIO is not available on all platforms. Depending on the
hypervisor configuration, the watchdog is either fully emulated or
exposed via ARM's SMC Calling Conventions (SMCCC) through the Vendor
Specific Hypervisor Service Calls space.
Add driver to support the SMC-based watchdog provided by the Gunyah
Hypervisor. Device registration is done in the QCOM SCM driver after
checks to restrict the watchdog initialization to Qualcomm devices
running under Gunyah.
Gunyah watchdog is not a hardware but an SMC-based vendor-specific
hypervisor interface provided by the Gunyah hypervisor. The design
involving QCOM SCM driver for registering the platform device has been
devised to avoid adding non-hardware nodes to devicetree.
Flavio Suligoi [Mon, 23 Mar 2026 12:52:04 +0000 (13:52 +0100)]
watchdog: gpio_wdt: add ACPI support
The gpio_wdt device driver uses the device property APIs, so it is
firmware agnostic. For this reason we can now add the ACPI support in
Kconfig.
In this way it can be used seamlessly in ACPI and DT systems.
For example, a typical GPIO watchdog device configuration, in an ACPI
SSDT table, could be:
Steve Wahl [Wed, 18 Mar 2026 15:50:05 +0000 (10:50 -0500)]
watchdog/hpwdt: Refine hpwdt message for UV platform
The watchdog hardware the hpwdt driver uses was added to the UV
platform for UV_5, but the logging mentioned by this driver was not
added to the BIOS. When the watchdog fires, the printed message had
the administrators and developers looking for non-existent log files,
and confused about whether a watchdog actually tripped.
Change the message that prints on UV platforms so it doesn't send the
user looking for non-existent logs.
To aid in any future debugging, include all 8 bits of the NMISTAT
register in the output, not just the two bits being used to determine
this was "mynmi". And provide names to the bits in NMISTAT so the
code is easier to understand.
Since commit 1f4ea4838b13 ("mcb: Add missing modpost build support")
the MODULE_ALIAS() is redundant as the module alias is now
automatically generated from the MODULE_DEVICE_TABLE().
watchdog: sp5100_tco: Use EFCH MMIO for newer Hygon FCH
Commit 009637de1f65 ("watchdog: sp5100_tco: support Hygon FCH/SCH
(Server Controller Hub)") added Hygon vendor matching to the efch
layout selection, but newer Hygon 0x790b SMBus devices still need the
efch_mmio path.
The efch_mmio path enables EFCH_PM_DECODEEN_WDT_TMREN before probing the
watchdog MMIO block. If firmware leaves that bit clear and the driver
picks the legacy efch path instead, probe falls back to the alternate
window and fails with "Watchdog hardware is disabled".
Select efch_mmio for Hygon 0x790b devices with revision 0x51 or later,
matching the equivalent AMD behavior and allowing the watchdog to
initialize on those systems.
watchdog: qcom: add support to get the bootstatus from IMEM
When the system boots up after a watchdog reset, the EXPIRED_STATUS bit
in the WDT_STS register is cleared. To identify if the system was
restarted due to WDT expiry, XBL update the information in the IMEM region.
Update the driver to read the restart reason from IMEM and populate the
bootstatus accordingly.
With the CONFIG_WATCHDOG_SYSFS enabled, user can extract the information
as below:
Document the "sram" property for the watchdog device on Qualcomm
IPQ platforms. Use this property to extract the restart reason from
IMEM, which is updated by XBL. Populate the watchdog's bootstatus sysFS
entry with this information, when the system reboots due to a watchdog
timeout.
Describe this property for the IPQ5424 watchdog device and extend support
to other targets subsequently.
watchdog: qcom: Unify user-visible "Qualcomm" name
Various names for Qualcomm as a company are used in user-visible config
options: QCOM, Qualcomm and Qualcomm Technologies. Switch to unified
"Qualcomm" so it will be easier for users to identify the options when
for example running menuconfig.
watchdog: remove driver for integrated WDT of ZFx86 486-based SoC
The machzwd driver supports the integrated watchdog of the ZF Micro
ZFx86 SoC, which contains a 486-compatible core [1]. Since 486
support was removed in commit 8b793a92d862 ("x86/cpu: Remove
M486/M486SX/ELAN support"), the driver is no longer useful, Remove it.
There are three "types" of profiles:
1. Profiles already included inside subsystem-specific documentation.
This is the most common case;
2. Profiles that are hosted externally;
3. Profiles that are at the same location as maintainer-handbooks.rst.
For (3), we need to create a TOC, as they don't exist elsewhere.
Change the logic to create TOC just for (3), prepending the
content of maintainer-handbooks with a sorted entry of all types,
before the TOC.
With such change, we can have an unique sorted list of profiles,
having the subsystem names used there listed.
Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <10d06a3f530e07ad981aba93617a9a7f4d63c408.1777295258.git.mchehab+huawei@kernel.org>
docs: maintainers_include: preserve names for files under process/
When a maintainer's profile is stored outside process, they're
already included on some other book and the name of the filesystem
may not be there. That's why the logic picks the name from the
subsystem's name.
However, files directly placed together with maintainers-handbooks.rst
(e.g. under Documentation/process/) is a different history: those
aren't placed anywhere, so we can keep using their own names,
letting Sphinx do his thing.
Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <46961bc932be804cec19f06d202c23423d4aa12a.1777295258.git.mchehab+huawei@kernel.org>
Instead of manually creating a TOC tree for them, use the new
tag to auto-generate its TOC.
Co-developed-by: Dan Williams <djbw@kernel.org> Signed-off-by: Dan Williams <djbw@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Message-ID: <9228f77b0339b8e5dea4a201ab6d4feb30cef5c2.1776176108.git.mchehab+huawei@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <65e3f1d51eda0984ac945f50128b593f848584bc.1777295258.git.mchehab+huawei@kernel.org>
Add a feature to allow auto-generating media entry profiles from the
corresponding field inside MAINTAINERS file(s).
Suggested-by: Dan Williams <djbw@kernel.org> Closes: https://lore.kernel.org/linux-doc/69dd6299440be_147c801005b@djbw-dev.notmuch/ Acked-by: Dan Williams <djbw@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Message-ID: <4e9512a3d05942c98361d06d60a118d7c78762b6.1776176108.git.mchehab+huawei@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <eac5fb166bcbf0f267e762d72b150bbf248ff057.1777295258.git.mchehab+huawei@kernel.org>
Linus Torvalds [Sun, 3 May 2026 15:19:57 +0000 (08:19 -0700)]
Merge tag 'slab-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fixes from Vlastimil Babka:
- Stable fixes for CONFIG_SMP=n where _nolock() allocations in NMI both
at kmalloc and page allocator levels are not properly protected by
the spin_trylock() semantics on !SMP (Harry Yoo)
* tag 'slab-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
mm/slab: return NULL early from kmalloc_nolock() in NMI on UP
mm/page_alloc: return NULL early from alloc_frozen_pages_nolock() in NMI on UP
Randy Dunlap [Sat, 28 Feb 2026 01:04:01 +0000 (17:04 -0800)]
docs: watchdog-kernel-api: general cleanups
Fix grammar and punctuation.
Add a missing struct member (pm_nb) and its description.
Add a subheading for Helper Functions between the struct descriptions
and just pure helper functions.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260228010402.2389343-5-rdunlap@infradead.org>
Linus Torvalds [Sun, 3 May 2026 15:17:09 +0000 (08:17 -0700)]
Merge tag 'locking-urgent-2026-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fix from Ingo Molnar:
"Fix lockup in requeue-PI during signal/timeout wakeups, by Sebastian
Andrzej Siewior"
* tag 'locking-urgent-2026-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Prevent lockup in requeue-PI during signal/ timeout wakeup
Nicolas Pitre [Wed, 22 Apr 2026 21:10:39 +0000 (17:10 -0400)]
cramfs: drop obsolete Future Development notes and update tools URL
fs/cramfs/README still carries a "Future Development" section written
around the Linux 2.3.39 era discussing design decisions that have long
since been settled:
- Block size is 4096 and matches PAGE_SIZE on the reader.
- Endianness is host-endian; mkcramfs -B / -L handles cross-builds.
- Block-pointer extensions (CRAMFS_FLAG_EXT_BLOCK_POINTERS) ended up
being the answer to layout flexibility rather than inode growth.
Drop the stale forward-looking section -- the factual format
description above remains accurate and is kept as-is.
While here, replace the dead sourceforge URL in the "Tools" section
with the current location of the user-space tools on github.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260422211039.270552-2-nico@fluxnic.net>
Nicolas Pitre [Wed, 22 Apr 2026 21:10:38 +0000 (17:10 -0400)]
Documentation: filesystems: cramfs: correct stale hard-link and endianness claims
Two paragraphs in cramfs.rst have been misleading for a long time:
- "Hard links are supported, but hard linked files will still have
a link count of 1": mkcramfs does not preserve hard links; it
deduplicates by content (eliminate_doubles()). Two names for
the same on-disk inode in the source tree become two separate
(content-shared) entries in the image, and cramfs always reports
a link count of 1.
- "Currently, cramfs must be written and read with architectures of
the same endianness ... PAGE_SIZE == 4096 ... is a bug, but it
hasn't been decided what the best fix is": the endianness
situation has been settled for years -- the kernel checks for
CRAMFS_MAGIC_WEND in cramfs_fill_super() and refuses the mount,
and mkcramfs has gained -B / -L for producing images of the
opposite endianness from the build host (useful for cross-builds,
but the reader still needs to match). Restate this accurately.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260422211039.270552-1-nico@fluxnic.net>
The sched monitor page was linking to Daniel's website which is now
down. The main purpose of the link was to point to a source for the
models from the original author and that can be found also in his
published paper.
Replace the link with a reference to Daniel's "A thread synchronization
model for the PREEMPT_RT Linux kernel" which can be found online and
includes the models definitions as well as the work behind them (not the
original patches but since they're based on a 5.0 kernel and are mostly
included upstream, there's little value in keeping them in the docs).
Fixes: 03abeaa63c08 ("Documentation/rv: Add docs for the sched monitors") Signed-off-by: Gabriele Monaco <gmonaco@redhat.com> Acked-by: Matteo Martelli <matteo.martelli@codethink.co.uk> Tested-by: Matteo Martelli <matteo.martelli@codethink.co.uk> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260427131709.170505-2-gmonaco@redhat.com>
Documentation/kernel-parameters: Remove "Deprecated" from isolcpus=
The isolcpus= option has been marked as deprecated in 2017. Back then it
was desired for the domain sub option to be configured dynamically at
runtime instead using this boot command line which provides a static
configuration. In the meantime this option was extended by other sub
options which don't have runtime counterpart or it does not make sense
to provide one.
The deprecated part always referred to the default `domain' sub option
but it was not obvious. Also the reasoning behind the deprecation is
sort of dubious: There is nothing wrong with a static configuration if
there is no desired to reconfigure. This is useful on systems which
have one purpose and the CPU partition configuration is not changed for
the entire lifetime.
Remove the "Deprecated" note. Remove the part of the description which
suggest to use cpuset.sched_load_balance and instead point to the
documentation file which explains how to use cpusets to configure this
at runtime.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Waiman Long <longman@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260427150739.bwVmmkj2@linutronix.de>
Daniel Pereira [Tue, 28 Apr 2026 18:02:05 +0000 (15:02 -0300)]
docs/pt_BR: process: link maintainer-kvm-x86 in maintainer-handbooks
The Portuguese translation of maintainer-kvm-x86.rst exists in the
directory, but it was not listed in the toctree of
maintainer-handbooks.rst.
Add the missing entry to ensure the document is properly indexed and
reachable through the main maintainer handbook page.
Signed-off-by: Daniel Pereira <danielmaraboo@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260428180208.175472-1-danielmaraboo@gmail.com>
Linus Torvalds [Sun, 3 May 2026 15:05:23 +0000 (08:05 -0700)]
Merge tag 'sched-urgent-2026-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
- Fix the delayed dequeue negative lag increase fix in the
fair scheduler (Peter Zijlstra)
- Fix wakeup_preempt_fair() to do proper delayed dequeue
(Vincent Guittot)
- Clear sched_entity::rel_deadline when initializing
forked entities, which bug can cause all tasks to be
EEVDF-ineligible, causing a NULL pointer dereference
crash in pick_next_entity() (Zicheng Qu)
* tag 'sched-urgent-2026-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Clear rel_deadline when initializing forked entities
sched/fair: Fix wakeup_preempt_fair() vs delayed dequeue
sched/fair: Fix the negative lag increase fix
Signed-off-by: Manuel Ebner <manuelebner@mailbox.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: SeongJae Park <sj@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260429072320.310817-2-manuelebner@mailbox.org>
Miles Krause [Wed, 29 Apr 2026 22:24:35 +0000 (18:24 -0400)]
Documentation/scheduler: Fix duplicated word in sched-deadline
The SCHED_DEADLINE documentation has a duplicated the in the CPU
affinity section.
Remove the extra word.
Signed-off-by: Miles Krause <mileskrause5200@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260429222435.2041-1-mileskrause5200@gmail.com>
Akiyoshi Kurita [Sat, 2 May 2026 07:01:43 +0000 (16:01 +0900)]
docs/ja_JP: translate more of submitting-patches.rst
Translate the "Separate your changes", "Style-check your changes",
and "Select the recipients for your patch" sections in
Documentation/translations/ja_JP/process/submitting-patches.rst.
Keep the wording close to the English text and wrap lines to match
the style used in the surrounding Japanese translation.
selftests/nolibc: avoid function pointer comparisons
The upcoming parisc support would require libgcc to implement function
pointer comparisons. As we try to avoid the libgcc dependency rework
the logic to work without such comparisons.
Costa Shulyupin [Sat, 2 May 2026 12:02:05 +0000 (15:02 +0300)]
docs: Remove stale ISDN parameters
The icn= and pcbit= parameters referenced drivers removed in
commit 02bbd9802da7 ("staging: i4l: delete the whole thing").
Remove the stale parameter entries and the now-unused ISDN tag
from the legend.
Suggested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Assisted-by: Claude:claude-opus-4-6 Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260502120206.1289126-1-costa.shul@redhat.com>
Consolidation of empty_zero_page declarations broke boot on sh.
sh stores its initial boot parameters in a page reserved in
arch/sh/kernel/head_32.S. Before commit 6215d9f4470f ("arch, mm:
consolidate empty_zero_page") this page was referenced in C code
as an array and after that commit it is referenced as a pointer.
This causes wrong code generation and boot hang.
Declare boot_params_page as an array to fix the issue.
Reported-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Tested-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Fixes: 6215d9f4470f ("arch, mm: consolidate empty_zero_page") Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Artur Rojek <contact@artur-rojek.eu> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Thomas Weißschuh [Wed, 29 Apr 2026 20:58:37 +0000 (22:58 +0200)]
selftests/nolibc: use QEMU_ARCH for QEMU_ARCH_USER
The current logic forces the XARCH to QEMU_ARCH mapping to contain
entries for all architectures. This will change. To avoid duplication
of that logic, reuse the already computed QEMU_ARCH variable.
Eliot Courtney [Fri, 1 May 2026 10:49:37 +0000 (19:49 +0900)]
rust: drm: fix unsound initialization in drm::Device::new
If pinned initialization of drm::Device::Data fails, it calls
drm::Device::release via drm_dev_put. This materializes a reference to
&drm::Device, but it's not fully constructed yet, because initializing
`data` failed. It should not be dropped either. Instead, if pinned
initialization fails, make sure drm::Device::release isn't called.
Fixes: 2e9fdbe5ec7a ("rust: drm: device: drop_in_place() the drm::Device in release()") Signed-off-by: Eliot Courtney <ecourtney@nvidia.com> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260501-fix-drm-1-v2-1-5c4f681837bc@nvidia.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
firewire: core: move allocation/reallocation paths into specific branch after isoc resource management in cdev
After managing the actual isochronous resources, there is
post-processing logic to determine what type of event should be
notified. However, there is room for improvement.
firewire: core: refactor notification type determination after isoc resource management in cdev
After managing the actual isochronous resources, there is
post-processing logic to determine what type of event should be
notified. However, there is room for improvement.
firewire: core: use switch statement for post-processing of isoc resource management in cdev
The iso_resource_auto structure object has three states. The current
implementation of state evaluation before managing the actual isochronous
resources can be improved.
This commit refactors the evaluation logic using a switch statement.
firewire: core: reduce critical section duration in pre-processing of isoc resource management in cdev
It is preferable for the critical section to be as small as possible.
Current implementation of iso_resource_auto_work() function uses a
spinlock to control concurrent access to members of fw_card, fw_device,
iso_resource_auto structures, however the locking duration could be
reduced.
Guangshuo Li [Mon, 13 Apr 2026 13:46:04 +0000 (21:46 +0800)]
counter: Fix refcount leak in counter_alloc() error path
After device_initialize(), the lifetime of the embedded struct device
is expected to be managed through the device core reference counting.
In counter_alloc(), if dev_set_name() fails after device_initialize(),
the error path removes the chrdev, frees the ID, and frees the backing
allocation directly instead of releasing the device reference with
put_device(). This bypasses the normal device lifetime rules and may
leave the reference count of the embedded struct device unbalanced,
resulting in a refcount leak.
The issue was identified by a static analysis tool I developed and
confirmed by manual review.
Fix this by using put_device() in the dev_set_name() failure path and
let counter_device_release() handle the final cleanup.
====================
Intel Wired LAN Updates 2024-04-30 (ixgbe, i40e, ice)
This series includes updates to support Energy-Efficient Ethernet (EEE) on
E610 devices in the ixgbe driver, support for an unmanaged DPLL output on
E830, as well as some other minor cleanups and improvements across ixgbe,
i40e, and ice.
Jedrzej begins with the first six patches preparing the ixgbe driver to
support EEE, adding a EEE capability flag, updating the supported EEE
speeds, updating the ACI command structures with the fields related to
EEE, moving the EEE config validation out for re-use, and finally
implementing the EEE support for E610 hardware.
Aleksandr fixes the ixgbe_update_flash_X550() logic to prevent unaligned
access in ixgbe_host_interface_command(). Note: this has no functional
change on x86, and is being sent through net-next as it is considered a
minor cleanup.
Jacob (hi!) modifies the i40e driver to only timestamp PTP event packets,
instead of timestamping every V2 event frame. This avoids wasting the
limited number of timestamp slots for frames which the PTP protocol does
not care about.
Jacob also extends the devlink flash notification message reporting that
users can activate the new firmware via devlink reload to explicitly
indicate the required "fw_activate" action.
Byungchul Park fixes the ice_lbtest_receive_frames() function to use
netmem_desc instead of the page structure.
Przemyslaw Korba fixes a truncation warning in ice_dpll_init_fwnode_pins()
by increasing the allowed length of the pin_name string on the stack to 16.
Ivan Vecera adds some bounds checking to ice_dpll_rclk_state_on_pin_get/set()
and moves the CGU register macros to be under the header guard ifdef in
ice_dpll.h
====================
Introduced by commit ad1df4f2d591 ("ice: dpll: Support E825-C SyncE and
dynamic pin discovery"):
ice_dpll.c: In function ‘ice_dpll_init’:
ice_dpll.c:3588:59: error: ‘%u’ directive output may be truncated
writing between 1 and 10 bytes into a region of size 4
[-Werror=format-truncation=] snprintf(pin_name, sizeof(pin_name),
"rclk%u", i);
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Przemyslaw Korba <przemyslaw.korba@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260430-jk-iwl-net-next-2026-04-30-v1-13-6f27ae1cd073@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Byungchul Park [Fri, 1 May 2026 06:37:23 +0000 (23:37 -0700)]
ice: access @pp through netmem_desc instead of page
To eliminate the use of struct page in page pool, the page pool users
should use netmem descriptor and APIs instead.
Make ice driver access @pp through netmem_desc instead of page.
Signed-off-by: Byungchul Park <byungchul@sk.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Reviewed-by: David Hildenbrand (Arm) <david@kernel.org> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260430-jk-iwl-net-next-2026-04-30-v1-12-6f27ae1cd073@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jacob Keller [Fri, 1 May 2026 06:37:22 +0000 (23:37 -0700)]
ice: mention fw_activate action along with devlink reload
The ice driver reports a helpful status message when updating firmware
indicating what action is necessary to enable the new firmware. This is
done because some updates require power cycling or rebooting the machine
but some can be activated via devlink.
The ice driver only supports activating firmware with the specific action
of "fw_activate" a bare "devlink dev reload" will *not* update the
firmware, and will only perform driver reinitialization.
Update the status message to explicitly reflect that the reload must use
the fw_activate action.
I considered modifying the text to spell out the full command, but felt
that was both overkill and something that would belong better as part of
the user space program and not hard coded into the kernel driver output.
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260430-jk-iwl-net-next-2026-04-30-v1-11-6f27ae1cd073@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jacob Keller [Fri, 1 May 2026 06:37:21 +0000 (23:37 -0700)]
i40e: only timestamp PTP event packets
The i40e_ptp_set_timestamp_mode() function is responsible for configuring
hardware timestamping. When programming receive timestamping, the logic
must determine how to configure the PRTTSYN_CTL1 register for receive
timestamping.
The i40e hardware does not support timestamping all frames. Instead,
timestamps are captured into one of the four PRTTSYN_RXTIME registers.
Currently, the driver configures hardware to timestamp all V2 packets on
ports 319 and 320, including all message types. This timestamps
significantly more packets than is actually requested by the
HWTSTAMP_FILTER_PTP_V2_EVENT filter type.
The documentation for HWTSTAMP_FILTER_PTP_V2_EVENT indicates that it should
timestamp PTP v2 messages on any layer, including any kind of event
packets.
Timestamping other packets is acceptable, but not required by the filter.
Doing so wastes valuable slots in the Rx timestamp registers. For most
applications this doesn't cause a problem. However, for extremely high
rates of messages, it becomes possible that one of the critical event
packets is not timestamped.
The PTP protocol only requires timestamps for event messages on port 319,
but hardware is timestamping on both 319 and 320, and timestamping message
types which do not need a timestamp value.
The i40e hardware actually has a more strict filtering option. First, only
timestamp layer 4 messages on port 319 instead of both 319 and 320. Second,
note that hardware has a specific mode to timestamp only event packets
(those with message type < 8).
Update the configuration to use the strict mode that only timestamps event
messages, switching the TSYNTYPE field from 10b to 11b which limits the
timestamping only to eventpackets with a Message Type of < 8. Note that the
X700 series datasheet seems to indicate that the V2MSESTYPE field is no
longer relevant. However, we only tested and validated with leaving the
V2MESSTYPE field set to 0xF for the "wildcard" behavior it documents. This
might not be required but it in that case setting it appears harmless, so
leave it as is.
This avoids wasting the valuable Rx timestamp register slots on non-event
frames, and may reduce faults when operating under high event rates.
ixgbe: fix unaligned u32 access in ixgbe_update_flash_X550()
ixgbe_host_interface_command() treats its buffer as a u32 array. The
local buffer we pass in was a union of byte-sized fields, which gives
it 1-byte alignment on the stack. On strict-align architectures this
can cause unaligned 32-bit accesses.
Add a u32 member to union ixgbe_hic_hdr2 so the object is 4-byte
aligned, and pass the u32 member when calling
ixgbe_host_interface_command().
No functional change on x86; prevents unaligned accesses on
architectures that enforce natural alignment.
Fixes: 49425dfc7451 ("ixgbe: Add support for x550em_a 10G MAC type") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Fixes: 6a14ee0cfb19 ("ixgbe: Add X550 support function pointers") Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260430-jk-iwl-net-next-2026-04-30-v1-9-6f27ae1cd073@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add E610 specific implementation of .get_eee() and .set_eee() ethtool
callbacks.
Introduce ixgbe_setup_eee_e610() which is used to set EEE config
on E610 device via ixgbe_aci_set_phy_cfg() (0x0601 ACI command).
Assign it to dedicated mac operation.
E610 devices support EEE feature specifically for 2.5, 5 and 10G link
speeds. When user try to set EEE for unsupported speeds log it.
Setting timer and setting EEE advertised speeds are not yet supported.
EEE shall be enabled by default for E610 devices.
Add EEE statuis logging during link watchdog run.
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260430-jk-iwl-net-next-2026-04-30-v1-6-6f27ae1cd073@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ixgbe: E610: update ACI command structs with EEE fields
There were recent changes in some of the ACI commands,
which have been extended with EEE related fields.
Set PHY Config, Get PHY Caps and Get Link Info have been
affected.
Align SW structs to the recent FW changes.
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260430-jk-iwl-net-next-2026-04-30-v1-4-6f27ae1cd073@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ixgbe: E610: use new version of 0x601 ACI command buffer
Since FW version 1.40, buffer size of the 0x601 cmd has been increased
by 2B - from 24 to 26B. Buffer has been extended with new field
which can be used to configure EEE entry delay.
Pre-1.40 FW versions still expect 24B buffer and throws error when
receipts 26B buffer. To keep compatibility, check whether EEE
device capability flag is set and basing on it use appropriate
size of the command buffer.
Additionally place Set PHY Config capabilities defines out of
structs definitions.
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260430-jk-iwl-net-next-2026-04-30-v1-3-6f27ae1cd073@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Despite there was no EEE (Energy Efficient Ethernet) feature
support for E610 adapters, eee_speeds_supported variable was
defined and even initialized with some EEE speeds.
As E610 adapter supports EEE only for 10G, 5G and 2.5G speeds,
update hw.phy.eee_speeds_supported. Remove unsupported speeds -
10M, 100M and 1G.
Add also entry for 5G speed in EEE speeds mapping array used
by ethtool callbacks.
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260430-jk-iwl-net-next-2026-04-30-v1-2-6f27ae1cd073@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Recently EEE functionality support has been introduced to E610 FW.
Currently ixgbe driver has no possibility to detect whether NVM
loaded on given adapter supports EEE.
There's dedicated device capability element reflecting FW support
for given EEE link speed.
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260430-jk-iwl-net-next-2026-04-30-v1-1-6f27ae1cd073@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
firmware: google: Skip failing entries instead of aborting populate
coreboot_table_populate() registers devices one by one. If
device_register() fails for one entry, the current code returns
immediately, leaving previously registered devices orphaned on the
coreboot bus with no cleanup path.
Since coreboot table entries are independent of each other, a failure
on one entry should not prevent the others from being registered.
This mirrors the strategy used by of_platform_populate(), which skips
individual failures rather than aborting.
Move ptr_entry increment before device_register(), log a warning on
failure, and continue the loop rather than aborting.
Signed-off-by: Titouan Ameline de Cadeville <titouan.ameline@gmail.com> Reviewed-by: Julius Werner <jwerner@chromium.org> Acked-by: Brian Norris <briannorris@chromium.org> Link: https://lore.kernel.org/r/20260501094322.123160-1-titouan.ameline@gmail.com Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
Jakub Kicinski [Wed, 29 Apr 2026 22:29:38 +0000 (15:29 -0700)]
net: tls: fix silent data drop under pipe back-pressure
tls_sw_splice_read() uses len when advancing rxm->offset / rxm->full_len
after skb_splice_bits(), rather than copied (the actual number of bytes
successfully spliced into the pipe). When the destination pipe cannot
accept all the requested bytes, splice_to_pipe() returns fewer bytes
than len, and 'len - copied' of data is effectively skipped over.