Johan Hovold [Fri, 10 Apr 2026 08:17:44 +0000 (10:17 +0200)]
spi: sifive: fix controller deregistration
Make sure to deregister the controller before disabling underlying
resources like interrupts during driver unbind.
Note that clocks were also disabled before the recent commit 140039c23aca ("spi: sifive: Simplify clock handling with
devm_clk_get_enabled()").
Fixes: 484a9a68d669 ("spi: sifive: Add driver for the SiFive SPI controller") Cc: stable@vger.kernel.org # 5.1 Cc: Yash Shah <yash.shah@sifive.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20260410081757.503099-15-johan@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
Johan Hovold [Fri, 10 Apr 2026 08:17:42 +0000 (10:17 +0200)]
spi: sh-hspi: fix controller deregistration
Make sure to deregister the controller before releasing underlying
resources like clocks during driver unbind.
Fixes: 49e599b8595f ("spi: sh-hspi: control spi clock more correctly") Cc: stable@vger.kernel.org # 3.4 Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20260410081757.503099-13-johan@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
Johan Hovold [Fri, 10 Apr 2026 08:17:40 +0000 (10:17 +0200)]
spi: rspi: fix controller deregistration
Make sure to deregister the controller before releasing underlying
resources like DMA during driver unbind.
Fixes: 9e03d05eee4c ("spi: rcar: Use devm_spi_register_master()") Cc: stable@vger.kernel.org # 3.14 Cc: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20260410081757.503099-11-johan@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
Johan Hovold [Fri, 10 Apr 2026 08:17:33 +0000 (10:17 +0200)]
spi: mxs: fix controller deregistration
Make sure to deregister the controller before releasing underlying
resources like DMA during driver unbind.
Fixes: 33e195acf268 ("spi: mxs: use devm_spi_register_master()") Cc: stable@vger.kernel.org # 3.13 Cc: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20260410081757.503099-4-johan@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
spi: tegra210-quad: Fix false positive WARN on interrupt timeout with transfer complete
The WARN_ON_ONCE/WARN_ON fired unconditionally on any completion
timeout, including the recoverable case where the interrupt was lost but
the hardware actually finished the transfer. This produced a noisy splat
with a full call trace even though the driver successfully recovered via
tegra_qspi_handle_timeout().
Since tegra210 uses threaded interrupts, the transfer completion can be
signaled before the interrupt fires, making this false positive case
common in practice.
Almost all the hosts I sysadmin in my fleet produce the following splat:
WARNING: CPU: 47 PID: 844 at drivers/spi/spi-tegra210-quad.c:1226 tegra_qspi_transfer_one_message+0x8a4/0xba8
....
tegra-qspi NVDA1513:00: QSPI interrupt timeout, but transfer complete
Move WARN_ON_ONCE/WARN_ON to fire only on real unrecoverable timeouts,
i.e., when tegra_qspi_handle_timeout() confirms the hardware did NOT
complete. This makes the warning actionable instead of just polluting
the metrics.
Document the system_cc_shared dma-buf heap that was introduced
recently. Describe its purpose, availability conditions and
relation to confidential computing VMs.
Mark Brown [Fri, 10 Apr 2026 12:20:01 +0000 (13:20 +0100)]
spi: rzv2h-rspi: Fix max_speed_hz and clock configuration issues
Prabhakar <prabhakar.csengg@gmail.com> says:
This patch series addresses three issues in the RZV2H RSPI driver:
1. The max_speed_hz field was advertising a prohibited bit rate, which
could lead to incorrect behavior when userspace applications attempt
to set the SPI clock speed.
2. The clock configuration logic allowed for an invalid combination of
SPR=0 and BRDV=0, which is not supported by the hardware.
3. Simplified the clock rate search function as min/max speed parameters
are not needed.
spi: rzv2h-rspi: Simplify clock rate search function signatures
The spr_min and spr_max parameters passed to
rzv2h_rspi_find_rate_variable() and rzv2h_rspi_find_rate_fixed() were
always called with RSPI_SPBR_SPR_MIN and RSPI_SPBR_SPR_MAX respectively.
There is no need to pass these as parameters since the valid SPR range
is fixed by the hardware.
The combination of SPR=0 and BRDV=0 results in the minimum division
ratio of 2, producing the maximum possible bit rate for a given clock
source. This combination is not supported in two cases:
- On RZ/G3E, RZ/G3L, RZ/V2H(P) and RZ/V2N, RSPI_n_TCLK is fixed at
200MHz, which would yield 100Mbps. The next hardware manual update
will explicitly state that since the maximum frequency of the
RSPICKn clock signal is 50MHz, settings with N=0 and n=0 resulting
in 100Mbps are prohibited.
- On RZ/T2H and RZ/N2H, when PCLK (125MHz) is used as the clock
source, SPR=0 and BRDV=0 is explicitly listed as unsupported in
the hardware manual (Table 36.7).
Skip the SPR=0/BRDV=0 combination in rzv2h_rspi_find_rate_fixed() to
prevent the driver from selecting an invalid clock configuration on the
affected SoCs.
Additionally, remove the now redundant RSPI_SPBR_SPR_PCLK_MIN define
which was previously set to 1 to work around the PCLK restriction, but
was overly broad as it incorrectly blocked valid combinations such as
SPR=0/BRDV=1 (31.25Mbps on PCLK=125MHz).
Fixes: 8b61c8919dff ("spi: Add driver for the RZ/V2H(P) RSPI IP") Fixes: 1ce3e8adc7d0 ("spi: rzv2h-rspi: add support for using PCLK for transfer clock") Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://patch.msgid.link/20260410080517.2405700-3-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
spi: rzv2h-rspi: Fix max_speed_hz advertising prohibited bit rate
On RZ/V2H(P), RZ/G3E and RZ/G3L, RSPI_n_TCLK is fixed at 200MHz.
The max_speed_hz was computed using clk_round_rate(tclk, ULONG_MAX)
with SPR=0 and BRDV=0, resulting in 100Mbps - the exact combination
prohibited on these SoCs. This could cause the SPI framework to request
a speed that rzv2h_rspi_find_rate_fixed() would skip, potentially
leading to a clock selection failure.
On RZ/T2H and RZ/N2H the max_speed_hz was correctly calculated as
50Mbps for both the variable PCLKSPIn and fixed PCLK clock sources.
Since the maximum supported bit rate is 50Mbps across all supported SoC
variants, replace the clk_round_rate() based calculation with a define
RSPI_MAX_SPEED_HZ set to 50MHz and use it directly for max_speed_hz.
Johan Hovold [Fri, 10 Apr 2026 06:47:49 +0000 (08:47 +0200)]
spi: fsl: fix controller deregistration
Make sure to deregister the controller before releasing underlying
resources like DMA during driver unbind.
Fixes: 4178b6b1b595 ("spi: fsl-(e)spi: migrate to using devm_ functions to simplify cleanup") Cc: stable@vger.kernel.org # 4.3 Cc: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20260410064749.496888-1-johan@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
Ben Guo [Fri, 10 Apr 2026 02:41:13 +0000 (10:41 +0800)]
docs/zh_CN: update rust/index.rst translation
Update the translation of .../rust/index.rst into Chinese.
Update the translation through commit a592a36e4937
("Documentation: use a source-read extension for the index link boilerplate")
Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn> Reviewed-by: Gary Guo <gary@garyguo.net> Signed-off-by: Ben Guo <ben.guo@openatom.club> Signed-off-by: Alex Shi <alexs@kernel.org>
Update the translation of .../rust/quick-start.rst into Chinese.
Update the translation through commit 5935461b4584
("docs: rust: quick-start: add Debian 13 (Trixie)")
Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn> Reviewed-by: Gary Guo <gary@garyguo.net> Signed-off-by: Ben Guo <ben.guo@openatom.club> Acked-by: Gary Guo <gary@garyguo.net> # Rust Signed-off-by: Alex Shi <alexs@kernel.org>
Update the translation of .../rust/coding-guidelines.rst into Chinese.
Update the translation through commit 4a9cb2eecc78
("docs: rust: add section on imports formatting")
Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn> Reviewed-by: Gary Guo <gary@garyguo.net> Signed-off-by: Ben Guo <ben.guo@openatom.club> Signed-off-by: Alex Shi <alexs@kernel.org>
Update the translation of .../rust/arch-support.rst into Chinese.
Update the translation through commit ccb8ce526807
("ARM: 9441/1: rust: Enable Rust support for ARMv7")
Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn> Reviewed-by: Gary Guo <gary@garyguo.net> Signed-off-by: Ben Guo <ben.guo@openatom.club> Signed-off-by: Alex Shi <alexs@kernel.org>
Song Hongyi [Wed, 11 Mar 2026 12:31:03 +0000 (20:31 +0800)]
docs/zh_CN: sync process/2.Process.rst with English version
The Chinese translation of the development process documentation was
outdated. Sync it with the current English version to ensure consistency.
Key changes include:
- Update versioning examples from 5.x to the 9.x placeholder.
- Add footnote [1] to explain the non-semantic versioning scheme.
- Replace the obsolete LTS kernel table with a link to kernel.org.
- Add a cross-reference for the "interleaved replies" section.
Update the translation through commit 5ce70894f6ca
("Doc: correct spelling and wording mistakes")
Signed-off-by: Song Hongyi <szpcq123@gmail.com> Signed-off-by: Alex Shi <alexs@kernel.org>
LIU Haoyang [Thu, 5 Mar 2026 20:10:58 +0000 (04:10 +0800)]
docs/zh_CN: fix an inconsistent statement in dev-tools/testing-overview
This patch fixes an inconsistent describtion in testing-overview.rst,
which should be ``kmalloc`` instead of ``kmalloc_arry`` according to
the original text.
Signed-off-by: LIU Haoyang <tttturtleruss@gmail.com> Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn> Signed-off-by: Alex Shi <alexs@kernel.org>
Jack Yu [Fri, 10 Apr 2026 06:42:25 +0000 (14:42 +0800)]
ASoC: rt1320-sdw: Add an approach to get new hardware advance gain
Add an approach to get new hardware advance gain,
and if there is no advance gain with this approach,
we can still get advance gain with original method.
Merge tag 'thunderbolt-for-v7.1-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-next
Mika writes:
thunderbolt: Changes for v7.1 merge window
This includes following USB4/Thunderbolt changes for the v7.1 merge
window:
- Disable CL-states for Titan Ridge based devices with older firmware.
- MAINTAINER update.
- Simplify allocation of various structures with kzalloc_flex().
All these have been in linux-next with no reported issues.
* tag 'thunderbolt-for-v7.1-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt:
thunderbolt: tunnel: Simplify allocation
thunderbolt: Use kzalloc_flex() for struct tb_path allocation
thunderbolt: dma_port: kmalloc_array + kzalloc to flex
MAINTAINERS: Remove bouncing maintainer, Mika takes over DMA test driver
thunderbolt: Disable CLx on Titan Ridge-based devices with old firmware
thunderbolt: Read router NVM version before applying quirks
Charles Keepax [Fri, 10 Apr 2026 10:45:00 +0000 (11:45 +0100)]
ASoC: SDCA: Update text of FIXME
A couple of attempts to correct this FIXME have been sent upstream but
the situation is not quite a simple as the FIXME implies. Update the
FIXME to include a better description of the situation.
Merge tag 'thermal-v7.1-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Merge updates of assorted thermal drivers for 7.1-rc1 from Daniel
Lezcano:
"- Added an OF node address to output message to make sensor names more
distinguishable (Alexander Stein)
- Added hwmon support for the i.MX97 thermal sensor (Alexander Stein)
- Clamped correctly the results when doing value/temperature conversion
in the Spreadtrum driver (Thorsten Blum)
- Added the SDM670 compatible DT bindings for the Tsens and the lMH
drivers (Richard Acayan)
- Added the SM8750 compatible DT bindings for the Tsens (Manaf
Meethalavalappu Pallikunhi)
- Added the Eliza SoC compatible DT bindings for the Tsens (Krzysztof
Kozlowski)
- Fixed inverted condition check on error in the Spear driver (Gopi
Krishna Menon)
- Converted the DT bindings documentation into DT schema (Gopi Krishna
Menon)
- Used max() macro to increase readibility in the Broadcom STB thermal
sensor (Thorsten Blum)
- Removed stale @trim_offset kernel-doc entry (John Madieu)"
* tag 'thermal-v7.1-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
thermal: renesas: rzg3e: Remove stale @trim_offset kernel-doc entry
thermal/drivers/brcmstb_thermal: Use max to simplify brcmstb_get_temp
dt-bindings: thermal: st,thermal-spear1340: convert to dtschema
thermal/drivers/spear: Fix error condition for reading st,thermal-flags
dt-bindings: thermal: qcom-tsens: Add Eliza SoC TSENS
dt-bindings: thermal: qcom-tsens: Document the SM8750 Temperature Sensor
thermal/drivers/sprd: Use min instead of clamp in sprd_thm_temp_to_rawdata
dt-bindings: thermal: lmh: Add SDM670 compatible
dt-bindings: thermal: tsens: add SDM670 compatible
thermal/drivers/sprd: Fix raw temperature clamping in sprd_thm_rawdata_to_temp
thermal/drivers/sprd: Fix temperature clamping in sprd_thm_temp_to_rawdata
thermal/drivers/imx91: Add hwmon support
thermal/of: Add OF node address to output message
Alexey Charkov [Tue, 31 Mar 2026 15:43:39 +0000 (19:43 +0400)]
regulator: bq257xx: Remove reference to the parent MFD's dev
Drop the ->bq field from the platform data of the bq257xx regulator driver,
which was only used to get the regmap of the parent MFD device, and use the
regmap from the regulator_dev instead, slimming down the code a bit.
arm64: rsi: use linear-map alias for realm config buffer
rsi_get_realm_config() passes its argument to virt_to_phys(), but
&config is a kernel image address and not a linear-map alias.
On arm64 this triggers the below warning:
virt_to_phys used for non-linear address: (____ptrval____) (config+0x0/0x1000)
WARNING: arch/arm64/mm/physaddr.c:15 at __virt_to_phys+0x50/0x70, CPU#0: swapper/0
Modules linked in:
.....
Hardware name: linux,dummy-virt (DT)
pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __virt_to_phys+0x50/0x70
lr : __virt_to_phys+0x4c/0x70
.....
......
Call trace:
__virt_to_phys+0x50/0x70 (P)
arm64_rsi_init+0xa0/0x1b8
setup_arch+0x13c/0x1a0
start_kernel+0x68/0x398
__primary_switched+0x88/0x90
Pass lm_alias(&config) instead so the RSI call uses the linear-map
alias of the same buffer and avoids the boot-time warning.
Add __regmap_init_i3c() and the corresponding regmap_init_i3c() macro to
allow creating a regmap for I3C devices without using the device-managed
version. This mirrors the pattern already established for other buses
such as I2C, SPI and so on, giving drivers more flexibility when
the regmap lifetime is not directly tied to the device.
- Clean up and rearrange the intel_rapl power capping driver to make
the respective interface drivers (TPMI, MSR, and MMOI) hold their
own settings and primitives and consolidate PL4 and PMU support
flags into rapl_defaults (Kuppuswamy Sathyanarayanan)
- Correct kernel-doc function parameter names in the power capping core
code (Randy Dunlap)
* pm-powercap:
powercap: intel_rapl: Consolidate PL4 and PMU support flags into rapl_defaults
powercap: intel_rapl: Move MSR primitives to MSR driver
thermal: intel: int340x: processor: Move MMIO primitives to MMIO driver
powercap: intel_rapl: Move TPMI primitives to TPMI driver
powercap: intel_rapl: Move primitive info to header for interface drivers
powercap: intel_rapl: Remove unused macro definitions
powercap: intel_rapl: Move MSR default settings into MSR interface driver
powercap: intel_rapl: Remove unused AVERAGE_POWER primitive
powercap: correct kernel-doc function parameter names
thermal: intel: int340x: processor: Move RAPL defaults to MMIO driver
powercap: intel_rapl: Move TPMI default settings into TPMI interface driver
powercap: intel_rapl: Allow interface drivers to configure rapl_defaults
powercap: intel_rapl: Use unit conversion macros from units.h
powercap: intel_rapl: Use GENMASK() and BIT() macros
powercap: intel_rapl: Use shifts for power-of-2 operations
powercap: intel_rapl: Simplify rapl_compute_time_window_atom()
powercap: intel_rapl: Remove unused TIME_WINDOW macros
powercap: intel_rapl: Cleanup coding style
powercap: intel_rapl: Add a symbol namespace for intel_rapl exports
Merge branches 'pm-cpuidle', 'pm-opp' and 'pm-sleep'
Merge cpuidle updates, OPP (operating performance points) library
updates, and updates related to system suspend and hibernation for
7.1-rc1:
- Refine stopped tick handling in the menu cpuidle governor and
rearrange stopped tick handling in the teo cpuidle governor (Rafael
Wysocki)
- Add Panther Lake C-states table to the intel_idle driver (Artem
Bityutskiy)
- Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha)
- Simplify cpuidle_register_device() with guard() (Huisong Li)
- Use performance level if available to distinguish between rates in
OPP debugfs (Manivannan Sadhasivam)
- Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)
- Return -ENODATA if the snapshot image is not loaded (Alberto Garcia)
- Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric
Biggers)
* pm-cpuidle:
cpuidle: Simplify cpuidle_register_device() with guard()
cpuidle: clean up dead dependencies on CPU_IDLE in Kconfig
intel_idle: Add Panther Lake C-states table
cpuidle: governors: teo: Rearrange stopped tick handling
cpuidle: governors: menu: Refine stopped tick handling
* pm-opp:
OPP: Move break out of scoped_guard in dev_pm_opp_xlate_required_opp()
OPP: debugfs: Use performance level if available to distinguish between rates
* pm-sleep:
PM: hibernate: return -ENODATA if the snapshot image is not loaded
PM: hibernate: x86: Remove inclusion of crypto/hash.h
Add the MSI Vector A16 HX A8WHG (board MS-15MM) to the DMI quirk table
to enable DMIC support. This laptop uses an AMD Ryzen 9 7945HX (Dragon
Range) with the ACP6x audio coprocessor (rev 0x62) and a Realtek ALC274
codec. The built-in digital microphone is connected via the ACP PDM
interface and requires this DMI entry to be activated.
Tested on MSI Vector A16 HX A8WHG with kernel 6.8.0-107 (Ubuntu 24.04).
DMIC capture device appears as 'acp6x' and records audio correctly.
This codec driver depended on the legacy GPIO API, and nothing
in the kernel is defining the platform data, so get rid of this.
Two in-kernel device trees are defining this codec using
undocumented device tree properties, so support these for now.
The same properties can be defined using software nodes if board
files are desired. The device tree use the "-gpio" rather than
"-gpios" suffix but the GPIO DT parser will deal with that.
Since there may be out of tree users, migrate to GPIO descriptors,
drop the platform data that is unused, and assign the dac_clk the
value that was used in all platforms found in a historical dig,
and support setting the clock to the PLL using the undocumented
device tree property.
Add some menuconfig so the codec can be selected and tested.
netfilter: require Ethernet MAC header before using eth_hdr()
`ip6t_eui64`, `xt_mac`, the `bitmap:ip,mac`, `hash:ip,mac`, and
`hash:mac` ipset types, and `nf_log_syslog` access `eth_hdr(skb)`
after either assuming that the skb is associated with an Ethernet
device or checking only that the `ETH_HLEN` bytes at
`skb_mac_header(skb)` lie between `skb->head` and `skb->data`.
Make these paths first verify that the skb is associated with an
Ethernet device, that the MAC header was set, and that it spans at
least a full Ethernet header before accessing `eth_hdr(skb)`.
netfilter: x_tables: Avoid a couple -Wflex-array-member-not-at-end warnings
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.
Use the TRAILING_OVERLAP() helper to fix the following warnings:
1 net/netfilter/x_tables.c:816:39: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
1 net/netfilter/x_tables.c:811:39: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
This helper creates a union between a flexible-array member (FAM)
and a set of members that would otherwise follow it. This overlays
the trailing members onto the FAM while preserving the original
memory layout.
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Florian Westphal <fw@strlen.de>
netfilter: conntrack: remove UDP-Lite conntrack support
UDP-Lite (RFC 3828) socket support was recently retired from the core
networking stack. As a follow-up of that, drop the connection tracker
and NAT support for UDP-Lite in Netfilter.
This patch removes CONFIG_NF_CT_PROTO_UDPLITE and scrubs UDP-Lite
awareness from the conntrack core, NAT core, nft_ct, and ctnetlink.
Please note that stateless packet inspection, matching, ipsets or
logging support for IPPROTO_UDPLITE is preserved.
As conntrack no longer extracts UDP-Lite ports or tracks its L4 state,
when performing NAT the UDP-Lite checksum cannot be updated anymore.
That is an expected and acceptable consequence of removing UDP-Lite
conntrack module.
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Florian Westphal <fw@strlen.de>
netfilter: xt_socket: enable defrag after all other checks
Originally this did not matter because defrag was enabled once per netns
and only disabled again on netns dismantle. When this got changed I should
have adjusted checkentry to not leave defrag enabled on error.
Fixes: de8c12110a13 ("netfilter: disable defrag once its no longer needed") Signed-off-by: Florian Westphal <fw@strlen.de>
netfilter: xt_HL: add pr_fmt and checkentry validation
Add pr_fmt to prefix log messages with the module name for
easier debugging in dmesg.
Add checkentry functions for IPv4 (ttl_mt_check) and IPv6
(hl_mt6_check) to validate the match mode at rule registration
time, rejecting invalid modes with -EINVAL.
The evaluation function returns false in case the mode is
unknown, so this is a cleanup, not a bug fix.
Florian Westphal [Sat, 28 Mar 2026 22:00:31 +0000 (23:00 +0100)]
netfilter: x_physdev: reject empty or not-nul terminated device names
Reject names that lack a \0 character and reject the empty string as
well. iptables allows this but it fails to re-parse iptables-save output
that contain such rules.
Add /proc/net/ip_vs_status to show current state of IPVS.
The motivation for this new /proc interface is to provide the output
for the users to help them decide when to tune the load factor for
hash tables, which is possible with the new sysctl knobs coming in
followup patch.
The output also includes information for the kthreads used for stats.
- Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
Rosen Penev)
- Add MAINTAINERS entry for CPPC driver (Viresh Kumar)
- Add support for new features: CPPC performance priority, Dynamic EPP,
Raw EPP, and new unit tests for them to amd-pstate (Gautham Shenoy,
Mario Limonciello)
- Fix sysfs files being present when HW missing and broken/outdated
documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)
- Pass the policy to cpufreq_driver->adjust_perf() to avoid using
cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate which
leads to a scheduling-while-atomic bug (K Prateek Nayak)
- Clean up dead code in Kconfig for cpufreq (Julian Braha)
- Remove max_freq_req update for pre-existing cpufreq policy and add a
boost_freq_req QoS request to save the boost constraint instead of
overwriting the last scaling_max_freq constraint (Pierre Gondois)
- Embed cpufreq QoS freq_req objects in cpufreq policy so they all
are allocated in one go along with the policy to simplify lifetime
rules and avoid error handling issues (Viresh Kumar)
- Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
scaling driver (Henry Tseng)
- Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
of cpumask_weight() because the former is more efficient (Yury Norov)
- Use sysfs_emit() in sysfs show functions for cpufreq governor
attributes (Thorsten Blum)
- Update intel_pstate to stop returning an error when "off" is written
to its status sysfs attribute while the driver is already off (Fabio
De Francesco)
- Include current frequency in the debug message printed by
__cpufreq_driver_target() (Pengjie Zhang)
* pm-cpufreq: (38 commits)
cpufreq/amd-pstate: Add POWER_SUPPLY select for dynamic EPP
MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
cpufreq/amd-pstate-ut: Add a unit test for raw EPP
cpufreq/amd-pstate: Add support for raw EPP writes
cpufreq/amd-pstate: Add support for platform profile class
cpufreq/amd-pstate: add kernel command line to override dynamic epp
cpufreq/amd-pstate: Add dynamic energy performance preference
Documentation: amd-pstate: fix dead links in the reference section
cpufreq/amd-pstate: Cache the max frequency in cpudata
Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
amd-pstate-ut: Add module parameter to select testcases
amd-pstate: Introduce a tracepoint trace_amd_pstate_cppc_req2()
amd-pstate: Add sysfs support for floor_freq and floor_count
amd-pstate: Add support for CPPC_REQ2 and FLOOR_PERF
x86/cpufeatures: Add AMD CPPC Performance Priority feature.
...
Pengpeng Hou [Tue, 10 Mar 2026 08:08:00 +0000 (08:08 +0000)]
xen/grant-table: guard gnttab_suspend/resume with CONFIG_HIBERNATE_CALLBACKS
In current linux.git, gnttab_suspend() and gnttab_resume() are defined
and declared unconditionally. However, their only in-tree callers reside
in drivers/xen/manage.c, which are guarded by CONFIG_HIBERNATE_CALLBACKS.
Match the helper scope to their callers by wrapping the definitions in
CONFIG_HIBERNATE_CALLBACKS and providing no-op stubs in the header. This
fixes the config-scope mismatch and reduces the code footprint when
hibernation callbacks are disabled.
Jason Andryuk [Wed, 18 Mar 2026 23:53:26 +0000 (19:53 -0400)]
hvc/xen: Check console connection flag
When the console out buffer is filled, __write_console() will return 0
as it cannot send any data. domU_write_console() will then spin in
`while (len)` as len doesn't decrement until xenconsoled attaches. This
would block a domU and nullify the parallelism of Hyperlaunch until dom0
userspace starts xenconsoled, which empties the buffer.
Xen 4.21 added a connection field to the xen console page. This is set
to XENCONSOLE_DISCONNECTED (1) when a domain is built, and xenconsoled
will set it to XENCONSOLE_CONNECTED (0) when it connects.
Update the hvc_xen driver to check the field. When the field is
disconnected, drop the write with -ENOTCONN. We only drop the write
when the field is XENCONSOLE_DISCONNECTED (1) to try for maximum
compatibility. The Xen toolstack has historically zero initialized the
console, so it should see XENCONSOLE_CONNECTED (0) by default. If an
implemenation used uninitialized memory, only checking for
XENCONSOLE_DISCONNECTED could have the lowest chance of not connecting.
This lets the hyperlaunched domU boot without stalling. Once dom0
starts xenconsoled, xl console can be used to access the domU's hvc0.
Paritally sync console.h from xen.git to bring in the new field.
Kexin Sun [Sat, 21 Mar 2026 11:00:39 +0000 (19:00 +0800)]
xen/swiotlb: fix stale reference to swiotlb_unmap_page()
Commit af85de5a9f00 ("xen: swiotlb: Switch to physical
address mapping callbacks") renamed xen_swiotlb_unmap_page()
to xen_swiotlb_unmap_phys(). The comment in
xen_swiotlb_unmap_sg() had already been missing the xen_
prefix (reading swiotlb_unmap_page()), and the rename only
changed _page to _phys without correcting this, leaving it
as swiotlb_unmap_phys(). Fix the reference to use the
correct function name xen_swiotlb_unmap_phys().
xen/manage: unwind partial shutdown watcher setup on error
setup_shutdown_watcher() registers shutdown_watch first, then the sysrq
watch, and finally publishes the supported feature-* nodes in xenstore.
If sysrq watch registration fails, or xenbus_printf() fails after one or
more feature nodes were created, the function returns immediately without
undoing the earlier setup.
This leaves the system in a partially initialized state, with registered
watches and/or stale xenstore entries despite the function reporting
failure.
Unwind the partial setup before returning an error by unregistering any
watches that were already registered and removing feature nodes that were
already published.
selftests/sched_ext: Fix wrong DSQ ID in peek_dsq error message
The error path after scx_bpf_create_dsq(real_dsq_id, ...) was reporting
test_dsq_id instead of real_dsq_id in the error message, which would
mislead debugging.
- Remove EROFS_MAP_ENCODED since it was always set together with
EROFS_MAP_MAPPED for compressed extents and checked redundantly;
- Replace the EROFS_MAP_FULL_MAPPED flag with the opposite
EROFS_MAP_PARTIAL_MAPPED flag so that extents are implicitly
fully mapped initially to simplify the logic;
- Make fragment extents independent of EROFS_MAP_MAPPED since
they are not directly allocated on disk; thus fragment extents
are no longer twisted with mapped extents.
ARM: xen: validate hypervisor compatible before parsing its version
fdt_find_hyper_node() reads the raw compatible property and then derives
hyper_node.version from a prefix match before later printing it with %s.
Flat DT properties are external boot input, and this path does not prove
that the first compatible entry is NUL-terminated within the returned
property length.
Keep the existing flat-DT lookup path, but verify that the first
compatible entry terminates within the returned property length before
deriving the version suffix from it.
Kuba Piecuch [Thu, 9 Apr 2026 16:57:44 +0000 (16:57 +0000)]
sched_ext: Documentation: improve accuracy of task lifecycle pseudo-code
* Add ops.quiescent() and ops.runnable() to the sched_change path.
When a queued task has one of its scheduling properties changed
(e.g. nice, affinity), it goes through dequeue() -> quiescent() ->
(property change callback, e.g. ops.set_weight()) -> runnable() ->
enqueue().
* Change && to || in ops.enqueue() condition. We want to enqueue tasks
that have a non-zero slice and are not in any DSQ.
* Call ops.dispatch() and ops.dequeue() only for tasks that have had
ops.enqueue() called. This is to account for tasks direct-dispatched
from ops.select_cpu().
* Add a note explaining that the pseudo-code provides a simplified view
of the task lifecycle and list some examples of cases that the
pseudo-code does not account for.
Fixes: a4f61f0a1afd ("sched_ext: Documentation: Add ops.dequeue() to task lifecycle") Signed-off-by: Kuba Piecuch <jpiecuch@google.com> Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
Neeraj Soni [Fri, 10 Apr 2026 06:58:33 +0000 (12:28 +0530)]
mmc: sdhci-msm: Fix the wrapped key handling
Inline Crypto Engine (ICE) supports wrapped key generation. While
registering crypto profile the supported key types are queried from ICE
driver. So the explicit check for RAW key is not needed.
Fixes: fd78e2b582a0 ("mmc: sdhci-msm: Add support for wrapped keys") Signed-off-by: Neeraj Soni <neeraj.soni@oss.qualcomm.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
um: Disable GCOV_PROFILE_ALL on 32-bit UML with Clang 20/21
Clang 20 and 21 miscompute __builtin_object_size() when -fprofile-arcs
is active on 32-bit UML targets, which passes incorrect object size
calculations for local variables through always_inline copy_to_user()
and check_copy_size(), causing spurious compile-time errors:
include/linux/ucopysize.h:52:4: error: call to '__bad_copy_from' declared with 'error' attribute: copy source size is too small
The regression was introduced in LLVM commit 02b8ee281947 ("[llvm]
Improve llvm.objectsize computation by computing GEP, alloca and malloc
parameters bound"), which shipped in Clang 20. It was fixed in LLVM
by commit 45b697e610fd ("[MemoryBuiltins] Consider index type size
when aggregating gep offsets"), which was backported to the LLVM 22.x
release branch.
The bug requires 32-bit UML + GCOV_PROFILE_ALL (which uses -fprofile-arcs),
though the exact trigger depends on optimizer decisions influenced by other
enabled configs.
Prevent the bad combination by disabling UML's ARCH_HAS_GCOV_PROFILE_ALL
on 32-bit when using Clang 20.x or 21.x.
gpio: tegra: return -ENOMEM on allocation failure in probe
devm_kzalloc() failure in tegra_gpio_probe() returns -ENODEV, which
indicates "no such device". The correct error code for a memory
allocation failure is -ENOMEM.
Hangbin Liu [Wed, 8 Apr 2026 07:19:05 +0000 (15:19 +0800)]
tools: ynl: tests: fix leading space on Makefile target
The ../generated/protos.a rule had a spurious leading space before the
target name. In make, target rules must start at column 0; only recipe
lines are indented with a tab. The extra space caused make to misparse
the rule.
Remove the leading space to match the style of the adjacent
../lib/ynl.a rule.
People (do people still write code or is it all AI?) seem to not
get that ksft_run() can only be called once. If we call it
multiple times KTAP parsers will likely cut off after the first
batch has finished.
fbnic_up() calls netif_tx_start_all_queues(), which only clears
__QUEUE_STATE_DRV_XOFF. If qdisc backlog has accumulated on any TX
queue before the reconfiguration (e.g. ring resize via ethtool -G),
start does not call __netif_schedule() to kick the qdisc, so the
pending backlog is never drained and the queue stalls.
Switch to netif_tx_wake_all_queues(), which clears DRV_XOFF and also
calls __netif_schedule() on every queue, ensuring any backlog that
built up before the down/up cycle is promptly dequeued.
net: airoha: Add dma_rmb() and READ_ONCE() in airoha_qdma_rx_process()
Add missing dma_rmb() in airoha_qdma_rx_process routine to make sure the
DMA read operations are completed when the NIC reports the processing on
the current descriptor is done. Moreover, add missing READ_ONCE() in
airoha_qdma_rx_process() for DMA descriptor control fields in order to
avoid any compiler reordering.
net: txgbe: fix RTNL assertion warning when remove module
For the copper NIC with external PHY, the driver called
phylink_connect_phy() during probe and phylink_disconnect_phy() during
remove. It caused an RTNL assertion warning in phylink_disconnect_phy()
upon module remove.
To fix this, add rtnl_lock() and rtnl_unlock() around the
phylink_disconnect_phy() in remove function.
Fixes: 02b2a6f91b90 ("net: txgbe: support copper NIC with external PHY") Cc: stable@vger.kernel.org Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/8B47A5872884147D+20260407094041.4646-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 10 Apr 2026 03:19:47 +0000 (20:19 -0700)]
Merge branch 'net-bcmgenet-fix-queue-lock-up'
Justin Chen says:
====================
net: bcmgenet: fix queue lock up
We have been seeing reports of logs like this.
[ 41.761198] bcmgenet 1001300000.ethernet eth0: NETDEV WATCHDOG: CPU: 0: transmit queue 2 timed out 10039 ms
[ 43.745198] bcmgenet 1001300000.ethernet eth0: NETDEV WATCHDOG: CPU: 0: transmit queue 2 timed out 12023 ms
[ 45.729198] bcmgenet 1001300000.ethernet eth0: NETDEV WATCHDOG: CPU: 0: transmit queue 2 timed out 14007 ms
We have two issues. The persistent queue timeouts and the eventual
lock up of the entire transmit.
We address the lock up issue first. The queue timeouts are due to
a fundamental design issue not a bug perse. Timeouts still persist,
but we should no longer lock up.
====================
The bcmgenet_timeout handler tries to take down all tx queues when
a single queue times out. This is over zealous and causes many race
conditions with queues that are still chugging along. Instead lets
only restart the timed out queue.
While reclaiming the tx queue we fast forward the write pointer to
drop any data in flight. These dropped frames are not added back
to the pool of free bds. We also need to tell the netdev that we
are dropping said data.
net: bcmgenet: fix off-by-one in bcmgenet_put_txcb
The write_ptr points to the next open tx_cb. We want to return the
tx_cb that gets rewinded, so we must rewind the pointer first then
return the tx_cb that it points to. That way the txcb can be correctly
cleaned up.
Fixes: 876dbadd53a7 ("net: bcmgenet: Fix unmapping of fragments in bcmgenet_xmit()") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de> Link: https://patch.msgid.link/20260406175756.134567-2-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kevin Hao [Tue, 7 Apr 2026 00:45:39 +0000 (08:45 +0800)]
net: macb: Use napi_schedule_irqoff() in IRQ handler
For non-PREEMPT_RT kernels, the IRQ handler runs with interrupts
disabled, allowing the use of napi_schedule_irqoff() to save a pair of
local_irq_{save,restore} operations. For PREEMPT_RT kernels,
napi_schedule_irqoff() behaves identically to napi_schedule().
Qingfang Deng [Tue, 7 Apr 2026 09:40:56 +0000 (17:40 +0800)]
ppp: consolidate refcount decrements
ppp_destroy_{channel,interface} are always called after
refcount_dec_and_test().
To reduce boilerplate code, consolidate the decrements by moving them
into the two functions. To reflect this change in semantics, rename the
functions to ppp_release_*.
Marek Vasut [Sun, 5 Apr 2026 23:29:58 +0000 (01:29 +0200)]
net: phy: realtek: Add property to enable SSC
Add support for spread spectrum clocking (SSC) on RTL8211F(D)(I)-CG,
RTL8211FS(I)(-VS)-CG, RTL8211FG(I)(-VS)-CG PHYs. The implementation
follows EMI improvement application note Rev. 1.2 for these PHYs.
The current implementation enables SSC for both RXC and SYSCLK clock
signals. Introduce DT properties 'realtek,clkout-ssc-enable',
'realtek,rxc-ssc-enable' and 'realtek,sysclk-ssc-enable' which control
CLKOUT, RXC and SYSCLK SSC spread spectrum clocking enablement on these
signals.
Document support for spread spectrum clocking (SSC) on RTL8211F(D)(I)-CG,
RTL8211FS(I)(-VS)-CG, RTL8211FG(I)(-VS)-CG PHYs. Introduce DT properties
'realtek,clkout-ssc-enable', 'realtek,rxc-ssc-enable' and
'realtek,sysclk-ssc-enable' which control CLKOUT, RXC and SYSCLK
SSC spread spectrum clocking enablement on these signals. These
clock are not exposed via the clock API, therefore assigned-clock-sscs
property does not apply.
====================
macsec: Add support for VLAN filtering in offload mode
This short series adds support for VLANs in MACsec devices when offload
mode is enabled. This allows VLAN netdevs on top of MACsec netdevs to
function, which accidentally used to be the case in the past, but was
broken. This series adds back proper support.
As part of this, the existing nsim-only MACsec offload tests were
translated to Python so they can run against real HW and new
traffic-based tests were added for VLAN filter propagation, since
there's currently no uAPI to check VLAN filters.
====================
VLAN-filtering is done through two netdev features
(NETIF_F_HW_VLAN_CTAG_FILTER and NETIF_F_HW_VLAN_STAG_FILTER) and two
netdev ops (ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid).
Implement these and advertise the features if the lower device supports
them. This allows proper VLAN filtering to work on top of MACsec
devices, when the lower device is capable of VLAN filtering.
As a concrete example, having this chain of interfaces now works:
vlan_filtering_capable_dev(1) -> macsec_dev(2) -> macsec_vlan_dev(3)
Before the mentioned commit this used to accidentally work because the
MACsec device (and thus the lower device) was put in promiscuous mode
and the VLAN filter was not used. But after commit [1] correctly made
the macsec driver expose the IFF_UNICAST_FLT flag, promiscuous mode was
no longer used and VLAN filters on dev 1 kicked in. Without support in
dev 2 for propagating VLAN filters down, the register_vlan_dev ->
vlan_vid_add -> __vlan_vid_add -> vlan_add_rx_filter_info call from dev
3 is silently eaten (because vlan_hw_filter_capable returns false and
vlan_add_rx_filter_info silently succeeds).
For MACsec, VLAN filters are only relevant for offload, otherwise
the VLANs are encrypted and the lower devices don't care about them. So
VLAN filters are only passed on to lower devices in offload mode.
Flipping between offload modes now needs to offload/unoffload the
filters with vlan_{get,drop}_rx_*_filter_info().
To avoid the back-and-forth filter updating during rollback, the setting
of macsec->offload is moved after the add/del secy ops. This is safe
since none of the code called from those requires macsec->offload.
In case adding the filters fails, the added ones are rolled back and an
error is returned to the operation toggling the offload state.
selftests: Add MACsec VLAN propagation traffic test
Add VLAN filter propagation tests through offloaded MACsec devices via
actual traffic.
The tests create MACsec tunnels with matching SAs on both endpoints,
stack VLANs on top, and verify connectivity with ping. Covered:
- Offloaded MACsec with VLAN (filters propagate to HW)
- Software MACsec with VLAN (no HW filter propagation)
- Offload on/off toggle and verifying traffic still works
On netdevsim this makes use of the VLAN filter debugfs file to actually
validate that filters are applied/removed correctly.
On real hardware the traffic should validate actual VLAN filter
propagation.
selftests: Migrate nsim-only MACsec tests to Python
Move MACsec offload API and ethtool feature tests from
tools/testing/selftests/drivers/net/netdevsim/macsec-offload.sh to
tools/testing/selftests/drivers/net/macsec.py using the NetDrvEnv
framework so tests can run against both netdevsim (default) and real
hardware (NETIF=ethX). As some real hardware requires MACsec to use
encryption, add that to the tests.
Netdevsim-specific limit checks (max SecY, max RX SC) were moved into
separate test cases to avoid failures on real hardware.
IFA_F_PERMANENT addresses require the allocation of a bunch of percpu
pointers, currently in atomic scope.
Similar to commit 51454ea42c1a ("ipv6: fix locking issues with loops
over idev->addr_list"), move fixup_permanent_addr() outside the
&idev->lock scope, and do the allocations with GFP_KERNEL. With such
change fixup_permanent_addr() is invoked with the BH enabled, and the
ifp lock acquired there needs the BH variant.
Note that we don't need to acquire a reference to the permanent
addresses before releasing the mentioned write lock, because
addrconf_permanent_addr() runs under RTNL and ifa removal always happens
under RTNL, too.
Also the PERMANENT flag is constant in the relevant scope, as it can be
cleared only by inet6_addr_modify() under the RTNL lock.
David Carlier [Tue, 7 Apr 2026 15:07:58 +0000 (16:07 +0100)]
net: use get_random_u{16,32,64}() where appropriate
Use the typed random integer helpers instead of
get_random_bytes() when filling a single integer variable.
The helpers return the value directly, require no pointer
or size argument, and better express intent.
Skipped sites writing into __be16 (netdevsim) and __le64
(ceph) fields where a direct assignment would trigger
sparse endianness warnings.
Signed-off-by: David Carlier <devnexen@gmail.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260407150758.5889-1-devnexen@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Zhang Yi [Thu, 9 Apr 2026 11:42:03 +0000 (19:42 +0800)]
jbd2: fix deadlock in jbd2_journal_cancel_revoke()
Commit f76d4c28a46a ("fs/jbd2: use sleeping version of
__find_get_block()") changed jbd2_journal_cancel_revoke() to use
__find_get_block_nonatomic() which holds the folio lock instead of
i_private_lock. This breaks the lock ordering (folio -> buffer) and
causes an ABBA deadlock when the filesystem blocksize < pagesize:
T1 T2
ext4_mkdir()
ext4_init_new_dir()
ext4_append()
ext4_getblk()
lock_buffer() <- A
sync_blockdev()
blkdev_writepages()
writeback_iter()
writeback_get_folio()
folio_lock() <- B
ext4_journal_get_create_access()
jbd2_journal_cancel_revoke()
__find_get_block_nonatomic()
folio_lock() <- B
block_write_full_folio()
lock_buffer() <- A
This can occasionally cause generic/013 to hang.
Fix by only calling __find_get_block_nonatomic() when the passed
buffer_head doesn't belong to the bdev, which is the only case that we
need to look up its bdev alias. Otherwise, the lookup is redundant since
the found buffer_head is equal to the one we passed in.
Ye Bin [Mon, 30 Mar 2026 13:30:35 +0000 (21:30 +0800)]
ext4: fix possible null-ptr-deref in mbt_kunit_exit()
There's issue as follows:
# test_new_blocks_simple: failed to initialize: -12
KASAN: null-ptr-deref in range [0x0000000000000638-0x000000000000063f]
Tainted: [E]=UNSIGNED_MODULE, [N]=TEST
RIP: 0010:mbt_kunit_exit+0x5e/0x3e0 [ext4_test]
Call Trace:
<TASK>
kunit_try_run_case_cleanup+0xbc/0x100 [kunit]
kunit_generic_run_threadfn_adapter+0x89/0x100 [kunit]
kthread+0x408/0x540
ret_from_fork+0xa76/0xdf0
ret_from_fork_asm+0x1a/0x30
If mbt_kunit_init() init testcase failed will lead to null-ptr-deref.
So add test if 'sb' is inited success in mbt_kunit_exit().
Fixes: 7c9fa399a369 ("ext4: add first unit test for ext4_mb_new_blocks_simple in mballoc") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/20260330133035.287842-6-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Ye Bin [Mon, 30 Mar 2026 13:30:33 +0000 (21:30 +0800)]
ext4: fix the error handling process in extents_kunit_init).
The error processing in extents_kunit_init() is improper, causing
resource leakage.
Reconstruct the error handling process to prevent potential resource
leaks
Fixes: cb1e0c1d1fad ("ext4: kunit tests for extent splitting and conversion") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/20260330133035.287842-4-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Ye Bin [Mon, 30 Mar 2026 13:30:31 +0000 (21:30 +0800)]
ext4: fix miss unlock 'sb->s_umount' in extents_kunit_init()
There's warning as follows when do ext4 kunit test:
WARNING: kunit_try_catch/15923 still has locks held! 7.0.0-rc3-next-20260309-00028-g73f965a1bbb1-dirty #281 Tainted: G E N
1 lock held by kunit_try_catch/15923:
#0: ffff888139f860e0 (&type->s_umount_key#70/1){+.+.}-{4:4}, at: alloc_super.constprop.0+0x172/0xa90
Call Trace:
<TASK>
dump_stack_lvl+0x180/0x1b0
debug_check_no_locks_held+0xc8/0xd0
do_exit+0x1502/0x2b20
kthread+0x3a9/0x540
ret_from_fork+0xa76/0xdf0
ret_from_fork_asm+0x1a/0x30
As sget() will return 'sb' which holds 's->s_umount' lock. However,
"extents-test" miss unlock this lock.
So unlock 's->s_umount' in the end of extents_kunit_init().
Fixes: cb1e0c1d1fad ("ext4: kunit tests for extent splitting and conversion") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/20260330133035.287842-2-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
ext4: fix bounds check in check_xattrs() to prevent out-of-bounds access
The bounds check for the next xattr entry in check_xattrs() uses
(void *)next >= end, which allows next to point within sizeof(u32)
bytes of end. On the next loop iteration, IS_LAST_ENTRY() reads 4
bytes via *(__u32 *)(entry), which can overrun the valid xattr region.
For example, if next lands at end - 1, the check passes since
next < end, but IS_LAST_ENTRY() reads 4 bytes starting at end - 1,
accessing 3 bytes beyond the valid region.
Fix this by changing the check to (void *)next + sizeof(u32) > end,
ensuring there is always enough space for the IS_LAST_ENTRY() read
on the subsequent iteration.
Zhang Yi [Fri, 27 Mar 2026 10:29:39 +0000 (18:29 +0800)]
ext4: zero post-EOF partial block before appending write
In cases of appending write beyond EOF, ext4_zero_partial_blocks() is
called within ext4_*_write_end() to zero out the partial block beyond
EOF. This prevents exposing stale data that might be written through
mmap.
However, supporting only the regular buffered write path is
insufficient. It is also necessary to support the DAX path as well as
the upcoming iomap buffered write path. Therefore, move this operation
to ext4_write_checks().
In addition, this may introduce a race window in which a post-EOF
buffered write can race with an mmap write after the old EOF block has
been zeroed. As a result, the data in this block written by the
buffer-write and the data written by the mmap-write may be mixed.
However, this is safe because users should not rely on the result of the
race condition.
Zhang Yi [Fri, 27 Mar 2026 10:29:38 +0000 (18:29 +0800)]
ext4: move pagecache_isize_extended() out of active handle
In ext4_alloc_file_blocks(), pagecache_isize_extended() is called under
an active handle and may also hold folio lock if the block size is
smaller than the folio size. This also breaks the "folio lock ->
transaction start" lock ordering for the upcoming iomap buffered I/O
path.
Therefore, move pagecache_isize_extended() outside of an active handle.
Additionally, it is unnecessary to update the file length during each
iteration of the allocation loop. Instead, update the file length only
to the position where the allocation is successful. Postpone updating
the inode size until after the allocation loop completes or is
interrupted due to an error.