git.ipfire.org Git - thirdparty/linux.git/log

platform/x86: ISST: Check HWP support before MSR access

On some systems, HWP can be explicitly disabled in the BIOS settings
When HWP is disabled by firmware, the HWP CPUID bit is not set, and
attempting to read MSR_PM_ENABLE will result in a General Protection
(GP) fault.

  unchecked MSR access error: RDMSR from 0x770 at rIP: 0xffffffffc33db92e (disable_dynamic_sst_features+0xe/0x50 [isst_tpmi_core])
  Call Trace:
   <TASK>
   ? ex_handler_msr+0xf6/0x150
   ? fixup_exception+0x1ad/0x340
   ? gp_try_fixup_and_notify+0x1e/0xb0
   ? exc_general_protection+0xc9/0x390
   ? terminate_walk+0x64/0x100
   ? asm_exc_general_protection+0x22/0x30
   ? disable_dynamic_sst_features+0xe/0x50 [isst_tpmi_core]
   isst_if_def_ioctl+0xece/0x1050 [isst_tpmi_core]
   ? ioctl_has_perm.constprop.42+0xe0/0x130
   isst_if_def_ioctl+0x10d/0x1a0 [isst_if_common]
   __se_sys_ioctl+0x86/0xc0
   do_syscall_64+0x8a/0x100
   entry_SYSCALL_64_after_hwframe+0x78/0xe2
  RIP: 0033:0x7f36eaef54a7

Add a check for X86_FEATURE_HWP before accessing the MSR. If HWP is
not available, return true safely.

Fixes: 12a7d2cb811d ("platform/x86: ISST: Add SST-CP support via TPMI")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20260303074635.2218-1-lirongqing@baidu.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/x86: hp-wmi: Add support for Omen 16-k0xxx (8A4D)

The HP Omen 16-k0xxx (board ID: 8A4D) has the same WMI interface as
other Victus S boards, but requires additional quirks for correctly
switching thermal profile.

Create a new quirk omen_v1_legacy_thermal_params which allows a board to
use Omen V1 thermal values, but rely on the older legacy
HP_OMEN_EC_THERMAL_PROFILE_OFFSET. Add the DMI board name to
victus_s_thermal_profile_boards[] table and map it to the newly added
quirk.

Testing on board 8A4D confirmed that platform profile is registered
successfully and fan RPMs are readable and controllable.

Tested-by: Qinfeng Wu <qwqgong@gmail.com>
Reported-by: Qinfeng Wu <qwqgong@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221150
Signed-off-by: Krishna Chomal <krishna.chomal108@gmail.com>
Link: https://patch.msgid.link/20260302073525.71037-1-krishna.chomal108@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/x86: hp-wmi: Add support for Omen 16-wf1xxx (8C76)

The HP Omen 16-wf1xxx (board ID: 8C76) has the same WMI interface as
other Victus S boards, but requires quirks for correctly switching
thermal profile (similar to board 8C78).

Add the DMI board name to victus_s_thermal_profile_boards[] table and
map it to omen_v1_thermal_params.

Testing on board 8C76 confirmed that platform profile is registered
successfully and fan RPMs are readable and controllable.

Tested-by: WJ Enderlava <jie7172585@gmail.com>
Reported-by: WJ Enderlava <jie7172585@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221149
Signed-off-by: Krishna Chomal <krishna.chomal108@gmail.com>
Link: https://patch.msgid.link/20260227154106.226809-1-krishna.chomal108@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/x86: hp-wmi: Add Omen 16-xf0xxx (8BCA) support

The HP Omen 16-xf0xxx board 8BCA uses the same Victus-S fan and
thermal WMI path as other recently supported Omen/Victus boards,
but it requires Omen v1 thermal profile parameters for correct
platform profile behavior.

Add board 8BCA to victus_s_thermal_profile_boards[] and map it
to omen_v1_thermal_params.

Validated on HP Omen 16-xf0xxx (board 8BCA):
- /sys/firmware/acpi/platform_profile exposes
low-power/balanced/performance
- fan RPM reporting works (fan1_input/fan2_input)
- manual fan control works through hp-wmi hwmon (pwm1/pwm1_enable)

Signed-off-by: Raed <thisisraed@outlook.com>
Link: https://patch.msgid.link/20260311131338.965249-1-youaretalkingtoraed@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/x86: asus-armoury: add support for G614FP

Add TDP data for laptop model G614FP.

Signed-off-by: Denis Benato <denis.benato@linux.dev>
Link: https://patch.msgid.link/20260309183559.433555-3-denis.benato@linux.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/x86: asus-armoury: add support for GA503QM

Add TDP data for laptop model GA503QM.

Signed-off-by: Denis Benato <denis.benato@linux.dev>
Link: https://patch.msgid.link/20260309183559.433555-2-denis.benato@linux.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

MAINTAINERS: change email address of Denis Benato

I have been using a linux.dev email since that is hugely better than gmail.

Signed-off-by: Denis Benato <denis.benato@linux.dev>
Signed-off-by: Denis Benato <benato.denis96@gmail.com>
Link: https://patch.msgid.link/20260304141102.63732-1-denis.benato@linux.dev
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

PM: hibernate: Drain trailing zero pages on userspace restore

Commit 005e8dddd497 ("PM: hibernate: don't store zero pages in the
image file") added an optimization to skip zero-filled pages in the
hibernation image. On restore, zero pages are handled internally by
snapshot_write_next() in a loop that processes them without returning
to the caller.

With the userspace restore interface, writing the last non-zero page
to /dev/snapshot is followed by the SNAPSHOT_ATOMIC_RESTORE ioctl. At
this point there are no more calls to snapshot_write_next() so any
trailing zero pages are not processed, snapshot_image_loaded() fails
because handle->cur is smaller than expected, the ioctl returns -EPERM
and the image is not restored.

The in-kernel restore path is not affected by this because the loop in
load_image() in swap.c calls snapshot_write_next() until it returns 0.
It is this final call that drains any trailing zero pages.

Fixed by calling snapshot_write_next() in snapshot_write_finalize(),
giving the kernel the chance to drain any trailing zero pages.

Fixes: 005e8dddd497 ("PM: hibernate: don't store zero pages in the image file")
Signed-off-by: Alberto Garcia <berto@igalia.com>
Acked-by: Brian Geffon <bgeffon@google.com>
Link: https://patch.msgid.link/ef5a7c5e3e3dbd17dcb20efaa0c53a47a23498bb.1773075892.git.berto@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

cpufreq: conservative: Reset requested_freq on limits change

A recently reported issue highlighted that the cached requested_freq
is not guaranteed to stay in sync with policy->cur. If the platform
changes the actual CPU frequency after the governor sets one (e.g.
due to platform-specific frequency scaling) and a re-sync occurs
later, policy->cur may diverge from requested_freq.

This can lead to incorrect behavior in the conservative governor.
For example, the governor may assume the CPU is already running at
the maximum frequency and skip further increases even though there
is still headroom.

Avoid this by resetting the cached requested_freq to policy->cur on
detecting a change in policy limits.

Reported-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Tested-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://lore.kernel.org/all/20260210115458.3493646-1-zhenglifeng1@huawei.com/
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Cc: All applicable <stable@vger.kernel.org>
Link: https://patch.msgid.link/d846a141a98ac0482f20560fcd7525c0f0ec2f30.1773999467.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

cpufreq: Don't skip cpufreq_frequency_table_cpuinfo()

The commit 6db0f533d320 ("cpufreq: preserve freq_table_sorted
across suspend/hibernate") unintentionally made a change where
cpufreq_frequency_table_cpuinfo() isn't getting called anymore
for old policies getting re-initialized.

This leads to potentially invalid values of policy->max and
policy->cpuinfo_max_freq.

Fix the issue by reverting the original commit and adding the condition
for just the sorting function.

Fixes: 6db0f533d320 ("cpufreq: preserve freq_table_sorted across suspend/hibernate")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 6.19+ <stable@vger.kernel.org> # 6.19+
Link: https://patch.msgid.link/65ba5c45749267c82e8a87af3dc788b37a0b3f48.1773998611.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

x86/cpu: Enable FSGSBASE early in cpu_init_exception_handling()

Move FSGSBASE enablement from identify_cpu() to cpu_init_exception_handling()
to ensure it is enabled before any exceptions can occur on both boot and
secondary CPUs.

== Background ==

Exception entry code (paranoid_entry()) uses ALTERNATIVE patching based on
X86_FEATURE_FSGSBASE to decide whether to use RDGSBASE/WRGSBASE instructions
or the slower RDMSR/SWAPGS sequence for saving/restoring GSBASE.

On boot CPU, ALTERNATIVE patching happens after enabling FSGSBASE in CR4.
When the feature is available, the code is permanently patched to use
RDGSBASE/WRGSBASE, which require CR4.FSGSBASE=1 to execute without triggering

== Boot Sequence ==

Boot CPU (with CR pinning enabled):
  trap_init()
    cpu_init()                   <- Uses unpatched code (RDMSR/SWAPGS)
      x2apic_setup()
  ...
  arch_cpu_finalize_init()
    identify_boot_cpu()
      identify_cpu()
        cr4_set_bits(X86_CR4_FSGSBASE)  # Enables the feature
# This becomes part of cr4_pinned_bits
    ...
    alternative_instructions()   <- Patches code to use RDGSBASE/WRGSBASE

Secondary CPUs (with CR pinning enabled):
  start_secondary()
    cr4_init()                   <- Code already patched, CR4.FSGSBASE=1
                                    set implicitly via cr4_pinned_bits

    cpu_init()                   <- exceptions work because FSGSBASE is
                                    already enabled

Secondary CPU (with CR pinning disabled):
  start_secondary()
    cr4_init()                   <- Code already patched, CR4.FSGSBASE=0
    cpu_init()
      x2apic_setup()
        rdmsrq(MSR_IA32_APICBASE)  <- Triggers #VC in SNP guests
          exc_vmm_communication()
            paranoid_entry()       <- Uses RDGSBASE with CR4.FSGSBASE=0
                                      (patched code)
    ...
    ap_starting()
      identify_secondary_cpu()
        identify_cpu()
  cr4_set_bits(X86_CR4_FSGSBASE)  <- Enables the feature, which is
                                             too late

== CR Pinning ==

Currently, for secondary CPUs, CR4.FSGSBASE is set implicitly through
CR-pinning: the boot CPU sets it during identify_cpu(), it becomes part of
cr4_pinned_bits, and cr4_init() applies those pinned bits to secondary CPUs.
This works but creates an undocumented dependency between cr4_init() and the
pinning mechanism.

== Problem ==

Secondary CPUs boot after alternatives have been applied globally. They
execute already-patched paranoid_entry() code that uses RDGSBASE/WRGSBASE
instructions, which require CR4.FSGSBASE=1. Upcoming changes to CR pinning
behavior will break the implicit dependency, causing secondary CPUs to
generate #UD.

This issue manifests itself on AMD SEV-SNP guests, where the rdmsrq() in
x2apic_setup() triggers a #VC exception early during cpu_init(). The #VC
handler (exc_vmm_communication()) executes the patched paranoid_entry() path.
Without CR4.FSGSBASE enabled, RDGSBASE instructions trigger #UD.

== Fix ==

Enable FSGSBASE explicitly in cpu_init_exception_handling() before loading
exception handlers. This makes the dependency explicit and ensures both
boot and secondary CPUs have FSGSBASE enabled before paranoid_entry()
executes.

Fixes: c82965f9e530 ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit")
Reported-by: Borislav Petkov <bp@alien8.de>
Suggested-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Cc: <stable@kernel.org>
Link: https://patch.msgid.link/20260318075654.1792916-2-nikunj@amd.com

coredump: add tracepoint for coredump events

Coredump is a generally useful and interesting event in the lifetime
of a process. Add a tracepoint so it can be monitored through the
standard kernel tracing infrastructure.

BPF-based crash monitoring is an advanced approach that
allows real-time crash interception: by attaching a BPF program at
this point, tools can use bpf_get_stack() with BPF_F_USER_STACK to
capture the user-space stack trace at the exact moment of the crash,
before the process is fully terminated, without waiting for a
coredump file to be written and parsed.

However, there is currently no stable kernel API for this use case.
Existing tools rely on attaching fentry probes to do_coredump(),
which is an internal function whose signature changes across kernel
versions, breaking these tools.

Add a stable tracepoint that fires at the beginning of
do_coredump(), providing BPF programs a reliable attachment point.
At tracepoint time, the crashing process context is still live, so
BPF programs can call bpf_get_stack() with BPF_F_USER_STACK to
extract the user-space backtrace.

The tracepoint records:
  - sig: signal number that triggered the coredump
  - comm: process name

Example output:

  $ echo 1 > /sys/kernel/tracing/events/coredump/coredump/enable
  $ sleep 999 &
  $ kill -SEGV $!
  $ cat /sys/kernel/tracing/trace
  #           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
  #              | |         |   |||||     |         |
             sleep-634     [036] .....   145.222206: coredump: sig=11 comm=sleep

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260323-coredump_tracepoint-v2-1-afced083b38d@debian.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

Merge patch series "fix architecture-specific compat_ftruncate64 implementations"

Christoph Hellwig <hch@lst.de> says:

This series fixes a really old bug found by code inspection, where the
architecture-specific 32-bit compat ftruncate64 implementations enforce
the non-LFS file size limit unless opened with O_LARGEFILE.

* patches from https://patch.msgid.link/20260323070205.2939118-1-hch@lst.de:
  fs: remove do_sys_truncate
  fs: pass on FTRUNCATE_* flags to do_truncate
  fs: fix archiecture-specific compat_ftruncate64

Link: https://patch.msgid.link/20260323070205.2939118-1-hch@lst.de
Signed-off-by: Christian Brauner <brauner@kernel.org>

fs: remove do_sys_truncate

do_sys_truncate ist only used to implement ksys_truncate and the native
truncate syscalls. Merge do_sys_truncate into ksys_truncate and return
int from it as it only returns 0 or negative errnos.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260323070205.2939118-4-hch@lst.de
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>

fs: pass on FTRUNCATE_* flags to do_truncate

Pass the flags one level down to replace the somewhat confusing small
argument, and clean up do_truncate as a result.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260323070205.2939118-3-hch@lst.de
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>

fs: fix archiecture-specific compat_ftruncate64

The "small" argument to do_sys_ftruncate indicates if > 32-bit size
should be reject, but all the arch-specific compat ftruncate64
implementations get this wrong. Merge do_sys_ftruncate and
ksys_ftruncate, replace the integer as boolean small flag with a
descriptive one about LFS semantics, and use it correctly in the
architecture-specific ftruncate64 implementations.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Fixes: 3dd681d944f6 ("arm64: 32-bit (compat) applications support")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260323070205.2939118-2-hch@lst.de
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>

reset: spacemit: k3: Decouple composite reset lines

Instead of grouping several different reset lines into one composite
reset, decouple them to individual ones which make it more aligned
with underlying hardware. And for DWC USB driver, it will match well
with the number of the reset property in the DT bindings.

The DWC3 USB host controller in K3 SoC has three reset lines - AHB, VCC,
PHY. The PCIe controller also has three reset lines - DBI, Slave, Master.
Also three reset lines each for UCIE and RCPU block.

As an agreement with maintainer, the reset IDs has been rearranged as
contiguous number but keep most part unchanged to avoid break patches
which already sent to mailing list. The changes of DT binding header file
and reset driver are merged together as one single commit to avoid
git-bisect breakage.

Fixes: 938ce3b16582 ("reset: spacemit: Add SpacemiT K3 reset driver")
Fixes: 216e0a5e98e5 ("dt-bindings: soc: spacemit: Add K3 reset support and IDs")
Signed-off-by: Yixun Lan <dlan@kernel.org>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

reset: gpio: fix double free in reset_add_gpio_aux_device() error path

When __auxiliary_device_add() fails, reset_add_gpio_aux_device()
calls auxiliary_device_uninit(adev).

The device release callback reset_gpio_aux_device_release() frees
adev, but the current error path then calls kfree(adev) again,
causing a double free.

Keep kfree(adev) for the auxiliary_device_init() failure path, but
avoid freeing adev after auxiliary_device_uninit().

Fixes: 5fc4e4cf7a22 ("reset: gpio: use software nodes to setup the GPIO lookup")
Cc: stable@vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

KVM: arm64: Remove extra ISBs when using msr_hcr_el2

The msr_hcr_el2 macro is slightly awkward, as it provides an ISB
when CONFIG_AMPERE_ERRATUM_AC04_CPU_23 is present, and none
otherwise. Note that this this option is 'default y', meaning that
it is likely to be selected.

Most instances of msr_hcr_el2 are also immediately followed by an ISB,
meaning that in most cases, you end-up with two back-to-back ISBs.
This isn't a big deal, but once you have seen that, you can't unsee it.

Rework the msr_hcr_el2 macro to always provide the ISB, and drop
the superfluous ISBs everywhere else.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260321212419.2803972-6-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: pkvm: Use direct function pointers for cpu_{on,resume}

Instead of using a boolean to decide whether a CPU is booting or
resuming, just pass an actual function pointer around.

This makes the code a bit more straightforward to understand.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260321212419.2803972-5-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: pkvm: Turn __kvm_hyp_init_cpu into an inner label

__kvm_hyp_init_cpu really is an internal label for kvm_hyp_cpu_entry
and kvm_hyp_cpu_resume.

Make it clear that this is what it is, and drop a pointless branch
in kvm_hyp_cpu_resume.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260321212419.2803972-4-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: pkvm: Simplify BTI handling on CPU boot

In order to perform an indirect branch to kvm_host_psci_cpu_entry()
on a BTI-aware system, we first branch to a 'BTI j' landing pad,
and from there branch again to the target.

While this works, this is really not required:

- BLR works with 'BTI c' and 'PACIASP' as the landing pad

- Even if LR gets clobbered by BLR, we are going to restore the
host's registers, so it is pointless to try and avoid touching
LR

Given the above, drop the veneer and directly call into C code.
If we were to come back from it, we'd directly enter the error
handler.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260321212419.2803972-3-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: pkvm: Move error handling to the end of kvm_hyp_cpu_entry

We currently handle CPUs having booted at EL1 in the middle of
the kvm_hyp_cpu_entry function. Not only this adversely affects
readability, but this is also at a bizarre spot should more
error handling be added (which we're about to do).

Move the WFE/WFI loop to the end of the function and fix a comment.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260321212419.2803972-2-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

ata: libata-transport: remove redundant dynamic sysfs attributes

Use the static sysfs attributes directly, this allows to significantly
simplify the code. See attribute_container_add_attrs() for why member
grp can be used instead of attrs.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>

ARM: dts: microchip: sam9x7: fix gpio-lines count for pioB

The pioB controller on the SAM9X7 SoC actually supports 27 GPIO lines.
The previous value of 26 was incorrect, leading to the last pin being
unavailable for use by the GPIO subsystem.
Update the #gpio-lines property to reflect
the correct hardware specification.

Fixes: 41af45af8bc3 ("ARM: dts: at91: sam9x7: add device tree for SoC")
Signed-off-by: Mihai Sain <mihai.sain@microchip.com>
Link: https://lore.kernel.org/r/20260209090735.2016-1-mihai.sain@microchip.com
Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>

ata: libata-scsi: refactor ata_scsiop_maint_in()

ata_scsiop_maint_in() is currently quite confusing to read, because it
currently only implements support for the service action REPORT SUPPORTED
OPERATION CODES.

Thus, when this function is checking for "invalid command format", it is
not very clear if it is an invalid command format for the MAINTENANCE IN
command itself, or an invalid command format for the (currently one and
only) service action/subcommand implemented for this command.

Move the service action to a separate function, so it is more clear that
the "invalid command format" check is actually specific for the REPORT
SUPPORTED OPERATION CODES service action.

This also makes it easier and less confusing to add support for additional
service actions in the future.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>

arm64: dts: ti: k3-j722s: Add main_i2c4 device node

Add missing device tree node for main_i2c4, and the corresponding ranges
in cbass_main. Interrupt for this i2c controller is routed through the
Main GPIOMUX Router.
Base address, Interrupt IDs are taken from J722S TRM [0].
Device, Clock IDs are taken from TISCI docs [1].

Additionally, the I2C4 is the only interrupt source to the GPIOMUX INTR
router that generates level interrupts, while all other sources generate
edge interrupts. Due to this, the router needs to handle interrupt-type
on a per-line basis. Modify the router node and its consumers to
specify the interrupt type corresponding to each interrupt line.

[0]: https://www.ti.com/lit/zip/sprujb3
[1]:
https://software-dl.ti.com/tisci/esd/latest/5_soc_doc/index.html#j722s

Signed-off-by: Jared McArthur <j-mcarthur@ti.com>
Signed-off-by: Aniket Limaye <a-limaye@ti.com>
Tested-by: Nora Schiffer <nora.schiffer@ew.tq-group.com>
Link: https://patch.msgid.link/20260304-j722s-main-i2c4-dt-v1-1-03f79f0cdf97@ti.com
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>

media: amphion: Fix race between m2m job_abort and device_run

Fix kernel panic caused by race condition where v4l2_m2m_ctx_release()
frees m2m_ctx while v4l2_m2m_try_run() is about to call device_run
with the same context.

Race sequence:
  v4l2_m2m_try_run():           v4l2_m2m_ctx_release():
    lock/unlock                   v4l2_m2m_cancel_job()
                                    job_abort()
                                      v4l2_m2m_job_finish()
                                  kfree(m2m_ctx)  <- frees ctx
    device_run()  <- use-after-free crash at 0x538

Crash trace:
  Unable to handle kernel read from unreadable memory at virtual address
  0000000000000538
  v4l2_m2m_try_run+0x78/0x138
  v4l2_m2m_device_run_work+0x14/0x20

The amphion vpu driver does not rely on the m2m framework's device_run
callback to perform encode/decode operations.

Fix the race by preventing m2m framework job scheduling entirely:
- Add job_ready callback returning 0 (no jobs ready for m2m framework)
- Remove job_abort callback to avoid the race condition

Fixes: 3cd084519c6f ("media: amphion: add vpu v4l2 m2m support")
Cc: stable@vger.kernel.org
Signed-off-by: Ming Qian <ming.qian@oss.nxp.com>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>

media: dt-bindings: rockchip,vdec: Add alternative reg-names order for RK35{76,88}

With the introduction of the RK3588 SoC, and RK3576 afterwards, three
register blocks have been provided for the video decoder unit instead of
just one, which are further referenced in vendor's datasheet by 'link
table', 'function' and 'cache'. The former is present at the top of the
listing, starting at video decoder unit base address.

However, while documenting RK3588, the binding broke the convention
expecting the unit address to indicate the start of the primary register
range, i.e. the 'function' block got listed before the 'link' one.

Since the binding changes have been already released and a fix would
bring up an ABI break, mark the current 'reg-names' ordering as
deprecated and introduce an alternative 'link,function,cache' listing
which follows the address-based ordering according to the TRM.

Additionally, drop the 'reg' description items as the order is not fixed
anymore, while the information they offer is not very relevant anyway.

It's worth noting there are currently no (known) users impacted by these
binding changes, since the video decoder support for the aforementioned
SoCs in mainline driver and devicetrees hasn't been released yet - it
landed in v7.0-rc1 while all DTS updates resulting from this will be
handled before v7.0 is out.

Fixes: c6ffb7e1fb90 ("media: dt-bindings: rockchip: Document RK3588 Video Decoder bindings")
Fixes: a5c4a6526476 ("media: dt-bindings: rockchip: Add RK3576 Video Decoder bindings")
Cc: stable@vger.kernel.org
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>

media: dt-bindings: rockchip,vdec: Mark reg-names required for RK35{76,88}

The Rockchip Video Decoder driver expects reg-names to be mandatory for
RK3576 and RK3588 SoCs, however the binding does not currently require
the use of them.

As a consequence, driver would fail to probe with a hypothetical
devicetree that doesn't provide the reg-names for these SoCs, but which
is otherwise a perfectly valid DT from the binding perspective.

Update the binding and make reg-names required for the aforementioned
SoCs. While this change introduces an ABI break, the expected impact on
potential users would be minimal, if any, since the old SoCs are
unaffected, while the video decoder support for these newer variants in
mainline driver and devicetrees hasn't been released yet.

Moreover, this is also a prerequisite for a subsequent binding update
introducing an alternative reg-names order, according to the
address-based listing in the vendor's datasheet.

Reported-by: Conor Dooley <conor@kernel.org>
Closes: https://lore.kernel.org/all/20260227-urologist-gratitude-7984733f2d41@spud/
Fixes: c6ffb7e1fb90 ("media: dt-bindings: rockchip: Document RK3588 Video Decoder bindings")
Fixes: a5c4a6526476 ("media: dt-bindings: rockchip: Add RK3576 Video Decoder bindings")
Cc: stable@vger.kernel.org
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>

media: mediatek: vcodec: fix use-after-free in encoder release path

The fops_vcodec_release() function frees the context structure (ctx)
without first cancelling any pending or running work in ctx->encode_work.
This creates a race window where the workqueue handler (mtk_venc_worker)
may still be accessing the context memory after it has been freed.

Race condition:

    CPU 0 (release path)               CPU 1 (workqueue)
    ---------------------               ------------------
    fops_vcodec_release()
      v4l2_m2m_ctx_release()
        v4l2_m2m_cancel_job()
        // waits for m2m job "done"
                                        mtk_venc_worker()
                                          v4l2_m2m_job_finish()
                                          // m2m job "done"
                                          // BUT worker still running!
                                          // post-job_finish access:
                                        other ctx dereferences
                                          // UAF if ctx already freed
        // returns (job "done")
      kfree(ctx)  // ctx freed

Root cause: The v4l2_m2m_ctx_release() only waits for the m2m job
lifecycle (via TRANS_RUNNING flag), not the workqueue lifecycle.
After v4l2_m2m_job_finish() is called, the m2m framework considers
the job complete and v4l2_m2m_ctx_release() returns, but the worker
function continues executing and may still access ctx.

The work is queued during encode operations via:
  queue_work(ctx->dev->encode_workqueue, &ctx->encode_work)
The worker function accesses ctx->m2m_ctx, ctx->dev, and other ctx
fields even after calling v4l2_m2m_job_finish().

This vulnerability was confirmed with KASAN by running an instrumented
test module that widens the post-job_finish race window. KASAN detected:

  BUG: KASAN: slab-use-after-free in mtk_venc_worker+0x159/0x180
  Read of size 4 at addr ffff88800326e000 by task kworker/u8:0/12

  Workqueue: mtk_vcodec_enc_wq mtk_venc_worker

  Allocated by task 47:
    __kasan_kmalloc+0x7f/0x90
    fops_vcodec_open+0x85/0x1a0

  Freed by task 47:
    __kasan_slab_free+0x43/0x70
    kfree+0xee/0x3a0
    fops_vcodec_release+0xb7/0x190

Fix this by calling cancel_work_sync(&ctx->encode_work) before kfree(ctx).
This ensures the workqueue handler is both cancelled (if pending) and
synchronized (waits for any running handler to complete) before the
context is freed.

Placement rationale: The fix is placed after v4l2_ctrl_handler_free()
and before list_del_init(&ctx->list). At this point, all m2m operations
are done (v4l2_m2m_ctx_release() has returned), and we need to ensure
the workqueue is synchronized before removing ctx from the list and
freeing it.

Note: The open error path does NOT need cancel_work_sync() because
INIT_WORK() only initializes the work structure - it does not schedule
it. Work is only scheduled later during device_run() operations.

Fixes: 0934d3759615 ("media: mediatek: vcodec: separate decoder and encoder")
Cc: stable@vger.kernel.org
Signed-off-by: Fan Wu <fanwu01@zju.edu.cn>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>

media: mtk-jpeg: fix use-after-free in release path due to uncancelled work

The mtk_jpeg_release() function frees the context structure (ctx) without
first cancelling any pending or running work in ctx->jpeg_work. This
creates a race window where the workqueue callback may still be accessing
the context memory after it has been freed.

Race condition:

    CPU 0 (release)                    CPU 1 (workqueue)
    ----------------                   ------------------
    close()
      mtk_jpeg_release()
                                       mtk_jpegenc_worker()
                                         ctx = work->data
                                         // accessing ctx

        kfree(ctx)  // freed!
                                         access ctx  // UAF!

The work is queued via queue_work() during JPEG encode/decode operations
(via mtk_jpeg_device_run). If the device is closed while work is pending
or running, the work handler will access freed memory.

Fix this by calling cancel_work_sync() BEFORE acquiring the mutex. This
ordering is critical: if cancel_work_sync() is called after mutex_lock(),
and the work handler also tries to acquire the same mutex, it would cause
a deadlock.

Note: The open error path does NOT need cancel_work_sync() because
INIT_WORK() only initializes the work structure - it does not schedule
it. Work is only scheduled later during ioctl operations.

Fixes: 5fb1c2361e56 ("mtk-jpegenc: add jpeg encode worker interface")
Cc: stable@vger.kernel.org
Signed-off-by: Fan Wu <fanwu01@zju.edu.cn>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>

media: imx-jpeg: Add support for encoder v1 descriptor configuration

Support the upgraded JPEG encoder v1 found on i.MX952 SoC.

Detect the encoder hardware version via the version register.

The v1 encoder uses an expanded descriptor format that allows all
encoding parameters, including JPEG quality, to be configured directly
in the descriptor.

This removes the manual register-based configuration step required by v0
and reduces the interrupt count from two to one per frame.

V0 encoding flow:
  1. Write quality to registers -> trigger config interrupt
  2. Start encoding -> trigger completion interrupt

V1 encoding flow:
  1. Configure descriptor with all parameters including quality
  2. Start encoding -> trigger completion interrupt

Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Ming Qian <ming.qian@oss.nxp.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>

media: imx-jpeg: Add encoder ops layer for hardware abstraction

Introduce mxc_jpeg_enc_ops function pointer structure to abstract
encoder configuration differences between hardware versions.

Extract the existing two-phase manual configuration into dedicated
functions (enter_config_mode/exit_config_mode) for v0 hardware.
Add setup_desc callback placeholder for future v1 hardware support
which will use descriptor-based configuration.

Store the extended sequential mode flag in the context to avoid
recalculating it during configuration phases.

No functional change.

Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Ming Qian <ming.qian@oss.nxp.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>

media: imx-jpeg: Use devm_pm_runtime_enable() helper

Use devm_pm_runtime_enable() to simplify probe and exit paths.

No functional change.

Signed-off-by: Ming Qian <ming.qian@oss.nxp.com>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>

media: imx-jpeg: Simplify descriptor initialization with memset

Use memset() to zero-initialize desc and cfg_desc structures instead of
assigning individual fields to zero. This is cleaner and ensures all
descriptor fields are properly initialized.

No functional change.

Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Ming Qian <ming.qian@oss.nxp.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>

media: chips-media: wave5: add missing spinlock protection for handle_dynamic_resolution_change()

Add spin_lock_irqsave()/spin_unlock_irqrestore() around the
handle_dynamic_resolution_change() call in initialize_sequence() to fix
the missing lock protection.

initialize_sequence() calls handle_dynamic_resolution_change() without
holding inst->state_spinlock. However, handle_dynamic_resolution_change()
has lockdep_assert_held(&inst->state_spinlock) indicating that callers
must hold this lock.

Other callers of handle_dynamic_resolution_change() properly acquire the
spinlock:
- wave5_vpu_dec_finish_decode()
- wave5_vpu_dec_device_run()

Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Fixes: 9707a6254a8a6b ("media: chips-media: wave5: Add the v4l2 layer")
Cc: stable@vger.kernel.org
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>

media: chips-media: wave5: add missing spinlock protection for send_eos_event()

Add spin_lock_irqsave()/spin_unlock_irqrestore() around send_eos_event()
calls in the VB2 buffer queue and streamoff callbacks to fix the missing
lock protection.

wave5_vpu_dec_buf_queue_dst() and streamoff_output() call send_eos_event()
without holding inst->state_spinlock. However, send_eos_event() has
lockdep_assert_held(&inst->state_spinlock) indicating that callers must
hold this lock.

Other callers of send_eos_event() properly acquire the spinlock:
- wave5_vpu_dec_finish_decode() acquires lock at line 431
- wave5_vpu_dec_encoder_cmd() acquires lock at line 821
- wave5_vpu_dec_device_run() acquires lock at line 1592

Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Fixes: 9707a6254a8a6b ("media: chips-media: wave5: Add the v4l2 layer")
Cc: stable@vger.kernel.org
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>

media: chips-media: wave5: fix a potential memory leak in wave5_vdi_init()

Add wave5_vdi_free_dma_memory() in the error path of
wave5_vdi_init() to prevent a potential memory leak.

Fixes: 45d1a2b93277 ("media: chips-media: wave5: Add vpuapi layer")
Cc: stable@vger.kernel.org
Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>

platform/x86: barco-p50-gpio: convert to guard() notation

Using guard notation simplifies control flow and makes the code clearer.

Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260318-barco-p50-gpio-set-v2-2-c0a4a6416163@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/x86: barco-p50-gpio: normalize return value of gpio_get

The GPIO get callback is expected to return 0 or 1 (or a negative error
code). Ensure that the value returned by p50_gpio_get() is normalized
to the [0, 1] range.

Fixes: 86ef402d805d606a ("gpiolib: sanitize the return value of gpio_chip::get()")
Reviewed-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260318-barco-p50-gpio-set-v2-1-c0a4a6416163@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

Merge branch 'xfs-7.1-merge' into for-next

Signed-off-by: Carlos Maiolino <cem@kernel.org>

Merge branch 'xfs-7.0-fixes' into for-next

Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: report cow mappings with dirty pagecache for iomap zero range

XFS has long supported the case where it is possible to have dirty
data in pagecache backed by COW fork blocks and a hole in the data
fork. This occurs for two reasons. On reflink enabled files, COW
fork blocks are allocated with preallocation to help avoid
fragmention. Second, if a mapping lookup for a write finds blocks in
the COW fork, it consumes those blocks unconditionally. This might
mean that COW fork blocks are backed by non-shared blocks or even a
hole in the data fork, both of which are perfectly fine.

This leaves an odd corner case for zero range, however, because it
needs to distinguish between ranges that are sparse and thus do not
require zeroing and those that are not. A range backed by COW fork
blocks and a data fork hole might either be a legitimate hole in the
file or a range with pending buffered writes that will be written
back (which will remap COW fork blocks into the data fork).

This "COW fork blocks over data fork hole" situation has
historically been reported as a hole to iomap, which then has grown
a flush hack as a workaround to ensure zeroing occurs correctly. Now
that this has been lifted into the filesystem and replaced by the
dirty folio lookup mechanism, we can do better and use the pagecache
state to decide how to report the mapping. If a COW fork range
exists with dirty folios in cache, then report a typical shared
mapping. If the range is clean in cache, then we can consider the
COW blocks preallocation and call it a hole.

This doesn't fundamentally change behavior, but makes mapping
reporting more accurate. Note that this does require splitting
across the EOF boundary (similar to normal zero range) to ensure we
don't spuriously perform post-eof zeroing. iomap will warn about
zeroing beyond EOF because folios beyond i_size may not be written
back.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: replace zero range flush with folio batch

Now that the zero range pagecache flush is purely isolated to
providing zeroing correctness in this case, we can remove it and
replace it with the folio batch mechanism that is used for handling
unwritten extents.

This is still slightly odd in that XFS reports a hole vs. a mapping
that reflects the COW fork extents, but that has always been the
case in this situation and so a separate issue. We drop the iomap
warning that assumes the folio batch is always associated with
unwritten mappings, but this is mainly a development assertion as
otherwise the core iomap fbatch code doesn't care much about the
mapping type if it's handed the set of folios to process.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: only flush when COW fork blocks overlap data fork holes

The zero range hole mapping flush case has been lifted from iomap
into XFS. Now that we have more mapping context available from the
->iomap_begin() handler, we can isolate the flush further to when we
know a hole is fronted by COW blocks.

Rather than purely rely on pagecache dirty state, explicitly check
for the case where a range is a hole in both forks. Otherwise trim
to the range where there does happen to be overlap and use that for
the pagecache writeback check. This might prevent some spurious
zeroing, but more importantly makes it easier to remove the flush
entirely.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: look up cow fork extent earlier for buffered iomap_begin

To further isolate the need for flushing for zero range, we need to
know whether a hole in the data fork is fronted by blocks in the COW
fork or not. COW fork lookup currently occurs further down in the
function, after the zero range case is handled.

As a preparation step, lift the COW fork extent lookup to earlier in
the function, at the same time as the data fork lookup. Only the
lookup logic is lifted. The COW fork branch/reporting logic remains
as is to avoid any observable behavior change from an iomap
reporting perspective.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: flush eof folio before insert range size update

The flush in xfs_buffered_write_iomap_begin() for zero range over a
data fork hole fronted by COW fork prealloc is primarily designed to
provide correct zeroing behavior in particular pagecache conditions.
As it turns out, this also partially masks some odd behavior in
insert range (via zero range via setattr).

Insert range bumps i_size the length of the new range, flushes,
unmaps pagecache and cancels COW prealloc, and then right shifts
extents from the end of the file back to the target offset of the
insert. Since the i_size update occurs before the pagecache flush,
this creates a transient situation where writeback around EOF can
behave differently.

This appears to be corner case situation, but if happens to be
fronted by COW fork speculative preallocation and a large, dirty
folio that contains at least one full COW block beyond EOF, the
writeback after i_size is bumped may remap that COW fork block into
the data fork within EOF. The block is zeroed and then shifted back
out to post-eof, but this is unexpected in that it leads to a
written post-eof data fork block. This can cause a zero range
warning on a subsequent size extension, because we should never find
blocks that require physical zeroing beyond i_size.

To avoid this quirk, flush the EOF folio before the i_size update
during insert range. The entire range will be flushed, unmapped and
invalidated anyways, so this should be relatively unnoticeable.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

iomap, xfs: lift zero range hole mapping flush into xfs

iomap zero range has a wart in that it also flushes dirty pagecache
over hole mappings (rather than only unwritten mappings). This was
included to accommodate a quirk in XFS where COW fork preallocation
can exist over a hole in the data fork, and the associated range is
reported as a hole. This is because the range actually is a hole,
but XFS also has an optimization where if COW fork blocks exist for
a range being written to, those blocks are used regardless of
whether the data fork blocks are shared or not. For zeroing, COW
fork blocks over a data fork hole are only relevant if the range is
dirty in pagecache, otherwise the range is already considered
zeroed.

The easiest way to deal with this corner case is to flush the
pagecache to trigger COW remapping into the data fork, and then
operate on the updated on-disk state. The problem is that ext4
cannot accommodate a flush from this context due to being a
transaction deadlock vector.

Outside of the hole quirk, ext4 can avoid the flush for zero range
by using the recently introduced folio batch lookup mechanism for
unwritten mappings. Therefore, take the next logical step and lift
the hole handling logic into the XFS iomap_begin handler. iomap will
still flush on unwritten mappings without a folio batch, and XFS
will flush and retry mapping lookups in the case where it would
otherwise report a hole with dirty pagecache during a zero range.

Note that this is intended to be a fairly straightforward lift and
otherwise not change behavior. Now that the flush exists within XFS,
follow on patches can further optimize it.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: flush dirty pagecache over hole in zoned mode zero range

For zoned filesystems a window exists between the first write to a
sparse range (i.e. data fork hole) and writeback completion where we
might spuriously observe holes in both the COW and data forks. This
occurs because a buffered write populates the COW fork with
delalloc, writeback submission removes the COW fork delalloc blocks
and unlocks the inode, and then writeback completion remaps the
physically allocated blocks into the data fork. If a zero range
operation does a lookup during this window where both forks show a
hole, it incorrectly reports a hole mapping for a range that
contains data.

This currently works because iomap checks for dirty pagecache over
holes and unwritten mappings. If found, it flushes and retries the
lookup. We plan to remove the hole flush logic from iomap, however,
so lift the flush into xfs_zoned_buffered_write_iomap_begin() to
preserve behavior and document the purpose for it. Zoned XFS
filesystems don't support unwritten extents, so if zoned mode can
come up with a way to close this transient hole window in the
future, this flush can likely be removed.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: fix iomap hole map reporting for zoned zero range

The hole mapping logic for zero range in zoned mode is not quite
correct. It currently reports a hole whenever one exists in the data
fork. If the first write to a sparse range has completed and not yet
written back, the blocks exist in the COW fork as delalloc until
writeback completes, at which point they are allocated and mapped
into the data fork. If a zero range occurs on a range that has not
yet populated the data fork, we will incorrectly report it as a
hole.

Note that this currently functions correctly because we are bailed
out by the pagecache flush in iomap_zero_range(). If a hole or
unwritten mapping is reported with dirty pagecache, it assumes there
is pending data, flushes to induce any pending block
allocations/remaps, and retries the lookup. We want to remove this
hack from iomap, however, so update iomap_begin() to only report a
hole for zeroing when one exists in both forks.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

KVM: arm64: nv: Expose shadow page tables in debugfs

Exposing shadow page tables in debugfs improves the debugability and
testability of NV. With this patch a new directory "nested" is created
for each VM created if the host is NV capable. Within the directory each
valid s2 mmu will have its shadow page table exposed as a readable file
with the file name formatted as 0x<vttbr>-0x<vtcr>-s2-{en,dis}abled. The
creation and removal of the files happen at the points when an s2 mmu
becomes valid, or the context it represents change. In the future the
"nested" directory can also hold other NV related information.

This is gated behind CONFIG_PTDUMP_STAGE2_DEBUGFS.

Suggested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Sebastian Ene <sebastianene@google.com>
Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://patch.msgid.link/20260317182638.1592507-3-weilin.chang@arm.com
[maz: minor refactor, full 16 chars addresses]
Signed-off-by: Marc Zyngier <maz@kernel.org>

gpio: qixis-fpga: Fix error handling for devm_regmap_init_mmio()

devm_regmap_init_mmio() returns an ERR_PTR() on failure, not NULL.
The original code checked for NULL which would never trigger on error,
potentially leading to an invalid pointer dereference.
Use IS_ERR() and PTR_ERR() to properly handle the error case.

Fixes: e88500247dc3 ("gpio: add QIXIS FPGA GPIO controller")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Link: https://patch.msgid.link/20260320-qixis-v1-1-a8efc22e8945@gmail.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

arm64: dts: arm/corstone1000: Add corstone-1000-a320

The Corstone-1000-a320 is a Corstone-1000 derivative with Cortex-A320 cores,
GIC-600, and Ethos-U85 NPU.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Message-Id: <20260320-dt-corstone1000-a320-v1-5-a549dfcfe8da@kernel.org>
Signed-off-by: Sudeep Holla <sudeep.holla@kernel.org>

arm64: dts: arm/corstone1000: Move FVP peripherals to separate .dtsi

The FVPs have a common set of peripherals specific to the FVP. Move
these to a separate .dtsi so they can be shared across FVP platforms.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Message-Id: <20260320-dt-corstone1000-a320-v1-4-a549dfcfe8da@kernel.org>
Signed-off-by: Sudeep Holla <sudeep.holla@kernel.org>

arm64: dts: arm/corstone1000: Move cpu nodes

In preparation to add a new Corstone-1000 variation with different CPUs,
move the CPU nodes into the specific platforms and out of the common
corstone1000.dtsi.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Message-Id: <20260320-dt-corstone1000-a320-v1-3-a549dfcfe8da@kernel.org>
Signed-off-by: Sudeep Holla <sudeep.holla@kernel.org>

dt-bindings: npu: arm,ethos: Add "arm,corstone1000-ethos-u85"

The Corstone-1000-A320 platform contains an Ethos-U85 NPU. Add a
specific compatible for it.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Message-Id: <20260320-dt-corstone1000-a320-v1-2-a549dfcfe8da@kernel.org>
Signed-off-by: Sudeep Holla <sudeep.holla@kernel.org>

dt-bindings: arm,corstone1000: Add "arm,corstone1000-a320-fvp"

The Arm Corstone1000-A320 is a variation of the Corstone1000 with
Cortex-A320 cores and an Ethos-U85 NPU. An FVP for the platform is
available here[1].

[1] https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms/IoT%20FVPs

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Message-Id: <20260320-dt-corstone1000-a320-v1-1-a549dfcfe8da@kernel.org>
Signed-off-by: Sudeep Holla <sudeep.holla@kernel.org>

xfs: remove redundant validation in xlog_recover_attri_commit_pass2

Remove the redundant post-parse validation switch. By the time that
block is reached, xfs_attri_validate() has already guaranteed all name
lengths are non-zero via xfs_attri_validate_namelen(), and
xfs_attri_validate_name_iovec() has already returned -EFSCORRUPTED for
NULL names. For the REMOVE case, attr_value and value_len are
structurally guaranteed to be NULL/zero because the parsing loop only
populates them when value_len != 0. All checks in that switch are
therefore dead code.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: fix ri_total validation in xlog_recover_attri_commit_pass2

The ri_total checks for SET/REPLACE operations are hardcoded to 3,
but xfs_attri_item_size() only emits a value iovec when value_len > 0,
so ri_total is 2 when value_len == 0.

For PPTR_SET/PPTR_REMOVE/PPTR_REPLACE, value_len is validated by
xfs_attri_validate() to be exactly sizeof(struct xfs_parent_rec) and
is never zero, so their hardcoded checks remain correct.

This problem may cause log recovery failures. The following script can be
used to reproduce the problem:

#!/bin/bash
mkfs.xfs -f /dev/sda
mount /dev/sda /mnt/test/
touch /mnt/test/file
for i in {1..200}; do
attr -s "user.attr_$i" -V "value_$i" /mnt/test/file > /dev/null
done
echo 1 > /sys/fs/xfs/debug/larp
echo 1 > /sys/fs/xfs/sda/errortag/larp
attr -s "user.zero" -V "" /mnt/test/file
echo 0 > /sys/fs/xfs/sda/errortag/larp
umount /mnt/test
mount /dev/sda /mnt/test/ # mount failed

Fix this by deriving the expected count dynamically as "2 + !!value_len"
for SET/REPLACE operations.

Cc: stable@vger.kernel.org # v6.9
Fixes: ad206ae50eca ("xfs: check opcode and iovec count match in xlog_recover_attri_commit_pass2")
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

dt-bindings: gpio: fix microchip,mpfs-gpio interrupt documentation

The microchip,mpfs-gpio binding suffered greatly due to being written
with a narrow minded view of the controller, and the interrupt bits
ended up incorrect. It was mistakenly assumed that the interrupt
configuration was set by platform firmware, based on the FPGA
configuration, and that the GPIO DT nodes were the only way to really
communicate interrupt configuration to software.

Instead, the mux should be a device in its own right, and the GPIO
controllers should be connected to it, rather than to the PLIC.
Now that a binding exists for that mux, try to fix the misconceptions
in the GPIO controller binding.

Firstly, it's not possible for this controller to have fewer than 14
GPIOs, and thus 14 interrupts also. There are three controllers, with
14, 24 & 32 GPIOs each. The fabric core, CoreGPIO, can of course have
a customisable number of GPIOs.

The example is wacky too - it follows from the incorrect understanding
that the GPIO controllers are connected to the PLIC directly. They are
not however, with a mux sitting in between. Update the example to use
the mux as a parent, and the interrupt numbers at the mux for GPIO2 as
the example - rather than the strange looking, repeated <53>.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260318-fondly-tradition-90b8241f0cc8@spud
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

gpio: dwapb: reduce allocation to single kzalloc

Instead of kzalloc + kcalloc, Combine the two using a flexible array
member.

Allows using __counted_by for extra runtime analysis. Move counting
variable to right after allocation as required by __counted_by.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260320005338.30355-1-rosenp@gmail.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

platform/mellanox: nvsw-sn2201: Drop unused include

This driver includes the legacy header <linux/gpio.h> but does
not use any symbols from it. Drop the inclusion.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20260320222033.3238317-1-andriy.shevchenko@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/surface: hotplug: Correct inclusion for GPIO APIs

The modern GPIO APIs are available for users via linux/gpio/consumer.h.
The linux/gpio.h is legacy header that is subject to remove. Hence
replace the latter by the former in the driver.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20260320221143.3237791-1-andriy.shevchenko@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

drm/xe/xe3p: Skip TD flush

Xe3p has HW ability to do transient display flush so the xe driver can
enable this HW feature by default and skip the software TD flush.

Bspec: 60002
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Shekhar Chauhan <shekhar.chauhan@intel.com>
Link: https://patch.msgid.link/20260305121902.1892593-10-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

drm/xe/xe3p_lpg: Restrict UAPI to enable L2 flush optimization

When set, starting xe3p_lpg, the L2 flush optimization
feature will control whether L2 is in Persistent or
Transient mode through monitoring of media activity.

To enable L2 flush optimization include new feature flag
GUC_CTL_ENABLE_L2FLUSH_OPT for Novalake platforms when
media type is detected.

Tighten UAPI validation to restrict userptr, svm and
dmabuf mappings to be either 2WAY or XA+1WAY

V5(Thomas): logic correction
V4(MattA): Modify uapi doc and commit
V3(MattA): check valid op and pat_index value
V2(MattA): validate dma-buf bos and madvise pat-index

Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Acked-by: Carl Zhang <carl.zhang@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260305121902.1892593-9-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

platform/x86: dell/dell-rbtn: Convert ACPI driver to a platform one

In all cases in which a struct acpi_driver is used for binding a driver
to an ACPI device object, a corresponding platform device is created by
the ACPI core and that device is regarded as a proper representation of
underlying hardware. Accordingly, a struct platform_driver should be
used by driver code to bind to that device. There are multiple reasons
why drivers should not bind directly to ACPI device objects [1].

Overall, it is better to bind drivers to platform devices than to their
ACPI companions, so convert the Dell Airplane Mode Switch (rbtn) ACPI
driver to a platform one.

After this change, the subordinate rfkill device will be registered
under the platform device used for driver binding instead of its ACPI
companion.

While this is not expected to alter functionality, it changes sysfs
layout and so it will be visible to user space.

Link: https://lore.kernel.org/all/2396510.ElGaqSPkdT@rafael.j.wysocki/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/6141354.MhkbZ0Pkbq@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/x86: dell/dell-rbtn: Register ACPI notify handler directly

To facilitate subsequent conversion of the driver to a platform one,
make it install an ACPI notify handler directly instead of using
a .notify() callback in struct acpi_driver.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/9591832.CDJkKcVGEf@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

drm/xe/pat: define coh_mode 2way

Defining 2way (two-way coherency) is critical for
Xe3p_LPG (Nova Lake P) platforms to support L2 flush
optimization safely.

This mode allows the driver to skip certain manual cache
flushes (L2 flush optimization) without risking memory
corruption because the hardware ensures the most recent
data is visible to both entities.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260305121902.1892593-8-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

drm/xe/xe3p_lpg: flush shrinker bo cachelines manually

XA, new pat_index introduced post xe3p_lpg, is memory shared between the
CPU and GPU is treated differently from other GPU memory when the Media
engine is power-gated.

XA is *always* flushed, like at the end-of-submssion (and maybe other
places), just that internally as an optimisation hw doesn't need to make
that a full flush (which will also include XA) when Media is
off/powergated, since it doesn't need to worry about GT caches vs Media
coherency, and only CPU vs GPU coherency, so can make that flush a
targeted XA flush, since stuff tagged with XA now means it's shared with
the CPU. The main implication is that we now need to somehow flush non-XA
before freeing system memory pages, otherwise dirty cachelines could be
flushed after the free (like if Media suddenly turns on and does a full
flush)

V4: Add comments for L2 flush path
V3(Thomas/MattA/MattR): Restrict userptr with non-xa, then no need to
flush manually
V2(MattA): Expand commit description

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20260305121902.1892593-7-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

xfs: close crash window in attr dabtree inactivation

When inactivating an inode with node-format extended attributes,
xfs_attr3_node_inactive() invalidates all child leaf/node blocks via
xfs_trans_binval(), but intentionally does not remove the corresponding
entries from their parent node blocks.  The implicit assumption is that
xfs_attr_inactive() will truncate the entire attr fork to zero extents
afterwards, so log recovery will never reach the root node and follow
those stale pointers.

However, if a log shutdown occurs after the leaf/node block cancellations
commit but before the attr bmap truncation commits, this assumption
breaks.  Recovery replays the attr bmap intact (the inode still has
attr fork extents), but suppresses replay of all cancelled leaf/node
blocks, maybe leaving them as stale data on disk.  On the next mount,
xlog_recover_process_iunlinks() retries inactivation and attempts to
read the root node via the attr bmap. If the root node was not replayed,
reading the unreplayed root block triggers a metadata verification
failure immediately; if it was replayed, following its child pointers
to unreplayed child blocks triggers the same failure:

XFS (pmem0): Metadata corruption detected at
xfs_da3_node_read_verify+0x53/0x220, xfs_da3_node block 0x78
XFS (pmem0): Unmount and run xfs_repair
XFS (pmem0): First 128 bytes of corrupted metadata buffer:
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
XFS (pmem0): metadata I/O error in "xfs_da_read_buf+0x104/0x190" at daddr 0x78 len 8 error 117

Fix this in two places:

In xfs_attr3_node_inactive(), after calling xfs_trans_binval() on a
child block, immediately remove the entry that references it from the
parent node in the same transaction.  This eliminates the window where
the parent holds a pointer to a cancelled block.  Once all children are
removed, the now-empty root node is converted to a leaf block within the
same transaction. This node-to-leaf conversion is necessary for crash
safety. If the system shutdown after the empty node is written to the
log but before the second-phase bmap truncation commits, log recovery
will attempt to verify the root block on disk. xfs_da3_node_verify()
does not permit a node block with count == 0; such a block will fail
verification and trigger a metadata corruption shutdown. on the other
hand, leaf blocks are allowed to have this transient state.

In xfs_attr_inactive(), split the attr fork truncation into two explicit
phases.  First, truncate all extents beyond the root block (the child
extents whose parent references have already been removed above).
Second, invalidate the root block and truncate the attr bmap to zero in
a single transaction.  The two operations in the second phase must be
atomic: as long as the attr bmap has any non-zero length, recovery can
follow it to the root block, so the root block invalidation must commit
together with the bmap-to-zero truncation.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: factor out xfs_attr3_leaf_init

Factor out wrapper xfs_attr3_leaf_init function, which exported for
external use.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Long Li <leo.lilong@huawei.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: factor out xfs_attr3_node_entry_remove

Factor out wrapper xfs_attr3_node_entry_remove function, which
exported for external use.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Long Li <leo.lilong@huawei.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: only assert new size for datafork during truncate extents

The assertion functions properly because we currently only truncate the
attr to a zero size. Any other new size of the attr is not preempted.
Make this assertion is specific to the datafork, preparing for
subsequent patches to truncate the attribute to a non-zero size.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Long Li <leo.lilong@huawei.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

xfs: Use xarray to track SB UUIDs instead of plain array.

Removing the plain array to track the UUIDs and switch
to xarray makes it more readable.

Signed-off-by: Lukas Herbolt <lukas@herbolt.com>
[cem: remove unneeded return from xfs_uuid_unmount]
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

drm/xe/vf: Improve getting clean NULL context

There is a small risk that when fetching a NULL context image the
VF may get a tweaked context image prepared by another VF that was
previously running on the engine before the GuC scheduler switched
the VFs.

To avoid that risk, without forcing GuC scheduler to trigger costly
engine reset on every VF switch, use a watchdog mechanism that when
configured with impossible condition, triggers an interrupt, which
GuC will handle by doing an engine reset. Also adjust job size to
account for additional dwords with watchdog setup.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patch.msgid.link/20260303201354.17948-4-michal.wajdeczko@intel.com

drm/xe: Add MI_SEMAPHORE_WAIT command definition

This command supports memory based Semaphore WAIT. Memory based
semaphores will be used for synchronization between the Producer
and the Consumer contexts. Producer and Consumer Contexts could
be running on different engines or on the same engine inside GT.

Bspec: 45749, 60244
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patch.msgid.link/20260303201354.17948-3-michal.wajdeczko@intel.com

drm/xe: Add PR_CTR_CTRL/THRSH register definitions

The Watchdog Counter Control and Watchdog Counter Threshold
registers are needed for watchdog programming. This watchdog
will generate the "Media Hang Notify" interrupt.

Bspec: 45999, 46000
Bspec: 60373, 60374
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patch.msgid.link/20260303201354.17948-2-michal.wajdeczko@intel.com

platform/x86: system76: Convert ACPI driver to a platform one

In all cases in which a struct acpi_driver is used for binding a driver
to an ACPI device object, a corresponding platform device is created by
the ACPI core and that device is regarded as a proper representation of
underlying hardware. Accordingly, a struct platform_driver should be
used by driver code to bind to that device. There are multiple reasons
why drivers should not bind directly to ACPI device objects [1].

Overall, it is better to bind drivers to platform devices than to their
ACPI companions, so convert the System76 ACPI driver to a platform one.

After this change, the subordinate hwmon, input and LED class devices
will be registered under the platform device used for driver binding
instead of its ACPI companion.

While this is not expected to alter functionality, it changes sysfs
layout and so it will be visible to user space.

Link: https://lore.kernel.org/all/2396510.ElGaqSPkdT@rafael.j.wysocki/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/3401648.aeNJFYEL58@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/x86: system76: Register ACPI notify handler directly

To facilitate subsequent conversion of the driver to a platform one,
make it install an ACPI notify handler directly instead of using
a .notify() callback in struct acpi_driver.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/13970743.uLZWGnKmhe@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/x86: system76: Drop redundant devm_led_classdev_unregister()

Drop two redundant devm_led_classdev_unregister() calls from
system76_remove().

No intentional functional impact.

Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/5057164.GXAFRqVoOG@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/x86: panasonic-laptop: Convert ACPI driver to a platform one

In all cases in which a struct acpi_driver is used for binding a driver
to an ACPI device object, a corresponding platform device is created by
the ACPI core and that device is regarded as a proper representation of
underlying hardware.  Accordingly, a struct platform_driver should be
used by driver code to bind to that device.  There are multiple reasons
why drivers should not bind directly to ACPI device objects [1].

Overall, it is better to bind drivers to platform devices than to their
ACPI companions, so convert the Panasonic laptop ACPI driver to a platform
one.

While this is not expected to alter functionality, it changes sysfs
layout and so it will be visible to user space.

To maintain backwards compatibility with possibly existing user space,
the sysfs attributes created by the driver under the ACPI device object
used by it are not relocated.  Accordingly, the driver will continue to
use the driver_data pointer in struct acpi_device which needs to be
cleared on driver removal.

Link: https://lore.kernel.org/all/2396510.ElGaqSPkdT@rafael.j.wysocki/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/8664183.T7Z3S40VBb@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/x86: panasonic-laptop: Register ACPI notify handler directly

To facilitate subsequent conversion of the driver to a platform one,
make it install an ACPI notify handler directly instead of using
a .notify() callback in struct acpi_driver.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/13979037.uLZWGnKmhe@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/x86: panasonic-laptop: Remove redundant checks from 3 functions

The device pointer cannot be NULL in acpi_pcc_hotkey_add() and
acpi_pcc_hotkey_remove() because these functions are ACPI driver
callbacks and NULL is never passed to any of them as an argument.

Likewise, acpi_pcc_hotkey_resume() is a resume callback of a
device driver and NULL is never passed to it as an argument, so
the dev pointer in it cannot be NULL.

Moreover, since acpi_pcc_hotkey_remove() and acpi_pcc_hotkey_resume()
can only run after acpi_pcc_hotkey_add() has completed successfully,
the acpi_driver_data() of the device object used by them cannot be
NULL when they run.

Drop all of the redundant NULL checks of the pointers mentioned above.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/1957824.tdWV9SEqCh@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/x86: panasonic-laptop: Fix OPTD notifier registration and cleanup

An ACPI notify handler is leaked if device_create_file() returns an
error in acpi_pcc_hotkey_add().

Also, it is pointless to call pcc_unregister_optd_notifier() in
acpi_pcc_hotkey_remove() if pcc->platform is NULL and it is better
to arrange the cleanup code in that function in the same order as
the rollback code in acpi_pcc_hotkey_add().

Address the above by placing the pcc_register_optd_notifier() call in
acpi_pcc_hotkey_add() after the device_create_file() return value
check and placing the pcc_unregister_optd_notifier() call in
acpi_pcc_hotkey_remove() right before the device_remove_file() call.

Fixes: d5a81d8e864b ("platform/x86: panasonic-laptop: Add support for optical driver power in Y and W series")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2411055.ElGaqSPkdT@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/x86: panasonic-laptop: Make pcc_register_optd_notifier() void

Convert pcc_register_optd_notifier() whose return value is never used to
a void function.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/5093613.31r3eYUQgx@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

arm64: dts: renesas: r9a09g047e57-smarc: Enable RSPI0

Enable RSPI0 on the RZ/G3E SMARC EVK, where it is accessible on the
PMOD0 connector.

Signed-off-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/b634c10e632fed07b5652c11de060deca27ead90.1771344527.git.tommaso.merciai.xr@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>

arm64: dts: renesas: r9a09g047: Add RSPI nodes

Add nodes for the RSPI IPs found in the Renesas RZ/G3E SoC.

Signed-off-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/c8df5202caf4e36ee5beafe78ad0940643edcbb6.1771344527.git.tommaso.merciai.xr@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>

wifi: mt76: fix backoff fields and max_power calculation

The maximum power value may exist in either the data or backoff field.
Previously, backoff power limits were not considered in txpower reporting.
This patch ensures mt76 also considers backoff values in the SKU table.

Also, each RU entry (RU26, RU52, RU106, BW20, ...) in the DTS corresponds
to 10 stream combinations (1T1ss, 2T1ss, 3T1ss, 4T1ss, 2T2ss, 3T2ss,
4T2ss, 3T3ss, 4T3ss, 4T4ss).

For beamforming tables:
- In connac2, beamforming entries for BW20~BW160, and OFDM do not include
1T1ss.
- In connac3, beamforming entries for BW20~BW160, and RU include 1T1ss,
but OFDM beamforming does not include 1T1ss.

Non-beamforming and RU entries for both connac2 and connac3 include 1T1ss.

Fixes: b05ab4be9fd7 ("wifi: mt76: mt7915: add bf backoff limit table support")
Signed-off-by: Allen Ye <allen.ye@mediatek.com>
Co-developed-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Link: https://patch.msgid.link/8fa8ec500b3d4de7b1966c6887f1dfbe5c46a54c.1771205424.git.ryder.lee@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>

wifi: mt76: mt7921: Replace deprecated PCI function

pcim_iomap_table() and pcim_iomap_regions() have been
deprecated.
Replace them with pcim_iomap_region().

Signed-off-by: Madhur Kumar <madhurkumar004@gmail.com>
Link: https://patch.msgid.link/20251208172331.89705-1-madhurkumar004@gmail.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>

wifi: mt76: mt7996: increase txq memory limit to 32 MiB

Prior to this change, both 2G and 6G radios would fall back to the
mac80211 default of 4MB which is not enough for high data rates.

Signed-off-by: Chad Monroe <chad@monroe.io>
Link: https://patch.msgid.link/acfe2e25768b414518be2db22b1d3ba6f5db6fa1.1765203249.git.chad@monroe.io
Signed-off-by: Felix Fietkau <nbd@nbd.name>

wifi: mt76: mt7996: reset device after MCU message timeout

Trigger a full reset after MCU message timeout.

Signed-off-by: Chad Monroe <chad@monroe.io>
Link: https://patch.msgid.link/6e05ed063f3763ad3457633c56b60a728a49a6f0.1765203753.git.chad@monroe.io
Signed-off-by: Felix Fietkau <nbd@nbd.name>

wifi: mt76: fix deadlock in remain-on-channel

mt76_remain_on_channel() and mt76_roc_complete() call mt76_set_channel()
while already holding dev->mutex. Since mt76_set_channel() also acquires
dev->mutex, this results in a deadlock.

Use __mt76_set_channel() instead of mt76_set_channel().
Add cancel_delayed_work_sync() for mac_work before acquiring the mutex
in mt76_remain_on_channel() to prevent a secondary deadlock with the
mac_work workqueue.

Fixes: a8f424c1287c ("wifi: mt76: add multi-radio remain_on_channel functions")
Signed-off-by: Chad Monroe <chad@monroe.io>
Link: https://patch.msgid.link/ace737e7b621af7c2adb33b0188011a5c1de2166.1765204256.git.chad@monroe.io
Signed-off-by: Felix Fietkau <nbd@nbd.name>

wifi: mt76: mt7921: fix potential deadlock in mt7921_roc_abort_sync

roc_abort_sync() can deadlock with roc_work(). roc_work() holds
dev->mt76.mutex, while cancel_work_sync() waits for roc_work()
to finish. If the caller already owns the same mutex, both
sides block and no progress is possible.

This deadlock can occur during station removal when
mt76_sta_state() -> mt76_sta_remove() -> mt7921_mac_sta_remove() ->
mt7921_roc_abort_sync() invokes cancel_work_sync() while
roc_work() is still running and holding dev->mt76.mutex.

This avoids the mutex deadlock and preserves exactly-once
work ownership.

Fixes: 352d966126e6 ("wifi: mt76: mt7921: fix a potential association failure upon resuming")
Co-developed-by: Quan Zhou <quan.zhou@mediatek.com>
Signed-off-by: Quan Zhou <quan.zhou@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Link: https://patch.msgid.link/20260126180013.8167-1-sean.wang@kernel.org
Signed-off-by: Felix Fietkau <nbd@nbd.name>

wifi: mt76: mt7921: fix ROC abort flow interruption in mt7921_roc_work

The mt7921_set_roc API may be executed concurrently with mt7921_roc_work,
specifically between the following code paths:

- The check and clear of MT76_STATE_ROC in mt7921_roc_work:
if (!test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state))
return;

- The execution of ieee80211_iterate_active_interfaces.

This race condition can interrupt the ROC abort flow, resulting in
the ROC process failing to abort as expected.

To address this defect, the modification of MT76_STATE_ROC is now
protected by mt792x_mutex_acquire(phy->dev). This ensures that
changes to the ROC state are properly synchronized, preventing
race conditions and ensuring the ROC abort flow is not interrupted.

Fixes: 034ae28b56f1 ("wifi: mt76: mt7921: introduce remain_on_channel support")
Cc: stable@vger.kernel.org
Signed-off-by: Quan Zhou <quan.zhou@mediatek.com>
Reviewed-by: Sean Wang <sean.wang@mediatek.com>
Link: https://patch.msgid.link/2568ece8b557e5dda79391414c834ef3233049b6.1769133724.git.quan.zhou@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>

wifi: mt76: mt7925: fix tx power setting failure after chip reset

After the chip reset, the procedure to set the tx power will not be
successful because the previous region setting is still remains.
Clear the region setting during MAC initialization and allow it to be
reset to finalize the TX power setting.

Fixes: 3bc62aa4484d ("wifi: mt76: mt7925: add auto regdomain switch support")
Signed-off-by: Leon Yen <leon.yen@mediatek.com>
Link: https://patch.msgid.link/20260120163152.3694116-1-leon.yen@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>

wifi: mt76: Fix memory leak after mt76_connac_mcu_alloc_sta_req()

mt76_connac_mcu_alloc_sta_req() allocates an skb which is expected to
be freed eventually by mt76_mcu_skb_send_msg(). However, currently if
an intermediate function fails before sending, the allocated skb is
leaked.

Specifically, mt76_connac_mcu_sta_wed_update() and
mt76_connac_mcu_sta_key_tlv() may fail, leading to an immediate memory
leak in the error path.

Fix this by explicitly freeing the skb in these error paths.
Commit 7c0f63fe37a5 ("wifi: mt76: mt7996: fix memory leak on
mt7996_mcu_sta_key_tlv error") made a similar change.

Compile tested only. Issue found using a prototype static analysis tool
and code review.

Fixes: d1369e515efe ("wifi: mt76: connac: introduce mt76_connac_mcu_sta_wed_update utility routine")
Fixes: 6683d988089c ("mt76: connac: move mt76_connac_mcu_add_key in connac module")
Fixes: 4f831d18d12d ("wifi: mt76: mt7915: enable WED RX support")
Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips")
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Link: https://patch.msgid.link/20260116144919.1482558-1-zilin@seu.edu.cn
Signed-off-by: Felix Fietkau <nbd@nbd.name>

wifi: mt76: mt7925: fix potential deadlock in mt7925_roc_abort_sync

roc_abort_sync() can deadlock with roc_work(). roc_work() holds
dev->mt76.mutex, while cancel_work_sync() waits for roc_work()
to finish. If the caller already owns the same mutex, both
sides block and no progress is possible.

This deadlock can occur during station removal when
mt76_sta_state() -> mt76_sta_remove() ->
mt7925_mac_sta_remove_link() -> mt7925_mac_link_sta_remove() ->
mt7925_roc_abort_sync() invokes cancel_work_sync() while
roc_work() is still running and holding dev->mt76.mutex.

This avoids the mutex deadlock and preserves exactly-once
work ownership.

Fixes: 45064d19fd3a ("wifi: mt76: mt7925: fix a potential association failure upon resuming")
Co-developed-by: Quan Zhou <quan.zhou@mediatek.com>
Signed-off-by: Quan Zhou <quan.zhou@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Link: https://patch.msgid.link/20251216013849.17976-1-sean.wang@kernel.org
Signed-off-by: Felix Fietkau <nbd@nbd.name>

wifi: mt76: mt7925: Skip scan process during suspend.

We are experiencing command timeouts because an upper layer triggers
an unexpected scan while the system/device is in suspend.
The upper layer should not initiate scans until the NIC has fully resumed.
We want to prevent scans during suspend and avoid timeouts without harming
power management or user experience.

Signed-off-by: Michael Lo <michael.lo@mediatek.com>
Link: https://patch.msgid.link/20260112114007.2115873-1-leon.yen@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>

wifi: mt76: mt7925: drop puncturing handling from BSS change path

IEEE80211_CHANCTX_CHANGE_PUNCTURING is a channel context change
flag and should not be checked in the BSS change handler, where
the changed mask represents enum ieee80211_bss_change.

Remove the puncturing handling from the BSS path and rely on
mt7925_change_chanctx() to update puncturing configuration.

Fixes: cadebdad959b ("wifi: mt76: mt7925: add EHT preamble puncturing")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Link: https://patch.msgid.link/20251216022017.23870-1-sean.wang@kernel.org
Signed-off-by: Felix Fietkau <nbd@nbd.name>