git.ipfire.org Git - thirdparty/kernel/linux.git/log

pmdomain: qcom: rpmhpd: Simplify locking with guard()

Simplify error handling (less gotos) over locks with guard().

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Konrad Dybcio <konradybcio@kernel.org>
Link: https://lore.kernel.org/r/20240823-cleanup-h-guard-pm-domain-v1-7-8320722eaf39@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: qcom: cpr: Simplify locking with guard()

Simplify error handling (less gotos) over locks with guard().

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Konrad Dybcio <konradybcio@kernel.org>
Link: https://lore.kernel.org/r/20240823-cleanup-h-guard-pm-domain-v1-6-8320722eaf39@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: qcom: cpr: Simplify with dev_err_probe()

Use dev_err_probe() to make defer code handling simpler.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Konrad Dybcio <konradybcio@kernel.org>
Link: https://lore.kernel.org/r/20240823-cleanup-h-guard-pm-domain-v1-5-8320722eaf39@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: imx: gpcv2: Simplify with scoped for each OF child loop

Use scoped for_each_child_of_node_scoped() when iterating over device
nodes to make code a bit simpler.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240823-cleanup-h-guard-pm-domain-v1-4-8320722eaf39@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: imx: gpc: Simplify with scoped for each OF child loop

Use scoped for_each_child_of_node_scoped() when iterating over device
nodes to make code a bit simpler.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240823-cleanup-h-guard-pm-domain-v1-3-8320722eaf39@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: rockchip: Simplify locking with guard()

Simplify error handling (smaller error handling) over locks with
guard().

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240823-cleanup-h-guard-pm-domain-v1-2-8320722eaf39@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: rockchip: Simplify with scoped for each OF child loop

Use scoped for_each_available_child_of_node_scoped() and
for_each_child_of_node_scoped() when iterating over
device nodes to make code a bit simpler.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240823-cleanup-h-guard-pm-domain-v1-1-8320722eaf39@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: qcom-cpr: Use scope based of_node_put() to simplify code.

Use scope based of_node_put() to simplify the code logic, and we don't
need to call of_node_put(). This will simplify the code a lot.

Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Link: https://lore.kernel.org/r/20240821034022.27394-3-zhangzekun11@huawei.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: qcom-cpr: Use helper function for_each_available_child_of_node()

for_each_available_child_of_node() can help to iterate through the
device_node, and we don't need to use while loop. Besides, the purpose
of the while loop is to find a device_node which fits the condition
"child_req_np == ref_np", we can just read the property of "child_np"
directly in for_each_available_child_of_node(). No functional change
with such conversion.

Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Link: https://lore.kernel.org/r/20240821034022.27394-2-zhangzekun11@huawei.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: Merge branch fixes into next

Merge the pmdomain fixes for v6.11-rc[n] into the next branch, to allow them
to get tested together with the new changes that are targeted for v6.12.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

OPP: Fix support for required OPPs for multiple PM domains

It has turned out that having _set_required_opps() to recursively call
dev_pm_opp_set_opp() to set the required OPPs, doesn't really work as well
as we expected.

More precisely, at each recursive call to dev_pm_opp_set_opp() we are
changing an OPP for a required_dev that belongs to a required-OPP table.
The problem with this, is that we may have several devices sharing the same
required-OPP table, which leads to an incorrect behaviour in regards to
aggregating the per device votes.

To fix the problem for a required-OPP table belonging to a PM domain, which
is the only existing usecase for now, let's simply replace the call to
dev_pm_opp_set_opp() in _set_required_opps() by a call to _set_opp_level().

Moving forward we may potentially need to add support for other types of
required-OPP tables. In this case, the aggregation needs to be thought of.

Fixes: e37440e7e2c2 ("OPP: Call dev_pm_opp_set_opp() for required OPPs")
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20240822224547.385095-2-ulf.hansson@linaro.org

cpuidle: dt_idle_genpd: Simplify with scoped for each OF child loop

Use scoped for_each_child_of_node_scoped() when iterating over device
nodes to make code a bit simpler.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240816150931.142208-4-krzysztof.kozlowski@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

cpuidle: psci: Simplify with scoped for each OF child loop

Use scoped for_each_child_of_node_scoped() when iterating over device
nodes to make code a bit simpler.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240816150931.142208-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: apple: Make apple_pmgr_reset_ops static

The sparse tool complains as follows:

drivers/pmdomain/apple/pmgr-pwrstate.c:180:32: warning:
symbol 'apple_pmgr_reset_ops' was not declared. Should it be static:

This symbol is not used outside of pmgr-pwrstate.c, so marks it static.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://lore.kernel.org/r/20240819115956.3884847-1-ruanjinjie@huawei.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

media: venus: use device managed APIs for power domains

Use devres-enabled version of power domain attach APIs.

Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Link: https://lore.kernel.org/r/1724063350-11993-3-git-send-email-quic_dikshita@quicinc.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

PM: domains: add device managed version of dev_pm_domain_attach|detach_list()

Add the devres-enabled version of dev_pm_domain_attach|detach_list.
If client drivers use devm_pm_domain_attach_list() to attach the PM domains,
devm_pm_domain_detach_list() will be invoked implicitly during remove phase.

Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com>
Link: https://lore.kernel.org/r/1724063350-11993-2-git-send-email-quic_dikshita@quicinc.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: rockchip: Add support for RK3576 SoC

Add configuration for RK3576 SoC and list the power domains.

Signed-off-by: Finley Xiao <finley.xiao@rock-chips.com>
Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
Reviewed-by: Elaine Zhang<zhangqing@rock-chips.com>
Link: https://lore.kernel.org/r/20240814222824.3170-3-detlev.casanova@collabora.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: Merge branch dt into next

Merge the immutable branch dt into next, to allow the DT bindings to be
tested together with changes that are targeted for v6.12.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

dt-bindings: power: Add support for RK3576 SoC

Define power domain IDs as described in the TRM and add compatible for
rockchip,rk3576-power-controller

Signed-off-by: Finley Xiao <finley.xiao@rock-chips.com>
Co-Developed-by: Detlev Casanova <detlev.casanova@collabora.com>
Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240814222824.3170-2-detlev.casanova@collabora.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: Merge branch fixes into next

Merge the pmdomain fixes for v6.11-rc[n] into the next branch, to allow them
to get tested together with the new changes that are targeted for v6.12.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: imx: wait SSAR when i.MX93 power domain on

With "quiet" set in bootargs, there is power domain failure:
"imx93_power_domain 44462400.power-domain: pd_off timeout: name:
44462400.power-domain, stat: 4"

The current power on opertation takes ISO state as power on finished
flag, but it is wrong. Before powering on operation really finishes,
powering off comes and powering off will never finish because the last
powering on still not finishes, so the following powering off actually
not trigger hardware state machine to run. SSAR is the last step when
powering on a domain, so need to wait SSAR done when powering on.

Since EdgeLock Enclave(ELE) handshake is involved in the flow, enlarge
the waiting time to 10ms for both on and off to avoid timeout.

Cc: stable@vger.kernel.org
Fixes: 0a0f7cc25d4a ("soc: imx: add i.MX93 SRC power domain driver")
Reviewed-by: Jacky Bai <ping.bai@nxp.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20240814124740.2778952-1-peng.fan@oss.nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: raspberrypi-power: set flag GENPD_FLAG_ACTIVE_WAKEUP

Set flag GENPD_FLAG_ACTIVE_WAKEUP to rpi_power genpd, then when a device
is set as wakeup source using device_set_wakeup_enable, the power
domain could be kept on to make sure the device could wakeup the system.

Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://lore.kernel.org/r/20240728114200.75559-6-wahrenst@gmx.net
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: raspberrypi-power: Add logging to rpi_firmware_set_power

The Raspberry Pi power driver heavily relies on the logging of the
underlying firmware driver. This results in disadvantages like unspecific
error messages or limited debugging options. So implement the logging for
the most important function rpi_firmware_set_power.

Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://lore.kernel.org/r/20240728114200.75559-5-wahrenst@gmx.net
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: raspberrypi-power: Adjust packet definition

According to the official Mailbox property interface the second part
of RPI_FIRMWARE_SET_POWER_STATE ( and so on ...) is named state because
it represent u32 flags and just the lowest bit is for on/off. So rename it
to align with documentation and prepare the driver for further changes.

Link: https://github.com/raspberrypi/firmware/wiki/Mailbox-property-interface
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240728114200.75559-4-wahrenst@gmx.net
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: Merge branch fixes into next

Merge the pmdomain fixes for v6.11-rc[n] into the next branch, to allow them
to get tested together with the new changes that are targeted for v6.12.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

cpuidle: psci: Enable the hierarchical topology for s2idle on PREEMPT_RT

To enable the domain-idle-states to be used during s2idle on a PREEMPT_RT
based configuration, let's allow the re-assignment of the ->enter_s2idle()
callback to psci_enter_s2idle_domain_idle_state().

Similar to s2ram, let's leave the support for CPU hotplug outside
PREEMPT_RT, as it's depending on using runtime PM. For s2idle, this means
that an offline CPU's PM domain will remain powered-on. In practise this
may lead to that a shallower idle-state than necessary gets selected, which
shouldn't be an issue (besides wasting power).

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com> # qcm6490 with PREEMPT_RT set
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240527142557.321610-8-ulf.hansson@linaro.org

cpuidle: psci: Enable the hierarchical topology for s2ram on PREEMPT_RT

The hierarchical PM domain topology are currently disabled on a PREEMPT_RT
based configuration. As a first step to enable it to be used, let's try to
attach the CPU devices to their PM domains on PREEMPT_RT. In this way the
syscore ops becomes available, allowing the PM domain topology to be
managed during s2ram.

For the moment let's leave the support for CPU hotplug outside PREEMPT_RT,
as it's depending on using runtime PM. For s2ram, this isn't a problem as
all CPUs are managed via the syscore ops.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com> # qcm6490 with PREEMPT_RT set
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240527142557.321610-7-ulf.hansson@linaro.org

cpuidle: psci: Drop redundant assignment of CPUIDLE_FLAG_RCU_IDLE

When using the hierarchical topology and PSCI OSI-mode we may end up
overriding the deepest idle-state's ->enter|enter_s2idle() callbacks, but
there is no point to also re-assign the CPUIDLE_FLAG_RCU_IDLE for the
idle-state in question, as that has already been set when parsing the
states from DT. See init_state_node().

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com> # qcm6490 with PREEMPT_RT set
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240527142557.321610-6-ulf.hansson@linaro.org

cpuidle: psci-domain: Enable system-wide suspend on PREEMPT_RT

The domain-idle-states are currently disabled on a PREEMPT_RT based
configuration for the cpuidle-psci-domain. To enable them to be used for
system-wide suspend and in particular during s2idle, let's set the
GENPD_FLAG_RPM_ALWAYS_ON instead of GENPD_FLAG_ALWAYS_ON for the
corresponding genpd provider.

In this way, the runtime PM path remains disabled in genpd for its attached
devices, while powering-on/off the PM domain during system-wide suspend
becomes allowed.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com> # qcm6490 with PREEMPT_RT set
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240527142557.321610-5-ulf.hansson@linaro.org

pmdomain: core: Use dev_name() instead of kobject_get_path() in debugfs

Using kobject_get_path() means a dynamic memory allocation gets done, which
doesn't work on a PREEMPT_RT based configuration while holding genpd's raw
spinlock.

To fix the problem, let's convert into using the simpler dev_name(). This
means the information about the path doesn't get presented in debugfs, but
hopefully this shouldn't be an issue.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com> # qcm6490 with PREEMPT_RT set
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240527142557.321610-4-ulf.hansson@linaro.org

pmdomain: core: Don't hold the genpd-lock when calling dev_pm_domain_set()

There is no need to hold the genpd-lock, while assigning the
dev->pm_domain. In fact, it becomes a problem on a PREEMPT_RT based
configuration as the genpd-lock may be a raw spinlock, while the lock
acquired through the call to dev_pm_domain_set() is a regular spinlock.

To fix the problem, let's simply move the calls to dev_pm_domain_set()
outside the genpd-lock.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com> # qcm6490 with PREEMPT_RT set
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240527142557.321610-3-ulf.hansson@linaro.org

pmdomain: core: Enable s2idle for CPU PM domains on PREEMPT_RT

To allow a genpd provider for a CPU PM domain to enter a domain-idle-state
during s2idle on a PREEMPT_RT based configuration, we can't use the regular
spinlock, as they are turned into sleepable locks on PREEMPT_RT.

To address this problem, let's convert into using the raw spinlock, but
only for genpd providers that have the GENPD_FLAG_CPU_DOMAIN bit set. In
this way, the lock can still be acquired/released in atomic context, which
is needed in the idle-path for PREEMPT_RT.

Do note that the genpd power-on/off notifiers may also be fired during
s2idle, but these are already prepared for PREEMPT_RT as they are based on
the raw notifiers. However, consumers of them may need to adopt accordingly
to work properly on PREEMPT_RT.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com> # qcm6490 with PREEMPT_RT set
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240527142557.321610-2-ulf.hansson@linaro.org

pmdomain: amlogic: remove obsolete vpu domain driver

meson-gx-pwrc-vpu has been superseded by meson-ee-pwrc since
commit 53773f2dfd9c ("soc: amlogic: meson-ee-pwrc: add support for the Meson GX SoCs"),
so v5.8.

This driver is obsolete and no longer used or tested.
There is no reason to keep it around so remove it.

Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20240730130227.712894-1-jbrunet@baylibre.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: imx: scu-pd: Remove duplicated clocks

These clocks are already added to the list. Remove the duplicates ones.

Fixes: a67d780720ff ("genpd: imx: scu-pd: add more PDs")
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240717080334.2210988-1-alexander.stein@ew.tq-group.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

Linux 6.11-rc2

profiling: remove profile=sleep support

The kernel sleep profile is no longer working due to a recursive locking
bug introduced by commit 42a20f86dc19 ("sched: Add wrapper for get_wchan()
to keep task blocked")

Booting with the 'profile=sleep' kernel command line option added or
executing

  # echo -n sleep > /sys/kernel/profiling

after boot causes the system to lock up.

Lockdep reports

  kthreadd/3 is trying to acquire lock:
  ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: get_wchan+0x32/0x70

  but task is already holding lock:
  ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: try_to_wake_up+0x53/0x370

with the call trace being

   lock_acquire+0xc8/0x2f0
   get_wchan+0x32/0x70
   __update_stats_enqueue_sleeper+0x151/0x430
   enqueue_entity+0x4b0/0x520
   enqueue_task_fair+0x92/0x6b0
   ttwu_do_activate+0x73/0x140
   try_to_wake_up+0x213/0x370
   swake_up_locked+0x20/0x50
   complete+0x2f/0x40
   kthread+0xfb/0x180

However, since nobody noticed this regression for more than two years,
let's remove 'profile=sleep' support based on the assumption that nobody
needs this functionality.

Fixes: 42a20f86dc19 ("sched: Add wrapper for get_wchan() to keep task blocked")
Cc: stable@vger.kernel.org # v5.16+
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'x86-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:

- Prevent a deadlock on cpu_hotplug_lock in the aperf/mperf driver.

   A recent change in the ACPI code which consolidated code pathes moved
   the invocation of init_freq_invariance_cppc() to be moved to a CPU
   hotplug handler. The first invocation on AMD CPUs ends up enabling a
   static branch which dead locks because the static branch enable tries
   to acquire cpu_hotplug_lock but that lock is already held write by
   the hotplug machinery.

   Use static_branch_enable_cpuslocked() instead and take the hotplug
   lock read for the Intel code path which is invoked from the
   architecture code outside of the CPU hotplug operations.

- Fix the number of reserved bits in the sev_config structure bit field
   so that the bitfield does not exceed 64 bit.

- Add missing Zen5 model numbers

- Fix the alignment assumptions of pti_clone_pgtable() and
   clone_entry_text() on 32-bit:

   The code assumes PMD aligned code sections, but on 32-bit the kernel
   entry text is not PMD aligned. So depending on the code size and
   location, which is configuration and compiler dependent, entry text
   can cross a PMD boundary. As the start is not PMD aligned adding PMD
   size to the start address is larger than the end address which
   results in partially mapped entry code for user space. That causes
   endless recursion on the first entry from userspace (usually #PF).

   Cure this by aligning the start address in the addition so it ends up
   at the next PMD start address.

   clone_entry_text() enforces PMD mapping, but on 32-bit the tail might
   eventually be PTE mapped, which causes a map fail because the PMD for
   the tail is not a large page mapping. Use PTI_LEVEL_KERNEL_IMAGE for
   the clone() invocation which resolves to PTE on 32-bit and PMD on
   64-bit.

- Zero the 8-byte case for get_user() on range check failure on 32-bit

   The recend consolidation of the 8-byte get_user() case broke the
   zeroing in the failure case again. Establish it by clearing ECX
   before the range check and not afterwards as that obvioulsy can't be
   reached when the range check fails

* tag 'x86-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/uaccess: Zero the 8-byte get_range case on failure on 32-bit
  x86/mm: Fix pti_clone_entry_text() for i386
  x86/mm: Fix pti_clone_pgtable() alignment assumption
  x86/setup: Parse the builtin command line before merging
  x86/CPU/AMD: Add models 0x60-0x6f to the Zen5 range
  x86/sev: Fix __reserved field in sev_config
  x86/aperfmperf: Fix deadlock on cpu_hotplug_lock

Merge tag 'timers-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:
"Two fixes for the timer/clocksource code:

   - The recent fix to make the take over of the broadcast timer more
     reliable retrieves a per CPU pointer in preemptible context.

     This went unnoticed in testing as some compilers hoist the access
     into the non-preemotible section where the pointer is actually
     used, but obviously compilers can rightfully invoke it where the
     code put it.

     Move it into the non-preemptible section right to the actual usage
     side to cure it.

   - The clocksource watchdog is supposed to emit a warning when the
     retry count is greater than one and the number of retries reaches
     the limit.

     The condition is backwards and warns always when the count is
     greater than one. Fixup the condition to prevent spamming dmesg"

* tag 'timers-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource: Fix brown-bag boolean thinko in cs_watchdog_read()
  tick/broadcast: Move per CPU pointer access into the atomic section

Merge tag 'sched-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Thomas Gleixner:

- When stime is larger than rtime due to accounting imprecision, then
   utime = rtime - stime becomes negative. As this is unsigned math, the
   result becomes a huge positive number.

   Cure it by resetting stime to rtime in that case, so utime becomes 0.

- Restore consistent state when sched_cpu_deactivate() fails.

   When offlining a CPU fails in sched_cpu_deactivate() after the SMT
   present counter has been decremented, then the function aborts but
   fails to increment the SMT present counter and leaves it imbalanced.
   Consecutive operations cause it to underflow. Add the missing fixup
   for the error path.

   For SMT accounting the runqueue needs to marked online again in the
   error exit path to restore consistent state.

* tag 'sched-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/core: Fix unbalance set_rq_online/offline() in sched_cpu_deactivate()
  sched/core: Introduce sched_set_rq_on/offline() helper
  sched/smt: Fix unbalance sched_smt_present dec/inc
  sched/smt: Introduce sched_smt_present_inc/dec() helper
  sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime

Merge tag 'perf-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 perf fixes from Thomas Gleixner:

- Move the smp_processor_id() invocation back into the non-preemtible
   region, so that the result is valid to use

- Add the missing package C2 residency counters for Sierra Forest CPUs
   to make the newly added support actually useful

* tag 'perf-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix smp_processor_id()-in-preemptible warnings
  perf/x86/intel/cstate: Add pkg C2 residency counter for Sierra Forest

Merge tag 'irq-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
"A couple of fixes for interrupt chip drivers:

   - Make sure to skip the clear register space in the MBIGEN driver
     when calculating the node register index. Otherwise the clear
     register is clobbered and the wrong node registers are accessed.

   - Fix a signed/unsigned confusion in the loongarch CPU driver which
     converts an error code to a huge "valid" interrupt number.

   - Convert the mesion GPIO interrupt controller lock to a raw spinlock
     so it works on RT.

   - Add a missing static to a internal function in the pic32 EVIC
     driver"

* tag 'irq-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/mbigen: Fix mbigen node address layout
  irqchip/meson-gpio: Convert meson_gpio_irq_controller::lock to 'raw_spinlock_t'
  irqchip/irq-pic32-evic: Add missing 'static' to internal function
  irqchip/loongarch-cpu: Fix return value of lpic_gsi_to_irq()

Merge tag 'locking-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Thomas Gleixner:
"Two fixes for locking and jump labels:

   - Ensure that the atomic_cmpxchg() conditions are correct and
     evaluating to true on any non-zero value except 1. The missing
     check of the return value leads to inconsisted state of the jump
     label counter.

   - Add a missing type conversion in the paravirt spinlock code which
     makes loongson build again"

* tag 'locking-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  jump_label: Fix the fix, brown paper bags galore
  locking/pvqspinlock: Correct the type of "old" variable in pv_kick_node()

arm: dts: arm: versatile-ab: Fix duplicate clock node name

Commit 04f08ef291d4 ("arm/arm64: dts: arm: Use generic clock and
regulator nodenames") renamed nodes and created 2 "clock-24000000" nodes
(at different paths).

The kernel can't handle these duplicate names even though they are at
different paths. Fix this by renaming one of the nodes to "clock-pclk".

This name is aligned with other Arm boards (those didn't have a known
frequency to use in the node name).

Fixes: 04f08ef291d4 ("arm/arm64: dts: arm: Use generic clock and regulator nodenames")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag '6.11-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

- two reparse point fixes

- minor cleanup

- additional trace point (to help debug a recent problem)

* tag '6.11-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update internal version number
  smb: client: fix FSCTL_GET_REPARSE_POINT against NetApp
  smb3: add dynamic tracepoints for shutdown ioctl
  cifs: Remove cifs_aio_ctx
  smb: client: handle lack of FSCTL_GET_REPARSE_POINT support

Merge tag 'media/v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:

- two Kconfig fixes

- one fix for the UVC driver addressing probing time detection of a UVC
   custom controls

- one fix related to PDF generation

* tag 'media/v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: v4l: Fix missing tabular column hint for Y14P format
  media: intel/ipu6: select AUXILIARY_BUS in Kconfig
  media: ipu-bridge: fix ipu6 Kconfig dependencies
  media: uvcvideo: Fix custom control mapping probing

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"One core change that reverts the double message print patch in sd.c
  (it was causing regressions on embedded systems).

  The rest are driver fixes in ufs, mpt3sas and mpi3mr"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: exynos: Don't resume FMP when crypto support is disabled
  scsi: mpt3sas: Avoid IOMMU page faults on REPORT ZONES
  scsi: mpi3mr: Avoid IOMMU page faults on REPORT ZONES
  scsi: ufs: core: Do not set link to OFF state while waking up from hibernation
  scsi: Revert "scsi: sd: Do not repeat the starting disk message"
  scsi: ufs: core: Fix deadlock during RTC update
  scsi: ufs: core: Bypass quick recovery if force reset is needed
  scsi: ufs: core: Check LSDBS cap when !mcq

Merge tag 'xfs-6.11-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Chandan Babu:

- Fix memory leak when corruption is detected during scrubbing parent
   pointers

- Allow SECURE namespace xattrs to use reserved block pool to in order
   to prevent ENOSPC

- Save stack space by passing tracepoint's char array to file_path()
   instead of another stack variable

- Remove unused parameter in macro XFS_DQUOT_LOGRES

- Replace comma with semicolon in a couple of places

* tag 'xfs-6.11-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: convert comma to semicolon
  xfs: convert comma to semicolon
  xfs: remove unused parameter in macro XFS_DQUOT_LOGRES
  xfs: fix file_path handling in tracepoints
  xfs: allow SECURE namespace xattrs to use reserved block pool
  xfs: fix a memory leak

Merge tag 'parisc-for-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc architecture fixes from Helge Deller:

- fix unaligned memory accesses when calling BPF functions

- adjust memory size constants to fix possible DMA corruptions

* tag 'parisc-for-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: fix a possible DMA corruption
parisc: fix unaligned accesses in BPF

runtime constants: deal with old decrepit linkers

The runtime constants linker script depended on documented linker
behavior [1]:

"If an output section’s name is the same as the input section’s name
  and is representable as a C identifier, then the linker will
  automatically PROVIDE two symbols: __start_SECNAME and __stop_SECNAME,
  where SECNAME is the name of the section. These indicate the start
  address and end address of the output section respectively"

to just automatically define the symbol names for the bounds of the
runtime constant arrays.

It turns out that this isn't actually something we can rely on, with old
linkers not generating these automatic symbols.  It looks to have been
introduced in binutils-2.29 back in 2017, and we still support building
with versions all the way back to binutils-2.25 (from 2015).

And yes, Oleg actually seems to be using such ancient versions of
binutils.

So instead of depending on the implicit symbols from "section names
match and are representable C identifiers", just do this all manually.
It's not like it causes us any extra pain, we already have to do that
for all the other sections that we use that often have special
characters in them.

Reported-and-tested-by: Oleg Nesterov <oleg@redhat.com>
Link: https://sourceware.org/binutils/docs/ld/Input-Section-Example.html
Link: https://lore.kernel.org/all/20240802114518.GA20924@redhat.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'tags/fixes-media-uvc-20230722' of git://git.kernel.org/pub/scm/linux/kernel/git/pinchartl/linux.git

uvcvideo v6.11 regression fix: fix custom control mapping probing

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>

Merge tag 'io_uring-6.11-20240802' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:
"Two minor tweaks for the NAPI handling, both from Olivier:

   - Kill two unused list definitions

   - Ensure that multishot NAPI doesn't age away"

* tag 'io_uring-6.11-20240802' of git://git.kernel.dk/linux:
  io_uring: remove unused local list heads in NAPI functions
  io_uring: keep multishot request NAPI timeout current

Merge tag 'thermal-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control fixes from Rafael Wysocki:
"These fix a few issues related to the MSI IRQs management in the
  int340x thermal driver, fix a thermal core issue that may lead to
  missing trip point crossing events and update the thermal core
  documentation.

  Specifics:

   - Fix MSI error path cleanup in int340x, allow it to work with a
     subset of thermal MSI IRQs if some of them are not working and make
     it free all MSI IRQs on module exit (Srinivas Pandruvada)

   - Fix a thermal core issue that may lead to missing trip point
     crossing events in some cases when thermal_zone_set_trips() is used
     and update the thermal core documentation (Rafael Wysocki)"

* tag 'thermal-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: core: Update thermal zone registration documentation
  thermal: trip: Avoid skipping trips in thermal_zone_set_trips()
  thermal: intel: int340x: Free MSI IRQ vectors on module exit
  thermal: intel: int340x: Allow limited thermal MSI support
  thermal: intel: int340x: Fix kernel warning during MSI cleanup

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

- Expand the speculative SSBS errata workaround to more CPUs

- Ensure jump label changes are visible to all CPUs with a
   kick_all_cpus_sync() (and also enable jump label batching as part of
   the fix)

- The shadow call stack sanitiser is currently incompatible with Rust,
   make CONFIG_RUST conditional on !CONFIG_SHADOW_CALL_STACK

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: jump_label: Ensure patched jump_labels are visible to all CPUs
  rust: SHADOW_CALL_STACK is incompatible with Rust
  arm64: errata: Expand speculative SSBS workaround (again)
  arm64: cputype: Add Cortex-A725 definitions
  arm64: cputype: Add Cortex-X1C definitions

Merge tag 'ceph-for-6.11-rc2' of https://github.com/ceph/ceph-client

Pull ceph fix from Ilya Dryomov:
"A fix for a potential hang in the MDS when cap revocation races with
the client releasing the caps in question, marked for stable"

* tag 'ceph-for-6.11-rc2' of https://github.com/ceph/ceph-client:
ceph: force sending a cap update msg back to MDS for revoke op

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
"The bulk of the changes here is a largish change to guest_memfd,
  delaying the clearing and encryption of guest-private pages until they
  are actually added to guest page tables. This started as "let's make
  it impossible to misuse the API" for SEV-SNP; but then it ballooned a
  bit.

  The new logic is generally simpler and more ready for hugepage support
  in guest_memfd.

  Summary:

   - fix latent bug in how usage of large pages is determined for
     confidential VMs

   - fix "underline too short" in docs

   - eliminate log spam from limited APIC timer periods

   - disallow pre-faulting of memory before SEV-SNP VMs are initialized

   - delay clearing and encrypting private memory until it is added to
     guest page tables

   - this change also enables another small cleanup: the checks in
     SNP_LAUNCH_UPDATE that limit it to non-populated, private pages can
     now be moved in the common kvm_gmem_populate() function

   - fix compilation error that the RISC-V merge introduced in selftests"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86/mmu: fix determination of max NPT mapping level for private pages
  KVM: riscv: selftests: Fix compile error
  KVM: guest_memfd: abstract how prepared folios are recorded
  KVM: guest_memfd: let kvm_gmem_populate() operate only on private gfns
  KVM: extend kvm_range_has_memory_attributes() to check subset of attributes
  KVM: cleanup and add shortcuts to kvm_range_has_memory_attributes()
  KVM: guest_memfd: move check for already-populated page to common code
  KVM: remove kvm_arch_gmem_prepare_needed()
  KVM: guest_memfd: make kvm_gmem_prepare_folio() operate on a single struct kvm
  KVM: guest_memfd: delay kvm_gmem_prepare_folio() until the memory is passed to the guest
  KVM: guest_memfd: return locked folio from __kvm_gmem_get_pfn
  KVM: rename CONFIG_HAVE_KVM_GMEM_* to CONFIG_HAVE_KVM_ARCH_GMEM_*
  KVM: guest_memfd: do not go through struct page
  KVM: guest_memfd: delay folio_mark_uptodate() until after successful preparation
  KVM: guest_memfd: return folio from __kvm_gmem_get_pfn()
  KVM: x86: disallow pre-fault for SNP VMs before initialization
  KVM: Documentation: Fix title underline too short warning
  KVM: x86: Eliminate log spam from limited APIC timer periods

Merge branch 'kvm-fixes' into HEAD

* fix latent bug in how usage of large pages is determined for
  confidential VMs

* fix "underline too short" in docs

* eliminate log spam from limited APIC timer periods

* disallow pre-faulting of memory before SEV-SNP VMs are initialized

* delay clearing and encrypting private memory until it is added to
  guest page tables

* this change also enables another small cleanup: the checks in
  SNP_LAUNCH_UPDATE that limit it to non-populated, private pages
  can now be moved in the common kvm_gmem_populate() function

Merge tag 'riscv-for-linus-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

- A fix to avoid dropping some of the internal pseudo-extensions, which
   breaks *envcfg dependency parsing

- The kernel entry address is now aligned in purgatory, which avoids a
   misaligned load that can lead to crash on systems that don't support
   misaligned accesses early in boot

- The FW_SFENCE_VMA_RECEIVED perf event was duplicated in a handful of
   perf JSON configurations, one of them been updated to
   FW_SFENCE_VMA_ASID_SENT

- The starfive cache driver is now restricted to 64-bit systems, as it
   isn't 32-bit clean

- A fix for to avoid aliasing legacy-mode perf counters with software
   perf counters

- VM_FAULT_SIGSEGV is now handled in the page fault code

- A fix for stalls during CPU hotplug due to IPIs being disabled

- A fix for memblock bounds checking. This manifests as a crash on
   systems with discontinuous memory maps that have regions that don't
   fit in the linear map

* tag 'riscv-for-linus-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix linear mapping checks for non-contiguous memory regions
  RISC-V: Enable the IPI before workqueue_online_cpu()
  riscv/mm: Add handling for VM_FAULT_SIGSEGV in mm_fault_error()
  perf: riscv: Fix selecting counters in legacy mode
  cache: StarFive: Require a 64-bit system
  perf arch events: Fix duplicate RISC-V SBI firmware event name
  riscv/purgatory: align riscv_kernel_entry
  riscv: cpufeature: Do not drop Linux-internal extensions

Merge tag 'kvm-riscv-fixes-6.11-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv fixes for 6.11, take #1

- Fix compile error in get-reg-list selftests

Merge tag 's390-6.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

- remove unused empty CPU alternatives header file

- fix recently and erroneously removed exception handling when loading
   an invalid floating point register

- ptdump fixes to reflect the recent changes due to the uncoupling of
   physical vs virtual kernel address spaces

- changes to avoid the unnecessary splitting of large pages in kernel
   mappings

- add the missing MODULE_DESCRIPTION for the CIO modules

* tag 's390-6.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: Keep inittext section writable
  s390/vmlinux.lds.S: Move ro_after_init section behind rodata section
  s390/mm: Get rid of RELOC_HIDE()
  s390/mm/ptdump: Improve sorting of markers
  s390/mm/ptdump: Add support for relocated lowcore mapping
  s390/mm/ptdump: Fix handling of identity mapping area
  s390/cio: Add missing MODULE_DESCRIPTION() macros
  s390/alternatives: Remove unused empty header file
  s390/fpu: Re-add exception handling in load_fpu_state()

clocksource: Fix brown-bag boolean thinko in cs_watchdog_read()

The current "nretries > 1 || nretries >= max_retries" check in
cs_watchdog_read() will always evaluate to true, and thus pr_warn(), if
nretries is greater than 1. The intent is instead to never warn on the
first try, but otherwise warn if the successful retry was the last retry.

Therefore, change that "||" to "&&".

Fixes: db3a34e17433 ("clocksource: Retry clock read if long delays detected")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240802154618.4149953-2-paulmck@kernel.org

Merge tag 'asm-generic-fixes-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic fixes from Arnd Bergmann:
"These are three important bug fixes for the cross-architecture tree,
  fixing a regression with the new syscall.tbl file, the inconsistent
  numbering for the new uretprobe syscall and a bug with iowrite64be on
  alpha"

* tag 'asm-generic-fixes-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  syscalls: fix syscall macros for newfstat/newfstatat
  uretprobe: change syscall number, again
  alpha: fix ioread64be()/iowrite64be() helpers

Merge tag 'sound-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"A small collection of fixes:

   - Revert of FireWire changes that caused a long-time regression

   - Another long-time regression fix for AMD HDMI

   - MIDI2 UMP fixes

   - HD-audio Conexant codec fixes and a quirk"

* tag 'sound-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda: Conditionally use snooping for AMD HDMI
  ALSA: usb-audio: Correct surround channels in UAC1 channel map
  ALSA: seq: ump: Explicitly reset RPN with Null RPN
  ALSA: seq: ump: Transmit RPN/NRPN message at each MSB/LSB data reception
  ALSA: seq: ump: Use the common RPN/bank conversion context
  ALSA: ump: Explicitly reset RPN with Null RPN
  ALSA: ump: Transmit RPN/NRPN message at each MSB/LSB data reception
  Revert "ALSA: firewire-lib: operate for period elapse event in process context"
  Revert "ALSA: firewire-lib: obsolete workqueue for period update"
  ALSA: hda/realtek: Add quirk for Acer Aspire E5-574G
  ALSA: seq: ump: Optimize conversions from SysEx to UMP
  ALSA: hda/conexant: Mute speakers at suspend / shutdown
  ALSA: hda/generic: Add a helper to mute speakers at suspend/shutdown
  ALSA: hda: conexant: Fix headset auto detect fail in the polling mode

Merge tag 'drm-fixes-2024-08-02' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
"Regular weekly fixes. This is a bit larger than usual but doesn't seem
  too crazy.

  Most of it is vmwgfx changes that fix a bunch of issues with wayland
  userspaces with dma-buf/external buffers and modesetting fixes.

  Otherwise it's kinda spread out, v3d fixes some new ioctls, nouveau
  has regression revert and fixes, amdgpu, i915 and ast have some small
  fixes, and some core fixes spread about.

  client:
   - fix error code

  atomic:
   - allow damage clips with async flips
   - allow explicit sync with async flips

  kselftests:
   - fix dmabuf-heaps test

  panic:
   - fix schedule_work in panic paths

  panel:
   - fix OrangePi Neo orientation

  gpuvm:
   - fix missing dependency

  amdgpu:
   - SMU 14.x update
   - Fix contiguous VRAM handling for IB parsing
   - GFX 12 fix
   - Regression fix for old APUs

  i915:
   - Static analysis fix for int overflow
   - Fix for HDCP2_STREAM_STATUS macro and removal of PWR_CLK_STATE for gen12

  nouveau:
   - revert busy wait change that caused a resume regression
   - fix buffer placement fault on dynamic pm s/r
   - fix refcount underflow

  ast:
   - fix black screen on resume
   - wake during connector status detect

  v3d:
   - fix issues with perf/timestamp ioctls

  vmwgfx:
   - fix deadlock in dma-buf fence polling
   - fix screen surface refcounting
   - fix dumb buffer handling
   - fix support for external buffers
   - fix overlay with screen targets
   - trigger modeset on screen moves"

* tag 'drm-fixes-2024-08-02' of https://gitlab.freedesktop.org/drm/kernel: (31 commits)
  Revert "nouveau: rip out busy fence waits"
  nouveau: set placement to original placement on uvmm validate.
  drm/atomic: Allow userspace to use damage clips with async flips
  drm/atomic: Allow userspace to use explicit sync with atomic async flips
  drm/i915: Fix possible int overflow in skl_ddi_calculate_wrpll()
  drm/i915/hdcp: Fix HDCP2_STREAM_STATUS macro
  drm/ast: astdp: Wake up during connector status detection
  i915/perf: Remove code to update PWR_CLK_STATE for gen12
  kselftests: dmabuf-heaps: Ensure the driver name is null-terminated
  drm/client: Fix error code in drm_client_buffer_vmap_local()
  drm/amdgpu: Fix APU handling in amdgpu_pm_load_smu_firmware()
  drm/amdgpu: increase mes log buffer size for gfx12
  drm/amdgpu: fix contiguous handling for IB parsing v2
  drm/amdgpu/pm: support gpu_metrics sysfs interface for smu v14.0.2/3
  drm/vmwgfx: Trigger a modeset when the screen moves
  drm/vmwgfx: Fix overlay when using Screen Targets
  drm/vmwgfx: Add basic support for external buffers
  drm/vmwgfx: Fix handling of dumb buffers
  drm/vmwgfx: Make sure the screen surface is ref counted
  drm/vmwgfx: Fix a deadlock in dma buf fence polling
  ...

cifs: update internal version number

To 2.50

Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: fix FSCTL_GET_REPARSE_POINT against NetApp

NetApp server requires the file to be open with FILE_READ_EA access in
order to support FSCTL_GET_REPARSE_POINT, otherwise it will return
STATUS_INVALID_DEVICE_REQUEST. It doesn't make any sense because
there's no requirement for FILE_READ_EA bit to be set nor
STATUS_INVALID_DEVICE_REQUEST being used for something other than
"unsupported reparse points" in MS-FSA.

To fix it and improve compatibility, set FILE_READ_EA & SYNCHRONIZE
bits to match what Windows client currently does.

Tested-by: Sebastian Steinbeisser <Sebastian.Steinbeisser@lrz.de>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb3: add dynamic tracepoints for shutdown ioctl

For debugging an umount failure in xfstests generic/043 generic/044 in some
configurations, we needed more information on the shutdown ioctl which
was suspected of being related to the cause, so tracepoints are added
in this patch e.g.

  "trace-cmd record -e smb3_shutdown_enter -e smb3_shutdown_done -e smb3_shutdown_err"

Sample output:
  godown-47084   [011] .....  3313.756965: smb3_shutdown_enter: flags=0x1 tid=0x733b3e75
  godown-47084   [011] .....  3313.756968: smb3_shutdown_done: flags=0x1 tid=0x733b3e75

Tested-by: Anthony Nandaa (Microsoft) <profnandaa@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>

cifs: Remove cifs_aio_ctx

Remove struct cifs_aio_ctx and its associated alloc/release functions as it
is no longer used, the functions being taken over by netfslib.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: handle lack of FSCTL_GET_REPARSE_POINT support

As per MS-FSA 2.1.5.10.14, support for FSCTL_GET_REPARSE_POINT is
optional and if the server doesn't support it,
STATUS_INVALID_DEVICE_REQUEST must be returned for the operation.

If we find files with reparse points and we can't read them due to
lack of client or server support, just ignore it and then treat them
as regular files or junctions.

Fixes: 5f71ebc41294 ("smb: client: parse reparse point flag in create response")
Reported-by: Sebastian Steinbeisser <Sebastian.Steinbeisser@lrz.de>
Tested-by: Sebastian Steinbeisser <Sebastian.Steinbeisser@lrz.de>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>

Merge tag 'ata-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux

Pull ata fix from Damien Le Moal:

- Add missing power-domains property to the device tree bindings for
the Rockchip Designware AHCI adapter (from Heiko)

* tag 'ata-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
dt-bindings: ata: rockchip-dwc-ahci: add missing power-domains

Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs fix from Al Viro:
"do_dup2() out-of-bounds array speculation fix"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
protect the fetch of ->fd[fd] in do_dup2() from mispredictions

arm64: jump_label: Ensure patched jump_labels are visible to all CPUs

Although the Arm architecture permits concurrent modification and
execution of NOP and branch instructions, it still requires some
synchronisation to ensure that other CPUs consistently execute the newly
written instruction:

>  When the modified instructions are observable, each PE that is
>  executing the modified instructions must execute an ISB or perform a
>  context synchronizing event to ensure execution of the modified
>  instructions

Prior to commit f6cc0c501649 ("arm64: Avoid calling stop_machine() when
patching jump labels"), the arm64 jump_label patching machinery
performed synchronisation using stop_machine() after each modification,
however this was problematic when flipping static keys from atomic
contexts (namely, the arm_arch_timer CPU hotplug startup notifier) and
so we switched to the _nosync() patching routines to avoid "scheduling
while atomic" BUG()s during boot.

In hindsight, the analysis of the issue in f6cc0c501649 isn't quite
right: it cites the use of IPIs in the default patching routines as the
cause of the lockup, whereas stop_machine() does not rely on IPIs and
the I-cache invalidation is performed using __flush_icache_range(),
which elides the call to kick_all_cpus_sync(). In fact, the blocking
wait for other CPUs is what triggers the BUG() and the problem remains
even after f6cc0c501649, for example because we could block on the
jump_label_mutex. Eventually, the arm_arch_timer driver was fixed to
avoid the static key entirely in commit a862fc2254bd
("clocksource/arm_arch_timer: Remove use of workaround static key").

This all leaves the jump_label patching code in a funny situation on
arm64 as we do not synchronise with other CPUs to reduce the likelihood
of a bug which no longer exists. Consequently, toggling a static key on
one CPU cannot be assumed to take effect on other CPUs, leading to
potential issues, for example with missing preempt notifiers.

Rather than revert f6cc0c501649 and go back to stop_machine() for each
patch site, implement arch_jump_label_transform_apply() and kick all
the other CPUs with an IPI at the end of patching.

Cc: Alexander Potapenko <glider@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Fixes: f6cc0c501649 ("arm64: Avoid calling stop_machine() when patching jump labels")
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240731133601.3073-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

syscalls: fix syscall macros for newfstat/newfstatat

The __NR_newfstat and __NR_newfstatat macros accidentally got renamed
in the conversion to the syscall.tbl format, dropping the 'new' portion
of the name.

In an unrelated change, the two syscalls are no longer architecture
specific but are once more defined on all 64-bit architectures, so the
'newstat' ABI keyword can be dropped from the table as a simplification.

Fixes: Fixes: 4fe53bf2ba0a ("syscalls: add generic scripts/syscall.tbl")
Closes: https://lore.kernel.org/lkml/838053e0-b186-4e9f-9668-9a3384a71f23@app.fastmail.com/T/#t
Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

uretprobe: change syscall number, again

Despite multiple attempts to get the syscall number assignment right
for the newly added uretprobe syscall, we ended up with a bit of a mess:

- The number is defined as 467 based on the assumption that the
   xattrat family of syscalls would use 463 through 466, but those
   did not make it into 6.11.

- The include/uapi/asm-generic/unistd.h file still lists the number
   463, but the new scripts/syscall.tbl that was supposed to have the
   same data lists 467 instead as the number for arc, arm64, csky,
   hexagon, loongarch, nios2, openrisc and riscv. None of these
   architectures actually provide a uretprobe syscall.

- All the other architectures (powerpc, arm, mips, ...) don't list
   this syscall at all.

There are two ways to make it consistent again: either list it with
the same syscall number on all architectures, or only list it on x86
but not in scripts/syscall.tbl and asm-generic/unistd.h.

Based on the most recent discussion, it seems like we won't need it
anywhere else, so just remove the inconsistent assignment and instead
move the x86 number to the next available one in the architecture
specific range, which is 335.

Fixes: 5c28424e9a34 ("syscalls: Fix to add sys_uretprobe to syscall.tbl")
Fixes: 190fec72df4a ("uprobe: Wire up uretprobe system call")
Fixes: 63ded110979b ("uprobe: Change uretprobe syscall scope and number")
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

thermal: core: Update thermal zone registration documentation

The thermal sysfs API document is outdated. One of the problems with
it is that is still documents thermal_zone_device_register() which
does not exit any more and it does not reflect the current thermal
zone operations definition.

Replace the thermal_zone_device_register() description in it with
a thermal_zone_device_register_with_trips() description, including
an update of the thermal zone operations list.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/2767845.mvXUDI8C0e@rjwysocki.net

Revert "nouveau: rip out busy fence waits"

This reverts commit d45bb9c5f7a6f7b6e47939856b28cb1da0cdc119.

Just got a report that this causes some suspend/resume issues,
so back it out and I'll investigate it later.

Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>

Merge tag 'drm-misc-fixes-2024-08-01' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

A couple drm_panic fixes, several v3d fixes to increase the new timestamp API
safety, several fixes for vmwgfx for various modesetting issues, PM fixes
for ast, async flips improvements and two fixes for nouveau to fix
resource refcounting and buffer placement.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240801-interesting-antique-bat-2fe4c0@houat

Merge tag 'drm-intel-fixes-2024-08-01' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes

- Static analysis fix for int overflow
- Fix for HDCP2_STREAM_STATUS macro and removal of PWR_CLK_STATE for gen12

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZqslBkcZlInYdYgm@jlahtine-mobl.ger.corp.intel.com

Merge tag 'amd-drm-fixes-6.11-2024-07-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.11-2024-07-27:

amdgpu:
- SMU 14.x update
- Fix contiguous VRAM handling for IB parsing
- GFX 12 fix
- Regression fix for old APUs

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240728025407.2115881-1-alexander.deucher@amd.com

protect the fetch of ->fd[fd] in do_dup2() from mispredictions

both callers have verified that fd is not greater than ->max_fds;
however, misprediction might end up with
tofree = fdt->fd[fd];
being speculatively executed. That's wrong for the same reasons
why it's wrong in close_fd()/file_close_fd_locked(); the same
solution applies - array_index_nospec(fd, fdt->max_fds) could differ
from fd only in case of speculative execution on mispredicted path.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

x86/uaccess: Zero the 8-byte get_range case on failure on 32-bit

While zeroing the upper 32 bits of an 8-byte getuser on 32-bit x86 was
fixed by commit 8c860ed825cb ("x86/uaccess: Fix missed zeroing of ia32 u64
get_user() range checking") it was broken again in commit 8a2462df1547
("x86/uaccess: Improve the 8-byte getuser() case").

This is because the register which holds the upper 32 bits (%ecx) is being
cleared _after_ the check_range, so if the range check fails, %ecx is never
cleared.

This can be reproduced with:
./tools/testing/kunit/kunit.py run --arch i386 usercopy

Instead, clear %ecx _before_ check_range in the 8-byte case. This
reintroduces a bit of the ugliness we were trying to avoid by adding
another #ifndef CONFIG_X86_64, but at least keeps check_range from needing
a separate bad_get_user_8 jump.

Fixes: 8a2462df1547 ("x86/uaccess: Improve the 8-byte getuser() case")
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/all/20240731073031.4045579-1-davidgow@google.com

riscv: Fix linear mapping checks for non-contiguous memory regions

The RISC-V kernel already has checks to ensure that memory which would
lie outside of the linear mapping is not used. However those checks
use memory_limit, which is used to implement the mem= kernel command
line option (to limit the total amount of memory, not its address
range). When memory is made up of two or more non-contiguous memory
banks this check is incorrect.

Two changes are made here:
- add a call in setup_bootmem() to memblock_cap_memory_range() which
   will cause any memory which falls outside the linear mapping to be
   removed from the memory regions.
- remove the check in create_linear_mapping_page_table() which was
   intended to remove memory which is outside the liner mapping based
   on memory_limit, as it is no longer needed. Note a check for
   mapping more memory than memory_limit (to implement mem=) is
   unnecessary because of the existing call to
   memblock_enforce_memory_limit().

This issue was seen when booting on a SV39 platform with two memory
banks:
  0x00,80000000 1GiB
  0x20,00000000 32GiB
This memory range is 158GiB from top to bottom, but the linear mapping
is limited to 128GiB, so the lower block of RAM will be mapped at
PAGE_OFFSET, and the upper block straddles the top of the linear
mapping.

This causes the following Oops:
[    0.000000] Linux version 6.10.0-rc2-gd3b8dd5b51dd-dirty (stuart.menefy@codasip.com) (riscv64-codasip-linux-gcc (GCC) 13.2.0, GNU ld (GNU Binutils) 2.41.0.20231213) #20 SMP Sat Jun 22 11:34:22 BST 2024
[    0.000000] memblock_add: [0x0000000080000000-0x00000000bfffffff] early_init_dt_add_memory_arch+0x4a/0x52
[    0.000000] memblock_add: [0x0000002000000000-0x00000027ffffffff] early_init_dt_add_memory_arch+0x4a/0x52
...
[    0.000000] memblock_alloc_try_nid: 23724 bytes align=0x8 nid=-1 from=0x0000000000000000 max_addr=0x0000000000000000 early_init_dt_alloc_memory_arch+0x1e/0x48
[    0.000000] memblock_reserve: [0x00000027ffff5350-0x00000027ffffaffb] memblock_alloc_range_nid+0xb8/0x132
[    0.000000] Unable to handle kernel paging request at virtual address fffffffe7fff5350
[    0.000000] Oops [#1]
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.10.0-rc2-gd3b8dd5b51dd-dirty #20
[    0.000000] Hardware name: codasip,a70x (DT)
[    0.000000] epc : __memset+0x8c/0x104
[    0.000000]  ra : memblock_alloc_try_nid+0x74/0x84
[    0.000000] epc : ffffffff805e88c8 ra : ffffffff806148f6 sp : ffffffff80e03d50
[    0.000000]  gp : ffffffff80ec4158 tp : ffffffff80e0bec0 t0 : fffffffe7fff52f8
[    0.000000]  t1 : 00000027ffffb000 t2 : 5f6b636f6c626d65 s0 : ffffffff80e03d90
[    0.000000]  s1 : 0000000000005cac a0 : fffffffe7fff5350 a1 : 0000000000000000
[    0.000000]  a2 : 0000000000005cac a3 : fffffffe7fffaff8 a4 : 000000000000002c
[    0.000000]  a5 : ffffffff805e88c8 a6 : 0000000000005cac a7 : 0000000000000030
[    0.000000]  s2 : fffffffe7fff5350 s3 : ffffffffffffffff s4 : 0000000000000000
[    0.000000]  s5 : ffffffff8062347e s6 : 0000000000000000 s7 : 0000000000000001
[    0.000000]  s8 : 0000000000002000 s9 : 00000000800226d0 s10: 0000000000000000
[    0.000000]  s11: 0000000000000000 t3 : ffffffff8080a928 t4 : ffffffff8080a928
[    0.000000]  t5 : ffffffff8080a928 t6 : ffffffff8080a940
[    0.000000] status: 0000000200000100 badaddr: fffffffe7fff5350 cause: 000000000000000f
[    0.000000] [<ffffffff805e88c8>] __memset+0x8c/0x104
[    0.000000] [<ffffffff8062349c>] early_init_dt_alloc_memory_arch+0x1e/0x48
[    0.000000] [<ffffffff8043e892>] __unflatten_device_tree+0x52/0x114
[    0.000000] [<ffffffff8062441e>] unflatten_device_tree+0x9e/0xb8
[    0.000000] [<ffffffff806046fe>] setup_arch+0xd4/0x5bc
[    0.000000] [<ffffffff806007aa>] start_kernel+0x76/0x81a
[    0.000000] Code: b823 02b2 bc23 02b2 b023 04b2 b423 04b2 b823 04b2 (bc23) 04b2
[    0.000000] ---[ end trace 0000000000000000 ]---
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

The problem is that memblock (unaware that some physical memory cannot
be used) has allocated memory from the top of memory but which is
outside the linear mapping region.

Signed-off-by: Stuart Menefy <stuart.menefy@codasip.com>
Fixes: c99127c45248 ("riscv: Make sure the linear mapping does not use the kernel mapping")
Reviewed-by: David McKay <david.mckay@codasip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240622114217.2158495-1-stuart.menefy@codasip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

Merge tag 'pci-v6.11-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI fixes from Bjorn Helgaas:

- Fix a pci_intx() regression that caused driver reload to fail with
   "Resources present before probing" (Philipp Stanner)

- Fix a pciehp regression that clobbered the upper bits of RAID status
   LEDs on NVMe devices behind an Intel VMD (Blazej Kucman)

* tag 'pci-v6.11-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI: pciehp: Retain Power Indicator bits for userspace indicators
  PCI: Fix devres regression in pci_intx()

KVM: x86/mmu: fix determination of max NPT mapping level for private pages

The `if (req_max_level)` test was meant ignore req_max_level if
PG_LEVEL_NONE was returned. Hence, this function should return
max_level instead of the ignored req_max_level.

This is only a latent issue for now, since guest_memfd does not
support large pages.

Signed-off-by: Ackerley Tng <ackerleytng@google.com>
Message-ID: <20240801173955.1975034-1-ackerleytng@google.com>
Fixes: f32fb32820b1 ("KVM: x86: Add hook for determining max NPT mapping level")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

PCI: pciehp: Retain Power Indicator bits for userspace indicators

The sysfs "attention" file normally controls the Slot Control Attention
Indicator with 0 (off), 1 (on), 2 (blink) settings.

576243b3f9ea ("PCI: pciehp: Allow exclusive userspace control of
indicators") added pciehp_set_raw_indicator_status() to allow userspace to
directly control all four bits in both the Attention Indicator and the
Power Indicator fields via the "attention" file.

This is used on Intel VMD bridges so utilities like "ledmon" can use sysfs
"attention" to control up to 16 indicators for NVMe device RAID status.

abaaac4845a0 ("PCI: hotplug: Use FIELD_GET/PREP()") broke this by masking
the sysfs data with PCI_EXP_SLTCTL_AIC, which discards the upper two bits
intended for the Power Indicator Control field (PCI_EXP_SLTCTL_PIC).

For NVMe devices behind an Intel VMD, ledmon settings that use the
PCI_EXP_SLTCTL_PIC bits, i.e., ATTENTION_REBUILD (0x5), ATTENTION_LOCATE
(0x7), ATTENTION_FAILURE (0xD), ATTENTION_OFF (0xF), no longer worked
correctly.

Mask with PCI_EXP_SLTCTL_AIC | PCI_EXP_SLTCTL_PIC to retain both the
Attention Indicator and the Power Indicator bits.

Fixes: abaaac4845a0 ("PCI: hotplug: Use FIELD_GET/PREP()")
Link: https://lore.kernel.org/r/20240722141440.7210-1-blazej.kucman@intel.com
Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v6.7+

PCI: Fix devres regression in pci_intx()

pci_intx() becomes managed if pcim_enable_device() has been called in
advance. Commit 25216afc9db5 ("PCI: Add managed pcim_intx()") changed this
behavior so that pci_intx() always leads to creation of a separate device
resource for itself, whereas earlier, a shared resource was used for all
PCI devres operations.

Unfortunately, pci_intx() seems to be used in some drivers' remove() paths;
in the managed case this causes a device resource to be created on driver
detach, which causes .probe() to fail if the driver is reloaded:

pci 0000:00:1f.2: Resources present before probing

Fix the regression by only redirecting pci_intx() to its managed twin
pcim_intx() if the pci_command changes.

Link: https://lore.kernel.org/r/20240725120729.59788-2-pstanner@redhat.com
Fixes: 25216afc9db5 ("PCI: Add managed pcim_intx()")
Reported-by: Damien Le Moal <dlemoal@kernel.org>
Closes: https://lore.kernel.org/all/b8f4ba97-84fc-4b7e-ba1a-99de2d9f0118@kernel.org/
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
[bhelgaas: add error message to commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Damien Le Moal <dlemoal@kernel.org>

Merge tag 'net-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
"Including fixes from wireless, bleutooth, BPF and netfilter.

  Current release - regressions:

   - core: drop bad gso csum_start and offset in virtio_net_hdr

   - wifi: mt76: fix null pointer access in mt792x_mac_link_bss_remove

   - eth: tun: add missing bpf_net_ctx_clear() in do_xdp_generic()

   - phy: aquantia: only poll GLOBAL_CFG regs on aqr113, aqr113c and
     aqr115c

  Current release - new code bugs:

   - smc: prevent UAF in inet_create()

   - bluetooth: btmtk: fix kernel crash when entering btmtk_usb_suspend

   - eth: bnxt: reject unsupported hash functions

  Previous releases - regressions:

   - sched: act_ct: take care of padding in struct zones_ht_key

   - netfilter: fix null-ptr-deref in iptable_nat_table_init().

   - tcp: adjust clamping window for applications specifying SO_RCVBUF

  Previous releases - always broken:

   - ethtool: rss: small fixes to spec and GET

   - mptcp:
      - fix signal endpoint re-add
      - pm: fix backup support in signal endpoints

   - wifi: ath12k: fix soft lockup on suspend

   - eth: bnxt_en: fix RSS logic in __bnxt_reserve_rings()

   - eth: ice: fix AF_XDP ZC timeout and concurrency issues

   - eth: mlx5:
      - fix missing lock on sync reset reload
      - fix error handling in irq_pool_request_irq"

* tag 'net-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits)
  mptcp: fix duplicate data handling
  mptcp: fix bad RCVPRUNED mib accounting
  ipv6: fix ndisc_is_useropt() handling for PIO
  igc: Fix double reset adapter triggered from a single taprio cmd
  net: MAINTAINERS: Demote Qualcomm IPA to "maintained"
  net: wan: fsl_qmc_hdlc: Discard received CRC
  net: wan: fsl_qmc_hdlc: Convert carrier_lock spinlock to a mutex
  net/mlx5e: Add a check for the return value from mlx5_port_set_eth_ptys
  net/mlx5e: Fix CT entry update leaks of modify header context
  net/mlx5e: Require mlx5 tc classifier action support for IPsec prio capability
  net/mlx5: Fix missing lock on sync reset reload
  net/mlx5: Lag, don't use the hardcoded value of the first port
  net/mlx5: DR, Fix 'stack guard page was hit' error in dr_rule
  net/mlx5: Fix error handling in irq_pool_request_irq
  net/mlx5: Always drain health in shutdown callback
  net: Add skbuff.h to MAINTAINERS
  r8169: don't increment tx_dropped in case of NETDEV_TX_BUSY
  netfilter: iptables: Fix potential null-ptr-deref in ip6table_nat_table_init().
  netfilter: iptables: Fix null-ptr-deref in iptable_nat_table_init().
  net: drop bad gso csum_start and offset in virtio_net_hdr
  ...

rust: SHADOW_CALL_STACK is incompatible with Rust

When using the shadow call stack sanitizer, all code must be compiled
with the -ffixed-x18 flag, but this flag is not currently being passed
to Rust. This results in crashes that are extremely difficult to debug.

To ensure that nobody else has to go through the same debugging session
that I had to, prevent configurations that enable both SHADOW_CALL_STACK
and RUST.

It is rather common for people to backport 724a75ac9542 ("arm64: rust:
Enable Rust support for AArch64"), so I recommend applying this fix all
the way back to 6.1.

Cc: stable@vger.kernel.org # 6.1 and later
Fixes: 724a75ac9542 ("arm64: rust: Enable Rust support for AArch64")
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Link: https://lore.kernel.org/r/20240729-shadow-call-stack-v4-1-2a664b082ea4@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: errata: Expand speculative SSBS workaround (again)

A number of Arm Ltd CPUs suffer from errata whereby an MSR to the SSBS
special-purpose register does not affect subsequent speculative
instructions, permitting speculative store bypassing for a window of
time.

We worked around this for a number of CPUs in commits:

* 7187bb7d0b5c7dfa ("arm64: errata: Add workaround for Arm errata 3194386 and 3312417")
* 75b3c43eab594bfb ("arm64: errata: Expand speculative SSBS workaround")

Since then, similar errata have been published for a number of other Arm
Ltd CPUs, for which the same mitigation is sufficient. This is described
in their respective Software Developer Errata Notice (SDEN) documents:

* Cortex-A76 (MP052) SDEN v31.0, erratum 3324349
  https://developer.arm.com/documentation/SDEN-885749/3100/

* Cortex-A77 (MP074) SDEN v19.0, erratum 3324348
  https://developer.arm.com/documentation/SDEN-1152370/1900/

* Cortex-A78 (MP102) SDEN v21.0, erratum 3324344
  https://developer.arm.com/documentation/SDEN-1401784/2100/

* Cortex-A78C (MP138) SDEN v16.0, erratum 3324346
  https://developer.arm.com/documentation/SDEN-1707916/1600/

* Cortex-A78C (MP154) SDEN v10.0, erratum 3324347
  https://developer.arm.com/documentation/SDEN-2004089/1000/

* Cortex-A725 (MP190) SDEN v5.0, erratum 3456106
  https://developer.arm.com/documentation/SDEN-2832921/0500/

* Cortex-X1 (MP077) SDEN v21.0, erratum 3324344
  https://developer.arm.com/documentation/SDEN-1401782/2100/

* Cortex-X1C (MP136) SDEN v16.0, erratum 3324346
  https://developer.arm.com/documentation/SDEN-1707914/1600/

* Neoverse-N1 (MP050) SDEN v32.0, erratum 3324349
  https://developer.arm.com/documentation/SDEN-885747/3200/

* Neoverse-V1 (MP076) SDEN v19.0, erratum 3324341
  https://developer.arm.com/documentation/SDEN-1401781/1900/

Note that due to the manner in which Arm develops IP and tracks errata,
some CPUs share a common erratum number and some CPUs have multiple
erratum numbers for the same HW issue.

On parts without SB, it is necessary to use ISB for the workaround. The
spec_bar() macro used in the mitigation will expand to a "DSB SY; ISB"
sequence in this case, which is sufficient on all affected parts.

Enable the existing mitigation by adding the relevant MIDRs to
erratum_spec_ssbs_list. The list is sorted alphanumerically (involving
moving Neoverse-V3 after Neoverse-V2) so that this is easy to audit and
potentially extend again in future. The Kconfig text is also updated to
clarify the set of affected parts and the mitigation.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240801101803.1982459-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: cputype: Add Cortex-A725 definitions

Add cputype definitions for Cortex-A725. These will be used for errata
detection in subsequent patches.

These values can be found in the Cortex-A725 TRM:

https://developer.arm.com/documentation/107652/0001/

... in table A-247 ("MIDR_EL1 bit descriptions").

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20240801101803.1982459-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: cputype: Add Cortex-X1C definitions

Add cputype definitions for Cortex-X1C. These will be used for errata
detection in subsequent patches.

These values can be found in the Cortex-X1C TRM:

https://developer.arm.com/documentation/101968/0002/

... in section B2.107 ("MIDR_EL1, Main ID Register, EL1").

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20240801101803.1982459-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

RISC-V: Enable the IPI before workqueue_online_cpu()

Sometimes the hotplug cpu stalls at the arch_cpu_idle() for a while after
workqueue_online_cpu(). When cpu stalls at the idle loop, the reschedule
IPI is pending. However the enable bit is not enabled yet so the cpu stalls
at WFI until watchdog timeout. Therefore enable the IPI before the
workqueue_online_cpu() to fix the issue.

Fixes: 63c5484e7495 ("workqueue: Add multiple affinity scopes and interface to select them")
Signed-off-by: Nick Hu <nick.hu@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240717031714.1946036-1-nick.hu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

riscv/mm: Add handling for VM_FAULT_SIGSEGV in mm_fault_error()

Handle VM_FAULT_SIGSEGV in the page fault path so that we correctly
kill the process and we don't BUG() the kernel.

Fixes: 07037db5d479 ("RISC-V: Paging and MMU")
Signed-off-by: Zhe Qiao <qiaozhe@iscas.ac.cn>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240731084547.85380-1-qiaozhe@iscas.ac.cn
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

perf: riscv: Fix selecting counters in legacy mode

It is required to check event type before checking event config.
Events with the different types can have the same config.
This check is missed for legacy mode code

For such perf usage:
sysctl -w kernel.perf_user_access=2
perf stat -e cycles,L1-dcache-loads --
driver will try to force both events to CYCLE counter.

This commit implements event type check before forcing
events on the special counters.

Signed-off-by: Shifrin Dmitry <dmitry.shifrin@syntacore.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Fixes: cc4c07c89aad ("drivers: perf: Implement perf event mmap support in the SBI backend")
Link: https://lore.kernel.org/r/20240729125858.630653-1-dmitry.shifrin@syntacore.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

cache: StarFive: Require a 64-bit system

This has a bunch of {read,write}q() calls, so it won't work on 32-bit
systems. I don't think there's any 32-bit StarFive systems, so for now
just require 64-bit.

Fixes: cabff60ca77d ("cache: Add StarFive StarLink cache management")
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Link: https://lore.kernel.org/r/20240722154519.25375-2-palmer@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

perf arch events: Fix duplicate RISC-V SBI firmware event name

Currently, the RISC-V firmware JSON file has duplicate event name
"FW_SFENCE_VMA_RECEIVED". According to the RISC-V SBI PMU extension[1],
the event name should be "FW_SFENCE_VMA_ASID_SENT".

Before this patch:
$ perf list

firmware:
  fw_access_load
       [Load access trap event. Unit: cpu]
  fw_access_store
       [Store access trap event. Unit: cpu]
....
fw_set_timer
       [Set timer event. Unit: cpu]
  fw_sfence_vma_asid_received
       [Received SFENCE.VMA with ASID request from other HART event. Unit: cpu]
  fw_sfence_vma_received
       [Sent SFENCE.VMA with ASID request to other HART event. Unit: cpu]

After this patch:
$ perf list

firmware:
  fw_access_load
       [Load access trap event. Unit: cpu]
  fw_access_store
       [Store access trap event. Unit: cpu]
.....
  fw_set_timer
       [Set timer event. Unit: cpu]
  fw_sfence_vma_asid_received
       [Received SFENCE.VMA with ASID request from other HART event. Unit: cpu]
  fw_sfence_vma_asid_sent
       [Sent SFENCE.VMA with ASID request to other HART event. Unit: cpu]
  fw_sfence_vma_received
       [Received SFENCE.VMA request from other HART event. Unit: cpu]

Link: https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-pmu.adoc#event-firmware-events-type-15
Fixes: 8f0dcb4e7364 ("perf arch events: riscv sbi firmware std event files")
Fixes: c4f769d4093d ("perf vendor events riscv: add Sifive U74 JSON file")
Fixes: acbf6de674ef ("perf vendor events riscv: Add StarFive Dubhe-80 JSON file")
Fixes: 7340c6df49df ("perf vendor events riscv: add T-HEAD C9xx JSON file")
Fixes: f5102e31c209 ("riscv: andes: Support specifying symbolic firmware and hardware raw event")
Signed-off-by: Eric Lin <eric.lin@sifive.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Nikita Shubin <n.shubin@yadro.com>
Reviewed-by: Inochi Amaoto <inochiama@outlook.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240719115018.27356-1-eric.lin@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

riscv/purgatory: align riscv_kernel_entry

When alignment handling is delegated to the kernel, everything must be
word-aligned in purgatory, since the trap handler is then set to the
kexec one. Without the alignment, hitting the exception would
ultimately crash. On other occasions, the kernel's handler would take
care of exceptions.
This has been tested on a JH7110 SoC with oreboot and its SBI delegating
unaligned access exceptions and the kernel configured to handle them.

Fixes: 736e30af583fb ("RISC-V: Add purgatory")
Signed-off-by: Daniel Maslowski <cyrevolt@gmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240719170437.247457-1-cyrevolt@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

alpha: fix ioread64be()/iowrite64be() helpers

Compile-testing the crypto/caam driver on alpha showed a pre-existing
problem on alpha with iowrite64be() missing:

ERROR: modpost: "iowrite64be" [drivers/crypto/caam/caam_jr.ko] undefined!

The prototypes were added a while ago when we started using asm-generic/io.h,
but the implementation was still missing. At some point the ioread64/iowrite64
helpers were added, but the big-endian versions are still missing, and
the generic version (using readq/writeq) is would not work here.

Change it to wrap ioread64()/iowrite64() instead.

Fixes: beba3771d9e0 ("crypto: caam: Make CRYPTO_DEV_FSL_CAAM dependent of COMPILE_TEST")
Fixes: e19d4ebc536d ("alpha: add full ioread64/iowrite64 implementation")
Fixes: 7e772dad9913 ("alpha: Use generic <asm-generic/io.h>")
Closes: https://lore.kernel.org/all/CAHk-=wgEyzSxTs467NDOVfBSzWvUS6ztcwhiy=M3xog==KBmTw@mail.gmail.com/
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

x86/mm: Fix pti_clone_entry_text() for i386

While x86_64 has PMD aligned text sections, i386 does not have this
luxery. Notably ALIGN_ENTRY_TEXT_END is empty and _etext has PAGE
alignment.

This means that text on i386 can be page granular at the tail end,
which in turn means that the PTI text clones should consistently
account for this.

Make pti_clone_entry_text() consistent with pti_clone_kernel_text().

Fixes: 16a3fe634f6a ("x86/mm/pti: Clone kernel-image on PTE level for 32 bit")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

x86/mm: Fix pti_clone_pgtable() alignment assumption

Guenter reported dodgy crashes on an i386-nosmp build using GCC-11
that had the form of endless traps until entry stack exhaust and then
#DF from the stack guard.

It turned out that pti_clone_pgtable() had alignment assumptions on
the start address, notably it hard assumes start is PMD aligned. This
is true on x86_64, but very much not true on i386.

These assumptions can cause the end condition to malfunction, leading
to a 'short' clone. Guess what happens when the user mapping has a
short copy of the entry text?

Use the correct increment form for addr to avoid alignment
assumptions.

Fixes: 16a3fe634f6a ("x86/mm/pti: Clone kernel-image on PTE level for 32 bit")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20240731163105.GG33588@noisy.programming.kicks-ass.net

ceph: force sending a cap update msg back to MDS for revoke op

If a client sends out a cap update dropping caps with the prior 'seq'
just before an incoming cap revoke request, then the client may drop
the revoke because it believes it's already released the requested
capabilities.

This causes the MDS to wait indefinitely for the client to respond
to the revoke. It's therefore always a good idea to ack the cap
revoke request with the bumped up 'seq'.

Currently if the cap->issued equals to the newcaps the check_caps()
will do nothing, we should force flush the caps.

Cc: stable@vger.kernel.org
Link: https://tracker.ceph.com/issues/61782
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>