git.ipfire.org Git - thirdparty/linux.git/log

RISC-V: KVM: Fix double-free of sdata in kvm_pmu_clear_snapshot_area()

In kvm_riscv_vcpu_pmu_snapshot_set_shmem(), when kvm_vcpu_write_guest()
fails, kvpmu->sdata is freed but not set to NULL. This leaves a dangling
pointer that will be freed again when kvm_pmu_clear_snapshot_area() is
called during vcpu teardown, triggering a KASAN double-free report.

First free occurs in kvm_riscv_vcpu_pmu_snapshot_set_shmem():
kvm_riscv_vcpu_pmu_snapshot_set_shmem arch/riscv/kvm/vcpu_pmu.c:443
kvm_sbi_ext_pmu_handler arch/riscv/kvm/vcpu_sbi_pmu.c:74
kvm_riscv_vcpu_sbi_ecall arch/riscv/kvm/vcpu_sbi.c:608
kvm_riscv_vcpu_exit arch/riscv/kvm/vcpu_exit.c:240
kvm_arch_vcpu_ioctl_run arch/riscv/kvm/vcpu.c:1008
kvm_vcpu_ioctl virt/kvm/kvm_main.c:4476

Second free (double-free) occurs in kvm_pmu_clear_snapshot_area():
kvm_pmu_clear_snapshot_area arch/riscv/kvm/vcpu_pmu.c:403 [inline]
kvm_riscv_vcpu_pmu_deinit.part arch/riscv/kvm/vcpu_pmu.c:905
kvm_riscv_vcpu_pmu_deinit arch/riscv/kvm/vcpu_pmu.c:893
kvm_arch_vcpu_destroy arch/riscv/kvm/vcpu.c:199
kvm_vcpu_destroy virt/kvm/kvm_main.c:469 [inline]
kvm_destroy_vcpus virt/kvm/kvm_main.c:489
kvm_arch_destroy_vm arch/riscv/kvm/vm.c:54
kvm_destroy_vm virt/kvm/kvm_main.c:1301 [inline]
kvm_put_kvm virt/kvm/kvm_main.c:1338
kvm_vm_release virt/kvm/kvm_main.c:1361

Fix it by setting kvpmu->sdata to NULL after kfree() in
kvm_riscv_vcpu_pmu_snapshot_set_shmem(), so that the subsequent
kfree(NULL) in kvm_pmu_clear_snapshot_area() becomes a safe no-op.

This bug was found by fuzzing the KVM RISC-V PMU interface.

Fixes: c2f41ddbcdd756 ("RISC-V: KVM: Implement SBI PMU Snapshot feature")
Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com>
Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn>
Reviewed-by: Nutty Liu <nutty.liu@hotmail.com>
Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260318092956.708246-1-xujiakai2025@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>

riscv: kvm: add null pointer check for vector datap

Add WARN_ON check before accessing cntx->vector.datap in
kvm_riscv_vcpu_vreg_addr() to detect potential null pointer
dereferences early, consistent with the pattern used in
kvm_riscv_vcpu_vector_reset().

This helps catch initialization issues where vector context
allocation may have failed.

Signed-off-by: Yufeng Wang <wangyufeng@kylinos.cn>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260317114759.53165-1-r4o5m6e8o@163.com
Signed-off-by: Anup Patel <anup@brainfault.org>

drm/i915/dsi: Don't do DSC horizontal timing adjustments in command mode

Stop adjusting the horizontal timing values based on the
compression ratio in command mode. Bspec seems to be telling
us to do this only in video mode, and this is also how the
Windows driver does things.

This should also fix a div-by-zero on some machines because
the adjusted htotal ends up being so small that we end up with
line_time_us==0 when trying to determine the vtotal value in
command mode.

Note that this doesn't actually make the display on the
Huawei Matebook E work, but at least the kernel no longer
explodes when the driver loads.

Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12045
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20260326111814.9800-2-ville.syrjala@linux.intel.com
Fixes: 53693f02d80e ("drm/i915/dsi: account for DSC in horizontal timings")
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 0b475e91ecc2313207196c6d7fd5c53e1a878525)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

xfrm: account XFRMA_IF_ID in aevent size calculation

xfrm_get_ae() allocates the reply skb with xfrm_aevent_msgsize(), then
build_aevent() appends attributes including XFRMA_IF_ID when x->if_id is
set.

xfrm_aevent_msgsize() does not include space for XFRMA_IF_ID. For states
with if_id, build_aevent() can fail with -EMSGSIZE and hit BUG_ON(err < 0)
in xfrm_get_ae(), turning a malformed netlink interaction into a kernel
panic.

Account XFRMA_IF_ID in the size calculation unconditionally and replace
the BUG_ON with normal error unwinding.

Fixes: 7e6526404ade ("xfrm: Add a new lookup key to match xfrm interfaces.")
Reported-by: Keenan Dong <keenanat2000@gmail.com>
Signed-off-by: Keenan Dong <keenanat2000@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

xfrm: clear trailing padding in build_polexpire()

build_expire() clears the trailing padding bytes of struct
xfrm_user_expire after setting the hard field via memset_after(),
but the analogous function build_polexpire() does not do this for
struct xfrm_user_polexpire.

The padding bytes after the __u8 hard field are left
uninitialized from the heap allocation, and are then sent to
userspace via netlink multicast to XFRMNLGRP_EXPIRE listeners,
leaking kernel heap memory contents.

Add the missing memset_after() call, matching build_expire().

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Yasuaki Torimaru <yasuakitorimaru@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

modpost: Declare extra_warn with unused attribute

A recent strengthening of -Wunused-but-set-variable (enabled with -Wall)
in clang under a new subwarning, -Wunused-but-set-global, points out an
unused static global variable in scripts/mod/modpost.c:

  scripts/mod/modpost.c:59:13: error: variable 'extra_warn' set but not used [-Werror,-Wunused-but-set-global]
     59 | static bool extra_warn;
        |             ^

This variable has been unused since commit 6c6c1fc09de3 ("modpost:
require a MODULE_DESCRIPTION()") but that is expected, as there are
currently no extra warnings at W=1 right now. Declare the variable with
the unused attribute to make it clear to the compiler that this variable
may be unused.

Cc: stable@vger.kernel.org
Fixes: 6c6c1fc09de3 ("modpost: require a MODULE_DESCRIPTION()")
Link: https://patch.msgid.link/20260325-modpost-extra_warn-unused-but-set-global-v1-1-2e84003b7e81@kernel.org
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>

kbuild: modules-cpio-pkg: Respect INSTALL_MOD_PATH

The modules-cpio-pkg target added in commit 2a9c8c0b59d3 ("kbuild: add
target to build a cpio containing modules") is incompatible with
initramfs with merged /lib and /usr/lib directories [1]. "/lib" cannot
be a link and directory at the same time.
Respect a non-empty INSTALL_MOD_PATH in the modules-cpio-pkg target so
that `make INSTALL_MOD_PATH=/usr modules-cpio-pkg` results in the same
module install location as `make INSTALL_MOD_PATH=/usr modules_install`.

Tested with Fedora distribution initramfs produced by dracut.

Link: https://systemd.io/THE_CASE_FOR_THE_USR_MERGE/
Fixes: 2a9c8c0b59d3 ("kbuild: add target to build a cpio containing modules")
Cc: stable@vger.kernel.org
Reviewed-by: Simon Glass <sjg@chromium.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Janne Grunau <j@jannau.net>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Tested-by: Nicolas Schier <nsc@kernel.org>
Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Link: https://patch.msgid.link/20260327-kbuild-modules-cpio-pkg-usr-merge-v3-1-ef507dfa006c@jannau.net
Signed-off-by: Nathan Chancellor <nathan@kernel.org>

gpu: nova-core: firmware: factor out an elf_str() function

Factor out a chunk of complexity into a new subroutine. This is an
incremental step in adding ELF32 support to the existing ELF64 section
support, for handling GPU firmware.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260326013902.588242-9-jhubbard@nvidia.com
[acourbot: use fuller prefix in commit message.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>

gpu: nova-core: firmware: move firmware image parsing code to firmware.rs

Up until now, only the GSP required parsing of its firmware headers.
However, upcoming support for Hopper/Blackwell+ adds another firmware
image (FMC), along with another format (ELF32).

Therefore, the current ELF64 section parsing support needs to be moved
up a level, so that both of the above can use it.

There are no functional changes. This is pure code movement.

Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260326013902.588242-8-jhubbard@nvidia.com
[acourbot: use fuller prefix in commit message.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>

dts: riscv: spacemit: k3: add P1 PMIC regulator tree

Add the P1 PMIC's regulator topology tree for pico-itx board.

Link: https://lore.kernel.org/r/20260327-02-k3-i2c-v2-1-9c6b374470c6@kernel.org
Signed-off-by: Yixun Lan <dlan@kernel.org>

dts: riscv: spacemit: k3: Add i2c nodes

Populate all I2C devicetree nodes for SpacemiT K3 SoC. The controller of
i2c3 is reserved for secure domain, and not available from Linux. The
controller of i2c7 simply doesn't exist from hardware perspective, as
vendor directly name the i2c controller used for PMIC as i2c8.

Reviewed-by: Troy Mitchell <troy.mitchell@linux.spacemit.com>
Link: https://lore.kernel.org/r/20260327-02-k3-i2c-v2-1-2119c0918868@kernel.org
Signed-off-by: Yixun Lan <dlan@kernel.org>

ksmbd: fix OOB write in QUERY_INFO for compound requests

When a compound request such as READ + QUERY_INFO(Security) is received,
and the first command (READ) consumes most of the response buffer,
ksmbd could write beyond the allocated buffer while building a security
descriptor.

The root cause was that smb2_get_info_sec() checked buffer space using
ppntsd_size from xattr, while build_sec_desc() often synthesized a
significantly larger descriptor from POSIX ACLs.

This patch introduces smb_acl_sec_desc_scratch_len() to accurately
compute the final descriptor size beforehand, performs proper buffer
checking with smb2_calc_max_out_buf_len(), and uses exact-sized
allocation + iov pinning.

Cc: stable@vger.kernel.org
Fixes: e2b76ab8b5c9 ("ksmbd: add support for read compound")
Signed-off-by: Asim Viladi Oglu Manizada <manizada@pm.me>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

erofs: harden h_shared_count in erofs_init_inode_xattrs()

`u8 h_shared_count` indicates the shared xattr count of an inode. It is
read from the on-disk xattr ibody header, which should be corrupted if
the size of the shared xattr array exceeds the space available in
`xattr_isize`.

It does not cause harmful consequence (e.g. crashes), since the image is
already considered corrupted, it indeed results in the silent processing
of garbage metadata.

Let's harden it to report -EFSCORRUPTED earlier.

Signed-off-by: Utkal Singh <singhutkal015@gmail.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

wifi: rtw88: coex: Ignore BT info byte 5 from RTL8821A

Sometimes while watching a Youtube video with Bluetooth headphones the
audio has a lot of interruptions, because the 5th byte of the BT info
sent by RTL8821AU has strange values, which result in
coex_stat->bt_hid_pair_num being 2 or 3. When this happens
rtw_coex_freerun_check() returns true, which causes
rtw_coex_action_wl_connected() to call rtw_coex_action_freerun() instead
of rtw_coex_action_bt_a2dp().

The RTL8821AU vendor driver doesn't do anything with the 5th byte of the
BT info, so ignore it here as well.

Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/bbf06c83-d2ee-4205-8fbb-829e2347586f@gmail.com

wifi: rtw89: fw: load TX power elements according to AID

For different A-die, there will be different TX power parameters.
In FW element header, the corresponding A-die ID will be described.
So, compare runtime AID with that to load the target TX power
parameters.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260325072130.41751-9-pkshih@realtek.com

wifi: rtw89: phy: load RF parameters relying on ACV for RTL8922D

RF parameters are conditional formats with RFE type and CV as arguments,
but RTL8922D has many variants and use ACV as argument instead of CV.
Add to select proper register values.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260325072130.41751-8-pkshih@realtek.com

wifi: rtw89: phy: expand PHY page for RTL8922D

PHY page range is to define offset from PHY0 to PHY1, and RTL8922D
needs to expand page to 0x2E0.

Signed-off-by: Eric Huang <echuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260325072130.41751-7-pkshih@realtek.com

wifi: rtw89: mac: disable pre-load function for RTL8922DE

The pre-load function is a MAC function to pre-load TX packets into
WiFi device's memory, so it can enhance performance. However, RTL8922DE
has not fully verified and fine tune this function, temporarily disable
this function.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260325072130.41751-6-pkshih@realtek.com

wifi: rtw89: mac: add specific case to dump mac memory for RTL8922D

The RTL8922D can reuse most mac memory addresses, but only
RTW89_MAC_MEM_SECURITY_CAM is different from existing one. Add a function
to return the specific memory address for RTL8922D.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260325072130.41751-5-pkshih@realtek.com

wifi: rtw89: pci: clear SER ISR when initial and leaving WoWLAN for WiFi 7 chips

The PCIE SER is to diagnose PCIE becomes abnormal, relying on IMR settings
to trigger interrupt when status is weird. Update settings to disable
PHY error flag 9, and clear ISR when initial and leaving WoWLAN.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260325072130.41751-4-pkshih@realtek.com

wifi: rtw89: wow: enable MLD address for Magic packet wakeup

Under MLO connections, the original Magic Packet configuration
only supported Link Addresses for wakeup. Update the setting
to support both MLD Address and Link Addresses for wakeup process.

Signed-off-by: Chin-Yen Lee <timlee@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260325072130.41751-3-pkshih@realtek.com

wifi: rtw89: wow: use struct style to fill WOW wakeup control H2C command

The WOW wakeup control command is used to tell firmware the content
of wakeup feature. Use struct instead of macros to fill the data.

Signed-off-by: Chin-Yen Lee <timlee@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260325072130.41751-2-pkshih@realtek.com

wifi: rtw89: 8922d: add set channel of RF part

The set channel of RF part is to configure channel and bandwidth on a
register. The function to encode channel and bandwidth into register
value will be implemented by coming patch.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260324062049.52266-8-pkshih@realtek.com

wifi: rtw89: 8922d: add set channel of BB part

The set channel of BB part is the main part, which according to channel
and bandwidth to configure CCK SCO, RX gain of LNA and TIA programmed
in efuse for CCK and OFDM rate, and spur elimination of CSI and NBI tones.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260324062049.52266-7-pkshih@realtek.com

wifi: rtw89: 8922d: add set channel of MAC part

The set channel is a key function to switch to specific operating channel.
For MAC part, configure hardware according to channel bandwidth, and
enable CCK rate for 2GHz band only.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260324062049.52266-6-pkshih@realtek.com

wifi: rtw89: 8922d: read and configure RF by calibration data from efuse physical map

The calibration data is from physical map, including 1) thermal trim to
align output thermal value across chips, and 2) PA bias to transmit
expected power by controller.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260324062049.52266-5-pkshih@realtek.com

wifi: rtw89: 8922d: define efuse map and read necessary fields

Define specific efuse map for RTL8922D, including TSSI, RX gain, MAC
address, RFE type and etc. The additional fields comparing to existing
chips are BT setting (define BT switch GPIO, antenna number and etc) and
gain offset2 (define more fields like existing RX gain offset).

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260324062049.52266-4-pkshih@realtek.com

hwmon: (ltc4286) Add missing MODULE_IMPORT_NS("PMBUS")

ltc4286.c uses PMBus core symbols exported in the PMBUS namespace,
such as pmbus_do_probe(), but does not declare MODULE_IMPORT_NS("PMBUS").

Add the missing namespace import to avoid modpost warnings.

Fixes: 0c459759ca97 ("hwmon: (pmbus) Add ltc4286 driver")
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260329170925.34581-5-sanman.pradhan@hpe.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (pxe1610) Check return value of page-select write in probe

pxe1610_probe() writes PMBUS_PAGE to select page 0 but does not check
the return value. If the write fails, subsequent register reads operate
on an indeterminate page, leading to silent misconfiguration.

Check the return value and propagate the error using dev_err_probe(),
which also handles -EPROBE_DEFER correctly without log spam.

Fixes: 344757bac526 ("hwmon: (pmbus) Add Infineon PXE1610 VR driver")
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260329170925.34581-4-sanman.pradhan@hpe.com
[groeck: Fix "Fixes" SHA]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (tps53679) Fix array access with zero-length block read

i2c_smbus_read_block_data() can return 0, indicating a zero-length
read. When this happens, tps53679_identify_chip() accesses buf[ret - 1]
which is buf[-1], reading one byte before the buffer on the stack.

Fix by changing the check from "ret < 0" to "ret <= 0", treating a
zero-length read as an error (-EIO), which prevents the out-of-bounds
array access.

Also fix a typo in the adjacent comment: "if present" instead of
duplicate "if".

Fixes: 75ca1e5875fe ("hwmon: (pmbus/tps53679) Add support for TPS53685")
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260329170925.34581-2-sanman.pradhan@hpe.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

wifi: rtw89: 8922d: add power on/off functions

The power on function is the first entry to power on hardware including
all MAC/BB/RF circuits, and then it becomes possible to do high level
operations, such as WiFi scan, connection.

If connection becomes unavailable, device stays into idle mode, calling
power off function to cut power.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260324062049.52266-3-pkshih@realtek.com

wifi: rtw89: 8922d: add definition of quota, registers and efuse block

The quota is used to configure memory size for TX/RX, and the definition
of registers includes H2C command, C2H event, WoWLAN reason, IMR of CMAC
and DMAC, ACK rate selector, RF kill GPIO, and BB functions of dynamic
initial gain and EDCCA. The layout of efuse block is to define logic
map of efuse, such as MAC address and RF calibration values.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260324062049.52266-2-pkshih@realtek.com

wifi: rtw88: validate RX rate to prevent out-of-bound

The reported RX rate might be unexpected, causing kernel warns:

  Rate marked as a VHT rate but data is invalid: MCS: 0, NSS: 0
  WARNING: net/mac80211/rx.c:5491 at ieee80211_rx_list+0x183/0x1020 [mac80211]

As the RX rate can be index of an array under certain conditions, validate
it to prevent accessing array out-of-bound potentially.

Tested on HP Notebook P3S95EA#ACB (kernel 6.19.9-1-cachyos):

  - No WARNING: net/mac80211/rx.c:5491 observed after the v2 patch.
The unexpected `NSS: 0, MCS: 0` VHT rate warnings are successfully
mitigated.
  - The system remains fully stable through prolonged idle periods,
high network load, active Bluetooth A2DP usage, and multiple deep
suspend/resume cycles.
  - Zero h2c timeouts or firmware lps state errors observed in dmesg.

Reported-by: Oleksandr Havrylov <goainwo@gmail.com>
Closes: https://lore.kernel.org/linux-wireless/CALdGYqSMUPnPfW-_q1RgYr0_SjoXUejAaJJr-o+jpwCk1S7ndQ@mail.gmail.com/
Tested-by: Oleksandr Havrylov <goainwo@gmail.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260324011001.5742-1-pkshih@realtek.com

wifi: rtw89: phy: fix uninitialized variable access in rtw89_phy_cfo_set_crystal_cap()

In the rtw89_phy_cfo_set_crystal_cap() function, for chips other than
RTL8852A/RTL8851B, the values read by rtw89_mac_read_xtal_si() are
stored into the local variables sc_xi_val and sc_xo_val. If either
read fails, these variables remain uninitialized, they are later
used to update cfo->crystal_cap and in debug print statements. This
can lead to undefined behavior.

Fix the issue by initializing sc_xi_val and sc_xo_val to zero,
like is implemented in vendor driver.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 8379fa611536 ("rtw89: 8852c: add write/read crystal function in CFO tracking")
Signed-off-by: Alexey Velichayshiy <a.velichayshiy@ispras.ru>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260323140613.1615574-1-a.velichayshiy@ispras.ru

wifi: rtw89: Add support for Buffalo WI-U3-2400XE2

Add the ID 0411:03a6 to the table to support an additional RTL8832CU
adapter: Buffalo WI-U3-2400XE2.

Link: https://github.com/morrownr/rtw89/commit/506d193b8cb7d6394509aebcf8de1531629f6100
Signed-off-by: Zenm Chen <zenmchen@gmail.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260320154136.5750-1-zenmchen@gmail.com

wifi: rtw89: Add support for TP-Link Archer TX50U

Add the ID 37ad:0103 to the table to support an additional RTL8832CU
adapter: TP-Link Archer TX50U.

Link: https://github.com/morrownr/rtl8852cu-20251113/issues/2
Signed-off-by: Zenm Chen <zenmchen@gmail.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260320093122.6754-1-zenmchen@gmail.com

wifi: rtw89: retry efuse physical map dump on transient failure

On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse
physical map dump intermittently fails with -EBUSY during probe.
The failure occurs in rtw89_dump_physical_efuse_map_ddv() where
read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY
bit after 1 second.

The root cause is a timing race during boot: the WiFi driver's
chip initialization (firmware download via PCIe) overlaps with
Bluetooth firmware download to the same combo chip via USB. This
can leave the efuse controller temporarily unavailable when the
WiFi driver attempts to read the efuse map.

The firmware download path retries up to 5 times, but the efuse
read that follows has no similar logic. Address this by adding
retry loop logic (also up to 5 attempts) around physical efuse
map dump.

Signed-off-by: Christian Hewitt <christianshewitt@gmail.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260317112155.1939569-1-christianshewitt@gmail.com

wifi: rtw88: TX QOS Null data the same way as Null data

When filling out the TX descriptor, Null data frames are treated like
management frames, but QOS Null data frames are treated like normal
data frames. Somehow this causes a problem for the firmware.

When connected to a network in the 2.4 GHz band, wpa_supplicant (or
NetworkManager?) triggers a scan every five minutes. During these scans
mac80211 transmits many QOS Null frames in quick succession. Because
these frames are marked with IEEE80211_TX_CTL_REQ_TX_STATUS, rtw88
asks the firmware to report the TX ACK status for each of these frames.
Sometimes the firmware can't process the TX status requests quickly
enough, they add up, it only processes some of them, and then marks
every subsequent TX status report with the wrong number.

The symptom is that after a while the warning "failed to get tx report
from firmware" appears every five minutes.

This problem apparently happens only with the older RTL8723D, RTL8821A,
RTL8812A, and probably RTL8703B chips.

Treat QOS Null data frames the same way as Null data frames. This seems
to avoid the problem.

Tested with RTL8821AU, RTL8723DU, RTL8811CU, and RTL8812BU.

Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/2b53fb0d-b1ed-47b6-8caa-2bb9ae2acb80@gmail.com

wifi: rtw88: add quirks to disable PCI ASPM and deep LPS for HP P3S95EA#ACB

On an HP laptop (P3S95EA#ACB) equipped with a Realtek RTL8821CE 802.11ac
PCIe adapter (PCI ID: 10ec:c821), the system experiences a hard lockup
(complete freeze of the UI and kernel, sysrq doesn't work, requires
holding the power button) when the WiFi adapter enters the power
saving state. Disable PCI ASPM to avoid system freeze.

In addition, driver throws messages periodically. Though this doesn't
always cause unstable connection, missing H2C commands might cause
unpredictable results. Disable deep LPS to avoid this as well.

rtw88_8821ce 0000:13:00.0: firmware failed to leave lps state
rtw88_8821ce 0000:13:00.0: failed to send h2c command
rtw88_8821ce 0000:13:00.0: failed to send h2c command

Tested on HP Notebook P3S95EA#ACB (kernel 6.19.7-1-cachyos):

  - No hard freeze observed during idle or active usage.
  - Zero h2c or lps errors in dmesg across idle (10 min),
    load stress (100MB download), and suspend/resume cycle.
  - Both quirk flags confirmed active via sysfs without any
    manual modprobe parameters.

Reported-by: Oleksandr Havrylov <goainwo@gmail.com>
Closes: https://lore.kernel.org/linux-wireless/CALdGYqSQ1Ko2TTBhUizMu_FvLMUAuQfFrVwS10n_C-LSQJQQkQ@mail.gmail.com/
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Tested-by: Oleksandr Havrylov <goainwo@gmail.com>
Link: https://patch.msgid.link/20260316035635.16550-1-pkshih@realtek.com

SUNRPC: xdr.h: fix all kernel-doc warnings

Correct a function parameter name (s/page/folio/) and add function
return value sections for multiple functions to eliminate
kernel-doc warnings:

Warning: include/linux/sunrpc/xdr.h:298 function parameter 'folio' not
described in 'xdr_set_scratch_folio'
Warning: include/linux/sunrpc/xdr.h:337 No description found for return
value of 'xdr_stream_remaining'
Warning: include/linux/sunrpc/xdr.h:357 No description found for return
value of 'xdr_align_size'
Warning: include/linux/sunrpc/xdr.h:374 No description found for return
value of 'xdr_pad_size'
Warning: include/linux/sunrpc/xdr.h:387 No description found for return
value of 'xdr_stream_encode_item_present'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

svcrdma: Factor out WR chain linking into helper

svc_rdma_prepare_write_chunk() and svc_rdma_prepare_reply_chunk()
contain identical code for linking RDMA R/W work requests onto a
Send context's WR chain. This duplication increases maintenance
burden and risks divergent bug fixes.

Introduce svc_rdma_cc_link_wrs() to consolidate the WR chain
linking logic. The helper walks the chunk context's rwctxts list,
chains each WR via rdma_rw_ctx_wrs(), and updates the Send
context's chain head and SQE count. Completion signaling is
requested only for the tail WR (posted first).

No functional change.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

svcrdma: Add Write chunk WRs to the RPC's Send WR chain

Previously, Write chunk RDMA Writes were posted via a separate
ib_post_send() call with their own completion handler. Each Write
chunk incurred a doorbell and generated a completion event.

Link Write chunk WRs onto the RPC Reply's Send WR chain so that a
single ib_post_send() call posts both the RDMA Writes and the Send
WR. A single completion event signals that all operations have
finished. This reduces both doorbell rate and completion rate, as
well as eliminating the latency of a round-trip between the Write
chunk completion and the subsequent Send WR posting.

The lifecycle of Write chunk resources changes: previously, the
svc_rdma_write_done() completion handler released Write chunk
resources when RDMA Writes completed. With WR chaining, resources
remain live until the Send completion. A new sc_write_info_list
tracks Write chunk metadata attached to each Send context, and
svc_rdma_write_chunk_release() frees these resources when the
Send context is released.

The svc_rdma_write_done() handler now handles only error cases.
On success it returns immediately since the Send completion handles
resource release. On failure (WR flush), it closes the connection
to signal to the client that the RPC Reply is incomplete.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

svcrdma: Clean up use of rdma->sc_pd->device

I can't think of a reason why svcrdma is using the PD's device. Most
other consumers of the IB DMA API use the ib_device pointer from the
connection's rdma_cm_id.

I don't think there's any functional difference between the two, but
it is a little confusing to see some uses of rdma_cm_id and some of
ib_pd.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

svcrdma: Clean up use of rdma->sc_pd->device in Receive paths

I can't think of a reason why svcrdma is using the PD's device. Most
other consumers of the IB DMA API use the ib_device pointer from the
connection's rdma_cm_id.

I don't believe there's any functional difference between the two,
but it is a little confusing to see some uses of rdma_cm_id->device
and some of ib_pd->device.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

svcrdma: Add fair queuing for Send Queue access

When the Send Queue fills, multiple threads may wait for SQ slots.
The previous implementation had no ordering guarantee, allowing
starvation when one thread repeatedly acquires slots while others
wait indefinitely.

Introduce a ticket-based fair queuing system. Each waiter takes a
ticket number and is served in FIFO order. This ensures forward
progress for all waiters when SQ capacity is constrained.

The implementation has two phases:
1. Fast path: attempt to reserve SQ slots without waiting
2. Slow path: take a ticket, wait for turn, then wait for slots

The ticket system adds two atomic counters to the transport:
- sc_sq_ticket_head: next ticket to issue
- sc_sq_ticket_tail: ticket currently being served

A dedicated wait queue (sc_sq_ticket_wait) handles ticket
ordering, separate from sc_send_wait which handles SQ capacity.
This separation ensures that send completions (the high-frequency
wake source) wake only the current ticket holder rather than all
queued waiters. Ticket handoff wakes only the ticket wait queue,
and each ticket holder that exits via connection close propagates
the wake to the next waiter in line.

When a waiter successfully reserves slots, it advances the tail
counter and wakes the next waiter. This creates an orderly handoff
that prevents starvation while maintaining good throughput on the
fast path when contention is low.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Optimize rq_respages allocation in svc_alloc_arg

svc_alloc_arg() invokes alloc_pages_bulk() with the full rq_maxpages
count (~259 for 1MB messages) for the rq_respages array, causing a
full-array scan despite most slots holding valid pages.

svc_rqst_release_pages() NULLs only the range

[rq_respages, rq_next_page)

after each RPC, so only that range contains NULL entries. Limit the
rq_respages fill in svc_alloc_arg() to that range instead of
scanning the full array.

svc_init_buffer() initializes rq_next_page to span the entire
rq_respages array, so the first svc_alloc_arg() call fills all
slots.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Track consumed rq_pages entries

The rq_pages array holds pages allocated for incoming RPC requests.
Two transport receive paths NULL entries in rq_pages to prevent
svc_rqst_release_pages() from freeing pages that the transport has
taken ownership of:

- svc_tcp_save_pages() moves partial request data pages to
svsk->sk_pages during multi-fragment TCP reassembly.

- svc_rdma_clear_rqst_pages() moves request data pages to
head->rc_pages because they are targets of active RDMA Read WRs.

A new rq_pages_nfree field in struct svc_rqst records how many
entries were NULLed. svc_alloc_arg() uses it to refill only those
entries rather than scanning the full rq_pages array. In steady
state, the transport NULLs a handful of entries per RPC, so the
allocator visits only those entries instead of the full ~259 slots
(for 1MB messages).

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

svcrdma: preserve rq_next_page in svc_rdma_save_io_pages

svc_rdma_save_io_pages() transfers response pages to the send
context and sets those slots to NULL. It then resets rq_next_page to
equal rq_respages, hiding the NULL region from
svc_rqst_release_pages().

Now that svc_rqst_release_pages() handles NULL entries, this reset
is no longer necessary. Removing it preserves the invariant that the
range [rq_respages, rq_next_page) accurately describes how many
response pages were consumed, enabling a subsequent optimization in
svc_alloc_arg() that refills only the consumed range.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Handle NULL entries in svc_rqst_release_pages

svc_rqst_release_pages() releases response pages between rq_respages
and rq_next_page. It currently passes the entire range to
release_pages(), which does not expect NULL entries.

A subsequent patch preserves the rq_next_page pointer in
svc_rdma_save_io_pages() so that it accurately records how many
response pages were consumed. After that change, the range

[rq_respages, rq_next_page)

can contain NULL entries where pages have already been transferred
to a send context.

Iterate through the range entry by entry, skipping NULLs, to handle
this case correctly.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Allocate a separate Reply page array

struct svc_rqst uses a single dynamically-allocated page array
(rq_pages) for both the incoming RPC Call message and the outgoing
RPC Reply message. rq_respages is a sliding pointer into rq_pages
that each transport receive path must compute based on how many
pages the Call consumed. This boundary tracking is a source of
confusion and bugs, and prevents an RPC transaction from having
both a large Call and a large Reply simultaneously.

Allocate rq_respages as its own page array, eliminating the boundary
arithmetic. This decouples Call and Reply buffer lifetimes,
following the precedent set by rq_bvec (a separate dynamically-
allocated array for I/O vectors).

Each svc_rqst now pins twice as many pages as before. For a server
running 16 threads with a 1MB maximum payload, the additional cost
is roughly 16MB of pinned memory. The new dynamic svc thread count
facility keeps this overhead minimal on an idle server. A subsequent
patch in this series limits per-request repopulation to only the
pages released during the previous RPC, avoiding a full-array scan
on each call to svc_alloc_arg().

Note: We've considered several alternatives to maintaining a full
second array. Each alternative reintroduces either boundary logic
complexity or I/O-path allocation pressure.

rq_next_page is initialized in svc_alloc_arg() and svc_process()
during Reply construction, and in svc_rdma_recvfrom() as a
precaution on error paths. Transport receive paths no longer compute
it from the Call size.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Tighten bounds checking in svc_rqst_replace_page

svc_rqst_replace_page() builds the Reply buffer by advancing
rq_next_page through the response page range. The bounds
check validates rq_next_page against the full rq_pages array,
but the valid range for rq_next_page is

[rq_respages, rq_page_end].

Use those bounds instead.

This is correct today because rq_respages and rq_page_end
both point into rq_pages, and it prepares for a subsequent
change that separates the Reply page array from rq_pages.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

NFSD: Sign filehandles

NFS clients may bypass restrictive directory permissions by using
open_by_handle() (or other available OS system call) to guess the
filehandles for files below that directory.

In order to harden knfsd servers against this attack, create a method to
sign and verify filehandles using SipHash-2-4 as a MAC (Message
Authentication Code).  According to
https://cr.yp.to/siphash/siphash-20120918.pdf, SipHash can be used as a
MAC, and our use of SipHash-2-4 provides a low 1 in 2^64 chance of forgery.

Filehandles that have been signed cannot be tampered with, nor can
clients reasonably guess correct filehandles and hashes that may exist in
parts of the filesystem they cannot access due to directory permissions.

Append the 8 byte SipHash to encoded filehandles for exports that have set
the "sign_fh" export option.  Filehandles received from clients are
verified by comparing the appended hash to the expected hash.  If the MAC
does not match the server responds with NFS error _STALE.  If unsigned
filehandles are received for an export with "sign_fh" they are rejected
with NFS error _STALE.

Signed-off-by: Benjamin Coddington <bcodding@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

NFSD/export: Add sign_fh export option

In order to signal that filehandles on this export should be signed, add a
"sign_fh" export option. Filehandle signing can help the server defend
against certain filehandle guessing attacks.

Setting the "sign_fh" export option sets NFSEXP_SIGN_FH. In a future patch
NFSD uses this signal to append a MAC onto filehandles for that export.

While we're in here, tidy a few stray expflags to more closely align to the
export flag order.

Link: https://lore.kernel.org/linux-nfs/cover.1772022373.git.bcodding@hammerspace.com
Signed-off-by: Benjamin Coddington <bcodding@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

NFSD: Add a key for signing filehandles

A future patch will enable NFSD to sign filehandles by appending a Message
Authentication Code(MAC).  To do this, NFSD requires a secret 128-bit key
that can persist across reboots.  A persisted key allows the server to
accept filehandles after a restart.  Enable NFSD to be configured with this
key via the netlink interface.

Link: https://lore.kernel.org/linux-nfs/cover.1772022373.git.bcodding@hammerspace.com
Signed-off-by: Benjamin Coddington <bcodding@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

nfsd: use dynamic allocation for oversized NFSv4.0 replay cache

Commit 1e8e9913672a ("nfsd: fix heap overflow in NFSv4.0 LOCK
replay cache") capped the replay cache copy at NFSD4_REPLAY_ISIZE
to prevent a heap overflow, but set rp_buflen to zero when the
encoded response exceeded the inline buffer. A retransmitted LOCK
reaching the replay path then produced only a status code with no
operation body, resulting in a malformed XDR response.

When the encoded response exceeds the 112-byte inline rp_ibuf, a
buffer is kmalloc'd to hold it. If the allocation fails, rp_buflen
remains zero, preserving the behavior from the capped-copy fix.
The buffer is freed when the stateowner is released or when a
subsequent operation's response fits in the inline buffer.

Fixes: 1e8e9913672a ("nfsd: fix heap overflow in NFSv4.0 LOCK replay cache")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

nfsd: convert global state_lock to per-net deleg_lock

Replace the global state_lock spinlock with a per-nfsd_net deleg_lock.
The state_lock was only used to protect delegation lifecycle operations
(the del_recall_lru list and delegation hash/unhash), all of which are
scoped to a single network namespace. Making the lock per-net removes
a source of unnecessary contention between containers.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

sunrpc: split cache_detail queue into request and reader lists

Replace the single interleaved queue (which mixed cache_request and
cache_reader entries distinguished by a ->reader flag) with two
dedicated lists: cd->requests for upcall requests and cd->readers
for open file handles.

Readers now track their position via a monotonically increasing
sequence number (next_seqno) rather than by their position in the
shared list. Each cache_request is assigned a seqno when enqueued,
and a new cache_next_request() helper finds the next request at or
after a given seqno.

This eliminates the cache_queue wrapper struct entirely, simplifies
the reader-skipping loops in cache_read/cache_poll/cache_ioctl/
cache_release, and makes the data flow easier to reason about.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

sunrpc: convert queue_wait from global to per-cache-detail waitqueue

The queue_wait waitqueue is currently a file-scoped global, so a
wake_up for one cache_detail wakes pollers on all caches. Convert it
to a per-cache-detail field so that only pollers on the relevant cache
are woken.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

sunrpc: convert queue_lock from global spinlock to per-cache-detail lock

The global queue_lock serializes all upcall queue operations across
every cache_detail instance. Convert it to a per-cache-detail spinlock
so that different caches (e.g. auth.unix.ip vs nfsd.fh) no longer
contend with each other on queue operations.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

sunrpc: Add XPT flags missing from SVC_XPRT_FLAG_LIST

Commit eccbbc7c00a5 ("nfsd: don't use sv_nrthreads in connection
limiting calculations.") and commit 898374fdd7f0 ("nfsd: unregister
with rpcbind when deleting a transport") added new XPT flags but
neglected to update the show_svc_xprt_flags() macro.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Remove dead code from fs/lockd/xdr4.c

Now that all NLMv4 server-side procedures use XDR encoder and
decoder functions generated by xdrgen, the hand-written code in
fs/lockd/xdr4.c is no longer needed. This file contained the
original XDR processing logic that has been systematically
replaced throughout this series.

Remove the file and its Makefile reference to eliminate the
dead code. The helper function nlm4svc_set_file_lock_range()
is still needed by the generated code, so move it to xdr4.h
as an inline function where it remains accessible.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Remove C macros that are no longer used

The conversion of all NLMv4 procedures to xdrgen-generated
XDR functions is complete. The hand-rolled XDR size
calculation macros (Ck, No, St, Rg) and the nlm_void
structure definition served only the older implementations
and are now unused.

Also removes NLMDBG_FACILITY, which was set to the client
debug flag in server-side code but never referenced, and
corrects a comment to specify "NLMv4 Server procedures".

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Add LOCKD_SHARE_SVID constant for DOS sharing mode

Replace the magic value ~(u32)0 with a named constant. This value
is used as a synthetic svid when looking up lockowners for DOS
share operations, which have no real process ID associated with
them.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 FREE_ALL procedure

With all other NLMv4 procedures now converted to xdrgen-generated
XDR functions, the FREE_ALL procedure can be converted as well.
This conversion allows the removal of nlm4svc_retrieve_args(),
a 79-line helper function that was used only by FREE_ALL to
retrieve client information from lockd's internal data
structures.

Replace the NLMPROC4_FREE_ALL entry in the nlm_procedures4
array with an entry that uses xdrgen-built XDR decoders and
encoders. The procedure handler is updated to use the new
wrapper structure (nlm4_notify_wrapper) and call
nlm4svc_lookup_host() directly, eliminating the need for the
now-removed helper function.

The .pc_argzero field is set to zero because xdrgen decoders
fully initialize all fields in argp->xdrgen, making the early
defensive memset unnecessary. The remaining argp fields that
fall outside the xdrgen structures are cleared explicitly as
needed.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 NM_LOCK procedure

Now that nlm4svc_do_lock() has been introduced to handle both
monitored and non-monitored lock requests, the NLMv4 NM_LOCK
procedure can be converted to use xdrgen-generated XDR
functions. This conversion allows the removal of
__nlm4svc_proc_lock(), a helper function that was previously
shared between the LOCK and NM_LOCK procedures.

Replace the NLMPROC4_NM_LOCK entry in the nlm_procedures4
array with an entry that uses xdrgen-built XDR decoders and
encoders. The procedure handler is updated to call
nlm4svc_do_lock() directly and access arguments through the
argp->xdrgen hierarchy.

The .pc_argzero field is set to zero because xdrgen decoders
fully initialize all fields in argp->xdrgen, making the early
defensive memset unnecessary. The remaining argp fields that
fall outside the xdrgen structures are cleared explicitly as
needed.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 UNSHARE procedure

Now that the share helpers have been decoupled from the
NLMv3-specific struct nlm_args and file_lock initialization
has been hoisted into the procedure handler, the NLMv4 UNSHARE
procedure can be converted to use xdrgen-generated XDR
functions.

Replace the NLMPROC4_UNSHARE entry in the nlm_procedures4
array with an entry that uses xdrgen-built XDR decoders and
encoders. The procedure handler is updated to use the new
wrapper structures (nlm4_shareargs_wrapper and
nlm4_shareres_wrapper) and access arguments through the
argp->xdrgen hierarchy.

The .pc_argzero field is set to zero because xdrgen decoders
fully initialize all fields in argp->xdrgen, making the early
defensive memset unnecessary. The remaining argp fields that
fall outside the xdrgen structures are cleared explicitly as
needed.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 SHARE procedure

Now that the share helpers have been decoupled from the
NLMv3-specific struct nlm_args and file_lock initialization
has been hoisted into the procedure handler, the NLMv4 SHARE
procedure can be converted to use xdrgen-generated XDR
functions.

Replace the NLMPROC4_SHARE entry in the nlm_procedures4 array
with an entry that uses xdrgen-built XDR decoders and encoders.
The procedure handler is updated to use the new wrapper
structures (nlm4_shareargs_wrapper and nlm4_shareres_wrapper)
and access arguments through the argp->xdrgen hierarchy.

The .pc_argzero field is set to zero because xdrgen decoders
fully initialize all fields in argp->xdrgen, making the early
defensive memset unnecessary. The remaining argp fields that
fall outside the xdrgen structures are cleared explicitly as
needed.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Prepare share helpers for xdrgen conversion

In order to convert the NLMv4 server-side XDR functions to use
xdrgen, the internal share helpers need to be decoupled from the
NLMv3-specific struct nlm_args. NLMv4 procedures will use
different argument structures once they are converted.

Refactor nlmsvc_share_file() and nlmsvc_unshare_file() to accept
individual arguments (oh, access, mode) instead of the common
struct nlm_args. This allows both protocol versions to call these
helpers without forcing a common argument structure.

While here, add kdoc comments to both functions and fix a comment
typo in the unshare path.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Hoist file_lock init out of nlm4svc_decode_shareargs()

The xdrgen-generated XDR decoders cannot initialize the
file_lock structure because it is an internal kernel type,
not part of the wire protocol. To prepare for converting
SHARE and UNSHARE procedures to use xdrgen, the file_lock
initialization must be moved from nlm4svc_decode_shareargs()
into the procedure handlers themselves.

This change removes one more dependency on the "struct
nlm_lock::fl" field in fs/lockd/xdr4.c, allowing the XDR
decoder to focus solely on unmarshalling wire data.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Convert server-side undefined procedures to xdrgen

The NLMv4 protocol defines several procedure slots that are
not implemented. These undefined procedures need proper
handling to return rpc_proc_unavail to clients that
mistakenly invoke them.

This patch converts the three undefined procedure entries
(slots 17, 18, and 19) to use xdrgen functions
nlm4_svc_decode_void and nlm4_svc_encode_void. The
nlm4svc_proc_unused function is also moved earlier in the
file to follow the convention of placing procedure
implementations before the procedure table.

The pc_argsize, pc_ressize, and pc_argzero fields are now
correctly set to zero since no arguments or results are
processed. The pc_xdrressize field is updated to XDR_void
to accurately reflect the response size.

This conversion completes the migration of all NLMv4
server-side procedures to use xdrgen-generated XDR
functions, improving type safety and eliminating
hand-written XDR code.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 SM_NOTIFY procedure

Convert the SM_NOTIFY procedure to use xdrgen functions
nlm4_svc_decode_nlm4_notifyargs and nlm4_svc_encode_void.
SM_NOTIFY is a private callback from statd to notify lockd
when a remote host has rebooted.

This patch introduces struct nlm4_notifyargs_wrapper to
bridge between the xdrgen-generated nlm4_notifyargs and
the nlm_reboot structure expected by nlm_host_rebooted().
The wrapper contains both the xdrgen-decoded arguments
and a reboot field for the existing API.

The pc_argzero field is set to zero because xdrgen decoders
reliably initialize all arguments, making the early
defensive memset unnecessary.

This change also corrects the pc_xdrressize field, which
previously contained a placeholder value.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 GRANTED_RES procedure

Convert the GRANTED_RES procedure to use xdrgen functions
nlm4_svc_decode_nlm4_res and nlm4_svc_encode_void.
GRANTED_RES is a callback procedure where the client sends
granted lock results back to the server after an async
GRANTED request.

The pc_argzero field is set to zero because xdrgen decoders
reliably initialize all arguments, making the early
defensive memset unnecessary.

This change also corrects the pc_xdrressize field, which
previously contained a placeholder value.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 UNLOCK_RES procedure

Update the NLMPROC4_UNLOCK_RES entry in nlm_procedures4 to invoke
xdrgen-generated XDR functions.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 CANCEL_RES procedure

Convert the CANCEL_RES procedure to use xdrgen functions
nlm4_svc_decode_nlm4_res and nlm4_svc_encode_void.
CANCEL_RES is a callback procedure where the client sends
cancel results back to the server after an async CANCEL
request.

The pc_argzero field is set to zero because xdrgen decoders
reliably initialize all arguments, making the early
defensive memset unnecessary.

This change also corrects the pc_xdrressize field, which
previously contained a placeholder value.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 LOCK_RES procedure

Convert the LOCK_RES procedure to use xdrgen functions
nlm4_svc_decode_nlm4_res and nlm4_svc_encode_void.
LOCK_RES is a callback procedure where the client sends
lock results back to the server after an async LOCK
request.

The pc_argzero field is set to zero because xdrgen decoders
reliably initialize all arguments, making the early
defensive memset unnecessary.

This change also corrects the pc_xdrressize field, which
previously contained a placeholder value.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 TEST_RES procedure

Convert the TEST_RES procedure to use xdrgen functions
nlm4_svc_decode_nlm4_testres and nlm4_svc_encode_void.
TEST_RES is a callback procedure where the client sends
test lock results back to the server after an async TEST
request.

The pc_argzero field is set to zero because xdrgen decoders
reliably initialize all arguments, making the early
defensive memset unnecessary.

This change also corrects the pc_xdrressize field, which
previously contained a placeholder value.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 GRANTED_MSG procedure

Convert the GRANTED_MSG procedure to use xdrgen functions
nlm4_svc_decode_nlm4_testargs and nlm4_svc_encode_void.
The procedure handler uses the nlm4_testargs_wrapper
structure that bridges between xdrgen types and the legacy
nlm_lock representation.

The pc_argzero field is set to zero because xdrgen decoders
reliably initialize all arguments, making the early
defensive memset unnecessary.

The NLM async callback mechanism uses client-side functions
which continue to take legacy struct nlm_res, preventing
GRANTED and GRANTED_MSG from sharing code for now.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 UNLOCK_MSG procedure

Convert the UNLOCK_MSG procedure to use xdrgen functions
nlm4_svc_decode_nlm4_unlockargs and nlm4_svc_encode_void.
The procedure handler uses the nlm4_unlockargs_wrapper
structure that bridges between xdrgen types and the legacy
nlm_lock representation.

The pc_argzero field is set to zero because xdrgen decoders
reliably initialize all arguments, making the early
defensive memset unnecessary.

The NLM async callback mechanism uses client-side functions
which continue to take legacy struct nlm_res, preventing
UNLOCK and UNLOCK_MSG from sharing code for now.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 CANCEL_MSG procedure

The CANCEL_MSG procedure is part of NLM's asynchronous lock
request flow, where clients send CANCEL_MSG to cancel pending
lock requests. This patch continues the xdrgen migration by
converting CANCEL_MSG to use generated XDR functions.

This patch converts the CANCEL_MSG procedure to use xdrgen
functions nlm4_svc_decode_nlm4_cancargs and
nlm4_svc_encode_void generated from the NLM version 4 protocol
specification. The procedure handler uses xdrgen types through
the nlm4_cancargs_wrapper structure that bridges between
generated code and the legacy nlm_lock representation.

The pc_argzero field is set to zero because xdrgen decoders
reliably initialize all arguments in the argp->xdrgen field,
making the early defensive memset unnecessary. Remaining argp
fields are cleared as needed.

The NLM async callback mechanism uses client-side functions
which continue to take legacy results like struct nlm_res,
preventing CANCEL and CANCEL_MSG from sharing code for now.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 LOCK_MSG procedure

The LOCK_MSG procedure is part of NLM's asynchronous lock
request flow, where clients send LOCK_MSG to request locks
that may block. This patch continues the xdrgen migration by
converting LOCK_MSG to use generated XDR functions.

This patch converts the LOCK_MSG procedure to use xdrgen
functions nlm4_svc_decode_nlm4_lockargs and
nlm4_svc_encode_void generated from the NLM version 4
protocol specification. The procedure handler uses xdrgen
types through the nlm4_lockargs_wrapper structure that
bridges between generated code and the legacy nlm_lock
representation.

The pc_argzero field is set to zero because xdrgen decoders
reliably initialize all arguments in the argp->xdrgen field,
making the early defensive memset unnecessary. Remaining
argp fields are cleared as needed.

The NLM async callback mechanism uses client-side functions
which continue to take legacy results like struct nlm_res,
preventing LOCK and LOCK_MSG from sharing code for now.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 TEST_MSG procedure

The TEST_MSG procedure is part of NLM's asynchronous lock request
flow, where clients send TEST_MSG to check lock availability
without blocking. This patch continues the xdrgen migration by
converting TEST_MSG to use generated XDR functions.

This patch converts the TEST_MSG procedure to use xdrgen
functions nlm4_svc_decode_nlm4_testargs and
nlm4_svc_encode_void generated from the NLM version 4 protocol
specification. The procedure handler uses xdrgen types through
the nlm4_testargs_wrapper structure that bridges between
generated code and the legacy nlm_lock representation.

The pc_argzero field is set to zero because xdrgen decoders
reliably initialize all arguments in the argp->xdrgen field,
making the early defensive memset unnecessary. Remaining argp
fields are cleared as needed.

The NLM async callback mechanism uses client-side functions
which continue to take legacy results like struct nlm_res,
preventing TEST and TEST_MSG from sharing code for now.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Refactor nlm4svc_callback()

The xdrgen-based XDR conversion requires each RPC procedure
to handle its own argument extraction, since xdrgen generates
distinct argument structures for each procedure rather than
using a single shared type.

This patch moves the host lookup logic from nlm4svc_callback()
into each of the five MSG procedure handlers (TEST_MSG,
LOCK_MSG, CANCEL_MSG, UNLOCK_MSG, and GRANTED_MSG). Each
handler now performs its own host lookup from rqstp->rq_argp
and passes the resulting host pointer to nlm4svc_callback(),
which is reduced to a simpler helper that only dispatches
the callback.

This refactoring enables the subsequent xdrgen conversion
patches by establishing the pattern where each procedure
handles its own argument extraction, while preserving
existing callback behavior unchanged.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 GRANTED procedure

The NLM GRANTED procedure provides server-to-client notification
when a previously blocked lock request has been granted,
completing the asynchronous lock request flow. This patch
completes the xdrgen migration for basic NLMv4 procedures by
converting the GRANTED procedure, the final one in this
conversion series.

This patch converts the GRANTED procedure to use xdrgen
functions nlm4_svc_decode_nlm4_testargs and
nlm4_svc_encode_nlm4_res generated from the NLM version 4
protocol specification. The procedure handler uses xdrgen types
through a wrapper structure that bridges between generated code
and the legacy nlm_lock representation still used by the core
lockd logic.

A new helper function nlm4_lock_to_nlm_lock() is introduced to
convert xdrgen nlm4_lock structures to the legacy nlm_lock
format. This helper complements the existing
nlm4svc_lookup_host() and nlm4svc_lookup_file() functions used
throughout this series.

The pc_argzero field is set to zero because xdrgen decoders
reliably initialize all arguments in the argp->xdrgen field,
making the early defensive memset unnecessary. Remaining argp
fields are cleared as needed.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 UNLOCK procedure

UNLOCK releases locks acquired via the LOCK procedure. Conversion
of TEST, LOCK, CANCEL, and UNLOCK provides the complete set of
lock lifecycle operations required by the NLM protocol, enabling
clients to test for conflicts, acquire locks, abort pending lock
requests, and release held locks.

The procedure handler converts arguments from the xdrgen-generated
nlm4_unlockargs structure to the legacy nlm_lock representation
through nlm4_unlockargs_wrapper. This maintains compatibility with
core lockd logic while using XDR decoders and encoders generated
from the NLMv4 protocol specification.

The original __nlm4svc_proc_unlock function is retained because
the asynchronous callback path invokes it directly, bypassing
the RPC dispatch mechanism.

The pc_argzero field is zero because nlm4_svc_decode_nlm4_unlockargs
initializes all fields in argp->xdrgen, eliminating the need for
early memset of the argument buffer. Remaining argp fields outside
the xdrgen structure are cleared explicitly where needed.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 CANCEL procedure

The NLM CANCEL procedure allows clients to cancel outstanding
blocked lock requests, completing the set of lock-related
operations that share common lookup patterns. This patch
continues the xdrgen migration by converting the CANCEL
procedure, leveraging the same nlm4svc_lookup_host() and
nlm4svc_lookup_file() helpers established in the TEST procedure
conversion to maintain consistency across the series.

This patch converts the CANCEL procedure to use xdrgen functions
nlm4_svc_decode_nlm4_cancargs and nlm4_svc_encode_nlm4_res
generated from the NLM version 4 protocol specification. The
procedure handler uses xdrgen types through a wrapper structure
that bridges between generated code and the legacy nlm_lock
representation still used by the core lockd logic.

The pc_argzero field is set to zero because xdrgen decoders
reliably initialize all arguments in the argp->xdrgen field,
making the early defensive memset unnecessary. Remaining argp
fields are cleared as needed.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 LOCK procedure

Replace legacy XDR handling in the LOCK procedure with xdrgen-
generated functions nlm4_svc_decode_lockargs and
nlm4_svc_encode_res.

The new nlm4svc_do_lock() handler replaces __nlm4svc_proc_lock()
at the NLMPROC4_LOCK entry point. Wrapper structures bridge
xdrgen types to the legacy nlm_lock representation used by core
lockd. The nlm4svc_lookup_host() and nlm4svc_lookup_file()
helpers from the TEST conversion handle host and file lookup.

The pc_argzero field is set to zero: xdrgen-generated decoders
initialize all fields in argp->xdrgen, so a defensive memset is
unnecessary. The wrapper's cookie and lock fields are cleared
by nlm4svc_do_lock() before use.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 TEST procedure

The NLM TEST procedure requires host and file lookups to check
lock state, operations that will be common across multiple NLM
procedures being migrated to xdrgen. By introducing the helper
functions nlm4svc_lookup_host() and nlm4svc_lookup_file() now,
we establish reusable patterns for subsequent conversions in
this series.

This patch converts the TEST procedure to use xdrgen functions
nlm4_svc_decode_testargs and nlm4_svc_encode_testres generated
from the NLM version 4 protocol specification. The procedure
handler is rewritten to use xdrgen types through wrapper
structures that bridge between generated code and the legacy
nlm_lock representation still used by the core lockd logic.
TEST_MSG is to be converted in a subsequent patch.

The pc_argzero field is set to zero because xdrgen decoders
reliably initialize all arguments in the argp->xdrgen field,
making the early defensive memset unnecessary. Remaining argp
fields are cleared as needed.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Use xdrgen XDR functions for the NLMv4 NULL procedure

Hand-written XDR encoders and decoders are difficult to maintain
and can inadvertently diverge from protocol specifications. By
migrating to xdrgen-generated code, we improve type safety and
ensure the implementation exactly matches the NLM version 4
protocol specification.

This patch begins the migration by converting the NULL procedure
to use nlm4_svc_decode_void and nlm4_svc_encode_void generated
from Documentation/sunrpc/xdr/nlm4.x. The NULL procedure is
straightforward as it has no arguments or results, making it an
ideal starting point for this series.

The pc_xdrressize field is set to XDR_void (zero) to reflect
that this procedure returns no XDR-encoded data. The argzero
field is also set to zero since xdrgen decoders reliably
initialize all decoded values.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Documentation: Add the RPC language description of NLM version 4

In order to generate source code to encode and decode NLMv4 protocol
elements, include a copy of the RPC language description of NLMv4
for xdrgen to process. The language description is an amalgam of
RFC 1813 and the Open Group's XNFS specification:

https://pubs.opengroup.org/onlinepubs/9629799/chap10.htm

The C code committed here was generated from the new nlm4.x file
using tools/net/sunrpc/xdrgen/xdrgen.

The goals of replacing hand-written XDR functions with ones that
are tool-generated are to improve memory safety and make XDR
encoding and decoding less brittle to maintain.

The xdrgen utility derives both the type definitions and the
encode/decode functions directly from protocol specifications,
using names and symbols familiar to anyone who knows those specs.
Unlike hand-written code that can inadvertently diverge from the
specification, xdrgen guarantees that the generated code matches
the specification exactly.

We would eventually like xdrgen to generate Rust code as well,
making the conversion of the kernel's NFS stacks to use Rust just
a little easier for us.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

NFSD: Enforce timeout on layout recall and integrate lease manager fencing

When a layout conflict triggers a recall, enforcing a timeout is
necessary to prevent excessive nfsd threads from being blocked in
__break_lease ensuring the server continues servicing incoming
requests efficiently.

This patch introduces a new function to lease_manager_operations:

lm_breaker_timedout: Invoked when a lease recall times out and is
about to be disposed of. This function enables the lease manager
to inform the caller whether the file_lease should remain on the
flc_list or be disposed of.

For the NFSD lease manager, this function now handles layout recall
timeouts. If the layout type supports fencing and the client has not
been fenced, a fence operation is triggered to prevent the client
from accessing the block device.

While the fencing operation is in progress, the conflicting file_lease
remains on the flc_list until fencing is complete. This guarantees
that no other clients can access the file, and the client with
exclusive access is properly blocked before disposal.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

NFSD: fix nfs4_file access extra count in nfsd4_add_rdaccess_to_wrdeleg

In nfsd4_add_rdaccess_to_wrdeleg, if fp->fi_fds[O_RDONLY] is already
set by another thread, __nfs4_file_get_access should not be called
to increment the nfs4_file access count since that was already done
by the thread that added READ access to the file. The extra fi_access
count in nfs4_file can prevent the corresponding nfsd_file from being
freed.

When stopping nfs-server service, these extra access counts trigger a
BUG in kmem_cache_destroy() that shows nfsd_file object remaining on
__kmem_cache_shutdown.

This problem can be reproduced by running the Git project's test
suite over NFS.

Fixes: 8072e34e1387 ("nfsd: fix nfsd_file reference leak in nfsd4_add_rdaccess_to_wrdeleg()")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

sunrpc: Fix compilation error (`make W=1`) when dprintk() is no-op

Clang compiler is not happy about set but unused variables:

.../flexfilelayout/flexfilelayoutdev.c:56:9: error: variable 'ret' set but not used [-Werror,-Wunused-but-set-variable]
.../flexfilelayout/flexfilelayout.c:1505:6: error: variable 'err' set but not used [-Werror,-Wunused-but-set-variable]
.../nfs4proc.c:9244:12: error: variable 'ptr' set but not used [-Werror,-Wunused-but-set-variable]

Fix these by forwarding parameters of dprintk() to no_printk().
The positive side-effect is a format-string checker enabled even for the cases
when dprintk() is no-op.

Fixes: d67ae825a59d ("pnfs/flexfiles: Add the FlexFile Layout Driver")
Fixes: fc931582c260 ("nfs41: create_session operation")
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

sunrpc: Kill RPC_IFDEBUG()

RPC_IFDEBUG() is used in only two places. In one the user of
the definition is guarded by ifdeffery, in the second one
it's implied due to dprintk() usage. Kill the macro and move
the ifdeffery to the regular condition with the variable defined
inside, while in the second case add the same conditional and
move the respective code there.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

nfs/blocklayout: Fix compilation error (`make W=1`) in bl_write_pagelist()

Clang compiler is not happy about set but unused variable
(when dprintk() is no-op):

.../blocklayout/blocklayout.c:384:9: error: variable 'count' set but not used [-Werror,-Wunused-but-set-variable]

Remove a leftover from the previous cleanup.

Fixes: 3a6fd1f004fc ("pnfs/blocklayout: remove read-modify-write handling in bl_write_pagelist")
Acked-by: Anna Schumaker <anna.schumkaer@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Relocate svc_version definitions to XDR layer

Public RPC server interfaces become cluttered when internal
XDR implementation details leak into them. The procedure count,
maximum XDR buffer size, and per-CPU call counters serve no
purpose outside the code that encodes and decodes NLM protocol
messages. Exposing these values through global headers creates
unnecessary coupling between the RPC dispatch logic and the
XDR layer.

Relocating the svc_version structure definitions confines this
implementation information to the files where XDR encoding and
decoding occur. In svc.c, the buffer size computation now reads
vs_xdrsize from the version structures rather than relying on a
preprocessor constant. This calculation occurs at service
initialization, after the linker has resolved the version
structure definitions. The dispatch function becomes non-static
because both the version structures and the dispatcher reside in
different translation units.

The NLMSVC_XDRSIZE macro is removed from xdr.h because buffer
size is now computed from the union of XDR argument and result
structures, matching the pattern used in other RPC services.
Version 1 and 3 share the same procedure table but maintain
separate counter arrays. Version 4 remains separate due to its
distinct procedure definitions.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Move nlm4svc_set_file_lock_range()

Both client-side and server-side NLMv4 code convert lock byte ranges
from the wire format (start, length) to the kernel's file_lock format
(start, end). The current nlm4svc_set_file_lock_range() performs this
conversion, but the "svc" prefix incorrectly suggests server-only use,
and client code must include server-internal headers to access it.

Rename to lockd_set_file_lock_range4() and relocate to the shared
lockd.h header, making it accessible to both client and server code.
This eliminates the need for client code to include xdr4.h, reducing
coupling between the XDR implementation files.

While relocating the function, add input validation: clamp the
starting offset to OFFSET_MAX before use. Without this, a malformed
lock request with off > OFFSET_MAX results in fl_start > fl_end,
violating file_lock invariants and potentially causing incorrect
lock conflict detection.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Make linux/lockd/nlm.h an internal header

The NLM protocol constants and status codes in nlm.h are needed
only by lockd's internal implementation. NFS client code and
NFSD interact with lockd through the stable API in bind.h and
have no direct use for protocol-level definitions.

Exposing these definitions globally via bind.h creates unnecessary
coupling between lockd internals and its consumers. Moving nlm.h
from include/linux/lockd/ to fs/lockd/ clarifies the API boundary:
bind.h provides the lockd service interface, while nlm.h remains
available only to code within fs/lockd/ that implements the
protocol.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Move xdr.h from include/linux/lockd/ to fs/lockd/

The lockd subsystem unnecessarily exposes internal NLM XDR type
definitions through the global include path. These definitions
are not used by any code outside fs/lockd/, making them
inappropriate for include/linux/lockd/.

Moving xdr.h to fs/lockd/ narrows the API surface and clarifies
that these types are internal implementation details. The
comment in linux/lockd/bind.h stating xdr.h was needed for
"xdr-encoded error codes" is stale: no lockd API consumers use
those codes.

Forward declarations for struct nfs_fh and struct file_lock are
added to bind.h because their definitions were previously pulled
in transitively through xdr.h. Additionally, nfs3proc.c and
proc.c need explicit includes of filelock.h for FL_CLOSE and
for accessing struct file_lock members, respectively.

Built and tested with lockd client/server operations. No
functional change.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Remove lockd/debug.h

The lockd include structure has unnecessary indirection. The header
include/linux/lockd/debug.h is consumed only by fs/lockd/lockd.h,
creating an extra compilation dependency and making the code harder
to navigate.

Fold the debug.h definitions directly into lockd.h and remove the
now-redundant header. This reduces the include tree depth and makes
the debug-related definitions easier to find when working on lockd
internals.

Build-tested with lockd built as module and built-in.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

lockd: Relocate include/linux/lockd/lockd.h

Headers placed in include/linux/ form part of the kernel's
internal API and signal to subsystem maintainers that other
parts of the kernel may depend on them. By moving lockd.h
into fs/lockd/, lockd becomes a more self-contained module
whose internal interfaces are clearly distinguished from its
public contract with the rest of the kernel. This relocation
addresses a long-standing XXX comment in the header itself
that acknowledged the file's misplacement. Future changes to
lockd internals can now proceed with confidence that external
consumers are not inadvertently coupled to implementation
details.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>