Ian Rogers [Sat, 4 Apr 2026 03:43:02 +0000 (20:43 -0700)]
perf sample: Make sure perf_sample__init/exit are used
The deferred stack trace code wasn't using perf_sample__init/exit. Add
the deferred stack trace clean up to perf_sample__exit which requires
proper NULL initialization in perf_sample__init. Make the
perf_sample__exit robust to being called more than once by using
zfree. Make the error paths in evsel__parse_sample exit the
sample. Add a merged_callchain boolean to capture that callchain is
allocated, deferred_callchain doen't suffice for this. Pack the struct
variables to avoid padding bytes for this.
Similiarly powerpc_vpadtl_sample wasn't using perf_sample__init/exit,
use it for consistency and potential issues with uninitialized
variables.
Similarly guest_session__inject_events in builtin-inject wasn't using
perf_sample_init/exit. The lifetime management for fetched events is
somewhat complex there, but when an event is fetched the sample should
be initialized and needs exiting on error. The sample may be left in
place so that future injects have access to it.
Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Sat, 21 Mar 2026 06:14:48 +0000 (23:14 -0700)]
perf tests sched stats: Write output to temp file
Writing to the perf.data file can fail in various contexts such as
continual test. Other tests write to a mktemp-ed file, make the "perf
sched stats tests" follow this convention.
Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Swapnil Sapkal <swapnil.sapkal@amd.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Namhyung Kim [Mon, 6 Apr 2026 05:18:16 +0000 (22:18 -0700)]
perf sched: Avoid crash for unexpected perf sched stats report
Doing a `perf sched record` then `perf sched stats report` crashes as
the tp_handler isn't set. Add a dummy tp_handler for it rather than
adding an extra check.
Reported-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
RISC-V: KVM: Fix shift-out-of-bounds in make_xfence_request()
The make_xfence_request() function uses a shift operation to check if a
vCPU is in the hart mask:
if (!(hmask & (1UL << (vcpu->vcpu_id - hbase))))
However, when the difference between vcpu_id and hbase
is >= BITS_PER_LONG, the shift operation causes undefined behavior.
This was detected by UBSAN:
UBSAN: shift-out-of-bounds in arch/riscv/kvm/tlb.c:343:23
shift exponent 256 is too large for 64-bit type 'long unsigned int'
Fix this by adding a bounds check before the shift operation.
This bug was found by fuzzing the KVM RISC-V interface.
Fixes: 13acfec2dbcc ("RISC-V: KVM: Add remote HFENCE functions based on VCPU requests") Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com> Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn> Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260403232011.2394966-1-xujiakai2025@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
Yu Kuai [Mon, 30 Mar 2026 05:52:13 +0000 (13:52 +0800)]
md: fix array_state=clear sysfs deadlock
When "clear" is written to array_state, md_attr_store() breaks sysfs
active protection so the array can delete itself from its own sysfs
store method.
However, md_attr_store() currently drops the mddev reference before
calling sysfs_unbreak_active_protection(). Once do_md_stop(..., 0)
has made the mddev eligible for delayed deletion, the temporary
kobject reference taken by sysfs_break_active_protection() can become
the last kobject reference protecting the md kobject.
That allows sysfs_unbreak_active_protection() to drop the last
kobject reference from the current sysfs writer context. kobject
teardown then recurses into kernfs removal while the current sysfs
node is still being unwound, and lockdep reports recursive locking on
kn->active with kernfs_drain() in the call chain.
Reproducer on an existing level:
1. Create an md0 linear array and activate it:
mknod /dev/md0 b 9 0
echo none > /sys/block/md0/md/metadata_version
echo linear > /sys/block/md0/md/level
echo 1 > /sys/block/md0/md/raid_disks
echo "$(cat /sys/class/block/sdb/dev)" > /sys/block/md0/md/new_dev
echo "$(($(cat /sys/class/block/sdb/size) / 2))" > \
/sys/block/md0/md/dev-sdb/size
echo 0 > /sys/block/md0/md/dev-sdb/slot
echo active > /sys/block/md0/md/array_state
2. Wait briefly for the array to settle, then clear it:
sleep 2
echo clear > /sys/block/md0/md/array_state
The warning looks like:
WARNING: possible recursive locking detected
bash/588 is trying to acquire lock:
(kn->active#65) at __kernfs_remove+0x157/0x1d0
but task is already holding lock:
(kn->active#65) at sysfs_unbreak_active_protection+0x1f/0x40
...
Call Trace:
kernfs_drain
__kernfs_remove
kernfs_remove_by_name_ns
sysfs_remove_group
sysfs_remove_groups
__kobject_del
kobject_put
md_attr_store
kernfs_fop_write_iter
vfs_write
ksys_write
Restore active protection before mddev_put() so the extra sysfs
kobject reference is dropped while the mddev is still held alive. The
actual md kobject deletion is then deferred until after the sysfs
write path has fully returned.
bpf: Fix stale offload->prog pointer after constant blinding
When a dev-bound-only BPF program (BPF_F_XDP_DEV_BOUND_ONLY) undergoes
JIT compilation with constant blinding enabled (bpf_jit_harden >= 2),
bpf_jit_blind_constants() clones the program. The original prog is then
freed in bpf_jit_prog_release_other(), which updates aux->prog to point
to the surviving clone, but fails to update offload->prog.
This leaves offload->prog pointing to the freed original program. When
the network namespace is subsequently destroyed, cleanup_net() triggers
bpf_dev_bound_netdev_unregister(), which iterates ondev->progs and calls
__bpf_prog_offload_destroy(offload->prog). Accessing the freed prog
causes a page fault:
1. Set net.core.bpf_jit_harden=2 (echo 2 > /proc/sys/net/core/bpf_jit_harden)
2. Run xdp_metadata selftest, which creates a dev-bound-only XDP
program on a veth inside a netns (./test_progs -t xdp_metadata)
3. cleanup_net -> page fault in __bpf_prog_offload_destroy
Dev-bound-only programs are unique in that they have an offload structure
but go through the normal JIT path instead of bpf_prog_offload_compile().
This means they are subject to constant blinding's prog clone-and-replace,
while also having offload->prog that must stay in sync.
Fix this by updating offload->prog in bpf_jit_prog_release_other(),
alongside the existing aux->prog update. Both are back-pointers to
the prog that must be kept in sync when the prog is replaced.
tc_tunnel test is based on a send_and_test_data function which takes a
subtest configuration, and a boolean indicating whether the connection
is supposed to fail or not. This boolean is systematically passed to
true, and is a remnant from the first (not integrated) attempts to
convert tc_tunnel to test_progs: those versions validated for
example that a connection properly fails when only one side of the
connection has tunneling enabled. This specific testing has not been
integrated because it involved large timeouts which increased quite a
lot the test duration, for little added value.
Remove the unused boolean from send_and_test_data to simplify the
generic part of subtests.
====================
bpf: fix end-of-list detection in cgroup_storage_get_next_key()
list_next_entry() never returns NULL, so the NULL check in
cgroup_storage_get_next_key() is dead code. When iterating past the last
element, the function reads storage->key from a bogus pointer that aliases
internal map fields and copies the result to userspace.
Patch 1 replaces the NULL check with list_entry_is_head() so the function
correctly returns -ENOENT when there are no more entries.
Patch 2 adds a selftest to cover this corner case, as suggested by Sun Jian
and Paul Chaignon.
Weiming Shi [Fri, 3 Apr 2026 13:29:51 +0000 (21:29 +0800)]
selftests/bpf: add get_next_key boundary test for cgroup_storage
Verify that bpf_map__get_next_key() correctly returns -ENOENT when
called on the last (and only) key in a cgroup_storage map. Before the
fix in the previous patch, this would succeed with bogus key data
instead of failing.
Suggested-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Acked-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/20260403132951.43533-3-bestswngs@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Weiming Shi [Fri, 3 Apr 2026 13:29:50 +0000 (21:29 +0800)]
bpf: fix end-of-list detection in cgroup_storage_get_next_key()
list_next_entry() never returns NULL -- when the current element is the
last entry it wraps to the list head via container_of(). The subsequent
NULL check is therefore dead code and get_next_key() never returns
-ENOENT for the last element, instead reading storage->key from a bogus
pointer that aliases internal map fields and copying the result to
userspace.
Replace it with list_entry_is_head() so the function correctly returns
-ENOENT when there are no more entries.
Fixes: de9cbbaadba5 ("bpf: introduce cgroup storage maps") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com> Acked-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/20260403132951.43533-2-bestswngs@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
====================
bpf: Fix torn writes in non-prealloc htab with BPF_F_LOCK
A torn write issue was reported in htab_map_update_elem() with
BPF_F_LOCK on hash maps. The BPF_F_LOCK fast path performs
a lockless lookup and copies the value under the element's embedded
spin_lock. A concurrent delete can free the element via
bpf_mem_cache_free(), which allows immediate reuse. When
alloc_htab_elem() recycles the same memory, it writes the value with
plain copy_map_value() without taking the spin_lock, racing with the
stale lock holder and producing torn writes.
Patch 1 fixes alloc_htab_elem() to use copy_map_value_locked() when
BPF_F_LOCK is set.
Patch 2 adds a selftest that reliably detects the torn writes on an
unpatched kernel.
selftests/bpf: Add torn write detection test for htab BPF_F_LOCK
Add a consistency subtest to htab_reuse that detects torn writes
caused by the BPF_F_LOCK lockless update racing with element
reallocation in alloc_htab_elem().
The test uses three thread roles started simultaneously via a pipe:
- locked updaters: BPF_F_LOCK|BPF_EXIST in-place updates
- delete+update workers: delete then BPF_ANY|BPF_F_LOCK insert
- locked readers: BPF_F_LOCK lookup checking value consistency
bpf: Use copy_map_value_locked() in alloc_htab_elem() for BPF_F_LOCK
When a BPF_F_LOCK update races with a concurrent delete, the freed
element can be immediately recycled by alloc_htab_elem(). The fast path
in htab_map_update_elem() performs a lockless lookup and then calls
copy_map_value_locked() under the element's spin_lock. If
alloc_htab_elem() recycles the same memory, it overwrites the value
with plain copy_map_value(), without taking the spin_lock, causing
torn writes.
Use copy_map_value_locked() when BPF_F_LOCK is set so the new element's
value is written under the embedded spin_lock, serializing against any
stale lock holders.
The i.MX6SX LCDIF is not fully compatible with the i.MX28 LCDIF. The
i.MX6SX controller provides additional overlay registers (AS_CTRL) which
are not present on i.MX28.
Linux has supported the dedicated compatible string since commit 45d59d704080 ("drm: Add new driver for MXSFB controller").
Other known DT users such as U-Boot and Barebox already support
"fsl,imx6sx-lcdif", so removing the fallback compatible string is low risk
since this device is used for display output only.
Fix the following CHECK_DTB warning:
/arch/arm/boot/dts/nxp/imx/imx6sx-nitrogen6sx.dtb: lcdif@2220000 (fsl,imx6sx-lcdif): compatible: 'oneOf' conditional failed, one must be fixed:
['fsl,imx6sx-lcdif', 'fsl,imx28-lcdif'] is too long
Frank Li [Wed, 11 Feb 2026 21:41:06 +0000 (16:41 -0500)]
ARM: dts: imx25: rename node name tcq to touchscreen
Rename node name tcq to touchscreen to fix below CHECK_DTBS warnings:
arch/arm/boot/dts/nxp/imx/imx25-karo-tx25.dtb: tscadc@50030000 (fsl,imx25-tsadc): 'tcq@50030400' does not match any of the regexes: '^adc@[0-9a-f]+$', '^pinctrl-[0-9]+$', '^touchscreen@[0-9a-f]+$'
from schema $id: http://devicetree.org/schemas/mfd/fsl,imx25-tsadc.yaml
Ian Ray [Tue, 17 Feb 2026 13:55:17 +0000 (15:55 +0200)]
ARM: dts: imx: bx50v3: Configure phy-mode to eliminate a warning
Set `phy-mode' on network switch CPU ports to eliminate a warning:
mv88e6085 gpio-0:00: OF node /mdio-gpio/switch@0/ports/port@4 of CPU port 4 lacks the required "phy-mode" property
Signed-off-by: Ian Ray <ian.ray@gehealthcare.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Peng Fan [Mon, 2 Mar 2026 15:07:42 +0000 (23:07 +0800)]
ARM: dts: imx7ulp: Add CPU clock and OPP table support
Add missing CPU clock definitions and operating-points-v2 table for the
Cortex-A7 on i.MX7ULP to enable proper CPU frequency scaling and
integration with the cpufreq/OPP frameworks.
Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Max Merchel [Fri, 20 Feb 2026 14:30:02 +0000 (15:30 +0100)]
ARM: dts: imx6qdl-tqma6: add missing labels
Add the missing labels for the temperature sensor and the EEPROM.
In SoM variants A and B, the components are connected to different
I2C buses. These labels are needed to reference them in subsequent
device trees.
Signed-off-by: Max Merchel <Max.Merchel@ew.tq-group.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Frank Li [Wed, 21 Jan 2026 18:04:17 +0000 (13:04 -0500)]
ARM: dts: imx: add required clocks and clock-names for ccm
Add required clocks and clock-names for ccm to fix below CHECK_DTBS
warnings:
arch/arm/boot/dts/nxp/imx/imx6dl-alti6p.dtb: clock-controller@20c4000 (fsl,imx6q-ccm): clock-names:0: 'osc' was expected
from schema $id: http://devicetree.org/schemas/clock/imx6q-clock.yaml#
Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Reviewed-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Frank Li [Thu, 12 Feb 2026 16:19:50 +0000 (11:19 -0500)]
ARM: dts: imx28-tx28: remove undocumented aliases
Remove undocumented aliases, which is not used in kernel to fix
CHECK_DTBS warnings.
arch/arm/boot/dts/nxp/mxs/imx28-tx28.dtb: aliases: 'lcdif_23bit_pins', 'lcdif_24bit_pins', 'reg_can_xcvr', 'spi_gpio', 'spi_mxs' do not match any of the regexes: '^[a-z][a-z0-9\\-]*$', '^pinctrl-[0-9]+$'
from schema $id: http://devicetree.org/schemas/aliases.yaml
Frank Li [Thu, 12 Feb 2026 16:19:49 +0000 (11:19 -0500)]
ARM: dts: imx28-tx28: rename compatible to "edt,edt-ft5206"
The compatible string "edt,edt-ft5x06" is neither documented nor used.
According to drivers/input/touchscreen/edt-ft5x06.c, ft5206, ft5306 and
ft5406 are compatible.
Use "edt,edt-ft5206" instead, as the datasheet does not specify the
exact touchscreen model.
Remove the undocumented fallback compatible string "mr25h256", as the
SPI core strips the vendor prefix.
Fix below CHECK_DTBS warnings:
arch/arm/boot/dts/nxp/mxs/imx28-sps1.dtb: /apb@80000000/apbh-bus@80000000/spi@80014000/flash@0: failed to match any schema with compatible: ['everspin,mr25h256', 'mr25h256']
Frank Li [Thu, 12 Feb 2026 16:19:47 +0000 (11:19 -0500)]
ARM: dts: imx28: rename gpios-reset to reset-gpios of hx8357
Rename gpios-reset to reset-gpios of hx8357 node to fix below CHECK_DTBS
warnings:
arch/arm/boot/dts/nxp/mxs/imx28-cfa10055.dtb: hx8357@0 (himax,hx8357b): Unevaluated properties are not allowed ('gpios-reset' was unexpected)
Frank Li [Thu, 12 Feb 2026 16:19:46 +0000 (11:19 -0500)]
ARM: dts: imx23/28: add "led-" prefix to LED subnodes
Add the "led-" prefix to LED subnodes to fix the below CHECK_DTBS warnings:
arch/arm/boot/dts/nxp/mxs/imx23-olinuxino.dtb: leds (gpio-leds): 'user' does not match any of the regexes: '(^led-[0-9a-f]$|led)', '^pinctrl-[0-9]+$'
from schema $id: http://devicetree.org/schemas/leds/leds-gpio.yaml
Frank Li [Thu, 12 Feb 2026 16:19:45 +0000 (11:19 -0500)]
ARM: dts: imx23: fix interrupt names for dma-controller@80024000
There are duplicate "empty" entries in the interrupt-names property of
the DMA controller. Rename them to "empty<n>" to fix below CHECK_DTBS
warnings.
arch/arm/boot/dts/nxp/mxs/imx23-olinuxino.dtb: dma-controller@80024000 (fsl,imx23-dma-apbx): interrupt-names:15: 'empty5' was expected
Frank Li [Wed, 11 Feb 2026 23:12:57 +0000 (18:12 -0500)]
ARM: dts: imx27: remove fsl,imx-osc26m from fixed-clock node
Remove fsl,imx-osc26m from fixed-clock node to fix below CHECK_DTB
warnings:
arch/arm/boot/dts/nxp/imx/imx27-apf27.dtb: osc26m (fsl,imx-osc26m): compatible: ['fsl,imx-osc26m', 'fixed-clock'] is too long
from schema $id: http://devicetree.org/schemas/clock/fixed-clock.yaml
Frank Li [Wed, 11 Feb 2026 23:12:56 +0000 (18:12 -0500)]
ARM: dts: imx27-eukrea-cpuimx27: rename uart8250 to serial
Rename node name uart8250 to serial to fix below CHECK_DTBS warnings:
arch/arm/boot/dts/nxp/imx/imx27-eukrea-mbimxsd27-baseboard.dtb: uart8250@3,200000 (ns8250): $nodename:0: 'uart8250@3,200000' does not match '^serial(@.*)?$'
Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Frank Li [Wed, 11 Feb 2026 21:00:03 +0000 (16:00 -0500)]
ARM: dts: imx: remove redundant intermediate node in pinmux hierarchy
Remove the redundant intermediate node between the pinmux and group nodes,
and add the missing "grp" suffix to the group node names.
Fix below CHECK_DTBS warnings:
arm/boot/dts/nxp/imx/imx27-apf27dev.dtb: iomuxc@10015000 (fsl,imx27-iomuxc): Unevaluated properties are not allowed ('imx27-apf27', 'imx27-apf27dev' were unexpected)
from schema $id: http://devicetree.org/schemas/pinctrl/fsl,imx27-iomuxc.yaml
Frank Li [Wed, 11 Feb 2026 21:00:02 +0000 (16:00 -0500)]
ARM: dts: imx: rename iomuxc to pinmux
Rename node name iomuxc to pinmux. Fix below CHECK_DTBS warnings:
arch/arm/boot/dts/nxp/imx/imx1-apf9328.dtb: iomuxc@21c000 (fsl,imx1-iomuxc): $nodename:0: 'iomuxc@21c000' does not match '^(pinctrl|pinmux)(@[0-9a-f]+)?$'
from schema $id: http://devicetree.org/schemas/pinctrl/fsl,imx27-iomuxc.yaml
Marek Vasut [Mon, 9 Feb 2026 17:07:04 +0000 (18:07 +0100)]
ARM: dts: imx6ull-dhcor: Handle both 1DX and 1YN WiFi on i.MX6ULL DHCOR
The muRata 1DX WiFi/BT chip is mounted on the DHCOM i.MX6ULL. This chip
has been discontinued and replaced by the muRata 1YN chip. The new chip
is a drop-in replacement of the old chip. To support both chips for the
i.MX6ULL DHCOR, drop the more specific compatible string and let the
driver auto-detect the chip type. Currently, there are no known quirks
that would apply only to one or the other chip.
Signed-off-by: Marek Vasut <marex@nabladev.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Frank Li [Mon, 2 Feb 2026 19:43:27 +0000 (14:43 -0500)]
ARM: dts: imx7s-warp: Remove data-lanes and clock-lanes for ov2680
The ov2680 only support 1 lane. Needn't additional property to descript it.
Remove it to fix below DTB_CHECK warnings:
camera@36 (ovti,ov2680): port:endpoint: 'clock-lanes', 'data-lanes' do not match any of the regexes: '^pinctrl-[0-9]+$'
from schema $id: http://devicetree.org/schemas/media/i2c/ovti,ov2680.yaml
Frank Li [Mon, 2 Feb 2026 19:43:26 +0000 (14:43 -0500)]
ARM: dts: imx53-smd: Add power supply node for fsl,sgtl5000
Add power supply, #sound-dai-cells and clock nodes for fsl,sgtl5000 to
fix below CHECK_DTB warnings:
arch/arm/boot/dts/nxp/imx/imx53-smd.dtb: sgtl5000@a (fsl,sgtl5000): '#sound-dai-cells' is a required property
from schema $id: http://devicetree.org/schemas/sound/fsl,sgtl5000.yaml#
arch/arm/boot/dts/nxp/imx/imx53-smd.dtb: sgtl5000@a (fsl,sgtl5000): 'clocks' is a required property
from schema $id: http://devicetree.org/schemas/sound/fsl,sgtl5000.yaml#
arch/arm/boot/dts/nxp/imx/imx53-smd.dtb: sgtl5000@a (fsl,sgtl5000): 'VDDA-supply' is a required property
from schema $id: http://devicetree.org/schemas/sound/fsl,sgtl5000.yaml#
arch/arm/boot/dts/nxp/imx/imx53-smd.dtb: sgtl5000@a (fsl,sgtl5000): 'VDDIO-supply' is a required property
David Carlier [Tue, 31 Mar 2026 10:37:44 +0000 (11:37 +0100)]
gpu: nova-core: fix missing colon in SEC2 boot debug message
The SEC2 mailbox debug output formats MBOX1 without a colon separator,
producing "MBOX10xdead" instead of "MBOX1: 0xdead". The GSP debug
message a few lines above uses the correct format.
smb/client: move smb2maperror declarations to smb2proto.h
For `smb2_error_map_table_test` and `smb2_error_map_num`, if their types
are changed in `smb2maperror.c` but the corresponding extern declarations
in `smb2maperror_test.c` are not updated, the compiler will not report an
error. Moving them to a common header file allows the compiler to catch
type mismatches.
Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
smb/client: check if SMB1 DOS/SRV error mapping arrays are sorted
Although the arrays are sorted at build time, verify the ordering again
when cifs.ko is loaded to avoid potential regressions introduced by
future script changes.
Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:37 +0000 (14:18 +0000)]
smb/client: use binary search for SMB1 DOS/SRV error mapping
Currently, map_smb_to_linux_error() uses linear searches for both
mapping_table_ERRDOS[] and mapping_table_ERRSRV[].
Refactor this by introducing search_mapping_table_ERRDOS() and
search_mapping_table_ERRSRV() that implements binary search(as the tables
are sorted).This improves lookup performance and reduces code duplication.
Also remove the sentinel entries from the mapping tables as they are no
longer needed with ARRAY_SIZE().
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:36 +0000 (14:18 +0000)]
smb/client: autogenerate SMB1 DOS/SRV to POSIX error mapping
Extend the `gen_smb1_mapping` script to support generating sorted POSIX
error mapping tables for both ERRDOS and ERRSRV classes at compile time.
The script parses annotations from smberr.h to generate smb1_err_dos_map.c
and smb1_err_srv_map.c, which are included as the contents of the arrays
mapping_table_ERRDOS[] and mapping_table_ERRSRV[], respectively.
This ensures that the mapping logic remains synchronized with the source
headers and prepares for faster error lookups using binary search in the
future.
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:35 +0000 (14:18 +0000)]
smb/client: annotate smberr.h with POSIX error codes
Annotate SMB1 error definitions in smberr.h with their corresponding
POSIX error codes.
To facilitate automated processing and ensure consistent formatting,
existing inline comments (/* ... */) in smberr.h were first moved to
the lines preceding the #define statements.
This provides the source data for generating sorted mapping tables,
allowing the implementation of binary search for faster error mapping
lookups in later commits.
The annotations were performed based on the manual
mapping_table_ERRDOS[] and mapping_table_ERRSRV[] arrays in
smb1maperror.c using the following python script:
def get_mappings():
mappings = {}
if not os.path.exists(MAP_FILE):
return mappings
with open(MAP_FILE, "r") as f:
content = f.read()
for table in ["mapping_table_ERRDOS", "mapping_table_ERRSRV"]:
pattern = (
rf'static const struct smb_to_posix_error {table}\[\] = '
r'\{([\s\S]+?)\};'
)
match = re.search(pattern, content)
if match:
entry_pattern = (
r'\{\s*([A-Za-z0-9_]+)\s*,\s*'
r'(-[A-Z0-9_]+)\s*\}'
)
entries = re.findall(entry_pattern, match.group(1))
for name, posix in entries:
if name != "0":
mappings[name] = posix
return mappings
def format_comment(comment_lines):
"""
Formats comment lines to comply with Linux kernel coding style.
Single-line comments remain on one line.
Multi-line comments use the standard block format.
"""
raw_text = []
for line in comment_lines:
line = line.strip()
if line.startswith('/*'):
line = line[2:]
if line.endswith('*/'):
line = line[:-2]
line = line.lstrip(' *').strip()
if line:
raw_text.append(line)
if not raw_text:
return []
# If it's a single line of text, keep it simple
if len(raw_text) == 1:
return [f"/* {raw_text[0]} */"]
# Multi-line: Standard Kernel Block Comment Format
formatted = ["/*"]
for text in raw_text:
formatted.append(f" * {text}")
formatted.append(" */")
return formatted
def fix_content(content, mappings):
lines = content.splitlines()
new_lines, i = [], 0
while i < len(lines):
line = lines[i]
# Match #define with inline comment
define_re = (
r'^(\s*#define\s+([A-Za-z0-9_]+)\s+'
r'[^\s/]+)\s*/\*'
)
match = re.match(define_re, line)
if match:
prefix, name = match.group(1), match.group(2)
# Extract full comment block
comment_block = [line[line.find('/*'):].strip()]
if '*/' not in line:
while i + 1 < len(lines):
i += 1
comment_block.append(lines[i].strip())
if '*/' in lines[i]:
break
# Format and add comment
new_lines.extend(format_comment(comment_block))
# Add define with tab-separated POSIX code
new_define = prefix.rstrip()
if name in mappings:
new_define += '\t// ' + mappings[name]
new_lines.append(new_define)
else:
no_comment_re = (
r'^(\s*#define\s+([A-Za-z0-9_]+)\s+'
r'[^\s/]+)\s*$'
)
match_no_comment = re.match(no_comment_re, line)
if match_no_comment:
prefix = match_no_comment.group(1)
name = match_no_comment.group(2)
new_define = prefix.rstrip()
if name in mappings:
new_define += '\t// ' + mappings[name]
new_lines.append(new_define)
else:
new_lines.append(line)
i += 1
return '\n'.join(new_lines)
if __name__ == "__main__":
m = get_mappings()
if os.path.exists(SMBERR_FILE):
with open(SMBERR_FILE, "r") as f:
content = f.read()
fixed = fix_content(content, m)
with open(SMBERR_FILE, "w") as f:
f.write(fixed + '\n')
print(f"Successfully processed {SMBERR_FILE}")
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:34 +0000 (14:18 +0000)]
smb/client: move ERRnetlogonNotStarted to DOS error class
In smb1maperror.c, ERRnetlogonNotStarted is included in the
mapping_table_ERRDOS array. However, in the smberr.h header file,
this macro was incorrectly placed under the ERRSRV (server)
error class section.
Move the macro definition to the ERRDOS section in smberr.h to maintain
consistency between the error classification in the header file and its
actual usage in the mapping tables.
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
smb/client: check if ntstatus_to_dos_map is sorted
Although the array is sorted at build time, verify the ordering again
when cifs.ko is loaded to avoid potential regressions introduced by
future script changes.
We are going to define 3 functions to check the sort results, introduce the
macro DEFINE_CHECK_SORT_FUNC() to reduce duplicate code.
Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:31 +0000 (14:18 +0000)]
smb/client: use binary search for NT status to DOS mapping
The ntstatus_to_dos_map[] table is sorted now. Replace the linear search
with binary search to improve lookup performance.
Also remove the sentinel entry as it is no longer needed with ARRAY_SIZE().
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:30 +0000 (14:18 +0000)]
smb/client: refactor ntstatus_to_dos() to return mapping entry
Refactor ntstatus_to_dos() to return a pointer to the mapping entry
instead of using output parameters. This allows callers to access all
fields of the entry directly.
In map_smb_to_linux_error(), integrate the printing logic directly
to avoid redundant lookups previously performed by cifs_print_status(),
which is now removed.
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:29 +0000 (14:18 +0000)]
smb/client: replace nt_errs with ntstatus_to_dos_map
The ntstatus_to_dos_map[] array now contains the NT error strings,
making the nt_errs[] array redundant.
Introduce `struct ntstatus_to_dos_err` instead of an anonymous struct.
This allows cifs_print_status() to look up error strings directly
from a single table.
Remove nterr.c, as nt_errs[] was its only functional content.
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:28 +0000 (14:18 +0000)]
smb/client: autogenerate SMB1 NT status to DOS error mapping
Introduce `gen_smb1_mapping` script to autogenerate the NT status to
DOS error mapping table for SMB1. This script parses nterr.h to
generate smb1_mapping_table.c, which is then directly included as
the content of the ntstatus_to_dos_map[] array at compile time.
The generated array is numerically sorted during the build process to
ensure a consistent structure, providing the necessary groundwork for
future introduction of binary search lookups.
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:27 +0000 (14:18 +0000)]
smb/client: annotate nterr.h with DOS error codes
Add comments to NT_STATUS definitions in nterr.h indicating the
corresponding DOS error class and code.
To ensure formatting consistency and facilitate automated processing,
existing human-readable comments in nterr.h were first moved to the
line preceding the #define statements.
This provides the source data for generating sorted mapping tables,
allowing the implementation of binary search for faster error mapping
lookups in later commits.
The mapping data is extracted from the existing manual
ntstatus_to_dos_map[] array in smb1maperror.c using the following
python script:
def move_comments(file_path):
"""
Moves existing inline comments (/* ... */ or // ...) to
the preceding line to ensure formatting consistency.
"""
if not os.path.exists(file_path):
return
with open(file_path, "r") as f:
lines = f.readlines()
new_lines = []
# Match #define statements with inline comments
re_str = r'^(\s*#define\s+[A-Za-z0-9_]+\s+.*?)\s*(/\*.*?\*/|//.*)$'
pattern = re.compile(re_str)
for line in lines:
match = pattern.match(line.rstrip())
if match:
define_part, comment_part = match.groups()
# Do not move if it's already an auto-generated mapping comment
if re.search(r'//\s*[A-Z0-9_]+\s*,\s*[A-Za-z0-9_]+', comment_part):
new_lines.append(line)
continue
indent = " " * (len(line) - len(line.lstrip()))
# Move old comment to previous line
new_lines.append(indent + comment_part + "\n")
# Keep the define part
new_lines.append(define_part.rstrip() + "\n")
else:
new_lines.append(line)
with open(file_path, "w") as f:
f.writelines(new_lines)
def annotate_nterr():
"""
Extracts DOS error mappings from smb1maperror.c and appends them
as comments to NT_STATUS defines in nterr.h, ensuring proper alignment.
"""
mapping = {}
if not os.path.exists(MAP_FILE) or not os.path.exists(NTERR_FILE):
return
# Extract mappings from the source mapping table
with open(MAP_FILE, "r") as f:
content = f.read()
# Strip comments from source to ensure robust parsing
content = re.sub(r'/\*.*?\*/', '', content, flags=re.DOTALL)
content = re.sub(r'//.*', '', content)
# Match [Class], [Code], [NT_STATUS] triplets using regex
map_re = r'([A-Z0-9_]+)\s*,\s*([A-Za-z0-9_]+)\s*,\s*(NT_STATUS_[A-Z0-9_]+)'
matches = re.findall(map_re, content)
for m in matches:
mapping[m[2]] = (m[0], m[1])
with open(NTERR_FILE, "r") as f:
lines = f.readlines()
new_lines = []
for line in lines:
stripped = line.strip()
if stripped.startswith("#define NT_STATUS_"):
# Remove any existing // comments before re-annotating
base_line = re.sub(r'\s*//.*$', '', line.rstrip())
parts = base_line.split()
if len(parts) >= 2:
name = parts[1]
# Append comment, ensuring proper alignment
if name == "NT_STATUS_OK":
line = f"{base_line}\t// SUCCESS, 0\n"
elif name in mapping:
d_class, d_code = mapping[name]
line = f"{base_line}\t// {d_class}, {d_code}\n"
else:
line = f"{base_line}\t// ERRHRD, ERRgeneral\n"
new_lines.append(line)
with open(NTERR_FILE, "w") as f:
f.writelines(new_lines)
if __name__ == "__main__":
# Step 1: Clean existing inline comments and move them to separate lines
move_comments(NTERR_FILE)
# Step 2: Annotate with DOS codes, ensuring proper DOS codes comments
annotate_nterr()
print("Successfully processed nterr.h with DOS codes comments.")
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Fredric Cover [Sun, 29 Mar 2026 01:47:53 +0000 (18:47 -0700)]
fs/smb/client: add verbose error logging for UNC parsing
Add cifs_dbg(VFS, ...) statements to smb3_parse_devname() to provide
explicit feedback when parsing fails. Currently, the function returns
-EINVAL silently, making it difficult to debug mount failures caused
by malformed paths or missing share names.
Signed-off-by: Fredric Cover <FredTheDude@proton.me> Acked-by: Henrique Carvalho <[2]henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
parse_probe_arg() accepts quoted immediate strings and passes the body
after the opening quote to __parse_imm_string(). That helper currently
computes strlen(str) and immediately dereferences str[len - 1], which
underflows when the body is empty and not closed with double-quotation.
Reject empty non-closed immediate strings before checking for the closing quote.
Daniel Palmer [Sat, 4 Apr 2026 02:31:08 +0000 (11:31 +0900)]
m68k: Fix task info flags handling for 68000
The logic for deciding what to do after a syscall should be checking
if any of the lower byte bits are set and then checking if the reschedule
bit is set.
Currently we are loading the top word, checking if any bits are set
(which never seems to be true) and thus jumping over loading the
whole long and checking if the reschedule bit is set.
We get the thread info in two places so split that logic out in
a macro and then fix the code so that it loads the byte of the flags
we need to check, checks if anything is set and then checks if
the reschedule bit in particular is set.
Reported-by: Christoph Plattner <christoph.plattner@gmx.at> Signed-off-by: Daniel Palmer <daniel@0x0f.com> Signed-off-by: Greg Ungerer <gerg@kernel.org>
Merge tag 'riscv-for-linus-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
- Fix a CONFIG_SPARSEMEM crash on RV32 by avoiding early phys_to_page()
- Prevent runtime const infrastructure from being used by modules,
similar to what was done for x86
- Avoid problems when shutting down ACPI systems with IOMMUs by adding
a device dependency between IOMMU and devices that use it
- Fix a bug where the CPU pointer masking state isn't properly reset
when tagged addresses aren't enabled for a task
- Fix some incorrect register assignments, and add some missing ones,
in kgdb support code
- Fix compilation of non-kernel code that uses the ptrace uapi header
by replacing BIT() with _BITUL()
- Fix compilation of the validate_v_ptrace kselftest by working around
kselftest macro expansion issues
* tag 'riscv-for-linus-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
ACPI: RIMT: Add dependency between iommu and devices
selftests: riscv: Add braces around EXPECT_EQ()
riscv: use _BITUL macro rather than BIT() in ptrace uapi and kselftests
riscv: Reset pmm when PR_TAGGED_ADDR_ENABLE is not set
riscv: make runtime const not usable by modules
riscv: patch: Avoid early phys_to_page()
riscv: kgdb: fix several debug register assignment bugs
x86/split_lock: Don't warn about unknown split_lock_detect parameter
The split_lock_detect command line parameter is handled in sld_setup() shortly
after cpu_parse_early_param() but still before parse_early_param().
Add a dummy parsing function so that parse_early_param() doesn't later
complain about the "unknown" parameter split_lock_detect=, and pass it along
to init.
Lance Yang [Wed, 1 Apr 2026 13:10:32 +0000 (21:10 +0800)]
mm: fix deferred split queue races during migration
migrate_folio_move() records the deferred split queue state from src and
replays it on dst. Replaying it after remove_migration_ptes(src, dst, 0)
makes dst visible before it is requeued, so a concurrent rmap-removal path
can mark dst partially mapped and trip the WARN in deferred_split_folio().
Move the requeue before remove_migration_ptes() so dst is back on the
deferred split queue before it becomes visible again.
Because migration still holds dst locked at that point, teach
deferred_split_scan() to requeue a folio when folio_trylock() fails.
Otherwise a fully mapped underused folio can be dequeued by the shrinker
and silently lost from split_queue.
[ziy@nvidia.com: move the comment] Link: https://lkml.kernel.org/r/FB71A764-0F10-4E5A-B4A0-BA4C7F138408@nvidia.com Link: https://syzkaller.appspot.com/bug?extid=a7067a757858ac8eb085 Link: https://lkml.kernel.org/r/20260401131032.13011-1-lance.yang@linux.dev Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue") Signed-off-by: Lance Yang <lance.yang@linux.dev> Signed-off-by: Zi Yan <ziy@nvidia.com> Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-mm/69ccb65b.050a0220.183828.003a.GAE@google.com/ Suggested-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: Zi Yan <ziy@nvidia.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: Byungchul Park <byungchul@sk.com> Cc: David Hildenbrand <david@kernel.org> Cc: Deepanshu Kartikey <kartikey406@gmail.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nico Pache <npache@redhat.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Ying Huang <ying.huang@linux.alibaba.com> Cc: Usama Arif <usama.arif@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/huge_memory: add and use has_deposited_pgtable()
Rather than thread has_deposited through zap_huge_pmd(), make things
clearer by adding has_deposited_pgtable() with comments describing why in
each case.
[ljs@kernel.org: fix folio_put()-before-recheck issue, per Sashiko] Link: https://lkml.kernel.org/r/0a917f80-902f-49b0-a75f-1bbaf23d7f94@lucifer.local Link: https://lkml.kernel.org/r/f9db59ca90937e39913d50ecb4f662e2bad17bbb.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/huge_memory: add and use normal_or_softleaf_folio_pmd()
Now we have pmd_to_softleaf_folio() available to us which also raises a
CONFIG_DEBUG_VM warning if unexpectedly an invalid softleaf entry, we can
now abstract folio handling altogether.
vm_normal_folio() deals with the huge zero page (which is present), as well
as PFN map/mixed map mappings in both cases returning NULL.
Otherwise, we try to obtain the softleaf folio.
This makes the logic far easier to comprehend and has it use the standard
vm_normal_folio_pmd() path for decoding of present entries.
Finally, we have to update the flushing logic to only do so if a folio is
established.
This patch also makes the 'is_present' value more accurate - because PFN
map, mixed map and zero huge pages are present, just not present and
'normal'.
[ljs@kernel.org: avoid bisection hazard] Link: https://lkml.kernel.org/r/d0cc6161-77a4-42ba-a411-96c23c78df1b@lucifer.local Link: https://lkml.kernel.org/r/c2be872d64ef9573b80727d9ab5446cf002f17b5.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Separate pmd_is_valid_softleaf() into separate components, then use the
pmd_is_valid_softleaf() predicate to implement pmd_to_softleaf_folio().
This returns the folio associated with a softleaf entry at PMD level. It
expects this to be valid for a PMD entry.
If CONFIG_DEBUG_VM is set, then assert on this being an invalid entry, and
either way return NULL in this case.
This lays the ground for further refactorings.
Link: https://lkml.kernel.org/r/b677592596274fa3fd701890497948e4b0e07cec.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/huge_memory: separate out the folio part of zap_huge_pmd()
Place the part of the logic that manipulates counters and possibly updates
the accessed bit of the folio into its own function to make zap_huge_pmd()
more readable.
Also rename flush_needed to is_present as we only require a flush for
present entries.
Additionally add comments as to why we're doing what we're doing with
respect to softleaf entries.
This also lays the ground for further refactoring.
Link: https://lkml.kernel.org/r/6c4db67952f5529da4db102a6149b9050b5dda4e.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reduce the repetition, and lay the ground for further refactorings by
keeping this variable separate.
Link: https://lkml.kernel.org/r/98104cde87e4b2aabeb16f236b8731591594457f.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
These checks have been in place since 2014, I think we can safely assume
that we are in a place where we don't need these as runtime checks.
In addition there are 4 other invocations of folio_remove_rmap_pmd(), none
of which make this assertion.
If we need to add this assertion, it should be in folio_remove_rmap_pmd(),
and as a VM_WARN_ON_ONCE(), however these seem superfluous so just remove
them.
Link: https://lkml.kernel.org/r/0c4c5ab247c90f80cf44718e8124b217d6a22544.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Rather than having separate logic for each case determining whether to zap
the deposited table, simply track this via a boolean.
We default this to whether the architecture requires it, and update it as
required elsewhere.
Link: https://lkml.kernel.org/r/71f576a1fbcd27a86322d12caa937bcdacf75407.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This has been around since the beginnings of the THP implementation. I
think we can safely assume that, if we have a THP folio, it will have a
head page.
Link: https://lkml.kernel.org/r/f3fa8eb4634ccb2e78209f570cc1a769a02ce93e.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/huge_memory: add a common exit path to zap_huge_pmd()
Other than when we acquire the PTL, we always need to unlock the PTL, and
optionally need to flush on exit.
The code is currently very duplicated in this respect, so default
flush_needed to false, set it true in the case in which it's required,
then share the same logic for all exit paths.
This also makes flush_needed make more sense as a function-scope value (we
don't need to flush for the PFN map/mixed map, zero huge, error cases for
instance).
Link: https://lkml.kernel.org/r/6b281d8ed972dff0e89bdcbdd810c96c7ae8c9dc.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/huge_memory: handle buggy PMD entry in zap_huge_pmd()
A recent bug I analysed managed to, through a bug in the userfaultfd
implementation, reach an invalid point in the zap_huge_pmd() code where
the PMD was none of:
- A non-DAX, PFN or mixed map.
- The huge zero folio
- A present PMD entry
- A softleaf entry
The code at this point calls folio_test_anon() on a known-NULL folio.
Having logic like this explicitly NULL dereference in the code is hard to
understand, and makes debugging potentially more difficult.
Add an else branch to handle this case and WARN().
No functional change intended.
Link: https://lore.kernel.org/all/6b3d7ad7-49e1-407a-903d-3103704160d8@lucifer.local/ Link: https://lkml.kernel.org/r/fcf1f6de84a2ace188b6bf103fa15dde695f1ed8.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc
There's no need to use the ancient approach of returning an integer here,
just return a boolean.
Also update flush_needed to be a boolean, similarly.
Also add a kdoc comment describing the function.
No functional change intended.
Link: https://lkml.kernel.org/r/132274566cd49d2960a2294c36dd2450593dfc55.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We don't need to have an extra level of indentation, we can simply exit
early in the first two branches.
No functional change intended.
Link: https://lkml.kernel.org/r/6b4d5efdbf5554b8fe788f677d0b50f355eec999.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "mm/huge_memory: refactor zap_huge_pmd()", v3.
zap_huge_pmd() is overly complicated, clean it up and also add an assert
in the case that we encounter a buggy PMD entry that doesn't match
expectations.
This is motivated by a bug discovered [0] where the PMD entry was none of:
* A non-DAX, PFN or mixed map.
* The huge zero folio
* A present PMD entry
* A softleaf entry
In zap_huge_pmd(), but due to the bug we manged to reach this code.
It is useful to explicitly call this out rather than have an arbitrary
NULL pointer dereference happen, which also improves understanding of
what's going on.
The series goes further to make use of vm_normal_folio_pmd() rather than
implementing custom logic for retrieving the folio, and extends softleaf
functionality to provide and use an equivalent softleaf function.
This patch (of 13):
This function is confused - it overloads the term 'special' yet again,
checks for DAX but in many cases the code explicitly excludes DAX before
invoking the predicate.
It also unnecessarily checks for vma->vm_file - this has to be present for
a driver to have set VMA_MIXEDMAP_BIT or VMA_PFNMAP_BIT.
In fact, a far simpler form of this is to reverse the DAX predicate and
return false if DAX is set.
This makes sense from the point of view of 'special' as in
vm_normal_page(), as DAX actually does potentially have retrievable
folios.
Also there's no need to have this in mm.h so move it to huge_memory.c.
mm: on remap assert that input range within the proposed VMA
Now we have range_in_vma_desc(), update remap_pfn_range_prepare() to check
whether the input range in contained within the specified VMA, so we can
fail at prepare time if an invalid range is specified.
This covers the I/O remap mmap actions also which ultimately call into
this function, and other mmap action types either already span the full
VMA or check this already.
Link: https://lkml.kernel.org/r/0fc1092f4b74f3f673a58e4e3942dc83f336dd85.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
A user can invoke mmap_action_map_kernel_pages() to specify that the
mapping should map kernel pages starting from desc->start of a specified
number of pages specified in an array.
In order to implement this, adjust mmap_action_prepare() to be able to
return an error code, as it makes sense to assert that the specified
parameters are valid as quickly as possible as well as updating the VMA
flags to include VMA_MIXEDMAP_BIT as necessary.
This provides an mmap_prepare equivalent of vm_insert_pages(). We
additionally update the existing vm_insert_pages() code to use
range_in_vma() and add a new range_in_vma_desc() helper function for the
mmap_prepare case, sharing the code between the two in range_is_subset().
We add both mmap_action_map_kernel_pages() and
mmap_action_map_kernel_pages_full() to allow for both partial and full VMA
mappings.
We update the documentation to reflect the new features.
Finally, we update the VMA tests accordingly to reflect the changes.
Link: https://lkml.kernel.org/r/926ac961690d856e67ec847bee2370ab3c6b9046.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
uio: replace deprecated mmap hook with mmap_prepare in uio_info
The f_op->mmap interface is deprecated, so update uio_info to use its
successor, mmap_prepare.
Therefore, replace the uio_info->mmap hook with a new
uio_info->mmap_prepare hook, and update its one user, target_core_user,
to both specify this new mmap_prepare hook and also to use the new
vm_ops->mapped() hook to continue to maintain a correct udev->kref
refcount.
Then update uio_mmap() to utilise the mmap_prepare compatibility layer to
invoke this callback from the uio mmap invocation.
Link: https://lkml.kernel.org/r/157583e4477705b496896c7acd4ac88a937b8fa6.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare
The f_op->mmap interface is deprecated, so update the vmbus driver to use
its successor, mmap_prepare.
This updates all callbacks which referenced the function pointer
hv_mmap_ring_buffer to instead reference hv_mmap_prepare_ring_buffer,
utilising the newly introduced compat_set_desc_from_vma() and
__compat_vma_mmap() to be able to implement this change.
The UIO HV generic driver is the only user of hv_create_ring_sysfs(),
which is the only function which references
vmbus_channel->mmap_prepare_ring_buffer which, in turn, is the only
external interface to hv_mmap_prepare_ring_buffer.
This patch therefore updates this caller to use mmap_prepare instead,
which also previously used vm_iomap_memory(), so this change replaces it
with its mmap_prepare equivalent, mmap_action_simple_ioremap().
[akpm@linux-foundation.org: restore struct vmbus_channel comment, per Michael Kelley] Link: https://lkml.kernel.org/r/05467cb62267d750e5c770147517d4df0246cda6.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm: allow handling of stacked mmap_prepare hooks in more drivers
While the conversion of mmap hooks to mmap_prepare is underway, we will
encounter situations where mmap hooks need to invoke nested mmap_prepare
hooks.
The nesting of mmap hooks is termed 'stacking'. In order to flexibly
facilitate the conversion of custom mmap hooks in drivers which stack, we
must split up the existing __compat_vma_mmap() function into two separate
functions:
* compat_set_desc_from_vma() - This allows the setting of a vm_area_desc
object's fields to the relevant fields of a VMA.
* __compat_vma_mmap() - Once an mmap_prepare hook has been executed upon a
vm_area_desc object, this function performs any mmap actions specified by
the mmap_prepare hook and then invokes its vm_ops->mapped() hook if any
were specified.
In ordinary cases, where a file's f_op->mmap_prepare() hook simply needs
to be invoked in a stacked mmap() hook, compat_vma_mmap() can be used.
However some drivers define their own nested hooks, which are invoked in
turn by another hook.
A concrete example is vmbus_channel->mmap_ring_buffer(), which is invoked
in turn by bin_attribute->mmap():
vmbus_channel->mmap_ring_buffer() has a signature of:
int (*mmap_ring_buffer)(struct vmbus_channel *channel,
struct vm_area_struct *vma);
And so compat_vma_mmap() cannot be used here for incremental conversion of
hooks from mmap() to mmap_prepare().
There are many such instances like this, where conversion to mmap_prepare
would otherwise cascade to a huge change set due to nesting of this kind.
The changes in this patch mean we could now instead convert
vmbus_channel->mmap_ring_buffer() to
vmbus_channel->mmap_prepare_ring_buffer(), and implement something like:
Allowing us to incrementally update this logic, and other logic like it.
Unfortunately, as part of this change, we need to be able to flexibly
assign to the VMA descriptor, so have to remove some of the const
declarations within the structure.
Also update the VMA tests to reflect the changes.
Link: https://lkml.kernel.org/r/24aac3019dd34740e788d169fccbe3c62781e648.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mtdchar: replace deprecated mmap hook with mmap_prepare, clean up
Replace the deprecated mmap callback with mmap_prepare.
Commit f5cf8f07423b ("mtd: Disable mtdchar mmap on MMU systems") commented
out the CONFIG_MMU part of this function back in 2012, so after ~14 years
it's probably reasonable to remove this altogether rather than updating
dead code.
Link: https://lkml.kernel.org/r/d036855c21962c58ace0eb24ecd6d973d77424fe.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Richard Weinberger <richard@nod.at> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>