selftests/bpf: Add torn write detection test for htab BPF_F_LOCK
Add a consistency subtest to htab_reuse that detects torn writes
caused by the BPF_F_LOCK lockless update racing with element
reallocation in alloc_htab_elem().
The test uses three thread roles started simultaneously via a pipe:
- locked updaters: BPF_F_LOCK|BPF_EXIST in-place updates
- delete+update workers: delete then BPF_ANY|BPF_F_LOCK insert
- locked readers: BPF_F_LOCK lookup checking value consistency
bpf: Use copy_map_value_locked() in alloc_htab_elem() for BPF_F_LOCK
When a BPF_F_LOCK update races with a concurrent delete, the freed
element can be immediately recycled by alloc_htab_elem(). The fast path
in htab_map_update_elem() performs a lockless lookup and then calls
copy_map_value_locked() under the element's spin_lock. If
alloc_htab_elem() recycles the same memory, it overwrites the value
with plain copy_map_value(), without taking the spin_lock, causing
torn writes.
Use copy_map_value_locked() when BPF_F_LOCK is set so the new element's
value is written under the embedded spin_lock, serializing against any
stale lock holders.
The i.MX6SX LCDIF is not fully compatible with the i.MX28 LCDIF. The
i.MX6SX controller provides additional overlay registers (AS_CTRL) which
are not present on i.MX28.
Linux has supported the dedicated compatible string since commit 45d59d704080 ("drm: Add new driver for MXSFB controller").
Other known DT users such as U-Boot and Barebox already support
"fsl,imx6sx-lcdif", so removing the fallback compatible string is low risk
since this device is used for display output only.
Fix the following CHECK_DTB warning:
/arch/arm/boot/dts/nxp/imx/imx6sx-nitrogen6sx.dtb: lcdif@2220000 (fsl,imx6sx-lcdif): compatible: 'oneOf' conditional failed, one must be fixed:
['fsl,imx6sx-lcdif', 'fsl,imx28-lcdif'] is too long
Frank Li [Wed, 11 Feb 2026 21:41:06 +0000 (16:41 -0500)]
ARM: dts: imx25: rename node name tcq to touchscreen
Rename node name tcq to touchscreen to fix below CHECK_DTBS warnings:
arch/arm/boot/dts/nxp/imx/imx25-karo-tx25.dtb: tscadc@50030000 (fsl,imx25-tsadc): 'tcq@50030400' does not match any of the regexes: '^adc@[0-9a-f]+$', '^pinctrl-[0-9]+$', '^touchscreen@[0-9a-f]+$'
from schema $id: http://devicetree.org/schemas/mfd/fsl,imx25-tsadc.yaml
Ian Ray [Tue, 17 Feb 2026 13:55:17 +0000 (15:55 +0200)]
ARM: dts: imx: bx50v3: Configure phy-mode to eliminate a warning
Set `phy-mode' on network switch CPU ports to eliminate a warning:
mv88e6085 gpio-0:00: OF node /mdio-gpio/switch@0/ports/port@4 of CPU port 4 lacks the required "phy-mode" property
Signed-off-by: Ian Ray <ian.ray@gehealthcare.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Peng Fan [Mon, 2 Mar 2026 15:07:42 +0000 (23:07 +0800)]
ARM: dts: imx7ulp: Add CPU clock and OPP table support
Add missing CPU clock definitions and operating-points-v2 table for the
Cortex-A7 on i.MX7ULP to enable proper CPU frequency scaling and
integration with the cpufreq/OPP frameworks.
Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Max Merchel [Fri, 20 Feb 2026 14:30:02 +0000 (15:30 +0100)]
ARM: dts: imx6qdl-tqma6: add missing labels
Add the missing labels for the temperature sensor and the EEPROM.
In SoM variants A and B, the components are connected to different
I2C buses. These labels are needed to reference them in subsequent
device trees.
Signed-off-by: Max Merchel <Max.Merchel@ew.tq-group.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Frank Li [Wed, 21 Jan 2026 18:04:17 +0000 (13:04 -0500)]
ARM: dts: imx: add required clocks and clock-names for ccm
Add required clocks and clock-names for ccm to fix below CHECK_DTBS
warnings:
arch/arm/boot/dts/nxp/imx/imx6dl-alti6p.dtb: clock-controller@20c4000 (fsl,imx6q-ccm): clock-names:0: 'osc' was expected
from schema $id: http://devicetree.org/schemas/clock/imx6q-clock.yaml#
Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Reviewed-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Frank Li [Thu, 12 Feb 2026 16:19:50 +0000 (11:19 -0500)]
ARM: dts: imx28-tx28: remove undocumented aliases
Remove undocumented aliases, which is not used in kernel to fix
CHECK_DTBS warnings.
arch/arm/boot/dts/nxp/mxs/imx28-tx28.dtb: aliases: 'lcdif_23bit_pins', 'lcdif_24bit_pins', 'reg_can_xcvr', 'spi_gpio', 'spi_mxs' do not match any of the regexes: '^[a-z][a-z0-9\\-]*$', '^pinctrl-[0-9]+$'
from schema $id: http://devicetree.org/schemas/aliases.yaml
Frank Li [Thu, 12 Feb 2026 16:19:49 +0000 (11:19 -0500)]
ARM: dts: imx28-tx28: rename compatible to "edt,edt-ft5206"
The compatible string "edt,edt-ft5x06" is neither documented nor used.
According to drivers/input/touchscreen/edt-ft5x06.c, ft5206, ft5306 and
ft5406 are compatible.
Use "edt,edt-ft5206" instead, as the datasheet does not specify the
exact touchscreen model.
Remove the undocumented fallback compatible string "mr25h256", as the
SPI core strips the vendor prefix.
Fix below CHECK_DTBS warnings:
arch/arm/boot/dts/nxp/mxs/imx28-sps1.dtb: /apb@80000000/apbh-bus@80000000/spi@80014000/flash@0: failed to match any schema with compatible: ['everspin,mr25h256', 'mr25h256']
Frank Li [Thu, 12 Feb 2026 16:19:47 +0000 (11:19 -0500)]
ARM: dts: imx28: rename gpios-reset to reset-gpios of hx8357
Rename gpios-reset to reset-gpios of hx8357 node to fix below CHECK_DTBS
warnings:
arch/arm/boot/dts/nxp/mxs/imx28-cfa10055.dtb: hx8357@0 (himax,hx8357b): Unevaluated properties are not allowed ('gpios-reset' was unexpected)
Frank Li [Thu, 12 Feb 2026 16:19:46 +0000 (11:19 -0500)]
ARM: dts: imx23/28: add "led-" prefix to LED subnodes
Add the "led-" prefix to LED subnodes to fix the below CHECK_DTBS warnings:
arch/arm/boot/dts/nxp/mxs/imx23-olinuxino.dtb: leds (gpio-leds): 'user' does not match any of the regexes: '(^led-[0-9a-f]$|led)', '^pinctrl-[0-9]+$'
from schema $id: http://devicetree.org/schemas/leds/leds-gpio.yaml
Frank Li [Thu, 12 Feb 2026 16:19:45 +0000 (11:19 -0500)]
ARM: dts: imx23: fix interrupt names for dma-controller@80024000
There are duplicate "empty" entries in the interrupt-names property of
the DMA controller. Rename them to "empty<n>" to fix below CHECK_DTBS
warnings.
arch/arm/boot/dts/nxp/mxs/imx23-olinuxino.dtb: dma-controller@80024000 (fsl,imx23-dma-apbx): interrupt-names:15: 'empty5' was expected
Frank Li [Wed, 11 Feb 2026 23:12:57 +0000 (18:12 -0500)]
ARM: dts: imx27: remove fsl,imx-osc26m from fixed-clock node
Remove fsl,imx-osc26m from fixed-clock node to fix below CHECK_DTB
warnings:
arch/arm/boot/dts/nxp/imx/imx27-apf27.dtb: osc26m (fsl,imx-osc26m): compatible: ['fsl,imx-osc26m', 'fixed-clock'] is too long
from schema $id: http://devicetree.org/schemas/clock/fixed-clock.yaml
Frank Li [Wed, 11 Feb 2026 23:12:56 +0000 (18:12 -0500)]
ARM: dts: imx27-eukrea-cpuimx27: rename uart8250 to serial
Rename node name uart8250 to serial to fix below CHECK_DTBS warnings:
arch/arm/boot/dts/nxp/imx/imx27-eukrea-mbimxsd27-baseboard.dtb: uart8250@3,200000 (ns8250): $nodename:0: 'uart8250@3,200000' does not match '^serial(@.*)?$'
Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Frank Li [Wed, 11 Feb 2026 21:00:03 +0000 (16:00 -0500)]
ARM: dts: imx: remove redundant intermediate node in pinmux hierarchy
Remove the redundant intermediate node between the pinmux and group nodes,
and add the missing "grp" suffix to the group node names.
Fix below CHECK_DTBS warnings:
arm/boot/dts/nxp/imx/imx27-apf27dev.dtb: iomuxc@10015000 (fsl,imx27-iomuxc): Unevaluated properties are not allowed ('imx27-apf27', 'imx27-apf27dev' were unexpected)
from schema $id: http://devicetree.org/schemas/pinctrl/fsl,imx27-iomuxc.yaml
Frank Li [Wed, 11 Feb 2026 21:00:02 +0000 (16:00 -0500)]
ARM: dts: imx: rename iomuxc to pinmux
Rename node name iomuxc to pinmux. Fix below CHECK_DTBS warnings:
arch/arm/boot/dts/nxp/imx/imx1-apf9328.dtb: iomuxc@21c000 (fsl,imx1-iomuxc): $nodename:0: 'iomuxc@21c000' does not match '^(pinctrl|pinmux)(@[0-9a-f]+)?$'
from schema $id: http://devicetree.org/schemas/pinctrl/fsl,imx27-iomuxc.yaml
Marek Vasut [Mon, 9 Feb 2026 17:07:04 +0000 (18:07 +0100)]
ARM: dts: imx6ull-dhcor: Handle both 1DX and 1YN WiFi on i.MX6ULL DHCOR
The muRata 1DX WiFi/BT chip is mounted on the DHCOM i.MX6ULL. This chip
has been discontinued and replaced by the muRata 1YN chip. The new chip
is a drop-in replacement of the old chip. To support both chips for the
i.MX6ULL DHCOR, drop the more specific compatible string and let the
driver auto-detect the chip type. Currently, there are no known quirks
that would apply only to one or the other chip.
Signed-off-by: Marek Vasut <marex@nabladev.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Frank Li [Mon, 2 Feb 2026 19:43:27 +0000 (14:43 -0500)]
ARM: dts: imx7s-warp: Remove data-lanes and clock-lanes for ov2680
The ov2680 only support 1 lane. Needn't additional property to descript it.
Remove it to fix below DTB_CHECK warnings:
camera@36 (ovti,ov2680): port:endpoint: 'clock-lanes', 'data-lanes' do not match any of the regexes: '^pinctrl-[0-9]+$'
from schema $id: http://devicetree.org/schemas/media/i2c/ovti,ov2680.yaml
Frank Li [Mon, 2 Feb 2026 19:43:26 +0000 (14:43 -0500)]
ARM: dts: imx53-smd: Add power supply node for fsl,sgtl5000
Add power supply, #sound-dai-cells and clock nodes for fsl,sgtl5000 to
fix below CHECK_DTB warnings:
arch/arm/boot/dts/nxp/imx/imx53-smd.dtb: sgtl5000@a (fsl,sgtl5000): '#sound-dai-cells' is a required property
from schema $id: http://devicetree.org/schemas/sound/fsl,sgtl5000.yaml#
arch/arm/boot/dts/nxp/imx/imx53-smd.dtb: sgtl5000@a (fsl,sgtl5000): 'clocks' is a required property
from schema $id: http://devicetree.org/schemas/sound/fsl,sgtl5000.yaml#
arch/arm/boot/dts/nxp/imx/imx53-smd.dtb: sgtl5000@a (fsl,sgtl5000): 'VDDA-supply' is a required property
from schema $id: http://devicetree.org/schemas/sound/fsl,sgtl5000.yaml#
arch/arm/boot/dts/nxp/imx/imx53-smd.dtb: sgtl5000@a (fsl,sgtl5000): 'VDDIO-supply' is a required property
David Carlier [Tue, 31 Mar 2026 10:37:44 +0000 (11:37 +0100)]
gpu: nova-core: fix missing colon in SEC2 boot debug message
The SEC2 mailbox debug output formats MBOX1 without a colon separator,
producing "MBOX10xdead" instead of "MBOX1: 0xdead". The GSP debug
message a few lines above uses the correct format.
smb/client: move smb2maperror declarations to smb2proto.h
For `smb2_error_map_table_test` and `smb2_error_map_num`, if their types
are changed in `smb2maperror.c` but the corresponding extern declarations
in `smb2maperror_test.c` are not updated, the compiler will not report an
error. Moving them to a common header file allows the compiler to catch
type mismatches.
Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
smb/client: check if SMB1 DOS/SRV error mapping arrays are sorted
Although the arrays are sorted at build time, verify the ordering again
when cifs.ko is loaded to avoid potential regressions introduced by
future script changes.
Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:37 +0000 (14:18 +0000)]
smb/client: use binary search for SMB1 DOS/SRV error mapping
Currently, map_smb_to_linux_error() uses linear searches for both
mapping_table_ERRDOS[] and mapping_table_ERRSRV[].
Refactor this by introducing search_mapping_table_ERRDOS() and
search_mapping_table_ERRSRV() that implements binary search(as the tables
are sorted).This improves lookup performance and reduces code duplication.
Also remove the sentinel entries from the mapping tables as they are no
longer needed with ARRAY_SIZE().
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:36 +0000 (14:18 +0000)]
smb/client: autogenerate SMB1 DOS/SRV to POSIX error mapping
Extend the `gen_smb1_mapping` script to support generating sorted POSIX
error mapping tables for both ERRDOS and ERRSRV classes at compile time.
The script parses annotations from smberr.h to generate smb1_err_dos_map.c
and smb1_err_srv_map.c, which are included as the contents of the arrays
mapping_table_ERRDOS[] and mapping_table_ERRSRV[], respectively.
This ensures that the mapping logic remains synchronized with the source
headers and prepares for faster error lookups using binary search in the
future.
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:35 +0000 (14:18 +0000)]
smb/client: annotate smberr.h with POSIX error codes
Annotate SMB1 error definitions in smberr.h with their corresponding
POSIX error codes.
To facilitate automated processing and ensure consistent formatting,
existing inline comments (/* ... */) in smberr.h were first moved to
the lines preceding the #define statements.
This provides the source data for generating sorted mapping tables,
allowing the implementation of binary search for faster error mapping
lookups in later commits.
The annotations were performed based on the manual
mapping_table_ERRDOS[] and mapping_table_ERRSRV[] arrays in
smb1maperror.c using the following python script:
def get_mappings():
mappings = {}
if not os.path.exists(MAP_FILE):
return mappings
with open(MAP_FILE, "r") as f:
content = f.read()
for table in ["mapping_table_ERRDOS", "mapping_table_ERRSRV"]:
pattern = (
rf'static const struct smb_to_posix_error {table}\[\] = '
r'\{([\s\S]+?)\};'
)
match = re.search(pattern, content)
if match:
entry_pattern = (
r'\{\s*([A-Za-z0-9_]+)\s*,\s*'
r'(-[A-Z0-9_]+)\s*\}'
)
entries = re.findall(entry_pattern, match.group(1))
for name, posix in entries:
if name != "0":
mappings[name] = posix
return mappings
def format_comment(comment_lines):
"""
Formats comment lines to comply with Linux kernel coding style.
Single-line comments remain on one line.
Multi-line comments use the standard block format.
"""
raw_text = []
for line in comment_lines:
line = line.strip()
if line.startswith('/*'):
line = line[2:]
if line.endswith('*/'):
line = line[:-2]
line = line.lstrip(' *').strip()
if line:
raw_text.append(line)
if not raw_text:
return []
# If it's a single line of text, keep it simple
if len(raw_text) == 1:
return [f"/* {raw_text[0]} */"]
# Multi-line: Standard Kernel Block Comment Format
formatted = ["/*"]
for text in raw_text:
formatted.append(f" * {text}")
formatted.append(" */")
return formatted
def fix_content(content, mappings):
lines = content.splitlines()
new_lines, i = [], 0
while i < len(lines):
line = lines[i]
# Match #define with inline comment
define_re = (
r'^(\s*#define\s+([A-Za-z0-9_]+)\s+'
r'[^\s/]+)\s*/\*'
)
match = re.match(define_re, line)
if match:
prefix, name = match.group(1), match.group(2)
# Extract full comment block
comment_block = [line[line.find('/*'):].strip()]
if '*/' not in line:
while i + 1 < len(lines):
i += 1
comment_block.append(lines[i].strip())
if '*/' in lines[i]:
break
# Format and add comment
new_lines.extend(format_comment(comment_block))
# Add define with tab-separated POSIX code
new_define = prefix.rstrip()
if name in mappings:
new_define += '\t// ' + mappings[name]
new_lines.append(new_define)
else:
no_comment_re = (
r'^(\s*#define\s+([A-Za-z0-9_]+)\s+'
r'[^\s/]+)\s*$'
)
match_no_comment = re.match(no_comment_re, line)
if match_no_comment:
prefix = match_no_comment.group(1)
name = match_no_comment.group(2)
new_define = prefix.rstrip()
if name in mappings:
new_define += '\t// ' + mappings[name]
new_lines.append(new_define)
else:
new_lines.append(line)
i += 1
return '\n'.join(new_lines)
if __name__ == "__main__":
m = get_mappings()
if os.path.exists(SMBERR_FILE):
with open(SMBERR_FILE, "r") as f:
content = f.read()
fixed = fix_content(content, m)
with open(SMBERR_FILE, "w") as f:
f.write(fixed + '\n')
print(f"Successfully processed {SMBERR_FILE}")
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:34 +0000 (14:18 +0000)]
smb/client: move ERRnetlogonNotStarted to DOS error class
In smb1maperror.c, ERRnetlogonNotStarted is included in the
mapping_table_ERRDOS array. However, in the smberr.h header file,
this macro was incorrectly placed under the ERRSRV (server)
error class section.
Move the macro definition to the ERRDOS section in smberr.h to maintain
consistency between the error classification in the header file and its
actual usage in the mapping tables.
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
smb/client: check if ntstatus_to_dos_map is sorted
Although the array is sorted at build time, verify the ordering again
when cifs.ko is loaded to avoid potential regressions introduced by
future script changes.
We are going to define 3 functions to check the sort results, introduce the
macro DEFINE_CHECK_SORT_FUNC() to reduce duplicate code.
Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:31 +0000 (14:18 +0000)]
smb/client: use binary search for NT status to DOS mapping
The ntstatus_to_dos_map[] table is sorted now. Replace the linear search
with binary search to improve lookup performance.
Also remove the sentinel entry as it is no longer needed with ARRAY_SIZE().
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:30 +0000 (14:18 +0000)]
smb/client: refactor ntstatus_to_dos() to return mapping entry
Refactor ntstatus_to_dos() to return a pointer to the mapping entry
instead of using output parameters. This allows callers to access all
fields of the entry directly.
In map_smb_to_linux_error(), integrate the printing logic directly
to avoid redundant lookups previously performed by cifs_print_status(),
which is now removed.
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:29 +0000 (14:18 +0000)]
smb/client: replace nt_errs with ntstatus_to_dos_map
The ntstatus_to_dos_map[] array now contains the NT error strings,
making the nt_errs[] array redundant.
Introduce `struct ntstatus_to_dos_err` instead of an anonymous struct.
This allows cifs_print_status() to look up error strings directly
from a single table.
Remove nterr.c, as nt_errs[] was its only functional content.
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:28 +0000 (14:18 +0000)]
smb/client: autogenerate SMB1 NT status to DOS error mapping
Introduce `gen_smb1_mapping` script to autogenerate the NT status to
DOS error mapping table for SMB1. This script parses nterr.h to
generate smb1_mapping_table.c, which is then directly included as
the content of the ntstatus_to_dos_map[] array at compile time.
The generated array is numerically sorted during the build process to
ensure a consistent structure, providing the necessary groundwork for
future introduction of binary search lookups.
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Huiwen He [Thu, 2 Apr 2026 14:18:27 +0000 (14:18 +0000)]
smb/client: annotate nterr.h with DOS error codes
Add comments to NT_STATUS definitions in nterr.h indicating the
corresponding DOS error class and code.
To ensure formatting consistency and facilitate automated processing,
existing human-readable comments in nterr.h were first moved to the
line preceding the #define statements.
This provides the source data for generating sorted mapping tables,
allowing the implementation of binary search for faster error mapping
lookups in later commits.
The mapping data is extracted from the existing manual
ntstatus_to_dos_map[] array in smb1maperror.c using the following
python script:
def move_comments(file_path):
"""
Moves existing inline comments (/* ... */ or // ...) to
the preceding line to ensure formatting consistency.
"""
if not os.path.exists(file_path):
return
with open(file_path, "r") as f:
lines = f.readlines()
new_lines = []
# Match #define statements with inline comments
re_str = r'^(\s*#define\s+[A-Za-z0-9_]+\s+.*?)\s*(/\*.*?\*/|//.*)$'
pattern = re.compile(re_str)
for line in lines:
match = pattern.match(line.rstrip())
if match:
define_part, comment_part = match.groups()
# Do not move if it's already an auto-generated mapping comment
if re.search(r'//\s*[A-Z0-9_]+\s*,\s*[A-Za-z0-9_]+', comment_part):
new_lines.append(line)
continue
indent = " " * (len(line) - len(line.lstrip()))
# Move old comment to previous line
new_lines.append(indent + comment_part + "\n")
# Keep the define part
new_lines.append(define_part.rstrip() + "\n")
else:
new_lines.append(line)
with open(file_path, "w") as f:
f.writelines(new_lines)
def annotate_nterr():
"""
Extracts DOS error mappings from smb1maperror.c and appends them
as comments to NT_STATUS defines in nterr.h, ensuring proper alignment.
"""
mapping = {}
if not os.path.exists(MAP_FILE) or not os.path.exists(NTERR_FILE):
return
# Extract mappings from the source mapping table
with open(MAP_FILE, "r") as f:
content = f.read()
# Strip comments from source to ensure robust parsing
content = re.sub(r'/\*.*?\*/', '', content, flags=re.DOTALL)
content = re.sub(r'//.*', '', content)
# Match [Class], [Code], [NT_STATUS] triplets using regex
map_re = r'([A-Z0-9_]+)\s*,\s*([A-Za-z0-9_]+)\s*,\s*(NT_STATUS_[A-Z0-9_]+)'
matches = re.findall(map_re, content)
for m in matches:
mapping[m[2]] = (m[0], m[1])
with open(NTERR_FILE, "r") as f:
lines = f.readlines()
new_lines = []
for line in lines:
stripped = line.strip()
if stripped.startswith("#define NT_STATUS_"):
# Remove any existing // comments before re-annotating
base_line = re.sub(r'\s*//.*$', '', line.rstrip())
parts = base_line.split()
if len(parts) >= 2:
name = parts[1]
# Append comment, ensuring proper alignment
if name == "NT_STATUS_OK":
line = f"{base_line}\t// SUCCESS, 0\n"
elif name in mapping:
d_class, d_code = mapping[name]
line = f"{base_line}\t// {d_class}, {d_code}\n"
else:
line = f"{base_line}\t// ERRHRD, ERRgeneral\n"
new_lines.append(line)
with open(NTERR_FILE, "w") as f:
f.writelines(new_lines)
if __name__ == "__main__":
# Step 1: Clean existing inline comments and move them to separate lines
move_comments(NTERR_FILE)
# Step 2: Annotate with DOS codes, ensuring proper DOS codes comments
annotate_nterr()
print("Successfully processed nterr.h with DOS codes comments.")
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
Fredric Cover [Sun, 29 Mar 2026 01:47:53 +0000 (18:47 -0700)]
fs/smb/client: add verbose error logging for UNC parsing
Add cifs_dbg(VFS, ...) statements to smb3_parse_devname() to provide
explicit feedback when parsing fails. Currently, the function returns
-EINVAL silently, making it difficult to debug mount failures caused
by malformed paths or missing share names.
Signed-off-by: Fredric Cover <FredTheDude@proton.me> Acked-by: Henrique Carvalho <[2]henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
parse_probe_arg() accepts quoted immediate strings and passes the body
after the opening quote to __parse_imm_string(). That helper currently
computes strlen(str) and immediately dereferences str[len - 1], which
underflows when the body is empty and not closed with double-quotation.
Reject empty non-closed immediate strings before checking for the closing quote.
Daniel Palmer [Sat, 4 Apr 2026 02:31:08 +0000 (11:31 +0900)]
m68k: Fix task info flags handling for 68000
The logic for deciding what to do after a syscall should be checking
if any of the lower byte bits are set and then checking if the reschedule
bit is set.
Currently we are loading the top word, checking if any bits are set
(which never seems to be true) and thus jumping over loading the
whole long and checking if the reschedule bit is set.
We get the thread info in two places so split that logic out in
a macro and then fix the code so that it loads the byte of the flags
we need to check, checks if anything is set and then checks if
the reschedule bit in particular is set.
Reported-by: Christoph Plattner <christoph.plattner@gmx.at> Signed-off-by: Daniel Palmer <daniel@0x0f.com> Signed-off-by: Greg Ungerer <gerg@kernel.org>
Merge tag 'riscv-for-linus-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
- Fix a CONFIG_SPARSEMEM crash on RV32 by avoiding early phys_to_page()
- Prevent runtime const infrastructure from being used by modules,
similar to what was done for x86
- Avoid problems when shutting down ACPI systems with IOMMUs by adding
a device dependency between IOMMU and devices that use it
- Fix a bug where the CPU pointer masking state isn't properly reset
when tagged addresses aren't enabled for a task
- Fix some incorrect register assignments, and add some missing ones,
in kgdb support code
- Fix compilation of non-kernel code that uses the ptrace uapi header
by replacing BIT() with _BITUL()
- Fix compilation of the validate_v_ptrace kselftest by working around
kselftest macro expansion issues
* tag 'riscv-for-linus-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
ACPI: RIMT: Add dependency between iommu and devices
selftests: riscv: Add braces around EXPECT_EQ()
riscv: use _BITUL macro rather than BIT() in ptrace uapi and kselftests
riscv: Reset pmm when PR_TAGGED_ADDR_ENABLE is not set
riscv: make runtime const not usable by modules
riscv: patch: Avoid early phys_to_page()
riscv: kgdb: fix several debug register assignment bugs
x86/split_lock: Don't warn about unknown split_lock_detect parameter
The split_lock_detect command line parameter is handled in sld_setup() shortly
after cpu_parse_early_param() but still before parse_early_param().
Add a dummy parsing function so that parse_early_param() doesn't later
complain about the "unknown" parameter split_lock_detect=, and pass it along
to init.
Lance Yang [Wed, 1 Apr 2026 13:10:32 +0000 (21:10 +0800)]
mm: fix deferred split queue races during migration
migrate_folio_move() records the deferred split queue state from src and
replays it on dst. Replaying it after remove_migration_ptes(src, dst, 0)
makes dst visible before it is requeued, so a concurrent rmap-removal path
can mark dst partially mapped and trip the WARN in deferred_split_folio().
Move the requeue before remove_migration_ptes() so dst is back on the
deferred split queue before it becomes visible again.
Because migration still holds dst locked at that point, teach
deferred_split_scan() to requeue a folio when folio_trylock() fails.
Otherwise a fully mapped underused folio can be dequeued by the shrinker
and silently lost from split_queue.
[ziy@nvidia.com: move the comment] Link: https://lkml.kernel.org/r/FB71A764-0F10-4E5A-B4A0-BA4C7F138408@nvidia.com Link: https://syzkaller.appspot.com/bug?extid=a7067a757858ac8eb085 Link: https://lkml.kernel.org/r/20260401131032.13011-1-lance.yang@linux.dev Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue") Signed-off-by: Lance Yang <lance.yang@linux.dev> Signed-off-by: Zi Yan <ziy@nvidia.com> Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-mm/69ccb65b.050a0220.183828.003a.GAE@google.com/ Suggested-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: Zi Yan <ziy@nvidia.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: Byungchul Park <byungchul@sk.com> Cc: David Hildenbrand <david@kernel.org> Cc: Deepanshu Kartikey <kartikey406@gmail.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nico Pache <npache@redhat.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Ying Huang <ying.huang@linux.alibaba.com> Cc: Usama Arif <usama.arif@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/huge_memory: add and use has_deposited_pgtable()
Rather than thread has_deposited through zap_huge_pmd(), make things
clearer by adding has_deposited_pgtable() with comments describing why in
each case.
[ljs@kernel.org: fix folio_put()-before-recheck issue, per Sashiko] Link: https://lkml.kernel.org/r/0a917f80-902f-49b0-a75f-1bbaf23d7f94@lucifer.local Link: https://lkml.kernel.org/r/f9db59ca90937e39913d50ecb4f662e2bad17bbb.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/huge_memory: add and use normal_or_softleaf_folio_pmd()
Now we have pmd_to_softleaf_folio() available to us which also raises a
CONFIG_DEBUG_VM warning if unexpectedly an invalid softleaf entry, we can
now abstract folio handling altogether.
vm_normal_folio() deals with the huge zero page (which is present), as well
as PFN map/mixed map mappings in both cases returning NULL.
Otherwise, we try to obtain the softleaf folio.
This makes the logic far easier to comprehend and has it use the standard
vm_normal_folio_pmd() path for decoding of present entries.
Finally, we have to update the flushing logic to only do so if a folio is
established.
This patch also makes the 'is_present' value more accurate - because PFN
map, mixed map and zero huge pages are present, just not present and
'normal'.
[ljs@kernel.org: avoid bisection hazard] Link: https://lkml.kernel.org/r/d0cc6161-77a4-42ba-a411-96c23c78df1b@lucifer.local Link: https://lkml.kernel.org/r/c2be872d64ef9573b80727d9ab5446cf002f17b5.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Separate pmd_is_valid_softleaf() into separate components, then use the
pmd_is_valid_softleaf() predicate to implement pmd_to_softleaf_folio().
This returns the folio associated with a softleaf entry at PMD level. It
expects this to be valid for a PMD entry.
If CONFIG_DEBUG_VM is set, then assert on this being an invalid entry, and
either way return NULL in this case.
This lays the ground for further refactorings.
Link: https://lkml.kernel.org/r/b677592596274fa3fd701890497948e4b0e07cec.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/huge_memory: separate out the folio part of zap_huge_pmd()
Place the part of the logic that manipulates counters and possibly updates
the accessed bit of the folio into its own function to make zap_huge_pmd()
more readable.
Also rename flush_needed to is_present as we only require a flush for
present entries.
Additionally add comments as to why we're doing what we're doing with
respect to softleaf entries.
This also lays the ground for further refactoring.
Link: https://lkml.kernel.org/r/6c4db67952f5529da4db102a6149b9050b5dda4e.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reduce the repetition, and lay the ground for further refactorings by
keeping this variable separate.
Link: https://lkml.kernel.org/r/98104cde87e4b2aabeb16f236b8731591594457f.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
These checks have been in place since 2014, I think we can safely assume
that we are in a place where we don't need these as runtime checks.
In addition there are 4 other invocations of folio_remove_rmap_pmd(), none
of which make this assertion.
If we need to add this assertion, it should be in folio_remove_rmap_pmd(),
and as a VM_WARN_ON_ONCE(), however these seem superfluous so just remove
them.
Link: https://lkml.kernel.org/r/0c4c5ab247c90f80cf44718e8124b217d6a22544.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Rather than having separate logic for each case determining whether to zap
the deposited table, simply track this via a boolean.
We default this to whether the architecture requires it, and update it as
required elsewhere.
Link: https://lkml.kernel.org/r/71f576a1fbcd27a86322d12caa937bcdacf75407.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This has been around since the beginnings of the THP implementation. I
think we can safely assume that, if we have a THP folio, it will have a
head page.
Link: https://lkml.kernel.org/r/f3fa8eb4634ccb2e78209f570cc1a769a02ce93e.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/huge_memory: add a common exit path to zap_huge_pmd()
Other than when we acquire the PTL, we always need to unlock the PTL, and
optionally need to flush on exit.
The code is currently very duplicated in this respect, so default
flush_needed to false, set it true in the case in which it's required,
then share the same logic for all exit paths.
This also makes flush_needed make more sense as a function-scope value (we
don't need to flush for the PFN map/mixed map, zero huge, error cases for
instance).
Link: https://lkml.kernel.org/r/6b281d8ed972dff0e89bdcbdd810c96c7ae8c9dc.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/huge_memory: handle buggy PMD entry in zap_huge_pmd()
A recent bug I analysed managed to, through a bug in the userfaultfd
implementation, reach an invalid point in the zap_huge_pmd() code where
the PMD was none of:
- A non-DAX, PFN or mixed map.
- The huge zero folio
- A present PMD entry
- A softleaf entry
The code at this point calls folio_test_anon() on a known-NULL folio.
Having logic like this explicitly NULL dereference in the code is hard to
understand, and makes debugging potentially more difficult.
Add an else branch to handle this case and WARN().
No functional change intended.
Link: https://lore.kernel.org/all/6b3d7ad7-49e1-407a-903d-3103704160d8@lucifer.local/ Link: https://lkml.kernel.org/r/fcf1f6de84a2ace188b6bf103fa15dde695f1ed8.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc
There's no need to use the ancient approach of returning an integer here,
just return a boolean.
Also update flush_needed to be a boolean, similarly.
Also add a kdoc comment describing the function.
No functional change intended.
Link: https://lkml.kernel.org/r/132274566cd49d2960a2294c36dd2450593dfc55.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We don't need to have an extra level of indentation, we can simply exit
early in the first two branches.
No functional change intended.
Link: https://lkml.kernel.org/r/6b4d5efdbf5554b8fe788f677d0b50f355eec999.1774029655.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "mm/huge_memory: refactor zap_huge_pmd()", v3.
zap_huge_pmd() is overly complicated, clean it up and also add an assert
in the case that we encounter a buggy PMD entry that doesn't match
expectations.
This is motivated by a bug discovered [0] where the PMD entry was none of:
* A non-DAX, PFN or mixed map.
* The huge zero folio
* A present PMD entry
* A softleaf entry
In zap_huge_pmd(), but due to the bug we manged to reach this code.
It is useful to explicitly call this out rather than have an arbitrary
NULL pointer dereference happen, which also improves understanding of
what's going on.
The series goes further to make use of vm_normal_folio_pmd() rather than
implementing custom logic for retrieving the folio, and extends softleaf
functionality to provide and use an equivalent softleaf function.
This patch (of 13):
This function is confused - it overloads the term 'special' yet again,
checks for DAX but in many cases the code explicitly excludes DAX before
invoking the predicate.
It also unnecessarily checks for vma->vm_file - this has to be present for
a driver to have set VMA_MIXEDMAP_BIT or VMA_PFNMAP_BIT.
In fact, a far simpler form of this is to reverse the DAX predicate and
return false if DAX is set.
This makes sense from the point of view of 'special' as in
vm_normal_page(), as DAX actually does potentially have retrievable
folios.
Also there's no need to have this in mm.h so move it to huge_memory.c.
mm: on remap assert that input range within the proposed VMA
Now we have range_in_vma_desc(), update remap_pfn_range_prepare() to check
whether the input range in contained within the specified VMA, so we can
fail at prepare time if an invalid range is specified.
This covers the I/O remap mmap actions also which ultimately call into
this function, and other mmap action types either already span the full
VMA or check this already.
Link: https://lkml.kernel.org/r/0fc1092f4b74f3f673a58e4e3942dc83f336dd85.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
A user can invoke mmap_action_map_kernel_pages() to specify that the
mapping should map kernel pages starting from desc->start of a specified
number of pages specified in an array.
In order to implement this, adjust mmap_action_prepare() to be able to
return an error code, as it makes sense to assert that the specified
parameters are valid as quickly as possible as well as updating the VMA
flags to include VMA_MIXEDMAP_BIT as necessary.
This provides an mmap_prepare equivalent of vm_insert_pages(). We
additionally update the existing vm_insert_pages() code to use
range_in_vma() and add a new range_in_vma_desc() helper function for the
mmap_prepare case, sharing the code between the two in range_is_subset().
We add both mmap_action_map_kernel_pages() and
mmap_action_map_kernel_pages_full() to allow for both partial and full VMA
mappings.
We update the documentation to reflect the new features.
Finally, we update the VMA tests accordingly to reflect the changes.
Link: https://lkml.kernel.org/r/926ac961690d856e67ec847bee2370ab3c6b9046.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
uio: replace deprecated mmap hook with mmap_prepare in uio_info
The f_op->mmap interface is deprecated, so update uio_info to use its
successor, mmap_prepare.
Therefore, replace the uio_info->mmap hook with a new
uio_info->mmap_prepare hook, and update its one user, target_core_user,
to both specify this new mmap_prepare hook and also to use the new
vm_ops->mapped() hook to continue to maintain a correct udev->kref
refcount.
Then update uio_mmap() to utilise the mmap_prepare compatibility layer to
invoke this callback from the uio mmap invocation.
Link: https://lkml.kernel.org/r/157583e4477705b496896c7acd4ac88a937b8fa6.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare
The f_op->mmap interface is deprecated, so update the vmbus driver to use
its successor, mmap_prepare.
This updates all callbacks which referenced the function pointer
hv_mmap_ring_buffer to instead reference hv_mmap_prepare_ring_buffer,
utilising the newly introduced compat_set_desc_from_vma() and
__compat_vma_mmap() to be able to implement this change.
The UIO HV generic driver is the only user of hv_create_ring_sysfs(),
which is the only function which references
vmbus_channel->mmap_prepare_ring_buffer which, in turn, is the only
external interface to hv_mmap_prepare_ring_buffer.
This patch therefore updates this caller to use mmap_prepare instead,
which also previously used vm_iomap_memory(), so this change replaces it
with its mmap_prepare equivalent, mmap_action_simple_ioremap().
[akpm@linux-foundation.org: restore struct vmbus_channel comment, per Michael Kelley] Link: https://lkml.kernel.org/r/05467cb62267d750e5c770147517d4df0246cda6.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm: allow handling of stacked mmap_prepare hooks in more drivers
While the conversion of mmap hooks to mmap_prepare is underway, we will
encounter situations where mmap hooks need to invoke nested mmap_prepare
hooks.
The nesting of mmap hooks is termed 'stacking'. In order to flexibly
facilitate the conversion of custom mmap hooks in drivers which stack, we
must split up the existing __compat_vma_mmap() function into two separate
functions:
* compat_set_desc_from_vma() - This allows the setting of a vm_area_desc
object's fields to the relevant fields of a VMA.
* __compat_vma_mmap() - Once an mmap_prepare hook has been executed upon a
vm_area_desc object, this function performs any mmap actions specified by
the mmap_prepare hook and then invokes its vm_ops->mapped() hook if any
were specified.
In ordinary cases, where a file's f_op->mmap_prepare() hook simply needs
to be invoked in a stacked mmap() hook, compat_vma_mmap() can be used.
However some drivers define their own nested hooks, which are invoked in
turn by another hook.
A concrete example is vmbus_channel->mmap_ring_buffer(), which is invoked
in turn by bin_attribute->mmap():
vmbus_channel->mmap_ring_buffer() has a signature of:
int (*mmap_ring_buffer)(struct vmbus_channel *channel,
struct vm_area_struct *vma);
And so compat_vma_mmap() cannot be used here for incremental conversion of
hooks from mmap() to mmap_prepare().
There are many such instances like this, where conversion to mmap_prepare
would otherwise cascade to a huge change set due to nesting of this kind.
The changes in this patch mean we could now instead convert
vmbus_channel->mmap_ring_buffer() to
vmbus_channel->mmap_prepare_ring_buffer(), and implement something like:
Allowing us to incrementally update this logic, and other logic like it.
Unfortunately, as part of this change, we need to be able to flexibly
assign to the VMA descriptor, so have to remove some of the const
declarations within the structure.
Also update the VMA tests to reflect the changes.
Link: https://lkml.kernel.org/r/24aac3019dd34740e788d169fccbe3c62781e648.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mtdchar: replace deprecated mmap hook with mmap_prepare, clean up
Replace the deprecated mmap callback with mmap_prepare.
Commit f5cf8f07423b ("mtd: Disable mtdchar mmap on MMU systems") commented
out the CONFIG_MMU part of this function back in 2012, so after ~14 years
it's probably reasonable to remove this altogether rather than updating
dead code.
Link: https://lkml.kernel.org/r/d036855c21962c58ace0eb24ecd6d973d77424fe.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Richard Weinberger <richard@nod.at> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Currently drivers use vm_iomap_memory() as a simple helper function for
I/O remapping memory over a range starting at a specified physical address
over a specified length.
In order to utilise this from mmap_prepare, separate out the core logic
into __simple_ioremap_prep(), update vm_iomap_memory() to use it, and add
simple_ioremap_prepare() to do the same with a VMA descriptor object.
We also add MMAP_SIMPLE_IO_REMAP and relevant fields to the struct
mmap_action type to permit this operation also.
We use mmap_action_ioremap() to set up the actual I/O remap operation once
we have checked and figured out the parameters, which makes
simple_ioremap_prepare() easy to implement.
We then add mmap_action_simple_ioremap() to allow drivers to make use of
this mode.
We update the mmap_prepare documentation to describe this mode. Finally,
we update the VMA tests to reflect this change.
Link: https://lkml.kernel.org/r/a08ef1c4542202684da63bb37f459d5dbbeddd91.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit 9d5403b1036c ("fs: convert most other generic_file_*mmap() users to
.mmap_prepare()") updated AFS to use the mmap_prepare callback in favour
of the deprecated mmap callback.
However, it did not account for the fact that mmap_prepare is called
pre-merge, and may then be merged, nor that mmap_prepare can fail to map
due to an out of memory error.
This change was therefore since reverted.
Both of those are cases in which we should not be incrementing a reference
count.
With the newly added vm_ops->mapped callback available, we can simply
defer this operation to that callback which is only invoked once the
mapping is successfully in place (but not yet visible to userspace as the
mmap and VMA write locks are held).
This allows us to once again reimplement the .mmap_prepare implementation
for this file system.
Therefore add afs_mapped() to implement this callback for AFS, and remove
the code doing so in afs_mmap_prepare().
Also update afs_vm_open(), afs_vm_close() and afs_vm_map_pages() to be
consistent in how the vnode is accessed.
Link: https://lkml.kernel.org/r/ad9a94350a9c7d2bdab79fc397ef0f64d3412d71.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Partially reverts commit 9d5403b1036c ("fs: convert most other
generic_file_*mmap() users to .mmap_prepare()").
This is because the .mmap invocation establishes a refcount, but
.mmap_prepare is called at a point where a merge or an allocation failure
might happen after the call, which would leak the refcount increment.
Functionality is being added to permit the use of .mmap_prepare in this
case, but in the interim, we need to fix this.
Link: https://lkml.kernel.org/r/08804c94e39d9102a3a8fbd12385e8aa079ba1d3.1774045440.git.ljs@kernel.org Fixes: 9d5403b1036c ("fs: convert most other generic_file_*mmap() users to .mmap_prepare()") Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Wei Liu <wei.liu@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Previously, when a driver needed to do something like establish a
reference count, it could do so in the mmap hook in the knowledge that the
mapping would succeed.
With the introduction of f_op->mmap_prepare this is no longer the case, as
it is invoked prior to actually establishing the mapping.
mmap_prepare is not appropriate for this kind of thing as it is called
before any merge might take place, and after which an error might occur
meaning resources could be leaked.
To take this into account, introduce a new vm_ops->mapped callback which
is invoked when the VMA is first mapped (though notably - not when it is
merged - which is correct and mirrors existing mmap/open/close behaviour).
We do better that vm_ops->open() here, as this callback can return an
error, at which point the VMA will be unmapped.
Note that vm_ops->mapped() is invoked after any mmap action is complete
(such as I/O remapping).
We intentionally do not expose the VMA at this point, exposing only the
fields that could be used, and an output parameter in case the operation
needs to update the vma->vm_private_data field.
In order to deal with stacked filesystems which invoke inner filesystem's
mmap() invocations, add __compat_vma_mapped() and invoke it on vfs_mmap()
(via compat_vma_mmap()) to ensure that the mapped callback is handled when
an mmap() caller invokes a nested filesystem's mmap_prepare() callback.
Update the mmap_prepare documentation to describe the mapped hook and make
it clear what its intended use is.
The vm_ops->mapped() call is handled by the mmap complete logic to ensure
the same code paths are handled by both the compatibility and VMA layers.
Additionally, update VMA userland test headers to reflect the change.
Link: https://lkml.kernel.org/r/4c5e98297eb0aae9565c564e1c296a112702f144.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm: have mmap_action_complete() handle the rmap lock and unmap
Rather than have the callers handle this both the rmap lock release and
unmapping the VMA on error, handle it within the mmap_action_complete()
logic where it makes sense to, being careful not to unlock twice.
This simplifies the logic and makes it harder to make mistake with this,
while retaining correct behaviour with regard to avoiding deadlocks.
Also replace the call_action_complete() function with a direct invocation
of mmap_action_complete() as the abstraction is no longer required.
Also update the VMA tests to reflect this change.
Link: https://lkml.kernel.org/r/8d1ee8ebd3542d006a47e8382fb80cf5b57ecf10.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We don't need to reference this field, it's confusing as it duplicates
mmap_action->hide_from_rmap_until_complete, so thread the mmap_action
through to __mmap_new_vma() instead and use the same field consistently.
Link: https://lkml.kernel.org/r/42c3fbb701e361a17193ecda0d2dabcc326288a5.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm: switch the rmap lock held option off in compat layer
In the mmap_prepare compatibility layer, we don't need to hold the rmap
lock, as we are being called from an .mmap handler.
The .mmap_prepare hook, when invoked in the VMA logic, is called prior to
the VMA being instantiated, but the completion hook is called after the VMA
is linked into the maple tree, meaning rmap walkers can reach it.
The mmap hook does not link the VMA into the tree, so this cannot happen.
Therefore it's safe to simply disable this in the mmap_prepare
compatibility layer.
Also update VMA tests code to reflect current compatibility layer state.
[akpm@linux-foundation.org: fix comment typo, per Vlastimil] Link: https://lkml.kernel.org/r/dda74230d26a1fcd79a3efab61fa4101dd1cac64.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm: avoid deadlock when holding rmap on mmap_prepare error
Commit ac0a3fc9c07d ("mm: add ability to take further action in
vm_area_desc") added the ability for drivers to instruct mm to take actions
after the .mmap_prepare callback is complete.
To make life simpler and safer, this is done before the VMA/mmap write lock
is dropped but when the VMA is completely established.
So on error, we simply munmap() the VMA.
As part of this implementation, unfortunately a horrible hack had to be
implemented to support some questionable behaviour hugetlb relies upon -
that is that the file rmap lock is held until the operation is complete.
The implementation, for convenience, did this in mmap_action_finish() so
both the VMA and mmap_prepare compatibility layer paths would have this
correctly handled.
However, it turns out there is a mistake here - the rmap lock cannot be
held on munmap, as free_pgtables() -> unlink_file_vma_batch_add() ->
unlink_file_vma_batch_process() takes the file rmap lock.
We therefore currently have a deadlock issue that might arise.
Resolve this by leaving it to callers to handle the unmap.
The compatibility layer does not support this rmap behaviour, so we simply
have it unmap on error after calling mmap_action_complete().
In the VMA implementation, we only perform the unmap after the rmap lock is
dropped.
This resolves the issue by ensuring the rmap lock is always dropped when
the unmap occurs.
Link: https://lkml.kernel.org/r/d44248be9da68258b07c2c59d4e73485ee0ca943.1774045440.git.ljs@kernel.org Fixes: ac0a3fc9c07d ("mm: add ability to take further action in vm_area_desc") Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Wei Liu <wei.liu@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm: add documentation for the mmap_prepare file operation callback
This documentation makes it easier for a driver/file system implementer to
correctly use this callback.
It covers the fundamentals, whilst intentionally leaving the less lovely
possible actions one might take undocumented (for instance - the
success_hook, error_hook fields in mmap_action).
The document also covers the new VMA flags implementation which is the
only one which will work correctly with mmap_prepare.
Link: https://lkml.kernel.org/r/3aebf918c213fa2aecf00a31a444119b5bdd7801.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "mm: expand mmap_prepare functionality and usage", v4.
This series expands the mmap_prepare functionality, which is intended to
replace the deprecated f_op->mmap hook which has been the source of bugs
and security issues for some time.
This series starts with some cleanup of existing mmap_prepare logic, then
adds documentation for the mmap_prepare call to make it easier for
filesystem and driver writers to understand how it works.
It then importantly adds a vm_ops->mapped hook, a key feature that was
missing from mmap_prepare previously - this is invoked when a driver which
specifies mmap_prepare has successfully been mapped but not merged with
another VMA.
mmap_prepare is invoked prior to a merge being attempted, so you cannot
manipulate state such as reference counts as if it were a new mapping.
The vm_ops->mapped hook allows a driver to perform tasks required at this
stage, and provides symmetry against subsequent vm_ops->open,close calls.
The series uses this to correct the afs implementation which wrongly
manipulated reference count at mmap_prepare time.
It then adds an mmap_prepare equivalent of vm_iomap_memory() -
mmap_action_simple_ioremap(), then uses this to update a number of drivers.
It then splits out the mmap_prepare compatibility layer (which allows for
invocation of mmap_prepare hooks in an mmap() hook) in such a way as to
allow for more incremental implementation of mmap_prepare hooks.
It then uses this to extend mmap_prepare usage in drivers.
Finally it adds an mmap_prepare equivalent of vm_map_pages(), which lays
the foundation for future work which will extend mmap_prepare to DMA
coherent mappings.
This patch (of 21):
Rather than passing arbitrary fields, pass a vm_area_desc pointer to mmap
prepare functions to mmap prepare, and an action and vma pointer to mmap
complete in order to put all the action-specific logic in the function
actually doing the work.
Additionally, allow mmap prepare functions to return an error so we can
error out as soon as possible if there is something logically incorrect in
the input.
Update remap_pfn_range_prepare() to properly check the input range for the
CoW case.
Also remove io_remap_pfn_range_complete(), as we can simply set up the
fields correctly in io_remap_pfn_range_prepare() and use
remap_pfn_range_complete() for this.
While we're here, make remap_pfn_range_prepare_vma() a little neater, and
pass mmap_action directly to call_action_complete().
Then, update compat_vma_mmap() to perform its logic directly, as
__compat_vma_map() is not used by anything so we don't need to export it.
Also update compat_vma_mmap() to use vfs_mmap_prepare() rather than
calling the mmap_prepare op directly.
Finally, update the VMA userland tests to reflect the changes.
Link: https://lkml.kernel.org/r/cover.1774045440.git.ljs@kernel.org Link: https://lkml.kernel.org/r/99f408e4694f44ab12bdc55fe0bd9685d3bd1117.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We have implemented flag mask comparisons of the form:
if ((vm_flags & (VM_FOO|VM_BAR|VM_BAZ) == VM_FOO) { ... }
Like-for-like in the code using a bitwise-and mask via vma_flags_and() and
using vma_flags_same() to ensure the final result equals only the required
flag value.
This is fine but confusing, make things clearer by instead explicitly
excluding undesired flags and including the desired one via tests of the
form:
if (vma_flags_test(&flags, VMA_FOO_BIT) &&
!vma_flags_test_any(&flags, VMA_BAR_BIT, VMA_BAZ_BIT)) { ... }
Which makes it easier to understand what is going on.
No functional change intended.
Link: https://lkml.kernel.org/r/d395c5dd837a9864f5efcec42175910afbe3ce73.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Suggested-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/vma: convert __mmap_region() to use vma_flags_t
Update the mmap() implementation logic implemented in __mmap_region() and
functions invoked by it. The mmap_region() function converts its input
vm_flags_t parameter to a vma_flags_t value which it then passes to
__mmap_region() which uses the vma_flags_t value consistently from then
on.
As part of the change, we convert map_deny_write_exec() to using
vma_flags_t (it was incorrectly using unsigned long before), and place it
in vma.h, as it is only used internal to mm.
With this change, we eliminate the legacy is_shared_maywrite_vm_flags()
helper function which is now no longer required.
We are also able to update the MMAP_STATE() and VMG_MMAP_STATE() macros to
use the vma_flags_t value.
Finally, we update the VMA tests to reflect the change.
Link: https://lkml.kernel.org/r/1fc33a404c962f02da778da100387cc19bd62153.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>