From: Linus Torvalds Date: Sun, 14 Jun 2026 22:37:58 +0000 (+0530) Subject: Merge tag 'vfs-7.2-rc1.procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs X-Git-Url: http://git.ipfire.org/gitweb/?a=commitdiff_plain;h=79169a1624253363fed3e9a447b77e50bb226206;p=thirdparty%2Fkernel%2Flinux.git Merge tag 'vfs-7.2-rc1.procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull procfs updates from Christian Brauner: - Revamp fs/filesystems.c The file was a mess with a hand-rolled linked list in desperate need of a cleanup. The filesystems list is now RCU-ified, /proc files can be marked permanent from outside fs/proc/, and the string emitted when reading /proc/filesystems is pre-generated and cached instead of pointer-chasing and printfing entry by entry on every read. The file is read frequently because libselinux reads it and is linked into numerous frequently used programs (even ones you would not suspect, like sed!). Scalability also improves since reference maintenance on open/close is bypassed. open+read+close cycle single-threaded (ops/s): before: 442732 after: 1063462 (+140%) open+read+close cycle with 20 processes (ops/s): before: 606177 after: 3300576 (+444%) A follow-up patch adds missing unlocks in some corner cases and tidies things up. - Relax the mount visibility check for subset=pid mounts When procfs is mounted with subset=pid, all static files become unavailable and only the dynamic pid information is accessible. In that case there is no point in imposing the full mount visibility restrictions on the mounter - everything that can be hidden in procfs is already inaccessible. These restrictions prevented procfs from being mounted inside rootless containers since almost all container implementations overmount parts of procfs to hide certain directories. As part of this /proc/self/net is only shown in subset=pid mounts for CAP_NET_ADMIN, reconfiguring subset=pid is rejected, the SB_I_USERNS_VISIBLE superblock flag is replaced with an FS_USERNS_MOUNT_RESTRICTED filesystem flag, fully visible mounts are recorded in a list, and the mount restrictions are finally documented. - Protect ptrace_may_access() with exec_update_lock in procfs Most uses of ptrace_may_access() in procfs should hold exec_update_lock to avoid TOCTOU issues with concurrent privileged execve() (like setuid binary execution). This fixes the easy cases - the owner and visibility checks and the FD link permission checks - with the gnarlier ones to follow later. * tag 'vfs-7.2-rc1.procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: fix ups and tidy ups to /proc/filesystems caching proc: protect ptrace_may_access() with exec_update_lock (FD links) proc: protect ptrace_may_access() with exec_update_lock (part 1) docs: proc: add documentation about mount restrictions proc: handle subset=pid separately in userns visibility checks proc: prevent reconfiguring subset=pid proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN fs: cache the string generated by reading /proc/filesystems sysfs: remove trivial sysfs_get_tree() wrapper fs: RCU-ify filesystems list fs: move SB_I_USERNS_VISIBLE to FS_USERNS_MOUNT_RESTRICTED proc: allow to mark /proc files permanent outside of fs/proc/ namespace: record fully visible mounts in list --- 79169a1624253363fed3e9a447b77e50bb226206 diff --cc include/linux/fs.h index 9674c3d1cb3f4,e7ff9f8b14855..6da44573ce450 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@@ -2294,7 -2281,7 +2294,8 @@@ struct file_system_type #define FS_MGTIME 64 /* FS uses multigrain timestamps */ #define FS_LBS 128 /* FS supports LBS */ #define FS_POWER_FREEZE 256 /* Always freeze on suspend/hibernate */ + #define FS_USERNS_MOUNT_RESTRICTED 512 /* Restrict mount in userns if not already visible */ +#define FS_USERNS_DELEGATABLE 1024 /* Can be mounted inside userns from outside */ #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */ int (*init_fs_context)(struct fs_context *); const struct fs_parameter_spec *parameters;