Christian Brauner <brauner@kernel.org> says:
Currently, creating a new mount namespace always copies the entire mount
tree from the caller's namespace. For containers and sandboxes that
intend to build their mount table from scratch this is wasteful: they
inherit a potentially large mount tree only to immediately tear it down.
This series adds support for creating a mount namespace that contains
only a clone of the root mount, with none of the child mounts. Two new
flags are introduced:
- CLONE_EMPTY_MNTNS (0x400000000) for clone3(), using the 64-bit flag
space.
- UNSHARE_EMPTY_MNTNS (0x00100000) for unshare(), reusing the
CLONE_PARENT_SETTID bit which has no meaning for unshare.
Both flags imply CLONE_NEWNS. The resulting namespace contains a single
nullfs root mount with an immutable empty directory. The intended
workflow is to then mount a real filesystem (e.g., tmpfs) over the root
and build the mount table from there.
* patches from https://patch.msgid.link/
20260306-work-empty-mntns-consolidated-v1-0-
6eb30529bbb0@kernel.org:
selftests/filesystems: add clone3 tests for empty mount namespaces
selftests/filesystems: add tests for empty mount namespaces
namespace: allow creating empty mount namespaces
Link: https://patch.msgid.link/20260306-work-empty-mntns-consolidated-v1-0-6eb30529bbb0@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>