From cc970310a47d553c1232be579caf1ae2a54640db Mon Sep 17 00:00:00 2001 From: Luca Boccassi Date: Tue, 16 Nov 2021 22:44:06 +0000 Subject: [PATCH] CONTAINER_INTERFACE: clarify that /proc/sys can be writable with namespacing When user and network namespaces are enabled, the kernel makes the global keys read-only, and makes the namespaced ones available for the guest already. --- docs/CONTAINER_INTERFACE.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/docs/CONTAINER_INTERFACE.md b/docs/CONTAINER_INTERFACE.md index 54b94e23424..9ca991cab50 100644 --- a/docs/CONTAINER_INTERFACE.md +++ b/docs/CONTAINER_INTERFACE.md @@ -22,10 +22,12 @@ manager, please consider supporting the following interfaces. (that file overrides whatever is pre-initialized by the container manager). 2. Make sure to pre-mount `/proc/`, `/sys/`, and `/sys/fs/selinux/` before - invoking systemd, and mount `/proc/sys/`, `/sys/`, and `/sys/fs/selinux/` - read-only in order to prevent the container from altering the host kernel's - configuration settings. (As a special exception, if your container has - network namespaces enabled, feel free to make `/proc/sys/net/` writable). + invoking systemd, and mount `/sys/`, `/sys/fs/selinux/` and `/proc/sys/` + read-only (the latter via e.g. a read-only bind mount on itself) in order + to prevent the container from altering the host kernel's configuration + settings. (As a special exception, if your container has network namespaces + enabled, feel free to make `/proc/sys/net/` writable. If it also has user, ipc, + uts and pid namespaces enabled, the entire `/proc/sys` can be left writable). systemd and various other subsystems (such as the SELinux userspace) have been modified to behave accordingly when these file systems are read-only. (It's OK to mount `/sys/` as `tmpfs` btw, and only mount a subset of its -- 2.47.3