namespaces.7: Rework discussion of cgroup namespaces

author Michael Kerrisk <mtk.manpages@gmail.com>

Fri, 6 May 2016 13:01:11 +0000 (15:01 +0200)

committer Michael Kerrisk <mtk.manpages@gmail.com>

Mon, 9 May 2016 21:08:54 +0000 (23:08 +0200)
author Michael Kerrisk <mtk.manpages@gmail.com>
Fri, 6 May 2016 13:01:11 +0000 (15:01 +0200)
committer Michael Kerrisk <mtk.manpages@gmail.com>
Mon, 9 May 2016 21:08:54 +0000 (23:08 +0200)
diff --git a/man7/namespaces.7 b/man7/namespaces.7

index bb37fedb6fbb61c0a307895d1d8d32c52755147e..7b0b9e2dde06a5eefd71a20165c32bca43cd1cf7 100644 (file)
--- a/man7/namespaces.7
+++ b/man7/namespaces.7
@@ -193,10 +193,10 @@ This file is a handle for the UTS namespace of the process.
  .\" ==================== Cgroup namespaces ====================
  .\"
  .SS Cgroup namespaces (CLONE_NEWCGROUP)
-Cgroup namespaces virtualize the view of a process's cgroups as seen via
-.IR /proc/[pid]/cgroup
-(see
-.BR cgroups (7)).
+Cgroup namespaces virtualize the view of a process's cgroups (see
+.BR cgroups (7))
+as seen via
+.IR /proc/[pid]/cgroup .
  
  Each cgroup namespace has its own set of cgroup root directories,
  which are the base points for the relative locations displayed in
@@ -209,7 +209,7 @@ with the
  .BR CLONE_NEWCGROUP
  flag, then its current cgroups directories become its cgroup root directories.
  (This applies both for the cgroups version 1 hierarchies
-as well as the cgroups version 2 unified hierarchy.)
+and the cgroups version 2 unified hierarchy.)
  
  When viewing
  .IR /proc/[pid]/cgroup ,
@@ -223,28 +223,28 @@ entries for each ancestor level in the cgroup hierarchy.
  
  The following shell session demonstrates the effect of creating
  a new cgroup namespace.
-First, we create child cgroup in the
+First, (as superuser) we create a child cgroup in the
  .I freezer
  hierarchy, and put the shell into that cgroup:
  
  .nf
  .in +4n
-$ \fBsudo mkdir \-p /sys/fs/cgroup/freezer/sub\fP
-$ \fBecho $$\fP                      # Show PID of this shell
+# \fBmkdir \-p /sys/fs/cgroup/freezer/sub\fP
+# \fBecho $$\fP                      # Show PID of this shell
  30655
-$ \fBsudo sh \-c 'echo 30655 > /sys/fs/cgroup/sub'\fP
-$ \fBcat /proc/self/cgroup | grep freezer\fP
+# \fBsh \-c 'echo 30655 > /sys/fs/cgroup/sub'\fP
+# \fBcat /proc/self/cgroup | grep freezer\fP
  7:freezer:/sub
  .in
  .fi
  
  Next, we use
  .BR unshare (1)
-to create a process running a shell in new user and cgroup namespaces:
+to create a process running a shell in a new cgroup namespace:
  
  .nf
  .in +4n
-$ \fBunshare -U -C bash\fP
+# \fBunshare \-C bash\fP
  .in
  .fi
  
@@ -267,26 +267,65 @@ $ \fBcat /proc/20124/cgroup | grep freezer\fP
  .in
  .fi
  
-The virtualization provided by cgroup namespaces serves at least two purposes.
-First, it can be used to prevent
-information leaks whereby cgroup directory paths outside of
+Use of cgroup namespaces requires a kernel that is configured with the
+.B CONFIG_CGROUPS
+option.
+
+Among the purposes served by the
+virtualization provided by cgroup namespaces are the following:
+.IP * 2
+It prevents information leaks whereby cgroup directory paths outside of
  a container would otherwise be visible to processes in the container.
-More importantly, this allows easier and more flexible
+Such leakages could, for example,
+reveal information about the container framework
+to containerized applications.
+.IP *
+It allows easier and more flexible
  confinement of container root tasks, because they can mount
-their own cgroup filesystems without needing to gain access to ancestor
+their own cgroup filesystems without gaining access to ancestor
  cgroup directories.
-So, for example, even if
-.I /cg/1
-is owned by uid 100000, a task namespaced under
-.I /cg/1/2
-owned by UID 100000 can mount that cgroup but not change settings in
+Consider, for example, the following scenario:
+.RS 4
+.IP \(bu 2
+We have a cgroup directory,
+.IR /cg/1 ,
+that is owned by user ID 9000.
+.IP \(bu
+We have a process,
+.IR X ,
+also owned by user ID 9000,
+that is namespaced under the cgroup
+.IR /cg/1/2
+(i.e.,
+.I X
+was placed in a new cgroup namespace via
+.BR clone (2)
+or
+.BR unshare (2)
+with the
+.BR CLONE_NEWCGROUP
+flag).
+.RE
+.IP
+In the absence of cgroup namespacing, because the cgroup directory
+.IR /cg/1
+is owned (and writable) by UID 9000 and process X is also owned
+by user ID 9000, then process X would be able to modify the contents
+of cgroups files (i.e., change cgroup settings) not only in
+.IR /cg/1/2
+but also in the ancestor cgroup directory
  .IR /cg/1 .
+Namespacing process
+.IR X
+under the cgroup directory
+.IR /cg/1/2
+prevents it modifying files in
+.IR /cg/1 ,
+since it cannot even see the contents of that directory
+(or of further removed cgroup ancestor directories).
  Combined with correct enforcement of hierarchical limits,
-this prevents that task from escaping its limits.
-
-Use of cgroup namespaces requires a kernel that is configured with the
-.B CONFIG_CGROUPS
-option.
+this prevents that process X from escaping the limits imposed
+by ancestor cgroups.
  .\"
  .\" ==================== IPC namespaces ====================
  .\"
author	Michael Kerrisk <mtk.manpages@gmail.com>
	Fri, 6 May 2016 13:01:11 +0000 (15:01 +0200)
committer	Michael Kerrisk <mtk.manpages@gmail.com>
	Mon, 9 May 2016 21:08:54 +0000 (23:08 +0200)