man7/user_namespaces.7

   1 .\" Copyright (c) 2013, 2014 by Michael Kerrisk <mtk.manpages@gmail.com>
   2 .\" and Copyright (c) 2012, 2014 by Eric W. Biederman <ebiederm@xmission.com>
   3 .\"
   4 .\" %%%LICENSE_START(VERBATIM)
   5 .\" Permission is granted to make and distribute verbatim copies of this
   6 .\" manual provided the copyright notice and this permission notice are
   7 .\" preserved on all copies.
   8 .\"
   9 .\" Permission is granted to copy and distribute modified versions of this
  10 .\" manual under the conditions for verbatim copying, provided that the
  11 .\" entire resulting derived work is distributed under the terms of a
  12 .\" permission notice identical to this one.
  13 .\"
  14 .\" Since the Linux kernel and libraries are constantly changing, this
  15 .\" manual page may be incorrect or out-of-date.  The author(s) assume no
  16 .\" responsibility for errors or omissions, or for damages resulting from
  17 .\" the use of the information contained herein.  The author(s) may not
  18 .\" have taken the same level of care in the production of this manual,
  19 .\" which is licensed free of charge, as they might when working
  20 .\" professionally.
  21 .\"
  22 .\" Formatted or processed versions of this manual, if unaccompanied by
  23 .\" the source, must acknowledge the copyright and authors of this work.
  24 .\" %%%LICENSE_END
  25 .\"
  26 .\"
  27 .TH USER_NAMESPACES 7 2015-03-29 "Linux" "Linux Programmer's Manual"
  28 .SH NAME
  29 user_namespaces \- overview of Linux user namespaces
  30 .SH DESCRIPTION
  31 For an overview of namespaces, see
  32 .BR namespaces (7).
  33
  34 User namespaces isolate security-related identifiers and attributes,
  35 in particular,
  36 user IDs and group IDs (see
  37 .BR credentials (7)),
  38 the root directory,
  39 keys (see
  40 .BR keyctl (2)),
  41 .\" FIXME: This page says very little about the interaction
  42 .\" of user namespaces and keys. Add something on this topic.
  43 and capabilities (see
  44 .BR capabilities (7)).
  45 A process's user and group IDs can be different
  46 inside and outside a user namespace.
  47 In particular,
  48 a process can have a normal unprivileged user ID outside a user namespace
  49 while at the same time having a user ID of 0 inside the namespace;
  50 in other words,
  51 the process has full privileges for operations inside the user namespace,
  52 but is unprivileged for operations outside the namespace.
  53 .\"
  54 .\" ============================================================
  55 .\"
  56 .SS Nested namespaces, namespace membership
  57 User namespaces can be nested;
  58 that is, each user namespace\(emexcept the initial ("root")
  59 namespace\(emhas a parent user namespace,
  60 and can have zero or more child user namespaces.
  61 The parent user namespace is the user namespace
  62 of the process that creates the user namespace via a call to
  63 .BR unshare (2)
  64 or
  65 .BR clone (2)
  66 with the
  67 .BR CLONE_NEWUSER
  68 flag.
  69
  70 The kernel imposes (since version 3.11) a limit of 32 nested levels of
  71 .\" commit 8742f229b635bf1c1c84a3dfe5e47c814c20b5c8
  72 user namespaces.
  73 .\" FIXME Explain the rationale for this limit. (What is the rationale?)
  74 Calls to
  75 .BR unshare (2)
  76 or
  77 .BR clone (2)
  78 that would cause this limit to be exceeded fail with the error
  79 .BR EUSERS .
  80
  81 Each process is a member of exactly one user namespace.
  82 A process created via
  83 .BR fork (2)
  84 or
  85 .BR clone (2)
  86 without the
  87 .BR CLONE_NEWUSER
  88 flag is a member of the same user namespace as its parent.
  89 A single-threaded process can join another user namespace with
  90 .BR setns (2)
  91 if it has the
  92 .BR CAP_SYS_ADMIN
  93 in that namespace;
  94 upon doing so, it gains a full set of capabilities in that namespace.
  95
  96 A call to
  97 .BR clone (2)
  98 or
  99 .BR unshare (2)
 100 with the
 101 .BR CLONE_NEWUSER
 102 flag makes the new child process (for
 103 .BR clone (2))
 104 or the caller (for
 105 .BR unshare (2))
 106 a member of the new user namespace created by the call.
 107 .\"
 108 .\" ============================================================
 109 .\"
 110 .SS Capabilities
 111 The child process created by
 112 .BR clone (2)
 113 with the
 114 .BR CLONE_NEWUSER
 115 flag starts out with a complete set
 116 of capabilities in the new user namespace.
 117 Likewise, a process that creates a new user namespace using
 118 .BR unshare (2)
 119 or joins an existing user namespace using
 120 .BR setns (2)
 121 gains a full set of capabilities in that namespace.
 122 On the other hand,
 123 that process has no capabilities in the parent (in the case of
 124 .BR clone (2))
 125 or previous (in the case of
 126 .BR unshare (2)
 127 and
 128 .BR setns (2))
 129 user namespace,
 130 even if the new namespace is created or joined by the root user
 131 (i.e., a process with user ID 0 in the root namespace).
 132
 133 Note that a call to
 134 .BR execve (2)
 135 will cause a process's capabilities to be recalculated in the usual way (see
 136 .BR capabilities (7)).
 137 Consequently,
 138 unless the process has a user ID of 0 within the namespace,
 139 or the executable file has a nonempty inheritable capabilities mask,
 140 the process will lose all capabilities.
 141 See the discussion of user and group ID mappings, below.
 142
 143 A call to
 144 .BR clone (2),
 145 .BR unshare (2),
 146 or
 147 .BR setns (2)
 148 using the
 149 .BR CLONE_NEWUSER
 150 flag sets the "securebits" flags
 151 (see
 152 .BR capabilities (7))
 153 to their default values (all flags disabled) in the child (for
 154 .BR clone (2))
 155 or caller (for
 156 .BR unshare (2),
 157 or
 158 .BR setns (2)).
 159 Note that because the caller no longer has capabilities
 160 in its original user namespace after a call to
 161 .BR setns (2),
 162 it is not possible for a process to reset its "securebits" flags while
 163 retaining its user namespace membership by using a pair of
 164 .BR setns (2)
 165 calls to move to another user namespace and then return to
 166 its original user namespace.
 167
 168 The rules for determining whether or not a process has a capability
 169 in a particular user namespace are as follows:
 170 .IP 1. 3
 171 A process has a capability inside a user namespace
 172 if it is a member of that namespace and
 173 it has the capability in its effective capability set.
 174 A process can gain capabilities in its effective capability
 175 set in various ways.
 176 For example, it may execute a set-user-ID program or an
 177 executable with associated file capabilities.
 178 In addition,
 179 a process may gain capabilities via the effect of
 180 .BR clone (2),
 181 .BR unshare (2),
 182 or
 183 .BR setns (2),
 184 as already described.
 185 .\" In the 3.8 sources, see security/commoncap.c::cap_capable():
 186 .IP 2.
 187 If a process has a capability in a user namespace,
 188 then it has that capability in all child (and further removed descendant)
 189 namespaces as well.
 190 .IP 3.
 191 .\" * The owner of the user namespace in the parent of the
 192 .\" * user namespace has all caps.
 193 When a user namespace is created, the kernel records the effective
 194 user ID of the creating process as being the "owner" of the namespace.
 195 .\" (and likewise associates the effective group ID of the creating process
 196 .\" with the namespace).
 197 A process that resides
 198 in the parent of the user namespace
 199 .\" See kernel commit 520d9eabce18edfef76a60b7b839d54facafe1f9 for a fix
 200 .\" on this point
 201 and whose effective user ID matches the owner of the namespace
 202 has all capabilities in the namespace.
 203 .\"     This includes the case where the process executes a set-user-ID
 204 .\"     program that confers the effective UID of the creator of the namespace.
 205 By virtue of the previous rule,
 206 this means that the process has all capabilities in all
 207 further removed descendant user namespaces as well.
 208 .\"
 209 .\" ============================================================
 210 .\"
 211 .SS Effect of capabilities within a user namespace
 212 Having a capability inside a user namespace
 213 permits a process to perform operations (that require privilege)
 214 only on resources governed by that namespace.
 215 In other words, having a capability in a user namespace permits a process
 216 to perform privileged operations on resources that are governed by (nonuser)
 217 namespaces associated with the user namespace (see the next subsection).
 218
 219 On the other hand, there are many privileged operations that affect
 220 resources that are not associated with any namespace type,
 221 for example, changing the system time (governed by
 222 .BR CAP_SYS_TIME ),
 223 loading a kernel module (governed by
 224 .BR CAP_SYS_MODULE ),
 225 and creating a device (governed by
 226 .BR CAP_MKNOD ).
 227 Only a process with privileges in the
 228 .I initial
 229 user namespace can perform such operations.
 230
 231 Holding
 232 .B CAP_SYS_ADMIN
 233 within a (noninitial) user namespace allows the creation of bind mounts,
 234 and mounting of the following types of filesystems:
 235 .\" fs_flags = FS_USERNS_MOUNT in kernel sources
 236
 237 .RS 4
 238 .PD 0
 239 .IP * 2
 240 .IR /proc
 241 (since Linux 3.8)
 242 .IP *
 243 .IR /sys
 244 (since Linux 3.8)
 245 .IP *
 246 .IR devpts
 247 (since Linux 3.9)
 248 .IP *
 249 .IR tmpfs
 250 (since Linux 3.9)
 251 .IP *
 252 .IR ramfs
 253 (since Linux 3.9)
 254 .IP *
 255 .IR mqueue
 256 (since Linux 3.9)
 257 .IP *
 258 .IR bpf
 259 .\" commit b2197755b2633e164a439682fb05a9b5ea48f706
 260 (since Linux 4.4)
 261 .PD
 262 .RE
 263 .PP
 264 Note however, that mounting block-based filesystems can be done
 265 only by a process that holds
 266 .BR CAP_SYS_ADMIN
 267 in the initial user namespace.
 268 .\"
 269 .\" ============================================================
 270 .\"
 271 .SS Interaction of user namespaces and other types of namespaces
 272 Starting in Linux 3.8, unprivileged processes can create user namespaces,
 273 and other the other types of namespaces can be created with just the
 274 .B CAP_SYS_ADMIN
 275 capability in the caller's user namespace.
 276
 277 When a non-user-namespace is created,
 278 it is owned by the user namespace in which the creating process
 279 was a member at the time of the creation of the namespace.
 280 Actions on the non-user-namespace
 281 require capabilities in the corresponding user namespace.
 282
 283 If
 284 .BR CLONE_NEWUSER
 285 is specified along with other
 286 .B CLONE_NEW*
 287 flags in a single
 288 .BR clone (2)
 289 or
 290 .BR unshare (2)
 291 call, the user namespace is guaranteed to be created first,
 292 giving the child
 293 .RB ( clone (2))
 294 or caller
 295 .RB ( unshare (2))
 296 privileges over the remaining namespaces created by the call.
 297 Thus, it is possible for an unprivileged caller to specify this combination
 298 of flags.
 299
 300 When a new namespace (other than a user namespace) is created via
 301 .BR clone (2)
 302 or
 303 .BR unshare (2),
 304 the kernel records the user namespace of the creating process against
 305 the new namespace.
 306 (This association can't be changed.)
 307 When a process in the new namespace subsequently performs
 308 privileged operations that operate on global
 309 resources isolated by the namespace,
 310 the permission checks are performed according to the process's capabilities
 311 in the user namespace that the kernel associated with the new namespace.
 312 For example, suppose that a process attempts to change the hostname
 313 .RB ( sethostname (2)),
 314 a resource governed by the UTS namespace.
 315 In this case,
 316 the kernel will determine which user namespace is associated with
 317 the process's UTS namespace, and check whether the process has the
 318 required capability
 319 .RB ( CAP_SYS_ADMIN )
 320 in that user namespace.
 321 .\"
 322 .\" ============================================================
 323 .\"
 324 .SS Restrictions on mount namespaces
 325
 326 Note the following points with respect to mount namespaces:
 327 .IP * 3
 328 A mount namespace has an owner user namespace.
 329 A mount namespace whose owner user namespace is different from
 330 the owner user namespace of its parent mount namespace is
 331 considered a less privileged mount namespace.
 332 .IP *
 333 When creating a less privileged mount namespace,
 334 shared mounts are reduced to slave mounts.
 335 This ensures that mappings performed in less
 336 privileged mount namespaces will not propagate to more privileged
 337 mount namespaces.
 338 .IP *
 339 .\" FIXME .
 340 .\"     What does "come as a single unit from more privileged mount" mean?
 341 Mounts that come as a single unit from more privileged mount are
 342 locked together and may not be separated in a less privileged mount
 343 namespace.
 344 (The
 345 .BR unshare (2)
 346 .B CLONE_NEWNS
 347 operation brings across all of the mounts from the original
 348 mount namespace as a single unit,
 349 and recursive mounts that propagate between
 350 mount namespaces propagate as a single unit.)
 351 .IP *
 352 The
 353 .BR mount (2)
 354 flags
 355 .BR MS_RDONLY ,
 356 .BR MS_NOSUID ,
 357 .BR MS_NOEXEC ,
 358 and the "atime" flags
 359 .RB ( MS_NOATIME ,
 360 .BR MS_NODIRATIME ,
 361 .BR MS_RELATIME )
 362 settings become locked
 363 .\" commit 9566d6742852c527bf5af38af5cbb878dad75705
 364 .\" Author: Eric W. Biederman <ebiederm@xmission.com>
 365 .\" Date:   Mon Jul 28 17:26:07 2014 -0700
 366 .\"
 367 .\"      mnt: Correct permission checks in do_remount
 368 .\"
 369 when propagated from a more privileged to
 370 a less privileged mount namespace,
 371 and may not be changed in the less privileged mount namespace.
 372 .IP *
 373 .\" (As of 3.18-rc1 (in Al Viro's 2014-08-30 vfs.git#for-next tree))
 374 A file or directory that is a mount point in one namespace that is not
 375 a mount point in another namespace, may be renamed, unlinked, or removed
 376 .RB ( rmdir (2))
 377 in the mount namespace in which it is not a mount point
 378 (subject to the usual permission checks).
 379 .IP
 380 Previously, attempting to unlink, rename, or remove a file or directory
 381 that was a mount point in another mount namespace would result in the error
 382 .BR EBUSY .
 383 That behavior had technical problems of enforcement (e.g., for NFS)
 384 and permitted denial-of-service attacks against more privileged users.
 385 (i.e., preventing individual files from being updated
 386 by bind mounting on top of them).
 387 .\"
 388 .\" ============================================================
 389 .\"
 390 .SS User and group ID mappings: uid_map and gid_map
 391 When a user namespace is created,
 392 it starts out without a mapping of user IDs (group IDs)
 393 to the parent user namespace.
 394 The
 395 .IR /proc/[pid]/uid_map
 396 and
 397 .IR /proc/[pid]/gid_map
 398 files (available since Linux 3.5)
 399 .\" commit 22d917d80e842829d0ca0a561967d728eb1d6303
 400 expose the mappings for user and group IDs
 401 inside the user namespace for the process
 402 .IR pid .
 403 These files can be read to view the mappings in a user namespace and
 404 written to (once) to define the mappings.
 405
 406 The description in the following paragraphs explains the details for
 407 .IR uid_map ;
 408 .IR gid_map
 409 is exactly the same,
 410 but each instance of "user ID" is replaced by "group ID".
 411
 412 The
 413 .I uid_map
 414 file exposes the mapping of user IDs from the user namespace
 415 of the process
 416 .IR pid
 417 to the user namespace of the process that opened
 418 .IR uid_map
 419 (but see a qualification to this point below).
 420 In other words, processes that are in different user namespaces
 421 will potentially see different values when reading from a particular
 422 .I uid_map
 423 file, depending on the user ID mappings for the user namespaces
 424 of the reading processes.
 425
 426 Each line in the
 427 .I uid_map
 428 file specifies a 1-to-1 mapping of a range of contiguous
 429 user IDs between two user namespaces.
 430 (When a user namespace is first created, this file is empty.)
 431 The specification in each line takes the form of
 432 three numbers delimited by white space.
 433 The first two numbers specify the starting user ID in
 434 each of the two user namespaces.
 435 The third number specifies the length of the mapped range.
 436 In detail, the fields are interpreted as follows:
 437 .IP (1) 4
 438 The start of the range of user IDs in
 439 the user namespace of the process
 440 .IR pid .
 441 .IP (2)
 442 The start of the range of user
 443 IDs to which the user IDs specified by field one map.
 444 How field two is interpreted depends on whether the process that opened
 445 .I uid_map
 446 and the process
 447 .IR pid
 448 are in the same user namespace, as follows:
 449 .RS
 450 .IP a) 3
 451 If the two processes are in different user namespaces:
 452 field two is the start of a range of
 453 user IDs in the user namespace of the process that opened
 454 .IR uid_map .
 455 .IP b)
 456 If the two processes are in the same user namespace:
 457 field two is the start of the range of
 458 user IDs in the parent user namespace of the process
 459 .IR pid .
 460 This case enables the opener of
 461 .I uid_map
 462 (the common case here is opening
 463 .IR /proc/self/uid_map )
 464 to see the mapping of user IDs into the user namespace of the process
 465 that created this user namespace.
 466 .RE
 467 .IP (3)
 468 The length of the range of user IDs that is mapped between the two
 469 user namespaces.
 470 .PP
 471 System calls that return user IDs (group IDs)\(emfor example,
 472 .BR getuid (2),
 473 .BR getgid (2),
 474 and the credential fields in the structure returned by
 475 .BR stat (2)\(emreturn
 476 the user ID (group ID) mapped into the caller's user namespace.
 477
 478 When a process accesses a file, its user and group IDs
 479 are mapped into the initial user namespace for the purpose of permission
 480 checking and assigning IDs when creating a file.
 481 When a process retrieves file user and group IDs via
 482 .BR stat (2),
 483 the IDs are mapped in the opposite direction,
 484 to produce values relative to the process user and group ID mappings.
 485
 486 The initial user namespace has no parent namespace,
 487 but, for consistency, the kernel provides dummy user and group
 488 ID mapping files for this namespace.
 489 Looking at the
 490 .I uid_map
 491 file
 492 .RI ( gid_map
 493 is the same) from a shell in the initial namespace shows:
 494
 495 .in +4n
 496 .nf
 497 $ \fBcat /proc/$$/uid_map\fP
 498          0          0 4294967295
 499 .fi
 500 .in
 501
 502 This mapping tells us
 503 that the range starting at user ID 0 in this namespace
 504 maps to a range starting at 0 in the (nonexistent) parent namespace,
 505 and the length of the range is the largest 32-bit unsigned integer.
 506 This leaves 4294967295 (the 32-bit signed \-1 value) unmapped.
 507 This is deliberate:
 508 .IR "(uid_t)\ \-1"
 509 is used in several interfaces (e.g.,
 510 .BR setreuid (2))
 511 as a way to specify "no user ID".
 512 Leaving
 513 .IR "(uid_t)\ \-1"
 514 unmapped and unusable guarantees that there will be no
 515 confusion when using these interfaces.
 516 .\"
 517 .\" ============================================================
 518 .\"
 519 .SS Defining user and group ID mappings: writing to uid_map and gid_map
 520 .PP
 521 After the creation of a new user namespace, the
 522 .I uid_map
 523 file of
 524 .I one
 525 of the processes in the namespace may be written to
 526 .I once
 527 to define the mapping of user IDs in the new user namespace.
 528 An attempt to write more than once to a
 529 .I uid_map
 530 file in a user namespace fails with the error
 531 .BR EPERM .
 532 Similar rules apply for
 533 .I gid_map
 534 files.
 535
 536 The lines written to
 537 .IR uid_map
 538 .RI ( gid_map )
 539 must conform to the following rules:
 540 .IP * 3
 541 The three fields must be valid numbers,
 542 and the last field must be greater than 0.
 543 .IP *
 544 Lines are terminated by newline characters.
 545 .IP *
 546 There is an (arbitrary) limit on the number of lines in the file.
 547 As at Linux 3.18, the limit is five lines.
 548 In addition, the number of bytes written to
 549 the file must be less than the system page size,
 550 .\" FIXME(Eric): the restriction "less than" rather than "less than or equal"
 551 .\" seems strangely arbitrary. Furthermore, the comment does not agree
 552 .\" with the code in kernel/user_namespace.c. Which is correct?
 553 and the write must be performed at the start of the file (i.e.,
 554 .BR lseek (2)
 555 and
 556 .BR pwrite (2)
 557 can't be used to write to nonzero offsets in the file).
 558 .IP *
 559 The range of user IDs (group IDs)
 560 specified in each line cannot overlap with the ranges
 561 in any other lines.
 562 In the initial implementation (Linux 3.8), this requirement was
 563 satisfied by a simplistic implementation that imposed the further
 564 requirement that
 565 the values in both field 1 and field 2 of successive lines must be
 566 in ascending numerical order,
 567 which prevented some otherwise valid maps from being created.
 568 Linux 3.9 and later
 569 .\" commit 0bd14b4fd72afd5df41e9fd59f356740f22fceba
 570 fix this limitation, allowing any valid set of nonoverlapping maps.
 571 .IP *
 572 At least one line must be written to the file.
 573 .PP
 574 Writes that violate the above rules fail with the error
 575 .BR EINVAL .
 576
 577 In order for a process to write to the
 578 .I /proc/[pid]/uid_map
 579 .RI ( /proc/[pid]/gid_map )
 580 file, all of the following requirements must be met:
 581 .IP 1. 3
 582 The writing process must have the
 583 .BR CAP_SETUID
 584 .RB ( CAP_SETGID )
 585 capability in the user namespace of the process
 586 .IR pid .
 587 .IP 2.
 588 The writing process must either be in the user namespace of the process
 589 .I pid
 590 or be in the parent user namespace of the process
 591 .IR pid .
 592 .IP 3.
 593 The mapped user IDs (group IDs) must in turn have a mapping
 594 in the parent user namespace.
 595 .IP 4.
 596 One of the following two cases applies:
 597 .RS
 598 .IP * 3
 599 .IR Either
 600 the writing process has the
 601 .BR CAP_SETUID
 602 .RB ( CAP_SETGID )
 603 capability in the
 604 .I parent
 605 user namespace.
 606 .RS
 607 .IP + 3
 608 No further restrictions apply:
 609 the process can make mappings to arbitrary user IDs (group IDs)
 610 in the parent user namespace.
 611 .RE
 612 .IP * 3
 613 .IR Or
 614 otherwise all of the following restrictions apply:
 615 .RS
 616 .IP + 3
 617 The data written to
 618 .I uid_map
 619 .RI ( gid_map )
 620 must consist of a single line that maps
 621 the writing process's effective user ID
 622 (group ID) in the parent user namespace to a user ID (group ID)
 623 in the user namespace.
 624 .IP +
 625 The writing process must have the same effective user ID as the process
 626 that created the user namespace.
 627 .IP +
 628 In the case of
 629 .IR gid_map ,
 630 use of the
 631 .BR setgroups (2)
 632 system call must first be denied by writing
 633 .RI \(dq deny \(dq
 634 to the
 635 .I /proc/[pid]/setgroups
 636 file (see below) before writing to
 637 .IR gid_map .
 638 .RE
 639 .RE
 640 .PP
 641 Writes that violate the above rules fail with the error
 642 .BR EPERM .
 643 .\"
 644 .\" ============================================================
 645 .\"
 646 .SS Interaction with system calls that change process UIDs or GIDs
 647 In a user namespace where the
 648 .I uid_map
 649 file has not been written, the system calls that change user IDs will fail.
 650 Similarly, if the
 651 .I gid_map
 652 file has not been written, the system calls that change group IDs will fail.
 653 After the
 654 .I uid_map
 655 and
 656 .I gid_map
 657 files have been written, only the mapped values may be used in
 658 system calls that change user and group IDs.
 659
 660 For user IDs, the relevant system calls include
 661 .BR setuid (2),
 662 .BR setfsuid (2),
 663 .BR setreuid (2),
 664 and
 665 .BR setresuid (2).
 666 For group IDs, the relevant system calls include
 667 .BR setgid (2),
 668 .BR setfsgid (2),
 669 .BR setregid (2),
 670 .BR setresgid (2),
 671 and
 672 .BR setgroups (2).
 673
 674 Writing
 675 .RI \(dq deny \(dq
 676 to the
 677 .I /proc/[pid]/setgroups
 678 file before writing to
 679 .I /proc/[pid]/gid_map
 680 .\" Things changed in Linux 3.19
 681 .\" commit 9cc46516ddf497ea16e8d7cb986ae03a0f6b92f8
 682 .\" commit 66d2f338ee4c449396b6f99f5e75cd18eb6df272
 683 .\" http://lwn.net/Articles/626665/
 684 will permanently disable
 685 .BR setgroups (2)
 686 in a user namespace and allow writing to
 687 .I /proc/[pid]/gid_map
 688 without having the
 689 .BR CAP_SETGID
 690 capability in the parent user namespace.
 691 .\"
 692 .\" ============================================================
 693 .\"
 694 .SS The /proc/[pid]/setgroups file
 695 .\"
 696 .\" commit 9cc46516ddf497ea16e8d7cb986ae03a0f6b92f8
 697 .\" commit 66d2f338ee4c449396b6f99f5e75cd18eb6df272
 698 .\" http://lwn.net/Articles/626665/
 699 .\" http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-8989
 700 .\"
 701 The
 702 .I /proc/[pid]/setgroups
 703 file displays the string
 704 .RI \(dq allow \(dq
 705 if processes in the user namespace that contains the process
 706 .I pid
 707 are permitted to employ the
 708 .BR setgroups (2)
 709 system call; it displays
 710 .RI \(dq deny \(dq
 711 if
 712 .BR setgroups (2)
 713 is not permitted in that user namespace.
 714 Note that regardless of the value in the
 715 .I /proc/[pid]/setgroups
 716 file (and regardless of the process's capabilities), calls to
 717 .BR setgroups (2)
 718 are also not permitted if
 719 .IR /proc/[pid]/gid_map
 720 has not yet been set.
 721
 722 A privileged process (one with the
 723 .BR CAP_SYS_ADMIN
 724 capability in the namespace) may write either of the strings
 725 .RI \(dq allow \(dq
 726 or
 727 .RI \(dq deny \(dq
 728 to this file
 729 .I before
 730 writing a group ID mapping
 731 for this user namespace to the file
 732 .IR /proc/[pid]/gid_map .
 733 Writing the string
 734 .RI \(dq deny \(dq
 735 prevents any process in the user namespace from employing
 736 .BR setgroups (2).
 737
 738 The essence of the restrictions described in the preceding
 739 paragraph is that it is permitted to write to
 740 .I /proc/[pid]/setgroups
 741 only so long as calling
 742 .BR setgroups (2)
 743 is disallowed because
 744 .I /proc/[pid]gid_map
 745 has not been set.
 746 This ensures that a process cannot transition from a state where
 747 .BR setgroups (2)
 748 is allowed to a state where
 749 .BR setgroups (2)
 750 is denied;
 751 a process can only transition from
 752 .BR setgroups (2)
 753 being disallowed to
 754 .BR setgroups (2)
 755 being allowed.
 756
 757 The default value of this file in the initial user namespace is
 758 .RI \(dq allow \(dq.
 759
 760 Once
 761 .IR /proc/[pid]/gid_map
 762 has been written to
 763 (which has the effect of enabling
 764 .BR setgroups (2)
 765 in the user namespace),
 766 it is no longer possible to disallow
 767 .BR setgroups (2)
 768 by writing
 769 .RI \(dq deny \(dq
 770 to
 771 .IR /proc/[pid]/setgroups
 772 (the write fails with the error
 773 .BR EPERM ).
 774
 775 A child user namespace inherits the
 776 .IR /proc/[pid]/setgroups
 777 setting from its parent.
 778
 779 If the
 780 .I setgroups
 781 file has the value
 782 .RI \(dq deny \(dq,
 783 then the
 784 .BR setgroups (2)
 785 system call can't subsequently be reenabled (by writing
 786 .RI \(dq allow \(dq
 787 to the file) in this user namespace.
 788 (Attempts to do so will fail with the error
 789 .BR EPERM .)
 790 This restriction also propagates down to all child user namespaces of
 791 this user namespace.
 792
 793 The
 794 .I /proc/[pid]/setgroups
 795 file was added in Linux 3.19,
 796 but was backported to many earlier stable kernel series,
 797 because it addresses a security issue.
 798 The issue concerned files with permissions such as "rwx\-\-\-rwx".
 799 Such files give fewer permissions to "group" than they do to "other".
 800 This means that dropping groups using
 801 .BR setgroups (2)
 802 might allow a process file access that it did not formerly have.
 803 Before the existence of user namespaces this was not a concern,
 804 since only a privileged process (one with the
 805 .BR CAP_SETGID
 806 capability) could call
 807 .BR setgroups (2).
 808 However, with the introduction of user namespaces,
 809 it became possible for an unprivileged process to create
 810 a new namespace in which the user had all privileges.
 811 This then allowed formerly unprivileged
 812 users to drop groups and thus gain file access
 813 that they did not previously have.
 814 The
 815 .I /proc/[pid]/setgroups
 816 file was added to address this security issue,
 817 by denying any pathway for an unprivileged process to drop groups with
 818 .BR setgroups (2).
 819 .\"
 820 .\" /proc/PID/setgroups
 821 .\"     [allow == setgroups() is allowed, "deny" == setgroups() is disallowed]
 822 .\"     * Can write if have CAP_SYS_ADMIN in NS
 823 .\"     * Must write BEFORE writing to /proc/PID/gid_map
 824 .\"
 825 .\" setgroups()
 826 .\"     * Must already have written to gid_maps
 827 .\"     * /proc/PID/setgroups must be "allow"
 828 .\"
 829 .\" /proc/PID/gid_map -- writing
 830 .\"     * Must already have written "deny" to /proc/PID/setgroups
 831 .\"
 832 .\" ============================================================
 833 .\"
 834 .SS Unmapped user and group IDs
 835 .PP
 836 There are various places where an unmapped user ID (group ID)
 837 may be exposed to user space.
 838 For example, the first process in a new user namespace may call
 839 .BR getuid ()
 840 before a user ID mapping has been defined for the namespace.
 841 In most such cases, an unmapped user ID is converted
 842 .\" from_kuid_munged(), from_kgid_munged()
 843 to the overflow user ID (group ID);
 844 the default value for the overflow user ID (group ID) is 65534.
 845 See the descriptions of
 846 .IR /proc/sys/kernel/overflowuid
 847 and
 848 .IR /proc/sys/kernel/overflowgid
 849 in
 850 .BR proc (5).
 851
 852 The cases where unmapped IDs are mapped in this fashion include
 853 system calls that return user IDs
 854 .RB ( getuid (2),
 855 .BR getgid (2),
 856 and similar),
 857 credentials passed over a UNIX domain socket,
 858 .\" also SO_PEERCRED
 859 credentials returned by
 860 .BR stat (2),
 861 .BR waitid (2),
 862 and the System V IPC "ctl"
 863 .B IPC_STAT
 864 operations,
 865 credentials exposed by
 866 .IR /proc/PID/status
 867 and the files in
 868 .IR /proc/sysvipc/* ,
 869 credentials returned via the
 870 .I si_uid
 871 field in the
 872 .I siginfo_t
 873 received with a signal (see
 874 .BR sigaction (2)),
 875 credentials written to the process accounting file (see
 876 .BR acct (5)),
 877 and credentials returned with POSIX message queue notifications (see
 878 .BR mq_notify (3)).
 879
 880 There is one notable case where unmapped user and group IDs are
 881 .I not
 882 .\" from_kuid(), from_kgid()
 883 .\" Also F_GETOWNER_UIDS is an exception
 884 converted to the corresponding overflow ID value.
 885 When viewing a
 886 .I uid_map
 887 or
 888 .I gid_map
 889 file in which there is no mapping for the second field,
 890 that field is displayed as 4294967295 (\-1 as an unsigned integer);
 891 .\"
 892 .\" ============================================================
 893 .\"
 894 .SS Set-user-ID and set-group-ID programs
 895 .PP
 896 When a process inside a user namespace executes
 897 a set-user-ID (set-group-ID) program,
 898 the process's effective user (group) ID inside the namespace is changed
 899 to whatever value is mapped for the user (group) ID of the file.
 900 However, if either the user
 901 .I or
 902 the group ID of the file has no mapping inside the namespace,
 903 the set-user-ID (set-group-ID) bit is silently ignored:
 904 the new program is executed,
 905 but the process's effective user (group) ID is left unchanged.
 906 (This mirrors the semantics of executing a set-user-ID or set-group-ID
 907 program that resides on a filesystem that was mounted with the
 908 .BR MS_NOSUID
 909 flag, as described in
 910 .BR mount (2).)
 911 .\"
 912 .\" ============================================================
 913 .\"
 914 .SS Miscellaneous
 915 .PP
 916 When a process's user and group IDs are passed over a UNIX domain socket
 917 to a process in a different user namespace (see the description of
 918 .B SCM_CREDENTIALS
 919 in
 920 .BR unix (7)),
 921 they are translated into the corresponding values as per the
 922 receiving process's user and group ID mappings.
 923 .\"
 924 .SH CONFORMING TO
 925 Namespaces are a Linux-specific feature.
 926 .\"
 927 .SH NOTES
 928 Over the years, there have been a lot of features that have been added
 929 to the Linux kernel that have been made available only to privileged users
 930 because of their potential to confuse set-user-ID-root applications.
 931 In general, it becomes safe to allow the root user in a user namespace to
 932 use those features because it is impossible, while in a user namespace,
 933 to gain more privilege than the root user of a user namespace has.
 934 .\"
 935 .\" ============================================================
 936 .\"
 937 .SS Availability
 938 Use of user namespaces requires a kernel that is configured with the
 939 .B CONFIG_USER_NS
 940 option.
 941 User namespaces require support in a range of subsystems across
 942 the kernel.
 943 When an unsupported subsystem is configured into the kernel,
 944 it is not possible to configure user namespaces support.
 945
 946 As at Linux 3.8, most relevant subsystems supported user namespaces,
 947 but a number of filesystems did not have the infrastructure needed
 948 to map user and group IDs between user namespaces.
 949 Linux 3.9 added the required infrastructure support for many of
 950 the remaining unsupported filesystems
 951 (Plan 9 (9P), Andrew File System (AFS), Ceph, CIFS, CODA, NFS, and OCFS2).
 952 Linux 3.11 added support the last of the unsupported major filesystems,
 953 .\" commit d6970d4b726cea6d7a9bc4120814f95c09571fc3
 954 XFS.
 955 .\"
 956 .SH EXAMPLE
 957 The program below is designed to allow experimenting with
 958 user namespaces, as well as other types of namespaces.
 959 It creates namespaces as specified by command-line options and then executes
 960 a command inside those namespaces.
 961 The comments and
 962 .I usage()
 963 function inside the program provide a full explanation of the program.
 964 The following shell session demonstrates its use.
 965
 966 First, we look at the run-time environment:
 967
 968 .in +4n
 969 .nf
 970 $ \fBuname -rs\fP     # Need Linux 3.8 or later
 971 Linux 3.8.0
 972 $ \fBid -u\fP         # Running as unprivileged user
 973 1000
 974 $ \fBid -g\fP
 975 1000
 976 .fi
 977 .in
 978
 979 Now start a new shell in new user
 980 .RI ( \-U ),
 981 mount
 982 .RI ( \-m ),
 983 and PID
 984 .RI ( \-p )
 985 namespaces, with user ID
 986 .RI ( \-M )
 987 and group ID
 988 .RI ( \-G )
 989 1000 mapped to 0 inside the user namespace:
 990
 991 .in +4n
 992 .nf
 993 $ \fB./userns_child_exec -p -m -U -M '0 1000 1' -G '0 1000 1' bash\fP
 994 .fi
 995 .in
 996
 997 The shell has PID 1, because it is the first process in the new
 998 PID namespace:
 999
1000 .in +4n
1001 .nf
1002 bash$ \fBecho $$\fP
1003 1
1004 .fi
1005 .in
1006
1007 Inside the user namespace, the shell has user and group ID 0,
1008 and a full set of permitted and effective capabilities:
1009
1010 .in +4n
1011 .nf
1012 bash$ \fBcat /proc/$$/status | egrep '^[UG]id'\fP
1013 Uid:    0       0       0       0
1014 Gid:    0       0       0       0
1015 bash$ \fBcat /proc/$$/status | egrep '^Cap(Prm|Inh|Eff)'\fP
1016 CapInh: 0000000000000000
1017 CapPrm: 0000001fffffffff
1018 CapEff: 0000001fffffffff
1019 .fi
1020 .in
1021
1022 Mounting a new
1023 .I /proc
1024 filesystem and listing all of the processes visible
1025 in the new PID namespace shows that the shell can't see
1026 any processes outside the PID namespace:
1027
1028 .in +4n
1029 .nf
1030 bash$ \fBmount -t proc proc /proc\fP
1031 bash$ \fBps ax\fP
1032   PID TTY      STAT   TIME COMMAND
1033     1 pts/3    S      0:00 bash
1034    22 pts/3    R+     0:00 ps ax
1035 .fi
1036 .in
1037 .SS Program source
1038 \&
1039 .nf
1040 /* userns_child_exec.c
1041
1042    Licensed under GNU General Public License v2 or later
1043
1044    Create a child process that executes a shell command in new
1045    namespace(s); allow UID and GID mappings to be specified when
1046    creating a user namespace.
1047 */
1048 #define _GNU_SOURCE
1049 #include <sched.h>
1050 #include <unistd.h>
1051 #include <stdlib.h>
1052 #include <sys/wait.h>
1053 #include <signal.h>
1054 #include <fcntl.h>
1055 #include <stdio.h>
1056 #include <string.h>
1057 #include <limits.h>
1058 #include <errno.h>
1059
1060 /* A simple error\-handling function: print an error message based
1061    on the value in \(aqerrno\(aq and terminate the calling process */
1062
1063 #define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \\
1064                         } while (0)
1065
1066 struct child_args {
1067     char **argv;        /* Command to be executed by child, with args */
1068     int    pipe_fd[2];  /* Pipe used to synchronize parent and child */
1069 };
1070
1071 static int verbose;
1072
1073 static void
1074 usage(char *pname)
1075 {
1076     fprintf(stderr, "Usage: %s [options] cmd [arg...]\\n\\n", pname);
1077     fprintf(stderr, "Create a child process that executes a shell "
1078             "command in a new user namespace,\\n"
1079             "and possibly also other new namespace(s).\\n\\n");
1080     fprintf(stderr, "Options can be:\\n\\n");
1081 #define fpe(str) fprintf(stderr, "    %s", str);
1082     fpe("\-i          New IPC namespace\\n");
1083     fpe("\-m          New mount namespace\\n");
1084     fpe("\-n          New network namespace\\n");
1085     fpe("\-p          New PID namespace\\n");
1086     fpe("\-u          New UTS namespace\\n");
1087     fpe("\-U          New user namespace\\n");
1088     fpe("\-M uid_map  Specify UID map for user namespace\\n");
1089     fpe("\-G gid_map  Specify GID map for user namespace\\n");
1090     fpe("\-z          Map user\(aqs UID and GID to 0 in user namespace\\n");
1091     fpe("            (equivalent to: \-M \(aq0 <uid> 1\(aq \-G \(aq0 <gid> 1\(aq)\\n");
1092     fpe("\-v          Display verbose messages\\n");
1093     fpe("\\n");
1094     fpe("If \-z, \-M, or \-G is specified, \-U is required.\\n");
1095     fpe("It is not permitted to specify both \-z and either \-M or \-G.\\n");
1096     fpe("\\n");
1097     fpe("Map strings for \-M and \-G consist of records of the form:\\n");
1098     fpe("\\n");
1099     fpe("    ID\-inside\-ns   ID\-outside\-ns   len\\n");
1100     fpe("\\n");
1101     fpe("A map string can contain multiple records, separated"
1102         " by commas;\\n");
1103     fpe("the commas are replaced by newlines before writing"
1104         " to map files.\\n");
1105
1106     exit(EXIT_FAILURE);
1107 }
1108
1109 /* Update the mapping file \(aqmap_file\(aq, with the value provided in
1110    \(aqmapping\(aq, a string that defines a UID or GID mapping. A UID or
1111    GID mapping consists of one or more newline\-delimited records
1112    of the form:
1113
1114        ID_inside\-ns    ID\-outside\-ns   length
1115
1116    Requiring the user to supply a string that contains newlines is
1117    of course inconvenient for command\-line use. Thus, we permit the
1118    use of commas to delimit records in this string, and replace them
1119    with newlines before writing the string to the file. */
1120
1121 static void
1122 update_map(char *mapping, char *map_file)
1123 {
1124     int fd, j;
1125     size_t map_len;     /* Length of \(aqmapping\(aq */
1126
1127     /* Replace commas in mapping string with newlines */
1128
1129     map_len = strlen(mapping);
1130     for (j = 0; j < map_len; j++)
1131         if (mapping[j] == \(aq,\(aq)
1132             mapping[j] = \(aq\\n\(aq;
1133
1134     fd = open(map_file, O_RDWR);
1135     if (fd == \-1) {
1136         fprintf(stderr, "ERROR: open %s: %s\\n", map_file,
1137                 strerror(errno));
1138         exit(EXIT_FAILURE);
1139     }
1140
1141     if (write(fd, mapping, map_len) != map_len) {
1142         fprintf(stderr, "ERROR: write %s: %s\\n", map_file,
1143                 strerror(errno));
1144         exit(EXIT_FAILURE);
1145     }
1146
1147     close(fd);
1148 }
1149
1150 /* Linux 3.19 made a change in the handling of setgroups(2) and the
1151    \(aqgid_map\(aq file to address a security issue. The issue allowed
1152    *unprivileged* users to employ user namespaces in order to drop
1153    The upshot of the 3.19 changes is that in order to update the
1154    \(aqgid_maps\(aq file, use of the setgroups() system call in this
1155    user namespace must first be disabled by writing "deny" to one of
1156    the /proc/PID/setgroups files for this namespace.  That is the
1157    purpose of the following function. */
1158
1159 static void
1160 proc_setgroups_write(pid_t child_pid, char *str)
1161 {
1162     char setgroups_path[PATH_MAX];
1163     int fd;
1164
1165     snprintf(setgroups_path, PATH_MAX, "/proc/%ld/setgroups",
1166             (long) child_pid);
1167
1168     fd = open(setgroups_path, O_RDWR);
1169     if (fd == \-1) {
1170
1171         /* We may be on a system that doesn\(aqt support
1172            /proc/PID/setgroups. In that case, the file won\(aqt exist,
1173            and the system won\(aqt impose the restrictions that Linux 3.19
1174            added. That\(aqs fine: we don\(aqt need to do anything in order
1175            to permit \(aqgid_map\(aq to be updated.
1176
1177            However, if the error from open() was something other than
1178            the ENOENT error that is expected for that case,  let the
1179            user know. */
1180
1181         if (errno != ENOENT)
1182             fprintf(stderr, "ERROR: open %s: %s\\n", setgroups_path,
1183                 strerror(errno));
1184         return;
1185     }
1186
1187     if (write(fd, str, strlen(str)) == \-1)
1188         fprintf(stderr, "ERROR: write %s: %s\\n", setgroups_path,
1189             strerror(errno));
1190
1191     close(fd);
1192 }
1193
1194 static int              /* Start function for cloned child */
1195 childFunc(void *arg)
1196 {
1197     struct child_args *args = (struct child_args *) arg;
1198     char ch;
1199
1200     /* Wait until the parent has updated the UID and GID mappings.
1201        See the comment in main(). We wait for end of file on a
1202        pipe that will be closed by the parent process once it has
1203        updated the mappings. */
1204
1205     close(args\->pipe_fd[1]);    /* Close our descriptor for the write
1206                                    end of the pipe so that we see EOF
1207                                    when parent closes its descriptor */
1208     if (read(args\->pipe_fd[0], &ch, 1) != 0) {
1209         fprintf(stderr,
1210                 "Failure in child: read from pipe returned != 0\\n");
1211         exit(EXIT_FAILURE);
1212     }
1213
1214     /* Execute a shell command */
1215
1216     printf("About to exec %s\\n", args\->argv[0]);
1217     execvp(args\->argv[0], args\->argv);
1218     errExit("execvp");
1219 }
1220
1221 #define STACK_SIZE (1024 * 1024)
1222
1223 static char child_stack[STACK_SIZE];    /* Space for child\(aqs stack */
1224
1225 int
1226 main(int argc, char *argv[])
1227 {
1228     int flags, opt, map_zero;
1229     pid_t child_pid;
1230     struct child_args args;
1231     char *uid_map, *gid_map;
1232     const int MAP_BUF_SIZE = 100;
1233     char map_buf[MAP_BUF_SIZE];
1234     char map_path[PATH_MAX];
1235
1236     /* Parse command\-line options. The initial \(aq+\(aq character in
1237        the final getopt() argument prevents GNU\-style permutation
1238        of command\-line options. That\(aqs useful, since sometimes
1239        the \(aqcommand\(aq to be executed by this program itself
1240        has command\-line options. We don\(aqt want getopt() to treat
1241        those as options to this program. */
1242
1243     flags = 0;
1244     verbose = 0;
1245     gid_map = NULL;
1246     uid_map = NULL;
1247     map_zero = 0;
1248     while ((opt = getopt(argc, argv, "+imnpuUM:G:zv")) != \-1) {
1249         switch (opt) {
1250         case \(aqi\(aq: flags |= CLONE_NEWIPC;        break;
1251         case \(aqm\(aq: flags |= CLONE_NEWNS;         break;
1252         case \(aqn\(aq: flags |= CLONE_NEWNET;        break;
1253         case \(aqp\(aq: flags |= CLONE_NEWPID;        break;
1254         case \(aqu\(aq: flags |= CLONE_NEWUTS;        break;
1255         case \(aqv\(aq: verbose = 1;                  break;
1256         case \(aqz\(aq: map_zero = 1;                 break;
1257         case \(aqM\(aq: uid_map = optarg;             break;
1258         case \(aqG\(aq: gid_map = optarg;             break;
1259         case \(aqU\(aq: flags |= CLONE_NEWUSER;       break;
1260         default:  usage(argv[0]);
1261         }
1262     }
1263
1264     /* \-M or \-G without \-U is nonsensical */
1265
1266     if (((uid_map != NULL || gid_map != NULL || map_zero) &&
1267                 !(flags & CLONE_NEWUSER)) ||
1268             (map_zero && (uid_map != NULL || gid_map != NULL)))
1269         usage(argv[0]);
1270
1271     args.argv = &argv[optind];
1272
1273     /* We use a pipe to synchronize the parent and child, in order to
1274        ensure that the parent sets the UID and GID maps before the child
1275        calls execve(). This ensures that the child maintains its
1276        capabilities during the execve() in the common case where we
1277        want to map the child\(aqs effective user ID to 0 in the new user
1278        namespace. Without this synchronization, the child would lose
1279        its capabilities if it performed an execve() with nonzero
1280        user IDs (see the capabilities(7) man page for details of the
1281        transformation of a process\(aqs capabilities during execve()). */
1282
1283     if (pipe(args.pipe_fd) == \-1)
1284         errExit("pipe");
1285
1286     /* Create the child in new namespace(s) */
1287
1288     child_pid = clone(childFunc, child_stack + STACK_SIZE,
1289                       flags | SIGCHLD, &args);
1290     if (child_pid == \-1)
1291         errExit("clone");
1292
1293     /* Parent falls through to here */
1294
1295     if (verbose)
1296         printf("%s: PID of child created by clone() is %ld\\n",
1297                 argv[0], (long) child_pid);
1298
1299     /* Update the UID and GID maps in the child */
1300
1301     if (uid_map != NULL || map_zero) {
1302         snprintf(map_path, PATH_MAX, "/proc/%ld/uid_map",
1303                 (long) child_pid);
1304         if (map_zero) {
1305             snprintf(map_buf, MAP_BUF_SIZE, "0 %ld 1", (long) getuid());
1306             uid_map = map_buf;
1307         }
1308         update_map(uid_map, map_path);
1309     }
1310
1311     if (gid_map != NULL || map_zero) {
1312         proc_setgroups_write(child_pid, "deny");
1313
1314         snprintf(map_path, PATH_MAX, "/proc/%ld/gid_map",
1315                 (long) child_pid);
1316         if (map_zero) {
1317             snprintf(map_buf, MAP_BUF_SIZE, "0 %ld 1", (long) getgid());
1318             gid_map = map_buf;
1319         }
1320         update_map(gid_map, map_path);
1321     }
1322
1323     /* Close the write end of the pipe, to signal to the child that we
1324        have updated the UID and GID maps */
1325
1326     close(args.pipe_fd[1]);
1327
1328     if (waitpid(child_pid, NULL, 0) == \-1)      /* Wait for child */
1329         errExit("waitpid");
1330
1331     if (verbose)
1332         printf("%s: terminating\\n", argv[0]);
1333
1334     exit(EXIT_SUCCESS);
1335 }
1336 .fi
1337 .SH SEE ALSO
1338 .BR newgidmap (1),      \" From the shadow package
1339 .BR newuidmap (1),      \" From the shadow package
1340 .BR clone (2),
1341 .BR setns (2),
1342 .BR unshare (2),
1343 .BR proc (5),
1344 .BR subgid (5),         \" From the shadow package
1345 .BR subuid (5),         \" From the shadow package
1346 .BR credentials (7),
1347 .BR capabilities (7),
1348 .BR namespaces (7),
1349 .BR cgroup_namespaces (7)
1350 .BR pid_namespaces (7)
1351 .sp
1352 The kernel source file
1353 .IR Documentation/namespaces/resource-control.txt .