]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man7/capabilities.7
Put SEE ALSO section into alphabetical order.
[thirdparty/man-pages.git] / man7 / capabilities.7
1 .\" Copyright (c) 2002 by Michael Kerrisk <mtk.manpages@gmail.com>
2 .\"
3 .\" Permission is granted to make and distribute verbatim copies of this
4 .\" manual provided the copyright notice and this permission notice are
5 .\" preserved on all copies.
6 .\"
7 .\" Permission is granted to copy and distribute modified versions of this
8 .\" manual under the conditions for verbatim copying, provided that the
9 .\" entire resulting derived work is distributed under the terms of a
10 .\" permission notice identical to this one.
11 .\"
12 .\" Since the Linux kernel and libraries are constantly changing, this
13 .\" manual page may be incorrect or out-of-date. The author(s) assume no
14 .\" responsibility for errors or omissions, or for damages resulting from
15 .\" the use of the information contained herein. The author(s) may not
16 .\" have taken the same level of care in the production of this manual,
17 .\" which is licensed free of charge, as they might when working
18 .\" professionally.
19 .\"
20 .\" Formatted or processed versions of this manual, if unaccompanied by
21 .\" the source, must acknowledge the copyright and authors of this work.
22 .\"
23 .\" 6 Aug 2002 - Initial Creation
24 .\" Modified 2003-05-23, Michael Kerrisk, <mtk.manpages@gmail.com>
25 .\" Modified 2004-05-27, Michael Kerrisk, <mtk.manpages@gmail.com>
26 .\" 2004-12-08, mtk Added O_NOATIME for CAP_FOWNER
27 .\" 2005-08-16, mtk, Added CAP_AUDIT_CONTROL and CAP_AUDIT_WRITE
28 .\" 2008-07-15, Serge Hallyn <serue@us.bbm.com>
29 .\" Document file capabilities, per-process capability
30 .\" bounding set, changed semantics for CAP_SETPCAP,
31 .\" and other changes in 2.6.2[45].
32 .\" Add CAP_MAC_ADMIN, CAP_MAC_OVERRIDE, CAP_SETFCAP.
33 .\" 2008-07-15, mtk
34 .\" Add text describing circumstances in which CAP_SETPCAP
35 .\" (theoretically) permits a thread to change the
36 .\" capability sets of another thread.
37 .\" Add section describing rules for programmatically
38 .\" adjusting thread capability sets.
39 .\" Describe rationale for capability bounding set.
40 .\" Document "securebits" flags.
41 .\" Add text noting that if we set the effective flag for one file
42 .\" capability, then we must also set the effective flag for all
43 .\" other capabilities where the permitted or inheritable bit is set.
44 .\"
45 .TH CAPABILITIES 7 2008-07-15 "Linux" "Linux Programmer's Manual"
46 .SH NAME
47 capabilities \- overview of Linux capabilities
48 .SH DESCRIPTION
49 For the purpose of performing permission checks,
50 traditional Unix implementations distinguish two categories of processes:
51 .I privileged
52 processes (whose effective user ID is 0, referred to as superuser or root),
53 and
54 .I unprivileged
55 processes (whose effective UID is non-zero).
56 Privileged processes bypass all kernel permission checks,
57 while unprivileged processes are subject to full permission
58 checking based on the process's credentials
59 (usually: effective UID, effective GID, and supplementary group list).
60
61 Starting with kernel 2.2, Linux divides the privileges traditionally
62 associated with superuser into distinct units, known as
63 .IR capabilities ,
64 which can be independently enabled and disabled.
65 Capabilities are a per-thread attribute.
66 .\"
67 .SS Capabilities List
68 The following list shows the capabilities implemented on Linux,
69 and the operations or behaviors that each capability permits:
70 .TP
71 .BR CAP_AUDIT_CONTROL " (since Linux 2.6.11)"
72 Enable and disable kernel auditing; change auditing filter rules;
73 retrieve auditing status and filtering rules.
74 .TP
75 .BR CAP_AUDIT_WRITE " (since Linux 2.6.11)"
76 Write records to kernel auditing log.
77 .TP
78 .B CAP_CHOWN
79 Make arbitrary changes to file UIDs and GIDs (see
80 .BR chown (2)).
81 .TP
82 .B CAP_DAC_OVERRIDE
83 Bypass file read, write, and execute permission checks.
84 (DAC is an abbreviation of "discretionary access control".)
85 .TP
86 .B CAP_DAC_READ_SEARCH
87 Bypass file read permission checks and
88 directory read and execute permission checks.
89 .TP
90 .B CAP_FOWNER
91 .PD 0
92 .RS
93 .IP * 2
94 Bypass permission checks on operations that normally
95 require the file system UID of the process to match the UID of
96 the file (e.g.,
97 .BR chmod (2),
98 .BR utime (2)),
99 excluding those operations covered by
100 .B CAP_DAC_OVERRIDE
101 and
102 .BR CAP_DAC_READ_SEARCH ;
103 .IP *
104 set extended file attributes (see
105 .BR chattr (1))
106 on arbitrary files;
107 .IP *
108 set Access Control Lists (ACLs) on arbitrary files;
109 .IP *
110 ignore directory sticky bit on file deletion;
111 .IP *
112 specify
113 .B O_NOATIME
114 for arbitrary files in
115 .BR open (2)
116 and
117 .BR fcntl (2).
118 .RE
119 .PD
120 .TP
121 .B CAP_FSETID
122 Don't clear set-user-ID and set-group-ID permission
123 bits when a file is modified;
124 set the set-group-ID bit for a file whose GID does not match
125 the file system or any of the supplementary GIDs of the calling process.
126 .TP
127 .B CAP_IPC_LOCK
128 Lock memory
129 .RB ( mlock (2),
130 .BR mlockall (2),
131 .BR mmap (2),
132 .BR shmctl (2)).
133 .TP
134 .B CAP_IPC_OWNER
135 Bypass permission checks for operations on System V IPC objects.
136 .TP
137 .B CAP_KILL
138 Bypass permission checks for sending signals (see
139 .BR kill (2)).
140 This includes use of the
141 .BR ioctl (2)
142 .B KDSIGACCEPT
143 operation.
144 .\" FIXME CAP_KILL also has an effect for threads + setting child
145 .\" termination signal to other than SIGCHLD: without this
146 .\" capability, the termination signal reverts to SIGCHLD
147 .\" if the child does an exec(). What is the rationale
148 .\" for this?
149 .TP
150 .BR CAP_LEASE " (since Linux 2.4)"
151 Establish leases on arbitrary files (see
152 .BR fcntl (2)).
153 .TP
154 .B CAP_LINUX_IMMUTABLE
155 Set the
156 .B FS_APPEND_FL
157 and
158 .B FS_IMMUTABLE_FL
159 .\" These attributes are now available on ext2, ext3, Reiserfs, XFS, JFS
160 i-node flags (see
161 .BR chattr (1)).
162 .TP
163 .BR CAP_MAC_ADMIN " (since Linux 2.6.25)"
164 Override Mandatory Access Control (MAC).
165 Implemented for the Smack Linux Security Module (LSM).
166 .TP
167 .BR CAP_MAC_OVERRIDE " (since Linux 2.6.25)"
168 Allow MAC configuration or state changes.
169 Implemented for the Smack LSM.
170 .TP
171 .BR CAP_MKNOD " (since Linux 2.4)"
172 Create special files using
173 .BR mknod (2).
174 .TP
175 .B CAP_NET_ADMIN
176 Perform various network-related operations
177 (e.g., setting privileged socket options,
178 enabling multicasting, interface configuration,
179 modifying routing tables).
180 .TP
181 .B CAP_NET_BIND_SERVICE
182 Bind a socket to Internet domain reserved ports
183 (port numbers less than 1024).
184 .TP
185 .B CAP_NET_BROADCAST
186 (Unused) Make socket broadcasts, and listen to multicasts.
187 .TP
188 .B CAP_NET_RAW
189 Use RAW and PACKET sockets.
190 .\" Also various IP options and setsockopt(SO_BINDTODEVICE)
191 .TP
192 .B CAP_SETGID
193 Make arbitrary manipulations of process GIDs and supplementary GID list;
194 forge GID when passing socket credentials via Unix domain sockets.
195 .TP
196 .BR CAP_SETFCAP " (since Linux 2.6.24)"
197 Set file capabilities.
198 .TP
199 .B CAP_SETPCAP
200 If file capabilities are not supported:
201 grant or remove any capability in the
202 caller's permitted capability set to or from any other process.
203 (This property of
204 .B CAP_SETPCAP
205 is not available when the kernel is configured to support
206 file capabilities, since
207 .B CAP_SETPCAP
208 has entirely different semantics for such kernels.)
209
210 If file capabilities are supported:
211 add any capability from the calling thread's bounding set
212 to its inheritable set;
213 drop capabilities from the bounding set (via
214 .BR prctl (2)
215 .BR PR_CAPBSET_DROP );
216 make changes to the
217 .I securebits
218 flags.
219 .TP
220 .B CAP_SETUID
221 Make arbitrary manipulations of process UIDs
222 .RB ( setuid (2),
223 .BR setreuid (2),
224 .BR setresuid (2),
225 .BR setfsuid (2));
226 make forged UID when passing socket credentials via Unix domain sockets.
227 .\" FIXME CAP_SETUID also an effect in exec(); document this.
228 .TP
229 .B CAP_SYS_ADMIN
230 .PD 0
231 .RS
232 .IP * 2
233 Perform a range of system administration operations including:
234 .BR quotactl (2),
235 .BR mount (2),
236 .BR umount (2),
237 .BR swapon (2),
238 .BR swapoff (2),
239 .BR sethostname (2),
240 .BR setdomainname (2);
241 .IP *
242 perform
243 .B IPC_SET
244 and
245 .B IPC_RMID
246 operations on arbitrary System V IPC objects;
247 .IP *
248 perform operations on
249 .I trusted
250 and
251 .I security
252 Extended Attributes (see
253 .BR attr (5));
254 .IP *
255 use
256 .BR lookup_dcookie (2);
257 .IP *
258 use
259 .BR ioprio_set (2)
260 to assign
261 .B IOPRIO_CLASS_RT
262 and (before Linux 2.6.25)
263 .B IOPRIO_CLASS_IDLE
264 I/O scheduling classes;
265 .IP *
266 perform
267 .BR keyctl (2)
268 .B KEYCTL_CHOWN
269 and
270 .B KEYCTL_SETPERM
271 operations;
272 .IP *
273 forge UID when passing socket credentials;
274 .IP *
275 exceed
276 .IR /proc/sys/fs/file-max ,
277 the system-wide limit on the number of open files,
278 in system calls that open files (e.g.,
279 .BR accept (2),
280 .BR execve (2),
281 .BR open (2),
282 .BR pipe (2)
283 (without this capability these system calls will fail with the error
284 .B ENFILE
285 if this limit is encountered);
286 .IP *
287 employ
288 .B CLONE_NEWNS
289 flag with
290 .BR clone (2)
291 and
292 .BR unshare (2);
293 .IP *
294 perform
295 .B KEYCTL_CHOWN
296 and
297 .B KEYCTL_SETPERM
298 .BR keyctl (2)
299 operations.
300 .RE
301 .PD
302 .TP
303 .B CAP_SYS_BOOT
304 Use
305 .BR reboot (2)
306 and
307 .BR kexec_load (2).
308 .TP
309 .B CAP_SYS_CHROOT
310 Use
311 .BR chroot (2).
312 .TP
313 .B CAP_SYS_MODULE
314 Load and unload kernel modules
315 (see
316 .BR init_module (2)
317 and
318 .BR delete_module (2));
319 in kernels before 2.6.25:
320 drop capabilities from the system-wide capability bounding set.
321 .TP
322 .B CAP_SYS_NICE
323 .PD 0
324 .RS
325 .IP * 2
326 Raise process nice value
327 .RB ( nice (2),
328 .BR setpriority (2))
329 and change the nice value for arbitrary processes;
330 .IP *
331 set real-time scheduling policies for calling process,
332 and set scheduling policies and priorities for arbitrary processes
333 .RB ( sched_setscheduler (2),
334 .BR sched_setparam (2));
335 .IP *
336 set CPU affinity for arbitrary processes
337 .RB ( sched_setaffinity (2));
338 .IP *
339 set I/O scheduling class and priority for arbitrary processes
340 .RB ( ioprio_set (2));
341 .IP *
342 apply
343 .BR migrate_pages (2)
344 to arbitrary processes and allow processes
345 to be migrated to arbitrary nodes;
346 .\" FIXME CAP_SYS_NICE also has the following effect for
347 .\" migrate_pages(2):
348 .\" do_migrate_pages(mm, &old, &new,
349 .\" capable(CAP_SYS_NICE) ? MPOL_MF_MOVE_ALL : MPOL_MF_MOVE);
350 .IP *
351 apply
352 .BR move_pages (2)
353 to arbitrary processes;
354 .IP *
355 use the
356 .B MPOL_MF_MOVE_ALL
357 flag with
358 .BR mbind (2)
359 and
360 .BR move_pages (2).
361 .RE
362 .PD
363 .TP
364 .B CAP_SYS_PACCT
365 Use
366 .BR acct (2).
367 .TP
368 .B CAP_SYS_PTRACE
369 Trace arbitrary processes using
370 .BR ptrace (2)
371 .TP
372 .B CAP_SYS_RAWIO
373 Perform I/O port operations
374 .RB ( iopl (2)
375 and
376 .BR ioperm (2));
377 access
378 .IR /proc/kcore .
379 .TP
380 .B CAP_SYS_RESOURCE
381 .PD 0
382 .RS
383 .IP * 2
384 Use reserved space on ext2 file systems;
385 .IP *
386 make
387 .BR ioctl (2)
388 calls controlling ext3 journaling;
389 .IP *
390 override disk quota limits;
391 .IP *
392 increase resource limits (see
393 .BR setrlimit (2));
394 .IP *
395 override
396 .B RLIMIT_NPROC
397 resource limit;
398 .IP *
399 raise
400 .I msg_qbytes
401 limit for a System V message queue above the limit in
402 .I /proc/sys/kernel/msgmnb
403 (see
404 .BR msgop (2)
405 and
406 .BR msgctl (2).
407 .RE
408 .PD
409 .TP
410 .B CAP_SYS_TIME
411 Set system clock
412 .RB ( settimeofday (2),
413 .BR stime (2),
414 .BR adjtimex (2));
415 set real-time (hardware) clock.
416 .TP
417 .B CAP_SYS_TTY_CONFIG
418 Use
419 .BR vhangup (2).
420 .\"
421 .SS Past and Current Implementation
422 A full implementation of capabilities requires that:
423 .IP 1. 3
424 For all privileged operations,
425 the kernel must check whether the thread has the required
426 capability in its effective set.
427 .IP 2.
428 The kernel must provide
429 system calls allowing a thread's capability sets to
430 be changed and retrieved.
431 .IP 3.
432 The file system must support attaching capabilities to an executable file,
433 so that a process gains those capabilities when the file is executed.
434 .PP
435 Before kernel 2.6.24, only the first two of these requirements are met;
436 since kernel 2.6.24, all three requirements are met.
437 .\"
438 .SS Thread Capability Sets
439 Each thread has three capability sets containing zero or more
440 of the above capabilities:
441 .TP
442 .IR Permitted :
443 This is a limiting superset for the effective
444 capabilities that the thread may assume.
445 It is also a limiting superset for the capabilities that
446 may be added to the inheritable set by a thread that does not have the
447 .B CAP_SETPCAP
448 capability in its effective set.
449
450 If a thread drops a capability from its permitted set,
451 it can never re-acquire that capability (unless it
452 .BR execve (2)s
453 either a set-user-ID-root program, or
454 a program whose associated file capabilities grant that capability).
455 .TP
456 .IR Inheritable :
457 This is a set of capabilities preserved across an
458 .BR execve (2).
459 It provides a mechanism for a process to assign capabilities
460 to the permitted set of the new program during an
461 .BR execve (2).
462 .TP
463 .IR Effective :
464 This is the set of capabilities used by the kernel to
465 perform permission checks for the thread.
466 .PP
467 A child created via
468 .BR fork (2)
469 inherits copies of its parent's capability sets.
470 See below for a discussion of the treatment of capabilities during
471 .BR execve (2).
472 .PP
473 Using
474 .BR capset (2),
475 a thread may manipulate its own capability sets (see below).
476 .\"
477 .SS File Capabilities
478 Since kernel 2.6.24, the kernel supports
479 associating capability sets with an executable file using
480 .BR setcap (8).
481 The file capability sets are stored in an extended attribute (see
482 .BR setxattr (2))
483 named
484 .IR "security.capability" .
485 Writing to this extended attribute requires the
486 .BR CAP_SETFCAP
487 capability.
488 The file capability sets,
489 in conjunction with the capability sets of the thread,
490 determine the capabilities of a thread after an
491 .BR execve (2).
492
493 The three file capability sets are:
494 .TP
495 .IR Permitted " (formerly known as " forced ):
496 These capabilities are automatically permitted to the thread,
497 regardless of the thread's inheritable capabilities.
498 .TP
499 .IR Inheritable " (formerly known as " allowed ):
500 This set is ANDed with the thread's inheritable set to determine which
501 inheritable capabilities are enabled in the permitted set of
502 the thread after the
503 .BR execve (2).
504 .TP
505 .IR Effective :
506 This is not a set, but rather just a single bit.
507 If this bit is set, then during an
508 .BR execve (2)
509 all of the new permitted capabilities for the thread are
510 also raised in the effective set.
511 If this bit is not set, then after an
512 .BR execve (2),
513 none of the new permitted capabilities is in the new effective set.
514
515 Enabling the file effective capability bit implies
516 that any file permitted or inheritable capability that causes a
517 thread to acquire the corresponding permitted capability during an
518 .BR execve (2)
519 (see the transormation rules described below) will also acquire that
520 capability in its effective set.
521 Therefore, when assigning capabilities to a file
522 .RB ( setcap (8),
523 .BR cap_set_file (3),
524 .BR cap_set_fd (3)),
525 if we specify the effective flag as being enabled for any capability,
526 then the effective flag must also be specified as enabled
527 for all other capabilities for which the corresponding permitted or
528 inheritable flags is enabled.
529 .\"
530 .SS Transformation of Capabilities During execve()
531 .PP
532 During an
533 .BR execve (2),
534 the kernel calculates the new capabilities of
535 the process using the following algorithm:
536 .in +4n
537 .nf
538
539 P'(permitted) = (P(inheritable) & F(inheritable)) |
540 (F(permitted) & cap_bset)
541
542 P'(effective) = F(effective) ? P'(permitted) : 0
543
544 P'(inheritable) = P(inheritable) [i.e., unchanged]
545
546 .fi
547 .in
548 where:
549 .RS 4
550 .IP P 10
551 denotes the value of a thread capability set before the
552 .BR execve (2)
553 .IP P'
554 denotes the value of a capability set after the
555 .BR execve (2)
556 .IP F
557 denotes a file capability set
558 .IP cap_bset
559 is the value of the capability bounding set (described below).
560 .RE
561 .\"
562 .SS Capabilities and execution of programs by root
563 In order to provide an all-powerful
564 .I root
565 using capability sets, during an
566 .BR execve (2):
567 .IP 1. 3
568 If a set-user-ID-root program is being executed,
569 or the real user ID of the process is 0 (root)
570 then the file inheritable and permitted sets are defined to be all ones
571 (i.e., all capabilities enabled).
572 .IP 2.
573 If a set-user-ID-root program is being executed,
574 then the file effective bit is defined to be one (enabled).
575 .PP
576 The upshot of the above rules,
577 combined with the capabilities transformations described above,
578 is that when a process
579 .BR execve (2)s
580 a set-user-ID-root program, or when a process with an effective UID of 0
581 .BR execve (2)s
582 a program,
583 it gains all capabilities in its permitted and effective capability sets,
584 except those masked out by the capability bounding set.
585 .\" If a process with real UID 0, and non-zero effective UID does an
586 .\" exec(), then it gets all capabilities in its
587 .\" permitted set, and no effective capabilities
588 This provides semantics that are the same as those provided by
589 traditional Unix systems.
590 .SS Capability bounding set
591 The capability bounding set is a security mechanism that can be used
592 to limit the capabilities that can be gained during an
593 .BR execve (2).
594 The bounding set is used in the following ways:
595 .IP * 2
596 During an
597 .BR execve (2),
598 the capability bounding set is ANDed with the file permitted
599 capability set, and the result of this operation is assigned to the
600 thread's permitted capability set.
601 The capability bounding set thus places a limit on the permitted
602 capabilities that may be granted by an executable file.
603 .IP *
604 (Since Linux 2.6.25)
605 The capability bounding set acts as a limiting superset for
606 the capabilities that a thread can add to its inheritable set using
607 .BR capset (2).
608 This means that if the capability is not in the bounding set,
609 then a thread can't add one of its permitted capabilities to its
610 inheritable set and thereby have that capability preserved in its
611 permitted set when it
612 .BR execve (2)s
613 a file that has the capability in its inheritable set.
614 .PP
615 Note that the bounding set masks the file permitted capabilities,
616 but not the inherited capabilities.
617 If a thread maintains a capability in its inherited set
618 that is not in its bounding set,
619 then it can still gain that capability in its permitted set
620 by executing a file that has the capability in its inherited set.
621 .PP
622 Depending on the kernel version, the capability bounding set is either
623 a system-wide attribute, or a per-process attribute.
624 .PP
625 .B "Capability bounding set prior to Linux 2.6.25"
626 .PP
627 In kernels before 2.6.25, the capability bounding set is a system-wide
628 attribute that affects all threads on the system.
629 The bounding set is accessible via the file
630 .IR /proc/sys/kernel/cap-bound .
631 (Confusingly, this bit mask parameter is expressed as a
632 signed decimal number in
633 .IR /proc/sys/kernel/cap-bound .)
634
635 Only the
636 .B init
637 process may set capabilities in the capability bounding set;
638 other than that, the superuser (more precisely: programs with the
639 .B CAP_SYS_MODULE
640 capability) may only clear capabilities from this set.
641
642 On a standard system the capability bounding set always masks out the
643 .B CAP_SETPCAP
644 capability.
645 To remove this restriction (dangerous!), modify the definition of
646 .B CAP_INIT_EFF_SET
647 in
648 .I include/linux/capability.h
649 and rebuild the kernel.
650
651 The system-wide capability bounding set feature was added
652 to Linux starting with kernel version 2.2.11.
653 .\"
654 .PP
655 .B "Capability bounding set from Linux 2.6.25 onwards"
656 .PP
657 From Linux 2.6.25, the
658 .I "capability bounding set"
659 is a per-thread attribute.
660 (There is no longer a system-wide capability bounding set.)
661
662 The bounding set is inherited at
663 .BR fork (2)
664 from the thread's parent, and is preserved across an
665 .BR execve (2).
666
667 A thread may remove capabilities from its capability bounding set using the
668 .BR prctl (2)
669 .B PR_CAPBSET_DROP
670 operation, provided it has the
671 .B CAP_SETPCAP
672 capability.
673 Once a capability has been dropped from the bounding set,
674 it cannot be restored to that set.
675 A thread can determine if a capability is in its bounding set using the
676 .BR prctl (2)
677 .B PR_CAPBSET_READ
678 operation.
679
680 Removing capabilities from the bounding set is only supported if file
681 capabilities are compiled into the kernel
682 (CONFIG_SECURITY_FILE_CAPABILITIES).
683 In that case, the
684 .B init
685 process (the ancestor of all processes) begins with a full bounding set.
686 If file capabilities are not compiled into the kernel, then
687 .B init
688 begins with a full bounding set minus
689 .BR CAP_SETPCAP ,
690 because this capability has a different meaning when there are
691 no file capabilities.
692
693 Removing a capability from the bounding set does not remove it
694 from the thread's inherited set.
695 However it does prevent the capability from being added
696 back into the thread's inherited set in the future.
697 .\"
698 .\"
699 .SS Effect of User ID Changes on Capabilities
700 To preserve the traditional semantics for transitions between
701 0 and non-zero user IDs,
702 the kernel makes the following changes to a thread's capability
703 sets on changes to the thread's real, effective, saved set,
704 and file system user IDs (using
705 .BR setuid (2),
706 .BR setresuid (2),
707 or similar):
708 .IP 1. 3
709 If one or more of the real, effective or saved set user IDs
710 was previously 0, and as a result of the UID changes all of these IDs
711 have a non-zero value,
712 then all capabilities are cleared from the permitted and effective
713 capability sets.
714 .IP 2.
715 If the effective user ID is changed from 0 to non-zero,
716 then all capabilities are cleared from the effective set.
717 .IP 3.
718 If the effective user ID is changed from non-zero to 0,
719 then the permitted set is copied to the effective set.
720 .IP 4.
721 If the file system user ID is changed from 0 to non-zero (see
722 .BR setfsuid (2))
723 then the following capabilities are cleared from the effective set:
724 .BR CAP_CHOWN ,
725 .BR CAP_DAC_OVERRIDE ,
726 .BR CAP_DAC_READ_SEARCH ,
727 .BR CAP_FOWNER ,
728 .BR CAP_FSETID ,
729 and
730 .BR CAP_MAC_OVERRIDE .
731 If the file system UID is changed from non-zero to 0,
732 then any of these capabilities that are enabled in the permitted set
733 are enabled in the effective set.
734 .PP
735 If a thread that has a 0 value for one or more of its user IDs wants
736 to prevent its permitted capability set being cleared when it resets
737 all of its user IDs to non-zero values, it can do so using the
738 .BR prctl (2)
739 .B PR_SET_KEEPCAPS
740 operation.
741 .\"
742 .SS Programmatically adjusting capability sets
743 A thread can retrieve and change its capability sets using the
744 .BR capget (2)
745 and
746 .BR capset (2)
747 system calls.
748 However, the use of
749 .BR cap_get_proc (3)
750 and
751 .BR cap_set_proc (3),
752 both provided in the
753 .I libcap
754 package,
755 is preferred for this purpose.
756 The following rules govern changes to the thread capability sets:
757 .IP 1. 3
758 If the caller does not have the
759 .B CAP_SETPCAP
760 capability,
761 the new inheritable set must be a subset of the combination
762 of the existing inheritable and permitted sets.
763 .IP 2.
764 (Since kernel 2.6.25)
765 The new inheritable set must be a subset of the combination of the
766 existing inheritable set and the capability bounding set.
767 .IP 3.
768 The new permitted set must be a subset of the existing permitted set
769 (i.e., it is not possible to acquire permitted capabilities
770 that the thread does not currently have).
771 .IP 4.
772 The new effective set must be a subset of the new permitted set.
773 .SS The """securebits"" flags: establishing a capabilities-only environment
774 .\" For some background:
775 .\" see http://lwn.net/Articles/280279/ and
776 .\" http://article.gmane.org/gmane.linux.kernel.lsm/5476/
777 Starting with kernel 2.6.26,
778 and with a kernel in which file capabilities are enabled,
779 Linux implements a set of per-thread
780 .I securebits
781 flags that can be used to disable special handling of capabilities for UID 0
782 .RI ( root ).
783 These flags are as follows:
784 .TP
785 .B SECURE_KEEP_CAPS
786 Setting this flag allows a thread that has one or more 0 UIDs to retain
787 its capabilities when it switches all of its UIDs to a non-zero value.
788 If this flag is not set,
789 then such a UID switch causes the thread to lose all capabilities.
790 This flag is always cleared on an
791 .BR execve (2).
792 (This flag provides the same functionality as the older
793 .BR prctl (2)
794 .B PR_SET_KEEPCAPS
795 operation.)
796 .TP
797 .B SECURE_NO_SETUID_FIXUP
798 Setting this flag stops the kernel from adjusting capability sets when
799 the threads's effective and file system UIDs are switched between
800 zero and non-zero values.
801 (See the subsection
802 .IR "Effect of User ID Changes on Capabilities" .)
803 .TP
804 .B SECURE_NOROOT
805 If this bit is set, then the kernel does not grant capabilities
806 when a set-user-ID-root program is executed, or when a process with
807 an effective or real UID of 0 calls
808 .BR execve (2).
809 (See the subsection
810 .IR "Capabilities and execution of programs by root" .)
811 .PP
812 Each of the above "base" flags has a companion "locked" flag.
813 Setting any of the "locked" flags is irreversible,
814 and has the effect of preventing further changes to the
815 corresponding "base" flag.
816 The locked flags are:
817 .BR SECURE_KEEP_CAPS_LOCKED ,
818 .BR SECURE_NO_SETUID_FIXUP_LOCKED ,
819 and
820 .BR SECURE_NOROOT_LOCKED .
821 .PP
822 The
823 .I securebits
824 flags can be modified and retrieved using the
825 .BR prctl (2)
826 .B PR_SET_SECUREBITS
827 and
828 .B PR_GET_SECUREBITS
829 operations.
830 The
831 .B CAP_SETPCAP
832 capability is required to modify the flags.
833
834 The
835 .I securebits
836 flags are inherited by child processes.
837 During an
838 .BR execve (2),
839 all of the flags are preserved, except
840 .B SECURE_KEEP_CAPS
841 which is always cleared.
842
843 An application can use the following call to lock itself,
844 and all of its descendants,
845 into an environment where the only way of gaining capabilities
846 is by executing a program with associated file capabilities:
847 .in +4n
848 .nf
849
850 prctl(PR_SET_SECUREBITS,
851 1 << SECURE_KEEP_CAPS_LOCKED |
852 1 << SECURE_NO_SETUID_FIXUP |
853 1 << SECURE_NO_SETUID_FIXUP_LOCKED |
854 1 << SECURE_NOROOT |
855 1 << SECURE_NOROOT_LOCKED);
856 .fi
857 .in
858 .SH "CONFORMING TO"
859 .PP
860 No standards govern capabilities, but the Linux capability implementation
861 is based on the withdrawn POSIX.1e draft standard; see
862 .IR http://wt.xpilot.org/publications/posix.1e/ .
863 .SH NOTES
864 Since kernel 2.5.27, capabilities are an optional kernel component,
865 and can be enabled/disabled via the CONFIG_SECURITY_CAPABILITIES
866 kernel configuration option.
867
868 The
869 .I /proc/PID/task/TID/status
870 file can be used to view the capability sets of a thread.
871 The
872 .I /proc/PID/status
873 file shows the capability sets of a process's main thread.
874
875 The
876 .I libcap
877 package provides a suite of routines for setting and
878 getting capabilities that is more comfortable and less likely
879 to change than the interface provided by
880 .BR capset (2)
881 and
882 .BR capget (2).
883 This package also provides the
884 .BR setcap (8)
885 and
886 .BR getcap (8)
887 programs.
888 It can be found at
889 .br
890 .IR http://www.kernel.org/pub/linux/libs/security/linux-privs .
891
892 Before kernel 2.6.24, and since kernel 2.6.24 if
893 file capabilities are not enabled, a thread with the
894 .B CAP_SETPCAP
895 capability can manipulate the capabilities of threads other than itself.
896 However, this is only theoretically possible,
897 since no thread ever has
898 .BR CAP_SETPCAP
899 in either of these cases:
900 .IP * 2
901 In the pre-2.6.25 implementation the system-wide capability bounding set,
902 .IR /proc/sys/kernel/cap-bound ,
903 always masks out this capability, and this can not be changed
904 without modifying the kernel source and rebuilding.
905 .IP *
906 If file capabilities are disabled in the current implementation, then
907 .B init
908 starts out with this capability removed from its per-process bounding
909 set, and that bounding set is inherited by all other processes
910 created on the system.
911 .SH "SEE ALSO"
912 .BR capget (2),
913 .BR prctl (2),
914 .BR setfsuid (2),
915 .BR cap_clear (3),
916 .BR cap_copy_ext (3),
917 .BR cap_from_text (3),
918 .BR cap_get_file (3),
919 .BR cap_get_proc (3),
920 .BR cap_init (3),
921 .BR capgetp (3),
922 .BR capsetp (3),
923 .BR credentials (7),
924 .BR pthreads (7),
925 .BR getcap (8),
926 .BR setcap (8)
927 .PP
928 .I include/linux/capability.h
929 in the kernel source