.\" the source, must acknowledge the copyright and authors of this work.
.\" %%%LICENSE_END
.\"
-.TH SECCOMP 2 2018-02-02 "Linux" "Linux Programmer's Manual"
+.TH SECCOMP 2 2019-11-19 "Linux" "Linux Programmer's Manual"
.SH NAME
seccomp \- operate on Secure Computing state of the process
.SH SYNOPSIS
.IP
This operation is functionally identical to the call:
.IP
- prctl(PR_SET_SECCOMP, SECCOMP_MODE_STRICT);
+.in +4n
+.EX
+prctl(PR_SET_SECCOMP, SECCOMP_MODE_STRICT);
+.EE
+.in
.TP
.BR SECCOMP_SET_MODE_FILTER
The system calls allowed are defined by a pointer to a Berkeley Packet
.IP
In order to use the
.BR SECCOMP_SET_MODE_FILTER
-operation, either the caller must have the
+operation, either the calling thread must have the
.BR CAP_SYS_ADMIN
capability in its user namespace, or the thread must already have the
.I no_new_privs
If that bit was not already set by an ancestor of this thread,
the thread must make the following call:
.IP
- prctl(PR_SET_NO_NEW_PRIVS, 1);
+.in +4n
+.EX
+prctl(PR_SET_NO_NEW_PRIVS, 1);
+.EE
+.in
.IP
Otherwise, the
.BR SECCOMP_SET_MODE_FILTER
.IR flags
is 0, this operation is functionally identical to the call:
.IP
- prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, args);
+.in +4n
+.EX
+prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, args);
+.EE
+.in
.IP
The recognized
.IR flags
actions from being logged via the
.IR /proc/sys/kernel/seccomp/actions_logged
file.
+.TP
+.BR SECCOMP_FILTER_FLAG_SPEC_ALLOW " (since Linux 4.17)"
+.\" commit 00a02d0c502a06d15e07b857f8ff921e3e402675
+Disable Speculative Store Bypass mitigation.
.RE
.TP
.BR SECCOMP_GET_ACTION_AVAIL " (since Linux 4.14)"
.PP
Because numbering of system calls varies between architectures and
some architectures (e.g., x86-64) allow user-space code to use
-the calling conventions of multiple architectures, it is usually
-necessary to verify the value of the
+the calling conventions of multiple architectures
+(and the convention being used may vary over the life of a process that uses
+.BR execve (2)
+to execute binaries that employ the different conventions),
+it is usually necessary to verify the value of the
.IR arch
field.
.PP
-It is strongly recommended to use a whitelisting approach whenever
+It is strongly recommended to use an allow-list approach whenever
possible because such an approach is more robust and simple.
-A blacklist will have to be updated whenever a potentially
+A deny-list will have to be updated whenever a potentially
dangerous system call is added (or a dangerous flag or option if those
-are blacklisted), and it is often possible to alter the
+are deny-listed), and it is often possible to alter the
representation of a value without altering its meaning, leading to
-a blacklist bypass.
+a deny-list bypass.
See also
.IR Caveats
below.
.\" so that the syscall table indexing still works.
.PP
This means that in order to create a seccomp-based
-blacklist for system calls performed through the x86-64 ABI,
+deny-list for system calls performed through the x86-64 ABI,
it is necessary to not only check that
.IR arch
equals
.PP
When checking values from
.IR args
-against a blacklist, keep in mind that arguments are often
+against a deny-list, keep in mind that arguments are often
silently truncated before being processed, but after the seccomp check.
For example, this happens if the i386 ABI is used on an
x86-64 kernel: although the kernel will normally not look beyond
.BR seccomp ()
can fail for the following reasons:
.TP
-.BR EACCESS
+.BR EACCES
The caller did not have the
.BR CAP_SYS_ADMIN
capability in its user namespace, or had not set
.IR flags ).
.PP
Since Linux 4.4, the
-.BR prctl (2)
+.BR ptrace (2)
.B PTRACE_SECCOMP_GET_FILTER
operation can be used to dump a process's seccomp filters.
.\"
+.SS Architecture support for seccomp BPF
+Architecture support for seccomp BPF filtering
+.\" Check by grepping for HAVE_ARCH_SECCOMP_FILTER in Kconfig files in
+.\" kernel source. Last checked in Linux 4.16-rc source.
+is available on the following architectures:
+.IP * 3
+x86-64, i386, x32 (since Linux 3.5)
+.PD 0
+.IP *
+ARM (since Linux 3.8)
+.IP *
+s390 (since Linux 3.8)
+.IP *
+MIPS (since Linux 3.16)
+.IP *
+ARM-64 (since Linux 3.19)
+.IP *
+PowerPC (since Linux 4.3)
+.IP *
+Tile (since Linux 4.3)
+.IP *
+PA-RISC (since Linux 4.6)
+.\" User mode Linux since Linux 4.6
+.PD
+.\"
.SS Caveats
There are various subtleties to consider when applying seccomp filters
to a program, including the following:
that the application might need to perform.
Such bugs may not easily be discovered when testing the seccomp
filters if the bugs occur in rarely used application code paths.
-.RS 3
.\"
.SS Seccomp-specific BPF details
Note the following BPF details specific to seccomp filters:
$ \fBuname -m\fP
x86_64
$ \fBsyscall_nr() {
- cat /usr/src/linux/arch/x86/syscalls/syscall_64.tbl | \\
+ cat /usr/src/linux/arch/x86/syscalls/syscall_64.tbl | \e
awk '$2 != "x32" && $3 == "'$1'" { print $1 }'
}\fP
.EE
{
unsigned int upper_nr_limit = 0xffffffff;
- /* Assume that AUDIT_ARCH_X86_64 means the normal x86-64 ABI */
+ /* Assume that AUDIT_ARCH_X86_64 means the normal x86-64 ABI
+ (in the x32 ABI, all system calls have bit 30 set in the
+ 'nr' field, meaning the numbers are >= X32_SYSCALL_BIT) */
if (t_arch == AUDIT_ARCH_X86_64)
upper_nr_limit = X32_SYSCALL_BIT - 1;
BPF_STMT(BPF_LD | BPF_W | BPF_ABS,
(offsetof(struct seccomp_data, nr))),
- /* [3] Check ABI - only needed for x86-64 in blacklist use
+ /* [3] Check ABI - only needed for x86-64 in deny-list use
cases. Use BPF_JGT instead of checking against the bit
mask to avoid having to reload the syscall number. */
BPF_JUMP(BPF_JMP | BPF_JGT | BPF_K, upper_nr_limit, 3, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, syscall_nr, 0, 1),
/* [5] Matching architecture and system call: don't execute
- the system call, and return 'f_errno' in 'errno' */
+ the system call, and return 'f_errno' in 'errno' */
BPF_STMT(BPF_RET | BPF_K,
SECCOMP_RET_ERRNO | (f_errno & SECCOMP_RET_DATA)),
{
if (argc < 5) {
fprintf(stderr, "Usage: "
- "%s <syscall_nr> <arch> <errno> <prog> [<args>]\\n"
- "Hint for <arch>: AUDIT_ARCH_I386: 0x%X\\n"
- " AUDIT_ARCH_X86_64: 0x%X\\n"
- "\\n", argv[0], AUDIT_ARCH_I386, AUDIT_ARCH_X86_64);
+ "%s <syscall_nr> <arch> <errno> <prog> [<args>]\en"
+ "Hint for <arch>: AUDIT_ARCH_I386: 0x%X\en"
+ " AUDIT_ARCH_X86_64: 0x%X\en"
+ "\en", argv[0], AUDIT_ARCH_I386, AUDIT_ARCH_X86_64);
exit(EXIT_FAILURE);
}
}
.EE
.SH SEE ALSO
+.BR bpfc (1),
.BR strace (1),
.BR bpf (2),
.BR prctl (2),
.I libseccomp
library, including:
.BR scmp_sys_resolver (1),
+.BR seccomp_export_bpf (3),
.BR seccomp_init (3),
.BR seccomp_load (3),
-.BR seccomp_rule_add (3),
and
-.BR seccomp_export_bpf (3).
+.BR seccomp_rule_add (3).
.PP
The kernel source files
.IR Documentation/networking/filter.txt
.IR Documentation/prctl/seccomp_filter.txt
before Linux 4.13).
.PP
-McCanne, S. and Jacobson, V. (1992)
+McCanne, S.\& and Jacobson, V.\& (1992)
.IR "The BSD Packet Filter: A New Architecture for User-level Packet Capture" ,
Proceedings of the USENIX Winter 1993 Conference
.UR http://www.tcpdump.org/papers/bpf\-usenix93.pdf