]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man2/prctl.2
Many pages: Fix style issues reported by `make lint-groff`
[thirdparty/man-pages.git] / man2 / prctl.2
CommitLineData
fea681da 1.\" Copyright (C) 1998 Andries Brouwer (aeb@cwi.nl)
ada17e7d 2.\" and Copyright (C) 2002, 2006, 2008, 2012, 2013, 2015 Michael Kerrisk <mtk.manpages@gmail.com>
af5f9508 3.\" and Copyright Guillem Jover <guillem@hadrons.org>
ada17e7d
MK
4.\" and Copyright (C) 2010 Andi Kleen <andi@firstfloor.org>
5.\" and Copyright (C) 2012 Cyrill Gorcunov <gorcunov@openvz.org>
3cd5e983 6.\" and Copyright (C) 2014 Dave Hansen / Intel
ada17e7d
MK
7.\" and Copyright (c) 2016 Eugene Syromyatnikov <evgsyr@gmail.com>
8.\" and Copyright (c) 2018 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
236a9f70 9.\" and Copyright (c) 2020 Dave Martin <Dave.Martin@arm.com>
fea681da 10.\"
5fbde956 11.\" SPDX-License-Identifier: Linux-man-pages-copyleft
fea681da
MK
12.\"
13.\" Modified Thu Nov 11 04:19:42 MET 1999, aeb: added PR_GET_PDEATHSIG
14.\" Modified 27 Jun 02, Michael Kerrisk
c13182ef 15.\" Added PR_SET_DUMPABLE, PR_GET_DUMPABLE,
fea681da 16.\" PR_SET_KEEPCAPS, PR_GET_KEEPCAPS
e87fdd92
MK
17.\" Modified 2006-08-30 Guillem Jover <guillem@hadrons.org>
18.\" Updated Linux versions where the options where introduced.
19.\" Added PR_SET_TIMING, PR_GET_TIMING, PR_SET_NAME, PR_GET_NAME,
20.\" PR_SET_UNALIGN, PR_GET_UNALIGN, PR_SET_FPEMU, PR_GET_FPEMU,
21.\" PR_SET_FPEXC, PR_GET_FPEXC
8ab8b43f
MK
22.\" 2008-04-29 Serge Hallyn, Document PR_CAPBSET_READ and PR_CAPBSET_DROP
23.\" 2008-06-13 Erik Bosman, <ejbosman@cs.vu.nl>
24.\" Document PR_GET_TSC and PR_SET_TSC.
25.\" 2008-06-15 mtk, Document PR_SET_SECCOMP, PR_GET_SECCOMP
bc02b3ea 26.\" 2009-10-03 Andi Kleen, document PR_MCE_KILL
06afe673 27.\" 2012-04 Cyrill Gorcunov, Document PR_SET_MM
bc02b3ea
MK
28.\" 2012-04-25 Michael Kerrisk, Document PR_TASK_PERF_EVENTS_DISABLE and
29.\" PR_TASK_PERF_EVENTS_ENABLE
34447828 30.\" 2012-09-20 Kees Cook, update PR_SET_SECCOMP for mode 2
f83fe154 31.\" 2012-09-20 Kees Cook, document PR_SET_NO_NEW_PRIVS, PR_GET_NO_NEW_PRIVS
934487a0
MK
32.\" 2012-10-25 Michael Kerrisk, Document PR_SET_TIMERSLACK and
33.\" PR_GET_TIMERSLACK
491b2e75 34.\" 2013-01-10 Kees Cook, document PR_SET_PTRACER
31cc8387 35.\" 2012-02-04 Michael Kerrisk, document PR_{SET,GET}_CHILD_SUBREAPER
03979794 36.\" 2014-11-10 Dave Hansen, document PR_MPX_{EN,DIS}ABLE_MANAGEMENT
fea681da 37.\"
e14baeeb 38.\"
1d767b55 39.TH PRCTL 2 2021-03-22 "Linux" "Linux Programmer's Manual"
fea681da 40.SH NAME
1e0d99b8 41prctl \- operations on a process or thread
9b477f43
AC
42.SH LIBRARY
43Standard C library
8fc3b2cf 44.RI ( libc ", " \-lc )
fea681da 45.SH SYNOPSIS
521bf584 46.nf
fea681da 47.B #include <sys/prctl.h>
68e4db0a 48.PP
521bf584
MK
49.BI "int prctl(int " option ", unsigned long " arg2 ", unsigned long " arg3 ,
50.BI " unsigned long " arg4 ", unsigned long " arg5 );
51.fi
fea681da 52.SH DESCRIPTION
e511ffb6 53.BR prctl ()
1e0d99b8
DM
54manipulates various aspects of the behavior
55of the calling thread or process.
56.PP
cd41e08c 57Note that careless use of some
ec5cb536 58.BR prctl ()
cd41e08c
MK
59operations can confuse the user-space run-time environment,
60so these operations should be used with care.
ec5cb536 61.PP
1e0d99b8 62.BR prctl ()
fea681da 63is called with a first argument describing what to do
1a329b56 64(with values defined in \fI<linux/prctl.h>\fP), and further
c4bb193f 65arguments with a significance depending on the first one.
fea681da 66The first argument can be:
03547431 67.\"
667eb3ac 68.\" prctl PR_CAP_AMBIENT
03547431
MK
69.TP
70.BR PR_CAP_AMBIENT " (since Linux 4.3)"
71.\" commit 58319057b7847667f0c9585b9de0e8932b0fdb08
1a52f4f6
MK
72Reads or changes the ambient capability set of the calling thread,
73according to the value of
03547431
MK
74.IR arg2 ,
75which must be one of the following:
76.RS
77.\"
78.TP
79.B PR_CAP_AMBIENT_RAISE
80The capability specified in
81.I arg3
82is added to the ambient set.
83The specified capability must already be present in
84both the permitted and the inheritable sets of the process.
85This operation is not permitted if the
86.B SECBIT_NO_CAP_AMBIENT_RAISE
87securebit is set.
88.TP
89.B PR_CAP_AMBIENT_LOWER
90The capability specified in
91.I arg3
92is removed from the ambient set.
93.TP
94.B PR_CAP_AMBIENT_IS_SET
95The
bf7bc8b8 96.BR prctl ()
03547431
MK
97call returns 1 if the capability in
98.I arg3
99is in the ambient set and 0 if it is not.
100.TP
1ae6b2c7 101.B PR_CAP_AMBIENT_CLEAR_ALL
03547431
MK
102All capabilities will be removed from the ambient set.
103This operation requires setting
104.I arg3
105to zero.
106.RE
269e3b97
MK
107.IP
108In all of the above operations,
109.I arg4
110and
111.I arg5
112must be specified as 0.
cf086650
MK
113.IP
114Higher-level interfaces layered on top of the above operations are
115provided in the
116.BR libcap (3)
117library in the form of
118.BR cap_get_ambient (3),
119.BR cap_set_ambient (3),
120and
121.BR cap_reset_ambient (3).
667eb3ac 122.\" prctl PR_CAPBSET_READ
fea681da 123.TP
2e781e20 124.BR PR_CAPBSET_READ " (since Linux 2.6.25)"
8ab8b43f
MK
125Return (as the function result) 1 if the capability specified in
126.I arg2
127is in the calling thread's capability bounding set,
128or 0 if it is not.
129(The capability constants are defined in
130.IR <linux/capability.h> .)
131The capability bounding set dictates
132whether the process can receive the capability through a
2914a14d 133file's permitted capability set on a subsequent call to
8ab8b43f 134.BR execve (2).
efeece04 135.IP
8ab8b43f
MK
136If the capability specified in
137.I arg2
138is not valid, then the call fails with the error
139.BR EINVAL .
d9a0d1d7
MK
140.IP
141A higher-level interface layered on top of this operation is provided in the
142.BR libcap (3)
143library in the form of
144.BR cap_get_bound (3).
667eb3ac 145.\" prctl PR_CAPBSET_DROP
8ab8b43f
MK
146.TP
147.BR PR_CAPBSET_DROP " (since Linux 2.6.25)"
148If the calling thread has the
149.B CAP_SETPCAP
af53fcb5 150capability within its user namespace, then drop the capability specified by
8ab8b43f
MK
151.I arg2
152from the calling thread's capability bounding set.
153Any children of the calling thread will inherit the newly
154reduced bounding set.
efeece04 155.IP
8ab8b43f
MK
156The call fails with the error:
157.B EPERM
2914a14d 158if the calling thread does not have the
8ab8b43f 159.BR CAP_SETPCAP ;
1ae6b2c7 160.B EINVAL
8ab8b43f
MK
161if
162.I arg2
163does not represent a valid capability; or
1ae6b2c7 164.B EINVAL
8ab8b43f
MK
165if file capabilities are not enabled in the kernel,
166in which case bounding sets are not supported.
d9a0d1d7
MK
167.IP
168A higher-level interface layered on top of this operation is provided in the
169.BR libcap (3)
170library in the form of
171.BR cap_drop_bound (3).
667eb3ac 172.\" prctl PR_SET_CHILD_SUBREAPER
73d3ac53
MK
173.TP
174.BR PR_SET_CHILD_SUBREAPER " (since Linux 3.4)"
175.\" commit ebec18a6d3aa1e7d84aab16225e87fd25170ec2b
176If
177.I arg2
178is nonzero,
179set the "child subreaper" attribute of the calling process;
180if
181.I arg2
182is zero, unset the attribute.
efeece04 183.IP
fbc63931 184A subreaper fulfills the role of
73d3ac53
MK
185.BR init (1)
186for its descendant processes.
fbc63931 187When a process becomes orphaned
b6088873 188(i.e., its immediate parent terminates),
fbc63931
MK
189then that process will be reparented to
190the nearest still living ancestor subreaper.
191Subsequently, calls to
e3a78ee9 192.BR getppid (2)
fbc63931
MK
193in the orphaned process will now return the PID of the subreaper process,
194and when the orphan terminates, it is the subreaper process that
73d3ac53 195will receive a
1ae6b2c7 196.B SIGCHLD
1a8e1c2f 197signal and will be able to
73d3ac53
MK
198.BR wait (2)
199on the process to discover its termination status.
efeece04 200.IP
4a5a783d 201The setting of the "child subreaper" attribute
300a9c78 202is not inherited by children created by
d59a7572
MK
203.BR fork (2)
204and
205.BR clone (2).
206The setting is preserved across
207.BR execve (2).
efeece04 208.IP
94e460d4
MK
209Establishing a subreaper process is useful in session management frameworks
210where a hierarchical group of processes is managed by a subreaper process
211that needs to be informed when one of the processes\(emfor example,
212a double-forked daemon\(emterminates
213(perhaps so that it can restart that process).
214Some
215.BR init (1)
216frameworks (e.g.,
217.BR systemd (1))
218employ a subreaper process for similar reasons.
667eb3ac 219.\" prctl PR_GET_CHILD_SUBREAPER
73d3ac53
MK
220.TP
221.BR PR_GET_CHILD_SUBREAPER " (since Linux 3.4)"
222Return the "child subreaper" setting of the caller,
223in the location pointed to by
1ae6b2c7 224.IR "(int\~*) arg2" .
667eb3ac 225.\" prctl PR_SET_DUMPABLE
8ab8b43f 226.TP
88989295 227.BR PR_SET_DUMPABLE " (since Linux 2.3.20)"
d4492caa 228Set the state of the "dumpable" attribute,
2d7fc98d
MK
229which determines whether core dumps are produced for the calling process
230upon delivery of a signal whose default behavior is to produce a core dump.
efeece04 231.IP
88989295 232In kernels up to and including 2.6.12,
8ab8b43f 233.I arg2
8aad30d7
MK
234must be either 0
235.RB ( SUID_DUMP_DISABLE ,
236process is not dumpable) or 1
237.RB ( SUID_DUMP_USER ,
238process is dumpable).
0de51ed1
MK
239Between kernels 2.6.13 and 2.6.17,
240.\" commit abf75a5033d4da7b8a7e92321d74021d1fcfb502
241the value 2 was also permitted,
88989295
MK
242which caused any binary which normally would not be dumped
243to be dumped readable by root only;
244for security reasons, this feature has been removed.
245.\" See http://marc.theaimsgroup.com/?l=linux-kernel&m=115270289030630&w=2
246.\" Subject: Fix prctl privilege escalation (CVE-2006-2451)
247.\" From: Marcel Holtmann <marcel () holtmann ! org>
248.\" Date: 2006-07-12 11:12:00
249(See also the description of
2d7fc98d 250.I /proc/sys/fs/\:suid_dumpable
88989295
MK
251in
252.BR proc (5).)
efeece04 253.IP
3076b3d9 254Normally, the "dumpable" attribute is set to 1.
2d7fc98d 255However, it is reset to the current value contained in the file
1ae6b2c7 256.I /proc/sys/fs/\:suid_dumpable
2d7fc98d 257(which by default has the value 0),
a644bc48 258in the following circumstances:
2d7fc98d
MK
259.\" See kernel/cred.c::commit_creds() (Linux 3.18 sources)
260.RS
41f90bb7 261.IP * 3
a644bc48 262The process's effective user or group ID is changed.
2d7fc98d 263.IP *
a644bc48 264The process's filesystem user or group ID is changed (see
2d7fc98d
MK
265.BR credentials (7)).
266.IP *
a644bc48 267The process executes
2d7fc98d 268.RB ( execve (2))
41f90bb7
MK
269a set-user-ID or set-group-ID program, resulting in a change
270of either the effective user ID or the effective group ID.
27ce08bf
KF
271.IP *
272The process executes
273.RB ( execve (2))
274a program that has file capabilities (see
275.BR capabilities (7)),
41f90bb7 276.\" See kernel/cred.c::commit_creds()
27ce08bf 277but only if the permitted capabilities
41f90bb7 278gained exceed those already permitted for the process.
5d28ea3e 279.\" Also certain namespace operations;
2d7fc98d
MK
280.RE
281.IP
cadcf1b1 282Processes that are not dumpable can not be attached via
6fdbc779 283.BR ptrace (2)
982d8cf7
MK
284.BR PTRACE_ATTACH ;
285see
286.BR ptrace (2)
287for further details.
efeece04 288.IP
161946a2
MK
289If a process is not dumpable,
290the ownership of files in the process's
1ae6b2c7 291.IR /proc/ pid
161946a2
MK
292directory is affected as described in
293.BR proc (5).
667eb3ac 294.\" prctl PR_GET_DUMPABLE
64536a1b 295.TP
88989295
MK
296.BR PR_GET_DUMPABLE " (since Linux 2.3.20)"
297Return (as the function result) the current state of the calling
d4492caa 298process's dumpable attribute.
88989295
MK
299.\" Since Linux 2.6.13, the dumpable flag can have the value 2,
300.\" but in 2.6.13 PR_GET_DUMPABLE simply returns 1 if the dumpable
c7094399 301.\" flags has a nonzero value. This was fixed in 2.6.14.
667eb3ac 302.\" prctl PR_SET_ENDIAN
64536a1b 303.TP
8ab8b43f 304.BR PR_SET_ENDIAN " (since Linux 2.6.18, PowerPC only)"
c13182ef 305Set the endian-ness of the calling process to the value given
64536a1b 306in \fIarg2\fP, which should be one of the following:
8ab8b43f 307.\" Respectively 0, 1, 2
64536a1b
MK
308.BR PR_ENDIAN_BIG ,
309.BR PR_ENDIAN_LITTLE ,
310or
0daa9e92 311.B PR_ENDIAN_PPC_LITTLE
64536a1b 312(PowerPC pseudo little endian).
667eb3ac 313.\" prctl PR_GET_ENDIAN
e87fdd92 314.TP
8ab8b43f
MK
315.BR PR_GET_ENDIAN " (since Linux 2.6.18, PowerPC only)"
316Return the endian-ness of the calling process,
317in the location pointed to by
1ae6b2c7 318.IR "(int\~*) arg2" .
667eb3ac 319.\" prctl PR_SET_FP_MODE
64a53a67
ES
320.TP
321.BR PR_SET_FP_MODE " (since Linux 4.0, only on MIPS)"
89507305
MK
322.\" commit 9791554b45a2acc28247f66a5fd5bbc212a6b8c8
323On the MIPS architecture,
324user-space code can be built using an ABI which permits linking
325with code that has more restrictive floating-point (FP) requirements.
326For example, user-space code may be built to target the O32 FPXX ABI
b3073df8 327and linked with code built for either one of the more restrictive
89507305 328FP32 or FP64 ABIs.
b3073df8 329When more restrictive code is linked in,
89507305
MK
330the overall requirement for the process is to use the more
331restrictive floating-point mode.
efeece04 332.IP
07d6076e 333Because the kernel has no means of knowing in advance
89507305 334which mode the process should be executed in,
07d6076e
MK
335and because these restrictions can
336change over the lifetime of the process, the
337.B PR_SET_FP_MODE
338operation is provided to allow control of the floating-point mode
339from user space.
efeece04 340.IP
64a53a67
ES
341.\" https://dmz-portal.mips.com/wiki/MIPS_O32_ABI_-_FR0_and_FR1_Interlinking
342The
343.I (unsigned int) arg2
89507305 344argument is a bit mask describing the floating-point mode used:
64a53a67
ES
345.RS
346.TP
1ae6b2c7 347.B PR_FP_MODE_FR
64a53a67
ES
348When this bit is
349.I unset
350(so called
351.BR FR=0 " or " FR0
41a926bf
MK
352mode), the 32 floating-point registers are 32 bits wide,
353and 64-bit registers are represented as a pair of registers
b3073df8 354(even- and odd- numbered,
89507305
MK
355with the even-numbered register containing the lower 32 bits,
356and the odd-numbered register containing the higher 32 bits).
efeece04 357.IP
64a53a67
ES
358When this bit is
359.I set
07d6076e 360(on supported hardware),
41a926bf 361the 32 floating-point registers are 64 bits wide (so called
64a53a67 362.BR FR=1 " or " FR1
89507305 363mode).
b3073df8 364Note that modern MIPS implementations (MIPS R6 and newer) support
64a53a67
ES
365.B FR=1
366mode only.
efeece04 367.IP
89507305 368Applications that use the O32 FP32 ABI can operate only when this bit is
64a53a67
ES
369.I unset
370.RB ( FR=0 ;
371or they can be used with FRE enabled, see below).
89507305
MK
372Applications that use the O32 FP64 ABI
373(and the O32 FP64A ABI, which exists to
374provide the ability to operate with existing FP32 code; see below)
375can operate only when this bit is
64a53a67
ES
376.I set
377.RB ( FR=1 ).
ffb0dafc 378Applications that use the O32 FPXX ABI can operate with either
1ae6b2c7 379.B FR=0
07d6076e 380or
1ae6b2c7 381.B FR=1 .
64a53a67 382.TP
1ae6b2c7 383.B PR_FP_MODE_FRE
07d6076e 384Enable emulation of 32-bit floating-point mode.
b3073df8 385When this mode is enabled,
07d6076e
MK
386it emulates 32-bit floating-point operations
387by raising a reserved-instruction exception
b3073df8 388on every instruction that uses 32-bit formats and
89507305
MK
389the kernel then handles the instruction in software.
390(The problem lies in the discrepancy of handling odd-numbered registers
391which are the high 32 bits of 64-bit registers with even numbers in
64a53a67 392.B FR=0
89507305 393mode and the lower 32-bit parts of odd-numbered 64-bit registers in
64a53a67 394.B FR=1
89507305
MK
395mode.)
396Enabling this bit is necessary when code with the O32 FP32 ABI should operate
397with code with compatible the O32 FPXX or O32 FP64A ABIs (which require
64a53a67 398.B FR=1
b3073df8
MK
399FPU mode) or when it is executed on newer hardware (MIPS R6 onwards)
400which lacks
64a53a67 401.B FR=0
89507305 402mode support when a binary with the FP32 ABI is used.
64a53a67 403.IP
89507305
MK
404Note that this mode makes sense only when the FPU is in 64-bit mode
405.RB ( FR=1 ).
64a53a67 406.IP
89507305 407Note that the use of emulation inherently has a significant performance hit
b3073df8 408and should be avoided if possible.
64a53a67
ES
409.RE
410.IP
07d6076e
MK
411In the N32/N64 ABI, 64-bit floating-point mode is always used,
412so FPU emulation is not required and the FPU always operates in
64a53a67
ES
413.B FR=1
414mode.
415.IP
07d6076e
MK
416This option is mainly intended for use by the dynamic linker
417.RB ( ld.so (8)).
64a53a67 418.IP
89507305
MK
419The arguments
420.IR arg3 ,
421.IR arg4 ,
422and
1ae6b2c7 423.I arg5
64a53a67 424are ignored.
667eb3ac 425.\" prctl PR_GET_FP_MODE
64a53a67
ES
426.TP
427.BR PR_GET_FP_MODE " (since Linux 4.0, only on MIPS)"
39466029
MK
428Return (as the function result)
429the current floating-point mode (see the description of
64a53a67
ES
430.B PR_SET_FP_MODE
431for details).
efeece04 432.IP
89507305 433On success,
07d6076e 434the call returns a bit mask which represents the current floating-point mode.
efeece04 435.IP
89507305
MK
436The arguments
437.IR arg2 ,
438.IR arg3 ,
439.IR arg4 ,
440and
1ae6b2c7 441.I arg5
64a53a67 442are ignored.
667eb3ac 443.\" prctl PR_SET_FPEMU
8ab8b43f 444.TP
8ab8b43f 445.BR PR_SET_FPEMU " (since Linux 2.4.18, 2.5.9, only on ia64)"
e87fdd92 446Set floating-point emulation control bits to \fIarg2\fP.
7626d2ce
MK
447Pass
448.B PR_FPEMU_NOPRINT
449to silently emulate floating-point operation accesses, or
450.B PR_FPEMU_SIGFPE
451to not emulate floating-point operations and send
8bd58774
MK
452.B SIGFPE
453instead.
667eb3ac 454.\" prctl PR_GET_FPEMU
e87fdd92 455.TP
8ab8b43f
MK
456.BR PR_GET_FPEMU " (since Linux 2.4.18, 2.5.9, only on ia64)"
457Return floating-point emulation control bits,
458in the location pointed to by
1ae6b2c7 459.IR "(int\~*) arg2" .
667eb3ac 460.\" prctl PR_SET_FPEXC
e87fdd92 461.TP
8ab8b43f 462.BR PR_SET_FPEXC " (since Linux 2.4.21, 2.5.32, only on PowerPC)"
1c44bd5b
MK
463Set floating-point exception mode to \fIarg2\fP.
464Pass \fBPR_FP_EXC_SW_ENABLE\fP to use FPEXC for FP exception enables,
c45bd688
MK
465\fBPR_FP_EXC_DIV\fP for floating-point divide by zero,
466\fBPR_FP_EXC_OVF\fP for floating-point overflow,
467\fBPR_FP_EXC_UND\fP for floating-point underflow,
468\fBPR_FP_EXC_RES\fP for floating-point inexact result,
469\fBPR_FP_EXC_INV\fP for floating-point invalid operation,
e87fdd92 470\fBPR_FP_EXC_DISABLED\fP for FP exceptions disabled,
b28f6e56 471\fBPR_FP_EXC_NONRECOV\fP for async nonrecoverable exception mode,
e87fdd92
MK
472\fBPR_FP_EXC_ASYNC\fP for async recoverable exception mode,
473\fBPR_FP_EXC_PRECISE\fP for precise exception mode.
667eb3ac 474.\" prctl PR_GET_FPEXC
e87fdd92 475.TP
8ab8b43f
MK
476.BR PR_GET_FPEXC " (since Linux 2.4.21, 2.5.32, only on PowerPC)"
477Return floating-point exception mode,
478in the location pointed to by
1ae6b2c7 479.IR "(int\~*) arg2" .
194ccff9
DM
480.\" prctl PR_SET_IO_FLUSHER
481.TP
482.BR PR_SET_IO_FLUSHER " (since Linux 5.6)"
483If a user process is involved in the block layer or filesystem I/O path,
484and can allocate memory while processing I/O requests it must set
485\fIarg2\fP to 1.
486This will put the process in the IO_FLUSHER state,
487which allows it special treatment to make progress when allocating memory.
488If \fIarg2\fP is 0, the process will clear the IO_FLUSHER state, and
489the default behavior will be used.
490.IP
491The calling process must have the
1ae6b2c7 492.B CAP_SYS_RESOURCE
194ccff9
DM
493capability.
494.IP
495.IR arg3 ,
496.IR arg4 ,
497and
1ae6b2c7 498.I arg5
194ccff9
DM
499must be zero.
500.IP
501The IO_FLUSHER state is inherited by a child process created via
502.BR fork (2)
503and is preserved across
504.BR execve (2).
505.IP
506Examples of IO_FLUSHER applications are FUSE daemons, SCSI device
507emulation daemons, and daemons that perform error handling like multipath
508path recovery applications.
509.\" prctl PR_GET_IO_FLUSHER
510.TP
511.B PR_GET_IO_FLUSHER (Since Linux 5.6)
512Return (as the function result) the IO_FLUSHER state of the caller.
513A value of 1 indicates that the caller is in the IO_FLUSHER state;
5140 indicates that the caller is not in the IO_FLUSHER state.
515.IP
516The calling process must have the
1ae6b2c7 517.B CAP_SYS_RESOURCE
194ccff9
DM
518capability.
519.IP
520.IR arg2 ,
521.IR arg3 ,
522.IR arg4 ,
523and
1ae6b2c7 524.I arg5
194ccff9 525must be zero.
667eb3ac 526.\" prctl PR_SET_KEEPCAPS
8ab8b43f 527.TP
88989295 528.BR PR_SET_KEEPCAPS " (since Linux 2.2.18)"
03361448 529Set the state of the calling thread's "keep capabilities" flag.
cb7c96bf 530The effect of this flag is described in
03361448 531.BR capabilities (7).
88989295 532.I arg2
03361448
MK
533must be either 0 (clear the flag)
534or 1 (set the flag).
028cb080 535The "keep capabilities" value will be reset to 0 on subsequent calls to
88989295 536.BR execve (2).
667eb3ac 537.\" prctl PR_GET_KEEPCAPS
88989295
MK
538.TP
539.BR PR_GET_KEEPCAPS " (since Linux 2.2.18)"
88ee5c1c 540Return (as the function result) the current state of the calling thread's
88989295 541"keep capabilities" flag.
03361448
MK
542See
543.BR capabilities (7)
544for a description of this flag.
667eb3ac 545.\" prctl PR_MCE_KILL
88989295 546.TP
03547431 547.BR PR_MCE_KILL " (since Linux 2.6.32)"
eb359b3e 548Set the machine check memory corruption kill policy for the calling thread.
03547431
MK
549If
550.I arg2
551is
552.BR PR_MCE_KILL_CLEAR ,
553clear the thread memory corruption kill policy and use the system-wide default.
554(The system-wide default is defined by
555.IR /proc/sys/vm/memory_failure_early_kill ;
556see
557.BR proc (5).)
558If
559.I arg2
560is
561.BR PR_MCE_KILL_SET ,
562use a thread-specific memory corruption kill policy.
563In this case,
564.I arg3
565defines whether the policy is
566.I early kill
567.RB ( PR_MCE_KILL_EARLY ),
568.I late kill
569.RB ( PR_MCE_KILL_LATE ),
570or the system-wide default
571.RB ( PR_MCE_KILL_DEFAULT ).
572Early kill means that the thread receives a
573.B SIGBUS
574signal as soon as hardware memory corruption is detected inside
575its address space.
576In late kill mode, the process is killed only when it accesses a corrupted page.
577See
578.BR sigaction (2)
579for more information on the
1ae6b2c7 580.B SIGBUS
03547431
MK
581signal.
582The policy is inherited by children.
583The remaining unused
584.BR prctl ()
585arguments must be zero for future compatibility.
667eb3ac 586.\" prctl PR_MCE_KILL_GET
88989295 587.TP
03547431 588.BR PR_MCE_KILL_GET " (since Linux 2.6.32)"
1ff5960b
MK
589Return (as the function result)
590the current per-process machine check kill policy.
03547431
MK
591All unused
592.BR prctl ()
593arguments must be zero.
667eb3ac 594.\" prctl PR_SET_MM
88989295 595.TP
03547431
MK
596.BR PR_SET_MM " (since Linux 3.3)"
597.\" commit 028ee4be34a09a6d48bdf30ab991ae933a7bc036
598Modify certain kernel memory map descriptor fields
599of the calling process.
600Usually these fields are set by the kernel and dynamic loader (see
601.BR ld.so (8)
602for more information) and a regular application should not use this feature.
603However, there are cases, such as self-modifying programs,
604where a program might find it useful to change its own memory map.
efeece04 605.IP
03547431 606The calling process must have the
1ae6b2c7 607.B CAP_SYS_RESOURCE
03547431
MK
608capability.
609The value in
610.I arg2
611is one of the options below, while
612.I arg3
613provides a new value for the option.
a87d0921
MF
614The
615.I arg4
616and
617.I arg5
618arguments must be zero if unused.
efeece04 619.IP
261c7e1d 620Before Linux 3.10,
d2eeb68f 621.\" commit 52b3694157e3aa6df871e283115652ec6f2d31e0
261c7e1d 622this feature is available only if the kernel is built with the
1ae6b2c7 623.B CONFIG_CHECKPOINT_RESTORE
261c7e1d 624option enabled.
03547431
MK
625.RS
626.TP
1ae6b2c7 627.B PR_SET_MM_START_CODE
03547431
MK
628Set the address above which the program text can run.
629The corresponding memory area must be readable and executable,
997d21e1 630but not writable or shareable (see
03547431 631.BR mprotect (2)
0fcc276f 632and
03547431
MK
633.BR mmap (2)
634for more information).
f83fe154 635.TP
1ae6b2c7 636.B PR_SET_MM_END_CODE
03547431
MK
637Set the address below which the program text can run.
638The corresponding memory area must be readable and executable,
997d21e1 639but not writable or shareable.
f83fe154 640.TP
1ae6b2c7 641.B PR_SET_MM_START_DATA
03547431
MK
642Set the address above which initialized and
643uninitialized (bss) data are placed.
644The corresponding memory area must be readable and writable,
997d21e1 645but not executable or shareable.
88989295 646.TP
03547431
MK
647.B PR_SET_MM_END_DATA
648Set the address below which initialized and
649uninitialized (bss) data are placed.
650The corresponding memory area must be readable and writable,
997d21e1 651but not executable or shareable.
88989295 652.TP
1ae6b2c7 653.B PR_SET_MM_START_STACK
03547431
MK
654Set the start address of the stack.
655The corresponding memory area must be readable and writable.
491b2e75 656.TP
1ae6b2c7 657.B PR_SET_MM_START_BRK
03547431
MK
658Set the address above which the program heap can be expanded with
659.BR brk (2)
660call.
661The address must be greater than the ending address of
662the current program data segment.
663In addition, the combined size of the resulting heap and
664the size of the data segment can't exceed the
1ae6b2c7 665.B RLIMIT_DATA
03547431
MK
666resource limit (see
667.BR setrlimit (2)).
668.TP
1ae6b2c7 669.B PR_SET_MM_BRK
03547431
MK
670Set the current
671.BR brk (2)
672value.
673The requirements for the address are the same as for the
1ae6b2c7 674.B PR_SET_MM_START_BRK
03547431 675option.
11ac5b51 676.PP
03547431
MK
677The following options are available since Linux 3.5.
678.\" commit fe8c7f5cbf91124987106faa3bdf0c8b955c4cf7
679.TP
1ae6b2c7 680.B PR_SET_MM_ARG_START
03547431
MK
681Set the address above which the program command line is placed.
682.TP
1ae6b2c7 683.B PR_SET_MM_ARG_END
03547431
MK
684Set the address below which the program command line is placed.
685.TP
1ae6b2c7 686.B PR_SET_MM_ENV_START
03547431
MK
687Set the address above which the program environment is placed.
688.TP
1ae6b2c7 689.B PR_SET_MM_ENV_END
03547431
MK
690Set the address below which the program environment is placed.
691.IP
692The address passed with
693.BR PR_SET_MM_ARG_START ,
694.BR PR_SET_MM_ARG_END ,
695.BR PR_SET_MM_ENV_START ,
696and
1ae6b2c7 697.B PR_SET_MM_ENV_END
03547431
MK
698should belong to a process stack area.
699Thus, the corresponding memory area must be readable, writable, and
700(depending on the kernel configuration) have the
1ae6b2c7 701.B MAP_GROWSDOWN
03547431
MK
702attribute set (see
703.BR mmap (2)).
704.TP
1ae6b2c7 705.B PR_SET_MM_AUXV
03547431
MK
706Set a new auxiliary vector.
707The
708.I arg3
709argument should provide the address of the vector.
710The
711.I arg4
712is the size of the vector.
713.TP
1ae6b2c7 714.B PR_SET_MM_EXE_FILE
03547431
MK
715.\" commit b32dfe377102ce668775f8b6b1461f7ad428f8b6
716Supersede the
1ae6b2c7 717.IR /proc/ pid /exe
03547431
MK
718symbolic link with a new one pointing to a new executable file
719identified by the file descriptor provided in
720.I arg3
721argument.
722The file descriptor should be obtained with a regular
723.BR open (2)
724call.
725.IP
726To change the symbolic link, one needs to unmap all existing
727executable memory areas, including those created by the kernel itself
728(for example the kernel usually creates at least one executable
729memory area for the ELF
1ae6b2c7 730.I \.text
03547431
MK
731section).
732.IP
642df17c 733In Linux 4.9 and earlier, the
47bc9cec 734.\" commit 3fb4afd9a504c2386b8435028d43283216bf588e
1ae6b2c7 735.B PR_SET_MM_EXE_FILE
642df17c
MK
736operation can be performed only once in a process's lifetime;
737attempting to perform the operation a second time results in the error
738.BR EPERM .
739This restriction was enforced for security reasons that were subsequently
740deemed specious,
741and the restriction was removed in Linux 4.10 because some
742user-space applications needed to perform this operation more than once.
11ac5b51 743.PP
7e3236a5
MF
744The following options are available since Linux 3.18.
745.\" commit f606b77f1a9e362451aca8f81d8f36a3a112139e
746.TP
1ae6b2c7 747.B PR_SET_MM_MAP
7e3236a5
MF
748Provides one-shot access to all the addresses by passing in a
749.I struct prctl_mm_map
750(as defined in \fI<linux/prctl.h>\fP).
751The
752.I arg4
753argument should provide the size of the struct.
efeece04 754.IP
7e3236a5 755This feature is available only if the kernel is built with the
1ae6b2c7 756.B CONFIG_CHECKPOINT_RESTORE
7e3236a5
MF
757option enabled.
758.TP
1ae6b2c7 759.B PR_SET_MM_MAP_SIZE
7e3236a5
MF
760Returns the size of the
761.I struct prctl_mm_map
762the kernel expects.
763This allows user space to find a compatible struct.
764The
765.I arg4
766argument should be a pointer to an unsigned int.
efeece04 767.IP
7e3236a5 768This feature is available only if the kernel is built with the
1ae6b2c7 769.B CONFIG_CHECKPOINT_RESTORE
7e3236a5 770option enabled.
03547431 771.RE
667eb3ac 772.\" prctl PR_MPX_ENABLE_MANAGEMENT
03547431 773.TP
77ca5b1d 774.BR PR_MPX_ENABLE_MANAGEMENT ", " PR_MPX_DISABLE_MANAGEMENT " (since Linux 3.19, removed in Linux 5.4; only on x86)"
03547431
MK
775.\" commit fe3d197f84319d3bce379a9c0dc17b1f48ad358c
776.\" See also http://lwn.net/Articles/582712/
777.\" See also https://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler
778Enable or disable kernel management of Memory Protection eXtensions (MPX)
779bounds tables.
780The
781.IR arg2 ,
782.IR arg3 ,
783.IR arg4 ,
784and
1ae6b2c7 785.I arg5
03547431
MK
786.\" commit e9d1b4f3c60997fe197bf0243cb4a41a44387a88
787arguments must be zero.
efeece04 788.IP
03547431
MK
789MPX is a hardware-assisted mechanism for performing bounds checking on
790pointers.
791It consists of a set of registers storing bounds information
792and a set of special instruction prefixes that tell the CPU on which
793instructions it should do bounds enforcement.
794There is a limited number of these registers and
795when there are more pointers than registers,
796their contents must be "spilled" into a set of tables.
797These tables are called "bounds tables" and the MPX
798.BR prctl ()
799operations control
800whether the kernel manages their allocation and freeing.
efeece04 801.IP
03547431
MK
802When management is enabled, the kernel will take over allocation
803and freeing of the bounds tables.
804It does this by trapping the #BR exceptions that result
805at first use of missing bounds tables and
806instead of delivering the exception to user space,
807it allocates the table and populates the bounds directory
808with the location of the new table.
809For freeing, the kernel checks to see if bounds tables are
810present for memory which is not allocated, and frees them if so.
efeece04 811.IP
03547431
MK
812Before enabling MPX management using
813.BR PR_MPX_ENABLE_MANAGEMENT ,
814the application must first have allocated a user-space buffer for
815the bounds directory and placed the location of that directory in the
816.I bndcfgu
817register.
efeece04 818.IP
a23d8efa 819These calls fail if the CPU or kernel does not support MPX.
03547431 820Kernel support for MPX is enabled via the
1ae6b2c7 821.B CONFIG_X86_INTEL_MPX
03547431 822configuration option.
11b0b31a
MK
823You can check whether the CPU supports MPX by looking for the
824.I mpx
03547431 825CPUID bit, like with the following command:
efeece04 826.IP
e256205a
MK
827.in +4n
828.EX
11b0b31a 829cat /proc/cpuinfo | grep \(aq mpx \(aq
e256205a
MK
830.EE
831.in
efeece04 832.IP
03547431
MK
833A thread may not switch in or out of long (64-bit) mode while MPX is
834enabled.
efeece04 835.IP
03547431 836All threads in a process are affected by these calls.
efeece04 837.IP
03547431
MK
838The child of a
839.BR fork (2)
840inherits the state of MPX management.
841During
842.BR execve (2),
843MPX management is reset to a state as if
1ae6b2c7 844.B PR_MPX_DISABLE_MANAGEMENT
03547431 845had been called.
efeece04 846.IP
03547431
MK
847For further information on Intel MPX, see the kernel source file
848.IR Documentation/x86/intel_mpx.txt .
2ab5fe26
DM
849.IP
850.\" commit f240652b6032b48ad7fa35c5e701cc4c8d697c0b
851.\" See also https://lkml.kernel.org/r/20190705175321.DB42F0AD@viggo.jf.intel.com
852Due to a lack of toolchain support,
853.BR PR_MPX_ENABLE_MANAGEMENT " and " PR_MPX_DISABLE_MANAGEMENT
230dd8d0 854are not supported in Linux 5.4 and later.
667eb3ac 855.\" prctl PR_SET_NAME
03547431
MK
856.TP
857.BR PR_SET_NAME " (since Linux 2.6.9)"
858Set the name of the calling thread,
859using the value in the location pointed to by
1ae6b2c7 860.IR "(char\~*) arg2" .
03547431
MK
861The name can be up to 16 bytes long,
862.\" TASK_COMM_LEN in include/linux/sched.h
863including the terminating null byte.
864(If the length of the string, including the terminating null byte,
865exceeds 16 bytes, the string is silently truncated.)
866This is the same attribute that can be set via
867.BR pthread_setname_np (3)
868and retrieved using
869.BR pthread_getname_np (3).
870The attribute is likewise accessible via
1ae6b2c7 871.IR /proc/self/task/ tid /comm
c3a523e0
MK
872(see
873.BR proc (5)),
03547431 874where
1ae6b2c7 875.I tid
6a67ed89 876is the thread ID of the calling thread, as returned by
5aaf1385
DM
877.BR gettid (2).
878.\" prctl PR_GET_NAME
03547431
MK
879.TP
880.BR PR_GET_NAME " (since Linux 2.6.11)"
881Return the name of the calling thread,
882in the buffer pointed to by
1ae6b2c7 883.IR "(char\~*) arg2" .
03547431
MK
884The buffer should allow space for up to 16 bytes;
885the returned string will be null-terminated.
667eb3ac 886.\" prctl PR_SET_NO_NEW_PRIVS
03547431
MK
887.TP
888.BR PR_SET_NO_NEW_PRIVS " (since Linux 3.5)"
40dfb5ba 889Set the calling thread's
03547431 890.I no_new_privs
fdda9363 891attribute to the value in
03547431
MK
892.IR arg2 .
893With
894.I no_new_privs
895set to 1,
896.BR execve (2)
897promises not to grant privileges to do anything
898that could not have been done without the
899.BR execve (2)
900call (for example,
901rendering the set-user-ID and set-group-ID mode bits,
902and file capabilities non-functional).
97caa19c 903Once set, the
fdda9363
MK
904.I no_new_privs
905attribute cannot be unset.
906The setting of this attribute is inherited by children created by
03547431
MK
907.BR fork (2)
908and
909.BR clone (2),
910and preserved across
911.BR execve (2).
efeece04 912.IP
c70fea6e
MK
913Since Linux 4.10,
914the value of a thread's
915.I no_new_privs
fdda9363 916attribute can be viewed via the
c70fea6e
MK
917.I NoNewPrivs
918field in the
1ae6b2c7 919.IR /proc/ pid /status
c70fea6e 920file.
efeece04 921.IP
03547431 922For more information, see the kernel source file
1ae6b2c7 923.I Documentation/userspace\-api/no_new_privs.rst
a84a5830
ES
924.\" commit 40fde647ccb0ae8c11d256d271e24d385eed595b
925(or
1ae6b2c7 926.I Documentation/prctl/no_new_privs.txt
a84a5830 927before Linux 4.13).
4d850396
MK
928See also
929.BR seccomp (2).
667eb3ac 930.\" prctl PR_GET_NO_NEW_PRIVS
03547431
MK
931.TP
932.BR PR_GET_NO_NEW_PRIVS " (since Linux 3.5)"
933Return (as the function result) the value of the
934.I no_new_privs
fdda9363 935attribute for the calling thread.
03547431
MK
936A value of 0 indicates the regular
937.BR execve (2)
938behavior.
939A value of 1 indicates
940.BR execve (2)
941will operate in the privilege-restricting mode described above.
8165500d
DM
942.\" prctl PR_PAC_RESET_KEYS
943.\" commit ba830885656414101b2f8ca88786524d4bb5e8c1
944.TP
945.BR PR_PAC_RESET_KEYS " (since Linux 5.0, only on arm64)"
946Securely reset the thread's pointer authentication keys
947to fresh random values generated by the kernel.
948.IP
949The set of keys to be reset is specified by
950.IR arg2 ,
951which must be a logical OR of zero or more of the following:
952.RS
953.TP
954.B PR_PAC_APIAKEY
955instruction authentication key A
956.TP
957.B PR_PAC_APIBKEY
958instruction authentication key B
959.TP
960.B PR_PAC_APDAKEY
961data authentication key A
962.TP
963.B PR_PAC_APDBKEY
964data authentication key B
965.TP
966.B PR_PAC_APGAKEY
967generic authentication \(lqA\(rq key.
968.IP
969(Yes folks, there really is no generic B key.)
970.RE
971.IP
972As a special case, if
973.I arg2
7289930f 974is zero, then all the keys are reset.
8165500d
DM
975Since new keys could be added in future,
976this is the recommended way to completely wipe the existing keys
977when establishing a clean execution context.
978Note that there is no need to use
1ae6b2c7 979.B PR_PAC_RESET_KEYS
8165500d
DM
980in preparation for calling
981.BR execve (2),
982since
983.BR execve (2)
984resets all the pointer authentication keys.
985.IP
986The remaining arguments
7289930f 987.IR arg3 ", " arg4 ", and " arg5
8165500d
DM
988must all be zero.
989.IP
990If the arguments are invalid,
991and in particular if
992.I arg2
993contains set bits that are unrecognized
994or that correspond to a key not available on this platform,
7289930f 995then the call fails with error
8165500d
DM
996.BR EINVAL .
997.IP
998.B Warning:
999Because the compiler or run-time environment
1000may be using some or all of the keys,
1001a successful
138b60c7 1002.B PR_PAC_RESET_KEYS
7289930f
MK
1003may crash the calling process.
1004The conditions for using it safely are complex and system-dependent.
1005Don't use it unless you know what you are doing.
8165500d
DM
1006.IP
1007For more information, see the kernel source file
1008.I Documentation/arm64/pointer\-authentication.rst
1009.\"commit b693d0b372afb39432e1c49ad7b3454855bc6bed
1010(or
1011.I Documentation/arm64/pointer\-authentication.txt
1012before Linux 5.3).
667eb3ac 1013.\" prctl PR_SET_PDEATHSIG
03547431
MK
1014.TP
1015.BR PR_SET_PDEATHSIG " (since Linux 2.1.57)"
29b249db 1016Set the parent-death signal
03547431 1017of the calling process to \fIarg2\fP (either a signal value
49a88f5d
MK
1018in the range 1..\c
1019.BR NSIG "\-1" ,
1020or 0 to clear).
03547431
MK
1021This is the signal that the calling process will get when its
1022parent dies.
c5236575 1023.IP
03547431
MK
1024.IR Warning :
1025.\" https://bugzilla.kernel.org/show_bug.cgi?id=43300
1026the "parent" in this case is considered to be the
1027.I thread
1028that created this process.
1029In other words, the signal will be sent when that thread terminates
1030(via, for example,
1031.BR pthread_exit (3)),
1032rather than after all of the threads in the parent process terminate.
910b0689 1033.IP
a32c96b8
MK
1034The parent-death signal is sent upon subsequent termination of the parent
1035thread and also upon termination of each subreaper process
1036(see the description of
1037.B PR_SET_CHILD_SUBREAPER
1038above) to which the caller is subsequently reparented.
1039If the parent thread and all ancestor subreapers have already terminated
1040by the time of the
1ae6b2c7 1041.B PR_SET_PDEATHSIG
a32c96b8
MK
1042operation, then no parent-death signal is sent to the caller.
1043.IP
a09b5995
MK
1044The parent-death signal is process-directed (see
1045.BR signal (7))
1046and, if the child installs a handler using the
1047.BR sigaction (2)
1048.B SA_SIGINFO
1049flag, the
1050.I si_pid
1051field of the
1052.I siginfo_t
1053argument of the handler contains the PID of the terminating parent process.
1054.IP
29b249db 1055The parent-death signal setting is cleared for the child of a
910b0689
MK
1056.BR fork (2).
1057It is also
1058(since Linux 2.4.36 / 2.6.23)
1059.\" commit d2d56c5f51028cb9f3d800882eb6f4cbd3f9099f
1060cleared when executing a set-user-ID or set-group-ID binary,
1061or a binary that has associated capabilities (see
1062.BR capabilities (7));
1063otherwise, this value is preserved across
1064.BR execve (2).
402f2e7b
MK
1065The parent-death signal setting is also cleared upon changes to
1066any of the following thread credentials:
1067.\" FIXME capability changes can also trigger this; see
cc6d2edb 1068.\" kernel/cred.c::commit_creds in the Linux 5.6 source.
402f2e7b
MK
1069effective user ID, effective group ID, filesystem user ID,
1070or filesystem group ID.
667eb3ac 1071.\" prctl PR_GET_PDEATHSIG
03547431
MK
1072.TP
1073.BR PR_GET_PDEATHSIG " (since Linux 2.3.15)"
1074Return the current value of the parent process death signal,
1075in the location pointed to by
1ae6b2c7 1076.IR "(int\~*) arg2" .
667eb3ac 1077.\" prctl PR_SET_PTRACER
03547431
MK
1078.TP
1079.BR PR_SET_PTRACER " (since Linux 3.4)"
1080.\" commit 2d514487faf188938a4ee4fb3464eeecfbdcf8eb
1081.\" commit bf06189e4d14641c0148bea16e9dd24943862215
1082This is meaningful only when the Yama LSM is enabled and in mode 1
1083("restricted ptrace", visible via
1084.IR /proc/sys/kernel/yama/ptrace_scope ).
1085When a "ptracer process ID" is passed in \fIarg2\fP,
1086the caller is declaring that the ptracer process can
1087.BR ptrace (2)
1088the calling process as if it were a direct process ancestor.
1089Each
1090.B PR_SET_PTRACER
1091operation replaces the previous "ptracer process ID".
1092Employing
1093.B PR_SET_PTRACER
1094with
1095.I arg2
1096set to 0 clears the caller's "ptracer process ID".
1097If
1098.I arg2
1099is
1100.BR PR_SET_PTRACER_ANY ,
1101the ptrace restrictions introduced by Yama are effectively disabled for the
1102calling process.
efeece04 1103.IP
03547431 1104For further information, see the kernel source file
1ae6b2c7 1105.I Documentation/admin\-guide/LSM/Yama.rst
6744a500
ES
1106.\" commit 90bb766440f2147486a2acc3e793d7b8348b0c22
1107(or
1ae6b2c7 1108.I Documentation/security/Yama.txt
6744a500 1109before Linux 4.13).
667eb3ac 1110.\" prctl PR_SET_SECCOMP
03547431
MK
1111.TP
1112.BR PR_SET_SECCOMP " (since Linux 2.6.23)"
1113.\" See http://thread.gmane.org/gmane.linux.kernel/542632
1114.\" [PATCH 0 of 2] seccomp updates
1115.\" andrea@cpushare.com
1116Set the secure computing (seccomp) mode for the calling thread, to limit
1117the available system calls.
1118The more recent
1119.BR seccomp (2)
1120system call provides a superset of the functionality of
2da936fe
MK
1121.BR PR_SET_SECCOMP ,
1122and is the preferred interface for new applications.
efeece04 1123.IP
03547431
MK
1124The seccomp mode is selected via
1125.IR arg2 .
1126(The seccomp constants are defined in
1127.IR <linux/seccomp.h> .)
ae6b2218
MK
1128The following values can be specified:
1129.RS
1130.TP
1131.BR SECCOMP_MODE_STRICT " (since Linux 2.6.23)"
1132See the description of
1133.B SECCOMP_SET_MODE_STRICT
1134in
1135.BR seccomp (2).
efeece04 1136.IP
33a0ccb2 1137This operation is available only
d6ef3d57
MK
1138if the kernel is configured with
1139.B CONFIG_SECCOMP
1140enabled.
ae6b2218
MK
1141.TP
1142.BR SECCOMP_MODE_FILTER " (since Linux 3.5)"
1143The allowed system calls are defined by a pointer
6239dfb2
MK
1144to a Berkeley Packet Filter passed in
1145.IR arg3 .
1146This argument is a pointer to
1147.IR "struct sock_fprog" ;
1148it can be designed to filter
d6ef3d57 1149arbitrary system calls and system call arguments.
ae6b2218
MK
1150See the description of
1151.B SECCOMP_SET_MODE_FILTER
1152in
1153.BR seccomp (2).
1154.IP
1155This operation is available only
1156if the kernel is configured with
d6ef3d57
MK
1157.B CONFIG_SECCOMP_FILTER
1158enabled.
ae6b2218 1159.RE
efeece04 1160.IP
ae6b2218
MK
1161For further details on seccomp filtering, see
1162.BR seccomp (2).
667eb3ac 1163.\" prctl PR_GET_SECCOMP
8ab8b43f
MK
1164.TP
1165.BR PR_GET_SECCOMP " (since Linux 2.6.23)"
5e91816c
MK
1166Return (as the function result)
1167the secure computing mode of the calling thread.
34447828
KC
1168If the caller is not in secure computing mode, this operation returns 0;
1169if the caller is in strict secure computing mode, then the
8ab8b43f
MK
1170.BR prctl ()
1171call will cause a
1172.B SIGKILL
1173signal to be sent to the process.
d6ef3d57 1174If the caller is in filter mode, and this system call is allowed by the
8eeb062d 1175seccomp filters, it returns 2; otherwise, the process is killed with a
1ae6b2c7 1176.B SIGKILL
8eeb062d 1177signal.
ae6b2218 1178.IP
33a0ccb2 1179This operation is available only
d6ef3d57
MK
1180if the kernel is configured with
1181.B CONFIG_SECCOMP
1182enabled.
efeece04 1183.IP
787843e7 1184Since Linux 3.8, the
1ae6b2c7 1185.I Seccomp
787843e7 1186field of the
1ae6b2c7 1187.IR /proc/ pid /status
787843e7
MK
1188file provides a method of obtaining the same information,
1189without the risk that the process is killed; see
1190.BR proc (5).
667eb3ac 1191.\" prctl PR_SET_SECUREBITS
88989295
MK
1192.TP
1193.BR PR_SET_SECUREBITS " (since Linux 2.6.26)"
1194Set the "securebits" flags of the calling thread to the value supplied in
03547431
MK
1195.IR arg2 .
1196See
1197.BR capabilities (7).
667eb3ac 1198.\" prctl PR_GET_SECUREBITS
88989295 1199.TP
03547431
MK
1200.BR PR_GET_SECUREBITS " (since Linux 2.6.26)"
1201Return (as the function result)
1202the "securebits" flags of the calling thread.
1203See
1204.BR capabilities (7).
667eb3ac 1205.\" prctl PR_GET_SPECULATION_CTRL
03547431 1206.TP
dd08fcca 1207.BR PR_GET_SPECULATION_CTRL " (since Linux 4.17)"
1cea09b3
MK
1208Return (as the function result)
1209the state of the speculation misfeature specified in
a01c1cbc
MK
1210.IR arg2 .
1211Currently, the only permitted value for this argument is
1ae6b2c7 1212.B PR_SPEC_STORE_BYPASS
2feab5d3
MK
1213(otherwise the call fails with the error
1214.BR ENODEV ).
1215.IP
1216The return value uses bits 0-3 with the following meaning:
e23acd79
KRW
1217.RS
1218.TP
1ae6b2c7 1219.B PR_SPEC_PRCTL
2feab5d3 1220Mitigation can be controlled per thread by
e6935958 1221.BR PR_SET_SPECULATION_CTRL .
e23acd79 1222.TP
1ae6b2c7 1223.B PR_SPEC_ENABLE
e23acd79
KRW
1224The speculation feature is enabled, mitigation is disabled.
1225.TP
1ae6b2c7 1226.B PR_SPEC_DISABLE
e6935958 1227The speculation feature is disabled, mitigation is enabled.
e23acd79 1228.TP
1ae6b2c7 1229.B PR_SPEC_FORCE_DISABLE
e23acd79
KRW
1230Same as
1231.B PR_SPEC_DISABLE
1232but cannot be undone.
734439ca
DM
1233.TP
1234.BR PR_SPEC_DISABLE_NOEXEC " (since Linux 5.1)"
1235Same as
1236.BR PR_SPEC_DISABLE ,
db4d5400 1237but the state will be cleared on
734439ca 1238.BR execve (2).
e23acd79
KRW
1239.RE
1240.IP
2feab5d3 1241If all bits are 0,
e23acd79
KRW
1242then the CPU is not affected by the speculation misfeature.
1243.IP
1244If
1245.B PR_SPEC_PRCTL
2feab5d3 1246is set, then per-thread control of the mitigation is available.
ac3756bc 1247If not set,
e36dfb81 1248.BR prctl ()
e23acd79 1249for the speculation misfeature will fail.
a01c1cbc
MK
1250.IP
1251The
e36dfb81
MK
1252.IR arg3 ,
1253.IR arg4 ,
e23acd79
KRW
1254and
1255.I arg5
a01c1cbc 1256arguments must be specified as 0; otherwise the call fails with the error
e36dfb81 1257.BR EINVAL .
667eb3ac 1258.\" prctl PR_SET_SPECULATION_CTRL
e23acd79 1259.TP
dd08fcca
MK
1260.BR PR_SET_SPECULATION_CTRL " (since Linux 4.17)"
1261.\" commit b617cfc858161140d69cc0b5cc211996b557a1c7
1262.\" commit 356e4bfff2c5489e016fdb925adbf12a1e3950ee
a01c1cbc
MK
1263Sets the state of the speculation misfeature specified in
1264.IR arg2 .
68b4922b
MK
1265The speculation-misfeature settings are per-thread attributes.
1266.IP
1267Currently,
1268.I arg2
1269must be one of:
277d7d6d
DM
1270.RS
1271.TP
2feab5d3 1272.B PR_SPEC_STORE_BYPASS
68b4922b 1273Set the state of the speculative store bypass misfeature.
277d7d6d
DM
1274.\" commit 9137bb27e60e554dab694eafa4cca241fa3a694f
1275.TP
1276.BR PR_SPEC_INDIRECT_BRANCH " (since Linux 4.20)"
68b4922b 1277Set the state of the indirect branch speculation misfeature.
277d7d6d
DM
1278.RE
1279.IP
68b4922b
MK
1280If
1281.I arg2
1282does not have one of the above values,
1283then the call fails with the error
1284.BR ENODEV .
1285.IP
ac3756bc 1286The
1ae6b2c7 1287.I arg3
a01c1cbc
MK
1288argument is used to hand in the control value,
1289which is one of the following:
e23acd79
KRW
1290.RS
1291.TP
1ae6b2c7 1292.B PR_SPEC_ENABLE
e23acd79
KRW
1293The speculation feature is enabled, mitigation is disabled.
1294.TP
1ae6b2c7 1295.B PR_SPEC_DISABLE
e6935958 1296The speculation feature is disabled, mitigation is enabled.
e23acd79 1297.TP
1ae6b2c7 1298.B PR_SPEC_FORCE_DISABLE
e23acd79 1299Same as
e6935958 1300.BR PR_SPEC_DISABLE ,
ac3756bc
MK
1301but cannot be undone.
1302A subsequent
277d7d6d
DM
1303.BR prctl (\c
1304.IR arg2 ,
1305.BR PR_SPEC_ENABLE )
1306with the same value for
1307.I arg2
2feab5d3 1308will fail with the error
e36dfb81 1309.BR EPERM .
734439ca
DM
1310.\" commit 71368af9027f18fe5d1c6f372cfdff7e4bde8b48
1311.TP
1312.BR PR_SPEC_DISABLE_NOEXEC " (since Linux 5.1)"
1313Same as
1314.BR PR_SPEC_DISABLE ,
db4d5400 1315but the state will be cleared on
734439ca
DM
1316.BR execve (2).
1317Currently only supported for
1318.I arg2
1319equal to
1320.B PR_SPEC_STORE_BYPASS.
e23acd79
KRW
1321.RE
1322.IP
277d7d6d 1323Any unsupported value in
1ae6b2c7 1324.I arg3
2feab5d3 1325will result in the call failing with the error
e23acd79 1326.BR ERANGE .
a01c1cbc
MK
1327.IP
1328The
2feab5d3 1329.I arg4
e23acd79
KRW
1330and
1331.I arg5
a01c1cbc 1332arguments must be specified as 0; otherwise the call fails with the error
e36dfb81 1333.BR EINVAL .
e23acd79 1334.IP
a01c1cbc
MK
1335The speculation feature can also be controlled by the
1336.B spec_store_bypass_disable
1337boot parameter.
1338This parameter may enforce a read-only policy which will result in the
549597a8 1339.BR prctl ()
a01c1cbc 1340call failing with the error
e23acd79 1341.BR ENXIO .
a01c1cbc 1342For further details, see the kernel source file
4f65a897 1343.IR Documentation/admin\-guide/kernel\-parameters.txt .
034f403a
DM
1344.\" prctl PR_SVE_SET_VL
1345.\" commit 2d2123bc7c7f843aa9db87720de159a049839862
1346.\" linux-5.6/Documentation/arm64/sve.rst
1347.TP
1348.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)"
1349Configure the thread's SVE vector length,
1350as specified by
1351.IR "(int) arg2" .
1352Arguments
1ae6b2c7
AC
1353.IR arg3 ,
1354.IR arg4 ,
1355and
1356.I arg5
034f403a
DM
1357are ignored.
1358.IP
1359The bits of
1360.I arg2
1361corresponding to
1362.B PR_SVE_VL_LEN_MASK
1363must be set to the desired vector length in bytes.
1364This is interpreted as an upper bound:
1365the kernel will select the greatest available vector length
1366that does not exceed the value specified.
1367In particular, specifying
1368.B SVE_VL_MAX
1369(defined in
1370.I <asm/sigcontext.h>)
1371for the
1372.B PR_SVE_VL_LEN_MASK
1373bits requests the maximum supported vector length.
1374.IP
1375In addition, the other bits of
1376.I arg2
1377must be set to one of the following combinations of flags:
1378.RS
1379.TP
1380.B 0
1381Perform the change immediately.
1382At the next
1383.BR execve (2)
1384in the thread,
1385the vector length will be reset to the value configured in
1386.IR /proc/sys/abi/sve_default_vector_length .
1387.TP
1388.B PR_SVE_VL_INHERIT
1389Perform the change immediately.
1390Subsequent
1391.BR execve (2)
1392calls will preserve the new vector length.
1393.TP
1394.B PR_SVE_SET_VL_ONEXEC
1395Defer the change, so that it is performed at the next
1396.BR execve (2)
1397in the thread.
1398Further
1399.BR execve (2)
1400calls will reset the vector length to the value configured in
1401.IR /proc/sys/abi/sve_default_vector_length .
1402.TP
1403.B "PR_SVE_SET_VL_ONEXEC | PR_SVE_VL_INHERIT"
1404Defer the change, so that it is performed at the next
1405.BR execve (2)
1406in the thread.
1407Further
1408.BR execve (2)
1409calls will preserve the new vector length.
1410.RE
1411.IP
1412In all cases,
1413any previously pending deferred change is canceled.
1414.IP
1415The call fails with error
1416.B EINVAL
1417if SVE is not supported on the platform, if
1418.I arg2
1419is unrecognized or invalid, or the value in the bits of
1420.I arg2
1421corresponding to
1422.B PR_SVE_VL_LEN_MASK
1423is outside the range
1424.BR SVE_VL_MIN .. SVE_VL_MAX
1425or is not a multiple of 16.
1426.IP
1427On success,
1428a nonnegative value is returned that describes the
1429.I selected
1430configuration.
1431If
1432.B PR_SVE_SET_VL_ONEXEC
1433was included in
1434.IR arg2 ,
1435then the configuration described by the return value
1436will take effect at the next
f99cea2c 1437.BR execve (2).
034f403a
DM
1438Otherwise, the configuration is already in effect when the
1439.B PR_SVE_SET_VL
1440call returns.
1441In either case, the value is encoded in the same way as the return value of
1442.BR PR_SVE_GET_VL .
1443Note that there is no explicit flag in the return value
1444corresponding to
1445.BR PR_SVE_SET_VL_ONEXEC .
1446.IP
1447The configuration (including any pending deferred change)
1448is inherited across
1449.BR fork (2)
1450and
1451.BR clone (2).
1452.IP
1453For more information, see the kernel source file
1454.I Documentation/arm64/sve.rst
1455.\"commit b693d0b372afb39432e1c49ad7b3454855bc6bed
1456(or
1457.I Documentation/arm64/sve.txt
1458before Linux 5.3).
1459.IP
1460.B Warning:
1461Because the compiler or run-time environment
1462may be using SVE, using this call without the
1463.B PR_SVE_SET_VL_ONEXEC
1464flag may crash the calling process.
1465The conditions for using it safely are complex and system-dependent.
1466Don't use it unless you really know what you are doing.
1467.\" prctl PR_SVE_GET_VL
1468.TP
1469.BR PR_SVE_GET_VL " (since Linux 4.15, only on arm64)"
1470Get the thread's current SVE vector length configuration.
1471.IP
1472Arguments
9b276f9e 1473.IR arg2 ", " arg3 ", " arg4 ", and " arg5
034f403a
DM
1474are ignored.
1475.IP
7fe3c5a9 1476Provided that the kernel and platform support SVE,
034f403a
DM
1477this operation always succeeds,
1478returning a nonnegative value that describes the
1479.I current
1480configuration.
1481The bits corresponding to
1482.B PR_SVE_VL_LEN_MASK
1483contain the currently configured vector length in bytes.
1484The bit corresponding to
1485.B PR_SVE_VL_INHERIT
1486indicates whether the vector length will be inherited
1487across
1488.BR execve (2).
1489.IP
1490Note that there is no way to determine whether there is
1491a pending vector length change that has not yet taken effect.
1492.IP
1493For more information, see the kernel source file
1494.I Documentation/arm64/sve.rst
1495.\"commit b693d0b372afb39432e1c49ad7b3454855bc6bed
1496(or
1497.I Documentation/arm64/sve.txt
1498before Linux 5.3).
131ee1e1
GKB
1499.TP
1500.\" prctl PR_SET_SYSCALL_USER_DISPATCH
1501.\" commit 1446e1df9eb183fdf81c3f0715402f1d7595d4
1502.BR PR_SET_SYSCALL_USER_DISPATCH " (since Linux 5.11, x86 only)"
131ee1e1
GKB
1503Configure the Syscall User Dispatch mechanism
1504for the calling thread.
1505This mechanism allows an application
1506to selectively intercept system calls
1507so that they can be handled within the application itself.
1508Interception takes the form of a thread-directed
1509.B SIGSYS
1510signal that is delivered to the thread
1511when it makes a system call.
1512If intercepted,
1513the system call is not executed by the kernel.
1514.IP
1515To enable this mechanism,
1516.I arg2
1517should be set to
1518.BR PR_SYS_DISPATCH_ON .
1519Once enabled, further system calls will be selectively intercepted,
1520depending on a control variable provided by user space.
1521In this case,
1522.I arg3
1523and
1524.I arg4
1525respectively identify the
1526.I offset
1527and
1528.I length
1529of a single contiguous memory region in the process address space
1530from where system calls are always allowed to be executed,
1531regardless of the control variable.
1532(Typically, this area would include the area of memory
1533containing the C library.)
1534.IP
1535.I arg5
1536points to a char-sized variable
1537that is a fast switch to allow/block system call execution
1538without the overhead of doing another system call
1539to re-configure Syscall User Dispatch.
1540This control variable can either be set to
1541.B SYSCALL_DISPATCH_FILTER_BLOCK
1542to block system calls from executing
1543or to
1544.B SYSCALL_DISPATCH_FILTER_ALLOW
1545to temporarily allow them to be executed.
1546This value is checked by the kernel
1547on every system call entry,
1548and any unexpected value will raise
1549an uncatchable
1550.B SIGSYS
1551at that time,
1552killing the application.
1553.IP
1554When a system call is intercepted,
1555the kernel sends a thread-directed
1556.B SIGSYS
1557signal to the triggering thread.
1558Various fields will be set in the
1559.I siginfo_t
1560structure (see
1561.BR sigaction (2))
1562associated with the signal:
1563.RS
1564.IP * 3
1565.I si_signo
1566will contain
1567.BR SIGSYS .
1568.IP *
1ae6b2c7 1569.I si_call_addr
131ee1e1
GKB
1570will show the address of the system call instruction.
1571.IP *
1ae6b2c7 1572.I si_syscall
131ee1e1 1573and
1ae6b2c7 1574.I si_arch
131ee1e1
GKB
1575will indicate which system call was attempted.
1576.IP *
1577.I si_code
1578will contain
1579.BR SYS_USER_DISPATCH .
1580.IP *
1581.I si_errno
1582will be set to 0.
1583.RE
1584.IP
1585The program counter will be as though the system call happened
1586(i.e., the program counter will not point to the system call instruction).
1587.IP
1588When the signal handler returns to the kernel,
1589the system call completes immediately
1590and returns to the calling thread,
1591without actually being executed.
1592If necessary
1593(i.e., when emulating the system call on user space.),
1594the signal handler should set the system call return value
1595to a sane value,
1596by modifying the register context stored in the
1597.I ucontext
1598argument of the signal handler.
1599See
1600.BR sigaction (2),
1601.BR sigreturn (2),
1602and
1603.BR getcontext (3)
1604for more information.
1605.IP
1606If
1607.I arg2
1608is set to
1609.BR PR_SYS_DISPATCH_OFF ,
1610Syscall User Dispatch is disabled for that thread.
1611the remaining arguments must be set to 0.
1612.IP
1613The setting is not preserved across
1614.BR fork (2),
1615.BR clone (2),
1616or
1617.BR execve (2).
1618.IP
1619For more information,
1620see the kernel source file
1ae6b2c7 1621.I Documentation/admin-guide/syscall-user-dispatch.rst
9b276f9e
DM
1622.\" prctl PR_SET_TAGGED_ADDR_CTRL
1623.\" commit 63f0c60379650d82250f22e4cf4137ef3dc4f43d
1624.TP
1625.BR PR_SET_TAGGED_ADDR_CTRL " (since Linux 5.4, only on arm64)"
236a9f70 1626Controls support for passing tagged user-space addresses to the kernel
9b276f9e
DM
1627(i.e., addresses where bits 56\(em63 are not all zero).
1628.IP
1629The level of support is selected by
1630.IR "arg2" ,
1631which can be one of the following:
1632.RS
1633.TP
1634.B 0
1635Addresses that are passed
1636for the purpose of being dereferenced by the kernel
1637must be untagged.
1638.TP
1639.B PR_TAGGED_ADDR_ENABLE
1640Addresses that are passed
1641for the purpose of being dereferenced by the kernel
1642may be tagged, with the exceptions summarized below.
1643.RE
1644.IP
1645The remaining arguments
236a9f70 1646.IR arg3 ", " arg4 ", and " arg5
9b276f9e
DM
1647must all be zero.
1648.\" Enforcement added in
1649.\" commit 3e91ec89f527b9870fe42dcbdb74fd389d123a95
1650.IP
1651On success, the mode specified in
1652.I arg2
6a17c54c 1653is set for the calling thread and the return value is 0.
9b276f9e
DM
1654If the arguments are invalid,
1655the mode specified in
1656.I arg2
1657is unrecognized,
1658or if this feature is unsupported by the kernel
1659or disabled via
1660.IR /proc/sys/abi/tagged_addr_disabled ,
236a9f70 1661the call fails with the error
9b276f9e
DM
1662.BR EINVAL .
1663.IP
1664In particular, if
1665.BR prctl ( PR_SET_TAGGED_ADDR_CTRL ,
16660, 0, 0, 0)
1667fails with
236a9f70 1668.BR EINVAL ,
9b276f9e
DM
1669then all addresses passed to the kernel must be untagged.
1670.IP
1671Irrespective of which mode is set,
1672addresses passed to certain interfaces
1673must always be untagged:
1674.RS
236a9f70 1675.IP \(bu 2
9b276f9e
DM
1676.BR brk (2),
1677.BR mmap (2),
1678.BR shmat (2),
1679.BR shmdt (2),
1680and the
1681.I new_address
1682argument of
1683.BR mremap (2).
1684.IP
1685(Prior to Linux 5.6 these accepted tagged addresses,
1686but the behaviour may not be what you expect.
1687Don't rely on it.)
236a9f70 1688.IP \(bu
9b276f9e
DM
1689\(oqpolymorphic\(cq interfaces
1690that accept pointers to arbitrary types cast to a
1691.I void *
1692or other generic type, specifically
71103ce8 1693.BR prctl (),
9b276f9e
DM
1694.BR ioctl (2),
1695and in general
1696.BR setsockopt (2)
1697(only certain specific
1698.BR setsockopt (2)
1699options allow tagged addresses).
1700.RE
1701.IP
1702This list of exclusions may shrink
1703when moving from one kernel version to a later kernel version.
1704While the kernel may make some guarantees
1705for backwards compatibility reasons,
1706for the purposes of new software
1707the effect of passing tagged addresses to these interfaces
1708is unspecified.
1709.IP
1710The mode set by this call is inherited across
1711.BR fork (2)
1712and
1713.BR clone (2).
1714The mode is reset by
1715.BR execve (2)
1716to 0
1717(i.e., tagged addresses not permitted in the user/kernel ABI).
1718.IP
1719For more information, see the kernel source file
1720.IR Documentation/arm64/tagged\-address\-abi.rst .
1721.IP
1722.B Warning:
1723This call is primarily intended for use by the run-time environment.
1724A successful
1725.B PR_SET_TAGGED_ADDR_CTRL
1726call elsewhere may crash the calling process.
236a9f70 1727The conditions for using it safely are complex and system-dependent.
9b276f9e
DM
1728Don't use it unless you know what you are doing.
1729.\" prctl PR_GET_TAGGED_ADDR_CTRL
1730.\" commit 63f0c60379650d82250f22e4cf4137ef3dc4f43d
1731.TP
1732.BR PR_GET_TAGGED_ADDR_CTRL " (since Linux 5.4, only on arm64)"
1733Returns the current tagged address mode
1734for the calling thread.
1735.IP
1736Arguments
236a9f70 1737.IR arg2 ", " arg3 ", " arg4 ", and " arg5
9b276f9e
DM
1738must all be zero.
1739.IP
1740If the arguments are invalid
1741or this feature is disabled or unsupported by the kernel,
1742the call fails with
1743.BR EINVAL .
1744In particular, if
1745.BR prctl ( PR_GET_TAGGED_ADDR_CTRL ,
17460, 0, 0, 0)
1747fails with
1748.BR EINVAL ,
1749then this feature is definitely either unsupported,
1750or disabled via
1751.IR /proc/sys/abi/tagged_addr_disabled .
1752In this case,
1753all addresses passed to the kernel must be untagged.
1754.IP
1755Otherwise, the call returns a nonnegative value
1756describing the current tagged address mode,
1757encoded in the same way as the
1758.I arg2
1759argument of
1760.BR PR_SET_TAGGED_ADDR_CTRL .
1761.IP
1762For more information, see the kernel source file
1763.IR Documentation/arm64/tagged\-address\-abi.rst .
03547431 1764.\"
667eb3ac 1765.\" prctl PR_TASK_PERF_EVENTS_DISABLE
06afe673
MK
1766.TP
1767.BR PR_TASK_PERF_EVENTS_DISABLE " (since Linux 2.6.31)"
1768Disable all performance counters attached to the calling process,
1769regardless of whether the counters were created by
1770this process or another process.
1771Performance counters created by the calling process for other
1772processes are unaffected.
66a9882e 1773For more information on performance counters, see the Linux kernel source file
06afe673
MK
1774.IR tools/perf/design.txt .
1775.IP
03547431
MK
1776Originally called
1777.BR PR_TASK_PERF_COUNTERS_DISABLE ;
1778.\" commit 1d1c7ddbfab358445a542715551301b7fc363e28
b0ea1ea3 1779renamed (retaining the same numerical value)
03547431
MK
1780in Linux 2.6.32.
1781.\"
667eb3ac 1782.\" prctl PR_TASK_PERF_EVENTS_ENABLE
03979794 1783.TP
03547431
MK
1784.BR PR_TASK_PERF_EVENTS_ENABLE " (since Linux 2.6.31)"
1785The converse of
1786.BR PR_TASK_PERF_EVENTS_DISABLE ;
1787enable performance counters attached to the calling process.
1788.IP
1789Originally called
1790.BR PR_TASK_PERF_COUNTERS_ENABLE ;
1791.\" commit 1d1c7ddbfab358445a542715551301b7fc363e28
1792renamed
1793.\" commit cdd6c482c9ff9c55475ee7392ec8f672eddb7be6
1794in Linux 2.6.32.
1795.\"
194ccff9
DM
1796.\" prctl PR_SET_THP_DISABLE
1797.TP
1798.BR PR_SET_THP_DISABLE " (since Linux 3.15)"
1799.\" commit a0715cc22601e8830ace98366c0c2bd8da52af52
1800Set the state of the "THP disable" flag for the calling thread.
1801If
1802.I arg2
1803has a nonzero value, the flag is set, otherwise it is cleared.
1804Setting this flag provides a method
1805for disabling transparent huge pages
1806for jobs where the code cannot be modified, and using a malloc hook with
1807.BR madvise (2)
1808is not an option (i.e., statically allocated data).
1809The setting of the "THP disable" flag is inherited by a child created via
1810.BR fork (2)
1811and is preserved across
1812.BR execve (2).
667eb3ac 1813.\" prctl PR_GET_THP_DISABLE
03547431
MK
1814.TP
1815.BR PR_GET_THP_DISABLE " (since Linux 3.15)"
035a7bf1 1816Return (as the function result) the current setting of the "THP disable"
03547431
MK
1817flag for the calling thread:
1818either 1, if the flag is set, or 0, if it is not.
667eb3ac 1819.\" prctl PR_GET_TID_ADDRESS
03547431
MK
1820.TP
1821.BR PR_GET_TID_ADDRESS " (since Linux 3.5)"
1822.\" commit 300f786b2683f8bb1ec0afb6e1851183a479c86d
f1ba3ad2 1823Return the
03547431
MK
1824.I clear_child_tid
1825address set by
1826.BR set_tid_address (2)
1827and the
1828.BR clone (2)
1829.B CLONE_CHILD_CLEARTID
1830flag, in the location pointed to by
1ae6b2c7 1831.IR "(int\~**)\~arg2" .
03547431 1832This feature is available only if the kernel is built with the
1ae6b2c7 1833.B CONFIG_CHECKPOINT_RESTORE
c7f2f9ed
MK
1834option enabled.
1835Note that since the
1836.BR prctl ()
1837system call does not have a compat implementation for
1838the AMD64 x32 and MIPS n32 ABIs,
1839and the kernel writes out a pointer using the kernel's pointer size,
1840this operation expects a user-space buffer of 8 (not 4) bytes on these ABIs.
667eb3ac 1841.\" prctl PR_SET_TIMERSLACK
03547431
MK
1842.TP
1843.BR PR_SET_TIMERSLACK " (since Linux 2.6.28)"
1844.\" See https://lwn.net/Articles/369549/
1845.\" commit 6976675d94042fbd446231d1bd8b7de71a980ada
3780f8a5
MK
1846Each thread has two associated timer slack values:
1847a "default" value, and a "current" value.
1848This operation sets the "current" timer slack value for the calling thread.
c14f7930
YX
1849.I arg2
1850is an unsigned long value, then maximum "current" value is ULONG_MAX and
1851the minimum "current" value is 1.
3780f8a5 1852If the nanosecond value supplied in
1ae6b2c7 1853.I arg2
3780f8a5 1854is greater than zero, then the "current" value is set to this value.
03547431
MK
1855If
1856.I arg2
c14f7930 1857is equal to zero,
3780f8a5
MK
1858the "current" timer slack is reset to the
1859thread's "default" timer slack value.
efeece04 1860.IP
3780f8a5 1861The "current" timer slack is used by the kernel to group timer expirations
03547431
MK
1862for the calling thread that are close to one another;
1863as a consequence, timer expirations for the thread may be
1864up to the specified number of nanoseconds late (but will never expire early).
1865Grouping timer expirations can help reduce system power consumption
1866by minimizing CPU wake-ups.
efeece04 1867.IP
03547431
MK
1868The timer expirations affected by timer slack are those set by
1869.BR select (2),
1870.BR pselect (2),
1871.BR poll (2),
1872.BR ppoll (2),
1873.BR epoll_wait (2),
1874.BR epoll_pwait (2),
1875.BR clock_nanosleep (2),
1876.BR nanosleep (2),
1877and
1878.BR futex (2)
1879(and thus the library functions implemented via futexes, including
1880.\" List obtained by grepping for futex usage in glibc source
1881.BR pthread_cond_timedwait (3),
1882.BR pthread_mutex_timedlock (3),
1883.BR pthread_rwlock_timedrdlock (3),
1884.BR pthread_rwlock_timedwrlock (3),
1885and
1886.BR sem_timedwait (3)).
efeece04 1887.IP
03547431
MK
1888Timer slack is not applied to threads that are scheduled under
1889a real-time scheduling policy (see
1890.BR sched_setscheduler (2)).
efeece04 1891.IP
03547431 1892When a new thread is created,
3780f8a5 1893the two timer slack values are made the same as the "current" value
03547431 1894of the creating thread.
3780f8a5
MK
1895Thereafter, a thread can adjust its "current" timer slack value via
1896.BR PR_SET_TIMERSLACK .
1897The "default" value can't be changed.
03547431 1898The timer slack values of
1ae6b2c7 1899.I init
03547431
MK
1900(PID 1), the ancestor of all processes,
1901are 50,000 nanoseconds (50 microseconds).
c14f7930 1902The timer slack value is inherited by a child created via
0b9a7995 1903.BR fork (2),
c14f7930 1904and is preserved across
03547431 1905.BR execve (2).
efeece04 1906.IP
c1f78aba
MK
1907Since Linux 4.6, the "current" timer slack value of any process
1908can be examined and changed via the file
1ae6b2c7 1909.IR /proc/ pid /timerslack_ns .
c1f78aba
MK
1910See
1911.BR proc (5).
667eb3ac 1912.\" prctl PR_GET_TIMERSLACK
e81a96ec 1913.TP
03547431
MK
1914.BR PR_GET_TIMERSLACK " (since Linux 2.6.28)"
1915Return (as the function result)
3780f8a5 1916the "current" timer slack value of the calling thread.
667eb3ac 1917.\" prctl PR_SET_TIMING
4bf25b89 1918.TP
d6bec36e
MK
1919.BR PR_SET_TIMING " (since Linux 2.6.0)"
1920.\" Precisely: Linux 2.6.0-test4
03547431
MK
1921Set whether to use (normal, traditional) statistical process timing or
1922accurate timestamp-based process timing, by passing
1923.B PR_TIMING_STATISTICAL
1924.\" 0
1925or
1926.B PR_TIMING_TIMESTAMP
1927.\" 1
1928to \fIarg2\fP.
1929.B PR_TIMING_TIMESTAMP
1930is not currently implemented
1931(attempting to set this mode will yield the error
1932.BR EINVAL ).
1933.\" PR_TIMING_TIMESTAMP doesn't do anything in 2.6.26-rc8,
1934.\" and looking at the patch history, it appears
1935.\" that it never did anything.
667eb3ac 1936.\" prctl PR_GET_TIMING
4bf25b89 1937.TP
d6bec36e
MK
1938.BR PR_GET_TIMING " (since Linux 2.6.0)"
1939.\" Precisely: Linux 2.6.0-test4
03547431
MK
1940Return (as the function result) which process timing method is currently
1941in use.
667eb3ac 1942.\" prctl PR_SET_TSC
4bf25b89 1943.TP
03547431
MK
1944.BR PR_SET_TSC " (since Linux 2.6.26, x86 only)"
1945Set the state of the flag determining whether the timestamp counter
1946can be read by the process.
1947Pass
1948.B PR_TSC_ENABLE
1949to
1950.I arg2
1951to allow it to be read, or
1952.B PR_TSC_SIGSEGV
1953to generate a
1954.B SIGSEGV
1955when the process tries to read the timestamp counter.
667eb3ac 1956.\" prctl PR_GET_TSC
4bf25b89 1957.TP
03547431
MK
1958.BR PR_GET_TSC " (since Linux 2.6.26, x86 only)"
1959Return the state of the flag determining whether the timestamp counter
1960can be read,
1961in the location pointed to by
1ae6b2c7 1962.IR "(int\~*) arg2" .
667eb3ac 1963.\" prctl PR_SET_UNALIGN
03547431
MK
1964.TP
1965.B PR_SET_UNALIGN
1966(Only on: ia64, since Linux 2.3.48; parisc, since Linux 2.6.15;
0e2c6b8c
ES
1967PowerPC, since Linux 2.6.18; Alpha, since Linux 2.6.22;
1968.\" sh: 94ea5e449ae834af058ef005d16a8ad44fcf13d6
1969.\" tile: 2f9ac29eec71a696cb0dcc5fb82c0f8d4dac28c9
1970sh, since Linux 2.6.34; tile, since Linux 3.12)
03547431
MK
1971Set unaligned access control bits to \fIarg2\fP.
1972Pass
1973\fBPR_UNALIGN_NOPRINT\fP to silently fix up unaligned user accesses,
1974or \fBPR_UNALIGN_SIGBUS\fP to generate
1975.B SIGBUS
2da72a43
MK
1976on unaligned user access.
1977Alpha also supports an additional flag with the value
1978of 4 and no corresponding named constant,
1979which instructs kernel to not fix up
0e2c6b8c 1980unaligned accesses (it is analogous to providing the
1ae6b2c7 1981.B UAC_NOFIX
2da72a43 1982flag in
1ae6b2c7 1983.B SSI_NVPAIRS
2da72a43
MK
1984operation of the
1985.BR setsysinfo ()
1986system call on Tru64).
667eb3ac 1987.\" prctl PR_GET_UNALIGN
03547431
MK
1988.TP
1989.B PR_GET_UNALIGN
f1bb5798 1990(See
03547431 1991.B PR_SET_UNALIGN
f1bb5798 1992for information on versions and architectures.)
03547431 1993Return unaligned access control bits, in the location pointed to by
1ae6b2c7 1994.IR "(unsigned int\~*) arg2" .
47297adb 1995.SH RETURN VALUE
8ab8b43f 1996On success,
194ccff9
DM
1997.BR PR_CAP_AMBIENT + PR_CAP_AMBIENT_IS_SET ,
1998.BR PR_CAPBSET_READ ,
8ab8b43f 1999.BR PR_GET_DUMPABLE ,
7f5d8442 2000.BR PR_GET_FP_MODE ,
194ccff9 2001.BR PR_GET_IO_FLUSHER ,
8ab8b43f 2002.BR PR_GET_KEEPCAPS ,
194ccff9 2003.BR PR_MCE_KILL_GET ,
f83fe154 2004.BR PR_GET_NO_NEW_PRIVS ,
194ccff9
DM
2005.BR PR_GET_SECUREBITS ,
2006.BR PR_GET_SPECULATION_CTRL ,
034f403a
DM
2007.BR PR_SVE_GET_VL ,
2008.BR PR_SVE_SET_VL ,
9b276f9e 2009.BR PR_GET_TAGGED_ADDR_CTRL ,
5745985f 2010.BR PR_GET_THP_DISABLE ,
8ab8b43f 2011.BR PR_GET_TIMING ,
c42db321 2012.BR PR_GET_TIMERSLACK ,
8ab8b43f 2013and (if it returns)
1ae6b2c7 2014.B PR_GET_SECCOMP
2fda57bd 2015return the nonnegative values described above.
fea681da
MK
2016All other
2017.I option
2018values return 0 on success.
2019On error, \-1 is returned, and
2020.I errno
f6a4078b 2021is set to indicate the error.
fea681da
MK
2022.SH ERRORS
2023.TP
0478944d
MK
2024.B EACCES
2025.I option
2026is
1ae6b2c7 2027.B PR_SET_SECCOMP
4ab9f1db
MK
2028and
2029.I arg2
2030is
2031.BR SECCOMP_MODE_FILTER ,
2032but the process does not have the
1ae6b2c7 2033.B CAP_SYS_ADMIN
4ab9f1db 2034capability or has not set the
1ae6b2c7 2035.I no_new_privs
4ab9f1db 2036attribute (see the discussion of
1ae6b2c7 2037.B PR_SET_NO_NEW_PRIVS
4ab9f1db
MK
2038above).
2039.TP
2040.B EACCES
2041.I option
2042is
0478944d
MK
2043.BR PR_SET_MM ,
2044and
2045.I arg3
2046is
2047.BR PR_SET_MM_EXE_FILE ,
2048the file is not executable.
2049.TP
2050.B EBADF
2051.I option
2052is
2053.BR PR_SET_MM ,
2054.I arg3
2055is
2056.BR PR_SET_MM_EXE_FILE ,
2057and the file descriptor passed in
2058.I arg4
2059is not valid.
2060.TP
2061.B EBUSY
2062.I option
2063is
2064.BR PR_SET_MM ,
2065.I arg3
2066is
2067.BR PR_SET_MM_EXE_FILE ,
2068and this the second attempt to change the
1ae6b2c7 2069.IR /proc/ pid /exe
0478944d
MK
2070symbolic link, which is prohibited.
2071.TP
8ab8b43f
MK
2072.B EFAULT
2073.I arg2
2074is an invalid address.
2075.TP
e35a0512
KC
2076.B EFAULT
2077.I option
2078is
2079.BR PR_SET_SECCOMP ,
2080.I arg2
2081is
2082.BR SECCOMP_MODE_FILTER ,
2083the system was built with
64c626f7 2084.BR CONFIG_SECCOMP_FILTER ,
e35a0512
KC
2085and
2086.I arg3
2087is an invalid address.
2088.TP
131ee1e1
GKB
2089.B EFAULT
2090.I option
2091is
2092.B PR_SET_SYSCALL_USER_DISPATCH
2093and
2094.I arg5
2095has an invalid address.
2096.TP
fea681da
MK
2097.B EINVAL
2098The value of
2099.I option
f7abc99c
DM
2100is not recognized,
2101or not supported on this system.