]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man2/prctl.2
prctl.2: Explain the circumstances in which the parent-death signal is sent
[thirdparty/man-pages.git] / man2 / prctl.2
CommitLineData
fea681da 1.\" Copyright (C) 1998 Andries Brouwer (aeb@cwi.nl)
73d3ac53 2.\" and Copyright (C) 2002, 2006, 2008, 2012, 2013 Michael Kerrisk <mtk.manpages@gmail.com>
af5f9508 3.\" and Copyright Guillem Jover <guillem@hadrons.org>
3cd5e983 4.\" and Copyright (C) 2014 Dave Hansen / Intel
fea681da 5.\"
93015253 6.\" %%%LICENSE_START(VERBATIM)
fea681da
MK
7.\" Permission is granted to make and distribute verbatim copies of this
8.\" manual provided the copyright notice and this permission notice are
9.\" preserved on all copies.
10.\"
11.\" Permission is granted to copy and distribute modified versions of this
12.\" manual under the conditions for verbatim copying, provided that the
13.\" entire resulting derived work is distributed under the terms of a
14.\" permission notice identical to this one.
c13182ef 15.\"
fea681da
MK
16.\" Since the Linux kernel and libraries are constantly changing, this
17.\" manual page may be incorrect or out-of-date. The author(s) assume no
18.\" responsibility for errors or omissions, or for damages resulting from
19.\" the use of the information contained herein. The author(s) may not
20.\" have taken the same level of care in the production of this manual,
21.\" which is licensed free of charge, as they might when working
22.\" professionally.
c13182ef 23.\"
fea681da
MK
24.\" Formatted or processed versions of this manual, if unaccompanied by
25.\" the source, must acknowledge the copyright and authors of this work.
4b72fb64 26.\" %%%LICENSE_END
fea681da
MK
27.\"
28.\" Modified Thu Nov 11 04:19:42 MET 1999, aeb: added PR_GET_PDEATHSIG
29.\" Modified 27 Jun 02, Michael Kerrisk
c13182ef 30.\" Added PR_SET_DUMPABLE, PR_GET_DUMPABLE,
fea681da 31.\" PR_SET_KEEPCAPS, PR_GET_KEEPCAPS
e87fdd92
MK
32.\" Modified 2006-08-30 Guillem Jover <guillem@hadrons.org>
33.\" Updated Linux versions where the options where introduced.
34.\" Added PR_SET_TIMING, PR_GET_TIMING, PR_SET_NAME, PR_GET_NAME,
35.\" PR_SET_UNALIGN, PR_GET_UNALIGN, PR_SET_FPEMU, PR_GET_FPEMU,
36.\" PR_SET_FPEXC, PR_GET_FPEXC
8ab8b43f
MK
37.\" 2008-04-29 Serge Hallyn, Document PR_CAPBSET_READ and PR_CAPBSET_DROP
38.\" 2008-06-13 Erik Bosman, <ejbosman@cs.vu.nl>
39.\" Document PR_GET_TSC and PR_SET_TSC.
40.\" 2008-06-15 mtk, Document PR_SET_SECCOMP, PR_GET_SECCOMP
bc02b3ea 41.\" 2009-10-03 Andi Kleen, document PR_MCE_KILL
06afe673 42.\" 2012-04 Cyrill Gorcunov, Document PR_SET_MM
bc02b3ea
MK
43.\" 2012-04-25 Michael Kerrisk, Document PR_TASK_PERF_EVENTS_DISABLE and
44.\" PR_TASK_PERF_EVENTS_ENABLE
34447828 45.\" 2012-09-20 Kees Cook, update PR_SET_SECCOMP for mode 2
f83fe154 46.\" 2012-09-20 Kees Cook, document PR_SET_NO_NEW_PRIVS, PR_GET_NO_NEW_PRIVS
934487a0
MK
47.\" 2012-10-25 Michael Kerrisk, Document PR_SET_TIMERSLACK and
48.\" PR_GET_TIMERSLACK
491b2e75 49.\" 2013-01-10 Kees Cook, document PR_SET_PTRACER
31cc8387 50.\" 2012-02-04 Michael Kerrisk, document PR_{SET,GET}_CHILD_SUBREAPER
03979794 51.\" 2014-11-10 Dave Hansen, document PR_MPX_{EN,DIS}ABLE_MANAGEMENT
fea681da 52.\"
e14baeeb 53.\"
8538a62b 54.TH PRCTL 2 2018-02-02 "Linux" "Linux Programmer's Manual"
fea681da
MK
55.SH NAME
56prctl \- operations on a process
57.SH SYNOPSIS
521bf584 58.nf
fea681da 59.B #include <sys/prctl.h>
68e4db0a 60.PP
521bf584
MK
61.BI "int prctl(int " option ", unsigned long " arg2 ", unsigned long " arg3 ,
62.BI " unsigned long " arg4 ", unsigned long " arg5 );
63.fi
fea681da 64.SH DESCRIPTION
e511ffb6 65.BR prctl ()
fea681da 66is called with a first argument describing what to do
1a329b56 67(with values defined in \fI<linux/prctl.h>\fP), and further
c4bb193f 68arguments with a significance depending on the first one.
fea681da 69The first argument can be:
03547431
MK
70.\"
71.TP
72.BR PR_CAP_AMBIENT " (since Linux 4.3)"
73.\" commit 58319057b7847667f0c9585b9de0e8932b0fdb08
1a52f4f6
MK
74Reads or changes the ambient capability set of the calling thread,
75according to the value of
03547431
MK
76.IR arg2 ,
77which must be one of the following:
78.RS
79.\"
80.TP
81.B PR_CAP_AMBIENT_RAISE
82The capability specified in
83.I arg3
84is added to the ambient set.
85The specified capability must already be present in
86both the permitted and the inheritable sets of the process.
87This operation is not permitted if the
88.B SECBIT_NO_CAP_AMBIENT_RAISE
89securebit is set.
90.TP
91.B PR_CAP_AMBIENT_LOWER
92The capability specified in
93.I arg3
94is removed from the ambient set.
95.TP
96.B PR_CAP_AMBIENT_IS_SET
97The
bf7bc8b8 98.BR prctl ()
03547431
MK
99call returns 1 if the capability in
100.I arg3
101is in the ambient set and 0 if it is not.
102.TP
103.BR PR_CAP_AMBIENT_CLEAR_ALL
104All capabilities will be removed from the ambient set.
105This operation requires setting
106.I arg3
107to zero.
108.RE
269e3b97
MK
109.IP
110In all of the above operations,
111.I arg4
112and
113.I arg5
114must be specified as 0.
cf086650
MK
115.IP
116Higher-level interfaces layered on top of the above operations are
117provided in the
118.BR libcap (3)
119library in the form of
120.BR cap_get_ambient (3),
121.BR cap_set_ambient (3),
122and
123.BR cap_reset_ambient (3).
fea681da 124.TP
2e781e20 125.BR PR_CAPBSET_READ " (since Linux 2.6.25)"
8ab8b43f
MK
126Return (as the function result) 1 if the capability specified in
127.I arg2
128is in the calling thread's capability bounding set,
129or 0 if it is not.
130(The capability constants are defined in
131.IR <linux/capability.h> .)
132The capability bounding set dictates
133whether the process can receive the capability through a
2914a14d 134file's permitted capability set on a subsequent call to
8ab8b43f 135.BR execve (2).
efeece04 136.IP
8ab8b43f
MK
137If the capability specified in
138.I arg2
139is not valid, then the call fails with the error
140.BR EINVAL .
d9a0d1d7
MK
141.IP
142A higher-level interface layered on top of this operation is provided in the
143.BR libcap (3)
144library in the form of
145.BR cap_get_bound (3).
8ab8b43f
MK
146.TP
147.BR PR_CAPBSET_DROP " (since Linux 2.6.25)"
148If the calling thread has the
149.B CAP_SETPCAP
af53fcb5 150capability within its user namespace, then drop the capability specified by
8ab8b43f
MK
151.I arg2
152from the calling thread's capability bounding set.
153Any children of the calling thread will inherit the newly
154reduced bounding set.
efeece04 155.IP
8ab8b43f
MK
156The call fails with the error:
157.B EPERM
2914a14d 158if the calling thread does not have the
8ab8b43f
MK
159.BR CAP_SETPCAP ;
160.BR EINVAL
161if
162.I arg2
163does not represent a valid capability; or
164.BR EINVAL
165if file capabilities are not enabled in the kernel,
166in which case bounding sets are not supported.
d9a0d1d7
MK
167.IP
168A higher-level interface layered on top of this operation is provided in the
169.BR libcap (3)
170library in the form of
171.BR cap_drop_bound (3).
73d3ac53
MK
172.TP
173.BR PR_SET_CHILD_SUBREAPER " (since Linux 3.4)"
174.\" commit ebec18a6d3aa1e7d84aab16225e87fd25170ec2b
175If
176.I arg2
177is nonzero,
178set the "child subreaper" attribute of the calling process;
179if
180.I arg2
181is zero, unset the attribute.
efeece04 182.IP
fbc63931 183A subreaper fulfills the role of
73d3ac53
MK
184.BR init (1)
185for its descendant processes.
fbc63931
MK
186When a process becomes orphaned
187(i.e., its immediate parent terminates)
188then that process will be reparented to
189the nearest still living ancestor subreaper.
190Subsequently, calls to
191.BR getppid ()
192in the orphaned process will now return the PID of the subreaper process,
193and when the orphan terminates, it is the subreaper process that
73d3ac53
MK
194will receive a
195.BR SIGCHLD
1a8e1c2f 196signal and will be able to
73d3ac53
MK
197.BR wait (2)
198on the process to discover its termination status.
efeece04 199.IP
300a9c78
MK
200The setting of "child subreaper" attribute
201is not inherited by children created by
d59a7572
MK
202.BR fork (2)
203and
204.BR clone (2).
205The setting is preserved across
206.BR execve (2).
efeece04 207.IP
94e460d4
MK
208Establishing a subreaper process is useful in session management frameworks
209where a hierarchical group of processes is managed by a subreaper process
210that needs to be informed when one of the processes\(emfor example,
211a double-forked daemon\(emterminates
212(perhaps so that it can restart that process).
213Some
214.BR init (1)
215frameworks (e.g.,
216.BR systemd (1))
217employ a subreaper process for similar reasons.
73d3ac53
MK
218.TP
219.BR PR_GET_CHILD_SUBREAPER " (since Linux 3.4)"
220Return the "child subreaper" setting of the caller,
221in the location pointed to by
222.IR "(int\ *) arg2" .
8ab8b43f 223.TP
88989295 224.BR PR_SET_DUMPABLE " (since Linux 2.3.20)"
2d7fc98d
MK
225Set the state of the "dumpable" flag,
226which determines whether core dumps are produced for the calling process
227upon delivery of a signal whose default behavior is to produce a core dump.
efeece04 228.IP
88989295 229In kernels up to and including 2.6.12,
8ab8b43f 230.I arg2
8aad30d7
MK
231must be either 0
232.RB ( SUID_DUMP_DISABLE ,
233process is not dumpable) or 1
234.RB ( SUID_DUMP_USER ,
235process is dumpable).
0de51ed1
MK
236Between kernels 2.6.13 and 2.6.17,
237.\" commit abf75a5033d4da7b8a7e92321d74021d1fcfb502
238the value 2 was also permitted,
88989295
MK
239which caused any binary which normally would not be dumped
240to be dumped readable by root only;
241for security reasons, this feature has been removed.
242.\" See http://marc.theaimsgroup.com/?l=linux-kernel&m=115270289030630&w=2
243.\" Subject: Fix prctl privilege escalation (CVE-2006-2451)
244.\" From: Marcel Holtmann <marcel () holtmann ! org>
245.\" Date: 2006-07-12 11:12:00
246(See also the description of
2d7fc98d 247.I /proc/sys/fs/\:suid_dumpable
88989295
MK
248in
249.BR proc (5).)
efeece04 250.IP
2d7fc98d
MK
251Normally, this flag is set to 1.
252However, it is reset to the current value contained in the file
253.IR /proc/sys/fs/\:suid_dumpable
254(which by default has the value 0),
a644bc48 255in the following circumstances:
2d7fc98d
MK
256.\" See kernel/cred.c::commit_creds() (Linux 3.18 sources)
257.RS
41f90bb7 258.IP * 3
a644bc48 259The process's effective user or group ID is changed.
2d7fc98d 260.IP *
a644bc48 261The process's filesystem user or group ID is changed (see
2d7fc98d
MK
262.BR credentials (7)).
263.IP *
a644bc48 264The process executes
2d7fc98d 265.RB ( execve (2))
41f90bb7
MK
266a set-user-ID or set-group-ID program, resulting in a change
267of either the effective user ID or the effective group ID.
27ce08bf
KF
268.IP *
269The process executes
270.RB ( execve (2))
271a program that has file capabilities (see
272.BR capabilities (7)),
41f90bb7 273.\" See kernel/cred.c::commit_creds()
27ce08bf 274but only if the permitted capabilities
41f90bb7 275gained exceed those already permitted for the process.
5d28ea3e 276.\" Also certain namespace operations;
2d7fc98d
MK
277.RE
278.IP
cadcf1b1 279Processes that are not dumpable can not be attached via
6fdbc779 280.BR ptrace (2)
982d8cf7
MK
281.BR PTRACE_ATTACH ;
282see
283.BR ptrace (2)
284for further details.
efeece04 285.IP
161946a2
MK
286If a process is not dumpable,
287the ownership of files in the process's
288.IR /proc/[pid]
289directory is affected as described in
290.BR proc (5).
64536a1b 291.TP
88989295
MK
292.BR PR_GET_DUMPABLE " (since Linux 2.3.20)"
293Return (as the function result) the current state of the calling
294process's dumpable flag.
295.\" Since Linux 2.6.13, the dumpable flag can have the value 2,
296.\" but in 2.6.13 PR_GET_DUMPABLE simply returns 1 if the dumpable
c7094399 297.\" flags has a nonzero value. This was fixed in 2.6.14.
64536a1b 298.TP
8ab8b43f 299.BR PR_SET_ENDIAN " (since Linux 2.6.18, PowerPC only)"
c13182ef 300Set the endian-ness of the calling process to the value given
64536a1b 301in \fIarg2\fP, which should be one of the following:
8ab8b43f 302.\" Respectively 0, 1, 2
64536a1b
MK
303.BR PR_ENDIAN_BIG ,
304.BR PR_ENDIAN_LITTLE ,
305or
0daa9e92 306.B PR_ENDIAN_PPC_LITTLE
64536a1b 307(PowerPC pseudo little endian).
e87fdd92 308.TP
8ab8b43f
MK
309.BR PR_GET_ENDIAN " (since Linux 2.6.18, PowerPC only)"
310Return the endian-ness of the calling process,
311in the location pointed to by
312.IR "(int\ *) arg2" .
64a53a67
ES
313.TP
314.BR PR_SET_FP_MODE " (since Linux 4.0, only on MIPS)"
89507305
MK
315.\" commit 9791554b45a2acc28247f66a5fd5bbc212a6b8c8
316On the MIPS architecture,
317user-space code can be built using an ABI which permits linking
318with code that has more restrictive floating-point (FP) requirements.
319For example, user-space code may be built to target the O32 FPXX ABI
b3073df8 320and linked with code built for either one of the more restrictive
89507305 321FP32 or FP64 ABIs.
b3073df8 322When more restrictive code is linked in,
89507305
MK
323the overall requirement for the process is to use the more
324restrictive floating-point mode.
efeece04 325.IP
07d6076e 326Because the kernel has no means of knowing in advance
89507305 327which mode the process should be executed in,
07d6076e
MK
328and because these restrictions can
329change over the lifetime of the process, the
330.B PR_SET_FP_MODE
331operation is provided to allow control of the floating-point mode
332from user space.
efeece04 333.IP
64a53a67
ES
334.\" https://dmz-portal.mips.com/wiki/MIPS_O32_ABI_-_FR0_and_FR1_Interlinking
335The
336.I (unsigned int) arg2
89507305 337argument is a bit mask describing the floating-point mode used:
64a53a67
ES
338.RS
339.TP
fb90e0c7 340.BR PR_FP_MODE_FR
64a53a67
ES
341When this bit is
342.I unset
343(so called
344.BR FR=0 " or " FR0
41a926bf
MK
345mode), the 32 floating-point registers are 32 bits wide,
346and 64-bit registers are represented as a pair of registers
b3073df8 347(even- and odd- numbered,
89507305
MK
348with the even-numbered register containing the lower 32 bits,
349and the odd-numbered register containing the higher 32 bits).
efeece04 350.IP
64a53a67
ES
351When this bit is
352.I set
07d6076e 353(on supported hardware),
41a926bf 354the 32 floating-point registers are 64 bits wide (so called
64a53a67 355.BR FR=1 " or " FR1
89507305 356mode).
b3073df8 357Note that modern MIPS implementations (MIPS R6 and newer) support
64a53a67
ES
358.B FR=1
359mode only.
efeece04
MK
360.IP
361.IP
89507305 362Applications that use the O32 FP32 ABI can operate only when this bit is
64a53a67
ES
363.I unset
364.RB ( FR=0 ;
365or they can be used with FRE enabled, see below).
89507305
MK
366Applications that use the O32 FP64 ABI
367(and the O32 FP64A ABI, which exists to
368provide the ability to operate with existing FP32 code; see below)
369can operate only when this bit is
64a53a67
ES
370.I set
371.RB ( FR=1 ).
ffb0dafc 372Applications that use the O32 FPXX ABI can operate with either
07d6076e
MK
373.BR FR=0
374or
375.BR FR=1 .
64a53a67 376.TP
fb90e0c7 377.BR PR_FP_MODE_FRE
07d6076e 378Enable emulation of 32-bit floating-point mode.
b3073df8 379When this mode is enabled,
07d6076e
MK
380it emulates 32-bit floating-point operations
381by raising a reserved-instruction exception
b3073df8 382on every instruction that uses 32-bit formats and
89507305
MK
383the kernel then handles the instruction in software.
384(The problem lies in the discrepancy of handling odd-numbered registers
385which are the high 32 bits of 64-bit registers with even numbers in
64a53a67 386.B FR=0
89507305 387mode and the lower 32-bit parts of odd-numbered 64-bit registers in
64a53a67 388.B FR=1
89507305
MK
389mode.)
390Enabling this bit is necessary when code with the O32 FP32 ABI should operate
391with code with compatible the O32 FPXX or O32 FP64A ABIs (which require
64a53a67 392.B FR=1
b3073df8
MK
393FPU mode) or when it is executed on newer hardware (MIPS R6 onwards)
394which lacks
64a53a67 395.B FR=0
89507305 396mode support when a binary with the FP32 ABI is used.
64a53a67 397.IP
89507305
MK
398Note that this mode makes sense only when the FPU is in 64-bit mode
399.RB ( FR=1 ).
64a53a67 400.IP
89507305 401Note that the use of emulation inherently has a significant performance hit
b3073df8 402and should be avoided if possible.
64a53a67
ES
403.RE
404.IP
07d6076e
MK
405In the N32/N64 ABI, 64-bit floating-point mode is always used,
406so FPU emulation is not required and the FPU always operates in
64a53a67
ES
407.B FR=1
408mode.
409.IP
07d6076e
MK
410This option is mainly intended for use by the dynamic linker
411.RB ( ld.so (8)).
64a53a67 412.IP
89507305
MK
413The arguments
414.IR arg3 ,
415.IR arg4 ,
416and
417.IR arg5
64a53a67
ES
418are ignored.
419.TP
420.BR PR_GET_FP_MODE " (since Linux 4.0, only on MIPS)"
89507305 421Get the current floating-point mode (see the description of
64a53a67
ES
422.B PR_SET_FP_MODE
423for details).
efeece04 424.IP
89507305 425On success,
07d6076e 426the call returns a bit mask which represents the current floating-point mode.
efeece04 427.IP
89507305
MK
428The arguments
429.IR arg2 ,
430.IR arg3 ,
431.IR arg4 ,
432and
433.IR arg5
64a53a67 434are ignored.
8ab8b43f 435.TP
8ab8b43f 436.BR PR_SET_FPEMU " (since Linux 2.4.18, 2.5.9, only on ia64)"
e87fdd92 437Set floating-point emulation control bits to \fIarg2\fP.
7626d2ce
MK
438Pass
439.B PR_FPEMU_NOPRINT
440to silently emulate floating-point operation accesses, or
441.B PR_FPEMU_SIGFPE
442to not emulate floating-point operations and send
8bd58774
MK
443.B SIGFPE
444instead.
e87fdd92 445.TP
8ab8b43f
MK
446.BR PR_GET_FPEMU " (since Linux 2.4.18, 2.5.9, only on ia64)"
447Return floating-point emulation control bits,
448in the location pointed to by
449.IR "(int\ *) arg2" .
e87fdd92 450.TP
8ab8b43f 451.BR PR_SET_FPEXC " (since Linux 2.4.21, 2.5.32, only on PowerPC)"
1c44bd5b
MK
452Set floating-point exception mode to \fIarg2\fP.
453Pass \fBPR_FP_EXC_SW_ENABLE\fP to use FPEXC for FP exception enables,
c45bd688
MK
454\fBPR_FP_EXC_DIV\fP for floating-point divide by zero,
455\fBPR_FP_EXC_OVF\fP for floating-point overflow,
456\fBPR_FP_EXC_UND\fP for floating-point underflow,
457\fBPR_FP_EXC_RES\fP for floating-point inexact result,
458\fBPR_FP_EXC_INV\fP for floating-point invalid operation,
e87fdd92 459\fBPR_FP_EXC_DISABLED\fP for FP exceptions disabled,
b28f6e56 460\fBPR_FP_EXC_NONRECOV\fP for async nonrecoverable exception mode,
e87fdd92
MK
461\fBPR_FP_EXC_ASYNC\fP for async recoverable exception mode,
462\fBPR_FP_EXC_PRECISE\fP for precise exception mode.
463.TP
8ab8b43f
MK
464.BR PR_GET_FPEXC " (since Linux 2.4.21, 2.5.32, only on PowerPC)"
465Return floating-point exception mode,
466in the location pointed to by
467.IR "(int\ *) arg2" .
468.TP
88989295 469.BR PR_SET_KEEPCAPS " (since Linux 2.2.18)"
03361448
MK
470Set the state of the calling thread's "keep capabilities" flag.
471The effect if this flag is described in
472.BR capabilities (7).
88989295 473.I arg2
03361448
MK
474must be either 0 (clear the flag)
475or 1 (set the flag).
028cb080 476The "keep capabilities" value will be reset to 0 on subsequent calls to
88989295
MK
477.BR execve (2).
478.TP
479.BR PR_GET_KEEPCAPS " (since Linux 2.2.18)"
88ee5c1c 480Return (as the function result) the current state of the calling thread's
88989295 481"keep capabilities" flag.
03361448
MK
482See
483.BR capabilities (7)
484for a description of this flag.
88989295 485.TP
03547431 486.BR PR_MCE_KILL " (since Linux 2.6.32)"
eb359b3e 487Set the machine check memory corruption kill policy for the calling thread.
03547431
MK
488If
489.I arg2
490is
491.BR PR_MCE_KILL_CLEAR ,
492clear the thread memory corruption kill policy and use the system-wide default.
493(The system-wide default is defined by
494.IR /proc/sys/vm/memory_failure_early_kill ;
495see
496.BR proc (5).)
497If
498.I arg2
499is
500.BR PR_MCE_KILL_SET ,
501use a thread-specific memory corruption kill policy.
502In this case,
503.I arg3
504defines whether the policy is
505.I early kill
506.RB ( PR_MCE_KILL_EARLY ),
507.I late kill
508.RB ( PR_MCE_KILL_LATE ),
509or the system-wide default
510.RB ( PR_MCE_KILL_DEFAULT ).
511Early kill means that the thread receives a
512.B SIGBUS
513signal as soon as hardware memory corruption is detected inside
514its address space.
515In late kill mode, the process is killed only when it accesses a corrupted page.
516See
517.BR sigaction (2)
518for more information on the
519.BR SIGBUS
520signal.
521The policy is inherited by children.
522The remaining unused
523.BR prctl ()
524arguments must be zero for future compatibility.
88989295 525.TP
03547431
MK
526.BR PR_MCE_KILL_GET " (since Linux 2.6.32)"
527Return the current per-process machine check kill policy.
528All unused
529.BR prctl ()
530arguments must be zero.
88989295 531.TP
03547431
MK
532.BR PR_SET_MM " (since Linux 3.3)"
533.\" commit 028ee4be34a09a6d48bdf30ab991ae933a7bc036
534Modify certain kernel memory map descriptor fields
535of the calling process.
536Usually these fields are set by the kernel and dynamic loader (see
537.BR ld.so (8)
538for more information) and a regular application should not use this feature.
539However, there are cases, such as self-modifying programs,
540where a program might find it useful to change its own memory map.
efeece04 541.IP
03547431
MK
542The calling process must have the
543.BR CAP_SYS_RESOURCE
544capability.
545The value in
546.I arg2
547is one of the options below, while
548.I arg3
549provides a new value for the option.
a87d0921
MF
550The
551.I arg4
552and
553.I arg5
554arguments must be zero if unused.
efeece04 555.IP
261c7e1d 556Before Linux 3.10,
d2eeb68f 557.\" commit 52b3694157e3aa6df871e283115652ec6f2d31e0
261c7e1d
MF
558this feature is available only if the kernel is built with the
559.BR CONFIG_CHECKPOINT_RESTORE
560option enabled.
03547431
MK
561.RS
562.TP
563.BR PR_SET_MM_START_CODE
564Set the address above which the program text can run.
565The corresponding memory area must be readable and executable,
997d21e1 566but not writable or shareable (see
03547431 567.BR mprotect (2)
0fcc276f 568and
03547431
MK
569.BR mmap (2)
570for more information).
f83fe154 571.TP
03547431
MK
572.BR PR_SET_MM_END_CODE
573Set the address below which the program text can run.
574The corresponding memory area must be readable and executable,
997d21e1 575but not writable or shareable.
f83fe154 576.TP
03547431
MK
577.BR PR_SET_MM_START_DATA
578Set the address above which initialized and
579uninitialized (bss) data are placed.
580The corresponding memory area must be readable and writable,
997d21e1 581but not executable or shareable.
88989295 582.TP
03547431
MK
583.B PR_SET_MM_END_DATA
584Set the address below which initialized and
585uninitialized (bss) data are placed.
586The corresponding memory area must be readable and writable,
997d21e1 587but not executable or shareable.
88989295 588.TP
03547431
MK
589.BR PR_SET_MM_START_STACK
590Set the start address of the stack.
591The corresponding memory area must be readable and writable.
491b2e75 592.TP
03547431
MK
593.BR PR_SET_MM_START_BRK
594Set the address above which the program heap can be expanded with
595.BR brk (2)
596call.
597The address must be greater than the ending address of
598the current program data segment.
599In addition, the combined size of the resulting heap and
600the size of the data segment can't exceed the
601.BR RLIMIT_DATA
602resource limit (see
603.BR setrlimit (2)).
604.TP
605.BR PR_SET_MM_BRK
606Set the current
607.BR brk (2)
608value.
609The requirements for the address are the same as for the
610.BR PR_SET_MM_START_BRK
611option.
11ac5b51 612.PP
03547431
MK
613The following options are available since Linux 3.5.
614.\" commit fe8c7f5cbf91124987106faa3bdf0c8b955c4cf7
615.TP
616.BR PR_SET_MM_ARG_START
617Set the address above which the program command line is placed.
618.TP
619.BR PR_SET_MM_ARG_END
620Set the address below which the program command line is placed.
621.TP
622.BR PR_SET_MM_ENV_START
623Set the address above which the program environment is placed.
624.TP
625.BR PR_SET_MM_ENV_END
626Set the address below which the program environment is placed.
627.IP
628The address passed with
629.BR PR_SET_MM_ARG_START ,
630.BR PR_SET_MM_ARG_END ,
631.BR PR_SET_MM_ENV_START ,
632and
633.BR PR_SET_MM_ENV_END
634should belong to a process stack area.
635Thus, the corresponding memory area must be readable, writable, and
636(depending on the kernel configuration) have the
637.BR MAP_GROWSDOWN
638attribute set (see
639.BR mmap (2)).
640.TP
641.BR PR_SET_MM_AUXV
642Set a new auxiliary vector.
643The
644.I arg3
645argument should provide the address of the vector.
646The
647.I arg4
648is the size of the vector.
649.TP
650.BR PR_SET_MM_EXE_FILE
651.\" commit b32dfe377102ce668775f8b6b1461f7ad428f8b6
652Supersede the
653.IR /proc/pid/exe
654symbolic link with a new one pointing to a new executable file
655identified by the file descriptor provided in
656.I arg3
657argument.
658The file descriptor should be obtained with a regular
659.BR open (2)
660call.
661.IP
662To change the symbolic link, one needs to unmap all existing
663executable memory areas, including those created by the kernel itself
664(for example the kernel usually creates at least one executable
665memory area for the ELF
666.IR \.text
667section).
668.IP
642df17c 669In Linux 4.9 and earlier, the
47bc9cec 670.\" commit 3fb4afd9a504c2386b8435028d43283216bf588e
47bc9cec 671.BR PR_SET_MM_EXE_FILE
642df17c
MK
672operation can be performed only once in a process's lifetime;
673attempting to perform the operation a second time results in the error
674.BR EPERM .
675This restriction was enforced for security reasons that were subsequently
676deemed specious,
677and the restriction was removed in Linux 4.10 because some
678user-space applications needed to perform this operation more than once.
11ac5b51 679.PP
7e3236a5
MF
680The following options are available since Linux 3.18.
681.\" commit f606b77f1a9e362451aca8f81d8f36a3a112139e
682.TP
683.BR PR_SET_MM_MAP
684Provides one-shot access to all the addresses by passing in a
685.I struct prctl_mm_map
686(as defined in \fI<linux/prctl.h>\fP).
687The
688.I arg4
689argument should provide the size of the struct.
efeece04 690.IP
7e3236a5
MF
691This feature is available only if the kernel is built with the
692.BR CONFIG_CHECKPOINT_RESTORE
693option enabled.
694.TP
695.BR PR_SET_MM_MAP_SIZE
696Returns the size of the
697.I struct prctl_mm_map
698the kernel expects.
699This allows user space to find a compatible struct.
700The
701.I arg4
702argument should be a pointer to an unsigned int.
efeece04 703.IP
7e3236a5
MF
704This feature is available only if the kernel is built with the
705.BR CONFIG_CHECKPOINT_RESTORE
706option enabled.
03547431
MK
707.RE
708.TP
709.BR PR_MPX_ENABLE_MANAGEMENT ", " PR_MPX_DISABLE_MANAGEMENT " (since Linux 3.19) "
710.\" commit fe3d197f84319d3bce379a9c0dc17b1f48ad358c
711.\" See also http://lwn.net/Articles/582712/
712.\" See also https://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler
713Enable or disable kernel management of Memory Protection eXtensions (MPX)
714bounds tables.
715The
716.IR arg2 ,
717.IR arg3 ,
718.IR arg4 ,
719and
720.IR arg5
721.\" commit e9d1b4f3c60997fe197bf0243cb4a41a44387a88
722arguments must be zero.
efeece04 723.IP
03547431
MK
724MPX is a hardware-assisted mechanism for performing bounds checking on
725pointers.
726It consists of a set of registers storing bounds information
727and a set of special instruction prefixes that tell the CPU on which
728instructions it should do bounds enforcement.
729There is a limited number of these registers and
730when there are more pointers than registers,
731their contents must be "spilled" into a set of tables.
732These tables are called "bounds tables" and the MPX
733.BR prctl ()
734operations control
735whether the kernel manages their allocation and freeing.
efeece04 736.IP
03547431
MK
737When management is enabled, the kernel will take over allocation
738and freeing of the bounds tables.
739It does this by trapping the #BR exceptions that result
740at first use of missing bounds tables and
741instead of delivering the exception to user space,
742it allocates the table and populates the bounds directory
743with the location of the new table.
744For freeing, the kernel checks to see if bounds tables are
745present for memory which is not allocated, and frees them if so.
efeece04 746.IP
03547431
MK
747Before enabling MPX management using
748.BR PR_MPX_ENABLE_MANAGEMENT ,
749the application must first have allocated a user-space buffer for
750the bounds directory and placed the location of that directory in the
751.I bndcfgu
752register.
efeece04 753.IP
a23d8efa 754These calls fail if the CPU or kernel does not support MPX.
03547431
MK
755Kernel support for MPX is enabled via the
756.BR CONFIG_X86_INTEL_MPX
757configuration option.
758You can check whether the CPU supports MPX by looking for the 'mpx'
759CPUID bit, like with the following command:
efeece04 760.IP
e256205a
MK
761.in +4n
762.EX
763cat /proc/cpuinfo | grep ' mpx '
764.EE
765.in
efeece04 766.IP
03547431
MK
767A thread may not switch in or out of long (64-bit) mode while MPX is
768enabled.
efeece04 769.IP
03547431 770All threads in a process are affected by these calls.
efeece04 771.IP
03547431
MK
772The child of a
773.BR fork (2)
774inherits the state of MPX management.
775During
776.BR execve (2),
777MPX management is reset to a state as if
778.BR PR_MPX_DISABLE_MANAGEMENT
779had been called.
efeece04 780.IP
03547431
MK
781For further information on Intel MPX, see the kernel source file
782.IR Documentation/x86/intel_mpx.txt .
783.TP
784.BR PR_SET_NAME " (since Linux 2.6.9)"
785Set the name of the calling thread,
786using the value in the location pointed to by
787.IR "(char\ *) arg2" .
788The name can be up to 16 bytes long,
789.\" TASK_COMM_LEN in include/linux/sched.h
790including the terminating null byte.
791(If the length of the string, including the terminating null byte,
792exceeds 16 bytes, the string is silently truncated.)
793This is the same attribute that can be set via
794.BR pthread_setname_np (3)
795and retrieved using
796.BR pthread_getname_np (3).
797The attribute is likewise accessible via
798.IR /proc/self/task/[tid]/comm ,
799where
800.I tid
801is the name of the calling thread.
802.TP
803.BR PR_GET_NAME " (since Linux 2.6.11)"
804Return the name of the calling thread,
805in the buffer pointed to by
806.IR "(char\ *) arg2" .
807The buffer should allow space for up to 16 bytes;
808the returned string will be null-terminated.
809.TP
810.BR PR_SET_NO_NEW_PRIVS " (since Linux 3.5)"
40dfb5ba 811Set the calling thread's
03547431 812.I no_new_privs
fdda9363 813attribute to the value in
03547431
MK
814.IR arg2 .
815With
816.I no_new_privs
817set to 1,
818.BR execve (2)
819promises not to grant privileges to do anything
820that could not have been done without the
821.BR execve (2)
822call (for example,
823rendering the set-user-ID and set-group-ID mode bits,
824and file capabilities non-functional).
fdda9363
MK
825Once set, this the
826.I no_new_privs
827attribute cannot be unset.
828The setting of this attribute is inherited by children created by
03547431
MK
829.BR fork (2)
830and
831.BR clone (2),
832and preserved across
833.BR execve (2).
efeece04 834.IP
c70fea6e
MK
835Since Linux 4.10,
836the value of a thread's
837.I no_new_privs
fdda9363 838attribute can be viewed via the
c70fea6e
MK
839.I NoNewPrivs
840field in the
841.IR /proc/[pid]/status
842file.
efeece04 843.IP
03547431 844For more information, see the kernel source file
a84a5830
ES
845.IR Documentation/userspace\-api/no_new_privs.rst
846.\" commit 40fde647ccb0ae8c11d256d271e24d385eed595b
847(or
848.IR Documentation/prctl/no_new_privs.txt
849before Linux 4.13).
4d850396
MK
850See also
851.BR seccomp (2).
03547431
MK
852.TP
853.BR PR_GET_NO_NEW_PRIVS " (since Linux 3.5)"
854Return (as the function result) the value of the
855.I no_new_privs
fdda9363 856attribute for the calling thread.
03547431
MK
857A value of 0 indicates the regular
858.BR execve (2)
859behavior.
860A value of 1 indicates
861.BR execve (2)
862will operate in the privilege-restricting mode described above.
863.TP
864.BR PR_SET_PDEATHSIG " (since Linux 2.1.57)"
29b249db 865Set the parent-death signal
03547431
MK
866of the calling process to \fIarg2\fP (either a signal value
867in the range 1..maxsig, or 0 to clear).
868This is the signal that the calling process will get when its
869parent dies.
c5236575 870.IP
03547431
MK
871.IR Warning :
872.\" https://bugzilla.kernel.org/show_bug.cgi?id=43300
873the "parent" in this case is considered to be the
874.I thread
875that created this process.
876In other words, the signal will be sent when that thread terminates
877(via, for example,
878.BR pthread_exit (3)),
879rather than after all of the threads in the parent process terminate.
910b0689 880.IP
a32c96b8
MK
881The parent-death signal is sent upon subsequent termination of the parent
882thread and also upon termination of each subreaper process
883(see the description of
884.B PR_SET_CHILD_SUBREAPER
885above) to which the caller is subsequently reparented.
886If the parent thread and all ancestor subreapers have already terminated
887by the time of the
888.BR PR_SET_PDEATHSIG
889operation, then no parent-death signal is sent to the caller.
890.IP
a09b5995
MK
891The parent-death signal is process-directed (see
892.BR signal (7))
893and, if the child installs a handler using the
894.BR sigaction (2)
895.B SA_SIGINFO
896flag, the
897.I si_pid
898field of the
899.I siginfo_t
900argument of the handler contains the PID of the terminating parent process.
901.IP
29b249db 902The parent-death signal setting is cleared for the child of a
910b0689
MK
903.BR fork (2).
904It is also
905(since Linux 2.4.36 / 2.6.23)
906.\" commit d2d56c5f51028cb9f3d800882eb6f4cbd3f9099f
907cleared when executing a set-user-ID or set-group-ID binary,
908or a binary that has associated capabilities (see
909.BR capabilities (7));
910otherwise, this value is preserved across
911.BR execve (2).
03547431
MK
912.TP
913.BR PR_GET_PDEATHSIG " (since Linux 2.3.15)"
914Return the current value of the parent process death signal,
915in the location pointed to by
916.IR "(int\ *) arg2" .
917.TP
918.BR PR_SET_PTRACER " (since Linux 3.4)"
919.\" commit 2d514487faf188938a4ee4fb3464eeecfbdcf8eb
920.\" commit bf06189e4d14641c0148bea16e9dd24943862215
921This is meaningful only when the Yama LSM is enabled and in mode 1
922("restricted ptrace", visible via
923.IR /proc/sys/kernel/yama/ptrace_scope ).
924When a "ptracer process ID" is passed in \fIarg2\fP,
925the caller is declaring that the ptracer process can
926.BR ptrace (2)
927the calling process as if it were a direct process ancestor.
928Each
929.B PR_SET_PTRACER
930operation replaces the previous "ptracer process ID".
931Employing
932.B PR_SET_PTRACER
933with
934.I arg2
935set to 0 clears the caller's "ptracer process ID".
936If
937.I arg2
938is
939.BR PR_SET_PTRACER_ANY ,
940the ptrace restrictions introduced by Yama are effectively disabled for the
941calling process.
efeece04 942.IP
03547431 943For further information, see the kernel source file
6744a500
ES
944.IR Documentation/admin\-guide/LSM/Yama.rst
945.\" commit 90bb766440f2147486a2acc3e793d7b8348b0c22
946(or
947.IR Documentation/security/Yama.txt
948before Linux 4.13).
03547431
MK
949.TP
950.BR PR_SET_SECCOMP " (since Linux 2.6.23)"
951.\" See http://thread.gmane.org/gmane.linux.kernel/542632
952.\" [PATCH 0 of 2] seccomp updates
953.\" andrea@cpushare.com
954Set the secure computing (seccomp) mode for the calling thread, to limit
955the available system calls.
956The more recent
957.BR seccomp (2)
958system call provides a superset of the functionality of
959.BR PR_SET_SECCOMP .
efeece04 960.IP
03547431
MK
961The seccomp mode is selected via
962.IR arg2 .
963(The seccomp constants are defined in
964.IR <linux/seccomp.h> .)
efeece04 965.IP
34447828 966With
8ab8b43f 967.IR arg2
34447828 968set to
b1248a9d 969.BR SECCOMP_MODE_STRICT ,
8ab8b43f
MK
970the only system calls that the thread is permitted to make are
971.BR read (2),
972.BR write (2),
85fbef74
MK
973.BR _exit (2)
974(but not
975.BR exit_group (2)),
fea681da 976and
8ab8b43f
MK
977.BR sigreturn (2).
978Other system calls result in the delivery of a
979.BR SIGKILL
980signal.
34447828 981Strict secure computing mode is useful for number-crunching applications
8ab8b43f
MK
982that may need to execute untrusted byte code,
983perhaps obtained by reading from a pipe or socket.
33a0ccb2 984This operation is available only
d6ef3d57
MK
985if the kernel is configured with
986.B CONFIG_SECCOMP
987enabled.
efeece04 988.IP
34447828
KC
989With
990.IR arg2
991set to
b1248a9d 992.BR SECCOMP_MODE_FILTER " (since Linux 3.5),"
6239dfb2
MK
993the system calls allowed are defined by a pointer
994to a Berkeley Packet Filter passed in
995.IR arg3 .
996This argument is a pointer to
997.IR "struct sock_fprog" ;
998it can be designed to filter
d6ef3d57 999arbitrary system calls and system call arguments.
33a0ccb2 1000This mode is available only if the kernel is configured with
d6ef3d57
MK
1001.B CONFIG_SECCOMP_FILTER
1002enabled.
efeece04 1003.IP
1733db35
MK
1004If
1005.BR SECCOMP_MODE_FILTER
1006filters permit
1007.BR fork (2),
990e3887 1008then the seccomp mode is inherited by children created by
1733db35
MK
1009.BR fork (2);
1010if
1011.BR execve (2)
fa1d2749 1012is permitted, then the seccomp mode is preserved across
1733db35
MK
1013.BR execve (2).
1014If the filters permit
a26ec136 1015.BR prctl ()
1733db35
MK
1016calls, then additional filters can be added;
1017they are run in order until the first non-allow result is seen.
efeece04 1018.IP
6239dfb2 1019For further information, see the kernel source file
28d96036
ES
1020.IR Documentation/userspace\-api/seccomp_filter.rst
1021.\" commit c061f33f35be0ccc80f4b8e0aea5dfd2ed7e01a3
1022(or
1023.IR Documentation/prctl/seccomp_filter.txt
1024before Linux 4.13).
8ab8b43f
MK
1025.TP
1026.BR PR_GET_SECCOMP " (since Linux 2.6.23)"
5e91816c
MK
1027Return (as the function result)
1028the secure computing mode of the calling thread.
34447828
KC
1029If the caller is not in secure computing mode, this operation returns 0;
1030if the caller is in strict secure computing mode, then the
8ab8b43f
MK
1031.BR prctl ()
1032call will cause a
1033.B SIGKILL
1034signal to be sent to the process.
d6ef3d57 1035If the caller is in filter mode, and this system call is allowed by the
8eeb062d
MK
1036seccomp filters, it returns 2; otherwise, the process is killed with a
1037.BR SIGKILL
1038signal.
33a0ccb2 1039This operation is available only
d6ef3d57
MK
1040if the kernel is configured with
1041.B CONFIG_SECCOMP
1042enabled.
efeece04 1043.IP
787843e7
MK
1044Since Linux 3.8, the
1045.IR Seccomp
1046field of the
1047.IR /proc/[pid]/status
1048file provides a method of obtaining the same information,
1049without the risk that the process is killed; see
1050.BR proc (5).
88989295
MK
1051.TP
1052.BR PR_SET_SECUREBITS " (since Linux 2.6.26)"
1053Set the "securebits" flags of the calling thread to the value supplied in
03547431
MK
1054.IR arg2 .
1055See
1056.BR capabilities (7).
88989295 1057.TP
03547431
MK
1058.BR PR_GET_SECUREBITS " (since Linux 2.6.26)"
1059Return (as the function result)
1060the "securebits" flags of the calling thread.
1061See
1062.BR capabilities (7).
1063.TP
dd08fcca 1064.BR PR_GET_SPECULATION_CTRL " (since Linux 4.17)"
a01c1cbc
MK
1065Returns the state of the speculation misfeature specified in
1066.IR arg2 .
1067Currently, the only permitted value for this argument is
2feab5d3
MK
1068.BR PR_SPEC_STORE_BYPASS
1069(otherwise the call fails with the error
1070.BR ENODEV ).
1071.IP
1072The return value uses bits 0-3 with the following meaning:
e23acd79
KRW
1073.RS
1074.TP
1075.BR PR_SPEC_PRCTL
2feab5d3 1076Mitigation can be controlled per thread by
e23acd79
KRW
1077.B PR_SET_SPECULATION_CTRL
1078.TP
1079.BR PR_SPEC_ENABLE
1080The speculation feature is enabled, mitigation is disabled.
1081.TP
1082.BR PR_SPEC_DISABLE
1083The speculation feature is disabled, mitigation is enabled
1084.TP
1085.BR PR_SPEC_FORCE_DISABLE
1086Same as
1087.B PR_SPEC_DISABLE
1088but cannot be undone.
1089.RE
1090.IP
2feab5d3 1091If all bits are 0,
e23acd79
KRW
1092then the CPU is not affected by the speculation misfeature.
1093.IP
1094If
1095.B PR_SPEC_PRCTL
2feab5d3 1096is set, then per-thread control of the mitigation is available.
ac3756bc 1097If not set,
e36dfb81 1098.BR prctl ()
e23acd79 1099for the speculation misfeature will fail.
a01c1cbc
MK
1100.IP
1101The
e36dfb81
MK
1102.IR arg3 ,
1103.IR arg4 ,
e23acd79
KRW
1104and
1105.I arg5
a01c1cbc 1106arguments must be specified as 0; otherwise the call fails with the error
e36dfb81 1107.BR EINVAL .
e23acd79 1108.TP
dd08fcca
MK
1109.BR PR_SET_SPECULATION_CTRL " (since Linux 4.17)"
1110.\" commit b617cfc858161140d69cc0b5cc211996b557a1c7
1111.\" commit 356e4bfff2c5489e016fdb925adbf12a1e3950ee
a01c1cbc
MK
1112Sets the state of the speculation misfeature specified in
1113.IR arg2 .
1114Currently, the only permitted value for this argument is
2feab5d3
MK
1115.B PR_SPEC_STORE_BYPASS
1116(otherwise the call fails with the error
1117.BR ENODEV ).
a01c1cbc 1118This setting is a per-thread attribute.
ac3756bc 1119The
e23acd79 1120.IR arg3
a01c1cbc
MK
1121argument is used to hand in the control value,
1122which is one of the following:
e23acd79
KRW
1123.RS
1124.TP
1125.BR PR_SPEC_ENABLE
1126The speculation feature is enabled, mitigation is disabled.
1127.TP
1128.BR PR_SPEC_DISABLE
1129The speculation feature is disabled, mitigation is enabled
1130.TP
1131.BR PR_SPEC_FORCE_DISABLE
1132Same as
1133.B PR_SPEC_DISABLE
ac3756bc
MK
1134but cannot be undone.
1135A subsequent
e23acd79
KRW
1136.B
1137prctl(..., PR_SPEC_ENABLE)
2feab5d3 1138will fail with the error
e36dfb81 1139.BR EPERM .
e23acd79
KRW
1140.RE
1141.IP
1142Any other value in
1143.IR arg3
2feab5d3 1144will result in the call failing with the error
e23acd79 1145.BR ERANGE .
a01c1cbc
MK
1146.IP
1147The
2feab5d3 1148.I arg4
e23acd79
KRW
1149and
1150.I arg5
a01c1cbc 1151arguments must be specified as 0; otherwise the call fails with the error
e36dfb81 1152.BR EINVAL .
e23acd79 1153.IP
a01c1cbc
MK
1154The speculation feature can also be controlled by the
1155.B spec_store_bypass_disable
1156boot parameter.
1157This parameter may enforce a read-only policy which will result in the
1158.BR prctl (2)
1159call failing with the error
e23acd79 1160.BR ENXIO .
a01c1cbc
MK
1161For further details, see the kernel source file
1162.IR Documentation/admin-guide/kernel-parameters.txt .
e23acd79 1163.TP
03547431
MK
1164.BR PR_SET_THP_DISABLE " (since Linux 3.15)"
1165.\" commit a0715cc22601e8830ace98366c0c2bd8da52af52
1166Set the state of the "THP disable" flag for the calling thread.
1167If
1168.I arg2
1169has a nonzero value, the flag is set, otherwise it is cleared.
1170Setting this flag provides a method
1171for disabling transparent huge pages
1172for jobs where the code cannot be modified, and using a malloc hook with
1173.BR madvise (2)
1174is not an option (i.e., statically allocated data).
1175The setting of the "THP disable" flag is inherited by a child created via
1176.BR fork (2)
1177and is preserved across
1178.BR execve (2).
1179.\"
06afe673
MK
1180.TP
1181.BR PR_TASK_PERF_EVENTS_DISABLE " (since Linux 2.6.31)"
1182Disable all performance counters attached to the calling process,
1183regardless of whether the counters were created by
1184this process or another process.
1185Performance counters created by the calling process for other
1186processes are unaffected.
66a9882e 1187For more information on performance counters, see the Linux kernel source file
06afe673
MK
1188.IR tools/perf/design.txt .
1189.IP
03547431
MK
1190Originally called
1191.BR PR_TASK_PERF_COUNTERS_DISABLE ;
1192.\" commit 1d1c7ddbfab358445a542715551301b7fc363e28
b0ea1ea3 1193renamed (retaining the same numerical value)
03547431
MK
1194in Linux 2.6.32.
1195.\"
03979794 1196.TP
03547431
MK
1197.BR PR_TASK_PERF_EVENTS_ENABLE " (since Linux 2.6.31)"
1198The converse of
1199.BR PR_TASK_PERF_EVENTS_DISABLE ;
1200enable performance counters attached to the calling process.
1201.IP
1202Originally called
1203.BR PR_TASK_PERF_COUNTERS_ENABLE ;
1204.\" commit 1d1c7ddbfab358445a542715551301b7fc363e28
1205renamed
1206.\" commit cdd6c482c9ff9c55475ee7392ec8f672eddb7be6
1207in Linux 2.6.32.
1208.\"
1209.TP
1210.BR PR_GET_THP_DISABLE " (since Linux 3.15)"
1211Return (via the function result) the current setting of the "THP disable"
1212flag for the calling thread:
1213either 1, if the flag is set, or 0, if it is not.
1214.TP
1215.BR PR_GET_TID_ADDRESS " (since Linux 3.5)"
1216.\" commit 300f786b2683f8bb1ec0afb6e1851183a479c86d
1217Retrieve the
1218.I clear_child_tid
1219address set by
1220.BR set_tid_address (2)
1221and the
1222.BR clone (2)
1223.B CLONE_CHILD_CLEARTID
1224flag, in the location pointed to by
1225.IR "(int\ **)\ arg2" .
1226This feature is available only if the kernel is built with the
1227.BR CONFIG_CHECKPOINT_RESTORE
c7f2f9ed
MK
1228option enabled.
1229Note that since the
1230.BR prctl ()
1231system call does not have a compat implementation for
1232the AMD64 x32 and MIPS n32 ABIs,
1233and the kernel writes out a pointer using the kernel's pointer size,
1234this operation expects a user-space buffer of 8 (not 4) bytes on these ABIs.
03547431
MK
1235.TP
1236.BR PR_SET_TIMERSLACK " (since Linux 2.6.28)"
1237.\" See https://lwn.net/Articles/369549/
1238.\" commit 6976675d94042fbd446231d1bd8b7de71a980ada
3780f8a5
MK
1239Each thread has two associated timer slack values:
1240a "default" value, and a "current" value.
1241This operation sets the "current" timer slack value for the calling thread.
1242If the nanosecond value supplied in
1243.IR arg2
1244is greater than zero, then the "current" value is set to this value.
03547431
MK
1245If
1246.I arg2
1247is less than or equal to zero,
1248.\" It seems that it's not possible to set the timer slack to zero;
1249.\" The minimum value is 1? Seems a little strange.
3780f8a5
MK
1250the "current" timer slack is reset to the
1251thread's "default" timer slack value.
efeece04 1252.IP
3780f8a5 1253The "current" timer slack is used by the kernel to group timer expirations
03547431
MK
1254for the calling thread that are close to one another;
1255as a consequence, timer expirations for the thread may be
1256up to the specified number of nanoseconds late (but will never expire early).
1257Grouping timer expirations can help reduce system power consumption
1258by minimizing CPU wake-ups.
efeece04 1259.IP
03547431
MK
1260The timer expirations affected by timer slack are those set by
1261.BR select (2),
1262.BR pselect (2),
1263.BR poll (2),
1264.BR ppoll (2),
1265.BR epoll_wait (2),
1266.BR epoll_pwait (2),
1267.BR clock_nanosleep (2),
1268.BR nanosleep (2),
1269and
1270.BR futex (2)
1271(and thus the library functions implemented via futexes, including
1272.\" List obtained by grepping for futex usage in glibc source
1273.BR pthread_cond_timedwait (3),
1274.BR pthread_mutex_timedlock (3),
1275.BR pthread_rwlock_timedrdlock (3),
1276.BR pthread_rwlock_timedwrlock (3),
1277and
1278.BR sem_timedwait (3)).
efeece04 1279.IP
03547431
MK
1280Timer slack is not applied to threads that are scheduled under
1281a real-time scheduling policy (see
1282.BR sched_setscheduler (2)).
efeece04 1283.IP
03547431 1284When a new thread is created,
3780f8a5 1285the two timer slack values are made the same as the "current" value
03547431 1286of the creating thread.
3780f8a5
MK
1287Thereafter, a thread can adjust its "current" timer slack value via
1288.BR PR_SET_TIMERSLACK .
1289The "default" value can't be changed.
03547431
MK
1290The timer slack values of
1291.IR init
1292(PID 1), the ancestor of all processes,
1293are 50,000 nanoseconds (50 microseconds).
1294The timer slack values are preserved across
1295.BR execve (2).
efeece04 1296.IP
c1f78aba
MK
1297Since Linux 4.6, the "current" timer slack value of any process
1298can be examined and changed via the file
1299.IR /proc/[pid]/timerslack_ns .
1300See
1301.BR proc (5).
e81a96ec 1302.TP
03547431
MK
1303.BR PR_GET_TIMERSLACK " (since Linux 2.6.28)"
1304Return (as the function result)
3780f8a5 1305the "current" timer slack value of the calling thread.
4bf25b89 1306.TP
d6bec36e
MK
1307.BR PR_SET_TIMING " (since Linux 2.6.0)"
1308.\" Precisely: Linux 2.6.0-test4
03547431
MK
1309Set whether to use (normal, traditional) statistical process timing or
1310accurate timestamp-based process timing, by passing
1311.B PR_TIMING_STATISTICAL
1312.\" 0
1313or
1314.B PR_TIMING_TIMESTAMP
1315.\" 1
1316to \fIarg2\fP.
1317.B PR_TIMING_TIMESTAMP
1318is not currently implemented
1319(attempting to set this mode will yield the error
1320.BR EINVAL ).
1321.\" PR_TIMING_TIMESTAMP doesn't do anything in 2.6.26-rc8,
1322.\" and looking at the patch history, it appears
1323.\" that it never did anything.
4bf25b89 1324.TP
d6bec36e
MK
1325.BR PR_GET_TIMING " (since Linux 2.6.0)"
1326.\" Precisely: Linux 2.6.0-test4
03547431
MK
1327Return (as the function result) which process timing method is currently
1328in use.
4bf25b89 1329.TP
03547431
MK
1330.BR PR_SET_TSC " (since Linux 2.6.26, x86 only)"
1331Set the state of the flag determining whether the timestamp counter
1332can be read by the process.
1333Pass
1334.B PR_TSC_ENABLE
1335to
1336.I arg2
1337to allow it to be read, or
1338.B PR_TSC_SIGSEGV
1339to generate a
1340.B SIGSEGV
1341when the process tries to read the timestamp counter.
4bf25b89 1342.TP
03547431
MK
1343.BR PR_GET_TSC " (since Linux 2.6.26, x86 only)"
1344Return the state of the flag determining whether the timestamp counter
1345can be read,
1346in the location pointed to by
1347.IR "(int\ *) arg2" .
1348.TP
1349.B PR_SET_UNALIGN
1350(Only on: ia64, since Linux 2.3.48; parisc, since Linux 2.6.15;
0e2c6b8c
ES
1351PowerPC, since Linux 2.6.18; Alpha, since Linux 2.6.22;
1352.\" sh: 94ea5e449ae834af058ef005d16a8ad44fcf13d6
1353.\" tile: 2f9ac29eec71a696cb0dcc5fb82c0f8d4dac28c9
1354sh, since Linux 2.6.34; tile, since Linux 3.12)
03547431
MK
1355Set unaligned access control bits to \fIarg2\fP.
1356Pass
1357\fBPR_UNALIGN_NOPRINT\fP to silently fix up unaligned user accesses,
1358or \fBPR_UNALIGN_SIGBUS\fP to generate
1359.B SIGBUS
2da72a43
MK
1360on unaligned user access.
1361Alpha also supports an additional flag with the value
1362of 4 and no corresponding named constant,
1363which instructs kernel to not fix up
0e2c6b8c 1364unaligned accesses (it is analogous to providing the
2da72a43
MK
1365.BR UAC_NOFIX
1366flag in
1367.BR SSI_NVPAIRS
1368operation of the
1369.BR setsysinfo ()
1370system call on Tru64).
03547431
MK
1371.TP
1372.B PR_GET_UNALIGN
1373(see
1374.B PR_SET_UNALIGN
1375for information on versions and architectures)
1376Return unaligned access control bits, in the location pointed to by
0e2c6b8c 1377.IR "(unsigned int\ *) arg2" .
47297adb 1378.SH RETURN VALUE
8ab8b43f
MK
1379On success,
1380.BR PR_GET_DUMPABLE ,
1381.BR PR_GET_KEEPCAPS ,
f83fe154 1382.BR PR_GET_NO_NEW_PRIVS ,
5745985f 1383.BR PR_GET_THP_DISABLE ,
8ab8b43f
MK
1384.BR PR_CAPBSET_READ ,
1385.BR PR_GET_TIMING ,
c42db321 1386.BR PR_GET_TIMERSLACK ,
8ab8b43f 1387.BR PR_GET_SECUREBITS ,
ed31c572 1388.BR PR_MCE_KILL_GET ,
0c3e75cb 1389.BR PR_CAP_AMBIENT + PR_CAP_AMBIENT_IS_SET ,
8ab8b43f
MK
1390and (if it returns)
1391.BR PR_GET_SECCOMP
2fda57bd 1392return the nonnegative values described above.
fea681da
MK
1393All other
1394.I option
1395values return 0 on success.
1396On error, \-1 is returned, and
1397.I errno
1398is set appropriately.
1399.SH ERRORS
1400.TP
0478944d
MK
1401.B EACCES
1402.I option
1403is
4ab9f1db
MK
1404.BR PR_SET_SECCOMP
1405and
1406.I arg2
1407is
1408.BR SECCOMP_MODE_FILTER ,
1409but the process does not have the
1410.BR CAP_SYS_ADMIN
1411capability or has not set the
1412.IR no_new_privs
1413attribute (see the discussion of
1414.BR PR_SET_NO_NEW_PRIVS
1415above).
1416.TP
1417.B EACCES
1418.I option
1419is
0478944d
MK
1420.BR PR_SET_MM ,
1421and
1422.I arg3
1423is
1424.BR PR_SET_MM_EXE_FILE ,
1425the file is not executable.
1426.TP
1427.B EBADF
1428.I option
1429is
1430.BR PR_SET_MM ,
1431.I arg3
1432is
1433.BR PR_SET_MM_EXE_FILE ,
1434and the file descriptor passed in
1435.I arg4
1436is not valid.
1437.TP
1438.B EBUSY
1439.I option
1440is
1441.BR PR_SET_MM ,
1442.I arg3
1443is
1444.BR PR_SET_MM_EXE_FILE ,
1445and this the second attempt to change the
1446.I /proc/pid/exe
1447symbolic link, which is prohibited.
1448.TP
8ab8b43f
MK
1449.B EFAULT
1450.I arg2
1451is an invalid address.
1452.TP
e35a0512
KC
1453.B EFAULT
1454.I option
1455is
1456.BR PR_SET_SECCOMP ,
1457.I arg2
1458is
1459.BR SECCOMP_MODE_FILTER ,
1460the system was built with
64c626f7 1461.BR CONFIG_SECCOMP_FILTER ,
e35a0512
KC
1462and
1463.I arg3
1464is an invalid address.
1465.TP
fea681da
MK
1466.B EINVAL
1467The value of
1468.I option