]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man2/prctl.2
prctl.2: RETURN VALUE: add some missing entries
[thirdparty/man-pages.git] / man2 / prctl.2
CommitLineData
fea681da 1.\" Copyright (C) 1998 Andries Brouwer (aeb@cwi.nl)
73d3ac53 2.\" and Copyright (C) 2002, 2006, 2008, 2012, 2013 Michael Kerrisk <mtk.manpages@gmail.com>
af5f9508 3.\" and Copyright Guillem Jover <guillem@hadrons.org>
3cd5e983 4.\" and Copyright (C) 2014 Dave Hansen / Intel
fea681da 5.\"
93015253 6.\" %%%LICENSE_START(VERBATIM)
fea681da
MK
7.\" Permission is granted to make and distribute verbatim copies of this
8.\" manual provided the copyright notice and this permission notice are
9.\" preserved on all copies.
10.\"
11.\" Permission is granted to copy and distribute modified versions of this
12.\" manual under the conditions for verbatim copying, provided that the
13.\" entire resulting derived work is distributed under the terms of a
14.\" permission notice identical to this one.
c13182ef 15.\"
fea681da
MK
16.\" Since the Linux kernel and libraries are constantly changing, this
17.\" manual page may be incorrect or out-of-date. The author(s) assume no
18.\" responsibility for errors or omissions, or for damages resulting from
19.\" the use of the information contained herein. The author(s) may not
20.\" have taken the same level of care in the production of this manual,
21.\" which is licensed free of charge, as they might when working
22.\" professionally.
c13182ef 23.\"
fea681da
MK
24.\" Formatted or processed versions of this manual, if unaccompanied by
25.\" the source, must acknowledge the copyright and authors of this work.
4b72fb64 26.\" %%%LICENSE_END
fea681da
MK
27.\"
28.\" Modified Thu Nov 11 04:19:42 MET 1999, aeb: added PR_GET_PDEATHSIG
29.\" Modified 27 Jun 02, Michael Kerrisk
c13182ef 30.\" Added PR_SET_DUMPABLE, PR_GET_DUMPABLE,
fea681da 31.\" PR_SET_KEEPCAPS, PR_GET_KEEPCAPS
e87fdd92
MK
32.\" Modified 2006-08-30 Guillem Jover <guillem@hadrons.org>
33.\" Updated Linux versions where the options where introduced.
34.\" Added PR_SET_TIMING, PR_GET_TIMING, PR_SET_NAME, PR_GET_NAME,
35.\" PR_SET_UNALIGN, PR_GET_UNALIGN, PR_SET_FPEMU, PR_GET_FPEMU,
36.\" PR_SET_FPEXC, PR_GET_FPEXC
8ab8b43f
MK
37.\" 2008-04-29 Serge Hallyn, Document PR_CAPBSET_READ and PR_CAPBSET_DROP
38.\" 2008-06-13 Erik Bosman, <ejbosman@cs.vu.nl>
39.\" Document PR_GET_TSC and PR_SET_TSC.
40.\" 2008-06-15 mtk, Document PR_SET_SECCOMP, PR_GET_SECCOMP
bc02b3ea 41.\" 2009-10-03 Andi Kleen, document PR_MCE_KILL
06afe673 42.\" 2012-04 Cyrill Gorcunov, Document PR_SET_MM
bc02b3ea
MK
43.\" 2012-04-25 Michael Kerrisk, Document PR_TASK_PERF_EVENTS_DISABLE and
44.\" PR_TASK_PERF_EVENTS_ENABLE
34447828 45.\" 2012-09-20 Kees Cook, update PR_SET_SECCOMP for mode 2
f83fe154 46.\" 2012-09-20 Kees Cook, document PR_SET_NO_NEW_PRIVS, PR_GET_NO_NEW_PRIVS
934487a0
MK
47.\" 2012-10-25 Michael Kerrisk, Document PR_SET_TIMERSLACK and
48.\" PR_GET_TIMERSLACK
491b2e75 49.\" 2013-01-10 Kees Cook, document PR_SET_PTRACER
31cc8387 50.\" 2012-02-04 Michael Kerrisk, document PR_{SET,GET}_CHILD_SUBREAPER
03979794 51.\" 2014-11-10 Dave Hansen, document PR_MPX_{EN,DIS}ABLE_MANAGEMENT
fea681da 52.\"
e14baeeb 53.\"
63121bd4 54.TH PRCTL 2 2019-08-02 "Linux" "Linux Programmer's Manual"
fea681da
MK
55.SH NAME
56prctl \- operations on a process
57.SH SYNOPSIS
521bf584 58.nf
fea681da 59.B #include <sys/prctl.h>
68e4db0a 60.PP
521bf584
MK
61.BI "int prctl(int " option ", unsigned long " arg2 ", unsigned long " arg3 ,
62.BI " unsigned long " arg4 ", unsigned long " arg5 );
63.fi
fea681da 64.SH DESCRIPTION
e511ffb6 65.BR prctl ()
fea681da 66is called with a first argument describing what to do
1a329b56 67(with values defined in \fI<linux/prctl.h>\fP), and further
c4bb193f 68arguments with a significance depending on the first one.
fea681da 69The first argument can be:
03547431
MK
70.\"
71.TP
72.BR PR_CAP_AMBIENT " (since Linux 4.3)"
73.\" commit 58319057b7847667f0c9585b9de0e8932b0fdb08
1a52f4f6
MK
74Reads or changes the ambient capability set of the calling thread,
75according to the value of
03547431
MK
76.IR arg2 ,
77which must be one of the following:
78.RS
79.\"
80.TP
81.B PR_CAP_AMBIENT_RAISE
82The capability specified in
83.I arg3
84is added to the ambient set.
85The specified capability must already be present in
86both the permitted and the inheritable sets of the process.
87This operation is not permitted if the
88.B SECBIT_NO_CAP_AMBIENT_RAISE
89securebit is set.
90.TP
91.B PR_CAP_AMBIENT_LOWER
92The capability specified in
93.I arg3
94is removed from the ambient set.
95.TP
96.B PR_CAP_AMBIENT_IS_SET
97The
bf7bc8b8 98.BR prctl ()
03547431
MK
99call returns 1 if the capability in
100.I arg3
101is in the ambient set and 0 if it is not.
102.TP
103.BR PR_CAP_AMBIENT_CLEAR_ALL
104All capabilities will be removed from the ambient set.
105This operation requires setting
106.I arg3
107to zero.
108.RE
269e3b97
MK
109.IP
110In all of the above operations,
111.I arg4
112and
113.I arg5
114must be specified as 0.
cf086650
MK
115.IP
116Higher-level interfaces layered on top of the above operations are
117provided in the
118.BR libcap (3)
119library in the form of
120.BR cap_get_ambient (3),
121.BR cap_set_ambient (3),
122and
123.BR cap_reset_ambient (3).
fea681da 124.TP
2e781e20 125.BR PR_CAPBSET_READ " (since Linux 2.6.25)"
8ab8b43f
MK
126Return (as the function result) 1 if the capability specified in
127.I arg2
128is in the calling thread's capability bounding set,
129or 0 if it is not.
130(The capability constants are defined in
131.IR <linux/capability.h> .)
132The capability bounding set dictates
133whether the process can receive the capability through a
2914a14d 134file's permitted capability set on a subsequent call to
8ab8b43f 135.BR execve (2).
efeece04 136.IP
8ab8b43f
MK
137If the capability specified in
138.I arg2
139is not valid, then the call fails with the error
140.BR EINVAL .
d9a0d1d7
MK
141.IP
142A higher-level interface layered on top of this operation is provided in the
143.BR libcap (3)
144library in the form of
145.BR cap_get_bound (3).
8ab8b43f
MK
146.TP
147.BR PR_CAPBSET_DROP " (since Linux 2.6.25)"
148If the calling thread has the
149.B CAP_SETPCAP
af53fcb5 150capability within its user namespace, then drop the capability specified by
8ab8b43f
MK
151.I arg2
152from the calling thread's capability bounding set.
153Any children of the calling thread will inherit the newly
154reduced bounding set.
efeece04 155.IP
8ab8b43f
MK
156The call fails with the error:
157.B EPERM
2914a14d 158if the calling thread does not have the
8ab8b43f
MK
159.BR CAP_SETPCAP ;
160.BR EINVAL
161if
162.I arg2
163does not represent a valid capability; or
164.BR EINVAL
165if file capabilities are not enabled in the kernel,
166in which case bounding sets are not supported.
d9a0d1d7
MK
167.IP
168A higher-level interface layered on top of this operation is provided in the
169.BR libcap (3)
170library in the form of
171.BR cap_drop_bound (3).
73d3ac53
MK
172.TP
173.BR PR_SET_CHILD_SUBREAPER " (since Linux 3.4)"
174.\" commit ebec18a6d3aa1e7d84aab16225e87fd25170ec2b
175If
176.I arg2
177is nonzero,
178set the "child subreaper" attribute of the calling process;
179if
180.I arg2
181is zero, unset the attribute.
efeece04 182.IP
fbc63931 183A subreaper fulfills the role of
73d3ac53
MK
184.BR init (1)
185for its descendant processes.
fbc63931 186When a process becomes orphaned
b6088873 187(i.e., its immediate parent terminates),
fbc63931
MK
188then that process will be reparented to
189the nearest still living ancestor subreaper.
190Subsequently, calls to
191.BR getppid ()
192in the orphaned process will now return the PID of the subreaper process,
193and when the orphan terminates, it is the subreaper process that
73d3ac53
MK
194will receive a
195.BR SIGCHLD
1a8e1c2f 196signal and will be able to
73d3ac53
MK
197.BR wait (2)
198on the process to discover its termination status.
efeece04 199.IP
4a5a783d 200The setting of the "child subreaper" attribute
300a9c78 201is not inherited by children created by
d59a7572
MK
202.BR fork (2)
203and
204.BR clone (2).
205The setting is preserved across
206.BR execve (2).
efeece04 207.IP
94e460d4
MK
208Establishing a subreaper process is useful in session management frameworks
209where a hierarchical group of processes is managed by a subreaper process
210that needs to be informed when one of the processes\(emfor example,
211a double-forked daemon\(emterminates
212(perhaps so that it can restart that process).
213Some
214.BR init (1)
215frameworks (e.g.,
216.BR systemd (1))
217employ a subreaper process for similar reasons.
73d3ac53
MK
218.TP
219.BR PR_GET_CHILD_SUBREAPER " (since Linux 3.4)"
220Return the "child subreaper" setting of the caller,
221in the location pointed to by
222.IR "(int\ *) arg2" .
8ab8b43f 223.TP
88989295 224.BR PR_SET_DUMPABLE " (since Linux 2.3.20)"
2d7fc98d
MK
225Set the state of the "dumpable" flag,
226which determines whether core dumps are produced for the calling process
227upon delivery of a signal whose default behavior is to produce a core dump.
efeece04 228.IP
88989295 229In kernels up to and including 2.6.12,
8ab8b43f 230.I arg2
8aad30d7
MK
231must be either 0
232.RB ( SUID_DUMP_DISABLE ,
233process is not dumpable) or 1
234.RB ( SUID_DUMP_USER ,
235process is dumpable).
0de51ed1
MK
236Between kernels 2.6.13 and 2.6.17,
237.\" commit abf75a5033d4da7b8a7e92321d74021d1fcfb502
238the value 2 was also permitted,
88989295
MK
239which caused any binary which normally would not be dumped
240to be dumped readable by root only;
241for security reasons, this feature has been removed.
242.\" See http://marc.theaimsgroup.com/?l=linux-kernel&m=115270289030630&w=2
243.\" Subject: Fix prctl privilege escalation (CVE-2006-2451)
244.\" From: Marcel Holtmann <marcel () holtmann ! org>
245.\" Date: 2006-07-12 11:12:00
246(See also the description of
2d7fc98d 247.I /proc/sys/fs/\:suid_dumpable
88989295
MK
248in
249.BR proc (5).)
efeece04 250.IP
2d7fc98d
MK
251Normally, this flag is set to 1.
252However, it is reset to the current value contained in the file
253.IR /proc/sys/fs/\:suid_dumpable
254(which by default has the value 0),
a644bc48 255in the following circumstances:
2d7fc98d
MK
256.\" See kernel/cred.c::commit_creds() (Linux 3.18 sources)
257.RS
41f90bb7 258.IP * 3
a644bc48 259The process's effective user or group ID is changed.
2d7fc98d 260.IP *
a644bc48 261The process's filesystem user or group ID is changed (see
2d7fc98d
MK
262.BR credentials (7)).
263.IP *
a644bc48 264The process executes
2d7fc98d 265.RB ( execve (2))
41f90bb7
MK
266a set-user-ID or set-group-ID program, resulting in a change
267of either the effective user ID or the effective group ID.
27ce08bf
KF
268.IP *
269The process executes
270.RB ( execve (2))
271a program that has file capabilities (see
272.BR capabilities (7)),
41f90bb7 273.\" See kernel/cred.c::commit_creds()
27ce08bf 274but only if the permitted capabilities
41f90bb7 275gained exceed those already permitted for the process.
5d28ea3e 276.\" Also certain namespace operations;
2d7fc98d
MK
277.RE
278.IP
cadcf1b1 279Processes that are not dumpable can not be attached via
6fdbc779 280.BR ptrace (2)
982d8cf7
MK
281.BR PTRACE_ATTACH ;
282see
283.BR ptrace (2)
284for further details.
efeece04 285.IP
161946a2
MK
286If a process is not dumpable,
287the ownership of files in the process's
288.IR /proc/[pid]
289directory is affected as described in
290.BR proc (5).
64536a1b 291.TP
88989295
MK
292.BR PR_GET_DUMPABLE " (since Linux 2.3.20)"
293Return (as the function result) the current state of the calling
294process's dumpable flag.
295.\" Since Linux 2.6.13, the dumpable flag can have the value 2,
296.\" but in 2.6.13 PR_GET_DUMPABLE simply returns 1 if the dumpable
c7094399 297.\" flags has a nonzero value. This was fixed in 2.6.14.
64536a1b 298.TP
8ab8b43f 299.BR PR_SET_ENDIAN " (since Linux 2.6.18, PowerPC only)"
c13182ef 300Set the endian-ness of the calling process to the value given
64536a1b 301in \fIarg2\fP, which should be one of the following:
8ab8b43f 302.\" Respectively 0, 1, 2
64536a1b
MK
303.BR PR_ENDIAN_BIG ,
304.BR PR_ENDIAN_LITTLE ,
305or
0daa9e92 306.B PR_ENDIAN_PPC_LITTLE
64536a1b 307(PowerPC pseudo little endian).
e87fdd92 308.TP
8ab8b43f
MK
309.BR PR_GET_ENDIAN " (since Linux 2.6.18, PowerPC only)"
310Return the endian-ness of the calling process,
311in the location pointed to by
312.IR "(int\ *) arg2" .
64a53a67
ES
313.TP
314.BR PR_SET_FP_MODE " (since Linux 4.0, only on MIPS)"
89507305
MK
315.\" commit 9791554b45a2acc28247f66a5fd5bbc212a6b8c8
316On the MIPS architecture,
317user-space code can be built using an ABI which permits linking
318with code that has more restrictive floating-point (FP) requirements.
319For example, user-space code may be built to target the O32 FPXX ABI
b3073df8 320and linked with code built for either one of the more restrictive
89507305 321FP32 or FP64 ABIs.
b3073df8 322When more restrictive code is linked in,
89507305
MK
323the overall requirement for the process is to use the more
324restrictive floating-point mode.
efeece04 325.IP
07d6076e 326Because the kernel has no means of knowing in advance
89507305 327which mode the process should be executed in,
07d6076e
MK
328and because these restrictions can
329change over the lifetime of the process, the
330.B PR_SET_FP_MODE
331operation is provided to allow control of the floating-point mode
332from user space.
efeece04 333.IP
64a53a67
ES
334.\" https://dmz-portal.mips.com/wiki/MIPS_O32_ABI_-_FR0_and_FR1_Interlinking
335The
336.I (unsigned int) arg2
89507305 337argument is a bit mask describing the floating-point mode used:
64a53a67
ES
338.RS
339.TP
fb90e0c7 340.BR PR_FP_MODE_FR
64a53a67
ES
341When this bit is
342.I unset
343(so called
344.BR FR=0 " or " FR0
41a926bf
MK
345mode), the 32 floating-point registers are 32 bits wide,
346and 64-bit registers are represented as a pair of registers
b3073df8 347(even- and odd- numbered,
89507305
MK
348with the even-numbered register containing the lower 32 bits,
349and the odd-numbered register containing the higher 32 bits).
efeece04 350.IP
64a53a67
ES
351When this bit is
352.I set
07d6076e 353(on supported hardware),
41a926bf 354the 32 floating-point registers are 64 bits wide (so called
64a53a67 355.BR FR=1 " or " FR1
89507305 356mode).
b3073df8 357Note that modern MIPS implementations (MIPS R6 and newer) support
64a53a67
ES
358.B FR=1
359mode only.
efeece04
MK
360.IP
361.IP
89507305 362Applications that use the O32 FP32 ABI can operate only when this bit is
64a53a67
ES
363.I unset
364.RB ( FR=0 ;
365or they can be used with FRE enabled, see below).
89507305
MK
366Applications that use the O32 FP64 ABI
367(and the O32 FP64A ABI, which exists to
368provide the ability to operate with existing FP32 code; see below)
369can operate only when this bit is
64a53a67
ES
370.I set
371.RB ( FR=1 ).
ffb0dafc 372Applications that use the O32 FPXX ABI can operate with either
07d6076e
MK
373.BR FR=0
374or
375.BR FR=1 .
64a53a67 376.TP
fb90e0c7 377.BR PR_FP_MODE_FRE
07d6076e 378Enable emulation of 32-bit floating-point mode.
b3073df8 379When this mode is enabled,
07d6076e
MK
380it emulates 32-bit floating-point operations
381by raising a reserved-instruction exception
b3073df8 382on every instruction that uses 32-bit formats and
89507305
MK
383the kernel then handles the instruction in software.
384(The problem lies in the discrepancy of handling odd-numbered registers
385which are the high 32 bits of 64-bit registers with even numbers in
64a53a67 386.B FR=0
89507305 387mode and the lower 32-bit parts of odd-numbered 64-bit registers in
64a53a67 388.B FR=1
89507305
MK
389mode.)
390Enabling this bit is necessary when code with the O32 FP32 ABI should operate
391with code with compatible the O32 FPXX or O32 FP64A ABIs (which require
64a53a67 392.B FR=1
b3073df8
MK
393FPU mode) or when it is executed on newer hardware (MIPS R6 onwards)
394which lacks
64a53a67 395.B FR=0
89507305 396mode support when a binary with the FP32 ABI is used.
64a53a67 397.IP
89507305
MK
398Note that this mode makes sense only when the FPU is in 64-bit mode
399.RB ( FR=1 ).
64a53a67 400.IP
89507305 401Note that the use of emulation inherently has a significant performance hit
b3073df8 402and should be avoided if possible.
64a53a67
ES
403.RE
404.IP
07d6076e
MK
405In the N32/N64 ABI, 64-bit floating-point mode is always used,
406so FPU emulation is not required and the FPU always operates in
64a53a67
ES
407.B FR=1
408mode.
409.IP
07d6076e
MK
410This option is mainly intended for use by the dynamic linker
411.RB ( ld.so (8)).
64a53a67 412.IP
89507305
MK
413The arguments
414.IR arg3 ,
415.IR arg4 ,
416and
417.IR arg5
64a53a67
ES
418are ignored.
419.TP
420.BR PR_GET_FP_MODE " (since Linux 4.0, only on MIPS)"
39466029
MK
421Return (as the function result)
422the current floating-point mode (see the description of
64a53a67
ES
423.B PR_SET_FP_MODE
424for details).
efeece04 425.IP
89507305 426On success,
07d6076e 427the call returns a bit mask which represents the current floating-point mode.
efeece04 428.IP
89507305
MK
429The arguments
430.IR arg2 ,
431.IR arg3 ,
432.IR arg4 ,
433and
434.IR arg5
64a53a67 435are ignored.
8ab8b43f 436.TP
8ab8b43f 437.BR PR_SET_FPEMU " (since Linux 2.4.18, 2.5.9, only on ia64)"
e87fdd92 438Set floating-point emulation control bits to \fIarg2\fP.
7626d2ce
MK
439Pass
440.B PR_FPEMU_NOPRINT
441to silently emulate floating-point operation accesses, or
442.B PR_FPEMU_SIGFPE
443to not emulate floating-point operations and send
8bd58774
MK
444.B SIGFPE
445instead.
e87fdd92 446.TP
8ab8b43f
MK
447.BR PR_GET_FPEMU " (since Linux 2.4.18, 2.5.9, only on ia64)"
448Return floating-point emulation control bits,
449in the location pointed to by
450.IR "(int\ *) arg2" .
e87fdd92 451.TP
8ab8b43f 452.BR PR_SET_FPEXC " (since Linux 2.4.21, 2.5.32, only on PowerPC)"
1c44bd5b
MK
453Set floating-point exception mode to \fIarg2\fP.
454Pass \fBPR_FP_EXC_SW_ENABLE\fP to use FPEXC for FP exception enables,
c45bd688
MK
455\fBPR_FP_EXC_DIV\fP for floating-point divide by zero,
456\fBPR_FP_EXC_OVF\fP for floating-point overflow,
457\fBPR_FP_EXC_UND\fP for floating-point underflow,
458\fBPR_FP_EXC_RES\fP for floating-point inexact result,
459\fBPR_FP_EXC_INV\fP for floating-point invalid operation,
e87fdd92 460\fBPR_FP_EXC_DISABLED\fP for FP exceptions disabled,
b28f6e56 461\fBPR_FP_EXC_NONRECOV\fP for async nonrecoverable exception mode,
e87fdd92
MK
462\fBPR_FP_EXC_ASYNC\fP for async recoverable exception mode,
463\fBPR_FP_EXC_PRECISE\fP for precise exception mode.
464.TP
8ab8b43f
MK
465.BR PR_GET_FPEXC " (since Linux 2.4.21, 2.5.32, only on PowerPC)"
466Return floating-point exception mode,
467in the location pointed to by
468.IR "(int\ *) arg2" .
469.TP
88989295 470.BR PR_SET_KEEPCAPS " (since Linux 2.2.18)"
03361448 471Set the state of the calling thread's "keep capabilities" flag.
cb7c96bf 472The effect of this flag is described in
03361448 473.BR capabilities (7).
88989295 474.I arg2
03361448
MK
475must be either 0 (clear the flag)
476or 1 (set the flag).
028cb080 477The "keep capabilities" value will be reset to 0 on subsequent calls to
88989295
MK
478.BR execve (2).
479.TP
480.BR PR_GET_KEEPCAPS " (since Linux 2.2.18)"
88ee5c1c 481Return (as the function result) the current state of the calling thread's
88989295 482"keep capabilities" flag.
03361448
MK
483See
484.BR capabilities (7)
485for a description of this flag.
88989295 486.TP
03547431 487.BR PR_MCE_KILL " (since Linux 2.6.32)"
eb359b3e 488Set the machine check memory corruption kill policy for the calling thread.
03547431
MK
489If
490.I arg2
491is
492.BR PR_MCE_KILL_CLEAR ,
493clear the thread memory corruption kill policy and use the system-wide default.
494(The system-wide default is defined by
495.IR /proc/sys/vm/memory_failure_early_kill ;
496see
497.BR proc (5).)
498If
499.I arg2
500is
501.BR PR_MCE_KILL_SET ,
502use a thread-specific memory corruption kill policy.
503In this case,
504.I arg3
505defines whether the policy is
506.I early kill
507.RB ( PR_MCE_KILL_EARLY ),
508.I late kill
509.RB ( PR_MCE_KILL_LATE ),
510or the system-wide default
511.RB ( PR_MCE_KILL_DEFAULT ).
512Early kill means that the thread receives a
513.B SIGBUS
514signal as soon as hardware memory corruption is detected inside
515its address space.
516In late kill mode, the process is killed only when it accesses a corrupted page.
517See
518.BR sigaction (2)
519for more information on the
520.BR SIGBUS
521signal.
522The policy is inherited by children.
523The remaining unused
524.BR prctl ()
525arguments must be zero for future compatibility.
88989295 526.TP
03547431
MK
527.BR PR_MCE_KILL_GET " (since Linux 2.6.32)"
528Return the current per-process machine check kill policy.
529All unused
530.BR prctl ()
531arguments must be zero.
88989295 532.TP
03547431
MK
533.BR PR_SET_MM " (since Linux 3.3)"
534.\" commit 028ee4be34a09a6d48bdf30ab991ae933a7bc036
535Modify certain kernel memory map descriptor fields
536of the calling process.
537Usually these fields are set by the kernel and dynamic loader (see
538.BR ld.so (8)
539for more information) and a regular application should not use this feature.
540However, there are cases, such as self-modifying programs,
541where a program might find it useful to change its own memory map.
efeece04 542.IP
03547431
MK
543The calling process must have the
544.BR CAP_SYS_RESOURCE
545capability.
546The value in
547.I arg2
548is one of the options below, while
549.I arg3
550provides a new value for the option.
a87d0921
MF
551The
552.I arg4
553and
554.I arg5
555arguments must be zero if unused.
efeece04 556.IP
261c7e1d 557Before Linux 3.10,
d2eeb68f 558.\" commit 52b3694157e3aa6df871e283115652ec6f2d31e0
261c7e1d
MF
559this feature is available only if the kernel is built with the
560.BR CONFIG_CHECKPOINT_RESTORE
561option enabled.
03547431
MK
562.RS
563.TP
564.BR PR_SET_MM_START_CODE
565Set the address above which the program text can run.
566The corresponding memory area must be readable and executable,
997d21e1 567but not writable or shareable (see
03547431 568.BR mprotect (2)
0fcc276f 569and
03547431
MK
570.BR mmap (2)
571for more information).
f83fe154 572.TP
03547431
MK
573.BR PR_SET_MM_END_CODE
574Set the address below which the program text can run.
575The corresponding memory area must be readable and executable,
997d21e1 576but not writable or shareable.
f83fe154 577.TP
03547431
MK
578.BR PR_SET_MM_START_DATA
579Set the address above which initialized and
580uninitialized (bss) data are placed.
581The corresponding memory area must be readable and writable,
997d21e1 582but not executable or shareable.
88989295 583.TP
03547431
MK
584.B PR_SET_MM_END_DATA
585Set the address below which initialized and
586uninitialized (bss) data are placed.
587The corresponding memory area must be readable and writable,
997d21e1 588but not executable or shareable.
88989295 589.TP
03547431
MK
590.BR PR_SET_MM_START_STACK
591Set the start address of the stack.
592The corresponding memory area must be readable and writable.
491b2e75 593.TP
03547431
MK
594.BR PR_SET_MM_START_BRK
595Set the address above which the program heap can be expanded with
596.BR brk (2)
597call.
598The address must be greater than the ending address of
599the current program data segment.
600In addition, the combined size of the resulting heap and
601the size of the data segment can't exceed the
602.BR RLIMIT_DATA
603resource limit (see
604.BR setrlimit (2)).
605.TP
606.BR PR_SET_MM_BRK
607Set the current
608.BR brk (2)
609value.
610The requirements for the address are the same as for the
611.BR PR_SET_MM_START_BRK
612option.
11ac5b51 613.PP
03547431
MK
614The following options are available since Linux 3.5.
615.\" commit fe8c7f5cbf91124987106faa3bdf0c8b955c4cf7
616.TP
617.BR PR_SET_MM_ARG_START
618Set the address above which the program command line is placed.
619.TP
620.BR PR_SET_MM_ARG_END
621Set the address below which the program command line is placed.
622.TP
623.BR PR_SET_MM_ENV_START
624Set the address above which the program environment is placed.
625.TP
626.BR PR_SET_MM_ENV_END
627Set the address below which the program environment is placed.
628.IP
629The address passed with
630.BR PR_SET_MM_ARG_START ,
631.BR PR_SET_MM_ARG_END ,
632.BR PR_SET_MM_ENV_START ,
633and
634.BR PR_SET_MM_ENV_END
635should belong to a process stack area.
636Thus, the corresponding memory area must be readable, writable, and
637(depending on the kernel configuration) have the
638.BR MAP_GROWSDOWN
639attribute set (see
640.BR mmap (2)).
641.TP
642.BR PR_SET_MM_AUXV
643Set a new auxiliary vector.
644The
645.I arg3
646argument should provide the address of the vector.
647The
648.I arg4
649is the size of the vector.
650.TP
651.BR PR_SET_MM_EXE_FILE
652.\" commit b32dfe377102ce668775f8b6b1461f7ad428f8b6
653Supersede the
654.IR /proc/pid/exe
655symbolic link with a new one pointing to a new executable file
656identified by the file descriptor provided in
657.I arg3
658argument.
659The file descriptor should be obtained with a regular
660.BR open (2)
661call.
662.IP
663To change the symbolic link, one needs to unmap all existing
664executable memory areas, including those created by the kernel itself
665(for example the kernel usually creates at least one executable
666memory area for the ELF
667.IR \.text
668section).
669.IP
642df17c 670In Linux 4.9 and earlier, the
47bc9cec 671.\" commit 3fb4afd9a504c2386b8435028d43283216bf588e
47bc9cec 672.BR PR_SET_MM_EXE_FILE
642df17c
MK
673operation can be performed only once in a process's lifetime;
674attempting to perform the operation a second time results in the error
675.BR EPERM .
676This restriction was enforced for security reasons that were subsequently
677deemed specious,
678and the restriction was removed in Linux 4.10 because some
679user-space applications needed to perform this operation more than once.
11ac5b51 680.PP
7e3236a5
MF
681The following options are available since Linux 3.18.
682.\" commit f606b77f1a9e362451aca8f81d8f36a3a112139e
683.TP
684.BR PR_SET_MM_MAP
685Provides one-shot access to all the addresses by passing in a
686.I struct prctl_mm_map
687(as defined in \fI<linux/prctl.h>\fP).
688The
689.I arg4
690argument should provide the size of the struct.
efeece04 691.IP
7e3236a5
MF
692This feature is available only if the kernel is built with the
693.BR CONFIG_CHECKPOINT_RESTORE
694option enabled.
695.TP
696.BR PR_SET_MM_MAP_SIZE
697Returns the size of the
698.I struct prctl_mm_map
699the kernel expects.
700This allows user space to find a compatible struct.
701The
702.I arg4
703argument should be a pointer to an unsigned int.
efeece04 704.IP
7e3236a5
MF
705This feature is available only if the kernel is built with the
706.BR CONFIG_CHECKPOINT_RESTORE
707option enabled.
03547431
MK
708.RE
709.TP
710.BR PR_MPX_ENABLE_MANAGEMENT ", " PR_MPX_DISABLE_MANAGEMENT " (since Linux 3.19) "
711.\" commit fe3d197f84319d3bce379a9c0dc17b1f48ad358c
712.\" See also http://lwn.net/Articles/582712/
713.\" See also https://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler
714Enable or disable kernel management of Memory Protection eXtensions (MPX)
715bounds tables.
716The
717.IR arg2 ,
718.IR arg3 ,
719.IR arg4 ,
720and
721.IR arg5
722.\" commit e9d1b4f3c60997fe197bf0243cb4a41a44387a88
723arguments must be zero.
efeece04 724.IP
03547431
MK
725MPX is a hardware-assisted mechanism for performing bounds checking on
726pointers.
727It consists of a set of registers storing bounds information
728and a set of special instruction prefixes that tell the CPU on which
729instructions it should do bounds enforcement.
730There is a limited number of these registers and
731when there are more pointers than registers,
732their contents must be "spilled" into a set of tables.
733These tables are called "bounds tables" and the MPX
734.BR prctl ()
735operations control
736whether the kernel manages their allocation and freeing.
efeece04 737.IP
03547431
MK
738When management is enabled, the kernel will take over allocation
739and freeing of the bounds tables.
740It does this by trapping the #BR exceptions that result
741at first use of missing bounds tables and
742instead of delivering the exception to user space,
743it allocates the table and populates the bounds directory
744with the location of the new table.
745For freeing, the kernel checks to see if bounds tables are
746present for memory which is not allocated, and frees them if so.
efeece04 747.IP
03547431
MK
748Before enabling MPX management using
749.BR PR_MPX_ENABLE_MANAGEMENT ,
750the application must first have allocated a user-space buffer for
751the bounds directory and placed the location of that directory in the
752.I bndcfgu
753register.
efeece04 754.IP
a23d8efa 755These calls fail if the CPU or kernel does not support MPX.
03547431
MK
756Kernel support for MPX is enabled via the
757.BR CONFIG_X86_INTEL_MPX
758configuration option.
759You can check whether the CPU supports MPX by looking for the 'mpx'
760CPUID bit, like with the following command:
efeece04 761.IP
e256205a
MK
762.in +4n
763.EX
764cat /proc/cpuinfo | grep ' mpx '
765.EE
766.in
efeece04 767.IP
03547431
MK
768A thread may not switch in or out of long (64-bit) mode while MPX is
769enabled.
efeece04 770.IP
03547431 771All threads in a process are affected by these calls.
efeece04 772.IP
03547431
MK
773The child of a
774.BR fork (2)
775inherits the state of MPX management.
776During
777.BR execve (2),
778MPX management is reset to a state as if
779.BR PR_MPX_DISABLE_MANAGEMENT
780had been called.
efeece04 781.IP
03547431
MK
782For further information on Intel MPX, see the kernel source file
783.IR Documentation/x86/intel_mpx.txt .
784.TP
785.BR PR_SET_NAME " (since Linux 2.6.9)"
786Set the name of the calling thread,
787using the value in the location pointed to by
788.IR "(char\ *) arg2" .
789The name can be up to 16 bytes long,
790.\" TASK_COMM_LEN in include/linux/sched.h
791including the terminating null byte.
792(If the length of the string, including the terminating null byte,
793exceeds 16 bytes, the string is silently truncated.)
794This is the same attribute that can be set via
795.BR pthread_setname_np (3)
796and retrieved using
797.BR pthread_getname_np (3).
798The attribute is likewise accessible via
799.IR /proc/self/task/[tid]/comm ,
800where
801.I tid
802is the name of the calling thread.
803.TP
804.BR PR_GET_NAME " (since Linux 2.6.11)"
805Return the name of the calling thread,
806in the buffer pointed to by
807.IR "(char\ *) arg2" .
808The buffer should allow space for up to 16 bytes;
809the returned string will be null-terminated.
810.TP
811.BR PR_SET_NO_NEW_PRIVS " (since Linux 3.5)"
40dfb5ba 812Set the calling thread's
03547431 813.I no_new_privs
fdda9363 814attribute to the value in
03547431
MK
815.IR arg2 .
816With
817.I no_new_privs
818set to 1,
819.BR execve (2)
820promises not to grant privileges to do anything
821that could not have been done without the
822.BR execve (2)
823call (for example,
824rendering the set-user-ID and set-group-ID mode bits,
825and file capabilities non-functional).
fdda9363
MK
826Once set, this the
827.I no_new_privs
828attribute cannot be unset.
829The setting of this attribute is inherited by children created by
03547431
MK
830.BR fork (2)
831and
832.BR clone (2),
833and preserved across
834.BR execve (2).
efeece04 835.IP
c70fea6e
MK
836Since Linux 4.10,
837the value of a thread's
838.I no_new_privs
fdda9363 839attribute can be viewed via the
c70fea6e
MK
840.I NoNewPrivs
841field in the
842.IR /proc/[pid]/status
843file.
efeece04 844.IP
03547431 845For more information, see the kernel source file
a84a5830
ES
846.IR Documentation/userspace\-api/no_new_privs.rst
847.\" commit 40fde647ccb0ae8c11d256d271e24d385eed595b
848(or
849.IR Documentation/prctl/no_new_privs.txt
850before Linux 4.13).
4d850396
MK
851See also
852.BR seccomp (2).
03547431
MK
853.TP
854.BR PR_GET_NO_NEW_PRIVS " (since Linux 3.5)"
855Return (as the function result) the value of the
856.I no_new_privs
fdda9363 857attribute for the calling thread.
03547431
MK
858A value of 0 indicates the regular
859.BR execve (2)
860behavior.
861A value of 1 indicates
862.BR execve (2)
863will operate in the privilege-restricting mode described above.
864.TP
865.BR PR_SET_PDEATHSIG " (since Linux 2.1.57)"
29b249db 866Set the parent-death signal
03547431
MK
867of the calling process to \fIarg2\fP (either a signal value
868in the range 1..maxsig, or 0 to clear).
869This is the signal that the calling process will get when its
870parent dies.
c5236575 871.IP
03547431
MK
872.IR Warning :
873.\" https://bugzilla.kernel.org/show_bug.cgi?id=43300
874the "parent" in this case is considered to be the
875.I thread
876that created this process.
877In other words, the signal will be sent when that thread terminates
878(via, for example,
879.BR pthread_exit (3)),
880rather than after all of the threads in the parent process terminate.
910b0689 881.IP
a32c96b8
MK
882The parent-death signal is sent upon subsequent termination of the parent
883thread and also upon termination of each subreaper process
884(see the description of
885.B PR_SET_CHILD_SUBREAPER
886above) to which the caller is subsequently reparented.
887If the parent thread and all ancestor subreapers have already terminated
888by the time of the
889.BR PR_SET_PDEATHSIG
890operation, then no parent-death signal is sent to the caller.
891.IP
a09b5995
MK
892The parent-death signal is process-directed (see
893.BR signal (7))
894and, if the child installs a handler using the
895.BR sigaction (2)
896.B SA_SIGINFO
897flag, the
898.I si_pid
899field of the
900.I siginfo_t
901argument of the handler contains the PID of the terminating parent process.
902.IP
29b249db 903The parent-death signal setting is cleared for the child of a
910b0689
MK
904.BR fork (2).
905It is also
906(since Linux 2.4.36 / 2.6.23)
907.\" commit d2d56c5f51028cb9f3d800882eb6f4cbd3f9099f
908cleared when executing a set-user-ID or set-group-ID binary,
909or a binary that has associated capabilities (see
910.BR capabilities (7));
911otherwise, this value is preserved across
912.BR execve (2).
03547431
MK
913.TP
914.BR PR_GET_PDEATHSIG " (since Linux 2.3.15)"
915Return the current value of the parent process death signal,
916in the location pointed to by
917.IR "(int\ *) arg2" .
918.TP
919.BR PR_SET_PTRACER " (since Linux 3.4)"
920.\" commit 2d514487faf188938a4ee4fb3464eeecfbdcf8eb
921.\" commit bf06189e4d14641c0148bea16e9dd24943862215
922This is meaningful only when the Yama LSM is enabled and in mode 1
923("restricted ptrace", visible via
924.IR /proc/sys/kernel/yama/ptrace_scope ).
925When a "ptracer process ID" is passed in \fIarg2\fP,
926the caller is declaring that the ptracer process can
927.BR ptrace (2)
928the calling process as if it were a direct process ancestor.
929Each
930.B PR_SET_PTRACER
931operation replaces the previous "ptracer process ID".
932Employing
933.B PR_SET_PTRACER
934with
935.I arg2
936set to 0 clears the caller's "ptracer process ID".
937If
938.I arg2
939is
940.BR PR_SET_PTRACER_ANY ,
941the ptrace restrictions introduced by Yama are effectively disabled for the
942calling process.
efeece04 943.IP
03547431 944For further information, see the kernel source file
6744a500
ES
945.IR Documentation/admin\-guide/LSM/Yama.rst
946.\" commit 90bb766440f2147486a2acc3e793d7b8348b0c22
947(or
948.IR Documentation/security/Yama.txt
949before Linux 4.13).
03547431
MK
950.TP
951.BR PR_SET_SECCOMP " (since Linux 2.6.23)"
952.\" See http://thread.gmane.org/gmane.linux.kernel/542632
953.\" [PATCH 0 of 2] seccomp updates
954.\" andrea@cpushare.com
955Set the secure computing (seccomp) mode for the calling thread, to limit
956the available system calls.
957The more recent
958.BR seccomp (2)
959system call provides a superset of the functionality of
960.BR PR_SET_SECCOMP .
efeece04 961.IP
03547431
MK
962The seccomp mode is selected via
963.IR arg2 .
964(The seccomp constants are defined in
965.IR <linux/seccomp.h> .)
efeece04 966.IP
34447828 967With
8ab8b43f 968.IR arg2
34447828 969set to
b1248a9d 970.BR SECCOMP_MODE_STRICT ,
8ab8b43f
MK
971the only system calls that the thread is permitted to make are
972.BR read (2),
973.BR write (2),
85fbef74
MK
974.BR _exit (2)
975(but not
976.BR exit_group (2)),
fea681da 977and
8ab8b43f
MK
978.BR sigreturn (2).
979Other system calls result in the delivery of a
980.BR SIGKILL
981signal.
34447828 982Strict secure computing mode is useful for number-crunching applications
8ab8b43f
MK
983that may need to execute untrusted byte code,
984perhaps obtained by reading from a pipe or socket.
33a0ccb2 985This operation is available only
d6ef3d57
MK
986if the kernel is configured with
987.B CONFIG_SECCOMP
988enabled.
efeece04 989.IP
34447828
KC
990With
991.IR arg2
992set to
b1248a9d 993.BR SECCOMP_MODE_FILTER " (since Linux 3.5),"
6239dfb2
MK
994the system calls allowed are defined by a pointer
995to a Berkeley Packet Filter passed in
996.IR arg3 .
997This argument is a pointer to
998.IR "struct sock_fprog" ;
999it can be designed to filter
d6ef3d57 1000arbitrary system calls and system call arguments.
33a0ccb2 1001This mode is available only if the kernel is configured with
d6ef3d57
MK
1002.B CONFIG_SECCOMP_FILTER
1003enabled.
efeece04 1004.IP
1733db35
MK
1005If
1006.BR SECCOMP_MODE_FILTER
1007filters permit
1008.BR fork (2),
990e3887 1009then the seccomp mode is inherited by children created by
1733db35
MK
1010.BR fork (2);
1011if
1012.BR execve (2)
fa1d2749 1013is permitted, then the seccomp mode is preserved across
1733db35
MK
1014.BR execve (2).
1015If the filters permit
a26ec136 1016.BR prctl ()
1733db35
MK
1017calls, then additional filters can be added;
1018they are run in order until the first non-allow result is seen.
efeece04 1019.IP
6239dfb2 1020For further information, see the kernel source file
28d96036
ES
1021.IR Documentation/userspace\-api/seccomp_filter.rst
1022.\" commit c061f33f35be0ccc80f4b8e0aea5dfd2ed7e01a3
1023(or
1024.IR Documentation/prctl/seccomp_filter.txt
1025before Linux 4.13).
8ab8b43f
MK
1026.TP
1027.BR PR_GET_SECCOMP " (since Linux 2.6.23)"
5e91816c
MK
1028Return (as the function result)
1029the secure computing mode of the calling thread.
34447828
KC
1030If the caller is not in secure computing mode, this operation returns 0;
1031if the caller is in strict secure computing mode, then the
8ab8b43f
MK
1032.BR prctl ()
1033call will cause a
1034.B SIGKILL
1035signal to be sent to the process.
d6ef3d57 1036If the caller is in filter mode, and this system call is allowed by the
8eeb062d
MK
1037seccomp filters, it returns 2; otherwise, the process is killed with a
1038.BR SIGKILL
1039signal.
33a0ccb2 1040This operation is available only
d6ef3d57
MK
1041if the kernel is configured with
1042.B CONFIG_SECCOMP
1043enabled.
efeece04 1044.IP
787843e7
MK
1045Since Linux 3.8, the
1046.IR Seccomp
1047field of the
1048.IR /proc/[pid]/status
1049file provides a method of obtaining the same information,
1050without the risk that the process is killed; see
1051.BR proc (5).
88989295
MK
1052.TP
1053.BR PR_SET_SECUREBITS " (since Linux 2.6.26)"
1054Set the "securebits" flags of the calling thread to the value supplied in
03547431
MK
1055.IR arg2 .
1056See
1057.BR capabilities (7).
88989295 1058.TP
03547431
MK
1059.BR PR_GET_SECUREBITS " (since Linux 2.6.26)"
1060Return (as the function result)
1061the "securebits" flags of the calling thread.
1062See
1063.BR capabilities (7).
1064.TP
dd08fcca 1065.BR PR_GET_SPECULATION_CTRL " (since Linux 4.17)"
1cea09b3
MK
1066Return (as the function result)
1067the state of the speculation misfeature specified in
a01c1cbc
MK
1068.IR arg2 .
1069Currently, the only permitted value for this argument is
2feab5d3
MK
1070.BR PR_SPEC_STORE_BYPASS
1071(otherwise the call fails with the error
1072.BR ENODEV ).
1073.IP
1074The return value uses bits 0-3 with the following meaning:
e23acd79
KRW
1075.RS
1076.TP
1077.BR PR_SPEC_PRCTL
2feab5d3 1078Mitigation can be controlled per thread by
e23acd79
KRW
1079.B PR_SET_SPECULATION_CTRL
1080.TP
1081.BR PR_SPEC_ENABLE
1082The speculation feature is enabled, mitigation is disabled.
1083.TP
1084.BR PR_SPEC_DISABLE
1085The speculation feature is disabled, mitigation is enabled
1086.TP
1087.BR PR_SPEC_FORCE_DISABLE
1088Same as
1089.B PR_SPEC_DISABLE
1090but cannot be undone.
1091.RE
1092.IP
2feab5d3 1093If all bits are 0,
e23acd79
KRW
1094then the CPU is not affected by the speculation misfeature.
1095.IP
1096If
1097.B PR_SPEC_PRCTL
2feab5d3 1098is set, then per-thread control of the mitigation is available.
ac3756bc 1099If not set,
e36dfb81 1100.BR prctl ()
e23acd79 1101for the speculation misfeature will fail.
a01c1cbc
MK
1102.IP
1103The
e36dfb81
MK
1104.IR arg3 ,
1105.IR arg4 ,
e23acd79
KRW
1106and
1107.I arg5
a01c1cbc 1108arguments must be specified as 0; otherwise the call fails with the error
e36dfb81 1109.BR EINVAL .
e23acd79 1110.TP
dd08fcca
MK
1111.BR PR_SET_SPECULATION_CTRL " (since Linux 4.17)"
1112.\" commit b617cfc858161140d69cc0b5cc211996b557a1c7
1113.\" commit 356e4bfff2c5489e016fdb925adbf12a1e3950ee
a01c1cbc
MK
1114Sets the state of the speculation misfeature specified in
1115.IR arg2 .
1116Currently, the only permitted value for this argument is
2feab5d3
MK
1117.B PR_SPEC_STORE_BYPASS
1118(otherwise the call fails with the error
1119.BR ENODEV ).
a01c1cbc 1120This setting is a per-thread attribute.
ac3756bc 1121The
e23acd79 1122.IR arg3
a01c1cbc
MK
1123argument is used to hand in the control value,
1124which is one of the following:
e23acd79
KRW
1125.RS
1126.TP
1127.BR PR_SPEC_ENABLE
1128The speculation feature is enabled, mitigation is disabled.
1129.TP
1130.BR PR_SPEC_DISABLE
1131The speculation feature is disabled, mitigation is enabled
1132.TP
1133.BR PR_SPEC_FORCE_DISABLE
1134Same as
1135.B PR_SPEC_DISABLE
ac3756bc
MK
1136but cannot be undone.
1137A subsequent
e23acd79
KRW
1138.B
1139prctl(..., PR_SPEC_ENABLE)
2feab5d3 1140will fail with the error
e36dfb81 1141.BR EPERM .
e23acd79
KRW
1142.RE
1143.IP
1144Any other value in
1145.IR arg3
2feab5d3 1146will result in the call failing with the error
e23acd79 1147.BR ERANGE .
a01c1cbc
MK
1148.IP
1149The
2feab5d3 1150.I arg4
e23acd79
KRW
1151and
1152.I arg5
a01c1cbc 1153arguments must be specified as 0; otherwise the call fails with the error
e36dfb81 1154.BR EINVAL .
e23acd79 1155.IP
a01c1cbc
MK
1156The speculation feature can also be controlled by the
1157.B spec_store_bypass_disable
1158boot parameter.
1159This parameter may enforce a read-only policy which will result in the
549597a8 1160.BR prctl ()
a01c1cbc 1161call failing with the error
e23acd79 1162.BR ENXIO .
a01c1cbc
MK
1163For further details, see the kernel source file
1164.IR Documentation/admin-guide/kernel-parameters.txt .
e23acd79 1165.TP
03547431
MK
1166.BR PR_SET_THP_DISABLE " (since Linux 3.15)"
1167.\" commit a0715cc22601e8830ace98366c0c2bd8da52af52
1168Set the state of the "THP disable" flag for the calling thread.
1169If
1170.I arg2
1171has a nonzero value, the flag is set, otherwise it is cleared.
1172Setting this flag provides a method
1173for disabling transparent huge pages
1174for jobs where the code cannot be modified, and using a malloc hook with
1175.BR madvise (2)
1176is not an option (i.e., statically allocated data).
1177The setting of the "THP disable" flag is inherited by a child created via
1178.BR fork (2)
1179and is preserved across
1180.BR execve (2).
1181.\"
06afe673
MK
1182.TP
1183.BR PR_TASK_PERF_EVENTS_DISABLE " (since Linux 2.6.31)"
1184Disable all performance counters attached to the calling process,
1185regardless of whether the counters were created by
1186this process or another process.
1187Performance counters created by the calling process for other
1188processes are unaffected.
66a9882e 1189For more information on performance counters, see the Linux kernel source file
06afe673
MK
1190.IR tools/perf/design.txt .
1191.IP
03547431
MK
1192Originally called
1193.BR PR_TASK_PERF_COUNTERS_DISABLE ;
1194.\" commit 1d1c7ddbfab358445a542715551301b7fc363e28
b0ea1ea3 1195renamed (retaining the same numerical value)
03547431
MK
1196in Linux 2.6.32.
1197.\"
03979794 1198.TP
03547431
MK
1199.BR PR_TASK_PERF_EVENTS_ENABLE " (since Linux 2.6.31)"
1200The converse of
1201.BR PR_TASK_PERF_EVENTS_DISABLE ;
1202enable performance counters attached to the calling process.
1203.IP
1204Originally called
1205.BR PR_TASK_PERF_COUNTERS_ENABLE ;
1206.\" commit 1d1c7ddbfab358445a542715551301b7fc363e28
1207renamed
1208.\" commit cdd6c482c9ff9c55475ee7392ec8f672eddb7be6
1209in Linux 2.6.32.
1210.\"
1211.TP
1212.BR PR_GET_THP_DISABLE " (since Linux 3.15)"
1213Return (via the function result) the current setting of the "THP disable"
1214flag for the calling thread:
1215either 1, if the flag is set, or 0, if it is not.
1216.TP
1217.BR PR_GET_TID_ADDRESS " (since Linux 3.5)"
1218.\" commit 300f786b2683f8bb1ec0afb6e1851183a479c86d
f1ba3ad2 1219Return the
03547431
MK
1220.I clear_child_tid
1221address set by
1222.BR set_tid_address (2)
1223and the
1224.BR clone (2)
1225.B CLONE_CHILD_CLEARTID
1226flag, in the location pointed to by
1227.IR "(int\ **)\ arg2" .
1228This feature is available only if the kernel is built with the
1229.BR CONFIG_CHECKPOINT_RESTORE
c7f2f9ed
MK
1230option enabled.
1231Note that since the
1232.BR prctl ()
1233system call does not have a compat implementation for
1234the AMD64 x32 and MIPS n32 ABIs,
1235and the kernel writes out a pointer using the kernel's pointer size,
1236this operation expects a user-space buffer of 8 (not 4) bytes on these ABIs.
03547431
MK
1237.TP
1238.BR PR_SET_TIMERSLACK " (since Linux 2.6.28)"
1239.\" See https://lwn.net/Articles/369549/
1240.\" commit 6976675d94042fbd446231d1bd8b7de71a980ada
3780f8a5
MK
1241Each thread has two associated timer slack values:
1242a "default" value, and a "current" value.
1243This operation sets the "current" timer slack value for the calling thread.
c14f7930
YX
1244.I arg2
1245is an unsigned long value, then maximum "current" value is ULONG_MAX and
1246the minimum "current" value is 1.
3780f8a5
MK
1247If the nanosecond value supplied in
1248.IR arg2
1249is greater than zero, then the "current" value is set to this value.
03547431
MK
1250If
1251.I arg2
c14f7930 1252is equal to zero,
3780f8a5
MK
1253the "current" timer slack is reset to the
1254thread's "default" timer slack value.
efeece04 1255.IP
3780f8a5 1256The "current" timer slack is used by the kernel to group timer expirations
03547431
MK
1257for the calling thread that are close to one another;
1258as a consequence, timer expirations for the thread may be
1259up to the specified number of nanoseconds late (but will never expire early).
1260Grouping timer expirations can help reduce system power consumption
1261by minimizing CPU wake-ups.
efeece04 1262.IP
03547431
MK
1263The timer expirations affected by timer slack are those set by
1264.BR select (2),
1265.BR pselect (2),
1266.BR poll (2),
1267.BR ppoll (2),
1268.BR epoll_wait (2),
1269.BR epoll_pwait (2),
1270.BR clock_nanosleep (2),
1271.BR nanosleep (2),
1272and
1273.BR futex (2)
1274(and thus the library functions implemented via futexes, including
1275.\" List obtained by grepping for futex usage in glibc source
1276.BR pthread_cond_timedwait (3),
1277.BR pthread_mutex_timedlock (3),
1278.BR pthread_rwlock_timedrdlock (3),
1279.BR pthread_rwlock_timedwrlock (3),
1280and
1281.BR sem_timedwait (3)).
efeece04 1282.IP
03547431
MK
1283Timer slack is not applied to threads that are scheduled under
1284a real-time scheduling policy (see
1285.BR sched_setscheduler (2)).
efeece04 1286.IP
03547431 1287When a new thread is created,
3780f8a5 1288the two timer slack values are made the same as the "current" value
03547431 1289of the creating thread.
3780f8a5
MK
1290Thereafter, a thread can adjust its "current" timer slack value via
1291.BR PR_SET_TIMERSLACK .
1292The "default" value can't be changed.
03547431
MK
1293The timer slack values of
1294.IR init
1295(PID 1), the ancestor of all processes,
1296are 50,000 nanoseconds (50 microseconds).
c14f7930 1297The timer slack value is inherited by a child created via
0b9a7995 1298.BR fork (2),
c14f7930 1299and is preserved across
03547431 1300.BR execve (2).
efeece04 1301.IP
c1f78aba
MK
1302Since Linux 4.6, the "current" timer slack value of any process
1303can be examined and changed via the file
1304.IR /proc/[pid]/timerslack_ns .
1305See
1306.BR proc (5).
e81a96ec 1307.TP
03547431
MK
1308.BR PR_GET_TIMERSLACK " (since Linux 2.6.28)"
1309Return (as the function result)
3780f8a5 1310the "current" timer slack value of the calling thread.
4bf25b89 1311.TP
d6bec36e
MK
1312.BR PR_SET_TIMING " (since Linux 2.6.0)"
1313.\" Precisely: Linux 2.6.0-test4
03547431
MK
1314Set whether to use (normal, traditional) statistical process timing or
1315accurate timestamp-based process timing, by passing
1316.B PR_TIMING_STATISTICAL
1317.\" 0
1318or
1319.B PR_TIMING_TIMESTAMP
1320.\" 1
1321to \fIarg2\fP.
1322.B PR_TIMING_TIMESTAMP
1323is not currently implemented
1324(attempting to set this mode will yield the error
1325.BR EINVAL ).
1326.\" PR_TIMING_TIMESTAMP doesn't do anything in 2.6.26-rc8,
1327.\" and looking at the patch history, it appears
1328.\" that it never did anything.
4bf25b89 1329.TP
d6bec36e
MK
1330.BR PR_GET_TIMING " (since Linux 2.6.0)"
1331.\" Precisely: Linux 2.6.0-test4
03547431
MK
1332Return (as the function result) which process timing method is currently
1333in use.
4bf25b89 1334.TP
03547431
MK
1335.BR PR_SET_TSC " (since Linux 2.6.26, x86 only)"
1336Set the state of the flag determining whether the timestamp counter
1337can be read by the process.
1338Pass
1339.B PR_TSC_ENABLE
1340to
1341.I arg2
1342to allow it to be read, or
1343.B PR_TSC_SIGSEGV
1344to generate a
1345.B SIGSEGV
1346when the process tries to read the timestamp counter.
4bf25b89 1347.TP
03547431
MK
1348.BR PR_GET_TSC " (since Linux 2.6.26, x86 only)"
1349Return the state of the flag determining whether the timestamp counter
1350can be read,
1351in the location pointed to by
1352.IR "(int\ *) arg2" .
1353.TP
1354.B PR_SET_UNALIGN
1355(Only on: ia64, since Linux 2.3.48; parisc, since Linux 2.6.15;
0e2c6b8c
ES
1356PowerPC, since Linux 2.6.18; Alpha, since Linux 2.6.22;
1357.\" sh: 94ea5e449ae834af058ef005d16a8ad44fcf13d6
1358.\" tile: 2f9ac29eec71a696cb0dcc5fb82c0f8d4dac28c9
1359sh, since Linux 2.6.34; tile, since Linux 3.12)
03547431
MK
1360Set unaligned access control bits to \fIarg2\fP.
1361Pass
1362\fBPR_UNALIGN_NOPRINT\fP to silently fix up unaligned user accesses,
1363or \fBPR_UNALIGN_SIGBUS\fP to generate
1364.B SIGBUS
2da72a43
MK
1365on unaligned user access.
1366Alpha also supports an additional flag with the value
1367of 4 and no corresponding named constant,
1368which instructs kernel to not fix up
0e2c6b8c 1369unaligned accesses (it is analogous to providing the
2da72a43
MK
1370.BR UAC_NOFIX
1371flag in
1372.BR SSI_NVPAIRS
1373operation of the
1374.BR setsysinfo ()
1375system call on Tru64).
03547431
MK
1376.TP
1377.B PR_GET_UNALIGN
f1bb5798 1378(See
03547431 1379.B PR_SET_UNALIGN
f1bb5798 1380for information on versions and architectures.)
03547431 1381Return unaligned access control bits, in the location pointed to by
0e2c6b8c 1382.IR "(unsigned int\ *) arg2" .
47297adb 1383.SH RETURN VALUE
8ab8b43f
MK
1384On success,
1385.BR PR_GET_DUMPABLE ,
7f5d8442 1386.BR PR_GET_FP_MODE ,
8ab8b43f 1387.BR PR_GET_KEEPCAPS ,
f83fe154 1388.BR PR_GET_NO_NEW_PRIVS ,
5745985f 1389.BR PR_GET_THP_DISABLE ,
8ab8b43f
MK
1390.BR PR_CAPBSET_READ ,
1391.BR PR_GET_TIMING ,
c42db321 1392.BR PR_GET_TIMERSLACK ,
8ab8b43f 1393.BR PR_GET_SECUREBITS ,
7f5d8442 1394.BR PR_GET_SPECULATION_CTRL ,
ed31c572 1395.BR PR_MCE_KILL_GET ,
0c3e75cb 1396.BR PR_CAP_AMBIENT + PR_CAP_AMBIENT_IS_SET ,
8ab8b43f
MK
1397and (if it returns)
1398.BR PR_GET_SECCOMP
2fda57bd 1399return the nonnegative values described above.
fea681da
MK
1400All other
1401.I option
1402values return 0 on success.
1403On error, \-1 is returned, and
1404.I errno
1405is set appropriately.
1406.SH ERRORS
1407.TP
0478944d
MK
1408.B EACCES
1409.I option
1410is
4ab9f1db
MK
1411.BR PR_SET_SECCOMP
1412and
1413.I arg2
1414is
1415.BR SECCOMP_MODE_FILTER ,
1416but the process does not have the
1417.BR CAP_SYS_ADMIN
1418capability or has not set the
1419.IR no_new_privs
1420attribute (see the discussion of
1421.BR PR_SET_NO_NEW_PRIVS
1422above).
1423.TP
1424.B EACCES
1425.I option
1426is
0478944d
MK
1427.BR PR_SET_MM ,
1428and
1429.I arg3
1430is
1431.BR PR_SET_MM_EXE_FILE ,
1432the file is not executable.
1433.TP
1434.B EBADF
1435.I option
1436is
1437.BR PR_SET_MM ,
1438.I arg3
1439is
1440.BR PR_SET_MM_EXE_FILE ,
1441and the file descriptor passed in
1442.I arg4
1443is not valid.
1444.TP
1445.B EBUSY
1446.I option
1447is
1448.BR PR_SET_MM ,
1449.I arg3
1450is
1451.BR PR_SET_MM_EXE_FILE ,
1452and this the second attempt to change the
1453.I /proc/pid/exe
1454symbolic link, which is prohibited.
1455.TP
8ab8b43f
MK
1456.B EFAULT
1457.I arg2
1458is an invalid address.
1459.TP
e35a0512
KC
1460.B EFAULT
1461.I option
1462is
1463.BR PR_SET_SECCOMP ,
1464.I arg2
1465is
1466.BR SECCOMP_MODE_FILTER ,
1467the system was built with
64c626f7 1468.BR CONFIG_SECCOMP_FILTER ,
e35a0512
KC
1469and
1470.I arg3
1471is an invalid address.
1472.TP
fea681da
MK
1473.B EINVAL
1474The value of
1475.I option