]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man2/prctl.2
stat.2: Minor reworking of text describing AT_NO_AUTOMOUNT
[thirdparty/man-pages.git] / man2 / prctl.2
CommitLineData
fea681da 1.\" Copyright (C) 1998 Andries Brouwer (aeb@cwi.nl)
73d3ac53 2.\" and Copyright (C) 2002, 2006, 2008, 2012, 2013 Michael Kerrisk <mtk.manpages@gmail.com>
af5f9508 3.\" and Copyright Guillem Jover <guillem@hadrons.org>
3cd5e983 4.\" and Copyright (C) 2014 Dave Hansen / Intel
fea681da 5.\"
93015253 6.\" %%%LICENSE_START(VERBATIM)
fea681da
MK
7.\" Permission is granted to make and distribute verbatim copies of this
8.\" manual provided the copyright notice and this permission notice are
9.\" preserved on all copies.
10.\"
11.\" Permission is granted to copy and distribute modified versions of this
12.\" manual under the conditions for verbatim copying, provided that the
13.\" entire resulting derived work is distributed under the terms of a
14.\" permission notice identical to this one.
c13182ef 15.\"
fea681da
MK
16.\" Since the Linux kernel and libraries are constantly changing, this
17.\" manual page may be incorrect or out-of-date. The author(s) assume no
18.\" responsibility for errors or omissions, or for damages resulting from
19.\" the use of the information contained herein. The author(s) may not
20.\" have taken the same level of care in the production of this manual,
21.\" which is licensed free of charge, as they might when working
22.\" professionally.
c13182ef 23.\"
fea681da
MK
24.\" Formatted or processed versions of this manual, if unaccompanied by
25.\" the source, must acknowledge the copyright and authors of this work.
4b72fb64 26.\" %%%LICENSE_END
fea681da
MK
27.\"
28.\" Modified Thu Nov 11 04:19:42 MET 1999, aeb: added PR_GET_PDEATHSIG
29.\" Modified 27 Jun 02, Michael Kerrisk
c13182ef 30.\" Added PR_SET_DUMPABLE, PR_GET_DUMPABLE,
fea681da 31.\" PR_SET_KEEPCAPS, PR_GET_KEEPCAPS
e87fdd92
MK
32.\" Modified 2006-08-30 Guillem Jover <guillem@hadrons.org>
33.\" Updated Linux versions where the options where introduced.
34.\" Added PR_SET_TIMING, PR_GET_TIMING, PR_SET_NAME, PR_GET_NAME,
35.\" PR_SET_UNALIGN, PR_GET_UNALIGN, PR_SET_FPEMU, PR_GET_FPEMU,
36.\" PR_SET_FPEXC, PR_GET_FPEXC
8ab8b43f
MK
37.\" 2008-04-29 Serge Hallyn, Document PR_CAPBSET_READ and PR_CAPBSET_DROP
38.\" 2008-06-13 Erik Bosman, <ejbosman@cs.vu.nl>
39.\" Document PR_GET_TSC and PR_SET_TSC.
40.\" 2008-06-15 mtk, Document PR_SET_SECCOMP, PR_GET_SECCOMP
bc02b3ea 41.\" 2009-10-03 Andi Kleen, document PR_MCE_KILL
06afe673 42.\" 2012-04 Cyrill Gorcunov, Document PR_SET_MM
bc02b3ea
MK
43.\" 2012-04-25 Michael Kerrisk, Document PR_TASK_PERF_EVENTS_DISABLE and
44.\" PR_TASK_PERF_EVENTS_ENABLE
34447828 45.\" 2012-09-20 Kees Cook, update PR_SET_SECCOMP for mode 2
f83fe154 46.\" 2012-09-20 Kees Cook, document PR_SET_NO_NEW_PRIVS, PR_GET_NO_NEW_PRIVS
934487a0
MK
47.\" 2012-10-25 Michael Kerrisk, Document PR_SET_TIMERSLACK and
48.\" PR_GET_TIMERSLACK
491b2e75 49.\" 2013-01-10 Kees Cook, document PR_SET_PTRACER
31cc8387 50.\" 2012-02-04 Michael Kerrisk, document PR_{SET,GET}_CHILD_SUBREAPER
03979794 51.\" 2014-11-10 Dave Hansen, document PR_MPX_{EN,DIS}ABLE_MANAGEMENT
fea681da 52.\"
e14baeeb 53.\"
e8426ca2 54.TH PRCTL 2 2020-04-11 "Linux" "Linux Programmer's Manual"
fea681da
MK
55.SH NAME
56prctl \- operations on a process
57.SH SYNOPSIS
521bf584 58.nf
fea681da 59.B #include <sys/prctl.h>
68e4db0a 60.PP
521bf584
MK
61.BI "int prctl(int " option ", unsigned long " arg2 ", unsigned long " arg3 ,
62.BI " unsigned long " arg4 ", unsigned long " arg5 );
63.fi
fea681da 64.SH DESCRIPTION
e511ffb6 65.BR prctl ()
fea681da 66is called with a first argument describing what to do
1a329b56 67(with values defined in \fI<linux/prctl.h>\fP), and further
c4bb193f 68arguments with a significance depending on the first one.
fea681da 69The first argument can be:
03547431
MK
70.\"
71.TP
72.BR PR_CAP_AMBIENT " (since Linux 4.3)"
73.\" commit 58319057b7847667f0c9585b9de0e8932b0fdb08
1a52f4f6
MK
74Reads or changes the ambient capability set of the calling thread,
75according to the value of
03547431
MK
76.IR arg2 ,
77which must be one of the following:
78.RS
79.\"
80.TP
81.B PR_CAP_AMBIENT_RAISE
82The capability specified in
83.I arg3
84is added to the ambient set.
85The specified capability must already be present in
86both the permitted and the inheritable sets of the process.
87This operation is not permitted if the
88.B SECBIT_NO_CAP_AMBIENT_RAISE
89securebit is set.
90.TP
91.B PR_CAP_AMBIENT_LOWER
92The capability specified in
93.I arg3
94is removed from the ambient set.
95.TP
96.B PR_CAP_AMBIENT_IS_SET
97The
bf7bc8b8 98.BR prctl ()
03547431
MK
99call returns 1 if the capability in
100.I arg3
101is in the ambient set and 0 if it is not.
102.TP
103.BR PR_CAP_AMBIENT_CLEAR_ALL
104All capabilities will be removed from the ambient set.
105This operation requires setting
106.I arg3
107to zero.
108.RE
269e3b97
MK
109.IP
110In all of the above operations,
111.I arg4
112and
113.I arg5
114must be specified as 0.
cf086650
MK
115.IP
116Higher-level interfaces layered on top of the above operations are
117provided in the
118.BR libcap (3)
119library in the form of
120.BR cap_get_ambient (3),
121.BR cap_set_ambient (3),
122and
123.BR cap_reset_ambient (3).
fea681da 124.TP
2e781e20 125.BR PR_CAPBSET_READ " (since Linux 2.6.25)"
8ab8b43f
MK
126Return (as the function result) 1 if the capability specified in
127.I arg2
128is in the calling thread's capability bounding set,
129or 0 if it is not.
130(The capability constants are defined in
131.IR <linux/capability.h> .)
132The capability bounding set dictates
133whether the process can receive the capability through a
2914a14d 134file's permitted capability set on a subsequent call to
8ab8b43f 135.BR execve (2).
efeece04 136.IP
8ab8b43f
MK
137If the capability specified in
138.I arg2
139is not valid, then the call fails with the error
140.BR EINVAL .
d9a0d1d7
MK
141.IP
142A higher-level interface layered on top of this operation is provided in the
143.BR libcap (3)
144library in the form of
145.BR cap_get_bound (3).
8ab8b43f
MK
146.TP
147.BR PR_CAPBSET_DROP " (since Linux 2.6.25)"
148If the calling thread has the
149.B CAP_SETPCAP
af53fcb5 150capability within its user namespace, then drop the capability specified by
8ab8b43f
MK
151.I arg2
152from the calling thread's capability bounding set.
153Any children of the calling thread will inherit the newly
154reduced bounding set.
efeece04 155.IP
8ab8b43f
MK
156The call fails with the error:
157.B EPERM
2914a14d 158if the calling thread does not have the
8ab8b43f
MK
159.BR CAP_SETPCAP ;
160.BR EINVAL
161if
162.I arg2
163does not represent a valid capability; or
164.BR EINVAL
165if file capabilities are not enabled in the kernel,
166in which case bounding sets are not supported.
d9a0d1d7
MK
167.IP
168A higher-level interface layered on top of this operation is provided in the
169.BR libcap (3)
170library in the form of
171.BR cap_drop_bound (3).
73d3ac53
MK
172.TP
173.BR PR_SET_CHILD_SUBREAPER " (since Linux 3.4)"
174.\" commit ebec18a6d3aa1e7d84aab16225e87fd25170ec2b
175If
176.I arg2
177is nonzero,
178set the "child subreaper" attribute of the calling process;
179if
180.I arg2
181is zero, unset the attribute.
efeece04 182.IP
fbc63931 183A subreaper fulfills the role of
73d3ac53
MK
184.BR init (1)
185for its descendant processes.
fbc63931 186When a process becomes orphaned
b6088873 187(i.e., its immediate parent terminates),
fbc63931
MK
188then that process will be reparented to
189the nearest still living ancestor subreaper.
190Subsequently, calls to
191.BR getppid ()
192in the orphaned process will now return the PID of the subreaper process,
193and when the orphan terminates, it is the subreaper process that
73d3ac53
MK
194will receive a
195.BR SIGCHLD
1a8e1c2f 196signal and will be able to
73d3ac53
MK
197.BR wait (2)
198on the process to discover its termination status.
efeece04 199.IP
4a5a783d 200The setting of the "child subreaper" attribute
300a9c78 201is not inherited by children created by
d59a7572
MK
202.BR fork (2)
203and
204.BR clone (2).
205The setting is preserved across
206.BR execve (2).
efeece04 207.IP
94e460d4
MK
208Establishing a subreaper process is useful in session management frameworks
209where a hierarchical group of processes is managed by a subreaper process
210that needs to be informed when one of the processes\(emfor example,
211a double-forked daemon\(emterminates
212(perhaps so that it can restart that process).
213Some
214.BR init (1)
215frameworks (e.g.,
216.BR systemd (1))
217employ a subreaper process for similar reasons.
73d3ac53
MK
218.TP
219.BR PR_GET_CHILD_SUBREAPER " (since Linux 3.4)"
220Return the "child subreaper" setting of the caller,
221in the location pointed to by
222.IR "(int\ *) arg2" .
8ab8b43f 223.TP
88989295 224.BR PR_SET_DUMPABLE " (since Linux 2.3.20)"
d4492caa 225Set the state of the "dumpable" attribute,
2d7fc98d
MK
226which determines whether core dumps are produced for the calling process
227upon delivery of a signal whose default behavior is to produce a core dump.
efeece04 228.IP
88989295 229In kernels up to and including 2.6.12,
8ab8b43f 230.I arg2
8aad30d7
MK
231must be either 0
232.RB ( SUID_DUMP_DISABLE ,
233process is not dumpable) or 1
234.RB ( SUID_DUMP_USER ,
235process is dumpable).
0de51ed1
MK
236Between kernels 2.6.13 and 2.6.17,
237.\" commit abf75a5033d4da7b8a7e92321d74021d1fcfb502
238the value 2 was also permitted,
88989295
MK
239which caused any binary which normally would not be dumped
240to be dumped readable by root only;
241for security reasons, this feature has been removed.
242.\" See http://marc.theaimsgroup.com/?l=linux-kernel&m=115270289030630&w=2
243.\" Subject: Fix prctl privilege escalation (CVE-2006-2451)
244.\" From: Marcel Holtmann <marcel () holtmann ! org>
245.\" Date: 2006-07-12 11:12:00
246(See also the description of
2d7fc98d 247.I /proc/sys/fs/\:suid_dumpable
88989295
MK
248in
249.BR proc (5).)
efeece04 250.IP
3076b3d9 251Normally, the "dumpable" attribute is set to 1.
2d7fc98d
MK
252However, it is reset to the current value contained in the file
253.IR /proc/sys/fs/\:suid_dumpable
254(which by default has the value 0),
a644bc48 255in the following circumstances:
2d7fc98d
MK
256.\" See kernel/cred.c::commit_creds() (Linux 3.18 sources)
257.RS
41f90bb7 258.IP * 3
a644bc48 259The process's effective user or group ID is changed.
2d7fc98d 260.IP *
a644bc48 261The process's filesystem user or group ID is changed (see
2d7fc98d
MK
262.BR credentials (7)).
263.IP *
a644bc48 264The process executes
2d7fc98d 265.RB ( execve (2))
41f90bb7
MK
266a set-user-ID or set-group-ID program, resulting in a change
267of either the effective user ID or the effective group ID.
27ce08bf
KF
268.IP *
269The process executes
270.RB ( execve (2))
271a program that has file capabilities (see
272.BR capabilities (7)),
41f90bb7 273.\" See kernel/cred.c::commit_creds()
27ce08bf 274but only if the permitted capabilities
41f90bb7 275gained exceed those already permitted for the process.
5d28ea3e 276.\" Also certain namespace operations;
2d7fc98d
MK
277.RE
278.IP
cadcf1b1 279Processes that are not dumpable can not be attached via
6fdbc779 280.BR ptrace (2)
982d8cf7
MK
281.BR PTRACE_ATTACH ;
282see
283.BR ptrace (2)
284for further details.
efeece04 285.IP
161946a2
MK
286If a process is not dumpable,
287the ownership of files in the process's
288.IR /proc/[pid]
289directory is affected as described in
290.BR proc (5).
64536a1b 291.TP
88989295
MK
292.BR PR_GET_DUMPABLE " (since Linux 2.3.20)"
293Return (as the function result) the current state of the calling
d4492caa 294process's dumpable attribute.
88989295
MK
295.\" Since Linux 2.6.13, the dumpable flag can have the value 2,
296.\" but in 2.6.13 PR_GET_DUMPABLE simply returns 1 if the dumpable
c7094399 297.\" flags has a nonzero value. This was fixed in 2.6.14.
64536a1b 298.TP
8ab8b43f 299.BR PR_SET_ENDIAN " (since Linux 2.6.18, PowerPC only)"
c13182ef 300Set the endian-ness of the calling process to the value given
64536a1b 301in \fIarg2\fP, which should be one of the following:
8ab8b43f 302.\" Respectively 0, 1, 2
64536a1b
MK
303.BR PR_ENDIAN_BIG ,
304.BR PR_ENDIAN_LITTLE ,
305or
0daa9e92 306.B PR_ENDIAN_PPC_LITTLE
64536a1b 307(PowerPC pseudo little endian).
e87fdd92 308.TP
8ab8b43f
MK
309.BR PR_GET_ENDIAN " (since Linux 2.6.18, PowerPC only)"
310Return the endian-ness of the calling process,
311in the location pointed to by
312.IR "(int\ *) arg2" .
64a53a67
ES
313.TP
314.BR PR_SET_FP_MODE " (since Linux 4.0, only on MIPS)"
89507305
MK
315.\" commit 9791554b45a2acc28247f66a5fd5bbc212a6b8c8
316On the MIPS architecture,
317user-space code can be built using an ABI which permits linking
318with code that has more restrictive floating-point (FP) requirements.
319For example, user-space code may be built to target the O32 FPXX ABI
b3073df8 320and linked with code built for either one of the more restrictive
89507305 321FP32 or FP64 ABIs.
b3073df8 322When more restrictive code is linked in,
89507305
MK
323the overall requirement for the process is to use the more
324restrictive floating-point mode.
efeece04 325.IP
07d6076e 326Because the kernel has no means of knowing in advance
89507305 327which mode the process should be executed in,
07d6076e
MK
328and because these restrictions can
329change over the lifetime of the process, the
330.B PR_SET_FP_MODE
331operation is provided to allow control of the floating-point mode
332from user space.
efeece04 333.IP
64a53a67
ES
334.\" https://dmz-portal.mips.com/wiki/MIPS_O32_ABI_-_FR0_and_FR1_Interlinking
335The
336.I (unsigned int) arg2
89507305 337argument is a bit mask describing the floating-point mode used:
64a53a67
ES
338.RS
339.TP
fb90e0c7 340.BR PR_FP_MODE_FR
64a53a67
ES
341When this bit is
342.I unset
343(so called
344.BR FR=0 " or " FR0
41a926bf
MK
345mode), the 32 floating-point registers are 32 bits wide,
346and 64-bit registers are represented as a pair of registers
b3073df8 347(even- and odd- numbered,
89507305
MK
348with the even-numbered register containing the lower 32 bits,
349and the odd-numbered register containing the higher 32 bits).
efeece04 350.IP
64a53a67
ES
351When this bit is
352.I set
07d6076e 353(on supported hardware),
41a926bf 354the 32 floating-point registers are 64 bits wide (so called
64a53a67 355.BR FR=1 " or " FR1
89507305 356mode).
b3073df8 357Note that modern MIPS implementations (MIPS R6 and newer) support
64a53a67
ES
358.B FR=1
359mode only.
efeece04
MK
360.IP
361.IP
89507305 362Applications that use the O32 FP32 ABI can operate only when this bit is
64a53a67
ES
363.I unset
364.RB ( FR=0 ;
365or they can be used with FRE enabled, see below).
89507305
MK
366Applications that use the O32 FP64 ABI
367(and the O32 FP64A ABI, which exists to
368provide the ability to operate with existing FP32 code; see below)
369can operate only when this bit is
64a53a67
ES
370.I set
371.RB ( FR=1 ).
ffb0dafc 372Applications that use the O32 FPXX ABI can operate with either
07d6076e
MK
373.BR FR=0
374or
375.BR FR=1 .
64a53a67 376.TP
fb90e0c7 377.BR PR_FP_MODE_FRE
07d6076e 378Enable emulation of 32-bit floating-point mode.
b3073df8 379When this mode is enabled,
07d6076e
MK
380it emulates 32-bit floating-point operations
381by raising a reserved-instruction exception
b3073df8 382on every instruction that uses 32-bit formats and
89507305
MK
383the kernel then handles the instruction in software.
384(The problem lies in the discrepancy of handling odd-numbered registers
385which are the high 32 bits of 64-bit registers with even numbers in
64a53a67 386.B FR=0
89507305 387mode and the lower 32-bit parts of odd-numbered 64-bit registers in
64a53a67 388.B FR=1
89507305
MK
389mode.)
390Enabling this bit is necessary when code with the O32 FP32 ABI should operate
391with code with compatible the O32 FPXX or O32 FP64A ABIs (which require
64a53a67 392.B FR=1
b3073df8
MK
393FPU mode) or when it is executed on newer hardware (MIPS R6 onwards)
394which lacks
64a53a67 395.B FR=0
89507305 396mode support when a binary with the FP32 ABI is used.
64a53a67 397.IP
89507305
MK
398Note that this mode makes sense only when the FPU is in 64-bit mode
399.RB ( FR=1 ).
64a53a67 400.IP
89507305 401Note that the use of emulation inherently has a significant performance hit
b3073df8 402and should be avoided if possible.
64a53a67
ES
403.RE
404.IP
07d6076e
MK
405In the N32/N64 ABI, 64-bit floating-point mode is always used,
406so FPU emulation is not required and the FPU always operates in
64a53a67
ES
407.B FR=1
408mode.
409.IP
07d6076e
MK
410This option is mainly intended for use by the dynamic linker
411.RB ( ld.so (8)).
64a53a67 412.IP
89507305
MK
413The arguments
414.IR arg3 ,
415.IR arg4 ,
416and
417.IR arg5
64a53a67
ES
418are ignored.
419.TP
420.BR PR_GET_FP_MODE " (since Linux 4.0, only on MIPS)"
39466029
MK
421Return (as the function result)
422the current floating-point mode (see the description of
64a53a67
ES
423.B PR_SET_FP_MODE
424for details).
efeece04 425.IP
89507305 426On success,
07d6076e 427the call returns a bit mask which represents the current floating-point mode.
efeece04 428.IP
89507305
MK
429The arguments
430.IR arg2 ,
431.IR arg3 ,
432.IR arg4 ,
433and
434.IR arg5
64a53a67 435are ignored.
8ab8b43f 436.TP
8ab8b43f 437.BR PR_SET_FPEMU " (since Linux 2.4.18, 2.5.9, only on ia64)"
e87fdd92 438Set floating-point emulation control bits to \fIarg2\fP.
7626d2ce
MK
439Pass
440.B PR_FPEMU_NOPRINT
441to silently emulate floating-point operation accesses, or
442.B PR_FPEMU_SIGFPE
443to not emulate floating-point operations and send
8bd58774
MK
444.B SIGFPE
445instead.
e87fdd92 446.TP
8ab8b43f
MK
447.BR PR_GET_FPEMU " (since Linux 2.4.18, 2.5.9, only on ia64)"
448Return floating-point emulation control bits,
449in the location pointed to by
450.IR "(int\ *) arg2" .
e87fdd92 451.TP
8ab8b43f 452.BR PR_SET_FPEXC " (since Linux 2.4.21, 2.5.32, only on PowerPC)"
1c44bd5b
MK
453Set floating-point exception mode to \fIarg2\fP.
454Pass \fBPR_FP_EXC_SW_ENABLE\fP to use FPEXC for FP exception enables,
c45bd688
MK
455\fBPR_FP_EXC_DIV\fP for floating-point divide by zero,
456\fBPR_FP_EXC_OVF\fP for floating-point overflow,
457\fBPR_FP_EXC_UND\fP for floating-point underflow,
458\fBPR_FP_EXC_RES\fP for floating-point inexact result,
459\fBPR_FP_EXC_INV\fP for floating-point invalid operation,
e87fdd92 460\fBPR_FP_EXC_DISABLED\fP for FP exceptions disabled,
b28f6e56 461\fBPR_FP_EXC_NONRECOV\fP for async nonrecoverable exception mode,
e87fdd92
MK
462\fBPR_FP_EXC_ASYNC\fP for async recoverable exception mode,
463\fBPR_FP_EXC_PRECISE\fP for precise exception mode.
464.TP
8ab8b43f
MK
465.BR PR_GET_FPEXC " (since Linux 2.4.21, 2.5.32, only on PowerPC)"
466Return floating-point exception mode,
467in the location pointed to by
468.IR "(int\ *) arg2" .
469.TP
88989295 470.BR PR_SET_KEEPCAPS " (since Linux 2.2.18)"
03361448 471Set the state of the calling thread's "keep capabilities" flag.
cb7c96bf 472The effect of this flag is described in
03361448 473.BR capabilities (7).
88989295 474.I arg2
03361448
MK
475must be either 0 (clear the flag)
476or 1 (set the flag).
028cb080 477The "keep capabilities" value will be reset to 0 on subsequent calls to
88989295
MK
478.BR execve (2).
479.TP
480.BR PR_GET_KEEPCAPS " (since Linux 2.2.18)"
88ee5c1c 481Return (as the function result) the current state of the calling thread's
88989295 482"keep capabilities" flag.
03361448
MK
483See
484.BR capabilities (7)
485for a description of this flag.
88989295 486.TP
03547431 487.BR PR_MCE_KILL " (since Linux 2.6.32)"
eb359b3e 488Set the machine check memory corruption kill policy for the calling thread.
03547431
MK
489If
490.I arg2
491is
492.BR PR_MCE_KILL_CLEAR ,
493clear the thread memory corruption kill policy and use the system-wide default.
494(The system-wide default is defined by
495.IR /proc/sys/vm/memory_failure_early_kill ;
496see
497.BR proc (5).)
498If
499.I arg2
500is
501.BR PR_MCE_KILL_SET ,
502use a thread-specific memory corruption kill policy.
503In this case,
504.I arg3
505defines whether the policy is
506.I early kill
507.RB ( PR_MCE_KILL_EARLY ),
508.I late kill
509.RB ( PR_MCE_KILL_LATE ),
510or the system-wide default
511.RB ( PR_MCE_KILL_DEFAULT ).
512Early kill means that the thread receives a
513.B SIGBUS
514signal as soon as hardware memory corruption is detected inside
515its address space.
516In late kill mode, the process is killed only when it accesses a corrupted page.
517See
518.BR sigaction (2)
519for more information on the
520.BR SIGBUS
521signal.
522The policy is inherited by children.
523The remaining unused
524.BR prctl ()
525arguments must be zero for future compatibility.
88989295 526.TP
03547431 527.BR PR_MCE_KILL_GET " (since Linux 2.6.32)"
1ff5960b
MK
528Return (as the function result)
529the current per-process machine check kill policy.
03547431
MK
530All unused
531.BR prctl ()
532arguments must be zero.
88989295 533.TP
03547431
MK
534.BR PR_SET_MM " (since Linux 3.3)"
535.\" commit 028ee4be34a09a6d48bdf30ab991ae933a7bc036
536Modify certain kernel memory map descriptor fields
537of the calling process.
538Usually these fields are set by the kernel and dynamic loader (see
539.BR ld.so (8)
540for more information) and a regular application should not use this feature.
541However, there are cases, such as self-modifying programs,
542where a program might find it useful to change its own memory map.
efeece04 543.IP
03547431
MK
544The calling process must have the
545.BR CAP_SYS_RESOURCE
546capability.
547The value in
548.I arg2
549is one of the options below, while
550.I arg3
551provides a new value for the option.
a87d0921
MF
552The
553.I arg4
554and
555.I arg5
556arguments must be zero if unused.
efeece04 557.IP
261c7e1d 558Before Linux 3.10,
d2eeb68f 559.\" commit 52b3694157e3aa6df871e283115652ec6f2d31e0
261c7e1d
MF
560this feature is available only if the kernel is built with the
561.BR CONFIG_CHECKPOINT_RESTORE
562option enabled.
03547431
MK
563.RS
564.TP
565.BR PR_SET_MM_START_CODE
566Set the address above which the program text can run.
567The corresponding memory area must be readable and executable,
997d21e1 568but not writable or shareable (see
03547431 569.BR mprotect (2)
0fcc276f 570and
03547431
MK
571.BR mmap (2)
572for more information).
f83fe154 573.TP
03547431
MK
574.BR PR_SET_MM_END_CODE
575Set the address below which the program text can run.
576The corresponding memory area must be readable and executable,
997d21e1 577but not writable or shareable.
f83fe154 578.TP
03547431
MK
579.BR PR_SET_MM_START_DATA
580Set the address above which initialized and
581uninitialized (bss) data are placed.
582The corresponding memory area must be readable and writable,
997d21e1 583but not executable or shareable.
88989295 584.TP
03547431
MK
585.B PR_SET_MM_END_DATA
586Set the address below which initialized and
587uninitialized (bss) data are placed.
588The corresponding memory area must be readable and writable,
997d21e1 589but not executable or shareable.
88989295 590.TP
03547431
MK
591.BR PR_SET_MM_START_STACK
592Set the start address of the stack.
593The corresponding memory area must be readable and writable.
491b2e75 594.TP
03547431
MK
595.BR PR_SET_MM_START_BRK
596Set the address above which the program heap can be expanded with
597.BR brk (2)
598call.
599The address must be greater than the ending address of
600the current program data segment.
601In addition, the combined size of the resulting heap and
602the size of the data segment can't exceed the
603.BR RLIMIT_DATA
604resource limit (see
605.BR setrlimit (2)).
606.TP
607.BR PR_SET_MM_BRK
608Set the current
609.BR brk (2)
610value.
611The requirements for the address are the same as for the
612.BR PR_SET_MM_START_BRK
613option.
11ac5b51 614.PP
03547431
MK
615The following options are available since Linux 3.5.
616.\" commit fe8c7f5cbf91124987106faa3bdf0c8b955c4cf7
617.TP
618.BR PR_SET_MM_ARG_START
619Set the address above which the program command line is placed.
620.TP
621.BR PR_SET_MM_ARG_END
622Set the address below which the program command line is placed.
623.TP
624.BR PR_SET_MM_ENV_START
625Set the address above which the program environment is placed.
626.TP
627.BR PR_SET_MM_ENV_END
628Set the address below which the program environment is placed.
629.IP
630The address passed with
631.BR PR_SET_MM_ARG_START ,
632.BR PR_SET_MM_ARG_END ,
633.BR PR_SET_MM_ENV_START ,
634and
635.BR PR_SET_MM_ENV_END
636should belong to a process stack area.
637Thus, the corresponding memory area must be readable, writable, and
638(depending on the kernel configuration) have the
639.BR MAP_GROWSDOWN
640attribute set (see
641.BR mmap (2)).
642.TP
643.BR PR_SET_MM_AUXV
644Set a new auxiliary vector.
645The
646.I arg3
647argument should provide the address of the vector.
648The
649.I arg4
650is the size of the vector.
651.TP
652.BR PR_SET_MM_EXE_FILE
653.\" commit b32dfe377102ce668775f8b6b1461f7ad428f8b6
654Supersede the
655.IR /proc/pid/exe
656symbolic link with a new one pointing to a new executable file
657identified by the file descriptor provided in
658.I arg3
659argument.
660The file descriptor should be obtained with a regular
661.BR open (2)
662call.
663.IP
664To change the symbolic link, one needs to unmap all existing
665executable memory areas, including those created by the kernel itself
666(for example the kernel usually creates at least one executable
667memory area for the ELF
668.IR \.text
669section).
670.IP
642df17c 671In Linux 4.9 and earlier, the
47bc9cec 672.\" commit 3fb4afd9a504c2386b8435028d43283216bf588e
47bc9cec 673.BR PR_SET_MM_EXE_FILE
642df17c
MK
674operation can be performed only once in a process's lifetime;
675attempting to perform the operation a second time results in the error
676.BR EPERM .
677This restriction was enforced for security reasons that were subsequently
678deemed specious,
679and the restriction was removed in Linux 4.10 because some
680user-space applications needed to perform this operation more than once.
11ac5b51 681.PP
7e3236a5
MF
682The following options are available since Linux 3.18.
683.\" commit f606b77f1a9e362451aca8f81d8f36a3a112139e
684.TP
685.BR PR_SET_MM_MAP
686Provides one-shot access to all the addresses by passing in a
687.I struct prctl_mm_map
688(as defined in \fI<linux/prctl.h>\fP).
689The
690.I arg4
691argument should provide the size of the struct.
efeece04 692.IP
7e3236a5
MF
693This feature is available only if the kernel is built with the
694.BR CONFIG_CHECKPOINT_RESTORE
695option enabled.
696.TP
697.BR PR_SET_MM_MAP_SIZE
698Returns the size of the
699.I struct prctl_mm_map
700the kernel expects.
701This allows user space to find a compatible struct.
702The
703.I arg4
704argument should be a pointer to an unsigned int.
efeece04 705.IP
7e3236a5
MF
706This feature is available only if the kernel is built with the
707.BR CONFIG_CHECKPOINT_RESTORE
708option enabled.
03547431
MK
709.RE
710.TP
711.BR PR_MPX_ENABLE_MANAGEMENT ", " PR_MPX_DISABLE_MANAGEMENT " (since Linux 3.19) "
712.\" commit fe3d197f84319d3bce379a9c0dc17b1f48ad358c
713.\" See also http://lwn.net/Articles/582712/
714.\" See also https://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler
715Enable or disable kernel management of Memory Protection eXtensions (MPX)
716bounds tables.
717The
718.IR arg2 ,
719.IR arg3 ,
720.IR arg4 ,
721and
722.IR arg5
723.\" commit e9d1b4f3c60997fe197bf0243cb4a41a44387a88
724arguments must be zero.
efeece04 725.IP
03547431
MK
726MPX is a hardware-assisted mechanism for performing bounds checking on
727pointers.
728It consists of a set of registers storing bounds information
729and a set of special instruction prefixes that tell the CPU on which
730instructions it should do bounds enforcement.
731There is a limited number of these registers and
732when there are more pointers than registers,
733their contents must be "spilled" into a set of tables.
734These tables are called "bounds tables" and the MPX
735.BR prctl ()
736operations control
737whether the kernel manages their allocation and freeing.
efeece04 738.IP
03547431
MK
739When management is enabled, the kernel will take over allocation
740and freeing of the bounds tables.
741It does this by trapping the #BR exceptions that result
742at first use of missing bounds tables and
743instead of delivering the exception to user space,
744it allocates the table and populates the bounds directory
745with the location of the new table.
746For freeing, the kernel checks to see if bounds tables are
747present for memory which is not allocated, and frees them if so.
efeece04 748.IP
03547431
MK
749Before enabling MPX management using
750.BR PR_MPX_ENABLE_MANAGEMENT ,
751the application must first have allocated a user-space buffer for
752the bounds directory and placed the location of that directory in the
753.I bndcfgu
754register.
efeece04 755.IP
a23d8efa 756These calls fail if the CPU or kernel does not support MPX.
03547431
MK
757Kernel support for MPX is enabled via the
758.BR CONFIG_X86_INTEL_MPX
759configuration option.
760You can check whether the CPU supports MPX by looking for the 'mpx'
761CPUID bit, like with the following command:
efeece04 762.IP
e256205a
MK
763.in +4n
764.EX
765cat /proc/cpuinfo | grep ' mpx '
766.EE
767.in
efeece04 768.IP
03547431
MK
769A thread may not switch in or out of long (64-bit) mode while MPX is
770enabled.
efeece04 771.IP
03547431 772All threads in a process are affected by these calls.
efeece04 773.IP
03547431
MK
774The child of a
775.BR fork (2)
776inherits the state of MPX management.
777During
778.BR execve (2),
779MPX management is reset to a state as if
780.BR PR_MPX_DISABLE_MANAGEMENT
781had been called.
efeece04 782.IP
03547431
MK
783For further information on Intel MPX, see the kernel source file
784.IR Documentation/x86/intel_mpx.txt .
785.TP
786.BR PR_SET_NAME " (since Linux 2.6.9)"
787Set the name of the calling thread,
788using the value in the location pointed to by
789.IR "(char\ *) arg2" .
790The name can be up to 16 bytes long,
791.\" TASK_COMM_LEN in include/linux/sched.h
792including the terminating null byte.
793(If the length of the string, including the terminating null byte,
794exceeds 16 bytes, the string is silently truncated.)
795This is the same attribute that can be set via
796.BR pthread_setname_np (3)
797and retrieved using
798.BR pthread_getname_np (3).
799The attribute is likewise accessible via
800.IR /proc/self/task/[tid]/comm ,
801where
802.I tid
803is the name of the calling thread.
804.TP
805.BR PR_GET_NAME " (since Linux 2.6.11)"
806Return the name of the calling thread,
807in the buffer pointed to by
808.IR "(char\ *) arg2" .
809The buffer should allow space for up to 16 bytes;
810the returned string will be null-terminated.
811.TP
812.BR PR_SET_NO_NEW_PRIVS " (since Linux 3.5)"
40dfb5ba 813Set the calling thread's
03547431 814.I no_new_privs
fdda9363 815attribute to the value in
03547431
MK
816.IR arg2 .
817With
818.I no_new_privs
819set to 1,
820.BR execve (2)
821promises not to grant privileges to do anything
822that could not have been done without the
823.BR execve (2)
824call (for example,
825rendering the set-user-ID and set-group-ID mode bits,
826and file capabilities non-functional).
97caa19c 827Once set, the
fdda9363
MK
828.I no_new_privs
829attribute cannot be unset.
830The setting of this attribute is inherited by children created by
03547431
MK
831.BR fork (2)
832and
833.BR clone (2),
834and preserved across
835.BR execve (2).
efeece04 836.IP
c70fea6e
MK
837Since Linux 4.10,
838the value of a thread's
839.I no_new_privs
fdda9363 840attribute can be viewed via the
c70fea6e
MK
841.I NoNewPrivs
842field in the
843.IR /proc/[pid]/status
844file.
efeece04 845.IP
03547431 846For more information, see the kernel source file
a84a5830
ES
847.IR Documentation/userspace\-api/no_new_privs.rst
848.\" commit 40fde647ccb0ae8c11d256d271e24d385eed595b
849(or
850.IR Documentation/prctl/no_new_privs.txt
851before Linux 4.13).
4d850396
MK
852See also
853.BR seccomp (2).
03547431
MK
854.TP
855.BR PR_GET_NO_NEW_PRIVS " (since Linux 3.5)"
856Return (as the function result) the value of the
857.I no_new_privs
fdda9363 858attribute for the calling thread.
03547431
MK
859A value of 0 indicates the regular
860.BR execve (2)
861behavior.
862A value of 1 indicates
863.BR execve (2)
864will operate in the privilege-restricting mode described above.
865.TP
866.BR PR_SET_PDEATHSIG " (since Linux 2.1.57)"
29b249db 867Set the parent-death signal
03547431
MK
868of the calling process to \fIarg2\fP (either a signal value
869in the range 1..maxsig, or 0 to clear).
870This is the signal that the calling process will get when its
871parent dies.
c5236575 872.IP
03547431
MK
873.IR Warning :
874.\" https://bugzilla.kernel.org/show_bug.cgi?id=43300
875the "parent" in this case is considered to be the
876.I thread
877that created this process.
878In other words, the signal will be sent when that thread terminates
879(via, for example,
880.BR pthread_exit (3)),
881rather than after all of the threads in the parent process terminate.
910b0689 882.IP
a32c96b8
MK
883The parent-death signal is sent upon subsequent termination of the parent
884thread and also upon termination of each subreaper process
885(see the description of
886.B PR_SET_CHILD_SUBREAPER
887above) to which the caller is subsequently reparented.
888If the parent thread and all ancestor subreapers have already terminated
889by the time of the
890.BR PR_SET_PDEATHSIG
891operation, then no parent-death signal is sent to the caller.
892.IP
a09b5995
MK
893The parent-death signal is process-directed (see
894.BR signal (7))
895and, if the child installs a handler using the
896.BR sigaction (2)
897.B SA_SIGINFO
898flag, the
899.I si_pid
900field of the
901.I siginfo_t
902argument of the handler contains the PID of the terminating parent process.
903.IP
29b249db 904The parent-death signal setting is cleared for the child of a
910b0689
MK
905.BR fork (2).
906It is also
907(since Linux 2.4.36 / 2.6.23)
908.\" commit d2d56c5f51028cb9f3d800882eb6f4cbd3f9099f
909cleared when executing a set-user-ID or set-group-ID binary,
910or a binary that has associated capabilities (see
911.BR capabilities (7));
912otherwise, this value is preserved across
913.BR execve (2).
03547431
MK
914.TP
915.BR PR_GET_PDEATHSIG " (since Linux 2.3.15)"
916Return the current value of the parent process death signal,
917in the location pointed to by
918.IR "(int\ *) arg2" .
919.TP
920.BR PR_SET_PTRACER " (since Linux 3.4)"
921.\" commit 2d514487faf188938a4ee4fb3464eeecfbdcf8eb
922.\" commit bf06189e4d14641c0148bea16e9dd24943862215
923This is meaningful only when the Yama LSM is enabled and in mode 1
924("restricted ptrace", visible via
925.IR /proc/sys/kernel/yama/ptrace_scope ).
926When a "ptracer process ID" is passed in \fIarg2\fP,
927the caller is declaring that the ptracer process can
928.BR ptrace (2)
929the calling process as if it were a direct process ancestor.
930Each
931.B PR_SET_PTRACER
932operation replaces the previous "ptracer process ID".
933Employing
934.B PR_SET_PTRACER
935with
936.I arg2
937set to 0 clears the caller's "ptracer process ID".
938If
939.I arg2
940is
941.BR PR_SET_PTRACER_ANY ,
942the ptrace restrictions introduced by Yama are effectively disabled for the
943calling process.
efeece04 944.IP
03547431 945For further information, see the kernel source file
6744a500
ES
946.IR Documentation/admin\-guide/LSM/Yama.rst
947.\" commit 90bb766440f2147486a2acc3e793d7b8348b0c22
948(or
949.IR Documentation/security/Yama.txt
950before Linux 4.13).
03547431
MK
951.TP
952.BR PR_SET_SECCOMP " (since Linux 2.6.23)"
953.\" See http://thread.gmane.org/gmane.linux.kernel/542632
954.\" [PATCH 0 of 2] seccomp updates
955.\" andrea@cpushare.com
956Set the secure computing (seccomp) mode for the calling thread, to limit
957the available system calls.
958The more recent
959.BR seccomp (2)
960system call provides a superset of the functionality of
961.BR PR_SET_SECCOMP .
efeece04 962.IP
03547431
MK
963The seccomp mode is selected via
964.IR arg2 .
965(The seccomp constants are defined in
966.IR <linux/seccomp.h> .)
efeece04 967.IP
34447828 968With
8ab8b43f 969.IR arg2
34447828 970set to
b1248a9d 971.BR SECCOMP_MODE_STRICT ,
8ab8b43f
MK
972the only system calls that the thread is permitted to make are
973.BR read (2),
974.BR write (2),
85fbef74
MK
975.BR _exit (2)
976(but not
977.BR exit_group (2)),
fea681da 978and
8ab8b43f
MK
979.BR sigreturn (2).
980Other system calls result in the delivery of a
981.BR SIGKILL
982signal.
34447828 983Strict secure computing mode is useful for number-crunching applications
8ab8b43f
MK
984that may need to execute untrusted byte code,
985perhaps obtained by reading from a pipe or socket.
33a0ccb2 986This operation is available only
d6ef3d57
MK
987if the kernel is configured with
988.B CONFIG_SECCOMP
989enabled.
efeece04 990.IP
34447828
KC
991With
992.IR arg2
993set to
b1248a9d 994.BR SECCOMP_MODE_FILTER " (since Linux 3.5),"
6239dfb2
MK
995the system calls allowed are defined by a pointer
996to a Berkeley Packet Filter passed in
997.IR arg3 .
998This argument is a pointer to
999.IR "struct sock_fprog" ;
1000it can be designed to filter
d6ef3d57 1001arbitrary system calls and system call arguments.
33a0ccb2 1002This mode is available only if the kernel is configured with
d6ef3d57
MK
1003.B CONFIG_SECCOMP_FILTER
1004enabled.
efeece04 1005.IP
1733db35
MK
1006If
1007.BR SECCOMP_MODE_FILTER
1008filters permit
1009.BR fork (2),
990e3887 1010then the seccomp mode is inherited by children created by
1733db35
MK
1011.BR fork (2);
1012if
1013.BR execve (2)
fa1d2749 1014is permitted, then the seccomp mode is preserved across
1733db35
MK
1015.BR execve (2).
1016If the filters permit
a26ec136 1017.BR prctl ()
1733db35
MK
1018calls, then additional filters can be added;
1019they are run in order until the first non-allow result is seen.
efeece04 1020.IP
6239dfb2 1021For further information, see the kernel source file
28d96036
ES
1022.IR Documentation/userspace\-api/seccomp_filter.rst
1023.\" commit c061f33f35be0ccc80f4b8e0aea5dfd2ed7e01a3
1024(or
1025.IR Documentation/prctl/seccomp_filter.txt
1026before Linux 4.13).
8ab8b43f
MK
1027.TP
1028.BR PR_GET_SECCOMP " (since Linux 2.6.23)"
5e91816c
MK
1029Return (as the function result)
1030the secure computing mode of the calling thread.
34447828
KC
1031If the caller is not in secure computing mode, this operation returns 0;
1032if the caller is in strict secure computing mode, then the
8ab8b43f
MK
1033.BR prctl ()
1034call will cause a
1035.B SIGKILL
1036signal to be sent to the process.
d6ef3d57 1037If the caller is in filter mode, and this system call is allowed by the
8eeb062d
MK
1038seccomp filters, it returns 2; otherwise, the process is killed with a
1039.BR SIGKILL
1040signal.
33a0ccb2 1041This operation is available only
d6ef3d57
MK
1042if the kernel is configured with
1043.B CONFIG_SECCOMP
1044enabled.
efeece04 1045.IP
787843e7
MK
1046Since Linux 3.8, the
1047.IR Seccomp
1048field of the
1049.IR /proc/[pid]/status
1050file provides a method of obtaining the same information,
1051without the risk that the process is killed; see
1052.BR proc (5).
88989295
MK
1053.TP
1054.BR PR_SET_SECUREBITS " (since Linux 2.6.26)"
1055Set the "securebits" flags of the calling thread to the value supplied in
03547431
MK
1056.IR arg2 .
1057See
1058.BR capabilities (7).
88989295 1059.TP
03547431
MK
1060.BR PR_GET_SECUREBITS " (since Linux 2.6.26)"
1061Return (as the function result)
1062the "securebits" flags of the calling thread.
1063See
1064.BR capabilities (7).
1065.TP
dd08fcca 1066.BR PR_GET_SPECULATION_CTRL " (since Linux 4.17)"
1cea09b3
MK
1067Return (as the function result)
1068the state of the speculation misfeature specified in
a01c1cbc
MK
1069.IR arg2 .
1070Currently, the only permitted value for this argument is
2feab5d3
MK
1071.BR PR_SPEC_STORE_BYPASS
1072(otherwise the call fails with the error
1073.BR ENODEV ).
1074.IP
1075The return value uses bits 0-3 with the following meaning:
e23acd79
KRW
1076.RS
1077.TP
1078.BR PR_SPEC_PRCTL
2feab5d3 1079Mitigation can be controlled per thread by
e23acd79
KRW
1080.B PR_SET_SPECULATION_CTRL
1081.TP
1082.BR PR_SPEC_ENABLE
1083The speculation feature is enabled, mitigation is disabled.
1084.TP
1085.BR PR_SPEC_DISABLE
1086The speculation feature is disabled, mitigation is enabled
1087.TP
1088.BR PR_SPEC_FORCE_DISABLE
1089Same as
1090.B PR_SPEC_DISABLE
1091but cannot be undone.
1092.RE
1093.IP
2feab5d3 1094If all bits are 0,
e23acd79
KRW
1095then the CPU is not affected by the speculation misfeature.
1096.IP
1097If
1098.B PR_SPEC_PRCTL
2feab5d3 1099is set, then per-thread control of the mitigation is available.
ac3756bc 1100If not set,
e36dfb81 1101.BR prctl ()
e23acd79 1102for the speculation misfeature will fail.
a01c1cbc
MK
1103.IP
1104The
e36dfb81
MK
1105.IR arg3 ,
1106.IR arg4 ,
e23acd79
KRW
1107and
1108.I arg5
a01c1cbc 1109arguments must be specified as 0; otherwise the call fails with the error
e36dfb81 1110.BR EINVAL .
e23acd79 1111.TP
dd08fcca
MK
1112.BR PR_SET_SPECULATION_CTRL " (since Linux 4.17)"
1113.\" commit b617cfc858161140d69cc0b5cc211996b557a1c7
1114.\" commit 356e4bfff2c5489e016fdb925adbf12a1e3950ee
a01c1cbc
MK
1115Sets the state of the speculation misfeature specified in
1116.IR arg2 .
1117Currently, the only permitted value for this argument is
2feab5d3
MK
1118.B PR_SPEC_STORE_BYPASS
1119(otherwise the call fails with the error
1120.BR ENODEV ).
a01c1cbc 1121This setting is a per-thread attribute.
ac3756bc 1122The
e23acd79 1123.IR arg3
a01c1cbc
MK
1124argument is used to hand in the control value,
1125which is one of the following:
e23acd79
KRW
1126.RS
1127.TP
1128.BR PR_SPEC_ENABLE
1129The speculation feature is enabled, mitigation is disabled.
1130.TP
1131.BR PR_SPEC_DISABLE
1132The speculation feature is disabled, mitigation is enabled
1133.TP
1134.BR PR_SPEC_FORCE_DISABLE
1135Same as
1136.B PR_SPEC_DISABLE
ac3756bc
MK
1137but cannot be undone.
1138A subsequent
e23acd79
KRW
1139.B
1140prctl(..., PR_SPEC_ENABLE)
2feab5d3 1141will fail with the error
e36dfb81 1142.BR EPERM .
e23acd79
KRW
1143.RE
1144.IP
1145Any other value in
1146.IR arg3
2feab5d3 1147will result in the call failing with the error
e23acd79 1148.BR ERANGE .
a01c1cbc
MK
1149.IP
1150The
2feab5d3 1151.I arg4
e23acd79
KRW
1152and
1153.I arg5
a01c1cbc 1154arguments must be specified as 0; otherwise the call fails with the error
e36dfb81 1155.BR EINVAL .
e23acd79 1156.IP
a01c1cbc
MK
1157The speculation feature can also be controlled by the
1158.B spec_store_bypass_disable
1159boot parameter.
1160This parameter may enforce a read-only policy which will result in the
549597a8 1161.BR prctl ()
a01c1cbc 1162call failing with the error
e23acd79 1163.BR ENXIO .
a01c1cbc
MK
1164For further details, see the kernel source file
1165.IR Documentation/admin-guide/kernel-parameters.txt .
e23acd79 1166.TP
03547431
MK
1167.BR PR_SET_THP_DISABLE " (since Linux 3.15)"
1168.\" commit a0715cc22601e8830ace98366c0c2bd8da52af52
1169Set the state of the "THP disable" flag for the calling thread.
1170If
1171.I arg2
1172has a nonzero value, the flag is set, otherwise it is cleared.
1173Setting this flag provides a method
1174for disabling transparent huge pages
1175for jobs where the code cannot be modified, and using a malloc hook with
1176.BR madvise (2)
1177is not an option (i.e., statically allocated data).
1178The setting of the "THP disable" flag is inherited by a child created via
1179.BR fork (2)
1180and is preserved across
1181.BR execve (2).
1182.\"
06afe673
MK
1183.TP
1184.BR PR_TASK_PERF_EVENTS_DISABLE " (since Linux 2.6.31)"
1185Disable all performance counters attached to the calling process,
1186regardless of whether the counters were created by
1187this process or another process.
1188Performance counters created by the calling process for other
1189processes are unaffected.
66a9882e 1190For more information on performance counters, see the Linux kernel source file
06afe673
MK
1191.IR tools/perf/design.txt .
1192.IP
03547431
MK
1193Originally called
1194.BR PR_TASK_PERF_COUNTERS_DISABLE ;
1195.\" commit 1d1c7ddbfab358445a542715551301b7fc363e28
b0ea1ea3 1196renamed (retaining the same numerical value)
03547431
MK
1197in Linux 2.6.32.
1198.\"
03979794 1199.TP
03547431
MK
1200.BR PR_TASK_PERF_EVENTS_ENABLE " (since Linux 2.6.31)"
1201The converse of
1202.BR PR_TASK_PERF_EVENTS_DISABLE ;
1203enable performance counters attached to the calling process.
1204.IP
1205Originally called
1206.BR PR_TASK_PERF_COUNTERS_ENABLE ;
1207.\" commit 1d1c7ddbfab358445a542715551301b7fc363e28
1208renamed
1209.\" commit cdd6c482c9ff9c55475ee7392ec8f672eddb7be6
1210in Linux 2.6.32.
1211.\"
1212.TP
1213.BR PR_GET_THP_DISABLE " (since Linux 3.15)"
035a7bf1 1214Return (as the function result) the current setting of the "THP disable"
03547431
MK
1215flag for the calling thread:
1216either 1, if the flag is set, or 0, if it is not.
1217.TP
1218.BR PR_GET_TID_ADDRESS " (since Linux 3.5)"
1219.\" commit 300f786b2683f8bb1ec0afb6e1851183a479c86d
f1ba3ad2 1220Return the
03547431
MK
1221.I clear_child_tid
1222address set by
1223.BR set_tid_address (2)
1224and the
1225.BR clone (2)
1226.B CLONE_CHILD_CLEARTID
1227flag, in the location pointed to by
1228.IR "(int\ **)\ arg2" .
1229This feature is available only if the kernel is built with the
1230.BR CONFIG_CHECKPOINT_RESTORE
c7f2f9ed
MK
1231option enabled.
1232Note that since the
1233.BR prctl ()
1234system call does not have a compat implementation for
1235the AMD64 x32 and MIPS n32 ABIs,
1236and the kernel writes out a pointer using the kernel's pointer size,
1237this operation expects a user-space buffer of 8 (not 4) bytes on these ABIs.
03547431
MK
1238.TP
1239.BR PR_SET_TIMERSLACK " (since Linux 2.6.28)"
1240.\" See https://lwn.net/Articles/369549/
1241.\" commit 6976675d94042fbd446231d1bd8b7de71a980ada
3780f8a5
MK
1242Each thread has two associated timer slack values:
1243a "default" value, and a "current" value.
1244This operation sets the "current" timer slack value for the calling thread.
c14f7930
YX
1245.I arg2
1246is an unsigned long value, then maximum "current" value is ULONG_MAX and
1247the minimum "current" value is 1.
3780f8a5
MK
1248If the nanosecond value supplied in
1249.IR arg2
1250is greater than zero, then the "current" value is set to this value.
03547431
MK
1251If
1252.I arg2
c14f7930 1253is equal to zero,
3780f8a5
MK
1254the "current" timer slack is reset to the
1255thread's "default" timer slack value.
efeece04 1256.IP
3780f8a5 1257The "current" timer slack is used by the kernel to group timer expirations
03547431
MK
1258for the calling thread that are close to one another;
1259as a consequence, timer expirations for the thread may be
1260up to the specified number of nanoseconds late (but will never expire early).
1261Grouping timer expirations can help reduce system power consumption
1262by minimizing CPU wake-ups.
efeece04 1263.IP
03547431
MK
1264The timer expirations affected by timer slack are those set by
1265.BR select (2),
1266.BR pselect (2),
1267.BR poll (2),
1268.BR ppoll (2),
1269.BR epoll_wait (2),
1270.BR epoll_pwait (2),
1271.BR clock_nanosleep (2),
1272.BR nanosleep (2),
1273and
1274.BR futex (2)
1275(and thus the library functions implemented via futexes, including
1276.\" List obtained by grepping for futex usage in glibc source
1277.BR pthread_cond_timedwait (3),
1278.BR pthread_mutex_timedlock (3),
1279.BR pthread_rwlock_timedrdlock (3),
1280.BR pthread_rwlock_timedwrlock (3),
1281and
1282.BR sem_timedwait (3)).
efeece04 1283.IP
03547431
MK
1284Timer slack is not applied to threads that are scheduled under
1285a real-time scheduling policy (see
1286.BR sched_setscheduler (2)).
efeece04 1287.IP
03547431 1288When a new thread is created,
3780f8a5 1289the two timer slack values are made the same as the "current" value
03547431 1290of the creating thread.
3780f8a5
MK
1291Thereafter, a thread can adjust its "current" timer slack value via
1292.BR PR_SET_TIMERSLACK .
1293The "default" value can't be changed.
03547431
MK
1294The timer slack values of
1295.IR init
1296(PID 1), the ancestor of all processes,
1297are 50,000 nanoseconds (50 microseconds).
c14f7930 1298The timer slack value is inherited by a child created via
0b9a7995 1299.BR fork (2),
c14f7930 1300and is preserved across
03547431 1301.BR execve (2).
efeece04 1302.IP
c1f78aba
MK
1303Since Linux 4.6, the "current" timer slack value of any process
1304can be examined and changed via the file
1305.IR /proc/[pid]/timerslack_ns .
1306See
1307.BR proc (5).
e81a96ec 1308.TP
03547431
MK
1309.BR PR_GET_TIMERSLACK " (since Linux 2.6.28)"
1310Return (as the function result)
3780f8a5 1311the "current" timer slack value of the calling thread.
4bf25b89 1312.TP
d6bec36e
MK
1313.BR PR_SET_TIMING " (since Linux 2.6.0)"
1314.\" Precisely: Linux 2.6.0-test4
03547431
MK
1315Set whether to use (normal, traditional) statistical process timing or
1316accurate timestamp-based process timing, by passing
1317.B PR_TIMING_STATISTICAL
1318.\" 0
1319or
1320.B PR_TIMING_TIMESTAMP
1321.\" 1
1322to \fIarg2\fP.
1323.B PR_TIMING_TIMESTAMP
1324is not currently implemented
1325(attempting to set this mode will yield the error
1326.BR EINVAL ).
1327.\" PR_TIMING_TIMESTAMP doesn't do anything in 2.6.26-rc8,
1328.\" and looking at the patch history, it appears
1329.\" that it never did anything.
4bf25b89 1330.TP
d6bec36e
MK
1331.BR PR_GET_TIMING " (since Linux 2.6.0)"
1332.\" Precisely: Linux 2.6.0-test4
03547431
MK
1333Return (as the function result) which process timing method is currently
1334in use.
4bf25b89 1335.TP
03547431
MK
1336.BR PR_SET_TSC " (since Linux 2.6.26, x86 only)"
1337Set the state of the flag determining whether the timestamp counter
1338can be read by the process.
1339Pass
1340.B PR_TSC_ENABLE
1341to
1342.I arg2
1343to allow it to be read, or
1344.B PR_TSC_SIGSEGV
1345to generate a
1346.B SIGSEGV
1347when the process tries to read the timestamp counter.
4bf25b89 1348.TP
03547431
MK
1349.BR PR_GET_TSC " (since Linux 2.6.26, x86 only)"
1350Return the state of the flag determining whether the timestamp counter
1351can be read,
1352in the location pointed to by
1353.IR "(int\ *) arg2" .
1354.TP
1355.B PR_SET_UNALIGN
1356(Only on: ia64, since Linux 2.3.48; parisc, since Linux 2.6.15;
0e2c6b8c
ES
1357PowerPC, since Linux 2.6.18; Alpha, since Linux 2.6.22;
1358.\" sh: 94ea5e449ae834af058ef005d16a8ad44fcf13d6
1359.\" tile: 2f9ac29eec71a696cb0dcc5fb82c0f8d4dac28c9
1360sh, since Linux 2.6.34; tile, since Linux 3.12)
03547431
MK
1361Set unaligned access control bits to \fIarg2\fP.
1362Pass
1363\fBPR_UNALIGN_NOPRINT\fP to silently fix up unaligned user accesses,
1364or \fBPR_UNALIGN_SIGBUS\fP to generate
1365.B SIGBUS
2da72a43
MK
1366on unaligned user access.
1367Alpha also supports an additional flag with the value
1368of 4 and no corresponding named constant,
1369which instructs kernel to not fix up
0e2c6b8c 1370unaligned accesses (it is analogous to providing the
2da72a43
MK
1371.BR UAC_NOFIX
1372flag in
1373.BR SSI_NVPAIRS
1374operation of the
1375.BR setsysinfo ()
1376system call on Tru64).
03547431
MK
1377.TP
1378.B PR_GET_UNALIGN
f1bb5798 1379(See
03547431 1380.B PR_SET_UNALIGN
f1bb5798 1381for information on versions and architectures.)
03547431 1382Return unaligned access control bits, in the location pointed to by
0e2c6b8c 1383.IR "(unsigned int\ *) arg2" .
308eb2f6 1384.TP
91e01506
MK
1385.BR PR_SET_IO_FLUSHER " (since Linux 5.6)"
1386If a user process is involved in the block layer or filesystem I/O path,
1387and can allocate memory while processing I/O requests it must set
1388\fIarg2\fP to 1.
1389This will put the process in the IO_FLUSHER state,
1390which allows it special treatment to make progress when allocating memory.
308eb2f6
MC
1391If \fIarg2\fP is 0, the process will clear the IO_FLUSHER state, and
1392the default behavior will be used.
4222606d 1393.IP
308eb2f6
MC
1394The calling process must have the
1395.BR CAP_SYS_RESOURCE
1396capability.
4222606d 1397.IP
3872a3d6
MK
1398.IR arg3 ,
1399.IR arg4 ,
1400and
1401.IR arg5
1402must be zero.
1403.IP
f3c29937
MK
1404The IO_FLUSHER state is inherited by a child process created via
1405.BR fork (2)
1406and is preserved across
1407.BR execve (2).
1408.IP
308eb2f6
MC
1409Examples of IO_FLUSHER applications are FUSE daemons, SCSI device
1410emulation daemons, and daemons that perform error handling like multipath
1411path recovery applications.
308eb2f6
MC
1412.TP
1413.B PR_GET_IO_FLUSHER (Since Linux 5.6)
b5b0b288
MK
1414Return (as the function result) the IO_FLUSHER state of the caller.
1415A value of 1 indicates that the caller is in the IO_FLUSHER state;
14160 indicates that the caller is not in the IO_FLUSHER state.
4222606d 1417.IP
308eb2f6
MC
1418The calling process must have the
1419.BR CAP_SYS_RESOURCE
1420capability.
3872a3d6
MK
1421.IP
1422.IR arg2 ,
1423.IR arg3 ,
1424.IR arg4 ,
1425and
1426.IR arg5
1427must be zero.
47297adb 1428.SH RETURN VALUE
8ab8b43f
MK
1429On success,
1430.BR PR_GET_DUMPABLE ,
7f5d8442 1431.BR PR_GET_FP_MODE ,
8ab8b43f 1432.BR PR_GET_KEEPCAPS ,
f83fe154 1433.BR PR_GET_NO_NEW_PRIVS ,
5745985f 1434.BR PR_GET_THP_DISABLE ,
8ab8b43f
MK
1435.BR PR_CAPBSET_READ ,
1436.BR PR_GET_TIMING ,
c42db321 1437.BR PR_GET_TIMERSLACK ,
8ab8b43f 1438.BR PR_GET_SECUREBITS ,
7f5d8442 1439.BR PR_GET_SPECULATION_CTRL ,
ed31c572 1440.BR PR_MCE_KILL_GET ,
0c3e75cb 1441.BR PR_CAP_AMBIENT + PR_CAP_AMBIENT_IS_SET ,
308eb2f6 1442.BR PR_GET_IO_FLUSHER ,
8ab8b43f
MK
1443and (if it returns)
1444.BR PR_GET_SECCOMP
2fda57bd 1445return the nonnegative values described above.
fea681da
MK
1446All other
1447.I option
1448values return 0 on success.
1449On error, \-1 is returned, and
1450.I errno
1451is set appropriately.
1452.SH ERRORS
1453.TP
0478944d
MK
1454.B EACCES
1455.I option
1456is
4ab9f1db
MK
1457.BR PR_SET_SECCOMP
1458and
1459.I arg2
1460is
1461.BR SECCOMP_MODE_FILTER ,
1462but the process does not have the
1463.BR CAP_SYS_ADMIN
1464capability or has not set the
1465.IR no_new_privs
1466attribute (see the discussion of
1467.BR PR_SET_NO_NEW_PRIVS
1468above).
1469.TP
1470.B EACCES
1471.I option
1472is
0478944d
MK
1473.BR PR_SET_MM ,
1474and
1475.I arg3
1476is
1477.BR PR_SET_MM_EXE_FILE ,
1478the file is not executable.
1479.TP
1480.B EBADF
1481.I option
1482is
1483.BR PR_SET_MM ,
1484.I arg3
1485is
1486.BR PR_SET_MM_EXE_FILE ,
1487and the file descriptor passed in
1488.I arg4
1489is not valid.
1490.TP
1491.B EBUSY
1492.I option
1493is
1494.BR PR_SET_MM ,
1495.I arg3
1496is
1497.BR PR_SET_MM_EXE_FILE ,
1498and this the second attempt to change the
1499.I /proc/pid/exe
1500symbolic link, which is prohibited.
1501.TP
8ab8b43f
MK
1502.B EFAULT
1503.I arg2
1504is an invalid address.
1505.TP
e35a0512
KC
1506.B EFAULT
1507.I option
1508is
1509.BR PR_SET_SECCOMP ,
1510.I arg2
1511is
1512.BR SECCOMP_MODE_FILTER ,
1513the system was built with
64c626f7 1514.BR CONFIG_SECCOMP_FILTER ,
e35a0512
KC
1515and
1516.I arg3
1517is an invalid address.
1518.TP
fea681da
MK
1519.B EINVAL
1520The value of
1521.I option