]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man2/prctl.2
e743a63059698f931470cd163e147463414b1de4
[thirdparty/man-pages.git] / man2 / prctl.2
1 .\" Copyright (C) 1998 Andries Brouwer (aeb@cwi.nl)
2 .\" and Copyright (C) 2002, 2006, 2008, 2012, 2013 Michael Kerrisk <mtk.manpages@gmail.com>
3 .\" and Copyright Guillem Jover <guillem@hadrons.org>
4 .\" and Copyright (C) 2014 Dave Hansen / Intel
5 .\"
6 .\" %%%LICENSE_START(VERBATIM)
7 .\" Permission is granted to make and distribute verbatim copies of this
8 .\" manual provided the copyright notice and this permission notice are
9 .\" preserved on all copies.
10 .\"
11 .\" Permission is granted to copy and distribute modified versions of this
12 .\" manual under the conditions for verbatim copying, provided that the
13 .\" entire resulting derived work is distributed under the terms of a
14 .\" permission notice identical to this one.
15 .\"
16 .\" Since the Linux kernel and libraries are constantly changing, this
17 .\" manual page may be incorrect or out-of-date. The author(s) assume no
18 .\" responsibility for errors or omissions, or for damages resulting from
19 .\" the use of the information contained herein. The author(s) may not
20 .\" have taken the same level of care in the production of this manual,
21 .\" which is licensed free of charge, as they might when working
22 .\" professionally.
23 .\"
24 .\" Formatted or processed versions of this manual, if unaccompanied by
25 .\" the source, must acknowledge the copyright and authors of this work.
26 .\" %%%LICENSE_END
27 .\"
28 .\" Modified Thu Nov 11 04:19:42 MET 1999, aeb: added PR_GET_PDEATHSIG
29 .\" Modified 27 Jun 02, Michael Kerrisk
30 .\" Added PR_SET_DUMPABLE, PR_GET_DUMPABLE,
31 .\" PR_SET_KEEPCAPS, PR_GET_KEEPCAPS
32 .\" Modified 2006-08-30 Guillem Jover <guillem@hadrons.org>
33 .\" Updated Linux versions where the options where introduced.
34 .\" Added PR_SET_TIMING, PR_GET_TIMING, PR_SET_NAME, PR_GET_NAME,
35 .\" PR_SET_UNALIGN, PR_GET_UNALIGN, PR_SET_FPEMU, PR_GET_FPEMU,
36 .\" PR_SET_FPEXC, PR_GET_FPEXC
37 .\" 2008-04-29 Serge Hallyn, Document PR_CAPBSET_READ and PR_CAPBSET_DROP
38 .\" 2008-06-13 Erik Bosman, <ejbosman@cs.vu.nl>
39 .\" Document PR_GET_TSC and PR_SET_TSC.
40 .\" 2008-06-15 mtk, Document PR_SET_SECCOMP, PR_GET_SECCOMP
41 .\" 2009-10-03 Andi Kleen, document PR_MCE_KILL
42 .\" 2012-04 Cyrill Gorcunov, Document PR_SET_MM
43 .\" 2012-04-25 Michael Kerrisk, Document PR_TASK_PERF_EVENTS_DISABLE and
44 .\" PR_TASK_PERF_EVENTS_ENABLE
45 .\" 2012-09-20 Kees Cook, update PR_SET_SECCOMP for mode 2
46 .\" 2012-09-20 Kees Cook, document PR_SET_NO_NEW_PRIVS, PR_GET_NO_NEW_PRIVS
47 .\" 2012-10-25 Michael Kerrisk, Document PR_SET_TIMERSLACK and
48 .\" PR_GET_TIMERSLACK
49 .\" 2013-01-10 Kees Cook, document PR_SET_PTRACER
50 .\" 2012-02-04 Michael kerrisk, document PR_{SET,GET}_CHILD_SUBREAPER
51 .\" 2014-11-10 Dave Hansen, document PR_MPX_{EN,DIS}ABLE_MANAGEMENT
52 .\"
53 .\"
54 .TH PRCTL 2 2015-05-07 "Linux" "Linux Programmer's Manual"
55 .SH NAME
56 prctl \- operations on a process
57 .SH SYNOPSIS
58 .nf
59 .B #include <sys/prctl.h>
60 .sp
61 .BI "int prctl(int " option ", unsigned long " arg2 ", unsigned long " arg3 ,
62 .BI " unsigned long " arg4 ", unsigned long " arg5 );
63 .fi
64 .SH DESCRIPTION
65 .BR prctl ()
66 is called with a first argument describing what to do
67 (with values defined in \fI<linux/prctl.h>\fP), and further
68 arguments with a significance depending on the first one.
69 The first argument can be:
70 .TP
71 .BR PR_CAPBSET_READ " (since Linux 2.6.25)"
72 Return (as the function result) 1 if the capability specified in
73 .I arg2
74 is in the calling thread's capability bounding set,
75 or 0 if it is not.
76 (The capability constants are defined in
77 .IR <linux/capability.h> .)
78 The capability bounding set dictates
79 whether the process can receive the capability through a
80 file's permitted capability set on a subsequent call to
81 .BR execve (2).
82
83 If the capability specified in
84 .I arg2
85 is not valid, then the call fails with the error
86 .BR EINVAL .
87 .TP
88 .BR PR_CAPBSET_DROP " (since Linux 2.6.25)"
89 If the calling thread has the
90 .B CAP_SETPCAP
91 capability, then drop the capability specified by
92 .I arg2
93 from the calling thread's capability bounding set.
94 Any children of the calling thread will inherit the newly
95 reduced bounding set.
96
97 The call fails with the error:
98 .B EPERM
99 if the calling thread does not have the
100 .BR CAP_SETPCAP ;
101 .BR EINVAL
102 if
103 .I arg2
104 does not represent a valid capability; or
105 .BR EINVAL
106 if file capabilities are not enabled in the kernel,
107 in which case bounding sets are not supported.
108 .TP
109 .BR PR_SET_CHILD_SUBREAPER " (since Linux 3.4)"
110 .\" commit ebec18a6d3aa1e7d84aab16225e87fd25170ec2b
111 If
112 .I arg2
113 is nonzero,
114 set the "child subreaper" attribute of the calling process;
115 if
116 .I arg2
117 is zero, unset the attribute.
118 When a process is marked as a child subreaper,
119 all of the children that it creates, and their descendants,
120 will be marked as having a subreaper.
121 In effect, a subreaper fulfills the role of
122 .BR init (1)
123 for its descendant processes.
124 Upon termination of a process
125 that is orphaned (i.e., its immediate parent has already terminated)
126 and marked as having a subreaper,
127 the nearest still living ancestor subreaper
128 will receive a
129 .BR SIGCHLD
130 signal and be able to
131 .BR wait (2)
132 on the process to discover its termination status.
133 .TP
134 .BR PR_GET_CHILD_SUBREAPER " (since Linux 3.4)"
135 Return the "child subreaper" setting of the caller,
136 in the location pointed to by
137 .IR "(int\ *) arg2" .
138 .TP
139 .BR PR_SET_DUMPABLE " (since Linux 2.3.20)"
140 Set the state of the "dumpable" flag,
141 which determines whether core dumps are produced for the calling process
142 upon delivery of a signal whose default behavior is to produce a core dump.
143
144 In kernels up to and including 2.6.12,
145 .I arg2
146 must be either 0
147 .RB ( SUID_DUMP_DISABLE ,
148 process is not dumpable) or 1
149 .RB ( SUID_DUMP_USER ,
150 process is dumpable).
151 Between kernels 2.6.13 and 2.6.17,
152 .\" commit abf75a5033d4da7b8a7e92321d74021d1fcfb502
153 the value 2 was also permitted,
154 which caused any binary which normally would not be dumped
155 to be dumped readable by root only;
156 for security reasons, this feature has been removed.
157 .\" See http://marc.theaimsgroup.com/?l=linux-kernel&m=115270289030630&w=2
158 .\" Subject: Fix prctl privilege escalation (CVE-2006-2451)
159 .\" From: Marcel Holtmann <marcel () holtmann ! org>
160 .\" Date: 2006-07-12 11:12:00
161 (See also the description of
162 .I /proc/sys/fs/\:suid_dumpable
163 in
164 .BR proc (5).)
165
166 Normally, this flag is set to 1.
167 However, it is reset to the current value contained in the file
168 .IR /proc/sys/fs/\:suid_dumpable
169 (which by default has the value 0),
170 if any of the following attributes of the process
171 are changed by the operations listed below:
172 .\" See kernel/cred.c::commit_creds() (Linux 3.18 sources)
173 .RS
174 .IP * 3
175 The effective user or group ID is changed.
176 .IP *
177 The filesystem user or group ID is changed (see
178 .BR credentials (7)).
179 .IP *
180 The process's set of permitted capabilities (see
181 .BR capabilities (7))
182 is changed such that its new set of capabilities is
183 not a subset of its previous set of capabilities.
184 .RE
185 .IP
186 The operations that may trigger changes to the dumpable flag include:
187 .\" Look for uses of commit_creds() in the kernel source code
188 .RS
189 .IP * 3
190 execution
191 .RB ( execve (2))
192 of a set-user-ID or set-group-ID program,
193 or a program that has capabilities (see
194 .BR capabilities (7));
195 .IP *
196 .BR capset (2);
197 and
198 .IP *
199 system calls that change process credentials
200 .RB ( setuid (2)
201 .BR setgid (2),
202 .BR setresuid (2),
203 .BR setresgid (2),
204 .BR setgroups (2),
205 and so on).
206 .\" Also certain namespace operations;
207 .RE
208 .IP
209 Processes that are not dumpable can not be attached via
210 .BR ptrace (2)
211 .BR PTRACE_ATTACH .
212 .TP
213 .BR PR_GET_DUMPABLE " (since Linux 2.3.20)"
214 Return (as the function result) the current state of the calling
215 process's dumpable flag.
216 .\" Since Linux 2.6.13, the dumpable flag can have the value 2,
217 .\" but in 2.6.13 PR_GET_DUMPABLE simply returns 1 if the dumpable
218 .\" flags has a nonzero value. This was fixed in 2.6.14.
219 .TP
220 .BR PR_SET_ENDIAN " (since Linux 2.6.18, PowerPC only)"
221 Set the endian-ness of the calling process to the value given
222 in \fIarg2\fP, which should be one of the following:
223 .\" Respectively 0, 1, 2
224 .BR PR_ENDIAN_BIG ,
225 .BR PR_ENDIAN_LITTLE ,
226 or
227 .B PR_ENDIAN_PPC_LITTLE
228 (PowerPC pseudo little endian).
229 .TP
230 .BR PR_GET_ENDIAN " (since Linux 2.6.18, PowerPC only)"
231 Return the endian-ness of the calling process,
232 in the location pointed to by
233 .IR "(int\ *) arg2" .
234 .TP
235 .BR PR_SET_FPEMU " (since Linux 2.4.18, 2.5.9, only on ia64)"
236 Set floating-point emulation control bits to \fIarg2\fP.
237 Pass
238 .B PR_FPEMU_NOPRINT
239 to silently emulate floating-point operation accesses, or
240 .B PR_FPEMU_SIGFPE
241 to not emulate floating-point operations and send
242 .B SIGFPE
243 instead.
244 .TP
245 .BR PR_GET_FPEMU " (since Linux 2.4.18, 2.5.9, only on ia64)"
246 Return floating-point emulation control bits,
247 in the location pointed to by
248 .IR "(int\ *) arg2" .
249 .TP
250 .BR PR_SET_FPEXC " (since Linux 2.4.21, 2.5.32, only on PowerPC)"
251 Set floating-point exception mode to \fIarg2\fP.
252 Pass \fBPR_FP_EXC_SW_ENABLE\fP to use FPEXC for FP exception enables,
253 \fBPR_FP_EXC_DIV\fP for floating-point divide by zero,
254 \fBPR_FP_EXC_OVF\fP for floating-point overflow,
255 \fBPR_FP_EXC_UND\fP for floating-point underflow,
256 \fBPR_FP_EXC_RES\fP for floating-point inexact result,
257 \fBPR_FP_EXC_INV\fP for floating-point invalid operation,
258 \fBPR_FP_EXC_DISABLED\fP for FP exceptions disabled,
259 \fBPR_FP_EXC_NONRECOV\fP for async nonrecoverable exception mode,
260 \fBPR_FP_EXC_ASYNC\fP for async recoverable exception mode,
261 \fBPR_FP_EXC_PRECISE\fP for precise exception mode.
262 .TP
263 .BR PR_GET_FPEXC " (since Linux 2.4.21, 2.5.32, only on PowerPC)"
264 Return floating-point exception mode,
265 in the location pointed to by
266 .IR "(int\ *) arg2" .
267 .TP
268 .BR PR_SET_KEEPCAPS " (since Linux 2.2.18)"
269 Set the state of the thread's "keep capabilities" flag,
270 which determines whether the threads's permitted
271 capability set is cleared when a change is made to the threads's user IDs
272 such that the threads's real UID, effective UID, and saved set-user-ID
273 all become nonzero when at least one of them previously had the value 0.
274 By default, the permitted capability set is cleared when such a change is made;
275 setting the "keep capabilities" flag prevents it from being cleared.
276 .I arg2
277 must be either 0 (permitted capabilities are cleared)
278 or 1 (permitted capabilities are kept).
279 (A thread's
280 .I effective
281 capability set is always cleared when such a credential change is made,
282 regardless of the setting of the "keep capabilities" flag.)
283 The "keep capabilities" value will be reset to 0 on subsequent calls to
284 .BR execve (2).
285 .TP
286 .BR PR_GET_KEEPCAPS " (since Linux 2.2.18)"
287 Return (as the function result) the current state of the calling threads's
288 "keep capabilities" flag.
289 .TP
290 .BR PR_SET_NAME " (since Linux 2.6.9)"
291 Set the name of the calling thread,
292 using the value in the location pointed to by
293 .IR "(char\ *) arg2" .
294 The name can be up to 16 bytes long,
295 .\" TASK_COMM_LEN in include/linux/sched.h
296 including the terminating null byte.
297 (If the length of the string, including the terminating null byte,
298 exceeds 16 bytes, the string is silently truncated.)
299 This is the same attribute that can be set via
300 .BR pthread_setname_np (3)
301 and retrieved using
302 .BR pthread_getname_np (3).
303 The attribute is likewise accessible via
304 .IR /proc/self/task/[tid]/comm ,
305 where
306 .I tid
307 is the name of the calling thread.
308 .TP
309 .BR PR_GET_NAME " (since Linux 2.6.11)"
310 Return the name of the calling thread,
311 in the buffer pointed to by
312 .IR "(char\ *) arg2" .
313 The buffer should allow space for up to 16 bytes;
314 the returned string will be null-terminated.
315 .TP
316 .BR PR_SET_NO_NEW_PRIVS " (since Linux 3.5)"
317 Set the calling process's
318 .I no_new_privs
319 bit to the value in
320 .IR arg2 .
321 With
322 .I no_new_privs
323 set to 1,
324 .BR execve (2)
325 promises not to grant privileges to do anything
326 that could not have been done without the
327 .BR execve (2)
328 call (for example,
329 rendering the set-user-ID and set-group-ID mode bits,
330 and file capabilities non-functional).
331 Once set, this bit cannot be unset.
332 The setting of this bit is inherited by children created by
333 .BR fork (2)
334 and
335 .BR clone (2),
336 and preserved across
337 .BR execve (2).
338
339 For more information, see the kernel source file
340 .IR Documentation/prctl/no_new_privs.txt .
341 .TP
342 .BR PR_GET_NO_NEW_PRIVS " (since Linux 3.5)"
343 Return (as the function result) the value of the
344 .I no_new_privs
345 bit for the current process.
346 A value of 0 indicates the regular
347 .BR execve (2)
348 behavior.
349 A value of 1 indicates
350 .BR execve (2)
351 will operate in the privilege-restricting mode described above.
352 .TP
353 .BR PR_SET_PDEATHSIG " (since Linux 2.1.57)"
354 Set the parent death signal
355 of the calling process to \fIarg2\fP (either a signal value
356 in the range 1..maxsig, or 0 to clear).
357 This is the signal that the calling process will get when its
358 parent dies.
359 This value is cleared for the child of a
360 .BR fork (2)
361 and (since Linux 2.4.36 / 2.6.23)
362 when executing a set-user-ID or set-group-ID binary,
363 or a binary that has associated capabilities (see
364 .BR capabilities (7)).
365 This value is preserved across
366 .BR execve (2).
367
368 .IR Warning :
369 .\" https://bugzilla.kernel.org/show_bug.cgi?id=43300
370 the "parent" in this case is considered to be the
371 .I thread
372 that created this process.
373 In other words, the signal will be sent when that thread terminates
374 (via, for example,
375 .BR pthread_exit (3)),
376 rather than after all of the threads in the parent process terminate.
377 .TP
378 .BR PR_GET_PDEATHSIG " (since Linux 2.3.15)"
379 Return the current value of the parent process death signal,
380 in the location pointed to by
381 .IR "(int\ *) arg2" .
382 .TP
383 .BR PR_SET_PTRACER " (since Linux 3.4)"
384 .\" commit 2d514487faf188938a4ee4fb3464eeecfbdcf8eb
385 .\" commit bf06189e4d14641c0148bea16e9dd24943862215
386 This is meaningful only when the Yama LSM is enabled and in mode 1
387 ("restricted ptrace", visible via
388 .IR /proc/sys/kernel/yama/ptrace_scope ).
389 When a "ptracer process ID" is passed in \fIarg2\fP,
390 the caller is declaring that the ptracer process can
391 .BR ptrace (2)
392 the calling process as if it were a direct process ancestor.
393 Each
394 .B PR_SET_PTRACER
395 operation replaces the previous "ptracer process ID".
396 Employing
397 .B PR_SET_PTRACER
398 with
399 .I arg2
400 set to 0 clears the caller's "ptracer process ID".
401 If
402 .I arg2
403 is
404 .BR PR_SET_PTRACER_ANY ,
405 the ptrace restrictions introduced by Yama are effectively disabled for the
406 calling process.
407
408 For further information, see the kernel source file
409 .IR Documentation/security/Yama.txt .
410 .TP
411 .BR PR_SET_SECCOMP " (since Linux 2.6.23)"
412 .\" See http://thread.gmane.org/gmane.linux.kernel/542632
413 .\" [PATCH 0 of 2] seccomp updates
414 .\" andrea@cpushare.com
415 Set the secure computing (seccomp) mode for the calling thread, to limit
416 the available system calls.
417 The more recent
418 .BR seccomp (2)
419 system call provides a superset of the functionality of
420 .BR PR_SET_SECCOMP .
421
422 The seccomp mode is selected via
423 .IR arg2 .
424 (The seccomp constants are defined in
425 .IR <linux/seccomp.h> .)
426
427 With
428 .IR arg2
429 set to
430 .BR SECCOMP_MODE_STRICT ,
431 the only system calls that the thread is permitted to make are
432 .BR read (2),
433 .BR write (2),
434 .BR _exit (2)
435 (but not
436 .BR exit_group (2)),
437 and
438 .BR sigreturn (2).
439 Other system calls result in the delivery of a
440 .BR SIGKILL
441 signal.
442 Strict secure computing mode is useful for number-crunching applications
443 that may need to execute untrusted byte code,
444 perhaps obtained by reading from a pipe or socket.
445 This operation is available only
446 if the kernel is configured with
447 .B CONFIG_SECCOMP
448 enabled.
449
450 With
451 .IR arg2
452 set to
453 .BR SECCOMP_MODE_FILTER " (since Linux 3.5),"
454 the system calls allowed are defined by a pointer
455 to a Berkeley Packet Filter passed in
456 .IR arg3 .
457 This argument is a pointer to
458 .IR "struct sock_fprog" ;
459 it can be designed to filter
460 arbitrary system calls and system call arguments.
461 This mode is available only if the kernel is configured with
462 .B CONFIG_SECCOMP_FILTER
463 enabled.
464
465 If
466 .BR SECCOMP_MODE_FILTER
467 filters permit
468 .BR fork (2),
469 then the seccomp mode is inherited by children created by
470 .BR fork (2);
471 if
472 .BR execve (2)
473 is permitted, then the seccomp mode is preserved across
474 .BR execve (2).
475 If the filters permit
476 .BR prctl ()
477 calls, then additional filters can be added;
478 they are run in order until the first non-allow result is seen.
479
480 For further information, see the kernel source file
481 .IR Documentation/prctl/seccomp_filter.txt .
482 .TP
483 .BR PR_GET_SECCOMP " (since Linux 2.6.23)"
484 Return (as the function result)
485 the secure computing mode of the calling thread.
486 If the caller is not in secure computing mode, this operation returns 0;
487 if the caller is in strict secure computing mode, then the
488 .BR prctl ()
489 call will cause a
490 .B SIGKILL
491 signal to be sent to the process.
492 If the caller is in filter mode, and this system call is allowed by the
493 seccomp filters, it returns 2; otherwise, the process is killed with a
494 .BR SIGKILL
495 signal.
496 This operation is available only
497 if the kernel is configured with
498 .B CONFIG_SECCOMP
499 enabled.
500
501 Since Linux 3.8, the
502 .IR Seccomp
503 field of the
504 .IR /proc/[pid]/status
505 file provides a method of obtaining the same information,
506 without the risk that the process is killed; see
507 .BR proc (5).
508 .TP
509 .BR PR_SET_SECUREBITS " (since Linux 2.6.26)"
510 Set the "securebits" flags of the calling thread to the value supplied in
511 .IR arg2 .
512 See
513 .BR capabilities (7).
514 .TP
515 .BR PR_GET_SECUREBITS " (since Linux 2.6.26)"
516 Return (as the function result)
517 the "securebits" flags of the calling thread.
518 See
519 .BR capabilities (7).
520 .TP
521 .BR PR_SET_THP_DISABLE " (since Linux 3.15)"
522 .\" commit a0715cc22601e8830ace98366c0c2bd8da52af52
523 Set the state of the "THP disable" flag for the calling thread.
524 If
525 .I arg2
526 has a nonzero value, the flag is set, otherwise it is cleared.
527 Setting this flag provides a method
528 for disabling transparent huge pages
529 for jobs where the code cannot be modified, and using a malloc hook with
530 .BR madvise (2)
531 is not an option (i.e., statically allocated data).
532 The setting of the "THP disable" flag is inherited by a child created via
533 .BR fork (2)
534 and is preserved across
535 .BR execve (2).
536 .TP
537 .BR PR_GET_THP_DISABLE " (since Linux 3.15)"
538 Return (via the function result) the current setting of the "THP disable"
539 flag for the calling thread:
540 either 1, if the flag is set, or 0, if it is not.
541 .TP
542 .BR PR_GET_TID_ADDRESS " (since Linux 3.5)"
543 .\" commit 300f786b2683f8bb1ec0afb6e1851183a479c86d
544 Retrieve the
545 .I clear_child_tid
546 address set by
547 .BR set_tid_address (2)
548 and the
549 .BR clone (2)
550 .B CLONE_CHILD_CLEARTID
551 flag, in the location pointed to by
552 .IR "(int\ **)\ arg2" .
553 This feature is available only if the kernel is built with the
554 .BR CONFIG_CHECKPOINT_RESTORE
555 option enabled.
556 .TP
557 .BR PR_SET_TIMERSLACK " (since Linux 2.6.28)"
558 .\" See https://lwn.net/Articles/369549/
559 .\" commit 6976675d94042fbd446231d1bd8b7de71a980ada
560 Set the current timer slack for the calling thread to the nanosecond value
561 supplied in
562 .IR arg2 .
563 If
564 .I arg2
565 is less than or equal to zero,
566 .\" It seems that it's not possible to set the timer slack to zero;
567 .\" The minimum value is 1? Seems a little strange.
568 reset the current timer slack to the thread's default timer slack value.
569 The timer slack is used by the kernel to group timer expirations
570 for the calling thread that are close to one another;
571 as a consequence, timer expirations for the thread may be
572 up to the specified number of nanoseconds late (but will never expire early).
573 Grouping timer expirations can help reduce system power consumption
574 by minimizing CPU wake-ups.
575
576 The timer expirations affected by timer slack are those set by
577 .BR select (2),
578 .BR pselect (2),
579 .BR poll (2),
580 .BR ppoll (2),
581 .BR epoll_wait (2),
582 .BR epoll_pwait (2),
583 .BR clock_nanosleep (2),
584 .BR nanosleep (2),
585 and
586 .BR futex (2)
587 (and thus the library functions implemented via futexes, including
588 .\" List obtained by grepping for futex usage in glibc source
589 .BR pthread_cond_timedwait (3),
590 .BR pthread_mutex_timedlock (3),
591 .BR pthread_rwlock_timedrdlock (3),
592 .BR pthread_rwlock_timedwrlock (3),
593 and
594 .BR sem_timedwait (3)).
595
596 Timer slack is not applied to threads that are scheduled under
597 a real-time scheduling policy (see
598 .BR sched_setscheduler (2)).
599
600 Each thread has two associated timer slack values:
601 a "default" value, and a "current" value.
602 The current value is the one that governs grouping
603 of timer expirations.
604 When a new thread is created,
605 the two timer slack values are made the same as the current value
606 of the creating thread.
607 Thereafter, a thread can adjust its current timer slack value via
608 .BR PR_SET_TIMERSLACK
609 (the default value can't be changed).
610 The timer slack values of
611 .IR init
612 (PID 1), the ancestor of all processes,
613 are 50,000 nanoseconds (50 microseconds).
614 The timer slack values are preserved across
615 .BR execve (2).
616 .TP
617 .BR PR_GET_TIMERSLACK " (since Linux 2.6.28)"
618 Return (as the function result)
619 the current timer slack value of the calling thread.
620 .TP
621 .BR PR_SET_TIMING " (since Linux 2.6.0-test4)"
622 Set whether to use (normal, traditional) statistical process timing or
623 accurate timestamp-based process timing, by passing
624 .B PR_TIMING_STATISTICAL
625 .\" 0
626 or
627 .B PR_TIMING_TIMESTAMP
628 .\" 1
629 to \fIarg2\fP.
630 .B PR_TIMING_TIMESTAMP
631 is not currently implemented
632 (attempting to set this mode will yield the error
633 .BR EINVAL ).
634 .\" PR_TIMING_TIMESTAMP doesn't do anything in 2.6.26-rc8,
635 .\" and looking at the patch history, it appears
636 .\" that it never did anything.
637 .TP
638 .BR PR_GET_TIMING " (since Linux 2.6.0-test4)"
639 Return (as the function result) which process timing method is currently
640 in use.
641 .TP
642 .BR PR_TASK_PERF_EVENTS_DISABLE " (since Linux 2.6.31)"
643 Disable all performance counters attached to the calling process,
644 regardless of whether the counters were created by
645 this process or another process.
646 Performance counters created by the calling process for other
647 processes are unaffected.
648 For more information on performance counters, see the Linux kernel source file
649 .IR tools/perf/design.txt .
650 .IP
651 Originally called
652 .BR PR_TASK_PERF_COUNTERS_DISABLE ;
653 .\" commit 1d1c7ddbfab358445a542715551301b7fc363e28
654 renamed (with same numerical value)
655 in Linux 2.6.32.
656 .TP
657 .BR PR_TASK_PERF_EVENTS_ENABLE " (since Linux 2.6.31)"
658 The converse of
659 .BR PR_TASK_PERF_EVENTS_DISABLE ;
660 enable performance counters attached to the calling process.
661 .IP
662 Originally called
663 .BR PR_TASK_PERF_COUNTERS_ENABLE ;
664 .\" commit 1d1c7ddbfab358445a542715551301b7fc363e28
665 renamed
666 .\" commit cdd6c482c9ff9c55475ee7392ec8f672eddb7be6
667 in Linux 2.6.32.
668 .TP
669 .BR PR_SET_TSC " (since Linux 2.6.26, x86 only)"
670 Set the state of the flag determining whether the timestamp counter
671 can be read by the process.
672 Pass
673 .B PR_TSC_ENABLE
674 to
675 .I arg2
676 to allow it to be read, or
677 .B PR_TSC_SIGSEGV
678 to generate a
679 .B SIGSEGV
680 when the process tries to read the timestamp counter.
681 .TP
682 .BR PR_GET_TSC " (since Linux 2.6.26, x86 only)"
683 Return the state of the flag determining whether the timestamp counter
684 can be read,
685 in the location pointed to by
686 .IR "(int\ *) arg2" .
687 .TP
688 .B PR_SET_UNALIGN
689 (Only on: ia64, since Linux 2.3.48; parisc, since Linux 2.6.15;
690 PowerPC, since Linux 2.6.18; Alpha, since Linux 2.6.22)
691 Set unaligned access control bits to \fIarg2\fP.
692 Pass
693 \fBPR_UNALIGN_NOPRINT\fP to silently fix up unaligned user accesses,
694 or \fBPR_UNALIGN_SIGBUS\fP to generate
695 .B SIGBUS
696 on unaligned user access.
697 .TP
698 .B PR_GET_UNALIGN
699 (see
700 .B PR_SET_UNALIGN
701 for information on versions and architectures)
702 Return unaligned access control bits, in the location pointed to by
703 .IR "(int\ *) arg2" .
704 .TP
705 .BR PR_MCE_KILL " (since Linux 2.6.32)"
706 Set the machine check memory corruption kill policy for the current thread.
707 If
708 .I arg2
709 is
710 .BR PR_MCE_KILL_CLEAR ,
711 clear the thread memory corruption kill policy and use the system-wide default.
712 (The system-wide default is defined by
713 .IR /proc/sys/vm/memory_failure_early_kill ;
714 see
715 .BR proc (5).)
716 If
717 .I arg2
718 is
719 .BR PR_MCE_KILL_SET ,
720 use a thread-specific memory corruption kill policy.
721 In this case,
722 .I arg3
723 defines whether the policy is
724 .I early kill
725 .RB ( PR_MCE_KILL_EARLY ),
726 .I late kill
727 .RB ( PR_MCE_KILL_LATE ),
728 or the system-wide default
729 .RB ( PR_MCE_KILL_DEFAULT ).
730 Early kill means that the thread receives a
731 .B SIGBUS
732 signal as soon as hardware memory corruption is detected inside
733 its address space.
734 In late kill mode, the process is killed only when it accesses a corrupted page.
735 See
736 .BR sigaction (2)
737 for more information on the
738 .BR SIGBUS
739 signal.
740 The policy is inherited by children.
741 The remaining unused
742 .BR prctl ()
743 arguments must be zero for future compatibility.
744 .TP
745 .BR PR_MCE_KILL_GET " (since Linux 2.6.32)"
746 Return the current per-process machine check kill policy.
747 All unused
748 .BR prctl ()
749 arguments must be zero.
750 .TP
751 .BR PR_SET_MM " (since Linux 3.3)"
752 .\" commit 028ee4be34a09a6d48bdf30ab991ae933a7bc036
753 Modify certain kernel memory map descriptor fields
754 of the calling process.
755 Usually these fields are set by the kernel and dynamic loader (see
756 .BR ld.so (8)
757 for more information) and a regular application should not use this feature.
758 However, there are cases, such as self-modifying programs,
759 where a program might find it useful to change its own memory map.
760 This feature is available only if the kernel is built with the
761 .BR CONFIG_CHECKPOINT_RESTORE
762 option enabled.
763 The calling process must have the
764 .BR CAP_SYS_RESOURCE
765 capability.
766 The value in
767 .I arg2
768 is one of the options below, while
769 .I arg3
770 provides a new value for the option.
771 .RS
772 .TP
773 .BR PR_SET_MM_START_CODE
774 Set the address above which the program text can run.
775 The corresponding memory area must be readable and executable,
776 but not writable or sharable (see
777 .BR mprotect (2)
778 and
779 .BR mmap (2)
780 for more information).
781 .TP
782 .BR PR_SET_MM_END_CODE
783 Set the address below which the program text can run.
784 The corresponding memory area must be readable and executable,
785 but not writable or sharable.
786 .TP
787 .BR PR_SET_MM_START_DATA
788 Set the address above which initialized and
789 uninitialized (bss) data are placed.
790 The corresponding memory area must be readable and writable,
791 but not executable or sharable.
792 .TP
793 .B PR_SET_MM_END_DATA
794 Set the address below which initialized and
795 uninitialized (bss) data are placed.
796 The corresponding memory area must be readable and writable,
797 but not executable or sharable.
798 .TP
799 .BR PR_SET_MM_START_STACK
800 Set the start address of the stack.
801 The corresponding memory area must be readable and writable.
802 .TP
803 .BR PR_SET_MM_START_BRK
804 Set the address above which the program heap can be expanded with
805 .BR brk (2)
806 call.
807 The address must be greater than the ending address of
808 the current program data segment.
809 In addition, the combined size of the resulting heap and
810 the size of the data segment can't exceed the
811 .BR RLIMIT_DATA
812 resource limit (see
813 .BR setrlimit (2)).
814 .TP
815 .BR PR_SET_MM_BRK
816 Set the current
817 .BR brk (2)
818 value.
819 The requirements for the address are the same as for the
820 .BR PR_SET_MM_START_BRK
821 option.
822 .P
823 The following options are available since Linux 3.5.
824 .\" commit fe8c7f5cbf91124987106faa3bdf0c8b955c4cf7
825 .TP
826 .BR PR_SET_MM_ARG_START
827 Set the address above which the program command line is placed.
828 .TP
829 .BR PR_SET_MM_ARG_END
830 Set the address below which the program command line is placed.
831 .TP
832 .BR PR_SET_MM_ENV_START
833 Set the address above which the program environment is placed.
834 .TP
835 .BR PR_SET_MM_ENV_END
836 Set the address below which the program environment is placed.
837 .IP
838 The address passed with
839 .BR PR_SET_MM_ARG_START ,
840 .BR PR_SET_MM_ARG_END ,
841 .BR PR_SET_MM_ENV_START ,
842 and
843 .BR PR_SET_MM_ENV_END
844 should belong to a process stack area.
845 Thus, the corresponding memory area must be readable, writable, and
846 (depending on the kernel configuration) have the
847 .BR MAP_GROWSDOWN
848 attribute set (see
849 .BR mmap (2)).
850 .TP
851 .BR PR_SET_MM_AUXV
852 Set a new auxiliary vector.
853 The
854 .I arg3
855 argument should provide the address of the vector.
856 The
857 .I arg4
858 is the size of the vector.
859 .TP
860 .BR PR_SET_MM_EXE_FILE
861 .\" commit b32dfe377102ce668775f8b6b1461f7ad428f8b6
862 Supersede the
863 .IR /proc/pid/exe
864 symbolic link with a new one pointing to a new executable file
865 identified by the file descriptor provided in
866 .I arg3
867 argument.
868 The file descriptor should be obtained with a regular
869 .BR open (2)
870 call.
871 .IP
872 To change the symbolic link, one needs to unmap all existing
873 executable memory areas, including those created by the kernel itself
874 (for example the kernel usually creates at least one executable
875 memory area for the ELF
876 .IR \.text
877 section).
878 .IP
879 The second limitation is that such transitions can be done only once
880 in a process life time.
881 Any further attempts will be rejected.
882 This should help system administrators monitor unusual
883 symbolic-link transitions over all processes running on a system.
884 .RE
885 .TP
886 .BR PR_MPX_ENABLE_MANAGEMENT ", " PR_MPX_DISABLE_MANAGEMENT " (since Linux 3.19) "
887 .\" commit fe3d197f84319d3bce379a9c0dc17b1f48ad358c
888 .\" See also http://lwn.net/Articles/582712/
889 .\" See also https://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler
890 Enable or disable kernel management of Memory Protection eXtensions (MPX)
891 bounds tables.
892 The
893 .IR arg2 ,
894 .IR arg3 ,
895 .IR arg4 ,
896 and
897 .IR arg5
898 .\" commit e9d1b4f3c60997fe197bf0243cb4a41a44387a88
899 arguments must be zero.
900
901 MPX is a hardware-assisted mechanism for performing bounds checking on
902 pointers.
903 It consists of a set of registers storing bounds information
904 and a set of special instruction prefixes that tell the CPU on which
905 instructions it should do bounds enforcement.
906 There is a limited number of these registers and
907 when there are more pointers than registers,
908 their contents must be "spilled" into a set of tables.
909 These tables are called "bounds tables" and the MPX
910 .BR prctl ()
911 operations control
912 whether the kernel manages their allocation and freeing.
913
914 When management is enabled, the kernel will take over allocation
915 and freeing of the bounds tables.
916 It does this by trapping the #BR exceptions that result
917 at first use of missing bounds tables and
918 instead of delivering the exception to user space,
919 it allocates the table and populates the bounds directory
920 with the location of the new table.
921 For freeing, the kernel checks to see if bounds tables are
922 present for memory which is not allocated, and frees them if so.
923
924 Before enabling MPX management using
925 .BR PR_MPX_ENABLE_MANAGEMENT ,
926 the application must first have allocated a user-space buffer for
927 the bounds directory and placed the location of that directory in the
928 .I bndcfgu
929 register.
930
931 These calls will fail if the CPU or kernel does not support MPX.
932 Kernel support for MPX is enabled via the
933 .BR CONFIG_X86_INTEL_MPX
934 configuration option.
935 You can check whether the CPU supports MPX by looking for the 'mpx'
936 CPUID bit, like with the following command:
937
938 cat /proc/cpuinfo | grep ' mpx '
939
940 A thread may not switch in or out of long (64-bit) mode while MPX is
941 enabled.
942
943 All threads in a process are affected by these calls.
944
945 The child of a
946 .BR fork (2)
947 inherits the state of MPX management.
948 During
949 .BR execve (2),
950 MPX management is reset to a state as if
951 .BR PR_MPX_DISABLE_MANAGEMENT
952 had been called.
953
954 For further information on Intel MPX, see the kernel source file
955 .IR Documentation/x86/intel_mpx.txt .
956 .\"
957 .SH RETURN VALUE
958 On success,
959 .BR PR_GET_DUMPABLE ,
960 .BR PR_GET_KEEPCAPS ,
961 .BR PR_GET_NO_NEW_PRIVS ,
962 .BR PR_GET_THP_DISABLE ,
963 .BR PR_CAPBSET_READ ,
964 .BR PR_GET_TIMING ,
965 .BR PR_GET_TIMERSLACK ,
966 .BR PR_GET_SECUREBITS ,
967 .BR PR_MCE_KILL_GET ,
968 and (if it returns)
969 .BR PR_GET_SECCOMP
970 return the nonnegative values described above.
971 All other
972 .I option
973 values return 0 on success.
974 On error, \-1 is returned, and
975 .I errno
976 is set appropriately.
977 .SH ERRORS
978 .TP
979 .B EFAULT
980 .I arg2
981 is an invalid address.
982 .TP
983 .B EFAULT
984 .I option
985 is
986 .BR PR_SET_SECCOMP ,
987 .I arg2
988 is
989 .BR SECCOMP_MODE_FILTER ,
990 the system was built with
991 .BR CONFIG_SECCOMP_FILTER ,
992 and
993 .I arg3
994 is an invalid address.
995 .TP
996 .B EINVAL
997 The value of
998 .I option
999 is not recognized.
1000 .TP
1001 .B EINVAL
1002 .I option
1003 is
1004 .BR PR_MCE_KILL
1005 or
1006 .BR PR_MCE_KILL_GET
1007 or
1008 .BR PR_SET_MM ,
1009 and unused
1010 .BR prctl ()
1011 arguments were not specified as zero.
1012 .TP
1013 .B EINVAL
1014 .I arg2
1015 is not valid value for this
1016 .IR option .
1017 .TP
1018 .B EINVAL
1019 .I option
1020 is
1021 .BR PR_SET_SECCOMP
1022 or
1023 .BR PR_GET_SECCOMP ,
1024 and the kernel was not configured with
1025 .BR CONFIG_SECCOMP .
1026 .TP
1027 .B EINVAL
1028 .I option
1029 is
1030 .BR PR_SET_SECCOMP ,
1031 .I arg2
1032 is
1033 .BR SECCOMP_MODE_FILTER ,
1034 and the kernel was not configured with
1035 .BR CONFIG_SECCOMP_FILTER .
1036 .TP
1037 .B EINVAL
1038 .I option
1039 is
1040 .BR PR_SET_MM ,
1041 and one of the following is true
1042 .RS
1043 .IP * 3
1044 .I arg4
1045 or
1046 .I arg5
1047 is nonzero;
1048 .IP *
1049 .I arg3
1050 is greater than
1051 .B TASK_SIZE
1052 (the limit on the size of the user address space for this architecture);
1053 .IP *
1054 .I arg2
1055 is
1056 .BR PR_SET_MM_START_CODE ,
1057 .BR PR_SET_MM_END_CODE ,
1058 .BR PR_SET_MM_START_DATA ,
1059 .BR PR_SET_MM_END_DATA ,
1060 or
1061 .BR PR_SET_MM_START_STACK ,
1062 and the permissions of the corresponding memory area are not as required;
1063 .IP *
1064 .I arg2
1065 is
1066 .BR PR_SET_MM_START_BRK
1067 or
1068 .BR PR_SET_MM_BRK ,
1069 and
1070 .I arg3
1071 is less than or equal to the end of the data segment
1072 or specifies a value that would cause the
1073 .B RLIMIT_DATA
1074 resource limit to be exceeded.
1075 .RE
1076 .TP
1077 .B EINVAL
1078 .I option
1079 is
1080 .BR PR_SET_PTRACER
1081 and
1082 .I arg2
1083 is not 0,
1084 .BR PR_SET_PTRACER_ANY ,
1085 or the PID of an existing process.
1086 .TP
1087 .B EINVAL
1088 .I option
1089 is
1090 .B PR_SET_PDEATHSIG
1091 and
1092 .I arg2
1093 is not a valid signal number.
1094 .TP
1095 .B EINVAL
1096 .I option
1097 is
1098 .BR PR_SET_DUMPABLE
1099 and
1100 .I arg2
1101 is neither
1102 .B SUID_DUMP_DISABLE
1103 nor
1104 .BR SUID_DUMP_USER .
1105 .TP
1106 .B EINVAL
1107 .I option
1108 is
1109 .BR PR_SET_TIMING
1110 and
1111 .I arg2
1112 is not
1113 .BR PR_TIMING_STATISTICAL .
1114 .TP
1115 .B EINVAL
1116 .I option
1117 is
1118 .BR PR_SET_NO_NEW_PRIVS
1119 and
1120 .I arg2
1121 is not equal to 1
1122 or
1123 .IR arg3 ,
1124 .IR arg4 ,
1125 or
1126 .IR arg5
1127 is nonzero.
1128 .TP
1129 .B EINVAL
1130 .I option
1131 is
1132 .BR PR_GET_NO_NEW_PRIVS
1133 and
1134 .IR arg2 ,
1135 .IR arg3 ,
1136 .IR arg4 ,
1137 or
1138 .IR arg5
1139 is nonzero.
1140 .TP
1141 .B EINVAL
1142 .I option
1143 is
1144 .BR PR_SET_THP_DISABLE
1145 and
1146 .IR arg3 ,
1147 .IR arg4 ,
1148 or
1149 .IR arg5
1150 is nonzero.
1151 .TP
1152 .B EINVAL
1153 .I option
1154 is
1155 .BR PR_GET_THP_DISABLE
1156 and
1157 .IR arg2 ,
1158 .IR arg3 ,
1159 .IR arg4 ,
1160 or
1161 .IR arg5
1162 is nonzero.
1163 .TP
1164 .B EPERM
1165 .I option
1166 is
1167 .BR PR_SET_SECUREBITS ,
1168 and the caller does not have the
1169 .B CAP_SETPCAP
1170 capability,
1171 or tried to unset a "locked" flag,
1172 or tried to set a flag whose corresponding locked flag was set
1173 (see
1174 .BR capabilities (7)).
1175 .TP
1176 .B EPERM
1177 .I option
1178 is
1179 .BR PR_SET_KEEPCAPS ,
1180 and the callers's
1181 .B SECURE_KEEP_CAPS_LOCKED
1182 flag is set
1183 (see
1184 .BR capabilities (7)).
1185 .TP
1186 .B EPERM
1187 .I option
1188 is
1189 .BR PR_CAPBSET_DROP ,
1190 and the caller does not have the
1191 .B CAP_SETPCAP
1192 capability.
1193 .TP
1194 .B EPERM
1195 .I option
1196 is
1197 .BR PR_SET_MM ,
1198 and the caller does not have the
1199 .B CAP_SYS_RESOURCE
1200 capability.
1201 .TP
1202 .B EACCES
1203 .I option
1204 is
1205 .BR PR_SET_MM ,
1206 and
1207 .I arg3
1208 is
1209 .BR PR_SET_MM_EXE_FILE ,
1210 the file is not executable.
1211 .TP
1212 .B EBUSY
1213 .I option
1214 is
1215 .BR PR_SET_MM ,
1216 .I arg3
1217 is
1218 .BR PR_SET_MM_EXE_FILE ,
1219 and this the second attempt to change the
1220 .I /proc/pid/exe
1221 symbolic link, which is prohibited.
1222 .TP
1223 .B EBADF
1224 .I option
1225 is
1226 .BR PR_SET_MM ,
1227 .I arg3
1228 is
1229 .BR PR_SET_MM_EXE_FILE ,
1230 and the file descriptor passed in
1231 .I arg4
1232 is not valid.
1233 .\" The following can't actually happen, because prctl() in
1234 .\" seccomp mode will cause SIGKILL.
1235 .\" .TP
1236 .\" .B EPERM
1237 .\" .I option
1238 .\" is
1239 .\" .BR PR_SET_SECCOMP ,
1240 .\" and secure computing mode is already 1.
1241 .TP
1242 .B ENXIO
1243 .I option
1244 was
1245 .BR PR_MPX_ENABLE_MANAGEMENT
1246 or
1247 .BR PR_MPX_DISABLE_MANAGEMENT
1248 and the kernel or the CPU does not support MPX management.
1249 Check that the kernel and processor have MPX support.
1250 .SH VERSIONS
1251 The
1252 .BR prctl ()
1253 system call was introduced in Linux 2.1.57.
1254 .\" The library interface was added in glibc 2.0.6
1255 .SH CONFORMING TO
1256 This call is Linux-specific.
1257 IRIX has a
1258 .BR prctl ()
1259 system call (also introduced in Linux 2.1.44
1260 as irix_prctl on the MIPS architecture),
1261 with prototype
1262 .sp
1263 .BI "ptrdiff_t prctl(int " option ", int " arg2 ", int " arg3 );
1264 .sp
1265 and options to get the maximum number of processes per user,
1266 get the maximum number of processors the calling process can use,
1267 find out whether a specified process is currently blocked,
1268 get or set the maximum stack size, and so on.
1269 .SH SEE ALSO
1270 .BR signal (2),
1271 .BR core (5)