]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man2/sched_setscheduler.2
getent.1, _syscall.2, acct.2, adjtimex.2, bdflush.2, brk.2, cacheflush.2, getsid...
[thirdparty/man-pages.git] / man2 / sched_setscheduler.2
1 .\" Copyright (C) Tom Bjorkholm, Markus Kuhn & David A. Wheeler 1996-1999
2 .\" and Copyright (C) 2007 Carsten Emde <Carsten.Emde@osadl.org>
3 .\" and Copyright (C) 2008 Michael Kerrisk <mtk.manpages@gmail.com>
4 .\"
5 .\" %%%LICENSE_START(GPLv2+_doc_full)
6 .\" This is free documentation; you can redistribute it and/or
7 .\" modify it under the terms of the GNU General Public License as
8 .\" published by the Free Software Foundation; either version 2 of
9 .\" the License, or (at your option) any later version.
10 .\"
11 .\" The GNU General Public License's references to "object code"
12 .\" and "executables" are to be interpreted as the output of any
13 .\" document formatting or typesetting system, including
14 .\" intermediate and printed output.
15 .\"
16 .\" This manual is distributed in the hope that it will be useful,
17 .\" but WITHOUT ANY WARRANTY; without even the implied warranty of
18 .\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
19 .\" GNU General Public License for more details.
20 .\"
21 .\" You should have received a copy of the GNU General Public
22 .\" License along with this manual; if not, see
23 .\" <http://www.gnu.org/licenses/>.
24 .\" %%%LICENSE_END
25 .\"
26 .\" 1996-04-01 Tom Bjorkholm <tomb@mydata.se>
27 .\" First version written
28 .\" 1996-04-10 Markus Kuhn <mskuhn@cip.informatik.uni-erlangen.de>
29 .\" revision
30 .\" 1999-08-18 David A. Wheeler <dwheeler@ida.org> added Note.
31 .\" Modified, 25 Jun 2002, Michael Kerrisk <mtk.manpages@gmail.com>
32 .\" Corrected description of queue placement by sched_setparam() and
33 .\" sched_setscheduler()
34 .\" A couple of grammar clean-ups
35 .\" Modified 2004-05-27 by Michael Kerrisk <mtk.manpages@gmail.com>
36 .\" 2005-03-23, mtk, Added description of SCHED_BATCH.
37 .\" 2007-07-10, Carsten Emde <Carsten.Emde@osadl.org>
38 .\" Add text on real-time features that are currently being
39 .\" added to the mainline kernel.
40 .\" 2008-05-07, mtk; Rewrote and restructured various parts of the page to
41 .\" improve readability.
42 .\" 2010-06-19, mtk, documented SCHED_RESET_ON_FORK
43 .\"
44 .\" Worth looking at: http://rt.wiki.kernel.org/index.php
45 .\"
46 .TH SCHED_SETSCHEDULER 2 2013-02-12 "Linux" "Linux Programmer's Manual"
47 .SH NAME
48 sched_setscheduler, sched_getscheduler \-
49 set and get scheduling policy/parameters
50 .SH SYNOPSIS
51 .nf
52 .B #include <sched.h>
53 .sp
54 .BI "int sched_setscheduler(pid_t " pid ", int " policy ,
55 .br
56 .BI " const struct sched_param *" param );
57 .sp
58 .BI "int sched_getscheduler(pid_t " pid );
59 .sp
60 \fBstruct sched_param {
61 ...
62 int \fIsched_priority\fB;
63 ...
64 };
65 .fi
66 .SH DESCRIPTION
67 .BR sched_setscheduler ()
68 sets both the scheduling policy and the associated parameters for the
69 process whose ID is specified in \fIpid\fP.
70 If \fIpid\fP equals zero, the
71 scheduling policy and parameters of the calling process will be set.
72 The interpretation of
73 the argument \fIparam\fP depends on the selected policy.
74 Currently, Linux supports the following "normal"
75 (i.e., non-real-time) scheduling policies:
76 .TP 14
77 .BR SCHED_OTHER
78 the standard round-robin time-sharing policy;
79 .\" In the 2.6 kernel sources, SCHED_OTHER is actually called
80 .\" SCHED_NORMAL.
81 .TP
82 .BR SCHED_BATCH
83 for "batch" style execution of processes; and
84 .TP
85 .BR SCHED_IDLE
86 for running
87 .I very
88 low priority background jobs.
89 .PP
90 The following "real-time" policies are also supported,
91 for special time-critical applications that need precise control over
92 the way in which runnable processes are selected for execution:
93 .TP 14
94 .BR SCHED_FIFO
95 a first-in, first-out policy; and
96 .TP
97 .BR SCHED_RR
98 a round-robin policy.
99 .PP
100 The semantics of each of these policies are detailed below.
101
102 .BR sched_getscheduler ()
103 queries the scheduling policy currently applied to the process
104 identified by \fIpid\fP.
105 If \fIpid\fP equals zero, the policy of the
106 calling process will be retrieved.
107 .\"
108 .SS Scheduling policies
109 The scheduler is the kernel component that decides which runnable process
110 will be executed by the CPU next.
111 Each process has an associated scheduling policy and a \fIstatic\fP
112 scheduling priority, \fIsched_priority\fP; these are the settings
113 that are modified by
114 .BR sched_setscheduler ().
115 The scheduler makes it decisions based on knowledge of the scheduling
116 policy and static priority of all processes on the system.
117
118 For processes scheduled under one of the normal scheduling policies
119 (\fBSCHED_OTHER\fP, \fBSCHED_IDLE\fP, \fBSCHED_BATCH\fP),
120 \fIsched_priority\fP is not used in scheduling
121 decisions (it must be specified as 0).
122
123 Processes scheduled under one of the real-time policies
124 (\fBSCHED_FIFO\fP, \fBSCHED_RR\fP) have a
125 \fIsched_priority\fP value in the range 1 (low) to 99 (high).
126 (As the numbers imply, real-time processes always have higher priority
127 than normal processes.)
128 Note well: POSIX.1-2001 only requires an implementation to support a
129 minimum 32 distinct priority levels for the real-time policies,
130 and some systems supply just this minimum.
131 Portable programs should use
132 .BR sched_get_priority_min (2)
133 and
134 .BR sched_get_priority_max (2)
135 to find the range of priorities supported for a particular policy.
136
137 Conceptually, the scheduler maintains a list of runnable
138 processes for each possible \fIsched_priority\fP value.
139 In order to determine which process runs next, the scheduler looks for
140 the nonempty list with the highest static priority and selects the
141 process at the head of this list.
142
143 A process's scheduling policy determines
144 where it will be inserted into the list of processes
145 with equal static priority and how it will move inside this list.
146
147 All scheduling is preemptive: if a process with a higher static
148 priority becomes ready to run, the currently running process
149 will be preempted and
150 returned to the wait list for its static priority level.
151 The scheduling policy only determines the
152 ordering within the list of runnable processes with equal static
153 priority.
154 .SS SCHED_FIFO: First in-first out scheduling
155 \fBSCHED_FIFO\fP can only be used with static priorities higher than
156 0, which means that when a \fBSCHED_FIFO\fP processes becomes runnable,
157 it will always immediately preempt any currently running
158 \fBSCHED_OTHER\fP, \fBSCHED_BATCH\fP, or \fBSCHED_IDLE\fP process.
159 \fBSCHED_FIFO\fP is a simple scheduling
160 algorithm without time slicing.
161 For processes scheduled under the
162 \fBSCHED_FIFO\fP policy, the following rules apply:
163 .IP * 3
164 A \fBSCHED_FIFO\fP process that has been preempted by another process of
165 higher priority will stay at the head of the list for its priority and
166 will resume execution as soon as all processes of higher priority are
167 blocked again.
168 .IP *
169 When a \fBSCHED_FIFO\fP process becomes runnable, it
170 will be inserted at the end of the list for its priority.
171 .IP *
172 A call to
173 .BR sched_setscheduler ()
174 or
175 .BR sched_setparam (2)
176 will put the
177 \fBSCHED_FIFO\fP (or \fBSCHED_RR\fP) process identified by
178 \fIpid\fP at the start of the list if it was runnable.
179 As a consequence, it may preempt the currently running process if
180 it has the same priority.
181 (POSIX.1-2001 specifies that the process should go to the end
182 of the list.)
183 .\" In 2.2.x and 2.4.x, the process is placed at the front of the queue
184 .\" In 2.0.x, the Right Thing happened: the process went to the back -- MTK
185 .IP *
186 A process calling
187 .BR sched_yield (2)
188 will be put at the end of the list.
189 .PP
190 No other events will move a process
191 scheduled under the \fBSCHED_FIFO\fP policy in the wait list of
192 runnable processes with equal static priority.
193
194 A \fBSCHED_FIFO\fP
195 process runs until either it is blocked by an I/O request, it is
196 preempted by a higher priority process, or it calls
197 .BR sched_yield (2).
198 .SS SCHED_RR: Round-robin scheduling
199 \fBSCHED_RR\fP is a simple enhancement of \fBSCHED_FIFO\fP.
200 Everything
201 described above for \fBSCHED_FIFO\fP also applies to \fBSCHED_RR\fP,
202 except that each process is only allowed to run for a maximum time
203 quantum.
204 If a \fBSCHED_RR\fP process has been running for a time
205 period equal to or longer than the time quantum, it will be put at the
206 end of the list for its priority.
207 A \fBSCHED_RR\fP process that has
208 been preempted by a higher priority process and subsequently resumes
209 execution as a running process will complete the unexpired portion of
210 its round-robin time quantum.
211 The length of the time quantum can be
212 retrieved using
213 .BR sched_rr_get_interval (2).
214 .\" On Linux 2.4, the length of the RR interval is influenced
215 .\" by the process nice value -- MTK
216 .\"
217 .SS SCHED_OTHER: Default Linux time-sharing scheduling
218 \fBSCHED_OTHER\fP can only be used at static priority 0.
219 \fBSCHED_OTHER\fP is the standard Linux time-sharing scheduler that is
220 intended for all processes that do not require the special
221 real-time mechanisms.
222 The process to run is chosen from the static
223 priority 0 list based on a \fIdynamic\fP priority that is determined only
224 inside this list.
225 The dynamic priority is based on the nice value (set by
226 .BR nice (2)
227 or
228 .BR setpriority (2))
229 and increased for each time quantum the process is ready to run,
230 but denied to run by the scheduler.
231 This ensures fair progress among all \fBSCHED_OTHER\fP processes.
232 .\"
233 .SS SCHED_BATCH: Scheduling batch processes
234 (Since Linux 2.6.16.)
235 \fBSCHED_BATCH\fP can only be used at static priority 0.
236 This policy is similar to \fBSCHED_OTHER\fP in that it schedules
237 the process according to its dynamic priority
238 (based on the nice value).
239 The difference is that this policy
240 will cause the scheduler to always assume
241 that the process is CPU-intensive.
242 Consequently, the scheduler will apply a small scheduling
243 penalty with respect to wakeup behaviour,
244 so that this process is mildly disfavored in scheduling decisions.
245
246 .\" The following paragraph is drawn largely from the text that
247 .\" accompanied Ingo Molnar's patch for the implementation of
248 .\" SCHED_BATCH.
249 This policy is useful for workloads that are noninteractive,
250 but do not want to lower their nice value,
251 and for workloads that want a deterministic scheduling policy without
252 interactivity causing extra preemptions (between the workload's tasks).
253 .\"
254 .SS SCHED_IDLE: Scheduling very low priority jobs
255 (Since Linux 2.6.23.)
256 \fBSCHED_IDLE\fP can only be used at static priority 0;
257 the process nice value has no influence for this policy.
258
259 This policy is intended for running jobs at extremely low
260 priority (lower even than a +19 nice value with the
261 .B SCHED_OTHER
262 or
263 .B SCHED_BATCH
264 policies).
265 .\"
266 .SS Resetting scheduling policy for child processes
267 Since Linux 2.6.32, the
268 .B SCHED_RESET_ON_FORK
269 flag can be ORed in
270 .I policy
271 when calling
272 .BR sched_setscheduler ().
273 As a result of including this flag, children created by
274 .BR fork (2)
275 do not inherit privileged scheduling policies.
276 This feature is intended for media-playback applications,
277 and can be used to prevent applications evading the
278 .BR RLIMIT_RTTIME
279 resource limit (see
280 .BR getrlimit (2))
281 by creating multiple child processes.
282
283 More precisely, if the
284 .BR SCHED_RESET_ON_FORK
285 flag is specified,
286 the following rules apply for subsequently created children:
287 .IP * 3
288 If the calling process has a scheduling policy of
289 .B SCHED_FIFO
290 or
291 .BR SCHED_RR ,
292 the policy is reset to
293 .BR SCHED_OTHER
294 in child processes.
295 .IP *
296 If the calling process has a negative nice value,
297 the nice value is reset to zero in child processes.
298 .PP
299 After the
300 .BR SCHED_RESET_ON_FORK
301 flag has been enabled,
302 it can only be reset if the process has the
303 .BR CAP_SYS_NICE
304 capability.
305 This flag is disabled in child processes created by
306 .BR fork (2).
307
308 The
309 .B SCHED_RESET_ON_FORK
310 flag is visible in the policy value returned by
311 .BR sched_getscheduler ()
312 .\"
313 .SS Privileges and resource limits
314 In Linux kernels before 2.6.12, only privileged
315 .RB ( CAP_SYS_NICE )
316 processes can set a nonzero static priority (i.e., set a real-time
317 scheduling policy).
318 The only change that an unprivileged process can make is to set the
319 .B SCHED_OTHER
320 policy, and this can only be done if the effective user ID of the caller of
321 .BR sched_setscheduler ()
322 matches the real or effective user ID of the target process
323 (i.e., the process specified by
324 .IR pid )
325 whose policy is being changed.
326
327 Since Linux 2.6.12, the
328 .B RLIMIT_RTPRIO
329 resource limit defines a ceiling on an unprivileged process's
330 static priority for the
331 .B SCHED_RR
332 and
333 .B SCHED_FIFO
334 policies.
335 The rules for changing scheduling policy and priority are as follows:
336 .IP * 3
337 If an unprivileged process has a nonzero
338 .B RLIMIT_RTPRIO
339 soft limit, then it can change its scheduling policy and priority,
340 subject to the restriction that the priority cannot be set to a
341 value higher than the maximum of its current priority and its
342 .B RLIMIT_RTPRIO
343 soft limit.
344 .IP *
345 If the
346 .B RLIMIT_RTPRIO
347 soft limit is 0, then the only permitted changes are to lower the priority,
348 or to switch to a non-real-time policy.
349 .IP *
350 Subject to the same rules,
351 another unprivileged process can also make these changes,
352 as long as the effective user ID of the process making the change
353 matches the real or effective user ID of the target process.
354 .IP *
355 Special rules apply for the
356 .BR SCHED_IDLE .
357 In Linux kernels before 2.6.39,
358 an unprivileged process operating under this policy cannot
359 change its policy, regardless of the value of its
360 .BR RLIMIT_RTPRIO
361 resource limit.
362 In Linux kernels since 2.6.39,
363 .\" commit c02aa73b1d18e43cfd79c2f193b225e84ca497c8
364 an unprivileged process can switch to either the
365 .BR SCHED_BATCH
366 or the
367 .BR SCHED_NORMAL
368 policy so long as its nice value falls within the range permitted by its
369 .BR RLIMIT_NICE
370 resource limit (see
371 .BR getrlimit (2)).
372 .PP
373 Privileged
374 .RB ( CAP_SYS_NICE )
375 processes ignore the
376 .B RLIMIT_RTPRIO
377 limit; as with older kernels,
378 they can make arbitrary changes to scheduling policy and priority.
379 See
380 .BR getrlimit (2)
381 for further information on
382 .BR RLIMIT_RTPRIO .
383 .SS Response time
384 A blocked high priority process waiting for the I/O has a certain
385 response time before it is scheduled again.
386 The device driver writer
387 can greatly reduce this response time by using a "slow interrupt"
388 interrupt handler.
389 .\" as described in
390 .\" .BR request_irq (9).
391 .SS Miscellaneous
392 Child processes inherit the scheduling policy and parameters across a
393 .BR fork (2).
394 The scheduling policy and parameters are preserved across
395 .BR execve (2).
396
397 Memory locking is usually needed for real-time processes to avoid
398 paging delays; this can be done with
399 .BR mlock (2)
400 or
401 .BR mlockall (2).
402
403 Since a nonblocking infinite loop in a process scheduled under
404 \fBSCHED_FIFO\fP or \fBSCHED_RR\fP will block all processes with lower
405 priority forever, a software developer should always keep available on
406 the console a shell scheduled under a higher static priority than the
407 tested application.
408 This will allow an emergency kill of tested
409 real-time applications that do not block or terminate as expected.
410 See also the description of the
411 .BR RLIMIT_RTTIME
412 resource limit in
413 .BR getrlimit (2).
414
415 POSIX systems on which
416 .BR sched_setscheduler ()
417 and
418 .BR sched_getscheduler ()
419 are available define
420 .B _POSIX_PRIORITY_SCHEDULING
421 in \fI<unistd.h>\fP.
422 .SH RETURN VALUE
423 On success,
424 .BR sched_setscheduler ()
425 returns zero.
426 On success,
427 .BR sched_getscheduler ()
428 returns the policy for the process (a nonnegative integer).
429 On error, \-1 is returned, and
430 .I errno
431 is set appropriately.
432 .SH ERRORS
433 .TP
434 .B EINVAL
435 The scheduling \fIpolicy\fP is not one of the recognized policies,
436 \fIparam\fP is NULL,
437 or \fIparam\fP does not make sense for the \fIpolicy\fP.
438 .TP
439 .B EPERM
440 The calling process does not have appropriate privileges.
441 .TP
442 .B ESRCH
443 The process whose ID is \fIpid\fP could not be found.
444 .SH CONFORMING TO
445 POSIX.1-2001 (but see BUGS below).
446 The \fBSCHED_BATCH\fP and \fBSCHED_IDLE\fP policies are Linux-specific.
447 .SH NOTES
448 POSIX.1 does not detail the permissions that an unprivileged
449 process requires in order to call
450 .BR sched_setscheduler (),
451 and details vary across systems.
452 For example, the Solaris 7 manual page says that
453 the real or effective user ID of the calling process must
454 match the real user ID or the save set-user-ID of the target process.
455 .PP
456 The scheduling policy and parameters are in fact per-thread
457 attributes on Linux.
458 The value returned from a call to
459 .BR gettid (2)
460 can be passed in the argument
461 .IR pid .
462 Specifying
463 .I pid
464 as 0 will operate on the attribute for the calling thread,
465 and passing the value returned from a call to
466 .BR getpid (2)
467 will operate on the attribute for the main thread of the thread group.
468 (If you are using the POSIX threads API, then use
469 .BR pthread_setschedparam (3),
470 .BR pthread_getschedparam (3),
471 and
472 .BR pthread_setschedprio (3),
473 instead of the
474 .BR sched_* (2)
475 system calls.)
476 .PP
477 Originally, Standard Linux was intended as a general-purpose operating
478 system being able to handle background processes, interactive
479 applications, and less demanding real-time applications (applications that
480 need to usually meet timing deadlines).
481 Although the Linux kernel 2.6
482 allowed for kernel preemption and the newly introduced O(1) scheduler
483 ensures that the time needed to schedule is fixed and deterministic
484 irrespective of the number of active tasks, true real-time computing
485 was not possible up to kernel version 2.6.17.
486 .SS Real-time features in the mainline Linux kernel
487 .\" FIXME . Probably this text will need some minor tweaking
488 .\" by about the time of 2.6.30; ask Carsten Emde about this then.
489 From kernel version 2.6.18 onward, however, Linux is gradually
490 becoming equipped with real-time capabilities,
491 most of which are derived from the former
492 .I realtime-preempt
493 patches developed by Ingo Molnar, Thomas Gleixner,
494 Steven Rostedt, and others.
495 Until the patches have been completely merged into the
496 mainline kernel
497 (this is expected to be around kernel version 2.6.30),
498 they must be installed to achieve the best real-time performance.
499 These patches are named:
500 .in +4n
501 .nf
502
503 patch-\fIkernelversion\fP-rt\fIpatchversion\fP
504 .fi
505 .in
506 .PP
507 and can be downloaded from
508 .UR http://www.kernel.org\:/pub\:/linux\:/kernel\:/projects\:/rt/
509 .UE .
510
511 Without the patches and prior to their full inclusion into the mainline
512 kernel, the kernel configuration offers only the three preemption classes
513 .BR CONFIG_PREEMPT_NONE ,
514 .BR CONFIG_PREEMPT_VOLUNTARY ,
515 and
516 .B CONFIG_PREEMPT_DESKTOP
517 which respectively provide no, some, and considerable
518 reduction of the worst-case scheduling latency.
519
520 With the patches applied or after their full inclusion into the mainline
521 kernel, the additional configuration item
522 .B CONFIG_PREEMPT_RT
523 becomes available.
524 If this is selected, Linux is transformed into a regular
525 real-time operating system.
526 The FIFO and RR scheduling policies that can be selected using
527 .BR sched_setscheduler ()
528 are then used to run a process
529 with true real-time priority and a minimum worst-case scheduling latency.
530 .SH BUGS
531 POSIX says that on success,
532 .BR sched_setscheduler ()
533 should return the previous scheduling policy.
534 Linux
535 .BR sched_setscheduler ()
536 does not conform to this requirement,
537 since it always returns 0 on success.
538 .SH SEE ALSO
539 .ad l
540 .nh
541 .BR chrt (1),
542 .BR getpriority (2),
543 .BR mlock (2),
544 .BR mlockall (2),
545 .BR munlock (2),
546 .BR munlockall (2),
547 .BR nice (2),
548 .BR sched_get_priority_max (2),
549 .BR sched_get_priority_min (2),
550 .BR sched_getaffinity (2),
551 .BR sched_getparam (2),
552 .BR sched_rr_get_interval (2),
553 .BR sched_setaffinity (2),
554 .BR sched_setparam (2),
555 .BR sched_yield (2),
556 .BR setpriority (2),
557 .BR capabilities (7),
558 .BR cpuset (7)
559 .ad j
560 .PP
561 .I Programming for the real world \- POSIX.4
562 by Bill O. Gallmeister, O'Reilly & Associates, Inc., ISBN 1-56592-074-0.
563 .PP
564 .I Documentation/scheduler/sched-rt-group.txt
565 in the Linux kernel source tree
566 (since kernel 2.6.25).