]>
Commit | Line | Data |
---|---|---|
fea681da MK |
1 | .\" Hey Emacs! This file is -*- nroff -*- source. |
2 | .\" | |
3 | .\" Copyright (C) Tom Bjorkholm, Markus Kuhn & David A. Wheeler 1996-1999 | |
cf9c27a6 | 4 | .\" and Copyright (C) 2007 Carsten Emde <Carsten.Emde@osadl.org> |
fea681da MK |
5 | .\" |
6 | .\" This is free documentation; you can redistribute it and/or | |
7 | .\" modify it under the terms of the GNU General Public License as | |
8 | .\" published by the Free Software Foundation; either version 2 of | |
9 | .\" the License, or (at your option) any later version. | |
10 | .\" | |
11 | .\" The GNU General Public License's references to "object code" | |
12 | .\" and "executables" are to be interpreted as the output of any | |
13 | .\" document formatting or typesetting system, including | |
14 | .\" intermediate and printed output. | |
15 | .\" | |
16 | .\" This manual is distributed in the hope that it will be useful, | |
17 | .\" but WITHOUT ANY WARRANTY; without even the implied warranty of | |
18 | .\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
19 | .\" GNU General Public License for more details. | |
20 | .\" | |
21 | .\" You should have received a copy of the GNU General Public | |
22 | .\" License along with this manual; if not, write to the Free | |
23 | .\" Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, | |
24 | .\" USA. | |
25 | .\" | |
26 | .\" 1996-04-01 Tom Bjorkholm <tomb@mydata.se> | |
27 | .\" First version written | |
28 | .\" 1996-04-10 Markus Kuhn <mskuhn@cip.informatik.uni-erlangen.de> | |
29 | .\" revision | |
30 | .\" 1999-08-18 David A. Wheeler <dwheeler@ida.org> added Note. | |
c11b1abf | 31 | .\" Modified, 25 Jun 2002, Michael Kerrisk <mtk.manpages@gmail.com> |
c13182ef | 32 | .\" Corrected description of queue placement by sched_setparam() and |
fea681da MK |
33 | .\" sched_setscheduler() |
34 | .\" A couple of grammar clean-ups | |
c11b1abf | 35 | .\" Modified 2004-05-27 by Michael Kerrisk <mtk.manpages@gmail.com> |
92c37d8c | 36 | .\" 2005-03-23, mtk, Added description of SCHED_BATCH. |
cf9c27a6 MK |
37 | .\" 2007-07-10, Carsten Emde <Carsten.Emde@osadl.org> |
38 | .\" Add text on real-time features that are currently being | |
39 | .\" added to the mainline kernel. | |
fea681da | 40 | .\" |
531d750b | 41 | .TH SCHED_SETSCHEDULER 2 2007-11-25 "Linux" "Linux Programmer's Manual" |
fea681da MK |
42 | .SH NAME |
43 | sched_setscheduler, sched_getscheduler \- | |
44 | set and get scheduling algorithm/parameters | |
45 | .SH SYNOPSIS | |
92c37d8c | 46 | .nf |
fea681da MK |
47 | .B #include <sched.h> |
48 | .sp | |
49 | .BI "int sched_setscheduler(pid_t " pid ", int " policy , | |
92c37d8c MK |
50 | .br |
51 | .BI " const struct sched_param *" param ); | |
fea681da MK |
52 | .sp |
53 | .BI "int sched_getscheduler(pid_t " pid ); | |
54 | .sp | |
fea681da | 55 | \fBstruct sched_param { |
92c37d8c MK |
56 | ... |
57 | int \fIsched_priority\fB; | |
58 | ... | |
fea681da | 59 | }; |
fea681da MK |
60 | .fi |
61 | .SH DESCRIPTION | |
e511ffb6 | 62 | .BR sched_setscheduler () |
fea681da | 63 | sets both the scheduling policy and the associated parameters for the |
c13182ef MK |
64 | process identified by \fIpid\fP. |
65 | If \fIpid\fP equals zero, the | |
66 | scheduler of the calling process will be set. | |
67 | The interpretation of | |
68 | the parameter \fIparam\fP depends on the selected policy. | |
69 | Currently, the | |
fea681da | 70 | following three scheduling policies are supported under Linux: |
5917ad3d MK |
71 | .BR SCHED_FIFO , |
72 | .BR SCHED_RR , | |
73 | .BR SCHED_OTHER , | |
c13182ef | 74 | .\" In the 2.6 kernel sources, SCHED_OTHER is actually called |
92c37d8c MK |
75 | .\" SCHED_NORMAL. |
76 | and | |
5917ad3d | 77 | .BR SCHED_BATCH ; |
fea681da MK |
78 | their respective semantics are described below. |
79 | ||
e511ffb6 | 80 | .BR sched_getscheduler () |
fea681da | 81 | queries the scheduling policy currently applied to the process |
c13182ef MK |
82 | identified by \fIpid\fP. |
83 | If \fIpid\fP equals zero, the policy of the | |
fea681da | 84 | calling process will be retrieved. |
fea681da MK |
85 | .SS Scheduling Policies |
86 | The scheduler is the kernel part that decides which runnable process | |
c13182ef MK |
87 | will be executed by the CPU next. |
88 | The Linux scheduler offers three | |
fea681da | 89 | different scheduling policies, one for normal processes and two for |
c13182ef MK |
90 | real-time applications. |
91 | A static priority value \fIsched_priority\fP | |
fea681da | 92 | is assigned to each process and this value can be changed only via |
c13182ef MK |
93 | system calls. |
94 | Conceptually, the scheduler maintains a list of runnable | |
fea681da | 95 | processes for each possible \fIsched_priority\fP value, and |
c13182ef MK |
96 | \fIsched_priority\fP can have a value in the range 0 to 99. |
97 | In order | |
fea681da MK |
98 | to determine the process that runs next, the Linux scheduler looks for |
99 | the non-empty list with the highest static priority and takes the | |
c13182ef MK |
100 | process at the head of this list. |
101 | The scheduling policy determines for | |
fea681da MK |
102 | each process, where it will be inserted into the list of processes |
103 | with equal static priority and how it will move inside this list. | |
104 | ||
5917ad3d | 105 | \fBSCHED_OTHER\fP is the default universal time-sharing scheduler |
92c37d8c | 106 | policy used by most processes. |
5917ad3d MK |
107 | \fBSCHED_BATCH\fP is intended for "batch" style execution of processes. |
108 | \fBSCHED_FIFO\fP and \fBSCHED_RR\fP are | |
fea681da MK |
109 | intended for special time-critical applications that need precise |
110 | control over the way in which runnable processes are selected for | |
c13182ef | 111 | execution. |
92c37d8c | 112 | |
5917ad3d | 113 | Processes scheduled with \fBSCHED_OTHER\fP or \fBSCHED_BATCH\fP |
92c37d8c | 114 | must be assigned the static priority 0. |
5917ad3d MK |
115 | Processes scheduled under \fBSCHED_FIFO\fP or |
116 | \fBSCHED_RR\fP can have a static priority in the range 1 to 99. | |
60a90ecd MK |
117 | The system calls |
118 | .BR sched_get_priority_min (2) | |
119 | and | |
120 | .BR sched_get_priority_max (2) | |
121 | can be used to find out the valid | |
fea681da | 122 | priority range for a scheduling policy in a portable way on all |
a7fadb55 | 123 | POSIX.1-2001 conforming systems. |
fea681da MK |
124 | |
125 | All scheduling is preemptive: If a process with a higher static | |
126 | priority gets ready to run, the current process will be preempted and | |
c13182ef MK |
127 | returned into its wait list. |
128 | The scheduling policy only determines the | |
fea681da MK |
129 | ordering within the list of runnable processes with equal static |
130 | priority. | |
fea681da | 131 | .SS SCHED_FIFO: First In-First Out scheduling |
5917ad3d MK |
132 | \fBSCHED_FIFO\fP can only be used with static priorities higher than |
133 | 0, which means that when a \fBSCHED_FIFO\fP processes becomes runnable, | |
92c37d8c | 134 | it will always immediately preempt any currently running |
5917ad3d MK |
135 | \fBSCHED_OTHER\fP or \fBSCHED_BATCH\fP process. |
136 | \fBSCHED_FIFO\fP is a simple scheduling | |
c13182ef MK |
137 | algorithm without time slicing. |
138 | For processes scheduled under the | |
5917ad3d MK |
139 | \fBSCHED_FIFO\fP policy, the following rules are applied: A |
140 | \fBSCHED_FIFO\fP process that has been preempted by another process of | |
fea681da MK |
141 | higher priority will stay at the head of the list for its priority and |
142 | will resume execution as soon as all processes of higher priority are | |
c13182ef | 143 | blocked again. |
5917ad3d | 144 | When a \fBSCHED_FIFO\fP process becomes runnable, it |
c13182ef MK |
145 | will be inserted at the end of the list for its priority. |
146 | A call to | |
60a90ecd MK |
147 | .BR sched_setscheduler () |
148 | or | |
149 | .BR sched_setparam (2) | |
150 | will put the | |
5917ad3d | 151 | \fBSCHED_FIFO\fP (or \fBSCHED_RR\fP) process identified by |
fea681da MK |
152 | \fIpid\fP at the start of the list if it was runnable. |
153 | As a consequence, it may preempt the currently running process if | |
154 | it has the same priority. | |
a7fadb55 | 155 | (POSIX.1-2001 specifies that the process should go to the end |
fea681da MK |
156 | of the list.) |
157 | .\" In 2.2.x and 2.4.x, the process is placed at the front of the queue | |
158 | .\" In 2.0.x, the Right Thing happened: the process went to the back -- MTK | |
60a90ecd MK |
159 | A process calling |
160 | .BR sched_yield (2) | |
161 | will be | |
c13182ef MK |
162 | put at the end of the list. |
163 | No other events will move a process | |
5917ad3d | 164 | scheduled under the \fBSCHED_FIFO\fP policy in the wait list of |
c13182ef | 165 | runnable processes with equal static priority. |
5917ad3d | 166 | A \fBSCHED_FIFO\fP |
fea681da | 167 | process runs until either it is blocked by an I/O request, it is |
60a90ecd MK |
168 | preempted by a higher priority process, or it calls |
169 | .BR sched_yield (2). | |
fea681da | 170 | .SS SCHED_RR: Round Robin scheduling |
5917ad3d | 171 | \fBSCHED_RR\fP is a simple enhancement of \fBSCHED_FIFO\fP. |
c13182ef | 172 | Everything |
5917ad3d | 173 | described above for \fBSCHED_FIFO\fP also applies to \fBSCHED_RR\fP, |
fea681da | 174 | except that each process is only allowed to run for a maximum time |
c13182ef | 175 | quantum. |
5917ad3d | 176 | If a \fBSCHED_RR\fP process has been running for a time |
fea681da | 177 | period equal to or longer than the time quantum, it will be put at the |
c13182ef | 178 | end of the list for its priority. |
5917ad3d | 179 | A \fBSCHED_RR\fP process that has |
fea681da MK |
180 | been preempted by a higher priority process and subsequently resumes |
181 | execution as a running process will complete the unexpired portion of | |
c13182ef MK |
182 | its round robin time quantum. |
183 | The length of the time quantum can be | |
60a90ecd MK |
184 | retrieved using |
185 | .BR sched_rr_get_interval (2). | |
fea681da MK |
186 | .\" On Linux 2.4, the length of the RR interval is influenced |
187 | .\" by the process nice value -- MTK | |
af800319 | 188 | .\" |
fea681da | 189 | .SS SCHED_OTHER: Default Linux time-sharing scheduling |
5917ad3d MK |
190 | \fBSCHED_OTHER\fP can only be used at static priority 0. |
191 | \fBSCHED_OTHER\fP is the standard Linux time-sharing scheduler that is | |
fea681da | 192 | intended for all processes that do not require special static priority |
c13182ef MK |
193 | real-time mechanisms. |
194 | The process to run is chosen from the static | |
fea681da | 195 | priority 0 list based on a dynamic priority that is determined only |
c13182ef MK |
196 | inside this list. |
197 | The dynamic priority is based on the nice level (set | |
60a90ecd MK |
198 | by |
199 | .BR nice (2) | |
200 | or | |
201 | .BR setpriority (2)) | |
202 | and increased for | |
fea681da | 203 | each time quantum the process is ready to run, but denied to run by |
c13182ef | 204 | the scheduler. |
5917ad3d | 205 | This ensures fair progress among all \fBSCHED_OTHER\fP |
fea681da | 206 | processes. |
92c37d8c MK |
207 | .SS SCHED_BATCH: Scheduling batch processes |
208 | (Since Linux 2.6.16.) | |
5917ad3d MK |
209 | \fBSCHED_BATCH\fP can only be used at static priority 0. |
210 | This policy is similar to \fBSCHED_OTHER\fP, except that | |
92c37d8c MK |
211 | this policy will cause the scheduler to always assume |
212 | that the process is CPU-intensive. | |
213 | Consequently, the scheduler will apply a small scheduling | |
d9bfdb9c | 214 | penalty so that this process is mildly disfavored in scheduling |
92c37d8c | 215 | decisions. |
c13182ef MK |
216 | .\" The following paragraph is drawn largely from the text that |
217 | .\" accompanied Ingo Molnar's patch for the implementation of | |
92c37d8c MK |
218 | .\" SCHED_BATCH. |
219 | This policy is useful for workloads that are non-interactive, | |
c13182ef MK |
220 | but do not want to lower their nice value, |
221 | and for workloads that want a deterministic scheduling policy without | |
92c37d8c | 222 | interactivity causing extra preemptions (between the workload's tasks). |
fd04afa8 | 223 | .SS Privileges and resource limits |
fd04afa8 MK |
224 | In Linux kernels before 2.6.12, only privileged |
225 | .RB ( CAP_SYS_NICE ) | |
226 | processes can set a non-zero static priority. | |
afdee10d MK |
227 | The only change that an unprivileged process can make is to set the |
228 | .B SCHED_OTHER | |
229 | policy, and this can only be done if the effective user ID of the caller of | |
e511ffb6 | 230 | .BR sched_setscheduler () |
c13182ef | 231 | matches the real or effective user ID of the target process |
afdee10d MK |
232 | (i.e., the process specified by |
233 | .IR pid ) | |
234 | whose policy is being changed. | |
235 | ||
fd04afa8 MK |
236 | Since Linux 2.6.12, the |
237 | .B RLIMIT_RTPRIO | |
afdee10d MK |
238 | resource limit defines a ceiling on an unprivileged process's |
239 | priority for the | |
fd04afa8 MK |
240 | .B SCHED_RR |
241 | and | |
0daa9e92 | 242 | .B SCHED_FIFO |
fd04afa8 | 243 | policies. |
c13182ef | 244 | If an unprivileged process has a non-zero |
afdee10d | 245 | .B RLIMIT_RTPRIO |
c13182ef MK |
246 | soft limit, then it can change its scheduling policy and priority, |
247 | subject to the restriction that the priority cannot be set to a | |
248 | value higher than the | |
afdee10d MK |
249 | .B RLIMIT_RTPRIO |
250 | soft limit. | |
c13182ef | 251 | If the |
afdee10d MK |
252 | .B RLIMIT_RTPRIO |
253 | soft limit is 0, then the only permitted change is to lower the priority. | |
c13182ef MK |
254 | Subject to the same rules, |
255 | another unprivileged process can also make these changes, | |
256 | as long as the effective user ID of the process making the change | |
afdee10d | 257 | matches the real or effective user ID of the target process. |
fd04afa8 MK |
258 | See |
259 | .BR getrlimit (2) | |
260 | for further information on | |
261 | .BR RLIMIT_RTPRIO . | |
afdee10d MK |
262 | Privileged |
263 | .RB ( CAP_SYS_NICE ) | |
1954b6a9 | 264 | processes ignore this limit; as with older kernels, |
afdee10d | 265 | they can make arbitrary changes to scheduling policy and priority. |
fea681da MK |
266 | .SS Response time |
267 | A blocked high priority process waiting for the I/O has a certain | |
c13182ef MK |
268 | response time before it is scheduled again. |
269 | The device driver writer | |
fea681da MK |
270 | can greatly reduce this response time by using a "slow interrupt" |
271 | interrupt handler. | |
272 | .\" as described in | |
273 | .\" .BR request_irq (9). | |
fea681da MK |
274 | .SS Miscellaneous |
275 | Child processes inherit the scheduling algorithm and parameters across a | |
0bfa087b | 276 | .BR fork (2). |
c13182ef | 277 | The scheduling algorithm and parameters are preserved across |
ddb51c37 | 278 | .BR execve (2). |
fea681da | 279 | |
c13182ef | 280 | Memory locking is usually needed for real-time processes to avoid |
fea681da | 281 | paging delays, this can be done with |
0bfa087b | 282 | .BR mlock (2) |
c13182ef | 283 | or |
0bfa087b | 284 | .BR mlockall (2). |
fea681da MK |
285 | |
286 | As a non-blocking end-less loop in a process scheduled under | |
5917ad3d | 287 | \fBSCHED_FIFO\fP or \fBSCHED_RR\fP will block all processes with lower |
fea681da MK |
288 | priority forever, a software developer should always keep available on |
289 | the console a shell scheduled under a higher static priority than the | |
c13182ef MK |
290 | tested application. |
291 | This will allow an emergency kill of tested | |
afdee10d | 292 | real-time applications that do not block or terminate as expected. |
fea681da MK |
293 | |
294 | POSIX systems on which | |
e511ffb6 | 295 | .BR sched_setscheduler () |
fea681da | 296 | and |
e511ffb6 | 297 | .BR sched_getscheduler () |
fea681da | 298 | are available define |
f25eaea8 | 299 | .B _POSIX_PRIORITY_SCHEDULING |
c84371c6 | 300 | in \fI<unistd.h>\fP. |
fea681da MK |
301 | .SH "RETURN VALUE" |
302 | On success, | |
e511ffb6 | 303 | .BR sched_setscheduler () |
c13182ef | 304 | returns zero. |
fea681da | 305 | On success, |
e511ffb6 | 306 | .BR sched_getscheduler () |
c13182ef | 307 | returns the policy for the process (a non-negative integer). |
92c37d8c | 308 | On error, \-1 is returned, and |
fea681da MK |
309 | .I errno |
310 | is set appropriately. | |
311 | .SH ERRORS | |
312 | .TP | |
313 | .B EINVAL | |
314 | The scheduling \fIpolicy\fP is not one of the recognized policies, | |
92c37d8c | 315 | or the parameter \fIparam\fP does not make sense for the \fIpolicy\fP. |
fea681da MK |
316 | .TP |
317 | .B EPERM | |
afdee10d | 318 | The calling process does not have appropriate privileges. |
fea681da MK |
319 | .TP |
320 | .B ESRCH | |
321 | The process whose ID is \fIpid\fP could not be found. | |
322 | .SH "CONFORMING TO" | |
d61ff56a | 323 | POSIX.1-2001 (but see BUGS below). |
8382f16d | 324 | The \fBSCHED_BATCH\fP policy is Linux-specific. |
92c37d8c | 325 | .SH NOTES |
97bc0094 MK |
326 | POSIX.1 does not detail the permissions that an unprivileged |
327 | process requires in order to call | |
f902bebc | 328 | .BR sched_setscheduler (), |
97bc0094 MK |
329 | and details vary across systems. |
330 | For example, the Solaris 7 manual page says that | |
bfd91cc8 | 331 | the real or effective user ID of the calling process must |
97bc0094 | 332 | match the real user ID or the save set-user-ID of the target process. |
cf9c27a6 MK |
333 | .PP |
334 | Originally, Standard Linux was intended as a general-purpose operating | |
335 | system being able to handle background processes, interactive | |
336 | applications, and less demanding real-time applications (applications that | |
337 | need to usually meet timing deadlines). | |
338 | Although the Linux kernel 2.6 | |
339 | allowed for kernel preemption and the newly introduced O(1) scheduler | |
340 | ensures that the time needed to schedule is fixed and deterministic | |
341 | irrespective of the number of active tasks, true real-time computing | |
342 | was not possible up to kernel version 2.6.17. | |
343 | .SS Real-time features in the mainline Linux kernel | |
c5871a7d | 344 | .\" FIXME . Probably this text will need some minor tweaking |
cf9c27a6 MK |
345 | .\" by about the time of 2.6.25; ask Carsten Emde about this then. |
346 | From kernel version 2.6.18 onwards, however, Linux is gradually | |
347 | becoming equipped with real-time capabilities, | |
348 | most of which are derived from the former | |
349 | realtime-preempt patches developed by Ingo Molnar, Thomas Gleixner and | |
350 | others. | |
bf1c0ede | 351 | Until the patches have been completely merged into the |
cf9c27a6 MK |
352 | mainline kernel |
353 | (this is expected to be around kernel version 2.6.24 or 2.6.25), | |
354 | the realtime-preempt patches must be installed to achieve the best | |
355 | realtime performance. | |
356 | These patches are named: | |
088a639b | 357 | .in +4n |
cf9c27a6 | 358 | .nf |
97bc0094 | 359 | |
cf9c27a6 MK |
360 | patch-\fIkernelversion\fP-rt\fIpatchversion\fP |
361 | .fi | |
362 | .in | |
fea681da | 363 | .PP |
cf9c27a6 MK |
364 | and can be downloaded from |
365 | .IR http://people.redhat.com/mingo/realtime-preempt/ . | |
366 | ||
367 | Without the patches and prior to their full inclusion into the mainline | |
368 | kernel, the kernel configuration offers only the three preemption classes | |
369 | .BR CONFIG_PREEMPT_NONE , | |
370 | .BR CONFIG_PREEMPT_VOLUNTARY , | |
371 | and | |
0daa9e92 | 372 | .B CONFIG_PREEMPT_DESKTOP |
cf9c27a6 MK |
373 | which respectively provide no, some, and considerable |
374 | reduction of the worst-case scheduling latency. | |
375 | ||
376 | With the patches applied or after their full inclusion into the mainline | |
377 | kernel, the additional configuration item | |
0daa9e92 | 378 | .B CONFIG_PREEMPT_RT |
cf9c27a6 MK |
379 | becomes available. |
380 | If this is selected, Linux is transformed into a regular | |
381 | real-time operating system. | |
382 | The FIFO and RR scheduling policies that can be selected using | |
383 | .BR sched_setscheduler () | |
384 | are then used to run a process | |
385 | with true real-time priority and a minimum worst-case scheduling latency. | |
d61ff56a MK |
386 | .SH BUGS |
387 | POSIX says that on success, | |
388 | .BR sched_setscheduler () | |
389 | should return the previous scheduling policy. | |
390 | Linux | |
391 | .BR sched_setscheduler () | |
392 | does not conform to this requirement, | |
393 | since it always returns 0 on success. | |
fea681da MK |
394 | .SH "SEE ALSO" |
395 | .BR getpriority (2), | |
396 | .BR mlock (2), | |
397 | .BR mlockall (2), | |
398 | .BR munlock (2), | |
399 | .BR munlockall (2), | |
400 | .BR nice (2), | |
401 | .BR sched_get_priority_max (2), | |
402 | .BR sched_get_priority_min (2), | |
403 | .BR sched_getaffinity (2), | |
404 | .BR sched_getparam (2), | |
405 | .BR sched_rr_get_interval (2), | |
406 | .BR sched_setaffinity (2), | |
407 | .BR sched_setparam (2), | |
408 | .BR sched_yield (2), | |
409 | .BR setpriority (2), | |
410 | .BR capabilities (7) | |
411 | .PP | |
412 | .I Programming for the real world \- POSIX.4 | |
413 | by Bill O. Gallmeister, O'Reilly & Associates, Inc., ISBN 1-56592-074-0 |