]>
Commit | Line | Data |
---|---|---|
8f0aff2a | 1 | .\" Page by b.hubert |
1abce893 MK |
2 | .\" and Copyright (C) 2015, Thomas Gleixner <tglx@linutronix.de> |
3 | .\" and Copyright (C) 2015, Michael Kerrisk <mtk.manpages@gmail.com> | |
2297bf0e | 4 | .\" |
2e46a6e7 | 5 | .\" %%%LICENSE_START(FREELY_REDISTRIBUTABLE) |
8f0aff2a | 6 | .\" may be freely modified and distributed |
8ff7380d | 7 | .\" %%%LICENSE_END |
fea681da MK |
8 | .\" |
9 | .\" Niki A. Rahimi (LTC Security Development, narahimi@us.ibm.com) | |
10 | .\" added ERRORS section. | |
11 | .\" | |
12 | .\" Modified 2004-06-17 mtk | |
13 | .\" Modified 2004-10-07 aeb, added FUTEX_REQUEUE, FUTEX_CMP_REQUEUE | |
14 | .\" | |
c13182ef MK |
15 | .\" 2.6.18 adds (Ingo Molnar) priority inheritance support: |
16 | .\" FUTEX_LOCK_PI, FUTEX_UNLOCK_PI, and FUTEX_TRYLOCK_PI. These need | |
34f7665a MK |
17 | .\" to be documented in the manual page. Probably there is sufficient |
18 | .\" material in the kernel source file Documentation/pi-futex.txt. | |
4f58b197 MK |
19 | .\" commit c87e2837be82df479a6bae9f155c43516d2feebc |
20 | .\" Author: Ingo Molnar <mingo@elte.hu> | |
21 | .\" Date: Tue Jun 27 02:54:58 2006 -0700 | |
22 | .\" | |
23 | .\" commit e2970f2fb6950183a34e8545faa093eb49d186e1 | |
24 | .\" Author: Ingo Molnar <mingo@elte.hu> | |
25 | .\" Date: Tue Jun 27 02:54:47 2006 -0700 | |
26 | .\" | |
27b38e1c | 27 | .\" See Documentation/pi-futex.txt |
4f58b197 | 28 | .\" |
bea08fec | 29 | .\" FIXME . |
40d5cf23 | 30 | .\" 2.6.25 adds FUTEX_WAKE_BITSET, FUTEX_WAIT_BITSET |
4f58b197 MK |
31 | .\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d |
32 | .\" Author: Thomas Gleixner <tglx@linutronix.de> | |
33 | .\" Date: Fri Feb 1 17:45:14 2008 +0100 | |
34 | .\" | |
bea08fec | 35 | .\" FIXME . |
4f58b197 MK |
36 | .\" 2.6.31 adds FUTEX_WAIT_REQUEUE_PI, FUTEX_CMP_REQUEUE_PI |
37 | .\" commit 52400ba946759af28442dee6265c5c0180ac7122 | |
38 | .\" Author: Darren Hart <dvhltc@us.ibm.com> | |
39 | .\" Date: Fri Apr 3 13:40:49 2009 -0700 | |
40 | .\" | |
41 | .\" commit ba9c22f2c01cf5c88beed5a6b9e07d42e10bd358 | |
42 | .\" Author: Darren Hart <dvhltc@us.ibm.com> | |
43 | .\" Date: Mon Apr 20 22:22:22 2009 -0700 | |
44 | .\" | |
45 | .\" See Documentation/futex-requeue-pi.txt | |
34f7665a | 46 | .\" |
3d155313 | 47 | .TH FUTEX 2 2014-05-21 "Linux" "Linux Programmer's Manual" |
fea681da | 48 | .SH NAME |
ce154705 | 49 | futex \- fast user-space locking |
fea681da | 50 | .SH SYNOPSIS |
9d9dc1e8 | 51 | .nf |
fea681da MK |
52 | .sp |
53 | .B "#include <linux/futex.h>" | |
fea681da MK |
54 | .B "#include <sys/time.h>" |
55 | .sp | |
d33602c4 MK |
56 | .BI "int futex(int *" uaddr ", int " futex_op ", int " val , |
57 | .BI " const struct timespec *" timeout , | |
9d9dc1e8 | 58 | .BI " int *" uaddr2 ", int " val3 ); |
fea681da | 59 | .\" int *? void *? u32 *? |
9d9dc1e8 | 60 | .fi |
409f08b0 | 61 | |
b939d6e4 MK |
62 | .IR Note : |
63 | There is no glibc wrapper for this system call; see NOTES. | |
47297adb | 64 | .SH DESCRIPTION |
fea681da MK |
65 | .PP |
66 | The | |
e511ffb6 | 67 | .BR futex () |
fea681da MK |
68 | system call provides a method for |
69 | a program to wait for a value at a given address to change, and a | |
70 | method to wake up anyone waiting on a particular address (while the | |
71 | addresses for the same memory in separate processes may not be | |
72 | equal, the kernel maps them internally so the same memory mapped in | |
73 | different locations will correspond for | |
e511ffb6 | 74 | .BR futex () |
c13182ef | 75 | calls). |
fd3fa7ef | 76 | This system call is typically used to |
fea681da MK |
77 | implement the contended case of a lock in shared memory, as |
78 | described in | |
a8bda636 | 79 | .BR futex (7). |
fea681da | 80 | .PP |
f388ba70 MK |
81 | When a futex operation did not finish uncontended in user space, a |
82 | .BR futex () | |
83 | call needs to be made to the kernel to arbitrate. | |
c13182ef | 84 | Arbitration can either mean putting the calling |
fea681da MK |
85 | process to sleep or, conversely, waking a waiting process. |
86 | .PP | |
f388ba70 MK |
87 | Callers of |
88 | .BR futex () | |
89 | are expected to adhere to the semantics described in | |
a8bda636 | 90 | .BR futex (7). |
fea681da | 91 | As these |
d603cc27 | 92 | semantics involve writing nonportable assembly instructions, this in turn |
fea681da MK |
93 | probably means that most users will in fact be library authors and not |
94 | general application developers. | |
95 | .PP | |
96 | The | |
97 | .I uaddr | |
f388ba70 MK |
98 | argument points to an integer which stores the counter (futex). |
99 | On all platforms, futexes are four-byte integers that | |
100 | must be aligned on a four-byte boundary. | |
101 | The operation to perform on the futex is specified in the | |
102 | .I futex_op | |
103 | argument; | |
104 | .IR val | |
105 | is a value whose meaning and purpose depends on | |
106 | .IR futex_op . | |
36ab2074 MK |
107 | |
108 | The remaining arguments | |
109 | .RI ( timeout , | |
110 | .IR uaddr2 , | |
111 | and | |
112 | .IR val3 ) | |
113 | are required only for certain of the futex operations described below. | |
114 | Where one of these arguments is not required, it is ignored. | |
115 | For several blocking operations, the | |
116 | .I timeout | |
117 | argument is a pointer to a | |
118 | .IR timespec | |
119 | structure that specifies a timeout for the operation. | |
120 | However, notwithstanding the prototype shown above, for some operations, | |
121 | this argument is instead a four-byte integer whose meaning | |
122 | is determined by the operation. | |
123 | Where it is required, | |
124 | .IR uaddr2 | |
125 | is a pointer to a second futex that is employed by the operation. | |
126 | The interpretation of the final integer argument, | |
127 | .IR val3 , | |
128 | depends on the operation. | |
129 | ||
6be4bad7 | 130 | The |
d33602c4 | 131 | .I futex_op |
6be4bad7 MK |
132 | argument consists of two parts: |
133 | a command that specifies the operation to be performed, | |
134 | bit-wise ORed with zero or or more options that | |
135 | modify the behaviour of the operation. | |
fc30eb79 | 136 | The options that may be included in |
d33602c4 | 137 | .I futex_op |
fc30eb79 TG |
138 | are as follows: |
139 | .TP | |
140 | .BR FUTEX_PRIVATE_FLAG " (since Linux 2.6.22)" | |
141 | .\" commit 34f01cc1f512fa783302982776895c73714ebbc2 | |
142 | This option bit can be employed with all futex operations. | |
143 | It tells the kernel that the futex is process private and not shared | |
144 | with another process. | |
145 | This allows the kernel to choose the fast path for validating | |
146 | the user-space address and avoids expensive VMA lookups, | |
147 | taking reference counts on file backing store, and so on. | |
ae2c1774 MK |
148 | |
149 | As a convenience, | |
150 | .IR <linux/futex.h> | |
151 | defines a set of constants with the suffix | |
152 | .BR _PRIVATE | |
153 | that are equivalents of all of the operations listed below, | |
154 | .\" except the obsolete FUTEX_FD for which the "private" flag was | |
155 | .\" meaningless | |
156 | but with the | |
157 | .BR FUTEX_PRIVATE_FLAG | |
158 | ORed into the constant value. | |
159 | Thus, there are | |
160 | .BR FUTEX_WAIT_PRIVATE , | |
161 | .BR FUTEX_WAKE_PRIVATE , | |
162 | and so on. | |
2e98bbc2 TG |
163 | .TP |
164 | .BR FUTEX_CLOCK_REALTIME " (since Linux 2.6.28)" | |
165 | .\" commit 1acdac104668a0834cfa267de9946fac7764d486 | |
4a7e5b05 | 166 | This option bit can be employed only with the |
2e98bbc2 TG |
167 | .BR FUTEX_WAIT_BITSET |
168 | and | |
169 | .BR FUTEX_WAIT_REQUEUE_PI | |
170 | operations (described below). | |
171 | ||
172 | If this option is set, | |
173 | the kernel treats the user space supplied timeout as an absolute | |
174 | time based on | |
175 | .BR CLOCK_REALTIME . | |
176 | ||
177 | If this option is not set, | |
1c952cf5 MK |
178 | the kernel treats the user space supplied timeout as relative time, |
179 | .\" FIXME I added CLOCK_MONOTONIC here. Is it correct? | |
180 | measured against the | |
181 | .BR CLOCK_MONOTONIC | |
182 | clock. | |
6be4bad7 MK |
183 | .PP |
184 | The operation specified in | |
d33602c4 | 185 | .I futex_op |
6be4bad7 | 186 | is one of the following: |
fea681da | 187 | .TP |
81c9d87e MK |
188 | .BR FUTEX_WAIT " (since Linux 2.6.0)" |
189 | .\" Strictly speaking, since some time in 2.5.x | |
f065673c MK |
190 | This operation tests that the value at the |
191 | location pointed to by the futex address | |
fea681da MK |
192 | .I uaddr |
193 | still contains the value | |
194 | .IR val , | |
f065673c | 195 | and then sleeps awaiting |
682edefb | 196 | .B FUTEX_WAKE |
f065673c MK |
197 | on the futex address. |
198 | The test and sleep steps are performed atomically. | |
199 | If the futex value does not match | |
200 | .IR val , | |
4710334a | 201 | then the call fails immediately with the error |
f065673c MK |
202 | .BR EWOULDBLOCK . |
203 | .\" FIXME I added the following sentence. Please confirm that it is correct. | |
204 | The purpose of the test step is to detect races where | |
205 | another process changes that value of the futex between | |
206 | the time it was last checked and the time of the | |
207 | .BR FUTEX_WAIT | |
208 | oepration. | |
209 | ||
1909e523 | 210 | |
c13182ef | 211 | If the |
fea681da | 212 | .I timeout |
1c952cf5 MK |
213 | argument is non-NULL, its contents specify a relative timeout for the wait |
214 | .\" FIXME I added CLOCK_MONOTONIC here. Is it correct? | |
215 | measured according to the | |
216 | .BR CLOCK_MONOTONIC | |
217 | clock. | |
82a6092b MK |
218 | (This interval will be rounded up to the system clock granularity, |
219 | and kernel scheduling delays mean that the | |
220 | blocking interval may overrun by a small amount.) | |
221 | If | |
222 | .I timeout | |
223 | is NULL, the call blocks indefinitely. | |
4798a7f3 | 224 | |
c13182ef | 225 | The arguments |
fea681da MK |
226 | .I uaddr2 |
227 | and | |
228 | .I val3 | |
229 | are ignored. | |
230 | ||
231 | For | |
a8bda636 | 232 | .BR futex (7), |
fea681da MK |
233 | this call is executed if decrementing the count gave a negative value |
234 | (indicating contention), and will sleep until another process releases | |
682edefb MK |
235 | the futex and executes the |
236 | .B FUTEX_WAKE | |
237 | operation. | |
fea681da | 238 | .TP |
d67e21f5 MK |
239 | .BR FUTEX_WAIT_BITSET " (since Linux 2.6.25)" |
240 | .\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d | |
241 | .\" FIXME TO complete | |
242 | [As yet undocumented] | |
243 | .TP | |
81c9d87e MK |
244 | .BR FUTEX_WAKE " (since Linux 2.6.0)" |
245 | .\" Strictly speaking, since Linux 2.5.x | |
f065673c MK |
246 | This operation wakes at most |
247 | .I val | |
248 | processes waiting (i.e., inside | |
249 | .BR FUTEX_WAIT ) | |
250 | on the futex at the address | |
251 | .IR uaddr . | |
252 | Most commonly, | |
253 | .I val | |
254 | is specified as either 1 (wake up a single waiter) or | |
255 | .BR INT_MAX | |
256 | (wake up all waiters). | |
730bfbda MK |
257 | .\" FIXME Please confirm that the following is correct: |
258 | No guarantee is provided about which waiters are awoken | |
259 | (e.g., a waiter with a higher scheduling priority is not guaranteed | |
260 | to be awoken in preference to a waiter with a lower priority). | |
4798a7f3 | 261 | |
fea681da MK |
262 | The arguments |
263 | .IR timeout , | |
264 | .I uaddr2 | |
265 | and | |
266 | .I val3 | |
267 | are ignored. | |
268 | ||
269 | For | |
a8bda636 | 270 | .BR futex (7), |
fea681da MK |
271 | this is executed if incrementing |
272 | the count showed that there were waiters, once the futex value has been set | |
273 | to 1 (indicating that it is available). | |
6bac3b85 MK |
274 | .\" |
275 | .\" FIXME I added some FUTEX_WAKE_OP text, and I'd be happy if someone | |
276 | .\" checked it. | |
fea681da | 277 | .TP |
d67e21f5 MK |
278 | .BR FUTEX_WAKE_OP " (since Linux 2.6.14)" |
279 | .\" commit 4732efbeb997189d9f9b04708dc26bf8613ed721 | |
6bac3b85 MK |
280 | .\" Author: Jakub Jelinek <jakub@redhat.com> |
281 | .\" Date: Tue Sep 6 15:16:25 2005 -0700 | |
282 | This operation was added to support some user-space use cases | |
283 | where more than one futex must be handled at the same time. | |
284 | The most notable example is the implementation of | |
285 | .BR pthread_cond_signal (3), | |
286 | which requires operations on two futexes, | |
287 | the one used to implement the mutex and the one used in the implementation | |
288 | of the wait queue associated with the condition variable. | |
289 | .BR FUTEX_WAKE_OP | |
290 | allows such cases to be implemented without leading to | |
291 | high rates of contention and context switching. | |
292 | ||
293 | The | |
294 | .BR FUTEX_WAIT_OP | |
295 | operation is equivalent to atomically executing the following code: | |
296 | ||
297 | .in +4n | |
298 | .nf | |
299 | int oldval = *(int *) uaddr2; | |
300 | *(int *) uaddr2 = oldval \fIop\fP \fIoparg\fP; | |
301 | futex(uaddr, FUTEX_WAKE, val, 0, 0, 0); | |
302 | if (oldval \fIcmp\fP \fIcmparg\fP) | |
303 | futex(uaddr2, FUTEX_WAKE, nr_wake2, 0, 0, 0); | |
304 | .fi | |
305 | .in | |
306 | ||
307 | In other words, | |
308 | .BR FUTEX_WAIT_OP | |
309 | does the following: | |
310 | .RS | |
311 | .IP * 3 | |
312 | saves the original value of the futex at | |
313 | .IR uaddr2 ; | |
314 | .IP * | |
315 | performs an operation to modify the value of the futex at | |
316 | .IR uaddr2 ; | |
317 | .IP * | |
318 | wakes up a maximum of | |
319 | .I val | |
320 | waiters on the futex | |
321 | .IR uaddr ; | |
322 | and | |
323 | .IP * | |
324 | dependent on the results of a test of the original value of the futex at | |
325 | .IR uaddr2 , | |
326 | wakes up a maximum of | |
327 | .I nr_wake2 | |
328 | waiters on the futex | |
329 | .IR uaddr2 . | |
330 | .RE | |
331 | .IP | |
332 | The | |
333 | .I nr_wake2 | |
334 | value is actually the | |
335 | .BR futex () | |
336 | .I timeout | |
337 | argument (ab)used to specify how many of the waiters on the futex at | |
338 | .IR uaddr2 | |
339 | are to be woken up; | |
340 | the kernel casts the | |
341 | .I timeout | |
342 | value to | |
343 | .IR u32 . | |
344 | ||
345 | The operation and comparison that are to be performed are encoded | |
346 | in the bits of the argument | |
347 | .IR val3 . | |
348 | Pictorially, the encoding is: | |
349 | ||
350 | .in +4n | |
351 | .nf | |
352 | +-----+-----+---------------+---------------+ | |
353 | | op | cmp | oparg | cmparg | | |
354 | +-----+-----+---------------+---------------+ | |
355 | # of bits: 4 4 12 12 | |
356 | ||
357 | .fi | |
358 | .in | |
359 | ||
360 | Expressed in code, the encoding is: | |
361 | ||
362 | .in +4n | |
363 | .nf | |
364 | #define FUTEX_OP(op, oparg, cmp, cmparg) \\ | |
365 | (((op & 0xf) << 28) | \\ | |
366 | ((cmp & 0xf) << 24) | \\ | |
367 | ((oparg & 0xfff) << 12) | \\ | |
368 | (cmparg & 0xfff)) | |
369 | .fi | |
370 | .in | |
371 | ||
372 | In the above, | |
373 | .I op | |
374 | and | |
375 | .I cmp | |
376 | are each one of the codes listed below. | |
377 | The | |
378 | .I oparg | |
379 | and | |
380 | .I cmparg | |
381 | components are literal numeric values, except as noted below. | |
382 | ||
383 | The | |
384 | .I op | |
385 | component has one of the following values: | |
386 | ||
387 | .in +4n | |
388 | .nf | |
389 | FUTEX_OP_SET 0 /* uaddr2 = oparg; */ | |
390 | FUTEX_OP_ADD 1 /* uaddr2 += oparg; */ | |
391 | FUTEX_OP_OR 2 /* uaddr2 |= oparg; */ | |
392 | FUTEX_OP_ANDN 3 /* uaddr2 &= ~oparg; */ | |
393 | FUTEX_OP_XOR 4 /* uaddr2 ^= oparg; */ | |
394 | .fi | |
395 | .in | |
396 | ||
397 | In addition, bit-wise ORing the following value into | |
398 | .I op | |
399 | causes | |
400 | .IR "(1\ <<\ oparg)" | |
401 | to be used as the operand: | |
402 | ||
403 | .in +4n | |
404 | .nf | |
405 | FUTEX_OP_ARG_SHIFT 8 /* Use (1 << oparg) as operand */ | |
406 | .fi | |
407 | .in | |
408 | ||
409 | The | |
410 | .I cmp | |
411 | field is one of the following: | |
412 | ||
413 | .in +4n | |
414 | .nf | |
415 | FUTEX_OP_CMP_EQ 0 /* if (oldval == cmparg) wake */ | |
416 | FUTEX_OP_CMP_NE 1 /* if (oldval != cmparg) wake */ | |
417 | FUTEX_OP_CMP_LT 2 /* if (oldval < cmparg) wake */ | |
418 | FUTEX_OP_CMP_LE 3 /* if (oldval <= cmparg) wake */ | |
419 | FUTEX_OP_CMP_GT 4 /* if (oldval > cmparg) wake */ | |
420 | FUTEX_OP_CMP_GE 5 /* if (oldval >= cmparg) wake */ | |
421 | .fi | |
422 | .in | |
423 | ||
424 | The return value of | |
425 | .BR FUTEX_WAKE_OP | |
426 | is the sum of the number of waiters woken on the futex | |
427 | .IR uaddr | |
428 | plus the number of waiters woken on the futex | |
429 | .IR uaddr2 . | |
d67e21f5 MK |
430 | .TP |
431 | .BR FUTEX_WAKE_BITSET " (since Linux 2.6.25)" | |
432 | .\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d | |
433 | .\" FIXME TO complete | |
434 | [As yet undocumented] | |
435 | .TP | |
436 | .BR FUTEX_LOCK_PI " (since Linux 2.6.18)" | |
437 | .\" commit c87e2837be82df479a6bae9f155c43516d2feebc | |
438 | .\" FIXME to complete | |
439 | [As yet undocumented] | |
440 | .TP | |
441 | .BR FUTEX_UNLOCK_PI " (since Linux 2.6.18)" | |
442 | .\" commit c87e2837be82df479a6bae9f155c43516d2feebc | |
443 | .\" FIXME to complete | |
444 | [As yet undocumented] | |
445 | .TP | |
446 | .BR FUTEX_TRYLOCK_PI " (since Linux 2.6.18)" | |
447 | .\" commit c87e2837be82df479a6bae9f155c43516d2feebc | |
448 | .\" FIXME to complete | |
449 | [As yet undocumented] | |
450 | .TP | |
81c9d87e MK |
451 | .BR FUTEX_FD " (from Linux 2.6.0 up to and including Linux 2.6.25)" |
452 | .\" Strictly speaking, from Linux 2.5.x to 2.6.25 | |
ff597681 MK |
453 | This operation creates a file descriptor that is associated with the futex at |
454 | .IR uaddr . | |
fea681da | 455 | .\" , suitable for .BR poll (2). |
ff597681 MK |
456 | The calling process must close the returned file descriptor after use. |
457 | When another process performs a | |
458 | .BR FUTEX_WAKE | |
459 | on the futex, the file descriptor indicates as being readable with | |
460 | .BR select (2), | |
461 | .BR poll (2), | |
462 | and | |
463 | .BR epoll (7) | |
464 | ||
465 | The file descriptor can be used to obtain asynchronous notifications: | |
466 | if | |
467 | .I val | |
468 | is nonzero, then when another process executes a | |
682edefb | 469 | .BR FUTEX_WAKE , |
ff597681 | 470 | the caller will receive the signal number that was passed in |
fea681da | 471 | .IR val . |
4798a7f3 | 472 | |
fea681da MK |
473 | The arguments |
474 | .IR timeout , | |
475 | .I uaddr2 | |
476 | and | |
477 | .I val3 | |
478 | are ignored. | |
479 | ||
c13182ef | 480 | To prevent race conditions, the caller should test if the futex has |
682edefb MK |
481 | been upped after |
482 | .B FUTEX_FD | |
483 | returns. | |
266a5e91 | 484 | |
da36351e | 485 | Because it was inherently racy, |
682edefb | 486 | .B FUTEX_FD |
75bc6c11 MK |
487 | has been removed |
488 | .\" commit 82af7aca56c67061420d618cc5a30f0fd4106b80 | |
489 | from Linux 2.6.26 onward. | |
fea681da | 490 | .TP |
81c9d87e MK |
491 | .BR FUTEX_REQUEUE " (since Linux 2.6.0)" |
492 | .\" Strictly speaking: from Linux 2.5.70 | |
4ac63a6c MK |
493 | .\" |
494 | .\" FIXME I added this warning. Okay? | |
495 | .IR "Avoid using this operation" . | |
dd05d612 | 496 | It is broken (unavoidably racy) for its intended purpose. |
4ac63a6c MK |
497 | Use |
498 | .BR FUTEX_CMP_REQUEUE | |
499 | instead. | |
500 | ||
dd05d612 MK |
501 | This operation performs the same task as |
502 | .BR FUTEX_CMP_REQUEUE , | |
503 | except that no check is made using the value in | |
504 | .IR val3 . | |
505 | (The argument | |
fea681da | 506 | .I val3 |
dd05d612 | 507 | is ignored.) |
fea681da MK |
508 | .TP |
509 | .BR FUTEX_CMP_REQUEUE " (since Linux 2.6.7)" | |
3dfcc11d | 510 | This operation was added as a replacement for the earlier |
682edefb | 511 | .BR FUTEX_REQUEUE , |
3dfcc11d MK |
512 | because that operation was racy for its intended use. |
513 | ||
514 | As with | |
682edefb | 515 | .BR FUTEX_REQUEUE , |
3dfcc11d MK |
516 | the |
517 | .BR FUTEX_CMP_REQUEUE | |
518 | operation is used to avoid a "thundering herd" effect when | |
519 | .B FUTEX_WAKE | |
520 | is used and all processes woken up need to acquire another futex. | |
521 | It differs from | |
522 | .BR FUTEX_REQUEUE | |
523 | in that it first checks whether the location | |
fea681da MK |
524 | .I uaddr |
525 | still contains the value | |
526 | .IR val3 . | |
e808bba0 MK |
527 | If not, the operation fails with the error |
528 | .BR EAGAIN . | |
3dfcc11d MK |
529 | .\" FIXME I added the following sentence on rational for FUTEX_CMP_REQUEUE. |
530 | .\" Is it correct? SHould it be expanded? | |
531 | This additional feature of | |
532 | .BR FUTEX_CMP_REQUEUE | |
533 | can be used by the caller to (atomically) detect changes | |
534 | in the value of the target futex at | |
535 | .IR uaddr2 . | |
4798a7f3 | 536 | |
3dfcc11d MK |
537 | The operation wakes up a maximum of |
538 | .I val | |
539 | waiters that are waiting on the futex at | |
540 | .IR uaddr . | |
541 | If there are more than | |
542 | .I val | |
543 | waiters, then the remaining waiters are removed | |
544 | from the wait queue of the source futex at | |
545 | .I uaddr | |
546 | and added to the wait queue of the target futex at | |
547 | .IR uaddr2 . | |
548 | The | |
549 | .I timeout | |
550 | argument is (ab)used to specify a cap on the number of waiters | |
551 | that are requeued to the futex at | |
552 | .IR uaddr2 ; | |
553 | the kernel casts the | |
fea681da | 554 | .I timeout |
3dfcc11d MK |
555 | value to |
556 | .IR u32 . | |
557 | ||
558 | .\" FIXME Please review the following new paragraph to see if it is | |
559 | .\" accurate. | |
560 | Typical values to specify for | |
561 | .I val | |
562 | are 0 or or 1. | |
563 | (Specifying | |
564 | .BR INT_MAX | |
565 | is not useful, because it would make the | |
566 | .BR FUTEX_CMP_REQUEUE | |
567 | operation equivalent to | |
568 | .BR FUTEX_WAKE .) | |
569 | The cap value specified via the (abused) | |
570 | .I timeout | |
571 | argument is typically either 1 or | |
572 | .BR INT_MAX . | |
573 | (Specifying the argument as 0 is not useful, because it would make the | |
574 | .BR FUTEX_CMP_REQUEUE | |
575 | operation equivalent to | |
576 | .BR FUTEX_WAIT .) | |
d67e21f5 MK |
577 | .TP |
578 | .BR FUTEX_CMP_REQUEUE_PI " (since Linux 2.6.31)" | |
579 | .\" commit 52400ba946759af28442dee6265c5c0180ac7122 | |
580 | .\" FIXME to complete | |
581 | [As yet undocumented] | |
582 | .TP | |
583 | .BR FUTEX_WAIT_REQUEUE_PI " (since Linux 2.6.31)" | |
584 | .\" commit 52400ba946759af28442dee6265c5c0180ac7122 | |
585 | .\" FIXME to complete | |
586 | [As yet undocumented] | |
47297adb | 587 | .SH RETURN VALUE |
fea681da | 588 | .PP |
e808bba0 MK |
589 | In the event of an error, all operations return \-1, and set |
590 | .I errno | |
591 | to indicate the error. | |
592 | The return value on success depends on the operation, | |
593 | as described in the following list: | |
fea681da MK |
594 | .TP |
595 | .B FUTEX_WAIT | |
682edefb MK |
596 | Returns 0 if the process was woken by a |
597 | .B FUTEX_WAKE | |
598 | call. | |
e808bba0 | 599 | See ERRORS for the various possible error returns. |
fea681da MK |
600 | .TP |
601 | .B FUTEX_WAKE | |
602 | Returns the number of processes woken up. | |
603 | .TP | |
604 | .B FUTEX_FD | |
605 | Returns the new file descriptor associated with the futex. | |
606 | .TP | |
607 | .B FUTEX_REQUEUE | |
608 | Returns the number of processes woken up. | |
609 | .TP | |
610 | .B FUTEX_CMP_REQUEUE | |
3dfcc11d MK |
611 | Returns the total number of processes woken up or requeued to the futex at |
612 | .IR uaddr2 . | |
613 | If this value is greater than | |
614 | .IR val , | |
615 | then difference is the number of waiters requeued to the futex at | |
616 | .IR uaddr2 . | |
519f2c3d MK |
617 | .\" |
618 | .\" FIXME Add success returns for other operations | |
fea681da MK |
619 | .SH ERRORS |
620 | .TP | |
621 | .B EACCES | |
622 | No read access to futex memory. | |
623 | .TP | |
624 | .B EAGAIN | |
682edefb | 625 | .B FUTEX_CMP_REQUEUE |
e808bba0 | 626 | detected that the value pointed to by |
9f6c40c0 МК |
627 | .I uaddr |
628 | is not equal to the expected value | |
629 | .IR val3 . | |
fd1dc4c2 | 630 | .\" FIXME: Is the following sentence correct? |
fea681da | 631 | (This probably indicates a race; |
682edefb MK |
632 | use the safe |
633 | .B FUTEX_WAKE | |
634 | now.) | |
fea681da MK |
635 | .TP |
636 | .B EFAULT | |
1ea901e8 MK |
637 | A required pointer argument (i.e., |
638 | .IR uaddr , | |
639 | .IR uaddr2 , | |
640 | or | |
641 | .IR timeout ) | |
496df304 | 642 | did not point to a valid user-space address. |
fea681da | 643 | .TP |
9f6c40c0 | 644 | .B EINTR |
e808bba0 | 645 | A |
9f6c40c0 | 646 | .B FUTEX_WAIT |
2674f781 MK |
647 | or |
648 | .B FUTEX_WAIT_BITSET | |
e808bba0 MK |
649 | operation was interrupted by a signal (see |
650 | .BR signal (7)) | |
651 | or a spurious wakeup. | |
9f6c40c0 | 652 | .TP |
fea681da | 653 | .B EINVAL |
fb2f4c27 MK |
654 | .RB ( FUTEX_WAIT , |
655 | .BR FUTEX_WAIT_REQUEUE_PI ) | |
656 | The supplied | |
657 | .I timeout | |
658 | argument was invalid | |
659 | .RI ( tv_sec | |
660 | was less than zero, or | |
661 | .IR tv_nsec | |
662 | was not less than 1000,000,000). | |
663 | .TP | |
664 | .B EINVAL | |
ea355b7f | 665 | .RB ( FUTEX_WAIT , |
caf1ff25 | 666 | .BR FUTEX_WAKE , |
6bac3b85 | 667 | .BR FUTEX_WAKE_OP , |
a1f47699 MK |
668 | .BR FUTEX_REQUEUE , |
669 | .BR FUTEX_CMP_REQUEUE ) | |
51ee94be | 670 | .I uaddr |
caf1ff25 | 671 | or (for |
a1f47699 MK |
672 | .BR FUTEX_REQUEUE |
673 | and | |
674 | .BR FUTEX_CMP_REQUEUE ) | |
caf1ff25 | 675 | .I uaddr2 |
51ee94be MK |
676 | does not point to a valid object\(emthat is, |
677 | the address is not 4-byte-aligned. | |
678 | .TP | |
679 | .B EINVAL | |
bae14b6c | 680 | .RB ( FUTEX_WAKE , |
e169277f MK |
681 | .BR FUTEX_REQUEUE , |
682 | .BR FUTEX_CMP_REQUEUE ) | |
496df304 | 683 | The kernel detected an inconsistency between the user-space state at |
9534086b TG |
684 | .I uaddr |
685 | and the kernel state\(emthat is, it detected a waiter which waits in | |
686 | .BR FUTEX_LOCK_PI . | |
687 | .TP | |
688 | .B EINVAL | |
add875c0 MK |
689 | .RB ( FUTEX_REQUEUE ) |
690 | .\" FIXME tglx suggested adding this, but does this error really | |
691 | .\" occur for FUTEX_REQUEUE? | |
692 | .I uaddr | |
693 | equals | |
694 | .IR uaddr2 | |
695 | (i.e., an attempt was made to requeue to the same futex). | |
696 | .TP | |
697 | .B EINVAL | |
6bac3b85 MK |
698 | .RB ( FUTEX_WAKE_OP ) |
699 | The kernel detected an inconsistency between the user-space state at | |
700 | .I uaddr | |
701 | and the kernel state; that is, it detected a waiter which waits in | |
702 | .B FUTEX_LOCK_PI | |
703 | on | |
704 | .IR uaddr . | |
705 | .TP | |
ff597681 MK |
706 | .BR EINVAL |
707 | .RB ( FUTEX_FD ) | |
708 | The signal number supplied in | |
709 | .I val | |
710 | is invalid. | |
711 | .TP | |
6bac3b85 | 712 | .B EINVAL |
4832b48a | 713 | Invalid argument. |
fea681da MK |
714 | .TP |
715 | .B ENFILE | |
ff597681 | 716 | .RB ( FUTEX_FD ) |
fea681da | 717 | The system limit on the total number of open files has been reached. |
4701fc28 MK |
718 | .TP |
719 | .B ENOSYS | |
720 | Invalid operation specified in | |
d33602c4 | 721 | .IR futex_op . |
9f6c40c0 | 722 | .TP |
4a7e5b05 MK |
723 | .B ENOSYS |
724 | The | |
725 | .BR FUTEX_CLOCK_REALTIME | |
726 | option was specified in | |
d33602c4 | 727 | .I futex_op , |
4a7e5b05 MK |
728 | but the accompanying operation was neither |
729 | .BR FUTEX_WAIT_BITSET | |
730 | nor | |
731 | .BR FUTEX_WAIT_REQUEUE_PI . | |
732 | .TP | |
9f6c40c0 | 733 | .B ETIMEDOUT |
d1926d78 MK |
734 | .RB ( FUTEX_WAIT ) |
735 | The operation timed out. | |
9f6c40c0 МК |
736 | .TP |
737 | .B EWOULDBLOCK | |
d33602c4 | 738 | .I futex_op |
e808bba0 MK |
739 | was |
740 | .BR FUTEX_WAIT | |
741 | and the value pointed to by | |
9f6c40c0 МК |
742 | .I uaddr |
743 | was not equal to the expected value | |
744 | .I val | |
e808bba0 | 745 | at the time of the call. |
47297adb | 746 | .SH VERSIONS |
a1d5f77c | 747 | .PP |
81c9d87e MK |
748 | Futexes were first made available in a stable kernel release |
749 | with Linux 2.6.0. | |
750 | ||
a1d5f77c MK |
751 | Initial futex support was merged in Linux 2.5.7 but with different semantics |
752 | from what was described above. | |
c4bb193f | 753 | A 4-argument system call with the semantics |
fd3fa7ef | 754 | described in this page was introduced in Linux 2.5.40. |
11b520ed | 755 | In Linux 2.5.70, one argument |
a1d5f77c | 756 | was added. |
11b520ed | 757 | In Linux 2.6.7, a sixth argument was added\(emmessy, especially |
a1d5f77c | 758 | on the s390 architecture. |
47297adb | 759 | .SH CONFORMING TO |
8382f16d | 760 | This system call is Linux-specific. |
47297adb | 761 | .SH NOTES |
fea681da | 762 | .PP |
fcdad7d6 | 763 | To reiterate, bare futexes are not intended as an easy-to-use abstraction |
c13182ef | 764 | for end-users. |
fcdad7d6 | 765 | (There is no wrapper function for this system call in glibc.) |
c13182ef | 766 | Implementors are expected to be assembly literate and to have |
7fac88a9 | 767 | read the sources of the futex user-space library referenced below. |
d282bb24 | 768 | .\" .SH AUTHORS |
fea681da MK |
769 | .\" .PP |
770 | .\" Futexes were designed and worked on by | |
771 | .\" Hubertus Franke (IBM Thomas J. Watson Research Center), | |
772 | .\" Matthew Kirkwood, Ingo Molnar (Red Hat) | |
773 | .\" and Rusty Russell (IBM Linux Technology Center). | |
774 | .\" This page written by bert hubert. | |
47297adb | 775 | .SH SEE ALSO |
d806bc05 | 776 | .BR restart_syscall (2), |
14d8dd3b | 777 | .BR futex (7) |
fea681da | 778 | .PP |
52087dd3 | 779 | \fIFuss, Futexes and Furwocks: Fast Userlevel Locking in Linux\fP |
9b936e9e MK |
780 | (proceedings of the Ottawa Linux Symposium 2002), online at |
781 | .br | |
608bf950 SK |
782 | .UR http://kernel.org\:/doc\:/ols\:/2002\:/ols2002-pages-479-495.pdf |
783 | .UE | |
f42eb21b MK |
784 | |
785 | \fIFutexes Are Tricky\fP (updated in 2011), Ulrich Drepper | |
786 | .UR http://www.akkadia.org/drepper/futex.pdf | |
787 | .UE | |
9b936e9e MK |
788 | .PP |
789 | Futex example library, futex-*.tar.bz2 at | |
790 | .br | |
a605264d | 791 | .UR ftp://ftp.kernel.org\:/pub\:/linux\:/kernel\:/people\:/rusty/ |
608bf950 | 792 | .UE |