]>
Commit | Line | Data |
---|---|---|
8f0aff2a | 1 | .\" Page by b.hubert |
1abce893 MK |
2 | .\" and Copyright (C) 2015, Thomas Gleixner <tglx@linutronix.de> |
3 | .\" and Copyright (C) 2015, Michael Kerrisk <mtk.manpages@gmail.com> | |
2297bf0e | 4 | .\" |
2e46a6e7 | 5 | .\" %%%LICENSE_START(FREELY_REDISTRIBUTABLE) |
8f0aff2a | 6 | .\" may be freely modified and distributed |
8ff7380d | 7 | .\" %%%LICENSE_END |
fea681da MK |
8 | .\" |
9 | .\" Niki A. Rahimi (LTC Security Development, narahimi@us.ibm.com) | |
10 | .\" added ERRORS section. | |
11 | .\" | |
12 | .\" Modified 2004-06-17 mtk | |
13 | .\" Modified 2004-10-07 aeb, added FUTEX_REQUEUE, FUTEX_CMP_REQUEUE | |
14 | .\" | |
3d155313 | 15 | .TH FUTEX 2 2014-05-21 "Linux" "Linux Programmer's Manual" |
fea681da | 16 | .SH NAME |
ce154705 | 17 | futex \- fast user-space locking |
fea681da | 18 | .SH SYNOPSIS |
9d9dc1e8 | 19 | .nf |
fea681da MK |
20 | .sp |
21 | .B "#include <linux/futex.h>" | |
fea681da MK |
22 | .B "#include <sys/time.h>" |
23 | .sp | |
d33602c4 | 24 | .BI "int futex(int *" uaddr ", int " futex_op ", int " val , |
768d3c23 MK |
25 | .BI " const struct timespec *" timeout , \ |
26 | " \fR /* or: \fBu32 \fIval2\fP */ | |
9d9dc1e8 | 27 | .BI " int *" uaddr2 ", int " val3 ); |
9d9dc1e8 | 28 | .fi |
409f08b0 | 29 | |
b939d6e4 MK |
30 | .IR Note : |
31 | There is no glibc wrapper for this system call; see NOTES. | |
47297adb | 32 | .SH DESCRIPTION |
fea681da MK |
33 | .PP |
34 | The | |
e511ffb6 | 35 | .BR futex () |
fea681da MK |
36 | system call provides a method for |
37 | a program to wait for a value at a given address to change, and a | |
f19904c0 | 38 | method to wake up anyone waiting on a particular address. |
a5956430 MK |
39 | (While the virtual addresses for the same memory in separate |
40 | processes may not be equal, | |
41 | the kernel maps them internally so that the same memory mapped | |
42 | in different locations will correspond for | |
e511ffb6 | 43 | .BR futex () |
f19904c0 | 44 | calls.) |
fd3fa7ef | 45 | This system call is typically used to |
fea681da MK |
46 | implement the contended case of a lock in shared memory, as |
47 | described in | |
a8bda636 | 48 | .BR futex (7). |
809ca3ae MK |
49 | |
50 | In the uncontended case, | |
51 | all operations on the futex memory location are performed | |
52 | in user space using atomic machine-language instructions, | |
53 | and the kernel maintains no information about the futex. | |
54 | The kernel allocates state information for the futex only | |
55 | in the contended case, when operations such as | |
56 | .BR FUTEX_WAIT , | |
57 | described below, are performed. | |
58 | ||
f388ba70 MK |
59 | When a futex operation did not finish uncontended in user space, a |
60 | .BR futex () | |
61 | call needs to be made to the kernel to arbitrate. | |
bdc5957a MK |
62 | Arbitration can either mean putting the caller |
63 | to sleep or, conversely, waking a waiting process or thread. | |
fea681da | 64 | .PP |
f388ba70 MK |
65 | Callers of |
66 | .BR futex () | |
67 | are expected to adhere to the semantics described in | |
a8bda636 | 68 | .BR futex (7). |
ed44c7c0 MK |
69 | As these semantics involve writing nonportable assembly instructions |
70 | (see the example library referred to in SEE ALSO), | |
71 | this in turn probably means that most users will in fact be | |
72 | library authors and not general application developers. | |
a663ca5a MK |
73 | .\" |
74 | .SS Arguments | |
fea681da MK |
75 | The |
76 | .I uaddr | |
f388ba70 MK |
77 | argument points to an integer which stores the counter (futex). |
78 | On all platforms, futexes are four-byte integers that | |
79 | must be aligned on a four-byte boundary. | |
80 | The operation to perform on the futex is specified in the | |
81 | .I futex_op | |
82 | argument; | |
83 | .IR val | |
84 | is a value whose meaning and purpose depends on | |
85 | .IR futex_op . | |
36ab2074 MK |
86 | |
87 | The remaining arguments | |
88 | .RI ( timeout , | |
89 | .IR uaddr2 , | |
90 | and | |
91 | .IR val3 ) | |
92 | are required only for certain of the futex operations described below. | |
93 | Where one of these arguments is not required, it is ignored. | |
768d3c23 | 94 | |
36ab2074 MK |
95 | For several blocking operations, the |
96 | .I timeout | |
97 | argument is a pointer to a | |
98 | .IR timespec | |
99 | structure that specifies a timeout for the operation. | |
100 | However, notwithstanding the prototype shown above, for some operations, | |
101 | this argument is instead a four-byte integer whose meaning | |
102 | is determined by the operation. | |
768d3c23 MK |
103 | For these operations, the kernel casts the |
104 | .I timeout | |
105 | value to | |
106 | .IR u32 , | |
107 | and in the remainder of this page, this argument is referred to as | |
108 | .I val2 | |
109 | when interpreted in this fashion. | |
110 | ||
de5a3bb4 | 111 | Where it is required, the |
36ab2074 | 112 | .IR uaddr2 |
de5a3bb4 | 113 | argument is a pointer to a second futex that is employed by the operation. |
36ab2074 MK |
114 | The interpretation of the final integer argument, |
115 | .IR val3 , | |
116 | depends on the operation. | |
a663ca5a MK |
117 | .\" |
118 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
119 | .\" | |
120 | .SS Futex operations | |
6be4bad7 | 121 | The |
d33602c4 | 122 | .I futex_op |
6be4bad7 MK |
123 | argument consists of two parts: |
124 | a command that specifies the operation to be performed, | |
125 | bit-wise ORed with zero or or more options that | |
126 | modify the behaviour of the operation. | |
fc30eb79 | 127 | The options that may be included in |
d33602c4 | 128 | .I futex_op |
fc30eb79 TG |
129 | are as follows: |
130 | .TP | |
131 | .BR FUTEX_PRIVATE_FLAG " (since Linux 2.6.22)" | |
132 | .\" commit 34f01cc1f512fa783302982776895c73714ebbc2 | |
133 | This option bit can be employed with all futex operations. | |
e45f9735 MK |
134 | It tells the kernel that the futex is process-private and not shared |
135 | with another process | |
136 | (i.e., it is being used for synchronization between threads). | |
fc30eb79 TG |
137 | This allows the kernel to choose the fast path for validating |
138 | the user-space address and avoids expensive VMA lookups, | |
139 | taking reference counts on file backing store, and so on. | |
ae2c1774 MK |
140 | |
141 | As a convenience, | |
142 | .IR <linux/futex.h> | |
143 | defines a set of constants with the suffix | |
144 | .BR _PRIVATE | |
145 | that are equivalents of all of the operations listed below, | |
dcdfde26 | 146 | .\" except the obsolete FUTEX_FD, for which the "private" flag was |
ae2c1774 MK |
147 | .\" meaningless |
148 | but with the | |
149 | .BR FUTEX_PRIVATE_FLAG | |
150 | ORed into the constant value. | |
151 | Thus, there are | |
152 | .BR FUTEX_WAIT_PRIVATE , | |
153 | .BR FUTEX_WAKE_PRIVATE , | |
154 | and so on. | |
2e98bbc2 TG |
155 | .TP |
156 | .BR FUTEX_CLOCK_REALTIME " (since Linux 2.6.28)" | |
157 | .\" commit 1acdac104668a0834cfa267de9946fac7764d486 | |
4a7e5b05 | 158 | This option bit can be employed only with the |
2e98bbc2 TG |
159 | .BR FUTEX_WAIT_BITSET |
160 | and | |
161 | .BR FUTEX_WAIT_REQUEUE_PI | |
c84cf68c | 162 | operations. |
2e98bbc2 | 163 | |
f2103b26 MK |
164 | If this option is set, the kernel treats |
165 | .I timeout | |
166 | as an absolute time based on | |
2e98bbc2 TG |
167 | .BR CLOCK_REALTIME . |
168 | ||
f2103b26 MK |
169 | If this option is not set, the kernel treats |
170 | .I timeout | |
171 | as relative time, | |
f1d2171d | 172 | .\" FIXME XXX I added CLOCK_MONOTONIC here. Okay? |
1c952cf5 MK |
173 | measured against the |
174 | .BR CLOCK_MONOTONIC | |
175 | clock. | |
6be4bad7 MK |
176 | .PP |
177 | The operation specified in | |
d33602c4 | 178 | .I futex_op |
6be4bad7 | 179 | is one of the following: |
70b06b90 MK |
180 | .\" |
181 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
182 | .\" | |
fea681da | 183 | .TP |
81c9d87e MK |
184 | .BR FUTEX_WAIT " (since Linux 2.6.0)" |
185 | .\" Strictly speaking, since some time in 2.5.x | |
f065673c MK |
186 | This operation tests that the value at the |
187 | location pointed to by the futex address | |
fea681da MK |
188 | .I uaddr |
189 | still contains the value | |
190 | .IR val , | |
f065673c | 191 | and then sleeps awaiting |
682edefb | 192 | .B FUTEX_WAKE |
f065673c MK |
193 | on the futex address. |
194 | The test and sleep steps are performed atomically. | |
195 | If the futex value does not match | |
196 | .IR val , | |
4710334a | 197 | then the call fails immediately with the error |
badbf70c | 198 | .BR EAGAIN . |
f065673c MK |
199 | .\" FIXME I added the following sentence. Please confirm that it is correct. |
200 | The purpose of the test step is to detect races where | |
bdc5957a | 201 | another process or thread changes the value of the futex between |
f065673c MK |
202 | the time it was last checked and the time of the |
203 | .BR FUTEX_WAIT | |
63d3f911 | 204 | operation. |
1909e523 | 205 | |
c13182ef | 206 | If the |
fea681da | 207 | .I timeout |
53ba4030 | 208 | argument is non-NULL, its contents specify a relative timeout for the wait, |
f1d2171d | 209 | .\" FIXME XXX I added CLOCK_MONOTONIC here. Okay? |
1c952cf5 MK |
210 | measured according to the |
211 | .BR CLOCK_MONOTONIC | |
212 | clock. | |
82a6092b MK |
213 | (This interval will be rounded up to the system clock granularity, |
214 | and kernel scheduling delays mean that the | |
215 | blocking interval may overrun by a small amount.) | |
216 | If | |
217 | .I timeout | |
218 | is NULL, the call blocks indefinitely. | |
4798a7f3 | 219 | |
c13182ef | 220 | The arguments |
fea681da MK |
221 | .I uaddr2 |
222 | and | |
223 | .I val3 | |
224 | are ignored. | |
225 | ||
226 | For | |
a8bda636 | 227 | .BR futex (7), |
fea681da | 228 | this call is executed if decrementing the count gave a negative value |
bdc5957a MK |
229 | (indicating contention), |
230 | and will sleep until another process or thread releases | |
682edefb MK |
231 | the futex and executes the |
232 | .B FUTEX_WAKE | |
233 | operation. | |
70b06b90 MK |
234 | .\" |
235 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
236 | .\" | |
fea681da | 237 | .TP |
81c9d87e MK |
238 | .BR FUTEX_WAKE " (since Linux 2.6.0)" |
239 | .\" Strictly speaking, since Linux 2.5.x | |
f065673c MK |
240 | This operation wakes at most |
241 | .I val | |
bdc5957a | 242 | of the waiters that are waiting (i.e., inside |
f065673c MK |
243 | .BR FUTEX_WAIT ) |
244 | on the futex at the address | |
245 | .IR uaddr . | |
246 | Most commonly, | |
247 | .I val | |
248 | is specified as either 1 (wake up a single waiter) or | |
249 | .BR INT_MAX | |
250 | (wake up all waiters). | |
730bfbda MK |
251 | .\" FIXME Please confirm that the following is correct: |
252 | No guarantee is provided about which waiters are awoken | |
253 | (e.g., a waiter with a higher scheduling priority is not guaranteed | |
254 | to be awoken in preference to a waiter with a lower priority). | |
4798a7f3 | 255 | |
fea681da MK |
256 | The arguments |
257 | .IR timeout , | |
c8b921bd | 258 | .IR uaddr2 , |
fea681da MK |
259 | and |
260 | .I val3 | |
261 | are ignored. | |
262 | ||
263 | For | |
a8bda636 | 264 | .BR futex (7), |
f2bf5121 | 265 | this is executed if incrementing the count showed that there were waiters, |
64191e8f | 266 | .\" FIXME How does "incrementing the count showed that there were waiters"? |
f2bf5121 | 267 | once the futex value has been set to 1 (indicating that it is available). |
70b06b90 MK |
268 | .\" |
269 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
270 | .\" | |
a7c2bf45 MK |
271 | .TP |
272 | .BR FUTEX_FD " (from Linux 2.6.0 up to and including Linux 2.6.25)" | |
273 | .\" Strictly speaking, from Linux 2.5.x to 2.6.25 | |
274 | This operation creates a file descriptor that is associated with the futex at | |
275 | .IR uaddr . | |
bdc5957a MK |
276 | The caller must close the returned file descriptor after use. |
277 | When another process or thread performs a | |
a7c2bf45 MK |
278 | .BR FUTEX_WAKE |
279 | on the futex, the file descriptor indicates as being readable with | |
280 | .BR select (2), | |
281 | .BR poll (2), | |
282 | and | |
283 | .BR epoll (7) | |
284 | ||
f1d2171d | 285 | The file descriptor can be used to obtain asynchronous notifications: if |
a7c2bf45 | 286 | .I val |
bdc5957a | 287 | is nonzero, then when another process or thread executes a |
a7c2bf45 MK |
288 | .BR FUTEX_WAKE , |
289 | the caller will receive the signal number that was passed in | |
290 | .IR val . | |
291 | ||
292 | The arguments | |
293 | .IR timeout , | |
294 | .I uaddr2 | |
295 | and | |
296 | .I val3 | |
297 | are ignored. | |
298 | ||
299 | To prevent race conditions, the caller should test if the futex has | |
300 | been upped after | |
301 | .B FUTEX_FD | |
302 | returns. | |
303 | ||
304 | Because it was inherently racy, | |
305 | .B FUTEX_FD | |
306 | has been removed | |
307 | .\" commit 82af7aca56c67061420d618cc5a30f0fd4106b80 | |
308 | from Linux 2.6.26 onward. | |
70b06b90 MK |
309 | .\" |
310 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
311 | .\" | |
a7c2bf45 MK |
312 | .TP |
313 | .BR FUTEX_REQUEUE " (since Linux 2.6.0)" | |
314 | .\" Strictly speaking: from Linux 2.5.70 | |
315 | .\" | |
f1d2171d | 316 | .\" FIXME XXX I added this warning. Okay? |
a7c2bf45 MK |
317 | .IR "Avoid using this operation" . |
318 | It is broken (unavoidably racy) for its intended purpose. | |
319 | Use | |
320 | .BR FUTEX_CMP_REQUEUE | |
321 | instead. | |
322 | ||
323 | This operation performs the same task as | |
324 | .BR FUTEX_CMP_REQUEUE , | |
325 | except that no check is made using the value in | |
326 | .IR val3 . | |
327 | (The argument | |
328 | .I val3 | |
329 | is ignored.) | |
70b06b90 MK |
330 | .\" |
331 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
332 | .\" | |
a7c2bf45 MK |
333 | .TP |
334 | .BR FUTEX_CMP_REQUEUE " (since Linux 2.6.7)" | |
335 | This operation was added as a replacement for the earlier | |
336 | .BR FUTEX_REQUEUE , | |
337 | because that operation was racy for its intended use. | |
338 | ||
339 | As with | |
340 | .BR FUTEX_REQUEUE , | |
341 | the | |
342 | .BR FUTEX_CMP_REQUEUE | |
343 | operation is used to avoid a "thundering herd" effect when | |
344 | .B FUTEX_WAKE | |
bdc5957a MK |
345 | is used and all of the waiters that are woken up |
346 | need to acquire another futex. | |
a7c2bf45 MK |
347 | It differs from |
348 | .BR FUTEX_REQUEUE | |
349 | in that it first checks whether the location | |
350 | .I uaddr | |
351 | still contains the value | |
352 | .IR val3 . | |
353 | If not, the operation fails with the error | |
354 | .BR EAGAIN . | |
70b06b90 MK |
355 | .\" FIXME I added the following sentence on the rationale for |
356 | .\" FUTEX_CMP_REQUEUE. Is it correct? Should it be expanded? | |
a7c2bf45 MK |
357 | This additional feature of |
358 | .BR FUTEX_CMP_REQUEUE | |
359 | can be used by the caller to (atomically) detect changes | |
360 | in the value of the target futex at | |
361 | .IR uaddr2 . | |
362 | ||
363 | The operation wakes up a maximum of | |
364 | .I val | |
365 | waiters that are waiting on the futex at | |
366 | .IR uaddr . | |
367 | If there are more than | |
368 | .I val | |
369 | waiters, then the remaining waiters are removed | |
370 | from the wait queue of the source futex at | |
371 | .I uaddr | |
372 | and added to the wait queue of the target futex at | |
373 | .IR uaddr2 . | |
936876a9 | 374 | |
a7c2bf45 | 375 | The |
768d3c23 | 376 | .I val2 |
936876a9 | 377 | argument specifies an upper limit on the number of waiters |
a7c2bf45 | 378 | that are requeued to the futex at |
768d3c23 | 379 | .IR uaddr2 . |
a7c2bf45 MK |
380 | |
381 | .\" FIXME Please review the following new paragraph to see if it is | |
382 | .\" accurate. | |
383 | Typical values to specify for | |
384 | .I val | |
385 | are 0 or or 1. | |
386 | (Specifying | |
387 | .BR INT_MAX | |
388 | is not useful, because it would make the | |
389 | .BR FUTEX_CMP_REQUEUE | |
390 | operation equivalent to | |
391 | .BR FUTEX_WAKE .) | |
936876a9 | 392 | The limit value specified via |
768d3c23 MK |
393 | .I val2 |
394 | is typically either 1 or | |
a7c2bf45 MK |
395 | .BR INT_MAX . |
396 | (Specifying the argument as 0 is not useful, because it would make the | |
397 | .BR FUTEX_CMP_REQUEUE | |
398 | operation equivalent to | |
399 | .BR FUTEX_WAIT .) | |
6bac3b85 | 400 | .\" |
43d16602 MK |
401 | .\" FIXME Here, it would be helpful to have an example of how |
402 | .\" FUTEX_CMP_REQUEUE might be used, at the same time illustrating | |
403 | .\" why FUTEX_WAKE is unsuitable for the same use case. | |
404 | .\" | |
70b06b90 MK |
405 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" |
406 | .\" | |
a5956430 MK |
407 | .\" FIXME I added a lengthy piece of text on FUTEX_WAKE_OP text, |
408 | .\" and I'd be happy if someone checked it. | |
fea681da | 409 | .TP |
d67e21f5 MK |
410 | .BR FUTEX_WAKE_OP " (since Linux 2.6.14)" |
411 | .\" commit 4732efbeb997189d9f9b04708dc26bf8613ed721 | |
6bac3b85 MK |
412 | .\" Author: Jakub Jelinek <jakub@redhat.com> |
413 | .\" Date: Tue Sep 6 15:16:25 2005 -0700 | |
414 | This operation was added to support some user-space use cases | |
415 | where more than one futex must be handled at the same time. | |
416 | The most notable example is the implementation of | |
417 | .BR pthread_cond_signal (3), | |
418 | which requires operations on two futexes, | |
419 | the one used to implement the mutex and the one used in the implementation | |
420 | of the wait queue associated with the condition variable. | |
421 | .BR FUTEX_WAKE_OP | |
422 | allows such cases to be implemented without leading to | |
423 | high rates of contention and context switching. | |
424 | ||
425 | The | |
426 | .BR FUTEX_WAIT_OP | |
427 | operation is equivalent to atomically executing the following code: | |
428 | ||
429 | .in +4n | |
430 | .nf | |
431 | int oldval = *(int *) uaddr2; | |
432 | *(int *) uaddr2 = oldval \fIop\fP \fIoparg\fP; | |
433 | futex(uaddr, FUTEX_WAKE, val, 0, 0, 0); | |
434 | if (oldval \fIcmp\fP \fIcmparg\fP) | |
768d3c23 | 435 | futex(uaddr2, FUTEX_WAKE, val2, 0, 0, 0); |
6bac3b85 MK |
436 | .fi |
437 | .in | |
438 | ||
439 | In other words, | |
440 | .BR FUTEX_WAIT_OP | |
441 | does the following: | |
442 | .RS | |
443 | .IP * 3 | |
444 | saves the original value of the futex at | |
445 | .IR uaddr2 ; | |
446 | .IP * | |
447 | performs an operation to modify the value of the futex at | |
448 | .IR uaddr2 ; | |
449 | .IP * | |
450 | wakes up a maximum of | |
451 | .I val | |
452 | waiters on the futex | |
453 | .IR uaddr ; | |
454 | and | |
455 | .IP * | |
456 | dependent on the results of a test of the original value of the futex at | |
457 | .IR uaddr2 , | |
458 | wakes up a maximum of | |
768d3c23 | 459 | .I val2 |
6bac3b85 MK |
460 | waiters on the futex |
461 | .IR uaddr2 . | |
462 | .RE | |
463 | .IP | |
6bac3b85 MK |
464 | The operation and comparison that are to be performed are encoded |
465 | in the bits of the argument | |
466 | .IR val3 . | |
467 | Pictorially, the encoding is: | |
468 | ||
f6af90e7 | 469 | .in +8n |
6bac3b85 | 470 | .nf |
f6af90e7 MK |
471 | +---+---+-----------+-----------+ |
472 | |op |cmp| oparg | cmparg | | |
473 | +---+---+-----------+-----------+ | |
474 | 4 4 12 12 <== # of bits | |
6bac3b85 MK |
475 | .fi |
476 | .in | |
477 | ||
478 | Expressed in code, the encoding is: | |
479 | ||
480 | .in +4n | |
481 | .nf | |
482 | #define FUTEX_OP(op, oparg, cmp, cmparg) \\ | |
483 | (((op & 0xf) << 28) | \\ | |
484 | ((cmp & 0xf) << 24) | \\ | |
485 | ((oparg & 0xfff) << 12) | \\ | |
486 | (cmparg & 0xfff)) | |
487 | .fi | |
488 | .in | |
489 | ||
490 | In the above, | |
491 | .I op | |
492 | and | |
493 | .I cmp | |
494 | are each one of the codes listed below. | |
495 | The | |
496 | .I oparg | |
497 | and | |
498 | .I cmparg | |
499 | components are literal numeric values, except as noted below. | |
500 | ||
501 | The | |
502 | .I op | |
503 | component has one of the following values: | |
504 | ||
505 | .in +4n | |
506 | .nf | |
507 | FUTEX_OP_SET 0 /* uaddr2 = oparg; */ | |
508 | FUTEX_OP_ADD 1 /* uaddr2 += oparg; */ | |
509 | FUTEX_OP_OR 2 /* uaddr2 |= oparg; */ | |
510 | FUTEX_OP_ANDN 3 /* uaddr2 &= ~oparg; */ | |
511 | FUTEX_OP_XOR 4 /* uaddr2 ^= oparg; */ | |
512 | .fi | |
513 | .in | |
514 | ||
515 | In addition, bit-wise ORing the following value into | |
516 | .I op | |
517 | causes | |
518 | .IR "(1\ <<\ oparg)" | |
519 | to be used as the operand: | |
520 | ||
521 | .in +4n | |
522 | .nf | |
523 | FUTEX_OP_ARG_SHIFT 8 /* Use (1 << oparg) as operand */ | |
524 | .fi | |
525 | .in | |
526 | ||
527 | The | |
528 | .I cmp | |
529 | field is one of the following: | |
530 | ||
531 | .in +4n | |
532 | .nf | |
533 | FUTEX_OP_CMP_EQ 0 /* if (oldval == cmparg) wake */ | |
534 | FUTEX_OP_CMP_NE 1 /* if (oldval != cmparg) wake */ | |
535 | FUTEX_OP_CMP_LT 2 /* if (oldval < cmparg) wake */ | |
536 | FUTEX_OP_CMP_LE 3 /* if (oldval <= cmparg) wake */ | |
537 | FUTEX_OP_CMP_GT 4 /* if (oldval > cmparg) wake */ | |
538 | FUTEX_OP_CMP_GE 5 /* if (oldval >= cmparg) wake */ | |
539 | .fi | |
540 | .in | |
541 | ||
542 | The return value of | |
543 | .BR FUTEX_WAKE_OP | |
544 | is the sum of the number of waiters woken on the futex | |
545 | .IR uaddr | |
546 | plus the number of waiters woken on the futex | |
547 | .IR uaddr2 . | |
70b06b90 MK |
548 | .\" |
549 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
550 | .\" | |
d67e21f5 | 551 | .TP |
79c9b436 TG |
552 | .BR FUTEX_WAIT_BITSET " (since Linux 2.6.25)" |
553 | .\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d | |
fd9e59d4 | 554 | This operation is like |
79c9b436 TG |
555 | .BR FUTEX_WAIT |
556 | except that | |
557 | .I val3 | |
558 | is used to provide a 32-bit bitset to the kernel. | |
559 | This bitset is stored in the kernel-internal state of the waiter. | |
560 | See the description of | |
561 | .BR FUTEX_WAKE_BITSET | |
562 | for further details. | |
563 | ||
fd9e59d4 MK |
564 | The |
565 | .BR FUTEX_WAIT_BITSET | |
566 | also interprets the | |
567 | .I timeout | |
568 | argument differently from | |
569 | .BR FUTEX_WAIT . | |
570 | See the discussion of | |
571 | .BR FUTEX_CLOCK_REALTIME , | |
572 | above. | |
573 | ||
79c9b436 TG |
574 | The |
575 | .I uaddr2 | |
576 | argument is ignored. | |
70b06b90 MK |
577 | .\" |
578 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
579 | .\" | |
79c9b436 | 580 | .TP |
d67e21f5 MK |
581 | .BR FUTEX_WAKE_BITSET " (since Linux 2.6.25)" |
582 | .\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d | |
55cc422d TG |
583 | This operation is the same as |
584 | .BR FUTEX_WAKE | |
585 | except that the | |
586 | .I val3 | |
587 | argument is used to provide a 32-bit bitset to the kernel. | |
98d769c0 MK |
588 | This bitset is used to select which waiters should be woken up. |
589 | The selection is done by a bit-wise AND of the "wake" bitset | |
590 | (i.e., the value in | |
591 | .IR val3 ) | |
592 | and the bitset which is stored in the kernel-internal | |
09cb4ce7 | 593 | state of the waiter (the "wait" bitset that is set using |
98d769c0 MK |
594 | .BR FUTEX_WAIT_BITSET ). |
595 | All of the waiters for which the result of the AND is nonzero are woken up; | |
596 | the remaining waiters are left sleeping. | |
597 | ||
f1d2171d | 598 | .\" FIXME XXX Is this paragraph that I added okay? |
e9d4496b MK |
599 | The effect of |
600 | .BR FUTEX_WAIT_BITSET | |
601 | and | |
602 | .BR FUTEX_WAKE_BITSET | |
603 | is to allow selective wake-ups among multiple waiters that are waiting | |
604 | on the same futex; | |
605 | since a futex has a size of 32 bits, | |
606 | these operations provide 32 wakeup "channels". | |
607 | (The | |
608 | .BR FUTEX_WAIT | |
609 | and | |
610 | .BR FUTEX_WAKE | |
611 | operations correspond to | |
612 | .BR FUTEX_WAIT_BITSET | |
613 | and | |
614 | .BR FUTEX_WAKE_BITSET | |
615 | operations where the bitsets are all ones.) | |
09cb4ce7 | 616 | Note, however, that using this bitset multiplexing feature on a |
e9d4496b MK |
617 | futex is less efficient than simply using multiple futexes, |
618 | because employing bitset multiplexing requires the kernel | |
619 | to check all waiters on a futex, | |
620 | including those that are not interested in being woken up | |
621 | (i.e., they do not have the relevant bit set in their "wait" bitset). | |
622 | .\" According to http://locklessinc.com/articles/futex_cheat_sheet/: | |
623 | .\" | |
624 | .\" "The original reason for the addition of these extensions | |
625 | .\" was to improve the performance of pthread read-write locks | |
626 | .\" in glibc. However, the pthreads library no longer uses the | |
627 | .\" same locking algorithm, and these extensions are not used | |
628 | .\" without the bitset parameter being all ones. | |
629 | .\" | |
630 | .\" The page goes on to note that the FUTEX_WAIT_BITSET operation | |
631 | .\" is nevertheless used (with a bitset of all ones) in order to | |
632 | .\" obtain the absolute timeout functionality that is useful | |
633 | .\" for efficiently implementing Pthreads APIs (which use absolute | |
634 | .\" timeouts); FUTEX_WAIT provides only relative timeouts. | |
635 | ||
98d769c0 MK |
636 | The |
637 | .I uaddr2 | |
638 | and | |
639 | .I timeout | |
640 | arguments are ignored. | |
bd90a5f9 | 641 | .\" |
70b06b90 | 642 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" |
bd90a5f9 MK |
643 | .\" |
644 | .SS Priority-inheritance futexes | |
b52e1cd4 MK |
645 | Linux supports priority-inheritance (PI) futexes in order to handle |
646 | priority-inversion problems that can be encountered with | |
647 | normal futex locks. | |
b565548b | 648 | Priority inversion is the problem that occurs when a high-priority |
bdc5957a MK |
649 | task is blocked waiting to acquire a lock held by a low-priority task, |
650 | while tasks at an intermediate priority continuously preempt | |
651 | the low-priority task from the CPU. | |
652 | Consequently, the low-priority task makes no progress toward | |
653 | releasing the lock, and the high-priority task remains blocked. | |
7f315ae3 | 654 | |
7d20efd7 MK |
655 | Priority inheritance is a mechanism for dealing with |
656 | the priority-inversion problem. | |
bdc5957a MK |
657 | With this mechanism, when a high-priority task becomes blocked |
658 | by a lock held by a low-priority task, | |
7d20efd7 | 659 | the latter's priority is temporarily raised to that of the former, |
bdc5957a | 660 | so that it is not preempted by any intermediate level tasks, |
7d20efd7 MK |
661 | and can thus make progress toward releasing the lock. |
662 | To be effective, priority inheritance must be transitive, | |
bdc5957a MK |
663 | meaning that if a high-priority task blocks on a lock |
664 | held by a lower-priority task that is itself blocked by lock | |
665 | held by another intermediate-priority task | |
7d20efd7 | 666 | (and so on, for chains of arbitrary length), |
bdc5957a MK |
667 | then both of those task |
668 | (or more generally, all of the tasks in a lock chain) | |
669 | have their priorities raised to be the same as the high-priority task. | |
7d20efd7 | 670 | |
9e2b90ee MK |
671 | .\" FIXME XXX The following is my attempt at a definition of PI futexes, |
672 | .\" based on mail discussions with Darren Hart. Does it seem okay? | |
673 | From a user-space perspective, | |
674 | what makes a futex PI-aware is a policy agreement between user space | |
675 | and the kernel about the value of the futex (described in a moment), | |
676 | coupled with the use of the PI futex operations described below | |
677 | (in particular, | |
678 | .BR FUTEX_LOCK_PI , | |
679 | .BR FUTEX_TRYLOCK_PI , | |
680 | and | |
681 | .BR FUTEX_CMP_REQUEUE_PI ). | |
682 | .\" Quoting Darren Hart: | |
683 | .\" These opcodes paired with the PI futex value policy (described below) | |
684 | .\" defines a "futex" as PI aware. These were created very specifically | |
685 | .\" in support of PI pthread_mutexes, so it makes a lot more sense to | |
686 | .\" talk about a PI aware pthread_mutex, than a PI aware futex, since | |
687 | .\" there is a lot of policy and scaffolding that has to be built up | |
688 | .\" around it to use it properly (this is what a PI pthread_mutex is). | |
689 | ||
f1d2171d | 690 | .\" FIXME XXX ===== Start of adapted Hart/Guniguntala text ===== |
79d918c7 MK |
691 | .\" The following text is drawn from the Hart/Guniguntala paper, |
692 | .\" but I have reworded some pieces significantly. Please check it. | |
693 | .\" | |
694 | The PI futex operations described below differ from the other | |
695 | futex operations in that they impose policy on the use of the futex value: | |
696 | .IP * 3 | |
7c16fbff | 697 | If the lock is unowned, the futex value shall be 0. |
79d918c7 MK |
698 | .IP * |
699 | If the lock is owned, the futex value shall be the thread ID (TID; see | |
700 | .BR gettid (2)) | |
701 | of the owning thread. | |
702 | .IP * | |
f1d2171d | 703 | .\" FIXME XXX In the following line, I added "the lock is owned and". Okay? |
79d918c7 MK |
704 | If the lock is owned and there are threads contending for the lock, |
705 | then the | |
706 | .B FUTEX_WAITERS | |
707 | bit shall be set in the futex value; in other words, the futex value is: | |
708 | ||
709 | FUTEX_WAITERS | TID | |
9e2b90ee | 710 | |
79d918c7 | 711 | .PP |
9e2b90ee MK |
712 | Note that a PI futex never just has the value |
713 | .BR FUTEX_WAITERS , | |
714 | which is a permissible state for non-PI futexes. | |
715 | ||
79d918c7 MK |
716 | With this policy in place, |
717 | a user-space application can acquire an unowned | |
21b060ba | 718 | lock or release an uncontended lock using atomic |
21b060ba | 719 | instructions executed in user-space (e.g., |
b52e1cd4 MK |
720 | .I cmpxchg |
721 | on the x86 architecture). | |
722 | Locking an unowned lock simply consists of setting | |
723 | the futex value to the caller's TID. | |
724 | Releasing an uncontended lock simply requires setting the futex value to 0. | |
725 | ||
726 | If a futex is currently owned (i.e., has a nonzero value), | |
727 | waiters must employ the | |
79d918c7 MK |
728 | .B FUTEX_LOCK_PI |
729 | operation to acquire the lock. | |
b52e1cd4 | 730 | If a lock is contended (i.e., the |
79d918c7 | 731 | .B FUTEX_WAITERS |
b52e1cd4 | 732 | bit is set in the futex value), the lock owner must employ the |
79d918c7 | 733 | .B FUTEX_UNLOCK_PI |
b52e1cd4 MK |
734 | operation to release the lock. |
735 | ||
79d918c7 MK |
736 | In the cases where callers are forced into the kernel |
737 | (i.e., required to perform a | |
738 | .BR futex () | |
739 | operation), | |
740 | they then deal directly with a so-called RT-mutex, | |
741 | a kernel locking mechanism which implements the required | |
742 | priority-inheritance semantics. | |
743 | After the RT-mutex is acquired, the futex value is updated accordingly, | |
744 | before the calling thread returns to user space. | |
745 | .\" FIXME ===== End of adapted Hart/Guniguntala text ===== | |
746 | ||
a59fca75 MK |
747 | It is important to note |
748 | .\" FIXME We need some explanation here of *why* it is important to | |
70b06b90 | 749 | .\" note this |
a59fca75 | 750 | that the kernel will update the futex value prior |
79d918c7 MK |
751 | to returning to user space. |
752 | Unlike the other futex operations described above, | |
753 | the PI futex operations are designed | |
d9d5be6b | 754 | for the implementation of very specific IPC mechanisms. |
fc57e6bb | 755 | .\" |
7bd3ffbc | 756 | .\" FIXME XXX In discussing errors for FUTEX_CMP_REQUEUE_PI, Darren Hart |
99c0ac69 MK |
757 | .\" made the observation that "EINVAL is returned if the non-pi |
758 | .\" to pi or op pairing semantics are violated." | |
759 | .\" Probably there needs to be a general statement about this | |
760 | .\" requirement, probably located at about this point in the page. | |
7bd3ffbc | 761 | .\" Darren, care to take a shot at this? |
dd003bef MK |
762 | .\" |
763 | .\" FIXME Somewhere on this page (I guess under the discussion of PI | |
764 | .\" futexes) we need a discussion of the FUTEX_OWNER_DIED bit. | |
765 | .\" Can someone propose a text? | |
bd90a5f9 MK |
766 | |
767 | PI futexes are operated on by specifying one of the following values in | |
768 | .IR futex_op : | |
70b06b90 MK |
769 | .\" |
770 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
771 | .\" | |
d67e21f5 MK |
772 | .TP |
773 | .BR FUTEX_LOCK_PI " (since Linux 2.6.18)" | |
774 | .\" commit c87e2837be82df479a6bae9f155c43516d2feebc | |
67833bec MK |
775 | .\" |
776 | .\" FIXME I did some significant rewording of tglx's text. | |
777 | .\" Please check, in case I injected errors. | |
778 | .\" | |
779 | This operation is used after after an attempt to acquire | |
780 | the futex lock via an atomic user-space instruction failed | |
781 | because the futex has a nonzero value\(emspecifically, | |
782 | because it contained the namespace-specific TID of the lock owner. | |
67259526 | 783 | .\" FIXME In the preceding line, what does "namespace-specific" mean? |
67833bec | 784 | .\" (I kept those words from tglx.) |
67259526 | 785 | .\" That is, what kind of namespace are we talking about? |
67833bec MK |
786 | .\" (I suppose we are talking PID namespaces here, but I want to |
787 | .\" be sure.) | |
788 | ||
789 | The operation checks the value of the futex at the address | |
790 | .IR uaddr . | |
70b06b90 MK |
791 | If the value is 0, then the kernel tries to atomically set |
792 | the futex value to the caller's TID. | |
67833bec MK |
793 | If that fails, |
794 | .\" FIXME What would be the cause of failure? | |
795 | or the futex value is nonzero, | |
796 | the kernel atomically sets the | |
e0547e70 | 797 | .B FUTEX_WAITERS |
67833bec MK |
798 | bit, which signals the futex owner that it cannot unlock the futex in |
799 | user space atomically by setting the futex value to 0. | |
800 | After that, the kernel tries to find the thread which is | |
801 | associated with the owner TID, | |
802 | .\" FIXME Could I get a bit more detail on the next two lines? | |
803 | .\" What is "creates or reuses kernel state" about? | |
804 | creates or reuses kernel state on behalf of the owner | |
805 | and attaches the waiter to it. | |
67259526 MK |
806 | .\" FIXME In the next line, what type of "priority" are we talking about? |
807 | .\" Realtime priorities for SCHED_FIFO and SCHED_RR? | |
808 | .\" Or something else? | |
1f043693 | 809 | The enqueueing of the waiter is in descending priority order if more |
e0547e70 | 810 | than one waiter exists. |
67259526 | 811 | .\" FIXME What does "bandwidth" refer to in the next line? |
e0547e70 | 812 | The owner inherits either the priority or the bandwidth of the waiter. |
67259526 MK |
813 | .\" FIXME In the preceding line, what determines whether the |
814 | .\" owner inherits the priority versus the bandwidth? | |
67833bec MK |
815 | .\" |
816 | .\" FIXME Could I get some help translating the next sentence into | |
817 | .\" something that user-space developers (and I) can understand? | |
70b06b90 | 818 | .\" In particular, what are "nested locks" in this context? |
e0547e70 TG |
819 | This inheritance follows the lock chain in the case of |
820 | nested locking and performs deadlock detection. | |
821 | ||
9ce19cf1 MK |
822 | .\" FIXME tglx says "The timeout argument is handled as described in |
823 | .\" FUTEX_WAIT." However, it appears to me that this is not right. | |
70b06b90 | 824 | .\" Is the following formulation correct? |
e0547e70 TG |
825 | The |
826 | .I timeout | |
9ce19cf1 MK |
827 | argument provides a timeout for the lock attempt. |
828 | It is interpreted as an absolute time, measured against the | |
829 | .BR CLOCK_REALTIME | |
830 | clock. | |
831 | If | |
832 | .I timeout | |
833 | is NULL, the operation will block indefinitely. | |
e0547e70 | 834 | |
a449c634 | 835 | The |
e0547e70 TG |
836 | .IR uaddr2 , |
837 | .IR val , | |
838 | and | |
839 | .IR val3 | |
a449c634 | 840 | arguments are ignored. |
67833bec | 841 | .\" |
70b06b90 MK |
842 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" |
843 | .\" | |
d67e21f5 | 844 | .TP |
12fdbe23 | 845 | .BR FUTEX_TRYLOCK_PI " (since Linux 2.6.18)" |
d67e21f5 | 846 | .\" commit c87e2837be82df479a6bae9f155c43516d2feebc |
12fdbe23 MK |
847 | This operation tries to acquire the futex at |
848 | .IR uaddr . | |
0b761826 | 849 | .\" FIXME I think it would be helpful here to say a few more words about |
70b06b90 MK |
850 | .\" the difference(s) between FUTEX_LOCK_PI and FUTEX_TRYLOCK_PI. |
851 | .\" Can someone propose something? | |
852 | .\" | |
fa0388c3 | 853 | It deals with the situation where the TID value at |
12fdbe23 MK |
854 | .I uaddr |
855 | is 0, but the | |
b52e1cd4 | 856 | .B FUTEX_WAITERS |
12fdbe23 | 857 | bit is set. |
fa0388c3 MK |
858 | .\" FIXME How does the situation in the previous sentence come about? |
859 | .\" Probably it would be helpful to say something about that in | |
860 | .\" the man page. | |
badbf70c | 861 | .\" FIXME And *how* does FUTEX_TRYLOCK_PI deal with this situation? |
a282e5b0 | 862 | User space cannot handle this condition in a race-free manner |
084744ef MK |
863 | |
864 | The | |
865 | .IR uaddr2 , | |
866 | .IR val , | |
867 | .IR timeout , | |
868 | and | |
869 | .IR val3 | |
870 | arguments are ignored. | |
70b06b90 MK |
871 | .\" |
872 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
873 | .\" | |
d67e21f5 | 874 | .TP |
12fdbe23 | 875 | .BR FUTEX_UNLOCK_PI " (since Linux 2.6.18)" |
d67e21f5 | 876 | .\" commit c87e2837be82df479a6bae9f155c43516d2feebc |
d4ba4328 | 877 | This operation wakes the top priority waiter that is waiting in |
ecae2099 TG |
878 | .B FUTEX_LOCK_PI |
879 | on the futex address provided by the | |
880 | .I uaddr | |
881 | argument. | |
882 | ||
883 | This is called when the user space value at | |
884 | .I uaddr | |
885 | cannot be changed atomically from a TID (of the owner) to 0. | |
886 | ||
887 | The | |
888 | .IR uaddr2 , | |
889 | .IR val , | |
890 | .IR timeout , | |
891 | and | |
892 | .IR val3 | |
11a194bf | 893 | arguments are ignored. |
70b06b90 MK |
894 | .\" |
895 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
896 | .\" | |
d67e21f5 | 897 | .TP |
d67e21f5 MK |
898 | .BR FUTEX_CMP_REQUEUE_PI " (since Linux 2.6.31)" |
899 | .\" commit 52400ba946759af28442dee6265c5c0180ac7122 | |
f812a08b DH |
900 | This operation is a PI-aware variant of |
901 | .BR FUTEX_CMP_REQUEUE . | |
902 | It requeues waiters that are blocked via | |
903 | .B FUTEX_WAIT_REQUEUE_PI | |
904 | on | |
905 | .I uaddr | |
906 | from a non-PI source futex | |
907 | .RI ( uaddr ) | |
908 | to a PI target futex | |
909 | .RI ( uaddr2 ). | |
910 | ||
9e54d26d MK |
911 | As with |
912 | .BR FUTEX_CMP_REQUEUE , | |
913 | this operation wakes up a maximum of | |
914 | .I val | |
915 | waiters that are waiting on the futex at | |
916 | .IR uaddr . | |
917 | However, for | |
918 | .BR FUTEX_CMP_REQUEUE_PI , | |
919 | .I val | |
6fbeb8f4 | 920 | is required to be 1 |
939ca89f | 921 | (since the main point is to avoid a thundering herd). |
9e54d26d MK |
922 | The remaining waiters are removed from the wait queue of the source futex at |
923 | .I uaddr | |
924 | and added to the wait queue of the target futex at | |
925 | .IR uaddr2 . | |
f812a08b | 926 | |
9e54d26d | 927 | The |
768d3c23 | 928 | .I val2 |
c6d8cf21 MK |
929 | .\" val2 is the cap on the number of requeued waiters. |
930 | .\" In the glibc pthread_cond_broadcast() implementation, this argument | |
931 | .\" is specified as INT_MAX, and for pthread_cond_signal() it is 0. | |
9e54d26d | 932 | and |
768d3c23 | 933 | .I val3 |
9e54d26d MK |
934 | arguments serve the same purposes as for |
935 | .BR FUTEX_CMP_REQUEUE . | |
70b06b90 | 936 | .\" |
be376673 MK |
937 | .\" FIXME The page at http://locklessinc.com/articles/futex_cheat_sheet/ |
938 | .\" notes that "priority-inheritance Futex to priority-inheritance | |
939 | .\" Futex requeues are currently unsupported". Do we need to say | |
940 | .\" something in the man page about that? | |
70b06b90 MK |
941 | .\" |
942 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
943 | .\" | |
d67e21f5 MK |
944 | .TP |
945 | .BR FUTEX_WAIT_REQUEUE_PI " (since Linux 2.6.31)" | |
946 | .\" commit 52400ba946759af28442dee6265c5c0180ac7122 | |
70b06b90 MK |
947 | .\" |
948 | .\" FIXME I find the next sentence (from tglx) pretty hard to grok. | |
949 | .\" Could someone explain it a bit more. | |
6ff1b4c0 TG |
950 | Wait operation to wait on a non-PI futex at |
951 | .I uaddr | |
952 | and potentially be requeued onto a PI futex at | |
953 | .IR uaddr2 . | |
954 | The wait operation on | |
955 | .I uaddr | |
956 | is the same as | |
957 | .BR FUTEX_WAIT . | |
70b06b90 | 958 | .\" |
f1d2171d MK |
959 | .\" FIXME I'm not quite clear on the meaning of the following sentence. |
960 | .\" Is this trying to say that while blocked in a | |
961 | .\" FUTEX_WAIT_REQUEUE_PI, it could happen that another | |
962 | .\" task does a FUTEX_WAKE on uaddr that simply causes | |
963 | .\" a normal wake, with the result that the FUTEX_WAIT_REQUEUE_PI | |
964 | .\" does not complete? What happens then to the FUTEX_WAIT_REQUEUE_PI | |
965 | .\" opertion? Does it remain blocked, or does it unblock | |
966 | .\" In which case, what does user space see? | |
6ff1b4c0 TG |
967 | The waiter can be removed from the wait on |
968 | .I uaddr | |
969 | via | |
970 | .BR FUTEX_WAKE | |
971 | without requeueing on | |
972 | .IR uaddr2 . | |
a4e69912 | 973 | |
63bea7dc MK |
974 | .\" FIXME Please check the following. tglx said "The timeout argument |
975 | .\" is handled as described in FUTEX_WAIT.", but the truth is | |
976 | .\" as below, AFAICS | |
977 | If | |
978 | .I timeout | |
979 | is not NULL, it specifies a timeout for the wait operation; | |
980 | this timeout is interpreted as outlined above in the description of the | |
981 | .BR FUTEX_CLOCK_REALTIME | |
982 | option. | |
983 | If | |
984 | .I timeout | |
985 | is NULL, the operation can block indefinitely. | |
986 | ||
a4e69912 MK |
987 | The |
988 | .I val3 | |
989 | argument is ignored. | |
70b06b90 | 990 | .\" FIXME Re the preceding sentence... Actually 'val3' is internally set to |
a4e69912 MK |
991 | .\" FUTEX_BITSET_MATCH_ANY before calling futex_wait_requeue_pi(). |
992 | .\" I'm not sure we need to say anything about this though. | |
993 | .\" Comments? | |
abb571e8 MK |
994 | |
995 | The | |
996 | .BR FUTEX_WAIT_REQUEUE_PI | |
997 | and | |
998 | .BR FUTEX_CMP_REQUEUE_PI | |
999 | were added to support a fairly specific use case: | |
1000 | support for priority-inheritance-aware POSIX threads condition variables. | |
1001 | The idea is that these operations should always be paired, | |
1002 | in order to ensure that user space and the kernel remain in sync. | |
1003 | Thus, in the | |
1004 | .BR FUTEX_WAIT_REQUEUE_PI | |
1005 | operation, the user-space application pre-specifies the target | |
1006 | of the requeue that takes place in the | |
1007 | .BR FUTEX_CMP_REQUEUE_PI | |
1008 | operation. | |
1009 | .\" | |
1010 | .\" Darren Hart notes that a patch to allow glibc to fully support | |
1011 | .\" PI-aware pthreds condition variables has not yet been accepted into | |
1012 | .\" glibc. The story is complex, and can be found at | |
1013 | .\" https://sourceware.org/bugzilla/show_bug.cgi?id=11588 | |
1014 | .\" Darren notes that in the meantime, the patch is shipped with various | |
1015 | .\" PREEMPT_RT enabled Linux systems. | |
1016 | .\" | |
1017 | .\" Related to the preceding, Darren proposed that somewhere, man-pages | |
1018 | .\" should document the following point: | |
1019 | .\" While the Linux kernel, since 2.6.31, supports requeueing of | |
1020 | .\" priority-inheritance (PI) aware mutexes via the | |
1021 | .\" FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI futex operations, | |
1022 | .\" the glibc implementation does not yet take full advantage of this. | |
1023 | .\" Specifically, the condvar internal data lock remains a non-PI aware | |
1024 | .\" mutex, regardless of the type of the pthread_mutex associated with | |
1025 | .\" the condvar. This can lead to an unbounded priority inversion on | |
1026 | .\" the internal data lock even when associating a PI aware | |
1027 | .\" pthread_mutex with a condvar during a pthread_cond*_wait | |
1028 | .\" operation. For this reason, it is not recommended to rely on | |
1029 | .\" priority inheritance when using pthread condition variables. | |
1030 | .\" The problem is that the obvious somewhere to place this text | |
1031 | .\" is the pthread_cond*wait(3) man page. However, such a man page | |
1032 | .\" does not currently exist. | |
70b06b90 | 1033 | .\" |
6700de24 | 1034 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" |
70b06b90 | 1035 | .\" |
47297adb | 1036 | .SH RETURN VALUE |
fea681da | 1037 | .PP |
6f147f79 | 1038 | In the event of an error, all operations return \-1 and set |
e808bba0 | 1039 | .I errno |
6f147f79 | 1040 | to indicate the cause of the error. |
e808bba0 MK |
1041 | The return value on success depends on the operation, |
1042 | as described in the following list: | |
fea681da MK |
1043 | .TP |
1044 | .B FUTEX_WAIT | |
bdc5957a | 1045 | Returns 0 if the caller was woken by a |
682edefb | 1046 | .B FUTEX_WAKE |
7446a837 MK |
1047 | or |
1048 | .B FUTEX_WAKE_BITSET | |
682edefb | 1049 | call. |
fea681da MK |
1050 | .TP |
1051 | .B FUTEX_WAKE | |
bdc5957a | 1052 | Returns the number of waiters that were woken up. |
fea681da MK |
1053 | .TP |
1054 | .B FUTEX_FD | |
1055 | Returns the new file descriptor associated with the futex. | |
1056 | .TP | |
1057 | .B FUTEX_REQUEUE | |
bdc5957a | 1058 | Returns the number of waiters that were woken up. |
fea681da MK |
1059 | .TP |
1060 | .B FUTEX_CMP_REQUEUE | |
bdc5957a MK |
1061 | Returns the total number of waiters that were woken up or |
1062 | requeued to the futex at | |
3dfcc11d MK |
1063 | .IR uaddr2 . |
1064 | If this value is greater than | |
1065 | .IR val , | |
1066 | then difference is the number of waiters requeued to the futex at | |
1067 | .IR uaddr2 . | |
dcad19c0 MK |
1068 | .TP |
1069 | .B FUTEX_WAKE_OP | |
f1d2171d | 1070 | .\" FIXME XXX Is the following correct? |
a8b5b324 MK |
1071 | Returns the total number of waiters that were woken up. |
1072 | This is the sum of the woken waiters on the two futexes at | |
1073 | .I uaddr | |
1074 | and | |
1075 | .IR uaddr2 . | |
dcad19c0 MK |
1076 | .TP |
1077 | .B FUTEX_WAIT_BITSET | |
f1d2171d | 1078 | .\" FIXME XXX Is the following correct? |
bdc5957a | 1079 | Returns 0 if the caller was woken by a |
7bcc5351 MK |
1080 | .B FUTEX_WAKE |
1081 | or | |
1082 | .B FUTEX_WAKE_BITSET | |
1083 | call. | |
dcad19c0 MK |
1084 | .TP |
1085 | .B FUTEX_WAKE_BITSET | |
f1d2171d | 1086 | .\" FIXME XXX Is the following correct? |
bdc5957a | 1087 | Returns the number of waiters that were woken up. |
dcad19c0 MK |
1088 | .TP |
1089 | .B FUTEX_LOCK_PI | |
f1d2171d | 1090 | .\" FIXME XXX Is the following correct? |
bf02a260 | 1091 | Returns 0 if the futex was successfully locked. |
dcad19c0 MK |
1092 | .TP |
1093 | .B FUTEX_TRYLOCK_PI | |
f1d2171d | 1094 | .\" FIXME XXX Is the following correct? |
5c716eef | 1095 | Returns 0 if the futex was successfully locked. |
dcad19c0 MK |
1096 | .TP |
1097 | .B FUTEX_UNLOCK_PI | |
f1d2171d | 1098 | .\" FIXME XXX Is the following correct? |
52bb928f | 1099 | Returns 0 if the futex was successfully unlocked. |
dcad19c0 MK |
1100 | .TP |
1101 | .B FUTEX_CMP_REQUEUE_PI | |
f1d2171d | 1102 | .\" FIXME XXX Is the following correct? |
bdc5957a MK |
1103 | Returns the total number of waiters that were woken up or |
1104 | requeued to the futex at | |
dddd395a MK |
1105 | .IR uaddr2 . |
1106 | If this value is greater than | |
1107 | .IR val , | |
1108 | then difference is the number of waiters requeued to the futex at | |
1109 | .IR uaddr2 . | |
dcad19c0 MK |
1110 | .TP |
1111 | .B FUTEX_WAIT_REQUEUE_PI | |
f1d2171d | 1112 | .\" FIXME XXX Is the following correct? |
22c15de9 MK |
1113 | Returns 0 if the caller was successfully requeued to the futex at |
1114 | .IR uaddr2 . | |
70b06b90 MK |
1115 | .\" |
1116 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
1117 | .\" | |
fea681da MK |
1118 | .SH ERRORS |
1119 | .TP | |
1120 | .B EACCES | |
1121 | No read access to futex memory. | |
1122 | .TP | |
1123 | .B EAGAIN | |
f48516d1 MK |
1124 | .RB ( FUTEX_WAIT , |
1125 | .BR FUTEX_WAIT_REQUEUE_PI ) | |
badbf70c MK |
1126 | The value pointed to by |
1127 | .I uaddr | |
1128 | was not equal to the expected value | |
1129 | .I val | |
1130 | at the time of the call. | |
1131 | .TP | |
1132 | .B EAGAIN | |
8f2068bb MK |
1133 | .RB ( FUTEX_CMP_REQUEUE , |
1134 | .BR FUTEX_CMP_REQUEUE_PI ) | |
ce5602fd | 1135 | The value pointed to by |
9f6c40c0 МК |
1136 | .I uaddr |
1137 | is not equal to the expected value | |
1138 | .IR val3 . | |
fd1dc4c2 | 1139 | .\" FIXME: Is the following sentence correct? |
fea681da | 1140 | (This probably indicates a race; |
682edefb MK |
1141 | use the safe |
1142 | .B FUTEX_WAKE | |
1143 | now.) | |
c0091dd3 | 1144 | .\" |
f1d2171d | 1145 | .\" FIXME XXX Should there be an EAGAIN case for FUTEX_TRYLOCK_PI? |
c0091dd3 MK |
1146 | .\" It seems so, looking at the handling of the rt_mutex_trylock() |
1147 | .\" call in futex_lock_pi() | |
1148 | .\" | |
fea681da | 1149 | .TP |
5662f56a MK |
1150 | .BR EAGAIN |
1151 | .RB ( FUTEX_LOCK_PI , | |
aaec9032 MK |
1152 | .BR FUTEX_TRYLOCK_PI , |
1153 | .BR FUTEX_CMP_REQUEUE_PI ) | |
1154 | The futex owner thread ID of | |
1155 | .I uaddr | |
1156 | (for | |
1157 | .BR FUTEX_CMP_REQUEUE_PI : | |
1158 | .IR uaddr2 ) | |
1159 | is about to exit, | |
5662f56a MK |
1160 | but has not yet handled the internal state cleanup. |
1161 | Try again. | |
61f8c1d1 | 1162 | .\" |
f1d2171d | 1163 | .\" FIXME XXX Is there not also an EAGAIN error case on 'uaddr2' for |
61f8c1d1 MK |
1164 | .\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via |
1165 | .\" futex_requeue() ==> futex_proxy_trylock_atomic() ==> | |
1166 | .\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> EAGAIN? | |
5662f56a | 1167 | .TP |
7a39e745 MK |
1168 | .BR EDEADLK |
1169 | .RB ( FUTEX_LOCK_PI , | |
1170 | .BR FUTEX_TRYLOCK_PI ) | |
1171 | The futex at | |
1172 | .I uaddr | |
1173 | is already locked by the caller. | |
d08ce5dd | 1174 | .\" |
f1d2171d | 1175 | .\" FIXME XXX Is there not also an EDEADLK error case on 'uaddr2' for |
d08ce5dd MK |
1176 | .\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via |
1177 | .\" futex_requeue() ==> futex_proxy_trylock_atomic() ==> | |
1178 | .\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> EDEADLK? | |
7a39e745 | 1179 | .TP |
662c0da8 MK |
1180 | .BR EDEADLK |
1181 | .\" FIXME I reworded tglx's text somewhat; is the following okay? | |
f1d2171d MK |
1182 | .\" FIXME XXX I see that kernel/locking/rtmutex.c uses EDEADLK in some places, |
1183 | .\" and EDEADLOCK in others. On almost all architectures these | |
1184 | .\" constants are synonymous. Is there a reason that both names | |
1185 | .\" are used? | |
662c0da8 MK |
1186 | .RB ( FUTEX_CMP_REQUEUE_PI ) |
1187 | While requeueing a waiter to the PI futex at | |
1188 | .IR uaddr2 , | |
1189 | the kernel detected a deadlock. | |
1190 | .TP | |
fea681da | 1191 | .B EFAULT |
1ea901e8 MK |
1192 | A required pointer argument (i.e., |
1193 | .IR uaddr , | |
1194 | .IR uaddr2 , | |
1195 | or | |
1196 | .IR timeout ) | |
496df304 | 1197 | did not point to a valid user-space address. |
fea681da | 1198 | .TP |
9f6c40c0 | 1199 | .B EINTR |
e808bba0 | 1200 | A |
9f6c40c0 | 1201 | .B FUTEX_WAIT |
2674f781 MK |
1202 | or |
1203 | .B FUTEX_WAIT_BITSET | |
e808bba0 MK |
1204 | operation was interrupted by a signal (see |
1205 | .BR signal (7)) | |
1206 | or a spurious wakeup. | |
5eeca856 MK |
1207 | .\" FIXME |
1208 | .\" Regarding the words "spurious wakeup" above, I received this | |
1209 | .\" bug report from Rich Felker: | |
1210 | .\" | |
1211 | .\" I see no code in the kernel whereby a "spurious wakeup", or anything | |
1212 | .\" other than interruption by a signal handler that's not SA_RESTART, | |
1213 | .\" can cause futex to fail with EINTR. In general, overloading of EINTR | |
1214 | .\" and/or spurious EINTRs from a syscall make it impossible to use that | |
1215 | .\" syscall for implementing any function where EINTR is a mandatory | |
1216 | .\" failure on interruption-by-signal, since there is no way for | |
1217 | .\" userspace to distinguish whether the EINTR occurred as a result of | |
1218 | .\" an interrupting signal or some other reason. The kernel folks have | |
1219 | .\" gone to great lengths to fix spurious EINTRs (see signal(7) for | |
1220 | .\" history), especially by non-interrupting signal handlers, including | |
1221 | .\" in futex, and allowing EINTR here would be contrary to that goal. | |
1222 | .\" | |
1223 | .\" It's my belief that the "or a spurious wakeup" text should simply be | |
1224 | .\" removed. | |
1225 | .\" | |
1226 | .\" The reason I'm raising this topic is its relevance to a thread on | |
1227 | .\" libc-alpha: | |
1228 | .\" [RFC] mutex destruction (#13690): problem description and workarounds | |
1229 | .\" | |
1230 | .\" The bug and mailing list discussions to which Rich refers are: | |
1231 | .\" https://sourceware.org/bugzilla/show_bug.cgi?id=13690 | |
1232 | .\" https://sourceware.org/ml/libc-alpha/2014-12/threads.html#0001 | |
1233 | .\" | |
1234 | .\" Can anyone comment on whether the words "spurious wakeup" are correct? | |
1235 | .\" | |
9f6c40c0 | 1236 | .TP |
fea681da | 1237 | .B EINVAL |
180f97b7 MK |
1238 | The operation in |
1239 | .IR futex_op | |
1240 | is one of those that employs a timeout, but the supplied | |
fb2f4c27 MK |
1241 | .I timeout |
1242 | argument was invalid | |
1243 | .RI ( tv_sec | |
1244 | was less than zero, or | |
1245 | .IR tv_nsec | |
1246 | was not less than 1000,000,000). | |
1247 | .TP | |
1248 | .B EINVAL | |
0c74df0b | 1249 | The operation specified in |
025e1374 | 1250 | .IR futex_op |
0c74df0b | 1251 | employs one or both of the pointers |
51ee94be | 1252 | .I uaddr |
a1f47699 | 1253 | and |
0c74df0b MK |
1254 | .IR uaddr2 , |
1255 | but one of these does not point to a valid object\(emthat is, | |
1256 | the address is not four-byte-aligned. | |
51ee94be MK |
1257 | .TP |
1258 | .B EINVAL | |
55cc422d TG |
1259 | .RB ( FUTEX_WAIT_BITSET , |
1260 | .BR FUTEX_WAKE_BITSET ) | |
79c9b436 TG |
1261 | The bitset supplied in |
1262 | .IR val3 | |
1263 | is zero. | |
1264 | .TP | |
1265 | .B EINVAL | |
2043f2c1 | 1266 | .RB ( FUTEX_REQUEUE , |
f1d2171d | 1267 | .\" FIXME XXX tglx suggested adding this, but does this error really occur for |
2043f2c1 MK |
1268 | .\" FUTEX_REQUEUE? (The case where it occurs for FUTEX_CMP_REQUEUE_PI |
1269 | .\" is obvious at the start of futex_requeue().) | |
f1d2171d MK |
1270 | .\" Darren Hart seems to agree with me that it does not occur for |
1271 | .\" FUTEX_REQUEUE. If Darren and I turn out to be wrong, then | |
1272 | .\" FUTEX_CMP_REQUEUE probably also needs to be added here. | |
2043f2c1 | 1273 | .BR FUTEX_CMP_REQUEUE_PI ) |
add875c0 MK |
1274 | .I uaddr |
1275 | equals | |
1276 | .IR uaddr2 | |
1277 | (i.e., an attempt was made to requeue to the same futex). | |
1278 | .TP | |
ff597681 MK |
1279 | .BR EINVAL |
1280 | .RB ( FUTEX_FD ) | |
1281 | The signal number supplied in | |
1282 | .I val | |
1283 | is invalid. | |
1284 | .TP | |
6bac3b85 | 1285 | .B EINVAL |
476debd7 MK |
1286 | .RB ( FUTEX_WAKE , |
1287 | .BR FUTEX_WAKE_OP , | |
1288 | .BR FUTEX_WAKE_BITSET , | |
1289 | .BR FUTEX_REQUEUE , | |
1290 | .BR FUTEX_CMP_REQUEUE ) | |
1291 | The kernel detected an inconsistency between the user-space state at | |
1292 | .I uaddr | |
1293 | and the kernel state\(emthat is, it detected a waiter which waits in | |
1294 | .BR FUTEX_LOCK_PI | |
1295 | on | |
1296 | .IR uaddr . | |
1297 | .TP | |
1298 | .B EINVAL | |
a218ef20 | 1299 | .RB ( FUTEX_LOCK_PI , |
ce022f18 MK |
1300 | .BR FUTEX_TRYLOCK_PI , |
1301 | .BR FUTEX_UNLOCK_PI ) | |
a218ef20 MK |
1302 | The kernel detected an inconsistency between the user-space state at |
1303 | .I uaddr | |
1304 | and the kernel state. | |
ce022f18 MK |
1305 | This indicates either state corruption |
1306 | .\" FIXME tglx did not mention the "state corruption" for FUTEX_UNLOCK_PI. | |
1307 | .\" Does that case also apply for FUTEX_UNLOCK_PI? | |
1308 | or that the kernel found a waiter on | |
a218ef20 MK |
1309 | .I uaddr |
1310 | which is waiting via | |
1311 | .BR FUTEX_WAIT | |
1312 | or | |
1313 | .BR FUTEX_WAIT_BITSET . | |
1314 | .TP | |
1315 | .B EINVAL | |
f9250b1a MK |
1316 | .RB ( FUTEX_CMP_REQUEUE_PI ) |
1317 | The kernel detected an inconsistency between the user-space state at | |
99c0041d MK |
1318 | .I uaddr2 |
1319 | and the kernel state; | |
1320 | that is, the kernel detected a waiter which waits via | |
1321 | .BR FUTEX_WAIT | |
1322 | .\" FIXME tglx did not mention FUTEX_WAIT_BITSET here, | |
1323 | .\" but should that not also be included here? | |
1324 | on | |
1325 | .IR uaddr2 . | |
1326 | .TP | |
1327 | .B EINVAL | |
1328 | .RB ( FUTEX_CMP_REQUEUE_PI ) | |
1329 | The kernel detected an inconsistency between the user-space state at | |
f9250b1a MK |
1330 | .I uaddr |
1331 | and the kernel state; | |
1332 | that is, the kernel detected a waiter which waits via | |
75299c8d | 1333 | .BR FUTEX_WAIT |
99c0041d | 1334 | or |
75299c8d | 1335 | .BR FUTEX_WAIT_BITESET |
f9250b1a MK |
1336 | on |
1337 | .IR uaddr . | |
1338 | .TP | |
1339 | .B EINVAL | |
99c0041d | 1340 | .RB ( FUTEX_CMP_REQUEUE_PI ) |
75299c8d MK |
1341 | The kernel detected an inconsistency between the user-space state at |
1342 | .I uaddr | |
1343 | and the kernel state; | |
1344 | that is, the kernel detected a waiter which waits on | |
1345 | .I uaddr | |
1346 | via | |
1347 | .BR FUTEX_LOCK_PI | |
1348 | (instead of | |
1349 | .BR FUTEX_WAIT_REQUEUE_PI ). | |
99c0041d MK |
1350 | .TP |
1351 | .B EINVAL | |
9786b3ca | 1352 | .RB ( FUTEX_CMP_REQUEUE_PI ) |
f1d2171d | 1353 | .\" FIXME XXX The following is a reworded version of Darren Hart's text. |
9786b3ca MK |
1354 | .\" Please check that I did not introduce any errors. |
1355 | An attempt was made to requeue a waiter to a futex other than that | |
1356 | specified by the matching | |
1357 | .B FUTEX_WAIT_REQUEUE_PI | |
1358 | call for that waiter. | |
1359 | .TP | |
1360 | .B EINVAL | |
f0c0d61c MK |
1361 | .RB ( FUTEX_CMP_REQUEUE_PI ) |
1362 | The | |
1363 | .I val | |
1364 | argument is not 1. | |
1365 | .TP | |
1366 | .B EINVAL | |
4832b48a | 1367 | Invalid argument. |
fea681da | 1368 | .TP |
a449c634 MK |
1369 | .BR ENOMEM |
1370 | .RB ( FUTEX_LOCK_PI , | |
e34a8fb6 MK |
1371 | .BR FUTEX_TRYLOCK_PI , |
1372 | .BR FUTEX_CMP_REQUEUE_PI ) | |
a449c634 MK |
1373 | The kernel could not allocate memory to hold state information. |
1374 | .TP | |
fea681da | 1375 | .B ENFILE |
ff597681 | 1376 | .RB ( FUTEX_FD ) |
fea681da | 1377 | The system limit on the total number of open files has been reached. |
4701fc28 MK |
1378 | .TP |
1379 | .B ENOSYS | |
1380 | Invalid operation specified in | |
d33602c4 | 1381 | .IR futex_op . |
9f6c40c0 | 1382 | .TP |
4a7e5b05 MK |
1383 | .B ENOSYS |
1384 | The | |
1385 | .BR FUTEX_CLOCK_REALTIME | |
1386 | option was specified in | |
1afcee7c | 1387 | .IR futex_op , |
4a7e5b05 MK |
1388 | but the accompanying operation was neither |
1389 | .BR FUTEX_WAIT_BITSET | |
1390 | nor | |
1391 | .BR FUTEX_WAIT_REQUEUE_PI . | |
1392 | .TP | |
a9dcb4d1 MK |
1393 | .BR ENOSYS |
1394 | .RB ( FUTEX_LOCK_PI , | |
f2424fae | 1395 | .BR FUTEX_TRYLOCK_PI , |
4945ff19 | 1396 | .BR FUTEX_UNLOCK_PI , |
4cf92894 | 1397 | .BR FUTEX_CMP_REQUEUE_PI , |
794bb106 | 1398 | .BR FUTEX_WAIT_REQUEUE_PI ) |
a9dcb4d1 | 1399 | A run-time check determined that the operation not available. |
a2ebebcd MK |
1400 | The PI futex operations are not implemented on all architectures and |
1401 | are not supported on some CPU variants. | |
a9dcb4d1 | 1402 | .TP |
c7589177 MK |
1403 | .BR EPERM |
1404 | .RB ( FUTEX_LOCK_PI , | |
dc2742a8 MK |
1405 | .BR FUTEX_TRYLOCK_PI , |
1406 | .BR FUTEX_CMP_REQUEUE_PI ) | |
04331c3f | 1407 | The caller is not allowed to attach itself to the futex at |
dc2742a8 MK |
1408 | .I uaddr |
1409 | (for | |
1410 | .BR FUTEX_CMP_REQUEUE_PI : | |
1411 | the futex at | |
1412 | .IR uaddr2 ). | |
c7589177 | 1413 | (This may be caused by a state corruption in user space.) |
61f8c1d1 | 1414 | .\" |
f1d2171d | 1415 | .\" FIXME XXX Is there not also an EPERM error case on 'uaddr2' for |
61f8c1d1 MK |
1416 | .\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via |
1417 | .\" futex_requeue() ==> futex_proxy_trylock_atomic() ==> | |
1418 | .\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> EPERM? | |
c7589177 | 1419 | .TP |
76f347ba | 1420 | .BR EPERM |
87276709 | 1421 | .RB ( FUTEX_UNLOCK_PI ) |
76f347ba MK |
1422 | The caller does not own the futex. |
1423 | .TP | |
0b0e4934 MK |
1424 | .BR ESRCH |
1425 | .RB ( FUTEX_LOCK_PI , | |
1426 | .BR FUTEX_TRYLOCK_PI ) | |
1427 | .\" FIXME I reworded the following sentence a bit differently from | |
1428 | .\" tglx's formulation. Is it okay? | |
1429 | The thread ID in the futex at | |
1430 | .I uaddr | |
1431 | does not exist. | |
61f8c1d1 | 1432 | .\" |
f1d2171d | 1433 | .\" FIXME XXX Is there not also an ESRCH error case on 'uaddr2' for |
61f8c1d1 MK |
1434 | .\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via |
1435 | .\" futex_requeue() ==> futex_proxy_trylock_atomic() ==> | |
1436 | .\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> ESRCH? | |
0b0e4934 | 1437 | .TP |
360f773c MK |
1438 | .BR ESRCH |
1439 | .RB ( FUTEX_CMP_REQUEUE_PI ) | |
1440 | .\" FIXME I reworded the following sentence a bit differently from | |
1441 | .\" tglx's formulation. Is it okay? | |
1442 | The thread ID in the futex at | |
1443 | .I uaddr2 | |
1444 | does not exist. | |
1445 | .TP | |
9f6c40c0 | 1446 | .B ETIMEDOUT |
4d85047f MK |
1447 | The operation in |
1448 | .IR futex_op | |
1449 | employed the timeout specified in | |
1450 | .IR timeout , | |
1451 | and the timeout expired before the operation completed. | |
70b06b90 MK |
1452 | .\" |
1453 | .\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" | |
1454 | .\" | |
47297adb | 1455 | .SH VERSIONS |
a1d5f77c | 1456 | .PP |
81c9d87e MK |
1457 | Futexes were first made available in a stable kernel release |
1458 | with Linux 2.6.0. | |
1459 | ||
a1d5f77c MK |
1460 | Initial futex support was merged in Linux 2.5.7 but with different semantics |
1461 | from what was described above. | |
52dee70e | 1462 | A four-argument system call with the semantics |
fd3fa7ef | 1463 | described in this page was introduced in Linux 2.5.40. |
11b520ed | 1464 | In Linux 2.5.70, one argument |
a1d5f77c | 1465 | was added. |
11b520ed | 1466 | In Linux 2.6.7, a sixth argument was added\(emmessy, especially |
a1d5f77c | 1467 | on the s390 architecture. |
47297adb | 1468 | .SH CONFORMING TO |
8382f16d | 1469 | This system call is Linux-specific. |
47297adb | 1470 | .SH NOTES |
baf0f1f4 MK |
1471 | Glibc does not provide a wrapper for this system call; call it using |
1472 | .BR syscall (2). | |
47297adb | 1473 | .SH SEE ALSO |
4c222281 | 1474 | .ad l |
9913033c | 1475 | .BR get_robust_list (2), |
d806bc05 | 1476 | .BR restart_syscall (2), |
14d8dd3b | 1477 | .BR futex (7) |
fea681da | 1478 | .PP |
f5ad572f MK |
1479 | The following kernel source files: |
1480 | .IP * 2 | |
1481 | .I Documentation/pi-futex.txt | |
1482 | .IP * | |
1483 | .I Documentation/futex-requeue-pi.txt | |
1484 | .IP * | |
1485 | .I Documentation/locking/rt-mutex.txt | |
1486 | .IP * | |
1487 | .I Documentation/locking/rt-mutex-design.txt | |
8fe019c7 MK |
1488 | .IP * |
1489 | .I Documentation/robust-futex-ABI.txt | |
43b99089 | 1490 | .PP |
4c222281 | 1491 | Franke, H., Russell, R., and Kirwood, M., 2002. |
52087dd3 | 1492 | \fIFuss, Futexes and Furwocks: Fast Userlevel Locking in Linux\fP |
4c222281 | 1493 | (from proceedings of the Ottawa Linux Symposium 2002), |
9b936e9e | 1494 | .br |
608bf950 SK |
1495 | .UR http://kernel.org\:/doc\:/ols\:/2002\:/ols2002-pages-479-495.pdf |
1496 | .UE | |
f42eb21b | 1497 | |
4c222281 | 1498 | Hart, D., 2009. \fIA futex overview and update\fP, |
2ed26199 MK |
1499 | .UR http://lwn.net/Articles/360699/ |
1500 | .UE | |
1501 | ||
4c222281 | 1502 | Hart, D. and Guniguntala, D., 2009. |
0483b6cc | 1503 | \fIRequeue-PI: Making Glibc Condvars PI-Aware\fP |
4c222281 | 1504 | (from proceedings of the 2009 Real-Time Linux Workshop), |
0483b6cc MK |
1505 | .UR http://lwn.net/images/conf/rtlws11/papers/proc/p10.pdf |
1506 | .UE | |
1507 | ||
4c222281 | 1508 | Drepper, U., 2011. \fIFutexes Are Tricky\fP, |
f42eb21b MK |
1509 | .UR http://www.akkadia.org/drepper/futex.pdf |
1510 | .UE | |
9b936e9e MK |
1511 | .PP |
1512 | Futex example library, futex-*.tar.bz2 at | |
1513 | .br | |
a605264d | 1514 | .UR ftp://ftp.kernel.org\:/pub\:/linux\:/kernel\:/people\:/rusty/ |
608bf950 | 1515 | .UE |
34f14794 MK |
1516 | .\" |
1517 | .\" FIXME Are there any other resources that should be listed | |
1518 | .\" in the SEE ALSO section? |