]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man2/futex.2
futex.2: Reword paragraph describing futex word
[thirdparty/man-pages.git] / man2 / futex.2
CommitLineData
8f0aff2a 1.\" Page by b.hubert
1abce893
MK
2.\" and Copyright (C) 2015, Thomas Gleixner <tglx@linutronix.de>
3.\" and Copyright (C) 2015, Michael Kerrisk <mtk.manpages@gmail.com>
2297bf0e 4.\"
2e46a6e7 5.\" %%%LICENSE_START(FREELY_REDISTRIBUTABLE)
8f0aff2a 6.\" may be freely modified and distributed
8ff7380d 7.\" %%%LICENSE_END
fea681da
MK
8.\"
9.\" Niki A. Rahimi (LTC Security Development, narahimi@us.ibm.com)
10.\" added ERRORS section.
11.\"
12.\" Modified 2004-06-17 mtk
13.\" Modified 2004-10-07 aeb, added FUTEX_REQUEUE, FUTEX_CMP_REQUEUE
14.\"
47f5c4ba
MK
15.\" FIXME Still to integrate are some points from Torvald Riegel's mail of
16.\" 2015-01-23:
17.\" http://thread.gmane.org/gmane.linux.kernel/1703405/focus=7977
18.\"
78e85692 19.\" FIXME Do we need to add some text regarding Torvald Riegel's 2015-01-24 mail
02182e7c
MK
20.\" at http://thread.gmane.org/gmane.linux.kernel/1703405/focus=1873242
21.\"
3d155313 22.TH FUTEX 2 2014-05-21 "Linux" "Linux Programmer's Manual"
fea681da 23.SH NAME
ce154705 24futex \- fast user-space locking
fea681da 25.SH SYNOPSIS
9d9dc1e8 26.nf
fea681da
MK
27.sp
28.B "#include <linux/futex.h>"
fea681da
MK
29.B "#include <sys/time.h>"
30.sp
d33602c4 31.BI "int futex(int *" uaddr ", int " futex_op ", int " val ,
768d3c23 32.BI " const struct timespec *" timeout , \
c6dc40a2 33" \fR /* or: \fBuint32_t \fIval2\fP */
9d9dc1e8 34.BI " int *" uaddr2 ", int " val3 );
9d9dc1e8 35.fi
409f08b0 36
b939d6e4
MK
37.IR Note :
38There is no glibc wrapper for this system call; see NOTES.
47297adb 39.SH DESCRIPTION
fea681da
MK
40.PP
41The
e511ffb6 42.BR futex ()
4b35dc5d 43system call provides a method for waiting until a certain condition becomes
077981d4
MK
44true.
45It is typically used as a blocking construct in the context of
4c8cb0ff 46shared-memory synchronization: The program implements the majority of
594536fb 47the synchronization in user space, and uses one of the operations of
4c8cb0ff
MK
48the system call when it is likely that it has to block for
49a longer time until the condition becomes true.
077981d4 50The program uses another operation of the system call to wake
4b35dc5d
TR
51anyone waiting for a particular condition.
52
7e8dcabc
MK
53A futex is a 32-bit value\(emreferred to below as a
54.IR "futex word" \(emwhose
55address is supplied to the
4b35dc5d 56.BR futex ()
7e8dcabc
MK
57system call.
58(Futexes are 32-bits in size on all platforms, including 64-bit systems.)
59All futex operations are governed by this value.
60In order to share a futex between processes,
61the futex is placed in a region of shared memory,
62created using (for example)
63.BR mmap (2)
64or
65.BR shmat (2).
66(Thus the futex word may have different
67virtual addresses in different processes,
68but these addresses all refer to the same location in physical memory.)
809ca3ae 69
0c3ec26b
MK
70When executing a futex operation that requests to block a thread,
71the kernel will block only if the futex word has the value that the
4c8cb0ff 72calling thread supplied as expected value.
077981d4
MK
73The load from the futex word, the comparison with
74the expected value,
75and the actual blocking will happen atomically and totally
0c3ec26b 76ordered with respect to concurrently executing futex
b80daba2
HS
77operations on the same futex word.
78Thus, the futex word is used to connect the synchronization in user space
4c8cb0ff 79with the implementation of blocking by the kernel; similar to an atomic
4b35dc5d 80compare-and-exchange operation that potentially changes shared memory,
077981d4 81blocking via a futex is an atomic compare-and-block operation.
d6bb5a38
MK
82.\" FIXME(Torvald Riegel):
83.\" Eventually we want to have some text in NOTES to satisfy
84.\" the reference in the following sentence
85.\" See NOTES for
86.\" a detailed specification of the synchronization semantics.
4b35dc5d 87
077981d4
MK
88One example use of futexes is implementing locks.
89The state of the lock (i.e.,
4c8cb0ff
MK
90acquired or not acquired) can be represented as an atomically accessed
91flag in shared memory.
92In the uncontended case,
93a thread can access or modify the lock state with atomic instructions,
94for example atomically changing it from not acquired to acquired
95using an atomic compare-and-exchange instruction.
36a90a75 96A thread may be unable to acquire a lock because
8e754e12
HS
97it is already acquired by another thread.
98It then may pass the lock's flag as futex word and the value
0c3ec26b 99representing the acquired state as the expected value to a
8e754e12
HS
100.BR futex ()
101wait operation.
0c3ec26b 102The call to
8e754e12
HS
103.BR futex ()
104will block if and only if the lock is still acquired.
077981d4 105When releasing the lock, a thread has to first reset the
0c3ec26b 106lock state to not acquired and then execute a futex
d725ab77 107operation that wakes threads blocked on the lock flag used as futex word
4c8cb0ff 108(this can be be further optimized to avoid unnecessary wake-ups).
077981d4 109See
4b35dc5d
TR
110.BR futex (7)
111for more detail on how to use futexes.
112
113Besides the basic wait and wake-up futex functionality, there are further
077981d4
MK
114futex operations aimed at supporting more complex use cases.
115Also note that
4c8cb0ff
MK
116no explicit initialization or destruction are necessary to use futexes;
117the kernel maintains a futex
118(i.e., the kernel-internal implementation artifact)
4b35dc5d
TR
119only while operations such as
120.BR FUTEX_WAIT ,
121described below, are being performed on a particular futex word.
a663ca5a
MK
122.\"
123.SS Arguments
fea681da
MK
124The
125.I uaddr
077981d4
MK
126argument points to the futex word.
127On all platforms, futexes are four-byte
4b35dc5d 128integers that must be aligned on a four-byte boundary.
f388ba70
MK
129The operation to perform on the futex is specified in the
130.I futex_op
131argument;
132.IR val
133is a value whose meaning and purpose depends on
134.IR futex_op .
36ab2074
MK
135
136The remaining arguments
137.RI ( timeout ,
138.IR uaddr2 ,
139and
140.IR val3 )
141are required only for certain of the futex operations described below.
142Where one of these arguments is not required, it is ignored.
768d3c23 143
36ab2074
MK
144For several blocking operations, the
145.I timeout
146argument is a pointer to a
147.IR timespec
148structure that specifies a timeout for the operation.
149However, notwithstanding the prototype shown above, for some operations,
10022b8e 150the least significant four bytes are used as an integer whose meaning
36ab2074 151is determined by the operation.
768d3c23
MK
152For these operations, the kernel casts the
153.I timeout
10022b8e
HS
154value first to
155.IR "unsigned long",
156then to
c6dc40a2 157.IR uint32_t ,
768d3c23
MK
158and in the remainder of this page, this argument is referred to as
159.I val2
160when interpreted in this fashion.
161
de5a3bb4 162Where it is required, the
36ab2074 163.IR uaddr2
4c8cb0ff
MK
164argument is a pointer to a second futex word that is employed
165by the operation.
36ab2074
MK
166The interpretation of the final integer argument,
167.IR val3 ,
168depends on the operation.
a663ca5a
MK
169.\"
170.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
171.\"
172.SS Futex operations
6be4bad7 173The
d33602c4 174.I futex_op
6be4bad7
MK
175argument consists of two parts:
176a command that specifies the operation to be performed,
177bit-wise ORed with zero or or more options that
178modify the behaviour of the operation.
fc30eb79 179The options that may be included in
d33602c4 180.I futex_op
fc30eb79
TG
181are as follows:
182.TP
183.BR FUTEX_PRIVATE_FLAG " (since Linux 2.6.22)"
184.\" commit 34f01cc1f512fa783302982776895c73714ebbc2
185This option bit can be employed with all futex operations.
e45f9735 186It tells the kernel that the futex is process-private and not shared
0c3ec26b
MK
187with another process (i.e., it is being used for synchronization
188only between threads of the same process).
943ccc52
MK
189This allows the kernel to make some additional performance optimizations.
190.\" I.e., It allows the kernel choose the fast path for validating
191.\" the user-space address and avoids expensive VMA lookups,
192.\" taking reference counts on file backing store, and so on.
ae2c1774
MK
193
194As a convenience,
195.IR <linux/futex.h>
196defines a set of constants with the suffix
197.BR _PRIVATE
198that are equivalents of all of the operations listed below,
dcdfde26 199.\" except the obsolete FUTEX_FD, for which the "private" flag was
ae2c1774
MK
200.\" meaningless
201but with the
202.BR FUTEX_PRIVATE_FLAG
203ORed into the constant value.
204Thus, there are
205.BR FUTEX_WAIT_PRIVATE ,
206.BR FUTEX_WAKE_PRIVATE ,
207and so on.
2e98bbc2
TG
208.TP
209.BR FUTEX_CLOCK_REALTIME " (since Linux 2.6.28)"
210.\" commit 1acdac104668a0834cfa267de9946fac7764d486
4a7e5b05 211This option bit can be employed only with the
2e98bbc2
TG
212.BR FUTEX_WAIT_BITSET
213and
214.BR FUTEX_WAIT_REQUEUE_PI
c84cf68c 215operations.
2e98bbc2 216
f2103b26
MK
217If this option is set, the kernel treats
218.I timeout
219as an absolute time based on
2e98bbc2
TG
220.BR CLOCK_REALTIME .
221
f2103b26
MK
222If this option is not set, the kernel treats
223.I timeout
224as relative time,
d6bb5a38 225.\" FIXME XXX I added CLOCK_MONOTONIC below. Okay?
1c952cf5
MK
226measured against the
227.BR CLOCK_MONOTONIC
228clock.
6be4bad7
MK
229.PP
230The operation specified in
d33602c4 231.I futex_op
6be4bad7 232is one of the following:
70b06b90
MK
233.\"
234.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
235.\"
fea681da 236.TP
81c9d87e
MK
237.BR FUTEX_WAIT " (since Linux 2.6.0)"
238.\" Strictly speaking, since some time in 2.5.x
f065673c 239This operation tests that the value at the
4b35dc5d 240futex word pointed to by the address
fea681da 241.I uaddr
4b35dc5d 242still contains the expected value
fea681da 243.IR val ,
4b35dc5d 244and if so, then sleeps awaiting
682edefb 245.B FUTEX_WAKE
077981d4
MK
246on the futex word.
247The load of the value of the futex word is an atomic memory
4b35dc5d 248access (i.e., using atomic machine instructions of the respective
077981d4
MK
249architecture).
250This load, the comparison with the expected value, and
4b35dc5d 251starting to sleep are performed atomically and totally ordered with respect
077981d4
MK
252to other futex operations on the same futex word.
253If the thread starts to
4b35dc5d 254sleep, it is considered a waiter on this futex word.
f065673c
MK
255If the futex value does not match
256.IR val ,
4710334a 257then the call fails immediately with the error
badbf70c 258.BR EAGAIN .
4b35dc5d
TR
259
260The purpose of the comparison with the expected value is to prevent lost
261wake-ups: If another thread changed the value of the futex word after the
262calling thread decided to block based on the prior value, and if the other
263thread executed a
264.BR FUTEX_WAKE
265operation (or similar wake-up) after the value change and before this
f065673c 266.BR FUTEX_WAIT
4b35dc5d
TR
267operation, then the latter will observe the value change and will not start
268to sleep.
1909e523 269
c13182ef 270If the
fea681da 271.I timeout
53ba4030 272argument is non-NULL, its contents specify a relative timeout for the wait,
d6bb5a38 273.\" FIXME XXX I added CLOCK_MONOTONIC below. Okay?
1c952cf5
MK
274measured according to the
275.BR CLOCK_MONOTONIC
276clock.
82a6092b
MK
277(This interval will be rounded up to the system clock granularity,
278and kernel scheduling delays mean that the
279blocking interval may overrun by a small amount.)
280If
281.I timeout
282is NULL, the call blocks indefinitely.
4798a7f3 283
c13182ef 284The arguments
fea681da
MK
285.I uaddr2
286and
287.I val3
288are ignored.
289
74f58a64
MK
290.\" FIXME(Torvald) I think we should remove this. Or maybe adapt to a
291.\" different example.
4b35dc5d
TR
292.\" For
293.\" .BR futex (7),
294.\" this call is executed if decrementing the count gave a negative value
295.\" (indicating contention),
296.\" and will sleep until another process or thread releases
297.\" the futex and executes the
298.\" .B FUTEX_WAKE
299.\" operation.
70b06b90
MK
300.\"
301.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
302.\"
fea681da 303.TP
81c9d87e
MK
304.BR FUTEX_WAKE " (since Linux 2.6.0)"
305.\" Strictly speaking, since Linux 2.5.x
f065673c
MK
306This operation wakes at most
307.I val
4b35dc5d 308of the waiters that are waiting (e.g., inside
f065673c 309.BR FUTEX_WAIT )
4b35dc5d 310on the futex word at the address
f065673c
MK
311.IR uaddr .
312Most commonly,
313.I val
314is specified as either 1 (wake up a single waiter) or
315.BR INT_MAX
316(wake up all waiters).
730bfbda
MK
317No guarantee is provided about which waiters are awoken
318(e.g., a waiter with a higher scheduling priority is not guaranteed
319to be awoken in preference to a waiter with a lower priority).
4798a7f3 320
fea681da
MK
321The arguments
322.IR timeout ,
c8b921bd 323.IR uaddr2 ,
fea681da
MK
324and
325.I val3
326are ignored.
327
74f58a64
MK
328.\" FIXME(Torvald) I think we should remove this. Or maybe adapt to
329.\" a different example.
4c8cb0ff
MK
330.\" For
331.\" .BR futex (7),
332.\" this is executed if incrementing the count showed that
333.\" there were waiters,
334.\" once the futex value has been set to 1
335.\" (indicating that it is available).
336.\"
337.\" FIXME How does "incrementing the count show that there were waiters"?
70b06b90
MK
338.\"
339.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
340.\"
a7c2bf45
MK
341.TP
342.BR FUTEX_FD " (from Linux 2.6.0 up to and including Linux 2.6.25)"
343.\" Strictly speaking, from Linux 2.5.x to 2.6.25
4c8cb0ff
MK
344This operation creates a file descriptor that is associated with
345the futex at
a7c2bf45 346.IR uaddr .
bdc5957a
MK
347The caller must close the returned file descriptor after use.
348When another process or thread performs a
a7c2bf45 349.BR FUTEX_WAKE
4b35dc5d 350on the futex word, the file descriptor indicates as being readable with
a7c2bf45
MK
351.BR select (2),
352.BR poll (2),
353and
354.BR epoll (7)
355
f1d2171d 356The file descriptor can be used to obtain asynchronous notifications: if
a7c2bf45 357.I val
bdc5957a 358is nonzero, then when another process or thread executes a
a7c2bf45
MK
359.BR FUTEX_WAKE ,
360the caller will receive the signal number that was passed in
361.IR val .
362
363The arguments
364.IR timeout ,
365.I uaddr2
366and
367.I val3
368are ignored.
369
4c8cb0ff
MK
370.\" FIXME(Torvald) We never define "upped". Maybe just remove the
371.\" following sentence?
a7c2bf45
MK
372To prevent race conditions, the caller should test if the futex has
373been upped after
374.B FUTEX_FD
375returns.
376
377Because it was inherently racy,
378.B FUTEX_FD
379has been removed
380.\" commit 82af7aca56c67061420d618cc5a30f0fd4106b80
381from Linux 2.6.26 onward.
70b06b90
MK
382.\"
383.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
384.\"
a7c2bf45
MK
385.TP
386.BR FUTEX_REQUEUE " (since Linux 2.6.0)"
387.\" Strictly speaking: from Linux 2.5.70
d6bb5a38
MK
388.\" FIXME(Torvald) Is there some indication that FUTEX_REQUEUE is broken
389.\" in general, or is this comment implicitly speaking about the
390.\" condvar (?) use case? If the latter we might want to weaken the
391.\" advice below a little.
392.\" [Anyone else have input on this?]
393.\"
a7c2bf45 394.IR "Avoid using this operation" .
4b35dc5d 395It is broken for its intended purpose.
a7c2bf45
MK
396Use
397.BR FUTEX_CMP_REQUEUE
398instead.
399
400This operation performs the same task as
401.BR FUTEX_CMP_REQUEUE ,
402except that no check is made using the value in
403.IR val3 .
404(The argument
405.I val3
406is ignored.)
70b06b90
MK
407.\"
408.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
409.\"
a7c2bf45
MK
410.TP
411.BR FUTEX_CMP_REQUEUE " (since Linux 2.6.7)"
4b35dc5d 412This operation first checks whether the location
a7c2bf45
MK
413.I uaddr
414still contains the value
415.IR val3 .
416If not, the operation fails with the error
417.BR EAGAIN .
4b35dc5d 418Otherwise, the operation wakes up a maximum of
a7c2bf45
MK
419.I val
420waiters that are waiting on the futex at
421.IR uaddr .
422If there are more than
423.I val
424waiters, then the remaining waiters are removed
425from the wait queue of the source futex at
426.I uaddr
427and added to the wait queue of the target futex at
428.IR uaddr2 .
429The
768d3c23 430.I val2
936876a9 431argument specifies an upper limit on the number of waiters
a7c2bf45 432that are requeued to the futex at
768d3c23 433.IR uaddr2 .
a7c2bf45 434
d6bb5a38
MK
435.\" FIXME(Torvald) Is the following correct? Or is just the decision
436.\" which threads to wake or requeue part of the atomic operation?
4b35dc5d
TR
437The load from
438.I uaddr
4c8cb0ff
MK
439is an atomic memory access (i.e., using atomic machine instructions of
440the respective architecture).
077981d4 441This load, the comparison with
4b35dc5d 442.IR val3 ,
4c8cb0ff
MK
443and the requeueing of any waiters are performed atomically and totally
444ordered with respect to other operations on the same futex word.
4b35dc5d
TR
445
446This operation was added as a replacement for the earlier
447.BR FUTEX_REQUEUE .
448The difference is that the check of the value at
449.I uaddr
0c3ec26b 450can be used to ensure that requeueing happens only under certain
4c8cb0ff 451conditions.
4b35dc5d
TR
452Both operations can be used to avoid a "thundering herd" effect when
453.B FUTEX_WAKE
4c8cb0ff
MK
454is used and all of the waiters that are woken need to acquire
455another futex.
4b35dc5d 456
a7c2bf45
MK
457.\" FIXME Please review the following new paragraph to see if it is
458.\" accurate.
459Typical values to specify for
460.I val
461are 0 or or 1.
462(Specifying
463.BR INT_MAX
464is not useful, because it would make the
465.BR FUTEX_CMP_REQUEUE
466operation equivalent to
467.BR FUTEX_WAKE .)
936876a9 468The limit value specified via
768d3c23
MK
469.I val2
470is typically either 1 or
a7c2bf45
MK
471.BR INT_MAX .
472(Specifying the argument as 0 is not useful, because it would make the
473.BR FUTEX_CMP_REQUEUE
474operation equivalent to
475.BR FUTEX_WAIT .)
6bac3b85 476.\"
43d16602
MK
477.\" FIXME Here, it would be helpful to have an example of how
478.\" FUTEX_CMP_REQUEUE might be used, at the same time illustrating
479.\" why FUTEX_WAKE is unsuitable for the same use case.
480.\"
70b06b90
MK
481.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
482.\"
a5956430
MK
483.\" FIXME I added a lengthy piece of text on FUTEX_WAKE_OP text,
484.\" and I'd be happy if someone checked it.
fea681da 485.TP
d67e21f5
MK
486.BR FUTEX_WAKE_OP " (since Linux 2.6.14)"
487.\" commit 4732efbeb997189d9f9b04708dc26bf8613ed721
6bac3b85
MK
488.\" Author: Jakub Jelinek <jakub@redhat.com>
489.\" Date: Tue Sep 6 15:16:25 2005 -0700
4c8cb0ff
MK
490.\" FIXME(Torvald) The glibc condvar implementation is currently being
491.\" revised (e.g., to not use an internal lock anymore).
492.\" It is probably more future-proof to remove this paragraph.
d6bb5a38 493.\" [Torvald, do you have an update here?]
6bac3b85
MK
494This operation was added to support some user-space use cases
495where more than one futex must be handled at the same time.
496The most notable example is the implementation of
497.BR pthread_cond_signal (3),
498which requires operations on two futexes,
499the one used to implement the mutex and the one used in the implementation
500of the wait queue associated with the condition variable.
501.BR FUTEX_WAKE_OP
502allows such cases to be implemented without leading to
503high rates of contention and context switching.
504
505The
506.BR FUTEX_WAIT_OP
e61abc20 507operation is equivalent to executing the following code atomically
4c8cb0ff
MK
508and totally ordered with respect to other futex operations on
509any of the two supplied futex words:
6bac3b85
MK
510
511.in +4n
512.nf
513int oldval = *(int *) uaddr2;
514*(int *) uaddr2 = oldval \fIop\fP \fIoparg\fP;
515futex(uaddr, FUTEX_WAKE, val, 0, 0, 0);
516if (oldval \fIcmp\fP \fIcmparg\fP)
768d3c23 517 futex(uaddr2, FUTEX_WAKE, val2, 0, 0, 0);
6bac3b85
MK
518.fi
519.in
520
521In other words,
522.BR FUTEX_WAIT_OP
523does the following:
524.RS
525.IP * 3
4b35dc5d
TR
526saves the original value of the futex word at
527.IR uaddr2
528and performs an operation to modify the value of the futex at
6bac3b85 529.IR uaddr2 ;
4c8cb0ff
MK
530this is an atomic read-modify-write memory access (i.e., using atomic
531machine instructions of the respective architecture)
6bac3b85
MK
532.IP *
533wakes up a maximum of
534.I val
4b35dc5d 535waiters on the futex for the futex word at
6bac3b85
MK
536.IR uaddr ;
537and
538.IP *
4c8cb0ff
MK
539dependent on the results of a test of the original value of the
540futex word at
6bac3b85
MK
541.IR uaddr2 ,
542wakes up a maximum of
768d3c23 543.I val2
4b35dc5d 544waiters on the futex for the futex word at
6bac3b85
MK
545.IR uaddr2 .
546.RE
547.IP
6bac3b85
MK
548The operation and comparison that are to be performed are encoded
549in the bits of the argument
550.IR val3 .
551Pictorially, the encoding is:
552
f6af90e7 553.in +8n
6bac3b85 554.nf
f6af90e7
MK
555+---+---+-----------+-----------+
556|op |cmp| oparg | cmparg |
557+---+---+-----------+-----------+
558 4 4 12 12 <== # of bits
6bac3b85
MK
559.fi
560.in
561
562Expressed in code, the encoding is:
563
564.in +4n
565.nf
566#define FUTEX_OP(op, oparg, cmp, cmparg) \\
567 (((op & 0xf) << 28) | \\
568 ((cmp & 0xf) << 24) | \\
569 ((oparg & 0xfff) << 12) | \\
570 (cmparg & 0xfff))
571.fi
572.in
573
574In the above,
575.I op
576and
577.I cmp
578are each one of the codes listed below.
579The
580.I oparg
581and
582.I cmparg
583components are literal numeric values, except as noted below.
584
585The
586.I op
587component has one of the following values:
588
589.in +4n
590.nf
591FUTEX_OP_SET 0 /* uaddr2 = oparg; */
592FUTEX_OP_ADD 1 /* uaddr2 += oparg; */
593FUTEX_OP_OR 2 /* uaddr2 |= oparg; */
594FUTEX_OP_ANDN 3 /* uaddr2 &= ~oparg; */
595FUTEX_OP_XOR 4 /* uaddr2 ^= oparg; */
596.fi
597.in
598
599In addition, bit-wise ORing the following value into
600.I op
601causes
602.IR "(1\ <<\ oparg)"
603to be used as the operand:
604
605.in +4n
606.nf
607FUTEX_OP_ARG_SHIFT 8 /* Use (1 << oparg) as operand */
608.fi
609.in
610
611The
612.I cmp
613field is one of the following:
614
615.in +4n
616.nf
617FUTEX_OP_CMP_EQ 0 /* if (oldval == cmparg) wake */
618FUTEX_OP_CMP_NE 1 /* if (oldval != cmparg) wake */
619FUTEX_OP_CMP_LT 2 /* if (oldval < cmparg) wake */
620FUTEX_OP_CMP_LE 3 /* if (oldval <= cmparg) wake */
621FUTEX_OP_CMP_GT 4 /* if (oldval > cmparg) wake */
622FUTEX_OP_CMP_GE 5 /* if (oldval >= cmparg) wake */
623.fi
624.in
625
626The return value of
627.BR FUTEX_WAKE_OP
628is the sum of the number of waiters woken on the futex
629.IR uaddr
630plus the number of waiters woken on the futex
631.IR uaddr2 .
70b06b90
MK
632.\"
633.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
634.\"
d67e21f5 635.TP
79c9b436
TG
636.BR FUTEX_WAIT_BITSET " (since Linux 2.6.25)"
637.\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
fd9e59d4 638This operation is like
79c9b436
TG
639.BR FUTEX_WAIT
640except that
641.I val3
642is used to provide a 32-bit bitset to the kernel.
643This bitset is stored in the kernel-internal state of the waiter.
644See the description of
645.BR FUTEX_WAKE_BITSET
646for further details.
647
fd9e59d4
MK
648The
649.BR FUTEX_WAIT_BITSET
9732dd8b 650operation also interprets the
fd9e59d4
MK
651.I timeout
652argument differently from
653.BR FUTEX_WAIT .
654See the discussion of
655.BR FUTEX_CLOCK_REALTIME ,
656above.
657
79c9b436
TG
658The
659.I uaddr2
660argument is ignored.
70b06b90
MK
661.\"
662.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
663.\"
79c9b436 664.TP
d67e21f5
MK
665.BR FUTEX_WAKE_BITSET " (since Linux 2.6.25)"
666.\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
55cc422d
TG
667This operation is the same as
668.BR FUTEX_WAKE
669except that the
670.I val3
671argument is used to provide a 32-bit bitset to the kernel.
98d769c0
MK
672This bitset is used to select which waiters should be woken up.
673The selection is done by a bit-wise AND of the "wake" bitset
674(i.e., the value in
675.IR val3 )
676and the bitset which is stored in the kernel-internal
09cb4ce7 677state of the waiter (the "wait" bitset that is set using
98d769c0
MK
678.BR FUTEX_WAIT_BITSET ).
679All of the waiters for which the result of the AND is nonzero are woken up;
680the remaining waiters are left sleeping.
681
d6bb5a38 682.\" FIXME XXX Is this next paragraph that I added okay?
e9d4496b
MK
683The effect of
684.BR FUTEX_WAIT_BITSET
685and
686.BR FUTEX_WAKE_BITSET
9732dd8b
MK
687is to allow selective wake-ups among multiple waiters that are blocked
688on the same futex.
09cb4ce7 689Note, however, that using this bitset multiplexing feature on a
e9d4496b
MK
690futex is less efficient than simply using multiple futexes,
691because employing bitset multiplexing requires the kernel
692to check all waiters on a futex,
693including those that are not interested in being woken up
694(i.e., they do not have the relevant bit set in their "wait" bitset).
695.\" According to http://locklessinc.com/articles/futex_cheat_sheet/:
696.\"
697.\" "The original reason for the addition of these extensions
698.\" was to improve the performance of pthread read-write locks
699.\" in glibc. However, the pthreads library no longer uses the
700.\" same locking algorithm, and these extensions are not used
701.\" without the bitset parameter being all ones.
702.\"
703.\" The page goes on to note that the FUTEX_WAIT_BITSET operation
704.\" is nevertheless used (with a bitset of all ones) in order to
705.\" obtain the absolute timeout functionality that is useful
706.\" for efficiently implementing Pthreads APIs (which use absolute
707.\" timeouts); FUTEX_WAIT provides only relative timeouts.
708
98d769c0
MK
709The
710.I uaddr2
711and
712.I timeout
713arguments are ignored.
9732dd8b
MK
714
715The
716.BR FUTEX_WAIT
717and
718.BR FUTEX_WAKE
719operations correspond to
720.BR FUTEX_WAIT_BITSET
721and
722.BR FUTEX_WAKE_BITSET
723operations where the bitsets are all ones.
bd90a5f9 724.\"
70b06b90 725.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
bd90a5f9
MK
726.\"
727.SS Priority-inheritance futexes
b52e1cd4
MK
728Linux supports priority-inheritance (PI) futexes in order to handle
729priority-inversion problems that can be encountered with
730normal futex locks.
b565548b 731Priority inversion is the problem that occurs when a high-priority
bdc5957a
MK
732task is blocked waiting to acquire a lock held by a low-priority task,
733while tasks at an intermediate priority continuously preempt
734the low-priority task from the CPU.
735Consequently, the low-priority task makes no progress toward
736releasing the lock, and the high-priority task remains blocked.
7f315ae3 737
7d20efd7
MK
738Priority inheritance is a mechanism for dealing with
739the priority-inversion problem.
bdc5957a
MK
740With this mechanism, when a high-priority task becomes blocked
741by a lock held by a low-priority task,
7d20efd7 742the latter's priority is temporarily raised to that of the former,
bdc5957a 743so that it is not preempted by any intermediate level tasks,
7d20efd7
MK
744and can thus make progress toward releasing the lock.
745To be effective, priority inheritance must be transitive,
bdc5957a
MK
746meaning that if a high-priority task blocks on a lock
747held by a lower-priority task that is itself blocked by lock
748held by another intermediate-priority task
7d20efd7 749(and so on, for chains of arbitrary length),
bdc5957a
MK
750then both of those task
751(or more generally, all of the tasks in a lock chain)
752have their priorities raised to be the same as the high-priority task.
7d20efd7 753
9e2b90ee
MK
754.\" FIXME XXX The following is my attempt at a definition of PI futexes,
755.\" based on mail discussions with Darren Hart. Does it seem okay?
756From a user-space perspective,
757what makes a futex PI-aware is a policy agreement between user space
4b35dc5d 758and the kernel about the value of the futex word (described in a moment),
9e2b90ee
MK
759coupled with the use of the PI futex operations described below
760(in particular,
761.BR FUTEX_LOCK_PI ,
762.BR FUTEX_TRYLOCK_PI ,
763and
764.BR FUTEX_CMP_REQUEUE_PI ).
765.\" Quoting Darren Hart:
766.\" These opcodes paired with the PI futex value policy (described below)
767.\" defines a "futex" as PI aware. These were created very specifically
768.\" in support of PI pthread_mutexes, so it makes a lot more sense to
769.\" talk about a PI aware pthread_mutex, than a PI aware futex, since
770.\" there is a lot of policy and scaffolding that has to be built up
771.\" around it to use it properly (this is what a PI pthread_mutex is).
772
f1d2171d 773.\" FIXME XXX ===== Start of adapted Hart/Guniguntala text =====
1af427a4
MK
774.\" The following text is drawn from the Hart/Guniguntala paper
775.\" (listed in SEE ALSO), but I have reworded some pieces
776.\" significantly. Please check it.
79d918c7
MK
777.\"
778The PI futex operations described below differ from the other
4b35dc5d
TR
779futex operations in that they impose policy on the use of the value of the
780futex word:
79d918c7 781.IP * 3
4b35dc5d 782If the lock is not acquired, the futex word's value shall be 0.
79d918c7 783.IP *
4c8cb0ff
MK
784If the lock is acquired, the futex word's value shall
785be the thread ID (TID;
4b35dc5d 786see
79d918c7
MK
787.BR gettid (2))
788of the owning thread.
789.IP *
f1d2171d 790.\" FIXME XXX In the following line, I added "the lock is owned and". Okay?
79d918c7
MK
791If the lock is owned and there are threads contending for the lock,
792then the
793.B FUTEX_WAITERS
4b35dc5d 794bit shall be set in the futex word's value; in other words, this value is:
79d918c7
MK
795
796 FUTEX_WAITERS | TID
9e2b90ee 797
79d918c7 798.PP
4b35dc5d 799Note that a PI futex word never just has the value
9e2b90ee
MK
800.BR FUTEX_WAITERS ,
801which is a permissible state for non-PI futexes.
802
79d918c7 803With this policy in place,
4b35dc5d
TR
804a user-space application can acquire a not-acquired
805lock or release a lock that no other threads try to acquire using atomic
4c8cb0ff
MK
806instructions executed in user space (e.g., a compare-and-swap operation
807such as
b52e1cd4
MK
808.I cmpxchg
809on the x86 architecture).
4c8cb0ff
MK
810Acquiring a lock simply consists of using compare-and-swap to atomically
811set the futex word's value to the caller's TID if its previous value was 0.
4b35dc5d
TR
812Releasing a lock requires using compare-and-swap to set the futex word's
813value to 0 if the previous value was the expected TID.
b52e1cd4 814
4b35dc5d 815If a futex is already acquired (i.e., has a nonzero value),
b52e1cd4 816waiters must employ the
79d918c7
MK
817.B FUTEX_LOCK_PI
818operation to acquire the lock.
4b35dc5d 819If other threads are waiting for the lock, then the
79d918c7 820.B FUTEX_WAITERS
4c8cb0ff
MK
821bit is set in the futex value;
822in this case, the lock owner must employ the
79d918c7 823.B FUTEX_UNLOCK_PI
b52e1cd4
MK
824operation to release the lock.
825
79d918c7
MK
826In the cases where callers are forced into the kernel
827(i.e., required to perform a
828.BR futex ()
0c3ec26b 829call),
79d918c7
MK
830they then deal directly with a so-called RT-mutex,
831a kernel locking mechanism which implements the required
832priority-inheritance semantics.
833After the RT-mutex is acquired, the futex value is updated accordingly,
834before the calling thread returns to user space.
835.\" FIXME ===== End of adapted Hart/Guniguntala text =====
836
a59fca75 837It is important to note
d6bb5a38
MK
838.\" FIXME We need some explanation in the following paragraph of *why*
839.\" it is important to note that "the kernel will update the
840.\" futex word's value prior
841to returning to user space" . Can someone explain?
4b35dc5d 842that the kernel will update the futex word's value prior
79d918c7
MK
843to returning to user space.
844Unlike the other futex operations described above,
845the PI futex operations are designed
d9d5be6b 846for the implementation of very specific IPC mechanisms.
fc57e6bb 847.\"
7bd3ffbc 848.\" FIXME XXX In discussing errors for FUTEX_CMP_REQUEUE_PI, Darren Hart
99c0ac69
MK
849.\" made the observation that "EINVAL is returned if the non-pi
850.\" to pi or op pairing semantics are violated."
851.\" Probably there needs to be a general statement about this
852.\" requirement, probably located at about this point in the page.
d6bb5a38 853.\" Darren (or someone else), care to take a shot at this?
dd003bef
MK
854.\"
855.\" FIXME Somewhere on this page (I guess under the discussion of PI
856.\" futexes) we need a discussion of the FUTEX_OWNER_DIED bit.
857.\" Can someone propose a text?
bd90a5f9
MK
858
859PI futexes are operated on by specifying one of the following values in
860.IR futex_op :
70b06b90
MK
861.\"
862.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
863.\"
d67e21f5
MK
864.TP
865.BR FUTEX_LOCK_PI " (since Linux 2.6.18)"
866.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
67833bec 867.\"
d6bb5a38
MK
868.\" FIXME I did some significant rewording of tglx's text to create
869.\" the text below.
870.\" Please check the following paragraph, in case I injected
871.\" errors.
67833bec
MK
872.\"
873This operation is used after after an attempt to acquire
4b35dc5d
TR
874the lock via an atomic user-space instruction failed
875because the futex word has a nonzero value\(emspecifically,
67833bec 876because it contained the namespace-specific TID of the lock owner.
67259526 877.\" FIXME In the preceding line, what does "namespace-specific" mean?
67833bec 878.\" (I kept those words from tglx.)
67259526 879.\" That is, what kind of namespace are we talking about?
67833bec
MK
880.\" (I suppose we are talking PID namespaces here, but I want to
881.\" be sure.)
882
4b35dc5d 883The operation checks the value of the futex word at the address
67833bec 884.IR uaddr .
70b06b90
MK
885If the value is 0, then the kernel tries to atomically set
886the futex value to the caller's TID.
d6bb5a38
MK
887.\" FIXME What would be the cause(s) of failure referred to
888.\" in the following sentence?
67833bec 889If that fails,
4b35dc5d 890or the futex word's value is nonzero,
67833bec 891the kernel atomically sets the
e0547e70 892.B FUTEX_WAITERS
67833bec
MK
893bit, which signals the futex owner that it cannot unlock the futex in
894user space atomically by setting the futex value to 0.
895After that, the kernel tries to find the thread which is
896associated with the owner TID,
897.\" FIXME Could I get a bit more detail on the next two lines?
898.\" What is "creates or reuses kernel state" about?
d6bb5a38 899.\" (I think this needs to be clearer in the page)
67833bec
MK
900creates or reuses kernel state on behalf of the owner
901and attaches the waiter to it.
67259526
MK
902.\" FIXME In the next line, what type of "priority" are we talking about?
903.\" Realtime priorities for SCHED_FIFO and SCHED_RR?
904.\" Or something else?
1f043693 905The enqueueing of the waiter is in descending priority order if more
e0547e70 906than one waiter exists.
67259526 907.\" FIXME What does "bandwidth" refer to in the next line?
e0547e70 908The owner inherits either the priority or the bandwidth of the waiter.
67259526
MK
909.\" FIXME In the preceding line, what determines whether the
910.\" owner inherits the priority versus the bandwidth?
67833bec
MK
911.\"
912.\" FIXME Could I get some help translating the next sentence into
913.\" something that user-space developers (and I) can understand?
70b06b90 914.\" In particular, what are "nested locks" in this context?
e0547e70
TG
915This inheritance follows the lock chain in the case of
916nested locking and performs deadlock detection.
917
d6bb5a38 918.\" FIXME tglx said "The timeout argument is handled as described in
9ce19cf1 919.\" FUTEX_WAIT." However, it appears to me that this is not right.
70b06b90 920.\" Is the following formulation correct?
e0547e70
TG
921The
922.I timeout
9ce19cf1
MK
923argument provides a timeout for the lock attempt.
924It is interpreted as an absolute time, measured against the
925.BR CLOCK_REALTIME
926clock.
927If
928.I timeout
929is NULL, the operation will block indefinitely.
e0547e70 930
a449c634 931The
e0547e70
TG
932.IR uaddr2 ,
933.IR val ,
934and
935.IR val3
a449c634 936arguments are ignored.
67833bec 937.\"
70b06b90
MK
938.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
939.\"
d67e21f5 940.TP
12fdbe23 941.BR FUTEX_TRYLOCK_PI " (since Linux 2.6.18)"
d67e21f5 942.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
12fdbe23
MK
943This operation tries to acquire the futex at
944.IR uaddr .
0b761826 945.\" FIXME I think it would be helpful here to say a few more words about
70b06b90
MK
946.\" the difference(s) between FUTEX_LOCK_PI and FUTEX_TRYLOCK_PI.
947.\" Can someone propose something?
948.\"
74f58a64
MK
949.\" FIXME(Torvald) Additionally, we claim above that just FUTEX_WAITERS
950.\" is never an allowed state.
fa0388c3 951It deals with the situation where the TID value at
12fdbe23
MK
952.I uaddr
953is 0, but the
b52e1cd4 954.B FUTEX_WAITERS
12fdbe23 955bit is set.
fa0388c3
MK
956.\" FIXME How does the situation in the previous sentence come about?
957.\" Probably it would be helpful to say something about that in
958.\" the man page.
badbf70c 959.\" FIXME And *how* does FUTEX_TRYLOCK_PI deal with this situation?
a282e5b0 960User space cannot handle this condition in a race-free manner
084744ef
MK
961
962The
963.IR uaddr2 ,
964.IR val ,
965.IR timeout ,
966and
967.IR val3
968arguments are ignored.
70b06b90
MK
969.\"
970.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
971.\"
d67e21f5 972.TP
12fdbe23 973.BR FUTEX_UNLOCK_PI " (since Linux 2.6.18)"
d67e21f5 974.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
d4ba4328 975This operation wakes the top priority waiter that is waiting in
ecae2099
TG
976.B FUTEX_LOCK_PI
977on the futex address provided by the
978.I uaddr
979argument.
980
981This is called when the user space value at
982.I uaddr
983cannot be changed atomically from a TID (of the owner) to 0.
984
985The
986.IR uaddr2 ,
987.IR val ,
988.IR timeout ,
989and
990.IR val3
11a194bf 991arguments are ignored.
70b06b90
MK
992.\"
993.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
994.\"
d67e21f5 995.TP
d67e21f5
MK
996.BR FUTEX_CMP_REQUEUE_PI " (since Linux 2.6.31)"
997.\" commit 52400ba946759af28442dee6265c5c0180ac7122
f812a08b
DH
998This operation is a PI-aware variant of
999.BR FUTEX_CMP_REQUEUE .
1000It requeues waiters that are blocked via
1001.B FUTEX_WAIT_REQUEUE_PI
1002on
1003.I uaddr
1004from a non-PI source futex
1005.RI ( uaddr )
1006to a PI target futex
1007.RI ( uaddr2 ).
1008
9e54d26d
MK
1009As with
1010.BR FUTEX_CMP_REQUEUE ,
1011this operation wakes up a maximum of
1012.I val
1013waiters that are waiting on the futex at
1014.IR uaddr .
1015However, for
1016.BR FUTEX_CMP_REQUEUE_PI ,
1017.I val
6fbeb8f4 1018is required to be 1
939ca89f 1019(since the main point is to avoid a thundering herd).
9e54d26d
MK
1020The remaining waiters are removed from the wait queue of the source futex at
1021.I uaddr
1022and added to the wait queue of the target futex at
1023.IR uaddr2 .
f812a08b 1024
9e54d26d 1025The
768d3c23 1026.I val2
c6d8cf21
MK
1027.\" val2 is the cap on the number of requeued waiters.
1028.\" In the glibc pthread_cond_broadcast() implementation, this argument
1029.\" is specified as INT_MAX, and for pthread_cond_signal() it is 0.
9e54d26d 1030and
768d3c23 1031.I val3
9e54d26d
MK
1032arguments serve the same purposes as for
1033.BR FUTEX_CMP_REQUEUE .
70b06b90 1034.\"
be376673
MK
1035.\" FIXME The page at http://locklessinc.com/articles/futex_cheat_sheet/
1036.\" notes that "priority-inheritance Futex to priority-inheritance
1037.\" Futex requeues are currently unsupported". Do we need to say
1038.\" something in the man page about that?
70b06b90
MK
1039.\"
1040.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1041.\"
d67e21f5
MK
1042.TP
1043.BR FUTEX_WAIT_REQUEUE_PI " (since Linux 2.6.31)"
1044.\" commit 52400ba946759af28442dee6265c5c0180ac7122
70b06b90
MK
1045.\"
1046.\" FIXME I find the next sentence (from tglx) pretty hard to grok.
1af427a4 1047.\" Could someone explain it a bit more?
6ff1b4c0
TG
1048Wait operation to wait on a non-PI futex at
1049.I uaddr
1050and potentially be requeued onto a PI futex at
1051.IR uaddr2 .
1052The wait operation on
1053.I uaddr
1054is the same as
1055.BR FUTEX_WAIT .
70b06b90 1056.\"
f1d2171d
MK
1057.\" FIXME I'm not quite clear on the meaning of the following sentence.
1058.\" Is this trying to say that while blocked in a
1059.\" FUTEX_WAIT_REQUEUE_PI, it could happen that another
1060.\" task does a FUTEX_WAKE on uaddr that simply causes
1061.\" a normal wake, with the result that the FUTEX_WAIT_REQUEUE_PI
1062.\" does not complete? What happens then to the FUTEX_WAIT_REQUEUE_PI
1063.\" opertion? Does it remain blocked, or does it unblock
1064.\" In which case, what does user space see?
6ff1b4c0
TG
1065The waiter can be removed from the wait on
1066.I uaddr
1067via
1068.BR FUTEX_WAKE
1069without requeueing on
1070.IR uaddr2 .
a4e69912 1071
63bea7dc
MK
1072.\" FIXME Please check the following. tglx said "The timeout argument
1073.\" is handled as described in FUTEX_WAIT.", but the truth is
1074.\" as below, AFAICS
1075If
1076.I timeout
1077is not NULL, it specifies a timeout for the wait operation;
1078this timeout is interpreted as outlined above in the description of the
1079.BR FUTEX_CLOCK_REALTIME
1080option.
1081If
1082.I timeout
1083is NULL, the operation can block indefinitely.
1084
a4e69912
MK
1085The
1086.I val3
1087argument is ignored.
70b06b90 1088.\" FIXME Re the preceding sentence... Actually 'val3' is internally set to
a4e69912
MK
1089.\" FUTEX_BITSET_MATCH_ANY before calling futex_wait_requeue_pi().
1090.\" I'm not sure we need to say anything about this though.
1091.\" Comments?
abb571e8
MK
1092
1093The
1094.BR FUTEX_WAIT_REQUEUE_PI
1095and
1096.BR FUTEX_CMP_REQUEUE_PI
1097were added to support a fairly specific use case:
1098support for priority-inheritance-aware POSIX threads condition variables.
1099The idea is that these operations should always be paired,
1100in order to ensure that user space and the kernel remain in sync.
1101Thus, in the
1102.BR FUTEX_WAIT_REQUEUE_PI
1103operation, the user-space application pre-specifies the target
1104of the requeue that takes place in the
1105.BR FUTEX_CMP_REQUEUE_PI
1106operation.
1107.\"
1108.\" Darren Hart notes that a patch to allow glibc to fully support
1af427a4 1109.\" PI-aware pthreads condition variables has not yet been accepted into
abb571e8
MK
1110.\" glibc. The story is complex, and can be found at
1111.\" https://sourceware.org/bugzilla/show_bug.cgi?id=11588
1112.\" Darren notes that in the meantime, the patch is shipped with various
1af427a4 1113.\" PREEMPT_RT-enabled Linux systems.
abb571e8
MK
1114.\"
1115.\" Related to the preceding, Darren proposed that somewhere, man-pages
1116.\" should document the following point:
1af427a4 1117.\"
4c8cb0ff
MK
1118.\" While the Linux kernel, since 2.6.31, supports requeueing of
1119.\" priority-inheritance (PI) aware mutexes via the
1120.\" FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI futex operations,
1121.\" the glibc implementation does not yet take full advantage of this.
1122.\" Specifically, the condvar internal data lock remains a non-PI aware
1123.\" mutex, regardless of the type of the pthread_mutex associated with
1124.\" the condvar. This can lead to an unbounded priority inversion on
1125.\" the internal data lock even when associating a PI aware
1126.\" pthread_mutex with a condvar during a pthread_cond*_wait
1127.\" operation. For this reason, it is not recommended to rely on
1128.\" priority inheritance when using pthread condition variables.
1af427a4
MK
1129.\"
1130.\" The problem is that the obvious location for this text is
1131.\" the pthread_cond*wait(3) man page. However, such a man page
abb571e8 1132.\" does not currently exist.
70b06b90 1133.\"
6700de24 1134.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
70b06b90 1135.\"
47297adb 1136.SH RETURN VALUE
fea681da 1137.PP
a5c5a06a
MK
1138In the event of an error (and assuming that
1139.BR futex ()
1140was invoked via
1141.BR syscall (2)),
1142all operations return \-1 and set
e808bba0 1143.I errno
6f147f79 1144to indicate the cause of the error.
e808bba0
MK
1145The return value on success depends on the operation,
1146as described in the following list:
fea681da
MK
1147.TP
1148.B FUTEX_WAIT
077981d4 1149Returns 0 if the caller was woken up.
4c8cb0ff
MK
1150Note that a wake-up can also be caused by common futex usage patterns
1151in unrelated code that happened to have previously used the futex word's
1152memory location (e.g., typical futex-based implementations of
1153Pthreads mutexes can cause this under some conditions).
1154Therefore, callers should always conservatively assume that a return
1155value of 0 can mean a spurious wake-up, and use the futex word's value
1156(i.e., the user space synchronization scheme)
1157 to decide whether to continue to block or not.
fea681da
MK
1158.TP
1159.B FUTEX_WAKE
bdc5957a 1160Returns the number of waiters that were woken up.
fea681da
MK
1161.TP
1162.B FUTEX_FD
1163Returns the new file descriptor associated with the futex.
1164.TP
1165.B FUTEX_REQUEUE
bdc5957a 1166Returns the number of waiters that were woken up.
fea681da
MK
1167.TP
1168.B FUTEX_CMP_REQUEUE
bdc5957a 1169Returns the total number of waiters that were woken up or
4b35dc5d 1170requeued to the futex for the futex word at
3dfcc11d
MK
1171.IR uaddr2 .
1172If this value is greater than
1173.IR val ,
4c8cb0ff
MK
1174then difference is the number of waiters requeued to the futex for the
1175futex word at
3dfcc11d 1176.IR uaddr2 .
dcad19c0
MK
1177.TP
1178.B FUTEX_WAKE_OP
a8b5b324 1179Returns the total number of waiters that were woken up.
4c8cb0ff
MK
1180This is the sum of the woken waiters on the two futexes for
1181the futex words at
a8b5b324
MK
1182.I uaddr
1183and
1184.IR uaddr2 .
dcad19c0
MK
1185.TP
1186.B FUTEX_WAIT_BITSET
077981d4
MK
1187Returns 0 if the caller was woken up.
1188See
4b35dc5d
TR
1189.B FUTEX_WAIT
1190for how to interpret this correctly in practice.
dcad19c0
MK
1191.TP
1192.B FUTEX_WAKE_BITSET
bdc5957a 1193Returns the number of waiters that were woken up.
dcad19c0
MK
1194.TP
1195.B FUTEX_LOCK_PI
bf02a260 1196Returns 0 if the futex was successfully locked.
dcad19c0
MK
1197.TP
1198.B FUTEX_TRYLOCK_PI
5c716eef 1199Returns 0 if the futex was successfully locked.
dcad19c0
MK
1200.TP
1201.B FUTEX_UNLOCK_PI
52bb928f 1202Returns 0 if the futex was successfully unlocked.
dcad19c0
MK
1203.TP
1204.B FUTEX_CMP_REQUEUE_PI
bdc5957a 1205Returns the total number of waiters that were woken up or
4b35dc5d 1206requeued to the futex for the futex word at
dddd395a
MK
1207.IR uaddr2 .
1208If this value is greater than
1209.IR val ,
4c8cb0ff
MK
1210then difference is the number of waiters requeued to the futex for
1211the futex word at
dddd395a 1212.IR uaddr2 .
dcad19c0
MK
1213.TP
1214.B FUTEX_WAIT_REQUEUE_PI
4c8cb0ff
MK
1215Returns 0 if the caller was successfully requeued to the futex for
1216the futex word at
22c15de9 1217.IR uaddr2 .
70b06b90
MK
1218.\"
1219.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1220.\"
fea681da
MK
1221.SH ERRORS
1222.TP
1223.B EACCES
4b35dc5d 1224No read access to the memory of a futex word.
fea681da
MK
1225.TP
1226.B EAGAIN
f48516d1 1227.RB ( FUTEX_WAIT ,
4b35dc5d 1228.BR FUTEX_WAIT_BITSET ,
f48516d1 1229.BR FUTEX_WAIT_REQUEUE_PI )
badbf70c
MK
1230The value pointed to by
1231.I uaddr
1232was not equal to the expected value
1233.I val
1234at the time of the call.
9732dd8b
MK
1235
1236.BR Note :
1237on Linux, the symbolic names
1238.B EAGAIN
1239and
1240.B EWOULDBLOCK
77da5feb 1241(both of which appear in different parts of the kernel futex code)
9732dd8b 1242have the same value.
badbf70c
MK
1243.TP
1244.B EAGAIN
8f2068bb
MK
1245.RB ( FUTEX_CMP_REQUEUE ,
1246.BR FUTEX_CMP_REQUEUE_PI )
ce5602fd 1247The value pointed to by
9f6c40c0
МК
1248.I uaddr
1249is not equal to the expected value
1250.IR val3 .
fd1dc4c2 1251.\" FIXME: Is the following sentence correct?
d6bb5a38 1252.\" [I would prefer to remove this sentence. --triegel@redhat.com]
fea681da 1253(This probably indicates a race;
682edefb
MK
1254use the safe
1255.B FUTEX_WAKE
1256now.)
c0091dd3 1257.\"
f1d2171d 1258.\" FIXME XXX Should there be an EAGAIN case for FUTEX_TRYLOCK_PI?
c0091dd3
MK
1259.\" It seems so, looking at the handling of the rt_mutex_trylock()
1260.\" call in futex_lock_pi()
9732dd8b 1261.\" (Davidlohr also thinks so.)
c0091dd3 1262.\"
fea681da 1263.TP
5662f56a
MK
1264.BR EAGAIN
1265.RB ( FUTEX_LOCK_PI ,
aaec9032
MK
1266.BR FUTEX_TRYLOCK_PI ,
1267.BR FUTEX_CMP_REQUEUE_PI )
1268The futex owner thread ID of
1269.I uaddr
1270(for
1271.BR FUTEX_CMP_REQUEUE_PI :
1272.IR uaddr2 )
1273is about to exit,
5662f56a
MK
1274but has not yet handled the internal state cleanup.
1275Try again.
1276.TP
7a39e745
MK
1277.BR EDEADLK
1278.RB ( FUTEX_LOCK_PI ,
9732dd8b
MK
1279.BR FUTEX_TRYLOCK_PI ,
1280.BR FUTEX_CMP_REQUEUE_PI )
4b35dc5d 1281The futex word at
7a39e745
MK
1282.I uaddr
1283is already locked by the caller.
1284.TP
662c0da8 1285.BR EDEADLK
4c8cb0ff 1286.\" FIXME XXX I see that kernel/locking/rtmutex.c uses EDEADLK in some
d6bb5a38 1287.\" places, and EDEADLOCK in others. On almost all architectures
4c8cb0ff
MK
1288.\" these constants are synonymous. Is there a reason that both
1289.\" names are used?
d6bb5a38 1290.\" FIXME I reworded tglx's text somewhat; is the following okay?
662c0da8 1291.RB ( FUTEX_CMP_REQUEUE_PI )
4b35dc5d 1292While requeueing a waiter to the PI futex for the futex word at
662c0da8
MK
1293.IR uaddr2 ,
1294the kernel detected a deadlock.
1295.TP
fea681da 1296.B EFAULT
1ea901e8
MK
1297A required pointer argument (i.e.,
1298.IR uaddr ,
1299.IR uaddr2 ,
1300or
1301.IR timeout )
496df304 1302did not point to a valid user-space address.
fea681da 1303.TP
9f6c40c0 1304.B EINTR
e808bba0 1305A
9f6c40c0 1306.B FUTEX_WAIT
2674f781
MK
1307or
1308.B FUTEX_WAIT_BITSET
e808bba0 1309operation was interrupted by a signal (see
f529fd20
MK
1310.BR signal (7)).
1311In kernels before Linux 2.6.22, this error could also be returned for
1312on a spurious wakeup; since Linux 2.6.22, this no longer happens.
9f6c40c0 1313.TP
fea681da 1314.B EINVAL
180f97b7
MK
1315The operation in
1316.IR futex_op
1317is one of those that employs a timeout, but the supplied
fb2f4c27
MK
1318.I timeout
1319argument was invalid
1320.RI ( tv_sec
1321was less than zero, or
1322.IR tv_nsec
cabee29d 1323was not less than 1,000,000,000).
fb2f4c27
MK
1324.TP
1325.B EINVAL
0c74df0b 1326The operation specified in
025e1374 1327.IR futex_op
0c74df0b 1328employs one or both of the pointers
51ee94be 1329.I uaddr
a1f47699 1330and
0c74df0b
MK
1331.IR uaddr2 ,
1332but one of these does not point to a valid object\(emthat is,
1333the address is not four-byte-aligned.
51ee94be
MK
1334.TP
1335.B EINVAL
55cc422d
TG
1336.RB ( FUTEX_WAIT_BITSET ,
1337.BR FUTEX_WAKE_BITSET )
79c9b436
TG
1338The bitset supplied in
1339.IR val3
1340is zero.
1341.TP
1342.B EINVAL
2abcba67 1343.RB ( FUTEX_CMP_REQUEUE_PI )
add875c0
MK
1344.I uaddr
1345equals
1346.IR uaddr2
1347(i.e., an attempt was made to requeue to the same futex).
1348.TP
ff597681
MK
1349.BR EINVAL
1350.RB ( FUTEX_FD )
1351The signal number supplied in
1352.I val
1353is invalid.
1354.TP
6bac3b85 1355.B EINVAL
476debd7
MK
1356.RB ( FUTEX_WAKE ,
1357.BR FUTEX_WAKE_OP ,
1358.BR FUTEX_WAKE_BITSET ,
1359.BR FUTEX_REQUEUE ,
1360.BR FUTEX_CMP_REQUEUE )
1361The kernel detected an inconsistency between the user-space state at
1362.I uaddr
1363and the kernel state\(emthat is, it detected a waiter which waits in
1364.BR FUTEX_LOCK_PI
1365on
1366.IR uaddr .
1367.TP
1368.B EINVAL
a218ef20 1369.RB ( FUTEX_LOCK_PI ,
ce022f18
MK
1370.BR FUTEX_TRYLOCK_PI ,
1371.BR FUTEX_UNLOCK_PI )
a218ef20
MK
1372The kernel detected an inconsistency between the user-space state at
1373.I uaddr
1374and the kernel state.
ce022f18 1375This indicates either state corruption
d6bb5a38
MK
1376.\" FIXME tglx did not mention the "state corruption" case for
1377.\" FUTEX_UNLOCK_PI, but I have added it, since I'm estimating
1378.\" that it also applied for FUTEX_UNLOCK_PI.
1379.\" So, does that case also apply for FUTEX_UNLOCK_PI?
ce022f18 1380or that the kernel found a waiter on
a218ef20
MK
1381.I uaddr
1382which is waiting via
1383.BR FUTEX_WAIT
1384or
1385.BR FUTEX_WAIT_BITSET .
1386.TP
1387.B EINVAL
f9250b1a
MK
1388.RB ( FUTEX_CMP_REQUEUE_PI )
1389The kernel detected an inconsistency between the user-space state at
99c0041d
MK
1390.I uaddr2
1391and the kernel state;
1392that is, the kernel detected a waiter which waits via
1393.BR FUTEX_WAIT
1394.\" FIXME tglx did not mention FUTEX_WAIT_BITSET here,
1395.\" but should that not also be included here?
1396on
1397.IR uaddr2 .
1398.TP
1399.B EINVAL
1400.RB ( FUTEX_CMP_REQUEUE_PI )
1401The kernel detected an inconsistency between the user-space state at
f9250b1a
MK
1402.I uaddr
1403and the kernel state;
1404that is, the kernel detected a waiter which waits via
75299c8d 1405.BR FUTEX_WAIT
99c0041d 1406or
75299c8d 1407.BR FUTEX_WAIT_BITESET
f9250b1a
MK
1408on
1409.IR uaddr .
1410.TP
1411.B EINVAL
99c0041d 1412.RB ( FUTEX_CMP_REQUEUE_PI )
75299c8d
MK
1413The kernel detected an inconsistency between the user-space state at
1414.I uaddr
1415and the kernel state;
1416that is, the kernel detected a waiter which waits on
1417.I uaddr
1418via
1419.BR FUTEX_LOCK_PI
1420(instead of
1421.BR FUTEX_WAIT_REQUEUE_PI ).
99c0041d
MK
1422.TP
1423.B EINVAL
9786b3ca 1424.RB ( FUTEX_CMP_REQUEUE_PI )
f1d2171d 1425.\" FIXME XXX The following is a reworded version of Darren Hart's text.
9786b3ca
MK
1426.\" Please check that I did not introduce any errors.
1427An attempt was made to requeue a waiter to a futex other than that
1428specified by the matching
1429.B FUTEX_WAIT_REQUEUE_PI
1430call for that waiter.
1431.TP
1432.B EINVAL
f0c0d61c
MK
1433.RB ( FUTEX_CMP_REQUEUE_PI )
1434The
1435.I val
1436argument is not 1.
1437.TP
1438.B EINVAL
4832b48a 1439Invalid argument.
fea681da 1440.TP
a449c634
MK
1441.BR ENOMEM
1442.RB ( FUTEX_LOCK_PI ,
e34a8fb6
MK
1443.BR FUTEX_TRYLOCK_PI ,
1444.BR FUTEX_CMP_REQUEUE_PI )
a449c634
MK
1445The kernel could not allocate memory to hold state information.
1446.TP
fea681da 1447.B ENFILE
ff597681 1448.RB ( FUTEX_FD )
fea681da 1449The system limit on the total number of open files has been reached.
4701fc28
MK
1450.TP
1451.B ENOSYS
1452Invalid operation specified in
d33602c4 1453.IR futex_op .
9f6c40c0 1454.TP
4a7e5b05
MK
1455.B ENOSYS
1456The
1457.BR FUTEX_CLOCK_REALTIME
1458option was specified in
1afcee7c 1459.IR futex_op ,
4a7e5b05
MK
1460but the accompanying operation was neither
1461.BR FUTEX_WAIT_BITSET
1462nor
1463.BR FUTEX_WAIT_REQUEUE_PI .
1464.TP
a9dcb4d1
MK
1465.BR ENOSYS
1466.RB ( FUTEX_LOCK_PI ,
f2424fae 1467.BR FUTEX_TRYLOCK_PI ,
4945ff19 1468.BR FUTEX_UNLOCK_PI ,
4cf92894 1469.BR FUTEX_CMP_REQUEUE_PI ,
794bb106 1470.BR FUTEX_WAIT_REQUEUE_PI )
4b35dc5d 1471A run-time check determined that the operation is not available.
a2ebebcd 1472The PI futex operations are not implemented on all architectures and
077981d4 1473are not supported on some CPU variants.
a9dcb4d1 1474.TP
c7589177
MK
1475.BR EPERM
1476.RB ( FUTEX_LOCK_PI ,
dc2742a8
MK
1477.BR FUTEX_TRYLOCK_PI ,
1478.BR FUTEX_CMP_REQUEUE_PI )
04331c3f 1479The caller is not allowed to attach itself to the futex at
dc2742a8
MK
1480.I uaddr
1481(for
1482.BR FUTEX_CMP_REQUEUE_PI :
1483the futex at
1484.IR uaddr2 ).
c7589177
MK
1485(This may be caused by a state corruption in user space.)
1486.TP
76f347ba 1487.BR EPERM
87276709 1488.RB ( FUTEX_UNLOCK_PI )
4b35dc5d 1489The caller does not own the lock represented by the futex word.
76f347ba 1490.TP
0b0e4934
MK
1491.BR ESRCH
1492.RB ( FUTEX_LOCK_PI ,
9732dd8b
MK
1493.BR FUTEX_TRYLOCK_PI ,
1494.BR FUTEX_CMP_REQUEUE_PI )
0b0e4934
MK
1495.\" FIXME I reworded the following sentence a bit differently from
1496.\" tglx's formulation. Is it okay?
4b35dc5d 1497The thread ID in the futex word at
0b0e4934
MK
1498.I uaddr
1499does not exist.
1500.TP
360f773c
MK
1501.BR ESRCH
1502.RB ( FUTEX_CMP_REQUEUE_PI )
1503.\" FIXME I reworded the following sentence a bit differently from
1504.\" tglx's formulation. Is it okay?
4b35dc5d 1505The thread ID in the futex word at
360f773c
MK
1506.I uaddr2
1507does not exist.
1508.TP
9f6c40c0 1509.B ETIMEDOUT
4d85047f
MK
1510The operation in
1511.IR futex_op
1512employed the timeout specified in
1513.IR timeout ,
1514and the timeout expired before the operation completed.
70b06b90
MK
1515.\"
1516.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1517.\"
47297adb 1518.SH VERSIONS
a1d5f77c 1519.PP
81c9d87e
MK
1520Futexes were first made available in a stable kernel release
1521with Linux 2.6.0.
1522
4c8cb0ff
MK
1523Initial futex support was merged in Linux 2.5.7 but with different
1524semantics from what was described above.
52dee70e 1525A four-argument system call with the semantics
fd3fa7ef 1526described in this page was introduced in Linux 2.5.40.
11b520ed 1527In Linux 2.5.70, one argument
a1d5f77c 1528was added.
11b520ed 1529In Linux 2.6.7, a sixth argument was added\(emmessy, especially
a1d5f77c 1530on the s390 architecture.
47297adb 1531.SH CONFORMING TO
8382f16d 1532This system call is Linux-specific.
47297adb 1533.SH NOTES
baf0f1f4
MK
1534Glibc does not provide a wrapper for this system call; call it using
1535.BR syscall (2).
cf44281c 1536
02f7b623
MK
1537Several higher-level programming abstractions are implemented via futexes,
1538including POSIX semaphores and
1539various POSIX threads synchronization mechanisms
1540(mutexes, condition variables, read-write locks, and barriers).
74f58a64
MK
1541.\" TODO FIXME(Torvald) Above, we cite this section and claim it contains
1542.\" details on the synchronization semantics; add the C11 equivalents
1543.\" here (or whatever we find consensus for).
305cc415
MK
1544.\"
1545.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1546.\"
1547.SH EXAMPLE
1548.\" FIXME Is it worth having an example program?
1549.\" FIXME Anything obviously broken in the example program?
1550.\"
77da5feb 1551The program below demonstrates use of futexes in a program
305cc415
MK
1552where parent and child use a pair of futexes located inside a
1553shared anonymous mapping to synchronize access to a shared resource:
1554the terminal.
1555The two processes each write
1556.IR nloops
1557(a command-line argument that defaults to 5 if omitted)
1558messages to the terminal and employ a synchronization protocol
1559that ensures that they alternate in writing messages.
1560Upon running this program we see output such as the following:
1561
1562.in +4n
1563.nf
1564$ \fB./futex_demo\fP
1565Parent (18534) 0
1566Child (18535) 0
1567Parent (18534) 1
1568Child (18535) 1
1569Parent (18534) 2
1570Child (18535) 2
1571Parent (18534) 3
1572Child (18535) 3
1573Parent (18534) 4
1574Child (18535) 4
1575.fi
1576.in
1577.SS Program source
1578\&
1579.nf
1580/* futex_demo.c
1581
1582 Usage: futex_demo [nloops]
1583 (Default: 5)
1584
1585 Demonstrate the use of futexes in a program where parent and child
1586 use a pair of futexes located inside a shared anonymous mapping to
1587 synchronize access to a shared resource: the terminal. The two
1588 processes each write \(aqnum\-loops\(aq messages to the terminal and employ
1589 a synchronization protocol that ensures that they alternate in
1590 writing messages.
1591*/
1592#define _GNU_SOURCE
1593#include <stdio.h>
1594#include <errno.h>
1595#include <stdlib.h>
1596#include <unistd.h>
1597#include <sys/wait.h>
1598#include <sys/mman.h>
1599#include <sys/syscall.h>
1600#include <linux/futex.h>
1601#include <sys/time.h>
1602
1603#define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\
1604 } while (0)
1605
1606static int *futex1, *futex2, *iaddr;
1607
1608static int
1609futex(int *uaddr, int futex_op, int val,
1610 const struct timespec *timeout, int *uaddr2, int val3)
1611{
1612 return syscall(SYS_futex, uaddr, futex_op, val,
1613 timeout, uaddr, val3);
1614}
1615
1616/* Acquire the futex pointed to by \(aqfutexp\(aq: wait for its value to
1617 become 1, and then set the value to 0. */
1618
1619static void
1620fwait(int *futexp)
1621{
1622 int s;
1623
1624 /* __sync_bool_compare_and_swap(ptr, oldval, newval) is a gcc
1625 built\-in function. It atomically performs the equivalent of:
1626
1627 if (*ptr == oldval)
1628 *ptr = newval;
1629
1630 It returns true if the test yielded true and *ptr was updated.
1631 The alternative here would be to employ the equivalent atomic
1632 machine\-language instructions. For further information, see
1633 the GCC Manual. */
1634
305cc415 1635 while (1) {
83e80dda 1636
63ad44cb 1637 /* Is the futex available? */
83e80dda 1638
305cc415
MK
1639 if (__sync_bool_compare_and_swap(futexp, 1, 0))
1640 break; /* Yes */
1641
63ad44cb 1642 /* Futex is not available; wait */
83e80dda 1643
63ad44cb
HS
1644 s = futex(futexp, FUTEX_WAIT, 0, NULL, NULL, 0);
1645 if (s == \-1 && errno != EAGAIN)
1646 errExit("futex\-FUTEX_WAIT");
305cc415
MK
1647 }
1648}
1649
1650/* Release the futex pointed to by \(aqfutexp\(aq: if the futex currently
1651 has the value 0, set its value to 1 and the wake any futex waiters,
1652 so that if the peer is blocked in fpost(), it can proceed. */
1653
1654static void
1655fpost(int *futexp)
1656{
1657 int s;
1658
1659 /* __sync_bool_compare_and_swap() was described in comments above */
1660
1661 if (__sync_bool_compare_and_swap(futexp, 0, 1)) {
1662
1663 s = futex(futexp, FUTEX_WAKE, 1, NULL, NULL, 0);
1664 if (s == \-1)
1665 errExit("futex\-FUTEX_WAKE");
1666 }
1667}
1668
1669int
1670main(int argc, char *argv[])
1671{
1672 pid_t childPid;
1673 int j, nloops;
1674
1675 setbuf(stdout, NULL);
1676
1677 nloops = (argc > 1) ? atoi(argv[1]) : 5;
1678
1679 /* Create a shared anonymous mapping that will hold the futexes.
1680 Since the futexes are being shared between processes, we
1681 subsequently use the "shared" futex operations (i.e., not the
1682 ones suffixed "_PRIVATE") */
1683
1684 iaddr = mmap(NULL, sizeof(int) * 2, PROT_READ | PROT_WRITE,
1685 MAP_ANONYMOUS | MAP_SHARED, \-1, 0);
1686 if (iaddr == MAP_FAILED)
1687 errExit("mmap");
1688
1689 futex1 = &iaddr[0];
1690 futex2 = &iaddr[1];
1691
1692 *futex1 = 0; /* State: unavailable */
1693 *futex2 = 1; /* State: available */
1694
1695 /* Create a child process that inherits the shared anonymous
35764662 1696 mapping */
305cc415
MK
1697
1698 childPid = fork();
92a46690 1699 if (childPid == \-1)
305cc415
MK
1700 errExit("fork");
1701
1702 if (childPid == 0) { /* Child */
1703 for (j = 0; j < nloops; j++) {
1704 fwait(futex1);
1705 printf("Child (%ld) %d\\n", (long) getpid(), j);
1706 fpost(futex2);
1707 }
1708
1709 exit(EXIT_SUCCESS);
1710 }
1711
1712 /* Parent falls through to here */
1713
1714 for (j = 0; j < nloops; j++) {
1715 fwait(futex2);
1716 printf("Parent (%ld) %d\\n", (long) getpid(), j);
1717 fpost(futex1);
1718 }
1719
1720 wait(NULL);
1721
1722 exit(EXIT_SUCCESS);
1723}
1724.fi
47297adb 1725.SH SEE ALSO
4c222281 1726.ad l
9913033c 1727.BR get_robust_list (2),
d806bc05 1728.BR restart_syscall (2),
e0074751 1729.BR pthread_mutexattr_getprotocol (3),
14d8dd3b 1730.BR futex (7)
fea681da 1731.PP
f5ad572f
MK
1732The following kernel source files:
1733.IP * 2
1734.I Documentation/pi-futex.txt
1735.IP *
1736.I Documentation/futex-requeue-pi.txt
1737.IP *
1738.I Documentation/locking/rt-mutex.txt
1739.IP *
1740.I Documentation/locking/rt-mutex-design.txt
8fe019c7
MK
1741.IP *
1742.I Documentation/robust-futex-ABI.txt
43b99089 1743.PP
4c222281 1744Franke, H., Russell, R., and Kirwood, M., 2002.
52087dd3 1745\fIFuss, Futexes and Furwocks: Fast Userlevel Locking in Linux\fP
4c222281 1746(from proceedings of the Ottawa Linux Symposium 2002),
9b936e9e 1747.br
608bf950
SK
1748.UR http://kernel.org\:/doc\:/ols\:/2002\:/ols2002-pages-479-495.pdf
1749.UE
f42eb21b 1750
4c222281 1751Hart, D., 2009. \fIA futex overview and update\fP,
2ed26199
MK
1752.UR http://lwn.net/Articles/360699/
1753.UE
1754
4c222281 1755Hart, D. and Guniguntala, D., 2009.
0483b6cc 1756\fIRequeue-PI: Making Glibc Condvars PI-Aware\fP
4c222281 1757(from proceedings of the 2009 Real-Time Linux Workshop),
0483b6cc
MK
1758.UR http://lwn.net/images/conf/rtlws11/papers/proc/p10.pdf
1759.UE
1760
4c222281 1761Drepper, U., 2011. \fIFutexes Are Tricky\fP,
f42eb21b
MK
1762.UR http://www.akkadia.org/drepper/futex.pdf
1763.UE
9b936e9e
MK
1764.PP
1765Futex example library, futex-*.tar.bz2 at
1766.br
a605264d 1767.UR ftp://ftp.kernel.org\:/pub\:/linux\:/kernel\:/people\:/rusty/
608bf950 1768.UE
34f14794
MK
1769.\"
1770.\" FIXME Are there any other resources that should be listed
1771.\" in the SEE ALSO section?
74f58a64 1772.\" FIXME(Torvald) We should probably refer to the glibc code here, in
4c8cb0ff
MK
1773.\" particular the glibc-internal futex wrapper functions that are
1774.\" WIP, and the generic pthread_mutex_t and perhaps condvar
1775.\" implementations.