]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man2/futex.2
futex.2: ffix
[thirdparty/man-pages.git] / man2 / futex.2
CommitLineData
8f0aff2a 1.\" Page by b.hubert
1abce893
MK
2.\" and Copyright (C) 2015, Thomas Gleixner <tglx@linutronix.de>
3.\" and Copyright (C) 2015, Michael Kerrisk <mtk.manpages@gmail.com>
2297bf0e 4.\"
2e46a6e7 5.\" %%%LICENSE_START(FREELY_REDISTRIBUTABLE)
8f0aff2a 6.\" may be freely modified and distributed
8ff7380d 7.\" %%%LICENSE_END
fea681da
MK
8.\"
9.\" Niki A. Rahimi (LTC Security Development, narahimi@us.ibm.com)
10.\" added ERRORS section.
11.\"
12.\" Modified 2004-06-17 mtk
13.\" Modified 2004-10-07 aeb, added FUTEX_REQUEUE, FUTEX_CMP_REQUEUE
14.\"
47f5c4ba
MK
15.\" FIXME Still to integrate are some points from Torvald Riegel's mail of
16.\" 2015-01-23:
17.\" http://thread.gmane.org/gmane.linux.kernel/1703405/focus=7977
18.\"
78e85692 19.\" FIXME Do we need to add some text regarding Torvald Riegel's 2015-01-24 mail
02182e7c
MK
20.\" at http://thread.gmane.org/gmane.linux.kernel/1703405/focus=1873242
21.\"
3d155313 22.TH FUTEX 2 2014-05-21 "Linux" "Linux Programmer's Manual"
fea681da 23.SH NAME
ce154705 24futex \- fast user-space locking
fea681da 25.SH SYNOPSIS
9d9dc1e8 26.nf
fea681da
MK
27.sp
28.B "#include <linux/futex.h>"
fea681da
MK
29.B "#include <sys/time.h>"
30.sp
d33602c4 31.BI "int futex(int *" uaddr ", int " futex_op ", int " val ,
768d3c23 32.BI " const struct timespec *" timeout , \
c6dc40a2 33" \fR /* or: \fBuint32_t \fIval2\fP */
9d9dc1e8 34.BI " int *" uaddr2 ", int " val3 );
9d9dc1e8 35.fi
409f08b0 36
b939d6e4
MK
37.IR Note :
38There is no glibc wrapper for this system call; see NOTES.
47297adb 39.SH DESCRIPTION
fea681da
MK
40.PP
41The
e511ffb6 42.BR futex ()
4b35dc5d 43system call provides a method for waiting until a certain condition becomes
077981d4
MK
44true.
45It is typically used as a blocking construct in the context of
d45f244c
MK
46shared-memory synchronization.
47When using futexes, the majority of
48the synchronization operations are performed in user space.
49The user-space program employs
50.BR futex ()
51operations only when it is likely that the program has to block for
4c8cb0ff 52a longer time until the condition becomes true.
d45f244c
MK
53The program uses another
54.BR futex ()
55operation to wake anyone waiting for a particular condition.
4b35dc5d 56
7e8dcabc
MK
57A futex is a 32-bit value\(emreferred to below as a
58.IR "futex word" \(emwhose
59address is supplied to the
4b35dc5d 60.BR futex ()
7e8dcabc
MK
61system call.
62(Futexes are 32-bits in size on all platforms, including 64-bit systems.)
63All futex operations are governed by this value.
64In order to share a futex between processes,
65the futex is placed in a region of shared memory,
66created using (for example)
67.BR mmap (2)
68or
69.BR shmat (2).
70(Thus the futex word may have different
71virtual addresses in different processes,
72but these addresses all refer to the same location in physical memory.)
809ca3ae 73
0c3ec26b
MK
74When executing a futex operation that requests to block a thread,
75the kernel will block only if the futex word has the value that the
55f9e85e
MK
76calling thread supplied (as one of the arguments of the
77.BR futex ()
78call) as the expected value of the futex word.
9d32a39b
MK
79The loading of the futex word's value,
80the comparison of that value with the expected value,
077981d4 81and the actual blocking will happen atomically and totally
0c3ec26b 82ordered with respect to concurrently executing futex
b80daba2
HS
83operations on the same futex word.
84Thus, the futex word is used to connect the synchronization in user space
9d32a39b 85with the implementation of blocking by the kernel.
55f9e85e 86Analogously to an atomic
4b35dc5d 87compare-and-exchange operation that potentially changes shared memory,
077981d4 88blocking via a futex is an atomic compare-and-block operation.
d6bb5a38
MK
89.\" FIXME(Torvald Riegel):
90.\" Eventually we want to have some text in NOTES to satisfy
91.\" the reference in the following sentence
92.\" See NOTES for
93.\" a detailed specification of the synchronization semantics.
4b35dc5d 94
55f9e85e 95One example use of futexes is for implementing locks.
c0dc758e
MK
96The state of the lock (i.e., acquired or not acquired)
97can be represented as an atomically accessed flag in shared memory.
4c8cb0ff 98In the uncontended case,
55f9e85e 99a thread can access or modify the lock state with atomic instructions
4c8cb0ff
MK
100for example atomically changing it from not acquired to acquired
101using an atomic compare-and-exchange instruction.
55f9e85e
MK
102(Such instructions are performed entirely in user mode,
103and the kernel maintains no information about the lock state.)
104On the other hand, a thread may be unable to acquire a lock because
8e754e12 105it is already acquired by another thread.
55f9e85e 106It then may pass the lock's flag as a futex word and the value
0c3ec26b 107representing the acquired state as the expected value to a
8e754e12
HS
108.BR futex ()
109wait operation.
55f9e85e 110This
8e754e12 111.BR futex ()
55f9e85e 112call will block if and only if the lock is still acquired.
077981d4 113When releasing the lock, a thread has to first reset the
0c3ec26b 114lock state to not acquired and then execute a futex
55f9e85e 115operation that wakes threads blocked on the lock flag used as a futex word
4c8cb0ff 116(this can be be further optimized to avoid unnecessary wake-ups).
077981d4 117See
4b35dc5d
TR
118.BR futex (7)
119for more detail on how to use futexes.
120
121Besides the basic wait and wake-up futex functionality, there are further
077981d4
MK
122futex operations aimed at supporting more complex use cases.
123Also note that
4c8cb0ff
MK
124no explicit initialization or destruction are necessary to use futexes;
125the kernel maintains a futex
126(i.e., the kernel-internal implementation artifact)
4b35dc5d
TR
127only while operations such as
128.BR FUTEX_WAIT ,
129described below, are being performed on a particular futex word.
a663ca5a
MK
130.\"
131.SS Arguments
fea681da
MK
132The
133.I uaddr
077981d4
MK
134argument points to the futex word.
135On all platforms, futexes are four-byte
4b35dc5d 136integers that must be aligned on a four-byte boundary.
f388ba70
MK
137The operation to perform on the futex is specified in the
138.I futex_op
139argument;
140.IR val
141is a value whose meaning and purpose depends on
142.IR futex_op .
36ab2074
MK
143
144The remaining arguments
145.RI ( timeout ,
146.IR uaddr2 ,
147and
148.IR val3 )
149are required only for certain of the futex operations described below.
150Where one of these arguments is not required, it is ignored.
768d3c23 151
36ab2074
MK
152For several blocking operations, the
153.I timeout
154argument is a pointer to a
155.IR timespec
156structure that specifies a timeout for the operation.
157However, notwithstanding the prototype shown above, for some operations,
10022b8e 158the least significant four bytes are used as an integer whose meaning
36ab2074 159is determined by the operation.
768d3c23
MK
160For these operations, the kernel casts the
161.I timeout
10022b8e
HS
162value first to
163.IR "unsigned long",
164then to
c6dc40a2 165.IR uint32_t ,
768d3c23
MK
166and in the remainder of this page, this argument is referred to as
167.I val2
168when interpreted in this fashion.
169
de5a3bb4 170Where it is required, the
36ab2074 171.IR uaddr2
4c8cb0ff
MK
172argument is a pointer to a second futex word that is employed
173by the operation.
36ab2074
MK
174The interpretation of the final integer argument,
175.IR val3 ,
176depends on the operation.
a663ca5a
MK
177.\"
178.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
179.\"
180.SS Futex operations
6be4bad7 181The
d33602c4 182.I futex_op
6be4bad7
MK
183argument consists of two parts:
184a command that specifies the operation to be performed,
185bit-wise ORed with zero or or more options that
186modify the behaviour of the operation.
fc30eb79 187The options that may be included in
d33602c4 188.I futex_op
fc30eb79
TG
189are as follows:
190.TP
191.BR FUTEX_PRIVATE_FLAG " (since Linux 2.6.22)"
192.\" commit 34f01cc1f512fa783302982776895c73714ebbc2
193This option bit can be employed with all futex operations.
e45f9735 194It tells the kernel that the futex is process-private and not shared
0c3ec26b
MK
195with another process (i.e., it is being used for synchronization
196only between threads of the same process).
943ccc52
MK
197This allows the kernel to make some additional performance optimizations.
198.\" I.e., It allows the kernel choose the fast path for validating
199.\" the user-space address and avoids expensive VMA lookups,
200.\" taking reference counts on file backing store, and so on.
ae2c1774
MK
201
202As a convenience,
203.IR <linux/futex.h>
204defines a set of constants with the suffix
205.BR _PRIVATE
206that are equivalents of all of the operations listed below,
dcdfde26 207.\" except the obsolete FUTEX_FD, for which the "private" flag was
ae2c1774
MK
208.\" meaningless
209but with the
210.BR FUTEX_PRIVATE_FLAG
211ORed into the constant value.
212Thus, there are
213.BR FUTEX_WAIT_PRIVATE ,
214.BR FUTEX_WAKE_PRIVATE ,
215and so on.
2e98bbc2
TG
216.TP
217.BR FUTEX_CLOCK_REALTIME " (since Linux 2.6.28)"
218.\" commit 1acdac104668a0834cfa267de9946fac7764d486
4a7e5b05 219This option bit can be employed only with the
2e98bbc2
TG
220.BR FUTEX_WAIT_BITSET
221and
222.BR FUTEX_WAIT_REQUEUE_PI
c84cf68c 223operations.
2e98bbc2 224
f2103b26
MK
225If this option is set, the kernel treats
226.I timeout
227as an absolute time based on
2e98bbc2
TG
228.BR CLOCK_REALTIME .
229
f2103b26
MK
230If this option is not set, the kernel treats
231.I timeout
232as relative time,
1c952cf5
MK
233measured against the
234.BR CLOCK_MONOTONIC
235clock.
6be4bad7
MK
236.PP
237The operation specified in
d33602c4 238.I futex_op
6be4bad7 239is one of the following:
70b06b90
MK
240.\"
241.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
242.\"
fea681da 243.TP
81c9d87e
MK
244.BR FUTEX_WAIT " (since Linux 2.6.0)"
245.\" Strictly speaking, since some time in 2.5.x
f065673c 246This operation tests that the value at the
4b35dc5d 247futex word pointed to by the address
fea681da 248.I uaddr
4b35dc5d 249still contains the expected value
fea681da 250.IR val ,
fd105614 251and if so, then sleeps waiting for a
682edefb 252.B FUTEX_WAKE
fd105614 253operation on the futex word.
077981d4 254The load of the value of the futex word is an atomic memory
4b35dc5d 255access (i.e., using atomic machine instructions of the respective
077981d4
MK
256architecture).
257This load, the comparison with the expected value, and
fd105614 258starting to sleep are performed atomically
da56650a
MK
259.\" FIXME: Torvald, I think we may need to add some explanation of
260.\" "totally odered" here.
fd105614
MK
261and totally ordered
262with respect to other futex operations on the same futex word.
c0dc758e
MK
263If the thread starts to sleep,
264it is considered a waiter on this futex word.
f065673c
MK
265If the futex value does not match
266.IR val ,
4710334a 267then the call fails immediately with the error
badbf70c 268.BR EAGAIN .
4b35dc5d
TR
269
270The purpose of the comparison with the expected value is to prevent lost
fd105614
MK
271wake-ups.
272If another thread changed the value of the futex word after the
c0dc758e
MK
273calling thread decided to block based on the prior value,
274and if the other thread executed a
4b35dc5d
TR
275.BR FUTEX_WAKE
276operation (or similar wake-up) after the value change and before this
f065673c 277.BR FUTEX_WAIT
4b35dc5d
TR
278operation, then the latter will observe the value change and will not start
279to sleep.
1909e523 280
c13182ef 281If the
fea681da 282.I timeout
53ba4030 283argument is non-NULL, its contents specify a relative timeout for the wait,
1c952cf5
MK
284measured according to the
285.BR CLOCK_MONOTONIC
286clock.
82a6092b 287(This interval will be rounded up to the system clock granularity,
8297383e 288and is guaranteed not to expire early.)
82a6092b
MK
289If
290.I timeout
291is NULL, the call blocks indefinitely.
4798a7f3 292
c13182ef 293The arguments
fea681da
MK
294.I uaddr2
295and
296.I val3
297are ignored.
298
74f58a64
MK
299.\" FIXME(Torvald) I think we should remove this. Or maybe adapt to a
300.\" different example.
4b35dc5d
TR
301.\" For
302.\" .BR futex (7),
303.\" this call is executed if decrementing the count gave a negative value
304.\" (indicating contention),
305.\" and will sleep until another process or thread releases
306.\" the futex and executes the
307.\" .B FUTEX_WAKE
308.\" operation.
70b06b90
MK
309.\"
310.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
311.\"
fea681da 312.TP
81c9d87e
MK
313.BR FUTEX_WAKE " (since Linux 2.6.0)"
314.\" Strictly speaking, since Linux 2.5.x
f065673c
MK
315This operation wakes at most
316.I val
4b35dc5d 317of the waiters that are waiting (e.g., inside
f065673c 318.BR FUTEX_WAIT )
4b35dc5d 319on the futex word at the address
f065673c
MK
320.IR uaddr .
321Most commonly,
322.I val
323is specified as either 1 (wake up a single waiter) or
324.BR INT_MAX
325(wake up all waiters).
730bfbda
MK
326No guarantee is provided about which waiters are awoken
327(e.g., a waiter with a higher scheduling priority is not guaranteed
328to be awoken in preference to a waiter with a lower priority).
4798a7f3 329
fea681da
MK
330The arguments
331.IR timeout ,
c8b921bd 332.IR uaddr2 ,
fea681da
MK
333and
334.I val3
335are ignored.
336
74f58a64
MK
337.\" FIXME(Torvald) I think we should remove this. Or maybe adapt to
338.\" a different example.
4c8cb0ff
MK
339.\" For
340.\" .BR futex (7),
341.\" this is executed if incrementing the count showed that
342.\" there were waiters,
343.\" once the futex value has been set to 1
344.\" (indicating that it is available).
345.\"
346.\" FIXME How does "incrementing the count show that there were waiters"?
70b06b90
MK
347.\"
348.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
349.\"
a7c2bf45
MK
350.TP
351.BR FUTEX_FD " (from Linux 2.6.0 up to and including Linux 2.6.25)"
352.\" Strictly speaking, from Linux 2.5.x to 2.6.25
4c8cb0ff
MK
353This operation creates a file descriptor that is associated with
354the futex at
a7c2bf45 355.IR uaddr .
bdc5957a
MK
356The caller must close the returned file descriptor after use.
357When another process or thread performs a
a7c2bf45 358.BR FUTEX_WAKE
4b35dc5d 359on the futex word, the file descriptor indicates as being readable with
a7c2bf45
MK
360.BR select (2),
361.BR poll (2),
362and
363.BR epoll (7)
364
f1d2171d 365The file descriptor can be used to obtain asynchronous notifications: if
a7c2bf45 366.I val
bdc5957a 367is nonzero, then when another process or thread executes a
a7c2bf45
MK
368.BR FUTEX_WAKE ,
369the caller will receive the signal number that was passed in
370.IR val .
371
372The arguments
373.IR timeout ,
374.I uaddr2
375and
376.I val3
377are ignored.
378
a7c2bf45
MK
379Because it was inherently racy,
380.B FUTEX_FD
381has been removed
382.\" commit 82af7aca56c67061420d618cc5a30f0fd4106b80
383from Linux 2.6.26 onward.
70b06b90
MK
384.\"
385.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
386.\"
a7c2bf45
MK
387.TP
388.BR FUTEX_REQUEUE " (since Linux 2.6.0)"
a7c2bf45 389This operation performs the same task as
27dd3a6e
MK
390.BR FUTEX_CMP_REQUEUE
391(see below), except that no check is made using the value in
a7c2bf45
MK
392.IR val3 .
393(The argument
394.I val3
395is ignored.)
70b06b90
MK
396.\"
397.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
398.\"
a7c2bf45
MK
399.TP
400.BR FUTEX_CMP_REQUEUE " (since Linux 2.6.7)"
4b35dc5d 401This operation first checks whether the location
a7c2bf45
MK
402.I uaddr
403still contains the value
404.IR val3 .
405If not, the operation fails with the error
406.BR EAGAIN .
4b35dc5d 407Otherwise, the operation wakes up a maximum of
a7c2bf45
MK
408.I val
409waiters that are waiting on the futex at
410.IR uaddr .
411If there are more than
412.I val
413waiters, then the remaining waiters are removed
414from the wait queue of the source futex at
415.I uaddr
416and added to the wait queue of the target futex at
417.IR uaddr2 .
418The
768d3c23 419.I val2
936876a9 420argument specifies an upper limit on the number of waiters
a7c2bf45 421that are requeued to the futex at
768d3c23 422.IR uaddr2 .
a7c2bf45 423
d6bb5a38
MK
424.\" FIXME(Torvald) Is the following correct? Or is just the decision
425.\" which threads to wake or requeue part of the atomic operation?
4b35dc5d
TR
426The load from
427.I uaddr
4c8cb0ff
MK
428is an atomic memory access (i.e., using atomic machine instructions of
429the respective architecture).
077981d4 430This load, the comparison with
4b35dc5d 431.IR val3 ,
4c8cb0ff
MK
432and the requeueing of any waiters are performed atomically and totally
433ordered with respect to other operations on the same futex word.
4b35dc5d 434
a7c2bf45
MK
435Typical values to specify for
436.I val
437are 0 or or 1.
438(Specifying
439.BR INT_MAX
440is not useful, because it would make the
441.BR FUTEX_CMP_REQUEUE
442operation equivalent to
443.BR FUTEX_WAKE .)
936876a9 444The limit value specified via
768d3c23
MK
445.I val2
446is typically either 1 or
a7c2bf45
MK
447.BR INT_MAX .
448(Specifying the argument as 0 is not useful, because it would make the
449.BR FUTEX_CMP_REQUEUE
450operation equivalent to
451.BR FUTEX_WAIT .)
627b50ce
MK
452
453The
454.B FUTEX_CMP_REQUEUE
455operation was added as a replacement for the earlier
456.BR FUTEX_REQUEUE .
457The difference is that the check of the value at
458.I uaddr
459can be used to ensure that requeueing happens only under certain
460conditions, which allows race conditions to be avoided in certain use cases.
dcb410c3 461.\" But, as Rich Felker points out, there remain valid use cases for
627b50ce
MK
462.\" FUTEX_REQUEUE, for example, when the calling thread is requeuing
463.\" the target(s) to a lock that the calling thread owns
464.\" From: Rich Felker <dalias@libc.org>
465.\" Date: Wed, 29 Oct 2014 22:43:17 -0400
466.\" To: Darren Hart <dvhart@infradead.org>
467.\" CC: libc-alpha@sourceware.org, ...
468.\" Subject: Re: Add futex wrapper to glibc?
469
470Both
471.BR FUTEX_REQUEUE
472and
473.BR FUTEX_CMP_REQUEUE
474can be used to avoid "thundering herd" wake-ups that could occur when using
475.B FUTEX_WAKE
476in cases where all of the waiters that are woken need to acquire
477another futex.
478Consider the following scenario,
479where multiple waiter threads are waiting on B,
480a wait queue implemented using a futex:
481
482.in +4n
483.nf
484lock(A)
485while (!check_value(V)) {
486 unlock(A);
487 block_on(B);
488 lock(A);
489};
490unlock(A);
491.fi
492.in
493
494If a waker thread used
495.BR FUTEX_WAKE ,
496then all waiters waiting on B would be woken up,
497and they would would all try to acquire lock A.
498However, waking all of the threads in this manner would be pointless because
499all except one of the threads would immediately block on lock A again.
500By contrast, a requeue operation wakes just one waiter and moves
501the other waiters to lock A,
502and when the woken waiter unlocks A then the next waiter can proceed.
43d16602 503.\"
70b06b90
MK
504.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
505.\"
a5956430
MK
506.\" FIXME I added a lengthy piece of text on FUTEX_WAKE_OP text,
507.\" and I'd be happy if someone checked it.
fea681da 508.TP
d67e21f5
MK
509.BR FUTEX_WAKE_OP " (since Linux 2.6.14)"
510.\" commit 4732efbeb997189d9f9b04708dc26bf8613ed721
6bac3b85
MK
511.\" Author: Jakub Jelinek <jakub@redhat.com>
512.\" Date: Tue Sep 6 15:16:25 2005 -0700
4c8cb0ff
MK
513.\" FIXME(Torvald) The glibc condvar implementation is currently being
514.\" revised (e.g., to not use an internal lock anymore).
515.\" It is probably more future-proof to remove this paragraph.
d6bb5a38 516.\" [Torvald, do you have an update here?]
6bac3b85
MK
517This operation was added to support some user-space use cases
518where more than one futex must be handled at the same time.
519The most notable example is the implementation of
520.BR pthread_cond_signal (3),
521which requires operations on two futexes,
522the one used to implement the mutex and the one used in the implementation
523of the wait queue associated with the condition variable.
524.BR FUTEX_WAKE_OP
525allows such cases to be implemented without leading to
526high rates of contention and context switching.
527
528The
529.BR FUTEX_WAIT_OP
e61abc20 530operation is equivalent to executing the following code atomically
4c8cb0ff
MK
531and totally ordered with respect to other futex operations on
532any of the two supplied futex words:
6bac3b85
MK
533
534.in +4n
535.nf
536int oldval = *(int *) uaddr2;
537*(int *) uaddr2 = oldval \fIop\fP \fIoparg\fP;
538futex(uaddr, FUTEX_WAKE, val, 0, 0, 0);
539if (oldval \fIcmp\fP \fIcmparg\fP)
768d3c23 540 futex(uaddr2, FUTEX_WAKE, val2, 0, 0, 0);
6bac3b85
MK
541.fi
542.in
543
544In other words,
545.BR FUTEX_WAIT_OP
546does the following:
547.RS
548.IP * 3
4b35dc5d
TR
549saves the original value of the futex word at
550.IR uaddr2
551and performs an operation to modify the value of the futex at
6bac3b85 552.IR uaddr2 ;
4c8cb0ff
MK
553this is an atomic read-modify-write memory access (i.e., using atomic
554machine instructions of the respective architecture)
6bac3b85
MK
555.IP *
556wakes up a maximum of
557.I val
4b35dc5d 558waiters on the futex for the futex word at
6bac3b85
MK
559.IR uaddr ;
560and
561.IP *
4c8cb0ff
MK
562dependent on the results of a test of the original value of the
563futex word at
6bac3b85
MK
564.IR uaddr2 ,
565wakes up a maximum of
768d3c23 566.I val2
4b35dc5d 567waiters on the futex for the futex word at
6bac3b85
MK
568.IR uaddr2 .
569.RE
570.IP
6bac3b85
MK
571The operation and comparison that are to be performed are encoded
572in the bits of the argument
573.IR val3 .
574Pictorially, the encoding is:
575
f6af90e7 576.in +8n
6bac3b85 577.nf
f6af90e7
MK
578+---+---+-----------+-----------+
579|op |cmp| oparg | cmparg |
580+---+---+-----------+-----------+
581 4 4 12 12 <== # of bits
6bac3b85
MK
582.fi
583.in
584
585Expressed in code, the encoding is:
586
587.in +4n
588.nf
589#define FUTEX_OP(op, oparg, cmp, cmparg) \\
590 (((op & 0xf) << 28) | \\
591 ((cmp & 0xf) << 24) | \\
592 ((oparg & 0xfff) << 12) | \\
593 (cmparg & 0xfff))
594.fi
595.in
596
597In the above,
598.I op
599and
600.I cmp
601are each one of the codes listed below.
602The
603.I oparg
604and
605.I cmparg
606components are literal numeric values, except as noted below.
607
608The
609.I op
610component has one of the following values:
611
612.in +4n
613.nf
614FUTEX_OP_SET 0 /* uaddr2 = oparg; */
615FUTEX_OP_ADD 1 /* uaddr2 += oparg; */
616FUTEX_OP_OR 2 /* uaddr2 |= oparg; */
617FUTEX_OP_ANDN 3 /* uaddr2 &= ~oparg; */
618FUTEX_OP_XOR 4 /* uaddr2 ^= oparg; */
619.fi
620.in
621
622In addition, bit-wise ORing the following value into
623.I op
624causes
625.IR "(1\ <<\ oparg)"
626to be used as the operand:
627
628.in +4n
629.nf
630FUTEX_OP_ARG_SHIFT 8 /* Use (1 << oparg) as operand */
631.fi
632.in
633
634The
635.I cmp
636field is one of the following:
637
638.in +4n
639.nf
640FUTEX_OP_CMP_EQ 0 /* if (oldval == cmparg) wake */
641FUTEX_OP_CMP_NE 1 /* if (oldval != cmparg) wake */
642FUTEX_OP_CMP_LT 2 /* if (oldval < cmparg) wake */
643FUTEX_OP_CMP_LE 3 /* if (oldval <= cmparg) wake */
644FUTEX_OP_CMP_GT 4 /* if (oldval > cmparg) wake */
645FUTEX_OP_CMP_GE 5 /* if (oldval >= cmparg) wake */
646.fi
647.in
648
649The return value of
650.BR FUTEX_WAKE_OP
651is the sum of the number of waiters woken on the futex
652.IR uaddr
653plus the number of waiters woken on the futex
654.IR uaddr2 .
70b06b90
MK
655.\"
656.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
657.\"
d67e21f5 658.TP
79c9b436
TG
659.BR FUTEX_WAIT_BITSET " (since Linux 2.6.25)"
660.\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
fd9e59d4 661This operation is like
79c9b436
TG
662.BR FUTEX_WAIT
663except that
664.I val3
665is used to provide a 32-bit bitset to the kernel.
666This bitset is stored in the kernel-internal state of the waiter.
667See the description of
668.BR FUTEX_WAKE_BITSET
669for further details.
670
fd9e59d4
MK
671The
672.BR FUTEX_WAIT_BITSET
9732dd8b 673operation also interprets the
fd9e59d4
MK
674.I timeout
675argument differently from
676.BR FUTEX_WAIT .
677See the discussion of
678.BR FUTEX_CLOCK_REALTIME ,
679above.
680
79c9b436
TG
681The
682.I uaddr2
683argument is ignored.
70b06b90
MK
684.\"
685.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
686.\"
79c9b436 687.TP
d67e21f5
MK
688.BR FUTEX_WAKE_BITSET " (since Linux 2.6.25)"
689.\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
55cc422d
TG
690This operation is the same as
691.BR FUTEX_WAKE
692except that the
693.I val3
694argument is used to provide a 32-bit bitset to the kernel.
98d769c0
MK
695This bitset is used to select which waiters should be woken up.
696The selection is done by a bit-wise AND of the "wake" bitset
697(i.e., the value in
698.IR val3 )
699and the bitset which is stored in the kernel-internal
09cb4ce7 700state of the waiter (the "wait" bitset that is set using
98d769c0
MK
701.BR FUTEX_WAIT_BITSET ).
702All of the waiters for which the result of the AND is nonzero are woken up;
703the remaining waiters are left sleeping.
704
e9d4496b
MK
705The effect of
706.BR FUTEX_WAIT_BITSET
707and
708.BR FUTEX_WAKE_BITSET
9732dd8b
MK
709is to allow selective wake-ups among multiple waiters that are blocked
710on the same futex.
ac894879
MK
711However, note that, depending on the use case,
712employing this bitset multiplexing feature on a
713futex can be less efficient than simply using multiple futexes,
e9d4496b
MK
714because employing bitset multiplexing requires the kernel
715to check all waiters on a futex,
716including those that are not interested in being woken up
717(i.e., they do not have the relevant bit set in their "wait" bitset).
718.\" According to http://locklessinc.com/articles/futex_cheat_sheet/:
719.\"
720.\" "The original reason for the addition of these extensions
721.\" was to improve the performance of pthread read-write locks
722.\" in glibc. However, the pthreads library no longer uses the
723.\" same locking algorithm, and these extensions are not used
724.\" without the bitset parameter being all ones.
725.\"
726.\" The page goes on to note that the FUTEX_WAIT_BITSET operation
727.\" is nevertheless used (with a bitset of all ones) in order to
728.\" obtain the absolute timeout functionality that is useful
729.\" for efficiently implementing Pthreads APIs (which use absolute
730.\" timeouts); FUTEX_WAIT provides only relative timeouts.
731
98d769c0
MK
732The
733.I uaddr2
734and
735.I timeout
736arguments are ignored.
9732dd8b
MK
737
738The
739.BR FUTEX_WAIT
740and
741.BR FUTEX_WAKE
742operations correspond to
743.BR FUTEX_WAIT_BITSET
744and
745.BR FUTEX_WAKE_BITSET
746operations where the bitsets are all ones.
bd90a5f9 747.\"
70b06b90 748.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
bd90a5f9
MK
749.\"
750.SS Priority-inheritance futexes
b52e1cd4
MK
751Linux supports priority-inheritance (PI) futexes in order to handle
752priority-inversion problems that can be encountered with
753normal futex locks.
b565548b 754Priority inversion is the problem that occurs when a high-priority
bdc5957a
MK
755task is blocked waiting to acquire a lock held by a low-priority task,
756while tasks at an intermediate priority continuously preempt
757the low-priority task from the CPU.
758Consequently, the low-priority task makes no progress toward
759releasing the lock, and the high-priority task remains blocked.
7f315ae3 760
7d20efd7
MK
761Priority inheritance is a mechanism for dealing with
762the priority-inversion problem.
bdc5957a
MK
763With this mechanism, when a high-priority task becomes blocked
764by a lock held by a low-priority task,
9cee832c
MK
765the priority of the low-priority task is temporarily raised
766to that of the high-priority task,
bdc5957a 767so that it is not preempted by any intermediate level tasks,
7d20efd7
MK
768and can thus make progress toward releasing the lock.
769To be effective, priority inheritance must be transitive,
bdc5957a
MK
770meaning that if a high-priority task blocks on a lock
771held by a lower-priority task that is itself blocked by lock
772held by another intermediate-priority task
7d20efd7 773(and so on, for chains of arbitrary length),
b0f35fbb 774then both of those tasks
bdc5957a
MK
775(or more generally, all of the tasks in a lock chain)
776have their priorities raised to be the same as the high-priority task.
7d20efd7 777
9e2b90ee 778From a user-space perspective,
39e9b2e1
MK
779what makes a futex PI-aware is a policy agreement (described below)
780between user space and the kernel about the value of the futex word,
f0a9e8f4 781coupled with the use of the PI-futex operations described below
9e2b90ee
MK
782(in particular,
783.BR FUTEX_LOCK_PI ,
784.BR FUTEX_TRYLOCK_PI ,
785and
786.BR FUTEX_CMP_REQUEUE_PI ).
787.\" Quoting Darren Hart:
788.\" These opcodes paired with the PI futex value policy (described below)
789.\" defines a "futex" as PI aware. These were created very specifically
790.\" in support of PI pthread_mutexes, so it makes a lot more sense to
791.\" talk about a PI aware pthread_mutex, than a PI aware futex, since
792.\" there is a lot of policy and scaffolding that has to be built up
793.\" around it to use it properly (this is what a PI pthread_mutex is).
794
ac894879 795.\" mtk: The following text is drawn from the Hart/Guniguntala paper
1af427a4 796.\" (listed in SEE ALSO), but I have reworded some pieces
f0a9e8f4 797.\" significntly.
79d918c7 798.\"
f0a9e8f4 799The PI-futex operations described below differ from the other
4b35dc5d
TR
800futex operations in that they impose policy on the use of the value of the
801futex word:
79d918c7 802.IP * 3
4b35dc5d 803If the lock is not acquired, the futex word's value shall be 0.
79d918c7 804.IP *
4c8cb0ff
MK
805If the lock is acquired, the futex word's value shall
806be the thread ID (TID;
4b35dc5d 807see
79d918c7
MK
808.BR gettid (2))
809of the owning thread.
810.IP *
79d918c7
MK
811If the lock is owned and there are threads contending for the lock,
812then the
813.B FUTEX_WAITERS
4b35dc5d 814bit shall be set in the futex word's value; in other words, this value is:
79d918c7
MK
815
816 FUTEX_WAITERS | TID
9e2b90ee 817
79d918c7 818.PP
aab5ca4b 819Note that a PI futex word never has just the value
9e2b90ee
MK
820.BR FUTEX_WAITERS ,
821which is a permissible state for non-PI futexes.
822
79d918c7 823With this policy in place,
fd105614 824a user-space application can acquire an unacquired
4b35dc5d 825lock or release a lock that no other threads try to acquire using atomic
fd105614
MK
826instructions executed in user mode
827(e.g., a compare-and-swap operation such as
b52e1cd4
MK
828.I cmpxchg
829on the x86 architecture).
4c8cb0ff
MK
830Acquiring a lock simply consists of using compare-and-swap to atomically
831set the futex word's value to the caller's TID if its previous value was 0.
4b35dc5d
TR
832Releasing a lock requires using compare-and-swap to set the futex word's
833value to 0 if the previous value was the expected TID.
b52e1cd4 834
4b35dc5d 835If a futex is already acquired (i.e., has a nonzero value),
b52e1cd4 836waiters must employ the
79d918c7
MK
837.B FUTEX_LOCK_PI
838operation to acquire the lock.
4b35dc5d 839If other threads are waiting for the lock, then the
79d918c7 840.B FUTEX_WAITERS
4c8cb0ff
MK
841bit is set in the futex value;
842in this case, the lock owner must employ the
79d918c7 843.B FUTEX_UNLOCK_PI
b52e1cd4
MK
844operation to release the lock.
845
79d918c7
MK
846In the cases where callers are forced into the kernel
847(i.e., required to perform a
848.BR futex ()
0c3ec26b 849call),
79d918c7
MK
850they then deal directly with a so-called RT-mutex,
851a kernel locking mechanism which implements the required
852priority-inheritance semantics.
853After the RT-mutex is acquired, the futex value is updated accordingly,
854before the calling thread returns to user space.
79d918c7 855
a59fca75 856It is important to note
ac894879 857.\" tglx (July 2015):
30239c10
MK
858.\" If there are multiple waiters on a pi futex then a wake pi operation
859.\" will wake the first waiter and hand over the lock to this waiter. This
860.\" includes handing over the rtmutex which represents the futex in the
861.\" kernel. The strict requirement is that the futex owner and the rtmutex
862.\" owner must be the same, except for the update period which is
863.\" serialized by the futex internal locking. That means the kernel must
864.\" update the user space value prior to returning to user space
4b35dc5d 865that the kernel will update the futex word's value prior
79d918c7
MK
866to returning to user space.
867Unlike the other futex operations described above,
f0a9e8f4 868the PI-futex operations are designed
d9d5be6b 869for the implementation of very specific IPC mechanisms.
fc57e6bb 870.\"
7bd3ffbc 871.\" FIXME XXX In discussing errors for FUTEX_CMP_REQUEUE_PI, Darren Hart
99c0ac69
MK
872.\" made the observation that "EINVAL is returned if the non-pi
873.\" to pi or op pairing semantics are violated."
874.\" Probably there needs to be a general statement about this
875.\" requirement, probably located at about this point in the page.
d6bb5a38 876.\" Darren (or someone else), care to take a shot at this?
30239c10
MK
877.\" tglx (July 2015):
878.\" Well, that's hard to describe because the kernel only has a limited
879.\" way of detecting such mismatches. It only can detect it when there are
880.\" non PI waiters on a futex and a PI function is called or vice versa.
881
882If a futex has an associated RT-mutex in the kernel
883(i.e., there are blocked waiters)
884and the owner of the futex/RT-mutex dies unexpectedly,
885then the kernel cleans up the RT-mutex and hands it over to the next waiter.
886This in turn requires that the user-space value is updated accordingly.
887To indicate that this is required, the kernel sets the
888.B FUTEX_OWNER_DIED
889bit in the futex word along with the thread ID of the new owner.
890User space is then responsible for cleaning this up
891(though there are cases where the kernel does the cleanup).
892.\" tglx (July 2015):
893.\" The FUTEX_OWNER_DIED bit can also be set on uncontended futexes, where
894.\" the kernel has no state associated. This happens via the robust futex
895.\" mechanism. In that case the futex value will be set to
896.\" FUTEX_OWNER_DIED. The robust futex mechanism is also available for non
897.\" PI futexes.
bd90a5f9
MK
898
899PI futexes are operated on by specifying one of the following values in
900.IR futex_op :
70b06b90
MK
901.\"
902.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
903.\"
d67e21f5
MK
904.TP
905.BR FUTEX_LOCK_PI " (since Linux 2.6.18)"
906.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
67833bec 907This operation is used after after an attempt to acquire
fd105614 908the lock via an atomic user-mode instruction failed
4b35dc5d 909because the futex word has a nonzero value\(emspecifically,
8297383e 910because it contained the (PID-namespace-specific) TID of the lock owner.
67833bec 911
4b35dc5d 912The operation checks the value of the futex word at the address
67833bec 913.IR uaddr .
70b06b90
MK
914If the value is 0, then the kernel tries to atomically set
915the futex value to the caller's TID.
c3875d1d 916If the futex word's value is nonzero,
67833bec 917the kernel atomically sets the
e0547e70 918.B FUTEX_WAITERS
67833bec
MK
919bit, which signals the futex owner that it cannot unlock the futex in
920user space atomically by setting the futex value to 0.
c3875d1d
MK
921.\" tglx (July 2015):
922.\" The operation here is similar to the FUTEX_WAIT logic. When the user
923.\" space atomic acquire does not succeed because the futex value was non
924.\" zero, then the waiter goes into the kernel, takes the kernel internal
925.\" lock and retries the acquisition under the lock. If the acquisition
926.\" does not succeed either, then it sets the FUTEX_WAITERS bit, to signal
927.\" the lock owner that it needs to go into the kernel. Here is the pseudo
928.\" code:
929.\"
930.\" lock(kernel_lock);
931.\" retry:
932.\"
933.\" /*
934.\" * Owner might have unlocked in userspace before we
935.\" * were able to set the waiter bit.
936.\" */
937.\" if (atomic_acquire(futex) == SUCCESS) {
938.\" unlock(kernel_lock());
939.\" return 0;
940.\" }
941.\"
942.\" /*
943.\" * Owner might have unlocked after the above atomic_acquire()
944.\" * attempt.
945.\" */
946.\" if (atomic_set_waiters_bit(futex) != SUCCESS)
947.\" goto retry;
948.\"
949.\" queue_waiter();
950.\" unlock(kernel_lock);
951.\" block();
952.\"
953After that, the kernel:
954.RS
955.IP 1. 3
956Tries to find the thread which is associated with the owner TID.
957.IP 2.
958Creates or reuses kernel state on behalf of the owner.
959(If this is the first waiter, there is no kernel state for this
960futex, so kernel state is created by locking the RT-mutex
961and the futex owner is made the owner of the RT-mutex.
962If there are existing waiters, then the existing state is reused.)
963.IP 3.
964Attaches the waiter to it
965(i.e., the waiter is enqueued on the RT-mutex waiter list).
966.RE
967.IP
ac894879
MK
968If more than one waiter exists,
969the enqueueing of the waiter is in descending priority order.
970(For information on priority ordering, see the discussion of the
971.BR SCHED_DEADLINE ,
972.BR SCHED_FIFO ,
973and
974.BR SCHED_RR
975scheduling policies in
976.BR sched (7).)
977The owner inherits either the waiter's CPU bandwidth
978(if the waiter is scheduled under the
979.BR SCHED_DEADLINE
980policy) or the waiter's priority (if the waiter is scheduled under the
981.BR SCHED_RR
982or
983.BR SCHED_FIFO
984policy).
c3875d1d
MK
985This inheritance follows the lock chain in the case of nested locking
986(i.e., task 1 blocks on lock A, held by task 2,
987while task 2 blocks on lock B, held by task 3)
988and performs deadlock detection.
989
e0547e70
TG
990The
991.I timeout
9ce19cf1
MK
992argument provides a timeout for the lock attempt.
993It is interpreted as an absolute time, measured against the
994.BR CLOCK_REALTIME
995clock.
996If
997.I timeout
998is NULL, the operation will block indefinitely.
e0547e70 999
a449c634 1000The
e0547e70
TG
1001.IR uaddr2 ,
1002.IR val ,
1003and
1004.IR val3
a449c634 1005arguments are ignored.
67833bec 1006.\"
70b06b90
MK
1007.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1008.\"
d67e21f5 1009.TP
12fdbe23 1010.BR FUTEX_TRYLOCK_PI " (since Linux 2.6.18)"
d67e21f5 1011.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
12fdbe23
MK
1012This operation tries to acquire the futex at
1013.IR uaddr .
c3875d1d
MK
1014It is invoked when a user-space atomic acquire did not
1015succeed because the futex word was not 0.
1016
41b6c6a0
MK
1017.\" FIXME The wording "The trylock in kernel" below needs clarification.
1018.\" Suggestions?
1019.\"
c3875d1d
MK
1020The trylock in kernel might succeed because the futex word
1021contains stale state
1022.RB ( FUTEX_WAITERS
1023and/or
1024.BR FUTEX_OWNER_DIED ).
1025This can happen when the owner of the futex died.
0b761826 1026.\" FIXME I think it would be helpful here to say a few more words about
70b06b90
MK
1027.\" the difference(s) between FUTEX_LOCK_PI and FUTEX_TRYLOCK_PI.
1028.\" Can someone propose something?
1029.\"
a282e5b0 1030User space cannot handle this condition in a race-free manner
084744ef
MK
1031
1032The
1033.IR uaddr2 ,
1034.IR val ,
1035.IR timeout ,
1036and
1037.IR val3
1038arguments are ignored.
70b06b90
MK
1039.\"
1040.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1041.\"
d67e21f5 1042.TP
12fdbe23 1043.BR FUTEX_UNLOCK_PI " (since Linux 2.6.18)"
d67e21f5 1044.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
d4ba4328 1045This operation wakes the top priority waiter that is waiting in
ecae2099
TG
1046.B FUTEX_LOCK_PI
1047on the futex address provided by the
1048.I uaddr
1049argument.
1050
1051This is called when the user space value at
1052.I uaddr
1053cannot be changed atomically from a TID (of the owner) to 0.
1054
1055The
1056.IR uaddr2 ,
1057.IR val ,
1058.IR timeout ,
1059and
1060.IR val3
11a194bf 1061arguments are ignored.
70b06b90
MK
1062.\"
1063.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1064.\"
d67e21f5 1065.TP
d67e21f5
MK
1066.BR FUTEX_CMP_REQUEUE_PI " (since Linux 2.6.31)"
1067.\" commit 52400ba946759af28442dee6265c5c0180ac7122
f812a08b
DH
1068This operation is a PI-aware variant of
1069.BR FUTEX_CMP_REQUEUE .
1070It requeues waiters that are blocked via
1071.B FUTEX_WAIT_REQUEUE_PI
1072on
1073.I uaddr
1074from a non-PI source futex
1075.RI ( uaddr )
1076to a PI target futex
1077.RI ( uaddr2 ).
1078
9e54d26d
MK
1079As with
1080.BR FUTEX_CMP_REQUEUE ,
1081this operation wakes up a maximum of
1082.I val
1083waiters that are waiting on the futex at
1084.IR uaddr .
1085However, for
1086.BR FUTEX_CMP_REQUEUE_PI ,
1087.I val
6fbeb8f4 1088is required to be 1
939ca89f 1089(since the main point is to avoid a thundering herd).
9e54d26d
MK
1090The remaining waiters are removed from the wait queue of the source futex at
1091.I uaddr
1092and added to the wait queue of the target futex at
1093.IR uaddr2 .
f812a08b 1094
9e54d26d 1095The
768d3c23 1096.I val2
c6d8cf21
MK
1097.\" val2 is the cap on the number of requeued waiters.
1098.\" In the glibc pthread_cond_broadcast() implementation, this argument
1099.\" is specified as INT_MAX, and for pthread_cond_signal() it is 0.
9e54d26d 1100and
768d3c23 1101.I val3
9e54d26d
MK
1102arguments serve the same purposes as for
1103.BR FUTEX_CMP_REQUEUE .
70b06b90 1104.\"
8297383e 1105.\" The page at http://locklessinc.com/articles/futex_cheat_sheet/
be376673 1106.\" notes that "priority-inheritance Futex to priority-inheritance
8297383e
MK
1107.\" Futex requeues are currently unsupported". However, probably
1108.\" the page does not need to say nothing about this, since
1109.\" Thomas Gleixner commented (July 2015): "they never will be
1110.\" supported because they make no sense at all"
70b06b90
MK
1111.\"
1112.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1113.\"
d67e21f5
MK
1114.TP
1115.BR FUTEX_WAIT_REQUEUE_PI " (since Linux 2.6.31)"
1116.\" commit 52400ba946759af28442dee6265c5c0180ac7122
70b06b90 1117.\"
c3875d1d 1118Wait on a non-PI futex at
6ff1b4c0 1119.I uaddr
c3875d1d
MK
1120and potentially be requeued (via a
1121.BR FUTEX_CMP_REQUEUE_PI
1122operation in another task) onto a PI futex at
6ff1b4c0
TG
1123.IR uaddr2 .
1124The wait operation on
1125.I uaddr
c3875d1d 1126is the same as for
6ff1b4c0 1127.BR FUTEX_WAIT .
c3875d1d 1128
6ff1b4c0
TG
1129The waiter can be removed from the wait on
1130.I uaddr
6ff1b4c0 1131without requeueing on
c3875d1d
MK
1132.IR uaddr2
1133via a
1134.BR FUTEX_WAIT
1135operation in another task.
1136In this case, the
1137.BR FUTEX_WAIT_REQUEUE_PI
1138operation returns with the error
1139.BR EWOULDBLOCK .
a4e69912 1140
63bea7dc
MK
1141If
1142.I timeout
1143is not NULL, it specifies a timeout for the wait operation;
1144this timeout is interpreted as outlined above in the description of the
1145.BR FUTEX_CLOCK_REALTIME
1146option.
1147If
1148.I timeout
1149is NULL, the operation can block indefinitely.
1150
a4e69912
MK
1151The
1152.I val3
1153argument is ignored.
abb571e8
MK
1154
1155The
1156.BR FUTEX_WAIT_REQUEUE_PI
1157and
1158.BR FUTEX_CMP_REQUEUE_PI
1159were added to support a fairly specific use case:
1160support for priority-inheritance-aware POSIX threads condition variables.
1161The idea is that these operations should always be paired,
1162in order to ensure that user space and the kernel remain in sync.
1163Thus, in the
1164.BR FUTEX_WAIT_REQUEUE_PI
1165operation, the user-space application pre-specifies the target
1166of the requeue that takes place in the
1167.BR FUTEX_CMP_REQUEUE_PI
1168operation.
1169.\"
1170.\" Darren Hart notes that a patch to allow glibc to fully support
1af427a4 1171.\" PI-aware pthreads condition variables has not yet been accepted into
abb571e8
MK
1172.\" glibc. The story is complex, and can be found at
1173.\" https://sourceware.org/bugzilla/show_bug.cgi?id=11588
1174.\" Darren notes that in the meantime, the patch is shipped with various
1af427a4 1175.\" PREEMPT_RT-enabled Linux systems.
abb571e8
MK
1176.\"
1177.\" Related to the preceding, Darren proposed that somewhere, man-pages
1178.\" should document the following point:
1af427a4 1179.\"
4c8cb0ff
MK
1180.\" While the Linux kernel, since 2.6.31, supports requeueing of
1181.\" priority-inheritance (PI) aware mutexes via the
1182.\" FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI futex operations,
1183.\" the glibc implementation does not yet take full advantage of this.
1184.\" Specifically, the condvar internal data lock remains a non-PI aware
1185.\" mutex, regardless of the type of the pthread_mutex associated with
1186.\" the condvar. This can lead to an unbounded priority inversion on
1187.\" the internal data lock even when associating a PI aware
1188.\" pthread_mutex with a condvar during a pthread_cond*_wait
1189.\" operation. For this reason, it is not recommended to rely on
1190.\" priority inheritance when using pthread condition variables.
1af427a4
MK
1191.\"
1192.\" The problem is that the obvious location for this text is
1193.\" the pthread_cond*wait(3) man page. However, such a man page
abb571e8 1194.\" does not currently exist.
70b06b90 1195.\"
6700de24 1196.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
70b06b90 1197.\"
47297adb 1198.SH RETURN VALUE
fea681da 1199.PP
a5c5a06a
MK
1200In the event of an error (and assuming that
1201.BR futex ()
1202was invoked via
1203.BR syscall (2)),
1204all operations return \-1 and set
e808bba0 1205.I errno
6f147f79 1206to indicate the cause of the error.
adc89a97 1207
e808bba0
MK
1208The return value on success depends on the operation,
1209as described in the following list:
fea681da
MK
1210.TP
1211.B FUTEX_WAIT
077981d4 1212Returns 0 if the caller was woken up.
4c8cb0ff
MK
1213Note that a wake-up can also be caused by common futex usage patterns
1214in unrelated code that happened to have previously used the futex word's
1215memory location (e.g., typical futex-based implementations of
1216Pthreads mutexes can cause this under some conditions).
1217Therefore, callers should always conservatively assume that a return
1218value of 0 can mean a spurious wake-up, and use the futex word's value
1219(i.e., the user space synchronization scheme)
1220 to decide whether to continue to block or not.
fea681da
MK
1221.TP
1222.B FUTEX_WAKE
bdc5957a 1223Returns the number of waiters that were woken up.
fea681da
MK
1224.TP
1225.B FUTEX_FD
1226Returns the new file descriptor associated with the futex.
1227.TP
1228.B FUTEX_REQUEUE
bdc5957a 1229Returns the number of waiters that were woken up.
fea681da
MK
1230.TP
1231.B FUTEX_CMP_REQUEUE
bdc5957a 1232Returns the total number of waiters that were woken up or
4b35dc5d 1233requeued to the futex for the futex word at
3dfcc11d
MK
1234.IR uaddr2 .
1235If this value is greater than
1236.IR val ,
fd105614 1237then the difference is the number of waiters requeued to the futex for the
4c8cb0ff 1238futex word at
3dfcc11d 1239.IR uaddr2 .
dcad19c0
MK
1240.TP
1241.B FUTEX_WAKE_OP
a8b5b324 1242Returns the total number of waiters that were woken up.
4c8cb0ff
MK
1243This is the sum of the woken waiters on the two futexes for
1244the futex words at
a8b5b324
MK
1245.I uaddr
1246and
1247.IR uaddr2 .
dcad19c0
MK
1248.TP
1249.B FUTEX_WAIT_BITSET
077981d4
MK
1250Returns 0 if the caller was woken up.
1251See
4b35dc5d
TR
1252.B FUTEX_WAIT
1253for how to interpret this correctly in practice.
dcad19c0
MK
1254.TP
1255.B FUTEX_WAKE_BITSET
bdc5957a 1256Returns the number of waiters that were woken up.
dcad19c0
MK
1257.TP
1258.B FUTEX_LOCK_PI
bf02a260 1259Returns 0 if the futex was successfully locked.
dcad19c0
MK
1260.TP
1261.B FUTEX_TRYLOCK_PI
5c716eef 1262Returns 0 if the futex was successfully locked.
dcad19c0
MK
1263.TP
1264.B FUTEX_UNLOCK_PI
52bb928f 1265Returns 0 if the futex was successfully unlocked.
dcad19c0
MK
1266.TP
1267.B FUTEX_CMP_REQUEUE_PI
bdc5957a 1268Returns the total number of waiters that were woken up or
4b35dc5d 1269requeued to the futex for the futex word at
dddd395a
MK
1270.IR uaddr2 .
1271If this value is greater than
1272.IR val ,
4c8cb0ff
MK
1273then difference is the number of waiters requeued to the futex for
1274the futex word at
dddd395a 1275.IR uaddr2 .
dcad19c0
MK
1276.TP
1277.B FUTEX_WAIT_REQUEUE_PI
4c8cb0ff
MK
1278Returns 0 if the caller was successfully requeued to the futex for
1279the futex word at
22c15de9 1280.IR uaddr2 .
70b06b90
MK
1281.\"
1282.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1283.\"
fea681da
MK
1284.SH ERRORS
1285.TP
1286.B EACCES
4b35dc5d 1287No read access to the memory of a futex word.
fea681da
MK
1288.TP
1289.B EAGAIN
f48516d1 1290.RB ( FUTEX_WAIT ,
4b35dc5d 1291.BR FUTEX_WAIT_BITSET ,
f48516d1 1292.BR FUTEX_WAIT_REQUEUE_PI )
badbf70c
MK
1293The value pointed to by
1294.I uaddr
1295was not equal to the expected value
1296.I val
1297at the time of the call.
9732dd8b
MK
1298
1299.BR Note :
1300on Linux, the symbolic names
1301.B EAGAIN
1302and
1303.B EWOULDBLOCK
77da5feb 1304(both of which appear in different parts of the kernel futex code)
9732dd8b 1305have the same value.
badbf70c
MK
1306.TP
1307.B EAGAIN
8f2068bb
MK
1308.RB ( FUTEX_CMP_REQUEUE ,
1309.BR FUTEX_CMP_REQUEUE_PI )
ce5602fd 1310The value pointed to by
9f6c40c0
МК
1311.I uaddr
1312is not equal to the expected value
1313.IR val3 .
fea681da 1314.TP
5662f56a
MK
1315.BR EAGAIN
1316.RB ( FUTEX_LOCK_PI ,
aaec9032
MK
1317.BR FUTEX_TRYLOCK_PI ,
1318.BR FUTEX_CMP_REQUEUE_PI )
1319The futex owner thread ID of
1320.I uaddr
1321(for
1322.BR FUTEX_CMP_REQUEUE_PI :
1323.IR uaddr2 )
1324is about to exit,
5662f56a
MK
1325but has not yet handled the internal state cleanup.
1326Try again.
1327.TP
7a39e745
MK
1328.BR EDEADLK
1329.RB ( FUTEX_LOCK_PI ,
9732dd8b
MK
1330.BR FUTEX_TRYLOCK_PI ,
1331.BR FUTEX_CMP_REQUEUE_PI )
4b35dc5d 1332The futex word at
7a39e745
MK
1333.I uaddr
1334is already locked by the caller.
1335.TP
662c0da8 1336.BR EDEADLK
c3875d1d 1337.\" FIXME . I see that kernel/locking/rtmutex.c uses EDEADLK in some
d6bb5a38 1338.\" places, and EDEADLOCK in others. On almost all architectures
4c8cb0ff
MK
1339.\" these constants are synonymous. Is there a reason that both
1340.\" names are used?
8297383e
MK
1341.\"
1342.\" tglx (July 2015): "No. We should probably fix that."
1343.\"
662c0da8 1344.RB ( FUTEX_CMP_REQUEUE_PI )
4b35dc5d 1345While requeueing a waiter to the PI futex for the futex word at
662c0da8
MK
1346.IR uaddr2 ,
1347the kernel detected a deadlock.
1348.TP
fea681da 1349.B EFAULT
1ea901e8
MK
1350A required pointer argument (i.e.,
1351.IR uaddr ,
1352.IR uaddr2 ,
1353or
1354.IR timeout )
496df304 1355did not point to a valid user-space address.
fea681da 1356.TP
9f6c40c0 1357.B EINTR
e808bba0 1358A
9f6c40c0 1359.B FUTEX_WAIT
2674f781
MK
1360or
1361.B FUTEX_WAIT_BITSET
e808bba0 1362operation was interrupted by a signal (see
f529fd20
MK
1363.BR signal (7)).
1364In kernels before Linux 2.6.22, this error could also be returned for
1365on a spurious wakeup; since Linux 2.6.22, this no longer happens.
9f6c40c0 1366.TP
fea681da 1367.B EINVAL
180f97b7
MK
1368The operation in
1369.IR futex_op
1370is one of those that employs a timeout, but the supplied
fb2f4c27
MK
1371.I timeout
1372argument was invalid
1373.RI ( tv_sec
1374was less than zero, or
1375.IR tv_nsec
cabee29d 1376was not less than 1,000,000,000).
fb2f4c27
MK
1377.TP
1378.B EINVAL
0c74df0b 1379The operation specified in
025e1374 1380.IR futex_op
0c74df0b 1381employs one or both of the pointers
51ee94be 1382.I uaddr
a1f47699 1383and
0c74df0b
MK
1384.IR uaddr2 ,
1385but one of these does not point to a valid object\(emthat is,
1386the address is not four-byte-aligned.
51ee94be
MK
1387.TP
1388.B EINVAL
55cc422d
TG
1389.RB ( FUTEX_WAIT_BITSET ,
1390.BR FUTEX_WAKE_BITSET )
79c9b436
TG
1391The bitset supplied in
1392.IR val3
1393is zero.
1394.TP
1395.B EINVAL
2abcba67 1396.RB ( FUTEX_CMP_REQUEUE_PI )
add875c0
MK
1397.I uaddr
1398equals
1399.IR uaddr2
1400(i.e., an attempt was made to requeue to the same futex).
1401.TP
ff597681
MK
1402.BR EINVAL
1403.RB ( FUTEX_FD )
1404The signal number supplied in
1405.I val
1406is invalid.
1407.TP
6bac3b85 1408.B EINVAL
476debd7
MK
1409.RB ( FUTEX_WAKE ,
1410.BR FUTEX_WAKE_OP ,
1411.BR FUTEX_WAKE_BITSET ,
1412.BR FUTEX_REQUEUE ,
1413.BR FUTEX_CMP_REQUEUE )
1414The kernel detected an inconsistency between the user-space state at
1415.I uaddr
1416and the kernel state\(emthat is, it detected a waiter which waits in
1417.BR FUTEX_LOCK_PI
1418on
1419.IR uaddr .
1420.TP
1421.B EINVAL
a218ef20 1422.RB ( FUTEX_LOCK_PI ,
ce022f18
MK
1423.BR FUTEX_TRYLOCK_PI ,
1424.BR FUTEX_UNLOCK_PI )
a218ef20
MK
1425The kernel detected an inconsistency between the user-space state at
1426.I uaddr
1427and the kernel state.
ce022f18 1428This indicates either state corruption
ce022f18 1429or that the kernel found a waiter on
a218ef20
MK
1430.I uaddr
1431which is waiting via
1432.BR FUTEX_WAIT
1433or
1434.BR FUTEX_WAIT_BITSET .
1435.TP
1436.B EINVAL
f9250b1a
MK
1437.RB ( FUTEX_CMP_REQUEUE_PI )
1438The kernel detected an inconsistency between the user-space state at
99c0041d
MK
1439.I uaddr2
1440and the kernel state;
1441that is, the kernel detected a waiter which waits via
1442.BR FUTEX_WAIT
8297383e
MK
1443or
1444.BR FUTEX_WAIT_BITSET
99c0041d
MK
1445on
1446.IR uaddr2 .
1447.TP
1448.B EINVAL
1449.RB ( FUTEX_CMP_REQUEUE_PI )
1450The kernel detected an inconsistency between the user-space state at
f9250b1a
MK
1451.I uaddr
1452and the kernel state;
1453that is, the kernel detected a waiter which waits via
75299c8d 1454.BR FUTEX_WAIT
99c0041d 1455or
75299c8d 1456.BR FUTEX_WAIT_BITESET
f9250b1a
MK
1457on
1458.IR uaddr .
1459.TP
1460.B EINVAL
99c0041d 1461.RB ( FUTEX_CMP_REQUEUE_PI )
75299c8d
MK
1462The kernel detected an inconsistency between the user-space state at
1463.I uaddr
1464and the kernel state;
1465that is, the kernel detected a waiter which waits on
1466.I uaddr
1467via
1468.BR FUTEX_LOCK_PI
1469(instead of
1470.BR FUTEX_WAIT_REQUEUE_PI ).
99c0041d
MK
1471.TP
1472.B EINVAL
9786b3ca 1473.RB ( FUTEX_CMP_REQUEUE_PI )
8297383e
MK
1474.\" This deals with the case:
1475.\" wait_requeue_pi(A, B);
1476.\" requeue_pi(A, C);
9786b3ca
MK
1477An attempt was made to requeue a waiter to a futex other than that
1478specified by the matching
1479.B FUTEX_WAIT_REQUEUE_PI
1480call for that waiter.
1481.TP
1482.B EINVAL
f0c0d61c
MK
1483.RB ( FUTEX_CMP_REQUEUE_PI )
1484The
1485.I val
1486argument is not 1.
1487.TP
1488.B EINVAL
4832b48a 1489Invalid argument.
fea681da 1490.TP
a449c634
MK
1491.BR ENOMEM
1492.RB ( FUTEX_LOCK_PI ,
e34a8fb6
MK
1493.BR FUTEX_TRYLOCK_PI ,
1494.BR FUTEX_CMP_REQUEUE_PI )
a449c634
MK
1495The kernel could not allocate memory to hold state information.
1496.TP
fea681da 1497.B ENFILE
ff597681 1498.RB ( FUTEX_FD )
fea681da 1499The system limit on the total number of open files has been reached.
4701fc28
MK
1500.TP
1501.B ENOSYS
1502Invalid operation specified in
d33602c4 1503.IR futex_op .
9f6c40c0 1504.TP
4a7e5b05
MK
1505.B ENOSYS
1506The
1507.BR FUTEX_CLOCK_REALTIME
1508option was specified in
1afcee7c 1509.IR futex_op ,
4a7e5b05
MK
1510but the accompanying operation was neither
1511.BR FUTEX_WAIT_BITSET
1512nor
1513.BR FUTEX_WAIT_REQUEUE_PI .
1514.TP
a9dcb4d1
MK
1515.BR ENOSYS
1516.RB ( FUTEX_LOCK_PI ,
f2424fae 1517.BR FUTEX_TRYLOCK_PI ,
4945ff19 1518.BR FUTEX_UNLOCK_PI ,
4cf92894 1519.BR FUTEX_CMP_REQUEUE_PI ,
794bb106 1520.BR FUTEX_WAIT_REQUEUE_PI )
4b35dc5d 1521A run-time check determined that the operation is not available.
f0a9e8f4 1522The PI-futex operations are not implemented on all architectures and
077981d4 1523are not supported on some CPU variants.
a9dcb4d1 1524.TP
c7589177
MK
1525.BR EPERM
1526.RB ( FUTEX_LOCK_PI ,
dc2742a8
MK
1527.BR FUTEX_TRYLOCK_PI ,
1528.BR FUTEX_CMP_REQUEUE_PI )
04331c3f 1529The caller is not allowed to attach itself to the futex at
dc2742a8
MK
1530.I uaddr
1531(for
1532.BR FUTEX_CMP_REQUEUE_PI :
1533the futex at
1534.IR uaddr2 ).
c7589177
MK
1535(This may be caused by a state corruption in user space.)
1536.TP
76f347ba 1537.BR EPERM
87276709 1538.RB ( FUTEX_UNLOCK_PI )
4b35dc5d 1539The caller does not own the lock represented by the futex word.
76f347ba 1540.TP
0b0e4934
MK
1541.BR ESRCH
1542.RB ( FUTEX_LOCK_PI ,
9732dd8b
MK
1543.BR FUTEX_TRYLOCK_PI ,
1544.BR FUTEX_CMP_REQUEUE_PI )
4b35dc5d 1545The thread ID in the futex word at
0b0e4934
MK
1546.I uaddr
1547does not exist.
1548.TP
360f773c
MK
1549.BR ESRCH
1550.RB ( FUTEX_CMP_REQUEUE_PI )
4b35dc5d 1551The thread ID in the futex word at
360f773c
MK
1552.I uaddr2
1553does not exist.
1554.TP
9f6c40c0 1555.B ETIMEDOUT
4d85047f
MK
1556The operation in
1557.IR futex_op
1558employed the timeout specified in
1559.IR timeout ,
1560and the timeout expired before the operation completed.
70b06b90
MK
1561.\"
1562.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1563.\"
47297adb 1564.SH VERSIONS
a1d5f77c 1565.PP
81c9d87e
MK
1566Futexes were first made available in a stable kernel release
1567with Linux 2.6.0.
1568
4c8cb0ff
MK
1569Initial futex support was merged in Linux 2.5.7 but with different
1570semantics from what was described above.
52dee70e 1571A four-argument system call with the semantics
fd3fa7ef 1572described in this page was introduced in Linux 2.5.40.
11b520ed 1573In Linux 2.5.70, one argument
a1d5f77c 1574was added.
11b520ed 1575In Linux 2.6.7, a sixth argument was added\(emmessy, especially
a1d5f77c 1576on the s390 architecture.
47297adb 1577.SH CONFORMING TO
8382f16d 1578This system call is Linux-specific.
47297adb 1579.SH NOTES
baf0f1f4
MK
1580Glibc does not provide a wrapper for this system call; call it using
1581.BR syscall (2).
cf44281c 1582
02f7b623
MK
1583Several higher-level programming abstractions are implemented via futexes,
1584including POSIX semaphores and
1585various POSIX threads synchronization mechanisms
1586(mutexes, condition variables, read-write locks, and barriers).
74f58a64
MK
1587.\" TODO FIXME(Torvald) Above, we cite this section and claim it contains
1588.\" details on the synchronization semantics; add the C11 equivalents
1589.\" here (or whatever we find consensus for).
305cc415
MK
1590.\"
1591.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
1592.\"
1593.SH EXAMPLE
1594.\" FIXME Is it worth having an example program?
1595.\" FIXME Anything obviously broken in the example program?
1596.\"
77da5feb 1597The program below demonstrates use of futexes in a program
305cc415
MK
1598where parent and child use a pair of futexes located inside a
1599shared anonymous mapping to synchronize access to a shared resource:
1600the terminal.
1601The two processes each write
1602.IR nloops
1603(a command-line argument that defaults to 5 if omitted)
1604messages to the terminal and employ a synchronization protocol
1605that ensures that they alternate in writing messages.
1606Upon running this program we see output such as the following:
1607
1608.in +4n
1609.nf
1610$ \fB./futex_demo\fP
1611Parent (18534) 0
1612Child (18535) 0
1613Parent (18534) 1
1614Child (18535) 1
1615Parent (18534) 2
1616Child (18535) 2
1617Parent (18534) 3
1618Child (18535) 3
1619Parent (18534) 4
1620Child (18535) 4
1621.fi
1622.in
1623.SS Program source
1624\&
1625.nf
1626/* futex_demo.c
1627
1628 Usage: futex_demo [nloops]
1629 (Default: 5)
1630
1631 Demonstrate the use of futexes in a program where parent and child
1632 use a pair of futexes located inside a shared anonymous mapping to
1633 synchronize access to a shared resource: the terminal. The two
1634 processes each write \(aqnum\-loops\(aq messages to the terminal and employ
1635 a synchronization protocol that ensures that they alternate in
1636 writing messages.
1637*/
1638#define _GNU_SOURCE
1639#include <stdio.h>
1640#include <errno.h>
1641#include <stdlib.h>
1642#include <unistd.h>
1643#include <sys/wait.h>
1644#include <sys/mman.h>
1645#include <sys/syscall.h>
1646#include <linux/futex.h>
1647#include <sys/time.h>
1648
1649#define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\
1650 } while (0)
1651
1652static int *futex1, *futex2, *iaddr;
1653
1654static int
1655futex(int *uaddr, int futex_op, int val,
1656 const struct timespec *timeout, int *uaddr2, int val3)
1657{
1658 return syscall(SYS_futex, uaddr, futex_op, val,
1659 timeout, uaddr, val3);
1660}
1661
1662/* Acquire the futex pointed to by \(aqfutexp\(aq: wait for its value to
1663 become 1, and then set the value to 0. */
1664
1665static void
1666fwait(int *futexp)
1667{
1668 int s;
1669
1670 /* __sync_bool_compare_and_swap(ptr, oldval, newval) is a gcc
1671 built\-in function. It atomically performs the equivalent of:
1672
1673 if (*ptr == oldval)
1674 *ptr = newval;
1675
1676 It returns true if the test yielded true and *ptr was updated.
1677 The alternative here would be to employ the equivalent atomic
1678 machine\-language instructions. For further information, see
1679 the GCC Manual. */
1680
305cc415 1681 while (1) {
83e80dda 1682
63ad44cb 1683 /* Is the futex available? */
83e80dda 1684
305cc415
MK
1685 if (__sync_bool_compare_and_swap(futexp, 1, 0))
1686 break; /* Yes */
1687
63ad44cb 1688 /* Futex is not available; wait */
83e80dda 1689
63ad44cb
HS
1690 s = futex(futexp, FUTEX_WAIT, 0, NULL, NULL, 0);
1691 if (s == \-1 && errno != EAGAIN)
1692 errExit("futex\-FUTEX_WAIT");
305cc415
MK
1693 }
1694}
1695
1696/* Release the futex pointed to by \(aqfutexp\(aq: if the futex currently
1697 has the value 0, set its value to 1 and the wake any futex waiters,
1698 so that if the peer is blocked in fpost(), it can proceed. */
1699
1700static void
1701fpost(int *futexp)
1702{
1703 int s;
1704
1705 /* __sync_bool_compare_and_swap() was described in comments above */
1706
1707 if (__sync_bool_compare_and_swap(futexp, 0, 1)) {
1708
1709 s = futex(futexp, FUTEX_WAKE, 1, NULL, NULL, 0);
1710 if (s == \-1)
1711 errExit("futex\-FUTEX_WAKE");
1712 }
1713}
1714
1715int
1716main(int argc, char *argv[])
1717{
1718 pid_t childPid;
1719 int j, nloops;
1720
1721 setbuf(stdout, NULL);
1722
1723 nloops = (argc > 1) ? atoi(argv[1]) : 5;
1724
1725 /* Create a shared anonymous mapping that will hold the futexes.
1726 Since the futexes are being shared between processes, we
1727 subsequently use the "shared" futex operations (i.e., not the
1728 ones suffixed "_PRIVATE") */
1729
1730 iaddr = mmap(NULL, sizeof(int) * 2, PROT_READ | PROT_WRITE,
1731 MAP_ANONYMOUS | MAP_SHARED, \-1, 0);
1732 if (iaddr == MAP_FAILED)
1733 errExit("mmap");
1734
1735 futex1 = &iaddr[0];
1736 futex2 = &iaddr[1];
1737
1738 *futex1 = 0; /* State: unavailable */
1739 *futex2 = 1; /* State: available */
1740
1741 /* Create a child process that inherits the shared anonymous
35764662 1742 mapping */
305cc415
MK
1743
1744 childPid = fork();
92a46690 1745 if (childPid == \-1)
305cc415
MK
1746 errExit("fork");
1747
1748 if (childPid == 0) { /* Child */
1749 for (j = 0; j < nloops; j++) {
1750 fwait(futex1);
1751 printf("Child (%ld) %d\\n", (long) getpid(), j);
1752 fpost(futex2);
1753 }
1754
1755 exit(EXIT_SUCCESS);
1756 }
1757
1758 /* Parent falls through to here */
1759
1760 for (j = 0; j < nloops; j++) {
1761 fwait(futex2);
1762 printf("Parent (%ld) %d\\n", (long) getpid(), j);
1763 fpost(futex1);
1764 }
1765
1766 wait(NULL);
1767
1768 exit(EXIT_SUCCESS);
1769}
1770.fi
47297adb 1771.SH SEE ALSO
4c222281 1772.ad l
9913033c 1773.BR get_robust_list (2),
d806bc05 1774.BR restart_syscall (2),
e0074751 1775.BR pthread_mutexattr_getprotocol (3),
ac894879
MK
1776.BR futex (7),
1777.BR sched (7)
fea681da 1778.PP
f5ad572f
MK
1779The following kernel source files:
1780.IP * 2
1781.I Documentation/pi-futex.txt
1782.IP *
1783.I Documentation/futex-requeue-pi.txt
1784.IP *
1785.I Documentation/locking/rt-mutex.txt
1786.IP *
1787.I Documentation/locking/rt-mutex-design.txt
8fe019c7
MK
1788.IP *
1789.I Documentation/robust-futex-ABI.txt
43b99089 1790.PP
4c222281 1791Franke, H., Russell, R., and Kirwood, M., 2002.
52087dd3 1792\fIFuss, Futexes and Furwocks: Fast Userlevel Locking in Linux\fP
4c222281 1793(from proceedings of the Ottawa Linux Symposium 2002),
9b936e9e 1794.br
608bf950
SK
1795.UR http://kernel.org\:/doc\:/ols\:/2002\:/ols2002-pages-479-495.pdf
1796.UE
f42eb21b 1797
4c222281 1798Hart, D., 2009. \fIA futex overview and update\fP,
2ed26199
MK
1799.UR http://lwn.net/Articles/360699/
1800.UE
1801
4c222281 1802Hart, D. and Guniguntala, D., 2009.
0483b6cc 1803\fIRequeue-PI: Making Glibc Condvars PI-Aware\fP
4c222281 1804(from proceedings of the 2009 Real-Time Linux Workshop),
0483b6cc
MK
1805.UR http://lwn.net/images/conf/rtlws11/papers/proc/p10.pdf
1806.UE
1807
4c222281 1808Drepper, U., 2011. \fIFutexes Are Tricky\fP,
f42eb21b
MK
1809.UR http://www.akkadia.org/drepper/futex.pdf
1810.UE
9b936e9e
MK
1811.PP
1812Futex example library, futex-*.tar.bz2 at
1813.br
a605264d 1814.UR ftp://ftp.kernel.org\:/pub\:/linux\:/kernel\:/people\:/rusty/
608bf950 1815.UE
34f14794
MK
1816.\"
1817.\" FIXME Are there any other resources that should be listed
1818.\" in the SEE ALSO section?
74f58a64 1819.\" FIXME(Torvald) We should probably refer to the glibc code here, in
4c8cb0ff
MK
1820.\" particular the glibc-internal futex wrapper functions that are
1821.\" WIP, and the generic pthread_mutex_t and perhaps condvar
1822.\" implementations.