2 .\" and Copyright (C) 2015, Thomas Gleixner <tglx@linutronix.de>
3 .\" and Copyright (C) 2015, Michael Kerrisk <mtk.manpages@gmail.com>
5 .\" %%%LICENSE_START(FREELY_REDISTRIBUTABLE)
6 .\" may be freely modified and distributed
9 .\" Niki A. Rahimi (LTC Security Development, narahimi@us.ibm.com)
10 .\" added ERRORS section.
12 .\" Modified 2004-06-17 mtk
13 .\" Modified 2004-10-07 aeb, added FUTEX_REQUEUE, FUTEX_CMP_REQUEUE
15 .\" 2.6.31 adds FUTEX_WAIT_REQUEUE_PI, FUTEX_CMP_REQUEUE_PI
16 .\" commit 52400ba946759af28442dee6265c5c0180ac7122
17 .\" Author: Darren Hart <dvhltc@us.ibm.com>
18 .\" Date: Fri Apr 3 13:40:49 2009 -0700
20 .\" commit ba9c22f2c01cf5c88beed5a6b9e07d42e10bd358
21 .\" Author: Darren Hart <dvhltc@us.ibm.com>
22 .\" Date: Mon Apr 20 22:22:22 2009 -0700
24 .\" See Documentation/futex-requeue-pi.txt
26 .TH FUTEX 2 2014-05-21 "Linux" "Linux Programmer's Manual"
28 futex \- fast user-space locking
32 .B "#include <linux/futex.h>"
33 .B "#include <sys/time.h>"
35 .BI "int futex(int *" uaddr ", int " futex_op ", int " val ,
36 .BI " const struct timespec *" timeout ,
37 .BI " int *" uaddr2 ", int " val3 );
38 .\" int *? void *? u32 *?
42 There is no glibc wrapper for this system call; see NOTES.
47 system call provides a method for
48 a program to wait for a value at a given address to change, and a
49 method to wake up anyone waiting on a particular address (while the
50 addresses for the same memory in separate processes may not be
51 equal, the kernel maps them internally so the same memory mapped in
52 different locations will correspond for
55 This system call is typically used to
56 implement the contended case of a lock in shared memory, as
60 When a futex operation did not finish uncontended in user space, a
62 call needs to be made to the kernel to arbitrate.
63 Arbitration can either mean putting the calling
64 process to sleep or, conversely, waking a waiting process.
68 are expected to adhere to the semantics described in
71 semantics involve writing nonportable assembly instructions, this in turn
72 probably means that most users will in fact be library authors and not
73 general application developers.
77 argument points to an integer which stores the counter (futex).
78 On all platforms, futexes are four-byte integers that
79 must be aligned on a four-byte boundary.
80 The operation to perform on the futex is specified in the
84 is a value whose meaning and purpose depends on
87 The remaining arguments
92 are required only for certain of the futex operations described below.
93 Where one of these arguments is not required, it is ignored.
94 For several blocking operations, the
96 argument is a pointer to a
98 structure that specifies a timeout for the operation.
99 However, notwithstanding the prototype shown above, for some operations,
100 this argument is instead a four-byte integer whose meaning
101 is determined by the operation.
102 Where it is required,
104 is a pointer to a second futex that is employed by the operation.
105 The interpretation of the final integer argument,
107 depends on the operation.
111 argument consists of two parts:
112 a command that specifies the operation to be performed,
113 bit-wise ORed with zero or or more options that
114 modify the behaviour of the operation.
115 The options that may be included in
119 .BR FUTEX_PRIVATE_FLAG " (since Linux 2.6.22)"
120 .\" commit 34f01cc1f512fa783302982776895c73714ebbc2
121 This option bit can be employed with all futex operations.
122 It tells the kernel that the futex is process private and not shared
123 with another process.
124 This allows the kernel to choose the fast path for validating
125 the user-space address and avoids expensive VMA lookups,
126 taking reference counts on file backing store, and so on.
130 defines a set of constants with the suffix
132 that are equivalents of all of the operations listed below,
133 .\" except the obsolete FUTEX_FD, for which the "private" flag was
136 .BR FUTEX_PRIVATE_FLAG
137 ORed into the constant value.
139 .BR FUTEX_WAIT_PRIVATE ,
140 .BR FUTEX_WAKE_PRIVATE ,
143 .BR FUTEX_CLOCK_REALTIME " (since Linux 2.6.28)"
144 .\" commit 1acdac104668a0834cfa267de9946fac7764d486
145 This option bit can be employed only with the
146 .BR FUTEX_WAIT_BITSET
148 .BR FUTEX_WAIT_REQUEUE_PI
151 If this option is set, the kernel treats
153 as an absolute time based on
156 If this option is not set, the kernel treats
159 .\" FIXME I added CLOCK_MONOTONIC here. Is it correct?
164 The operation specified in
166 is one of the following:
168 .BR FUTEX_WAIT " (since Linux 2.6.0)"
169 .\" Strictly speaking, since some time in 2.5.x
170 This operation tests that the value at the
171 location pointed to by the futex address
173 still contains the value
175 and then sleeps awaiting
177 on the futex address.
178 The test and sleep steps are performed atomically.
179 If the futex value does not match
181 then the call fails immediately with the error
183 .\" FIXME I added the following sentence. Please confirm that it is correct.
184 The purpose of the test step is to detect races where
185 another process changes that value of the futex between
186 the time it was last checked and the time of the
192 argument is non-NULL, its contents specify a relative timeout for the wait
193 .\" FIXME I added CLOCK_MONOTONIC here. Is it correct?
194 measured according to the
197 (This interval will be rounded up to the system clock granularity,
198 and kernel scheduling delays mean that the
199 blocking interval may overrun by a small amount.)
202 is NULL, the call blocks indefinitely.
212 this call is executed if decrementing the count gave a negative value
213 (indicating contention), and will sleep until another process releases
214 the futex and executes the
218 .BR FUTEX_WAKE " (since Linux 2.6.0)"
219 .\" Strictly speaking, since Linux 2.5.x
220 This operation wakes at most
222 processes waiting (i.e., inside
224 on the futex at the address
228 is specified as either 1 (wake up a single waiter) or
230 (wake up all waiters).
231 .\" FIXME Please confirm that the following is correct:
232 No guarantee is provided about which waiters are awoken
233 (e.g., a waiter with a higher scheduling priority is not guaranteed
234 to be awoken in preference to a waiter with a lower priority).
245 this is executed if incrementing
246 the count showed that there were waiters, once the futex value has been set
247 to 1 (indicating that it is available).
249 .BR FUTEX_FD " (from Linux 2.6.0 up to and including Linux 2.6.25)"
250 .\" Strictly speaking, from Linux 2.5.x to 2.6.25
251 This operation creates a file descriptor that is associated with the futex at
253 .\" , suitable for .BR poll (2).
254 The calling process must close the returned file descriptor after use.
255 When another process performs a
257 on the futex, the file descriptor indicates as being readable with
263 The file descriptor can be used to obtain asynchronous notifications:
266 is nonzero, then when another process executes a
268 the caller will receive the signal number that was passed in
278 To prevent race conditions, the caller should test if the futex has
283 Because it was inherently racy,
286 .\" commit 82af7aca56c67061420d618cc5a30f0fd4106b80
287 from Linux 2.6.26 onward.
289 .BR FUTEX_REQUEUE " (since Linux 2.6.0)"
290 .\" Strictly speaking: from Linux 2.5.70
292 .\" FIXME I added this warning. Okay?
293 .IR "Avoid using this operation" .
294 It is broken (unavoidably racy) for its intended purpose.
296 .BR FUTEX_CMP_REQUEUE
299 This operation performs the same task as
300 .BR FUTEX_CMP_REQUEUE ,
301 except that no check is made using the value in
307 .BR FUTEX_CMP_REQUEUE " (since Linux 2.6.7)"
308 This operation was added as a replacement for the earlier
310 because that operation was racy for its intended use.
315 .BR FUTEX_CMP_REQUEUE
316 operation is used to avoid a "thundering herd" effect when
318 is used and all processes woken up need to acquire another futex.
321 in that it first checks whether the location
323 still contains the value
325 If not, the operation fails with the error
327 .\" FIXME I added the following sentence on rational for FUTEX_CMP_REQUEUE.
328 .\" Is it correct? SHould it be expanded?
329 This additional feature of
330 .BR FUTEX_CMP_REQUEUE
331 can be used by the caller to (atomically) detect changes
332 in the value of the target futex at
335 The operation wakes up a maximum of
337 waiters that are waiting on the futex at
339 If there are more than
341 waiters, then the remaining waiters are removed
342 from the wait queue of the source futex at
344 and added to the wait queue of the target futex at
348 argument is (ab)used to specify a cap on the number of waiters
349 that are requeued to the futex at
356 .\" FIXME Please review the following new paragraph to see if it is
358 Typical values to specify for
363 is not useful, because it would make the
364 .BR FUTEX_CMP_REQUEUE
365 operation equivalent to
367 The cap value specified via the (abused)
369 argument is typically either 1 or
371 (Specifying the argument as 0 is not useful, because it would make the
372 .BR FUTEX_CMP_REQUEUE
373 operation equivalent to
376 .\" FIXME I added some FUTEX_WAKE_OP text, and I'd be happy if someone
379 .BR FUTEX_WAKE_OP " (since Linux 2.6.14)"
380 .\" commit 4732efbeb997189d9f9b04708dc26bf8613ed721
381 .\" Author: Jakub Jelinek <jakub@redhat.com>
382 .\" Date: Tue Sep 6 15:16:25 2005 -0700
383 This operation was added to support some user-space use cases
384 where more than one futex must be handled at the same time.
385 The most notable example is the implementation of
386 .BR pthread_cond_signal (3),
387 which requires operations on two futexes,
388 the one used to implement the mutex and the one used in the implementation
389 of the wait queue associated with the condition variable.
391 allows such cases to be implemented without leading to
392 high rates of contention and context switching.
396 operation is equivalent to atomically executing the following code:
400 int oldval = *(int *) uaddr2;
401 *(int *) uaddr2 = oldval \fIop\fP \fIoparg\fP;
402 futex(uaddr, FUTEX_WAKE, val, 0, 0, 0);
403 if (oldval \fIcmp\fP \fIcmparg\fP)
404 futex(uaddr2, FUTEX_WAKE, nr_wake2, 0, 0, 0);
413 saves the original value of the futex at
416 performs an operation to modify the value of the futex at
419 wakes up a maximum of
425 dependent on the results of a test of the original value of the futex at
427 wakes up a maximum of
435 value is actually the
438 argument (ab)used to specify how many of the waiters on the futex at
446 The operation and comparison that are to be performed are encoded
447 in the bits of the argument
449 Pictorially, the encoding is:
453 +---+---+-----------+-----------+
454 |op |cmp| oparg | cmparg |
455 +---+---+-----------+-----------+
456 4 4 12 12 <== # of bits
460 Expressed in code, the encoding is:
464 #define FUTEX_OP(op, oparg, cmp, cmparg) \\
465 (((op & 0xf) << 28) | \\
466 ((cmp & 0xf) << 24) | \\
467 ((oparg & 0xfff) << 12) | \\
476 are each one of the codes listed below.
481 components are literal numeric values, except as noted below.
485 component has one of the following values:
489 FUTEX_OP_SET 0 /* uaddr2 = oparg; */
490 FUTEX_OP_ADD 1 /* uaddr2 += oparg; */
491 FUTEX_OP_OR 2 /* uaddr2 |= oparg; */
492 FUTEX_OP_ANDN 3 /* uaddr2 &= ~oparg; */
493 FUTEX_OP_XOR 4 /* uaddr2 ^= oparg; */
497 In addition, bit-wise ORing the following value into
501 to be used as the operand:
505 FUTEX_OP_ARG_SHIFT 8 /* Use (1 << oparg) as operand */
511 field is one of the following:
515 FUTEX_OP_CMP_EQ 0 /* if (oldval == cmparg) wake */
516 FUTEX_OP_CMP_NE 1 /* if (oldval != cmparg) wake */
517 FUTEX_OP_CMP_LT 2 /* if (oldval < cmparg) wake */
518 FUTEX_OP_CMP_LE 3 /* if (oldval <= cmparg) wake */
519 FUTEX_OP_CMP_GT 4 /* if (oldval > cmparg) wake */
520 FUTEX_OP_CMP_GE 5 /* if (oldval >= cmparg) wake */
526 is the sum of the number of waiters woken on the futex
528 plus the number of waiters woken on the futex
531 .BR FUTEX_WAIT_BITSET " (since Linux 2.6.25)"
532 .\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
533 This operation is like
537 is used to provide a 32-bit bitset to the kernel.
538 This bitset is stored in the kernel-internal state of the waiter.
539 See the description of
540 .BR FUTEX_WAKE_BITSET
544 .BR FUTEX_WAIT_BITSET
547 argument differently from
549 See the discussion of
550 .BR FUTEX_CLOCK_REALTIME ,
557 .BR FUTEX_WAKE_BITSET " (since Linux 2.6.25)"
558 .\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
559 This operation is the same as
563 argument is used to provide a 32-bit bitset to the kernel.
564 This bitset is used to select which waiters should be woken up.
565 The selection is done by a bit-wise AND of the "wake" bitset
568 and the bitset which is stored in the kernel-internal
569 state of the waiter (the "wait" bitset that is set using
570 .BR FUTEX_WAIT_BITSET ).
571 All of the waiters for which the result of the AND is nonzero are woken up;
572 the remaining waiters are left sleeping.
574 .\" FIXME please review this paragraph that I added
576 .BR FUTEX_WAIT_BITSET
578 .BR FUTEX_WAKE_BITSET
579 is to allow selective wake-ups among multiple waiters that are waiting
581 since a futex has a size of 32 bits,
582 these operations provide 32 wakeup "channels".
587 operations correspond to
588 .BR FUTEX_WAIT_BITSET
590 .BR FUTEX_WAKE_BITSET
591 operations where the bitsets are all ones.)
592 Note, however, that using this bitset multiplexing feature on a
593 futex is less efficient than simply using multiple futexes,
594 because employing bitset multiplexing requires the kernel
595 to check all waiters on a futex,
596 including those that are not interested in being woken up
597 (i.e., they do not have the relevant bit set in their "wait" bitset).
598 .\" According to http://locklessinc.com/articles/futex_cheat_sheet/:
600 .\" "The original reason for the addition of these extensions
601 .\" was to improve the performance of pthread read-write locks
602 .\" in glibc. However, the pthreads library no longer uses the
603 .\" same locking algorithm, and these extensions are not used
604 .\" without the bitset parameter being all ones.
606 .\" The page goes on to note that the FUTEX_WAIT_BITSET operation
607 .\" is nevertheless used (with a bitset of all ones) in order to
608 .\" obtain the absolute timeout functionality that is useful
609 .\" for efficiently implementing Pthreads APIs (which use absolute
610 .\" timeouts); FUTEX_WAIT provides only relative timeouts.
616 arguments are ignored.
619 .SS Priority-inheritance futexes
620 Linux supports priority-inheritance (PI) futexes in order to handle
621 priority-inversion problems that can be encountered with
624 .\" FIXME ===== Start of adapted Hart/Guniguntala text =====
625 .\" The following text is drawn from the Hart/Guniguntala paper,
626 .\" but I have reworded some pieces significantly. Please check it.
628 The PI futex operations described below differ from the other
629 futex operations in that they impose policy on the use of the futex value:
631 If the lock is unowned, the futex value shall be 0.
633 If the lock is owned, the futex value shall be the thread ID (TID; see
635 of the owning thread.
637 .\" FIXME In the following line, I added "the lock is owned and". Okay?
638 If the lock is owned and there are threads contending for the lock,
641 bit shall be set in the futex value; in other words, the futex value is:
645 With this policy in place,
646 a user-space application can acquire an unowned
647 lock or release an uncontended lock using a atomic
648 .\" FIXME In the following line, I added "user-space". Okay?
649 user-space instructions (e.g.,
651 on the x86 architecture).
652 Locking an unowned lock simply consists of setting
653 the futex value to the caller's TID.
654 Releasing an uncontended lock simply requires setting the futex value to 0.
656 If a futex is currently owned (i.e., has a nonzero value),
657 waiters must employ the
659 operation to acquire the lock.
660 If a lock is contended (i.e., the
662 bit is set in the futex value), the lock owner must employ the
664 operation to release the lock.
666 In the cases where callers are forced into the kernel
667 (i.e., required to perform a
670 they then deal directly with a so-called RT-mutex,
671 a kernel locking mechanism which implements the required
672 priority-inheritance semantics.
673 After the RT-mutex is acquired, the futex value is updated accordingly,
674 before the calling thread returns to user space.
675 .\" FIXME ===== End of adapted Hart/Guniguntala text =====
678 .\" FIXME We need some explanation here of why it is important to note this
679 to note that the kernel will update the futex value prior
680 to returning to user space.
681 Unlike the other futex operations described above,
682 the PI futex operations are designed
683 for the implementation of very specific IPC mechanisms).
685 .\" FIXME We don't quite have a definition anywhere of what a PI futex
686 .\" is (vs a non-PI futex). Below, we have the information of
687 .\" FUTEX_CMP_REQUEUE_PI requeues from a non-PI futex to a
688 .\" PI futex, but what determines whether the futex is of one
689 .\" kind of the other? We should have such a definition somewhere
692 PI futexes are operated on by specifying one of the following values in
695 .BR FUTEX_LOCK_PI " (since Linux 2.6.18)"
696 .\" commit c87e2837be82df479a6bae9f155c43516d2feebc
698 .\" FIXME I did some significant rewording of tglx's text.
699 .\" Please check, in case I injected errors.
701 This operation is used after after an attempt to acquire
702 the futex lock via an atomic user-space instruction failed
703 because the futex has a nonzero value\(emspecifically,
704 because it contained the namespace-specific TID of the lock owner.
705 .\" FIXME In the preceding line, what does "namespace-specific" mean?
706 .\" (I kept those words from tglx.)
707 .\" That is, what kind of namespace are we talking about?
708 .\" (I suppose we are talking PID namespaces here, but I want to
711 The operation checks the value of the futex at the address
713 If the value is 0, then the kernel tries to atomically set the futex value to
716 .\" FIXME What would be the cause of failure?
717 or the futex value is nonzero,
718 the kernel atomically sets the
720 bit, which signals the futex owner that it cannot unlock the futex in
721 user space atomically by setting the futex value to 0.
722 After that, the kernel tries to find the thread which is
723 associated with the owner TID,
724 .\" FIXME Could I get a bit more detail on the next two lines?
725 .\" What is "creates or reuses kernel state" about?
726 creates or reuses kernel state on behalf of the owner
727 and attaches the waiter to it.
728 .\" FIXME In the next line, what type of "priority" are we talking about?
729 .\" Realtime priorities for SCHED_FIFO and SCHED_RR?
730 .\" Or something else?
731 The enqueing of the waiter is in descending priority order if more
732 than one waiter exists.
733 .\" FIXME What does "bandwidth" refer to in the next line?
734 The owner inherits either the priority or the bandwidth of the waiter.
735 .\" FIXME In the preceding line, what determines whether the
736 .\" owner inherits the priority versus the bandwidth?
738 .\" FIXME Could I get some help translating the next sentence into
739 .\" something that user-space developers (and I) can understand?
740 .\" In particular, what are "nexted locks" in this context?
741 This inheritance follows the lock chain in the case of
742 nested locking and performs deadlock detection.
744 .\" FIXME tglx says "The timeout argument is handled as described in
745 .\" FUTEX_WAIT." However, it appears to me that this is not right.
746 .\" Is the following formulation correct.
749 argument provides a timeout for the lock attempt.
750 It is interpreted as an absolute time, measured against the
755 is NULL, the operation will block indefinitely.
762 arguments are ignored.
764 .\" tglx noted the following "ERROR" case for FUTEX_LOCK_PI and
766 .\" > [EOWNERDIED] The owner of the futex died and the kernel made the
767 .\" > caller the new owner. The kernel sets the FUTEX_OWNER_DIED bit
768 .\" > in the futex userspace value. Caller is responsible for cleanup
770 .\" However, there is no such thing as an EOWNERDIED error. I had a look
771 .\" through the kernel source for the FUTEX_OWNER_DIED cases and didn't
772 .\" see an obvious error associated with them. Can you clarify? (I think
773 .\" the point is that this condition, which is described in
774 .\" Documentation/robust-futexes.txt, is not an error as such. However,
775 .\" I'm not yet sure of how to describe it in the man page.)
778 .BR FUTEX_TRYLOCK_PI " (since Linux 2.6.18)"
779 .\" commit c87e2837be82df479a6bae9f155c43516d2feebc
780 This operation tries to acquire the futex at
782 .\" FIXME I think it would be helpful here to say a few more words about
783 .\" the difference(s) between FUTEX_LOCK_PI and FUTEX_TRYLOCK_PI
784 It deals with the situation where the TID value at
789 .\" FIXME How does the situation in the previous sentence come about?
790 .\" Probably it would be helpful to say something about that in
792 .\" FIXME And *how* does FUTEX_TRYLOCK_PI deal with this situation?
793 User space cannot handle this race free.
801 arguments are ignored.
803 .BR FUTEX_UNLOCK_PI " (since Linux 2.6.18)"
804 .\" commit c87e2837be82df479a6bae9f155c43516d2feebc
805 This operation wakes the top priority waiter which is waiting in
807 on the futex address provided by the
811 This is called when the user space value at
813 cannot be changed atomically from a TID (of the owner) to 0.
821 arguments are ignored.
823 .BR FUTEX_CMP_REQUEUE_PI " (since Linux 2.6.31)"
824 .\" commit 52400ba946759af28442dee6265c5c0180ac7122
825 .\" FIXME to complete
826 This operation is a PI-aware variant of
827 .BR FUTEX_CMP_REQUEUE .
828 It requeues waiters that are blocked via
829 .B FUTEX_WAIT_REQUEUE_PI
832 from a non-PI source futex
838 .BR FUTEX_CMP_REQUEUE ,
839 this operation wakes up a maximum of
841 waiters that are waiting on the futex at
844 .BR FUTEX_CMP_REQUEUE_PI ,
847 The remaining waiters are removed from the wait queue of the source futex at
849 and added to the wait queue of the target futex at
856 arguments serve the same purposes as for
857 .BR FUTEX_CMP_REQUEUE .
858 .\" FIXME The page at http://locklessinc.com/articles/futex_cheat_sheet/
859 .\" notes that "priority-inheritance Futex to priority-inheritance
860 .\" Futex requeues are currently unsupported". Do we need to say
861 .\" something in the man page about that?
863 .BR FUTEX_WAIT_REQUEUE_PI " (since Linux 2.6.31)"
864 .\" commit 52400ba946759af28442dee6265c5c0180ac7122
865 .\" FIXME to complete
867 .\" FIXME Employs 'timeout' argument, supports FUTEX_CLOCK_REALTIME
868 .\" 'timeout' can be NULL
870 [As yet undocumented]
873 In the event of an error, all operations return \-1 and set
875 to indicate the cause of the error.
876 The return value on success depends on the operation,
877 as described in the following list:
880 Returns 0 if the process was woken by a
887 Returns the number of processes woken up.
890 Returns the new file descriptor associated with the futex.
893 Returns the number of processes woken up.
896 Returns the total number of processes woken up or requeued to the futex at
898 If this value is greater than
900 then difference is the number of waiters requeued to the futex at
903 .\" FIXME Add success returns for other operations
906 .\" FIXME Is the following correct?
907 Returns the total number of waiters that were woken up.
908 This is the sum of the woken waiters on the two futexes at
914 .\" FIXME Is the following correct?
915 Returns 0 if the process was woken by a
922 .\" FIXME Is the following correct?
923 Returns the number of processes woken up.
926 .\" FIXME Is the following correct?
927 Returns 0 if the futex was successfully locked.
930 .\" FIXME Is the following correct?
931 Returns 0 if the futex was successfully locked.
934 .\" FIXME Is the following correct?
935 Returns 0 if the futex was successfully unlocked.
937 .B FUTEX_CMP_REQUEUE_PI
938 .\" FIXME Is the following correct?
939 Returns the total number of processes woken up or requeued to the futex at
941 If this value is greater than
943 then difference is the number of waiters requeued to the futex at
946 .B FUTEX_WAIT_REQUEUE_PI
947 .\" FIXME Is the following correct?
948 Returns 0 if the caller was successfully requeued to the futex at
953 No read access to futex memory.
957 The value pointed to by
959 was not equal to the expected value
961 at the time of the call.
965 detected that the value pointed to by
967 is not equal to the expected value
969 .\" FIXME: Is the following sentence correct?
970 (This probably indicates a race;
975 .\" FIXME Should there be an EAGAIN case for FUTEX_TRYLOCK_PI?
976 .\" It seems so, looking at the handling of the rt_mutex_trylock()
977 .\" call in futex_lock_pi()
981 .RB ( FUTEX_LOCK_PI ,
982 .BR FUTEX_TRYLOCK_PI )
983 The futex owner thread ID is about to exit,
984 but has not yet handled the internal state cleanup.
987 .\" FIXME Is there not also an EAGAIN error case on 'uaddr2' for
988 .\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via
989 .\" futex_requeue() ==> futex_proxy_trylock_atomic() ==>
990 .\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> EAGAIN?
993 .RB ( FUTEX_LOCK_PI ,
994 .BR FUTEX_TRYLOCK_PI )
997 is already locked by the caller.
999 .\" FIXME Is there not also an EDEADLK error case on 'uaddr2' for
1000 .\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via
1001 .\" futex_requeue() ==> futex_proxy_trylock_atomic() ==>
1002 .\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> EDEADLK?
1005 A required pointer argument (i.e.,
1010 did not point to a valid user-space address.
1016 .B FUTEX_WAIT_BITSET
1017 operation was interrupted by a signal (see
1019 or a spurious wakeup.
1024 is one of those that employs a timeout, but the supplied
1026 argument was invalid
1028 was less than zero, or
1030 was not less than 1000,000,000).
1033 The operation specified in
1035 employs one or both of the pointers
1039 but one of these does not point to a valid object\(emthat is,
1040 the address is not four-byte-aligned.
1045 .BR FUTEX_WAKE_BITSET ,
1047 .BR FUTEX_CMP_REQUEUE )
1048 The kernel detected an inconsistency between the user-space state at
1050 and the kernel state\(emthat is, it detected a waiter which waits in
1056 .RB ( FUTEX_WAIT_BITSET ,
1057 .BR FUTEX_WAKE_BITSET )
1058 The bitset supplied in
1063 .RB ( FUTEX_REQUEUE )
1064 .\" FIXME tglx suggested adding this, but does this error really
1065 .\" occur for FUTEX_REQUEUE?
1069 (i.e., an attempt was made to requeue to the same futex).
1073 The signal number supplied in
1078 .RB ( FUTEX_LOCK_PI ,
1079 .BR FUTEX_TRYLOCK_PI ,
1080 .BR FUTEX_UNLOCK_PI )
1081 The kernel detected an inconsistency between the user-space state at
1083 and the kernel state.
1084 This indicates either state corruption
1085 .\" FIXME tglx did not mention the "state corruption" for FUTEX_UNLOCK_PI.
1086 .\" Does that case also apply for FUTEX_UNLOCK_PI?
1087 or that the kernel found a waiter on
1089 which is waiting via
1092 .BR FUTEX_WAIT_BITSET .
1098 .RB ( FUTEX_LOCK_PI ,
1099 .BR FUTEX_TRYLOCK_PI ,
1100 .BR FUTEX_CMP_REQUEUE_PI )
1101 The kernel could not allocate memory to hold state information.
1105 The system limit on the total number of open files has been reached.
1108 Invalid operation specified in
1113 .BR FUTEX_CLOCK_REALTIME
1114 option was specified in
1116 but the accompanying operation was neither
1117 .BR FUTEX_WAIT_BITSET
1119 .BR FUTEX_WAIT_REQUEUE_PI .
1122 .RB ( FUTEX_LOCK_PI ,
1123 .BR FUTEX_TRYLOCK_PI ,
1124 .BR FUTEX_UNLOCK_PI )
1125 A run-time check determined that the operation not available.
1128 .BR FUTEX_TRYLOCK_PI
1129 are not implemented on all architectures and
1130 not supported on some CPU variants.
1133 .RB ( FUTEX_LOCK_PI ,
1134 .BR FUTEX_TRYLOCK_PI )
1135 The caller is not allowed to attach itself to the futex.
1136 (This may be caused by a state corruption in user space.)
1138 .\" FIXME Is there not also an EPERM error case on 'uaddr2' for
1139 .\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via
1140 .\" futex_requeue() ==> futex_proxy_trylock_atomic() ==>
1141 .\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> EPERM?
1145 The caller does not own the futex.
1148 .RB ( FUTEX_LOCK_PI ,
1149 .BR FUTEX_TRYLOCK_PI )
1150 .\" FIXME I reworded the following sentence a bit differently from
1151 .\" tglx's formulation. Is it okay?
1152 The thread ID in the futex at
1156 .\" FIXME Is there not also an ESRCH error case on 'uaddr2' for
1157 .\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via
1158 .\" futex_requeue() ==> futex_proxy_trylock_atomic() ==>
1159 .\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> ESRCH?
1164 employed the timeout specified in
1166 and the timeout expired before the operation completed.
1169 Futexes were first made available in a stable kernel release
1172 Initial futex support was merged in Linux 2.5.7 but with different semantics
1173 from what was described above.
1174 A four-argument system call with the semantics
1175 described in this page was introduced in Linux 2.5.40.
1176 In Linux 2.5.70, one argument
1178 In Linux 2.6.7, a sixth argument was added\(emmessy, especially
1179 on the s390 architecture.
1181 This system call is Linux-specific.
1184 To reiterate, bare futexes are not intended as an easy-to-use abstraction
1186 (There is no wrapper function for this system call in glibc.)
1187 Implementors are expected to be assembly literate and to have
1188 read the sources of the futex user-space library referenced below.
1191 .\" Futexes were designed and worked on by
1192 .\" Hubertus Franke (IBM Thomas J. Watson Research Center),
1193 .\" Matthew Kirkwood, Ingo Molnar (Red Hat)
1194 .\" and Rusty Russell (IBM Linux Technology Center).
1195 .\" This page written by bert hubert.
1197 .BR get_robust_list (2),
1198 .BR restart_syscall (2),
1201 The following kernel source files:
1203 .I Documentation/pi-futex.txt
1205 .I Documentation/futex-requeue-pi.txt
1207 .I Documentation/locking/rt-mutex.txt
1209 .I Documentation/locking/rt-mutex-design.txt
1211 \fIFuss, Futexes and Furwocks: Fast Userlevel Locking in Linux\fP
1212 (proceedings of the Ottawa Linux Symposium 2002), online at
1214 .UR http://kernel.org\:/doc\:/ols\:/2002\:/ols2002-pages-479-495.pdf
1217 \fIRequeue-PI: Making Glibc Condvars PI-Aware\fP
1218 (2009 Real-Time Linux Workshop)
1219 .UR http://lwn.net/images/conf/rtlws11/papers/proc/p10.pdf
1222 \fIFutexes Are Tricky\fP (updated in 2011), Ulrich Drepper
1223 .UR http://www.akkadia.org/drepper/futex.pdf
1226 Futex example library, futex-*.tar.bz2 at
1228 .UR ftp://ftp.kernel.org\:/pub\:/linux\:/kernel\:/people\:/rusty/
1231 .\" FIXME Are there any other resources that should be listed
1232 .\" in the SEE ALSO section?