]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man2/futex.2
futex.2: ERRORS: Add EDEADLK case for FUTEX_LOCK_PI and FUTEX_TRYLOCK_PI
[thirdparty/man-pages.git] / man2 / futex.2
CommitLineData
8f0aff2a 1.\" Page by b.hubert
1abce893
MK
2.\" and Copyright (C) 2015, Thomas Gleixner <tglx@linutronix.de>
3.\" and Copyright (C) 2015, Michael Kerrisk <mtk.manpages@gmail.com>
2297bf0e 4.\"
2e46a6e7 5.\" %%%LICENSE_START(FREELY_REDISTRIBUTABLE)
8f0aff2a 6.\" may be freely modified and distributed
8ff7380d 7.\" %%%LICENSE_END
fea681da
MK
8.\"
9.\" Niki A. Rahimi (LTC Security Development, narahimi@us.ibm.com)
10.\" added ERRORS section.
11.\"
12.\" Modified 2004-06-17 mtk
13.\" Modified 2004-10-07 aeb, added FUTEX_REQUEUE, FUTEX_CMP_REQUEUE
14.\"
c13182ef
MK
15.\" 2.6.18 adds (Ingo Molnar) priority inheritance support:
16.\" FUTEX_LOCK_PI, FUTEX_UNLOCK_PI, and FUTEX_TRYLOCK_PI. These need
34f7665a
MK
17.\" to be documented in the manual page. Probably there is sufficient
18.\" material in the kernel source file Documentation/pi-futex.txt.
4f58b197
MK
19.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
20.\" Author: Ingo Molnar <mingo@elte.hu>
21.\" Date: Tue Jun 27 02:54:58 2006 -0700
22.\"
23.\" commit e2970f2fb6950183a34e8545faa093eb49d186e1
24.\" Author: Ingo Molnar <mingo@elte.hu>
25.\" Date: Tue Jun 27 02:54:47 2006 -0700
26.\"
27b38e1c 27.\" See Documentation/pi-futex.txt
4f58b197 28.\"
4f58b197
MK
29.\" 2.6.31 adds FUTEX_WAIT_REQUEUE_PI, FUTEX_CMP_REQUEUE_PI
30.\" commit 52400ba946759af28442dee6265c5c0180ac7122
31.\" Author: Darren Hart <dvhltc@us.ibm.com>
32.\" Date: Fri Apr 3 13:40:49 2009 -0700
33.\"
34.\" commit ba9c22f2c01cf5c88beed5a6b9e07d42e10bd358
35.\" Author: Darren Hart <dvhltc@us.ibm.com>
36.\" Date: Mon Apr 20 22:22:22 2009 -0700
37.\"
38.\" See Documentation/futex-requeue-pi.txt
34f7665a 39.\"
3d155313 40.TH FUTEX 2 2014-05-21 "Linux" "Linux Programmer's Manual"
fea681da 41.SH NAME
ce154705 42futex \- fast user-space locking
fea681da 43.SH SYNOPSIS
9d9dc1e8 44.nf
fea681da
MK
45.sp
46.B "#include <linux/futex.h>"
fea681da
MK
47.B "#include <sys/time.h>"
48.sp
d33602c4
MK
49.BI "int futex(int *" uaddr ", int " futex_op ", int " val ,
50.BI " const struct timespec *" timeout ,
9d9dc1e8 51.BI " int *" uaddr2 ", int " val3 );
fea681da 52.\" int *? void *? u32 *?
9d9dc1e8 53.fi
409f08b0 54
b939d6e4
MK
55.IR Note :
56There is no glibc wrapper for this system call; see NOTES.
47297adb 57.SH DESCRIPTION
fea681da
MK
58.PP
59The
e511ffb6 60.BR futex ()
fea681da
MK
61system call provides a method for
62a program to wait for a value at a given address to change, and a
63method to wake up anyone waiting on a particular address (while the
64addresses for the same memory in separate processes may not be
65equal, the kernel maps them internally so the same memory mapped in
66different locations will correspond for
e511ffb6 67.BR futex ()
c13182ef 68calls).
fd3fa7ef 69This system call is typically used to
fea681da
MK
70implement the contended case of a lock in shared memory, as
71described in
a8bda636 72.BR futex (7).
fea681da 73.PP
f388ba70
MK
74When a futex operation did not finish uncontended in user space, a
75.BR futex ()
76call needs to be made to the kernel to arbitrate.
c13182ef 77Arbitration can either mean putting the calling
fea681da
MK
78process to sleep or, conversely, waking a waiting process.
79.PP
f388ba70
MK
80Callers of
81.BR futex ()
82are expected to adhere to the semantics described in
a8bda636 83.BR futex (7).
fea681da 84As these
d603cc27 85semantics involve writing nonportable assembly instructions, this in turn
fea681da
MK
86probably means that most users will in fact be library authors and not
87general application developers.
88.PP
89The
90.I uaddr
f388ba70
MK
91argument points to an integer which stores the counter (futex).
92On all platforms, futexes are four-byte integers that
93must be aligned on a four-byte boundary.
94The operation to perform on the futex is specified in the
95.I futex_op
96argument;
97.IR val
98is a value whose meaning and purpose depends on
99.IR futex_op .
36ab2074
MK
100
101The remaining arguments
102.RI ( timeout ,
103.IR uaddr2 ,
104and
105.IR val3 )
106are required only for certain of the futex operations described below.
107Where one of these arguments is not required, it is ignored.
108For several blocking operations, the
109.I timeout
110argument is a pointer to a
111.IR timespec
112structure that specifies a timeout for the operation.
113However, notwithstanding the prototype shown above, for some operations,
114this argument is instead a four-byte integer whose meaning
115is determined by the operation.
116Where it is required,
117.IR uaddr2
118is a pointer to a second futex that is employed by the operation.
119The interpretation of the final integer argument,
120.IR val3 ,
121depends on the operation.
122
6be4bad7 123The
d33602c4 124.I futex_op
6be4bad7
MK
125argument consists of two parts:
126a command that specifies the operation to be performed,
127bit-wise ORed with zero or or more options that
128modify the behaviour of the operation.
fc30eb79 129The options that may be included in
d33602c4 130.I futex_op
fc30eb79
TG
131are as follows:
132.TP
133.BR FUTEX_PRIVATE_FLAG " (since Linux 2.6.22)"
134.\" commit 34f01cc1f512fa783302982776895c73714ebbc2
135This option bit can be employed with all futex operations.
136It tells the kernel that the futex is process private and not shared
137with another process.
138This allows the kernel to choose the fast path for validating
139the user-space address and avoids expensive VMA lookups,
140taking reference counts on file backing store, and so on.
ae2c1774
MK
141
142As a convenience,
143.IR <linux/futex.h>
144defines a set of constants with the suffix
145.BR _PRIVATE
146that are equivalents of all of the operations listed below,
dcdfde26 147.\" except the obsolete FUTEX_FD, for which the "private" flag was
ae2c1774
MK
148.\" meaningless
149but with the
150.BR FUTEX_PRIVATE_FLAG
151ORed into the constant value.
152Thus, there are
153.BR FUTEX_WAIT_PRIVATE ,
154.BR FUTEX_WAKE_PRIVATE ,
155and so on.
2e98bbc2
TG
156.TP
157.BR FUTEX_CLOCK_REALTIME " (since Linux 2.6.28)"
158.\" commit 1acdac104668a0834cfa267de9946fac7764d486
4a7e5b05 159This option bit can be employed only with the
2e98bbc2
TG
160.BR FUTEX_WAIT_BITSET
161and
162.BR FUTEX_WAIT_REQUEUE_PI
163operations (described below).
164
f2103b26
MK
165If this option is set, the kernel treats
166.I timeout
167as an absolute time based on
2e98bbc2
TG
168.BR CLOCK_REALTIME .
169
f2103b26
MK
170If this option is not set, the kernel treats
171.I timeout
172as relative time,
1c952cf5
MK
173.\" FIXME I added CLOCK_MONOTONIC here. Is it correct?
174measured against the
175.BR CLOCK_MONOTONIC
176clock.
6be4bad7
MK
177.PP
178The operation specified in
d33602c4 179.I futex_op
6be4bad7 180is one of the following:
fea681da 181.TP
81c9d87e
MK
182.BR FUTEX_WAIT " (since Linux 2.6.0)"
183.\" Strictly speaking, since some time in 2.5.x
f065673c
MK
184This operation tests that the value at the
185location pointed to by the futex address
fea681da
MK
186.I uaddr
187still contains the value
188.IR val ,
f065673c 189and then sleeps awaiting
682edefb 190.B FUTEX_WAKE
f065673c
MK
191on the futex address.
192The test and sleep steps are performed atomically.
193If the futex value does not match
194.IR val ,
4710334a 195then the call fails immediately with the error
f065673c
MK
196.BR EWOULDBLOCK .
197.\" FIXME I added the following sentence. Please confirm that it is correct.
198The purpose of the test step is to detect races where
199another process changes that value of the futex between
200the time it was last checked and the time of the
201.BR FUTEX_WAIT
63d3f911 202operation.
1909e523 203
c13182ef 204If the
fea681da 205.I timeout
1c952cf5
MK
206argument is non-NULL, its contents specify a relative timeout for the wait
207.\" FIXME I added CLOCK_MONOTONIC here. Is it correct?
208measured according to the
209.BR CLOCK_MONOTONIC
210clock.
82a6092b
MK
211(This interval will be rounded up to the system clock granularity,
212and kernel scheduling delays mean that the
213blocking interval may overrun by a small amount.)
214If
215.I timeout
216is NULL, the call blocks indefinitely.
4798a7f3 217
c13182ef 218The arguments
fea681da
MK
219.I uaddr2
220and
221.I val3
222are ignored.
223
224For
a8bda636 225.BR futex (7),
fea681da
MK
226this call is executed if decrementing the count gave a negative value
227(indicating contention), and will sleep until another process releases
682edefb
MK
228the futex and executes the
229.B FUTEX_WAKE
230operation.
fea681da 231.TP
81c9d87e
MK
232.BR FUTEX_WAKE " (since Linux 2.6.0)"
233.\" Strictly speaking, since Linux 2.5.x
f065673c
MK
234This operation wakes at most
235.I val
236processes waiting (i.e., inside
237.BR FUTEX_WAIT )
238on the futex at the address
239.IR uaddr .
240Most commonly,
241.I val
242is specified as either 1 (wake up a single waiter) or
243.BR INT_MAX
244(wake up all waiters).
730bfbda
MK
245.\" FIXME Please confirm that the following is correct:
246No guarantee is provided about which waiters are awoken
247(e.g., a waiter with a higher scheduling priority is not guaranteed
248to be awoken in preference to a waiter with a lower priority).
4798a7f3 249
fea681da
MK
250The arguments
251.IR timeout ,
252.I uaddr2
253and
254.I val3
255are ignored.
256
257For
a8bda636 258.BR futex (7),
fea681da
MK
259this is executed if incrementing
260the count showed that there were waiters, once the futex value has been set
261to 1 (indicating that it is available).
a7c2bf45
MK
262.TP
263.BR FUTEX_FD " (from Linux 2.6.0 up to and including Linux 2.6.25)"
264.\" Strictly speaking, from Linux 2.5.x to 2.6.25
265This operation creates a file descriptor that is associated with the futex at
266.IR uaddr .
267.\" , suitable for .BR poll (2).
268The calling process must close the returned file descriptor after use.
269When another process performs a
270.BR FUTEX_WAKE
271on the futex, the file descriptor indicates as being readable with
272.BR select (2),
273.BR poll (2),
274and
275.BR epoll (7)
276
277The file descriptor can be used to obtain asynchronous notifications:
278if
279.I val
280is nonzero, then when another process executes a
281.BR FUTEX_WAKE ,
282the caller will receive the signal number that was passed in
283.IR val .
284
285The arguments
286.IR timeout ,
287.I uaddr2
288and
289.I val3
290are ignored.
291
292To prevent race conditions, the caller should test if the futex has
293been upped after
294.B FUTEX_FD
295returns.
296
297Because it was inherently racy,
298.B FUTEX_FD
299has been removed
300.\" commit 82af7aca56c67061420d618cc5a30f0fd4106b80
301from Linux 2.6.26 onward.
302.TP
303.BR FUTEX_REQUEUE " (since Linux 2.6.0)"
304.\" Strictly speaking: from Linux 2.5.70
305.\"
306.\" FIXME I added this warning. Okay?
307.IR "Avoid using this operation" .
308It is broken (unavoidably racy) for its intended purpose.
309Use
310.BR FUTEX_CMP_REQUEUE
311instead.
312
313This operation performs the same task as
314.BR FUTEX_CMP_REQUEUE ,
315except that no check is made using the value in
316.IR val3 .
317(The argument
318.I val3
319is ignored.)
320.TP
321.BR FUTEX_CMP_REQUEUE " (since Linux 2.6.7)"
322This operation was added as a replacement for the earlier
323.BR FUTEX_REQUEUE ,
324because that operation was racy for its intended use.
325
326As with
327.BR FUTEX_REQUEUE ,
328the
329.BR FUTEX_CMP_REQUEUE
330operation is used to avoid a "thundering herd" effect when
331.B FUTEX_WAKE
332is used and all processes woken up need to acquire another futex.
333It differs from
334.BR FUTEX_REQUEUE
335in that it first checks whether the location
336.I uaddr
337still contains the value
338.IR val3 .
339If not, the operation fails with the error
340.BR EAGAIN .
341.\" FIXME I added the following sentence on rational for FUTEX_CMP_REQUEUE.
342.\" Is it correct? SHould it be expanded?
343This additional feature of
344.BR FUTEX_CMP_REQUEUE
345can be used by the caller to (atomically) detect changes
346in the value of the target futex at
347.IR uaddr2 .
348
349The operation wakes up a maximum of
350.I val
351waiters that are waiting on the futex at
352.IR uaddr .
353If there are more than
354.I val
355waiters, then the remaining waiters are removed
356from the wait queue of the source futex at
357.I uaddr
358and added to the wait queue of the target futex at
359.IR uaddr2 .
360The
361.I timeout
362argument is (ab)used to specify a cap on the number of waiters
363that are requeued to the futex at
364.IR uaddr2 ;
365the kernel casts the
366.I timeout
367value to
368.IR u32 .
369
370.\" FIXME Please review the following new paragraph to see if it is
371.\" accurate.
372Typical values to specify for
373.I val
374are 0 or or 1.
375(Specifying
376.BR INT_MAX
377is not useful, because it would make the
378.BR FUTEX_CMP_REQUEUE
379operation equivalent to
380.BR FUTEX_WAKE .)
381The cap value specified via the (abused)
382.I timeout
383argument is typically either 1 or
384.BR INT_MAX .
385(Specifying the argument as 0 is not useful, because it would make the
386.BR FUTEX_CMP_REQUEUE
387operation equivalent to
388.BR FUTEX_WAIT .)
6bac3b85
MK
389.\"
390.\" FIXME I added some FUTEX_WAKE_OP text, and I'd be happy if someone
391.\" checked it.
fea681da 392.TP
d67e21f5
MK
393.BR FUTEX_WAKE_OP " (since Linux 2.6.14)"
394.\" commit 4732efbeb997189d9f9b04708dc26bf8613ed721
6bac3b85
MK
395.\" Author: Jakub Jelinek <jakub@redhat.com>
396.\" Date: Tue Sep 6 15:16:25 2005 -0700
397This operation was added to support some user-space use cases
398where more than one futex must be handled at the same time.
399The most notable example is the implementation of
400.BR pthread_cond_signal (3),
401which requires operations on two futexes,
402the one used to implement the mutex and the one used in the implementation
403of the wait queue associated with the condition variable.
404.BR FUTEX_WAKE_OP
405allows such cases to be implemented without leading to
406high rates of contention and context switching.
407
408The
409.BR FUTEX_WAIT_OP
410operation is equivalent to atomically executing the following code:
411
412.in +4n
413.nf
414int oldval = *(int *) uaddr2;
415*(int *) uaddr2 = oldval \fIop\fP \fIoparg\fP;
416futex(uaddr, FUTEX_WAKE, val, 0, 0, 0);
417if (oldval \fIcmp\fP \fIcmparg\fP)
418 futex(uaddr2, FUTEX_WAKE, nr_wake2, 0, 0, 0);
419.fi
420.in
421
422In other words,
423.BR FUTEX_WAIT_OP
424does the following:
425.RS
426.IP * 3
427saves the original value of the futex at
428.IR uaddr2 ;
429.IP *
430performs an operation to modify the value of the futex at
431.IR uaddr2 ;
432.IP *
433wakes up a maximum of
434.I val
435waiters on the futex
436.IR uaddr ;
437and
438.IP *
439dependent on the results of a test of the original value of the futex at
440.IR uaddr2 ,
441wakes up a maximum of
442.I nr_wake2
443waiters on the futex
444.IR uaddr2 .
445.RE
446.IP
447The
448.I nr_wake2
449value is actually the
450.BR futex ()
451.I timeout
452argument (ab)used to specify how many of the waiters on the futex at
453.IR uaddr2
454are to be woken up;
455the kernel casts the
456.I timeout
457value to
458.IR u32 .
459
460The operation and comparison that are to be performed are encoded
461in the bits of the argument
462.IR val3 .
463Pictorially, the encoding is:
464
465.in +4n
466.nf
467 +-----+-----+---------------+---------------+
468 | op | cmp | oparg | cmparg |
469 +-----+-----+---------------+---------------+
470# of bits: 4 4 12 12
471
472.fi
473.in
474
475Expressed in code, the encoding is:
476
477.in +4n
478.nf
479#define FUTEX_OP(op, oparg, cmp, cmparg) \\
480 (((op & 0xf) << 28) | \\
481 ((cmp & 0xf) << 24) | \\
482 ((oparg & 0xfff) << 12) | \\
483 (cmparg & 0xfff))
484.fi
485.in
486
487In the above,
488.I op
489and
490.I cmp
491are each one of the codes listed below.
492The
493.I oparg
494and
495.I cmparg
496components are literal numeric values, except as noted below.
497
498The
499.I op
500component has one of the following values:
501
502.in +4n
503.nf
504FUTEX_OP_SET 0 /* uaddr2 = oparg; */
505FUTEX_OP_ADD 1 /* uaddr2 += oparg; */
506FUTEX_OP_OR 2 /* uaddr2 |= oparg; */
507FUTEX_OP_ANDN 3 /* uaddr2 &= ~oparg; */
508FUTEX_OP_XOR 4 /* uaddr2 ^= oparg; */
509.fi
510.in
511
512In addition, bit-wise ORing the following value into
513.I op
514causes
515.IR "(1\ <<\ oparg)"
516to be used as the operand:
517
518.in +4n
519.nf
520FUTEX_OP_ARG_SHIFT 8 /* Use (1 << oparg) as operand */
521.fi
522.in
523
524The
525.I cmp
526field is one of the following:
527
528.in +4n
529.nf
530FUTEX_OP_CMP_EQ 0 /* if (oldval == cmparg) wake */
531FUTEX_OP_CMP_NE 1 /* if (oldval != cmparg) wake */
532FUTEX_OP_CMP_LT 2 /* if (oldval < cmparg) wake */
533FUTEX_OP_CMP_LE 3 /* if (oldval <= cmparg) wake */
534FUTEX_OP_CMP_GT 4 /* if (oldval > cmparg) wake */
535FUTEX_OP_CMP_GE 5 /* if (oldval >= cmparg) wake */
536.fi
537.in
538
539The return value of
540.BR FUTEX_WAKE_OP
541is the sum of the number of waiters woken on the futex
542.IR uaddr
543plus the number of waiters woken on the futex
544.IR uaddr2 .
d67e21f5 545.TP
79c9b436
TG
546.BR FUTEX_WAIT_BITSET " (since Linux 2.6.25)"
547.\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
fd9e59d4 548This operation is like
79c9b436
TG
549.BR FUTEX_WAIT
550except that
551.I val3
552is used to provide a 32-bit bitset to the kernel.
553This bitset is stored in the kernel-internal state of the waiter.
554See the description of
555.BR FUTEX_WAKE_BITSET
556for further details.
557
fd9e59d4
MK
558The
559.BR FUTEX_WAIT_BITSET
560also interprets the
561.I timeout
562argument differently from
563.BR FUTEX_WAIT .
564See the discussion of
565.BR FUTEX_CLOCK_REALTIME ,
566above.
567
79c9b436
TG
568The
569.I uaddr2
570argument is ignored.
571.TP
d67e21f5
MK
572.BR FUTEX_WAKE_BITSET " (since Linux 2.6.25)"
573.\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
55cc422d
TG
574This operation is the same as
575.BR FUTEX_WAKE
576except that the
577.I val3
578argument is used to provide a 32-bit bitset to the kernel.
98d769c0
MK
579This bitset is used to select which waiters should be woken up.
580The selection is done by a bit-wise AND of the "wake" bitset
581(i.e., the value in
582.IR val3 )
583and the bitset which is stored in the kernel-internal
09cb4ce7 584state of the waiter (the "wait" bitset that is set using
98d769c0
MK
585.BR FUTEX_WAIT_BITSET ).
586All of the waiters for which the result of the AND is nonzero are woken up;
587the remaining waiters are left sleeping.
588
e9d4496b
MK
589.\" FIXME please review this paragraph that I added
590The effect of
591.BR FUTEX_WAIT_BITSET
592and
593.BR FUTEX_WAKE_BITSET
594is to allow selective wake-ups among multiple waiters that are waiting
595on the same futex;
596since a futex has a size of 32 bits,
597these operations provide 32 wakeup "channels".
598(The
599.BR FUTEX_WAIT
600and
601.BR FUTEX_WAKE
602operations correspond to
603.BR FUTEX_WAIT_BITSET
604and
605.BR FUTEX_WAKE_BITSET
606operations where the bitsets are all ones.)
09cb4ce7 607Note, however, that using this bitset multiplexing feature on a
e9d4496b
MK
608futex is less efficient than simply using multiple futexes,
609because employing bitset multiplexing requires the kernel
610to check all waiters on a futex,
611including those that are not interested in being woken up
612(i.e., they do not have the relevant bit set in their "wait" bitset).
613.\" According to http://locklessinc.com/articles/futex_cheat_sheet/:
614.\"
615.\" "The original reason for the addition of these extensions
616.\" was to improve the performance of pthread read-write locks
617.\" in glibc. However, the pthreads library no longer uses the
618.\" same locking algorithm, and these extensions are not used
619.\" without the bitset parameter being all ones.
620.\"
621.\" The page goes on to note that the FUTEX_WAIT_BITSET operation
622.\" is nevertheless used (with a bitset of all ones) in order to
623.\" obtain the absolute timeout functionality that is useful
624.\" for efficiently implementing Pthreads APIs (which use absolute
625.\" timeouts); FUTEX_WAIT provides only relative timeouts.
626
98d769c0
MK
627The
628.I uaddr2
629and
630.I timeout
631arguments are ignored.
d67e21f5
MK
632.TP
633.BR FUTEX_LOCK_PI " (since Linux 2.6.18)"
634.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
6b060884 635.\"
dd218aaa
MK
636.\" FIXME Employs 'timeout' argument, with absolute time value on
637.\" CLOCK_REALTIME clock; 'timeout' can be NULL
6b060884 638.\"
e0547e70
TG
639This operation reads from the futex address provided by the
640.I uaddr
641argument, which contains the namespace-specific thread ID (TID)
67259526
MK
642.\" FIXME In the preceding line, what does "namespace-specific" mean?
643.\" That is, what kind of namespace are we talking about?
e0547e70
TG
644of the lock owner.
645If the TID is 0, then the kernel tries to set the waiter's TID atomically.
646If the TID is nonzero or the take over fails,
647the kernel sets atomically the
648.B FUTEX_WAITERS
649bit, which signals the owner that it cannot unlock the futex in
650user space atomically by transitioning from TID to 0.
651After that, the kernel tries to find the task which is
652associated with the owner TID, creates or reuses kernel state on behalf
653of the owner and attaches the waiter to it.
67259526
MK
654.\" FIXME In the next line, what type of "priority" are we talking about?
655.\" Realtime priorities for SCHED_FIFO and SCHED_RR?
656.\" Or something else?
e0547e70
TG
657The enqueing of the waiter is in descending priority order if more
658than one waiter exists.
67259526 659.\" FIXME What does "bandwidth" refer to in the next line?
e0547e70 660The owner inherits either the priority or the bandwidth of the waiter.
67259526
MK
661.\" FIXME In the preceding line, what determines whether the
662.\" owner inherits the priority versus the bandwidth?
e0547e70
TG
663This inheritance follows the lock chain in the case of
664nested locking and performs deadlock detection.
665
666The
667.I timeout
668.\" FIXME Is this true??????????????????????
669argument is handled as described in
670.BR FUTEX_WAIT .
671
a449c634 672The
e0547e70
TG
673.IR uaddr2 ,
674.IR val ,
675and
676.IR val3
a449c634 677arguments are ignored.
d67e21f5
MK
678.TP
679.BR FUTEX_UNLOCK_PI " (since Linux 2.6.18)"
680.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
681.\" FIXME to complete
682[As yet undocumented]
683.TP
684.BR FUTEX_TRYLOCK_PI " (since Linux 2.6.18)"
685.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
686.\" FIXME to complete
687[As yet undocumented]
688.TP
d67e21f5
MK
689.BR FUTEX_CMP_REQUEUE_PI " (since Linux 2.6.31)"
690.\" commit 52400ba946759af28442dee6265c5c0180ac7122
691.\" FIXME to complete
692[As yet undocumented]
693.TP
694.BR FUTEX_WAIT_REQUEUE_PI " (since Linux 2.6.31)"
695.\" commit 52400ba946759af28442dee6265c5c0180ac7122
696.\" FIXME to complete
dd218aaa
MK
697.\"
698.\" FIXME Employs 'timeout' argument, supports FUTEX_CLOCK_REALTIME
699.\" 'timeout' can be NULL
700.\"
d67e21f5 701[As yet undocumented]
47297adb 702.SH RETURN VALUE
fea681da 703.PP
6f147f79 704In the event of an error, all operations return \-1 and set
e808bba0 705.I errno
6f147f79 706to indicate the cause of the error.
e808bba0
MK
707The return value on success depends on the operation,
708as described in the following list:
fea681da
MK
709.TP
710.B FUTEX_WAIT
682edefb
MK
711Returns 0 if the process was woken by a
712.B FUTEX_WAKE
713call.
e808bba0 714See ERRORS for the various possible error returns.
fea681da
MK
715.TP
716.B FUTEX_WAKE
717Returns the number of processes woken up.
718.TP
719.B FUTEX_FD
720Returns the new file descriptor associated with the futex.
721.TP
722.B FUTEX_REQUEUE
723Returns the number of processes woken up.
724.TP
725.B FUTEX_CMP_REQUEUE
3dfcc11d
MK
726Returns the total number of processes woken up or requeued to the futex at
727.IR uaddr2 .
728If this value is greater than
729.IR val ,
730then difference is the number of waiters requeued to the futex at
731.IR uaddr2 .
519f2c3d
MK
732.\"
733.\" FIXME Add success returns for other operations
fea681da
MK
734.SH ERRORS
735.TP
736.B EACCES
737No read access to futex memory.
738.TP
739.B EAGAIN
682edefb 740.B FUTEX_CMP_REQUEUE
e808bba0 741detected that the value pointed to by
9f6c40c0
МК
742.I uaddr
743is not equal to the expected value
744.IR val3 .
fd1dc4c2 745.\" FIXME: Is the following sentence correct?
fea681da 746(This probably indicates a race;
682edefb
MK
747use the safe
748.B FUTEX_WAKE
749now.)
fea681da 750.TP
5662f56a
MK
751.BR EAGAIN
752.RB ( FUTEX_LOCK_PI ,
753.BR FUTEX_TRYLOCK_PI )
754The futex owner thread ID is about to exit,
755but has not yet handled the internal state cleanup.
756Try again.
61f8c1d1
MK
757.\"
758.\" FIXME Is there not also an EAGAIN error case on 'uaddr2' for
759.\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via
760.\" futex_requeue() ==> futex_proxy_trylock_atomic() ==>
761.\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> EAGAIN?
5662f56a 762.TP
7a39e745
MK
763.BR EDEADLK
764.RB ( FUTEX_LOCK_PI ,
765.BR FUTEX_TRYLOCK_PI )
766The futex at
767.I uaddr
768is already locked by the caller.
769.TP
fea681da 770.B EFAULT
1ea901e8
MK
771A required pointer argument (i.e.,
772.IR uaddr ,
773.IR uaddr2 ,
774or
775.IR timeout )
496df304 776did not point to a valid user-space address.
fea681da 777.TP
9f6c40c0 778.B EINTR
e808bba0 779A
9f6c40c0 780.B FUTEX_WAIT
2674f781
MK
781or
782.B FUTEX_WAIT_BITSET
e808bba0
MK
783operation was interrupted by a signal (see
784.BR signal (7))
785or a spurious wakeup.
9f6c40c0 786.TP
fea681da 787.B EINVAL
180f97b7
MK
788The operation in
789.IR futex_op
790is one of those that employs a timeout, but the supplied
fb2f4c27
MK
791.I timeout
792argument was invalid
793.RI ( tv_sec
794was less than zero, or
795.IR tv_nsec
796was not less than 1000,000,000).
797.TP
798.B EINVAL
0c74df0b
MK
799The operation specified in
800.BR futex_op
801employs one or both of the pointers
51ee94be 802.I uaddr
a1f47699 803and
0c74df0b
MK
804.IR uaddr2 ,
805but one of these does not point to a valid object\(emthat is,
806the address is not four-byte-aligned.
51ee94be
MK
807.TP
808.B EINVAL
bae14b6c 809.RB ( FUTEX_WAKE ,
5447735d 810.BR FUTEX_WAKE_OP ,
98d769c0 811.BR FUTEX_WAKE_BITSET ,
e169277f
MK
812.BR FUTEX_REQUEUE ,
813.BR FUTEX_CMP_REQUEUE )
496df304 814The kernel detected an inconsistency between the user-space state at
9534086b
TG
815.I uaddr
816and the kernel state\(emthat is, it detected a waiter which waits in
5447735d
MK
817.BR FUTEX_LOCK_PI
818on
819.IR uaddr .
9534086b
TG
820.TP
821.B EINVAL
55cc422d
TG
822.RB ( FUTEX_WAIT_BITSET ,
823.BR FUTEX_WAKE_BITSET )
79c9b436
TG
824The bitset supplied in
825.IR val3
826is zero.
827.TP
828.B EINVAL
add875c0
MK
829.RB ( FUTEX_REQUEUE )
830.\" FIXME tglx suggested adding this, but does this error really
831.\" occur for FUTEX_REQUEUE?
832.I uaddr
833equals
834.IR uaddr2
835(i.e., an attempt was made to requeue to the same futex).
836.TP
ff597681
MK
837.BR EINVAL
838.RB ( FUTEX_FD )
839The signal number supplied in
840.I val
841is invalid.
842.TP
6bac3b85 843.B EINVAL
a218ef20
MK
844.RB ( FUTEX_LOCK_PI ,
845.BR FUTEX_TRYLOCK_PI )
846The kernel detected an inconsistency between the user-space state at
847.I uaddr
848and the kernel state.
849This indicates either state corruption or that the kernel found a waiter on
850.I uaddr
851which is waiting via
852.BR FUTEX_WAIT
853or
854.BR FUTEX_WAIT_BITSET .
855.TP
856.B EINVAL
4832b48a 857Invalid argument.
fea681da 858.TP
a449c634
MK
859.BR ENOMEM
860.RB ( FUTEX_LOCK_PI ,
861.BR FUTEX_TRYLOCK_PI )
862The kernel could not allocate memory to hold state information.
863.TP
fea681da 864.B ENFILE
ff597681 865.RB ( FUTEX_FD )
fea681da 866The system limit on the total number of open files has been reached.
4701fc28
MK
867.TP
868.B ENOSYS
869Invalid operation specified in
d33602c4 870.IR futex_op .
9f6c40c0 871.TP
4a7e5b05
MK
872.B ENOSYS
873The
874.BR FUTEX_CLOCK_REALTIME
875option was specified in
1afcee7c 876.IR futex_op ,
4a7e5b05
MK
877but the accompanying operation was neither
878.BR FUTEX_WAIT_BITSET
879nor
880.BR FUTEX_WAIT_REQUEUE_PI .
881.TP
c7589177
MK
882.BR EPERM
883.RB ( FUTEX_LOCK_PI ,
884.BR FUTEX_TRYLOCK_PI )
885The caller is not allowed to attach itself to the futex.
886(This may be caused by a state corruption in user space.)
61f8c1d1
MK
887.\"
888.\" FIXME Is there not also an EPERM error case on 'uaddr2' for
889.\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via
890.\" futex_requeue() ==> futex_proxy_trylock_atomic() ==>
891.\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> EPERM?
c7589177 892.TP
0b0e4934
MK
893.BR ESRCH
894.RB ( FUTEX_LOCK_PI ,
895.BR FUTEX_TRYLOCK_PI )
896.\" FIXME I reworded the following sentence a bit differently from
897.\" tglx's formulation. Is it okay?
898The thread ID in the futex at
899.I uaddr
900does not exist.
61f8c1d1
MK
901.\"
902.\" FIXME Is there not also an ESRCH error case on 'uaddr2' for
903.\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via
904.\" futex_requeue() ==> futex_proxy_trylock_atomic() ==>
905.\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> ESRCH?
0b0e4934 906.TP
9f6c40c0 907.B ETIMEDOUT
4d85047f
MK
908The operation in
909.IR futex_op
910employed the timeout specified in
911.IR timeout ,
912and the timeout expired before the operation completed.
9f6c40c0
МК
913.TP
914.B EWOULDBLOCK
0582b19d
MK
915.RB ( FUTEX_WAIT )
916The value pointed to by
9f6c40c0
МК
917.I uaddr
918was not equal to the expected value
919.I val
e808bba0 920at the time of the call.
47297adb 921.SH VERSIONS
a1d5f77c 922.PP
81c9d87e
MK
923Futexes were first made available in a stable kernel release
924with Linux 2.6.0.
925
a1d5f77c
MK
926Initial futex support was merged in Linux 2.5.7 but with different semantics
927from what was described above.
52dee70e 928A four-argument system call with the semantics
fd3fa7ef 929described in this page was introduced in Linux 2.5.40.
11b520ed 930In Linux 2.5.70, one argument
a1d5f77c 931was added.
11b520ed 932In Linux 2.6.7, a sixth argument was added\(emmessy, especially
a1d5f77c 933on the s390 architecture.
47297adb 934.SH CONFORMING TO
8382f16d 935This system call is Linux-specific.
47297adb 936.SH NOTES
fea681da 937.PP
fcdad7d6 938To reiterate, bare futexes are not intended as an easy-to-use abstraction
c13182ef 939for end-users.
fcdad7d6 940(There is no wrapper function for this system call in glibc.)
c13182ef 941Implementors are expected to be assembly literate and to have
7fac88a9 942read the sources of the futex user-space library referenced below.
d282bb24 943.\" .SH AUTHORS
fea681da
MK
944.\" .PP
945.\" Futexes were designed and worked on by
946.\" Hubertus Franke (IBM Thomas J. Watson Research Center),
947.\" Matthew Kirkwood, Ingo Molnar (Red Hat)
948.\" and Rusty Russell (IBM Linux Technology Center).
949.\" This page written by bert hubert.
47297adb 950.SH SEE ALSO
9913033c 951.BR get_robust_list (2),
d806bc05 952.BR restart_syscall (2),
14d8dd3b 953.BR futex (7)
fea681da 954.PP
43b99089
MK
955The kernel source files
956.IR Documentation/pi-futex.txt
957and
958.IR Documentation/futex-requeue-pi.txt .
959.PP
52087dd3 960\fIFuss, Futexes and Furwocks: Fast Userlevel Locking in Linux\fP
9b936e9e
MK
961(proceedings of the Ottawa Linux Symposium 2002), online at
962.br
608bf950
SK
963.UR http://kernel.org\:/doc\:/ols\:/2002\:/ols2002-pages-479-495.pdf
964.UE
f42eb21b
MK
965
966\fIFutexes Are Tricky\fP (updated in 2011), Ulrich Drepper
967.UR http://www.akkadia.org/drepper/futex.pdf
968.UE
9b936e9e
MK
969.PP
970Futex example library, futex-*.tar.bz2 at
971.br
a605264d 972.UR ftp://ftp.kernel.org\:/pub\:/linux\:/kernel\:/people\:/rusty/
608bf950 973.UE