]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man2/futex.2
futex.2: ERRORS: added EAGAIN case for FUTEX_WAIT_REQUEUE_PI
[thirdparty/man-pages.git] / man2 / futex.2
CommitLineData
8f0aff2a 1.\" Page by b.hubert
1abce893
MK
2.\" and Copyright (C) 2015, Thomas Gleixner <tglx@linutronix.de>
3.\" and Copyright (C) 2015, Michael Kerrisk <mtk.manpages@gmail.com>
2297bf0e 4.\"
2e46a6e7 5.\" %%%LICENSE_START(FREELY_REDISTRIBUTABLE)
8f0aff2a 6.\" may be freely modified and distributed
8ff7380d 7.\" %%%LICENSE_END
fea681da
MK
8.\"
9.\" Niki A. Rahimi (LTC Security Development, narahimi@us.ibm.com)
10.\" added ERRORS section.
11.\"
12.\" Modified 2004-06-17 mtk
13.\" Modified 2004-10-07 aeb, added FUTEX_REQUEUE, FUTEX_CMP_REQUEUE
14.\"
4f58b197
MK
15.\" 2.6.31 adds FUTEX_WAIT_REQUEUE_PI, FUTEX_CMP_REQUEUE_PI
16.\" commit 52400ba946759af28442dee6265c5c0180ac7122
17.\" Author: Darren Hart <dvhltc@us.ibm.com>
18.\" Date: Fri Apr 3 13:40:49 2009 -0700
19.\"
20.\" commit ba9c22f2c01cf5c88beed5a6b9e07d42e10bd358
21.\" Author: Darren Hart <dvhltc@us.ibm.com>
22.\" Date: Mon Apr 20 22:22:22 2009 -0700
23.\"
24.\" See Documentation/futex-requeue-pi.txt
34f7665a 25.\"
3d155313 26.TH FUTEX 2 2014-05-21 "Linux" "Linux Programmer's Manual"
fea681da 27.SH NAME
ce154705 28futex \- fast user-space locking
fea681da 29.SH SYNOPSIS
9d9dc1e8 30.nf
fea681da
MK
31.sp
32.B "#include <linux/futex.h>"
fea681da
MK
33.B "#include <sys/time.h>"
34.sp
d33602c4 35.BI "int futex(int *" uaddr ", int " futex_op ", int " val ,
768d3c23
MK
36.BI " const struct timespec *" timeout , \
37" \fR /* or: \fBu32 \fIval2\fP */
9d9dc1e8 38.BI " int *" uaddr2 ", int " val3 );
9d9dc1e8 39.fi
409f08b0 40
b939d6e4
MK
41.IR Note :
42There is no glibc wrapper for this system call; see NOTES.
47297adb 43.SH DESCRIPTION
fea681da
MK
44.PP
45The
e511ffb6 46.BR futex ()
fea681da
MK
47system call provides a method for
48a program to wait for a value at a given address to change, and a
49method to wake up anyone waiting on a particular address (while the
50addresses for the same memory in separate processes may not be
51equal, the kernel maps them internally so the same memory mapped in
52different locations will correspond for
e511ffb6 53.BR futex ()
c13182ef 54calls).
fd3fa7ef 55This system call is typically used to
fea681da
MK
56implement the contended case of a lock in shared memory, as
57described in
a8bda636 58.BR futex (7).
fea681da 59.PP
f388ba70
MK
60When a futex operation did not finish uncontended in user space, a
61.BR futex ()
62call needs to be made to the kernel to arbitrate.
c13182ef 63Arbitration can either mean putting the calling
fea681da
MK
64process to sleep or, conversely, waking a waiting process.
65.PP
f388ba70
MK
66Callers of
67.BR futex ()
68are expected to adhere to the semantics described in
a8bda636 69.BR futex (7).
fea681da 70As these
d603cc27 71semantics involve writing nonportable assembly instructions, this in turn
fea681da
MK
72probably means that most users will in fact be library authors and not
73general application developers.
74.PP
75The
76.I uaddr
f388ba70
MK
77argument points to an integer which stores the counter (futex).
78On all platforms, futexes are four-byte integers that
79must be aligned on a four-byte boundary.
80The operation to perform on the futex is specified in the
81.I futex_op
82argument;
83.IR val
84is a value whose meaning and purpose depends on
85.IR futex_op .
36ab2074
MK
86
87The remaining arguments
88.RI ( timeout ,
89.IR uaddr2 ,
90and
91.IR val3 )
92are required only for certain of the futex operations described below.
93Where one of these arguments is not required, it is ignored.
768d3c23 94
36ab2074
MK
95For several blocking operations, the
96.I timeout
97argument is a pointer to a
98.IR timespec
99structure that specifies a timeout for the operation.
100However, notwithstanding the prototype shown above, for some operations,
101this argument is instead a four-byte integer whose meaning
102is determined by the operation.
768d3c23
MK
103For these operations, the kernel casts the
104.I timeout
105value to
106.IR u32 ,
107and in the remainder of this page, this argument is referred to as
108.I val2
109when interpreted in this fashion.
110
de5a3bb4 111Where it is required, the
36ab2074 112.IR uaddr2
de5a3bb4 113argument is a pointer to a second futex that is employed by the operation.
36ab2074
MK
114The interpretation of the final integer argument,
115.IR val3 ,
116depends on the operation.
117
6be4bad7 118The
d33602c4 119.I futex_op
6be4bad7
MK
120argument consists of two parts:
121a command that specifies the operation to be performed,
122bit-wise ORed with zero or or more options that
123modify the behaviour of the operation.
fc30eb79 124The options that may be included in
d33602c4 125.I futex_op
fc30eb79
TG
126are as follows:
127.TP
128.BR FUTEX_PRIVATE_FLAG " (since Linux 2.6.22)"
129.\" commit 34f01cc1f512fa783302982776895c73714ebbc2
130This option bit can be employed with all futex operations.
131It tells the kernel that the futex is process private and not shared
132with another process.
133This allows the kernel to choose the fast path for validating
134the user-space address and avoids expensive VMA lookups,
135taking reference counts on file backing store, and so on.
ae2c1774
MK
136
137As a convenience,
138.IR <linux/futex.h>
139defines a set of constants with the suffix
140.BR _PRIVATE
141that are equivalents of all of the operations listed below,
dcdfde26 142.\" except the obsolete FUTEX_FD, for which the "private" flag was
ae2c1774
MK
143.\" meaningless
144but with the
145.BR FUTEX_PRIVATE_FLAG
146ORed into the constant value.
147Thus, there are
148.BR FUTEX_WAIT_PRIVATE ,
149.BR FUTEX_WAKE_PRIVATE ,
150and so on.
2e98bbc2
TG
151.TP
152.BR FUTEX_CLOCK_REALTIME " (since Linux 2.6.28)"
153.\" commit 1acdac104668a0834cfa267de9946fac7764d486
4a7e5b05 154This option bit can be employed only with the
2e98bbc2
TG
155.BR FUTEX_WAIT_BITSET
156and
157.BR FUTEX_WAIT_REQUEUE_PI
c84cf68c 158operations.
2e98bbc2 159
f2103b26
MK
160If this option is set, the kernel treats
161.I timeout
162as an absolute time based on
2e98bbc2
TG
163.BR CLOCK_REALTIME .
164
f2103b26
MK
165If this option is not set, the kernel treats
166.I timeout
167as relative time,
1c952cf5
MK
168.\" FIXME I added CLOCK_MONOTONIC here. Is it correct?
169measured against the
170.BR CLOCK_MONOTONIC
171clock.
6be4bad7
MK
172.PP
173The operation specified in
d33602c4 174.I futex_op
6be4bad7 175is one of the following:
fea681da 176.TP
81c9d87e
MK
177.BR FUTEX_WAIT " (since Linux 2.6.0)"
178.\" Strictly speaking, since some time in 2.5.x
f065673c
MK
179This operation tests that the value at the
180location pointed to by the futex address
fea681da
MK
181.I uaddr
182still contains the value
183.IR val ,
f065673c 184and then sleeps awaiting
682edefb 185.B FUTEX_WAKE
f065673c
MK
186on the futex address.
187The test and sleep steps are performed atomically.
188If the futex value does not match
189.IR val ,
4710334a 190then the call fails immediately with the error
badbf70c 191.BR EAGAIN .
f065673c
MK
192.\" FIXME I added the following sentence. Please confirm that it is correct.
193The purpose of the test step is to detect races where
194another process changes that value of the futex between
195the time it was last checked and the time of the
196.BR FUTEX_WAIT
63d3f911 197operation.
1909e523 198
c13182ef 199If the
fea681da 200.I timeout
1c952cf5
MK
201argument is non-NULL, its contents specify a relative timeout for the wait
202.\" FIXME I added CLOCK_MONOTONIC here. Is it correct?
203measured according to the
204.BR CLOCK_MONOTONIC
205clock.
82a6092b
MK
206(This interval will be rounded up to the system clock granularity,
207and kernel scheduling delays mean that the
208blocking interval may overrun by a small amount.)
209If
210.I timeout
211is NULL, the call blocks indefinitely.
4798a7f3 212
c13182ef 213The arguments
fea681da
MK
214.I uaddr2
215and
216.I val3
217are ignored.
218
219For
a8bda636 220.BR futex (7),
fea681da
MK
221this call is executed if decrementing the count gave a negative value
222(indicating contention), and will sleep until another process releases
682edefb
MK
223the futex and executes the
224.B FUTEX_WAKE
225operation.
fea681da 226.TP
81c9d87e
MK
227.BR FUTEX_WAKE " (since Linux 2.6.0)"
228.\" Strictly speaking, since Linux 2.5.x
f065673c
MK
229This operation wakes at most
230.I val
231processes waiting (i.e., inside
232.BR FUTEX_WAIT )
233on the futex at the address
234.IR uaddr .
235Most commonly,
236.I val
237is specified as either 1 (wake up a single waiter) or
238.BR INT_MAX
239(wake up all waiters).
730bfbda
MK
240.\" FIXME Please confirm that the following is correct:
241No guarantee is provided about which waiters are awoken
242(e.g., a waiter with a higher scheduling priority is not guaranteed
243to be awoken in preference to a waiter with a lower priority).
4798a7f3 244
fea681da
MK
245The arguments
246.IR timeout ,
c8b921bd 247.IR uaddr2 ,
fea681da
MK
248and
249.I val3
250are ignored.
251
252For
a8bda636 253.BR futex (7),
fea681da
MK
254this is executed if incrementing
255the count showed that there were waiters, once the futex value has been set
256to 1 (indicating that it is available).
a7c2bf45
MK
257.TP
258.BR FUTEX_FD " (from Linux 2.6.0 up to and including Linux 2.6.25)"
259.\" Strictly speaking, from Linux 2.5.x to 2.6.25
260This operation creates a file descriptor that is associated with the futex at
261.IR uaddr .
262.\" , suitable for .BR poll (2).
263The calling process must close the returned file descriptor after use.
264When another process performs a
265.BR FUTEX_WAKE
266on the futex, the file descriptor indicates as being readable with
267.BR select (2),
268.BR poll (2),
269and
270.BR epoll (7)
271
272The file descriptor can be used to obtain asynchronous notifications:
273if
274.I val
275is nonzero, then when another process executes a
276.BR FUTEX_WAKE ,
277the caller will receive the signal number that was passed in
278.IR val .
279
280The arguments
281.IR timeout ,
282.I uaddr2
283and
284.I val3
285are ignored.
286
287To prevent race conditions, the caller should test if the futex has
288been upped after
289.B FUTEX_FD
290returns.
291
292Because it was inherently racy,
293.B FUTEX_FD
294has been removed
295.\" commit 82af7aca56c67061420d618cc5a30f0fd4106b80
296from Linux 2.6.26 onward.
297.TP
298.BR FUTEX_REQUEUE " (since Linux 2.6.0)"
299.\" Strictly speaking: from Linux 2.5.70
300.\"
301.\" FIXME I added this warning. Okay?
302.IR "Avoid using this operation" .
303It is broken (unavoidably racy) for its intended purpose.
304Use
305.BR FUTEX_CMP_REQUEUE
306instead.
307
308This operation performs the same task as
309.BR FUTEX_CMP_REQUEUE ,
310except that no check is made using the value in
311.IR val3 .
312(The argument
313.I val3
314is ignored.)
315.TP
316.BR FUTEX_CMP_REQUEUE " (since Linux 2.6.7)"
317This operation was added as a replacement for the earlier
318.BR FUTEX_REQUEUE ,
319because that operation was racy for its intended use.
320
321As with
322.BR FUTEX_REQUEUE ,
323the
324.BR FUTEX_CMP_REQUEUE
325operation is used to avoid a "thundering herd" effect when
326.B FUTEX_WAKE
327is used and all processes woken up need to acquire another futex.
328It differs from
329.BR FUTEX_REQUEUE
330in that it first checks whether the location
331.I uaddr
332still contains the value
333.IR val3 .
334If not, the operation fails with the error
335.BR EAGAIN .
336.\" FIXME I added the following sentence on rational for FUTEX_CMP_REQUEUE.
337.\" Is it correct? SHould it be expanded?
338This additional feature of
339.BR FUTEX_CMP_REQUEUE
340can be used by the caller to (atomically) detect changes
341in the value of the target futex at
342.IR uaddr2 .
343
344The operation wakes up a maximum of
345.I val
346waiters that are waiting on the futex at
347.IR uaddr .
348If there are more than
349.I val
350waiters, then the remaining waiters are removed
351from the wait queue of the source futex at
352.I uaddr
353and added to the wait queue of the target futex at
354.IR uaddr2 .
936876a9 355
a7c2bf45 356The
768d3c23 357.I val2
936876a9 358argument specifies an upper limit on the number of waiters
a7c2bf45 359that are requeued to the futex at
768d3c23 360.IR uaddr2 .
a7c2bf45
MK
361
362.\" FIXME Please review the following new paragraph to see if it is
363.\" accurate.
364Typical values to specify for
365.I val
366are 0 or or 1.
367(Specifying
368.BR INT_MAX
369is not useful, because it would make the
370.BR FUTEX_CMP_REQUEUE
371operation equivalent to
372.BR FUTEX_WAKE .)
936876a9 373The limit value specified via
768d3c23
MK
374.I val2
375is typically either 1 or
a7c2bf45
MK
376.BR INT_MAX .
377(Specifying the argument as 0 is not useful, because it would make the
378.BR FUTEX_CMP_REQUEUE
379operation equivalent to
380.BR FUTEX_WAIT .)
6bac3b85
MK
381.\"
382.\" FIXME I added some FUTEX_WAKE_OP text, and I'd be happy if someone
383.\" checked it.
fea681da 384.TP
d67e21f5
MK
385.BR FUTEX_WAKE_OP " (since Linux 2.6.14)"
386.\" commit 4732efbeb997189d9f9b04708dc26bf8613ed721
6bac3b85
MK
387.\" Author: Jakub Jelinek <jakub@redhat.com>
388.\" Date: Tue Sep 6 15:16:25 2005 -0700
389This operation was added to support some user-space use cases
390where more than one futex must be handled at the same time.
391The most notable example is the implementation of
392.BR pthread_cond_signal (3),
393which requires operations on two futexes,
394the one used to implement the mutex and the one used in the implementation
395of the wait queue associated with the condition variable.
396.BR FUTEX_WAKE_OP
397allows such cases to be implemented without leading to
398high rates of contention and context switching.
399
400The
401.BR FUTEX_WAIT_OP
402operation is equivalent to atomically executing the following code:
403
404.in +4n
405.nf
406int oldval = *(int *) uaddr2;
407*(int *) uaddr2 = oldval \fIop\fP \fIoparg\fP;
408futex(uaddr, FUTEX_WAKE, val, 0, 0, 0);
409if (oldval \fIcmp\fP \fIcmparg\fP)
768d3c23 410 futex(uaddr2, FUTEX_WAKE, val2, 0, 0, 0);
6bac3b85
MK
411.fi
412.in
413
414In other words,
415.BR FUTEX_WAIT_OP
416does the following:
417.RS
418.IP * 3
419saves the original value of the futex at
420.IR uaddr2 ;
421.IP *
422performs an operation to modify the value of the futex at
423.IR uaddr2 ;
424.IP *
425wakes up a maximum of
426.I val
427waiters on the futex
428.IR uaddr ;
429and
430.IP *
431dependent on the results of a test of the original value of the futex at
432.IR uaddr2 ,
433wakes up a maximum of
768d3c23 434.I val2
6bac3b85
MK
435waiters on the futex
436.IR uaddr2 .
437.RE
438.IP
6bac3b85
MK
439The operation and comparison that are to be performed are encoded
440in the bits of the argument
441.IR val3 .
442Pictorially, the encoding is:
443
f6af90e7 444.in +8n
6bac3b85 445.nf
f6af90e7
MK
446+---+---+-----------+-----------+
447|op |cmp| oparg | cmparg |
448+---+---+-----------+-----------+
449 4 4 12 12 <== # of bits
6bac3b85
MK
450.fi
451.in
452
453Expressed in code, the encoding is:
454
455.in +4n
456.nf
457#define FUTEX_OP(op, oparg, cmp, cmparg) \\
458 (((op & 0xf) << 28) | \\
459 ((cmp & 0xf) << 24) | \\
460 ((oparg & 0xfff) << 12) | \\
461 (cmparg & 0xfff))
462.fi
463.in
464
465In the above,
466.I op
467and
468.I cmp
469are each one of the codes listed below.
470The
471.I oparg
472and
473.I cmparg
474components are literal numeric values, except as noted below.
475
476The
477.I op
478component has one of the following values:
479
480.in +4n
481.nf
482FUTEX_OP_SET 0 /* uaddr2 = oparg; */
483FUTEX_OP_ADD 1 /* uaddr2 += oparg; */
484FUTEX_OP_OR 2 /* uaddr2 |= oparg; */
485FUTEX_OP_ANDN 3 /* uaddr2 &= ~oparg; */
486FUTEX_OP_XOR 4 /* uaddr2 ^= oparg; */
487.fi
488.in
489
490In addition, bit-wise ORing the following value into
491.I op
492causes
493.IR "(1\ <<\ oparg)"
494to be used as the operand:
495
496.in +4n
497.nf
498FUTEX_OP_ARG_SHIFT 8 /* Use (1 << oparg) as operand */
499.fi
500.in
501
502The
503.I cmp
504field is one of the following:
505
506.in +4n
507.nf
508FUTEX_OP_CMP_EQ 0 /* if (oldval == cmparg) wake */
509FUTEX_OP_CMP_NE 1 /* if (oldval != cmparg) wake */
510FUTEX_OP_CMP_LT 2 /* if (oldval < cmparg) wake */
511FUTEX_OP_CMP_LE 3 /* if (oldval <= cmparg) wake */
512FUTEX_OP_CMP_GT 4 /* if (oldval > cmparg) wake */
513FUTEX_OP_CMP_GE 5 /* if (oldval >= cmparg) wake */
514.fi
515.in
516
517The return value of
518.BR FUTEX_WAKE_OP
519is the sum of the number of waiters woken on the futex
520.IR uaddr
521plus the number of waiters woken on the futex
522.IR uaddr2 .
d67e21f5 523.TP
79c9b436
TG
524.BR FUTEX_WAIT_BITSET " (since Linux 2.6.25)"
525.\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
fd9e59d4 526This operation is like
79c9b436
TG
527.BR FUTEX_WAIT
528except that
529.I val3
530is used to provide a 32-bit bitset to the kernel.
531This bitset is stored in the kernel-internal state of the waiter.
532See the description of
533.BR FUTEX_WAKE_BITSET
534for further details.
535
fd9e59d4
MK
536The
537.BR FUTEX_WAIT_BITSET
538also interprets the
539.I timeout
540argument differently from
541.BR FUTEX_WAIT .
542See the discussion of
543.BR FUTEX_CLOCK_REALTIME ,
544above.
545
79c9b436
TG
546The
547.I uaddr2
548argument is ignored.
549.TP
d67e21f5
MK
550.BR FUTEX_WAKE_BITSET " (since Linux 2.6.25)"
551.\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
55cc422d
TG
552This operation is the same as
553.BR FUTEX_WAKE
554except that the
555.I val3
556argument is used to provide a 32-bit bitset to the kernel.
98d769c0
MK
557This bitset is used to select which waiters should be woken up.
558The selection is done by a bit-wise AND of the "wake" bitset
559(i.e., the value in
560.IR val3 )
561and the bitset which is stored in the kernel-internal
09cb4ce7 562state of the waiter (the "wait" bitset that is set using
98d769c0
MK
563.BR FUTEX_WAIT_BITSET ).
564All of the waiters for which the result of the AND is nonzero are woken up;
565the remaining waiters are left sleeping.
566
e9d4496b
MK
567.\" FIXME please review this paragraph that I added
568The effect of
569.BR FUTEX_WAIT_BITSET
570and
571.BR FUTEX_WAKE_BITSET
572is to allow selective wake-ups among multiple waiters that are waiting
573on the same futex;
574since a futex has a size of 32 bits,
575these operations provide 32 wakeup "channels".
576(The
577.BR FUTEX_WAIT
578and
579.BR FUTEX_WAKE
580operations correspond to
581.BR FUTEX_WAIT_BITSET
582and
583.BR FUTEX_WAKE_BITSET
584operations where the bitsets are all ones.)
09cb4ce7 585Note, however, that using this bitset multiplexing feature on a
e9d4496b
MK
586futex is less efficient than simply using multiple futexes,
587because employing bitset multiplexing requires the kernel
588to check all waiters on a futex,
589including those that are not interested in being woken up
590(i.e., they do not have the relevant bit set in their "wait" bitset).
591.\" According to http://locklessinc.com/articles/futex_cheat_sheet/:
592.\"
593.\" "The original reason for the addition of these extensions
594.\" was to improve the performance of pthread read-write locks
595.\" in glibc. However, the pthreads library no longer uses the
596.\" same locking algorithm, and these extensions are not used
597.\" without the bitset parameter being all ones.
598.\"
599.\" The page goes on to note that the FUTEX_WAIT_BITSET operation
600.\" is nevertheless used (with a bitset of all ones) in order to
601.\" obtain the absolute timeout functionality that is useful
602.\" for efficiently implementing Pthreads APIs (which use absolute
603.\" timeouts); FUTEX_WAIT provides only relative timeouts.
604
98d769c0
MK
605The
606.I uaddr2
607and
608.I timeout
609arguments are ignored.
bd90a5f9
MK
610.\"
611.\"
612.SS Priority-inheritance futexes
b52e1cd4
MK
613Linux supports priority-inheritance (PI) futexes in order to handle
614priority-inversion problems that can be encountered with
615normal futex locks.
79d918c7
MK
616.\"
617.\" FIXME ===== Start of adapted Hart/Guniguntala text =====
618.\" The following text is drawn from the Hart/Guniguntala paper,
619.\" but I have reworded some pieces significantly. Please check it.
620.\"
621The PI futex operations described below differ from the other
622futex operations in that they impose policy on the use of the futex value:
623.IP * 3
7c16fbff 624If the lock is unowned, the futex value shall be 0.
79d918c7
MK
625.IP *
626If the lock is owned, the futex value shall be the thread ID (TID; see
627.BR gettid (2))
628of the owning thread.
629.IP *
630.\" FIXME In the following line, I added "the lock is owned and". Okay?
631If the lock is owned and there are threads contending for the lock,
632then the
633.B FUTEX_WAITERS
634bit shall be set in the futex value; in other words, the futex value is:
635
636 FUTEX_WAITERS | TID
637.PP
638With this policy in place,
639a user-space application can acquire an unowned
b52e1cd4 640lock or release an uncontended lock using a atomic
79d918c7 641.\" FIXME In the following line, I added "user-space". Okay?
b52e1cd4
MK
642user-space instructions (e.g.,
643.I cmpxchg
644on the x86 architecture).
645Locking an unowned lock simply consists of setting
646the futex value to the caller's TID.
647Releasing an uncontended lock simply requires setting the futex value to 0.
648
649If a futex is currently owned (i.e., has a nonzero value),
650waiters must employ the
79d918c7
MK
651.B FUTEX_LOCK_PI
652operation to acquire the lock.
b52e1cd4 653If a lock is contended (i.e., the
79d918c7 654.B FUTEX_WAITERS
b52e1cd4 655bit is set in the futex value), the lock owner must employ the
79d918c7 656.B FUTEX_UNLOCK_PI
b52e1cd4
MK
657operation to release the lock.
658
79d918c7
MK
659In the cases where callers are forced into the kernel
660(i.e., required to perform a
661.BR futex ()
662operation),
663they then deal directly with a so-called RT-mutex,
664a kernel locking mechanism which implements the required
665priority-inheritance semantics.
666After the RT-mutex is acquired, the futex value is updated accordingly,
667before the calling thread returns to user space.
668.\" FIXME ===== End of adapted Hart/Guniguntala text =====
669
670It is important
671.\" FIXME We need some explanation here of why it is important to note this
672to note that the kernel will update the futex value prior
673to returning to user space.
674Unlike the other futex operations described above,
675the PI futex operations are designed
7c16fbff 676for the implementation of very specific IPC mechanisms).
fc57e6bb
MK
677.\"
678.\" FIXME We don't quite have a definition anywhere of what a PI futex
679.\" is (vs a non-PI futex). Below, we have the information of
680.\" FUTEX_CMP_REQUEUE_PI requeues from a non-PI futex to a
681.\" PI futex, but what determines whether the futex is of one
682.\" kind of the other? We should have such a definition somewhere
683.\" about here.
bd90a5f9
MK
684
685PI futexes are operated on by specifying one of the following values in
686.IR futex_op :
d67e21f5
MK
687.TP
688.BR FUTEX_LOCK_PI " (since Linux 2.6.18)"
689.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
67833bec
MK
690.\"
691.\" FIXME I did some significant rewording of tglx's text.
692.\" Please check, in case I injected errors.
693.\"
694This operation is used after after an attempt to acquire
695the futex lock via an atomic user-space instruction failed
696because the futex has a nonzero value\(emspecifically,
697because it contained the namespace-specific TID of the lock owner.
67259526 698.\" FIXME In the preceding line, what does "namespace-specific" mean?
67833bec 699.\" (I kept those words from tglx.)
67259526 700.\" That is, what kind of namespace are we talking about?
67833bec
MK
701.\" (I suppose we are talking PID namespaces here, but I want to
702.\" be sure.)
703
704The operation checks the value of the futex at the address
705.IR uaddr .
706If the value is 0, then the kernel tries to atomically set the futex value to
707the caller's TID.
708If that fails,
709.\" FIXME What would be the cause of failure?
710or the futex value is nonzero,
711the kernel atomically sets the
e0547e70 712.B FUTEX_WAITERS
67833bec
MK
713bit, which signals the futex owner that it cannot unlock the futex in
714user space atomically by setting the futex value to 0.
715After that, the kernel tries to find the thread which is
716associated with the owner TID,
717.\" FIXME Could I get a bit more detail on the next two lines?
718.\" What is "creates or reuses kernel state" about?
719creates or reuses kernel state on behalf of the owner
720and attaches the waiter to it.
67259526
MK
721.\" FIXME In the next line, what type of "priority" are we talking about?
722.\" Realtime priorities for SCHED_FIFO and SCHED_RR?
723.\" Or something else?
e0547e70
TG
724The enqueing of the waiter is in descending priority order if more
725than one waiter exists.
67259526 726.\" FIXME What does "bandwidth" refer to in the next line?
e0547e70 727The owner inherits either the priority or the bandwidth of the waiter.
67259526
MK
728.\" FIXME In the preceding line, what determines whether the
729.\" owner inherits the priority versus the bandwidth?
67833bec
MK
730.\"
731.\" FIXME Could I get some help translating the next sentence into
732.\" something that user-space developers (and I) can understand?
733.\" In particular, what are "nexted locks" in this context?
e0547e70
TG
734This inheritance follows the lock chain in the case of
735nested locking and performs deadlock detection.
736
9ce19cf1
MK
737.\" FIXME tglx says "The timeout argument is handled as described in
738.\" FUTEX_WAIT." However, it appears to me that this is not right.
739.\" Is the following formulation correct.
e0547e70
TG
740The
741.I timeout
9ce19cf1
MK
742argument provides a timeout for the lock attempt.
743It is interpreted as an absolute time, measured against the
744.BR CLOCK_REALTIME
745clock.
746If
747.I timeout
748is NULL, the operation will block indefinitely.
e0547e70 749
a449c634 750The
e0547e70
TG
751.IR uaddr2 ,
752.IR val ,
753and
754.IR val3
a449c634 755arguments are ignored.
fedaeaf3 756.\" FIXME
a9dcb4d1 757.\" tglx noted the following "ERROR" case for FUTEX_LOCK_PI and
670b34f8
MK
758.\" FUTEX_TRYLOCK_PI and FUTEX_WAIT_REQUEUE_PI:
759.\"
a9dcb4d1
MK
760.\" > [EOWNERDIED] The owner of the futex died and the kernel made the
761.\" > caller the new owner. The kernel sets the FUTEX_OWNER_DIED bit
762.\" > in the futex userspace value. Caller is responsible for cleanup
fedaeaf3 763.\"
a9dcb4d1 764.\" However, there is no such thing as an EOWNERDIED error. I had a look
fedaeaf3
MK
765.\" through the kernel source for the FUTEX_OWNER_DIED cases and didn't
766.\" see an obvious error associated with them. Can you clarify? (I think
767.\" the point is that this condition, which is described in
768.\" Documentation/robust-futexes.txt, is not an error as such. However,
769.\" I'm not yet sure of how to describe it in the man page.)
670b34f8 770.\" Suggestions please!
67833bec 771.\"
d67e21f5 772.TP
12fdbe23 773.BR FUTEX_TRYLOCK_PI " (since Linux 2.6.18)"
d67e21f5 774.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
12fdbe23
MK
775This operation tries to acquire the futex at
776.IR uaddr .
0b761826
MK
777.\" FIXME I think it would be helpful here to say a few more words about
778.\" the difference(s) between FUTEX_LOCK_PI and FUTEX_TRYLOCK_PI
fa0388c3 779It deals with the situation where the TID value at
12fdbe23
MK
780.I uaddr
781is 0, but the
b52e1cd4 782.B FUTEX_WAITERS
12fdbe23 783bit is set.
fa0388c3
MK
784.\" FIXME How does the situation in the previous sentence come about?
785.\" Probably it would be helpful to say something about that in
786.\" the man page.
badbf70c 787.\" FIXME And *how* does FUTEX_TRYLOCK_PI deal with this situation?
12fdbe23 788User space cannot handle this race free.
084744ef
MK
789
790The
791.IR uaddr2 ,
792.IR val ,
793.IR timeout ,
794and
795.IR val3
796arguments are ignored.
d67e21f5 797.TP
12fdbe23 798.BR FUTEX_UNLOCK_PI " (since Linux 2.6.18)"
d67e21f5 799.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
ecae2099
TG
800This operation wakes the top priority waiter which is waiting in
801.B FUTEX_LOCK_PI
802on the futex address provided by the
803.I uaddr
804argument.
805
806This is called when the user space value at
807.I uaddr
808cannot be changed atomically from a TID (of the owner) to 0.
809
810The
811.IR uaddr2 ,
812.IR val ,
813.IR timeout ,
814and
815.IR val3
11a194bf 816arguments are ignored.
d67e21f5 817.TP
d67e21f5
MK
818.BR FUTEX_CMP_REQUEUE_PI " (since Linux 2.6.31)"
819.\" commit 52400ba946759af28442dee6265c5c0180ac7122
820.\" FIXME to complete
f812a08b
DH
821This operation is a PI-aware variant of
822.BR FUTEX_CMP_REQUEUE .
823It requeues waiters that are blocked via
824.B FUTEX_WAIT_REQUEUE_PI
825on
826.I uaddr
827from a non-PI source futex
828.RI ( uaddr )
829to a PI target futex
830.RI ( uaddr2 ).
831
9e54d26d
MK
832As with
833.BR FUTEX_CMP_REQUEUE ,
834this operation wakes up a maximum of
835.I val
836waiters that are waiting on the futex at
837.IR uaddr .
838However, for
839.BR FUTEX_CMP_REQUEUE_PI ,
840.I val
841is required to be 1.
842The remaining waiters are removed from the wait queue of the source futex at
843.I uaddr
844and added to the wait queue of the target futex at
845.IR uaddr2 .
f812a08b 846
9e54d26d 847The
768d3c23 848.I val2
c6d8cf21
MK
849.\" val2 is the cap on the number of requeued waiters.
850.\" In the glibc pthread_cond_broadcast() implementation, this argument
851.\" is specified as INT_MAX, and for pthread_cond_signal() it is 0.
9e54d26d 852and
768d3c23 853.I val3
9e54d26d
MK
854arguments serve the same purposes as for
855.BR FUTEX_CMP_REQUEUE .
be376673
MK
856.\" FIXME The page at http://locklessinc.com/articles/futex_cheat_sheet/
857.\" notes that "priority-inheritance Futex to priority-inheritance
858.\" Futex requeues are currently unsupported". Do we need to say
859.\" something in the man page about that?
d67e21f5
MK
860.TP
861.BR FUTEX_WAIT_REQUEUE_PI " (since Linux 2.6.31)"
862.\" commit 52400ba946759af28442dee6265c5c0180ac7122
6ff1b4c0
TG
863Wait operation to wait on a non-PI futex at
864.I uaddr
865and potentially be requeued onto a PI futex at
866.IR uaddr2 .
867The wait operation on
868.I uaddr
869is the same as
870.BR FUTEX_WAIT .
871The waiter can be removed from the wait on
872.I uaddr
873via
874.BR FUTEX_WAKE
875without requeueing on
876.IR uaddr2 .
a4e69912 877
63bea7dc
MK
878.\" FIXME Please check the following. tglx said "The timeout argument
879.\" is handled as described in FUTEX_WAIT.", but the truth is
880.\" as below, AFAICS
881If
882.I timeout
883is not NULL, it specifies a timeout for the wait operation;
884this timeout is interpreted as outlined above in the description of the
885.BR FUTEX_CLOCK_REALTIME
886option.
887If
888.I timeout
889is NULL, the operation can block indefinitely.
890
a4e69912
MK
891The
892.I val3
893argument is ignored.
894.\" FIXME Re the preceding sentence, actually 'val3' is internally set to
895.\" FUTEX_BITSET_MATCH_ANY before calling futex_wait_requeue_pi().
896.\" I'm not sure we need to say anything about this though.
897.\" Comments?
47297adb 898.SH RETURN VALUE
fea681da 899.PP
6f147f79 900In the event of an error, all operations return \-1 and set
e808bba0 901.I errno
6f147f79 902to indicate the cause of the error.
e808bba0
MK
903The return value on success depends on the operation,
904as described in the following list:
fea681da
MK
905.TP
906.B FUTEX_WAIT
682edefb
MK
907Returns 0 if the process was woken by a
908.B FUTEX_WAKE
7446a837
MK
909or
910.B FUTEX_WAKE_BITSET
682edefb 911call.
fea681da
MK
912.TP
913.B FUTEX_WAKE
914Returns the number of processes woken up.
915.TP
916.B FUTEX_FD
917Returns the new file descriptor associated with the futex.
918.TP
919.B FUTEX_REQUEUE
920Returns the number of processes woken up.
921.TP
922.B FUTEX_CMP_REQUEUE
3dfcc11d
MK
923Returns the total number of processes woken up or requeued to the futex at
924.IR uaddr2 .
925If this value is greater than
926.IR val ,
927then difference is the number of waiters requeued to the futex at
928.IR uaddr2 .
519f2c3d
MK
929.\"
930.\" FIXME Add success returns for other operations
dcad19c0
MK
931.TP
932.B FUTEX_WAKE_OP
a8b5b324
MK
933.\" FIXME Is the following correct?
934Returns the total number of waiters that were woken up.
935This is the sum of the woken waiters on the two futexes at
936.I uaddr
937and
938.IR uaddr2 .
dcad19c0
MK
939.TP
940.B FUTEX_WAIT_BITSET
7bcc5351
MK
941.\" FIXME Is the following correct?
942Returns 0 if the process was woken by a
943.B FUTEX_WAKE
944or
945.B FUTEX_WAKE_BITSET
946call.
dcad19c0
MK
947.TP
948.B FUTEX_WAKE_BITSET
b884566b
MK
949.\" FIXME Is the following correct?
950Returns the number of processes woken up.
dcad19c0
MK
951.TP
952.B FUTEX_LOCK_PI
bf02a260
MK
953.\" FIXME Is the following correct?
954Returns 0 if the futex was successfully locked.
dcad19c0
MK
955.TP
956.B FUTEX_TRYLOCK_PI
5c716eef
MK
957.\" FIXME Is the following correct?
958Returns 0 if the futex was successfully locked.
dcad19c0
MK
959.TP
960.B FUTEX_UNLOCK_PI
52bb928f
MK
961.\" FIXME Is the following correct?
962Returns 0 if the futex was successfully unlocked.
dcad19c0
MK
963.TP
964.B FUTEX_CMP_REQUEUE_PI
dddd395a
MK
965.\" FIXME Is the following correct?
966Returns the total number of processes woken up or requeued to the futex at
967.IR uaddr2 .
968If this value is greater than
969.IR val ,
970then difference is the number of waiters requeued to the futex at
971.IR uaddr2 .
dcad19c0
MK
972.TP
973.B FUTEX_WAIT_REQUEUE_PI
22c15de9
MK
974.\" FIXME Is the following correct?
975Returns 0 if the caller was successfully requeued to the futex at
976.IR uaddr2 .
fea681da
MK
977.SH ERRORS
978.TP
979.B EACCES
980No read access to futex memory.
981.TP
982.B EAGAIN
f48516d1
MK
983.RB ( FUTEX_WAIT ,
984.BR FUTEX_WAIT_REQUEUE_PI )
badbf70c
MK
985The value pointed to by
986.I uaddr
987was not equal to the expected value
988.I val
989at the time of the call.
990.TP
991.B EAGAIN
8f2068bb
MK
992.RB ( FUTEX_CMP_REQUEUE ,
993.BR FUTEX_CMP_REQUEUE_PI )
ce5602fd 994The value pointed to by
9f6c40c0
МК
995.I uaddr
996is not equal to the expected value
997.IR val3 .
fd1dc4c2 998.\" FIXME: Is the following sentence correct?
fea681da 999(This probably indicates a race;
682edefb
MK
1000use the safe
1001.B FUTEX_WAKE
1002now.)
c0091dd3
MK
1003.\"
1004.\" FIXME Should there be an EAGAIN case for FUTEX_TRYLOCK_PI?
1005.\" It seems so, looking at the handling of the rt_mutex_trylock()
1006.\" call in futex_lock_pi()
1007.\"
fea681da 1008.TP
5662f56a
MK
1009.BR EAGAIN
1010.RB ( FUTEX_LOCK_PI ,
aaec9032
MK
1011.BR FUTEX_TRYLOCK_PI ,
1012.BR FUTEX_CMP_REQUEUE_PI )
1013The futex owner thread ID of
1014.I uaddr
1015(for
1016.BR FUTEX_CMP_REQUEUE_PI :
1017.IR uaddr2 )
1018is about to exit,
5662f56a
MK
1019but has not yet handled the internal state cleanup.
1020Try again.
61f8c1d1
MK
1021.\"
1022.\" FIXME Is there not also an EAGAIN error case on 'uaddr2' for
1023.\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via
1024.\" futex_requeue() ==> futex_proxy_trylock_atomic() ==>
1025.\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> EAGAIN?
5662f56a 1026.TP
7a39e745
MK
1027.BR EDEADLK
1028.RB ( FUTEX_LOCK_PI ,
1029.BR FUTEX_TRYLOCK_PI )
1030The futex at
1031.I uaddr
1032is already locked by the caller.
d08ce5dd
MK
1033.\"
1034.\" FIXME Is there not also an EDEADLK error case on 'uaddr2' for
1035.\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via
1036.\" futex_requeue() ==> futex_proxy_trylock_atomic() ==>
1037.\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> EDEADLK?
7a39e745 1038.TP
662c0da8
MK
1039.BR EDEADLK
1040.\" FIXME I reworded tglx's text somewhat; is the following okay?
1041.RB ( FUTEX_CMP_REQUEUE_PI )
1042While requeueing a waiter to the PI futex at
1043.IR uaddr2 ,
1044the kernel detected a deadlock.
1045.TP
fea681da 1046.B EFAULT
1ea901e8
MK
1047A required pointer argument (i.e.,
1048.IR uaddr ,
1049.IR uaddr2 ,
1050or
1051.IR timeout )
496df304 1052did not point to a valid user-space address.
fea681da 1053.TP
9f6c40c0 1054.B EINTR
e808bba0 1055A
9f6c40c0 1056.B FUTEX_WAIT
2674f781
MK
1057or
1058.B FUTEX_WAIT_BITSET
e808bba0
MK
1059operation was interrupted by a signal (see
1060.BR signal (7))
1061or a spurious wakeup.
5eeca856
MK
1062.\" FIXME
1063.\" Regarding the words "spurious wakeup" above, I received this
1064.\" bug report from Rich Felker:
1065.\"
1066.\" I see no code in the kernel whereby a "spurious wakeup", or anything
1067.\" other than interruption by a signal handler that's not SA_RESTART,
1068.\" can cause futex to fail with EINTR. In general, overloading of EINTR
1069.\" and/or spurious EINTRs from a syscall make it impossible to use that
1070.\" syscall for implementing any function where EINTR is a mandatory
1071.\" failure on interruption-by-signal, since there is no way for
1072.\" userspace to distinguish whether the EINTR occurred as a result of
1073.\" an interrupting signal or some other reason. The kernel folks have
1074.\" gone to great lengths to fix spurious EINTRs (see signal(7) for
1075.\" history), especially by non-interrupting signal handlers, including
1076.\" in futex, and allowing EINTR here would be contrary to that goal.
1077.\"
1078.\" It's my belief that the "or a spurious wakeup" text should simply be
1079.\" removed.
1080.\"
1081.\" The reason I'm raising this topic is its relevance to a thread on
1082.\" libc-alpha:
1083.\" [RFC] mutex destruction (#13690): problem description and workarounds
1084.\"
1085.\" The bug and mailing list discussions to which Rich refers are:
1086.\" https://sourceware.org/bugzilla/show_bug.cgi?id=13690
1087.\" https://sourceware.org/ml/libc-alpha/2014-12/threads.html#0001
1088.\"
1089.\" Can anyone comment on whether the words "spurious wakeup" are correct?
1090.\"
9f6c40c0 1091.TP
fea681da 1092.B EINVAL
180f97b7
MK
1093The operation in
1094.IR futex_op
1095is one of those that employs a timeout, but the supplied
fb2f4c27
MK
1096.I timeout
1097argument was invalid
1098.RI ( tv_sec
1099was less than zero, or
1100.IR tv_nsec
1101was not less than 1000,000,000).
1102.TP
1103.B EINVAL
0c74df0b 1104The operation specified in
025e1374 1105.IR futex_op
0c74df0b 1106employs one or both of the pointers
51ee94be 1107.I uaddr
a1f47699 1108and
0c74df0b
MK
1109.IR uaddr2 ,
1110but one of these does not point to a valid object\(emthat is,
1111the address is not four-byte-aligned.
51ee94be
MK
1112.TP
1113.B EINVAL
55cc422d
TG
1114.RB ( FUTEX_WAIT_BITSET ,
1115.BR FUTEX_WAKE_BITSET )
79c9b436
TG
1116The bitset supplied in
1117.IR val3
1118is zero.
1119.TP
1120.B EINVAL
2043f2c1
MK
1121.RB ( FUTEX_REQUEUE ,
1122.\" FIXME tglx suggested adding this, but does this error really occur for
1123.\" FUTEX_REQUEUE? (The case where it occurs for FUTEX_CMP_REQUEUE_PI
1124.\" is obvious at the start of futex_requeue().)
1125.BR FUTEX_CMP_REQUEUE_PI )
add875c0
MK
1126.I uaddr
1127equals
1128.IR uaddr2
1129(i.e., an attempt was made to requeue to the same futex).
1130.TP
ff597681
MK
1131.BR EINVAL
1132.RB ( FUTEX_FD )
1133The signal number supplied in
1134.I val
1135is invalid.
1136.TP
6bac3b85 1137.B EINVAL
476debd7
MK
1138.RB ( FUTEX_WAKE ,
1139.BR FUTEX_WAKE_OP ,
1140.BR FUTEX_WAKE_BITSET ,
1141.BR FUTEX_REQUEUE ,
1142.BR FUTEX_CMP_REQUEUE )
1143The kernel detected an inconsistency between the user-space state at
1144.I uaddr
1145and the kernel state\(emthat is, it detected a waiter which waits in
1146.BR FUTEX_LOCK_PI
1147on
1148.IR uaddr .
1149.TP
1150.B EINVAL
a218ef20 1151.RB ( FUTEX_LOCK_PI ,
ce022f18
MK
1152.BR FUTEX_TRYLOCK_PI ,
1153.BR FUTEX_UNLOCK_PI )
a218ef20
MK
1154The kernel detected an inconsistency between the user-space state at
1155.I uaddr
1156and the kernel state.
ce022f18
MK
1157This indicates either state corruption
1158.\" FIXME tglx did not mention the "state corruption" for FUTEX_UNLOCK_PI.
1159.\" Does that case also apply for FUTEX_UNLOCK_PI?
1160or that the kernel found a waiter on
a218ef20
MK
1161.I uaddr
1162which is waiting via
1163.BR FUTEX_WAIT
1164or
1165.BR FUTEX_WAIT_BITSET .
1166.TP
1167.B EINVAL
f9250b1a
MK
1168.RB ( FUTEX_CMP_REQUEUE_PI )
1169The kernel detected an inconsistency between the user-space state at
99c0041d
MK
1170.I uaddr2
1171and the kernel state;
1172that is, the kernel detected a waiter which waits via
1173.BR FUTEX_WAIT
1174.\" FIXME tglx did not mention FUTEX_WAIT_BITSET here,
1175.\" but should that not also be included here?
1176on
1177.IR uaddr2 .
1178.TP
1179.B EINVAL
1180.RB ( FUTEX_CMP_REQUEUE_PI )
1181The kernel detected an inconsistency between the user-space state at
f9250b1a
MK
1182.I uaddr
1183and the kernel state;
1184that is, the kernel detected a waiter which waits via
99c0041d
MK
1185.BR FUTEX_LOCK_PI ,
1186.BR FUTEX_WAIT ,
1187or
1188.BR FUTEX_WAIT_BITSET ,
f9250b1a
MK
1189on
1190.IR uaddr .
1191.TP
1192.B EINVAL
99c0041d
MK
1193.RB ( FUTEX_CMP_REQUEUE_PI )
1194.TP
1195.B EINVAL
4832b48a 1196Invalid argument.
fea681da 1197.TP
a449c634
MK
1198.BR ENOMEM
1199.RB ( FUTEX_LOCK_PI ,
e34a8fb6
MK
1200.BR FUTEX_TRYLOCK_PI ,
1201.BR FUTEX_CMP_REQUEUE_PI )
a449c634
MK
1202The kernel could not allocate memory to hold state information.
1203.TP
fea681da 1204.B ENFILE
ff597681 1205.RB ( FUTEX_FD )
fea681da 1206The system limit on the total number of open files has been reached.
4701fc28
MK
1207.TP
1208.B ENOSYS
1209Invalid operation specified in
d33602c4 1210.IR futex_op .
9f6c40c0 1211.TP
4a7e5b05
MK
1212.B ENOSYS
1213The
1214.BR FUTEX_CLOCK_REALTIME
1215option was specified in
1afcee7c 1216.IR futex_op ,
4a7e5b05
MK
1217but the accompanying operation was neither
1218.BR FUTEX_WAIT_BITSET
1219nor
1220.BR FUTEX_WAIT_REQUEUE_PI .
1221.TP
a9dcb4d1
MK
1222.BR ENOSYS
1223.RB ( FUTEX_LOCK_PI ,
f2424fae 1224.BR FUTEX_TRYLOCK_PI ,
4945ff19 1225.BR FUTEX_UNLOCK_PI ,
794bb106
MK
1226.BR FUTEX_CMP_REQUEUE_PI
1227.BR FUTEX_WAIT_REQUEUE_PI )
a9dcb4d1 1228A run-time check determined that the operation not available.
a2ebebcd
MK
1229The PI futex operations are not implemented on all architectures and
1230are not supported on some CPU variants.
a9dcb4d1 1231.TP
c7589177
MK
1232.BR EPERM
1233.RB ( FUTEX_LOCK_PI ,
dc2742a8
MK
1234.BR FUTEX_TRYLOCK_PI ,
1235.BR FUTEX_CMP_REQUEUE_PI )
04331c3f 1236The caller is not allowed to attach itself to the futex at
dc2742a8
MK
1237.I uaddr
1238(for
1239.BR FUTEX_CMP_REQUEUE_PI :
1240the futex at
1241.IR uaddr2 ).
c7589177 1242(This may be caused by a state corruption in user space.)
61f8c1d1
MK
1243.\"
1244.\" FIXME Is there not also an EPERM error case on 'uaddr2' for
1245.\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via
1246.\" futex_requeue() ==> futex_proxy_trylock_atomic() ==>
1247.\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> EPERM?
c7589177 1248.TP
76f347ba 1249.BR EPERM
87276709 1250.RB ( FUTEX_UNLOCK_PI )
76f347ba
MK
1251The caller does not own the futex.
1252.TP
0b0e4934
MK
1253.BR ESRCH
1254.RB ( FUTEX_LOCK_PI ,
1255.BR FUTEX_TRYLOCK_PI )
1256.\" FIXME I reworded the following sentence a bit differently from
1257.\" tglx's formulation. Is it okay?
1258The thread ID in the futex at
1259.I uaddr
1260does not exist.
61f8c1d1
MK
1261.\"
1262.\" FIXME Is there not also an ESRCH error case on 'uaddr2' for
1263.\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via
1264.\" futex_requeue() ==> futex_proxy_trylock_atomic() ==>
1265.\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> ESRCH?
0b0e4934 1266.TP
360f773c
MK
1267.BR ESRCH
1268.RB ( FUTEX_CMP_REQUEUE_PI )
1269.\" FIXME I reworded the following sentence a bit differently from
1270.\" tglx's formulation. Is it okay?
1271The thread ID in the futex at
1272.I uaddr2
1273does not exist.
1274.TP
9f6c40c0 1275.B ETIMEDOUT
4d85047f
MK
1276The operation in
1277.IR futex_op
1278employed the timeout specified in
1279.IR timeout ,
1280and the timeout expired before the operation completed.
47297adb 1281.SH VERSIONS
a1d5f77c 1282.PP
81c9d87e
MK
1283Futexes were first made available in a stable kernel release
1284with Linux 2.6.0.
1285
a1d5f77c
MK
1286Initial futex support was merged in Linux 2.5.7 but with different semantics
1287from what was described above.
52dee70e 1288A four-argument system call with the semantics
fd3fa7ef 1289described in this page was introduced in Linux 2.5.40.
11b520ed 1290In Linux 2.5.70, one argument
a1d5f77c 1291was added.
11b520ed 1292In Linux 2.6.7, a sixth argument was added\(emmessy, especially
a1d5f77c 1293on the s390 architecture.
47297adb 1294.SH CONFORMING TO
8382f16d 1295This system call is Linux-specific.
47297adb 1296.SH NOTES
fea681da 1297.PP
fcdad7d6 1298To reiterate, bare futexes are not intended as an easy-to-use abstraction
c13182ef 1299for end-users.
fcdad7d6 1300(There is no wrapper function for this system call in glibc.)
c13182ef 1301Implementors are expected to be assembly literate and to have
7fac88a9 1302read the sources of the futex user-space library referenced below.
d282bb24 1303.\" .SH AUTHORS
fea681da
MK
1304.\" .PP
1305.\" Futexes were designed and worked on by
1306.\" Hubertus Franke (IBM Thomas J. Watson Research Center),
1307.\" Matthew Kirkwood, Ingo Molnar (Red Hat)
1308.\" and Rusty Russell (IBM Linux Technology Center).
1309.\" This page written by bert hubert.
47297adb 1310.SH SEE ALSO
9913033c 1311.BR get_robust_list (2),
d806bc05 1312.BR restart_syscall (2),
14d8dd3b 1313.BR futex (7)
fea681da 1314.PP
f5ad572f
MK
1315The following kernel source files:
1316.IP * 2
1317.I Documentation/pi-futex.txt
1318.IP *
1319.I Documentation/futex-requeue-pi.txt
1320.IP *
1321.I Documentation/locking/rt-mutex.txt
1322.IP *
1323.I Documentation/locking/rt-mutex-design.txt
43b99089 1324.PP
52087dd3 1325\fIFuss, Futexes and Furwocks: Fast Userlevel Locking in Linux\fP
9b936e9e
MK
1326(proceedings of the Ottawa Linux Symposium 2002), online at
1327.br
608bf950
SK
1328.UR http://kernel.org\:/doc\:/ols\:/2002\:/ols2002-pages-479-495.pdf
1329.UE
f42eb21b 1330
2ed26199
MK
1331\fIA futex overview and update\fP, 11 November 2009
1332.UR http://lwn.net/Articles/360699/
1333.UE
1334
0483b6cc
MK
1335\fIRequeue-PI: Making Glibc Condvars PI-Aware\fP
1336(2009 Real-Time Linux Workshop)
1337.UR http://lwn.net/images/conf/rtlws11/papers/proc/p10.pdf
1338.UE
1339
f42eb21b
MK
1340\fIFutexes Are Tricky\fP (updated in 2011), Ulrich Drepper
1341.UR http://www.akkadia.org/drepper/futex.pdf
1342.UE
9b936e9e
MK
1343.PP
1344Futex example library, futex-*.tar.bz2 at
1345.br
a605264d 1346.UR ftp://ftp.kernel.org\:/pub\:/linux\:/kernel\:/people\:/rusty/
608bf950 1347.UE
34f14794
MK
1348.\"
1349.\" FIXME Are there any other resources that should be listed
1350.\" in the SEE ALSO section?