]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man2/futex.2
futex.2: wfix
[thirdparty/man-pages.git] / man2 / futex.2
CommitLineData
8f0aff2a 1.\" Page by b.hubert
1abce893
MK
2.\" and Copyright (C) 2015, Thomas Gleixner <tglx@linutronix.de>
3.\" and Copyright (C) 2015, Michael Kerrisk <mtk.manpages@gmail.com>
2297bf0e 4.\"
2e46a6e7 5.\" %%%LICENSE_START(FREELY_REDISTRIBUTABLE)
8f0aff2a 6.\" may be freely modified and distributed
8ff7380d 7.\" %%%LICENSE_END
fea681da
MK
8.\"
9.\" Niki A. Rahimi (LTC Security Development, narahimi@us.ibm.com)
10.\" added ERRORS section.
11.\"
12.\" Modified 2004-06-17 mtk
13.\" Modified 2004-10-07 aeb, added FUTEX_REQUEUE, FUTEX_CMP_REQUEUE
14.\"
3d155313 15.TH FUTEX 2 2014-05-21 "Linux" "Linux Programmer's Manual"
fea681da 16.SH NAME
ce154705 17futex \- fast user-space locking
fea681da 18.SH SYNOPSIS
9d9dc1e8 19.nf
fea681da
MK
20.sp
21.B "#include <linux/futex.h>"
fea681da
MK
22.B "#include <sys/time.h>"
23.sp
d33602c4 24.BI "int futex(int *" uaddr ", int " futex_op ", int " val ,
768d3c23
MK
25.BI " const struct timespec *" timeout , \
26" \fR /* or: \fBu32 \fIval2\fP */
9d9dc1e8 27.BI " int *" uaddr2 ", int " val3 );
9d9dc1e8 28.fi
409f08b0 29
b939d6e4
MK
30.IR Note :
31There is no glibc wrapper for this system call; see NOTES.
47297adb 32.SH DESCRIPTION
fea681da
MK
33.PP
34The
e511ffb6 35.BR futex ()
fea681da
MK
36system call provides a method for
37a program to wait for a value at a given address to change, and a
38method to wake up anyone waiting on a particular address (while the
39addresses for the same memory in separate processes may not be
40equal, the kernel maps them internally so the same memory mapped in
41different locations will correspond for
e511ffb6 42.BR futex ()
c13182ef 43calls).
fd3fa7ef 44This system call is typically used to
fea681da
MK
45implement the contended case of a lock in shared memory, as
46described in
a8bda636 47.BR futex (7).
fea681da 48.PP
f388ba70
MK
49When a futex operation did not finish uncontended in user space, a
50.BR futex ()
51call needs to be made to the kernel to arbitrate.
c13182ef 52Arbitration can either mean putting the calling
fea681da
MK
53process to sleep or, conversely, waking a waiting process.
54.PP
f388ba70
MK
55Callers of
56.BR futex ()
57are expected to adhere to the semantics described in
a8bda636 58.BR futex (7).
fea681da 59As these
d603cc27 60semantics involve writing nonportable assembly instructions, this in turn
fea681da
MK
61probably means that most users will in fact be library authors and not
62general application developers.
63.PP
64The
65.I uaddr
f388ba70
MK
66argument points to an integer which stores the counter (futex).
67On all platforms, futexes are four-byte integers that
68must be aligned on a four-byte boundary.
69The operation to perform on the futex is specified in the
70.I futex_op
71argument;
72.IR val
73is a value whose meaning and purpose depends on
74.IR futex_op .
36ab2074
MK
75
76The remaining arguments
77.RI ( timeout ,
78.IR uaddr2 ,
79and
80.IR val3 )
81are required only for certain of the futex operations described below.
82Where one of these arguments is not required, it is ignored.
768d3c23 83
36ab2074
MK
84For several blocking operations, the
85.I timeout
86argument is a pointer to a
87.IR timespec
88structure that specifies a timeout for the operation.
89However, notwithstanding the prototype shown above, for some operations,
90this argument is instead a four-byte integer whose meaning
91is determined by the operation.
768d3c23
MK
92For these operations, the kernel casts the
93.I timeout
94value to
95.IR u32 ,
96and in the remainder of this page, this argument is referred to as
97.I val2
98when interpreted in this fashion.
99
de5a3bb4 100Where it is required, the
36ab2074 101.IR uaddr2
de5a3bb4 102argument is a pointer to a second futex that is employed by the operation.
36ab2074
MK
103The interpretation of the final integer argument,
104.IR val3 ,
105depends on the operation.
106
6be4bad7 107The
d33602c4 108.I futex_op
6be4bad7
MK
109argument consists of two parts:
110a command that specifies the operation to be performed,
111bit-wise ORed with zero or or more options that
112modify the behaviour of the operation.
fc30eb79 113The options that may be included in
d33602c4 114.I futex_op
fc30eb79
TG
115are as follows:
116.TP
117.BR FUTEX_PRIVATE_FLAG " (since Linux 2.6.22)"
118.\" commit 34f01cc1f512fa783302982776895c73714ebbc2
119This option bit can be employed with all futex operations.
120It tells the kernel that the futex is process private and not shared
121with another process.
122This allows the kernel to choose the fast path for validating
123the user-space address and avoids expensive VMA lookups,
124taking reference counts on file backing store, and so on.
ae2c1774
MK
125
126As a convenience,
127.IR <linux/futex.h>
128defines a set of constants with the suffix
129.BR _PRIVATE
130that are equivalents of all of the operations listed below,
dcdfde26 131.\" except the obsolete FUTEX_FD, for which the "private" flag was
ae2c1774
MK
132.\" meaningless
133but with the
134.BR FUTEX_PRIVATE_FLAG
135ORed into the constant value.
136Thus, there are
137.BR FUTEX_WAIT_PRIVATE ,
138.BR FUTEX_WAKE_PRIVATE ,
139and so on.
2e98bbc2
TG
140.TP
141.BR FUTEX_CLOCK_REALTIME " (since Linux 2.6.28)"
142.\" commit 1acdac104668a0834cfa267de9946fac7764d486
4a7e5b05 143This option bit can be employed only with the
2e98bbc2
TG
144.BR FUTEX_WAIT_BITSET
145and
146.BR FUTEX_WAIT_REQUEUE_PI
c84cf68c 147operations.
2e98bbc2 148
f2103b26
MK
149If this option is set, the kernel treats
150.I timeout
151as an absolute time based on
2e98bbc2
TG
152.BR CLOCK_REALTIME .
153
f2103b26
MK
154If this option is not set, the kernel treats
155.I timeout
156as relative time,
1c952cf5
MK
157.\" FIXME I added CLOCK_MONOTONIC here. Is it correct?
158measured against the
159.BR CLOCK_MONOTONIC
160clock.
6be4bad7
MK
161.PP
162The operation specified in
d33602c4 163.I futex_op
6be4bad7 164is one of the following:
fea681da 165.TP
81c9d87e
MK
166.BR FUTEX_WAIT " (since Linux 2.6.0)"
167.\" Strictly speaking, since some time in 2.5.x
f065673c
MK
168This operation tests that the value at the
169location pointed to by the futex address
fea681da
MK
170.I uaddr
171still contains the value
172.IR val ,
f065673c 173and then sleeps awaiting
682edefb 174.B FUTEX_WAKE
f065673c
MK
175on the futex address.
176The test and sleep steps are performed atomically.
177If the futex value does not match
178.IR val ,
4710334a 179then the call fails immediately with the error
badbf70c 180.BR EAGAIN .
f065673c
MK
181.\" FIXME I added the following sentence. Please confirm that it is correct.
182The purpose of the test step is to detect races where
183another process changes that value of the futex between
184the time it was last checked and the time of the
185.BR FUTEX_WAIT
63d3f911 186operation.
1909e523 187
c13182ef 188If the
fea681da 189.I timeout
1c952cf5
MK
190argument is non-NULL, its contents specify a relative timeout for the wait
191.\" FIXME I added CLOCK_MONOTONIC here. Is it correct?
192measured according to the
193.BR CLOCK_MONOTONIC
194clock.
82a6092b
MK
195(This interval will be rounded up to the system clock granularity,
196and kernel scheduling delays mean that the
197blocking interval may overrun by a small amount.)
198If
199.I timeout
200is NULL, the call blocks indefinitely.
4798a7f3 201
c13182ef 202The arguments
fea681da
MK
203.I uaddr2
204and
205.I val3
206are ignored.
207
208For
a8bda636 209.BR futex (7),
fea681da
MK
210this call is executed if decrementing the count gave a negative value
211(indicating contention), and will sleep until another process releases
682edefb
MK
212the futex and executes the
213.B FUTEX_WAKE
214operation.
fea681da 215.TP
81c9d87e
MK
216.BR FUTEX_WAKE " (since Linux 2.6.0)"
217.\" Strictly speaking, since Linux 2.5.x
f065673c
MK
218This operation wakes at most
219.I val
220processes waiting (i.e., inside
221.BR FUTEX_WAIT )
222on the futex at the address
223.IR uaddr .
224Most commonly,
225.I val
226is specified as either 1 (wake up a single waiter) or
227.BR INT_MAX
228(wake up all waiters).
730bfbda
MK
229.\" FIXME Please confirm that the following is correct:
230No guarantee is provided about which waiters are awoken
231(e.g., a waiter with a higher scheduling priority is not guaranteed
232to be awoken in preference to a waiter with a lower priority).
4798a7f3 233
fea681da
MK
234The arguments
235.IR timeout ,
c8b921bd 236.IR uaddr2 ,
fea681da
MK
237and
238.I val3
239are ignored.
240
241For
a8bda636 242.BR futex (7),
fea681da
MK
243this is executed if incrementing
244the count showed that there were waiters, once the futex value has been set
245to 1 (indicating that it is available).
a7c2bf45
MK
246.TP
247.BR FUTEX_FD " (from Linux 2.6.0 up to and including Linux 2.6.25)"
248.\" Strictly speaking, from Linux 2.5.x to 2.6.25
249This operation creates a file descriptor that is associated with the futex at
250.IR uaddr .
251.\" , suitable for .BR poll (2).
252The calling process must close the returned file descriptor after use.
253When another process performs a
254.BR FUTEX_WAKE
255on the futex, the file descriptor indicates as being readable with
256.BR select (2),
257.BR poll (2),
258and
259.BR epoll (7)
260
261The file descriptor can be used to obtain asynchronous notifications:
262if
263.I val
264is nonzero, then when another process executes a
265.BR FUTEX_WAKE ,
266the caller will receive the signal number that was passed in
267.IR val .
268
269The arguments
270.IR timeout ,
271.I uaddr2
272and
273.I val3
274are ignored.
275
276To prevent race conditions, the caller should test if the futex has
277been upped after
278.B FUTEX_FD
279returns.
280
281Because it was inherently racy,
282.B FUTEX_FD
283has been removed
284.\" commit 82af7aca56c67061420d618cc5a30f0fd4106b80
285from Linux 2.6.26 onward.
286.TP
287.BR FUTEX_REQUEUE " (since Linux 2.6.0)"
288.\" Strictly speaking: from Linux 2.5.70
289.\"
290.\" FIXME I added this warning. Okay?
291.IR "Avoid using this operation" .
292It is broken (unavoidably racy) for its intended purpose.
293Use
294.BR FUTEX_CMP_REQUEUE
295instead.
296
297This operation performs the same task as
298.BR FUTEX_CMP_REQUEUE ,
299except that no check is made using the value in
300.IR val3 .
301(The argument
302.I val3
303is ignored.)
304.TP
305.BR FUTEX_CMP_REQUEUE " (since Linux 2.6.7)"
306This operation was added as a replacement for the earlier
307.BR FUTEX_REQUEUE ,
308because that operation was racy for its intended use.
309
310As with
311.BR FUTEX_REQUEUE ,
312the
313.BR FUTEX_CMP_REQUEUE
314operation is used to avoid a "thundering herd" effect when
315.B FUTEX_WAKE
316is used and all processes woken up need to acquire another futex.
317It differs from
318.BR FUTEX_REQUEUE
319in that it first checks whether the location
320.I uaddr
321still contains the value
322.IR val3 .
323If not, the operation fails with the error
324.BR EAGAIN .
325.\" FIXME I added the following sentence on rational for FUTEX_CMP_REQUEUE.
326.\" Is it correct? SHould it be expanded?
327This additional feature of
328.BR FUTEX_CMP_REQUEUE
329can be used by the caller to (atomically) detect changes
330in the value of the target futex at
331.IR uaddr2 .
332
333The operation wakes up a maximum of
334.I val
335waiters that are waiting on the futex at
336.IR uaddr .
337If there are more than
338.I val
339waiters, then the remaining waiters are removed
340from the wait queue of the source futex at
341.I uaddr
342and added to the wait queue of the target futex at
343.IR uaddr2 .
936876a9 344
a7c2bf45 345The
768d3c23 346.I val2
936876a9 347argument specifies an upper limit on the number of waiters
a7c2bf45 348that are requeued to the futex at
768d3c23 349.IR uaddr2 .
a7c2bf45
MK
350
351.\" FIXME Please review the following new paragraph to see if it is
352.\" accurate.
353Typical values to specify for
354.I val
355are 0 or or 1.
356(Specifying
357.BR INT_MAX
358is not useful, because it would make the
359.BR FUTEX_CMP_REQUEUE
360operation equivalent to
361.BR FUTEX_WAKE .)
936876a9 362The limit value specified via
768d3c23
MK
363.I val2
364is typically either 1 or
a7c2bf45
MK
365.BR INT_MAX .
366(Specifying the argument as 0 is not useful, because it would make the
367.BR FUTEX_CMP_REQUEUE
368operation equivalent to
369.BR FUTEX_WAIT .)
6bac3b85
MK
370.\"
371.\" FIXME I added some FUTEX_WAKE_OP text, and I'd be happy if someone
372.\" checked it.
fea681da 373.TP
d67e21f5
MK
374.BR FUTEX_WAKE_OP " (since Linux 2.6.14)"
375.\" commit 4732efbeb997189d9f9b04708dc26bf8613ed721
6bac3b85
MK
376.\" Author: Jakub Jelinek <jakub@redhat.com>
377.\" Date: Tue Sep 6 15:16:25 2005 -0700
378This operation was added to support some user-space use cases
379where more than one futex must be handled at the same time.
380The most notable example is the implementation of
381.BR pthread_cond_signal (3),
382which requires operations on two futexes,
383the one used to implement the mutex and the one used in the implementation
384of the wait queue associated with the condition variable.
385.BR FUTEX_WAKE_OP
386allows such cases to be implemented without leading to
387high rates of contention and context switching.
388
389The
390.BR FUTEX_WAIT_OP
391operation is equivalent to atomically executing the following code:
392
393.in +4n
394.nf
395int oldval = *(int *) uaddr2;
396*(int *) uaddr2 = oldval \fIop\fP \fIoparg\fP;
397futex(uaddr, FUTEX_WAKE, val, 0, 0, 0);
398if (oldval \fIcmp\fP \fIcmparg\fP)
768d3c23 399 futex(uaddr2, FUTEX_WAKE, val2, 0, 0, 0);
6bac3b85
MK
400.fi
401.in
402
403In other words,
404.BR FUTEX_WAIT_OP
405does the following:
406.RS
407.IP * 3
408saves the original value of the futex at
409.IR uaddr2 ;
410.IP *
411performs an operation to modify the value of the futex at
412.IR uaddr2 ;
413.IP *
414wakes up a maximum of
415.I val
416waiters on the futex
417.IR uaddr ;
418and
419.IP *
420dependent on the results of a test of the original value of the futex at
421.IR uaddr2 ,
422wakes up a maximum of
768d3c23 423.I val2
6bac3b85
MK
424waiters on the futex
425.IR uaddr2 .
426.RE
427.IP
6bac3b85
MK
428The operation and comparison that are to be performed are encoded
429in the bits of the argument
430.IR val3 .
431Pictorially, the encoding is:
432
f6af90e7 433.in +8n
6bac3b85 434.nf
f6af90e7
MK
435+---+---+-----------+-----------+
436|op |cmp| oparg | cmparg |
437+---+---+-----------+-----------+
438 4 4 12 12 <== # of bits
6bac3b85
MK
439.fi
440.in
441
442Expressed in code, the encoding is:
443
444.in +4n
445.nf
446#define FUTEX_OP(op, oparg, cmp, cmparg) \\
447 (((op & 0xf) << 28) | \\
448 ((cmp & 0xf) << 24) | \\
449 ((oparg & 0xfff) << 12) | \\
450 (cmparg & 0xfff))
451.fi
452.in
453
454In the above,
455.I op
456and
457.I cmp
458are each one of the codes listed below.
459The
460.I oparg
461and
462.I cmparg
463components are literal numeric values, except as noted below.
464
465The
466.I op
467component has one of the following values:
468
469.in +4n
470.nf
471FUTEX_OP_SET 0 /* uaddr2 = oparg; */
472FUTEX_OP_ADD 1 /* uaddr2 += oparg; */
473FUTEX_OP_OR 2 /* uaddr2 |= oparg; */
474FUTEX_OP_ANDN 3 /* uaddr2 &= ~oparg; */
475FUTEX_OP_XOR 4 /* uaddr2 ^= oparg; */
476.fi
477.in
478
479In addition, bit-wise ORing the following value into
480.I op
481causes
482.IR "(1\ <<\ oparg)"
483to be used as the operand:
484
485.in +4n
486.nf
487FUTEX_OP_ARG_SHIFT 8 /* Use (1 << oparg) as operand */
488.fi
489.in
490
491The
492.I cmp
493field is one of the following:
494
495.in +4n
496.nf
497FUTEX_OP_CMP_EQ 0 /* if (oldval == cmparg) wake */
498FUTEX_OP_CMP_NE 1 /* if (oldval != cmparg) wake */
499FUTEX_OP_CMP_LT 2 /* if (oldval < cmparg) wake */
500FUTEX_OP_CMP_LE 3 /* if (oldval <= cmparg) wake */
501FUTEX_OP_CMP_GT 4 /* if (oldval > cmparg) wake */
502FUTEX_OP_CMP_GE 5 /* if (oldval >= cmparg) wake */
503.fi
504.in
505
506The return value of
507.BR FUTEX_WAKE_OP
508is the sum of the number of waiters woken on the futex
509.IR uaddr
510plus the number of waiters woken on the futex
511.IR uaddr2 .
d67e21f5 512.TP
79c9b436
TG
513.BR FUTEX_WAIT_BITSET " (since Linux 2.6.25)"
514.\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
fd9e59d4 515This operation is like
79c9b436
TG
516.BR FUTEX_WAIT
517except that
518.I val3
519is used to provide a 32-bit bitset to the kernel.
520This bitset is stored in the kernel-internal state of the waiter.
521See the description of
522.BR FUTEX_WAKE_BITSET
523for further details.
524
fd9e59d4
MK
525The
526.BR FUTEX_WAIT_BITSET
527also interprets the
528.I timeout
529argument differently from
530.BR FUTEX_WAIT .
531See the discussion of
532.BR FUTEX_CLOCK_REALTIME ,
533above.
534
79c9b436
TG
535The
536.I uaddr2
537argument is ignored.
538.TP
d67e21f5
MK
539.BR FUTEX_WAKE_BITSET " (since Linux 2.6.25)"
540.\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
55cc422d
TG
541This operation is the same as
542.BR FUTEX_WAKE
543except that the
544.I val3
545argument is used to provide a 32-bit bitset to the kernel.
98d769c0
MK
546This bitset is used to select which waiters should be woken up.
547The selection is done by a bit-wise AND of the "wake" bitset
548(i.e., the value in
549.IR val3 )
550and the bitset which is stored in the kernel-internal
09cb4ce7 551state of the waiter (the "wait" bitset that is set using
98d769c0
MK
552.BR FUTEX_WAIT_BITSET ).
553All of the waiters for which the result of the AND is nonzero are woken up;
554the remaining waiters are left sleeping.
555
e9d4496b
MK
556.\" FIXME please review this paragraph that I added
557The effect of
558.BR FUTEX_WAIT_BITSET
559and
560.BR FUTEX_WAKE_BITSET
561is to allow selective wake-ups among multiple waiters that are waiting
562on the same futex;
563since a futex has a size of 32 bits,
564these operations provide 32 wakeup "channels".
565(The
566.BR FUTEX_WAIT
567and
568.BR FUTEX_WAKE
569operations correspond to
570.BR FUTEX_WAIT_BITSET
571and
572.BR FUTEX_WAKE_BITSET
573operations where the bitsets are all ones.)
09cb4ce7 574Note, however, that using this bitset multiplexing feature on a
e9d4496b
MK
575futex is less efficient than simply using multiple futexes,
576because employing bitset multiplexing requires the kernel
577to check all waiters on a futex,
578including those that are not interested in being woken up
579(i.e., they do not have the relevant bit set in their "wait" bitset).
580.\" According to http://locklessinc.com/articles/futex_cheat_sheet/:
581.\"
582.\" "The original reason for the addition of these extensions
583.\" was to improve the performance of pthread read-write locks
584.\" in glibc. However, the pthreads library no longer uses the
585.\" same locking algorithm, and these extensions are not used
586.\" without the bitset parameter being all ones.
587.\"
588.\" The page goes on to note that the FUTEX_WAIT_BITSET operation
589.\" is nevertheless used (with a bitset of all ones) in order to
590.\" obtain the absolute timeout functionality that is useful
591.\" for efficiently implementing Pthreads APIs (which use absolute
592.\" timeouts); FUTEX_WAIT provides only relative timeouts.
593
98d769c0
MK
594The
595.I uaddr2
596and
597.I timeout
598arguments are ignored.
bd90a5f9
MK
599.\"
600.\"
601.SS Priority-inheritance futexes
b52e1cd4
MK
602Linux supports priority-inheritance (PI) futexes in order to handle
603priority-inversion problems that can be encountered with
604normal futex locks.
79d918c7
MK
605.\"
606.\" FIXME ===== Start of adapted Hart/Guniguntala text =====
607.\" The following text is drawn from the Hart/Guniguntala paper,
608.\" but I have reworded some pieces significantly. Please check it.
609.\"
610The PI futex operations described below differ from the other
611futex operations in that they impose policy on the use of the futex value:
612.IP * 3
7c16fbff 613If the lock is unowned, the futex value shall be 0.
79d918c7
MK
614.IP *
615If the lock is owned, the futex value shall be the thread ID (TID; see
616.BR gettid (2))
617of the owning thread.
618.IP *
619.\" FIXME In the following line, I added "the lock is owned and". Okay?
620If the lock is owned and there are threads contending for the lock,
621then the
622.B FUTEX_WAITERS
623bit shall be set in the futex value; in other words, the futex value is:
624
625 FUTEX_WAITERS | TID
626.PP
627With this policy in place,
628a user-space application can acquire an unowned
21b060ba 629lock or release an uncontended lock using atomic
79d918c7 630.\" FIXME In the following line, I added "user-space". Okay?
21b060ba 631instructions executed in user-space (e.g.,
b52e1cd4
MK
632.I cmpxchg
633on the x86 architecture).
634Locking an unowned lock simply consists of setting
635the futex value to the caller's TID.
636Releasing an uncontended lock simply requires setting the futex value to 0.
637
638If a futex is currently owned (i.e., has a nonzero value),
639waiters must employ the
79d918c7
MK
640.B FUTEX_LOCK_PI
641operation to acquire the lock.
b52e1cd4 642If a lock is contended (i.e., the
79d918c7 643.B FUTEX_WAITERS
b52e1cd4 644bit is set in the futex value), the lock owner must employ the
79d918c7 645.B FUTEX_UNLOCK_PI
b52e1cd4
MK
646operation to release the lock.
647
79d918c7
MK
648In the cases where callers are forced into the kernel
649(i.e., required to perform a
650.BR futex ()
651operation),
652they then deal directly with a so-called RT-mutex,
653a kernel locking mechanism which implements the required
654priority-inheritance semantics.
655After the RT-mutex is acquired, the futex value is updated accordingly,
656before the calling thread returns to user space.
657.\" FIXME ===== End of adapted Hart/Guniguntala text =====
658
a59fca75
MK
659It is important to note
660.\" FIXME We need some explanation here of *why* it is important to
661.\" note this
662that the kernel will update the futex value prior
79d918c7
MK
663to returning to user space.
664Unlike the other futex operations described above,
665the PI futex operations are designed
7c16fbff 666for the implementation of very specific IPC mechanisms).
fc57e6bb
MK
667.\"
668.\" FIXME We don't quite have a definition anywhere of what a PI futex
669.\" is (vs a non-PI futex). Below, we have the information of
670.\" FUTEX_CMP_REQUEUE_PI requeues from a non-PI futex to a
671.\" PI futex, but what determines whether the futex is of one
672.\" kind of the other? We should have such a definition somewhere
673.\" about here.
99c0ac69
MK
674.\"
675.\" FIXME In discussing errors for FUTEX_CMP_REQUEUE_PI, Darren Hart
676.\" made the observation that "EINVAL is returned if the non-pi
677.\" to pi or op pairing semantics are violated."
678.\" Probably there needs to be a general statement about this
679.\" requirement, probably located at about this point in the page.
bd90a5f9
MK
680
681PI futexes are operated on by specifying one of the following values in
682.IR futex_op :
d67e21f5
MK
683.TP
684.BR FUTEX_LOCK_PI " (since Linux 2.6.18)"
685.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
67833bec
MK
686.\"
687.\" FIXME I did some significant rewording of tglx's text.
688.\" Please check, in case I injected errors.
689.\"
690This operation is used after after an attempt to acquire
691the futex lock via an atomic user-space instruction failed
692because the futex has a nonzero value\(emspecifically,
693because it contained the namespace-specific TID of the lock owner.
67259526 694.\" FIXME In the preceding line, what does "namespace-specific" mean?
67833bec 695.\" (I kept those words from tglx.)
67259526 696.\" That is, what kind of namespace are we talking about?
67833bec
MK
697.\" (I suppose we are talking PID namespaces here, but I want to
698.\" be sure.)
699
700The operation checks the value of the futex at the address
701.IR uaddr .
702If the value is 0, then the kernel tries to atomically set the futex value to
703the caller's TID.
704If that fails,
705.\" FIXME What would be the cause of failure?
706or the futex value is nonzero,
707the kernel atomically sets the
e0547e70 708.B FUTEX_WAITERS
67833bec
MK
709bit, which signals the futex owner that it cannot unlock the futex in
710user space atomically by setting the futex value to 0.
711After that, the kernel tries to find the thread which is
712associated with the owner TID,
713.\" FIXME Could I get a bit more detail on the next two lines?
714.\" What is "creates or reuses kernel state" about?
715creates or reuses kernel state on behalf of the owner
716and attaches the waiter to it.
67259526
MK
717.\" FIXME In the next line, what type of "priority" are we talking about?
718.\" Realtime priorities for SCHED_FIFO and SCHED_RR?
719.\" Or something else?
1f043693 720The enqueueing of the waiter is in descending priority order if more
e0547e70 721than one waiter exists.
67259526 722.\" FIXME What does "bandwidth" refer to in the next line?
e0547e70 723The owner inherits either the priority or the bandwidth of the waiter.
67259526
MK
724.\" FIXME In the preceding line, what determines whether the
725.\" owner inherits the priority versus the bandwidth?
67833bec
MK
726.\"
727.\" FIXME Could I get some help translating the next sentence into
728.\" something that user-space developers (and I) can understand?
729.\" In particular, what are "nexted locks" in this context?
e0547e70
TG
730This inheritance follows the lock chain in the case of
731nested locking and performs deadlock detection.
732
9ce19cf1
MK
733.\" FIXME tglx says "The timeout argument is handled as described in
734.\" FUTEX_WAIT." However, it appears to me that this is not right.
735.\" Is the following formulation correct.
e0547e70
TG
736The
737.I timeout
9ce19cf1
MK
738argument provides a timeout for the lock attempt.
739It is interpreted as an absolute time, measured against the
740.BR CLOCK_REALTIME
741clock.
742If
743.I timeout
744is NULL, the operation will block indefinitely.
e0547e70 745
a449c634 746The
e0547e70
TG
747.IR uaddr2 ,
748.IR val ,
749and
750.IR val3
a449c634 751arguments are ignored.
fedaeaf3 752.\" FIXME
a9dcb4d1 753.\" tglx noted the following "ERROR" case for FUTEX_LOCK_PI and
670b34f8
MK
754.\" FUTEX_TRYLOCK_PI and FUTEX_WAIT_REQUEUE_PI:
755.\"
a9dcb4d1
MK
756.\" > [EOWNERDIED] The owner of the futex died and the kernel made the
757.\" > caller the new owner. The kernel sets the FUTEX_OWNER_DIED bit
758.\" > in the futex userspace value. Caller is responsible for cleanup
fedaeaf3 759.\"
a9dcb4d1 760.\" However, there is no such thing as an EOWNERDIED error. I had a look
fedaeaf3
MK
761.\" through the kernel source for the FUTEX_OWNER_DIED cases and didn't
762.\" see an obvious error associated with them. Can you clarify? (I think
763.\" the point is that this condition, which is described in
764.\" Documentation/robust-futexes.txt, is not an error as such. However,
765.\" I'm not yet sure of how to describe it in the man page.)
670b34f8 766.\" Suggestions please!
67833bec 767.\"
d67e21f5 768.TP
12fdbe23 769.BR FUTEX_TRYLOCK_PI " (since Linux 2.6.18)"
d67e21f5 770.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
12fdbe23
MK
771This operation tries to acquire the futex at
772.IR uaddr .
0b761826
MK
773.\" FIXME I think it would be helpful here to say a few more words about
774.\" the difference(s) between FUTEX_LOCK_PI and FUTEX_TRYLOCK_PI
fa0388c3 775It deals with the situation where the TID value at
12fdbe23
MK
776.I uaddr
777is 0, but the
b52e1cd4 778.B FUTEX_WAITERS
12fdbe23 779bit is set.
fa0388c3
MK
780.\" FIXME How does the situation in the previous sentence come about?
781.\" Probably it would be helpful to say something about that in
782.\" the man page.
badbf70c 783.\" FIXME And *how* does FUTEX_TRYLOCK_PI deal with this situation?
12fdbe23 784User space cannot handle this race free.
084744ef
MK
785
786The
787.IR uaddr2 ,
788.IR val ,
789.IR timeout ,
790and
791.IR val3
792arguments are ignored.
d67e21f5 793.TP
12fdbe23 794.BR FUTEX_UNLOCK_PI " (since Linux 2.6.18)"
d67e21f5 795.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
ecae2099
TG
796This operation wakes the top priority waiter which is waiting in
797.B FUTEX_LOCK_PI
798on the futex address provided by the
799.I uaddr
800argument.
801
802This is called when the user space value at
803.I uaddr
804cannot be changed atomically from a TID (of the owner) to 0.
805
806The
807.IR uaddr2 ,
808.IR val ,
809.IR timeout ,
810and
811.IR val3
11a194bf 812arguments are ignored.
d67e21f5 813.TP
d67e21f5
MK
814.BR FUTEX_CMP_REQUEUE_PI " (since Linux 2.6.31)"
815.\" commit 52400ba946759af28442dee6265c5c0180ac7122
816.\" FIXME to complete
f812a08b
DH
817This operation is a PI-aware variant of
818.BR FUTEX_CMP_REQUEUE .
819It requeues waiters that are blocked via
820.B FUTEX_WAIT_REQUEUE_PI
821on
822.I uaddr
823from a non-PI source futex
824.RI ( uaddr )
825to a PI target futex
826.RI ( uaddr2 ).
827
9e54d26d
MK
828As with
829.BR FUTEX_CMP_REQUEUE ,
830this operation wakes up a maximum of
831.I val
832waiters that are waiting on the futex at
833.IR uaddr .
834However, for
835.BR FUTEX_CMP_REQUEUE_PI ,
836.I val
6fbeb8f4
MK
837is required to be 1
838(since the the main point is to avoid a thundering herd).
9e54d26d
MK
839The remaining waiters are removed from the wait queue of the source futex at
840.I uaddr
841and added to the wait queue of the target futex at
842.IR uaddr2 .
f812a08b 843
9e54d26d 844The
768d3c23 845.I val2
c6d8cf21
MK
846.\" val2 is the cap on the number of requeued waiters.
847.\" In the glibc pthread_cond_broadcast() implementation, this argument
848.\" is specified as INT_MAX, and for pthread_cond_signal() it is 0.
9e54d26d 849and
768d3c23 850.I val3
9e54d26d
MK
851arguments serve the same purposes as for
852.BR FUTEX_CMP_REQUEUE .
be376673
MK
853.\" FIXME The page at http://locklessinc.com/articles/futex_cheat_sheet/
854.\" notes that "priority-inheritance Futex to priority-inheritance
855.\" Futex requeues are currently unsupported". Do we need to say
856.\" something in the man page about that?
d67e21f5
MK
857.TP
858.BR FUTEX_WAIT_REQUEUE_PI " (since Linux 2.6.31)"
859.\" commit 52400ba946759af28442dee6265c5c0180ac7122
6ff1b4c0
TG
860Wait operation to wait on a non-PI futex at
861.I uaddr
862and potentially be requeued onto a PI futex at
863.IR uaddr2 .
864The wait operation on
865.I uaddr
866is the same as
867.BR FUTEX_WAIT .
868The waiter can be removed from the wait on
869.I uaddr
870via
871.BR FUTEX_WAKE
872without requeueing on
873.IR uaddr2 .
a4e69912 874
5d67b190
MK
875.\" FIXME Somewhere around here, something needs to be said about
876.\" the pairing semantics of FUTEX_CMP_REQUEUE_PI and
877.\" FUTEX_WAIT_REQUEUE_PI. (The Hart/Guniguntala paer says
878.\" "FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI must be
879.\" paired only with each other." Could someone propose
880.\" a statement about this pairing requirement and why it
881.\" is needed?
882.\"
63bea7dc
MK
883.\" FIXME Please check the following. tglx said "The timeout argument
884.\" is handled as described in FUTEX_WAIT.", but the truth is
885.\" as below, AFAICS
886If
887.I timeout
888is not NULL, it specifies a timeout for the wait operation;
889this timeout is interpreted as outlined above in the description of the
890.BR FUTEX_CLOCK_REALTIME
891option.
892If
893.I timeout
894is NULL, the operation can block indefinitely.
895
a4e69912
MK
896The
897.I val3
898argument is ignored.
899.\" FIXME Re the preceding sentence, actually 'val3' is internally set to
900.\" FUTEX_BITSET_MATCH_ANY before calling futex_wait_requeue_pi().
901.\" I'm not sure we need to say anything about this though.
902.\" Comments?
47297adb 903.SH RETURN VALUE
fea681da 904.PP
6f147f79 905In the event of an error, all operations return \-1 and set
e808bba0 906.I errno
6f147f79 907to indicate the cause of the error.
e808bba0
MK
908The return value on success depends on the operation,
909as described in the following list:
fea681da
MK
910.TP
911.B FUTEX_WAIT
682edefb
MK
912Returns 0 if the process was woken by a
913.B FUTEX_WAKE
7446a837
MK
914or
915.B FUTEX_WAKE_BITSET
682edefb 916call.
fea681da
MK
917.TP
918.B FUTEX_WAKE
919Returns the number of processes woken up.
920.TP
921.B FUTEX_FD
922Returns the new file descriptor associated with the futex.
923.TP
924.B FUTEX_REQUEUE
925Returns the number of processes woken up.
926.TP
927.B FUTEX_CMP_REQUEUE
3dfcc11d
MK
928Returns the total number of processes woken up or requeued to the futex at
929.IR uaddr2 .
930If this value is greater than
931.IR val ,
932then difference is the number of waiters requeued to the futex at
933.IR uaddr2 .
519f2c3d
MK
934.\"
935.\" FIXME Add success returns for other operations
dcad19c0
MK
936.TP
937.B FUTEX_WAKE_OP
a8b5b324
MK
938.\" FIXME Is the following correct?
939Returns the total number of waiters that were woken up.
940This is the sum of the woken waiters on the two futexes at
941.I uaddr
942and
943.IR uaddr2 .
dcad19c0
MK
944.TP
945.B FUTEX_WAIT_BITSET
7bcc5351
MK
946.\" FIXME Is the following correct?
947Returns 0 if the process was woken by a
948.B FUTEX_WAKE
949or
950.B FUTEX_WAKE_BITSET
951call.
dcad19c0
MK
952.TP
953.B FUTEX_WAKE_BITSET
b884566b
MK
954.\" FIXME Is the following correct?
955Returns the number of processes woken up.
dcad19c0
MK
956.TP
957.B FUTEX_LOCK_PI
bf02a260
MK
958.\" FIXME Is the following correct?
959Returns 0 if the futex was successfully locked.
dcad19c0
MK
960.TP
961.B FUTEX_TRYLOCK_PI
5c716eef
MK
962.\" FIXME Is the following correct?
963Returns 0 if the futex was successfully locked.
dcad19c0
MK
964.TP
965.B FUTEX_UNLOCK_PI
52bb928f
MK
966.\" FIXME Is the following correct?
967Returns 0 if the futex was successfully unlocked.
dcad19c0
MK
968.TP
969.B FUTEX_CMP_REQUEUE_PI
dddd395a
MK
970.\" FIXME Is the following correct?
971Returns the total number of processes woken up or requeued to the futex at
972.IR uaddr2 .
973If this value is greater than
974.IR val ,
975then difference is the number of waiters requeued to the futex at
976.IR uaddr2 .
dcad19c0
MK
977.TP
978.B FUTEX_WAIT_REQUEUE_PI
22c15de9
MK
979.\" FIXME Is the following correct?
980Returns 0 if the caller was successfully requeued to the futex at
981.IR uaddr2 .
fea681da
MK
982.SH ERRORS
983.TP
984.B EACCES
985No read access to futex memory.
986.TP
987.B EAGAIN
f48516d1
MK
988.RB ( FUTEX_WAIT ,
989.BR FUTEX_WAIT_REQUEUE_PI )
badbf70c
MK
990The value pointed to by
991.I uaddr
992was not equal to the expected value
993.I val
994at the time of the call.
995.TP
996.B EAGAIN
8f2068bb
MK
997.RB ( FUTEX_CMP_REQUEUE ,
998.BR FUTEX_CMP_REQUEUE_PI )
ce5602fd 999The value pointed to by
9f6c40c0
МК
1000.I uaddr
1001is not equal to the expected value
1002.IR val3 .
fd1dc4c2 1003.\" FIXME: Is the following sentence correct?
fea681da 1004(This probably indicates a race;
682edefb
MK
1005use the safe
1006.B FUTEX_WAKE
1007now.)
c0091dd3
MK
1008.\"
1009.\" FIXME Should there be an EAGAIN case for FUTEX_TRYLOCK_PI?
1010.\" It seems so, looking at the handling of the rt_mutex_trylock()
1011.\" call in futex_lock_pi()
1012.\"
fea681da 1013.TP
5662f56a
MK
1014.BR EAGAIN
1015.RB ( FUTEX_LOCK_PI ,
aaec9032
MK
1016.BR FUTEX_TRYLOCK_PI ,
1017.BR FUTEX_CMP_REQUEUE_PI )
1018The futex owner thread ID of
1019.I uaddr
1020(for
1021.BR FUTEX_CMP_REQUEUE_PI :
1022.IR uaddr2 )
1023is about to exit,
5662f56a
MK
1024but has not yet handled the internal state cleanup.
1025Try again.
61f8c1d1
MK
1026.\"
1027.\" FIXME Is there not also an EAGAIN error case on 'uaddr2' for
1028.\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via
1029.\" futex_requeue() ==> futex_proxy_trylock_atomic() ==>
1030.\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> EAGAIN?
5662f56a 1031.TP
7a39e745
MK
1032.BR EDEADLK
1033.RB ( FUTEX_LOCK_PI ,
1034.BR FUTEX_TRYLOCK_PI )
1035The futex at
1036.I uaddr
1037is already locked by the caller.
d08ce5dd
MK
1038.\"
1039.\" FIXME Is there not also an EDEADLK error case on 'uaddr2' for
1040.\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via
1041.\" futex_requeue() ==> futex_proxy_trylock_atomic() ==>
1042.\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> EDEADLK?
7a39e745 1043.TP
662c0da8
MK
1044.BR EDEADLK
1045.\" FIXME I reworded tglx's text somewhat; is the following okay?
1046.RB ( FUTEX_CMP_REQUEUE_PI )
1047While requeueing a waiter to the PI futex at
1048.IR uaddr2 ,
1049the kernel detected a deadlock.
1050.TP
fea681da 1051.B EFAULT
1ea901e8
MK
1052A required pointer argument (i.e.,
1053.IR uaddr ,
1054.IR uaddr2 ,
1055or
1056.IR timeout )
496df304 1057did not point to a valid user-space address.
fea681da 1058.TP
9f6c40c0 1059.B EINTR
e808bba0 1060A
9f6c40c0 1061.B FUTEX_WAIT
2674f781
MK
1062or
1063.B FUTEX_WAIT_BITSET
e808bba0
MK
1064operation was interrupted by a signal (see
1065.BR signal (7))
1066or a spurious wakeup.
5eeca856
MK
1067.\" FIXME
1068.\" Regarding the words "spurious wakeup" above, I received this
1069.\" bug report from Rich Felker:
1070.\"
1071.\" I see no code in the kernel whereby a "spurious wakeup", or anything
1072.\" other than interruption by a signal handler that's not SA_RESTART,
1073.\" can cause futex to fail with EINTR. In general, overloading of EINTR
1074.\" and/or spurious EINTRs from a syscall make it impossible to use that
1075.\" syscall for implementing any function where EINTR is a mandatory
1076.\" failure on interruption-by-signal, since there is no way for
1077.\" userspace to distinguish whether the EINTR occurred as a result of
1078.\" an interrupting signal or some other reason. The kernel folks have
1079.\" gone to great lengths to fix spurious EINTRs (see signal(7) for
1080.\" history), especially by non-interrupting signal handlers, including
1081.\" in futex, and allowing EINTR here would be contrary to that goal.
1082.\"
1083.\" It's my belief that the "or a spurious wakeup" text should simply be
1084.\" removed.
1085.\"
1086.\" The reason I'm raising this topic is its relevance to a thread on
1087.\" libc-alpha:
1088.\" [RFC] mutex destruction (#13690): problem description and workarounds
1089.\"
1090.\" The bug and mailing list discussions to which Rich refers are:
1091.\" https://sourceware.org/bugzilla/show_bug.cgi?id=13690
1092.\" https://sourceware.org/ml/libc-alpha/2014-12/threads.html#0001
1093.\"
1094.\" Can anyone comment on whether the words "spurious wakeup" are correct?
1095.\"
9f6c40c0 1096.TP
fea681da 1097.B EINVAL
180f97b7
MK
1098The operation in
1099.IR futex_op
1100is one of those that employs a timeout, but the supplied
fb2f4c27
MK
1101.I timeout
1102argument was invalid
1103.RI ( tv_sec
1104was less than zero, or
1105.IR tv_nsec
1106was not less than 1000,000,000).
1107.TP
1108.B EINVAL
0c74df0b 1109The operation specified in
025e1374 1110.IR futex_op
0c74df0b 1111employs one or both of the pointers
51ee94be 1112.I uaddr
a1f47699 1113and
0c74df0b
MK
1114.IR uaddr2 ,
1115but one of these does not point to a valid object\(emthat is,
1116the address is not four-byte-aligned.
51ee94be
MK
1117.TP
1118.B EINVAL
55cc422d
TG
1119.RB ( FUTEX_WAIT_BITSET ,
1120.BR FUTEX_WAKE_BITSET )
79c9b436
TG
1121The bitset supplied in
1122.IR val3
1123is zero.
1124.TP
1125.B EINVAL
2043f2c1
MK
1126.RB ( FUTEX_REQUEUE ,
1127.\" FIXME tglx suggested adding this, but does this error really occur for
1128.\" FUTEX_REQUEUE? (The case where it occurs for FUTEX_CMP_REQUEUE_PI
1129.\" is obvious at the start of futex_requeue().)
1130.BR FUTEX_CMP_REQUEUE_PI )
add875c0
MK
1131.I uaddr
1132equals
1133.IR uaddr2
1134(i.e., an attempt was made to requeue to the same futex).
1135.TP
ff597681
MK
1136.BR EINVAL
1137.RB ( FUTEX_FD )
1138The signal number supplied in
1139.I val
1140is invalid.
1141.TP
6bac3b85 1142.B EINVAL
476debd7
MK
1143.RB ( FUTEX_WAKE ,
1144.BR FUTEX_WAKE_OP ,
1145.BR FUTEX_WAKE_BITSET ,
1146.BR FUTEX_REQUEUE ,
1147.BR FUTEX_CMP_REQUEUE )
1148The kernel detected an inconsistency between the user-space state at
1149.I uaddr
1150and the kernel state\(emthat is, it detected a waiter which waits in
1151.BR FUTEX_LOCK_PI
1152on
1153.IR uaddr .
1154.TP
1155.B EINVAL
a218ef20 1156.RB ( FUTEX_LOCK_PI ,
ce022f18
MK
1157.BR FUTEX_TRYLOCK_PI ,
1158.BR FUTEX_UNLOCK_PI )
a218ef20
MK
1159The kernel detected an inconsistency between the user-space state at
1160.I uaddr
1161and the kernel state.
ce022f18
MK
1162This indicates either state corruption
1163.\" FIXME tglx did not mention the "state corruption" for FUTEX_UNLOCK_PI.
1164.\" Does that case also apply for FUTEX_UNLOCK_PI?
1165or that the kernel found a waiter on
a218ef20
MK
1166.I uaddr
1167which is waiting via
1168.BR FUTEX_WAIT
1169or
1170.BR FUTEX_WAIT_BITSET .
1171.TP
1172.B EINVAL
f9250b1a
MK
1173.RB ( FUTEX_CMP_REQUEUE_PI )
1174The kernel detected an inconsistency between the user-space state at
99c0041d
MK
1175.I uaddr2
1176and the kernel state;
1177that is, the kernel detected a waiter which waits via
1178.BR FUTEX_WAIT
1179.\" FIXME tglx did not mention FUTEX_WAIT_BITSET here,
1180.\" but should that not also be included here?
1181on
1182.IR uaddr2 .
1183.TP
1184.B EINVAL
1185.RB ( FUTEX_CMP_REQUEUE_PI )
1186The kernel detected an inconsistency between the user-space state at
f9250b1a
MK
1187.I uaddr
1188and the kernel state;
1189that is, the kernel detected a waiter which waits via
75299c8d 1190.BR FUTEX_WAIT
99c0041d 1191or
75299c8d 1192.BR FUTEX_WAIT_BITESET
f9250b1a
MK
1193on
1194.IR uaddr .
1195.TP
1196.B EINVAL
99c0041d 1197.RB ( FUTEX_CMP_REQUEUE_PI )
75299c8d
MK
1198The kernel detected an inconsistency between the user-space state at
1199.I uaddr
1200and the kernel state;
1201that is, the kernel detected a waiter which waits on
1202.I uaddr
1203via
1204.BR FUTEX_LOCK_PI
1205(instead of
1206.BR FUTEX_WAIT_REQUEUE_PI ).
99c0041d
MK
1207.TP
1208.B EINVAL
9786b3ca
MK
1209.RB ( FUTEX_CMP_REQUEUE_PI )
1210.\" FIXME This is a reworded version of Darren Hart's text.
1211.\" Please check that I did not introduce any errors.
1212An attempt was made to requeue a waiter to a futex other than that
1213specified by the matching
1214.B FUTEX_WAIT_REQUEUE_PI
1215call for that waiter.
1216.TP
1217.B EINVAL
f0c0d61c
MK
1218.RB ( FUTEX_CMP_REQUEUE_PI )
1219The
1220.I val
1221argument is not 1.
1222.TP
1223.B EINVAL
4832b48a 1224Invalid argument.
fea681da 1225.TP
a449c634
MK
1226.BR ENOMEM
1227.RB ( FUTEX_LOCK_PI ,
e34a8fb6
MK
1228.BR FUTEX_TRYLOCK_PI ,
1229.BR FUTEX_CMP_REQUEUE_PI )
a449c634
MK
1230The kernel could not allocate memory to hold state information.
1231.TP
fea681da 1232.B ENFILE
ff597681 1233.RB ( FUTEX_FD )
fea681da 1234The system limit on the total number of open files has been reached.
4701fc28
MK
1235.TP
1236.B ENOSYS
1237Invalid operation specified in
d33602c4 1238.IR futex_op .
9f6c40c0 1239.TP
4a7e5b05
MK
1240.B ENOSYS
1241The
1242.BR FUTEX_CLOCK_REALTIME
1243option was specified in
1afcee7c 1244.IR futex_op ,
4a7e5b05
MK
1245but the accompanying operation was neither
1246.BR FUTEX_WAIT_BITSET
1247nor
1248.BR FUTEX_WAIT_REQUEUE_PI .
1249.TP
a9dcb4d1
MK
1250.BR ENOSYS
1251.RB ( FUTEX_LOCK_PI ,
f2424fae 1252.BR FUTEX_TRYLOCK_PI ,
4945ff19 1253.BR FUTEX_UNLOCK_PI ,
794bb106
MK
1254.BR FUTEX_CMP_REQUEUE_PI
1255.BR FUTEX_WAIT_REQUEUE_PI )
a9dcb4d1 1256A run-time check determined that the operation not available.
a2ebebcd
MK
1257The PI futex operations are not implemented on all architectures and
1258are not supported on some CPU variants.
a9dcb4d1 1259.TP
c7589177
MK
1260.BR EPERM
1261.RB ( FUTEX_LOCK_PI ,
dc2742a8
MK
1262.BR FUTEX_TRYLOCK_PI ,
1263.BR FUTEX_CMP_REQUEUE_PI )
04331c3f 1264The caller is not allowed to attach itself to the futex at
dc2742a8
MK
1265.I uaddr
1266(for
1267.BR FUTEX_CMP_REQUEUE_PI :
1268the futex at
1269.IR uaddr2 ).
c7589177 1270(This may be caused by a state corruption in user space.)
61f8c1d1
MK
1271.\"
1272.\" FIXME Is there not also an EPERM error case on 'uaddr2' for
1273.\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via
1274.\" futex_requeue() ==> futex_proxy_trylock_atomic() ==>
1275.\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> EPERM?
c7589177 1276.TP
76f347ba 1277.BR EPERM
87276709 1278.RB ( FUTEX_UNLOCK_PI )
76f347ba
MK
1279The caller does not own the futex.
1280.TP
0b0e4934
MK
1281.BR ESRCH
1282.RB ( FUTEX_LOCK_PI ,
1283.BR FUTEX_TRYLOCK_PI )
1284.\" FIXME I reworded the following sentence a bit differently from
1285.\" tglx's formulation. Is it okay?
1286The thread ID in the futex at
1287.I uaddr
1288does not exist.
61f8c1d1
MK
1289.\"
1290.\" FIXME Is there not also an ESRCH error case on 'uaddr2' for
1291.\" FUTEX_REQUEUE and FUTEX_CMP_REQUEUE via
1292.\" futex_requeue() ==> futex_proxy_trylock_atomic() ==>
1293.\" futex_lock_pi_atomic() ==> attach_to_pi_owner() ==> ESRCH?
0b0e4934 1294.TP
360f773c
MK
1295.BR ESRCH
1296.RB ( FUTEX_CMP_REQUEUE_PI )
1297.\" FIXME I reworded the following sentence a bit differently from
1298.\" tglx's formulation. Is it okay?
1299The thread ID in the futex at
1300.I uaddr2
1301does not exist.
1302.TP
9f6c40c0 1303.B ETIMEDOUT
4d85047f
MK
1304The operation in
1305.IR futex_op
1306employed the timeout specified in
1307.IR timeout ,
1308and the timeout expired before the operation completed.
47297adb 1309.SH VERSIONS
a1d5f77c 1310.PP
81c9d87e
MK
1311Futexes were first made available in a stable kernel release
1312with Linux 2.6.0.
1313
a1d5f77c
MK
1314Initial futex support was merged in Linux 2.5.7 but with different semantics
1315from what was described above.
52dee70e 1316A four-argument system call with the semantics
fd3fa7ef 1317described in this page was introduced in Linux 2.5.40.
11b520ed 1318In Linux 2.5.70, one argument
a1d5f77c 1319was added.
11b520ed 1320In Linux 2.6.7, a sixth argument was added\(emmessy, especially
a1d5f77c 1321on the s390 architecture.
47297adb 1322.SH CONFORMING TO
8382f16d 1323This system call is Linux-specific.
47297adb 1324.SH NOTES
fea681da 1325.PP
fcdad7d6 1326To reiterate, bare futexes are not intended as an easy-to-use abstraction
c13182ef 1327for end-users.
fcdad7d6 1328(There is no wrapper function for this system call in glibc.)
c13182ef 1329Implementors are expected to be assembly literate and to have
7fac88a9 1330read the sources of the futex user-space library referenced below.
d282bb24 1331.\" .SH AUTHORS
fea681da
MK
1332.\" .PP
1333.\" Futexes were designed and worked on by
1334.\" Hubertus Franke (IBM Thomas J. Watson Research Center),
1335.\" Matthew Kirkwood, Ingo Molnar (Red Hat)
1336.\" and Rusty Russell (IBM Linux Technology Center).
1337.\" This page written by bert hubert.
47297adb 1338.SH SEE ALSO
9913033c 1339.BR get_robust_list (2),
d806bc05 1340.BR restart_syscall (2),
14d8dd3b 1341.BR futex (7)
fea681da 1342.PP
f5ad572f
MK
1343The following kernel source files:
1344.IP * 2
1345.I Documentation/pi-futex.txt
1346.IP *
1347.I Documentation/futex-requeue-pi.txt
1348.IP *
1349.I Documentation/locking/rt-mutex.txt
1350.IP *
1351.I Documentation/locking/rt-mutex-design.txt
43b99089 1352.PP
52087dd3 1353\fIFuss, Futexes and Furwocks: Fast Userlevel Locking in Linux\fP
9b936e9e
MK
1354(proceedings of the Ottawa Linux Symposium 2002), online at
1355.br
608bf950
SK
1356.UR http://kernel.org\:/doc\:/ols\:/2002\:/ols2002-pages-479-495.pdf
1357.UE
f42eb21b 1358
2ed26199
MK
1359\fIA futex overview and update\fP, 11 November 2009
1360.UR http://lwn.net/Articles/360699/
1361.UE
1362
0483b6cc
MK
1363\fIRequeue-PI: Making Glibc Condvars PI-Aware\fP
1364(2009 Real-Time Linux Workshop)
1365.UR http://lwn.net/images/conf/rtlws11/papers/proc/p10.pdf
1366.UE
1367
f42eb21b
MK
1368\fIFutexes Are Tricky\fP (updated in 2011), Ulrich Drepper
1369.UR http://www.akkadia.org/drepper/futex.pdf
1370.UE
9b936e9e
MK
1371.PP
1372Futex example library, futex-*.tar.bz2 at
1373.br
a605264d 1374.UR ftp://ftp.kernel.org\:/pub\:/linux\:/kernel\:/people\:/rusty/
608bf950 1375.UE
34f14794
MK
1376.\"
1377.\" FIXME Are there any other resources that should be listed
1378.\" in the SEE ALSO section?