futex(2) -- Linux man page
NAMEfutex - Fast Userspace Locking system call
The sys_futex system call provides a method for a program to wait for a value at a given address to change, and a method to wake up anyone waiting on a particular address (while the addresses for the same memory in separate processes may not be equal, the kernel maps them internally so the same memory mapped in different locations will correspond for sys_futex calls). It is typically used to implement the contended case of a lock in shared memory, as described in futex(4).
When a futex(4) operation did not finish uncontended in userspace, a call needs to be made to the kernel to arbitrate. Arbitration can either mean putting the calling process to sleep or, conversely, waking a waiting process.
Callers of this function are expected to adhere to the semantics as set out in futex(4). As these semantics involve writing non-portable assembly instructions, this in turn probably means that most users will in fact be library authors and not general application developers.
The futex argument needs to point to an aligned integer which stores the counter. The operation to execute is passed via the op parameter, along with a value val.
Three operations are currently defined:
- This operation atomically verifies that the futex address still contains the value given, and sleeps awaiting FUTEX_WAKE on this futex address. If the timeout argument is non-NULL, its contents describe the maximum duration of the wait, which is infinite otherwise. For futex(4), this call is executed if decrementing the count gave a negative value (indicating contention), and will sleep until another process releases the futex and executes the FUTEX_WAKE operation.
- This operation wakes at most val processes waiting on this futex address (ie. inside FUTEX_WAIT). For futex(4), this is executed if incrementing the count showed that there were waiters, once the futex value has been set to 1 (indicating that it is available).
To support asynchronous wakeups, this operation associates a file descriptor
with a futex.
If another process executes a FUTEX_WAKE, the process will receive the signal
number that was passed in val. The calling process must close the
returned file descriptor after use.
To prevent race conditions, the caller should test if the futex has been upped after FUTEX_FD returns.
Depending on which operation was executed, the returned value can have differing meanings.
- Returns 0 if the process was woken by a FUTEX_WAKE call. In case of timeout, ETIMEDOUT is returned. If the futex was not equal to the expected value, the operation returns EWOULDBLOCK. Signals (or other spurious wakeups) cause FUTEX_WAIT to return EINTR.
- Returns the number of processes woken up.
- Returns the new file descriptor associated with the futex.
- Error in getting timeout information from userspace.
- An operation was not defined or error in page alignment.
To reiterate, bare futexes are not intended as an easy to use abstraction for end-users. Implementors are expected to be assembly literate and to have read the sources of the futex userspace library referenced below.
Futexes were designed and worked on by Hubertus Franke (IBM Thomas J. Watson Research Center), Matthew Kirkwood, Ingo Molnar (Red Hat) and Rusty Russell (IBM Linux Technology Center). This page written by bert hubert.
futex(4), `Fuss, Futexes and Furwocks: Fast Userlevel Locking in Linux' (proceedings of the Ottawa Linux Symposium 2002), futex example library, futex-*.tar.bz2 <URL:ftp://ftp.nl.kernel.org:/pub/linux/kernel/people/rusty/>.