From: Alejandro Colomar Date: Sat, 19 Jul 2025 22:32:04 +0000 (+0200) Subject: man/man2/fcntl{,_locking}.2: Split locking operations from fcntl(2) X-Git-Tag: man-pages-6.15~2^2~2 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=92582bafdeae6c965aaf53dc14edeac509566c66;p=thirdparty%2Fman-pages.git man/man2/fcntl{,_locking}.2: Split locking operations from fcntl(2) Signed-off-by: Alejandro Colomar --- diff --git a/man/man2/fcntl.2 b/man/man2/fcntl.2 index b81f92c0b..d3e9064c0 100644 --- a/man/man2/fcntl.2 +++ b/man/man2/fcntl.2 @@ -61,470 +61,19 @@ indicating that the kernel does not recognize this value. .TQ .BR F_SETFL (2const) .SS Advisory record locking -Linux implements traditional ("process-associated") UNIX record locks, -as standardized by POSIX. -For a Linux-specific alternative with better semantics, -see the discussion of open file description locks below. -.P -.BR F_SETLK , -.BR F_SETLKW , -and -.B F_GETLK -are used to acquire, release, and test for the existence of record -locks (also known as byte-range, file-segment, or file-region locks). -The third argument, -.IR lock , -is a pointer to a structure that has at least the following fields -(in unspecified order). -.P -.in +4n -.EX -struct flock { - ... - short l_type; /* Type of lock: F_RDLCK, - F_WRLCK, F_UNLCK */ - short l_whence; /* How to interpret l_start: - SEEK_SET, SEEK_CUR, SEEK_END */ - off_t l_start; /* Starting offset for lock */ - off_t l_len; /* Number of bytes to lock */ - pid_t l_pid; /* PID of process blocking our lock - (set by F_GETLK and F_OFD_GETLK) */ - ... -}; -.EE -.in -.P -The -.IR l_whence ", " l_start ", and " l_len -fields of this structure specify the range of bytes we wish to lock. -Bytes past the end of the file may be locked, -but not bytes before the start of the file. -.P -.I l_start -is the starting offset for the lock, and is interpreted -relative to either: -the start of the file (if -.I l_whence -is -.BR SEEK_SET ); -the current file offset (if -.I l_whence -is -.BR SEEK_CUR ); -or the end of the file (if -.I l_whence -is -.BR SEEK_END ). -In the final two cases, -.I l_start -can be a negative number provided the -offset does not lie before the start of the file. -.P -.I l_len -specifies the number of bytes to be locked. -If -.I l_len -is positive, then the range to be locked covers bytes -.I l_start -up to and including -.IR l_start + l_len \-1. -Specifying 0 for -.I l_len -has the special meaning: lock all bytes starting at the -location specified by -.IR l_whence " and " l_start -through to the end of file, no matter how large the file grows. -.P -POSIX.1-2001 allows (but does not require) -an implementation to support a negative -.I l_len -value; if -.I l_len -is negative, the interval described by -.I lock -covers bytes -.IR l_start + l_len -up to and including -.IR l_start \-1. -This is supported since Linux 2.4.21 and Linux 2.5.49. -.P -The -.I l_type -field can be used to place a read -.RB ( F_RDLCK ) -or a write -.RB ( F_WRLCK ) -lock on a file. -Any number of processes may hold a read lock (shared lock) -on a file region, but only one process may hold a write lock -(exclusive lock). -An exclusive lock excludes all other locks, -both shared and exclusive. -A single process can hold only one type of lock on a file region; -if a new lock is applied to an already-locked region, -then the existing lock is converted to the new lock type. -(Such conversions may involve splitting, shrinking, or coalescing with -an existing lock if the byte range specified by the new lock does not -precisely coincide with the range of the existing lock.) -.TP -.BR F_SETLK \~(\f[I]struct\~flock\~*\f[]) -Acquire a lock (when -.I l_type -is -.B F_RDLCK -or -.BR F_WRLCK ) -or release a lock (when -.I l_type -is -.BR F_UNLCK ) -on the bytes specified by the -.IR l_whence ", " l_start ", and " l_len -fields of -.IR lock . -If a conflicting lock is held by another process, -this call returns \-1 and sets -.I errno -to -.B EACCES -or -.BR EAGAIN . -(The error returned in this case differs across implementations, -so POSIX requires a portable application to check for both errors.) .TP -.BR F_SETLKW \~(\f[I]struct\~flock\~*\f[]) -As for -.BR F_SETLK , -but if a conflicting lock is held on the file, then wait for that -lock to be released. -If a signal is caught while waiting, then the call is interrupted -and (after the signal handler has returned) -returns immediately (with return value \-1 and -.I errno -set to -.BR EINTR ; -see -.BR signal (7)). -.TP -.BR F_GETLK \~(\f[I]struct\~flock\~*\f[]) -On input to this call, -.I lock -describes a lock we would like to place on the file. -If the lock could be placed, -.BR fcntl () -does not actually place it, but returns -.B F_UNLCK -in the -.I l_type -field of -.I lock -and leaves the other fields of the structure unchanged. -.IP -If one or more incompatible locks would prevent -this lock being placed, then -.BR fcntl () -returns details about one of those locks in the -.IR l_type ", " l_whence ", " l_start ", and " l_len -fields of -.IR lock . -If the conflicting lock is a traditional (process-associated) record lock, -then the -.I l_pid -field is set to the PID of the process holding that lock. -If the conflicting lock is an open file description lock, then -.I l_pid -is set to \-1. -Note that the returned information -may already be out of date by the time the caller inspects it. -.P -In order to place a read lock, -.I fd -must be open for reading. -In order to place a write lock, -.I fd -must be open for writing. -To place both types of lock, open a file read-write. -.P -When placing locks with -.BR F_SETLKW , -the kernel detects -.IR deadlocks , -whereby two or more processes have their -lock requests mutually blocked by locks held by the other processes. -For example, suppose process A holds a write lock on byte 100 of a file, -and process B holds a write lock on byte 200. -If each process then attempts to lock the byte already -locked by the other process using -.BR F_SETLKW , -then, without deadlock detection, -both processes would remain blocked indefinitely. -When the kernel detects such deadlocks, -it causes one of the blocking lock requests to immediately fail with the error -.BR EDEADLK ; -an application that encounters such an error should release -some of its locks to allow other applications to proceed before -attempting regain the locks that it requires. -Circular deadlocks involving more than two processes are also detected. -Note, however, that there are limitations to the kernel's -deadlock-detection algorithm; see BUGS. -.P -As well as being removed by an explicit -.BR F_UNLCK , -record locks are automatically released when the process terminates. -.P -Record locks are not inherited by a child created via -.BR fork (2), -but are preserved across an -.BR execve (2). -.P -Because of the buffering performed by the -.BR stdio (3) -library, the use of record locking with routines in that package -should be avoided; use -.BR read (2) -and -.BR write (2) -instead. -.P -The record locks described above are associated with the process -(unlike the open file description locks described below). -This has some unfortunate consequences: -.IP \[bu] 3 -If a process closes -.I any -file descriptor referring to a file, -then all of the process's locks on that file are released, -regardless of the file descriptor(s) on which the locks were obtained. -.\" (Additional file descriptors referring to the same file -.\" may have been obtained by calls to -.\" .BR open "(2), " dup "(2), " dup2 "(2), or " fcntl ().) -This is bad: it means that a process can lose its locks on -a file such as -.I /etc/passwd -or -.I /etc/mtab -when for some reason a library function decides to open, read, -and close the same file. -.IP \[bu] -The threads in a process share locks. -In other words, -a multithreaded program can't use record locking to ensure -that threads don't simultaneously access the same region of a file. -.P -Open file description locks solve both of these problems. +.BR F_SETLK (2const) +.TQ +.BR F_SETLKW (2const) +.TQ +.BR F_GETLK (2const) .SS Open file description locks (non-POSIX) -Open file description locks are advisory byte-range locks whose operation is -in most respects identical to the traditional record locks described above. -This lock type is Linux-specific, -and available since Linux 3.15. -(There is a proposal with the Austin Group -.\" FIXME . Review progress into POSIX -.\" http://austingroupbugs.net/view.php?id=768 -to include this lock type in the next revision of POSIX.1.) -For an explanation of open file descriptions, see -.BR open (2). -.P -The principal difference between the two lock types -is that whereas traditional record locks -are associated with a process, -open file description locks are associated with the -open file description on which they are acquired, -much like locks acquired with -.BR flock (2). -Consequently (and unlike traditional advisory record locks), -open file description locks are inherited across -.BR fork (2) -(and -.BR clone (2) -with -.BR CLONE_FILES ), -and are only automatically released on the last close -of the open file description, -instead of being released on any close of the file. -.P -Conflicting lock combinations -(i.e., a read lock and a write lock or two write locks) -where one lock is an open file description lock and the other -is a traditional record lock conflict -even when they are acquired by the same process on the same file descriptor. -.P -Open file description locks placed via the same open file description -(i.e., via the same file descriptor, -or via a duplicate of the file descriptor created by -.BR fork (2), -.BR dup (2), -.BR F_DUPFD (2const), -and so on) are always compatible: -if a new lock is placed on an already locked region, -then the existing lock is converted to the new lock type. -(Such conversions may result in splitting, shrinking, or coalescing with -an existing lock as discussed above.) -.P -On the other hand, open file description locks may conflict with -each other when they are acquired via different open file descriptions. -Thus, the threads in a multithreaded program can use -open file description locks to synchronize access to a file region -by having each thread perform its own -.BR open (2) -on the file and applying locks via the resulting file descriptor. -.P -As with traditional advisory locks, the third argument to -.BR fcntl (), -.IR lock , -is a pointer to an -.I flock -structure. -By contrast with traditional record locks, the -.I l_pid -field of that structure must be set to zero -when using the operations described below. -.P -The operations for working with open file description locks are analogous -to those used with traditional locks: -.TP -.BR F_OFD_SETLK \~(\f[I]struct\~flock\~*\f[]) -Acquire an open file description lock (when -.I l_type -is -.B F_RDLCK -or -.BR F_WRLCK ) -or release an open file description lock (when -.I l_type -is -.BR F_UNLCK ) -on the bytes specified by the -.IR l_whence ", " l_start ", and " l_len -fields of -.IR lock . -If a conflicting lock is held by another process, -this call returns \-1 and sets -.I errno -to -.BR EAGAIN . -.TP -.BR F_OFD_SETLKW \~(\f[I]struct\~flock\~*\f[]) -As for -.BR F_OFD_SETLK , -but if a conflicting lock is held on the file, then wait for that lock to be -released. -If a signal is caught while waiting, then the call is interrupted -and (after the signal handler has returned) returns immediately -(with return value \-1 and -.I errno -set to -.BR EINTR ; -see -.BR signal (7)). .TP -.BR F_OFD_GETLK \~(\f[I]struct\~flock\~*\f[]) -On input to this call, -.I lock -describes an open file description lock we would like to place on the file. -If the lock could be placed, -.BR fcntl () -does not actually place it, but returns -.B F_UNLCK -in the -.I l_type -field of -.I lock -and leaves the other fields of the structure unchanged. -If one or more incompatible locks would prevent this lock being placed, -then details about one of these locks are returned via -.IR lock , -as described above for -.BR F_GETLK . -.P -In the current implementation, -.\" commit 57b65325fe34ec4c917bc4e555144b4a94d9e1f7 -no deadlock detection is performed for open file description locks. -(This contrasts with process-associated record locks, -for which the kernel does perform deadlock detection.) -.\" -.SS Mandatory locking -.IR Warning : -the Linux implementation of mandatory locking is unreliable. -See BUGS below. -Because of these bugs, -and the fact that the feature is believed to be little used, -since Linux 4.5, mandatory locking has been made an optional feature, -governed by a configuration option -.RB ( CONFIG_MANDATORY_FILE_LOCKING ). -This feature is no longer supported at all in Linux 5.15 and above. -.P -By default, both traditional (process-associated) and open file description -record locks are advisory. -Advisory locks are not enforced and are useful only between -cooperating processes. -.P -Both lock types can also be mandatory. -Mandatory locks are enforced for all processes. -If a process tries to perform an incompatible access (e.g., -.BR read (2) -or -.BR write (2)) -on a file region that has an incompatible mandatory lock, -then the result depends upon whether the -.B O_NONBLOCK -flag is enabled for its open file description. -If the -.B O_NONBLOCK -flag is not enabled, then -the system call is blocked until the lock is removed -or converted to a mode that is compatible with the access. -If the -.B O_NONBLOCK -flag is enabled, then the system call fails with the error -.BR EAGAIN . -.P -To make use of mandatory locks, mandatory locking must be enabled -both on the filesystem that contains the file to be locked, -and on the file itself. -Mandatory locking is enabled on a filesystem -using the "\-o mand" option to -.BR mount (8), -or the -.B MS_MANDLOCK -flag for -.BR mount (2). -Mandatory locking is enabled on a file by disabling -group execute permission on the file and enabling the set-group-ID -permission bit (see -.BR chmod (1) -and -.BR chmod (2)). -.P -Mandatory locking is not specified by POSIX. -Some other systems also support mandatory locking, -although the details of how to enable it vary across systems. -.\" -.SS Lost locks -When an advisory lock is obtained on a networked filesystem such as -NFS it is possible that the lock might get lost. -This may happen due to administrative action on the server, or due to a -network partition (i.e., loss of network connectivity with the server) -which lasts long enough for the server to assume -that the client is no longer functioning. -.P -When the filesystem determines that a lock has been lost, future -.BR read (2) -or -.BR write (2) -requests may fail with the error -.BR EIO . -This error will persist until the lock is removed or the file -descriptor is closed. -Since Linux 3.12, -.\" commit ef1820f9be27b6ad158f433ab38002ab8131db4d -this happens at least for NFSv4 (including all minor versions). -.P -Some versions of UNIX send a signal -.RB ( SIGLOST ) -in this circumstance. -Linux does not define this signal, and does not provide any -asynchronous notification of lost locks. -.\" +.BR F_OFD_SETLK (2const) +.TQ +.BR F_OFD_SETLKW (2const) +.TQ +.BR F_OFD_GETLK (2const) .SS Managing signals .TP .BR F_GETOWN (2const) @@ -584,65 +133,10 @@ another process. .I fd is not an open file descriptor .TP -.B EBADF -.I op -is -.B F_SETLK -or -.B F_SETLKW -and the file descriptor open mode doesn't match with the -type of lock requested. -.TP -.B EDEADLK -It was detected that the specified -.B F_SETLKW -operation would cause a deadlock. -.TP -.B EFAULT -.I lock -is outside your accessible address space. -.TP -.B EINTR -.I op -is -.B F_SETLKW -or -.B F_OFD_SETLKW -and the operation was interrupted by a signal; see -.BR signal (7). -.TP -.B EINTR -.I op -is -.BR F_GETLK , -.BR F_SETLK , -.BR F_OFD_GETLK , -or -.BR F_OFD_SETLK , -and the operation was interrupted by a signal before the lock was checked or -acquired. -Most likely when locking a remote file (e.g., locking over -NFS), but can sometimes happen locally. -.TP .B EINVAL The value specified in .I op is not recognized by this kernel. -.TP -.B EINVAL -.I op -is -.BR F_OFD_SETLK , -.BR F_OFD_SETLKW , -or -.BR F_OFD_GETLK , -and -.I l_pid -was not specified as zero. -.TP -.B ENOLCK -Too many segment locks open, lock table is full, or a remote locking -protocol failed (e.g., locking over NFS). .SH VERSIONS POSIX.1-2024 specifies .B FD_CLOFORK @@ -653,173 +147,8 @@ but Linux doesn't support them. POSIX.1-2008. .\" .P .\" SVr4 documents additional EIO, ENOLINK and EOVERFLOW error conditions. -.P -.BR F_OFD_SETLK , -.BR F_OFD_SETLKW , -and -.B F_OFD_GETLK -are Linux-specific (and one must define -.B _GNU_SOURCE -to obtain their definitions), -but work is being done to have them included in the next version of POSIX.1. .SH HISTORY SVr4, 4.3BSD, POSIX.1-2001. -.P -Only the operations -.BR F_GETLK , -.BR F_SETLK , -and -.B F_SETLKW -are specified in POSIX.1-2001. -.SH NOTES -.SS File locking -The original Linux -.BR fcntl () -system call was not designed to handle large file offsets -(in the -.I flock -structure). -Consequently, an -.BR fcntl64 () -system call was added in Linux 2.4. -The newer system call employs a different structure for file locking, -.IR flock64 , -and corresponding operations, -.BR F_GETLK64 , -.BR F_SETLK64 , -and -.BR F_SETLKW64 . -However, these details can be ignored by applications using glibc, whose -.BR fcntl () -wrapper function transparently employs the more recent system call -where it is available. -.\" -.SS Record locks -Since Linux 2.0, there is no interaction between the types of lock -placed by -.BR flock (2) -and -.BR fcntl (). -.P -Several systems have more fields in -.I "struct flock" -such as, for example, -.I l_sysid -(to identify the machine where the lock is held). -.\" e.g., Solaris 8 documents this field in fcntl(2), and Irix 6.5 -.\" documents it in fcntl(5). mtk, May 2007 -.\" Also, FreeBSD documents it (Apr 2014). -Clearly, -.I l_pid -alone is not going to be very useful if the process holding the lock -may live on a different machine; -on Linux, while present on some architectures (such as MIPS32), -this field is not used. -.P -The original Linux -.BR fcntl () -system call was not designed to handle large file offsets -(in the -.I flock -structure). -Consequently, an -.BR fcntl64 () -system call was added in Linux 2.4. -The newer system call employs a different structure for file locking, -.IR flock64 , -and corresponding operations, -.BR F_GETLK64 , -.BR F_SETLK64 , -and -.BR F_SETLKW64 . -However, these details can be ignored by applications using glibc, whose -.BR fcntl () -wrapper function transparently employs the more recent system call -where it is available. -.SS Record locking and NFS -Before Linux 3.12, if an NFSv4 client -loses contact with the server for a period of time -(defined as more than 90 seconds with no communication), -.\" -.\" Neil Brown: With NFSv3 the failure mode is the reverse. If -.\" the server loses contact with a client then any lock stays in place -.\" indefinitely ("why can't I read my mail"... I remember it well). -.\" -it might lose and regain a lock without ever being aware of the fact. -(The period of time after which contact is assumed lost is known as -the NFSv4 leasetime. -On a Linux NFS server, this can be determined by looking at -.IR /proc/fs/nfsd/nfsv4leasetime , -which expresses the period in seconds. -The default value for this file is 90.) -.\" -.\" Jeff Layton: -.\" Note that this is not a firm timeout. The server runs a job -.\" periodically to clean out expired stateful objects, and it's likely -.\" that there is some time (maybe even up to another whole lease period) -.\" between when the timeout expires and the job actually runs. If the -.\" client gets a RENEW in there within that window, its lease will be -.\" renewed and its state preserved. -.\" -This scenario potentially risks data corruption, -since another process might acquire a lock in the intervening period -and perform file I/O. -.P -Since Linux 3.12, -.\" commit ef1820f9be27b6ad158f433ab38002ab8131db4d -if an NFSv4 client loses contact with the server, -any I/O to the file by a process which "thinks" it holds -a lock will fail until that process closes and reopens the file. -A kernel parameter, -.IR nfs.recover_lost_locks , -can be set to 1 to obtain the pre-3.12 behavior, -whereby the client will attempt to recover lost locks -when contact is reestablished with the server. -Because of the attendant risk of data corruption, -.\" commit f6de7a39c181dfb8a2c534661a53c73afb3081cd -this parameter defaults to 0 (disabled). -.SH BUGS -.SS Deadlock detection -The deadlock-detection algorithm employed by the kernel when dealing with -.B F_SETLKW -requests can yield both -false negatives (failures to detect deadlocks, -leaving a set of deadlocked processes blocked indefinitely) -and false positives -.RB ( EDEADLK -errors when there is no deadlock). -For example, -the kernel limits the lock depth of its dependency search to 10 steps, -meaning that circular deadlock chains that exceed -that size will not be detected. -In addition, the kernel may falsely indicate a deadlock -when two or more processes created using the -.BR clone (2) -.B CLONE_FILES -flag place locks that appear (to the kernel) to conflict. -.\" -.SS Mandatory locking -The Linux implementation of mandatory locking -is subject to race conditions which render it unreliable: -.\" http://marc.info/?l=linux-kernel&m=119013491707153&w=2 -.\" -.\" Reconfirmed by Jeff Layton -.\" From: Jeff Layton redhat.com> -.\" Subject: Re: Status of fcntl() mandatory locking -.\" Newsgroups: gmane.linux.file-systems -.\" Date: 2014-04-28 10:07:57 GMT -.\" http://thread.gmane.org/gmane.linux.file-systems/84481/focus=84518 -a -.BR write (2) -call that overlaps with a lock may modify data after the mandatory lock is -acquired; -a -.BR read (2) -call that overlaps with a lock may detect changes to data that were made -only after a write lock was acquired. -Similar races exist between mandatory locks and -.BR mmap (2). -It is therefore inadvisable to rely on mandatory locking. .SH SEE ALSO .BR dup2 (2), .BR flock (2), @@ -829,16 +158,3 @@ It is therefore inadvisable to rely on mandatory locking. .BR capabilities (7), .BR feature_test_macros (7), .BR lslocks (8) -.P -.IR locks.txt , -.IR mandatory\-locking.txt , -and -.I dnotify.txt -in the Linux kernel source directory -.I Documentation/filesystems/ -(on older kernels, these files are directly under the -.I Documentation/ -directory, and -.I mandatory\-locking.txt -is called -.IR mandatory.txt ) diff --git a/man/man2/fcntl_locking.2 b/man/man2/fcntl_locking.2 new file mode 100644 index 000000000..098f72e50 --- /dev/null +++ b/man/man2/fcntl_locking.2 @@ -0,0 +1,747 @@ +.\" Copyright, the authors of the Linux man-pages project +.\" +.\" SPDX-License-Identifier: Linux-man-pages-copyleft +.\" +.TH fcntl_locking 2 (date) "Linux man-pages (unreleased)" +.SH NAME +F_GETLK, +F_SETLK, +F_SETLKW, +F_OFD_GETLK, +F_OFD_SETLK, +F_OFD_SETLKW +\- +locking +.SH LIBRARY +Standard C library +.RI ( libc ,\~ \-lc ) +.SH SYNOPSIS +.nf +.B #include +.P +.BI "int fcntl(int " fd ", F_GETLK, struct flock *" lock ); +.BI "int fcntl(int " fd ", F_SETLK, const struct flock *" lock ); +.BI "int fcntl(int " fd ", F_SETLKW, const struct flock *" lock ); +.P +.BI "int fcntl(int " fd ", F_OFD_GETLK, struct flock *" lock ); +.BI "int fcntl(int " fd ", F_OFD_SETLK, const struct flock *" lock ); +.BI "int fcntl(int " fd ", F_OFD_SETLKW, const struct flock *" lock ); +.fi +.SH DESCRIPTION +.SS Advisory record locking +Linux implements traditional ("process-associated") UNIX record locks, +as standardized by POSIX. +For a Linux-specific alternative with better semantics, +see the discussion of open file description locks below. +.P +.BR F_SETLK , +.BR F_SETLKW , +and +.B F_GETLK +are used to acquire, release, and test for the existence of record +locks (also known as byte-range, file-segment, or file-region locks). +The third argument, +.IR lock , +is a pointer to a structure that has at least the following fields +(in unspecified order). +.P +.in +4n +.EX +struct flock { + ... + short l_type; /* Type of lock: F_RDLCK, + F_WRLCK, F_UNLCK */ + short l_whence; /* How to interpret l_start: + SEEK_SET, SEEK_CUR, SEEK_END */ + off_t l_start; /* Starting offset for lock */ + off_t l_len; /* Number of bytes to lock */ + pid_t l_pid; /* PID of process blocking our lock + (set by F_GETLK and F_OFD_GETLK) */ + ... +}; +.EE +.in +.P +The +.IR l_whence ", " l_start ", and " l_len +fields of this structure specify the range of bytes we wish to lock. +Bytes past the end of the file may be locked, +but not bytes before the start of the file. +.P +.I l_start +is the starting offset for the lock, and is interpreted +relative to either: +the start of the file (if +.I l_whence +is +.BR SEEK_SET ); +the current file offset (if +.I l_whence +is +.BR SEEK_CUR ); +or the end of the file (if +.I l_whence +is +.BR SEEK_END ). +In the final two cases, +.I l_start +can be a negative number provided the +offset does not lie before the start of the file. +.P +.I l_len +specifies the number of bytes to be locked. +If +.I l_len +is positive, then the range to be locked covers bytes +.I l_start +up to and including +.IR l_start + l_len \-1. +Specifying 0 for +.I l_len +has the special meaning: lock all bytes starting at the +location specified by +.IR l_whence " and " l_start +through to the end of file, no matter how large the file grows. +.P +POSIX.1-2001 allows (but does not require) +an implementation to support a negative +.I l_len +value; if +.I l_len +is negative, the interval described by +.I lock +covers bytes +.IR l_start + l_len +up to and including +.IR l_start \-1. +This is supported since Linux 2.4.21 and Linux 2.5.49. +.P +The +.I l_type +field can be used to place a read +.RB ( F_RDLCK ) +or a write +.RB ( F_WRLCK ) +lock on a file. +Any number of processes may hold a read lock (shared lock) +on a file region, but only one process may hold a write lock +(exclusive lock). +An exclusive lock excludes all other locks, +both shared and exclusive. +A single process can hold only one type of lock on a file region; +if a new lock is applied to an already-locked region, +then the existing lock is converted to the new lock type. +(Such conversions may involve splitting, shrinking, or coalescing with +an existing lock if the byte range specified by the new lock does not +precisely coincide with the range of the existing lock.) +.TP +.BR F_SETLK \~(\f[I]struct\~flock\~*\f[]) +Acquire a lock (when +.I l_type +is +.B F_RDLCK +or +.BR F_WRLCK ) +or release a lock (when +.I l_type +is +.BR F_UNLCK ) +on the bytes specified by the +.IR l_whence ", " l_start ", and " l_len +fields of +.IR lock . +If a conflicting lock is held by another process, +this call returns \-1 and sets +.I errno +to +.B EACCES +or +.BR EAGAIN . +(The error returned in this case differs across implementations, +so POSIX requires a portable application to check for both errors.) +.TP +.BR F_SETLKW \~(\f[I]struct\~flock\~*\f[]) +As for +.BR F_SETLK , +but if a conflicting lock is held on the file, then wait for that +lock to be released. +If a signal is caught while waiting, then the call is interrupted +and (after the signal handler has returned) +returns immediately (with return value \-1 and +.I errno +set to +.BR EINTR ; +see +.BR signal (7)). +.TP +.BR F_GETLK \~(\f[I]struct\~flock\~*\f[]) +On input to this call, +.I lock +describes a lock we would like to place on the file. +If the lock could be placed, +.BR fcntl () +does not actually place it, but returns +.B F_UNLCK +in the +.I l_type +field of +.I lock +and leaves the other fields of the structure unchanged. +.IP +If one or more incompatible locks would prevent +this lock being placed, then +.BR fcntl () +returns details about one of those locks in the +.IR l_type ", " l_whence ", " l_start ", and " l_len +fields of +.IR lock . +If the conflicting lock is a traditional (process-associated) record lock, +then the +.I l_pid +field is set to the PID of the process holding that lock. +If the conflicting lock is an open file description lock, then +.I l_pid +is set to \-1. +Note that the returned information +may already be out of date by the time the caller inspects it. +.P +In order to place a read lock, +.I fd +must be open for reading. +In order to place a write lock, +.I fd +must be open for writing. +To place both types of lock, open a file read-write. +.P +When placing locks with +.BR F_SETLKW , +the kernel detects +.IR deadlocks , +whereby two or more processes have their +lock requests mutually blocked by locks held by the other processes. +For example, suppose process A holds a write lock on byte 100 of a file, +and process B holds a write lock on byte 200. +If each process then attempts to lock the byte already +locked by the other process using +.BR F_SETLKW , +then, without deadlock detection, +both processes would remain blocked indefinitely. +When the kernel detects such deadlocks, +it causes one of the blocking lock requests to immediately fail with the error +.BR EDEADLK ; +an application that encounters such an error should release +some of its locks to allow other applications to proceed before +attempting regain the locks that it requires. +Circular deadlocks involving more than two processes are also detected. +Note, however, that there are limitations to the kernel's +deadlock-detection algorithm; see BUGS. +.P +As well as being removed by an explicit +.BR F_UNLCK , +record locks are automatically released when the process terminates. +.P +Record locks are not inherited by a child created via +.BR fork (2), +but are preserved across an +.BR execve (2). +.P +Because of the buffering performed by the +.BR stdio (3) +library, the use of record locking with routines in that package +should be avoided; use +.BR read (2) +and +.BR write (2) +instead. +.P +The record locks described above are associated with the process +(unlike the open file description locks described below). +This has some unfortunate consequences: +.IP \[bu] 3 +If a process closes +.I any +file descriptor referring to a file, +then all of the process's locks on that file are released, +regardless of the file descriptor(s) on which the locks were obtained. +.\" (Additional file descriptors referring to the same file +.\" may have been obtained by calls to +.\" .BR open "(2), " dup "(2), " dup2 "(2), or " fcntl ().) +This is bad: it means that a process can lose its locks on +a file such as +.I /etc/passwd +or +.I /etc/mtab +when for some reason a library function decides to open, read, +and close the same file. +.IP \[bu] +The threads in a process share locks. +In other words, +a multithreaded program can't use record locking to ensure +that threads don't simultaneously access the same region of a file. +.P +Open file description locks solve both of these problems. +.SS Open file description locks (non-POSIX) +Open file description locks are advisory byte-range locks whose operation is +in most respects identical to the traditional record locks described above. +This lock type is Linux-specific, +and available since Linux 3.15. +(There is a proposal with the Austin Group +.\" FIXME . Review progress into POSIX +.\" http://austingroupbugs.net/view.php?id=768 +to include this lock type in the next revision of POSIX.1.) +For an explanation of open file descriptions, see +.BR open (2). +.P +The principal difference between the two lock types +is that whereas traditional record locks +are associated with a process, +open file description locks are associated with the +open file description on which they are acquired, +much like locks acquired with +.BR flock (2). +Consequently (and unlike traditional advisory record locks), +open file description locks are inherited across +.BR fork (2) +(and +.BR clone (2) +with +.BR CLONE_FILES ), +and are only automatically released on the last close +of the open file description, +instead of being released on any close of the file. +.P +Conflicting lock combinations +(i.e., a read lock and a write lock or two write locks) +where one lock is an open file description lock and the other +is a traditional record lock conflict +even when they are acquired by the same process on the same file descriptor. +.P +Open file description locks placed via the same open file description +(i.e., via the same file descriptor, +or via a duplicate of the file descriptor created by +.BR fork (2), +.BR dup (2), +.BR F_DUPFD (2const), +and so on) are always compatible: +if a new lock is placed on an already locked region, +then the existing lock is converted to the new lock type. +(Such conversions may result in splitting, shrinking, or coalescing with +an existing lock as discussed above.) +.P +On the other hand, open file description locks may conflict with +each other when they are acquired via different open file descriptions. +Thus, the threads in a multithreaded program can use +open file description locks to synchronize access to a file region +by having each thread perform its own +.BR open (2) +on the file and applying locks via the resulting file descriptor. +.P +As with traditional advisory locks, the third argument to +.BR fcntl (), +.IR lock , +is a pointer to an +.I flock +structure. +By contrast with traditional record locks, the +.I l_pid +field of that structure must be set to zero +when using the operations described below. +.P +The operations for working with open file description locks are analogous +to those used with traditional locks: +.TP +.BR F_OFD_SETLK \~(\f[I]struct\~flock\~*\f[]) +Acquire an open file description lock (when +.I l_type +is +.B F_RDLCK +or +.BR F_WRLCK ) +or release an open file description lock (when +.I l_type +is +.BR F_UNLCK ) +on the bytes specified by the +.IR l_whence ", " l_start ", and " l_len +fields of +.IR lock . +If a conflicting lock is held by another process, +this call returns \-1 and sets +.I errno +to +.BR EAGAIN . +.TP +.BR F_OFD_SETLKW \~(\f[I]struct\~flock\~*\f[]) +As for +.BR F_OFD_SETLK , +but if a conflicting lock is held on the file, then wait for that lock to be +released. +If a signal is caught while waiting, then the call is interrupted +and (after the signal handler has returned) returns immediately +(with return value \-1 and +.I errno +set to +.BR EINTR ; +see +.BR signal (7)). +.TP +.BR F_OFD_GETLK \~(\f[I]struct\~flock\~*\f[]) +On input to this call, +.I lock +describes an open file description lock we would like to place on the file. +If the lock could be placed, +.BR fcntl () +does not actually place it, but returns +.B F_UNLCK +in the +.I l_type +field of +.I lock +and leaves the other fields of the structure unchanged. +If one or more incompatible locks would prevent this lock being placed, +then details about one of these locks are returned via +.IR lock , +as described above for +.BR F_GETLK . +.P +In the current implementation, +.\" commit 57b65325fe34ec4c917bc4e555144b4a94d9e1f7 +no deadlock detection is performed for open file description locks. +(This contrasts with process-associated record locks, +for which the kernel does perform deadlock detection.) +.\" +.SS Mandatory locking +.IR Warning : +the Linux implementation of mandatory locking is unreliable. +See BUGS below. +Because of these bugs, +and the fact that the feature is believed to be little used, +since Linux 4.5, mandatory locking has been made an optional feature, +governed by a configuration option +.RB ( CONFIG_MANDATORY_FILE_LOCKING ). +This feature is no longer supported at all in Linux 5.15 and above. +.P +By default, both traditional (process-associated) and open file description +record locks are advisory. +Advisory locks are not enforced and are useful only between +cooperating processes. +.P +Both lock types can also be mandatory. +Mandatory locks are enforced for all processes. +If a process tries to perform an incompatible access (e.g., +.BR read (2) +or +.BR write (2)) +on a file region that has an incompatible mandatory lock, +then the result depends upon whether the +.B O_NONBLOCK +flag is enabled for its open file description. +If the +.B O_NONBLOCK +flag is not enabled, then +the system call is blocked until the lock is removed +or converted to a mode that is compatible with the access. +If the +.B O_NONBLOCK +flag is enabled, then the system call fails with the error +.BR EAGAIN . +.P +To make use of mandatory locks, mandatory locking must be enabled +both on the filesystem that contains the file to be locked, +and on the file itself. +Mandatory locking is enabled on a filesystem +using the "\-o mand" option to +.BR mount (8), +or the +.B MS_MANDLOCK +flag for +.BR mount (2). +Mandatory locking is enabled on a file by disabling +group execute permission on the file and enabling the set-group-ID +permission bit (see +.BR chmod (1) +and +.BR chmod (2)). +.P +Mandatory locking is not specified by POSIX. +Some other systems also support mandatory locking, +although the details of how to enable it vary across systems. +.\" +.SS Lost locks +When an advisory lock is obtained on a networked filesystem such as +NFS it is possible that the lock might get lost. +This may happen due to administrative action on the server, or due to a +network partition (i.e., loss of network connectivity with the server) +which lasts long enough for the server to assume +that the client is no longer functioning. +.P +When the filesystem determines that a lock has been lost, future +.BR read (2) +or +.BR write (2) +requests may fail with the error +.BR EIO . +This error will persist until the lock is removed or the file +descriptor is closed. +Since Linux 3.12, +.\" commit ef1820f9be27b6ad158f433ab38002ab8131db4d +this happens at least for NFSv4 (including all minor versions). +.P +Some versions of UNIX send a signal +.RB ( SIGLOST ) +in this circumstance. +Linux does not define this signal, and does not provide any +asynchronous notification of lost locks. +.SH RETURN VALUE +Zero. +.P +On error, \-1 is returned, and +.I errno +is set to indicate the error. +.SH ERRORS +See +.BR fcntl (2). +.TP +.B EBADF +.I op +is +.B F_SETLK +or +.B F_SETLKW +and the file descriptor open mode doesn't match with the +type of lock requested. +.TP +.B EDEADLK +It was detected that the specified +.B F_SETLKW +operation would cause a deadlock. +.TP +.B EFAULT +.I lock +is outside your accessible address space. +.TP +.B EINTR +.I op +is +.B F_SETLKW +or +.B F_OFD_SETLKW +and the operation was interrupted by a signal; see +.BR signal (7). +.TP +.B EINTR +.I op +is +.BR F_GETLK , +.BR F_SETLK , +.BR F_OFD_GETLK , +or +.BR F_OFD_SETLK , +and the operation was interrupted by a signal before the lock was checked or +acquired. +Most likely when locking a remote file (e.g., locking over +NFS), but can sometimes happen locally. +.TP +.B EINVAL +.I op +is +.BR F_OFD_SETLK , +.BR F_OFD_SETLKW , +or +.BR F_OFD_GETLK , +and +.I l_pid +was not specified as zero. +.TP +.B ENOLCK +Too many segment locks open, lock table is full, or a remote locking +protocol failed (e.g., locking over NFS). +.SH STANDARDS +POSIX.1-2008. +.\" .P +.\" SVr4 documents additional EIO, ENOLINK and EOVERFLOW error conditions. +.P +.BR F_OFD_SETLK , +.BR F_OFD_SETLKW , +and +.B F_OFD_GETLK +are Linux-specific (and one must define +.B _GNU_SOURCE +to obtain their definitions), +but work is being done to have them included in the next version of POSIX.1. +.SH HISTORY +SVr4, 4.3BSD, POSIX.1-2001. +.P +Only the operations +.BR F_GETLK , +.BR F_SETLK , +and +.B F_SETLKW +are specified in POSIX.1-2001. +.SH NOTES +.SS File locking +The original Linux +.BR fcntl () +system call was not designed to handle large file offsets +(in the +.I flock +structure). +Consequently, an +.BR fcntl64 () +system call was added in Linux 2.4. +The newer system call employs a different structure for file locking, +.IR flock64 , +and corresponding operations, +.BR F_GETLK64 , +.BR F_SETLK64 , +and +.BR F_SETLKW64 . +However, these details can be ignored by applications using glibc, whose +.BR fcntl () +wrapper function transparently employs the more recent system call +where it is available. +.\" +.SS Record locks +Since Linux 2.0, there is no interaction between the types of lock +placed by +.BR flock (2) +and +.BR fcntl (). +.P +Several systems have more fields in +.I "struct flock" +such as, for example, +.I l_sysid +(to identify the machine where the lock is held). +.\" e.g., Solaris 8 documents this field in fcntl(2), and Irix 6.5 +.\" documents it in fcntl(5). mtk, May 2007 +.\" Also, FreeBSD documents it (Apr 2014). +Clearly, +.I l_pid +alone is not going to be very useful if the process holding the lock +may live on a different machine; +on Linux, while present on some architectures (such as MIPS32), +this field is not used. +.P +The original Linux +.BR fcntl () +system call was not designed to handle large file offsets +(in the +.I flock +structure). +Consequently, an +.BR fcntl64 () +system call was added in Linux 2.4. +The newer system call employs a different structure for file locking, +.IR flock64 , +and corresponding operations, +.BR F_GETLK64 , +.BR F_SETLK64 , +and +.BR F_SETLKW64 . +However, these details can be ignored by applications using glibc, whose +.BR fcntl () +wrapper function transparently employs the more recent system call +where it is available. +.SS Record locking and NFS +Before Linux 3.12, if an NFSv4 client +loses contact with the server for a period of time +(defined as more than 90 seconds with no communication), +.\" +.\" Neil Brown: With NFSv3 the failure mode is the reverse. If +.\" the server loses contact with a client then any lock stays in place +.\" indefinitely ("why can't I read my mail"... I remember it well). +.\" +it might lose and regain a lock without ever being aware of the fact. +(The period of time after which contact is assumed lost is known as +the NFSv4 leasetime. +On a Linux NFS server, this can be determined by looking at +.IR /proc/fs/nfsd/nfsv4leasetime , +which expresses the period in seconds. +The default value for this file is 90.) +.\" +.\" Jeff Layton: +.\" Note that this is not a firm timeout. The server runs a job +.\" periodically to clean out expired stateful objects, and it's likely +.\" that there is some time (maybe even up to another whole lease period) +.\" between when the timeout expires and the job actually runs. If the +.\" client gets a RENEW in there within that window, its lease will be +.\" renewed and its state preserved. +.\" +This scenario potentially risks data corruption, +since another process might acquire a lock in the intervening period +and perform file I/O. +.P +Since Linux 3.12, +.\" commit ef1820f9be27b6ad158f433ab38002ab8131db4d +if an NFSv4 client loses contact with the server, +any I/O to the file by a process which "thinks" it holds +a lock will fail until that process closes and reopens the file. +A kernel parameter, +.IR nfs.recover_lost_locks , +can be set to 1 to obtain the pre-3.12 behavior, +whereby the client will attempt to recover lost locks +when contact is reestablished with the server. +Because of the attendant risk of data corruption, +.\" commit f6de7a39c181dfb8a2c534661a53c73afb3081cd +this parameter defaults to 0 (disabled). +.SH BUGS +.SS Deadlock detection +The deadlock-detection algorithm employed by the kernel when dealing with +.B F_SETLKW +requests can yield both +false negatives (failures to detect deadlocks, +leaving a set of deadlocked processes blocked indefinitely) +and false positives +.RB ( EDEADLK +errors when there is no deadlock). +For example, +the kernel limits the lock depth of its dependency search to 10 steps, +meaning that circular deadlock chains that exceed +that size will not be detected. +In addition, the kernel may falsely indicate a deadlock +when two or more processes created using the +.BR clone (2) +.B CLONE_FILES +flag place locks that appear (to the kernel) to conflict. +.\" +.SS Mandatory locking +The Linux implementation of mandatory locking +is subject to race conditions which render it unreliable: +.\" http://marc.info/?l=linux-kernel&m=119013491707153&w=2 +.\" +.\" Reconfirmed by Jeff Layton +.\" From: Jeff Layton redhat.com> +.\" Subject: Re: Status of fcntl() mandatory locking +.\" Newsgroups: gmane.linux.file-systems +.\" Date: 2014-04-28 10:07:57 GMT +.\" http://thread.gmane.org/gmane.linux.file-systems/84481/focus=84518 +a +.BR write (2) +call that overlaps with a lock may modify data after the mandatory lock is +acquired; +a +.BR read (2) +call that overlaps with a lock may detect changes to data that were made +only after a write lock was acquired. +Similar races exist between mandatory locks and +.BR mmap (2). +It is therefore inadvisable to rely on mandatory locking. +.SH SEE ALSO +.BR fcntl (2), +.BR flock (2), +.BR lockf (3), +.BR lslocks (8) +.P +.IR locks.txt , +.IR mandatory\-locking.txt , +and +.I dnotify.txt +in the Linux kernel source directory +.I Documentation/filesystems/ +(on older kernels, these files are directly under the +.I Documentation/ +directory, and +.I mandatory\-locking.txt +is called +.IR mandatory.txt )