]> git.ipfire.org Git - thirdparty/systemd.git/blame - docs/BLOCK_DEVICE_LOCKING.md
docs: document the new journal file format additions
[thirdparty/systemd.git] / docs / BLOCK_DEVICE_LOCKING.md
CommitLineData
c3e270f4
FB
1---
2title: Locking Block Device Access
4cdca0af 3category: Interfaces
b41a3f66 4layout: default
c3e270f4
FB
5---
6
ecb1a44c
LP
7# Locking Block Device Access
8
9*TL;DR: Use BSD file locks
10[(`flock(2)`)](http://man7.org/linux/man-pages/man2/flock.2.html) on block
11device nodes to synchronize access for partitioning and file system formatting
12tools.*
13
14`systemd-udevd` probes all block devices showing up for file system superblock
15and partition table information (utilizing `libblkid`). If another program
16concurrently modifies a superblock or partition table this probing might be
17affected, which is bad in itself, but also might in turn result in undesired
18effects in programs subscribing to `udev` events.
19
20Applications manipulating a block device can temporarily stop `systemd-udevd`
21from processing rules on it — and thus bar it from probing the device — by
22taking a BSD file lock on the block device node. Specifically, whenever
23`systemd-udevd` starts processing a block device it takes a `LOCK_SH|LOCK_NB`
24lock using [`flock(2)`](http://man7.org/linux/man-pages/man2/flock.2.html) on
25the main block device (i.e. never on any partition block device, but on the
26device the partition belongs to). If this lock cannot be taken (i.e. `flock()`
27returns `EBUSY`), it refrains from processing the device. If it manages to take
28the lock it is kept for the entire time the device is processed.
29
30Note that `systemd-udevd` also watches all block device nodes it manages for
31`inotify()` `IN_CLOSE` events: whenever such an event is seen, this is used as
32trigger to re-run the rule-set for the device.
33
34These two concepts allow tools such as disk partitioners or file system
35formatting tools to safely and easily take exclusive ownership of a block
36device while operating: before starting work on the block device, they should
37take an `LOCK_EX` lock on it. This has two effects: first of all, in case
38`systemd-udevd` is still processing the device the tool will wait for it to
d238709c 39finish. Second, after the lock is taken, it can be sure that
ecb1a44c
LP
40`systemd-udevd` will refrain from processing the block device, and thus all
41other client applications subscribed to it won't get device notifications from
42potentially half-written data either. After the operation is complete the
43partitioner/formatter can simply close the device node. This has two effects:
44it implicitly releases the lock, so that `systemd-udevd` can process events on
45the device node again. Secondly, it results an `IN_CLOSE` event, which causes
46`systemd-udevd` to immediately re-process the device — seeing all changes the
47tool made — and notify subscribed clients about it.
48
49Besides synchronizing block device access between `systemd-udevd` and such
50tools this scheme may also be used to synchronize access between those tools
51themselves. However, do note that `flock()` locks are advisory only. This means
52if one tool honours this scheme and another tool does not, they will of course
53not be synchronized properly, and might interfere with each other's work.
54
55Note that the file locks follow the usual access semantics of BSD locks: since
56`systemd-udevd` never writes to such block devices it only takes a `LOCK_SH`
57*shared* lock. A program intending to make changes to the block device should
58take a `LOCK_EX` *exclusive* lock instead. For further details, see the
59`flock(2)` man page.
60
61And please keep in mind: BSD file locks (`flock()`) and POSIX file locks
62(`lockf()`, `F_SETLK`, …) are different concepts, and in their effect
63orthogonal. The scheme discussed above uses the former and not the latter,
edc8e7b8 64because these types of locks more closely match the required semantics.
ecb1a44c
LP
65
66Summarizing: it is recommended to take `LOCK_EX` BSD file locks when
67manipulating block devices in all tools that change file system block devices
68(`mkfs`, `fsck`, …) or partition tables (`fdisk`, `parted`, …), right after
69opening the node.