]> git.ipfire.org Git - people/arne_f/kernel.git/blame - Documentation/block/writeback_cache_control.txt
net: handle the return value of pskb_carve_frag_list() correctly
[people/arne_f/kernel.git] / Documentation / block / writeback_cache_control.txt
CommitLineData
04ccc65c
CH
1
2Explicit volatile write back cache control
3=====================================
4
5Introduction
6------------
7
8Many storage devices, especially in the consumer market, come with volatile
9write back caches. That means the devices signal I/O completion to the
10operating system before data actually has hit the non-volatile storage. This
11behavior obviously speeds up various workloads, but it means the operating
12system needs to force data out to the non-volatile storage when it performs
13a data integrity operation like fsync, sync or an unmount.
14
15The Linux block layer provides two simple mechanisms that let filesystems
16control the caching behavior of the storage device. These mechanisms are
17a forced cache flush, and the Force Unit Access (FUA) flag for requests.
18
19
20Explicit cache flushes
21----------------------
22
28a8f0d3 23The REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from
04ccc65c
CH
24the filesystem and will make sure the volatile cache of the storage device
25has been flushed before the actual I/O operation is started. This explicitly
26guarantees that previously completed write requests are on non-volatile
28a8f0d3 27storage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be
04ccc65c
CH
28set on an otherwise empty bio structure, which causes only an explicit cache
29flush without any dependent I/O. It is recommend to use
30the blkdev_issue_flush() helper for a pure cache flush.
31
32
33Forced Unit Access
34-----------------
35
36The REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the
37filesystem and will make sure that I/O completion for this request is only
38signaled after the data has been committed to non-volatile storage.
39
40
41Implementation details for filesystems
42--------------------------------------
43
28a8f0d3 44Filesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to
04ccc65c 45worry if the underlying devices need any explicit cache flushing and how
28a8f0d3 46the Forced Unit Access is implemented. The REQ_PREFLUSH and REQ_FUA flags
04ccc65c
CH
47may both be set on a single bio.
48
49
50Implementation details for make_request_fn based block drivers
51--------------------------------------------------------------
52
28a8f0d3 53These drivers will always see the REQ_PREFLUSH and REQ_FUA bits as they sit
04ccc65c
CH
54directly below the submit_bio interface. For remapping drivers the REQ_FUA
55bits need to be propagated to underlying devices, and a global flush needs
28a8f0d3
MC
56to be implemented for bios with the REQ_PREFLUSH bit set. For real device
57drivers that do not have a volatile cache the REQ_PREFLUSH and REQ_FUA bits
58on non-empty bios can simply be ignored, and REQ_PREFLUSH requests without
04ccc65c
CH
59data can be completed successfully without doing any work. Drivers for
60devices with volatile caches need to implement the support for these
61flags themselves without any help from the block layer.
62
63
64Implementation details for request_fn based block drivers
65--------------------------------------------------------------
66
67For devices that do not support volatile write caches there is no driver
28a8f0d3
MC
68support required, the block layer completes empty REQ_PREFLUSH requests before
69entering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from
04ccc65c
CH
70requests that have a payload. For devices with volatile write caches the
71driver needs to tell the block layer that it supports flushing caches by
72doing:
73
2245f6de 74 blk_queue_write_cache(sdkp->disk->queue, true, false);
04ccc65c 75
3a5e02ce 76and handle empty REQ_OP_FLUSH requests in its prep_fn/request_fn. Note that
28a8f0d3 77REQ_PREFLUSH requests with a payload are automatically turned into a sequence
3a5e02ce 78of an empty REQ_OP_FLUSH request followed by the actual write by the block
04ccc65c
CH
79layer. For devices that also support the FUA bit the block layer needs
80to be told to pass through the REQ_FUA bit using:
81
2245f6de 82 blk_queue_write_cache(sdkp->disk->queue, true, true);
04ccc65c
CH
83
84and the driver must handle write requests that have the REQ_FUA bit set
85in prep_fn/request_fn. If the FUA bit is not natively supported the block
3a5e02ce 86layer turns it into an empty REQ_OP_FLUSH request after the actual write.