f2fs_fiemap() calls f2fs_map_blocks() to obtain the block mapping a
file, and then merges contiguous mappings into extents. If the mapping
is found in the read extent cache, node blocks do not need to be read.
However, in the following scenario, a contiguous extent can be split
into two extents:
$ dd if=/dev/zero of=data.128M bs=1M count=128
$ losetup -f data.128M
$ mkfs.f2fs /dev/loop0 -f
$ mount -o mode=lfs /dev/loop0 /mnt/f2fs/
$ cd /mnt/f2fs/
$ dd if=/dev/zero of=data.72M bs=1M count=72 && sync
$ dd if=/dev/zero of=data.4M bs=1M count=4 && sync
$ dd if=/dev/zero of=data.4M bs=1M count=2 seek=2 conv=notrunc && sync
$ echo 3 > /proc/sys/vm/drop_caches
$ dd if=/dev/zero of=data.4M bs=1M count=2 seek=0 conv=notrunc && sync
$ dd if=/dev/zero of=data.4M bs=1M count=2 seek=0 conv=notrunc && sync
$ f2fs_io fiemap 0 1024 data.4M
Fiemap: offset = 0 len = 1024
logical addr. physical addr. length flags
0
0000000000000000 0000000006400000 0000000000200000 00001000
1
0000000000200000 0000000006600000 0000000000200000 00001001
Although the physical addresses of the ranges 0~2MB and 2M~4MB are
contiguous, the mapping for the 2M~4MB range is not present in memory.
When the physical addresses for the 0~2MB range are updated, no merge
happens because the adjacent mapping is missing from the in-memory
cache. As a result, fiemap reports two separate extents instead of a
single contiguous one.
The root cause is that the read extent cache does not guarantee that all
blocks of an extent are present in memory. Therefore, when the extent
length returned by f2fs_map_blocks_cached() is smaller than maxblocks,
the remaining mappings are retrieved via f2fs_get_dnode_of_data() to
ensure correct fiemap extent boundary handling.
Cc: stable@kernel.org
Fixes: cd8fc5226bef ("f2fs: remove the create argument to f2fs_map_blocks")
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
lfs_dio_write = (flag == F2FS_GET_BLOCK_DIO && f2fs_lfs_mode(sbi) &&
map->m_may_create);
- if (!map->m_may_create && f2fs_map_blocks_cached(inode, map, flag))
- goto out;
+ if (!map->m_may_create && f2fs_map_blocks_cached(inode, map, flag)) {
+ struct extent_info ei;
+
+ /*
+ * 1. If map->m_multidev_dio is true, map->m_pblk cannot be
+ * waitted by f2fs_wait_on_block_writeback_range() and are not
+ * mergeable.
+ * 2. If pgofs hits the read extent cache, it means the mapping
+ * is already cached in the extent cache, but it is not
+ * mergeable, and there is no need to query the mapping again
+ * via f2fs_get_dnode_of_data().
+ */
+ pgofs = (pgoff_t)map->m_lblk + map->m_len;
+ if (map->m_len == maxblocks ||
+ map->m_multidev_dio ||
+ f2fs_lookup_read_extent_cache(inode, pgofs, &ei))
+ goto out;
+ ofs = map->m_len;
+ goto map_more;
+ }
map->m_bdev = inode->i_sb->s_bdev;
map->m_multidev_dio =
/* it only supports block size == page size */
pgofs = (pgoff_t)map->m_lblk;
- end = pgofs + maxblocks;
+map_more:
+ end = (pgoff_t)map->m_lblk + maxblocks;
if (flag == F2FS_GET_BLOCK_PRECACHE)
mode = LOOKUP_NODE_RA;