From: Qu Wenruo <wqu@suse.com>
Date: Sun, 26 Apr 2026 07:51:03 +0000 (+0930)
Subject: btrfs: enable cross-folio readahead for bs < ps and large folio cases
X-Git-Url: http://git.ipfire.org/gitweb/?a=commitdiff_plain;h=796ad9c3432ea5bf96811564680dae668d1a4880;p=thirdparty%2Flinux.git

btrfs: enable cross-folio readahead for bs < ps and large folio cases

[BACKGROUND]
When bs < ps support was initially introduced, the compressed data
readahead was disabled as at that time the target page size was 64K.
This means a compressed data extent can span at most 3 64K pages (the
head and tail parts are not aligned to 64K), meaning the benefit is
pretty minimal.

[UNEXPECTED WORKING SITUATION]
But with the already merged large folio support, we're already enabling
readahead with subpage routine unintentionally, e.g.:

   0      4K      8K      12K      16K
   |   Folio 0    |    Folio 8K    |
   |<----- Compressed data ------->|

We have 2 8K sized folios, all backed by a single compressed data.

In that case add_ra_bio_pages() will continue to add folio 8K into the
read bio, as the condition to skip is only (bs < ps), not taking the
newer large folio support into consideration at all.

So for folio 8K, it is added to the read bio, but without subpage lock
bitmap populated.

Then at end_bbio_data_read(), folio 0 has proper locked bitmap set, but
folio 8K does not.
This inconsistency is handled by the extra safety net at
btrfs_subpage_end_and_test_lock() where if a folio has no @nr_locked, it
will just be unlocked without touching the locked bitmap.

[ENHANCEMENT]
Make add_ra_bio_pages() support bs < ps and large folio cases, by
removing the check and calling btrfs_folio_set_lock() unconditionally.

This won't make any difference on 4K page sized systems with large
folios, as the readahead is already working, although unexpectedly.

But this will enable true compressed data readahead for bs < ps cases
properly.

Please note that such readahead will only work if the compressed extent is
crossing folio boundaries, which is also the existing limitation.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
---

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index a02b62e0a8f33..ea3207273834e 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -358,12 +358,9 @@ struct compressed_bio *btrfs_alloc_compressed_write(struct btrfs_inode *inode,
  * Add extra pages in the same compressed file extent so that we don't need to
  * re-read the same extent again and again.
  *
- * NOTE: this won't work well for subpage, as for subpage read, we lock the
- * full page then submit bio for each compressed/regular extents.
- *
- * This means, if we have several sectors in the same page points to the same
- * on-disk compressed data, we will re-read the same extent many times and
- * this function can only help for the next page.
+ * If in the same page, we have several non-contiguous blocks which are pointing
+ * to the same on-disk compressed data, we will re-read the same extent many
+ * times, as this function can only help cross page situations.
  */
 static noinline int add_ra_bio_pages(struct inode *inode,
 				     u64 compressed_end,
@@ -391,16 +388,6 @@ static noinline int add_ra_bio_pages(struct inode *inode,
 	if (isize == 0)
 		return 0;
 
-	/*
-	 * For current subpage support, we only support 64K page size,
-	 * which means maximum compressed extent size (128K) is just 2x page
-	 * size.
-	 * This makes readahead less effective, so here disable readahead for
-	 * subpage for now, until full compressed write is supported.
-	 */
-	if (fs_info->sectorsize < PAGE_SIZE)
-		return 0;
-
 	/* For bs > ps cases, we don't support readahead for compressed folios for now. */
 	if (fs_info->block_min_order)
 		return 0;
@@ -438,8 +425,8 @@ static noinline int add_ra_bio_pages(struct inode *inode,
 				break;
 
 			/*
-			 * Jump to next page start as we already have page for
-			 * current offset.
+			 * Jump to the next folio as we already have a folio for
+			 * the current offset.
 			 */
 			cur += (folio_sz - offset);
 			continue;
@@ -457,7 +444,7 @@ static noinline int add_ra_bio_pages(struct inode *inode,
 			break;
 
 		if (filemap_add_folio(mapping, folio, pg_index, cache_gfp)) {
-			/* There is already a page, skip to page end */
+			/* There is already a folio, skip to folio end. */
 			cur += folio_size(folio);
 			folio_put(folio);
 			continue;
@@ -482,7 +469,7 @@ static noinline int add_ra_bio_pages(struct inode *inode,
 		read_unlock(&em_tree->lock);
 
 		/*
-		 * At this point, we have a locked page in the page cache for
+		 * At this point, we have a locked folio in the page cache for
 		 * these bytes in the file.  But, we have to make sure they map
 		 * to this compressed extent on disk.
 		 */
@@ -516,13 +503,7 @@ static noinline int add_ra_bio_pages(struct inode *inode,
 			folio_put(folio);
 			break;
 		}
-		/*
-		 * If it's subpage, we also need to increase its
-		 * subpage::readers number, as at endio we will decrease
-		 * subpage::readers and to unlock the page.
-		 */
-		if (fs_info->sectorsize < PAGE_SIZE)
-			btrfs_folio_set_lock(fs_info, folio, cur, add_size);
+		btrfs_folio_set_lock(fs_info, folio, cur, add_size);
 		folio_put(folio);
 		cur += add_size;
 	}