--- /dev/null
+From d5321a0fa8bc49f11bea0b470800962c17d92d8f Mon Sep 17 00:00:00 2001
+From: Qu Wenruo <wqu@suse.com>
+Date: Tue, 10 May 2022 15:10:18 +0800
+Subject: btrfs: add "0x" prefix for unsupported optional features
+
+From: Qu Wenruo <wqu@suse.com>
+
+commit d5321a0fa8bc49f11bea0b470800962c17d92d8f upstream.
+
+The following error message lack the "0x" obviously:
+
+ cannot mount because of unsupported optional features (4000)
+
+Add the prefix to make it less confusing. This can happen on older
+kernels that try to mount a filesystem with newer features so it makes
+sense to backport to older trees.
+
+CC: stable@vger.kernel.org # 4.14+
+Reviewed-by: Nikolay Borisov <nborisov@suse.com>
+Signed-off-by: Qu Wenruo <wqu@suse.com>
+Reviewed-by: David Sterba <dsterba@suse.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/btrfs/disk-io.c | 4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+--- a/fs/btrfs/disk-io.c
++++ b/fs/btrfs/disk-io.c
+@@ -3522,7 +3522,7 @@ int __cold open_ctree(struct super_block
+ ~BTRFS_FEATURE_INCOMPAT_SUPP;
+ if (features) {
+ btrfs_err(fs_info,
+- "cannot mount because of unsupported optional features (%llx)",
++ "cannot mount because of unsupported optional features (0x%llx)",
+ features);
+ err = -EINVAL;
+ goto fail_alloc;
+@@ -3560,7 +3560,7 @@ int __cold open_ctree(struct super_block
+ ~BTRFS_FEATURE_COMPAT_RO_SUPP;
+ if (!sb_rdonly(sb) && features) {
+ btrfs_err(fs_info,
+- "cannot mount read-write because of unsupported optional features (%llx)",
++ "cannot mount read-write because of unsupported optional features (0x%llx)",
+ features);
+ err = -EINVAL;
+ goto fail_alloc;
--- /dev/null
+From 10f7f6f879c28f8368d6516ab1ccf3517a1f5d3d Mon Sep 17 00:00:00 2001
+From: Qu Wenruo <wqu@suse.com>
+Date: Tue, 12 Apr 2022 20:30:14 +0800
+Subject: btrfs: fix the error handling for submit_extent_page() for btrfs_do_readpage()
+
+From: Qu Wenruo <wqu@suse.com>
+
+commit 10f7f6f879c28f8368d6516ab1ccf3517a1f5d3d upstream.
+
+[BUG]
+Test case generic/475 have a very high chance (almost 100%) to hit a fs
+hang, where a data page will never be unlocked and hang all later
+operations.
+
+[CAUSE]
+In btrfs_do_readpage(), if we hit an error from submit_extent_page() we
+will try to do the cleanup for our current io range, and exit.
+
+This works fine for PAGE_SIZE == sectorsize cases, but not for subpage.
+
+For subpage btrfs_do_readpage() will lock the full page first, which can
+contain several different sectors and extents:
+
+ btrfs_do_readpage()
+ |- begin_page_read()
+ | |- btrfs_subpage_start_reader();
+ | Now the page will have PAGE_SIZE / sectorsize reader pending,
+ | and the page is locked.
+ |
+ |- end_page_read() for different branches
+ | This function will reduce subpage readers, and when readers
+ | reach 0, it will unlock the page.
+
+But when submit_extent_page() failed, we only cleanup the current
+io range, while the remaining io range will never be cleaned up, and the
+page remains locked forever.
+
+[FIX]
+Update the error handling of submit_extent_page() to cleanup all the
+remaining subpage range before exiting the loop.
+
+Please note that, now submit_extent_page() can only fail due to
+sanity check in alloc_new_bio().
+
+Thus regular IO errors are impossible to trigger the error path.
+
+CC: stable@vger.kernel.org # 5.15+
+Signed-off-by: Qu Wenruo <wqu@suse.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/btrfs/extent_io.c | 8 ++++++--
+ 1 file changed, 6 insertions(+), 2 deletions(-)
+
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -3721,8 +3721,12 @@ int btrfs_do_readpage(struct page *page,
+ this_bio_flag,
+ force_bio_submit);
+ if (ret) {
+- unlock_extent(tree, cur, cur + iosize - 1);
+- end_page_read(page, false, cur, iosize);
++ /*
++ * We have to unlock the remaining range, or the page
++ * will never be unlocked.
++ */
++ unlock_extent(tree, cur, end);
++ end_page_read(page, false, cur, end + 1 - cur);
+ goto out;
+ }
+ cur = cur + iosize;
--- /dev/null
+From d201238ccd2f30b9bfcfadaeae0972e3a486a176 Mon Sep 17 00:00:00 2001
+From: Qu Wenruo <wqu@suse.com>
+Date: Mon, 28 Feb 2022 15:05:53 +0800
+Subject: btrfs: repair super block num_devices automatically
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+From: Qu Wenruo <wqu@suse.com>
+
+commit d201238ccd2f30b9bfcfadaeae0972e3a486a176 upstream.
+
+[BUG]
+There is a report that a btrfs has a bad super block num devices.
+
+This makes btrfs to reject the fs completely.
+
+ BTRFS error (device sdd3): super_num_devices 3 mismatch with num_devices 2 found here
+ BTRFS error (device sdd3): failed to read chunk tree: -22
+ BTRFS error (device sdd3): open_ctree failed
+
+[CAUSE]
+During btrfs device removal, chunk tree and super block num devs are
+updated in two different transactions:
+
+ btrfs_rm_device()
+ |- btrfs_rm_dev_item(device)
+ | |- trans = btrfs_start_transaction()
+ | | Now we got transaction X
+ | |
+ | |- btrfs_del_item()
+ | | Now device item is removed from chunk tree
+ | |
+ | |- btrfs_commit_transaction()
+ | Transaction X got committed, super num devs untouched,
+ | but device item removed from chunk tree.
+ | (AKA, super num devs is already incorrect)
+ |
+ |- cur_devices->num_devices--;
+ |- cur_devices->total_devices--;
+ |- btrfs_set_super_num_devices()
+ All those operations are not in transaction X, thus it will
+ only be written back to disk in next transaction.
+
+So after the transaction X in btrfs_rm_dev_item() committed, but before
+transaction X+1 (which can be minutes away), a power loss happen, then
+we got the super num mismatch.
+
+This has been fixed by commit bbac58698a55 ("btrfs: remove device item
+and update super block in the same transaction").
+
+[FIX]
+Make the super_num_devices check less strict, converting it from a hard
+error to a warning, and reset the value to a correct one for the current
+or next transaction commit.
+
+As the number of device items is the critical information where the
+super block num_devices is only a cached value (and also useful for
+cross checking), it's safe to automatically update it. Other device
+related problems like missing device are handled after that and may
+require other means to resolve, like degraded mount. With this fix,
+potentially affected filesystems won't fail mount and require the manual
+repair by btrfs check.
+
+Reported-by: Luca Béla Palkovics <luca.bela.palkovics@gmail.com>
+Link: https://lore.kernel.org/linux-btrfs/CA+8xDSpvdm_U0QLBAnrH=zqDq_cWCOH5TiV46CKmp3igr44okQ@mail.gmail.com/
+CC: stable@vger.kernel.org # 4.14+
+Signed-off-by: Qu Wenruo <wqu@suse.com>
+Reviewed-by: David Sterba <dsterba@suse.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/btrfs/volumes.c | 8 ++++----
+ 1 file changed, 4 insertions(+), 4 deletions(-)
+
+--- a/fs/btrfs/volumes.c
++++ b/fs/btrfs/volumes.c
+@@ -7698,12 +7698,12 @@ int btrfs_read_chunk_tree(struct btrfs_f
+ * do another round of validation checks.
+ */
+ if (total_dev != fs_info->fs_devices->total_devices) {
+- btrfs_err(fs_info,
+- "super_num_devices %llu mismatch with num_devices %llu found here",
++ btrfs_warn(fs_info,
++"super block num_devices %llu mismatch with DEV_ITEM count %llu, will be repaired on next transaction commit",
+ btrfs_super_num_devices(fs_info->super_copy),
+ total_dev);
+- ret = -EINVAL;
+- goto error;
++ fs_info->fs_devices->total_devices = total_dev;
++ btrfs_set_super_num_devices(fs_info->super_copy, total_dev);
+ }
+ if (btrfs_super_total_bytes(fs_info->super_copy) <
+ fs_info->fs_devices->total_rw_bytes) {
--- /dev/null
+From 44e5801fada6925d2bba1987c7b59cbcc9d0d592 Mon Sep 17 00:00:00 2001
+From: Qu Wenruo <wqu@suse.com>
+Date: Tue, 12 Apr 2022 20:30:15 +0800
+Subject: btrfs: return correct error number for __extent_writepage_io()
+
+From: Qu Wenruo <wqu@suse.com>
+
+commit 44e5801fada6925d2bba1987c7b59cbcc9d0d592 upstream.
+
+[BUG]
+If we hit an error from submit_extent_page() inside
+__extent_writepage_io(), we could still return 0 to the caller, and
+even trigger the warning in btrfs_page_assert_not_dirty().
+
+[CAUSE]
+In __extent_writepage_io(), if we hit an error from
+submit_extent_page(), we will just clean up the range and continue.
+
+This is completely fine for regular PAGE_SIZE == sectorsize, as we can
+only hit one sector in one page, thus after the error we're ensured to
+exit and @ret will be saved.
+
+But for subpage case, we may have other dirty subpage range in the page,
+and in the next loop, we may succeeded submitting the next range.
+
+In that case, @ret will be overwritten, and we return 0 to the caller,
+while we have hit some error.
+
+[FIX]
+Introduce @has_error and @saved_ret to record the first error we hit, so
+we will never forget what error we hit.
+
+CC: stable@vger.kernel.org # 5.15+
+Signed-off-by: Qu Wenruo <wqu@suse.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/btrfs/extent_io.c | 13 ++++++++++++-
+ 1 file changed, 12 insertions(+), 1 deletion(-)
+
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -3898,10 +3898,12 @@ static noinline_for_stack int __extent_w
+ u64 extent_offset;
+ u64 block_start;
+ struct extent_map *em;
++ int saved_ret = 0;
+ int ret = 0;
+ int nr = 0;
+ u32 opf = REQ_OP_WRITE;
+ const unsigned int write_flags = wbc_to_write_flags(wbc);
++ bool has_error = false;
+ bool compressed;
+
+ ret = btrfs_writepage_cow_fixup(page);
+@@ -3951,6 +3953,9 @@ static noinline_for_stack int __extent_w
+ if (IS_ERR_OR_NULL(em)) {
+ btrfs_page_set_error(fs_info, page, cur, end - cur + 1);
+ ret = PTR_ERR_OR_ZERO(em);
++ has_error = true;
++ if (!saved_ret)
++ saved_ret = ret;
+ break;
+ }
+
+@@ -4014,6 +4019,10 @@ static noinline_for_stack int __extent_w
+ end_bio_extent_writepage,
+ 0, 0, false);
+ if (ret) {
++ has_error = true;
++ if (!saved_ret)
++ saved_ret = ret;
++
+ btrfs_page_set_error(fs_info, page, cur, iosize);
+ if (PageWriteback(page))
+ btrfs_page_clear_writeback(fs_info, page, cur,
+@@ -4027,8 +4036,10 @@ static noinline_for_stack int __extent_w
+ * If we finish without problem, we should not only clear page dirty,
+ * but also empty subpage dirty bits
+ */
+- if (!ret)
++ if (!has_error)
+ btrfs_page_assert_not_dirty(fs_info, page);
++ else
++ ret = saved_ret;
+ *nr_ret = nr;
+ return ret;
+ }
--- /dev/null
+From 8b8a53998caefebfe5c8da7a74c2b601caf5dd48 Mon Sep 17 00:00:00 2001
+From: Naohiro Aota <naohiro.aota@wdc.com>
+Date: Tue, 3 May 2022 17:48:52 -0700
+Subject: btrfs: zoned: finish block group when there are no more allocatable bytes left
+
+From: Naohiro Aota <naohiro.aota@wdc.com>
+
+commit 8b8a53998caefebfe5c8da7a74c2b601caf5dd48 upstream.
+
+Currently, btrfs_zone_finish_endio() finishes a block group only when the
+written region reaches the end of the block group. We can also finish the
+block group when no more allocation is possible.
+
+Fixes: be1a1d7a5d24 ("btrfs: zoned: finish fully written block group")
+CC: stable@vger.kernel.org # 5.16+
+Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>
+Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
+Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/btrfs/zoned.c | 11 ++++++++++-
+ 1 file changed, 10 insertions(+), 1 deletion(-)
+
+--- a/fs/btrfs/zoned.c
++++ b/fs/btrfs/zoned.c
+@@ -1961,6 +1961,7 @@ void btrfs_zone_finish_endio(struct btrf
+ struct btrfs_block_group *block_group;
+ struct map_lookup *map;
+ struct btrfs_device *device;
++ u64 min_alloc_bytes;
+ u64 physical;
+
+ if (!btrfs_is_zoned(fs_info))
+@@ -1969,7 +1970,15 @@ void btrfs_zone_finish_endio(struct btrf
+ block_group = btrfs_lookup_block_group(fs_info, logical);
+ ASSERT(block_group);
+
+- if (logical + length < block_group->start + block_group->zone_capacity)
++ /* No MIXED_BG on zoned btrfs. */
++ if (block_group->flags & BTRFS_BLOCK_GROUP_DATA)
++ min_alloc_bytes = fs_info->sectorsize;
++ else
++ min_alloc_bytes = fs_info->nodesize;
++
++ /* Bail out if we can allocate more data from this block group. */
++ if (logical + length + min_alloc_bytes <=
++ block_group->start + block_group->zone_capacity)
+ goto out;
+
+ spin_lock(&block_group->lock);
--- /dev/null
+From aa9ffadfcae33e611d8c2d476bcc2aa0d273b587 Mon Sep 17 00:00:00 2001
+From: Naohiro Aota <naohiro.aota@wdc.com>
+Date: Wed, 4 May 2022 16:12:48 -0700
+Subject: btrfs: zoned: fix comparison of alloc_offset vs meta_write_pointer
+
+From: Naohiro Aota <naohiro.aota@wdc.com>
+
+commit aa9ffadfcae33e611d8c2d476bcc2aa0d273b587 upstream.
+
+The block_group->alloc_offset is an offset from the start of the block
+group. OTOH, the ->meta_write_pointer is an address in the logical
+space. So, we should compare the alloc_offset shifted with the
+block_group->start.
+
+Fixes: afba2bc036b0 ("btrfs: zoned: implement active zone tracking")
+CC: stable@vger.kernel.org # 5.16+
+Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/btrfs/zoned.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/fs/btrfs/zoned.c
++++ b/fs/btrfs/zoned.c
+@@ -1863,7 +1863,7 @@ int btrfs_zone_finish(struct btrfs_block
+ /* Check if we have unwritten allocated space */
+ if ((block_group->flags &
+ (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM)) &&
+- block_group->alloc_offset > block_group->meta_write_pointer) {
++ block_group->start + block_group->alloc_offset > block_group->meta_write_pointer) {
+ spin_unlock(&block_group->lock);
+ return -EAGAIN;
+ }
--- /dev/null
+From 56fbb0a4e8b3e929e41cc846e6ef89eb01152201 Mon Sep 17 00:00:00 2001
+From: Naohiro Aota <naohiro.aota@wdc.com>
+Date: Tue, 3 May 2022 17:48:53 -0700
+Subject: btrfs: zoned: properly finish block group on metadata write
+
+From: Naohiro Aota <naohiro.aota@wdc.com>
+
+commit 56fbb0a4e8b3e929e41cc846e6ef89eb01152201 upstream.
+
+Commit be1a1d7a5d24 ("btrfs: zoned: finish fully written block group")
+introduced zone finishing code both for data and metadata end_io path.
+However, the metadata side is not working as it should. First, it
+compares logical address (eb->start + eb->len) with offset within a
+block group (cache->zone_capacity) in submit_eb_page(). That essentially
+disabled zone finishing on metadata end_io path.
+
+Furthermore, fixing the issue above revealed we cannot call
+btrfs_zone_finish_endio() in end_extent_buffer_writeback(). We cannot
+call btrfs_lookup_block_group() which require spin lock inside end_io
+context.
+
+Introduce btrfs_schedule_zone_finish_bg() to wait for the extent buffer
+writeback and do the zone finish IO in a workqueue.
+
+Also, drop EXTENT_BUFFER_ZONE_FINISH as it is no longer used.
+
+Fixes: be1a1d7a5d24 ("btrfs: zoned: finish fully written block group")
+CC: stable@vger.kernel.org # 5.16+
+Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
+Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/btrfs/block-group.h | 2 ++
+ fs/btrfs/extent_io.c | 6 +-----
+ fs/btrfs/extent_io.h | 1 -
+ fs/btrfs/zoned.c | 31 +++++++++++++++++++++++++++++++
+ fs/btrfs/zoned.h | 5 +++++
+ 5 files changed, 39 insertions(+), 6 deletions(-)
+
+--- a/fs/btrfs/block-group.h
++++ b/fs/btrfs/block-group.h
+@@ -211,6 +211,8 @@ struct btrfs_block_group {
+ u64 meta_write_pointer;
+ struct map_lookup *physical_map;
+ struct list_head active_bg_list;
++ struct work_struct zone_finish_work;
++ struct extent_buffer *last_eb;
+ };
+
+ static inline u64 btrfs_block_group_end(struct btrfs_block_group *block_group)
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -4173,9 +4173,6 @@ void wait_on_extent_buffer_writeback(str
+
+ static void end_extent_buffer_writeback(struct extent_buffer *eb)
+ {
+- if (test_bit(EXTENT_BUFFER_ZONE_FINISH, &eb->bflags))
+- btrfs_zone_finish_endio(eb->fs_info, eb->start, eb->len);
+-
+ clear_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags);
+ smp_mb__after_atomic();
+ wake_up_bit(&eb->bflags, EXTENT_BUFFER_WRITEBACK);
+@@ -4795,8 +4792,7 @@ static int submit_eb_page(struct page *p
+ /*
+ * Implies write in zoned mode. Mark the last eb in a block group.
+ */
+- if (cache->seq_zone && eb->start + eb->len == cache->zone_capacity)
+- set_bit(EXTENT_BUFFER_ZONE_FINISH, &eb->bflags);
++ btrfs_schedule_zone_finish_bg(cache, eb);
+ btrfs_put_block_group(cache);
+ }
+ ret = write_one_eb(eb, wbc, epd);
+--- a/fs/btrfs/extent_io.h
++++ b/fs/btrfs/extent_io.h
+@@ -32,7 +32,6 @@ enum {
+ /* write IO error */
+ EXTENT_BUFFER_WRITE_ERR,
+ EXTENT_BUFFER_NO_CHECK,
+- EXTENT_BUFFER_ZONE_FINISH,
+ };
+
+ /* these are flags for __process_pages_contig */
+--- a/fs/btrfs/zoned.c
++++ b/fs/btrfs/zoned.c
+@@ -2007,6 +2007,37 @@ out:
+ btrfs_put_block_group(block_group);
+ }
+
++static void btrfs_zone_finish_endio_workfn(struct work_struct *work)
++{
++ struct btrfs_block_group *bg =
++ container_of(work, struct btrfs_block_group, zone_finish_work);
++
++ wait_on_extent_buffer_writeback(bg->last_eb);
++ free_extent_buffer(bg->last_eb);
++ btrfs_zone_finish_endio(bg->fs_info, bg->start, bg->length);
++ btrfs_put_block_group(bg);
++}
++
++void btrfs_schedule_zone_finish_bg(struct btrfs_block_group *bg,
++ struct extent_buffer *eb)
++{
++ if (!bg->seq_zone || eb->start + eb->len * 2 <= bg->start + bg->zone_capacity)
++ return;
++
++ if (WARN_ON(bg->zone_finish_work.func == btrfs_zone_finish_endio_workfn)) {
++ btrfs_err(bg->fs_info, "double scheduling of bg %llu zone finishing",
++ bg->start);
++ return;
++ }
++
++ /* For the work */
++ btrfs_get_block_group(bg);
++ atomic_inc(&eb->refs);
++ bg->last_eb = eb;
++ INIT_WORK(&bg->zone_finish_work, btrfs_zone_finish_endio_workfn);
++ queue_work(system_unbound_wq, &bg->zone_finish_work);
++}
++
+ void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg)
+ {
+ struct btrfs_fs_info *fs_info = bg->fs_info;
+--- a/fs/btrfs/zoned.h
++++ b/fs/btrfs/zoned.h
+@@ -76,6 +76,8 @@ int btrfs_zone_finish(struct btrfs_block
+ bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags);
+ void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical,
+ u64 length);
++void btrfs_schedule_zone_finish_bg(struct btrfs_block_group *bg,
++ struct extent_buffer *eb);
+ void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg);
+ void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info);
+ #else /* CONFIG_BLK_DEV_ZONED */
+@@ -233,6 +235,9 @@ static inline bool btrfs_can_activate_zo
+ static inline void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info,
+ u64 logical, u64 length) { }
+
++static inline void btrfs_schedule_zone_finish_bg(struct btrfs_block_group *bg,
++ struct extent_buffer *eb) { }
++
+ static inline void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg) { }
+
+ static inline void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info) { }
--- /dev/null
+From 74e91b12b11560f01d120751d99d91d54b265d3d Mon Sep 17 00:00:00 2001
+From: Naohiro Aota <naohiro.aota@wdc.com>
+Date: Tue, 3 May 2022 17:48:54 -0700
+Subject: btrfs: zoned: zone finish unused block group
+
+From: Naohiro Aota <naohiro.aota@wdc.com>
+
+commit 74e91b12b11560f01d120751d99d91d54b265d3d upstream.
+
+While the active zones within an active block group are reset, and their
+active resource is released, the block group itself is kept in the active
+block group list and marked as active. As a result, the list will contain
+more than max_active_zones block groups. That itself is not fatal for the
+device as the zones are properly reset.
+
+However, that inflated list is, of course, strange. Also, a to-appear
+patch series, which deactivates an active block group on demand, gets
+confused with the wrong list.
+
+So, fix the issue by finishing the unused block group once it gets
+read-only, so that we can release the active resource in an early stage.
+
+Fixes: be1a1d7a5d24 ("btrfs: zoned: finish fully written block group")
+CC: stable@vger.kernel.org # 5.16+
+Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
+Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/btrfs/block-group.c | 8 ++++++++
+ 1 file changed, 8 insertions(+)
+
+--- a/fs/btrfs/block-group.c
++++ b/fs/btrfs/block-group.c
+@@ -1367,6 +1367,14 @@ void btrfs_delete_unused_bgs(struct btrf
+ goto next;
+ }
+
++ ret = btrfs_zone_finish(block_group);
++ if (ret < 0) {
++ btrfs_dec_block_group_ro(block_group);
++ if (ret == -EAGAIN)
++ ret = 0;
++ goto next;
++ }
++
+ /*
+ * Want to do this before we do anything else so we can recover
+ * properly if we fail to join the transaction.
ptrace-um-replace-pt_dtrace-with-tif_singlestep.patch
ptrace-xtensa-replace-pt_singlestep-with-tif_singlestep.patch
ptrace-reimplement-ptrace_kill-by-always-sending-sigkill.patch
+btrfs-add-0x-prefix-for-unsupported-optional-features.patch
+btrfs-return-correct-error-number-for-__extent_writepage_io.patch
+btrfs-repair-super-block-num_devices-automatically.patch
+btrfs-fix-the-error-handling-for-submit_extent_page-for-btrfs_do_readpage.patch
+btrfs-zoned-properly-finish-block-group-on-metadata-write.patch
+btrfs-zoned-zone-finish-unused-block-group.patch
+btrfs-zoned-finish-block-group-when-there-are-no-more-allocatable-bytes-left.patch
+btrfs-zoned-fix-comparison-of-alloc_offset-vs-meta_write_pointer.patch