From: Greg Kroah-Hartman Date: Wed, 9 Oct 2024 13:37:33 +0000 (+0200) Subject: drop btrfs patch from older kernels X-Git-Tag: v6.6.55~4 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=766f86e311a77572a7a6d3a3191ec4be9a3782ae;p=thirdparty%2Fkernel%2Fstable-queue.git drop btrfs patch from older kernels --- diff --git a/queue-5.10/btrfs-get-rid-of-warning-on-transaction-commit-when-using-flushoncommit.patch b/queue-5.10/btrfs-get-rid-of-warning-on-transaction-commit-when-using-flushoncommit.patch deleted file mode 100644 index 7d9d053c5fd..00000000000 --- a/queue-5.10/btrfs-get-rid-of-warning-on-transaction-commit-when-using-flushoncommit.patch +++ /dev/null @@ -1,282 +0,0 @@ -From a0f0cf8341e34e5d2265bfd3a7ad68342da1e2aa Mon Sep 17 00:00:00 2001 -From: Filipe Manana -Date: Wed, 2 Feb 2022 15:26:09 +0000 -Subject: btrfs: get rid of warning on transaction commit when using flushoncommit - -From: Filipe Manana - -commit a0f0cf8341e34e5d2265bfd3a7ad68342da1e2aa upstream. - -When using the flushoncommit mount option, during almost every transaction -commit we trigger a warning from __writeback_inodes_sb_nr(): - - $ cat fs/fs-writeback.c: - (...) - static void __writeback_inodes_sb_nr(struct super_block *sb, ... - { - (...) - WARN_ON(!rwsem_is_locked(&sb->s_umount)); - (...) - } - (...) - -The trace produced in dmesg looks like the following: - - [947.473890] WARNING: CPU: 5 PID: 930 at fs/fs-writeback.c:2610 __writeback_inodes_sb_nr+0x7e/0xb3 - [947.481623] Modules linked in: nfsd nls_cp437 cifs asn1_decoder cifs_arc4 fscache cifs_md4 ipmi_ssif - [947.489571] CPU: 5 PID: 930 Comm: btrfs-transacti Not tainted 95.16.3-srb-asrock-00001-g36437ad63879 #186 - [947.497969] RIP: 0010:__writeback_inodes_sb_nr+0x7e/0xb3 - [947.502097] Code: 24 10 4c 89 44 24 18 c6 (...) - [947.519760] RSP: 0018:ffffc90000777e10 EFLAGS: 00010246 - [947.523818] RAX: 0000000000000000 RBX: 0000000000963300 RCX: 0000000000000000 - [947.529765] RDX: 0000000000000000 RSI: 000000000000fa51 RDI: ffffc90000777e50 - [947.535740] RBP: ffff888101628a90 R08: ffff888100955800 R09: ffff888100956000 - [947.541701] R10: 0000000000000002 R11: 0000000000000001 R12: ffff888100963488 - [947.547645] R13: ffff888100963000 R14: ffff888112fb7200 R15: ffff888100963460 - [947.553621] FS: 0000000000000000(0000) GS:ffff88841fd40000(0000) knlGS:0000000000000000 - [947.560537] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 - [947.565122] CR2: 0000000008be50c4 CR3: 000000000220c000 CR4: 00000000001006e0 - [947.571072] Call Trace: - [947.572354] - [947.573266] btrfs_commit_transaction+0x1f1/0x998 - [947.576785] ? start_transaction+0x3ab/0x44e - [947.579867] ? schedule_timeout+0x8a/0xdd - [947.582716] transaction_kthread+0xe9/0x156 - [947.585721] ? btrfs_cleanup_transaction.isra.0+0x407/0x407 - [947.590104] kthread+0x131/0x139 - [947.592168] ? set_kthread_struct+0x32/0x32 - [947.595174] ret_from_fork+0x22/0x30 - [947.597561] - [947.598553] ---[ end trace 644721052755541c ]--- - -This is because we started using writeback_inodes_sb() to flush delalloc -when committing a transaction (when using -o flushoncommit), in order to -avoid deadlocks with filesystem freeze operations. This change was made -by commit ce8ea7cc6eb313 ("btrfs: don't call btrfs_start_delalloc_roots -in flushoncommit"). After that change we started producing that warning, -and every now and then a user reports this since the warning happens too -often, it spams dmesg/syslog, and a user is unsure if this reflects any -problem that might compromise the filesystem's reliability. - -We can not just lock the sb->s_umount semaphore before calling -writeback_inodes_sb(), because that would at least deadlock with -filesystem freezing, since at fs/super.c:freeze_super() sync_filesystem() -is called while we are holding that semaphore in write mode, and that can -trigger a transaction commit, resulting in a deadlock. It would also -trigger the same type of deadlock in the unmount path. Possibly, it could -also introduce some other locking dependencies that lockdep would report. - -To fix this call try_to_writeback_inodes_sb() instead of -writeback_inodes_sb(), because that will try to read lock sb->s_umount -and then will only call writeback_inodes_sb() if it was able to lock it. -This is fine because the cases where it can't read lock sb->s_umount -are during a filesystem unmount or during a filesystem freeze - in those -cases sb->s_umount is write locked and sync_filesystem() is called, which -calls writeback_inodes_sb(). In other words, in all cases where we can't -take a read lock on sb->s_umount, writeback is already being triggered -elsewhere. - -An alternative would be to call btrfs_start_delalloc_roots() with a -number of pages different from LONG_MAX, for example matching the number -of delalloc bytes we currently have, in which case we would end up -starting all delalloc with filemap_fdatawrite_wbc() and not with an -async flush via filemap_flush() - that is only possible after the rather -recent commit e076ab2a2ca70a ("btrfs: shrink delalloc pages instead of -full inodes"). However that creates a whole new can of worms due to new -lock dependencies, which lockdep complains, like for example: - -[ 8948.247280] ====================================================== -[ 8948.247823] WARNING: possible circular locking dependency detected -[ 8948.248353] 5.17.0-rc1-btrfs-next-111 #1 Not tainted -[ 8948.248786] ------------------------------------------------------ -[ 8948.249320] kworker/u16:18/933570 is trying to acquire lock: -[ 8948.249812] ffff9b3de1591690 (sb_internal#2){.+.+}-{0:0}, at: find_free_extent+0x141e/0x1590 [btrfs] -[ 8948.250638] - but task is already holding lock: -[ 8948.251140] ffff9b3e09c717d8 (&root->delalloc_mutex){+.+.}-{3:3}, at: start_delalloc_inodes+0x78/0x400 [btrfs] -[ 8948.252018] - which lock already depends on the new lock. - -[ 8948.252710] - the existing dependency chain (in reverse order) is: -[ 8948.253343] - -> #2 (&root->delalloc_mutex){+.+.}-{3:3}: -[ 8948.253950] __mutex_lock+0x90/0x900 -[ 8948.254354] start_delalloc_inodes+0x78/0x400 [btrfs] -[ 8948.254859] btrfs_start_delalloc_roots+0x194/0x2a0 [btrfs] -[ 8948.255408] btrfs_commit_transaction+0x32f/0xc00 [btrfs] -[ 8948.255942] btrfs_mksubvol+0x380/0x570 [btrfs] -[ 8948.256406] btrfs_mksnapshot+0x81/0xb0 [btrfs] -[ 8948.256870] __btrfs_ioctl_snap_create+0x17f/0x190 [btrfs] -[ 8948.257413] btrfs_ioctl_snap_create_v2+0xbb/0x140 [btrfs] -[ 8948.257961] btrfs_ioctl+0x1196/0x3630 [btrfs] -[ 8948.258418] __x64_sys_ioctl+0x83/0xb0 -[ 8948.258793] do_syscall_64+0x3b/0xc0 -[ 8948.259146] entry_SYSCALL_64_after_hwframe+0x44/0xae -[ 8948.259709] - -> #1 (&fs_info->delalloc_root_mutex){+.+.}-{3:3}: -[ 8948.260330] __mutex_lock+0x90/0x900 -[ 8948.260692] btrfs_start_delalloc_roots+0x97/0x2a0 [btrfs] -[ 8948.261234] btrfs_commit_transaction+0x32f/0xc00 [btrfs] -[ 8948.261766] btrfs_set_free_space_cache_v1_active+0x38/0x60 [btrfs] -[ 8948.262379] btrfs_start_pre_rw_mount+0x119/0x180 [btrfs] -[ 8948.262909] open_ctree+0x1511/0x171e [btrfs] -[ 8948.263359] btrfs_mount_root.cold+0x12/0xde [btrfs] -[ 8948.263863] legacy_get_tree+0x30/0x50 -[ 8948.264242] vfs_get_tree+0x28/0xc0 -[ 8948.264594] vfs_kern_mount.part.0+0x71/0xb0 -[ 8948.265017] btrfs_mount+0x11d/0x3a0 [btrfs] -[ 8948.265462] legacy_get_tree+0x30/0x50 -[ 8948.265851] vfs_get_tree+0x28/0xc0 -[ 8948.266203] path_mount+0x2d4/0xbe0 -[ 8948.266554] __x64_sys_mount+0x103/0x140 -[ 8948.266940] do_syscall_64+0x3b/0xc0 -[ 8948.267300] entry_SYSCALL_64_after_hwframe+0x44/0xae -[ 8948.267790] - -> #0 (sb_internal#2){.+.+}-{0:0}: -[ 8948.268322] __lock_acquire+0x12e8/0x2260 -[ 8948.268733] lock_acquire+0xd7/0x310 -[ 8948.269092] start_transaction+0x44c/0x6e0 [btrfs] -[ 8948.269591] find_free_extent+0x141e/0x1590 [btrfs] -[ 8948.270087] btrfs_reserve_extent+0x14b/0x280 [btrfs] -[ 8948.270588] cow_file_range+0x17e/0x490 [btrfs] -[ 8948.271051] btrfs_run_delalloc_range+0x345/0x7a0 [btrfs] -[ 8948.271586] writepage_delalloc+0xb5/0x170 [btrfs] -[ 8948.272071] __extent_writepage+0x156/0x3c0 [btrfs] -[ 8948.272579] extent_write_cache_pages+0x263/0x460 [btrfs] -[ 8948.273113] extent_writepages+0x76/0x130 [btrfs] -[ 8948.273573] do_writepages+0xd2/0x1c0 -[ 8948.273942] filemap_fdatawrite_wbc+0x68/0x90 -[ 8948.274371] start_delalloc_inodes+0x17f/0x400 [btrfs] -[ 8948.274876] btrfs_start_delalloc_roots+0x194/0x2a0 [btrfs] -[ 8948.275417] flush_space+0x1f2/0x630 [btrfs] -[ 8948.275863] btrfs_async_reclaim_data_space+0x108/0x1b0 [btrfs] -[ 8948.276438] process_one_work+0x252/0x5a0 -[ 8948.276829] worker_thread+0x55/0x3b0 -[ 8948.277189] kthread+0xf2/0x120 -[ 8948.277506] ret_from_fork+0x22/0x30 -[ 8948.277868] - other info that might help us debug this: - -[ 8948.278548] Chain exists of: - sb_internal#2 --> &fs_info->delalloc_root_mutex --> &root->delalloc_mutex - -[ 8948.279601] Possible unsafe locking scenario: - -[ 8948.280102] CPU0 CPU1 -[ 8948.280508] ---- ---- -[ 8948.280915] lock(&root->delalloc_mutex); -[ 8948.281271] lock(&fs_info->delalloc_root_mutex); -[ 8948.281915] lock(&root->delalloc_mutex); -[ 8948.282487] lock(sb_internal#2); -[ 8948.282800] - *** DEADLOCK *** - -[ 8948.283333] 4 locks held by kworker/u16:18/933570: -[ 8948.283750] #0: ffff9b3dc00a9d48 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1d2/0x5a0 -[ 8948.284609] #1: ffffa90349dafe70 ((work_completion)(&fs_info->async_data_reclaim_work)){+.+.}-{0:0}, at: process_one_work+0x1d2/0x5a0 -[ 8948.285637] #2: ffff9b3e14db5040 (&fs_info->delalloc_root_mutex){+.+.}-{3:3}, at: btrfs_start_delalloc_roots+0x97/0x2a0 [btrfs] -[ 8948.286674] #3: ffff9b3e09c717d8 (&root->delalloc_mutex){+.+.}-{3:3}, at: start_delalloc_inodes+0x78/0x400 [btrfs] -[ 8948.287596] - stack backtrace: -[ 8948.287975] CPU: 3 PID: 933570 Comm: kworker/u16:18 Not tainted 5.17.0-rc1-btrfs-next-111 #1 -[ 8948.288677] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 -[ 8948.289649] Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs] -[ 8948.290298] Call Trace: -[ 8948.290517] -[ 8948.290700] dump_stack_lvl+0x59/0x73 -[ 8948.291026] check_noncircular+0xf3/0x110 -[ 8948.291375] ? start_transaction+0x228/0x6e0 [btrfs] -[ 8948.291826] __lock_acquire+0x12e8/0x2260 -[ 8948.292241] lock_acquire+0xd7/0x310 -[ 8948.292714] ? find_free_extent+0x141e/0x1590 [btrfs] -[ 8948.293241] ? lock_is_held_type+0xea/0x140 -[ 8948.293601] start_transaction+0x44c/0x6e0 [btrfs] -[ 8948.294055] ? find_free_extent+0x141e/0x1590 [btrfs] -[ 8948.294518] find_free_extent+0x141e/0x1590 [btrfs] -[ 8948.294957] ? _raw_spin_unlock+0x29/0x40 -[ 8948.295312] ? btrfs_get_alloc_profile+0x124/0x290 [btrfs] -[ 8948.295813] btrfs_reserve_extent+0x14b/0x280 [btrfs] -[ 8948.296270] cow_file_range+0x17e/0x490 [btrfs] -[ 8948.296691] btrfs_run_delalloc_range+0x345/0x7a0 [btrfs] -[ 8948.297175] ? find_lock_delalloc_range+0x247/0x270 [btrfs] -[ 8948.297678] writepage_delalloc+0xb5/0x170 [btrfs] -[ 8948.298123] __extent_writepage+0x156/0x3c0 [btrfs] -[ 8948.298570] extent_write_cache_pages+0x263/0x460 [btrfs] -[ 8948.299061] extent_writepages+0x76/0x130 [btrfs] -[ 8948.299495] do_writepages+0xd2/0x1c0 -[ 8948.299817] ? sched_clock_cpu+0xd/0x110 -[ 8948.300160] ? lock_release+0x155/0x4a0 -[ 8948.300494] filemap_fdatawrite_wbc+0x68/0x90 -[ 8948.300874] ? do_raw_spin_unlock+0x4b/0xa0 -[ 8948.301243] start_delalloc_inodes+0x17f/0x400 [btrfs] -[ 8948.301706] ? lock_release+0x155/0x4a0 -[ 8948.302055] btrfs_start_delalloc_roots+0x194/0x2a0 [btrfs] -[ 8948.302564] flush_space+0x1f2/0x630 [btrfs] -[ 8948.302970] btrfs_async_reclaim_data_space+0x108/0x1b0 [btrfs] -[ 8948.303510] process_one_work+0x252/0x5a0 -[ 8948.303860] ? process_one_work+0x5a0/0x5a0 -[ 8948.304221] worker_thread+0x55/0x3b0 -[ 8948.304543] ? process_one_work+0x5a0/0x5a0 -[ 8948.304904] kthread+0xf2/0x120 -[ 8948.305184] ? kthread_complete_and_exit+0x20/0x20 -[ 8948.305598] ret_from_fork+0x22/0x30 -[ 8948.305921] - -It all comes from the fact that btrfs_start_delalloc_roots() takes the -delalloc_root_mutex, in the transaction commit path we are holding a -read lock on one of the superblock's freeze semaphores (via -sb_start_intwrite()), the async reclaim task can also do a call to -btrfs_start_delalloc_roots(), which ends up triggering writeback with -calls to filemap_fdatawrite_wbc(), resulting in extent allocation which -in turn can call btrfs_start_transaction(), which will result in taking -the freeze semaphore via sb_start_intwrite(), forming a nasty dependency -on all those locks which can be taken in different orders by different -code paths. - -So just adopt the simple approach of calling try_to_writeback_inodes_sb() -at btrfs_start_delalloc_flush(). - -Link: https://lore.kernel.org/linux-btrfs/20220130005258.GA7465@cuci.nl/ -Link: https://lore.kernel.org/linux-btrfs/43acc426-d683-d1b6-729d-c6bc4a2fff4d@gmail.com/ -Link: https://lore.kernel.org/linux-btrfs/6833930a-08d7-6fbc-0141-eb9cdfd6bb4d@gmail.com/ -Link: https://lore.kernel.org/linux-btrfs/20190322041731.GF16651@hungrycats.org/ -Reviewed-by: Omar Sandoval -Signed-off-by: Filipe Manana -[ add more link reports ] -Signed-off-by: David Sterba -Signed-off-by: Hugo SIMELIERE -Signed-off-by: Greg Kroah-Hartman ---- - fs/btrfs/transaction.c | 12 ++++++++++-- - 1 file changed, 10 insertions(+), 2 deletions(-) - ---- a/fs/btrfs/transaction.c -+++ b/fs/btrfs/transaction.c -@@ -2021,16 +2021,24 @@ static inline int btrfs_start_delalloc_f - struct btrfs_fs_info *fs_info = trans->fs_info; - - /* -- * We use writeback_inodes_sb here because if we used -+ * We use try_to_writeback_inodes_sb() here because if we used - * btrfs_start_delalloc_roots we would deadlock with fs freeze. - * Currently are holding the fs freeze lock, if we do an async flush - * we'll do btrfs_join_transaction() and deadlock because we need to - * wait for the fs freeze lock. Using the direct flushing we benefit - * from already being in a transaction and our join_transaction doesn't - * have to re-take the fs freeze lock. -+ * -+ * Note that try_to_writeback_inodes_sb() will only trigger writeback -+ * if it can read lock sb->s_umount. It will always be able to lock it, -+ * except when the filesystem is being unmounted or being frozen, but in -+ * those cases sync_filesystem() is called, which results in calling -+ * writeback_inodes_sb() while holding a write lock on sb->s_umount. -+ * Note that we don't call writeback_inodes_sb() directly, because it -+ * will emit a warning if sb->s_umount is not locked. - */ - if (btrfs_test_opt(fs_info, FLUSHONCOMMIT)) { -- writeback_inodes_sb(fs_info->sb, WB_REASON_SYNC); -+ try_to_writeback_inodes_sb(fs_info->sb, WB_REASON_SYNC); - } else { - struct btrfs_pending_snapshot *pending; - struct list_head *head = &trans->transaction->pending_snapshots; diff --git a/queue-5.10/series b/queue-5.10/series index a21c1e94001..9bfeb97e683 100644 --- a/queue-5.10/series +++ b/queue-5.10/series @@ -442,5 +442,4 @@ acpi-battery-simplify-battery-hook-locking.patch acpi-battery-fix-possible-crash-when-unregistering-a.patch ext4-fix-inode-tree-inconsistency-caused-by-enomem.patch vhost-scsi-null-ptr-dereference-in-vhost_scsi_get_req.patch -btrfs-get-rid-of-warning-on-transaction-commit-when-using-flushoncommit.patch clk-imx6ul-fix-failed-to-get-parent-error.patch diff --git a/queue-5.4/btrfs-get-rid-of-warning-on-transaction-commit-when-using-flushoncommit.patch b/queue-5.4/btrfs-get-rid-of-warning-on-transaction-commit-when-using-flushoncommit.patch deleted file mode 100644 index 549d1dc013d..00000000000 --- a/queue-5.4/btrfs-get-rid-of-warning-on-transaction-commit-when-using-flushoncommit.patch +++ /dev/null @@ -1,282 +0,0 @@ -From a0f0cf8341e34e5d2265bfd3a7ad68342da1e2aa Mon Sep 17 00:00:00 2001 -From: Filipe Manana -Date: Wed, 2 Feb 2022 15:26:09 +0000 -Subject: btrfs: get rid of warning on transaction commit when using flushoncommit - -From: Filipe Manana - -commit a0f0cf8341e34e5d2265bfd3a7ad68342da1e2aa upstream. - -When using the flushoncommit mount option, during almost every transaction -commit we trigger a warning from __writeback_inodes_sb_nr(): - - $ cat fs/fs-writeback.c: - (...) - static void __writeback_inodes_sb_nr(struct super_block *sb, ... - { - (...) - WARN_ON(!rwsem_is_locked(&sb->s_umount)); - (...) - } - (...) - -The trace produced in dmesg looks like the following: - - [947.473890] WARNING: CPU: 5 PID: 930 at fs/fs-writeback.c:2610 __writeback_inodes_sb_nr+0x7e/0xb3 - [947.481623] Modules linked in: nfsd nls_cp437 cifs asn1_decoder cifs_arc4 fscache cifs_md4 ipmi_ssif - [947.489571] CPU: 5 PID: 930 Comm: btrfs-transacti Not tainted 95.16.3-srb-asrock-00001-g36437ad63879 #186 - [947.497969] RIP: 0010:__writeback_inodes_sb_nr+0x7e/0xb3 - [947.502097] Code: 24 10 4c 89 44 24 18 c6 (...) - [947.519760] RSP: 0018:ffffc90000777e10 EFLAGS: 00010246 - [947.523818] RAX: 0000000000000000 RBX: 0000000000963300 RCX: 0000000000000000 - [947.529765] RDX: 0000000000000000 RSI: 000000000000fa51 RDI: ffffc90000777e50 - [947.535740] RBP: ffff888101628a90 R08: ffff888100955800 R09: ffff888100956000 - [947.541701] R10: 0000000000000002 R11: 0000000000000001 R12: ffff888100963488 - [947.547645] R13: ffff888100963000 R14: ffff888112fb7200 R15: ffff888100963460 - [947.553621] FS: 0000000000000000(0000) GS:ffff88841fd40000(0000) knlGS:0000000000000000 - [947.560537] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 - [947.565122] CR2: 0000000008be50c4 CR3: 000000000220c000 CR4: 00000000001006e0 - [947.571072] Call Trace: - [947.572354] - [947.573266] btrfs_commit_transaction+0x1f1/0x998 - [947.576785] ? start_transaction+0x3ab/0x44e - [947.579867] ? schedule_timeout+0x8a/0xdd - [947.582716] transaction_kthread+0xe9/0x156 - [947.585721] ? btrfs_cleanup_transaction.isra.0+0x407/0x407 - [947.590104] kthread+0x131/0x139 - [947.592168] ? set_kthread_struct+0x32/0x32 - [947.595174] ret_from_fork+0x22/0x30 - [947.597561] - [947.598553] ---[ end trace 644721052755541c ]--- - -This is because we started using writeback_inodes_sb() to flush delalloc -when committing a transaction (when using -o flushoncommit), in order to -avoid deadlocks with filesystem freeze operations. This change was made -by commit ce8ea7cc6eb313 ("btrfs: don't call btrfs_start_delalloc_roots -in flushoncommit"). After that change we started producing that warning, -and every now and then a user reports this since the warning happens too -often, it spams dmesg/syslog, and a user is unsure if this reflects any -problem that might compromise the filesystem's reliability. - -We can not just lock the sb->s_umount semaphore before calling -writeback_inodes_sb(), because that would at least deadlock with -filesystem freezing, since at fs/super.c:freeze_super() sync_filesystem() -is called while we are holding that semaphore in write mode, and that can -trigger a transaction commit, resulting in a deadlock. It would also -trigger the same type of deadlock in the unmount path. Possibly, it could -also introduce some other locking dependencies that lockdep would report. - -To fix this call try_to_writeback_inodes_sb() instead of -writeback_inodes_sb(), because that will try to read lock sb->s_umount -and then will only call writeback_inodes_sb() if it was able to lock it. -This is fine because the cases where it can't read lock sb->s_umount -are during a filesystem unmount or during a filesystem freeze - in those -cases sb->s_umount is write locked and sync_filesystem() is called, which -calls writeback_inodes_sb(). In other words, in all cases where we can't -take a read lock on sb->s_umount, writeback is already being triggered -elsewhere. - -An alternative would be to call btrfs_start_delalloc_roots() with a -number of pages different from LONG_MAX, for example matching the number -of delalloc bytes we currently have, in which case we would end up -starting all delalloc with filemap_fdatawrite_wbc() and not with an -async flush via filemap_flush() - that is only possible after the rather -recent commit e076ab2a2ca70a ("btrfs: shrink delalloc pages instead of -full inodes"). However that creates a whole new can of worms due to new -lock dependencies, which lockdep complains, like for example: - -[ 8948.247280] ====================================================== -[ 8948.247823] WARNING: possible circular locking dependency detected -[ 8948.248353] 5.17.0-rc1-btrfs-next-111 #1 Not tainted -[ 8948.248786] ------------------------------------------------------ -[ 8948.249320] kworker/u16:18/933570 is trying to acquire lock: -[ 8948.249812] ffff9b3de1591690 (sb_internal#2){.+.+}-{0:0}, at: find_free_extent+0x141e/0x1590 [btrfs] -[ 8948.250638] - but task is already holding lock: -[ 8948.251140] ffff9b3e09c717d8 (&root->delalloc_mutex){+.+.}-{3:3}, at: start_delalloc_inodes+0x78/0x400 [btrfs] -[ 8948.252018] - which lock already depends on the new lock. - -[ 8948.252710] - the existing dependency chain (in reverse order) is: -[ 8948.253343] - -> #2 (&root->delalloc_mutex){+.+.}-{3:3}: -[ 8948.253950] __mutex_lock+0x90/0x900 -[ 8948.254354] start_delalloc_inodes+0x78/0x400 [btrfs] -[ 8948.254859] btrfs_start_delalloc_roots+0x194/0x2a0 [btrfs] -[ 8948.255408] btrfs_commit_transaction+0x32f/0xc00 [btrfs] -[ 8948.255942] btrfs_mksubvol+0x380/0x570 [btrfs] -[ 8948.256406] btrfs_mksnapshot+0x81/0xb0 [btrfs] -[ 8948.256870] __btrfs_ioctl_snap_create+0x17f/0x190 [btrfs] -[ 8948.257413] btrfs_ioctl_snap_create_v2+0xbb/0x140 [btrfs] -[ 8948.257961] btrfs_ioctl+0x1196/0x3630 [btrfs] -[ 8948.258418] __x64_sys_ioctl+0x83/0xb0 -[ 8948.258793] do_syscall_64+0x3b/0xc0 -[ 8948.259146] entry_SYSCALL_64_after_hwframe+0x44/0xae -[ 8948.259709] - -> #1 (&fs_info->delalloc_root_mutex){+.+.}-{3:3}: -[ 8948.260330] __mutex_lock+0x90/0x900 -[ 8948.260692] btrfs_start_delalloc_roots+0x97/0x2a0 [btrfs] -[ 8948.261234] btrfs_commit_transaction+0x32f/0xc00 [btrfs] -[ 8948.261766] btrfs_set_free_space_cache_v1_active+0x38/0x60 [btrfs] -[ 8948.262379] btrfs_start_pre_rw_mount+0x119/0x180 [btrfs] -[ 8948.262909] open_ctree+0x1511/0x171e [btrfs] -[ 8948.263359] btrfs_mount_root.cold+0x12/0xde [btrfs] -[ 8948.263863] legacy_get_tree+0x30/0x50 -[ 8948.264242] vfs_get_tree+0x28/0xc0 -[ 8948.264594] vfs_kern_mount.part.0+0x71/0xb0 -[ 8948.265017] btrfs_mount+0x11d/0x3a0 [btrfs] -[ 8948.265462] legacy_get_tree+0x30/0x50 -[ 8948.265851] vfs_get_tree+0x28/0xc0 -[ 8948.266203] path_mount+0x2d4/0xbe0 -[ 8948.266554] __x64_sys_mount+0x103/0x140 -[ 8948.266940] do_syscall_64+0x3b/0xc0 -[ 8948.267300] entry_SYSCALL_64_after_hwframe+0x44/0xae -[ 8948.267790] - -> #0 (sb_internal#2){.+.+}-{0:0}: -[ 8948.268322] __lock_acquire+0x12e8/0x2260 -[ 8948.268733] lock_acquire+0xd7/0x310 -[ 8948.269092] start_transaction+0x44c/0x6e0 [btrfs] -[ 8948.269591] find_free_extent+0x141e/0x1590 [btrfs] -[ 8948.270087] btrfs_reserve_extent+0x14b/0x280 [btrfs] -[ 8948.270588] cow_file_range+0x17e/0x490 [btrfs] -[ 8948.271051] btrfs_run_delalloc_range+0x345/0x7a0 [btrfs] -[ 8948.271586] writepage_delalloc+0xb5/0x170 [btrfs] -[ 8948.272071] __extent_writepage+0x156/0x3c0 [btrfs] -[ 8948.272579] extent_write_cache_pages+0x263/0x460 [btrfs] -[ 8948.273113] extent_writepages+0x76/0x130 [btrfs] -[ 8948.273573] do_writepages+0xd2/0x1c0 -[ 8948.273942] filemap_fdatawrite_wbc+0x68/0x90 -[ 8948.274371] start_delalloc_inodes+0x17f/0x400 [btrfs] -[ 8948.274876] btrfs_start_delalloc_roots+0x194/0x2a0 [btrfs] -[ 8948.275417] flush_space+0x1f2/0x630 [btrfs] -[ 8948.275863] btrfs_async_reclaim_data_space+0x108/0x1b0 [btrfs] -[ 8948.276438] process_one_work+0x252/0x5a0 -[ 8948.276829] worker_thread+0x55/0x3b0 -[ 8948.277189] kthread+0xf2/0x120 -[ 8948.277506] ret_from_fork+0x22/0x30 -[ 8948.277868] - other info that might help us debug this: - -[ 8948.278548] Chain exists of: - sb_internal#2 --> &fs_info->delalloc_root_mutex --> &root->delalloc_mutex - -[ 8948.279601] Possible unsafe locking scenario: - -[ 8948.280102] CPU0 CPU1 -[ 8948.280508] ---- ---- -[ 8948.280915] lock(&root->delalloc_mutex); -[ 8948.281271] lock(&fs_info->delalloc_root_mutex); -[ 8948.281915] lock(&root->delalloc_mutex); -[ 8948.282487] lock(sb_internal#2); -[ 8948.282800] - *** DEADLOCK *** - -[ 8948.283333] 4 locks held by kworker/u16:18/933570: -[ 8948.283750] #0: ffff9b3dc00a9d48 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1d2/0x5a0 -[ 8948.284609] #1: ffffa90349dafe70 ((work_completion)(&fs_info->async_data_reclaim_work)){+.+.}-{0:0}, at: process_one_work+0x1d2/0x5a0 -[ 8948.285637] #2: ffff9b3e14db5040 (&fs_info->delalloc_root_mutex){+.+.}-{3:3}, at: btrfs_start_delalloc_roots+0x97/0x2a0 [btrfs] -[ 8948.286674] #3: ffff9b3e09c717d8 (&root->delalloc_mutex){+.+.}-{3:3}, at: start_delalloc_inodes+0x78/0x400 [btrfs] -[ 8948.287596] - stack backtrace: -[ 8948.287975] CPU: 3 PID: 933570 Comm: kworker/u16:18 Not tainted 5.17.0-rc1-btrfs-next-111 #1 -[ 8948.288677] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 -[ 8948.289649] Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs] -[ 8948.290298] Call Trace: -[ 8948.290517] -[ 8948.290700] dump_stack_lvl+0x59/0x73 -[ 8948.291026] check_noncircular+0xf3/0x110 -[ 8948.291375] ? start_transaction+0x228/0x6e0 [btrfs] -[ 8948.291826] __lock_acquire+0x12e8/0x2260 -[ 8948.292241] lock_acquire+0xd7/0x310 -[ 8948.292714] ? find_free_extent+0x141e/0x1590 [btrfs] -[ 8948.293241] ? lock_is_held_type+0xea/0x140 -[ 8948.293601] start_transaction+0x44c/0x6e0 [btrfs] -[ 8948.294055] ? find_free_extent+0x141e/0x1590 [btrfs] -[ 8948.294518] find_free_extent+0x141e/0x1590 [btrfs] -[ 8948.294957] ? _raw_spin_unlock+0x29/0x40 -[ 8948.295312] ? btrfs_get_alloc_profile+0x124/0x290 [btrfs] -[ 8948.295813] btrfs_reserve_extent+0x14b/0x280 [btrfs] -[ 8948.296270] cow_file_range+0x17e/0x490 [btrfs] -[ 8948.296691] btrfs_run_delalloc_range+0x345/0x7a0 [btrfs] -[ 8948.297175] ? find_lock_delalloc_range+0x247/0x270 [btrfs] -[ 8948.297678] writepage_delalloc+0xb5/0x170 [btrfs] -[ 8948.298123] __extent_writepage+0x156/0x3c0 [btrfs] -[ 8948.298570] extent_write_cache_pages+0x263/0x460 [btrfs] -[ 8948.299061] extent_writepages+0x76/0x130 [btrfs] -[ 8948.299495] do_writepages+0xd2/0x1c0 -[ 8948.299817] ? sched_clock_cpu+0xd/0x110 -[ 8948.300160] ? lock_release+0x155/0x4a0 -[ 8948.300494] filemap_fdatawrite_wbc+0x68/0x90 -[ 8948.300874] ? do_raw_spin_unlock+0x4b/0xa0 -[ 8948.301243] start_delalloc_inodes+0x17f/0x400 [btrfs] -[ 8948.301706] ? lock_release+0x155/0x4a0 -[ 8948.302055] btrfs_start_delalloc_roots+0x194/0x2a0 [btrfs] -[ 8948.302564] flush_space+0x1f2/0x630 [btrfs] -[ 8948.302970] btrfs_async_reclaim_data_space+0x108/0x1b0 [btrfs] -[ 8948.303510] process_one_work+0x252/0x5a0 -[ 8948.303860] ? process_one_work+0x5a0/0x5a0 -[ 8948.304221] worker_thread+0x55/0x3b0 -[ 8948.304543] ? process_one_work+0x5a0/0x5a0 -[ 8948.304904] kthread+0xf2/0x120 -[ 8948.305184] ? kthread_complete_and_exit+0x20/0x20 -[ 8948.305598] ret_from_fork+0x22/0x30 -[ 8948.305921] - -It all comes from the fact that btrfs_start_delalloc_roots() takes the -delalloc_root_mutex, in the transaction commit path we are holding a -read lock on one of the superblock's freeze semaphores (via -sb_start_intwrite()), the async reclaim task can also do a call to -btrfs_start_delalloc_roots(), which ends up triggering writeback with -calls to filemap_fdatawrite_wbc(), resulting in extent allocation which -in turn can call btrfs_start_transaction(), which will result in taking -the freeze semaphore via sb_start_intwrite(), forming a nasty dependency -on all those locks which can be taken in different orders by different -code paths. - -So just adopt the simple approach of calling try_to_writeback_inodes_sb() -at btrfs_start_delalloc_flush(). - -Link: https://lore.kernel.org/linux-btrfs/20220130005258.GA7465@cuci.nl/ -Link: https://lore.kernel.org/linux-btrfs/43acc426-d683-d1b6-729d-c6bc4a2fff4d@gmail.com/ -Link: https://lore.kernel.org/linux-btrfs/6833930a-08d7-6fbc-0141-eb9cdfd6bb4d@gmail.com/ -Link: https://lore.kernel.org/linux-btrfs/20190322041731.GF16651@hungrycats.org/ -Reviewed-by: Omar Sandoval -Signed-off-by: Filipe Manana -[ add more link reports ] -Signed-off-by: David Sterba -Signed-off-by: Hugo SIMELIERE -Signed-off-by: Greg Kroah-Hartman ---- - fs/btrfs/transaction.c | 12 ++++++++++-- - 1 file changed, 10 insertions(+), 2 deletions(-) - ---- a/fs/btrfs/transaction.c -+++ b/fs/btrfs/transaction.c -@@ -1893,16 +1893,24 @@ static inline int btrfs_start_delalloc_f - struct btrfs_fs_info *fs_info = trans->fs_info; - - /* -- * We use writeback_inodes_sb here because if we used -+ * We use try_to_writeback_inodes_sb() here because if we used - * btrfs_start_delalloc_roots we would deadlock with fs freeze. - * Currently are holding the fs freeze lock, if we do an async flush - * we'll do btrfs_join_transaction() and deadlock because we need to - * wait for the fs freeze lock. Using the direct flushing we benefit - * from already being in a transaction and our join_transaction doesn't - * have to re-take the fs freeze lock. -+ * -+ * Note that try_to_writeback_inodes_sb() will only trigger writeback -+ * if it can read lock sb->s_umount. It will always be able to lock it, -+ * except when the filesystem is being unmounted or being frozen, but in -+ * those cases sync_filesystem() is called, which results in calling -+ * writeback_inodes_sb() while holding a write lock on sb->s_umount. -+ * Note that we don't call writeback_inodes_sb() directly, because it -+ * will emit a warning if sb->s_umount is not locked. - */ - if (btrfs_test_opt(fs_info, FLUSHONCOMMIT)) { -- writeback_inodes_sb(fs_info->sb, WB_REASON_SYNC); -+ try_to_writeback_inodes_sb(fs_info->sb, WB_REASON_SYNC); - } else { - struct btrfs_pending_snapshot *pending; - struct list_head *head = &trans->transaction->pending_snapshots; diff --git a/queue-5.4/series b/queue-5.4/series index 034a7edac93..d8d6ff010d5 100644 --- a/queue-5.4/series +++ b/queue-5.4/series @@ -303,5 +303,4 @@ r8169-add-tally-counter-fields-added-with-rtl8125.patch acpi-battery-simplify-battery-hook-locking.patch acpi-battery-fix-possible-crash-when-unregistering-a.patch ext4-fix-inode-tree-inconsistency-caused-by-enomem.patch -btrfs-get-rid-of-warning-on-transaction-commit-when-using-flushoncommit.patch clk-imx6ul-fix-failed-to-get-parent-error.patch