From: Linus Torvalds Date: Sun, 14 Jun 2026 22:00:45 +0000 (+0530) Subject: Merge tag 'vfs-7.2-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git... X-Git-Url: http://git.ipfire.org/gitweb/?a=commitdiff_plain;h=c17fdf62aeecbbaf2c2fd5c494e2089c02b0e75b;p=thirdparty%2Fkernel%2Flinux.git Merge tag 'vfs-7.2-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs writeback updates from Christian Brauner: - Fix a race between cgroup_writeback_umount() and inode_switch_wbs() When a container exits, a race between cgroup_writeback_umount() and inode_switch_wbs()/cleanup_offline_cgwb() can trigger "VFS: Busy inodes after unmount" followed by a use-after-free on percpu counters. There is a window between inode_prepare_wbs_switch() returning true (having passed the SB_ACTIVE check and grabbed the inode) and the subsequent wb_queue_isw() call: if cgroup_writeback_umount() observes the global isw_nr_in_flight counter as non-zero but flush_workqueue() finds nothing queued yet, it returns early - leaving a held inode reference that blocks evict_inodes() and a later iput() that hits freed percpu counters. The race is closed by covering the window from inode_prepare_wbs_switch() through wb_queue_isw() with an RCU read-side critical section and synchronizing in the umount path. On top of that the now-dead rcu_barrier() left over from the queue_rcu_work() era is removed, and the global synchronize_rcu()/flush_workqueue() pair is replaced with a per-sb in-flight counter plus pin/unpin/drain helpers so umount no longer serializes against switch activity on unrelated superblocks. Under cgroup writeback churn on a 16 vCPU guest this takes umount latency from ~92-138ms p50 down to ~5-8ms p50 and the cumulative cost of cgroup_writeback_umount() from ~62ms to ~4us per call. The initial race fix is kept separate and minimal so it backports cleanly to stable trees that still queue switches via queue_rcu_work(). - Improve write performance with RWF_DONTCACHE Dirty DONTCACHE pages are now tracked per bdi_writeback so that the writeback flusher can be kicked in a targeted fashion for IOCB_DONTCACHE writes instead of relying on global writeback, and the PG_dropbehind flag is preserved when a folio is split. * tag 'vfs-7.2-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: mm: kick writeback flusher for IOCB_DONTCACHE with targeted dirty tracking mm: track DONTCACHE dirty pages per bdi_writeback mm: preserve PG_dropbehind flag during folio split writeback: use a per-sb counter to drain inode wb switches at umount writeback: drop now-unnecessary rcu_barrier() in cgroup_writeback_umount() writeback: fix race between cgroup_writeback_umount() and inode_switch_wbs() --- c17fdf62aeecbbaf2c2fd5c494e2089c02b0e75b