--- /dev/null
+From 9ee29d20aab228adfb02ca93f87fb53c56c2f3af Mon Sep 17 00:00:00 2001
+From: Theodore Ts'o <tytso@mit.edu>
+Date: Fri, 27 Mar 2026 02:13:15 -0400
+Subject: ext4: always drain queued discard work in ext4_mb_release()
+
+From: Theodore Ts'o <tytso@mit.edu>
+
+commit 9ee29d20aab228adfb02ca93f87fb53c56c2f3af upstream.
+
+While reviewing recent ext4 patch[1], Sashiko raised the following
+concern[2]:
+
+> If the filesystem is initially mounted with the discard option,
+> deleting files will populate sbi->s_discard_list and queue
+> s_discard_work. If it is then remounted with nodiscard, the
+> EXT4_MOUNT_DISCARD flag is cleared, but the pending s_discard_work is
+> neither cancelled nor flushed.
+
+[1] https://lore.kernel.org/r/20260319094545.19291-1-qiang.zhang@linux.dev/
+[2] https://sashiko.dev/#/patchset/20260319094545.19291-1-qiang.zhang%40linux.dev
+
+The concern was valid, but it had nothing to do with the patch[1].
+One of the problems with Sashiko in its current (early) form is that
+it will detect pre-existing issues and report it as a problem with the
+patch that it is reviewing.
+
+In practice, it would be hard to hit deliberately (unless you are a
+malicious syzkaller fuzzer), since it would involve mounting the file
+system with -o discard, and then deleting a large number of files,
+remounting the file system with -o nodiscard, and then immediately
+unmounting the file system before the queued discard work has a change
+to drain on its own.
+
+Fix it because it's a real bug, and to avoid Sashiko from raising this
+concern when analyzing future patches to mballoc.c.
+
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Fixes: 55cdd0af2bc5 ("ext4: get discard out of jbd2 commit kthread contex")
+Cc: stable@kernel.org
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/ext4/mballoc.c | 12 +++++-------
+ 1 file changed, 5 insertions(+), 7 deletions(-)
+
+--- a/fs/ext4/mballoc.c
++++ b/fs/ext4/mballoc.c
+@@ -3793,13 +3793,11 @@ int ext4_mb_release(struct super_block *
+ struct kmem_cache *cachep = get_groupinfo_cache(sb->s_blocksize_bits);
+ int count;
+
+- if (test_opt(sb, DISCARD)) {
+- /*
+- * wait the discard work to drain all of ext4_free_data
+- */
+- flush_work(&sbi->s_discard_work);
+- WARN_ON_ONCE(!list_empty(&sbi->s_discard_list));
+- }
++ /*
++ * wait the discard work to drain all of ext4_free_data
++ */
++ flush_work(&sbi->s_discard_work);
++ WARN_ON_ONCE(!list_empty(&sbi->s_discard_list));
+
+ group_info = rcu_access_pointer(sbi->s_group_info);
+ if (group_info) {
--- /dev/null
+From 46066e3a06647c5b186cc6334409722622d05c44 Mon Sep 17 00:00:00 2001
+From: Ye Bin <yebin10@huawei.com>
+Date: Mon, 2 Mar 2026 21:46:19 +0800
+Subject: ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal()
+
+From: Ye Bin <yebin10@huawei.com>
+
+commit 46066e3a06647c5b186cc6334409722622d05c44 upstream.
+
+There's issue as follows:
+...
+EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
+EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
+
+EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
+EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
+
+EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
+EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
+
+EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
+EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
+
+EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2243 at logical offset 0 with max blocks 1 with error 117
+EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
+
+EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2239 at logical offset 0 with max blocks 1 with error 117
+EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
+
+EXT4-fs (mmcblk0p1): error count since last fsck: 1
+EXT4-fs (mmcblk0p1): initial error at time 1765597433: ext4_mb_generate_buddy:760
+EXT4-fs (mmcblk0p1): last error at time 1765597433: ext4_mb_generate_buddy:760
+...
+
+According to the log analysis, blocks are always requested from the
+corrupted block group. This may happen as follows:
+ext4_mb_find_by_goal
+ ext4_mb_load_buddy
+ ext4_mb_load_buddy_gfp
+ ext4_mb_init_cache
+ ext4_read_block_bitmap_nowait
+ ext4_wait_block_bitmap
+ ext4_validate_block_bitmap
+ if (!grp || EXT4_MB_GRP_BBITMAP_CORRUPT(grp))
+ return -EFSCORRUPTED; // There's no logs.
+ if (err)
+ return err; // Will return error
+ext4_lock_group(ac->ac_sb, group);
+ if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info))) // Unreachable
+ goto out;
+
+After commit 9008a58e5dce ("ext4: make the bitmap read routines return
+real error codes") merged, Commit 163a203ddb36 ("ext4: mark block group
+as corrupt on block bitmap error") is no real solution for allocating
+blocks from corrupted block groups. This is because if
+'EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)' is true, then
+'ext4_mb_load_buddy()' may return an error. This means that the block
+allocation will fail.
+Therefore, check block group if corrupted when ext4_mb_load_buddy()
+returns error.
+
+Fixes: 163a203ddb36 ("ext4: mark block group as corrupt on block bitmap error")
+Fixes: 9008a58e5dce ("ext4: make the bitmap read routines return real error codes")
+Signed-off-by: Ye Bin <yebin10@huawei.com>
+Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
+Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
+Reviewed-by: Andreas Dilger <adilger@dilger.ca>
+Reviewed-by: Jan Kara <jack@suse.cz>
+Link: https://patch.msgid.link/20260302134619.3145520-1-yebin@huaweicloud.com
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Cc: stable@kernel.org
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/ext4/mballoc.c | 6 +++++-
+ 1 file changed, 5 insertions(+), 1 deletion(-)
+
+--- a/fs/ext4/mballoc.c
++++ b/fs/ext4/mballoc.c
+@@ -2369,8 +2369,12 @@ int ext4_mb_find_by_goal(struct ext4_all
+ return 0;
+
+ err = ext4_mb_load_buddy(ac->ac_sb, group, e4b);
+- if (err)
++ if (err) {
++ if (EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info) &&
++ !(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY))
++ return 0;
+ return err;
++ }
+
+ ext4_lock_group(ac->ac_sb, group);
+ if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)))
--- /dev/null
+From 5422fe71d26d42af6c454ca9527faaad4e677d6c Mon Sep 17 00:00:00 2001
+From: Edward Adam Davis <eadavis@qq.com>
+Date: Fri, 6 Mar 2026 09:31:58 +0800
+Subject: ext4: avoid infinite loops caused by residual data
+
+From: Edward Adam Davis <eadavis@qq.com>
+
+commit 5422fe71d26d42af6c454ca9527faaad4e677d6c upstream.
+
+On the mkdir/mknod path, when mapping logical blocks to physical blocks,
+if inserting a new extent into the extent tree fails (in this example,
+because the file system disabled the huge file feature when marking the
+inode as dirty), ext4_ext_map_blocks() only calls ext4_free_blocks() to
+reclaim the physical block without deleting the corresponding data in
+the extent tree. This causes subsequent mkdir operations to reference
+the previously reclaimed physical block number again, even though this
+physical block is already being used by the xattr block. Therefore, a
+situation arises where both the directory and xattr are using the same
+buffer head block in memory simultaneously.
+
+The above causes ext4_xattr_block_set() to enter an infinite loop about
+"inserted" and cannot release the inode lock, ultimately leading to the
+143s blocking problem mentioned in [1].
+
+If the metadata is corrupted, then trying to remove some extent space
+can do even more harm. Also in case EXT4_GET_BLOCKS_DELALLOC_RESERVE
+was passed, remove space wrongly update quota information.
+Jan Kara suggests distinguishing between two cases:
+
+1) The error is ENOSPC or EDQUOT - in this case the filesystem is fully
+consistent and we must maintain its consistency including all the
+accounting. However these errors can happen only early before we've
+inserted the extent into the extent tree. So current code works correctly
+for this case.
+
+2) Some other error - this means metadata is corrupted. We should strive to
+do as few modifications as possible to limit damage. So I'd just skip
+freeing of allocated blocks.
+
+[1]
+INFO: task syz.0.17:5995 blocked for more than 143 seconds.
+Call Trace:
+ inode_lock_nested include/linux/fs.h:1073 [inline]
+ __start_dirop fs/namei.c:2923 [inline]
+ start_dirop fs/namei.c:2934 [inline]
+
+Reported-by: syzbot+512459401510e2a9a39f@syzkaller.appspotmail.com
+Closes: https://syzkaller.appspot.com/bug?extid=1659aaaaa8d9d11265d7
+Tested-by: syzbot+1659aaaaa8d9d11265d7@syzkaller.appspotmail.com
+Reported-by: syzbot+1659aaaaa8d9d11265d7@syzkaller.appspotmail.com
+Closes: https://syzkaller.appspot.com/bug?extid=512459401510e2a9a39f
+Tested-by: syzbot+1659aaaaa8d9d11265d7@syzkaller.appspotmail.com
+Signed-off-by: Edward Adam Davis <eadavis@qq.com>
+Reviewed-by: Jan Kara <jack@suse.cz>
+Tested-by: syzbot+512459401510e2a9a39f@syzkaller.appspotmail.com
+Link: https://patch.msgid.link/tencent_43696283A68450B761D76866C6F360E36705@qq.com
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Cc: stable@kernel.org
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/ext4/extents.c | 8 ++++++--
+ 1 file changed, 6 insertions(+), 2 deletions(-)
+
+--- a/fs/ext4/extents.c
++++ b/fs/ext4/extents.c
+@@ -4421,9 +4421,13 @@ got_allocated_blocks:
+ path = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
+ if (IS_ERR(path)) {
+ err = PTR_ERR(path);
+- if (allocated_clusters) {
++ /*
++ * Gracefully handle out of space conditions. If the filesystem
++ * is inconsistent, we'll just leak allocated blocks to avoid
++ * causing even more damage.
++ */
++ if (allocated_clusters && (err == -EDQUOT || err == -ENOSPC)) {
+ int fb_flags = 0;
+-
+ /*
+ * free data blocks we just allocated.
+ * not a good idea to call discard here directly,
--- /dev/null
+From ed9356a30e59c7cc3198e7fc46cfedf3767b9b17 Mon Sep 17 00:00:00 2001
+From: Deepanshu Kartikey <kartikey406@gmail.com>
+Date: Sat, 7 Feb 2026 10:06:07 +0530
+Subject: ext4: convert inline data to extents when truncate exceeds inline size
+
+From: Deepanshu Kartikey <kartikey406@gmail.com>
+
+commit ed9356a30e59c7cc3198e7fc46cfedf3767b9b17 upstream.
+
+Add a check in ext4_setattr() to convert files from inline data storage
+to extent-based storage when truncate() grows the file size beyond the
+inline capacity. This prevents the filesystem from entering an
+inconsistent state where the inline data flag is set but the file size
+exceeds what can be stored inline.
+
+Without this fix, the following sequence causes a kernel BUG_ON():
+
+1. Mount filesystem with inode that has inline flag set and small size
+2. truncate(file, 50MB) - grows size but inline flag remains set
+3. sendfile() attempts to write data
+4. ext4_write_inline_data() hits BUG_ON(write_size > inline_capacity)
+
+The crash occurs because ext4_write_inline_data() expects inline storage
+to accommodate the write, but the actual inline capacity (~60 bytes for
+i_block + ~96 bytes for xattrs) is far smaller than the file size and
+write request.
+
+The fix checks if the new size from setattr exceeds the inode's actual
+inline capacity (EXT4_I(inode)->i_inline_size) and converts the file to
+extent-based storage before proceeding with the size change.
+
+This addresses the root cause by ensuring the inline data flag and file
+size remain consistent during truncate operations.
+
+Reported-by: syzbot+7de5fe447862fc37576f@syzkaller.appspotmail.com
+Closes: https://syzkaller.appspot.com/bug?extid=7de5fe447862fc37576f
+Tested-by: syzbot+7de5fe447862fc37576f@syzkaller.appspotmail.com
+Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
+Link: https://patch.msgid.link/20260207043607.1175976-1-kartikey406@gmail.com
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Cc: stable@kernel.org
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/ext4/inode.c | 12 ++++++++++++
+ 1 file changed, 12 insertions(+)
+
+--- a/fs/ext4/inode.c
++++ b/fs/ext4/inode.c
+@@ -5530,6 +5530,18 @@ int ext4_setattr(struct mnt_idmap *idmap
+ if (attr->ia_size == inode->i_size)
+ inc_ivers = false;
+
++ /*
++ * If file has inline data but new size exceeds inline capacity,
++ * convert to extent-based storage first to prevent inconsistent
++ * state (inline flag set but size exceeds inline capacity).
++ */
++ if (ext4_has_inline_data(inode) &&
++ attr->ia_size > EXT4_I(inode)->i_inline_size) {
++ error = ext4_convert_inline_data(inode);
++ if (error)
++ goto err_out;
++ }
++
+ if (shrink) {
+ if (ext4_should_order_data(inode)) {
+ error = ext4_begin_ordered_truncate(inode,
--- /dev/null
+From 1308255bbf8452762f89f44f7447ce137ecdbcff Mon Sep 17 00:00:00 2001
+From: Jan Kara <jack@suse.cz>
+Date: Mon, 16 Feb 2026 17:48:44 +0100
+Subject: ext4: fix fsync(2) for nojournal mode
+
+From: Jan Kara <jack@suse.cz>
+
+commit 1308255bbf8452762f89f44f7447ce137ecdbcff upstream.
+
+When inode metadata is changed, we sometimes just call
+ext4_mark_inode_dirty() to track modified metadata. This copies inode
+metadata into block buffer which is enough when we are journalling
+metadata. However when we are running in nojournal mode we currently
+fail to write the dirtied inode buffer during fsync(2) because the inode
+is not marked as dirty. Use explicit ext4_write_inode() call to make
+sure the inode table buffer is written to the disk. This is a band aid
+solution but proper solution requires a much larger rewrite including
+changes in metadata bh tracking infrastructure.
+
+Reported-by: Free Ekanayaka <free.ekanayaka@gmail.com>
+Link: https://lore.kernel.org/all/87il8nhxdm.fsf@x1.mail-host-address-is-not-set/
+CC: stable@vger.kernel.org
+Signed-off-by: Jan Kara <jack@suse.cz>
+Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
+Link: https://patch.msgid.link/20260216164848.3074-4-jack@suse.cz
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Cc: stable@kernel.org
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/ext4/fsync.c | 16 ++++++++++++++--
+ 1 file changed, 14 insertions(+), 2 deletions(-)
+
+--- a/fs/ext4/fsync.c
++++ b/fs/ext4/fsync.c
+@@ -83,11 +83,23 @@ static int ext4_fsync_nojournal(struct f
+ int datasync, bool *needs_barrier)
+ {
+ struct inode *inode = file->f_inode;
++ struct writeback_control wbc = {
++ .sync_mode = WB_SYNC_ALL,
++ .nr_to_write = 0,
++ };
+ int ret;
+
+ ret = generic_buffers_fsync_noflush(file, start, end, datasync);
+- if (!ret)
+- ret = ext4_sync_parent(inode);
++ if (ret)
++ return ret;
++
++ /* Force writeout of inode table buffer to disk */
++ ret = ext4_write_inode(inode, &wbc);
++ if (ret)
++ return ret;
++
++ ret = ext4_sync_parent(inode);
++
+ if (test_opt(inode->i_sb, BARRIER))
+ *needs_barrier = true;
+
--- /dev/null
+From ec0a7500d8eace5b4f305fa0c594dd148f0e8d29 Mon Sep 17 00:00:00 2001
+From: Baokun Li <libaokun@linux.alibaba.com>
+Date: Mon, 23 Mar 2026 14:08:36 +0800
+Subject: ext4: fix iloc.bh leak in ext4_fc_replay_inode() error paths
+
+From: Baokun Li <libaokun@linux.alibaba.com>
+
+commit ec0a7500d8eace5b4f305fa0c594dd148f0e8d29 upstream.
+
+During code review, Joseph found that ext4_fc_replay_inode() calls
+ext4_get_fc_inode_loc() to get the inode location, which holds a
+reference to iloc.bh that must be released via brelse().
+
+However, several error paths jump to the 'out' label without
+releasing iloc.bh:
+
+ - ext4_handle_dirty_metadata() failure
+ - sync_dirty_buffer() failure
+ - ext4_mark_inode_used() failure
+ - ext4_iget() failure
+
+Fix this by introducing an 'out_brelse' label placed just before
+the existing 'out' label to ensure iloc.bh is always released.
+
+Additionally, make ext4_fc_replay_inode() propagate errors
+properly instead of always returning 0.
+
+Reported-by: Joseph Qi <joseph.qi@linux.alibaba.com>
+Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
+Signed-off-by: Baokun Li <libaokun@linux.alibaba.com>
+Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
+Reviewed-by: Jan Kara <jack@suse.cz>
+Link: https://patch.msgid.link/20260323060836.3452660-1-libaokun@linux.alibaba.com
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Cc: stable@kernel.org
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/ext4/fast_commit.c | 13 ++++++++-----
+ 1 file changed, 8 insertions(+), 5 deletions(-)
+
+--- a/fs/ext4/fast_commit.c
++++ b/fs/ext4/fast_commit.c
+@@ -1601,19 +1601,21 @@ static int ext4_fc_replay_inode(struct s
+ /* Immediately update the inode on disk. */
+ ret = ext4_handle_dirty_metadata(NULL, NULL, iloc.bh);
+ if (ret)
+- goto out;
++ goto out_brelse;
+ ret = sync_dirty_buffer(iloc.bh);
+ if (ret)
+- goto out;
++ goto out_brelse;
+ ret = ext4_mark_inode_used(sb, ino);
+ if (ret)
+- goto out;
++ goto out_brelse;
+
+ /* Given that we just wrote the inode on disk, this SHOULD succeed. */
+ inode = ext4_iget(sb, ino, EXT4_IGET_NORMAL);
+ if (IS_ERR(inode)) {
+ ext4_debug("Inode not found.");
+- return -EFSCORRUPTED;
++ inode = NULL;
++ ret = -EFSCORRUPTED;
++ goto out_brelse;
+ }
+
+ /*
+@@ -1630,13 +1632,14 @@ static int ext4_fc_replay_inode(struct s
+ ext4_inode_csum_set(inode, ext4_raw_inode(&iloc), EXT4_I(inode));
+ ret = ext4_handle_dirty_metadata(NULL, NULL, iloc.bh);
+ sync_dirty_buffer(iloc.bh);
++out_brelse:
+ brelse(iloc.bh);
+ out:
+ iput(inode);
+ if (!ret)
+ blkdev_issue_flush(sb->s_bdev);
+
+- return 0;
++ return ret;
+ }
+
+ /*
--- /dev/null
+From b1d682f1990c19fb1d5b97d13266210457092bcd Mon Sep 17 00:00:00 2001
+From: Simon Weber <simon.weber.39@gmail.com>
+Date: Sat, 7 Feb 2026 10:53:03 +0100
+Subject: ext4: fix journal credit check when setting fscrypt context
+
+From: Simon Weber <simon.weber.39@gmail.com>
+
+commit b1d682f1990c19fb1d5b97d13266210457092bcd upstream.
+
+Fix an issue arising when ext4 features has_journal, ea_inode, and encrypt
+are activated simultaneously, leading to ENOSPC when creating an encrypted
+file.
+
+Fix by passing XATTR_CREATE flag to xattr_set_handle function if a handle
+is specified, i.e., when the function is called in the control flow of
+creating a new inode. This aligns the number of jbd2 credits set_handle
+checks for with the number allocated for creating a new inode.
+
+ext4_set_context must not be called with a non-null handle (fs_data) if
+fscrypt context xattr is not guaranteed to not exist yet. The only other
+usage of this function currently is when handling the ioctl
+FS_IOC_SET_ENCRYPTION_POLICY, which calls it with fs_data=NULL.
+
+Fixes: c1a5d5f6ab21eb7e ("ext4: improve journal credit handling in set xattr paths")
+
+Co-developed-by: Anthony Durrer <anthonydev@fastmail.com>
+Signed-off-by: Anthony Durrer <anthonydev@fastmail.com>
+Signed-off-by: Simon Weber <simon.weber.39@gmail.com>
+Reviewed-by: Eric Biggers <ebiggers@kernel.org>
+Link: https://patch.msgid.link/20260207100148.724275-4-simon.weber.39@gmail.com
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Cc: stable@kernel.org
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/ext4/crypto.c | 9 ++++++++-
+ 1 file changed, 8 insertions(+), 1 deletion(-)
+
+--- a/fs/ext4/crypto.c
++++ b/fs/ext4/crypto.c
+@@ -169,10 +169,17 @@ static int ext4_set_context(struct inode
+ */
+
+ if (handle) {
++ /*
++ * Since the inode is new it is ok to pass the
++ * XATTR_CREATE flag. This is necessary to match the
++ * remaining journal credits check in the set_handle
++ * function with the credits allocated for the new
++ * inode.
++ */
+ res = ext4_xattr_set_handle(handle, inode,
+ EXT4_XATTR_INDEX_ENCRYPTION,
+ EXT4_XATTR_NAME_ENCRYPTION_CONTEXT,
+- ctx, len, 0);
++ ctx, len, XATTR_CREATE);
+ if (!res) {
+ ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
+ ext4_clear_inode_state(inode,
--- /dev/null
+From f4a2b42e78914ff15630e71289adc589c3a8eb45 Mon Sep 17 00:00:00 2001
+From: Jan Kara <jack@suse.cz>
+Date: Thu, 5 Feb 2026 10:22:24 +0100
+Subject: ext4: fix stale xarray tags after writeback
+
+From: Jan Kara <jack@suse.cz>
+
+commit f4a2b42e78914ff15630e71289adc589c3a8eb45 upstream.
+
+There are cases where ext4_bio_write_page() gets called for a page which
+has no buffers to submit. This happens e.g. when the part of the file is
+actually a hole, when we cannot allocate blocks due to being called from
+jbd2, or in data=journal mode when checkpointing writes the buffers
+earlier. In these cases we just return from ext4_bio_write_page()
+however if the page didn't need redirtying, we will leave stale DIRTY
+and/or TOWRITE tags in xarray because those get cleared only in
+__folio_start_writeback(). As a result we can leave these tags set in
+mappings even after a final sync on filesystem that's getting remounted
+read-only or that's being frozen. Various assertions can then get upset
+when writeback is started on such filesystems (Gerald reported assertion
+in ext4_journal_check_start() firing).
+
+Fix the problem by cycling the page through writeback state even if we
+decide nothing needs to be written for it so that xarray tags get
+properly updated. This is slightly silly (we could update the xarray
+tags directly) but I don't think a special helper messing with xarray
+tags is really worth it in this relatively rare corner case.
+
+Reported-by: Gerald Yang <gerald.yang@canonical.com>
+Link: https://lore.kernel.org/all/20260128074515.2028982-1-gerald.yang@canonical.com
+Fixes: dff4ac75eeee ("ext4: move keep_towrite handling to ext4_bio_write_page()")
+Signed-off-by: Jan Kara <jack@suse.cz>
+Link: https://patch.msgid.link/20260205092223.21287-2-jack@suse.cz
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Cc: stable@kernel.org
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/ext4/page-io.c | 10 ++++++++--
+ 1 file changed, 8 insertions(+), 2 deletions(-)
+
+--- a/fs/ext4/page-io.c
++++ b/fs/ext4/page-io.c
+@@ -506,9 +506,15 @@ int ext4_bio_write_folio(struct ext4_io_
+ nr_to_submit++;
+ } while ((bh = bh->b_this_page) != head);
+
+- /* Nothing to submit? Just unlock the folio... */
+- if (!nr_to_submit)
++ if (!nr_to_submit) {
++ /*
++ * We have nothing to submit. Just cycle the folio through
++ * writeback state to properly update xarray tags.
++ */
++ __folio_start_writeback(folio, keep_towrite);
++ folio_end_writeback(folio);
+ return 0;
++ }
+
+ bh = head = folio_buffers(folio);
+
--- /dev/null
+From 496bb99b7e66f48b178126626f47e9ba79e2d0fa Mon Sep 17 00:00:00 2001
+From: Zqiang <qiang.zhang@linux.dev>
+Date: Thu, 19 Mar 2026 17:45:45 +0800
+Subject: ext4: fix the might_sleep() warnings in kvfree()
+
+From: Zqiang <qiang.zhang@linux.dev>
+
+commit 496bb99b7e66f48b178126626f47e9ba79e2d0fa upstream.
+
+Use the kvfree() in the RCU read critical section can trigger
+the following warnings:
+
+EXT4-fs (vdb): unmounting filesystem cd983e5b-3c83-4f5a-a136-17b00eb9d018.
+
+WARNING: suspicious RCU usage
+
+./include/linux/rcupdate.h:409 Illegal context switch in RCU read-side critical section!
+
+other info that might help us debug this:
+
+rcu_scheduler_active = 2, debug_locks = 1
+
+Call Trace:
+ <TASK>
+ dump_stack_lvl+0xbb/0xd0
+ dump_stack+0x14/0x20
+ lockdep_rcu_suspicious+0x15a/0x1b0
+ __might_resched+0x375/0x4d0
+ ? put_object.part.0+0x2c/0x50
+ __might_sleep+0x108/0x160
+ vfree+0x58/0x910
+ ? ext4_group_desc_free+0x27/0x270
+ kvfree+0x23/0x40
+ ext4_group_desc_free+0x111/0x270
+ ext4_put_super+0x3c8/0xd40
+ generic_shutdown_super+0x14c/0x4a0
+ ? __pfx_shrinker_free+0x10/0x10
+ kill_block_super+0x40/0x90
+ ext4_kill_sb+0x6d/0xb0
+ deactivate_locked_super+0xb4/0x180
+ deactivate_super+0x7e/0xa0
+ cleanup_mnt+0x296/0x3e0
+ __cleanup_mnt+0x16/0x20
+ task_work_run+0x157/0x250
+ ? __pfx_task_work_run+0x10/0x10
+ ? exit_to_user_mode_loop+0x6a/0x550
+ exit_to_user_mode_loop+0x102/0x550
+ do_syscall_64+0x44a/0x500
+ entry_SYSCALL_64_after_hwframe+0x77/0x7f
+ </TASK>
+
+BUG: sleeping function called from invalid context at mm/vmalloc.c:3441
+in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 556, name: umount
+preempt_count: 1, expected: 0
+CPU: 3 UID: 0 PID: 556 Comm: umount
+Call Trace:
+ <TASK>
+ dump_stack_lvl+0xbb/0xd0
+ dump_stack+0x14/0x20
+ __might_resched+0x275/0x4d0
+ ? put_object.part.0+0x2c/0x50
+ __might_sleep+0x108/0x160
+ vfree+0x58/0x910
+ ? ext4_group_desc_free+0x27/0x270
+ kvfree+0x23/0x40
+ ext4_group_desc_free+0x111/0x270
+ ext4_put_super+0x3c8/0xd40
+ generic_shutdown_super+0x14c/0x4a0
+ ? __pfx_shrinker_free+0x10/0x10
+ kill_block_super+0x40/0x90
+ ext4_kill_sb+0x6d/0xb0
+ deactivate_locked_super+0xb4/0x180
+ deactivate_super+0x7e/0xa0
+ cleanup_mnt+0x296/0x3e0
+ __cleanup_mnt+0x16/0x20
+ task_work_run+0x157/0x250
+ ? __pfx_task_work_run+0x10/0x10
+ ? exit_to_user_mode_loop+0x6a/0x550
+ exit_to_user_mode_loop+0x102/0x550
+ do_syscall_64+0x44a/0x500
+ entry_SYSCALL_64_after_hwframe+0x77/0x7f
+
+The above scenarios occur in initialization failures and teardown
+paths, there are no parallel operations on the resources released
+by kvfree(), this commit therefore remove rcu_read_lock/unlock() and
+use rcu_access_pointer() instead of rcu_dereference() operations.
+
+Fixes: 7c990728b99e ("ext4: fix potential race between s_flex_groups online resizing and access")
+Fixes: df3da4ea5a0f ("ext4: fix potential race between s_group_info online resizing and access")
+Signed-off-by: Zqiang <qiang.zhang@linux.dev>
+Reviewed-by: Baokun Li <libaokun@linux.alibaba.com>
+Link: https://patch.msgid.link/20260319094545.19291-1-qiang.zhang@linux.dev
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Cc: stable@kernel.org
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/ext4/mballoc.c | 10 +++-------
+ fs/ext4/super.c | 8 ++------
+ 2 files changed, 5 insertions(+), 13 deletions(-)
+
+--- a/fs/ext4/mballoc.c
++++ b/fs/ext4/mballoc.c
+@@ -3502,9 +3502,7 @@ err_freebuddy:
+ rcu_read_unlock();
+ iput(sbi->s_buddy_cache);
+ err_freesgi:
+- rcu_read_lock();
+- kvfree(rcu_dereference(sbi->s_group_info));
+- rcu_read_unlock();
++ kvfree(rcu_access_pointer(sbi->s_group_info));
+ return -ENOMEM;
+ }
+
+@@ -3803,7 +3801,8 @@ int ext4_mb_release(struct super_block *
+ WARN_ON_ONCE(!list_empty(&sbi->s_discard_list));
+ }
+
+- if (sbi->s_group_info) {
++ group_info = rcu_access_pointer(sbi->s_group_info);
++ if (group_info) {
+ for (i = 0; i < ngroups; i++) {
+ cond_resched();
+ grinfo = ext4_get_group_info(sb, i);
+@@ -3821,12 +3820,9 @@ int ext4_mb_release(struct super_block *
+ num_meta_group_infos = (ngroups +
+ EXT4_DESC_PER_BLOCK(sb) - 1) >>
+ EXT4_DESC_PER_BLOCK_BITS(sb);
+- rcu_read_lock();
+- group_info = rcu_dereference(sbi->s_group_info);
+ for (i = 0; i < num_meta_group_infos; i++)
+ kfree(group_info[i]);
+ kvfree(group_info);
+- rcu_read_unlock();
+ }
+ kfree(sbi->s_mb_avg_fragment_size);
+ kfree(sbi->s_mb_avg_fragment_size_locks);
+--- a/fs/ext4/super.c
++++ b/fs/ext4/super.c
+@@ -1266,12 +1266,10 @@ static void ext4_group_desc_free(struct
+ struct buffer_head **group_desc;
+ int i;
+
+- rcu_read_lock();
+- group_desc = rcu_dereference(sbi->s_group_desc);
++ group_desc = rcu_access_pointer(sbi->s_group_desc);
+ for (i = 0; i < sbi->s_gdb_count; i++)
+ brelse(group_desc[i]);
+ kvfree(group_desc);
+- rcu_read_unlock();
+ }
+
+ static void ext4_flex_groups_free(struct ext4_sb_info *sbi)
+@@ -1279,14 +1277,12 @@ static void ext4_flex_groups_free(struct
+ struct flex_groups **flex_groups;
+ int i;
+
+- rcu_read_lock();
+- flex_groups = rcu_dereference(sbi->s_flex_groups);
++ flex_groups = rcu_access_pointer(sbi->s_flex_groups);
+ if (flex_groups) {
+ for (i = 0; i < sbi->s_flex_groups_allocated; i++)
+ kvfree(flex_groups[i]);
+ kvfree(flex_groups);
+ }
+- rcu_read_unlock();
+ }
+
+ static void ext4_put_super(struct super_block *sb)
--- /dev/null
+From d15e4b0a418537aafa56b2cb80d44add83e83697 Mon Sep 17 00:00:00 2001
+From: Jiayuan Chen <jiayuan.chen@shopee.com>
+Date: Thu, 19 Mar 2026 20:03:35 +0800
+Subject: ext4: fix use-after-free in update_super_work when racing with umount
+
+From: Jiayuan Chen <jiayuan.chen@shopee.com>
+
+commit d15e4b0a418537aafa56b2cb80d44add83e83697 upstream.
+
+Commit b98535d09179 ("ext4: fix bug_on in start_this_handle during umount
+filesystem") moved ext4_unregister_sysfs() before flushing s_sb_upd_work
+to prevent new error work from being queued via /proc/fs/ext4/xx/mb_groups
+reads during unmount. However, this introduced a use-after-free because
+update_super_work calls ext4_notify_error_sysfs() -> sysfs_notify() which
+accesses the kobject's kernfs_node after it has been freed by kobject_del()
+in ext4_unregister_sysfs():
+
+ update_super_work ext4_put_super
+ ----------------- --------------
+ ext4_unregister_sysfs(sb)
+ kobject_del(&sbi->s_kobj)
+ __kobject_del()
+ sysfs_remove_dir()
+ kobj->sd = NULL
+ sysfs_put(sd)
+ kernfs_put() // RCU free
+ ext4_notify_error_sysfs(sbi)
+ sysfs_notify(&sbi->s_kobj)
+ kn = kobj->sd // stale pointer
+ kernfs_get(kn) // UAF on freed kernfs_node
+ ext4_journal_destroy()
+ flush_work(&sbi->s_sb_upd_work)
+
+Instead of reordering the teardown sequence, fix this by making
+ext4_notify_error_sysfs() detect that sysfs has already been torn down
+by checking s_kobj.state_in_sysfs, and skipping the sysfs_notify() call
+in that case. A dedicated mutex (s_error_notify_mutex) serializes
+ext4_notify_error_sysfs() against kobject_del() in ext4_unregister_sysfs()
+to prevent TOCTOU races where the kobject could be deleted between the
+state_in_sysfs check and the sysfs_notify() call.
+
+Fixes: b98535d09179 ("ext4: fix bug_on in start_this_handle during umount filesystem")
+Cc: Jiayuan Chen <jiayuan.chen@linux.dev>
+Suggested-by: Jan Kara <jack@suse.cz>
+Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
+Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
+Reviewed-by: Jan Kara <jack@suse.cz>
+Link: https://patch.msgid.link/20260319120336.157873-1-jiayuan.chen@linux.dev
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Cc: stable@kernel.org
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/ext4/ext4.h | 1 +
+ fs/ext4/super.c | 1 +
+ fs/ext4/sysfs.c | 10 +++++++++-
+ 3 files changed, 11 insertions(+), 1 deletion(-)
+
+--- a/fs/ext4/ext4.h
++++ b/fs/ext4/ext4.h
+@@ -1535,6 +1535,7 @@ struct ext4_sb_info {
+ struct proc_dir_entry *s_proc;
+ struct kobject s_kobj;
+ struct completion s_kobj_unregister;
++ struct mutex s_error_notify_mutex; /* protects sysfs_notify vs kobject_del */
+ struct super_block *s_sb;
+ struct buffer_head *s_mmp_bh;
+
+--- a/fs/ext4/super.c
++++ b/fs/ext4/super.c
+@@ -5344,6 +5344,7 @@ static int __ext4_fill_super(struct fs_c
+
+ timer_setup(&sbi->s_err_report, print_daily_error_info, 0);
+ spin_lock_init(&sbi->s_error_lock);
++ mutex_init(&sbi->s_error_notify_mutex);
+ INIT_WORK(&sbi->s_sb_upd_work, update_super_work);
+
+ err = ext4_group_desc_init(sb, es, logical_sb_block, &first_not_zeroed);
+--- a/fs/ext4/sysfs.c
++++ b/fs/ext4/sysfs.c
+@@ -529,7 +529,10 @@ static const struct kobj_type ext4_feat_
+
+ void ext4_notify_error_sysfs(struct ext4_sb_info *sbi)
+ {
+- sysfs_notify(&sbi->s_kobj, NULL, "errors_count");
++ mutex_lock(&sbi->s_error_notify_mutex);
++ if (sbi->s_kobj.state_in_sysfs)
++ sysfs_notify(&sbi->s_kobj, NULL, "errors_count");
++ mutex_unlock(&sbi->s_error_notify_mutex);
+ }
+
+ static struct kobject *ext4_root;
+@@ -542,8 +545,10 @@ int ext4_register_sysfs(struct super_blo
+ int err;
+
+ init_completion(&sbi->s_kobj_unregister);
++ mutex_lock(&sbi->s_error_notify_mutex);
+ err = kobject_init_and_add(&sbi->s_kobj, &ext4_sb_ktype, ext4_root,
+ "%s", sb->s_id);
++ mutex_unlock(&sbi->s_error_notify_mutex);
+ if (err) {
+ kobject_put(&sbi->s_kobj);
+ wait_for_completion(&sbi->s_kobj_unregister);
+@@ -576,7 +581,10 @@ void ext4_unregister_sysfs(struct super_
+
+ if (sbi->s_proc)
+ remove_proc_subtree(sb->s_id, ext4_proc_root);
++
++ mutex_lock(&sbi->s_error_notify_mutex);
+ kobject_del(&sbi->s_kobj);
++ mutex_unlock(&sbi->s_error_notify_mutex);
+ }
+
+ int __init ext4_init_sysfs(void)
--- /dev/null
+From bd060afa7cc3e0ad30afa9ecc544a78638498555 Mon Sep 17 00:00:00 2001
+From: Jan Kara <jack@suse.cz>
+Date: Mon, 16 Feb 2026 17:48:43 +0100
+Subject: ext4: make recently_deleted() properly work with lazy itable initialization
+
+From: Jan Kara <jack@suse.cz>
+
+commit bd060afa7cc3e0ad30afa9ecc544a78638498555 upstream.
+
+recently_deleted() checks whether inode has been used in the near past.
+However this can give false positive result when inode table is not
+initialized yet and we are in fact comparing to random garbage (or stale
+itable block of a filesystem before mkfs). Ultimately this results in
+uninitialized inodes being skipped during inode allocation and possibly
+they are never initialized and thus e2fsck complains. Verify if the
+inode has been initialized before checking for dtime.
+
+Signed-off-by: Jan Kara <jack@suse.cz>
+Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
+Link: https://patch.msgid.link/20260216164848.3074-3-jack@suse.cz
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Cc: stable@kernel.org
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/ext4/ialloc.c | 6 ++++++
+ 1 file changed, 6 insertions(+)
+
+--- a/fs/ext4/ialloc.c
++++ b/fs/ext4/ialloc.c
+@@ -688,6 +688,12 @@ static int recently_deleted(struct super
+ if (unlikely(!gdp))
+ return 0;
+
++ /* Inode was never used in this filesystem? */
++ if (ext4_has_group_desc_csum(sb) &&
++ (gdp->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT) ||
++ ino >= EXT4_INODES_PER_GROUP(sb) - ext4_itable_unused_count(sb, gdp)))
++ return 0;
++
+ bh = sb_find_get_block(sb, ext4_inode_table(sb, gdp) +
+ (ino / inodes_per_block));
+ if (!bh || !buffer_uptodate(bh))
--- /dev/null
+From 3822743dc20386d9897e999dbb990befa3a5b3f8 Mon Sep 17 00:00:00 2001
+From: Helen Koike <koike@igalia.com>
+Date: Tue, 17 Mar 2026 11:23:10 -0300
+Subject: ext4: reject mount if bigalloc with s_first_data_block != 0
+
+From: Helen Koike <koike@igalia.com>
+
+commit 3822743dc20386d9897e999dbb990befa3a5b3f8 upstream.
+
+bigalloc with s_first_data_block != 0 is not supported, reject mounting
+it.
+
+Signed-off-by: Helen Koike <koike@igalia.com>
+Suggested-by: Theodore Ts'o <tytso@mit.edu>
+Reported-by: syzbot+b73703b873a33d8eb8f6@syzkaller.appspotmail.com
+Closes: https://syzkaller.appspot.com/bug?extid=b73703b873a33d8eb8f6
+Link: https://patch.msgid.link/20260317142325.135074-1-koike@igalia.com
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Cc: stable@kernel.org
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/ext4/super.c | 7 +++++++
+ 1 file changed, 7 insertions(+)
+
+--- a/fs/ext4/super.c
++++ b/fs/ext4/super.c
+@@ -3655,6 +3655,13 @@ int ext4_feature_set_ok(struct super_blo
+ "extents feature\n");
+ return 0;
+ }
++ if (ext4_has_feature_bigalloc(sb) &&
++ le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block)) {
++ ext4_msg(sb, KERN_WARNING,
++ "bad geometry: bigalloc file system with non-zero "
++ "first_data_block\n");
++ return 0;
++ }
+
+ #if !IS_ENABLED(CONFIG_QUOTA) || !IS_ENABLED(CONFIG_QFMT_V2)
+ if (!readonly && (ext4_has_feature_quota(sb) ||
--- /dev/null
+From 356227096eb66e41b23caf7045e6304877322edf Mon Sep 17 00:00:00 2001
+From: Yuto Ohnuki <ytohnuki@amazon.com>
+Date: Mon, 23 Feb 2026 12:33:46 +0000
+Subject: ext4: replace BUG_ON with proper error handling in ext4_read_inline_folio
+
+From: Yuto Ohnuki <ytohnuki@amazon.com>
+
+commit 356227096eb66e41b23caf7045e6304877322edf upstream.
+
+Replace BUG_ON() with proper error handling when inline data size
+exceeds PAGE_SIZE. This prevents kernel panic and allows the system to
+continue running while properly reporting the filesystem corruption.
+
+The error is logged via ext4_error_inode(), the buffer head is released
+to prevent memory leak, and -EFSCORRUPTED is returned to indicate
+filesystem corruption.
+
+Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com>
+Link: https://patch.msgid.link/20260223123345.14838-2-ytohnuki@amazon.com
+Signed-off-by: Theodore Ts'o <tytso@mit.edu>
+Cc: stable@kernel.org
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/ext4/inline.c | 10 +++++++++-
+ 1 file changed, 9 insertions(+), 1 deletion(-)
+
+--- a/fs/ext4/inline.c
++++ b/fs/ext4/inline.c
+@@ -517,7 +517,15 @@ static int ext4_read_inline_folio(struct
+ goto out;
+
+ len = min_t(size_t, ext4_get_inline_size(inode), i_size_read(inode));
+- BUG_ON(len > PAGE_SIZE);
++
++ if (len > PAGE_SIZE) {
++ ext4_error_inode(inode, __func__, __LINE__, 0,
++ "inline size %zu exceeds PAGE_SIZE", len);
++ ret = -EFSCORRUPTED;
++ brelse(iloc.bh);
++ goto out;
++ }
++
+ kaddr = kmap_local_folio(folio, 0);
+ ret = ext4_read_inline_data(inode, kaddr, len, &iloc);
+ flush_dcache_folio(folio);
loongarch-workaround-ls2k-ls7a-gpu-dma-hang-bug.patch
xfs-stop-reclaim-before-pushing-ail-during-unmount.patch
xfs-fix-ri_total-validation-in-xlog_recover_attri_commit_pass2.patch
+ext4-fix-journal-credit-check-when-setting-fscrypt-context.patch
+ext4-convert-inline-data-to-extents-when-truncate-exceeds-inline-size.patch
+ext4-fix-stale-xarray-tags-after-writeback.patch
+ext4-fix-fsync-2-for-nojournal-mode.patch
+ext4-make-recently_deleted-properly-work-with-lazy-itable-initialization.patch
+ext4-replace-bug_on-with-proper-error-handling-in-ext4_read_inline_folio.patch
+ext4-avoid-infinite-loops-caused-by-residual-data.patch
+ext4-avoid-allocate-block-from-corrupted-group-in-ext4_mb_find_by_goal.patch
+ext4-reject-mount-if-bigalloc-with-s_first_data_block-0.patch
+ext4-fix-use-after-free-in-update_super_work-when-racing-with-umount.patch
+ext4-fix-the-might_sleep-warnings-in-kvfree.patch
+ext4-fix-iloc.bh-leak-in-ext4_fc_replay_inode-error-paths.patch
+ext4-always-drain-queued-discard-work-in-ext4_mb_release.patch