From: Teng Liu <27rabbitlt@gmail.com> Date: Wed, 13 May 2026 11:35:44 +0000 (+0200) Subject: btrfs: validate data reloc tree file extent item members X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=a6908f88c9da9778957a07ac568aa643124278a8;p=thirdparty%2Flinux.git btrfs: validate data reloc tree file extent item members get_new_location() uses BUG_ON() to crash the kernel if the file extent item it looks up has any of offset, compression, encryption, or other_encoding set non-zero. The data reloc inode is only written by relocation's own paths and the four fields are always 0 in what the kernel writes: - insert_prealloc_file_extent() memsets the stack item to zero and only fills in type, disk_bytenr, disk_num_bytes and num_bytes, so offset/compression/encryption/other_encoding stay 0. - insert_ordered_extent_file_extent() copies oe->compress_type into the file extent's compression field, but the data reloc inode is created with BTRFS_INODE_NOCOMPRESS so compress_type is always 0; encryption and other_encoding are reserved-and-zero in btrfs. A non-zero value here means the leaf decoded from disk does not match what the kernel wrote, i.e. on-disk corruption. A malformed image reaches this code via balance and panics the kernel. A previous attempt to enforce all four constraints in tree-checker's check_extent_data_item() was merged as commit 7d0ee95979e9 ("btrfs: validate data reloc tree file extent item members in tree-checker") and then reverted by commit 1c034697fcaa after btrfs/061 produced false positives on arm64 with 64K pages. The reason: relocation writeback legitimately produces REG file_extent_items with offset != 0 in the data reloc tree. When an ordered extent covers only the back portion of an underlying PREALLOC (num_bytes < ram_bytes on the input file_extent), insert_ordered_extent_file_extent() inserts a REG with offset = oe->offset num_bytes = oe->num_bytes ram_bytes preserved from the original PREALLOC, and this item can reach disk if a transaction commit fires while it is present in the leaf. The four fields belong in different layers: - compression, encryption and other_encoding are universal invariants for every item in the data reloc tree, regardless of cluster geometry. Enforce them in tree-checker's check_extent_data_item() so a corrupt leaf is rejected at read time. - offset is only an invariant at the cluster-boundary keys that get_new_location() searches (the key is computed as src_disk_bytenr - reloc_block_group_start). Partial-PREALLOC writebacks legitimately place REG items at non-boundary keys with offset != 0; tree-checker cannot reject these. The cluster- boundary item is always written by either insert_prealloc_file_extent() (offset=0 by memset) or by the front portion of a partial writeback (offset=0 by construction), so a non-zero offset there is corruption. Enforce the universal invariants in check_extent_data_item() with a file_extent_err() rejection. Convert the BUG_ON() in get_new_location() to a -EUCLEAN return paired with btrfs_print_leaf() and btrfs_err() so the offending leaf is logged. The caller in replace_file_extents() already handles non-zero returns from get_new_location() by breaking out of the loop without aborting the transaction. Suggested-by: Qu Wenruo Suggested-by: David Sterba Reported-by: syzbot+3e20d8f3d41bac5dc9a2@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=3e20d8f3d41bac5dc9a2 Signed-off-by: Teng Liu <27rabbitlt@gmail.com> Signed-off-by: David Sterba --- diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 67af02e732d0c..955e338dcfd89 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -813,6 +813,7 @@ static int get_new_location(struct inode *reloc_inode, u64 *new_bytenr, u64 bytenr, u64 num_bytes) { struct btrfs_root *root = BTRFS_I(reloc_inode)->root; + struct btrfs_fs_info *fs_info = root->fs_info; BTRFS_PATH_AUTO_FREE(path); struct btrfs_file_extent_item *fi; struct extent_buffer *leaf; @@ -834,10 +835,23 @@ static int get_new_location(struct inode *reloc_inode, u64 *new_bytenr, fi = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_file_extent_item); - BUG_ON(btrfs_file_extent_offset(leaf, fi) || - btrfs_file_extent_compression(leaf, fi) || - btrfs_file_extent_encryption(leaf, fi) || - btrfs_file_extent_other_encoding(leaf, fi)); + /* + * The cluster-boundary key searched above is always written by + * relocation with offset 0: either by insert_prealloc_file_extent() + * (memsets the stack item to 0) or by the front portion of a partial + * writeback (offset=0 by construction). A non-zero value here means + * the on-disk leaf does not match what relocation wrote, i.e. + * corruption. The other encoding fields are caught earlier by + * tree-checker's check_extent_data_item(). + */ + if (unlikely(btrfs_file_extent_offset(leaf, fi))) { + btrfs_print_leaf(leaf); + btrfs_err(fs_info, +"unexpected non-zero offset in file extent item for data reloc inode %llu key offset %llu offset %llu", + btrfs_ino(BTRFS_I(reloc_inode)), bytenr, + btrfs_file_extent_offset(leaf, fi)); + return -EUCLEAN; + } if (num_bytes != btrfs_file_extent_disk_num_bytes(leaf, fi)) return -EINVAL; diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c index b7ff3e3d3a631..cb3e676a81cc4 100644 --- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -296,6 +296,33 @@ static int check_extent_data_item(struct extent_buffer *leaf, return 0; } + /* + * For the data reloc tree, file extent items are written by + * relocation's own paths. The data reloc inode is created with + * BTRFS_INODE_NOCOMPRESS, so insert_ordered_extent_file_extent() + * always leaves the compression field at 0. Encryption and + * other_encoding are reserved-and-zero in btrfs. A non-zero value + * for any of these means the leaf decoded from disk does not match + * what the kernel wrote, i.e. on-disk corruption. + * + * The file_extent_item's offset field is NOT a universal invariant + * here: partial-PREALLOC writebacks legitimately produce REG items + * with non-zero offset at non-boundary keys. The offset check is + * performed at the call site in get_new_location(), which only + * inspects cluster-boundary keys where offset is always 0. + */ + if (unlikely(btrfs_header_owner(leaf) == BTRFS_DATA_RELOC_TREE_OBJECTID && + (btrfs_file_extent_compression(leaf, fi) || + btrfs_file_extent_encryption(leaf, fi) || + btrfs_file_extent_other_encoding(leaf, fi)))) { + file_extent_err(leaf, slot, +"invalid encoding fields for data reloc tree, compression=%u encryption=%u other_encoding=%u", + btrfs_file_extent_compression(leaf, fi), + btrfs_file_extent_encryption(leaf, fi), + btrfs_file_extent_other_encoding(leaf, fi)); + return -EUCLEAN; + } + /* Regular or preallocated extent has fixed item size */ if (unlikely(item_size != sizeof(*fi))) { file_extent_err(leaf, slot,