[BUG]
Running btrfs balance can trigger a null-ptr-deref before relocating a
data chunk when metadata corruption leaves a chunk in the chunk tree
without a corresponding block group in the in-memory cache:
KASAN: null-ptr-deref in range [0x0000000000000088-0x000000000000008f]
RIP: 0010:btrfs_may_alloc_data_chunk+0x40/0x1c0 fs/btrfs/volumes.c:3601
Call Trace:
__btrfs_balance fs/btrfs/volumes.c:4217 [inline]
btrfs_balance+0x2516/0x42b0 fs/btrfs/volumes.c:4604
btrfs_ioctl_balance fs/btrfs/ioctl.c:3577 [inline]
btrfs_ioctl+0x25cf/0x5b90 fs/btrfs/ioctl.c:5313
...
[CAUSE]
__btrfs_balance() iterates the on-disk chunk tree and passes the chunk
logical bytenr to btrfs_may_alloc_data_chunk() before relocating a data
chunk. That helper then queries the in-memory block group cache:
cache = btrfs_lookup_block_group(fs_info, chunk_offset);
chunk_type = cache->flags; /* cache may be NULL */
A corrupt image can contain a chunk item whose matching block group
item is missing, so no block group is ever inserted into the cache. In
that case btrfs_lookup_block_group() returns NULL.
The code only guards this with ASSERT(cache), which becomes a no-op when
CONFIG_BTRFS_ASSERT is disabled. The subsequent dereference of
cache->flags therefore crashes the kernel.
[FIX]
Add a NULL check after btrfs_lookup_block_group() in
btrfs_may_alloc_data_chunk() and print and error message for clarity.
Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
u64 chunk_type;
cache = btrfs_lookup_block_group(fs_info, chunk_offset);
- ASSERT(cache);
+ if (unlikely(!cache)) {
+ btrfs_err(fs_info, "balance: chunk at bytenr %llu has no corresponding block group",
+ chunk_offset);
+ return -EUCLEAN;
+ }
chunk_type = cache->flags;
btrfs_put_block_group(cache);