There is a uaf problem which is found by case 23rdev-lifetime:
Oops: general protection fault, probably for non-canonical address 0xdead000000000122
RIP: 0010:bdi_unregister+0x4b/0x170
Call Trace:
<TASK>
__del_gendisk+0x356/0x3e0
mddev_unlock+0x351/0x360
rdev_attr_store+0x217/0x280
kernfs_fop_write_iter+0x14a/0x210
vfs_write+0x29e/0x550
ksys_write+0x74/0xf0
do_syscall_64+0xbb/0x380
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff5250a177e
The sequence is:
1. rdev remove path gets reconfig_mutex
2. rdev remove path release reconfig_mutex in mddev_unlock
3. md stop calls do_md_stop and sets MD_DELETED
4. rdev remove path calls del_gendisk because MD_DELETED is set
5. md stop path release reconfig_mutex and calls del_gendisk again
So there is a race condition we should resolve. This patch adds a
flag MD_DO_DELETE to avoid the race condition.
Link: https://lore.kernel.org/linux-raid/20251029063419.21700-1-xni@redhat.com
Fixes: 9e59d609763f ("md: call del_gendisk in control path")
Signed-off-by: Xiao Ni <xni@redhat.com>
Suggested-by: Yu Kuai <yukuai@fnnas.com>
Reviewed-by: Li Nan <linan122@huawei.com>
Signed-off-by: Yu Kuai <yukuai@fnnas.com>
* do_md_stop. dm raid only uses md_stop to stop. So dm raid
* doesn't need to check MD_DELETED when getting reconfig lock
*/
- if (test_bit(MD_DELETED, &mddev->flags)) {
+ if (test_bit(MD_DELETED, &mddev->flags) &&
+ !test_and_set_bit(MD_DO_DELETE, &mddev->flags)) {
kobject_del(&mddev->kobj);
del_gendisk(mddev->gendisk);
}
MD_HAS_MULTIPLE_PPLS,
MD_NOT_READY,
MD_BROKEN,
+ MD_DO_DELETE,
MD_DELETED,
};