]> git.ipfire.org Git - thirdparty/linux.git/commit
blk-cgroup: defer blkcg css_put until blkg is unlinked from queue
authorZizhi Wo <wozizhi@huawei.com>
Tue, 16 Jun 2026 01:17:46 +0000 (09:17 +0800)
committerJens Axboe <axboe@kernel.dk>
Mon, 22 Jun 2026 21:59:53 +0000 (15:59 -0600)
commit3ed9b4779a4aa3f44cd9f78627498d7adac40daa
tree36711428b19050ff6027a1ea18cefa9315f82582
parent0ab5ee5a1badb58cbb2242617cb01a4972b1f2a2
blk-cgroup: defer blkcg css_put until blkg is unlinked from queue

[BUG]
Our fuzz testing triggered a blkcg use-after-free issue:

  BUG: KASAN: slab-use-after-free in _raw_spin_lock+0x75/0xe0
  Call Trace:
  ...
  blkcg_deactivate_policy+0x244/0x4d0
  ioc_rqos_exit+0x44/0xe0
  rq_qos_exit+0xba/0x120
  __del_gendisk+0x50b/0x800
  del_gendisk+0xff/0x190
  ...

[CAUSE]
process1 process2
cgroup_rmdir
...
  css_killed_work_fn
    offline_css
    ...
      blkcg_destroy_blkgs
      ...
        __blkg_release
  css_put(&blkg->blkcg->css)
          blkg_free
    INIT_WORK(xxx, blkg_free_workfn)
    schedule_work
    css_put
    ...
      blkcg_css_free
        kfree(blkcg)--------blkcg has been freed!!!
====================================schedule_work
              blkg_free_workfn
__del_gendisk
  rq_qos_exit
    ioc_rqos_exit
      blkcg_deactivate_policy
        mutex_lock(&q->blkcg_mutex)
spin_lock_irq(&q->queue_lock)
        list_for_each_entry(blkg, xxx)
  blkcg = blkg->blkcg
  spin_lock(&blkcg->lock)-------UAF!!!
        mutex_lock(&q->blkcg_mutex)
        spin_lock_irq(&q->queue_lock)
        /* Only then is the blkg removed from the list */
        list_del_init(&blkg->q_node)

As a result, a blkg can still be reachable through q->blkg_list while
its ->blkcg has already been freed.

[Fix]
Fix this by deferring the blkcg css_put() until after the blkg has been
unlinked from q->blkg_list in blkg_free_workfn(). This ensures that the
blkcg outlives every blkg still reachable through q->blkg_list, so any
iterator holding q->queue_lock is guaranteed to observe a valid
blkg->blkcg.

While at it, move css_tryget_online() from blkg_create() into blkg_alloc()
so that the css reference is owned by the alloc/free pair rather than
straddling layers:
blkg_alloc()  <-> blkg_free()
blkg_create() <-> blkg_destroy()

Fixes: f1c006f1c685 ("blk-cgroup: synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()")
Suggested-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Zizhi Wo <wozizhi@huawei.com>
Reviewed-by: Yu Kuai <yukuai@fygo.io>
Reviewed-by: Tang Yizhou <yizhou.tang@shopee.com>
Link: https://patch.msgid.link/20260616011746.2451461-1-wozizhi@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
block/blk-cgroup.c