]> git.ipfire.org Git - thirdparty/kernel/linux.git/commit
drm/msm: Fix shrinker deadlock
authorDaniel J Blueman <daniel@quora.org>
Fri, 8 May 2026 06:57:21 +0000 (14:57 +0800)
committerRob Clark <robin.clark@oss.qualcomm.com>
Fri, 15 May 2026 16:42:19 +0000 (09:42 -0700)
commit3392291fc509d8ad6e4ad90f15b0a193f721cbc9
tree81a3daa904d2bf6ad1d7d9db3e107370ce96f389
parent3f9ed5f5aa9ecffd284218fffd6abc29617f51d5
drm/msm: Fix shrinker deadlock

With PROVE_LOCKING on an Snapdragon X1 and VM reclaim pressure, we see:

   ======================================================
   WARNING: possible circular locking dependency detected
   7.0.0-debug+ #43 Tainted: G        W
   ------------------------------------------------------
   kswapd0/82 is trying to acquire lock:
   ffff800080ec3870 (reservation_ww_class_acquire){+.+.}-{0:0}, at: msm_gem_shrinker_scan+0x17c/0x400 [msm]

   but task is already holding lock:
   ffffc31709b263b8 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x88/0x988

   which lock already depends on the new lock.

   the existing dependency chain (in reverse order) is:

   -> #2 (fs_reclaim){+.+.}-{0:0}:
          __lock_acquire+0x4d0/0xad0
          lock_acquire.part.0+0xc4/0x248
          lock_acquire+0x8c/0x248
          fs_reclaim_acquire+0xd0/0xf0
          dma_resv_lockdep+0x224/0x348
          do_one_initcall+0x84/0x5d0
          do_initcalls+0x194/0x1d8
          kernel_init_freeable+0x128/0x180
          kernel_init+0x2c/0x160
          ret_from_fork+0x10/0x20

   -> #1 (reservation_ww_class_mutex){+.+.}-{4:4}:
          __lock_acquire+0x4d0/0xad0
          lock_acquire.part.0+0xc4/0x248
          lock_acquire+0x8c/0x248
          dma_resv_lockdep+0x1a8/0x348
          do_one_initcall+0x84/0x5d0
          do_initcalls+0x194/0x1d8
          kernel_init_freeable+0x128/0x180
          kernel_init+0x2c/0x160
          ret_from_fork+0x10/0x20

   -> #0 (reservation_ww_class_acquire){+.+.}-{0:0}:
          check_prev_add+0x114/0x790
          validate_chain+0x594/0x6f0
          __lock_acquire+0x4d0/0xad0
          lock_acquire.part.0+0xc4/0x248
          lock_acquire+0x8c/0x248
          drm_gem_lru_scan+0x1ac/0x440
          msm_gem_shrinker_scan+0x17c/0x400 [msm]
          do_shrink_slab+0x150/0x4a0
          shrink_slab+0x144/0x460
          shrink_one+0x9c/0x1b0
          shrink_many+0x27c/0x5c0
          shrink_node+0x344/0x550
          balance_pgdat+0x2c0/0x988
          kswapd+0x11c/0x318
          kthread+0x10c/0x128
          ret_from_fork+0x10/0x20

   other info that might help us debug this:
   Chain exists of:
     reservation_ww_class_acquire --> reservation_ww_class_mutex --> fs_reclaim
    Possible unsafe locking scenario:
          CPU0                    CPU1
          ----                    ----
     lock(fs_reclaim);
                                  lock(reservation_ww_class_mutex);
                                  lock(fs_reclaim);
     lock(reservation_ww_class_acquire);

    *** DEADLOCK ***
   1 lock held by kswapd0/82:
    #0: ffffc31709b263b8 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x88/0x988

   stack backtrace:
   CPU: 4 UID: 0 PID: 82 Comm: kswapd0 Tainted: G        W           7.0.0-debug+ #43 PREEMPT(full)
   Tainted: [W]=WARN
   Hardware name: LENOVO 21BX0016US/21BX0016US, BIOS N3HET94W (1.66 ) 09/15/2025
   Call trace:
    show_stack+0x20/0x40 (C)
    dump_stack_lvl+0x9c/0xd0
    dump_stack+0x18/0x30
    print_circular_bug+0x114/0x120
    check_noncircular+0x178/0x198
    check_prev_add+0x114/0x790
    validate_chain+0x594/0x6f0
    __lock_acquire+0x4d0/0xad0
    lock_acquire.part.0+0xc4/0x248
    lock_acquire+0x8c/0x248
    drm_gem_lru_scan+0x1ac/0x440
    msm_gem_shrinker_scan+0x17c/0x400 [msm]
    do_shrink_slab+0x150/0x4a0
    shrink_slab+0x144/0x460
    shrink_one+0x9c/0x1b0
    shrink_many+0x27c/0x5c0
    shrink_node+0x344/0x550
    balance_pgdat+0x2c0/0x988
    kswapd+0x11c/0x318
    kthread+0x10c/0x128
    ret_from_fork+0x10/0x20

kswapd0 holding fs_reclaim calls the MSM shrinker, which calls
dma_resv_lock. This in turn acquires fs_reclaim.

Fix this deadlock by using dma_resv_trylock() instead, dropping the
subsequently unused passed wait-wound lock 'ticket'.

Cc: stable@vger.kernel.org
Signed-off-by: Daniel J Blueman <daniel@quora.org>
Fixes: fe4952b5f27c ("drm/msm: Convert vm locking")
Patchwork: https://patchwork.freedesktop.org/patch/723564/
Message-ID: <20260508065722.18785-1-daniel@quora.org>
[rob: fixup compile errors, replace lockdep splat with something legible]
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
drivers/gpu/drm/msm/msm_gem_shrinker.c