From: Sean Christopherson Date: Fri, 16 May 2025 21:35:37 +0000 (-0700) Subject: KVM: Conditionally reschedule when resetting the dirty ring X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=1333c35c4eea6533874564899371501ee80b9583;p=thirdparty%2Fkernel%2Flinux.git KVM: Conditionally reschedule when resetting the dirty ring When resetting a dirty ring, conditionally reschedule on each iteration after the first. The recently introduced hard limit mitigates the issue of an endless reset, but isn't sufficient to completely prevent RCU stalls, soft lockups, etc., nor is the hard limit intended to guard against such badness. Note! Take care to check for reschedule even in the "continue" paths, as a pathological scenario (or malicious userspace) could dirty the same gfn over and over, i.e. always hit the continue path. rcu: INFO: rcu_sched self-detected stall on CPU rcu: 4-....: (5249 ticks this GP) idle=51e4/1/0x4000000000000000 softirq=309/309 fqs=2563 rcu: (t=5250 jiffies g=-319 q=608 ncpus=24) CPU: 4 UID: 1000 PID: 1067 Comm: dirty_log_test Tainted: G L 6.13.0-rc3-17fa7a24ea1e-HEAD-vm #814 Tainted: [L]=SOFTLOCKUP Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_arch_mmu_enable_log_dirty_pt_masked+0x26/0x200 [kvm] Call Trace: kvm_reset_dirty_gfn.part.0+0xb4/0xe0 [kvm] kvm_dirty_ring_reset+0x58/0x220 [kvm] kvm_vm_ioctl+0x10eb/0x15d0 [kvm] __x64_sys_ioctl+0x8b/0xb0 do_syscall_64+0x5b/0x160 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Tainted: [L]=SOFTLOCKUP Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_arch_mmu_enable_log_dirty_pt_masked+0x17/0x200 [kvm] Call Trace: kvm_reset_dirty_gfn.part.0+0xb4/0xe0 [kvm] kvm_dirty_ring_reset+0x58/0x220 [kvm] kvm_vm_ioctl+0x10eb/0x15d0 [kvm] __x64_sys_ioctl+0x8b/0xb0 do_syscall_64+0x5b/0x160 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: fb04a1eddb1a ("KVM: X86: Implement ring-based dirty memory tracking") Reviewed-by: James Houghton Reviewed-by: Yan Zhao Reviewed-by: Peter Xu Link: https://lore.kernel.org/r/20250516213540.2546077-4-seanjc@google.com Signed-off-by: Sean Christopherson --- diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c index e844e869e8c7f..97cca0c02fd17 100644 --- a/virt/kvm/dirty_ring.c +++ b/virt/kvm/dirty_ring.c @@ -134,6 +134,16 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring, ring->reset_index++; (*nr_entries_reset)++; + + /* + * While the size of each ring is fixed, it's possible for the + * ring to be constantly re-dirtied/harvested while the reset + * is in-progress (the hard limit exists only to guard against + * wrapping the count into negative space). + */ + if (!first_round) + cond_resched(); + /* * Try to coalesce the reset operations when the guest is * scanning pages in the same slot.