The TPR_THRESHOLD field in the VMCS is used by VMX to induce VM exits
when the guest's virtual TPR falls under the specified threshold,
allowing KVM to inject previously masked interrupts.
KVM handles these VM exits in handle_tpr_below_threshold().
Commit
eb90f3417a0c ("KVM: vmx: speed up TPR below threshold vmexits")
optimized this function by calling apic_update_ppr() instead of raising
KVM_REQ_EVENT. apic_update_ppr() then raises KVM_REQ_EVENT if there is
a pending, deliverable interrupt.
However, if there are no new interrupts pending, apic_update_ppr() does
not issue the request. Thus, kvm_lapic_update_cr8_intercept() and
vmx_update_cr8_intercept() are not called before VM entry, which results
in a high, stale TPR_THRESHOLD. This is problematic due to the following
sentence in 28.2.1.1 "VM-Execution Control Fields" in the SDM:
The following check is performed if the “use TPR shadow” VM-execution
control is 1 and the “virtualize APIC accesses” and “virtual-interrupt
delivery” VM-execution controls are both 0: the value of bits 3:0 of
the TPR threshold VM-execution control field should not be greater
than the value of bits 7:4 of VTPR.
This error condition is typically not observed when KVM runs on a bare
metal system because modern processors support APICv, which enables
virtual-interrupt delivery, and which KVM uses when possible. This
causes the processor to no longer generate TPR-below-threshold exits
and to no longer check TPR_THRESHOLD on entry. However, when running
on older platforms, or under nested virtualization on a hypervisor that
does not support virtual-interrupt delivery and enforces this check
(like Hyper-V) this can cause a VM entry failure with hardware error
0x7, as seen in [1].
Call kvm_lapic_update_cr8_intercept() if apic_update_ppr() does not
find a deliverable interrupt (and thus does not raise KVM_REQ_EVENT).
Remove calls to kvm_lapic_update_cr8_intercept() on paths that end up in
apic_update_ppr(), as they now become redundant. This ensures that any
path that updates the guest's PPR also figures out if KVM needs to wait
for a TPR change (using TPR_THRESHOLD on VMX or CR8 intercepts on SVM).
Link: https://github.com/coconut-svsm/svsm/issues/1081
Tested-by: Stefano Garzarella <sgarzare@redhat.com>
Cc: stable@vger.kernel.org
Fixes: eb90f3417a0c ("KVM: vmx: speed up TPR below threshold vmexits")
Signed-off-by: Carlos López <clopez@suse.de>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <
20260618174347.
1981064-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
if (__apic_update_ppr(apic, &ppr) &&
apic_has_interrupt_for_ppr(apic, ppr) != -1)
kvm_make_request(KVM_REQ_EVENT, apic->vcpu);
+ else
+ kvm_lapic_update_cr8_intercept(apic->vcpu);
}
void kvm_apic_update_ppr(struct kvm_vcpu *vcpu)
r = kvm_apic_set_state(vcpu, s);
if (r)
return r;
- kvm_lapic_update_cr8_intercept(vcpu);
return 0;
}
kvm_register_mark_dirty(vcpu, VCPU_REG_CR3);
kvm_x86_call(post_set_cr3)(vcpu, sregs->cr3);
- kvm_set_cr8(vcpu, sregs->cr8);
-
*mmu_reset_needed |= vcpu->arch.efer != sregs->efer;
kvm_x86_call(set_efer)(vcpu, sregs->efer);
kvm_set_segment(vcpu, &sregs->tr, VCPU_SREG_TR);
kvm_set_segment(vcpu, &sregs->ldt, VCPU_SREG_LDTR);
- kvm_lapic_update_cr8_intercept(vcpu);
+ kvm_set_cr8(vcpu, sregs->cr8);
/* Older userspace won't unhalt the vcpu on reset. */
if (kvm_vcpu_is_bsp(vcpu) && kvm_rip_read(vcpu) == 0xfff0 &&