--- /dev/null
+From dc91f2eb1a4021eb6705c15e474942f84ab9b211 Mon Sep 17 00:00:00 2001
+From: Haozhong Zhang <haozhong.zhang@intel.com>
+Date: Mon, 18 Sep 2017 09:56:49 +0800
+Subject: KVM: VMX: do not change SN bit in vmx_update_pi_irte()
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+From: Haozhong Zhang <haozhong.zhang@intel.com>
+
+commit dc91f2eb1a4021eb6705c15e474942f84ab9b211 upstream.
+
+In kvm_vcpu_trigger_posted_interrupt() and pi_pre_block(), KVM
+assumes that PI notification events should not be suppressed when the
+target vCPU is not blocked.
+
+vmx_update_pi_irte() sets the SN field before changing an interrupt
+from posting to remapping, but it does not check the vCPU mode.
+Therefore, the change of SN field may break above the assumption.
+Besides, I don't see reasons to suppress notification events here, so
+remove the changes of SN field to avoid race condition.
+
+Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
+Reported-by: "Ramamurthy, Venkatesh" <venkatesh.ramamurthy@intel.com>
+Reported-by: Dan Williams <dan.j.williams@intel.com>
+Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
+Fixes: 28b835d60fcc ("KVM: Update Posted-Interrupts Descriptor when vCPU is preempted")
+Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ arch/x86/kvm/vmx.c | 6 +-----
+ 1 file changed, 1 insertion(+), 5 deletions(-)
+
+--- a/arch/x86/kvm/vmx.c
++++ b/arch/x86/kvm/vmx.c
+@@ -11604,12 +11604,8 @@ static int vmx_update_pi_irte(struct kvm
+
+ if (set)
+ ret = irq_set_vcpu_affinity(host_irq, &vcpu_info);
+- else {
+- /* suppress notification event before unposting */
+- pi_set_sn(vcpu_to_pi_desc(vcpu));
++ else
+ ret = irq_set_vcpu_affinity(host_irq, NULL);
+- pi_clear_sn(vcpu_to_pi_desc(vcpu));
+- }
+
+ if (ret < 0) {
+ printk(KERN_INFO "%s: failed to update PI IRTE\n",
--- /dev/null
+From 5753743fa5108b8f98bd61e40dc63f641b26c768 Mon Sep 17 00:00:00 2001
+From: Haozhong Zhang <haozhong.zhang@intel.com>
+Date: Mon, 18 Sep 2017 09:56:50 +0800
+Subject: KVM: VMX: remove WARN_ON_ONCE in kvm_vcpu_trigger_posted_interrupt
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+From: Haozhong Zhang <haozhong.zhang@intel.com>
+
+commit 5753743fa5108b8f98bd61e40dc63f641b26c768 upstream.
+
+WARN_ON_ONCE(pi_test_sn(&vmx->pi_desc)) in kvm_vcpu_trigger_posted_interrupt()
+intends to detect the violation of invariant that VT-d PI notification
+event is not suppressed when vcpu is in the guest mode. Because the
+two checks for the target vcpu mode and the target suppress field
+cannot be performed atomically, the target vcpu mode may change in
+between. If that does happen, WARN_ON_ONCE() here may raise false
+alarms.
+
+As the previous patch fixed the real invariant breaker, remove this
+WARN_ON_ONCE() to avoid false alarms, and document the allowed cases
+instead.
+
+Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
+Reported-by: "Ramamurthy, Venkatesh" <venkatesh.ramamurthy@intel.com>
+Reported-by: Dan Williams <dan.j.williams@intel.com>
+Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
+Fixes: 28b835d60fcc ("KVM: Update Posted-Interrupts Descriptor when vCPU is preempted")
+Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ arch/x86/kvm/vmx.c | 33 +++++++++++++++++++++------------
+ 1 file changed, 21 insertions(+), 12 deletions(-)
+
+--- a/arch/x86/kvm/vmx.c
++++ b/arch/x86/kvm/vmx.c
+@@ -5046,21 +5046,30 @@ static inline bool kvm_vcpu_trigger_post
+ int pi_vec = nested ? POSTED_INTR_NESTED_VECTOR : POSTED_INTR_VECTOR;
+
+ if (vcpu->mode == IN_GUEST_MODE) {
+- struct vcpu_vmx *vmx = to_vmx(vcpu);
+-
+ /*
+- * Currently, we don't support urgent interrupt,
+- * all interrupts are recognized as non-urgent
+- * interrupt, so we cannot post interrupts when
+- * 'SN' is set.
++ * The vector of interrupt to be delivered to vcpu had
++ * been set in PIR before this function.
++ *
++ * Following cases will be reached in this block, and
++ * we always send a notification event in all cases as
++ * explained below.
++ *
++ * Case 1: vcpu keeps in non-root mode. Sending a
++ * notification event posts the interrupt to vcpu.
++ *
++ * Case 2: vcpu exits to root mode and is still
++ * runnable. PIR will be synced to vIRR before the
++ * next vcpu entry. Sending a notification event in
++ * this case has no effect, as vcpu is not in root
++ * mode.
+ *
+- * If the vcpu is in guest mode, it means it is
+- * running instead of being scheduled out and
+- * waiting in the run queue, and that's the only
+- * case when 'SN' is set currently, warning if
+- * 'SN' is set.
++ * Case 3: vcpu exits to root mode and is blocked.
++ * vcpu_block() has already synced PIR to vIRR and
++ * never blocks vcpu if vIRR is not cleared. Therefore,
++ * a blocked vcpu here does not wait for any requested
++ * interrupts in PIR, and sending a notification event
++ * which has no effect is safe here.
+ */
+- WARN_ON_ONCE(pi_test_sn(&vmx->pi_desc));
+
+ apic->send_IPI_mask(get_cpu_mask(vcpu->cpu), pi_vec);
+ return true;