--- /dev/null
+From da9de5f8527f4b9efc82f967d29a583318c034c7 Mon Sep 17 00:00:00 2001
+From: Mike Marciniszyn <mike.marciniszyn@intel.com>
+Date: Fri, 7 Jun 2019 08:25:31 -0400
+Subject: IB/hfi1: Close PSM sdma_progress sleep window
+
+From: Mike Marciniszyn <mike.marciniszyn@intel.com>
+
+commit da9de5f8527f4b9efc82f967d29a583318c034c7 upstream.
+
+The call to sdma_progress() is called outside the wait lock.
+
+In this case, there is a race condition where sdma_progress() can return
+false and the sdma_engine can idle. If that happens, there will be no
+more sdma interrupts to cause the wakeup and the user_sdma xmit will hang.
+
+Fix by moving the lock to enclose the sdma_progress() call.
+
+Also, delete busycount. The need for this was removed by:
+commit bcad29137a97 ("IB/hfi1: Serve the most starved iowait entry first")
+
+Cc: <stable@vger.kernel.org>
+Fixes: 7724105686e7 ("IB/hfi1: add driver files")
+Reviewed-by: Gary Leshner <Gary.S.Leshner@intel.com>
+Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
+Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
+Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+
+---
+ drivers/infiniband/hw/hfi1/user_sdma.c | 13 ++++---------
+ 1 file changed, 4 insertions(+), 9 deletions(-)
+
+--- a/drivers/infiniband/hw/hfi1/user_sdma.c
++++ b/drivers/infiniband/hw/hfi1/user_sdma.c
+@@ -260,7 +260,6 @@ struct user_sdma_txreq {
+ struct list_head list;
+ struct user_sdma_request *req;
+ u16 flags;
+- unsigned busycount;
+ u64 seqnum;
+ };
+
+@@ -323,25 +322,22 @@ static int defer_packet_queue(
+ struct hfi1_user_sdma_pkt_q *pq =
+ container_of(wait, struct hfi1_user_sdma_pkt_q, busy);
+ struct hfi1_ibdev *dev = &pq->dd->verbs_dev;
+- struct user_sdma_txreq *tx =
+- container_of(txreq, struct user_sdma_txreq, txreq);
+
+- if (sdma_progress(sde, seq, txreq)) {
+- if (tx->busycount++ < MAX_DEFER_RETRY_COUNT)
+- goto eagain;
+- }
++ write_seqlock(&dev->iowait_lock);
++ if (sdma_progress(sde, seq, txreq))
++ goto eagain;
+ /*
+ * We are assuming that if the list is enqueued somewhere, it
+ * is to the dmawait list since that is the only place where
+ * it is supposed to be enqueued.
+ */
+ xchg(&pq->state, SDMA_PKT_Q_DEFERRED);
+- write_seqlock(&dev->iowait_lock);
+ if (list_empty(&pq->busy.list))
+ list_add_tail(&pq->busy.list, &sde->dmawait);
+ write_sequnlock(&dev->iowait_lock);
+ return -EBUSY;
+ eagain:
++ write_sequnlock(&dev->iowait_lock);
+ return -EAGAIN;
+ }
+
+@@ -925,7 +921,6 @@ static int user_sdma_send_pkts(struct us
+
+ tx->flags = 0;
+ tx->req = req;
+- tx->busycount = 0;
+ INIT_LIST_HEAD(&tx->list);
+
+ if (req->seqnum == req->info.npkts - 1)
--- /dev/null
+From bb34e690e9340bc155ebed5a3d75fc63ff69e082 Mon Sep 17 00:00:00 2001
+From: Wanpeng Li <wanpengli@tencent.com>
+Date: Tue, 2 Jul 2019 17:25:02 +0800
+Subject: KVM: LAPIC: Fix pending interrupt in IRR blocked by software disable LAPIC
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+From: Wanpeng Li <wanpengli@tencent.com>
+
+commit bb34e690e9340bc155ebed5a3d75fc63ff69e082 upstream.
+
+Thomas reported that:
+
+ | Background:
+ |
+ | In preparation of supporting IPI shorthands I changed the CPU offline
+ | code to software disable the local APIC instead of just masking it.
+ | That's done by clearing the APIC_SPIV_APIC_ENABLED bit in the APIC_SPIV
+ | register.
+ |
+ | Failure:
+ |
+ | When the CPU comes back online the startup code triggers occasionally
+ | the warning in apic_pending_intr_clear(). That complains that the IRRs
+ | are not empty.
+ |
+ | The offending vector is the local APIC timer vector who's IRR bit is set
+ | and stays set.
+ |
+ | It took me quite some time to reproduce the issue locally, but now I can
+ | see what happens.
+ |
+ | It requires apicv_enabled=0, i.e. full apic emulation. With apicv_enabled=1
+ | (and hardware support) it behaves correctly.
+ |
+ | Here is the series of events:
+ |
+ | Guest CPU
+ |
+ | goes down
+ |
+ | native_cpu_disable()
+ |
+ | apic_soft_disable();
+ |
+ | play_dead()
+ |
+ | ....
+ |
+ | startup()
+ |
+ | if (apic_enabled())
+ | apic_pending_intr_clear() <- Not taken
+ |
+ | enable APIC
+ |
+ | apic_pending_intr_clear() <- Triggers warning because IRR is stale
+ |
+ | When this happens then the deadline timer or the regular APIC timer -
+ | happens with both, has fired shortly before the APIC is disabled, but the
+ | interrupt was not serviced because the guest CPU was in an interrupt
+ | disabled region at that point.
+ |
+ | The state of the timer vector ISR/IRR bits:
+ |
+ | ISR IRR
+ | before apic_soft_disable() 0 1
+ | after apic_soft_disable() 0 1
+ |
+ | On startup 0 1
+ |
+ | Now one would assume that the IRR is cleared after the INIT reset, but this
+ | happens only on CPU0.
+ |
+ | Why?
+ |
+ | Because our CPU0 hotplug is just for testing to make sure nothing breaks
+ | and goes through an NMI wakeup vehicle because INIT would send it through
+ | the boots-trap code which is not really working if that CPU was not
+ | physically unplugged.
+ |
+ | Now looking at a real world APIC the situation in that case is:
+ |
+ | ISR IRR
+ | before apic_soft_disable() 0 1
+ | after apic_soft_disable() 0 1
+ |
+ | On startup 0 0
+ |
+ | Why?
+ |
+ | Once the dying CPU reenables interrupts the pending interrupt gets
+ | delivered as a spurious interupt and then the state is clear.
+ |
+ | While that CPU0 hotplug test case is surely an esoteric issue, the APIC
+ | emulation is still wrong, Even if the play_dead() code would not enable
+ | interrupts then the pending IRR bit would turn into an ISR .. interrupt
+ | when the APIC is reenabled on startup.
+
+From SDM 10.4.7.2 Local APIC State After It Has Been Software Disabled
+* Pending interrupts in the IRR and ISR registers are held and require
+ masking or handling by the CPU.
+
+In Thomas's testing, hardware cpu will not respect soft disable LAPIC
+when IRR has already been set or APICv posted-interrupt is in flight,
+so we can skip soft disable APIC checking when clearing IRR and set ISR,
+continue to respect soft disable APIC when attempting to set IRR.
+
+Reported-by: Rong Chen <rong.a.chen@intel.com>
+Reported-by: Feng Tang <feng.tang@intel.com>
+Reported-by: Thomas Gleixner <tglx@linutronix.de>
+Tested-by: Thomas Gleixner <tglx@linutronix.de>
+Cc: Paolo Bonzini <pbonzini@redhat.com>
+Cc: Radim Krčmář <rkrcmar@redhat.com>
+Cc: Thomas Gleixner <tglx@linutronix.de>
+Cc: Rong Chen <rong.a.chen@intel.com>
+Cc: Feng Tang <feng.tang@intel.com>
+Cc: stable@vger.kernel.org
+Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
+Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ arch/x86/kvm/lapic.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/arch/x86/kvm/lapic.c
++++ b/arch/x86/kvm/lapic.c
+@@ -1992,7 +1992,7 @@ int kvm_apic_has_interrupt(struct kvm_vc
+ struct kvm_lapic *apic = vcpu->arch.apic;
+ int highest_irr;
+
+- if (!apic_enabled(apic))
++ if (!kvm_apic_hw_enabled(apic))
+ return -1;
+
+ apic_update_ppr(apic);