--- /dev/null
+From bb34e690e9340bc155ebed5a3d75fc63ff69e082 Mon Sep 17 00:00:00 2001
+From: Wanpeng Li <wanpengli@tencent.com>
+Date: Tue, 2 Jul 2019 17:25:02 +0800
+Subject: KVM: LAPIC: Fix pending interrupt in IRR blocked by software disable LAPIC
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+From: Wanpeng Li <wanpengli@tencent.com>
+
+commit bb34e690e9340bc155ebed5a3d75fc63ff69e082 upstream.
+
+Thomas reported that:
+
+ | Background:
+ |
+ | In preparation of supporting IPI shorthands I changed the CPU offline
+ | code to software disable the local APIC instead of just masking it.
+ | That's done by clearing the APIC_SPIV_APIC_ENABLED bit in the APIC_SPIV
+ | register.
+ |
+ | Failure:
+ |
+ | When the CPU comes back online the startup code triggers occasionally
+ | the warning in apic_pending_intr_clear(). That complains that the IRRs
+ | are not empty.
+ |
+ | The offending vector is the local APIC timer vector who's IRR bit is set
+ | and stays set.
+ |
+ | It took me quite some time to reproduce the issue locally, but now I can
+ | see what happens.
+ |
+ | It requires apicv_enabled=0, i.e. full apic emulation. With apicv_enabled=1
+ | (and hardware support) it behaves correctly.
+ |
+ | Here is the series of events:
+ |
+ | Guest CPU
+ |
+ | goes down
+ |
+ | native_cpu_disable()
+ |
+ | apic_soft_disable();
+ |
+ | play_dead()
+ |
+ | ....
+ |
+ | startup()
+ |
+ | if (apic_enabled())
+ | apic_pending_intr_clear() <- Not taken
+ |
+ | enable APIC
+ |
+ | apic_pending_intr_clear() <- Triggers warning because IRR is stale
+ |
+ | When this happens then the deadline timer or the regular APIC timer -
+ | happens with both, has fired shortly before the APIC is disabled, but the
+ | interrupt was not serviced because the guest CPU was in an interrupt
+ | disabled region at that point.
+ |
+ | The state of the timer vector ISR/IRR bits:
+ |
+ | ISR IRR
+ | before apic_soft_disable() 0 1
+ | after apic_soft_disable() 0 1
+ |
+ | On startup 0 1
+ |
+ | Now one would assume that the IRR is cleared after the INIT reset, but this
+ | happens only on CPU0.
+ |
+ | Why?
+ |
+ | Because our CPU0 hotplug is just for testing to make sure nothing breaks
+ | and goes through an NMI wakeup vehicle because INIT would send it through
+ | the boots-trap code which is not really working if that CPU was not
+ | physically unplugged.
+ |
+ | Now looking at a real world APIC the situation in that case is:
+ |
+ | ISR IRR
+ | before apic_soft_disable() 0 1
+ | after apic_soft_disable() 0 1
+ |
+ | On startup 0 0
+ |
+ | Why?
+ |
+ | Once the dying CPU reenables interrupts the pending interrupt gets
+ | delivered as a spurious interupt and then the state is clear.
+ |
+ | While that CPU0 hotplug test case is surely an esoteric issue, the APIC
+ | emulation is still wrong, Even if the play_dead() code would not enable
+ | interrupts then the pending IRR bit would turn into an ISR .. interrupt
+ | when the APIC is reenabled on startup.
+
+From SDM 10.4.7.2 Local APIC State After It Has Been Software Disabled
+* Pending interrupts in the IRR and ISR registers are held and require
+ masking or handling by the CPU.
+
+In Thomas's testing, hardware cpu will not respect soft disable LAPIC
+when IRR has already been set or APICv posted-interrupt is in flight,
+so we can skip soft disable APIC checking when clearing IRR and set ISR,
+continue to respect soft disable APIC when attempting to set IRR.
+
+Reported-by: Rong Chen <rong.a.chen@intel.com>
+Reported-by: Feng Tang <feng.tang@intel.com>
+Reported-by: Thomas Gleixner <tglx@linutronix.de>
+Tested-by: Thomas Gleixner <tglx@linutronix.de>
+Cc: Paolo Bonzini <pbonzini@redhat.com>
+Cc: Radim Krčmář <rkrcmar@redhat.com>
+Cc: Thomas Gleixner <tglx@linutronix.de>
+Cc: Rong Chen <rong.a.chen@intel.com>
+Cc: Feng Tang <feng.tang@intel.com>
+Cc: stable@vger.kernel.org
+Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
+Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ arch/x86/kvm/lapic.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/arch/x86/kvm/lapic.c
++++ b/arch/x86/kvm/lapic.c
+@@ -2161,7 +2161,7 @@ int kvm_apic_has_interrupt(struct kvm_vc
+ struct kvm_lapic *apic = vcpu->arch.apic;
+ u32 ppr;
+
+- if (!apic_enabled(apic))
++ if (!kvm_apic_hw_enabled(apic))
+ return -1;
+
+ __apic_update_ppr(apic, &ppr);
--- /dev/null
+From 1e091c3bbf51d34d5d96337a59ce5ab2ac3ba2cc Mon Sep 17 00:00:00 2001
+From: Chuck Lever <chuck.lever@oracle.com>
+Date: Tue, 11 Jun 2019 11:01:16 -0400
+Subject: svcrdma: Ignore source port when computing DRC hash
+
+From: Chuck Lever <chuck.lever@oracle.com>
+
+commit 1e091c3bbf51d34d5d96337a59ce5ab2ac3ba2cc upstream.
+
+The DRC appears to be effectively empty after an RPC/RDMA transport
+reconnect. The problem is that each connection uses a different
+source port, which defeats the DRC hash.
+
+Clients always have to disconnect before they send retransmissions
+to reset the connection's credit accounting, thus every retransmit
+on NFS/RDMA will miss the DRC.
+
+An NFS/RDMA client's IP source port is meaningless for RDMA
+transports. The transport layer typically sets the source port value
+on the connection to a random ephemeral port. The server already
+ignores it for the "secure port" check. See commit 16e4d93f6de7
+("NFSD: Ignore client's source port on RDMA transports").
+
+The Linux NFS server's DRC resolves XID collisions from the same
+source IP address by using the checksum of the first 200 bytes of
+the RPC call header.
+
+Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
+Cc: stable@vger.kernel.org # v4.14+
+Signed-off-by: J. Bruce Fields <bfields@redhat.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ net/sunrpc/xprtrdma/svc_rdma_transport.c | 7 ++++++-
+ 1 file changed, 6 insertions(+), 1 deletion(-)
+
+--- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
++++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
+@@ -524,9 +524,14 @@ static void handle_connect_req(struct rd
+ /* Save client advertised inbound read limit for use later in accept. */
+ newxprt->sc_ord = param->initiator_depth;
+
+- /* Set the local and remote addresses in the transport */
+ sa = (struct sockaddr *)&newxprt->sc_cm_id->route.addr.dst_addr;
+ svc_xprt_set_remote(&newxprt->sc_xprt, sa, svc_addr_len(sa));
++ /* The remote port is arbitrary and not under the control of the
++ * client ULP. Set it to a fixed value so that the DRC continues
++ * to be effective after a reconnect.
++ */
++ rpc_set_port((struct sockaddr *)&newxprt->sc_xprt.xpt_remote, 0);
++
+ sa = (struct sockaddr *)&newxprt->sc_cm_id->route.addr.src_addr;
+ svc_xprt_set_local(&newxprt->sc_xprt, sa, svc_addr_len(sa));
+