]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/commitdiff
4.14-stable patches
authorGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Sat, 6 Jul 2019 05:12:00 +0000 (07:12 +0200)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Sat, 6 Jul 2019 05:12:00 +0000 (07:12 +0200)
added patches:
kvm-lapic-fix-pending-interrupt-in-irr-blocked-by-software-disable-lapic.patch
svcrdma-ignore-source-port-when-computing-drc-hash.patch

queue-4.14/kvm-lapic-fix-pending-interrupt-in-irr-blocked-by-software-disable-lapic.patch [new file with mode: 0644]
queue-4.14/series
queue-4.14/svcrdma-ignore-source-port-when-computing-drc-hash.patch [new file with mode: 0644]

diff --git a/queue-4.14/kvm-lapic-fix-pending-interrupt-in-irr-blocked-by-software-disable-lapic.patch b/queue-4.14/kvm-lapic-fix-pending-interrupt-in-irr-blocked-by-software-disable-lapic.patch
new file mode 100644 (file)
index 0000000..21f463e
--- /dev/null
@@ -0,0 +1,138 @@
+From bb34e690e9340bc155ebed5a3d75fc63ff69e082 Mon Sep 17 00:00:00 2001
+From: Wanpeng Li <wanpengli@tencent.com>
+Date: Tue, 2 Jul 2019 17:25:02 +0800
+Subject: KVM: LAPIC: Fix pending interrupt in IRR blocked by software disable LAPIC
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+From: Wanpeng Li <wanpengli@tencent.com>
+
+commit bb34e690e9340bc155ebed5a3d75fc63ff69e082 upstream.
+
+Thomas reported that:
+
+ | Background:
+ |
+ |    In preparation of supporting IPI shorthands I changed the CPU offline
+ |    code to software disable the local APIC instead of just masking it.
+ |    That's done by clearing the APIC_SPIV_APIC_ENABLED bit in the APIC_SPIV
+ |    register.
+ |
+ | Failure:
+ |
+ |    When the CPU comes back online the startup code triggers occasionally
+ |    the warning in apic_pending_intr_clear(). That complains that the IRRs
+ |    are not empty.
+ |
+ |    The offending vector is the local APIC timer vector who's IRR bit is set
+ |    and stays set.
+ |
+ | It took me quite some time to reproduce the issue locally, but now I can
+ | see what happens.
+ |
+ | It requires apicv_enabled=0, i.e. full apic emulation. With apicv_enabled=1
+ | (and hardware support) it behaves correctly.
+ |
+ | Here is the series of events:
+ |
+ |     Guest CPU
+ |
+ |     goes down
+ |
+ |       native_cpu_disable()
+ |
+ |                     apic_soft_disable();
+ |
+ |     play_dead()
+ |
+ |     ....
+ |
+ |     startup()
+ |
+ |       if (apic_enabled())
+ |         apic_pending_intr_clear()   <- Not taken
+ |
+ |      enable APIC
+ |
+ |         apic_pending_intr_clear()   <- Triggers warning because IRR is stale
+ |
+ | When this happens then the deadline timer or the regular APIC timer -
+ | happens with both, has fired shortly before the APIC is disabled, but the
+ | interrupt was not serviced because the guest CPU was in an interrupt
+ | disabled region at that point.
+ |
+ | The state of the timer vector ISR/IRR bits:
+ |
+ |                                     ISR     IRR
+ | before apic_soft_disable()    0           1
+ | after apic_soft_disable()     0           1
+ |
+ | On startup                           0            1
+ |
+ | Now one would assume that the IRR is cleared after the INIT reset, but this
+ | happens only on CPU0.
+ |
+ | Why?
+ |
+ | Because our CPU0 hotplug is just for testing to make sure nothing breaks
+ | and goes through an NMI wakeup vehicle because INIT would send it through
+ | the boots-trap code which is not really working if that CPU was not
+ | physically unplugged.
+ |
+ | Now looking at a real world APIC the situation in that case is:
+ |
+ |                                     ISR     IRR
+ | before apic_soft_disable()    0           1
+ | after apic_soft_disable()     0           1
+ |
+ | On startup                           0            0
+ |
+ | Why?
+ |
+ | Once the dying CPU reenables interrupts the pending interrupt gets
+ | delivered as a spurious interupt and then the state is clear.
+ |
+ | While that CPU0 hotplug test case is surely an esoteric issue, the APIC
+ | emulation is still wrong, Even if the play_dead() code would not enable
+ | interrupts then the pending IRR bit would turn into an ISR .. interrupt
+ | when the APIC is reenabled on startup.
+
+From SDM 10.4.7.2 Local APIC State After It Has Been Software Disabled
+* Pending interrupts in the IRR and ISR registers are held and require
+  masking or handling by the CPU.
+
+In Thomas's testing, hardware cpu will not respect soft disable LAPIC
+when IRR has already been set or APICv posted-interrupt is in flight,
+so we can skip soft disable APIC checking when clearing IRR and set ISR,
+continue to respect soft disable APIC when attempting to set IRR.
+
+Reported-by: Rong Chen <rong.a.chen@intel.com>
+Reported-by: Feng Tang <feng.tang@intel.com>
+Reported-by: Thomas Gleixner <tglx@linutronix.de>
+Tested-by: Thomas Gleixner <tglx@linutronix.de>
+Cc: Paolo Bonzini <pbonzini@redhat.com>
+Cc: Radim Krčmář <rkrcmar@redhat.com>
+Cc: Thomas Gleixner <tglx@linutronix.de>
+Cc: Rong Chen <rong.a.chen@intel.com>
+Cc: Feng Tang <feng.tang@intel.com>
+Cc: stable@vger.kernel.org
+Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
+Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ arch/x86/kvm/lapic.c |    2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/arch/x86/kvm/lapic.c
++++ b/arch/x86/kvm/lapic.c
+@@ -2161,7 +2161,7 @@ int kvm_apic_has_interrupt(struct kvm_vc
+       struct kvm_lapic *apic = vcpu->arch.apic;
+       u32 ppr;
+-      if (!apic_enabled(apic))
++      if (!kvm_apic_hw_enabled(apic))
+               return -1;
+       __apic_update_ppr(apic, &ppr);
index 08a5d245d1e9d5e691da459ad406c20c49cf8e09..60c3428bfc1da5eecefec875cecfb0bfff66edc0 100644 (file)
@@ -48,3 +48,5 @@ vhost-scsi-add-weight-support.patch
 tty-rocket-fix-incorrect-forward-declaration-of-rp_i.patch
 arc-handle-gcc-generated-__builtin_trap-for-older-compiler.patch
 kvm-x86-degrade-warn-to-pr_warn_ratelimited.patch
+kvm-lapic-fix-pending-interrupt-in-irr-blocked-by-software-disable-lapic.patch
+svcrdma-ignore-source-port-when-computing-drc-hash.patch
diff --git a/queue-4.14/svcrdma-ignore-source-port-when-computing-drc-hash.patch b/queue-4.14/svcrdma-ignore-source-port-when-computing-drc-hash.patch
new file mode 100644 (file)
index 0000000..2c70ac7
--- /dev/null
@@ -0,0 +1,54 @@
+From 1e091c3bbf51d34d5d96337a59ce5ab2ac3ba2cc Mon Sep 17 00:00:00 2001
+From: Chuck Lever <chuck.lever@oracle.com>
+Date: Tue, 11 Jun 2019 11:01:16 -0400
+Subject: svcrdma: Ignore source port when computing DRC hash
+
+From: Chuck Lever <chuck.lever@oracle.com>
+
+commit 1e091c3bbf51d34d5d96337a59ce5ab2ac3ba2cc upstream.
+
+The DRC appears to be effectively empty after an RPC/RDMA transport
+reconnect. The problem is that each connection uses a different
+source port, which defeats the DRC hash.
+
+Clients always have to disconnect before they send retransmissions
+to reset the connection's credit accounting, thus every retransmit
+on NFS/RDMA will miss the DRC.
+
+An NFS/RDMA client's IP source port is meaningless for RDMA
+transports. The transport layer typically sets the source port value
+on the connection to a random ephemeral port. The server already
+ignores it for the "secure port" check. See commit 16e4d93f6de7
+("NFSD: Ignore client's source port on RDMA transports").
+
+The Linux NFS server's DRC resolves XID collisions from the same
+source IP address by using the checksum of the first 200 bytes of
+the RPC call header.
+
+Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
+Cc: stable@vger.kernel.org # v4.14+
+Signed-off-by: J. Bruce Fields <bfields@redhat.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ net/sunrpc/xprtrdma/svc_rdma_transport.c |    7 ++++++-
+ 1 file changed, 6 insertions(+), 1 deletion(-)
+
+--- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
++++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
+@@ -524,9 +524,14 @@ static void handle_connect_req(struct rd
+       /* Save client advertised inbound read limit for use later in accept. */
+       newxprt->sc_ord = param->initiator_depth;
+-      /* Set the local and remote addresses in the transport */
+       sa = (struct sockaddr *)&newxprt->sc_cm_id->route.addr.dst_addr;
+       svc_xprt_set_remote(&newxprt->sc_xprt, sa, svc_addr_len(sa));
++      /* The remote port is arbitrary and not under the control of the
++       * client ULP. Set it to a fixed value so that the DRC continues
++       * to be effective after a reconnect.
++       */
++      rpc_set_port((struct sockaddr *)&newxprt->sc_xprt.xpt_remote, 0);
++
+       sa = (struct sockaddr *)&newxprt->sc_cm_id->route.addr.src_addr;
+       svc_xprt_set_local(&newxprt->sc_xprt, sa, svc_addr_len(sa));