5.10-stable patches

author Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Sun, 24 Oct 2021 12:00:58 +0000 (14:00 +0200)

committer Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Sun, 24 Oct 2021 12:00:58 +0000 (14:00 +0200)
author Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sun, 24 Oct 2021 12:00:58 +0000 (14:00 +0200)
committer Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sun, 24 Oct 2021 12:00:58 +0000 (14:00 +0200)
diff --git a/queue-5.10/kvm-nvmx-promptly-process-interrupts-delivered-while-in-guest-mode.patch b/queue-5.10/kvm-nvmx-promptly-process-interrupts-delivered-while-in-guest-mode.patch

new file mode 100644 (file)

index 0000000..b784e16
--- /dev/null
+++ b/queue-5.10/kvm-nvmx-promptly-process-interrupts-delivered-while-in-guest-mode.patch
@@ -0,0 +1,51 @@
+From 3a25dfa67fe40f3a2690af2c562e0947a78bd6a0 Mon Sep 17 00:00:00 2001
+From: Paolo Bonzini <pbonzini@redhat.com>
+Date: Wed, 20 Oct 2021 06:22:59 -0400
+Subject: KVM: nVMX: promptly process interrupts delivered while in guest mode
+
+From: Paolo Bonzini <pbonzini@redhat.com>
+
+commit 3a25dfa67fe40f3a2690af2c562e0947a78bd6a0 upstream.
+
+Since commit c300ab9f08df ("KVM: x86: Replace late check_nested_events() hack with
+more precise fix") there is no longer the certainty that check_nested_events()
+tries to inject an external interrupt vmexit to L1 on every call to vcpu_enter_guest.
+Therefore, even in that case we need to set KVM_REQ_EVENT.  This ensures
+that inject_pending_event() is called, and from there kvm_check_nested_events().
+
+Fixes: c300ab9f08df ("KVM: x86: Replace late check_nested_events() hack with more precise fix")
+Cc: stable@vger.kernel.org
+Reviewed-by: Sean Christopherson <seanjc@google.com>
+Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ arch/x86/kvm/vmx/vmx.c |   17 ++++++-----------
+ 1 file changed, 6 insertions(+), 11 deletions(-)
+
+--- a/arch/x86/kvm/vmx/vmx.c
++++ b/arch/x86/kvm/vmx/vmx.c
+@@ -6316,18 +6316,13 @@ static int vmx_sync_pir_to_irr(struct kv
+ 
+               /*
+                * If we are running L2 and L1 has a new pending interrupt
+-               * which can be injected, we should re-evaluate
+-               * what should be done with this new L1 interrupt.
+-               * If L1 intercepts external-interrupts, we should
+-               * exit from L2 to L1. Otherwise, interrupt should be
+-               * delivered directly to L2.
++               * which can be injected, this may cause a vmexit or it may
++               * be injected into L2.  Either way, this interrupt will be
++               * processed via KVM_REQ_EVENT, not RVI, because we do not use
++               * virtual interrupt delivery to inject L1 interrupts into L2.
+                */
+-              if (is_guest_mode(vcpu) && max_irr_updated) {
+-                      if (nested_exit_on_intr(vcpu))
+-                              kvm_vcpu_exiting_guest_mode(vcpu);
+-                      else
+-                              kvm_make_request(KVM_REQ_EVENT, vcpu);
+-              }
++              if (is_guest_mode(vcpu) && max_irr_updated)
++                      kvm_make_request(KVM_REQ_EVENT, vcpu);
+       } else {
+               max_irr = kvm_lapic_find_highest_irr(vcpu);
+       }
diff --git a/queue-5.10/kvm-ppc-book3s-hv-fix-stack-handling-in-idle_kvm_start_guest.patch b/queue-5.10/kvm-ppc-book3s-hv-fix-stack-handling-in-idle_kvm_start_guest.patch

new file mode 100644 (file)

index 0000000..2634549
--- /dev/null
+++ b/queue-5.10/kvm-ppc-book3s-hv-fix-stack-handling-in-idle_kvm_start_guest.patch
@@ -0,0 +1,105 @@
+From 9b4416c5095c20e110c82ae602c254099b83b72f Mon Sep 17 00:00:00 2001
+From: Michael Ellerman <mpe@ellerman.id.au>
+Date: Fri, 15 Oct 2021 23:01:48 +1100
+Subject: KVM: PPC: Book3S HV: Fix stack handling in idle_kvm_start_guest()
+
+From: Michael Ellerman <mpe@ellerman.id.au>
+
+commit 9b4416c5095c20e110c82ae602c254099b83b72f upstream.
+
+In commit 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in
+C") kvm_start_guest() became idle_kvm_start_guest(). The old code
+allocated a stack frame on the emergency stack, but didn't use the
+frame to store anything, and also didn't store anything in its caller's
+frame.
+
+idle_kvm_start_guest() on the other hand is written more like a normal C
+function, it creates a frame on entry, and also stores CR/LR into its
+callers frame (per the ABI). The problem is that there is no caller
+frame on the emergency stack.
+
+The emergency stack for a given CPU is allocated with:
+
+  paca_ptrs[i]->emergency_sp = alloc_stack(limit, i) + THREAD_SIZE;
+
+So emergency_sp actually points to the first address above the emergency
+stack allocation for a given CPU, we must not store above it without
+first decrementing it to create a frame. This is different to the
+regular kernel stack, paca->kstack, which is initialised to point at an
+initial frame that is ready to use.
+
+idle_kvm_start_guest() stores the backchain, CR and LR all of which
+write outside the allocation for the emergency stack. It then creates a
+stack frame and saves the non-volatile registers. Unfortunately the
+frame it creates is not large enough to fit the non-volatiles, and so
+the saving of the non-volatile registers also writes outside the
+emergency stack allocation.
+
+The end result is that we corrupt whatever is at 0-24 bytes, and 112-248
+bytes above the emergency stack allocation.
+
+In practice this has gone unnoticed because the memory immediately above
+the emergency stack happens to be used for other stack allocations,
+either another CPUs mc_emergency_sp or an IRQ stack. See the order of
+calls to irqstack_early_init() and emergency_stack_init().
+
+The low addresses of another stack are the top of that stack, and so are
+only used if that stack is under extreme pressue, which essentially
+never happens in practice - and if it did there's a high likelyhood we'd
+crash due to that stack overflowing.
+
+Still, we shouldn't be corrupting someone else's stack, and it is purely
+luck that we aren't corrupting something else.
+
+To fix it we save CR/LR into the caller's frame using the existing r1 on
+entry, we then create a SWITCH_FRAME_SIZE frame (which has space for
+pt_regs) on the emergency stack with the backchain pointing to the
+existing stack, and then finally we switch to the new frame on the
+emergency stack.
+
+Fixes: 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C")
+Cc: stable@vger.kernel.org # v5.2+
+Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
+Link: https://lore.kernel.org/r/20211015133929.832061-1-mpe@ellerman.id.au
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ arch/powerpc/kvm/book3s_hv_rmhandlers.S |   19 ++++++++++---------
+ 1 file changed, 10 insertions(+), 9 deletions(-)
+
+--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
++++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+@@ -292,13 +292,15 @@ kvm_novcpu_exit:
+  * r3 contains the SRR1 wakeup value, SRR1 is trashed.
+  */
+ _GLOBAL(idle_kvm_start_guest)
+-      ld      r4,PACAEMERGSP(r13)
+       mfcr    r5
+       mflr    r0
+-      std     r1,0(r4)
+-      std     r5,8(r4)
+-      std     r0,16(r4)
+-      subi    r1,r4,STACK_FRAME_OVERHEAD
++      std     r5, 8(r1)       // Save CR in caller's frame
++      std     r0, 16(r1)      // Save LR in caller's frame
++      // Create frame on emergency stack
++      ld      r4, PACAEMERGSP(r13)
++      stdu    r1, -SWITCH_FRAME_SIZE(r4)
++      // Switch to new frame on emergency stack
++      mr      r1, r4
+       SAVE_NVGPRS(r1)
+ 
+       /*
+@@ -444,10 +446,9 @@ kvm_no_guest:
+       /* set up r3 for return */
+       mfspr   r3,SPRN_SRR1
+       REST_NVGPRS(r1)
+-      addi    r1, r1, STACK_FRAME_OVERHEAD
+-      ld      r0, 16(r1)
+-      ld      r5, 8(r1)
+-      ld      r1, 0(r1)
++      ld      r1, 0(r1)       // Switch back to caller stack
++      ld      r0, 16(r1)      // Reload LR
++      ld      r5, 8(r1)       // Reload CR
+       mtlr    r0
+       mtcr    r5
+       blr
diff --git a/queue-5.10/kvm-ppc-book3s-hv-make-idle_kvm_start_guest-return-0-if-it-went-to-guest.patch b/queue-5.10/kvm-ppc-book3s-hv-make-idle_kvm_start_guest-return-0-if-it-went-to-guest.patch

new file mode 100644 (file)

index 0000000..c23921e
--- /dev/null
+++ b/queue-5.10/kvm-ppc-book3s-hv-make-idle_kvm_start_guest-return-0-if-it-went-to-guest.patch
@@ -0,0 +1,71 @@
+From cdeb5d7d890e14f3b70e8087e745c4a6a7d9f337 Mon Sep 17 00:00:00 2001
+From: Michael Ellerman <mpe@ellerman.id.au>
+Date: Fri, 15 Oct 2021 23:02:08 +1100
+Subject: KVM: PPC: Book3S HV: Make idle_kvm_start_guest() return 0 if it went to guest
+
+From: Michael Ellerman <mpe@ellerman.id.au>
+
+commit cdeb5d7d890e14f3b70e8087e745c4a6a7d9f337 upstream.
+
+We call idle_kvm_start_guest() from power7_offline() if the thread has
+been requested to enter KVM. We pass it the SRR1 value that was returned
+from power7_idle_insn() which tells us what sort of wakeup we're
+processing.
+
+Depending on the SRR1 value we pass in, the KVM code might enter the
+guest, or it might return to us to do some host action if the wakeup
+requires it.
+
+If idle_kvm_start_guest() is able to handle the wakeup, and enter the
+guest it is supposed to indicate that by returning a zero SRR1 value to
+us.
+
+That was the behaviour prior to commit 10d91611f426 ("powerpc/64s:
+Reimplement book3s idle code in C"), however in that commit the
+handling of SRR1 was reworked, and the zeroing behaviour was lost.
+
+Returning from idle_kvm_start_guest() without zeroing the SRR1 value can
+confuse the host offline code, causing the guest to crash and other
+weirdness.
+
+Fixes: 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C")
+Cc: stable@vger.kernel.org # v5.2+
+Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
+Link: https://lore.kernel.org/r/20211015133929.832061-2-mpe@ellerman.id.au
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ arch/powerpc/kvm/book3s_hv_rmhandlers.S |    9 +++++++--
+ 1 file changed, 7 insertions(+), 2 deletions(-)
+
+--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
++++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+@@ -301,6 +301,7 @@ _GLOBAL(idle_kvm_start_guest)
+       stdu    r1, -SWITCH_FRAME_SIZE(r4)
+       // Switch to new frame on emergency stack
+       mr      r1, r4
++      std     r3, 32(r1)      // Save SRR1 wakeup value
+       SAVE_NVGPRS(r1)
+ 
+       /*
+@@ -352,6 +353,10 @@ kvm_unsplit_wakeup:
+ 
+ kvm_secondary_got_guest:
+ 
++      // About to go to guest, clear saved SRR1
++      li      r0, 0
++      std     r0, 32(r1)
++
+       /* Set HSTATE_DSCR(r13) to something sensible */
+       ld      r6, PACA_DSCR_DEFAULT(r13)
+       std     r6, HSTATE_DSCR(r13)
+@@ -443,8 +448,8 @@ kvm_no_guest:
+       mfspr   r4, SPRN_LPCR
+       rlwimi  r4, r3, 0, LPCR_PECE0 | LPCR_PECE1
+       mtspr   SPRN_LPCR, r4
+-      /* set up r3 for return */
+-      mfspr   r3,SPRN_SRR1
++      // Return SRR1 wakeup value, or 0 if we went into the guest
++      ld      r3, 32(r1)
+       REST_NVGPRS(r1)
+       ld      r1, 0(r1)       // Switch back to caller stack
+       ld      r0, 16(r1)      // Reload LR
diff --git a/queue-5.10/mm-slub-fix-incorrect-memcg-slab-count-for-bulk-free.patch b/queue-5.10/mm-slub-fix-incorrect-memcg-slab-count-for-bulk-free.patch

new file mode 100644 (file)

index 0000000..e26e21e
--- /dev/null
+++ b/queue-5.10/mm-slub-fix-incorrect-memcg-slab-count-for-bulk-free.patch
@@ -0,0 +1,49 @@
+From 3ddd60268c24bcac9d744404cc277e9dc52fe6b6 Mon Sep 17 00:00:00 2001
+From: Miaohe Lin <linmiaohe@huawei.com>
+Date: Mon, 18 Oct 2021 15:16:06 -0700
+Subject: mm, slub: fix incorrect memcg slab count for bulk free
+
+From: Miaohe Lin <linmiaohe@huawei.com>
+
+commit 3ddd60268c24bcac9d744404cc277e9dc52fe6b6 upstream.
+
+kmem_cache_free_bulk() will call memcg_slab_free_hook() for all objects
+when doing bulk free.  So we shouldn't call memcg_slab_free_hook() again
+for bulk free to avoid incorrect memcg slab count.
+
+Link: https://lkml.kernel.org/r/20210916123920.48704-6-linmiaohe@huawei.com
+Fixes: d1b2cf6cb84a ("mm: memcg/slab: uncharge during kmem_cache_free_bulk()")
+Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
+Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
+Cc: Andrey Konovalov <andreyknvl@gmail.com>
+Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
+Cc: Bharata B Rao <bharata@linux.ibm.com>
+Cc: Christoph Lameter <cl@linux.com>
+Cc: David Rientjes <rientjes@google.com>
+Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
+Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
+Cc: Kees Cook <keescook@chromium.org>
+Cc: Pekka Enberg <penberg@kernel.org>
+Cc: Roman Gushchin <guro@fb.com>
+Cc: <stable@vger.kernel.org>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ mm/slub.c |    4 +++-
+ 1 file changed, 3 insertions(+), 1 deletion(-)
+
+--- a/mm/slub.c
++++ b/mm/slub.c
+@@ -3100,7 +3100,9 @@ static __always_inline void do_slab_free
+       struct kmem_cache_cpu *c;
+       unsigned long tid;
+ 
+-      memcg_slab_free_hook(s, &head, 1);
++      /* memcg_slab_free_hook() is already called for bulk free. */
++      if (!tail)
++              memcg_slab_free_hook(s, &head, 1);
+ redo:
+       /*
+        * Determine the currently cpus per cpu slab.
diff --git a/queue-5.10/mm-slub-fix-mismatch-between-reconstructed-freelist-depth-and-cnt.patch b/queue-5.10/mm-slub-fix-mismatch-between-reconstructed-freelist-depth-and-cnt.patch

new file mode 100644 (file)

index 0000000..7de87ae
--- /dev/null
+++ b/queue-5.10/mm-slub-fix-mismatch-between-reconstructed-freelist-depth-and-cnt.patch
@@ -0,0 +1,72 @@
+From 899447f669da76cc3605665e1a95ee877bc464cc Mon Sep 17 00:00:00 2001
+From: Miaohe Lin <linmiaohe@huawei.com>
+Date: Mon, 18 Oct 2021 15:15:55 -0700
+Subject: mm, slub: fix mismatch between reconstructed freelist depth and cnt
+
+From: Miaohe Lin <linmiaohe@huawei.com>
+
+commit 899447f669da76cc3605665e1a95ee877bc464cc upstream.
+
+If object's reuse is delayed, it will be excluded from the reconstructed
+freelist.  But we forgot to adjust the cnt accordingly.  So there will
+be a mismatch between reconstructed freelist depth and cnt.  This will
+lead to free_debug_processing() complaining about freelist count or a
+incorrect slub inuse count.
+
+Link: https://lkml.kernel.org/r/20210916123920.48704-3-linmiaohe@huawei.com
+Fixes: c3895391df38 ("kasan, slub: fix handling of kasan_slab_free hook")
+Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
+Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
+Cc: Andrey Konovalov <andreyknvl@gmail.com>
+Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
+Cc: Bharata B Rao <bharata@linux.ibm.com>
+Cc: Christoph Lameter <cl@linux.com>
+Cc: David Rientjes <rientjes@google.com>
+Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
+Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
+Cc: Kees Cook <keescook@chromium.org>
+Cc: Pekka Enberg <penberg@kernel.org>
+Cc: Roman Gushchin <guro@fb.com>
+Cc: <stable@vger.kernel.org>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ mm/slub.c |   11 +++++++++--
+ 1 file changed, 9 insertions(+), 2 deletions(-)
+
+--- a/mm/slub.c
++++ b/mm/slub.c
+@@ -1543,7 +1543,8 @@ static __always_inline bool slab_free_ho
+ }
+ 
+ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
+-                                         void **head, void **tail)
++                                         void **head, void **tail,
++                                         int *cnt)
+ {
+ 
+       void *object;
+@@ -1578,6 +1579,12 @@ static inline bool slab_free_freelist_ho
+                       *head = object;
+                       if (!*tail)
+                               *tail = object;
++              } else {
++                      /*
++                       * Adjust the reconstructed freelist depth
++                       * accordingly if object's reuse is delayed.
++                       */
++                      --(*cnt);
+               }
+       } while (object != old_tail);
+ 
+@@ -3137,7 +3144,7 @@ static __always_inline void slab_free(st
+        * With KASAN enabled slab_free_freelist_hook modifies the freelist
+        * to remove objects, whose reuse must be delayed.
+        */
+-      if (slab_free_freelist_hook(s, &head, &tail))
++      if (slab_free_freelist_hook(s, &head, &tail, &cnt))
+               do_slab_free(s, page, head, tail, cnt, addr);
+ }
+ 
diff --git a/queue-5.10/mm-slub-fix-potential-memoryleak-in-kmem_cache_open.patch b/queue-5.10/mm-slub-fix-potential-memoryleak-in-kmem_cache_open.patch

new file mode 100644 (file)

index 0000000..0e58d14
--- /dev/null
+++ b/queue-5.10/mm-slub-fix-potential-memoryleak-in-kmem_cache_open.patch
@@ -0,0 +1,47 @@
+From 9037c57681d25e4dcc442d940d6dbe24dd31f461 Mon Sep 17 00:00:00 2001
+From: Miaohe Lin <linmiaohe@huawei.com>
+Date: Mon, 18 Oct 2021 15:15:59 -0700
+Subject: mm, slub: fix potential memoryleak in kmem_cache_open()
+
+From: Miaohe Lin <linmiaohe@huawei.com>
+
+commit 9037c57681d25e4dcc442d940d6dbe24dd31f461 upstream.
+
+In error path, the random_seq of slub cache might be leaked.  Fix this
+by using __kmem_cache_release() to release all the relevant resources.
+
+Link: https://lkml.kernel.org/r/20210916123920.48704-4-linmiaohe@huawei.com
+Fixes: 210e7a43fa90 ("mm: SLUB freelist randomization")
+Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
+Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
+Cc: Andrey Konovalov <andreyknvl@gmail.com>
+Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
+Cc: Bharata B Rao <bharata@linux.ibm.com>
+Cc: Christoph Lameter <cl@linux.com>
+Cc: David Rientjes <rientjes@google.com>
+Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
+Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
+Cc: Kees Cook <keescook@chromium.org>
+Cc: Pekka Enberg <penberg@kernel.org>
+Cc: Roman Gushchin <guro@fb.com>
+Cc: <stable@vger.kernel.org>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ mm/slub.c |    2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/mm/slub.c
++++ b/mm/slub.c
+@@ -3832,8 +3832,8 @@ static int kmem_cache_open(struct kmem_c
+       if (alloc_kmem_cache_cpus(s))
+               return 0;
+ 
+-      free_kmem_cache_nodes(s);
+ error:
++      __kmem_cache_release(s);
+       return -EINVAL;
+ }
+ 
diff --git a/queue-5.10/powerpc-idle-don-t-corrupt-back-chain-when-going-idle.patch b/queue-5.10/powerpc-idle-don-t-corrupt-back-chain-when-going-idle.patch

new file mode 100644 (file)

index 0000000..78f0073
--- /dev/null
+++ b/queue-5.10/powerpc-idle-don-t-corrupt-back-chain-when-going-idle.patch
@@ -0,0 +1,67 @@
+From 496c5fe25c377ddb7815c4ce8ecfb676f051e9b6 Mon Sep 17 00:00:00 2001
+From: Michael Ellerman <mpe@ellerman.id.au>
+Date: Wed, 20 Oct 2021 20:48:26 +1100
+Subject: powerpc/idle: Don't corrupt back chain when going idle
+
+From: Michael Ellerman <mpe@ellerman.id.au>
+
+commit 496c5fe25c377ddb7815c4ce8ecfb676f051e9b6 upstream.
+
+In isa206_idle_insn_mayloss() we store various registers into the stack
+red zone, which is allowed.
+
+However inside the IDLE_STATE_ENTER_SEQ_NORET macro we save r2 again,
+to 0(r1), which corrupts the stack back chain.
+
+We used to do the same in isa206_idle_insn_mayloss() itself, but we
+fixed that in 73287caa9210 ("powerpc64/idle: Fix SP offsets when saving
+GPRs"), however we missed that the macro also corrupts the back chain.
+
+Corrupting the back chain is bad for debuggability but doesn't
+necessarily cause a bug.
+
+However we recently changed the stack handling in some KVM code, and it
+now relies on the stack back chain being valid when it returns. The
+corruption causes that code to return with r1 pointing somewhere in
+kernel data, at some point LR is restored from the stack and we branch
+to NULL or somewhere else invalid.
+
+Only affects Power8 hosts running KVM guests, with dynamic_mt_modes
+enabled (which it is by default).
+
+The fixes tag below points to the commit that changed the KVM stack
+handling, exposing this bug. The actual corruption of the back chain has
+always existed since 948cf67c4726 ("powerpc: Add NAP mode support on
+Power7 in HV mode").
+
+Fixes: 9b4416c5095c ("KVM: PPC: Book3S HV: Fix stack handling in idle_kvm_start_guest()")
+Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
+Link: https://lore.kernel.org/r/20211020094826.3222052-1-mpe@ellerman.id.au
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ arch/powerpc/kernel/idle_book3s.S |   10 ++++++----
+ 1 file changed, 6 insertions(+), 4 deletions(-)
+
+--- a/arch/powerpc/kernel/idle_book3s.S
++++ b/arch/powerpc/kernel/idle_book3s.S
+@@ -126,14 +126,16 @@ _GLOBAL(idle_return_gpr_loss)
+ /*
+  * This is the sequence required to execute idle instructions, as
+  * specified in ISA v2.07 (and earlier). MSR[IR] and MSR[DR] must be 0.
+- *
+- * The 0(r1) slot is used to save r2 in isa206, so use that here.
++ * We have to store a GPR somewhere, ptesync, then reload it, and create
++ * a false dependency on the result of the load. It doesn't matter which
++ * GPR we store, or where we store it. We have already stored r2 to the
++ * stack at -8(r1) in isa206_idle_insn_mayloss, so use that.
+  */
+ #define IDLE_STATE_ENTER_SEQ_NORET(IDLE_INST)                 \
+       /* Magic NAP/SLEEP/WINKLE mode enter sequence */        \
+-      std     r2,0(r1);                                       \
++      std     r2,-8(r1);                                      \
+       ptesync;                                                \
+-      ld      r2,0(r1);                                       \
++      ld      r2,-8(r1);                                      \
+ 236:  cmpd    cr0,r2,r2;                                      \
+       bne     236b;                                           \
+       IDLE_INST;                                              \
diff --git a/queue-5.10/powerpc64-idle-fix-sp-offsets-when-saving-gprs.patch b/queue-5.10/powerpc64-idle-fix-sp-offsets-when-saving-gprs.patch

new file mode 100644 (file)

index 0000000..4715db8
--- /dev/null
+++ b/queue-5.10/powerpc64-idle-fix-sp-offsets-when-saving-gprs.patch
@@ -0,0 +1,194 @@
+From 73287caa9210ded6066833195f4335f7f688a46b Mon Sep 17 00:00:00 2001
+From: "Christopher M. Riedl" <cmr@codefail.de>
+Date: Sat, 6 Feb 2021 01:23:42 -0600
+Subject: powerpc64/idle: Fix SP offsets when saving GPRs
+
+From: Christopher M. Riedl <cmr@codefail.de>
+
+commit 73287caa9210ded6066833195f4335f7f688a46b upstream.
+
+The idle entry/exit code saves/restores GPRs in the stack "red zone"
+(Protected Zone according to PowerPC64 ELF ABI v2). However, the offset
+used for the first GPR is incorrect and overwrites the back chain - the
+Protected Zone actually starts below the current SP. In practice this is
+probably not an issue, but it's still incorrect so fix it.
+
+Also expand the comments to explain why using the stack "red zone"
+instead of creating a new stackframe is appropriate here.
+
+Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
+Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
+Link: https://lore.kernel.org/r/20210206072342.5067-1-cmr@codefail.de
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ arch/powerpc/kernel/idle_book3s.S |  138 ++++++++++++++++++++------------------
+ 1 file changed, 73 insertions(+), 65 deletions(-)
+
+--- a/arch/powerpc/kernel/idle_book3s.S
++++ b/arch/powerpc/kernel/idle_book3s.S
+@@ -52,28 +52,32 @@ _GLOBAL(isa300_idle_stop_mayloss)
+       std     r1,PACAR1(r13)
+       mflr    r4
+       mfcr    r5
+-      /* use stack red zone rather than a new frame for saving regs */
+-      std     r2,-8*0(r1)
+-      std     r14,-8*1(r1)
+-      std     r15,-8*2(r1)
+-      std     r16,-8*3(r1)
+-      std     r17,-8*4(r1)
+-      std     r18,-8*5(r1)
+-      std     r19,-8*6(r1)
+-      std     r20,-8*7(r1)
+-      std     r21,-8*8(r1)
+-      std     r22,-8*9(r1)
+-      std     r23,-8*10(r1)
+-      std     r24,-8*11(r1)
+-      std     r25,-8*12(r1)
+-      std     r26,-8*13(r1)
+-      std     r27,-8*14(r1)
+-      std     r28,-8*15(r1)
+-      std     r29,-8*16(r1)
+-      std     r30,-8*17(r1)
+-      std     r31,-8*18(r1)
+-      std     r4,-8*19(r1)
+-      std     r5,-8*20(r1)
++      /*
++       * Use the stack red zone rather than a new frame for saving regs since
++       * in the case of no GPR loss the wakeup code branches directly back to
++       * the caller without deallocating the stack frame first.
++       */
++      std     r2,-8*1(r1)
++      std     r14,-8*2(r1)
++      std     r15,-8*3(r1)
++      std     r16,-8*4(r1)
++      std     r17,-8*5(r1)
++      std     r18,-8*6(r1)
++      std     r19,-8*7(r1)
++      std     r20,-8*8(r1)
++      std     r21,-8*9(r1)
++      std     r22,-8*10(r1)
++      std     r23,-8*11(r1)
++      std     r24,-8*12(r1)
++      std     r25,-8*13(r1)
++      std     r26,-8*14(r1)
++      std     r27,-8*15(r1)
++      std     r28,-8*16(r1)
++      std     r29,-8*17(r1)
++      std     r30,-8*18(r1)
++      std     r31,-8*19(r1)
++      std     r4,-8*20(r1)
++      std     r5,-8*21(r1)
+       /* 168 bytes */
+       PPC_STOP
+       b       .       /* catch bugs */
+@@ -89,8 +93,8 @@ _GLOBAL(isa300_idle_stop_mayloss)
+  */
+ _GLOBAL(idle_return_gpr_loss)
+       ld      r1,PACAR1(r13)
+-      ld      r4,-8*19(r1)
+-      ld      r5,-8*20(r1)
++      ld      r4,-8*20(r1)
++      ld      r5,-8*21(r1)
+       mtlr    r4
+       mtcr    r5
+       /*
+@@ -98,25 +102,25 @@ _GLOBAL(idle_return_gpr_loss)
+        * from PACATOC. This could be avoided for that less common case
+        * if KVM saved its r2.
+        */
+-      ld      r2,-8*0(r1)
+-      ld      r14,-8*1(r1)
+-      ld      r15,-8*2(r1)
+-      ld      r16,-8*3(r1)
+-      ld      r17,-8*4(r1)
+-      ld      r18,-8*5(r1)
+-      ld      r19,-8*6(r1)
+-      ld      r20,-8*7(r1)
+-      ld      r21,-8*8(r1)
+-      ld      r22,-8*9(r1)
+-      ld      r23,-8*10(r1)
+-      ld      r24,-8*11(r1)
+-      ld      r25,-8*12(r1)
+-      ld      r26,-8*13(r1)
+-      ld      r27,-8*14(r1)
+-      ld      r28,-8*15(r1)
+-      ld      r29,-8*16(r1)
+-      ld      r30,-8*17(r1)
+-      ld      r31,-8*18(r1)
++      ld      r2,-8*1(r1)
++      ld      r14,-8*2(r1)
++      ld      r15,-8*3(r1)
++      ld      r16,-8*4(r1)
++      ld      r17,-8*5(r1)
++      ld      r18,-8*6(r1)
++      ld      r19,-8*7(r1)
++      ld      r20,-8*8(r1)
++      ld      r21,-8*9(r1)
++      ld      r22,-8*10(r1)
++      ld      r23,-8*11(r1)
++      ld      r24,-8*12(r1)
++      ld      r25,-8*13(r1)
++      ld      r26,-8*14(r1)
++      ld      r27,-8*15(r1)
++      ld      r28,-8*16(r1)
++      ld      r29,-8*17(r1)
++      ld      r30,-8*18(r1)
++      ld      r31,-8*19(r1)
+       blr
+ 
+ /*
+@@ -154,28 +158,32 @@ _GLOBAL(isa206_idle_insn_mayloss)
+       std     r1,PACAR1(r13)
+       mflr    r4
+       mfcr    r5
+-      /* use stack red zone rather than a new frame for saving regs */
+-      std     r2,-8*0(r1)
+-      std     r14,-8*1(r1)
+-      std     r15,-8*2(r1)
+-      std     r16,-8*3(r1)
+-      std     r17,-8*4(r1)
+-      std     r18,-8*5(r1)
+-      std     r19,-8*6(r1)
+-      std     r20,-8*7(r1)
+-      std     r21,-8*8(r1)
+-      std     r22,-8*9(r1)
+-      std     r23,-8*10(r1)
+-      std     r24,-8*11(r1)
+-      std     r25,-8*12(r1)
+-      std     r26,-8*13(r1)
+-      std     r27,-8*14(r1)
+-      std     r28,-8*15(r1)
+-      std     r29,-8*16(r1)
+-      std     r30,-8*17(r1)
+-      std     r31,-8*18(r1)
+-      std     r4,-8*19(r1)
+-      std     r5,-8*20(r1)
++      /*
++       * Use the stack red zone rather than a new frame for saving regs since
++       * in the case of no GPR loss the wakeup code branches directly back to
++       * the caller without deallocating the stack frame first.
++       */
++      std     r2,-8*1(r1)
++      std     r14,-8*2(r1)
++      std     r15,-8*3(r1)
++      std     r16,-8*4(r1)
++      std     r17,-8*5(r1)
++      std     r18,-8*6(r1)
++      std     r19,-8*7(r1)
++      std     r20,-8*8(r1)
++      std     r21,-8*9(r1)
++      std     r22,-8*10(r1)
++      std     r23,-8*11(r1)
++      std     r24,-8*12(r1)
++      std     r25,-8*13(r1)
++      std     r26,-8*14(r1)
++      std     r27,-8*15(r1)
++      std     r28,-8*16(r1)
++      std     r29,-8*17(r1)
++      std     r30,-8*18(r1)
++      std     r31,-8*19(r1)
++      std     r4,-8*20(r1)
++      std     r5,-8*21(r1)
+       cmpwi   r3,PNV_THREAD_NAP
+       bne     1f
+       IDLE_STATE_ENTER_SEQ_NORET(PPC_NAP)
diff --git a/queue-5.10/series b/queue-5.10/series

index bd9ce6b31bc6f355761447778fd1c94c4ec94274..8db481299d5776ebda4b29a5b61651b548ef1460 100644 (file)
--- a/queue-5.10/series
+++ b/queue-5.10/series
@@ -52,3 +52,11 @@ alsa-hda-realtek-add-quirk-for-clevo-pc50hs.patch
  asoc-dapm-fix-missing-kctl-change-notifications.patch
  audit-fix-possible-null-pointer-dereference-in-audit_filter_rules.patch
  net-dsa-mt7530-correct-ds-num_ports.patch
+powerpc64-idle-fix-sp-offsets-when-saving-gprs.patch
+kvm-ppc-book3s-hv-fix-stack-handling-in-idle_kvm_start_guest.patch
+kvm-ppc-book3s-hv-make-idle_kvm_start_guest-return-0-if-it-went-to-guest.patch
+powerpc-idle-don-t-corrupt-back-chain-when-going-idle.patch
+mm-slub-fix-mismatch-between-reconstructed-freelist-depth-and-cnt.patch
+mm-slub-fix-potential-memoryleak-in-kmem_cache_open.patch
+mm-slub-fix-incorrect-memcg-slab-count-for-bulk-free.patch
+kvm-nvmx-promptly-process-interrupts-delivered-while-in-guest-mode.patch
author	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Sun, 24 Oct 2021 12:00:58 +0000 (14:00 +0200)
committer	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Sun, 24 Oct 2021 12:00:58 +0000 (14:00 +0200)
queue-5.10/kvm-nvmx-promptly-process-interrupts-delivered-while-in-guest-mode.patch	[new file with mode: 0644]	patch \| blob
queue-5.10/kvm-ppc-book3s-hv-fix-stack-handling-in-idle_kvm_start_guest.patch	[new file with mode: 0644]	patch \| blob
queue-5.10/kvm-ppc-book3s-hv-make-idle_kvm_start_guest-return-0-if-it-went-to-guest.patch	[new file with mode: 0644]	patch \| blob
queue-5.10/mm-slub-fix-incorrect-memcg-slab-count-for-bulk-free.patch	[new file with mode: 0644]	patch \| blob
queue-5.10/mm-slub-fix-mismatch-between-reconstructed-freelist-depth-and-cnt.patch	[new file with mode: 0644]	patch \| blob
queue-5.10/mm-slub-fix-potential-memoryleak-in-kmem_cache_open.patch	[new file with mode: 0644]	patch \| blob
queue-5.10/powerpc-idle-don-t-corrupt-back-chain-when-going-idle.patch	[new file with mode: 0644]	patch \| blob
queue-5.10/powerpc64-idle-fix-sp-offsets-when-saving-gprs.patch	[new file with mode: 0644]	patch \| blob
queue-5.10/series		patch \| blob \| blame \| history