From bf580969dcf664da11d3d275cf51563372a0e36f Mon Sep 17 00:00:00 2001 From: Greg Kroah-Hartman Date: Thu, 20 Mar 2014 11:38:29 -0700 Subject: [PATCH] 3.10-stable patches added patches: ipc-fix-2-bugs-in-msgrcv-msg_copy-implementation.patch kvm-svm-fix-cr8-intercept-window.patch pci-enable-intx-in-pci_reenable_device-only-when-msi-msi-x-not-enabled.patch vmxnet3-fix-building-without-config_pci_msi.patch vmxnet3-fix-netpoll-race-condition.patch --- ...gs-in-msgrcv-msg_copy-implementation.patch | 120 ++++++++++++++++++ .../kvm-svm-fix-cr8-intercept-window.patch | 52 ++++++++ ...vice-only-when-msi-msi-x-not-enabled.patch | 47 +++++++ queue-3.10/series | 5 + ...-fix-building-without-config_pci_msi.patch | 51 ++++++++ .../vmxnet3-fix-netpoll-race-condition.patch | 78 ++++++++++++ 6 files changed, 353 insertions(+) create mode 100644 queue-3.10/ipc-fix-2-bugs-in-msgrcv-msg_copy-implementation.patch create mode 100644 queue-3.10/kvm-svm-fix-cr8-intercept-window.patch create mode 100644 queue-3.10/pci-enable-intx-in-pci_reenable_device-only-when-msi-msi-x-not-enabled.patch create mode 100644 queue-3.10/vmxnet3-fix-building-without-config_pci_msi.patch create mode 100644 queue-3.10/vmxnet3-fix-netpoll-race-condition.patch diff --git a/queue-3.10/ipc-fix-2-bugs-in-msgrcv-msg_copy-implementation.patch b/queue-3.10/ipc-fix-2-bugs-in-msgrcv-msg_copy-implementation.patch new file mode 100644 index 00000000000..162f43b462e --- /dev/null +++ b/queue-3.10/ipc-fix-2-bugs-in-msgrcv-msg_copy-implementation.patch @@ -0,0 +1,120 @@ +From 4f87dac386cc43d5525da7a939d4b4e7edbea22c Mon Sep 17 00:00:00 2001 +From: Michael Kerrisk +Date: Mon, 10 Mar 2014 14:46:07 +0100 +Subject: ipc: Fix 2 bugs in msgrcv() MSG_COPY implementation + +From: Michael Kerrisk + +commit 4f87dac386cc43d5525da7a939d4b4e7edbea22c upstream. + +While testing and documenting the msgrcv() MSG_COPY flag that Stanislav +Kinsbursky added in commit 4a674f34ba04 ("ipc: introduce message queue +copy feature" => kernel 3.8), I discovered a couple of bugs in the +implementation. The two bugs concern MSG_COPY interactions with other +msgrcv() flags, namely: + + (A) MSG_COPY + MSG_EXCEPT + (B) MSG_COPY + !IPC_NOWAIT + +The bugs are distinct (and the fix for the first one is obvious), +however my fix for both is a single-line patch, which is why I'm +combining them in a single mail, rather than writing two mails+patches. + + ===== (A) MSG_COPY + MSG_EXCEPT ===== + +With the addition of the MSG_COPY flag, there are now two msgrcv() +flags--MSG_COPY and MSG_EXCEPT--that modify the meaning of the 'msgtyp' +argument in unrelated ways. Specifying both in the same call is a +logical error that is currently permitted, with the effect that MSG_COPY +has priority and MSG_EXCEPT is ignored. The call should give an error +if both flags are specified. The patch below implements that behavior. + + ===== (B) (B) MSG_COPY + !IPC_NOWAIT ===== + +The test code that was submitted in commit 3a665531a3b7 ("selftests: IPC +message queue copy feature test") shows MSG_COPY being used in +conjunction with IPC_NOWAIT. In other words, if there is no message at +the position 'msgtyp'. return immediately with the error in ENOMSG. + +What was not (fully) tested is the behavior if MSG_COPY is specified +*without* IPC_NOWAIT, and there is an odd behavior. If the queue +contains less than 'msgtyp' messages, then the call blocks until the +next message is written to the queue. At that point, the msgrcv() call +returns a copy of the newly added message, regardless of whether that +message is at the ordinal position 'msgtyp'. This is clearly bogus, and +problematic for applications that might want to make use of the MSG_COPY +flag. + +I considered the following possible solutions to this problem: + + (1) Force the call to block until a message *does* appear at the + position 'msgtyp'. + + (2) If the MSG_COPY flag is specified, the kernel should implicitly add + IPC_NOWAIT, so that the call fails with ENOMSG for this case. + + (3) If the MSG_COPY flag is specified, but IPC_NOWAIT is not, generate + an error (probably, EINVAL is the right one). + +I do not know if any application would really want to have the +functionality of solution (1), especially since an application can +determine in advance the number of messages in the queue using msgctl() +IPC_STAT. Obviously, this solution would be the most work to implement. + +Solution (2) would have the effect of silently fixing any applications +that tried to employ broken behavior. However, it would mean that if we +later decided to implement solution (1), then user-space could not +easily detect what the kernel supports (but, since I'm somewhat doubtful +that solution (1) is needed, I'm not sure that this is much of a +problem). + +Solution (3) would have the effect of informing broken applications that +they are doing something broken. The downside is that this would cause +a ABI breakage for any applications that are currently employing the +broken behavior. However: + +a) Those applications are almost certainly not getting the results they + expect. +b) Possibly, those applications don't even exist, because MSG_COPY is + currently hidden behind CONFIG_CHECKPOINT_RESTORE. + +The upside of solution (3) is that if we later decided to implement +solution (1), user-space could determine what the kernel supports, via +the error return. + +In my view, solution (3) is mildly preferable to solution (2), and +solution (1) could still be done later if anyone really cares. The +patch below implements solution (3). + +PS. For anyone out there still listening, it's the usual story: +documenting an API (and the thinking about, and the testing of the API, +that documentation entails) is the one of the single best ways of +finding bugs in the API, as I've learned from a lot of experience. Best +to do that documentation before releasing the API. + +Signed-off-by: Michael Kerrisk +Acked-by: Stanislav Kinsbursky +Cc: Stanislav Kinsbursky +Cc: Serge Hallyn +Cc: "Eric W. Biederman" +Cc: Pavel Emelyanov +Cc: Al Viro +Cc: KOSAKI Motohiro +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman + +--- + ipc/msg.c | 2 ++ + 1 file changed, 2 insertions(+) + +--- a/ipc/msg.c ++++ b/ipc/msg.c +@@ -885,6 +885,8 @@ long do_msgrcv(int msqid, void __user *b + return -EINVAL; + + if (msgflg & MSG_COPY) { ++ if ((msgflg & MSG_EXCEPT) || !(msgflg & IPC_NOWAIT)) ++ return -EINVAL; + copy = prepare_copy(buf, min_t(size_t, bufsz, ns->msg_ctlmax)); + if (IS_ERR(copy)) + return PTR_ERR(copy); diff --git a/queue-3.10/kvm-svm-fix-cr8-intercept-window.patch b/queue-3.10/kvm-svm-fix-cr8-intercept-window.patch new file mode 100644 index 00000000000..a4a0cfccbbd --- /dev/null +++ b/queue-3.10/kvm-svm-fix-cr8-intercept-window.patch @@ -0,0 +1,52 @@ +From 596f3142d2b7be307a1652d59e7b93adab918437 Mon Sep 17 00:00:00 2001 +From: Radim Krčmář +Date: Tue, 11 Mar 2014 19:11:18 +0100 +Subject: KVM: SVM: fix cr8 intercept window +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Radim Krčmář + +commit 596f3142d2b7be307a1652d59e7b93adab918437 upstream. + +We always disable cr8 intercept in its handler, but only re-enable it +if handling KVM_REQ_EVENT, so there can be a window where we do not +intercept cr8 writes, which allows an interrupt to disrupt a higher +priority task. + +Fix this by disabling intercepts in the same function that re-enables +them when needed. This fixes BSOD in Windows 2008. + +Signed-off-by: Radim Krčmář +Reviewed-by: Marcelo Tosatti +Signed-off-by: Paolo Bonzini +Signed-off-by: Greg Kroah-Hartman + +--- + arch/x86/kvm/svm.c | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +--- a/arch/x86/kvm/svm.c ++++ b/arch/x86/kvm/svm.c +@@ -2985,10 +2985,8 @@ static int cr8_write_interception(struct + u8 cr8_prev = kvm_get_cr8(&svm->vcpu); + /* instruction emulation calls kvm_set_cr8() */ + r = cr_interception(svm); +- if (irqchip_in_kernel(svm->vcpu.kvm)) { +- clr_cr_intercept(svm, INTERCEPT_CR8_WRITE); ++ if (irqchip_in_kernel(svm->vcpu.kvm)) + return r; +- } + if (cr8_prev <= kvm_get_cr8(&svm->vcpu)) + return r; + kvm_run->exit_reason = KVM_EXIT_SET_TPR; +@@ -3550,6 +3548,8 @@ static void update_cr8_intercept(struct + if (is_guest_mode(vcpu) && (vcpu->arch.hflags & HF_VINTR_MASK)) + return; + ++ clr_cr_intercept(svm, INTERCEPT_CR8_WRITE); ++ + if (irr == -1) + return; + diff --git a/queue-3.10/pci-enable-intx-in-pci_reenable_device-only-when-msi-msi-x-not-enabled.patch b/queue-3.10/pci-enable-intx-in-pci_reenable_device-only-when-msi-msi-x-not-enabled.patch new file mode 100644 index 00000000000..9890821d612 --- /dev/null +++ b/queue-3.10/pci-enable-intx-in-pci_reenable_device-only-when-msi-msi-x-not-enabled.patch @@ -0,0 +1,47 @@ +From 3cdeb713dc66057b50682048c151eae07b186c42 Mon Sep 17 00:00:00 2001 +From: Bjorn Helgaas +Date: Tue, 11 Mar 2014 14:22:19 -0600 +Subject: PCI: Enable INTx in pci_reenable_device() only when MSI/MSI-X not enabled + +From: Bjorn Helgaas + +commit 3cdeb713dc66057b50682048c151eae07b186c42 upstream. + +Andreas reported that after 1f42db786b14 ("PCI: Enable INTx if BIOS left +them disabled"), pciehp surprise removal stopped working. + +This happens because pci_reenable_device() on the hotplug bridge (used in +the pciehp_configure_device() path) clears the Interrupt Disable bit, which +apparently breaks the bridge's MSI hotplug event reporting. + +Previously we cleared the Interrupt Disable bit in do_pci_enable_device(), +which is used by both pci_enable_device() and pci_reenable_device(). But +we use pci_reenable_device() after the driver may have enabled MSI or +MSI-X, and we *set* Interrupt Disable as part of enabling MSI/MSI-X. + +This patch clears Interrupt Disable only when MSI/MSI-X has not been +enabled. + +Fixes: 1f42db786b14 PCI: Enable INTx if BIOS left them disabled +Link: https://bugzilla.kernel.org/show_bug.cgi?id=71691 +Reported-and-tested-by: Andreas Noever +Signed-off-by: Bjorn Helgaas +CC: Sarah Sharp +Signed-off-by: Greg Kroah-Hartman + +--- + drivers/pci/pci.c | 3 +++ + 1 file changed, 3 insertions(+) + +--- a/drivers/pci/pci.c ++++ b/drivers/pci/pci.c +@@ -1130,6 +1130,9 @@ static int do_pci_enable_device(struct p + return err; + pci_fixup_device(pci_fixup_enable, dev); + ++ if (dev->msi_enabled || dev->msix_enabled) ++ return 0; ++ + pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); + if (pin) { + pci_read_config_word(dev, PCI_COMMAND, &cmd); diff --git a/queue-3.10/series b/queue-3.10/series index b75f257e4bd..ba524ae3ac6 100644 --- a/queue-3.10/series +++ b/queue-3.10/series @@ -54,3 +54,8 @@ iscsi-target-fix-iscsit_get_tpg_from_np-tpg_state-bug.patch fs-proc-base.c-fix-gpf-in-proc-pid-map_files.patch drm-radeon-atom-select-the-proper-number-of-lanes-in.patch asoc-pcm-free-path-list-before-exiting-from-error-conditions.patch +ipc-fix-2-bugs-in-msgrcv-msg_copy-implementation.patch +kvm-svm-fix-cr8-intercept-window.patch +pci-enable-intx-in-pci_reenable_device-only-when-msi-msi-x-not-enabled.patch +vmxnet3-fix-netpoll-race-condition.patch +vmxnet3-fix-building-without-config_pci_msi.patch diff --git a/queue-3.10/vmxnet3-fix-building-without-config_pci_msi.patch b/queue-3.10/vmxnet3-fix-building-without-config_pci_msi.patch new file mode 100644 index 00000000000..4f8107ae249 --- /dev/null +++ b/queue-3.10/vmxnet3-fix-building-without-config_pci_msi.patch @@ -0,0 +1,51 @@ +From 0a8d8c446b5429d15ff2d48f46e00d8a08552303 Mon Sep 17 00:00:00 2001 +From: Arnd Bergmann +Date: Thu, 13 Mar 2014 10:44:34 +0100 +Subject: vmxnet3: fix building without CONFIG_PCI_MSI + +From: Arnd Bergmann + +commit 0a8d8c446b5429d15ff2d48f46e00d8a08552303 upstream. + +Since commit d25f06ea466e "vmxnet3: fix netpoll race condition", +the vmxnet3 driver fails to build when CONFIG_PCI_MSI is disabled, +because it unconditionally references the vmxnet3_msix_rx() +function. + +To fix this, use the same #ifdef in the caller that exists around +the function definition. + +Signed-off-by: Arnd Bergmann +Cc: Neil Horman +Cc: Shreyas Bhatewara +Cc: "VMware, Inc." +Cc: "David S. Miller" +Acked-by: Neil Horman +Signed-off-by: David S. Miller +Signed-off-by: Greg Kroah-Hartman + +--- + drivers/net/vmxnet3/vmxnet3_drv.c | 7 +++++-- + 1 file changed, 5 insertions(+), 2 deletions(-) + +--- a/drivers/net/vmxnet3/vmxnet3_drv.c ++++ b/drivers/net/vmxnet3/vmxnet3_drv.c +@@ -1740,13 +1740,16 @@ static void + vmxnet3_netpoll(struct net_device *netdev) + { + struct vmxnet3_adapter *adapter = netdev_priv(netdev); +- int i; + + switch (adapter->intr.type) { +- case VMXNET3_IT_MSIX: ++#ifdef CONFIG_PCI_MSI ++ case VMXNET3_IT_MSIX: { ++ int i; + for (i = 0; i < adapter->num_rx_queues; i++) + vmxnet3_msix_rx(0, &adapter->rx_queue[i]); + break; ++ } ++#endif + case VMXNET3_IT_MSI: + default: + vmxnet3_intr(0, adapter->netdev); diff --git a/queue-3.10/vmxnet3-fix-netpoll-race-condition.patch b/queue-3.10/vmxnet3-fix-netpoll-race-condition.patch new file mode 100644 index 00000000000..c0d95d7c4e3 --- /dev/null +++ b/queue-3.10/vmxnet3-fix-netpoll-race-condition.patch @@ -0,0 +1,78 @@ +From d25f06ea466ea521b563b76661180b4e44714ae6 Mon Sep 17 00:00:00 2001 +From: Neil Horman +Date: Mon, 10 Mar 2014 06:55:55 -0400 +Subject: vmxnet3: fix netpoll race condition + +From: Neil Horman + +commit d25f06ea466ea521b563b76661180b4e44714ae6 upstream. + +vmxnet3's netpoll driver is incorrectly coded. It directly calls +vmxnet3_do_poll, which is the driver internal napi poll routine. As the netpoll +controller method doesn't block real napi polls in any way, there is a potential +for race conditions in which the netpoll controller method and the napi poll +method run concurrently. The result is data corruption causing panics such as this +one recently observed: +PID: 1371 TASK: ffff88023762caa0 CPU: 1 COMMAND: "rs:main Q:Reg" + #0 [ffff88023abd5780] machine_kexec at ffffffff81038f3b + #1 [ffff88023abd57e0] crash_kexec at ffffffff810c5d92 + #2 [ffff88023abd58b0] oops_end at ffffffff8152b570 + #3 [ffff88023abd58e0] die at ffffffff81010e0b + #4 [ffff88023abd5910] do_trap at ffffffff8152add4 + #5 [ffff88023abd5970] do_invalid_op at ffffffff8100cf95 + #6 [ffff88023abd5a10] invalid_op at ffffffff8100bf9b + [exception RIP: vmxnet3_rq_rx_complete+1968] + RIP: ffffffffa00f1e80 RSP: ffff88023abd5ac8 RFLAGS: 00010086 + RAX: 0000000000000000 RBX: ffff88023b5dcee0 RCX: 00000000000000c0 + RDX: 0000000000000000 RSI: 00000000000005f2 RDI: ffff88023b5dcee0 + RBP: ffff88023abd5b48 R8: 0000000000000000 R9: ffff88023a3b6048 + R10: 0000000000000000 R11: 0000000000000002 R12: ffff8802398d4cd8 + R13: ffff88023af35140 R14: ffff88023b60c890 R15: 0000000000000000 + ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 + #7 [ffff88023abd5b50] vmxnet3_do_poll at ffffffffa00f204a [vmxnet3] + #8 [ffff88023abd5b80] vmxnet3_netpoll at ffffffffa00f209c [vmxnet3] + #9 [ffff88023abd5ba0] netpoll_poll_dev at ffffffff81472bb7 + +The fix is to do as other drivers do, and have the poll controller call the top +half interrupt handler, which schedules a napi poll properly to recieve frames + +Tested by myself, successfully. + +Signed-off-by: Neil Horman +CC: Shreyas Bhatewara +CC: "VMware, Inc." +CC: "David S. Miller" +Reviewed-by: Shreyas N Bhatewara +Signed-off-by: David S. Miller +Signed-off-by: Greg Kroah-Hartman + +--- + drivers/net/vmxnet3/vmxnet3_drv.c | 16 +++++++++++----- + 1 file changed, 11 insertions(+), 5 deletions(-) + +--- a/drivers/net/vmxnet3/vmxnet3_drv.c ++++ b/drivers/net/vmxnet3/vmxnet3_drv.c +@@ -1740,12 +1740,18 @@ static void + vmxnet3_netpoll(struct net_device *netdev) + { + struct vmxnet3_adapter *adapter = netdev_priv(netdev); ++ int i; + +- if (adapter->intr.mask_mode == VMXNET3_IMM_ACTIVE) +- vmxnet3_disable_all_intrs(adapter); +- +- vmxnet3_do_poll(adapter, adapter->rx_queue[0].rx_ring[0].size); +- vmxnet3_enable_all_intrs(adapter); ++ switch (adapter->intr.type) { ++ case VMXNET3_IT_MSIX: ++ for (i = 0; i < adapter->num_rx_queues; i++) ++ vmxnet3_msix_rx(0, &adapter->rx_queue[i]); ++ break; ++ case VMXNET3_IT_MSI: ++ default: ++ vmxnet3_intr(0, adapter->netdev); ++ break; ++ } + + } + #endif /* CONFIG_NET_POLL_CONTROLLER */ -- 2.47.3