From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Mon, 13 May 2013 21:58:03 +0000 (-0400)
Subject: 3.9-stable patches
X-Git-Tag: v3.0.79~18
X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=52cd3882d9def9c662db0ba5375658b8f9fbb787;p=thirdparty%2Fkernel%2Fstable-queue.git

3.9-stable patches

added patches:
	audit-syscall-rules-are-not-applied-to-existing-processes-on-non-x86.patch
	audit-vfs-fix-audit_inode-call-in-o_creat-case-of-do_last.patch
	scsi-sd-fix-array-cache-flushing-bug-causing-performance-problems.patch
	xen-vcpu-pvhvm-fix-vcpu-hotplugging-hanging.patch
---

diff --git a/queue-3.9/audit-syscall-rules-are-not-applied-to-existing-processes-on-non-x86.patch b/queue-3.9/audit-syscall-rules-are-not-applied-to-existing-processes-on-non-x86.patch
new file mode 100644
index 00000000000..edec5ad41ab
--- /dev/null
+++ b/queue-3.9/audit-syscall-rules-are-not-applied-to-existing-processes-on-non-x86.patch
@@ -0,0 +1,48 @@
+From cdee3904b4ce7c03d1013ed6dd704b43ae7fc2e9 Mon Sep 17 00:00:00 2001
+From: Anton Blanchard <anton@samba.org>
+Date: Wed, 9 Jan 2013 10:46:17 +1100
+Subject: audit: Syscall rules are not applied to existing processes on non-x86
+
+From: Anton Blanchard <anton@samba.org>
+
+commit cdee3904b4ce7c03d1013ed6dd704b43ae7fc2e9 upstream.
+
+Commit b05d8447e782 (audit: inline audit_syscall_entry to reduce
+burden on archs) changed audit_syscall_entry to check for a dummy
+context before calling __audit_syscall_entry. Unfortunately the dummy
+context state is maintained in __audit_syscall_entry so once set it
+never gets cleared, even if the audit rules change.
+
+As a result, if there are no auditing rules when a process starts
+then it will never be subject to any rules added later. x86 doesn't
+see this because it has an assembly fast path that calls directly into
+__audit_syscall_entry.
+
+I noticed this issue when working on audit performance optimisations.
+I wrote a set of simple test cases available at:
+
+http://ozlabs.org/~anton/junkcode/audit_tests.tar.gz
+
+02_new_rule.py fails without the patch and passes with it. The
+test case clears all rules, starts a process, adds a rule then
+verifies the process produces a syscall audit record.
+
+Signed-off-by: Anton Blanchard <anton@samba.org>
+Signed-off-by: Eric Paris <eparis@redhat.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ include/linux/audit.h |    2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/include/linux/audit.h
++++ b/include/linux/audit.h
+@@ -120,7 +120,7 @@ static inline void audit_syscall_entry(i
+ 				       unsigned long a1, unsigned long a2,
+ 				       unsigned long a3)
+ {
+-	if (unlikely(!audit_dummy_context()))
++	if (unlikely(current->audit_context))
+ 		__audit_syscall_entry(arch, major, a0, a1, a2, a3);
+ }
+ static inline void audit_syscall_exit(void *pt_regs)
diff --git a/queue-3.9/audit-vfs-fix-audit_inode-call-in-o_creat-case-of-do_last.patch b/queue-3.9/audit-vfs-fix-audit_inode-call-in-o_creat-case-of-do_last.patch
new file mode 100644
index 00000000000..d91191125d3
--- /dev/null
+++ b/queue-3.9/audit-vfs-fix-audit_inode-call-in-o_creat-case-of-do_last.patch
@@ -0,0 +1,56 @@
+From 33e2208acfc15ce00d3dd13e839bf6434faa2b04 Mon Sep 17 00:00:00 2001
+From: Jeff Layton <jlayton@redhat.com>
+Date: Fri, 12 Apr 2013 15:16:32 -0400
+Subject: audit: vfs: fix audit_inode call in O_CREAT case of do_last
+
+From: Jeff Layton <jlayton@redhat.com>
+
+commit 33e2208acfc15ce00d3dd13e839bf6434faa2b04 upstream.
+
+Jiri reported a regression in auditing of open(..., O_CREAT) syscalls.
+In older kernels, creating a file with open(..., O_CREAT) created
+audit_name records that looked like this:
+
+type=PATH msg=audit(1360255720.628:64): item=1 name="/abc/foo" inode=138810 dev=fd:00 mode=0100640 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0
+type=PATH msg=audit(1360255720.628:64): item=0 name="/abc/" inode=138635 dev=fd:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0
+
+...in recent kernels though, they look like this:
+
+type=PATH msg=audit(1360255402.886:12574): item=2 name=(null) inode=264599 dev=fd:00 mode=0100640 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0
+type=PATH msg=audit(1360255402.886:12574): item=1 name=(null) inode=264598 dev=fd:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0
+type=PATH msg=audit(1360255402.886:12574): item=0 name="/abc/foo" inode=264598 dev=fd:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0
+
+Richard bisected to determine that the problems started with commit
+bfcec708, but the log messages have changed with some later
+audit-related patches.
+
+The problem is that this audit_inode call is passing in the parent of
+the dentry being opened, but audit_inode is being called with the parent
+flag false. This causes later audit_inode and audit_inode_child calls to
+match the wrong entry in the audit_names list.
+
+This patch simply sets the flag to properly indicate that this inode
+represents the parent. With this, the audit_names entries are back to
+looking like they did before.
+
+Reported-by: Jiri Jaburek <jjaburek@redhat.com>
+Signed-off-by: Jeff Layton <jlayton@redhat.com>
+Test By: Richard Guy Briggs <rbriggs@redhat.com>
+Signed-off-by: Eric Paris <eparis@redhat.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ fs/namei.c |    2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/fs/namei.c
++++ b/fs/namei.c
+@@ -2740,7 +2740,7 @@ static int do_last(struct nameidata *nd,
+ 		if (error)
+ 			return error;
+ 
+-		audit_inode(name, dir, 0);
++		audit_inode(name, dir, LOOKUP_PARENT);
+ 		error = -EISDIR;
+ 		/* trailing slashes? */
+ 		if (nd->last.name[nd->last.len])
diff --git a/queue-3.9/scsi-sd-fix-array-cache-flushing-bug-causing-performance-problems.patch b/queue-3.9/scsi-sd-fix-array-cache-flushing-bug-causing-performance-problems.patch
new file mode 100644
index 00000000000..8a4d08645bc
--- /dev/null
+++ b/queue-3.9/scsi-sd-fix-array-cache-flushing-bug-causing-performance-problems.patch
@@ -0,0 +1,102 @@
+From 39c60a0948cc06139e2fbfe084f83cb7e7deae3b Mon Sep 17 00:00:00 2001
+From: James Bottomley <JBottomley@Parallels.com>
+Date: Wed, 24 Apr 2013 14:02:53 -0700
+Subject: SCSI: sd: fix array cache flushing bug causing performance problems
+
+From: James Bottomley <JBottomley@Parallels.com>
+
+commit 39c60a0948cc06139e2fbfe084f83cb7e7deae3b upstream.
+
+Some arrays synchronize their full non volatile cache when the sd driver sends
+a SYNCHRONIZE CACHE command.  Unfortunately, they can have Terrabytes of this
+and we send a SYNCHRONIZE CACHE for every barrier if an array reports it has a
+writeback cache.  This leads to massive slowdowns on journalled filesystems.
+
+The fix is to allow userspace to turn off the writeback cache setting as a
+temporary measure (i.e. without doing the MODE SELECT to write it back to the
+device), so even though the device reported it has a writeback cache, the
+user, knowing that the cache is non volatile and all they care about is
+filesystem correctness, can turn that bit off in the kernel and avoid the
+performance ruinous (and safety irrelevant) SYNCHRONIZE CACHE commands.
+
+The way you do this is add a 'temporary' prefix when performing the usual
+cache setting operations, so
+
+echo temporary write through > /sys/class/scsi_disk/<disk>/cache_type
+
+Reported-by: Ric Wheeler <rwheeler@redhat.com>
+Signed-off-by: James Bottomley <JBottomley@Parallels.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ drivers/scsi/sd.c |   20 ++++++++++++++++++++
+ drivers/scsi/sd.h |    1 +
+ 2 files changed, 21 insertions(+)
+
+--- a/drivers/scsi/sd.c
++++ b/drivers/scsi/sd.c
+@@ -142,6 +142,7 @@ sd_store_cache_type(struct device *dev,
+ 	char *buffer_data;
+ 	struct scsi_mode_data data;
+ 	struct scsi_sense_hdr sshdr;
++	const char *temp = "temporary ";
+ 	int len;
+ 
+ 	if (sdp->type != TYPE_DISK)
+@@ -150,6 +151,13 @@ sd_store_cache_type(struct device *dev,
+ 		 * it's not worth the risk */
+ 		return -EINVAL;
+ 
++	if (strncmp(buf, temp, sizeof(temp) - 1) == 0) {
++		buf += sizeof(temp) - 1;
++		sdkp->cache_override = 1;
++	} else {
++		sdkp->cache_override = 0;
++	}
++
+ 	for (i = 0; i < ARRAY_SIZE(sd_cache_types); i++) {
+ 		len = strlen(sd_cache_types[i]);
+ 		if (strncmp(sd_cache_types[i], buf, len) == 0 &&
+@@ -162,6 +170,13 @@ sd_store_cache_type(struct device *dev,
+ 		return -EINVAL;
+ 	rcd = ct & 0x01 ? 1 : 0;
+ 	wce = ct & 0x02 ? 1 : 0;
++
++	if (sdkp->cache_override) {
++		sdkp->WCE = wce;
++		sdkp->RCD = rcd;
++		return count;
++	}
++
+ 	if (scsi_mode_sense(sdp, 0x08, 8, buffer, sizeof(buffer), SD_TIMEOUT,
+ 			    SD_MAX_RETRIES, &data, NULL))
+ 		return -EINVAL;
+@@ -2319,6 +2334,10 @@ sd_read_cache_type(struct scsi_disk *sdk
+ 	int old_rcd = sdkp->RCD;
+ 	int old_dpofua = sdkp->DPOFUA;
+ 
++
++	if (sdkp->cache_override)
++		return;
++
+ 	first_len = 4;
+ 	if (sdp->skip_ms_page_8) {
+ 		if (sdp->type == TYPE_RBC)
+@@ -2812,6 +2831,7 @@ static void sd_probe_async(void *data, a
+ 	sdkp->capacity = 0;
+ 	sdkp->media_present = 1;
+ 	sdkp->write_prot = 0;
++	sdkp->cache_override = 0;
+ 	sdkp->WCE = 0;
+ 	sdkp->RCD = 0;
+ 	sdkp->ATO = 0;
+--- a/drivers/scsi/sd.h
++++ b/drivers/scsi/sd.h
+@@ -73,6 +73,7 @@ struct scsi_disk {
+ 	u8		protection_type;/* Data Integrity Field */
+ 	u8		provisioning_mode;
+ 	unsigned	ATO : 1;	/* state of disk ATO bit */
++	unsigned	cache_override : 1; /* temp override of WCE,RCD */
+ 	unsigned	WCE : 1;	/* state of disk WCE bit */
+ 	unsigned	RCD : 1;	/* state of disk RCD bit, unused */
+ 	unsigned	DPOFUA : 1;	/* state of disk DPOFUA bit */
diff --git a/queue-3.9/series b/queue-3.9/series
index 55b1a9b7a48..8f66bd46a97 100644
--- a/queue-3.9/series
+++ b/queue-3.9/series
@@ -26,3 +26,7 @@ nfsd-fix-oops-when-legacy_recdir_name_error-is-passed-a.patch
 hp_accel-ignore-the-error-from-lis3lv02d_poweron-at-resume.patch
 x86-vm86-fix-vm86-syscalls-use-syscall_definex.patch
 shm-fix-null-pointer-deref-when-userspace-specifies-invalid-hugepage-size.patch
+xen-vcpu-pvhvm-fix-vcpu-hotplugging-hanging.patch
+scsi-sd-fix-array-cache-flushing-bug-causing-performance-problems.patch
+audit-syscall-rules-are-not-applied-to-existing-processes-on-non-x86.patch
+audit-vfs-fix-audit_inode-call-in-o_creat-case-of-do_last.patch
diff --git a/queue-3.9/xen-vcpu-pvhvm-fix-vcpu-hotplugging-hanging.patch b/queue-3.9/xen-vcpu-pvhvm-fix-vcpu-hotplugging-hanging.patch
new file mode 100644
index 00000000000..4f5d310e647
--- /dev/null
+++ b/queue-3.9/xen-vcpu-pvhvm-fix-vcpu-hotplugging-hanging.patch
@@ -0,0 +1,108 @@
+From 7f1fc268c47491fd5e63548f6415fc8604e13003 Mon Sep 17 00:00:00 2001
+From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
+Date: Sun, 5 May 2013 09:30:09 -0400
+Subject: xen/vcpu/pvhvm: Fix vcpu hotplugging hanging.
+
+From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
+
+commit 7f1fc268c47491fd5e63548f6415fc8604e13003 upstream.
+
+If a user did:
+
+	echo 0 > /sys/devices/system/cpu/cpu1/online
+	echo 1 > /sys/devices/system/cpu/cpu1/online
+
+we would (this a build with DEBUG enabled) get to:
+smpboot: ++++++++++++++++++++=_---CPU UP  1
+.. snip..
+smpboot: Stack at about ffff880074c0ff44
+smpboot: CPU1: has booted.
+
+and hang. The RCU mechanism would kick in an try to IPI the CPU1
+but the IPIs (and all other interrupts) would never arrive at the
+CPU1. At first glance at least. A bit digging in the hypervisor
+trace shows that (using xenanalyze):
+
+[vla] d4v1 vec 243 injecting
+   0.043163027 --|x d4v1 intr_window vec 243 src 5(vector) intr f3
+]  0.043163639 --|x d4v1 vmentry cycles 1468
+]  0.043164913 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254
+   0.043164913 --|x d4v1 inj_virq vec 243  real
+  [vla] d4v1 vec 243 injecting
+   0.043164913 --|x d4v1 intr_window vec 243 src 5(vector) intr f3
+]  0.043165526 --|x d4v1 vmentry cycles 1472
+]  0.043166800 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254
+   0.043166800 --|x d4v1 inj_virq vec 243  real
+  [vla] d4v1 vec 243 injecting
+
+there is a pending event (subsequent debugging shows it is the IPI
+from the VCPU0 when smpboot.c on VCPU1 has done
+"set_cpu_online(smp_processor_id(), true)") and the guest VCPU1 is
+interrupted with the callback IPI (0xf3 aka 243) which ends up calling
+__xen_evtchn_do_upcall.
+
+The __xen_evtchn_do_upcall seems to do *something* but not acknowledge
+the pending events. And the moment the guest does a 'cli' (that is the
+ffffffff81673254 in the log above) the hypervisor is invoked again to
+inject the IPI (0xf3) to tell the guest it has pending interrupts.
+This repeats itself forever.
+
+The culprit was the per_cpu(xen_vcpu, cpu) pointer. At the bootup
+we set each per_cpu(xen_vcpu, cpu) to point to the
+shared_info->vcpu_info[vcpu] but later on use the VCPUOP_register_vcpu_info
+to register per-CPU  structures (xen_vcpu_setup).
+This is used to allow events for more than 32 VCPUs and for performance
+optimizations reasons.
+
+When the user performs the VCPU hotplug we end up calling the
+the xen_vcpu_setup once more. We make the hypercall which returns
+-EINVAL as it does not allow multiple registration calls (and
+already has re-assigned where the events are being set). We pick
+the fallback case and set per_cpu(xen_vcpu, cpu) to point to the
+shared_info->vcpu_info[vcpu] (which is a good fallback during bootup).
+However the hypervisor is still setting events in the register
+per-cpu structure (per_cpu(xen_vcpu_info, cpu)).
+
+As such when the events are set by the hypervisor (such as timer one),
+and when we iterate in __xen_evtchn_do_upcall we end up reading stale
+events from the shared_info->vcpu_info[vcpu] instead of the
+per_cpu(xen_vcpu_info, cpu) structures. Hence we never acknowledge the
+events that the hypervisor has set and the hypervisor keeps on reminding
+us to ack the events which we never do.
+
+The fix is simple. Don't on the second time when xen_vcpu_setup is
+called over-write the per_cpu(xen_vcpu, cpu) if it points to
+per_cpu(xen_vcpu_info).
+
+Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
+Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ arch/x86/xen/enlighten.c |   15 +++++++++++++++
+ 1 file changed, 15 insertions(+)
+
+--- a/arch/x86/xen/enlighten.c
++++ b/arch/x86/xen/enlighten.c
+@@ -156,6 +156,21 @@ static void xen_vcpu_setup(int cpu)
+ 
+ 	BUG_ON(HYPERVISOR_shared_info == &xen_dummy_shared_info);
+ 
++	/*
++	 * This path is called twice on PVHVM - first during bootup via
++	 * smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being
++	 * hotplugged: cpu_up -> xen_hvm_cpu_notify.
++	 * As we can only do the VCPUOP_register_vcpu_info once lets
++	 * not over-write its result.
++	 *
++	 * For PV it is called during restore (xen_vcpu_restore) and bootup
++	 * (xen_setup_vcpu_info_placement). The hotplug mechanism does not
++	 * use this function.
++	 */
++	if (xen_hvm_domain()) {
++		if (per_cpu(xen_vcpu, cpu) == &per_cpu(xen_vcpu_info, cpu))
++			return;
++	}
+ 	if (cpu < MAX_VIRT_CPUS)
+ 		per_cpu(xen_vcpu,cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu];
+