From: Greg Kroah-Hartman Date: Mon, 13 May 2013 21:58:03 +0000 (-0400) Subject: 3.9-stable patches X-Git-Tag: v3.0.79~18 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=52cd3882d9def9c662db0ba5375658b8f9fbb787;p=thirdparty%2Fkernel%2Fstable-queue.git 3.9-stable patches added patches: audit-syscall-rules-are-not-applied-to-existing-processes-on-non-x86.patch audit-vfs-fix-audit_inode-call-in-o_creat-case-of-do_last.patch scsi-sd-fix-array-cache-flushing-bug-causing-performance-problems.patch xen-vcpu-pvhvm-fix-vcpu-hotplugging-hanging.patch --- diff --git a/queue-3.9/audit-syscall-rules-are-not-applied-to-existing-processes-on-non-x86.patch b/queue-3.9/audit-syscall-rules-are-not-applied-to-existing-processes-on-non-x86.patch new file mode 100644 index 00000000000..edec5ad41ab --- /dev/null +++ b/queue-3.9/audit-syscall-rules-are-not-applied-to-existing-processes-on-non-x86.patch @@ -0,0 +1,48 @@ +From cdee3904b4ce7c03d1013ed6dd704b43ae7fc2e9 Mon Sep 17 00:00:00 2001 +From: Anton Blanchard +Date: Wed, 9 Jan 2013 10:46:17 +1100 +Subject: audit: Syscall rules are not applied to existing processes on non-x86 + +From: Anton Blanchard + +commit cdee3904b4ce7c03d1013ed6dd704b43ae7fc2e9 upstream. + +Commit b05d8447e782 (audit: inline audit_syscall_entry to reduce +burden on archs) changed audit_syscall_entry to check for a dummy +context before calling __audit_syscall_entry. Unfortunately the dummy +context state is maintained in __audit_syscall_entry so once set it +never gets cleared, even if the audit rules change. + +As a result, if there are no auditing rules when a process starts +then it will never be subject to any rules added later. x86 doesn't +see this because it has an assembly fast path that calls directly into +__audit_syscall_entry. + +I noticed this issue when working on audit performance optimisations. +I wrote a set of simple test cases available at: + +http://ozlabs.org/~anton/junkcode/audit_tests.tar.gz + +02_new_rule.py fails without the patch and passes with it. The +test case clears all rules, starts a process, adds a rule then +verifies the process produces a syscall audit record. + +Signed-off-by: Anton Blanchard +Signed-off-by: Eric Paris +Signed-off-by: Greg Kroah-Hartman + +--- + include/linux/audit.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/include/linux/audit.h ++++ b/include/linux/audit.h +@@ -120,7 +120,7 @@ static inline void audit_syscall_entry(i + unsigned long a1, unsigned long a2, + unsigned long a3) + { +- if (unlikely(!audit_dummy_context())) ++ if (unlikely(current->audit_context)) + __audit_syscall_entry(arch, major, a0, a1, a2, a3); + } + static inline void audit_syscall_exit(void *pt_regs) diff --git a/queue-3.9/audit-vfs-fix-audit_inode-call-in-o_creat-case-of-do_last.patch b/queue-3.9/audit-vfs-fix-audit_inode-call-in-o_creat-case-of-do_last.patch new file mode 100644 index 00000000000..d91191125d3 --- /dev/null +++ b/queue-3.9/audit-vfs-fix-audit_inode-call-in-o_creat-case-of-do_last.patch @@ -0,0 +1,56 @@ +From 33e2208acfc15ce00d3dd13e839bf6434faa2b04 Mon Sep 17 00:00:00 2001 +From: Jeff Layton +Date: Fri, 12 Apr 2013 15:16:32 -0400 +Subject: audit: vfs: fix audit_inode call in O_CREAT case of do_last + +From: Jeff Layton + +commit 33e2208acfc15ce00d3dd13e839bf6434faa2b04 upstream. + +Jiri reported a regression in auditing of open(..., O_CREAT) syscalls. +In older kernels, creating a file with open(..., O_CREAT) created +audit_name records that looked like this: + +type=PATH msg=audit(1360255720.628:64): item=1 name="/abc/foo" inode=138810 dev=fd:00 mode=0100640 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0 +type=PATH msg=audit(1360255720.628:64): item=0 name="/abc/" inode=138635 dev=fd:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0 + +...in recent kernels though, they look like this: + +type=PATH msg=audit(1360255402.886:12574): item=2 name=(null) inode=264599 dev=fd:00 mode=0100640 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0 +type=PATH msg=audit(1360255402.886:12574): item=1 name=(null) inode=264598 dev=fd:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0 +type=PATH msg=audit(1360255402.886:12574): item=0 name="/abc/foo" inode=264598 dev=fd:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0 + +Richard bisected to determine that the problems started with commit +bfcec708, but the log messages have changed with some later +audit-related patches. + +The problem is that this audit_inode call is passing in the parent of +the dentry being opened, but audit_inode is being called with the parent +flag false. This causes later audit_inode and audit_inode_child calls to +match the wrong entry in the audit_names list. + +This patch simply sets the flag to properly indicate that this inode +represents the parent. With this, the audit_names entries are back to +looking like they did before. + +Reported-by: Jiri Jaburek +Signed-off-by: Jeff Layton +Test By: Richard Guy Briggs +Signed-off-by: Eric Paris +Signed-off-by: Greg Kroah-Hartman + +--- + fs/namei.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/fs/namei.c ++++ b/fs/namei.c +@@ -2740,7 +2740,7 @@ static int do_last(struct nameidata *nd, + if (error) + return error; + +- audit_inode(name, dir, 0); ++ audit_inode(name, dir, LOOKUP_PARENT); + error = -EISDIR; + /* trailing slashes? */ + if (nd->last.name[nd->last.len]) diff --git a/queue-3.9/scsi-sd-fix-array-cache-flushing-bug-causing-performance-problems.patch b/queue-3.9/scsi-sd-fix-array-cache-flushing-bug-causing-performance-problems.patch new file mode 100644 index 00000000000..8a4d08645bc --- /dev/null +++ b/queue-3.9/scsi-sd-fix-array-cache-flushing-bug-causing-performance-problems.patch @@ -0,0 +1,102 @@ +From 39c60a0948cc06139e2fbfe084f83cb7e7deae3b Mon Sep 17 00:00:00 2001 +From: James Bottomley +Date: Wed, 24 Apr 2013 14:02:53 -0700 +Subject: SCSI: sd: fix array cache flushing bug causing performance problems + +From: James Bottomley + +commit 39c60a0948cc06139e2fbfe084f83cb7e7deae3b upstream. + +Some arrays synchronize their full non volatile cache when the sd driver sends +a SYNCHRONIZE CACHE command. Unfortunately, they can have Terrabytes of this +and we send a SYNCHRONIZE CACHE for every barrier if an array reports it has a +writeback cache. This leads to massive slowdowns on journalled filesystems. + +The fix is to allow userspace to turn off the writeback cache setting as a +temporary measure (i.e. without doing the MODE SELECT to write it back to the +device), so even though the device reported it has a writeback cache, the +user, knowing that the cache is non volatile and all they care about is +filesystem correctness, can turn that bit off in the kernel and avoid the +performance ruinous (and safety irrelevant) SYNCHRONIZE CACHE commands. + +The way you do this is add a 'temporary' prefix when performing the usual +cache setting operations, so + +echo temporary write through > /sys/class/scsi_disk//cache_type + +Reported-by: Ric Wheeler +Signed-off-by: James Bottomley +Signed-off-by: Greg Kroah-Hartman + +--- + drivers/scsi/sd.c | 20 ++++++++++++++++++++ + drivers/scsi/sd.h | 1 + + 2 files changed, 21 insertions(+) + +--- a/drivers/scsi/sd.c ++++ b/drivers/scsi/sd.c +@@ -142,6 +142,7 @@ sd_store_cache_type(struct device *dev, + char *buffer_data; + struct scsi_mode_data data; + struct scsi_sense_hdr sshdr; ++ const char *temp = "temporary "; + int len; + + if (sdp->type != TYPE_DISK) +@@ -150,6 +151,13 @@ sd_store_cache_type(struct device *dev, + * it's not worth the risk */ + return -EINVAL; + ++ if (strncmp(buf, temp, sizeof(temp) - 1) == 0) { ++ buf += sizeof(temp) - 1; ++ sdkp->cache_override = 1; ++ } else { ++ sdkp->cache_override = 0; ++ } ++ + for (i = 0; i < ARRAY_SIZE(sd_cache_types); i++) { + len = strlen(sd_cache_types[i]); + if (strncmp(sd_cache_types[i], buf, len) == 0 && +@@ -162,6 +170,13 @@ sd_store_cache_type(struct device *dev, + return -EINVAL; + rcd = ct & 0x01 ? 1 : 0; + wce = ct & 0x02 ? 1 : 0; ++ ++ if (sdkp->cache_override) { ++ sdkp->WCE = wce; ++ sdkp->RCD = rcd; ++ return count; ++ } ++ + if (scsi_mode_sense(sdp, 0x08, 8, buffer, sizeof(buffer), SD_TIMEOUT, + SD_MAX_RETRIES, &data, NULL)) + return -EINVAL; +@@ -2319,6 +2334,10 @@ sd_read_cache_type(struct scsi_disk *sdk + int old_rcd = sdkp->RCD; + int old_dpofua = sdkp->DPOFUA; + ++ ++ if (sdkp->cache_override) ++ return; ++ + first_len = 4; + if (sdp->skip_ms_page_8) { + if (sdp->type == TYPE_RBC) +@@ -2812,6 +2831,7 @@ static void sd_probe_async(void *data, a + sdkp->capacity = 0; + sdkp->media_present = 1; + sdkp->write_prot = 0; ++ sdkp->cache_override = 0; + sdkp->WCE = 0; + sdkp->RCD = 0; + sdkp->ATO = 0; +--- a/drivers/scsi/sd.h ++++ b/drivers/scsi/sd.h +@@ -73,6 +73,7 @@ struct scsi_disk { + u8 protection_type;/* Data Integrity Field */ + u8 provisioning_mode; + unsigned ATO : 1; /* state of disk ATO bit */ ++ unsigned cache_override : 1; /* temp override of WCE,RCD */ + unsigned WCE : 1; /* state of disk WCE bit */ + unsigned RCD : 1; /* state of disk RCD bit, unused */ + unsigned DPOFUA : 1; /* state of disk DPOFUA bit */ diff --git a/queue-3.9/series b/queue-3.9/series index 55b1a9b7a48..8f66bd46a97 100644 --- a/queue-3.9/series +++ b/queue-3.9/series @@ -26,3 +26,7 @@ nfsd-fix-oops-when-legacy_recdir_name_error-is-passed-a.patch hp_accel-ignore-the-error-from-lis3lv02d_poweron-at-resume.patch x86-vm86-fix-vm86-syscalls-use-syscall_definex.patch shm-fix-null-pointer-deref-when-userspace-specifies-invalid-hugepage-size.patch +xen-vcpu-pvhvm-fix-vcpu-hotplugging-hanging.patch +scsi-sd-fix-array-cache-flushing-bug-causing-performance-problems.patch +audit-syscall-rules-are-not-applied-to-existing-processes-on-non-x86.patch +audit-vfs-fix-audit_inode-call-in-o_creat-case-of-do_last.patch diff --git a/queue-3.9/xen-vcpu-pvhvm-fix-vcpu-hotplugging-hanging.patch b/queue-3.9/xen-vcpu-pvhvm-fix-vcpu-hotplugging-hanging.patch new file mode 100644 index 00000000000..4f5d310e647 --- /dev/null +++ b/queue-3.9/xen-vcpu-pvhvm-fix-vcpu-hotplugging-hanging.patch @@ -0,0 +1,108 @@ +From 7f1fc268c47491fd5e63548f6415fc8604e13003 Mon Sep 17 00:00:00 2001 +From: Konrad Rzeszutek Wilk +Date: Sun, 5 May 2013 09:30:09 -0400 +Subject: xen/vcpu/pvhvm: Fix vcpu hotplugging hanging. + +From: Konrad Rzeszutek Wilk + +commit 7f1fc268c47491fd5e63548f6415fc8604e13003 upstream. + +If a user did: + + echo 0 > /sys/devices/system/cpu/cpu1/online + echo 1 > /sys/devices/system/cpu/cpu1/online + +we would (this a build with DEBUG enabled) get to: +smpboot: ++++++++++++++++++++=_---CPU UP 1 +.. snip.. +smpboot: Stack at about ffff880074c0ff44 +smpboot: CPU1: has booted. + +and hang. The RCU mechanism would kick in an try to IPI the CPU1 +but the IPIs (and all other interrupts) would never arrive at the +CPU1. At first glance at least. A bit digging in the hypervisor +trace shows that (using xenanalyze): + +[vla] d4v1 vec 243 injecting + 0.043163027 --|x d4v1 intr_window vec 243 src 5(vector) intr f3 +] 0.043163639 --|x d4v1 vmentry cycles 1468 +] 0.043164913 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254 + 0.043164913 --|x d4v1 inj_virq vec 243 real + [vla] d4v1 vec 243 injecting + 0.043164913 --|x d4v1 intr_window vec 243 src 5(vector) intr f3 +] 0.043165526 --|x d4v1 vmentry cycles 1472 +] 0.043166800 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254 + 0.043166800 --|x d4v1 inj_virq vec 243 real + [vla] d4v1 vec 243 injecting + +there is a pending event (subsequent debugging shows it is the IPI +from the VCPU0 when smpboot.c on VCPU1 has done +"set_cpu_online(smp_processor_id(), true)") and the guest VCPU1 is +interrupted with the callback IPI (0xf3 aka 243) which ends up calling +__xen_evtchn_do_upcall. + +The __xen_evtchn_do_upcall seems to do *something* but not acknowledge +the pending events. And the moment the guest does a 'cli' (that is the +ffffffff81673254 in the log above) the hypervisor is invoked again to +inject the IPI (0xf3) to tell the guest it has pending interrupts. +This repeats itself forever. + +The culprit was the per_cpu(xen_vcpu, cpu) pointer. At the bootup +we set each per_cpu(xen_vcpu, cpu) to point to the +shared_info->vcpu_info[vcpu] but later on use the VCPUOP_register_vcpu_info +to register per-CPU structures (xen_vcpu_setup). +This is used to allow events for more than 32 VCPUs and for performance +optimizations reasons. + +When the user performs the VCPU hotplug we end up calling the +the xen_vcpu_setup once more. We make the hypercall which returns +-EINVAL as it does not allow multiple registration calls (and +already has re-assigned where the events are being set). We pick +the fallback case and set per_cpu(xen_vcpu, cpu) to point to the +shared_info->vcpu_info[vcpu] (which is a good fallback during bootup). +However the hypervisor is still setting events in the register +per-cpu structure (per_cpu(xen_vcpu_info, cpu)). + +As such when the events are set by the hypervisor (such as timer one), +and when we iterate in __xen_evtchn_do_upcall we end up reading stale +events from the shared_info->vcpu_info[vcpu] instead of the +per_cpu(xen_vcpu_info, cpu) structures. Hence we never acknowledge the +events that the hypervisor has set and the hypervisor keeps on reminding +us to ack the events which we never do. + +The fix is simple. Don't on the second time when xen_vcpu_setup is +called over-write the per_cpu(xen_vcpu, cpu) if it points to +per_cpu(xen_vcpu_info). + +Acked-by: Stefano Stabellini +Signed-off-by: Konrad Rzeszutek Wilk +Signed-off-by: Greg Kroah-Hartman + +--- + arch/x86/xen/enlighten.c | 15 +++++++++++++++ + 1 file changed, 15 insertions(+) + +--- a/arch/x86/xen/enlighten.c ++++ b/arch/x86/xen/enlighten.c +@@ -156,6 +156,21 @@ static void xen_vcpu_setup(int cpu) + + BUG_ON(HYPERVISOR_shared_info == &xen_dummy_shared_info); + ++ /* ++ * This path is called twice on PVHVM - first during bootup via ++ * smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being ++ * hotplugged: cpu_up -> xen_hvm_cpu_notify. ++ * As we can only do the VCPUOP_register_vcpu_info once lets ++ * not over-write its result. ++ * ++ * For PV it is called during restore (xen_vcpu_restore) and bootup ++ * (xen_setup_vcpu_info_placement). The hotplug mechanism does not ++ * use this function. ++ */ ++ if (xen_hvm_domain()) { ++ if (per_cpu(xen_vcpu, cpu) == &per_cpu(xen_vcpu_info, cpu)) ++ return; ++ } + if (cpu < MAX_VIRT_CPUS) + per_cpu(xen_vcpu,cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu]; +