3.4-stable patches

author Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Mon, 15 Apr 2013 00:36:55 +0000 (17:36 -0700)

committer Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Mon, 15 Apr 2013 00:36:55 +0000 (17:36 -0700)
author Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mon, 15 Apr 2013 00:36:55 +0000 (17:36 -0700)
committer Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mon, 15 Apr 2013 00:36:55 +0000 (17:36 -0700)
diff --git a/queue-3.4/sched_clock-prevent-64bit-inatomicity-on-32bit-systems.patch b/queue-3.4/sched_clock-prevent-64bit-inatomicity-on-32bit-systems.patch

new file mode 100644 (file)

index 0000000..cc2685f
--- /dev/null
+++ b/queue-3.4/sched_clock-prevent-64bit-inatomicity-on-32bit-systems.patch
@@ -0,0 +1,141 @@
+From a1cbcaa9ea87b87a96b9fc465951dcf36e459ca2 Mon Sep 17 00:00:00 2001
+From: Thomas Gleixner <tglx@linutronix.de>
+Date: Sat, 6 Apr 2013 10:10:27 +0200
+Subject: sched_clock: Prevent 64bit inatomicity on 32bit systems
+
+From: Thomas Gleixner <tglx@linutronix.de>
+
+commit a1cbcaa9ea87b87a96b9fc465951dcf36e459ca2 upstream.
+
+The sched_clock_remote() implementation has the following inatomicity
+problem on 32bit systems when accessing the remote scd->clock, which
+is a 64bit value.
+
+CPU0                   CPU1
+
+sched_clock_local()    sched_clock_remote(CPU0)
+...
+                       remote_clock = scd[CPU0]->clock
+                           read_low32bit(scd[CPU0]->clock)
+cmpxchg64(scd->clock,...)
+                           read_high32bit(scd[CPU0]->clock)
+
+While the update of scd->clock is using an atomic64 mechanism, the
+readout on the remote cpu is not, which can cause completely bogus
+readouts.
+
+It is a quite rare problem, because it requires the update to hit the
+narrow race window between the low/high readout and the update must go
+across the 32bit boundary.
+
+The resulting misbehaviour is, that CPU1 will see the sched_clock on
+CPU1 ~4 seconds ahead of it's own and update CPU1s sched_clock value
+to this bogus timestamp. This stays that way due to the clamping
+implementation for about 4 seconds until the synchronization with
+CLOCK_MONOTONIC undoes the problem.
+
+The issue is hard to observe, because it might only result in a less
+accurate SCHED_OTHER timeslicing behaviour. To create observable
+damage on realtime scheduling classes, it is necessary that the bogus
+update of CPU1 sched_clock happens in the context of an realtime
+thread, which then gets charged 4 seconds of RT runtime, which results
+in the RT throttler mechanism to trigger and prevent scheduling of RT
+tasks for a little less than 4 seconds. So this is quite unlikely as
+well.
+
+The issue was quite hard to decode as the reproduction time is between
+2 days and 3 weeks and intrusive tracing makes it less likely, but the
+following trace recorded with trace_clock=global, which uses
+sched_clock_local(), gave the final hint:
+
+  <idle>-0   0d..30 400269.477150: hrtimer_cancel: hrtimer=0xf7061e80
+  <idle>-0   0d..30 400269.477151: hrtimer_start:  hrtimer=0xf7061e80 ...
+irq/20-S-587 1d..32 400273.772118: sched_wakeup:   comm= ... target_cpu=0
+  <idle>-0   0dN.30 400273.772118: hrtimer_cancel: hrtimer=0xf7061e80
+
+What happens is that CPU0 goes idle and invokes
+sched_clock_idle_sleep_event() which invokes sched_clock_local() and
+CPU1 runs a remote wakeup for CPU0 at the same time, which invokes
+sched_remote_clock(). The time jump gets propagated to CPU0 via
+sched_remote_clock() and stays stale on both cores for ~4 seconds.
+
+There are only two other possibilities, which could cause a stale
+sched clock:
+
+1) ktime_get() which reads out CLOCK_MONOTONIC returns a sporadic
+   wrong value.
+
+2) sched_clock() which reads the TSC returns a sporadic wrong value.
+
+#1 can be excluded because sched_clock would continue to increase for
+   one jiffy and then go stale.
+
+#2 can be excluded because it would not make the clock jump
+   forward. It would just result in a stale sched_clock for one jiffy.
+
+After quite some brain twisting and finding the same pattern on other
+traces, sched_clock_remote() remained the only place which could cause
+such a problem and as explained above it's indeed racy on 32bit
+systems.
+
+So while on 64bit systems the readout is atomic, we need to verify the
+remote readout on 32bit machines. We need to protect the local->clock
+readout in sched_clock_remote() on 32bit as well because an NMI could
+hit between the low and the high readout, call sched_clock_local() and
+modify local->clock.
+
+Thanks to Siegfried Wulsch for bearing with my debug requests and
+going through the tedious tasks of running a bunch of reproducer
+systems to generate the debug information which let me decode the
+issue.
+
+Reported-by: Siegfried Wulsch <Siegfried.Wulsch@rovema.de>
+Acked-by: Peter Zijlstra <peterz@infradead.org>
+Cc: Steven Rostedt <rostedt@goodmis.org>
+Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304051544160.21884@ionos
+Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ kernel/sched/clock.c |   26 ++++++++++++++++++++++++++
+ 1 file changed, 26 insertions(+)
+
+--- a/kernel/sched/clock.c
++++ b/kernel/sched/clock.c
+@@ -176,10 +176,36 @@ static u64 sched_clock_remote(struct sch
+       u64 this_clock, remote_clock;
+       u64 *ptr, old_val, val;
+ 
++#if BITS_PER_LONG != 64
++again:
++      /*
++       * Careful here: The local and the remote clock values need to
++       * be read out atomic as we need to compare the values and
++       * then update either the local or the remote side. So the
++       * cmpxchg64 below only protects one readout.
++       *
++       * We must reread via sched_clock_local() in the retry case on
++       * 32bit as an NMI could use sched_clock_local() via the
++       * tracer and hit between the readout of
++       * the low32bit and the high 32bit portion.
++       */
++      this_clock = sched_clock_local(my_scd);
++      /*
++       * We must enforce atomic readout on 32bit, otherwise the
++       * update on the remote cpu can hit inbetween the readout of
++       * the low32bit and the high 32bit portion.
++       */
++      remote_clock = cmpxchg64(&scd->clock, 0, 0);
++#else
++      /*
++       * On 64bit the read of [my]scd->clock is atomic versus the
++       * update, so we can avoid the above 32bit dance.
++       */
+       sched_clock_local(my_scd);
+ again:
+       this_clock = my_scd->clock;
+       remote_clock = scd->clock;
++#endif
+ 
+       /*
+        * Use the opportunity that we have both locks
diff --git a/queue-3.4/series b/queue-3.4/series

index c78f65d0429689eb9e24e25ab1added91e27330d..5708dffe6c7e9720f8b5cd62cbc9a729a713da3e 100644 (file)
--- a/queue-3.4/series
+++ b/queue-3.4/series
@@ -8,4 +8,6 @@ cifs-allow-passwords-which-begin-with-a-delimitor.patch
  target-fix-incorrect-fallthrough-of-alua-standby-offline-transition-cdbs.patch
  vfs-revert-spurious-fix-to-spinning-prevention-in-prune_icache_sb.patch
  kref-implement-kref_get_unless_zero-v3.patch
+udl-handle-edid-failure-properly.patch
+sched_clock-prevent-64bit-inatomicity-on-32bit-systems.patch
  kobject-fix-kset_find_obj-race-with-concurrent-last-kobject_put.patch
diff --git a/queue-3.4/udl-handle-edid-failure-properly.patch b/queue-3.4/udl-handle-edid-failure-properly.patch

new file mode 100644 (file)

index 0000000..8f2f6fa
--- /dev/null
+++ b/queue-3.4/udl-handle-edid-failure-properly.patch
@@ -0,0 +1,31 @@
+From 1baee58638fc58248625255f5c5fcdb987f11b1f Mon Sep 17 00:00:00 2001
+From: Dave Airlie <airlied@gmail.com>
+Date: Fri, 12 Apr 2013 13:25:20 +1000
+Subject: udl: handle EDID failure properly.
+
+From: Dave Airlie <airlied@gmail.com>
+
+commit 1baee58638fc58248625255f5c5fcdb987f11b1f upstream.
+
+Don't oops seems proper.
+
+Signed-off-by: Dave Airlie <airlied@redhat.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ drivers/gpu/drm/udl/udl_connector.c |    4 ++++
+ 1 file changed, 4 insertions(+)
+
+--- a/drivers/gpu/drm/udl/udl_connector.c
++++ b/drivers/gpu/drm/udl/udl_connector.c
+@@ -61,6 +61,10 @@ static int udl_get_modes(struct drm_conn
+       int ret;
+ 
+       edid = (struct edid *)udl_get_edid(udl);
++      if (!edid) {
++              drm_mode_connector_update_edid_property(connector, NULL);
++              return 0;
++      }
+ 
+       connector->display_info.raw_edid = (char *)edid;
+
author	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Mon, 15 Apr 2013 00:36:55 +0000 (17:36 -0700)
committer	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Mon, 15 Apr 2013 00:36:55 +0000 (17:36 -0700)
queue-3.4/sched_clock-prevent-64bit-inatomicity-on-32bit-systems.patch	[new file with mode: 0644]	patch \| blob
queue-3.4/series		patch \| blob \| blame \| history
queue-3.4/udl-handle-edid-failure-properly.patch	[new file with mode: 0644]	patch \| blob