From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Sat, 11 Aug 2018 17:12:42 +0000 (+0200)
Subject: 4.17-stable patches
X-Git-Tag: v4.18.1~48
X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=7ab0486e3f070856fc35b3d9a346728ce9ce376b;p=thirdparty%2Fkernel%2Fstable-queue.git

4.17-stable patches

added patches:
	mark-hi-and-tasklet-softirq-synchronous.patch
	sched-deadline-update-rq_clock-of-later_rq-when-pushing-a-task.patch
	stop_machine-disable-preemption-after-queueing-stopper-threads.patch
---

diff --git a/queue-4.17/mark-hi-and-tasklet-softirq-synchronous.patch b/queue-4.17/mark-hi-and-tasklet-softirq-synchronous.patch
new file mode 100644
index 00000000000..fc09f5cdf89
--- /dev/null
+++ b/queue-4.17/mark-hi-and-tasklet-softirq-synchronous.patch
@@ -0,0 +1,103 @@
+From 3c53776e29f81719efcf8f7a6e30cdf753bee94d Mon Sep 17 00:00:00 2001
+From: Linus Torvalds <torvalds@linux-foundation.org>
+Date: Mon, 8 Jan 2018 11:51:04 -0800
+Subject: Mark HI and TASKLET softirq synchronous
+
+From: Linus Torvalds <torvalds@linux-foundation.org>
+
+commit 3c53776e29f81719efcf8f7a6e30cdf753bee94d upstream.
+
+Way back in 4.9, we committed 4cd13c21b207 ("softirq: Let ksoftirqd do
+its job"), and ever since we've had small nagging issues with it.  For
+example, we've had:
+
+  1ff688209e2e ("watchdog: core: make sure the watchdog_worker is not deferred")
+  8d5755b3f77b ("watchdog: softdog: fire watchdog even if softirqs do not get to run")
+  217f69743681 ("net: busy-poll: allow preemption in sk_busy_loop()")
+
+all of which worked around some of the effects of that commit.
+
+The DVB people have also complained that the commit causes excessive USB
+URB latencies, which seems to be due to the USB code using tasklets to
+schedule USB traffic.  This seems to be an issue mainly when already
+living on the edge, but waiting for ksoftirqd to handle it really does
+seem to cause excessive latencies.
+
+Now Hanna Hawa reports that this issue isn't just limited to USB URB and
+DVB, but also causes timeout problems for the Marvell SoC team:
+
+ "I'm facing kernel panic issue while running raid 5 on sata disks
+  connected to Macchiatobin (Marvell community board with Armada-8040
+  SoC with 4 ARMv8 cores of CA72) Raid 5 built with Marvell DMA engine
+  and async_tx mechanism (ASYNC_TX_DMA [=y]); the DMA driver (mv_xor_v2)
+  uses a tasklet to clean the done descriptors from the queue"
+
+The latency problem causes a panic:
+
+  mv_xor_v2 f0400000.xor: dma_sync_wait: timeout!
+  Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for transaction
+
+We've discussed simply just reverting the original commit entirely, and
+also much more involved solutions (with per-softirq threads etc).  This
+patch is intentionally stupid and fairly limited, because the issue
+still remains, and the other solutions either got sidetracked or had
+other issues.
+
+We should probably also consider the timer softirqs to be synchronous
+and not be delayed to ksoftirqd (since they were the issue with the
+earlier watchdog problems), but that should be done as a separate patch.
+This does only the tasklet cases.
+
+Reported-and-tested-by: Hanna Hawa <hannah@marvell.com>
+Reported-and-tested-by: Josef Griebichler <griebichler.josef@gmx.at>
+Reported-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
+Cc: Alan Stern <stern@rowland.harvard.edu>
+Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+Cc: Eric Dumazet <edumazet@google.com>
+Cc: Ingo Molnar <mingo@kernel.org>
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ kernel/softirq.c |   12 ++++++++----
+ 1 file changed, 8 insertions(+), 4 deletions(-)
+
+--- a/kernel/softirq.c
++++ b/kernel/softirq.c
+@@ -79,12 +79,16 @@ static void wakeup_softirqd(void)
+ 
+ /*
+  * If ksoftirqd is scheduled, we do not want to process pending softirqs
+- * right now. Let ksoftirqd handle this at its own rate, to get fairness.
++ * right now. Let ksoftirqd handle this at its own rate, to get fairness,
++ * unless we're doing some of the synchronous softirqs.
+  */
+-static bool ksoftirqd_running(void)
++#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
++static bool ksoftirqd_running(unsigned long pending)
+ {
+ 	struct task_struct *tsk = __this_cpu_read(ksoftirqd);
+ 
++	if (pending & SOFTIRQ_NOW_MASK)
++		return false;
+ 	return tsk && (tsk->state == TASK_RUNNING);
+ }
+ 
+@@ -329,7 +333,7 @@ asmlinkage __visible void do_softirq(voi
+ 
+ 	pending = local_softirq_pending();
+ 
+-	if (pending && !ksoftirqd_running())
++	if (pending && !ksoftirqd_running(pending))
+ 		do_softirq_own_stack();
+ 
+ 	local_irq_restore(flags);
+@@ -356,7 +360,7 @@ void irq_enter(void)
+ 
+ static inline void invoke_softirq(void)
+ {
+-	if (ksoftirqd_running())
++	if (ksoftirqd_running(local_softirq_pending()))
+ 		return;
+ 
+ 	if (!force_irqthreads) {
diff --git a/queue-4.17/sched-deadline-update-rq_clock-of-later_rq-when-pushing-a-task.patch b/queue-4.17/sched-deadline-update-rq_clock-of-later_rq-when-pushing-a-task.patch
new file mode 100644
index 00000000000..4cecdbfd619
--- /dev/null
+++ b/queue-4.17/sched-deadline-update-rq_clock-of-later_rq-when-pushing-a-task.patch
@@ -0,0 +1,95 @@
+From 840d719604b0925ca23dde95f1767e4528668369 Mon Sep 17 00:00:00 2001
+From: Daniel Bristot de Oliveira <bristot@redhat.com>
+Date: Fri, 20 Jul 2018 11:16:30 +0200
+Subject: sched/deadline: Update rq_clock of later_rq when pushing a task
+
+From: Daniel Bristot de Oliveira <bristot@redhat.com>
+
+commit 840d719604b0925ca23dde95f1767e4528668369 upstream.
+
+Daniel Casini got this warn while running a DL task here at RetisLab:
+
+  [  461.137582] ------------[ cut here ]------------
+  [  461.137583] rq->clock_update_flags < RQCF_ACT_SKIP
+  [  461.137599] WARNING: CPU: 4 PID: 2354 at kernel/sched/sched.h:967 assert_clock_updated.isra.32.part.33+0x17/0x20
+      [a ton of modules]
+  [  461.137646] CPU: 4 PID: 2354 Comm: label_image Not tainted 4.18.0-rc4+ #3
+  [  461.137647] Hardware name: ASUS All Series/Z87-K, BIOS 0801 09/02/2013
+  [  461.137649] RIP: 0010:assert_clock_updated.isra.32.part.33+0x17/0x20
+  [  461.137649] Code: ff 48 89 83 08 09 00 00 eb c6 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 98 7a 6c a5 c6 05 bc 0d 54 01 01 48 89 e5 e8 a9 84 fb ff <0f> 0b 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 83 7e 60 01 74 0a 48 3b
+  [  461.137673] RSP: 0018:ffffa77e08cafc68 EFLAGS: 00010082
+  [  461.137674] RAX: 0000000000000000 RBX: ffff8b3fc1702d80 RCX: 0000000000000006
+  [  461.137674] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff8b3fded164b0
+  [  461.137675] RBP: ffffa77e08cafc68 R08: 0000000000000026 R09: 0000000000000339
+  [  461.137676] R10: ffff8b3fd060d410 R11: 0000000000000026 R12: ffffffffa4e14e20
+  [  461.137677] R13: ffff8b3fdec22940 R14: ffff8b3fc1702da0 R15: ffff8b3fdec22940
+  [  461.137678] FS:  00007efe43ee5700(0000) GS:ffff8b3fded00000(0000) knlGS:0000000000000000
+  [  461.137679] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
+  [  461.137680] CR2: 00007efe30000010 CR3: 0000000301744003 CR4: 00000000001606e0
+  [  461.137680] Call Trace:
+  [  461.137684]  push_dl_task.part.46+0x3bc/0x460
+  [  461.137686]  task_woken_dl+0x60/0x80
+  [  461.137689]  ttwu_do_wakeup+0x4f/0x150
+  [  461.137690]  ttwu_do_activate+0x77/0x80
+  [  461.137692]  try_to_wake_up+0x1d6/0x4c0
+  [  461.137693]  wake_up_q+0x32/0x70
+  [  461.137696]  do_futex+0x7e7/0xb50
+  [  461.137698]  __x64_sys_futex+0x8b/0x180
+  [  461.137701]  do_syscall_64+0x5a/0x110
+  [  461.137703]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
+  [  461.137705] RIP: 0033:0x7efe4918ca26
+  [  461.137705] Code: 00 00 00 74 17 49 8b 48 20 44 8b 59 10 41 83 e3 30 41 83 fb 20 74 1e be 85 00 00 00 41 ba 01 00 00 00 41 b9 01 00 00 04 0f 05 <48> 3d 01 f0 ff ff 73 1f 31 c0 c3 be 8c 00 00 00 49 89 c8 4d 31 d2
+  [  461.137738] RSP: 002b:00007efe43ee4928 EFLAGS: 00000283 ORIG_RAX: 00000000000000ca
+  [  461.137739] RAX: ffffffffffffffda RBX: 0000000005094df0 RCX: 00007efe4918ca26
+  [  461.137740] RDX: 0000000000000001 RSI: 0000000000000085 RDI: 0000000005094e24
+  [  461.137741] RBP: 00007efe43ee49c0 R08: 0000000005094e20 R09: 0000000004000001
+  [  461.137741] R10: 0000000000000001 R11: 0000000000000283 R12: 0000000000000000
+  [  461.137742] R13: 0000000005094df8 R14: 0000000000000001 R15: 0000000000448a10
+  [  461.137743] ---[ end trace 187df4cad2bf7649 ]---
+
+This warning happened in the push_dl_task(), because
+__add_running_bw()->cpufreq_update_util() is getting the rq_clock of
+the later_rq before its update, which takes place at activate_task().
+The fix then is to update the rq_clock before calling add_running_bw().
+
+To avoid double rq_clock_update() call, we set ENQUEUE_NOCLOCK flag to
+activate_task().
+
+Reported-by: Daniel Casini <daniel.casini@santannapisa.it>
+Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
+Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
+Acked-by: Juri Lelli <juri.lelli@redhat.com>
+Cc: Clark Williams <williams@redhat.com>
+Cc: Linus Torvalds <torvalds@linux-foundation.org>
+Cc: Luca Abeni <luca.abeni@santannapisa.it>
+Cc: Peter Zijlstra <peterz@infradead.org>
+Cc: Steven Rostedt <rostedt@goodmis.org>
+Cc: Thomas Gleixner <tglx@linutronix.de>
+Cc: Tommaso Cucinotta <tommaso.cucinotta@santannapisa.it>
+Fixes: e0367b12674b sched/deadline: Move CPU frequency selection triggering points
+Link: http://lkml.kernel.org/r/ca31d073a4788acf0684a8b255f14fea775ccf20.1532077269.git.bristot@redhat.com
+Signed-off-by: Ingo Molnar <mingo@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ kernel/sched/deadline.c |    8 +++++++-
+ 1 file changed, 7 insertions(+), 1 deletion(-)
+
+--- a/kernel/sched/deadline.c
++++ b/kernel/sched/deadline.c
+@@ -2090,8 +2090,14 @@ retry:
+ 	sub_rq_bw(&next_task->dl, &rq->dl);
+ 	set_task_cpu(next_task, later_rq->cpu);
+ 	add_rq_bw(&next_task->dl, &later_rq->dl);
++
++	/*
++	 * Update the later_rq clock here, because the clock is used
++	 * by the cpufreq_update_util() inside __add_running_bw().
++	 */
++	update_rq_clock(later_rq);
+ 	add_running_bw(&next_task->dl, &later_rq->dl);
+-	activate_task(later_rq, next_task, 0);
++	activate_task(later_rq, next_task, ENQUEUE_NOCLOCK);
+ 	ret = 1;
+ 
+ 	resched_curr(later_rq);
diff --git a/queue-4.17/series b/queue-4.17/series
index 2ff105343f4..a840d3a49d2 100644
--- a/queue-4.17/series
+++ b/queue-4.17/series
@@ -1,2 +1,5 @@
 parisc-enable-config_mlongcalls-by-default.patch
 parisc-define-mb-and-add-memory-barriers-to-assembler-unlock-sequences.patch
+mark-hi-and-tasklet-softirq-synchronous.patch
+stop_machine-disable-preemption-after-queueing-stopper-threads.patch
+sched-deadline-update-rq_clock-of-later_rq-when-pushing-a-task.patch
diff --git a/queue-4.17/stop_machine-disable-preemption-after-queueing-stopper-threads.patch b/queue-4.17/stop_machine-disable-preemption-after-queueing-stopper-threads.patch
new file mode 100644
index 00000000000..a8c40a3cc76
--- /dev/null
+++ b/queue-4.17/stop_machine-disable-preemption-after-queueing-stopper-threads.patch
@@ -0,0 +1,91 @@
+From 2610e88946632afb78aa58e61f11368ac4c0af7b Mon Sep 17 00:00:00 2001
+From: "Isaac J. Manjarres" <isaacm@codeaurora.org>
+Date: Tue, 17 Jul 2018 12:35:29 -0700
+Subject: stop_machine: Disable preemption after queueing stopper threads
+
+From: Isaac J. Manjarres <isaacm@codeaurora.org>
+
+commit 2610e88946632afb78aa58e61f11368ac4c0af7b upstream.
+
+This commit:
+
+  9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads")
+
+does not fully address the race condition that can occur
+as follows:
+
+On one CPU, call it CPU 3, thread 1 invokes
+cpu_stop_queue_two_works(2, 3,...), and the execution is such
+that thread 1 queues the works for migration/2 and migration/3,
+and is preempted after releasing the locks for migration/2 and
+migration/3, but before waking the threads.
+
+Then, On CPU 2, a kworker, call it thread 2, is running,
+and it invokes cpu_stop_queue_two_works(1, 2,...), such that
+thread 2 queues the works for migration/1 and migration/2.
+Meanwhile, on CPU 3, thread 1 resumes execution, and wakes
+migration/2 and migration/3. This means that when CPU 2
+releases the locks for migration/1 and migration/2, but before
+it wakes those threads, it can be preempted by migration/2.
+
+If thread 2 is preempted by migration/2, then migration/2 will
+execute the first work item successfully, since migration/3
+was woken up by CPU 3, but when it goes to execute the second
+work item, it disables preemption, calls multi_cpu_stop(),
+and thus, CPU 2 will wait forever for migration/1, which should
+have been woken up by thread 2. However migration/1 cannot be
+woken up by thread 2, since it is a kworker, so it is affine to
+CPU 2, but CPU 2 is running migration/2 with preemption
+disabled, so thread 2 will never run.
+
+Disable preemption after queueing works for stopper threads
+to ensure that the operation of queueing the works and waking
+the stopper threads is atomic.
+
+Co-Developed-by: Prasad Sodagudi <psodagud@codeaurora.org>
+Co-Developed-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
+Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
+Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
+Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
+Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
+Cc: Linus Torvalds <torvalds@linux-foundation.org>
+Cc: Peter Zijlstra <peterz@infradead.org>
+Cc: Thomas Gleixner <tglx@linutronix.de>
+Cc: bigeasy@linutronix.de
+Cc: gregkh@linuxfoundation.org
+Cc: matt@codeblueprint.co.uk
+Fixes: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads")
+Link: http://lkml.kernel.org/r/1531856129-9871-1-git-send-email-isaacm@codeaurora.org
+Signed-off-by: Ingo Molnar <mingo@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ kernel/stop_machine.c |   10 +++++++++-
+ 1 file changed, 9 insertions(+), 1 deletion(-)
+
+--- a/kernel/stop_machine.c
++++ b/kernel/stop_machine.c
+@@ -260,6 +260,15 @@ retry:
+ 	err = 0;
+ 	__cpu_stop_queue_work(stopper1, work1, &wakeq);
+ 	__cpu_stop_queue_work(stopper2, work2, &wakeq);
++	/*
++	 * The waking up of stopper threads has to happen
++	 * in the same scheduling context as the queueing.
++	 * Otherwise, there is a possibility of one of the
++	 * above stoppers being woken up by another CPU,
++	 * and preempting us. This will cause us to n ot
++	 * wake up the other stopper forever.
++	 */
++	preempt_disable();
+ unlock:
+ 	raw_spin_unlock(&stopper2->lock);
+ 	raw_spin_unlock_irq(&stopper1->lock);
+@@ -271,7 +280,6 @@ unlock:
+ 	}
+ 
+ 	if (!err) {
+-		preempt_disable();
+ 		wake_up_q(&wakeq);
+ 		preempt_enable();
+ 	}