]> git.ipfire.org Git - thirdparty/kernel/stable.git/commit
workqueue: cond_resched() after processing each work item
authorTejun Heo <tj@kernel.org>
Wed, 28 Aug 2013 21:33:37 +0000 (17:33 -0400)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Mon, 14 Apr 2014 13:44:16 +0000 (06:44 -0700)
commit00cef7a5e0766f0f4bedc9da1c80fbe992cf68ef
treecf6eee93fbfe7eb5bfbc934f26391c69aa454f5c
parentaa34e62c2f0d4a105606971a1eb666f22338993b
workqueue: cond_resched() after processing each work item

commit b22ce2785d97423846206cceec4efee0c4afd980 upstream.

If !PREEMPT, a kworker running work items back to back can hog CPU.
This becomes dangerous when a self-requeueing work item which is
waiting for something to happen races against stop_machine.  Such
self-requeueing work item would requeue itself indefinitely hogging
the kworker and CPU it's running on while stop_machine would wait for
that CPU to enter stop_machine while preventing anything else from
happening on all other CPUs.  The two would deadlock.

Jamie Liu reports that this deadlock scenario exists around
scsi_requeue_run_queue() and libata port multiplier support, where one
port may exclude command processing from other ports.  With the right
timing, scsi_requeue_run_queue() can end up requeueing itself trying
to execute an IO which is asked to be retried while another device has
an exclusive access, which in turn can't make forward progress due to
stop_machine.

Fix it by invoking cond_resched() after executing each work item.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Jamie Liu <jamieliu@google.com>
References: http://thread.gmane.org/gmane.linux.kernel/1552567
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Qiang Huang <h.huangqiang@huawei.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kernel/workqueue.c