There's a really rare and obscure bug in CFQ, that causes a crash in
cfq_dispatch_insert() due to rq == NULL. One example of that is seen
here:
http://lkml.org/lkml/2007/4/15/41
Neil correctly diagnosed the situation for how this can happen, read
that analysis here:
http://lkml.org/lkml/2007/4/25/57
This looks like it requires md to trigger, even though it should
potentially be possible to due with O_DIRECT (at least if you edit the
kernel and doctor some of the unplug calls).
The fix is to move the ->next_rq update to when we add a request to the
rbtree. Then we remove the possibility for a request to exist in the
rbtree code, but not have ->next_rq correctly updated.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
if (!cfq_cfqq_on_rr(cfqq))
cfq_add_cfqq_rr(cfqd, cfqq);
+
+ /*
+ * check if this request is a better next-serve candidate
+ */
+ cfqq->next_rq = cfq_choose_req(cfqd, cfqq->next_rq, rq);
+ BUG_ON(!cfqq->next_rq);
}
static inline void
if (rq_is_meta(rq))
cfqq->meta_pending++;
- /*
- * check if this request is a better next-serve candidate)) {
- */
- cfqq->next_rq = cfq_choose_req(cfqd, cfqq->next_rq, rq);
- BUG_ON(!cfqq->next_rq);
-
/*
* we never wait for an async request and we don't allow preemption
* of an async request. so just return early