--- /dev/null
+From d59b23795933678c9638fd20c942d2b4f3cd6185 Mon Sep 17 00:00:00 2001
+From: Coly Li <colyli@suse.de>
+Date: Mon, 30 Oct 2017 14:46:31 -0700
+Subject: bcache: only permit to recovery read error when cache device is clean
+
+From: Coly Li <colyli@suse.de>
+
+commit d59b23795933678c9638fd20c942d2b4f3cd6185 upstream.
+
+When bcache does read I/Os, for example in writeback or writethrough mode,
+if a read request on cache device is failed, bcache will try to recovery
+the request by reading from cached device. If the data on cached device is
+not synced with cache device, then requester will get a stale data.
+
+For critical storage system like database, providing stale data from
+recovery may result an application level data corruption, which is
+unacceptible.
+
+With this patch, for a failed read request in writeback or writethrough
+mode, recovery a recoverable read request only happens when cache device
+is clean. That is to say, all data on cached device is up to update.
+
+For other cache modes in bcache, read request will never hit
+cached_dev_read_error(), they don't need this patch.
+
+Please note, because cache mode can be switched arbitrarily in run time, a
+writethrough mode might be switched from a writeback mode. Therefore
+checking dc->has_data in writethrough mode still makes sense.
+
+Changelog:
+V4: Fix parens error pointed by Michael Lyle.
+v3: By response from Kent Oversteet, he thinks recovering stale data is a
+ bug to fix, and option to permit it is unnecessary. So this version
+ the sysfs file is removed.
+v2: rename sysfs entry from allow_stale_data_on_failure to
+ allow_stale_data_on_failure, and fix the confusing commit log.
+v1: initial patch posted.
+
+[small change to patch comment spelling by mlyle]
+
+Signed-off-by: Coly Li <colyli@suse.de>
+Signed-off-by: Michael Lyle <mlyle@lyle.org>
+Reported-by: Arne Wolf <awolf@lenovo.com>
+Reviewed-by: Michael Lyle <mlyle@lyle.org>
+Cc: Kent Overstreet <kent.overstreet@gmail.com>
+Cc: Nix <nix@esperi.org.uk>
+Cc: Kai Krakow <hurikhan77@gmail.com>
+Cc: Eric Wheeler <bcache@lists.ewheeler.net>
+Cc: Junhui Tang <tang.junhui@zte.com.cn>
+Signed-off-by: Jens Axboe <axboe@kernel.dk>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ drivers/md/bcache/request.c | 10 +++++++++-
+ 1 file changed, 9 insertions(+), 1 deletion(-)
+
+--- a/drivers/md/bcache/request.c
++++ b/drivers/md/bcache/request.c
+@@ -705,8 +705,16 @@ static void cached_dev_read_error(struct
+ {
+ struct search *s = container_of(cl, struct search, cl);
+ struct bio *bio = &s->bio.bio;
++ struct cached_dev *dc = container_of(s->d, struct cached_dev, disk);
+
+- if (s->recoverable) {
++ /*
++ * If cache device is dirty (dc->has_dirty is non-zero), then
++ * recovery a failed read request from cached device may get a
++ * stale data back. So read failure recovery is only permitted
++ * when cache device is clean.
++ */
++ if (s->recoverable &&
++ (dc && !atomic_read(&dc->has_dirty))) {
+ /* Retry from the backing device: */
+ trace_bcache_read_retry(s->orig_bio);
+
--- /dev/null
+From e393aa2446150536929140739f09c6ecbcbea7f0 Mon Sep 17 00:00:00 2001
+From: Rui Hua <huarui.dev@gmail.com>
+Date: Fri, 24 Nov 2017 15:14:26 -0800
+Subject: bcache: recover data from backing when data is clean
+
+From: Rui Hua <huarui.dev@gmail.com>
+
+commit e393aa2446150536929140739f09c6ecbcbea7f0 upstream.
+
+When we send a read request and hit the clean data in cache device, there
+is a situation called cache read race in bcache(see the commit in the tail
+of cache_look_up(), the following explaination just copy from there):
+The bucket we're reading from might be reused while our bio is in flight,
+and we could then end up reading the wrong data. We guard against this
+by checking (in bch_cache_read_endio()) if the pointer is stale again;
+if so, we treat it as an error (s->iop.error = -EINTR) and reread from
+the backing device (but we don't pass that error up anywhere)
+
+It should be noted that cache read race happened under normal
+circumstances, not the circumstance when SSD failed, it was counted
+and shown in /sys/fs/bcache/XXX/internal/cache_read_races.
+
+Without this patch, when we use writeback mode, we will never reread from
+the backing device when cache read race happened, until the whole cache
+device is clean, because the condition
+(s->recoverable && (dc && !atomic_read(&dc->has_dirty))) is false in
+cached_dev_read_error(). In this situation, the s->iop.error(= -EINTR)
+will be passed up, at last, user will receive -EINTR when it's bio end,
+this is not suitable, and wield to up-application.
+
+In this patch, we use s->read_dirty_data to judge whether the read
+request hit dirty data in cache device, it is safe to reread data from
+the backing device when the read request hit clean data. This can not
+only handle cache read race, but also recover data when failed read
+request from cache device.
+
+[edited by mlyle to fix up whitespace, commit log title, comment
+spelling]
+
+Fixes: d59b23795933 ("bcache: only permit to recovery read error when cache device is clean")
+Signed-off-by: Hua Rui <huarui.dev@gmail.com>
+Reviewed-by: Michael Lyle <mlyle@lyle.org>
+Reviewed-by: Coly Li <colyli@suse.de>
+Signed-off-by: Michael Lyle <mlyle@lyle.org>
+Signed-off-by: Jens Axboe <axboe@kernel.dk>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ drivers/md/bcache/request.c | 13 ++++++-------
+ 1 file changed, 6 insertions(+), 7 deletions(-)
+
+--- a/drivers/md/bcache/request.c
++++ b/drivers/md/bcache/request.c
+@@ -705,16 +705,15 @@ static void cached_dev_read_error(struct
+ {
+ struct search *s = container_of(cl, struct search, cl);
+ struct bio *bio = &s->bio.bio;
+- struct cached_dev *dc = container_of(s->d, struct cached_dev, disk);
+
+ /*
+- * If cache device is dirty (dc->has_dirty is non-zero), then
+- * recovery a failed read request from cached device may get a
+- * stale data back. So read failure recovery is only permitted
+- * when cache device is clean.
++ * If read request hit dirty data (s->read_dirty_data is true),
++ * then recovery a failed read request from cached device may
++ * get a stale data back. So read failure recovery is only
++ * permitted when read request hit clean data in cache device,
++ * or when cache read race happened.
+ */
+- if (s->recoverable &&
+- (dc && !atomic_read(&dc->has_dirty))) {
++ if (s->recoverable && !s->read_dirty_data) {
+ /* Retry from the backing device: */
+ trace_bcache_read_retry(s->orig_bio);
+