From: Greg Kroah-Hartman Date: Fri, 31 Aug 2018 20:17:16 +0000 (-0700) Subject: 4.18-stable patches X-Git-Tag: v3.18.121~29 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=6c2e1ef67f4659fb229802f5297e9b53baec2067;p=thirdparty%2Fkernel%2Fstable-queue.git 4.18-stable patches added patches: nvme-pci-add-a-memory-barrier-to-nvme_dbbuf_update_and_check_event.patch --- diff --git a/queue-4.18/nvme-pci-add-a-memory-barrier-to-nvme_dbbuf_update_and_check_event.patch b/queue-4.18/nvme-pci-add-a-memory-barrier-to-nvme_dbbuf_update_and_check_event.patch new file mode 100644 index 00000000000..86c91d865b3 --- /dev/null +++ b/queue-4.18/nvme-pci-add-a-memory-barrier-to-nvme_dbbuf_update_and_check_event.patch @@ -0,0 +1,70 @@ +From f1ed3df20d2d223e0852cc4ac1f19bba869a7e3c Mon Sep 17 00:00:00 2001 +From: Michal Wnukowski +Date: Wed, 15 Aug 2018 15:51:57 -0700 +Subject: nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event + +From: Michal Wnukowski + +commit f1ed3df20d2d223e0852cc4ac1f19bba869a7e3c upstream. + +In many architectures loads may be reordered with older stores to +different locations. In the nvme driver the following two operations +could be reordered: + + - Write shadow doorbell (dbbuf_db) into memory. + - Read EventIdx (dbbuf_ei) from memory. + +This can result in a potential race condition between driver and VM host +processing requests (if given virtual NVMe controller has a support for +shadow doorbell). If that occurs, then the NVMe controller may decide to +wait for MMIO doorbell from guest operating system, and guest driver may +decide not to issue MMIO doorbell on any of subsequent commands. + +This issue is purely timing-dependent one, so there is no easy way to +reproduce it. Currently the easiest known approach is to run "Oracle IO +Numbers" (orion) that is shipped with Oracle DB: + +orion -run advanced -num_large 0 -size_small 8 -type rand -simulate \ + concat -write 40 -duration 120 -matrix row -testname nvme_test + +Where nvme_test is a .lun file that contains a list of NVMe block +devices to run test against. Limiting number of vCPUs assigned to given +VM instance seems to increase chances for this bug to occur. On test +environment with VM that got 4 NVMe drives and 1 vCPU assigned the +virtual NVMe controller hang could be observed within 10-20 minutes. +That correspond to about 400-500k IO operations processed (or about +100GB of IO read/writes). + +Orion tool was used as a validation and set to run in a loop for 36 +hours (equivalent of pushing 550M IO operations). No issues were +observed. That suggest that the patch fixes the issue. + +Fixes: f9f38e33389c ("nvme: improve performance for virtual NVMe devices") +Signed-off-by: Michal Wnukowski +Reviewed-by: Keith Busch +Reviewed-by: Sagi Grimberg +[hch: updated changelog and comment a bit] +Signed-off-by: Christoph Hellwig +Signed-off-by: Greg Kroah-Hartman + +--- + drivers/nvme/host/pci.c | 8 ++++++++ + 1 file changed, 8 insertions(+) + +--- a/drivers/nvme/host/pci.c ++++ b/drivers/nvme/host/pci.c +@@ -316,6 +316,14 @@ static bool nvme_dbbuf_update_and_check_ + old_value = *dbbuf_db; + *dbbuf_db = value; + ++ /* ++ * Ensure that the doorbell is updated before reading the event ++ * index from memory. The controller needs to provide similar ++ * ordering to ensure the envent index is updated before reading ++ * the doorbell. ++ */ ++ mb(); ++ + if (!nvme_dbbuf_need_event(*dbbuf_ei, value, old_value)) + return false; + } diff --git a/queue-4.18/series b/queue-4.18/series index 9e21877f7ad..b9952b9c05f 100644 --- a/queue-4.18/series +++ b/queue-4.18/series @@ -43,3 +43,4 @@ ext4-use-ext4_warning-for-sb_getblk-failure.patch ext4-sysfs-print-ext4_super_block-fields-as-little-endian.patch ext4-reset-error-code-in-ext4_find_entry-in-fallback.patch ext4-fix-race-when-setting-the-bitmap-corrupted-flag.patch +nvme-pci-add-a-memory-barrier-to-nvme_dbbuf_update_and_check_event.patch