--- /dev/null
+From 27f344eb15dd0da80ebec80c7245e8c85043f841 Mon Sep 17 00:00:00 2001
+From: James Smart <james.smart@emulex.com>
+Date: Wed, 7 May 2014 17:16:46 -0400
+Subject: lpfc: Add iotag memory barrier
+
+From: James Smart <james.smart@emulex.com>
+
+commit 27f344eb15dd0da80ebec80c7245e8c85043f841 upstream.
+
+Add a memory barrier to ensure the valid bit is read before
+any of the cqe payload is read. This fixes an issue seen
+on Power where the cqe payload was getting loaded before
+the valid bit. When this occurred, we saw an iotag out of
+range error when a command completed, but since the iotag
+looked invalid the command didn't get completed to scsi core.
+Later we hit the command timeout, attempted to abort the command,
+then waited for the aborted command to get returned. Since the
+adapter already returned the command, we timeout waiting,
+and end up escalating EEH all the way to host reset. This
+patch fixes this issue.
+
+Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
+Signed-off-by: James Smart <james.smart@emulex.com>
+Signed-off-by: Christoph Hellwig <hch@lst.de>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ drivers/scsi/lpfc/lpfc_sli.c | 21 +++++++++++++++++++++
+ 1 file changed, 21 insertions(+)
+
+--- a/drivers/scsi/lpfc/lpfc_sli.c
++++ b/drivers/scsi/lpfc/lpfc_sli.c
+@@ -263,6 +263,16 @@ lpfc_sli4_eq_get(struct lpfc_queue *q)
+ return NULL;
+
+ q->hba_index = idx;
++
++ /*
++ * insert barrier for instruction interlock : data from the hardware
++ * must have the valid bit checked before it can be copied and acted
++ * upon. Given what was seen in lpfc_sli4_cq_get() of speculative
++ * instructions allowing action on content before valid bit checked,
++ * add barrier here as well. May not be needed as "content" is a
++ * single 32-bit entity here (vs multi word structure for cq's).
++ */
++ mb();
+ return eqe;
+ }
+
+@@ -368,6 +378,17 @@ lpfc_sli4_cq_get(struct lpfc_queue *q)
+
+ cqe = q->qe[q->hba_index].cqe;
+ q->hba_index = idx;
++
++ /*
++ * insert barrier for instruction interlock : data from the hardware
++ * must have the valid bit checked before it can be copied and acted
++ * upon. Speculative instructions were allowing a bcopy at the start
++ * of lpfc_sli4_fp_handle_wcqe(), which is called immediately
++ * after our return, to copy data before the valid bit check above
++ * was done. As such, some of the copied data was stale. The barrier
++ * ensures the check is before any data is copied.
++ */
++ mb();
+ return cqe;
+ }
+