*
* C11 has standardized a good model for expressing these orderings when doing
* atomics. It defines three *tiers* of ordering:
+ *
* 1. Sequential Consistency (every processor sees the same total order of
* events)
*
* In other words, this tier is close in behavior to Sequential Consistency
* in much the same way a General-Relativity universe is close to a
* Newtonian universe.
+ *
* 3. Relaxed (i.e unordered/unfenced)
*
* In C11 standard's terminology for atomic memory ordering,
* - https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
* - http://preshing.com/20120913/acquire-and-release-semantics/
*
+ * The x86/amd64 architectures have a Total Store Order memory model, which
+ * differs from a sequentially consistent ordering by allowing stores to
+ * be re-ordered after loads.
+ *
* In this file:
* 1. all RMW (Read/Modify/Write) operations are sequentially consistent.
* This includes operations like Atomic_IncN, Atomic_ReadIfEqualWriteN,
* Atomic_ReadWriteN, etc.
- * 2. all R and W operations are relaxed. This includes operations like
- * Atomic_WriteN, Atomic_ReadN, Atomic_TestBitN, etc.
+ * 2. all R and W operations have TSO consistency. This includes operations
+ * like Atomic_WriteN, Atomic_ReadN, Atomic_TestBitN, etc.
*
* The below routines of course ensure both the CPU and compiler honor the
* ordering constraint.
*
* Notes:
- * 1. Since R-only and W-only operations do not provide ordering, callers
- * using them for synchronizing operations like double-checked
- * initialization or releasing spinlocks must provide extra barriers.
- * 2. This implementation of Atomic operations is suboptimal. On x86,simple
+ * 1. R-only and W-only operations provide TSO consistency between atomic
+ * operations because most of our code was written with that assumption
+ * in mind (PR 2060781).
+ * 2. This implementation of Atomic operations is suboptimal. On x86, simple
* reads and writes have acquire/release semantics at the hardware level.
- * On arm64, we have separate instructions for sequentially consistent
- * reads and writes (the same instructions are used for acquire/release).
- * Neither of these are exposed for R-only or W-only callers.
+ * On arm64, we have separate functions for sequentially consistent
+ * reads and writes, see Atomic_Read*Acquire and Atomic_Write*Release.
+ * 3. For R-only and W-only "relaxed" ordering, see Atomic_ReadNRelaxed and
+ * Atomic_WriteNRelaxed.
*
* For further details on x86 and ARM memory ordering see
* https://wiki.eng.vmware.com/ARM/MemoryOrdering.
: "cc"
);
#else
- out = (var->value & (CONST64U(1) << bit)) != 0;
+ out = (Atomic_Read64(var) & (CONST64U(1) << bit)) != 0;
#endif
return out;
}
\
_val; \
})
-#define _VMATOM_SNIPPET_R _VMATOM_SNIPPET_R_NF
+
+/* Read total store order (returned). */
+#define _VMATOM_SNIPPET_R(bs, is, rp, es, atm) ({ \
+ uint##bs _val; \
+ \
+ /* ldr is atomic if and only if 'atm' is aligned on #bs bits. */ \
+ __asm__ __volatile__( \
+ "dmb ishld \n\t"\
+ "dmb ishst \n\t"\
+ "ldr"#is" %"#rp"0, %1 \n\t"\
+ "dmb ishld \n\t"\
+ : "=r" (_val) \
+ : "m" (*atm) \
+ : "memory" \
+ ); \
+ \
+ _val; \
+})
/* Read acquire/seq_cst (returned). */
#define _VMATOM_SNIPPET_R_SC(bs, is, rp, es, atm) ({ \
"ldar"#is" %"#rp"0, %1 \n\t"\
: "=r" (_val) \
: "Q" (*atm) \
+ : "memory" \
); \
\
_val; \
: "r" (val) \
); \
})
-#define _VMATOM_SNIPPET_W _VMATOM_SNIPPET_W_NF
+
+/* Write total store order. */
+#define _VMATOM_SNIPPET_W(bs, is, rp, es, atm, val) ({ \
+ __asm__ __volatile__( \
+ "dmb ishld \n\t"\
+ "dmb ishst \n\t"\
+ "str"#is" %"#rp"1, %0 \n\t"\
+ "dmb ishst \n\t"\
+ : "=m" (*atm) \
+ : "r" (val) \
+ : "memory" \
+ ); \
+})
/* Write release/seq_cst. */
#define _VMATOM_SNIPPET_W_SC(bs, is, rp, es, atm, val) ({ \
"stlr"#is" %"#rp"1, %0 \n\t"\
: "=Q" (*atm) \
: "r" (val) \
+ : "memory" \
); \
})