ptep_try_set() installs a kernel PTE with try_cmpxchg() but, unlike
__set_pte(), skips the barriers that arm64 requires after writing a valid
kernel PTE. Without them a subsequent access can fault instead of seeing
the new mapping.
Issue them with emit_pte_barriers() rather than __set_pte_complete().
ptep_try_set() must finish the store before it returns, but
__set_pte_complete() would defer the barriers when the calling context is in
lazy MMU mode.
v2: Emit the barriers directly instead of __set_pte_complete(). (Catalin)
Fixes: 258df8fce42f ("mm: Add ptep_try_set() for lockless empty-slot installs")
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/all/aiRFcz78QTZdIHHB@arm.com/
Link: https://lore.kernel.org/bpf/7f5f7c94601312c1a401fb18998291cc@kernel.org
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
{
pteval_t old = 0;
- return try_cmpxchg(&pte_val(*ptep), &old, pte_val(new_pte));
+ if (!try_cmpxchg(&pte_val(*ptep), &old, pte_val(new_pte)))
+ return false;
+
+ /*
+ * The store must be complete by the time this returns, but the caller
+ * may be in lazy MMU mode, where __set_pte_complete() would defer the
+ * barriers. Issue them directly.
+ */
+ emit_pte_barriers();
+ return true;
}
#define ptep_try_set ptep_try_set