From: Palmer Dabbelt Date: Fri, 12 Jul 2024 10:23:22 +0000 (-0700) Subject: Merge patch series "riscv: Apply Zawrs when available" X-Git-Tag: v6.11-rc1~94^2~2 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=5ee121a39330e437cae0d64feeb459c7ec9e9500;p=thirdparty%2Fkernel%2Flinux.git Merge patch series "riscv: Apply Zawrs when available" Andrew Jones says: Zawrs provides two instructions (wrs.nto and wrs.sto), where both are meant to allow the hart to enter a low-power state while waiting on a store to a memory location. The instructions also both wait an implementation-defined "short" duration (unless the implementation terminates the stall for another reason). The difference is that while wrs.sto will terminate when the duration elapses, wrs.nto, depending on configuration, will either just keep waiting or an ILL exception will be raised. Linux will use wrs.nto, so if platforms have an implementation which falls in the "just keep waiting" category (which is not expected), then it should _not_ advertise Zawrs in the hardware description. Like wfi (and with the same {m,h}status bits to configure it), when wrs.nto is configured to raise exceptions it's expected that the higher privilege level will see the instruction was a wait instruction, do something, and then resume execution following the instruction. For example, KVM does configure exceptions for wfi (hstatus.VTW=1) and therefore also for wrs.nto. KVM does this for wfi since it's better to allow other tasks to be scheduled while a VCPU waits for an interrupt. For waits such as those where wrs.nto/sto would be used, which are typically locks, it is also a good idea for KVM to be involved, as it can attempt to schedule the lock holding VCPU. This series starts with Christoph's addition of the riscv smp_cond_load_relaxed function which applies wrs.sto when available. That patch has been reworked to use wrs.nto and to use the same approach as Arm for the wait loop, since we can't have arbitrary C code between the load-reserved and the wrs. Then, hwprobe support is added (since the instructions are also usable from usermode), and finally KVM is taught about wrs.nto, allowing guests to see and use the Zawrs extension. We still don't have test results from hardware, and it's not possible to prove that using Zawrs is a win when testing on QEMU, not even when oversubscribing VCPUs to guests. However, it is possible to use KVM selftests to force a scenario where we can prove Zawrs does its job and does it well. [4] is a test which does this and, on my machine, without Zawrs it takes 16 seconds to complete and with Zawrs it takes 0.25 seconds. This series is also available here [1]. In order to use QEMU for testing a build with [2] is needed. In order to enable guests to use Zawrs with KVM using kvmtool, the branch at [3] may be used. [1] https://github.com/jones-drew/linux/commits/riscv/zawrs-v3/ [2] https://lore.kernel.org/all/20240312152901.512001-2-ajones@ventanamicro.com/ [3] https://github.com/jones-drew/kvmtool/commits/riscv/zawrs/ [4] https://github.com/jones-drew/linux/commit/cb2beccebcece10881db842ed69bdd5715cfab5d Link: https://lore.kernel.org/r/20240426100820.14762-8-ajones@ventanamicro.com * b4-shazam-merge: KVM: riscv: selftests: Add Zawrs extension to get-reg-list test KVM: riscv: Support guest wrs.nto riscv: hwprobe: export Zawrs ISA extension riscv: Add Zawrs support for spinlocks dt-bindings: riscv: Add Zawrs ISA extension description riscv: Provide a definition for 'pause' Signed-off-by: Palmer Dabbelt --- 5ee121a39330e437cae0d64feeb459c7ec9e9500 diff --cc Documentation/arch/riscv/hwprobe.rst index 652e6ef05baa0,e072ce8285d88..02eb4d98b7dea --- a/Documentation/arch/riscv/hwprobe.rst +++ b/Documentation/arch/riscv/hwprobe.rst @@@ -188,53 -188,10 +188,57 @@@ The following keys are defined manual starting from commit 95cf1f9 ("Add changes requested by Ved during signoff") + * :c:macro:`RISCV_HWPROBE_EXT_ZIHINTPAUSE`: The Zihintpause extension is + supported as defined in the RISC-V ISA manual starting from commit + d8ab5c78c207 ("Zihintpause is ratified"). + + * :c:macro:`RISCV_HWPROBE_EXT_ZVE32X`: The Vector sub-extension Zve32x is + supported, as defined by version 1.0 of the RISC-V Vector extension manual. + + * :c:macro:`RISCV_HWPROBE_EXT_ZVE32F`: The Vector sub-extension Zve32f is + supported, as defined by version 1.0 of the RISC-V Vector extension manual. + + * :c:macro:`RISCV_HWPROBE_EXT_ZVE64X`: The Vector sub-extension Zve64x is + supported, as defined by version 1.0 of the RISC-V Vector extension manual. + + * :c:macro:`RISCV_HWPROBE_EXT_ZVE64F`: The Vector sub-extension Zve64f is + supported, as defined by version 1.0 of the RISC-V Vector extension manual. + + * :c:macro:`RISCV_HWPROBE_EXT_ZVE64D`: The Vector sub-extension Zve64d is + supported, as defined by version 1.0 of the RISC-V Vector extension manual. + + * :c:macro:`RISCV_HWPROBE_EXT_ZIMOP`: The Zimop May-Be-Operations extension is + supported as defined in the RISC-V ISA manual starting from commit + 58220614a5f ("Zimop is ratified/1.0"). + + * :c:macro:`RISCV_HWPROBE_EXT_ZCA`: The Zca extension part of Zc* standard + extensions for code size reduction, as ratified in commit 8be3419c1c0 + ("Zcf doesn't exist on RV64 as it contains no instructions") of + riscv-code-size-reduction. + + * :c:macro:`RISCV_HWPROBE_EXT_ZCB`: The Zcb extension part of Zc* standard + extensions for code size reduction, as ratified in commit 8be3419c1c0 + ("Zcf doesn't exist on RV64 as it contains no instructions") of + riscv-code-size-reduction. + + * :c:macro:`RISCV_HWPROBE_EXT_ZCD`: The Zcd extension part of Zc* standard + extensions for code size reduction, as ratified in commit 8be3419c1c0 + ("Zcf doesn't exist on RV64 as it contains no instructions") of + riscv-code-size-reduction. + + * :c:macro:`RISCV_HWPROBE_EXT_ZCF`: The Zcf extension part of Zc* standard + extensions for code size reduction, as ratified in commit 8be3419c1c0 + ("Zcf doesn't exist on RV64 as it contains no instructions") of + riscv-code-size-reduction. + + * :c:macro:`RISCV_HWPROBE_EXT_ZCMOP`: The Zcmop May-Be-Operations extension is + supported as defined in the RISC-V ISA manual starting from commit + c732a4f39a4 ("Zcmop is ratified/1.0"). + + * :c:macro:`RISCV_HWPROBE_EXT_ZAWRS`: The Zawrs extension is supported as + ratified in commit 98918c844281 ("Merge pull request #1217 from + riscv/zawrs") of riscv-isa-manual. + * :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: A bitmask that contains performance information about the selected set of processors. diff --cc arch/riscv/include/asm/cmpxchg.h index ddb002ed89dea,725276dcb9960..637d7acab1f83 --- a/arch/riscv/include/asm/cmpxchg.h +++ b/arch/riscv/include/asm/cmpxchg.h @@@ -8,63 -8,30 +8,66 @@@ #include + #include #include + #include + #include -#define __xchg_relaxed(ptr, new, size) \ +#define __arch_xchg_masked(prepend, append, r, p, n) \ +({ \ + u32 *__ptr32b = (u32 *)((ulong)(p) & ~0x3); \ + ulong __s = ((ulong)(p) & (0x4 - sizeof(*p))) * BITS_PER_BYTE; \ + ulong __mask = GENMASK(((sizeof(*p)) * BITS_PER_BYTE) - 1, 0) \ + << __s; \ + ulong __newx = (ulong)(n) << __s; \ + ulong __retx; \ + ulong __rc; \ + \ + __asm__ __volatile__ ( \ + prepend \ + "0: lr.w %0, %2\n" \ + " and %1, %0, %z4\n" \ + " or %1, %1, %z3\n" \ + " sc.w %1, %1, %2\n" \ + " bnez %1, 0b\n" \ + append \ + : "=&r" (__retx), "=&r" (__rc), "+A" (*(__ptr32b)) \ + : "rJ" (__newx), "rJ" (~__mask) \ + : "memory"); \ + \ + r = (__typeof__(*(p)))((__retx & __mask) >> __s); \ +}) + +#define __arch_xchg(sfx, prepend, append, r, p, n) \ +({ \ + __asm__ __volatile__ ( \ + prepend \ + " amoswap" sfx " %0, %2, %1\n" \ + append \ + : "=r" (r), "+A" (*(p)) \ + : "r" (n) \ + : "memory"); \ +}) + +#define _arch_xchg(ptr, new, sfx, prepend, append) \ ({ \ __typeof__(ptr) __ptr = (ptr); \ - __typeof__(new) __new = (new); \ - __typeof__(*(ptr)) __ret; \ - switch (size) { \ + __typeof__(*(__ptr)) __new = (new); \ + __typeof__(*(__ptr)) __ret; \ + \ + switch (sizeof(*__ptr)) { \ + case 1: \ + case 2: \ + __arch_xchg_masked(prepend, append, \ + __ret, __ptr, __new); \ + break; \ case 4: \ - __asm__ __volatile__ ( \ - " amoswap.w %0, %2, %1\n" \ - : "=r" (__ret), "+A" (*__ptr) \ - : "r" (__new) \ - : "memory"); \ + __arch_xchg(".w" sfx, prepend, append, \ + __ret, __ptr, __new); \ break; \ case 8: \ - __asm__ __volatile__ ( \ - " amoswap.d %0, %2, %1\n" \ - : "=r" (__ret), "+A" (*__ptr) \ - : "r" (__new) \ - : "memory"); \ + __arch_xchg(".d" sfx, prepend, append, \ + __ret, __ptr, __new); \ break; \ default: \ BUILD_BUG(); \ @@@ -203,22 -362,59 +206,77 @@@ arch_cmpxchg_relaxed((ptr), (o), (n)); \ }) +#define arch_cmpxchg64_relaxed(ptr, o, n) \ +({ \ + BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ + arch_cmpxchg_relaxed((ptr), (o), (n)); \ +}) + +#define arch_cmpxchg64_acquire(ptr, o, n) \ +({ \ + BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ + arch_cmpxchg_acquire((ptr), (o), (n)); \ +}) + +#define arch_cmpxchg64_release(ptr, o, n) \ +({ \ + BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ + arch_cmpxchg_release((ptr), (o), (n)); \ +}) + + #ifdef CONFIG_RISCV_ISA_ZAWRS + /* + * Despite wrs.nto being "WRS-with-no-timeout", in the absence of changes to + * @val we expect it to still terminate within a "reasonable" amount of time + * for an implementation-specific other reason, a pending, locally-enabled + * interrupt, or because it has been configured to raise an illegal + * instruction exception. + */ + static __always_inline void __cmpwait(volatile void *ptr, + unsigned long val, + int size) + { + unsigned long tmp; + + asm goto(ALTERNATIVE("j %l[no_zawrs]", "nop", + 0, RISCV_ISA_EXT_ZAWRS, 1) + : : : : no_zawrs); + + switch (size) { + case 4: + asm volatile( + " lr.w %0, %1\n" + " xor %0, %0, %2\n" + " bnez %0, 1f\n" + ZAWRS_WRS_NTO "\n" + "1:" + : "=&r" (tmp), "+A" (*(u32 *)ptr) + : "r" (val)); + break; + #if __riscv_xlen == 64 + case 8: + asm volatile( + " lr.d %0, %1\n" + " xor %0, %0, %2\n" + " bnez %0, 1f\n" + ZAWRS_WRS_NTO "\n" + "1:" + : "=&r" (tmp), "+A" (*(u64 *)ptr) + : "r" (val)); + break; + #endif + default: + BUILD_BUG(); + } + + return; + + no_zawrs: + asm volatile(RISCV_PAUSE : : : "memory"); + } + + #define __cmpwait_relaxed(ptr, val) \ + __cmpwait((ptr), (unsigned long)(val), sizeof(*(ptr))) + #endif + #endif /* _ASM_RISCV_CMPXCHG_H */ diff --cc arch/riscv/include/asm/hwcap.h index 4880324a1b298,5b358c3cf212f..b18b202ca141a --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@@ -81,17 -81,7 +81,18 @@@ #define RISCV_ISA_EXT_ZTSO 72 #define RISCV_ISA_EXT_ZACAS 73 #define RISCV_ISA_EXT_XANDESPMU 74 -#define RISCV_ISA_EXT_ZAWRS 75 +#define RISCV_ISA_EXT_ZVE32X 75 +#define RISCV_ISA_EXT_ZVE32F 76 +#define RISCV_ISA_EXT_ZVE64X 77 +#define RISCV_ISA_EXT_ZVE64F 78 +#define RISCV_ISA_EXT_ZVE64D 79 +#define RISCV_ISA_EXT_ZIMOP 80 +#define RISCV_ISA_EXT_ZCA 81 +#define RISCV_ISA_EXT_ZCB 82 +#define RISCV_ISA_EXT_ZCD 83 +#define RISCV_ISA_EXT_ZCF 84 +#define RISCV_ISA_EXT_ZCMOP 85 ++#define RISCV_ISA_EXT_ZAWRS 86 #define RISCV_ISA_EXT_XLINUXENVCFG 127 diff --cc arch/riscv/include/uapi/asm/hwprobe.h index 9e48320cb970f,3b5e0adb1d72d..8b8f6ac0eae28 --- a/arch/riscv/include/uapi/asm/hwprobe.h +++ b/arch/riscv/include/uapi/asm/hwprobe.h @@@ -59,18 -59,7 +59,19 @@@ struct riscv_hwprobe #define RISCV_HWPROBE_EXT_ZTSO (1ULL << 33) #define RISCV_HWPROBE_EXT_ZACAS (1ULL << 34) #define RISCV_HWPROBE_EXT_ZICOND (1ULL << 35) +#define RISCV_HWPROBE_EXT_ZIHINTPAUSE (1ULL << 36) +#define RISCV_HWPROBE_EXT_ZVE32X (1ULL << 37) +#define RISCV_HWPROBE_EXT_ZVE32F (1ULL << 38) +#define RISCV_HWPROBE_EXT_ZVE64X (1ULL << 39) +#define RISCV_HWPROBE_EXT_ZVE64F (1ULL << 40) +#define RISCV_HWPROBE_EXT_ZVE64D (1ULL << 41) +#define RISCV_HWPROBE_EXT_ZIMOP (1ULL << 42) +#define RISCV_HWPROBE_EXT_ZCA (1ULL << 43) +#define RISCV_HWPROBE_EXT_ZCB (1ULL << 44) +#define RISCV_HWPROBE_EXT_ZCD (1ULL << 45) +#define RISCV_HWPROBE_EXT_ZCF (1ULL << 46) +#define RISCV_HWPROBE_EXT_ZCMOP (1ULL << 47) + #define RISCV_HWPROBE_EXT_ZAWRS (1ULL << 48) #define RISCV_HWPROBE_KEY_CPUPERF_0 5 #define RISCV_HWPROBE_MISALIGNED_UNKNOWN (0 << 0) #define RISCV_HWPROBE_MISALIGNED_EMULATED (1 << 0) diff --cc arch/riscv/include/uapi/asm/kvm.h index a6215634df7c4,89ea06bd07c2b..e97db3296456e --- a/arch/riscv/include/uapi/asm/kvm.h +++ b/arch/riscv/include/uapi/asm/kvm.h @@@ -167,13 -167,7 +167,14 @@@ enum KVM_RISCV_ISA_EXT_ID KVM_RISCV_ISA_EXT_ZFA, KVM_RISCV_ISA_EXT_ZTSO, KVM_RISCV_ISA_EXT_ZACAS, + KVM_RISCV_ISA_EXT_SSCOFPMF, + KVM_RISCV_ISA_EXT_ZIMOP, + KVM_RISCV_ISA_EXT_ZCA, + KVM_RISCV_ISA_EXT_ZCB, + KVM_RISCV_ISA_EXT_ZCD, + KVM_RISCV_ISA_EXT_ZCF, + KVM_RISCV_ISA_EXT_ZCMOP, + KVM_RISCV_ISA_EXT_ZAWRS, KVM_RISCV_ISA_EXT_MAX, }; diff --cc arch/riscv/kernel/cpufeature.c index ec4bff7a827cb,02de9eaa3f421..0366dc3baf338 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@@ -345,8 -256,8 +345,9 @@@ const struct riscv_isa_ext_data riscv_i __RISCV_ISA_EXT_DATA(zihintntl, RISCV_ISA_EXT_ZIHINTNTL), __RISCV_ISA_EXT_DATA(zihintpause, RISCV_ISA_EXT_ZIHINTPAUSE), __RISCV_ISA_EXT_DATA(zihpm, RISCV_ISA_EXT_ZIHPM), + __RISCV_ISA_EXT_DATA(zimop, RISCV_ISA_EXT_ZIMOP), __RISCV_ISA_EXT_DATA(zacas, RISCV_ISA_EXT_ZACAS), + __RISCV_ISA_EXT_DATA(zawrs, RISCV_ISA_EXT_ZAWRS), __RISCV_ISA_EXT_DATA(zfa, RISCV_ISA_EXT_ZFA), __RISCV_ISA_EXT_DATA(zfh, RISCV_ISA_EXT_ZFH), __RISCV_ISA_EXT_DATA(zfhmin, RISCV_ISA_EXT_ZFHMIN), diff --cc arch/riscv/kernel/sys_hwprobe.c index c6b77fee6e1aa,b86e3531a45ac..685594769535c --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@@ -112,22 -111,9 +112,23 @@@ static void hwprobe_isa_ext0(struct ris EXT_KEY(ZTSO); EXT_KEY(ZACAS); EXT_KEY(ZICOND); + EXT_KEY(ZIHINTPAUSE); + EXT_KEY(ZIMOP); + EXT_KEY(ZCA); + EXT_KEY(ZCB); + EXT_KEY(ZCMOP); + EXT_KEY(ZAWRS); + /* + * All the following extensions must depend on the kernel + * support of V. + */ if (has_vector()) { + EXT_KEY(ZVE32X); + EXT_KEY(ZVE32F); + EXT_KEY(ZVE64X); + EXT_KEY(ZVE64F); + EXT_KEY(ZVE64D); EXT_KEY(ZVBB); EXT_KEY(ZVBC); EXT_KEY(ZVKB);