From: Pierrick Bouvier Date: Thu, 28 Nov 2024 21:38:43 +0000 (-0800) Subject: plugins: optimize cpu_index code generation X-Git-Tag: v10.0.0-rc0~107^2~70 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=dbf408b6678a6076bd2412159d0ce665dce6acd0;p=thirdparty%2Fqemu.git plugins: optimize cpu_index code generation When running with a single vcpu, we can return a constant instead of a load when accessing cpu_index. A side effect is that all tcg operations using it are optimized, most notably scoreboard access. When running a simple loop in user-mode, the speedup is around 20%. Signed-off-by: Pierrick Bouvier Reviewed-by: Richard Henderson Signed-off-by: Richard Henderson Message-ID: <20241128213843.1023080-1-pierrick.bouvier@linaro.org> --- diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c index 1ef075552ca..7e5f040bf73 100644 --- a/accel/tcg/plugin-gen.c +++ b/accel/tcg/plugin-gen.c @@ -102,6 +102,15 @@ static void gen_disable_mem_helper(void) static TCGv_i32 gen_cpu_index(void) { + /* + * Optimize when we run with a single vcpu. All values using cpu_index, + * including scoreboard index, will be optimized out. + * User-mode calls tb_flush when setting this flag. In system-mode, all + * vcpus are created before generating code. + */ + if (!tcg_cflags_has(current_cpu, CF_PARALLEL)) { + return tcg_constant_i32(current_cpu->cpu_index); + } TCGv_i32 cpu_index = tcg_temp_ebb_new_i32(); tcg_gen_ld_i32(cpu_index, tcg_env, -offsetof(ArchCPU, env) + offsetof(CPUState, cpu_index));