From e18acf9979948d4d552074f5c0b38837006d1141 Mon Sep 17 00:00:00 2001 From: Roger Sayle Date: Sat, 20 Jun 2026 10:02:34 +0100 Subject: [PATCH] i386: Avoid calling gen_lowpart for V2DImode on DDmode register. Hongtao did warn me there might be corner cases in STV when converting DImode SUBREGs to use V2DImode vector registers. https://gcc.gnu.org/pipermail/gcc-patches/2026-May/718615.html is exactly such a case, where with -m32 -O2 -march=cascadelake we attempt to convert a comparison containing (subreg:DI (reg:DD) 0) where the resulting call to gen_lowpart (V2DImode, ...) triggers an ICE. It's the lack of transitivity: DDmode may be SUBREGed to DImode, and DImode may be SUBREGed to V2DImode, but DDmode can't (directly) be SUBREGed to V2DImode. Alas my attempts to reduce a testcase (simpler than the one already in the testsuite) haven't be successful... The example below contains the problematic SUBREG in a comparison, but in this case, STV doesn't consider converting the chain profitable. typedef float __decfloat64 __attribute__((mode(DD))); __decfloat64 ext(); __decfloat64 x,y,z; void foo() { __decfloat64 t = ext(); if (__builtin_memcmp((void*)&t,(void*)&x,sizeof(__decfloat64)) != 0 && __builtin_memcmp((void*)&t,(void*)&y,sizeof(__decfloat64)) != 0 && __builtin_memcmp((void*)&t,(void*)&z,sizeof(__decfloat64)) != 0) { x = t; y = t; z = t; } } Fortunately, the proposed fix is relatively straight forward, using modes_tieable_p and validate_subreg to check that the source pseudo is representable in an SSE register. When originally written, this code was expecting V1TImode, V2DImode or V4SImode. If this isn't a simple SUBREG, STV can emit a move instruction (to preserve it), then operate on the resulting pseudo (as usual). 2026-06-20 Roger Sayle Hongtao Liu gcc/ChangeLog * config/i386/i386-features.cc (scalar_chain::convert_op): Check if the (DImode) SUBREG being converted by STV has an original mode that's tieable to the vector mode (V2DImode), if not (e.g. DDmode) emit this "conversion" as a separate move for reload to handle. --- gcc/config/i386/i386-features.cc | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc index c0282f68986..23f8038bae8 100644 --- a/gcc/config/i386/i386-features.cc +++ b/gcc/config/i386/i386-features.cc @@ -1169,7 +1169,19 @@ scalar_chain::convert_op (rtx *op, rtx_insn *insn) { gcc_assert (SUBREG_P (*op)); if (GET_MODE (*op) != vmode) - *op = gen_lowpart (vmode, *op); + { + rtx inner = SUBREG_REG (*op); + poly_uint64 byte = SUBREG_BYTE (*op); + if (targetm.modes_tieable_p (vmode, GET_MODE (inner)) + && validate_subreg (vmode, GET_MODE (inner), inner, byte)) + *op = gen_lowpart (vmode, *op); + else + { + tmp = gen_reg_rtx (GET_MODE (*op)); + emit_insn_before (gen_rtx_SET (tmp, *op), insn); + *op = gen_rtx_SUBREG (vmode, tmp, 0); + } + } } } -- 2.47.3