Alex Coplan [Mon, 13 Oct 2025 13:41:09 +0000 (13:41 +0000)]
aarch64, testsuite: Add -fchecking to test options [PR121772]
I noticed while testing a backport of the PR121772 fix to GCC 13 that
the test wasn't triggering the ICE as expected with the unpatched
compiler.
This turned out to be because the ICE is a checking ICE, and we
configure by default with --enable-checking=release on the branches.
Additionally, I hadn't noticed when doing the backports to 15 and 14
since there we still ICE later on in emit_move_insn even if we don't
catch the invalid gimple with checking.
I'm not too sure why the 13 branch doesn't see the emit_move_insn ICE,
but it's somewhat irrelevant - the important thing is that adding
-fchecking to the options makes the test fail as expected with an
unpatched compiler (i.e. with a gimple checking failure), even on
release branches.
I considered applying this patch to just the release branches, but
figured that trunk will at some point itself become a release branch, so
it seems to make most sense just to apply it everywhere.
I've checked that the test still passes with this patch, and still fails
if I revert the PR121772 fix.
gcc/testsuite/ChangeLog:
PR tree-optimization/121772
* gcc.target/aarch64/torture/pr121772.c: Add -fchecking to
dg-options.
Jason Merrill [Wed, 20 Aug 2025 03:15:20 +0000 (23:15 -0400)]
c++: pointer to auto member function [PR120757]
Here r13-1210 correctly changed &A<int>::foo to not be considered
type-dependent, but tsubst_expr of the OFFSET_REF got confused trying to
tsubst a type that involved auto. Fixed by getting the type from the
member rather than tsubst.
PR c++/120757
gcc/cp/ChangeLog:
* pt.cc (tsubst_expr) [OFFSET_REF]: Don't tsubst the type.
Jakub Jelinek [Thu, 9 Oct 2025 16:06:39 +0000 (18:06 +0200)]
gimplify: Fix up side-effect handling in 2nd __builtin_c[lt]zg argument [PR122188]
The patch from yesterday made me think about side-effects in the second
argument of __builtin_c[lt]zg. When we change
__builtin_c[lt]zg (x, y)
when y is not INTEGER_CST into
x ? __builtin_c[lt]zg (x) : y
with evaluating x only once, we omit the side-effects in y unless x is not
0. That looks undesirable, we should evaluate side-effects in y
unconditionally.
2025-10-09 Jakub Jelinek <jakub@redhat.com>
PR c/122188
* c-gimplify.cc (c_gimplify_expr): Also gimplify the second operand
before the COND_EXPR and use in COND_EXPR result of gimplification.
Jakub Jelinek [Wed, 8 Oct 2025 07:58:41 +0000 (09:58 +0200)]
gimplify: Fix up __builtin_c[lt]zg gimplification [PR122188]
The following testcase ICEs during gimplification.
The problem is that save_expr sometimes doesn't create a SAVE_EXPR but
returns the original complex tree (COND_EXPR) and the code then uses that
tree in 2 different spots without unsharing. As this is done during
gimplification it wasn't unshared when whole body is unshared and because
gimplification is destructive, the first time we gimplify it we destruct it
and second time we try to gimplify it we ICE on it.
Now, we could replace one a use with unshare_expr (a), but because this
is a gimplification hook, I think easier than trying to create a save_expr
is just gimplify the argument, then we know it is is_gimple_val and so
something without side-effects and can safely use it twice. That argument
would be the first thing to gimplify after return GS_OK anyway, so it
doesn't change argument sequencing etc.
2025-10-08 Jakub Jelinek <jakub@redhat.com>
PR c/122188
* c-gimplify.cc (c_gimplify_expr): Gimplify CALL_EXPR_ARG (*expr_p, 0)
instead of calling save_expr on it.
Jakub Jelinek [Mon, 6 Oct 2025 07:46:48 +0000 (09:46 +0200)]
stmt: Handle %cc[name] in resolve_asm_operand_names [PR122133]
Last year I've extended the asm template syntax in inline asm to support
%cc0 etc., apparently the first 2 letter generic operand modifier.
As the following testcase shows, I forgot to tweak the [foo] handling
for it though. As final.cc will error on any % ISALPHA not followed by
digit (with the exception of % c c digit), I think we can safely handle
this for any 2 letters in between % and [, instead of hardcoding it for
now only for %cc[ and changing it again next time we add something
two-letter.
2025-10-06 Jakub Jelinek <jakub@redhat.com>
PR middle-end/122133
* stmt.cc (resolve_asm_operand_names): Handle % and 2 letters followed
by open square.
Jakub Jelinek [Sat, 4 Oct 2025 15:06:16 +0000 (17:06 +0200)]
widening_mul: Reset flow sensitive info in maybe_optimize_guarding_check [PR122104]
In PR95852 I've added an optimization where next to just pattern
recognizing r = x * y; r / x != y or r = x * y; r / x == y
as .MUL_OVERFLOW or negation thereof it also recognizes
r = x * y; x && (r / x != y) or r = x * y; !x || (r / x == y)
by optimizing the guarding condition to always true/false.
The problem with that is that some value ranges recorded for
the SSA_NAMEs in the formerly conditional, now unconditional
basic block can be invalid.
This patch fixes it by calling reset_flow_sensitive_info_in_bb
if we optimize the guarding condition.
2025-10-04 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/122104
* tree-ssa-math-opts.cc (maybe_optimize_guarding_check): Call
reset_flow_sensitive_info_in_bb on bb when optimizing out the
guarding condition.
Avinash Jayakar [Mon, 13 Oct 2025 09:47:45 +0000 (15:17 +0530)]
match.pd: Do not canonicalize division by power 2 for {ROUND, CEIL}_DIV
Canonicalization of unsigned division by power of 2 only applies to
{TRUNC,FLOOR,EXACT}_DIV, therefore remove the same pattern for {CEIL,ROUND}_DIV,
which was added in a previous commit.
Robin Dapp [Tue, 7 Oct 2025 13:18:27 +0000 (07:18 -0600)]
[PATCH] RISC-V: Detect wrap in shuffle_series_pattern [PR121845].
Hi,
In shuffle_series_pattern we use series_p to determine if the permute
mask is a simple series. This didn't take into account that series_p
also returns true for e.g. {0, 3, 2, 1} where the step is 3 and the
indices form a series modulo 4.
We emit
vid + vmul
in order to synthesize a series. In order to be always correct we would
need a vrem afterwards still which does not seem worth it.
This patch adds the modulo for VLA permutes and punts if we wrap around
for VLS permutes. I'm not really certain whether we'll really see a wrapping
VLA series (certainly we haven't so far in the test suite) but as we observed
a VLS one here now it appears conservatively correct to module the indices.
Regtested on rv64gcv_zvl512b.
Regards
Robin
PR target/121845
gcc/ChangeLog:
* config/riscv/riscv-v.cc (shuffle_series_patterns):
Modulo indices for VLA and punt when wrapping for VLS.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr121845.c: New test.
Aurelien Jarno [Thu, 2 Oct 2025 15:05:34 +0000 (09:05 -0600)]
[PATCH v2] RISC-V: fix __builtin_round NaN handling [PR target/121652]
__builtin_round() fails to correctly generate invalid exceptions for NaN
inputs when -ftrapping-math is used (which is the default). According to
the specification, an invalid exception should be raised for sNaN, but
not for qNaN.
Commit f12a27216952 ("RISC-V: fix __builtin_round clobbering FP...")
attempted to avoid raising an invalid exception for qNaN by saving and
restoring the FP exception flags. However this inadvertently suppressed
the invalid exception for sNaN as well.
Instead of saving/restoring fflags, this patch uses the same approach
than the well tested GLIBC round implementation. When flag_trapping_math
is enabled, it first checks whether the input is a NaN using feq.s/d. In
that case it adds the input value with itself to possibly convert sNaN
into qNaN. With this change, the glibc testsuite passes again.
The generated code with -ftrapping-math now looks like:
AVR: target/122187 - Don't clobber recog_data.operand[] in insn out.
avr.cc::avr_out_extr() and avr.cc::avr_out_extr_not()
changed xop for output, which spoiled the operand for
the next invokation, running into an assertion.
This patch makes a local copy of the operands.
PR target/122187
gcc/
* config/avr/avr.cc (avr_out_extr, avr_out_extr_not):
Make a local copy of the passed rtx[] operands.
gcc/testsuite/
* gcc.target/avr/torture/pr122187.c: New test.
Robin Dapp [Thu, 4 Sep 2025 08:16:21 +0000 (10:16 +0200)]
RISC-V: Use correct target in expand_vec_perm [PR121780].
This fixes a glaring mistake in yesterday's change to the expansion of
vec_perm. We should of course move tmp_target into the real target
and not the other way around. I wonder why my testing hasn't
caught this...
Robin Dapp [Mon, 1 Sep 2025 09:41:34 +0000 (11:41 +0200)]
RISC-V: Handle overlap in expand_vec_perm PR121742.
In a two-source gather we unconditionally overwrite target with the
first gather's result already. If op1 == target this clobbers the
source operand for the second gather. This patch uses a temporary in
that case.
PR target/121742
gcc/ChangeLog:
* config/riscv/riscv-v.cc (expand_vec_perm): Use temporary if
op1 and target overlap.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr121742.c: New test.
Kito Cheng [Thu, 10 Jul 2025 07:28:30 +0000 (15:28 +0800)]
RISC-V: Always register vector built-in functions during LTO [PR110812]
Previously, vector built-in functions were not properly registered during
the LTO pipeline, causing link failures when vector intrinsics were used
in LTO builds with mixed architecture options. This patch ensures all
vector built-in functions are always registered during LTO compilation.
The key changes include:
- Moving pragma intrinsic flag manipulation from riscv-c.cc to
riscv-vector-builtins.cc for better encapsulation
- Registering all vector built-in functions regardless of current ISA
extensions, deferring the actual extension checking to expansion time
- Adding proper support for built-in type registration during LTO
This approach is safe because we already perform extension requirement
checking at expansion time. The trade-off is a slight increase in
bootstrap time for LTO builds due to registering more built-in functions.
PR target/110812
gcc/ChangeLog:
* config/riscv/riscv-c.cc (pragma_intrinsic_flags): Remove struct.
(riscv_pragma_intrinsic_flags_pollute): Remove function.
(riscv_pragma_intrinsic_flags_restore): Remove function.
(riscv_pragma_intrinsic): Simplify to only call handle_pragma_vector.
* config/riscv/riscv-vector-builtins.cc (pragma_intrinsic_flags):
Move struct definition here from riscv-c.cc.
(riscv_pragma_intrinsic_flags_pollute): Move and adapt from
riscv-c.cc, add zvfbfmin, zvfhmin and vector_elen_bf_16 support.
(riscv_pragma_intrinsic_flags_restore): Move from riscv-c.cc.
(rvv_switcher::rvv_switcher): Add pollute_flags parameter to
control flag manipulation.
(rvv_switcher::~rvv_switcher): Restore flags conditionally.
(register_builtin_types): Use rvv_switcher without polluting flags.
(get_required_extensions): Remove function.
(check_required_extensions): Simplify to only check type validity.
(function_instance::function_returns_void_p): Move implementation
from header.
(function_builder::add_function): Register placeholder for LTO.
(init_builtins): Simplify and handle LTO case.
(reinit_builtins): Remove function.
(handle_pragma_vector): Remove extension checking.
* config/riscv/riscv-vector-builtins.h
(function_instance::function_returns_void_p): Add declaration.
(function_call_info::function_returns_void_p): Remove inline
implementation.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/lto/pr110812_0.c: New test.
* gcc.target/riscv/lto/pr110812_1.c: New test.
* gcc.target/riscv/lto/riscv-lto.exp: New test driver.
* gcc.target/riscv/lto/riscv_vector.h: New header wrapper.
Eric Botcazou [Tue, 30 Sep 2025 09:55:18 +0000 (11:55 +0200)]
Ada: Fix internal error on ill-formed Reduce attribute in Ada 2022
This is an internal error on the new Reduce attribute of Ada 2022 when the
programmer swaps its arguments(!) The change makes it so that the compiler
gives an error message instead.
gcc/ada/
PR ada/117517
* sem_attr.adb (Resolve_Attribute) <Attribute_Reduce>: Try to
resolve the reducer first. Fix casing of error message.
Harald Anlauf [Sat, 20 Sep 2025 20:20:25 +0000 (22:20 +0200)]
Fortran: fix issues with rank-2 deferred-length character arrays [PR108581]
PR fortran/108581
gcc/fortran/ChangeLog:
* trans-array.cc (gfc_conv_expr_descriptor): Take the dynamic
string length into account when deriving the dataptr offset for
a deferred-length character array.
gcc/testsuite/ChangeLog:
* gfortran.dg/deferred_character_39.f90: New test.
Because LoongArch does not implement TARGET_CAN_INLINE_P,
functions with the target attribute set and those without
it cannot be inlined. At the same time, setting the
always_inline attribute will cause compilation failure.
To solve this problem, I implemented this hook. During the
implementation process, it checks the status of the target
special options of the caller and callee, such as the ISA
extension.
PR target/121875
gcc/ChangeLog:
* config/loongarch/loongarch.cc
(loongarch_can_inline_p): New function.
(TARGET_CAN_INLINE_P): Define.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/can_inline_1.c: New test.
* gcc.target/loongarch/can_inline_2.c: New test.
* gcc.target/loongarch/can_inline_3.c: New test.
* gcc.target/loongarch/can_inline_4.c: New test.
* gcc.target/loongarch/can_inline_5.c: New test.
* gcc.target/loongarch/can_inline_6.c: New test.
* gcc.target/loongarch/pr121875.c: New test.
Alex Coplan [Tue, 9 Sep 2025 11:57:14 +0000 (12:57 +0100)]
match.pd: Add missing type check to reduc(ctor) pattern [PR121772]
In this PR we have a reduction of a vector constructor, where the
type of the constructor is int16x8_t and the elements are int16x4_t;
i.e. it is representing a concatenation of two vectors.
This triggers a match.pd pattern which looks like it was written to
handle reductions of vector constructors where the elements of the ctor
are scalars, not vectors. There is no type check to enforce this
property, which leads to the pattern replacing a reduction to scalar
with an int16x4_t vector in this case, which of course is a type error,
leading to an invalid GIMPLE ICE.
This patch adds a type check to the pattern, only going ahead with the
transformation if the element type of the ctor matches that of the
reduction.
gcc/ChangeLog:
PR tree-optimization/121772
* match.pd: Add type check to reduc(ctor) pattern.
gcc/testsuite/ChangeLog:
PR tree-optimization/121772
* gcc.target/aarch64/torture/pr121772.c: New test.
This assertion, despite what I said in r16-4070, is not valid: we can
reach here when deduping a VAR_DECL that didn't get a LANG_SPECIFIC in
the current TU. It's still correct to always use lang_cplusplus however
as for anything else the decl would have been created with an
appropriate LANG_SPECIFIC to start with.
OpenMP: Unshare expr in context-selector condition [PR121922]
As the testcase shows, a missing unshare_expr caused that the condition
was only evaluated once instead of every time when a 'declare variant'
was resolved.
PR middle-end/121922
gcc/ChangeLog:
* omp-general.cc (omp_dynamic_cond): Use 'unshare_expr' for
the user condition.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/declare-variant-1.c: New test.
Patrick Palka [Wed, 24 Sep 2025 02:41:26 +0000 (22:41 -0400)]
libstdc++/testsuite: Unpoison 'u' on s390x in names.cc test
This is the s390 counterpart to r11-7364-gd0453cf5c68b6a, and fixes the
following names.cc failure caused by a use of a poisoned identifier.
If we look at the corresponding upstream header[1] it's clear that the
problematic identifier is 'u'.
In file included from /usr/include/linux/types.h:5,
from /usr/include/linux/sched/types.h:5,
from /usr/include/bits/sched.h:61,
from /usr/include/sched.h:43,
from /usr/include/pthread.h:22,
from /usr/include/c++/14/s390x-redhat-linux/bits/gthr-default.h:35,
from /usr/include/c++/14/s390x-redhat-linux/bits/gthr.h:157,
from /usr/include/c++/14/ext/atomicity.h:35,
from /usr/include/c++/14/bits/ios_base.h:39,
from /usr/include/c++/14/streambuf:43,
from /usr/include/c++/14/bits/streambuf_iterator.h:35,
from /usr/include/c++/14/iterator:66,
from /usr/include/c++/14/s390x-redhat-linux/bits/stdc++.h:54,
from /root/rpmbuild/BUILD/gcc-14.3.1-20250617/libstdc++-v3/testsuite/17_intro/names.cc:384:
/usr/include/asm/types.h:24: error: expected unqualified-id before '[' token
/usr/include/asm/types.h:24: error: expected ')' before '[' token
/root/rpmbuild/BUILD/gcc-14.3.1-20250617/libstdc++-v3/testsuite/17_intro/names.cc:101: note: to match this '('
compiler exited with status 1
FAIL: 17_intro/names.cc -std=gnu++98 (test for excess errors)
Excess errors:
/usr/include/asm/types.h:24: error: expected unqualified-id before '[' token
/usr/include/asm/types.h:24: error: expected ')' before '[' token
Patrick Palka [Sat, 20 Sep 2025 14:45:22 +0000 (10:45 -0400)]
c++: find_template_parameters and NTTPs [PR121981]
Here the normal form of the two immediately-declared D<<placeholder>, V>
constraints is the same, so we rightfully share the normal form between
them. We first compute the normal form from the context of auto deduction
for W in which case the placeholder has level 2 where the set of
in-scope template parameters has depth 2 (a dummy level is added from
normalize_placeholder_type_constraints).
Naturally the atomic constraint only depends on the template parameter
V of depth 1 index 0. The depth 2 of current_template_parms however
means that find_template_parameters when it sees V within the atomic
constraint will recurse into its TREE_TYPE, an auto of level 2, and mark
the atomic constraint as also depending on the template parameter of
depth 2 index 0, which is clearly wrong. Later during constraint
checking for B we ICE within the satisfaction cache since we lack two
levels of template arguments supposedly needed by the cached atomic
constraint.
I think when find_template_parameters sees an NTTP, it doesn't need to
walk its TREE_TYPE because NTTP substitution is done obliviously with
respect to its type -- only the corresponding NTTP argument matters,
not other template arguments possibly used within its type. This is
most clearly true for (unconstrained) auto NTTPs as in the testcase, but
also true for other NTTPs. Doing so fixes the testcase because we no
longer record any depth 2 when walking V within the atomic constraint.
PR c++/121981
gcc/cp/ChangeLog:
* pt.cc (any_template_parm_r) <case TEMPLATE_TYPE_PARM>:
Don't walk TREE_TYPE.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-placeholder15.C: New test.
Jakub Jelinek [Thu, 18 Sep 2025 14:41:32 +0000 (16:41 +0200)]
openmp: Fix up ICE in lower_omp_regimplify_operands_p [PR121977]
The following testcase ICEs in functions called from
lower_omp_regimplify_operands_p, because maybe_lookup_decl returns
NULL for this (on the outer taskloop context) when regimplifying the
taskloop pre body. If it isn't found in current context, we should
look in outer ones.
2025-09-18 Jakub Jelinek <jakub@redhat.com>
PR c++/121977
* omp-low.cc (lower_omp_regimplify_operands_p): If maybe_lookup_decl
returns NULL, use maybe_lookup_decl_in_outer_ctx as fallback.
Jakub Jelinek [Tue, 16 Sep 2025 17:25:58 +0000 (19:25 +0200)]
docs: Adjust -Wimplicit-fallthrough= documentation for C23
I've noticed in -Wimplicit-fallthrough= documentation we talk about
[[fallthrough]]; for C++17 but don't mention that it is also standard
way to suppress the warning for C23.
2025-09-16 Jakub Jelinek <jakub@redhat.com>
* doc/invoke.texi (Wimplicit-fallthrough=): Document that also C23
provides a standard way to suppress the warning with [[fallthrough]];.
Jakub Jelinek [Wed, 10 Sep 2025 10:39:11 +0000 (12:39 +0200)]
testsuite: Only scan for known file extensions in lto.exp
This is something that has bothered me for a few years but I've only found
time for it now.
The glob used for finding *_1.* etc. counterparts to the *_0.* tests is too
broad, so if one has say next to *_1.c file also *_1.c~ or *_1.c.~1~
or *_1.c.orig or *_1.c.bak etc. files, lto.exp will report a warning and the
test will fail.
So, e.g. in rpm build if some backported commit in patch form adds some
gcc/testsuite/*.dg/lto/ test and one uses -b option to patch, if one doesn't
remove the backup files, the test will fail.
Looking through all the *.dg/lto/ directories, I only see c, C, ii, f, f90
and d extensions used right now for the *_1.* files (and higher), while for
the *_0.* files also m, mm and f03 extensions are used.
So, the following patch only searches for those (plus for Fortran uses the
extensions searched by the gfortran.dg/lto/ driver, i.e. \[fF\]{,90,95,03,08}
, not just f, f90 and f03).
Tested on x86_64-linux and verified I got exactly the same number of
grep '^Executing.on.host.*_[1-9]\.' testsuite/*/*.log
before/after this patch when doing make check RUNTESTFLAGS=lto.exp
except for 2 new ones which were previously failed because I had backup
files for 2 tests.
2025-09-10 Jakub Jelinek <jakub@redhat.com>
* lib/lto.exp (lto-execute-1): Search for _1.* etc. files
only with a list of known extensions.
Jakub Jelinek [Wed, 10 Sep 2025 10:34:50 +0000 (12:34 +0200)]
bitint: Fix up lowering optimization of .*_OVERFLOW ifns [PR121828]
THe lowering of .{ADD,SUB,MUL}_OVERFLOW ifns is optimized, so that we don't
in the common cases uselessly don't create a large _Complex _BitInt
temporary with the first (real) part being the result and second (imag) part
just being a huge 0 or 1, although we still do that if it can't be done.
The optimizable_arith_overflow function checks when that is possible, like
whether the ifn result is used at most twice, once in REALPART_EXPR and once
in IMAGPART_EXPR in the same bb, etc. For IMAGPART_EXPR it then checks
if it has a single use which is a cast to some integral non-bitint type
(usually bool or int etc.). The final check is whether that cast stmt
appears after the REALPART_EXPR (the usual case), in that case it is
optimizable, otherwise it is not (because the lowering for optimizable
ifns of this kind is done at the location of the REALPART_EXPR and it
tweaks the IMAGPART_EXPR cast location at that point, so otherwise it
would be set after use.
Now, we also have an optimization for the REALPART_EXPR lhs being used
in a single stmt - store in the same bb, in that case we don't have to
store the real part result in a temporary but it can go directly into
memory.
Except that nothing checks for the IMAGPART_EXPR cast being before or after
the store in this case, so the following testcase ICEs because we have
a use before a def stmt.
In bar (the function handled right already before this patch) we have
_6 = .SUB_OVERFLOW (y_4(D), x_5(D));
_1 = REALPART_EXPR <_6>;
_2 = IMAGPART_EXPR <_6>;
a = _1;
_3 = (int) _2;
baz (_3);
before the lowering, so we can just store the limbs of the .SUB_OVERFLOW
into the limbs of a variable and while doing that compute the value we
eventually store into _3 instead of the former a = _1; stmt.
In foo we have
_5 = .SUB_OVERFLOW (y_3(D), x_4(D));
_1 = REALPART_EXPR <_5>;
_2 = IMAGPART_EXPR <_5>;
t_6 = (int) _2;
baz (t_6);
a = _1;
and we can't do that because the lowering would be at the a = _1; stmt
and would try to set t_6 to the overflow flag at that point. We don't
need to punt completely and mark _5 as _Complex _BitInt VAR_DECL though
in this case, all we need is not merge the a = _1; store with the
.SUB_OVERFLOW and REALPART_EXPR/IMAGPART_EXPR lowering. So, add _1
to m_names and lower the first 3 stmts at the _1 = REALPART_EXPR <_5>;
location, optimizable_arith_overflow returned non-zero and so the
cast after IMAGPART_EXPR was after it and then a = _1; will copy from
the temporary VAR_DECL to memory.
2025-09-10 Jakub Jelinek <jakub@redhat.com>
PR middle-end/121828
* gimple-lower-bitint.cc (gimple_lower_bitint): For REALPART_EXPR
consumed by store in the same bb and with REALPART_EXPR from
optimizable_arith_overflow, don't add REALPART_EXPR lhs to
the m_names bitmap only if the cast from IMAGPART_EXPR doesn't
appear in between the REALPART_EXPR and the store.
Jakub Jelinek [Wed, 10 Sep 2025 10:33:14 +0000 (12:33 +0200)]
expr: Handle RAW_DATA_CST in store_constructor [PR121831]
I thought this wouldn't be necessary because RAW_DATA_CST can only appear
inside of (array) CONSTRUCTORs within DECL_INITIAL of TREE_STATIC vars,
so there shouldn't be a need to expand it. Except that we have an
optimization when reading ARRAY_REF from such a CONSTRUCTOR which will
try to expand the constructor if it either can be stored by pieces
(I think that is just fine) or if it is mostly zeros (which is at least
75% of the initializer zeros). Now the second case is I think in some
cases desirable (say 256MB initializer and just 20 elements out of that
non-zero, so clear everything and store 20 elements must be fastest and
short), but could be really bad as well (say 40GB initializer with
10GB non-zero in it, especially if it doesn't result in the original
variable being optimized away). Maybe it would help if expand_constructor
and store_constructor* etc. had some optional argument with addresses into
the original VAR_DECL so that it could be copying larger amounts of data
like larger RAW_DATA_CSTs from there instead of pushing those into new
.rodata again. And another problem is that we apparently expand the
initializes twice, expand_constructor in store_constructor can expand
the stores one and if expand_constructor returns non-NULL, we then
expand_expr the CONSTRUCTOR again. to the same location.
This patch doesn't address either of those issues, just adds RAW_DATA_CST
support to store_constructor for now. For the can_store_by_pieces
cases it stores those by pieces using a new callback very similar to
string_cst_read_str, for the rest (unfortunately) forces it into a memory
and copies from there.
Jakub Jelinek [Mon, 8 Sep 2025 09:49:58 +0000 (11:49 +0200)]
libstdc++: Fix up <ext/pointer.h> [PR121827]
During the tests mentioned in
https://gcc.gnu.org/pipermail/gcc-patches/2025-August/692482.html
(but dunno why I haven't noticed it back in August but only when testing
https://gcc.gnu.org/pipermail/gcc-patches/2025-September/694527.html )
I've noticed two ext header problems.
One is that #include <ext/pointer.h> got broken with the r13-3037-g18f176d0b25591e28 change and since then is no longer
self-contained, as it includes iosfwd only if _GLIBCXX_HOSTED is defined
but doesn't actually include bits/c++config.h to make sure it is defined,
then includes a bunch of headers which do include bits/c++config.h and
finally uses in #if _GLIBCXX_HOSTED guarded code what is declared in iosfwd.
The other problem is that ext/cast.h is also not a self-contained header,
but that one has
/** @file ext/cast.h
* This is an internal header file, included by other library headers.
* Do not attempt to use it directly. @headername{ext/pointer.h}
*/
comment, so I think we just shouldn't include it in extc++.h and let
ext/pointer.h include it.
2025-09-08 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/121827
* include/precompiled/extc++.h: Don't include ext/cast.h which is an
internal header.
* include/ext/pointer.h: Include bits/c++config.h before
#if _GLIBCXX_HOSTED.
Jakub Jelinek [Fri, 5 Sep 2025 08:59:42 +0000 (10:59 +0200)]
testsuite, powerpc, v2: Fix vsx-vectorize-* after alignment peeling [PR118567]
On Tue, Jul 01, 2025 at 02:50:40PM -0500, Segher Boessenkool wrote:
> No tests become good tests without effort. And tests that are not good
> tests require constant maintenance!
Here are two patches, either just the first one or both can be used
and both were tested on powerpc64le-linux.
The second one adds further 8 tests, which are dg-do run which #include
the former tests, don't do any dump tests and just define the checking/main
for those.
2025-09-05 Jakub Jelinek <jakub@redhat.com>
PR testsuite/118567
* gcc.target/powerpc/vsx-vectorize-9.c: New test.
* gcc.target/powerpc/vsx-vectorize-10.c: New test.
* gcc.target/powerpc/vsx-vectorize-11.c: New test.
* gcc.target/powerpc/vsx-vectorize-12.c: New test.
* gcc.target/powerpc/vsx-vectorize-13.c: New test.
* gcc.target/powerpc/vsx-vectorize-14.c: New test.
* gcc.target/powerpc/vsx-vectorize-15.c: New test.
* gcc.target/powerpc/vsx-vectorize-16.c: New test.
Jakub Jelinek [Fri, 5 Sep 2025 08:54:53 +0000 (10:54 +0200)]
testsuite, powerpc, v2: Fix vsx-vectorize-* after alignment peeling [PR118567]
On Tue, Jul 01, 2025 at 02:50:40PM -0500, Segher Boessenkool wrote:
> No tests become good tests without effort. And tests that are not good
> tests require constant maintenance!
Here are two patches, either just the first one or both can be used
and both were tested on powerpc64le-linux.
The first one removes all the checking etc. stuff from the testcases,
as they are just dg-do compile, for the vectorize dump checks all we
care about are the vectorized loops they want to test.
2025-09-05 Jakub Jelinek <jakub@redhat.com>
PR testsuite/118567
* gcc.target/powerpc/vsx-vectorize-1.c: Remove includes, checking
part of main1 and main.
* gcc.target/powerpc/vsx-vectorize-2.c: Remove includes, replace
bar definition with declaration, remove main.
* gcc.target/powerpc/vsx-vectorize-3.c: Likewise.
* gcc.target/powerpc/vsx-vectorize-4.c: Likewise.
* gcc.target/powerpc/vsx-vectorize-5.c: Likewise.
* gcc.target/powerpc/vsx-vectorize-6.c: Likewise.
* gcc.target/powerpc/vsx-vectorize-7.c: Likewise.
* gcc.target/powerpc/vsx-vectorize-8.c: Likewise.
Jakub Jelinek [Mon, 25 Aug 2025 22:28:10 +0000 (00:28 +0200)]
omp-expand: Initialize fd->loop.n2 if needed for the zero iter case [PR121453]
When expand_omp_for_init_counts is called from expand_omp_for_generic,
zero_iter1_bb is NULL and the code always creates a new bb in which it
clears fd->loop.n2 var (if it is a var), because it can dominate code
with lastprivate guards that use the var.
When called from other places, zero_iter1_bb is non-NULL and so we don't
insert the clearing (and can't, because the same bb is used also for the
non-zero iterations exit and in that case we need to preserve the iteration
count). Clearing is also not necessary when e.g. outermost collapsed
loop has constant non-zero number of iterations, in that case we initialize the
var to something already earlier. The following patch makes sure to clear
it if it hasn't been initialized yet before the first check for zero iterations.
2025-08-26 Jakub Jelinek <jakub@redhat.com>
PR middle-end/121453
* omp-expand.cc (expand_omp_for_init_counts): Clear fd->loop.n2
before first zero count check if zero_iter1_bb is non-NULL upon
entry and fd->loop.n2 has not been written yet.
Jakub Jelinek [Thu, 14 Aug 2025 20:30:45 +0000 (22:30 +0200)]
c++: Fix up build_cplus_array_type [PR121524]
The following testcase is miscompiled since my r15-3046 change
to properly apply std attributes after closing ] for arrays to the
array type.
Array type is not a class type, so when cplus_decl_attribute is
called on the ARRAY_TYPE, it doesn't do ATTR_FLAG_TYPE_IN_PLACE.
Though, for alignas/gnu::aligned/deprecated/gnu::unavailable/gnu::unused
attributes the handlers of those attributes for non-ATTR_FLAG_TYPE_IN_PLACE
on types call build_variant_type_copy and modify some flags on the new
variant type. They also usually don't clear *no_add_attrs, so the caller
then checks if the attributes are present on the new type and if not, calls
build_type_attribute_variant.
On the following testcase, it results in the B::foo type to be properly
32 byte aligned.
The problem happens later when we build_cplus_array_type for C::a.
elt_type is T (typedef, or using works likewise), we get as m
main variant type with unsigned int element type but because elt_type
is different, build_cplus_array_type searches the TYPE_NEXT_VARIANT chain
to find if there isn't already a useful ARRAY_TYPE to reuse.
It checks for NULL TYPE_NAME, NULL TYPE_ATTRIBUTES and the right TREE_TYPE.
Unfortunately this is not good enough, build_variant_type_copy above created
a variant type on which it modified TYPE_USER_ALIGN and TYPE_ALIGN, but
TYPE_ATTRIBUTES is still NULL, only the build_type_attribute_variant call
later adds attributes.
The problem is that the intermediate type is found in the TYPE_NEXT_VARIANT
chain and reused.
The following patch adds conditions to prevent problems with the affected
attributes (except gnu::unused, I think whether TREE_USED is set or not
shouldn't prevent sharing). In particular, if TYPE_USER_ALIGN is not
set on the variant, it wasn't user realigned, if it is set, it verifies
it has it set because the elt_type has been user aligned and TYPE_ALIGN
is the expected one. For deprecated it punts on the flag being set and
for gnu::unavailable as well.
2025-08-14 Jakub Jelinek <jakub@redhat.com>
PR c++/121524
* tree.cc (build_cplus_array_type): Don't reuse variant type
if it has TREE_DEPRECATED or TREE_UNAVAILABLE flags set or,
unless elt_type has TYPE_USER_ALIGN set and TYPE_ALIGN is
TYPE_ALIGN of elt_type, TYPE_USER_ALIGN is not set.
Richard Biener [Mon, 22 Sep 2025 08:14:31 +0000 (10:14 +0200)]
tree-optimization/122016 - PRE insertion breaks abnormal coalescing
When PRE asks VN to simplify a NARY but not insert, that bypasses
the abnormal guard in maybe_push_res_to_seq and we blindly accept
new uses of abnormals. The following fixes this.
PR tree-optimization/122016
* tree-ssa-sccvn.cc (vn_nary_simplify): Do not use the
simplified expression when it references abnormals.
Richard Biener [Mon, 8 Sep 2025 12:32:38 +0000 (14:32 +0200)]
tree-optimization/121844 - IVOPTs and asm goto in latch
When there's an asm goto in the latch of a loop we may not use
IP_END IVs since instantiating those would (need to) split the
latch edge which in turn invalidates IP_NORMAL position handling.
This is a revision of the PR107997 fix.
PR tree-optimization/107997
PR tree-optimization/121844
* tree-ssa-loop-ivopts.cc (allow_ip_end_pos_p): Do not allow
IP_END for latches ending with a control stmt.
(create_new_iv): Do not split the latch edge, instead assert
that's not necessary.
Richard Biener [Tue, 26 Aug 2025 08:34:01 +0000 (10:34 +0200)]
tree-optimization/121659 - bogus swap of reduction operands
The following addresses a bogus swapping of SLP operands of a
reduction operation which gets STMT_VINFO_REDUC_IDX out of sync
with the SLP operand order. In fact the most obvious mistake is
that we simply swap operands even on the first stmt even when
there's no difference in the comparison operators (for == and !=
at least). But there are more latent issues that I noticed and
fixed up in the process.
PR tree-optimization/121659
* tree-vect-slp.cc (vect_build_slp_tree_1): Do not allow
matching up comparison operators by swapping if that would
disturb STMT_VINFO_REDUC_IDX. Make sure to only
actually mark operands for swapping when there was a
mismatch and we're not processing the first stmt.
Richard Biener [Mon, 18 Aug 2025 11:38:37 +0000 (13:38 +0200)]
tree-optimization/121527 - wrong SRA with aggregate copy
SRA handles outermost VIEW_CONVERT_EXPRs but it wrongly ignores
those when building an access which leads to the wrong size
used when the VIEW_CONVERT_EXPR does not have the same size as
its operand which is valid GENERIC and is used by Ada upcasting.
PR tree-optimization/121527
* tree-sra.cc (build_access_from_expr_1): Do not strip an
outer VIEW_CONVERT_EXPR as it's relevant for the size of
the access.
(get_access_for_expr): Likewise.
Richard Biener [Tue, 5 Aug 2025 06:59:18 +0000 (08:59 +0200)]
tree-optimization/121370 - avoid UB in building a CHREC
When there is obvious UB involved in the process of re-associating
a series of IV increments to build up a CHREC, fail. This catches
a few degenerate cases where SCEV introduces UB with its inherent
re-associating of IV increments.
c++/modules: Fix language linkage handling [PR122019]
The ICE in the linked PR is caused because when current_lang_name is
lang_name_c, set_decl_linkage calls decl_linkage on the retrofitted
declaration. This is problematic because at this point we haven't
finished streaming the declaration, and so we crash when attempting to
access these missing bits (such as the type).
The only declarations we can reach here will be things like types that
don't get a language linkage anyway, so it seems reasonable to just
hardcode a C++ language linkage here to work around the issue.
An alternative fix would be to override current_lang_name in
lazy_load_binding instead, but this is potentially confusing if we ever
deliberately implement `extern "C" import "header_unit.hpp";` to
override the language linkage of imported declarations, so I went with
this approach instead. (Though it seems we never will do this.)
While testing this I found that we don't currently complain about
mismatching language linkages for variables, and module_may_redeclare
doesn't cope well with implementation units, so this patch also fixes
those issues.
PR c++/122019
gcc/cp/ChangeLog:
* module.cc (trees_in::install_entity): Don't be affected by
global language linkage state.
(trees_in::is_matching_decl): Check mismatching language linkage
for variables too.
(module_may_redeclare): Report the correct module name for
partitions and implementation units.
gcc/testsuite/ChangeLog:
* g++.dg/modules/lang-4_a.C: New test.
* g++.dg/modules/lang-4_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit fe2f86a960435a7c8c4e5134a0dfc64dd6062157)
c++: Fix canonical type for lambda pack captures [PR122015]
comp_template_parms_position uses whether a TEMPLATE_TYPE_PARM is a pack
to determine equivalency. This in turn affects whether
canonical_type_parameter finds a pre-existing auto type as equivalent.
When generating the 'auto...' type for a lambda pack capture, we only
mark it as a pack after generating the node (and calculating its
canonical); this means that later when comparing a version streamed in
from a module we think that two equivalent types have different
TYPE_CANONICAL, because the latter already had
TEMPLATE_PARM_PARAMETER_PACK set before calculating its canonical.
This patch fixes this by using a new 'make_auto_pack' function to ensure
that packness is set before the canonical is looked up.
PR c++/122015
gcc/cp/ChangeLog:
* cp-tree.h (make_auto_pack): Declare.
* lambda.cc (lambda_capture_field_type): Use make_auto_pack to
ensure TYPE_CANONICAL is set correctly.
* pt.cc (make_auto_pack): New function.
gcc/testsuite/ChangeLog:
* g++.dg/modules/lambda-11.h: New test.
* g++.dg/modules/lambda-11_a.H: New test.
* g++.dg/modules/lambda-11_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Patrick Palka <ppalka@redhat.com>
(cherry picked from commit cc79849cc883146964f0001f33c8b7eb576825c4)
Eric Botcazou [Mon, 22 Sep 2025 09:08:34 +0000 (11:08 +0200)]
Ada: Fix internal error on use clause present in generic formal part
This is a regression present on the mainline and 15 branch: the compiler
aborts on a use clause present in the formal part of a generic unit because
of an oversight in the new inference code for generic actual parameters.
The fix also adds a missing test to Analyze_Dimension_Array_Aggregate.
gcc/ada/
PR ada/121968
* sem_ch12.adb (Associations.Find_Assoc): Add guard for clauses.
* sem_dim.adb (Analyze_Dimension_Array_Aggregate): Add test for
N_Iterated_Component_Association nodes.
Atomic support enhanced to fix existing atomic_compare_and_swapsi pattern
to handle side effects; new patterns atomic_fetch_op and atomic_test_and_set
added. As MicroBlaze has no QImode test/set instruction, use shift magic
to implement atomic_test_and_set.
Patrick Palka [Wed, 17 Sep 2025 00:59:10 +0000 (20:59 -0400)]
libstdc++: Explicitly pass -Wsystem-headers in tests that need it
When running libstdc++ tests using an installed gcc (as opposed to an
in-tree gcc), we naturally use system stdlib headers instead of the
in-tree headers. But warnings from within system headers are suppressed
by default, so tests that check for such warnings spuriously fail in such
a setup. This patch makes us compile such tests with -Wsystem-headers so
that they consistently pass.
When none of mprefer-vector-width, avx256_optimal/avx128_optimal,
avx256_store_by_pieces/avx512_store_by_pieces is specified, GCC will
set ix86_{move_max,store_max} as max available vector length except
for AVX part.
So for -mavx2, vectorizer will choose 256-bit for vectorization, but
128-bit is used for struct copy, there could be a potential STLF issue
due to this "misalign".
Jeff Law [Fri, 12 Sep 2025 22:08:38 +0000 (16:08 -0600)]
Fix latent LRA bug
Shreya's work to add the addptr pattern on the RISC-V port exposed a latent bug
in LRA.
We lazily allocate/reallocate the ira_reg_equiv structure and when we do
(re)allocation we'll over-allocate and zero-fill so that we don't have to
actually allocate and relocate the data so often.
In the case exposed by Shreya's work we had N requested entries at the last
rellocation step. We actually allocate N+M entries. During LRA we allocate
enough new pseudos and thus have N+M+1 pseudos.
In get_equiv we read ira_reg_equiv[regno] without bounds checking so we read
past the allocated part of the array and get back junk which we use and
depending on the precise contents we fault in various fun and interesting ways.
We could either arrange to re-allocate ira_reg_equiv again on some path through
LRA (possibly in get_equiv itself). We could also just insert the bounds check
in get_equiv like is done elsewhere in LRA. Vlad indicated no strong
preference in an email last week.
So this just adds the bounds check in a manner similar to what's done elsewhere
in LRA. Bootstrapped and regression tested on x86_64 as well as RISC-V with
Shreya's work enabled and regtested across the various embedded targets.
gcc/
* lra-constraints.cc (get_equiv): Bounds check before accessing
data in ira_reg_equiv.
Jennifer Schmitz [Thu, 28 Aug 2025 10:10:27 +0000 (03:10 -0700)]
aarch64: Force vector in SVE gimple_folder::fold_active_lanes_to.
An ICE was reported in the following test case:
svint8_t foo(svbool_t pg, int8_t op2) {
return svmul_n_s8_z(pg, svdup_s8(1), op2);
}
with a type mismatch in ‘vec_cond_expr’:
_4 = VEC_COND_EXPR <v16_2(D), v32_3(D), { 0, ... }>;
The reason is that svmul_impl::fold folds calls where one of the operands
is all ones to the other operand using
gimple_folder::fold_active_lanes_to. However, we implicitly assumed
that the argument that is passed to fold_active_lanes_to is a vector
type. In the given test case op2 is a scalar type, resulting in the type
mismatch in the vec_cond_expr.
This patch fixes the ICE by forcing a vector type of the argument
in fold_active_lanes_to before the statement with the vec_cond_expr.
In the initial version of this patch, the force_vector statement was placed in
svmul_impl::fold, but it was moved to fold_active_lanes_to to align it with
fold_const_binary which takes care of the fixup from scalar to vector
type using vector_const_binop.
The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
OK for trunk?
OK to backport to GCC 15?
Eric Botcazou [Fri, 22 Aug 2025 12:51:58 +0000 (14:51 +0200)]
ada: Fix internal error on aspect in complex object declaration
The sufficient conditions are that the aspect be deferred and the object be
rewritten as a renaming because of the complex initialization expression.
gcc/ada/ChangeLog:
* gcc-interface/trans.cc (gnat_to_gnu)
<N_Object_Renaming_Declaration>: Deal with objects whose elaboration
is deferred.
(process_freeze_entity): Deal with renamed objects whose elaboration
is deferred.
Eric Botcazou [Wed, 3 Sep 2025 07:17:39 +0000 (09:17 +0200)]
ada: Fix wrong finalization of aliased array of bounded vector
The problem is that Apply_Discriminant_Check introduces an unnecessary
temporary for an assignment where both sides have the same constrained
subtype but the left-hand side is an aliased component.
This comes from an approximation in the implementation introduced long
time ago to deal with aliased unconstrained objects in Ada 95, more
specifically to still generate a check when both sides have the same
unconstrained subtype in this case; it is replaced by an explicit test
that the common subtype is constrained.
gcc/ada/ChangeLog:
* checks.adb (Apply_Discriminant_Check): Remove undocumented test
on Is_Aliased_View applied to the left-hand side to skip the check
in the case where the subtypes are the same, and replace it with a
test that the subtypes are constrained.
Eric Botcazou [Fri, 29 Aug 2025 07:24:33 +0000 (09:24 +0200)]
ada: Fix crash on iterator of type with Constant_Indexing aspect
This happens when the type returned by the indexing function is a private
type whose completion is derived from another private type, because the
Finalize_Address routine cannot correctly fetch the actual root type.
gcc/ada/ChangeLog:
* exp_util.adb (Finalize_Address): In an untagged derivation, call
Root_Type on the full view of the base type if the partial view is
itself not a derived type.
(Is_Untagged_Derivation): Minor formatting tweak.
aarch64: PR target/121749: Use correct predicate for narrowing shift amounts
With g:d20b2ad845876eec0ee80a3933ad49f9f6c4ee30 the narrowing shift instructions
are now represented with standard RTL and more merging optimisations occur.
This exposed a wrong predicate for the shift amount operand.
The shift amount is the number of bits of the narrow destination, not the input
sources.
Correct this by using the vn_mode attribute when specifying the predicate, which
exists for this purpose.
I've spotted a few more narrowing shift patterns that need the restriction, so
they are updated as well.
Bootstrapped and tested on aarch64-none-linux-gnu.
which obviously lost everything in r87, instead of retaining its lower
bits as we expect. It's because the splitter didn't anticipate the
output register may be one of the input registers.
PR target/121906
gcc/
* config/loongarch/loongarch.md (*bstrins_<mode>_for_ior_mask):
Always create a new pseudo for the input register of the bstrins
instruction.