Fortran: Add code gen for do,concurrent's LOCAL/LOCAL_INIT [PR101602]
Implement LOCAL and LOCAL_INIT; we locally replace the tree declaration by
a local declaration of the outer variable. The 'local_init' then assigns
the value at the beginning of each loop iteration from the outer
declaration.
Note that the current implementation does not handle LOCAL with types that
have a default initializer and LOCAL/LOCAL_INIT for assumed-shape arrays;
this is diagnosed with a sorry error.
PR fortran/101602
gcc/fortran/ChangeLog:
* resolve.cc (resolve_locality_spec): Remove 'sorry, unimplemented'.
* trans-stmt.cc (struct symbol_and_tree_t): New.
(gfc_trans_concurrent_locality_spec): New.
(gfc_trans_forall_1): Call it; update to handle local and local_init.
* trans-decl.cc (gfc_start_saved_local_decls,
gfc_stop_saved_local_decls): New; moved code from ...
(gfc_process_block_locals): ... here. Call it.
* trans.h (gfc_start_saved_local_decls,
gfc_stop_saved_local_decls): Declare.
gcc/testsuite/ChangeLog:
* gfortran.dg/do_concurrent_8_f2023.f90: Update for removed 'sorry,
unimplemented'.
* gfortran.dg/do_concurrent_9.f90: Likewise.
* gfortran.dg/do_concurrent_all_clauses.f90: Likewise.
* gfortran.dg/do_concurrent_local_init.f90: Likewise.
* gfortran.dg/do_concurrent_locality_specs.f90: Likewise.
* gfortran.dg/do_concurrent_11.f90: New test.
* gfortran.dg/do_concurrent_12.f90: New test.
* gfortran.dg/do_concurrent_13.f90: New test.
* gfortran.dg/do_concurrent_14.f90: New test.
* gfortran.dg/do_concurrent_15.f90: New test.
Jason Merrill [Tue, 8 Apr 2025 19:53:34 +0000 (15:53 -0400)]
c++: lambda in concept [PR118698]
When normalizing is_foo for <T>, we get to normalizing
callable<decltype(...),T> for <T,foo>, which means substituting <T,foo> into
<decltype(...),T>.
Since r14-9938, because in_template_context is false we return the lambda
unchanged, just with LAMBDA_EXPR_EXTRA_ARGS set, so the closure type still
refers to the is_specialization_of tparms in its CLASSTYPE_TEMPLATE_INFO.
So then in normalize_atom caching find_template_parameters walks over the
parameter mapping; any_template_parm_r walks into the TREE_TYPE of a
LAMBDA_EXPR without considering EXTRA_ARGS and finds a template parm from
the wrong parameter list.
But since r15-3530 we expect to set tf_partial when substituting with
dependent arguments, so we should set that when normalizing. And then
tf_partial causes TREE_STATIC to be set on the EXTRA_ARGS, meaning that
those args will replace all the template parms in the rest of the lambda, so
we can walk just the EXTRA_ARGS and ignore the rest.
In previous years, I've tried to update the guality tests
so that they give clean results on aarch64-linux-gnu with
a recent version of GDB. This patch does the same thing for
GCC 15. The version of GDB I used was 16.2.
As before, there are no PRs for the XFAILs. The idea is that
anyone who is interested in working in this area can see the
current XFAILs by grepping the tests.
The issue is specifically about a missing word, but I spotted other
copy-editing issues like misplaced hyphens in nearby text. I also
thought that the -Wimplicit example was anachronistic because it's a
hard error in modern C dialects rather than a warning, and replaced it
with something users are more likely to run into.
gcc/ChangeLog
PR c++/90468
* doc/invoke.texi (Warning Options): Clean up text describing
-Wno-xxx.
Jakub Jelinek [Tue, 8 Apr 2025 13:57:45 +0000 (15:57 +0200)]
cobol: Further fixes for cobol cross-compilation from 32-bit arches [PR119364]
On top of
https://gcc.gnu.org/pipermail/gcc-patches/2025-April/680256.html
patch this brings make check-cobol when using the cross compiler from
32-bit host to x86_64-linux to the following:
Running /home/jakub/src/gcc/gcc/testsuite/cobol.dg/dg.exp ...
FAIL: cobol.dg/group1/declarative_1.cob -O0 execution test
FAIL: cobol.dg/group1/declarative_1.cob -O1 execution test
FAIL: cobol.dg/group1/declarative_1.cob -O2 execution test
FAIL: cobol.dg/group1/declarative_1.cob -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
FAIL: cobol.dg/group1/declarative_1.cob -O3 -g execution test
FAIL: cobol.dg/group1/declarative_1.cob -Os execution test
=== cobol Summary ===
# of expected passes 3123
# of unexpected failures 6
# of expected failures 6
(which has some analysis but not a fix yet).
This patch fixes various cases where host size of various types
(void *, int, size_t, unsigned char) is used in place where
size of those types in bytes on the target should be used instead.
At least the size of void * and size_t actually differns between
ilp32 hosts and lp64 targets, int could be different in theory as well
but we actually don't support 16-bit ints on the host side and only support
lp64 targets right now for cobol, and finally sizeof(unsigned char) is
always 1, so there is no point to multiply by that and it is still
wrong to use host sizeof for the target decisions.
2025-04-08 Jakub Jelinek <jakub@redhat.com>
PR cobol/119364
* genapi.cc (function_handle_from_name): Use sizeof_pointer.
(parser_file_add): Use int_size_in_bytes(VOID_P) and
int_size_in_bytes(int).
(inspect_tally): Use int_size_in_bytes(VOID_P).
(inspect_replacing): Likewise.
(gg_array_of_field_pointers): Likewise.
(gg_array_of_file_pointers): Likewise.
(parser_set_pointers): Use sizeof_pointer.
* cobol1.cc (create_our_type_nodes_init): Use
int_size_in_bytes(SIZE_T) and int_size_in_bytes(VOID_P).
* gengen.cc (gg_array_of_size_t): Use int_size_in_bytes(SIZE_T).
(gg_array_of_bytes): Just use N, don't multiply it by
sizeof(unsigned char).
* parse.y: Include tree.h. Use int_size_in_bytes(ptr_type_node).
Jakub Jelinek [Tue, 8 Apr 2025 13:14:58 +0000 (15:14 +0200)]
simplify-rtx: Fix up POPCOUNT optimization [PR119672]
The gcc.dg/vect/pr113281-1.c test and many others ICE on riscv since
presumably the r15-9238 change which allowed more cases of vector modes
in simplify_const_relational_operation.
In the testcase it is EQ of
(popcount:SI (unspec:RVVMF32BI [
(and:RVVMF32BI (const_vector:RVVMF32BI repeat [
(const_int 1 [0x1])
])
(reg:RVVMF32BI 147 [ mask__6.8_35 ]))
(reg:SI 143 [ _41 ])
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE))
and
(const_int 0 [0])
which it tries to fold as EQ comparison of
(unspec:RVVMF32BI [
(and:RVVMF32BI (const_vector:RVVMF32BI repeat [
(const_int 1 [0x1])
])
(reg:RVVMF32BI 147 [ mask__6.8_35 ]))
(reg:SI 143 [ _41 ])
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
with
(const_int 0 [0])
which ICEs because const0_rtx isn't a vector.
Fixed by using CONST0_RTX, so that we pass
(const_vector:RVVMF32BI repeat [
(const_int 0 [0])
])
instead.
2025-04-08 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/119672
* simplify-rtx.cc (simplify_context::simplify_relational_operation_1):
For POPCOUNT == 0 or != 0 optimizations use
CONST0_RTX (GET_MODE (XEXP (op0, 0))) rather than const0_rtx.
Martin Uecker [Fri, 4 Apr 2025 19:01:48 +0000 (21:01 +0200)]
c: fix checking for a tag for variably modified tagged types [PR119612]
The checking assertion added for PR118765 did not take into account
that add_decl_expr can change TYPE_NAME to a TYPE_DECL with no name
for certain cases of variably modified types. This also implies that we
might sometimes not reliably detect the absence of a tag when only
considering TYPE_NAME. This patch introduces a new helper function
c_type_tag to reliable compute the tag for a tagged types and uses it
for code where the switch to C23 may cause regressions.
PR c/119612
gcc/c/ChangeLog:
* c-tree.h (c_type_tag): Add prototype.
* c-typeck.cc (c_type_tag): New function.
(tagged_types_tu_compatible_p, composite_type_internal): Use
c_type_tag.
* c-decl.cc (c_struct_hasher::hash, previous_tag): Use c_type_tag.
gcc/testsuite/ChangeLog:
* gcc.dg/gnu23-tag-6.c: New test.
* gcc.dg/pr119612.c: New test.
OpenMP: Fix append_args handling in modify_call_for_omp_dispatch
At tree level, the addr ref is also required for array dummy arguments,
contrary to C; the GOMP_interop calls in modify_call_for_omp_dispatch
were updated accordingly (using build_fold_addr_expr).
As the GOMP_interop calls had no location data associated with them,
the init call happened as soon as executing the previous line of code,
which was confusing; solution: use the location data of the function
call itself.
PR middle-end/119662
gcc/ChangeLog:
* gimplify.cc (modify_call_for_omp_dispatch): Fix GOMP_interop
arg passing; add location info to function calls.
libgomp/ChangeLog:
* testsuite/libgomp.c/append-args-fr-1.c: New test.
* testsuite/libgomp.c/append-args-fr.h: New test.
Jason Merrill [Mon, 7 Apr 2025 18:35:14 +0000 (14:35 -0400)]
c++: self-dependent alias template [PR117530]
Here, instantiating B<short> means instantiating A<short>, which means
instantiating B<short>. And then when we go to register the initial
instantiation, it conflicts with the inner one. Fixed by checking after
tsubst whether there's already something in the hash table. We already did
something much like this in tsubst_decl, but that doesn't handle this case.
While I was here, I noticed that we had a pop_deferring_access_checks on one
early exit but not another, and since I wanted to add yet another I switched
to using deferring_access_check_sentinel.
PR c++/117530
gcc/cp/ChangeLog:
* pt.cc (instantiate_template): Check retrieve_specialization after
tsubst.
Jakub Jelinek [Tue, 8 Apr 2025 10:39:16 +0000 (12:39 +0200)]
riscv: Fix a typo in config/riscv/freebsd.h [PR119678]
The r15-1124 commit had a typo in one of the FBSD_LINK_PG_NOTE
macro uses.
Fixed thusly, tested with
../configure --target riscv64-unknown-freebsd14 --disable-bootstrap --enable-languages=c --disable-libsanitizer --disable-libgomp
make -j32
Before it failed while compiling gcc.cc:
In file included from ./tm.h:44,
from ../../gcc/gcc.cc:35:
../../gcc/config/riscv/freebsd.h:45:5: error: expected ‘,’ or ‘;’ before ‘FBSD_LINK_PG_NOTES’
45 | " FBSD_LINK_PG_NOTES " \
| ^~~~~~~~~~~~~~~~~~
../../gcc/gcc.cc:1211:32: note: in expansion of macro ‘LINK_SPEC’
Now it fails later on during libgcc configury because I don't have
corresponding binutils.
2025-04-08 Jakub Jelinek <jakub@redhat.com>
PR target/119678
* config/riscv/freebsd.h (LINK_SPEC): Use FBSD_LINK_PG_NOTE rather
than non-existing FBSD_LINK_PG_NOTES.
ld: error: undefined symbol: _Unwind_RaiseException
>>> referenced by eh_throw.cc:93 ([...]/source-gcc/libstdc++-v3/libsupc++/eh_throw.cc:93)
>>> eh_throw.o:(__cxa_throw) in archive /srv/data/tschwinge/amd-instinct2/gcc/build/submit-light-target_gcn/build-gcc/amdgcn-amdhsa/gfx908/libstdc++-v3/src/.libs/libstdc++.a
[...]
collect2: error: ld returned 1 exit status
..., and/or:
ld: error: undefined symbol: _Unwind_Resume_or_Rethrow
>>> referenced by eh_throw.cc:129 ([...]/source-gcc/libstdc++-v3/libsupc++/eh_throw.cc:129)
>>> eh_throw.o:(__cxa_rethrow) in archive /srv/data/tschwinge/amd-instinct2/gcc/build/submit-light-target_gcn/build-gcc/amdgcn-amdhsa/gfx908/libstdc++-v3/src/.libs/libstdc++.a
[...]
collect2: error: ld returned 1 exit status
..., and nvptx:
unresolved symbol _Unwind_RaiseException
collect2: error: ld returned 1 exit status
..., or:
unresolved symbol _Unwind_Resume_or_Rethrow
collect2: error: ld returned 1 exit status
For both GCN, nvptx, this each progresses ~25 'check-gcc-c++',
and ~10 'check-target-libstdc++-v3' test cases:
[-FAIL:-]{+PASS:+} [...] (test for excess errors)
..., with (if applicable, for most of them):
[-UNRESOLVED:-]{+PASS:+} [...] [-compilation failed to produce executable-]{+execution test+}
..., or some 'FAIL: [...] execution test' where these test cases now FAIL when
attempting to use these interfaces, or, if applicable, FAIL due to run-time
'GCC/nvptx: sorry, unimplemented: dynamic stack allocation not supported'.
Thomas Schwinge [Tue, 18 Mar 2025 09:10:30 +0000 (10:10 +0100)]
GCN, nvptx: Define '_Unwind_DeleteException'
This resolves GCN:
ld: error: undefined symbol: _Unwind_DeleteException
>>> referenced by eh_catch.cc:109 ([...]/source-gcc/libstdc++-v3/libsupc++/eh_catch.cc:109)
>>> eh_catch.o:(__cxa_end_catch) in archive [...]/build-gcc/amdgcn-amdhsa/libstdc++-v3/src/.libs/libstdc++.a
[...]
collect2: error: ld returned 1 exit status
..., and nvptx:
unresolved symbol _Unwind_DeleteException
collect2: error: ld returned 1 exit status
For both GCN, nvptx, this each progresses ~100 'check-gcc-c++',
and ~500 'check-target-libstdc++-v3' test cases:
[-FAIL:-]{+PASS:+} [...] (test for excess errors)
..., with (if applicable, for most of them):
[-UNRESOLVED:-]{+PASS:+} [...] [-compilation failed to produce executable-]{+execution test+}
..., or just a few 'FAIL: [...] execution test' where these test cases now
FAIL for unrelated reasons, or, if applicable, FAIL due to run-time
'GCC/nvptx: sorry, unimplemented: dynamic stack allocation not supported'.
Thomas Schwinge [Mon, 7 Apr 2025 10:39:33 +0000 (12:39 +0200)]
nvptx: In offloading compilation, special-case certain host-setup symbol aliases: avoid unused label 'emit_ptx_alias' diagnostic
Minor fix-up for commit 65b31b3fff2fced015ded1026733605f34053796
"nvptx: In offloading compilation, special-case certain host-setup symbol aliases [PR101544]",
as of which we see for non-offloading configurations:
+[...]/source-gcc/gcc/config/nvptx/nvptx.cc: In function 'void nvptx_asm_output_def_from_decls(FILE*, tree, tree)':
+[...]/source-gcc/gcc/config/nvptx/nvptx.cc:7769:2: warning: label 'emit_ptx_alias' defined but not used [-Wunused-label]
+ 7769 | emit_ptx_alias:
+ | ^~~~~~~~~~~~~~
libgomp: Add -Wno-c-binding-type for omp_lib.f90 compilation
Silence the overeager "Warning: Variable 'depobj_list' at (1) is a dummy
argument of the BIND(C) procedure ... but may not be C interoperable
[-Wc-binding-type]" when compiling omp_lib.f90(.in).
The argument is of integer kind 'omp_depend_kind = @OMP_DEPEND_KIND@'.
Jakub Jelinek [Tue, 8 Apr 2025 09:55:13 +0000 (11:55 +0200)]
cse: Fix up delete_trivially_dead_insns [PR119594]
The following testcase is miscompiled by delete_trivially_dead_insns,
latently since r0-6313, actually since r15-1575.
The problem is in that r0-6313 change, which made count_reg_usage not
count uses of the pseudo which the containing SET sets. That is needed
so we can delete those instructions as trivially dead if they are really
dead, but has the following problem. After fwprop proper we have:
(insn 7 2 8 2 (set (reg/v:DI 101 [ g ])
(const_int -1 [0xffffffffffffffff])) "pr119594.c":8:10 95 {*movdi_internal}
(nil))
...
(insn 26 24 27 7 (set (reg:DI 104 [ g ])
(zero_extend:DI (subreg:SI (reg/v:DI 101 [ g ]) 0))) "pr119594.c":11:8 175 {*zero_extendsidi2}
(expr_list:REG_EQUAL (const_int 4294967295 [0xffffffff])
(expr_list:REG_DEAD (reg/v:DI 101 [ g ])
(nil))))
(insn 27 26 28 7 (set (reg/v:DI 101 [ g ])
(zero_extend:DI (subreg:SI (reg/v:DI 101 [ g ]) 0))) "pr119594.c":11:8 175 {*zero_extendsidi2}
(expr_list:REG_EQUAL (const_int 4294967295 [0xffffffff])
(expr_list:REG_UNUSED (reg/v:DI 101 [ g ])
(nil))))
and nothing else uses or sets the 101 and 104 pseudos. The subpass doesn't
look at REG_UNUSED or REG_DEAD notes (correctly, as they aren't guaranteed
to be accurate). The last change in the IL was forward propagation of
(reg:DI 104 [ g ]) value into the following insn.
Now, count_reg_usage doesn't count anything on insn 7, the SET_DEST is a
reg, so we don't count that and SET_SRC doesn't contain any regs.
On insn 26 it counts one usage of pseudo 101 (so counts[101] = 1) and
on insn 27 since r0-6313 doesn't count anything as that insn sets
pseudo 101 to something that uses it, it isn't a side-effect instruction
and can't throw.
Now, after counting reg usages the subpass walks the IL from end to start,
sees insn 27, counts[101] is non-zero, so insn_live_p is true, nothing is
deleted. Then sees insn 26, counts[104] is zero, insn_live_p is false,
we delete the insn and decrease associated counts, in this case counts[101]
becomes zero. And finally later we process insn 7, counts[101] is now zero,
insn_live_p is false, we delete the insn (and decrease associated counts,
which aren't any).
Except that this resulted in insn 27 staying in the IL but using a REG
which is no longer set (and worse, having a REG_EQUAL note of something we
need later in the same bb, so we then assume pseudo 101 contains 0xffffffff,
which it no longer does.
Now, if insn 26 was after insn 27, this would work just fine, we'd first
delete that and then insn 27 and then insn 7, which is why most of the time
it happens to work fine.
The following patch fixes it by detecting the cases where there are
self-references after a pseudo has been used at least once outside of the
self-references or just as REG_P SET_DEST and in that case only increases
the count for the pseudo, making it not trivially deletable.
2025-04-08 Eric Botcazou <botcazou@adacore.com>
Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/119594
* cse.cc (count_reg_usage): Count even x == dest regs if they have
non-zero counts already and incr is positive.
gccrs: Change optional to expected for parse_loop_label
gcc/rust/ChangeLog:
* parse/rust-parse-impl.h (Parser::parse_loop_label): Change function
return type to expected.
(Parser::parse_labelled_loop_expr): Adapt call location to new return
type.
* parse/rust-parse.h (enum class): Update function prototype.
gccrs: Resolve labels within break or continue expressions
gcc/rust/ChangeLog:
* resolve/rust-late-name-resolver-2.0.cc (Late::visit): Add call
to label resolution if there is one label.
(Late::resolve_label): Look for labels and emit an error message on
failure.
* resolve/rust-late-name-resolver-2.0.h: Add function prototypes.
A loop label error state was in use to represent missing loop label but
this may be easily forgotten and the optional nature of the label was
misrepresented.
gcc/rust/ChangeLog:
* ast/rust-ast-builder.cc (Builder::block): Call with a nullopt instead
of an error loop label.
(WhileLetLoopExpr::as_string): Use getter function and adapt to
newtype.
* ast/rust-ast.cc (WhileLoopExpr::as_string): Likewise.
(LoopExpr::as_string): Likewise.
(BreakExpr::as_string): Likewise.
(ForLoopExpr::as_string): Likewise.
* ast/rust-expr.h (class BlockExpr): Make loop label optional.
(class BreakExpr): Likewise.
* expand/rust-derive-clone.cc (DeriveClone::clone_fn): Use nullopt.
* expand/rust-derive-debug.cc (DeriveDebug::stub_debug_fn): Likewise.
* expand/rust-derive-default.cc (DeriveDefault::default_fn): Likewise.
* expand/rust-derive-eq.cc: Likewise.
* parse/rust-parse-impl.h (Parser::parse_block_expr): Use optional
for arguments.
(Parser::parse_loop_expr): Likewise.
(Parser::parse_while_loop_expr): Likewise.
(Parser::parse_while_let_loop_expr): Likewise.
(Parser::parse_for_loop_expr): Likewise.
(Parser::parse_labelled_loop_expr): Likewise.
(Parser::parse_loop_label): Return an optional.
* parse/rust-parse.h: Update function prototype and use nullopt for
default values.
* resolve/rust-ast-resolve-base.h (redefined_error): created a function for
rust_error_at for redefined at multiple times.
* resolve/rust-ast-resolve-implitem.h: changed rust_error_at to redefined_error.
* resolve/rust-ast-resolve-stmt.cc (ResolveStmt::visit): changed rust_error_at to
redefined_error.
* resolve/rust-ast-resolve-stmt.h: changed rust_error_at to redefined_error.
* resolve/rust-ast-resolve-toplevel.h: changed rust_error_at to redefined_error.
Signed-off-by: Sri Ganesh Thota <sriganeshthota12345@gmail.com>
Owen Avery [Sun, 30 Mar 2025 23:08:45 +0000 (19:08 -0400)]
gccrs: nr2.0: Improve test script
gcc/testsuite/ChangeLog:
* rust/compile/nr2/compile.exp: Avoid absolute paths in output,
adjust phrasing of output, and avoid false XPASS output when
tests are run in parallel.
* ast/rust-ast-visitor.cc
(DefaultASTVisitor::visit): Remove explicit visitation of a
function's self parameter, as if it exists it'll be visited as
one of the function parameters.
* resolve/rust-ast-resolve-type.cc (ResolveRelativeTypePath::go): fix error msg
* typecheck/rust-substitution-mapper.cc (SubstMapper::Resolve): add validation
(SubstMapper::valid_type): new check
(SubstMapper::visit): check if can resolve
* typecheck/rust-substitution-mapper.h: new prototype
gcc/testsuite/ChangeLog:
* rust/compile/nr2/exclude: nr2 is missing type path error
* rust/compile/issue-3643.rs: New test.
* rust/compile/issue-3646.rs: New test.
* rust/compile/issue-3654.rs: New test.
* rust/compile/issue-3663.rs: New test.
* rust/compile/issue-3671.rs: New test.
Signed-off-by: Philip Herron <herron.philip@googlemail.com>
Philip Herron [Wed, 2 Apr 2025 17:21:46 +0000 (18:21 +0100)]
gccrs: Fix recusive type query and nullptr on type path
This was a small fix to sort out the segfault to check for nullptr on the
TypePath cases for query type. But when this happened opened up a few bugs
that were hidden under the carpet namely: compile/issue-2905-{1,2}.rs which
has a recursive type query which needs to ne handled but now and error
message is being output for the type path. This happens because we start
resolving a generic struct:
struct Wierd<T>(A<(T,)>);
So the child field A is also generic and the generic argument of the tuple
of T needs to be applied to this generic field. This causes a chunk of
code to do bounds checking to ensure the bounds are ok, this is also
something that probably might change as generic types will have the bounds
secified anyway but thats besides the case right now. So once this bounds
checking occurs we endup looking at the impl block for Wierd<i32> which is
also grand but we still havent finished resolving the parent type of Wierd
which is recusive. But the query type system needs to check for that.
The other issue was: compile/issue-3022.rs which is a resolution issue:
impl<T: Foo<U>, U> Foo<U> for Bar<T, U>
The bound of Foo<T> is added to T before U is resolved but this was hidden
before the new error message was added. So now we have a generic
arguements handler being used correctly all over the code base apart from
1 last case for Traits but we will deal with that later. This handles the
case by setting up the type parameters upfront then sorting out their
bounds.
Fixes Rust-GCC#3625
gcc/rust/ChangeLog:
* typecheck/rust-hir-trait-resolve.cc (TraitResolver::resolve_trait): new argument
* typecheck/rust-hir-type-check-base.cc (TypeCheckBase::TypeCheckBase): new helper
* typecheck/rust-hir-type-check-base.h: new helper prototype
* typecheck/rust-hir-type-check-implitem.cc (TypeCheckTopLevelExternItem::visit):
remove comment out code
* typecheck/rust-hir-type-check-path.cc (TypeCheckExpr::resolve_root_path): check for null
* typecheck/rust-hir-type-check-type.cc (TypeCheckType::resolve_root_path): likewise
(TypeResolveGenericParam::Resolve): new args
(TypeResolveGenericParam::ApplyAnyTraitBounds): new helper
(TypeResolveGenericParam::apply_trait_bounds): new field
(TypeResolveGenericParam::visit): update
* typecheck/rust-hir-type-check-type.h: new args
* typecheck/rust-hir-type-check.cc (TraitItemReference::get_type_from_fn): reuse helper
* typecheck/rust-type-util.cc (query_type): check for recursive query
* typecheck/rust-tyty-subst.cc (SubstitutionParamMapping::SubstitutionParamMapping):
remove const
(SubstitutionParamMapping::get_generic_param): likewise
* typecheck/rust-tyty-subst.h: likewise
* typecheck/rust-tyty-variance-analysis.cc (GenericTyVisitorCtx::process_type): likewise
gcc/testsuite/ChangeLog:
* rust/compile/issue-3625.rs: New test.
Signed-off-by: Philip Herron <herron.philip@googlemail.com>
Philip Herron [Wed, 2 Apr 2025 15:16:47 +0000 (16:16 +0100)]
gccrs: Fix ICE when there are 2 functions named main
We need to setup the main_identifier_node for MAIN_DECL_P checks in the
middle-end. But it is valid to have a main function/method on impl blocks.
So we need to flag if this is a "root" item or not, which is one that is
jsut an HIR::Function on part of the Crate::items as oppposed to a
HIR::Function which is part of an HIR::ImplBlock as part of the HIR::Crate.
Some small cleanups have been added here too.
Fixes Rust-GCC#3648
gcc/rust/ChangeLog:
* backend/rust-compile-base.cc: new flag is_root_item
* backend/rust-compile-base.h: update prototype
* backend/rust-compile-implitem.cc (CompileTraitItem::visit): update call
* backend/rust-compile-implitem.h: remove old debug internal error
* backend/rust-compile-item.cc (CompileItem::visit): update call
* backend/rust-compile-item.h: remove old debug
* backend/rust-compile-resolve-path.cc (HIRCompileBase::query_compile): update calls
* backend/rust-compile.cc: likewise
* typecheck/rust-hir-trait-resolve.cc (TraitResolver::resolve_path_to_trait):
remove assertion and error
gcc/testsuite/ChangeLog:
* rust/compile/issue-3648.rs: New test.
Signed-off-by: Philip Herron <herron.philip@googlemail.com>
Philip Herron [Mon, 31 Mar 2025 16:58:24 +0000 (17:58 +0100)]
gccrs: Fix ICE when doing shift checks on const decl
Const decls are just delcarations wrapping the value into the DECL_INITIAL
and the shift checks we have assume no decls are involved and its just flat
values. This patch simply unwraps the constant values if they exist.
Jonathan Wakely [Mon, 7 Apr 2025 18:52:55 +0000 (19:52 +0100)]
libstdc++: Fix use-after-free in std::format [PR119671]
When formatting floating-point values to wide strings there's a case
where we invalidate a std::wstring buffer while a std::wstring_view is
still referring to it.
libstdc++-v3/ChangeLog:
PR libstdc++/119671
* include/std/format (__formatter_fp::format): Do not invalidate
__wstr unless _M_localized returns a valid string.
* testsuite/std/format/functions/format.cc: Check wide string
formatting of floating-point types with classic locale.
Reviewed-by: Tomasz Kaminski <tkaminsk@redhat.com>
Tejas Belagod [Wed, 27 Mar 2024 10:02:51 +0000 (15:32 +0530)]
AArch64: Diagnose OpenMP offloading when SVE types involved.
The target clause in OpenMP is used to offload loop kernels to accelarator
peripeherals. target's 'map' clause is used to move data from and to the
accelarator. When the data is SVE type, it may not be suitable because of
various reasons i.e. the two SVE targets may not agree on vector size or
some targets don't support variable vector size. This makes SVE unsuitable
for use in OMP's 'map' clause. This patch diagnoses all such cases and issues
an error where SVE types are not suitable.
Co-authored-by: Andrea Corallo <andrea.corallo@arm.com>
gcc/ChangeLog:
* target.h (type_context_kind): Add new context kinds for target clauses.
(omp_type_context): Query if the context is of OMP kind.
* config/aarch64/aarch64-sve-builtins.cc (verify_type_context): Diagnose
SVE types for a given OpenMP context.
(omp_type_context): New.
* gimplify.cc (omp_notice_variable): Diagnose implicitly-mapped SVE
objects in OpenMP regions.
(gimplify_scan_omp_clauses): Diagnose SVE types for various target
clauses.
Various parts of the omp code checked whether the size of a decl
was an INTEGER_CST in order to determine whether the decl was
variable-sized or not. If it was variable-sized, it was expected
to have a DECL_VALUE_EXPR replacement, as for VLAs.
This patch uses poly_int_tree_p instead, so that variable-length
SVE vectors are treated like constant-length vectors. This means
that some structures become poly_int-sized, with some fields at
poly_int offsets, but we already have code to handle that.
An alternative would have been to handle the data via indirection
instead. However, that's likely to be more complicated, and it
would contradict is_variable_sized, which already uses a check
for TREE_CONSTANT rather than INTEGER_CST.
gimple_add_tmp_var should probably not add a safelen of 1
for SVE vectors, but that's really a separate thing and might
be hard to test.
Co-authored-by: Tejas Belagod <tejas.belagod@arm.com>
gcc/
PR middle-end/101018
* poly-int.h (can_and_p): New function.
* fold-const.cc (poly_int_binop): Use it to optimize BIT_AND_EXPRs
involving POLY_INT_CSTs.
* gimplify.cc (omp_notice_variable): Use poly_int_tree_p instead
of INTEGER_CST when checking for constant-sized omp data.
(gimplify_adjust_omp_clauses_1): Likewise.
(gimplify_adjust_omp_clauses): Likewise.
* omp-low.cc (scan_sharing_clauses): Likewise.
Haochen Jiang [Fri, 28 Mar 2025 08:16:27 +0000 (16:16 +0800)]
i386: Add PTA_AVX10_1_256 to PTA_DIAMONDRAPIDS
For -march= handling, PTA_AVX10_1 will not imply PTA_AVX10_1_256,
resulting in TARGET_AVX10_1 becoming true while TARGET_AVX10_1_256
false. Since we will check TARGET_AVX10_1_256 in GCC 15 for AVX512
feature enabling for AVX10, -march=diamondrapids will not enable
512 bit register and x/ymm16+.
Since AVX10 will get a further clean up in GCC 16 and will help
PTA_DIAMONDRAPIDS reusing PTA_GRANITERAPIDS_D, the imply would become
obvious again, I plan not to add the testcase but just to fix the issue
in GCC 15.
Jin Ma [Mon, 7 Apr 2025 06:21:50 +0000 (14:21 +0800)]
RISC-V: Disable unsupported vsext/vzext patterns for XTheadVector.
XThreadVector does not support the vsext/vzext instructions; however,
due to the reuse of RVV optimizations, it may generate these instructions
in certain cases. To prevent the error "Unknown opcode 'th.vsext.vf2',"
we should disable these patterns.
V2:
Change the value of dg-do in the test case from assemble to compile, and
remove the -save-temps option.
gcc/ChangeLog:
* config/riscv/vector.md: Disable vsext/vzext for XTheadVector.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/xtheadvector/vsext.c: New test.
* gcc.target/riscv/rvv/xtheadvector/vzext.c: New test.
Jason Merrill [Mon, 7 Apr 2025 15:49:19 +0000 (11:49 -0400)]
c++: constinit and value-initialization [PR119652]
Value-initialization built an AGGR_INIT_EXPR to set AGGR_INIT_ZERO_FIRST on.
Passing that AGGR_INIT_EXPR to maybe_constant_value returned a TARGET_EXPR,
which potential_constant_expression_1 mistook for a temporary.
We shouldn't add a TARGET_EXPR to the AGGR_INIT_EXPR in this case, just like
we already avoid adding it to CONSTRUCTOR or CALL_EXPR.
PR c++/119652
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_outermost_constant_expr): Also don't add a
TARGET_EXPR around AGGR_INIT_EXPR.
Iain Sandoe [Sun, 15 Oct 2023 09:19:22 +0000 (10:19 +0100)]
aarch64, Darwin: Initial implementation of Apple cores [PR113257].
After discussion with the open source support team at Apple, we have
established that the cores conform to the 8.5 and 8.6 requirements.
One of the mandatory features (FEAT_SPECRES) is not exposed (or
available) in user-space code but is supported for privileged code.
The values for chip IDs and the LITTLE.big variants have been taken
from lists in the XNU and LLVM sources.
PR target/113257
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def (AARCH64_CORE): Add Apple-a12,
Apple-M1, Apple-M2, Apple-M3 with expanded names to allow for the
LITTLE.big versions.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi: Add apple-m1,2 and 3 cores to the ones listed
for arch and tune selections.
Thomas Schwinge [Sat, 5 Apr 2025 21:11:23 +0000 (23:11 +0200)]
GCN, nvptx libstdc++: Force use of '__atomic' builtins [PR119645]
For both GCN, nvptx, this gets rid of 'configure'-time:
configure: WARNING: No native atomic operations are provided for this platform.
configure: WARNING: They will be faked using a mutex.
configure: WARNING: Performance of certain classes will degrade as a result.
..., and changes:
-checking for lock policy for shared_ptr reference counts... mutex
+checking for lock policy for shared_ptr reference counts... atomic
That means, '[...]/[target]/libstdc++-v3/', 'Makefile's change:
..., and '[...]/[target]/libstdc++-v3/config.h' changes:
/* Defined if shared_ptr reference counting should use atomic operations. */
-/* #undef HAVE_ATOMIC_LOCK_POLICY */
+#define HAVE_ATOMIC_LOCK_POLICY 1
/* Define if the compiler supports C++11 atomics. */
-/* #undef _GLIBCXX_ATOMIC_BUILTINS */
+#define _GLIBCXX_ATOMIC_BUILTINS 1
..., and '[...]/[target]/libstdc++-v3/include/[target]/bits/c++config.h'
changes:
/* Defined if shared_ptr reference counting should use atomic operations. */
-/* #undef _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY */
+#define _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY 1
/* Define if the compiler supports C++11 atomics. */
-/* #undef _GLIBCXX_ATOMIC_BUILTINS */
+#define _GLIBCXX_ATOMIC_BUILTINS 1
This means that '[...]/[target]/libstdc++-v3/libsupc++/atomicity.cc',
'[...]/[target]/libstdc++-v3/libsupc++/atomicity.o' then uses atomic
instructions for synchronization instead of C++ static local variables, which
in turn for their guard variables, via 'libstdc++-v3/libsupc++/guard.cc', used
'libgcc/gthr.h' recursive mutexes, which currently are unsupported for GCN.
For GCN, this turns ~500 libstdc++ execution test FAILs into PASSes, and also
progresses:
PASS: g++.dg/tree-ssa/pr20458.C -std=gnu++17 (test for excess errors)
[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++17 execution test
PASS: g++.dg/tree-ssa/pr20458.C -std=gnu++26 (test for excess errors)
[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++26 execution test
UNSUPPORTED: g++.dg/tree-ssa/pr20458.C -std=gnu++98: exception handling not supported
(For nvptx, there is no effective change, due to other misconfiguration.)
Thomas Schwinge [Sun, 6 Apr 2025 15:44:18 +0000 (17:44 +0200)]
nvptx: Support '-mfake-ptx-alloca': defer failure to run-time 'alloca' usage
Follow-up to commit 1146410c0feb0e82c689b1333fdf530a2b34dc2b
"nvptx: Support '-mfake-ptx-alloca'". '-mfake-ptx-alloca' is applicable only
for configurations where PTX 'alloca' is not supported, where target libraries
are built with it enabled (that is, libstdc++, libgfortran).
This change progresses:
[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++26 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++26 [-compilation failed to produce executable-]{+execution test+}
UNSUPPORTED: g++.dg/tree-ssa/pr20458.C -std=gnu++98: exception handling not supported
..., and "enables" a few test cases:
FAIL: g++.old-deja/g++.other/sibcall1.C -std=gnu++17 (test for excess errors)
[Etc.]
FAIL: g++.old-deja/g++.other/unchanging1.C -std=gnu++17 (test for excess errors)
[Etc.]
..., which now (unrelatedly to 'alloca', and in the same way as configurations
where PTX 'alloca' is supported) FAIL due to:
unresolved symbol _Unwind_DeleteException
collect2: error: ld returned 1 exit status
Most importantly, it progresses ~830 libstdc++ test cases:
[-FAIL:-]{+PASS:+} [...] (test for excess errors)
..., with (if applicable, for most of them):
[-UNRESOLVED:-]{+PASS:+} [...] [-compilation failed to produce executable-]{+execution test+}
..., or just a few 'FAIL: [...] execution test' where these test cases also
FAIL in configurations where PTX 'alloca' is supported, or ~120 instances of
'FAIL: [...] execution test' due to run-time
'GCC/nvptx: sorry, unimplemented: dynamic stack allocation not supported'.
| With '-mfake-ptx-alloca', libgfortran again succeeds to build, and compared
| to before, we've got only a small number of regressions due to nvptx 'ld'
| complaining about 'unresolved symbol __GCC_nvptx__PTX_alloca_not_supported':
|
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray/codimension_2.f90 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
[-FAIL:-]{+PASS:+} gfortran.dg/coarray/codimension_2.f90 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
| [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib -O2 -lcaf_single [-execution test-]{+compilation failed to produce executable+}
[-FAIL:-]{+PASS:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib -O2 -lcaf_single [-compilation failed to produce executable-]{+execution test+}
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
| [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib -O2 -lcaf_single [-execution test-]{+compilation failed to produce executable+}
[-FAIL:-]{+PASS:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib -O2 -lcaf_single [-compilation failed to produce executable-]{+execution test+}
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
| [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib -O2 -lcaf_single [-execution test-]{+compilation failed to produce executable+}
[-FAIL:-]{+PASS:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib -O2 -lcaf_single [-compilation failed to produce executable-]{+execution test+}
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray_43.f90 -O (test for excess errors)
[-FAIL:-]{+PASS:+} gfortran.dg/coarray_43.f90 -O (test for excess errors)
..., and further progresses:
[-FAIL:-]{+PASS:+} gfortran.dg/coarray_lib_comm_1.f90 -O0 (test for excess errors)
[-UNRESOLVED:-]{+FAIL:+} gfortran.dg/coarray_lib_comm_1.f90 -O0 [-compilation failed to produce executable-]{+execution test+}
[Etc.]
..., which now (unrelatedly to 'alloca', and in the same way as configurations
where PTX 'alloca' is supported) FAILs due to:
error : Prototype doesn't match for '_gfortran_caf_transfer_between_remotes' in 'input file 9 at offset 159897', first defined in 'input file 9 at offset 159897'
error : Prototype doesn't match for '_gfortran_caf_stop_numeric' in 'input file 9 at offset 159897', first defined in 'input file 9 at offset 159897'
nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300)
Jakub Jelinek [Mon, 7 Apr 2025 12:25:49 +0000 (14:25 +0200)]
cobol: sed portability fix
Apparently Darwin sed doesn't like 's/\(foo\|bar\|baz\)/qux/' syntax,
simplified by using a pattern which matches all libgcobol header names
except possible config.h.
2025-04-07 Jakub Jelinek <jakub@redhat.com>
* Make-lang.in (cobol/charmaps.cc, cobol/valconv.cc): Use a BRE
only sed regex.
Jakub Jelinek [Mon, 7 Apr 2025 11:53:20 +0000 (13:53 +0200)]
cobol: Fix up update_web_docs_git for COBOL [PR119227]
As mentioned in the PR, the COBOL documentation is currently not present
in onlinedocs at all.
While the script generates gcobol{,-io}.{pdf,html}, it generates them in
the gcc/gcc/cobol/ subdirectory of the update_web_docs_git temporary
directory and nothing find it there afterwards, all the processing is on
for file in */*.html *.ps *.pdf *.tar; do
So, this patch puts gcobol{,-io}.html into gcobol/ subdirectory and
gcobol{,-io}.pdf into the current directory, so that it is picked up.
With this it makes into onlinedocs:
find . -name \*cobol\*
./onlinedocs/gcobol.pdf.gz
./onlinedocs/gcobol.pdf
./onlinedocs/gcobol_io.pdf.gz
./onlinedocs/gcobol_io.pdf
./onlinedocs/gcobol
./onlinedocs/gcobol/gcobol_io.html.gz
./onlinedocs/gcobol/gcobol_io.html
./onlinedocs/gcobol/gcobol.html.gz
./onlinedocs/gcobol/gcobol.html
./onlinedocs/gnat_rm/gnat_005frm_002finterfacing_005fto_005fother_005flanguages-interfacing-to-cobol.html.gz
./onlinedocs/gnat_rm/gnat_005frm_002finterfacing_005fto_005fother_005flanguages-interfacing-to-cobol.html
./onlinedocs/gnat_rm/gnat_005frm_002fimplementation_005fadvice-rm-f-7-cobol-support.html.gz
./onlinedocs/gnat_rm/gnat_005frm_002fimplementation_005fadvice-rm-f-7-cobol-support.html
./onlinedocs/gnat_rm/gnat_005frm_002fimplementation_005fadvice-rm-b-4-95-98-interfacing-with-cobol.html.gz
./onlinedocs/gnat_rm/gnat_005frm_002fimplementation_005fadvice-rm-b-4-95-98-interfacing-with-cobol.html
2025-04-07 Jakub Jelinek <jakub@redhat.com>
PR web/119227
* update_web_docs_git: Rename mdoc2pdf_html to cobol_mdoc2pdf_html,
perform mkdir -p $DOCSDIR/gcobol gcobol, remove $d/ from pdf and in
html replace it with gcobol/; update uses of the renamed function.
Jakub Jelinek [Mon, 7 Apr 2025 11:52:28 +0000 (13:52 +0200)]
cobol: Fix up make html for COBOL [PR119227]
What make html does for COBOL is quite inconsistent with all
other FEs. Normally make html creates HTML/gcc-15.0.1/
subdirectory and puts there subdirectories like gcc, cpp, gccint, gfortran
etc. and only those contain *.html files. COBOL puts gcobol.html and
gcobol-io.html into the current directory instead.
The following patch puts them into $(build_htmldir)/gcobol/ directory.
2025-04-07 Jakub Jelinek <jakub@redhat.com>
PR web/119227
* Make-lang.in (GCOBOL_HTML_FILES): New variable.
(cobol.install-html, cobol.html, cobol.srchtml): Use
$(GCOBOL_HTML_FILES) instead of gcobol.html gcobol-io.html.
(gcobol.html): Rename goal to ...
($(build_htmldir)/gcobol/gcobol.html): ... this. Run mkinstalldirs.
(gcobol-io.html): Rename goal to ...
($(build_htmldir)/gcobol/gcobol-io.html): ... this. Run mkinstalldirs.
Martin Jambor [Mon, 7 Apr 2025 11:32:10 +0000 (13:32 +0200)]
sra: Clear grp_same_access_path of acesses created by total scalarization (PR118924)
During analysis of PR 118924 it was discussed that total scalarization
invents access paths (strings of COMPONENT_REFs and possibly even
ARRAY_REFs) which did not exist in the program before which can have
unintended effects on subsequent AA queries. Although not doing that
does not mean that SRA cannot create such situations (see the bug for
more info), it has been agreed that not doing this is generally better.
This patch therfore makes SRA fall back on creating simple MEM_REFs when
accessing components of an aggregate corresponding to what a SRA
variable now represents.
gcc/ChangeLog:
2025-03-26 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/118924
* tree-sra.cc (create_total_scalarization_access): Set
grp_same_access_path flag to zero.
Martin Jambor [Mon, 7 Apr 2025 11:32:09 +0000 (13:32 +0200)]
sra: Avoid creating TBAA hazards (PR118924)
The testcase in PR 118924, when compiled on Aarch64, contains an
gimple aggregate assignment statement in between different types which
are types_compatible_p but behave differently for the purposes of
alias analysis.
SRA replaces the statement with a series of scalar assignments which
however have LHSs access chains modeled on the RHS type and so do not
alias with a subsequent reads and so are DSEd.
SRA clearly gets its "same_access_path" logic subtly wrong. One issue
is that the same_access_path_p function probably should be implemented
more along the lines of (parts of ao_compare::compare_ao_refs) instead
of internally relying on operand_equal_p. That is however not the
problem in the PR and so I will deal with it only later.
The issue here is that even when the access path is the same, it must
not be bolted on an aggregate type that does not match. This patch
does that, taking just one simple function from the
ao_compare::compare_ao_refs machinery and using it to detect the
situation. The rest is just merging the information in between
accesses of the same access group.
I looked at how many times we come across such assignment during
"make stage2-bubble" of GCC (configured with only c and C++ and
without multilib and libsanitizers) and on an x86_64 there were 87924
such assignments (though now I realize not all of them had to be
aggregate), so they do happen. The patch leads to about 5% increase
of cases where we don't use an "access path" but resort to a
MEM_REF (from 90209 to 95204). On an Aarch64, there were 92268 such
assignments and the increase of falling back to MEM_REFs was by
4% (but from a bigger base 132983 to 107991).
gcc/ChangeLog:
2025-04-04 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/118924
* tree-ssa-alias-compare.h (types_equal_for_same_type_for_tbaa_p):
Declare.
* tree-ssa-alias.cc: Include ipa-utils.h.
(types_equal_for_same_type_for_tbaa_p): New public overloaded variant.
* tree-sra.cc: Include tree-ssa-alias-compare.h.
(create_access): Initialzie grp_same_access_path to true.
(build_accesses_from_assign): Detect tbaa hazards and clear
grp_same_access_path fields of involved accesses when they occur.
(sort_and_splice_var_accesses): Take previous values of
grp_same_access_path into account.
gcc/testsuite/ChangeLog:
2025-03-25 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/118924
* g++.dg/tree-ssa/pr118924.C: New test.
Richard Biener [Mon, 7 Apr 2025 09:27:19 +0000 (11:27 +0200)]
tree-optimization/119640 - ICE with vectorized shift placement
When the whole shift is invariant but the shift amount needs
to be converted and a vector shift used we can mess up placement
of vector stmts because we do not make SLP scheduling aware of
the need to insert code for it. The following mitigates this
by more conservative placement of such code in vectorizable_shift.
PR tree-optimization/119640
* tree-vect-stmts.cc (vectorizable_shift): Always insert code
for one of our SLP operands before the code for the vector
shift itself.
When MUL is not available, then the __umulhisi3 and __mulhisi3
functions can use __mulhisi3_helper. This improves code size,
stack footprint and runtime on AVRrc.
libgcc/
* config/avr/lib1funcs.S (__mulhisi3, __umulhisi3): Use
__mulhisi3_helper for better performance on AVRrc.
testsuite: arm: Tighten compile options for short-vfp-1.c [PR119556]
The previous version of this test required arch v6+ (for sxth), and
the number of vmov depended on the float-point ABI (where softfp
needed more of them to transfer floating-point values to and from
general registers).
With this patch we require arch v7-a, vfp FPU and -mfloat-abi=hard, we
also use -O2 to clean the generated code and convert
scan-assembler-times directives into check-function-bodies.
Tested on arm-none-linux-gnueabihf and several flavours of
arm-none-eabi.
Jakub Jelinek [Mon, 7 Apr 2025 09:57:36 +0000 (11:57 +0200)]
tailc: Extend the IPA-VRP workaround [PR119614]
The IPA-VRP workaround in the tailc/musttail passes was just comparing
the singleton constant from a tail call candidate return with the ret_val.
This unfortunately doesn't work in the following testcase, where we have
<bb 5> [local count: 152205050]:
baz (); [must tail call]
goto <bb 4>; [100.00%]
<bb 7> [local count: 1073741824]:
# _3 = PHI <0B(4), _8(6)>
return _3;
and in the unreduced testcase even more PHIs before we reach the return
stmt.
Normally when the call has lhs, whenever we follow a (non-EH) successor
edge, it calls propagate_through_phis and that walks the PHIs in the
destination bb of the edge and when it sees a PHI whose argument matches
that of the currently tracked value (ass_var), it updates ass_var to
PHI result of that PHI. I think it is theoretically dangerous that it
picks the first one, perhaps there could be multiple PHIs, so perhaps safer
would be walk backwards from the return value up to the call.
Anyway, this PR is about the IPA-VRP workaround, there ass_var is NULL
because the potential tail call has no lhs, but ret_var is not TREE_CONSTANT
but SSA_NAME with PHI as SSA_NAME_DEF_STMT. The following patch handles
it by pushing the edges we've walked through when ass_var is NULL into a
vector and if ret_var is SSA_NAME set to PHI result, it attempts to walk
back from the ret_var through arguments of PHIs corresponding to the
edges we've walked back until we reach a constant and compare that constant
against the singleton value as well.
2025-04-07 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119614
* tree-tailcall.cc (find_tail_calls): Remember edges which have been
walked through if !ass_var. Perform IPA-VRP workaround even when
ret_var is not TREE_CONSTANT, in that case check in a loop if it is
a PHI result and in that case look at the PHI argument from
corresponding edge in the edge vector.
* g++.dg/opt/pr119613.C: Change { c || c++11 } in obviously C++ only
test to just c++11.
* g++.dg/opt/pr119614.C: New test.
LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].
In GCC14, LoongArch added __float128 as an alias for _Float128.
In commit r15-8962, support for q/Q suffixes for 128-bit floating point
numbers. This will cause the compiler to automatically link libquadmath
when compiling Fortran programs. But on LoongArch `long double` is
IEEE quad, so there is no need to implement libquadmath.
This causes link failure.
PR target/119408
libgfortran/ChangeLog:
* acinclude.m4: When checking for __float128 support, determine
whether the current architecture is LoongArch. If so, return false.
* configure: Regenerate.
libquadmath/ChangeLog:
* configure.ac: When checking for __float128 support, determine
whether the current architecture is LoongArch. If so, return false.
* configure: Regenerate.
Sigend-off-by: Xi Ruoyao <xry111@xry111.site> Sigend-off-by: Jakub Jelinek <jakub@redhat.com>
Eric Botcazou [Mon, 7 Apr 2025 08:33:52 +0000 (10:33 +0200)]
Ada: Fix wrong 'Access to aliased constrained array of controlled type
For technical reasons, the recently reimplemented finalization machinery
for controlled types requires arrays of controlled types to be allocated
with their bounds, including in the case where their nominal subtype is
constrained. However, in this case, the type of 'Access for the arrays
is pointer-to-constrained-array and, therefore, its value must designate
the array itself and not the bounds.
gcc/ada/
* gcc-interface/utils.cc (convert) <POINTER_TYPE>: Use fold_convert
to convert between thin pointers. If the source is a thin pointer
with zero offset from the base and the target is a pointer to its
array, displace the pointer after converting it.
* gcc-interface/utils2.cc (build_unary_op) <ATTR_ADDR_EXPR>: Use
fold_convert to convert the address before displacing it.
combine: Limit insn searchs for 2->2 combinations [PR116398]
As noted in the previous patch, combine still takes >30% of
compile time in the original testcase for PR101523. The problem
is that try_combine uses linear insn searches for some dataflow
queries, so in the worst case, an unlimited number of 2->2
combinations for the same i2 can lead to quadratic behaviour.
This patch limits distribute_links to a certain number
of instructions when i2 is unchanged. As Segher said in the PR trail,
it would make more conceptual sense to apply the limit unconditionally,
but I thought it would be better to change as little as possible at
this development stage. Logically, in stage 1, the --param should
be applied directly by distribute_links with no input from callers.
I think it's safe to drop log links even if a use exists. All
processing of log links seems to handle the absence of a link
for a particular register in a conservative way.
The initial set-up errs on the side of dropping links, since for example
create_log_links has:
/* flow.c claimed:
We don't build a LOG_LINK for hard registers contained
in ASM_OPERANDs. If these registers get replaced,
we might wind up changing the semantics of the insn,
even if reload can make what appear to be valid
assignments later. */
if (regno < FIRST_PSEUDO_REGISTER
&& asm_noperands (PATTERN (use_insn)) >= 0)
continue;
which excludes combinations by dropping log links, rather than during
try_combine. And:
/* If this register is being initialized using itself, and the
register is uninitialized in this basic block, and there are
no LOG_LINKS which set the register, then part of the
register is uninitialized. In that case we can't assume
anything about the number of nonzero bits.
??? We could do better if we checked this in
reg_{nonzero_bits,num_sign_bit_copies}_for_combine. Then we
could avoid making assumptions about the insn which initially
sets the register, while still using the information in other
insns. We would have to be careful to check every insn
involved in the combination. */
if (insn
&& reg_referenced_p (x, PATTERN (insn))
&& !REGNO_REG_SET_P (DF_LR_IN (BLOCK_FOR_INSN (insn)),
REGNO (x)))
{
struct insn_link *link;
Also, to answer Jakub's question in that comment, I tried bisecting:
int limit = atoi (getenv ("BISECT"));
(so applying the limit for all calls from try_combine) with an
abort in distribute_links if the limit caused a link to be skipped.
The minimum BISECT value that allowed an aarch64-linux-gnu bootstrap
to succeed with --enable-languages=all --enable-checking=yes,rtl,extra
was 142, so much lower than the parameter value. I realised too late
that --enable-checking=release would probably have been a more
interesting test.
The previous patch meant that distribute_links itself is now linear
for a given i2 definition, since each search starts at the previous
last use, rather than at i2 itself. This means that the limit has
to be applied cumulatively across all searches for the same link.
The patch does that by storing a counter in the insn_link structure.
There was a 32-bit hole there on LP64 hosts.
gcc/
PR testsuite/116398
* params.opt (-param=max-combine-search-insns=): New param.
* doc/invoke.texi: Document it.
* combine.cc (insn_link::insn_count): New field.
(alloc_insn_link): Initialize it.
(distribute_links): Add a limit parameter.
(try_combine): Use the new param to limit distribute_links
when only i3 has changed.
Another problem in PR101523 was that, after each successful 2->2
combination attempt, distribute_links would search further and further
for the next combinable use of the i2 destination. Each search would
start at i2 itself, making the search quadratic in the worst case.
In a 2->2 combination, if i2 is unchanged, the search can start at i3
instead of i2. The same thing applies to i2 when distributing i2's
links, since the only changes to earlier instructions are the deletion
of i0 and i1.
This change, combined with the previous split_i2i3 patch, gives a
34.6% speedup in combine for the testcase in PR101523. Combine
goes from being 41% to 34% of compile time.
gcc/
PR testsuite/116398
* combine.cc (distribute_links): Take an optional start point.
(try_combine): If only i3 has changed, only distribute i3's links,
not i2's. Start the search for the new use from i3 rather than
from the definition instruction. Likewise start the search for
the new use from i2 when distributing i2's links.
combine: Avoid split_i2i3 search if i2 is unchanged [PR116398]
When combining a single-set i2 into a multi-set i3, combine
first tries to match the new multi-set in-place. If that fails,
combine considers splitting the multi-set so that one set goes in
i2 and the other set stays in i3. That moves a destination from i3
to i2 and so combine needs to update any associated log link for that
destination to point to i2 rather than i3.
However, that kind of split can also occur for 2->2 combinations.
For a 2-instruction combination in which i2 doesn't die in i3, combine
tries a 2->1 combination by turning i3 into a parallel of the original
i2 and the combined i3. If that fails, combine will split the parallel
as above, so that the first set goes in i2 and the second set goes in i3.
But that can often leave i2 unchanged, meaning that no destinations have
moved and so no search is necessary.
gcc/
PR testsuite/116398
* combine.cc (try_combine): Shortcut the split_i2i3 handling if
i2 is unchanged.
combine: Allow 2->2 combinations, but with a tweak [PR116398]
One of the problems in PR101523 was that, after each successful
2->2 combination attempt, try_combine would restart combination
attempts at i2 even if i2 hadn't changed. This led to quadratic
behaviour as the same failed combinations between i2 and i3 were
tried repeatedly.
The original patch for the PR dealt with that by disallowing 2->2
combinations. However, that led to various optimisation regressions,
so there was interest in allowing the combinations again, at least
until an alternative way of getting the same results is in place.
to be folded to a vector of -1s. One unusual thing about the fold
is that the <1> in the second insn is uninitialised; it looks like
it should be replaced by <0>, or that there should be an insn 4 that
copies <0> to <1>.
As it stands, the test relies on init-regs to insert a zero
initialisation of <1>. This happens after all the cse/pre/fwprop
stuff, with only dce passes between init-regs and combine.
Combine therefore sees:
It looks like the test should then pass through a 3, 9 -> 5 combination,
so that we get an (eq ...) between two zeros and fold it to a vector
of -1s. But although the combination is attempted, the fold doesn't
happen. Instead, combine is left to match the unsimplified (eq ...)
between two zeros, which rightly fails. The test only passes because
late_combine2 happens to try simplifying an (eq ...) between reg X and
reg X, which does fold to a vector of -1s.
The different handling of registers and constants is due to this
code in simplify_const_relational_operation:
if (INTEGRAL_MODE_P (mode) && trueop1 != const0_rtx
&& (code == EQ || code == NE)
&& ! ((REG_P (op0) || CONST_INT_P (trueop0))
&& (REG_P (op1) || CONST_INT_P (trueop1)))
&& (tem = simplify_binary_operation (MINUS, mode, op0, op1)) != 0
/* We cannot do this if tem is a nonzero address. */
&& ! nonzero_address_p (tem))
return simplify_const_relational_operation (signed_condition (code),
mode, tem, const0_rtx);
INTEGRAL_MODE_P matches vector integer modes, but everything else
about the condition is written for scalar integers only. Thus if
trueop0 and trueop1 are equal vector constants, we'll bypass all
the exclusions and try simplifying a subtraction. This will succeed,
giving a vector of zeros. The recursive call will then try to simplify
a comparison between the vector of zeros and const0_rtx, which isn't
well-formed. Luckily or unluckily, the ill-formedness doesn't trigger
an ICE, but it does prevent any simplification from happening.
The least-effort fix would be to replace INTEGRAL_MODE_P with
SCALAR_INT_MODE_P. But the fold does make conceptual sense for
vectors too, so it seemed better to keep the INTEGRAL_MODE_P and
generalise the rest of the condition to match.
gcc/
* simplify-rtx.cc (simplify_const_relational_operation): Generalize
the constant checks in the fold-via-minus path to match the
INTEGRAL_MODE_P condition.
Iain Sandoe [Mon, 31 Mar 2025 12:54:37 +0000 (13:54 +0100)]
testsuite, cobol: Avoid adding duplicate libs.
The discovered paths already include the multilib and so there is
no need to add an extra library to COBOL_UNDER_TEST. Doing so makes
a duplicate, which causes test fails on Darwin, where the linker warns
when duplicate libraries are provided on the link line.
gcc/testsuite/ChangeLog:
* lib/cobol.exp: Simplify the setting of COBOL_UNDER_TEST.
As discussed in the PR, the options had been added during development
to handle specific cases, they are no longer needed (and if they should
become necessary, we will need to guard them such that individual
platforms get the correct handling).
PR cobol/119414
gcc/cobol/ChangeLog:
* gcobolspec.cc (append_rdynamic,
append_allow_multiple_definition, append_fpic): Remove.
(lang_specific_driver): Remove platform-specific command
line option handling.
Doc: Fix description of -Wno-aggressive-loop-optimizations [PR78874]
gcc/ChangeLog
PR middle-end/78874
* doc/invoke.texi (Warning Options): Fix description of
-Wno-aggressive-loop-optimizations to reflect that this turns
off the warning, and the default is for it to be enabled.
Patrick Palka [Sun, 6 Apr 2025 02:39:15 +0000 (22:39 -0400)]
c++: maybe_dependent_member_ref and typenames [PR118626]
Here during maybe_dependent_member_ref for accepted_type<_Up>, we
correctly don't strip the typedef because it's a complex one (its
defaulted template parameter isn't used in its definition) and so
we recurse to consider its corresponding TYPE_DECL.
We then incorrectly decide to not rewrite this use because of the
TYPENAME_TYPE shortcut. But I don't think this shortcut should apply to
a typedef TYPE_DECL.
PR c++/118626
gcc/cp/ChangeLog:
* pt.cc (maybe_dependent_member_ref): Restrict TYPENAME_TYPE
shortcut to non-typedef TYPE_DECL.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/class-deduction-alias25a.C: New test.
Patrick Palka [Sun, 6 Apr 2025 02:39:12 +0000 (22:39 -0400)]
c++: maybe_dependent_member_ref and stripped alias [PR118626]
Here during maybe_dependent_member_ref (as part of CTAD rewriting
for the variant constructor) for __accepted_type<_Up> we strip this
alias all the way to type _Nth_type<__accepted_index<_Up>>, for which
we return NULL since _Nth_type is at namespace scope and so no
longer needs rewriting.
Note that however the template argument __accepted_index<_Up> of this
stripped type _does_ need rewriting (since it specializes a variable
template from the current instantiation). We end up not rewriting this
variable template reference at any point however because upon returning
NULL, the caller (tsubst) proceeds to substitute the original form of
the type __accepted_type<_Up>, which doesn't directly refer to
__accepted_index. This later leads to an ICE during subsequent alias
CTAD rewriting of this guide that contains a non-rewritten reference
to __accepted_index.
So when maybe_dependent_member_ref decides to not rewrite a class-scope
alias that's been stripped, the caller needs to commit to substituting
the stripped type rather than the original type. This patch essentially
implements that by making maybe_dependent_member_ref call tsubst itself
in that case.
PR c++/118626
gcc/cp/ChangeLog:
* pt.cc (maybe_dependent_member_ref): Substitute and return the
stripped type if we decided to not rewrite it directly.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/class-deduction-alias25.C: New test.
Per the issue, there were a couple places in the manual where
-Wno-psabi was mentioned, but the option itself was not documented.
gcc/c-family/ChangeLog
PR c/81831
* c.opt (Wpsabi): Remove "Undocumented" modifier and add a
documentation string.
gcc/ChangeLog
PR c/81831
* doc/invoke.texi (Option Summary): Add -Wno-psabi.
(Warning Options): Document -Wpsabi separately from -Wabi.
Note it's enabled by default, not just implied by -Wabi.
Replace the detailed example for a GCC 4.4 change for x86
(which is unlikely to be very interesting nowadays) with
just a list of all targets that presently diagnose these
warnings.
(RS/6000 and PowerPC Options): Add cross-references for
-Wno-psabi.