This change makes the check_dynamic_spec precondition checks slightly
faster to compile, and avoids those checks entirely for the common cases
of calling check_dynamic_spec_integral or check_dynamic_spec_string.
Instead of checking for unique types by keeping counts in an array and
looping over that array, we can just keep a sum of how many valid types
are present, and check that it equals the total number of types in the
pack.
The diagnostic is slightly worse now, because there's only a single
"invalid template argument types" string that appears in the output,
where previously we had either "non-unique template argument type" or
"disallowed template argument type", depending on the failure mode.
Given that most users will never use this function directly, and
probably won't use invalid types anyway, the inferior diagnostic seems
acceptable.
libstdc++-v3/ChangeLog:
* include/std/format (basic_format_parse_context::__once): New
variable template.
(basic_format_parse_context::__valid_types_for_check_dynamic_spec):
New function template for checking argument types.
(basic_format_parse_context::__check_dynamic_spec): New function
template to implement the common check_dynamic_spec logic.
(basic_format_parse_context::check_dynamic_spec_integral): Call
__check_dynamic_spec instead of check_dynamic_spec.
(basic_format_parse_context::check_dynamic_spec_string):
Likewise. Use _CharT instead of char_type consistently.
(basic_format_parse_context::check_dynamic_spec): Use
__valid_types_for_check_dynamic_spec for precondition checks and
call __check_dynamic_spec for checking the arg id.
* testsuite/std/format/parse_ctx_neg.cc: Adjust expected errors.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Mon, 10 Mar 2025 14:29:36 +0000 (14:29 +0000)]
libstdc++: Add static_assert to std::packaged_task::packaged_task(F&&)
LWG 4154 (approved in Wrocław, November 2024) fixed the Mandates:
precondition for std::packaged_task::packaged_task(F&&) to match what
the implementation actually requires. We already gave a diagnostic in
the right cases as required by the issue resolution, so strictly
speaking we don't need to do anything. But the current diagnostic comes
from inside the implementation of std::__invoke_r and could be more
user-friendly.
For C++17 (when std::is_invocable_r_v is available) add a static_assert
to the constructor, so the error is clear:
Jonathan Wakely [Fri, 12 Jan 2024 16:57:41 +0000 (16:57 +0000)]
libstdc++: Update tzdata to 2025a
Import the new 2025a tzdata.zi file. The leapseconds file was also
updated to have a new expiry (no new leap seconds were added).
libstdc++-v3/ChangeLog:
* include/std/chrono (__detail::__get_leap_second_info): Update
expiry date for leap seconds list.
* src/c++20/tzdata.zi: Import new file from 2025a release.
* src/c++20/tzdb.cc (tzdb_list::_Node::_S_read_leap_seconds)
Update expiry date for leap seconds list.
Jason Merrill [Tue, 11 Mar 2025 21:43:35 +0000 (17:43 -0400)]
contrib: relpath.sh /lib /include [PR119081]
Previously, if the common ancestor of the two paths is / we would print the
absolute second argument, but this PR asks for a relative path in that case
as well, which makes sense for the libstdc++.modules.json use case.
Alex Coplan [Mon, 10 Mar 2025 16:44:15 +0000 (16:44 +0000)]
df: Treat partial defs as uses in df_simulate_defs [PR116564]
The PR shows us spinning in dce.cc:fast_dce at the start of combine.
This spinning appears to be because of a disagreement between the fast_dce code
and the code in df-problems.cc:df_lr_bb_local_compute. Specifically, they
disagree on the treatment of partial defs. For the testcase in the PR, we have
the following insn in bb 3:
i.e. it models partial defs as a RMW operation; thus for the def arising
from i10 above, it records a use of r104; hence it ends up in the
live-in set for bb 3.
However, as it stands, the code in dce.cc:fast_dce (and its callee
dce_process_block) has no such provision for DF_REF_PARTIAL defs. It
does not treat these as a RMW and does not compute r104 above as being
live-in to bb 3. At the end of dce_process_block we compute the
following "did something happen" condition used to decide termination of
the analysis:
because of the disagreement between df_lr_local_compute and the local
analysis done by fast_dce, we invariably have r104 in DF_LR_IN, but not
in local_live. Hence we always return true here, call
df_analyze_problem (which re-computes DF_LR_IN according to
df_lr_bb_local_compute, re-adding r104), and so the analysis never
terminates.
This patch therefore adjusts df_simulate_defs (called from
dce_process_block) to match the behaviour of df_lr_bb_local_compute in
this respect, namely we make it model partial defs as RMW operations by
setting the relevant register live. This fixes the spinning in fast_dce
for this testcase.
gcc/ChangeLog:
PR rtl-optimization/116564
* df-problems.cc (df_simulate_defs): For partial defs, mark the
register live (treat it as a RMW operation).
gcc/testsuite/ChangeLog:
PR rtl-optimization/116564
* gcc.target/aarch64/torture/pr116564.c: New test.
- Import phobos v2.111.0-beta.1.
- Added `bitCast' function to `std.conv'.
- Added `readfln' and `File.readfln' functions to `std.stdio'.
- New procedural API for `std.sumtype'.
Richard Earnshaw [Mon, 10 Mar 2025 14:12:38 +0000 (14:12 +0000)]
arm: allow type-punning subregs in vpr_register_operand [PR115439]
Subregs that only change the mode of an operand (ie don't change the
size) should be safe for the VPR register. If we don't permit them
we may end up with some redundant copy instructions.
Fortran: Add F2018 TEAM_NUMBER to coindexed expressions [PR98903]
Add missing parsing and code generation for a[..., TEAM_NUMBER=...] as
defined from F2015 onwards. Because F2015 is not used as dedicated
standard in GFortran add it to the F2018 standard feature set.
PR fortran/98903
gcc/fortran/ChangeLog:
* array.cc (gfc_copy_array_ref): Copy team, team_type and stat.
(match_team_or_stat): Match a single team(_number)= or stat=.
(gfc_match_array_ref): Add switching to image_selector_parsing
and error handling when indices come after named arguments.
* coarray.cc (move_coarray_ref): Move also team_type.
* expr.cc (gfc_free_ref_list): Free team and stat expression.
(gfc_find_team_co): Find team or team_number in array-ref.
* gfortran.h (enum gfc_array_ref_team_type): New enum to
distinguish unset, team or team_number expression.
(gfc_find_team_co): Default searching to team= expressions.
* resolve.cc (resolve_array_ref): Check for type correctness of
team(_number) and stats in coindices.
* trans-array.cc (gfc_conv_array_ref): Ensure stat is cleared
when fcoarray=single is used.
* trans-intrinsic.cc (conv_stat_and_team): Including team_number
in conversion.
(gfc_conv_intrinsic_caf_get): Propagate team_number to ABI
routine.
(conv_caf_send_to_remote): Same.
(conv_caf_sendget): Same.
gcc/testsuite/ChangeLog:
* gfortran.dg/coarray/coindexed_2.f90: New test.
* gfortran.dg/coarray/coindexed_3.f08: New test.
* gfortran.dg/coarray/coindexed_4.f08: New test.
Tomasz Kamiński [Tue, 11 Mar 2025 10:59:36 +0000 (11:59 +0100)]
libstdc++: Correct preprocessing checks for floatX_t and bfloat_16 formatting
Floating points types _Float16, _Float32, _Float64, and bfloat16,
can be formatted only if std::to_chars overloads for such types
were provided. Currently this is only the case for architectures
where float and double are 32-bits and 64-bits IEEE floating points types.
This patch updates the preprocessing checks for formatters
for above types to check _GLIBCXX_FLOAT_IS_IEEE_BINARY32
and _GLIBCXX_DOUBLE_IS_IEEE_BINARY64. Making them non-formattable
on non-IEEE architectures.
Remove a potential UB, where we could produce basic_format_arg
with _M_type set to _Arg_fp32 or _Arg_fp64, that was later not
handled by `_M_visit`.
libstdc++-v3/ChangeLog:
* include/std/format (formatter<_Float16, _CharT>): Define only if
_GLIBCXX_FLOAT_IS_IEEE_BINARY32 macro is defined.
(formatter<_Float16, _CharT>): As above.
(formatter<__gnu_cxx::__bfloat16_t, _CharT>): As above.
(formatter<_Float64, _CharT>): Define only if
_GLIBCXX_DOUBLE_IS_IEEE_BINARY64 is defined.
(basic_format_arg::_S_to_arg_type): Normalize _Float32 and _Float64
only to float and double respectivelly.
(basic_format_arg::_S_to_enum): Remove handling of _Float32 and _Float64.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Mark Wielaard [Wed, 12 Mar 2025 11:29:24 +0000 (12:29 +0100)]
Regenerate cobol/lang.opt.urls
With the COBOL: Frontend (commit 3c5ed996a) came a lang.opt.urls,
which is different from what regenerate-opt-urls.py generates. Make
the CI bot happy by regenerating it.
Longer term, the COBOL docs need to be sorted out (see e.g. PR119227)
and then perhaps regenerate-opt-urls.py adjusted so that it can deal
with the COBOL docs.
Simon Martin [Wed, 12 Mar 2025 08:09:35 +0000 (09:09 +0100)]
cobol: Remove unnecesssary CPPFLAGS update and restore MacOS build
The build currently fails on MacOS even when the Cobol front-end and
libgcobol builds are disabled.
The problem is that gcc/cobol/Make-lang.in adds -Iinclude to CPPFLAGS,
which somehow makes clang unhappy about the include order:
error: <cstddef> tried including <stddef.h> but didn't find libc++'s
<stddef.h> header. This usually means that your header search paths
are not configured properly.
It turns out that this addition is unnecessary: simply removing it fixes
the build on MacOS, without impacting the build x86_64-pc-linux-gnu when
configured with --enable-languages=default,cobol.
It feels like there might be more cleanup opportunities there, but they
can be taken care of later.
Jonathan Wakely [Tue, 11 Mar 2025 17:29:01 +0000 (17:29 +0000)]
libstdc++: Prevent dangling references in std::unique_ptr::operator*
LWG 4148 (approved in Wrocław, November 2024) makes it ill-formed to
dereference a std::unique_ptr if that would return a dangling reference.
That can happen with a custom pointer type and a const-qualified
element_type, such that std::add_lvalue_reference_t<element_type> is a
reference-to-const that could bind to a short-lived temporary.
In C++26 the compiler diagnoses this as an error anyway:
bits/unique_ptr.h:457:16: error: returning reference to temporary [-Wreturn-local-addr]
But that can be disabled with -Wno-return-local-addr so the
static_assert ensures it is enforced consistently.
libstdc++-v3/ChangeLog:
* include/bits/unique_ptr.h (unique_ptr::operator*): Add
static_assert to check for dangling reference, as per LWG 4148.
* testsuite/20_util/unique_ptr/lwg4148.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Tue, 11 Mar 2025 15:47:21 +0000 (15:47 +0000)]
libstdc++: Make range adaptor __has_arrow helper use a const type
LWG 4112 (approved in Wrocław, November 2024) changes the has-arrow
helper to require operator-> to be valid on a const-qualified lvalue.
This affects the constraints for filter_view::_Iterator::operator-> and
join_view::_Iterator::operator-> so that they can only be used if the
underlying iterator supports operator-> on const.
The change also adds semantic (i.e. not checkable and not enforced)
requirements that operator-> must have the same semantics whether called
on a const or non-const value, and on an lvalue or rvalue (due to the
implicit expression variation rules in [concepts.equality]).
libstdc++-v3/ChangeLog:
* include/bits/ranges_util.h (ranges::_detail::__has_arrow):
Require operator->() to be valid on const-qualified type, as per
LWG 4112.
* testsuite/std/ranges/adaptors/lwg4112.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
LWG 4142 (approved in Wrocław, November 2024) made it ill-formed to call
basic_format_parse_context::check_dynamic_spec with an empty template
argument list.
This adds a static_assert to enforce that, and adjusts the tests.
libstdc++-v3/ChangeLog:
* include/std/format
(basic_format_parse_context::check_dynamic_spec): Require a
non-empty parameter pack, as per LWG 4142.
* testsuite/std/format/parse_ctx.cc: Remove call of
check_dynamic_spec with empty template argument list.
* testsuite/std/format/parse_ctx_neg.cc: Add dg-error to call of
check_dynamic_spec with empty template argument list.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
aarch64: Make latency account for synthetic VEC_PERM_EXPRs [PR116901]
Another problem in pr110625_[24].c was that the latency calculations
were ignoring VEC_PERM_EXPRs that had no associated stmt_vec_info.
Such VEC_PERM_EXPRs are common and expected for SLP these days.
After this change, the number of general ops in the testcases seems
to be accurate apart from one remaining detail: we assume that the
extension in a permuted extending load is free, even though the
extension happens after the permutation. Fixing that would require
more information from the vectoriser and so isn't GCC 15 material.
It also should cease to be a problem if we do end up moving the
permutation to its own node, rather than keeping it as part of
the load.
gcc/
PR target/116901
* config/aarch64/aarch64.cc (aarch64_vector_costs::count_ops): Allow
stmt_info to be null.
(aarch64_vector_costs::add_stmt_cost): Call count_ops even if
stmt_info is null.
vect: Fix ncopies when costing SLP reductions [PR116901]
pr110625_[24].c started failing after r15-1329-gd66b820f392aa9a7,
which switched to single def-use cycles for single-lane SLP.
The problem is that we only costed one vector accumulator
operation for an N-vector cycle.
The problem seems to have been latent, and meant that we also
only costed one FADDA for reduc_strict_4.c and reduc_strict_5.c,
even though they need 4 and 6 FADDAs respectively.
I'm not sure why:
if ((double_reduc || reduction_type != TREE_CODE_REDUCTION)
&& ncopies > 1)
was previously only necessary for non-SLP, but the patch preserves
that for safety.
gcc/
PR tree-optimization/116901
* tree-vect-loop.cc (vectorizable_reduction): Set ncopies to
SLP_TREE_NUMBER_OF_VEC_STMTS for SLP.
Before r14-2877-gbf67bf4880ce5be0, the aarch64 code assumed that
every multi-vector reduction would use single def-use cycles.
The patch fixed it to test what the vectoriser actually planned
to do, using newly provided information.
At the time, we didn't try to use single def-use cycles for any costed
variant in the associated testcase (gcc.target/aarch64/pr110625_1.c),
so it was enough to check that the single-def-use latency was never
printed to the dump file. However, we do now consider using single
def-use cycles for the single-lane SLP fallback.
This patch therefore switches to a positive test of the
non-single-def-use latency. I checked that the test still failed
in this form before r14-2877-gbf67bf4880ce5be0.
gcc/testsuite/
* gcc.target/aarch64/pr110625_1.c: Turn into a positive test for
a vector latency of 2, rather than a negative test for a vector
latency of 8.
Richard Biener [Tue, 11 Mar 2025 08:39:06 +0000 (09:39 +0100)]
Simple cobol.dg testsuite
The following adds a simple cobol.dg test harness, based on gfortran.dg.
It's invoked by make check-cobol, has three tests, two execution test and
one test exercising dg-error. The existing FAIL is due to an assembling
error, tracked by PR119214.
Jakub Jelinek [Wed, 12 Mar 2025 07:27:17 +0000 (08:27 +0100)]
builtins: Fix up strspn/strcspn folding [PR119219]
The PR119204 r15-7955 fix caused some regressions.
The problem is that the fold_builtin* APIs document that expr is
either a CALL_EXPR of the call or NULL, so using TREE_TYPE (expr)
can crash e.g. during constexpr evaluation etc.
As can be seen in the surrounding patch, for the neighbouring builtins
(both modf and strpbrk) fold_builtin_2 passes down type, which is the
result type, TREE_TYPE (TREE_TYPE (fndecl)) and those builtins use it
to build the return value, while strspn was always building size_type_node
and strcspn had this change from that to TREE_TYPE (expr).
The patch passes type to these two and uses it there as well.
The patch keeps passing expr because it is used in the
check_nul_terminated_array calls done for both strspn and strcspn,
those calls clearly can deal with NULL expr but prefer if it is non-NULL
for some warning.
2025-03-12 Jakub Jelinek <jakub@redhat.com>
PR middle-end/119204
PR middle-end/119219
* builtins.cc (fold_builtin_2): Pass type as another argument
to fold_builtin_strspn and fold_builtin_strcspn.
(fold_builtin_strspn): Add type argument, use it instead of
size_type_node.
(fold_builtin_strcspn): Add type argument, use it instead of
TREE_TYPE (expr).
Jakub Jelinek [Wed, 12 Mar 2025 07:01:09 +0000 (08:01 +0100)]
c++: Handle RAW_DATA_CST in modules.cc [PR119076]
The following testcases (one with #embed, one with large initializer
turned into RAW_DATA_CST) show that I forgot to handle RAW_DATA_CST in
module streaming.
Similar to the PCH case we need to stream out RAW_DATA_CST with NULL
RAW_DATA_OWNER (i.e. a tree which has data owned by libcpp buffer) so
that it will be streamed back in as STRING_CST which owns the data,
but because the data can be really large (hopefully not so much for
header modules though), without actually trying to build a STRING_CST
on the module writing side because that would mean another large
allocation and copying of the large data.
RAW_DATA_CST with RAW_DATA_OWNER then needs to be streamed out and in
by streaming the owner and offset from owner's data and length.
* g++.dg/modules/pr119076-1_a.H: New test.
* g++.dg/modules/pr119076-1_b.C: New test.
* g++.dg/modules/pr119076-2_a.H: New test.
* g++.dg/modules/pr119076-2_b.C: New test.
Jakub Jelinek [Wed, 12 Mar 2025 06:46:25 +0000 (07:46 +0100)]
preprocessor: Fix up diagnostic typo in convert_oct [PR119202]
In r15-4286 I've introduced a typo, part of the change was
- cpp_error (pfile, CPP_DL_ERROR, "'\\o' not followed by '{'");
+ cpp_error (pfile, CPP_DL_ERROR, "%<\\o%> not followed by %<}%>");
which turned { into }. This patch fixes it back.
2025-03-12 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/119202
* charset.cc (convert_oct): Fix up typo in diagnostics about \o
not followed by {.
My/Kees' earlier patches adjusted -Wunterminated-string-initialization
warning so that it doesn't warn about initializers of nonstring decls
and that nonstring attribute is allowed on multi-dimensional arrays.
Unfortunately as this testcase shows, we still warn about initializers
of multi-dimensional array nonstring decls.
The problem is that in that case field passed to output_init_element
is actually INTEGER_CST, index into the array.
For RECORD_OR_UNION_TYPE_P (constructor_type) field is a FIELD_DECL
which we want to use, but otherwise (in arrays) IMHO we want to use
constructor_fields (which is the innermost FIELD_DECL whose part
is being initialized), or - if that is NULL - constructor_decl, the
whole decl being initialized with multi-dimensional array type.
2025-03-11 Jakub Jelinek <jakub@redhat.com>
PR c/117178
* c-typeck.cc (output_init_element): Pass field to digest_init
only for record/union types, otherwise pass constructor_fields
if non-NULL and constructor_decl if constructor_fields is NULL.
* gcc.dg/Wunterminated-string-initialization-2.c: New test.
Andrew Pinski [Tue, 11 Mar 2025 06:10:01 +0000 (23:10 -0700)]
aarch64: Fix DFP constants [PR119131]
After r15-6660-g45d306a835cb3f865, in some cases
DFP constants would cause an ICE. This is due to
do a mismatch of a few things. The predicate of the move
uses aarch64_valid_fp_move to say if the constant is valid or not.
But after reload/LRA when can_create_pseudo_p returns false; aarch64_valid_fp_move
would return false for constants that were valid for the constraints
of the instruction. A strictor predicate compared to the constraint is wrong.
In this case `Uvi` is the constraint while aarch64_valid_fp_move allows it
via aarch64_can_const_movi_rtx_p for !DECIMAL_FLOAT_MODE_P, there is no such check
for DECIMAL_FLOAT_MODE_P.
The fix is to remove the check !DECIMAL_FLOAT_MODE_P in aarch64_valid_fp_move
and in the define_expand. As now the predicate allows a superset of what is allowed
by the constraints.
aarch64_float_const_representable_p should be rejecting DFP modes as they can't be used
with instructions like `mov s0, 1.0`.
Changes since v1:
* v2: Add check to aarch64_float_const_representable_p for DFP.
Built and tested on aarch64-linux-gnu with no regressions.
Jason Merrill [Mon, 10 Mar 2025 18:10:52 +0000 (14:10 -0400)]
c++: constexpr caching deleted pointer [PR119162]
In this testcase, we pass the checks for mismatched new/delete because the
pointer is deleted before it is returned. And then a subsequent evaluation
uses the cached value, but the deleted heap var isn't in
ctx->global->heap_vars anymore, so cxx_eval_outermost_constant_expr doesn't
run find_heap_var_refs, and ends up with garbage.
Fixed by not caching a reference to deleted.
I considered rejecting such a reference immediately as non-constant, but I
don't think that's valid; an invalid pointer value isn't UB until we try to
do something with it or it winds up in the final result of constant
evaluation.
I also considered not caching other heap references (i.e. using
find_heap_var_refs instead of adding find_deleted_heap_var), which would
include heap pointers passed in from the caller, but those don't have the
same heap_vars problem. We might want cxx_eval_outermost_constant_expr to
prune constexpr_call entries that refer to objects created during the
evaluation, but that applies to local variables and temporaries just as much
as heap "variables".
PR c++/119162
gcc/cp/ChangeLog:
* constexpr.cc (find_deleted_heap_var): New.
(cxx_eval_call_expression): Don't cache a
reference to heap_deleted.
Sandra Loosemore [Tue, 11 Mar 2025 16:36:22 +0000 (16:36 +0000)]
OpenMP/C: Store location in cp_parser_omp_var_list for kind=0 [PR118579]
This patch is the C equivalent of commit r15-6512-gcf94ba812ca496 for C++,
to improve the location information for individual items in an OpenMP
variable list.
gcc/c/ChangeLog
PR c/118579
* c-parser.cc (c_parser_omp_variable_list): Capture location
information when KIND is OMP_CLAUSE_ERROR.
(c_parser_oacc_data_clause_deviceptr): Use the improved location
for diagnostics, and remove the FIXME.
(c_finish_omp_declare_variant): Likewise.
(c_parser_omp_threadprivate): Likewise.
gcc/testsuite/ChangeLog
PR c/118579
* c-c++-common/gomp/pr118579.c: New testcase.
Jonathan Wakely [Thu, 6 Mar 2025 20:28:07 +0000 (20:28 +0000)]
contrib: Clean up outdated parts of gcc-git-customization.sh
It's very unlikely that anybody is still using the old remotes/$user Git
repo setup and still needs this script to be able to migrate it to the
remotes/users/$user structure. Simplify the script by removing those
parts.
This fixes an error that gets displayed in some circumstances:
fatal: no such section: remote.me
contrib/ChangeLog:
* gcc-git-customization.sh: Delete outdated commands for
migrating from very old git setups.
Iain Buclaw [Tue, 11 Mar 2025 16:56:18 +0000 (17:56 +0100)]
d: Fix regression returning from function with invariants [PR119139]
An optimization was added in GDC-12 which sets the TREE_READONLY flag on
all local variables with the storage class `const' assigned. For some
reason, const is also being added by the front-end to `__result'
variables in non-virtual functions, which ends up getting wrong code by
the gimplify pass promoting the local to static storage.
A bug has been raised upstream, as this looks like an error in the AST.
For now, turn off setting TREE_READONLY on all result variables.
PR d/119139
gcc/d/ChangeLog:
* decl.cc (get_symbol_decl): Don't set TREE_READONLY for __result
declarations.
Harald Anlauf [Mon, 10 Mar 2025 21:24:27 +0000 (22:24 +0100)]
Fortran: reject SAVE of a COMMON in a BLOCK construct [PR119199]
PR fortran/119199
gcc/fortran/ChangeLog:
* decl.cc (gfc_match_save): Reject SAVE statement of a COMMON block
when in a BLOCK construct.
* trans-common.cc (translate_common): Avoid NULL pointer dereference.
gcc/testsuite/ChangeLog:
* gfortran.dg/common_30.f90: New test.
* gfortran.dg/common_31.f90: New test.
gcc.target/aarch64/sve/pred-not-gen-[14].c started failing after r15-268-g9dbff9c05520a74e, but we didn't look at it in time for
GCC 15. This patch marks the failures as expected for now.
We should revisit for GCC 16.
See the PR for some discussion about what a GCC 16 fix might
look like.
Thomas Koenig [Tue, 11 Mar 2025 16:40:57 +0000 (17:40 +0100)]
Abstract interfaces and dummy arguments are not global.
The attached patch makes sure that procedures from abstract
interfaces and dummy arguments are not put into the global
symbol table, and are not checked against global symbols.
gcc/fortran/ChangeLog:
PR fortran/119078
* frontend-passes.cc (check_against_globals): Do not check
for abstract interfaces or dummy arguments.
* resolve.cc (gfc_verify_binding_labels): Adjust comment.
Do not put abstract interfaces or dummy argument into global
namespace.
gcc/testsuite/ChangeLog:
PR fortran/119078
* gfortran.dg/interface_58.f90: New test.
For many functions in tbz_2.c, it doesn't matter whether the code
tests a 32-bit or a 64-bit register. g6-g8 have started testing
32-bit registers, but the others could in future too.
gcc/testsuite/
* gcc.target/aarch64/tbz_2.c: Accept both 32-bit and 64-bit registers.
Juergen Christ [Mon, 10 Mar 2025 09:03:36 +0000 (10:03 +0100)]
s390: fix delegitimization of addresses
In legitimize_pic_address we create a
(const (unspec ... UNSPEC_GOTENT))
in the GOT offset might be >= 4k. However, the
s390_delegitimize_address does not contain a case for this scenario.
Jakub Jelinek [Tue, 11 Mar 2025 13:34:01 +0000 (14:34 +0100)]
cobol: Fix up libgcobol configure [PR119216]
Sorry, seems I've screwed up the earlier libgcobol/configure.tgt change.
Looking in more detail, the way e.g. libsanitizer/configure.tgt works is
that it is sourced twice, once at toplevel and there it just sets
UNSUPPORTED=1 for fully unsupported triplets, and then inside of
libsanitizer/configure where it decides to include or not include the
various sublibraries depending on the *_SUPPORTED flags.
So, the following patch attempts to do the same for libgcobol as well.
The BIULD_LIBGCOBOL automake conditional was unused, this patch guards it
on LIBGCOBOL_SUPPORTED as well and guards with it
toolexeclib_LTLIBRARIES = libgcobol.la
Also, AM_CFLAGS has been changed to AM_CXXFLAGS as there are just C++
sources in the library.
2025-03-11 Jakub Jelinek <jakub@redhat.com>
PR cobol/119216
* configure.ac: Check for UNSUPPORTED set by libgcobol/configure.tgt
rather than LIBGCOBOL_SUPPORTED.
* configure: Regenerate.
libgcobol/
* configure.tgt: On fully unsupported targets set UNSUPPORTED=1.
* configure.ac: Add AC_CHECK_SIZEOF([void *]), source in
configure.tgt and set BUILD_LIBGCOBOL also based on
LIBGCOBOL_SUPPORTED.
* Makefile.am (toolexeclib_LTLIBRARIES): Conditionalize on
BUILD_LIBGCOBOL.
(AM_CFLAGS): Rename to ...
(AM_CXXFLAGS): ... this.
(%.lo: %.cc): Use $(AM_CXXFLAGS) rather than $(AM_CFLAGS).
* configure: Regenerate.
* Makefile.in: Regenerate.
Jakub Jelinek [Tue, 11 Mar 2025 13:25:19 +0000 (14:25 +0100)]
cobol: libgcobol/Makefile.am cleanups
Looking at libgcobol.la, I see a lot of cruft, stuff that just shouldn't
be there because automake generates it otherwise right, but also stuff
using undefined variables etc.
libgcobol.{a,so*} seems to build and install the same as before.
Note, I stull see DT_RUNPATH in the installed libgcobol.so.1 before/after
this patch and I'd prefer not to see it, not seeing it in other libraries
like libstdc++.so.6 etc. Dunno if that is because of the dependency on
libstdc++ (but e.g. libstdc++ has dependency on libgcc_s and doesn't do
that).
H.J. Lu [Sun, 9 Mar 2025 14:00:23 +0000 (07:00 -0700)]
i386: Verify that argument registers are spilled properly
While working on a local x86 patch, which passed the GCC testsuite, I got
a compiler error:
In function ‘paravirt_read_msr’,
inlined from ‘perf_ibs_handle_irq’ at arch/x86/events/amd/ibs.c:1055:2:
./arch/x86/include/asm/paravirt_types.h:397:17: error: ‘asm’ operand has impossible constraints or there are not enough registers
397 | asm volatile(ALTERNATIVE(PARAVIRT_CALL, ALT_CALL_INSTR, \
| ^~~
when building x86-64 Linux kernel. RDI, RSI, RDX and RCX registers are
used to pass arguments in 64-bit mode. EAX, EDX and ECX registers are
used to pass arguments in 32-bit mode. But there is no coverage in the
GCC testsuite. Add tests to verify that argument registers are spilled
properly.
PR target/119171
* gcc.target/i386/pr119171-1.c: New test.
* gcc.target/i386/pr119171-2.c: Likewise.
Richard Earnshaw [Tue, 11 Mar 2025 10:48:54 +0000 (10:48 +0000)]
arm: testsuite: fix arm_neon_h checks with conflicting cpu/arch
GCC will complain if the -mcpu flag specifies a different architecture
to that specified in -march, but if the floating-point ABI is "soft",
then differences in the floating-point architecture features are
ignored.
However, the arm_libc_fp_abi checks whether we change the FP ABI by
adding -mfloat-abi=hard/softfp to override the defaults. If that
fails it won't add anything.
Unfortunately arm_neon_h_ok wasn't correctly checking whether the libc
check had worked and just assumed that it would always add something
to enable FP. That's insufficient and we need to consider this failure.
We simply mark tests as unsupported in this case.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp
(check_effective_target_arm_neon_h_ok_nocache): Return zero if
check_effective_target_arm_libc_fp_abi_ok reports failure.
Iain Sandoe [Tue, 11 Mar 2025 09:56:18 +0000 (09:56 +0000)]
configure, Darwin: Require explicit selection of COBOL.
By defult, Darwin does not have sufficient tools to build COBOL
so we do not want to include it in --enable-languages=all since
this will break regular testing of all supported languages.
However, we do want to be able to build it on demand (where the
build system has sufficiently new tools) and so do not want to
disable it permanently.
ChangeLog:
* configure: Regenerate.
* configure.ac: Do not build COBOL on Darwin by default,
even for --enable-languages=all.
Jakub Jelinek [Tue, 11 Mar 2025 10:08:27 +0000 (11:08 +0100)]
cobol: Fix --enable-link-serialization build
--enable-link-serialization relies on each FE participating properly,
setting <lang>.serial, depending on $(<lang>.prev) and printing progress.
The configure option is mainly for LTO bootstraps when we don't want to link
all the FEs at once because that can consume too much memory.
The comment changes are unrelated, just something I've spotted while
working on this. .exe is a Windows suffix, so either we shouldn't
talk about suffixes in the comments or use there $(exeext) as well
to make it clear that it is dependent on the host/build.
2025-03-11 Jakub Jelinek <jakub@redhat.com>
* Make-lang.in: Remove .exe extension from comments.
(cobol.serial): Set to cobol1$(exeext).
(cobol1$(exeext)): Depend on $(cobol.prev). Add
LINK_PROGRESS calls before/after the link command.
Jakub Jelinek [Tue, 11 Mar 2025 10:07:15 +0000 (11:07 +0100)]
cobol: Use *.cc suffix for bison/flex generated C++ files
In GCC 12 we've switched to using *.cc suffixes for C++ sources in GCC
sources, including generated files, instead of using *.c suffixes and
compiling them as C++ anyway (that was the case since we've switched
GCC to C++ in GCC 4.8).
I've noticed gcc/cobol has 3 generated files still with c extension
despite clearly having C++ code in it and being compiled as C++.
2025-03-11 Jakub Jelinek <jakub@redhat.com>
* Make-lang.in (cobol/parse.c, cobol/cdf.c, cobol/scan.c): Remove.
(cobol/parse.cc, cobol/cdf.cc, cobol/scan.cc): New goals.
(cobol/cdf.o): Depend on cobol/cdf.cc rather than cobol/cdf.c.
(cobol/parse.o): Depend on cobol/parse.cc rather than cobol/parse.c.
(cobol/scan.o): Depend on cobol/scan.cc rather than cobol/scan.c,
on cobol/cdf.cc rather than cobol/cdf.c and on cobol/parse.cc rather
than cobol/parse.c.
(cobol.srcextra): Depend on cobol/parse.cc cobol/cdf.cc cobol/scan.cc
rather than cobol/parse.c cobol/cdf.c cobol/scan.c.
Jakub Jelinek [Tue, 11 Mar 2025 10:05:13 +0000 (11:05 +0100)]
Make libgcobol/configure.tgt more similar to other libraries
When we know libgcobol is unsupported on 32-bit arches, we should just say
so in configure.tgt, the same way as on other targets.
2025-03-11 Jakub Jelinek <jakub@redhat.com>
* configure.tgt: Only set LIBGCOBOL_SUPPORTED for lp64
multilibs of powerpc64le-*-linux* and x86_64-*-linux*. Handle
i?86-*-linux* the same as x86_64-*-linux*.
Jakub Jelinek [Tue, 11 Mar 2025 10:01:55 +0000 (11:01 +0100)]
tree: Improve skip_simple_arithmetic [PR119183]
The following testcase takes very long time to compile, because
skip_simple_arithmetic decides to first call tree_invariant_p on
the second argument (and indirectly recurse there). I think before
canonicalization of operands for commutative binary expressions
(and for non-commutative ones always) it is pretty common that the
first operand is a constant, something which tree_invariant_p handles
immediately, so the following patch special cases that; I've added
there a tree_invariant_p call too after the checks, while it is not
really needed currently, tree_invariant_p has the same checks, I wanted
to be prepared in case tree_invariant_p changes. But if you think
I should avoid it, I can drop it too.
This is just a partial fix, I think one can certainly construct a testcase
which will still have horrible compile time complexity (but I've tried and
haven't managed to do so), so perhaps we should just limit the recursion
depth through skip_simple_arithmetic/tree_invariant_p with some defaulted
argument.
2025-03-11 Jakub Jelinek <jakub@redhat.com>
PR c/119183
* tree.cc (skip_simple_arithmetic): If first operand of binary
expr is TREE_CONSTANT or TREE_READONLY with no side-effects, call
tree_invariant_p on that operand first instead of on the second.
Jakub Jelinek [Tue, 11 Mar 2025 09:57:30 +0000 (10:57 +0100)]
complex: Don't DCE unused COMPLEX_EXPRs for -O0 [PR119190]
The PR116463 r15-3128 change regressed the following testcase at -O0.
While for -O1+ we can do -fvar-tracking-assignments, for -O0 we don't
(partly because it is compile time expensive and partly because at -O0
most of the vars live most of their lifetime in memory slots), so if we
DCE some statements, it can mean that DW_AT_location for some vars won't
be available or even it won't be possible to put a breakpoint at some
particular line in the source.
We normally perform dce just in the subpasses of
pass_local_optimization_passes or pass_all_optimizations or
pass_all_optimizations_g, so don't do that at all for -O0. So the complex
change is an exception. And it was described as a way to help forwprop and
reassoc, neither applies to -O0.
This regresses PR119120 again though, I'll post a patch for that momentarily.
2025-03-11 Jakub Jelinek <jakub@redhat.com>
PR debug/119190
* tree-complex.cc (update_complex_assignment, tree_lower_complex):
Perform simple dce on dce_worklist only if optimize.
Deprecate support for the ESA/390 architecture which will be eventually
removed, and encourage the usage of the z/Architecture instead.
Furthermore, default for -m31 to -mzarch whereas previously we defaulted
to -mesa.
gcc/ChangeLog:
* config.gcc: Fail in case of option --with-mode=esa.
* config/s390/s390.cc (s390_option_override_internal): Default
to z/Architecture mode.
* config/s390/s390.h (DRIVER_SELF_SPECS): Ditto.
* config/s390/s390.opt: Emit a warning for option -mesa.
* doc/invoke.texi: Document the change.
Currently insn_cost() only considers the source part of a SET.
Implement TARGET_INSN_COST in order to also take the destination into
account. This may make a difference in case of a MEM where the address
is a SYMBOL_REF.
James K. Lowden [Thu, 6 Mar 2025 21:25:09 +0000 (16:25 -0500)]
COBOL: Frontend
gcc/cobol/
* LICENSE: New file.
* Make-lang.in: New file.
* config-lang.in: New file.
* lang.opt: New file.
* lang.opt.urls: New file.
* cbldiag.h: New file.
* cdfval.h: New file.
* cobol-system.h: New file.
* copybook.h: New file.
* dts.h: New file.
* exceptg.h: New file.
* gengen.h: New file.
* genmath.h: New file.
* genutil.h: New file.
* inspect.h: New file.
* lang-specs.h: New file.
* lexio.h: New file.
* parse_ante.h: New file.
* parse_util.h: New file.
* scan_ante.h: New file.
* scan_post.h: New file.
* show_parse.h: New file.
* structs.h: New file.
* symbols.h: New file.
* token_names.h: New file.
* util.h: New file.
* cdf-copy.cc: New file.
* lexio.cc: New file.
* scan.l: New file.
* parse.y: New file.
* genapi.cc: New file.
* genapi.h: New file.
* gengen.cc: New file.
* genmath.cc: New file.
* genutil.cc: New file.
* cdf.y: New file.
* cobol1.cc: New file.
* convert.cc: New file.
* except.cc: New file.
* gcobolspec.cc: New file.
* structs.cc: New file.
* symbols.cc: New file.
* symfind.cc: New file.
* util.cc: New file.
* gcobc: New file.
* gcobol.1: New file.
* gcobol.3: New file.
* help.gen: New file.
* udf/stored-char-length.cbl: New file.
James K. Lowden [Mon, 10 Mar 2025 15:08:42 +0000 (16:08 +0100)]
COBOL: libgcobol
libgcobol/
* Makefile.am: New file.
* Makefile.in: Autogenerate.
* acinclude.m4: Likewise.
* aclocal.m4: Likewise.
* configure.ac: New file.
* configure: Autogenerate.
* configure.tgt: New file.
* README: New file.
* charmaps.cc: New file.
* config.h.in: New file.
* constants.cc: New file.
* gfileio.cc: New file.
* gmath.cc: New file.
* io.cc: New file.
* valconv.cc: New file.
* charmaps.h: New file.
* common-defs.h: New file.
* ec.h: New file.
* exceptl.h: New file.
* gcobolio.h: New file.
* gfileio.h: New file.
* gmath.h: New file.
* io.h: New file.
* libgcobol.h: New file.
* valconv.h: New file.
* libgcobol.cc: New file.
* intrinsic.cc: New file.
aarch64: Avoid unnecessary use of 2-input TBLs [PR115258]
When using TBL for (say) a V4SI permutation, the aarch64 port first
asks target-independent code to lower to a V16QI permutation.
Then, during code generation, an input like:
But subregs (unlike regs) are not shared, so the op0 == op1 check
always failed for this case. We'd then force each subreg into a
fresh register, meaning that during the later:
there is no way for aarch64_expand_vec_perm_1 to realise that
d->op0 and d->op1 are the same value. It would therefore generate
a two-input TBL in the testcase, even though a single-input TBL
is enough.
I'm not sure forcing subregs to a fresh regiter is a good idea --
it caused problems for copysign & co. -- but that's not something
to fiddle with during stage 4. Using op0 == op1 for rtx equality
is independently wrong, so we might as well just fix that for now.
The patch gets rid of extra MOVs that are a regression from GCC 14.
The testcase is based on one from Kugan, itself based on TSVC.
gcc/
PR target/115258
* config/aarch64/aarch64.cc (aarch64_vectorize_vec_perm_const): Use
d.one_vector_p to decide whether op1 should be a copy of op0.
gcc/testsuite/
PR target/115258
* gcc.target/aarch64/pr115258_2.c: New test.
In PR test case IRA preferred to allocate hard reg to a pseudo instead
of its equivalence. This resulted in allocating caller-saved hard reg
and generating save/restore insns in the function prologue/epilogue.
The equivalence is an invariant (stack pointer plus offset) and the
pseudo is used mostly as memory address. This happened as there was
no simplification of insn after the invariant substitution. The patch
adds the necessary code.
gcc/ChangeLog:
PR target/114991
* ira-costs.cc (equiv_can_be_consumed_p): Add new argument invariant_p.
Add code for dealing with the invariant.
(calculate_equiv_gains): Don't consider init insns. Pass the new
argument to equiv_can_be_consumed_p. Don't treat invariant as
memory.
gcc/testsuite/ChangeLog:
PR target/114991
* gcc.target/aarch64/pr114991.c: New test.
Nathaniel Shead [Fri, 31 Jan 2025 12:53:35 +0000 (23:53 +1100)]
c++/modules: Handle exposures of TU-local types in uninstantiated member templates
Previously, 'is_tu_local_entity' wouldn't detect the exposure of the (in
practice) TU-local lambda in the following example, unless instantiated:
struct S {
template <typename>
static inline decltype([]{}) x = {};
};
This is for two reasons. Firstly, when traversing the TYPE_FIELDS of S
we only see the TEMPLATE_DECL, and never end up building a dependency on
its DECL_TEMPLATE_RESULT (due to not being instantiated). This patch
fixes this by stripping any templates before checking for unnamed types.
The second reason is that we currently assume all class-scope entities
are not TU-local. Despite this being unambiguous in the standard, this
is not actually true in our implementation just yet, due to issues with
mangling lambdas in some circumstances. Allowing these lambdas to be
exported can cause issues in importers with apparently conflicting
declarations, so this patch treats them as TU-local as well.
After these changes, we now get double diagnostics from the two ways
that we can see the above lambda being exposed, via 'S' (through
TYPE_FIELDS) or via 'S::x'. To workaround this we hide diagnostics from
the first case, so we only get errors from 'S::x' which will be closer
to the point the offending lambda is declared.
gcc/cp/ChangeLog:
* module.cc (trees_out::has_tu_local_dep): Also look at the
TI_TEMPLATE if we don't find a dep for a decl.
(depset::hash::is_tu_local_entity): Handle unnamed template
types, treat lambdas specially.
(is_exposure_of_member_type): New function.
(depset::hash::add_dependency): Use it.
(depset::hash::finalize_dependencies): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/modules/internal-10.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Christophe Lyon [Wed, 15 Jan 2025 17:11:33 +0000 (17:11 +0000)]
arm: [MVE] Fix predicates for vec_cmp, vec_vcmpu and vcond_mask (PR 115439)
When compiling c-c++-common/vector-compare-3.c with
-march=armv8.1-m.main+mve+fp.dp -mfloat-abi=hard -mfpu=auto
(which enables MVE), we fail to match vcond_mask because operand 3 has
s_register_operand as predicate for a MVE_VPRED mode, but we try to
match:
(insn 26 25 27 2 (set (reg:V4SI 137)
(unspec:V4SI [
(reg:V4SI 144)
(reg:V4SI 145)
(subreg:V4BI (reg:HI 143) 0)
] VPSELQ_S)) "/src/gcc/testsuite/c-c++-common/vector-compare-3.c":23:6 -1
(nil))
The fix is to use the right predicate: vpr_register_operand.
The patch also fixes vec_cmp and vec_cmpu in the same way.
When testing with
-mthumb/-march=armv8.1-m.main+mve.fp+fp.dp/-mtune=cortex-m55/-mfloat-abi=hard/-mfpu=auto
it fixes the ICES in c-c++-common/vector-compare-3.c,
g++.dg/opt/pr79734.C, g++.dg/tree-ssa/pr111150.C and
gcc.dg/tree-ssa/pr111150.c
gcc/ChangeLog
PR target/115439
* config/arm/mve.md (vec_vcmp, vec_vcmpu, vcond_mask): Use
vpr_register_operand predicate for MVE_VPRED operands.
Fortran: Fix gimplification error for pointer remapping in forall [PR107143]
Enhance dependency checking for data pointers to check for same derived
type and not only for a type being a derived type. This prevent
generation of a descriptor for a function call, that is unsuitable in
forall's pointer assignment.
PR fortran/107143
gcc/fortran/ChangeLog:
* dependency.cc (check_data_pointer_types): Do not just compare
for derived type, but for same derived type.
Jakub Jelinek [Mon, 10 Mar 2025 09:34:00 +0000 (10:34 +0100)]
libgcc: Fix up unwind-dw2-btree.h [PR119151]
The following testcase shows a bug in unwind-dw2-btree.h.
In short, the header provides lock-free btree data structure (so no parent
link on nodes, both insertion and deletion are done in top-down walks
with some locking of just a few nodes at a time so that lookups can notice
concurrent modifications and retry, non-leaf (inner) nodes contain keys
which are initially the base address of the left-most leaf entry of the
following child (or all ones if there is none) minus one, insertion ensures
balancing of the tree to ensure [d/2, d] entries filled through aggressive
splitting if it sees a full tree while walking, deletion performs various
operations like merging neighbour trees, merging into parent or moving some
nodes from neighbour to the current one).
What differs from the textbook implementations is mostly that the leaf nodes
don't include just address as a key, but address range, address + size
(where we don't insert any ranges with zero size) and the lookups can be
performed for any address in the [address, address + size) range. The keys
on inner nodes are still just address-1, so the child covers all nodes
where addr <= key unless it is covered already in children to the left.
The user (static executables or JIT) should always ensure there is no
overlap in between any of the ranges.
In the testcase a bunch of insertions are done, always followed by one
removal, followed by one insertion of a range slightly different from the
removed one. E.g. in the first case [&code[0x50], &code[0x59]] range
is removed and then we insert [&code[0x4c], &code[0x53]] range instead.
This is valid, it doesn't overlap anything. But the problem is that some
non-leaf (inner) one used the &code[0x4f] key (after the 11 insertions
completely correctly). On removal, nothing adjusts the keys on the parent
nodes (it really can't in the top-down only walk, the keys could be many nodes
above it and unlike insertion, removal only knows the start address, doesn't
know the removed size and so will discover it only when reaching the leaf
node which contains it; plus even if it knew the address and size, it still
doesn't know what the second left-most leaf node will be (i.e. the one after
removal)). And on insertion, if nodes aren't split at a level, nothing
adjusts the inner keys either. If a range is inserted and is either fully
bellow key (keys are - 1, so having address + size - 1 being equal to key is
fine) or fully after key (i.e. address > key), it works just fine, but if
the key is in a middle of the range like in this case, &code[0x4f] is in the
middle of the [&code[0x4c], &code[0x53]] range, then insertion works fine
(we only use size on the leaf nodes), and lookup of the addresses below
the key work fine too (i.e. [&code[0x4c], &code[0x4f]] will succeed).
The problem is with lookups after the key (i.e. [&code[0x50, &code[0x53]]),
the lookup looks for them in different children of the btree and doesn't
find an entry and returns NULL.
As users need to ensure non-overlapping entries at any time, the following
patch fixes it by adjusting keys during insertion where we know not just
the address but also size; if we find during the top-down walk a key
which is in the middle of the range being inserted, we simply increase the
key to be equal to address + size - 1 of the range being inserted.
There can't be any existing leaf nodes overlapping the range in correct
programs and the btree rebalancing done on deletion ensures we don't have
any empty nodes which would also cause problems.
The patch adjusts the keys in two spots, once for the current node being
walked (the last hunk in the header, with large comment trying to explain
it) and once during inner node splitting in a parent node if we'd otherwise
try to add that key in the middle of the range being inserted into the
parent node (in that case it would be missed in the last hunk).
The testcase covers both of those spots, so succeeds with GCC 12 (which
didn't have btrees) and fails with vanilla GCC trunk and also fails if
either the
if (fence < base + size - 1)
fence = iter->content.children[slot].separator = base + size - 1;
or
if (left_fence >= target && left_fence < target + size - 1)
left_fence = target + size - 1;
hunk is removed (of course, only with the current node sizes, i.e. up to
15 children of inner nodes and up to 10 entries in leaf nodes).
2025-03-10 Jakub Jelinek <jakub@redhat.com>
Michael Leuchtenburg <michael@slashhome.org>
PR libgcc/119151
* unwind-dw2-btree.h (btree_split_inner): Add size argument. If
left_fence is in the middle of [target,target + size - 1] range,
increase it to target + size - 1.
(btree_insert): Adjust btree_split_inner caller. If fence is smaller
than base + size - 1, increase it and separator of the slot to
base + size - 1.
Xi Ruoyao [Fri, 7 Mar 2025 04:49:54 +0000 (12:49 +0800)]
LoongArch: Fix ICE when trying to recognize bitwise + alsl.w pair [PR119127]
When we call loongarch_reassoc_shift_bitwise for
<optab>_alsl_reversesi_extend, the mask is in DImode but we are trying
to operate it in SImode, causing an ICE.
To fix the issue sign-extend the mask into the mode we want. And also
specially handle the case the mask is extended into -1 to avoid a
miss-optimization.
gcc/ChangeLog:
PR target/119127
* config/loongarch/loongarch.cc
(loongarch_reassoc_shift_bitwise): Sign extend mask to mode,
specially handle the case it's extended to -1.
* config/loongarch/loongarch.md
(loongarch_reassoc_shift_bitwise): Update the comment for the
special case.
Jakub Jelinek [Mon, 10 Mar 2025 08:33:55 +0000 (09:33 +0100)]
libgcc: Formatting fixes for unwind-dw2-btree.h
Studying unwind-dw2-btree.h was really hard for me because
the formatting is wrong or weird in many ways all around the code
and that kept distracting my attention.
That includes all kinds of things, including wrong indentation, using
{} around single statement substatements, excessive use of ()s around
some parts of expressions which don't increase code clarity, no space
after dot in comments, some comments not starting with capital letters,
some not ending with dot, adding {} around some parts of code without
any obvious reason (and when it isn't done in a similar neighboring
function) or ( at the end of line without any reason.
The following patch fixes the formatting issues I found, no functional
changes.
Jakub Jelinek [Mon, 10 Mar 2025 08:31:41 +0000 (09:31 +0100)]
gimple-ssa-warn-access: Adjust maybe_warn_nonstring_arg for nonstring multidimensional arrays [PR117178]
The following patch fixes 4 xfails in attr-nonstring-11.c (and results in 2
false positive warnings in attr-nonstring-12.c not being produced either).
The thing is that maybe_warn_nonstring_arg simply assumed that nonstring
arrays must be single-dimensional, so when it sees a nonstring decl with
ARRAY_TYPE, it just used its dimension. With multi-dimensional arrays
that is not the right dimension to use though, it can be dimension of
some outer dimension, e.g. if we have
char a[5][6][7] __attribute__((nonstring)) if decl is
a[5] it would assume maximum non-NUL terminated string length of 5 rather than
7, if a[5][6] it would assume 6 and only for a[5][6][0] it would assume the
correct 7. So, the following patch looks through all the outer dimensions
to reach the innermost one (which for attribute nonstring is guaranteed to
have char/unsigned char/signed char element type).
2025-03-10 Jakub Jelinek <jakub@redhat.com>
PR c/117178
* gimple-ssa-warn-access.cc (maybe_warn_nonstring_arg): Look through
multi-dimensional array types, stop at the innermost ARRAY_TYPE.
* c-c++-common/attr-nonstring-11.c: Remove xfails.
* c-c++-common/attr-nonstring-12.c (warn_strcmp_cst_1,
warn_strcmp_cst_2): Don't expect any warnings here.
(warn_strcmp_cst_3, warn_strcmp_cst_4): New functions with expected
warnings.
The issue is the same as 12383255fe4e82c31f5e42c72a8fbcb1b5dea35d.
Neither is .REDUC_PLUS set for V2SImode on LoongArch, so add it
to the list of targets not expecting BB vectorization.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/bb-slp-77.c: Add loongarch*-*-* to the list
of expected failing targets.
Jeff Law [Sun, 9 Mar 2025 20:25:37 +0000 (14:25 -0600)]
[rtl-optimization/117467] Mark FP destinations as dead
The next step in improving ext-dce is to clean up a minor wart in the
set/clobber handling code.
In that code the safe thing to do is to not process a destination at all. That
will leave bits set in the live bitmaps for objects that may no longer be live.
Of course with extraneous bits set we use more memory and do more work managing
the bitmaps, but it's safe from a code correctness standpoint.
One case that is slipping through that we need to fix is scalar fp
destinations. Essentially the code never tried to handle those and as a result
would leave those entities live and bubble them up through the CFG.
In the testcase at hand this takes us from ~10k live objects at entry to ~4k
live objects at entry. Time spent in ext-dce goes from 2.14s to .64s.
Jeff Law [Sun, 9 Mar 2025 19:28:10 +0000 (13:28 -0600)]
[rtl-optimization/117467] Avoid unnecessarily marking things live in ext-dce
This is the first of what I expect to be a few patches to improve memory
consumption and performance of ext-dce.
While I haven't been able to reproduce the insane memory usage that Richi saw,
I can certainly see how we might get there. I instrumented ext-dce to dump the
size of liveness sets, removed the memory allocation limiter, then compiled the
appropriate file from specfp on rv64.
In my test I saw the liveness sets growing to absurd sizes as we worked from
the last block back to the first. Think 125k entries by the time we got back
to the entry block which would mean ~30k live registers. Simply no way that's
correct.
The use handling is the primary source of problems and the code that I most
want to rewrite for gcc-16. It's just a fugly mess. I'm not terribly inclined
to do that rewrite for gcc-15 though. So these will be spot adjustments.
The most important thing to know about use processing is it sets up an iterator
and walks that. When a SET is encountered we actually manually
dive into the SRC/DEST and ideally terminate the iterator.
If during that SET processing we encounter something unexpected we let the
iterator continue normally, which causes iteration down into the SET_DEST
object. That's safe behavior, though it can lead to too many objects as being
marked live.
We can refine that behavior by trivially realizing that we need not process the
SET_DEST if it is a naked REG (and probably for other cases too, but they're
not expected to be terribly important). So once we see the SET with a simple
REG destination, we can bump the iterator to avoid having it dive into the
SET_DEST if something unexpected is seen on the SET_SRC side.
Fixing this alone takes us from 125k live objects to 10k live objects at the
entry block. Time in ext-dce for rv64 on the testcase goes from 10.81s to
2.14s.
Given this reduces the things considered live, this could easily result in
finding more cases for ext-dce to improve. In fact a missed optimization issue
for rv64 I've been poking at needs this patch as a prerequisite.
Bootstrapped and regression tested on x86_64.
Pushing to the trunk.
PR rtl-optimization/117467
gcc
* ext-dce.cc (ext_dce_process_uses): When trivially possible advance
the iterator over the destination of a SET.
Andrew Pinski [Sun, 9 Mar 2025 06:43:54 +0000 (22:43 -0800)]
phiopt: Fix value_replacement for middle bb having phi nodes [PR118922]
After r12-5300-gf98f373dd822b3, value_replacement would be able to look at the
following cfg structure:
```
<bb 5> [local count: 1014686024]:
if (h_6 != 0)
goto <bb 7>; [94.50%]
else
goto <bb 6>; [5.50%]
value_replacement would incorrectly think the middle bb (6) was empty and so it decides
to remove condition in bb5 and replacing it with 0 as the function thought it was `h_6 ? 0 : h_6`.
But since the there is an incoming phi node to bb6 defining h_6 that is incorrect.
The fix is to check if there is phi nodes in the middle bb and set empty_or_with_defined_p to false.
This was not needed before r12-5300-gf98f373dd822b3 because the phi would have been dead otherwise due to
other checks.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/118922
gcc/ChangeLog:
* tree-ssa-phiopt.cc (value_replacement): Set empty_or_with_defined_p
to false when there is phi nodes for the middle bb.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr118922-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
OpenMP: Integrate dynamic selectors with dispatch argument handling [PR118457]
Support for dynamic selectors in "declare variant" was developed in
parallel with support for the adjust_args/append_args clauses and the
dispatch construct; they collided in a bad way. This patch fixes the
"sorry" for calls that need both by removing the adjust_args/append_args
code from gimplify_call_expr and invoking it from the new variant
substitution code instead. It's handled as a tree -> tree transformation
rather than tree -> gimple because eventually this code may end up being
invoked from the front ends instead of the gimplifier (see PR115076).
gcc/ChangeLog
PR middle-end/118457
* gimplify.cc (modify_call_for_omp_dispatch): New, containing
code split from gimplify_call_expr and modified to emit tree
instead of gimple. Remove the error for falling through to a call
to the base function.
(expand_variant_call_expr): New, split from gimplify_variant_call_expr.
Call modify_call_for_omp_dispatch on calls to
variants in a dispatch construct context.
(gimplify_variant_call_expr): Make it call expand_variant_call_expr
to do the actual work.
(gimplify_call_expr): Remove sorry for calls involving both
dynamic/late selectors and adjust_args/append_args, and adjust
for new interface. Move adjust_args/append_args code to
modify_call_for_omp_dispatch.
(gimplify_omp_dispatch): Add some comments.
Thomas Koenig [Sat, 8 Mar 2025 15:13:41 +0000 (16:13 +0100)]
Fix regression with -Wexternal-argument-mismatch.
The attached patch fixes an ICE regresseion where undo state was not
handled properly when generating formal from actual arguments, which
occurred under certain conditions with the newly introduced
-Wexternal-argument-mismatch option.
The fix is simple: When we are generating these symbols, we no
longer need to undo anything, so we can just remove them.
I had considered adding an extra optional argument, but decided
against it on code clarity grounds.
While looking at the code, I also saw that a member of gfc_symbol
introduced with my patch should be a bitfield of width 1.
gcc/fortran/ChangeLog:
PR fortran/119157
* gfortran.h (gfc_symbol): Make ext_dummy_arglist_mismatch a
one-bit bitfield
(gfc_pop_undo_symbol): Declare prototype.
* symbol.cc (gfc_pop_undo_symbol): New function.
* interface.cc (gfc_get_formal_from_actual_arglist): Call it
for artificially introduced formal variables.
gcc/testsuite/ChangeLog:
PR fortran/119157
* gfortran.dg/interface_57.f90: New test.
This commit implements the proposed resolution to LWG4169, which is
to constrain std::atomic<T>'s default constructor based on whether
T itself is default constructible.
At the moment, std::atomic<T>'s primary template in libstdc++ has a
defaulted default constructor. Value-initialization of the T member
(since C++20 / P0883R2) is done via a NSDMI (= T()).
GCC already considers the defaulted constructor constrained/deleted,
however this behavior is non-standard (see the discussion in PR116769):
the presence of a NSDMI should not make the constructor unavailable to
overload resolution/deleted ([class.default.ctor]/2.5 does not apply).
When using libstdc++ on Clang, this causes build issues as the
constructor is *not* deleted there -- the interpretation of
[class.default.ctor]/4 seems to match Clang's behavior.
Therefore, although there would be "nothing to do" with GCC+libstdc++,
this commit changes the code as to stop relying on the GCC language
extension. In C++ >= 20 modes, std::atomic's defaulted default
constructor is changed to be a non-defaulted one, with a constraint
added as per LWG4169; value-initialization of the data member is moved
from the NSDMI to the member init list. The new signature matches the
one in the Standard as per [atomics.types.operations]/1.
In pre-C++20 modes, the constructor is left defaulted. This ensures
compatibility with C++11/14/17 behavior. In other words: we are not
backporting P0883R2 to earlier language modes here.
Amend an existing test to check that a std::atomic wrapping a
non-default constructible type is always non-default constructible:
from C++20, because of the constraint; before C++20, because we
are removing the NSDMI, and therefore [class.default.ctor]/2.5
applies.
Add another test that checks that std::atomic is trivially default
constructible in pre-C++20 modes, and it isn't afterwards.
libstdc++-v3/ChangeLog:
* include/bits/version.def (atomic_value_initialization):
Guard the FTM with the language concepts FTM.
* include/bits/version.h: Regenerate.
* include/std/atomic (atomic): When atomic value init is
defined, change the defaulted default constructor to
a non-defaulted one, constraining it as per LWG4169.
Otherwise, keep the existing constructor.
Remove the NSDMI for the _M_i member.
(_GLIBCXX20_INIT): Drop the macro, as it is not needed any more.
* testsuite/29_atomics/atomic/69301.cc: Test that
an atomic wrapping a non-default-constructible type is
always itself non-default-constructible (in all language
modes).
* testsuite/29_atomics/atomic/cons/trivial.cc: New test.
inline-asm: Improve documentation of "asm constexpr".
While working on an adjacent documentation fix, I noticed that the
documentation for the gnu++11 "asm constexpr" feature was very
confusing, in some cases being attached to parts of the asm syntax
that are not otherwise required to be string literals, and missing from
other parts of the syntax that are. I've checked what the C++ parser
actually does and fixed the documentation to match, also improving it
to use correct markup and to be more explicit and less implementor-speaky.
gcc/cp/ChangeLog
* parser.cc (cp_parser_asm_definition): Make comment more explicit.
(cp_parser_asm_operand_list): Likewise. Also correct the comment
block at the top of the function to reflect reality.
gcc/ChangeLog
* doc/extend.texi (Basic Asm): Document that AssemblerInstructions
can be an asm constexpr.
(Extended Asm): Move the notes about asm constexprs for
AssemblerTemplate and Clobbers to the corresponding subsections.
Remove the notes for OutputOperands and InputOperands and reword
misleading descriptions of the list item syntax. Note that
constraint strings can be asm constexprs.
(Asm constexprs): Use "title case" for subsection name. Be
explicit about what parts of the asm syntax this applies to and
that the parentheses are required. Correct markup and terminology.
Jason Merrill [Thu, 6 Mar 2025 17:39:36 +0000 (12:39 -0500)]
c++/modules: purview of explicit instantiations [PR114630]
When calling instantiate_pending_templates at end of parsing, any new
functions that are instantiated from this point have their module
purview set based on the current value of module_kind.
This is unideal, however, as the modules code will then treat these
instantiations as reachable and cause large swathes of the GMF to be
emitted into the module CMI, despite no code in the actual module
purview referencing it.
This patch fixes this by setting DECL_MODULE_PURVIEW_P as appropriate when
we see an explicit instantiation, and adjusting module_kind accordingly
during deferred instantiation, meaning that GMF entities won't be counted
as reachable unless referenced by an actually reachable entity.
Note that purviewness and attachment etc. is generally only determined
by the base template: this is purely for determining whether an
explicit instantiation is in the module purview and hence whether it
should be streamed out. See the comment on 'set_instantiating_module'.
Incidentally, since the "xtreme" testcases are deliberately large (and this
commit adds another one), let's make sure we only run them once.
PR c++/114630
PR c++/114795
gcc/cp/ChangeLog:
* pt.cc (reopen_tinst_level): Set or clear MK_PURVIEW.
(mark_decl_instantiated): Call set_instantiating_module.
(instantiate_pending_templates): Save and restore module_kind so
it isn't affected by reopen_tinst_level.
gcc/testsuite/ChangeLog:
* g++.dg/modules/modules.exp: Run xtreme tests once.
* g++.dg/modules/gmf-3.C: New test.
* g++.dg/modules/gmf-4.C: New test.
* g++.dg/modules/gmf-xtreme.C: New test.
My P3349R1 paper clarifies that we should be able to lower contiguous
iterators to pointers, without worrying about side effects of individual
increment or dereference operations.
We do need to advance the iterators, and we need to use std::to_address
on the result of advancing them. This ensures that iterators with error
detection get a chance to diagnose bugs. If we don't use std::to_address
on the advanced iterator, it would be possible for a memcpy on the
pointers to overflow a buffer. By performing the += or -= operations and
also using std::to_address, we give the iterator a chance to abort,
throw, or call a violation handler before the buffer overflow happens.
The new tests only check the std::copy* algorithms, because std::move
and std::move_backward use the same implementation details.
libstdc++-v3/ChangeLog:
* include/bits/stl_algobase.h (__nothrow_contiguous_iterator):
Remove.
(__memcpyable_iterators): Simplify.
(__copy_move_a2, __copy_n_a, __copy_move_backward_a2): Call
std::to_address on the iterators after advancing them.
* testsuite/25_algorithms/copy/contiguous.cc: New test.
* testsuite/25_algorithms/copy_backward/contiguous.cc: New test.
* testsuite/25_algorithms/copy_n/contiguous.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>