Matthias Klose [Thu, 13 Mar 2025 06:22:02 +0000 (07:22 +0100)]
Allow to build libgccjit with a soname bound to the GCC major version
When configuring GCC with --program-suffix=-$(BASE_VERSION) to allow
installation multiple GCC versions in parallel, the executable of the
driver (gcc-$(BASE_VERSION)) gets recorded in the libgccjit.so.0
library. Assuming, that you only install the libgccjit.so.0 library
from the newest GCC, you have a libgccjit installed, which always calls
back to the newest installed version of GCC. I'm not saying that the
ABI is changing, but I'd like to see the libgccjit calling out to the
corresponding compiler, and therefore installing a libgccjit with a
soname that matches the GCC major version.
The downside is having to rebuild packages built against libgccjit with
each major GCC version, but looking at the reverse dependencies, at
least for package builds, only emacs is using libgccjit.
My plan to use this feature is to build a libgccjit0 using the default
GCC (e.g. gcc-14), and a libgccjit15, when building a newer GCC. When
changing the GCC default to 15, building a libgccjit0 from gcc-15, and a
libgccjit14 from gcc-14.
When configuring without --enable-versioned-jit, the behavior is unchanged.
2025-03-13 Matthias Klose <doko@ubuntu.com>
gcc/
* configure.ac: Add option --enable-versioned-jit.
* configure: Regenerate.
* Makefile.in: Move from jit/Make-lang.in, setting value from
configure.ac.
* doc/install.texi: Document option --enable-versioned-jit.
gcc/jit/
* Make-lang.in (LIBGCCJIT_VERSION_NUM): Move to ../Makefile.in.
David Malcolm [Thu, 13 Mar 2025 00:51:06 +0000 (20:51 -0400)]
analyzer: support RAW_DATA_CST [PR117262]
gcc/analyzer/ChangeLog:
PR analyzer/117262
* region-model-manager.cc
(region_model_manager::get_or_create_constant_svalue): Use
NULL_TREE for the types of constant_svalue for RAW_DATA_CST.
(region_model_manager::maybe_fold_sub_svalue): Generalize
STRING_CST logic to also handle RAW_DATA_CST.
(region_model_manager::maybe_get_char_from_cst): New.
(region_model_manager::maybe_get_char_from_raw_data_cst): New.
* region-model-manager.h
(region_model_manager::maybe_get_char_from_cst): New decl.
(region_model_manager::maybe_get_char_from_raw_data_cst): New decl.
* region-model.cc (region_model::get_rvalue_1): Handle
RAW_DATA_CST.
* store.cc (get_subregion_within_ctor_for_ctor_pair): New.
(binding_map::apply_ctor_pair_to_child_region): Call
get_subregion_within_ctor_for_ctor_pair so that we handle
RAW_DATA_CST.
gcc/testsuite/ChangeLog:
PR analyzer/117262
* c-c++-common/analyzer/raw-data-cst-pr117262-1.c: New test.
* c-c++-common/analyzer/raw-data-cst-pr117262-2.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jakub Jelinek [Wed, 12 Mar 2025 23:42:54 +0000 (00:42 +0100)]
c++: Evaluate immediate invocation call arguments with mce_true [PR119150]
Since Marek's r14-4140 which moved immediate invocation evaluation
from build_over_call to cp_fold_r, the following testcase is miscompiled.
The a = foo (bar ()); case is actually handled right, that is handled
in cp_fold_r and the whole CALL_EXPR is at that point evaluated by
cp_fold_immediate_r with cxx_constant_value (stmt, tf_none);
and that uses mce_true for evaluation of the argument as well as the actual
call.
But in the bool b = foo (bar ()); case we actually try to evaluate this
as non-manifestly constant-evaluated. And while
/* Make sure we fold std::is_constant_evaluated to true in an
immediate function. */
if (DECL_IMMEDIATE_FUNCTION_P (fun))
call_ctx.manifestly_const_eval = mce_true;
ensures that if consteval and __builtin_is_constant_evaluated () is true
inside of that call, this happens after arguments to the function
have been already constant evaluated in cxx_bind_parameters_in_call.
The call_ctx in that case also includes new call_ctx.call, something that
shouldn't be used for the arguments, so the following patch just arranges
to call cxx_bind_parameters_in_call with manifestly_constant_evaluated =
mce_true.
2025-03-13 Jakub Jelinek <jakub@redhat.com>
PR c++/119150
* constexpr.cc (cxx_eval_call_expression): For
DECL_IMMEDIATE_FUNCTION_P (fun) set manifestly_const_eval in new_ctx
and new_call to mce_true and set ctx to &new_ctx.
Nathaniel Shead [Sat, 8 Feb 2025 13:37:48 +0000 (00:37 +1100)]
c++/modules: Better handle no-linkage decls in unnamed namespaces [PR118799]
There are two issues with no-linkage decls (e.g. explicit type aliases)
in unnamed namespaces that this patch fixes.
Firstly, we don't currently handle exporting no-linkage decls in unnamed
namespaces. This should be ill-formed in [module.export], since having
an exported declaration within a namespace-definition makes the
namespace definition exported (p2), but an unnamed namespace has
internal linkage thus violating p3.
Secondly, by the standard it appears to be possible to emit unnamed
namespaces from named modules in certain scenarios, for instance when
they contain a type alias (which is not itself an entity). This patch
makes the adjustments needed to ensure we don't error in this scenario.
PR c++/118799
gcc/cp/ChangeLog:
* module.cc (depset::hash::is_tu_local_entity): Only types,
functions, variables, and template (specialisations) can be
TU-local. Explicit type aliases are TU-local iff the type they
refer to are.
(module_state::write_namespaces): Allow unnamed namespaces in
named modules.
(check_module_decl_linkage): Error for all exported declarations
in an unnamed namespace.
gcc/testsuite/ChangeLog:
* g++.dg/modules/export-6.C: Adjust error message, add check for
no-linkage decls in namespace.
* g++.dg/modules/internal-4_b.C: Allow exposing a namespace with
internal linkage. Type aliases are not entities and so never
exposures.
* g++.dg/modules/using-30_a.C: New test.
* g++.dg/modules/using-30_b.C: New test.
* g++.dg/modules/using-30_c.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Currently, note_vague_linkage_fn is called on all definitions imported
from modules. This is not correct, however; among other things, a
gnu_inline function does not have vague linkage, and causes an ICE if
its treated as such.
There are other things that we seem to potentially miss (e.g. dllexport
handling), so it seems sensible to stop trying to manage linkage in such
an ad-hoc manner but to use the normal interfaces more often. While
looking at this I also found that we seem to miss marking vague linkage
variables as COMDAT, so this patch fixes that as well.
The change to use expand_or_defer_fn exposes a checking-only ICE in
trees_in::assert_definition, where we forget that we already installed a
definition for a function. We work around this by instead of clearing
DECL_SAVED_TREE entirely in expand_or_defer_fn_1, we instead set it to a
dummy value. This way we can also avoid the check for !TREE_ASM_WRITTEN
beforehand.
PR c++/119154
gcc/cp/ChangeLog:
* decl2.cc (vague_linkage_p): Don't treat gnu_inline functions
as having vague linkage.
* module.cc (trees_out::core_bools): Clear DECL_INTERFACE_KNOWN
for vague-linkage entities.
(read_var_def): Maybe set comdat linkage for imported var
definitions.
(module_state::read_cluster): Use expand_or_defer_fn instead of
ad-hoc linkage management.
(post_load_processing): Likewise.
* semantics.cc (expand_or_defer_fn_1): Don't forget that we had
a definition at all.
gcc/testsuite/ChangeLog:
* g++.dg/modules/linkage-3_a.C: New test.
* g++.dg/modules/linkage-3_b.C: New test.
* g++.dg/modules/pr119154_a.C: New test.
* g++.dg/modules/pr119154_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
d = MEM <unsigned char[4]> [(struct A *)&TARGET_EXPR <D.2894, foo()]
= MEM <unsigned char[4]> [(struct A *)(const struct A &) &e],
TARGET_EXPR <D.2894, foo()>
that is, a COMPOUND_EXPR where a TARGET_EXPR is used twice, and its
address is taken in the left-hand side operand, so it can't be elided.
But set_target_expr_eliding simply recurses on the second operand of
a COMPOUND_EXPR and marks the TARGET_EXPR as eliding. This then causes
a crash.
cp_build_indirect_ref_1 should not be changing the value category.
While *&TARGET_EXPR is an lvalue, folding it into TARGET_EXPR would
render is a prvalue of class type.
PR c++/117512
gcc/cp/ChangeLog:
* typeck.cc (cp_build_indirect_ref_1): Only do the *&e -> e
folding if the result would be an lvalue.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/alignas23.C: New test.
* g++.dg/ext/align3.C: New test.
* g++.dg/ext/align4.C: New test.
* g++.dg/ext/align5.C: New test.
Simon Martin [Wed, 12 Mar 2025 19:15:39 +0000 (20:15 +0100)]
c++: Look through capture proxy from outer lambda instead of erroring out [PR110584]
We've been rejecting this valid code since r8-4571:
=== cut here ===
void foo (float);
int main () {
constexpr float x = 0;
(void) [&] () {
foo (x);
(void) [] () {
foo (x);
};
};
}
=== cut here ===
The problem is that when processing X in the inner lambda,
process_outer_var_ref errors out even though it does find the constant
capture from the enclosing lambda.
This patch makes sure that process_outer_var_ref properly looks through
normal capture proxies, if any.
PR c++/110584
gcc/cp/ChangeLog:
* cp-tree.h (strip_normal_capture_proxy): Declare.
* lambda.cc (strip_normal_capture_proxy): New function to look
through normal capture proxies.
(build_capture_proxy): Use it.
* semantics.cc (process_outer_var_ref): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/lambda/lambda-nested10.C: New test.
This test has been failing since gcc-6. The test was always very
fragile anyway since it relied on an auto-inc being created and then
split by the subreg2 (later the subreg3) pass. But the code to get
precisely these conditions was very long-winded and unlikely to be
immune to other changes in the compiler (as proved to be the case).
There's no obvious way to recreate the exact conditions we were
testing for, so just remove the test.
Marek Polacek [Fri, 7 Mar 2025 16:26:46 +0000 (11:26 -0500)]
c++: ICE with lambda in fold expression in requires [PR119134]
The r12-8258 fix assumes that DECL_CONTEXT of 'pack' in
check_for_bare_parameter_packs is going to be an operator()
but as this test shows, it can be empty.
This change makes the check_dynamic_spec precondition checks slightly
faster to compile, and avoids those checks entirely for the common cases
of calling check_dynamic_spec_integral or check_dynamic_spec_string.
Instead of checking for unique types by keeping counts in an array and
looping over that array, we can just keep a sum of how many valid types
are present, and check that it equals the total number of types in the
pack.
The diagnostic is slightly worse now, because there's only a single
"invalid template argument types" string that appears in the output,
where previously we had either "non-unique template argument type" or
"disallowed template argument type", depending on the failure mode.
Given that most users will never use this function directly, and
probably won't use invalid types anyway, the inferior diagnostic seems
acceptable.
libstdc++-v3/ChangeLog:
* include/std/format (basic_format_parse_context::__once): New
variable template.
(basic_format_parse_context::__valid_types_for_check_dynamic_spec):
New function template for checking argument types.
(basic_format_parse_context::__check_dynamic_spec): New function
template to implement the common check_dynamic_spec logic.
(basic_format_parse_context::check_dynamic_spec_integral): Call
__check_dynamic_spec instead of check_dynamic_spec.
(basic_format_parse_context::check_dynamic_spec_string):
Likewise. Use _CharT instead of char_type consistently.
(basic_format_parse_context::check_dynamic_spec): Use
__valid_types_for_check_dynamic_spec for precondition checks and
call __check_dynamic_spec for checking the arg id.
* testsuite/std/format/parse_ctx_neg.cc: Adjust expected errors.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Mon, 10 Mar 2025 14:29:36 +0000 (14:29 +0000)]
libstdc++: Add static_assert to std::packaged_task::packaged_task(F&&)
LWG 4154 (approved in Wrocław, November 2024) fixed the Mandates:
precondition for std::packaged_task::packaged_task(F&&) to match what
the implementation actually requires. We already gave a diagnostic in
the right cases as required by the issue resolution, so strictly
speaking we don't need to do anything. But the current diagnostic comes
from inside the implementation of std::__invoke_r and could be more
user-friendly.
For C++17 (when std::is_invocable_r_v is available) add a static_assert
to the constructor, so the error is clear:
Jonathan Wakely [Fri, 12 Jan 2024 16:57:41 +0000 (16:57 +0000)]
libstdc++: Update tzdata to 2025a
Import the new 2025a tzdata.zi file. The leapseconds file was also
updated to have a new expiry (no new leap seconds were added).
libstdc++-v3/ChangeLog:
* include/std/chrono (__detail::__get_leap_second_info): Update
expiry date for leap seconds list.
* src/c++20/tzdata.zi: Import new file from 2025a release.
* src/c++20/tzdb.cc (tzdb_list::_Node::_S_read_leap_seconds)
Update expiry date for leap seconds list.
Jason Merrill [Tue, 11 Mar 2025 21:43:35 +0000 (17:43 -0400)]
contrib: relpath.sh /lib /include [PR119081]
Previously, if the common ancestor of the two paths is / we would print the
absolute second argument, but this PR asks for a relative path in that case
as well, which makes sense for the libstdc++.modules.json use case.
Alex Coplan [Mon, 10 Mar 2025 16:44:15 +0000 (16:44 +0000)]
df: Treat partial defs as uses in df_simulate_defs [PR116564]
The PR shows us spinning in dce.cc:fast_dce at the start of combine.
This spinning appears to be because of a disagreement between the fast_dce code
and the code in df-problems.cc:df_lr_bb_local_compute. Specifically, they
disagree on the treatment of partial defs. For the testcase in the PR, we have
the following insn in bb 3:
i.e. it models partial defs as a RMW operation; thus for the def arising
from i10 above, it records a use of r104; hence it ends up in the
live-in set for bb 3.
However, as it stands, the code in dce.cc:fast_dce (and its callee
dce_process_block) has no such provision for DF_REF_PARTIAL defs. It
does not treat these as a RMW and does not compute r104 above as being
live-in to bb 3. At the end of dce_process_block we compute the
following "did something happen" condition used to decide termination of
the analysis:
because of the disagreement between df_lr_local_compute and the local
analysis done by fast_dce, we invariably have r104 in DF_LR_IN, but not
in local_live. Hence we always return true here, call
df_analyze_problem (which re-computes DF_LR_IN according to
df_lr_bb_local_compute, re-adding r104), and so the analysis never
terminates.
This patch therefore adjusts df_simulate_defs (called from
dce_process_block) to match the behaviour of df_lr_bb_local_compute in
this respect, namely we make it model partial defs as RMW operations by
setting the relevant register live. This fixes the spinning in fast_dce
for this testcase.
gcc/ChangeLog:
PR rtl-optimization/116564
* df-problems.cc (df_simulate_defs): For partial defs, mark the
register live (treat it as a RMW operation).
gcc/testsuite/ChangeLog:
PR rtl-optimization/116564
* gcc.target/aarch64/torture/pr116564.c: New test.
- Import phobos v2.111.0-beta.1.
- Added `bitCast' function to `std.conv'.
- Added `readfln' and `File.readfln' functions to `std.stdio'.
- New procedural API for `std.sumtype'.
Richard Earnshaw [Mon, 10 Mar 2025 14:12:38 +0000 (14:12 +0000)]
arm: allow type-punning subregs in vpr_register_operand [PR115439]
Subregs that only change the mode of an operand (ie don't change the
size) should be safe for the VPR register. If we don't permit them
we may end up with some redundant copy instructions.
Fortran: Add F2018 TEAM_NUMBER to coindexed expressions [PR98903]
Add missing parsing and code generation for a[..., TEAM_NUMBER=...] as
defined from F2015 onwards. Because F2015 is not used as dedicated
standard in GFortran add it to the F2018 standard feature set.
PR fortran/98903
gcc/fortran/ChangeLog:
* array.cc (gfc_copy_array_ref): Copy team, team_type and stat.
(match_team_or_stat): Match a single team(_number)= or stat=.
(gfc_match_array_ref): Add switching to image_selector_parsing
and error handling when indices come after named arguments.
* coarray.cc (move_coarray_ref): Move also team_type.
* expr.cc (gfc_free_ref_list): Free team and stat expression.
(gfc_find_team_co): Find team or team_number in array-ref.
* gfortran.h (enum gfc_array_ref_team_type): New enum to
distinguish unset, team or team_number expression.
(gfc_find_team_co): Default searching to team= expressions.
* resolve.cc (resolve_array_ref): Check for type correctness of
team(_number) and stats in coindices.
* trans-array.cc (gfc_conv_array_ref): Ensure stat is cleared
when fcoarray=single is used.
* trans-intrinsic.cc (conv_stat_and_team): Including team_number
in conversion.
(gfc_conv_intrinsic_caf_get): Propagate team_number to ABI
routine.
(conv_caf_send_to_remote): Same.
(conv_caf_sendget): Same.
gcc/testsuite/ChangeLog:
* gfortran.dg/coarray/coindexed_2.f90: New test.
* gfortran.dg/coarray/coindexed_3.f08: New test.
* gfortran.dg/coarray/coindexed_4.f08: New test.
Tomasz Kamiński [Tue, 11 Mar 2025 10:59:36 +0000 (11:59 +0100)]
libstdc++: Correct preprocessing checks for floatX_t and bfloat_16 formatting
Floating points types _Float16, _Float32, _Float64, and bfloat16,
can be formatted only if std::to_chars overloads for such types
were provided. Currently this is only the case for architectures
where float and double are 32-bits and 64-bits IEEE floating points types.
This patch updates the preprocessing checks for formatters
for above types to check _GLIBCXX_FLOAT_IS_IEEE_BINARY32
and _GLIBCXX_DOUBLE_IS_IEEE_BINARY64. Making them non-formattable
on non-IEEE architectures.
Remove a potential UB, where we could produce basic_format_arg
with _M_type set to _Arg_fp32 or _Arg_fp64, that was later not
handled by `_M_visit`.
libstdc++-v3/ChangeLog:
* include/std/format (formatter<_Float16, _CharT>): Define only if
_GLIBCXX_FLOAT_IS_IEEE_BINARY32 macro is defined.
(formatter<_Float16, _CharT>): As above.
(formatter<__gnu_cxx::__bfloat16_t, _CharT>): As above.
(formatter<_Float64, _CharT>): Define only if
_GLIBCXX_DOUBLE_IS_IEEE_BINARY64 is defined.
(basic_format_arg::_S_to_arg_type): Normalize _Float32 and _Float64
only to float and double respectivelly.
(basic_format_arg::_S_to_enum): Remove handling of _Float32 and _Float64.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Mark Wielaard [Wed, 12 Mar 2025 11:29:24 +0000 (12:29 +0100)]
Regenerate cobol/lang.opt.urls
With the COBOL: Frontend (commit 3c5ed996a) came a lang.opt.urls,
which is different from what regenerate-opt-urls.py generates. Make
the CI bot happy by regenerating it.
Longer term, the COBOL docs need to be sorted out (see e.g. PR119227)
and then perhaps regenerate-opt-urls.py adjusted so that it can deal
with the COBOL docs.
Simon Martin [Wed, 12 Mar 2025 08:09:35 +0000 (09:09 +0100)]
cobol: Remove unnecesssary CPPFLAGS update and restore MacOS build
The build currently fails on MacOS even when the Cobol front-end and
libgcobol builds are disabled.
The problem is that gcc/cobol/Make-lang.in adds -Iinclude to CPPFLAGS,
which somehow makes clang unhappy about the include order:
error: <cstddef> tried including <stddef.h> but didn't find libc++'s
<stddef.h> header. This usually means that your header search paths
are not configured properly.
It turns out that this addition is unnecessary: simply removing it fixes
the build on MacOS, without impacting the build x86_64-pc-linux-gnu when
configured with --enable-languages=default,cobol.
It feels like there might be more cleanup opportunities there, but they
can be taken care of later.
Jonathan Wakely [Tue, 11 Mar 2025 17:29:01 +0000 (17:29 +0000)]
libstdc++: Prevent dangling references in std::unique_ptr::operator*
LWG 4148 (approved in Wrocław, November 2024) makes it ill-formed to
dereference a std::unique_ptr if that would return a dangling reference.
That can happen with a custom pointer type and a const-qualified
element_type, such that std::add_lvalue_reference_t<element_type> is a
reference-to-const that could bind to a short-lived temporary.
In C++26 the compiler diagnoses this as an error anyway:
bits/unique_ptr.h:457:16: error: returning reference to temporary [-Wreturn-local-addr]
But that can be disabled with -Wno-return-local-addr so the
static_assert ensures it is enforced consistently.
libstdc++-v3/ChangeLog:
* include/bits/unique_ptr.h (unique_ptr::operator*): Add
static_assert to check for dangling reference, as per LWG 4148.
* testsuite/20_util/unique_ptr/lwg4148.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Tue, 11 Mar 2025 15:47:21 +0000 (15:47 +0000)]
libstdc++: Make range adaptor __has_arrow helper use a const type
LWG 4112 (approved in Wrocław, November 2024) changes the has-arrow
helper to require operator-> to be valid on a const-qualified lvalue.
This affects the constraints for filter_view::_Iterator::operator-> and
join_view::_Iterator::operator-> so that they can only be used if the
underlying iterator supports operator-> on const.
The change also adds semantic (i.e. not checkable and not enforced)
requirements that operator-> must have the same semantics whether called
on a const or non-const value, and on an lvalue or rvalue (due to the
implicit expression variation rules in [concepts.equality]).
libstdc++-v3/ChangeLog:
* include/bits/ranges_util.h (ranges::_detail::__has_arrow):
Require operator->() to be valid on const-qualified type, as per
LWG 4112.
* testsuite/std/ranges/adaptors/lwg4112.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
LWG 4142 (approved in Wrocław, November 2024) made it ill-formed to call
basic_format_parse_context::check_dynamic_spec with an empty template
argument list.
This adds a static_assert to enforce that, and adjusts the tests.
libstdc++-v3/ChangeLog:
* include/std/format
(basic_format_parse_context::check_dynamic_spec): Require a
non-empty parameter pack, as per LWG 4142.
* testsuite/std/format/parse_ctx.cc: Remove call of
check_dynamic_spec with empty template argument list.
* testsuite/std/format/parse_ctx_neg.cc: Add dg-error to call of
check_dynamic_spec with empty template argument list.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
aarch64: Make latency account for synthetic VEC_PERM_EXPRs [PR116901]
Another problem in pr110625_[24].c was that the latency calculations
were ignoring VEC_PERM_EXPRs that had no associated stmt_vec_info.
Such VEC_PERM_EXPRs are common and expected for SLP these days.
After this change, the number of general ops in the testcases seems
to be accurate apart from one remaining detail: we assume that the
extension in a permuted extending load is free, even though the
extension happens after the permutation. Fixing that would require
more information from the vectoriser and so isn't GCC 15 material.
It also should cease to be a problem if we do end up moving the
permutation to its own node, rather than keeping it as part of
the load.
gcc/
PR target/116901
* config/aarch64/aarch64.cc (aarch64_vector_costs::count_ops): Allow
stmt_info to be null.
(aarch64_vector_costs::add_stmt_cost): Call count_ops even if
stmt_info is null.
vect: Fix ncopies when costing SLP reductions [PR116901]
pr110625_[24].c started failing after r15-1329-gd66b820f392aa9a7,
which switched to single def-use cycles for single-lane SLP.
The problem is that we only costed one vector accumulator
operation for an N-vector cycle.
The problem seems to have been latent, and meant that we also
only costed one FADDA for reduc_strict_4.c and reduc_strict_5.c,
even though they need 4 and 6 FADDAs respectively.
I'm not sure why:
if ((double_reduc || reduction_type != TREE_CODE_REDUCTION)
&& ncopies > 1)
was previously only necessary for non-SLP, but the patch preserves
that for safety.
gcc/
PR tree-optimization/116901
* tree-vect-loop.cc (vectorizable_reduction): Set ncopies to
SLP_TREE_NUMBER_OF_VEC_STMTS for SLP.
Before r14-2877-gbf67bf4880ce5be0, the aarch64 code assumed that
every multi-vector reduction would use single def-use cycles.
The patch fixed it to test what the vectoriser actually planned
to do, using newly provided information.
At the time, we didn't try to use single def-use cycles for any costed
variant in the associated testcase (gcc.target/aarch64/pr110625_1.c),
so it was enough to check that the single-def-use latency was never
printed to the dump file. However, we do now consider using single
def-use cycles for the single-lane SLP fallback.
This patch therefore switches to a positive test of the
non-single-def-use latency. I checked that the test still failed
in this form before r14-2877-gbf67bf4880ce5be0.
gcc/testsuite/
* gcc.target/aarch64/pr110625_1.c: Turn into a positive test for
a vector latency of 2, rather than a negative test for a vector
latency of 8.
Richard Biener [Tue, 11 Mar 2025 08:39:06 +0000 (09:39 +0100)]
Simple cobol.dg testsuite
The following adds a simple cobol.dg test harness, based on gfortran.dg.
It's invoked by make check-cobol, has three tests, two execution test and
one test exercising dg-error. The existing FAIL is due to an assembling
error, tracked by PR119214.
Jakub Jelinek [Wed, 12 Mar 2025 07:27:17 +0000 (08:27 +0100)]
builtins: Fix up strspn/strcspn folding [PR119219]
The PR119204 r15-7955 fix caused some regressions.
The problem is that the fold_builtin* APIs document that expr is
either a CALL_EXPR of the call or NULL, so using TREE_TYPE (expr)
can crash e.g. during constexpr evaluation etc.
As can be seen in the surrounding patch, for the neighbouring builtins
(both modf and strpbrk) fold_builtin_2 passes down type, which is the
result type, TREE_TYPE (TREE_TYPE (fndecl)) and those builtins use it
to build the return value, while strspn was always building size_type_node
and strcspn had this change from that to TREE_TYPE (expr).
The patch passes type to these two and uses it there as well.
The patch keeps passing expr because it is used in the
check_nul_terminated_array calls done for both strspn and strcspn,
those calls clearly can deal with NULL expr but prefer if it is non-NULL
for some warning.
2025-03-12 Jakub Jelinek <jakub@redhat.com>
PR middle-end/119204
PR middle-end/119219
* builtins.cc (fold_builtin_2): Pass type as another argument
to fold_builtin_strspn and fold_builtin_strcspn.
(fold_builtin_strspn): Add type argument, use it instead of
size_type_node.
(fold_builtin_strcspn): Add type argument, use it instead of
TREE_TYPE (expr).
Jakub Jelinek [Wed, 12 Mar 2025 07:01:09 +0000 (08:01 +0100)]
c++: Handle RAW_DATA_CST in modules.cc [PR119076]
The following testcases (one with #embed, one with large initializer
turned into RAW_DATA_CST) show that I forgot to handle RAW_DATA_CST in
module streaming.
Similar to the PCH case we need to stream out RAW_DATA_CST with NULL
RAW_DATA_OWNER (i.e. a tree which has data owned by libcpp buffer) so
that it will be streamed back in as STRING_CST which owns the data,
but because the data can be really large (hopefully not so much for
header modules though), without actually trying to build a STRING_CST
on the module writing side because that would mean another large
allocation and copying of the large data.
RAW_DATA_CST with RAW_DATA_OWNER then needs to be streamed out and in
by streaming the owner and offset from owner's data and length.
* g++.dg/modules/pr119076-1_a.H: New test.
* g++.dg/modules/pr119076-1_b.C: New test.
* g++.dg/modules/pr119076-2_a.H: New test.
* g++.dg/modules/pr119076-2_b.C: New test.
Jakub Jelinek [Wed, 12 Mar 2025 06:46:25 +0000 (07:46 +0100)]
preprocessor: Fix up diagnostic typo in convert_oct [PR119202]
In r15-4286 I've introduced a typo, part of the change was
- cpp_error (pfile, CPP_DL_ERROR, "'\\o' not followed by '{'");
+ cpp_error (pfile, CPP_DL_ERROR, "%<\\o%> not followed by %<}%>");
which turned { into }. This patch fixes it back.
2025-03-12 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/119202
* charset.cc (convert_oct): Fix up typo in diagnostics about \o
not followed by {.
My/Kees' earlier patches adjusted -Wunterminated-string-initialization
warning so that it doesn't warn about initializers of nonstring decls
and that nonstring attribute is allowed on multi-dimensional arrays.
Unfortunately as this testcase shows, we still warn about initializers
of multi-dimensional array nonstring decls.
The problem is that in that case field passed to output_init_element
is actually INTEGER_CST, index into the array.
For RECORD_OR_UNION_TYPE_P (constructor_type) field is a FIELD_DECL
which we want to use, but otherwise (in arrays) IMHO we want to use
constructor_fields (which is the innermost FIELD_DECL whose part
is being initialized), or - if that is NULL - constructor_decl, the
whole decl being initialized with multi-dimensional array type.
2025-03-11 Jakub Jelinek <jakub@redhat.com>
PR c/117178
* c-typeck.cc (output_init_element): Pass field to digest_init
only for record/union types, otherwise pass constructor_fields
if non-NULL and constructor_decl if constructor_fields is NULL.
* gcc.dg/Wunterminated-string-initialization-2.c: New test.
Andrew Pinski [Tue, 11 Mar 2025 06:10:01 +0000 (23:10 -0700)]
aarch64: Fix DFP constants [PR119131]
After r15-6660-g45d306a835cb3f865, in some cases
DFP constants would cause an ICE. This is due to
do a mismatch of a few things. The predicate of the move
uses aarch64_valid_fp_move to say if the constant is valid or not.
But after reload/LRA when can_create_pseudo_p returns false; aarch64_valid_fp_move
would return false for constants that were valid for the constraints
of the instruction. A strictor predicate compared to the constraint is wrong.
In this case `Uvi` is the constraint while aarch64_valid_fp_move allows it
via aarch64_can_const_movi_rtx_p for !DECIMAL_FLOAT_MODE_P, there is no such check
for DECIMAL_FLOAT_MODE_P.
The fix is to remove the check !DECIMAL_FLOAT_MODE_P in aarch64_valid_fp_move
and in the define_expand. As now the predicate allows a superset of what is allowed
by the constraints.
aarch64_float_const_representable_p should be rejecting DFP modes as they can't be used
with instructions like `mov s0, 1.0`.
Changes since v1:
* v2: Add check to aarch64_float_const_representable_p for DFP.
Built and tested on aarch64-linux-gnu with no regressions.
Jason Merrill [Mon, 10 Mar 2025 18:10:52 +0000 (14:10 -0400)]
c++: constexpr caching deleted pointer [PR119162]
In this testcase, we pass the checks for mismatched new/delete because the
pointer is deleted before it is returned. And then a subsequent evaluation
uses the cached value, but the deleted heap var isn't in
ctx->global->heap_vars anymore, so cxx_eval_outermost_constant_expr doesn't
run find_heap_var_refs, and ends up with garbage.
Fixed by not caching a reference to deleted.
I considered rejecting such a reference immediately as non-constant, but I
don't think that's valid; an invalid pointer value isn't UB until we try to
do something with it or it winds up in the final result of constant
evaluation.
I also considered not caching other heap references (i.e. using
find_heap_var_refs instead of adding find_deleted_heap_var), which would
include heap pointers passed in from the caller, but those don't have the
same heap_vars problem. We might want cxx_eval_outermost_constant_expr to
prune constexpr_call entries that refer to objects created during the
evaluation, but that applies to local variables and temporaries just as much
as heap "variables".
PR c++/119162
gcc/cp/ChangeLog:
* constexpr.cc (find_deleted_heap_var): New.
(cxx_eval_call_expression): Don't cache a
reference to heap_deleted.
Sandra Loosemore [Tue, 11 Mar 2025 16:36:22 +0000 (16:36 +0000)]
OpenMP/C: Store location in cp_parser_omp_var_list for kind=0 [PR118579]
This patch is the C equivalent of commit r15-6512-gcf94ba812ca496 for C++,
to improve the location information for individual items in an OpenMP
variable list.
gcc/c/ChangeLog
PR c/118579
* c-parser.cc (c_parser_omp_variable_list): Capture location
information when KIND is OMP_CLAUSE_ERROR.
(c_parser_oacc_data_clause_deviceptr): Use the improved location
for diagnostics, and remove the FIXME.
(c_finish_omp_declare_variant): Likewise.
(c_parser_omp_threadprivate): Likewise.
gcc/testsuite/ChangeLog
PR c/118579
* c-c++-common/gomp/pr118579.c: New testcase.
Jonathan Wakely [Thu, 6 Mar 2025 20:28:07 +0000 (20:28 +0000)]
contrib: Clean up outdated parts of gcc-git-customization.sh
It's very unlikely that anybody is still using the old remotes/$user Git
repo setup and still needs this script to be able to migrate it to the
remotes/users/$user structure. Simplify the script by removing those
parts.
This fixes an error that gets displayed in some circumstances:
fatal: no such section: remote.me
contrib/ChangeLog:
* gcc-git-customization.sh: Delete outdated commands for
migrating from very old git setups.
Iain Buclaw [Tue, 11 Mar 2025 16:56:18 +0000 (17:56 +0100)]
d: Fix regression returning from function with invariants [PR119139]
An optimization was added in GDC-12 which sets the TREE_READONLY flag on
all local variables with the storage class `const' assigned. For some
reason, const is also being added by the front-end to `__result'
variables in non-virtual functions, which ends up getting wrong code by
the gimplify pass promoting the local to static storage.
A bug has been raised upstream, as this looks like an error in the AST.
For now, turn off setting TREE_READONLY on all result variables.
PR d/119139
gcc/d/ChangeLog:
* decl.cc (get_symbol_decl): Don't set TREE_READONLY for __result
declarations.
Harald Anlauf [Mon, 10 Mar 2025 21:24:27 +0000 (22:24 +0100)]
Fortran: reject SAVE of a COMMON in a BLOCK construct [PR119199]
PR fortran/119199
gcc/fortran/ChangeLog:
* decl.cc (gfc_match_save): Reject SAVE statement of a COMMON block
when in a BLOCK construct.
* trans-common.cc (translate_common): Avoid NULL pointer dereference.
gcc/testsuite/ChangeLog:
* gfortran.dg/common_30.f90: New test.
* gfortran.dg/common_31.f90: New test.
gcc.target/aarch64/sve/pred-not-gen-[14].c started failing after r15-268-g9dbff9c05520a74e, but we didn't look at it in time for
GCC 15. This patch marks the failures as expected for now.
We should revisit for GCC 16.
See the PR for some discussion about what a GCC 16 fix might
look like.
Thomas Koenig [Tue, 11 Mar 2025 16:40:57 +0000 (17:40 +0100)]
Abstract interfaces and dummy arguments are not global.
The attached patch makes sure that procedures from abstract
interfaces and dummy arguments are not put into the global
symbol table, and are not checked against global symbols.
gcc/fortran/ChangeLog:
PR fortran/119078
* frontend-passes.cc (check_against_globals): Do not check
for abstract interfaces or dummy arguments.
* resolve.cc (gfc_verify_binding_labels): Adjust comment.
Do not put abstract interfaces or dummy argument into global
namespace.
gcc/testsuite/ChangeLog:
PR fortran/119078
* gfortran.dg/interface_58.f90: New test.
For many functions in tbz_2.c, it doesn't matter whether the code
tests a 32-bit or a 64-bit register. g6-g8 have started testing
32-bit registers, but the others could in future too.
gcc/testsuite/
* gcc.target/aarch64/tbz_2.c: Accept both 32-bit and 64-bit registers.
Juergen Christ [Mon, 10 Mar 2025 09:03:36 +0000 (10:03 +0100)]
s390: fix delegitimization of addresses
In legitimize_pic_address we create a
(const (unspec ... UNSPEC_GOTENT))
in the GOT offset might be >= 4k. However, the
s390_delegitimize_address does not contain a case for this scenario.
Jakub Jelinek [Tue, 11 Mar 2025 13:34:01 +0000 (14:34 +0100)]
cobol: Fix up libgcobol configure [PR119216]
Sorry, seems I've screwed up the earlier libgcobol/configure.tgt change.
Looking in more detail, the way e.g. libsanitizer/configure.tgt works is
that it is sourced twice, once at toplevel and there it just sets
UNSUPPORTED=1 for fully unsupported triplets, and then inside of
libsanitizer/configure where it decides to include or not include the
various sublibraries depending on the *_SUPPORTED flags.
So, the following patch attempts to do the same for libgcobol as well.
The BIULD_LIBGCOBOL automake conditional was unused, this patch guards it
on LIBGCOBOL_SUPPORTED as well and guards with it
toolexeclib_LTLIBRARIES = libgcobol.la
Also, AM_CFLAGS has been changed to AM_CXXFLAGS as there are just C++
sources in the library.
2025-03-11 Jakub Jelinek <jakub@redhat.com>
PR cobol/119216
* configure.ac: Check for UNSUPPORTED set by libgcobol/configure.tgt
rather than LIBGCOBOL_SUPPORTED.
* configure: Regenerate.
libgcobol/
* configure.tgt: On fully unsupported targets set UNSUPPORTED=1.
* configure.ac: Add AC_CHECK_SIZEOF([void *]), source in
configure.tgt and set BUILD_LIBGCOBOL also based on
LIBGCOBOL_SUPPORTED.
* Makefile.am (toolexeclib_LTLIBRARIES): Conditionalize on
BUILD_LIBGCOBOL.
(AM_CFLAGS): Rename to ...
(AM_CXXFLAGS): ... this.
(%.lo: %.cc): Use $(AM_CXXFLAGS) rather than $(AM_CFLAGS).
* configure: Regenerate.
* Makefile.in: Regenerate.
Jakub Jelinek [Tue, 11 Mar 2025 13:25:19 +0000 (14:25 +0100)]
cobol: libgcobol/Makefile.am cleanups
Looking at libgcobol.la, I see a lot of cruft, stuff that just shouldn't
be there because automake generates it otherwise right, but also stuff
using undefined variables etc.
libgcobol.{a,so*} seems to build and install the same as before.
Note, I stull see DT_RUNPATH in the installed libgcobol.so.1 before/after
this patch and I'd prefer not to see it, not seeing it in other libraries
like libstdc++.so.6 etc. Dunno if that is because of the dependency on
libstdc++ (but e.g. libstdc++ has dependency on libgcc_s and doesn't do
that).
H.J. Lu [Sun, 9 Mar 2025 14:00:23 +0000 (07:00 -0700)]
i386: Verify that argument registers are spilled properly
While working on a local x86 patch, which passed the GCC testsuite, I got
a compiler error:
In function ‘paravirt_read_msr’,
inlined from ‘perf_ibs_handle_irq’ at arch/x86/events/amd/ibs.c:1055:2:
./arch/x86/include/asm/paravirt_types.h:397:17: error: ‘asm’ operand has impossible constraints or there are not enough registers
397 | asm volatile(ALTERNATIVE(PARAVIRT_CALL, ALT_CALL_INSTR, \
| ^~~
when building x86-64 Linux kernel. RDI, RSI, RDX and RCX registers are
used to pass arguments in 64-bit mode. EAX, EDX and ECX registers are
used to pass arguments in 32-bit mode. But there is no coverage in the
GCC testsuite. Add tests to verify that argument registers are spilled
properly.
PR target/119171
* gcc.target/i386/pr119171-1.c: New test.
* gcc.target/i386/pr119171-2.c: Likewise.
Richard Earnshaw [Tue, 11 Mar 2025 10:48:54 +0000 (10:48 +0000)]
arm: testsuite: fix arm_neon_h checks with conflicting cpu/arch
GCC will complain if the -mcpu flag specifies a different architecture
to that specified in -march, but if the floating-point ABI is "soft",
then differences in the floating-point architecture features are
ignored.
However, the arm_libc_fp_abi checks whether we change the FP ABI by
adding -mfloat-abi=hard/softfp to override the defaults. If that
fails it won't add anything.
Unfortunately arm_neon_h_ok wasn't correctly checking whether the libc
check had worked and just assumed that it would always add something
to enable FP. That's insufficient and we need to consider this failure.
We simply mark tests as unsupported in this case.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp
(check_effective_target_arm_neon_h_ok_nocache): Return zero if
check_effective_target_arm_libc_fp_abi_ok reports failure.
Iain Sandoe [Tue, 11 Mar 2025 09:56:18 +0000 (09:56 +0000)]
configure, Darwin: Require explicit selection of COBOL.
By defult, Darwin does not have sufficient tools to build COBOL
so we do not want to include it in --enable-languages=all since
this will break regular testing of all supported languages.
However, we do want to be able to build it on demand (where the
build system has sufficiently new tools) and so do not want to
disable it permanently.
ChangeLog:
* configure: Regenerate.
* configure.ac: Do not build COBOL on Darwin by default,
even for --enable-languages=all.
Jakub Jelinek [Tue, 11 Mar 2025 10:08:27 +0000 (11:08 +0100)]
cobol: Fix --enable-link-serialization build
--enable-link-serialization relies on each FE participating properly,
setting <lang>.serial, depending on $(<lang>.prev) and printing progress.
The configure option is mainly for LTO bootstraps when we don't want to link
all the FEs at once because that can consume too much memory.
The comment changes are unrelated, just something I've spotted while
working on this. .exe is a Windows suffix, so either we shouldn't
talk about suffixes in the comments or use there $(exeext) as well
to make it clear that it is dependent on the host/build.
2025-03-11 Jakub Jelinek <jakub@redhat.com>
* Make-lang.in: Remove .exe extension from comments.
(cobol.serial): Set to cobol1$(exeext).
(cobol1$(exeext)): Depend on $(cobol.prev). Add
LINK_PROGRESS calls before/after the link command.
Jakub Jelinek [Tue, 11 Mar 2025 10:07:15 +0000 (11:07 +0100)]
cobol: Use *.cc suffix for bison/flex generated C++ files
In GCC 12 we've switched to using *.cc suffixes for C++ sources in GCC
sources, including generated files, instead of using *.c suffixes and
compiling them as C++ anyway (that was the case since we've switched
GCC to C++ in GCC 4.8).
I've noticed gcc/cobol has 3 generated files still with c extension
despite clearly having C++ code in it and being compiled as C++.
2025-03-11 Jakub Jelinek <jakub@redhat.com>
* Make-lang.in (cobol/parse.c, cobol/cdf.c, cobol/scan.c): Remove.
(cobol/parse.cc, cobol/cdf.cc, cobol/scan.cc): New goals.
(cobol/cdf.o): Depend on cobol/cdf.cc rather than cobol/cdf.c.
(cobol/parse.o): Depend on cobol/parse.cc rather than cobol/parse.c.
(cobol/scan.o): Depend on cobol/scan.cc rather than cobol/scan.c,
on cobol/cdf.cc rather than cobol/cdf.c and on cobol/parse.cc rather
than cobol/parse.c.
(cobol.srcextra): Depend on cobol/parse.cc cobol/cdf.cc cobol/scan.cc
rather than cobol/parse.c cobol/cdf.c cobol/scan.c.
Jakub Jelinek [Tue, 11 Mar 2025 10:05:13 +0000 (11:05 +0100)]
Make libgcobol/configure.tgt more similar to other libraries
When we know libgcobol is unsupported on 32-bit arches, we should just say
so in configure.tgt, the same way as on other targets.
2025-03-11 Jakub Jelinek <jakub@redhat.com>
* configure.tgt: Only set LIBGCOBOL_SUPPORTED for lp64
multilibs of powerpc64le-*-linux* and x86_64-*-linux*. Handle
i?86-*-linux* the same as x86_64-*-linux*.
Jakub Jelinek [Tue, 11 Mar 2025 10:01:55 +0000 (11:01 +0100)]
tree: Improve skip_simple_arithmetic [PR119183]
The following testcase takes very long time to compile, because
skip_simple_arithmetic decides to first call tree_invariant_p on
the second argument (and indirectly recurse there). I think before
canonicalization of operands for commutative binary expressions
(and for non-commutative ones always) it is pretty common that the
first operand is a constant, something which tree_invariant_p handles
immediately, so the following patch special cases that; I've added
there a tree_invariant_p call too after the checks, while it is not
really needed currently, tree_invariant_p has the same checks, I wanted
to be prepared in case tree_invariant_p changes. But if you think
I should avoid it, I can drop it too.
This is just a partial fix, I think one can certainly construct a testcase
which will still have horrible compile time complexity (but I've tried and
haven't managed to do so), so perhaps we should just limit the recursion
depth through skip_simple_arithmetic/tree_invariant_p with some defaulted
argument.
2025-03-11 Jakub Jelinek <jakub@redhat.com>
PR c/119183
* tree.cc (skip_simple_arithmetic): If first operand of binary
expr is TREE_CONSTANT or TREE_READONLY with no side-effects, call
tree_invariant_p on that operand first instead of on the second.
Jakub Jelinek [Tue, 11 Mar 2025 09:57:30 +0000 (10:57 +0100)]
complex: Don't DCE unused COMPLEX_EXPRs for -O0 [PR119190]
The PR116463 r15-3128 change regressed the following testcase at -O0.
While for -O1+ we can do -fvar-tracking-assignments, for -O0 we don't
(partly because it is compile time expensive and partly because at -O0
most of the vars live most of their lifetime in memory slots), so if we
DCE some statements, it can mean that DW_AT_location for some vars won't
be available or even it won't be possible to put a breakpoint at some
particular line in the source.
We normally perform dce just in the subpasses of
pass_local_optimization_passes or pass_all_optimizations or
pass_all_optimizations_g, so don't do that at all for -O0. So the complex
change is an exception. And it was described as a way to help forwprop and
reassoc, neither applies to -O0.
This regresses PR119120 again though, I'll post a patch for that momentarily.
2025-03-11 Jakub Jelinek <jakub@redhat.com>
PR debug/119190
* tree-complex.cc (update_complex_assignment, tree_lower_complex):
Perform simple dce on dce_worklist only if optimize.
Deprecate support for the ESA/390 architecture which will be eventually
removed, and encourage the usage of the z/Architecture instead.
Furthermore, default for -m31 to -mzarch whereas previously we defaulted
to -mesa.
gcc/ChangeLog:
* config.gcc: Fail in case of option --with-mode=esa.
* config/s390/s390.cc (s390_option_override_internal): Default
to z/Architecture mode.
* config/s390/s390.h (DRIVER_SELF_SPECS): Ditto.
* config/s390/s390.opt: Emit a warning for option -mesa.
* doc/invoke.texi: Document the change.
Currently insn_cost() only considers the source part of a SET.
Implement TARGET_INSN_COST in order to also take the destination into
account. This may make a difference in case of a MEM where the address
is a SYMBOL_REF.
James K. Lowden [Thu, 6 Mar 2025 21:25:09 +0000 (16:25 -0500)]
COBOL: Frontend
gcc/cobol/
* LICENSE: New file.
* Make-lang.in: New file.
* config-lang.in: New file.
* lang.opt: New file.
* lang.opt.urls: New file.
* cbldiag.h: New file.
* cdfval.h: New file.
* cobol-system.h: New file.
* copybook.h: New file.
* dts.h: New file.
* exceptg.h: New file.
* gengen.h: New file.
* genmath.h: New file.
* genutil.h: New file.
* inspect.h: New file.
* lang-specs.h: New file.
* lexio.h: New file.
* parse_ante.h: New file.
* parse_util.h: New file.
* scan_ante.h: New file.
* scan_post.h: New file.
* show_parse.h: New file.
* structs.h: New file.
* symbols.h: New file.
* token_names.h: New file.
* util.h: New file.
* cdf-copy.cc: New file.
* lexio.cc: New file.
* scan.l: New file.
* parse.y: New file.
* genapi.cc: New file.
* genapi.h: New file.
* gengen.cc: New file.
* genmath.cc: New file.
* genutil.cc: New file.
* cdf.y: New file.
* cobol1.cc: New file.
* convert.cc: New file.
* except.cc: New file.
* gcobolspec.cc: New file.
* structs.cc: New file.
* symbols.cc: New file.
* symfind.cc: New file.
* util.cc: New file.
* gcobc: New file.
* gcobol.1: New file.
* gcobol.3: New file.
* help.gen: New file.
* udf/stored-char-length.cbl: New file.
James K. Lowden [Mon, 10 Mar 2025 15:08:42 +0000 (16:08 +0100)]
COBOL: libgcobol
libgcobol/
* Makefile.am: New file.
* Makefile.in: Autogenerate.
* acinclude.m4: Likewise.
* aclocal.m4: Likewise.
* configure.ac: New file.
* configure: Autogenerate.
* configure.tgt: New file.
* README: New file.
* charmaps.cc: New file.
* config.h.in: New file.
* constants.cc: New file.
* gfileio.cc: New file.
* gmath.cc: New file.
* io.cc: New file.
* valconv.cc: New file.
* charmaps.h: New file.
* common-defs.h: New file.
* ec.h: New file.
* exceptl.h: New file.
* gcobolio.h: New file.
* gfileio.h: New file.
* gmath.h: New file.
* io.h: New file.
* libgcobol.h: New file.
* valconv.h: New file.
* libgcobol.cc: New file.
* intrinsic.cc: New file.
aarch64: Avoid unnecessary use of 2-input TBLs [PR115258]
When using TBL for (say) a V4SI permutation, the aarch64 port first
asks target-independent code to lower to a V16QI permutation.
Then, during code generation, an input like:
But subregs (unlike regs) are not shared, so the op0 == op1 check
always failed for this case. We'd then force each subreg into a
fresh register, meaning that during the later:
there is no way for aarch64_expand_vec_perm_1 to realise that
d->op0 and d->op1 are the same value. It would therefore generate
a two-input TBL in the testcase, even though a single-input TBL
is enough.
I'm not sure forcing subregs to a fresh regiter is a good idea --
it caused problems for copysign & co. -- but that's not something
to fiddle with during stage 4. Using op0 == op1 for rtx equality
is independently wrong, so we might as well just fix that for now.
The patch gets rid of extra MOVs that are a regression from GCC 14.
The testcase is based on one from Kugan, itself based on TSVC.
gcc/
PR target/115258
* config/aarch64/aarch64.cc (aarch64_vectorize_vec_perm_const): Use
d.one_vector_p to decide whether op1 should be a copy of op0.
gcc/testsuite/
PR target/115258
* gcc.target/aarch64/pr115258_2.c: New test.
In PR test case IRA preferred to allocate hard reg to a pseudo instead
of its equivalence. This resulted in allocating caller-saved hard reg
and generating save/restore insns in the function prologue/epilogue.
The equivalence is an invariant (stack pointer plus offset) and the
pseudo is used mostly as memory address. This happened as there was
no simplification of insn after the invariant substitution. The patch
adds the necessary code.
gcc/ChangeLog:
PR target/114991
* ira-costs.cc (equiv_can_be_consumed_p): Add new argument invariant_p.
Add code for dealing with the invariant.
(calculate_equiv_gains): Don't consider init insns. Pass the new
argument to equiv_can_be_consumed_p. Don't treat invariant as
memory.
gcc/testsuite/ChangeLog:
PR target/114991
* gcc.target/aarch64/pr114991.c: New test.
Nathaniel Shead [Fri, 31 Jan 2025 12:53:35 +0000 (23:53 +1100)]
c++/modules: Handle exposures of TU-local types in uninstantiated member templates
Previously, 'is_tu_local_entity' wouldn't detect the exposure of the (in
practice) TU-local lambda in the following example, unless instantiated:
struct S {
template <typename>
static inline decltype([]{}) x = {};
};
This is for two reasons. Firstly, when traversing the TYPE_FIELDS of S
we only see the TEMPLATE_DECL, and never end up building a dependency on
its DECL_TEMPLATE_RESULT (due to not being instantiated). This patch
fixes this by stripping any templates before checking for unnamed types.
The second reason is that we currently assume all class-scope entities
are not TU-local. Despite this being unambiguous in the standard, this
is not actually true in our implementation just yet, due to issues with
mangling lambdas in some circumstances. Allowing these lambdas to be
exported can cause issues in importers with apparently conflicting
declarations, so this patch treats them as TU-local as well.
After these changes, we now get double diagnostics from the two ways
that we can see the above lambda being exposed, via 'S' (through
TYPE_FIELDS) or via 'S::x'. To workaround this we hide diagnostics from
the first case, so we only get errors from 'S::x' which will be closer
to the point the offending lambda is declared.
gcc/cp/ChangeLog:
* module.cc (trees_out::has_tu_local_dep): Also look at the
TI_TEMPLATE if we don't find a dep for a decl.
(depset::hash::is_tu_local_entity): Handle unnamed template
types, treat lambdas specially.
(is_exposure_of_member_type): New function.
(depset::hash::add_dependency): Use it.
(depset::hash::finalize_dependencies): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/modules/internal-10.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Christophe Lyon [Wed, 15 Jan 2025 17:11:33 +0000 (17:11 +0000)]
arm: [MVE] Fix predicates for vec_cmp, vec_vcmpu and vcond_mask (PR 115439)
When compiling c-c++-common/vector-compare-3.c with
-march=armv8.1-m.main+mve+fp.dp -mfloat-abi=hard -mfpu=auto
(which enables MVE), we fail to match vcond_mask because operand 3 has
s_register_operand as predicate for a MVE_VPRED mode, but we try to
match:
(insn 26 25 27 2 (set (reg:V4SI 137)
(unspec:V4SI [
(reg:V4SI 144)
(reg:V4SI 145)
(subreg:V4BI (reg:HI 143) 0)
] VPSELQ_S)) "/src/gcc/testsuite/c-c++-common/vector-compare-3.c":23:6 -1
(nil))
The fix is to use the right predicate: vpr_register_operand.
The patch also fixes vec_cmp and vec_cmpu in the same way.
When testing with
-mthumb/-march=armv8.1-m.main+mve.fp+fp.dp/-mtune=cortex-m55/-mfloat-abi=hard/-mfpu=auto
it fixes the ICES in c-c++-common/vector-compare-3.c,
g++.dg/opt/pr79734.C, g++.dg/tree-ssa/pr111150.C and
gcc.dg/tree-ssa/pr111150.c
gcc/ChangeLog
PR target/115439
* config/arm/mve.md (vec_vcmp, vec_vcmpu, vcond_mask): Use
vpr_register_operand predicate for MVE_VPRED operands.
Fortran: Fix gimplification error for pointer remapping in forall [PR107143]
Enhance dependency checking for data pointers to check for same derived
type and not only for a type being a derived type. This prevent
generation of a descriptor for a function call, that is unsuitable in
forall's pointer assignment.
PR fortran/107143
gcc/fortran/ChangeLog:
* dependency.cc (check_data_pointer_types): Do not just compare
for derived type, but for same derived type.
Jakub Jelinek [Mon, 10 Mar 2025 09:34:00 +0000 (10:34 +0100)]
libgcc: Fix up unwind-dw2-btree.h [PR119151]
The following testcase shows a bug in unwind-dw2-btree.h.
In short, the header provides lock-free btree data structure (so no parent
link on nodes, both insertion and deletion are done in top-down walks
with some locking of just a few nodes at a time so that lookups can notice
concurrent modifications and retry, non-leaf (inner) nodes contain keys
which are initially the base address of the left-most leaf entry of the
following child (or all ones if there is none) minus one, insertion ensures
balancing of the tree to ensure [d/2, d] entries filled through aggressive
splitting if it sees a full tree while walking, deletion performs various
operations like merging neighbour trees, merging into parent or moving some
nodes from neighbour to the current one).
What differs from the textbook implementations is mostly that the leaf nodes
don't include just address as a key, but address range, address + size
(where we don't insert any ranges with zero size) and the lookups can be
performed for any address in the [address, address + size) range. The keys
on inner nodes are still just address-1, so the child covers all nodes
where addr <= key unless it is covered already in children to the left.
The user (static executables or JIT) should always ensure there is no
overlap in between any of the ranges.
In the testcase a bunch of insertions are done, always followed by one
removal, followed by one insertion of a range slightly different from the
removed one. E.g. in the first case [&code[0x50], &code[0x59]] range
is removed and then we insert [&code[0x4c], &code[0x53]] range instead.
This is valid, it doesn't overlap anything. But the problem is that some
non-leaf (inner) one used the &code[0x4f] key (after the 11 insertions
completely correctly). On removal, nothing adjusts the keys on the parent
nodes (it really can't in the top-down only walk, the keys could be many nodes
above it and unlike insertion, removal only knows the start address, doesn't
know the removed size and so will discover it only when reaching the leaf
node which contains it; plus even if it knew the address and size, it still
doesn't know what the second left-most leaf node will be (i.e. the one after
removal)). And on insertion, if nodes aren't split at a level, nothing
adjusts the inner keys either. If a range is inserted and is either fully
bellow key (keys are - 1, so having address + size - 1 being equal to key is
fine) or fully after key (i.e. address > key), it works just fine, but if
the key is in a middle of the range like in this case, &code[0x4f] is in the
middle of the [&code[0x4c], &code[0x53]] range, then insertion works fine
(we only use size on the leaf nodes), and lookup of the addresses below
the key work fine too (i.e. [&code[0x4c], &code[0x4f]] will succeed).
The problem is with lookups after the key (i.e. [&code[0x50, &code[0x53]]),
the lookup looks for them in different children of the btree and doesn't
find an entry and returns NULL.
As users need to ensure non-overlapping entries at any time, the following
patch fixes it by adjusting keys during insertion where we know not just
the address but also size; if we find during the top-down walk a key
which is in the middle of the range being inserted, we simply increase the
key to be equal to address + size - 1 of the range being inserted.
There can't be any existing leaf nodes overlapping the range in correct
programs and the btree rebalancing done on deletion ensures we don't have
any empty nodes which would also cause problems.
The patch adjusts the keys in two spots, once for the current node being
walked (the last hunk in the header, with large comment trying to explain
it) and once during inner node splitting in a parent node if we'd otherwise
try to add that key in the middle of the range being inserted into the
parent node (in that case it would be missed in the last hunk).
The testcase covers both of those spots, so succeeds with GCC 12 (which
didn't have btrees) and fails with vanilla GCC trunk and also fails if
either the
if (fence < base + size - 1)
fence = iter->content.children[slot].separator = base + size - 1;
or
if (left_fence >= target && left_fence < target + size - 1)
left_fence = target + size - 1;
hunk is removed (of course, only with the current node sizes, i.e. up to
15 children of inner nodes and up to 10 entries in leaf nodes).
2025-03-10 Jakub Jelinek <jakub@redhat.com>
Michael Leuchtenburg <michael@slashhome.org>
PR libgcc/119151
* unwind-dw2-btree.h (btree_split_inner): Add size argument. If
left_fence is in the middle of [target,target + size - 1] range,
increase it to target + size - 1.
(btree_insert): Adjust btree_split_inner caller. If fence is smaller
than base + size - 1, increase it and separator of the slot to
base + size - 1.
Xi Ruoyao [Fri, 7 Mar 2025 04:49:54 +0000 (12:49 +0800)]
LoongArch: Fix ICE when trying to recognize bitwise + alsl.w pair [PR119127]
When we call loongarch_reassoc_shift_bitwise for
<optab>_alsl_reversesi_extend, the mask is in DImode but we are trying
to operate it in SImode, causing an ICE.
To fix the issue sign-extend the mask into the mode we want. And also
specially handle the case the mask is extended into -1 to avoid a
miss-optimization.
gcc/ChangeLog:
PR target/119127
* config/loongarch/loongarch.cc
(loongarch_reassoc_shift_bitwise): Sign extend mask to mode,
specially handle the case it's extended to -1.
* config/loongarch/loongarch.md
(loongarch_reassoc_shift_bitwise): Update the comment for the
special case.
Jakub Jelinek [Mon, 10 Mar 2025 08:33:55 +0000 (09:33 +0100)]
libgcc: Formatting fixes for unwind-dw2-btree.h
Studying unwind-dw2-btree.h was really hard for me because
the formatting is wrong or weird in many ways all around the code
and that kept distracting my attention.
That includes all kinds of things, including wrong indentation, using
{} around single statement substatements, excessive use of ()s around
some parts of expressions which don't increase code clarity, no space
after dot in comments, some comments not starting with capital letters,
some not ending with dot, adding {} around some parts of code without
any obvious reason (and when it isn't done in a similar neighboring
function) or ( at the end of line without any reason.
The following patch fixes the formatting issues I found, no functional
changes.
Jakub Jelinek [Mon, 10 Mar 2025 08:31:41 +0000 (09:31 +0100)]
gimple-ssa-warn-access: Adjust maybe_warn_nonstring_arg for nonstring multidimensional arrays [PR117178]
The following patch fixes 4 xfails in attr-nonstring-11.c (and results in 2
false positive warnings in attr-nonstring-12.c not being produced either).
The thing is that maybe_warn_nonstring_arg simply assumed that nonstring
arrays must be single-dimensional, so when it sees a nonstring decl with
ARRAY_TYPE, it just used its dimension. With multi-dimensional arrays
that is not the right dimension to use though, it can be dimension of
some outer dimension, e.g. if we have
char a[5][6][7] __attribute__((nonstring)) if decl is
a[5] it would assume maximum non-NUL terminated string length of 5 rather than
7, if a[5][6] it would assume 6 and only for a[5][6][0] it would assume the
correct 7. So, the following patch looks through all the outer dimensions
to reach the innermost one (which for attribute nonstring is guaranteed to
have char/unsigned char/signed char element type).
2025-03-10 Jakub Jelinek <jakub@redhat.com>
PR c/117178
* gimple-ssa-warn-access.cc (maybe_warn_nonstring_arg): Look through
multi-dimensional array types, stop at the innermost ARRAY_TYPE.
* c-c++-common/attr-nonstring-11.c: Remove xfails.
* c-c++-common/attr-nonstring-12.c (warn_strcmp_cst_1,
warn_strcmp_cst_2): Don't expect any warnings here.
(warn_strcmp_cst_3, warn_strcmp_cst_4): New functions with expected
warnings.
The issue is the same as 12383255fe4e82c31f5e42c72a8fbcb1b5dea35d.
Neither is .REDUC_PLUS set for V2SImode on LoongArch, so add it
to the list of targets not expecting BB vectorization.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/bb-slp-77.c: Add loongarch*-*-* to the list
of expected failing targets.