Jonathan Wakely [Tue, 17 Dec 2024 21:32:19 +0000 (21:32 +0000)]
libstdc++: Fix std::future::wait_until for subsecond negative times [PR118093]
The current check for negative times (i.e. before the epoch) only checks
for a negative number of seconds. For a time 1ms before the epoch the
seconds part will be zero, but the futex syscall will still fail with an
EINVAL error. Extend the check to handle this case.
This change adds a redundant check in the headers too, so that we avoid
even calling into the library for negative times. Both checks can be
marked [[unlikely]]. The check in the headers avoids the cost of
splitting the time into seconds and nanoseconds and then making a PLT
call. The check inside the library matches where we were checking
already, and fixes existing binaries that were compiled against older
headers but use a newer libstdc++.so.6 at runtime.
libstdc++-v3/ChangeLog:
PR libstdc++/118093
* include/bits/atomic_futex.h (_M_load_and_test_until_impl):
Return false for times before the epoch.
* src/c++11/futex.cc (_M_futex_wait_until): Extend check for
negative times to check for subsecond times. Add unlikely
attribute.
(_M_futex_wait_until_steady): Likewise.
* testsuite/30_threads/future/members/118093.cc: New test.
We have several overloads of std::deque::_M_insert_aux, one of which is
variadic and called by std::deque::emplace. With a suitable set of
arguments to emplace, it's possible for one of the non-variadic
_M_insert_aux overloads to be selected by overload resolution, making
emplace ill-formed.
Rename the variadic _M_insert_aux to _M_emplace_aux so that calls to
emplace never select an _M_insert_aux overload. Also add an inline
_M_insert_aux for the const lvalue overload that is called from
insert(const_iterator, const value_type&).
Richard Biener [Wed, 8 Jan 2025 08:25:52 +0000 (09:25 +0100)]
tree-optimization/117979 - failed irreducible loop update from DCE
When CD-DCE creates forwarders to reduce false control dependences
it fails to update the irreducible state of edge and the forwarder
block in case the fowarder groups both normal (entry) and edges
from an irreducible region (necessarily backedges). This is because
when we split the first edge, if that's a normal edge, the forwarder
and its edge to the original block will not be marked as part
of the irreducible region but when we then redirect an edge from
within the region it becomes so.
The following fixes this up.
Note I think creating a forwarder that includes backedges is
likely not going to help, but at this stage I don't want to change
the CFG going into DCE. For regular loops we'll have a single
entry and a single backedge by means of loop init and will never
create a forwarder - so this is solely happening for irreducible
regions where it's harder to prove that such forwarder doesn't help.
PR tree-optimization/117979
* tree-ssa-dce.cc (make_forwarders_with_degenerate_phis):
Properly update the irreducible region state.
DWARF has voted in recently https://dwarfstd.org/issues/241209.1.html ,
which is basically just a guarantee that the DWARF 6 draft
DW_AT_language_{name,version} attribute codes and content of
https://dwarfstd.org/languages-v6.html can be used as an extension
in DWARF 5 and won't be changed.
So, this patch is an alternative to the
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669671.html
patch, which had the major problem that it required changing all the
DWARF consumers to be able to debug C17 or later or C++17 or later
sources.
This patch uses still DWARF 5 DW_LANG_C11 or DW_LANG_C_plus_plus_14,
the latest code in DWARF 5 proper, so all DWARF 5 capable consumers
should be able to deal with that, but additionally emits the
DWARF 6 attributes so that newer DWARF consumers can see it isn't
just C++14 but say C++23 or C11 but C23. Consumers which don't know
those DWARF 6 attributes would just ignore them. This is like any other
-gno-strict-dwarf extension, except that normally we emit say DWARF 5
codes where possible only after DWARF 5 is released, while in this case
there is a guarantee it can be used before DWARF 6 is released.
2025-01-08 Jakub Jelinek <jakub@redhat.com>
include/
* dwarf2.h (enum dwarf_source_language): Fix comment pasto.
(enum dwarf_source_language_name): New type.
* dwarf2.def (DW_AT_language_name, DW_AT_language_version): New
DWARF 6 codes.
gcc/
* dwarf2out.cc (break_out_comdat_types): Copy over
DW_AT_language_{name,version} if present.
(output_skeleton_debug_sections): Remove also
DW_AT_language_{name,version}.
(gen_compile_unit_die): For C17, C23, C2Y, C++17, C++20, C++23
and C++26 emit for -gdwarf-5 -gno-strict-dwarf also
DW_AT_language_{name,version} attributes.
gcc/testsuite/
* g++.dg/debug/dwarf2/lang-cpp17.C: Add -gno-strict-dwarf to
dg-options. Check also for DW_AT_language_{name,version} values.
* g++.dg/debug/dwarf2/lang-cpp20.C: Likewise.
* g++.dg/debug/dwarf2/lang-cpp23.C: New test.
Richard Biener [Tue, 7 Jan 2025 10:15:43 +0000 (11:15 +0100)]
tree-optimization/118269 - SLP reduction chain and early breaks
When we create the SLP reduction chain epilogue for the PHIs for
the early exit we fail to properly classify the reduction as SLP
reduction chain. The following fixes the corresponding checks.
PR tree-optimization/118269
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
Use the correct stmt for the REDUC_GROUP_FIRST_ELEMENT lookup.
* gcc.dg/vect/vect-early-break_131-pr118269.c: New testcase.
Jeevitha [Wed, 8 Jan 2025 07:03:12 +0000 (01:03 -0600)]
testsuite: Simplify target test and dg-options for AMO tests
Removed powerpc*-*-* from the target test as it is always true. Simplified
options by removing -mpower9-misc and -mvsx, which are enabled by default with
-mdejagnu-cpu=power9. The has_arch_pwr9 check is also true with
-mdejagnu-cpu=power9, so it has been removed.
* gcc.target/powerpc/amo1.c: Removed powerpc*-*-* from the target and
simplified dg-options.
* gcc.target/powerpc/amo2.c: Simplified dg-options and added powerpc_vsx
target check.
Hongyu Wang [Thu, 2 Jan 2025 02:29:27 +0000 (10:29 +0800)]
i386: Add br_mispredict_scale in cost table.
For later processors, the pipeline went deeper so the penalty for
untaken branch can be larger than before. Add a new parameter
br_mispredict_scale to describe the penalty, and adopt to
noce_max_ifcvt_seq_cost hook to allow longer sequence to be
converted with cmove.
This improves cpu2017 544 with -Ofast -march=native for 14% on P-core
SPR, and 8% on E-core SRF. No other regression observed.
gcc/ChangeLog:
* config/i386/i386.cc (ix86_noce_max_ifcvt_seq_cost): Adjust
cost with ix86_tune_cost->br_mispredict_scale.
* config/i386/i386.h (processor_costs): Add br_mispredict_scale.
* config/i386/x86-tune-costs.h: Add new br_mispredict_scale to
all processor_costs, in which icelake_cost/alderlake_cost
with value COSTS_N_INSNS (2) + 3 and other processor with value
COSTS_N_INSNS (2).
Pan Li [Thu, 12 Dec 2024 02:48:08 +0000 (10:48 +0800)]
Match: Refactor the signed SAT_* match for saturated value [NFC]
This patch would like to refactor the all signed SAT_* patterns for
the saturated value. Aka, overflow to INT_MAX when > 0 and downflow
to INT_MIN when < 0. Thus, we can remove sorts of duplicated expression
in different patterns.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.
gcc/ChangeLog:
* match.pd: Extract saturated value match for signed SAT_*.
Pan Li [Wed, 11 Dec 2024 11:37:06 +0000 (19:37 +0800)]
Match: Refactor the signed SAT_TRUNC match patterns [NFC]
This patch would like to refactor the all signed SAT_TRUNC patterns,
aka:
* Extract type check outside.
* Re-arrange the related match pattern forms together.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.
gcc/ChangeLog:
* match.pd: Refactor sorts of signed SAT_TRUNC match patterns
Pan Li [Wed, 11 Dec 2024 11:09:08 +0000 (19:09 +0800)]
Match: Refactor the signed SAT_SUB match patterns [NFC]
This patch would like to refactor the all signed SAT_ADD patterns,
aka:
* Extract type check outside.
* Re-arrange the related match pattern forms together.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.
gcc/ChangeLog:
* match.pd: Refactor sorts of signed SAT_SUB match patterns.
This improves codegen for x264 sum of absolute difference routines.
The insn count is same, but we avoid double widening ops and ensuing
whole register moves.
Also for more general applicability, we chose to implement abs diff
vs. the sum of abs diff variant.
Suggested-by: Robin Dapp <rdapp@ventanamicro.com> Co-authored-by: Pan Li <pan2.li@intel.com> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
PR target/117722
Keith Packard [Tue, 7 Jan 2025 21:54:11 +0000 (14:54 -0700)]
[PATCH] libgcc/m68k: More fixes for soft float
Fix __extenddfxf2:
* Remove bogus denorm handling block which would never execute --
the converted exp value is always positive as EXCESSX > EXCESSD.
* Compute the whole significand in dl instead of doing part of it in
ldl.
* Mask off exponent from dl.l.upper so the denorm shift test
works.
* Insert the hidden one bit into dl.l.upper as needed.
Fix __truncxfdf2 denorm handling. All that is required is to shift the
significand right by the correct amount; it already has all of the
necessary bits set including the explicit one. Compute the shift
amount, then perform the wide shift across both elements of the
significand.
Fix __fixxfsi:
* The value was off by a factor of two as the significand contains
32 bits, not 31 so we need to shift by one more than the equivalent
code in __fixdfsi.
* Simplify the code having realized that the lower 32 bits of the
significand can never appear in the results.
Return positive qNaN instead of negative. For floats, qNaN is 0x7fff_ffff. For
doubles, qNaN is 0x7fff_ffff_ffff_ffff.
Return correctly signed zero on float and double divide underflow. This means
that Ld$underflow now expects d7 to contain the sign bit, just like the other
return paths.
libgcc/
* config/m68k/fpgnulib.c (extenddfxf2): Simplify code by removing code
that should never execute. Fix denorm shift test and insert hidden bit
as needed.
(__truncxfdf2): Properly compue and shift the significant right.
* config/m68k/lb1sf68.S (__fixxfsi): Correct shift counts and simplify.
(QUIET_NAN): Make it a positive quiet NaN and fix return values to inject
sign properly.
Jeff Law [Tue, 7 Jan 2025 21:27:28 +0000 (14:27 -0700)]
Fix testsuite expectations for RVV after recent change
Tamar's recent improvement to improve affine unsigned folding for exchange2
twiddle code generation for a couple tests in the RVV testsuite just enough to
cause testsuite failures.
I've looked at both tests before/after Tamar's change and the code is clearly
better -- essentially tighter vector loops due to improvements in address
arithmetic. Additionally we have fewer vsetvls after Tamar's patch.
Given that I'm just making the obvious adjustments to the expected assembly and
pushing to the trunk.
Jeff Law [Tue, 7 Jan 2025 19:20:15 +0000 (12:20 -0700)]
Fix regression in ft32 port after recent switch table adjustments
This is a trivial bug that showed up after Mark W's recent patch to not apply
the size limit on jump tables.
The ft32 port has limited immediate ranges on comparisons and the casesi
expander didn't honor those. It'd blindly pass along an out of range constant.
This patch adds the trivial adjustment to force an out of range constant into a
register. It fixes these regressions:
> Tests that now fail, but worked before (3 tests):
>
> ft32-sim: gcc: gcc.c-torture/compile/pr34093.c -O1 (test for excess errors)
> ft32-sim: gcc: gcc.dg/torture/pr106809.c -O1 (test for excess errors)
> ft32-sim: gcc: gcc.dg/torture/pr106809.c -O1 (test for excess errors)
Tested in my tester. No other tests were fixed.
gcc/
* config/ft32/ft32.md (casesi expander): Force operands[2] into
a register if it's not a suitable rimm operand.
Dimitar Dimitrov [Sun, 24 Nov 2024 10:22:13 +0000 (12:22 +0200)]
testsuite: RISC-V: Skip tests providing -march for ILP32E/ILP64E ABIs
Many test cases explicitly set -march with extensions which are not
compatible with the E ABI variants. This leads to spurious errors
when toolchain has been configured for RV32E base ISA and ILP32E ABI:
spawn ... -march=rv32gc_zbb ...
cc1: error: ILP32E ABI does not support the 'D' extension
Fix by skipping those tests if toolchain's default ABI is E.
testsuite: RISC-V: Skip tests using -mcpu= for ILP32E/ILP64E ABIs
The tests are specifying -mcpu with D extension, which is not compatible
with the ILP32E and ILP64E ABIs. Fix by skipping the tests if toolchain's
default ABI is an E variant.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr109508.c: Skip for E ABI.
* gcc.target/riscv/pr114139.c: Ditto.
Dimitar Dimitrov [Mon, 25 Nov 2024 18:48:00 +0000 (20:48 +0200)]
testsuite: RISC-V: Skip V and Zvbb tests for ILP32E/ILP64E ABIs
Some tests add options for V and Zvbb extensions, but those extensions
are not compatible with the E ABI variants. This leads to spurious test
failures when toolchain's default ABI is ILP32E or ILP64E:
spawn ... -march=rv32ecv_zvbb ...
cc1: error: ILP32E ABI does not support the 'D' extension
cc1: sorry, unimplemented: Currently the 'V' implementation requires the 'M' extension
Fix by skipping the tests when toolchain's default ABI is E variant.
Dimitar Dimitrov [Thu, 12 Dec 2024 18:22:59 +0000 (20:22 +0200)]
testsuite: RISC-V: Add effective target for E ABI variant
Add new effective target check for either ILP32E or ILP64E ABI variants.
Initial implementation only checks for RV32E or RV64E ISA, which in turn
implies that ILP32E/ILP64E ABI is used. The RV32I+ILP32E and
RV64I+ILP64E combinations are not yet caught by the check, but they
do not seem to be widely used currently.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp (check_effective_target_riscv_abi_e):
New procedure.
Thomas Koenig [Tue, 7 Jan 2025 14:23:29 +0000 (15:23 +0100)]
Document unsigned constants in intrinsic modules.
gcc/fortran/ChangeLog:
* intrinsic.texi (ISO_FORTRAN_ENV): Also mention INT8 in the
text. Document UINT8, UINT16, UINT32 and UINT64.
(ISO_C_BINDING): New table for unsigned KIND numbers.
Wilco Dijkstra [Fri, 1 Nov 2024 14:40:26 +0000 (14:40 +0000)]
AArch64: Switch off early scheduling
The early scheduler takes up ~33% of the total build time, however it doesn't
provide a meaningful performance gain. This is partly because modern OoO cores
need far less scheduling, partly because the scheduler tends to create many
unnecessary spills by increasing register pressure. Building applications
56% faster is far more useful than ~0.1% improvement on SPEC, so switch off
early scheduling on AArch64. Codesize reduces by ~0.2%.
Fix various tests that depend on scheduling by explicitly adding -fschedule-insns.
gcc:
* common/config/aarch64/aarch64-common.cc: Switch off fschedule_insns.
Wilco Dijkstra [Fri, 1 Nov 2024 14:44:56 +0000 (14:44 +0000)]
AArch64: Block combine_and_move from creating FP literal loads
The IRA combine_and_move pass runs if the scheduler is disabled and aggressively
combines moves. The movsf/df patterns allow all FP immediates since they rely
on a split pattern. However splits do not happen during IRA, so the result is
extra literal loads. To avoid this, split early during expand and block
creation of FP immediates that need this split. Mark a few testcases that
rely on late splitting as xfail.
Tobias Burnus [Tue, 7 Jan 2025 15:43:30 +0000 (16:43 +0100)]
libgomp.texi: Minor update to omp_get_num_devices/omp_get_initial_device
libgomp/ChangeLog:
* libgomp.texi (OpenMP 6.0): Fix typo.
(omp_get_default_device): Update the wording as the value
returned by omp_get_initial_device is now ambiguous.
(omp_get_num_devices): Minor wording tweak.
(omp_get_initial_device): Note that the function may also
return omp_initial_device since OpenMP 6.
Tamar Christina [Mon, 6 Jan 2025 17:52:14 +0000 (17:52 +0000)]
perform affine fold to unsigned on non address expressions. [PR114932]
When the patch for PR114074 was applied we saw a good boost in exchange2.
This boost was partially caused by a simplification of the addressing modes.
With the patch applied IV opts saw the following form for the base addressing;
This is because the patch promoted multiplies where one operand is a constant
from a signed multiply to an unsigned one, to attempt to fold away the constant.
This patch attempts the same but due to the various problems with SCEV and
niters not being able to analyze the resulting forms (i.e. PR114322) we can't
do it during SCEV or in the general form like in fold-const like extract_muldiv
attempts.
Instead this applies the simplification during IVopts initialization when we
create the IV. This allows IV opts to see the simplified form without
influencing the rest of the compiler.
as mentioned in PR114074 it would be good to fix the missed optimization in the
other passes so we can perform this in general.
The reason this has a big impact on Fortran code is that Fortran doesn't seem to
have unsigned integer types. As such all it's addressing are created with
signed types and folding does not happen on them due to the possible overflow.
concretely on AArch64 this changes the results from generation:
The two patches together results in a 10% performance increase in exchange2 in
SPECCPU 2017 and a 4% reduction in binary size and a 5% improvement in compile
time. There's also a 5% performance improvement in fotonik3d and similar
reduction in binary size.
The patch folds every IV to unsigned to canonicalize them. At the end of the
pass we match.pd will then remove unneeded conversions.
Note that we cannot force everything to unsigned, IVops requires that array
address expressions remain as such. Folding them results in them becoming
pointer expressions for which some optimizations in IVopts do not run.
PR tree-optimization/114932
* gcc.dg/tree-ssa/pr64705.c: Update dump file scan.
* gcc.target/i386/pr115462.c: The testcase shares 3 IVs which calculates
the same thing but with a slightly different increment offset. The test
checks for 3 complex addressing loads, one for each IV. But with this
change they now all share one IV. That is the loop now only has one
complex addressing. This is ultimately driven by the backend costing
and the current costing says this is preferred so updating the testcase.
* gfortran.dg/addressing-modes_1.f90: New test.
Andrew Pinski [Sat, 16 Nov 2024 04:22:04 +0000 (20:22 -0800)]
cfgexpand: Handle integral vector types and constructors for scope conflicts [PR105769]
This is an expansion of the last patch to also track pointers via vector types and the
constructor that are used with vector types.
In this case we had:
```
_15 = (long unsigned int) &bias;
_10 = (long unsigned int) &cov_jn;
_12 = {_10, _15};
...
...
MEM <vector(2) long unsigned int> [(void *)&D.6172 + 32B] = _12;
MEM[(struct function *)&D.6157] ={v} {CLOBBER(bob)};
```
Anyways tracking the pointers via vector types to say they are alive
at the point where the store of the vector happens fixes the bug by saying
it is alive at the same time as another variable is alive.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/105769
gcc/ChangeLog:
* cfgexpand.cc (vars_ssa_cache::operator()): For constructors
walk over the elements.
gcc/testsuite/ChangeLog:
* g++.dg/torture/pr105769-1.C: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andrew Pinski [Sat, 16 Nov 2024 04:22:03 +0000 (20:22 -0800)]
cfgexpand: Rewrite add_scope_conflicts_2 to use cache and look back further [PR111422]
After fixing loop-im to do the correct overflow rewriting
for pointer types too. We end up with code like:
```
_9 = (unsigned long) &g;
_84 = _9 + 18446744073709551615;
_11 = _42 + _84;
_44 = (signed char *) _11;
...
*_44 = 10;
g ={v} {CLOBBER(eos)};
...
n[0] = &f;
*_44 = 8;
g ={v} {CLOBBER(eos)};
```
Which was not being recongized by the scope conflicts code.
This was because it only handled one level walk backs rather than multiple ones.
This fixes the issue by having a cache which records all references to addresses
of stack variables.
Unlike the previous patch, this only records and looks at addresses of stack variables.
The cache uses a bitmap and uses the index as the bit to look at.
* cfgexpand.cc (struct vars_ssa_cache): New class.
(vars_ssa_cache::vars_ssa_cache): New constructor.
(vars_ssa_cache::~vars_ssa_cache): New deconstructor.
(vars_ssa_cache::create): New method.
(vars_ssa_cache::exists): New method.
(vars_ssa_cache::add_one): New method.
(vars_ssa_cache::update): New method.
(vars_ssa_cache::dump): New method.
(add_scope_conflicts_2): Factor mostly out to
vars_ssa_cache::operator(). New cache argument.
Walk the bitmap cache for the stack variables addresses.
(vars_ssa_cache::operator()): New method factored out from
add_scope_conflicts_2. Rewrite to be a full walk of all operands
and use a worklist.
(add_scope_conflicts_1): Add cache new argument for the addr cache.
Just call add_scope_conflicts_2 for the phi result instead of calling
for the uses and don't call walk_stmt_load_store_addr_ops for phis.
Update call to add_scope_conflicts_2 to add cache argument.
(add_scope_conflicts): Add cache argument and update calls to
add_scope_conflicts_1.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr117426-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andrew Pinski [Sat, 16 Nov 2024 04:22:02 +0000 (20:22 -0800)]
cfgexpand: Factor out getting the stack decl index
This is the first patch in improving this code.
Since there are a few places which get the index and they
check the same thing, let's factor that out into one function.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* cfgexpand.cc (INVALID_STACK_INDEX): New defined.
(decl_stack_index): New function.
(visit_op): Use decl_stack_index.
(visit_conflict): Likewise.
(add_scope_conflicts_1): Likewise.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Richard Biener [Tue, 7 Jan 2025 12:18:27 +0000 (13:18 +0100)]
rtl-optimization/118298 - constant iteration loops and #pragma unroll
When the RTL unroller handles constant iteration loops it bails out
prematurely when heuristics wouldn't apply any unrolling before
checking #pragma unroll.
PR rtl-optimization/118298
* loop-unroll.cc (decide_unroll_constant_iterations): Honor
loop->unroll even if the loop is too big for heuristics.
Richard Biener [Tue, 7 Jan 2025 14:07:12 +0000 (15:07 +0100)]
Fixup convert-dfp*.c
The testcases use -save-temps which doesn't play nice with -flto
and multilib testing resulting in spurious UNRESOLVED like
/usr/lib64/gcc/x86_64-suse-linux/14/../../../../x86_64-suse-linux/bin/ld: i386:x86-64 architecture of input file `./convert-dfp-2.ltrans0.ltrans.o' is incompatible with i386 output
The following skips the testcases when using -flto.
* gcc.dg/torture/convert-dfp-2.c: Skip with -flto.
* gcc.dg/torture/convert-dfp.c: Likewise.
Alexandre Oliva [Wed, 11 Dec 2024 13:16:58 +0000 (10:16 -0300)]
ada: Drop g-cpp* units not needed by the compiler
Having moved __gnat_convert_caught_object to g-cstyin.o, we can drop
other g-cpp* units that are now needed by programs that actually use
their APIs to get more information about C++ exceptions and type_info
objects.
gcc/ada/ChangeLog:
* gcc-interface/Make-lang.in (GNAT_ADA_OBJS, GNATBIND_OBJS):
Drop g-cpp, g-cppexc and g-cppstd.
Eric Botcazou [Tue, 10 Dec 2024 09:24:47 +0000 (10:24 +0100)]
ada: Do not create temporaries for initialization statements
Assignment statements marked with the No_Ctrl_Actions or No_Finalize_Actions
flag are initialization statements and, therefore, no temporaries are needed
to hold the value of the right-hand side for them.
gcc/ada/ChangeLog:
* gcc-interface/trans.cc (Call_to_gnu): Always use the return slot
optimization if the parent node is an initialization statement.
(gnat_to_gnu) <N_Assignment_Statement>: Build an INIT_EXPR instead
of a MODIFY_EXPR if this is an initialization statement.
Eric Botcazou [Fri, 20 Dec 2024 15:49:50 +0000 (16:49 +0100)]
ada: Do not raise exceptions from Exp_Aggr.Packed_Array_Aggregate_Handled
An exception is now raised during bootstrap and this causes compatibility
issues with older compilers.
gcc/ada/ChangeLog:
* exp_aggr.adb (Packed_Array_Aggregate_Handled): Remove declaration
and handler for Not_Handled local exception. Check the return value
of Get_Component_Val instead.
(Get_Component_Val): Return No_Uint instead of raising Not_Handled.
Javier Miranda [Thu, 19 Dec 2024 10:41:59 +0000 (10:41 +0000)]
ada: Cleanup preanalysis of static expressions (part 2)
According to RM 13.14(8/4), a static expression in an aspect specification
does not cause freezing; however, the frontend performs many calls to
Preanalyze_Spec_Expression made during the analysis of aspects. This
patch, suggested by Eric Botcazou, takes care of this additional code
cleanup which requires also replacing many occurrences of the global
variable In_Spec_Expression by calls to Preanalysis_Active.
gcc/ada/ChangeLog:
* exp_util.adb (Insert_Actions): Document behavior under strict
preanalysis.
* sem.ads (In_Strict_Preanalysis): New subprogram.
(Preanalysis_Active): Replace 'and' operator by 'and then'.
* sem.adb (In_Strict_Preanalysis): Ditto.
* sem_attr.adb (Check_Dereference): Replace In_Spec_Expression
occurrence by call to Preanalysis_Active, and document it.
(Resolve_Attribute [Atribute_Access]): Ditto.
(Eval_Attribute): No evaluation under strict preanalysis.
(Validate_Static_Object_Name): No action under strict preanalysis.
* sem_ch13.adb (Check_Aspect_At_End_Of_Declarations): Replace
calls to Preanalyze_Spec_Expression by calls to Preanalyze_And_Resolve.
(Check_Aspect_At_Freeze_Point): Ditto.
(Resolve_Aspect_Expressions [Dynamic/Static/Predicate aspects]): Code
cleanup adjusting the code to emulate Preanalyze_And_Resolve, instead
of Preanalyze_Spec_Expression.
(Resolve_Aspect_Expressions [CPU/Interrupt_Priority/Priority/
Storage_Size aspects]): Replace calls to Preanalyze_Spec_Expression
by call to Preanalyze_And _Resolve.
* sem_ch3.adb (Analyze_Object_Declaration): Replace In_Spec_Expression
occurrence by call to Preanalysis_Active.
(Find_Type_Of_Object): Add documentation.
* sem_ch4.adb (Analyze_Case_Expression): Replace In_Spec_Expression
occurrence by call to Preanalysis_Active.
* sem_ch6.adb (Analyze_Expression_Function): Minor code reorganization
moving the code preanalyzing the expression after the new body has
been inserted in the tree to ensure that its Parent attribute is
available for preanalysis.
* sem_cat.adb (Validate_Static_Object_Name): No action under strict
preanalysis.
* sem_elab.adb (Check_For_Eliminated_Subprogram): Replace In_Spec_Expression
occurrence by call to Preanalysis_Active.
* sem_eval.adb (Eval_Intrinsic_Call [Name_Enclosing_Entity]): Ditto.
* sem_elim.adb (Check_For_Eliminated_Subprogram): Ditto.
* sem_res.adb (Resolve_Entity_Name): Ditto.
Piotr Trojanek [Thu, 19 Dec 2024 23:09:15 +0000 (00:09 +0100)]
ada: Improve protection against wrong use from GDB
A code cleanup in routine intended to be used from DGB, suggested by running
GNATcheck rule Boolean_Negations. However, this code can be tuned to protect
against more illegal uses.
gcc/ada/ChangeLog:
* exp_disp.adb (Write_DT): Add guards that prevent crashes on illegal
node numbers.
Piotr Trojanek [Thu, 19 Dec 2024 14:32:56 +0000 (15:32 +0100)]
ada: Remove dead code in detection of null record definitions
Code cleanup; behavior is unaffected.
gcc/ada/ChangeLog:
* sem_util.adb (Is_Null_Record_Definition): Remove check for
Component_List being present after using it; replace check for
component item being a component declaration with an assertion;
fix style in comment.
This patch fixes two problems with how abort was deferred in finally
parts. First, calls to runtime subprograms are now omitted when
aborting is disallowed by active restrictions. Second, Abort_Undefer is
now correctly called when the finally part propagates an exception.
Steve Baird [Tue, 17 Dec 2024 21:27:04 +0000 (13:27 -0800)]
ada: Improved checking of uses of package renamings
In some cases, the RM 8.5.1(3.1) legality rule about uses of renamings of
limited views of packages was implemented incorrectly, resulting in rejecting
legal uses.
gcc/ada/ChangeLog:
* gen_il-fields.ads: add new Renames_Limited_View field.
* gen_il-gen-gen_entities.adb: add Renames_Limited_View flag for
packages.
* einfo.ads: add comment documenting Renames_Limited_View flag.
* sem_ch8.adb (Analyze_Package_Renaming): Set new Renames_Limited_View
flag. Test new Renames_Limited_View flag instead of calling
Has_Limited_With. If Has_Limited_With is True, that just means
that somebody, sometime during this compilation needed to
reference the limited view of the package; so that function
returns True too often to be used here.
(Find_Expanded_Name): Test new Renames_Limited_View flag instead of
calling Has_Limited_With.
Piotr Trojanek [Tue, 30 Jan 2024 00:10:17 +0000 (01:10 +0100)]
ada: Remove flag Is_Inherited_Pragma which is only set and never used
Code cleanup; behavior is unaffected. Flag Is_Inherited_Pragma is only set in
GNAT, but is not actually used, neither by the compiler nor by any backend.
gcc/ada/ChangeLog:
* contracts.adb (Inherit_Pragma): Don't set flag Is_Inherited_Pragma.
* gen_il-fields.ads (Opt_Field_Enum): Remove field identifier.
* gen_il-gen-gen_nodes.adb (N_Pragma): Remove field from node.
* sinfo.ads (Is_Inherited_Pragma): Remove field description.
(N_Pragma): Remove field reference.
Piotr Trojanek [Tue, 26 Mar 2024 15:23:41 +0000 (16:23 +0100)]
ada: Handle attributes related to Ada 2012 iterators as internal
Use existing machinery for internal attributes to handle attributes
related to Ada 2012 iterators. All these attributes exist exclusively
as a mean to delay processing.
Code cleanup. The only change in behavior is the wording of error
emitted when one of the internal attributes appears in source code:
from "illegal attribute" (which used to be emitted in the analysis)
to "unrecognized attribute (which is emitted by the parser).
gcc/ada/ChangeLog:
* exp_attr.adb (Expand_N_Attribute_Reference): Remove explicit
handling of attributes related to Ada 2012 iterators.
* sem_attr.adb (Analyze_Attribute, Eval_Attribute): Likewise;
move attribute Reduce according to alphabetic order.
* snames.adb-tmpl (Get_Attribute_Id): Add support for new internal
attributes.
* snames.ads-tmpl: Recognize names of new internal attributes.
(Attribute_Id): Recognize new internal attributes.
(Internal_Attribute_Id): Likewise.
(Is_Internal_Attribute_Name): Avoid duplication in comment.
Piotr Trojanek [Thu, 2 Mar 2023 21:43:12 +0000 (22:43 +0100)]
ada: Remove unnecessary qualifiers for First/Next list operations
Code cleanup related to work on expression functions for GNATprove
(which require accessibility checks even when they are not expanded
and thus have no explicit return statements).
gcc/ada/ChangeLog:
* accessibility.adb (First_Selector): Remove redundant and locally
inconsistent parenthesis.
(Check_Return_Construct_Accessibility): Remove qualifier from list
operation.
* sem_util.adb (Is_Prim_Of_Abst_Type_With_Nonstatic_CW_Pre_Post):
Likewise.
Eric Botcazou [Wed, 18 Dec 2024 09:16:15 +0000 (10:16 +0100)]
ada: Fix internal error on container aggregate for bounded vectors
The problem is that we analyze references to an object before the actual
subtype of the object is established, thus creating a type mismatch that
is flagged by the code generator.
gcc/ada/ChangeLog:
* exp_ch7.ads (Store_After_Actions_In_Scope_Without_Analysis): New
procedure declaration.
* exp_ch7.adb (Store_New_Actions_In_Scope): New procedure.
(Store_Actions_In_Scope): Call Store_New_Actions_In_Scope when the
target list is empty.
(Store_After_Actions_In_Scope_Without_Analysis): New procedure body.
* exp_aggr.adb (Expand_Container_Aggregate): For a declaration that
is wrapped in a transient scope, also defer the analysis of the new
code until after the declaration is analyzed.
Eric Botcazou [Tue, 17 Dec 2024 19:00:38 +0000 (20:00 +0100)]
ada: Add guard to System.Val_Real.Large_Powfive against pathological input
There is no need to keep multiplying the result once it saturates to +Inf.
gcc/ada/ChangeLog:
* libgnat/s-powflt.ads (Maxpow_Exact): Minor comment fix.
* libgnat/s-powlfl.ads (Maxpow_Exact): Likewise.
* libgnat/s-powllf.ads (Maxpow_Exact): Likewise.
* libgnat/s-valrea.adb (Large_Powfive) [1 parameter]: Exit the loop
as soon as the result saturates to +Inf.
(Large_Powfive) [2 parameters]: Likewise.
Piotr Trojanek [Mon, 16 Dec 2024 13:15:57 +0000 (14:15 +0100)]
ada: Move checks for consequences of Exceptional_Cases to GNAT
Previously checks for consequence expressions of Exceptional_Cases aspects were
done in GNATprove backend. However, we can do them in the frontend, where they
will apply to all subprograms, regardless of the SPARK_Mode aspect.
gcc/ada/ChangeLog:
* sem_prag.adb (Analyze_Exceptional_Cases_In_Decl_Part): Move check
from GNATprove backend to GNAT frontend.
Piotr Trojanek [Mon, 16 Dec 2024 12:52:43 +0000 (13:52 +0100)]
ada: Fix comments about Subprogram_Variant and Exceptional_Cases
The comment about Subprogram_Variant was outdated after more types have been
allowed by the corresponding SPARK RM rule; the comment about Exceptional_Cases
was incorrect, after being copy-pasted.
Steve Baird [Fri, 13 Dec 2024 01:06:00 +0000 (17:06 -0800)]
ada: Put_Image spec incorrectly ignored for Fixed_Point_Type'Base'Image call.
If a Put_Image aspect specification (introduced in Ada 2022) is given for a
fixed point type Fx, then in some cases a call to Fx'Base'Image would
incorrectly ignore the aspect specification and would instead return the
pre-Ada2022 version of the image. However, a call to Fx'Image would do the
right thing.
gcc/ada/ChangeLog:
* exp_put_image.adb (Image_Should_Call_Put_Image): Cope with the case
where the attribute prefix for an Image attribute reference
denotes an Itype constructed for a fixed point type. Calling
Has_Aspect with such an Itype misses applicable aspect
specifications; we need to look on the right list. This comes up
if the prefix of the attribute reference is
Some_Fixed_Point_Type'Base.
Gary Dismukes [Fri, 13 Dec 2024 23:36:05 +0000 (23:36 +0000)]
ada: Error on instantiation with defaulted formal type referencing other formal type
The compiler wasn't accounting for default subtypes on generic formal types
that reference other formal types of the same generic, leading to errors
about invalid subtypes. Several other problems that could lead to blowups
or incorrect errors were noticed through testing related cases and fixed
along the way.
gcc/ada/ChangeLog:
* sem_ch12.adb (Analyze_One_Association): In the case of a formal type
that has a Default_Subtype_Mark that does not have its Entity field set,
this means the default refers to another formal type of the same generic
formal part, so locate the matching subtype in the Result_Renamings and
set Match's Entity to that subtype prior to the call to Instantiate_Type.
(Validate_Formal_TypeDefault.Reference_Formal): Add test of Entity being
Present, to prevent blowups on End_Label ids (which don't have Entity set).
(Validate_Formal_Type_Default.Validate_Derived_Type_Default): Apply
Base_Type to Formal.
(Validate_Formal_Type_Default): Guard interface-related semantic checks
with a test of Is_Tagged_Type.
Eric Botcazou [Mon, 16 Dec 2024 07:59:26 +0000 (08:59 +0100)]
ada: Restrict previous change made to expansion of allocators
There is no need to build a cleanup if exceptions cannot be propagated.
gcc/ada/ChangeLog:
* exp_ch4.adb (Expand_Allocator_Expression): Do not build a cleanup
if restriction No_Exception_Propagation is active.
* exp_ch6.adb (Make_Build_In_Place_Call_In_Allocator): Likewise.
Deng Jianbo [Tue, 31 Dec 2024 11:33:23 +0000 (19:33 +0800)]
LoongArch: Optimize initializing fp resgister to zero
In LoongArch, currently uses instruction movgr2fr.{d|w} to move zero
from fixed-point register to floating-pointer regsiter for initializing
fp register to zero. When LSX or LASX is enabled, we can use instruction
vxor.v which has lower latency than instruction movgr2fr.{d|w} to set fp
register to zero directly.
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_output_move):
Optimize instructions for initializing fp regsiter to zero.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/mov-zero-1.c: New test.
* gcc.target/loongarch/mov-zero-2.c: New test.
Gaius Mulley [Tue, 7 Jan 2025 11:20:45 +0000 (11:20 +0000)]
[PR modula2/118010, modula2/118183] Unable to rebuild the bootstrap tools and Wtypemismatch
This patch combines fixes for both PR-118010 (Wtypemismatch) and PR-118183
(unable to rebuild the bootstrap tools). PR-118010 required a new data
type (COFF_T) to be exported from SYSTEM and used in all return
types for libc.lseek. The patch also includes COFF_T implemented in mc
and this data type has been propagated though the translated versions
of pge and mc. Finally the patch adjusts the modula-2 declaration of
location_t to reflect the new gcc 64 bit type.
A new command line option -fm2-file-offset-bits= has been implemented to
override the default 64 bit declaration of COFF_T.
gcc/ChangeLog:
PR modula2/118010
* doc/gm2.texi (Compiler options): New option
-fm2-file-offset-bits=.
Fortran: Extend cylic type detection for deallocate [PR116669]
Using cycles in derived/class types lead to the compiler doing a endless
recursion in several locations, when the cycle was not immediate.
An immediate cyclic dependency is present in, for example T T::comp.
Cylcic dependencies of the form T T2::comp; T2 T::comp2; are now
detected and the recursive bit in the derived type's attr is set.
gcc/fortran/ChangeLog:
PR fortran/116669
* class.cc (gfc_find_derived_vtab): Use attr to determine cyclic
type dependendies.
* expr.cc (gfc_has_default_initializer): Prevent endless
recursion by storing already visited derived types.
* resolve.cc (resolve_cyclic_derived_type): Determine if a type
is used in its hierarchy in a cyclic way.
(resolve_fl_derived0): Call resolve_cyclic_derived_type.
(resolve_fl_derived): Ensure vtab is generated when cyclic
derived types have allocatable components.
* trans-array.cc (structure_alloc_comps): Prevent endless loop
for derived type cycles.
* trans-expr.cc (gfc_get_ultimate_alloc_ptr_comps_caf_token):
Off topic, just prevent memory leaks.
gcc/testsuite/ChangeLog:
* gfortran.dg/class_array_15.f03: Freeing more memory.
* gfortran.dg/recursive_alloc_comp_6.f90: New test.
This patch removes the AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS tunable and
use_new_vector_costs entry in aarch64-tuning-flags.def and makes the
AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS paths in the backend the
default. To that end, the function aarch64_use_new_vector_costs_p and its uses
were removed. To prevent costing vec_to_scalar operations with 0, as
described in
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665481.html,
we adjusted vectorizable_store such that the variable n_adjacent_stores
also covers vec_to_scalar operations. This way vec_to_scalar operations
are not costed individually, but as a group.
As suggested by Richard Sandiford, the "known_ne" in the multilane-check
was replaced by "maybe_ne" in order to treat nunits==1+1X as a vector
rather than a scalar.
Two tests were adjusted due to changes in codegen. In both cases, the
old code performed loop unrolling once, but the new code does not:
Example from gcc.target/aarch64/sve/strided_load_2.c (compiled with
-O2 -ftree-vectorize -march=armv8.2-a+sve -mtune=generic -moverride=tune=none):
f_int64_t_32:
cbz w3, .L92
mov x4, 0
uxtw x3, w3
+ cntd x5
+ whilelo p7.d, xzr, x3
+ mov z29.s, w5
mov z31.s, w2
- whilelo p6.d, xzr, x3
- mov x2, x3
- index z30.s, #0, #1
- uqdecd x2
- ptrue p5.b, all
- whilelo p7.d, xzr, x2
+ index z30.d, #0, #1
+ ptrue p6.b, all
.p2align 3,,7
.L94:
- ld1d z27.d, p7/z, [x0, #1, mul vl]
- ld1d z28.d, p6/z, [x0]
- movprfx z29, z31
- mul z29.s, p5/m, z29.s, z30.s
- incw x4
- uunpklo z0.d, z29.s
- uunpkhi z29.d, z29.s
- ld1d z25.d, p6/z, [x1, z0.d, lsl 3]
- ld1d z26.d, p7/z, [x1, z29.d, lsl 3]
- add z25.d, z28.d, z25.d
+ ld1d z27.d, p7/z, [x0, x4, lsl 3]
+ movprfx z28, z31
+ mul z28.s, p6/m, z28.s, z30.s
+ ld1d z26.d, p7/z, [x1, z28.d, uxtw 3]
add z26.d, z27.d, z26.d
- st1d z26.d, p7, [x0, #1, mul vl]
- whilelo p7.d, x4, x2
- st1d z25.d, p6, [x0]
- incw z30.s
- incb x0, all, mul #2
- whilelo p6.d, x4, x3
+ st1d z26.d, p7, [x0, x4, lsl 3]
+ add z30.s, z30.s, z29.s
+ incd x4
+ whilelo p7.d, x4, x3
b.any .L94
.L92:
ret
Alexandre Oliva [Fri, 20 Dec 2024 21:02:08 +0000 (18:02 -0300)]
expand: drop stack adjustments after barrier [PR118006]
A gimple block with __builtin_unreachable () can't have code after it,
and gimple optimizers ensure there isn't any, even without
optimization. But if the block requires stack adjustments,
e.g. because of a call that passes arguments on the stack, expand will
emit that after the barrier, and then rtl checkers rightfully
complain. Arrange to discard adjustments after a barrier.
Strub expanders seem to be necessary to bring about the exact
conditions that require stack adjustments after the block that ends
with a __builtin_unreachable call.
for gcc/ChangeLog
PR middle-end/118006
* cfgexpand.cc (expand_gimple_basic_block): Do not emit
pending stack adjustments after a barrier.
Akram Ahmad [Mon, 6 Jan 2025 20:09:30 +0000 (20:09 +0000)]
aarch64: remove extra XTN in vector concatenation
GIMPLE code which performs a narrowing truncation on the result of a
vector concatenation currently results in an unnecessary XTN being
emitted following a UZP1 to concate the operands. In cases such as this,
UZP1 should instead use a smaller arrangement specifier to replace the
XTN instruction. This is seen in cases such as in this GIMPLE example:
int32x2_t foo (svint64_t a, svint64_t b)
{
vector(2) int vect__2.8;
long int _1;
long int _3;
vector(2) long int _12;
bar:
ptrue p3.b, all
uaddv d0, p3, z0.d
uaddv d1, p3, z1.d
uzp1 v0.2d, v0.2d, v1.2d
xtn v0.2s, v0.2d
ret
This patch therefore defines the *aarch64_trunc_concat<mode> insn which
truncates the concatenation result, rather than concatenating the
truncated operands (such as in *aarch64_narrow_trunc<mode>), resulting
in the following optimised assembly being emitted:
bar:
ptrue p3.b, all
uaddv d0, p3, z0.d
uaddv d1, p3, z1.d
uzp1 v0.2s, v0.2s, v1.2s
ret
This patch passes all regression tests on aarch64 with no new failures.
A supporting test for this optimisation is also written and passes.
OK for master? I do not have commit rights so I cannot push the patch
myself.
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md: (*aarch64_trunc_concat)
new insn definition.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/truncated_concatenation_1.c: new test
for the above example and other modes covered by insn
definitions.
```
In file included from /gcc/src/libsanitizer/interception/interception.h:18,
from /gcc/src/libsanitizer/interception/interception_type_test.cpp:14:
/gcc/src/libsanitizer/interception/interception_type_test.cpp:30:61: error: static assertion failed
30 | COMPILER_CHECK((__sanitizer::is_same<::SSIZE_T, ::ssize_t>::value));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/gcc/src/libsanitizer/sanitizer_common/sanitizer_internal_defs.h:363:44: note: in definition of macro 'COMPILER_CHECK'
363 | #define COMPILER_CHECK(pred) static_assert(pred, "")
| ^~~~
make[8]: *** [Makefile:469: interception_type_test.lo] Error 1
```
The culprit seems to be that we don't check for equality of type sizes
anymore but rather whether the types are indeed the same. On s390 -m31
we have that `sizeof(int)==sizeof(long)` holds which is why previously
the checks succeeded. They fail now because
```
size_t => unsigned long
ssize_t => long
ptrdiff_t => int
::SSIZE_T => __sanitizer::sptr => int
::PTRDIFF_T => __sanitizer::sptr => int
```
This is fixed by mapping `SSIZE_T` to `long` in the end.
For some targets uptr is mapped to unsigned int and size_t to unsigned
long and sizeof(int)==sizeof(long) holds. Still, these are distinct
types and type checking may fail. Therefore, replace uptr by
usize/SIZE_T wherever a size_t is expected.
Stafford Horne [Mon, 6 Jan 2025 12:12:40 +0000 (12:12 +0000)]
or1k: add .note.GNU-stack section on linux
In the OpenRISC build we get the following warning:
ld: warning: __modsi3_s.o: missing .note.GNU-stack section implies executable stack
ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
Fix this by adding a .note.GNU-stack to indicate the stack does not need to be
executable for the lib1funcs.
Note, this is also needed for the upcoming glibc 2.41.
libgcc/
* config/or1k/lib1funcs.S: Add .note.GNU-stack section on linux.
SVE intrinsics: Fold svmul by -1 to svneg for unsigned types
As follow-up to
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665472.html,
this patch implements folding of svmul by -1 to svneg for
unsigned SVE vector types. The key idea is to reuse the existing code that
does this fold for signed types and feed it as callback to a helper function
that adds the necessary type conversions.
For example, for the test case
svuint64_t foo (svuint64_t x, svbool_t pg)
{
return svmul_n_u64_x (pg, x, -1);
}
the following gimple sequence is emitted (-O2 -mcpu=grace):
svuint64_t foo (svuint64_t x, svbool_t pg)
{
svint64_t D.12921;
svint64_t D.12920;
svuint64_t D.12919;
In general, the new helper gimple_folder::convert_and_fold
- takes a target type and a function pointer,
- converts the lhs and all non-boolean vector types to the target type,
- passes the converted lhs and arguments to the callback,
- receives the new gimple statement from the callback function,
- adds the necessary view converts to the gimple sequence,
- and returns the new call.
Because all arguments are converted to the same target types, the helper
function is only suitable for folding calls whose arguments are all of
the same type. If necessary, this could be extended to convert the
arguments to different types differentially.
The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/ChangeLog:
* config/aarch64/aarch64-sve-builtins-base.cc
(svmul_impl::fold): Wrap code for folding to svneg in lambda
function and pass to gimple_folder::convert_and_fold to enable
the transform for unsigned types.
* config/aarch64/aarch64-sve-builtins.cc
(gimple_folder::convert_and_fold): New function that converts
operands to target type before calling callback function, adding the
necessary conversion statements.
(gimple_folder::redirect_call): Set fntype of redirected call.
(get_vector_type): Move from here to aarch64-sve-builtins.h.
* config/aarch64/aarch64-sve-builtins.h
(gimple_folder::convert_and_fold): Declare function.
(get_vector_type): Move here as inline function.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/acle/asm/mul_u8.c: Adjust expected outcome.
* gcc.target/aarch64/sve/acle/asm/mul_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_u64.c: New test and adjust
expected outcome.
Martin Jambor [Mon, 6 Jan 2025 10:58:29 +0000 (11:58 +0100)]
ipa-cp: Make dumping of bit masks representing -1 nicer
Dumps of the lattices representing bit-values and of propagation
results of bit-values can print a really long hexadecimal value when
the bit-value represents -1 (all bits set). This patch simply detect
that situation and prints the string "-1" in that case, making the
dumps somewhat nicer.
gcc/ChangeLog:
2025-01-03 Martin Jambor <mjambor@suse.cz>
* ipa-cp.cc (ipcp_print_widest_int): New function.
(ipcp_store_vr_results): Use it.
(ipcp_bits_lattice::print): Likewise. Fix formatting.
Mark Wielaard [Sun, 5 Jan 2025 17:00:36 +0000 (18:00 +0100)]
tree-switch-conversion: don't apply switch size limit on jump tables
commit 56946c801a7c ("gimple: Add limit after which slower switchlower
algs are used [PR117091] [PR117352]") introduced a limit on the number
of cases of a switch. It also bails out on finding jump tables if the
switch is too large. This introduces a compile time regression during
bootstrap. A riscv bootstrap takes hours longer. Particularly
insn-attrtab.cc will take hours instead of minutes. Fix this by not
applying the switch size limit on jump tables.
An alternative would be to implement greedy switch clustering for jump
tables as is done for switch bitmap clustering.
Tamar Christina [Mon, 6 Jan 2025 09:24:36 +0000 (09:24 +0000)]
AArch64: Implement four and eight chunk VLA concats [PR118272]
The following testcase
#pragma GCC target ("+sve")
extern char __attribute__ ((simd, const)) fn3 (int, short);
void test_fn3 (float *a, float *b, double *c, int n)
{
for (int i = 0; i < n; ++i)
a[i] = fn3 (b[i], c[i]);
}
at -Ofast ICEs because my previous patch only added support for combining 2
partial SVE vectors into a bigger vector. However There can also 4 and 8
piece subvectors.
This patch fixes this by implementing the missing expansions.
gcc/ChangeLog:
PR target/96342
PR target/118272
* config/aarch64/aarch64-sve.md (vec_init<mode><Vquad>,
vec_initvnx16qivnx2qi): New.
* config/aarch64/aarch64.cc (aarch64_sve_expand_vector_init_subvector):
Rewrite to support any arbitrary combinations.
* config/aarch64/iterators.md (SVE_NO2E): Update to use SVE_NO4E
(SVE_NO2E, Vquad): New.
gcc/testsuite/ChangeLog:
PR target/96342
PR target/118272
* gcc.target/aarch64/vect-simd-clone-3.c: New test.
Eric Botcazou [Fri, 13 Dec 2024 18:17:00 +0000 (19:17 +0100)]
ada: Streamline runtime support of finalization collections
Finalization collections are declared as (limited) controlled types so that
they can be naturally attached to a finalization master, but the same result
can be achieved by means of (limited) finalizable types, which need not be
tagged and thus avoid dragging the runtime support of tagged types.
gcc/ada/ChangeLog:
* libgnat/s-finpri.ads: Remove clause for Ada.Finalization.
(Finalization_Collection): Change to limited private type with the
Finalizable aspect.
(Initialize): Remove "overriding" keyword.
(Finalize): Likewise.
* libgnat/s-finpri.adb (Initialize): Likewise.
(Finalize): Likewise.
Eric Botcazou [Thu, 12 Dec 2024 22:08:30 +0000 (23:08 +0100)]
ada: Fix predicate involving array indexing rejected in generic package
The indexing is rejected with the message:
error: reference to current instance of type does not denote a type
when it is applied to a prefix which is the current instance of the type
to which the predicate is applied.
There is already a specific handling of component selection for this case
present in Find_Selected_Component, so this adds an equivalent specific
handling of indexing for this case to Analyze_Indexed_Component_Form.
gcc/ada/ChangeLog:
PR ada/117569
* sem_ch4.adb (Analyze_Indexed_Component_Form): Do not rewrite the
node as a type conversion if it is the current instance of a type
in a generic unit.
* sem_ch8.adb (Find_Selected_Component): Restrict the special case
of the current instance of a type to a generic unit.
Alexandre Oliva [Tue, 10 Dec 2024 12:06:57 +0000 (09:06 -0300)]
ada: Reduce footprint of C++ exception interoperation support
The initial C++ base-type exception interoperation support change
brought all of GNAT.CPP* along with raise-gcc, because of
[__gnat_]Convert_Caught_Object. Move that private but pragma-exported
function to GNAT.CPP.Std.Type_Info, so that it can rely on the C++
virtual/dispatch calls that justified the introduction of the Ada
wrapper type, to avoid emulating virtual calls in C or bringing in a
dependency on the C++ compiler and runtime.
Drop the CharPtr package instantiation, that brought a huge amount of
unnecessary code, and use string and storage primitives instead, using
the strcmp builtin directly for the C string compares.
Move the conversion to Ada String in Name to the wrapper interface in
GNAT.CPP.Std, adjusting the private internal type to shave off a few
more bytes from the only unit that raise-gcc will still need.
Finally, disable heap finalization for Type_Info_Ptr, to avoid
dragging in all of the finalization code. Thank to Eric Botcazou for
the suggestion.
gcc/ada/ChangeLog:
* libgnat/g-cppexc.adb (Convert_Caught_Object): Move...
* libgnat/g-cstyin.adb (Convert_Caught_Object): ... here.
Use object call notation.
(strcmp): New.
(Char_Arr, CharPtr, Char_Pointer, To_chars_ptr): Drop. Do not
import Interfaces.C.Pointers.
(To_Pointer): Convert from System.Address.
(Name_Starts_With_Asterisk): Rename local variable.
(Name_Past_Asterisk): Rewrite with System.Address and strcmp.
Import System.Storage_Elements.
(Equals): Use strcmp.
(Before): Fix logic error. Use strcmp.
(Name): Move conversion to String...
* libgnat/g-cppstd.adb (Name): ... here. Import
Interfaces.C.Strings.
* libgnat/g-cppstd.ads (Type_Info_Ptr): Disable heap
finalization.
* libgnat/g-cstyin.ads (Name): Change return type.
Claire Dross [Mon, 2 Dec 2024 16:14:47 +0000 (17:14 +0100)]
ada: Support new SPARK aspect Exit_Cases
The aspect Exit_Cases allows annotating a subprogram with a list of
cases specifying, for all input which satisfy a guard, how the
subprogram is allowed to terminate. For now, it can only be either
returning normally or propagating an exception. This contract is not
checked at runtime, it is only meant for static verification in SPARK.
gcc/ada/ChangeLog:
* aspects.ads: Add aspect Aspect_Exit_Cases.
* contracts.adb (Analyze_Entry_Or_Subprogram_Contract): Handle Exit_Cases.
(Expand_Subprogram_Contract): Idem.
* einfo-utils.adb (Get_Pragma): Allow Pragma_Exit_Cases.
* einfo-utils.ads (Get_Pragma): Idem.
* exp_prag.adb (Expand_Pragma_Exit_Cases): Ignore the pragma, currently we don't expand it.
* exp_prag.ads (Expand_Pragma_Exit_Cases): Idem.
* inline.adb (Remove_Aspects_And_Pragmas): Add Exit_Cases to the list.
(Remove_Items): Idem.
* par-prag.adb (Last_Arg_Is_Reason): Idem.
* sem_ch12.adb: Idem.
* sem_ch13.adb: Idem.
* sem_util.adb: Idem.
* sem_util.ads: Idem.
* sinfo.ads: Idem.
* snames.ads-tmpl: Add names Name_Exit_Cases, Name_Exception_Raised, and Name_Normal_Return
as well as pragma Pragma_Exit_Cases.
* sem_prag.adb (Analyze_Exit_Cases_In_Decl_Part): Make sure that a
pragma or aspect Exit_Cases is well formed.
(Analyze_Pragma): Make sure that a pragma or aspect Exit_Cases is at the right place.
* sem_prag.ads (Analyze_Exit_Cases_In_Decl_Part): Declaration.
* doc/gnat_rm/implementation_defined_pragmas.rst: Document the Exit_Cases pragma.
* doc/gnat_rm/implementation_defined_aspects.rst: Document the Exit_Cases aspect.
* gnat_rm.texi: Regenerate.