liuhongt [Tue, 26 Mar 2024 04:28:14 +0000 (21:28 -0700)]
Enable vectorization for unknown tripcount in very cheap cost model but disable epilog vectorization.
gcc/ChangeLog:
* tree-vect-loop.cc (vect_analyze_loop_costing): Enable
vectorization for LOOP_VINFO_PEELING_FOR_NITER in very cheap
cost model.
(vect_analyze_loop): Disable epilogue vectorization in very
cheap cost model.
* doc/invoke.texi: Adjust documents for very-cheap cost model.
Jovan Vukic [Wed, 9 Oct 2024 22:53:38 +0000 (16:53 -0600)]
RISC-V: Optimize branches with shifted immediate operands
After the valuable feedback I received, it’s clear to me that the
oversight was in the tests showing the benefits of the patch. In the
test file, I added functions f5 and f6, which now generate more
efficient code with fewer instructions.
Before the patch:
f5:
li a4,2097152
addi a4,a4,-2048
li a5,1167360
and a0,a0,a4
addi a5,a5,-2048
beq a0,a5,.L4
f6:
li a5,3407872
addi a5,a5,-2048
and a0,a0,a5
li a5,1114112
beq a0,a5,.L7
After the patch:
f5:
srli a5,a0,11
andi a5,a5,1023
li a4,569
beq a5,a4,.L5
f6:
srli a5,a0,11
andi a5,a5,1663
li a4,544
beq a5,a4,.L9
PR target/115921
gcc/ChangeLog:
* config/riscv/iterators.md (any_eq): New code iterator.
* config/riscv/riscv.h (COMMON_TRAILING_ZEROS): New macro.
(SMALL_AFTER_COMMON_TRAILING_SHIFT): Ditto.
* config/riscv/riscv.md (*branch<ANYI:mode>_shiftedarith_<optab>_shifted):
New pattern.
Eric Botcazou [Wed, 9 Oct 2024 19:31:13 +0000 (21:31 +0200)]
Fix LTO bootstrap failure with -Werror=lto-type-mismatch
In GNAT's implementation model, using convention C (or C_Pass_By_Copy) has
no effect on the internal representation of types since the representation
is identical to that of C by default. It's even counter-productive given
the implementation advice listed in B.3(63-71) so the interface between the
front-end and gigi does not use it and instead uses structurally identical
types on both sides.
gcc/ada
PR ada/117038
* fe.h (struct c_array): Add 'const' to declaration of pointer.
(C_Source_Buffer): Use consistent formatting.
* par-ch3.adb (P_Component_Items): Properly set Aliased_Present on
access definition.
* sinput.ads: Remove clause for Interfaces.C.
(C_Array): Change type of Length to Integer and make both components
aliased. Remove Convention aspect.
(C_Source_Buffer): Remove all aspects.
* sinput.adb (C_Source_Buffer): Adjust to above change.
Jason Merrill [Wed, 9 Oct 2024 16:28:46 +0000 (12:28 -0400)]
c++: more modules and -M
In r15-4119-gc877a27f04f648 I told preprocess_file to use the
directives-only scan with modules, but it seems that I also need to set the
cpp_option so that communication between _cpp_handle_directive and
scan_translation_unit_directives_only works properly in
c-c++-common/cpp/embed-6.c.
gcc/c-family/ChangeLog:
* c-ppoutput.cc (preprocess_file): Set directives_only flag.
Jonathan Wakely [Fri, 4 Oct 2024 11:40:47 +0000 (12:40 +0100)]
libstdc++: Test 17_intro/names.cc with -D_FORTIFY_SOURCE=2 [PR116210]
Add a new testcase that repeats 17_intro/names.cc but with
_FORTIFY_SOURCE defined, to find problems in Glibc fortify wrappers like
https://sourceware.org/bugzilla/show_bug.cgi?id=32052 (which is fixed
now).
libstdc++-v3/ChangeLog:
PR libstdc++/116210
* testsuite/17_intro/names.cc (sz): Undef for versions of Glibc
that use it in the fortify wrappers.
* testsuite/17_intro/names_fortify.cc: New test.
Jonathan Wakely [Fri, 4 Oct 2024 11:08:12 +0000 (12:08 +0100)]
libstdc++: Drop format attribute from snprintf wrapper [PR116969]
When __LONG_DOUBLE_IEEE128__ is defined we need to declare a wrapper for
Glibc's 'snprintf' symbol, so we can call the original definition that
works with the IBM128 format of long double. Because we were declaring
the wrapper using __typeof__(__builtin_snprintf) it inherited the
__attribute__((format(printf, 3, 4))) decoration, and then we got a
warning for calling that wrapper with an __ibm128 argument for a %Lf
conversion specifier. The warning is bogus, because the function we're
calling really does want __ibm128 for %Lf, but there's no "printf but
with a different long double format" archetype for the attribute.
In r15-4039-g28911f626864e7 I added a diagnostic pragma to suppress the
warning, but it would be better to just declare the wrapper without the
attribute, and not have to suppress a warning for code that we know is
actually correct.
libstdc++-v3/ChangeLog:
PR libstdc++/116969
* include/bits/locale_facets_nonio.tcc (money_put::__do_put):
Remove diagnostic pragmas.
(__glibcxx_snprintfibm128): Declare type manually, instead of
using __typeof__(__builtin_snprintf).
aarch64: Fix SVE ACLE gimple folds for C++ LTO [PR116629]
The SVE ACLE code has two ways of handling overloaded functions.
One, used by C, is to define a single dummy function for each unique
overloaded name, with resolve_overloaded_builtin then resolving calls
to real non-overloaded functions. The other, used by C++, is to
define a separate function for each individual overload.
The builtins harness assigns integer function codes programmatically.
However, LTO requires it to use the same assignment for every
translation unit, regardless of language. This means that C++ TUs
need to create (unused) slots for the C overloads and that C TUs
need to create (unused) slots for the C++ overloads.
In many ways, it doesn't matter whether the LTO frontend itself
uses the C approach or the C++ approach to defining overloaded
functions, since the LTO frontend never has to resolve source-level
overloading. However, the C++ approach of defining a separate
function for each overload means that C++ calls never need to
be redirected to a different function. Calls to an overload
can appear in the LTO dump and survive until expand. In contrast,
calls to C's dummy overload functions are resolved by the front
end and never survive to LTO (or expand).
Some optimisations work by moving between sibling functions, such as _m
to _x. If the source function is an overload, the expected destination
function is too. The LTO frontend needs to define C++ overloads if it
wants to do this optimisation properly for C++.
The PR is about a tree checking failure caused by trying to use a
stubbed-out C++ overload in LTO. Dealing with that by detecting the
stub (rather than changing which overloads are defined) would have
turned this from an ice-on-valid to a missed optimisation.
In future, it would probably make sense to redirect overloads to
non-overloaded functions during gimple folding, in case that exposes
more CSE opportunities. But it'd probably be of limited benefit, since
it should be rare for code to mix overloaded and non-overloaded uses of
the same operation. It also wouldn't be suitable for backports.
gcc/
PR target/116629
* config/aarch64/aarch64-sve-builtins.cc
(function_builder::function_builder): Use direct overloads for LTO.
gcc/testsuite/
PR target/116629
* gcc.target/aarch64/sve/acle/general/pr106326_2.c: New test.
testsuite: Make check-function-bodies work with LTO
This patch tries to make check-function-bodies automatically
choose between reading the regular assembly file and reading the
LTO assembly file. There should only ever be one right answer,
since check-function-bodies doesn't make sense on slim LTO output.
Maybe this will turn out to be impossible to get right, but I'd like
to try at least.
gcc/testsuite/
* lib/scanasm.exp (check-function-bodies): Look in ltrans0.ltrans.s
if the test appears to be using LTO.
Jonathan Wakely [Mon, 7 Oct 2024 09:22:24 +0000 (10:22 +0100)]
libstdc++: Ignore _GLIBCXX_USE_POSIX_SEMAPHORE if not supported [PR116992]
If _GLIBCXX_HAVE_POSIX_SEMAPHRE is undefined then users get an error
when defining _GLIBCXX_USE_POSIX_SEMAPHORE. We can just ignore it
instead (and warn them it's being ignored).
This fixes a testsuite failure on hppa64-hp-hpux11.11 (and probably some
other targets):
FAIL: 30_threads/semaphore/platform_try_acquire_for.cc -std=gnu++20 (test for excess errors)
Excess errors:
semaphore:49: error: '__semaphore_impl' has not been declared
libstdc++-v3/ChangeLog:
PR libstdc++/116992
* include/bits/semaphore_base.h (_GLIBCXX_USE_POSIX_SEMAPHORE):
Undefine and issue a warning if POSIX sem_t is not supported.
* testsuite/30_threads/semaphore/platform_try_acquire_for.cc:
Prune new warning.
Jonathan Wakely [Mon, 7 Oct 2024 09:19:29 +0000 (10:19 +0100)]
libstdc++: Fix -Wnarrowing in <complex> [PR116991]
When _GLIBCXX_USE_C99_COMPLEX_ARC is undefined we use the generic
__complex_acos function template for _Float32 etc. and that gives a
-Wnarrowing warning:
complex:2043: warning: ISO C++ does not allow converting to '_Float32' from 'long double' with greater conversion rank [-Wnarrowing]
Use a cast to do the conversion so that it doesn't warn.
libstdc++-v3/ChangeLog:
PR libstdc++/116991
* include/std/complex (__complex_acos): Cast literal to
destination type.
Jonathan Wakely [Thu, 26 Sep 2024 15:55:07 +0000 (16:55 +0100)]
libstdc++: Enable _GLIBCXX_ASSERTIONS by default for -O0 [PR112808]
Too many users don't know about -D_GLIBCXX_ASSERTIONS and so are missing
valuable checks for C++ standard library preconditions. This change
enables libstdc++ assertions by default when compiling with -O0 so that
we diagnose more bugs by default.
When users enable optimization we don't add the assertions by default
(because they have non-zero overhead) so they still need to enable them
manually.
For users who really don't want the assertions even in unoptimized
builds, defining _GLIBCXX_NO_ASSERTIONS will prevent them from being
enabled automatically.
Jonathan Wakely [Thu, 26 Sep 2024 15:42:27 +0000 (16:42 +0100)]
libstdc++: Simplify std::aligned_storage and fix for versioned namespace [PR61458]
This simplifies the implementation of std::aligned_storage. For the
unstable ABI it also fixes the bug where its size is too large when the
default alignment is used. We can't fix that for the stable ABI though,
so just add a comment about the bug.
libstdc++-v3/ChangeLog:
PR libstdc++/61458
* doc/doxygen/user.cfg.in (GENERATE_BUGLIST): Set to NO.
* include/std/type_traits (__aligned_storage_msa): Remove.
(__aligned_storage_max_align_t): New struct.
(__aligned_storage_default_alignment): New function.
(aligned_storage): Use __aligned_storage_default_alignment for
default alignment. Replace union with a struct containing an
aligned buffer. Improve Doxygen comment.
(aligned_storage_t): Use __aligned_storage_default_alignment for
default alignment.
Jonathan Wakely [Thu, 11 Jul 2024 19:38:05 +0000 (20:38 +0100)]
libstdc++: Do not cast away const-ness in std::construct_at (LWG 3870)
This change also requires implementing the proposed resolution of LWG
3216 so that std::make_shared and std::allocate_shared still work, and
the proposed resolution of LWG 3891 so that std::expected still works.
libstdc++-v3/ChangeLog:
* include/bits/shared_ptr_base.h: Remove cv-qualifiers from
type managed by _Sp_counted_ptr_inplace, as per LWG 3210.
* include/bits/stl_construct.h: Do not cast away cv-qualifiers
when passing pointer to placement new.
* include/std/expected: Use remove_cv_t for union member, as per
LWG 3891.
* testsuite/20_util/allocator/void.cc: Do not test construction
via const pointer.
Jonathan Wakely [Mon, 18 Mar 2024 16:59:50 +0000 (16:59 +0000)]
libstdc++: Make std::construct_at support arrays (LWG 3436)
The issue was approved at the recent St. Louis meeting, requiring
support for bounded arrays, but only without arguments to initialize the
array elements.
libstdc++-v3/ChangeLog:
* include/bits/stl_construct.h (construct_at): Support array
types (LWG 3436).
* testsuite/20_util/specialized_algorithms/construct_at/array.cc:
New test.
* testsuite/20_util/specialized_algorithms/construct_at/array_neg.cc:
New test.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/initlist-opt1.C: Adjust for different diagnostics
from std::construct_at by adding -fconcepts-diagnostics-depth=2.
Jonathan Wakely [Fri, 27 Sep 2024 15:54:31 +0000 (16:54 +0100)]
libstdc++: Tweak %c formatting for chrono types
libstdc++-v3/ChangeLog:
* include/bits/chrono_io.h (__formatter_chrono::_M_c): Add
[[unlikely]] attribute to condition for missing %c format in
locale. Use %T instead of %H:%M:%S in fallback.
Jonathan Wakely [Wed, 18 Sep 2024 16:20:29 +0000 (17:20 +0100)]
libstdc++: Fix formatting of chrono::duration with character rep [PR116755]
Implement Peter Dimov's suggestion for resolving LWG 4118, which is to
use +d.count() so that character types are promoted to an integer type
before formatting them. This didn't have unanimous consensus in the
committee as Howard Hinnant proposed that we should format the rep
consistently with std::format("{}", d.count()) instead. That ends up
being more complicated, because it makes std::formattable a precondition
of operator<< which was not previously the case, and it means that
ios_base::fmtflags from the stream would be ignored because std::format
doesn't use them.
libstdc++-v3/ChangeLog:
PR libstdc++/116755
* include/bits/chrono_io.h (operator<<): Use +d.count() for
duration inserter.
(__formatter_chrono::_M_format): Likewise for %Q format.
* testsuite/20_util/duration/io.cc: Test durations with
character types as reps.
Richard Biener [Wed, 9 Oct 2024 09:47:08 +0000 (11:47 +0200)]
Clear DR_GROUP_NEXT_ELEMENT upon group dissolving
I've tried to sanitize DR_GROUP_NEXT_ELEMENT accesses but there are too
many so the following instead makes sure DR_GROUP_NEXT_ELEMENT is never
non-NULL for !STMT_VINFO_GROUPED_ACCESS.
* tree-vect-data-refs.cc (vect_analyze_data_ref_access): When
cancelling a DR group also clear DR_GROUP_NEXT_ELEMENT.
Richard Biener [Wed, 9 Oct 2024 09:42:59 +0000 (11:42 +0200)]
tree-optimization/117041 - fix load classification of former grouped load
When we first detect a grouped load but later dis-associate it we
only set DR_GROUP_FIRST_ELEMENT to NULL, indicating it is not a
STMT_VINFO_GROUPED_ACCESS but leave DR_GROUP_NEXT_ELEMENT set. This
causes a stray DR_GROUP_NEXT_ELEMENT access in get_group_load_store_type
to go wrong, indicating a load isn't single_element_p when it actually
is, leading to wrong classification and an ICE.
PR tree-optimization/117041
* tree-vect-stmts.cc (get_group_load_store_type): Only
check DR_GROUP_NEXT_ELEMENT for STMT_VINFO_GROUPED_ACCESS.
Richard Biener [Wed, 13 Mar 2024 13:59:27 +0000 (14:59 +0100)]
tree-optimization/116974 - Handle single-lane SLP for OMP scan store
The following massages the GIMPLE matching way of handling scan
stores to work with single-lane SLP. I do not fully understand all
the cases that can happen and the stmt matching at vectorizable_store
time is less than ideal - but the following gets me all the testcases
to pass with and without forced SLP.
Long term we want to perform the matching at SLP discovery time,
properly chaining the various SLP instances the current state ends
up with.
PR tree-optimization/116974
* tree-vect-stmts.cc (check_scan_store): Pass in the SLP node
instead of just a flag. Allow single-lane scan stores.
(vectorizable_store): Adjust.
* tree-vect-loop.cc (vect_analyze_loop_2): Empty scan_map
before re-trying.
Richard Biener [Tue, 8 Oct 2024 12:28:16 +0000 (14:28 +0200)]
tree-optimization/116575 - handle SLP of permuted masked loads
The following handles SLP discovery of permuted masked loads which
was prohibited (because wrongly handled) for PR114375. In particular
with single-lane SLP at the moment all masked group loads appear
permuted and we fail to use masked load lanes as well. The following
addresses parts of the issues, starting with doing correct basic
discovery - namely discover an unpermuted mask load followed by
a permute node. In particular groups with gaps do not support masking
yet (and didn't before w/o SLP IIRC). There's still issues with
how we represent masked load/store-lanes I think, but I first have to
get my hands on a good testcase.
PR tree-optimization/116575
PR tree-optimization/114375
* tree-vect-slp.cc (vect_build_slp_tree_2): Do not reject
permuted mask loads without gaps but instead discover a
node for the full unpermuted load and permute that with
a VEC_PERM node.
* gcc.dg/vect/vect-pr114375.c: Expect vectorization now with avx2.
Richard Biener [Tue, 8 Oct 2024 07:01:01 +0000 (09:01 +0200)]
tree-optimization/117000 - elide .REDUC_IOR with compare against zero
The following adds a pattern to elide a .REDUC_IOR operation when
the result is compared against zero with a cbranch. I've resorted
to using can_compare_p since that's what RTL expansion eventually
checks - while GIMPLE allowed whole vector equality compares for long
I'll notice vector lowering won't lower unsupported ones and RTL
expansion doesn't seem to try using [u]cmp<vector-mode> optabs
(and neither x86 nor aarch64 implements those). There's cstore
but no target implements that for vector modes either.
PR tree-optimization/117000
* match.pd (.REDUC_IOR !=/== 0): New pattern.
* gimple-match-head.cc: Include memmodel.h and optabs.h.
* generic-match-head.cc: Likewise.
Ken Matsui [Sat, 2 Mar 2024 06:10:55 +0000 (22:10 -0800)]
gcc, libcpp: Add warning switch for "#pragma once in main file" [PR89808]
This patch adds a warning switch for "#pragma once in main file". The
warning option name is Wpragma-once-outside-header, which is the same
as Clang provides.
PR preprocessor/89808
gcc/c-family/ChangeLog:
* c.opt (Wpragma_once_outside_header): Define new option.
* c.opt.urls: Regenerate.
Artemiy Volkov [Wed, 9 Oct 2024 00:06:23 +0000 (18:06 -0600)]
tree-optimization/116024 - simplify some cases of X +- C1 cmp C2
Whenever C1 and C2 are integer constants, X is of a wrapping type, and
cmp is a relational operator, the expression X +- C1 cmp C2 can be
simplified in the following cases:
(a) If cmp is <= and C2 -+ C1 == +INF(1), we can transform the initial
comparison in the following way:
X +- C1 <= C2
-INF <= X +- C1 <= C2 (add left hand side which holds for any X, C1)
-INF -+ C1 <= X <= C2 -+ C1 (add -+C1 to all 3 expressions)
-INF -+ C1 <= X <= +INF (due to (1))
-INF -+ C1 <= X (eliminate the right hand side since it holds for any X)
(b) By analogy, if cmp if >= and C2 -+ C1 == -INF(1), use the following
sequence of transformations:
X +- C1 >= C2
+INF >= X +- C1 >= C2 (add left hand side which holds for any X, C1)
+INF -+ C1 >= X >= C2 -+ C1 (add -+C1 to all 3 expressions)
+INF -+ C1 >= X >= -INF (due to (1))
+INF -+ C1 >= X (eliminate the right hand side since it holds for any X)
(c) The > and < cases are negations of (a) and (b), respectively.
This transformation allows to occasionally save add / sub instructions,
for instance the expression
3 + (uint32_t)f() < 2
compiles to
cmn w0, #4
cset w0, ls
instead of
add w0, w0, 3
cmp w0, 2
cset w0, ls
on aarch64.
Testcases that go together with this patch have been split into two
separate files, one containing testcases for unsigned variables and the
other for wrapping signed ones (and thus compiled with -fwrapv).
Additionally, one aarch64 test has been adjusted since the patch has
caused the generated code to change from
cmn w0, #2
csinc w0, w1, wzr, cc (x < -2)
to
cmn w0, #3
csinc w0, w1, wzr, cs (x <= -3)
This patch has been bootstrapped and regtested on aarch64, x86_64, and
i386, and additionally regtested on riscv32.
gcc/ChangeLog:
PR tree-optimization/116024
* match.pd: New transformation around integer comparison.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr116024-2.c: New test.
* gcc.dg/tree-ssa/pr116024-2-fwrapv.c: Ditto.
* gcc.target/aarch64/gtu_to_ltu_cmp_1.c: Adjust.
Artemiy Volkov [Wed, 9 Oct 2024 00:04:13 +0000 (18:04 -0600)]
tree-optimization/116024 - simplify C1-X cmp C2 for wrapping signed types
Implement a match.pd transformation inverting the sign of X in
C1 - X cmp C2, where C1 and C2 are integer constants and X is
of a wrapping signed type, by observing that:
(a) If cmp is == or !=, simply move X and C2 to opposite sides of
the comparison to arrive at X cmp C1 - C2.
(b) If cmp is <:
- C1 - X < C2 means that C1 - X spans the values of -INF,
-INF + 1, ..., C2 - 1;
- Therefore, X is one of C1 - -INF, C1 - (-INF + 1), ...,
C1 - C2 + 1;
- Subtracting (C1 + 1), X - (C1 + 1) is one of - (-INF) - 1,
- (-INF) - 2, ..., -C2;
- Using the fact that - (-INF) - 1 is +INF, derive that
X - (C1 + 1) spans the values +INF, +INF - 1, ..., -C2;
- Thus, the original expression can be simplified to
X - (C1 + 1) > -C2 - 1.
(c) Similarly, C1 - X <= C2 is equivalent to X - (C1 + 1) >= -C2 - 1.
(d) The >= and > cases are negations of (b) and (c), respectively.
(e) In all cases, the expression -C2 - 1 can be shortened to
bit_not (C2).
This transformation allows to occasionally save load-immediate /
subtraction instructions, e.g. the following statement:
10 - (int)f() >= 20;
now compiles to
addi a0,a0,-11
slti a0,a0,-20
instead of
li a5,10
sub a0,a5,a0
slti t0,a0,20
xori a0,t0,1
on 32-bit RISC-V when compiled with -fwrapv.
Additional examples can be found in the newly added test file. This
patch has been bootstrapped and regtested on aarch64, x86_64, and i386,
and additionally regtested on riscv32.
gcc/ChangeLog:
PR tree-optimization/116024
* match.pd: New transformation around integer comparison.
Artemiy Volkov [Tue, 8 Oct 2024 23:54:55 +0000 (17:54 -0600)]
tree-optimization/116024 - simplify C1-X cmp C2 for unsigned types
Implement a match.pd transformation inverting the sign of X in
C1 - X cmp C2, where C1 and C2 are integer constants and X is
of an unsigned type, by observing that:
(a) If cmp is == or !=, simply move X and C2 to opposite sides of the
comparison to arrive at X cmp C1 - C2.
(b) If cmp is <:
- C1 - X < C2 means that C1 - X spans the range of 0, 1, ..., C2 - 1;
- This means that X spans the range of C1 - (C2 - 1),
C1 - (C2 - 2), ..., C1;
- Subtracting C1 - (C2 - 1), X - (C1 - (C2 - 1)) is one of 0, 1,
..., C1 - (C1 - (C2 - 1));
- Simplifying the above, X - (C1 - C2 + 1) is one of 0, 1, ...,
C2 - 1;
- Summarizing, the expression C1 - X < C2 can be transformed
into X - (C1 - C2 + 1) < C2.
(c) Similarly, if cmp is <=:
- C1 - X <= C2 means that C1 - X is one of 0, 1, ..., C2;
- It follows that X is one of C1 - C2, C1 - (C2 - 1), ..., C1;
- Subtracting C1 - C2, X - (C1 - C2) has range 0, 1, ..., C2;
- Thus, the expression C1 - X <= C2 can be transformed into
X - (C1 - C2) <= C2.
(d) The >= and > cases are negations of (b) and (c), respectively.
This transformation allows to occasionally save load-immediate /
subtraction instructions, e.g. the following statement:
300 - (unsigned int)f() < 100;
now compiles to
addi a0,a0,-201
sltiu a0,a0,100
instead of
li a5,300
sub a0,a5,a0
sltiu a0,a0,100
on 32-bit RISC-V.
Additional examples can be found in the newly added test file. This
patch has been bootstrapped and regtested on aarch64, x86_64, and i386,
and additionally regtested on riscv32.
gcc/ChangeLog:
PR tree-optimization/116024
* match.pd: New transformation around integer comparison.
Artemiy Volkov [Tue, 8 Oct 2024 23:51:08 +0000 (17:51 -0600)]
tree-optimization/116024 - simplify C1-X cmp C2 for UB-on-overflow types
Implement a match.pd pattern for C1 - X cmp C2, where C1 and C2 are
integer constants and X is of a UB-on-overflow type. The pattern is
simplified to X rcmp C1 - C2 by moving X and C2 to the other side of the
comparison (with opposite signs). If C1 - C2 happens to overflow,
replace the whole expression with either a constant 0 or a constant 1
node, depending on the comparison operator and the sign of the overflow.
This transformation allows to occasionally save load-immediate /
subtraction instructions, e.g. the following statement:
10 - (int) x <= 9;
now compiles to
sgt a0,a0,zero
instead of
li a5,10
sub a0,a5,a0
slti a0,a0,10
on 32-bit RISC-V.
Additional examples can be found in the newly added test file. This
patch has been bootstrapped and regtested on aarch64, x86_64, and
i386, and additionally regtested on riscv32. Existing tests were
adjusted where necessary.
gcc/ChangeLog:
PR tree-optimization/116024
* match.pd: New transformation around integer comparison.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr116024.c: New test.
* gcc.dg/pr67089-6.c: Adjust.
Yangyu Chen [Tue, 8 Oct 2024 17:08:44 +0000 (11:08 -0600)]
RISC-V: Implement TARGET_CAN_INLINE_P
Currently, we lack support for TARGET_CAN_INLINE_P on the RISC-V
ISA. As a result, certain functions cannot be optimized with inlining
when specific options, such as __attribute__((target("arch=+v"))) .
This can lead to potential performance issues when building
retargetable binaries for RISC-V.
To address this, I have implemented the riscv_can_inline_p function.
This addition enables inlining when the callee either has no special
options or when the some options match, and also ensuring that the
callee's ISA is a subset of the caller's. I also check some other
options when there is no always_inline set.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (cl_opt_var_ref_t): Add
cl_opt_var_ref_t pointer to member of cl_target_option.
(struct riscv_ext_flag_table_t): Add new cl_opt_var_ref_t field.
(RISCV_EXT_FLAG_ENTRY): New macro to simplify the definition of
riscv_ext_flag_table.
(riscv_ext_is_subset): New function to check if the callee's ISA
is a subset of the caller's.
(riscv_x_target_flags_isa_mask): New function to get the mask of
ISA extension in x_target_flags of gcc_options.
* config/riscv/riscv-subset.h (riscv_ext_is_subset): Declare
riscv_ext_is_subset function.
(riscv_x_target_flags_isa_mask): Declare
riscv_x_target_flags_isa_mask function.
* config/riscv/riscv.cc (riscv_can_inline_p): New function.
(TARGET_CAN_INLINE_P): Implement TARGET_CAN_INLINE_P.
Pan Li [Tue, 8 Oct 2024 03:28:44 +0000 (11:28 +0800)]
RISC-V: Add testcases for form 1 of scalar signed SAT_TRUNC
Form 1:
#define DEF_SAT_S_TRUNC_FMT_1(WT, NT, NT_MIN, NT_MAX) \
NT __attribute__((noinline)) \
sat_s_trunc_##WT##_to_##NT##_fmt_1 (WT x) \
{ \
NT trunc = (NT)x; \
return (WT)NT_MIN <= x && x <= (WT)NT_MAX \
? trunc \
: x < 0 ? NT_MIN : NT_MAX; \
}
The below test are passed for this patch.
* The rv64gcv fully regression test.
It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_arith_data.h: Add test data for SAT_TRUNC.
* gcc.target/riscv/sat_s_trunc-1-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-1-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-1-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-1-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-1-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-1-i64-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-1-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-1-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-1-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-1-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-1-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-run-1-i64-to-i8.c: New test.
Before this patch:
10 │ sat_s_trunc_int64_t_to_int32_t_fmt_1:
11 │ li a5,1
12 │ slli a5,a5,31
13 │ li a4,-1
14 │ add a5,a0,a5
15 │ srli a4,a4,32
16 │ bgtu a5,a4,.L2
17 │ sext.w a0,a0
18 │ ret
19 │ .L2:
20 │ srai a5,a0,63
21 │ li a0,-2147483648
22 │ xor a0,a0,a5
23 │ not a0,a0
24 │ ret
After this patch:
10 │ sat_s_trunc_int64_t_to_int32_t_fmt_1:
11 │ li a5,-2147483648
12 │ xori a3,a5,-1
13 │ slt a4,a0,a3
14 │ slt a5,a5,a0
15 │ and a5,a4,a5
16 │ srai a4,a0,63
17 │ xor a4,a4,a3
18 │ addi a3,a5,-1
19 │ neg a5,a5
20 │ and a4,a4,a3
21 │ and a0,a0,a5
22 │ or a0,a0,a4
23 │ sext.w a0,a0
24 │ ret
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/ChangeLog:
* config/riscv/riscv-protos.h (riscv_expand_sstrunc): Add new
func decl to expand SAT_TRUNC.
* config/riscv/riscv.cc (riscv_expand_sstrunc): Add new func
impl to expand SAT_TRUNC.
* config/riscv/riscv.md (sstrunc<mode><anyi_double_truncated>2):
Add new pattern for double truncation.
(sstrunc<mode><anyi_quad_truncated>2): Ditto but for quad.
(sstrunc<mode><anyi_oct_truncated>2): Ditto but for oct.
Pan Li [Tue, 8 Oct 2024 03:06:23 +0000 (11:06 +0800)]
Widening-Mul: Fix one bug of consume after phi node released
When try to matching saturation related pattern on PHI node, we may have
to try each pattern for all phi node of bb. Aka:
for each PHI node in bb:
gphi *phi = xxx;
try_match_sat_add (, phi);
try_match_sat_sub (, phi);
try_match_sat_trunc (, phi);
The PHI node will be removed if one of the above 3 sat patterns are
matched. There will be a problem that, for example, sat_add is
matched and then the phi is removed(freed), and the next 2 sat_sub and
sat_trunc will depend on the removed(freed) phi node.
This patch would like to fix this consume after phi node released issue.
To ensure at most one pattern of the above will be matched.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.
gcc/ChangeLog:
* tree-ssa-math-opts.cc (build_saturation_binary_arith_call): Rename
to...
(build_saturation_binary_arith_call_and_replace): ...this.
(build_saturation_binary_arith_call_and_insert): ...this.
(match_unsigned_saturation_add): Leverage renamed func.
(match_unsigned_saturation_sub): Ditto.
(match_saturation_add): Return bool on matched and leverage
renamed func.
(match_saturation_sub): Ditto.
(match_saturation_trunc): Ditto.
(math_opts_dom_walker::after_dom_children): Ensure at most one
pattern will be matched for each phi node.
The below test suites are passed for this patch.
* The rv64gcv fully regression test with pr116861-1.c failed.
* The x86 bootstrap test.
* The x86 fully regression test.
The failed pr116861-1.c ice will be fixed in underlying patch, as it
just trigger one existing bug.
gcc/ChangeLog:
* match.pd: Add case 1 matching pattern for signed SAT_TRUNC.
* tree-ssa-math-opts.cc (gimple_signed_integer_sat_trunc): Add
new decl for signed SAT_TRUNC.
(match_saturation_trunc): Add new func impl to try SAT_TRUNC
pattern on phi node.
(math_opts_dom_walker::after_dom_children): Add
match_saturation_trunc for phi node iteration.
Jan Beulich [Tue, 8 Oct 2024 14:05:33 +0000 (16:05 +0200)]
x86/{,V}AES: adjust when to force EVEX encoding
Commit a79d13a01f8c ("i386: Fix aes/vaes patterns [PR114576]") correctly
said "..., but we need to emit {evex} prefix in the assembly if AES ISA
is not enabled". Yet it did so only for the TARGET_AES insns. Going from
the alternative chosen in the TARGET_VAES insns isn't quite right: If
AES is (also) enabled, EVEX encoding would needlessly be forced.
Palmer Dabbelt [Tue, 8 Oct 2024 13:28:32 +0000 (07:28 -0600)]
[RISC-V][PR target/116615] RISC-V: Use default LOGICAL_OP_NON_SHORT_CIRCUIT
> We have cheap logical ops, so let's just move this back to the default
> to take advantage of the standard branch/op hueristics.
>
> gcc/ChangeLog:
>
> PR target/116615
> * config/riscv/riscv.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Remove.
> ---
> There's a bunch more discussion in the bug, but it's starting to smell
> like this was just a holdover from MIPS (where maybe it also shouldn't
> be set). I haven't tested this, but I figured I'd send the patch to get
> a little more visibility.
>
> I guess we should also kick off something like a SPEC run to make sure
> there's no regressions?
So as I noted earlier, this appears to be a nice win on the BPI. Testsuite
fallout is minimal -- just the one SFB related test tripping at -Os that was
also hit by Andrew P's work.
After looking at it more closely, the SFB codegen and the codegen after
Andrew's work should be equivalent assuming two independent ops can dispatch
together.
The test actually generates sensible code at -Os. It's the -Os in combination
with the -fno-ssa-phiopt that causes problems. I think the best thing to do
here is just skip at -Os. That still keeps a degree of testing the SFB path.
Tested successfully in my tester. But will wait for the pre-commit tester to
render a verdict before moving forward.
Fix parsing of substring refs in coarrays. [PR51815]
The parser was greadily taking the substring ref as an array ref because
an array_spec was present. Fix this by only parsing the coarray (pseudo)
ref when no regular array is present.
gcc/fortran/ChangeLog:
PR fortran/51815
* array.cc (gfc_match_array_ref): Only parse coarray part of
ref.
* match.h (gfc_match_array_ref): Add flag.
* primary.cc (gfc_match_varspec): Request only coarray ref
parsing when no regular array is present. Report error on
unexpected additional ref.
gcc/testsuite/ChangeLog:
* gfortran.dg/pr102532.f90: Fix dg-errors: Add new error.
* gfortran.dg/coarray/substring_1.f90: New test.
Pan Li [Thu, 3 Oct 2024 08:47:52 +0000 (16:47 +0800)]
RISC-V: Add testcases for form 4 of scalar signed SAT_SUB
Form 4:
#define DEF_SAT_S_SUB_FMT_4(T, UT, MIN, MAX) \
T __attribute__((noinline)) \
sat_s_sub_##T##_fmt_4 (T x, T y) \
{ \
T minus; \
bool overflow = __builtin_sub_overflow (x, y, &minus); \
return !overflow ? minus : x < 0 ? MIN : MAX; \
}
The below test are passed for this patch.
* The rv64gcv fully regression test.
It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_sub-4-i16.c: New test.
* gcc.target/riscv/sat_s_sub-4-i32.c: New test.
* gcc.target/riscv/sat_s_sub-4-i64.c: New test.
* gcc.target/riscv/sat_s_sub-4-i8.c: New test.
* gcc.target/riscv/sat_s_sub-run-4-i16.c: New test.
* gcc.target/riscv/sat_s_sub-run-4-i32.c: New test.
* gcc.target/riscv/sat_s_sub-run-4-i64.c: New test.
* gcc.target/riscv/sat_s_sub-run-4-i8.c: New test.
Pan Li [Thu, 3 Oct 2024 08:15:56 +0000 (16:15 +0800)]
RISC-V: Add testcases for form 3 of scalar signed SAT_SUB
Form 3:
#define DEF_SAT_S_SUB_FMT_3(T, UT, MIN, MAX) \
T __attribute__((noinline)) \
sat_s_sub_##T##_fmt_3 (T x, T y) \
{ \
T minus; \
bool overflow = __builtin_sub_overflow (x, y, &minus); \
return overflow ? x < 0 ? MIN : MAX : minus; \
}
The below test are passed for this patch.
* The rv64gcv fully regression test.
It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_sub-3-i16.c: New test.
* gcc.target/riscv/sat_s_sub-3-i32.c: New test.
* gcc.target/riscv/sat_s_sub-3-i64.c: New test.
* gcc.target/riscv/sat_s_sub-3-i8.c: New test.
* gcc.target/riscv/sat_s_sub-run-3-i16.c: New test.
* gcc.target/riscv/sat_s_sub-run-3-i32.c: New test.
* gcc.target/riscv/sat_s_sub-run-3-i64.c: New test.
* gcc.target/riscv/sat_s_sub-run-3-i8.c: New test.
Jakub Jelinek [Tue, 8 Oct 2024 08:40:29 +0000 (10:40 +0200)]
ssa-math-opts, i386: Handle most unordered values rather than just 2 [PR116896]
On Mon, Oct 07, 2024 at 10:32:57AM +0200, Richard Biener wrote:
> > They are implementation defined, -1, 0, 1, 2 is defined by libstdc++:
> > using type = signed char;
> > enum class _Ord : type { equivalent = 0, less = -1, greater = 1 };
> > enum class _Ncmp : type { _Unordered = 2 };
> > https://eel.is/c++draft/cmp#categories.pre-1 documents them as
> > enum class ord { equal = 0, equivalent = equal, less = -1, greater = 1 }; // exposition only
> > enum class ncmp { unordered = -127 }; // exposition only
> > and now looking at it, LLVM's libc++ takes that literally and uses
> > -1, 0, 1, -127. One can't use <=> operator without including <compare>
> > which provides the enums, so I think if all we care about is libstdc++,
> > then just hardcoding -1, 0, 1, 2 is fine, if we want to also optimize
> > libc++ when used with gcc, we could support -1, 0, 1, -127 as another
> > option.
> > Supporting arbitrary 4 values doesn't make sense, at least on x86 the
> > only reason to do the conversion to int in an optab is a good sequence
> > to turn the flag comparisons to -1, 0, 1. So, either we do nothing
> > more than the patch, or add handle both 2 and -127 for unordered,
> > or add support for arbitrary value for the unordered case except
> > -1, 0, 1 (then -1 could mean signed int, 1 unsigned int, 0 do the jumps
> > and any other value what should be returned for unordered.
Here is an incremental patch which adds support for (almost) arbitrary
unordered constant value. It changes the .SPACESHIP and spaceship<mode>4
optab conventions, so 0 means use branches, floating point, -1, 0, 1, 2
results consumed by tree-ssa-math-opts.cc emitted comparisons, -1
means signed int comparisons, -1, 0, 1 results, 1 means unsigned int
comparisons, -1, 0, 1 results, and for constant other than -1, 0, 1
which fit into [-128, 127] converted to the PHI type are otherwise
specified as the last argument (then it is -1, 0, 1, C results).
2024-10-08 Jakub Jelinek <jakub@redhat.com>
PR middle-end/116896
* tree-ssa-math-opts.cc (optimize_spaceship): Handle unordered values
other than 2, but they still need to be signed char range possibly
converted to the PHI result and can't be in [-1, 1] range. Use
last .SPACESHIP argument of 1 for unsigned int comparisons, -1 for
signed int, 0 for floating point branches and any other for floating
point with that value as unordered.
* config/i386/i386-expand.cc (ix86_expand_fp_spaceship): Use op2 rather
const2_rtx if op2 is not const0_rtx for unordered result.
(ix86_expand_int_spaceship): Change INTVAL (op2) == 1 tests to
INTVAL (op2) != -1.
* doc/md.texi (spaceship@var{m}4): Document the above changes.
Eric Botcazou [Wed, 11 Sep 2024 18:15:32 +0000 (20:15 +0200)]
ada: Fix infinite loop on MSP430 with -mlarge flag
This removes the loop trying to find a pointer mode among the integer modes,
which is obsolete and does not work on platforms where pointers have unusual
size like MSP430 or special semantics like Morello.
gcc/ada/ChangeLog:
PR ada/116498
* gcc-interface/decl.cc (validate_size): Use the size of the default
pointer mode as the minimum size for access types and fat pointers.
Eric Botcazou [Tue, 10 Sep 2024 13:02:19 +0000 (15:02 +0200)]
ada: Remove -gnateE information message for noncontiguous enumeration type
It is very confusing for the user because it does not make any reference
to the source code but only to details of the underlying implementation.
gcc/ada/ChangeLog:
* gcc-interface/trans.cc (Raise_Error_to_gnu) <CE_Invalid_Data>:
Do not the generate range information if the value is a call to a
Rep_To_Pos function.
The initial signal handling code introduced for aarch64-android
overlooked details of the tasking runtime, not in the initial testing
perimeter.
Specifically, a reference to __gnat_sigtramp from __gnat_error_handler,
initially introduced for the arm port, was prevented if !arm on the
grounds that other ports would rely on kernel CFI. aarch64-android
does provide kernel CFI and __gnat_sigtramp was not provided for this
configuration.
But there is a similar reference from s-intman__android, which kicks in
as soon as the tasking runtime gets activated, triggering link failures.
Testing for more precise target specific parameters from Ada
code is inconvenient and replicating the logic is not attractive in
any case, so this change addresses the problem in the following
fashion:
- Always provide a __gnat_sigtramp entry point, common to the
tasking and non-tasking signal handling code for all the Android
configurations,
- There (C code), from target definition macros, select a path
that either routes directly to the actual signal handler or goes
through the intermediate layer providing hand crafted CFI
information which allows unwinding up to the interrupted code.
- Similarily to what was done for VxWorks, move the arm specific
definitions to a separate header file to make the general structure
of the common C code easier to grasp,
- Adjust the comments in the common sigtramp.h header to
account for such an organisation possibility.
gcc/ada/ChangeLog:
* sigtramp-armdroid.c: Refactor into ...
* sigtramp-android.c, sigtramp-android-asm.h: New files.
* Makefile.rtl (arm/aarch64-android section): Add
sigtramp-android.o to EXTRA_LIBGNAT_OBJS unconditionally. Add
sigtramp.h and sigtramp-android-asm.h to EXTRA_LIBGNAT_SRCS.
* init.c (android section, __gnat_error_handler): Defer to
__gnat_sigramp unconditionally again.
* sigtramp.h: Adjust comments to allow neutral signal handling
relays, merely forwarding to the underlying handler without any
intermediate CFI magic.
Eric Botcazou [Thu, 12 Sep 2024 10:45:27 +0000 (12:45 +0200)]
ada: Fix bogus Constraint_Error for 'Wide_Wide_Value on wide enumeration literal
The problem is that 'Wide_Wide_Value is piggybacked on 'Value and the latter
invokes System.Val_Util.Normalize_String, which incorrectly normalizes the
input string in the presence of enumeration literals with wide characters.
gcc/ada/ChangeLog:
PR ada/115507
* exp_imgv.adb (Expand_Valid_Value_Attribute): Add actual parameter
for Is_Wide formal in the call to Valid_Value_Enumeration_NN.
(Expand_Value_Attribute): Likewise.
* libgnat/s-vaen16.ads (Value_Enumeration_16): Add Is_Wide formal.
(Valid_Value_Enumeration_16): Likewise.
* libgnat/s-vaen32.ads (Value_Enumeration_32): Likewise.
(Valid_Value_Enumeration_32): Likewise.
* libgnat/s-vaenu8.ads (Value_Enumeration_8): Likewise.
(Valid_Value_Enumeration_8): Likewise.
* libgnat/s-valboo.adb (Value_Boolean): Pass True for To_Upper_Case
formal parameter in call to Normalize_String.
* libgnat/s-valcha.adb (Value_Character): Likewise.
* libgnat/s-valuen.ads (Value_Enumeration): Add Is_Wide formal.
(Valid_Value_Enumeration): Likewise.
* libgnat/s-valuen.adb (Value_Enumeration_Pos): Likewise and pass
its negation for To_Upper_Case formal in call to Normalize_String.
(Valid_Value_Enumeration): Add Is_Wide formal and forward it in
call to Value_Enumeration_Pos.
(Value_Enumeration): Likewise.
* libgnat/s-valuti.ads (Normalize_String): Add To_Upper_Case formal
parameter and adjust post-condition accordingly.
* libgnat/s-valuti.adb (Normalize_String): Add To_Upper_Case formal
parameter and adjust implementation accordingly.
* libgnat/s-valwch.adb (Value_Wide_Wide_Character): Pass False for
To_Upper_Case formal parameter in call to Normalize_String.
Eric Botcazou [Wed, 11 Sep 2024 17:42:03 +0000 (19:42 +0200)]
ada: Fix bogus error in instantiation with formal package
The compiler reports that an actual does not match the formal when there
is a defaulted formal discrete type because Check_Formal_Package_Instance
fails to skip the implicit base type generated by the compiler.
gcc/ada/ChangeLog:
PR ada/114636
* sem_ch12.adb (Check_Formal_Package_Instance): For a defaulted
formal discrete type, skip the generated implicit base type.
Eric Botcazou [Wed, 11 Sep 2024 17:37:08 +0000 (19:37 +0200)]
ada: Fix negative value returned by 'Image for array with nonnegative component
The problem is that Exp_Put_Image.Build_Elementary_Put_Image_Call uses the
signedness of the base type but the size of the first subtype, hence the
discrepancy between them.
gcc/ada/ChangeLog:
PR ada/115535
* exp_put_image.adb (Build_Elementary_Put_Image_Call): Use the size
of the underlying type to find the support type.
Eric Botcazou [Wed, 11 Sep 2024 17:26:18 +0000 (19:26 +0200)]
ada: Fix internal error on elsif part of if-statement containing if-expression
The problem occurs when the compiler is trying to find a context to which
it can hoist finalization actions coming from the if-expression, because
Find_Hook_Context incorrectly returns the N_Elsif_Part node.
gcc/ada/ChangeLog:
PR ada/114640
* exp_util.adb (Find_Hook_Context): For a node present within a
conditional expression, do not return an N_Elsif_Part node.
A container aggregate can either be empty, contain only
positional elements or named element associations. Reject the
scenario where the latter two are both used.
gcc/ada/ChangeLog:
* diagnostics-constructors.adb
(Make_Mixed_Container_Aggregate_Error): New function for the error
message
(Record_Mixed_Container_Aggregate_Error): New function for the
error message.
* diagnostics-constructors.ads: Likewise.
* diagnostics-repository.ads: register new diagnostics id
* diagnostics.ads: add new diagnostics id
* errout.adb (First_And_Last_Node): Detect the span for component
associations.
* sem_aggr.adb (Resolve_Container_Aggregate): reject container
aggregates that have both named and positional elements.
ada: Add mechanism to test internal error machinery
This patch adds a pragma that triggers an internal compiler error when
analyzed. It is not externally documented and makes it possible to test
the code that runs when the compiler encounters an internal error.
gcc/ada/ChangeLog:
* snames.ads-tmpl: Add new pragma definition.
* par-prag.adb (Prag): Handle new pragma.
* sem_prag.adb (Analyze_Pragma): Implement new pragma.
This patch puts a comment explaining the absence of Storage_Size in an
alphabetically sorted list at the spot where Storage_Size would be in
that list.
gcc/ada/ChangeLog:
* snames.ads-tmpl: Tweak position of comment.
gcc/ada/ChangeLog:
* doc/gnat_rm/gnat_language_extensions.rst: replace
references to RFC's with appropriate text from the rfc
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.
ada: Add dependency lines for External_Initialization
When a file included through External_Initialization has been modified,
the unit including it must be recompiled. This patch adds the
generation of dependency lines to the handling of the
External_Initialization aspect, to signal that fact to gnatmake and
other tools that invoke GNAT.
ada: Use corect capacity with two dimensional arrays
Previously when a bounded list was initialized with an array aggregate
then we used the correct size only if the array was one dimensional.
This patch adds support for deriving the size for multidimensional array
types as well.
gcc/ada/ChangeLog:
* exp_aggr.adb (Build_Siz_Exp): Support deriving the size of the
container aggregate with multi-dimensional arrays. Make the
function return an node of an expression instead of an integer.
Additionally calculate the size expression for
Component_Associations.
(To_Int) make this method available for more functions.
(Aggregate_Size) Relocate the calculation of
Componenet_Associations to Build_Siz_Exp.
Eric Botcazou [Tue, 10 Sep 2024 12:58:21 +0000 (14:58 +0200)]
ada: Add Is_Rep_To_Pos predicate and export it for use in gigi
This is modeled on the existing Is_Init_Proc predicate.
gcc/ada/ChangeLog:
* exp_tss.ads (Is_Rep_To_Pos): New function declaration.
* exp_tss.adb (Is_Rep_To_Pos): New function body.
* fe.h (Is_Rep_To_Pos): New macro and extern declaration.
Eric Botcazou [Tue, 10 Sep 2024 10:09:48 +0000 (12:09 +0200)]
ada: Avoid dependency on Long_Long_Long_Integer and System.Img_LLLI for 'Image
When the Image attribute is applied directly to another attribute returning
Universal_Integer, for example Enum_Rep, it is converted to the equivalent
of Universal_Integer'Image, which is implemented by Long_Long_Long_Integer
and thus triggers a dependency on System.Img_LLLI, both being unnecessary
in most practical cases.
gcc/ada/ChangeLog:
* exp_imgv.adb (Rewrite_Object_Image): When the prefix is a type
conversion to Universal_Integer, use its expression directly. When
the prefix is an integer literal with Universal_Integer type, try
to compute a narrower type.
Raphaël AMIARD [Thu, 29 Aug 2024 10:43:54 +0000 (12:43 +0200)]
ada: Use semantics from the RFC for declarative items mixed with statements
We want to allow statements lists with declarations *and* an exception
handler. What follows from this is that declarations declared in the
statement list are *not* visible from the exception handler, and that
the following code:
declare
A : Integer := 12;
begin
A : Integer := 15;
<stmts>
exception
when others => ...
Roughly expands to:
declare
A : Integer := 12;
begin
declare
A : Integer := 15;
begin
<stmts>
exception
when others => ...
As such, in the code above, there is no more error triggered for
conflicting declarations of `A`.
Move "Local declarations without block" into curated extensions
Restrict legal local decls in statement lists
Only accept object declarations & renamings, as well as use clauses for
gcc/ada/ChangeLog:
* par-ch11.adb (P_Sequence_Of_Statements): Remove Handled
parameter. Always wrap the statements in a block when there are
declarations in it.
* par-ch5.adb: Adapt call to P_Sequence_Of_Statements Update
outdated comment, remove useless `Style_Checks` pragma.
(P_Sequence_Of_Statements): Don't emit an error in core extensions
mode. Emit an error when a non valid declaration is parsed in
sequence of statements.
* par.adb: Adapt P_Sequence_Of_Statements' signature
* doc/gnat_rm/gnat_language_extensions.rst: Adapt documentation
now.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.
Steve Baird [Thu, 5 Sep 2024 20:42:20 +0000 (13:42 -0700)]
ada: Improved support for incomplete parameter types
Fix two bugs uncovered by a recent ACATS test C3A1005: a freezing problem
and a case where a user-defined equality function for an incomplete type
was incorrectly hidden from use-clause visibility by the "corresponding"
predefined op (which doesn't actually exist).
gcc/ada/ChangeLog:
* sem_ch6.adb (Analyze_Subprogram_Body_Helper): Don't freeze here
if Has_Delayed_Freeze returns True.
* sem_type.adb (Valid_Equality_Arg): Treat an incomplete type like
a limited type because neither has an implicitly-defined equality
primitive.
(Covers): If either argument is an incomplete type
whose full view is available, then look through to the full view.
* sem_res.adb (Resolve_Actuals): If the actual parameter type is
complete and the formal parameter type is not, then update the
formal parameter type to use the complete view.
squirek [Tue, 13 Aug 2024 11:42:41 +0000 (11:42 +0000)]
ada: Early freezeing of types with 'Size'Class
This patch fixes an issue in the compiler whereby declarations of derived types
whose parent is a mutably tagged type cause early freezing of the parent type -
leading to spurious compile-time errors.
gcc/ada/ChangeLog:
* sem_ch3.adb (Derived_Type_Declaration): Modify generation of
compile time check.
Eric Botcazou [Fri, 6 Sep 2024 08:05:58 +0000 (10:05 +0200)]
ada: Print the load address in symbolic backtraces
The load address of PIE executables is printed in non-symbolic backtraces
(-E binder switch) but it makes sense to print it in symbolic backtraces
(-Es binder switch) too, because symbolic backtraces may degenerate into
non-symbolic ones when the executable is stripped for example.
gcc/ada/ChangeLog:
* libgnat/s-trasym__dwarf.adb (LDAD_Header): New String constant.
(Symbolic_Traceback): Print the load address of the executable at
the beginning if it is not null.
If a limited private partial view of a type has an access discriminant with
a default expression, and if the type (perhaps tagged, perhaps not) is
completed by deriving from an immutably limited type, then the default
discriminant expression should not be rejected.
gcc/ada/ChangeLog:
* sem_ch6.adb (Check_Discriminant_Conformance): In testing whether
a default expression is permitted for an access discriminant, we
need to know whether the discriminated type is immutably limited.
Handle another part of this test that cannot easily be handled in
Sem_Aux.Is_Immutably_Limited. This involves declaring a new local
function, Is_Derived_From_Immutably_Limited_Type.
Steve Baird [Thu, 29 Aug 2024 22:17:54 +0000 (15:17 -0700)]
ada: Missing constraint check for 'Length attribute reference
In some cases involving a universal-integer-valued attribute reference
(typically a 'Length attribute reference) occurring as an actual parameter
in a call, the runtime check that the constraints of the formal parameter
are satisfied is incorrectly not performed.
gcc/ada/ChangeLog:
* sem_attr.adb (Resolve_Attribute): When setting the Etype of a
universal-integer-valued attribute reference to the subtype
determined by its context, use the basetype of that subtype
instead of the subtype itself if there is a possibility that the
attribute value will not satisfy the constraints of that subtype.
Otherwise the compiler is, in effect, assuming something that
might not be true. Except use the subtype in the case of a
not-from-source 'Pos attribute reference in order to avoid
breaking things.
This patch adds a way to have the adareducer tool run on a appropriate
set of files when GNAT crashes. This feature is behind the -gnatd_m
debugging switch.
gcc/ada/ChangeLog:
* comperr.adb (Compiler_Abort): Add call to
Generate_Minimal_Reproducer and replace call to Namet.Unlock with
call to Unlock_If_Locked.
* debug.adb: Document new purpose of -gnatd_m and -gnatd_M.
* fname-uf.adb (Instantiate_SFN_Pattern): New procedure.
(Get_Default_File_Name): New function.
(Get_File_Name): Replace inline code with call to
Instantiate_SFN_Pattern.
* fname-uf.ads (Get_Default_File_Name): New function.
* generate_minimal_reproducer.adb (Generate_Minimal_Reproducer):
New procedure.
* namet.adb (Unlock_If_Locked): New function.
* namet.ads (Unlock_If_Locked): Likewise.
* par-prag.adb (Prag): Add special behavior with -gnatd_M.
* set_targ.adb: Minor fixes to comments.
* gcc-interface/Make-lang.in: Update list of object files.
Eric Botcazou [Wed, 4 Sep 2024 22:19:25 +0000 (00:19 +0200)]
ada: Fix wrong finalization of anonymous array aggregate
The issue arises when the aggregate consists only of iterated associations
because, in this case, its expansion uses a 2-pass mechanism which creates
a temporary that needs a fully-fledged initialization, thus running afoul
of the optimization that avoids building the initialization procedure in
the anonymous array case.
gcc/ada/ChangeLog:
* exp_aggr.ads (Is_Two_Pass_Aggregate): New function declaration.
* exp_aggr.adb (Is_Two_Pass_Aggregate): New function body.
(Expand_Array_Aggregate): Call Is_Two_Pass_Aggregate to detect the
aggregates that need the 2-pass expansion.
* exp_ch3.adb (Expand_Freeze_Array_Type): In the anonymous array
case, build the initialization procedure if the initial value in
the object declaration is a 2-pass aggregate.
This patch introduces a GNAT extension that adds a new aspect,
External_Initialization. A section is added to the reference
manual with a description of what the aspect does.
The implementation reuses existing mechanisms, in particular
Sinput.L.Load_Source_File and Sem_Res.Set_String_Literal_Subtype.
A new node kind is added, and nodes of that type are present in what
is passed to the back ends. That makes it necessary to update the back
ends to handle the new node type. The C interface is extended to make
that possible.
gcc/ada/ChangeLog:
* aspects.ads: Add entities for External_Initialization.
* checks.adb (Selected_Length_Checks): Add support for
N_External_Initializer nodes.
* doc/gnat_rm/gnat_language_extensions.rst: Add section for the added
extension.
* exp_util.adb (Insert_Actions): Add support for N_External_Initializer
nodes.
* fe.h (C_Source_Buffer): New function.
* gen_il-fields.ads: Add new field.
* gen_il-gen-gen_nodes.adb: Add N_External_Initializer node kind.
* gen_il-gen.adb: Add new field type.
* gen_il-types.ads: Add new node kind and new field type.
* pprint.adb (Expr_Name): Handle new node kind.
* sem.adb (Analyze): Add support for N_External_Initializer nodes.
* sem_ch13.adb (Analyze_Aspect_Specifications, Check_Aspect_At_Freeze_Point):
Add support for External_Initialization aspect.
* sem_ch3.adb (Apply_External_Initialization): New subprogram.
(Analyze_Object_Declaration): Add support for External_Initialization aspect.
* sem_res.adb (Resolve_External_Initializer): New procedure.
(Resolve): Add support for N_External_Initializer nodes.
(Set_String_Literal_Subtype): Extend to handle N_External_Initializer nodes.
* sinfo-utils.adb (Is_In_Union_Id): Adapt to new field addition.
* sinfo.ads: Add documentation for new node kind and new field.
* sinput.adb, sinput.ads (C_Source_Buffer): Add new C interface function.
* snames.ads-tmpl: Add new aspect identifier.
* sprint.adb (Sprint_Node_Actual): Add nop handling of N_External_Initializer
nodes.
* types.ads: Modify type to allow for new C interface.
* gcc-interface/trans.cc (gnat_to_gnu): Handle new GNAT node type.
* gcc-interface/Make-lang.in: Update list of stage1 run-time library units.
* gnat-style.texi: Regenerate.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.
Olivier Hainque [Fri, 16 Aug 2024 17:04:37 +0000 (19:04 +0200)]
ada: Use a-nallfl__wraplf.ads for Android
This is the most common definition. Otherwise, from the default:
a-nallfl.ads:51:13: ... intrinsic binding type mismatch on result
a-nallfl.ads:51:13: ... intrinsic binding type mismatch on parameter 1
a-nallfl.ads:51:13: ... profile of "Sin" doesn't match the builtin it binds
gcc/ada/ChangeLog:
* Makefile.rtl (arm/aarch64-android): Associate a-nallfl.ads with
libgnat/a-nallfl__wraplf.ads.
union {
sighandler_t sa_handler;
void (*sa_sigaction)(int, struct siginfo*, void*);
};
sigset_t sa_mask;
int sa_flags;
void (*sa_restorer)(void);
```
gcc/ada/ChangeLog:
* libgnarl/s-linux__android-arm.ads: New file, renaming of ...
* libgnarl/s-linux__android.ads: ... this file.
* libgnarl/s-linux__android-aarch64.ads: New file. Based on the
-arm variant, with sa_ field positions adjusted.
* Makefile.rtl (arm/aarch64-android pairs): Adjust accordingly.
* libgnarl/s-osinte__android.ads: Rather than making assumptions
on the actual type of the C sigset_t, use
Os_Constants.SIZEOF_sigset_t to define an Ada sigset_t type of the
proper size. Use C.int instead of unsigned_long for sa_flags.
Olivier Hainque [Fri, 16 Aug 2024 15:10:59 +0000 (17:10 +0200)]
ada: Account for aarch64 in init.c section for Android
Unlike the ARM port already there, aarch64 is dwarf CFI based
for unwinding and Android-Linux exposes kernel CFI for signal
handlers.
gcc/ada/ChangeLog:
* init.c (__gnat_error_handler): Map signals straight to Ada
exceptions, without a local CFI trampoline.
(__gnat_adjust_context_for_raise): Guard arm specific code on __arm__
compilation. Do nothing otherwise, relying on libgcc's signal
frame recognition for PC/RA adjustments.
Olivier Hainque [Fri, 16 Aug 2024 15:12:13 +0000 (17:12 +0200)]
ada: Extend arm-android section of Makefile.rtl to aarch64
gcc/ada/ChangeLog:
* Makefile.rtl: Extend arm-android section to aarch64, in a similar
fashion as other arm/arch64 configurations. Introduce pair
selection guards to prevent match of aarch64-linux-android on the
regular aarch64-linux% cross as well.
ada: sem_prag.adb: ignore compile_time_{warning,error} in CodePeer mode
GNAT sometimes needs help from the GCC back-end in order to check
whether Compile_Time_{Warning,Error} are true. As CodePeer does not have
access to a GCC back-end, it is unable to perform these checks. Thus we
need to remove said pragmas from the tree.
gcc/ada/ChangeLog:
* sem_prag.adb (Process_Compile_Time_Warning_Or_Error): Turn
Compile_Time pragmas into null nodes
Recompute TYPE_MODE and DECL_MODE for vector_type for accelerator.
gcc/ChangeLog:
PR ipa/96265
* lto-streamer-in.cc (lto_read_tree_1): Set TYPE_MODE and DECL_MODE
for vector_type if offloading is enabled.
(lto_input_mode_table): Remove handling of vector modes.
* tree-streamer-out.cc (pack_ts_decl_common_value_fields): Stream out
VOIDmode for vector_type if offloading is enabled.
(pack_ts_decl_common_value_fields): Likewise.
Xi Ruoyao [Thu, 11 Jul 2024 11:43:48 +0000 (19:43 +0800)]
LoongArch: Add support to annotate tablejump
This is per the request from the kernel developers. For generating the
ORC unwind info, the objtool program needs to analysis the control flow
of a .o file. If a jump table is used, objtool has to correlate the
jump instruction with the table.
On x86 (where objtool was initially developed) it's simple: a relocation
entry natrually correlates them because one single instruction is used
for table-based jump. But on an RISC machine objtool would have to
reconstruct the data flow if it must find out the correlation on its
own.
So, emit an additional section to store the correlation info as pairs of
addresses, each pair contains the address of a jump instruction (jr) and
the address of the jump table. This is very trivial to implement in
GCC.
gcc/ChangeLog:
* config/loongarch/genopts/loongarch.opt.in
(mannotate-tablejump): New option.
* config/loongarch/loongarch.opt: Regenerate.
* config/loongarch/loongarch.md (tablejump<mode>): Emit
additional correlation info between the jump instruction and the
jump table, if -mannotate-tablejump.
* doc/invoke.texi: Document -mannotate-tablejump.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/jump-table-annotate.c: New test.
Suggested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
There is a description in <https://github.com/riscv/riscv-isa-manual/blob/main/src/zawrs.adoc>:
"The instructions in the Zawrs extension are only useful in conjunction
with the LR instruction, which is provided by the Zalrsc component
of the A extension."
Tobias Burnus [Mon, 7 Oct 2024 21:57:42 +0000 (23:57 +0200)]
Move gfortran.dg/gomp/allocate-static.f90 to libgomp.fortran/
The testcase was turned into a 'dg-do run' check to check for the alignment,
but this only works in testsuite/gfortran.dg, causing link errors for
out-of-tree testing. The test was added in r15-4104-ga8caeaacf499d5.
gcc/testsuite/:
* gfortran.dg/gomp/allocate-static.f90: Move to libgomp/testsuite/.
libgomp/:
* testsuite/libgomp.fortran/allocate-static.f90: Moved from
gcc/testsuite/ as it is a dg-do run test; use real omp_lib_kinds
instead of local definition