Evgeny Karpov [Sat, 8 Jun 2024 13:49:17 +0000 (13:49 +0000)]
aarch64: Add DLL import/export to AArch64 target
This patch reuses the MinGW implementation to enable DLL import/export
functionality for the aarch64-w64-mingw32 target. It also modifies
environment configurations for MinGW.
Evgeny Karpov [Mon, 24 Jun 2024 12:44:58 +0000 (12:44 +0000)]
aarch64: Add selectany attribute handling
This patch extends the aarch64 attributes list with the selectany
attribute for the aarch64-w64-mingw32 target and reuses the mingw
implementation to handle it.
Evgeny Karpov [Mon, 24 Jun 2024 12:43:05 +0000 (12:43 +0000)]
Rename functions for reuse in AArch64
This patch renames functions related to dllimport/dllexport and
selectany functionality. These functions will be reused in the
aarch64-w64-mingw32 target.
Evgeny Karpov [Mon, 24 Jun 2024 12:38:40 +0000 (12:38 +0000)]
Extract ix86 dllimport implementation to mingw
This patch extracts the ix86 implementation for expanding a SYMBOL
into its corresponding dllimport, far-address, or refptr symbol. It
will be reused in the aarch64-w64-mingw32 target. The implementation
is copied as is from i386/i386.cc with minor changes to follow to the
code style.
Also this patch replaces the original DLL import/export implementation
in ix86 with mingw.
Evgeny Karpov [Sat, 8 Jun 2024 13:18:02 +0000 (13:18 +0000)]
Move mingw_* declarations to the mingw folder
This patch refactors recent changes to move mingw-related
functionality to the mingw folder. More renamings to the mingw_ prefix
will be done in follow-up commits.
This is the first commit in the second patch series to add DLL
import/export implementation to AArch64.
Coauthors: Zac Walker <zacwalker@microsoft.com>,
Mark Harmstone <mark@harmstone.com> and
Ron Riddle <ron.riddle@microsoft.com>
Refactored, prepared, and validated by
Radek Barton <radek.barton@microsoft.com> and
Evgeny Karpov <evgeny.karpov@microsoft.com>
Jakub Jelinek [Tue, 25 Jun 2024 06:35:56 +0000 (08:35 +0200)]
c: Fix ICE related to incomplete structures in C23 [PR114930]
Here is a version of the c_update_type_canonical fixes which passed
bootstrap/regtest.
The non-trivial part is the handling of the case when
build_qualified_type (TYPE_CANONICAL (t), TYPE_QUALS (x))
returns a type with NULL TYPE_CANONICAL. That should happen only
if TYPE_CANONICAL (t) == t, because otherwise c_update_type_canonical should
have been already called on the other type. c, the returned type, is usually x
and in that case it should have TYPE_CANONICAL set to itself, or worst
for whatever reason x is not the right canonical type (say it has attributes
or whatever disqualifies it from check_qualified_type). In that case
either it finds some pre-existing type from the variant chain of t which
is later in the chain and we haven't processed it yet (but then
get_qualified_type moves it right after t in:
/* Put the found variant at the head of the variant list so
frequently searched variants get found faster. The C++ FE
benefits greatly from this. */
tree t = *tp;
*tp = TYPE_NEXT_VARIANT (t);
TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (mv);
TYPE_NEXT_VARIANT (mv) = t;
return t;
optimization), or creates a fresh new type using build_variant_type_copy,
which again places the new type right after t:
/* Add the new type to the chain of variants of TYPE. */
TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (m);
TYPE_NEXT_VARIANT (m) = t;
TYPE_MAIN_VARIANT (t) = m;
At this point we want to make c its own canonical type (i.e. TYPE_CANONICAL
(c) = c;), but also need to process pointers to it and only then return back
to processing x. Processing the whole chain from c again could be costly,
we could have hundreds of types in the chain already processed, and while
the loop would just quickly skip them
for (tree x = t, l = NULL_TREE; x; l = x, x = TYPE_NEXT_VARIANT (x))
{
if (x != t && TYPE_STRUCTURAL_EQUALITY_P (x))
...
else if (x != t)
continue;
it feels costly. So, this patch instead moves c from right after t
to right before x in the chain (that shouldn't change anything, because
clearly build_qualified_type didn't find any matches in the chain before
x) and continues processing the c at that position, so should handle the
x that encountered this in the next iteration.
We could avoid some of the moving in the chain if we processed the chain
twice, once deal only with x != t && TYPE_STRUCTURAL_EQUALITY_P (x)
&& TYPE_CANONICAL (t) == t && check_qualified_type (t, x, TYPE_QUALS (x))
types (in that case set TYPE_CANONICAL (x) = x) and once the rest. There
is still the theoretical case where build_qualified_type would return
a new type and in that case we are back to the moving the type around and
needing to handle it though.
2024-06-25 Jakub Jelinek <jakub@redhat.com>
Martin Uecker <uecker@tugraz.at>
PR c/114930
PR c/115502
gcc/c/
* c-decl.cc (c_update_type_canonical): Assert t is main variant
with 0 TYPE_QUALS. Simplify and don't use check_qualified_type.
Deal with the case where build_qualified_type returns
TYPE_STRUCTURAL_EQUALITY_P type.
gcc/testsuite/
* gcc.dg/pr114574-1.c: Require lto effective target.
* gcc.dg/pr114574-2.c: Likewise.
* gcc.dg/pr114930.c: New test.
* gcc.dg/pr115502.c: New test.
Jeff Law [Tue, 25 Jun 2024 05:22:21 +0000 (23:22 -0600)]
[committed][RISC-V] Fix some of the testsuite fallout from late-combine patch
This fixes most, but not all of the testsuite fallout from the late-combine
patch. Specifically in the vector space we're often able to eliminate a
broadcast of an scalar element across a vector. That eliminates the vsetvl
related to the broadcast, but more importantly from the testsuite standpoint it
turns .vv forms into .vf or .vx forms.
There were two paths we could have taken here. One to accept .v*, ignoring the
actual register operands. Or to create new matches for the .vx and .vf
variants. I selected the latter as I'd like us to know if the code to avoid
the broadcast regresses.
I'm pushing this through now so that we've got cleaner results and to prevent
duplicate work. I've got patch for the rest of the testsuite fallout, but I
want to think about them a bit.
Kewen Lin [Tue, 25 Jun 2024 05:04:53 +0000 (00:04 -0500)]
Replace {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE with new hook mode_for_floating_type
Currently how we determine which mode will be used for a
floating point type is that for a given type precision
(size) call mode_for_size to get the first mode which has
this size in the specified class. On Powerpc, we have
three modes (TF/KF/IF) having the same mode precision 128
(see[1]), so the processing forces us to have to place TF
at the first place, it would require us to make more
adjustment in some generic code to avoid some unexpected
mode conversions and it would be even worse if we get rid
of TF eventually one day. And as Joseph pointed out in [2],
"floating types should have their mode, not a poorly
defined precision value", as Joseph and Richi suggested,
this patch is to introduce one hook mode_for_floating_type
which returns the corresponding mode for type float, double
or long double. The default implementation returns SFmode
for float and DFmode for double or long double. For ports
which need special treatment, there are some other patches
for their own port specific implementation (referring to
how {,LONG_}DOUBLE_TYPE_SIZE get used there). For all
generic uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE, depending
on the context, some of them are replaced with TYPE_PRECISION
of the according type node, some other are replaced with
GET_MODE_PRECISION on the mode from mode_for_floating_type.
This patch also poisons {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE,
so most defines of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE in port
specific are removed, but there are still some which are
good to be kept for readability then they get renamed with
port specific prefix.
Kewen Lin [Tue, 25 Jun 2024 05:04:51 +0000 (00:04 -0500)]
vms: Replace use of LONG_DOUBLE_TYPE_SIZE
Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type. To be prepared for that, this
patch is to replace use of LONG_DOUBLE_TYPE_SIZE in vms port
with TYPE_PRECISION of long_double_type_node.
Kewen Lin [Tue, 25 Jun 2024 05:04:49 +0000 (00:04 -0500)]
rust: Replace uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type. To be prepared for that, this
patch is to replace use of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
in rust with TYPE_PRECISION of {float,{,long_}double}_type_node.
Kewen Lin [Tue, 25 Jun 2024 05:04:47 +0000 (00:04 -0500)]
go: Replace uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type. To be prepared for that, this
patch is to replace use of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
in go with TYPE_PRECISION of {float,{,long_}double}_type_node.
* go-gcc.cc (Gcc_backend::float_type): Use TYPE_PRECISION of
{float,double,long_double}_type_node to replace
{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE.
(Gcc_backend::complex_type): Likewise.
Sergei Lewis [Mon, 24 Jun 2024 20:20:14 +0000 (14:20 -0600)]
[PATCH v2 2/3] RISC-V: setmem for RISCV with V extension
This is primarily Sergei's work, my contributions were limited to
merging his expander with the one that's on the trunk, allowing
non-constant value and trivial testsuite adjustments due to option renaming.
I'm doing setmem first because it's the easiest. The others will follow
soon enough.
I've tested this in my system, waiting on pre-commit CI to render its
verdict before moving forward.
gcc/ChangeLog
* config/riscv/riscv-protos.h (riscv_vector::expand_vec_setmem): New
function declaration.
* config/riscv/riscv-string.cc (riscv_vector::expand_vec_setmem): New
function: this generates an inline vectorised memory set, if and only if
we know the entire operation can be performed in a single vector store.
* config/riscv/riscv.md (setmem<mode>): Try riscv_vector::expand_vec_setmem
for constant lengths. Do not require operand 2 to be a constant.
gcc/testsuite/ChangeLog
* gcc.target/riscv/rvv/base/setmem-1.c: New tests
* gcc.target/riscv/rvv/base/setmem-2.c: New tests
* gcc.target/riscv/rvv/base/setmem-3.c: New tests
Patrick O'Neill [Mon, 24 Jun 2024 19:06:15 +0000 (12:06 -0700)]
RISC-V: Add dg-remove-option for z* extensions
This introduces testsuite support infra for removing extensions.
Since z* extensions don't have ordering requirements the logic for
adding/removing those extensions has also been consolidated.
This fixes RVWMO compile testcases failing on Ztso targets by removing
the extension from the -march string.
gcc/ChangeLog:
* doc/sourcebuild.texi (dg-remove-option): Add documentation.
(dg-add-option): Add documentation for riscv_{a,zaamo,zalrsc,ztso}
Harald Anlauf [Sun, 23 Jun 2024 20:36:43 +0000 (22:36 +0200)]
Fortran: fix passing of optional dummy as actual to optional argument [PR55978]
gcc/fortran/ChangeLog:
PR fortran/55978
* trans-array.cc (gfc_conv_array_parameter): Do not dereference
data component of a missing allocatable dummy array argument for
passing as actual to optional dummy. Harden logic of presence
check for optional pointer dummy by using TRUTH_ANDIF_EXPR instead
of TRUTH_AND_EXPR.
gcc/testsuite/ChangeLog:
PR fortran/55978
* gfortran.dg/optional_absent_12.f90: New test.
Roger Sayle [Mon, 24 Jun 2024 14:34:03 +0000 (15:34 +0100)]
PR tree-optimization/113673: Avoid load merging when potentially trapping.
This patch fixes PR tree-optimization/113673, a P2 ice-on-valid regression
caused by load merging of (ptr[0]<<8)+ptr[1] when -ftrapv has been
specified. When the operator is | or ^ this is safe, but for addition
of signed integer types, a trap may be generated/required, so merging this
idiom into a single non-trapping instruction is inappropriate, confusing
the compiler by transforming a basic block with an exception edge into one
without.
This revision implements Richard Biener's feedback to add an early check
for stmt_can_throw_internal (cfun, stmt) to prevent transforming in the
presence of any statement that could trap, not just overflow on addition.
The one other tweak included in this patch is to mark the local function
find_bswap_or_nop_load as static ensuring that it isn't called from outside
this file, and guaranteeing that it is dominated by stmt_can_throw_internal
checking.
2024-06-24 Roger Sayle <roger@nextmovesoftware.com>
Richard Biener <rguenther@suse.de>
gcc/ChangeLog
PR tree-optimization/113673
* gimple-ssa-store-merging.cc (find_bswap_or_nop_load): Make static.
(find_bswap_or_nop_1): Avoid transformations (load merging) when
stmt_can_throw_internal indicates that a statement can trap.
gcc/testsuite/ChangeLog
PR tree-optimization/113673
* g++.dg/pr113673.C: New test case.
Richard Biener [Mon, 24 Jun 2024 07:52:39 +0000 (09:52 +0200)]
tree-optimization/115602 - SLP CSE results in cycles
The following prevents SLP CSE to create new cycles which happened
because of a 1:1 permute node being present where its child was then
CSEd to the permute node. Fixed by making a node only available to
CSE to after recursing.
PR tree-optimization/115602
* tree-vect-slp.cc (vect_cse_slp_nodes): Delay populating the
bst-map to avoid cycles.
Richard Biener [Fri, 21 Jun 2024 11:19:26 +0000 (13:19 +0200)]
tree-optimization/115528 - fix vect alignment analysis for outer loop vect
For outer loop vectorization of a data reference in the inner loop
we have to look at both steps to see if they preserve alignment.
What is special for this testcase is that the outer loop step is
one element but the inner loop step four and that we now use SLP
and the vectorization factor is one.
PR tree-optimization/115528
* tree-vect-data-refs.cc (vect_compute_data_ref_alignment):
Make sure to look at both the inner and outer loop step
behavior.
This patch adds a combine pass that runs late in the pipeline.
There are two instances: one between combine and split1, and one
after postreload.
The pass currently has a single objective: remove definitions by
substituting into all uses. The pre-RA version tries to restrict
itself to cases that are likely to have a neutral or beneficial
effect on register pressure.
The patch fixes PR106594. It also fixes a few FAILs and XFAILs
in the aarch64 test results, mostly due to making proper use of
MOVPRFX in cases where we didn't previously.
This is just a first step. I'm hoping that the pass could be
used for other combine-related optimisations in future. In particular,
the post-RA version doesn't need to restrict itself to cases where all
uses are substitutable, since it doesn't have to worry about register
pressure. If we did that, and if we extended it to handle multi-register
REGs, the pass might be a viable replacement for regcprop, which in
turn might reduce the cost of having a post-RA instance of the new pass.
On most targets, the pass is enabled by default at -O2 and above.
However, it has a tendency to undo x86's STV and RPAD passes,
by folding the more complex post-STV/RPAD form back into the
simpler pre-pass form.
Also, running a pass after register allocation means that we can
now match define_insn_and_splits that were previously only matched
before register allocation. This trips things like:
(define_insn_and_split "..."
[...pattern...]
"...cond..."
"#"
"&& 1"
[...pattern...]
{
...unconditional use of gen_reg_rtx ()...;
}
because matching and splitting after RA will call gen_reg_rtx when
pseudos are no longer allowed. rs6000 has several instances of this.
xtensa has a variation in which the split condition is:
"&& can_create_pseudo_p ()"
The failure then is that, if we match after RA, we'll never be
able to split the instruction.
The patch therefore disables the pass by default on i386, rs6000
and xtensa. Hopefully we can fix those ports later (if their
maintainers want). It seems better to add the pass first, though,
to make it easier to test any such fixes.
gcc.target/aarch64/bitfield-bitint-abi-align{16,8}.c would need
quite a few updates for the late-combine output. That might be
worth doing, but it seems too complex to do as part of this patch.
I tried compiling at least one target per CPU directory and comparing
the assembly output for parts of the GCC testsuite. This is just a way
of getting a flavour of how the pass performs; it obviously isn't a
meaningful benchmark. All targets seemed to improve on average:
rtl-ssa has routines for scanning forwards or backwards for something
under the control of an exclusion set. These searches are currently
used for two main things:
- to work out where an instruction can be moved within its EBB
- to work out whether recog can add a new hard register clobber
The exclusion set was originally a callback function that returned
true for insns that should be ignored. However, for the late-combine
work, I'd also like to be able to skip an entire definition, along
with all its uses.
This patch prepares for that by turning the exclusion set into an
object that provides predicate member functions. Currently the
only two member functions are:
- should_ignore_insn: what the old callback did
- should_ignore_def: the new functionality
but more could be added later.
Doing this also makes it easy to remove some asymmetry that I think
in hindsight was a mistake: in forward scans, ignoring an insn meant
ignoring all definitions in that insn (ok) and all uses of those
definitions (non-obvious). The new interface makes it possible
to select the required behaviour, with that behaviour being applied
consistently in both directions.
Now that the exclusion set is a dedicated object, rather than
just a "random" function, I think it makes sense to remove the
_ignoring suffix from the function names. The suffix was originally
there to describe the callback, and in particular to emphasise that
a true return meant "ignore" rather than "heed".
gcc/
* rtl-ssa.h: Include predicates.h.
* rtl-ssa/predicates.h: New file.
* rtl-ssa/access-utils.h (prev_call_clobbers_ignoring): Rename to...
(prev_call_clobbers): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(next_call_clobbers_ignoring): Rename to...
(next_call_clobbers): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(first_nondebug_insn_use_ignoring): Rename to...
(first_nondebug_insn_use): ...this and treat the ignore parameter as
an object with the same interface as ignore_nothing.
(last_nondebug_insn_use_ignoring): Rename to...
(last_nondebug_insn_use): ...this and treat the ignore parameter as
an object with the same interface as ignore_nothing.
(last_access_ignoring): Rename to...
(last_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing. Conditionally skip
definitions.
(prev_access_ignoring): Rename to...
(prev_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing.
(first_def_ignoring): Replace with...
(first_access): ...this new function.
(next_access_ignoring): Rename to...
(next_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing. Conditionally skip
definitions.
* rtl-ssa/change-utils.h (insn_is_changing): Delete.
(restrict_movement_ignoring): Rename to...
(restrict_movement): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(recog_ignoring): Rename to...
(recog): ...this and treat the ignore parameter as an object with
the same interface as ignore_nothing.
* rtl-ssa/changes.h (insn_is_changing_closure): Delete.
* rtl-ssa/functions.h (function_info::add_regno_clobber): Treat
the ignore parameter as an object with the same interface as
ignore_nothing.
* rtl-ssa/insn-utils.h (insn_is): Delete.
* rtl-ssa/insns.h (insn_is_closure): Delete.
* rtl-ssa/member-fns.inl
(insn_is_changing_closure::insn_is_changing_closure): Delete.
(insn_is_changing_closure::operator()): Likewise.
(function_info::add_regno_clobber): Treat the ignore parameter
as an object with the same interface as ignore_nothing.
(ignore_changing_insns::ignore_changing_insns): New function.
(ignore_changing_insns::should_ignore_insn): Likewise.
* rtl-ssa/movement.h (restrict_movement_for_dead_range): Treat
the ignore parameter as an object with the same interface as
ignore_nothing.
(restrict_movement_for_defs_ignoring): Rename to...
(restrict_movement_for_defs): ...this and treat the ignore parameter
as an object with the same interface as ignore_nothing.
(restrict_movement_for_uses_ignoring): Rename to...
(restrict_movement_for_uses): ...this and treat the ignore parameter
as an object with the same interface as ignore_nothing. Conditionally
skip definitions.
* doc/rtl.texi: Update for above name changes. Use
ignore_changing_insns instead of insn_is_changing.
* config/aarch64/aarch64-cc-fusion.cc (cc_fusion::parallelize_insns):
Likewise.
* pair-fusion.cc (no_ignore): Delete.
(latest_hazard_before, first_hazard_after): Update for above name
changes. Use ignore_nothing instead of no_ignore.
(pair_fusion_bb_info::fuse_pair): Update for above name changes.
Use ignore_changing_insns instead of insn_is_changing.
(pair_fusion::try_promote_writeback): Likewise.
The compare_repeat_factors comparator fails qsort checking eventually
because it uses rf2->rank - rf1->rank to compare unsigned numbers
which causes issues for ranks that interpret negative as signed.
Fixed by re-writing the obvious way. I've also fixed the count
comparison which suffers from truncation as count is 64bit signed
while the comparator result is 32bit int (that's a lot less likely
to hit in practice though).
The testcase from the PR is too large to include.
PR tree-optimization/115599
* tree-ssa-reassoc.cc (compare_repeat_factors): Use explicit
compares to avoid truncations.
Haochen Gui [Mon, 24 Jun 2024 05:12:51 +0000 (13:12 +0800)]
fwprop: invoke change_is_worthwhile to judge if a replacement is worthwhile
gcc/
* fwprop.cc (try_fwprop_subst_pattern): Invoke change_is_worthwhile
to judge if a replacement is worthwhile. Remove single_set check
and add is_debug_insn check.
* recog.cc (swap_change): Invalidate recog_data when the cached INSN
is swapped out.
* rtl-ssa/changes.cc (rtl_ssa::changes_are_worthwhile): Check if the
insn cost of new rtl is unknown and fail the replacement.
Mark Harmstone [Mon, 24 Jun 2024 00:17:39 +0000 (18:17 -0600)]
[PATCH 02/11] Handle CodeView base types
Adds a get_type_num function to translate type DIEs into CodeView
numbers, along with a hash table for this. For now we just deal with
the base types (integers, Unicode chars, floats, and bools).
Mark Harmstone [Sun, 23 Jun 2024 23:48:10 +0000 (17:48 -0600)]
[PATCH 01/11] Output CodeView data about variables
Parse the DW_TAG_variable DIEs, and outputs S_GDATA32 (for global variables)
and S_LDATA32 (static global variables) symbols into the .debug$S section.
gcc/
* dwarf2codeview.cc (S_LDATA32, S_GDATA32): Define.
(struct codeview_symbol): New structure.
(sym, last_sym): New variables.
(write_data_symbol): New function.
(write_codeview_symbols): Call write_data_symbol.
(add_variable, codeview_debug_early_finish): New functions.
* dwarf2codeview.h (codeview_debug_early_finish): Prototype.
* dwarf2out.cc
(dwarf2out_early_finish): Call codeview_debug_early_finish.
Artemiy Volkov [Sun, 23 Jun 2024 20:54:00 +0000 (14:54 -0600)]
[PATCH] RISC-V: Fix unrecognizable pattern in riscv_expand_conditional_move()
Presently, the code fragment:
int x[5];
void
d(int a, int b, int c) {
for (int i = 0; i < 5; i++)
x[i] = (a != b) ? c : a;
}
causes an ICE when compiled with -O2 -march=rv32i_zicond:
test.c: In function 'd':
test.c: error: unrecognizable insn:
11 | }
| ^
(insn 8 5 9 2 (set (reg:SI 139 [ iftmp.0_2 ])
(if_then_else:SI (ne:SI (reg/v:SI 136 [ a ])
(reg/v:SI 137 [ b ]))
(reg/v:SI 136 [ a ])
(reg/v:SI 138 [ c ]))) -1
(nil))
during RTL pass: vregs
This happens because, as part of one of the optimizations in
riscv_expand_conditional_move(), an if_then_else is generated with both
comparands being register operands, resulting in an unmatchable insn since
Zicond patterns require constant 0 as the second comparand. Fix this by adding
a extra check before performing this optimization.
The code snippet mentioned above is also included in this patch as a new Zicond
testcase.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_expand_conditional_move): Add a
CONST0_RTX check.
Jeff Law [Sun, 23 Jun 2024 14:26:25 +0000 (08:26 -0600)]
[committed][RISC-V][PR target/114139] Verify we have a CONST_INT before extracting INTVAL
Run-of-the-mill checking issue. We had something like (plus (reg) (reg)) and
tried to extract INTVAL (XEXP (x, 1)) which of course blows up with checking
on.
Fixed thusly. Tested on riscv32-elf in my tester. riscv64-elf is in flight,
but won't finish for a while due to other tasks in flight.
PR target/114139
gcc/
* config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Verify object
is a CONST_INT before looking at INTVAL.
Richard Biener [Sun, 23 Jun 2024 09:26:39 +0000 (11:26 +0200)]
tree-optimization/115597 - allow CSE of two-operator VEC_PERM nodes
The following makes sure to always CSE when there's SLP_TREE_SCALAR_STMTS
as otherwise a chain of two-operator node operations can result in
exponential behavior of the CSE process as likely seen when building
510.parest on aarch64.
PR tree-optimization/115597
* tree-vect-slp.cc (vect_cse_slp_nodes): Allow to CSE
VEC_PERM nodes.
Richard Biener [Sat, 22 Jun 2024 12:59:09 +0000 (14:59 +0200)]
tree-optimization/115579 - fix wrong code with store-motion
The recent change to relax store motion for variables that cannot have
store data races broke the optimization to share flag vars for stores
that all happen in the same single BB. The following fixes this.
PR tree-optimization/115579
* tree-ssa-loop-im.cc (execute_sm): Return the auxiliary data
created.
(hoist_memory_references): Record the flag var that's eventually
created and re-use it when all stores are in the same BB.
Jeff Law [Sat, 22 Jun 2024 16:39:51 +0000 (10:39 -0600)]
[committed] [RISC-V] Skip zbs-ext-2.c for -Oz as well
> the test should probably also be skipped on -Oz:
>
> === gcc: Unexpected fails for rv64imafdc lp64d medlow ===
> FAIL: gcc.target/riscv/zbs-ext-2.c -Oz scan-assembler-times andi\t 1
> FAIL: gcc.target/riscv/zbs-ext-2.c -Oz scan-assembler-times andn\t 1
> FAIL: gcc.target/riscv/zbs-ext-2.c -Oz scan-assembler-times li\t 1
Yea. Just re-ran thing and sure enough we need to skip -Oz as well. So
committing the obvious change....
gcc/testsuite/
* gcc.target/riscv/zbs-ext-2.c: Also skip for -Oz.
David Malcolm [Fri, 21 Jun 2024 22:20:38 +0000 (18:20 -0400)]
diagnostics: remove duplicate copies of diagnostic_kind_text
No functional change intended.
gcc/ChangeLog:
* diagnostic-format-json.cc
(json_output_format::on_end_diagnostic): Use
get_diagnostic_kind_text rather than embedding a duplicate copy of
the table.
* diagnostic-format-sarif.cc
(make_rule_id_for_diagnostic_kind): Likewise.
* diagnostic.cc (get_diagnostic_kind_text): New.
* diagnostic.h (get_diagnostic_kind_text): New decl.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jeff Law [Fri, 21 Jun 2024 21:58:12 +0000 (15:58 -0600)]
[committed] Fix testsuite fallout on stormy16 after IOR->PLUS change
More minor fallout from the IOR->PLUS change a little while ago. This time on
xstormy16.
The pattern to swap nibbles actually tries to handle all the cases of IOR, XOR
and PLUS. But when we generate PLUS earlier in the pipeline, the
simplifications/canonicalizations are slightly different resulting in the
pattern not matching.
This patch adds an alternate pattern which matches what we get now. Basically
it looks like QImode rotate by 4, zero extended to HI.
Run in my tester to verify the regression was fixed. Pushing to the trunk.
gcc/
* config/stormy16/stormy16.md (swpn_zext): New pattern.
Jonathan Wakely [Wed, 19 Jun 2024 16:26:37 +0000 (17:26 +0100)]
libstdc++: Remove std::__is_pointer and std::__is_scalar [PR115497]
This removes the std::__is_pointer and std::__is_scalar traits, as they
conflicts with a Clang built-in.
Although Clang has a hack to make the class templates work despite using
reserved names, removing these class templates will allow that hack to
be dropped at some future date.
libstdc++-v3/ChangeLog:
PR libstdc++/115497
* include/bits/cpp_type_traits.h (__is_pointer, __is_scalar):
Remove.
(__is_arithmetic): Do not use __is_pointer in the primary
template. Add partial specialization for pointers.
Jonathan Wakely [Wed, 19 Jun 2024 10:19:58 +0000 (11:19 +0100)]
libstdc++: Remove std::__is_void class template [PR115497]
This removes the std::__is_void trait, as it conflicts with a Clang
built-in. There is only one use of the trait, which can easily be
replaced by simpler code.
Although Clang has a hack to make the class template work despite using
a reserved name, removing std::__is_void will allow that hack to be
dropped at some future date.
libstdc++-v3/ChangeLog:
PR libstdc++/115497
* include/bits/cpp_type_traits.h (__is_void): Remove.
* include/debug/helper_functions.h (_Distance_traits):
Adjust partial specialization to match void directly, instead of
using __is_void<T>::__type and matching __true_type.
Jonathan Wakely [Wed, 19 Jun 2024 16:21:16 +0000 (17:21 +0100)]
libstdc++: Stop using std::__is_pointer in <deque> and <algorithm> [PR115497]
This replaces all uses of the std::__is_pointer type trait with uses of
the new __is_pointer built-in. Since the class template was only used to
enable some performance optimizations for algorithms, we can use the
built-in when __has_builtin(__is_pointer) is true (which is the case for
GCC trunk and for current versions of Clang) and just forego the
optimization otherwise.
Removing the uses of std::__is_pointer means it can be removed from
<bits/cpp_type_traits.h>, which is another step towards fixing PR
115497.
libstdc++-v3/ChangeLog:
PR libstdc++/115497
* include/bits/deque.tcc (__lex_cmp_dit): Replace __is_pointer
class template with __is_pointer(T) built-in.
(__lexicographical_compare_aux1): Likewise.
* include/bits/stl_algobase.h (__equal_aux1): Likewise.
(__lexicographical_compare_aux1): Likewise.
Jonathan Wakely [Wed, 19 Jun 2024 10:19:58 +0000 (11:19 +0100)]
libstdc++: Don't use std::__is_scalar in std::valarray initialization [PR115497]
This removes the use of the std::__is_scalar trait from <valarray>,
where it can be replaced by __is_trivial. It's used to decide whether we
can use memset to value-initialize valarray elements, but memset is
suitable for any trivial types, because value-initializing them is
equivalent to filling them with zeros.
This is another step towards removing the class templates in
<bits/cpp_type_traits.h> that conflict with Clang built-in names.
libstdc++-v3/ChangeLog:
PR libstdc++/115497
* include/bits/valarray_array.h (__valarray_default_construct):
Use __is_trivial(_Tp). instead of __is_scalar<_Tp>.
Jonathan Wakely [Wed, 19 Jun 2024 15:14:56 +0000 (16:14 +0100)]
libstdc++: Fix std::fill and std::fill_n optimizations [PR109150]
As noted in the PR, the optimization used for scalar types in std::fill
and std::fill_n is non-conforming, because it doesn't consider that
assigning a scalar type might have non-trivial side effects which are
affected by the optimization.
By changing the condition under which the optimization is done we ensure
it's only performed when safe to do so, and we also enable it for
additional types, which was the original subject of the PR.
Instead of two overloads using __enable_if<__is_scalar<T>::__value, R>
we can combine them into one and create a local variable which is either
a local copy of __value or another reference to it, depending on whether
the optimization is allowed.
This removes a use of std::__is_scalar, which is a step towards fixing
PR 115497 by removing std::__is_pointer from <bits/cpp_type_traits.h>
libstdc++-v3/ChangeLog:
PR libstdc++/109150
* include/bits/stl_algobase.h (__fill_a1): Combine the
!__is_scalar and __is_scalar overloads into one and rewrite the
condition used to decide whether to perform the load outside the
loop.
* testsuite/25_algorithms/fill/109150.cc: New test.
* testsuite/25_algorithms/fill_n/109150.cc: New test.
Matthias Kretz [Fri, 21 Jun 2024 14:22:22 +0000 (16:22 +0200)]
libstdc++: Fix test on x86_64 and non-simd targets
* Running a test compiled with AVX512 instructions requires
avx512f_runtime not just avx512f.
* The 'reduce2' test violated an invariant of fixed_size_simd_mask and
thus failed on all targets without 16-Byte vector builtins enabled (in
bits/simd.h).
All uses of xs_hi_nonmemory_operand allow constraint "i",
which means that they allow consts, symbol_refs and label_refs.
The definition of xs_hi_nonmemory_operand accounted for consts,
but not for symbol_refs and label_refs.
gcc/
* config/stormy16/predicates.md (xs_hi_nonmemory_operand): Handle
symbol_ref and label_ref.
power_of_2_operand allows any 32-bit power of 2, whereas "I" only
accepts 16-bit signed constants. This meant that any power of 2
greater than 32768 would cause an "insn does not satisfy its
constraints" ICE.
Also, the %p operand modifier barfed on 1<<31, which is sign-
rather than zero-extended to 64 bits. The code is inherently
limited to 32-bit operands -- power_of_2_operand contains a test
involving "unsigned" -- so this patch just ands with 0xffffffff.
gcc/
* config/iq2000/iq2000.cc (iq2000_print_operand): Make %p handle 1<<31.
* config/iq2000/iq2000.md: Remove "I" constraints on
power_of_2_operands.
No-op moves are given the code NOOP_MOVE_INSN_CODE if we plan
to delete them later. Such insns shouldn't be costed, partly
because they're going to disappear, and partly because targets
won't recognise the insn code.
Andrew MacLeod [Mon, 17 Jun 2024 20:07:16 +0000 (16:07 -0400)]
Print "Global Exported" to dump_file from set_range_info.
* gimple-range.cc (gimple_ranger::register_inferred_ranges): Do not
dump global range info after set_range_info.
(gimple_ranger::register_transitive_inferred_ranges): Likewise.
(dom_ranger::range_of_stmt): Likewise.
* tree-ssanames.cc (set_range_info): If global range info
changes, maybe print new range to dump_file.
* tree-vrp.cc (remove_unreachable::handle_early): Do not
dump global range info after set_range_info.
(remove_unreachable::remove): Likewise.
(remove_unreachable::remove_and_update_globals): Likewise.
(pass_assumptions::execute): Likewise.
Andrew MacLeod [Mon, 17 Jun 2024 15:32:51 +0000 (11:32 -0400)]
Change fast VRP algorithm
Change the fast VRP algorithm to track contextual ranges active within
each basic block.
* gimple-range.cc (dom_ranger::dom_ranger): Create a block
vector.
(dom_ranger::~dom_ranger): Dispose of the block vector.
(dom_ranger::edge_range): Delete.
(dom_ranger::range_on_edge): Combine range in src BB with any
range gori_nme_on_edge returns.
(dom_ranger::range_in_bb): Combine global range with any active
contextual range for an ssa-name.
(dom_ranger::range_of_stmt): Fix non-ssa LHS case, use
fur_depend for folding so relations can be registered.
(dom_ranger::maybe_push_edge): Delete.
(dom_ranger::pre_bb): Create incoming contextual range vector.
(dom_ranger::post_bb): Free contextual range vector.
* gimple-range.h (dom_ranger::edge_range): Delete.
(dom_ranger::m_e0): Delete.
(dom_ranger::m_e1): Delete.
(dom_ranger::m_bb): New.
(dom_ranger::m_pop_list): Delete.
* tree-vrp.cc (execute_fast_vrp): Enable relation oracle.
Andrew MacLeod [Mon, 17 Jun 2024 15:23:12 +0000 (11:23 -0400)]
Add builtin_unreachable processing for fast_vrp.
Add a remove_unreachable object to fast vrp, and honor the final_p flag.
* tree-vrp.cc (remove_unreachable::remove): Export global range
if builtin_unreachable dominates all uses.
(remove_unreachable::remove_and_update_globals): Do not reset SCEV.
(execute_ranger_vrp): Reset SCEV here instead.
(fvrp_folder::fvrp_folder): Take final pass flag
and create a remove_unreachable object when specified.
(fvrp_folder::pre_fold_stmt): Register GIMPLE_CONDs with
the remove_unreachcable object.
(fvrp_folder::m_unreachable): New.
(execute_fast_vrp): Process remove_unreachable object.
(pass_vrp::execute): Add final_p flag to execute_fast_vrp.
David Malcolm [Fri, 21 Jun 2024 12:46:14 +0000 (08:46 -0400)]
testsuite: check that generated .sarif files validate against the SARIF schema [PR109360]
This patch extends the dg directive verify-sarif-file so that if
the "jsonschema" tool is available, it will be used to validate the
generated .sarif file.
gcc/testsuite/ChangeLog:
PR testsuite/109360
* lib/sarif-schema-2.1.0.json: New file, downloaded from
https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/schemas/sarif-schema-2.1.0.json
Licensing information can be seen at
https://github.com/oasis-tcs/sarif-spec/issues/583
which states "They are free to incorporate it into their
implementation. No need for special permission or paperwork from
OASIS."
* lib/scansarif.exp (verify-sarif-file): If "jsonschema" is
available, use it to verify that the .sarif file complies with the
SARIF schema.
* lib/target-supports.exp (check_effective_target_jsonschema):
New.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Fri, 21 Jun 2024 12:46:13 +0000 (08:46 -0400)]
diagnostics: fixes to SARIF output [PR109360]
When adding validation of .sarif files against the schema
(PR testsuite/109360) I discovered various issues where we were
generating invalid .sarif files.
Specifically, in
c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c
the relatedLocations for the "note" diagnostics were missing column
numbers, leading to validation failure due to non-unique elements,
such as multiple:
"message": {"text": "invalid UTF-8 character <bf>"}},
on line 25 with no column information.
Root cause is that for some diagnostics in libcpp we have a location_t
representing the line as a whole, setting a column_override on the
rich_location (since the line hasn't been fully read yet). We were
handling this column override for plain text output, but not for .sarif
output.
Similarly, in diagnostic-format-sarif-file-pr111700.c there is a warning
emitted on "line 0" of the file, whereas SARIF requires line numbers to
be positive.
We also use column == 0 internally to mean "the line as a whole",
whereas SARIF required column numbers to be positive.
This patch fixes these various issues.
gcc/ChangeLog:
PR testsuite/109360
* diagnostic-format-sarif.cc
(sarif_builder::make_location_object): Pass any column override
from rich_loc to maybe_make_physical_location_object.
(sarif_builder::maybe_make_physical_location_object): Add
"column_override" param and pass it to maybe_make_region_object.
(sarif_builder::maybe_make_region_object): Add "column_override"
param and use it when the location has 0 for a column. Don't
add "startLine", "startColumn", "endLine", or "endColumn" if
the values aren't positive.
(sarif_builder::maybe_make_region_object_for_context): Don't
add "startLine" or "endLine" if the values aren't positive.
libcpp/ChangeLog:
PR testsuite/109360
* include/rich-location.h (rich_location::get_column_override):
New accessor.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jonathan Wakely [Tue, 18 Jun 2024 15:59:52 +0000 (16:59 +0100)]
libstdc++: Make std::any_cast<void> ill-formed (LWG 3305)
LWG 3305 was approved earlier this year in Tokyo. We need to give an
error if using std::any_cast<void>, but std::any_cast<void()> is valid
(but always returns null).
libstdc++-v3/ChangeLog:
* include/std/any (any_cast(any*), any_cast(const any*)): Add
static assertion to reject void types, as per LWG 3305.
* testsuite/20_util/any/misc/lwg3305.cc: New test.
This member function was previously deprecated, but that was reverted by
P2875R4, approved earlier this year in Tokyo. Since it's not going to be
deprecated in C++26, and so presumably not removed, there is no point in
giving deprecated warnings for C++23 mode.
Jonathan Wakely [Sun, 7 Apr 2024 13:12:25 +0000 (14:12 +0100)]
libstdc++: Add deprecation warnings to <strstream> types
libstdc++-v3/ChangeLog:
* include/backward/backward_warning.h: Adjust comments to
suggest <spanstream> as another alternative to <strstream>.
* include/backward/strstream (strstreambuf, istrstream)
(ostrstream, strstream): Add deprecated attribute.
Jonathan Wakely [Thu, 20 Jun 2024 12:28:08 +0000 (13:28 +0100)]
libstdc++: Fix __cpp_lib_chrono for old std::string ABI
The <chrono> header is incomplete for the old std::string ABI, because
std::chrono::tzdb is only defined for the new ABI. The feature test
macro advertising full C++20 support should not be defined for the old
ABI.
libstdc++-v3/ChangeLog:
* include/bits/version.def (chrono): Add cxx11abi = yes.
* include/bits/version.h: Regenerate.
* testsuite/std/time/syn_c++20.cc: Adjust expected value for
the feature test macro.
Jonathan Wakely [Tue, 18 Jun 2024 12:27:02 +0000 (13:27 +0100)]
libstdc++: Fix std::to_array for trivial-ish types [PR115522]
Due to PR c++/85723 the std::is_trivial trait is true for types with a
deleted default constructor, so the use of std::is_trivial in
std::to_array is not sufficient to ensure the type can be trivially
default constructed then filled using memcpy.
I also forgot that a type with a deleted assignment operator can still
be trivial, so we also need to check that it's assignable because the
is_constant_evaluated() path can't use memcpy.
Replace the uses of std::is_trivial with std::is_trivially_copyable
(needed for memcpy), std::is_trivially_default_constructible (needed so
that the default construction is valid and does no work) and
std::is_copy_assignable (needed for the constant evaluation case).
libstdc++-v3/ChangeLog:
PR libstdc++/115522
* include/std/array (to_array): Workaround the fact that
std::is_trivial is not sufficient to check that a type is
trivially default constructible and assignable.
* testsuite/23_containers/array/creation/115522.cc: New test.
Jonathan Wakely [Thu, 20 Jun 2024 15:13:10 +0000 (16:13 +0100)]
libstdc++: Initialize base in test allocator's constructor
This fixes a warning from one of the test allocators:
warning: base class 'class std::allocator<__gnu_test::copy_tracker>' should be explicitly initialized in the copy constructor [-Wextra]
libstdc++-v3/ChangeLog:
* testsuite/util/testsuite_allocator.h (tracker_allocator):
Initialize base class in copy constructor.
*minus_plus_one had no constraints, which meant that it could be
matched after RA with operands 0, 1 and 2 all being different.
The associated split instead requires operand 0 to be tied to
operand 1.
Eric Botcazou [Mon, 27 May 2024 14:46:03 +0000 (16:46 +0200)]
ada: Fix bogus Address Sanitizer stack-buffer-overflow on packed array copy
The Address Sanitizer considers that the padding at the end of a justified
modular type may be accessed through the object, but it is never accessed
and therefore can always be reused.
gcc/ada/
* gcc-interface/decl.cc (gnat_to_gnu_entity) <discrete_type>: Set
the TYPE_JUSTIFIED_MODULAR_P flag earlier.
* gcc-interface/misc.cc (gnat_unit_size_without_reusable_padding):
New function.
(LANG_HOOKS_UNIT_SIZE_WITHOUT_REUSABLE_PADDING): Redefine to above
function.
Eric Botcazou [Mon, 27 May 2024 14:31:20 +0000 (16:31 +0200)]
ada: Fix bogus Address Sanitizer stack-buffer-overflow on packed record equality
We set DECL_BIT_FIELD optimistically during the translation of record types
and clear it afterward if needed, but fail to clear other attributes in the
latter case, which fools the logic of the Address Sanitizer.
gcc/ada/
* gcc-interface/utils.cc (clear_decl_bit_field): New function.
(finish_record_type): Call clear_decl_bit_field instead of clearing
DECL_BIT_FIELD manually.
Eric Botcazou [Sat, 20 Apr 2024 10:26:52 +0000 (12:26 +0200)]
ada: Implement fast modulo reduction for nonbinary modular multiplication
This adds the missing guard to prevent the reduction from being used when
the target does not provide or cannot synthesize a high-part multiply.
gcc/ada/
* gcc-interface/trans.cc (gnat_to_gnu) <N_Op_Mod>: Fix formatting.
* gcc-interface/utils2.cc: Include optabs-query.h.
(fast_modulo_reduction): Call can_mult_highpart_p on the TYPE_MODE
before generating a high-part multiply. Fix formatting.
Eric Botcazou [Fri, 5 Apr 2024 18:47:34 +0000 (20:47 +0200)]
ada: Implement fast modulo reduction for nonbinary modular multiplication
This implements modulo reduction for nonbinary modular multiplication with
small moduli by means of the standard division-free algorithm also used in
the optimizer, but with fewer constraints and therefore better results.
For the sake of consistency, it is also used for the 'Mod attribute of the
same modular types and, more generally, for the Mod (and Rem) operators of
unsigned types if the second operand is static and not a power of two.
gcc/ada/
* gcc-interface/gigi.h (fast_modulo_reduction): Declare.
* gcc-interface/trans.cc (gnat_to_gnu) <N_Op_Mod>: In the unsigned
case, call fast_modulo_reduction for {FLOOR,TRUNC}_MOD_EXPR if the
RHS is a constant and not a power of two, and the precision is not
larger than the word size.
* gcc-interface/utils2.cc: Include expmed.h.
(fast_modulo_reduction): New function.
(nonbinary_modular_operation): Call fast_modulo_reduction for the
multiplication if the precision is not larger than the word size.
Javier Miranda [Thu, 6 Jun 2024 12:06:53 +0000 (12:06 +0000)]
ada: Reject ambiguous function calls in interpolated string expressions
When the interpolated expression is a call to an ambiguous call
the frontend does not reject it; erroneously accepts the call
and generates code that calls to one of them.
gcc/ada/
* sem_ch2.adb (Analyze_Interpolated_String_Literal): Reject
ambiguous function calls.
Javier Miranda [Thu, 6 Jun 2024 11:20:14 +0000 (11:20 +0000)]
ada: Crash when using user defined string literals
When a non-overridable aspect is explicitly specified for a
non-tagged derived type, the compiler blows up processing an
object declaration of an object of such type.
gcc/ada/
* sem_ch13.adb (Analyze_One_Aspect): Fix code locating the entity
of the parent type.
Eric Botcazou [Tue, 4 Jun 2024 19:33:28 +0000 (21:33 +0200)]
ada: Small cleanup in processing of primitive operations
The processing of primitive operations is now always uniform for tagged and
untagged types, but the code contains left-overs from the time where it was
specific to tagged types, in particular for the handling of subtypes.
gcc/ada/
* einfo.ads (Direct_Primitive_Operations): Mention concurrent types
as well as GNAT extensions instead of implementation details.
(Primitive_Operations): Document that Direct_Primitive_Operations is
also used for concurrent types as a fallback.
* einfo-utils.adb (Primitive_Operations): Tweak formatting.
* exp_util.ads (Find_Prim_Op): Adjust description.
* exp_util.adb (Make_Subtype_From_Expr): In the private case with
unknown discriminants, always copy Direct_Primitive_Operations and
do not overwrite the Class_Wide_Type of the expression's base type.
* sem_ch3.adb (Analyze_Incomplete_Type_Decl): Tweak comment.
(Analyze_Subtype_Declaration): Remove older and now dead calls to
Set_Direct_Primitive_Operations. Tweak comment.
(Build_Derived_Private_Type): Likewise.
(Build_Derived_Record_Type): Likewise.
(Build_Discriminated_Subtype): Set Direct_Primitive_Operations in
all cases instead of just for tagged types.
(Complete_Private_Subtype): Likewise.
(Derived_Type_Declaration): Tweak comment.
* sem_ch4.ads (Try_Object_Operation): Adjust description.
Doug Rupp [Tue, 4 Jun 2024 17:17:57 +0000 (10:17 -0700)]
ada: Revert conditional installation of signal handlers on VxWorks
The conditional installation resulted in a semantic change, and
although it is likely what is ultimately wanted (since HW interrupts
are being reworked on VxWorks). However it must be done in concert
with other modifications for the new formulation of HW interrupts and
not in isolation.
gcc/ada/
* init.c [vxworks] (__gnat_install_handler): Revert to
installing signal handlers without regard to interrupt_state.
Javier Miranda [Thu, 30 May 2024 11:24:54 +0000 (11:24 +0000)]
ada: Cannot override inherited function with controlling result
When a package has the declaration of a derived tagged
type T with private null extension that inherits a public
function F with controlling result, and a derivation of T
is declared in the public part of another package, overriding
function F may be rejected by the compiler.
gcc/ada/
* sem_disp.adb (Find_Hidden_Overridden_Primitive): Check
public dispatching primitives of ancestors; previously,
only immediately-visible primitives were checked.
Eric Botcazou [Thu, 30 May 2024 10:46:57 +0000 (12:46 +0200)]
ada: Fix missing index check with declare expression
The Do_Range_Check flag is properly set on the Expression of the EWA node
built for the declare expression, so this instructs Generate_Index_Checks
to look into this Expression.
gcc/ada/
* checks.adb (Generate_Index_Checks): Add specific treatment for
index expressions that are N_Expression_With_Actions nodes.
Eric Botcazou [Tue, 28 May 2024 21:08:32 +0000 (23:08 +0200)]
ada: Fix internal error on case expression used as index of array component
This occurs when the bounds of the array component depend on a discriminant
and the component reference is not nested, that is to say the component is
not (referenced as) a subcomponent of a larger record.
In this case, Analyze_Selected_Component does not build the actual subtype
for the component, but it turns out to be required for constructs generated
during the analysis of the case expression.
The change causes this actual subtype to be built, and also renames a local
variable used to hold the prefix of the selected component.
gcc/ada/
* sem_ch4.adb (Analyze_Selected_Component): Rename Name into Pref
and use Sel local variable consistently.
(Is_Simple_Indexed_Component): New predicate.
Call Is_Simple_Indexed_Component to determine whether to build an
actual subtype for the component.
Eric Botcazou [Thu, 30 May 2024 22:13:44 +0000 (00:13 +0200)]
ada: Fix incorrect handling of packed array with aliased composite components
The problem is that the handling of the interaction between packing and
aliased/atomic/independent components of an array type is tied to that of
the interaction between a component clause and aliased/atomic/independent
components, although the semantics are different: packing is a best effort
thing, whereas a component clause must be honored or else an error be given.
This decouples the two handlings, but retrofits the separate processing of
independent components done in both cases into the common code and changes
the error message from "minimum allowed is" to "minimum allowed value is"
for the sake of consistency with the aliased/atomic processing.
gcc/ada/
* freeze.adb (Freeze_Array_Type): Decouple the handling of the
interaction between packing and aliased/atomic components from
that of the interaction between a component clause and aliased/
atomic components, and retrofit the processing of the interaction
between the two characteristics and independent components into
the common processing.
The only substantive change is to remove Activation_Chain_Entity
from N_Generic_Package_Declaration. The comment in sinfo.ads suggesting
this change was written in 1993!
Various pieces of missing documentation are added to Sinfo and Einfo.
Steve Baird [Fri, 24 May 2024 00:11:42 +0000 (17:11 -0700)]
ada: Predefined arithmetic operators incorrectly treated as directly visible
In some cases, a predefined operator (e.g., the "+" operator for an
integer type) is incorrectly treated as being directly visible when
it is not. This can lead to both accepting operator uses that should
be rejected and also to incorrectly rejecting legal constructs as ambiguous
(for example, an expression "Foo + 1" where Foo is an overloaded function and
the "+" operator is directly visible for the result type of only one of
the possible callees).
gcc/ada/
* sem_ch4.adb (Is_Effectively_Visible_Operator): A new function.
(Check_Arithmetic_Pair): In paths where Add_One_Interp was
previously called unconditionally, instead call only if
Is_Effectively_Visible_Operator returns True.
(Check_Boolean_Pair): Likewise.
(Find_Unary_Types): Likewise.
Piotr Trojanek [Thu, 16 May 2024 08:59:31 +0000 (10:59 +0200)]
ada: Fix for Default_Component_Value with declare expressions
When the expression of aspect Default_Component_Value includes a declare
expression with current type instance, we attempted to recursively froze
that type, which itself caused an infinite recursion, because we didn't
properly manage the scope of declare expression.
This patch fixes both the detection of the current type instance and
analysis of the expression that caused recursive freezing.
gcc/ada/
* sem_attr.adb (In_Aspect_Specification): Use the standard
condition that works correctly with declare expressions.
* sem_ch13.adb (Analyze_Aspects_At_Freeze_Point): Replace
ordinary analysis with preanalysis of spec expressions.
Justin Squirek [Thu, 9 May 2024 20:16:24 +0000 (20:16 +0000)]
ada: Spurious style error with mutiple square brackets
This patch fixes a spurious error in the compiler when checking for style for
token separation where two square brackets are next to each other.
gcc/ada/
* csets.ads (Identifier_Char): New function - replacing table.
* csets.adb (Identifier_Char): Rename and move table for static values.
(Initialize): Remove dynamic calculations.
(Identifier_Char): New function to calculate dynamic values.
* opt.adb (Set_Config_Switches): Remove setting of Identifier_Char.
Andrew Pinski [Thu, 20 Jun 2024 22:52:05 +0000 (15:52 -0700)]
complex-lowering: Better handling of PAREN_EXPR [PR68855]
When PAREN_EXPR tree code was added in r0-85884-gdedd42d511b6e4,
a simplified handling was added to complex lowering. Which means
we would get:
```
_9 = COMPLEX_EXPR <_15, _14>;
_11 = ((_9));
_19 = REALPART_EXPR <_11>;
_20 = IMAGPART_EXPR <_11>;
```
In many cases instead of just simply:
```
_19 = ((_15));
_20 = ((_14));
```
So this adds full support for PAREN_EXPR to complex lowering.
It is handled very similar as NEGATE_EXPR; except creating PAREN_EXPR
instead of NEGATE_EXPR for the real/imag parts. This allows for
more optimizations including vectorization, especially with
-ffast-math.
gfortran.dg/vect/pr68855.f90 is an example where this could show up.
It also shows up in SPEC CPU 2006's 465.tonto; though I have not done
any benchmarking there.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
PR tree-optimization/68855
* tree-complex.cc (init_dont_simulate_again): Handle PAREN_EXPR
like NEGATE_EXPR.
(complex_propagate::visit_stmt): Likewise.
(expand_complex_move): Don't handle PAREN_EXPR.
(expand_complex_paren): New function.
(expand_complex_operations_1): Handle PAREN_EXPR like
NEGATE_EXPR. And call expand_complex_paren for PAREN_EXPR.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/pr68855.c: New test.
* gfortran.dg/vect/pr68855.f90: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Richard Biener [Fri, 21 Jun 2024 06:05:22 +0000 (08:05 +0200)]
Remove outdated info from passes.texi
This applies some maintainance to passes.texi by removing references
to no longer existing passes. It also fixes a few minor things but
doesn't fill the gaps that meanwhile exist.
* doc/passes.texi: Remove references to no longer existing
passes.
YunQiang Su [Tue, 18 Jun 2024 09:03:51 +0000 (17:03 +0800)]
MIPS: Set condmove cost to SET(REG, REG)
On most uarch, the cost condmove is same as other noraml integer,
and it should be COSTS_N_INSNS(1).
In GCC12 or previous, the condmove is always enabled, and from
GCC13, we start to compare the cost.
The generic rtx_cost give the result of COSTS_N_INSN(2).
Let's define it to COSTS_N_INSN(1) in mips_rtx_costs.
gcc
* config/mips/mips.cc(mips_rtx_costs): Set condmove cost.
* config/mips/mips.md(mov<GPR:mode>_on_<MOVECC:mode>,
mov<GPR:mode>_on_<MOVECC:mode>_mips16e2,
mov<GPR:mode>_on_<GPR2:mode>_ne
mov<GPR:mode>_on_<GPR2:mode>_ne_mips16e2): Define name by
remove starting *, so that we can use CODE_FOR_.
Kewen Lin [Fri, 21 Jun 2024 01:23:56 +0000 (20:23 -0500)]
rs6000: Fix wrong RTL patterns for vector merge high/low word on LE
Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low word, which are altivec_vmrg[hl]w,
vsx_xxmrg[hl]w_<VSX_W:mode>. These defines are mainly for
built-in function vec_merge{h,l}, __builtin_vsx_xxmrghw,
__builtin_vsx_xxmrghw_4si and some internal gen function
needs. These functions should consider endianness, taking
vec_mergeh as example, as PVIPR defines, vec_mergeh "Merges
the first halves (in element order) of two vectors", it does
note it's in element order. So it's mapped into vmrghw on
BE while vmrglw on LE respectively. Although the mapped
insns are different, as the discussion in PR106069, the RTL
pattern should be still the same, it is conformed before
commit r12-4496, define_expand altivec_vmrghw got expanded
into:
on LE, although the mapped insn are still vmrghw on BE and
vmrglw on LE, the associated RTL pattern is completely
wrong and inconsistent with the mapped insn. If optimization
passes leave this pattern alone, even if its pattern doesn't
represent its mapped insn, it's still fine, that's why simple
testing on bif doesn't expose this issue. But once some
optimization pass such as combine does some changes basing
on this wrong pattern, because the pattern doesn't match the
semantics that the expanded insn is intended to represent,
it would cause the unexpected result.
So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns. With the
proposed patch, the expanders like altivec_vmrghw expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.
Co-authored-by: Xionghu Luo <xionghuluo@tencent.com>
PR target/106069
PR target/115355
gcc/ChangeLog:
* config/rs6000/altivec.md (altivec_vmrghw_direct_<VSX_W:mode>): Rename
to ...
(altivec_vmrghw_direct_<VSX_W:mode>_be): ... this. Add the condition
BYTES_BIG_ENDIAN.
(altivec_vmrghw_direct_<VSX_W:mode>_le): New define_insn.
(altivec_vmrglw_direct_<VSX_W:mode>): Rename to ...
(altivec_vmrglw_direct_<VSX_W:mode>_be): ... this. Add the condition
BYTES_BIG_ENDIAN.
(altivec_vmrglw_direct_<VSX_W:mode>_le): New define_insn.
(altivec_vmrghw): Adjust by calling gen_altivec_vmrghw_direct_v4si_be
for BE and gen_altivec_vmrglw_direct_v4si_le for LE.
(altivec_vmrglw): Adjust by calling gen_altivec_vmrglw_direct_v4si_be
for BE and gen_altivec_vmrghw_direct_v4si_le for LE.
(vec_widen_umult_hi_v8hi): Adjust the call to
gen_altivec_vmrghw_direct_v4si by gen_altivec_vmrghw for BE
and by gen_altivec_vmrglw for LE.
(vec_widen_smult_hi_v8hi): Likewise.
(vec_widen_umult_lo_v8hi): Adjust the call to
gen_altivec_vmrglw_direct_v4si by gen_altivec_vmrglw for BE
and by gen_altivec_vmrghw for LE
(vec_widen_smult_lo_v8hi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghw_direct_v4si by
CODE_FOR_altivec_vmrghw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrghw_direct_v4si_le for LE. And replace
CODE_FOR_altivec_vmrglw_direct_v4si by
CODE_FOR_altivec_vmrglw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrglw_direct_v4si_le for LE.
* config/rs6000/vsx.md (vsx_xxmrghw_<VSX_W:mode>): Adjust by calling
gen_altivec_vmrghw_direct_v4si_be for BE and
gen_altivec_vmrglw_direct_v4si_le for LE.
(vsx_xxmrglw_<VSX_W:mode>): Adjust by calling
gen_altivec_vmrglw_direct_v4si_be for BE and
gen_altivec_vmrghw_direct_v4si_le for LE.
gcc/testsuite/ChangeLog:
* g++.target/powerpc/pr106069.C: New test.
* gcc.target/powerpc/pr115355.c: New test.