Kewen Lin [Tue, 25 Jun 2024 05:04:53 +0000 (00:04 -0500)]
Replace {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE with new hook mode_for_floating_type
Currently how we determine which mode will be used for a
floating point type is that for a given type precision
(size) call mode_for_size to get the first mode which has
this size in the specified class. On Powerpc, we have
three modes (TF/KF/IF) having the same mode precision 128
(see[1]), so the processing forces us to have to place TF
at the first place, it would require us to make more
adjustment in some generic code to avoid some unexpected
mode conversions and it would be even worse if we get rid
of TF eventually one day. And as Joseph pointed out in [2],
"floating types should have their mode, not a poorly
defined precision value", as Joseph and Richi suggested,
this patch is to introduce one hook mode_for_floating_type
which returns the corresponding mode for type float, double
or long double. The default implementation returns SFmode
for float and DFmode for double or long double. For ports
which need special treatment, there are some other patches
for their own port specific implementation (referring to
how {,LONG_}DOUBLE_TYPE_SIZE get used there). For all
generic uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE, depending
on the context, some of them are replaced with TYPE_PRECISION
of the according type node, some other are replaced with
GET_MODE_PRECISION on the mode from mode_for_floating_type.
This patch also poisons {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE,
so most defines of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE in port
specific are removed, but there are still some which are
good to be kept for readability then they get renamed with
port specific prefix.
Kewen Lin [Tue, 25 Jun 2024 05:04:51 +0000 (00:04 -0500)]
vms: Replace use of LONG_DOUBLE_TYPE_SIZE
Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type. To be prepared for that, this
patch is to replace use of LONG_DOUBLE_TYPE_SIZE in vms port
with TYPE_PRECISION of long_double_type_node.
Kewen Lin [Tue, 25 Jun 2024 05:04:49 +0000 (00:04 -0500)]
rust: Replace uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type. To be prepared for that, this
patch is to replace use of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
in rust with TYPE_PRECISION of {float,{,long_}double}_type_node.
Kewen Lin [Tue, 25 Jun 2024 05:04:47 +0000 (00:04 -0500)]
go: Replace uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type. To be prepared for that, this
patch is to replace use of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
in go with TYPE_PRECISION of {float,{,long_}double}_type_node.
* go-gcc.cc (Gcc_backend::float_type): Use TYPE_PRECISION of
{float,double,long_double}_type_node to replace
{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE.
(Gcc_backend::complex_type): Likewise.
Sergei Lewis [Mon, 24 Jun 2024 20:20:14 +0000 (14:20 -0600)]
[PATCH v2 2/3] RISC-V: setmem for RISCV with V extension
This is primarily Sergei's work, my contributions were limited to
merging his expander with the one that's on the trunk, allowing
non-constant value and trivial testsuite adjustments due to option renaming.
I'm doing setmem first because it's the easiest. The others will follow
soon enough.
I've tested this in my system, waiting on pre-commit CI to render its
verdict before moving forward.
gcc/ChangeLog
* config/riscv/riscv-protos.h (riscv_vector::expand_vec_setmem): New
function declaration.
* config/riscv/riscv-string.cc (riscv_vector::expand_vec_setmem): New
function: this generates an inline vectorised memory set, if and only if
we know the entire operation can be performed in a single vector store.
* config/riscv/riscv.md (setmem<mode>): Try riscv_vector::expand_vec_setmem
for constant lengths. Do not require operand 2 to be a constant.
gcc/testsuite/ChangeLog
* gcc.target/riscv/rvv/base/setmem-1.c: New tests
* gcc.target/riscv/rvv/base/setmem-2.c: New tests
* gcc.target/riscv/rvv/base/setmem-3.c: New tests
Patrick O'Neill [Mon, 24 Jun 2024 19:06:15 +0000 (12:06 -0700)]
RISC-V: Add dg-remove-option for z* extensions
This introduces testsuite support infra for removing extensions.
Since z* extensions don't have ordering requirements the logic for
adding/removing those extensions has also been consolidated.
This fixes RVWMO compile testcases failing on Ztso targets by removing
the extension from the -march string.
gcc/ChangeLog:
* doc/sourcebuild.texi (dg-remove-option): Add documentation.
(dg-add-option): Add documentation for riscv_{a,zaamo,zalrsc,ztso}
Harald Anlauf [Sun, 23 Jun 2024 20:36:43 +0000 (22:36 +0200)]
Fortran: fix passing of optional dummy as actual to optional argument [PR55978]
gcc/fortran/ChangeLog:
PR fortran/55978
* trans-array.cc (gfc_conv_array_parameter): Do not dereference
data component of a missing allocatable dummy array argument for
passing as actual to optional dummy. Harden logic of presence
check for optional pointer dummy by using TRUTH_ANDIF_EXPR instead
of TRUTH_AND_EXPR.
gcc/testsuite/ChangeLog:
PR fortran/55978
* gfortran.dg/optional_absent_12.f90: New test.
Roger Sayle [Mon, 24 Jun 2024 14:34:03 +0000 (15:34 +0100)]
PR tree-optimization/113673: Avoid load merging when potentially trapping.
This patch fixes PR tree-optimization/113673, a P2 ice-on-valid regression
caused by load merging of (ptr[0]<<8)+ptr[1] when -ftrapv has been
specified. When the operator is | or ^ this is safe, but for addition
of signed integer types, a trap may be generated/required, so merging this
idiom into a single non-trapping instruction is inappropriate, confusing
the compiler by transforming a basic block with an exception edge into one
without.
This revision implements Richard Biener's feedback to add an early check
for stmt_can_throw_internal (cfun, stmt) to prevent transforming in the
presence of any statement that could trap, not just overflow on addition.
The one other tweak included in this patch is to mark the local function
find_bswap_or_nop_load as static ensuring that it isn't called from outside
this file, and guaranteeing that it is dominated by stmt_can_throw_internal
checking.
2024-06-24 Roger Sayle <roger@nextmovesoftware.com>
Richard Biener <rguenther@suse.de>
gcc/ChangeLog
PR tree-optimization/113673
* gimple-ssa-store-merging.cc (find_bswap_or_nop_load): Make static.
(find_bswap_or_nop_1): Avoid transformations (load merging) when
stmt_can_throw_internal indicates that a statement can trap.
gcc/testsuite/ChangeLog
PR tree-optimization/113673
* g++.dg/pr113673.C: New test case.
Richard Biener [Mon, 24 Jun 2024 07:52:39 +0000 (09:52 +0200)]
tree-optimization/115602 - SLP CSE results in cycles
The following prevents SLP CSE to create new cycles which happened
because of a 1:1 permute node being present where its child was then
CSEd to the permute node. Fixed by making a node only available to
CSE to after recursing.
PR tree-optimization/115602
* tree-vect-slp.cc (vect_cse_slp_nodes): Delay populating the
bst-map to avoid cycles.
Richard Biener [Fri, 21 Jun 2024 11:19:26 +0000 (13:19 +0200)]
tree-optimization/115528 - fix vect alignment analysis for outer loop vect
For outer loop vectorization of a data reference in the inner loop
we have to look at both steps to see if they preserve alignment.
What is special for this testcase is that the outer loop step is
one element but the inner loop step four and that we now use SLP
and the vectorization factor is one.
PR tree-optimization/115528
* tree-vect-data-refs.cc (vect_compute_data_ref_alignment):
Make sure to look at both the inner and outer loop step
behavior.
This patch adds a combine pass that runs late in the pipeline.
There are two instances: one between combine and split1, and one
after postreload.
The pass currently has a single objective: remove definitions by
substituting into all uses. The pre-RA version tries to restrict
itself to cases that are likely to have a neutral or beneficial
effect on register pressure.
The patch fixes PR106594. It also fixes a few FAILs and XFAILs
in the aarch64 test results, mostly due to making proper use of
MOVPRFX in cases where we didn't previously.
This is just a first step. I'm hoping that the pass could be
used for other combine-related optimisations in future. In particular,
the post-RA version doesn't need to restrict itself to cases where all
uses are substitutable, since it doesn't have to worry about register
pressure. If we did that, and if we extended it to handle multi-register
REGs, the pass might be a viable replacement for regcprop, which in
turn might reduce the cost of having a post-RA instance of the new pass.
On most targets, the pass is enabled by default at -O2 and above.
However, it has a tendency to undo x86's STV and RPAD passes,
by folding the more complex post-STV/RPAD form back into the
simpler pre-pass form.
Also, running a pass after register allocation means that we can
now match define_insn_and_splits that were previously only matched
before register allocation. This trips things like:
(define_insn_and_split "..."
[...pattern...]
"...cond..."
"#"
"&& 1"
[...pattern...]
{
...unconditional use of gen_reg_rtx ()...;
}
because matching and splitting after RA will call gen_reg_rtx when
pseudos are no longer allowed. rs6000 has several instances of this.
xtensa has a variation in which the split condition is:
"&& can_create_pseudo_p ()"
The failure then is that, if we match after RA, we'll never be
able to split the instruction.
The patch therefore disables the pass by default on i386, rs6000
and xtensa. Hopefully we can fix those ports later (if their
maintainers want). It seems better to add the pass first, though,
to make it easier to test any such fixes.
gcc.target/aarch64/bitfield-bitint-abi-align{16,8}.c would need
quite a few updates for the late-combine output. That might be
worth doing, but it seems too complex to do as part of this patch.
I tried compiling at least one target per CPU directory and comparing
the assembly output for parts of the GCC testsuite. This is just a way
of getting a flavour of how the pass performs; it obviously isn't a
meaningful benchmark. All targets seemed to improve on average:
rtl-ssa has routines for scanning forwards or backwards for something
under the control of an exclusion set. These searches are currently
used for two main things:
- to work out where an instruction can be moved within its EBB
- to work out whether recog can add a new hard register clobber
The exclusion set was originally a callback function that returned
true for insns that should be ignored. However, for the late-combine
work, I'd also like to be able to skip an entire definition, along
with all its uses.
This patch prepares for that by turning the exclusion set into an
object that provides predicate member functions. Currently the
only two member functions are:
- should_ignore_insn: what the old callback did
- should_ignore_def: the new functionality
but more could be added later.
Doing this also makes it easy to remove some asymmetry that I think
in hindsight was a mistake: in forward scans, ignoring an insn meant
ignoring all definitions in that insn (ok) and all uses of those
definitions (non-obvious). The new interface makes it possible
to select the required behaviour, with that behaviour being applied
consistently in both directions.
Now that the exclusion set is a dedicated object, rather than
just a "random" function, I think it makes sense to remove the
_ignoring suffix from the function names. The suffix was originally
there to describe the callback, and in particular to emphasise that
a true return meant "ignore" rather than "heed".
gcc/
* rtl-ssa.h: Include predicates.h.
* rtl-ssa/predicates.h: New file.
* rtl-ssa/access-utils.h (prev_call_clobbers_ignoring): Rename to...
(prev_call_clobbers): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(next_call_clobbers_ignoring): Rename to...
(next_call_clobbers): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(first_nondebug_insn_use_ignoring): Rename to...
(first_nondebug_insn_use): ...this and treat the ignore parameter as
an object with the same interface as ignore_nothing.
(last_nondebug_insn_use_ignoring): Rename to...
(last_nondebug_insn_use): ...this and treat the ignore parameter as
an object with the same interface as ignore_nothing.
(last_access_ignoring): Rename to...
(last_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing. Conditionally skip
definitions.
(prev_access_ignoring): Rename to...
(prev_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing.
(first_def_ignoring): Replace with...
(first_access): ...this new function.
(next_access_ignoring): Rename to...
(next_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing. Conditionally skip
definitions.
* rtl-ssa/change-utils.h (insn_is_changing): Delete.
(restrict_movement_ignoring): Rename to...
(restrict_movement): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(recog_ignoring): Rename to...
(recog): ...this and treat the ignore parameter as an object with
the same interface as ignore_nothing.
* rtl-ssa/changes.h (insn_is_changing_closure): Delete.
* rtl-ssa/functions.h (function_info::add_regno_clobber): Treat
the ignore parameter as an object with the same interface as
ignore_nothing.
* rtl-ssa/insn-utils.h (insn_is): Delete.
* rtl-ssa/insns.h (insn_is_closure): Delete.
* rtl-ssa/member-fns.inl
(insn_is_changing_closure::insn_is_changing_closure): Delete.
(insn_is_changing_closure::operator()): Likewise.
(function_info::add_regno_clobber): Treat the ignore parameter
as an object with the same interface as ignore_nothing.
(ignore_changing_insns::ignore_changing_insns): New function.
(ignore_changing_insns::should_ignore_insn): Likewise.
* rtl-ssa/movement.h (restrict_movement_for_dead_range): Treat
the ignore parameter as an object with the same interface as
ignore_nothing.
(restrict_movement_for_defs_ignoring): Rename to...
(restrict_movement_for_defs): ...this and treat the ignore parameter
as an object with the same interface as ignore_nothing.
(restrict_movement_for_uses_ignoring): Rename to...
(restrict_movement_for_uses): ...this and treat the ignore parameter
as an object with the same interface as ignore_nothing. Conditionally
skip definitions.
* doc/rtl.texi: Update for above name changes. Use
ignore_changing_insns instead of insn_is_changing.
* config/aarch64/aarch64-cc-fusion.cc (cc_fusion::parallelize_insns):
Likewise.
* pair-fusion.cc (no_ignore): Delete.
(latest_hazard_before, first_hazard_after): Update for above name
changes. Use ignore_nothing instead of no_ignore.
(pair_fusion_bb_info::fuse_pair): Update for above name changes.
Use ignore_changing_insns instead of insn_is_changing.
(pair_fusion::try_promote_writeback): Likewise.
The compare_repeat_factors comparator fails qsort checking eventually
because it uses rf2->rank - rf1->rank to compare unsigned numbers
which causes issues for ranks that interpret negative as signed.
Fixed by re-writing the obvious way. I've also fixed the count
comparison which suffers from truncation as count is 64bit signed
while the comparator result is 32bit int (that's a lot less likely
to hit in practice though).
The testcase from the PR is too large to include.
PR tree-optimization/115599
* tree-ssa-reassoc.cc (compare_repeat_factors): Use explicit
compares to avoid truncations.
Haochen Gui [Mon, 24 Jun 2024 05:12:51 +0000 (13:12 +0800)]
fwprop: invoke change_is_worthwhile to judge if a replacement is worthwhile
gcc/
* fwprop.cc (try_fwprop_subst_pattern): Invoke change_is_worthwhile
to judge if a replacement is worthwhile. Remove single_set check
and add is_debug_insn check.
* recog.cc (swap_change): Invalidate recog_data when the cached INSN
is swapped out.
* rtl-ssa/changes.cc (rtl_ssa::changes_are_worthwhile): Check if the
insn cost of new rtl is unknown and fail the replacement.
Mark Harmstone [Mon, 24 Jun 2024 00:17:39 +0000 (18:17 -0600)]
[PATCH 02/11] Handle CodeView base types
Adds a get_type_num function to translate type DIEs into CodeView
numbers, along with a hash table for this. For now we just deal with
the base types (integers, Unicode chars, floats, and bools).
Mark Harmstone [Sun, 23 Jun 2024 23:48:10 +0000 (17:48 -0600)]
[PATCH 01/11] Output CodeView data about variables
Parse the DW_TAG_variable DIEs, and outputs S_GDATA32 (for global variables)
and S_LDATA32 (static global variables) symbols into the .debug$S section.
gcc/
* dwarf2codeview.cc (S_LDATA32, S_GDATA32): Define.
(struct codeview_symbol): New structure.
(sym, last_sym): New variables.
(write_data_symbol): New function.
(write_codeview_symbols): Call write_data_symbol.
(add_variable, codeview_debug_early_finish): New functions.
* dwarf2codeview.h (codeview_debug_early_finish): Prototype.
* dwarf2out.cc
(dwarf2out_early_finish): Call codeview_debug_early_finish.
Artemiy Volkov [Sun, 23 Jun 2024 20:54:00 +0000 (14:54 -0600)]
[PATCH] RISC-V: Fix unrecognizable pattern in riscv_expand_conditional_move()
Presently, the code fragment:
int x[5];
void
d(int a, int b, int c) {
for (int i = 0; i < 5; i++)
x[i] = (a != b) ? c : a;
}
causes an ICE when compiled with -O2 -march=rv32i_zicond:
test.c: In function 'd':
test.c: error: unrecognizable insn:
11 | }
| ^
(insn 8 5 9 2 (set (reg:SI 139 [ iftmp.0_2 ])
(if_then_else:SI (ne:SI (reg/v:SI 136 [ a ])
(reg/v:SI 137 [ b ]))
(reg/v:SI 136 [ a ])
(reg/v:SI 138 [ c ]))) -1
(nil))
during RTL pass: vregs
This happens because, as part of one of the optimizations in
riscv_expand_conditional_move(), an if_then_else is generated with both
comparands being register operands, resulting in an unmatchable insn since
Zicond patterns require constant 0 as the second comparand. Fix this by adding
a extra check before performing this optimization.
The code snippet mentioned above is also included in this patch as a new Zicond
testcase.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_expand_conditional_move): Add a
CONST0_RTX check.
Jeff Law [Sun, 23 Jun 2024 14:26:25 +0000 (08:26 -0600)]
[committed][RISC-V][PR target/114139] Verify we have a CONST_INT before extracting INTVAL
Run-of-the-mill checking issue. We had something like (plus (reg) (reg)) and
tried to extract INTVAL (XEXP (x, 1)) which of course blows up with checking
on.
Fixed thusly. Tested on riscv32-elf in my tester. riscv64-elf is in flight,
but won't finish for a while due to other tasks in flight.
PR target/114139
gcc/
* config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Verify object
is a CONST_INT before looking at INTVAL.
Richard Biener [Sun, 23 Jun 2024 09:26:39 +0000 (11:26 +0200)]
tree-optimization/115597 - allow CSE of two-operator VEC_PERM nodes
The following makes sure to always CSE when there's SLP_TREE_SCALAR_STMTS
as otherwise a chain of two-operator node operations can result in
exponential behavior of the CSE process as likely seen when building
510.parest on aarch64.
PR tree-optimization/115597
* tree-vect-slp.cc (vect_cse_slp_nodes): Allow to CSE
VEC_PERM nodes.
Richard Biener [Sat, 22 Jun 2024 12:59:09 +0000 (14:59 +0200)]
tree-optimization/115579 - fix wrong code with store-motion
The recent change to relax store motion for variables that cannot have
store data races broke the optimization to share flag vars for stores
that all happen in the same single BB. The following fixes this.
PR tree-optimization/115579
* tree-ssa-loop-im.cc (execute_sm): Return the auxiliary data
created.
(hoist_memory_references): Record the flag var that's eventually
created and re-use it when all stores are in the same BB.
Jeff Law [Sat, 22 Jun 2024 16:39:51 +0000 (10:39 -0600)]
[committed] [RISC-V] Skip zbs-ext-2.c for -Oz as well
> the test should probably also be skipped on -Oz:
>
> === gcc: Unexpected fails for rv64imafdc lp64d medlow ===
> FAIL: gcc.target/riscv/zbs-ext-2.c -Oz scan-assembler-times andi\t 1
> FAIL: gcc.target/riscv/zbs-ext-2.c -Oz scan-assembler-times andn\t 1
> FAIL: gcc.target/riscv/zbs-ext-2.c -Oz scan-assembler-times li\t 1
Yea. Just re-ran thing and sure enough we need to skip -Oz as well. So
committing the obvious change....
gcc/testsuite/
* gcc.target/riscv/zbs-ext-2.c: Also skip for -Oz.
David Malcolm [Fri, 21 Jun 2024 22:20:38 +0000 (18:20 -0400)]
diagnostics: remove duplicate copies of diagnostic_kind_text
No functional change intended.
gcc/ChangeLog:
* diagnostic-format-json.cc
(json_output_format::on_end_diagnostic): Use
get_diagnostic_kind_text rather than embedding a duplicate copy of
the table.
* diagnostic-format-sarif.cc
(make_rule_id_for_diagnostic_kind): Likewise.
* diagnostic.cc (get_diagnostic_kind_text): New.
* diagnostic.h (get_diagnostic_kind_text): New decl.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jeff Law [Fri, 21 Jun 2024 21:58:12 +0000 (15:58 -0600)]
[committed] Fix testsuite fallout on stormy16 after IOR->PLUS change
More minor fallout from the IOR->PLUS change a little while ago. This time on
xstormy16.
The pattern to swap nibbles actually tries to handle all the cases of IOR, XOR
and PLUS. But when we generate PLUS earlier in the pipeline, the
simplifications/canonicalizations are slightly different resulting in the
pattern not matching.
This patch adds an alternate pattern which matches what we get now. Basically
it looks like QImode rotate by 4, zero extended to HI.
Run in my tester to verify the regression was fixed. Pushing to the trunk.
gcc/
* config/stormy16/stormy16.md (swpn_zext): New pattern.
Jonathan Wakely [Wed, 19 Jun 2024 16:26:37 +0000 (17:26 +0100)]
libstdc++: Remove std::__is_pointer and std::__is_scalar [PR115497]
This removes the std::__is_pointer and std::__is_scalar traits, as they
conflicts with a Clang built-in.
Although Clang has a hack to make the class templates work despite using
reserved names, removing these class templates will allow that hack to
be dropped at some future date.
libstdc++-v3/ChangeLog:
PR libstdc++/115497
* include/bits/cpp_type_traits.h (__is_pointer, __is_scalar):
Remove.
(__is_arithmetic): Do not use __is_pointer in the primary
template. Add partial specialization for pointers.
Jonathan Wakely [Wed, 19 Jun 2024 10:19:58 +0000 (11:19 +0100)]
libstdc++: Remove std::__is_void class template [PR115497]
This removes the std::__is_void trait, as it conflicts with a Clang
built-in. There is only one use of the trait, which can easily be
replaced by simpler code.
Although Clang has a hack to make the class template work despite using
a reserved name, removing std::__is_void will allow that hack to be
dropped at some future date.
libstdc++-v3/ChangeLog:
PR libstdc++/115497
* include/bits/cpp_type_traits.h (__is_void): Remove.
* include/debug/helper_functions.h (_Distance_traits):
Adjust partial specialization to match void directly, instead of
using __is_void<T>::__type and matching __true_type.
Jonathan Wakely [Wed, 19 Jun 2024 16:21:16 +0000 (17:21 +0100)]
libstdc++: Stop using std::__is_pointer in <deque> and <algorithm> [PR115497]
This replaces all uses of the std::__is_pointer type trait with uses of
the new __is_pointer built-in. Since the class template was only used to
enable some performance optimizations for algorithms, we can use the
built-in when __has_builtin(__is_pointer) is true (which is the case for
GCC trunk and for current versions of Clang) and just forego the
optimization otherwise.
Removing the uses of std::__is_pointer means it can be removed from
<bits/cpp_type_traits.h>, which is another step towards fixing PR
115497.
libstdc++-v3/ChangeLog:
PR libstdc++/115497
* include/bits/deque.tcc (__lex_cmp_dit): Replace __is_pointer
class template with __is_pointer(T) built-in.
(__lexicographical_compare_aux1): Likewise.
* include/bits/stl_algobase.h (__equal_aux1): Likewise.
(__lexicographical_compare_aux1): Likewise.
Jonathan Wakely [Wed, 19 Jun 2024 10:19:58 +0000 (11:19 +0100)]
libstdc++: Don't use std::__is_scalar in std::valarray initialization [PR115497]
This removes the use of the std::__is_scalar trait from <valarray>,
where it can be replaced by __is_trivial. It's used to decide whether we
can use memset to value-initialize valarray elements, but memset is
suitable for any trivial types, because value-initializing them is
equivalent to filling them with zeros.
This is another step towards removing the class templates in
<bits/cpp_type_traits.h> that conflict with Clang built-in names.
libstdc++-v3/ChangeLog:
PR libstdc++/115497
* include/bits/valarray_array.h (__valarray_default_construct):
Use __is_trivial(_Tp). instead of __is_scalar<_Tp>.
Jonathan Wakely [Wed, 19 Jun 2024 15:14:56 +0000 (16:14 +0100)]
libstdc++: Fix std::fill and std::fill_n optimizations [PR109150]
As noted in the PR, the optimization used for scalar types in std::fill
and std::fill_n is non-conforming, because it doesn't consider that
assigning a scalar type might have non-trivial side effects which are
affected by the optimization.
By changing the condition under which the optimization is done we ensure
it's only performed when safe to do so, and we also enable it for
additional types, which was the original subject of the PR.
Instead of two overloads using __enable_if<__is_scalar<T>::__value, R>
we can combine them into one and create a local variable which is either
a local copy of __value or another reference to it, depending on whether
the optimization is allowed.
This removes a use of std::__is_scalar, which is a step towards fixing
PR 115497 by removing std::__is_pointer from <bits/cpp_type_traits.h>
libstdc++-v3/ChangeLog:
PR libstdc++/109150
* include/bits/stl_algobase.h (__fill_a1): Combine the
!__is_scalar and __is_scalar overloads into one and rewrite the
condition used to decide whether to perform the load outside the
loop.
* testsuite/25_algorithms/fill/109150.cc: New test.
* testsuite/25_algorithms/fill_n/109150.cc: New test.
Matthias Kretz [Fri, 21 Jun 2024 14:22:22 +0000 (16:22 +0200)]
libstdc++: Fix test on x86_64 and non-simd targets
* Running a test compiled with AVX512 instructions requires
avx512f_runtime not just avx512f.
* The 'reduce2' test violated an invariant of fixed_size_simd_mask and
thus failed on all targets without 16-Byte vector builtins enabled (in
bits/simd.h).
All uses of xs_hi_nonmemory_operand allow constraint "i",
which means that they allow consts, symbol_refs and label_refs.
The definition of xs_hi_nonmemory_operand accounted for consts,
but not for symbol_refs and label_refs.
gcc/
* config/stormy16/predicates.md (xs_hi_nonmemory_operand): Handle
symbol_ref and label_ref.
power_of_2_operand allows any 32-bit power of 2, whereas "I" only
accepts 16-bit signed constants. This meant that any power of 2
greater than 32768 would cause an "insn does not satisfy its
constraints" ICE.
Also, the %p operand modifier barfed on 1<<31, which is sign-
rather than zero-extended to 64 bits. The code is inherently
limited to 32-bit operands -- power_of_2_operand contains a test
involving "unsigned" -- so this patch just ands with 0xffffffff.
gcc/
* config/iq2000/iq2000.cc (iq2000_print_operand): Make %p handle 1<<31.
* config/iq2000/iq2000.md: Remove "I" constraints on
power_of_2_operands.
No-op moves are given the code NOOP_MOVE_INSN_CODE if we plan
to delete them later. Such insns shouldn't be costed, partly
because they're going to disappear, and partly because targets
won't recognise the insn code.
Andrew MacLeod [Mon, 17 Jun 2024 20:07:16 +0000 (16:07 -0400)]
Print "Global Exported" to dump_file from set_range_info.
* gimple-range.cc (gimple_ranger::register_inferred_ranges): Do not
dump global range info after set_range_info.
(gimple_ranger::register_transitive_inferred_ranges): Likewise.
(dom_ranger::range_of_stmt): Likewise.
* tree-ssanames.cc (set_range_info): If global range info
changes, maybe print new range to dump_file.
* tree-vrp.cc (remove_unreachable::handle_early): Do not
dump global range info after set_range_info.
(remove_unreachable::remove): Likewise.
(remove_unreachable::remove_and_update_globals): Likewise.
(pass_assumptions::execute): Likewise.
Andrew MacLeod [Mon, 17 Jun 2024 15:32:51 +0000 (11:32 -0400)]
Change fast VRP algorithm
Change the fast VRP algorithm to track contextual ranges active within
each basic block.
* gimple-range.cc (dom_ranger::dom_ranger): Create a block
vector.
(dom_ranger::~dom_ranger): Dispose of the block vector.
(dom_ranger::edge_range): Delete.
(dom_ranger::range_on_edge): Combine range in src BB with any
range gori_nme_on_edge returns.
(dom_ranger::range_in_bb): Combine global range with any active
contextual range for an ssa-name.
(dom_ranger::range_of_stmt): Fix non-ssa LHS case, use
fur_depend for folding so relations can be registered.
(dom_ranger::maybe_push_edge): Delete.
(dom_ranger::pre_bb): Create incoming contextual range vector.
(dom_ranger::post_bb): Free contextual range vector.
* gimple-range.h (dom_ranger::edge_range): Delete.
(dom_ranger::m_e0): Delete.
(dom_ranger::m_e1): Delete.
(dom_ranger::m_bb): New.
(dom_ranger::m_pop_list): Delete.
* tree-vrp.cc (execute_fast_vrp): Enable relation oracle.
Andrew MacLeod [Mon, 17 Jun 2024 15:23:12 +0000 (11:23 -0400)]
Add builtin_unreachable processing for fast_vrp.
Add a remove_unreachable object to fast vrp, and honor the final_p flag.
* tree-vrp.cc (remove_unreachable::remove): Export global range
if builtin_unreachable dominates all uses.
(remove_unreachable::remove_and_update_globals): Do not reset SCEV.
(execute_ranger_vrp): Reset SCEV here instead.
(fvrp_folder::fvrp_folder): Take final pass flag
and create a remove_unreachable object when specified.
(fvrp_folder::pre_fold_stmt): Register GIMPLE_CONDs with
the remove_unreachcable object.
(fvrp_folder::m_unreachable): New.
(execute_fast_vrp): Process remove_unreachable object.
(pass_vrp::execute): Add final_p flag to execute_fast_vrp.
David Malcolm [Fri, 21 Jun 2024 12:46:14 +0000 (08:46 -0400)]
testsuite: check that generated .sarif files validate against the SARIF schema [PR109360]
This patch extends the dg directive verify-sarif-file so that if
the "jsonschema" tool is available, it will be used to validate the
generated .sarif file.
gcc/testsuite/ChangeLog:
PR testsuite/109360
* lib/sarif-schema-2.1.0.json: New file, downloaded from
https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/schemas/sarif-schema-2.1.0.json
Licensing information can be seen at
https://github.com/oasis-tcs/sarif-spec/issues/583
which states "They are free to incorporate it into their
implementation. No need for special permission or paperwork from
OASIS."
* lib/scansarif.exp (verify-sarif-file): If "jsonschema" is
available, use it to verify that the .sarif file complies with the
SARIF schema.
* lib/target-supports.exp (check_effective_target_jsonschema):
New.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Fri, 21 Jun 2024 12:46:13 +0000 (08:46 -0400)]
diagnostics: fixes to SARIF output [PR109360]
When adding validation of .sarif files against the schema
(PR testsuite/109360) I discovered various issues where we were
generating invalid .sarif files.
Specifically, in
c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c
the relatedLocations for the "note" diagnostics were missing column
numbers, leading to validation failure due to non-unique elements,
such as multiple:
"message": {"text": "invalid UTF-8 character <bf>"}},
on line 25 with no column information.
Root cause is that for some diagnostics in libcpp we have a location_t
representing the line as a whole, setting a column_override on the
rich_location (since the line hasn't been fully read yet). We were
handling this column override for plain text output, but not for .sarif
output.
Similarly, in diagnostic-format-sarif-file-pr111700.c there is a warning
emitted on "line 0" of the file, whereas SARIF requires line numbers to
be positive.
We also use column == 0 internally to mean "the line as a whole",
whereas SARIF required column numbers to be positive.
This patch fixes these various issues.
gcc/ChangeLog:
PR testsuite/109360
* diagnostic-format-sarif.cc
(sarif_builder::make_location_object): Pass any column override
from rich_loc to maybe_make_physical_location_object.
(sarif_builder::maybe_make_physical_location_object): Add
"column_override" param and pass it to maybe_make_region_object.
(sarif_builder::maybe_make_region_object): Add "column_override"
param and use it when the location has 0 for a column. Don't
add "startLine", "startColumn", "endLine", or "endColumn" if
the values aren't positive.
(sarif_builder::maybe_make_region_object_for_context): Don't
add "startLine" or "endLine" if the values aren't positive.
libcpp/ChangeLog:
PR testsuite/109360
* include/rich-location.h (rich_location::get_column_override):
New accessor.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jonathan Wakely [Tue, 18 Jun 2024 15:59:52 +0000 (16:59 +0100)]
libstdc++: Make std::any_cast<void> ill-formed (LWG 3305)
LWG 3305 was approved earlier this year in Tokyo. We need to give an
error if using std::any_cast<void>, but std::any_cast<void()> is valid
(but always returns null).
libstdc++-v3/ChangeLog:
* include/std/any (any_cast(any*), any_cast(const any*)): Add
static assertion to reject void types, as per LWG 3305.
* testsuite/20_util/any/misc/lwg3305.cc: New test.
This member function was previously deprecated, but that was reverted by
P2875R4, approved earlier this year in Tokyo. Since it's not going to be
deprecated in C++26, and so presumably not removed, there is no point in
giving deprecated warnings for C++23 mode.
Jonathan Wakely [Sun, 7 Apr 2024 13:12:25 +0000 (14:12 +0100)]
libstdc++: Add deprecation warnings to <strstream> types
libstdc++-v3/ChangeLog:
* include/backward/backward_warning.h: Adjust comments to
suggest <spanstream> as another alternative to <strstream>.
* include/backward/strstream (strstreambuf, istrstream)
(ostrstream, strstream): Add deprecated attribute.
Jonathan Wakely [Thu, 20 Jun 2024 12:28:08 +0000 (13:28 +0100)]
libstdc++: Fix __cpp_lib_chrono for old std::string ABI
The <chrono> header is incomplete for the old std::string ABI, because
std::chrono::tzdb is only defined for the new ABI. The feature test
macro advertising full C++20 support should not be defined for the old
ABI.
libstdc++-v3/ChangeLog:
* include/bits/version.def (chrono): Add cxx11abi = yes.
* include/bits/version.h: Regenerate.
* testsuite/std/time/syn_c++20.cc: Adjust expected value for
the feature test macro.
Jonathan Wakely [Tue, 18 Jun 2024 12:27:02 +0000 (13:27 +0100)]
libstdc++: Fix std::to_array for trivial-ish types [PR115522]
Due to PR c++/85723 the std::is_trivial trait is true for types with a
deleted default constructor, so the use of std::is_trivial in
std::to_array is not sufficient to ensure the type can be trivially
default constructed then filled using memcpy.
I also forgot that a type with a deleted assignment operator can still
be trivial, so we also need to check that it's assignable because the
is_constant_evaluated() path can't use memcpy.
Replace the uses of std::is_trivial with std::is_trivially_copyable
(needed for memcpy), std::is_trivially_default_constructible (needed so
that the default construction is valid and does no work) and
std::is_copy_assignable (needed for the constant evaluation case).
libstdc++-v3/ChangeLog:
PR libstdc++/115522
* include/std/array (to_array): Workaround the fact that
std::is_trivial is not sufficient to check that a type is
trivially default constructible and assignable.
* testsuite/23_containers/array/creation/115522.cc: New test.
Jonathan Wakely [Thu, 20 Jun 2024 15:13:10 +0000 (16:13 +0100)]
libstdc++: Initialize base in test allocator's constructor
This fixes a warning from one of the test allocators:
warning: base class 'class std::allocator<__gnu_test::copy_tracker>' should be explicitly initialized in the copy constructor [-Wextra]
libstdc++-v3/ChangeLog:
* testsuite/util/testsuite_allocator.h (tracker_allocator):
Initialize base class in copy constructor.
*minus_plus_one had no constraints, which meant that it could be
matched after RA with operands 0, 1 and 2 all being different.
The associated split instead requires operand 0 to be tied to
operand 1.
Eric Botcazou [Mon, 27 May 2024 14:46:03 +0000 (16:46 +0200)]
ada: Fix bogus Address Sanitizer stack-buffer-overflow on packed array copy
The Address Sanitizer considers that the padding at the end of a justified
modular type may be accessed through the object, but it is never accessed
and therefore can always be reused.
gcc/ada/
* gcc-interface/decl.cc (gnat_to_gnu_entity) <discrete_type>: Set
the TYPE_JUSTIFIED_MODULAR_P flag earlier.
* gcc-interface/misc.cc (gnat_unit_size_without_reusable_padding):
New function.
(LANG_HOOKS_UNIT_SIZE_WITHOUT_REUSABLE_PADDING): Redefine to above
function.
Eric Botcazou [Mon, 27 May 2024 14:31:20 +0000 (16:31 +0200)]
ada: Fix bogus Address Sanitizer stack-buffer-overflow on packed record equality
We set DECL_BIT_FIELD optimistically during the translation of record types
and clear it afterward if needed, but fail to clear other attributes in the
latter case, which fools the logic of the Address Sanitizer.
gcc/ada/
* gcc-interface/utils.cc (clear_decl_bit_field): New function.
(finish_record_type): Call clear_decl_bit_field instead of clearing
DECL_BIT_FIELD manually.
Eric Botcazou [Sat, 20 Apr 2024 10:26:52 +0000 (12:26 +0200)]
ada: Implement fast modulo reduction for nonbinary modular multiplication
This adds the missing guard to prevent the reduction from being used when
the target does not provide or cannot synthesize a high-part multiply.
gcc/ada/
* gcc-interface/trans.cc (gnat_to_gnu) <N_Op_Mod>: Fix formatting.
* gcc-interface/utils2.cc: Include optabs-query.h.
(fast_modulo_reduction): Call can_mult_highpart_p on the TYPE_MODE
before generating a high-part multiply. Fix formatting.
Eric Botcazou [Fri, 5 Apr 2024 18:47:34 +0000 (20:47 +0200)]
ada: Implement fast modulo reduction for nonbinary modular multiplication
This implements modulo reduction for nonbinary modular multiplication with
small moduli by means of the standard division-free algorithm also used in
the optimizer, but with fewer constraints and therefore better results.
For the sake of consistency, it is also used for the 'Mod attribute of the
same modular types and, more generally, for the Mod (and Rem) operators of
unsigned types if the second operand is static and not a power of two.
gcc/ada/
* gcc-interface/gigi.h (fast_modulo_reduction): Declare.
* gcc-interface/trans.cc (gnat_to_gnu) <N_Op_Mod>: In the unsigned
case, call fast_modulo_reduction for {FLOOR,TRUNC}_MOD_EXPR if the
RHS is a constant and not a power of two, and the precision is not
larger than the word size.
* gcc-interface/utils2.cc: Include expmed.h.
(fast_modulo_reduction): New function.
(nonbinary_modular_operation): Call fast_modulo_reduction for the
multiplication if the precision is not larger than the word size.
Javier Miranda [Thu, 6 Jun 2024 12:06:53 +0000 (12:06 +0000)]
ada: Reject ambiguous function calls in interpolated string expressions
When the interpolated expression is a call to an ambiguous call
the frontend does not reject it; erroneously accepts the call
and generates code that calls to one of them.
gcc/ada/
* sem_ch2.adb (Analyze_Interpolated_String_Literal): Reject
ambiguous function calls.
Javier Miranda [Thu, 6 Jun 2024 11:20:14 +0000 (11:20 +0000)]
ada: Crash when using user defined string literals
When a non-overridable aspect is explicitly specified for a
non-tagged derived type, the compiler blows up processing an
object declaration of an object of such type.
gcc/ada/
* sem_ch13.adb (Analyze_One_Aspect): Fix code locating the entity
of the parent type.
Eric Botcazou [Tue, 4 Jun 2024 19:33:28 +0000 (21:33 +0200)]
ada: Small cleanup in processing of primitive operations
The processing of primitive operations is now always uniform for tagged and
untagged types, but the code contains left-overs from the time where it was
specific to tagged types, in particular for the handling of subtypes.
gcc/ada/
* einfo.ads (Direct_Primitive_Operations): Mention concurrent types
as well as GNAT extensions instead of implementation details.
(Primitive_Operations): Document that Direct_Primitive_Operations is
also used for concurrent types as a fallback.
* einfo-utils.adb (Primitive_Operations): Tweak formatting.
* exp_util.ads (Find_Prim_Op): Adjust description.
* exp_util.adb (Make_Subtype_From_Expr): In the private case with
unknown discriminants, always copy Direct_Primitive_Operations and
do not overwrite the Class_Wide_Type of the expression's base type.
* sem_ch3.adb (Analyze_Incomplete_Type_Decl): Tweak comment.
(Analyze_Subtype_Declaration): Remove older and now dead calls to
Set_Direct_Primitive_Operations. Tweak comment.
(Build_Derived_Private_Type): Likewise.
(Build_Derived_Record_Type): Likewise.
(Build_Discriminated_Subtype): Set Direct_Primitive_Operations in
all cases instead of just for tagged types.
(Complete_Private_Subtype): Likewise.
(Derived_Type_Declaration): Tweak comment.
* sem_ch4.ads (Try_Object_Operation): Adjust description.
Doug Rupp [Tue, 4 Jun 2024 17:17:57 +0000 (10:17 -0700)]
ada: Revert conditional installation of signal handlers on VxWorks
The conditional installation resulted in a semantic change, and
although it is likely what is ultimately wanted (since HW interrupts
are being reworked on VxWorks). However it must be done in concert
with other modifications for the new formulation of HW interrupts and
not in isolation.
gcc/ada/
* init.c [vxworks] (__gnat_install_handler): Revert to
installing signal handlers without regard to interrupt_state.
Javier Miranda [Thu, 30 May 2024 11:24:54 +0000 (11:24 +0000)]
ada: Cannot override inherited function with controlling result
When a package has the declaration of a derived tagged
type T with private null extension that inherits a public
function F with controlling result, and a derivation of T
is declared in the public part of another package, overriding
function F may be rejected by the compiler.
gcc/ada/
* sem_disp.adb (Find_Hidden_Overridden_Primitive): Check
public dispatching primitives of ancestors; previously,
only immediately-visible primitives were checked.
Eric Botcazou [Thu, 30 May 2024 10:46:57 +0000 (12:46 +0200)]
ada: Fix missing index check with declare expression
The Do_Range_Check flag is properly set on the Expression of the EWA node
built for the declare expression, so this instructs Generate_Index_Checks
to look into this Expression.
gcc/ada/
* checks.adb (Generate_Index_Checks): Add specific treatment for
index expressions that are N_Expression_With_Actions nodes.
Eric Botcazou [Tue, 28 May 2024 21:08:32 +0000 (23:08 +0200)]
ada: Fix internal error on case expression used as index of array component
This occurs when the bounds of the array component depend on a discriminant
and the component reference is not nested, that is to say the component is
not (referenced as) a subcomponent of a larger record.
In this case, Analyze_Selected_Component does not build the actual subtype
for the component, but it turns out to be required for constructs generated
during the analysis of the case expression.
The change causes this actual subtype to be built, and also renames a local
variable used to hold the prefix of the selected component.
gcc/ada/
* sem_ch4.adb (Analyze_Selected_Component): Rename Name into Pref
and use Sel local variable consistently.
(Is_Simple_Indexed_Component): New predicate.
Call Is_Simple_Indexed_Component to determine whether to build an
actual subtype for the component.
Eric Botcazou [Thu, 30 May 2024 22:13:44 +0000 (00:13 +0200)]
ada: Fix incorrect handling of packed array with aliased composite components
The problem is that the handling of the interaction between packing and
aliased/atomic/independent components of an array type is tied to that of
the interaction between a component clause and aliased/atomic/independent
components, although the semantics are different: packing is a best effort
thing, whereas a component clause must be honored or else an error be given.
This decouples the two handlings, but retrofits the separate processing of
independent components done in both cases into the common code and changes
the error message from "minimum allowed is" to "minimum allowed value is"
for the sake of consistency with the aliased/atomic processing.
gcc/ada/
* freeze.adb (Freeze_Array_Type): Decouple the handling of the
interaction between packing and aliased/atomic components from
that of the interaction between a component clause and aliased/
atomic components, and retrofit the processing of the interaction
between the two characteristics and independent components into
the common processing.
The only substantive change is to remove Activation_Chain_Entity
from N_Generic_Package_Declaration. The comment in sinfo.ads suggesting
this change was written in 1993!
Various pieces of missing documentation are added to Sinfo and Einfo.
Steve Baird [Fri, 24 May 2024 00:11:42 +0000 (17:11 -0700)]
ada: Predefined arithmetic operators incorrectly treated as directly visible
In some cases, a predefined operator (e.g., the "+" operator for an
integer type) is incorrectly treated as being directly visible when
it is not. This can lead to both accepting operator uses that should
be rejected and also to incorrectly rejecting legal constructs as ambiguous
(for example, an expression "Foo + 1" where Foo is an overloaded function and
the "+" operator is directly visible for the result type of only one of
the possible callees).
gcc/ada/
* sem_ch4.adb (Is_Effectively_Visible_Operator): A new function.
(Check_Arithmetic_Pair): In paths where Add_One_Interp was
previously called unconditionally, instead call only if
Is_Effectively_Visible_Operator returns True.
(Check_Boolean_Pair): Likewise.
(Find_Unary_Types): Likewise.
Piotr Trojanek [Thu, 16 May 2024 08:59:31 +0000 (10:59 +0200)]
ada: Fix for Default_Component_Value with declare expressions
When the expression of aspect Default_Component_Value includes a declare
expression with current type instance, we attempted to recursively froze
that type, which itself caused an infinite recursion, because we didn't
properly manage the scope of declare expression.
This patch fixes both the detection of the current type instance and
analysis of the expression that caused recursive freezing.
gcc/ada/
* sem_attr.adb (In_Aspect_Specification): Use the standard
condition that works correctly with declare expressions.
* sem_ch13.adb (Analyze_Aspects_At_Freeze_Point): Replace
ordinary analysis with preanalysis of spec expressions.
Justin Squirek [Thu, 9 May 2024 20:16:24 +0000 (20:16 +0000)]
ada: Spurious style error with mutiple square brackets
This patch fixes a spurious error in the compiler when checking for style for
token separation where two square brackets are next to each other.
gcc/ada/
* csets.ads (Identifier_Char): New function - replacing table.
* csets.adb (Identifier_Char): Rename and move table for static values.
(Initialize): Remove dynamic calculations.
(Identifier_Char): New function to calculate dynamic values.
* opt.adb (Set_Config_Switches): Remove setting of Identifier_Char.
Andrew Pinski [Thu, 20 Jun 2024 22:52:05 +0000 (15:52 -0700)]
complex-lowering: Better handling of PAREN_EXPR [PR68855]
When PAREN_EXPR tree code was added in r0-85884-gdedd42d511b6e4,
a simplified handling was added to complex lowering. Which means
we would get:
```
_9 = COMPLEX_EXPR <_15, _14>;
_11 = ((_9));
_19 = REALPART_EXPR <_11>;
_20 = IMAGPART_EXPR <_11>;
```
In many cases instead of just simply:
```
_19 = ((_15));
_20 = ((_14));
```
So this adds full support for PAREN_EXPR to complex lowering.
It is handled very similar as NEGATE_EXPR; except creating PAREN_EXPR
instead of NEGATE_EXPR for the real/imag parts. This allows for
more optimizations including vectorization, especially with
-ffast-math.
gfortran.dg/vect/pr68855.f90 is an example where this could show up.
It also shows up in SPEC CPU 2006's 465.tonto; though I have not done
any benchmarking there.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
PR tree-optimization/68855
* tree-complex.cc (init_dont_simulate_again): Handle PAREN_EXPR
like NEGATE_EXPR.
(complex_propagate::visit_stmt): Likewise.
(expand_complex_move): Don't handle PAREN_EXPR.
(expand_complex_paren): New function.
(expand_complex_operations_1): Handle PAREN_EXPR like
NEGATE_EXPR. And call expand_complex_paren for PAREN_EXPR.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/pr68855.c: New test.
* gfortran.dg/vect/pr68855.f90: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Richard Biener [Fri, 21 Jun 2024 06:05:22 +0000 (08:05 +0200)]
Remove outdated info from passes.texi
This applies some maintainance to passes.texi by removing references
to no longer existing passes. It also fixes a few minor things but
doesn't fill the gaps that meanwhile exist.
* doc/passes.texi: Remove references to no longer existing
passes.
YunQiang Su [Tue, 18 Jun 2024 09:03:51 +0000 (17:03 +0800)]
MIPS: Set condmove cost to SET(REG, REG)
On most uarch, the cost condmove is same as other noraml integer,
and it should be COSTS_N_INSNS(1).
In GCC12 or previous, the condmove is always enabled, and from
GCC13, we start to compare the cost.
The generic rtx_cost give the result of COSTS_N_INSN(2).
Let's define it to COSTS_N_INSN(1) in mips_rtx_costs.
gcc
* config/mips/mips.cc(mips_rtx_costs): Set condmove cost.
* config/mips/mips.md(mov<GPR:mode>_on_<MOVECC:mode>,
mov<GPR:mode>_on_<MOVECC:mode>_mips16e2,
mov<GPR:mode>_on_<GPR2:mode>_ne
mov<GPR:mode>_on_<GPR2:mode>_ne_mips16e2): Define name by
remove starting *, so that we can use CODE_FOR_.
Kewen Lin [Fri, 21 Jun 2024 01:23:56 +0000 (20:23 -0500)]
rs6000: Fix wrong RTL patterns for vector merge high/low word on LE
Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low word, which are altivec_vmrg[hl]w,
vsx_xxmrg[hl]w_<VSX_W:mode>. These defines are mainly for
built-in function vec_merge{h,l}, __builtin_vsx_xxmrghw,
__builtin_vsx_xxmrghw_4si and some internal gen function
needs. These functions should consider endianness, taking
vec_mergeh as example, as PVIPR defines, vec_mergeh "Merges
the first halves (in element order) of two vectors", it does
note it's in element order. So it's mapped into vmrghw on
BE while vmrglw on LE respectively. Although the mapped
insns are different, as the discussion in PR106069, the RTL
pattern should be still the same, it is conformed before
commit r12-4496, define_expand altivec_vmrghw got expanded
into:
on LE, although the mapped insn are still vmrghw on BE and
vmrglw on LE, the associated RTL pattern is completely
wrong and inconsistent with the mapped insn. If optimization
passes leave this pattern alone, even if its pattern doesn't
represent its mapped insn, it's still fine, that's why simple
testing on bif doesn't expose this issue. But once some
optimization pass such as combine does some changes basing
on this wrong pattern, because the pattern doesn't match the
semantics that the expanded insn is intended to represent,
it would cause the unexpected result.
So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns. With the
proposed patch, the expanders like altivec_vmrghw expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.
Co-authored-by: Xionghu Luo <xionghuluo@tencent.com>
PR target/106069
PR target/115355
gcc/ChangeLog:
* config/rs6000/altivec.md (altivec_vmrghw_direct_<VSX_W:mode>): Rename
to ...
(altivec_vmrghw_direct_<VSX_W:mode>_be): ... this. Add the condition
BYTES_BIG_ENDIAN.
(altivec_vmrghw_direct_<VSX_W:mode>_le): New define_insn.
(altivec_vmrglw_direct_<VSX_W:mode>): Rename to ...
(altivec_vmrglw_direct_<VSX_W:mode>_be): ... this. Add the condition
BYTES_BIG_ENDIAN.
(altivec_vmrglw_direct_<VSX_W:mode>_le): New define_insn.
(altivec_vmrghw): Adjust by calling gen_altivec_vmrghw_direct_v4si_be
for BE and gen_altivec_vmrglw_direct_v4si_le for LE.
(altivec_vmrglw): Adjust by calling gen_altivec_vmrglw_direct_v4si_be
for BE and gen_altivec_vmrghw_direct_v4si_le for LE.
(vec_widen_umult_hi_v8hi): Adjust the call to
gen_altivec_vmrghw_direct_v4si by gen_altivec_vmrghw for BE
and by gen_altivec_vmrglw for LE.
(vec_widen_smult_hi_v8hi): Likewise.
(vec_widen_umult_lo_v8hi): Adjust the call to
gen_altivec_vmrglw_direct_v4si by gen_altivec_vmrglw for BE
and by gen_altivec_vmrghw for LE
(vec_widen_smult_lo_v8hi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghw_direct_v4si by
CODE_FOR_altivec_vmrghw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrghw_direct_v4si_le for LE. And replace
CODE_FOR_altivec_vmrglw_direct_v4si by
CODE_FOR_altivec_vmrglw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrglw_direct_v4si_le for LE.
* config/rs6000/vsx.md (vsx_xxmrghw_<VSX_W:mode>): Adjust by calling
gen_altivec_vmrghw_direct_v4si_be for BE and
gen_altivec_vmrglw_direct_v4si_le for LE.
(vsx_xxmrglw_<VSX_W:mode>): Adjust by calling
gen_altivec_vmrglw_direct_v4si_be for BE and
gen_altivec_vmrghw_direct_v4si_le for LE.
gcc/testsuite/ChangeLog:
* g++.target/powerpc/pr106069.C: New test.
* gcc.target/powerpc/pr115355.c: New test.
Roger Sayle [Thu, 20 Jun 2024 15:30:15 +0000 (16:30 +0100)]
i386: Allow all register_operand SUBREGs in x86_ternlog_idx.
This patch tweaks ix86_ternlog_idx to allow any SUBREG that matches
the register_operand predicate, and is split out as an independent
piece of a patch that I have to clean-up redundant ternlog patterns
in sse.md. It turns out that some of these patterns aren't (yet)
sufficiently redundant to be obsolete. The problem is that the
"new" ternlog pattern has the restriction that it allows SUBREGs,
but only those where the inner and outer modes are the same size,
where regular patterns use "register_operand" which allows arbitrary
including paradoxical SUBREGs.
A motivating example is f2 in gcc.target/i386/avx512dq-abs-copysign-1.c
void f2 (float x, float y)
{
register float a __asm ("xmm16"), b __asm ("xmm17");
a = x;
b = y;
asm volatile ("" : "+v" (a), "+v" (b));
a = __builtin_copysignf (a, b);
asm volatile ("" : "+v" (a));
}
where the SUBREG is paradoxical, with inner mode SF and outer mode V4SF.
This patch allows the recently added ternlog_operand to accept this case.
2024-06-20 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386-expand.cc (ix86_ternlog_idx): Allow any SUBREG
that matches register_operand. Use rtx_equal_p to compare REG
or SUBREG "leaf" operands.
Jeff Law [Thu, 20 Jun 2024 14:43:37 +0000 (08:43 -0600)]
[RISC-V] Minor cleanup/improvement to bset/binv patterns
Changes since V1:
Whitespace fixes noted by the linter
Missed using the iterator for the output template in
<bit_optab><mode>_mask pattern!
--
This patch introduces a bit_optab iterator that maps IOR/XOR to bset and
binv (and one day bclr if we need it). That allows us to combine some
patterns that only differed in the RTL opcode (IOR vs XOR) and in the
name/assembly (bset vs binv).
Additionally this also allow us to use the iterator in the
bset<mode>mask and bsetidisi patterns thus potentially fixing a missed
optimization.
This has gone through my tester. I'll wait for a verdict from
pre-commit CI before moving forward.
gcc/
* config/riscv/bitmanip.md (<bit_optab><mode>): New unified
pattern for bset/binv using a code iterator.
(<bit_optab>i<mode>): Likewise.
(<bit_optab><mode>_mask): Likewise. Support XOR via any_or.
(<bit_optab>isidi): Likewise.
* config/riscv/iterators.md (bit_optab): New iterator.
Matthias Kretz [Fri, 14 Jun 2024 13:11:25 +0000 (15:11 +0200)]
libstdc++: Fix find_last_set(simd_mask) to ignore padding bits
With the change to the AVX512 find_last_set implementation, the change
to AVX512 operator!= is unnecessary. However, the latter was not
producing optimal code and unnecessarily set the padding bits. In
theory, the compiler could determine that with the new !=
implementation, the bit operation for clearing the padding bits is a
no-op and can be elided.
PR libstdc++/115454
* include/experimental/bits/simd_x86.h (_S_not_equal_to): Use
neq comparison instead of bitwise negation after eq.
(_S_find_last_set): Clear unused high bits before computing
bit_width.
* testsuite/experimental/simd/pr115454_find_last_set.cc: New
test.
Steve Baird [Thu, 16 May 2024 21:49:17 +0000 (14:49 -0700)]
ada: Reference to nonexistent operator in reduction expression accepted
In some cases, a reduction expression that references the (nonexistent)
"+" operator of a generic formal private type is incorrectly accepted.
gcc/ada/
* sem_attr.adb (Resolve_Attribute.Proper_Op): When resolving the
name of the reducer subprogram in a reduction expression,
Proper_Op treats references to operators defined in Standard
specially. Disable this special treatment if the type of the
reduction expression is not the right class of type for the
operator, or if a new Boolean parameter (named "Strict") is True.
(Resolve_Attribute): In the overloaded case, iterate over the
reducer subprogram candidates twice. First with Strict => True and
then, if no good intepretation is found, with Strict => False.
Yannick Moy [Mon, 27 May 2024 10:06:47 +0000 (12:06 +0200)]
ada: Fix checking of SPARK RM on ghost with concurrent part
SPARK RM 6.9(21) forbids a ghost type to have concurrent parts.
This was not enforced, instead only the type itself was checked to
be concurrent. Now fixed.
Bob Duff [Tue, 28 May 2024 16:19:51 +0000 (12:19 -0400)]
ada: Rewrite generic formal/actual matching
...in preparation for implementing type inference for generic
parameters.
The main change is to do the "matching" computation early, and produce a
*constant* data structure (Gen_Assocs_Rec) to represent the matching
between each triple of unanalyzed formal, analyzed formal, and
corresponding actual. This will allow us to look at that data structure
more than once, which will be necessary for type inference.
Matching_Actual is removed; Match_Assocs is added.
Other changes include removal of global variables, splitting out
processing into subprograms, adding assertions, comment corrections,
and other general cleanups.
gcc/ada/
* expander.ads: Minor comment fixes.
* nlists.ads: Misc comment improvements.
* sem_aux.ads (First_Discriminant): Improve comment.
* sem_ch12.adb: Misc cleanups.
(Associations): New package containing type Gen_Assocs_Rec
to represent matchings, and function Match_Assocs to create the
Gen_Assocs_Rec constant.
(Analyze_Associations): Call Match_Assocs, and other major
changes related to that.
* sem_ch12.ads: Minor comment fixes.
* sem_ch3.adb: Minor comment fixes.
Steve Baird [Fri, 24 May 2024 21:14:03 +0000 (14:14 -0700)]
ada: Replace "All" argument to Extensions_Allowed pragma with "All_Extensions"
The argument to pragma Extensions_Allowed to enable all extensions is
no longer "All", but instead "All_Extensions".
gcc/ada/
* doc/gnat_rm/gnat_language_extensions.rst: Update documentation.
* doc/gnat_rm/implementation_defined_pragmas.rst: Update
documentation.
* errout.adb
(Error_Msg_GNAT_Extension): Update error message text.
* par-prag.adb: Update pragma parsing. This includes changing the
the name of the Check_Arg_Is_On_Or_Off formal parameter All_OK_Too
to All_Extensions_OK_Too.
* sem_prag.adb (Analyze_Pragma): In analyzing an
Extensions_Allowed pragma, replace uses of Name_All with
Name_All_Extensions; update a comment to reflect this.
* snames.ads-tmpl: Add Name_All_Extensions declaration.
* gnat_rm.texi: Regenerate.
Gary Dismukes [Thu, 23 May 2024 22:06:21 +0000 (22:06 +0000)]
ada: Crash on selected component of formal derived type in generic instance
The compiler crashes on an instantiation of a generic child unit G1.GC
that has a formal private extension P_Ext of a private type P declared
in the parent G1 whose full type has a component C, when analyzing a
selected component ACC.C whose prefix is of an access type coming from
an instantiation of another generic G2 where the designated type is
the formal type P_Ext (coming in from a formal type of G2).
gcc/ada/
* sem_ch4.adb (Try_Selected_Component_In_Instance): Reverse if_statement
clauses so that the testing for the special case of extensions of private
types in instance bodies is done first, followed by the testing for the case
of a parent type that's a generic actual type. In the extension case, apply
Base_Type to the type actual in the test of Used_As_Generic_Actual, and add
a test of Present (Parent_Subtype (Typ)).