ira: Scale save/restore costs of callee save registers with block frequency
In assign_hard_reg(), when computing the costs of the hard registers, the
cost of saving/restoring a callee-save hard register in prolog/epilog is
taken into consideration. However, this cost is not scaled with the entry
block frequency. Without scaling, the cost of saving/restoring is quite
small and this can result in a callee-save register being chosen by
assign_hard_reg() even though there are free caller-save registers
available. Assigning a callee save register to a pseudo that is live
in the entire function and across a call will cause shrink wrap to fail.
Gaius Mulley [Tue, 25 Jun 2024 17:35:22 +0000 (18:35 +0100)]
PR modula2/115536 Expression is evaluated incorrectly when encountering relops and indirection
This fix ensures that we only call BuildRelOpFromBoolean if we are
inside a constant expression (where no indirection can be used).
The fix creates a temporary variable when a boolean is created from
a relop in other cases.
The previous pattern implementation would not work if the operands required
dereferencing during non const expressions. Comparison of relop results
in a constant expression are resolved by constant propagation, basic
block analysis and dead code removal. After the quadruples have been
optimized only one assignment to the boolean variable will remain for
const expressions. All quadruple pattern checking for boolean
expressions is removed by the patch. Thus the implementation becomes
more generic.
gcc/m2/ChangeLog:
PR modula2/115536
* gm2-compiler/M2BasicBlock.def (GetBasicBlockScope): New procedure.
(GetBasicBlockStart): Ditto.
(GetBasicBlockEnd): Ditto.
(IsBasicBlockFirst): New procedure function.
* gm2-compiler/M2BasicBlock.mod (ConvertQuads2BasicBlock): Allow
conditional boolean quads to be removed.
(GetBasicBlockScope): Implement new procedure.
(GetBasicBlockStart): Ditto.
(GetBasicBlockEnd): Ditto.
(IsBasicBlockFirst): Implement new procedure function.
* gm2-compiler/M2GCCDeclare.def (FoldConstants): New parameter
declaration.
* gm2-compiler/M2GCCDeclare.mod (FoldConstants): New parameter
declaration.
(DeclareTypesConstantsProceduresInRange): Recreate basic blocks
after resolving constant expressions.
(CodeBecomes): Guard IsVariableSSA with IsVar.
* gm2-compiler/M2GenGCC.def (ResolveConstantExpressions): New
parameter declaration.
* gm2-compiler/M2GenGCC.mod (FoldIfLess): Remove relop pattern
detection.
(FoldIfGre): Ditto.
(FoldIfLessEqu): Ditto.
(FoldIfGreEqu): Ditto.
(FoldIfIn): Ditto.
(FoldIfNotIn): Ditto.
(FoldIfEqu): Ditto.
(FoldIfNotEqu): Ditto.
(FoldBecomes): Add BasicBlock parameter and allow conditional
boolean becomes to be folded in the first basic block.
(ResolveConstantExpressions): Reimplement.
* gm2-compiler/M2Quads.def (IsConstQuad): New procedure function.
(IsConditionalBooleanQuad): Ditto.
* gm2-compiler/M2Quads.mod (IsConstQuad): Implement new procedure function.
(IsConditionalBooleanQuad): Ditto.
(MoveWithMode): Use GenQuadOTypetok.
(IsInitialisingConst): Rewrite using OpUsesOp1.
(OpUsesOp1): New procedure function.
(doBuildAssignment): Mark des as a VarConditional.
(ConvertBooleanToVariable): Call PutVarConditional.
(DumpQuadSummary): New procedure.
(BuildRelOpFromBoolean): Updated debugging and improved comments.
(BuildRelOp): Only call BuildRelOpFromBoolean if we are in a const
expression and both operands are boolean relops.
(GenQuadOTypeUniquetok): New procedure.
(BackPatch): Correct comment.
* gm2-compiler/SymbolTable.def (PutVarConditional): New procedure.
(IsVarConditional): New procedure function.
* gm2-compiler/SymbolTable.mod (PutVarConditional): Implement new
procedure.
(IsVarConditional): Implement new procedure function.
(SymConstVar): New field IsConditional.
(SymVar): New field IsConditional.
(MakeVar): Initialize IsConditional field.
(MakeConstVar): Initialize IsConditional field.
* gm2-compiler/M2Swig.mod (DoBasicBlock): Change parameters to
use BasicBlock.
* gm2-compiler/M2Code.mod (SecondDeclareAndOptimize): Use iterator
to FoldConstants over basic block list.
* gm2-compiler/M2SymInit.mod (AppendEntry): Replace parameters
with BasicBlock.
* gm2-compiler/P3Build.bnf (Relation): Call RecordOp for #, <> and =.
gcc/testsuite/ChangeLog:
PR modula2/115536
* gm2/iso/const/pass/constbool4.mod: New test.
* gm2/iso/const/pass/constbool5.mod: New test.
* gm2/iso/run/pass/condtest2.mod: New test.
* gm2/iso/run/pass/condtest3.mod: New test.
* gm2/iso/run/pass/condtest4.mod: New test.
* gm2/iso/run/pass/condtest5.mod: New test.
* gm2/iso/run/pass/constbool4.mod: New test.
Jeff Law [Tue, 25 Jun 2024 17:22:01 +0000 (11:22 -0600)]
[committed] Fix fr30-elf newlib build failure with late-combine
So the late combine work has exposed a latent bug in the fr30 port.
The fr30 "call" instruction is pc-relative with a *very* limited range, 12 bits
to be precise.
With such a limited range its hard to see how we could ever consistently use it
in the compiler, with the possible exception of self-recursion. Even for a
call to a locally binding function -ffunction-sections and linker placement of
functions may separate the caller/callee. Code generation seemed to be using
indirect forms pretty consistently, though the RTL would allow direct calls.
With late-combine some of those indirects would be optimized into direct calls.
This naturally led to out of range scenarios.
With the fr30 port slated for removal unless it gets updated to use LRA and the
fundamental problems using direct calls, I took the shortest path to keep
things working -- namely forcing all calls to be indirect.
Tested in my tester with no regressions (and fixes the newlib build failure
with late-combine enabled). Pushed to the trunk.
gcc/
* config/fr30/constraints.md (Q): Remove unused constraint.
* config/fr30/predicates.md (call_operand): Remove unused predicate.
* config/fr30/fr30.md (call, vall_value): Turn into expanders and
force the call address into a register.
(*call, *call_value): Adjust to only allow indirect calls. Adjust
output template accordingly.
Patrick Palka [Tue, 25 Jun 2024 16:59:24 +0000 (12:59 -0400)]
c++: alias CTAD and copy deduction guide [PR115198]
Here we're neglecting to update DECL_NAME during the alias CTAD guide
transformation, which causes copy_guide_p to return false for the
transformed copy deduction guide since DECL_NAME is still __dguide_C
with TREE_TYPE C<B, T> but it should be __dguide_A with TREE_TYPE A<T>
(i.e. C<false, T>). This ultimately results in ambiguity during
overload resolution between the copy deduction guide vs copy ctor guide.
This patch makes us update DECL_NAME of a transformed guide accordingly
during alias/inherited CTAD.
PR c++/115198
gcc/cp/ChangeLog:
* pt.cc (alias_ctad_tweaks): Update DECL_NAME of the transformed
guides.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/class-deduction-alias22.C: New test.
Patrick Palka [Tue, 25 Jun 2024 14:42:21 +0000 (10:42 -0400)]
c++: using non-dep array var of unknown bound [PR115358]
For a non-dependent array variable of unknown bound, it seems we need to
try instantiating its definition upon use in a template context for sake
of proper checking and typing of the overall expression, like we do for
function specializations with deduced return type.
PR c++/115358
gcc/cp/ChangeLog:
* decl2.cc (mark_used): Call maybe_instantiate_decl for an array
variable with unknown bound.
* semantics.cc (finish_decltype_type): Remove now redundant
handling of array variables with unknown bound.
* typeck.cc (cxx_sizeof_expr): Likewise.
Sandra Loosemore [Tue, 25 Jun 2024 13:54:43 +0000 (13:54 +0000)]
Fix PR c/115587, uninitialized variable in c_parser_omp_loop_nest
This function had a reference to an uninitialized variable on the
error path. The problem was diagnosed by clang but not gcc. It seems
the cleanest solution is to initialize all the loop-clause variables
at the point of declaration rather than at different places in the
code.
The C++ front end didn't have this problem, but I've made similar
changes there to keep the code in sync.
gcc/c/ChangeLog:
PR c/115587
* c-parser.cc (c_parser_omp_loop_nest): Move initializations to
point of declaration.
gcc/cp/ChangeLog:
PR c/115587
* parser.cc (cp_parser_omp_loop_nest): Move initializations to
point of declaration.
libatomic: Add rcpc3 128-bit atomic operations for AArch64
The introduction of the optional RCPC3 architectural extension for
Armv8.2-A upwards provides additional support for the release
consistency model, introducing the Load-Acquire RCpc Pair Ordered, and
Store-Release Pair Ordered operations in the form of LDIAPP and STILP.
These operations are single-copy atomic on cores which also implement
LSE2 and, as such, support for these operations is added to Libatomic
and employed accordingly when the LSE2 and RCPC3 features are detected
in a given core at runtime.
libatomic/ChangeLog:
* config/linux/aarch64/atomic_16.S (libat_load_16): Add LRCPC3
variant.
(libat_store_16): Likewise.
* config/linux/aarch64/host-config.h (HWCAP2_LRCPC3): New.
(LSE2_LRCPC3_ATOP): Previously LSE2_ATOP. New ifuncs guarded
under it.
(has_rcpc3): New.
for details. This change was different from the others in that the
original call was to simplify_subreg rather than simplify_lowpart_subreg.
The old code would therefore go on to do the force_reg for more cases
than the new code would.
gcc/
* expmed.cc (store_bit_field_using_insv): Revert earlier change
to use force_subreg instead of simplify_gen_subreg.
YunQiang Su [Wed, 19 Jun 2024 17:20:36 +0000 (01:20 +0800)]
MIPS: Implement vcond_mask optabs for MSA
Currently, we have `mips_expand_vec_cond_expr`, which calculate
cmp_res first. We can just add a new extra argument to ask it
to use operands[3] as cmp_res instead of calculating from operands[4]
and operands[5].
gcc
* config/mips/mips.cc(mips_expand_vec_cond_expr): Add extra
argument to info that opernads[3] is cmp_res already.
* config/mips/mips-protos.h(mips_expand_vec_cond_expr): Ditto.
* config/mips/mips-msa.md(vcond_mask): Define new expand.
(vcondu): Use mips_expand_vec_cond_expr with 4th argument.
(vcond): Ditto.
Evgeny Karpov [Sat, 8 Jun 2024 13:49:17 +0000 (13:49 +0000)]
aarch64: Add DLL import/export to AArch64 target
This patch reuses the MinGW implementation to enable DLL import/export
functionality for the aarch64-w64-mingw32 target. It also modifies
environment configurations for MinGW.
Evgeny Karpov [Mon, 24 Jun 2024 12:44:58 +0000 (12:44 +0000)]
aarch64: Add selectany attribute handling
This patch extends the aarch64 attributes list with the selectany
attribute for the aarch64-w64-mingw32 target and reuses the mingw
implementation to handle it.
Evgeny Karpov [Mon, 24 Jun 2024 12:43:05 +0000 (12:43 +0000)]
Rename functions for reuse in AArch64
This patch renames functions related to dllimport/dllexport and
selectany functionality. These functions will be reused in the
aarch64-w64-mingw32 target.
Evgeny Karpov [Mon, 24 Jun 2024 12:38:40 +0000 (12:38 +0000)]
Extract ix86 dllimport implementation to mingw
This patch extracts the ix86 implementation for expanding a SYMBOL
into its corresponding dllimport, far-address, or refptr symbol. It
will be reused in the aarch64-w64-mingw32 target. The implementation
is copied as is from i386/i386.cc with minor changes to follow to the
code style.
Also this patch replaces the original DLL import/export implementation
in ix86 with mingw.
Evgeny Karpov [Sat, 8 Jun 2024 13:18:02 +0000 (13:18 +0000)]
Move mingw_* declarations to the mingw folder
This patch refactors recent changes to move mingw-related
functionality to the mingw folder. More renamings to the mingw_ prefix
will be done in follow-up commits.
This is the first commit in the second patch series to add DLL
import/export implementation to AArch64.
Coauthors: Zac Walker <zacwalker@microsoft.com>,
Mark Harmstone <mark@harmstone.com> and
Ron Riddle <ron.riddle@microsoft.com>
Refactored, prepared, and validated by
Radek Barton <radek.barton@microsoft.com> and
Evgeny Karpov <evgeny.karpov@microsoft.com>
Jakub Jelinek [Tue, 25 Jun 2024 06:35:56 +0000 (08:35 +0200)]
c: Fix ICE related to incomplete structures in C23 [PR114930]
Here is a version of the c_update_type_canonical fixes which passed
bootstrap/regtest.
The non-trivial part is the handling of the case when
build_qualified_type (TYPE_CANONICAL (t), TYPE_QUALS (x))
returns a type with NULL TYPE_CANONICAL. That should happen only
if TYPE_CANONICAL (t) == t, because otherwise c_update_type_canonical should
have been already called on the other type. c, the returned type, is usually x
and in that case it should have TYPE_CANONICAL set to itself, or worst
for whatever reason x is not the right canonical type (say it has attributes
or whatever disqualifies it from check_qualified_type). In that case
either it finds some pre-existing type from the variant chain of t which
is later in the chain and we haven't processed it yet (but then
get_qualified_type moves it right after t in:
/* Put the found variant at the head of the variant list so
frequently searched variants get found faster. The C++ FE
benefits greatly from this. */
tree t = *tp;
*tp = TYPE_NEXT_VARIANT (t);
TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (mv);
TYPE_NEXT_VARIANT (mv) = t;
return t;
optimization), or creates a fresh new type using build_variant_type_copy,
which again places the new type right after t:
/* Add the new type to the chain of variants of TYPE. */
TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (m);
TYPE_NEXT_VARIANT (m) = t;
TYPE_MAIN_VARIANT (t) = m;
At this point we want to make c its own canonical type (i.e. TYPE_CANONICAL
(c) = c;), but also need to process pointers to it and only then return back
to processing x. Processing the whole chain from c again could be costly,
we could have hundreds of types in the chain already processed, and while
the loop would just quickly skip them
for (tree x = t, l = NULL_TREE; x; l = x, x = TYPE_NEXT_VARIANT (x))
{
if (x != t && TYPE_STRUCTURAL_EQUALITY_P (x))
...
else if (x != t)
continue;
it feels costly. So, this patch instead moves c from right after t
to right before x in the chain (that shouldn't change anything, because
clearly build_qualified_type didn't find any matches in the chain before
x) and continues processing the c at that position, so should handle the
x that encountered this in the next iteration.
We could avoid some of the moving in the chain if we processed the chain
twice, once deal only with x != t && TYPE_STRUCTURAL_EQUALITY_P (x)
&& TYPE_CANONICAL (t) == t && check_qualified_type (t, x, TYPE_QUALS (x))
types (in that case set TYPE_CANONICAL (x) = x) and once the rest. There
is still the theoretical case where build_qualified_type would return
a new type and in that case we are back to the moving the type around and
needing to handle it though.
2024-06-25 Jakub Jelinek <jakub@redhat.com>
Martin Uecker <uecker@tugraz.at>
PR c/114930
PR c/115502
gcc/c/
* c-decl.cc (c_update_type_canonical): Assert t is main variant
with 0 TYPE_QUALS. Simplify and don't use check_qualified_type.
Deal with the case where build_qualified_type returns
TYPE_STRUCTURAL_EQUALITY_P type.
gcc/testsuite/
* gcc.dg/pr114574-1.c: Require lto effective target.
* gcc.dg/pr114574-2.c: Likewise.
* gcc.dg/pr114930.c: New test.
* gcc.dg/pr115502.c: New test.
Jeff Law [Tue, 25 Jun 2024 05:22:21 +0000 (23:22 -0600)]
[committed][RISC-V] Fix some of the testsuite fallout from late-combine patch
This fixes most, but not all of the testsuite fallout from the late-combine
patch. Specifically in the vector space we're often able to eliminate a
broadcast of an scalar element across a vector. That eliminates the vsetvl
related to the broadcast, but more importantly from the testsuite standpoint it
turns .vv forms into .vf or .vx forms.
There were two paths we could have taken here. One to accept .v*, ignoring the
actual register operands. Or to create new matches for the .vx and .vf
variants. I selected the latter as I'd like us to know if the code to avoid
the broadcast regresses.
I'm pushing this through now so that we've got cleaner results and to prevent
duplicate work. I've got patch for the rest of the testsuite fallout, but I
want to think about them a bit.
Kewen Lin [Tue, 25 Jun 2024 05:04:53 +0000 (00:04 -0500)]
Replace {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE with new hook mode_for_floating_type
Currently how we determine which mode will be used for a
floating point type is that for a given type precision
(size) call mode_for_size to get the first mode which has
this size in the specified class. On Powerpc, we have
three modes (TF/KF/IF) having the same mode precision 128
(see[1]), so the processing forces us to have to place TF
at the first place, it would require us to make more
adjustment in some generic code to avoid some unexpected
mode conversions and it would be even worse if we get rid
of TF eventually one day. And as Joseph pointed out in [2],
"floating types should have their mode, not a poorly
defined precision value", as Joseph and Richi suggested,
this patch is to introduce one hook mode_for_floating_type
which returns the corresponding mode for type float, double
or long double. The default implementation returns SFmode
for float and DFmode for double or long double. For ports
which need special treatment, there are some other patches
for their own port specific implementation (referring to
how {,LONG_}DOUBLE_TYPE_SIZE get used there). For all
generic uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE, depending
on the context, some of them are replaced with TYPE_PRECISION
of the according type node, some other are replaced with
GET_MODE_PRECISION on the mode from mode_for_floating_type.
This patch also poisons {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE,
so most defines of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE in port
specific are removed, but there are still some which are
good to be kept for readability then they get renamed with
port specific prefix.
Kewen Lin [Tue, 25 Jun 2024 05:04:51 +0000 (00:04 -0500)]
vms: Replace use of LONG_DOUBLE_TYPE_SIZE
Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type. To be prepared for that, this
patch is to replace use of LONG_DOUBLE_TYPE_SIZE in vms port
with TYPE_PRECISION of long_double_type_node.
Kewen Lin [Tue, 25 Jun 2024 05:04:49 +0000 (00:04 -0500)]
rust: Replace uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type. To be prepared for that, this
patch is to replace use of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
in rust with TYPE_PRECISION of {float,{,long_}double}_type_node.
Kewen Lin [Tue, 25 Jun 2024 05:04:47 +0000 (00:04 -0500)]
go: Replace uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type. To be prepared for that, this
patch is to replace use of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
in go with TYPE_PRECISION of {float,{,long_}double}_type_node.
* go-gcc.cc (Gcc_backend::float_type): Use TYPE_PRECISION of
{float,double,long_double}_type_node to replace
{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE.
(Gcc_backend::complex_type): Likewise.
Sergei Lewis [Mon, 24 Jun 2024 20:20:14 +0000 (14:20 -0600)]
[PATCH v2 2/3] RISC-V: setmem for RISCV with V extension
This is primarily Sergei's work, my contributions were limited to
merging his expander with the one that's on the trunk, allowing
non-constant value and trivial testsuite adjustments due to option renaming.
I'm doing setmem first because it's the easiest. The others will follow
soon enough.
I've tested this in my system, waiting on pre-commit CI to render its
verdict before moving forward.
gcc/ChangeLog
* config/riscv/riscv-protos.h (riscv_vector::expand_vec_setmem): New
function declaration.
* config/riscv/riscv-string.cc (riscv_vector::expand_vec_setmem): New
function: this generates an inline vectorised memory set, if and only if
we know the entire operation can be performed in a single vector store.
* config/riscv/riscv.md (setmem<mode>): Try riscv_vector::expand_vec_setmem
for constant lengths. Do not require operand 2 to be a constant.
gcc/testsuite/ChangeLog
* gcc.target/riscv/rvv/base/setmem-1.c: New tests
* gcc.target/riscv/rvv/base/setmem-2.c: New tests
* gcc.target/riscv/rvv/base/setmem-3.c: New tests
Patrick O'Neill [Mon, 24 Jun 2024 19:06:15 +0000 (12:06 -0700)]
RISC-V: Add dg-remove-option for z* extensions
This introduces testsuite support infra for removing extensions.
Since z* extensions don't have ordering requirements the logic for
adding/removing those extensions has also been consolidated.
This fixes RVWMO compile testcases failing on Ztso targets by removing
the extension from the -march string.
gcc/ChangeLog:
* doc/sourcebuild.texi (dg-remove-option): Add documentation.
(dg-add-option): Add documentation for riscv_{a,zaamo,zalrsc,ztso}
Harald Anlauf [Sun, 23 Jun 2024 20:36:43 +0000 (22:36 +0200)]
Fortran: fix passing of optional dummy as actual to optional argument [PR55978]
gcc/fortran/ChangeLog:
PR fortran/55978
* trans-array.cc (gfc_conv_array_parameter): Do not dereference
data component of a missing allocatable dummy array argument for
passing as actual to optional dummy. Harden logic of presence
check for optional pointer dummy by using TRUTH_ANDIF_EXPR instead
of TRUTH_AND_EXPR.
gcc/testsuite/ChangeLog:
PR fortran/55978
* gfortran.dg/optional_absent_12.f90: New test.
Roger Sayle [Mon, 24 Jun 2024 14:34:03 +0000 (15:34 +0100)]
PR tree-optimization/113673: Avoid load merging when potentially trapping.
This patch fixes PR tree-optimization/113673, a P2 ice-on-valid regression
caused by load merging of (ptr[0]<<8)+ptr[1] when -ftrapv has been
specified. When the operator is | or ^ this is safe, but for addition
of signed integer types, a trap may be generated/required, so merging this
idiom into a single non-trapping instruction is inappropriate, confusing
the compiler by transforming a basic block with an exception edge into one
without.
This revision implements Richard Biener's feedback to add an early check
for stmt_can_throw_internal (cfun, stmt) to prevent transforming in the
presence of any statement that could trap, not just overflow on addition.
The one other tweak included in this patch is to mark the local function
find_bswap_or_nop_load as static ensuring that it isn't called from outside
this file, and guaranteeing that it is dominated by stmt_can_throw_internal
checking.
2024-06-24 Roger Sayle <roger@nextmovesoftware.com>
Richard Biener <rguenther@suse.de>
gcc/ChangeLog
PR tree-optimization/113673
* gimple-ssa-store-merging.cc (find_bswap_or_nop_load): Make static.
(find_bswap_or_nop_1): Avoid transformations (load merging) when
stmt_can_throw_internal indicates that a statement can trap.
gcc/testsuite/ChangeLog
PR tree-optimization/113673
* g++.dg/pr113673.C: New test case.
Richard Biener [Mon, 24 Jun 2024 07:52:39 +0000 (09:52 +0200)]
tree-optimization/115602 - SLP CSE results in cycles
The following prevents SLP CSE to create new cycles which happened
because of a 1:1 permute node being present where its child was then
CSEd to the permute node. Fixed by making a node only available to
CSE to after recursing.
PR tree-optimization/115602
* tree-vect-slp.cc (vect_cse_slp_nodes): Delay populating the
bst-map to avoid cycles.
Richard Biener [Fri, 21 Jun 2024 11:19:26 +0000 (13:19 +0200)]
tree-optimization/115528 - fix vect alignment analysis for outer loop vect
For outer loop vectorization of a data reference in the inner loop
we have to look at both steps to see if they preserve alignment.
What is special for this testcase is that the outer loop step is
one element but the inner loop step four and that we now use SLP
and the vectorization factor is one.
PR tree-optimization/115528
* tree-vect-data-refs.cc (vect_compute_data_ref_alignment):
Make sure to look at both the inner and outer loop step
behavior.
This patch adds a combine pass that runs late in the pipeline.
There are two instances: one between combine and split1, and one
after postreload.
The pass currently has a single objective: remove definitions by
substituting into all uses. The pre-RA version tries to restrict
itself to cases that are likely to have a neutral or beneficial
effect on register pressure.
The patch fixes PR106594. It also fixes a few FAILs and XFAILs
in the aarch64 test results, mostly due to making proper use of
MOVPRFX in cases where we didn't previously.
This is just a first step. I'm hoping that the pass could be
used for other combine-related optimisations in future. In particular,
the post-RA version doesn't need to restrict itself to cases where all
uses are substitutable, since it doesn't have to worry about register
pressure. If we did that, and if we extended it to handle multi-register
REGs, the pass might be a viable replacement for regcprop, which in
turn might reduce the cost of having a post-RA instance of the new pass.
On most targets, the pass is enabled by default at -O2 and above.
However, it has a tendency to undo x86's STV and RPAD passes,
by folding the more complex post-STV/RPAD form back into the
simpler pre-pass form.
Also, running a pass after register allocation means that we can
now match define_insn_and_splits that were previously only matched
before register allocation. This trips things like:
(define_insn_and_split "..."
[...pattern...]
"...cond..."
"#"
"&& 1"
[...pattern...]
{
...unconditional use of gen_reg_rtx ()...;
}
because matching and splitting after RA will call gen_reg_rtx when
pseudos are no longer allowed. rs6000 has several instances of this.
xtensa has a variation in which the split condition is:
"&& can_create_pseudo_p ()"
The failure then is that, if we match after RA, we'll never be
able to split the instruction.
The patch therefore disables the pass by default on i386, rs6000
and xtensa. Hopefully we can fix those ports later (if their
maintainers want). It seems better to add the pass first, though,
to make it easier to test any such fixes.
gcc.target/aarch64/bitfield-bitint-abi-align{16,8}.c would need
quite a few updates for the late-combine output. That might be
worth doing, but it seems too complex to do as part of this patch.
I tried compiling at least one target per CPU directory and comparing
the assembly output for parts of the GCC testsuite. This is just a way
of getting a flavour of how the pass performs; it obviously isn't a
meaningful benchmark. All targets seemed to improve on average:
rtl-ssa has routines for scanning forwards or backwards for something
under the control of an exclusion set. These searches are currently
used for two main things:
- to work out where an instruction can be moved within its EBB
- to work out whether recog can add a new hard register clobber
The exclusion set was originally a callback function that returned
true for insns that should be ignored. However, for the late-combine
work, I'd also like to be able to skip an entire definition, along
with all its uses.
This patch prepares for that by turning the exclusion set into an
object that provides predicate member functions. Currently the
only two member functions are:
- should_ignore_insn: what the old callback did
- should_ignore_def: the new functionality
but more could be added later.
Doing this also makes it easy to remove some asymmetry that I think
in hindsight was a mistake: in forward scans, ignoring an insn meant
ignoring all definitions in that insn (ok) and all uses of those
definitions (non-obvious). The new interface makes it possible
to select the required behaviour, with that behaviour being applied
consistently in both directions.
Now that the exclusion set is a dedicated object, rather than
just a "random" function, I think it makes sense to remove the
_ignoring suffix from the function names. The suffix was originally
there to describe the callback, and in particular to emphasise that
a true return meant "ignore" rather than "heed".
gcc/
* rtl-ssa.h: Include predicates.h.
* rtl-ssa/predicates.h: New file.
* rtl-ssa/access-utils.h (prev_call_clobbers_ignoring): Rename to...
(prev_call_clobbers): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(next_call_clobbers_ignoring): Rename to...
(next_call_clobbers): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(first_nondebug_insn_use_ignoring): Rename to...
(first_nondebug_insn_use): ...this and treat the ignore parameter as
an object with the same interface as ignore_nothing.
(last_nondebug_insn_use_ignoring): Rename to...
(last_nondebug_insn_use): ...this and treat the ignore parameter as
an object with the same interface as ignore_nothing.
(last_access_ignoring): Rename to...
(last_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing. Conditionally skip
definitions.
(prev_access_ignoring): Rename to...
(prev_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing.
(first_def_ignoring): Replace with...
(first_access): ...this new function.
(next_access_ignoring): Rename to...
(next_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing. Conditionally skip
definitions.
* rtl-ssa/change-utils.h (insn_is_changing): Delete.
(restrict_movement_ignoring): Rename to...
(restrict_movement): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(recog_ignoring): Rename to...
(recog): ...this and treat the ignore parameter as an object with
the same interface as ignore_nothing.
* rtl-ssa/changes.h (insn_is_changing_closure): Delete.
* rtl-ssa/functions.h (function_info::add_regno_clobber): Treat
the ignore parameter as an object with the same interface as
ignore_nothing.
* rtl-ssa/insn-utils.h (insn_is): Delete.
* rtl-ssa/insns.h (insn_is_closure): Delete.
* rtl-ssa/member-fns.inl
(insn_is_changing_closure::insn_is_changing_closure): Delete.
(insn_is_changing_closure::operator()): Likewise.
(function_info::add_regno_clobber): Treat the ignore parameter
as an object with the same interface as ignore_nothing.
(ignore_changing_insns::ignore_changing_insns): New function.
(ignore_changing_insns::should_ignore_insn): Likewise.
* rtl-ssa/movement.h (restrict_movement_for_dead_range): Treat
the ignore parameter as an object with the same interface as
ignore_nothing.
(restrict_movement_for_defs_ignoring): Rename to...
(restrict_movement_for_defs): ...this and treat the ignore parameter
as an object with the same interface as ignore_nothing.
(restrict_movement_for_uses_ignoring): Rename to...
(restrict_movement_for_uses): ...this and treat the ignore parameter
as an object with the same interface as ignore_nothing. Conditionally
skip definitions.
* doc/rtl.texi: Update for above name changes. Use
ignore_changing_insns instead of insn_is_changing.
* config/aarch64/aarch64-cc-fusion.cc (cc_fusion::parallelize_insns):
Likewise.
* pair-fusion.cc (no_ignore): Delete.
(latest_hazard_before, first_hazard_after): Update for above name
changes. Use ignore_nothing instead of no_ignore.
(pair_fusion_bb_info::fuse_pair): Update for above name changes.
Use ignore_changing_insns instead of insn_is_changing.
(pair_fusion::try_promote_writeback): Likewise.
The compare_repeat_factors comparator fails qsort checking eventually
because it uses rf2->rank - rf1->rank to compare unsigned numbers
which causes issues for ranks that interpret negative as signed.
Fixed by re-writing the obvious way. I've also fixed the count
comparison which suffers from truncation as count is 64bit signed
while the comparator result is 32bit int (that's a lot less likely
to hit in practice though).
The testcase from the PR is too large to include.
PR tree-optimization/115599
* tree-ssa-reassoc.cc (compare_repeat_factors): Use explicit
compares to avoid truncations.
Haochen Gui [Mon, 24 Jun 2024 05:12:51 +0000 (13:12 +0800)]
fwprop: invoke change_is_worthwhile to judge if a replacement is worthwhile
gcc/
* fwprop.cc (try_fwprop_subst_pattern): Invoke change_is_worthwhile
to judge if a replacement is worthwhile. Remove single_set check
and add is_debug_insn check.
* recog.cc (swap_change): Invalidate recog_data when the cached INSN
is swapped out.
* rtl-ssa/changes.cc (rtl_ssa::changes_are_worthwhile): Check if the
insn cost of new rtl is unknown and fail the replacement.
Mark Harmstone [Mon, 24 Jun 2024 00:17:39 +0000 (18:17 -0600)]
[PATCH 02/11] Handle CodeView base types
Adds a get_type_num function to translate type DIEs into CodeView
numbers, along with a hash table for this. For now we just deal with
the base types (integers, Unicode chars, floats, and bools).
Mark Harmstone [Sun, 23 Jun 2024 23:48:10 +0000 (17:48 -0600)]
[PATCH 01/11] Output CodeView data about variables
Parse the DW_TAG_variable DIEs, and outputs S_GDATA32 (for global variables)
and S_LDATA32 (static global variables) symbols into the .debug$S section.
gcc/
* dwarf2codeview.cc (S_LDATA32, S_GDATA32): Define.
(struct codeview_symbol): New structure.
(sym, last_sym): New variables.
(write_data_symbol): New function.
(write_codeview_symbols): Call write_data_symbol.
(add_variable, codeview_debug_early_finish): New functions.
* dwarf2codeview.h (codeview_debug_early_finish): Prototype.
* dwarf2out.cc
(dwarf2out_early_finish): Call codeview_debug_early_finish.
Artemiy Volkov [Sun, 23 Jun 2024 20:54:00 +0000 (14:54 -0600)]
[PATCH] RISC-V: Fix unrecognizable pattern in riscv_expand_conditional_move()
Presently, the code fragment:
int x[5];
void
d(int a, int b, int c) {
for (int i = 0; i < 5; i++)
x[i] = (a != b) ? c : a;
}
causes an ICE when compiled with -O2 -march=rv32i_zicond:
test.c: In function 'd':
test.c: error: unrecognizable insn:
11 | }
| ^
(insn 8 5 9 2 (set (reg:SI 139 [ iftmp.0_2 ])
(if_then_else:SI (ne:SI (reg/v:SI 136 [ a ])
(reg/v:SI 137 [ b ]))
(reg/v:SI 136 [ a ])
(reg/v:SI 138 [ c ]))) -1
(nil))
during RTL pass: vregs
This happens because, as part of one of the optimizations in
riscv_expand_conditional_move(), an if_then_else is generated with both
comparands being register operands, resulting in an unmatchable insn since
Zicond patterns require constant 0 as the second comparand. Fix this by adding
a extra check before performing this optimization.
The code snippet mentioned above is also included in this patch as a new Zicond
testcase.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_expand_conditional_move): Add a
CONST0_RTX check.
Jeff Law [Sun, 23 Jun 2024 14:26:25 +0000 (08:26 -0600)]
[committed][RISC-V][PR target/114139] Verify we have a CONST_INT before extracting INTVAL
Run-of-the-mill checking issue. We had something like (plus (reg) (reg)) and
tried to extract INTVAL (XEXP (x, 1)) which of course blows up with checking
on.
Fixed thusly. Tested on riscv32-elf in my tester. riscv64-elf is in flight,
but won't finish for a while due to other tasks in flight.
PR target/114139
gcc/
* config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Verify object
is a CONST_INT before looking at INTVAL.
Richard Biener [Sun, 23 Jun 2024 09:26:39 +0000 (11:26 +0200)]
tree-optimization/115597 - allow CSE of two-operator VEC_PERM nodes
The following makes sure to always CSE when there's SLP_TREE_SCALAR_STMTS
as otherwise a chain of two-operator node operations can result in
exponential behavior of the CSE process as likely seen when building
510.parest on aarch64.
PR tree-optimization/115597
* tree-vect-slp.cc (vect_cse_slp_nodes): Allow to CSE
VEC_PERM nodes.
Richard Biener [Sat, 22 Jun 2024 12:59:09 +0000 (14:59 +0200)]
tree-optimization/115579 - fix wrong code with store-motion
The recent change to relax store motion for variables that cannot have
store data races broke the optimization to share flag vars for stores
that all happen in the same single BB. The following fixes this.
PR tree-optimization/115579
* tree-ssa-loop-im.cc (execute_sm): Return the auxiliary data
created.
(hoist_memory_references): Record the flag var that's eventually
created and re-use it when all stores are in the same BB.
Jeff Law [Sat, 22 Jun 2024 16:39:51 +0000 (10:39 -0600)]
[committed] [RISC-V] Skip zbs-ext-2.c for -Oz as well
> the test should probably also be skipped on -Oz:
>
> === gcc: Unexpected fails for rv64imafdc lp64d medlow ===
> FAIL: gcc.target/riscv/zbs-ext-2.c -Oz scan-assembler-times andi\t 1
> FAIL: gcc.target/riscv/zbs-ext-2.c -Oz scan-assembler-times andn\t 1
> FAIL: gcc.target/riscv/zbs-ext-2.c -Oz scan-assembler-times li\t 1
Yea. Just re-ran thing and sure enough we need to skip -Oz as well. So
committing the obvious change....
gcc/testsuite/
* gcc.target/riscv/zbs-ext-2.c: Also skip for -Oz.
David Malcolm [Fri, 21 Jun 2024 22:20:38 +0000 (18:20 -0400)]
diagnostics: remove duplicate copies of diagnostic_kind_text
No functional change intended.
gcc/ChangeLog:
* diagnostic-format-json.cc
(json_output_format::on_end_diagnostic): Use
get_diagnostic_kind_text rather than embedding a duplicate copy of
the table.
* diagnostic-format-sarif.cc
(make_rule_id_for_diagnostic_kind): Likewise.
* diagnostic.cc (get_diagnostic_kind_text): New.
* diagnostic.h (get_diagnostic_kind_text): New decl.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jeff Law [Fri, 21 Jun 2024 21:58:12 +0000 (15:58 -0600)]
[committed] Fix testsuite fallout on stormy16 after IOR->PLUS change
More minor fallout from the IOR->PLUS change a little while ago. This time on
xstormy16.
The pattern to swap nibbles actually tries to handle all the cases of IOR, XOR
and PLUS. But when we generate PLUS earlier in the pipeline, the
simplifications/canonicalizations are slightly different resulting in the
pattern not matching.
This patch adds an alternate pattern which matches what we get now. Basically
it looks like QImode rotate by 4, zero extended to HI.
Run in my tester to verify the regression was fixed. Pushing to the trunk.
gcc/
* config/stormy16/stormy16.md (swpn_zext): New pattern.
Jonathan Wakely [Wed, 19 Jun 2024 16:26:37 +0000 (17:26 +0100)]
libstdc++: Remove std::__is_pointer and std::__is_scalar [PR115497]
This removes the std::__is_pointer and std::__is_scalar traits, as they
conflicts with a Clang built-in.
Although Clang has a hack to make the class templates work despite using
reserved names, removing these class templates will allow that hack to
be dropped at some future date.
libstdc++-v3/ChangeLog:
PR libstdc++/115497
* include/bits/cpp_type_traits.h (__is_pointer, __is_scalar):
Remove.
(__is_arithmetic): Do not use __is_pointer in the primary
template. Add partial specialization for pointers.
Jonathan Wakely [Wed, 19 Jun 2024 10:19:58 +0000 (11:19 +0100)]
libstdc++: Remove std::__is_void class template [PR115497]
This removes the std::__is_void trait, as it conflicts with a Clang
built-in. There is only one use of the trait, which can easily be
replaced by simpler code.
Although Clang has a hack to make the class template work despite using
a reserved name, removing std::__is_void will allow that hack to be
dropped at some future date.
libstdc++-v3/ChangeLog:
PR libstdc++/115497
* include/bits/cpp_type_traits.h (__is_void): Remove.
* include/debug/helper_functions.h (_Distance_traits):
Adjust partial specialization to match void directly, instead of
using __is_void<T>::__type and matching __true_type.
Jonathan Wakely [Wed, 19 Jun 2024 16:21:16 +0000 (17:21 +0100)]
libstdc++: Stop using std::__is_pointer in <deque> and <algorithm> [PR115497]
This replaces all uses of the std::__is_pointer type trait with uses of
the new __is_pointer built-in. Since the class template was only used to
enable some performance optimizations for algorithms, we can use the
built-in when __has_builtin(__is_pointer) is true (which is the case for
GCC trunk and for current versions of Clang) and just forego the
optimization otherwise.
Removing the uses of std::__is_pointer means it can be removed from
<bits/cpp_type_traits.h>, which is another step towards fixing PR
115497.
libstdc++-v3/ChangeLog:
PR libstdc++/115497
* include/bits/deque.tcc (__lex_cmp_dit): Replace __is_pointer
class template with __is_pointer(T) built-in.
(__lexicographical_compare_aux1): Likewise.
* include/bits/stl_algobase.h (__equal_aux1): Likewise.
(__lexicographical_compare_aux1): Likewise.
Jonathan Wakely [Wed, 19 Jun 2024 10:19:58 +0000 (11:19 +0100)]
libstdc++: Don't use std::__is_scalar in std::valarray initialization [PR115497]
This removes the use of the std::__is_scalar trait from <valarray>,
where it can be replaced by __is_trivial. It's used to decide whether we
can use memset to value-initialize valarray elements, but memset is
suitable for any trivial types, because value-initializing them is
equivalent to filling them with zeros.
This is another step towards removing the class templates in
<bits/cpp_type_traits.h> that conflict with Clang built-in names.
libstdc++-v3/ChangeLog:
PR libstdc++/115497
* include/bits/valarray_array.h (__valarray_default_construct):
Use __is_trivial(_Tp). instead of __is_scalar<_Tp>.
Jonathan Wakely [Wed, 19 Jun 2024 15:14:56 +0000 (16:14 +0100)]
libstdc++: Fix std::fill and std::fill_n optimizations [PR109150]
As noted in the PR, the optimization used for scalar types in std::fill
and std::fill_n is non-conforming, because it doesn't consider that
assigning a scalar type might have non-trivial side effects which are
affected by the optimization.
By changing the condition under which the optimization is done we ensure
it's only performed when safe to do so, and we also enable it for
additional types, which was the original subject of the PR.
Instead of two overloads using __enable_if<__is_scalar<T>::__value, R>
we can combine them into one and create a local variable which is either
a local copy of __value or another reference to it, depending on whether
the optimization is allowed.
This removes a use of std::__is_scalar, which is a step towards fixing
PR 115497 by removing std::__is_pointer from <bits/cpp_type_traits.h>
libstdc++-v3/ChangeLog:
PR libstdc++/109150
* include/bits/stl_algobase.h (__fill_a1): Combine the
!__is_scalar and __is_scalar overloads into one and rewrite the
condition used to decide whether to perform the load outside the
loop.
* testsuite/25_algorithms/fill/109150.cc: New test.
* testsuite/25_algorithms/fill_n/109150.cc: New test.
Matthias Kretz [Fri, 21 Jun 2024 14:22:22 +0000 (16:22 +0200)]
libstdc++: Fix test on x86_64 and non-simd targets
* Running a test compiled with AVX512 instructions requires
avx512f_runtime not just avx512f.
* The 'reduce2' test violated an invariant of fixed_size_simd_mask and
thus failed on all targets without 16-Byte vector builtins enabled (in
bits/simd.h).
All uses of xs_hi_nonmemory_operand allow constraint "i",
which means that they allow consts, symbol_refs and label_refs.
The definition of xs_hi_nonmemory_operand accounted for consts,
but not for symbol_refs and label_refs.
gcc/
* config/stormy16/predicates.md (xs_hi_nonmemory_operand): Handle
symbol_ref and label_ref.
power_of_2_operand allows any 32-bit power of 2, whereas "I" only
accepts 16-bit signed constants. This meant that any power of 2
greater than 32768 would cause an "insn does not satisfy its
constraints" ICE.
Also, the %p operand modifier barfed on 1<<31, which is sign-
rather than zero-extended to 64 bits. The code is inherently
limited to 32-bit operands -- power_of_2_operand contains a test
involving "unsigned" -- so this patch just ands with 0xffffffff.
gcc/
* config/iq2000/iq2000.cc (iq2000_print_operand): Make %p handle 1<<31.
* config/iq2000/iq2000.md: Remove "I" constraints on
power_of_2_operands.
No-op moves are given the code NOOP_MOVE_INSN_CODE if we plan
to delete them later. Such insns shouldn't be costed, partly
because they're going to disappear, and partly because targets
won't recognise the insn code.
Andrew MacLeod [Mon, 17 Jun 2024 20:07:16 +0000 (16:07 -0400)]
Print "Global Exported" to dump_file from set_range_info.
* gimple-range.cc (gimple_ranger::register_inferred_ranges): Do not
dump global range info after set_range_info.
(gimple_ranger::register_transitive_inferred_ranges): Likewise.
(dom_ranger::range_of_stmt): Likewise.
* tree-ssanames.cc (set_range_info): If global range info
changes, maybe print new range to dump_file.
* tree-vrp.cc (remove_unreachable::handle_early): Do not
dump global range info after set_range_info.
(remove_unreachable::remove): Likewise.
(remove_unreachable::remove_and_update_globals): Likewise.
(pass_assumptions::execute): Likewise.
Andrew MacLeod [Mon, 17 Jun 2024 15:32:51 +0000 (11:32 -0400)]
Change fast VRP algorithm
Change the fast VRP algorithm to track contextual ranges active within
each basic block.
* gimple-range.cc (dom_ranger::dom_ranger): Create a block
vector.
(dom_ranger::~dom_ranger): Dispose of the block vector.
(dom_ranger::edge_range): Delete.
(dom_ranger::range_on_edge): Combine range in src BB with any
range gori_nme_on_edge returns.
(dom_ranger::range_in_bb): Combine global range with any active
contextual range for an ssa-name.
(dom_ranger::range_of_stmt): Fix non-ssa LHS case, use
fur_depend for folding so relations can be registered.
(dom_ranger::maybe_push_edge): Delete.
(dom_ranger::pre_bb): Create incoming contextual range vector.
(dom_ranger::post_bb): Free contextual range vector.
* gimple-range.h (dom_ranger::edge_range): Delete.
(dom_ranger::m_e0): Delete.
(dom_ranger::m_e1): Delete.
(dom_ranger::m_bb): New.
(dom_ranger::m_pop_list): Delete.
* tree-vrp.cc (execute_fast_vrp): Enable relation oracle.
Andrew MacLeod [Mon, 17 Jun 2024 15:23:12 +0000 (11:23 -0400)]
Add builtin_unreachable processing for fast_vrp.
Add a remove_unreachable object to fast vrp, and honor the final_p flag.
* tree-vrp.cc (remove_unreachable::remove): Export global range
if builtin_unreachable dominates all uses.
(remove_unreachable::remove_and_update_globals): Do not reset SCEV.
(execute_ranger_vrp): Reset SCEV here instead.
(fvrp_folder::fvrp_folder): Take final pass flag
and create a remove_unreachable object when specified.
(fvrp_folder::pre_fold_stmt): Register GIMPLE_CONDs with
the remove_unreachcable object.
(fvrp_folder::m_unreachable): New.
(execute_fast_vrp): Process remove_unreachable object.
(pass_vrp::execute): Add final_p flag to execute_fast_vrp.
David Malcolm [Fri, 21 Jun 2024 12:46:14 +0000 (08:46 -0400)]
testsuite: check that generated .sarif files validate against the SARIF schema [PR109360]
This patch extends the dg directive verify-sarif-file so that if
the "jsonschema" tool is available, it will be used to validate the
generated .sarif file.
gcc/testsuite/ChangeLog:
PR testsuite/109360
* lib/sarif-schema-2.1.0.json: New file, downloaded from
https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/schemas/sarif-schema-2.1.0.json
Licensing information can be seen at
https://github.com/oasis-tcs/sarif-spec/issues/583
which states "They are free to incorporate it into their
implementation. No need for special permission or paperwork from
OASIS."
* lib/scansarif.exp (verify-sarif-file): If "jsonschema" is
available, use it to verify that the .sarif file complies with the
SARIF schema.
* lib/target-supports.exp (check_effective_target_jsonschema):
New.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Fri, 21 Jun 2024 12:46:13 +0000 (08:46 -0400)]
diagnostics: fixes to SARIF output [PR109360]
When adding validation of .sarif files against the schema
(PR testsuite/109360) I discovered various issues where we were
generating invalid .sarif files.
Specifically, in
c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c
the relatedLocations for the "note" diagnostics were missing column
numbers, leading to validation failure due to non-unique elements,
such as multiple:
"message": {"text": "invalid UTF-8 character <bf>"}},
on line 25 with no column information.
Root cause is that for some diagnostics in libcpp we have a location_t
representing the line as a whole, setting a column_override on the
rich_location (since the line hasn't been fully read yet). We were
handling this column override for plain text output, but not for .sarif
output.
Similarly, in diagnostic-format-sarif-file-pr111700.c there is a warning
emitted on "line 0" of the file, whereas SARIF requires line numbers to
be positive.
We also use column == 0 internally to mean "the line as a whole",
whereas SARIF required column numbers to be positive.
This patch fixes these various issues.
gcc/ChangeLog:
PR testsuite/109360
* diagnostic-format-sarif.cc
(sarif_builder::make_location_object): Pass any column override
from rich_loc to maybe_make_physical_location_object.
(sarif_builder::maybe_make_physical_location_object): Add
"column_override" param and pass it to maybe_make_region_object.
(sarif_builder::maybe_make_region_object): Add "column_override"
param and use it when the location has 0 for a column. Don't
add "startLine", "startColumn", "endLine", or "endColumn" if
the values aren't positive.
(sarif_builder::maybe_make_region_object_for_context): Don't
add "startLine" or "endLine" if the values aren't positive.
libcpp/ChangeLog:
PR testsuite/109360
* include/rich-location.h (rich_location::get_column_override):
New accessor.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jonathan Wakely [Tue, 18 Jun 2024 15:59:52 +0000 (16:59 +0100)]
libstdc++: Make std::any_cast<void> ill-formed (LWG 3305)
LWG 3305 was approved earlier this year in Tokyo. We need to give an
error if using std::any_cast<void>, but std::any_cast<void()> is valid
(but always returns null).
libstdc++-v3/ChangeLog:
* include/std/any (any_cast(any*), any_cast(const any*)): Add
static assertion to reject void types, as per LWG 3305.
* testsuite/20_util/any/misc/lwg3305.cc: New test.
This member function was previously deprecated, but that was reverted by
P2875R4, approved earlier this year in Tokyo. Since it's not going to be
deprecated in C++26, and so presumably not removed, there is no point in
giving deprecated warnings for C++23 mode.
Jonathan Wakely [Sun, 7 Apr 2024 13:12:25 +0000 (14:12 +0100)]
libstdc++: Add deprecation warnings to <strstream> types
libstdc++-v3/ChangeLog:
* include/backward/backward_warning.h: Adjust comments to
suggest <spanstream> as another alternative to <strstream>.
* include/backward/strstream (strstreambuf, istrstream)
(ostrstream, strstream): Add deprecated attribute.
Jonathan Wakely [Thu, 20 Jun 2024 12:28:08 +0000 (13:28 +0100)]
libstdc++: Fix __cpp_lib_chrono for old std::string ABI
The <chrono> header is incomplete for the old std::string ABI, because
std::chrono::tzdb is only defined for the new ABI. The feature test
macro advertising full C++20 support should not be defined for the old
ABI.
libstdc++-v3/ChangeLog:
* include/bits/version.def (chrono): Add cxx11abi = yes.
* include/bits/version.h: Regenerate.
* testsuite/std/time/syn_c++20.cc: Adjust expected value for
the feature test macro.
Jonathan Wakely [Tue, 18 Jun 2024 12:27:02 +0000 (13:27 +0100)]
libstdc++: Fix std::to_array for trivial-ish types [PR115522]
Due to PR c++/85723 the std::is_trivial trait is true for types with a
deleted default constructor, so the use of std::is_trivial in
std::to_array is not sufficient to ensure the type can be trivially
default constructed then filled using memcpy.
I also forgot that a type with a deleted assignment operator can still
be trivial, so we also need to check that it's assignable because the
is_constant_evaluated() path can't use memcpy.
Replace the uses of std::is_trivial with std::is_trivially_copyable
(needed for memcpy), std::is_trivially_default_constructible (needed so
that the default construction is valid and does no work) and
std::is_copy_assignable (needed for the constant evaluation case).
libstdc++-v3/ChangeLog:
PR libstdc++/115522
* include/std/array (to_array): Workaround the fact that
std::is_trivial is not sufficient to check that a type is
trivially default constructible and assignable.
* testsuite/23_containers/array/creation/115522.cc: New test.
Jonathan Wakely [Thu, 20 Jun 2024 15:13:10 +0000 (16:13 +0100)]
libstdc++: Initialize base in test allocator's constructor
This fixes a warning from one of the test allocators:
warning: base class 'class std::allocator<__gnu_test::copy_tracker>' should be explicitly initialized in the copy constructor [-Wextra]
libstdc++-v3/ChangeLog:
* testsuite/util/testsuite_allocator.h (tracker_allocator):
Initialize base class in copy constructor.
*minus_plus_one had no constraints, which meant that it could be
matched after RA with operands 0, 1 and 2 all being different.
The associated split instead requires operand 0 to be tied to
operand 1.
Eric Botcazou [Mon, 27 May 2024 14:46:03 +0000 (16:46 +0200)]
ada: Fix bogus Address Sanitizer stack-buffer-overflow on packed array copy
The Address Sanitizer considers that the padding at the end of a justified
modular type may be accessed through the object, but it is never accessed
and therefore can always be reused.
gcc/ada/
* gcc-interface/decl.cc (gnat_to_gnu_entity) <discrete_type>: Set
the TYPE_JUSTIFIED_MODULAR_P flag earlier.
* gcc-interface/misc.cc (gnat_unit_size_without_reusable_padding):
New function.
(LANG_HOOKS_UNIT_SIZE_WITHOUT_REUSABLE_PADDING): Redefine to above
function.
Eric Botcazou [Mon, 27 May 2024 14:31:20 +0000 (16:31 +0200)]
ada: Fix bogus Address Sanitizer stack-buffer-overflow on packed record equality
We set DECL_BIT_FIELD optimistically during the translation of record types
and clear it afterward if needed, but fail to clear other attributes in the
latter case, which fools the logic of the Address Sanitizer.
gcc/ada/
* gcc-interface/utils.cc (clear_decl_bit_field): New function.
(finish_record_type): Call clear_decl_bit_field instead of clearing
DECL_BIT_FIELD manually.
Eric Botcazou [Sat, 20 Apr 2024 10:26:52 +0000 (12:26 +0200)]
ada: Implement fast modulo reduction for nonbinary modular multiplication
This adds the missing guard to prevent the reduction from being used when
the target does not provide or cannot synthesize a high-part multiply.
gcc/ada/
* gcc-interface/trans.cc (gnat_to_gnu) <N_Op_Mod>: Fix formatting.
* gcc-interface/utils2.cc: Include optabs-query.h.
(fast_modulo_reduction): Call can_mult_highpart_p on the TYPE_MODE
before generating a high-part multiply. Fix formatting.
Eric Botcazou [Fri, 5 Apr 2024 18:47:34 +0000 (20:47 +0200)]
ada: Implement fast modulo reduction for nonbinary modular multiplication
This implements modulo reduction for nonbinary modular multiplication with
small moduli by means of the standard division-free algorithm also used in
the optimizer, but with fewer constraints and therefore better results.
For the sake of consistency, it is also used for the 'Mod attribute of the
same modular types and, more generally, for the Mod (and Rem) operators of
unsigned types if the second operand is static and not a power of two.
gcc/ada/
* gcc-interface/gigi.h (fast_modulo_reduction): Declare.
* gcc-interface/trans.cc (gnat_to_gnu) <N_Op_Mod>: In the unsigned
case, call fast_modulo_reduction for {FLOOR,TRUNC}_MOD_EXPR if the
RHS is a constant and not a power of two, and the precision is not
larger than the word size.
* gcc-interface/utils2.cc: Include expmed.h.
(fast_modulo_reduction): New function.
(nonbinary_modular_operation): Call fast_modulo_reduction for the
multiplication if the precision is not larger than the word size.
Javier Miranda [Thu, 6 Jun 2024 12:06:53 +0000 (12:06 +0000)]
ada: Reject ambiguous function calls in interpolated string expressions
When the interpolated expression is a call to an ambiguous call
the frontend does not reject it; erroneously accepts the call
and generates code that calls to one of them.
gcc/ada/
* sem_ch2.adb (Analyze_Interpolated_String_Literal): Reject
ambiguous function calls.
Javier Miranda [Thu, 6 Jun 2024 11:20:14 +0000 (11:20 +0000)]
ada: Crash when using user defined string literals
When a non-overridable aspect is explicitly specified for a
non-tagged derived type, the compiler blows up processing an
object declaration of an object of such type.
gcc/ada/
* sem_ch13.adb (Analyze_One_Aspect): Fix code locating the entity
of the parent type.
Eric Botcazou [Tue, 4 Jun 2024 19:33:28 +0000 (21:33 +0200)]
ada: Small cleanup in processing of primitive operations
The processing of primitive operations is now always uniform for tagged and
untagged types, but the code contains left-overs from the time where it was
specific to tagged types, in particular for the handling of subtypes.
gcc/ada/
* einfo.ads (Direct_Primitive_Operations): Mention concurrent types
as well as GNAT extensions instead of implementation details.
(Primitive_Operations): Document that Direct_Primitive_Operations is
also used for concurrent types as a fallback.
* einfo-utils.adb (Primitive_Operations): Tweak formatting.
* exp_util.ads (Find_Prim_Op): Adjust description.
* exp_util.adb (Make_Subtype_From_Expr): In the private case with
unknown discriminants, always copy Direct_Primitive_Operations and
do not overwrite the Class_Wide_Type of the expression's base type.
* sem_ch3.adb (Analyze_Incomplete_Type_Decl): Tweak comment.
(Analyze_Subtype_Declaration): Remove older and now dead calls to
Set_Direct_Primitive_Operations. Tweak comment.
(Build_Derived_Private_Type): Likewise.
(Build_Derived_Record_Type): Likewise.
(Build_Discriminated_Subtype): Set Direct_Primitive_Operations in
all cases instead of just for tagged types.
(Complete_Private_Subtype): Likewise.
(Derived_Type_Declaration): Tweak comment.
* sem_ch4.ads (Try_Object_Operation): Adjust description.