aarch64: Suggest an -mcpu option when user passes CPU name to -march
This small patch helps users who confuse -march and -mcpu on AArch64.
Sometimes users pass -march with a CPU name, where they most likely wanted to
use -mcpu, which would select the right architecture features *and* tune for
their desired CPU. Currently we'll just error out with an unkown architecture
message and list the valid architecture options.
With this patch we check if their string matches a known CPU and suggest they
use an -mcpu option instead.
So compiling with -march=neoverse-n1 will now give the error:
cc1: error: unknown value 'neoverse-n1' for '-march'
cc1: note: valid arguments are: armv8-a armv8.1-a armv8.2-a armv8.3-a armv8.4-a armv8.5-a armv8.6-a armv8.7-a armv8.8-a armv8-r armv9-a
cc1: note: did you mean '-mcpu=neoverse-n1'?
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_validate_march): Check if invalid arch
string is a valid -mcpu string and emit hint.
Richard Biener [Mon, 5 Sep 2022 12:22:51 +0000 (14:22 +0200)]
Remove MAX_SWITCH_CASES limit
The following removes the MAX_SWITCH_CASES limit to fight quadraticness
when looking up case labels from edges. Instead use the
{start,end}_recording_case_labels facility for that. For it to be
usable I've exported get_cases_for_edge from tree-cfg.cc.
* tree-cfg.h (get_cases_for_edge): Declare.
* tree-cfg.cc (get_cases_for_edge): Export.
* tree-ssa-uninit.cc (execute_late_warn_uninitialized):
Start and end recording case labels.
* gimple-predicate-analysis.cc (MAX_SWITCH_CASES): Remove.
(predicate::init_from_control_deps): Use get_cases_for_edge.
Richard Biener [Mon, 5 Sep 2022 12:21:01 +0000 (14:21 +0200)]
Unify MAX_POSTDOM_CHECK and --param uninit-control-dep-attempts
The following unifies both limits, in particular the MAX_POSTDOM_CHECK
tends to be too low and is not user-controllable.
* gimple-predicate-analysis.cc (MAX_POSTDOM_CHECK): Remove.
(compute_control_dep_chain): Move uninit-control-dep-attempts
checking where it also counts the post-dominator check
invocations.
At one time the aarch64 port registered the Advanced SIMD builtins
lazily, when we first encountered a set of target flags that includes
+simd. These days we always initialise them at start-up, temporarily
forcing a conducive set of flags if necessary.
This patch removes some vestiges of the old way of doing things.
Xi Ruoyao [Thu, 1 Sep 2022 10:38:14 +0000 (18:38 +0800)]
LoongArch: add -mdirect-extern-access option
As a new target, LoongArch does not use copy relocation as it's
problematic in some circumstances. One bad consequence is we are
emitting GOT for all accesses to all extern objects with default
visibility. The use of GOT is not needed in statically linked
executables, OS kernels etc. The GOT entry just wastes space, and the
GOT access just slow down the execution in those environments.
Before -mexplicit-relocs, we used "-Wa,-mla-global-with-pcrel" to tell
the assembler not to use GOT for extern access. But with
-mexplicit-relocs, we have to opt the logic in GCC.
The name "-mdirect-extern-access" is learnt from x86 port.
gcc/ChangeLog:
* config/loongarch/genopts/loongarch.opt.in: Add
-mdirect-extern-access option.
* config/loongarch/loongarch.opt: Regenerate.
* config/loongarch/loongarch.cc
(loongarch_symbol_binds_local_p): Return true if
TARGET_DIRECT_EXTERN_ACCESS.
(loongarch_option_override_internal): Complain if
-mdirect-extern-access is used with -fPIC or -fpic.
* doc/invoke.texi: Document -mdirect-extern-access for
LoongArch.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/direct-extern-1.c: New test.
* gcc.target/loongarch/direct-extern-2.c: New test.
Piotr Trojanek [Tue, 2 Aug 2022 17:46:36 +0000 (19:46 +0200)]
[Ada] Move check for null array aggregates to expansion
Despite recent changes to runtime checks for null array aggregates,
GNATprove still struggles with N_Raise_Constraint_Error nodes inserted
into AST by aggregate resolution. The ultimate fix is to move these
checks to expansion (which is disabled in GNATprove mode) and explicitly
emit a proof check in the GNATprove backend.
gcc/ada/
* exp_aggr.adb (Check_Bounds): Move code and comment related to
check for null array aggregate from Resolve_Null_Array_Aggregate.
* sem_aggr.ads (Is_Null_Aggregate): Move spec from unit body.
* sem_aggr.adb (Resolve_Null_Array_Aggregate): Move check to
expansion.
Piotr Trojanek [Sun, 31 Jul 2022 20:11:30 +0000 (22:11 +0200)]
[Ada] Fix inconsistent building of itypes for null array aggregates
To analyze Ada 2022 null array aggregates we introduced a dedicated
routine and bypassed the code for ordinary array aggregates. However,
the types for the array indexes created by this dedicated routine
differed from the types created for ordinary array aggregates, i.e.
itypes for null array aggregates were associated with the array subtype
declaration, while itypes for ordinary array aggregates were associated
with the aggregate itself. These differences cause trouble for various
routines in GNATprove.
This patch reduces the special handling of null array aggregates and
reuses the building of itypes for ordinary array aggregates.
gcc/ada/
* sem_aggr.adb
(Array_Aggr_Subtype): Bypass call to Collect_Aggr_Bound with
dedicated code for null array aggregates.
(Resolve_Array_Aggregate): Remove special handling of null array
aggregates.
(Resolve_Array_Aggregate): Create bounds, but let
Array_Aggr_Subtype create itype entities.
Piotr Trojanek [Sun, 31 Jul 2022 20:27:13 +0000 (22:27 +0200)]
[Ada] Fix insertion of a runtime check for null array aggregate
A null array aggregate of Ada 2022 requires a conditional runtime check
that was inserted as an if-statement. While gigi can handle statements
inserted into a list of declarations, in GNATprove such a statement will
cause a crash. It is better to insert a conditional raise node, which is
properly handled by both gigi and GNATprove.
gcc/ada/
* sem_aggr.adb (Resolve_Null_Array_Aggregate): Insert check as a
Raise_Constraint_Error node and not an If_Statement.
Ada.Containers.Vectors has two Append procedures that take an
Element value; one takes a Count parameter and one does not
(the count is implicitly one for the latter). For the former version,
there was code that took a faster path if certain conditions were met
and otherwise took a slower path; one of the prerequisite conditions
for this was Count = 1. For the latter version, no such special-case
detection was performed; the more general code was always executed.
Move the special-case detection/handling code from the former version into
the latter and change the former version to simply call the latter version
if Count = 1. Also apply same change to Ada.Containers.Indefinite_Vectors.
gcc/ada/
* libgnat/a-coinve.adb, libgnat/a-convec.adb
(Append): If the Append that takes an Element and a Count is
called with Count = 1, then call the Append that does not take a
Count parameter; otherwise call the code that handles the general
case. Move the special case detection/handling code that was
formerly in that version of Append into the version that does not
take a Count parameter, so that now both versions get the
performance benefit.
Piotr Trojanek [Wed, 27 Jul 2022 07:37:05 +0000 (09:37 +0200)]
[Ada] Create internal type for null array aggregate as an itype
Internal type created for the null array aggregate of Ada 2022 was
created as a temporary entity and then flagged as internal, but it is
better to create this type directly as an itype.
In particular, when the null array aggregate appears in a spec
expression, its type declaration will not be attached to the AST.
An itype will have Associated_Node_For_Itype, so that the context of
the type can be recovered, which is what GNATprove does.
gcc/ada/
* sem_aggr.adb (Resolve_Null_Array_Aggregate): Create internal
type for the aggregate as an itype.
Piotr Trojanek [Tue, 26 Jul 2022 20:47:58 +0000 (22:47 +0200)]
[Ada] Remove no longer referenced GNATprove utility routine for itypes
Code cleanup related to itypes for Ada 2022 null array aggregates.
Remove routine that was added in 2011 but is not referenced by
GNATprove since 2015.
Steve Baird [Tue, 26 Jul 2022 00:19:29 +0000 (17:19 -0700)]
[Ada] Bad Default_Initial_Condition check for a not-default-initialized object
No Default_Initial_Condition check should be generated for an object
declaration that has an explicit initial value. Previously this was
implemented by testing the Has_Init_Expression flag, but this only works
if the object declaration was created by the parser (since only the
parser sets that attribute, at least currently).
gcc/ada/
* exp_ch3.adb
(Expand_N_Object_Declaration): In deciding whether to emit a DIC
check, we were previously testing the Has_Init_Expression flag.
Continue to test that flag as before, but add a test for the
syntactic presence of an initial value in the object declaration.
This new test would not supersede the old test in the case where
an explicit initial value has been eliminated as part of some tree
transformation.
Eric Botcazou [Tue, 26 Jul 2022 15:33:06 +0000 (17:33 +0200)]
[Ada] Small cleanup in body of System.Value_R
This is mostly stylistic but also adds a couple of missing comments.
gcc/ada/
* libgnat/s-valuer.adb (Scan_Decimal_Digits): Consistently avoid
initializing local variables.
(Scan_Integral_Digits): Likewise.
(Scan_Raw_Real): Likewise and add a couple of comments.
Eric Botcazou [Fri, 22 Jul 2022 14:10:25 +0000 (16:10 +0200)]
[Ada] Fix bogus discriminant check failure for type with predicate
This reorders the processing in Freeze_Entity_Checks so that building the
predicate functions, which first requires building discriminated checking
functions for record types with a variant part, is done after processing
and checking this variant part.
gcc/ada/
* sem_ch13.adb (Freeze_Entity_Checks): Build predicate functions
only after checking the variant part of a record type, if any.
Piotr Trojanek [Thu, 21 Jul 2022 19:42:17 +0000 (21:42 +0200)]
[Ada] Detect expansion of iterated component associations into loops
Iterated component associations are expanded into loops, which GNAT
should detect as violating restriction No_Implicit_Loops; same for
iterated element associations and delta array aggregates.
Part of cleanups for correct handling of iterated component associations
in SPARK.
gcc/ada/
* exp_aggr.adb
(Two_Pass_Aggregate_Expansion): Expand into implicit rather than
ordinary loops, to detect violations of restriction
No_Implicit_Loops.
(Generate_Loop): Likewise for delta array aggregates.
Piotr Trojanek [Thu, 21 Jul 2022 15:47:11 +0000 (17:47 +0200)]
[Ada] Fix double identifiers in iterated component association
The iterated_component_association grammar construct appears in Ada RM
in two syntactic forms: with iterator_specification and with
defining_identifier. This is now properly reflected in the GNAT AST,
while previously we had two defining_identifiers regardless of the
syntactic form.
Cleanup related to handling of iterated_component_association in SPARK.
Behavior of the compiler itself should not be affected.
gcc/ada/
* exp_aggr.adb (Two_Pass_Aggregate_Expansion): Expand iterated
component association with an unanalyzed copy of iterated
expression. The previous code worked only because the expanded
loop used both an analyzed copy of the iterator_specification and
an analyzed copy of the iterated expression. Now the iterated
expression is reanalyzed in the context of the expanded loop.
* par-ch4.adb (Build_Iterated_Component_Association): Don't set
defining identifier when iterator specification is present.
* sem_aggr.adb (Resolve_Iterated_Association): Pick index name
from the iterator specification.
* sem_elab.adb (Traverse_Potential_Scenario): Handle iterated
element association just like iterated component association. Not
strictly part of this fix, but still worth for the completeness.
* sem_res.adb (Resolve): Pick index name from the iterator
specification, when present.
* sem_util.adb (Traverse_More): For completeness, just like the
change in Traverse_Potential_Scenario.
* sinfo.ads
(ITERATED_COMPONENT_ASSOCIATION): Fix and complete description.
(ITERATED_ELEMENT_ASSOCIATION): Likewise.
Bob Duff [Wed, 20 Jul 2022 21:37:51 +0000 (17:37 -0400)]
[Ada] Suppress warnings in trivial subprograms with finalization
There are heuristics for suppressing warnings about unused objects in
trivial cases. In particular, we try to suppress warnings here:
function F (A : Integer) return Some_Type;
X : Some_Type;
begin
raise Not_Yet_Implemented;
return X;
end F;
But it doesn't work if Some_Type is controlled. This patch fixes that
bug.
gcc/ada/
* sem_ch6.adb
(Analyze_Subprogram_Body_Helper): Use First_Real_Statement to deal
with this case. Note that First_Real_Statement is likely to be
removed as part of this ticket, so this is a temporary fix.
Piotr Trojanek [Tue, 19 Jul 2022 11:57:05 +0000 (13:57 +0200)]
[Ada] Fix resolution of iterated component association
For iterator specification appearing inside an iterated component
association, we just did ad-hoc, incomplete checks and delayed a proper
analysis until the iterated component association is expanded into loop (and
then reanalyzed).
However, when the iterated component association is not expanded, e.g.
because we are in semantic checking mode, GNATprove mode or inside a
generic, then the AST lacked any processing or error reporting.
This is fixed by reusing the existing analysis of iterator specifications,
as they also appear in other constructs, e.g. in quantified expressions.
gcc/ada/
* sem_aggr.adb (Resolve_Iterated_Component_Association): Split
processing of cases with and without iterator specification; reuse
analysis of iterator specification; improve diagnostics for
premature usage of iterator index in discrete choices.
Piotr Trojanek [Mon, 18 Jul 2022 16:56:52 +0000 (18:56 +0200)]
[Ada] Cleanup resolution of iterated component association
Tune names of local entities.
gcc/ada/
* sem_aggr.adb (Resolve_Iterated_Component_Association): Change
generic name Ent to a more intuitive Scop; rename Remove_Ref to
Remove_Reference, so it can be instantiated as a traversal routine
with plural name.
Piotr Trojanek [Mon, 18 Jul 2022 14:14:55 +0000 (16:14 +0200)]
[Ada] Cleanup analysis of quantified expressions with empty ranges
Cleanup handling of quantified expressions before using it as an inspiration
for fixing the handling of iterated component associations. Behavior is
unaffected.
gcc/ada/
* sem_ch4.adb
(Is_Empty_Range): Move error reporting to the caller.
(Analyze_Qualified_Expression): Move error reporting from Is_Empty_Range;
add matching call to End_Scope before rewriting and returning.
Eric Botcazou [Sun, 17 Jul 2022 10:38:15 +0000 (12:38 +0200)]
[Ada] Fix crash for Default_Initial_Condition on derived enumeration type
This fixes a crash on the declaration of a private derived enumeration type
with the Default_Initial_Condition aspect and in the process makes a couple
of related adjustments: 1) removes the early freezing of implicit character
and numeric base types and 2) fixes an oversight in the implementation of
delayed representation aspects.
gcc/ada/
* aspects.ads (Delaying Evaluation of Aspect): Fix typos.
* exp_ch3.adb (Freeze_Type): Do not generate Invariant and DIC
procedures for internal types.
* exp_util.adb (Build_DIC_Procedure_Body): Adjust comment.
* freeze.adb (Freeze_Entity): Call Inherit_Delayed_Rep_Aspects for
subtypes and derived types only after the base or parent type has
been frozen. Remove useless freezing for first subtype.
(Freeze_Fixed_Point_Type): Call Inherit_Delayed_Rep_Aspects too.
* layout.adb (Set_Elem_Alignment): Deal with private types.
* sem_ch3.adb (Build_Derived_Enumeration_Type): Build the implicit
base as an itype and do not insert its declaration in the tree.
(Build_Derived_Numeric_Type): Do not freeze the implicit base.
(Derived_Standard_Character): Likewise.
(Constrain_Enumeration): Inherit the chain of representation items
instead of replacing it.
* sem_ch13.ads (Inherit_Aspects_At_Freeze_Point): Add ??? comment.
(Inherit_Delayed_Rep_Aspects): Declare.
* sem_ch13.adb (Analyze_Aspects_At_Freeze_Point): Do not invoke
Inherit_Delayed_Rep_Aspects.
(Inherit_Aspects_At_Freeze_Point): Deal with private types.
(Inherit_Delayed_Rep_Aspects): Move to library level.
Piotr Trojanek [Thu, 25 Feb 2021 20:52:22 +0000 (21:52 +0100)]
[Ada] Cleanup expansion of attribute Priority
Semantically neutral cleanup after the main fix for expansion of
attribute Priority.
gcc/ada/
* einfo-utils.adb (Number_Entries): Refine type of a local variable.
* exp_attr.adb (Expand_N_Attribute_Reference): Rename Conctyp to
Prottyp; refactor repeated calls to New_Occurrence_Of; replace
Number_Entries with Has_Entries.
* exp_ch5.adb (Expand_N_Assignment_Statement): Likewise; remove Subprg
variable (apparently copy-pasted from expansion of the attribute).
Piotr Trojanek [Fri, 30 Oct 2020 23:01:43 +0000 (00:01 +0100)]
[Ada] Fix expansion of attribute Priority
gcc/ada/
* exp_attr.adb (Expand_N_Attribute_Reference): Fix detection of the
enclosing protected type and of the enclosing protected subprogram.
* exp_ch5.adb (Expand_N_Assignment_Statement): Likewise.
Piotr Trojanek [Tue, 12 Jul 2022 11:18:19 +0000 (13:18 +0200)]
[Ada] Improve pretty-printing of iterated component associations
Pretty-printing used mostly in the debugger now handles more Ada 2022
syntax features. In particular, now it correctly handles expressions like
"[for E of A when E /= X => E]".
gcc/ada/
* sprint.adb (Sprint_Node_Actual): Handle iterator_specification within
iterated_component_association and iterator_filter within
iterator_specification.
Be even more conservative in intersection of NANs.
Intersecting two ranges where one is a NAN is keeping the sign bit of
the NAN range. This is not correct as the sign bits may not match.
I think the only time we're absolutely sure about the intersection of
a NAN and something else, is when both are a NAN with exactly the same
properties (sign bit). If we're intersecting two NANs of differing
sign, we can decide later whether that's undefined or just a NAN with
no known sign. For now I've done the latter.
I'm still mentally working on intersections involving NANs, especially
if we want to keep track of signbits. For now, let's be extra careful
and only do things we're absolutely sure about.
Later we may want to fold the intersect of [NAN,NAN] and say [3,5]
with the posibility of NAN, to a NAN, but I'm not 100% sure. As I've
said before, setting varying is always a safe choice, because it means
we know nothing and ranger won't attempt to optimize anything.
[libsanitizer, Darwin] Fix bootstrap after recent merge.
The latest merge to libsanitizer includes changes to handle macOS 13+.
However, these changes are incompatible with GCC and so we need to find
an alternate solution. To restore bootstrap back this change out until
the alternate can be found.
There are 6 idioms of the same check and I'd like to add more.
It seems there are macros as well as functions for things like
REAL_VALUE_ISINF and REAL_VALUE_NEGATIVE. I don't know if there was
historical need for this duplicity, but I think it's cleaner if we
start gravitating towards inline functions only.
[PR/middle-end 106819] NANs can never be a singleton
Possible NANs can never be a singleton, so they will never be
propagated. This was the intent, and then the signed zero code crept
in, and was mistakenly checked before the NAN.
PR/middle-end 106819
gcc/ChangeLog:
* value-range.cc (frange::singleton_p): Move NAN check to the top.
nvptx: Silence unused variable warning in output_constant_pool_contents()
Similar to the rs6000 code, nvptx defines ASM_OUTPUT_DEF_FROM_DECLS as well as
ASM_OUTPUT_DEF. Make sure that the define's parameters are used by referencing
them as (void) to silence a warning in output_constant_pool_contents().
Jakub Jelinek [Sat, 3 Sep 2022 07:41:54 +0000 (09:41 +0200)]
openmp: Partial OpenMP 5.2 doacross and omp_cur_iteration support
The following patch implements part of the OpenMP 5.2 changes related
to ordered loops and with the assumed resolution of
https://github.com/OpenMP/spec/issues/3302 issues.
The changes are:
1) the depend clause on stand-alone ordered constructs has been renamed
to doacross (because depend clause has different syntax on other
constructs) with some syntax changes below, depend clause is deprecated
(we'll deprecate stuff on the GCC side only when we have everything else
from 5.2 implemented)
depend(source) -> doacross(source:) or doacross(source:omp_cur_iteration)
depend(sink:vec) -> doacross(sink:vec) (where vec has the same syntax
as before)
2) in 5.1 and before it has been significant whether ordered clause has or
doesn't have an argument, if it didn't, only block-associated ordered
could appear in the body, if it did, only stand-alone ordered could appear
in the body, all loops had to be perfectly nested, no associated
range-based for loops, no linear clause on work-sharing loop and ordered
clause with an argument wasn't allowed on composite for simd.
In 5.2, whether ordered clause has or doesn't have an argument is
insignificant (except for bugs in the standard, #3302 mentions those),
if the argument is missing, it is simply treated as equal to collapse
argument (if any, otherwise 1). The implementation better should be able
to differentiate between ordered and doacross loops at compile time
which previously was through the absence or presence of the argument,
now it is done through looking at the body of the construct lexically
and looking for stand-alone ordered constructs. If there are any,
it is to be handled as doacross loop, otherwise it is ordered loop
(but in that case ordered argument if present must be equal to collapse
argument - 5.2 says instead it must be one, but that is clearly wrong
and mentioned in #3302) - stand-alone ordered constructs must appear
lexically in the body (and had to before as well). For the restrictions
mentioned above, the for simd restriction is gone (stand-alone ordered
can't appear in simd construct, so that is enough), and the other rules
are expected to be changed into something related to presence of
stand-alone ordered constructs in the body
3) 5.2 allows a new syntax, doacross(sink:omp_cur_iteration-1), which
means wait for previous iteration in the iteration space of all the
associated loops
The following patch implements that, except that we sorry for now
on the doacross(sink:omp_cur_iteration-1) syntax during omp expansion
because library side isn't done yet for it. It doesn't implement it for
the Fortran FE either.
Incrementally, I'd like to change the way we differentiate between
stand-alone and block-associated ordered constructs, because the current
way of looking for presence of doacross clause doesn't work well if those
clauses are removed because they had been invalid (wrong syntax or
unknown variables in it etc.) and of course implement
doacross(sink:omp_cur_iteration-1).
2022-09-03 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_DOACROSS.
(enum omp_clause_depend_kind): Remove OMP_CLAUSE_DEPEND_SOURCE
and OMP_CLAUSE_DEPEND_SINK, add OMP_CLAUSE_DEPEND_INVALID.
(enum omp_clause_doacross_kind): New type.
(struct tree_omp_clause): Add subcode.doacross_kind member.
* tree.h (OMP_CLAUSE_DEPEND_SINK_NEGATIVE): Remove.
(OMP_CLAUSE_DOACROSS_KIND): Define.
(OMP_CLAUSE_DOACROSS_SINK_NEGATIVE): Define.
(OMP_CLAUSE_DOACROSS_DEPEND): Define.
(OMP_CLAUSE_ORDERED_DOACROSS): Define.
* tree.cc (omp_clause_num_ops, omp_clause_code_name): Add
OMP_CLAUSE_DOACROSS entries.
* tree-nested.cc (convert_nonlocal_omp_clauses,
convert_local_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
* tree-pretty-print.cc (dump_omp_clause): Don't handle
OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK. Handle
OMP_CLAUSE_DOACROSS.
* gimplify.cc (gimplify_omp_depend): Don't handle
OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK.
(gimplify_scan_omp_clauses): Likewise. Handle OMP_CLAUSE_DOACROSS.
(gimplify_adjust_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
(find_standalone_omp_ordered): New function.
(gimplify_omp_for): When OMP_CLAUSE_ORDERED is present, search
body for OMP_ORDERED with OMP_CLAUSE_DOACROSS and if found,
set OMP_CLAUSE_ORDERED_DOACROSS.
(gimplify_omp_ordered): Don't handle OMP_CLAUSE_DEPEND_SINK or
OMP_CLAUSE_DEPEND_SOURCE, instead check OMP_CLAUSE_DOACROSS, adjust
diagnostics that presence or absence of ordered clause parameter
is irrelevant. Handle doacross(sink:omp_cur_iteration-1). Use
actual user name of the clause - doacross or depend - in diagnostics.
* omp-general.cc (omp_extract_for_data): Don't set fd->ordered
if !OMP_CLAUSE_ORDERED_DOACROSS (t). If
OMP_CLAUSE_ORDERED_DOACROSS (t) but !OMP_CLAUSE_ORDERED_EXPR (t),
set fd->ordered to -1 and set it after the loop in that case to
fd->collapse.
* omp-low.cc (check_omp_nesting_restrictions): Don't handle
OMP_CLAUSE_DEPEND_SOURCE nor OMP_CLAUSE_DEPEND_SINK, instead check
OMP_CLAUSE_DOACROSS. Use actual user name of the clause - doacross
or depend - in diagnostics. Diagnose mixing of stand-alone and
block associated ordered constructs binding to the same loop.
(lower_omp_ordered_clauses): Don't handle OMP_CLAUSE_DEPEND_SINK,
instead handle OMP_CLAUSE_DOACROSS.
(lower_omp_ordered): Look for OMP_CLAUSE_DOACROSS instead of
OMP_CLAUSE_DEPEND.
(lower_depend_clauses): Don't handle OMP_CLAUSE_DEPEND_SOURCE and
OMP_CLAUSE_DEPEND_SINK.
* omp-expand.cc (expand_omp_ordered_sink): Emit a sorry for
doacross(sink:omp_cur_iteration-1).
(expand_omp_ordered_source_sink): Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE. Use actual user name of the clause
- doacross or depend - in diagnostics.
(expand_omp): Look for OMP_CLAUSE_DOACROSS clause instead of
OMP_CLAUSE_DEPEND.
(build_omp_regions_1): Likewise.
(omp_make_gimple_edges): Likewise.
* lto-streamer-out.cc (hash_tree): Handle OMP_CLAUSE_DOACROSS.
* tree-streamer-in.cc (unpack_ts_omp_clause_value_fields): Likewise.
* tree-streamer-out.cc (pack_ts_omp_clause_value_fields): Likewise.
gcc/c-family/
* c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_DOACROSS.
* c-omp.cc (c_finish_omp_depobj): Check also for OMP_CLAUSE_DOACROSS
clause and diagnose it. Don't handle OMP_CLAUSE_DEPEND_SOURCE and
OMP_CLAUSE_DEPEND_SINK. Assert kind is not OMP_CLAUSE_DEPEND_INVALID.
gcc/c/
* c-parser.cc (c_parser_omp_clause_name): Handle doacross.
(c_parser_omp_clause_depend_sink): Renamed to ...
(c_parser_omp_clause_doacross_sink): ... this. Add depend_p argument.
Handle parsing of doacross(sink:omp_cur_iteration-1). Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE, build OMP_CLAUSE_DOACROSS instead
of OMP_CLAUSE_DEPEND and set OMP_CLAUSE_DOACROSS_DEPEND flag on it.
(c_parser_omp_clause_depend): Use OMP_CLAUSE_DOACROSS_SINK and
OMP_CLAUSE_DOACROSS_SOURCE instead of OMP_CLAUSE_DEPEND_SINK and
OMP_CLAUSE_DEPEND_SOURCE, build OMP_CLAUSE_DOACROSS for depend(source)
and set OMP_CLAUSE_DOACROSS_DEPEND on it.
(c_parser_omp_clause_doacross): New function.
(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DOACROSS.
(c_parser_omp_depobj): Use OMP_CLAUSE_DEPEND_INVALID instead of
OMP_CLAUSE_DEPEND_SOURCE.
(c_parser_omp_for_loop): Don't diagnose here linear clause together
with ordered with argument.
(c_parser_omp_simd): Don't diagnose ordered clause with argument on
for simd.
(OMP_ORDERED_DEPEND_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_DOACROSS.
(c_parser_omp_ordered): Handle also doacross and adjust for it
diagnostic wording.
* c-typeck.cc (c_finish_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
Don't handle OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK.
gcc/cp/
* parser.cc (cp_parser_omp_clause_name): Handle doacross.
(cp_parser_omp_clause_depend_sink): Renamed to ...
(cp_parser_omp_clause_doacross_sink): ... this. Add depend_p
argument. Handle parsing of doacross(sink:omp_cur_iteration-1). Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE, build OMP_CLAUSE_DOACROSS instead
of OMP_CLAUSE_DEPEND and set OMP_CLAUSE_DOACROSS_DEPEND flag on it.
(cp_parser_omp_clause_depend): Use OMP_CLAUSE_DOACROSS_SINK and
OMP_CLAUSE_DOACROSS_SOURCE instead of OMP_CLAUSE_DEPEND_SINK and
OMP_CLAUSE_DEPEND_SOURCE, build OMP_CLAUSE_DOACROSS for depend(source)
and set OMP_CLAUSE_DOACROSS_DEPEND on it.
(cp_parser_omp_clause_doacross): New function.
(cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DOACROSS.
(cp_parser_omp_depobj): Use OMP_CLAUSE_DEPEND_INVALID instead of
OMP_CLAUSE_DEPEND_SOURCE.
(cp_parser_omp_for_loop): Don't diagnose here linear clause together
with ordered with argument.
(cp_parser_omp_simd): Don't diagnose ordered clause with argument on
for simd.
(OMP_ORDERED_DEPEND_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_DOACROSS.
(cp_parser_omp_ordered): Handle also doacross and adjust for it
diagnostic wording.
* pt.cc (tsubst_omp_clause_decl): Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE.
(tsubst_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
(tsubst_expr): Use OMP_CLAUSE_DEPEND_INVALID instead of
OMP_CLAUSE_DEPEND_SOURCE.
* semantics.cc (cp_finish_omp_clause_depend_sink): Rename to ...
(cp_finish_omp_clause_doacross_sink): ... this.
(finish_omp_clauses): Handle OMP_CLAUSE_DOACROSS. Don't handle
OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK.
gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_clauses): Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE, build OMP_CLAUSE_DOACROSS
clause instead of OMP_CLAUSE_DEPEND and set OMP_CLAUSE_DOACROSS_DEPEND
on it.
gcc/testsuite/
* c-c++-common/gomp/doacross-2.c: Adjust expected diagnostics.
* c-c++-common/gomp/doacross-5.c: New test.
* c-c++-common/gomp/doacross-6.c: New test.
* c-c++-common/gomp/nesting-2.c: Adjust expected diagnostics.
* c-c++-common/gomp/ordered-3.c: Likewise.
* c-c++-common/gomp/sink-3.c: Likewise.
* gfortran.dg/gomp/nesting-2.f90: Likewise.
David Malcolm [Fri, 2 Sep 2022 22:29:33 +0000 (18:29 -0400)]
c/c++: new warning: -Wxor-used-as-pow [PR90885]
PR c/90885 notes various places in real-world code where people have
written C/C++ code that uses ^ (exclusive or) where presumbably they
meant exponentiation.
For example
https://codesearch.isocpp.org/cgi-bin/cgi_ppsearch?q=2%5E32&search=Search
currently finds 11 places using "2^32", and all of them appear to be
places where the user means 2 to the power of 32, rather than 2
exclusive-orred with 32 (which is 34).
This patch adds a new -Wxor-used-as-pow warning to the C and C++
frontends to complain about ^ when the left-hand side is the decimal
constant 2 or the decimal constant 10.
This is the same name as the corresponding clang warning:
https://clang.llvm.org/docs/DiagnosticsReference.html#wxor-used-as-pow
As per the clang warning, the warning suggests converting the left-hand
side to a hexadecimal constant if you really mean xor, which suppresses
the warning (though this patch implements a fix-it hint for that, whereas
the clang implementation only has a fix-it hint for the initial
suggestion of exponentiation).
I initially tried implementing this without checking for decimals, but
this version had lots of false positives. Checking for decimals
requires extending the lexer to capture whether or not a CPP_NUMBER
token was decimal. I added a new DECIMAL_INT flag to cpplib.h for this.
Unfortunately, c_token and cp_tokens both have only an unsigned char for
their flags (as captured by c_lex_with_flags), whereas this would add
the 12th flag to cpp_tokens. Of the first 8 flags, all but BOL are used
in the C or C++ frontends, but BOL is not, so I moved that to a higher
position, using its old value for the new DECIMAL_INT flag, so that it
is representable within an unsigned char.
Example output:
demo.c:5:13: warning: result of '2^8' is 10; did you mean '1 << 8' (256)? [-Wxor-used-as-pow]
5 | int t2_8 = 2^8;
| ^
| --
| 1<<
demo.c:5:12: note: you can silence this warning by using a hexadecimal constant (0x2 rather than 2)
5 | int t2_8 = 2^8;
| ^
| 0x2
demo.c:21:15: warning: result of '10^6' is 12; did you mean '1e6'? [-Wxor-used-as-pow]
21 | int t10_6 = 10^6;
| ^
| ---
| 1e
demo.c:21:13: note: you can silence this warning by using a hexadecimal constant (0xa rather than 10)
21 | int t10_6 = 10^6;
| ^~
| 0xa
gcc/c-family/ChangeLog:
PR c/90885
* c-common.h (check_for_xor_used_as_pow): New decl.
* c-lex.cc (c_lex_with_flags): Add DECIMAL_INT to flags as appropriate.
* c-warn.cc (check_for_xor_used_as_pow): New.
* c.opt (Wxor-used-as-pow): New.
gcc/c/ChangeLog:
PR c/90885
* c-parser.cc (c_parser_string_literal): Clear ret.m_decimal.
(c_parser_expr_no_commas): Likewise.
(c_parser_conditional_expression): Likewise.
(c_parser_binary_expression): Clear m_decimal when popping the
stack.
(c_parser_unary_expression): Clear ret.m_decimal.
(c_parser_has_attribute_expression): Likewise for result.
(c_parser_predefined_identifier): Likewise for expr.
(c_parser_postfix_expression): Likewise for expr.
Set expr.m_decimal when handling a CPP_NUMBER that was a decimal
token.
* c-tree.h (c_expr::m_decimal): New bitfield.
* c-typeck.cc (parser_build_binary_op): Clear result.m_decimal.
(parser_build_binary_op): Call check_for_xor_used_as_pow.
gcc/cp/ChangeLog:
PR c/90885
* cp-tree.h (class cp_expr): Add bitfield m_decimal. Clear it in
existing ctors. Add ctor that allows specifying its value.
(cp_expr::decimal_p): New accessor.
* parser.cc (cp_parser_expression_stack_entry::flags): New field.
(cp_parser_primary_expression): Set m_decimal of cp_expr when
handling numbers.
(cp_parser_binary_expression): Extract flags from token when
populating stack. Call check_for_xor_used_as_pow.
gcc/testsuite/ChangeLog:
PR c/90885
* c-c++-common/Wxor-used-as-pow-1.c: New test.
* c-c++-common/Wxor-used-as-pow-fixits.c: New test.
* g++.dg/parse/expr3.C: Convert 2 to 0x2 to suppress
-Wxor-used-as-pow.
* g++.dg/warn/Wparentheses-10.C: Likewise.
* g++.dg/warn/Wparentheses-18.C: Likewise.
* g++.dg/warn/Wparentheses-19.C: Likewise.
* g++.dg/warn/Wparentheses-9.C: Likewise.
* g++.dg/warn/Wxor-used-as-pow-named-op.C: New test.
* gcc.dg/Wparentheses-6.c: Convert 2 to 0x2 to suppress
-Wxor-used-as-pow.
* gcc.dg/Wparentheses-7.c: Likewise.
* gcc.dg/precedence-1.c: Likewise.
libcpp/ChangeLog:
PR c/90885
* include/cpplib.h (BOL): Move macro to 1 << 12 since it is
not used by C/C++'s unsigned char token flags.
(DECIMAL_INT): New, using 1 << 6, so that it is visible as
part of C/C++'s 8 bits of token flags.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Iain Buclaw [Tue, 16 Aug 2022 14:18:02 +0000 (16:18 +0200)]
d: Fix #error You must define PREFERRED_DEBUGGING_TYPE if DWARF is not supported
This moves all D front-end specific target definitions out of the main
target headers, and into its own header that is included by tm_d.h
instead of pulling in the same headers as tm_p.h.
This fixes the build on target configurations that pull in the default D
language target hooks, and subsequently trigger an error because the
definition of PREFERRED_DEBUGGING_TYPE is behind tm.h, the one header
that is avoided from being included in default-d.cc.
PR d/105659
gcc/ChangeLog:
* config.gcc: Set tm_d_file to ${cpu_type}/${cpu_type}-d.h.
* config/aarch64/aarch64-d.cc: Include tm_d.h.
* config/aarch64/aarch64-protos.h (aarch64_d_target_versions): Move to
config/aarch64/aarch64-d.h.
(aarch64_d_register_target_info): Likewise.
* config/aarch64/aarch64.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/arm/arm-d.cc: Include tm_d.h and arm-protos.h instead of
tm_p.h.
* config/arm/arm-protos.h (arm_d_target_versions): Move to
config/arm/arm-d.h.
(arm_d_register_target_info): Likewise.
* config/arm/arm.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/default-d.cc: Remove memmodel.h include.
* config/freebsd-d.cc: Include tm_d.h instead of tm_p.h.
* config/glibc-d.cc: Likewise.
* config/i386/i386-d.cc: Include tm_d.h.
* config/i386/i386-protos.h (ix86_d_target_versions): Move to
config/i386/i386-d.h.
(ix86_d_register_target_info): Likewise.
(ix86_d_has_stdcall_convention): Likewise.
* config/i386/i386.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
(TARGET_D_HAS_STDCALL_CONVENTION): Likewise.
* config/i386/winnt-d.cc: Include tm_d.h instead of tm_p.h.
* config/mips/mips-d.cc: Include tm_d.h.
* config/mips/mips-protos.h (mips_d_target_versions): Move to
config/mips/mips-d.h.
(mips_d_register_target_info): Likewise.
* config/mips/mips.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/netbsd-d.cc: Include tm_d.h instead of tm.h and memmodel.h.
* config/openbsd-d.cc: Likewise.
* config/pa/pa-d.cc: Include tm_d.h.
* config/pa/pa-protos.h (pa_d_target_versions): Move to
config/pa/pa-d.h.
(pa_d_register_target_info): Likewise.
* config/pa/pa.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/riscv/riscv-d.cc: Include tm_d.h.
* config/riscv/riscv-protos.h (riscv_d_target_versions): Move to
config/riscv/riscv-d.h.
(riscv_d_register_target_info): Likewise.
* config/riscv/riscv.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/rs6000/rs6000-d.cc: Include tm_d.h.
* config/rs6000/rs6000-protos.h (rs6000_d_target_versions): Move to
config/rs6000/rs6000-d.h.
(rs6000_d_register_target_info): Likewise.
* config/rs6000/rs6000.h (TARGET_D_CPU_VERSIONS) Likewise.:
(TARGET_D_REGISTER_CPU_TARGET_INFO) Likewise.:
* config/s390/s390-d.cc: Include tm_d.h.
* config/s390/s390-protos.h (s390_d_target_versions): Move to
config/s390/s390-d.h.
(s390_d_register_target_info): Likewise.
* config/s390/s390.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/sol2-d.cc: Include tm_d.h instead of tm.h and memmodel.h.
* config/sparc/sparc-d.cc: Include tm_d.h.
* config/sparc/sparc-protos.h (sparc_d_target_versions): Move to
config/sparc/sparc-d.h.
(sparc_d_register_target_info): Likewise.
* config/sparc/sparc.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* configure: Regenerate.
* configure.ac (tm_d_file): Remove defaults.h.
(tm_d_include_list): Remove options.h and insn-constants.h.
* config/aarch64/aarch64-d.h: New file.
* config/arm/arm-d.h: New file.
* config/i386/i386-d.h: New file.
* config/mips/mips-d.h: New file.
* config/pa/pa-d.h: New file.
* config/riscv/riscv-d.h: New file.
* config/rs6000/rs6000-d.h: New file.
* config/s390/s390-d.h: New file.
* config/sparc/sparc-d.h: New file.
Patrick Palka [Fri, 2 Sep 2022 19:16:37 +0000 (15:16 -0400)]
libstdc++: Consistently use ::type when deriving from __and/or/not_
Now that these internal type traits are (again) class templates, it's
better to derive from the trait's ::type instead of from the trait
itself, for sake of a shallower inheritance chain.
libstdc++-v3/ChangeLog:
* include/std/tuple (tuple::_UseOtherCtor): Use ::type when
deriving from __and_, __or_ or __not_.
* include/std/type_traits (negation): Likewise.
(is_unsigned): Likewise.
(__is_implicitly_default_constructible): Likewise.
(is_trivially_destructible): Likewise.
(__is_nt_invocable_impl): Likewise.
This defines the is_xxx_constructible_v and is_xxx_assignable_v variable
templates by using the built-ins directly. The actual logic for each one
is the same as the corresponding class template, but way using the
variable template doesn't need to instantiate the class template.
This means that the variable templates won't use the static assertions
checking for complete types, cv void or unbounded arrays, but that's OK
because the built-ins check those anyway. We could probably remove the
static assertions from the class templates, and maybe from all type
traits that use a built-in.
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_constructible_v)
(is_default_constructible_v, is_copy_constructible_v)
(is_move_constructible_v): Define using __is_constructible.
(is_assignable_v, is_copy_assignable_v, is_move_assignable_v):
Define using __is_assignable.
(is_trivially_constructible_v)
(is_trivially_default_constructible_v)
(is_trivially_copy_constructible_v)
(is_trivially_move_constructible_v): Define using
__is_trivially_constructible.
(is_trivially_assignable_v, is_trivially_copy_assignable_v)
(is_trivially_move_assignable_v): Define using
__is_trivially_assignable.
(is_nothrow_constructible_v)
(is_nothrow_default_constructible_v)
(is_nothrow_copy_constructible_v)
(is_nothrow_move_constructible_v): Define using
__is_nothrow_constructible.
(is_nothrow_assignable_v, is_nothrow_copy_assignable_v)
(is_nothrow_move_assignable_v): Define using
__is_nothrow_assignable.
Patrick Palka [Fri, 2 Sep 2022 15:19:51 +0000 (11:19 -0400)]
libstdc++: Fix laziness of __and/or/not_
r13-2230-g390f94eee1ae69 redefined the internal logical operator traits
__and_, __or_ and __not_ as alias templates that directly resolve to
true_type or false_type. But it turns out using an alias template here
causes the traits to be less lazy than before because we now compute the
logical result immediately upon _specialization_ of the trait, and not
later upon _completion_ of the specialization.
So for example, in
using type = __and_<A, __not_<B>>;
we now compute the conjunction and thus instantiate A even though we're
in a context that doesn't require completion of the __and_. What's
worse is that we also compute the inner negation and thus instantiate B
(for the same reason), independent of the __and_ and the value of A!
Thus the traits are now less lazy and composable than before.
Fortunately, the fix is cheap and straightforward: redefine these traits
as class templates instead of as alias templates so that computation of
the logical result is triggered by completion, not by specialization.
libstdc++-v3/ChangeLog:
* include/std/type_traits (__or_, __and_, __not_): Redefine as a
class template instead of as an alias template.
* testsuite/20_util/logical_traits/requirements/short_circuit.cc:
Add more tests for conjunction and disjunction. Add corresponding
tests for __and_ and __or_.
vect_optimize_slp_pass always treats the starting layout as valid,
to avoid having to "optimise" when every possible choice is invalid.
But it gives the starting layout a high cost if it seems like the
target might reject it, in the hope that this will encourage other
(valid) layouts.
The testcase for PR106787 showed that this was flawed, since it was
triggering even in cases where the number of input lanes is different
from the number of output lanes. Picking such a high cost could also
make costs for loop-invariant nodes overwhelm the costs for inner-loop
nodes.
This patch makes the costing less aggressive by (a) restricting
it to N-to-N permutations and (b) assigning the maximum cost of
a permute.
gcc/
* tree-vect-slp.cc (vect_optimize_slp_pass::internal_node_cost):
Reduce the fallback cost to 1. Only use it if the number of
input lanes is equal to the number of output lanes.
gcc/testsuite/
* gcc.dg/vect/bb-slp-layout-20.c: New test.
vect: Ensure SLP nodes don't end up in multiple BB partitions [PR106787]
In the PR we have two REDUC_PLUS SLP instances that share a common
load of stride 4. Each instance also has a unique contiguous load.
Initially all three loads are out of order, so have a nontrivial
load permutation. The layout pass puts them in order instead,
For the two contiguous loads it is possible to do this by adjusting the
SLP_LOAD_PERMUTATION to be { 0, 1, 2, 3 }. But a SLP_LOAD_PERMUTATION
of { 0, 4, 8, 12 } is rejected as unsupported, so the pass creates a
separate VEC_PERM_EXPR instead.
Later the 4-stride load's initial SLP_LOAD_PERMUTATION is rejected too,
so that the load gets replaced by an external node built from scalars.
We then have an external node feeding a VEC_PERM_EXPR.
VEC_PERM_EXPRs created in this way do not have any associated
SLP_TREE_SCALAR_STMTS. This means that they do not affect the
decision about which nodes should be in which subgraph for costing
purposes. If the VEC_PERM_EXPR is fed by a vect_external_def,
then the VEC_PERM_EXPR's input doesn't affect that decision either.
The net effect is that a shared VEC_PERM_EXPR fed by an external def
can appear in more than one subgraph. This triggered an ICE in
vect_schedule_node, which (rightly) expects to be called no more
than once for the same internal def.
There seemed to be many possible fixes, including:
(1) Replace unsupported loads with external defs *before* doing
the layout optimisation. This would avoid the need for the
VEC_PERM_EXPR altogether.
(2) If the target doesn't support a load in its original layout,
stop the layout optimisation from checking whether the target
supports loads in any new candidate layout. In other words,
treat all layouts as if they were supported whenever the
original layout is not in fact supported.
I'd rather not do this. In principle, the layout optimisation
could convert an unsupported layout to a supported one.
Selectively ignoring target support would work against that.
We could try to look specifically for loads that will need
to be decomposed, but that just seems like admitting that
things are happening in the wrong order.
(3) Add SLP_TREE_SCALAR_STMTS to VEC_PERM_EXPRs.
That would be OK for this case, but wouldn't be possible
for external defs that represent existing vectors.
(4) Make vect_schedule_slp share SCC info between subgraphs.
It feels like that's working around the partitioning problem
rather than a real fix though.
(5) Directly ensure that internal def nodes belong to a single
subgraph.
(1) is probably the best long-term fix, but (5) is much simpler.
The subgraph partitioning code already has a hash set to record
which nodes have been visited; we just need to convert that to a
map from nodes to instances instead.
gcc/
PR tree-optimization/106787
* tree-vect-slp.cc (vect_map_to_instance): New function, split out
from...
(vect_bb_partition_graph_r): ...here. Replace the visited set
with a map from nodes to instances. Ensure that a node only
appears in one partition.
(vect_bb_partition_graph): Update accordingly.
gcc/testsuite/
* gcc.dg/vect/bb-slp-layout-19.c: New test.
Richard Biener [Fri, 2 Sep 2022 11:36:13 +0000 (13:36 +0200)]
tree-optimization/106809 - compile time hog in VN
The dominated_by_p_w_unex function is prone to high compile time.
With GCC 12 we introduced a VN run for uninit diagnostics which now
runs into a degenerate case with bison generated code. Fortunately
this case is easy to fix with a simple extra check - a more
general fix needs more work.
PR tree-optimization/106809
* tree-ssa-sccvn.cc (dominaged_by_p_w_unex): Check we have
more than one successor before doing extra work.
Kito Cheng [Fri, 20 Nov 2020 07:55:58 +0000 (15:55 +0800)]
RISC-V: Implement TARGET_COMPUTE_MULTILIB
Use TARGET_COMPUTE_MULTILIB to search the multi-lib reuse for riscv*-*-elf*,
according following rules:
1. Check ABI is same.
2. Check both has atomic extension or both don't have atomic extension.
- Because mix soft and hard atomic operation doesn't make sense and
won't work as expect.
3. Check current arch is superset of the target multi-lib arch.
- It might result slower performance or larger code size, but it
safe to run.
4. Pick most match multi-lib set if more than one multi-lib are pass
the above checking.
Example for how to select multi-lib:
We build code with -march=rv32imaf and -mabi=ilp32, and we have
following 5 multi-lib set:
The first and second multi-lib is safe to like, 3rd multi-lib can't
re-use becasue it don't have atomic extension, which is mismatch according
rule 2, and the 4th multi-lib can't re-use too due to the ABI mismatch,
the last multi-lib can't use since current arch is not superset of the
arch of multi-lib.
And emit error if not found suitable multi-lib set, the error message
only emit when link with standard libraries.
// No actual linking, so no error emitted.
$ riscv64-unknown-elf-gcc -print-multi-directory -march=rv32ia -mabi=ilp32
.
// Link to default libc and libgcc, so check the multi-lib, and emit
// error because not found suitable multilib.
$ riscv64-unknown-elf-gcc -march=rv32ia -mabi=ilp32 ~/hello.c
riscv64-unknown-elf-gcc: fatal error: can't found suitable multilib set for '-march=rv32ia'/'-mabi=ilp32'
compilation terminated.
// No error emitted, because not link to stdlib.
$ riscv64-unknown-elf-gcc -march=rv32ia -mabi=ilp32 ~/hello.c -nostdlib
// No error emitted, because compile only.
$ riscv64-unknown-elf-gcc -march=rv32ia -mabi=ilp32 ~/hello.c -c
Kito Cheng [Fri, 20 Nov 2020 07:52:53 +0000 (15:52 +0800)]
Add TARGET_COMPUTE_MULTILIB hook to override multi-lib result.
Create a new hook to let target could override the multi-lib result,
the motivation is RISC-V might have very complicated multi-lib re-use
rule*, which is hard to maintain and use current multi-lib scripts,
we even hit the "argument list too long" error when we tried to add more
multi-lib reuse rule.
So I think it would be great to have a target specific way to determine
the multi-lib re-use rule, then we could write those rule in C, instead
of expand every possible case in MULTILIB_REUSE.
Gary Dismukes [Wed, 13 Jul 2022 22:06:47 +0000 (18:06 -0400)]
[Ada] Error on return of object whose full view has undefaulted discriminants
The compiler wrongly reports an error about the expected type not
matching the same-named found type in a return statement for a function
whose result type has unknown discriminants when the full type is tagged
and has an undefaulted discriminant, and the return expression is an object
initialized by a function call. The processing for return statements that
creates an actual subtype based on the return expression type's underlying
type when that type has discriminants, and converts the expression to
the actual subtype, should only be done when the underlying discriminated
type is mutable (i.e., has defaulted discriminants). Otherwise the
unchecked conversion to the actual subtype (of the underlying full type)
can lead to a resolution problem later within Expand_Simple_Function_Return
in the expansion of tag assignments (because the target type of the
conversion is a full view and does not match the partial view of
the function's result type).
gcc/ada/
* exp_ch6.adb (Expand_Simple_Function_Return) Bypass creation of an actual
subtype and unchecked conversion to that subtype when the underlying type
of the expression has discriminants without defaults.
Eric Botcazou [Thu, 7 Jul 2022 22:01:15 +0000 (00:01 +0200)]
[Ada] Fix crash on declaration of overaligned array with constraints
The semantic analyzer was setting the Is_Constr_Subt_For_UN_Aliased flag on
the actual subtype of the object, which is incorrect because the nominal
subtype is constrained. This also adjusts a recent related change.
gcc/ada/
* exp_util.adb (Expand_Subtype_From_Expr): Check for the presence
of the Is_Constr_Subt_For_U_Nominal flag instead of the absence
of the Is_Constr_Subt_For_UN_Aliased flag on the subtype of the
expression of an object declaration before reusing this subtype.
* sem_ch3.adb (Analyze_Object_Declaration): Do not incorrectly
set the Is_Constr_Subt_For_UN_Aliased flag on the actual subtype
of an array with definite nominal subtype. Remove useless test.
[Ada] Recover proof of Scaled_Divide in System.Arith_64
Proof of Scaled_Divide was impacted by changes in provers and Why3.
Recover it partially, leaving some unproved basic inferences to be
further investigated.
[Ada] Fix proof of runtime unit System.Value* and System.Image*
Refactor specification of the Value* and Image* units and fix proofs.
gcc/ada/
* libgnat/a-nbnbig.ads: Add Always_Return annotation.
* libgnat/s-vaispe.ads: New ghost unit for the specification of
System.Value_I. Restore proofs.
* libgnat/s-vauspe.ads: New ghost unit for the specification of
System.Value_U. Restore proofs.
* libgnat/s-valuei.adb: The specification only subprograms are
moved to System.Value_I_Spec. Restore proofs.
* libgnat/s-valueu.adb: The specification only subprograms are
moved to System.Value_U_Spec. Restore proofs.
* libgnat/s-valuti.ads
(Uns_Params): Generic unit used to bundle together the
specification functions of System.Value_U_Spec.
(Int_Params): Generic unit used to bundle together the
specification functions of System.Value_I_Spec.
* libgnat/s-imagef.adb: It is now possible to instantiate the
appropriate specification units instead of creating imported ghost
subprograms.
* libgnat/s-imagei.adb: Update to refactoring of specifications
and fix proofs.
* libgnat/s-imageu.adb: Likewise.
* libgnat/s-imgint.ads: Ghost parameters are grouped together in a
package now.
* libgnat/s-imglli.ads: Likewise.
* libgnat/s-imgllu.ads: Likewise.
* libgnat/s-imgllli.ads: Likewise.
* libgnat/s-imglllu.ads: Likewise.
* libgnat/s-imguns.ads: Likewise.
* libgnat/s-vallli.ads: Likewise.
* libgnat/s-valllli.ads: Likewise.
* libgnat/s-imagei.ads: Likewise.
* libgnat/s-imageu.ads: Likewise.
* libgnat/s-vaispe.adb: Likewise.
* libgnat/s-valint.ads: Likewise.
* libgnat/s-valuei.ads: Likewise.
* libgnat/s-valueu.ads: Likewise.
* libgnat/s-vauspe.adb: Likewise.
Simon Rainer [Wed, 31 Aug 2022 21:00:08 +0000 (23:00 +0200)]
ipa: Fix throw in multi-versioned functions [PR106627]
Any multi-versioned function was implicitly declared as noexcept, which
leads to an abort if an exception is thrown inside the function.
The reason for this is that the function declaration is replaced by a
newly created dispatcher declaration, which has TREE_NOTHROW always set
to 1. Instead we need to set TREE_NOTHROW to the value of the original
declaration.
PR ipa/106627
gcc/ChangeLog:
* config/i386/i386-features.cc (ix86_get_function_versions_dispatcher):
Set TREE_NOTHROW correctly for dispatcher declaration.
* config/rs6000/rs6000.cc (rs6000_get_function_versions_dispatcher):
Likewise.
Jonathan Wakely [Thu, 1 Sep 2022 14:58:34 +0000 (15:58 +0100)]
libstdc++: Remove __is_referenceable helper
We only use the __is_referenceable helper in three places now:
add_pointer, add_lvalue_reference, and add_rvalue_reference. But lots of
other traits depend on add_[lr]value_reference, and decay depends on
add_pointer, so removing the instantiation of __is_referenceable helps
compile all those other traits slightly faster.
We can just use void_t<T&> to check for a referenceable type in the
add_[lr]value_reference traits.
Then we can specialize add_pointer for reference types, so that we don't
need to use remove_reference, and then use void_t<T*> for all
non-reference types to detect when we can form a pointer to the type.
libstdc++-v3/ChangeLog:
* include/std/type_traits (__is_referenceable): Remove.
(__add_lvalue_reference_helper, __add_rvalue_reference_helper):
Use __void_t instead of __is_referenceable.
(__add_pointer_helper): Likewise.
(add_pointer): Add partial specializations for reference types.
Jonathan Wakely [Thu, 1 Sep 2022 14:19:28 +0000 (15:19 +0100)]
libstdc++: Optimize is_constructible traits
We can replace some class template helpers with alias templates, which
are cheaper to instantiate.
For example, replace the __is_copy_constructible_impl class template
with an alias template that uses just evaluates the __is_constructible
built-in, using add_lvalue_reference<const T> to get the argument type
in a way that works for non-referenceable types. For a given
specialization of is_copy_constructible this results in the same number
of class templates being instantiated (for the common case of non-void,
non-function types), but the add_lvalue_reference instantiations are not
specific to the is_copy_constructible specialization and so can be
reused by other traits. Previously __is_copy_constructible_impl was a
distinct class template and its specializations were never used for
anything except is_copy_constructible.
With the new definitions of these traits that don't depend on helper
classes, it becomes more practical to optimize the
is_xxx_constructible_v variable templates to avoid instantiations.
Previously doing so would have meant two entirely separate
implementation strategies for these traits.
libstdc++-v3/ChangeLog:
* include/std/type_traits (__is_constructible_impl): Replace
class template with alias template.
(is_default_constructible, is_nothrow_constructible)
(is_nothrow_constructible): Simplify base-specifier.
(__is_copy_constructible_impl, __is_move_constructible_impl)
(__is_nothrow_copy_constructible_impl)
(__is_nothrow_move_constructible_impl): Remove class templates.
(is_copy_constructible, is_move_constructible)
(is_nothrow_constructible, is_nothrow_default_constructible)
(is_nothrow_copy_constructible, is_nothrow_move_constructible):
Adjust base-specifiers to use __is_constructible_impl.
(__is_copy_assignable_impl, __is_move_assignable_impl)
(__is_nt_copy_assignable_impl, __is_nt_move_assignable_impl):
Remove class templates.
(__is_assignable_impl): New alias template.
(is_assignable, is_copy_assignable, is_move_assignable):
Adjust base-specifiers to use new alias template.
(is_nothrow_copy_assignable, is_nothrow_move_assignable):
Adjust base-specifiers to use existing alias template.
(__is_trivially_constructible_impl): New alias template.
(is_trivially_constructible, is_trivially_default_constructible)
(is_trivially_copy_constructible)
(is_trivially_move_constructible): Adjust base-specifiers to use
new alias template.
(__is_trivially_assignable_impl): New alias template.
(is_trivially_assignable, is_trivially_copy_assignable)
(is_trivially_move_assignable): Adjust base-specifier to use
new alias template.
(__add_lval_ref_t, __add_rval_ref_t): New alias templates.
(add_lvalue_reference, add_rvalue_reference): Use new alias
templates.
Jonathan Wakely [Thu, 1 Sep 2022 12:06:13 +0000 (13:06 +0100)]
libstdc++: Optimize std::decay
Define partial specializations of std::decay and its __decay_selector
helper so that remove_reference, is_array and is_function are not
instantiated for every type, and remove_extent is not instantiated for
arrays.
libstdc++-v3/ChangeLog:
* include/std/type_traits (__decay_selector): Add partial
specializations for array types. Only check for function types
when not dealing with an array.
(decay): Add partial specializations for reference types.
Jonathan Wakely [Wed, 31 Aug 2022 14:00:24 +0000 (15:00 +0100)]
libstdc++: Use built-ins for some variable templates
This avoids having to instantiate a class template that just uses the
same built-in anyway.
None of the corresponding class templates have any type-completeness
static assertions, so we're not losing any diagnostics by using the
built-ins directly.
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_enum_v, is_class_v, is_union_v)
(is_empty_v, is_polymoprhic_v, is_abstract_v, is_final_v)
(is_base_of_v, is_aggregate_v): Use built-in directly instead of
instantiating class template.
Joseph Myers [Thu, 1 Sep 2022 19:10:59 +0000 (19:10 +0000)]
c: C2x removal of unprototyped functions
C2x has completely removed unprototyped functions, so that () now
means the same as (void) in both function declarations and
definitions, where previously that change had been made for
definitions only. Implement this accordingly.
This is a change where GNU/Linux distribution builders might wish to
try builds with a -std=gnu2x default to start early on getting old
code fixed that still has () declarations for functions taking
arguments, in advance of GCC moving to -std=gnu2x as default maybe in
GCC 14 or 15; I don't know how much such code is likely to be in
current use.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/c/
* c-decl.cc (grokparms): Handle () in a function declaration the
same as (void) for C2X.
gcc/testsuite/
* gcc.dg/c11-unproto-3.c, gcc.dg/c2x-unproto-3.c,
gcc.dg/c2x-unproto-4.c: New tests.
* gcc.dg/c2x-old-style-definition-6.c, gcc.dg/c2x-unproto-1.c,
gcc.dg/c2x-unproto-2.c: Update for removal of unprototyped
functions.
vect: Try to remove single-vector permutes from SLP graph
This patch extends the SLP layout optimisation pass so that it
tries to remove layout changes that are brought about by permutes
of existing vectors. This fixes the bb-slp-pr54400.c regression on
x86_64 and also means that we can remove the permutes in cases like:
The new test is a simple adaption of bb-slp-pr54400.c, with the
same style of markup.
gcc/
* tree-vect-slp.cc (vect_build_slp_tree_2): When building a
VEC_PERM_EXPR of an existing vector, set the SLP_TREE_LANES
to the number of vector elements, if that's a known constant.
(vect_optimize_slp_pass::is_compatible_layout): Remove associated
comment about zero SLP_TREE_LANES.
(vect_optimize_slp_pass::start_choosing_layouts): Iterate over
all partition members when looking for potential layouts.
Handle existing permutes of fixed-length vectors.
gcc/testsuite/
* gcc.dg/vect/bb-slp-pr54400.c: Extend to aarch64.
* gcc.dg/vect/bb-slp-layout-18.c: New test.
We're starting to abuse the infinity endpoints in the frange code and
the associated range operators. Building infinities are rather cheap,
and we could even inline them, but I think it's best to just not
recalculate them all the time.
I see about 20 uses of real_inf in the source code, not including the
backends. And I'm about to add more :).
gcc/ChangeLog:
* emit-rtl.cc (init_emit_once): Initialize dconstinf and
dconstninf.
* real.h: Add dconstinf and dconstninf.
Jason Merrill [Wed, 24 Aug 2022 20:31:11 +0000 (16:31 -0400)]
c++: set TYPE_STRING_FLAG for char8_t
While looking at the DWARF handling of char8_t I wondered why we weren't
setting TREE_STRING_FLAG on it. I hoped that setting that flag would be an
easy fix for PR102958, but it doesn't seem to be sufficicent. But it still
seems correct.
I also tried setting the flag on char16_t and char32_t, but that broke
because braced_list_to_string assumes char-sized elements. Since we don't
set the flag on wchar_t, I abandoned that idea.
gcc/c-family/ChangeLog:
* c-common.cc (c_common_nodes_and_builtins): Set TREE_STRING_FLAG on
char8_t.
(braced_list_to_string): Check for char-sized elements.