Richard Biener [Thu, 24 Oct 2024 15:06:29 +0000 (17:06 +0200)]
Restrict :c to commutative ops as intended
genmatch was supposed to restrict :c to verifiable commutative
operations while leaving :C to the "I know what I'm doing" case.
The following enforces this, cleaning up parsing and amending
the commutative_op helper. There's one pattern that needs adjustment,
the pattern optimizing fmax (x, NaN) or fmax (NaN, x) to x since
fmax isn't commutative.
* genmatch.cc (commutative_op): Add paramter to indicate whether
all compares should be considered commutative. Handle
hypot, add_overflow and mul_overflow.
(parser::parse_expr): Simplify 'c' handling by using
commutative_op and error out when the operation is not.
* match.pd ((minmax:c @0 NaN@1) -> @0): Use :C, we know
what we are doing.
Richard Biener [Thu, 24 Oct 2024 14:15:43 +0000 (16:15 +0200)]
tree-optimization/117277 - remove CLOBBERs before SLP code generation
We have to remove CLOBBERs before SLP is code generated since for
store-lanes we are inserting our own CLOBBERs that we want to survive.
So the following refactors vect_transform_loop to remove unwanted
stmts first.
This resolves the gcc.target/aarch64/sve/store_lane_spill_1.c FAIL.
PR tree-optimization/117277
* tree-vect-loop.cc (vect_transform_loop): Remove CLOBBERs
and prefetches before doing any code generation.
The following implements masked load-lane discovery for SLP. The
challenge here is that a masked load has a full-width mask with
group-size number of elements when this becomes a masked load-lanes
instruction one mask element gates all group members. We already
have some discovery hints in place, namely STMT_VINFO_SLP_VECT_ONLY
to guard non-uniform masks, but we need to choose a way for SLP
discovery to handle possible masked load-lanes SLP trees.
I have this time chosen to handle load-lanes discovery where we
have performed permute optimization already and conveniently got
the graph with predecessor edges built. This is because unlike
non-masked loads masked loads with a load_permutation are never
produced by SLP discovery (because load permutation handling doesn't
handle un-permuting the mask) and thus the load-permutation lowering
which handles non-masked load-lanes discovery doesn't trigger.
With this SLP discovery for a possible masked load-lanes, thus
a masked load with uniform mask, produces a splat of a single-lane
sub-graph as the mask SLP operand. This is a representation that
shouldn't pessimize the mask load case and allows the masked load-lanes
transform to simply elide this splat.
This fixes the aarch64-sve.exp mask_struct_load*.c testcases with
--param vect-force-slp=1
PR tree-optimization/116575
* tree-vect-slp.cc (vect_get_and_check_slp_defs): Handle
gaps, aka NULL scalar stmt.
(vect_build_slp_tree_2): Allow gaps in the middle of a
grouped mask load. When the mask of a grouped mask load
is uniform do single-lane discovery for the mask and
insert a splat VEC_PERM_EXPR node.
(vect_optimize_slp_pass::decide_masked_load_lanes): New
function.
(vect_optimize_slp_pass::run): Call it.
Richard Biener [Wed, 23 Oct 2024 09:55:31 +0000 (11:55 +0200)]
Relax vect_check_scalar_mask check
When the mask is not a constant or external def there's no need to
check the scalar type, in particular with SLP and the mask being
a VEC_PERM_EXPR there isn't a scalar operand ready to check
(not one vect_is_simple_use will get you). We later check the
vector type and reject non-mask types there.
* tree-vect-stmts.cc (vect_check_scalar_mask): Only check
the scalar type for constant or extern defs.
Tom Tromey [Wed, 31 Jul 2024 15:01:45 +0000 (09:01 -0600)]
ada: Change scope of XUB type
An earlier patch in the "nameless" series caused a regression with
-fgnat-encodings=all. Previously, all artificial types were emitted
in the CU scope in the DWARF, but with the patch, an "XUB" type is
emitted in the function scope. This causes gdb lookups to erroneously
find the XUB type rather than the type that gdb expects to find.
Note that I don't know why the earlier code worked, because decl.cc
clearly sets the XUB type's context to be the current function.
This patch changes the type's context so that it is nested in a type
that is conveniently available.
gcc/ada/ChangeLog:
* gcc-interface/decl.cc (gnat_to_gnu_entity): Use gnu_fat_type as the type
context for a XUB type.
Tom Tromey [Tue, 9 Jul 2024 15:36:26 +0000 (09:36 -0600)]
ada: Set DECL_NAMELESS in create_type_decl
When using minimal encodings, most artificial types do not need to
have their names emitted in the DWARF. This patch changes the
compiler to generally omit these names.
However, a subset of names are needed: when the compiler creates an
artificial type for certain kinds of arrays, the name is needed by
gdb. So, a new parameter is added to create_type_decl to allow this
omission to be disabled.
Note that simply passing 'false' as the artificial_p argument to
create_type_decl doesn't work properly -- other parts of the compiler
seem to rely on this flag being set, and so making this change causes
ICEs.
gcc/ada/ChangeLog:
* gcc-interface/decl.cc (gnat_to_gnu_entity): Update some calls to
create_type_decl.
* gcc-interface/gigi.h (create_type_decl): Add can_be_nameless parameter.
* gcc-interface/utils.cc (create_type_decl): Add can_be_nameless
parameter. Set DECL_NAMELESS on type decl.
Tom Tromey [Wed, 10 Jul 2024 17:54:25 +0000 (11:54 -0600)]
ada: Mark some type decls as nameless
The types created by record_builtin_type and create_type_stub_decl can
be marked as nameless when using minimal encodings. In this
situation, gdb does not need these type names.
gcc/ada/ChangeLog:
* gcc-interface/utils.cc (record_builtin_type, create_type_stub_decl):
Set DECL_NAMELESS on type decls.
Tom Tromey [Wed, 10 Jul 2024 17:52:17 +0000 (11:52 -0600)]
ada: Mark XUA types as artificial
gdb does not need the name of XUA types. This patch changes the
compiler to unconditionally mark these as artificial; a subsequent
patch will arrange for the name to be omitted.
gcc/ada/ChangeLog:
* gcc-interface/decl.cc (gnat_to_gnu_entity): Pass 'true' to
create_type_decl when creating XUA type.
Tom Tromey [Wed, 10 Jul 2024 17:46:57 +0000 (11:46 -0600)]
ada: Add 'artificial_p' parameter to build_unc_object_type
This adds an 'artificial_p' parameter to build_unc_object_type, so
that the artificiality of the type can be propagated to
create_type_decl. This will affect the namelessness of the type in a
subsequent patch.
Tom Tromey [Wed, 10 Jul 2024 17:29:11 +0000 (11:29 -0600)]
ada: Standard types are not artificial
This changes gigi so that standard types are no longer marked
artificial. This change is needed to prevent subsequent patches from
causing standard types to have their names elided. Also, although
DWARF says that DW_AT_artificial is used for "the declaration of an
object or type artificially generated by a compiler and not explicitly
declared by the source program", it seems to me that types provided by
the language should not be marked as such; and this is what the C and
C++ compilers do.
gcc/ada/ChangeLog:
* gcc-interface/decl.cc (is_artificial): New function.
(gnat_to_gnu_entity): Use it.
Eric Botcazou [Wed, 18 Sep 2024 06:24:32 +0000 (08:24 +0200)]
ada: Fix fallout of change in parameter passing out of aliasing considerations
If an actual parameter that is a type conversion is passed by reference but
not addressable, the temporary that is created and whose address is passed
instead may need to be in the target type of the conversion to fulfill the
requirements of strict aliasing.
gcc/ada/ChangeLog:
* gcc-interface/trans.cc (Call_to_gnu): If the formal is passed by
reference and the actual is a type conversion but not addressable,
create the temporary in the target type of the conversion if this
is needed to enforce strict aliasing.
Eric Botcazou [Wed, 11 Sep 2024 17:53:12 +0000 (19:53 +0200)]
ada: Fix internal error on bit-packed array type with Volatile_Full_Access
The problem occurs when the component type is a record type with default
values for the initialization procedure of the (base) array type, because
the compiler is trying to generate a full access for a parameter of the
base array type, which does not make sense.
gcc/ada/ChangeLog:
PR ada/116551
* gcc-interface/trans.cc (node_is_atomic) <N_Identifier>: Return
false if the type of the entity is an unconstrained array type.
(node_is_volatile_full_access) <N_Identifier>: Likewise.
Bob Duff [Tue, 1 Oct 2024 15:29:34 +0000 (11:29 -0400)]
ada: Disable self-referential with_clauses
Self-referential with_clauses (as in package body X says "with X;")
cause trouble, such as duplicate nested instantiations when using
container packages. This patch disables most of the processing by
setting the Is_Implicit_With flag. It's not really implicit, but the
subsequent processing behaves as if it is, and coming up with a more
accurate (and much longer) name for the flag doesn't seem beneficial for
such an obscure case. Note that the spec of X will be processed later,
rather than upon seeing "with X;".
Other cleanups, such as renaming Implicit_With to be Is_Implicit_With.
gcc/ada/ChangeLog:
* sem_ch10.adb: (Analyze_With_Clause): Check for self-referential
with clause. Give a warning, and set Is_Implicit_With, which we
are reusing in this obscure case even though it's not really
implicit.
(Analyze_Context): Remove check for self-referential with clause.
It wasn't correct -- it only triggered for Acts_As_Spec
subprograms. Corrected check is now in Analyze_With_Clause.
(Implicit_With): Rename to be Is_Implicit_With. Misc cleanup,
comment fixes.
(Process_Spec_Clauses): Remove default for Exit_On_Self parameter.
Use "exit when" instead of if statement.
* sinfo.ads (Implicit_With): Rename to be Is_Implicit_With.
Document new use for self-referential withs.
* ali.adb (Scan_ALI): Use an aggregate to initialize Withs entry.
* exp_put_image.adb (Preload_Root_Buffer_Type): Make this a
once-only procedure.
* sem_util.ads (Is_Ancestor_Package): Fix comment -- a libraryunit
is an ancestor of itself, but this doesn't return True in that
case.
* sem_util.adb (Is_Ancestor_Package): Better to initialize things
on their declaration.
* lib-load.adb: Minor comment fix.
* sem_prag.adb: Implicit_With --> Is_Implicit_With. Minor comment
fix.
* gen_il-fields.ads: Implicit_With --> Is_Implicit_With.
* gen_il-gen-gen_nodes.adb: Likewise
* lib.adb: Likewise
* lib-writ.adb: Likewise
* rtsfind.adb: Likewise
* sem_cat.adb: Likewise
* sem_ch12.adb: Likewise
* sem_ch8.adb: Likewise
* sem_elab.adb: Likewise
* sem_warn.adb: Likewise
* gcc-interface/trans.cc: (Implicit_With): Rename to be
Is_Implicit_With.
Eric Botcazou [Tue, 1 Oct 2024 07:19:36 +0000 (09:19 +0200)]
ada: Small adjustments to commentary after latest change
This removes the enumeration of the various cases in the comment associated
with the declaration of In_Expanded_Body to prevent synchronization issues.
Javier Miranda [Mon, 30 Sep 2024 09:08:04 +0000 (09:08 +0000)]
ada: Pragma Pre_Class and Post_Class have no effect at runtime
The pragmas Pre_Class and Post_Class are accepted by the compiler
but have no effect at runtime.
gcc/ada/ChangeLog:
* freeze.adb (Freeze_Entity): If the entity is an access-to-subprogram
type declaration that pre/postcondition contracts, build the
wrapper
(if not previously done as part of processing aspects).
* sem_ch3.adb (Build_Access_Subprogram_Wrapper): Add missing support
for building the wrapper when the access type has pragmas
Pre_Class/Post_Class.
(Build_Access_Subprogram_Wrapper_Declaration): New subprogram.
* sem_ch3.ads (Build_Access_Subprogram_Wrapper): Spec moved to the
public part of the package.
* sem_prag.adb (Analyze_Pre_Post_Condition): Store in the tree copy of
class-wide pre/postcondition expression; required to merge it with
inherited conditions.
(Is_Valid_Assertion_Kind): Added Pre_Class and Post_Class.
Eric Botcazou [Fri, 27 Sep 2024 07:36:17 +0000 (09:36 +0200)]
ada: Fix ATC with timed delay from Ada.Real_Time
An Asynchronous Transfer of Control blocks with a timed delay that is
computed by means of the Ada.Real_Time unit (instead of the default
Ada.Calendar unit) because of a missing abort deferral in the unit.
gcc/ada/ChangeLog:
PR ada/43485
* libgnarl/a-retide.adb: Add with clause for System.Soft_Links.
(Delay_Until): Defer and undefer abort around the call to the
Timed_Delay routine of System.Task_Primitives.Operations.
ada: Adjust documentation of External_Initialization
The parameters Maximum_Size and If_Empty were mentioned during the
request for comments phase but are not implemented, at least for now.
This patch changes the GNAT reference manual accordingly. It also makes
a minor punctuation change.
Eric Botcazou [Fri, 13 Sep 2024 09:53:00 +0000 (11:53 +0200)]
ada: Add Type_Size_For function to Uintp package
It computes the size of an integer type that can accommodate the input.
gcc/ada/ChangeLog:
* uintp.ads (Type_Size_For): New function declaration.
* uintp.adb (Type_Size_For): New function body.
* exp_imgv.adb (Rewrite_Object_Image): Call Type_Size_For to get
the size of a narrower integer type.
Javier Miranda [Tue, 17 Sep 2024 11:53:06 +0000 (11:53 +0000)]
ada: Constraint error not raised in ACATS test c413007
The Constraint_Error exception is not raised when a subprogram
is called using prefix notation, and the prefix of the call is
an access-to-subprogram type with a null value. This new check
is enabled by switch -gnatd_P
gcc/ada/ChangeLog:
* gen_il-fields.ads: New node field (Is_Expanded_Prefixed_Call).
* gen_il-gen-gen_nodes.adb: New semantic field for N_Function_Call
and N_Procedure_Call_Statement nodes.
* sem_ch4.adb (Complete_Object_Operation): Mark the rewritten node
with the Is_Expanded_Prefixed_Call flag.
* sem_res.adb (Check_Prefixed_Call): Code cleanup and addition of
documentation.
(Resolve_Actuals): Add a null-exclusion check on the
prefix of the call when it is an access-type.
* sinfo.ads: Adding new semantic flag (Is_Expanded_Prefixed_Call)
to N_Function_Call and N_Procedure_Call_Statement nodes.
* debug.adb: Adding documentation for switch d_P.
* sem_ch13.adb (Analyze_One_Aspect): change the call to
`Error_Msg_GNAT_Extension` to allow this aspect in core
extensions. Put the code path in core extensions.
* exp_util.adb (Name_Of_Controlled_Prim_Op): Put the code path
in core extensions
Eric Botcazou [Mon, 16 Sep 2024 06:31:57 +0000 (08:31 +0200)]
ada: Fix internal error on ambiguous operands of comparison operator
This is a regression introduced when the diagnosis of ambiguous operands
for comparison and equality operators was moved from the analysis to the
resolution phase in order to avoid spurious ambiguities in specific cases.
When an ambiguity is detected for the operands of predefined comparison
and equality operators during analysis, it needs to be recorded so that
later calls to the disambiguation routine know about this ambiguity for
the case where the context has been resolved to boolean.
gcc/ada/ChangeLog:
* sem_type.ads (Interp ): Add Opnd_Typ component and remove default
value for Abstract_Op component.
(Add_One_Interp): Rename Opnd_Type parameter to Opnd_Typ.
* sem_type.adb (Add_One_Interp): Likewise.
(Add_One_Interp.Add_Entry): Record the operand type as well.
(Collect_Interp): Record Empty for the operand type.
(Disambiguate.Is_Ambiguous_Boolean_Operator): New predicate.
(Disambiguate): Use it to detect recorded ambiguity cases.
* sem_ch4.adb (Find_Comparison_Equality_Types): Add commentary.
Eric Botcazou [Fri, 13 Sep 2024 19:09:31 +0000 (21:09 +0200)]
ada: Fix fallout of change to 'Wide_Wide_Value for enumeration types
The literals of enumeration types are always normalized, even though they
contain wide characters (but the normalization leaves these unchanged),
so a normalization routine that is aware of wide characters must be run
on the input string for 'Wide_Wide_Value.
gcc/ada/ChangeLog:
PR ada/115507
* rtsfind.ads (RE_Id): Add RE_Enum_[Wide_]Wide_String_To_String.
(RE_Unit_Table): Add entries for the new values.
* exp_attr.adb (Is_User_Defined_Enumeration_Type): New predicate.
(Expand_N_Attribute_Reference) <Attribute_Wide_Value>: Build a call
to RE_Enum_Wide_String_To_String for user-defined enumeration types.
<Attribute_Wide_Wide_Value>: Likewise with
RE_Enum_Wide_Wide_String_To_String.
* exp_imgv.adb (Expand_Value_Attribute): Adjust to above.
* libgnat/s-wchwts.ads (Enum_Wide_String_To_String): New function.
(Enum_Wide_Wide_String_To_String): Likewise.
* libgnat/s-wchwts.adb: Add clauses for System.Case_Util.
(Normalize_String): New local procedure.
(Enum_Wide_String_To_String): New function body.
(Enum_Wide_Wide_String_To_String): Likewise.
Javier Miranda [Fri, 13 Sep 2024 07:02:02 +0000 (07:02 +0000)]
ada: Untagged incomplete view not detected in ACATS test b3a1a060
Adding checks for RM 3.10.1(10): An actual parameter cannot be
of an untagged incomplete view; the result object of a function
call cannot be of an incomplete view.
gcc/ada/ChangeLog:
* sem_res.adb (Resolve_Actuals): Add checks for incomplete
type actuals.
Eric Botcazou [Thu, 12 Sep 2024 14:11:47 +0000 (16:11 +0200)]
ada: Pass parameters of full access unconstrained array types by copy in calls
When a full access array type is declared, either Volatile_Full_Access in
Ada 2012 or Atomic in Ada 2022, an implicit base array type is built by the
compiler with the Full_Access flag set, although full accesses cannot be
generated for objects of this type because the size is not known statically.
If the component type is a record with default values, an initialization
procedure taking a parameter of the base array type is built. Given that
full accesses cannot be generated for the parameter inside the procedure,
we need to pass the actual parameter by copy to the procedure in order to
implement the full access semantics.
gcc/ada/ChangeLog:
* exp_ch6.adb (Expand_Actuals.Is_Legal_Copy): Return True for an
initialization procedure with a full access formal parameter.
(Expand_Actuals.Requires_Atomic_Or_Volatile_Copy): Return True if
the formal parameter is of a full access unconstrained array type.
Jakub Jelinek [Fri, 25 Oct 2024 07:44:10 +0000 (09:44 +0200)]
non-gcc: Remove trailing whitespace
I've tried to build stage3 with
-Wleading-whitespace=blanks -Wtrailing-whitespace=blank -Wno-error=leading-whitespace=blanks -Wno-error=trailing-whitespace=blank
added to STRICT_WARN and that expectably resulted in about
2744 unique trailing whitespace warnings and 124837 leading whitespace
warnings when excluding *.md files (which obviously is in big part a
generator issue). Others from that are generator related, I think those
need to be solved later.
The following patch just fixes up the easy case (trailing whitespace),
which could be easily automated:
for i in `find . -name \*.h -o -name \*.cc -o -name \*.c | xargs grep -l '[ ]$' | grep -v testsuite/`; do sed -i -e 's/[ ]*$//' $i; done
I've excluded files which I knew are obviously generated or go FE.
Is there anything else we'd want to avoid the changes?
Due to patch size, I've split it between gcc/ part
and rest (include/, libiberty/, libgcc/, libcpp/, libstdc++-v3/;
this part).
Jakub Jelinek [Fri, 25 Oct 2024 07:41:46 +0000 (09:41 +0200)]
gcc: Remove trailing whitespace
I've tried to build stage3 with
-Wleading-whitespace=blanks -Wtrailing-whitespace=blank -Wno-error=leading-whitespace=blanks -Wno-error=trailing-whitespace=blank
added to STRICT_WARN and that expectably resulted in about
2744 unique trailing whitespace warnings and 124837 leading whitespace
warnings when excluding *.md files (which obviously is in big part a
generator issue). Others from that are generator related, I think those
need to be solved later.
The following patch just fixes up the easy case (trailing whitespace),
which could be easily automated:
for i in `find . -name \*.h -o -name \*.cc -o -name \*.c | xargs grep -l '[ ]$' | grep -v testsuite/`; do sed -i -e 's/[ ]*$//' $i; done
I've excluded files which I knew are obviously generated or go FE.
Is there anything else we'd want to avoid the changes?
Due to patch size, I've split it between gcc/ part (this patch)
and rest (include/, libiberty/, libgcc/, libcpp/, libstdc++-v3/).
Jennifer Schmitz [Thu, 24 Oct 2024 12:11:31 +0000 (05:11 -0700)]
SVE intrinsics: Fold svaba with op1 all zeros to svabd.
Similar to
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665780.html,
this patch implements folding of svaba to svabd if op1 is all zeros,
resulting in the use of UABD/SABD instructions instead of UABA/SABA.
Tests were added to check the produced assembly for use of UABD/SABD,
also for the _n case.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64-sve-builtins-sve2.cc
(svaba_impl::fold): Fold svaba to svabd if op1 is all zeros.
Nathaniel Shead [Tue, 20 Aug 2024 15:08:36 +0000 (01:08 +1000)]
c++/modules: Support decloned cdtors
When compiling with '-fdeclone-ctor-dtor' (enabled by default with -Os),
we run into issues where we don't correctly emit the underlying
functions. We also need to ensure that COMDAT constructors are marked
as such before 'maybe_clone_body' attempts to propagate COMDAT groups to
the new thunks.
gcc/cp/ChangeLog:
* module.cc (post_load_processing): Mark COMDAT as needed, emit
declarations if maybe_clone_body fails.
gcc/testsuite/ChangeLog:
* g++.dg/modules/clone-2_a.C: New test.
* g++.dg/modules/clone-2_b.C: New test.
* g++.dg/modules/clone-3_a.C: New test.
* g++.dg/modules/clone-3_b.C: New test.
Nathaniel Shead [Tue, 20 Aug 2024 14:50:53 +0000 (00:50 +1000)]
c++/modules: Prevent maybe_clone_decl being called multiple times [PR115007]
The ICE in the linked PR is caused because maybe_clone_decl is not
prepared to be called on a declaration that has already had clones
created; what happens otherwise is that start_preparsed_function early
exits and never sets up cfun, causing a segfault later on.
To fix this we ensure that post_load_processing only calls
maybe_clone_decl if TREE_ASM_WRITTEN has not been marked on the
declaration yet, and (if maybe_clone_decls succeeds) marks this flag on
the decl so that it doesn't get called again later when finalising
deferred vague linkage declarations in c_parse_final_cleanups.
As a bonus this now allows us to only keep the DECL_SAVED_TREE around in
expand_or_defer_fn_1 for modules which have CMIs, which will have
benefits for LTO performance in non-interface TUs.
For clarity we also update the streaming code to do post_load_decls for
maybe in-charge cdtors rather than any DECL_ABSTRACT_P declaration, as
this is more accurate to the decls affected by maybe_clone_body.
PR c++/115007
gcc/cp/ChangeLog:
* module.cc (module_state::read_cluster): Replace
DECL_ABSTRACT_P with DECL_MAYBE_IN_CHARGE_CDTOR_P.
(post_load_processing): Check and mark TREE_ASM_WRITTEN.
* semantics.cc (expand_or_defer_fn_1): Use the more specific
module_maybe_has_cmi_p instead of modules_p.
gcc/testsuite/ChangeLog:
* g++.dg/modules/virt-6_a.C: New test.
* g++.dg/modules/virt-6_b.C: New test.
Nathaniel Shead [Tue, 20 Aug 2024 14:42:42 +0000 (00:42 +1000)]
c++: Handle ABI for non-polymorphic dynamic classes
The Itanium ABI has specific rules for when virtual tables for dynamic
classes should be emitted. However we didn't consider structures with
virtual inheritance but no virtual members as dynamic classes for ABI
purposes; this patch fixes this.
gcc/cp/ChangeLog:
* decl2.cc (import_export_class): Use TYPE_CONTAINS_VPTR_P
instead of TYPE_POLYMORPHIC_P.
(import_export_decl): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/modules/virt-5_a.C: New test.
* g++.dg/modules/virt-5_b.C: New test.
Georg-Johann Lay [Tue, 22 Oct 2024 09:51:44 +0000 (11:51 +0200)]
AVR: target/116953 - Restore recog_data after calling jump_over_one_insn_p.
The previous fix for PR116953 is incomplete because references to
recog_data are escaping avr_out_sbxx_branch() in the form of %-operands
in the returned asm code template. This patch reverts the previous fix,
and re-extracts the operands by means of extract_constrain_insn_cached()
after the call of jump_over_one_insn_p().
PR target/116953
gcc/
* config/avr/avr.cc (avr_out_sbxx_branch): Revert previous fix
for PR116953 (r15-4078). Run extract_constrain_insn_cached
on the current insn after calling jump_over_one_insn_p.
David Malcolm [Thu, 24 Oct 2024 19:52:29 +0000 (15:52 -0400)]
analyzer: avoid implicit use of global_dc's pretty_printer [PR116613]
Previously, various places in the analyzer generated message strings
by cloning the diagnostic_context's pretty_printer, printing to that
pretty_printer's buffer, and then returning a copy of the buffer
contents.
This implicit use of a particular pretty printer doesn't work well for
the "multiple diagnostic output formats" case (PR other/116613), such as
differences in colorization, or in how phase 3 of formatting works.
Hence as enabling work towards that, the following patch reworks the
various functions returning a label_text string in favor of functions
that print to a specific pretty_printer, such as diagnotic_event's
"get_desc" vfunc, which becomes "print_desc". This makes the particular
pretty_printer in use explicit in each case.
Previously, the various pending_diagnostic::describe_* vfuncs returned a
label_text, with the return of an empty string signifying that no
description could be generated. With this patch, these vfuncs gain a
"pretty_printer &" param and a bool return value and now either print to
the pretty_printer and return true, or return false to signify the
"no description available" case.
No functional change intended.
gcc/analyzer/ChangeLog:
PR other/116613
* bounds-checking.cc
(concrete_buffer_overflow::describe_final_event): Convert return
type from label_text to bool. Add "pp" param and either print to
it and return true, or return false.
(concrete_buffer_overflow::describe_final_event_as_bytes): Convert
to print to a pp rather than returning a label_text.
(concrete_buffer_overflow::describe_final_event_as_bits):
Likewise.
(class concrete_buffer_over_read): Analogous changes to above.
(class concrete_buffer_underwrite): Likewise.
(class concrete_buffer_under_read): Likewise.
(class symbolic_buffer_overflow): Likewise.
(class symbolic_buffer_over_read): Likewise.
* call-details.cc (class overlapping_buffers): Likewise.
* call-info.cc (call_info::print): Reimplement.
(class call_info::add_events_to_path::call_event): Convert
"get_desc" vfunc to "print_desc", dropping return type, adding
"pp" param, and printing to it.
(class succeed_or_fail_call_info): Likewise.
* call-info.h (class call_info): Likewise.
(class succeed_or_fail_call_info): Likewise.
* checker-event.cc (checker_event::dump): Reimplement.
(checker_event::prepare_for_emission): Update for change from
get_desc to print_desc.
(debug_event::get_desc): Convert to...
(debug_event::print_desc): ...this.
(precanned_custom_event::get_desc): Convert to...
(precanned_custom_event::print_desc): ...this.
(statement_event::get_desc): Convert to...
(statement_event::print_desc): ...this.
(region_creation_event_memory_space::get_desc): Convert to...
(region_creation_event_memory_space::print_desc): ...this.
(region_creation_event_capacity::get_desc): Convert to...
(region_creation_event_capacity::print_desc): ...this.
(region_creation_event_allocation_size::get_desc): Convert to...
(region_creation_event_allocation_size::print_desc): ...this.
(region_creation_event_debug::get_desc): Convert to...
(region_creation_event_debug::print_desc): ...this.
(function_entry_event::get_desc): Convert to...
(function_entry_event::print_desc): ...this.
(state_change_event::get_desc): Convert to...
(state_change_event::print_desc): ...this.
(state_change_event::get_meaning): Update for change to
pending_diagnostic::get_meaning_for_state_change.
(superedge_event::should_filter_p): Convert from usage of get_desc
to print_desc.
(start_cfg_edge_event::get_desc): Convert to...
(start_cfg_edge_event::print_desc): ...this.
(call_event::get_desc): Convert to...
(call_event::print_desc): ...this.
(return_event::get_desc): Convert to...
(return_event::print_desc): ...this.
(start_consolidated_cfg_edges_event::get_desc): Convert to...
(start_consolidated_cfg_edges_event::print_desc): ...this.
(inlined_call_event::get_desc): Convert to...
(inlined_call_event::print_desc): ...this.
(setjmp_event::get_desc): Convert to...
(setjmp_event::print_desc): ...this.
(rewind_from_longjmp_event::get_desc): Convert to...
(rewind_from_longjmp_event::print_desc): ...this.
(rewind_to_setjmp_event::get_desc): Convert to...
(rewind_to_setjmp_event::print_desc): ...this.
(warning_event::get_desc): Convert to...
(warning_event::print_desc): ...this.
* checker-event.h: Convert the various "get_desc" vfunc decls to
"print_desc".
* checker-path.cc (checker_path::dump): Convert to usage of
checker_event::print_desc.
(checker_path::debug): Convert to debug form of
checker_event::get_desc.
* diagnostic-manager.cc
(diagnostic_manager::prune_interproc_events): Likewise.
(diagnostic_manager::prune_system_headers): Likewise.
* engine.cc (call_summary_edge_info::get_desc): Convert to...
(call_summary_edge_info::print_desc): ...this.
(stale_jmp_buf::describe_final_event): Update for change to
this vfunc.
(tainted_args_function_custom_event::get_desc): Convert to...
(tainted_args_function_custom_event::print_desc): ...this.
(tainted_args_field_custom_event::get_desc): Convert to...
(tainted_args_field_custom_event::print_desc): ...this.
(tainted_args_callback_custom_event::get_desc): Convert to...
(tainted_args_callback_custom_event::print_desc): ...this.
(jump_through_null::describe_final_event): Update for change to
this vfunc.
* infinite-loop.cc (perpetual_start_cfg_edge_event::get_desc):
Convert to...
(perpetual_start_cfg_edge_event::print_desc): ...this.
(looping_back_event::get_desc): Convert to...
(looping_back_event::print_desc): ...this.
(looping_back_event::describe_final_event): Update for change to
this vfunc.
* infinite-recursion.cc (class infinite_recursion_diagnostic):
Update for changes to pending_diagnostic.
* kf.cc (class putenv_of_auto_var): Likewise.
(kf_realloc::impl_call_post): Update for changes to call_info.
(kf_strchr::impl_call_post): Likewise.
(kf_strncpy::impl_call_post): Likewise.
(kf_strstr::impl_call_post): Likewise.
(class kf_strtok::undefined_behavior): Update for changes to
pending_diagnostic.
(class strtok_call_info): Update for changes to call_info.
* pending-diagnostic.cc (evdesc::event_desc::formatted_print):
Delete.
* pending-diagnostic.h (struct event_desc): Delete.
(struct state_change): Drop event_desc base class.
(struct call_with_state): Likewise.
(struct return_of_state): Likewise.
(struct final_event): Likewise.
(pending_event::describe_state_change): Convert return
type from label_text to bool. Add "pp" param and either print to
it and return true, or return false. Do the latter for the base
class implementation.
(pending_event::describe_call_with_state): Likewise.
(pending_event::describe_return_of_state): Likewise.
(pending_event::describe_final_event): Likewise.
* region-model.cc
(poisoned_value_diagnostic::describe_final_event): Update for
change to this vfunc.
(shift_count_negative_diagnostic::describe_final_event): Likewise.
(shift_count_overflow_diagnostic::describe_final_event): Likewise.
(ptrdiff_region_creation_event::get_desc): Convert to...
(ptrdiff_region_creation_event::print_desc): ...this.
(undefined_ptrdiff_diagnostic::describe_final_event): Update for
change to this vfunc.
(write_to_const_diagnostic::describe_final_event): Likewise.
(write_to_string_literal_diagnostic::describe_final_event):
Likewise.
(dubious_allocation_size::describe_final_event): Likewise.
(null_terminator_check_event::get_desc): Convert to...
(null_terminator_check_event::print_desc): ...this.
(float_as_size_arg::describe_final_event): Update for change to
this vfunc.
(exposure_through_uninit_copy::describe_final_event): Likewise.
* sm-fd.cc: Include "diagnostic-core.h". Update throughout for
changes to pending_diagnostic vfuncs.
* sm-file.cc: Likewise.
* sm-malloc.cc: Likewise.
* sm-sensitive.cc: Likewise.
* sm-signal.cc: Likewise.
* sm-taint.cc: Likewise.
* varargs.cc: Likewise.
gcc/ChangeLog:
PR other/116613
* diagnostic-format-json.cc (make_json_for_path): Add "ref_pp"
param and use when obtaining event descriptions.
(json_output_format::on_report_diagnostic): Pass this format's
printer as the above.
* diagnostic-format-sarif.cc
(sarif_builder::make_location_object): Clone this format's printer
and use it to obtain the text of the message.
* diagnostic-path.cc: Include "pretty-print-markup.h".
(diagnostic_event::get_desc): New.
(path_label::get_text): Update for changes to diagnostic_event.
(event_range::print): Likewise.
(class element_event_desc): New.
(diagnostic_text_output_format::print_path): Update for changes to
diagnostic_event.
* diagnostic-path.h (diagnostic_event::get_desc): Replace with...
(diagnostic_event::print_desc): ...this.
(diagnostic_event::get_desc): Add this back for debugging, without
the bool param.
* pretty-print.cc (pp_printf_n): New.
* pretty-print.h (pp_printf_n): New decl.
* selftest-diagnostic-path.h (test_diagnostic_event::get_desc):
Convert to...
(test_diagnostic_event::print_desc): ...this.
* simple-diagnostic-path.cc (simple_diagnostic_event::print_desc):
New.
(selftest::test_intraprocedural_path): Use debug form of get_desc.
* simple-diagnostic-path.h (simple_diagnostic_event::get_desc):
Convert to...
(simple_diagnostic_event::print_desc): ...this, moving
implementation to test_diagnostic_event.
gcc/testsuite/ChangeLog:
PR other/116613
* gcc.dg/plugin/analyzer_cpython_plugin.c: Convert call outcomes
from "get_desc" to print_desc".
* gcc.dg/plugin/analyzer_gil_plugin.c: Update for changes to
pending_diagnostic vfuncs.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
but it turns out the tail calls in question are not the ones that test
is actually checking for. Rather, when () is interpreted as (void) in
C23 mode, ICF notices that certain functions are identical and so
turns test_indirect_2 into a tail call to text_indirect_1 and
test_indirect_casted_2 into a tail call to test_indirect_casted_1
(which it didn't do previously when one function used () and one used
(void)).
To avoid these spurious failures, make the test use -fno-ipa-icf
rather than relying on () and (void) giving different function types
to avoid ICF.
[...]
In file included from ../../source-gcc/gcc/config/gcn/mkoffload.cc:31:0:
../../source-gcc/gcc/diagnostic.h:29:3: error: #error "You must define INCLUDE_MEMORY before including system.h to use diagnostic.h"
# error "You must define INCLUDE_MEMORY before including system.h to use diagnostic.h"
^
In file included from ../../source-gcc/gcc/diagnostic.h:34:0,
from ../../source-gcc/gcc/config/gcn/mkoffload.cc:31:
../../source-gcc/gcc/pretty-print.h:29:3: error: #error "You must define INCLUDE_MEMORY before including system.h to use pretty-print.h"
# error "You must define INCLUDE_MEMORY before including system.h to use pretty-print.h"
^
In file included from ../../source-gcc/gcc/diagnostic.h:34:0,
from ../../source-gcc/gcc/config/gcn/mkoffload.cc:31:
../../source-gcc/gcc/pretty-print.h:280:16: error: 'unique_ptr' in namespace 'std' does not name a template type
virtual std::unique_ptr<pretty_printer> clone () const;
^
In file included from ../../source-gcc/gcc/config/gcn/mkoffload.cc:31:0:
../../source-gcc/gcc/diagnostic.h:585:32: error: 'std::unique_ptr' has not been declared
void set_output_format (std::unique_ptr<diagnostic_output_format> output_format);
^
[...]
Dimitar Dimitrov [Thu, 24 Oct 2024 16:59:42 +0000 (19:59 +0300)]
testsuite: Require effective target pie for pr113197
The test for PR113197 explicitly enables PIE. But targets without PIE
emit warnings when -fpie is passed (e.g. pru and avr), which causes the
test to fail.
Fix by adding an effective target requirement for PIE.
With this patch, the test now is marked as unsupported for
pru-unknown-elf. Testing for x86_64-pc-linux-gnu passes with current
mainline, and fails if the fix from r15-4018-g02f4efe3c12cf7 is
reverted.
This removes the overload of __throw_bad_variant_access that must be
called with a string literal. This avoids a potential source of
undefined behaviour if that function got misused. The other overload
that takes a bool parameter can be adjusted to take an integer index
selecting one of the four possible string literals to use, ensuring
that the std::bad_variant_access constructor is only called with those
literals.
Passing an index outside the range [0,3] is bogus, but will still select
a valid string literal and avoid undefined behaviour.
libstdc++-v3/ChangeLog:
* include/std/variant (__throw_bad_variant_access(unsigned)):
Define new function as inline friend, with namespace-scope
declaration using noreturn attribute.
(__throw_bad_variant_access(const char*)): Remove.
(__throw_bad_variant_access(bool)): Remove.
(visit, visit<R>): Adjust calls to __throw_bad_variant_access.
David Malcolm [Thu, 24 Oct 2024 15:48:01 +0000 (11:48 -0400)]
Use unique_ptr in more places in pretty_printer/diagnostics [PR116613]
My forthcoming patches for PR other/116613 make much more use of
cloning of pretty_printers than before, so it makes sense as a
preliminary patch for the result of pretty_printer::clone to be a
std::unique_ptr, rather than add more manual uses of "delete".
On doing so, I noticed various other places where naked new/delete is
used for run-time configuration of diagnostics:
* the output format (text vs SARIF)
* client data hooks
* the option manager
* the URLifier
Hence this patch also makes use of std::unique_ptr and ::make_unique for
managing such client policy classes, and also for diagnostic_buffer's
per-format implementations.
Unfortunately we can't directly include <memory> in our internal headers
but instead any of our TUs that make use of std::unique_ptr must #define
INCLUDE_MEMORY before including system.h.
Hence the bulk of this patch is taken up with adding a define of
INCLUDE_MEMORY to hundreds of source files: everything that includes
diagnostic.h or pretty-print.h (and thus anything transitively such as
includers of lto-wrapper.h, c-tree.h, cp-tree.h and rtl-ssa.h).
Thanks to Gaius Mulley for the parts of the patch that regenerated the
m2 files.
Signed-off-by: David Malcolm <dmalcolm@redhat.com> Co-authored-by: Gaius Mulley <gaiusmod2@gmail.com> Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Ricardo Jesus [Mon, 14 Oct 2024 13:28:02 +0000 (14:28 +0100)]
aarch64: libstdc++: Use shufflevector instead of shuffle in opt_random.h
This patch modifies the implementation of the vectorized mersenne
twister random number generator to use __builtin_shufflevector instead
of __builtin_shuffle. This makes it (almost) compatible with Clang.
To make the implementation fully compatible with Clang, Clang will need
to support internal Neon types like __Uint8x16_t and __Uint32x4_t, which
currently it does not. This looks like an oversight in Clang and so will
be addressed separately.
I see no codegen change with this patch.
Bootstrapped and tested on aarch64-none-linux-gnu.
libstdc++-v3/ChangeLog:
* config/cpu/aarch64/opt/ext/opt_random.h (__VEXT): Replace uses
of __builtin_shuffle with __builtin_shufflevector.
(__aarch64_lsl_128): Move shift amount to a template parameter.
(__aarch64_lsr_128): Move shift amount to a template parameter.
(__aarch64_recursion): Update call sites of __aarch64_lsl_128
and __aarch64_lsr_128.
Record nonzero bits in the irange_bitmask of POLY_INT_CSTs
At the moment, ranger punts entirely on POLY_INT_CSTs. Numerical
ranges are a bit difficult, unless we do start modelling bounds on
the indeterminates. But we can at least track the nonzero bits.
gcc/
* value-query.cc (range_query::get_tree_range): Use get_nonzero_bits
to populate the irange_bitmask of a POLY_INT_CST.
gcc/testsuite/
* gcc.target/aarch64/sve/cnt_fold_6.c: New test.
This patch adds a rule to simplify (X >> C1) * (C2 << C1) -> X * C2
when the low C1 bits of X are known to be zero. As with the earlier
X >> C1 << (C2 + C1) patch, any single conversion is allowed between
the shift and the multiplication.
gcc/
* match.pd: Simplify (X >> C1) * (C2 << C1) -> X * C2 if the
low C1 bits of X are zero.
This patch extends get_nonzero_bits to handle POLY_INT_CSTs,
The easiest (but also most useful) case is that the number
of trailing zeros in the runtime value is at least the number
of trailing zeros in each individual component.
In principle, we could do this for coeffs 1 and above only,
and then OR in ceoff 0. This would give ~0x11 for [14, 32], say.
But that's future work.
This patch adds a rule to simplify (X >> C1) << (C1 + C2) -> X << C2
when the low C1 bits of X are known to be zero.
Any single conversion can take place between the shifts. E.g. for
a truncating conversion, any extra bits of X that are preserved by
truncating after the shift are immediately lost by the shift left.
And the sign bits used for an extending conversion are the same as
the sign bits used for the rshift. (A double conversion of say
int->unsigned->uint64_t would be wrong though.)
gcc/
* match.pd: Simplify (X >> C1) << (C1 + C2) -> X << C2 if the
low C1 bits of X are zero.
gcc/testsuite/
* gcc.dg/tree-ssa/shifts-1.c: New test.
* gcc.dg/tree-ssa/shifts-2.c: Likewise.
Generalise ((X /[ex] A) +- B) * A -> X +- A * B rule
match.pd had a rule to simplify ((X /[ex] A) +- B) * A -> X +- A * B
when A and B are INTEGER_CSTs. This patch extends it to handle the
case where the outer multiplication is by a factor of A, not just
A itself. It also handles addition and multiplication of poly_ints.
(Exact division by a poly_int seems unlikely.)
gcc/
* match.pd: Generalise ((X /[ex] A) +- B) * A -> X +- A * B rule
to ((X /[ex] C1) +- C2) * (C1 * C3) -> (X * C3) +- (C1 * C2 * C3).
I tried to look for places where we were handling TRUNC_DIV_EXPR
more favourably than EXACT_DIV_EXPR.
Most of the places that I looked at but didn't change were handling
div/mod pairs. But there's bound to be others I missed...
gcc/
* match.pd: Extend some rules to handle exact_div like trunc_div.
* tree.h (trunc_or_exact_div_p): New function.
* tree-ssa-loop-niter.cc (is_rshift_by_1): Use it.
* tree-ssa-loop-ivopts.cc (force_expr_to_var_cost): Handle
EXACT_DIV_EXPR.
Andrew MacLeod [Wed, 23 Oct 2024 14:59:13 +0000 (10:59 -0400)]
Implement pointer_or_operator.
The class pointer_or is no longer used, and can be removed. Its
functionality was never moved to the new dispatch system.
This implements operator_bitwise_or::fold_range() for prange operands.
Andrew MacLeod [Mon, 21 Oct 2024 22:20:10 +0000 (18:20 -0400)]
Remove pointer_and_operator.
This operator class predates the dispatch system, and is no longer used.
The functionality of wi_fold has been replaced by
operator_bitwise_and::fold_range with prange operaands.
Andrew MacLeod [Mon, 21 Oct 2024 22:11:43 +0000 (18:11 -0400)]
Remove pointer_min_max_operator.
The pointer_min_max_operator class was used before the current dispatch
system was created. These operations have been transferred to
operator_min::fold_range () and operator_max::fold_range () with prange
operands.
This class is no longer used for anything, delete it.
Jakub Jelinek [Thu, 24 Oct 2024 10:56:19 +0000 (12:56 +0200)]
c++: Further fix for get_member_function_from_ptrfunc [PR117259]
The following testcase shows that the previous get_member_function_from_ptrfunc
changes weren't sufficient and we still have cases where
-fsanitize=undefined with pointers to member functions can cause wrong code
being generated and related false positive warnings.
The problem is that save_expr doesn't always create SAVE_EXPR, it can skip
some invariant arithmetics and in the end it could be really large
expressions which would be evaluated several times (and what is worse, with
-fsanitize=undefined those expressions then can have SAVE_EXPRs added to
their subparts for -fsanitize=bounds or -fsanitize=null or
-fsanitize=alignment instrumentation). Tried to just build1 a SAVE_EXPR
+ add TREE_SIDE_EFFECTS instead of save_expr, but that doesn't work either,
because cp_fold happily optimizes those SAVE_EXPRs away when it sees
SAVE_EXPR operand is tree_invariant_p.
So, the following patch instead of using save_expr or building SAVE_EXPR
manually builds a TARGET_EXPR. Both types are pointers, so it doesn't need
to be destroyed in any way, but TARGET_EXPR is what doesn't get optimized
away immediately.
2024-10-24 Jakub Jelinek <jakub@redhat.com>
PR c++/117259
* typeck.cc (get_member_function_from_ptrfunc): Use force_target_expr
rather than save_expr for instance_ptr and function. Don't call it
for TREE_CONSTANT.
Jakub Jelinek [Thu, 24 Oct 2024 10:45:34 +0000 (12:45 +0200)]
asan: Fix up build_check_stmt gsi handling [PR117209]
gsi_safe_insert_before properly updates gsi_bb in gimple_stmt_iterator
in case it splits objects, but unfortunately build_check_stmt was in
some places (but not others) using a copy of the iterator rather than
the iterator passed from callers and so didn't propagate that to callers.
I guess it didn't matter much before when it was just using
gsi_insert_before as that really didn't change the iterator.
The !before_p case is apparently dead code, nothing is calling it with
before_p=false since around 4.9.
2024-10-24 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/117209
* asan.cc (maybe_cast_to_ptrmode): Formatting fix.
(build_check_stmt): Don't copy *iter into gsi, perform all
the updates on iter directly.
Jennifer Schmitz [Thu, 17 Oct 2024 09:31:47 +0000 (02:31 -0700)]
SVE intrinsics: Fold svsra with op1 all zeros to svlsr/svasr.
A common idiom in intrinsics loops is to have accumulator intrinsics
in an unrolled loop with an accumulator initialized to zero at the beginning.
Propagating the initial zero accumulator into the first iteration
of the loop and simplifying the first accumulate instruction is a
desirable transformation that we should teach GCC.
Therefore, this patch folds svsra to svlsr/svasr if op1 is all zeros,
producing the lower latency instructions LSR/ASR instead of USRA/SSRA.
We implemented this optimization in svsra_impl::fold.
Tests were added to check the produced assembly for use of LSR/ASR.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64-sve-builtins-sve2.cc
(svsra_impl::fold): Fold svsra to svlsr/svasr if op1 is all zeros.
Soumya AR [Thu, 17 Oct 2024 04:00:35 +0000 (09:30 +0530)]
SVE intrinsics: Fold constant operands for svlsl.
This patch implements constant folding for svlsl. Test cases have been added to
check for the following cases:
Zero, merge, and don't care predication.
Shift by 0.
Shift by register width.
Overflow shift on signed and unsigned integers.
Shift on a negative integer.
Maximum possible shift, eg. shift by 7 on an 8-bit integer.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by: Soumya AR <soumyaa@nvidia.com>
gcc/ChangeLog:
* config/aarch64/aarch64-sve-builtins-base.cc (svlsl_impl::fold):
Try constant folding.
* config/aarch64/aarch64-sve-builtins.cc (aarch64_const_binop):
Return 0 if shift is out of range.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/const_fold_lsl_1.c: New test.
SVE intrinsics: Fold division and multiplication by -1 to neg
Because a neg instruction has lower latency and higher throughput than
sdiv and mul, svdiv and svmul by -1 can be folded to svneg. For svdiv,
this is already implemented on the RTL level; for svmul, the
optimization was still missing.
This patch implements folding to svneg for both operations using the
gimple_folder. For svdiv, the transform is applied if the divisor is -1.
Svmul is folded if either of the operands is -1. A case distinction of
the predication is made to account for the fact that svneg_m has 3 arguments
(argument 0 holds the values for the inactive lanes), while svneg_x and
svneg_z have only 2 arguments.
Tests were added or adjusted to check the produced assembly and runtime
tests were added to check correctness.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold):
Fold division by -1 to svneg.
(svmul_impl::fold): Fold multiplication by -1 to svneg.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/asm/div_s32.c: New test.
* gcc.target/aarch64/sve/acle/asm/mul_s16.c: Adjust expected outcome.
* gcc.target/aarch64/sve/acle/asm/mul_s32.c: New test.
* gcc.target/aarch64/sve/acle/asm/mul_s64.c: Adjust expected outcome.
* gcc.target/aarch64/sve/acle/asm/mul_s8.c: Likewise.
* gcc.target/aarch64/sve/div_const_run.c: New test.
* gcc.target/aarch64/sve/mul_const_run.c: Likewise.
Jennifer Schmitz [Tue, 15 Oct 2024 14:58:14 +0000 (07:58 -0700)]
SVE intrinsics: Add constant folding for svindex.
This patch folds svindex with constant arguments into a vector series.
We implemented this in svindex_impl::fold using the function build_vec_series.
For example,
svuint64_t f1 ()
{
return svindex_u642 (10, 3);
}
compiled with -O2 -march=armv8.2-a+sve, is folded to {10, 13, 16, ...}
in the gimple pass lower.
This optimization benefits cases where svindex is used in combination with
other gimple-level optimizations.
For example,
svuint64_t f2 ()
{
return svmul_x (svptrue_b64 (), svindex_u64 (10, 3), 5);
}
has previously been compiled to
f2:
index z0.d, #10, #3
mul z0.d, z0.d, #5
ret
Now, it is compiled to
f2:
mov x0, 50
index z0.d, x0, #15
ret
We added test cases checking
- the application of the transform during gimple for constant arguments,
- the interaction with another gimple-level optimization.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?
Wang Pengcheng [Thu, 24 Oct 2024 05:11:53 +0000 (23:11 -0600)]
[PATCH] RISC-V: override alignment of function/jump/loop
Just like what AArch64 has done.
Signed-off-by: Wang Pengcheng <wangpengcheng.pp@bytedance.com>
gcc/ChangeLog:
* config/riscv/riscv.cc (struct riscv_tune_param): Add new
tune options.
(riscv_override_options_internal): Override the default alignment
when not optimizing for size.
Jakub Jelinek [Thu, 24 Oct 2024 03:21:13 +0000 (21:21 -0600)]
testsuite: Fix up pr116488.c and pr117226.c tests [PR116488]
Hi!
On Mon, Oct 21, 2024 at 01:39:52PM -0600, Jeff Law wrote:
> * gcc.dg/torture/pr116488.c: New test.
> * gcc.dg/torture/pr117226.c: New test.
These two tests FAIL on powerpc64le-linux (and I assume on all other
-funsigned-char defaulting targets).
The following patch fixes that, tested on powerpc64le-linux and
x86_64-linux (-m32/-m64); on x86_64 also tested before/after with
-funsigned-char.
Ok for trunk?
2024-10-22 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/116488
PR rtl-optimization/117226
* gcc.dg/torture/pr116488.c (c, e): Change type from char to
signed char.
* gcc.dg/torture/pr117226.c (main): Change f type from char to
signed char.
Pan Li [Mon, 23 Sep 2024 05:43:50 +0000 (13:43 +0800)]
RISC-V: Add testcases for form 4 of signed vector SAT_ADD
Form 4:
#define DEF_VEC_SAT_S_ADD_FMT_4(T, UT, MIN, MAX) \
void __attribute__((noinline)) \
vec_sat_s_add_##T##_fmt_4 (T *out, T *op_1, T *op_2, unsigned limit) \
{ \
unsigned i; \
for (i = 0; i < limit; i++) \
{ \
T x = op_1[i]; \
T y = op_2[i]; \
T sum; \
bool overflow = __builtin_add_overflow (x, y, &sum); \
out[i] = !overflow ? sum : x < 0 ? MIN : MAX; \
} \
}
The below test are passed for this patch.
* The rv64gcv fully regression test.
It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-13.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-14.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-15.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-16.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-13.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-14.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-15.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-16.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com> Signed-off-by: Pan Li <pan2.li@intel.com>
Andrew Pinski [Wed, 23 Oct 2024 23:39:21 +0000 (16:39 -0700)]
aarch64: Fix warning in aarch64_ptrue_reg
After r15-4579-g9ffcf1f193b477, we get the following warning/error while bootstrapping on aarch64:
```
../../gcc/gcc/config/aarch64/aarch64.cc: In function ‘rtx_def* aarch64_ptrue_reg(machine_mode, unsigned int)’:
../../gcc/gcc/config/aarch64/aarch64.cc:3643:21: error: comparison of integer expressions of different signedness: ‘int’ and ‘unsigned int’ [-Werror=sign-compare]
3643 | for (int i = 0; i < vl; i++)
| ~~^~~~
```
This changes the type of i to unsigned to match the type of vl.
Pushed as obvious after a bootstrap/test on aarch64-linux-gnu.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_ptrue_reg): Fix type
of induction variable i.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
When internal functions support was added to match (r6-4979-gc9e926ce2bdc8b),
the check for ECF_CONST was the builtin function side. Though before r15-4503-g8d6d6d537fdc,
there was no use of maybe_push_res_to_seq with non-const internal functions so the check
would not make a difference.
This adds the check for internal functions just as there is a check for builtins.
Note I didn't add a testcase because there was no non-const internal function
which could be used on x86_64 in a decent manor.
Jonathan Wakely [Tue, 22 Oct 2024 15:26:27 +0000 (16:26 +0100)]
ginclude: stdalign.h should define __xxx_is_defined macros for C++
The __alignas_is_defined macro has been required by C++ since C++11, and
C++ Library DR 4036 clarified that __alignof_is_defined should be
defined too. The whole <stdalign.h> header was deprecated for C++23 (see
LWG 3827) and is likely to be removed for C++26 (see P3348), but we can
deal with that later.
The macros alignas and alignof should not be defined, as they're
keywords in C++.
gcc/ChangeLog:
* ginclude/stdalign.h (__alignas_is_defined): Define for C++.
(__alignof_is_defined): Likewise.
libstdc++-v3/ChangeLog:
* testsuite/18_support/headers/cstdalign/macros.cc: New test.
David Malcolm [Wed, 23 Oct 2024 18:26:38 +0000 (14:26 -0400)]
jit: reset state in varasm.cc [PR117275]
PR jit/117275 reports various jit test failures seen on
powerpc64le-unknown-linux-gnu due to hitting this assertion
in varasm.cc on the 2nd compilation in a process:
#2 0x00007ffff63e67d0 in assemble_external_libcall (fun=0x7ffff2a4b1d8)
at ../../src/gcc/varasm.cc:2650
2650 gcc_assert (!pending_assemble_externals_processed);
(gdb) p pending_assemble_externals_processed
$1 = true
We're not properly resetting state in varasm.cc after a compile
for libgccjit.
Pengxuan Zheng [Mon, 14 Oct 2024 12:37:49 +0000 (05:37 -0700)]
aarch64: Improve scalar mode popcount expansion by using SVE [PR113860]
This is similar to the recent improvements to the Advanced SIMD popcount
expansion by using SVE. We can utilize SVE to generate more efficient code for
scalar mode popcount too.
Changes since v1:
* v2: Add a new VNx1BI mode and a new test case for V1DI.
* v3: Abandon VNx1BI changes and add a new variant of aarch64_ptrue_reg.
PR target/113860
gcc/ChangeLog:
* config/aarch64/aarch64-protos.h (aarch64_ptrue_reg): New function.
* config/aarch64/aarch64-simd.md (popcount<mode>2): Update pattern to
also support V1DI mode.
* config/aarch64/aarch64.cc (aarch64_ptrue_reg): New function.
* config/aarch64/aarch64.md (popcount<mode>2): Add TARGET_SVE support.
* config/aarch64/iterators.md (VDQHSD_V1DI): New mode iterator.
(SVE_VDQ_I): Add V1DI.
(bitsize): Likewise.
(VPRED): Likewise.
(VEC_POP_MODE): New mode attribute.
(vec_pop_mode): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/popcnt-sve.c: Update test.
* gcc.target/aarch64/popcnt11.c: New test.
* gcc.target/aarch64/popcnt12.c: New test.
David Malcolm [Wed, 23 Oct 2024 14:54:42 +0000 (10:54 -0400)]
diagnostics: implement buffering for non-textual formats [PR105916]
PR fortran/105916 reports stray diagnostics appearing in JSON and SARIF
output from gfortran.
In order to handle various awkard parsing issues, the Fortran frontend
implements buffering of diagnostics, so that diagnostics reported to
global_dc can be either:
(a) immediately issued, or
(b) speculatively reported to global_dc, and stored in a buffer, to
either be issued later or discarded.
This buffering code in gcc/fortran/error.cc directly manipulates
implementation details of the diagnostic_context such as the
pretty_printer's buffer, and the counts of how many diagnostics have
been issued. The issue is that this manipulation of pretty_printer's
buffer doesn't work for formats such as JSON and SARIF where diagnostics
are handled in a different way (such as by accumulating json::object
instances in an array).
This patch moves responsibility for such buffering of diagnostics from
fortran's error.cc to the diagnostic subsystem. It introduces a new
class diagnostic_buffer representing a particular buffer of diagnostics
that have been reported but not yet issued. Each diagnostic output
format implements buffering in a different way, and so there is a
new class hierarchy, diagnostic_per_format_buffer, representing the
various format-specific ways that buffering is to be implemented. This
is hidden as an implementation detail of diagnostic_buffer.
The patch also updates how diagnostics of each kind (e.g. warnings vs
errors) are counted, so that if buffering is enabled, the count is
incremented within the buffer, and the counts in the diagnostic_context
are only updated if and when the buffer is flushed; checking for
max_errors is similarly updated to support both buffered and unbuffered
cases.
For ease of debugging, the patch extends the "dump" functions within the
diagnostics subsystem, so that e.g. global_dc->dump () now prints the
buffering status, e.g.:
which shows that no diagnostics have been issued yet, but the active
diagnostic_buffer has a single error buffered within it, in SARIF form.
Similarly, it's possible to use "dump" on a diagnostic_buffer to directly
query its contents; here's the same example, this time with the text
output format:
showing that we have an error in error_buffer, with colorized text.
gcc/ChangeLog:
PR fortran/105916
* diagnostic-buffer.h: New file.
* diagnostic-format-json.cc: Define INCLUDE_VECTOR. Include
"diagnostic-buffer.h".
(class diagnostic_json_format_buffer): New subclass.
(class json_output_format): Add friend class
diagnostic_json_format_buffer.
(json_output_format::make_per_format_buffer): New vfunc
implementation.
(json_output_format::set_buffer): New vfunc implementation.
(json_output_format::json_output_format): Initialize m_buffer.
(json_output_format::m_buffer): New field.
(diagnostic_json_format_buffer::dump): New.
(diagnostic_json_format_buffer::empty_p): New.
(diagnostic_json_format_buffer::move_to): New.
(diagnostic_json_format_buffer::clear): New.
(diagnostic_json_format_buffer::flush): New.
(json_output_format::on_report_diagnostic): Implement optional
buffering.
* diagnostic-format-sarif.cc: Include "diagnostic-buffer.h".
(class diagnostic_sarif_format_buffer): New subclass.
(class sarif_builder): Add friend
class diagnostic_sarif_format_buffer.
(sarif_builder::num_results): New accessor.
(sarif_builder::get_result): New accessor.
(sarif_builder::on_report_diagnostic): Add param "buffer"; use it
to implement optional buffering.
(diagnostic_sarif_format_buffer::dump): New.
(diagnostic_sarif_format_buffer::empty_p): New.
(diagnostic_sarif_format_buffer::move_to): New.
(diagnostic_sarif_format_buffer::clear): New.
(diagnostic_sarif_format_buffer::flush): New.
(sarif_output_format::make_per_format_buffer): New vfunc
implementation.
(sarif_output_format::set_buffer): New vfunc implementation.
(sarif_output_format::on_report_diagnostic): Pass m_buffer to
sarif_builder::on_report_diagnostic.
(sarif_output_format::num_results): New accessor.
(sarif_output_format::get_result): New accessor.
(diagnostic_output_format::diagnostic_output_format): Initialize
m_buffer.
(diagnostic_output_format::m_buffer): New field.
(diagnostic_output_format::num_results): Get accessor.
(diagnostic_output_format::get_result): Get accessor.
(selftest::get_message_from_result): New.
(selftest::test_buffering): New.
(selftest::diagnostic_format_sarif_cc_tests): Call it.
* diagnostic-format-text.cc: Include
"diagnostic-client-data-hooks.h".
(class diagnostic_text_format_buffer): New subclass.
(diagnostic_text_format_buffer::diagnostic_text_format_buffer):
New.
(diagnostic_text_format_buffer::dump): New.
(diagnostic_text_format_buffer::empty_p): New.
(diagnostic_text_format_buffer::move_to): New.
(diagnostic_text_format_buffer::clear): New.
(diagnostic_text_format_buffer::flush): New.
(diagnostic_text_output_format::dump): Dump m_saved_output_buffer.
(diagnostic_text_output_format::set_buffer): New.
(diagnostic_text_output_format::make_per_format_buffer): New.
* diagnostic-format-text.h
(diagnostic_text_output_format::diagnostic_text_output_format):
Initialize m_saved_output_buffer.
(diagnostic_text_output_format::set_buffer): New decl.
(diagnostic_text_output_format::make_per_format_buffer): New decl.
(diagnostic_text_output_format::m_saved_output_buffer): New field.
* diagnostic-format.h (class diagnostic_per_format_buffer): New
forward decl.
(diagnostic_output_format::make_per_format_buffer): New vfunc.
(diagnostic_output_format::set_buffer): New vfunc.
* diagnostic.cc: Include "diagnostic-buffer.h".
(diagnostic_context::initialize): Replace memset with call to
"clear" on m_diagnostic_counters. Initializer
m_diagnostic_buffer.
(diagnostic_context::finish): Call set_diagnostic_buffer with
nullptr.
(diagnostic_context::dump): Update for encapsulation of counts
into m_diagnostic_counters. Dump m_diagnostic_buffer.
(diagnostic_context::execution_failed_p): Update for encapsulation of
counts into m_diagnostic_counters.
(diagnostic_context::check_max_errors): Likewise.
(diagnostic_context::report_diagnostic): Likewise. Eliminate
diagnostic_check_max_errors in favor of check_max_errors.
Update increment of counter to support buffering. Eliminate
diagnostic_action_after_output in favor of action_after_output.
Only add fixits to m_edit_context_ptr if buffering is disabled.
Only call diagnostic_output_format::after_diagnostic if buffering
is disabled.
(diagnostic_context::error_recursion): Eliminate
diagnostic_action_after_output in favor of action_after_output.
(diagnostic_context::set_diagnostic_buffer): New.
(diagnostic_context::clear_diagnostic_buffer): New.
(diagnostic_context::flush_diagnostic_buffer): New.
(diagnostic_counters::diagnostic_counters): New.
(diagnostic_counters::dump): New.
(diagnostic_counters::move_to): New.
(diagnostic_counters::clear): New.
(diagnostic_buffer::diagnostic_buffer): New.
(diagnostic_buffer::~diagnostic_buffer): New.
(diagnostic_buffer::dump): New.
(diagnostic_buffer::empty_p): New.
(diagnostic_buffer::move_to): New.
(diagnostic_buffer::ensure_per_format_buffer): New.
(c_diagnostic_cc_tests): Remove stray newline.
* diagnostic.h (class diagnostic_buffer): New forward decl.
(struct diagnostic_counters): New.
(diagnostic_context::check_max_errors): Make private.
(diagnostic_context::action_after_output): Make private.
(diagnostic_context::get_output_format): Make non-const.
(diagnostic_context::diagnostic_count): Update for change
to m_diagnostic_counters.
(diagnostic_context::set_diagnostic_buffer): New decl.
(diagnostic_context::get_diagnostic_buffer): New decl.
(diagnostic_context::clear_diagnostic_buffer): New decl.
(diagnostic_context::flush_diagnostic_buffer): New decl.
(diagnostic_context::m_diagnostic_count): Replace array with...
(diagnostic_context::m_diagnostic_counters): ...this.
(diagnostic_context::m_diagnostic_buffer): New field.
(diagnostic_action_after_output): Delete.
(diagnostic_check_max_errors): Delete.
gcc/fortran/ChangeLog:
PR fortran/105916
* error.cc (pp_error_buffer, pp_warning_buffer): Convert from
output_buffer * to diagnostic_buffer *.
(warningcount_buffered, werrorcount_buffered): Eliminate.
(gfc_error_buffer::gfc_error_buffer): Move constructor definition
here, and initialize "buffer" using *global_dc.
(gfc_output_buffer_empty_p): Delete in favor of
diagnostic_buffer::empty_p.
(gfc_clear_pp_buffer): Replace with...
(gfc_clear_diagnostic_buffer): ...this, moving implementation
details to diagnostic_context::clear_diagnostic_buffer.
(gfc_warning): Replace buffering implementation with calls
to global_dc->get_diagnostic_buffer and
global_dc->set_diagnostic_buffer.
(gfc_clear_warning): Update for renaming of gfc_clear_pp_buffer
and elimination of warningcount_buffered and werrorcount_buffered.
(gfc_warning_check): Replace buffering implementation with calls
to pp_warning_buffer->empty_p and
global_dc->flush_diagnostic_buffer.
(gfc_error_opt): Replace buffering implementation with calls to
global_dc->get_diagnostic_buffer and set_diagnostic_buffer.
(gfc_clear_error): Update for renaming of gfc_clear_pp_buffer.
(gfc_error_flag_test): Replace call to gfc_output_buffer_empty_p
with call to diagnostic_buffer::empty_p.
(gfc_error_check): Replace buffering implementation with calls
to pp_error_buffer->empty_p and global_dc->flush_diagnostic_buffer.
(gfc_move_error_buffer_from_to): Replace buffering implementation
with usage of diagnostic_buffer.
(gfc_free_error): Update for renaming of gfc_clear_pp_buffer.
(gfc_diagnostics_init): Use "new" directly when creating
pp_warning_buffer. Remove setting of m_flush_p on the two
buffers, as this is handled by diagnostic_buffer and by
diagnostic_text_format_buffer's constructor.
* gfortran.h: Replace #include "pretty-print.h" for output_buffer
with #include "diagnostic-buffer.h" for diagnostic_buffer.
(struct gfc_error_buffer): Change type of field "buffer" from
output_buffer to diagnostic_buffer. Move definition of constructor
into error.cc so that it can use global_dc.
gcc/testsuite/ChangeLog:
PR fortran/105916
* gcc.dg/plugin/diagnostic_plugin_xhtml_format.c: Include
"diagnostic-buffer.h".
(class diagnostic_xhtml_format_buffer): New subclass.
(class xhtml_builder): Add friend
class diagnostic_xhtml_format_buffer.
(diagnostic_xhtml_format_buffer::dump): New.
(diagnostic_xhtml_format_buffer::empty_p): New.
(diagnostic_xhtml_format_buffer::move_to): New.
(diagnostic_xhtml_format_buffer::clear): New.
(diagnostic_xhtml_format_buffer::flush): New.
(xhtml_builder::on_report_diagnostic): Add "buffer" param, and use
it.
(xhtml_output_format::dump): Fix typo.
(xhtml_output_format::make_per_format_buffer): New.
(xhtml_output_format::set_buffer): New.
(xhtml_output_format::on_report_diagnostic): Fix whitespace. Pass
m_buffer to xhtml_builder::on_report_diagnostic.
(xhtml_output_format::xhtml_output_format): Initialize m_buffer.
(xhtml_output_format::m_buffer): New field.
* gfortran.dg/diagnostic-format-json-pr105916.F90: New test.
* gfortran.dg/diagnostic-format-sarif-1.F90: New test.
* gfortran.dg/diagnostic-format-sarif-1.py: New support script.
* gfortran.dg/diagnostic-format-sarif-pr105916.f90: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jonathan Wakely [Tue, 22 Oct 2024 20:23:06 +0000 (21:23 +0100)]
libstdc++: Replace std::__to_address in C++20 branch in <string>
As noted by Patrick, r15-4546-g85e5b80ee2de80 should have changed the
usage of std::__to_address to std::to_address in the C++20-specific
branch that works on types satisfying std::contiguous_iterator.
libstdc++-v3/ChangeLog:
* include/bits/basic_string.h (assign(Iter, Iter)): Call
std::to_address instead of __to_address.
Paul Thomas [Wed, 23 Oct 2024 13:34:20 +0000 (14:34 +0100)]
Fortran: Generic processing of assumed rank objects (f202y) [PR116733]
2024-10-23 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/116733
* array.cc : White space corrections.
* expr.cc (gfc_check_pointer_assign): Permit assumed rank
target with -std=f202y. Add constraints that the data pointer
object must have rank remapping specified and the that the data
target be contiguous.
* gfortran.h : Add a gfc_array_ref field 'ar' to the structure
'gfc_association_list'.
* interface.cc (gfc_compare_actual_formal): If -Wsurprising
is set, emit a warning if an assumed size array is passed to an
assumed rank dummy.
* intrinsic.cc (do_ts29113_check): Permit an assumed rank arg.
for reshape if -std=f202y and the argument is contiguous.
* invoke.texi : Introduce -std=f202y. Whitespace errors.
* lang.opt : Accept -std=f202y.
* libgfortran.h : Define GFC_STD_F202Y.
* match.cc (gfc_match_associate): If -std=f202y an assumed rank
selector is allowed if it is contiguous and the associate name
has rank remapping specified.
* options.cc (gfc_init_options): -std=f202y is equivalent to
-std=f2023 with experimental f202y features. White space issues
* parse.cc (parse_associate): If the selector is assumed rank,
use the 'ar' field of the association list to build an array
specification.
* primary.cc (gfc_match_varspec): Do not resolve the assumed
rank selector of a class associate name at this stage to avoid
the rank change.
* resolve.cc (find_array_spec): If an array_ref dimension is -1
reset it with the rank in the object's array_spec.
(gfc_expression_rank): Do not check dimen types for an assumed
rank variable expression.
(resolve_variable): Do not emit the assumed rank context error
if the context is pointer assignment and the variable is a
target.
(resolve_assoc_var): Resolve the bounds and check for missing
bounds in the rank remap of an associate name with an assumed
rank selector. Do not correct the rank of an associate name
with an assumed rank selector.
(resolve_symbol): Allow the reference to an assumed rank object
if -std-f202y is enabled and the current operation is
EXEC_BLOCK.
* st.cc (gfc_free_association_list): Free bounds expressions
of the 'ar' field, if present.
* trans-array.cc (gfc_conv_ss_startstride): If -std=f202y and
bounds checking activated, do not apply the assertion.
* trans-expr.cc (gfc_trans_pointer_assignment): An assumed rank
target has its offset set to zero.
* trans-stmt.cc (trans_associate_var): If the selector is
assumed rank, call gfc_trans_pointer_assignment using the 'ar'
field in the association list as the array reference for expr1.
The data target, expr2, is a copy of the selector expression.
gcc/testsuite/
PR fortran/116733
* gfortran.dg/associate_3.f03: Change error message.
* gfortran.dg/f202y/f202y.exp: Enable tests of f202y features.
* gfortran.dg/f202y/generic_assumed_rank_1.f90: New test.
* gfortran.dg/f202y/generic_assumed_rank_2.f90: New test.
* gfortran.dg/f202y/generic_assumed_rank_3.f90: New test.
Wilco Dijkstra [Thu, 17 Oct 2024 14:33:44 +0000 (14:33 +0000)]
AArch64: Remove redundant check in aarch64_simd_mov
The split condition in aarch64_simd_mov uses aarch64_simd_special_constant_p.
While doing the split, it checks the mode before calling
aarch64_maybe_generate_simd_constant. This risky since it may result in
unexpectedly calling aarch64_split_simd_move instead of
aarch64_maybe_generate_simd_constant. Since the mode is already checked,
remove the spurious explicit mode check.
Wilco Dijkstra [Tue, 15 Oct 2024 16:22:23 +0000 (16:22 +0000)]
AArch64: Fix copysign patterns
The current copysign pattern has a mismatch in the predicates and constraints -
operand[2] is a register_operand but also has an alternative X which allows any
operand. Since it is a floating point operation, having an integer alternative
makes no sense. Change the expander to always use vector immediates which
results in better code and sharing of immediates between copysign and xorsign.
gcc/ChangeLog:
* config/aarch64/aarch64.md (copysign<GPF:mode>3): Widen immediate to
vector.
(copysign<GPF:mode>3_insn): Use VQ_INT_EQUIV in operand 3.
* config/aarch64/iterators.md (VQ_INT_EQUIV): New iterator.
(vq_int_equiv): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/copysign_3.c: New test.
* gcc.target/aarch64/copysign_4.c: New test.
* gcc.target/aarch64/fneg-abs_2.c: Fixup test.
* gcc.target/aarch64/sve/fneg-abs_2.c: Likewise.
Wilco Dijkstra [Tue, 8 Oct 2024 15:55:25 +0000 (15:55 +0000)]
AArch64: Improve SIMD immediate generation (2/3)
Allow use of SVE immediates when generating AdvSIMD code and SVE is available.
First check for a valid AdvSIMD immediate, and if SVE is available, try using
an SVE move or bitmask immediate.
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (ior<mode>3<vczle><vczbe>):
Use aarch64_reg_or_orr_imm predicate. Combine SVE/AdvSIMD immediates
and use aarch64_output_simd_orr_imm.
* config/aarch64/aarch64.cc (struct simd_immediate_info): Add SVE_MOV.
(aarch64_sve_valid_immediate): Use SVE_MOV for SVE move immediates.
(aarch64_simd_valid_imm): Enable SVE SIMD immediates when possible.
(aarch64_output_simd_imm): Support emitting SVE SIMD immediates.
* config/aarch64/predicates.md (aarch64_orr_imm_sve_advsimd): Remove.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/acle/asm/insr_s64.c: Allow SVE MOV imm.
* gcc.target/aarch64/sve/acle/asm/insr_u64.c: Likewise.
* gcc.target/aarch64/sve/fneg-abs_1.c: Update to check for ORRI.
* gcc.target/aarch64/sve/fneg-abs_2.c: Likewise.
* gcc.target/aarch64/sve/simd_imm_mov.c: New test.
Wilco Dijkstra [Tue, 8 Oct 2024 13:32:09 +0000 (13:32 +0000)]
AArch64: Improve SIMD immediate generation (1/3)
Cleanup the various interfaces related to SIMD immediate generation. Introduce
new functions that make it clear which operation (AND, OR, MOV) we are testing
for rather than guessing the final instruction. Reduce the use of overly long
names, unused and default parameters for clarity. No changes to internals or
generated code.
gcc/ChangeLog:
* config/aarch64/aarch64-protos.h (enum simd_immediate_check): Move to aarch64.cc.
(aarch64_output_simd_mov_immediate): Remove.
(aarch64_output_simd_mov_imm): New prototype.
(aarch64_output_simd_orr_imm): Likewise.
(aarch64_output_simd_and_imm): Likewise.
(aarch64_simd_valid_immediate): Remove.
(aarch64_simd_valid_and_imm): New prototype.
(aarch64_simd_valid_mov_imm): Likewise.
(aarch64_simd_valid_orr_imm): Likewise.
* config/aarch64/aarch64-simd.md: Use aarch64_output_simd_mov_imm.
* config/aarch64/aarch64.cc (enum simd_immediate_check): Moved from aarch64-protos.h.
Use AARCH64_CHECK_AND rather than AARCH64_CHECk_BIC.
(aarch64_expand_sve_const_vector): Use aarch64_simd_valid_mov_imm.
(aarch64_expand_mov_immediate): Likewise.
(aarch64_can_const_movi_rtx_p): Likewise.
(aarch64_secondary_reload): Likewise.
(aarch64_legitimate_constant_p): Likewise.
(aarch64_advsimd_valid_immediate): Simplify checks on 'which' param.
(aarch64_sve_valid_immediate): Add extra param for move vs logical.
(aarch64_simd_valid_immediate): Rename to aarch64_simd_valid_imm.
(aarch64_simd_valid_mov_imm): New function.
(aarch64_simd_valid_orr_imm): Likewise.
(aarch64_simd_valid_and_imm): Likewise.
(aarch64_mov_operand_p): Use aarch64_simd_valid_mov_imm.
(aarch64_simd_scalar_immediate_valid_for_move): Likewise.
(aarch64_simd_make_constant): Likewise.
(aarch64_expand_vector_init_fallback): Likewise.
(aarch64_output_simd_mov_immediate): Rename to aarch64_output_simd_imm.
(aarch64_output_simd_orr_imm): New function.
(aarch64_output_simd_and_imm): Likewise.
(aarch64_output_simd_mov_imm): Likewise.
(aarch64_output_scalar_simd_mov_immediate): Use aarch64_output_simd_mov_imm.
(aarch64_output_sve_mov_immediate): Use aarch64_simd_valid_imm.
(aarch64_output_sve_ptrues): Likewise.
* config/aarch64/constraints.md (Do): Use aarch64_simd_valid_orr_imm.
(Db): Use aarch64_simd_valid_and_imm.
* config/aarch64/predicates.md (aarch64_reg_or_bic_imm): Use aarch64_simd_valid_orr_imm.
(aarch64_reg_or_and_imm): Use aarch64_simd_valid_and_imm.