Andrew Pinski [Thu, 22 Aug 2024 21:34:03 +0000 (14:34 -0700)]
toplevel: Error out if using --disable-libstdcxx with bootstrap [PR105474]
Bootstrapping and using --disable-libstdcxx will cause a build failure deep in compiling
stage2 so instead error out early in the toplevel configure so it is more user friendly.
Bootstrapped and tested on x86_64-linux-gnu.
Also made sure --disable-libstdcxx without --disable-bootstrap failed.
PR bootstrap/105474
ChangeLog:
* configure: Regenerate.
* configure.ac: Error out if libstdc++ is not enabled
with bootstrapping.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andrew Pinski [Sat, 4 May 2024 09:03:16 +0000 (02:03 -0700)]
aarch64: Support multiple variants including up to 3
On some of the Qualcomm's SoC that includes oryon-1 core, the variant
will be different on the cores due to big.little config. Though
the difference between big and little is not significant enough
to have seperate cost/scheduling models for them and the feature set
is the same across all variants.
Also on some SoCs, there are 3 variants of the core, big.middle.little
so this increases the support there for up to 3 cores and 3 variants
in the original parsing loop but it does not change the support for max
of 2 different cores.
After this patch and the patch that adds oryon-1, -mcpu=native works
on the SoCs I am working with.
Bootstrapped and tested on aarch64-linux-gnu with no regressions.
gcc/ChangeLog:
* config/aarch64/driver-aarch64.cc (host_detect_local_cpu): Support
3 cores and 3 variants. If there is one core but multiple variant,
then treat the variant as being all.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/cpunative/info_25: New file.
* gcc.target/aarch64/cpunative/info_26: New file.
* gcc.target/aarch64/cpunative/native_cpu_25.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_26.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Wilco Dijkstra [Fri, 25 Oct 2024 14:53:58 +0000 (14:53 +0000)]
AArch64: Add more accurate constraint [PR117292]
As shown in the PR, reload may only check the constraint in some cases and
and not check the predicate is still valid for the resulting instruction.
To fix the issue, add a new constraint which matches the predicate exactly.
gcc/ChangeLog:
PR target/117292
* config/aarch64/aarch64-simd.md (xor<mode>3<vczle><vczbe>): Use
'De' constraint.
* config/aarch64/constraints.md (De): Add new constraint.
Paul Thomas [Fri, 25 Oct 2024 16:59:03 +0000 (17:59 +0100)]
Fortran: Fix ICE with structure constructor in data statement [PR79685]
2024-10-25 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/79685
* decl.cc (match_data_constant): Find the symtree instead of
the symbol so the use renamed symbols are found. Pass this and
the derived type to gfc_match_structure_constructor.
* match.h: Update prototype of gfc_match_structure_contructor.
* primary.cc (gfc_match_structure_constructor): Remove call to
gfc_get_ha_sym_tree and use caller supplied symtree instead.
gcc/testsuite/
PR fortran/79685
* gfortran.dg/use_rename_13.f90: New test.
Jennifer Schmitz [Thu, 17 Oct 2024 15:40:34 +0000 (08:40 -0700)]
match.pd: Add std::pow folding optimizations.
This patch adds the following two simplifications in match.pd for
POW_ALL and POWI:
- pow (1.0/x, y) to pow (x, -y), avoiding the division
- pow (0.0, x) to 0.0, avoiding the call to pow.
The patterns are guarded by flag_unsafe_math_optimizations,
!flag_trapping_math, and !HONOR_INFINITIES.
The POW_ALL patterns are also gated under !flag_errno_math.
The second pattern is also guarded by !HONOR_NANS and
!HONOR_SIGNED_ZEROS.
Tests were added to confirm the application of the transform for
builtins pow, powf, powl, powi, powif, powil, and powf16.
The patch was bootstrapped and regtested on aarch64-linux-gnu and
x86_64-linux-gnu, no regression.
OK for mainline?
Pan Li [Thu, 24 Oct 2024 13:57:04 +0000 (21:57 +0800)]
Match: Simplify branch form 3 of unsigned SAT_ADD into branchless
There are sorts of forms for the unsigned SAT_ADD. Some of them are
complicated while others are cheap. This patch would like to simplify
the complicated form into the cheap ones. For example as below:
From the form 3 (branch):
SAT_U_ADD = (X + Y) >= x ? (X + Y) : -1.
The simplify doesn't need to check if target support the SAT_ADD, it
is somehow the optimization in gimple level.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.
gcc/ChangeLog:
* match.pd: Remove unsigned branch form 3 for SAT_ADD, and
add simplify to branchless instead.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/sat_u_add-simplify-1-u16.c: New test.
* gcc.dg/tree-ssa/sat_u_add-simplify-1-u32.c: New test.
* gcc.dg/tree-ssa/sat_u_add-simplify-1-u64.c: New test.
* gcc.dg/tree-ssa/sat_u_add-simplify-1-u8.c: New test.
Jakub Jelinek [Fri, 25 Oct 2024 12:09:42 +0000 (14:09 +0200)]
Assorted --disable-checking fixes [PR117249]
We have currently 3 different definitions of gcc_assert macro, one used most
of the time (unless --disable-checking) which evaluates the condition at
runtime and also checks it at runtime, then one for --disable-checking GCC 4.5+
which looks like
((void)(UNLIKELY (!(EXPR)) ? __builtin_unreachable (), 0 : 0))
and a fallback one
((void)(0 && (EXPR)))
Now, the last one actually doesn't evaluate any of the side-effects in the
argument, just quiets up unused var/parameter warnings.
I've tried to replace the middle definition with
({ [[assume (EXPR)]]; (void) 0; })
for compilers which support assume attribute and statement expressions
(surprisingly quite a few spots use gcc_assert inside of comma expressions),
but ran into PR117287, so for now such a change isn't being proposed.
The following patch attempts to move important side-effects from gcc_assert
arguments.
Bootstrapped/regtested on x86_64-linux and i686-linux with normal
--enable-checking=yes,rtl,extra, plus additionally I've attempted to do
x86_64-linux bootstrap with --disable-checking and gcc_assert changed to the
((void)(0 && (EXPR)))
version when --disable-checking. That version ran into spurious middle-end
warnings
../../gcc/../include/libiberty.h:733:36: error: argument to ‘alloca’ is too large [-Werror=alloca-larger-than=]
../../gcc/tree-ssa-reassoc.cc:5659:20: note: in expansion of macro ‘XALLOCAVEC’
int op_num = ops.length ();
int op_normal_num = op_num;
gcc_assert (op_num > 0);
int stmt_num = op_num - 1;
gimple **stmts = XALLOCAVEC (gimple *, stmt_num);
where we have gcc_assert exactly to work-around middle-end warnings.
Guess I'd need to also disable -Werror for this experiment, which actually
isn't a problem with unmodified system.h, because even for
--disable-checking we use the __builtin_unreachable at least in
stage2/stage3 and so the warnings aren't emitted, and even if it used
[[assume ()]]; it would work too because in stage2/stage3 we could again
rely on assume and statement expression support.
2024-10-25 Jakub Jelinek <jakub@redhat.com>
PR middle-end/117249
* tree-ssa-structalias.cc (insert_vi_for_tree): Move put calls out of
gcc_assert.
* lto-cgraph.cc (lto_symtab_encoder_delete_node): Likewise.
* gimple-ssa-strength-reduction.cc (get_alternative_base,
add_cand_for_stmt): Likewise.
* tree-eh.cc (add_stmt_to_eh_lp_fn): Likewise.
* except.cc (duplicate_eh_regions_1): Likewise.
* tree-ssa-reassoc.cc (insert_operand_rank): Likewise.
* config/nvptx/nvptx.cc (nvptx_expand_call): Use == rather than = in
gcc_assert.
* opts-common.cc (jobserver_info::disconnect): Call close outside of
gcc_assert and only check result in it.
(jobserver_info::return_token): Call write outside of gcc_assert and
only check result in it.
* genautomata.cc (output_default_latencies): Move j++ side-effect
outside of gcc_assert.
* tree-ssa-loop-ivopts.cc (get_alias_ptr_type_for_ptr_address): Use
== rather than = in gcc_assert.
* cgraph.cc (symbol_table::create_edge): Move ++edges_max_uid
side-effect outside of gcc_assert.
Jakub Jelinek [Fri, 25 Oct 2024 12:05:37 +0000 (14:05 +0200)]
lto: Handle RAW_DATA_CST in compare_tree_sccs_1 [PR117201]
I've missed I need to add RAW_DATA_CST support in compare_tree_sccs_1,
because without that it considers all RAW_DATA_CSTs to be equivalent,
regardless of their length or content.
Richard Biener [Fri, 25 Oct 2024 10:38:24 +0000 (12:38 +0200)]
Default expand_vec_cond_expr_p code to ERROR_MARK
As we want to transition to only vcond_mask expanders the following
makes it possible to easier distinguish queries that rely on
vcond queries for expand_vec_cond_expr_p from those of vcond_mask
by for the latter having the comparison code defaulted to ERROR_MARK.
My recent gcc.dg/tree-ssa/shifts-3.c test failed on arm-linux-gnu
because it used widen_mult_expr to do a multiplication on chars.
This patch generalises the regexp in the same way as for f3.
gcc/testsuite/
* gcc.dg/tree-ssa/shifts-3.c: Accept widen_mult for f2 too.
libstdc++: implement concatenation of strings and string_views
This adds support for P2591R5, merged for C++26.
libstdc++-v3/ChangeLog:
* include/bits/basic_string.h: Implement the four operator+
overloads between basic_string and (types convertible to)
basic_string_view.
* include/bits/version.def: Bump the feature-testing macro.
* include/bits/version.h: Regenerate.
* testsuite/21_strings/basic_string/operators/char/op_plus_fspath_neg.cc: New test.
* testsuite/21_strings/basic_string/operators/char/op_plus_string_view.cc: New test.
* testsuite/21_strings/basic_string/operators/char/op_plus_string_view_compat.cc:
New test.
Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
Richard Biener [Thu, 24 Oct 2024 15:06:29 +0000 (17:06 +0200)]
Restrict :c to commutative ops as intended
genmatch was supposed to restrict :c to verifiable commutative
operations while leaving :C to the "I know what I'm doing" case.
The following enforces this, cleaning up parsing and amending
the commutative_op helper. There's one pattern that needs adjustment,
the pattern optimizing fmax (x, NaN) or fmax (NaN, x) to x since
fmax isn't commutative.
* genmatch.cc (commutative_op): Add paramter to indicate whether
all compares should be considered commutative. Handle
hypot, add_overflow and mul_overflow.
(parser::parse_expr): Simplify 'c' handling by using
commutative_op and error out when the operation is not.
* match.pd ((minmax:c @0 NaN@1) -> @0): Use :C, we know
what we are doing.
Richard Biener [Thu, 24 Oct 2024 14:15:43 +0000 (16:15 +0200)]
tree-optimization/117277 - remove CLOBBERs before SLP code generation
We have to remove CLOBBERs before SLP is code generated since for
store-lanes we are inserting our own CLOBBERs that we want to survive.
So the following refactors vect_transform_loop to remove unwanted
stmts first.
This resolves the gcc.target/aarch64/sve/store_lane_spill_1.c FAIL.
PR tree-optimization/117277
* tree-vect-loop.cc (vect_transform_loop): Remove CLOBBERs
and prefetches before doing any code generation.
The following implements masked load-lane discovery for SLP. The
challenge here is that a masked load has a full-width mask with
group-size number of elements when this becomes a masked load-lanes
instruction one mask element gates all group members. We already
have some discovery hints in place, namely STMT_VINFO_SLP_VECT_ONLY
to guard non-uniform masks, but we need to choose a way for SLP
discovery to handle possible masked load-lanes SLP trees.
I have this time chosen to handle load-lanes discovery where we
have performed permute optimization already and conveniently got
the graph with predecessor edges built. This is because unlike
non-masked loads masked loads with a load_permutation are never
produced by SLP discovery (because load permutation handling doesn't
handle un-permuting the mask) and thus the load-permutation lowering
which handles non-masked load-lanes discovery doesn't trigger.
With this SLP discovery for a possible masked load-lanes, thus
a masked load with uniform mask, produces a splat of a single-lane
sub-graph as the mask SLP operand. This is a representation that
shouldn't pessimize the mask load case and allows the masked load-lanes
transform to simply elide this splat.
This fixes the aarch64-sve.exp mask_struct_load*.c testcases with
--param vect-force-slp=1
PR tree-optimization/116575
* tree-vect-slp.cc (vect_get_and_check_slp_defs): Handle
gaps, aka NULL scalar stmt.
(vect_build_slp_tree_2): Allow gaps in the middle of a
grouped mask load. When the mask of a grouped mask load
is uniform do single-lane discovery for the mask and
insert a splat VEC_PERM_EXPR node.
(vect_optimize_slp_pass::decide_masked_load_lanes): New
function.
(vect_optimize_slp_pass::run): Call it.
Richard Biener [Wed, 23 Oct 2024 09:55:31 +0000 (11:55 +0200)]
Relax vect_check_scalar_mask check
When the mask is not a constant or external def there's no need to
check the scalar type, in particular with SLP and the mask being
a VEC_PERM_EXPR there isn't a scalar operand ready to check
(not one vect_is_simple_use will get you). We later check the
vector type and reject non-mask types there.
* tree-vect-stmts.cc (vect_check_scalar_mask): Only check
the scalar type for constant or extern defs.
Tom Tromey [Wed, 31 Jul 2024 15:01:45 +0000 (09:01 -0600)]
ada: Change scope of XUB type
An earlier patch in the "nameless" series caused a regression with
-fgnat-encodings=all. Previously, all artificial types were emitted
in the CU scope in the DWARF, but with the patch, an "XUB" type is
emitted in the function scope. This causes gdb lookups to erroneously
find the XUB type rather than the type that gdb expects to find.
Note that I don't know why the earlier code worked, because decl.cc
clearly sets the XUB type's context to be the current function.
This patch changes the type's context so that it is nested in a type
that is conveniently available.
gcc/ada/ChangeLog:
* gcc-interface/decl.cc (gnat_to_gnu_entity): Use gnu_fat_type as the type
context for a XUB type.
Tom Tromey [Tue, 9 Jul 2024 15:36:26 +0000 (09:36 -0600)]
ada: Set DECL_NAMELESS in create_type_decl
When using minimal encodings, most artificial types do not need to
have their names emitted in the DWARF. This patch changes the
compiler to generally omit these names.
However, a subset of names are needed: when the compiler creates an
artificial type for certain kinds of arrays, the name is needed by
gdb. So, a new parameter is added to create_type_decl to allow this
omission to be disabled.
Note that simply passing 'false' as the artificial_p argument to
create_type_decl doesn't work properly -- other parts of the compiler
seem to rely on this flag being set, and so making this change causes
ICEs.
gcc/ada/ChangeLog:
* gcc-interface/decl.cc (gnat_to_gnu_entity): Update some calls to
create_type_decl.
* gcc-interface/gigi.h (create_type_decl): Add can_be_nameless parameter.
* gcc-interface/utils.cc (create_type_decl): Add can_be_nameless
parameter. Set DECL_NAMELESS on type decl.
Tom Tromey [Wed, 10 Jul 2024 17:54:25 +0000 (11:54 -0600)]
ada: Mark some type decls as nameless
The types created by record_builtin_type and create_type_stub_decl can
be marked as nameless when using minimal encodings. In this
situation, gdb does not need these type names.
gcc/ada/ChangeLog:
* gcc-interface/utils.cc (record_builtin_type, create_type_stub_decl):
Set DECL_NAMELESS on type decls.
Tom Tromey [Wed, 10 Jul 2024 17:52:17 +0000 (11:52 -0600)]
ada: Mark XUA types as artificial
gdb does not need the name of XUA types. This patch changes the
compiler to unconditionally mark these as artificial; a subsequent
patch will arrange for the name to be omitted.
gcc/ada/ChangeLog:
* gcc-interface/decl.cc (gnat_to_gnu_entity): Pass 'true' to
create_type_decl when creating XUA type.
Tom Tromey [Wed, 10 Jul 2024 17:46:57 +0000 (11:46 -0600)]
ada: Add 'artificial_p' parameter to build_unc_object_type
This adds an 'artificial_p' parameter to build_unc_object_type, so
that the artificiality of the type can be propagated to
create_type_decl. This will affect the namelessness of the type in a
subsequent patch.
Tom Tromey [Wed, 10 Jul 2024 17:29:11 +0000 (11:29 -0600)]
ada: Standard types are not artificial
This changes gigi so that standard types are no longer marked
artificial. This change is needed to prevent subsequent patches from
causing standard types to have their names elided. Also, although
DWARF says that DW_AT_artificial is used for "the declaration of an
object or type artificially generated by a compiler and not explicitly
declared by the source program", it seems to me that types provided by
the language should not be marked as such; and this is what the C and
C++ compilers do.
gcc/ada/ChangeLog:
* gcc-interface/decl.cc (is_artificial): New function.
(gnat_to_gnu_entity): Use it.
Eric Botcazou [Wed, 18 Sep 2024 06:24:32 +0000 (08:24 +0200)]
ada: Fix fallout of change in parameter passing out of aliasing considerations
If an actual parameter that is a type conversion is passed by reference but
not addressable, the temporary that is created and whose address is passed
instead may need to be in the target type of the conversion to fulfill the
requirements of strict aliasing.
gcc/ada/ChangeLog:
* gcc-interface/trans.cc (Call_to_gnu): If the formal is passed by
reference and the actual is a type conversion but not addressable,
create the temporary in the target type of the conversion if this
is needed to enforce strict aliasing.
Eric Botcazou [Wed, 11 Sep 2024 17:53:12 +0000 (19:53 +0200)]
ada: Fix internal error on bit-packed array type with Volatile_Full_Access
The problem occurs when the component type is a record type with default
values for the initialization procedure of the (base) array type, because
the compiler is trying to generate a full access for a parameter of the
base array type, which does not make sense.
gcc/ada/ChangeLog:
PR ada/116551
* gcc-interface/trans.cc (node_is_atomic) <N_Identifier>: Return
false if the type of the entity is an unconstrained array type.
(node_is_volatile_full_access) <N_Identifier>: Likewise.
Bob Duff [Tue, 1 Oct 2024 15:29:34 +0000 (11:29 -0400)]
ada: Disable self-referential with_clauses
Self-referential with_clauses (as in package body X says "with X;")
cause trouble, such as duplicate nested instantiations when using
container packages. This patch disables most of the processing by
setting the Is_Implicit_With flag. It's not really implicit, but the
subsequent processing behaves as if it is, and coming up with a more
accurate (and much longer) name for the flag doesn't seem beneficial for
such an obscure case. Note that the spec of X will be processed later,
rather than upon seeing "with X;".
Other cleanups, such as renaming Implicit_With to be Is_Implicit_With.
gcc/ada/ChangeLog:
* sem_ch10.adb: (Analyze_With_Clause): Check for self-referential
with clause. Give a warning, and set Is_Implicit_With, which we
are reusing in this obscure case even though it's not really
implicit.
(Analyze_Context): Remove check for self-referential with clause.
It wasn't correct -- it only triggered for Acts_As_Spec
subprograms. Corrected check is now in Analyze_With_Clause.
(Implicit_With): Rename to be Is_Implicit_With. Misc cleanup,
comment fixes.
(Process_Spec_Clauses): Remove default for Exit_On_Self parameter.
Use "exit when" instead of if statement.
* sinfo.ads (Implicit_With): Rename to be Is_Implicit_With.
Document new use for self-referential withs.
* ali.adb (Scan_ALI): Use an aggregate to initialize Withs entry.
* exp_put_image.adb (Preload_Root_Buffer_Type): Make this a
once-only procedure.
* sem_util.ads (Is_Ancestor_Package): Fix comment -- a libraryunit
is an ancestor of itself, but this doesn't return True in that
case.
* sem_util.adb (Is_Ancestor_Package): Better to initialize things
on their declaration.
* lib-load.adb: Minor comment fix.
* sem_prag.adb: Implicit_With --> Is_Implicit_With. Minor comment
fix.
* gen_il-fields.ads: Implicit_With --> Is_Implicit_With.
* gen_il-gen-gen_nodes.adb: Likewise
* lib.adb: Likewise
* lib-writ.adb: Likewise
* rtsfind.adb: Likewise
* sem_cat.adb: Likewise
* sem_ch12.adb: Likewise
* sem_ch8.adb: Likewise
* sem_elab.adb: Likewise
* sem_warn.adb: Likewise
* gcc-interface/trans.cc: (Implicit_With): Rename to be
Is_Implicit_With.
Eric Botcazou [Tue, 1 Oct 2024 07:19:36 +0000 (09:19 +0200)]
ada: Small adjustments to commentary after latest change
This removes the enumeration of the various cases in the comment associated
with the declaration of In_Expanded_Body to prevent synchronization issues.
Javier Miranda [Mon, 30 Sep 2024 09:08:04 +0000 (09:08 +0000)]
ada: Pragma Pre_Class and Post_Class have no effect at runtime
The pragmas Pre_Class and Post_Class are accepted by the compiler
but have no effect at runtime.
gcc/ada/ChangeLog:
* freeze.adb (Freeze_Entity): If the entity is an access-to-subprogram
type declaration that pre/postcondition contracts, build the
wrapper
(if not previously done as part of processing aspects).
* sem_ch3.adb (Build_Access_Subprogram_Wrapper): Add missing support
for building the wrapper when the access type has pragmas
Pre_Class/Post_Class.
(Build_Access_Subprogram_Wrapper_Declaration): New subprogram.
* sem_ch3.ads (Build_Access_Subprogram_Wrapper): Spec moved to the
public part of the package.
* sem_prag.adb (Analyze_Pre_Post_Condition): Store in the tree copy of
class-wide pre/postcondition expression; required to merge it with
inherited conditions.
(Is_Valid_Assertion_Kind): Added Pre_Class and Post_Class.
Eric Botcazou [Fri, 27 Sep 2024 07:36:17 +0000 (09:36 +0200)]
ada: Fix ATC with timed delay from Ada.Real_Time
An Asynchronous Transfer of Control blocks with a timed delay that is
computed by means of the Ada.Real_Time unit (instead of the default
Ada.Calendar unit) because of a missing abort deferral in the unit.
gcc/ada/ChangeLog:
PR ada/43485
* libgnarl/a-retide.adb: Add with clause for System.Soft_Links.
(Delay_Until): Defer and undefer abort around the call to the
Timed_Delay routine of System.Task_Primitives.Operations.
ada: Adjust documentation of External_Initialization
The parameters Maximum_Size and If_Empty were mentioned during the
request for comments phase but are not implemented, at least for now.
This patch changes the GNAT reference manual accordingly. It also makes
a minor punctuation change.
Eric Botcazou [Fri, 13 Sep 2024 09:53:00 +0000 (11:53 +0200)]
ada: Add Type_Size_For function to Uintp package
It computes the size of an integer type that can accommodate the input.
gcc/ada/ChangeLog:
* uintp.ads (Type_Size_For): New function declaration.
* uintp.adb (Type_Size_For): New function body.
* exp_imgv.adb (Rewrite_Object_Image): Call Type_Size_For to get
the size of a narrower integer type.
Javier Miranda [Tue, 17 Sep 2024 11:53:06 +0000 (11:53 +0000)]
ada: Constraint error not raised in ACATS test c413007
The Constraint_Error exception is not raised when a subprogram
is called using prefix notation, and the prefix of the call is
an access-to-subprogram type with a null value. This new check
is enabled by switch -gnatd_P
gcc/ada/ChangeLog:
* gen_il-fields.ads: New node field (Is_Expanded_Prefixed_Call).
* gen_il-gen-gen_nodes.adb: New semantic field for N_Function_Call
and N_Procedure_Call_Statement nodes.
* sem_ch4.adb (Complete_Object_Operation): Mark the rewritten node
with the Is_Expanded_Prefixed_Call flag.
* sem_res.adb (Check_Prefixed_Call): Code cleanup and addition of
documentation.
(Resolve_Actuals): Add a null-exclusion check on the
prefix of the call when it is an access-type.
* sinfo.ads: Adding new semantic flag (Is_Expanded_Prefixed_Call)
to N_Function_Call and N_Procedure_Call_Statement nodes.
* debug.adb: Adding documentation for switch d_P.
* sem_ch13.adb (Analyze_One_Aspect): change the call to
`Error_Msg_GNAT_Extension` to allow this aspect in core
extensions. Put the code path in core extensions.
* exp_util.adb (Name_Of_Controlled_Prim_Op): Put the code path
in core extensions
Eric Botcazou [Mon, 16 Sep 2024 06:31:57 +0000 (08:31 +0200)]
ada: Fix internal error on ambiguous operands of comparison operator
This is a regression introduced when the diagnosis of ambiguous operands
for comparison and equality operators was moved from the analysis to the
resolution phase in order to avoid spurious ambiguities in specific cases.
When an ambiguity is detected for the operands of predefined comparison
and equality operators during analysis, it needs to be recorded so that
later calls to the disambiguation routine know about this ambiguity for
the case where the context has been resolved to boolean.
gcc/ada/ChangeLog:
* sem_type.ads (Interp ): Add Opnd_Typ component and remove default
value for Abstract_Op component.
(Add_One_Interp): Rename Opnd_Type parameter to Opnd_Typ.
* sem_type.adb (Add_One_Interp): Likewise.
(Add_One_Interp.Add_Entry): Record the operand type as well.
(Collect_Interp): Record Empty for the operand type.
(Disambiguate.Is_Ambiguous_Boolean_Operator): New predicate.
(Disambiguate): Use it to detect recorded ambiguity cases.
* sem_ch4.adb (Find_Comparison_Equality_Types): Add commentary.
Eric Botcazou [Fri, 13 Sep 2024 19:09:31 +0000 (21:09 +0200)]
ada: Fix fallout of change to 'Wide_Wide_Value for enumeration types
The literals of enumeration types are always normalized, even though they
contain wide characters (but the normalization leaves these unchanged),
so a normalization routine that is aware of wide characters must be run
on the input string for 'Wide_Wide_Value.
gcc/ada/ChangeLog:
PR ada/115507
* rtsfind.ads (RE_Id): Add RE_Enum_[Wide_]Wide_String_To_String.
(RE_Unit_Table): Add entries for the new values.
* exp_attr.adb (Is_User_Defined_Enumeration_Type): New predicate.
(Expand_N_Attribute_Reference) <Attribute_Wide_Value>: Build a call
to RE_Enum_Wide_String_To_String for user-defined enumeration types.
<Attribute_Wide_Wide_Value>: Likewise with
RE_Enum_Wide_Wide_String_To_String.
* exp_imgv.adb (Expand_Value_Attribute): Adjust to above.
* libgnat/s-wchwts.ads (Enum_Wide_String_To_String): New function.
(Enum_Wide_Wide_String_To_String): Likewise.
* libgnat/s-wchwts.adb: Add clauses for System.Case_Util.
(Normalize_String): New local procedure.
(Enum_Wide_String_To_String): New function body.
(Enum_Wide_Wide_String_To_String): Likewise.
Javier Miranda [Fri, 13 Sep 2024 07:02:02 +0000 (07:02 +0000)]
ada: Untagged incomplete view not detected in ACATS test b3a1a060
Adding checks for RM 3.10.1(10): An actual parameter cannot be
of an untagged incomplete view; the result object of a function
call cannot be of an incomplete view.
gcc/ada/ChangeLog:
* sem_res.adb (Resolve_Actuals): Add checks for incomplete
type actuals.
Eric Botcazou [Thu, 12 Sep 2024 14:11:47 +0000 (16:11 +0200)]
ada: Pass parameters of full access unconstrained array types by copy in calls
When a full access array type is declared, either Volatile_Full_Access in
Ada 2012 or Atomic in Ada 2022, an implicit base array type is built by the
compiler with the Full_Access flag set, although full accesses cannot be
generated for objects of this type because the size is not known statically.
If the component type is a record with default values, an initialization
procedure taking a parameter of the base array type is built. Given that
full accesses cannot be generated for the parameter inside the procedure,
we need to pass the actual parameter by copy to the procedure in order to
implement the full access semantics.
gcc/ada/ChangeLog:
* exp_ch6.adb (Expand_Actuals.Is_Legal_Copy): Return True for an
initialization procedure with a full access formal parameter.
(Expand_Actuals.Requires_Atomic_Or_Volatile_Copy): Return True if
the formal parameter is of a full access unconstrained array type.
Jakub Jelinek [Fri, 25 Oct 2024 07:44:10 +0000 (09:44 +0200)]
non-gcc: Remove trailing whitespace
I've tried to build stage3 with
-Wleading-whitespace=blanks -Wtrailing-whitespace=blank -Wno-error=leading-whitespace=blanks -Wno-error=trailing-whitespace=blank
added to STRICT_WARN and that expectably resulted in about
2744 unique trailing whitespace warnings and 124837 leading whitespace
warnings when excluding *.md files (which obviously is in big part a
generator issue). Others from that are generator related, I think those
need to be solved later.
The following patch just fixes up the easy case (trailing whitespace),
which could be easily automated:
for i in `find . -name \*.h -o -name \*.cc -o -name \*.c | xargs grep -l '[ ]$' | grep -v testsuite/`; do sed -i -e 's/[ ]*$//' $i; done
I've excluded files which I knew are obviously generated or go FE.
Is there anything else we'd want to avoid the changes?
Due to patch size, I've split it between gcc/ part
and rest (include/, libiberty/, libgcc/, libcpp/, libstdc++-v3/;
this part).
Jakub Jelinek [Fri, 25 Oct 2024 07:41:46 +0000 (09:41 +0200)]
gcc: Remove trailing whitespace
I've tried to build stage3 with
-Wleading-whitespace=blanks -Wtrailing-whitespace=blank -Wno-error=leading-whitespace=blanks -Wno-error=trailing-whitespace=blank
added to STRICT_WARN and that expectably resulted in about
2744 unique trailing whitespace warnings and 124837 leading whitespace
warnings when excluding *.md files (which obviously is in big part a
generator issue). Others from that are generator related, I think those
need to be solved later.
The following patch just fixes up the easy case (trailing whitespace),
which could be easily automated:
for i in `find . -name \*.h -o -name \*.cc -o -name \*.c | xargs grep -l '[ ]$' | grep -v testsuite/`; do sed -i -e 's/[ ]*$//' $i; done
I've excluded files which I knew are obviously generated or go FE.
Is there anything else we'd want to avoid the changes?
Due to patch size, I've split it between gcc/ part (this patch)
and rest (include/, libiberty/, libgcc/, libcpp/, libstdc++-v3/).
Jennifer Schmitz [Thu, 24 Oct 2024 12:11:31 +0000 (05:11 -0700)]
SVE intrinsics: Fold svaba with op1 all zeros to svabd.
Similar to
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665780.html,
this patch implements folding of svaba to svabd if op1 is all zeros,
resulting in the use of UABD/SABD instructions instead of UABA/SABA.
Tests were added to check the produced assembly for use of UABD/SABD,
also for the _n case.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64-sve-builtins-sve2.cc
(svaba_impl::fold): Fold svaba to svabd if op1 is all zeros.
Nathaniel Shead [Tue, 20 Aug 2024 15:08:36 +0000 (01:08 +1000)]
c++/modules: Support decloned cdtors
When compiling with '-fdeclone-ctor-dtor' (enabled by default with -Os),
we run into issues where we don't correctly emit the underlying
functions. We also need to ensure that COMDAT constructors are marked
as such before 'maybe_clone_body' attempts to propagate COMDAT groups to
the new thunks.
gcc/cp/ChangeLog:
* module.cc (post_load_processing): Mark COMDAT as needed, emit
declarations if maybe_clone_body fails.
gcc/testsuite/ChangeLog:
* g++.dg/modules/clone-2_a.C: New test.
* g++.dg/modules/clone-2_b.C: New test.
* g++.dg/modules/clone-3_a.C: New test.
* g++.dg/modules/clone-3_b.C: New test.
Nathaniel Shead [Tue, 20 Aug 2024 14:50:53 +0000 (00:50 +1000)]
c++/modules: Prevent maybe_clone_decl being called multiple times [PR115007]
The ICE in the linked PR is caused because maybe_clone_decl is not
prepared to be called on a declaration that has already had clones
created; what happens otherwise is that start_preparsed_function early
exits and never sets up cfun, causing a segfault later on.
To fix this we ensure that post_load_processing only calls
maybe_clone_decl if TREE_ASM_WRITTEN has not been marked on the
declaration yet, and (if maybe_clone_decls succeeds) marks this flag on
the decl so that it doesn't get called again later when finalising
deferred vague linkage declarations in c_parse_final_cleanups.
As a bonus this now allows us to only keep the DECL_SAVED_TREE around in
expand_or_defer_fn_1 for modules which have CMIs, which will have
benefits for LTO performance in non-interface TUs.
For clarity we also update the streaming code to do post_load_decls for
maybe in-charge cdtors rather than any DECL_ABSTRACT_P declaration, as
this is more accurate to the decls affected by maybe_clone_body.
PR c++/115007
gcc/cp/ChangeLog:
* module.cc (module_state::read_cluster): Replace
DECL_ABSTRACT_P with DECL_MAYBE_IN_CHARGE_CDTOR_P.
(post_load_processing): Check and mark TREE_ASM_WRITTEN.
* semantics.cc (expand_or_defer_fn_1): Use the more specific
module_maybe_has_cmi_p instead of modules_p.
gcc/testsuite/ChangeLog:
* g++.dg/modules/virt-6_a.C: New test.
* g++.dg/modules/virt-6_b.C: New test.
Nathaniel Shead [Tue, 20 Aug 2024 14:42:42 +0000 (00:42 +1000)]
c++: Handle ABI for non-polymorphic dynamic classes
The Itanium ABI has specific rules for when virtual tables for dynamic
classes should be emitted. However we didn't consider structures with
virtual inheritance but no virtual members as dynamic classes for ABI
purposes; this patch fixes this.
gcc/cp/ChangeLog:
* decl2.cc (import_export_class): Use TYPE_CONTAINS_VPTR_P
instead of TYPE_POLYMORPHIC_P.
(import_export_decl): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/modules/virt-5_a.C: New test.
* g++.dg/modules/virt-5_b.C: New test.
Georg-Johann Lay [Tue, 22 Oct 2024 09:51:44 +0000 (11:51 +0200)]
AVR: target/116953 - Restore recog_data after calling jump_over_one_insn_p.
The previous fix for PR116953 is incomplete because references to
recog_data are escaping avr_out_sbxx_branch() in the form of %-operands
in the returned asm code template. This patch reverts the previous fix,
and re-extracts the operands by means of extract_constrain_insn_cached()
after the call of jump_over_one_insn_p().
PR target/116953
gcc/
* config/avr/avr.cc (avr_out_sbxx_branch): Revert previous fix
for PR116953 (r15-4078). Run extract_constrain_insn_cached
on the current insn after calling jump_over_one_insn_p.
David Malcolm [Thu, 24 Oct 2024 19:52:29 +0000 (15:52 -0400)]
analyzer: avoid implicit use of global_dc's pretty_printer [PR116613]
Previously, various places in the analyzer generated message strings
by cloning the diagnostic_context's pretty_printer, printing to that
pretty_printer's buffer, and then returning a copy of the buffer
contents.
This implicit use of a particular pretty printer doesn't work well for
the "multiple diagnostic output formats" case (PR other/116613), such as
differences in colorization, or in how phase 3 of formatting works.
Hence as enabling work towards that, the following patch reworks the
various functions returning a label_text string in favor of functions
that print to a specific pretty_printer, such as diagnotic_event's
"get_desc" vfunc, which becomes "print_desc". This makes the particular
pretty_printer in use explicit in each case.
Previously, the various pending_diagnostic::describe_* vfuncs returned a
label_text, with the return of an empty string signifying that no
description could be generated. With this patch, these vfuncs gain a
"pretty_printer &" param and a bool return value and now either print to
the pretty_printer and return true, or return false to signify the
"no description available" case.
No functional change intended.
gcc/analyzer/ChangeLog:
PR other/116613
* bounds-checking.cc
(concrete_buffer_overflow::describe_final_event): Convert return
type from label_text to bool. Add "pp" param and either print to
it and return true, or return false.
(concrete_buffer_overflow::describe_final_event_as_bytes): Convert
to print to a pp rather than returning a label_text.
(concrete_buffer_overflow::describe_final_event_as_bits):
Likewise.
(class concrete_buffer_over_read): Analogous changes to above.
(class concrete_buffer_underwrite): Likewise.
(class concrete_buffer_under_read): Likewise.
(class symbolic_buffer_overflow): Likewise.
(class symbolic_buffer_over_read): Likewise.
* call-details.cc (class overlapping_buffers): Likewise.
* call-info.cc (call_info::print): Reimplement.
(class call_info::add_events_to_path::call_event): Convert
"get_desc" vfunc to "print_desc", dropping return type, adding
"pp" param, and printing to it.
(class succeed_or_fail_call_info): Likewise.
* call-info.h (class call_info): Likewise.
(class succeed_or_fail_call_info): Likewise.
* checker-event.cc (checker_event::dump): Reimplement.
(checker_event::prepare_for_emission): Update for change from
get_desc to print_desc.
(debug_event::get_desc): Convert to...
(debug_event::print_desc): ...this.
(precanned_custom_event::get_desc): Convert to...
(precanned_custom_event::print_desc): ...this.
(statement_event::get_desc): Convert to...
(statement_event::print_desc): ...this.
(region_creation_event_memory_space::get_desc): Convert to...
(region_creation_event_memory_space::print_desc): ...this.
(region_creation_event_capacity::get_desc): Convert to...
(region_creation_event_capacity::print_desc): ...this.
(region_creation_event_allocation_size::get_desc): Convert to...
(region_creation_event_allocation_size::print_desc): ...this.
(region_creation_event_debug::get_desc): Convert to...
(region_creation_event_debug::print_desc): ...this.
(function_entry_event::get_desc): Convert to...
(function_entry_event::print_desc): ...this.
(state_change_event::get_desc): Convert to...
(state_change_event::print_desc): ...this.
(state_change_event::get_meaning): Update for change to
pending_diagnostic::get_meaning_for_state_change.
(superedge_event::should_filter_p): Convert from usage of get_desc
to print_desc.
(start_cfg_edge_event::get_desc): Convert to...
(start_cfg_edge_event::print_desc): ...this.
(call_event::get_desc): Convert to...
(call_event::print_desc): ...this.
(return_event::get_desc): Convert to...
(return_event::print_desc): ...this.
(start_consolidated_cfg_edges_event::get_desc): Convert to...
(start_consolidated_cfg_edges_event::print_desc): ...this.
(inlined_call_event::get_desc): Convert to...
(inlined_call_event::print_desc): ...this.
(setjmp_event::get_desc): Convert to...
(setjmp_event::print_desc): ...this.
(rewind_from_longjmp_event::get_desc): Convert to...
(rewind_from_longjmp_event::print_desc): ...this.
(rewind_to_setjmp_event::get_desc): Convert to...
(rewind_to_setjmp_event::print_desc): ...this.
(warning_event::get_desc): Convert to...
(warning_event::print_desc): ...this.
* checker-event.h: Convert the various "get_desc" vfunc decls to
"print_desc".
* checker-path.cc (checker_path::dump): Convert to usage of
checker_event::print_desc.
(checker_path::debug): Convert to debug form of
checker_event::get_desc.
* diagnostic-manager.cc
(diagnostic_manager::prune_interproc_events): Likewise.
(diagnostic_manager::prune_system_headers): Likewise.
* engine.cc (call_summary_edge_info::get_desc): Convert to...
(call_summary_edge_info::print_desc): ...this.
(stale_jmp_buf::describe_final_event): Update for change to
this vfunc.
(tainted_args_function_custom_event::get_desc): Convert to...
(tainted_args_function_custom_event::print_desc): ...this.
(tainted_args_field_custom_event::get_desc): Convert to...
(tainted_args_field_custom_event::print_desc): ...this.
(tainted_args_callback_custom_event::get_desc): Convert to...
(tainted_args_callback_custom_event::print_desc): ...this.
(jump_through_null::describe_final_event): Update for change to
this vfunc.
* infinite-loop.cc (perpetual_start_cfg_edge_event::get_desc):
Convert to...
(perpetual_start_cfg_edge_event::print_desc): ...this.
(looping_back_event::get_desc): Convert to...
(looping_back_event::print_desc): ...this.
(looping_back_event::describe_final_event): Update for change to
this vfunc.
* infinite-recursion.cc (class infinite_recursion_diagnostic):
Update for changes to pending_diagnostic.
* kf.cc (class putenv_of_auto_var): Likewise.
(kf_realloc::impl_call_post): Update for changes to call_info.
(kf_strchr::impl_call_post): Likewise.
(kf_strncpy::impl_call_post): Likewise.
(kf_strstr::impl_call_post): Likewise.
(class kf_strtok::undefined_behavior): Update for changes to
pending_diagnostic.
(class strtok_call_info): Update for changes to call_info.
* pending-diagnostic.cc (evdesc::event_desc::formatted_print):
Delete.
* pending-diagnostic.h (struct event_desc): Delete.
(struct state_change): Drop event_desc base class.
(struct call_with_state): Likewise.
(struct return_of_state): Likewise.
(struct final_event): Likewise.
(pending_event::describe_state_change): Convert return
type from label_text to bool. Add "pp" param and either print to
it and return true, or return false. Do the latter for the base
class implementation.
(pending_event::describe_call_with_state): Likewise.
(pending_event::describe_return_of_state): Likewise.
(pending_event::describe_final_event): Likewise.
* region-model.cc
(poisoned_value_diagnostic::describe_final_event): Update for
change to this vfunc.
(shift_count_negative_diagnostic::describe_final_event): Likewise.
(shift_count_overflow_diagnostic::describe_final_event): Likewise.
(ptrdiff_region_creation_event::get_desc): Convert to...
(ptrdiff_region_creation_event::print_desc): ...this.
(undefined_ptrdiff_diagnostic::describe_final_event): Update for
change to this vfunc.
(write_to_const_diagnostic::describe_final_event): Likewise.
(write_to_string_literal_diagnostic::describe_final_event):
Likewise.
(dubious_allocation_size::describe_final_event): Likewise.
(null_terminator_check_event::get_desc): Convert to...
(null_terminator_check_event::print_desc): ...this.
(float_as_size_arg::describe_final_event): Update for change to
this vfunc.
(exposure_through_uninit_copy::describe_final_event): Likewise.
* sm-fd.cc: Include "diagnostic-core.h". Update throughout for
changes to pending_diagnostic vfuncs.
* sm-file.cc: Likewise.
* sm-malloc.cc: Likewise.
* sm-sensitive.cc: Likewise.
* sm-signal.cc: Likewise.
* sm-taint.cc: Likewise.
* varargs.cc: Likewise.
gcc/ChangeLog:
PR other/116613
* diagnostic-format-json.cc (make_json_for_path): Add "ref_pp"
param and use when obtaining event descriptions.
(json_output_format::on_report_diagnostic): Pass this format's
printer as the above.
* diagnostic-format-sarif.cc
(sarif_builder::make_location_object): Clone this format's printer
and use it to obtain the text of the message.
* diagnostic-path.cc: Include "pretty-print-markup.h".
(diagnostic_event::get_desc): New.
(path_label::get_text): Update for changes to diagnostic_event.
(event_range::print): Likewise.
(class element_event_desc): New.
(diagnostic_text_output_format::print_path): Update for changes to
diagnostic_event.
* diagnostic-path.h (diagnostic_event::get_desc): Replace with...
(diagnostic_event::print_desc): ...this.
(diagnostic_event::get_desc): Add this back for debugging, without
the bool param.
* pretty-print.cc (pp_printf_n): New.
* pretty-print.h (pp_printf_n): New decl.
* selftest-diagnostic-path.h (test_diagnostic_event::get_desc):
Convert to...
(test_diagnostic_event::print_desc): ...this.
* simple-diagnostic-path.cc (simple_diagnostic_event::print_desc):
New.
(selftest::test_intraprocedural_path): Use debug form of get_desc.
* simple-diagnostic-path.h (simple_diagnostic_event::get_desc):
Convert to...
(simple_diagnostic_event::print_desc): ...this, moving
implementation to test_diagnostic_event.
gcc/testsuite/ChangeLog:
PR other/116613
* gcc.dg/plugin/analyzer_cpython_plugin.c: Convert call outcomes
from "get_desc" to print_desc".
* gcc.dg/plugin/analyzer_gil_plugin.c: Update for changes to
pending_diagnostic vfuncs.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
but it turns out the tail calls in question are not the ones that test
is actually checking for. Rather, when () is interpreted as (void) in
C23 mode, ICF notices that certain functions are identical and so
turns test_indirect_2 into a tail call to text_indirect_1 and
test_indirect_casted_2 into a tail call to test_indirect_casted_1
(which it didn't do previously when one function used () and one used
(void)).
To avoid these spurious failures, make the test use -fno-ipa-icf
rather than relying on () and (void) giving different function types
to avoid ICF.
[...]
In file included from ../../source-gcc/gcc/config/gcn/mkoffload.cc:31:0:
../../source-gcc/gcc/diagnostic.h:29:3: error: #error "You must define INCLUDE_MEMORY before including system.h to use diagnostic.h"
# error "You must define INCLUDE_MEMORY before including system.h to use diagnostic.h"
^
In file included from ../../source-gcc/gcc/diagnostic.h:34:0,
from ../../source-gcc/gcc/config/gcn/mkoffload.cc:31:
../../source-gcc/gcc/pretty-print.h:29:3: error: #error "You must define INCLUDE_MEMORY before including system.h to use pretty-print.h"
# error "You must define INCLUDE_MEMORY before including system.h to use pretty-print.h"
^
In file included from ../../source-gcc/gcc/diagnostic.h:34:0,
from ../../source-gcc/gcc/config/gcn/mkoffload.cc:31:
../../source-gcc/gcc/pretty-print.h:280:16: error: 'unique_ptr' in namespace 'std' does not name a template type
virtual std::unique_ptr<pretty_printer> clone () const;
^
In file included from ../../source-gcc/gcc/config/gcn/mkoffload.cc:31:0:
../../source-gcc/gcc/diagnostic.h:585:32: error: 'std::unique_ptr' has not been declared
void set_output_format (std::unique_ptr<diagnostic_output_format> output_format);
^
[...]
Dimitar Dimitrov [Thu, 24 Oct 2024 16:59:42 +0000 (19:59 +0300)]
testsuite: Require effective target pie for pr113197
The test for PR113197 explicitly enables PIE. But targets without PIE
emit warnings when -fpie is passed (e.g. pru and avr), which causes the
test to fail.
Fix by adding an effective target requirement for PIE.
With this patch, the test now is marked as unsupported for
pru-unknown-elf. Testing for x86_64-pc-linux-gnu passes with current
mainline, and fails if the fix from r15-4018-g02f4efe3c12cf7 is
reverted.
This removes the overload of __throw_bad_variant_access that must be
called with a string literal. This avoids a potential source of
undefined behaviour if that function got misused. The other overload
that takes a bool parameter can be adjusted to take an integer index
selecting one of the four possible string literals to use, ensuring
that the std::bad_variant_access constructor is only called with those
literals.
Passing an index outside the range [0,3] is bogus, but will still select
a valid string literal and avoid undefined behaviour.
libstdc++-v3/ChangeLog:
* include/std/variant (__throw_bad_variant_access(unsigned)):
Define new function as inline friend, with namespace-scope
declaration using noreturn attribute.
(__throw_bad_variant_access(const char*)): Remove.
(__throw_bad_variant_access(bool)): Remove.
(visit, visit<R>): Adjust calls to __throw_bad_variant_access.
David Malcolm [Thu, 24 Oct 2024 15:48:01 +0000 (11:48 -0400)]
Use unique_ptr in more places in pretty_printer/diagnostics [PR116613]
My forthcoming patches for PR other/116613 make much more use of
cloning of pretty_printers than before, so it makes sense as a
preliminary patch for the result of pretty_printer::clone to be a
std::unique_ptr, rather than add more manual uses of "delete".
On doing so, I noticed various other places where naked new/delete is
used for run-time configuration of diagnostics:
* the output format (text vs SARIF)
* client data hooks
* the option manager
* the URLifier
Hence this patch also makes use of std::unique_ptr and ::make_unique for
managing such client policy classes, and also for diagnostic_buffer's
per-format implementations.
Unfortunately we can't directly include <memory> in our internal headers
but instead any of our TUs that make use of std::unique_ptr must #define
INCLUDE_MEMORY before including system.h.
Hence the bulk of this patch is taken up with adding a define of
INCLUDE_MEMORY to hundreds of source files: everything that includes
diagnostic.h or pretty-print.h (and thus anything transitively such as
includers of lto-wrapper.h, c-tree.h, cp-tree.h and rtl-ssa.h).
Thanks to Gaius Mulley for the parts of the patch that regenerated the
m2 files.
Signed-off-by: David Malcolm <dmalcolm@redhat.com> Co-authored-by: Gaius Mulley <gaiusmod2@gmail.com> Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Ricardo Jesus [Mon, 14 Oct 2024 13:28:02 +0000 (14:28 +0100)]
aarch64: libstdc++: Use shufflevector instead of shuffle in opt_random.h
This patch modifies the implementation of the vectorized mersenne
twister random number generator to use __builtin_shufflevector instead
of __builtin_shuffle. This makes it (almost) compatible with Clang.
To make the implementation fully compatible with Clang, Clang will need
to support internal Neon types like __Uint8x16_t and __Uint32x4_t, which
currently it does not. This looks like an oversight in Clang and so will
be addressed separately.
I see no codegen change with this patch.
Bootstrapped and tested on aarch64-none-linux-gnu.
libstdc++-v3/ChangeLog:
* config/cpu/aarch64/opt/ext/opt_random.h (__VEXT): Replace uses
of __builtin_shuffle with __builtin_shufflevector.
(__aarch64_lsl_128): Move shift amount to a template parameter.
(__aarch64_lsr_128): Move shift amount to a template parameter.
(__aarch64_recursion): Update call sites of __aarch64_lsl_128
and __aarch64_lsr_128.
Record nonzero bits in the irange_bitmask of POLY_INT_CSTs
At the moment, ranger punts entirely on POLY_INT_CSTs. Numerical
ranges are a bit difficult, unless we do start modelling bounds on
the indeterminates. But we can at least track the nonzero bits.
gcc/
* value-query.cc (range_query::get_tree_range): Use get_nonzero_bits
to populate the irange_bitmask of a POLY_INT_CST.
gcc/testsuite/
* gcc.target/aarch64/sve/cnt_fold_6.c: New test.
This patch adds a rule to simplify (X >> C1) * (C2 << C1) -> X * C2
when the low C1 bits of X are known to be zero. As with the earlier
X >> C1 << (C2 + C1) patch, any single conversion is allowed between
the shift and the multiplication.
gcc/
* match.pd: Simplify (X >> C1) * (C2 << C1) -> X * C2 if the
low C1 bits of X are zero.
This patch extends get_nonzero_bits to handle POLY_INT_CSTs,
The easiest (but also most useful) case is that the number
of trailing zeros in the runtime value is at least the number
of trailing zeros in each individual component.
In principle, we could do this for coeffs 1 and above only,
and then OR in ceoff 0. This would give ~0x11 for [14, 32], say.
But that's future work.
This patch adds a rule to simplify (X >> C1) << (C1 + C2) -> X << C2
when the low C1 bits of X are known to be zero.
Any single conversion can take place between the shifts. E.g. for
a truncating conversion, any extra bits of X that are preserved by
truncating after the shift are immediately lost by the shift left.
And the sign bits used for an extending conversion are the same as
the sign bits used for the rshift. (A double conversion of say
int->unsigned->uint64_t would be wrong though.)
gcc/
* match.pd: Simplify (X >> C1) << (C1 + C2) -> X << C2 if the
low C1 bits of X are zero.
gcc/testsuite/
* gcc.dg/tree-ssa/shifts-1.c: New test.
* gcc.dg/tree-ssa/shifts-2.c: Likewise.
Generalise ((X /[ex] A) +- B) * A -> X +- A * B rule
match.pd had a rule to simplify ((X /[ex] A) +- B) * A -> X +- A * B
when A and B are INTEGER_CSTs. This patch extends it to handle the
case where the outer multiplication is by a factor of A, not just
A itself. It also handles addition and multiplication of poly_ints.
(Exact division by a poly_int seems unlikely.)
gcc/
* match.pd: Generalise ((X /[ex] A) +- B) * A -> X +- A * B rule
to ((X /[ex] C1) +- C2) * (C1 * C3) -> (X * C3) +- (C1 * C2 * C3).
I tried to look for places where we were handling TRUNC_DIV_EXPR
more favourably than EXACT_DIV_EXPR.
Most of the places that I looked at but didn't change were handling
div/mod pairs. But there's bound to be others I missed...
gcc/
* match.pd: Extend some rules to handle exact_div like trunc_div.
* tree.h (trunc_or_exact_div_p): New function.
* tree-ssa-loop-niter.cc (is_rshift_by_1): Use it.
* tree-ssa-loop-ivopts.cc (force_expr_to_var_cost): Handle
EXACT_DIV_EXPR.
Andrew MacLeod [Wed, 23 Oct 2024 14:59:13 +0000 (10:59 -0400)]
Implement pointer_or_operator.
The class pointer_or is no longer used, and can be removed. Its
functionality was never moved to the new dispatch system.
This implements operator_bitwise_or::fold_range() for prange operands.
Andrew MacLeod [Mon, 21 Oct 2024 22:20:10 +0000 (18:20 -0400)]
Remove pointer_and_operator.
This operator class predates the dispatch system, and is no longer used.
The functionality of wi_fold has been replaced by
operator_bitwise_and::fold_range with prange operaands.
Andrew MacLeod [Mon, 21 Oct 2024 22:11:43 +0000 (18:11 -0400)]
Remove pointer_min_max_operator.
The pointer_min_max_operator class was used before the current dispatch
system was created. These operations have been transferred to
operator_min::fold_range () and operator_max::fold_range () with prange
operands.
This class is no longer used for anything, delete it.
Jakub Jelinek [Thu, 24 Oct 2024 10:56:19 +0000 (12:56 +0200)]
c++: Further fix for get_member_function_from_ptrfunc [PR117259]
The following testcase shows that the previous get_member_function_from_ptrfunc
changes weren't sufficient and we still have cases where
-fsanitize=undefined with pointers to member functions can cause wrong code
being generated and related false positive warnings.
The problem is that save_expr doesn't always create SAVE_EXPR, it can skip
some invariant arithmetics and in the end it could be really large
expressions which would be evaluated several times (and what is worse, with
-fsanitize=undefined those expressions then can have SAVE_EXPRs added to
their subparts for -fsanitize=bounds or -fsanitize=null or
-fsanitize=alignment instrumentation). Tried to just build1 a SAVE_EXPR
+ add TREE_SIDE_EFFECTS instead of save_expr, but that doesn't work either,
because cp_fold happily optimizes those SAVE_EXPRs away when it sees
SAVE_EXPR operand is tree_invariant_p.
So, the following patch instead of using save_expr or building SAVE_EXPR
manually builds a TARGET_EXPR. Both types are pointers, so it doesn't need
to be destroyed in any way, but TARGET_EXPR is what doesn't get optimized
away immediately.
2024-10-24 Jakub Jelinek <jakub@redhat.com>
PR c++/117259
* typeck.cc (get_member_function_from_ptrfunc): Use force_target_expr
rather than save_expr for instance_ptr and function. Don't call it
for TREE_CONSTANT.
Jakub Jelinek [Thu, 24 Oct 2024 10:45:34 +0000 (12:45 +0200)]
asan: Fix up build_check_stmt gsi handling [PR117209]
gsi_safe_insert_before properly updates gsi_bb in gimple_stmt_iterator
in case it splits objects, but unfortunately build_check_stmt was in
some places (but not others) using a copy of the iterator rather than
the iterator passed from callers and so didn't propagate that to callers.
I guess it didn't matter much before when it was just using
gsi_insert_before as that really didn't change the iterator.
The !before_p case is apparently dead code, nothing is calling it with
before_p=false since around 4.9.
2024-10-24 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/117209
* asan.cc (maybe_cast_to_ptrmode): Formatting fix.
(build_check_stmt): Don't copy *iter into gsi, perform all
the updates on iter directly.
Jennifer Schmitz [Thu, 17 Oct 2024 09:31:47 +0000 (02:31 -0700)]
SVE intrinsics: Fold svsra with op1 all zeros to svlsr/svasr.
A common idiom in intrinsics loops is to have accumulator intrinsics
in an unrolled loop with an accumulator initialized to zero at the beginning.
Propagating the initial zero accumulator into the first iteration
of the loop and simplifying the first accumulate instruction is a
desirable transformation that we should teach GCC.
Therefore, this patch folds svsra to svlsr/svasr if op1 is all zeros,
producing the lower latency instructions LSR/ASR instead of USRA/SSRA.
We implemented this optimization in svsra_impl::fold.
Tests were added to check the produced assembly for use of LSR/ASR.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64-sve-builtins-sve2.cc
(svsra_impl::fold): Fold svsra to svlsr/svasr if op1 is all zeros.
Soumya AR [Thu, 17 Oct 2024 04:00:35 +0000 (09:30 +0530)]
SVE intrinsics: Fold constant operands for svlsl.
This patch implements constant folding for svlsl. Test cases have been added to
check for the following cases:
Zero, merge, and don't care predication.
Shift by 0.
Shift by register width.
Overflow shift on signed and unsigned integers.
Shift on a negative integer.
Maximum possible shift, eg. shift by 7 on an 8-bit integer.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by: Soumya AR <soumyaa@nvidia.com>
gcc/ChangeLog:
* config/aarch64/aarch64-sve-builtins-base.cc (svlsl_impl::fold):
Try constant folding.
* config/aarch64/aarch64-sve-builtins.cc (aarch64_const_binop):
Return 0 if shift is out of range.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/const_fold_lsl_1.c: New test.
SVE intrinsics: Fold division and multiplication by -1 to neg
Because a neg instruction has lower latency and higher throughput than
sdiv and mul, svdiv and svmul by -1 can be folded to svneg. For svdiv,
this is already implemented on the RTL level; for svmul, the
optimization was still missing.
This patch implements folding to svneg for both operations using the
gimple_folder. For svdiv, the transform is applied if the divisor is -1.
Svmul is folded if either of the operands is -1. A case distinction of
the predication is made to account for the fact that svneg_m has 3 arguments
(argument 0 holds the values for the inactive lanes), while svneg_x and
svneg_z have only 2 arguments.
Tests were added or adjusted to check the produced assembly and runtime
tests were added to check correctness.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold):
Fold division by -1 to svneg.
(svmul_impl::fold): Fold multiplication by -1 to svneg.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/asm/div_s32.c: New test.
* gcc.target/aarch64/sve/acle/asm/mul_s16.c: Adjust expected outcome.
* gcc.target/aarch64/sve/acle/asm/mul_s32.c: New test.
* gcc.target/aarch64/sve/acle/asm/mul_s64.c: Adjust expected outcome.
* gcc.target/aarch64/sve/acle/asm/mul_s8.c: Likewise.
* gcc.target/aarch64/sve/div_const_run.c: New test.
* gcc.target/aarch64/sve/mul_const_run.c: Likewise.
Jennifer Schmitz [Tue, 15 Oct 2024 14:58:14 +0000 (07:58 -0700)]
SVE intrinsics: Add constant folding for svindex.
This patch folds svindex with constant arguments into a vector series.
We implemented this in svindex_impl::fold using the function build_vec_series.
For example,
svuint64_t f1 ()
{
return svindex_u642 (10, 3);
}
compiled with -O2 -march=armv8.2-a+sve, is folded to {10, 13, 16, ...}
in the gimple pass lower.
This optimization benefits cases where svindex is used in combination with
other gimple-level optimizations.
For example,
svuint64_t f2 ()
{
return svmul_x (svptrue_b64 (), svindex_u64 (10, 3), 5);
}
has previously been compiled to
f2:
index z0.d, #10, #3
mul z0.d, z0.d, #5
ret
Now, it is compiled to
f2:
mov x0, 50
index z0.d, x0, #15
ret
We added test cases checking
- the application of the transform during gimple for constant arguments,
- the interaction with another gimple-level optimization.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?
Wang Pengcheng [Thu, 24 Oct 2024 05:11:53 +0000 (23:11 -0600)]
[PATCH] RISC-V: override alignment of function/jump/loop
Just like what AArch64 has done.
Signed-off-by: Wang Pengcheng <wangpengcheng.pp@bytedance.com>
gcc/ChangeLog:
* config/riscv/riscv.cc (struct riscv_tune_param): Add new
tune options.
(riscv_override_options_internal): Override the default alignment
when not optimizing for size.
Jakub Jelinek [Thu, 24 Oct 2024 03:21:13 +0000 (21:21 -0600)]
testsuite: Fix up pr116488.c and pr117226.c tests [PR116488]
Hi!
On Mon, Oct 21, 2024 at 01:39:52PM -0600, Jeff Law wrote:
> * gcc.dg/torture/pr116488.c: New test.
> * gcc.dg/torture/pr117226.c: New test.
These two tests FAIL on powerpc64le-linux (and I assume on all other
-funsigned-char defaulting targets).
The following patch fixes that, tested on powerpc64le-linux and
x86_64-linux (-m32/-m64); on x86_64 also tested before/after with
-funsigned-char.
Ok for trunk?
2024-10-22 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/116488
PR rtl-optimization/117226
* gcc.dg/torture/pr116488.c (c, e): Change type from char to
signed char.
* gcc.dg/torture/pr117226.c (main): Change f type from char to
signed char.
Pan Li [Mon, 23 Sep 2024 05:43:50 +0000 (13:43 +0800)]
RISC-V: Add testcases for form 4 of signed vector SAT_ADD
Form 4:
#define DEF_VEC_SAT_S_ADD_FMT_4(T, UT, MIN, MAX) \
void __attribute__((noinline)) \
vec_sat_s_add_##T##_fmt_4 (T *out, T *op_1, T *op_2, unsigned limit) \
{ \
unsigned i; \
for (i = 0; i < limit; i++) \
{ \
T x = op_1[i]; \
T y = op_2[i]; \
T sum; \
bool overflow = __builtin_add_overflow (x, y, &sum); \
out[i] = !overflow ? sum : x < 0 ? MIN : MAX; \
} \
}
The below test are passed for this patch.
* The rv64gcv fully regression test.
It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-13.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-14.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-15.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-16.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-13.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-14.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-15.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-16.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com> Signed-off-by: Pan Li <pan2.li@intel.com>
Andrew Pinski [Wed, 23 Oct 2024 23:39:21 +0000 (16:39 -0700)]
aarch64: Fix warning in aarch64_ptrue_reg
After r15-4579-g9ffcf1f193b477, we get the following warning/error while bootstrapping on aarch64:
```
../../gcc/gcc/config/aarch64/aarch64.cc: In function ‘rtx_def* aarch64_ptrue_reg(machine_mode, unsigned int)’:
../../gcc/gcc/config/aarch64/aarch64.cc:3643:21: error: comparison of integer expressions of different signedness: ‘int’ and ‘unsigned int’ [-Werror=sign-compare]
3643 | for (int i = 0; i < vl; i++)
| ~~^~~~
```
This changes the type of i to unsigned to match the type of vl.
Pushed as obvious after a bootstrap/test on aarch64-linux-gnu.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_ptrue_reg): Fix type
of induction variable i.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
When internal functions support was added to match (r6-4979-gc9e926ce2bdc8b),
the check for ECF_CONST was the builtin function side. Though before r15-4503-g8d6d6d537fdc,
there was no use of maybe_push_res_to_seq with non-const internal functions so the check
would not make a difference.
This adds the check for internal functions just as there is a check for builtins.
Note I didn't add a testcase because there was no non-const internal function
which could be used on x86_64 in a decent manor.