git.ipfire.org Git - thirdparty/gcc.git/log

]> git.ipfire.org Git - thirdparty/gcc.git/log

projects / thirdparty / gcc.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Marc Poulhiès [Mon, 6 Mar 2023 11:15:13 +0000 (12:15 +0100)]

ada: Fix (again) incorrect handling of Aggregate aspect

Previous fix stopped the processing of the Aggregate aspect early,
skipping the call to Record_Rep_Item, making later call to
Resolve_Container_Aggregate fail.

Also, the previous fix would not handle correctly the case where the
type is private and the check for non-array type can only be done at the
freeze point with the full type.

Adapt the resolving of the aspect when the input is not correct and the
parameters can't be resolved.

gcc/ada/

* sem_ch13.adb (Analyze_One_Aspect): Call Record_Rep_Item.
(Check_Aspect_At_Freeze_Point): Check the aspect is specified on
non-array type only...
(Analyze_One_Aspect): ... instead of doing it too early here.
* sem_aggr.adb (Resolve_Container_Aggregate): Do nothing in case
the parameters failed to resolve.

commit | commitdiff | tree

Piotr Trojanek [Sat, 4 Mar 2023 17:07:55 +0000 (18:07 +0100)]

ada: Prevent search of calls in preconditions from going too far

When determining whether a call to protected function appears within
a pragma expression we can safely stop at the subprogram body.

Cleanup related to recently added support for a new SPARK aspects,
whose implementation was based on Contract_Cases.

gcc/ada/

* sem_util.adb (Check_Internal_Protected_Use): Add standard protection
against search going too far.

commit | commitdiff | tree

Piotr Trojanek [Sat, 4 Mar 2023 17:07:33 +0000 (18:07 +0100)]

ada: Fix comments for recently added SPARK aspects

Implementation of contract Subprogram_Variant and Exceptional_Cases was
based on the existing code for Contract_Cases, i.e. on the existing
occurrences of Aspect_Contract_Cases, Name_Contract_Cases and
Pragma_Contract_Cases. However, occurrences of "Contract_Cases" itself
in the comments were not updated.

gcc/ada/

* contracts.adb
(Add_Pre_Post_Condition): Mention new aspects in the comment.
* contracts.ads
(Add_Contract_Item): Likewise.
(Analyze_Subprogram_Body_Stub_Contract): Likewise.
* sem_prag.adb
(Contract_Freeze_Error): Likewise.
(Ensure_Aggregate_Form): Likewise.
* sem_prag.ads
(Find_Related_Declaration_Or_Body): Likewise.
* sinfo.ads
(Is_Generic_Contract_Pragma): Likewise.

commit | commitdiff | tree

Piotr Trojanek [Fri, 3 Mar 2023 16:45:20 +0000 (17:45 +0100)]

ada: Add missing supportive code for recently added SPARK aspects

Fix minor inconsistencies with the recently added SPARK aspects
Exceptional_Cases and Subprogram_Variant, whose implementation is based
on Contract_Cases.

gcc/ada/

* aspects.ads
(Implementation_Defined_Aspect): Recently added aspects are
implementation-defined, just like Contract_Cases.
* sem_prag.ads
(Aspect_Specifying_Pragma): Recently added aspects have corresponding
pragmas, just like Contract_Cases.
(Pragma_Significant_To_Subprograms): Recently added aspects are
significant to subprograms, just like Contract_Cases.

commit | commitdiff | tree

Piotr Trojanek [Mon, 6 Mar 2023 11:50:04 +0000 (12:50 +0100)]

ada: Tune handling of attributes Old in contract Exceptional_Cases

Contract Exceptional_Cases allows formal parameters to appear *in*
prefixes of attributes Old, but the code only allowed them to appear
*as* prefixes of those attributes.

For example, we now accetp expressions like "X.all'Old" that were
previously rejected.

gcc/ada/

* sem_res.adb (Resolve_Entity_Name): Tune handling of formal parameters
in contract Exceptional_Cases.

commit | commitdiff | tree

Piotr Trojanek [Fri, 3 Mar 2023 16:27:40 +0000 (17:27 +0100)]

ada: Remove redundant guards from calls to Move_Aspects

Routine Move_Aspects does nothing if its From parameter has no aspects.
There is no need to check this at the call sites.

Code cleanup related to changes in handling of expressions functions in
GNATprove; semantics is unaffected.

gcc/ada/

* par-ch7.adb (P_Package): Remove redundant guard from call to
Move_Aspects.
* par-ch9.adb (P_Task): Likewise.
* sem_ch6.adb (Analyze_Expression_Function, Is_Inline_Pragma): Likewise.

commit | commitdiff | tree

Eric Botcazou [Sat, 4 Mar 2023 14:02:32 +0000 (15:02 +0100)]

ada: Small tweak to implementation of by-copy semantics for storage models

Get_Actual_Subtype can be used to access the Actual_Designated_Subtype of
explicit dereferences with a storage model. As a side effect, this also
handles the case where the prefix of the dereference is a formal parameter.

gcc/ada/

* exp_ch6.adb (Add_Simple_Call_By_Copy_Code): Use Get_Actual_Subtype
to retrieve the actual subtype for all actuals and do it in only one
place for all unconstrained composite formal types.

commit | commitdiff | tree

Piotr Trojanek [Fri, 3 Mar 2023 17:23:58 +0000 (18:23 +0100)]

ada: Fix copy-paste mistake in analysis of Exceptional_Cases

Trivial mistakes in copied code.

gcc/ada/

* sem_prag.adb (Analyze_Pragma): Fix references to Exceptional_Cases in
code copied from handling of Subprogram_Variant.

commit | commitdiff | tree

Ronan Desplanques [Fri, 3 Mar 2023 16:48:47 +0000 (17:48 +0100)]

ada: Enrich documentation of subprogram

This patch adds documentation to the subprogram Replace_Type in
Sem_Ch3. In particular, references to relevant parts of the Ada
reference manual are added.

gcc/ada/

* sem_ch3.adb (Replace_Type): Add more documentation.

commit | commitdiff | tree

Ronan Desplanques [Fri, 3 Mar 2023 11:33:21 +0000 (12:33 +0100)]

ada: Maximize use of existing constant

This patch does not change the behavior of the compiler and is
intended as a readability improvement.

gcc/ada/

* sem_ch3.adb (Replace_Type): Use existing constant wherever
possible.

commit | commitdiff | tree

Ronan Desplanques [Fri, 3 Mar 2023 11:33:21 +0000 (12:33 +0100)]

ada: Reduce span of variable

This patch does not change the behavior of the compiler, but is
intended to improve readability. It seizes an opportunity to move
a variable declaration to a smaller scope, so that it's clearer
that the variable is not used outside of that scope.

gcc/ada/

* sem_ch3.adb (Replace_Type): Reduce span of variable.

commit | commitdiff | tree

Bob Duff [Fri, 3 Mar 2023 14:46:34 +0000 (09:46 -0500)]

ada: Set Is_Not_Self_Hidden flag in more cases

More work-in-progress for changing E_Void checks to the flag.

gcc/ada/

* sem_ch9.adb (Analyze_Protected_Type_Declaration): Set the flag
for protected types.
(Analyze_Single_Protected_Declaration): Likewise, for singleton
protected objects.
(Analyze_Task_Type_Declaration): Set the flag for task types.
(Analyze_Single_Task_Declaration): Likewise, for singleton task
objects.
* sem_ch10.adb (Decorate_Type): Set the flag for types treated as
incomplete.
(Build_Shadow_Entity): Set the flag for shadow entities.
(Decorate_State): Set the flag for an abstract state.
(Build_Limited_Views): Set the flag for limited view of package.
* sem_attr.adb (Check_Not_Incomplete_Type): Disable the check when
this is a current instance.

commit | commitdiff | tree

Ronan Desplanques [Fri, 3 Mar 2023 14:21:16 +0000 (15:21 +0100)]

ada: Handle controlling access parameters in DTWs

This patch improves the way controlling access parameters are
handled in dispatch table wrappers. The constructions of both the
specifications and the bodies of wrappers are modified.

gcc/ada/

* freeze.adb (Build_DTW_Body): Add appropriate type conversions for
controlling access parameters.
* sem_util.adb (Build_Overriding_Spec): Fix designated types in
controlling access parameters.

commit | commitdiff | tree

Bob Duff [Thu, 2 Mar 2023 15:12:29 +0000 (10:12 -0500)]

ada: Add Entry_Cancel_Parameter to E_Label

...and other (minor) changes.

gcc/ada/

* gen_il-gen-gen_entities.adb (E_Label): Add
Entry_Cancel_Parameter. This is necessary because
Analyze_Implicit_Label_Declaration set the Ekind to E_Label.
Without this change, this field would fail the vanishing-fields
check in Atree (which is currently commented out).
* einfo.ads (Entry_Cancel_Parameter): Document for E_Label.
* sem_eval.adb (Why_Not_Static): Protect against previous errors
(no need to explain why something is not static if it's already
illegal for other reasons).
* sem_util.ads (Enter_Name): Fix misleading comment.

commit | commitdiff | tree

Eric Botcazou [Thu, 2 Mar 2023 10:51:22 +0000 (11:51 +0100)]

ada: Minor fixes in description of scope depth

In particular, the scope depth of library units is 1 instead of 0.

gcc/ada/

* einfo.ads (Scope_Depth): Fix circular definition.
(Scope_Depth_Value): Fix value for library units.

commit | commitdiff | tree

Piotr Trojanek [Thu, 2 Mar 2023 14:11:40 +0000 (15:11 +0100)]

ada: Tune warning about assignment just before a raise statement

Tune warning about a possibly ineffective assignment to a formal
parameter that happens just before a raise statement.

The warning is now emitted for parameters of all by-copy types and not
just of scalar types (this gives more warnings), but is suppressed for
aliased parameters (this removes some spurious warnings).

gcc/ada/

* sem_ch11.adb (Analyze_Raise_Expression): Tune warning condition.
* libgnat/g-dirope.ads (Open): Remove a potentially inaccurate comment.
* libgnat/g-dirope.adb (Open): Remove a potentially useless assignment;
the Dir output parameter should be assigned a null value anyway by the
preceding call to Free.

commit | commitdiff | tree

Piotr Trojanek [Thu, 2 Mar 2023 14:09:24 +0000 (15:09 +0100)]

ada: Accept aliased parameters in Exceptional_Cases

Aliased parameters, just like parameters by-reference types, can safely
appear in consequences of Exceptional_Cases aspect.

gcc/ada/

* sem_res.adb (Resolve_Entity_Name): Allow aliased parameters; tune
error message.

commit | commitdiff | tree

Marc Poulhiès [Tue, 28 Feb 2023 16:10:29 +0000 (17:10 +0100)]

ada: Fix incorrect handling of Aggregate aspect

This change fixes 2 incorrect handlings of the aspect.
The arguments are now correctly resolved and the aspect is rejected on
non array types.

gcc/ada/

* sem_ch13.adb (Analyze_One_Aspect): Mark Aggregate aspect as
needing delayed resolution and reject the aspect on non-array
type.

commit | commitdiff | tree

Bob Duff [Thu, 2 Mar 2023 14:44:03 +0000 (09:44 -0500)]

ada: Fix obsolete comment in Sinfo.Utils

...caused by moving code here from Atree.

gcc/ada/

* sinfo-utils.adb: Update comment to refer to
New_Node_Debugging_Output.

commit | commitdiff | tree

Marc Poulhiès [Tue, 28 Feb 2023 10:01:47 +0000 (11:01 +0100)]

ada: Fix SPARK context not restored when Load_Unit is failing

When Load_Unit fails to find the unit or encounters an error, the
Load_Fail procedure is called and an exception is raised, skipping the
restoration of the SPARK/Ghost context stored on procedure entry.

gcc/ada/

* rtsfind.adb (Load_RTU.Restore_SPARK_Context): New.
(Load_RTU): Use Restore_SPARK_Context on all exit paths.
* sem_ch6.adb (Analyze_Subprogram_Body_Helper): Initialize local
variable to Empty.

commit | commitdiff | tree

Piotr Trojanek [Tue, 7 Feb 2023 23:54:06 +0000 (00:54 +0100)]

ada: Restrict use of formal parameters within exceptional cases

Restrict references to formal parameters within the new SPARK aspect
Exceptional_Cases and allow occurrences of 'Old in this aspect.

gcc/ada/

* sem_attr.adb
(Analyze_Attribute_Old_Result): Allow uses of 'Old and 'Result within
the new aspect.
* sem_res.adb
(Within_Exceptional_Cases_Consequence): New utility routine.
(Resolve_Entity_Name): Restrict use of formal parameters within the
new aspect.

commit | commitdiff | tree

Juzhe-Zhong [Thu, 25 May 2023 06:19:29 +0000 (14:19 +0800)]

RISC-V: Remove FRM_REGNUM dependency for rtx conversions

According to RVV ISA:
The conversions use the dynamic rounding mode in frm, except for the rtz
variants, which round towards zero.

So rtz conversion patterns should not have FRM dependency.

We can't support mode switching for FRM yet since rvv intrinsic doc is
not updated but
I think this patch is correct.

gcc/ChangeLog:

* config/riscv/vector.md: Remove FRM_REGNUM dependency in rtz
instructions.

Signed-off-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>

commit | commitdiff | tree

Christophe Lyon [Tue, 23 May 2023 09:20:05 +0000 (09:20 +0000)]

testsuite, analyzer: Fix testcases with fclose

The gcc.dg/analyzer/data-model-4.c and
gcc.dg/analyzer/torture/conftest-1.c fail with recent glibc headers
and succeed with older headers.

The new error message is:
warning: use of possibly-NULL 'f' where non-null expected [CWE-690] [-Wanalyzer-possible-null-argument]

Like similar previous fixes in this area, this patch updates the
testcase so that this warning isn't reported.

2023-05-23 Christophe Lyon <christophe.lyon@linaro.org>

gcc/testsuite/
* gcc.dg/analyzer/data-model-4.c: Exit if fopen returns NULL.
* gcc.dg/analyzer/torture/conftest-1.c: Likewise.

commit | commitdiff | tree

Aldy Hernandez [Wed, 24 May 2023 17:57:00 +0000 (19:57 +0200)]

Stream out NANs correctly.

NANs don't have bounds, so there's no need to stream them out.

gcc/ChangeLog:

* data-streamer-in.cc (streamer_read_value_range): Handle NANs.
* data-streamer-out.cc (streamer_write_vrange): Same.
* value-range.h (class vrange): Make streamer_write_vrange a friend.

commit | commitdiff | tree

Aldy Hernandez [Wed, 24 May 2023 17:55:09 +0000 (19:55 +0200)]

Disallow setting of NANs in frange setter unless setting trees.

frange::set() is confusing in that we can set a NAN by specifying a
bound of +-NAN, even though we tecnically disallow NANs in the setter
because the kind can never be VR_NAN. This is a wart for
get_tree_range(), which builds a range out of a tree from the source,
to work correctly. It's ugly, and it showed its limitation while
implementing LTO streaming of ranges.

This patch disallows passing NAN bounds in frange::set() and fixes
get_tree_range.

gcc/ChangeLog:

* value-query.cc (range_query::get_tree_range): Set NAN directly
if necessary.
* value-range.cc (frange::set): Assert that bounds are not NAN.

commit | commitdiff | tree

Aldy Hernandez [Wed, 24 May 2023 17:53:53 +0000 (19:53 +0200)]

Hash known NANs correctly for franges.

We're ICEing when trying to hash a known NAN. This is unnoticeable
because the only user would be IPA, and even so, it currently doesn't
handle floats. However, handling floats is a flip of a switch, so
it's best to handle them already.

gcc/ChangeLog:

* value-range.cc (add_vrange): Handle known NANs.

commit | commitdiff | tree

Aldy Hernandez [Wed, 24 May 2023 17:47:02 +0000 (19:47 +0200)]

Add an frange::set_nan() variant that takes a nan_state.

Generalize frange::set_nan() to take a nan_state and make current
set_nan() methods syntactic sugar.

This is in preparation for better streaming of NANs for LTO/IPA.

gcc/ChangeLog:

* value-range.h (frange::set_nan): New.

commit | commitdiff | tree

Alexandre Oliva [Wed, 24 May 2023 06:07:56 +0000 (03:07 -0300)]

[PR100106] Reject unaligned subregs when strict alignment is required

The testcase for pr100106, compiled with optimization for 32-bit
powerpc -mcpu=604 with -mstrict-align expands the initialization of a
union from a float _Complex value into a load from an SCmode
constant pool entry, aligned to 4 bytes, into a DImode pseudo,
requiring 8-byte alignment.

The patch that introduced the testcase modified simplify_subreg to
avoid changing the MEM to outermode, but simplify_gen_subreg still
creates a SUBREG or a MEM that would require stricter alignment than
MEM's, and lra_constraints appears to get confused by that, repeatedly
creating unsatisfiable reloads for the SUBREG until it exceeds the
insn count.

Avoiding the unaligned SUBREG, expand splits the DImode dest into
SUBREGs and loads each SImode word of the constant pool with the
proper alignment.

for gcc/ChangeLog

PR target/100106
* emit-rtl.cc (validate_subreg): Reject a SUBREG of a MEM that
requires stricter alignment than MEM's.

for gcc/testsuite/ChangeLog

PR target/100106
* gcc.target/powerpc/pr100106-sa.c: New.

commit | commitdiff | tree

Alexandre Oliva [Wed, 24 May 2023 06:07:52 +0000 (03:07 -0300)]

[testsuite] require profiling for -pg

Fix two tests that use -pg but don't declare their requirement for
profiling support.

for gcc/testsuite/ChangeLog

* gcc.target/i386/mcount_pic.c: Add dg-require-profiling.
* gcc.target/i386/pr104447.c: Likewise.

commit | commitdiff | tree

Alexandre Oliva [Wed, 24 May 2023 06:07:49 +0000 (03:07 -0300)]

[testsuite] require pthread for openmp

Fix test that uses -fopenmp without declaring requirement for pthread
support.

for gcc/testsuite/ChangeLog

* g++.dg/pr80481.C: Add explicit pthread requirement.

commit | commitdiff | tree

Alexandre Oliva [Wed, 24 May 2023 06:07:47 +0000 (03:07 -0300)]

[testsuite] require pic for pr103074.c

Fix test that uses -fPIC without stating the requirement for PIC
support.

for gcc/testsuite/ChangeLog

* gcc.target/i386/pr103074.c: Require fpic support.

commit | commitdiff | tree

Alexandre Oliva [Wed, 24 May 2023 06:07:46 +0000 (03:07 -0300)]

[testsuite] tsvc: skip include malloc.h when unavailable

tsvc tests all fail on systems that don't offer a malloc.h, other than
those that explicitly rule that out.  Use the preprocessor to test for
malloc.h's availability.

tsvc.h also expects a definition for struct timeval, but it doesn't
include sys/time.h.  Add a conditional include thereof.

for  gcc/testsuite/ChangeLog

* gcc.dg/vect/tsvc/tsvc.h: Test for and conditionally include
malloc.h and sys/time.h.

commit | commitdiff | tree

Alexandre Oliva [Wed, 24 May 2023 06:08:12 +0000 (03:08 -0300)]

[libstdc++] [testsuite] xfail to_chars/long_double on x86-vxworks

Just as on aarch64, x86's wider long double experiences loss of
precision with from_chars implemented in terms of double. Expect the
execution fail.

for libstdc++-v3/ChangeLog

* testsuite/20_util/to_chars/long_double.cc: Expect execution
fail on x86-vxworks.

commit | commitdiff | tree

Alexandre Oliva [Wed, 24 May 2023 06:08:10 +0000 (03:08 -0300)]

[testsuite] [x86] cope with --enable-frame-pointer

Various x86 tests fail if the toolchain is configured with
--enable-frame-pointer, because the unexpected extra insns mess with
the expected asm counts. Add -fomit-frame-pointer so that they can
still pass.

for gcc/testsuite/ChangeLog

* gcc.target/i386/pieces-memcpy-7.c: Add -fomit-frame-pointer.
* gcc.target/i386/pieces-memcpy-8.c: Likewise.
* gcc.target/i386/pieces-memcpy-9.c: Likewise.
* gcc.target/i386/pieces-memset-1.c: Likewise.
* gcc.target/i386/pieces-memset-36.c: Likewise.
* gcc.target/i386/pieces-memset-4.c: Likewise.
* gcc.target/i386/pieces-memset-40.c: Likewise.
* gcc.target/i386/pieces-memset-41.c: Likewise.
* gcc.target/i386/pieces-memset-7.c: Likewise.
* gcc.target/i386/pieces-memset-8.c: Likewise.
* gcc.target/i386/pieces-memset-9.c: Likewise.
* gcc.target/i386/pr102230.c: Likewise.
* gcc.target/i386/pr78103-2.c: Likewise.

commit | commitdiff | tree

GCC Administrator [Thu, 25 May 2023 00:16:49 +0000 (00:16 +0000)]

Daily bump.

commit | commitdiff | tree

Andrew MacLeod [Wed, 24 May 2023 13:52:26 +0000 (09:52 -0400)]

Gimple range PHI analyzer and testcases

Provide a PHI analyzer framework to provive better initial values for
PHI nodes which formk groups with initial values and single statements
which modify the PHI values in some predicatable way.

PR tree-optimization/107822
PR tree-optimization/107986
gcc/
* Makefile.in (OBJS): Add gimple-range-phi.o.
* gimple-range-cache.h (ranger_cache::m_estimate): New
phi_analyzer pointer member.
* gimple-range-fold.cc (fold_using_range::range_of_phi): Use
phi_analyzer if no loop info is available.
* gimple-range-phi.cc: New file.
* gimple-range-phi.h: New file.
* tree-vrp.cc (execute_ranger_vrp): Utililze a phi_analyzer.

gcc/testsuite/
* gcc.dg/pr107822.c: New.
* gcc.dg/pr107986-1.c: New.

commit | commitdiff | tree

Andrew MacLeod [Wed, 24 May 2023 13:17:32 +0000 (09:17 -0400)]

Provide relation queries for a stmt.

Allow fur_list and fold_stmt to be provided a range_query rather than
always defaultsing to NULL (which becomes a global query).
Also provide a fold_relations () routine which can provide a range_trio
for an arbitrary statement using any range_query

* gimple-range-fold.cc (fur_list::fur_list): Add range_query param
to contructors.
(fold_range): Add range_query parameter.
(fur_relation::fur_relation): New.
(fur_relation::trio): New.
(fur_relation::register_relation): New.
(fold_relations): New.
* gimple-range-fold.h (fold_range): Adjust prototypes.
(fold_relations): New.

commit | commitdiff | tree

Andrew MacLeod [Wed, 24 May 2023 13:06:26 +0000 (09:06 -0400)]

Make ssa_cache a range_query.

By providing range_of_expr as a range_query, we can fold and do other
interesting things using values from the global table. Make ranger's
knonw globals available via const_query.

* gimple-range-cache.cc (ssa_cache::range_of_expr): New.
* gimple-range-cache.h (class ssa_cache): Inherit from range_query.
(ranger_cache::const_query): New.
* gimple-range.cc (gimple_ranger::const_query): New.
* gimple-range.h (gimple_ranger::const_query): New prototype.

commit | commitdiff | tree

Andrew MacLeod [Wed, 24 May 2023 12:49:30 +0000 (08:49 -0400)]

Make ssa_cache and ssa_lazy_cache virtual.

Making them virtual allows us to interchangebly use the caches.

* gimple-range-cache.cc (ssa_cache::dump): Use get_range.
(ssa_cache::dump_range_query): Delete.
(ssa_lazy_cache::dump_range_query): Delete.
(ssa_lazy_cache::get_range): Move from header file.
(ssa_lazy_cache::clear_range): ditto.
(ssa_lazy_cache::clear): Ditto.
* gimple-range-cache.h (class ssa_cache): Virtualize.
(class ssa_lazy_cache): Inherit and virtualize.

commit | commitdiff | tree

Harald Anlauf [Wed, 24 May 2023 19:04:43 +0000 (21:04 +0200)]

Fortran: reject bad DIM argument of SIZE intrinsic in simplification [PR104350]

gcc/fortran/ChangeLog:

PR fortran/104350
* simplify.cc (simplify_size): Reject DIM argument of intrinsic SIZE
with error when out of valid range.

gcc/testsuite/ChangeLog:

PR fortran/104350
* gfortran.dg/size_dim_2.f90: New test.

commit | commitdiff | tree

Harald Anlauf [Sun, 21 May 2023 20:25:29 +0000 (22:25 +0200)]

Fortran: checking and simplification of RESHAPE intrinsic [PR103794]

gcc/fortran/ChangeLog:

PR fortran/103794
* check.cc (gfc_check_reshape): Expand constant arguments SHAPE and
ORDER before checking.
* gfortran.h (gfc_is_constant_array_expr): Add prototype.
* iresolve.cc (gfc_resolve_reshape): Expand constant argument SHAPE.
* simplify.cc (is_constant_array_expr): If array is determined to be
constant, expand small array constructors if needed.
(gfc_is_constant_array_expr): Wrapper for is_constant_array_expr.
(gfc_simplify_reshape): Fix check for insufficient elements in SOURCE
when no padding specified.

gcc/testsuite/ChangeLog:

PR fortran/103794
* gfortran.dg/reshape_10.f90: New test.
* gfortran.dg/reshape_11.f90: New test.

commit | commitdiff | tree

Matthias Kretz [Wed, 24 May 2023 14:43:07 +0000 (16:43 +0200)]

libstdc++: Fix type of first argument to vec_cntm call

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/109949
* include/experimental/bits/simd.h (__intrinsic_type): If
__ALTIVEC__ is defined, map gnu::vector_size types to their
corresponding __vector T types without losing unsignedness of
integer types. Also prefer long long over long.
* include/experimental/bits/simd_ppc.h (_S_popcount): Cast mask
object to the expected unsigned vector type.

commit | commitdiff | tree

Aldy Hernandez [Wed, 24 May 2023 17:59:20 +0000 (19:59 +0200)]

Remove deprecated vrange::kind().

gcc/ChangeLog:

* value-range.h (vrange::kind): Remove.

commit | commitdiff | tree

Roger Sayle [Wed, 24 May 2023 16:32:20 +0000 (17:32 +0100)]

PR middle-end/109840: Preserve popcount/parity type in match.pd.

PR middle-end/109840 is a regression introduced by my recent patch to
fold popcount(bswap(x)) as popcount(x).  When the bswap and the popcount
have the same precision, everything works fine, but this optimization also
allowed a zero-extension between the two.  The oversight is that we need
to be strict with type conversions, both to avoid accidentally changing
the argument type to popcount, and also to reflect the effects of
argument/return-value promotion in the call to bswap, so this zero extension
needs to be preserved/explicit in the optimized form.

Interestingly, match.pd should (in theory) be able to narrow calls to
popcount and parity, removing a zero-extension from its argument, but
that is an independent optimization, that needs to check IFN_ support.
Many thanks to Andrew Pinski for his help/fixes with these transformations.

2023-05-24  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
PR middle-end/109840
* match.pd <popcount optimizations>: Preserve zero-extension when
optimizing popcount((T)bswap(x)) and popcount((T)rotate(x,y)) as
popcount((T)x), so the popcount's argument keeps the same type.
<parity optimizations>:  Likewise preserve extensions when
simplifying parity((T)bswap(x)) and parity((T)rotate(x,y)) as
parity((T)x), so that the parity's argument type is the same.

gcc/testsuite/ChangeLog
PR middle-end/109840
* gcc.dg/fold-parity-8.c: New test.
* gcc.dg/fold-popcount-11.c: Likewise.

commit | commitdiff | tree

Aldy Hernandez [Wed, 17 May 2023 09:29:32 +0000 (11:29 +0200)]

Provide an API for ipa_vr.

This patch encapsulates the ipa_vr internals into an API. It also
makes it type agnostic, in preparation for upcoming changes to IPA.

Interestingly, there's a 0.44% improvement to IPA-cp, which I'm sure
we'll soak up with future changes in this area :).

gcc/ChangeLog:

* ipa-cp.cc (ipa_value_range_from_jfunc): Use new ipa_vr API.
(ipcp_store_vr_results): Same.
* ipa-prop.cc (ipa_vr::ipa_vr): New.
(ipa_vr::get_vrange): New.
(ipa_vr::set_unknown): New.
(ipa_vr::streamer_read): New.
(ipa_vr::streamer_write): New.
(write_ipcp_transformation_info): Use new ipa_vr API.
(read_ipcp_transformation_info): Same.
(ipa_vr::nonzero_p): Delete.
(ipcp_update_vr): Use new ipa_vr API.
* ipa-prop.h (class ipa_vr): Provide an API and hide internals.
* ipa-sra.cc (zap_useless_ipcp_results): Use new ipa_vr API.

gcc/testsuite/ChangeLog:

* gcc.dg/ipa/pr78121.c: Adjust for vrange::dump use.
* gcc.dg/ipa/vrp1.c: Same.
* gcc.dg/ipa/vrp2.c: Same.
* gcc.dg/ipa/vrp3.c: Same.
* gcc.dg/ipa/vrp4.c: Same.
* gcc.dg/ipa/vrp5.c: Same.
* gcc.dg/ipa/vrp6.c: Same.
* gcc.dg/ipa/vrp7.c: Same.
* gcc.dg/ipa/vrp8.c: Same.

commit | commitdiff | tree

Jan-Benedict Glaw [Wed, 24 May 2023 14:35:22 +0000 (16:35 +0200)]

Fix sprintf length warning

One of the supplied argument strings is unneccesarily long (c-sky, using
basically the same code, fixed it to a shorter length) and this fixes overflow
warnings, as GCC fails to deduce that the full 256 bytes for load_op[] are
not used at all.

gcc/ChangeLog:

* config/mcore/mcore.cc (output_inline_const) Make buffer smaller to
silence overflow warnings later on.

commit | commitdiff | tree

Uros Bizjak [Wed, 24 May 2023 14:17:55 +0000 (16:17 +0200)]

i386: Add v<any_shift:insn>v4qi3 expander

Also, move v<any_shift:insn>v8qi3 expander to a better place and enable
it with TARGET_MMX_WITH_SSE. Remove handling of V8QImode from
ix86_expand_vecop_qihi2 since all partial QI->HI vector modes expand
via ix86_expand_vecop_qihi_partial.

gcc/ChangeLog:

* config/i386/i386-expand.cc (ix86_expand_vecop_qihi2):
Remove handling of V8QImode.
* config/i386/mmx.md (v<insn>v8qi3): Move from sse.md.
Call ix86_expand_vecop_qihi_partial. Enable for TARGET_MMX_WITH_SSE.
(v<insn>v4qi3): Ditto.
* config/i386/sse.md (v<insn>v8qi3): Remove.

gcc/testsuite/ChangeLog:

* gcc.target/i386/vect-shiftv4qi.c (dg-options):
Remove -ftree-vectorize.
* gcc.target/i386/vect-shiftv8qi.c (dg-options): Ditto.
* gcc.target/i386/vect-vshiftv4qi.c: New test.
* gcc.target/i386/vect-vshiftv8qi.c: New test.

commit | commitdiff | tree

Kyrylo Tkachov [Wed, 24 May 2023 13:52:34 +0000 (14:52 +0100)]

aarch64: PR target/99195 Annotate vector shift patterns for vec-concat-zero

Continuing the series of straightforward annotations, this one handles the normal (not widening or narrowing) vector shifts.
Tests included.

Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

gcc/ChangeLog:

PR target/99195
* config/aarch64/aarch64-simd.md (aarch64_simd_lshr<mode>): Rename to...
(aarch64_simd_lshr<mode><vczle><vczbe>): ... This.
(aarch64_simd_ashr<mode>): Rename to...
(aarch64_simd_ashr<mode><vczle><vczbe>): ... This.
(aarch64_simd_imm_shl<mode>): Rename to...
(aarch64_simd_imm_shl<mode><vczle><vczbe>): ... This.
(aarch64_simd_reg_sshl<mode>): Rename to...
(aarch64_simd_reg_sshl<mode><vczle><vczbe>): ... This.
(aarch64_simd_reg_shl<mode>_unsigned): Rename to...
(aarch64_simd_reg_shl<mode>_unsigned<vczle><vczbe>): ... This.
(aarch64_simd_reg_shl<mode>_signed): Rename to...
(aarch64_simd_reg_shl<mode>_signed<vczle><vczbe>): ... This.
(vec_shr_<mode>): Rename to...
(vec_shr_<mode><vczle><vczbe>): ... This.
(aarch64_<sur>shl<mode>): Rename to...
(aarch64_<sur>shl<mode><vczle><vczbe>): ... This.
(aarch64_<sur>q<r>shl<mode>): Rename to...
(aarch64_<sur>q<r>shl<mode><vczle><vczbe>): ... This.

gcc/testsuite/ChangeLog:

PR target/99195
* gcc.target/aarch64/simd/pr99195_1.c: Add testing for shifts.
* gcc.target/aarch64/simd/pr99195_6.c: Likewise.
* gcc.target/aarch64/simd/pr99195_8.c: New test.

commit | commitdiff | tree

Richard Biener [Wed, 24 May 2023 08:07:36 +0000 (10:07 +0200)]

target/109944 - avoid STLF fail for V16QImode CTOR expansion

The following dispatches to V2DImode CTOR expansion instead of
using sets of (subreg:DI (reg:V16QI 146) [08]) which causes
LRA to spill DImode and reload V16QImode. The same applies for
V8QImode or V4HImode construction from SImode parts which happens
during 32bit libgcc build.

PR target/109944
* config/i386/i386-expand.cc (ix86_expand_vector_init_general):
Perform final vector composition using
ix86_expand_vector_init_general instead of setting
the highpart and lowpart which causes spilling.

* gcc.target/i386/pr109944-1.c: New testcase.
* gcc.target/i386/pr109944-2.c: Likewise.

commit | commitdiff | tree

Andrew MacLeod [Tue, 23 May 2023 19:41:03 +0000 (15:41 -0400)]

Only update global value if it changes.

Do not update and propagate a global value if it hasn't changed.

PR tree-optimization/109695
* gimple-range-cache.cc (ranger_cache::get_global_range): Add
changed param.
* gimple-range-cache.h (ranger_cache::get_global_range): Ditto.
* gimple-range.cc (gimple_ranger::range_of_stmt): Pass changed
flag to set_global_range.
(gimple_ranger::prefill_stmt_dependencies): Ditto.

commit | commitdiff | tree

Andrew MacLeod [Tue, 23 May 2023 19:20:56 +0000 (15:20 -0400)]

Use negative values to reflect always_current in the temporal cache.

Instead of using 0, use negative timestamps to reflect always_current state.
If the value doesn't change, keep the timestamp rather than creating a new
one and invalidating any dependencies.

PR tree-optimization/109695
* gimple-range-cache.cc (temporal_cache::temporal_value): Return
a positive int.
(temporal_cache::current_p): Check always_current method.
(temporal_cache::set_always_current): Add param and set value
appropriately.
(temporal_cache::always_current_p): New.
(ranger_cache::get_global_range): Adjust.
(ranger_cache::set_global_range): set always current first.

commit | commitdiff | tree

Andrew MacLeod [Tue, 23 May 2023 19:11:44 +0000 (15:11 -0400)]

Choose better initial values for ranger.

Instead of defaulting to VARYING, fold the stmt using just global ranges.

PR tree-optimization/109695
* gimple-range-cache.cc (ranger_cache::get_global_range): Call
fold_range with global query to choose an initial value.

commit | commitdiff | tree

Juzhe-Zhong [Wed, 24 May 2023 11:40:37 +0000 (19:40 +0800)]

RISC-V: Add FRM_ prefix to dynamic rounding mode enum

An obvious fix to make all enum naming consistent.

gcc/ChangeLog:

* config/riscv/riscv-protos.h (enum frm_field_enum): Add FRM_
prefix.

Signed-off-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>

commit | commitdiff | tree

Richard Biener [Wed, 24 May 2023 10:36:28 +0000 (12:36 +0200)]

tree-optimization/109849 - fix fallout of PRE hoisting change

The PR109849 fix made us no longer hoist some memory loads because
of the expression set intersection. We can still avoid to compute
the union by simply taking the first sets expressions and leave
the pruning of expressions with values not suitable for hoisting
to sorted_array_from_bitmap_set.

PR tree-optimization/109849
* tree-ssa-pre.cc (do_hoist_insertion): Do not intersect
expressions but take the first sets.

* gcc.dg/tree-ssa/ssa-hoist-9.c: New testcase.

commit | commitdiff | tree

Matthias Kretz [Wed, 24 May 2023 10:50:46 +0000 (12:50 +0200)]

libstdc++: Fix SFINAE for __is_intrinsic_type on ARM

On ARM NEON doesn't support double, so __is_intrinsic_type_v<double,
whatever> should say false (instead of being ill-formed).

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/109261
* include/experimental/bits/simd.h (__intrinsic_type):
Specialize __intrinsic_type<double, 8> and
__intrinsic_type<double, 16> in any case, but provide the member
type only with __aarch64__.

commit | commitdiff | tree

Matthias Kretz [Tue, 23 May 2023 21:48:49 +0000 (23:48 +0200)]

libstdc++: Add missing constexpr to simd_neon

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/109261
* include/experimental/bits/simd_neon.h (_S_reduce): Add
constexpr and make NEON implementation conditional on
not __builtin_is_constant_evaluated.

commit | commitdiff | tree

Gaius Mulley [Wed, 24 May 2023 10:14:07 +0000 (11:14 +0100)]

PR modula2/109952 Inconsistent HIGH values with 'ARRAY OF CHAR'

This patch fixes the case when a single character constant literal is
passed as a string actual parameter to an ARRAY OF CHAR formal parameter.
To be consistent a single character is promoted to a string and nul
terminated (and its high value is 1). Previously a single character
string would not be nul terminated and the high value was 0.
The documentation now includes a section describing the expected behavior
and included in this patch is some regression test code matching the
table inside the documentation.

gcc/ChangeLog:

PR modula2/109952
* doc/gm2.texi (High procedure function): New node.
(Using): New menu entry for High procedure function.

gcc/m2/ChangeLog:

PR modula2/109952
* Make-maintainer.in: Change header to include emacs file mode.
* gm2-compiler/M2GenGCC.mod (BuildHighFromChar): Check whether
operand is a constant string and is nul terminated then return one.
* gm2-compiler/PCSymBuild.mod (WalkFunction): Add default return
TRUE. Static analysis missing return path fix.
* gm2-libs/IO.mod (Init): Rewrite to help static analysis.
* target-independent/m2/gm2-libs.texi: Rebuild.

gcc/testsuite/ChangeLog:

PR modula2/109952
* gm2/pim/run/pass/hightests.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

commit | commitdiff | tree

Richard Sandiford [Wed, 24 May 2023 08:53:12 +0000 (09:53 +0100)]

early-remat: Resync with new DF postorders [PR109940]

When I wrote early-remat, the DF_FORWARD block order was a postorder
of a reverse/backward walk (i.e. of the inverted cfg), rather than a
reverse postorder of a forward walk.  A postorder of a backward walk
lacked the important property that dominators come before the blocks
they dominate; instead it ensures that postdominators come after
the blocks that they postdominate.

The DF_BACKWARD block order was similarly a postorder of a forward
walk.  Since early-remat wanted a standard postorder and reverse
postorder with normal dominator properties, it used the DF_BACKWARD
order instead of the DF_FORWARD order.

g:53dddbfeb213ac4ec39f fixed the DF orders so that DF_FORWARD was
an RPO of a forward walk and so that DF_BACKWARD was an RPO of a
backward walk.  This meant that iterating backwards over the
DF_BACKWARD order had the exact problem that the original DF_FORWARD
order had, triggering a flurry of ICEs for SVE.

This fixes the build with SVE enabled.  It also fixes an ICE
in g++.target/aarch64/sve/pr99766.C with normal builds.  I've
included the test from the PR as well, for extra coverage.

gcc/
PR rtl-optimization/109940
* early-remat.cc (postorder_index): Rename to...
(rpo_index): ...this.
(compare_candidates): Sort by decreasing rpo_index rather than
increasing postorder_index.
(early_remat::sort_candidates): Calculate the forward RPO from
DF_FORWARD.
(early_remat::local_phase): Follow forward RPO using DF_FORWARD,
rather than DF_BACKWARD in reverse.

gcc/testsuite/
* gcc.dg/torture/pr109940.c: New test.

commit | commitdiff | tree

Kyrylo Tkachov [Wed, 24 May 2023 08:33:04 +0000 (09:33 +0100)]

arm: PR target/109939 Correct signedness of return type of __ssat intrinsics

As the PR says we shouldn't be using qualifier_unsigned for the return type of the __ssat intrinsics.
UNSIGNED_SAT_BINOP_UNSIGNED_IMM_QUALIFIERS already exists for that.
This was just a thinko.
This patch fixes this and the warning with -Wconversion goes away.

Bootstrapped and tested on arm-none-linux-gnueabihf.

gcc/ChangeLog:

PR target/109939
* config/arm/arm-builtins.cc (SAT_BINOP_UNSIGNED_IMM_QUALIFIERS): Use
qualifier_none for the return operand.

gcc/testsuite/ChangeLog:

PR target/109939
* gcc.target/arm/pr109939.c: New test.

commit | commitdiff | tree

Juzhe-Zhong [Wed, 24 May 2023 07:31:46 +0000 (15:31 +0800)]

RISC-V: Add RVV mask logic auto-vectorization

This patch is adding mask logic auto-vectorization, define the pattern
as "define_insn_and_split" to allow combine PASS easily combine series
instructions.

For example:
combine vmxor.mm + vmnot.m into vmxnor.mm

Signed-off-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
gcc/ChangeLog:

* config/riscv/autovec.md (<optab><mode>3): New pattern.
(one_cmpl<mode>2): Ditto.
(*<optab>not<mode>): Ditto.
(*n<optab><mode>): Ditto.
* config/riscv/riscv-v.cc (expand_vec_cmp_float): Change to
one_cmpl.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/cmp/vcond-4.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/vcond_run-4.c: New test.

commit | commitdiff | tree

Alexandre Oliva [Wed, 24 May 2023 06:07:58 +0000 (03:07 -0300)]

[testsuite] [ppc] xfail uninit-pred-9_b bogus warn on ppc32 too

The bogus warning is present on 32-bit ppc-vx7r2 too, so drop the 64
from the powerpc xfail triplet.

for gcc/testsuite/ChangeLog

* gcc.dg/uninit-pred-9_b.c: Xfail bogus warning on 32-bit ppc
as well.

commit | commitdiff | tree

Alexandre Oliva [Wed, 24 May 2023 06:07:44 +0000 (03:07 -0300)]

[testsuite] [i386] enable sse2 for signbit-2.c

The expected results for signbit-2 only arise on x86 with avx512f
disabled and sse2 enabled. The patch already disables avx512f
explicitly, but it fails to enable sse2.

for gcc/testsuite/ChangeLog

* gcc.dg/signbit-2.c: Add -msse2 on x86.

commit | commitdiff | tree

Alexandre Oliva [Wed, 24 May 2023 06:07:41 +0000 (03:07 -0300)]

Check for sysconf decl on vxworks

The sysconf function is only available in rtp mode on vxworks. In
kernel mode, it is not even declared, but the feature test macro in
the testsuite doesn't notice its absence because it's a link test, and
vxworks kernel mode uses partial linking.

This patch introduces an alternate test on vxworks targets to check
for a declaration and for an often-used sysconf parameter.

for gcc/testsuite/ChangeLog

* lib/target-supports.exp (check_effective_target_sysconf):
Check for declaration and _SC_PAGESIZE on vxworks.

commit | commitdiff | tree

Kewen Lin [Wed, 24 May 2023 05:05:01 +0000 (00:05 -0500)]

vect: Enhance cost evaluation in vect_transform_slp_perm_load_1

Following Richi's suggestion in [1], I'm working on deferring
cost evaluation next to the transformation, this patch is
to enhance function vect_transform_slp_perm_load_1 which
could under-cost for vector permutation, since the costing
doesn't try to consider nvectors_per_build, it's inconsistent
with the transformation part.

Basically it changes the below

  if (index == count)
    {
       if (!noop_p)
         {
           // A ...
           // ++*n_perms;

           if (!analyze_only)
             {
                // B1 ...
                // B2 ...
                for ...
                   // B3 building VEC_PERM_EXPR
             }
         }
       else if (!analyze_only)
         {
            // no B2 since no any further uses here.
            for ...
              // B4 building nothing
         }
        // B5 ...
    }

to:

  if (index == count)
    {
       if (!noop_p)
         {
           // A ...

           if (!analyze_only)
             // B1 ...

           // B2 ... (trivial computations during analyze_only or not)

           for ...
             {
                // now n_perms is consistent with building VEC_PERM_EXPR
                // ++*n_perms;
                if (analyze_only)
                   continue;
                // B3 building VEC_PERM_EXPR
             }
         }
       else if (!analyze_only)
         {
            // no B2 since no any further uses here.
            for ...
              // B4 building nothing
         }
        // B5 ...
    }

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563624.html

gcc/ChangeLog:

* tree-vect-slp.cc (vect_transform_slp_perm_load_1): Adjust the
calculation on n_perms by considering nvectors_per_build.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c: New test.

commit | commitdiff | tree

Juzhe-Zhong [Wed, 24 May 2023 03:37:01 +0000 (11:37 +0800)]

RISC-V: Add RVV comparison autovectorization

This patch enable RVV auto-vectorization including floating-point
unorder and order comparison.

The testcases are leveraged from Richard. So include Richard as co-author.

And this patch is the prerequisite patch for my current middle-end work.
Without this patch, I can't support len_mask_xxx middle-end pattern
since the mask is generated by comparison.

For example,
for (int i...; i < n.)
if (cond[i])
a[i] = b[i]

We need len_mask_load/len_mask_store for such code and I am gonna
support them in the middle-end after this patch is merged.

Both integer && floating (order and unorder) are tested.
built && regression passed.

Signed-off-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Co-Authored-By: Richard Sandiford <richard.sandiford@arm.com>
gcc/ChangeLog:

* config/riscv/autovec.md (@vcond_mask_<mode><vm>): New pattern.
(vec_cmp<mode><vm>): New pattern.
(vec_cmpu<mode><vm>): New pattern.
(vcond<V:mode><VI:mode>): New pattern.
(vcondu<V:mode><VI:mode>): New pattern.
* config/riscv/riscv-protos.h (enum insn_type): Add new enum.
(emit_vlmax_merge_insn): New function.
(emit_vlmax_cmp_insn): Ditto.
(emit_vlmax_cmp_mu_insn): Ditto.
(expand_vec_cmp): Ditto.
(expand_vec_cmp_float): Ditto.
(expand_vcond): Ditto.
* config/riscv/riscv-v.cc (emit_vlmax_merge_insn): Ditto.
(emit_vlmax_cmp_insn): Ditto.
(emit_vlmax_cmp_mu_insn): Ditto.
(get_cmp_insn_code): Ditto.
(expand_vec_cmp): Ditto.
(expand_vec_cmp_float): Ditto.
(expand_vcond): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp:
* gcc.target/riscv/rvv/autovec/cmp/vcond-1.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/vcond-2.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/vcond-3.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/vcond_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/vcond_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/vcond_run-3.c: New test.

commit | commitdiff | tree

Pan Li [Thu, 18 May 2023 06:21:30 +0000 (14:21 +0800)]

RISC-V: Support RVV VREINTERPRET from vbool*_t to vuint*m1_t

This patch support the RVV VREINTERPRET from the vbool*_t to the
vuint*m1_t.  Aka:

vuint*m1_t __riscv_vreinterpret_x_x(vbool*_t);

These APIs help the users to convert vector the vbool*_t to the LMUL=1
unsigned integer vint*_t.  According to the RVV intrinsic SPEC as below,
the reinterpret intrinsics only change the types of the underlying contents.

https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#reinterpret-vbool-o-vintm1

For example, given below code.
vuint8m1_t test_vreinterpret_v_b1_vuint8m1 (vbool1_t src) {
  return __riscv_vreinterpret_v_b1_u8m1 (src);
}

It will generate the assembly code similar as below:
vsetvli a5,zero,e8,m8,ta,ma
vlm.v   v1,0(a1)
vs1r.v  v1,0(a0)
ret

Please NOTE the test files doesn't cover all the possible combinations
of the intrinsic APIs introduced by this PATCH due to too many.
This is the last PATCH for the reinterpret between the signed/unsigned
and the bool vector types.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/genrvv-type-indexer.cc (main): Add
unsigned_eew*_lmul1_interpret for indexer.
* config/riscv/riscv-vector-builtins-functions.def (vreinterpret):
Register vuint*m1_t interpret function.
* config/riscv/riscv-vector-builtins-types.def (DEF_RVV_UNSIGNED_EEW8_LMUL1_INTERPRET_OPS):
New macro for vuint8m1_t.
(DEF_RVV_UNSIGNED_EEW16_LMUL1_INTERPRET_OPS): Likewise.
(DEF_RVV_UNSIGNED_EEW32_LMUL1_INTERPRET_OPS): Likewise.
(DEF_RVV_UNSIGNED_EEW64_LMUL1_INTERPRET_OPS): Likewise.
(vbool1_t): Add to unsigned_eew*_interpret_ops.
(vbool2_t): Likewise.
(vbool4_t): Likewise.
(vbool8_t): Likewise.
(vbool16_t): Likewise.
(vbool32_t): Likewise.
(vbool64_t): Likewise.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_UNSIGNED_EEW8_LMUL1_INTERPRET_OPS):
New macro for vuint*m1_t.
(DEF_RVV_UNSIGNED_EEW16_LMUL1_INTERPRET_OPS): Likewise.
(DEF_RVV_UNSIGNED_EEW32_LMUL1_INTERPRET_OPS): Likewise.
(DEF_RVV_UNSIGNED_EEW64_LMUL1_INTERPRET_OPS): Likewise.
(required_extensions_p): Add vuint*m1_t interpret case.
* config/riscv/riscv-vector-builtins.def (unsigned_eew8_lmul1_interpret):
Add vuint*m1_t interpret to base type.
(unsigned_eew16_lmul1_interpret): Likewise.
(unsigned_eew32_lmul1_interpret): Likewise.
(unsigned_eew64_lmul1_interpret): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/misc_vreinterpret_vbool_vint.c:
Enrich test cases.

commit | commitdiff | tree

Pan Li [Thu, 18 May 2023 02:46:38 +0000 (10:46 +0800)]

RISC-V: Support RVV VREINTERPRET from vbool*_t to vint*m1_t

This patch support the RVV VREINTERPRET from the vbool*_t to the
vint*m1_t.  Aka:

vint*m1_t __riscv_vreinterpret_x_x(vbool*_t);

These APIs help the users to convert vector the vbool*_t to the LMUL=1
signed integer vint*_t.  According to the RVV intrinsic SPEC as below,
the reinterpret intrinsics only change the types of the underlying contents.

https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#reinterpret-vbool-o-vintm1

For example, given below code.
vint8m1_t test_vreinterpret_v_b1_vint8m1 (vbool1_t src) {
  return __riscv_vreinterpret_v_b1_i8m1 (src);
}

It will generate the assembly code similar as below:
vsetvli a5,zero,e8,m8,ta,ma
vlm.v   v1,0(a1)
vs1r.v  v1,0(a0)
ret

Please NOTE the test files doesn't cover all the possible combinations
of the intrinsic APIs introduced by this PATCH due to too many.
The reinterpret from vbool*_t to vuint*m1_t with lmul=1 will be coverred
in another PATCH.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/genrvv-type-indexer.cc (EEW_SIZE_LIST): New macro
for the eew size list.
(LMUL1_LOG2): New macro for the log2 value of lmul=1.
(main): Add signed_eew*_lmul1_interpret for indexer.
* config/riscv/riscv-vector-builtins-functions.def (vreinterpret):
Register vint*m1_t interpret function.
* config/riscv/riscv-vector-builtins-types.def (DEF_RVV_SIGNED_EEW8_LMUL1_INTERPRET_OPS):
New macro for vint8m1_t.
(DEF_RVV_SIGNED_EEW16_LMUL1_INTERPRET_OPS): Likewise.
(DEF_RVV_SIGNED_EEW32_LMUL1_INTERPRET_OPS): Likewise.
(DEF_RVV_SIGNED_EEW64_LMUL1_INTERPRET_OPS): Likewise.
(vbool1_t): Add to signed_eew*_interpret_ops.
(vbool2_t): Likewise.
(vbool4_t): Likewise.
(vbool8_t): Likewise.
(vbool16_t): Likewise.
(vbool32_t): Likewise.
(vbool64_t): Likewise.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_SIGNED_EEW8_LMUL1_INTERPRET_OPS):
New macro for vint*m1_t.
(DEF_RVV_SIGNED_EEW16_LMUL1_INTERPRET_OPS): Likewise.
(DEF_RVV_SIGNED_EEW32_LMUL1_INTERPRET_OPS): Likewise.
(DEF_RVV_SIGNED_EEW64_LMUL1_INTERPRET_OPS): Likewise.
(required_extensions_p): Add vint8m1_t interpret case.
* config/riscv/riscv-vector-builtins.def (signed_eew8_lmul1_interpret):
Add vint*m1_t interpret to base type.
(signed_eew16_lmul1_interpret): Likewise.
(signed_eew32_lmul1_interpret): Likewise.
(signed_eew64_lmul1_interpret): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/misc_vreinterpret_vbool_vint.c:
Enrich the test cases.

commit | commitdiff | tree

Juzhe-Zhong [Wed, 24 May 2023 02:59:02 +0000 (10:59 +0800)]

RISC-V: Fix incorrect code of reaching inaccessible memory address

To fix this issue, we seperate Vl operand and normal operands.

gcc/ChangeLog:

* config/riscv/autovec.md: Adjust for new interface.
* config/riscv/riscv-protos.h (emit_vlmax_insn): Add VL operand.
(emit_nonvlmax_insn): Add AVL operand.
* config/riscv/riscv-v.cc (emit_vlmax_insn): Add VL operand.
(emit_nonvlmax_insn): Add AVL operand.
(sew64_scalar_helper): Adjust for new interface.
(expand_tuple_move): Ditto.
* config/riscv/vector.md: Ditto.

Signed-off-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>

commit | commitdiff | tree

Juzhe-Zhong [Wed, 24 May 2023 01:49:11 +0000 (09:49 +0800)]

RISC-V: Fix magic number of RVV auto-vectorization expander

This simple patch fixes the magic number, remove magic number make codes
more reasonable.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_vec_series): Remove magic number.
(expand_const_vector): Ditto.
(legitimize_move): Ditto.
(sew64_scalar_helper): Ditto.
(expand_tuple_move): Ditto.
(expand_vector_init_insert_elems): Ditto.
* config/riscv/riscv.cc (vector_zero_call_used_regs): Ditto.

Signed-off-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>

commit | commitdiff | tree

liuhongt [Fri, 19 May 2023 05:55:50 +0000 (13:55 +0800)]

Fold _mm{,256,512}_abs_{epi8,epi16,epi32,epi64} into gimple ABS_EXPR.

Also for 64-bit vector abs intrinsics _mm_abs_{pi8,pi16,pi32}.

gcc/ChangeLog:

PR target/109900
* config/i386/i386.cc (ix86_gimple_fold_builtin): Fold
_mm{,256,512}_abs_{epi8,epi16,epi32,epi64} and
_mm_abs_{pi8,pi16,pi32} into gimple ABS_EXPR.
(ix86_masked_all_ones): Handle 64-bit mask.
* config/i386/i386-builtin.def: Replace icode of related
non-mask simd abs builtins with CODE_FOR_nothing.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr109900.c: New test.

commit | commitdiff | tree

GCC Administrator [Wed, 24 May 2023 00:17:47 +0000 (00:17 +0000)]

Daily bump.

commit | commitdiff | tree

Martin Uecker [Sun, 21 May 2023 17:32:01 +0000 (19:32 +0200)]

Fix ICEs related to VM types in C 2/2 [PR109450]

Size expressions were sometimes lost and not gimplified correctly,
leading to ICEs and incorrect evaluation order. Fix this by 1) not
recursing pointers when gimplifying parameters, which was incorrect
because it might access variables declared later for incomplete
structs, and 2) adding a decl expr for variably-modified arrays
that are pointed to by parameters declared as arrays.

PR c/109450

gcc/
* function.cc (gimplify_parm_type): Remove function.
(gimplify_parameters): Call gimplify_type_sizes.

gcc/c/
* c-decl.cc (add_decl_expr): New function.
(grokdeclarator): Add decl expr for size expression in
types pointed to by parameters declared as arrays.

gcc/testsuite/
* gcc.dg/pr109450-1.c: New test.
* gcc.dg/pr109450-2.c: New test.
* gcc.dg/vla-26.c: New test.

commit | commitdiff | tree

Martin Uecker [Thu, 13 Apr 2023 17:37:12 +0000 (19:37 +0200)]

Fix ICEs related to VM types in C 1/2 [PR70418, PR107557, PR108423]

Size expressions were sometimes lost and not gimplified correctly, leading to
ICEs and incorrect evaluation order. Fix this by 1) not recursing into
pointers when gimplifying parameters in the middle-end (the code is merged with
gimplify_type_sizes), which is incorrect because it might access variables
declared later for incomplete structs, and 2) tracking size expressions for
struct/union members correctly, 3) emitting code to evaluate size expressions
for missing cases (nested functions, empty declarations, and structs/unions).

PR c/70418
PR c/106465
PR c/107557
PR c/108423

gcc/c/
* c-decl.cc (start_decl): Make sure size expression are
evaluated only in correct context.
(grokdeclarator): Size expression in fields may need a bind
expression, make sure DECL_EXPR is always created.
(grokfield, declspecs_add_type): Pass along size expressions.
(finish_struct): Remove unneeded DECL_EXPR.
(start_function): Evaluate size expressions for nested functions.
* c-parser.cc (c_parser_struct_declarations,
c_parser_struct_or_union_specifier): Pass along size expressions.
(c_parser_declaration_or_fndef): Evaluate size expression.
(c_parser_objc_at_property_declaration,
c_parser_objc_class_instance_variables): Adapt.
* c-tree.h (grokfield): Adapt declaration.

gcc/testsuite/
* gcc.dg/nested-vla-1.c: New test.
* gcc.dg/nested-vla-2.c: New test.
* gcc.dg/nested-vla-3.c: New test.
* gcc.dg/pr70418.c: New test.
* gcc.dg/pr106465.c: New test.
* gcc.dg/pr107557-1.c: New test.
* gcc.dg/pr107557-2.c: New test.
* gcc.dg/pr108423-1.c: New test.
* gcc.dg/pr108423-2.c: New test.
* gcc.dg/pr108423-3.c: New test.
* gcc.dg/pr108423-4.c: New test.
* gcc.dg/pr108423-5.c: New test.
* gcc.dg/pr108423-6.c: New test.
* gcc.dg/typename-vla-2.c: New test.
* gcc.dg/typename-vla-3.c: New test.
* gcc.dg/typename-vla-4.c: New test.
* gcc.misc-tests/gcov-pr85350.c: Adapt.

commit | commitdiff | tree

Takayuki 'January June' Suwa [Mon, 22 May 2023 07:04:37 +0000 (16:04 +0900)]

xtensa: Merge '*addx' and '*subx' insn patterns into one

By making use of the 'addsub_operator' added in the last patch.

gcc/ChangeLog:

* config/xtensa/xtensa.md (*addsubx): Rename from '*addx',
and change to also accept '*subx' pattern.
(*subx): Remove.

commit | commitdiff | tree

Takayuki 'January June' Suwa [Tue, 23 May 2023 05:48:09 +0000 (14:48 +0900)]

xtensa: Optimize '(x & CST1_POW2) != 0 ? CST2_POW2 : 0'

This patch decreses one machine instruction from "single bit extraction
with shifting" operation, and tries to eliminate the conditional
branch if CST2_POW2 doesn't fit into signed 12 bits with the help
of ifcvt optimization.

    /* example #1 */
    int test0(int x) {
      return (x & 1048576) != 0 ? 1024 : 0;
    }
    extern int foo(void);
    int test1(void) {
      return (foo() & 1048576) != 0 ? 16777216 : 0;
    }

    ;; before
    test0:
movi a9, 0x400
srai a2, a2, 10
and a2, a2, a9
ret.n
    test1:
addi sp, sp, -16
s32i.n a0, sp, 12
call0 foo
extui a2, a2, 20, 1
slli a2, a2, 20
beqz.n a2, .L2
movi.n a2, 1
slli a2, a2, 24
    .L2:
l32i.n a0, sp, 12
addi sp, sp, 16
ret.n

    ;; after
    test0:
extui a2, a2, 20, 1
slli a2, a2, 10
ret.n
    test1:
addi sp, sp, -16
s32i.n a0, sp, 12
call0 foo
l32i.n a0, sp, 12
extui a2, a2, 20, 1
slli a2, a2, 24
addi sp, sp, 16
ret.n

In addition, if the left shift amount ('exact_log2(CST2_POW2)') is
between 1 through 3 and a either addition or subtraction with another
register follows, emit a ADDX[248] or SUBX[248] machine instruction
instead of separate left shift and add/subtract ones.

    /* example #2 */
    int test2(int x, int y) {
      return ((x & 1048576) != 0 ? 4 : 0) + y;
    }
    int test3(int x, int y) {
      return ((x & 2) != 0 ? 8 : 0) - y;
    }

    ;; before
    test2:
movi.n a9, 4
srai a2, a2, 18
and a2, a2, a9
add.n a2, a2, a3
ret.n
    test3:
movi.n a9, 8
slli a2, a2, 2
and a2, a2, a9
sub a2, a2, a3
ret.n

    ;; after
    test2:
extui a2, a2, 20, 1
addx4 a2, a2, a3
ret.n
    test3:
extui a2, a2, 1, 1
subx8 a2, a2, a3
ret.n

gcc/ChangeLog:

* config/xtensa/predicates.md (addsub_operator): New.
* config/xtensa/xtensa.md (*extzvsi-1bit_ashlsi3,
*extzvsi-1bit_addsubx): New insn_and_split patterns.
* config/xtensa/xtensa.cc (xtensa_rtx_costs):
Add a special case about ifcvt 'noce_try_cmove()' to handle
constant loads that do not fit into signed 12 bits in the
patterns added above.

commit | commitdiff | tree

Richard Biener [Tue, 23 May 2023 13:03:00 +0000 (15:03 +0200)]

tree-optimization/109747 - SLP cost of CTORs

The x86 backend looks at the SLP node passed to the add_stmt_cost
hook when costing vec_construct, looking for elements that require
a move from a GPR to a vector register and cost that.  But since
vect_prologue_cost_for_slp decomposes the cost for an external
SLP node into individual pieces this cost gets applied N times
without a chance for the backend to know it's just dealing with
a part of the SLP node.  Just looking at a part is also not perfect
since the GPR to XMM move cost applies only once per distinct
element so handling the whole SLP node one more correctly reflects
cost (albeit without considering other external SLP nodes).

The following addresses the issue by passing down the SLP node
only for one piece and nullptr for the rest.  The x86 backend
is currently the only one looking at it.

In the future the cost of external elements is something to deal
with globally but that would require the full SLP tree be available
to costing.

It's difficult to write a testcase, at the tipping point not
vectorizing is better so I'll followup with x86 specific adjustments
and will see to add a testcase later.

PR tree-optimization/109747
* tree-vect-slp.cc (vect_prologue_cost_for_slp): Pass down
the SLP node only once to the cost hook.

commit | commitdiff | tree

Georg-Johann Lay [Tue, 23 May 2023 16:49:19 +0000 (18:49 +0200)]

Improve cost computation for single-bit bit insertions.

Some miscomputation of rtx_costs lead to sub-optimal code for
single-bit bit insertions. This patch implements TARGET_INSN_COST,
which has a chance to see the whole insn during insn combination;
in partictlar the SET_DEST of (set (zero_extract (...) ...)).

gcc/
* config/avr/avr.cc (avr_insn_cost): New static function.
(TARGET_INSN_COST): Define to that function.

commit | commitdiff | tree

Richard Biener [Tue, 23 May 2023 13:12:33 +0000 (15:12 +0200)]

Account for vector splat GPR->XMM move cost

The following also accounts for a GPR->XMM move cost for splat
operations and properly guards eliding the cost when moving from
memory only for SSE4.1 or HImode or larger operands. This
doesn't fix the PR fully yet.

PR target/109944
* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
For vector construction or splats apply GPR->XMM move
costing. QImode memory can be handled directly only
with SSE4.1 pinsrb.

commit | commitdiff | tree

Richard Biener [Tue, 23 May 2023 13:58:52 +0000 (15:58 +0200)]

Generic vector op costing adjustment

This is a small adjustment to the work done for PR108752 and
better reflects the cost of the generated sequence.

PR tree-optimization/108752
* tree-vect-stmts.cc (vectorizable_operation): For bit
operations with generic word_mode vectors do not cost
an extra stmt. For plus, minus and negate also cost the
constant materialization.

commit | commitdiff | tree

Uros Bizjak [Tue, 23 May 2023 15:54:39 +0000 (17:54 +0200)]

i386: Add V8QI and V4QImode partial vector shift operations

Add V8QImode and V4QImode vector shift patterns that call into
ix86_expand_vecop_qihi_partial. Generate special sequences
for constant count operands.

gcc/ChangeLog:

* config/i386/i386-expand.cc (ix86_expand_vecop_qihi_partial):
Call ix86_expand_vec_shift_qihi_constant for shifts
with constant count operand.
* config/i386/i386.cc (ix86_shift_rotate_cost):
Handle V4QImode and V8QImode.
* config/i386/mmx.md (<insn>v8qi3): New insn pattern.
(<insn>v4qi3): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/vect-shiftv4qi.c: New test.
* gcc.target/i386/vect-shiftv8qi.c: New test.

commit | commitdiff | tree

Juzhe-Zhong [Tue, 23 May 2023 14:24:22 +0000 (22:24 +0800)]

RISC-V: Fix warning of vxrm pattern

I just notice the warning:
../../../riscv-gcc/gcc/config/riscv/vector.md:618:1: warning: source
missing a mode?

gcc/ChangeLog:

* config/riscv/vector.md: Add mode.

Signed-off-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>

commit | commitdiff | tree

Aldy Hernandez [Tue, 23 May 2023 10:34:45 +0000 (12:34 +0200)]

Remove buggy special case in irange::invert [PR109934].

This patch removes a buggy special case in irange::invert which seems
to have been broken for a while, and probably never triggered because
the legacy code was handled elsewhere, and the non-legacy code was
using an int_range_max of int_range<255> which made it extremely
likely for num_ranges == 255. However, with auto-resizing ranges,
int_range_max will start off at 3 and can hit this bogus code in the
unswitching code.

PR tree-optimization/109934

gcc/ChangeLog:

* value-range.cc (irange::invert): Remove buggy special case.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr109934.c: New test.

commit | commitdiff | tree

Richard Biener [Tue, 23 May 2023 09:55:45 +0000 (11:55 +0200)]

Dump ANTIC_OUT before pruning it

This dumps ANTIC_OUT before pruning clobbered mems from it as part
of the ANTIC_IN compute.

* tree-ssa-pre.cc (compute_antic_aux): Dump the correct
ANTIC_OUT.

commit | commitdiff | tree

Richard Sandiford [Tue, 23 May 2023 10:34:42 +0000 (11:34 +0100)]

aarch64: Provide FPR alternatives for some bit insertions [PR109632]

At -O2, and so with SLP vectorisation enabled:

    struct complx_t { float re, im; };
    complx_t add(complx_t a, complx_t b) {
      return {a.re + b.re, a.im + b.im};
    }

generates:

        fmov    w3, s1
        fmov    x0, d0
        fmov    x1, d2
        fmov    w2, s3
        bfi     x0, x3, 32, 32
        fmov    d31, x0
        bfi     x1, x2, 32, 32
        fmov    d30, x1
        fadd    v31.2s, v31.2s, v30.2s
        fmov    x1, d31
        lsr     x0, x1, 32
        fmov    s1, w0
        lsr     w0, w1, 0
        fmov    s0, w0
        ret

This is because complx_t is passed and returned in FPRs, but GCC gives
it DImode.  We therefore “need” to assemble a DImode pseudo from the
two individual floats, bitcast it to a vector, do the arithmetic,
bitcast it back to a DImode pseudo, then extract the individual floats.

There are many problems here.  The most basic is that we shouldn't
use SLP for such a trivial example.  But SLP should in principle be
beneficial for more complicated examples, so preventing SLP for the
example above just changes the reproducer needed.  A more fundamental
problem is that it doesn't make sense to use single DImode pseudos in a
testcase like this.  I have a WIP patch to allow re and im to be stored
in individual SFmode pseudos instead, but it's quite an invasive change
and might end up going nowhere.

A simpler problem to tackle is that we allow DImode pseudos to be stored
in FPRs, but we don't provide any patterns for inserting values into
them, even though INS makes that easy for element-like insertions.
This patch adds some patterns for that.

Doing that showed that aarch64_modes_tieable_p was too strict:
it didn't allow SFmode and DImode values to be tied, even though
both of them occupy a single GPR and FPR, and even though we allow
both classes to change between the modes.

The *aarch64_bfidi<ALLX:mode>_subreg_<SUBDI_BITS> pattern is
especially ugly, but it's not clear what target-independent
code ought to simplify it to, if it was going to simplify it.

We should probably do the same thing for extractions, but that's left
as future work.

After the patch we generate:

        ins     v0.s[1], v1.s[0]
        ins     v2.s[1], v3.s[0]
        fadd    v0.2s, v0.2s, v2.2s
        fmov    x0, d0
        ushr    d1, d0, 32
        lsr     w0, w0, 0
        fmov    s0, w0
        ret

which seems like a step in the right direction.

All in all, there's nothing elegant about this patchh.  It just
seems like the least worst option.

gcc/
PR target/109632
* config/aarch64/aarch64.cc (aarch64_modes_tieable_p): Allow
subregs between any scalars that are 64 bits or smaller.
* config/aarch64/iterators.md (SUBDI_BITS): New int iterator.
(bits_etype): New int attribute.
* config/aarch64/aarch64.md (*insv_reg<mode>_<SUBDI_BITS>)
(*aarch64_bfi<GPI:mode><ALLX:mode>_<SUBDI_BITS>): New patterns.
(*aarch64_bfidi<ALLX:mode>_subreg_<SUBDI_BITS>): Likewise.

gcc/testsuite/
* gcc.target/aarch64/ins_bitfield_1.c: New test.
* gcc.target/aarch64/ins_bitfield_2.c: Likewise.
* gcc.target/aarch64/ins_bitfield_3.c: Likewise.
* gcc.target/aarch64/ins_bitfield_4.c: Likewise.
* gcc.target/aarch64/ins_bitfield_5.c: Likewise.
* gcc.target/aarch64/ins_bitfield_6.c: Likewise.

commit | commitdiff | tree

Richard Sandiford [Tue, 23 May 2023 10:34:41 +0000 (11:34 +0100)]

md: Allow <FOO> to refer to the value of int iterator FOO

In a follow-up patch, I wanted to use an int iterator to iterate
over various possible values of a const_int.  But one problem
with int iterators was that there was no way of referring to the
current value of the iterator.  This is unlike modes and codes,
which provide automatic "mode", "MODE", "code" and "CODE"
attribbutes.  These automatic definitions are the equivalent
of an explicit:

  (define_mode_attr mode [(QI "qi") (HI "hi") ...])

We obviously can't do that for every possible value of an int.

One option would have been to go for some kind of lazily-populated
attribute.  But that sounds quite complicated.  This patch instead
goes for the simpler approach of allowing <FOO> to refer to the
current value of FOO.

In principle it would be possible to allow the same thing
for mode and code iterators.  But for modes there are at least
4 realistic possiblities:

  - the E_* enumeration value (which is what this patch would give)
  - the user-facing C token, like DImode, SFmode, etc.
  - the equivalent of <MODE>
  - the equivalent of <mode>

Because of this ambiguity, it seemed better to stick to the
current approach for modes.  For codes it's less clear-cut,
but <CODE> and <code> are both realistic possibilities, so again
it seemed better to be explicit.

The patch also removes “Each @var{int} must have the same rtx format.
@xref{RTL Classes}.”, which was erroneously copied from the code
iterator section.

gcc/
* doc/md.texi: Document that <FOO> can be used to refer to the
numerical value of an int iterator FOO.  Tweak other parts of
the int iterator documentation.
* read-rtl.cc (iterator_group::has_self_attr): New field.
(map_attr_string): When has_self_attr is true, make <FOO>
expand to the current value of iterator FOO.
(initialize_iterators): Set has_self_attr for int iterators.

commit | commitdiff | tree

Juzhe-Zhong [Tue, 23 May 2023 10:22:54 +0000 (18:22 +0800)]

RISC-V: Refactor the framework of RVV auto-vectorization

This patch is to refactor the framework of RVV auto-vectorization.
Since we find out are keep adding helpers && wrappers when implementing
auto-vectorization.
It will make the RVV auto-vectorizaiton very messy.

After double check my downstream RVV GCC, assemble all auto-vectorization
patterns we are going to have. Base on these informations, I refactor the
RVV framework to make it is easier and flexible for future use.

For example, we will definitely implement len_mask_load/len_mask_store
patterns which have both length && mask operand and use undefine merge operand.

len_cond_div or cond_div will have length or mask operand and use a real
merge operand instead of undefine merge operand.

Also, we will have some patterns will use tail undisturbed and mask any.

etc..... We will defintely have various features.

Base on these circumstances, we add these following private members:

  int m_op_num;
  /* It't true when the pattern has a dest operand. Most of the patterns have
     dest operand wheras some patterns like STOREs does not have dest
     operand.
  */
  bool m_has_dest_p;
  bool m_fully_unmasked_p;
  bool m_use_real_merge_p;
  bool m_has_avl_p;
  bool m_vlmax_p;
  bool m_has_tail_policy_p;
  bool m_has_mask_policy_p;
  enum tail_policy m_tail_policy;
  enum mask_policy m_mask_policy;
  machine_mode m_dest_mode;
  machine_mode m_mask_mode;

These variables I believe can cover all potential situations.

And the instruction generater wrapper is "emit_insn" which will add
operands and
emit instruction according to the variables I mentioned above.

After this is done. We will easily add helpers without changing any base
class "insn_expand".

Currently, we have "emit_vlmax_tany_many" and "emit_nonvlmax_tany_many".

For example, when we want to emit a binary operations:
We have

Then just use emit_vlmax_tany_many (...RVV_BINOP_NUM...)

So, if we support ternary operation in the future. It's quite simple:
emit_vlmax_tany_many (...RVV_BINOP_NUM...)

"*_tany_many" means we are using tail any and mask any.

We will definitely need tail undisturbed or mask undisturbed when we
support these patterns
in middle-end. It's very simple to extend such helper base on current
framework:

we can do that in the future like this:

void
emit_nonvlmax_tu_mu (unsigned icode, int op_num, rtx *ops)
{
  machine_mode data_mode = GET_MODE (ops[0]);
  machine_mode mask_mode = get_mask_mode (data_mode).require ();
  /* The number = 11 is because we have maximum 11 operands for
     RVV instruction patterns according to vector.md.  */
  insn_expander<11> e (/*OP_NUM*/ op_num,
       /*HAS_DEST_P*/ true,
       /*USE_ALL_TRUES_MASK_P*/ true,
       /*USE_UNDEF_MERGE_P*/ true,
       /*HAS_AVL_P*/ true,
       /*VLMAX_P*/ false,
       /*HAS_TAIL_POLICY_P*/ true,
       /*HAS_MASK_POLICY_P*/ true,
       /*TAIL_POLICY*/ TAIL_UNDISTURBED,
       /*MASK_POLICY*/ MASK_UNDISTURBED,
       /*DEST_MODE*/ data_mode,
       /*MASK_MODE*/ mask_mode);
  e.emit_insn ((enum insn_code) icode, ops);
}

That's enough (I have tested it fully in my downstream RVV GCC).
I didn't add it in this patch.

Thanks.

Signed-off-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
gcc/ChangeLog:

* config/riscv/autovec.md: Refactor the framework of RVV auto-vectorization.
* config/riscv/riscv-protos.h (RVV_MISC_OP_NUM): Ditto.
(RVV_UNOP_NUM): New macro.
(RVV_BINOP_NUM): Ditto.
(legitimize_move): Refactor the framework of RVV auto-vectorization.
(emit_vlmax_op): Ditto.
(emit_vlmax_reg_op): Ditto.
(emit_len_op): Ditto.
(emit_len_binop): Ditto.
(emit_vlmax_tany_many): Ditto.
(emit_nonvlmax_tany_many): Ditto.
(sew64_scalar_helper): Ditto.
(expand_tuple_move): Ditto.
* config/riscv/riscv-v.cc (emit_pred_op): Ditto.
(emit_pred_binop): Ditto.
(emit_vlmax_op): Ditto.
(emit_vlmax_tany_many): New function.
(emit_len_op): Remove.
(emit_nonvlmax_tany_many): New function.
(emit_vlmax_reg_op): Remove.
(emit_len_binop): Ditto.
(emit_index_op): Ditto.
(expand_vec_series): Refactor the framework of RVV auto-vectorization.
(expand_const_vector): Ditto.
(legitimize_move): Ditto.
(sew64_scalar_helper): Ditto.
(expand_tuple_move): Ditto.
(expand_vector_init_insert_elems): Ditto.
* config/riscv/riscv.cc (vector_zero_call_used_regs): Ditto.
* config/riscv/vector.md: Ditto.

commit | commitdiff | tree

Kyrylo Tkachov [Tue, 23 May 2023 10:09:08 +0000 (11:09 +0100)]

aarch64: PR target/109855 Add predicate and constraints to define_subst in aarch64-simd.md

In this PR we ICE because the substituted pattern for mla "lost" its predicate and constraint for operand 0
because the define_subst template:
  [(set (match_operand:<VDBL> 0)
        (vec_concat:<VDBL>
         (match_dup 1)
         (match_operand:VDZ 2 "aarch64_simd_or_scalar_imm_zero")))])

Uses match_operand instead of match_dup for operand 0. We can't use match_dup 0 for it because we need to specify the widened mode.
The problem is fixed by adding a "register_operand" predicate and "=w" constraint to the match_operand.
This makes sense conceptually too as the transformation we're targeting only applies to instructions that write a "w" register.
With this change the mddump pattern that ICEs goes from:
(define_insn ("aarch64_mlav4hi_vec_concatz_le")
     [
        (set (match_operand:V8HI 0 ("") ("")) <<------ Missing constraint!
            (vec_concat:V8HI (plus:V4HI (mult:V4HI (match_operand:V4HI 2 ("register_operand") ("w"))
                        (match_operand:V4HI 3 ("register_operand") ("w")))
                    (match_operand:V4HI 1 ("register_operand") ("0")))
                (match_operand:V4HI 4 ("aarch64_simd_or_scalar_imm_zero") (""))))
    ] ("(!BYTES_BIG_ENDIAN) && (TARGET_SIMD)") ("mla\t%0.4h, %2.4h, %3.4h")

to the proper:
(define_insn ("aarch64_mlav4hi_vec_concatz_le")
     [
        (set (match_operand:V8HI 0 ("register_operand") ("=w")) <<-------- Constraint in the right place
            (vec_concat:V8HI (plus:V4HI (mult:V4HI (match_operand:V4HI 2 ("register_operand") ("w"))
                        (match_operand:V4HI 3 ("register_operand") ("w")))
                    (match_operand:V4HI 1 ("register_operand") ("0")))
                (match_operand:V4HI 4 ("aarch64_simd_or_scalar_imm_zero") (""))))
    ] ("(!BYTES_BIG_ENDIAN) && (TARGET_SIMD)") ("mla\t%0.4h, %2.4h, %3.4h")

This seems to do the right thing for multi-alternative patterns as well, the annotated pattern for aarch64_cmltv8qi is:
(define_insn ("aarch64_cmltv8qi")
     [
        (set (match_operand:V8QI 0 ("register_operand") ("=w,w"))
            (neg:V8QI (lt:V8QI (match_operand:V8QI 1 ("register_operand") ("w,w"))
                    (match_operand:V8QI 2 ("aarch64_simd_reg_or_zero") ("w,ZDz")))))
    ]

whereas the substituted version now looks like:
(define_insn ("aarch64_cmltv8qi_vec_concatz_le")
     [
        (set (match_operand:V16QI 0 ("register_operand") ("=w,w"))
            (vec_concat:V16QI (neg:V8QI (lt:V8QI (match_operand:V8QI 1 ("register_operand") ("w,w"))
                        (match_operand:V8QI 2 ("aarch64_simd_reg_or_zero") ("w,ZDz"))))
                (match_operand:V8QI 3 ("aarch64_simd_or_scalar_imm_zero") (""))))
    ]

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/ChangeLog:

PR target/109855
* config/aarch64/aarch64-simd.md (add_vec_concat_subst_le): Add predicate
and constraint for operand 0.
(add_vec_concat_subst_be): Likewise.

gcc/testsuite/ChangeLog:

PR target/109855
* gcc.target/aarch64/pr109855.c: New test.

commit | commitdiff | tree

Richard Biener [Thu, 18 May 2023 11:52:29 +0000 (13:52 +0200)]

tree-optimization/109849 - missed code hoisting

The following fixes code hoisting to properly consider ANTIC_OUT instead
of ANTIC_IN. That's a bit expensive to re-compute but since we no
longer iterate we're doing this only once per BB which should be
acceptable. This avoids missing hoistings to the end of blocks where
something in the block clobbers the hoisted value.

PR tree-optimization/109849
* tree-ssa-pre.cc (do_hoist_insertion): Compute ANTIC_OUT
and use that to determine what to hoist.

* gcc.dg/tree-ssa/ssa-hoist-8.c: New testcase.

commit | commitdiff | tree

Eric Botcazou [Tue, 23 May 2023 08:17:02 +0000 (10:17 +0200)]

Minor tweak

commit | commitdiff | tree

Eric Botcazou [Tue, 23 May 2023 08:15:35 +0000 (10:15 +0200)]

Fix handling of non-integral bit-fields in native_encode_initializer

The encoder for CONSTRUCTORs assumes that all bit-fields (DECL_BIT_FIELD)
have integral types, but that's not the case in Ada where they may have
pretty much any type, resulting in a wrong encoding for them

gcc/
* fold-const.cc (native_encode_initializer) <CONSTRUCTOR>: Apply the
specific treatment for bit-fields only if they have an integral type
and filter out non-integral bit-fields that do not start and end on
a byte boundary.

gcc/testsuite/
* gnat.dg/opt101.adb: New test.
* gnat.dg/opt101_pkg.ads: New helper.

commit | commitdiff | tree

Piotr Trojanek [Fri, 3 Feb 2023 21:15:44 +0000 (22:15 +0100)]

ada: Accept and analyze new aspect Exceptional_Cases

Add new aspect Exceptional_Cases, which is intended for SPARK and
describes in which cases an exception will be raised, and optionally
supply a postcondition that shall be verified in this case.

The implementation is heavily modeled after Subprogram_Variant, which in
turn was heavily modeled after Contract_Cases. Currently the aspect is
only analysed; the code infrastructure required to expand it is prepared
but empty. This is enough for the aspect to be verified by GNATprove.

gcc/ada/

* aspects.ads
(Aspect_Id): Add aspect identifier.
(Aspect_Argument): New aspect accepts an expression.
(Is_Representation_Aspect): New aspect is not a representation
aspect.
(Aspect_Names): Associate name with the new aspect identifier.
(Aspect_Delay): New aspect is never delayed.
* contracts.adb
(Add_Contract_Item): Store new aspect among contract items.
(Analyze_Entry_Or_Subprogram_Contract): Likewise.
(Analyze_Subprogram_Body_Stub_Contract): Likewise.
(Process_Contract_Cases): Expand new aspect, if present.
* contracts.ads
(Analyze_Entry_Or_Subprogram_Body_Contract): Mention new aspect in
spec.
(Analyze_Entry_Or_Subprogram_Contract): Likewise.
* einfo-utils.adb
(Get_Pragma): Allow new aspect to be picked by the backend.
* einfo-utils.ads
(Get_Pragma): Mention new aspect in spec.
* exp_prag.adb
(Expand_Pragma_Exceptional_Cases): Dummy expansion routine.
* exp_prag.ads
(Expand_Pragma_Exceptional_Cases): Add spec for expansion routine.
* inline.adb
(Remove_Aspects_And_Pragmas): Remove aspect from bodies to inline.
* par-prag.adb
(Par.Prag): Accept pragma in the parser, so it will be checked
later.
* sem_ch12.adb
(Implementation of Generic Contracts): Mention new aspect in
comment.
* sem_ch13.adb
(Analyze_Aspect_Specifications): Transform new aspect info a
corresponding pragma.
* sem_prag.adb
(Analyze_Exceptional_Cases_In_Decl_Part): Analyze aspect
expression; heavily inspired by the existing code for analysis of
Subprogram_Variant and exception handlers.
(Analyze_Pragma): Analyze pragma corresponding to the new aspect.
(Is_Non_Significant_Pragma_Reference): Add new pragma to the
table.
* sem_prag.ads
(Assertion_Expression_Pragma): New pragma acts as an assertion
expression, even though it is not currently expanded.
(Analyze_Exceptional_Cases_In_Decl_Part): Add spec.
* sem_util.adb
(Is_Subprogram_Contract_Annotation): Mark new annotation is a
subprogram contract, so the subprogram with it won't be inlined.
* sem_util.ads
(Is_Subprogram_Contract_Annotation): Mention new aspect in
comment.
* sinfo.ads
(Contract_Test_Cases): Mention new aspect in comment.
* snames.ads-tmpl: Add entries for the new name and pragma.

commit | commitdiff | tree

Eric Botcazou [Wed, 1 Mar 2023 21:28:51 +0000 (22:28 +0100)]

ada: Rework fix for internal error on quantified expression with predicated type

It turns out that skipping compiler-generated block scopes is problematic
when computing the public status of a subprogram, because this subprogram
may end up being nested in the elaboration procedure of a package spec or
body, in which case it may not be public.

This replaces the original fix with a pair of Push_Scope/Pop_Scope in the
Build_Predicate_Function procedure, as done elsewhere in similar cases.

gcc/ada/

* sem_ch13.adb (Build_Predicate_Functions): If the current scope
is not that of the type, push this scope and pop it at the end.
* sem_util.ads (Current_Scope_No_Loops_No_Blocks): Delete.
* sem_util.adb (Current_Scope_No_Loops_No_Blocks): Likewise.
(Set_Public_Status): Call again Current_Scope.

commit | commitdiff | tree

Gary Dismukes [Fri, 17 Feb 2023 23:16:55 +0000 (18:16 -0500)]

ada: ICE on BIP call in class-wide function return within instance

The compiler blows up (such as with a Storage_Error or Assert_Failure)
on a call to a limited build-in-place function occurring in the return
for a function with a limited class-wide result. Such a function
should include extra formals for a task master and activation chain
(because it's possible for a limited class-wide type to have values
with task parts), but when the enclosing function occurs within an
instantiation and the result subtype comes from a formal type, the
extra formals were missing for the enclosing function. As a result,
the attempt to retrieve the task master formal for passing along to
a BIP call in the return failed when calling Build_In_Place_Formal to
loop through the formals. When determining the need for the formals in
Create_Extra_Formals, Needs_BIP_Actual_Task_Actuals was returning False,
because Might_Have_Tasks incorrectly returned False due to the test
of Is_Limited_Record flag on the class-wide generic actual subtype's
Etype being False. Is_Limited_Record was not being properly inherited
by the class-wide type in the case of private extensions, because
Make_Class_Wide_Type was called in Analyze_Private_Extension_Declaration
before certain flags (such as Is_Limited_Record and Is_Controlled_Active)
are inherited later in Build_Derived_Record_Type (which will also call
Make_Class_Wide_Type). This is corrected by removing the early call
to Make_Class_Wide_Type.

gcc/ada/

* exp_ch6.adb (Might_Have_Tasks): Remove unneeded Etype call from
call to Is_Limited_Record, since that flag is now properly
inherited by class-wide types.
* sem_ch3.adb (Analyze_Private_Extension_Declaration): Remove call
to Make_Class_Wide_Type, which is done too early, and will later
be done in Build_Derived_Record_Type after flags such as
Is_Limited_Record and Is_Controlled_Active have been set on the
derived type.

commit | commitdiff | tree

Patrick Bernardi [Tue, 28 Feb 2023 22:30:46 +0000 (17:30 -0500)]

ada: Remove redundant parentheses from System.Stack_Checking.Operations

gcc/ada/

* libgnat/s-stchop.adb (Stack_Check): Remove redundant parentheses.

commit | commitdiff | tree

Piotr Trojanek [Thu, 23 Feb 2023 23:47:47 +0000 (00:47 +0100)]

ada: Add tags to warnings controlled by Warn_On_Redundant_Constructs

Some of the calls to Error_Msg_N controlled by the flag
Warn_On_Redundant_Constructs missed the "?r?" tag in their message
string. This caused a misleading "[enabled by default]" label to appear
next to the error message.

Spotted while adding a warning about duplicated choices in exception
handlers.

gcc/ada/

* freeze.adb (Freeze_Record_Type): Add tag for redundant pragma Pack.
* sem_aggr.adb (Resolve_Record_Aggregate): Add tag for redundant OTHERS
choice.
* sem_ch8.adb (Use_One_Type): Add tag for redundant USE clauses.

commit | commitdiff | tree

Piotr Trojanek [Tue, 28 Feb 2023 09:36:54 +0000 (10:36 +0100)]

ada: Cleanup inconsistent iteration over exception handlers

When detecting duplicate choices in exception handlers we had
inconsistent pairs of First/Next_Non_Pragma and First_Non_Pragma/Next.
This was harmless, because exception choices don't allow pragmas at all,
e.g.:

   when Program_Error | Constraint_Error | ...; --  pragma not allowed

and exception handlers only allow pragmas to appear as the first item
on the list, e.g.:

   exception
      pragma Inspection_Point;   --  first item on the list of handlers
      when Program_Error =>
         <statements>
      pragma Inspection_Point;   --  last item on the list of statements
      when Constraint_Error =>
         ...

However, it still seems cleaner to have consistent pairs of First/Next
and First_Non_Pragma/Next_Non_Pragma.

gcc/ada/

* sem_ch11.adb
(Check_Duplication): Fix inconsistent iteration.
(Others_Present): Iterate over handlers using First_Non_Pragma and
Next_Non_Pragma just like in Check_Duplication.

commit | commitdiff | tree

Eric Botcazou [Sat, 11 Feb 2023 12:12:53 +0000 (13:12 +0100)]

ada: Fix latent issue in support for protected entries

The problem is that, unlike for protected subprograms, the expansion of
cleanups for protected entries is not delayed when they contain package
instances with a body, so the cleanups are generated twice and this may
yield two finalizers if the secondary stack is used in the entry body.

This restores the delaying, which uncovers the missing propagation of the
Uses_Sec_Stack flag as is done for protected subprograms, which in turn
requires using a Corresponding_Spec field as for protected subprograms.

This also gets rid of the Delay_Subprogram_Descriptors flag on entities,
whose only remaining use in Expand_Cleanup_Actions was unreachable.

The last change is to unconditionally reset the scopes in the case of
protected subprograms when they are expanded, as is done in the case of
protected entries. This makes it possible to remove the code adjusting
the scope on the fly in Cleanup_Scopes but requires a few adjustments.

gcc/ada/

* einfo.ads (Delay_Subprogram_Descriptors): Delete.
* gen_il-fields.ads (Opt_Field_Enum): Remove
Delay_Subprogram_Descriptors.
* gen_il-gen-gen_entities.adb (Gen_Entities): Likewise.
* gen_il-gen-gen_nodes.adb (N_Entry_Body): Add Corresponding_Spec.
* sinfo.ads (Corresponding_Spec): Document new use.
(N_Entry_Body): Likewise.
* exp_ch6.adb (Expand_Protected_Object_Reference): Be prepared for
protected subprograms that have been expanded.
* exp_ch7.adb (Expand_Cleanup_Actions): Remove unreachable code.
* exp_ch9.adb (Build_Protected_Entry): Add a local variable for the
new block and propagate Uses_Sec_Stack from the corresponding spec.
(Expand_N_Protected_Body) <N_Subprogram_Body>: Unconditionally reset
the scopes of top-level entities in the new body.
* inline.adb (Cleanup_Scopes): Do not adjust the scope on the fly.
* sem_ch9.adb (Analyze_Entry_Body): Set Corresponding_Spec.
* sem_ch12.adb (Analyze_Package_Instantiation): Remove obsolete code
setting Delay_Subprogram_Descriptors and tidy up.
* sem_util.adb (Scope_Within): Deal with protected subprograms that
have been expanded.
(Scope_Within_Or_Same): Likewise.

commit | commitdiff | tree

Eric Botcazou [Fri, 24 Feb 2023 16:08:01 +0000 (17:08 +0100)]

ada: Fix address manipulation issue in the tasking runtime

The implementation of task attributes in the runtime defines an atomic clone
of System.Address, which is awkward for targets where addresses and pointers
have a specific representation, so this change replaces that with a pragma
Atomic_Components on the Attribute_Array type.

gcc/ada/

* libgnarl/s-taskin.ads (Atomic_Address): Delete.
(Attribute_Array): Add pragma Atomic_Components.
(Ada_Task_Control_Block): Adjust default value of Attributes.
* libgnarl/s-tasini.adb (Finalize_Attributes): Adjust type of local
variable.
* libgnarl/s-tataat.ads (Deallocator): Adjust type of parameter.
(To_Attribute): Adjust source type.
* libgnarl/a-tasatt.adb: Add clauses for System.Storage_Elements.
(New_Attribute): Adjust return type.
(Deallocate): Adjust type of parameter.
(To_Real_Attribute): Adjust source type.
(To_Address): Add target type.
(To_Attribute): Adjust source type.
(Fast_Path): Adjust tested type.
(Finalize): Compare with Null_Address.
(Reference): Likewise.
(Reinitialize): Likewise.
(Set_Value): Likewise. Add conversion to Integer_Address.
(Value): Likewise.

commit | commitdiff | tree

Raphael Amiard [Wed, 15 Feb 2023 11:06:30 +0000 (12:06 +0100)]

ada: Make string interpolation part of the core extensions

gcc/ada/

* scng.adb (Scan): Replace occurrences of All_Extensions_Allowed
by Core_Extensions_Allowed.

commit | commitdiff | tree

Claire Dross [Mon, 27 Feb 2023 10:51:45 +0000 (10:51 +0000)]

ada: Update ghost code for proof of integer input functions

Introduce new ghost helper functions to facilitate proof.

gcc/ada/

* libgnat/s-valueu.adb (Scan_Raw_Unsigned): Use new helpers.
* libgnat/s-vauspe.ads (Raw_Unsigned_Starts_As_Based_Ghost,
Raw_Unsigned_Is_Based_Ghost): New ghost helper functions.
(Is_Raw_Unsigned_Format_Ghost, Scan_Split_No_Overflow_Ghost,
Scan_Split_Value_Ghost, Raw_Unsigned_Last_Ghost): Use new
helpers.

Mirror of https://gcc.gnu.org/git/gcc.git