Thomas Schwinge [Mon, 5 Jun 2023 09:26:37 +0000 (11:26 +0200)]
driver: Forward '-lgfortran', '-lm' to offloading compilation
..., so that users don't manually need to specify
'-foffload-options=-lgfortran', '-foffload-options=-lm' in addition to
'-lgfortran', '-lm' (specified manually, or implicitly by the driver).
..., via 'include'ing the existing 'gfortran.fortran-torture/execute/math.f90',
which therefore is enhanced for optional OpenACC 'serial', OpenMP 'target'
usage.
These use alternatives like, for example, "AB|CDE|FG", but what really must've
been meant is "A(B|C)D(E|F)G". The former variant also does "work": it matches
any of "AB", or "CDE", or "FG", which are components of the latter variant.
(That means, the former variant matches too loosely.)
..., which, presumably, was added by mistake in
commit dce6c58db87ebf7f4477bd3126228e73e4eeee97
"Add support for detecting mismatched allocation/deallocation calls".
Thomas Schwinge [Wed, 14 Jun 2023 07:25:15 +0000 (09:25 +0200)]
Fix typo in 'libgomp.c/target-51.c'
..., and therefore, given 'target offload_device':
PASS: libgomp.c/target-51.c (test for excess errors)
PASS: libgomp.c/target-51.c execution test
[-FAIL:-]{+PASS:+} libgomp.c/target-51.c output pattern test
liuhongt [Tue, 13 Jun 2023 06:20:59 +0000 (14:20 +0800)]
Use x instead of v for alternative 2 (v, BH) in mov<mode>_internal.
Since there's no evex version for vpcmpeq ymm, ymm, ymm.
gcc/ChangeLog:
PR target/110227
* config/i386/sse.md (mov<mode>_internal>): Use x instead of v
for alternative 2 since there's no evex version for vpcmpeqd
ymm, ymm, ymm.
Tobias Burnus [Wed, 14 Jun 2023 05:53:02 +0000 (07:53 +0200)]
OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory
OMP_TARGET_OFFLOAD=mandatory handling was before inconsistent. Hence, in
OpenMP 5.2 it was clarified/extended by having implications on the
default-device-var; additionally, omp_initial_device and omp_invalid_device
enum values/PARAMETERs were added; support for it was added
in r13-1066-g1158fe43407568 including aborting for omp_invalid_device and
non-conforming device numbers. Only the mandatory handling was missing.
Namely, while the default-device-var is usually initialized to value 0,
with 'mandatory' it must have the value 'omp_invalid_device' if and only if
zero non-host devices are available. (The OMP_DEFAULT_DEVICE env var
overrides this as it comes semantically after the initialization.)
To achieve this, default-device-var is now initialized to MIN_INT. If
there is no 'mandatory', it is set to 0 directly after env var parsing.
Otherwise, it is updated in gomp_target_init to either 0 or
omp_invalid_device. To ensure INT_MIN is never seen by the user, both
the omp_get_default_device API routine and omp_display_env (user call
and OMP_DISPLAY_ENV env var) call gomp_init_targets_once() in that case.
libgomp/ChangeLog:
* env.c (gomp_default_icv_values): Init default_device_var to
an nonconforming value - INT_MIN.
(initialize_env): After env-var parsing, set default_device_var to
device 0 unless OMP_TARGET_OFFLOAD=mandatory.
(omp_display_env): If default_device_var is INT_MIN, call
gomp_init_targets_once.
* icv-device.c (omp_get_default_device): Likewise.
* libgomp.texi (OMP_DEFAULT_DEVICE): Update init description.
(OpenMP 5.2 Impl. Status): Mark OMP_TARGET_OFFLOAD=mandatory as 'Y'.
* target.c (resolve_device): Improve error message device-num < 0
with 'mandatory' and no no-host devices available.
(gomp_target_init): Set default-device-var if INT_MIN.
* testsuite/libgomp.c/target-48.c: New test.
* testsuite/libgomp.c/target-49.c: New test.
* testsuite/libgomp.c/target-50.c: New test.
* testsuite/libgomp.c/target-50a.c: New test.
* testsuite/libgomp.c/target-51.c: New test.
* testsuite/libgomp.c/target-52.c: New test.
* testsuite/libgomp.c/target-53.c: New test.
* testsuite/libgomp.c/target-54.c: New test.
Gaius Mulley [Tue, 13 Jun 2023 22:21:42 +0000 (23:21 +0100)]
modula2 Fixes to the error format specifications
This patch contains a python3 script to check the meta format error
specifications. It also includes about 20 fixes to M2Quads.mod format
specifications.
gcc/m2/ChangeLog:
* Make-lang.in (check-format-error): New rule.
* gm2-compiler/M2MetaError.mod (op): Add calls InternalError if
digits are detected.
* gm2-compiler/M2Quads.mod (BuildForToByDo): Bugfix to format
specifier.
(BuildLengthFunction): Bugfix to format specifiers.
(BuildOddFunction): Bugfix to format specifiers.
(BuildAbsFunction): Bugfix to format specifiers.
(BuildCapFunction): Bugfix to format specifiers.
(BuildChrFunction): Bugfix to format specifiers.
(BuildOrdFunction): Bugfix to format specifiers.
(BuildMakeAdrFunction): Bugfix to format specifiers.
(BuildSizeFunction): Bugfix to format specifiers.
(BuildBitSizeFunction): Bugfix to format specifiers.
* tools-src/checkmeta.py: New file.
David Malcolm [Tue, 13 Jun 2023 21:42:47 +0000 (17:42 -0400)]
c/c++: use positive tone in missing header notes [PR84890]
Quoting "How a computer should talk to people" (as quoted
in "Concepts Error Messages for Humans"):
"Various negative tones or actions are unfriendly: being manipulative,
not giving a second chance, talking down, using fashionable slang,
blaming. We must not seem to blame the person. We should avoid suggesting
that the person is inadequate. Phrases like "you forgot" may seem
harmless, but what if a computer said this to you four or five times in
two minutes? Anyway, the person may disagree, so why risk offense?"
gcc/c-family/ChangeLog:
PR c/84890
* known-headers.cc
(suggest_missing_header::~suggest_missing_header): Reword note to
avoid negative tone of "forgetting".
gcc/cp/ChangeLog:
PR c/84890
* name-lookup.cc (missing_std_header::~missing_std_header): Reword
note to avoid negative tone of "forgetting".
Nathan Sidwell [Mon, 12 Jun 2023 23:37:04 +0000 (19:37 -0400)]
c++: Fix templated convertion operator demangling
Instantiations of templated conversion operators failed to demangle
for cases such as 'operator X<int>', but worked for 'operator X<int>
&', due to thinking the template instantiation of X was the
instantiation of the conversion operator itself.
Harald Anlauf [Mon, 12 Jun 2023 21:08:48 +0000 (23:08 +0200)]
Fortran: fix passing of zero-sized array arguments to procedures [PR86277]
gcc/fortran/ChangeLog:
PR fortran/86277
* trans-array.cc (gfc_trans_allocate_array_storage): When passing a
zero-sized array with fixed (= non-dynamic) size, allocate temporary
by the caller, not by the callee.
gcc/testsuite/ChangeLog:
PR fortran/86277
* gfortran.dg/zero_sized_14.f90: New test.
* gfortran.dg/zero_sized_15.f90: New test.
Jeff Law [Tue, 13 Jun 2023 17:46:32 +0000 (11:46 -0600)]
Remove a couple mudflap remnants
I happened to be digging into the specs to understand a build
failure and spotted mflib and mfwrap. Those were used by the
mudflap system which we ripped out years ago and we just missed
these.
I verified x86 still bootstraps after removing these bits.
Pushed to the trunk as obvious,
gcc/
* gcc.cc (LINK_COMMAND_SPEC): Remove mudflap spec handling.
Jeff Law [Tue, 13 Jun 2023 17:10:21 +0000 (11:10 -0600)]
Remove sh5media divtab code
Spurred by Akari Takahashi's patch to config/sh/divtab.cc, this removes
divtab.cc completely.
divtab.cc was used to calculate a division table for the sh5 media
processor. GCC dropped support for that (unmanufactured) chip back
in 2016 and this file simply got missed AFAICT.
Jakub Jelinek [Tue, 13 Jun 2023 16:39:45 +0000 (18:39 +0200)]
i386: Fix up whitespace in assembly
I've noticed that standard_sse_constant_opcode emits some spurious
whitespace around tab, that isn't something which is done for
any other instruction and looks wrong.
2023-06-13 Jakub Jelinek <jakub@redhat.com>
* config/i386/i386.cc (standard_sse_constant_opcode): Remove
superfluous spaces around \t for vpcmpeqd.
where we create a dead instruction that initializes the vector to zero,
immediately followed by a set of the entire vector. This patch skips
this zeroing instruction when the vector has only a single element.
It also updates the code to indicate when we've cleared the vector,
so that we don't need to initialize zero elements.
2023-06-13 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* expr.cc (store_constructor) <case VECTOR_TYPE>: Don't bother
clearing vectors with only a single element. Set CLEARED if the
vector was initialized to zero.
Juzhe-Zhong [Tue, 13 Jun 2023 11:38:38 +0000 (19:38 +0800)]
RISC-V: Add more SLP tests
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/partial/slp-10.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-11.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-13.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-14.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-15.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-10.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-11.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-13.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-14.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-15.c: New test.
Jason Merrill [Tue, 13 Jun 2023 11:29:34 +0000 (07:29 -0400)]
c++: mutable temps in rodata
If the type of a temporary has mutable members, we can't set TREE_READONLY
on the VAR_DECL; this is parallel to the check in
cp_apply_type_quals_to_decl.
Yanzhang Wang [Tue, 13 Jun 2023 02:46:40 +0000 (10:46 +0800)]
RISC-V: Add vector psabi checking.
This patch adds support to check function's argument or return is vector type
and throw warning if yes.
There're two exceptions,
- The vector_size attribute.
- The intrinsic functions.
Some cases that need to add -Wno-psabi to ignore the warning.
gcc/ChangeLog:
* config/riscv/riscv-protos.h (riscv_init_cumulative_args): Set
warning flag if func is not builtin
* config/riscv/riscv.cc
(riscv_scalable_vector_type_p): Determine whether the type is scalable vector.
(riscv_arg_has_vector): Determine whether the arg is vector type.
(riscv_pass_in_vector_p): Check the vector type param is passed by value.
(riscv_init_cumulative_args): The same as header.
(riscv_get_arg_info): Add the checking.
(riscv_function_value): Check the func return and set warning flag
* config/riscv/riscv.h (INIT_CUMULATIVE_ARGS): Add a flag to
determine whether warning psabi or not.
gcc/testsuite/ChangeLog:
* g++.target/riscv/rvv/base/pr109244.C: Add the -Wno-psabi.
* g++.target/riscv/rvv/base/pr109535.C: Same
* gcc.target/riscv/rvv/base/binop_vx_constraint-120.c: Same
* gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c: Same
* gcc.target/riscv/rvv/base/mask_insn_shortcut.c: Same
* gcc.target/riscv/rvv/base/misc_vreinterpret_vbool_vint.c: Same
* gcc.target/riscv/rvv/base/pr110109-2.c: Same
* gcc.target/riscv/rvv/base/scalar_move-9.c: Same
* gcc.target/riscv/rvv/base/spill-10.c: Same
* gcc.target/riscv/rvv/base/spill-11.c: Same
* gcc.target/riscv/rvv/base/spill-9.c: Same
* gcc.target/riscv/rvv/base/vlmul_ext-1.c: Same
* gcc.target/riscv/rvv/base/zero_base_load_store_optimization.c: Same
* gcc.target/riscv/rvv/base/zvfh-intrinsic.c: Same
* gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: Same
* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Same
* gcc.target/riscv/rvv/vsetvl/vsetvl-1.c: Same
* gcc.target/riscv/vector-abi-1.c: New test.
* gcc.target/riscv/vector-abi-2.c: New test.
* gcc.target/riscv/vector-abi-3.c: New test.
* gcc.target/riscv/vector-abi-4.c: New test.
* gcc.target/riscv/vector-abi-5.c: New test.
* gcc.target/riscv/vector-abi-6.c: New test.
Signed-off-by: Yanzhang Wang <yanzhang.wang@intel.com> Co-authored-by: Kito Cheng <kito.cheng@sifive.com>
Add a testcase for 'omp requires unified_address' that is currently supported
by all devices but was not tested for.
libgomp/
PR libgomp/109837
* testsuite/libgomp.c-c++-common/requires-unified-addr-1.c: New test.
* testsuite/libgomp.fortran/requires-unified-addr-1.f90: New test.
Kyrylo Tkachov [Tue, 13 Jun 2023 09:17:24 +0000 (10:17 +0100)]
arm: Extend -mtp= arguments
After discussing the -mtp= option with Arm's LLVM developers we'd like to extend
the functionality of the option somewhat.
There are actually 3 system registers that can be accessed for the thread pointer
in aarch32: tpidrurw, tpidruro, tpidrprw. They are all read through the CP15 co-processor
mechanism. The current -mtp=cp15 option reads the tpidruro register.
This patch extends -mtp to allow for the above three explicit tpidr names and
keeps -mtp=cp15 as an alias of -mtp=tpidruro for backwards compatibility.
Bootstrapped and tested on arm-none-linux-gnueabihf.
gcc/ChangeLog:
* config/arm/arm-opts.h (enum arm_tp_type): Remove TP_CP15.
Add TP_TPIDRURW, TP_TPIDRURO, TP_TPIDRPRW values.
* config/arm/arm-protos.h (arm_output_load_tpidr): Declare prototype.
* config/arm/arm.cc (arm_option_reconfigure_globals): Replace TP_CP15
with TP_TPIDRURO.
(arm_output_load_tpidr): Define.
* config/arm/arm.h (TARGET_HARD_TP): Define in terms of TARGET_SOFT_TP.
* config/arm/arm.md (load_tp_hard): Call arm_output_load_tpidr to output
assembly.
(reload_tp_hard): Likewise.
* config/arm/arm.opt (tpidrurw, tpidruro, tpidrprw): New values for
arm_tp_type.
* doc/invoke.texi (Arm Options, mtp): Document new values.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mtp.c: New test.
* gcc.target/arm/mtp_1.c: New test.
* gcc.target/arm/mtp_2.c: New test.
* gcc.target/arm/mtp_3.c: New test.
* gcc.target/arm/mtp_4.c: New test.
Kyrylo Tkachov [Tue, 13 Jun 2023 09:13:55 +0000 (10:13 +0100)]
aarch64: Extend -mtp= arguments
After discussing the -mtp= option with Arm's LLVM developers we'd like to extend
the functionality of the option somewhat.
First of all, there is another TPIDR register that can be used to read the thread pointer:
TPIDRRO_EL0 (which can also be accessed by AArch32 under another name) so it makes sense
to add -mtp=tpidrr0_el0. This makes the existing arguments el0, el1, el2, el3 somewhat
inconsistent in their naming so this patch introduces the more "full" names
tpidr_el0, tpidr_el1, tpidr_el2, tpidr_el3 and makes the above short names alias of these new ones.
Long story short, we preserve backwards compatibility and add a new TPIDR register to access through
-mtp that wasn't available previously.
There is more relevant discussion of the options at https://reviews.llvm.org/D152433 if you're interested.
Bootstrapped and tested on aarch64-none-linux-gnu.
PR target/108779
* gcc.target/aarch64/mtp_5.c: New test.
* gcc.target/aarch64/mtp_6.c: New test.
* gcc.target/aarch64/mtp_7.c: New test.
* gcc.target/aarch64/mtp_8.c: New test.
* gcc.target/aarch64/mtp_9.c: New test.
Alexandre Oliva [Tue, 13 Jun 2023 08:52:22 +0000 (05:52 -0300)]
fix frange_nextafter odr violation
C++ requires inline functions to be declared inline and defined in
every translation unit that uses them. frange_nextafter is used in
gimple-range-op.cc but it's only defined as inline in
range-op-float.cc. Drop the extraneous inline specifier.
Other non-static inline functions in range-op-float.cc are not
referenced elsewhere, so I'm making them static.
for gcc/ChangeLog
* range-op-float.cc (frange_nextafter): Drop inline.
(frelop_early_resolve): Add static.
(frange_float): Likewise.
Richard Biener [Tue, 13 Jun 2023 07:19:34 +0000 (09:19 +0200)]
middle-end/110232 - fix native interpret of vector <signed-boolean:1>
The following fixes native interpretation of a buffer as boolean
vector with bit-precision elements such as AVX512 vectors. The
check whether the buffer covers the whole vector was broken for
bit-precision elements and the following instead implements it
based on the vector type size.
PR middle-end/110232
* fold-const.cc (native_interpret_vector): Use TYPE_SIZE_UNIT
to check whether the buffer covers the whole vector.
Richard Biener [Tue, 13 Jun 2023 06:52:23 +0000 (08:52 +0200)]
Fix disambiguation against .MASK_LOAD
Alias analysis was treating .MASK_LOAD as storing a full vector
which means we disambiguate against decls of smaller than vector size.
This complements the previous patch handling .MASK_STORE and fixes
runtime execution FAILs of gfortran.dg/matmul_3.f90 and
gfortran.dg/inline_sum_2.f90 when using AVX512 with full masked loop
vectorization on Zen4.
* tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): For
.MASK_LOAD and friends set the size of the access to unknown.
Kewen Lin [Tue, 13 Jun 2023 08:04:54 +0000 (03:04 -0500)]
testsuite: Check int128 effective target for pr109932-{1,2}.c [PR110230]
This patch is to make newly added test cases pr109932-{1,2}.c
check int128 effective target to avoid unsupported type error
on 32-bit. I did hit this failure during testing and fixed
it, but made a stupid mistake not updating the local formatted
patch which was actually out of date.
PR testsuite/110230
PR target/109932
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr109932-1.c: Adjust with int128 effective target.
* gcc.target/powerpc/pr109932-2.c: Ditto.
Piotr Trojanek [Wed, 3 May 2023 13:26:23 +0000 (15:26 +0200)]
ada: Fix decoration of iterated component association for GNATprove
This patch is an alternative solution for a recent fix in analysis of
iterated component association.
To recap, if the iterated expression is an aggregate, we want to
propagate the component type downward with a call to Resolve_Aggr_Expr;
otherwise we want this expression to be only preanalysed (since the
association might need to be repeatedly evaluated), but also we need to
apply predicate and range checks to the expression itself (these are
required for GNATprove).
It turns out that Resolve_Aggr_Expr already knows how to deal with a
nested aggregate and also works for GNATprove, where it both preanalyzes
the expression and applies necessary checks.
In other words, expression of the iterated component association is now
resolved just like expression of an ordinary array aggregate.
gcc/ada/
* sem_aggr.adb (Resolve_Iterated_Component_Association): Simply resolve
the expression.
Bob Duff [Wed, 3 May 2023 12:42:24 +0000 (08:42 -0400)]
ada: Add missing ss_mark/ss_release in quantified expressions
If a quantified expression says "for all ... of F(...)"
where F(...) is a function call that returns on the secondary
stack, we need to clean up the secondary stack. This patch
adds the required ss_mark/ss_release in that case.
gcc/ada/
* exp_ch4.adb
(Expand_N_Quantified_Expression): Detect the secondary-stack
case, and find the innermost scope where we should mark/release,
and Set_Uses_Sec_Stack on that. Skip intermediate blocks and loops
that are part of expansion.
Piotr Trojanek [Wed, 3 May 2023 11:52:44 +0000 (13:52 +0200)]
ada: Recognize iterated_component_association as repeatedly evaluated
As iterated_component_association is an array_component_association
(because of a grammar rule Ada 2022 RM 4.3.3(5/5)), its expression is
repeatedly evaluated (because of Ada 2022 RM 6.1.1(22.14/5)).
With this patch we will now get errors for both conjuncts in this code,
which have semantically equivalent array aggregates that use an ordinary
component association and iterated component association.
procedure Iter (S : String)
with Post => String'(for J in 1 .. 3 => S (S'First)'Old) =
String'( 1 .. 3 => S (S'First)'Old);
gcc/ada/
* sem_util.adb (Is_Repeatedly_Evaluated): Recognize iterated component
association as repeatedly evaluated.
Piotr Trojanek [Wed, 3 May 2023 09:39:52 +0000 (11:39 +0200)]
ada: Recognize iterated_component_association as potentially unevaluated
Routine Is_Potentially_Unevaluated was written for Ada 2012, but now we
use it for Ada 2022 as well, so it must recognize iterated component
associations (which were added by Ada 2022) as an array component
association.
gcc/ada/
* sem_util.adb (Is_Potentially_Unevaluated): Recognize iterated
component association as potentially unevaluated.
Piotr Trojanek [Wed, 3 May 2023 07:23:29 +0000 (09:23 +0200)]
ada: Disable inlining in potentially unevaluated contexts
Instead of explicitly disabling inlining in quantified expressions,
(which happen to be only preanalysed) and then disabling inlining in
potentially unevaluated contexts that are fully analysed (which happen
to include quantified expressions), we now simply disable inlining in
all potentially unevaluated contexts, regardless of the full analysis
mode.
This also disables inlining in iterated component associations, which
can be both preanalysed or fully analysed depending on their expression,
but nevertheless are potentially unevaluated.
gcc/ada/
* sem_res.adb (Resolve_Call): Replace early call to
In_Quantified_Expression with a call to Is_Potentially_Unevaluated that
was only done when Full_Analysis is true.
Piotr Trojanek [Tue, 2 May 2023 14:03:18 +0000 (16:03 +0200)]
ada: Implement new aspect Always_Terminates for SPARK
This patch allows subprograms to be annotated with aspect
Always_Terminates that requires a boolean expression. When this
expression evaluates to True, the subprogram is required to terminate or
raise an exception, but not loop infinitely.
This aspect is only meant to be used by GNATprove and it has no
meaningful run-time semantics: either the annotated subprogram
terminates and then the aspect expression doesn't matter, or the
subprogram loops infinitely and there is nothing we can do. (We could
also evaluate the aspect expression just to detect run-time errors in
the expression itself, but this can be implemented later, after a
backend support for the aspect is added to GNATprove.)
Implementation of this aspect is heavily based on the implementation of
Subprogram_Variant, which in turn is heavily based on the implementation
of Contract_Cases. Since the new aspect is not yet expanded, there is no
corresponding assertion kind that would control the expansion.
gcc/ada/
* aspects.ads (Aspect_Id): Add new aspect.
(Implementation_Defined_Aspect): New aspect is
implementation-defined.
(Aspect_Argument): New aspect has an expression argument.
(Is_Representation_Aspect): New aspect is not a representation
aspect.
(Aspect_Names): Link new aspect identifier with a name.
(Aspect_Delay): New aspect is never delayed.
* contracts.adb (Expand_Subprogram_Contract): Mention new aspect
in comment.
(Add_Contract_Item): Attach pragma corresponding to the new aspect
to contract items.
(Analyze_Entry_Or_Subprogram_Contract): Analyze pragma
corresponding to the new aspect that appears with subprogram spec.
(Analyze_Subprogram_Body_Stub_Contract): Expand pragma
corresponding to the new aspect.
* contracts.ads
(Add_Contract_Item, Analyze_Entry_Or_Subprogram_Contract)
(Analyze_Entry_Or_Subprogram_Body_Contract)
(Analyze_Subprogram_Body_Stub_Contract): Mention new aspect in
comment.
* einfo-utils.adb (Get_Pragma): Return pragma attached to
contract.
* einfo-utils.ads (Get_Pragma): Mention new contract in comment.
* exp_prag.adb (Expand_Pragma_Always_Terminates): Placeholder for
possibly expanding new aspect.
* exp_prag.ads (Expand_Pragma_Always_Terminates): Dedicated
routine for expansion of the new aspect.
* inline.adb (Remove_Aspects_And_Pragmas): Remove aspect from
inlined bodies.
* par-prag.adb (Prag): Postpone checking of the pragma until
analysis.
* sem_ch12.adb: Mention new aspect in explanation of handling
contracts on generic units.
* sem_ch13.adb (Analyze_Aspect_Specifications): Convert new aspect
into a corresponding pragma.
(Check_Aspect_At_Freeze_Point): Don't expect new aspect.
* sem_prag.adb (Analyze_Always_Terminates_In_Decl_Part): Analyze
pragma corresponding to the new aspect.
(Analyze_Pragma): Handle pragma corresponding to the new aspect.
(Is_Non_Significant_Pragma_Reference): Handle references appearing
within new aspect.
* sem_prag.ads (Aspect_Specifying_Pragma): New aspect can be
emulated with a pragma.
(Assertion_Expression_Pragma): New aspect has an assertion
expression.
(Pragma_Significant_To_Subprograms): New aspect is significant to
subprograms.
(Analyze_Always_Terminates_In_Decl_Part): Add spec for routine
that analyses new aspect.
(Find_Related_Declaration_Or_Body): Mention new aspect in comment.
* sem_util.adb (Is_Subprogram_Contract_Annotation): New aspect is
a subprogram contract annotation.
* sem_util.ads (Is_Subprogram_Contract_Annotation): Mention new
aspect in comment.
* sinfo.ads (Is_Generic_Contract_Pragma): New pragma is a generic
contract.
(Contract): Explain attaching new pragma to subprogram contract.
* snames.ads-tmpl (Name_Always_Terminates): New name for the new
contract.
(Pragma_Always_Terminates): New pragma identifier.
Piotr Trojanek [Tue, 2 May 2023 10:49:43 +0000 (12:49 +0200)]
ada: Skip elaboration checks for abstract subprograms on derived types
Elaboration checks skip abstract subprogram declarations, which have no
body that could be examined. Now these checks also skip abstract
subprograms of a derived type, which have no body either.
gcc/ada/
* sem_elab.adb (Check_Overriding_Primitive): Prevent Corresponding_Body
to be called with entity of an abstract subprogram.
Eric Botcazou [Fri, 28 Apr 2023 13:55:38 +0000 (15:55 +0200)]
ada: Fix another case of missing Has_Private_View flag
It occurs for the case of a function call first parsed as an identifier.
gcc/ada/
* sem_ch12.adb (Save_References_In_Identifier): In the case where
the identifier has been turned into a function call by analysis,
call Set_Global_Type on the entity if it is global.
Eric Botcazou [Mon, 24 Apr 2023 09:07:38 +0000 (11:07 +0200)]
ada: Fix exception raised on invalid contract in generic package
This lets the compiler give a proper error message instead.
gcc/ada/
* contracts.adb (Contract_Error): New exception.
(Add_Contract_Item): Raise Contract_Error instead of Program_Error.
(Add_Generic_Contract_Pragma): Deal with Contract_Error.
Eric Botcazou [Tue, 25 Apr 2023 21:20:08 +0000 (23:20 +0200)]
ada: Fix spurious error on call to function returning private in generic
The spurious error is given on a call to a parameterless function returning
a private type, present in the body of a generic construct both declared and
instantiated in the presence of the full view of the type, because this full
view is not properly restored for the instantiation.
This is supposed to be handled by the Has_Private_View mechanism, but it is
bypassed here because the call to the parameterless function is first parsed
as a simple identifier before being later analyzed as a function call.
Fixing this first issue uncovered another one, whereby the Has_Private_View
flag was not properly set on an operator returning a private type that ends
up being later resolved as a function call.
Finally a small loophole in Eval_Attribute exposed by the change also needs
to be plugged.
gcc/ada/
* sem_attr.adb (Eval_Attribute): Add more exceptions to the early
return for a prefix which is a nonfrozen generic actual type.
* sem_ch12.adb (Copy_Generic_Node): Also check private views in the
case of an entity name or operator analyzed as a function call.
(Set_Global_Type): Make it a child of Save_Global_References.
(Save_References_In_Operator): In the case where the operator has
been turned into a function call, call Set_Global_Type on the entity
if it is global.
Eric Botcazou [Mon, 24 Apr 2023 15:11:01 +0000 (17:11 +0200)]
ada: Fix internal error on imported function with post-condition
The problem, which is also present for an expression function, is that the
function is invoked in the initializing expression of a variable declared
in the same declarative part as the function, which causes the freezing of
its artificial body before the post-condition is analyzed on its spec.
gcc/ada/
* contracts.adb (Analyze_Entry_Or_Subprogram_Body_Contract): For a
subprogram body that has no contracts and does not come from source,
make sure that contracts on its corresponding spec are analyzed, if
any, before expanding them.
Eric Botcazou [Mon, 24 Apr 2023 18:50:39 +0000 (20:50 +0200)]
ada: Streamline expansion of controlled actions for aggregates
This changes the strategy used to expand controlled actions for array and
record aggregates so as to make it simpler and more robust.
The current strategy is to set the No_Ctrl_Actions flag on the assignments
generated during the expansion of aggregate, as done during the expansion
of initialization procedures, and to generate the adjustments of the LHS
manually in the same list of actions, before sending the entire list for
analysis and expansion. The problem is that, when the RHS also requires
controlled actions, the No_Ctrl_Actions flag prevents transient scopes
from being created around the assignments, with the end result that the
actions are "naturally" generated between the assignments and adjustments
of the LHS, causing premature finalization of the RHS. In order to counter
that, the controlled actions of the RHS must also be generated manually
during the expansion of the aggregates, after blocking normal processing
e.g. by means of the No_Side_Effect_Removal flag. This means that, for
a more complex RHS, this strategy generates a wrong order of controlled
actions by default, until specifically adjusted.
The new strategy is to reuse the standard machinery as much as possible,
disabling only the part that is not needed for the assignments generated
during the expansion of aggregates, namely the finalization of the LHS;
in other words, the adjustment of the LHS is left entirely to the standard
machinery and the creation of transient scopes is no longer blocked, which
gives a correct order of controlled actions by default. It is implemented
by means of a No_Finalize_Actions flag present on the assignments generated
during the expansion.
It is mostly straightforward, modulo the following hitch: the assignments
are now analyzed and expanded by the common expander, which in the case of
controlled assignments analyzes the final rewriting with all checks off,
which in particular disables elaboration checks for the calls to the Adjust
primitives; now these checks are necessary in the case where an aggregate
is the initialization expression of an object declared before the body of
the Adjust primitive is seen. Hence the use of an existing trick, namely
Suppress/Unsuppress blocks, around the assignments.
gcc/ada/
* gen_il-fields.ads (Opt_Field_Enum): Add No_Finalize_Actions and
remove No_Side_Effect_Removal.
* gen_il-gen-gen_nodes.adb (N_Function_Call): Remove semantic flag
No_Side_Effect_Removal
(N_Assignment_Statement): Add semantic flag No_Finalize_Actions.
* sinfo.ads (No_Ctrl_Actions): Adjust comment.
(No_Finalize_Actions): New flag on assignment statements.
(No_Side_Effect_Removal): Delete.
* exp_aggr.adb (Build_Record_Aggr_Code): Remove obsolete comment and
Ancestor_Is_Expression variable. In the case of an extension, do
not generate a call to Adjust manually, call Set_No_Finalize_Actions
instead. Do not set the tags, replace call to Make_Unsuppress_Block
by Make_Suppress_Block and remove useless assertions.
In the general case, call Initialize_Component.
(Initialize_Controlled_Component): Delete.
(Initialize_Simple_Component): Delete.
(Initialize_Component): Do the low-level processing, but do not
generate a call to Adjust manually, call Set_No_Finalize_Actions.
(Process_Transient_Component): Delete.
(Process_Transient_Component_Completion): Likewise.
* exp_ch5.adb (Expand_Assign_Array): Deal with No_Finalize_Actions.
(Expand_Assign_Array_Loop): Likewise.
(Expand_N_Assignment_Statement): Likewise.
(Make_Tag_Ctrl_Assignment): Likewise.
* exp_util.adb (Remove_Side_Effects): Do not test the
No_Side_Effect_Removal flag.
* sem_prag.adb (Process_Suppress_Unsuppress): Give the warning in
SPARK mode only for pragma Suppress.
* tbuild.ads (Make_Suppress_Block): New declaration.
(Make_Unsuppress_Block): Adjust comment.
* tbuild.adb (Make_Suppress_Block): New procedure.
(Make_Unsuppress_Block): Unsuppress instead of suppressing.
Eric Botcazou [Thu, 20 Apr 2023 15:20:46 +0000 (17:20 +0200)]
ada: Remove obsolete code in Analyze_Assignment
This code was dealing with build-in-place calls for nonlimited types, but
they no longer exist since Is_Build_In_Place_Result_Type => Is_Limited_View.
gcc/ada/
* sem_ch5.adb (Analyze_Assignment): Turn Rhs into a constant and
remove calls to the following subprograms.
(Transform_BIP_Assignment): Delete.
(Should_Transform_BIP_Assignment): Likewise.
Eric Botcazou [Fri, 21 Apr 2023 16:37:12 +0000 (18:37 +0200)]
ada: Small housekeeping work in expansion of extension aggregates
This avoids repeatedly calling Unqualify on the same node, removes a dead
call to Generate_Finalization_Actions, a redundant setting of Assignment_OK
and reuses a local variable more consistently. No functional changes.
gcc/ada/
* exp_aggr.adb (Build_Record_Aggr_Code): Add new variable Ancestor_Q
to store the result of Unqualify on Ancestor. Remove the dead call
to Generate_Finalization_Actions in the case of another aggregate as
ancestor part. Remove the redundant setting of Assignment_OK. Use
Init_Typ in lieu of Etype (Ancestor) more consistently.
Eric Botcazou [Fri, 21 Apr 2023 16:30:48 +0000 (18:30 +0200)]
ada: Fix wrong expansion of limited extension aggregate
This happens when the ancestor part is itself an aggregate: in this case,
the tag of the extension aggregate is wrongly set to that of the ancestor.
gcc/ada/
* exp_aggr.adb (Build_Record_Aggr_Code): In the case of an extension
aggregate of a limited type whose ancestor part is an aggregate, do
not skip the final code assigning the tag of the extension.
In preparation for attribute Initialized to become ghost, use aspect
Ghost_Predicate instead of Predicate in unit Ada.Strings.Superbounded
of the standard library.
gcc/ada/
* libgnat/a-strsup.ads: Change predicate aspect.
* sem_ch13.adb (Add_Predicate): Fix for first predicate.
Eric Botcazou [Wed, 19 Apr 2023 07:56:42 +0000 (09:56 +0200)]
ada: Fix expansion of aggregates with controlled components
The expansion is incorrect in the case where the initialization expression
of a component is a conditional expression that has a function call as one
of its dependent expressions, leading to a wrong order of initialization,
adjustment and finalization.
gcc/ada/
* exp_aggr.adb (Initialize_Component): Perform immediate expansion
of the initialization expression if it is a conditional expression
and the component type is controlled.
Eric Botcazou [Tue, 18 Apr 2023 10:44:55 +0000 (12:44 +0200)]
ada: Factor common processing in expansion of aggregates
The final processing at the component level of array aggregates and record
aggregates is very similar, so this factors out the common processing into
three new library-level subprograms.
There should be no functional changes, but the expanded code may be changed
in the case of controlled components of array aggregates not covered by a
multiple choice: the previous expansion used to place new declarations prior
to the aggregate in this case and that is no longer the case, i.e. they are
always placed right before the initialization of the component (as was done
for all controlled components of record aggregates and controlled components
of array aggregates covered by a multiple choice).
gcc/ada/
* exp_aggr.adb (Initialize_Component): New procedure factored out
from the processing of array and record aggregates.
(Initialize_Controlled_Component): Likewise.
(Initialize_Simple_Component): Likewise.
(Build_Array_Aggr_Code.Gen_Assign): Remove In_Loop parameter.
Call Initialize_Component to initialize the component.
(Initialize_Array_Component): Delete.
(Initialize_Ctrl_Array_Component): Likewise.
(Build_Array_Aggr_Code): Adjust calls to Gen_Assign.
(Build_Record_Aggr_Code): Call Initialize_Simple_Component or
Initialize_Component to initialize the component.
(Initialize_Ctrl_Record_Component): Delete.
(Initialize_Record_Component): Likewise.
Piotr Trojanek [Wed, 19 Apr 2023 09:02:34 +0000 (11:02 +0200)]
ada: Cleanup finding of locally handled exception handlers
Code cleanup related to handling exceptions in GNATprove; semantics is
unaffected.
gcc/ada/
* exp_ch11.adb (Find_Local_Handler): Replace guard against other
constructs appearing in the list of exception handlers with iteration
using First_Non_Pragma/Next_Non_Pragma.
Piotr Trojanek [Tue, 18 Apr 2023 17:13:38 +0000 (19:13 +0200)]
ada: Cleanup expansion of locally handled exception handlers
Code cleanup related to handling exceptions in GNATprove; semantics is
unaffected.
gcc/ada/
* exp_ch11.ads (Find_Local_Handler): Fix typo in comment.
* exp_ch11.adb (Find_Local_Handler): Remove redundant check for the
Exception_Handler list being present; use membership test to eliminate
local object LCN; fold nested IF statements. Remove useless ELSIF
condition.
ada: Support new GNAT-specific aspect Ghost_Predicate
New aspect Ghost_Predicate allows the use of ghost entities in the
predicate expression, even if the type is not ghost itself. As a result,
subtypes with a ghost predicate cannot be used in membership tests.
Subtypes with ghost predicates are subject to the same additional
restrictions as subtypes with aspect Dynamic_Predicate.
They are governed for compilation by assertion policy Ghost.
Checking of the predicate itself is governed by the usual assertion
policy (Static_Predicate/Dynamic_Predicate/Predicate) independently
of the ghost predicate.
gcc/ada/
* doc/gnat_rm/implementation_defined_aspects.rst: Document new
aspect.
* doc/gnat_rm/implementation_defined_pragmas.rst: Whitespace.
* aspects.adb (Init_Canonical_Aspect): Set it to Predicate.
* aspects.ads: Set global constants for new aspect.
* einfo.ads: Describe new flag related to new aspect.
* exp_ch6.adb (Can_Fold_Predicate_Call): Do not fold new aspect.
* exp_util.adb (Make_Predicate_Check): Add comment.
* gen_il-fields.ads: Add new flag.
* gen_il-gen-gen_entities.adb: Add new flag.
* ghost.adb (Is_OK_Ghost_Context): Ghost predicate is an OK
ghost context.
(Mark_Ghost_Pragma): Add overloading with ghost mode parameter.
* ghost.ads (Mark_Ghost_Pragma): Add overloading with ghpst mode
parameter.
(Name_To_Ghost_Mode): Make function public.
* sem_aggr.adb: Issue error for violation of valid use.
* sem_case.adb: Issue error for violation of valid use.
* sem_ch13.adb: Adapt for new aspect.
* sem_ch3.adb (Analyze_Full_Type_Declaration): Remove dead code
which was trying to propagate Has_Predicates flag in the wrong
direction (from derived to parent type).
(Analyze_Number_Declaration): Issue error for violation of valid
use.
(Build_Derived_Type): Cleanup inheritance of predicate flags from
parent to derived type.
(Build_Predicate_Function): Only add a predicate check when it
is not ignored as Ghost code.
* sem_ch4.adb (Analyze_Membership_Op): Issue an error for use of
a subtype with a ghost predicate as name in a membership test.
* sem_ch5.adb (Check_Predicate_Use): Issue error for violation of
valid use.
* sem_eval.adb: Adapt code for Dynamic_Predicate to account for
Ghost_Predicate too.
* sem_prag.adb (Analyze_Pragma): Make pragma ghost or not.
* sem_util.adb (Bad_Predicated_Subtype_Use): Adapt to new aspect.
(Inherit_Predicate_Flags): Add inheritance of flag. Add parameter
to apply to derived types.
* sem_util.ads (Inherit_Predicate_Flags): Change signature.
* snames.ads-tmpl: Add new aspect name.
* gnat_rm.texi: Regenerate.
Piotr Trojanek [Mon, 17 Apr 2023 10:14:28 +0000 (12:14 +0200)]
ada: Remove explicit decoration of wrapper created in freezing
We create wrapper functions associated with inherited functions with
controlling results which are not overridden during freezing. We partly
decorated them explicitly, even though they would be fully decorated
later anyway.
This early decoration didn't work as expected, because flag
In_Private_Part that is read by Override_Dispatching_Operation it not
set reliably while freezing (as explained in the comment of
Is_Private_Declaration). In effect, we were getting a circularity
between Alias and Overridden_Operation, which was causing GNATprove to
loop infinitely.
Apparently the cleanest fix is to not decorate the wrapper with an early
call to Override_Dispatching_Operation.
gcc/ada/
* exp_ch3.adb (Make_Controlling_Function_Wrappers): Remove early
decoration.
Tejas Belagod [Tue, 11 May 2021 10:09:03 +0000 (11:09 +0100)]
AArch64: [PR96339] Optimise svlast[ab]
This PR optimizes an SVE intrinsics sequence where
svlasta (svptrue_pat_b8 (SV_VL1), x)
a scalar is selected based on a constant predicate and a variable vector.
This sequence is optimized to return the correspoding element of a NEON
vector. For eg.
svlasta (svptrue_pat_b8 (SV_VL1), x)
returns
umov w0, v0.b[1]
Likewise,
svlastb (svptrue_pat_b8 (SV_VL1), x)
returns
umov w0, v0.b[0]
This optimization only works provided the constant predicate maps to a range
that is within the bounds of a 128-bit NEON register.
gcc/ChangeLog:
PR target/96339
* config/aarch64/aarch64-sve-builtins-base.cc (svlast_impl::fold): Fold sve
calls that have a constant input predicate vector.
(svlast_impl::is_lasta): Query to check if intrinsic is svlasta.
(svlast_impl::is_lastb): Query to check if intrinsic is svlastb.
(svlast_impl::vect_all_same): Check if all vector elements are equal.
Andi Kleen [Tue, 30 May 2023 11:05:39 +0000 (04:05 -0700)]
Update perf auto profile script
- Fix gen_autofdo_event: The download URL for the Intel Perfmon Event
list has changed, as well as the JSON format.
Also it now uses pattern matching to match CPUs. Update the script to support all of this.
- Regenerate gcc-auto-profile with the latest published Intel model
numbers, so it works with recent systems.
- So far it's still broken on hybrid systems
contrib/ChangeLog:
* gen_autofdo_event.py: Update for download server changes
This patch fixes the requirement of V_WHOLE and V_FRACT.
E.g. VNx8QI in V_WHOLE has no requirement which is incorrect.
Actually, VNx8QI should be whole(full) mode when TARGET_MIN_VLEN < 128
since when TARGET_MIN_VLEN == 128, VNx8QI is e8mf2 which is fractional
vector.
Co-Authored by: Robin Dapp <rdapp@ventanamicro.com>
* config/riscv/riscv-v.cc (emit_vlmax_decompress_insn): New function.
(shuffle_decompress_patterns): New function.
(expand_vec_perm_const_1): Add decompress optimization.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/partial/slp-8.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-9.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-8.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-9.c: New test.
So as Georg-Johann discusses in the BZ, reload_cse_move2add can generate
incorrect code when optimizing code with clobbers. Specifically in the
case where we try to optimize a sequence of 4 operations down to 3
operations we can reset INSN to the next instruction and continue the loop.
That skips the code to invalidate objects based on things like REG_INC
nodes, stack pushes and most importantly clobbers attached to the current
insn.
This patch factors all of the invalidation code used by reload_cse_move2add
into a new function and calls it at the appropriate time.
Georg-Johann has confirmed this patch fixes his avr bug and I've had it in
my tester over the weekend. It's bootstrapped and regression tested on
aarch64, m68k, sh4, alpha and hppa. It's also regression tested successfully
on a wide variety of other targets.
[aarch64] Improve code-gen for vector initialization with single constant element.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_expand_vector_init): Tweak condition
if (n_var == n_elts && n_elts <= 16) to allow a single constant,
and if maxv == 1, use constant element for duplicating into register.
Tobias Burnus [Mon, 12 Jun 2023 16:15:28 +0000 (18:15 +0200)]
OpenMP: Cleanups related to the 'present' modifier
Reduce number of enum values passed to libgomp as
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} have the same semantic as
GOMP_MAP_FORCE_PRESENT (i.e. abort if not present, otherwise ignore);
that's different to GOMP_MAP_ALWAYS_PRESENT_{TO,TOFROM,FROM} which also
abort if not present but copy data when present. This is is a follow-up to
the commit r14-1579-g4ede915d5dde93 done 6 days ago.
Additionally, the commit improves a libgomp run-time and a C/C++ compile-time
error wording and extends testcases a tiny bit.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_clause_map): Reword error message for
clearness especially with 'omp target (enter/exit) data.'
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_clause_map): Reword error message for
clearness especially with 'omp target (enter/exit) data.'
* semantics.cc (handle_omp_array_sections): Handle
GOMP_MAP_{ALWAYS_,}PRESENT_{TO,TOFROM,FROM,ALLOC} enum values.
gcc/ChangeLog:
* gimplify.cc (gimplify_adjust_omp_clauses_1): Use
GOMP_MAP_FORCE_PRESENT for 'present alloc' implicit mapping.
(gimplify_adjust_omp_clauses): Change
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to the equivalent
GOMP_MAP_FORCE_PRESENT.
* omp-low.cc (lower_omp_target): Remove handling of no-longer valid
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC}; update map kinds used for
to/from clauses with present modifier.
include/ChangeLog:
* gomp-constants.h (enum gomp_map_kind): Change the enum values
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to be compiler only.
(GOMP_MAP_PRESENT_P): Update to include also GOMP_MAP_FORCE_PRESENT.
libgomp/ChangeLog:
* target.c (gomp_to_device_kind_p, gomp_map_vars_internal): Replace
GOMP_MAP_PRESENT_{FROM,TO,TOFROM,ACLLOC} by GOMP_MAP_FORCE_PRESENT.
(gomp_map_vars_internal, gomp_update): Likewise; unify and improve
error message.
* testsuite/libgomp.c-c++-common/target-present-2.c: Update for
changed error message.
* testsuite/libgomp.fortran/target-present-1.f90: Likewise.
* testsuite/libgomp.fortran/target-present-2.f90: Likewise.
* testsuite/libgomp.oacc-c-c++-common/present-1.c: Likewise.
* testsuite/libgomp.c-c++-common/target-present-1.c: Likewise and
extend testcase to check that data is copied when needed.
* testsuite/libgomp.c-c++-common/target-present-3.c: Likewise.
* testsuite/libgomp.fortran/target-present-3.f90: Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/defaultmap-4.c: Update scan-tree-dump.
* c-c++-common/gomp/map-9.c: Likewise.
* gfortran.dg/gomp/defaultmap-8.f90: Likewise.
* gfortran.dg/gomp/map-11.f90: Likewise.
* gfortran.dg/gomp/target-update-1.f90: Likewise.
* gfortran.dg/gomp/map-12.f90: Likewise; also check original dump.
* c-c++-common/gomp/map-6.c: Update dg-error and also check
clause error with 'target (enter/exit) data'.
Andrew MacLeod [Sat, 10 Jun 2023 21:06:36 +0000 (17:06 -0400)]
Provide interface for non-standard operators.
THis removes the hack introduced for WIDEN_MULT which exported a pointer
to the operator and the gimple-range-op.cc set the operator to this
pointer whenn it was appropriate.
Instead, we simple change the range-op table to be unsigned indexed,
and add new opcodes to the end of the table, allowing them to be indexed
directly via range_op_handler::range_op.
* gimple-range-op.cc (gimple_range_op_handler::maybe_non_standard):
Use range_op_handler directly.
* range-op.cc (range_op_handler::range_op_handler): Unsigned
param instead of tree-code.
(ptr_op_widen_plus_signed): Delete.
(ptr_op_widen_plus_unsigned): Delete.
(ptr_op_widen_mult_signed): Delete.
(ptr_op_widen_mult_unsigned): Delete.
(range_op_table::initialize_integral_ops): Add new opcodes.
* range-op.h (range_op_handler): Use unsigned.
(OP_WIDEN_MULT_SIGNED): New.
(OP_WIDEN_MULT_UNSIGNED): New.
(OP_WIDEN_PLUS_SIGNED): New.
(OP_WIDEN_PLUS_UNSIGNED): New.
(RANGE_OP_TABLE_SIZE): New.
(range_op_table::operator []): Use unsigned.
(range_op_table::set): Use unsigned.
(m_range_tree): Make unsigned.
(ptr_op_widen_mult_signed): Remove.
(ptr_op_widen_mult_unsigned): Remove.
(ptr_op_widen_plus_signed): Remove.
(ptr_op_widen_plus_unsigned): Remove.
Andrew MacLeod [Sat, 10 Jun 2023 20:59:38 +0000 (16:59 -0400)]
Provide a default range_operator via range_op_handler.
range_op_handler now provides a default range_operator for any opcode,
so there is no longer a need to check for a valid operator.
* gimple-range-op.cc (gimple_range_op_handler): Set m_operator
manually as there is no access to the default operator.
(cfn_copysign::fold_range): Don't check for validity.
(cfn_ubsan::fold_range): Ditto.
(gimple_range_op_handler::maybe_builtin_call): Don't set to NULL.
* range-op.cc (default_operator): New.
(range_op_handler::range_op_handler): Use default_operator
instead of NULL.
(range_op_handler::operator bool): Move from header, compare
against default operator.
(range_op_handler::range_op): New.
* range-op.h (range_op_handler::operator bool): Move.
Andrew MacLeod [Sat, 10 Jun 2023 20:35:18 +0000 (16:35 -0400)]
Add a hybrid MAX_EXPR operator for integer and pointer.
This adds an operator to the unified table for MAX_EXPR which will
select either the pointer or integer version based on the type passed
to the method. This is for use until we have a seperate PRANGE class.
THIs also removes the pointer table which is no longer needed.
Andrew MacLeod [Sat, 10 Jun 2023 20:34:26 +0000 (16:34 -0400)]
Add a hybrid MIN_EXPR operator for integer and pointer.
This adds an operator to the unified table for MIN_EXPR which will
select either the pointer or integer version based on the type passed
to the method. This is for use until we have a seperate PRANGE class.
Andrew MacLeod [Sat, 10 Jun 2023 20:33:17 +0000 (16:33 -0400)]
Add a hybrid BIT_IOR_EXPR operator for integer and pointer.
This adds an operator to the unified table for BIT_IOR_EXPR which will
select either the pointer or integer version based on the type passed
to the method. This is for use until we have a seperate PRANGE class.
Andrew MacLeod [Sat, 10 Jun 2023 20:28:40 +0000 (16:28 -0400)]
Add a hybrid BIT_AND_EXPR operator for integer and pointer.
This adds an operator to the unified table for BIT_AND_EXPR which will
select either the pointer or integer version based on the type passed
to the method. This is for use until we have a seperate PRANGE class.
Andrew MacLeod [Sat, 10 Jun 2023 20:02:09 +0000 (16:02 -0400)]
Move operator_bitwise_and to the unified range-op table.
At this point, the remaining 4 integral operation have different
impllementations than pointers, so we now check for a pointer table
entry first, then if there is nothing, look at the Unified table.
* range-op-mixed.h (class operator_bitwise_and): Move from...
* range-op.cc (unified_table::unified_table): Add BIT_AND_EXPR.
(get_op_handler): Check for a pointer table entry first.
(class operator_bitwise_and): Move from here.
(integral_table::integral_table): Remove BIT_AND_EXPR.
Pan Li [Mon, 12 Jun 2023 12:07:24 +0000 (20:07 +0800)]
RISC-V: Fix one potential test failure for RVV vsetvl
The test will fail on below command with multi-thread like below. However,
it comes from one missed "Oz" option when check vsetvl.
make -j $(nproc) report RUNTESTFLAGS="rvv.exp riscv.exp"
To some reason, this failure cannot be reproduced by RUNTESTFLAGS="rvv.exp"
or make without -j option. We would like to fix it and root cause the
reason later.
Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/vsetvl-23.c: Adjust test checking.
Pan Li [Mon, 12 Jun 2023 07:16:21 +0000 (15:16 +0800)]
RISC-V: Support RVV FP16 MISC vget/vset intrinsic API
This patch support the intrinsic API of FP16 ZVFHMIN vget/vset. From
the user's perspective, it is reasonable to do some get/set operations
for the vfloat16*_t types when only ZVFHMIN is enabled.
Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-types.def
(vfloat16m1_t): Add type to lmul1 ops.
(vfloat16m2_t): Likewise.
(vfloat16m4_t): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: Add new test cases.
* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Likewise.
Richard Biener [Mon, 12 Jun 2023 12:09:45 +0000 (14:09 +0200)]
Fix disambiguation against .MASK_STORE
Alias analysis was treating .MASK_STORE as storing a full vector
which means we disambiguate against decls of smaller than vector size.
That's of course wrong and a similar issue was fixed for DSE already.
The following makes sure we set the size of the access to unknown
and only constrain max_size.
This fixes runtime execution FAILs of gfortran.dg/matmul_2.f90,
gfortran.dg/matmul_6.f90 and gfortran.dg/pr91577.f90 when using
AVX512 with full masked loop vectorization on Zen4.
* tree-ssa-alias.cc (call_may_clobber_ref_p_1): For
.MASK_STORE and friend set the size of the access to
unknown.
Juzhe-Zhong [Mon, 12 Jun 2023 02:41:02 +0000 (10:41 +0800)]
RISC-V: Add RVV narrow shift right lowering auto-vectorization
Optimize the following auto-vectorization codes:
void foo (int16_t * __restrict a, int32_t * __restrict b, int32_t c, int n)
{
for (int i = 0; i < n; i++)
a[i] = b[i] >> c;
}
Before this patch:
foo:
ble a3,zero,.L5
.L3:
vsetvli a5,a3,e32,m1,ta,ma
vle32.v v1,0(a1)
vsetvli a4,zero,e32,m1,ta,ma
vsra.vx v1,v1,a2
vsetvli zero,zero,e16,mf2,ta,ma
slli a7,a5,2
vncvt.x.x.w v1,v1
slli a6,a5,1
vsetvli zero,a5,e16,mf2,ta,ma
sub a3,a3,a5
vse16.v v1,0(a0)
add a1,a1,a7
add a0,a0,a6
bne a3,zero,.L3
.L5:
ret
After this patch:
foo:
ble a3,zero,.L5
.L3:
vsetvli a5,a3,e32,m1,ta,ma
vle32.v v1,0(a1)
vsetvli a7,zero,e16,mf2,ta,ma
slli a6,a5,2
vnsra.wx v1,v1,a2
slli a4,a5,1
vsetvli zero,a5,e16,mf2,ta,ma
sub a3,a3,a5
vse16.v v1,0(a0)
add a1,a1,a6
add a0,a0,a4
bne a3,zero,.L3
.L5:
ret
gcc/ChangeLog:
* config/riscv/autovec-opt.md
(*v<any_shiftrt:optab><any_extend:optab>trunc<mode>): New pattern.
(*<any_shiftrt:optab>trunc<mode>): Ditto.
* config/riscv/autovec.md (<optab><mode>3): Change to
define_insn_and_split.
(v<optab><mode>3): Ditto.
(trunc<mode><v_double_trunc>2): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/narrow-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/narrow-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/narrow-3.c: New test.
* gcc.target/riscv/rvv/autovec/binop/narrow_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/narrow_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/narrow_run-3.c: New test.
Kyrylo Tkachov [Mon, 12 Jun 2023 10:42:29 +0000 (11:42 +0100)]
simplify-rtx: Implement constant folding of SS_TRUNCATE, US_TRUNCATE
This patch implements RTL constant-folding for the SS_TRUNCATE and US_TRUNCATE codes.
The semantics are a clamping operation on the argument with the min and max of the narrow mode,
followed by a truncation. The signedness of the clamp and the min/max extrema is derived from
the signedness of the saturating operation.
We have a number of instructions in aarch64 that use SS_TRUNCATE and US_TRUNCATE to represent
their operations and we have pretty thorough runtime tests in gcc.target/aarch64/advsimd-intrinsics/vqmovn*.c.
With this patch the instructions are folded away at optimisation levels and the correctness checks still
pass.
Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
Andre Vieira [Mon, 12 Jun 2023 09:30:39 +0000 (10:30 +0100)]
vect: Don't pass subtype to vect_widened_op_tree where not needed [PR 110142]
This patch fixes an issue introduced by
g:2f482a07365d9f4a94a56edd13b7f01b8f78b5a0, where a subtype was beeing passed
to vect_widened_op_tree, when no subtype was to be used. This lead to an
errorneous use of IFN_VEC_WIDEN_MINUS.
gcc/ChangeLog:
PR middle-end/110142
* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Don't pass
subtype to vect_widened_op_tree and remove subtype parameter, also
remove superfluous overloaded function definition.
(vect_recog_widen_plus_pattern): Remove subtype parameter and dont pass
to call to vect_recog_widen_op_pattern.
(vect_recog_widen_minus_pattern): Likewise.