git.ipfire.org Git - thirdparty/gcc.git/log

driver: Forward '-lgfortran', '-lm' to offloading compilation

..., so that users don't manually need to specify
'-foffload-options=-lgfortran', '-foffload-options=-lm' in addition to
'-lgfortran', '-lm' (specified manually, or implicitly by the driver).

gcc/
* gcc.cc (driver_handle_option): Forward host '-lgfortran', '-lm'
to offloading compilation.
* config/gcn/mkoffload.cc (main): Adjust.
* config/nvptx/mkoffload.cc (main): Likewise.
* doc/invoke.texi (foffload-options): Update example.
libgomp/
* testsuite/libgomp.fortran/fortran.exp (lang_link_flags): Don't
set.
* testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags):
Likewise.
* testsuite/libgomp.c/simd-math-1.c: Remove
'-foffload-options=-lm'.
* testsuite/libgomp.fortran/fortran-torture_execute_math.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/fortran-torture_execute_math.f90:
Likewise.

Add 'libgomp.{,oacc-}fortran/fortran-torture_execute_math.f90'

..., via 'include'ing the existing 'gfortran.fortran-torture/execute/math.f90',
which therefore is enhanced for optional OpenACC 'serial', OpenMP 'target'
usage.

gcc/testsuite/
* gfortran.fortran-torture/execute/math.f90: Enhance for optional
OpenACC 'serial', OpenMP 'target' usage.
libgomp/
* testsuite/libgomp.fortran/fortran-torture_execute_math.f90: New.
* testsuite/libgomp.oacc-fortran/fortran-torture_execute_math.f90:
Likewise.

Tighten 'dg-warning' alternatives in 'c-c++-common/Wfree-nonheap-object{,-2,-3}.c'

..., added in commit fe7f75cf16783589eedbab597e6d0b8d35d7e470
"Correct/improve maybe_emit_free_warning (PR middle-end/98166, PR c++/57111, PR middle-end/98160)".

These use alternatives like, for example, "AB|CDE|FG", but what really must've
been meant is "A(B|C)D(E|F)G". The former variant also does "work": it matches
any of "AB", or "CDE", or "FG", which are components of the latter variant.
(That means, the former variant matches too loosely.)

gcc/testsuite/
* c-c++-common/Wfree-nonheap-object-2.c: Tighten 'dg-warning'
alternatives.
* c-c++-common/Wfree-nonheap-object-3.c: Likewise.
* c-c++-common/Wfree-nonheap-object.c: Likewise.

Remove 'gcc/testsuite/g++.dg/warn/Wfree-nonheap-object.s'

..., which, presumably, was added by mistake in
commit dce6c58db87ebf7f4477bd3126228e73e4eeee97
"Add support for detecting mismatched allocation/deallocation calls".

gcc/testsuite/
* g++.dg/warn/Wfree-nonheap-object.s: Remove.

Fix typo in 'libgomp.c/target-51.c'

..., and therefore, given 'target offload_device':

    PASS: libgomp.c/target-51.c (test for excess errors)
    PASS: libgomp.c/target-51.c execution test
    [-FAIL:-]{+PASS:+} libgomp.c/target-51.c output pattern test

Fix-up for recent commit 18c8b56c7d67a9e37acf28822587786f0fc0efbc
"OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory".

libgomp/
* testsuite/libgomp.c/target-51.c: Fix typo.

Use x instead of v for alternative 2 (v, BH) in mov<mode>_internal.

Since there's no evex version for vpcmpeq ymm, ymm, ymm.

gcc/ChangeLog:

PR target/110227
* config/i386/sse.md (mov<mode>_internal>): Use x instead of v
for alternative 2 since there's no evex version for vpcmpeqd
ymm, ymm, ymm.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr110227.c: New test.

OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory

OMP_TARGET_OFFLOAD=mandatory handling was before inconsistent. Hence, in
OpenMP 5.2 it was clarified/extended by having implications on the
default-device-var; additionally, omp_initial_device and omp_invalid_device
enum values/PARAMETERs were added; support for it was added
in r13-1066-g1158fe43407568 including aborting for omp_invalid_device and
non-conforming device numbers. Only the mandatory handling was missing.

Namely, while the default-device-var is usually initialized to value 0,
with 'mandatory' it must have the value 'omp_invalid_device' if and only if
zero non-host devices are available. (The OMP_DEFAULT_DEVICE env var
overrides this as it comes semantically after the initialization.)

To achieve this, default-device-var is now initialized to MIN_INT. If
there is no 'mandatory', it is set to 0 directly after env var parsing.
Otherwise, it is updated in gomp_target_init to either 0 or
omp_invalid_device. To ensure INT_MIN is never seen by the user, both
the omp_get_default_device API routine and omp_display_env (user call
and OMP_DISPLAY_ENV env var) call gomp_init_targets_once() in that case.

libgomp/ChangeLog:

* env.c (gomp_default_icv_values): Init default_device_var to
an nonconforming value - INT_MIN.
(initialize_env): After env-var parsing, set default_device_var to
device 0 unless OMP_TARGET_OFFLOAD=mandatory.
(omp_display_env): If default_device_var is INT_MIN, call
gomp_init_targets_once.
* icv-device.c (omp_get_default_device): Likewise.
* libgomp.texi (OMP_DEFAULT_DEVICE): Update init description.
(OpenMP 5.2 Impl. Status): Mark OMP_TARGET_OFFLOAD=mandatory as 'Y'.
* target.c (resolve_device): Improve error message device-num < 0
with 'mandatory' and no no-host devices available.
(gomp_target_init): Set default-device-var if INT_MIN.
* testsuite/libgomp.c/target-48.c: New test.
* testsuite/libgomp.c/target-49.c: New test.
* testsuite/libgomp.c/target-50.c: New test.
* testsuite/libgomp.c/target-50a.c: New test.
* testsuite/libgomp.c/target-51.c: New test.
* testsuite/libgomp.c/target-52.c: New test.
* testsuite/libgomp.c/target-53.c: New test.
* testsuite/libgomp.c/target-54.c: New test.

Daily bump.

modula2 Fixes to the error format specifications

This patch contains a python3 script to check the meta format error
specifications. It also includes about 20 fixes to M2Quads.mod format
specifications.

gcc/m2/ChangeLog:

* Make-lang.in (check-format-error): New rule.
* gm2-compiler/M2MetaError.mod (op): Add calls InternalError if
digits are detected.
* gm2-compiler/M2Quads.mod (BuildForToByDo): Bugfix to format
specifier.
(BuildLengthFunction): Bugfix to format specifiers.
(BuildOddFunction): Bugfix to format specifiers.
(BuildAbsFunction): Bugfix to format specifiers.
(BuildCapFunction): Bugfix to format specifiers.
(BuildChrFunction): Bugfix to format specifiers.
(BuildOrdFunction): Bugfix to format specifiers.
(BuildMakeAdrFunction): Bugfix to format specifiers.
(BuildSizeFunction): Bugfix to format specifiers.
(BuildBitSizeFunction): Bugfix to format specifiers.
* tools-src/checkmeta.py: New file.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

c/c++: use positive tone in missing header notes [PR84890]

Quoting "How a computer should talk to people" (as quoted
in "Concepts Error Messages for Humans"):

"Various negative tones or actions are unfriendly: being manipulative,
not giving a second chance, talking down, using fashionable slang,
blaming. We must not seem to blame the person. We should avoid suggesting
that the person is inadequate. Phrases like "you forgot" may seem
harmless, but what if a computer said this to you four or five times in
two minutes? Anyway, the person may disagree, so why risk offense?"

gcc/c-family/ChangeLog:
PR c/84890
* known-headers.cc
(suggest_missing_header::~suggest_missing_header): Reword note to
avoid negative tone of "forgetting".

gcc/cp/ChangeLog:
PR c/84890
* name-lookup.cc (missing_std_header::~missing_std_header): Reword
note to avoid negative tone of "forgetting".

gcc/testsuite/ChangeLog:
PR c/84890
* g++.dg/cpp2a/srcloc3.C: Update expected message.
* g++.dg/lookup/missing-std-include-2.C: Likewise.
* g++.dg/lookup/missing-std-include-3.C: Likewise.
* g++.dg/lookup/missing-std-include-6.C: Likewise.
* g++.dg/lookup/missing-std-include.C: Likewise.
* g++.dg/spellcheck-inttypes.C: Likewise.
* g++.dg/spellcheck-stdint.C: Likewise.
* g++.dg/spellcheck-stdlib.C: Likewise.
* gcc.dg/spellcheck-inttypes.c: Likewise.
* gcc.dg/spellcheck-stdbool.c: Likewise.
* gcc.dg/spellcheck-stdint.c: Likewise.
* gcc.dg/spellcheck-stdlib.c: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

c++: Fix templated convertion operator demangling

Instantiations of templated conversion operators failed to demangle
for cases such as 'operator X<int>', but worked for 'operator X<int>
&', due to thinking the template instantiation of X was the
instantiation of the conversion operator itself.

libiberty/
* cp-demangle.c (d_print_conversion): Remove incorrect
template instantiation handling.
* testsuite/demangle-expected: Add testcases.

Fortran: add DATA statement testcase

gcc/testsuite/
* gfortran.dg/data_array_7.f90: New test.

Fortran: fix passing of zero-sized array arguments to procedures [PR86277]

gcc/fortran/ChangeLog:

PR fortran/86277
* trans-array.cc (gfc_trans_allocate_array_storage): When passing a
zero-sized array with fixed (= non-dynamic) size, allocate temporary
by the caller, not by the callee.

gcc/testsuite/ChangeLog:

PR fortran/86277
* gfortran.dg/zero_sized_14.f90: New test.
* gfortran.dg/zero_sized_15.f90: New test.

Co-authored-by: Mikael Morin <mikael@gcc.gnu.org>

Remove a couple mudflap remnants

I happened to be digging into the specs to understand a build
failure and spotted mflib and mfwrap. Those were used by the
mudflap system which we ripped out years ago and we just missed
these.

I verified x86 still bootstraps after removing these bits.

Pushed to the trunk as obvious,
gcc/
* gcc.cc (LINK_COMMAND_SPEC): Remove mudflap spec handling.

Remove sh5media divtab code

Spurred by Akari Takahashi's patch to config/sh/divtab.cc, this removes
divtab.cc completely.

divtab.cc was used to calculate a division table for the sh5 media
processor. GCC dropped support for that (unmanufactured) chip back
in 2016 and this file simply got missed AFAICT.

gcc/
* config/sh/divtab.cc: Remove.

i386: Fix up whitespace in assembly

I've noticed that standard_sse_constant_opcode emits some spurious
whitespace around tab, that isn't something which is done for
any other instruction and looks wrong.

2023-06-13 Jakub Jelinek <jakub@redhat.com>

* config/i386/i386.cc (standard_sse_constant_opcode): Remove
superfluous spaces around \t for vpcmpeqd.

Avoid duplicate vector initializations during RTL expansion.

This middle-end patch avoids some redundant RTL for vector initialization
during RTL expansion.  For the simple test case:

typedef __int128 v1ti __attribute__ ((__vector_size__ (16)));
__int128 key;

v1ti foo() {
    return (v1ti){key};
}

the middle-end currently expands:

(set (reg:V1TI 85) (const_vector:V1TI [ (const_int 0) ]))

(set (reg:V1TI 85) (mem/c:V1TI (symbol_ref:DI ("key"))))

where we create a dead instruction that initializes the vector to zero,
immediately followed by a set of the entire vector.  This patch skips
this zeroing instruction when the vector has only a single element.
It also updates the code to indicate when we've cleared the vector,
so that we don't need to initialize zero elements.

2023-06-13  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* expr.cc (store_constructor) <case VECTOR_TYPE>: Don't bother
clearing vectors with only a single element.  Set CLEARED if the
vector was initialized to zero.

RISC-V: Remove duplicate `#include "riscv-vector-switch.def"`

Hi,

This patch remove the duplicate `#include "riscv-vector-switch.def"` statement
and add #undef for ENTRY and TUPLE_ENTRY macros later.

Best,
Lehua

gcc/ChangeLog:

* config/riscv/riscv-v.cc (struct mode_vtype_group): Remove duplicate
#include.
(ENTRY): Undef.
(TUPLE_ENTRY): Undef.

RISC-V: Add comments of some functions

gcc/ChangeLog:

* config/riscv/riscv-v.cc (rvv_builder::single_step_npatterns_p): Add comment.
(shuffle_generic_patterns): Ditto.
(expand_vec_perm_const_1): Ditto.

RISC-V: Add more SLP tests

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/slp-10.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-11.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-13.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-14.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-15.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-10.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-11.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-13.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-14.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-15.c: New test.

RISC-V: Fix bug of VLA SLP auto-vectorization

Sorry for producing bugs in the previous VLA SLP patch.

Consider this following permutation:
_85 = VEC_PERM_EXPR <{ 99, 17, ... }, { 11, 80, ... }, { 0, POLY_INT_CST [4, 4], 1, POLY_INT_CST [5, 4], 2, POLY_INT_CST [6, 4], ... }>;

The correct result should be:
_85 = { 99, 11, 17, 80, ... }

However, I did wrong in the previous patch.

Code sequence before this patch:

set mask = { 0, 1, 0, 1, ... }
set v0 = { 99, 17, 99, 17, ... }
set v1 = { 11, 80, 11, 80, ... }
set index = viota (mask) = { 0, 0, 1, 1, 2, 2, ... }
set result = vrgather_mu (v0, v1, index, mask) = { 99, 11, 99, 80 }
The result is incorrect.

After this patch:

set mask = { 0, 1, 0, 1, ... }
set index = viota (mask) = { 0, 0, 1, 1, 2, 2, ... }
set v0 = vrgather ({ 99, 17, 99, 17, ... }, index) = { 99, 99, 17, 17, ... }
set v1 = { 11, 80, 11, 80, ... }
set result = vrgather_mu (v0, v1, index, mask) = { 99, 11, 17, 80 }
The result is what we expected.

This issue was discovered in the test I appended in this patch with --param=riscv-autovec-lmul=2.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (emit_vlmax_decompress_insn): Fix bug.
(shuffle_decompress_patterns): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/slp-12.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-12.c: New test.

Fix memory leak in loop header copying

* tree-ssa-loop-ch.cc (ch_base::copy_headers): Free loop BBs.

c++: mutable temps in rodata

If the type of a temporary has mutable members, we can't set TREE_READONLY
on the VAR_DECL; this is parallel to the check in
cp_apply_type_quals_to_decl.

gcc/cp/ChangeLog:

* tree.cc (build_target_expr): Check TYPE_HAS_MUTABLE_P.

gcc/testsuite/ChangeLog:

* g++.dg/tree-ssa/initlist-opt6.C: New test.

RISC-V: Add vector psabi checking.

This patch adds support to check function's argument or return is vector type
and throw warning if yes.

There're two exceptions,
- The vector_size attribute.
- The intrinsic functions.

Some cases that need to add -Wno-psabi to ignore the warning.

gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_init_cumulative_args): Set
warning flag if func is not builtin
* config/riscv/riscv.cc
(riscv_scalable_vector_type_p): Determine whether the type is scalable vector.
(riscv_arg_has_vector): Determine whether the arg is vector type.
(riscv_pass_in_vector_p): Check the vector type param is passed by value.
(riscv_init_cumulative_args): The same as header.
(riscv_get_arg_info): Add the checking.
(riscv_function_value): Check the func return and set warning flag
* config/riscv/riscv.h (INIT_CUMULATIVE_ARGS): Add a flag to
determine whether warning psabi or not.

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/base/pr109244.C: Add the -Wno-psabi.
* g++.target/riscv/rvv/base/pr109535.C: Same
* gcc.target/riscv/rvv/base/binop_vx_constraint-120.c: Same
* gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c: Same
* gcc.target/riscv/rvv/base/mask_insn_shortcut.c: Same
* gcc.target/riscv/rvv/base/misc_vreinterpret_vbool_vint.c: Same
* gcc.target/riscv/rvv/base/pr110109-2.c: Same
* gcc.target/riscv/rvv/base/scalar_move-9.c: Same
* gcc.target/riscv/rvv/base/spill-10.c: Same
* gcc.target/riscv/rvv/base/spill-11.c: Same
* gcc.target/riscv/rvv/base/spill-9.c: Same
* gcc.target/riscv/rvv/base/vlmul_ext-1.c: Same
* gcc.target/riscv/rvv/base/zero_base_load_store_optimization.c: Same
* gcc.target/riscv/rvv/base/zvfh-intrinsic.c: Same
* gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: Same
* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Same
* gcc.target/riscv/rvv/vsetvl/vsetvl-1.c: Same
* gcc.target/riscv/vector-abi-1.c: New test.
* gcc.target/riscv/vector-abi-2.c: New test.
* gcc.target/riscv/vector-abi-3.c: New test.
* gcc.target/riscv/vector-abi-4.c: New test.
* gcc.target/riscv/vector-abi-5.c: New test.
* gcc.target/riscv/vector-abi-6.c: New test.

Signed-off-by: Yanzhang Wang <yanzhang.wang@intel.com>
Co-authored-by: Kito Cheng <kito.cheng@sifive.com>

libgomp/testsuite: Add requires-unified-addr-1.{c,f90} [PR109837]

Add a testcase for 'omp requires unified_address' that is currently supported
by all devices but was not tested for.

libgomp/

PR libgomp/109837
* testsuite/libgomp.c-c++-common/requires-unified-addr-1.c: New test.
* testsuite/libgomp.fortran/requires-unified-addr-1.f90: New test.

arm: Extend -mtp= arguments

After discussing the -mtp= option with Arm's LLVM developers we'd like to extend
the functionality of the option somewhat.
There are actually 3 system registers that can be accessed for the thread pointer
in aarch32: tpidrurw, tpidruro, tpidrprw. They are all read through the CP15 co-processor
mechanism. The current -mtp=cp15 option reads the tpidruro register.
This patch extends -mtp to allow for the above three explicit tpidr names and
keeps -mtp=cp15 as an alias of -mtp=tpidruro for backwards compatibility.

Bootstrapped and tested on arm-none-linux-gnueabihf.

gcc/ChangeLog:

* config/arm/arm-opts.h (enum arm_tp_type): Remove TP_CP15.
Add TP_TPIDRURW, TP_TPIDRURO, TP_TPIDRPRW values.
* config/arm/arm-protos.h (arm_output_load_tpidr): Declare prototype.
* config/arm/arm.cc (arm_option_reconfigure_globals): Replace TP_CP15
with TP_TPIDRURO.
(arm_output_load_tpidr): Define.
* config/arm/arm.h (TARGET_HARD_TP): Define in terms of TARGET_SOFT_TP.
* config/arm/arm.md (load_tp_hard): Call arm_output_load_tpidr to output
assembly.
(reload_tp_hard): Likewise.
* config/arm/arm.opt (tpidrurw, tpidruro, tpidrprw): New values for
arm_tp_type.
* doc/invoke.texi (Arm Options, mtp): Document new values.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mtp.c: New test.
* gcc.target/arm/mtp_1.c: New test.
* gcc.target/arm/mtp_2.c: New test.
* gcc.target/arm/mtp_3.c: New test.
* gcc.target/arm/mtp_4.c: New test.

aarch64: Extend -mtp= arguments

After discussing the -mtp= option with Arm's LLVM developers we'd like to extend
the functionality of the option somewhat.
First of all, there is another TPIDR register that can be used to read the thread pointer:
TPIDRRO_EL0 (which can also be accessed by AArch32 under another name) so it makes sense
to add -mtp=tpidrr0_el0. This makes the existing arguments el0, el1, el2, el3 somewhat
inconsistent in their naming so this patch introduces the more "full" names
tpidr_el0, tpidr_el1, tpidr_el2, tpidr_el3 and makes the above short names alias of these new ones.
Long story short, we preserve backwards compatibility and add a new TPIDR register to access through
-mtp that wasn't available previously.
There is more relevant discussion of the options at https://reviews.llvm.org/D152433 if you're interested.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/ChangeLog:

PR target/108779
* config/aarch64/aarch64-opts.h (enum aarch64_tp_reg): Add
AARCH64_TPIDRRO_EL0 value.
* config/aarch64/aarch64.cc (aarch64_output_load_tp): Define.
* config/aarch64/aarch64.opt (tpidr_el0, tpidr_el1, tpidr_el2,
tpidr_el3, tpidrro_el3): New accepted values to -mtp=.
* doc/invoke.texi (AArch64 Options): Document new -mtp= options.

gcc/testsuite/ChangeLog:

PR target/108779
* gcc.target/aarch64/mtp_5.c: New test.
* gcc.target/aarch64/mtp_6.c: New test.
* gcc.target/aarch64/mtp_7.c: New test.
* gcc.target/aarch64/mtp_8.c: New test.
* gcc.target/aarch64/mtp_9.c: New test.

fix frange_nextafter odr violation

C++ requires inline functions to be declared inline and defined in
every translation unit that uses them.  frange_nextafter is used in
gimple-range-op.cc but it's only defined as inline in
range-op-float.cc.  Drop the extraneous inline specifier.

Other non-static inline functions in range-op-float.cc are not
referenced elsewhere, so I'm making them static.

for  gcc/ChangeLog

* range-op-float.cc (frange_nextafter): Drop inline.
(frelop_early_resolve): Add static.
(frange_float): Likewise.

middle-end/110232 - fix native interpret of vector <signed-boolean:1>

The following fixes native interpretation of a buffer as boolean
vector with bit-precision elements such as AVX512 vectors. The
check whether the buffer covers the whole vector was broken for
bit-precision elements and the following instead implements it
based on the vector type size.

PR middle-end/110232
* fold-const.cc (native_interpret_vector): Use TYPE_SIZE_UNIT
to check whether the buffer covers the whole vector.

* gcc.target/i386/pr110232.c: New testcase.

Fix disambiguation against .MASK_LOAD

Alias analysis was treating .MASK_LOAD as storing a full vector
which means we disambiguate against decls of smaller than vector size.
This complements the previous patch handling .MASK_STORE and fixes
runtime execution FAILs of gfortran.dg/matmul_3.f90 and
gfortran.dg/inline_sum_2.f90 when using AVX512 with full masked loop
vectorization on Zen4.

* tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): For
.MASK_LOAD and friends set the size of the access to unknown.

testsuite: Update powerpc test fold-vec-extract-int.p8.c

Update powerpc tests with extra zero_extend removal with default ree pass.

2023-06-13 Ajit Kumar Agarwal <aagarwa1@linux.ibm.com>

gcc/testsuite/ChangeLog:

PR testsuite/109880
* gcc.target/powerpc/fold-vec-extract-int.p8.c: Update test.

testsuite: Check int128 effective target for pr109932-{1,2}.c [PR110230]

This patch is to make newly added test cases pr109932-{1,2}.c
check int128 effective target to avoid unsupported type error
on 32-bit. I did hit this failure during testing and fixed
it, but made a stupid mistake not updating the local formatted
patch which was actually out of date.

PR testsuite/110230
PR target/109932

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr109932-1.c: Adjust with int128 effective target.
* gcc.target/powerpc/pr109932-2.c: Ditto.

ada: Fix decoration of iterated component association for GNATprove

This patch is an alternative solution for a recent fix in analysis of
iterated component association.

To recap, if the iterated expression is an aggregate, we want to
propagate the component type downward with a call to Resolve_Aggr_Expr;
otherwise we want this expression to be only preanalysed (since the
association might need to be repeatedly evaluated), but also we need to
apply predicate and range checks to the expression itself (these are
required for GNATprove).

It turns out that Resolve_Aggr_Expr already knows how to deal with a
nested aggregate and also works for GNATprove, where it both preanalyzes
the expression and applies necessary checks.

In other words, expression of the iterated component association is now
resolved just like expression of an ordinary array aggregate.

gcc/ada/

* sem_aggr.adb (Resolve_Iterated_Component_Association): Simply resolve
the expression.

ada: Add missing ss_mark/ss_release in quantified expressions

If a quantified expression says "for all ... of F(...)"
where F(...) is a function call that returns on the secondary
stack, we need to clean up the secondary stack. This patch
adds the required ss_mark/ss_release in that case.

gcc/ada/

* exp_ch4.adb
(Expand_N_Quantified_Expression): Detect the secondary-stack
case, and find the innermost scope where we should mark/release,
and Set_Uses_Sec_Stack on that. Skip intermediate blocks and loops
that are part of expansion.

ada: Recognize iterated_component_association as repeatedly evaluated

As iterated_component_association is an array_component_association
(because of a grammar rule Ada 2022 RM 4.3.3(5/5)), its expression is
repeatedly evaluated (because of Ada 2022 RM 6.1.1(22.14/5)).

With this patch we will now get errors for both conjuncts in this code,
which have semantically equivalent array aggregates that use an ordinary
component association and iterated component association.

  procedure Iter (S : String)
    with Post => String'(for J in 1 .. 3 => S (S'First)'Old) =
                 String'(         1 .. 3 => S (S'First)'Old);

gcc/ada/

* sem_util.adb (Is_Repeatedly_Evaluated): Recognize iterated component
association as repeatedly evaluated.

ada: Recognize iterated_component_association as potentially unevaluated

Routine Is_Potentially_Unevaluated was written for Ada 2012, but now we
use it for Ada 2022 as well, so it must recognize iterated component
associations (which were added by Ada 2022) as an array component
association.

gcc/ada/

* sem_util.adb (Is_Potentially_Unevaluated): Recognize iterated
component association as potentially unevaluated.

ada: Disable inlining in potentially unevaluated contexts

Instead of explicitly disabling inlining in quantified expressions,
(which happen to be only preanalysed) and then disabling inlining in
potentially unevaluated contexts that are fully analysed (which happen
to include quantified expressions), we now simply disable inlining in
all potentially unevaluated contexts, regardless of the full analysis
mode.

This also disables inlining in iterated component associations, which
can be both preanalysed or fully analysed depending on their expression,
but nevertheless are potentially unevaluated.

gcc/ada/

* sem_res.adb (Resolve_Call): Replace early call to
In_Quantified_Expression with a call to Is_Potentially_Unevaluated that
was only done when Full_Analysis is true.

ada: Implement new aspect Always_Terminates for SPARK

This patch allows subprograms to be annotated with aspect
Always_Terminates that requires a boolean expression. When this
expression evaluates to True, the subprogram is required to terminate or
raise an exception, but not loop infinitely.

This aspect is only meant to be used by GNATprove and it has no
meaningful run-time semantics: either the annotated subprogram
terminates and then the aspect expression doesn't matter, or the
subprogram loops infinitely and there is nothing we can do. (We could
also evaluate the aspect expression just to detect run-time errors in
the expression itself, but this can be implemented later, after a
backend support for the aspect is added to GNATprove.)

Implementation of this aspect is heavily based on the implementation of
Subprogram_Variant, which in turn is heavily based on the implementation
of Contract_Cases. Since the new aspect is not yet expanded, there is no
corresponding assertion kind that would control the expansion.

gcc/ada/

* aspects.ads (Aspect_Id): Add new aspect.
(Implementation_Defined_Aspect): New aspect is
implementation-defined.
(Aspect_Argument): New aspect has an expression argument.
(Is_Representation_Aspect): New aspect is not a representation
aspect.
(Aspect_Names): Link new aspect identifier with a name.
(Aspect_Delay): New aspect is never delayed.
* contracts.adb (Expand_Subprogram_Contract): Mention new aspect
in comment.
(Add_Contract_Item): Attach pragma corresponding to the new aspect
to contract items.
(Analyze_Entry_Or_Subprogram_Contract): Analyze pragma
corresponding to the new aspect that appears with subprogram spec.
(Analyze_Subprogram_Body_Stub_Contract): Expand pragma
corresponding to the new aspect.
* contracts.ads
(Add_Contract_Item, Analyze_Entry_Or_Subprogram_Contract)
(Analyze_Entry_Or_Subprogram_Body_Contract)
(Analyze_Subprogram_Body_Stub_Contract): Mention new aspect in
comment.
* einfo-utils.adb (Get_Pragma): Return pragma attached to
contract.
* einfo-utils.ads (Get_Pragma): Mention new contract in comment.
* exp_prag.adb (Expand_Pragma_Always_Terminates): Placeholder for
possibly expanding new aspect.
* exp_prag.ads (Expand_Pragma_Always_Terminates): Dedicated
routine for expansion of the new aspect.
* inline.adb (Remove_Aspects_And_Pragmas): Remove aspect from
inlined bodies.
* par-prag.adb (Prag): Postpone checking of the pragma until
analysis.
* sem_ch12.adb: Mention new aspect in explanation of handling
contracts on generic units.
* sem_ch13.adb (Analyze_Aspect_Specifications): Convert new aspect
into a corresponding pragma.
(Check_Aspect_At_Freeze_Point): Don't expect new aspect.
* sem_prag.adb (Analyze_Always_Terminates_In_Decl_Part): Analyze
pragma corresponding to the new aspect.
(Analyze_Pragma): Handle pragma corresponding to the new aspect.
(Is_Non_Significant_Pragma_Reference): Handle references appearing
within new aspect.
* sem_prag.ads (Aspect_Specifying_Pragma): New aspect can be
emulated with a pragma.
(Assertion_Expression_Pragma): New aspect has an assertion
expression.
(Pragma_Significant_To_Subprograms): New aspect is significant to
subprograms.
(Analyze_Always_Terminates_In_Decl_Part): Add spec for routine
that analyses new aspect.
(Find_Related_Declaration_Or_Body): Mention new aspect in comment.
* sem_util.adb (Is_Subprogram_Contract_Annotation): New aspect is
a subprogram contract annotation.
* sem_util.ads (Is_Subprogram_Contract_Annotation): Mention new
aspect in comment.
* sinfo.ads (Is_Generic_Contract_Pragma): New pragma is a generic
contract.
(Contract): Explain attaching new pragma to subprogram contract.
* snames.ads-tmpl (Name_Always_Terminates): New name for the new
contract.
(Pragma_Always_Terminates): New pragma identifier.

ada: Skip elaboration checks for abstract subprograms on derived types

Elaboration checks skip abstract subprogram declarations, which have no
body that could be examined. Now these checks also skip abstract
subprograms of a derived type, which have no body either.

gcc/ada/

* sem_elab.adb (Check_Overriding_Primitive): Prevent Corresponding_Body
to be called with entity of an abstract subprogram.

ada: Fix another case of missing Has_Private_View flag

It occurs for the case of a function call first parsed as an identifier.

gcc/ada/

* sem_ch12.adb (Save_References_In_Identifier): In the case where
the identifier has been turned into a function call by analysis,
call Set_Global_Type on the entity if it is global.

ada: Fix iterated component initialization

The call to Resolve_Aggr_Expr may leave references to temporary entities
used to check for the construct legality and meant to be removed.

Using Preanalyze_And_Resolve correctly guarantees that there is no
visible occurrence of such entities.

gcc/ada/

* sem_aggr.adb (Resolve_Iterated_Component_Association): Call
Preanalyze_And_Resolve instead of Resolve_Aggr_Expr except for
aggregate.

Co-authored-by: Ed Schonberg <schonberg@adacore.com>

ada: Fix exception raised on invalid contract in generic package

This lets the compiler give a proper error message instead.

gcc/ada/

* contracts.adb (Contract_Error): New exception.
(Add_Contract_Item): Raise Contract_Error instead of Program_Error.
(Add_Generic_Contract_Pragma): Deal with Contract_Error.

ada: Fix spurious error on call to function returning private in generic

The spurious error is given on a call to a parameterless function returning
a private type, present in the body of a generic construct both declared and
instantiated in the presence of the full view of the type, because this full
view is not properly restored for the instantiation.

This is supposed to be handled by the Has_Private_View mechanism, but it is
bypassed here because the call to the parameterless function is first parsed
as a simple identifier before being later analyzed as a function call.

Fixing this first issue uncovered another one, whereby the Has_Private_View
flag was not properly set on an operator returning a private type that ends
up being later resolved as a function call.

Finally a small loophole in Eval_Attribute exposed by the change also needs
to be plugged.

gcc/ada/

* sem_attr.adb (Eval_Attribute): Add more exceptions to the early
return for a prefix which is a nonfrozen generic actual type.
* sem_ch12.adb (Copy_Generic_Node): Also check private views in the
case of an entity name or operator analyzed as a function call.
(Set_Global_Type): Make it a child of Save_Global_References.
(Save_References_In_Operator): In the case where the operator has
been turned into a function call, call Set_Global_Type on the entity
if it is global.

ada: Fix internal error on imported function with post-condition

The problem, which is also present for an expression function, is that the
function is invoked in the initializing expression of a variable declared
in the same declarative part as the function, which causes the freezing of
its artificial body before the post-condition is analyzed on its spec.

gcc/ada/

* contracts.adb (Analyze_Entry_Or_Subprogram_Body_Contract): For a
subprogram body that has no contracts and does not come from source,
make sure that contracts on its corresponding spec are analyzed, if
any, before expanding them.

ada: Streamline expansion of controlled actions for aggregates

This changes the strategy used to expand controlled actions for array and
record aggregates so as to make it simpler and more robust.

The current strategy is to set the No_Ctrl_Actions flag on the assignments
generated during the expansion of aggregate, as done during the expansion
of initialization procedures, and to generate the adjustments of the LHS
manually in the same list of actions, before sending the entire list for
analysis and expansion.  The problem is that, when the RHS also requires
controlled actions, the No_Ctrl_Actions flag prevents transient scopes
from being created around the assignments, with the end result that the
actions are "naturally" generated between the assignments and adjustments
of the LHS, causing premature finalization of the RHS.  In order to counter
that, the controlled actions of the RHS must also be generated manually
during the expansion of the aggregates, after blocking normal processing
e.g. by means of the No_Side_Effect_Removal flag.  This means that, for
a more complex RHS, this strategy generates a wrong order of controlled
actions by default, until specifically adjusted.

The new strategy is to reuse the standard machinery as much as possible,
disabling only the part that is not needed for the assignments generated
during the expansion of aggregates, namely the finalization of the LHS;
in other words, the adjustment of the LHS is left entirely to the standard
machinery and the creation of transient scopes is no longer blocked, which
gives a correct order of controlled actions by default.  It is implemented
by means of a No_Finalize_Actions flag present on the assignments generated
during the expansion.

It is mostly straightforward, modulo the following hitch: the assignments
are now analyzed and expanded by the common expander, which in the case of
controlled assignments analyzes the final rewriting with all checks off,
which in particular disables elaboration checks for the calls to the Adjust
primitives; now these checks are necessary in the case where an aggregate
is the initialization expression of an object declared before the body of
the Adjust primitive is seen.  Hence the use of an existing trick, namely
Suppress/Unsuppress blocks, around the assignments.

gcc/ada/

* gen_il-fields.ads (Opt_Field_Enum): Add No_Finalize_Actions and
remove No_Side_Effect_Removal.
* gen_il-gen-gen_nodes.adb (N_Function_Call): Remove semantic flag
No_Side_Effect_Removal
(N_Assignment_Statement): Add semantic flag No_Finalize_Actions.
* sinfo.ads (No_Ctrl_Actions): Adjust comment.
(No_Finalize_Actions): New flag on assignment statements.
(No_Side_Effect_Removal): Delete.
* exp_aggr.adb (Build_Record_Aggr_Code): Remove obsolete comment and
Ancestor_Is_Expression variable.  In the case of an extension, do
not generate a call to Adjust manually, call Set_No_Finalize_Actions
instead.  Do not set the tags, replace call to Make_Unsuppress_Block
by Make_Suppress_Block and remove useless assertions.
In the general case, call Initialize_Component.
(Initialize_Controlled_Component): Delete.
(Initialize_Simple_Component): Delete.
(Initialize_Component): Do the low-level processing, but do not
generate a call to Adjust manually, call Set_No_Finalize_Actions.
(Process_Transient_Component): Delete.
(Process_Transient_Component_Completion): Likewise.
* exp_ch5.adb (Expand_Assign_Array): Deal with No_Finalize_Actions.
(Expand_Assign_Array_Loop): Likewise.
(Expand_N_Assignment_Statement): Likewise.
(Make_Tag_Ctrl_Assignment): Likewise.
* exp_util.adb (Remove_Side_Effects): Do not test the
No_Side_Effect_Removal flag.
* sem_prag.adb (Process_Suppress_Unsuppress): Give the warning in
SPARK mode only for pragma Suppress.
* tbuild.ads (Make_Suppress_Block): New declaration.
(Make_Unsuppress_Block): Adjust comment.
* tbuild.adb (Make_Suppress_Block): New procedure.
(Make_Unsuppress_Block): Unsuppress instead of suppressing.

ada: Remove obsolete code in Analyze_Assignment

This code was dealing with build-in-place calls for nonlimited types, but
they no longer exist since Is_Build_In_Place_Result_Type => Is_Limited_View.

gcc/ada/

* sem_ch5.adb (Analyze_Assignment): Turn Rhs into a constant and
remove calls to the following subprograms.
(Transform_BIP_Assignment): Delete.
(Should_Transform_BIP_Assignment): Likewise.

ada: Remove unreferenced routine Is_Inherited_Operation_For_Type

Remove routine that is no referenced after deconstructing of restriction
SPARK_05.

gcc/ada/

* sem_util.ads (Is_Inherited_Operation_For_Type): Remove spec.
* sem_util.adb (Is_Inherited_Operation_For_Type): Remove body.

ada: Small housekeeping work in expansion of extension aggregates

This avoids repeatedly calling Unqualify on the same node, removes a dead
call to Generate_Finalization_Actions, a redundant setting of Assignment_OK
and reuses a local variable more consistently.  No functional changes.

gcc/ada/

* exp_aggr.adb (Build_Record_Aggr_Code): Add new variable Ancestor_Q
to store the result of Unqualify on Ancestor.  Remove the dead call
to Generate_Finalization_Actions in the case of another aggregate as
ancestor part.  Remove the redundant setting of Assignment_OK.  Use
Init_Typ in lieu of Etype (Ancestor) more consistently.

ada: Fix wrong expansion of limited extension aggregate

This happens when the ancestor part is itself an aggregate: in this case,
the tag of the extension aggregate is wrongly set to that of the ancestor.

gcc/ada/

* exp_aggr.adb (Build_Record_Aggr_Code): In the case of an extension
aggregate of a limited type whose ancestor part is an aggregate, do
not skip the final code assigning the tag of the extension.

ada: Mark attribute Initialized as ghost code

Implement the SPARK RM change that defines attribute Initialized
as being ghost, i.e. only allowed where a ghost entity would be allowed.

gcc/ada/

* ghost.adb (Check_Ghost_Context): Allow absence of Ghost_Id
for attribute. Update error message to mention Ghost_Predicate.
(Is_Ghost_Attribute_Reference): New query.
* ghost.ads (Is_Ghost_Attribute_Reference): New query.
* sem_attr.adb (Resolve_Attribute): Check ghost context for ghost
attributes.

ada: Add No_Elaboration_Code_All pragma to System.Storage_Elements

Allows System.Storage_Elements to be used in units that
have the No_Elaboration_Code_All restriction.

gcc/ada/

* libgnat/s-stoele.ads: Add No_Elaboration_Code_All pragma.

ada: Factor out tag assignments from type in expander

They are performed in a few different places during expansion.

gcc/ada/

* exp_util.ads (Make_Tag_Assignment_From_Type): Declare.
* exp_util.adb (Make_Tag_Assignment_From_Type): New function.
* exp_aggr.adb (Build_Record_Aggr_Code): Call the above function.
(Initialize_Simple_Component): Likewise.
* exp_ch3.adb (Build_Record_Init_Proc.Build_Assignment): Likewise.
(Build_Record_Init_Proc.Build_Init_Procedure ): Likewise.
(Make_Tag_Assignment): Likewise. Rename local variable and call
Unqualify to go through qualified expressions.
* exp_ch4.adb (Expand_Allocator_Expression): Likewise.

ada: Use ghost predicate in standard library

In preparation for attribute Initialized to become ghost, use aspect
Ghost_Predicate instead of Predicate in unit Ada.Strings.Superbounded
of the standard library.

gcc/ada/

* libgnat/a-strsup.ads: Change predicate aspect.
* sem_ch13.adb (Add_Predicate): Fix for first predicate.

ada: Fix expansion of aggregates with controlled components

The expansion is incorrect in the case where the initialization expression
of a component is a conditional expression that has a function call as one
of its dependent expressions, leading to a wrong order of initialization,
adjustment and finalization.

gcc/ada/

* exp_aggr.adb (Initialize_Component): Perform immediate expansion
of the initialization expression if it is a conditional expression
and the component type is controlled.

ada: Factor common processing in expansion of aggregates

The final processing at the component level of array aggregates and record
aggregates is very similar, so this factors out the common processing into
three new library-level subprograms.

There should be no functional changes, but the expanded code may be changed
in the case of controlled components of array aggregates not covered by a
multiple choice: the previous expansion used to place new declarations prior
to the aggregate in this case and that is no longer the case, i.e. they are
always placed right before the initialization of the component (as was done
for all controlled components of record aggregates and controlled components
of array aggregates covered by a multiple choice).

gcc/ada/

* exp_aggr.adb (Initialize_Component): New procedure factored out
from the processing of array and record aggregates.
(Initialize_Controlled_Component): Likewise.
(Initialize_Simple_Component): Likewise.
(Build_Array_Aggr_Code.Gen_Assign): Remove In_Loop parameter.
Call Initialize_Component to initialize the component.
(Initialize_Array_Component): Delete.
(Initialize_Ctrl_Array_Component): Likewise.
(Build_Array_Aggr_Code): Adjust calls to Gen_Assign.
(Build_Record_Aggr_Code): Call Initialize_Simple_Component or
Initialize_Component to initialize the component.
(Initialize_Ctrl_Record_Component): Delete.
(Initialize_Record_Component): Likewise.

ada: Remove wrong comment about expansion of exceptions for GNATprove

Code cleanup related to handling exceptions in GNATprove.

gcc/ada/

* exp_ch11.adb (Expand_N_Raise_Statement): Expansion of raise statements
never happens in GNATprove mode.

ada: Cleanup finding of locally handled exception handlers

Code cleanup related to handling exceptions in GNATprove; semantics is
unaffected.

gcc/ada/

* exp_ch11.adb (Find_Local_Handler): Replace guard against other
constructs appearing in the list of exception handlers with iteration
using First_Non_Pragma/Next_Non_Pragma.

ada: Cleanup expansion of locally handled exception handlers

Code cleanup related to handling exceptions in GNATprove; semantics is
unaffected.

gcc/ada/

* exp_ch11.ads (Find_Local_Handler): Fix typo in comment.
* exp_ch11.adb (Find_Local_Handler): Remove redundant check for the
Exception_Handler list being present; use membership test to eliminate
local object LCN; fold nested IF statements. Remove useless ELSIF
condition.

ada: Tune style in detection of writable function actuals

Cleanup; semantics is unaffected.

gcc/ada/

* sem_util.adb (Check_Function_Writable_Actuals): Tune style; use
subtype name to detect membership test nodes.

ada: Simplify appending to a newly created list

Code cleanup; semantics is unaffected.

gcc/ada/

* exp_disp.adb (Make_Disp_Asynchronous_Select_Spec): Use a single call
to New_List.

ada: Support new GNAT-specific aspect Ghost_Predicate

New aspect Ghost_Predicate allows the use of ghost entities in the
predicate expression, even if the type is not ghost itself. As a result,
subtypes with a ghost predicate cannot be used in membership tests.

Subtypes with ghost predicates are subject to the same additional
restrictions as subtypes with aspect Dynamic_Predicate.
They are governed for compilation by assertion policy Ghost.
Checking of the predicate itself is governed by the usual assertion
policy (Static_Predicate/Dynamic_Predicate/Predicate) independently
of the ghost predicate.

gcc/ada/

* doc/gnat_rm/implementation_defined_aspects.rst: Document new
aspect.
* doc/gnat_rm/implementation_defined_pragmas.rst: Whitespace.
* aspects.adb (Init_Canonical_Aspect): Set it to Predicate.
* aspects.ads: Set global constants for new aspect.
* einfo.ads: Describe new flag related to new aspect.
* exp_ch6.adb (Can_Fold_Predicate_Call): Do not fold new aspect.
* exp_util.adb (Make_Predicate_Check): Add comment.
* gen_il-fields.ads: Add new flag.
* gen_il-gen-gen_entities.adb: Add new flag.
* ghost.adb (Is_OK_Ghost_Context): Ghost predicate is an OK
ghost context.
(Mark_Ghost_Pragma): Add overloading with ghost mode parameter.
* ghost.ads (Mark_Ghost_Pragma): Add overloading with ghpst mode
parameter.
(Name_To_Ghost_Mode): Make function public.
* sem_aggr.adb: Issue error for violation of valid use.
* sem_case.adb: Issue error for violation of valid use.
* sem_ch13.adb: Adapt for new aspect.
* sem_ch3.adb (Analyze_Full_Type_Declaration): Remove dead code
which was trying to propagate Has_Predicates flag in the wrong
direction (from derived to parent type).
(Analyze_Number_Declaration): Issue error for violation of valid
use.
(Build_Derived_Type): Cleanup inheritance of predicate flags from
parent to derived type.
(Build_Predicate_Function): Only add a predicate check when it
is not ignored as Ghost code.
* sem_ch4.adb (Analyze_Membership_Op): Issue an error for use of
a subtype with a ghost predicate as name in a membership test.
* sem_ch5.adb (Check_Predicate_Use): Issue error for violation of
valid use.
* sem_eval.adb: Adapt code for Dynamic_Predicate to account for
Ghost_Predicate too.
* sem_prag.adb (Analyze_Pragma): Make pragma ghost or not.
* sem_util.adb (Bad_Predicated_Subtype_Use): Adapt to new aspect.
(Inherit_Predicate_Flags): Add inheritance of flag. Add parameter
to apply to derived types.
* sem_util.ads (Inherit_Predicate_Flags): Change signature.
* snames.ads-tmpl: Add new aspect name.
* gnat_rm.texi: Regenerate.

ada: Remove explicit decoration of wrapper created in freezing

We create wrapper functions associated with inherited functions with
controlling results which are not overridden during freezing. We partly
decorated them explicitly, even though they would be fully decorated
later anyway.

This early decoration didn't work as expected, because flag
In_Private_Part that is read by Override_Dispatching_Operation it not
set reliably while freezing (as explained in the comment of
Is_Private_Declaration). In effect, we were getting a circularity
between Alias and Overridden_Operation, which was causing GNATprove to
loop infinitely.

Apparently the cleanest fix is to not decorate the wrapper with an early
call to Override_Dispatching_Operation.

gcc/ada/

* exp_ch3.adb (Make_Controlling_Function_Wrappers): Remove early
decoration.

RISC-V: Fix one typo in full-vec-movel test

This patch would like to fix one typo when checking assembly of
full-vec-movel.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls-vlmax/full-vec-move1.c:
Adjust dg-do to comiple for asm checking.

AArch64: [PR96339] Optimise svlast[ab]

  This PR optimizes an SVE intrinsics sequence where
    svlasta (svptrue_pat_b8 (SV_VL1), x)
  a scalar is selected based on a constant predicate and a variable vector.
  This sequence is optimized to return the correspoding element of a NEON
  vector. For eg.
    svlasta (svptrue_pat_b8 (SV_VL1), x)
  returns
    umov    w0, v0.b[1]
  Likewise,
    svlastb (svptrue_pat_b8 (SV_VL1), x)
  returns
     umov    w0, v0.b[0]
  This optimization only works provided the constant predicate maps to a range
  that is within the bounds of a 128-bit NEON register.

gcc/ChangeLog:

PR target/96339
* config/aarch64/aarch64-sve-builtins-base.cc (svlast_impl::fold): Fold sve
calls that have a constant input predicate vector.
(svlast_impl::is_lasta): Query to check if intrinsic is svlasta.
(svlast_impl::is_lastb): Query to check if intrinsic is svlastb.
(svlast_impl::vect_all_same): Check if all vector elements are equal.

gcc/testsuite/ChangeLog:

PR target/96339
* gcc.target/aarch64/sve/acle/general-c/svlast.c: New.
* gcc.target/aarch64/sve/acle/general-c/svlast128_run.c: New.
* gcc.target/aarch64/sve/acle/general-c/svlast256_run.c: New.
* gcc.target/aarch64/sve/pcs/return_4.c (caller_bf16): Fix asm
to expect optimized code for function body.
* gcc.target/aarch64/sve/pcs/return_4_128.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_4_256.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_4_512.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_4_1024.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_4_2048.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_5.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_5_128.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_5_256.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_5_512.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_5_1024.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_5_2048.c (caller_bf16): Likewise.

Update perf auto profile script

- Fix gen_autofdo_event: The download URL for the Intel Perfmon Event
  list has changed, as well as the JSON format.
  Also it now uses pattern matching to match CPUs. Update the script to support all of this.
- Regenerate gcc-auto-profile with the latest published Intel model
  numbers, so it works with recent systems.
- So far it's still broken on hybrid systems

contrib/ChangeLog:

* gen_autofdo_event.py: Update for download server changes

gcc/ChangeLog

* config/i386/gcc-auto-profile: Regenerate.

RISC-V: Fix V_WHOLE && V_FRACT iterator requirement

This patch fixes the requirement of V_WHOLE and V_FRACT.
E.g. VNx8QI in V_WHOLE has no requirement which is incorrect.
     Actually, VNx8QI should be whole(full) mode when TARGET_MIN_VLEN < 128
     since when TARGET_MIN_VLEN == 128, VNx8QI is e8mf2 which is fractional
     vector.

Co-Authored by: Robin Dapp <rdapp@ventanamicro.com>

gcc/ChangeLog:

* config/riscv/vector-iterators.md: Fix requirement.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls-vlmax/full-vec-move1.c: New test.

RISC-V: Enhance RVV VLA SLP auto-vectorization with decompress operation

According to RVV ISA:
https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc

We can enhance VLA SLP auto-vectorization with (16.5.1. Synthesizing vdecompress)
Decompress operation.

Case 1 (nunits = POLY_INT_CST [16, 16]):
_48 = VEC_PERM_EXPR <_37, _35, { 0, POLY_INT_CST [16, 16], 1, POLY_INT_CST [17, 16], 2, POLY_INT_CST [18, 16], ... }>;
We can optimize such VLA SLP permuation pattern into:
_48 = vdecompress (_37, _35, mask = { 0, 1, 0, 1, ... };

Case 2 (nunits = POLY_INT_CST [16, 16]):
_23 = VEC_PERM_EXPR <_46, _44, { POLY_INT_CST [1, 1], POLY_INT_CST [3, 3], POLY_INT_CST [2, 1], POLY_INT_CST [4, 3], POLY_INT_CST [3, 1], POLY_INT_CST [5, 3], ... }>;
We can optimize such VLA SLP permuation pattern into:
_48 = vdecompress (slidedown(_46, 1/2 nunits), slidedown(_44, 1/2 nunits), mask = { 0, 1, 0, 1, ... };

For example:
void __attribute__ ((noinline, noclone))
vec_slp (uint64_t *restrict a, uint64_t b, uint64_t c, int n)
{
  for (int i = 0; i < n; ++i)
    {
      a[i * 2] += b;
      a[i * 2 + 1] += c;
    }
}

ASM:
...
        vid.v   v0
        vand.vi v0,v0,1
        vmseq.vi        v0,v0,1  ===> mask = { 0, 1, 0, 1, ... }
vdecompress:
        viota.m v3,v0
        vrgather.vv     v2,v1,v3,v0.t
Loop:
        vsetvli zero,a5,e64,m1,ta,ma
        vle64.v v1,0(a0)
        vsetvli a6,zero,e64,m1,ta,ma
        vadd.vv v1,v2,v1
        vsetvli zero,a5,e64,m1,ta,ma
        mv      a5,a3
        vse64.v v1,0(a0)
        add     a3,a3,a1
        add     a0,a0,a2
        bgtu    a5,a4,.L4

gcc/ChangeLog:

* config/riscv/riscv-v.cc (emit_vlmax_decompress_insn): New function.
(shuffle_decompress_patterns): New function.
(expand_vec_perm_const_1): Add decompress optimization.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/slp-8.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-9.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-8.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-9.c: New test.

Daily bump.

PR modula2/110189 Using an unknown TYPE as argument to VAL gives ICE

This patch tidies P3Build.bnf and fixes error format specs in
M2Quads.mod when encountering unknown symbols.

gcc/m2/ChangeLog:

PR modula2/110189
* gm2-compiler/M2Quads.mod (BuildAbsFunction): Replace abort
format specifier.
(BuildValFunction): Replace abort format specifier.
(BuildCastFunction): Replace abort format specifier.
(BuildMinFunction): Replace abort format specifier.
(BuildMaxFunction): Replace abort format specifier.
(BuildTruncFunction): Replace abort format specifier.
* gm2-compiler/P3Build.bnf (Pass1): Remove.
(Pass2): Remove.
(Pass3): Remove.
(Expect): Add Pass1.
(AsmStatement): Remove Pass3.
(AsmOperands): Remove Pass3.
(AsmOperandSpec): Remove Pass3.
(AsmInputElement): Remove Pass3.
(AsmOutputElement): Remove Pass3.
(AsmTrashList): Remove Pass3.

gcc/testsuite/ChangeLog:

PR modula2/110189
* gm2/pim/fail/foovaltype.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

[committed] [PR rtl-optimization/101188] Fix reload_cse_move2add ignoring clobbers

So as Georg-Johann discusses in the BZ, reload_cse_move2add can generate
incorrect code when optimizing code with clobbers.  Specifically in the
case where we try to optimize a sequence of 4 operations down to 3
operations we can reset INSN to the next instruction and continue the loop.

That skips the code to invalidate objects based on things like REG_INC
nodes, stack pushes and most importantly clobbers attached to the current
insn.

This patch factors all of the invalidation code used by reload_cse_move2add
into a new function and calls it at the appropriate time.

Georg-Johann has confirmed this patch fixes his avr bug and I've had it in
my tester over the weekend.  It's bootstrapped and regression tested on
aarch64, m68k, sh4, alpha and hppa.  It's also regression tested successfully
on a wide variety of other targets.

gcc/
PR rtl-optimization/101188
* postreload.cc (reload_cse_move2add_invalidate): New function,
extracted from...
(reload_cse_move2add): Call reload_cse_move2add_invalidate.

gcc/testsuite
PR rtl-optimization/101188
* gcc.c-torture/execute/pr101188.c: New test

[aarch64] Improve code-gen for vector initialization with single constant element.

gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_expand_vector_init): Tweak condition
if (n_var == n_elts && n_elts <= 16) to allow a single constant,
and if maxv == 1, use constant element for duplicating into register.

gcc/testsuite/ChangeLog:
* gcc.target/aarch64/vec-init-single-const.c: New test.
* gcc.target/aarch64/vec-init-single-const-be.c: Likewise.
* gcc.target/aarch64/vec-init-single-const-2.c: Likewise.

OpenMP: Cleanups related to the 'present' modifier

Reduce number of enum values passed to libgomp as
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} have the same semantic as
GOMP_MAP_FORCE_PRESENT (i.e. abort if not present, otherwise ignore);
that's different to GOMP_MAP_ALWAYS_PRESENT_{TO,TOFROM,FROM} which also
abort if not present but copy data when present. This is is a follow-up to
the commit r14-1579-g4ede915d5dde93 done 6 days ago.

Additionally, the commit improves a libgomp run-time and a C/C++ compile-time
error wording and extends testcases a tiny bit.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_omp_clause_map): Reword error message for
clearness especially with 'omp target (enter/exit) data.'

gcc/cp/ChangeLog:

* parser.cc (cp_parser_omp_clause_map): Reword error message for
clearness especially with 'omp target (enter/exit) data.'
* semantics.cc (handle_omp_array_sections): Handle
GOMP_MAP_{ALWAYS_,}PRESENT_{TO,TOFROM,FROM,ALLOC} enum values.

gcc/ChangeLog:

* gimplify.cc (gimplify_adjust_omp_clauses_1): Use
GOMP_MAP_FORCE_PRESENT for 'present alloc' implicit mapping.
(gimplify_adjust_omp_clauses): Change
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to the equivalent
GOMP_MAP_FORCE_PRESENT.
* omp-low.cc (lower_omp_target): Remove handling of no-longer valid
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC}; update map kinds used for
to/from clauses with present modifier.

include/ChangeLog:

* gomp-constants.h (enum gomp_map_kind): Change the enum values
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to be compiler only.
(GOMP_MAP_PRESENT_P): Update to include also GOMP_MAP_FORCE_PRESENT.

libgomp/ChangeLog:

* target.c (gomp_to_device_kind_p, gomp_map_vars_internal): Replace
GOMP_MAP_PRESENT_{FROM,TO,TOFROM,ACLLOC} by GOMP_MAP_FORCE_PRESENT.
(gomp_map_vars_internal, gomp_update): Likewise; unify and improve
error message.
* testsuite/libgomp.c-c++-common/target-present-2.c: Update for
changed error message.
* testsuite/libgomp.fortran/target-present-1.f90: Likewise.
* testsuite/libgomp.fortran/target-present-2.f90: Likewise.
* testsuite/libgomp.oacc-c-c++-common/present-1.c: Likewise.
* testsuite/libgomp.c-c++-common/target-present-1.c: Likewise and
extend testcase to check that data is copied when needed.
* testsuite/libgomp.c-c++-common/target-present-3.c: Likewise.
* testsuite/libgomp.fortran/target-present-3.f90: Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/defaultmap-4.c: Update scan-tree-dump.
* c-c++-common/gomp/map-9.c: Likewise.
* gfortran.dg/gomp/defaultmap-8.f90: Likewise.
* gfortran.dg/gomp/map-11.f90: Likewise.
* gfortran.dg/gomp/target-update-1.f90: Likewise.
* gfortran.dg/gomp/map-12.f90: Likewise; also check original dump.
* c-c++-common/gomp/map-6.c: Update dg-error and also check
clause error with 'target (enter/exit) data'.

Add some overrides.

PR tree-optimization/110205
* range-op-float.cc (range_operator::fold_range): Add default FII
fold routine.
* range-op-mixed.h (class operator_gt): Add missing final overrides.
* range-op.cc (range_op_handler::fold_range): Add RO_FII case.
(operator_lshift ::update_bitmask): Add final override.
(operator_rshift ::update_bitmask): Add final override.
* range-op.h (range_operator::fold_range): Add FII prototype.

Provide interface for non-standard operators.

THis removes the hack introduced for WIDEN_MULT which exported a pointer
to the operator and the gimple-range-op.cc set the operator to this
pointer whenn it was appropriate.

Instead, we simple change the range-op table to be unsigned indexed,
and add new opcodes to the end of the table, allowing them to be indexed
directly via range_op_handler::range_op.

* gimple-range-op.cc (gimple_range_op_handler::maybe_non_standard):
Use range_op_handler directly.
* range-op.cc (range_op_handler::range_op_handler): Unsigned
param instead of tree-code.
(ptr_op_widen_plus_signed): Delete.
(ptr_op_widen_plus_unsigned): Delete.
(ptr_op_widen_mult_signed): Delete.
(ptr_op_widen_mult_unsigned): Delete.
(range_op_table::initialize_integral_ops): Add new opcodes.
* range-op.h (range_op_handler): Use unsigned.
(OP_WIDEN_MULT_SIGNED): New.
(OP_WIDEN_MULT_UNSIGNED): New.
(OP_WIDEN_PLUS_SIGNED): New.
(OP_WIDEN_PLUS_UNSIGNED): New.
(RANGE_OP_TABLE_SIZE): New.
(range_op_table::operator []): Use unsigned.
(range_op_table::set): Use unsigned.
(m_range_tree): Make unsigned.
(ptr_op_widen_mult_signed): Remove.
(ptr_op_widen_mult_unsigned): Remove.
(ptr_op_widen_plus_signed): Remove.
(ptr_op_widen_plus_unsigned): Remove.

Provide a default range_operator via range_op_handler.

range_op_handler now provides a default range_operator for any opcode,
so there is no longer a need to check for a valid operator.

* gimple-range-op.cc (gimple_range_op_handler): Set m_operator
manually as there is no access to the default operator.
(cfn_copysign::fold_range): Don't check for validity.
(cfn_ubsan::fold_range): Ditto.
(gimple_range_op_handler::maybe_builtin_call): Don't set to NULL.
* range-op.cc (default_operator): New.
(range_op_handler::range_op_handler): Use default_operator
instead of NULL.
(range_op_handler::operator bool): Move from header, compare
against default operator.
(range_op_handler::range_op): New.
* range-op.h (range_op_handler::operator bool): Move.

Switch from unified table to range_op_table. There can be only one.

Now that there is only a single range_op_table, make the base table the
only table.

* range-op.cc (unified_table): Delete.
(range_op_table operator_table): Instantiate.
(range_op_table::range_op_table): Rename from unified_table.
(range_op_handler::range_op_handler): Use range_op_table.
* range-op.h (range_op_table::operator []): Inline.
(range_op_table::set): Inline.

Remove type from range_op_handler table selection

With the unified table complete, we no loonger need to specify a type
to choose a table when setting a range_op_handler.

* gimple-range-gori.cc (gori_compute::condexpr_adjust): Do not
pass type.
* gimple-range-op.cc (get_code): Rename from get_code_and_type
and simplify.
(gimple_range_op_handler::supported_p): No need for type.
(gimple_range_op_handler::gimple_range_op_handler): Ditto.
(cfn_copysign::fold_range): Ditto.
(cfn_ubsan::fold_range): Ditto.
* ipa-cp.cc (ipa_vr_operation_and_type_effects): Ditto.
* ipa-fnsummary.cc (evaluate_conditions_for_known_args): Ditto.
* range-op-float.cc (operator_plus::op1_range): Ditto.
(operator_mult::op1_range): Ditto.
(range_op_float_tests): Ditto.
* range-op.cc (get_op_handler): Remove.
(range_op_handler::set_op_handler): Remove.
(operator_plus::op1_range): No need for type.
(operator_minus::op1_range): Ditto.
(operator_mult::op1_range): Ditto.
(operator_exact_divide::op1_range): Ditto.
(operator_cast::op1_range): Ditto.
(perator_bitwise_not::fold_range): Ditto.
(operator_negate::fold_range): Ditto.
* range-op.h (range_op_handler::range_op_handler): Remove type param.
(range_cast): No need for type.
(range_op_table::operator[]): Check for enum_code >= 0.
* tree-data-ref.cc (compute_distributive_range): No need for type.
* tree-ssa-loop-unswitch.cc (unswitch_predicate): Ditto.
* value-query.cc (range_query::get_tree_range): Ditto.
* value-relation.cc (relation_oracle::validate_relation): Ditto.
* vr-values.cc (range_of_var_in_loop): Ditto.
(simplify_using_ranges::fold_cond_with_ops): Ditto.

Add a hybrid MAX_EXPR operator for integer and pointer.

This adds an operator to the unified table for MAX_EXPR which will
select either the pointer or integer version based on the type passed
to the method. This is for use until we have a seperate PRANGE class.

THIs also removes the pointer table which is no longer needed.

* range-op-mixed.h (operator_max): Remove final.
* range-op-ptr.cc (pointer_table::pointer_table): Remove MAX_EXPR.
(pointer_table::pointer_table): Remove.
(class hybrid_max_operator): New.
(range_op_table::initialize_pointer_ops): Add hybrid_max_operator.
* range-op.cc (pointer_tree_table): Remove.
(unified_table::unified_table): Comment out MAX_EXPR.
(get_op_handler): Remove check of pointer table.
* range-op.h (class pointer_table): Remove.

Add a hybrid MIN_EXPR operator for integer and pointer.

This adds an operator to the unified table for MIN_EXPR which will
select either the pointer or integer version based on the type passed
to the method. This is for use until we have a seperate PRANGE class.

* range-op-mixed.h (operator_min): Remove final.
* range-op-ptr.cc (pointer_table::pointer_table): Remove MIN_EXPR.
(class hybrid_min_operator): New.
(range_op_table::initialize_pointer_ops): Add hybrid_min_operator.
* range-op.cc (unified_table::unified_table): Comment out MIN_EXPR.

Add a hybrid BIT_IOR_EXPR operator for integer and pointer.

This adds an operator to the unified table for BIT_IOR_EXPR which will
select either the pointer or integer version based on the type passed
to the method. This is for use until we have a seperate PRANGE class.

* range-op-mixed.h (operator_bitwise_or): Remove final.
* range-op-ptr.cc (pointer_table::pointer_table): Remove BIT_IOR_EXPR.
(class hybrid_or_operator): New.
(range_op_table::initialize_pointer_ops): Add hybrid_or_operator.
* range-op.cc (unified_table::unified_table): Comment out BIT_IOR_EXPR.

Add a hybrid BIT_AND_EXPR operator for integer and pointer.

This adds an operator to the unified table for BIT_AND_EXPR which will
select either the pointer or integer version based on the type passed
to the method. This is for use until we have a seperate PRANGE class.

* range-op-mixed.h (operator_bitwise_and): Remove final.
* range-op-ptr.cc (pointer_table::pointer_table): Remove BIT_AND_EXPR.
(class hybrid_and_operator): New.
(range_op_table::initialize_pointer_ops): Add hybrid_and_operator.
* range-op.cc (unified_table::unified_table): Comment out BIT_AND_EXPR.

Split pointer ibased range operators to range-op-ptr.cc

MOve the pointer table and all pointer specific operators into a
new file for pointers.

* Makefile.in (OBJS): Add range-op-ptr.o.
* range-op-mixed.h (update_known_bitmask): Move prototype here.
(minus_op1_op2_relation_effect): Move prototype here.
(wi_includes_zero_p): Move function to here.
(wi_zero_p): Ditto.
* range-op.cc (update_known_bitmask): Remove static.
(wi_includes_zero_p): Move to header.
(wi_zero_p): Move to header.
(minus_op1_op2_relation_effect): Remove static.
(operator_pointer_diff): Move class and routines to range-op-ptr.cc.
(pointer_plus_operator): Ditto.
(pointer_min_max_operator): Ditto.
(pointer_and_operator): Ditto.
(pointer_or_operator): Ditto.
(pointer_table): Ditto.
(range_op_table::initialize_pointer_ops): Ditto.
* range-op-ptr.cc: New.

Move operator_max to the unified range-op table.

Also remove the integral table.

* range-op-mixed.h (class operator_max): Move from...
* range-op.cc (unified_table::unified_table): Add MAX_EXPR.
(get_op_handler): Remove the integral table.
(class operator_max): Move from here.
(integral_table::integral_table): Delete.
* range-op.h (class integral_table): Delete.

Move operator_min to the unified range-op table.

* range-op-mixed.h (class operator_min): Move from...
* range-op.cc (unified_table::unified_table): Add MIN_EXPR.
(class operator_min): Move from here.
(integral_table::integral_table): Remove MIN_EXPR.

Move operator_bitwise_or to the unified range-op table.

* range-op-mixed.h (class operator_bitwise_or): Move from...
* range-op.cc (unified_table::unified_table): Add BIT_IOR_EXPR.
(class operator_bitwise_or): Move from here.
(integral_table::integral_table): Remove BIT_IOR_EXPR.

Move operator_bitwise_and to the unified range-op table.

At this point, the remaining 4 integral operation have different
impllementations than pointers, so we now check for a pointer table
entry first, then if there is nothing, look at the Unified table.

* range-op-mixed.h (class operator_bitwise_and): Move from...
* range-op.cc (unified_table::unified_table): Add BIT_AND_EXPR.
(get_op_handler): Check for a pointer table entry first.
(class operator_bitwise_and): Move from here.
(integral_table::integral_table): Remove BIT_AND_EXPR.

Move operator_bitwise_xor to the unified range-op table.

* range-op-mixed.h (class operator_bitwise_xor): Move from...
* range-op.cc (unified_table::unified_table): Add BIT_XOR_EXPR.
(class operator_bitwise_xor): Move from here.
(integral_table::integral_table): Remove BIT_XOR_EXPR.
(pointer_table::pointer_table): Remove BIT_XOR_EXPR.

Move operator_bitwise_not to the unified range-op table.

* range-op-mixed.h (class operator_bitwise_not): Move from...
* range-op.cc (unified_table::unified_table): Add BIT_NOT_EXPR.
(class operator_bitwise_not): Move from here.
(integral_table::integral_table): Remove BIT_NOT_EXPR.
(pointer_table::pointer_table): Remove BIT_NOT_EXPR.

Move operator_addr_expr to the unified range-op table.

* range-op-mixed.h (class operator_addr_expr): Move from...
* range-op.cc (unified_table::unified_table): Add ADDR_EXPR.
(class operator_addr_expr): Move from here.
(integral_table::integral_table): Remove ADDR_EXPR.
(pointer_table::pointer_table): Remove ADDR_EXPR.

PR modula2/110126 variables are reported as unused when referenced by ASM fix

This patch fixes the trash list of the asm statement. It introduces a
separate build procedure for trashed elements.

gcc/m2/ChangeLog:

PR modula2/110126
* gm2-compiler/M2Quads.def (BuildAsmElement): Remove
trash parameter.
(BuildAsmTrash): New procedure.
* gm2-compiler/M2Quads.mod (BuildAsmTrash): New procedure.
(BuildAsmElement): Remove trash parameter.
* gm2-compiler/P3Build.bnf (AsmTrashList): Rewrite.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

RISC-V: Fix one potential test failure for RVV vsetvl

The test will fail on below command with multi-thread like below. However,
it comes from one missed "Oz" option when check vsetvl.

make -j $(nproc) report RUNTESTFLAGS="rvv.exp riscv.exp"

To some reason, this failure cannot be reproduced by RUNTESTFLAGS="rvv.exp"
or make without -j option. We would like to fix it and root cause the
reason later.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/vsetvl-23.c: Adjust test checking.

RISC-V: Support RVV FP16 MISC vget/vset intrinsic API

This patch support the intrinsic API of FP16 ZVFHMIN vget/vset. From
the user's perspective, it is reasonable to do some get/set operations
for the vfloat16*_t types when only ZVFHMIN is enabled.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-types.def
(vfloat16m1_t): Add type to lmul1 ops.
(vfloat16m2_t): Likewise.
(vfloat16m4_t): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: Add new test cases.
* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Likewise.

Fix disambiguation against .MASK_STORE

Alias analysis was treating .MASK_STORE as storing a full vector
which means we disambiguate against decls of smaller than vector size.
That's of course wrong and a similar issue was fixed for DSE already.
The following makes sure we set the size of the access to unknown
and only constrain max_size.

This fixes runtime execution FAILs of gfortran.dg/matmul_2.f90,
gfortran.dg/matmul_6.f90 and gfortran.dg/pr91577.f90 when using
AVX512 with full masked loop vectorization on Zen4.

* tree-ssa-alias.cc (call_may_clobber_ref_p_1): For
.MASK_STORE and friend set the size of the access to
unknown.

Remove DEFAULT_MATCHPD_PARTITIONS macro

As Jakub pointed out, DEFAULT_MATCHPD_PARTITIONS
is now unused and can be removed.

gcc/ChangeLog:

* config.in: Regenerate.
* configure: Regenerate.
* configure.ac: Remove DEFAULT_MATCHPD_PARTITIONS.

RISC-V: Add RVV narrow shift right lowering auto-vectorization

Optimize the following auto-vectorization codes:
void foo (int16_t * __restrict a, int32_t * __restrict b, int32_t c, int n)
{
    for (int i = 0; i < n; i++)
      a[i] = b[i] >> c;
}

Before this patch:
foo:
        ble     a3,zero,.L5
.L3:
        vsetvli a5,a3,e32,m1,ta,ma
        vle32.v v1,0(a1)
        vsetvli a4,zero,e32,m1,ta,ma
        vsra.vx v1,v1,a2
        vsetvli zero,zero,e16,mf2,ta,ma
        slli    a7,a5,2
        vncvt.x.x.w     v1,v1
        slli    a6,a5,1
        vsetvli zero,a5,e16,mf2,ta,ma
        sub     a3,a3,a5
        vse16.v v1,0(a0)
        add     a1,a1,a7
        add     a0,a0,a6
        bne     a3,zero,.L3
.L5:
        ret

After this patch:
foo:
ble a3,zero,.L5
.L3:
vsetvli a5,a3,e32,m1,ta,ma
vle32.v v1,0(a1)
vsetvli a7,zero,e16,mf2,ta,ma
slli a6,a5,2
vnsra.wx v1,v1,a2
slli a4,a5,1
vsetvli zero,a5,e16,mf2,ta,ma
sub a3,a3,a5
vse16.v v1,0(a0)
add a1,a1,a6
add a0,a0,a4
bne a3,zero,.L3
.L5:
ret

gcc/ChangeLog:

* config/riscv/autovec-opt.md
(*v<any_shiftrt:optab><any_extend:optab>trunc<mode>): New pattern.
(*<any_shiftrt:optab>trunc<mode>): Ditto.
* config/riscv/autovec.md (<optab><mode>3): Change to
define_insn_and_split.
(v<optab><mode>3): Ditto.
(trunc<mode><v_double_trunc>2): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/narrow-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/narrow-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/narrow-3.c: New test.
* gcc.target/riscv/rvv/autovec/binop/narrow_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/narrow_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/narrow_run-3.c: New test.

simplify-rtx: Implement constant folding of SS_TRUNCATE, US_TRUNCATE

This patch implements RTL constant-folding for the SS_TRUNCATE and US_TRUNCATE codes.
The semantics are a clamping operation on the argument with the min and max of the narrow mode,
followed by a truncation. The signedness of the clamp and the min/max extrema is derived from
the signedness of the saturating operation.

We have a number of instructions in aarch64 that use SS_TRUNCATE and US_TRUNCATE to represent
their operations and we have pretty thorough runtime tests in gcc.target/aarch64/advsimd-intrinsics/vqmovn*.c.
With this patch the instructions are folded away at optimisation levels and the correctness checks still
pass.

Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

gcc/ChangeLog:

* simplify-rtx.cc (simplify_const_unary_operation):
Handle US_TRUNCATE, SS_TRUNCATE.

RISC-V: Add ZVFHMIN block autovec testcase

To be safe, add ZVFHMIN autovec block testcase to make sure
we won't enable autovec in ZVFHMIN by mistakes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/zvfhmin-1.c: New test.

Fix oversight in latest change

gcc/
PR modula2/109952
* doc/gm2.texi (Standard procedures): Fix Next link.

Regenerate config.in

Looks like I forgot to regenerate config.in which
causes updates when you enable maintainer mode.

gcc/ChangeLog:

* config.in: Regenerate.

vect: Don't pass subtype to vect_widened_op_tree where not needed [PR 110142]

This patch fixes an issue introduced by
g:2f482a07365d9f4a94a56edd13b7f01b8f78b5a0, where a subtype was beeing passed
to vect_widened_op_tree, when no subtype was to be used. This lead to an
errorneous use of IFN_VEC_WIDEN_MINUS.

gcc/ChangeLog:

PR middle-end/110142
* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Don't pass
subtype to vect_widened_op_tree and remove subtype parameter, also
remove superfluous overloaded function definition.
(vect_recog_widen_plus_pattern): Remove subtype parameter and dont pass
to call to vect_recog_widen_op_pattern.
(vect_recog_widen_minus_pattern): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr110142.c: New test.