git.ipfire.org Git - thirdparty/gcc.git/log

ada: Fix expansion of aggregates with controlled components

The expansion is incorrect in the case where the initialization expression
of a component is a conditional expression that has a function call as one
of its dependent expressions, leading to a wrong order of initialization,
adjustment and finalization.

gcc/ada/

* exp_aggr.adb (Initialize_Component): Perform immediate expansion
of the initialization expression if it is a conditional expression
and the component type is controlled.

ada: Factor common processing in expansion of aggregates

The final processing at the component level of array aggregates and record
aggregates is very similar, so this factors out the common processing into
three new library-level subprograms.

There should be no functional changes, but the expanded code may be changed
in the case of controlled components of array aggregates not covered by a
multiple choice: the previous expansion used to place new declarations prior
to the aggregate in this case and that is no longer the case, i.e. they are
always placed right before the initialization of the component (as was done
for all controlled components of record aggregates and controlled components
of array aggregates covered by a multiple choice).

gcc/ada/

* exp_aggr.adb (Initialize_Component): New procedure factored out
from the processing of array and record aggregates.
(Initialize_Controlled_Component): Likewise.
(Initialize_Simple_Component): Likewise.
(Build_Array_Aggr_Code.Gen_Assign): Remove In_Loop parameter.
Call Initialize_Component to initialize the component.
(Initialize_Array_Component): Delete.
(Initialize_Ctrl_Array_Component): Likewise.
(Build_Array_Aggr_Code): Adjust calls to Gen_Assign.
(Build_Record_Aggr_Code): Call Initialize_Simple_Component or
Initialize_Component to initialize the component.
(Initialize_Ctrl_Record_Component): Delete.
(Initialize_Record_Component): Likewise.

ada: Remove wrong comment about expansion of exceptions for GNATprove

Code cleanup related to handling exceptions in GNATprove.

gcc/ada/

* exp_ch11.adb (Expand_N_Raise_Statement): Expansion of raise statements
never happens in GNATprove mode.

ada: Cleanup finding of locally handled exception handlers

Code cleanup related to handling exceptions in GNATprove; semantics is
unaffected.

gcc/ada/

* exp_ch11.adb (Find_Local_Handler): Replace guard against other
constructs appearing in the list of exception handlers with iteration
using First_Non_Pragma/Next_Non_Pragma.

ada: Cleanup expansion of locally handled exception handlers

Code cleanup related to handling exceptions in GNATprove; semantics is
unaffected.

gcc/ada/

* exp_ch11.ads (Find_Local_Handler): Fix typo in comment.
* exp_ch11.adb (Find_Local_Handler): Remove redundant check for the
Exception_Handler list being present; use membership test to eliminate
local object LCN; fold nested IF statements. Remove useless ELSIF
condition.

ada: Tune style in detection of writable function actuals

Cleanup; semantics is unaffected.

gcc/ada/

* sem_util.adb (Check_Function_Writable_Actuals): Tune style; use
subtype name to detect membership test nodes.

ada: Simplify appending to a newly created list

Code cleanup; semantics is unaffected.

gcc/ada/

* exp_disp.adb (Make_Disp_Asynchronous_Select_Spec): Use a single call
to New_List.

ada: Support new GNAT-specific aspect Ghost_Predicate

New aspect Ghost_Predicate allows the use of ghost entities in the
predicate expression, even if the type is not ghost itself. As a result,
subtypes with a ghost predicate cannot be used in membership tests.

Subtypes with ghost predicates are subject to the same additional
restrictions as subtypes with aspect Dynamic_Predicate.
They are governed for compilation by assertion policy Ghost.
Checking of the predicate itself is governed by the usual assertion
policy (Static_Predicate/Dynamic_Predicate/Predicate) independently
of the ghost predicate.

gcc/ada/

* doc/gnat_rm/implementation_defined_aspects.rst: Document new
aspect.
* doc/gnat_rm/implementation_defined_pragmas.rst: Whitespace.
* aspects.adb (Init_Canonical_Aspect): Set it to Predicate.
* aspects.ads: Set global constants for new aspect.
* einfo.ads: Describe new flag related to new aspect.
* exp_ch6.adb (Can_Fold_Predicate_Call): Do not fold new aspect.
* exp_util.adb (Make_Predicate_Check): Add comment.
* gen_il-fields.ads: Add new flag.
* gen_il-gen-gen_entities.adb: Add new flag.
* ghost.adb (Is_OK_Ghost_Context): Ghost predicate is an OK
ghost context.
(Mark_Ghost_Pragma): Add overloading with ghost mode parameter.
* ghost.ads (Mark_Ghost_Pragma): Add overloading with ghpst mode
parameter.
(Name_To_Ghost_Mode): Make function public.
* sem_aggr.adb: Issue error for violation of valid use.
* sem_case.adb: Issue error for violation of valid use.
* sem_ch13.adb: Adapt for new aspect.
* sem_ch3.adb (Analyze_Full_Type_Declaration): Remove dead code
which was trying to propagate Has_Predicates flag in the wrong
direction (from derived to parent type).
(Analyze_Number_Declaration): Issue error for violation of valid
use.
(Build_Derived_Type): Cleanup inheritance of predicate flags from
parent to derived type.
(Build_Predicate_Function): Only add a predicate check when it
is not ignored as Ghost code.
* sem_ch4.adb (Analyze_Membership_Op): Issue an error for use of
a subtype with a ghost predicate as name in a membership test.
* sem_ch5.adb (Check_Predicate_Use): Issue error for violation of
valid use.
* sem_eval.adb: Adapt code for Dynamic_Predicate to account for
Ghost_Predicate too.
* sem_prag.adb (Analyze_Pragma): Make pragma ghost or not.
* sem_util.adb (Bad_Predicated_Subtype_Use): Adapt to new aspect.
(Inherit_Predicate_Flags): Add inheritance of flag. Add parameter
to apply to derived types.
* sem_util.ads (Inherit_Predicate_Flags): Change signature.
* snames.ads-tmpl: Add new aspect name.
* gnat_rm.texi: Regenerate.

ada: Remove explicit decoration of wrapper created in freezing

We create wrapper functions associated with inherited functions with
controlling results which are not overridden during freezing. We partly
decorated them explicitly, even though they would be fully decorated
later anyway.

This early decoration didn't work as expected, because flag
In_Private_Part that is read by Override_Dispatching_Operation it not
set reliably while freezing (as explained in the comment of
Is_Private_Declaration). In effect, we were getting a circularity
between Alias and Overridden_Operation, which was causing GNATprove to
loop infinitely.

Apparently the cleanest fix is to not decorate the wrapper with an early
call to Override_Dispatching_Operation.

gcc/ada/

* exp_ch3.adb (Make_Controlling_Function_Wrappers): Remove early
decoration.

RISC-V: Fix one typo in full-vec-movel test

This patch would like to fix one typo when checking assembly of
full-vec-movel.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls-vlmax/full-vec-move1.c:
Adjust dg-do to comiple for asm checking.

AArch64: [PR96339] Optimise svlast[ab]

  This PR optimizes an SVE intrinsics sequence where
    svlasta (svptrue_pat_b8 (SV_VL1), x)
  a scalar is selected based on a constant predicate and a variable vector.
  This sequence is optimized to return the correspoding element of a NEON
  vector. For eg.
    svlasta (svptrue_pat_b8 (SV_VL1), x)
  returns
    umov    w0, v0.b[1]
  Likewise,
    svlastb (svptrue_pat_b8 (SV_VL1), x)
  returns
     umov    w0, v0.b[0]
  This optimization only works provided the constant predicate maps to a range
  that is within the bounds of a 128-bit NEON register.

gcc/ChangeLog:

PR target/96339
* config/aarch64/aarch64-sve-builtins-base.cc (svlast_impl::fold): Fold sve
calls that have a constant input predicate vector.
(svlast_impl::is_lasta): Query to check if intrinsic is svlasta.
(svlast_impl::is_lastb): Query to check if intrinsic is svlastb.
(svlast_impl::vect_all_same): Check if all vector elements are equal.

gcc/testsuite/ChangeLog:

PR target/96339
* gcc.target/aarch64/sve/acle/general-c/svlast.c: New.
* gcc.target/aarch64/sve/acle/general-c/svlast128_run.c: New.
* gcc.target/aarch64/sve/acle/general-c/svlast256_run.c: New.
* gcc.target/aarch64/sve/pcs/return_4.c (caller_bf16): Fix asm
to expect optimized code for function body.
* gcc.target/aarch64/sve/pcs/return_4_128.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_4_256.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_4_512.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_4_1024.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_4_2048.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_5.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_5_128.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_5_256.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_5_512.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_5_1024.c (caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_5_2048.c (caller_bf16): Likewise.

Update perf auto profile script

- Fix gen_autofdo_event: The download URL for the Intel Perfmon Event
  list has changed, as well as the JSON format.
  Also it now uses pattern matching to match CPUs. Update the script to support all of this.
- Regenerate gcc-auto-profile with the latest published Intel model
  numbers, so it works with recent systems.
- So far it's still broken on hybrid systems

contrib/ChangeLog:

* gen_autofdo_event.py: Update for download server changes

gcc/ChangeLog

* config/i386/gcc-auto-profile: Regenerate.

RISC-V: Fix V_WHOLE && V_FRACT iterator requirement

This patch fixes the requirement of V_WHOLE and V_FRACT.
E.g. VNx8QI in V_WHOLE has no requirement which is incorrect.
     Actually, VNx8QI should be whole(full) mode when TARGET_MIN_VLEN < 128
     since when TARGET_MIN_VLEN == 128, VNx8QI is e8mf2 which is fractional
     vector.

Co-Authored by: Robin Dapp <rdapp@ventanamicro.com>

gcc/ChangeLog:

* config/riscv/vector-iterators.md: Fix requirement.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls-vlmax/full-vec-move1.c: New test.

RISC-V: Enhance RVV VLA SLP auto-vectorization with decompress operation

According to RVV ISA:
https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc

We can enhance VLA SLP auto-vectorization with (16.5.1. Synthesizing vdecompress)
Decompress operation.

Case 1 (nunits = POLY_INT_CST [16, 16]):
_48 = VEC_PERM_EXPR <_37, _35, { 0, POLY_INT_CST [16, 16], 1, POLY_INT_CST [17, 16], 2, POLY_INT_CST [18, 16], ... }>;
We can optimize such VLA SLP permuation pattern into:
_48 = vdecompress (_37, _35, mask = { 0, 1, 0, 1, ... };

Case 2 (nunits = POLY_INT_CST [16, 16]):
_23 = VEC_PERM_EXPR <_46, _44, { POLY_INT_CST [1, 1], POLY_INT_CST [3, 3], POLY_INT_CST [2, 1], POLY_INT_CST [4, 3], POLY_INT_CST [3, 1], POLY_INT_CST [5, 3], ... }>;
We can optimize such VLA SLP permuation pattern into:
_48 = vdecompress (slidedown(_46, 1/2 nunits), slidedown(_44, 1/2 nunits), mask = { 0, 1, 0, 1, ... };

For example:
void __attribute__ ((noinline, noclone))
vec_slp (uint64_t *restrict a, uint64_t b, uint64_t c, int n)
{
  for (int i = 0; i < n; ++i)
    {
      a[i * 2] += b;
      a[i * 2 + 1] += c;
    }
}

ASM:
...
        vid.v   v0
        vand.vi v0,v0,1
        vmseq.vi        v0,v0,1  ===> mask = { 0, 1, 0, 1, ... }
vdecompress:
        viota.m v3,v0
        vrgather.vv     v2,v1,v3,v0.t
Loop:
        vsetvli zero,a5,e64,m1,ta,ma
        vle64.v v1,0(a0)
        vsetvli a6,zero,e64,m1,ta,ma
        vadd.vv v1,v2,v1
        vsetvli zero,a5,e64,m1,ta,ma
        mv      a5,a3
        vse64.v v1,0(a0)
        add     a3,a3,a1
        add     a0,a0,a2
        bgtu    a5,a4,.L4

gcc/ChangeLog:

* config/riscv/riscv-v.cc (emit_vlmax_decompress_insn): New function.
(shuffle_decompress_patterns): New function.
(expand_vec_perm_const_1): Add decompress optimization.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/slp-8.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-9.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-8.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-9.c: New test.

Daily bump.

PR modula2/110189 Using an unknown TYPE as argument to VAL gives ICE

This patch tidies P3Build.bnf and fixes error format specs in
M2Quads.mod when encountering unknown symbols.

gcc/m2/ChangeLog:

PR modula2/110189
* gm2-compiler/M2Quads.mod (BuildAbsFunction): Replace abort
format specifier.
(BuildValFunction): Replace abort format specifier.
(BuildCastFunction): Replace abort format specifier.
(BuildMinFunction): Replace abort format specifier.
(BuildMaxFunction): Replace abort format specifier.
(BuildTruncFunction): Replace abort format specifier.
* gm2-compiler/P3Build.bnf (Pass1): Remove.
(Pass2): Remove.
(Pass3): Remove.
(Expect): Add Pass1.
(AsmStatement): Remove Pass3.
(AsmOperands): Remove Pass3.
(AsmOperandSpec): Remove Pass3.
(AsmInputElement): Remove Pass3.
(AsmOutputElement): Remove Pass3.
(AsmTrashList): Remove Pass3.

gcc/testsuite/ChangeLog:

PR modula2/110189
* gm2/pim/fail/foovaltype.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

[committed] [PR rtl-optimization/101188] Fix reload_cse_move2add ignoring clobbers

So as Georg-Johann discusses in the BZ, reload_cse_move2add can generate
incorrect code when optimizing code with clobbers.  Specifically in the
case where we try to optimize a sequence of 4 operations down to 3
operations we can reset INSN to the next instruction and continue the loop.

That skips the code to invalidate objects based on things like REG_INC
nodes, stack pushes and most importantly clobbers attached to the current
insn.

This patch factors all of the invalidation code used by reload_cse_move2add
into a new function and calls it at the appropriate time.

Georg-Johann has confirmed this patch fixes his avr bug and I've had it in
my tester over the weekend.  It's bootstrapped and regression tested on
aarch64, m68k, sh4, alpha and hppa.  It's also regression tested successfully
on a wide variety of other targets.

gcc/
PR rtl-optimization/101188
* postreload.cc (reload_cse_move2add_invalidate): New function,
extracted from...
(reload_cse_move2add): Call reload_cse_move2add_invalidate.

gcc/testsuite
PR rtl-optimization/101188
* gcc.c-torture/execute/pr101188.c: New test

[aarch64] Improve code-gen for vector initialization with single constant element.

gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_expand_vector_init): Tweak condition
if (n_var == n_elts && n_elts <= 16) to allow a single constant,
and if maxv == 1, use constant element for duplicating into register.

gcc/testsuite/ChangeLog:
* gcc.target/aarch64/vec-init-single-const.c: New test.
* gcc.target/aarch64/vec-init-single-const-be.c: Likewise.
* gcc.target/aarch64/vec-init-single-const-2.c: Likewise.

OpenMP: Cleanups related to the 'present' modifier

Reduce number of enum values passed to libgomp as
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} have the same semantic as
GOMP_MAP_FORCE_PRESENT (i.e. abort if not present, otherwise ignore);
that's different to GOMP_MAP_ALWAYS_PRESENT_{TO,TOFROM,FROM} which also
abort if not present but copy data when present. This is is a follow-up to
the commit r14-1579-g4ede915d5dde93 done 6 days ago.

Additionally, the commit improves a libgomp run-time and a C/C++ compile-time
error wording and extends testcases a tiny bit.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_omp_clause_map): Reword error message for
clearness especially with 'omp target (enter/exit) data.'

gcc/cp/ChangeLog:

* parser.cc (cp_parser_omp_clause_map): Reword error message for
clearness especially with 'omp target (enter/exit) data.'
* semantics.cc (handle_omp_array_sections): Handle
GOMP_MAP_{ALWAYS_,}PRESENT_{TO,TOFROM,FROM,ALLOC} enum values.

gcc/ChangeLog:

* gimplify.cc (gimplify_adjust_omp_clauses_1): Use
GOMP_MAP_FORCE_PRESENT for 'present alloc' implicit mapping.
(gimplify_adjust_omp_clauses): Change
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to the equivalent
GOMP_MAP_FORCE_PRESENT.
* omp-low.cc (lower_omp_target): Remove handling of no-longer valid
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC}; update map kinds used for
to/from clauses with present modifier.

include/ChangeLog:

* gomp-constants.h (enum gomp_map_kind): Change the enum values
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to be compiler only.
(GOMP_MAP_PRESENT_P): Update to include also GOMP_MAP_FORCE_PRESENT.

libgomp/ChangeLog:

* target.c (gomp_to_device_kind_p, gomp_map_vars_internal): Replace
GOMP_MAP_PRESENT_{FROM,TO,TOFROM,ACLLOC} by GOMP_MAP_FORCE_PRESENT.
(gomp_map_vars_internal, gomp_update): Likewise; unify and improve
error message.
* testsuite/libgomp.c-c++-common/target-present-2.c: Update for
changed error message.
* testsuite/libgomp.fortran/target-present-1.f90: Likewise.
* testsuite/libgomp.fortran/target-present-2.f90: Likewise.
* testsuite/libgomp.oacc-c-c++-common/present-1.c: Likewise.
* testsuite/libgomp.c-c++-common/target-present-1.c: Likewise and
extend testcase to check that data is copied when needed.
* testsuite/libgomp.c-c++-common/target-present-3.c: Likewise.
* testsuite/libgomp.fortran/target-present-3.f90: Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/defaultmap-4.c: Update scan-tree-dump.
* c-c++-common/gomp/map-9.c: Likewise.
* gfortran.dg/gomp/defaultmap-8.f90: Likewise.
* gfortran.dg/gomp/map-11.f90: Likewise.
* gfortran.dg/gomp/target-update-1.f90: Likewise.
* gfortran.dg/gomp/map-12.f90: Likewise; also check original dump.
* c-c++-common/gomp/map-6.c: Update dg-error and also check
clause error with 'target (enter/exit) data'.

Add some overrides.

PR tree-optimization/110205
* range-op-float.cc (range_operator::fold_range): Add default FII
fold routine.
* range-op-mixed.h (class operator_gt): Add missing final overrides.
* range-op.cc (range_op_handler::fold_range): Add RO_FII case.
(operator_lshift ::update_bitmask): Add final override.
(operator_rshift ::update_bitmask): Add final override.
* range-op.h (range_operator::fold_range): Add FII prototype.

Provide interface for non-standard operators.

THis removes the hack introduced for WIDEN_MULT which exported a pointer
to the operator and the gimple-range-op.cc set the operator to this
pointer whenn it was appropriate.

Instead, we simple change the range-op table to be unsigned indexed,
and add new opcodes to the end of the table, allowing them to be indexed
directly via range_op_handler::range_op.

* gimple-range-op.cc (gimple_range_op_handler::maybe_non_standard):
Use range_op_handler directly.
* range-op.cc (range_op_handler::range_op_handler): Unsigned
param instead of tree-code.
(ptr_op_widen_plus_signed): Delete.
(ptr_op_widen_plus_unsigned): Delete.
(ptr_op_widen_mult_signed): Delete.
(ptr_op_widen_mult_unsigned): Delete.
(range_op_table::initialize_integral_ops): Add new opcodes.
* range-op.h (range_op_handler): Use unsigned.
(OP_WIDEN_MULT_SIGNED): New.
(OP_WIDEN_MULT_UNSIGNED): New.
(OP_WIDEN_PLUS_SIGNED): New.
(OP_WIDEN_PLUS_UNSIGNED): New.
(RANGE_OP_TABLE_SIZE): New.
(range_op_table::operator []): Use unsigned.
(range_op_table::set): Use unsigned.
(m_range_tree): Make unsigned.
(ptr_op_widen_mult_signed): Remove.
(ptr_op_widen_mult_unsigned): Remove.
(ptr_op_widen_plus_signed): Remove.
(ptr_op_widen_plus_unsigned): Remove.

Provide a default range_operator via range_op_handler.

range_op_handler now provides a default range_operator for any opcode,
so there is no longer a need to check for a valid operator.

* gimple-range-op.cc (gimple_range_op_handler): Set m_operator
manually as there is no access to the default operator.
(cfn_copysign::fold_range): Don't check for validity.
(cfn_ubsan::fold_range): Ditto.
(gimple_range_op_handler::maybe_builtin_call): Don't set to NULL.
* range-op.cc (default_operator): New.
(range_op_handler::range_op_handler): Use default_operator
instead of NULL.
(range_op_handler::operator bool): Move from header, compare
against default operator.
(range_op_handler::range_op): New.
* range-op.h (range_op_handler::operator bool): Move.

Switch from unified table to range_op_table. There can be only one.

Now that there is only a single range_op_table, make the base table the
only table.

* range-op.cc (unified_table): Delete.
(range_op_table operator_table): Instantiate.
(range_op_table::range_op_table): Rename from unified_table.
(range_op_handler::range_op_handler): Use range_op_table.
* range-op.h (range_op_table::operator []): Inline.
(range_op_table::set): Inline.

Remove type from range_op_handler table selection

With the unified table complete, we no loonger need to specify a type
to choose a table when setting a range_op_handler.

* gimple-range-gori.cc (gori_compute::condexpr_adjust): Do not
pass type.
* gimple-range-op.cc (get_code): Rename from get_code_and_type
and simplify.
(gimple_range_op_handler::supported_p): No need for type.
(gimple_range_op_handler::gimple_range_op_handler): Ditto.
(cfn_copysign::fold_range): Ditto.
(cfn_ubsan::fold_range): Ditto.
* ipa-cp.cc (ipa_vr_operation_and_type_effects): Ditto.
* ipa-fnsummary.cc (evaluate_conditions_for_known_args): Ditto.
* range-op-float.cc (operator_plus::op1_range): Ditto.
(operator_mult::op1_range): Ditto.
(range_op_float_tests): Ditto.
* range-op.cc (get_op_handler): Remove.
(range_op_handler::set_op_handler): Remove.
(operator_plus::op1_range): No need for type.
(operator_minus::op1_range): Ditto.
(operator_mult::op1_range): Ditto.
(operator_exact_divide::op1_range): Ditto.
(operator_cast::op1_range): Ditto.
(perator_bitwise_not::fold_range): Ditto.
(operator_negate::fold_range): Ditto.
* range-op.h (range_op_handler::range_op_handler): Remove type param.
(range_cast): No need for type.
(range_op_table::operator[]): Check for enum_code >= 0.
* tree-data-ref.cc (compute_distributive_range): No need for type.
* tree-ssa-loop-unswitch.cc (unswitch_predicate): Ditto.
* value-query.cc (range_query::get_tree_range): Ditto.
* value-relation.cc (relation_oracle::validate_relation): Ditto.
* vr-values.cc (range_of_var_in_loop): Ditto.
(simplify_using_ranges::fold_cond_with_ops): Ditto.

Add a hybrid MAX_EXPR operator for integer and pointer.

This adds an operator to the unified table for MAX_EXPR which will
select either the pointer or integer version based on the type passed
to the method. This is for use until we have a seperate PRANGE class.

THIs also removes the pointer table which is no longer needed.

* range-op-mixed.h (operator_max): Remove final.
* range-op-ptr.cc (pointer_table::pointer_table): Remove MAX_EXPR.
(pointer_table::pointer_table): Remove.
(class hybrid_max_operator): New.
(range_op_table::initialize_pointer_ops): Add hybrid_max_operator.
* range-op.cc (pointer_tree_table): Remove.
(unified_table::unified_table): Comment out MAX_EXPR.
(get_op_handler): Remove check of pointer table.
* range-op.h (class pointer_table): Remove.

Add a hybrid MIN_EXPR operator for integer and pointer.

This adds an operator to the unified table for MIN_EXPR which will
select either the pointer or integer version based on the type passed
to the method. This is for use until we have a seperate PRANGE class.

* range-op-mixed.h (operator_min): Remove final.
* range-op-ptr.cc (pointer_table::pointer_table): Remove MIN_EXPR.
(class hybrid_min_operator): New.
(range_op_table::initialize_pointer_ops): Add hybrid_min_operator.
* range-op.cc (unified_table::unified_table): Comment out MIN_EXPR.

Add a hybrid BIT_IOR_EXPR operator for integer and pointer.

This adds an operator to the unified table for BIT_IOR_EXPR which will
select either the pointer or integer version based on the type passed
to the method. This is for use until we have a seperate PRANGE class.

* range-op-mixed.h (operator_bitwise_or): Remove final.
* range-op-ptr.cc (pointer_table::pointer_table): Remove BIT_IOR_EXPR.
(class hybrid_or_operator): New.
(range_op_table::initialize_pointer_ops): Add hybrid_or_operator.
* range-op.cc (unified_table::unified_table): Comment out BIT_IOR_EXPR.

Add a hybrid BIT_AND_EXPR operator for integer and pointer.

This adds an operator to the unified table for BIT_AND_EXPR which will
select either the pointer or integer version based on the type passed
to the method. This is for use until we have a seperate PRANGE class.

* range-op-mixed.h (operator_bitwise_and): Remove final.
* range-op-ptr.cc (pointer_table::pointer_table): Remove BIT_AND_EXPR.
(class hybrid_and_operator): New.
(range_op_table::initialize_pointer_ops): Add hybrid_and_operator.
* range-op.cc (unified_table::unified_table): Comment out BIT_AND_EXPR.

Split pointer ibased range operators to range-op-ptr.cc

MOve the pointer table and all pointer specific operators into a
new file for pointers.

* Makefile.in (OBJS): Add range-op-ptr.o.
* range-op-mixed.h (update_known_bitmask): Move prototype here.
(minus_op1_op2_relation_effect): Move prototype here.
(wi_includes_zero_p): Move function to here.
(wi_zero_p): Ditto.
* range-op.cc (update_known_bitmask): Remove static.
(wi_includes_zero_p): Move to header.
(wi_zero_p): Move to header.
(minus_op1_op2_relation_effect): Remove static.
(operator_pointer_diff): Move class and routines to range-op-ptr.cc.
(pointer_plus_operator): Ditto.
(pointer_min_max_operator): Ditto.
(pointer_and_operator): Ditto.
(pointer_or_operator): Ditto.
(pointer_table): Ditto.
(range_op_table::initialize_pointer_ops): Ditto.
* range-op-ptr.cc: New.

Move operator_max to the unified range-op table.

Also remove the integral table.

* range-op-mixed.h (class operator_max): Move from...
* range-op.cc (unified_table::unified_table): Add MAX_EXPR.
(get_op_handler): Remove the integral table.
(class operator_max): Move from here.
(integral_table::integral_table): Delete.
* range-op.h (class integral_table): Delete.

Move operator_min to the unified range-op table.

* range-op-mixed.h (class operator_min): Move from...
* range-op.cc (unified_table::unified_table): Add MIN_EXPR.
(class operator_min): Move from here.
(integral_table::integral_table): Remove MIN_EXPR.

Move operator_bitwise_or to the unified range-op table.

* range-op-mixed.h (class operator_bitwise_or): Move from...
* range-op.cc (unified_table::unified_table): Add BIT_IOR_EXPR.
(class operator_bitwise_or): Move from here.
(integral_table::integral_table): Remove BIT_IOR_EXPR.

Move operator_bitwise_and to the unified range-op table.

At this point, the remaining 4 integral operation have different
impllementations than pointers, so we now check for a pointer table
entry first, then if there is nothing, look at the Unified table.

* range-op-mixed.h (class operator_bitwise_and): Move from...
* range-op.cc (unified_table::unified_table): Add BIT_AND_EXPR.
(get_op_handler): Check for a pointer table entry first.
(class operator_bitwise_and): Move from here.
(integral_table::integral_table): Remove BIT_AND_EXPR.

Move operator_bitwise_xor to the unified range-op table.

* range-op-mixed.h (class operator_bitwise_xor): Move from...
* range-op.cc (unified_table::unified_table): Add BIT_XOR_EXPR.
(class operator_bitwise_xor): Move from here.
(integral_table::integral_table): Remove BIT_XOR_EXPR.
(pointer_table::pointer_table): Remove BIT_XOR_EXPR.

Move operator_bitwise_not to the unified range-op table.

* range-op-mixed.h (class operator_bitwise_not): Move from...
* range-op.cc (unified_table::unified_table): Add BIT_NOT_EXPR.
(class operator_bitwise_not): Move from here.
(integral_table::integral_table): Remove BIT_NOT_EXPR.
(pointer_table::pointer_table): Remove BIT_NOT_EXPR.

Move operator_addr_expr to the unified range-op table.

* range-op-mixed.h (class operator_addr_expr): Move from...
* range-op.cc (unified_table::unified_table): Add ADDR_EXPR.
(class operator_addr_expr): Move from here.
(integral_table::integral_table): Remove ADDR_EXPR.
(pointer_table::pointer_table): Remove ADDR_EXPR.

PR modula2/110126 variables are reported as unused when referenced by ASM fix

This patch fixes the trash list of the asm statement. It introduces a
separate build procedure for trashed elements.

gcc/m2/ChangeLog:

PR modula2/110126
* gm2-compiler/M2Quads.def (BuildAsmElement): Remove
trash parameter.
(BuildAsmTrash): New procedure.
* gm2-compiler/M2Quads.mod (BuildAsmTrash): New procedure.
(BuildAsmElement): Remove trash parameter.
* gm2-compiler/P3Build.bnf (AsmTrashList): Rewrite.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

RISC-V: Fix one potential test failure for RVV vsetvl

The test will fail on below command with multi-thread like below. However,
it comes from one missed "Oz" option when check vsetvl.

make -j $(nproc) report RUNTESTFLAGS="rvv.exp riscv.exp"

To some reason, this failure cannot be reproduced by RUNTESTFLAGS="rvv.exp"
or make without -j option. We would like to fix it and root cause the
reason later.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/vsetvl-23.c: Adjust test checking.

RISC-V: Support RVV FP16 MISC vget/vset intrinsic API

This patch support the intrinsic API of FP16 ZVFHMIN vget/vset. From
the user's perspective, it is reasonable to do some get/set operations
for the vfloat16*_t types when only ZVFHMIN is enabled.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-types.def
(vfloat16m1_t): Add type to lmul1 ops.
(vfloat16m2_t): Likewise.
(vfloat16m4_t): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: Add new test cases.
* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Likewise.

Fix disambiguation against .MASK_STORE

Alias analysis was treating .MASK_STORE as storing a full vector
which means we disambiguate against decls of smaller than vector size.
That's of course wrong and a similar issue was fixed for DSE already.
The following makes sure we set the size of the access to unknown
and only constrain max_size.

This fixes runtime execution FAILs of gfortran.dg/matmul_2.f90,
gfortran.dg/matmul_6.f90 and gfortran.dg/pr91577.f90 when using
AVX512 with full masked loop vectorization on Zen4.

* tree-ssa-alias.cc (call_may_clobber_ref_p_1): For
.MASK_STORE and friend set the size of the access to
unknown.

Remove DEFAULT_MATCHPD_PARTITIONS macro

As Jakub pointed out, DEFAULT_MATCHPD_PARTITIONS
is now unused and can be removed.

gcc/ChangeLog:

* config.in: Regenerate.
* configure: Regenerate.
* configure.ac: Remove DEFAULT_MATCHPD_PARTITIONS.

RISC-V: Add RVV narrow shift right lowering auto-vectorization

Optimize the following auto-vectorization codes:
void foo (int16_t * __restrict a, int32_t * __restrict b, int32_t c, int n)
{
    for (int i = 0; i < n; i++)
      a[i] = b[i] >> c;
}

Before this patch:
foo:
        ble     a3,zero,.L5
.L3:
        vsetvli a5,a3,e32,m1,ta,ma
        vle32.v v1,0(a1)
        vsetvli a4,zero,e32,m1,ta,ma
        vsra.vx v1,v1,a2
        vsetvli zero,zero,e16,mf2,ta,ma
        slli    a7,a5,2
        vncvt.x.x.w     v1,v1
        slli    a6,a5,1
        vsetvli zero,a5,e16,mf2,ta,ma
        sub     a3,a3,a5
        vse16.v v1,0(a0)
        add     a1,a1,a7
        add     a0,a0,a6
        bne     a3,zero,.L3
.L5:
        ret

After this patch:
foo:
ble a3,zero,.L5
.L3:
vsetvli a5,a3,e32,m1,ta,ma
vle32.v v1,0(a1)
vsetvli a7,zero,e16,mf2,ta,ma
slli a6,a5,2
vnsra.wx v1,v1,a2
slli a4,a5,1
vsetvli zero,a5,e16,mf2,ta,ma
sub a3,a3,a5
vse16.v v1,0(a0)
add a1,a1,a6
add a0,a0,a4
bne a3,zero,.L3
.L5:
ret

gcc/ChangeLog:

* config/riscv/autovec-opt.md
(*v<any_shiftrt:optab><any_extend:optab>trunc<mode>): New pattern.
(*<any_shiftrt:optab>trunc<mode>): Ditto.
* config/riscv/autovec.md (<optab><mode>3): Change to
define_insn_and_split.
(v<optab><mode>3): Ditto.
(trunc<mode><v_double_trunc>2): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/narrow-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/narrow-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/narrow-3.c: New test.
* gcc.target/riscv/rvv/autovec/binop/narrow_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/narrow_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/narrow_run-3.c: New test.

simplify-rtx: Implement constant folding of SS_TRUNCATE, US_TRUNCATE

This patch implements RTL constant-folding for the SS_TRUNCATE and US_TRUNCATE codes.
The semantics are a clamping operation on the argument with the min and max of the narrow mode,
followed by a truncation. The signedness of the clamp and the min/max extrema is derived from
the signedness of the saturating operation.

We have a number of instructions in aarch64 that use SS_TRUNCATE and US_TRUNCATE to represent
their operations and we have pretty thorough runtime tests in gcc.target/aarch64/advsimd-intrinsics/vqmovn*.c.
With this patch the instructions are folded away at optimisation levels and the correctness checks still
pass.

Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

gcc/ChangeLog:

* simplify-rtx.cc (simplify_const_unary_operation):
Handle US_TRUNCATE, SS_TRUNCATE.

RISC-V: Add ZVFHMIN block autovec testcase

To be safe, add ZVFHMIN autovec block testcase to make sure
we won't enable autovec in ZVFHMIN by mistakes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/zvfhmin-1.c: New test.

Fix oversight in latest change

gcc/
PR modula2/109952
* doc/gm2.texi (Standard procedures): Fix Next link.

Regenerate config.in

Looks like I forgot to regenerate config.in which
causes updates when you enable maintainer mode.

gcc/ChangeLog:

* config.in: Regenerate.

vect: Don't pass subtype to vect_widened_op_tree where not needed [PR 110142]

This patch fixes an issue introduced by
g:2f482a07365d9f4a94a56edd13b7f01b8f78b5a0, where a subtype was beeing passed
to vect_widened_op_tree, when no subtype was to be used. This lead to an
errorneous use of IFN_VEC_WIDEN_MINUS.

gcc/ChangeLog:

PR middle-end/110142
* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Don't pass
subtype to vect_widened_op_tree and remove subtype parameter, also
remove superfluous overloaded function definition.
(vect_recog_widen_plus_pattern): Remove subtype parameter and dont pass
to call to vect_recog_widen_op_pattern.
(vect_recog_widen_minus_pattern): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr110142.c: New test.

Add missing vec_pack/unpacks patterns for _Float16 <-> int/float conversion.

This patch only support optabs for vector modes whose lenth >= 128.
For 32/64-bit vector, they're more hanlded by BB vectorizer with
truncmn2/extendmn2/fix{,uns}_truncmn2.

gcc/ChangeLog:

* config/i386/sse.md (vec_pack<floatprefix>_float_<mode>): New expander.
(vec_unpack_<fixprefix>fix_trunc_lo_<mode>): Ditto.
(vec_unpack_<fixprefix>fix_trunc_hi_<mode>): Ditto.
(vec_unpacks_lo_<mode>): Ditto.
(vec_unpacks_hi_<mode>): Ditto.
(sse_movlhps_<mode>): New define_insn.
(ssse3_palignr<mode>_perm): Extend to V_128H.
(V_128H): New mode iterator.
(ssepackPHmode): New mode attribute.
(vunpck_extract_mode): Ditto.
(vpckfloat_concat_mode): Extend to VxSI/VxSF for _Float16.
(vpckfloat_temp_mode): Ditto.
(vpckfloat_op_mode): Ditto.
(vunpckfixt_mode): Extend to VxHF.
(vunpckfixt_model): Ditto.
(vunpckfixt_extract_mode): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/vec_pack_fp16-1.c: New test.
* gcc.target/i386/vec_pack_fp16-2.c: New test.
* gcc.target/i386/vec_pack_fp16-3.c: New test.

middle-end/110200 - genmatch force-leaf and convert interaction

The following fixes code GENERIC generation for (convert! ...)
which currently generates

  if (TREE_TYPE (_o1[0]) != type)
    _r1 = fold_build1_loc (loc, NOP_EXPR, type, _o1[0]);
    if (EXPR_P (_r1))
      goto next_after_fail867;
  else
    _r1 = _o1[0];

where obviously braces are missing.

PR middle-end/110200
* genmatch.cc (expr::gen_transform): Put braces around
the if arm for the (convert ...) short-cut.

c++: build initializer_list<string> in a loop [PR105838]

I previously applied this change in r13-4565 but reverted it due to
PR108071. That PR was then fixed by r13-4712, but I didn't re-apply this
change then because we weren't making the array static; since r14-1500 for
PR110070 we now make the initializer array static, so let's bring this back.

In situations where the maybe_init_list_as_range optimization isn't viable,
we can build an initializer_list<string> with a loop over a constant array
of string literals.

This is represented using a VEC_INIT_EXPR, which required adjusting a couple
of places that expected the initializer array to have the same type as the
target array and fixing build_vec_init not to undo our efforts.

PR c++/105838

gcc/cp/ChangeLog:

* call.cc (convert_like_internal) [ck_list]: Use
maybe_init_list_as_array.
* constexpr.cc (cxx_eval_vec_init_1): Init might have
a different type.
* tree.cc (build_vec_init_elt): Likewise.
* init.cc (build_vec_init): Handle from_array from a
TARGET_EXPR. Retain TARGET_EXPR of a different type.

gcc/testsuite/ChangeLog:

* g++.dg/tree-ssa/initlist-opt5.C: New test.

rs6000: Guard __builtin_{un,}pack_vector_int128 with vsx [PR109932]

As PR109932 shows, builtins __builtin_{un,}pack_vector_int128
should be guarded under vsx rather than power7, as their
corresponding bif patterns have the conditions TARGET_VSX
and VECTOR_MEM_ALTIVEC_OR_VSX_P (V1TImode). This patch is to
move __builtin_{un,}pack_vector_int128 to stanza vsx to ensure
their supports.

PR target/109932

gcc/ChangeLog:

* config/rs6000/rs6000-builtins.def (__builtin_pack_vector_int128,
__builtin_unpack_vector_int128): Move from stanza power7 to vsx.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr109932-1.c: New test.
* gcc.target/powerpc/pr109932-2.c: New test.

rs6000: Don't use TFmode for 128 bits fp constant in toc [PR110011]

As PR110011 shows, when encoding 128 bits fp constant into
toc, we adopts REAL_VALUE_TO_TARGET_LONG_DOUBLE which is
to find the first float mode with LONG_DOUBLE_TYPE_SIZE
bits of precision, it would be TFmode here. But the 128
bits fp constant can be with mode IFmode or KFmode, which
doesn't necessarily have the same underlying float format
as the one of TFmode, like this PR exposes, with option
-mabi=ibmlongdouble TFmode has ibm_extended_format while
KFmode has ieee_quad_format, mixing up the formats (the
encoding/decoding ways) would cause unexpected results.

This patch is to make it use constant's own mode instead
of TFmode for real_to_target call.

PR target/110011

gcc/ChangeLog:

* config/rs6000/rs6000.cc (output_toc): Use the mode of the 128-bit
floating constant itself for real_to_target call.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr110011.c: New test.

RISC-V: Add test cases for RVV FP16 undefined and vlmul trunc

This patch would like to add more tests for RVV FP16 undef and vlmul
trunc, aka

__riscv_vundefined_f16*();
__riscv_vlmul_trunc_v_f16*_f16*();

From the user's perspective, it is reasonable to do above operation
when only ZVFHMIN is enabled. This patch would like to add new test
cases to make sure the RVV FP16 vreinterpret works well as expected.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: Add test cases.
* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Ditto.

RISC-V: Support RVV FP16 MISC vlmul ext intrinsic API

This patch support the intrinsic API of FP16 ZVFHMIN vlmul ext. Aka:

vfloat16*_t <==> vfloat16*_t.

From the user's perspective, it is reasonable to do some type convert
between vfloat16*_t and vfloat16*_t when only ZVFHMIN is enabled.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-types.def
(vfloat16mf4_t): Add type to X2/X4/X8/X16/X32 vlmul ext ops.
(vfloat16mf2_t): Ditto.
(vfloat16m1_t): Ditto.
(vfloat16m2_t): Ditto.
(vfloat16m4_t): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: Add new test cases.
* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Add new test cases.

aix: Debugging does not require a stack frame.

The rs6000 port has allocated a stack frame when debugging is enabled
on AIX since the earliest versions of the port. Apparently the
earliest versions of the debuggers for AIX had difficulty with stackless
frames.

Both AIX DBX and GDB support stackless frames on AIX, and IBM XLC,
OpenXL and LLVM for AIX do not generate an extraneous stack frame when
debugging is enabled. This patch updates the rs6000 stack info function
to not set the stack frame flag when debugging is enabled for AIX.

gcc/ChangeLog:

* config/rs6000/rs6000-logue.cc (rs6000_stack_info):
Do not require a stack frame when debugging is enabled for AIX.

Signed-off-by: David Edelsohn <dje.gcc@gmail.com>

Daily bump.

c++: unsynthesized defaulted constexpr fn [PR110122]

In this other testcase from PR110122, during regeneration of the generic
lambda with V=Bar{}, substitution followed by coerce_template_parms for
A<V>'s template argument naturally yields a copy of V in terms of Bar's
(implicitly) defaulted copy constructor.

This however happens inside a template context so although we introduced
a use of the copy constructor, mark_used didn't actually synthesize it,
which causes subsequent constant evaluation of the template argument to
fail with:

  nontype-class59.C: In instantiation of ‘void f() [with Bar V = Bar{Foo()}]’:
  nontype-class59.C:22:11:   required from here
  nontype-class59.C:18:18: error: ‘constexpr Bar::Bar(const Bar&)’ used before its definition

We already make sure to instantiate templated constexpr functions needed
for constant evaluation (as per P0859R0).  So this patch fixes this by
making us synthesize defaulted constexpr functions needed for constant
evaluation as well.

PR c++/110122

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Synthesize defaulted
functions needed for constant evaluation.
(instantiate_cx_fn_r): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-class59.C: New test.

c++: extend lookup_template_class shortcut [PR110122]

Here when substituting the injected class name A during regeneration of
the lambda, we find ourselves in lookup_template_class for A<V> with
V=_ZTAXtl3BarEE (i.e. the template parameter object for Foo{}).  The call
to coerce_template_parms within then undesirably tries to make a copy of
this class NTTP argument, which fails because Foo is not copyable.  But it
seems clear that this testcase shouldn't require copyability of Foo.

lookup_template_class has a shortcut for looking up the current class
scope, which would avoid the problematic coerce_template_parms call, but
the shortcut doesn't trigger because it only considers the innermost
class scope which in this case in the lambda type.  So this patch fixes
this by extending the lookup_template_class shortcut to consider outer
class scopes too (and skipping over lambda types since they are never
specialized from lookup_template_class).  We also need to avoid calling
coerce_template_parms when specializing a templated non-template nested
class for the first time (such as A::B in the testcase).  Coercion should
be unnecessary there because the innermost arguments belong to the context
and so should have already been coerced.

PR c++/110122

gcc/cp/ChangeLog:

* pt.cc (lookup_template_class): Extend shortcut for looking up the
current class scope to consider outer class scopes too, and use
current_nonlambda_class_type instead of current_class_type.  Only
call coerce_template_parms when specializing a primary template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-class57.C: New test.
* g++.dg/cpp2a/nontype-class58.C: New test.

libgfortran: remove support for --enable-intermodule

libgfortran/

PR libfortran/109373
* configure.ac: Remove support for --enable-intermodule
* Makefile.am: Remove onestep path.
* configure: Regenerate.
* Makefile.in: Regenerate.

Use canonical form for reversed single-bit insertions after reload.

We now split almost all insns after reload in order to add clobber of REG_CC.
If insns are coming from insn combiner and there is no canonical form for
the respective arithmetic (like for reversed bit insertions), there is
no need to keep all these different representations after reload:
Instead of splitting such patterns to their clobber-REG_CC-analogon, we can
split to a canonical representation, which is insv_notbit for the present case.
This is a no-op change.

gcc/
* config/avr/avr.md (adjust_len) [insv_notbit_0, insv_notbit_7]:
Remove attribute values.
(insv_notbit): New post-reload insn.
(*insv.not-shiftrt_split, *insv.xor1-bit.0_split)
(*insv.not-bit.0_split, *insv.not-bit.7_split)
(*insv.xor-extract_split): Split to insv_notbit.
(*insv.not-shiftrt, *insv.xor1-bit.0, *insv.not-bit.0, *insv.not-bit.7)
(*insv.xor-extract): Remove post-reload insns.
* config/avr/avr.cc (avr_out_insert_notbit) [bitno]: Remove parameter.
(avr_adjust_insn_length): Adjust call of avr_out_insert_notbit.
[ADJUST_LEN_INSV_NOTBIT_0, ADJUST_LEN_INSV_NOTBIT_7]: Remove cases.
* config/avr/avr-protos.h (avr_out_insert_notbit): Adjust prototype.

target/19907: Overhaul bit extractions.

o Logical right shift that shifts the MSB to position 0 can be performed in
  such a way that the input operand constraint can be relaxed from "0" to "r".
  This results in less register pressure.  Moreover, no scratch register is
  required in that case.

o The deprecated "extzv" pattern is replaced by "extzv<mode>" that allows
  inputs of scalar integer modes of different sizes (1 up to 4 bytes).

o Existing patterns are adjusted to the more generic "extzv<mode>" pattern.
  Some patterns are added as the middle-end has been reworked to spot
  more bit-extraction opportunities.

o A C function is used to print the asm for bit extractions, which is more
  convenient for complex output logic.

The generated code is still not optimal because RTL optimizers might still
prefer arithmetic like shift over bit-extractions.  For test cases see
also PR36884 and PR55181.

gcc/
PR target/109907
* config/avr/avr.md (adjust_len) [extr, extr_not]: New elements.
(MSB, SIZE): New mode attributes.
(any_shift): New code iterator.
(*lshr<mode>3_split, *lshr<mode>3, lshr<mode>3)
(*lshr<mode>3_const_split): Add constraint alternative for
the case of shift-offset = MSB.  Ditch "length" attribute.
(extzv<mode): New. replaces extzv.  Adjust following patterns.
Use avr_out_extr, avr_out_extr_not to print asm.
(*extzv.subreg.<mode>, *extzv.<mode>.subreg, *extzv.xor)
(*extzv<mode>.ge, *neg.ashiftrt<mode>.msb, *extzv.io.lsr7): New.
* config/avr/constraints.md (C15, C23, C31, Yil): New
* config/avr/predicates.md (reg_or_low_io_operand)
(const7_operand, reg_or_low_io_operand)
(const15_operand, const_0_to_15_operand)
(const23_operand, const_0_to_23_operand)
(const31_operand, const_0_to_31_operand): New.
* config/avr/avr-protos.h (avr_out_extr, avr_out_extr_not): New.
* config/avr/avr.cc (avr_out_extr, avr_out_extr_not): New funcs.
(lshrqi3_out, lshrhi3_out, lshrpsi3_out, lshrsi3_out): Adjust
MSB case to new insn constraint "r" for operands[1].
(avr_adjust_insn_length) [ADJUST_LEN_EXTR_NOT, ADJUST_LEN_EXTR]:
Handle these cases.
(avr_rtx_costs_1): Adjust cost for a new pattern.
gcc/testsuite/
PR target/109907
* gcc.target/avr/pr109907.c: New test.
* gcc.target/avr/torture/pr109907-1.c: New test.
* gcc.target/avr/torture/pr109907-2.c: New test.

RISC-V: Rework Phase 5 && Phase 6 of VSETVL PASS

Address comments from Jeff.

This patch is to rework Phase 5 && Phase 6 of VSETVL PASS since Phase 5 && Phase 6
are quite messy and cause some bugs discovered by my downstream auto-vectorization
test-generator.

Before this patch.

Phase 5 is cleanup_insns is the function remove AVL operand dependency from each RVV instruction.
E.g. vadd.vv (use a5), after Phase 5, ====> vadd.vv (use const_int 0). Since "a5" is used in "vsetvl" instructions and
after the correct "vsetvl" instructions are inserted, each RVV instruction doesn't need AVL operand "a5" anymore. Then,
we remove this operand dependency helps for the following scheduling PASS.

Phase 6 is propagate_avl do the following 2 things:
1. Local && Global user vsetvl instructions optimization.
   E.g.
      vsetvli a2, a2, e8, mf8   ======> Change it into vsetvli a2, a2, e32, mf2
      vsetvli zero,a2, e32, mf2  ======> eliminate
2. Optimize user vsetvl from "vsetvl a2,a2" into "vsetvl zero,a2" if "a2" is not used by any instructions.
Since from Phase 1 ~ Phase 4 which inserts "vsetvli" instructions base on LCM which change the CFG, I re-new a new
RTL_SSA framework (which is more expensive than just using DF) for Phase 6 and optmize user vsetvli base on the new RTL_SSA.

There are 2 issues in Phase 5 && Phase 6:
1. local_eliminate_vsetvl_insn was introduced by @kito which can do better local user vsetvl optimizations better than
   Phase 6 do, such approach doesn't need to re-new the RTL_SSA framework. So the local user vsetvli instructions optimizaiton
   in Phase 6 is redundant and should be removed.
2. A bug discovered by my downstream auto-vectorization test-generator (I can't put the test in this patch since we are missing autovec
   patterns for it so we can't use the upstream GCC directly reproduce such issue but I will remember put it back after I support the
   necessary autovec patterns). Such bug is causing by using RTL_SSA re-new framework. The issue description is this:

Before Phase 6:
   ...
   insn1: vsetlvi a3, 17 <========== generated by SELECT_VL auto-vec pattern.
   slli a4,a3,3
   ...
   insn2: vsetvli zero, a3, ...
   load (use const_int 0, before Phase 5, it's using a3, but the use of "a3" is removed in Phase 5)
   ...

In Phase 6, we iterate to insn2, then get the def of "a3" which is the insn1.
insn2 is the vsetvli instruction inserted in Phase 4 which is not included in the RLT_SSA framework
even though we renew it (I didn't take a look at it and I don't think we need to now).
Base on this situation, the def_info of insn2 has the information "set->single_nondebug_insn_use ()"
which return true. Obviously, this information is not correct, since insn1 has aleast 2 uses:
1). slli a4,a3,3 2).insn2: vsetvli zero, a3, ... Then, the test generated by my downstream test-generator
execution test failed.

Conclusion of RTL_SSA framework:
Before this patch, we initialize RTL_SSA 2 times. One is at the beginning of the VSETVL PASS which is absolutely correct, the other
is re-new after Phase 4 (LCM) has incorrect information that causes bugs.

Besides, we don't like to initialize RTL_SSA second time it seems to be a waste since we just need to do a little optimization.

Base on all circumstances I described above, I rework and reorganize Phase 5 && Phase 6 as follows:
1. Phase 5 is called ssa_post_optimization which is doing the optimization base on the RTL_SSA information (The RTL_SSA is initialized
   at the beginning of the VSETVL PASS, no need to re-new it again). This phase includes 3 optimizaitons:
   1). local_eliminate_vsetvl_insn we already have (no change).
   2). global_eliminate_vsetvl_insn ---> new optimizaiton splitted from orignal Phase 6 but with more powerful and reliable implementation.
      E.g.
      void f(int8_t *base, int8_t *out, size_t vl, size_t m, size_t k) {
        size_t avl;
        if (m > 100)
          avl = __riscv_vsetvl_e16mf4(vl << 4);
        else
          avl = __riscv_vsetvl_e32mf2(vl >> 8);
        for (size_t i = 0; i < m; i++) {
          vint8mf8_t v0 = __riscv_vle8_v_i8mf8(base + i, avl);
          v0 = __riscv_vadd_vv_i8mf8 (v0, v0, avl);
          __riscv_vse8_v_i8mf8(out + i, v0, avl);
        }
      }

      This example failed to global user vsetvl optimize before this patch:
      f:
              li      a5,100
              bleu    a3,a5,.L2
              slli    a2,a2,4
              vsetvli a4,a2,e16,mf4,ta,mu
      .L3:
              li      a5,0
              vsetvli zero,a4,e8,mf8,ta,ma
      .L5:
              add     a6,a0,a5
              add     a2,a1,a5
              vle8.v  v1,0(a6)
              addi    a5,a5,1
              vadd.vv v1,v1,v1
              vse8.v  v1,0(a2)
              bgtu    a3,a5,.L5
      .L10:
              ret
      .L2:
              beq     a3,zero,.L10
              srli    a2,a2,8
              vsetvli a4,a2,e32,mf2,ta,mu
              j       .L3
      With this patch:
      f:
              li      a5,100
              bleu    a3,a5,.L2
              slli    a2,a2,4
              vsetvli zero,a2,e8,mf8,ta,ma
      .L3:
              li      a5,0
      .L5:
              add     a6,a0,a5
              add     a2,a1,a5
              vle8.v  v1,0(a6)
              addi    a5,a5,1
              vadd.vv v1,v1,v1
              vse8.v  v1,0(a2)
              bgtu    a3,a5,.L5
      .L10:
              ret
      .L2:
              beq     a3,zero,.L10
              srli    a2,a2,8
              vsetvli zero,a2,e8,mf8,ta,ma
              j       .L3

   3). Remove AVL operand dependency of each RVV instructions.

2. Phase 6 is called df_post_optimization: Optimize "vsetvl a3,a2...." into Optimize "vsetvl zero,a2...." base on
   dataflow analysis of new CFG (new CFG is created by LCM). The reason we need to do use new CFG and after Phase 5:
   ...
   vsetvl a3, a2...
   vadd.vv (use a3)
   If we don't have Phase 5 which removes the "a3" use in vadd.vv, we will fail to optimize vsetvl a3,a2 into vsetvl zero,a2.

   This patch passed all tests in rvv.exp with ONLY peformance && codegen improved (no performance decline and no bugs including my
   downstream tests).

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (available_occurrence_p): Enhance user vsetvl optimization.
(vector_insn_info::parse_insn): Add rtx_insn parse.
(pass_vsetvl::local_eliminate_vsetvl_insn): Enhance user vsetvl optimization.
(get_first_vsetvl): New function.
(pass_vsetvl::global_eliminate_vsetvl_insn): Ditto.
(pass_vsetvl::cleanup_insns): Remove it.
(pass_vsetvl::ssa_post_optimization): New function.
(has_no_uses): Ditto.
(pass_vsetvl::propagate_avl): Remove it.
(pass_vsetvl::df_post_optimization): New function.
(pass_vsetvl::lazy_vsetvl): Rework Phase 5 && Phase 6.
* config/riscv/riscv-vsetvl.h: Adapt declaration.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/vsetvl-16.c: Adapt test.
* gcc.target/riscv/rvv/vsetvl/vsetvl-2.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvl-3.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvl-21.c: New test.
* gcc.target/riscv/rvv/vsetvl/vsetvl-22.c: New test.
* gcc.target/riscv/rvv/vsetvl/vsetvl-23.c: New test.

Daily bump.

Convert ipcp_vr_lattice to type agnostic framework.

This converts the lattice to store ranges in Value_Range instead of
value_range (*) to make it type agnostic, and adjust all users
accordingly.

I've been careful to make sure Value_Range never ends up on GC, since
it contains an int_range_max and can expand on-demand onto the heap.
Longer term storage for ranges should be done with vrange_storage, as
per the previous patch ("Provide an API for ipa_vr").

gcc/ChangeLog:

* ipa-cp.cc (ipcp_vr_lattice::init): Take type argument.
(ipcp_vr_lattice::print): Call dump method.
(ipcp_vr_lattice::meet_with): Adjust for m_vr being a
Value_Range.
(ipcp_vr_lattice::meet_with_1): Make argument a reference.
(ipcp_vr_lattice::set_to_bottom): Set varying for an unsupported
range.
(initialize_node_lattices): Pass type when appropriate.
(ipa_vr_operation_and_type_effects): Make type agnostic.
(ipa_value_range_from_jfunc): Same.
(propagate_vr_across_jump_function): Same.
* ipa-fnsummary.cc (evaluate_conditions_for_known_args): Same.
(evaluate_properties_for_edge): Same.
* ipa-prop.cc (ipa_vr::get_vrange): Same.
(ipcp_update_vr): Same.
* ipa-prop.h (ipa_value_range_from_jfunc): Same.
(ipa_range_set_and_normalize): Same.

testsuite: Cut down 27_io/basic_istream/.../94749.cc for simulators

The test wchar_t/94749.cc can take about 10 minutes on some
simulator/host combinations with char/94749.cc at a third of
that time. The cause is test05 which is quite heavy and
includes wrapping a 32-bit counter. Run it only for native
setups.

* testsuite/27_io/basic_istream/ignore/wchar_t/94749.cc (main)
[! SIMULATOR_TEST]: Also exclude running test05.
* testsuite/27_io/basic_istream/ignore/char/94749.cc: Ditto.

c++: Adjust conversion deduction [PR61663][DR976]

Drop the return type's reference before doing cvqual and related decays.

gcc/cp/
PR c++/61663
* pt.cc (maybe_adjust_types_for_deduction): Implement DR976.
gcc/testsuite/
* g++.dg/template/pr61663.C: New.

target/109650: Fix wrong code after cc0 -> CCmode transition.

This patch fixes a wrong-code bug in the wake of PR92729, the transition that
turned the AVR backend from cc0 to CCmode.  In cc0, the insn that uses cc0 like
a conditional branch always follows the cc0 setter, which is no more the case
with CCmode where set and use of REG_CC might be in different basic blocks.

This patch removes the machine-dependent reorg pass in avr_reorg entirely.

It is replaced by a new, AVR specific mini-pass that runs prior to split2.
Canonicalization of comparisons away from the "difficult" codes GT[U] and LE[U]
is now mostly performed by implementing TARGET_CANONICALIZE_COMPARISON.

Moreover:

* Text peephole conditions get "dead_or_set_regno_p (*, REG_CC)" as needed.

* RTL peephole conditions get "peep2_regno_dead_p (*, REG_CC)" as needed.

* Conditional branches no more clobber REG_CC.

* insn output for compares looks ahead to determine the branch mode in use.
  This needs also "dead_or_set_regno_p (*, REG_CC)".

* Add RTL peepholes for decrement-and-branch detection.

* Some of the patterns like "*cmphi.zero-extend.0" lost their
  combine-ational part wit PR92729.  Restore them.

Finally, it fixes some of the many indentation glitches left over from PR92729.

gcc/
PR target/109650
PR target/92729
* config/avr/avr-passes.def (avr_pass_ifelse): Insert new pass.
* config/avr/avr.cc (avr_pass_ifelse): New RTL pass.
(avr_pass_data_ifelse): New pass_data for it.
(make_avr_pass_ifelse, avr_redundant_compare, avr_cbranch_cost)
(avr_canonicalize_comparison, avr_out_plus_set_ZN)
(avr_out_cmp_ext): New functions.
(compare_condtition): Make sure REG_CC dies in the branch insn.
(avr_rtx_costs_1): Add computation of cbranch costs.
(avr_adjust_insn_length) [ADJUST_LEN_ADD_SET_ZN, ADJUST_LEN_CMP_ZEXT]:
[ADJUST_LEN_CMP_SEXT]Handle them.
(TARGET_CANONICALIZE_COMPARISON): New define.
(avr_simplify_comparison_p, compare_diff_p, avr_compare_pattern)
(avr_reorg_remove_redundant_compare, avr_reorg): Remove functions.
(TARGET_MACHINE_DEPENDENT_REORG): Remove define.
* config/avr/avr-protos.h (avr_simplify_comparison_p): Remove proto.
(make_avr_pass_ifelse, avr_out_plus_set_ZN, cc_reg_rtx)
(avr_out_cmp_zext): New Protos
* config/avr/avr.md (branch, difficult_branch): Don't split insns.
(*cbranchhi.zero-extend.0", *cbranchhi.zero-extend.1")
(*swapped_tst<mode>, *add.for.eqne.<mode>): New insns.
(*cbranch<mode>4): Rename to cbranch<mode>4_insn.
(define_peephole): Add dead_or_set_regno_p(insn,REG_CC) as needed.
(define_deephole2): Add peep2_regno_dead_p(*,REG_CC) as needed.
Add new RTL peepholes for decrement-and-branch and *swapped_tst<mode>.
Rework signtest-and-branch peepholes for *sbrx_branch<mode>.
(adjust_len) [add_set_ZN, cmp_zext]: New.
(QIPSI): New mode iterator.
(ALLs1, ALLs2, ALLs4, ALLs234): New mode iterators.
(gelt): New code iterator.
(gelt_eqne): New code attribute.
(rvbranch, *rvbranch, difficult_rvbranch, *difficult_rvbranch)
(branch_unspec, *negated_tst<mode>, *reversed_tst<mode>)
(*cmpqi_sign_extend): Remove insns.
(define_c_enum "unspec") [UNSPEC_IDENTITY]: Remove.
* config/avr/avr-dimode.md (cbranch<mode>4): Canonicalize comparisons.
* config/avr/predicates.md (scratch_or_d_register_operand): New.
* config/avr/constraints.md (Yxx): New constraint.

gcc/testsuite/
PR target/109650
* gcc.target/avr/torture/pr109650-1.c: New test.
* gcc.target/avr/torture/pr109650-2.c: New test.

Fortran: add Fortran 2018 IEEE_{MIN,MAX} functions

libgfortran/

* ieee/ieee_arithmetic.F90: Add IEEE_MIN_NUM, IEEE_MAX_NUM,
IEEE_MIN_NUM_MAG, and IEEE_MAX_NUM_MAG functions.

gcc/fortran/

* f95-lang.cc (gfc_init_builtin_functions): Add fmax() and
fmin() built-ins, and their variants.
* mathbuiltins.def: Add FMAX and FMIN built-ins.
* trans-intrinsic.cc (conv_intrinsic_ieee_minmax): New function.
(gfc_conv_ieee_arithmetic_function): Handle IEEE_MIN_NUM and
IEEE_MAX_NUM functions.

gcc/testsuite/
* gfortran.dg/ieee/minmax_1.f90: New test.

libatomic: x86_64: Always try ifunc

We used to skip ifunc check when CX16 is available. But now we use
CX16+AVX+Intel/AMD for the "perfect" 16b load implementation, so CX16
alone is not a sufficient reason not to use ifunc (see PR104688).

This causes a subtle and annoying issue: when GCC is built with a
higher -march= setting in CFLAGS_FOR_TARGET, ifunc is disabled and
the worst (locked) implementation of __atomic_load_16 is always used.

There seems no good way to check if the CPU is Intel or AMD from
the built-in macros (maybe we can check every known model like __skylake,
__bdver2, ..., but it will be very error-prune and require an update
whenever we add the support for a new x86 model). The best thing we can
do seems "always try ifunc" here.

libatomic/ChangeLog:

* configure.tgt: For x86_64, always set try_ifunc=yes.

testsuite: Add more allocation size tests for conjured svalues [PR110014]

This patch adds the reproducers reported in PR 110014 as test cases. The
false positives in those cases are already fixed with PR 109577.

2023-06-09 Tim Lange <mail@tim-lange.me>

PR analyzer/110014

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/realloc-pr110014.c: New tests.

analyzer: Fix allocation size false positive on conjured svalue [PR109577]

Currently, the analyzer tries to prove that the allocation size is a
multiple of the pointee's type size.  This patch reverses the behavior
to try to prove that the expression is not a multiple of the pointee's
type size.  With this change, each unhandled case should be gracefully
considered as correct.  This fixes the bug reported in PR 109577 by
Paul Eggert.

Regression-tested on Linux x86-64 with -m32 and -m64.

2023-06-09  Tim Lange  <mail@tim-lange.me>

PR analyzer/109577

gcc/analyzer/ChangeLog:

* constraint-manager.cc (class sval_finder): Visitor to find
childs in svalue trees.
(constraint_manager::sval_constrained_p): Add new function to
check whether a sval might be part of an constraint.
* constraint-manager.h: Add sval_constrained_p function.
* region-model.cc (class size_visitor): Reverse behavior to not
emit a warning on not explicitly considered cases.
(region_model::check_region_size):
Adapt to size_visitor changes.

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/allocation-size-2.c: Change expected output
and add new test case.
* gcc.dg/analyzer/pr109577.c: New test.

RISC-V: Add test cases for RVV FP16 vreinterpret

This patch would like to add more tests for RVV FP16 vreinterpret, aka

vfloat16*_t <==> v{u}int16*_t.

There we allow FP16 vreinterpret in ZVFHMIN consider we have vle FP16 already.
It doesn't break anything in SPEC as there is no such vreinterpret insn.
From the user's perspective, it is reasonable to do some type convert
between vfloat16 and v{u}int16 when only ZVFHMIN is enabled.

This patch would like to add new test cases to make sure the RVV FP16
vreinterpret works well as expected.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: Add new cases.
* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Diito.

RISC-V: Enable select_vl for RVV auto-vectorization

Consider this following example:
void vec_add(int32_t *restrict c, int32_t *restrict a, int32_t *restrict b,
             int N) {
  for (long i = 0; i < N; i++) {
    c[i] = a[i] + b[i];
  }
}

After this patch:
vec_add:
        ble     a3,zero,.L5
.L3:
        vsetvli a5,a3,e32,m1,ta,ma
        vle32.v v2,0(a1)
        vle32.v v1,0(a2)
        vsetvli a6,zero,e32,m1,ta,ma ===> redundant vsetvl.
        slli    a4,a5,2
        vadd.vv v1,v1,v2
        sub     a3,a3,a5
        vsetvli zero,a5,e32,m1,ta,ma ===> redundant vsetvl.
        vse32.v v1,0(a0)
        add     a1,a1,a4
        add     a2,a2,a4
        add     a0,a0,a4
        bne     a3,zero,.L3
.L5:
        ret

We can get close-to-optimal codegen but with some redundant vsetvls.
This is not the big issue which will be easily addressed in RISC-V backend.

I am going to add a standalone PASS "AVL propagation" (avlprop) to addresse
such issue.

gcc/ChangeLog:

* config/riscv/autovec.md (select_vl<mode>): New pattern.
* config/riscv/riscv-protos.h (expand_select_vl): New function.
* config/riscv/riscv-v.cc (expand_select_vl): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/ternop/ternop-2.c: Adapt test.
* gcc.target/riscv/rvv/autovec/ternop/ternop-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/select_vl-1.c: New test.

Unify MULT_EXPR range operator

Move the declaration of the class to the range-op-mixed header, add the
floating point prototypes as well, and use it in the new unified table.

* range-op-float.cc (foperator_mult_div_base): Delete.
(foperator_mult_div_base::find_range): Make static local function.
(foperator_mult): Remove. Move prototypes to range-op-mixed.h
(operator_mult::op1_range): Rename from foperator_mult.
(operator_mult::op2_range): Ditto.
(operator_mult::rv_fold): Ditto.
(float_table::float_table): Remove MULT_EXPR.
(class foperator_div): Inherit from range_operator.
(float_table::float_table): Delete.
* range-op-mixed.h (class operator_mult): Combined from integer
and float files.
* range-op.cc (float_tree_table): Delete.
(op_mult): New object.
(unified_table::unified_table): Add MULT_EXPR.
(get_op_handler): Do not check float table any longer.
(class cross_product_operator): Move to range-op-mixed.h.
(class operator_mult): Move to range-op-mixed.h.
(integral_table::integral_table): Remove MULT_EXPR.
(pointer_table::pointer_table): Remove MULT_EXPR.
* range-op.h (float_table): Remove.

Unify NEGATE_EXPR range operator

Move the declaration of the class to the range-op-mixed header, add the
floating point prototypes as well, and use it in the new unified table.

* range-op-float.cc (foperator_negate): Remove. Move prototypes
to range-op-mixed.h
(operator_negate::fold_range): Rename from foperator_negate.
(operator_negate::op1_range): Ditto.
(float_table::float_table): Remove NEGATE_EXPR.
* range-op-mixed.h (class operator_negate): Combined from integer
and float files.
* range-op.cc (op_negate): New object.
(unified_table::unified_table): Add NEGATE_EXPR.
(class operator_negate): Move to range-op-mixed.h.
(integral_table::integral_table): Remove NEGATE_EXPR.
(pointer_table::pointer_table): Remove NEGATE_EXPR.

Unify MINUS_EXPR range operator

Move the declaration of the class to the range-op-mixed header, add the
floating point prototypes as well, and use it in the new unified table.

* range-op-float.cc (foperator_minus): Remove. Move prototypes
to range-op-mixed.h
(operator_minus::fold_range): Rename from foperator_minus.
(operator_minus::op1_range): Ditto.
(operator_minus::op2_range): Ditto.
(operator_minus::rv_fold): Ditto.
(float_table::float_table): Remove MINUS_EXPR.
* range-op-mixed.h (class operator_minus): Combined from integer
and float files.
* range-op.cc (op_minus): New object.
(unified_table::unified_table): Add MINUS_EXPR.
(class operator_minus): Move to range-op-mixed.h.
(integral_table::integral_table): Remove MINUS_EXPR.
(pointer_table::pointer_table): Remove MINUS_EXPR.

Unify ABS_EXPR range operator

Move the declaration of the class to the range-op-mixed header, add the
floating point prototypes as well, and use it in the new unified table.

* range-op-float.cc (foperator_abs): Remove. Move prototypes
to range-op-mixed.h
(operator_abs::fold_range): Rename from foperator_abs.
(operator_abs::op1_range): Ditto.
(float_table::float_table): Remove ABS_EXPR.
* range-op-mixed.h (class operator_abs): Combined from integer
and float files.
* range-op.cc (op_abs): New object.
(unified_table::unified_table): Add ABS_EXPR.
(class operator_abs): Move to range-op-mixed.h.
(integral_table::integral_table): Remove ABS_EXPR.
(pointer_table::pointer_table): Remove ABS_EXPR.

Unify PLUS_EXPR range operator

Move the declaration of the class to the range-op-mixed header, add the
floating point prototypes as well, and use it in the new unified table.

* range-op-float.cc (foperator_plus): Remove. Move prototypes
to range-op-mixed.h
(operator_plus::fold_range): Rename from foperator_plus.
(operator_plus::op1_range): Ditto.
(operator_plus::op2_range): Ditto.
(operator_plus::rv_fold): Ditto.
(float_table::float_table): Remove PLUS_EXPR.
* range-op-mixed.h (class operator_plus): Combined from integer
and float files.
* range-op.cc (op_plus): New object.
(unified_table::unified_table): Add PLUS_EXPR.
(class operator_plus): Move to range-op-mixed.h.
(integral_table::integral_table): Remove PLUS_EXPR.
(pointer_table::pointer_table): Remove PLUS_EXPR.

Unify operator_cast range operator

Move the declaration of the class to the range-op-mixed header, and use it
in the new unified table.

* range-op-mixed.h (class operator_cast): Combined from integer
and float files.
* range-op.cc (op_cast): New object.
(unified_table::unified_table): Add op_cast
(class operator_cast): Move to range-op-mixed.h.
(integral_table::integral_table): Remove op_cast
(pointer_table::pointer_table): Remove op_cast.

Unify operator_cst range operator

Move the declaration of the class to the range-op-mixed header, add the
floating point prototypes as well, and use it in the new unified table.

* range-op-float.cc (operator_cst::fold_range): New.
* range-op-mixed.h (class operator_cst): Move from integer file.
* range-op.cc (op_cst): New object.
(unified_table::unified_table): Add op_cst. Also use for REAL_CST.
(class operator_cst): Move to range-op-mixed.h.
(integral_table::integral_table): Remove op_cst.
(pointer_table::pointer_table): Remove op_cst.

Unify Identity range operator

Move the declaration of the class to the range-op-mixed header, add the
floating point prototypes as well, and use it in the new unified table.

* range-op-float.cc (foperator_identity): Remove. Move prototypes
to range-op-mixed.h
(operator_identity::fold_range): Rename from foperator_identity.
(operator_identity::op1_range): Ditto.
(float_table::float_table): Remove fop_identity.
* range-op-mixed.h (class operator_identity): Combined from integer
and float files.
* range-op.cc (op_identity): New object.
(unified_table::unified_table): Add op_identity.
(class operator_identity): Move to range-op-mixed.h.
(integral_table::integral_table): Remove identity.
(pointer_table::pointer_table): Remove identity.

Unify GE_EXPR range operator

Move the declaration of the class to the range-op-mixed header, add the
floating point prototypes as well, and use it in the new unified table.

* range-op-float.cc (foperator_ge): Remove. Move prototypes
to range-op-mixed.h
(operator_ge::fold_range): Rename from foperator_ge.
(operator_ge::op1_range): Ditto.
(float_table::float_table): Remove GE_EXPR.
* range-op-mixed.h (class operator_ge): Combined from integer
and float files.
* range-op.cc (op_ge): New object.
(unified_table::unified_table): Add GE_EXPR.
(class operator_ge): Move to range-op-mixed.h.
(ge_op1_op2_relation): Fold into
operator_ge::op1_op2_relation.
(integral_table::integral_table): Remove GE_EXPR.
(pointer_table::pointer_table): Remove GE_EXPR.
* range-op.h (ge_op1_op2_relation): Delete.

Unify GT_EXPR range operator

Move the declaration of the class to the range-op-mixed header, add the
floating point prototypes as well, and use it in the new unified table.

* range-op-float.cc (foperator_gt): Remove. Move prototypes
to range-op-mixed.h
(operator_gt::fold_range): Rename from foperator_gt.
(operator_gt::op1_range): Ditto.
(float_table::float_table): Remove GT_EXPR.
* range-op-mixed.h (class operator_gt): Combined from integer
and float files.
* range-op.cc (op_gt): New object.
(unified_table::unified_table): Add GT_EXPR.
(class operator_gt): Move to range-op-mixed.h.
(gt_op1_op2_relation): Fold into
operator_gt::op1_op2_relation.
(integral_table::integral_table): Remove GT_EXPR.
(pointer_table::pointer_table): Remove GT_EXPR.
* range-op.h (gt_op1_op2_relation): Delete.

Unify LE_EXPR range operator

Move the declaration of the class to the range-op-mixed header, add the
floating point prototypes as well, and use it in the new unified table.

* range-op-float.cc (foperator_le): Remove. Move prototypes
to range-op-mixed.h
(operator_le::fold_range): Rename from foperator_le.
(operator_le::op1_range): Ditto.
(float_table::float_table): Remove LE_EXPR.
* range-op-mixed.h (class operator_le): Combined from integer
and float files.
* range-op.cc (op_le): New object.
(unified_table::unified_table): Add LE_EXPR.
(class operator_le): Move to range-op-mixed.h.
(le_op1_op2_relation): Fold into
operator_le::op1_op2_relation.
(integral_table::integral_table): Remove LE_EXPR.
(pointer_table::pointer_table): Remove LE_EXPR.
* range-op.h (le_op1_op2_relation): Delete.

Unify LT_EXPR range operator

Move the declaration of the class to the range-op-mixed header, add the
floating point prototypes as well, and use it in the new unified table.

* range-op-float.cc (foperator_lt): Remove. Move prototypes
to range-op-mixed.h
(operator_lt::fold_range): Rename from foperator_lt.
(operator_lt::op1_range): Ditto.
(float_table::float_table): Remove LT_EXPR.
* range-op-mixed.h (class operator_lt): Combined from integer
and float files.
* range-op.cc (op_lt): New object.
(unified_table::unified_table): Add LT_EXPR.
(class operator_lt): Move to range-op-mixed.h.
(lt_op1_op2_relation): Fold into
operator_lt::op1_op2_relation.
(integral_table::integral_table): Remove LT_EXPR.
(pointer_table::pointer_table): Remove LT_EXPR.
* range-op.h (lt_op1_op2_relation): Delete.

Unify NE_EXPR range operator

Move the declaration of the class to the range-op-mixed header, add the
floating point prototypes as well, and use it in the new unified table.

* range-op-float.cc (foperator_not_equal): Remove. Move prototypes
to range-op-mixed.h
(operator_equal::fold_range): Rename from foperator_not_equal.
(operator_equal::op1_range): Ditto.
(float_table::float_table): Remove NE_EXPR.
* range-op-mixed.h (class operator_not_equal): Combined from integer
and float files.
* range-op.cc (op_equal): New object.
(unified_table::unified_table): Add NE_EXPR.
(class operator_not_equal): Move to range-op-mixed.h.
(not_equal_op1_op2_relation): Fold into
operator_not_equal::op1_op2_relation.
(integral_table::integral_table): Remove NE_EXPR.
(pointer_table::pointer_table): Remove NE_EXPR.
* range-op.h (not_equal_op1_op2_relation): Delete.

Unify EQ_EXPR range operator

Move the declaration of the class to the range-op-mixed header, add the
floating point prototypes as well, and use it in the new unified table.

* range-op-float.cc (foperator_equal): Remove. Move prototypes
to range-op-mixed.h
(operator_equal::fold_range): Rename from foperator_equal.
(operator_equal::op1_range): Ditto.
(float_table::float_table): Remove EQ_EXPR.
* range-op-mixed.h (class operator_equal): Combined from integer
and float files.
* range-op.cc (op_equal): New object.
(unified_table::unified_table): Add EQ_EXPR.
(class operator_equal): Move to range-op-mixed.h.
(equal_op1_op2_relation): Fold into
operator_equal::op1_op2_relation.
(integral_table::integral_table): Remove EQ_EXPR.
(pointer_table::pointer_table): Remove EQ_EXPR.
* range-op.h (equal_op1_op2_relation): Delete.

Provide a unified range-op table.

Create a table to prepare for unifying all operations into a single table.
Move any operators which only occur in one table to the approriate
initialization routine.
Provide a mixed header file for range-ops with multiple categories.

* range-op-float.cc (class float_table): Move to header.
(float_table::float_table): Move float only operators to...
(range_op_table::initialize_float_ops): Here.
* range-op-mixed.h: New.
* range-op.cc (integral_tree_table, pointer_tree_table): Moved
to top of file.
(float_tree_table): Moved from range-op-float.cc.
(unified_tree_table): New.
(unified_table::unified_table): New. Call initialize routines.
(get_op_handler): Check unified table first.
(range_op_handler::range_op_handler): Handle no type constructor.
(integral_table::integral_table): Move integral only operators to...
(range_op_table::initialize_integral_ops): Here.
(pointer_table::pointer_table): Move pointer only operators to...
(range_op_table::initialize_pointer_ops): Here.
* range-op.h (enum bool_range_state): Move to range-op-mixed.h.
(get_bool_state): Ditto.
(empty_range_varying): Ditto.
(relop_early_resolve): Ditto.
(class range_op_table): Add new init methods for range types.
(class integral_table): Move declaration to here.
(class pointer_table): Move declaration to here.
(class float_table): Move declaration to here.

Daily bump.

VECT: Add SELECT_VL support

This patch address comments from Richard && Richi and rebase to trunk.

This patch is adding SELECT_VL middle-end support
allow target have target dependent optimization in case of
length calculation.

This patch is inspired by RVV ISA and LLVM:
https://reviews.llvm.org/D99750

The SELECT_VL is same behavior as LLVM "get_vector_length" with
these following properties:

1. Only apply on single-rgroup.
2. non SLP.
3. adjust loop control IV.
4. adjust data reference IV.
5. allow non-vf elements processing in non-final iteration

Code
   # void vvaddint32(size_t n, const int*x, const int*y, int*z)
   # { for (size_t i=0; i<n; i++) { z[i]=x[i]+y[i]; } }

Take RVV codegen for example:

Before this patch:
vvaddint32:
        ble     a0,zero,.L6
        csrr    a4,vlenb
        srli    a6,a4,2
.L4:
        mv      a5,a0
        bleu    a0,a6,.L3
        mv      a5,a6
.L3:
        vsetvli zero,a5,e32,m1,ta,ma
        vle32.v v2,0(a1)
        vle32.v v1,0(a2)
        vsetvli a7,zero,e32,m1,ta,ma
        sub     a0,a0,a5
        vadd.vv v1,v1,v2
        vsetvli zero,a5,e32,m1,ta,ma
        vse32.v v1,0(a3)
        add     a2,a2,a4
        add     a3,a3,a4
        add     a1,a1,a4
        bne     a0,zero,.L4
.L6:
        ret

After this patch:

vvaddint32:
    vsetvli t0, a0, e32, ta, ma  # Set vector length based on 32-bit vectors
    vle32.v v0, (a1)         # Get first vector
      sub a0, a0, t0         # Decrement number done
      slli t0, t0, 2         # Multiply number done by 4 bytes
      add a1, a1, t0         # Bump pointer
    vle32.v v1, (a2)         # Get second vector
      add a2, a2, t0         # Bump pointer
    vadd.vv v2, v0, v1       # Sum vectors
    vse32.v v2, (a3)         # Store result
      add a3, a3, t0         # Bump pointer
      bnez a0, vvaddint32    # Loop back
      ret                    # Finished

Co-authored-by: Richard Sandiford<richard.sandiford@arm.com>
Co-authored-by: Richard Biener <rguenther@suse.de>
gcc/ChangeLog:

* doc/md.texi: Add SELECT_VL support.
* internal-fn.def (SELECT_VL): Ditto.
* optabs.def (OPTAB_D): Ditto.
* tree-vect-loop-manip.cc (vect_set_loop_controls_directly): Ditto.
* tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): Ditto.
* tree-vect-stmts.cc (get_select_vl_data_ref_ptr): Ditto.
(vectorizable_store): Ditto.
(vectorizable_load): Ditto.
* tree-vectorizer.h (LOOP_VINFO_USING_SELECT_VL_P): Ditto.

analyzer: add caching to globals with initializers [PR110112]

PR analyzer/110112 notes that -fanalyzer is extremely slow on a source
file with large read-only static arrays, repeatedly building the
same compound_svalue representing the full initializer, and repeatedly
building svalues representing parts of the the full initialiazer.

This patch adds caches for both of these; together they reduce the time
taken by -fanalyzer -O2 on the testcase in the bug for an optimized
build:
  91.2s : no caches (status quo)
  32.4s : cache in decl_region::get_svalue_for_constructor
   3.7s : cache in region::get_initial_value_at_main
   3.1s : both caches (this patch)

gcc/analyzer/ChangeLog:
PR analyzer/110112
* region-model.cc (region_model::get_initial_value_for_global):
Move code to region::calc_initial_value_at_main.
* region.cc (region::get_initial_value_at_main): New function.
(region::calc_initial_value_at_main): New function, based on code
in region_model::get_initial_value_for_global.
(region::region): Initialize m_cached_init_sval_at_main.
(decl_region::get_svalue_for_constructor): Add a cache, splitting
out body to...
(decl_region::calc_svalue_for_constructor): ...this new function.
* region.h (region::get_initial_value_at_main): New decl.
(region::calc_initial_value_at_main): New decl.
(region::m_cached_init_sval_at_main): New field.
(decl_region::decl_region): Initialize m_ctor_svalue.
(decl_region::calc_svalue_for_constructor): New decl.
(decl_region::m_ctor_svalue): New field.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

libstdc++: use using instead of typedef for type_traits

Since the type_traits header is a C++11 header file, using can be used instead
of typedef. This patch provides more readability, especially for long type
names.

libstdc++-v3/ChangeLog:

* include/std/type_traits: Use using instead of typedef

Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

Also check type being cast to

before casting into an irange, make sure the type being cast into
is also supported.

PR ipa/109886
* ipa-prop.cc (ipa_compute_jump_functions_for_edge): Check param
type as well.

Relocate range_cast to header, and add a generic version.

Make range_cast inlinable by moving it to the header file.
Also trap if the destination is not capable of representing the cast type.
Add a generic version which can change range classes.. ie float to int.

* range-op.cc (range_cast): Move to...
* range-op.h (range_cast): Here and add generic a version.

c++: fix 32-bit spaceship failures [PR110185]

Various spaceship tests failed after r14-1624. This turned out to be
because the comparison category classes return in memory on 32-bit targets,
and the synthesized operator<=> looks something like

if (auto v = a.x <=> b.x, v == 0); else return v;
if (auto v = a.y <=> b.y, v == 0); else return v;
etc.

so check_return_expr was trying to do NRVO for all the 'v' variables, and
now on subsequent returns we check to see if the previous NRV is still in
scope. But the NRVs didn't have names, so looking up name bindings crashed.
Fixed both by giving 'v' a name so we can NRVO the first one, and fixing the
test to give up if the old NRV has no name.

PR c++/110185
PR c++/58487

gcc/cp/ChangeLog:

* method.cc (build_comparison_op): Give retval a name.
* typeck.cc (check_return_expr): Fix for nameless variables.

c++: diagnose auto in template arg

We were failing to diagnose this Concepts TS feature that didn't make it
into C++20 because the 'auto' was getting converted to a template parameter
before we checked for it. So also check in cp_parser_simple_type_specifier.

The code in cp_parser_template_type_arg that I initially expected to
diagnose this seems unreachable because cp_parser_type_id_1 already checks
auto.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_simple_type_specifier): Check for auto
in template argument.
(cp_parser_template_type_arg): Remove auto checking.

gcc/testsuite/ChangeLog:

* g++.dg/concepts/auto7.C: New test.
* g++.dg/concepts/auto7a.C: New test.

c++: init-list of uncopyable type [PR110102]

The maybe_init_list_as_range optimization is a form of copy elision, but we
can only elide well-formed copies.

PR c++/110102

gcc/cp/ChangeLog:

* call.cc (maybe_init_list_as_array): Check that the element type is
copyable.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-opt1.C: New test.

doc: Clarification for -Wmissing-field-initializers

The manual is incorrect in saying that the option does not warn
about designated initializers, which it does in C++. Whether the
divergence in behavior is desirable is another thing, but let's
at least make the manual match the reality.

PR c/39589
PR c++/96868

gcc/ChangeLog:

* doc/invoke.texi: Clarify that -Wmissing-field-initializers doesn't
warn about designated initializers in C only.

Add Plus to the op list of `(zero_one == 0) ? y : z <op> y` pattern

This adds plus to the op list of `(zero_one == 0) ? y : z <op> y` patterns
which currently has bit_ior and bit_xor.
This shows up now in GCC after the boolization work that Uroš has been doing.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/97711
PR tree-optimization/110155

gcc/ChangeLog:

* match.pd ((zero_one == 0) ? y : z <op> y): Add plus to the op.
((zero_one != 0) ? z <op> y : y): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/branchless-cond-add-2.c: New test.
* gcc.dg/tree-ssa/branchless-cond-add.c: New test.

Change the `(zero_one ==/!= 0) ? y : z <op> y` patterns to use multiply rather than `(-zero_one) & z`

Since there is a pattern to convert `(-zero_one) & z` into `zero_one * z` already,
it is better if we don't do a secondary transformation. This reduces the extra
statements produced by match-and-simplify on the gimple level too.

gcc/ChangeLog:

* match.pd ((zero_one ==/!= 0) ? y : z <op> y): Use
multiply rather than negation/bit_and.