git.ipfire.org Git - thirdparty/gcc.git/log

MATCH: remove negate for 1bit types

For 1bit types, negate is either undefined or don't change the value.
In either cases we want to remove them.
This patch adds a match pattern to do that.
Also converting to a 1bit type we can remove the negate just like we already do
for `&1` so this patch adds that too.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Notes on the testcases:
This patch is the last part to fix PR 95929; cond-bool-2.c testcase.
bit1neg-1.c is a 1bit-field testcase where we could remove the assignment
all the way in one case (which happened on the RTL level for some targets but not all).
cond-bool-2.c is the reduced testcase of PR 95929.

PR tree-optimization/95929

gcc/ChangeLog:

* match.pd (convert?(-a)): New pattern
for 1bit integer types.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bit1neg-1.c: New test.
* gcc.dg/tree-ssa/cond-bool-1.c: New test.
* gcc.dg/tree-ssa/cond-bool-2.c: New test.

Revert "Initial support for AVX10.1"

This reverts commit 11ad44da01dd1c91c96e45802fd8b1c50e88703f.

Revert "Emit a warning when disabling AVX512 with AVX10 enabled or disabling AVX10 with AVX512 enabled"

This reverts commit 0288ab14732a16b3787546cdd159941eb7306cf3.

Revert "Emit a warning when AVX10 options conflict in vector width"

This reverts commit 26a820dc136b00b4dc37609429576b6a914cb572.

Revert "Support AVX10.1 for AVX512DQ+AVX512VL intrins"

This reverts commit 2485dd9b4e219307f00d683077bbaf5a2add6604.

Revert "Support AVX10.1 for AVX512DQ+AVX512VL intrins"

This reverts commit 1c3c405ecf23aeb3a2976350887bf2238719c71f.

Revert "[Patch 3/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins"

This reverts commit d14ab07ee91de0ebf80b73a22c4a23ecf2a2572e.

Revert "[Patch 4/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins"

This reverts commit aba10895052fcb2ab3c6d53ad98c855509877555.

Revert "[Patch 5/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins"

This reverts commit 0b20e0f17b47a86cddba68a2e016be0132ae9b0a.

Revert "[Patch 6/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins"

This reverts commit 5ccdfd0870be168031f8902e1039e77be93b131a.

Revert "i386: Add AVX2 pragma wrapper for AVX512DQVL intrins"

This reverts commit 68f7cb6cf9e8b9f2254855507f3b479552adda5f.

debug/111080 - avoid outputting debug info for unused restrict qualified type

The following applies some maintainance with respect to type qualifiers
and kinds added by later DWARF standards to prune_unused_types_walk.
The particular case in the bug is not handling (thus marking required)
all restrict qualified type DIEs. I've found more DW_TAG_*_type that
are unhandled, looked up the DWARF docs and added them as well based
on common sense.

PR debug/111080
* dwarf2out.cc (prune_unused_types_walk): Handle
DW_TAG_restrict_type, DW_TAG_shared_type, DW_TAG_atomic_type,
DW_TAG_immutable_type, DW_TAG_coarray_type, DW_TAG_unspecified_type
and DW_TAG_dynamic_type as to only output them when referenced.

* gcc.dg/debug/dwarf2/pr111080.c: New testcase.

Adjust GCC V13 to GCC 13.1 in diagnotic.

gcc/ChangeLog:

* config/i386/i386.cc (ix86_invalid_conversion): Adjust GCC
V13 to GCC 13.1.

Fix target_clone ("arch=graniterapids-d") and target_clone ("arch=arrowlake-s")

Both "graniterapid-d" and "graniterapids" are attached with
PROCESSOR_GRANITERAPID in processor_alias_table but mapped to
different __cpu_subtype in get_intel_cpu.

And get_builtin_code_for_version will try to match the first
PROCESSOR_GRANITERAPIDS in processor_alias_table which maps to
"granitepraids" here.

861      else if (new_target->arch_specified && new_target->arch > 0)
1862        for (i = 0; i < pta_size; i++)
1863          if (processor_alias_table[i].processor == new_target->arch)
1864            {
1865              const pta *arch_info = &processor_alias_table[i];
1866              switch (arch_info->priority)
1867                {
1868                default:
1869                  arg_str = arch_info->name;

This mismatch makes dispatch_function_versions check the preidcate
of__builtin_cpu_is ("graniterapids") for "graniterapids-d" and causes
the issue.
The patch explicitly adds PROCESSOR_ARROWLAKE_S and
PROCESSOR_GRANITERAPIDS_D to make a distinction.

For "alderlake","raptorlake", "meteorlake" they share same isa, cost,
tuning, and mapped to the same __cpu_type/__cpu_subtype in
get_intel_cpu, so no need to add PROCESSOR_RAPTORLAKE and others.

gcc/ChangeLog:

* common/config/i386/i386-common.cc (processor_names): Add new
member graniterapids-s and arrowlake-s.
* config/i386/i386-options.cc (processor_alias_table): Update
table with PROCESSOR_ARROWLAKE_S and
PROCESSOR_GRANITERAPIDS_D.
(m_GRANITERAPID_D): New macro.
(m_ARROWLAKE_S): Ditto.
(m_CORE_AVX512): Add m_GRANITERAPIDS_D.
(processor_cost_table): Add icelake_cost for
PROCESSOR_GRANITERAPIDS_D and alderlake_cost for
PROCESSOR_ARROWLAKE_S.
* config/i386/x86-tune.def: Hanlde m_ARROWLAKE_S same as
m_ARROWLAKE.
* config/i386/i386.h (enum processor_type): Add new member
PROCESSOR_GRANITERAPIDS_D and PROCESSOR_ARROWLAKE_S.
* config/i386/i386-c.cc (ix86_target_macros_internal): Handle
PROCESSOR_GRANITERAPIDS_D and PROCESSOR_ARROWLAKE_S

testsuite: Xfail gcc.dg/tree-ssa/update-threading.c for CRIS, PR110628

* gcc.dg/tree-ssa/update-threading.c: Xfail for cris-*-*.

Daily bump.

Improve quality of code from LRA register elimination

This is primarily Jivan's work, I'm mostly responsible for the write-up and
coordinating with Vlad on a few questions.

On targets with limitations on immediates usable in arithmetic instructions,
LRA's register elimination phase can construct fairly poor code.

This example (from the GCC testsuite) illustrates the problem well.

int  consume (void *);
int foo (void) {
  int x[1000000];
  return consume (x + 1000);
}

If you compile on riscv64-linux-gnu with "-O2 -march=rv64gc -mabi=lp64d", then
you'll get this code (up to the call to consume()).

        .cfi_startproc
        li      t0,-4001792
        li      a0,-3997696
        li      a5,4001792
        addi    sp,sp,-16
        .cfi_def_cfa_offset 16
        addi    t0,t0,1792
        addi    a0,a0,1696
        addi    a5,a5,-1792
        sd      ra,8(sp)
        add     a5,a5,a0
        add     sp,sp,t0
        .cfi_def_cfa_offset 4000016
        .cfi_offset 1, -8
        add     a0,a5,sp
        call    consume

Of particular interest is the value in a0 when we call consume. We compute that
horribly inefficiently.   If we back-substitute from the final assignment to a0
we get...

a0 = a5 + sp
a0 = a5 + (sp + t0)
a0 = (a5 + a0) + (sp + t0)
a0 = ((a5 - 1792) + a0) + (sp + t0)
a0 = ((a5 - 1792) + (a0 + 1696)) + (sp + t0)
a0 = ((a5 - 1792) + (a0 + 1696)) + (sp + (t0 + 1792))
a0 = (a5 + (a0 + 1696)) + (sp + t0)  // removed offsetting terms
a0 = (a5 + (a0 + 1696)) + ((sp - 16) + t0)
a0 = (4001792 + (a0 + 1696)) + ((sp - 16) + t0)
a0 = (4001792 + (-3997696 + 1696)) + ((sp - 16) + t0)
a0 = (4001792 + (-3997696 + 1696)) + ((sp - 16) + -4001792)
a0 = (-3997696 + 1696) + (sp -16) // removed offsetting terms
a0 = sp - 3990616

That's a pretty convoluted way to compute sp - 3990616.

Something like this would be notably better (not great, but we need both the
stack adjustment and the address of the object to pass to consume):

   addi sp,sp,-16
   sd ra,8(sp)
   li t0,-4001792
   addi t0,t0,1792
   add sp,sp,t0
   li a0,4096
   addi a0,a0,-96
   add a0,sp,a0
   call consume

The problem is LRA's elimination code is not handling the case where we have
(plus (reg1) (reg2) where reg1 is an eliminable register and reg2 has a known
equivalency, particularly a constant.

If we can determine that reg2 is equivalent to a constant and treat (plus
(reg1) (reg2)) in the same way we'd treat (plus (reg1) (const_int)) then we can
get the desired code.

This eliminates about 19b instructions, or roughly 1% for deepsjeng on rv64.
There are improvements elsewhere, but they're relatively small.  This may
ultimately lessen the value of Manolis's fold-mem-offsets patch.  So we'll have
to evaluate that again once he posts a new version.

Bootstrapped and regression tested on x86_64 as well as bootstrapped on rv64.
Earlier versions have been tested against spec2017.  Pre-approved by Vlad in a
private email conversation (thanks Vlad!).

Committed to the trunk,

gcc/
* lra-eliminations.cc (eliminate_regs_in_insn): Use equivalences to
to help simplify code further.

Fortran: improve diagnostic message for COMMON with automatic object [PR32986]

gcc/fortran/ChangeLog:

PR fortran/32986
* resolve.cc (is_non_constant_shape_array): Add forward declaration.
(resolve_common_vars): Diagnose automatic array object in COMMON.
(resolve_symbol): Prevent confusing follow-on error.

gcc/testsuite/ChangeLog:

PR fortran/32986
* gfortran.dg/common_28.f90: New test.

Phi analyzer - Initialize with range instead of a tree.

Rangers PHI analyzer currently only allows a single initializer to a group.
This patch changes that to use an inialization range, which is
cumulative of all integer constants, plus a single symbolic value.
There is no other change to group functionality.

This patch also changes the way PHI groups are printed so they show up in the
listing as they are encountered, rather than as a list at the end. It
was more difficult to see what was going on previously.

PR tree-optimization/110918 - Initialize with range instead of a tree.
gcc/
* gimple-range-fold.cc (fold_using_range::range_of_phi): Tweak output.
* gimple-range-phi.cc (phi_group::phi_group): Remove unused members.
Initialize using a range instead of value and edge.
(phi_group::calculate_using_modifier): Use initializer value and
process for relations after trying for iteration convergence.
(phi_group::refine_using_relation): Use initializer range.
(phi_group::dump): Rework the dump output.
(phi_analyzer::process_phi): Allow multiple constant initilizers.
Dump groups immediately as created.
(phi_analyzer::dump): Tweak output.
* gimple-range-phi.h (phi_group::phi_group): Adjust prototype.
(phi_group::initial_value): Delete.
(phi_group::refine_using_relation): Adjust prototype.
(phi_group::m_initial_value): Delete.
(phi_group::m_initial_edge): Delete.
(phi_group::m_vr): Use int_range_max.
* tree-vrp.cc (execute_ranger_vrp): Don't dump phi groups.

gcc/testsuite/
* gcc.dg/pr102983.c: Adjust output expectations.
* gcc.dg/pr110918.c: New.

Don't process phi groups with one phi.

The phi analyzer should not create a phi group containing a single phi.

* gimple-range-phi.cc (phi_analyzer::operator[]): Return NULL if
no group was created.
(phi_analyzer::process_phi): Do not create groups of one phi node.

rtl: use rtx_code for gen_ccmp_first and gen_ccmp_next

Now that we have a forward declaration of rtx_code in coretypes.h, we
can adjust these hooks to take rtx_code arguments rather than an int.

gcc/ChangeLog:

* target.def (gen_ccmp_first, gen_ccmp_next): Use rtx_code for
CODE, CMP_CODE and BIT_CODE arguments.
* config/aarch64/aarch64.cc (aarch64_gen_ccmp_first): Likewise.
(aarch64_gen_ccmp_next): Likewise.
* doc/tm.texi: Regenerated.

rtl: Forward declare rtx_code

Now that we require C++ 11, we can safely forward declare rtx_code
so that we can use it in target hooks.

gcc/ChangeLog
* coretypes.h (rtx_code): Add forward declaration.
* rtl.h (rtx_code): Make compatible with forward declaration.

i386: Fix register spill failure with concat RTX [PR111010]

Disable (=&r,m,m) alternative for 32-bit targets. The combination of two
memory operands (possibly with complex addressing mode), early clobbered
output, frame pointer and PIC registers uses too much registers on
a register constrained 32-bit target.

Also merge two similar patterns using DWIH mode iterator.

PR target/111010

gcc/ChangeLog:

* config/i386/i386.md (*concat<any_or_plus:mode><dwi>3_3):
Merge pattern from *concatditi3_3 and *concatsidi3_3 using
DWIH mode iterator. Disable (=&r,m,m) alternative for
32-bit targets.
(*concat<any_or_plus:mode><dwi>3_3): Disable (=&r,m,m)
alternative for 32-bit targets.

[PATCH] RISC-V:add a more appropriate type attribute

Due to the more accurate type attribute added to the clz, ctz, and pcnt
operations in https://github.com/gcc-mirror/gcc/commit/07e2576d6f3 the
same type attribute should be used here.

gcc/ChangeLog:

* config/riscv/bitmanip.md (*<bitmanip_optab>disi2_sext): Add a more
appropriate type attribute.

RISC-V: Add conditional unary neg/abs/not autovec patterns

Hi,

This patch add conditional unary neg/abs/not autovec patterns to RISC-V backend.
For this C code:

void
test_3 (float *__restrict a, float *__restrict b, int *__restrict pred, int n)
{
  for (int i = 0; i < n; i += 1)
    {
      a[i] = pred[i] ? __builtin_fabsf (b[i]) : a[i];
    }
}

Before this patch:
        ...
        vsetvli a7,zero,e32,m1,ta,ma
        vfabs.v v2,v2
        vmerge.vvm      v1,v1,v2,v0
        ...

After this patch:
        ...
        vsetvli a7,zero,e32,m1,ta,mu
        vfabs.v v1,v2,v0.t
        ...

For int neg/not and FP neg patterns, Defining the corresponding cond_xxx paterns
is enough.
For the FP abs pattern, We need to change the definition of `abs<mode>2` and
`@vcond_mask_<mode><vm>` pattern from define_expand to define_insn_and_split
in order to fuse them into a new pattern `*cond_abs<mode>` at the combine pass.
A fusion process similar to the one below:

(insn 30 29 31 4 (set (reg:RVVM1SF 152 [ vect_iftmp.15 ])
        (abs:RVVM1SF (reg:RVVM1SF 137 [ vect__6.14 ]))) "float.c":15:56 discrim 1 12799 {absrvvm1sf2}
     (expr_list:REG_DEAD (reg:RVVM1SF 137 [ vect__6.14 ])
        (nil)))

(insn 31 30 32 4 (set (reg:RVVM1SF 140 [ vect_iftmp.19 ])
        (if_then_else:RVVM1SF (reg:RVVMF32BI 136 [ mask__27.11 ])
            (reg:RVVM1SF 152 [ vect_iftmp.15 ])
            (reg:RVVM1SF 139 [ vect_iftmp.18 ]))) 12707 {vcond_mask_rvvm1sfrvvmf32bi}
     (expr_list:REG_DEAD (reg:RVVM1SF 152 [ vect_iftmp.15 ])
        (expr_list:REG_DEAD (reg:RVVM1SF 139 [ vect_iftmp.18 ])
            (expr_list:REG_DEAD (reg:RVVMF32BI 136 [ mask__27.11 ])
                (nil)))))
==>

(insn 31 30 32 4 (set (reg:RVVM1SF 140 [ vect_iftmp.19 ])
        (if_then_else:RVVM1SF (reg:RVVMF32BI 136 [ mask__27.11 ])
            (abs:RVVM1SF (reg:RVVM1SF 137 [ vect__6.14 ]))
            (reg:RVVM1SF 139 [ vect_iftmp.18 ]))) 13444 {*cond_absrvvm1sf}
     (expr_list:REG_DEAD (reg:RVVM1SF 137 [ vect__6.14 ])
        (expr_list:REG_DEAD (reg:RVVMF32BI 136 [ mask__27.11 ])
            (expr_list:REG_DEAD (reg:RVVM1SF 139 [ vect_iftmp.18 ])
                (nil)))))

Best,
Lehua

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*cond_abs<mode>): New combine pattern.
(*copysign<mode>_neg): Ditto.
* config/riscv/autovec.md (@vcond_mask_<mode><vm>): Adjust.
(<optab><mode>2): Ditto.
(cond_<optab><mode>): New.
(cond_len_<optab><mode>): Ditto.
* config/riscv/riscv-protos.h (enum insn_type): New.
(expand_cond_len_unop): New helper func.
* config/riscv/riscv-v.cc (shuffle_merge_patterns): Adjust.
(expand_cond_len_unop): New helper func.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/cond/cond_unary-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-5.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-6.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-7.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-8.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-6.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-7.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-8.c: New test.

Fix handling of static exists in loop_ch

This patch fixes wrong return value in should_duplicate_loop_header_p.
Doing so uncovered suboptimal decisions on some jump threading testcases
where we choose to stop duplicating just before basic block that has zero
cost and duplicating so would be always a win.

This is because the heuristics trying to choose right point to duplicate
all winning blocks and to get loop to be do_while did not account
zero_cost blocks in all cases. The patch simplifies the logic by
simply remembering zero cost blocks and handling them last after
the right stopping point is chosen.

gcc/ChangeLog:

* tree-ssa-loop-ch.cc (enum ch_decision): Fix comment.
(should_duplicate_loop_header_p): Fix return value for static exits.
(ch_base::copy_headers): Improve handling of ch_possible_zero_cost.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/copy-headers-9.c: Update template.

Add testcase for PR110940

gcc/testsuite/ChangeLog:
PR middle-end/110940
* gcc.c-torture/compile/pr110940.c: New test.

libffi: Backport of LoongArch support for libffi.

This is a backport of <https://github.com/libffi/libffi/commit/f259a6f6de>,
and contains modifications to commit 5a4774cd4d, as well as the LoongArch
schema portion of commit ee22ecbd11. This is needed for libgo.

libffi/ChangeLog:

PR libffi/108682
* configure.host: Add LoongArch support.
* Makefile.am: Likewise.
* Makefile.in: Regenerate.
* src/loongarch64/ffi.c: New file.
* src/loongarch64/ffitarget.h: New file.
* src/loongarch64/sysv.S: New file.

vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

Like r14-3317 which moves the handlings on memory access
type VMAT_GATHER_SCATTER in vectorizable_load final loop
nest, this one is to deal with vectorizable_store side.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_store): Move the handlings on
VMAT_GATHER_SCATTER in the final loop nest to its own loop,
and update the final nest accordingly.

vect: Move VMAT_LOAD_STORE_LANES handlings from final loop nest

Like commit r14-3214 which moves the handlings on memory
access type VMAT_LOAD_STORE_LANES in vectorizable_load
final loop nest, this one is to deal with the function
vectorizable_store.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_store): Move the handlings on
VMAT_LOAD_STORE_LANES in the final loop nest to its own loop,
and update the final nest accordingly.

vect: Remove some manual release in vectorizable_store

To avoid some duplicates in some follow-up patches on
function vectorizable_store, this patch is to adjust some
existing vec with auto_vec and remove some manual release
invocation. Also refactor a bit and remove some uesless
codes.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_store): Remove vec oprnds,
adjust vec result_chain, vec_oprnd with auto_vec, and adjust
gvec_oprnds with auto_delete_vec.

libstdc++: Fix tests relying on operator new/delete overload

Fix tests that are checking for an expected allocation plan. They are failing if
an allocation is taking place outside the test main.

libstdc++-v3/ChangeLog

* testsuite/util/replacement_memory_operators.h
(counter::scope): New, capture and reset counter count at construction and
restore it at destruction.
(counter::check_new): Add scope instantiation.
* testsuite/23_containers/unordered_map/96088.cc (main):
Add counter::scope instantiation.
* testsuite/23_containers/unordered_multimap/96088.cc (main): Likewise.
* testsuite/23_containers/unordered_multiset/96088.cc (main): Likewise.
* testsuite/23_containers/unordered_set/96088.cc (main): Likewise.
* testsuite/ext/malloc_allocator/deallocate_local.cc (main): Likewise.
* testsuite/ext/new_allocator/deallocate_local.cc (main): Likewise.
* testsuite/ext/throw_allocator/deallocate_local.cc (main): Likewise.
* testsuite/ext/pool_allocator/allocate_chunk.cc (started): New global.
(operator new(size_t)): Check started.
(main): Set/Unset started.
* testsuite/17_intro/no_library_allocation.cc: New test case.

RISC-V: Fix potential ICE of global vsetvl elimination

Committed for following VSETVL refactor patch to make V2 patch easier to review.
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc
(pass_vsetvl::global_eliminate_vsetvl_insn): Fix potential ICE.

RISC-V: Fix VTYPE fuse rule bug

This bug is exposed after refactor patch.
Separate it and commited.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (ge_sew_ratio_unavailable_p):
Fix fuse rule bug.
* config/riscv/riscv-vsetvl.def (DEF_SEW_LMUL_FUSE_RULE): Ditto.

RISC-V: Fix gather_load_run-12.c test

FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c:
Add vsetvli asm.

RISC-V: Add attribute to vtype change only vsetvl

This patch is prepare patch for VSETVL PASS.

Commited.

gcc/ChangeLog:

* config/riscv/vector.md: Add attribute.

RISC-V: Adapt live-1.c testcase

Commited.

Fix failures:

FAIL: gcc.target/riscv/rvv/autovec/partial/live-1.c scan-tree-dump-times optimized ".VEC_EXTRACT" 10
FAIL: gcc.target/riscv/rvv/autovec/partial/live-1.c scan-tree-dump-times optimized ".VEC_EXTRACT" 10

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/live-1.c: Adapt test.

Daily bump.

RISC-V: Clang format riscv-vsetvl.cc[NFC]

Commited.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (change_insn): Clang format.
(vector_infos_manager::all_same_ratio_p): Ditto.
(vector_infos_manager::all_same_avl_p): Ditto.
(pass_vsetvl::refine_vsetvls): Ditto.
(pass_vsetvl::cleanup_vsetvls): Ditto.
(pass_vsetvl::commit_vsetvls): Ditto.
(pass_vsetvl::local_eliminate_vsetvl_insn): Ditto.
(pass_vsetvl::global_eliminate_vsetvl_insn): Ditto.
(pass_vsetvl::compute_probabilities): Ditto.

RISC-V: Add riscv-vsetvl.def to t-riscv

This patch will be backport to GCC 13 and commit to trunk.
gcc/ChangeLog:

* config/riscv/t-riscv: Add riscv-vsetvl.def

libgomp, testsuite: Do not call nonstandard functions

The following functions are not standard, and not always available
(e.g., on darwin). They should not be called unless available: gamma,
gammaf, scalb, scalbf, significand, and significandf.

libgomp/ChangeLog:

* testsuite/lib/libgomp.exp: Add effective target.
* testsuite/libgomp.c/simd-math-1.c: Avoid calling nonstandard
functions.

analyzer: reimplement kf_strlen [PR105899]

Reimplement kf_strlen in terms of the new string scanning
implementation, sharing strlen's implementation with
__analyzer_get_strlen.

gcc/analyzer/ChangeLog:
PR analyzer/105899
* kf-analyzer.cc (class kf_analyzer_get_strlen): Move to kf.cc.
(register_known_analyzer_functions): Use make_kf_strlen.
* kf.cc (class kf_strlen::impl_call_pre): Replace with
implementation of kf_analyzer_get_strlen from kf-analyzer.cc.
Handle "UNKNOWN" return from check_for_null_terminated_string_arg
by falling back to a conjured svalue.
(make_kf_strlen): New.
(register_known_functions): Use make_kf_strlen.
* known-function-manager.h (make_kf_strlen): New decl.

gcc/testsuite/ChangeLog:
PR analyzer/105899
* gcc.dg/analyzer/null-terminated-strings-1.c: Update expected
results on symbolic values.
* gcc.dg/analyzer/strlen-1.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

c++: maybe_substitute_reqs_for fix

While working on PR109751 I found that maybe_substitute_reqs_for was doing
the wrong thing for a non-template friend, substituting in the template args
of the scope's original template rather than those of the instantiation.
This didn't end up being necessary to fix the PR, but it's still an
improvement.

gcc/cp/ChangeLog:

* pt.cc (outer_template_args): Handle non-template argument.
* constraint.cc (maybe_substitute_reqs_for): Pass decl to it.
* cp-tree.h (outer_template_args): Adjust.

c++: constrained hidden friends [PR109751]

r13-4035 avoided a problem with overloading of constrained hidden friends by
checking satisfaction, but checking satisfaction early is inconsistent with
the usual late checking and can lead to hard errors, so let's not do that
after all.

We were wrongly treating the different instantiations of the same friend
template as the same function because maybe_substitute_reqs_for was failing
to actually substitute in the case of a non-template friend.  But we don't
actually need to do the substitution anyway, because [temp.friend] says that
such a friend can't be the same as any other declaration.

After fixing that, instead of a redefinition error we got an ambiguous
overload error, fixed by allowing constrained hidden friends to coexist
until overload resolution, at which point they probably won't be in the same
ADL overload set anyway.

And we avoid mangling collisions by following the proposed mangling for
these friends as a member function with an extra 'F' before the name.  I
demangle this by just adding [friend] to the name of the function because
it's not feasible to reconstruct the actual scope of the function since the
mangling ABI doesn't distinguish between class and namespace scopes.

PR c++/109751

gcc/cp/ChangeLog:

* cp-tree.h (member_like_constrained_friend_p): Declare.
* decl.cc (member_like_constrained_friend_p): New.
(function_requirements_equivalent_p): Check it.
(duplicate_decls): Check it.
(grokfndecl): Check friend template constraints.
* mangle.cc (decl_mangling_context): Check it.
(write_unqualified_name): Check it.
* pt.cc (uses_outer_template_parms_in_constraints): Fix for friends.
(tsubst_friend_function): Don't check satisfaction.

include/ChangeLog:

* demangle.h (enum demangle_component_type): Add
DEMANGLE_COMPONENT_FRIEND.

libiberty/ChangeLog:

* cp-demangle.c (d_make_comp): Handle DEMANGLE_COMPONENT_FRIEND.
(d_count_templates_scopes): Likewise.
(d_print_comp_inner): Likewise.
(d_unqualified_name): Handle member-like friend mangling.
* testsuite/demangle-expected: Add test.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-friend11.C: Now works.  Add template.
* g++.dg/cpp2a/concepts-friend15.C: New test.

RISC-V: output Autovec params explicitly in --help ...

... otherwise user has no clue what -param to actually change

gcc/ChangeLog:
* config/riscv/riscv.opt: Add --param names
riscv-autovec-preference and riscv-autovec-lmul

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>

RISC-V: Add multiarch support on riscv-linux-gnu

This adds multiarch support to the RISC-V port so that bootstraps work with
Debian out-of-the-box. Without this patch the stage1 compiler is unable to
find headers/libraries when building the stage1 runtime.

This is functionally (and possibly textually) equivalent to Debian's fix for
the same problem.

gcc/
* config/riscv/t-linux: Add MULTIARCH_DIRNAME.

OpenMP: Handle 'all' as category in defaultmap

Both, specifying no category and specifying 'all', implies
that the implicit-behavior applies to all categories.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_omp_clause_defaultmap): Parse
'all' as category.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_omp_clause_defaultmap): Parse
'all' as category.

gcc/fortran/ChangeLog:

* gfortran.h (enum gfc_omp_defaultmap_category):
Add OMP_DEFAULTMAP_CAT_ALL.
* openmp.cc (gfc_match_omp_clauses): Parse
'all' as category.
* trans-openmp.cc (gfc_trans_omp_clauses): Handle it.

gcc/ChangeLog:

* tree-core.h (enum omp_clause_defaultmap_kind): Add
OMP_CLAUSE_DEFAULTMAP_CATEGORY_ALL.
* gimplify.cc (gimplify_scan_omp_clauses): Handle it.
* tree-pretty-print.cc (dump_omp_clause): Likewise.

libgomp/ChangeLog:

* libgomp.texi (OpenMP 5.2 status): Add depobj with
destroy-var argument as 'N'. Mark defaultmap with
'all' category as 'Y'.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/defaultmap-1.f90: Update dg-error.
* c-c++-common/gomp/defaultmap-5.c: New test.
* c-c++-common/gomp/defaultmap-6.c: New test.
* gfortran.dg/gomp/defaultmap-10.f90: New test.
* gfortran.dg/gomp/defaultmap-9.f90: New test.

doc: Remove obsolete sentence about _Float* not being supported in C++ [PR106652]

As mentioned in the PR, these types are supported in C++ since GCC 13,
so we shouldn't confuse users.

2023-08-22 Jakub Jelinek <jakub@redhat.com>

PR c++/106652
* doc/extend.texi (_Float<n>): Drop obsolete sentence that the
types aren't supported in C++.

VECT: Add LEN_FOLD_EXTRACT_LAST pattern

Hi, Richard and Richi.

This is the last autovec pattern I want to add for RVV (length loop control).

This patch is supposed to handled this following case:

int __attribute__ ((noinline, noclone))
condition_reduction (int *a, int min_v, int n)
{
  int last = 66; /* High start value.  */

  for (int i = 0; i < n; i++)
    if (a[i] < min_v)
      last = i;

  return last;
}

ARM SVE IR:

  ...
  mask__7.11_39 = vect__4.10_37 < vect_cst__38;
  _40 = loop_mask_36 & mask__7.11_39;
  last_5 = .FOLD_EXTRACT_LAST (last_15, _40, vect_vec_iv_.7_32);
  ...

RVV IR, we want to see:
...
loop_len = SELECT_VL
mask__7.11_39 = vect__4.10_37 < vect_cst__38;
last_5 = .LEN_FOLD_EXTRACT_LAST (last_15, _40, vect_vec_iv_.7_32, loop_len, bias);
...

gcc/ChangeLog:

* doc/md.texi: Add LEN_FOLD_EXTRACT_LAST pattern.
* internal-fn.cc (fold_len_extract_direct): Ditto.
(expand_fold_len_extract_optab_fn): Ditto.
(direct_fold_len_extract_optab_supported_p): Ditto.
* internal-fn.def (LEN_FOLD_EXTRACT_LAST): Ditto.
* optabs.def (OPTAB_D): Ditto.

Simplify intereaved store vectorization processing

When doing interleaving we perform code generation when visiting the
last store of a chain. We keep track of this via DR_GROUP_STORE_COUNT,
the following localizes this to the caller of vectorizable_store,
also avoing redundant non-processing of the other stores.

* tree-vect-stmts.cc (vectorizable_store): Do not bump
DR_GROUP_STORE_COUNT here. Remove early out.
(vect_transform_stmt): Only call vectorizable_store on
the last element of an interleaving chain.

MAINTAINERS: Update my email address

Signed-off-by: Filip Kastl <fkastl@suse.cz>
ChangeLog:

* MAINTAINERS: Update my email address.

tree-optimization/94864 - vector insert of vector extract simplification

The PRs ask for optimizing of

  _1 = BIT_FIELD_REF <b_3(D), 64, 64>;
  result_4 = BIT_INSERT_EXPR <a_2(D), _1, 64>;

to a vector permutation.  The following implements this as
match.pd pattern, improving code generation on x86_64.

On the RTL level we face the issue that backend patterns inconsistently
use vec_merge and vec_select of vec_concat to represent permutes.

I think using a (supported) permute is almost always better
than an extract plus insert, maybe excluding the case we extract
element zero and that's aliased to a register that can be used
directly for insertion (not sure how to query that).

The patch FAILs one case in gcc.target/i386/avx512fp16-vmovsh-1a.c
where we now expand from

__A_28 = VEC_PERM_EXPR <x2.8_9, x1.9_10, { 0, 9, 10, 11, 12, 13, 14, 15 }>;

instead of

_28 = BIT_FIELD_REF <x2.8_9, 16, 0>;
__A_29 = BIT_INSERT_EXPR <x1.9_10, _28, 0>;

producing a vpblendw instruction instead of the expected vmovsh.  That's
either a missed vec_perm_const expansion optimization or even better,
an improvement - Zen4 for example has 4 ports to execute vpblendw
but only 3 for executing vmovsh and both instructions have the same size.

The patch XFAILs the sub-testcase.

PR tree-optimization/94864
PR tree-optimization/94865
PR tree-optimization/93080
* match.pd (bit_insert @0 (BIT_FIELD_REF @1 ..) ..): New pattern
for vector insertion from vector extraction.

* gcc.target/i386/pr94864.c: New testcase.
* gcc.target/i386/pr94865.c: Likewise.
* gcc.target/i386/avx512fp16-vmovsh-1a.c: XFAIL.
* gcc.dg/tree-ssa/forwprop-40.c: Likewise.
* gcc.dg/tree-ssa/forwprop-41.c: Likewise.

Fortran: implement vector sections in DATA statements [PR49588]

gcc/fortran/ChangeLog:

PR fortran/49588
* data.cc (gfc_advance_section): Derive next index set and next offset
into DATA variable also for array references using vector sections.
Use auxiliary array to keep track of offsets into indexing vectors.
(gfc_get_section_index): Set up initial indices also for DATA variables
with array references using vector sections.
* data.h (gfc_get_section_index): Adjust prototype.
(gfc_advance_section): Likewise.
* resolve.cc (check_data_variable): Pass vector offsets.

gcc/testsuite/ChangeLog:

PR fortran/49588
* gfortran.dg/data_vector_section.f90: New test.

VECT: Support loop len control on EXTRACT_LAST vectorization

Hi, @Richi and @Richard, base on previous disscussion, I simpily fix issuses for
powerpc and s390 with your suggestions:

-  machine_mode len_load_mode = get_len_load_store_mode
-    (loop_vinfo->vector_mode, true).require ();
-  machine_mode len_store_mode = get_len_load_store_mode
-    (loop_vinfo->vector_mode, false).require ();
+  machine_mode len_load_mode, len_store_mode;
+  if (!get_len_load_store_mode (loop_vinfo->vector_mode, true)
+        .exists (&len_load_mode))
+    return false;
+  if (!get_len_load_store_mode (loop_vinfo->vector_mode, false)
+        .exists (&len_store_mode))
+    return false;

Co-Authored-By: Kewen.Lin <linkw@linux.ibm.com>
gcc/ChangeLog:

* tree-vect-loop.cc (vect_verify_loop_lens): Add exists check.
(vectorizable_live_operation): Add live vectorization for length loop
control.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/live-1.c: New test.
* gcc.target/riscv/rvv/autovec/partial/live_run-1.c: New test.

Testcase fix.

gcc/testsuite/ChangeLog:

* gcc.target/i386/invariant-ternlog-1.c: Only scan %rdx under
TARGET_64BIT.

RISC-V: Change fnms testcases assertion to xfail

Hi,

This patch fixes inappropriate assertions in fnms testcases since
we want to generate .COND_FNMS but actually generate .FNMS + .VCOND_MASK.
A patch to do this optimization will follow.

Best,
Lehua

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Adjust.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto.

analyzer: check format strings for null termination [PR105899]

This patch extends -fanalyzer to check the format strings of calls
to functions marked with '__attribute__ ((format...))'.

The only checking done in this patch is to check that the format string
is a valid null-terminated string; this patch doesn't attempt to check
the content of the format string.

gcc/analyzer/ChangeLog:
PR analyzer/105899
* call-details.cc (call_details::call_details): New ctor.
* call-details.h (call_details::call_details): New ctor decl.
(struct call_arg_details): Move here from region-model.cc.
* region-model.cc (region_model::check_call_format_attr): New.
(region_model::check_call_args): Call it.
(struct call_arg_details): Move it to call-details.h.
* region-model.h (region_model::check_call_format_attr): New decl.

gcc/testsuite/ChangeLog:
PR analyzer/105899
* gcc.dg/analyzer/attr-format-1.c: New test.
* gcc.dg/analyzer/sprintf-1.c: Update expected results for
now-passing tests.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: add kf_fopen

Add checking to -fanalyzer that both params of calls to "fopen" are
valid null-terminated strings.

gcc/analyzer/ChangeLog:
* kf.cc (class kf_fopen): New.
(register_known_functions): Register it.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/fopen-1.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: replace -Wanalyzer-unterminated-string with scan_for_null_terminator [PR105899]

In r14-3169-g325f9e88802daa I added check_for_null_terminated_string_arg
to -fanalyzer, calling it in various places, with a sole check for
unterminated string constants, adding -Wanalyzer-unterminated-string for
this case.

This patch adds region_model::scan_for_null_terminator, which simulates
scanning memory for a zero byte, complaining about uninitiliazed bytes
and out-of-range accesses seen before any zero byte is seen.

This more flexible approach catches the issues we saw before with
-Wanalyzer-unterminated-string, and also catches uninitialized runs
of bytes, and I believe will be a better way to build checking of C
string operations in the analyzer.

Given that the patch makes -Wanalyzer-unterminated-string redundant
and that this option was only in trunk for 10 days and has no known
users, the patch simply removes the option without a compatibility
fallback.

The patch uses custom events and notes to provide context on where
the issues are coming from.  For example, given:

null-terminated-strings-1.c: In function ‘test_partially_initialized’:
null-terminated-strings-1.c:71:3: warning: use of uninitialized value ‘buf[1]’ [CWE-457] [-Wanalyzer-use-of-uninitialized-value]
   71 |   __analyzer_get_strlen (buf);
      |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~
  ‘test_partially_initialized’: events 1-3
    |
    |   69 |   char buf[16];
    |      |        ^~~
    |      |        |
    |      |        (1) region created on stack here
    |   70 |   buf[0] = 'a';
    |   71 |   __analyzer_get_strlen (buf);
    |      |   ~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |      |   |
    |      |   (2) while looking for null terminator for argument 1 (‘&buf’) of ‘__analyzer_get_strlen’...
    |      |   (3) use of uninitialized value ‘buf[1]’ here
    |
analyzer-decls.h:59:22: note: argument 1 of ‘__analyzer_get_strlen’ must be a pointer to a null-terminated string
   59 | extern __SIZE_TYPE__ __analyzer_get_strlen (const char *ptr);
      |                      ^~~~~~~~~~~~~~~~~~~~~

gcc/analyzer/ChangeLog:
PR analyzer/105899
* analyzer.opt (Wanalyzer-unterminated-string): Delete.
* call-details.cc
(call_details::check_for_null_terminated_string_arg): Convert
return type from void to const svalue *.  Add param "out_sval".
* call-details.h
(call_details::check_for_null_terminated_string_arg): Likewise.
* kf-analyzer.cc (kf_analyzer_get_strlen::impl_call_pre): Wire up
to result of check_for_null_terminated_string_arg.
* region-model.cc (get_strlen): Delete.
(class unterminated_string_arg): Delete.
(struct fragment): New.
(class iterable_cluster): New.
(region_model::get_store_bytes): New.
(get_tree_for_byte_offset): New.
(region_model::scan_for_null_terminator): New.
(region_model::check_for_null_terminated_string_arg): Convert
return type from void to const svalue *.  Add param "out_sval".
Reimplement in terms of scan_for_null_terminator, dropping the
special-case for -Wanalyzer-unterminated-string.
* region-model.h (region_model::get_store_bytes): New decl.
(region_model::scan_for_null_terminator): New decl.
(region_model::check_for_null_terminated_string_arg): Convert
return type from void to const svalue *.  Add param "out_sval".
* store.cc (concrete_binding::get_byte_range): New.
* store.h (concrete_binding::get_byte_range): New decl.
(store_manager::get_concrete_binding): New overload.

gcc/ChangeLog:
PR analyzer/105899
* doc/invoke.texi: Remove -Wanalyzer-unterminated-string.

gcc/testsuite/ChangeLog:
PR analyzer/105899
* gcc.dg/analyzer/error-1.c: Update expected results to reflect
reimplementation of unterminated string detection.  Add test
coverage for uninitialized buffers.
* gcc.dg/analyzer/null-terminated-strings-1.c: Likewise.
* gcc.dg/analyzer/putenv-1.c: Likewise.
* gcc.dg/analyzer/strchr-1.c: Likewise.
* gcc.dg/analyzer/strcpy-1.c: Likewise.
* gcc.dg/analyzer/strdup-1.c: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: handle NULL inner context in region_model_context_decorator

gcc/analyzer/ChangeLog:
* region-model.cc (region_model_context_decorator::add_event):
Handle m_inner being NULL.
* region-model.h (class region_model_context_decorator): Likewise.
(annotating_context::warn): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: add ability for context to add events to a saved_diagnostic

gcc/analyzer/ChangeLog:
* diagnostic-manager.cc (saved_diagnostic::add_event): New.
(saved_diagnostic::add_any_saved_events): New.
(diagnostic_manager::add_event): New.
(dedupe_winners::emit_best): New.
(diagnostic_manager::emit_saved_diagnostic): Make "sd" param
non-const. Call saved_diagnostic::add_any_saved_events.
* diagnostic-manager.h (saved_diagnostic::add_event): New decl.
(saved_diagnostic::add_any_saved_events): New decl.
(saved_diagnostic::m_saved_events): New field.
(diagnostic_manager::add_event): New decl.
(diagnostic_manager::emit_saved_diagnostic): Make "sd" param
non-const.
* engine.cc (impl_region_model_context::add_event): New.
* exploded-graph.h (impl_region_model_context::add_event): New decl.
* region-model.cc
(noop_region_model_context::add_event): New.
(region_model_context_decorator::add_event): New.
* region-model.h (region_model_context::add_event): New vfunc.
(noop_region_model_context::add_event): New decl.
(region_model_context_decorator::add_event): New decl.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: convert note_adding_context to annotating_context

This is enabling work towards the context being able to inject
events into diagnostic paths, rather than just notes after the
warning.

gcc/analyzer/ChangeLog:
* region-model.cc
(class check_external_function_for_access_attr::annotating_ctxt):
Convert to an annotating_context.
* region-model.h (class note_adding_context): Rename to...
(class annotating_context): ...this, updating the "warn" method.
(note_adding_context::make_note): Replace with...
(annotating_context::add_annotations): ...this.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Daily bump.

RISC-V: Support RVV VFWREDUSUM.VS rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFWREDUSUM.VS as the below samples

* __riscv_vfwredusum_vs_f32m1_f64m1_rm
* __riscv_vfwredusum_vs_f32m1_f64m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(vfwredusum_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfwredusum_frm): New intrinsic function def.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-wredusum.c: New test.

bpf: neg instruction does not accept an immediate

The BPF virtual machine does not support neg nor neg32 instructions with
an immediate.

The erroneous instructions were removed from binutils:
https://sourceware.org/pipermail/binutils/2023-August/129135.html

Change the define_insn so that an immediate cannot be accepted.

From testing, a neg-immediate was probably never chosen over a
mov-immediate anyway.

gcc/

* config/bpf/bpf.md (neg): Second operand must be a register.

[PATCH] RISC-V: Add Types to Missing Bitmanip Instructions

This patch updates the bitmanip instructions to ensure that no insn is left
without a type attribute. Updates a total of 8 insns to have type "bitmanip"

Tested for regressions using rv32/64 multilib with newlib/linux.

gcc/Changelog:

* config/riscv/bitmanip.md: Added bitmanip type to insns
that are missing types.

Remove XFAIL from gcc/testsuite/gcc.dg/unroll-7.c

This test passes since commit e41103081bfa "Fix undefined behaviour in
profile_count::differs_from_p", so remove the xfail annotation.

Tested on aarch64-linux-gnu, armv8l-linux-gnueabihf and x86_64-linux-gnu.

gcc/testsuite/ChangeLog:
* gcc.dg/unroll-7.c: Remove xfail.

[RISCV][committed] Remove spurious newline in ztso sequence

amo-table-ztso-load-3 the coordination branch after merging up the Ztso changes
due to a spurious newline in the output causing scan-function-body to fail.
There's probably an over-zealous .* or similar regexp in the framework. I
didn't see it in a quick scan, but could have easily missed it.

Regardless, fixing the extraneous newline is easy :-)

gcc/
* config/riscv/sync-ztso.md (atomic_load_ztso<mode>): Avoid extraenous
newline.

aarch64: fix format specifier

gcc/ChangeLog:

* config/aarch64/falkor-tag-collision-avoidance.cc (dump_insn_list):
Fix format specifier.

[frange] Return false if nothing changed in union_nans().

When one operand is a known NAN, we always return TRUE from
union_nans(), even if no change occurred. This patch fixes the
oversight.

gcc/ChangeLog:

* value-range.cc (frange::union_nans): Return false if nothing
changed.
(range_tests_floats): New test.

[PATCH 2/2] RISC-V: Add quotes to #error messages (all)

From: Tsukasa OI <research_trasio@irq.a4lg.com>

In commit 1aaf3a64e92a ("[PATCH] RISC-V: Deduplicate #error messages in
testsuite"), the author made a mistake to miss the test after adding
quotes around extension names. To avoid future errors and for consistency
with other #error uses in the RISC-V testsuite, this commit quotes all
unquoted #error messages.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadba.c: Quote unquoted #error message.
* gcc.target/riscv/xtheadbb.c: Ditto.
* gcc.target/riscv/xtheadbs.c: Ditto.
* gcc.target/riscv/xtheadcmo.c: Ditto.
* gcc.target/riscv/xtheadcondmov.c: Ditto.
* gcc.target/riscv/xtheadfmemidx.c: Ditto.
* gcc.target/riscv/xtheadfmv.c: Ditto.
* gcc.target/riscv/xtheadint.c: Ditto.
* gcc.target/riscv/xtheadmac.c: Ditto.
* gcc.target/riscv/xtheadmemidx.c: Ditto.
* gcc.target/riscv/xtheadmempair.c: Ditto.
* gcc.target/riscv/xtheadsync.c: Ditto.
* gcc.target/riscv/zawrs.c: Ditto.
* gcc.target/riscv/zvbb.c: Ditto.
* gcc.target/riscv/zvbc.c: Ditto.
* gcc.target/riscv/zvkg.c: Ditto.
* gcc.target/riscv/zvkned.c: Ditto.
* gcc.target/riscv/zvknha.c: Ditto.
* gcc.target/riscv/zvknhb.c: Ditto.
* gcc.target/riscv/zvksed.c: Ditto.
* gcc.target/riscv/zvksh.c: Ditto.
* gcc.target/riscv/zvkt.c: Ditto.

[PATCH 1/2] RISC-V: Add quotes to #error messages

In commit 1aaf3a64e92a ("[PATCH] RISC-V: Deduplicate #error messages in
testsuite"), the author made a mistake to miss the test after adding
quotes around extension names. To avoid future errors and for consistency
with other #error uses in the RISC-V testsuite, this commit quotes #error
messages where necessary to avoid current test case failures.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zvkn.c: Quote #error messages.
* gcc.target/riscv/zvkn-1.c: Ditto.
* gcc.target/riscv/zvknc.c: Ditto.
* gcc.target/riscv/zvknc-1.c: Ditto.
* gcc.target/riscv/zvknc-2.c: Ditto.
* gcc.target/riscv/zvkng.c: Ditto.
* gcc.target/riscv/zvkng-1.c: Ditto.
* gcc.target/riscv/zvkng-2.c: Ditto.
* gcc.target/riscv/zvks.c: Ditto.
* gcc.target/riscv/zvks-1.c: Ditto.
* gcc.target/riscv/zvksc.c: Ditto.
* gcc.target/riscv/zvksc-1.c: Ditto.
* gcc.target/riscv/zvksc-2.c: Ditto.
* gcc.target/riscv/zvksg.c: Ditto.
* gcc.target/riscv/zvksg-1.c: Ditto.
* gcc.target/riscv/zvksg-2.c: Ditto.

Fix FAIL: gcc.target/i386/pr87007-5.c

The following fixes the gcc.target/i386/pr87007-5.c testcase which
changed code generation again after the recent sinking improvements.
We now have

        vxorps  %xmm0, %xmm0, %xmm0
        vsqrtsd d2(%rip), %xmm0, %xmm0

and a necessary xor again in one case, the other vsqrtsd has
a register source and a properly zeroing load:

        vmovsd  d3(%rip), %xmm0
        testl   %esi, %esi
        jg      .L11
.L3:
        vsqrtsd %xmm0, %xmm0, %xmm0

the following patch adjusts the scan.

* gcc.target/i386/pr87007-5.c: Update comment, adjust subtest.

Fix gcc.dg/vect/bb-slp-subgroups-2.c with 256bit vectors

The following adds vect128, vect256 and vect512 effective targets
and adjusts gcc.dg/vect/bb-slp-subgroups-2.c accordingly.

gcc/testsuite/
* lib/target-supports.exp: Add vect128, vect256 and vect512
effective targets.
* gcc.dg/vect/bb-slp-subgroups-2.c: Properly handle the
vect256 case.

Fix gcc.dg/vect/pr65947-7.c failures on aarch64.

gcc/testsuite/ChangeLog:
* gcc.dg/vect/pr65947-7.c: Add target check aarch64*-*-* and scan vect
dump for pattern "optimizing condition reduction with FOLD_EXTRACT_LAST"
for targets that support vect_fold_extract_last.

Fix gcc.dg/vect/bb-slp-46.c FAIL

When relaxing vectorization of possibly overflowing reductions I
failed to update a testcase that will now vectorize and no longer
test for what it was written for. The following replaces the
vectorizable add with a division.

* gcc.dg/vect/bb-slp-46.c: Use division instead of addition
to avoid reduction vectorization.

Adjust testcase for Intel GDS.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512f-pr88464-2.c: Add -mgather to
options.
* gcc.target/i386/avx512f-pr88464-3.c: Ditto.
* gcc.target/i386/avx512f-pr88464-4.c: Ditto.
* gcc.target/i386/avx512f-pr88464-6.c: Ditto.
* gcc.target/i386/avx512f-pr88464-7.c: Ditto.
* gcc.target/i386/avx512f-pr88464-8.c: Ditto.
* gcc.target/i386/avx512vl-pr88464-10.c: Ditto.
* gcc.target/i386/avx512vl-pr88464-12.c: Ditto.
* gcc.target/i386/avx512vl-pr88464-13.c: Ditto.
* gcc.target/i386/avx512vl-pr88464-14.c: Ditto.
* gcc.target/i386/avx512vl-pr88464-15.c: Ditto.
* gcc.target/i386/avx512vl-pr88464-16.c: Ditto.
* gcc.target/i386/avx512vl-pr88464-2.c: Ditto.
* gcc.target/i386/avx512vl-pr88464-4.c: Ditto.
* gcc.target/i386/avx512vl-pr88464-5.c: Ditto.
* gcc.target/i386/avx512vl-pr88464-6.c: Ditto.
* gcc.target/i386/avx512vl-pr88464-7.c: Ditto.
* gcc.target/i386/avx512vl-pr88464-8.c: Ditto.

PR111048: Set arg_npatterns correctly.

In valid_mask_for_fold_vec_perm_cst we set arg_npatterns always
to VECTOR_CST_NPATTERNS (arg0) because of (q1 & 0) == 0:

     /* Ensure that the stepped sequence always selects from the same
         input pattern.  */
      unsigned arg_npatterns
        = ((q1 & 0) == 0) ? VECTOR_CST_NPATTERNS (arg0)
                          : VECTOR_CST_NPATTERNS (arg1);

resulting in wrong code-gen issues.
The patch fixes this by changing the condition to (q1 & 1) == 0.

gcc/ChangeLog:
PR tree-optimization/111048
* fold-const.cc (valid_mask_for_fold_vec_perm_cst_p): Set arg_npatterns
correctly.
(fold_vec_perm_cst): Remove workaround and again call
valid_mask_fold_vec_perm_cst_p for both VLS and VLA vectors.
(test_fold_vec_perm_cst::test_nunits_min_4): Add test-case.

tree-optimization/111082 - bogus promoted min

vectorize_slp_instance_root_stmt promotes operations with undefined
overflow to unsigned arithmetic but fails to consider operations
that do not overflow like MIN which it turned into MIN with wrong
signedness and in the case of the PR an unsupported operation.
The following rectifies this.

PR tree-optimization/111082
* tree-vect-slp.cc (vectorize_slp_instance_root_stmt): Only
pun operations that can overflow.

* gcc.dg/pr111082.c: New testcase.

libstdc++: Remove reliance on unspecified behaviour in std::rethrow_if_nested test

This test case calls std::set_terminate while there is an active
exception. Since LWG 2111 it is unspecified which terminate handler is
used when std::nested_exception::rethrow_nested() calls std::terminate.
With libsupc++ the global handler changed by std::set_terminate is used,
but libc++abi uses the active exception's handler (the one that was
current when the exception was first thrown).

Adjust the test case so that it works with either implementation choice.
So that the process doesn't exit cleanly if std::terminate happens
sooner than expected, use a global variable to control when the "clean
terminate" behaviour happens.

libstdc++-v3/ChangeLog:

* testsuite/18_support/nested_exception/rethrow_if_nested-term.cc:
Call std::set_terminate before throwing the nested exception.

LCM: Export 2 helpful functions as global for VSETVL PASS use in RISC-V backend

This patch exports 'compute_antinout_edge' and 'compute_earliest' as global scope
which is going to be used in VSETVL PASS of RISC-V backend.

The demand fusion is the fusion of VSETVL information to emit VSETVL which dominate and pre-config for most
of the RVV instructions in order to elide redundant VSETVLs.

For exmaple:

for
for
  for
    if (cond}
      VSETVL demand 1: SEW/LMUL = 16 and TU policy
    else
      VSETVL demand 2: SEW = 32

VSETVL pass should be able to fuse demand 1 and demand 2 into new demand: SEW = 32, LMUL = M2, TU policy.
Then emit such VSETVL at the outmost of the for loop to get the most optimal codegen and run-time execution.

Currenty the VSETVL PASS Phase 3 (demand fusion) is really messy and un-reliable as well as un-maintainable.
And, I recently read dragon book and morgan's book again, I found there "earliest" can allow us to do the
demand fusion in a very reliable and optimal way.

So, this patch exports these 2 functions which are very helpful for VSETVL pass.

gcc/ChangeLog:

* lcm.cc (compute_antinout_edge): Export as global use.
(compute_earliest): Ditto.
(compute_rev_insert_delete): Ditto.
* lcm.h (compute_antinout_edge): Ditto.
(compute_earliest): Ditto.

tree-optimization/111070 - fix ICE with recent ifcombine fix

We now got test coverage for non-SSA name bits so the following amends
the SSA_NAME_OCCURS_IN_ABNORMAL_PHI checks.

PR tree-optimization/111070
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Check we have
an SSA name before checking SSA_NAME_OCCURS_IN_ABNORMAL_PHI.

* gcc.dg/pr111070.c: New testcase.

MATCH: [PR111002] Sink view_convert for vec_cond

Like convert we can sink view_convert into vec_cond but
we can only do it if the element types are nop_conversions.
This is to allow conversion between signed and unsigned types only.
Rather than between integer and float types which mess up the vec_cond
so that isel does not understand `a?-1:0` is still that.

OK? Bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-gnu.

PR tree-optimization/111002

gcc/ChangeLog:

* match.pd (view_convert(vec_cond(a,b,c))): New pattern.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/cond_convert_8.c: New test.

Testsuite, LTO: silence warning to make test pass on Darwin

gcc/testsuite/ChangeLog:

* gcc.dg/lto/20091013-1_2.c: Add -Wno-stringop-overread.

Support -march=gracemont

Alderlake-N is E-core only, add it as an alias of Alderlake.

gcc/ChangeLog:

* common/config/i386/cpuinfo.h (get_intel_cpu): Detect
Alderlake-N.
* common/config/i386/i386-common.cc (alias_table): Support
-march=gracemont as an alias of -march=alderlake.

Daily bump.

PR modula2/111085 nexttoward and nexttowardf contain incorrect definitions

The definition for procedures nexttoward and nexttowardf contain
second incorrect parameter and return types. This bug was
discovered when attempting to resolve PR 108143 and is applied
separately and prior to PR 108143.

gcc/m2/ChangeLog:

PR modula2/111085
* gm2-libs/Builtins.def (nexttoward): Alter the second
parameter to LONGREAL.
(nexttowardf): Alter the second parameter to LONGREAL.
* gm2-libs/Builtins.mod (nexttoward): Alter the second
parameter to LONGREAL.
(nexttowardf): Alter the second parameter to LONGREAL.
* gm2-libs/cbuiltin.def (nexttoward): Alter the second
parameter to LONGREAL.
(nexttowardf): Alter the second parameter to LONGREAL.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

Testsuite, darwin: account for macOS 13 and 14

gcc/testsuite/ChangeLog:

* gcc.dg/darwin-minversion-link.c: Account for macOS 13 and 14.

testsuite: Adjust g++.dg/gomp/pr58567.C to new compiler message

Commit 92d1425ca780 "c++: redundant targ coercion for var/alias tmpls"
changed the compiler error message in this testcase from

<source>: In instantiation of 'void foo() [with T = int]':
<source>:14:11:   required from here
<source>:8:22: error: 'int' is not a class, struct, or union type
<source>:8:22: error: 'int' is not a class, struct, or union type
<source>:8:22: error: 'int' is not a class, struct, or union type
<source>:8:3: error: expected iteration declaration or initialization
compiler exited with status 1

to:

<source>: In instantiation of 'void foo() [with T = int]':
<source>:14:11:   required from here
<source>:8:22: error: 'int' is not a class, struct, or union type
<source>:8:3: error: invalid type for iteration variable 'i'
compiler exited with status 1
Excess errors:
<source>:8:3: error: invalid type for iteration variable 'i'

Andrew Pinski analysed the issue in PR 110756 and considered that it was a
testsuite issue in that the error message changed slightly.  Also, it's a
better error message.

Therefore, we only need to adjust the testcase to expect the new message.

gcc/testsuite/ChangeLog:
PR testsuite/110756
* g++.dg/gomp/pr58567.C: Adjust to new compiler error message.

Testsuite, darwin: Fix analyzer testcases

On darwin, system headers are fortified by default and that defeats the
analyzer's warnings on memcpy() calls. Turn this off for testing.

gcc/testsuite/ChangeLog:

* gcc.dg/plugin/taint-CVE-2011-0521-5-fixed.c: Use
_FORTIFY_SOURCE=0 on darwin.
* gcc.dg/plugin/taint-CVE-2011-0521-5.c: Likewise.
* gcc.dg/plugin/taint-CVE-2011-0521-6.c: Likewise.

Testsuite: mark IPA test as requiring alias support

This was indicated in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85656
but never committed. Without it, the test fails on darwin.

gcc/testsuite/ChangeLog:
* gcc.dg/ipa/ipa-icf-38.c: Require alias support.

Testsuite, plugin: make testcase pattern more flexible

On Darwin, the message recorded in the sarif file contains:
"message": {"text": "Segmentation fault: 11"}
which is different from, e.g., linux:
"message": {"text": "Segmentation fault"}

Adjusting the testcase pattern to be a little more flexible.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/crash-test-write-though-null-sarif.c: Update
expected pattern.

i386: Micro-optimize ix86_expand_sse_extend

Partial vector src is forced to a register as ops[1], we can use it
instead of SRC in the call to ix86_expand_sse_cmp. This change avoids
forcing operand[1] to a register in sign/zero-extend expanders.

gcc/ChangeLog:

* config/i386/i386-expand.cc (ix86_expand_sse_extend): Use ops[1]
instead of src in the call to ix86_expand_sse_cmp.
* config/i386/sse.md (<any_extend:insn>v8qiv8hi2): Do not
force operands[1] to a register.
(<any_extend:insn>v4hiv4si2): Ditto.
(<any_extend:insn>v2siv2di2): Ditto.

d: Merge upstream dmd, druntime 26f049fb26, phobos 330d6a4fd.

D front-end changes:

- Import dmd v2.105.0-beta.1.
- Added predefined version identifier VisionOS (ignored by GDC).
- Functions can no longer have `enum` storage class.
- The deprecation of the `body` keyword has been reverted, it is
  now an obsolete feature.
- The error for `scope class` has been reverted, it is now an
  obsolete feature.

D runtime changes:

- Import druntime v2.105.0-beta.1.

Phobos changes:

- Import phobos v2.105.0-beta.1.
- AliasSeq has been removed from std.math.
- extern(C) getdelim and getline have been removed from
  std.stdio.

gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd 26f049fb26.
* dmd/VERSION: Bump version to v2.105.0-beta.1.
* d-codegen.cc (get_frameinfo): Check useGC in condition.
* d-lang.cc (d_handle_option): Set obsolete parameter when compiling
with -Wall.
(d_post_options): Set useGC to false when compiling with
-fno-druntime.  Propagate obsolete flag to compileEnv.
* expr.cc (ExprVisitor::visit (CatExp *)): Check useGC in condition.

libphobos/ChangeLog:

* libdruntime/MERGE: Merge upstream druntime 26f049fb26.
* src/MERGE: Merge upstream phobos 330d6a4fd.

Testsuite: fix analyzer tests on Darwin

On macOS, system headers redefine by default some macros (memcpy,
memmove, etc) to checked versions, which defeats the analyzer. We
want to turn this off.
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104042

gcc/testsuite/ChangeLog:

PR analyzer/104042
* gcc.dg/analyzer/analyzer.exp: Pass -D_FORTIFY_SOURCE=0 on Darwin.
* gcc.dg/analyzer/fd-bind.c: Add missing <string.h> header.
* gcc.dg/analyzer/fd-datagram-socket.c: Likewise.
* gcc.dg/analyzer/fd-listen.c: Likewise.
* gcc.dg/analyzer/fd-socket-misuse.c: Likewise.
* gcc.dg/analyzer/fd-stream-socket-active-open.c: Likewise.
* gcc.dg/analyzer/fd-stream-socket-passive-open.c: Likewise.
* gcc.dg/analyzer/fd-stream-socket.c: Likewise.
* gcc.dg/analyzer/fd-symbolic-socket.c: Likewise.

MATCH: Sink convert for vec_cond

Convert be sinked into a vec_cond if both sides
fold. Unlike other unary operations, we need to check that we still can handle
this vec_cond's first operand is the same as the new truth type.

I tried a few different versions of this patch:
view_convert to the new truth_type but that does not work as we always support all vec_cond
afterwards.
using expand_vec_cond_expr_p; but that would allow too much.

I also tried to see if view_convert can be handled here but we end up with:
_3 = VEC_COND_EXPR <_2, { Nan(-1), Nan(-1), Nan(-1), Nan(-1) }, { 0.0, 0.0, 0.0, 0.0 }>;
Which isel does not know how to handle as just being a view_convert from `vector(4) <signed-boolean:32>`
to `vector(4) float` and causes a regression with `g++.target/i386/pr88152.C`

Note, in the case of the SVE testcase, we will sink negate after the convert and be able
to remove a few extra instructions in the end.
Also with this change gcc.target/aarch64/sve/cond_unary_5.c will now pass.

Committed as approved after a bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-gnu.

gcc/ChangeLog:

PR tree-optimization/111006
PR tree-optimization/110986
* match.pd: (op(vec_cond(a,b,c))): Handle convert for op.

gcc/testsuite/ChangeLog:

PR tree-optimization/111006
* gcc.target/aarch64/sve/cond_convert_7.c: New test.

fix misleading identation breaking bootstrap

Fix identation issue introduced by 966f3c13
"Fix format attribute for printf".

gcc/c-family/ChangeLog:

* c-format.cc: Fix identation.

improve error when /usr/include isn't found [PR90835]

This is a pretty simple patch that ought to help Darwin users understand
better why their build is failing when they forget to pass the
--with-sysroot= flag to configure.

gcc/ChangeLog:

PR target/90835
* Makefile.in: improve error message when /usr/include is
missing

Fix format attribute for printf

Since a long time (GCC 4.4?) GCC does support annotating functions
with either the format attribute "gnu_printf" or "ms_printf" to
distinguish between different format string interpretations.

However, it seems like the attribute is ignored for the "printf"
symbol; regardless what the function declaration says, GCC treats
it as "ms_printf". This has become an issue now that mingw-w64
supports using the UCRT instead of msvcrt.dll, and in this case
the stdio functions are declared with the gnu_printf attribute,
and inttypes.h uses the same format specifiers as in GNU mode.

A reproducible example of the problem:

$ cat format.c
__attribute__((__format__ (gnu_printf, 1, 2))) int printf (const char *__format, ...);
__attribute__((__format__ (gnu_printf, 1, 2))) int othername (const char *__format, ...);

void function(void) {
    long long unsigned x = 42;
    othername("%llu\n", x);
    printf("%llu\n", x);
}
$ x86_64-w64-mingw32-gcc -c -Wformat format.c
format.c: In function 'function':
format.c:7:15: warning: unknown conversion type character 'l' in format [-Wformat=]
    7 |     printf("%llu\n", x);
      |               ^
format.c:7:12: warning: too many arguments for format [-Wformat-extra-args]
    7 |     printf("%llu\n", x);
      |            ^~~~~~~~

Note how both functions, printf and othername, are declare with
identical gnu_printf format attributes - GCC does take this into
account for "othername" and doesn't produce a warning, but GCC
seems to disregard the attribute in the printf declaration and
behave as if it was declared as ms_printf.

If the printf function declaration is changed into a static inline
function, the actual attribute used is honored though.

gcc/c-family/ChangeLog:

PR c/95130
* c-format.cc: skip default format for printf symbol if
explicitly declared by prototype.

Signed-off-by: Tomas Kalibera <tomas.kalibera@gmail.com>
Signed-off-by: Jonathan Yong <10walls@gmail.com>

Daily bump.