]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
11 months agolower SLP load permutation to interleaving
Richard Biener [Mon, 13 May 2024 12:57:01 +0000 (14:57 +0200)] 
lower SLP load permutation to interleaving

The following emulates classical interleaving for SLP load permutes
that we are unlikely handling natively.  This is to handle cases
where interleaving (or load/store-lanes) is the optimal choice for
vectorizing even when we are doing that within SLP.  An example
would be

void foo (int * __restrict a, int * b)
{
  for (int i = 0; i < 16; ++i)
    {
      a[4*i + 0] = b[4*i + 0] * 3;
      a[4*i + 1] = b[4*i + 1] + 3;
      a[4*i + 2] = (b[4*i + 2] * 3 + 3);
      a[4*i + 3] = b[4*i + 3] * 3;
    }
}

where currently the SLP store is merging four single-lane SLP
sub-graphs but none of the loads in it can be code-generated
with V4SImode vectors and a VF of four as the permutes would need
three vectors.

The patch introduces a lowering phase after SLP discovery but
before SLP pattern recognition or permute optimization that
analyzes all loads from the same dataref group and creates an
interleaving scheme starting from an unpermuted load.

What can be handled is power-of-two group size and a group size of
three.  The possibility for doing the interleaving with a load-lanes
like instruction is done as followup.

For a group-size of three this is done by using
the non-interleaving fallback code which then creates at VF == 4 from
{ { a0, b0, c0 }, { a1, b1, c1 }, { a2, b2, c2 }, { a3, b3, c3 } }
the intermediate vectors { c0, c0, c1, c1 } and { c2, c2, c3, c3 }
to produce { c0, c1, c2, c3 }.  This turns out to be more effective
than the scheme implemented for non-SLP for SSE and only slightly
worse for AVX512 and a bit more worse for AVX2.  It seems to me that
this would extend to other non-power-of-two group-sizes though (but
the patch does not).  Optimal schemes are likely difficult to lay out
in VF agnostic form.

I'll note that while the lowering assumes even/odd extract is
generally available for all vector element sizes (which is probably
a good assumption), it doesn't in any way constrain the other
permutes it generates based on target availability.  Again difficult
to do in a VF agnostic way (but at least currently the vector type
is fixed).

I'll also note that the SLP store side merges lanes in a way
producing three-vector permutes for store group-size of three, so
the testcase uses a store group-size of four.

The patch has a fallback for when there are multi-lane groups
and the resulting permutes to not fit interleaving.  Code
generation is not optimal when this triggers and might be
worse than doing single-lane group interleaving.

The patch handles gaps by representing them with NULL
entries in SLP_TREE_SCALAR_STMTS for the unpermuted load node.
The SLP discovery changes could be elided if we manually build the
load node instead.

SLP load nodes covering enough lanes to not need intermediate
permutes are retained as having a load-permutation and do not
use the single SLP load node for each dataref group.  That's
something we might want to change, making load-permutation
something purely local to SLP discovery (but then SLP discovery
could do part of the lowering).

The patch misses CSEing intermediate generated permutes and
registering them with the bst_map which is possibly required
for SLP pattern detection in some cases - this re-spin of the
patch moves the lowering after SLP pattern detection.

* tree-vect-slp.cc (vect_build_slp_tree_1): Handle NULL stmt.
(vect_build_slp_tree_2): Likewise.  Release load permutation
when there's a NULL in SLP_TREE_SCALAR_STMTS and assert there's
no actual permutation in that case.
(vllp_cmp): New function.
(vect_lower_load_permutations): Likewise.
(vect_analyze_slp): Call it.

* gcc.dg/vect/slp-11a.c: Expect SLP.
* gcc.dg/vect/slp-12a.c: Likewise.
* gcc.dg/vect/slp-51.c: New testcase.
* gcc.dg/vect/slp-52.c: New testcase.

11 months ago[PATCH] RISC-V: Optimize the cost of the DFmode register move for RV32.
Xianmiao Qu [Mon, 2 Sep 2024 04:28:13 +0000 (22:28 -0600)] 
[PATCH] RISC-V: Optimize the cost of the DFmode register move for RV32.

Currently, in RV32, even with the D extension enabled, the cost of DFmode
register moves is still set to 'COSTS_N_INSNS (2)'. This results in the
'lower-subreg' pass splitting DFmode register moves into two SImode SUBREG
register moves, leading to the generation of many redundant instructions.

As an example, consider the following test case:
  double foo (int t, double a, double b)
  {
    if (t > 0)
      return a;
    else
      return b;
  }

When compiling with -march=rv32imafdc -mabi=ilp32d, the following code is generated:
          .cfi_startproc
          addi    sp,sp,-32
          .cfi_def_cfa_offset 32
          fsd     fa0,8(sp)
          fsd     fa1,16(sp)
          lw      a4,8(sp)
          lw      a5,12(sp)
          lw      a2,16(sp)
          lw      a3,20(sp)
          bgt     a0,zero,.L1
          mv      a4,a2
          mv      a5,a3
  .L1:
          sw      a4,24(sp)
          sw      a5,28(sp)
          fld     fa0,24(sp)
          addi    sp,sp,32
          .cfi_def_cfa_offset 0
          jr      ra
          .cfi_endproc

After adjust the DFmode register move's cost to 'COSTS_N_INSNS (1)', the
generated code is as follows, with a significant reduction in the number
of instructions.
          .cfi_startproc
          ble     a0,zero,.L5
          ret
  .L5:
          fmv.d   fa0,fa1
          ret
          .cfi_endproc

gcc/
* config/riscv/riscv.cc (riscv_rtx_costs): Optimize the cost of the
DFmode register move for RV32.

gcc/testsuite/
* gcc.target/riscv/rv32-movdf-cost.c: New test.

11 months ago[committed][PR rtl-optimization/116544] Fix test for promoted subregs
Jeff Law [Mon, 2 Sep 2024 04:16:04 +0000 (22:16 -0600)] 
[committed][PR rtl-optimization/116544] Fix test for promoted subregs

This is a small bug in the ext-dce code's handling of promoted subregs.

Essentially when we see a promoted subreg we need to make additional bit groups
live as various parts of the RTL path know that an extension of a suitably
promoted subreg can be trivially eliminated.

When I added support for dealing with this quirk I failed to account for the
larger modes properly and it ignored the case when the size of the inner object
was > 32 bits.  Opps.

This does _not_ fix the outstanding x86 issue.  That's caused by something
completely different and more concerning ;(

Bootstrapped and regression tested on x86.  Obviously fixes the testcase on
riscv as well.

Pushing to the trunk.

PR rtl-optimization/116544
gcc/
* ext-dce.cc (ext_dce_process_uses): Fix thinko in promoted subreg
handling.

gcc/testsuite/
* gcc.dg/torture/pr116544.c: New test.

11 months agoi386: Support vec_cmp for V8BF/V16BF/V32BF in AVX10.2
Levy Hsu [Mon, 2 Sep 2024 02:24:49 +0000 (10:24 +0800)] 
i386: Support vec_cmp for V8BF/V16BF/V32BF in AVX10.2

gcc/ChangeLog:

* config/i386/i386-expand.cc (ix86_use_mask_cmp_p): Add BFmode
for int mask cmp.
* config/i386/sse.md (vec_cmp<mode><avx512fmaskmodelower>): New
vec_cmp expand for VBF modes.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx10_2-512-bf-vector-cmpp-1.c: New test.
* gcc.target/i386/avx10_2-bf-vector-cmpp-1.c: Ditto.

11 months agoi386: Support vectorized BF16 sqrt with AVX10.2 instruction
Levy Hsu [Mon, 2 Sep 2024 02:24:48 +0000 (10:24 +0800)] 
i386: Support vectorized BF16 sqrt with AVX10.2 instruction

gcc/ChangeLog:

* config/i386/sse.md: Expand VF2H to VF2HB with VBF modes.

11 months agoi386: Support vectorized BF16 smaxmin with AVX10.2 instructions
Levy Hsu [Mon, 2 Sep 2024 02:24:47 +0000 (10:24 +0800)] 
i386: Support vectorized BF16 smaxmin with AVX10.2 instructions

gcc/ChangeLog:

* config/i386/sse.md
(<code><mode>3): New define expand pattern for BF smaxmin.

gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_2-512-bf-vector-smaxmin-1.c: New test.
* gcc.target/i386/avx10_2-bf-vector-smaxmin-1.c: New test.

11 months agoi386: Support vectorized BF16 FMA with AVX10.2 instructions
Levy Hsu [Mon, 2 Sep 2024 02:24:46 +0000 (10:24 +0800)] 
i386: Support vectorized BF16 FMA with AVX10.2 instructions

gcc/ChangeLog:

* config/i386/sse.md: Add V8BF/V16BF/V32BF to mode iterator FMAMODEM.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx10_2-512-bf-vector-fma-1.c: New test.
* gcc.target/i386/avx10_2-bf-vector-fma-1.c: New test.

11 months agoi386: Support vectorized BF16 add/sub/mul/div with AVX10.2 instructions
Levy Hsu [Mon, 2 Sep 2024 02:24:45 +0000 (10:24 +0800)] 
i386: Support vectorized BF16 add/sub/mul/div with AVX10.2 instructions

AVX10.2 introduces several non-exception instructions for BF16 vector.
Enable vectorized BF add/sub/mul/div operation by supporting standard
optab for them.

gcc/ChangeLog:

* config/i386/sse.md (div<mode>3): New expander for BFmode div.
(VF_BHSD): New mode iterator with vector BFmodes.
(<insn><mode>3<mask_name><round_name>): Change mode to VF_BHSD.
(mul<mode>3<mask_name><round_name>): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx10_2-512-bf-vector-operations-1.c: New test.
* gcc.target/i386/avx10_2-bf-vector-operations-1.c: Ditto.

11 months agoi386: Optimize generate insn for AVX10.2 compare
Hu, Lin1 [Mon, 2 Sep 2024 02:24:36 +0000 (10:24 +0800)] 
i386: Optimize generate insn for AVX10.2 compare

gcc/ChangeLog:

* config/i386/i386-expand.cc (ix86_expand_fp_compare): Add UNSPEC to
support the optimization.
* config/i386/i386.cc (ix86_fp_compare_code_to_integer): Add NE/EQ.
* config/i386/i386.md (*cmpx<unord><MODEF:mode>): New define_insn.
(*cmpx<unord>hf): Ditto.
* config/i386/predicates.md (ix86_trivial_fp_comparison_operator):
Add ne/eq.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx10_2-compare-1b.c: New test.

11 months agoi386: Optimize ordered and nonequal
Hu, Lin1 [Mon, 2 Sep 2024 02:24:31 +0000 (10:24 +0800)] 
i386: Optimize ordered and nonequal

Currently, when we input !__builtin_isunordered (a, b) && (a != b), gcc
will emit
  ucomiss %xmm1, %xmm0
  movl $1, %ecx
  setp %dl
  setnp %al
  cmovne %ecx, %edx
  andl %edx, %eax
  movzbl %al, %eax

In fact,
  xorl %eax, %eax
  ucomiss %xmm1, %xmm0
  setne %al
is better.

gcc/ChangeLog:

* match.pd: Optimize (and ordered non-equal) to
(not (or unordered  equal))

gcc/testsuite/ChangeLog:

* gcc.target/i386/optimize_one.c: New test.

11 months agoi386: Auto vectorize sdot_prod, usdot_prod, udot_prod with AVX10.2 instructions
Haochen Jiang [Mon, 2 Sep 2024 02:24:29 +0000 (10:24 +0800)] 
i386: Auto vectorize sdot_prod, usdot_prod, udot_prod with AVX10.2 instructions

gcc/ChangeLog:

* config/i386/sse.md (VI1_AVX512VNNIBW): New.
(VI2_AVX10_2): Ditto.
(sdot_prod<mode>): Add AVX10.2
to auto vectorize and combine 512 bit part.
(udot_prod<mode>): Ditto.
(sdot_prodv64qi): Removed.
(udot_prodv64qi): Ditto.
(usdot_prod<mode>): Add AVX10.2 to auto vectorize.
(udot_prod<mode>): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/vnniint16-auto-vectorize-2.c: Only define
TEST when not defined.
* gcc.target/i386/vnniint8-auto-vectorize-2.c: Ditto.
* gcc.target/i386/vnniint16-auto-vectorize-3.c: New test.
* gcc.target/i386/vnniint16-auto-vectorize-4.c: Ditto.
* gcc.target/i386/vnniint8-auto-vectorize-3.c: Ditto.
* gcc.target/i386/vnniint8-auto-vectorize-4.c: Ditto.

11 months agoRISC-V: Add testcases for unsigned scalar quad and oct .SAT_TRUNC form 3
Pan Li [Sun, 18 Aug 2024 06:08:21 +0000 (14:08 +0800)] 
RISC-V: Add testcases for unsigned scalar quad and oct .SAT_TRUNC form 3

This patch would like to add test cases for the unsigned scalar quad and
oct .SAT_TRUNC form 3.  Aka:

Form 3:
  #define DEF_SAT_U_TRUC_FMT_3(NT, WT)     \
  NT __attribute__((noinline))             \
  sat_u_truc_##WT##_to_##NT##_fmt_3 (WT x) \
  {                                        \
    WT max = (WT)(NT)-1;                   \
    return x <= max ? (NT)x : (NT) max;    \
  }

QUAD:
DEF_SAT_U_TRUC_FMT_3 (uint16_t, uint64_t)
DEF_SAT_U_TRUC_FMT_3 (uint8_t, uint32_t)

OCT:
DEF_SAT_U_TRUC_FMT_3 (uint8_t, uint64_t)

The below test is passed for this patch.
* The rv64gcv regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_u_trunc-16.c: New test.
* gcc.target/riscv/sat_u_trunc-17.c: New test.
* gcc.target/riscv/sat_u_trunc-18.c: New test.
* gcc.target/riscv/sat_u_trunc-run-16.c: New test.
* gcc.target/riscv/sat_u_trunc-run-17.c: New test.
* gcc.target/riscv/sat_u_trunc-run-18.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
11 months agoRISC-V: Add testcases for unsigned scalar quad and oct .SAT_TRUNC form 2
Pan Li [Sun, 18 Aug 2024 04:49:47 +0000 (12:49 +0800)] 
RISC-V: Add testcases for unsigned scalar quad and oct .SAT_TRUNC form 2

This patch would like to add test cases for the unsigned scalar quad and
oct .SAT_TRUNC form 2.  Aka:

Form 2:
  #define DEF_SAT_U_TRUC_FMT_2(NT, WT)     \
  NT __attribute__((noinline))             \
  sat_u_truc_##WT##_to_##NT##_fmt_2 (WT x) \
  {                                        \
    WT max = (WT)(NT)-1;                   \
    return x > max ? (NT) max : (NT)x;     \
  }

QUAD:
DEF_SAT_U_TRUC_FMT_2 (uint16_t, uint64_t)
DEF_SAT_U_TRUC_FMT_2 (uint8_t, uint32_t)

OCT:
DEF_SAT_U_TRUC_FMT_2 (uint8_t, uint64_t)

The below test is passed for this patch.
* The rv64gcv regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_u_trunc-10.c: New test.
* gcc.target/riscv/sat_u_trunc-11.c: New test.
* gcc.target/riscv/sat_u_trunc-12.c: New test.
* gcc.target/riscv/sat_u_trunc-run-10.c: New test.
* gcc.target/riscv/sat_u_trunc-run-11.c: New test.
* gcc.target/riscv/sat_u_trunc-run-12.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
11 months agoRISC-V: Add testcases for form 4 of unsigned vector .SAT_ADD IMM
Pan Li [Fri, 30 Aug 2024 03:01:37 +0000 (11:01 +0800)] 
RISC-V: Add testcases for form 4 of unsigned vector .SAT_ADD IMM

This patch would like to add test cases for the unsigned vector .SAT_ADD
when one of the operand is IMM.

Form 4:
  #define DEF_VEC_SAT_U_ADD_IMM_FMT_4(T, IMM)                               \
  T __attribute__((noinline))                                               \
  vec_sat_u_add_imm##IMM##_##T##_fmt_4 (T *out, T *in, unsigned limit)      \
  {                                                                         \
    unsigned i;                                                             \
    T ret;                                                                  \
    for (i = 0; i < limit; i++)                                             \
      {                                                                     \
        out[i] = __builtin_add_overflow (in[i], IMM, &ret) == 0 ? ret : -1; \
      }                                                                     \
  }

DEF_VEC_SAT_U_ADD_IMM_FMT_4(uint64_t, 123)

The below test are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-13.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-14.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-15.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-16.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-13.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-14.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-15.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-16.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
11 months agoRISC-V: Add testcases for form 3 of unsigned vector .SAT_ADD IMM
Pan Li [Fri, 30 Aug 2024 00:36:45 +0000 (08:36 +0800)] 
RISC-V: Add testcases for form 3 of unsigned vector .SAT_ADD IMM

This patch would like to add test cases for the unsigned vector .SAT_ADD
when one of the operand is IMM.

Form 3:
  #define DEF_VEC_SAT_U_ADD_IMM_FMT_3(T, IMM)                          \
  T __attribute__((noinline))                                          \
  vec_sat_u_add_imm##IMM##_##T##_fmt_3 (T *out, T *in, unsigned limit) \
  {                                                                    \
    unsigned i;                                                        \
    T ret;                                                             \
    for (i = 0; i < limit; i++)                                        \
      {                                                                \
        out[i] = __builtin_add_overflow (in[i], IMM, &ret) ? -1 : ret; \
      }                                                                \
  }

DEF_VEC_SAT_U_ADD_IMM_FMT_3(uint64_t, 123)

The below test are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-10.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-11.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-12.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-9.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-10.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-11.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-12.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-9.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
11 months agoRISC-V: Refactor gen zero_extend rtx for SAT_* when expand SImode in RV64
Pan Li [Fri, 30 Aug 2024 06:07:12 +0000 (14:07 +0800)] 
RISC-V: Refactor gen zero_extend rtx for SAT_* when expand SImode in RV64

In previous, we have some specially handling for both the .SAT_ADD and
.SAT_SUB for unsigned int.  There are similar to take care of SImode
in RV64 for zero extend.  Thus refactor these two helper function
into one for possible code duplication.

The below test suite are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_gen_zero_extend_rtx): Merge
the zero_extend handing from func riscv_gen_unsigned_xmode_reg.
(riscv_gen_unsigned_xmode_reg): Remove.
(riscv_expand_ussub): Leverage riscv_gen_zero_extend_rtx
instead of riscv_gen_unsigned_xmode_reg.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_u_sub-11.c: Adjust asm check.
* gcc.target/riscv/sat_u_sub-15.c: Ditto.
* gcc.target/riscv/sat_u_sub-19.c: Ditto.
* gcc.target/riscv/sat_u_sub-23.c: Ditto.
* gcc.target/riscv/sat_u_sub-27.c: Ditto.
* gcc.target/riscv/sat_u_sub-3.c: Ditto.
* gcc.target/riscv/sat_u_sub-31.c: Ditto.
* gcc.target/riscv/sat_u_sub-35.c: Ditto.
* gcc.target/riscv/sat_u_sub-39.c: Ditto.
* gcc.target/riscv/sat_u_sub-43.c: Ditto.
* gcc.target/riscv/sat_u_sub-47.c: Ditto.
* gcc.target/riscv/sat_u_sub-7.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-11.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-11_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-11_2.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-15.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-15_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-15_2.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-3.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-3_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-3_2.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-7.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-7_1.c: Ditto.
* gcc.target/riscv/sat_u_sub_imm-7_2.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
11 months agoDaily bump.
GCC Administrator [Mon, 2 Sep 2024 00:16:51 +0000 (00:16 +0000)] 
Daily bump.

11 months agoslsr: Use simple_dce_from_worklist in SLSR [PR116554]
Andrew Pinski [Sun, 1 Sep 2024 00:23:19 +0000 (17:23 -0700)] 
slsr: Use simple_dce_from_worklist in SLSR [PR116554]

While working on a phiopt patch, it was noticed that
SLSR would leave around some unused ssa names. Let's
add simple_dce_from_worklist usage to SLSR to remove
the dead statements. This should give a small improvemnent
for passes afterwards.

Boostrapped and tested on x86_64.

gcc/ChangeLog:

PR tree-optimization/116554
* gimple-ssa-strength-reduction.cc: Include tree-ssa-dce.h.
(replace_mult_candidate): Add sdce_worklist argument, mark
the rhs1/rhs2 for maybe dceing.
(replace_unconditional_candidate): Add sdce_worklist argument,
Update call to replace_mult_candidate.
(replace_conditional_candidate): Add sdce_worklist argument,
update call to replace_mult_candidate.
(replace_uncond_cands_and_profitable_phis): Add sdce_worklist argument,
update call to replace_conditional_candidate,
replace_unconditional_candidate, and replace_uncond_cands_and_profitable_phis.
(replace_one_candidate): Add sdce_worklist argument, mark
the orig_rhs1/orig_rhs2 for maybe dceing.
(replace_profitable_candidates): Add sdce_worklist argument,
update call to replace_one_candidate and replace_profitable_candidates.
(analyze_candidates_and_replace): Call simple_dce_from_worklist and
update calls to replace_profitable_candidates, and
replace_uncond_cands_and_profitable_phis.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
11 months agotestsuite: Prune compilation messages for modules tests
Hans-Peter Nilsson [Sun, 18 Aug 2024 05:01:06 +0000 (07:01 +0200)] 
testsuite: Prune compilation messages for modules tests

All testsuite compiler-calls pass default_target_compile in the
dejagnu installation (typically /usr/share/dejagnu/target.exp) which
also calls the dejagnu-installed prune_warnings.

Normally, tests using the dg framework (most or all tests these days)
compile and link by calling various wrappers that end up calling
dg-test in the dejagnu installation, typically installed as
/usr/share/dejagnu/dg.exp.  That, besides the compiler call, also
calls ${tool}-dg-prune (g++-dg-prune) on the messages, which in turn
ends up calling prune_gcc_output in gcc/testsuite/lib/prune.exp.  That
gcc-specific "pruning" function handles more cases than the dejagnu
prune_warnings, and also has updated patterns.

But, module_do_it in modules.exp calls the lower-level
${tool}_target_compile "directly", i.e. g++_target_compile defined in
gcc/testsuite/lib/g++.exp.  That does not call ${tool}-dg-prune,
meaning those test-cases miss the gcc-specific pruning.

Noticed while testing a dejagnu update that handled the miniscule "in"
in the warning (line-breaks added below besides the original one after
"(void*)':")

"/path/to/cris-elf/bin/ld:
/gccobj/cris-elf/./libstdc++-v3/src/.libs/libstdc++.a(random.o): in
function `std::(anonymous namespace)::__libc_getentropy(void*)':
/gccsrc/libstdc++-v3/src/c++11/random.cc:183: warning: _getentropy is
not implemented and will always fail"

The line saying "in function" rather than "In function" (from the
binutils linker since 2018) is pruned by prune_gcc_output. The
prune_warnings in dejagnu-1.6.3 and earlier handles the second line
separately.  It's an unfortunate wart that neither consumes the
delimiting line-break, leaving to the callers to prune residual empty
lines.  See prune_warnings in dejagnu (default_target_compile and
dg-test) for those other line-break fixups, as alluded in the comment.

* g++.dg/modules/modules.exp (module_do_it): Prune compilation
messages.

11 months agoDaily bump.
GCC Administrator [Sun, 1 Sep 2024 00:25:25 +0000 (00:25 +0000)] 
Daily bump.

11 months agoi386: Support read-modify-write memory operands in STV.
Roger Sayle [Sat, 31 Aug 2024 20:17:18 +0000 (14:17 -0600)] 
i386: Support read-modify-write memory operands in STV.

This patch enables STV when the first operand of a TImode binary
logic operand (AND, IOR or XOR) is a memory operand, which is commonly
the case with read-modify-write instructions.

A different motivating example from the one given previously is:

__int128 m, p, q;
void foo() {
    m ^= (p & q);
}

Currently with -O2 -mavx the RMW instructions are rejected by STV,
resulting in scalar code:

foo: movq    p(%rip), %rax
        movq    p+8(%rip), %rdx
        andq    q(%rip), %rax
        andq    q+8(%rip), %rdx
        xorq    %rax, m(%rip)
        xorq    %rdx, m+8(%rip)
        ret

With this patch they become scalar-to-vector candidates:

foo: vmovdqa p(%rip), %xmm0
        vpand   q(%rip), %xmm0, %xmm0
        vpxor   m(%rip), %xmm0, %xmm0
        vmovdqa %xmm0, m(%rip)
        ret

2024-08-31  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/i386/i386-features.cc (timode_scalar_to_vector_candidate_p):
Support the first operand of AND, IOR and XOR being MEM_P, i.e. a
read-modify-write insn.

gcc/testsuite/ChangeLog
* gcc.target/i386/movti-2.c: Change dg-options to -Os.
* gcc.target/i386/movti-4.c: Expected output of original movti-2.c.

11 months agolibobjc: Add cast to void* to disable warning for casting between incompatible functi...
Andrew Pinski [Sat, 31 Aug 2024 18:57:32 +0000 (11:57 -0700)] 
libobjc: Add cast to void* to disable warning for casting between incompatible function types [PR89586]

Even though __objc_get_forward_imp returns an IMP type, it will be casted to a compatable function
type before calling it. So we adding a cast to `void*` will disable warning about the incompatible type.

Pushed after bootstrap/test on x86_64.

libobjc/ChangeLog:

PR libobjc/89586
* sendmsg.c (__objc_get_forward_imp): Add cast to `void*` before casting to IMP.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
11 months agoAVR: Run pass avr-fuse-add a second time after pass_cprop_hardreg.
Georg-Johann Lay [Fri, 30 Aug 2024 17:38:30 +0000 (19:38 +0200)] 
AVR: Run pass avr-fuse-add a second time after pass_cprop_hardreg.

gcc/
* config/avr/avr-passes.cc (avr_pass_fuse_add) <clone>: Override.
* config/avr/avr-passes.def (avr_pass_fuse_add): Run again
after pass_cprop_hardreg.

11 months agoAVR: Tidy pass avr-fuse-add.
Georg-Johann Lay [Fri, 30 Aug 2024 17:38:30 +0000 (19:38 +0200)] 
AVR: Tidy pass avr-fuse-add.

gcc/
* config/avr/avr-protos.h (avr_split_tiny_move): Rename to
avr_split_fake_addressing_move.
* config/avr/avr-passes.def: Same.
* config/avr/avr-passes.cc: Same.
(avr_pass_data_fuse_add) <tv_id>: Set to TV_MACH_DEP.
* config/avr/avr.md (split-lpmx): Remove a define_split.  Such
splits are performed by avr_split_fake_addressing_move.

11 months agotestsuite, c++, coroutines: Avoid 'unused' warnings [NFC].
Iain Sandoe [Sat, 31 Aug 2024 11:53:40 +0000 (12:53 +0100)] 
testsuite, c++, coroutines: Avoid 'unused' warnings [NFC].

The 'torture' section of the coroutine tests is primarily about checking
correct operation of the generated code.  It should, ideally, be possible
to run this part of the testsuite with '-Wall' and expect no fails.  In
the case that we wish to test for a specific diagnostic (and that it does
not appear over a range of optimisation/debug conditions) then we should
make that explict (as done, for example, in pr109867.C).

The tests amended here have warnings because of unused entities; in many
cases those are relevant to the test, and so we just mark them with
__attribute__((__unused__)).

We amend the debug output in coro.h to avoid similar warnings when print
output is disabled (the default).

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/coro.h: Use a variadic macro for PRINTF to
avoid unused warnings when output is disabled.
* g++.dg/coroutines/torture/co-await-04-control-flow.C: Avoid
unused warnings.
* g++.dg/coroutines/torture/co-ret-13-template-2.C: Likewise.
* g++.dg/coroutines/torture/exceptions-test-01-n4849-a.C: Likewise.
* g++.dg/coroutines/torture/local-var-04-hiding-nested-scopes.C:
Likewise.
* g++.dg/coroutines/torture/pr109867.C: Likewise.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
11 months agotestsuite, c++, coroutines: Correct a test intent.
Iain Sandoe [Sat, 31 Aug 2024 11:42:36 +0000 (12:42 +0100)] 
testsuite, c++, coroutines: Correct a test intent.

The intention of the series of tests numberef pr95615-* is to
verify that entities created by the ramp and potentially needing
destruction are correctly handled when exceptions are thrown.
Because of a typo, one case was not being checked correctly (the
return object).  This patch amends the check to test that the
returned object is properly deleted.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/torture/pr95615.inc: Check tha the
task object produced by get_return_object is correctly
deleted on exception.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
11 months agoc++, coroutines: Make and use a frame access helper.
Iain Sandoe [Tue, 27 Aug 2024 13:52:26 +0000 (14:52 +0100)] 
c++, coroutines: Make and use a frame access helper.

In the review of earlier patches it was suggested that we might make
use of finish_class_access_expr instead of doing a lookup for the
member and then a build_class_access_expr call.

finish_class_access_expr does a lot more work than we need and ends
up calling build_class_access_expr anyway.  So, instead, this patch
makes a new helper to do the lookup and build and uses that helper
everywhere except instances in the ramp function that we are going
to handle separately.

gcc/cp/ChangeLog:

* coroutines.cc (coro_build_frame_access_expr): New.
(transform_await_expr): Use coro_build_frame_access_expr.
(transform_local_var_uses): Likewise.
(build_actor_fn): Likewise.
(build_destroy_fn): Likewise.
(cp_coroutine_transform::build_ramp_function): Likewise.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
11 months agohppa: Enable PA 2.0 symbolic operands on ELF32 targets
John David Anglin [Sat, 31 Aug 2024 16:20:14 +0000 (12:20 -0400)] 
hppa: Enable PA 2.0 symbolic operands on ELF32 targets

The GNU ELF32 linker has been fixed and it can now handle PA 2.0
symbolic relocations.

This only affects non-pic code generation.

2024-08-31  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa.cc (pa_emit_move_sequence): Remove symbolic
memory work arounds for TARGET_ELF32.
(pa_legitimate_address_p): Likewise.  Allow symbolic
operands.  Adjust comment.
* config/pa/pa.md: Replace reg_or_0_or_nonsymb_mem_operand
with reg_or_0_or_mem_operand predicate in various unnamed
move insns.
* config/pa/predicates.md (floating_point_store_memory_operand):
Update comment.  Remove symbolic memory work arounds for
TARGET_ELF32.
(nonsymb_mem_operand): Rename to mem_operand.  Allow
symbolic memory operands.
(reg_or_0_or_nonsymb_mem_operand): Rename to
reg_or_0_or_mem_operand.  Allow symbolic memory operands.

11 months agophiopt: Ignore some nop statements in heursics [PR116098]
Andrew Pinski [Fri, 30 Aug 2024 17:36:24 +0000 (10:36 -0700)] 
phiopt: Ignore some nop statements in heursics [PR116098]

The heurstics that was added for PR71016, try to search to see
if the conversion was being moved away from its definition. The problem
is the heurstics would stop if there was a non GIMPLE_ASSIGN (and already ignores
debug statements) and in this case we would have a GIMPLE_LABEL that was not
being ignored. So we should need to ignore GIMPLE_NOP, GIMPLE_LABEL and GIMPLE_PREDICT.
Note this is now similar to how gimple_empty_block_p behaves.

Note this fixes the wrong code that was reported by moving the VCE (conversion) out before
the phiopt/match could convert it into an bit_ior and move the VCE out with the VCE being
conditionally valid.

Bootstrapped and tested on x86_64-linux-gnu.
Also built and tested for aarch64-linux-gnu.

PR tree-optimization/116098

gcc/ChangeLog:

* tree-ssa-phiopt.cc (factor_out_conditional_operation): Ignore
nops, labels and predicts for heuristic for conversion with a constant.

gcc/testsuite/ChangeLog:

* c-c++-common/torture/pr116098-1.c: New test.
* gcc.target/aarch64/csel-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
11 months agotestsuite: Change what is being tested for pr66726-2.c
Andrew Pinski [Fri, 30 Aug 2024 16:53:01 +0000 (09:53 -0700)] 
testsuite: Change what is being tested for pr66726-2.c

r14-575-g6d6c17e45f62cf changed the debug dump message but the testcase
pr66726-2.c was not updated for the change. The testcase was searching to
make sure we didn't factor out a conversion but the testcase was no longer
testing that so we needed to update what was being searched for.

Tested on x86_64-linux.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr66726-2.c: Update scan dump message.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
11 months agoFortran: downgrade use associated namelist group name to legacy extension
Harald Anlauf [Fri, 30 Aug 2024 19:15:43 +0000 (21:15 +0200)] 
Fortran: downgrade use associated namelist group name to legacy extension

The Fortran standard disallows use associated names as namelist group name
(e.g. F2003:C581, but also later standards).  This feature is a gfortran
legacy extension, and we should give a warning even for -std=gnu.

gcc/fortran/ChangeLog:

* match.cc (gfc_match_namelist): Downgrade feature from GNU to
legacy extension.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr88169_3.f90: Adjust pattern.

11 months agoc++: Add unsequenced C++ testcase
Jakub Jelinek [Sat, 31 Aug 2024 14:03:20 +0000 (16:03 +0200)] 
c++: Add unsequenced C++ testcase

This is the testcase I wrote originally and which on top of the
https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659154.html
patch didn't behave the way I wanted (no warning and no optimizations of
[[unsequenced]] function templates which don't have pointer/reference
arguments.
Posting this separately, because it depends on the above mentioned
patch as well as the PR116175
https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659157.html
patch.

2024-08-31  Jakub Jelinek  <jakub@redhat.com>

* g++.dg/ext/attr-unsequenced-1.C: New test.

11 months agoc: Add support for unsequenced and reproducible attributes
Jakub Jelinek [Sat, 31 Aug 2024 13:58:23 +0000 (15:58 +0200)] 
c: Add support for unsequenced and reproducible attributes

C23 added in N2956 ( https://open-std.org/JTC1/SC22/WG14/www/docs/n2956.htm )
two new attributes, which are described as similar to GCC const and pure
attributes, but they aren't really same and it seems that even the paper
is missing some of the differences.
The paper says unsequenced is the same as const on functions without pointer
arguments and reproducible is the same as pure on such functions (except
that they are function type attributes rather than function
declaration ones), but it seems the paper doesn't consider the finiteness GCC
relies on (aka non-DECL_LOOPING_CONST_OR_PURE_P) - the paper only talks
about using the attributes for CSE etc., not for DCE.

The following patch introduces (for now limited) support for those
attributes, both as standard C23 attributes and as GNU extensions (the
difference is that the patch is then less strict on where it allows them,
like other function type attributes they can be specified on function
declarations as well and apply to the type, while C23 standard ones must
go on the function declarators (i.e. after closing paren after function
parameters) or in type specifiers of function type.

If function doesn't have any pointer/reference arguments, the patch
adds additional internal attribute with " noptr" suffix which then is used
by flags_from_decl_or_type to handle those easy cases as
ECF_CONST|ECF_LOOPING_CONST_OR_PURE or
ECF_PURE|ECF_LOOPING_CONST_OR_PURE
The harder cases aren't handled right now, I'd hope they can be handled
incrementally.

I wonder whether we shouldn't emit a warning for the
gcc.dg/c23-attr-{reproducible,unsequenced}-5.c cases, while the standard
clearly specifies that composite types should union the attributes and it
is what GCC implements for decades, for ?: that feels dangerous for the
new attributes, it would be much better to be conservative on say
(cond ? unsequenced_function : normal_function) (args)

There is no diagnostics on incorrect [[unsequenced]] or [[reproducible]]
function definitions, while I think diagnosing non-const static/TLS
declarations in the former could be easy, the rest feels hard.  E.g. the
const/pure discovery can just punt on everything it doesn't understand,
but complete diagnostics would need to understand it.

2024-08-31  Jakub Jelinek  <jakub@redhat.com>

PR c/116130
gcc/
* doc/extend.texi (unsequenced, reproducible): Document new function
type attributes.
* calls.cc (flags_from_decl_or_type): Handle "unsequenced noptr" and
"reproducible noptr" attributes.
gcc/c-family/
* c-attribs.cc (c_common_gnu_attributes): Add entries for
"unsequenced", "reproducible", "unsequenced noptr" and
"reproducible noptr" attributes.
(handle_unsequenced_attribute): New function.
(handle_reproducible_attribute): Likewise.
* c-common.h (handle_unsequenced_attribute): Declare.
(handle_reproducible_attribute): Likewise.
* c-lex.cc (c_common_has_attribute): Return 202311 for standard
unsequenced and reproducible attributes.
gcc/c/
* c-decl.cc (handle_std_unsequenced_attribute): New function.
(handle_std_reproducible_attribute): Likewise.
(std_attributes): Add entries for "unsequenced" and "reproducible"
attributes.
(c_warn_type_attributes): Add TYPE argument.  Allow unsequenced
or reproducible attributes if it is FUNCTION_TYPE.
(groktypename): Adjust c_warn_type_attributes caller.
(grokdeclarator): Likewise.
(finish_declspecs): Likewise.
* c-parser.cc (c_parser_declaration_or_fndef): Likewise.
* c-tree.h (c_warn_type_attributes): Add TYPE argument.
gcc/testsuite/
* c-c++-common/attr-reproducible-1.c: New test.
* c-c++-common/attr-reproducible-2.c: New test.
* c-c++-common/attr-unsequenced-1.c: New test.
* c-c++-common/attr-unsequenced-2.c: New test.
* gcc.dg/c23-attr-reproducible-1.c: New test.
* gcc.dg/c23-attr-reproducible-2.c: New test.
* gcc.dg/c23-attr-reproducible-3.c: New test.
* gcc.dg/c23-attr-reproducible-4.c: New test.
* gcc.dg/c23-attr-reproducible-5.c: New test.
* gcc.dg/c23-attr-reproducible-5-aux.c: New file.
* gcc.dg/c23-attr-unsequenced-1.c: New test.
* gcc.dg/c23-attr-unsequenced-2.c: New test.
* gcc.dg/c23-attr-unsequenced-3.c: New test.
* gcc.dg/c23-attr-unsequenced-4.c: New test.
* gcc.dg/c23-attr-unsequenced-5.c: New test.
* gcc.dg/c23-attr-unsequenced-5-aux.c: New file.
* gcc.dg/c23-has-c-attribute-2.c: Add tests for unsequenced
and reproducible attributes.

11 months agoAVR: Don't print a space after , when printing instructions.
Georg-Johann Lay [Sat, 31 Aug 2024 08:58:12 +0000 (10:58 +0200)] 
AVR: Don't print a space after , when printing instructions.

gcc/
* config/avr/avr.cc: Follow the convention to not add a space
after comma when printing instructions.

11 months agoOptimize initialization of small padded objects
Alexandre Oliva [Sat, 31 Aug 2024 09:03:12 +0000 (06:03 -0300)] 
Optimize initialization of small padded objects

When small objects containing padding bits (or bytes) are fully
initialized, we will often store them in registers, and setting
bitfields and other small fields will attempt to preserve the
uninitialized padding bits, which tends to be expensive.
Zero-initializing registers, OTOH, tends to be cheap.

So, if we're optimizing, zero-initialize such small padded objects
even if that's not needed for correctness.  We can't zero-initialize
all such padding objects, though: if there's no padding whatsoever,
and all fields are initialized with nonzero, the zero initialization
would be flagged as dead.  That's why we introduce machinery to detect
whether objects have padding bits.  I considered distinguishing
between bitfields, units and larger padding elements, but I didn't
pursue that distinction.

Since the object's zero-initialization subsumes fields'
zero-initialization, the empty string test in builtin-snprintf-6.c's
test_assign_aggregate would regress without the addition of
native_encode_constructor.

for  gcc/ChangeLog

* expr.cc (categorize_ctor_elements_1): Change p_complete to
int, to distinguish complete initialization in presence or
absence of uninitialized padding bits.
(categorize_ctor_elements): Likewise.  Adjust all callers...
* expr.h (categorize_ctor_elements): ... and declaration.
(type_has_padding_at_level_p): New.
* gimple-fold.cc (type_has_padding_at_level_p): New.
* fold-const.cc (native_encode_constructor): New.
(native_encode_expr): Call it.
* gimplify.cc (gimplify_init_constructor): Clear small
non-addressable non-volatile objects with padding or
other uninitialized fields as an optimization.

for  gcc/testsuite/ChangeLog

* gcc.dg/init-pad-1.c: New.

11 months agoDaily bump.
GCC Administrator [Sat, 31 Aug 2024 00:18:22 +0000 (00:18 +0000)] 
Daily bump.

11 months agoc++: add fixed test [PR101099]
Marek Polacek [Fri, 30 Aug 2024 21:09:19 +0000 (17:09 -0400)] 
c++: add fixed test [PR101099]

-fconcepts-ts is no longer supported so this can't be made to ICE
anymore.

PR c++/101099

gcc/testsuite/ChangeLog:

* g++.dg/concepts/pr101099.C: New test.

11 months agoc++: add fixed test [PR115616]
Marek Polacek [Fri, 30 Aug 2024 20:34:11 +0000 (16:34 -0400)] 
c++: add fixed test [PR115616]

This got fixed by r15-2120.

PR c++/115616

gcc/testsuite/ChangeLog:

* g++.dg/template/friend83.C: New test.

11 months agoc++: fix used but not defined warning for friend
Jason Merrill [Thu, 29 Aug 2024 17:27:13 +0000 (13:27 -0400)] 
c++: fix used but not defined warning for friend

Here limit_bad_template_recursion avoids instantiating foo, and then we
wrongly warn that it isn't defined, because as a non-template (but
templated) friend DECL_TEMPLATE_INSTANTIATION is false.

gcc/cp/ChangeLog:

* decl2.cc (c_parse_final_cleanups): Also check
DECL_FRIEND_PSEUDO_TEMPLATE_INSTANTIATION.

gcc/testsuite/ChangeLog:

* g++.dg/diagnostic/used-inline1.C: New test.

11 months agoFortran: default-initialization of derived-type function results [PR98454]
Harald Anlauf [Thu, 29 Aug 2024 20:17:07 +0000 (22:17 +0200)] 
Fortran: default-initialization of derived-type function results [PR98454]

gcc/fortran/ChangeLog:

PR fortran/98454
* resolve.cc (resolve_symbol): Add default-initialization of
non-allocatable, non-pointer derived-type function results.

gcc/testsuite/ChangeLog:

PR fortran/98454
* gfortran.dg/alloc_comp_class_4.f03: Remove bogus pattern.
* gfortran.dg/pdt_26.f03: Adjust expected count.
* gfortran.dg/derived_result_3.f90: New test.

11 months agogdbhooks: Fix printing of vec with vl_ptr layout
Alex Coplan [Fri, 30 Aug 2024 14:29:34 +0000 (15:29 +0100)] 
gdbhooks: Fix printing of vec with vl_ptr layout

As it stands, the pretty printing of GCC's vecs by gdbhooks.py only
handles vectors with vl_embed layout.  As such, when encountering a vec
with vl_ptr layout, GDB would print a diagnostic like:

  gdb.error: There is no member or method named m_vecpfx.

when (e.g.) any such vec occurred in a backtrace.  This patch extends
VecPrinter.children to also handle vl_ptr vectors.

gcc/ChangeLog:

* gdbhooks.py (VEC_KIND_EMBED): New.
(VEC_KIND_PTR): New.
(get_vec_kind): New.
(VecPrinter.children): Also handle vectors with vl_ptr layout.

11 months agoDon't remove /usr/lib and /lib from when passing to the linker [PR97304/104707]
Andrew Pinski [Tue, 16 Apr 2024 19:06:51 +0000 (12:06 -0700)] 
Don't remove /usr/lib and /lib from when passing to the linker [PR97304/104707]

With newer ld, the default search library path does not include /usr/lib nor /lib
but the driver decides to not pass -L down to the link for these and then in some/most
cases libc is not found.
This code dates from at least 1992 and it is done in a way which is not safe and
does not make sense. So let's remove it.

Bootstrapped and tested on x86_64-linux-gnu (which defaults to being a multilib).

gcc/ChangeLog:

PR driver/104707
PR driver/97304

* gcc.cc (is_directory): Don't not include /usr/lib and /lib
for library directory pathes. Remove library argument.
(add_to_obstack): Update call to is_directory.
(driver_handle_option): Likewise.
(spec_path): Likewise.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
11 months agomiddle-end: Remove integer_three_node [PR116537]
Andrew Pinski [Thu, 29 Aug 2024 18:01:56 +0000 (11:01 -0700)] 
middle-end: Remove integer_three_node [PR116537]

After the small expansion patch for __builtin_prefetch, the
only use of integer_three_node is inside tree-ssa-loop-prefetch.cc so let's
remove it as the loop prefetch pass is not enabled these days by default and
having a tree node around just for that pass is a little wasteful. Integer
constants are also shared these days so calling build_int_cst will use the cached
node anyways.

Bootstrapped and tested on x86_64-linux.

PR middle-end/116537

gcc/ChangeLog:

* tree-core.h (enum tree_index): Remove TI_INTEGER_THREE
* tree-ssa-loop-prefetch.cc (issue_prefetch_ref): Call build_int_cst
instead of using integer_three_node.
* tree.cc (build_common_tree_nodes): Remove initialization
of integer_three_node.
* tree.h (integer_three_node): Delete.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
11 months agoexpand: Small speed up expansion of __builtin_prefetch
Andrew Pinski [Thu, 29 Aug 2024 17:58:41 +0000 (10:58 -0700)] 
expand: Small speed up expansion of __builtin_prefetch

This is a small speed up of the expansion of __builtin_prefetch.
Basically for the optional arguments, no reason to call expand_normal
on a constant integer that we know the value, just replace it with
GEN_INT/const0_rtx instead.

Bootstrapped and tested on x86_64-linux.

gcc/ChangeLog:

* builtins.cc (expand_builtin_prefetch): Rewrite expansion of the optional
arguments to not expand known constants.

11 months agoPR modula2/116181: m2rts fix -Wodr warning
Gaius Mulley [Fri, 30 Aug 2024 13:22:01 +0000 (14:22 +0100)] 
PR modula2/116181: m2rts fix -Wodr warning

This patch fixes the -Wodr warning seen in pge-boot/m2rts.h
when building pge.

gcc/m2/ChangeLog:

PR modula2/116181
* pge-boot/GM2RTS.h: Regenerate.
* pge-boot/m2rts.h: Ditto.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
11 months agoAvoid division by zero via constant_multiple_p
Richard Biener [Fri, 30 Aug 2024 07:50:32 +0000 (09:50 +0200)] 
Avoid division by zero via constant_multiple_p

With recent SLP vectorization patches I see RISC-V divison by zero
for gfortran.dg/matmul_10.f90 and others in get_group_load_store_type
which does

              && can_div_trunc_p (group_size
                                  * LOOP_VINFO_VECT_FACTOR (loop_vinfo) - gap,
                                  nunits, &tem, &remain)
              && (known_eq (remain, 0u)
                  || (constant_multiple_p (nunits, remain, &num)
                      && (vector_vector_composition_type (vectype, num,
                                                          &half_vtype)
                          != NULL_TREE))))
            overrun_p = false;

where for [2, 2] / [0, 2] the condition doesn't reflect what we
are trying to test - that, when remain is zero or, when non-zero,
nunits is a multiple of remain, we can avoid touching a gap via
loading smaller pieces and vector composition.

It isn't safe to change the known_eq to maybe_eq so instead
require known_ne (remain, 0u) before doing constant_multiple_p.
There's the corresponding code in vectorizable_load that's known
to have a latent similar issue, so sync that up as well.

* tree-vect-stmts.cc (get_group_load_store_type): Check
known_ne (remain, 0u) before doing constant_multiple_p.
(vectorizable_load): Likewise.

11 months agoDo not bother with reassociation in SLP discovery for single-lane
Richard Biener [Fri, 30 Aug 2024 09:39:53 +0000 (11:39 +0200)] 
Do not bother with reassociation in SLP discovery for single-lane

It just clutters the dump files and takes up compile-time.

* tree-vect-slp.cc (vect_build_slp_tree_2): Disable SLP
reassociation for single-lane.

11 months agoc++: Allow standard attributes after closing square bracket in new-type-id [PR110345]
Jakub Jelinek [Fri, 30 Aug 2024 07:40:34 +0000 (09:40 +0200)] 
c++: Allow standard attributes after closing square bracket in new-type-id [PR110345]

For C++ 26 P2552R3 I went through all the spots (except modules) where
attribute-specifier-seq appears in the grammar and tried to construct
a testcase in all those spots, for now for [[deprecated]] attribute.

The first thing I found is that we aren't parsing standard attributes in
noptr-new-declarator - https://eel.is/c++draft/expr.new#1

The following patch parses it there, for the non-outermost arrays
applies normally the attributes to the array type, for the outermost
where we just set *nelts and don't really build an array type just
warns that we ignore those attributes (or, do you think we should
just build an array type in that case and just take its element type?).

2024-08-30  Jakub Jelinek  <jakub@redhat.com>

PR c++/110345
* parser.cc (make_array_declarator): Add STD_ATTRS argument, set
declarator->std_attributes to it.
(cp_parser_new_type_id): Warn on non-ignored std_attributes on the
array declarator which is being omitted.
(cp_parser_direct_new_declarator): Parse standard attributes after
closing square bracket, pass it to make_array_declarator.
(cp_parser_direct_declarator): Pass std_attrs to make_array_declarator
instead of setting declarator->std_attributes manually.

* g++.dg/cpp0x/gen-attrs-80.C: New test.
* g++.dg/cpp0x/gen-attrs-81.C: New test.

11 months agoCheck avx upper register for parallel.
liuhongt [Thu, 29 Aug 2024 03:39:20 +0000 (11:39 +0800)] 
Check avx upper register for parallel.

For function arguments/return, when it's BLK mode, it's put in a
parallel with an expr_list, and the expr_list contains the real mode
and registers.
Current ix86_check_avx_upper_register only checked for SSE_REG_P, and
failed to handle that. The patch extend the handle to each subrtx.

gcc/ChangeLog:

PR target/116512
* config/i386/i386.cc (ix86_check_avx_upper_register): Iterate
subrtx to scan for avx upper register.
(ix86_check_avx_upper_stores): Inline old
ix86_check_avx_upper_register.
(ix86_avx_u128_mode_needed): Ditto, and replace
FOR_EACH_SUBRTX with call to new
ix86_check_avx_upper_register.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr116512.c: New test.

11 months agoDaily bump.
GCC Administrator [Fri, 30 Aug 2024 00:19:54 +0000 (00:19 +0000)] 
Daily bump.

11 months agoSARIF output: implement embedded URLs in messages (§3.11.6; PR other/116419)
David Malcolm [Thu, 29 Aug 2024 22:48:32 +0000 (18:48 -0400)] 
SARIF output: implement embedded URLs in messages (§3.11.6; PR other/116419)

GCC diagnostic messages can contain URLs, such as to our documentation
when we suggest an option name to correct a misspelling.

SARIF message strings can contain embedded URLs in the plain text
messages (see SARIF v2.1.0 §3.11.6), but previously we were
simply dropping any URLs from the diagnostic messages.

This patch adds support for encoding URLs into messages in our SARIF
output, using the pp_token machinery added in the previous patch.

As well as supporting URLs, the patch also adjusts how we report
event IDs in SARIF message, so that rather than e.g.
  "text": "second 'free' here; first 'free' was at (1)"
we now report:
  "text": "second 'free' here; first 'free' was at [(1)](sarif:/runs/0/results/0/codeFlows/0/threadFlows/0/locations/0)"

i.e. the text "(1)" now has a embedded link referring within the sarif
log to the threadFlowLocation object for the other event, via JSON
pointer (see §3.10.3 "URIs that use the sarif scheme").  Doing so
requires the arious objects to know their index within their containing
array, requiring some reworking of how they are constructed.

gcc/ChangeLog:
PR other/116419
* diagnostic-event-id.h (diagnostic_event_id_t::zero_based): New.
* diagnostic-format-sarif.cc: Include "pretty-print-format-impl.h"
and "pretty-print-urlifier.h".
(sarif_result::sarif_result): Add param "idx_within_parent".
(sarif_result::get_index_within_parent): New accessor.
(sarif_result::m_idx_within_parent): New field.
(sarif_code_flow::sarif_code_flow): New ctor.
(sarif_code_flow::get_parent): New accessor.
(sarif_code_flow::get_index_within_parent): New accessor.
(sarif_code_flow::m_parent): New field.
(sarif_code_flow::m_thread_id_map): New field.
(sarif_code_flow::m_thread_flows_arr): New field.
(sarif_code_flow::m_all_tfl_objs): New field.
(sarif_thread_flow::sarif_thread_flow): Add "parent" and
"idx_within_parent" params.
(sarif_thread_flow::get_parent): New accessor.
(sarif_thread_flow::get_index_within_parent): New accessor.
(sarif_thread_flow::m_parent): New field.
(sarif_thread_flow::m_idx_within_parent): New field.
(sarif_thread_flow_location::sarif_thread_flow_location): New
ctor.
(sarif_thread_flow_location::get_parent): New accessor.
(sarif_thread_flow_location::get_index_within_parent): New
accessor.
(sarif_thread_flow_location::m_parent): New field.
(sarif_thread_flow_location::m_idx_within_parent): New field.
(sarif_builder::get_code_flow_for_event_ids): New accessor.
(class sarif_builder::sarif_token_printer): New.
(sarif_builder::m_token_printer): New member.
(sarif_builder::m_next_result_idx): New field.
(sarif_builder::m_current_code_flow): New field.
(sarif_code_flow::get_or_append_thread_flow): New.
(sarif_code_flow::get_thread_flow): New.
(sarif_code_flow::add_location): New.
(sarif_code_flow::get_thread_flow_loc_obj): New.
(sarif_thread_flow::add_location): Create the new
sarif_thread_flow_location internally, rather than passing
it in as a parm so that we can keep track of its index in
the array.  Return a reference to it.
(sarif_builder::sarif_builder): Initialize m_token_printer,
m_next_result_idx, and m_current_code_flow.
(sarif_builder::on_report_diagnostic): Pass index to
make_result_object.
(sarif_builder::make_result_object): Add "idx_within_parent" param
and pass to sarif_result ctor.  Pass code flow index to call to
make_code_flow_object.
(make_sarif_url_for_event): New.
(sarif_builder::make_code_flow_object): Add "idx_within_parent"
param and pass it to sarif_code_flow ctor.  Reimplement walking
of events so that we first create threadFlow objects for each
thread, then populate them with threadFlowLocation objects, so
that the IDs work.  Set m_current_code_flow whilst creating the
latter, so that we can create correct URIs for "%@".
(sarif_builder::make_thread_flow_location_object): Replace with...
(sarif_builder::populate_thread_flow_location_object): ...this.
(sarif_output_format::get_builder): New accessor.
(sarif_begin_embedded_link): New.
(sarif_end_embedded_link): New.
(sarif_builder::sarif_token_printer::print_tokens): New.
(diagnostic_output_format_init_sarif): Add "fmt" param; use it to
set the token printer and output format for the context.
(diagnostic_output_format_init_sarif_stderr): Move responsibility
for setting the context's output format to within
diagnostic_output_format_init_sarif.
(diagnostic_output_format_init_sarif_file): Likewise.
(diagnostic_output_format_init_sarif_stream): Likewise.
(test_sarif_diagnostic_context::test_sarif_diagnostic_context):
Likewise.
(selftest::test_make_location_object): Provide an idx for the
result.
(selftest::get_result_from_log): New.
(selftest::get_message_from_log): New.
(selftest::test_message_with_embedded_link): New test.
(selftest::diagnostic_format_sarif_cc_tests): Call it.
* pretty-print-format-impl.h: Include "diagnostic-event-id.h".
(pp_token::kind): Add "event_id".
(struct pp_token_event_id): New.
(is_a_helper <pp_token_event_id *>::test): New.
(is_a_helper <const pp_token_event_id *>::test): New.
* pretty-print.cc (pp_token::dump): Handle kind::event_id.
(pretty_printer::format): Update handling of "%@" in phase 2
so that we add a pp_token_event_id, rather that the text "(N)".
(default_token_printer): Handle pp_token::kind::event_id by
printing the text "(N)".

gcc/testsuite/ChangeLog:
PR other/116419
* gcc.dg/sarif-output/bad-pragma.c: New test.
* gcc.dg/sarif-output/test-bad-pragma.py: New test.
* gcc.dg/sarif-output/test-include-chain-2.py
(test_location_relationships): Update expected text of event to
include an intra-sarif URI to the other event.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agopretty-print: reimplement pp_format with a new struct pp_token
David Malcolm [Thu, 29 Aug 2024 22:48:27 +0000 (18:48 -0400)] 
pretty-print: reimplement pp_format with a new struct pp_token

The following patch rewrites the internals of pp_format.

A pretty_printer's output_buffer maintains a stack of chunk_info
instances, each one responsible for handling a call to pp_format, where
having a stack allows us to support re-entrant calls to pp_format on the
same pretty_printer.

Previously a chunk_info merely stored buffers of accumulated text
per unformatted run and per formatted argument.

This led to various special-casing for handling:

- urlifiers, needing class quoting_info to handle awkard cases where
  the run of quoted text could be split between stages 1 and 2
  of formatting

- dumpfiles, where the optinfo machinery could lead to objects being
  stashed during formatting for later replay to JSON optimization
  records

- in the C++ frontend, the format codes %H and %I can't be processed
  until we've seen both, leading to awkward code to manipulate the
  text buffers

Further, supporting URLs in messages in SARIF output (PR other/116419)
would add additional manipulations of text buffers, since our internal
pp_begin_url API gives the URL at the beginning of the wrapped text,
whereas SARIF's format for embedded URLs has the URL *after* the wrapped
text.  Also when handling "%@" we wouldn't necessarily know the URL of
an event ID until later, requiring further nasty special-case
manipulation of text buffers.

This patch rewrites pretty-print formatting by introducing a new
intermediate representation during formatting: pp_token and
pp_token_list.  Rather than simply accumulating a buffer of "char" in
the chunk_obstack during formatting, we now also accumulate a
pp_token_list, a doubly-linked list of pp_token, which can be:
- text buffers
- begin/end colorization
- begin/end quote
- begin/end URL
- "custom data" tokens

Working at the level of tokens rather than just text buffers allows the
various awkward special cases above to be replaced with uniform logic.
For example, all "urlification" is now done in phase 3 of formatting,
in one place, by looking for [..., BEGIN_QUOTE, TEXT, END_QUOTE, ...]
and injecting BEGIN_URL and END_URL wrapper tokens when the urlifier
has a URL for TEXT.  Doing so greatly simplifies the urlifier code,
allowing the removal of class quoting_info.

The tokens and token lists are allocated on the chunk_obstack, and so
there's no additional heap activity required, with the memory reclaimed
when the chunk_obstack is freed after phase 3 of formatting.

New kinds of pp_token can be added as needed to support output formats.
For example, the followup patch adds a token for "%@" for events IDs, to
better support SARIF output.

No functional change intended.

gcc/c/ChangeLog:
* c-objc-common.cc (c_tree_printer): Convert final param from
const char ** to pp_token_list &.

gcc/cp/ChangeLog:
* error.cc: Include "make-unique.h".
(deferred_printed_type::m_buffer_ptr): Replace with...
(deferred_printed_type::m_printed_text): ...this and...
(deferred_printed_type::m_token_list): ...this.
(deferred_printed_type::deferred_printed_type): Update ctors for
above changes.
(deferred_printed_type::set_text_for_token_list): New.
(append_formatted_chunk): Pass chunk_obstack to
append_formatted_chunk.
(add_quotes): Delete.
(cxx_format_postprocessor::handle): Reimplement to call
deferred_printed_type::set_text_for_token_list, rather than store
buffer pointers.
(defer_phase_2_of_type_diff): Replace param "buffer_ptr"
with "formatted_token_list".  Reimplement by storing
a pointer to formatted_token_list so that the postprocessor can
put its text there.
(cp_printer): Convert param "buffer_ptr" to
"formatted_token_list".  Update calls to
defer_phase_2_of_type_diff accordingly.

gcc/ChangeLog:
* diagnostic.cc (diagnostic_context::report_diagnostic): Don't
pass m_urlifier to pp_format, as urlification now happens in
phase 3.
* dump-context.h (class dump_pretty_printer): Update leading
comment.
(dump_pretty_printer::emit_items): Drop decl.
(dump_pretty_printer::set_optinfo): New.
(class dump_pretty_printer::stashed_item): Delete class.
(class dump_pretty_printer::custom_token_printer): New class.
(dump_pretty_printer::format_decoder_cb): Convert param from
const char ** to pp_token_list &.
(dump_pretty_printer::decode_format): Likewise.
(dump_pretty_printer::stash_item): Likewise.
(dump_pretty_printer::emit_any_pending_textual_chunks): Drop decl.
(dump_pretty_printer::m_stashed_items): Delete field.
(dump_pretty_printer::m_token_printer): New member data.
* dumpfile.cc (struct wrapped_optinfo_item): New.
(dump_pretty_printer::dump_pretty_printer): Update for dropping
of field m_stashed_items and new field m_token_printer.
(dump_pretty_printer::emit_items): Delete; we now use
pp_output_formatted_text..
(dump_pretty_printer::emit_any_pending_textual_chunks): Delete.
(dump_pretty_printer::stash_item): Convert param from
const char ** to pp_token_list &.
(dump_pretty_printer::format_decoder_cb): Likewise.
(dump_pretty_printer::decode_format): Likewise.
(dump_pretty_printer::custom_token_printer::print_tokens): New.
(dump_pretty_printer::custom_token_printer::emit_any_pending_textual_chunks):
New.
(dump_context::dump_printf_va): Call set_optinfo on the
dump_pretty_printer.  Replace call to emit_items with a call to
pp_output_formatted_text.
* opt-problem.cc (opt_problem::opt_problem): Replace call to
emit_items with call to set_optinfo and call to
pp_output_formatted_text.
* pretty-print-format-impl.h (struct pp_token): New.
(struct pp_token_text): New.
(is_a_helper <pp_token_text *>::test): New.
(is_a_helper <const pp_token_text *>::test): New.
(struct pp_token_begin_color): New.
(is_a_helper <pp_token_begin_color *>::test): New.
(is_a_helper <const pp_token_begin_color *>::test): New.
(struct pp_token_end_color): New.
(struct pp_token_begin_quote): New.
(struct pp_token_end_quote): New.
(struct pp_token_begin_url): New.
(is_a_helper <pp_token_begin_url*>::test): New.
(is_a_helper <const pp_token_begin_url*>::test): New.
(struct pp_token_end_url): New.
(struct pp_token_custom_data): New.
(is_a_helper <pp_token_custom_data *>::test): New.
(is_a_helper <const pp_token_custom_data *>::test): New.
(class pp_token_list): New.
(chunk_info::get_args): Drop.
(chunk_info::get_quoting_info): Drop.
(chunk_info::get_token_lists): New accessor.
(chunk_info::append_formatted_chunk): Add obstack & param.
(chunk_info::dump): New decls.
(chunk_info::m_args): Convert element type from const char * to
pp_token_list *.  Rewrite/update comment.
(chunk_info::m_quotes): Drop field.
* pretty-print-markup.h (class pp_token_list): New forward decl.
(pp_markup::context::context): Drop urlifier param; add
formatted_token_list param.
(pp_markup::context::push_back_any_text): New decl.
(pp_markup::context::m_urlifier): Drop field.
(pp_markup::context::m_formatted_token_list): New field.
* pretty-print-urlifier.h: Update comment.
* pretty-print.cc: Define INCLUDE_MEMORY.  Include
"make-unique.h".
(default_token_printer): New forward decl.
(obstack_append_string): Delete.
(urlify_quoted_string): Delete.
(pp_token::pp_token): New.
(pp_token::dump): New.
(allocate_object): New.
(class quoting_info): Delete.
(pp_token::operator new): New.
(pp_token::operator delete): New.
(pp_token_list::operator new): New.
(pp_token_list::operator delete): New.
(pp_token_list::pp_token_list): New.
(pp_token_list::~pp_token_list): New.
(pp_token_list::push_back_text): New.
(pp_token_list::push_back): New.
(pp_token_list::push_back_list): New.
(pp_token_list::pop_front): New.
(pp_token_list::remove_token): New.
(pp_token_list::insert_after): New.
(pp_token_list::replace_custom_tokens): New.
(pp_token_list::merge_consecutive_text_tokens): New.
(pp_token_list::apply_urlifier): New.
(pp_token_list::dump): New.
(chunk_info::append_formatted_chunk): Add obstack & param and use
it to reimplement in terms of token lists.
(chunk_info::pop_from_output_buffer): Drop m_quotes.
(chunk_info::on_begin_quote): Delete.
(chunk_info::dump): New.
(chunk_info::on_end_quote): Delete.
(push_back_any_text): New.
(pretty_printer::format): Drop "urlifier" param and quoting_info
logic.  Convert "formatters" and "args" from const ** to
pp_token_list **.  Reimplement so that rather than just
accumulating a text buffer in the chunk_obstack for each arg,
instead also accumulate a pp_token_list and pp_tokens for each
arg.
(auto_obstack::operator obstack &): New.
(quoting_info::handle_phase_3): Delete.
(pp_output_formatted_text): Reimplement in terms of manipulations
of pp_token_lists, rather than char buffers.  Call
default_token_printer, or m_token_printer's print_tokens vfunc.
(default_token_printer): New.
(pretty_printer::pretty_printer): Initialize m_token_printer in
both ctors.
(pp_markup::context::begin_quote): Reimplement to use token list.
(pp_markup::context::end_quote): Likewise.
(pp_markup::context::begin_highlight_color): Likewise.
(pp_markup::context::end_highlight_color): Likewise.
(pp_markup::context::push_back_any_text): New.
(selftest::test_merge_consecutive_text_tokens): New.
(selftest::test_custom_tokens_1): New.
(selftest::test_custom_tokens_2): New.
(selftest::pp_printf_with_urlifier): Drop "urlifier" param from
call to pp_format.
(selftest::test_urlification): Add test of the example from
pretty-print-format-impl.h.
(selftest::pretty_print_cc_tests): Call the new selftest
functions.
* pretty-print.h (class quoting_info): Drop forward decl.
(class pp_token_list): New forward decl.
(printer_fn): Convert final param from const char ** to
pp_token_list &.
(class token_printer): New.
(class pretty_printer): Add pp_output_formatted_text as friend.
(pretty_printer::set_token_printer): New.
(pretty_printer::format): Drop urlifier param as this now happens
in phase 3.
(pretty_printer::m_format_decoder): Update comment.
(pretty_printer::m_token_printer): New field.
(pp_format): Drop urlifier param.
* tree-diagnostic.cc (default_tree_printer): Convert final param
from const char ** to pp_token_list &.
* tree-diagnostic.h: Likewise for decl.

gcc/fortran/ChangeLog:
* error.cc (gfc_format_decoder): Convert final param from
const char **buffer_ptr to pp_token_list &formatted_token_list,
and update call to default_tree_printer accordingly.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agopretty-print: move class chunk_info into its own header
David Malcolm [Thu, 29 Aug 2024 22:48:20 +0000 (18:48 -0400)] 
pretty-print: move class chunk_info into its own header

No functional change intended.

gcc/cp/ChangeLog:
* error.cc: Include "pretty-print-format-impl.h".

gcc/ChangeLog:
* dumpfile.cc: Include "pretty-print-format-impl.h".
* pretty-print-format-impl.h: New file, based on material from
pretty-print.h.
* pretty-print.cc: Include "pretty-print-format-impl.h".
* pretty-print.h (chunk_info): Replace full declaration with
a forward decl, moving full decl to pretty-print-format-impl.h.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agoUse std::unique_ptr for optinfo_item
David Malcolm [Thu, 29 Aug 2024 22:48:16 +0000 (18:48 -0400)] 
Use std::unique_ptr for optinfo_item

As preliminary work towards an overhaul of how optinfo_items
interact with dump_pretty_printer, replace uses of optinfo_item * with
std::unique_ptr<optinfo_item> to make ownership clearer.

No functional change intended.

gcc/ChangeLog:
* config/aarch64/aarch64.cc: Define INCLUDE_MEMORY.
* config/arm/arm.cc: Likewise.
* config/i386/i386.cc: Likewise.
* config/loongarch/loongarch.cc: Likewise.
* config/riscv/riscv-vector-costs.cc: Likewise.
* config/riscv/riscv.cc: Likewise.
* config/rs6000/rs6000.cc: Likewise.
* dump-context.h (dump_context::emit_item): Convert "item" param
from * to const &.
(dump_pretty_printer::stash_item): Convert "item" param from
optinfo_ * to std::unique_ptr<optinfo_item>.
(dump_pretty_printer::emit_item): Likewise.
* dumpfile.cc: Include "make-unique.h".
(make_item_for_dump_gimple_stmt): Replace uses of optinfo_item *
with std::unique_ptr<optinfo_item>.
(dump_context::dump_gimple_stmt): Likewise.
(make_item_for_dump_gimple_expr): Likewise.
(dump_context::dump_gimple_expr): Likewise.
(make_item_for_dump_generic_expr): Likewise.
(dump_context::dump_generic_expr): Likewise.
(make_item_for_dump_symtab_node): Likewise.
(dump_pretty_printer::emit_items): Likewise.
(dump_pretty_printer::emit_any_pending_textual_chunks): Likewise.
(dump_pretty_printer::emit_item): Likewise.
(dump_pretty_printer::stash_item): Likewise.
(dump_pretty_printer::decode_format): Likewise.
(dump_context::dump_printf_va): Fix overlong line.
(make_item_for_dump_dec): Replace uses of optinfo_item * with
std::unique_ptr<optinfo_item>.
(dump_context::dump_dec): Likewise.
(dump_context::dump_symtab_node): Likewise.
(dump_context::begin_scope): Likewise.
(dump_context::emit_item): Likewise.
* gimple-loop-interchange.cc: Define INCLUDE_MEMORY.
* gimple-loop-jam.cc: Likewise.
* gimple-loop-versioning.cc: Likewise.
* graphite-dependences.cc: Likewise.
* graphite-isl-ast-to-gimple.cc: Likewise.
* graphite-optimize-isl.cc: Likewise.
* graphite-poly.cc: Likewise.
* graphite-scop-detection.cc: Likewise.
* graphite-sese-to-poly.cc: Likewise.
* graphite.cc: Likewise.
* opt-problem.cc: Likewise.
* optinfo.cc (optinfo::add_item): Convert "item" param from
optinfo_ * to std::unique_ptr<optinfo_item>.
(optinfo::emit_for_opt_problem): Update for change to
dump_context::emit_item.
* optinfo.h: Add #error to fail immediately if INCLUDE_MEMORY
wasn't defined, rather than fail to find std::unique_ptr.
(optinfo::add_item): Convert "item" param from optinfo_ * to
std::unique_ptr<optinfo_item>.
* sese.cc: Define INCLUDE_MEMORY.
* targhooks.cc: Likewise.
* tree-data-ref.cc: Likewise.
* tree-if-conv.cc: Likewise.
* tree-loop-distribution.cc: Likewise.
* tree-parloops.cc: Likewise.
* tree-predcom.cc: Likewise.
* tree-ssa-live.cc: Likewise.
* tree-ssa-loop-ivcanon.cc: Likewise.
* tree-ssa-loop-ivopts.cc: Likewise.
* tree-ssa-loop-prefetch.cc: Likewise.
* tree-ssa-loop-unswitch.cc: Likewise.
* tree-ssa-phiopt.cc: Likewise.
* tree-ssa-threadbackward.cc: Likewise.
* tree-ssa-threadupdate.cc: Likewise.
* tree-vect-data-refs.cc: Likewise.
* tree-vect-generic.cc: Likewise.
* tree-vect-loop-manip.cc: Likewise.
* tree-vect-loop.cc: Likewise.
* tree-vect-patterns.cc: Likewise.
* tree-vect-slp-patterns.cc: Likewise.
* tree-vect-slp.cc: Likewise.
* tree-vect-stmts.cc: Likewise.
* tree-vectorizer.cc: Likewise.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/dump_plugin.c: Define INCLUDE_MEMORY.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agoFortran: fix ICE with use with rename of namelist member [PR116530]
Harald Anlauf [Thu, 29 Aug 2024 19:21:39 +0000 (21:21 +0200)] 
Fortran: fix ICE with use with rename of namelist member [PR116530]

gcc/fortran/ChangeLog:

PR fortran/116530
* trans-io.cc (transfer_namelist_element): Prevent NULL pointer
dereference.

gcc/testsuite/ChangeLog:

PR fortran/116530
* gfortran.dg/use_rename_12.f90: New test.

11 months agohppa: Fix handling of unscaled index addresses on HP-UX
John David Anglin [Thu, 29 Aug 2024 15:53:45 +0000 (11:53 -0400)] 
hppa: Fix handling of unscaled index addresses on HP-UX

The PA-RISC architecture uses the top two bits of memory pointers
to select space registers.  The space register ID is ored with the
pointer offset to compute the global virtual address for an access.

The new late combine passes broke gcc on HP-UX.  One of these passes
runs after reload.  The existing code assumed no unscaled index
instructions would be created after reload as the REG_POINTER flag
is not reliable after reload.  The new pass sometimes interchanged
the base and index registers, causing these instructions to fault
when the wrong space register was selected.

I investigated various alternatives to try to retain generation
of unscaled index instructions on HP-UX.  It's not possible to
simply treat unscaled index addresses as not legitimate after
reload as sometimes instructions need to be rerecognized after
reload.  So, we needed to allow unscaled index addresses after
reload and to disable the late combine passes.

I had noticed that reversing the current order of base and index
register canonicalization resulted in more accesses using unscaled
index addresses.  However, this exposed issues with the REG_POINTER
flag.

The flag is not propagated when a constant is added to a pointer.
Tree opimization sometimes adds two pointers.  I found that I had
to treat the result as a pointer but the addition generally corrupts
the space register bits.  These get fixed when a negative pointer
is added.  Finally, the REG_POINTER flag isn't set when a pointer
is passed in a function call.  I couldn't get this approach to work.

Thus, I came to the conclusion that the best approach was to
disable use of unscaled index addresses on HP-UX.  I don't think
this impacts performance significantly.  Code size might get
slightly larger but we get some or more back from having the late
combine passes.

2024-08-29  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa.cc (load_reg): Don't generate load with
unscaled index address when !TARGET_NO_SPACE_REGS.
(pa_legitimate_address_p): Only allow unscaled index
addresses when TARGET_NO_SPACE_REGS.

11 months agoexpand: Allow widdening optab when expanding popcount==1 [PR116508]
Andrew Pinski [Wed, 28 Aug 2024 22:03:53 +0000 (15:03 -0700)] 
expand: Allow widdening optab when expanding popcount==1 [PR116508]

After adding popcount{qi,hi}2 to the aarch64 backend, I noticed that
the expansion for popcount==1 was no longer trying to do the trick
of handling popcount==1 as `(arg ^ (arg - 1)) > arg - 1`. The problem
is the expansion was using OPTAB_DIRECT, when using OPTAB_WIDEN
will allow modes which are smaller than SImode (in the aarch64 case).

Note QImode's cost still needs some improvements so part of popcnt-eq-1.c
is xfailed. Though there is a check to make sure the costs are compared now.

Built and tested on aarch64-linux-gnu.

PR middle-end/116508

gcc/ChangeLog:

* internal-fn.cc (expand_POPCOUNT): Use OPTAB_WIDEN for PLUS and
XOR/AND expansion.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/popcnt-eq-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
11 months agoada: Fix assertion failure on private limited with clause
Eric Botcazou [Fri, 16 Aug 2024 23:18:43 +0000 (01:18 +0200)] 
ada: Fix assertion failure on private limited with clause

This checks that the name is of an entity before accessing its Entity field.

gcc/ada/

* sem_ch8.adb (Has_Private_With): Add test on Is_Entity_Name.

11 months agoada: Fix internal error on concatenation of discriminant-dependent component
Eric Botcazou [Fri, 16 Aug 2024 14:03:30 +0000 (16:03 +0200)] 
ada: Fix internal error on concatenation of discriminant-dependent component

This only occurs with optimization enabled, but the expanded code is always
wrong because it reuses the formal parameter of an initialization procedure
associated with a discriminant (a discriminal in GNAT parlance) outside of
the initialization procedure.

gcc/ada/

* checks.adb (Selected_Length_Checks.Get_E_Length): For a
component of a record with discriminants and if the expression is
a selected component, try to build an actual subtype from its
prefix instead of from the discriminal.

11 months agoada: Missing legality check when type completed
Steve Baird [Mon, 5 Aug 2024 22:53:12 +0000 (15:53 -0700)] 
ada: Missing legality check when type completed

Refine previous fix to better handle tagged cases.

gcc/ada/

* sem_ch6.adb (Check_Discriminant_Conformance): Immediately after
calling Is_Immutably_Limited_Type, perform an additional test that
one might reasonably imagine would instead have been part of
Is_Immutably_Limited_Type. The new test is a call to a new
function Has_Tagged_Limited_Partial_View whose implementation
includes a call to Incomplete_Or_Partial_View, which cannot be
easily be called from Is_Immutably_Limited_Type (because sem_aux,
which is in the closure of the binder, cannot easily "with"
sem_util).
* sem_aux.adb (Is_Immutably_Limited): Include
N_Derived_Type_Definition case when testing Limited_Present flag.

11 months agoada: Fix missing finalization for call to function returning limited view
Eric Botcazou [Fri, 16 Aug 2024 09:28:37 +0000 (11:28 +0200)] 
ada: Fix missing finalization for call to function returning limited view

The call is legal because it is made from the body, which has visibility on
the nonlimited view, so this changes the code in Expand_Call_Helper to look
at the Etype of the call node instead of the Etype of the function.

gcc/ada/

* exp_ch6.adb (Expand_Call_Helper): In the case of a function
call, look at the Etype of the call node to determine whether
finalization actions need to be performed.

11 months agoada: Print Insertion_Sloc in dmsg
Viljar Indus [Mon, 22 Jul 2024 06:45:03 +0000 (09:45 +0300)] 
ada: Print Insertion_Sloc in dmsg

gcc/ada/

* erroutc.adb (dmsg): Print Insertion_Sloc.

11 months agoada: Use the same warning character in continuation messages
Viljar Indus [Fri, 19 Jul 2024 07:34:03 +0000 (10:34 +0300)] 
ada: Use the same warning character in continuation messages

For consitency sake the main and continuation messages should
use the same warning characters.

gcc/ada/

* exp_aggr.adb (Expand_Range_Component): Remove extra warning
character. Use same conditional warning char.
* freeze.adb (Warn_Overlay): Use named warning character.
* restrict.adb (Id_Case): Use named warning character.
* sem_prag.adb (Rewrite_Assertion_Kind): Use default warning
character.

11 months agoada: Restructure continuation message for pretty printing
Viljar Indus [Wed, 17 Jul 2024 10:08:23 +0000 (13:08 +0300)] 
ada: Restructure continuation message for pretty printing

Continuation messages should have the same location
as the main message. If the goal is to point to a different
location then Error_Msg_Sloc should be used to change
the location of the continuation message.

gcc/ada/

* par-ch4.adb (P_Name): Use Error_Msg_Sloc for the location of the
continuation message.

11 months agoada: Improve Inspection_Point warning
Viljar Indus [Tue, 16 Jul 2024 11:17:41 +0000 (14:17 +0300)] 
ada: Improve Inspection_Point warning

Ensure that the primary and sub message point
to the same location in order to assure that the
submessages get pretty printed in the correct order.

gcc/ada/

* exp_prag.adb (Expand_Pragma_Inspection_Point): Improve sub
diagnostic generation.

11 months agoada: Avoid creating continuation messages without an intended parent
Viljar Indus [Wed, 14 Aug 2024 12:24:10 +0000 (15:24 +0300)] 
ada: Avoid creating continuation messages without an intended parent

The messages modified in this patch do not have a clear intended
parent. This causes a lot of issues when grouping continuation
messages together with their parent. This can be confusing as it
is not obvious what was the parent message that caused this
problem or in worst case scenarios the message not being printed
alltogether.

These modified messages do not seem to be related to any concrete
error message and thus should be treated as independent messages.

gcc/ada/

* sem_ch12.adb (Abandon_Instantiation): Remove continuation
characters from the error message.
* sem_ch13.adb (Check_False_Aspect_For_Derived_Type): Remove
continuation characters from the error message.
* sem_ch6.adb (Assert_False): Avoid creating a continuation
message without a parent. If no primary message is created then
the message is considered as primary.

gcc/testsuite/ChangeLog:

* gnat.dg/interface6.adb: Adjust test.

11 months agoada: Parse the attributes of continuation messages correctly
Viljar Indus [Fri, 21 Jun 2024 10:28:40 +0000 (13:28 +0300)] 
ada: Parse the attributes of continuation messages correctly

Currently unless pretty printing is enabled we avoid parsing
the message strings for continuation messages. This leads
to inconsistent state for the Error_Msg_Object-s that are
being created.

gcc/ada/

* erroutc.adb (Prescan_Message): Avoid not parsing all of the
message attributes.
* erroutc.ads: Update the documentation.

11 months agoada: Use consistent type continuations messages
Viljar Indus [Fri, 21 Jun 2024 10:19:10 +0000 (13:19 +0300)] 
ada: Use consistent type continuations messages

Avoid cases where the main message is an error and the
continuation is a warning.

gcc/ada/

* freeze.adb: Remove warning insertion characters from a
continuation message.
* sem_util.adb: Remove warning insertion characters from a
continuation message.
* sem_warn.adb: Use same warning character as the main message.

11 months agoada: Extract line fitting algorithm
Viljar Indus [Mon, 1 Apr 2024 08:50:10 +0000 (11:50 +0300)] 
ada: Extract line fitting algorithm

Separate the line fitting algorithm from the general line
printing algorithm.

gcc/ada/

* erroutc.ads: Add new method Output_Text_Within
* erroutc.adb: Move the line fitting code to a new method called
Output_Text_Within

11 months agoada: Ensure validity checks for private scalar types
Piotr Trojanek [Fri, 9 Aug 2024 15:52:51 +0000 (17:52 +0200)] 
ada: Ensure validity checks for private scalar types

To check validity of data values, we must strip privacy from their
types.

gcc/ada/

* checks.adb (Expr_Known_Valid): Use Validated_View, which strips
type derivation and privacy.
* exp_ch3.adb (Simple_Init_Private_Type): Kill checks inside
unchecked conversions, just like in Simple_Init_Scalar_Type.

11 months agoada: Display actual line length in line length check
Viljar Indus [Tue, 13 Aug 2024 08:37:31 +0000 (11:37 +0300)] 
ada: Display actual line length in line length check

gcc/ada/

* styleg.adb (Check_Line_Max_Length): Add the actual line length
to the diagnostic message.

11 months agoada: Proper handling for iterator associations in array aggregates
Gary Dismukes [Mon, 12 Aug 2024 22:50:57 +0000 (22:50 +0000)] 
ada: Proper handling for iterator associations in array aggregates

The compiler was flagging type-mismatch errors on iterated component
associations in array aggregates of form "for C in <iterator_name>",
improperly requiring the type of the iterator to be the array index
type. The parser can't distinguish whether the association is one
involving an actual discrete choice vs. an iterator specification,
and creates an N_Iterated_Component_Association with a Defining_Identifer
and Discrete_Choices, and the analysis phase has to disambiguate this,
determining whether to create an N_Iterator_Specification node for
the association. A related change is to revise the similar code for
iterated associations of container aggregates, to allow forms of
iterator objects other than just function calls.

gcc/ada/

* sem_aggr.adb (Resolve_Array_Aggregate): Add loop over associations to locate
N_Iterated_Component_Associations that do not have an Iterator_Specification,
and if their Discrete_Choices list consists of a single choice, analyze it and
if it's the name of an iterator object, then create an Iterator_Specification
and associate it with the iterated component association.
(Resolve_Iterated_Association): Replace test for function call with test of
Is_Object_Reference, to handle other forms of iterator objects in container
aggregates.

11 months agoada: First controlling parameter aspect
Javier Miranda [Mon, 12 Aug 2024 18:50:09 +0000 (18:50 +0000)] 
ada: First controlling parameter aspect

gcc/ada/

* usage.adb (Usage): Document switch -gnatw_j
* doc/gnat_rm/gnat_language_extensions.rst: Add documentation.
* gnat_rm.texi: Regenerate.

11 months agoada: Update documentation for conditional when constructs
Justin Squirek [Mon, 12 Aug 2024 18:31:39 +0000 (18:31 +0000)] 
ada: Update documentation for conditional when constructs

This patch moves the documentation for conditional when constructs out of the
curated set (e.g. into -gnatX0).

gcc/ada/

* doc/gnat_rm/gnat_language_extensions.rst: Move conditional when
constructs out of the curated set.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.

11 months agoAllow subregs around constant displacements [PR116516]
Richard Sandiford [Thu, 29 Aug 2024 13:00:23 +0000 (14:00 +0100)] 
Allow subregs around constant displacements [PR116516]

This patch fixes a regression introduced by g:708ee71808ea61758e73.
x86_64 allows addresses of the form:

  (zero_extend:DI (subreg:SI (symbol_ref:DI "foo") 0))

Before the previous patch, a lax SUBREG check meant that we would
treat the subreg as a base and reload it into a base register.
But that wasn't what the target was expecting.  Instead we should
treat "foo" as a constant displacement, to match:

leal foo, <dest>

After the patch, we recognised that "foo" isn't a base register,
but ICEd on it rather than handling it as a displacement.

With or without the recent patches, if the address had instead been:

  (zero_extend:DI
    (subreg:SI (plus:DI (reg:DI R) (symbol_ref:DI "foo") 0)))

then we would have treated "foo" as the displacement and R as the base
or index, as expected.  The problem was that the code that does this was
rejecting all subregs of objects, rather than just subregs of variable
objects.

gcc/
PR middle-end/116516
* rtlanal.cc (strip_address_mutations): Allow subregs around
constant displacements.

gcc/testsuite/
PR middle-end/116516
* gcc.c-torture/compile/pr116516.c: New test.

11 months agoMake some smallest_int_mode_for_size calls cope with failure
Richard Sandiford [Thu, 29 Aug 2024 13:00:23 +0000 (14:00 +0100)] 
Make some smallest_int_mode_for_size calls cope with failure

smallest_int_mode_for_size now returns an optional mode rather
than aborting on failure.  This patch adjusts a couple of callers
so that they fail gracefully when no mode exists.

There should be no behavioural change, since anything that triggers
the new return paths would previously have aborted.  I just think
this is how the code would have been written if the option had been
available earlier.

gcc/
* dse.cc (find_shift_sequence): Allow smallest_int_mode_for_size
to failure.
* optabs.cc (expand_twoval_binop_libfunc): Likewise.

11 months agoAVR: target/115830 - Make better use of SREG.N and SREG.Z.
Georg-Johann Lay [Sun, 4 Aug 2024 17:46:43 +0000 (19:46 +0200)] 
AVR: target/115830 - Make better use of SREG.N and SREG.Z.

This patch adds new CC modes CCN and CCZN for operations that
set SREG.N, resp. SREG.Z and SREG.N.  Add peephole2 patterns
to generate new compute + branch insns that make use
of the Z and N flags.  Most of these patterns need their own
asm output routines that don't do all the micro-optimizations
that the ordinary outputs may perform, as the latter have no
requirement to set CC in a usable way.

We don't use cmpelim because it cannot provide scratch regs
(which peephole2 can), and some of the patterns require a
scratch reg, whereas the same operations that don't set REG_CC
don't require a scratch.  See the comments in avr.md for details.

The existing add.for.cc* patterns are simplified as they no
more cover QImode, which is handled in a separate QImode case.
Apart from that, it adds 3 patterns for subtractions and one
pattern for shift left, all for multi-byte cases (HI, PSI, SI).

The add.for.cc* patterns now use CC[Z]Nmode, instead of the
formerly abuse of CCmode.

PR target/115830
gcc/
* config/avr/avr-modes.def (CCN, CCZN): New CC_MODEs.
* config/avr/avr-protos.h (avr_cond_branch): New from
ret_cond_branch.
(avr_out_plus_set_N, avr_op8_ZN_operator, avr_cmp0_code)
(avr_out_op8_set_ZN, avr_len_op8_set_ZN): New protos.
(ccn_reg_rtx, cczn_reg_rtx): New declarations.
* config/avr/avr.cc (avr_cond_branch): New from ret_cond_branch.
(avr_cond_string): Add bool cc_overflow_unusable argument.
(avr_print_operand) ['L']: Like 'j' but overflow unusable.
['K']: Like 'k' but overflow unusable.
(avr_out_plus_set_ZN): Remove handling of QImode.
(avr_out_plus_set_N, avr_op8_ZN_operator, avr_cmp0_code)
(avr_out_op8_set_ZN, avr_len_op8_set_ZN): New functions.
(avr_adjust_insn_length) [ADJUST_LEN_ADD_SET_N]: Hande case.
(avr_class_max_nregs): All MODE_CCs occupy one hard reg.
(avr_hard_regno_nregs): Same.
(avr_hard_regno_mode_ok) [REG_CC]: Allow all MODE_CC.
(pass_manager.h, context.h, tree-pass.h): Include them.
(ccn_reg_rtx, cczn_reg_rtx): New GTY variables.
(avr_init_expanders): Initialize them.
(avr_option_override): Run peephole2 a second time.
* config/avr/avr.md (adjust_len) [add_set_N]: New attr value.
(ALLCC, HI_SI): New mode iterators.
(CCname): New mode attribute.
(eqnegtle, cmp_signed, op8_ZN): New code iterators.
(swap, SWAP): New code attributes.
(branch): Handle CCNmode and CCZNmode.  Assimilate...
(difficult_branch): ...this insn.
(p1m1): Remove.
(gen_add_for_<code>_<mode>): Adjust to CCNmode and CCZNmode. Use
HISI as mode iterator.  Extend peephole2s that produce them.
(*add.for.eqne.<mode>): Extend to *add.for.cc[z]n.<mode>.
(*ashift.for.ccn.<mode>): New insn and peephole2 to make them.
(*sub.for.cczn.<mode>, *sub-extend<mode>.for.cczn.<mode>):
New insns and peephole2s to make them.
(*op8.for.cczn.<code>): New insn and peephole2 to make them.
* config/avr/predicates.md (const_1_to_3_operand)
(abs1_abs2_operand, signed_comparison_operator)
(op8_ZN_operator): New predicates.
gcc/testsuite/
* gcc.target/avr/pr115830-add.c: New test.
* gcc.target/avr/pr115830-add-c.c: New test.
* gcc.target/avr/pr115830-add-i.c: New test.
* gcc.target/avr/pr115830-and.c: New test.
* gcc.target/avr/pr115830-asl.c: New test.
* gcc.target/avr/pr115830-asr.c: New test.
* gcc.target/avr/pr115830-ior.c: New test.
* gcc.target/avr/pr115830-lsr.c: New test.
* gcc.target/avr/pr115830-asl32.c: New test.
* gcc.target/avr/pr115830-sub.c: New test.
* gcc.target/avr/pr115830-sub-ext.c: New test.

11 months agoc++: don't remove labels during coro-early-expand-ifns [PR105104]
Arsen Arsenović [Fri, 16 Aug 2024 17:07:01 +0000 (19:07 +0200)] 
c++: don't remove labels during coro-early-expand-ifns [PR105104]

In some scenarios, it is possible for the CFG cleanup to cause one of
the labels mentioned in CO_YIELD, which coro-early-expand-ifns intends
to remove, to become part of some statement.  As a result, when that
label is removed, the statement it became part of becomes invalid,
crashing the compiler.

There doesn't appear to be a reason to remove the labels (anymore, at
least), so let's not do that.

PR c++/105104

gcc/ChangeLog:

* coroutine-passes.cc (execute_early_expand_coro_ifns): Don't
remove any labels.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/torture/pr105104.C: New test.

11 months agoAVR: Outsource code for avr-specific passes to new avr-passes.cc.
Georg-Johann Lay [Wed, 28 Aug 2024 09:18:45 +0000 (11:18 +0200)] 
AVR: Outsource code for avr-specific passes to new avr-passes.cc.

gcc/
* config.gcc (extra_objs) [target=avr]: Add avr-passes.o.
* config/avr/t-avr (avr-passes.o): New rule to make it.
* config/avr/avr.cc (#define INCLUDE_VECTOR): Remove.
(cfganal.h, cfgrtl.h, context.h, tree-pass.h, print-rtl.h): Don't
include them.
(avr_strict_signed_p, avr_strict_unsigned_p, avr_2comparisons_rhs)
(make_avr_pass_recompute_notes, make_avr_pass_casesi)
(make_avr_pass_ifelse, make_avr_pass_pre_proep, avr_split_tiny_move)
(emit_move_ccc, emit_move_ccc_after, reg_seen_between_p)
(avr_maybe_adjust_cfa, avr_redundant_compare_regs)
(avr_parallel_insn_from_insns, avr_is_casesi_sequence)
(avr_optimize_casesi, avr_redundant_compare, make_avr_pass_fuse_add)
(avr_optimize_2ifelse, avr_rest_of_handle_ifelse)
(avr_casei_sequence_check_operands)
Move functions...
(avr_pass_data_fuse_add, avr_pass_data_ifelse)
(avr_pass_data_casesi, avr_pass_data_recompute_notes)
(avr_pass_data_pre_proep): Move objects...
(avr_pass_fuse_add, avr_pass_pre_proep, avr_pass_recompute_notes)
(avr_pass_ifelse, avr_pass_casesi, AVR_LdSt_Props): Move classes...
* config/avr/avr-passes.cc: ... to this new C++ module.
(struct Ranges): Move to...
* config/avr/ranges.h: ...this new file.
* config/avr/avr-protos.h: Adjust comments.

11 months agotestsuite: Fix up refactored scanltranstree.exp functions [PR116522]
Alex Coplan [Thu, 29 Aug 2024 10:31:40 +0000 (11:31 +0100)] 
testsuite: Fix up refactored scanltranstree.exp functions [PR116522]

When adding RTL variants of the scan-ltrans-tree* functions in:
r15-3254-g3f51f0dc88ec21c1ec79df694200f10ef85915f4
I messed up the name of the underlying scan function to invoke.  The
code currently attempts to invoke functions named
scan{,-not,-dem,-dem-not} but should instead be invoking
scan-dump{,-not,-dem,-dem-not}.  This patch fixes that.

gcc/testsuite/ChangeLog:

PR testsuite/116522
* lib/scanltranstree.exp: Fix name of underlying scan function
used for scan-ltrans-{tree,rtl}-dump{,-not,-dem,-dem-not}.

11 months agoRISC-V: Fix subreg of VLS modes larger than a vector [PR116086].
Robin Dapp [Tue, 27 Aug 2024 08:25:34 +0000 (10:25 +0200)] 
RISC-V: Fix subreg of VLS modes larger than a vector [PR116086].

When the source mode is potentially larger than one vector (e.g. an
LMUL2 mode for VLEN=128) we don't know which vector the subreg actually
refers to.  For zvl128b and LMUL=2 the subreg in (subreg:V2DI (reg:V4DI))
could actually be the a full (high) vector register of a two-register
group (at VLEN=128) or the higher part of a single register (at VLEN>128).

As the subreg is statically ambiguous we prevent such situations in
can_change_mode_class.

The culprit in PR116086 is

 _12 = BIT_FIELD_REF <vect_cst__42, 128, 128>;

which can be expanded with a vector-vector extract (from V4DI to V2DI).
This patch adds a VLS-mode vector-vector extract that handles "halving"
cases like this one by sliding down the source vector, thus making sure
the correct part is used.

PR target/116086

gcc/ChangeLog:

* config/riscv/autovec.md (vec_extract<mode><v_half>): Add
vector-vector extract for VLS modes.
* config/riscv/riscv.cc (riscv_can_change_mode_class): Forbid
VLS modes larger than one vector.
* config/riscv/vector-iterators.md: Add vector-vector extract
iterators.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add effective target checks for
zvl256b and zvl512b.
* gcc.target/riscv/rvv/autovec/pr116086-2-run.c: New test.
* gcc.target/riscv/rvv/autovec/pr116086-2.c: New test.
* gcc.target/riscv/rvv/autovec/pr116086.c: New test.

11 months agoi386: Support wide immediate constants in STV.
Roger Sayle [Thu, 29 Aug 2024 03:19:28 +0000 (21:19 -0600)] 
i386: Support wide immediate constants in STV.

This patch provides more accurate costs/gains for (wide) immediate
constants in STV, suitably adjusting the costs/gains when the highpart
and lowpart words are the same.

2024-08-28  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/i386/i386-features.cc (timode_immed_const_gain): New
function to determine the gain/cost on a CONST_WIDE_INT.
(timode_scalar_chain::compute_convert_gain): Fix whitespace.
<case CONST_WIDE_INT>: Provide more accurate estimates using
timode_immed_const_gain.
<case AND>: Handle CONSTANT_SCALAR_INT_P (src).

11 months agoWrite LF_MFUNC_ID types for CodeView struct member functions
Mark Harmstone [Mon, 26 Aug 2024 21:16:11 +0000 (22:16 +0100)] 
Write LF_MFUNC_ID types for CodeView struct member functions

If recording the definition of a struct member function, write an
LF_MFUNC_ID type rather than an LF_FUNC_ID. This links directly to the
struct type, rather than to an LF_STRING_ID with its name.

gcc/
* dwarf2codeview.cc (enum cv_leaf_type): Add LF_MFUNC_ID.
(write_lf_mfunc_id): New function.
(add_lf_func_id): New function.
(add_lf_mfunc_id): New function.
(add_function): Call add_lf_func_id or add_lf_mfunc_id.

11 months agoRecord member functions in CodeView struct definitions
Mark Harmstone [Mon, 26 Aug 2024 21:40:56 +0000 (22:40 +0100)] 
Record member functions in CodeView struct definitions

CodeView has two ways of recording struct member functions.
Non-overloaded functions have an LF_ONEMETHOD sub-type in the field
list, which records the name and the function type (LF_MFUNCTION).
Overloaded functions have an LF_METHOD instead, which points to an
LF_METHODLIST, which is an array of links to various LF_MFUNCTION types.

gcc/
* dwarf2codeview.cc (enum cv_leaf_type): Add LF_MFUNCTION,
LF_METHODLIST, LF_METHOD, and LF_ONEMETHOD.
(struct codeview_subtype): Add lf_onemethod and lf_method to union.
(struct lf_methodlist_entry): New type.
(struct codeview_custom_type): Add lf_mfunc_id, lf_mfunction, and
lf_methodlist to union.
(struct codeview_method): New type.
(struct method_hasher): New type.
(get_type_num_subroutine_type): Add forward declaration.
(write_lf_fieldlist): Handle LF_ONEMETHOD and LF_METHOD.
(write_lf_mfunction): New function.
(write_lf_methodlist): New function.
(write_custom_types): Handle LF_MFUNCTION and LF_METHODLIST.
(add_struct_function): New function.
(get_mfunction_type): New function.
(is_templated_func): New function.
(get_type_num_struct): Handle DW_TAG_subprogram child DIEs.
(get_type_num_subroutine_type): Add containing_class_type, this_type,
and this_adjustment params, and handle creating LF_MFUNCTION types as
well as LF_PROCEDURE.
(get_type_num): New params for get_type_num_subroutine_type.
(add_function): New params for get_type_num_subroutine_type.
* dwarf2codeview.h (CV_METHOD_VANILLA, CV_METHOD_VIRTUAL): Define.
(CV_METHOD_STATIC, CV_METHOD_FRIEND, CV_METHOD_INTRO): Likewise.
(CV_METHOD_PUREVIRT, CV_METHOD_PUREINTRO): Likewise.

11 months agoRecord static data members in CodeView structs
Mark Harmstone [Mon, 26 Aug 2024 20:34:46 +0000 (21:34 +0100)] 
Record static data members in CodeView structs

Record LF_STMEMBER field list subtypes to represent static data members
in structs.

gcc/
* dwarf2codeview.cc (enum cv_leaf_type): Add LF_STMEMBER.
(struct codeview_subtype): Add lf_static_member to union.
(write_lf_fieldlist): Handle LF_STMEMBER.
(add_struct_member): New function.
(add_struct_static_member): New function.
(get_accessibility): New function.
(get_type_num_struct): Split out into add_struct_member and
get_accessibility, and handle static members.

11 months agoHandle scoping in CodeView LF_FUNC_ID types
Mark Harmstone [Mon, 26 Aug 2024 20:19:51 +0000 (21:19 +0100)] 
Handle scoping in CodeView LF_FUNC_ID types

If a function is in a namespace, create an LF_STRING_ID type for the
name of its parent, and record this in the LF_FUNC_ID type we create
for the function.

gcc/
* dwarf2codeview.cc (enum cf_leaf_type): Add LF_STRING_ID.
(struct codeview_custom_type): Add lf_string_id to union.
(struct string_id_hasher): New type.
(string_id_htab): New global variable.
(write_lf_string_id): New function.
(write_custom_types): Call write_lf_string_id.
(codeview_debug_finish): Free string_id_htab.
(add_string_id): New function.
(get_scope_string_id): New function.
(add_function): Call get_scope_string_id and set scope.

11 months agoHandle namespaced names for CodeView
Mark Harmstone [Mon, 26 Aug 2024 20:03:58 +0000 (21:03 +0100)] 
Handle namespaced names for CodeView

Run all CodeView names through a new function get_name, which chains
together a DIE's DW_AT_name with that of its parent to create a
C++-style name.

gcc/
* dwarf2codeview.cc (get_name): New function.
(add_enum_forward_def): Call get_name.
(get_type_num_enumeration_type): Call get_name.
(add_struct_forward_def): Call get_name.
(get_type_num_struct): Call get_name.
(add_variable): Call get_name.
(add function): Call get_name.
* dwarf2out.cc (get_die_parent): Rename to dw_get_die_parent and make
non-static.
(generate_type_signature): Handle renamed get_die_parent.
* dwarf2out.h (dw_get_die_parent): Add declaration.

11 months agoDaily bump.
GCC Administrator [Thu, 29 Aug 2024 00:19:25 +0000 (00:19 +0000)] 
Daily bump.

11 months agoc++: wrong error due to std::initializer_list opt [PR116476]
Marek Polacek [Wed, 28 Aug 2024 19:45:49 +0000 (15:45 -0400)] 
c++: wrong error due to std::initializer_list opt [PR116476]

Here maybe_init_list_as_array gets elttype=field, init={NON_LVALUE_EXPR <2>}
and it tries to convert the init's element type (int) to field
using implicit_conversion, which works, so overall maybe_init_list_as_array
is successful.

But it constifies init_elttype so we end up with "const int".  Later,
when we actually perform the conversion and invoke field::field(T&&),
we end up with this error:

  error: binding reference of type 'int&&' to 'const int' discards qualifiers

So I think maybe_init_list_as_array should try to perform the conversion,
like it does below with fc.

PR c++/116476

gcc/cp/ChangeLog:

* call.cc (maybe_init_list_as_array): Try convert_like and see if it
worked.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-opt2.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
11 months agoPR modula2/116181 remove ODR warnings from library interface files
Gaius Mulley [Wed, 28 Aug 2024 21:51:11 +0000 (22:51 +0100)] 
PR modula2/116181 remove ODR warnings from library interface files

This patch removes the warnings generated by -Wodr from the library
interface between modula-2 and C.

gcc/m2/ChangeLog:

PR modula2/116181
* Make-lang.in (MC_SRC_FLAGS): New macro.
(m2/mc-boot/$(SRC_PREFIX)%.o): Use MC_SRC_FLAGS.
(m2/mc-boot-ch/$(SRC_PREFIX)%.o): Ditto.
(m2/gm2-libs-boot/M2RTS.o): Ditto.
(m2/gm2-libs-boot/%.o): Ditto.
(GM2-LIBS-BOOT-H): New macro.
(m2/gm2-libs-boot/RTcodummy.o): Use MC_SRC_FLAGS.
Remove gm2-libs-host.h from the dependancy.
(m2/gm2-libs-boot/wrapc.o): Use MC_SRC_FLAGS.
Add dependancy GM2-LIBS-BOOT-H.
(m2/gm2-libs-boot/UnixArgs.o): Ditto.
(m2/gm2-libs-boot/choosetemp.o): Ditto.
(m2/gm2-libs-boot/errno.o): Ditto.
(m2/gm2-libs-boot/dtoa.o): Ditto.
(m2/gm2-libs-boot/ldtoa.o): Ditto.
(m2/gm2-libs-boot/termios.o): Ditto.
(m2/gm2-libs-boot/SysExceptions.o): Ditto.
(m2/gm2-compiler-boot/M2GCCDeclare.o): Add gm2-libs-ch to the
search path.
(m2/gm2-compiler-boot/M2Error.o): Ditto.
(m2/gm2-compiler-boot/%.o): Ditto.
(m2/pge-boot/%.o): Ditto.
* gm2-gcc/m2color.cc (m2color_colorize_start): Replace parameter
type char to void and recast to char * when calling colorize_start.
* gm2-gcc/m2color.h (m2color_colorize_start): Replace parameter
type char to void.
* gm2-gcc/m2type.h: Remove #if 0 block.
* gm2-libs-ch/SysExceptions.c (DECL_PROC_T): Provide alternative
defines for MC an gm2.
(PROC_FUNC): Ditto.
(EXTERN): Force undefine and redefine.
(SysExceptions_InitExceptionHandlers): Rewrite function
declaration using defined macros.
(_M2_SysExceptions_init): Use EXTERN.
(_M2_SysExceptions_finish): Replace with ...
(_M2_SysExceptions_fini): ... this and add parameters.
* gm2-libs-ch/UnixArgs.cc (gm2-libs-host.h): Include.
(GUnixArgs.h): Include.
(GM2RTS.h): Include.
(UnixArgs_GetArgV): Change return type to void *.
(UnixArgs_GetEnvV): Ditto.
* gm2-libs-ch/m2rts.h (M2RTS_RegisterModule_Cstr): Add new
conditional macro.
(M2RTS_RequestDependant): Remove.
(M2RTS_RegisterModule): Ditto.
(M2RTS_Terminate): Ditto.
(M2RTS_DeconstructModules): Ditto.
(M2RTS_Halt): Ditto.
(_M2_M2RTS_init): Ditto.
(M2RTS_ConstructModules): Ditto.
* gm2-libs-ch/termios.c (_termios_C): Define.
(EXTERN): Add conditional definition.
(doSetUnset): New function.
(_M2_termios_init): Add correct parameters.
(_M2_termios_finish): Ditto.
(_M2_termios_fini): Ditto.
* mc-boot-ch/GSysExceptions.c (DECL_PROC_T): New define.
(PROC_FUNC): Ditto.
(EXTERN): Force undef.
(SysExceptions_InitExceptionHandlers): Rewrite.
* mc-boot-ch/Glibc.c (libc_open): Rename parameter
oflag to flags.
* mc-boot-ch/Gtermios.cc (_termios_C): New define.
(KillTermios): Change parameter type from
struct termios * to termios_TERMIOS.
(tcsnow): Rewrite.
(tcsnow): Rewrite.
(tcsdrain): Rewrite.
(tcsflush): Rewrite.
(cfgetospeed): Rewrite.
(cfgetispeed): Rewrite.
(cfsetospeed): Rewrite.
(cfsetispeed): Rewrite.
(cfsetspeed): Rewrite.
(cfsetspeed): Rewrite.
(tcgetattr): Rewrite.
(tcsetattr): Rewrite.
(cfmakeraw): Rewrite.
(tcsendbreak): Rewrite.
(tcdrain): Rewrite.
(tcflushi): Rewrite.
(tcflusho): Rewrite.
(tcflushio): Rewrite.
(tcflowoni): Rewrite.
(tcflowoffi): Rewrite.
(tcflowono): Rewrite.
(tcflowoffo): Rewrite.
(GetFlag): Rewrite.
(SetFlag): Rewrite.
(GetChar): Rewrite.
(SetChar): Rewrite.
(InitTermios): Rewrite.
* pge-boot/GM2RTS.cc: Regenerate.
* pge-boot/GSysExceptions.cc: Ditto.
* pge-boot/Gtermios.cc: Ditto.
* pge-boot/m2rts.h: Rewrite.
* mc-boot-ch/GSYSTEM.h: New file.
* mc-boot-ch/GSysExceptions.h: New file.
* mc-boot-ch/Gtermios.h: New file.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
11 months agoexpand: Add debug dump on the cost for `popcount==1` expand
Andrew Pinski [Mon, 26 Aug 2024 22:14:24 +0000 (15:14 -0700)] 
expand: Add debug dump on the cost for `popcount==1` expand

While working on PR 114224, I found it would be useful to dump the
different costs of the expansion to make easier to understand why one
was chosen over the other.

Changes since v1:
* v2: make the dump a single line

Bootstrapped and tested on x86_64-linux-gnu.
Build and tested for aarch64-linux-gnu.

gcc/ChangeLog:

* internal-fn.cc (expand_POPCOUNT): Dump the costs for
the two choices.

11 months agolibstdc++: Fix autoconf check for O_NONBLOCK in <fcntl.h>
Jonathan Wakely [Wed, 28 Aug 2024 11:38:18 +0000 (12:38 +0100)] 
libstdc++: Fix autoconf check for O_NONBLOCK in <fcntl.h>

I misused the AC_CHECK_DECL macro, assuming that it behaved like
AC_CHECK_DECLS and always defined a HAVE_xxx macro if the decl was
found. Instead, the [action-if-found] shell commands are needed to
defined HAVE_O_NONBLOCK explicitly.

libstdc++-v3/ChangeLog:

* configure.ac: Fix check for O_NONBLOCK.
* config.h.in: Regenerate.
* configure: Regenerate.

11 months agolibstdc++: Fix -Wunused-parameter warnings in Networking TS headers
Jonathan Wakely [Wed, 28 Aug 2024 11:21:56 +0000 (12:21 +0100)] 
libstdc++: Fix -Wunused-parameter warnings in Networking TS headers

libstdc++-v3/ChangeLog:

* include/experimental/io_context: Remove name of unused
parameter.
* include/experimental/socket: Add [[maybe_unused]] attribute.

11 months agolibstdc++: Fix -Wunused-variable warning in <format>
Jonathan Wakely [Wed, 28 Aug 2024 11:09:58 +0000 (12:09 +0100)] 
libstdc++: Fix -Wunused-variable warning in <format>

libstdc++-v3/ChangeLog:

* include/std/format (format_parse_context::check_dynamic_spec):
Add [[maybe_unused]] attribute and comment.

11 months agolibstdc++: Remove unused typedef in <ranges>
Jonathan Wakely [Wed, 28 Aug 2024 10:49:08 +0000 (11:49 +0100)] 
libstdc++: Remove unused typedef in <ranges>

This local typedef should have been removed in r14-6199-g45630fbcf7875b.

libstdc++-v3/ChangeLog:

* include/std/ranges (to): Remove unused typedef.

11 months agodoc: Add Dhruv Matani to Contributors
Jonathan Wakely [Wed, 28 Aug 2024 10:49:46 +0000 (11:49 +0100)] 
doc: Add Dhruv Matani to Contributors

gcc/ChangeLog:

* doc/contrib.texi (Contributors): Add Dhruv Matani.

11 months agolibstdc++: Fix @file for target-specific opt_random.h
Kim Gräsman [Tue, 27 Aug 2024 16:11:29 +0000 (17:11 +0100)] 
libstdc++: Fix @file for target-specific opt_random.h

A few of these files self-identified as ext/random.tcc, update to use
the actual basename.

libstdc++-v3/ChangeLog:

* config/cpu/aarch64/opt/ext/opt_random.h: Improve doxygen file
docs.
* config/cpu/i486/opt/ext/opt_random.h: Likewise.

11 months agolibstdc++: Fix @headername for bits/cpp_type_traits.h
Kim Gräsman [Tue, 27 Aug 2024 16:08:47 +0000 (17:08 +0100)] 
libstdc++: Fix @headername for bits/cpp_type_traits.h

There is no file ext/type_traits, point it to ext/type_traits.h instead.

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h: Improve doxygen file docs.

11 months agoAVR: Overhaul the avr-ifelse RTL optimization pass.
Georg-Johann Lay [Fri, 23 Aug 2024 09:34:43 +0000 (11:34 +0200)] 
AVR: Overhaul the avr-ifelse RTL optimization pass.

Mini-pass avr-ifelse realizes optimizations that replace two cbranch
insns with one comparison and two branches.  This patch adds the
following improvements:

- The right operand of the comparisons may also be REGs.
  Formerly only CONST_INT was handled.

- The RTX code of the first comparison in no more restricted
  to (effectively) EQ.

- When the second cbranch is located in the fallthrough path
  of the first cbranch, then difficult (expensive) comparisons
  can always be avoided.  This may require to swap the branch
  targets.  (When the second cbranch is located after the target
  label of the first one, then getting rid of difficult branches
  would require to reorder blocks.)

- The code has been cleaned up:  avr_rest_of_handle_ifelse() now
  just scans the insn stream for optimization candidates.  The code
  that actually performs the transformation has been outsourced to
  the new function avr_optimize_2ifelse().

- The code to find a better representation for reg-const_int comparisons
  has been split into two parts:  First try to find codes such that the
  right-hand sides of the comparisons are the same (avr_2comparisons_rhs).
  When this succeeds then one comparison can serve two branches, and
  that function tries to get rid of difficult branches.  This is always
  possible when the second cbranch is located in the fallthrough path
  of the first one.

Some final notes on why we don't use compare-elim:  1) The two cbranch
insns may come with different scratch operands depending on the chosen
constraint alternatives.  There are cases where the outgoing comparison
requires a scratch but only one incoming cbranch has one.  2) Avoiding
difficult branches can be achieved by rewiring basic blocks.
compare-elim doesn't do that; it doesn't even know the costs of the
branch codes.  3)  avr_2comparisons_rhs() may de-canonicalize a
comparison to achieve its goal.  compare-elim doesn't know how to do
that.  4) There are more reasons, see for example the commit message
and discussion for PR115830.

avr_2comparisons_rhs tries to decompose the interval as given by some
[u]intN_t into three intervals using the new Ranges struct that
implemens set operations on finite unions of intervals.
Sadly, value-range.h is not well suited for that, and writing a
wrapper around it that avoids all corner case ICEs would be more
laborious than struct Ranges.

gcc/
* config/avr/avr.cc (INCLUDE_VECTOR): Define it.
(cfganal.h): Include it.
(Ranges): New struct.
(avr_2comparisons_rhs, avr_redundant_compare_regs)
(avr_strict_signed_p, avr_strict_unsigned_p): New static functions.
(avr_redundant_compare): Overhaul: Allow more cases.
(avr_optimize_2ifelse): New static function, outsourced from...
(avr_rest_of_handle_ifelse): ...this method.
gcc/testsuite/
* gcc.target/avr/torture/ifelse-c.h: New file.
* gcc.target/avr/torture/ifelse-d.h: New file.
* gcc.target/avr/torture/ifelse-q.h: New file.
* gcc.target/avr/torture/ifelse-r.h: New file.
* gcc.target/avr/torture/ifelse-c-i8.c: New test.
* gcc.target/avr/torture/ifelse-d-i8.c: New test.
* gcc.target/avr/torture/ifelse-q-i8.c: New test.
* gcc.target/avr/torture/ifelse-r-i8.c: New test.
* gcc.target/avr/torture/ifelse-c-i16.c: New test.
* gcc.target/avr/torture/ifelse-d-i16.c: New test.
* gcc.target/avr/torture/ifelse-q-i16.c: New test.
* gcc.target/avr/torture/ifelse-r-i16.c: New test.
* gcc.target/avr/torture/ifelse-c-u16.c: New test.
* gcc.target/avr/torture/ifelse-d-u16.c: New test.
* gcc.target/avr/torture/ifelse-q-u16.c: New test.
* gcc.target/avr/torture/ifelse-r-u16.c: New test.

11 months agoAdd gcc ka.po
Joseph Myers [Wed, 28 Aug 2024 16:31:13 +0000 (16:31 +0000)] 
Add gcc ka.po

* ka.po: New file.