git.ipfire.org Git - thirdparty/gcc.git/log

Daily bump.

[PR rtl-optimization/115876] Avoid ubsan in ext-dce.cc

This fixes two general ubsan issues in ext-dce, both related to use-side
processsing of modes > DImode.

In ext_dce_process_uses we can be presented with something like this as a use
(subreg:SI (reg:TF) 12)

That will result in an out of range shift for a HOST_WIDE_INT object.  Where
this happens is safe to just break from the SET context and process the
subjects.  This will ultimately result in seeing (reg:TF) and we'll mark all
bit groups as live.

In carry_backpropagate we can be presented with a TImode shift (for example)
and the shift count can be > 63 for such a shift.  This naturally trips ubsan
as well as we're operating on 64 bit objects.

We can just return mmask in this case noting that every bit group is live.

The combination of these two fixes eliminates all the reported ubsan issues in
ext-dce seen in a bootstrap and regression test on x86.

While I was in there I went ahead and fixed the various hardcoded 63/64 values
to be HOST_BITS_PER_WIDE_INT based.

Bootstrapped and regression tested on x86 with no regressions.  Also built with
ubsan enabled and verified the build logs and testsuite logs don't call out any
issues in ext-dce anymore.

Pushing to the trunk.

PR rtl-optimization/115876
gcc
* ext-dce.cc (ext_dce_process_sets): Replace hardcoded 63/64 instances
with HOST_BITS_PER_WIDE_INT based values.
(carry_backpropagate): Handle modes with more bits than
HOST_BITS_PER_WIDE_INT gracefully, avoiding undefined behavior.
(ext_dce_process_uses): Handle subreg offsets which would result
in ubsan shifts gracefully, avoiding undefined behavior.

libstdc++: Remove note from the GCC 4.0.1 days

libstdc++-v3:
* doc/xml/manual/prerequisites.xml: Remove note from the
GCC 4.0.1 days.
* doc/html/manual/setup.html: Regenerate.

doc: Tweak gm2 mailing list address

gcc:
* doc/gm2.texi (Contributing): Tweak gm2 mailing list address.

PHIOPT: move factor_out_conditional_operation over to use gimple_match_op

To start working on more with expressions with more than one operand, converting
over to use gimple_match_op is needed.
The added side-effect here is factor_out_conditional_operation can now support
builtins/internal calls that has one operand without any extra code added.

Note on the changed testcases:
* pr87007-5.c: the test was testing testing for avoiding partial register stalls
for the sqrt and making sure there is only one zero of the register before the
branch, the phiopt would now merge the sqrt's so disable phiopt.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* gimple-match-exports.cc (gimple_match_op::operands_occurs_in_abnormal_phi):
New function.
* gimple-match.h (gimple_match_op): Add operands_occurs_in_abnormal_phi.
* tree-ssa-phiopt.cc (factor_out_conditional_operation): Use gimple_match_op
instead of manually extracting from/creating the gimple.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr87007-5.c: Disable phi-opt.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

libgfortran: implement fpu-macppc for Darwin, support IEEE arithmetic

This allows to build and use IEEE modules on Darwin PowerPC.

libgfortran/ChangeLog:

* config/fpu-macppc.h (new file): initial support for powerpc-darwin.
* configure.host: enable ieee_support for powerpc-darwin case,
set fpu_host='fpu-macppc'.

Signed-off-by: Sergey Fedorov <vital.had@gmail.com>

AVR: Tweak 16-bit addition with const that didn't get a LD_REGS register.

The 16-bit additions like addhi3 have two forms: One with a scratch:QI
and one without, where the latter is required because reload cannot
deal with a scratch when spill code pops a 16-bit addition.

Passes like combine and fwprop1 may come up with the non-scratch version,
which is sub-optimal in the case when the addition is performed in a
NO_LD_REGS register because the operands will be spilled to LD_REGS.
Having a scratch:QI at disposal can lead to better code with less spills.

gcc/
* config/avr/avr.md (*add<mode>3_split) [!reload_completed]:
Add a scratch:QI to 16-bit additions with constant.

AVR: ad target/116407 - Fix linker error "relocation truncated to fit".

PR target/116407
gcc/
* config/avr/avr.md (*dec-and-branchhi!=-1.l.clobber):
Increase the additional jump offset to 2 words.

AVR: target/116407 - Fix linker error "relocation truncated to fit".

Some text peepholes output extra instructions prior to a branch
instruction and that increase the jump offset of backward branches.

PR target/116407
gcc/
* config/avr/avr-protos.h (avr_jump_mode): Add an int argument.
* config/avr/avr.cc (avr_jump_mode): Add an int argument to increase
the computed jump offset of backwards branches.
* config/avr/avr.md (*dec-and-branchhi!=-1, *dec-and-branchsi!=-1):
Increase the jump offset used by avr_jump_mode() as needed.
gcc/testsuite/
* gcc.target/avr/torture/pr116407-2.c: New test.
* gcc.target/avr/torture/pr116407-4.c: New test.

forwprop: Also dce from added statements from gimple_simplify

This extends r14-3982-g9ea74d235c7e78 to also include the newly added statements
since some of them might be dead too (due to the way match and simplify works).
This was noticed while working on adding a new match and simplify pattern where a
new statement that got added was not being used.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* gimple-fold.cc (mark_lhs_in_seq_for_dce): New function.
(replace_stmt_with_simplification): Call mark_lhs_in_seq_for_dce
right before inserting the sequence.
(fold_stmt_1): Add dce_worklist argument, update call to
replace_stmt_with_simplification.
(fold_stmt): Add dce_worklist argument, update call to fold_stmt_1.
(fold_stmt_inplace): Update call to fold_stmt_1.
* gimple-fold.h (fold_stmt): Add bitmap argument.
* tree-ssa-forwprop.cc (pass_forwprop::execute): Update call to fold_stmt.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

RISC-V: Implement the quad and oct .SAT_TRUNC for scalar

This patch would like to implement the quad and oct .SAT_TRUNC pattern
in the riscv backend. Aka:

Form 1:
  #define DEF_SAT_U_TRUC_FMT_1(NT, WT)     \
  NT __attribute__((noinline))             \
  sat_u_truc_##WT##_to_##NT##_fmt_1 (WT x) \
  {                                        \
    bool overflow = x > (WT)(NT)(-1);      \
    return ((NT)x) | (NT)-overflow;        \
  }

DEF_SAT_U_TRUC_FMT_1(uint16_t, uint64_t)

Before this patch:
   4   │ __attribute__((noinline))
   5   │ uint16_t sat_u_truc_uint64_t_to_uint16_t_fmt_1 (uint64_t x)
   6   │ {
   7   │   _Bool overflow;
   8   │   short unsigned int _1;
   9   │   short unsigned int _2;
  10   │   short unsigned int _3;
  11   │   uint16_t _6;
  12   │
  13   │ ;;   basic block 2, loop depth 0
  14   │ ;;    pred:       ENTRY
  15   │   overflow_5 = x_4(D) > 65535;
  16   │   _1 = (short unsigned int) x_4(D);
  17   │   _2 = (short unsigned int) overflow_5;
  18   │   _3 = -_2;
  19   │   _6 = _1 | _3;
  20   │   return _6;
  21   │ ;;    succ:       EXIT
  22   │
  23   │ }

After this patch:
   3   │
   4   │ __attribute__((noinline))
   5   │ uint16_t sat_u_truc_uint64_t_to_uint16_t_fmt_1 (uint64_t x)
   6   │ {
   7   │   uint16_t _6;
   8   │
   9   │ ;;   basic block 2, loop depth 0
  10   │ ;;    pred:       ENTRY
  11   │   _6 = .SAT_TRUNC (x_4(D)); [tail call]
  12   │   return _6;
  13   │ ;;    succ:       EXIT
  14   │
  15   │ }

The below tests suites are passed for this patch
1. The rv64gcv fully regression test.
2. The rv64gcv build with glibc

gcc/ChangeLog:

* config/riscv/iterators.md (ANYI_QUAD_TRUNC): New iterator for
quad truncation.
(ANYI_OCT_TRUNC): New iterator for oct truncation.
(ANYI_QUAD_TRUNCATED): New attr for truncated quad modes.
(ANYI_OCT_TRUNCATED): New attr for truncated oct modes.
(anyi_quad_truncated): Ditto but for lower case.
(anyi_oct_truncated): Ditto but for lower case.
* config/riscv/riscv.md (ustrunc<mode><anyi_quad_truncated>2):
Add new pattern for quad truncation.
(ustrunc<mode><anyi_oct_truncated>2): Ditto but for oct.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-2.c: Adjust
the expand dump check times.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-3.c: Ditto.
* gcc.target/riscv/sat_arith_data.h: Add test helper macros.
* gcc.target/riscv/sat_u_trunc-4.c: New test.
* gcc.target/riscv/sat_u_trunc-5.c: New test.
* gcc.target/riscv/sat_u_trunc-6.c: New test.
* gcc.target/riscv/sat_u_trunc-run-4.c: New test.
* gcc.target/riscv/sat_u_trunc-run-5.c: New test.
* gcc.target/riscv/sat_u_trunc-run-6.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Make sure high bits of usadd operands is clean for non-Xmode [PR116278]

For QI/HImode of .SAT_ADD,  the operands may be sign-extended and the
high bits of Xmode may be all 1 which is not expected.  For example as
below code.

signed char b[1];
unsigned short c;
signed char *d = b;
int main() {
  b[0] = -40;
  c = ({ (unsigned short)d[0] < 0xFFF6 ? (unsigned short)d[0] : 0xFFF6; }) + 9;
  __builtin_printf("%d\n", c);
}

After expanding we have:

;; _6 = .SAT_ADD (_3, 9);
(insn 8 7 9 (set (reg:DI 143)
        (high:DI (symbol_ref:DI ("d") [flags 0x86]  <var_decl d>)))
     (nil))
(insn 9 8 10 (set (reg/f:DI 142)
        (mem/f/c:DI (lo_sum:DI (reg:DI 143)
                (symbol_ref:DI ("d") [flags 0x86]  <var_decl d>)) [1 d+0 S8 A64]))
     (nil))
(insn 10 9 11 (set (reg:HI 144 [ _3 ])
        (sign_extend:HI (mem:QI (reg/f:DI 142) [0 *d.0_1+0 S1 A8]))) "test.c":7:10 -1
     (nil))

The convert from signed char to unsigned short will have sign_extend rtl
as above.  And finally become the lb insn as below:

lb      a1,0(a5)   // a1 is -40, aka 0xffffffffffffffd8
lui     a0,0x1a
addi    a5,a1,9
slli    a5,a5,0x30
srli    a5,a5,0x30 // a5 is 65505
sltu    a1,a5,a1   // compare 65505 and 0xffffffffffffffd8 => TRUE

The sltu try to compare 65505 and 0xffffffffffffffd8 here,  but we
actually want to compare 65505 and 65496 (0xffd8).  Thus we need to
clean up the high bits to ensure this.

The below test suites are passed for this patch:
* The rv64gcv fully regression test.

PR target/116278

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_gen_zero_extend_rtx): Add new
func impl to zero extend rtx.
(riscv_expand_usadd): Leverage above func to cleanup operands 0
and remove the special handing for SImode in RV64.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_u_add-11.c: Adjust asm check body.
* gcc.target/riscv/sat_u_add-15.c: Ditto.
* gcc.target/riscv/sat_u_add-19.c: Ditto.
* gcc.target/riscv/sat_u_add-23.c: Ditto.
* gcc.target/riscv/sat_u_add-3.c: Ditto.
* gcc.target/riscv/sat_u_add-7.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-11.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-15.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-3.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-7.c: Ditto.
* gcc.target/riscv/pr116278-run-1.c: New test.
* gcc.target/riscv/pr116278-run-2.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add testcases for unsigned scalar .SAT_TRUNC form 3

This patch would like to add test cases for the unsigned scalar
.SAT_TRUNC form 3.  Aka:

Form 3:
  #define DEF_SAT_U_TRUC_FMT_3(NT, WT)     \
  NT __attribute__((noinline))             \
  sat_u_truc_##WT##_to_##NT##_fmt_3 (WT x) \
  {                                        \
    WT max = (WT)(NT)-1;                   \
    return x <= max ? (NT)x : (NT) max;    \
  }

DEF_SAT_U_TRUC_FMT_3 (uint32_t, uint64_t)

The below test is passed for this patch.
* The rv64gcv regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_u_trunc-13.c: New test.
* gcc.target/riscv/sat_u_trunc-14.c: New test.
* gcc.target/riscv/sat_u_trunc-15.c: New test.
* gcc.target/riscv/sat_u_trunc-run-13.c: New test.
* gcc.target/riscv/sat_u_trunc-run-14.c: New test.
* gcc.target/riscv/sat_u_trunc-run-15.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add testcases for unsigned scalar .SAT_TRUNC form 2

This patch would like to add test cases for the unsigned scalar
.SAT_TRUNC form 2.  Aka:

Form 2:
  #define DEF_SAT_U_TRUC_FMT_2(NT, WT)     \
  NT __attribute__((noinline))             \
  sat_u_truc_##WT##_to_##NT##_fmt_2 (WT x) \
  {                                        \
    WT max = (WT)(NT)-1;                   \
    return x > max ? (NT) max : (NT)x;     \
  }

DEF_SAT_U_TRUC_FMT_2 (uint32_t, uint64_t)

The below test is passed for this patch.
* The rv64gcv regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_u_trunc-7.c: New test.
* gcc.target/riscv/sat_u_trunc-8.c: New test.
* gcc.target/riscv/sat_u_trunc-9.c: New test.
* gcc.target/riscv/sat_u_trunc-run-7.c: New test.
* gcc.target/riscv/sat_u_trunc-run-8.c: New test.
* gcc.target/riscv/sat_u_trunc-run-9.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

Daily bump.

[committed] Avoid right shifting signed value on ext-dce.cc

This is analogous to a prior patch to ext-dce which fixes propagation of sign
bits, but this time for the saturating variants.  I'd held off fixing those
because I wanted the time to look at that code (since we don't have a testcase
for it as far as I know).

Not surprisingly, putting an abort on that path and running an x86 bootstrap
and testsuite run, it never triggers.  Of course not a lot of code tries to do
saturating shifts.

Anyway, bootstrapped and regression tested on x86_64.  Pushing to the trunk.

Thanks for everyone's patience.

gcc/
* ext-dce.cc (carry_backpropagate): Cast mask to HOST_WIDE_INT before
shifting.

t-rtems: add rv32imf architecture to the RTEMS multilib for RISC-V

The attach patch is specific to the RTEMS RISC-V architecture multilib which is
controlled by the t-rtems file in the gcc/config/riscv/ directory. The patch
file was created from the gcc-13.3.0 branch. It was successfully tested within
RTEMS Source Builder.

gcc/
* config/riscv/t-rtems: Add ilp32f multilib.

Adjust v850 rotate expander to allow more cases for V850E3V5

The recent if-conversion changes tripped a failure on the v850 port.

The core underlying issue is that while the if-conversion code tries to do the
right thing with noce_can_force_operand to determine if it can force an
arbitrary operand into a register, it's not really a sufficient check.

Essentially for arithmetic codes, it checks the operands.  If the operands are
force-able and there's a code_to_optab mapping, then it returns true.

code_to_optab doesn't actually check anything other than the existence of  a
mapping in the target.  If the target pattern has restrictions enforced by the
condition or it's an expander that is allowed to FAIL, then
noce_can_force_operand to be true, even though we may not be able to directly
force the operand into a register.

This came up on the v850 when we had an operand that was a rotate by a constant
number of bits (I don't remember the count, all that's important about it was
the count was not 8 or 16).

The v850 port has this define_expand:

> (define_expand "rotlsi3"
>   [(parallel [(set (match_operand:SI 0 "register_operand" "")
>                    (rotate:SI (match_operand:SI 1 "register_operand" "")
>                               (match_operand:SI 2 "const_int_operand" "")))
>               (clobber (reg:CC CC_REGNUM))])]
>   "(TARGET_V850E_UP)"
>   {
>     if (INTVAL (operands[2]) != 16)
>       FAIL;
>   })
So the only rotate count allowed is 16 (there's a similar HI rotate with a count of 8).  AFAICT the rotate patterns are allowed to FAIL.  So naturally the expander fails and we get a testsuite regression:

> Tests that now fail, but worked before (4 tests):
>
> v850-sim/-mgcc-abi/-msoft-float/-mv850e3v5: gcc: gcc.c-torture/execute/20100805-1.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
> v850-sim/-mgcc-abi/-msoft-float/-mv850e3v5: gcc: gcc.c-torture/execute/20100805-1.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
> v850-sim/-msoft-float/-mv850e3v5: gcc: gcc.c-torture/execute/20100805-1.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
> v850-sim/-mv850e3v5: gcc: gcc.c-torture/execute/20100805-1.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)

This patch works around the problem by allowing the rotates in additional
cases, particularly for the V850E3V5+ variants which have a general rotate
capability.  But let's be clear, this is just a workaround and I expect we're
going to have to revisit the code to test if an operand can be forced into a
register.

gcc/
* config/v850/v850.md (rotlsi3): Allow more cases for V850E3V5+.

RISC-V: Fix ICE for vector single-width integer multiply-add intrinsics

When rs1 is the immediate 0, the following ICE occurs:

error: unrecognizable insn:
(insn 8 5 12 2 (set (reg:RVVM1DI 134 [ <retval> ])
        (if_then_else:RVVM1DI (unspec:RVVMF64BI [
                    (const_vector:RVVMF64BI repeat [
                            (const_int 1 [0x1])
                       ])
                    (reg/v:DI 137 [ vl ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (plus:RVVM1DI (mult:RVVM1DI (vec_duplicate:RVVM1DI (const_int 0 [0]))
                    (reg/v:RVVM1DI 136 [ vs2 ]))
                (reg/v:RVVM1DI 135 [ vd ]))
            (reg/v:RVVM1DI 135 [ vd ])))

gcc/ChangeLog:

* config/riscv/vector.md: Allow scalar operand to be 0.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/bug-7.c: New test.
* gcc.target/riscv/rvv/base/bug-8.c: New test.

[RISC-V][PR target/116282] Stabilize pattern conditions

So as expected the core problem with target/116282 is that the cost of certain
constant synthesis cases varied depending on whether or not we're allowed to
generate new pseudos or not.

That in turn meant that in obscure cases an insn might change from recognizable
to unrecognizable and triggers the observed failure.

So we need to keep the cost stable, at least when called from a pattern's
condition.  So we pass another boolean down when necessary. I've tried to keep
API fallout minimized.

Built and tested  on rv32 in my tester.  Let's see what pre-commit testing has
to say though 🙂

Note this will also require a minor change to the in-flight constant synthesis
work.

PR target/116282
gcc/
* config/riscv/riscv-protos.h (riscv_const_insns): Add new argument.
* config/riscv/riscv.cc (riscv_build_integer): Add new argument
ALLOW_NEW_PSEUDOS.  Pass it down to recursive calls and check it
before using synthesis which allows new registers to be created.
(riscv_split_integer_cost): Pass new argument to riscv_build_integer.
(riscv_integer_cost): Add ALLOW_NEW_PSEUDOS argument, pass it down to
riscv_build_integer.
(riscv_legitimate_constant_p): Pass new argument to riscv_const_insns.
(riscv_const_insns): New argment ALLOW_NEW_PSEUDOS.  Pass it down to
riscv_integer_cost and riscv_const_insns.
(riscv_split_const_insns): Pass new argument to riscv_const_insns.
(riscv_move_integer, riscv_rtx_costs): Similarly.
* config/riscv/riscv.md (shadd with costly constant): Pass new argument
to riscv_const_insns.
* config/riscv/bitmanip.md (and with costly constant): Pass new argument
to riscv_const_insns.

gcc/testsuite/
* gcc.target/riscv/pr116282.c: New test.

RISC-V: Bugfix for RVV rounding intrinsic ICE in function checker

When compiling an interface for rounding of type 'vfloat16*' without using zvfh
or zvfhmin, it is not enough to use FLOAT_MODE_P because the type does not
support it. Although the subsequent riscv_validate_vector_type checks will
still fail and throw exceptions, I don't think we should have ICE here.

internal compiler error: in check, at config/riscv/riscv-vector-builtins-shapes.cc:444
   10 |   return __riscv_vfadd_vv_f16m1_rm (vs2, vs1, 0, vl);
      |   ^~~~~~
0x4191794 internal_error(char const*, ...)
        /iothome/jin.ma/code/master/gcc/gcc/diagnostic-global-context.cc:491
0x416ebf5 fancy_abort(char const*, int, char const*)
        /iothome/jin.ma/code/master/gcc/gcc/diagnostic.cc:1772
0x220aae6 riscv_vector::build_frm_base::check(riscv_vector::function_checker&) const
        /iothome/jin.ma/code/master/gcc/gcc/config/riscv/riscv-vector-builtins-shapes.cc:444
0x2205323 riscv_vector::function_checker::check()
        /iothome/jin.ma/code/master/gcc/gcc/config/riscv/riscv-vector-builtins.cc:4414

gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_vector_float_type_p): New.
* config/riscv/riscv-vector-builtins.cc (function_instance::any_type_float_p):
Use riscv_vector_float_type_p instead of FLOAT_MODE_P for judgment.
* config/riscv/riscv.cc (riscv_vector_int_type_p): Change static to extern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/bug-9.c: New test.

RISC-V: Bugfix incorrect operand for vwsll auto-vect

This patch would like to fix one ICE when rv64gcv_zvbb for vwsll.
Consider below example.

void vwsll_vv_test (short *restrict dst, char *restrict a,
                    int *restrict b, int n)
{
  for (int i = 0; i < n; i++)
    dst[i] = a[i] << b[i];
}

It will hit the vwsll pattern with following operands.
operand 0 -> (reg:RVVMF2HI 146 [ vect__7.13 ])
operand 1 -> (reg:RVVMF4QI 165 [ vect_cst__33 ])
operand 2 -> (reg:RVVM1SI 171 [ vect_cst__36 ])

According to the ISA, operand 2 should be the same as operand 1.
Aka operand 2 should have RVVMF4QI mode as above.  Thus,  add
quad truncation for operand 2 before emit vwsll.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

PR target/116280

gcc/ChangeLog:

* config/riscv/autovec-opt.md: Add quad truncation to
align the mode requirement for vwsll.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr116280-1.c: New test.
* gcc.target/riscv/rvv/base/pr116280-2.c: New test.

RISC-V: Add auto-vect pattern for vector rotate shift

This patch add the vector rotate shift pattern for auto-vect.
With this patch, the scalar rotate shift can be automatically
vectorized into vector rotate shift.

gcc/ChangeLog:

* config/riscv/autovec.md (v<bitmanip_optab><mode>3):
Add new define_expand pattern for vector rotate shift.
gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vrolr-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vrolr-run.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vrolr-template.h: New test.

libstdc++: Update references to gcc.gnu.org/onlinedocs

libstdc++-v3:
* doc/xml/manual/abi.xml: Update reference to
gcc.gnu.org/onlinedocs.
* doc/xml/manual/concurrency_extensions.xml (interface): Ditto.
* doc/xml/manual/extensions.xml: Ditto.
* doc/xml/manual/parallel_mode.xml: Ditto.
* doc/xml/manual/shared_ptr.xml: Ditto.
* doc/xml/manual/using_exceptions.xml: Ditto. And change GNU GCC
to GCC.
* doc/html/manual/abi.html: Regenerate.
* doc/html/manual/ext_concurrency_impl.html: Ditto.
* doc/html/manual/ext_demangling.html: Ditto.
* doc/html/manual/memory.html: Ditto.
* doc/html/manual/parallel_mode_design.html: Ditto.
* doc/html/manual/parallel_mode_using.html: Ditto.
* doc/html/manual/using_exceptions.html: Ditto.

doc: Tweak PIM4 link

gcc:
* doc/gm2.texi (What is GNU Modula-2): Tweak PIM4 link.

libstdc++: Tweak links to installation docs

libstdc++v-3:
* doc/xml/manual/prerequisites.xml: Tweak two links to
installation docs. Fix grammar.
* doc/html/manual/setup.html: Regenerate.

doc: Tweak link to gm2 list archive

Without the trailing slash we incur a "301 Moved Permanently".

gcc:
* doc/gm2.texi (Community): Tweak link to gm2 list archive.

AVR: target/116390 - Fix an avrtiny asm out template.

PR target/116390
gcc/
* config/avr/avr.cc (avr_out_movsi_mr_r_reg_disp_tiny): Fix
output templates for the reg_base == reg_src and
reg_src == reg_base - 2 cases.
gcc/testsuite/
* gcc.target/avr/torture/pr116390.c: New test.

RISC-V: Fix factor in dwarf_poly_indeterminate_value [PR116305]

This patch is to fix the bug (BugId:116305) introduced by the commit
bd93ef for risc-v target.

The commit bd93ef changes the chunk_num from 1 to TARGET_MIN_VLEN/128
if TARGET_MIN_VLEN is larger than 128 in riscv_convert_vector_bits. So
it changes the value of BYTES_PER_RISCV_VECTOR. For example, before
merging the commit bd93ef and if TARGET_MIN_VLEN is 256, the value
of BYTES_PER_RISCV_VECTOR should be [8, 8], but now [16, 16]. The value
of riscv_bytes_per_vector_chunk and BYTES_PER_RISCV_VECTOR are no longer
equal.

Prologue will use BYTES_PER_RISCV_VECTOR.coeffs[1] to estimate the vlenb
register value in riscv_legitimize_poly_move, and dwarf2cfi will also
get the estimated vlenb register value in riscv_dwarf_poly_indeterminate_value
to calculate the number of times to multiply the vlenb register value.

So need to change the factor from riscv_bytes_per_vector_chunk to
BYTES_PER_RISCV_VECTOR, otherwise we will get the incorrect dwarf
information. The incorrect example as follow:

```
csrr    t0,vlenb
slli    t1,t0,1
sub     sp,sp,t1

.cfi_escape 0xf,0xb,0x72,0,0x92,0xa2,0x38,0,0x34,0x1e,0x23,0x50,0x22
```

The sequence '0x92,0xa2,0x38,0' means the vlenb register, '0x34' means
the literal 4, '0x1e' means the multiply operation. But in fact, the
vlenb register value just need to multiply the literal 2.

PR target/116305

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_dwarf_poly_indeterminate_value): Take
BYTES_PER_RISCV_VECTOR for *factor instead of riscv_bytes_per_vector_chunk.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/scalable_vector_cfi.c: New test.

Signed-off-by: Zhijin Zeng <zhijin.zeng@spacemit.com>

Daily bump.

Write CodeView information about stack variables

Outputs CodeView S_REGREL32 symbols for unoptimized local variables that
are stored on the stack. This includes a change to dwarf2out.cc to make
it easier to extract the function frame base without having to worry
about the function prologue or epilogue.

gcc/
* dwarf2codeview.cc (enum cv_sym_type): Add S_REGREL32.
(write_fbreg_variable): New function.
(write_unoptimized_local_variable): Add fblock parameter, and handle
DW_OP_fbreg locations.
(write_unoptimized_function_vars): Add fbloc parameter.
(write_function): Extract frame base from DWARF.
* dwarf2out.cc (convert_cfa_to_fb_loc_list): Output simplified frame
base information for CodeView.

Write CodeView information about enregistered variables

Outputs CodeView S_REGISTER symbols, representing local variables or
parameters that are held in a register.

gcc/
* dwarf2codeview.cc (enum cv_sym_type): Add S_REGISTER.
(enum cv_x86_register): New type.
(enum cv_amd64_register): New type.
(dwarf_reg_to_cv): New function.
(write_s_register): New function.
(write_unoptimized_local_variable): Handle parameters and DW_OP_reg*
location types.

Write CodeView information about local static variables

Outputs CodeView S_LDATA32 symbols, for static variables within
functions, along with S_BLOCK32 and S_END for the beginning and end of
lexical blocks.

gcc/
* dwarf2codeview.cc (enum cv_sym_type): Add S_END and S_BLOCK32.
(write_local_s_ldata32): New function.
(write_unoptimized_local_variable): New function.
(write_s_block32): New function.
(write_s_end): New function.
(write_unoptimized_function_vars): New function.
(write_function): Call write_unoptimized_function_vars.

Fix maybe-uninitialized CodeView LF_INDEX warning

Initialize last_type to 0 to silence two spurious maybe-uninitialized warnings.
We issue an LF_INDEX continuation subtype for any LF_FIELDLISTs that
overflow, so LF_INDEXes will always have a subtype preceding them (and
thus last_type will always be set).

gcc/
* dwarf2codeview.cc (get_type_num_enumeration_type): Initialize last_type
to 0.
(get_type_num_struct): Likewise.

AVR: target/85624 - Use HImode for clrmemqi alignment.

gcc/
PR target/85624
* config/avr/avr.md (*clrmemqi*): Use HImode for alignment operand.

(cherry picked from commit 507b4e147588c0fafe952b7226dd764ebeebb103)

Fortran: fix documentation of intrinsic RANDOM_INIT [PR114146]

gcc/fortran/ChangeLog:

PR fortran/114146
* intrinsic.texi: Fix documentation of arguments of RANDOM_INIT,
which is conforming to the F2018 standard.

modula2: change identifier names to avoid build warnings

This fix avoids the following warnings: In implementation module
‘StdChans’: either the identifier has the same name as a keyword or
alternatively a keyword has the wrong case (‘IN’ and ‘in’)
54 | stdnull: ChanId ;

the symbol name ‘in’ is legal as an identifier, however as such it
might cause confusion and is considered bad programming practice.

gcc/m2/ChangeLog:

* gm2-libs-iso/StdChans.mod (in): Rename to ...
(inch): ... this.
(out): Rename to ...
(outch): ... this.
(err): Rename to ...
(errch): ... this.

Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net>

Fix using keywords as identifiers to prevent warnings during build

m2pim/DynamicStrings.mod:1358:27: note: In procedure ‘Slice’: the symbol
name ‘end’ is legal as an identifier, however as such it might cause
confusion and is considered bad programming practice
1358 | start, end, o: INTEGER ;

m2pim/DynamicStrings.mod:1358:27: note: either the identifier has the
same name as a keyword or alternatively a keyword has the wrong case
(‘END’ and ‘end’).

gcc/m2/ChangeLog:

* gm2-libs/DynamicStrings.mod (Slice): Rename end to stop.

Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net>

testsuite: Verify -fshort-enums and -fno-short-enums in pr33738.C

For some targets, like Cortex-M on arm-none-eabi, the -fshort-enums is
enabled by default. For these targets, the test case fails as
sizeof(Alpha) < sizeof(int).
To make the test case behave identical for targets that does enable
-fshort-enums and those that does not, add -fno-short-enums in the test
case and verify that the warning is not emitted. Then also create a copy
and run the test with -fshort-enums and verify that the warning is
emitted.

Regtested on x86_64-pc-linux-gnu and arm-none-eabi.

gcc/testsuite/ChangeLog:

* g++.dg/warn/pr33738.C: Added -fno-short-enums.
* g++.dg/warn/pr33738-2.C: Duplicate g++.dg/warn/pr33738.C with
-fshort-enums and removed xfail.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

testsuite: Add -fno-short-enums to pr97315-1.C

The test case assumes that sizeof(tree_code) >= 2. On some targets, like
Cortex-M on arm-none-eabi, -fshort-enums is enabled by default and in
that case, sizeof(tree_code) will be 1 and the following warning is
emitted:

.../pr97315-1.C:8:13: warning: width of 'tree_base::code' exceeds its type

Avoid the warning by forcing -fno-short-enums.

gcc/testsuite/ChangeLog:

* g++.dg/opt/pr97315-1.C: Add -fno-short-enums.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

testsuite: Add -fwrapv to signbit-5.c

On Cortex-M55 with MVE, the test case fails due to -INT_MAX being
undefined. Adding -fwrapv solves the issues.

Regtested on x86_64-pc-linux-gnu and arm-none-eabi for
Cortex-M0/M3/M4/M7/M33/M55/M85/A7.

gcc/testsuite/ChangeLog:

* gcc.dg/signbit-5.c: Add -fwrapv and remove x86 exception.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Co-authored-by: Yvan ROUX <yvan.roux@foss.st.com>

PR modula2/116378 m2 bootstrap fails on x86_64-darwin

This patch fixes m2 bootstrap failure on x86_64-darwin.  libc_open
is defined with three parameters the last of which is an int for
portability (rather than a vararg).  This avoids portability
problems by promoting mode_t to an int.  In the future it could
be tidied up by using the m2 optarg extension.

gcc/m2/ChangeLog:

PR modula2/116378
* gm2-libs-iso/TermFile.mod (termOpen): Add third argument
for open.
* gm2-libs/libc.def (open): Remove vararg and use INTEGER for
mode parameter three.
* mc-boot-ch/Glibc.c (tracedb_open): Replace mode_t with int.
(libc_open): Rewrite without varargs.
* mc-boot/Glibc.h (libc_open): Replace varargs with int mode.
* pge-boot/Glibc.cc (libc_open): Rewrite.
* pge-boot/Glibc.h (libc_open): Replace varargs with int mode.

gcc/testsuite/ChangeLog:

PR modula2/116378
* gm2/extensions/run/pass/testopen.mod: Add third argument
for open.
* gm2/isolib/run/pass/openlibc.mod: Ditto.
* gm2/pim/run/pass/testaddr3.mod: Ditto.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

c++: Pedwarn on [[]]; at class scope [PR110345]

For C++ 26 P2552R3 I went through all the spots (except modules) where
attribute-specifier-seq appears in the grammar and tried to construct
a testcase in all those spots, for now for [[deprecated]] attribute.

The fourth issue is that we just emit (when enabled) -Wextra-semi warning
not just for lone semicolon at class scope (correct), but also for
[[]]; or [[whatever]]; there too.
While just semicolon is valid in C++11 and newer,
https://eel.is/c++draft/class.mem#nt:member-declaration
allows empty-declaration, unlike namespace scope or block scope
something like attribute-declaration or empty statement with attributes
applied for it aren't supported.
While syntactically it matches
attribute-specifier-seq [opt] decl-specifier-seq [opt] member-declarator-list [opt] ;
with the latter two omitted, there is
https://eel.is/c++draft/class.mem#general-3
which says that is not valid.

So, the following patch emits a pedwarn in that case.

2024-08-16 Jakub Jelinek <jakub@redhat.com>

PR c++/110345
* parser.cc (cp_parser_member_declaration): Call maybe_warn_extra_semi
only if it is empty-declaration, if there are some tokens like
attribute, pedwarn that the declaration doesn't declare anything.

* g++.dg/cpp0x/gen-attrs-84.C: New test.

i386: Fix some vex insns that prohibit egpr

Although these vex insn have evex counterpart, but when it
uses the displayed vex prefix should not support APX EGPR.
Like TARGET_AVXVNNI, TARGET_IFMA and TARGET_AVXNECONVERT.
TARGET_AVXVNNIINT8 and TARGET_AVXVNNITINT16 also are vex
insn should not support egpr.

gcc/ChangeLog:

* config/i386/sse.md (vpmadd52<vpmadd52type><mode>):
Prohibit egpr for vex version.
(vpdpbusd_<mode>): Ditto.
(vpdpbusds_<mode>): Ditto.
(vpdpwssd_<mode>): Ditto.
(vpdpwssds_<mode>): Ditto.
(*vcvtneps2bf16_v4sf): Ditto.
(*vcvtneps2bf16_v8sf): Ditto.
(vpdp<vpdotprodtype>_<mode>): Ditto.
(vbcstnebf162ps_<mode>): Ditto.
(vbcstnesh2ps_<mode>): Ditto.
(vcvtnee<bf16_ph>2ps_<mode>): Ditto.
(vcvtneo<bf16_ph>2ps_<mode>): Ditto.
(vpdp<vpdpwprodtype>_<mode>): Ditto.

aarch64: Improve popcount for bytes [PR113042]

For popcount for bytes, we don't need the reduction addition
after the vector cnt instruction as we are only counting one
byte's popcount.
This changes the popcount extend to cover all ALLI rather than GPI.

Changes since v1:
* v2 - Use ALLI iterator and combine all into one pattern.
       Add new testcases popcnt[6-8].c.
* v3 - Simplify TARGET_CSSC path.
       Use convert_to_mode instead of gen_zero_extend* directly.
       Some other small cleanups.

Bootstrapped and tested on aarch64-linux-gnu with no regressions.

PR target/113042

gcc/ChangeLog:

* config/aarch64/aarch64.md (popcount<mode>2): Update pattern
to support ALLI modes.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/popcnt5.c: New test.
* gcc.target/aarch64/popcnt6.c: New test.
* gcc.target/aarch64/popcnt7.c: New test.
* gcc.target/aarch64/popcnt8.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

libstdc++-v3: Handle iconv as optional for newlib builds [PR116362]

Support for iconv in newlib seems to have been always
assumed present by libstdc++-v3, but is default off.

Though, it hasn't been used before recent libstdc++ changes
that actually call iconv functions. This now leads to
failures exposed by running the test-suite, unless the
newlib being used has been explicitly configured with
--enable-newlib-iconv. When failing, there are undefined
references to iconv, iconv_open or iconv_close for multiple
tests.

Thankfully there's a macro in newlib.h that we can check to
detect presence of iconv support for the newlib build that's
used.

libstdc++-v3:
PR libstdc++/116362
* configure.ac: Check newlib configuration whether iconv is enabled.
* configure: Regenerate.

libstdc++-v3: testsuite: Prune uncapitalized "in function" linker warning

Newer newlib trigger warnings about certain functions not implemented
(_getentropy) when testing libstdc++-v3.

Since 2018 (circa binutils-2.31) the "in function" prefix isn't
capitalized for those "not implemented" warnings when generated from
the linker (a GNU ld feature used by newlib). Dejagnu up to and
including at least dejagnu-1.6.3 (and git @ 42979bd3b9) assumes a
capital "In function", leaving that part unpruned, and boom we have
thousands of "excess errors" from the libstdc++-v3 testsuite.

While gcc/testsuite/lib/prune.exp:prune_gcc_output already deals with
this quirk with a vastly more generic pattern, I choose this simpler
tweak.

libstdc++-v3:
* testsuite/lib/prune.exp (libstdc++-dg-prune): Prune
uncapitalized "in function" warning from linker.

Daily bump.

PHIOPT: Fix comment before factor_out_conditional_operation

I didn't update the comment before factor_out_conditional_operation
correctly. this updates it to be correct and mentions unary operations
rather than just conversions.

Pushed as obvious.

gcc/ChangeLog:

* tree-ssa-phiopt.cc (factor_out_conditional_operation): Update
comment.

RISC-V: use fclass insns to implement isfinite,isnormal and isinf builtins

Currently these builtins use float compare instructions which require
FP flags to be saved/restored which could be costly in uarch.
RV Base ISA already has FCLASS.{d,s,h} instruction to compare/identify FP
values w/o disturbing FP exception flags.

Now that upstream supports the corresponding optabs, wire them up in the
backend.

gcc/ChangeLog:
* config/riscv/riscv.md: define_insn for fclass insn.
define_expand for isfinite, isnormal, isinf.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/fclass.c: New tests.

Tested-by: Edwin Lu <ewlu@rivosinc.com> # pre-commit-CI #2060
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>

i386: Improve split of *extendv2di2_highpart_stv_noavx512vl.

This patch follows up on the previous patch to fix PR target/116275 by
improving the code STV (ultimately) generates for highpart sign extensions
like (x<<8)>>8.  The arithmetic right shift is able to take advantage of
the available common subexpressions from the preceding left shift.

Hence previously with -O2 -m32 -mavx -mno-avx512vl we'd generate:

        vpsllq  $8, %xmm0, %xmm0
        vpsrad  $8, %xmm0, %xmm1
        vpsrlq  $8, %xmm0, %xmm0
        vpblendw        $51, %xmm0, %xmm1, %xmm0

But with improved splitting, we now generate three instructions:

        vpslld  $8, %xmm1, %xmm0
        vpsrad  $8, %xmm0, %xmm0
        vpblendw        $51, %xmm1, %xmm0, %xmm0

This patch also implements Uros' suggestion that the pre-reload
splitter could introduced a new pseudo to hold the intermediate
to potentially help reload with register allocation, which applies
when not performing the above optimization, i.e. on TARGET_XOP.

2024-08-15  Roger Sayle  <roger@nextmovesoftware.com>
    Uros Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog
* config/i386/i386.md (*extendv2di2_highpart_stv_noavx512vl): Split
to an improved implementation on !TARGET_XOP.  On TARGET_XOP, use
a new pseudo for the intermediate to simplify register allocation.

gcc/testsuite/ChangeLog
* g++.target/i386/pr116275-2.C: New test case.

fortran: Fix bootstrap in resolve.cc [PR116387]

The r15-2934 change broke bootstrap:
../../gcc/fortran/resolve.cc: In function ‘bool resolve_operator(gfc_expr*)’:
../../gcc/fortran/resolve.cc:4649:22: error: too many arguments for format [-Werror=format-extra-args]
4649 |           gfc_error ("Inconsistent coranks for operator at %%L and %%L",
      |                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The following patch fixes that by using %L rather than %%L, the call has 2
location arguments.

2024-08-15  Jakub Jelinek  <jakub@redhat.com>

PR bootstrap/116387
* resolve.cc (resolve_operator): Use %L rather than %%L in format
string.

c++: fix up cpp23/consteval-if3.C test [PR115583]

Compiling with optimizations is needed to trigger the bug fixed
by r15-2369.

PR c++/115583

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/consteval-if13.C: Compile with -O.

Tweak base/index disambiguation in decompose_normal_address [PR116236]

The PR points out that, for an address like:

(plus (zero_extend X) Y)

decompose_normal_address doesn't establish a strong preference
between treating X as the base or Y as the base. As the comment
in the patch says, zero_extend isn't enough on its own to assume
an index, at least not on POINTERS_EXTEND_UNSIGNED targets.
But in a construct like the one above, X and Y have different modes,
and it seems reasonable to assume that the one with the expected
address mode is the base.

This matters on targets like m68k that support index extension
and that require different classes for bases and indices.

gcc/
PR middle-end/116236
* rtlanal.cc (decompose_normal_address): Try to distinguish
bases and indices based on mode, before resorting to "baseness".

late-combine: Preserve INSN_CODE when modifying notes [PR116343]

When it removes a definition, late-combine tries to update all
uses in notes.  It does this using the same insn_propagation class
that it uses for patterns.

However, insn_propagation uses validate_change, which in turn
resets the INSN_CODE.  This is inefficient in the best case,
since it forces the pattern to be rerecognised even though
changing a note can't affect the INSN_CODE.  But in the PR
it's a correctness problem: resetting INSN_CODE means we lose
the NOOP_INSN_MOVE_CODE, which in turn means that rtl-ssa doesn't
queue it for deletion.

This patch adds a routine specifically for propagating into notes.
A belt-and-braces fix would be to rerecognise noop moves in
function_info::change_insns, but I can't think of a good reason
why that would be necessary, and it could paper over latent bugs.

gcc/
PR testsuite/116343
* recog.h (insn_propagation::apply_to_note): Declare.
* recog.cc (insn_propagation::apply_to_note): New function.
* late-combine.cc (insn_combination::substitute_note): Use
apply_to_note instead of apply_to_rvalue.
* rtl-ssa/changes.cc (rtl_ssa::changes_are_worthwhile): Improve
dumping of costs for noop moves.

gcc/testsuite/
PR testsuite/116343
* gcc.dg/torture/pr116343.c: New test.

Fix Coarray in associate not a coarray. [PR110033]

A coarray used in an associate did not become a coarray in the block of
the associate. This patch fixes that and the same also in select type
statements.

PR fortran/110033

gcc/fortran/ChangeLog:

* class.cc (gfc_is_class_scalar_expr): Coarray refs that ref
only self, aka this image, are regarded as scalar, too.
* resolve.cc (resolve_assoc_var): Ignore this image coarray refs
and do not build a new class type.
* trans-expr.cc (gfc_get_caf_token_offset): Get the caf token
from the descriptor for associated variables.
(gfc_conv_variable): Same.
(gfc_trans_pointer_assignment): Assign token to temporary
associate variable, too.
(gfc_trans_scalar_assign): Add flag that assign is for associate
and use it to assign the token.
(is_assoc_assign): Detect that expressions are for associate
assign.
(gfc_trans_assignment_1): Treat associate assigns like pointer
assignments where possible.
* trans-stmt.cc (trans_associate_var): Set same_class only for
class-targets.
* trans.h (gfc_trans_scalar_assign): Add flag to
trans_scalar_assign for marking associate assignments.

gcc/testsuite/ChangeLog:

* gfortran.dg/coarray/associate_1.f90: New test.

Add corank to gfc_expr.

Compute the corank of an expression along side to the regular rank.
This safe costly calls to gfc_get_corank (), which consecutively has
been removed. In some locations the code needed some adaption to model
the difference between expr.corank and gfc_get_corank correctly. The
latter always returned the codimension of the expression and not its
current corank, i.e. the resolution of all indezes.

This commit is preparatory to fixing PR fortran/110033 and may contain
parts of that fix already.

gcc/fortran/ChangeLog:

* arith.cc (reduce_unary): Use expr.corank.
(reduce_binary_ac): Same.
(reduce_binary_ca): Same.
(reduce_binary_aa): Same.
* array.cc (gfc_match_array_ref): Same.
* check.cc (dim_corank_check): Same.
(gfc_check_move_alloc): Same.
(gfc_check_image_index): Same.
* class.cc (gfc_add_class_array_ref): Same.
(finalize_component): Same.
* data.cc (gfc_assign_data_value): Same.
* decl.cc (match_clist_expr): Same.
(add_init_expr_to_sym): Same.
* expr.cc (simplify_intrinsic_op): Same.
(simplify_parameter_variable): Same.
(gfc_check_assign_symbol): Same.
(gfc_get_variable_expr): Same.
(gfc_add_full_array_ref): Same.
(gfc_lval_expr_from_sym): Same.
(gfc_get_corank): Removed.
* frontend-passes.cc (callback_reduction): Use expr.corank.
(create_var): Same.
(combine_array_constructor): Same.
(optimize_minmaxloc): Same.
* gfortran.h (gfc_get_corank): Add corank to gfc_expr.
* intrinsic.cc (gfc_get_intrinsic_function_symbol): Use
expr.corank.
(gfc_convert_type_warn): Same.
(gfc_convert_chartype): Same.
* iresolve.cc (resolve_bound): Same.
(gfc_resolve_cshift): Same.
(gfc_resolve_eoshift): Same.
(gfc_resolve_logical): Same.
(gfc_resolve_matmul): Same.
* match.cc (copy_ts_from_selector_to_associate): Same.
* matchexp.cc (gfc_get_parentheses): Same.
* parse.cc (parse_associate): Same.
* primary.cc (gfc_match_rvalue): Same.
* resolve.cc (resolve_structure_cons): Same.
(resolve_actual_arglist): Same.
(resolve_elemental_actual): Same.
(resolve_generic_f0): Same.
(resolve_unknown_f): Same.
(resolve_operator): Same.
(gfc_expression_rank): Same and set dimen_type for coarray to
default.
(gfc_op_rank_conformable): Use expr.corank.
(add_caf_get_intrinsic): Same.
(resolve_variable): Same.
(gfc_fixup_inferred_type_refs): Same.
(check_host_association): Same.
(resolve_compcall): Same.
(resolve_expr_ppc): Same.
(resolve_assoc_var): Same.
(fixup_array_ref): Same.
(resolve_select_type): Same.
(add_comp_ref): Same.
(get_temp_from_expr): Same.
(resolve_fl_var_and_proc): Same.
(resolve_symbol): Same.
* symbol.cc (gfc_is_associate_pointer): Same.
* trans-array.cc (walk_coarray): Same.
(gfc_conv_expr_descriptor): Same.
(gfc_walk_array_ref): Same.
* trans-array.h (gfc_walk_array_ref): Same.
* trans-expr.cc (gfc_get_ultimate_alloc_ptr_comps_caf_token):
Same.
* trans-intrinsic.cc (trans_this_image): Same.
(trans_image_index): Same.
(conv_intrinsic_cobound): Same.
(gfc_walk_intrinsic_function): Same.
(conv_intrinsic_move_alloc): Same.
* trans-stmt.cc (gfc_trans_lock_unlock): Same.
(trans_associate_var): Same and adapt to slightly different
behaviour of expr.corank and gfc_get_corank.
(gfc_trans_allocate): Same.
* trans.cc (gfc_add_finalizer_call): Same.

c++: c->B::m access resolved through current inst [PR116320]

Here when checking the access of (the injected-class-name) B in c->B::m
at parse time, we notice its context B (now the type) is a base of the
object type C<T>, so we proceed to use C<T> as the effective qualifying
type. But this C<T> is the dependent specialization not the primary
template type, so it has empty TYPE_BINFO, which leads to a segfault later
from perform_or_defer_access_check.

The reason the DERIVED_FROM_P (B, C<T>) test guarding this code path works
despite C<T> having empty TYPE_BINFO is because of its currently_open_class
logic (added in r9-713-gd9338471b91bbe) which replaces a dependent
specialization with the primary template type if we're inside it. So the
safest fix seems to be to call currently_open_class in the caller as well.

PR c++/116320

gcc/cp/ChangeLog:

* semantics.cc (check_accessibility_of_qualified_id): Try
currently_open_class when using the object type as the
effective qualifying type.

gcc/testsuite/ChangeLog:

* g++.dg/template/access42.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++/coroutines: fix passing *this to promise type, again [PR116327]

In r15-2210 we got rid of the unnecessary cast to lvalue reference when
passing *this to the promise type ctor, and as a drive-by change we also
simplified the code to use cp_build_fold_indirect_ref.

But it turns out cp_build_fold_indirect_ref does too much here, namely
it has a shortcut for returning current_class_ref if the operand is
current_class_ptr. The problem with that shortcut is current_class_ref
might have gotten clobbered earlier if it appeared in the function body,
since rewrite_param_uses walks and rewrites in-place all local variable
uses to their corresponding frame copy.

So later cp_build_fold_indirect_ref for *this will instead return the
clobbered current_class_ref i.e. *frame_ptr->this, which doesn't make
sense here since we're in the ramp function and not the actor function
where frame_ptr is in scope.

This patch fixes this by using the build_fold_indirect_ref instead of
cp_build_fold_indirect_ref.

PR c++/116327
PR c++/104981
PR c++/115550

gcc/cp/ChangeLog:

* coroutines.cc (morph_fn_to_coro): Use build_fold_indirect_ref
instead of cp_build_fold_indirect_ref.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr104981-preview-this.C: Improve coverage by
adding a non-static data member use within the coroutine member
function.
* g++.dg/coroutines/pr116327-preview-this.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

LoongArch: Implement scalar isinf, isnormal, and isfinite via fclass

Doing so can avoid loading FP constants from the memory.  It also
partially fixes PR 66262 as fclass does not signal on sNaN.

gcc/ChangeLog:

* config/loongarch/loongarch.md (extendsidi2): Add ("=r", "f")
alternative and use movfr2gr.s for it.  The spec clearly states
movfr2gr.s sign extends the value to GRLEN.
(fclass_<fmt>): Make the result SImode instead of a floating
mode.  The fclass results are really not FP values.
(FCLASS_MASK): New define_int_iterator.
(fclass_optab): New define_int_attr.
(<FCLASS_MASK:fclass_optab><ANYF:mode>): New define_expand
template.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/fclass-compile.c: New test.
* gcc.target/loongarch/fclass-run.c: New test.

Movement between GENERAL_REGS and SSE_REGS for TImode doesn't need secondary reload.

It results in 2 failures for x86_64-pc-linux-gnu{\
-march=cascadelake};

gcc: gcc.target/i386/extendditi3-1.c scan-assembler cqt?o
gcc: gcc.target/i386/pr113560.c scan-assembler-times \tmulq 1

For pr113560.c, now GCC generates mulx instead of mulq with
-march=cascadelake, which should be optimal, so adjust testcase for
that.
For gcc.target/i386/extendditi2-1.c, RA happens to choose another
register instead of rax and result in

movq %rdi, %rbp
movq %rdi, %rax
sarq $63, %rbp
movq %rbp, %rdx

The patch adds a new define_peephole2 for that.

gcc/ChangeLog:

PR target/116274
* config/i386/i386-expand.cc (ix86_expand_vector_move):
Restrict special case TImode to 128-bit vector conversions via
V2DI under ix86_pre_reload_split ().
* config/i386/i386.cc (inline_secondary_memory_needed):
Movement between GENERAL_REGS and SSE_REGS for TImode doesn't
need secondary reload.
* config/i386/i386.md (*extendsidi2_rex64): Add a
define_peephole2 after it.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr116274.c: New test.
* gcc.target/i386/pr113560.c: Scan either mulq or mulx.

aarch64: Rename svpext to svpext_lane [PR116371]

When implementing the SME2 ACLE, I somehow missed off the _lane
suffix on svpext.

gcc/
PR target/116371
* config/aarch64/aarch64-sve-builtins-sve2.h (svpext): Rename to...
(svpext_lane): ...this.
* config/aarch64/aarch64-sve-builtins-sve2.cc (svpext_impl): Rename
to...
(svpext_lane_impl): ...this and update instantiation accordingly.
* config/aarch64/aarch64-sve-builtins-sve2.def (svpext): Rename to...
(svpext_lane): ...this.

gcc/testsuite/
PR target/116371
* gcc.target/aarch64/sme2/acle-asm/pext_c16.c,
gcc.target/aarch64/sme2/acle-asm/pext_c16_x2.c,
gcc.target/aarch64/sme2/acle-asm/pext_c32.c,
gcc.target/aarch64/sme2/acle-asm/pext_c32_x2.c,
gcc.target/aarch64/sme2/acle-asm/pext_c64.c,
gcc.target/aarch64/sme2/acle-asm/pext_c64_x2.c,
gcc.target/aarch64/sme2/acle-asm/pext_c8.c,
gcc.target/aarch64/sme2/acle-asm/pext_c8_x2.c: Replace with...
* gcc.target/aarch64/sme2/acle-asm/pext_lane_c16.c,
gcc.target/aarch64/sme2/acle-asm/pext_lane_c16_x2.c,
gcc.target/aarch64/sme2/acle-asm/pext_lane_c32.c,
gcc.target/aarch64/sme2/acle-asm/pext_lane_c32_x2.c,
gcc.target/aarch64/sme2/acle-asm/pext_lane_c64.c,
gcc.target/aarch64/sme2/acle-asm/pext_lane_c64_x2.c,
gcc.target/aarch64/sme2/acle-asm/pext_lane_c8.c,
gcc.target/aarch64/sme2/acle-asm/pext_lane_c8_x2.c: ...these new tests,
testing for svpext_lane instead of svpext.

rs6000: Add TARGET_FLOAT128_HW guard for quad-precision insns

gcc/
* config/rs6000/rs6000.md (floatti<mode>2, floatunsti<mode>2,
fix_trunc<mode>ti2): Add guard TARGET_FLOAT128_HW.
* config/rs6000/vsx.md (xsxexpqp_<IEEE128:mode>_<V2DI_DI:mode>,
xsxsigqp_<IEEE128:mode>_<VEC_TI:mode>, xsiexpqpf_<mode>,
xsiexpqp_<IEEE128:mode>_<V2DI_DI:mode>, xscmpexpqp_<code>_<mode>,
*xscmpexpqp, xststdcnegqp_<mode>): Replace guard TARGET_P9_VECTOR
with TARGET_FLOAT128_HW.
(xststdc_<mode>, *xststdc_<mode>, isinf<mode>2): Add guard
TARGET_FLOAT128_HW for the IEEE128 modes.

gcc/testsuite/
* gcc.target/powerpc/float128-cmp2-runnable.c: Replace
ppc_float128_sw with ppc_float128_hw and remove p9vector_hw.

rs6000: Implement optab_isnormal for SFDF and IEEE128

gcc/
PR target/97786
* config/rs6000/vsx.md (isnormal<mode>2): New expand.

gcc/testsuite/
PR target/97786
* gcc.target/powerpc/pr97786-7.c: New test.
* gcc.target/powerpc/pr97786-8.c: New test.

rs6000: Implement optab_isfinite for SFDF and IEEE128

gcc/
PR target/97786
* config/rs6000/vsx.md (isfinite<mode>2): New expand.

gcc/testsuite/
PR target/97786
* gcc.target/powerpc/pr97786-4.c: New test.
* gcc.target/powerpc/pr97786-5.c: New test.

rs6000: Implement optab_isinf for SFDF and IEEE128

gcc/
PR target/97786
* config/rs6000/rs6000.md (constant VSX_TEST_DATA_CLASS_NAN,
VSX_TEST_DATA_CLASS_POS_INF, VSX_TEST_DATA_CLASS_NEG_INF,
VSX_TEST_DATA_CLASS_POS_ZERO, VSX_TEST_DATA_CLASS_NEG_ZERO,
VSX_TEST_DATA_CLASS_POS_DENORMAL, VSX_TEST_DATA_CLASS_NEG_DENORMAL):
Define.
(mode_attr sdq, vsx_altivec, wa_v, x): Define.
(mode_iterator IEEE_FP): Define.
* config/rs6000/vsx.md (isinf<mode>2): New expand.
(expand xststdcqp_<mode>, xststdc<sd>p): Combine into...
(expand xststdc_<mode>): ...this.
(insn *xststdcqp_<mode>, *xststdc<sd>p): Combine into...
(insn *xststdc_<mode>): ...this.
* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Rename
CODE_FOR_xststdcqp_kf as CODE_FOR_xststdc_kf,
CODE_FOR_xststdcqp_tf as CODE_FOR_xststdc_tf.
* config/rs6000/rs6000-builtins.def: Rename xststdcdp as xststdc_df,
xststdcsp as xststdc_sf, xststdcqp_kf as xststdc_kf.

gcc/testsuite/
PR target/97786
* gcc.target/powerpc/pr97786-1.c: New test.
* gcc.target/powerpc/pr97786-2.c: New test.

Value Range: Add range op for builtin isnormal

The former patch adds optab for builtin isnormal. Thus builtin isnormal
might not be folded at front end. So the range op for isnormal is needed
for value range analysis. This patch adds range op for builtin isnormal.

gcc/
* gimple-range-op.cc (class cfn_isfinite): New.
(op_cfn_finite): New variables.
(gimple_range_op_handler::maybe_builtin_call): Handle
CFN_BUILT_IN_ISFINITE.
* value-range.h (class frange): Declear known_isnormal and
known_isdenormal_or_zero.
(frange::known_isnormal): Define.
(frange::known_isdenormal_or_zero): Define.

gcc/testsuite/
* gcc.dg/tree-ssa/range-isnormal.c: New test.

Value Range: Add range op for builtin isfinite

The former patch adds optab for builtin isfinite. Thus builtin isfinite
might not be folded at front end. So the range op for isfinite is needed
for value range analysis. This patch adds range op for builtin isfinite.

gcc/
* gimple-range-op.cc (class cfn_isfinite): New.
(op_cfn_finite): New variables.
(gimple_range_op_handler::maybe_builtin_call): Handle
CFN_BUILT_IN_ISFINITE.

gcc/testsuite/
* gcc.dg/tree-ssa/range-isfinite.c: New test.

Value Range: Add range op for builtin isinf

The builtin isinf is not folded at front end if the corresponding optab
exists. So the range op for isinf is needed for value range analysis.
This patch adds range op for builtin isinf.

gcc/
PR target/114678
* gimple-range-op.cc (class cfn_isinf): New.
(op_cfn_isinf): New variables.
(gimple_range_op_handler::maybe_builtin_call): Handle
CASE_FLT_FN (BUILT_IN_ISINF).

gcc/testsuite/
PR target/114678
* gcc.dg/tree-ssa/range-isinf.c: New test.
* gcc.dg/tree-ssa/range-sincos.c: Remove xfail for s390.
* gcc.dg/tree-ssa/vrp-float-abs-1.c: Likewise.

Daily bump.

c++: ICE with NSDMIs and fn arguments [PR116015]

The problem in this PR is that we ended up with

  {.rows=(&<PLACEHOLDER_EXPR struct Widget>)->n,
   .outer_stride=(&<PLACEHOLDER_EXPR struct MatrixLayout>)->rows}

that is, two PLACEHOLDER_EXPRs for different types on the same level
in one { }.  That should not happen; we may, for instance, neglect to
replace a PLACEHOLDER_EXPR due to CONSTRUCTOR_PLACEHOLDER_BOUNDARY on
the constructor.

The same problem happened in PR100252, which I fixed by introducing
replace_placeholders_for_class_temp_r.  That didn't work here, though,
because r_p_for_c_t_r only works for non-eliding TARGET_EXPRs: replacing
a PLACEHOLDER_EXPR with a temporary that is going to be elided will
result in a crash in gimplify_var_or_parm_decl when it encounters such
a loose decl.

But leaving the PLACEHOLDER_EXPRs in is also bad because then we end
up with this PR.

TARGET_EXPRs for function arguments are elided in gimplify_arg.  The
argument will get a real temporary only in get_formal_tmp_var.  One
idea was to use the temporary that is going to be elided anyway, and
then replace_decl it with the real object once we get it.  But that
didn't work out: one problem is that we elide the TARGET_EXPR for an
argument before we create the real temporary for the argument, and
when we get it, the context that this was a TARGET_EXPR for an argument
has been lost.  We're also in the middle end territory now, even though
this is a C++-specific problem.

A solution is to simply stop eliding TARGET_EXPRs whose initializer is
a CONSTRUCTOR.  Such copies can't be (at the moment) elided anyway.  But
not eliding all TARGET_EXPRs would be a pessimization.

PR c++/116015

gcc/cp/ChangeLog:

* call.cc (convert_for_arg_passing): Don't set_target_expr_eliding
when the TARGET_EXPR initializer is a CONSTRUCTOR.

gcc/ChangeLog:

* gimplify.cc (gimplify_arg): Do not strip a TARGET_EXPR whose
initializer is a CONSTRUCTOR.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/nsdmi-aggr23.C: New test.

s390: Remove vector intrinsics

The following intrinsics are not implemented. Thus, remove them.

gcc/ChangeLog:

* config/s390/vecintrin.h (vec_vstbrh): Remove.
(vec_vstbrf): Remove.
(vec_vstbrg): Remove.
(vec_vstbrq): Remove.
(vec_vstbrf_flt): Remove.
(vec_vstbrg_dbl): Remove.
(vec_vsterb): Remove.
(vec_vsterh): Remove.
(vec_vsterf): Remove.
(vec_vsterg): Remove.
(vec_vsterf_flt): Remove.
(vec_vsterg_dbl): Remove.

s390: Fix high-level builtins vec_gfmsum{,_accum}_128

Starting with r14-9449-g9f2b16ce1efef0 builtins were streamlined with
those in LLVM.  In particular s390_vgfm{,a}g have been changed from
UV16QI to UINT128 in order to match those in LLVM.  However, these
low-level builtins are directly used by the high-level builtins
vec_gfmsum{,_accum}_128 which expect UV16QI instead.  Therefore,
introduce new low-level builtins s390_vgfm{,a}g_128 and make use of
them, respectively.

gcc/ChangeLog:

* config/s390/s390-builtin-types.def (BT_FN_UV16QI_UV2DI_UV2DI):
New.
(BT_FN_UV16QI_UV2DI_UV2DI_UV16QI): New.
* config/s390/s390-builtins.def (s390_vgfmg_128): New.
(s390_vgfmag_128): New.
* config/s390/vecintrin.h (vec_gfmsum_128): Use s390_vgfmg_128.
(vec_gfmsum_accum_128): Use s390_vgfmag_128.

Fortran: fix minor frontend GMP leaks

gcc/fortran/ChangeLog:

* simplify.cc (gfc_simplify_sizeof): Clear used gmp variable.
* target-memory.cc (gfc_target_expr_size): Likewise.

i386: Optimization for APX NDD is always zero-uppered for shift

gcc/ChangeLog:

PR target/113729
* config/i386/i386.md (*ashlqi3_1_zext<mode><nf_name>):
New define_insn.
(*ashlhi3_1_zext<mode><nf_name>): Ditto.
(*<insn>qi3_1_zext<mode><nf_name>): Ditto.
(*<insn>hi3_1_zext<mode><nf_name>): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr113729.c: Add testcase for shift and rotate.

i386: Optimization for APX NDD is always zero-uppered for logic

gcc/ChangeLog:

PR target/113729
* config/i386/i386.md (*andqi_1_zext<mode><nf_name>): New
define_insn.
(*andhi_1_zext<mode><nf_name>): Ditto.
(*<code>qi_1_zext<mode><nf_name>): Ditto.
(*<code>hi_1_zext<mode><nf_name>): Ditto.
(*negqi_1_zext<mode><nf_name>): Ditto.
(*neghi_1_zext<mode><nf_name>): Ditto.
(*one_cmplqi2_1_zext<mode>): Ditto.
(*one_cmplhi2_1_zext<mode>): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr113729.c: Add more tests.

i386: Optimization for APX NDD is always zero-uppered for sub/adc/sbb

gcc/ChangeLog:

PR target/113729
* config/i386/i386.md (*subqi_1_zext<mode><nf_name>): New
define_insn.
(*subhi_1_zext<mode><nf_name>): Ditto.
(*addqi3_carry_zext<mode>): Ditto.
(*addhi3_carry_zext<mode>): Ditto.
(*addqi3_carry_zext<mode>_0): Ditto.
(*addhi3_carry_zext<mode>_0): Ditto.
(*addqi3_carry_zext<mode>_0r): Ditto.
(*addhi3_carry_zext<mode>_0r): Ditto.
(*subqi3_carry_zext<mode>): Ditto.
(*subhi3_carry_zext<mode>): Ditto.
(*subqi3_carry_zext<mode>_0): Ditto.
(*subhi3_carry_zext<mode>_0): Ditto.
(*subqi3_carry_zext<mode>_0r): Ditto.
(*subhi3_carry_zext<mode>_0r): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr113729.c: Add more test.
* gcc.target/i386/pr113729-adc-sbb.c: New test.

i386: Optimization for APX NDD is always zero-uppered for ADD

gcc/ChangeLog:

PR target/113729
* config/i386/i386.md (*addqi_1_zext<mode><nf_name>): New
define.
(*addhi_1_zext<mode><nf_name>): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr113729.c: New test.

Restrict pr116202-run-1.c test to riscv_v target

The testcase uses -march=rv64gcv and dg-do run, so should be
restricted to a riscv_v target.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr116202-run-1.c (dg-do run):
Add target riscv_v.

Prevent future proc_ptr parsing issues in associate [PR102973]

A global variable is set when proc_ptr parsing in an associate is
expected. In the case of an error, that flag was not reset, which is
fixed now.

gcc/fortran/ChangeLog:

PR fortran/102973

* match.cc (gfc_match_associate): Reset proc_ptr parsing flag on
error.

Fix ICE in build_function_decl [PR116292]

Fix ICE by getting the vtype only when a derived or class type is
prevent. Also take care about the _len component for unlimited
polymorphics.

gcc/fortran/ChangeLog:

PR fortran/116292

* trans-intrinsic.cc (conv_intrinsic_move_alloc): Get the vtab
only for derived types and classes and adjust _len for class
types.

gcc/testsuite/ChangeLog:

* gfortran.dg/move_alloc_19.f90: New test.

genoutput: Accelerate the place_operands function.

With the increase in the number of modes and patterns for some
backend architectures, the place_operands function becomes a
bottleneck int the speed of genoutput, and may even become a
bottleneck int the overall speed of building the GCC project.
This patch aims to accelerate the place_operands function,
the optimizations it includes are:
1. Use a hash table to store operand information,
improving the lookup time for the first operand.
2. Move mode comparison to the beginning to avoid the scenarios of most strcmp.

I tested the speed improvements for the following backends,
Improvement Ratio
x86_64 197.9%
aarch64 954.5%
riscv 2578.6%
If the build machine is slow, then this improvement can save a lot of time.

I tested the genoutput output for x86_64/aarch64/riscv backends,
and there was no difference compared to before the optimization,
so this shouldn't introduce any functional issues.

gcc/
* genoutput.cc (struct operand_data): Add member 'eq_next' to
point to the next member with the same hash value in the
hash table.
(compare_operands): Move the comparison of the mode to the very
beginning to accelerate the comparison of the two operands.
(struct operand_data_hasher): New, a class that takes into account
the necessary elements for comparing the equality of two operands
in its hash value.
(operand_data_hasher::hash): New.
(operand_data_hasher::equal): New.
(operand_datas): New, hash table of konwn pattern operands.
(place_operands): Use a hash table instead of traversing the array
to find the same operand.
(main): Add initialization of the hash table 'operand_datas'.

Revert "[rtl-optimization/116244] Don't create bogus regs in alter_subreg"

This reverts commit e9738e77674e23f600315ca1efed7d1c7944d0cc.

testsuite: Fix fam-in-union-alone-in-struct-2.c with unsigned char [PR116148]

As PR116148#c7 shows, fam-in-union-alone-in-struct-2.c still
fails on hppa which is a BE environment, but by checking more
(also confirmed by John in PR116148#c12), it's due to that
signedness of plain char on hppa is signed therefore the value
of with_fam_3_v.a[7] "8f" get sign extended as "ffffff8f" then
the verification will fail. This patch is to change plain char
with unsigned char to avoid that.

PR testsuite/116148

gcc/testsuite/ChangeLog:

* c-c++-common/fam-in-union-alone-in-struct-2.c: Change the type of
member a[] of union with_fam_3 with unsigned char.

Move ix86_align_loops into a separate pass and insert the pass after pass_endbr_and_patchable_area.

gcc/ChangeLog:

PR target/116174
* config/i386/i386.cc (ix86_align_loops): Move this to ..
* config/i386/i386-features.cc (ix86_align_loops): .. here.
(class pass_align_tight_loops): New class.
(make_pass_align_tight_loops): New function.
* config/i386/i386-passes.def: Insert pass_align_tight_loops
after pass_insert_endbr_and_patchable_area.
* config/i386/i386-protos.h (make_pass_align_tight_loops): New
declare.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr116174.c: New test.

Daily bump.

testsuite: Fix struct size check [PR116155]

The size of "struct only_fam_2" is dependent on the alignment of the
flexible array member "b", and not on the type of the preceding
bit-fields. For most targets the two are equal. But on default_packed
targets like pru-unknown-elf, the alignment of int is not equal to the
size of int, so the test failed.

Patch was suggested by Qing Zhao. Tested on pru-unknown-elf and
x86_64-pc-linux-gnu.

PR testsuite/116155

gcc/testsuite/ChangeLog:

* c-c++-common/fam-in-union-alone-in-struct-1.c: Adjust
check to account for default_packed targets.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>

ifcvt: Fix force_operand ICE in noce_convert_multiple_sets [PR116353]

Now that more operations are allowed for noce_convert_multiple_sets,
we need to check noce_can_force_operand on the sequence before calling
try_emit_cmove_seq. Otherwise an inappropriate argument may be given
to copy_to_mode_reg and result in an ICE.

PR tree-optimization/116353

gcc/ChangeLog:

* ifcvt.cc (bb_ok_for_noce_convert_multiple_sets): Check
noce_can_force_operand.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr116353.c: New test.

Fortran: reject array constructor value of abstract type [PR114308]

gcc/fortran/ChangeLog:

PR fortran/114308
* array.cc (resolve_array_list): Reject array constructor value if
its declared type is abstract (F2018:C7114).

gcc/testsuite/ChangeLog:

PR fortran/114308
* gfortran.dg/abstract_type_10.f90: New test.

Co-Authored-By: Steven G. Kargl <kargl@gcc.gnu.org>

RISC-V: Fix non-obvious comment typos

This fixes the remainder of the typos I found when reading various parts of the
RISC-V backend.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (legitimize_move): extrac -> extract.
(expand_vec_cmp_float): Remove duplicate vmnor.mm.
* config/riscv/riscv-vector-builtins.cc: ins -> insns.
* config/riscv/riscv.cc (riscv_init_machine_status): mwrvv -> mrvv.
* config/riscv/vector-iterators.md: RVVM8QImde -> RVVM8QImode
* config/riscv/vector.md: Replaced non-existant vsetivl with vsetivli.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>

Internal-fn: Handle vector bool type for type strict match mode [PR116103]

For some target like target=amdgcn-amdhsa, we need to take care of
vector bool types prior to general vector mode types. Or we may have
the asm check failure as below.

gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
gcc.dg/tree-ssa/loop-bound-2.c scan-tree-dump-not ivopts "zero if "

The below test suites are passed for this patch.
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.
4. The amdgcn test case as above.

PR target/116103

gcc/ChangeLog:

* internal-fn.cc (type_strictly_matches_mode_p): Add handling
for vector bool type.

Signed-off-by: Pan Li <pan2.li@intel.com>

LRA: Don't emit move for substituted CONSTATNT_P operand [PR116170]

Commit r15-2084 exposes one ICE in LRA.  Firstly, before
r15-2084 KFmode has 126 bit precision while V1TImode has 128
bit precision, so the subreg (subreg:V1TI (reg:KF 131) 0) is
paradoxical_subreg_p, which stops some passes from doing
some optimization.  After r15-2084, KFmode has the same mode
precision as V1TImode, passes are able to optimize more, but
it causes this ICE in LRA as described below:

For insn 106 (set (mem:V1TI ...) (subreg:V1TI (reg:KF 133) 0)),
which matches pattern

(define_insn "*vsx_le_perm_store_<mode>"
  [(set (match_operand:VSX_LE_128 0 "memory_operand" "=Z,Q")
        (match_operand:VSX_LE_128 1 "vsx_register_operand" "+wa,r"))]
  "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR
   && !altivec_indexed_or_indirect_operand (operands[0], <MODE>mode)"
  "@
   #
   #"
  [(set_attr "type" "vecstore,store")
   (set_attr "length" "12,8")
   (set_attr "isa" "<VSisa>,*")])

LRA makes equivalence substitution on r133 with const double
(const_double:KF 0.0), selects alternative 0 and fixes up
operand 1 for constraint "wa", because operand 1 is OP_INOUT,
so it considers assigning back to it as well, that is:

  lra_emit_move (type == OP_INOUT ? copy_rtx (old) : old, new_reg);

But because old has been changed to const_double in equivalence
substitution, the move is actually assigning to const_double,
which is invalid and cause ICE.

Considering reg:KF 133 is equivalent with (const_double:KF 0.0)
even though this operand is OP_INOUT, IMHO there should not be
any following uses of reg:KF 133, otherwise it doesn't have the
chance to be equivalent to (const_double:KF 0.0).  So this patch
is to guard the lra_emit_move with !CONSTANT_P to exclude such
case.

PR rtl-optimization/116170

gcc/ChangeLog:

* lra-constraints.cc (curr_insn_transform): Don't emit move back to
old operand if it's CONSTANT_P.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr116170.c: New test.

Regenerate avr.opt.urls

avr added an -mlra option, but the avr.opt.url file wasn't
regenerated.

Note that commit 149a23ee2568 ("AVR: -mlra is not documeted in TEXI.")
did add the Undocumented flag, but that still needs the avr.op.urls
file to be updated.

Fixes: 09a87ea666b2 ("AVR: ad target/113934 - Add option -mlra to enable LRA.")
gcc/ChangeLog:

* config/avr/avr.opt.urls: Regenerate.

Daily bump.

rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759]

Only disable shrink-wrapping when using -mrop-protect when we know we
will be emitting the ROP-protect hash instructions (ie, non-leaf functions).

2024-06-17 Peter Bergner <bergner@linux.ibm.com>

gcc/
PR target/114759
* config/rs6000/rs6000.cc (rs6000_override_options_after_change): Move
the disabling of shrink-wrapping from here....
* config/rs6000/rs6000-logue.cc (rs6000_emit_prologue): ...to here.

gcc/testsuite/
PR target/114759
* gcc.target/powerpc/pr114759-1.c: New test.

RISC-V: Fix missing abi arg in test

The following test was failing when building on 32 bit targets
due to not overwriting the mabi arg. This resulted in dejagnu
attempting to run the test with -mabi=ilp32d -march=rv64gcv_zvl256b

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr116202-run-1.c: Add mabi arg

Signed-off-by: Edwin Lu <ewlu@rivosinc.com>

[rtl-optimization/116244] Don't create bogus regs in alter_subreg

So this is another nasty latent bug exposed by ext-dce.

Similar to the prior m68k failure it's another problem with how we handle
paradoxical subregs on big endian targets.

In this instance when we remove the hard subregs we take something like:

(subreg:DI (reg:SI 0) 0)

And turn it into

(reg:SI -1)

Which is clearly wrong.  (reg:SI 0) is correct.

The transformation happens in alter_subreg, but I really wanted to fix this in
subreg_regno since we could have similar problems in some of the other callers
of subreg_regno.

Unfortunately reload depends on the current behavior of subreg_regno; in the
cases where the return value is an invalid register, the wrong half of a
register pair, etc the resulting bogus value is detected by reload and triggers
reloading of the inner object.  So that's the new comment in subreg_regno.

The second best place to fix is alter_subreg which is what this patch does.  If
presented with a paradoxical subreg, then the base register number should
always be REGNO (SUBREG_REG (object)).  It's just how paradoxicals are designed
to work.

I haven't tried to fix the other places that call subreg_regno.  After being
burned by reload, I'm more than a bit worried about unintended fallout.

I must admit I'm surprised we haven't stumbled over this before and that it
didn't fix any failures on the big endian embedded targets.

Boostrapped & regression tested on x86_64, also went through all the embedded
targets in my tester and bootstrapped on m68k & s390x to get some additional
big endian testing.

Pushing to the trunk.

rtl-optimization/116244
gcc/
* rtlanal.cc (subreg_regno): Update comment.
* final.cc (alter_subrg): Always use REGNO (SUBREG_REG ()) to get
the base regsiter for paradoxical subregs.

gcc/testsuite/
* g++.target/m68k/m68k.exp: New test driver.
* g++.target/m68k/pr116244.C: New test.

borrowck: Fix debug prints on 32-bits architectures

gcc/rust/ChangeLog:

* checks/errors/borrowck/rust-bir-builder.h: Cast size_t values to unsigned
long before printing.
* checks/errors/borrowck/rust-bir-fact-collector.h: Likewise.

borrowck: Avoid overloading issues on 32bit architectures

On architectures where `size_t` is `unsigned int`, such as 32bit x86,
we encounter an issue with `PlaceId` and `FreeRegion` being aliases to
the same types. This poses an issue for overloading functions for these
two types, such as `push_subset` in that case. This commit renames one
of these `push_subset` functions to avoid the issue, but this should be
fixed with a newtype pattern for these two types.

gcc/rust/ChangeLog:

* checks/errors/borrowck/rust-bir-fact-collector.h (points): Rename
`push_subset(PlaceId, PlaceId)` to `push_subset_place(PlaceId, PlaceId)`

ifcvt: Handle multiple rewired regs and refactor noce_convert_multiple_sets

The existing implementation of need_cmov_or_rewire and
noce_convert_multiple_sets_1 assumes that sets are either REG or SUBREG.
This commit enchances them so they can handle/rewire arbitrary set statements.

To do that a new helper struct noce_multiple_sets_info is introduced which is
used by noce_convert_multiple_sets and its helper functions. This results in
cleaner function signatures, improved efficientcy (a number of vecs and hash
set/map are replaced with a single vec of struct) and simplicity.

gcc/ChangeLog:

* ifcvt.cc (need_cmov_or_rewire): Renamed init_noce_multiple_sets_info.
(init_noce_multiple_sets_info): Initialize noce_multiple_sets_info.
(noce_convert_multiple_sets_1): Use noce_multiple_sets_info and handle
rewiring of multiple registers.
(noce_convert_multiple_sets): Updated to use noce_multiple_sets_info.
* ifcvt.h (struct noce_multiple_sets_info): Introduce new struct
noce_multiple_sets_info to store info for noce_convert_multiple_sets.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/ifcvt_multiple_sets_rewire.c: New test.