git.ipfire.org Git - thirdparty/gcc.git/log

vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

gcc/ChangeLog:

PR target/107432
* tree-vect-generic.cc
(expand_vector_conversion): Support convert for int -> int,
float -> float and int <-> float.
* tree-vect-stmts.cc (vectorizable_conversion): Wrap the
indirect convert part.
(supportable_indirect_convert_operation): New function.
* tree-vectorizer.h (supportable_indirect_convert_operation):
Define the new function.

gcc/testsuite/ChangeLog:

PR target/107432
* gcc.target/i386/pr107432-1.c: New test.
* gcc.target/i386/pr107432-2.c: Ditto.
* gcc.target/i386/pr107432-3.c: Ditto.
* gcc.target/i386/pr107432-4.c: Ditto.
* gcc.target/i386/pr107432-5.c: Ditto.
* gcc.target/i386/pr107432-6.c: Ditto.
* gcc.target/i386/pr107432-7.c: Ditto.

RISC-V: Add testcases for vector truncate after .SAT_SUB

This patch would like to add the test cases of the vector truncate after
.SAT_SUB.  Aka:

  #define DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(OUT_T, IN_T)                   \
  void __attribute__((noinline))                                       \
  vec_sat_u_sub_trunc_##OUT_T##_fmt_1 (OUT_T *out, IN_T *op_1, IN_T y, \
        unsigned limit)                 \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      {                                                                \
        IN_T x = op_1[i];                                              \
        out[i] = (OUT_T)(x >= y ? x - y : 0);                          \
      }                                                                \
  }

The below 3 cases are included.

DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(uint8_t, uint16_t)
DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(uint16_t, uint32_t)
DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(uint32_t, uint64_t)

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add helper
test macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_scalar.h: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-3.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-3.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

LoongArch: NFC: Dedup and sort the comment in loongarch_print_operand_reloc

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_print_operand_reloc):
Dedup and sort the comment describing modifiers.

LoongArch: Tweak IOR rtx_cost for bstrins

Consider

    c &= 0xfff;
    a &= ~0xfff;
    b &= ~0xfff;
    a |= c;
    b |= c;

This can be done with 2 bstrins instructions.  But we need to recognize
it in loongarch_rtx_costs or the compiler will not propagate "c & 0xfff"
forward.

gcc/ChangeLog:

* config/loongarch/loongarch.cc:
(loongarch_use_bstrins_for_ior_with_mask): Split the main logic
into ...
(loongarch_use_bstrins_for_ior_with_mask_1): ... here.
(loongarch_rtx_costs): Special case for IOR those can be
implemented with bstrins.

gcc/testsuite/ChangeLog;

* gcc.target/loongarch/bstrins-3.c: New test.

Fix wrong cost of MEM when addr is a lea.

416.gamess regressed 4-6% on x86_64 since my r15-882-g1d6199e5f8c1c0.
The commit adjust rtx_cost of mem to reduce cost of (add op0 disp).
But Cost of ADDR could be cheaper than XEXP (addr, 0) when it's a lea.
It is the case in the PR, the patch adjust rtx_cost to only handle reg
+ disp, for other forms, they're basically all LEA which doesn't have
additional cost of ADD.

gcc/ChangeLog:

PR target/115462
* config/i386/i386.cc (ix86_rtx_costs): Make cost of MEM (reg +
disp) just a little bit more than MEM (reg).

gcc/testsuite/ChangeLog:
* gcc.target/i386/pr115462.c: New test.

Internal-fn: Support new IFN SAT_TRUNC for unsigned scalar int

This patch would like to add the middle-end presentation for the
saturation truncation.  Aka set the result of truncated value to
the max value when overflow.  It will take the pattern similar
as below.

Form 1:
  #define DEF_SAT_U_TRUC_FMT_1(WT, NT) \
  NT __attribute__((noinline))         \
  sat_u_truc_##T##_fmt_1 (WT x)        \
  {                                    \
    bool overflow = x > (WT)(NT)(-1);  \
    return ((NT)x) | (NT)-overflow;    \
  }

For example, truncated uint16_t to uint8_t, we have

* SAT_TRUNC (254)   => 254
* SAT_TRUNC (255)   => 255
* SAT_TRUNC (256)   => 255
* SAT_TRUNC (65536) => 255

Given below SAT_TRUNC from uint64_t to uint32_t.

DEF_SAT_U_TRUC_FMT_1 (uint64_t, uint32_t)

Before this patch:
__attribute__((noinline))
uint32_t sat_u_truc_T_fmt_1 (uint64_t x)
{
  _Bool overflow;
  unsigned int _1;
  unsigned int _2;
  unsigned int _3;
  uint32_t _6;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  overflow_5 = x_4(D) > 4294967295;
  _1 = (unsigned int) x_4(D);
  _2 = (unsigned int) overflow_5;
  _3 = -_2;
  _6 = _1 | _3;
  return _6;
;;    succ:       EXIT

}

After this patch:
__attribute__((noinline))
uint32_t sat_u_truc_T_fmt_1 (uint64_t x)
{
  uint32_t _6;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _6 = .SAT_TRUNC (x_4(D)); [tail call]
  return _6;
;;    succ:       EXIT

}

The below tests are passed for this patch:
*. The rv64gcv fully regression tests.
*. The rv64gcv build with glibc.
*. The x86 bootstrap tests.
*. The x86 fully regression tests.

gcc/ChangeLog:

* internal-fn.def (SAT_TRUNC): Add new signed IFN sat_trunc as
unary_convert.
* match.pd: Add new matching pattern for unsigned int sat_trunc.
* optabs.def (OPTAB_CL): Add unsigned and signed optab.
* tree-ssa-math-opts.cc (gimple_unsigend_integer_sat_trunc): Add
new decl for the matching pattern generated func.
(match_unsigned_saturation_trunc): Add new func impl to match
the .SAT_TRUNC.
(math_opts_dom_walker::after_dom_children): Add .SAT_TRUNC match
function under BIT_IOR_EXPR case.

Signed-off-by: Pan Li <pan2.li@intel.com>

Vect: Support truncate after .SAT_SUB pattern in zip

The zip benchmark of coremark-pro have one SAT_SUB like pattern but
truncated as below:

void test (uint16_t *x, unsigned b, unsigned n)
{
  unsigned a = 0;
  register uint16_t *p = x;

  do {
    a = *--p;
    *p = (uint16_t)(a >= b ? a - b : 0); // Truncate after .SAT_SUB
  } while (--n);
}

It will have gimple before vect pass,  it cannot hit any pattern of
SAT_SUB and then cannot vectorize to SAT_SUB.

_2 = a_11 - b_12(D);
iftmp.0_13 = (short unsigned int) _2;
_18 = a_11 >= b_12(D);
iftmp.0_5 = _18 ? iftmp.0_13 : 0;

This patch would like to improve the pattern match to recog above
as truncate after .SAT_SUB pattern.  Then we will have the pattern
similar to below,  as well as eliminate the first 3 dead stmt.

_2 = a_11 - b_12(D);
iftmp.0_13 = (short unsigned int) _2;
_18 = a_11 >= b_12(D);
iftmp.0_5 = (short unsigned int).SAT_SUB (a_11, b_12(D));

The below tests are passed for this patch.
1. The rv64gcv fully regression tests.
2. The rv64gcv build with glibc.
3. The x86 bootstrap tests.
4. The x86 fully regression tests.

gcc/ChangeLog:

* match.pd: Add convert description for minus and capture.
* tree-vect-patterns.cc (vect_recog_build_binary_gimple_call): Add
new logic to handle in_type is incompatibile with out_type,  as
well as rename from.
(vect_recog_build_binary_gimple_stmt): Rename to.
(vect_recog_sat_add_pattern): Leverage above renamed func.
(vect_recog_sat_sub_pattern): Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

tree-optimization/115652 - amend last fix

The previous fix breaks in the degenerate case when the discovered
last_stmt is equal to the first stmt in the block since then we
undo a required stmt advancement.

PR tree-optimization/115652
* tree-vect-slp.cc (vect_schedule_slp_node): Only insert
at the start of the block if that strictly dominates
the discovered dependent stmt.

tree-optimization/115493 - complete previous fix

The following fixes the 2nd occurance of new_temp missed with the
previous fix.

PR tree-optimization/115493
* tree-vect-loop.cc (vect_create_epilog_for_reduction): Use
first scalar result.

Daily bump.

libstdc++: Add script to update docs for a new release branch

This should be run on a release branch after branching from trunk.
Various links and references to trunk in the docs will be updated to
refer to the new release branch.

libstdc++-v3/ChangeLog:

* scripts/update_release_branch.sh: New file.

libstdc++: Remove duplicate test

We currently have 808590.cc which only runs for C++98 mode, and
808590-cxx11.cc which only runs for C++11 and later, but have almost
identical content (except for a defaulted special member in the C++11
one, to suppress a -Wdeprecated-copy warning).

This was done originally to ensure that the test ran for both C++98 mode
and C++11 mode, because the logic being tested was different enough to
need both to be tested. But it's trivial to run all tests in multiple
-std modes now, using GLIBCXX_TESTSUITE_STDS, so we don't need two
separate tests. We can remove one of the tests and allow the other one
to run in any -std mode.

libstdc++-v3/ChangeLog:

* testsuite/20_util/specialized_algorithms/uninitialized_copy/808590.cc:
Copy defaulted assignment operator from 808590-cxx11.cc to
suppress a warning.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/808590-cxx11.cc:
Removed.

libstdc++: Increase timeouts for PSTL tests in debug mode [PR90276]

These tests compile very slowly in debug mode.

libstdc++-v3/ChangeLog:

PR libstdc++/90276
* testsuite/25_algorithms/pstl/alg_modifying_operations/rotate_copy.cc:
Increase timeout for debug mode.
* testsuite/25_algorithms/pstl/alg_modifying_operations/transform_binary.cc:
Likewise.
* testsuite/25_algorithms/pstl/alg_nonmodifying/mismatch.cc:
Likewise.
* testsuite/25_algorithms/pstl/alg_sorting/lexicographical_compare.cc:
Likewise.
* testsuite/25_algorithms/pstl/alg_sorting/minmax_element.cc:
Likewise.
* testsuite/25_algorithms/pstl/alg_sorting/set_symmetric_difference.cc:
Likewise.

libstdc++: Work around some PSTL test failures for debug mode [PR90276]

This addresses one known failure due to a bug in the upstream tests, and
a number of timeouts due to the algorithms running much more slowly with
debug mode checks enabled.

libstdc++-v3/ChangeLog:

PR libstdc++/90276
* testsuite/25_algorithms/pstl/alg_sorting/partial_sort.cc
[_GLIBCXX_DEBUG]: Add xfail-run-if for debug mode.
* testsuite/25_algorithms/pstl/alg_nonmodifying/nth_element.cc
[_GLIBCXX_DEBUG]: Reduce size of test data.
* testsuite/25_algorithms/pstl/alg_sorting/includes.cc:
Likewise.
* testsuite/25_algorithms/pstl/alg_sorting/set_util.h:
Likewise.

libstdc++: Fix std::chrono::tzdb to work with vanguard format

I found some issues in the std::chrono::tzdb parser by testing the
tzdata "vanguard" format, which uses new features that aren't enabled in
the "main" and "rearguard" data formats.

Since 2024a the keyword "minimum" is no longer valid for the FROM and TO
fields in a Rule line, which means that "m" is now a valid abbreviation
for "maximum". Previously we expected either "mi" or "ma". For backwards
compatibility, a FROM field beginning with "mi" is still supported and
is treated as 1900. The "maximum" keyword is only allowed in TO now,
because it makes no sense in FROM. To support these changes the
minmax_year and minmax_year2 classes for parsing FROM and TO are
replaced with a single years_from_to class that reads both fields.

The vanguard format makes use of %z in Zone FORMAT fields, which caused
an exception to be thrown from ZoneInfo::set_abbrev because no % or /
characters were expected when a Zone doesn't use a named Rule. The
ZoneInfo::to(sys_info&) function now uses format_abbrev_str to replace
any %z with the current offset. Although format_abbrev_str also checks
for %s and STD/DST formats, those only make sense when a named Rule is
in effect, so won't occur when ZoneInfo::to(sys_info&) is used.

This change also implements a feature that has always been missing from
time_zone::_M_get_sys_info: finding the Rule that is active before the
specified time point, so that we can correctly handle %s in the FORMAT
for the first new sys_info that gets created. This requires implementing
a poorly documented feature of zic, to get the LETTERS field from a
later transition, as described at
https://mm.icann.org/pipermail/tz/2024-April/058891.html
In order for this to work we need to be able to distinguish an empty
letters field (as used by CE%sT where the variable part is either empty
or "S") from "the letters field is not known for this transition". The
tzdata file uses "-" for an empty letters field, which libstdc++ was
previously replacing with "" when the Rule was parsed. Instead, we now
preserve the "-" in the Rule object, so that "" can be used for the case
where we don't know the letters (and so need to decide it).

libstdc++-v3/ChangeLog:

* src/c++20/tzdb.cc (minmax_year, minmax_year2): Remove.
(years_from_to): New class replacing minmax_year and
minmax_year2.
(format_abbrev_str, select_std_or_dst_abbrev): Move earlier in
the file. Handle "-" for letters.
(ZoneInfo::to): Use format_abbrev_str to expand %z.
(ZoneInfo::set_abbrev): Remove exception. Change parameter from
reference to value.
(operator>>(istream&, Rule&)): Do not clear letters when it
contains "-".
(time_zone::_M_get_sys_info): Add missing logic to find the Rule
in effect before the time point.
* testsuite/std/time/tzdb/1.cc: Adjust for vanguard format using
"GMT" as the Zone name, not as a Link to "Etc/GMT".
* testsuite/std/time/time_zone/sys_info_abbrev.cc: New test.

tree-optimization/115629 - missed tail merging

The following fixes a missed tail-merging observed for the testcase
in PR115629.  The issue is that when deps_ok_for_redirect doesn't
compute both would be valid prevailing blocks it rejects the merge.
The following instead makes sure to record the working block as
prevailing.  Also stmt comparison fails for indirect references
and is not handling memory references thoroughly, failing to unify
array indices and pointers indirected.  The following attempts to
fix this.

PR tree-optimization/115629
* tree-ssa-tail-merge.cc (gimple_equal_p): Handle
memory references better.
(deps_ok_for_redirect): Handle the case not both blocks
are considered a valid prevailing block.

* gcc.dg/tree-ssa/tail-merge-1.c: New testcase.

RISC-V: Update testcase comments to point to PSABI rather than Table A.6

Table A.6 was originally the source of truth for the recommended mappings.
Point to the PSABI doc since the memory model mappings have been moved there.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo/a-rvwmo-fence.c: Replace A.6 reference with PSABI.
* gcc.target/riscv/amo/a-rvwmo-load-acquire.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-load-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-load-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-store-compat-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-store-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-store-release.c: Ditto.
* gcc.target/riscv/amo/a-ztso-fence.c: Ditto.
* gcc.target/riscv/amo/a-ztso-load-acquire.c: Ditto.
* gcc.target/riscv/amo/a-ztso-load-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-ztso-load-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-ztso-store-compat-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-ztso-store-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-ztso-store-release.c: Ditto.
* gcc.target/riscv/amo/zaamo-rvwmo-amo-add-int.c: Ditto.
* gcc.target/riscv/amo/zaamo-ztso-amo-add-int.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-consume.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acq-rel.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acquire.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-seq-cst.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-consume.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acq-rel.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acquire.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-seq-cst.c: Ditto.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>

RISC-V: Consolidate amo testcase variants

Many riscv/amo/ testcases use check-function-bodies. These testcases can be
consolidated with related testcases (memory ordering variants) without affecting
the assertions.

Give functions descriptive names so testsuite failures are obvious from the
'FAIL:' line.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo/amo-table-a-6-amo-add-1.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-2.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-3.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-4.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-5.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-1.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-2.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-3.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-4.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-5.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-1.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-2.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-3.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-4.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-5.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-1.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-2.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-3.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-4.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-5.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-1.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-2.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-3.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-4.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-5.c: Removed.
* gcc.target/riscv/amo/a-rvwmo-fence.c: New test.
* gcc.target/riscv/amo/a-ztso-fence.c: New test.
* gcc.target/riscv/amo/zaamo-rvwmo-amo-add-int.c: New test.
* gcc.target/riscv/amo/zaamo-ztso-amo-add-int.c: New test.
* gcc.target/riscv/amo/zalrsc-rvwmo-amo-add-int.c: New test.
* gcc.target/riscv/amo/zalrsc-ztso-amo-add-int.c: New test.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>

RISC-V: Rename amo testcases

Rename riscv/amo/ testcases to follow a '{ext}-{model}-{name}-{memory order}.c'
naming convention.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo/amo-table-a-6-load-2.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-load-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-load-1.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-load-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-load-3.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-load-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-store-compat-3.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-store-compat-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-store-1.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-store-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-store-2.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-store-release.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-load-2.c: Move to...
* gcc.target/riscv/amo/a-ztso-load-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-load-1.c: Move to...
* gcc.target/riscv/amo/a-ztso-load-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-load-3.c: Move to...
* gcc.target/riscv/amo/a-ztso-load-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-store-3.c: Move to...
* gcc.target/riscv/amo/a-ztso-store-compat-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-store-1.c: Move to...
* gcc.target/riscv/amo/a-ztso-store-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-store-2.c: Move to...
* gcc.target/riscv/amo/a-ztso-store-release.c: ...here.
* gcc.target/riscv/amo/amo-zaamo-preferred-over-zalrsc.c: Move to...
* gcc.target/riscv/amo/zaamo-preferred-over-zalrsc.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-6.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire-release.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-3.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-2.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-consume.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-1.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-4.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-release.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-7.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-5.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-4.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acq-rel.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-2.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-1.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-3.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-release.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-5.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-6.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire-release.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-3.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-2.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-consume.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-1.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-4.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-release.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-7.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-5.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-4.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acq-rel.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-2.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-1.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-3.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-release.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-5.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-seq-cst.c: ...here.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>

rs6000, change altivec*-runnable.c test file names

Changed the names of the test files.

gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
* gcc.target/powerpc/altivec-1-runnable.c: Change the name to
altivec-38.c.
* gcc.target/powerpc/altivec-2-runnable.c: Change the name to
p8vector-builtin-9.c.

rs6000, altivec-2-runnable.c update the require-effective-target

The test requires a minimum of Power8 vector HW and a compile level
of -O2. Update the dg test directives.

gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
* gcc.target/powerpc/altivec-2-runnable.c: Change the
require-effective-target for the test.

rs6000, altivec-1-runnable.c update the require-effective-target

Update the dg test directives.

gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
* gcc.target/powerpc/altivec-1-runnable.c: Change the
require-effective-target for the test.

[committed] Remove compromised sh test

Surya's recent patch to IRA improves the code for sh/pr54602-1.c slightly.
Specifically it's able to eliminate a save/restore in the prologue/epilogue and
a bit of register shuffling.

As a result there literally aren't any insns that can be used to fill the delay
slot of the return, so a nop gets emitted and the test fails.

Given there literally aren't any insns to move into the delay slot, the best
course of action is to just drop the test.

gcc/testsuite
* gcc.target/sh/pr54602-1.c: Delete test.

[committed][RISC-V] Fix expected output for thead store pair test

Surya's patch to IRA has improved the code we generate for one of the thead
store pair tests for both rv32 and rv64. This patch adjusts the expectations
of that test.

I've verified that the test now passes on rv32 and rv64 in my tester. Pushing
to the trunk.

gcc/testsuite
* gcc.target/riscv/xtheadmempair-3.c: Update expected output.

tree-optimization/115652 - adjust insertion gsi for SLP

The following adjusts how SLP computes the insertion location.  In
particular it advanced the insert iterator of the found last_stmt.
The vectorizer will later insert stmts _before_ it.  But we also
have the constraint that possibly masked ops may not be scheduled
outside of the loop and as we do not model the loop mask in the
SLP graph we have to adjust for that.  The following moves this
to after the advance since it isn't compatible with that as the
current GIMPLE_COND exception shows.  The PR is about in-order
reduction vectorization which also isn't happy when that's the
very first stmt.

PR tree-optimization/115652
* tree-vect-slp.cc (vect_schedule_slp_node): Advance the
iterator based on last_stmt only for vector defs.

Record edge true/false value for gcov

Make gcov aware which edges are the true/false to more accurately
reconstruct the CFG. There are plenty of bits left in arc_info and it
opens up for richer reporting.

gcc/ChangeLog:

* gcov-io.h (GCOV_ARC_TRUE): New.
(GCOV_ARC_FALSE): New.
* gcov.cc (struct arc_info): Add true_value, false_value.
(read_graph_file): Read true_value, false_value.
* profile.cc (branch_prob): Write GCOV_ARC_TRUE, GCOV_ARC_FALSE.

Use the term MC/DC in help for gcov --conditions

Without key terms like "masking" and "MC/DC" it is not at all obvious
what --conditions actually reports on, and there is no easy path for the
user to figure out. By at least including the two key terms MC/DC and
masking users have something to search for.

gcc/ChangeLog:

* gcov.cc (print_usage): Reference masking MC/DC.

Add section on MC/DC in gcov manual

gcc/ChangeLog:

* doc/gcov.texi: Add MC/DC section.

Use auto_vec for memory release on return

Using auto_vec ensure this memory is cleaned up on function exit.

gcc/ChangeLog:

* tree-profile.cc (find_conditions): Use auto_vec.

arm: make arm_predict_doloop_p reject loops with calls

With the introduction of low overhead loops we defined arm_predict_doloop_p,
this is meant to be a low-weight check to rule out loops we are not considering
for doloop optimization and it is used by other passes to prevent optimizations
that may hurt the doloop optimization later on. The reason these are meant to be
lightweight is because it's used by pre-RTL optimizations, meaning we can't do
the same checks that doloop does.

After the definition of arm_predict_doloop_p, when testing for armv8.1-m.main,
tree-ssa/ivopts-3.c failed the scan-dump check as the dump now matched an extra
'!= 0' introduced by:
Doloop cmp iv use: if (ivtmp_1 != 0)
Predict loop 1 can perform doloop optimization later.

where previously we had:
Predict doloop failure due to target specific checks.

and after this patch:
Predict doloop failure due to call in loop.
Predict doloop failure due to target specific checks.

Added a copy of the original tree-ssa/ivopts-3.c as a target specifc test to
check for the new dump message.

gcc/ChangeLog:

* config/arm/arm.cc (arm_predict_doloop_p): Reject loops with function
calls that are not builtins.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/ivopts-3.c: New test.

[aarch64] Add support for -mcpu=grace

This adds support for the NVIDIA Grace CPU to aarch64.
We reuse the tuning decisions for the Neoverse V2 core, but include a
number of architecture features that are not enabled by default in
-mcpu=neoverse-v2.

This allows Grace users to more simply target the CPU with -mcpu=grace
rather than remembering what extensions to tag on top of
-mcpu=neoverse-v2.

Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/

* config/aarch64/aarch64-cores.def (grace): New entry.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi (AArch64 Options): Document the above.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>

i386: Remove declaration of unused functions

The patch fixes the issue introduced in
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=63512c72df09b43d56ac7680cdfd57a66d40c636
and reported at
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655599.html .

Regards,
Evgeny

The patch fixes the issue with compilation on x86_64-gnu-linux
when warnings for unused functions are treated as errors.

gcc/ChangeLog:

* config/i386/i386.cc (legitimize_dllimport_symbol): Remove unused
functions.
(legitimize_pe_coff_extern_decl): Likewise.

rs6000: Fix wrong RTL patterns for vector merge high/low short on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low short, which are altivec_vmrg[hl]h.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghh on BE while vmrglh on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
16-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-2.c is a typical example for this issue on element type
short.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghh expands
into altivec_vmrghh_direct_be or altivec_vmrglh_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo <xionghuluo@tencent.com>
PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghh_direct): Rename to ...
(altivec_vmrghh_direct_be): ... this.  Add condition BYTES_BIG_ENDIAN.
(altivec_vmrghh_direct_le): New define_insn.
(altivec_vmrglh_direct): Rename to ...
(altivec_vmrglh_direct_be): ... this.  Add condition BYTES_BIG_ENDIAN.
(altivec_vmrglh_direct_le): New define_insn.
(altivec_vmrghh): Adjust by calling gen_altivec_vmrghh_direct_be
for BE and gen_altivec_vmrglh_direct_le for LE.
(altivec_vmrglh): Adjust by calling gen_altivec_vmrglh_direct_be
for BE and gen_altivec_vmrghh_direct_le for LE.
(vec_widen_umult_hi_v16qi): Adjust the call to
gen_altivec_vmrghh_direct by gen_altivec_vmrghh for BE
and by gen_altivec_vmrglh for LE.
(vec_widen_smult_hi_v16qi): Likewise.
(vec_widen_umult_lo_v16qi): Adjust the call to
gen_altivec_vmrglh_direct by gen_altivec_vmrglh for BE
and by gen_altivec_vmrghh for LE.
(vec_widen_smult_lo_v16qi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghh_direct by
CODE_FOR_altivec_vmrghh_direct_be for BE and
CODE_FOR_altivec_vmrghh_direct_le for LE.  And replace
CODE_FOR_altivec_vmrglh_direct by
CODE_FOR_altivec_vmrglh_direct_be for BE and
CODE_FOR_altivec_vmrglh_direct_le for LE.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106069-2.c: New test.

rs6000: Fix wrong RTL patterns for vector merge high/low char on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low char, which are altivec_vmrg[hl]b.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghb on BE while vmrglb on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
8-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-1.c is a typical example for this issue.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghb expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo <xionghuluo@tencent.com>
PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghb_direct): Rename to ...
(altivec_vmrghb_direct_be): ... this.  Add condition BYTES_BIG_ENDIAN.
(altivec_vmrghb_direct_le): New define_insn.
(altivec_vmrglb_direct): Rename to ...
(altivec_vmrglb_direct_be): ... this.  Add condition BYTES_BIG_ENDIAN.
(altivec_vmrglb_direct_le): New define_insn.
(altivec_vmrghb): Adjust by calling gen_altivec_vmrghb_direct_be
for BE and gen_altivec_vmrglb_direct_le for LE.
(altivec_vmrglb): Adjust by calling gen_altivec_vmrglb_direct_be
for BE and gen_altivec_vmrghb_direct_le for LE.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghb_direct by
CODE_FOR_altivec_vmrghb_direct_be for BE and
CODE_FOR_altivec_vmrghb_direct_le for LE.  And replace
CODE_FOR_altivec_vmrglb_direct by
CODE_FOR_altivec_vmrglb_direct_be for BE and
CODE_FOR_altivec_vmrglb_direct_le for LE.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106069-1.c: New test.

tree-optimization/115646 - ICE with pow shrink-wrapping from bitfield

The following makes analysis and transform agree on constraints.

PR tree-optimization/115646
* tree-call-cdce.cc (check_pow): Check for bit_sz values
as allowed by transform.

* gcc.dg/pr115646.c: New testcase.

optab: Add isnormal_optab for isnormal builtin

gcc/
* builtins.cc (interclass_mathfn_icode): Set optab to isnormal_optab
for isnormal builtin.
* optabs.def (isnormal_optab): New.
* doc/md.texi (isnormal): Document.

optab: Add isfinite_optab for isfinite builtin

gcc/
* builtins.cc (interclass_mathfn_icode): Set optab to isfinite_optab
for isfinite builtin.
* optabs.def (isfinite_optab): New.
* doc/md.texi (isfinite): Document.

[libstdc++] [testsuite] no libatomic for vxworks

libatomic hasn't been ported to vxworks. Most of the stdatomic.h and
<atomic> underlying requirements are provided by builtins and libgcc,
and the vxworks libc already provides remaining __atomic symbols, so
porting libatomic doesn't seem to make sense.

However, some of the target arch-only tests in
add_options_for_libatomic cover vxworks targets, so we end up
attempting to link libatomic in, even though it's not there.
Preempt those too-broad tests.

Co-Authored-By: Marc Poulhiès <poulhies@adacore.com>
for libstdc++-v3/ChangeLog

* testsuite/lib/dg-options.exp (add_options_for_libatomic):
None for *-*-vxworks*.

[testsuite] [arm] [vect] adjust mve-vshr test [PR113281]

The test was too optimistic, alas.  We used to vectorize shifts by
clamping the shift counts below the bit width of the types (e.g. at 15
for 16-bit vector elements), but (uint16_t)32768 >> (uint16_t)16 is
well defined (because of promotion to 32-bit int) and must yield 0,
not 1 (as before the fix).

Unfortunately, in the gimple model of vector units, such large shift
counts wouldn't be well-defined, so we won't vectorize such shifts any
more, unless we can tell they're in range or undefined.

So the test that expected the vectorization we no longer performed
needs to be adjusted.  Instead of nobbling the test, Richard Earnshaw
suggested annotating the test with the expected ranges so as to enable
the optimization, and Christophe Lyon suggested a further
simplification.

Co-Authored-By: Richard Earnshaw <Richard.Earnshaw@arm.com>
for  gcc/testsuite/ChangeLog

PR tree-optimization/113281
* gcc.target/arm/simd/mve-vshr.c: Add expected ranges.

Optimize a < 0 ? -1 : 0 to (signed)a >> 31.

Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31
and x < 0 ? 1 : 0 into (unsigned) x >> 31.

Move the optimization did in ix86_expand_int_vcond to match.pd

gcc/ChangeLog:

PR target/114189
* match.pd: Simplify a < 0 ? -1 : 0 to (signed) >> 31 and a <
0 ? 1 : 0 to (unsigned) a >> 31 for vector integer type.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx2-pr115517.c: New test.
* gcc.target/i386/avx512-pr115517.c: New test.
* g++.target/i386/avx2-pr115517.C: New test.
* g++.target/i386/avx512-pr115517.C: New test.
* g++.dg/tree-ssa/pr88152-1.C: Adjust testcase.

[PATCH 11/11] Handle subroutine types in CodeView

gcc/
* dwarf2codeview.cc (struct codeview_custom_type): Add lf_procedure
and lf_arglist to union.
(write_lf_procedure, write_lf_arglist): New functions.
(write_custom_types): Call write_lf_procedure and write_lf_arglist.
(get_type_num_subroutine_type): New function.
(get_type_num): Handle DW_TAG_subroutine_type DIEs.
* dwarf2codeview.h (LF_PROCEDURE, LF_ARGLIST): Define.

[PATCH 10/11] Handle bitfields for CodeView

Translates structure members with DW_AT_data_bit_offset set in DWARF
into LF_BITFIELD symbols.

gcc/
* dwarf2codeview.cc (struct codeview_custom_type): Add lf_bitfield to
union.
(write_lf_bitfield): New function.
(write_custom_types): Call write_lf_bitfield.
(create_bitfield): New function.
(get_type_num_struct): Handle bitfields.
* dwarf2codeview.h (LF_BITFIELD): Define.

diagnostics: introduce diagnostic-global-context.cc

This moves all of the uses of global_dc within diagnostic.cc (including
the definition) to a new diagnostic-global-context.cc. My intent is to
make clearer those parts of our internal API that implicitly use
global_dc, and to perhaps avoid linking global_dc into a future
libdiagnostics.so.

No functional change intended.

gcc/ChangeLog:
* Makefile.in (OBJS-libcommon): Add diagnostic-global-context.o.
* diagnostic-global-context.cc: New file, taken from material in
diagnostic.cc.
* diagnostic.cc (global_diagnostic_context): Move to
diagnostic-global-context.cc.
(global_dc): Likewise.
(verbatim): Likewise.
(emit_diagnostic): Likewise.
(emit_diagnostic_valist): Likewise.
(emit_diagnostic_valist_meta): Likewise.
(inform): Likewise.
(inform_n): Likewise.
(warning): Likewise.
(warning_at): Likewise.
(warning_meta): Likewise.
(warning_n): Likewise.
(pedwarn): Likewise.
(permerror): Likewise.
(permerror_opt): Likewise.
(error): Likewise.
(error_n): Likewise.
(error_at): Likewise.
(error_meta): Likewise.
(sorry): Likewise.
(sorry_at): Likewise.
(seen_error): Likewise.
(fatal_error): Likewise.
(internal_error): Likewise.
(internal_error_no_backtrace): Likewise.
(fnotice): Likewise.
(auto_diagnostic_group::auto_diagnostic_group): Likewise.
(auto_diagnostic_group::~auto_diagnostic_group): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

diagnostics: eliminate various implicit uses of global_dc

This patch eliminates all implicit uses of "global_dc" from
the path-printing logic and from
gcc_rich_location::add_location_if_nearby.

No functional change intended.

gcc/c/ChangeLog:
* c-parser.cc (c_parser_require): Pass *global_dc to
gcc_rich_location::add_location_if_nearby.

gcc/cp/ChangeLog:
* parser.cc (cp_parser_error_1): Pass *global_dc to
gcc_rich_location::add_location_if_nearby.
(cp_parser_decl_specifier_seq): Likewise.
(cp_parser_set_storage_class): Likewise.
(cp_parser_set_storage_class): Likewise.

gcc/ChangeLog:
* diagnostic-path.cc (class path_label): Add m_path field,
and use it to replace all uses of global_dc.
(event_range::event_range): Add "ctxt" param and use it to
construct m_path_label.
(event_range::maybe_add_event): Add "ctxt" param and pass it to
gcc_rich_location::add_location_if_nearby.
(path_summary::path_summary): Add "ctxt" param and pass it to
event_range::maybe_add_event.
(diagnostic_context::print_path): Pass *this to path_summary ctor.
(selftest::test_empty_path): Use "dc" when constructing
path_summary rather than implicitly using global_dc.
(selftest::test_intraprocedural_path): Likewise.
(selftest::test_interprocedural_path_1): Likewise.
(selftest::test_interprocedural_path_2): Likewise.
(selftest::test_recursion): Likewise.
(selftest::test_control_flow_1): Likewise.
(selftest::test_control_flow_2): Likewise.
(selftest::test_control_flow_3): Likewise.
(selftest::assert_cfg_edge_path_streq): Likewise.
(selftest::test_control_flow_5): Likewise.
(selftest::test_control_flow_6): Likewise.
(selftest::diagnostic_path_cc_tests): Eliminate use of global_dc.
* diagnostic-show-locus.cc
(gcc_rich_location::add_location_if_nearby): Add "ctxt" param and
use it instead of implicitly using global_dc.
(selftest::test_add_location_if_nearby): Use
test_diagnostic_context rather than implicitly using global_dc.
* diagnostic.cc (pedantic_warning_kind): Delete macro.
(permissive_error_kind): Delete macro.
(permissive_error_option): Delete macro.
(diagnostic_context::diagnostic_enabled): Remove use of
permissive_error_option.
(diagnostic_context::report_diagnostic): Remove use of
pedantic_warning_kind.
(diagnostic_impl): Convert to...
(diagnostic_context::diagnostic_impl): ...this.
(diagnostic_n_impl): Convert to...
(diagnostic_context::diagnostic_n_impl): ...this.
(emit_diagnostic): Explicitly use global_dc for method call.
(emit_diagnostic_valist): Likewise.
(emit_diagnostic_valist_meta): Likewise.
(inform): Likewise.
(inform_n): Likewise.
(warning): Likewise.
(warning_at): Likewise.
(warning_meta): Likewise.
(warning_n): Likewise.
(pedwarn): Likewise.
(permerror): Likewise.
(permerror_opt): Likewise.
(error): Likewise.
(error_n): Likewise.
(error_at): Likewise.
(error_meta): Likewise.
(sorry): Likewise.
(sorry_at): Likewise.
(fatal_error): Likewise.
(internal_error): Likewise.
(internal_error_no_backtrace): Likewise.
* diagnostic.h (diagnostic_context::diagnostic_impl): New decl.
(diagnostic_context::diagnostic_n_impl): New decl.
* gcc-rich-location.h (gcc_rich_location::add_location_if_nearby):
Add "ctxt" param.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

testsuite: use check-jsonschema for validating .sarif files [PR109360]

As reported here:
  https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655434.html
the schema validation I added for generated .sarif files in
r15-1541-ga84fe222029ff2 used the "jsonschema" command line tool, which
has been deprecated by more recent versions of the Python 3 "jsonschema"
module.

This patch updates the validation to use the more recent
"check-jsonschema" command line tool, from the Python 3 "check-jsonschema"
module, fixing the testsuite FAILs due to the deprecation message.

As an added bonus, the output on validation failures is *much* nicer, e.g.
if I undo r15-1540-g9f4fdc3acebcf6, the error messages begin like this:
verify-sarif-file: res: Schema validation errors were encountered.
  diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].locations[0].physicalLocation.region.startColumn: 0 is less than the minimum of 1
  diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].relatedLocations[0].physicalLocation.region.startColumn: 0 is less than the minimum of 1
  diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].relatedLocations[1].physicalLocation.region.startColumn: 0 is less than the minimum of 1
  diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].relatedLocations[2].physicalLocation.region.startColumn: 0 is less than the minimum of 1
child process exited abnormally
FAIL: c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c  -Wc++-compat   (test .sarif output against SARIF schema)

Tested with Python 3.8 with check_jsonschema 0.28.6

gcc/ChangeLog:
PR testsuite/109360
* doc/install.texi (Python3 modules): Update SARIF validation
requirement to use check-jsonschema rather than jsonschema.

gcc/testsuite/ChangeLog:
PR testsuite/109360
* lib/scansarif.exp (verify-sarif-file): Use check-jsonschema
rather than jsonschema, updating the invocation accordingly.
* lib/target-supports.exp (check_effective_target_jsonschema): Convert
to...
(check_effective_target_check_jsonschema): ...this.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Daily bump.

c++: decltype of capture proxy of ref [PR115504]

The finish_decltype_type capture proxy handling added in r14-5330 was
incorrectly stripping references in the type of the captured variable.

PR c++/115504

gcc/cp/ChangeLog:

* semantics.cc (finish_decltype_type): Don't strip the reference
type (if any) of a capture proxy's captured variable.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/decltype-auto8.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

[PATCH 09/11] Handle arrays for CodeView

Translates DW_TAG_array_type DIEs into LF_ARRAY symbols.

gcc/
* dwarf2codeview.cc (struct codeview_custom_type): Add lf_array to
union.
(write_lf_array): New function.
(write_custom_types): Call write_lf_array.
(get_type_num_array_type): New function.
(get_type_num): Handle DW_TAG_array_type DIEs.
* dwarf2codeview.h (LF_ARRAY): Define.

[PATCH 08/11] Handle unions for CodeView.

Translates DW_TAG_union_type DIEs into LF_UNION symbols.

gcc/
* dwarf2codeview.cc (write_lf_union): New function.
(write_custom_types): Call write_lf_union.
(add_struct_forward_def): Handle DW_TAG_union_type DIEs.
(get_type_num_struct): Handle unions.
(get_type_num): Handle DW_TAG_union_type DIEs.
* dwarf2codeview.h (LF_UNION): Define.

libstdc++: Simplify std::valarray initialization helpers

Dispatching to partial specializations doesn't really seem to offer much
benefit here. The __is_trivial(T) condition is a compile-time constant
so the untaken branches are dead code and don't cost us anything.

libstdc++-v3/ChangeLog:

* include/bits/valarray_array.h (_Array_default_ctor): Remove.
(__valarray_default_construct): Inline it into here.
(_Array_init_ctor): Remove.
(__valarray_fill_construct): Inline it into here.
(_Array_copy_ctor): Remove.
(__valarray_copy_construct(const T*, const T*, T*)): Inline it
into here.
(__valarray_copy_construct(const T*, size_t, size_t, T*)):
Use _GLIBCXX17_CONSTEXPR for constant condition.

modula2: tidyup remove unused procedures and unused parameters

This patch removes M2GenGCC.mod:QuadCondition and
M2Quads.mod:GenQuadOTypeUniquetok. It also removes unused parameter
WalkAction for all FoldIf procedures.

gcc/m2/ChangeLog:

* gm2-compiler/M2GenGCC.mod (QuadCondition): Remove.
(FoldIfEqu): Remove WalkAction parameter.
(FoldIfNotEqu): Ditto.
(FoldIfGreEqu): Ditto.
(FoldIfLessEqu): Ditto.
(FoldIfGre): Ditto.
(FoldIfLess): Ditto.
(FoldIfIn): Ditto.
(FoldIfNotIn): Ditto.
* gm2-compiler/M2Quads.mod (GenQuadOTypeUniquetok): Remove.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

libstdc++: Replace viewcvs links in docs with cgit links

libstdc++-v3/ChangeLog:

* doc/xml/faq.xml: Replace viewcvs links with cgit links.
* doc/xml/manual/allocator.xml: Likewise.
* doc/xml/manual/mt_allocator.xml: Likewise.
* doc/html/*: Regenerate.

c++: ICE with __has_unique_object_representations [PR115476]

Here we started to ICE with r13-25: in check_trait_type, for "X[]" we
return true here:

  if (kind == 1 && TREE_CODE (type) == ARRAY_TYPE && !TYPE_DOMAIN (type))
    return true; // Array of unknown bound. Don't care about completeness.

and then end up crashing in record_has_unique_obj_representations:

4836   if (cur != wi::to_offset (sz))

because sz is null.

https://eel.is/c++draft/type.traits#tab:meta.unary.prop-row-47-column-3-sentence-1
says that the preconditions for __has_unique_object_representations are:
"T shall be a complete type, cv void, or an array of unknown bound" and
that "For an array type T, the same result as
has_unique_object_representations_v<remove_all_extents_t<T>>" so T[]
should be treated as T.  So we should use kind==2 for the trait.

PR c++/115476

gcc/cp/ChangeLog:

* semantics.cc (finish_trait_expr)
<case CPTK_HAS_UNIQUE_OBJ_REPRESENTATIONS>: Move below to call
check_trait_type with kind==2.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/has-unique-obj-representations4.C: New test.

[PATCH v2 3/3] RISC-V: cmpmem for RISCV with V extension

So this is the cmpmem patch from Sergei, updated for the trunk.

Updates included adjusting the existing cmpmemsi expander to
conditionally try expansion via vector. And a minor testsuite
adjustment to turn off vector expansion in one test that is primarily
focused on vset optimization and ensuring we don't have extras.

I've spun this in my tester successfully and just want to see a clean
run through precommit CI before moving forward.

Jeff
gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_vector::expand_vec_cmpmem): New
function declaration.
* config/riscv/riscv-string.cc (riscv_vector::expand_vec_cmpmem): New
function.
* config/riscv/riscv.md (cmpmemsi): Try riscv_vector::expand_vec_cmpmem
for constant lengths.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/cmpmem-1.c: New codegen tests
* gcc.target/riscv/rvv/base/cmpmem-2.c: New execution tests
* gcc.target/riscv/rvv/base/cmpmem-3.c: New codegen tests
* gcc.target/riscv/rvv/base/cmpmem-4.c: New codegen tests
* gcc.target/riscv/rvv/autovec/vls/misalign-1.c: Turn off vector mem* and
str* handling.

PR modula2/115540 gcc/m2/mc-boot-ch/Gtermios.cc error return-statement with a value

This patch fixes three occurrences of cfmakeraw use in the hand built
m2 support libraries which incorrectly attempt to return a void
result.

gcc/m2/ChangeLog:

PR modula2/115540
* gm2-libs-ch/termios.c (cfmakeraw): Remove return.
* mc-boot-ch/Gtermios.cc (cfmakeraw): Remove return.
* pge-boot/Gtermios.cc (cfmakeraw): Remove return.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

Add param for bb limit to invoke fast_vrp.

If the basic block count is too high, simply use fast_vrp for all
VRP passes.

* doc/invoke.texi (vrp-block-limit): Document.
* params.opt (param=vrp-block-limit): New.
* tree-vrp.cc (fvrp_folder::execute): Invoke fast_vrp if block
count exceeds limit.

c++: ICE with generic lambda and pack expansion [PR115425]

In r13-272 we hardened the *_PACK_EXPANSION and *_ARGUMENT_PACK macros.
That trips up here because make_pack_expansion returns error_mark_node
and we access that with PACK_EXPANSION_LOCAL_P.

PR c++/115425

gcc/cp/ChangeLog:

* pt.cc (tsubst_pack_expansion): Return error_mark_node if
make_pack_expansion doesn't work out.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-generic12.C: New test.

c++: ICE with __dynamic_cast redecl [PR115501]

Since r13-3299, build_dynamic_cast_1 calls pushdecl which calls
duplicate_decls and that in this testcase emits the "conflicting
declaration" error and returns error_mark_node, so the subsequent
build_cxx_call crashes on the error_mark_node.

PR c++/115501

gcc/cp/ChangeLog:

* rtti.cc (build_dynamic_cast_1): Return if dcast_fn is erroneous.

gcc/testsuite/ChangeLog:

* g++.dg/rtti/dyncast8.C: New test.

ira: Scale save/restore costs of callee save registers with block frequency

In assign_hard_reg(), when computing the costs of the hard registers, the
cost of saving/restoring a callee-save hard register in prolog/epilog is
taken into consideration. However, this cost is not scaled with the entry
block frequency. Without scaling, the cost of saving/restoring is quite
small and this can result in a callee-save register being chosen by
assign_hard_reg() even though there are free caller-save registers
available. Assigning a callee save register to a pseudo that is live
in the entire function and across a call will cause shrink wrap to fail.

2024-06-25 Surya Kumari Jangala <jskumari@linux.ibm.com>

gcc/
PR rtl-optimization/111673
* ira-color.cc (assign_hard_reg): Scale save/restore costs of
callee save registers with block frequency.

gcc/testsuite/
PR rtl-optimization/111673
* gcc.target/powerpc/pr111673.c: New test.

PR modula2/115536 Expression is evaluated incorrectly when encountering relops and indirection

This fix ensures that we only call BuildRelOpFromBoolean if we are
inside a constant expression (where no indirection can be used).
The fix creates a temporary variable when a boolean is created from
a relop in other cases.
The previous pattern implementation would not work if the operands required
dereferencing during non const expressions.  Comparison of relop results
in a constant expression are resolved by constant propagation, basic
block analysis and dead code removal.  After the quadruples have been
optimized only one assignment to the boolean variable will remain for
const expressions.  All quadruple pattern checking for boolean
expressions is removed by the patch.  Thus the implementation becomes
more generic.

gcc/m2/ChangeLog:

PR modula2/115536
* gm2-compiler/M2BasicBlock.def (GetBasicBlockScope): New procedure.
(GetBasicBlockStart): Ditto.
(GetBasicBlockEnd): Ditto.
(IsBasicBlockFirst): New procedure function.
* gm2-compiler/M2BasicBlock.mod (ConvertQuads2BasicBlock): Allow
conditional boolean quads to be removed.
(GetBasicBlockScope): Implement new procedure.
(GetBasicBlockStart): Ditto.
(GetBasicBlockEnd): Ditto.
(IsBasicBlockFirst): Implement new procedure function.
* gm2-compiler/M2GCCDeclare.def (FoldConstants): New parameter
declaration.
* gm2-compiler/M2GCCDeclare.mod (FoldConstants): New parameter
declaration.
(DeclareTypesConstantsProceduresInRange): Recreate basic blocks
after resolving constant expressions.
(CodeBecomes): Guard IsVariableSSA with IsVar.
* gm2-compiler/M2GenGCC.def (ResolveConstantExpressions): New
parameter declaration.
* gm2-compiler/M2GenGCC.mod (FoldIfLess): Remove relop pattern
detection.
(FoldIfGre): Ditto.
(FoldIfLessEqu): Ditto.
(FoldIfGreEqu): Ditto.
(FoldIfIn): Ditto.
(FoldIfNotIn): Ditto.
(FoldIfEqu): Ditto.
(FoldIfNotEqu): Ditto.
(FoldBecomes): Add BasicBlock parameter and allow conditional
boolean becomes to be folded in the first basic block.
(ResolveConstantExpressions): Reimplement.
* gm2-compiler/M2Quads.def (IsConstQuad): New procedure function.
(IsConditionalBooleanQuad): Ditto.
* gm2-compiler/M2Quads.mod (IsConstQuad): Implement new procedure function.
(IsConditionalBooleanQuad): Ditto.
(MoveWithMode): Use GenQuadOTypetok.
(IsInitialisingConst): Rewrite using OpUsesOp1.
(OpUsesOp1): New procedure function.
(doBuildAssignment): Mark des as a VarConditional.
(ConvertBooleanToVariable): Call PutVarConditional.
(DumpQuadSummary): New procedure.
(BuildRelOpFromBoolean): Updated debugging and improved comments.
(BuildRelOp): Only call BuildRelOpFromBoolean if we are in a const
expression and both operands are boolean relops.
(GenQuadOTypeUniquetok): New procedure.
(BackPatch): Correct comment.
* gm2-compiler/SymbolTable.def (PutVarConditional): New procedure.
(IsVarConditional): New procedure function.
* gm2-compiler/SymbolTable.mod (PutVarConditional): Implement new
procedure.
(IsVarConditional): Implement new procedure function.
(SymConstVar): New field IsConditional.
(SymVar): New field IsConditional.
(MakeVar): Initialize IsConditional field.
(MakeConstVar): Initialize IsConditional field.
* gm2-compiler/M2Swig.mod (DoBasicBlock): Change parameters to
use BasicBlock.
* gm2-compiler/M2Code.mod (SecondDeclareAndOptimize): Use iterator
to FoldConstants over basic block list.
* gm2-compiler/M2SymInit.mod (AppendEntry): Replace parameters
with BasicBlock.
* gm2-compiler/P3Build.bnf (Relation): Call RecordOp for #, <> and =.

gcc/testsuite/ChangeLog:

PR modula2/115536
* gm2/iso/const/pass/constbool4.mod: New test.
* gm2/iso/const/pass/constbool5.mod: New test.
* gm2/iso/run/pass/condtest2.mod: New test.
* gm2/iso/run/pass/condtest3.mod: New test.
* gm2/iso/run/pass/condtest4.mod: New test.
* gm2/iso/run/pass/condtest5.mod: New test.
* gm2/iso/run/pass/constbool4.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

[committed] Fix fr30-elf newlib build failure with late-combine

So the late combine work has exposed a latent bug in the fr30 port.

The fr30 "call" instruction is pc-relative with a *very* limited range, 12 bits
to be precise.

With such a limited range its hard to see how we could ever consistently use it
in the compiler, with the possible exception of self-recursion.  Even for a
call to a locally binding function -ffunction-sections and linker placement of
functions may separate the caller/callee.  Code generation seemed to be using
indirect forms pretty consistently, though the RTL would allow direct calls.

With late-combine some of those indirects would be optimized into direct calls.
This naturally led to out of range scenarios.

With the fr30 port slated for removal unless it gets updated to use LRA and the
fundamental problems using direct calls, I took the shortest path to keep
things working -- namely forcing all calls to be indirect.

Tested in my tester with no regressions (and fixes the newlib build failure
with late-combine enabled).    Pushed to the trunk.

gcc/
* config/fr30/constraints.md (Q): Remove unused constraint.
* config/fr30/predicates.md (call_operand): Remove unused predicate.
* config/fr30/fr30.md (call, vall_value): Turn into expanders and
force the call address into a register.
(*call, *call_value): Adjust to only allow indirect calls.  Adjust
output template accordingly.

late-combine: Honor targetm.cannot_copy_insn_p

late-combine was failing to take targetm.cannot_copy_insn_p into
account, which led to multiple definitions of PIC symbols on
arm*-*-* targets.

gcc/
* late-combine.cc (insn_combination::substitute_nondebug_use):
Reject second and subsequent uses if targetm.cannot_copy_insn_p
disallows copying.

c++: alias CTAD and copy deduction guide [PR115198]

Here we're neglecting to update DECL_NAME during the alias CTAD guide
transformation, which causes copy_guide_p to return false for the
transformed copy deduction guide since DECL_NAME is still __dguide_C
with TREE_TYPE C<B, T> but it should be __dguide_A with TREE_TYPE A<T>
(i.e. C<false, T>). This ultimately results in ambiguity during
overload resolution between the copy deduction guide vs copy ctor guide.

This patch makes us update DECL_NAME of a transformed guide accordingly
during alias/inherited CTAD.

PR c++/115198

gcc/cp/ChangeLog:

* pt.cc (alias_ctad_tweaks): Update DECL_NAME of the transformed
guides.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/class-deduction-alias22.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++: using non-dep array var of unknown bound [PR115358]

For a non-dependent array variable of unknown bound, it seems we need to
try instantiating its definition upon use in a template context for sake
of proper checking and typing of the overall expression, like we do for
function specializations with deduced return type.

PR c++/115358

gcc/cp/ChangeLog:

* decl2.cc (mark_used): Call maybe_instantiate_decl for an array
variable with unknown bound.
* semantics.cc (finish_decltype_type): Remove now redundant
handling of array variables with unknown bound.
* typeck.cc (cxx_sizeof_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/template/array37.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

Fix PR c/115587, uninitialized variable in c_parser_omp_loop_nest

This function had a reference to an uninitialized variable on the
error path. The problem was diagnosed by clang but not gcc. It seems
the cleanest solution is to initialize all the loop-clause variables
at the point of declaration rather than at different places in the
code.

The C++ front end didn't have this problem, but I've made similar
changes there to keep the code in sync.

gcc/c/ChangeLog:

PR c/115587
* c-parser.cc (c_parser_omp_loop_nest): Move initializations to
point of declaration.

gcc/cp/ChangeLog:

PR c/115587
* parser.cc (cp_parser_omp_loop_nest): Move initializations to
point of declaration.

GORI cleanups

The following replaces conditional is_export_p calls as is_export_p
handles a NULL bb itself.

* gimple-range-gori.cc (gori_compute::may_recompute_p):
Call is_export_p with NULL bb.

doc: gccint: Fix typos in jump_table_data description

gcc/ChangeLog:

* doc/rtl.texi (jump_table_data): Fix typos.

Add a debug counter for late-combine

This should help to diagnose problems like PR115631.

gcc/
* dbgcnt.def (late_combine): New debug counter.
* late-combine.cc (insn_combination::run): Use it.

libatomic: Add rcpc3 128-bit atomic operations for AArch64

The introduction of the optional RCPC3 architectural extension for
Armv8.2-A upwards provides additional support for the release
consistency model, introducing the Load-Acquire RCpc Pair Ordered, and
Store-Release Pair Ordered operations in the form of LDIAPP and STILP.

These operations are single-copy atomic on cores which also implement
LSE2 and, as such, support for these operations is added to Libatomic
and employed accordingly when the LSE2 and RCPC3 features are detected
in a given core at runtime.

libatomic/ChangeLog:

* config/linux/aarch64/atomic_16.S (libat_load_16): Add LRCPC3
variant.
(libat_store_16): Likewise.
* config/linux/aarch64/host-config.h (HWCAP2_LRCPC3): New.
(LSE2_LRCPC3_ATOP): Previously LSE2_ATOP. New ifuncs guarded
under it.
(has_rcpc3): New.

SPARC: fix internal error with -mv8plus on 64-bit Linux

This passes -m32 when -mv8plus is specified on Linux (like on Solaris).

gcc/
PR target/115608
* config/sparc/linux64.h (CC1_SPEC): Pass -m32 for -mv8plus.

rs6000: Properly default-disable late-combine passes [PR106594, PR115622, PR115633]

..., so that it also works for '__attribute__ ((optimize("[...]")))' etc.

PR target/106594
PR target/115622
PR target/115633
gcc/
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Move
default-disable of late-combine passes from here...
(rs6000_override_options_after_change): ... to here.

Revert one of the force_subreg changes

One of the changes in g:d4047da6a070175aae7121c739d1cad6b08ff4b2
caused a regression in ft32-elf; see:

https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655418.html

for details. This change was different from the others in that the
original call was to simplify_subreg rather than simplify_lowpart_subreg.
The old code would therefore go on to do the force_reg for more cases
than the new code would.

gcc/
* expmed.cc (store_bit_field_using_insv): Revert earlier change
to use force_subreg instead of simplify_gen_subreg.

MIPS: Implement vcond_mask optabs for MSA

Currently, we have `mips_expand_vec_cond_expr`, which calculate
cmp_res first. We can just add a new extra argument to ask it
to use operands[3] as cmp_res instead of calculating from operands[4]
and operands[5].

gcc
* config/mips/mips.cc(mips_expand_vec_cond_expr): Add extra
argument to info that opernads[3] is cmp_res already.
* config/mips/mips-protos.h(mips_expand_vec_cond_expr): Ditto.
* config/mips/mips-msa.md(vcond_mask): Define new expand.
(vcondu): Use mips_expand_vec_cond_expr with 4th argument.
(vcond): Ditto.

MIPS: Output $0 for conditional trap if !ISA_HAS_COND_TRAPI

MIPSr6 removes condition trap instructions with imm, so the instruction
like `teq $2,imm` will be converted to
  li $at, imm
  teq $2, $at

The current version of Gas cannot detect if imm is zero, and output
  teq $2, $0
Let's do it in GCC.

gcc
* config/mips/mips.md(conditional_trap_reg): Output $0 instead
of 0 if !ISA_HAS_COND_TRAPI.

aarch64: Add DLL import/export to AArch64 target

This patch reuses the MinGW implementation to enable DLL import/export
functionality for the aarch64-w64-mingw32 target. It also modifies
environment configurations for MinGW.

2024-06-08 Evgeny Karpov <Evgeny.Karpov@microsoft.com>

gcc/ChangeLog:
* config.gcc: Add winnt-dll.o, which contains the DLL
import/export implementation.
* config/aarch64/aarch64.cc (aarch64_load_symref_appropriately):
Add dllimport implementation.
(aarch64_expand_call): Likewise.
(aarch64_legitimize_address): Likewise.
* config/aarch64/cygming.h (SYMBOL_FLAG_DLLIMPORT): Modify MinGW
environment to support DLL import/export.
(SYMBOL_FLAG_DLLEXPORT): Likewise.
(SYMBOL_REF_DLLIMPORT_P): Likewise.
(SYMBOL_FLAG_STUBVAR): Likewise.
(SYMBOL_REF_STUBVAR_P): Likewise.
(TARGET_VALID_DLLIMPORT_ATTRIBUTE_P): Likewise.
(TARGET_ASM_FILE_END): Likewise.
(SUB_TARGET_RECORD_STUB): Likewise.
(GOT_ALIAS_SET): Likewise.
(PE_COFF_EXTERN_DECL_SHOULD_BE_LEGITIMIZED): Likewise.
(HAVE_64BIT_POINTERS): Likewise.

Adjust DLL import/export implementation for AArch64

The DLL import/export mingw implementation, originally from ix86,
requires minor adjustments to be compatible with AArch64.

2024-06-08 Evgeny Karpov <Evgeny.Karpov@microsoft.com>

gcc/ChangeLog:
* config/i386/cygming.h
(PE_COFF_EXTERN_DECL_SHOULD_BE_LEGITIMIZED): Declare whether an
external declaration should be legitimized.
(HAVE_64BIT_POINTERS): Define whether the target supports 64-bit
pointers.
* config/mingw/mingw32.h (defined): Use the correct
DllMainCRTStartup entry function.
* config/mingw/winnt-dll.cc (defined): Exclude ix86-related code.

aarch64: Add selectany attribute handling

This patch extends the aarch64 attributes list with the selectany
attribute for the aarch64-w64-mingw32 target and reuses the mingw
implementation to handle it.

2024-06-08 Evgeny Karpov <Evgeny.Karpov@microsoft.com>

gcc/ChangeLog:
* config/aarch64/aarch64.cc: Extend the aarch64 attributes list.
* config/aarch64/cygming.h (SUBTARGET_ATTRIBUTE_TABLE): Define the
selectany attribute.

Rename functions for reuse in AArch64

This patch renames functions related to dllimport/dllexport and
selectany functionality. These functions will be reused in the
aarch64-w64-mingw32 target.

2024-06-08 Evgeny Karpov <Evgeny.Karpov@microsoft.com>

gcc/ChangeLog:
* config/i386/cygming.h (mingw_pe_record_stub): Rename functions
in mingw folder which will be reused for aarch64.
(TARGET_ASM_FILE_END): Update to new target-independent name.
(SUBTARGET_ATTRIBUTE_TABLE): Likewise.
(TARGET_VALID_DLLIMPORT_ATTRIBUTE_P): Likewise.
(SUB_TARGET_RECORD_STUB): Likewise.
* config/i386/i386-protos.h (ix86_handle_selectany_attribute):
Likewise.
(mingw_handle_selectany_attribute): Likewise.
(i386_pe_valid_dllimport_attribute_p): Likewise.
(mingw_pe_valid_dllimport_attribute_p): Likewise.
(i386_pe_file_end): Likewise.
(mingw_pe_file_end): Likewise.
(i386_pe_record_stub): Likewise.
(mingw_pe_record_stub): Likewise.
* config/mingw/winnt.cc (ix86_handle_selectany_attribute):
Likewise.
(mingw_handle_selectany_attribute): Likewise.
(i386_pe_valid_dllimport_attribute_p): Likewise.
(mingw_pe_valid_dllimport_attribute_p): Likewise.
(i386_pe_record_stub): Likewise.
(mingw_pe_record_stub): Likewise.
(i386_pe_file_end): Likewise.
(mingw_pe_file_end): Likewise.
* config/mingw/winnt.h (mingw_handle_selectany_attribute): Declate
functionality that will be reused by multiple targets.
(mingw_pe_file_end): Likewise.
(mingw_pe_record_stub): Likewise.
(mingw_pe_valid_dllimport_attribute_p): Likewise.

Extract ix86 dllimport implementation to mingw

This patch extracts the ix86 implementation for expanding a SYMBOL
into its corresponding dllimport, far-address, or refptr symbol.  It
will be reused in the aarch64-w64-mingw32 target.  The implementation
is copied as is from i386/i386.cc with minor changes to follow to the
code style.

Also this patch replaces the original DLL import/export implementation
in ix86 with mingw.

2024-06-08  Evgeny Karpov <Evgeny.Karpov@microsoft.com>

gcc/ChangeLog:
* config.gcc: Add winnt-dll.o, which contains the DLL
import/export implementation.
* config/i386/cygming.h (SUB_TARGET_RECORD_STUB): Remove the
old implementation. Rename the required function to MinGW.
Use MinGW implementation for COFF and nothing otherwise.
(GOT_ALIAS_SET): Likewise.
* config/i386/i386-expand.cc (ix86_expand_move): Likewise.
* config/i386/i386-expand.h (ix86_GOT_alias_set): Likewise.
(legitimize_pe_coff_symbol): Likewise.
* config/i386/i386-protos.h (i386_pe_record_stub): Likewise.
* config/i386/i386.cc (is_imported_p): Likewise.
(legitimate_pic_address_disp_p): Likewise.
(ix86_GOT_alias_set): Likewise.
(legitimize_pic_address): Likewise.
(legitimize_tls_address): Likewise.
(struct dllimport_hasher): Likewise.
(GTY): Likewise.
(get_dllimport_decl): Likewise.
(legitimize_pe_coff_extern_decl): Likewise.
(legitimize_dllimport_symbol): Likewise.
(legitimize_pe_coff_symbol): Likewise.
(ix86_legitimize_address): Likewise.
* config/i386/i386.h (GOT_ALIAS_SET): Likewise.
* config/mingw/winnt.cc (i386_pe_record_stub): Likewise.
(mingw_pe_record_stub): Likewise.
* config/mingw/winnt.h (mingw_pe_record_stub): Likewise.
* config/mingw/t-cygming: Add the winnt-dll.o compilation.
* config/mingw/winnt-dll.cc: New file.
* config/mingw/winnt-dll.h: New file.

Move mingw_* declarations to the mingw folder

This patch refactors recent changes to move mingw-related
functionality to the mingw folder. More renamings to the mingw_ prefix
will be done in follow-up commits.

This is the first commit in the second patch series to add DLL
import/export implementation to AArch64.

Coauthors: Zac Walker <zacwalker@microsoft.com>,
Mark Harmstone <mark@harmstone.com> and
Ron Riddle <ron.riddle@microsoft.com>

Refactored, prepared, and validated by
Radek Barton <radek.barton@microsoft.com> and
Evgeny Karpov <evgeny.karpov@microsoft.com>

2024-06-08 Evgeny Karpov <evgeny.karpov@microsoft.com>

gcc/ChangeLog:
* config.gcc: Move mingw_* declations to mingw.
* config/aarch64/aarch64-protos.h
(mingw_pe_maybe_record_exported_symbol): Likewise.
(mingw_pe_section_type_flags): Likewise.
(mingw_pe_unique_section): Likewise.
(mingw_pe_encode_section_info): Likewise.
* config/aarch64/cygming.h
(mingw_pe_asm_named_section): Likewise.
(mingw_pe_declare_function_type): Likewise.
* config/i386/i386-protos.h
(mingw_pe_unique_section): Likewise.
(mingw_pe_declare_function_type): Likewise.
(mingw_pe_maybe_record_exported_symbol): Likewise.
(mingw_pe_encode_section_info): Likewise.
(mingw_pe_section_type_flags): Likewise.
(mingw_pe_asm_named_section): Likewise.
* config/mingw/winnt.h: New file.

c: Fix ICE related to incomplete structures in C23 [PR114930]

Here is a version of the c_update_type_canonical fixes which passed
bootstrap/regtest.
The non-trivial part is the handling of the case when
build_qualified_type (TYPE_CANONICAL (t), TYPE_QUALS (x))
returns a type with NULL TYPE_CANONICAL.  That should happen only
if TYPE_CANONICAL (t) == t, because otherwise c_update_type_canonical should
have been already called on the other type.  c, the returned type, is usually x
and in that case it should have TYPE_CANONICAL set to itself, or worst
for whatever reason x is not the right canonical type (say it has attributes
or whatever disqualifies it from check_qualified_type).  In that case
either it finds some pre-existing type from the variant chain of t which
is later in the chain and we haven't processed it yet (but then
get_qualified_type moves it right after t in:
        /* Put the found variant at the head of the variant list so
           frequently searched variants get found faster.  The C++ FE
           benefits greatly from this.  */
        tree t = *tp;
        *tp = TYPE_NEXT_VARIANT (t);
        TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (mv);
        TYPE_NEXT_VARIANT (mv) = t;
        return t;
optimization), or creates a fresh new type using build_variant_type_copy,
which again places the new type right after t:
  /* Add the new type to the chain of variants of TYPE.  */
  TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (m);
  TYPE_NEXT_VARIANT (m) = t;
  TYPE_MAIN_VARIANT (t) = m;
At this point we want to make c its own canonical type (i.e. TYPE_CANONICAL
(c) = c;), but also need to process pointers to it and only then return back
to processing x.  Processing the whole chain from c again could be costly,
we could have hundreds of types in the chain already processed, and while
the loop would just quickly skip them
  for (tree x = t, l = NULL_TREE; x; l = x, x = TYPE_NEXT_VARIANT (x))
    {
      if (x != t && TYPE_STRUCTURAL_EQUALITY_P (x))
...
      else if (x != t)
        continue;
it feels costly.  So, this patch instead moves c from right after t
to right before x in the chain (that shouldn't change anything, because
clearly build_qualified_type didn't find any matches in the chain before
x) and continues processing the c at that position, so should handle the
x that encountered this in the next iteration.

We could avoid some of the moving in the chain if we processed the chain
twice, once deal only with x != t && TYPE_STRUCTURAL_EQUALITY_P (x)
&& TYPE_CANONICAL (t) == t && check_qualified_type (t, x, TYPE_QUALS (x))
types (in that case set TYPE_CANONICAL (x) = x) and once the rest.  There
is still the theoretical case where build_qualified_type would return
a new type and in that case we are back to the moving the type around and
needing to handle it though.

2024-06-25  Jakub Jelinek  <jakub@redhat.com>
    Martin Uecker  <uecker@tugraz.at>

PR c/114930
PR c/115502
gcc/c/
* c-decl.cc (c_update_type_canonical): Assert t is main variant
with 0 TYPE_QUALS.  Simplify and don't use check_qualified_type.
Deal with the case where build_qualified_type returns
TYPE_STRUCTURAL_EQUALITY_P type.
gcc/testsuite/
* gcc.dg/pr114574-1.c: Require lto effective target.
* gcc.dg/pr114574-2.c: Likewise.
* gcc.dg/pr114930.c: New test.
* gcc.dg/pr115502.c: New test.

[PATCH 07/11] Handle structs and classes for CodeView

Translates DW_TAG_structure_type DIEs into LF_STRUCTURE symbols, and
DW_TAG_class_type DIEs into LF_CLASS symbols.

gcc/
* dwarf2codeview.cc (struct codeview_type): Add is_fwd_ref member.
(struct codeview_subtype): Add lf_member to union.
(struct codeview_custom_type): Add lf_structure to union.
(struct codeview_deferred_type): New structure.
(deferred_types, last_deferred_type): New variables.
(get_type_num): Add new args to prototype.
(write_lf_fieldlist): Handle LF_MEMBER subtypes.
(write_lf_structure): New function.
(write_custom_types): Call write_lf_structure.
(get_type_num_pointer_type): Add in_struct argument.
(get_type_num_const_type): Likewise.
(get_type_num_volatile_type): Likewise.
(add_enum_forward_def): Fix get_type_num call.
(get_type_num_enumeration_type): Add in-struct argument.
(add_deferred_type, flush_deferred_types): New functions.
(add_struct_forward_def, get_type_num_struct): Likewise.
(get_type_num): Handle self-referential structs.
(add_variable): Fix get_type_num call.
(codeview_debug_early_finish): Call flush_deferred_types.
* dwarf2codeview.h (LF_CLASS, LF_STRUCTURE, LF_MEMBER): Define.

[committed][RISC-V] Fix some of the testsuite fallout from late-combine patch

This fixes most, but not all of the testsuite fallout from the late-combine
patch.  Specifically in the vector space we're often able to eliminate a
broadcast of an scalar element across a vector.  That eliminates the vsetvl
related to the broadcast, but more importantly from the testsuite standpoint it
turns .vv forms into .vf or .vx forms.

There were two paths we could have taken here.  One to accept .v*, ignoring the
actual register operands.  Or to create new matches for the .vx and .vf
variants.  I selected the latter as I'd like us to know if the code to avoid
the broadcast regresses.

I'm pushing this through now so that we've got cleaner results and to prevent
duplicate work.  I've got patch for the rest of the testsuite fallout, but I
want to think about them a bit.

gcc/testsuite
* gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c: Adjust
expected test output after late-combine changes.
* gcc.target/riscv/rvv/autovec/binop/vadd-rv64gcv-nofm.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv-nofm.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vmul-rv64gcv-nofm.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vsub-rv64gcv-nofm.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv64gcv.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-2.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-2.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: Likewise.

Replace {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE with new hook mode_for_floating_type

Currently how we determine which mode will be used for a
floating point type is that for a given type precision
(size) call mode_for_size to get the first mode which has
this size in the specified class.  On Powerpc, we have
three modes (TF/KF/IF) having the same mode precision 128
(see[1]), so the processing forces us to have to place TF
at the first place, it would require us to make more
adjustment in some generic code to avoid some unexpected
mode conversions and it would be even worse if we get rid
of TF eventually one day.  And as Joseph pointed out in [2],
"floating  types should have their mode, not a poorly
defined precision value", as Joseph and Richi suggested,
this patch is to introduce one hook mode_for_floating_type
which returns the corresponding mode for type float, double
or long double.  The default implementation returns SFmode
for float and DFmode for double or long double.  For ports
which need special treatment, there are some other patches
for their own port specific implementation (referring to
how {,LONG_}DOUBLE_TYPE_SIZE get used there).  For all
generic uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE, depending
on the context, some of them are replaced with TYPE_PRECISION
of the according type node, some other are replaced with
GET_MODE_PRECISION on the mode from mode_for_floating_type.
This patch also poisons {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE,
so most defines of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE in port
specific are removed, but there are still some which are
good to be kept for readability then they get renamed with
port specific prefix.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651017.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html

gcc/jit/ChangeLog:

* jit-recording.cc (recording::memento_of_get_type::get_size): Update
macros {FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE by calling
targetm.c.mode_for_floating_type with
TI_{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE.

gcc/ChangeLog:

* coretypes.h (enum tree_index): Forward declaration.
* defaults.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* doc/rtl.texi: Update document by replacing {FLOAT,DOUBLE}_TYPE_SIZE
with C type {float,double}.
* doc/tm.texi.in: Document new hook mode_for_floating_type, remove
document entries for {FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE and
update document for WIDEST_HARDWARE_FP_SIZE.
* doc/tm.texi: Regenerate.
* emit-rtl.cc (init_emit_once): Replace DOUBLE_TYPE_SIZE by
calling targetm.c.mode_for_floating_type with TI_DOUBLE_TYPE.
* real.h (REAL_VALUE_TO_TARGET_LONG_DOUBLE): Use TYPE_PRECISION of
long_double_type_node to replace LONG_DOUBLE_TYPE_SIZE.
* system.h (FLOAT_TYPE_SIZE): Poison.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* target.def (mode_for_floating_type): New hook.
* targhooks.cc (default_mode_for_floating_type): New function.
(default_scalar_mode_supported_p): Update macros
{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE by calling
targetm.c.mode_for_floating_type with
TI_{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE.
* targhooks.h (default_mode_for_floating_type): New declaration.
* tree-core.h (enum tree_index): Specify underlying type unsigned
to sync with forward declaration in coretypes.h.
(NUM_FLOATN_TYPES): Explicitly convert to int.
(NUM_FLOATNX_TYPES): Likewise.
(NUM_FLOATN_NX_TYPES): Likewise.
* tree.cc (build_common_tree_nodes): Update macros
{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE by calling
targetm.c.mode_for_floating_type with
TI_{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE and set type mode accordingly.
* config/arc/arc.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/bpf/bpf.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/epiphany/epiphany.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/fr30/fr30.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/frv/frv.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/ft32/ft32.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/gcn/gcn.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/iq2000/iq2000.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/lm32/lm32.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/m32c/m32c.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/m32r/m32r.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/microblaze/microblaze.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/mmix/mmix.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/moxie/moxie.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/msp430/msp430.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/nds32/nds32.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/nios2/nios2.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/nvptx/nvptx.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/or1k/or1k.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/pdp11/pdp11.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/pru/pru.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/stormy16/stormy16.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/visium/visium.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/xtensa/xtensa.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/rs6000/rs6000.cc (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
(rs6000_c_mode_for_floating_type): New function.
* config/rs6000/rs6000.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/aarch64/aarch64.cc (aarch64_c_mode_for_floating_type):
New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/aarch64/aarch64.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/alpha/alpha.cc (alpha_c_mode_for_floating_type): New
function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/alpha/alpha.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/avr/avr.cc (avr_c_mode_for_floating_type): New
function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/avr/avr.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/i386/i386.cc (ix86_c_mode_for_floating_type): New
function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/i386/i386.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/ia64/ia64.cc (ia64_c_mode_for_floating_type): New
function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/ia64/ia64.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/riscv/riscv.cc (riscv_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/riscv/riscv.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/rl78/rl78.cc (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
(rl78_c_mode_for_floating_type): New function.
* config/rl78/rl78.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/rx/rx.cc (rx_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/rx/rx.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/s390/s390.cc (s390_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/s390/s390.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/sh/sh.cc (sh_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/sh/sh.h (LONG_DOUBLE_TYPE_SIZE): Remove.
* config/h8300/h8300.cc (h8300_c_mode_for_floating_type): New
function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/h8300/h8300.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Remove.
(LONG_DOUBLE_TYPE_SIZE): Remove.
(DOUBLE_TYPE_MODE): New macro.
* config/h8300/linux.h (DOUBLE_TYPE_SIZE): Remove.
(DOUBLE_TYPE_MODE): New macro.
* config/loongarch/loongarch.cc (loongarch_c_mode_for_floating_type):
New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/loongarch/loongarch.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Remove.
(LONG_DOUBLE_TYPE_SIZE): Rename to ...
(LA_LONG_DOUBLE_TYPE_SIZE): ... this.
(UNITS_PER_FPVALUE): Replace LONG_DOUBLE_TYPE_SIZE with
LA_LONG_DOUBLE_TYPE_SIZE.
(MAX_FIXED_MODE_SIZE): Likewise.
(STRUCTURE_SIZE_BOUNDARY): Likewise.
(BIGGEST_ALIGNMENT): Likewise.
* config/m68k/m68k.cc (m68k_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/m68k/m68k.h (LONG_DOUBLE_TYPE_SIZE): Remove.
(LONG_DOUBLE_TYPE_MODE): New macro.
* config/m68k/netbsd-elf.h (LONG_DOUBLE_TYPE_SIZE): Remove.
(LONG_DOUBLE_TYPE_MODE): New macro.
* config/mips/mips.cc (mips_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/mips/mips.h (UNITS_PER_FPVALUE): Replace LONG_DOUBLE_TYPE_SIZE
with MIPS_LONG_DOUBLE_TYPE_SIZE.
(MAX_FIXED_MODE_SIZE): Likewise.
(STRUCTURE_SIZE_BOUNDARY): Likewise.
(BIGGEST_ALIGNMENT): Likewise.
(FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Remove.
(LONG_DOUBLE_TYPE_SIZE): Rename to ...
(MIPS_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/mips/n32-elf.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(MIPS_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/pa/pa.cc (pa_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
(pa_scalar_mode_supported_p): Rename FLOAT_TYPE_SIZE to
PA_FLOAT_TYPE_SIZE, rename DOUBLE_TYPE_SIZE to PA_DOUBLE_TYPE_SIZE
and rename LONG_DOUBLE_TYPE_SIZE to PA_LONG_DOUBLE_TYPE_SIZE.
* config/pa/pa.h (PA_FLOAT_TYPE_SIZE): New macro.
(PA_DOUBLE_TYPE_SIZE): Likewise.
(PA_LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/pa/pa-64.h (FLOAT_TYPE_SIZE): Rename to ...
(PA_FLOAT_TYPE_SIZE): ... this.
(DOUBLE_TYPE_SIZE): Rename to ...
(PA_DOUBLE_TYPE_SIZE): ... this.
(LONG_DOUBLE_TYPE_SIZE): Rename to ...
(PA_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/pa/pa-hpux.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(PA_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/sparc.cc (sparc_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
(FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
(sparc_type_code): Replace FLOAT_TYPE_SIZE with TYPE_PRECISION of
float_type_node.
* config/sparc/sparc.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Remove.
* config/sparc/freebsd.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/linux.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/linux64.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/netbsd-elf.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/openbsd64.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/sol2.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/sp-elf.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/sp64-elf.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/bfin/bfin.h (FLOAT_TYPE_SIZE): Rename to ...
(BFIN_FLOAT_TYPE_SIZE): ... this.
(DOUBLE_TYPE_SIZE): Rename to ...
(BFIN_DOUBLE_TYPE_SIZE): ... this.
(LONG_DOUBLE_TYPE_SIZE): Remove.
(UNITS_PER_FLOAT): Replace FLOAT_TYPE_SIZE with BFIN_FLOAT_TYPE_SIZE.
(UNITS_PER_DOUBLE): Replace DOUBLE_TYPE_SIZE with
BFIN_DOUBLE_TYPE_SIZE.

vms: Replace use of LONG_DOUBLE_TYPE_SIZE

Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type. To be prepared for that, this
patch is to replace use of LONG_DOUBLE_TYPE_SIZE in vms port
with TYPE_PRECISION of long_double_type_node.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html

gcc/ChangeLog:

* config/vms/vms.cc (vms_patch_builtins): Use TYPE_PRECISION of
long_double_type_node to replace LONG_DOUBLE_TYPE_SIZE.

rust: Replace uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE

Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type. To be prepared for that, this
patch is to replace use of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
in rust with TYPE_PRECISION of {float,{,long_}double}_type_node.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html

gcc/rust/ChangeLog:

* rust-gcc.cc (float_type): Use TYPE_PRECISION of
{float,double,long_double}_type_node to replace
{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE.

go: Replace uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE

Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type. To be prepared for that, this
patch is to replace use of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
in go with TYPE_PRECISION of {float,{,long_}double}_type_node.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html

gcc/go/ChangeLog:

* go-gcc.cc (Gcc_backend::float_type): Use TYPE_PRECISION of
{float,double,long_double}_type_node to replace
{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE.
(Gcc_backend::complex_type): Likewise.

c-family: Add Warning property to Wnrvo option [PR115624]

This was missing when Wnrvo was added in
r14-1594-g2ae5384d457b9c67586de012816dfc71a6943164 .

Pushed after a bootstrap/test on x86_64-linux-gnu.

gcc/c-family/ChangeLog:

PR c++/115624
* c.opt (Wnrvo): Add Warning property.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

Make transitive relations an oracle option

This patch makes processing of transitive relations configurable at
dom_oracle creation.

* tree-vrp.cc (execute_fast_vrp): Do not use transitive relations.
* value-query.cc (range_query::create_relation_oracle): Add
parameter to enable transitive relations.
* value-query.h (range_query::create_relation_oracle): Likewise.
* value-relation.h (dom_oracle::dom_oracle): Likewise.
* value-relation.cc (dom_oracle::dom_oracle): Likewise.
(dom_oracle::register_transitives): Check transitive flag.

Daily bump.

[PATCH v2 2/3] RISC-V: setmem for RISCV with V extension

This is primarily Sergei's work, my contributions were limited to
merging his expander with the one that's on the trunk, allowing
non-constant value and trivial testsuite adjustments due to option renaming.

I'm doing setmem first because it's the easiest. The others will follow
soon enough.

I've tested this in my system, waiting on pre-commit CI to render its
verdict before moving forward.

gcc/ChangeLog

* config/riscv/riscv-protos.h (riscv_vector::expand_vec_setmem): New
function declaration.

* config/riscv/riscv-string.cc (riscv_vector::expand_vec_setmem): New
function: this generates an inline vectorised memory set, if and only if
we know the entire operation can be performed in a single vector store.

* config/riscv/riscv.md (setmem<mode>): Try riscv_vector::expand_vec_setmem
for constant lengths. Do not require operand 2 to be a constant.

gcc/testsuite/ChangeLog

* gcc.target/riscv/rvv/base/setmem-1.c: New tests
* gcc.target/riscv/rvv/base/setmem-2.c: New tests
* gcc.target/riscv/rvv/base/setmem-3.c: New tests

RISC-V: Add dg-remove-option for z* extensions

This introduces testsuite support infra for removing extensions.
Since z* extensions don't have ordering requirements the logic for
adding/removing those extensions has also been consolidated.

This fixes RVWMO compile testcases failing on Ztso targets by removing
the extension from the -march string.

gcc/ChangeLog:

* doc/sourcebuild.texi (dg-remove-option): Add documentation.
(dg-add-option): Add documentation for riscv_{a,zaamo,zalrsc,ztso}

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo/amo-table-a-6-amo-add-1.c: Add dg-remove-options
for ztso.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-2.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-3.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-4.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-5.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-1.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-2.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-3.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-4.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-5.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-6.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-7.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-fence-1.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-fence-2.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-fence-3.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-fence-4.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-fence-5.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-load-1.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-load-2.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-load-3.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-store-1.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-store-2.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-store-compat-3.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-1.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-2.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-3.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-4.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-5.c: Ditto.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-1.c: Replace manually
specified -march string with dg-add/remove-options directives.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-2.c: Ditto.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-3.c: Ditto.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-4.c: Ditto.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-5.c: Ditto.
* lib/target-supports-dg.exp: Add dg-remove-options.
* lib/target-supports.exp: Add dg-remove-options and consolidate z*
extension add/remove-option code.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>

Fortran: fix passing of optional dummy as actual to optional argument [PR55978]

gcc/fortran/ChangeLog:

PR fortran/55978
* trans-array.cc (gfc_conv_array_parameter): Do not dereference
data component of a missing allocatable dummy array argument for
passing as actual to optional dummy. Harden logic of presence
check for optional pointer dummy by using TRUTH_ANDIF_EXPR instead
of TRUTH_AND_EXPR.

gcc/testsuite/ChangeLog:

PR fortran/55978
* gfortran.dg/optional_absent_12.f90: New test.

PR tree-optimization/113673: Avoid load merging when potentially trapping.

This patch fixes PR tree-optimization/113673, a P2 ice-on-valid regression
caused by load merging of (ptr[0]<<8)+ptr[1] when -ftrapv has been
specified.  When the operator is | or ^ this is safe, but for addition
of signed integer types, a trap may be generated/required, so merging this
idiom into a single non-trapping instruction is inappropriate, confusing
the compiler by transforming a basic block with an exception edge into one
without.

This revision implements Richard Biener's feedback to add an early check
for stmt_can_throw_internal (cfun, stmt) to prevent transforming in the
presence of any statement that could trap, not just overflow on addition.
The one other tweak included in this patch is to mark the local function
find_bswap_or_nop_load as static ensuring that it isn't called from outside
this file, and guaranteeing that it is dominated by stmt_can_throw_internal
checking.

2024-06-24  Roger Sayle  <roger@nextmovesoftware.com>
    Richard Biener  <rguenther@suse.de>

gcc/ChangeLog
PR tree-optimization/113673
* gimple-ssa-store-merging.cc (find_bswap_or_nop_load): Make static.
(find_bswap_or_nop_1): Avoid transformations (load merging) when
stmt_can_throw_internal indicates that a statement can trap.

gcc/testsuite/ChangeLog
PR tree-optimization/113673
* g++.dg/pr113673.C: New test case.

tree-optimization/115602 - SLP CSE results in cycles

The following prevents SLP CSE to create new cycles which happened
because of a 1:1 permute node being present where its child was then
CSEd to the permute node. Fixed by making a node only available to
CSE to after recursing.

PR tree-optimization/115602
* tree-vect-slp.cc (vect_cse_slp_nodes): Delay populating the
bst-map to avoid cycles.

* gcc.dg/vect/pr115602.c: New testcase.

tree-optimization/115528 - fix vect alignment analysis for outer loop vect

For outer loop vectorization of a data reference in the inner loop
we have to look at both steps to see if they preserve alignment.

What is special for this testcase is that the outer loop step is
one element but the inner loop step four and that we now use SLP
and the vectorization factor is one.

PR tree-optimization/115528
* tree-vect-data-refs.cc (vect_compute_data_ref_alignment):
Make sure to look at both the inner and outer loop step
behavior.

* gfortran.dg/vect/pr115528.f: New testcase.

Fix MinGW option -mcrtdll=

Add missing msvcr40* and msvcrtd* cases to CPP_SPEC and
document missing _UCRT macro and msvcr71* case.

Fixes commit 453cb585f0f8673a5d69d1b420ffd4b3f53aca00.

gcc/
* config/i386/mingw-w64.h (CPP_SPEC): Add missing -mcrtdll=
cases: msvcr40*, msvcrtd*.
* config/mingw/mingw32.h (CPP_SPEC): Add missing -mcrtdll=
cases: msvcr40*, msvcrtd*.
* doc/invoke.texi: Add missing -mcrtdll= cases: msvcr40*,
msvcrtd*, msvcr71*. Express wildcards with *. Document _UCRT.

Regenerate common.opt.urls

gcc/
* common.opt.urls: Regenerate.

Add a late-combine pass [PR106594]

This patch adds a combine pass that runs late in the pipeline.
There are two instances: one between combine and split1, and one
after postreload.

The pass currently has a single objective: remove definitions by
substituting into all uses.  The pre-RA version tries to restrict
itself to cases that are likely to have a neutral or beneficial
effect on register pressure.

The patch fixes PR106594.  It also fixes a few FAILs and XFAILs
in the aarch64 test results, mostly due to making proper use of
MOVPRFX in cases where we didn't previously.

This is just a first step.  I'm hoping that the pass could be
used for other combine-related optimisations in future.  In particular,
the post-RA version doesn't need to restrict itself to cases where all
uses are substitutable, since it doesn't have to worry about register
pressure.  If we did that, and if we extended it to handle multi-register
REGs, the pass might be a viable replacement for regcprop, which in
turn might reduce the cost of having a post-RA instance of the new pass.

On most targets, the pass is enabled by default at -O2 and above.
However, it has a tendency to undo x86's STV and RPAD passes,
by folding the more complex post-STV/RPAD form back into the
simpler pre-pass form.

Also, running a pass after register allocation means that we can
now match define_insn_and_splits that were previously only matched
before register allocation.  This trips things like:

  (define_insn_and_split "..."
    [...pattern...]
    "...cond..."
    "#"
    "&& 1"
    [...pattern...]
    {
      ...unconditional use of gen_reg_rtx ()...;
    }

because matching and splitting after RA will call gen_reg_rtx when
pseudos are no longer allowed.  rs6000 has several instances of this.

xtensa has a variation in which the split condition is:

    "&& can_create_pseudo_p ()"

The failure then is that, if we match after RA, we'll never be
able to split the instruction.

The patch therefore disables the pass by default on i386, rs6000
and xtensa.  Hopefully we can fix those ports later (if their
maintainers want).  It seems better to add the pass first, though,
to make it easier to test any such fixes.

gcc.target/aarch64/bitfield-bitint-abi-align{16,8}.c would need
quite a few updates for the late-combine output.  That might be
worth doing, but it seems too complex to do as part of this patch.

I tried compiling at least one target per CPU directory and comparing
the assembly output for parts of the GCC testsuite.  This is just a way
of getting a flavour of how the pass performs; it obviously isn't a
meaningful benchmark.  All targets seemed to improve on average:

Target                 Tests   Good    Bad   %Good   Delta  Median
======                 =====   ====    ===   =====   =====  ======
aarch64-linux-gnu       2215   1975    240  89.16%   -4159      -1
aarch64_be-linux-gnu    1569   1483     86  94.52%  -10117      -1
alpha-linux-gnu         1454   1370     84  94.22%   -9502      -1
amdgcn-amdhsa           5122   4671    451  91.19%  -35737      -1
arc-elf                 2166   1932    234  89.20%  -37742      -1
arm-linux-gnueabi       1953   1661    292  85.05%  -12415      -1
arm-linux-gnueabihf     1834   1549    285  84.46%  -11137      -1
avr-elf                 4789   4330    459  90.42% -441276      -4
bfin-elf                2795   2394    401  85.65%  -19252      -1
bpf-elf                 3122   2928    194  93.79%   -8785      -1
c6x-elf                 2227   1929    298  86.62%  -17339      -1
cris-elf                3464   3270    194  94.40%  -23263      -2
csky-elf                2915   2591    324  88.89%  -22146      -1
epiphany-elf            2399   2304     95  96.04%  -28698      -2
fr30-elf                7712   7299    413  94.64%  -99830      -2
frv-linux-gnu           3332   2877    455  86.34%  -25108      -1
ft32-elf                2775   2667    108  96.11%  -25029      -1
h8300-elf               3176   2862    314  90.11%  -29305      -2
hppa64-hp-hpux11.23     4287   4247     40  99.07%  -45963      -2
ia64-linux-gnu          2343   1946    397  83.06%   -9907      -2
iq2000-elf              9684   9637     47  99.51% -126557      -2
lm32-elf                2681   2608     73  97.28%  -59884      -3
loongarch64-linux-gnu   1303   1218     85  93.48%  -13375      -2
m32r-elf                1626   1517    109  93.30%   -9323      -2
m68k-linux-gnu          3022   2620    402  86.70%  -21531      -1
mcore-elf               2315   2085    230  90.06%  -24160      -1
microblaze-elf          2782   2585    197  92.92%  -16530      -1
mipsel-linux-gnu        1958   1827    131  93.31%  -15462      -1
mipsisa64-linux-gnu     1655   1488    167  89.91%  -16592      -2
mmix                    4914   4814    100  97.96%  -63021      -1
mn10300-elf             3639   3320    319  91.23%  -34752      -2
moxie-rtems             3497   3252    245  92.99%  -87305      -3
msp430-elf              4353   3876    477  89.04%  -23780      -1
nds32le-elf             3042   2780    262  91.39%  -27320      -1
nios2-linux-gnu         1683   1355    328  80.51%   -8065      -1
nvptx-none              2114   1781    333  84.25%  -12589      -2
or1k-elf                3045   2699    346  88.64%  -14328      -2
pdp11                   4515   4146    369  91.83%  -26047      -2
pru-elf                 1585   1245    340  78.55%   -5225      -1
riscv32-elf             2122   2000    122  94.25% -101162      -2
riscv64-elf             1841   1726    115  93.75%  -49997      -2
rl78-elf                2823   2530    293  89.62%  -40742      -4
rx-elf                  2614   2480    134  94.87%  -18863      -1
s390-linux-gnu          1591   1393    198  87.55%  -16696      -1
s390x-linux-gnu         2015   1879    136  93.25%  -21134      -1
sh-linux-gnu            1870   1507    363  80.59%   -9491      -1
sparc-linux-gnu         1123   1075     48  95.73%  -14503      -1
sparc-wrs-vxworks       1121   1073     48  95.72%  -14578      -1
sparc64-linux-gnu       1096   1021     75  93.16%  -15003      -1
v850-elf                1897   1728    169  91.09%  -11078      -1
vax-netbsdelf           3035   2995     40  98.68%  -27642      -1
visium-elf              1392   1106    286  79.45%   -7984      -2
xstormy16-elf           2577   2071    506  80.36%  -13061      -1

gcc/
PR rtl-optimization/106594
PR rtl-optimization/114515
PR rtl-optimization/114575
PR rtl-optimization/114996
PR rtl-optimization/115104
* Makefile.in (OBJS): Add late-combine.o.
* common.opt (flate-combine-instructions): New option.
* doc/invoke.texi: Document it.
* opts.cc (default_options_table): Enable it by default at -O2
and above.
* tree-pass.h (make_pass_late_combine): Declare.
* late-combine.cc: New file.
* passes.def: Add two instances of late_combine.
* doc/passes.texi: Document the new passes.
* config/i386/i386-options.cc (ix86_override_options_after_change):
Disable late-combine by default.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Likewise.
* config/xtensa/xtensa.cc (xtensa_option_override): Likewise.

gcc/testsuite/
PR rtl-optimization/106594
* gcc.dg/ira-shrinkwrap-prep-1.c: Restrict XFAIL to non-aarch64
targets.
* gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.
* gcc.dg/stack-check-4.c: Add -fno-shrink-wrap.
* gcc.target/aarch64/bitfield-bitint-abi-align16.c: Add
-fno-late-combine-instructions.
* gcc.target/aarch64/bitfield-bitint-abi-align8.c: Likewise.
* gcc.target/aarch64/sve/cond_asrd_3.c: Remove XFAILs.
* gcc.target/aarch64/sve/cond_convert_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fabd_5.c: Likewise.
* gcc.target/aarch64/sve/cond_convert_6.c: Expect the MOVPRFX /Zs
described in the comment.
* gcc.target/aarch64/sve/cond_unary_4.c: Likewise.
* gcc.target/aarch64/pr106594_1.c: New test.

rtl-ssa: Rework _ignoring interfaces

rtl-ssa has routines for scanning forwards or backwards for something
under the control of an exclusion set.  These searches are currently
used for two main things:

- to work out where an instruction can be moved within its EBB
- to work out whether recog can add a new hard register clobber

The exclusion set was originally a callback function that returned
true for insns that should be ignored.  However, for the late-combine
work, I'd also like to be able to skip an entire definition, along
with all its uses.

This patch prepares for that by turning the exclusion set into an
object that provides predicate member functions.  Currently the
only two member functions are:

- should_ignore_insn: what the old callback did
- should_ignore_def: the new functionality

but more could be added later.

Doing this also makes it easy to remove some asymmetry that I think
in hindsight was a mistake: in forward scans, ignoring an insn meant
ignoring all definitions in that insn (ok) and all uses of those
definitions (non-obvious).  The new interface makes it possible
to select the required behaviour, with that behaviour being applied
consistently in both directions.

Now that the exclusion set is a dedicated object, rather than
just a "random" function, I think it makes sense to remove the
_ignoring suffix from the function names.  The suffix was originally
there to describe the callback, and in particular to emphasise that
a true return meant "ignore" rather than "heed".

gcc/
* rtl-ssa.h: Include predicates.h.
* rtl-ssa/predicates.h: New file.
* rtl-ssa/access-utils.h (prev_call_clobbers_ignoring): Rename to...
(prev_call_clobbers): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(next_call_clobbers_ignoring): Rename to...
(next_call_clobbers): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(first_nondebug_insn_use_ignoring): Rename to...
(first_nondebug_insn_use): ...this and treat the ignore parameter as
an object with the same interface as ignore_nothing.
(last_nondebug_insn_use_ignoring): Rename to...
(last_nondebug_insn_use): ...this and treat the ignore parameter as
an object with the same interface as ignore_nothing.
(last_access_ignoring): Rename to...
(last_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing.  Conditionally skip
definitions.
(prev_access_ignoring): Rename to...
(prev_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing.
(first_def_ignoring): Replace with...
(first_access): ...this new function.
(next_access_ignoring): Rename to...
(next_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing.  Conditionally skip
definitions.
* rtl-ssa/change-utils.h (insn_is_changing): Delete.
(restrict_movement_ignoring): Rename to...
(restrict_movement): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(recog_ignoring): Rename to...
(recog): ...this and treat the ignore parameter as an object with
the same interface as ignore_nothing.
* rtl-ssa/changes.h (insn_is_changing_closure): Delete.
* rtl-ssa/functions.h (function_info::add_regno_clobber): Treat
the ignore parameter as an object with the same interface as
ignore_nothing.
* rtl-ssa/insn-utils.h (insn_is): Delete.
* rtl-ssa/insns.h (insn_is_closure): Delete.
* rtl-ssa/member-fns.inl
(insn_is_changing_closure::insn_is_changing_closure): Delete.
(insn_is_changing_closure::operator()): Likewise.
(function_info::add_regno_clobber): Treat the ignore parameter
as an object with the same interface as ignore_nothing.
(ignore_changing_insns::ignore_changing_insns): New function.
(ignore_changing_insns::should_ignore_insn): Likewise.
* rtl-ssa/movement.h (restrict_movement_for_dead_range): Treat
the ignore parameter as an object with the same interface as
ignore_nothing.
(restrict_movement_for_defs_ignoring): Rename to...
(restrict_movement_for_defs): ...this and treat the ignore parameter
as an object with the same interface as ignore_nothing.
(restrict_movement_for_uses_ignoring): Rename to...
(restrict_movement_for_uses): ...this and treat the ignore parameter
as an object with the same interface as ignore_nothing.  Conditionally
skip definitions.
* doc/rtl.texi: Update for above name changes.  Use
ignore_changing_insns instead of insn_is_changing.
* config/aarch64/aarch64-cc-fusion.cc (cc_fusion::parallelize_insns):
Likewise.
* pair-fusion.cc (no_ignore): Delete.
(latest_hazard_before, first_hazard_after): Update for above name
changes.  Use ignore_nothing instead of no_ignore.
(pair_fusion_bb_info::fuse_pair): Update for above name changes.
Use ignore_changing_insns instead of insn_is_changing.
(pair_fusion::try_promote_writeback): Likewise.