git.ipfire.org Git - thirdparty/gcc.git/log

c++: -Wtemplate-body and tentative parsing [PR120575]

Here we were asserting non-zero errorcount, which is not the case if the
parse error was reduced to a warning (or silenced) in a template body. So
check seen_error instead.

PR c++/120575
PR c++/116064

gcc/cp/ChangeLog:

* parser.cc (cp_parser_abort_tentative_parse): Check seen_error
instead of errorcount.

gcc/testsuite/ChangeLog:

* g++.dg/template/permissive-error3.C: New test.

RISC-V: Add test for vec_duplicate + vsadd.vv combine case 1 with GR2VR cost 0, 1 and 2

Add asm dump check test for vec_duplicate + vsadd.vv combine to
vsadd.vx, with the GR2VR cost is 0, 1 and 2.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vsadd.vv combine case 0 with GR2VR cost 0, 2 and 15

Add asm dump check and run test for vec_duplicate + vsadd.vv
combine to vsadd.vx, with the GR2VR cost is 0, 2 and 15.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Add test
helper macros.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for run test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsadd-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsadd-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsadd-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsadd-run-1-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Combine vec_duplicate + vsadd.vv to vsadd.vx on GR2VR cost

This patch would like to combine the vec_duplicate + vsadd.vv to the
vsadd.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_SAT_S_ADD(T, UT, MIN, MAX) \
  T                                      \
  test_##T##_sat_add (T x, T y)          \
  {                                      \
    T sum = (UT)x + (UT)y;               \
    return (x ^ y) < 0                   \
      ? sum                              \
      : (sum ^ x) >= 0                   \
        ? sum                            \
        : x < 0 ? MIN : MAX;             \
  }

  DEF_SAT_S_ADD(int32_t, uint32_t, INT32_MIN, INT32_MAX)
  DEF_VX_BINARY_CASE_2_WRAP(T, SAT_S_ADD_FUNC(T), sat_add)

Before this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │     beq a3,zero,.L8
  12   │     vsetvli a5,zero,e32,m1,ta,ma
  13   │     vmv.v.x v2,a2
  14   │     slli    a3,a3,32
  15   │     srli    a3,a3,32
  16   │ .L3:
  17   │     vsetvli a5,a3,e32,m1,ta,ma
  18   │     vle32.v v1,0(a1)
  19   │     slli    a4,a5,2
  20   │     sub a3,a3,a5
  21   │     add a1,a1,a4
  22   │     vsadd.vv v1,v1,v2
  23   │     vse32.v v1,0(a0)
  24   │     add a0,a0,a4
  25   │     bne a3,zero,.L3

After this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │     beq a3,zero,.L8
  12   │     slli    a3,a3,32
  13   │     srli    a3,a3,32
  14   │ .L3:
  15   │     vsetvli a5,a3,e32,m1,ta,ma
  16   │     vle32.v v1,0(a1)
  17   │     slli    a4,a5,2
  18   │     sub a3,a3,a5
  19   │     add a1,a1,a4
  20   │     vsadd.vx v1,v1,a2
  21   │     vse32.v v1,0(a0)
  22   │     add a0,a0,a4
  23   │     bne a3,zero,.L3

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_vx_binary_vec_dup_vec): Add
new case SS_PLUS.
(expand_vx_binary_vec_vec_dup): Ditto.
* config/riscv/riscv.cc (riscv_rtx_costs): Ditto.
* config/riscv/vector-iterators.md: Add new op ss_plus.

Signed-off-by: Pan Li <pan2.li@intel.com>

MAINTAINERS: Add myself as an aarch64 port reviewer

Following on from the announcement here:
https://gcc.gnu.org/pipermail/gcc/2025-July/246267.html
adding myself as an aarch64 port reviewer.

ChangeLog:

* MAINTAINERS (Reviewers): Add myself for the aarch64 port.

tree-optimization/120944 - bogus VN with volatile copies

The following avoids translating expressions through volatile
copies.

PR tree-optimization/120944
* tree-ssa-sccvn.cc (vn_reference_lookup_3): Gate optimizations
invalid when volatile is involved.

* gcc.dg/torture/pr120944.c: New testcase.

Ada: Switch from ACATS 2.6 to ACATS 4.2 testsuite

This effectively adds 250 new tests, i.e. around 10% more tests.

gcc/ada/
* gcc-interface/Make-lang.in (ACATSDIR): Change to acats-4.

ada: Fix alignment violation for chain of aligned and misaligned composite types

This happens when aggressive optimizations are enabled (i.e. -O2 and above)
because the ivopts pass fails to properly mark the new memory accesses it is
creating as misaligned by means of the build_aligned_type function.

gcc/ada/ChangeLog:

* gcc-interface/utils.cc (make_packable_type): Clear the TYPE_PACKED
flag in the case where the alignment is bumped.

ada: Remove strange elaboration code generated for Cluster type in System.Pack_NN

Initialization procedures are turned into functions under the hood and, even
when they are null (empty), the compiler may generate a convoluted sequence
of instructions that return uninitialized data and, therefore, is useless.

gcc/ada/ChangeLog:

* gcc-interface/trans.cc (Subprogram_Body_to_gnu): Do not generate
a block-copy out for a null initialization procedure when the _Init
parameter is not passed in.

ada: Disable previous change for enumeration types

The debugger cannot correctly interpret the return value in this case.

gcc/ada/ChangeLog:

* gcc-interface/decl.cc (gnat_to_gnu_subprog_type): Only apply the
transformation to integer types.

ada: Add missing guards to previous change

We need to make sure that an integer type exists for the given size.

gcc/ada/ChangeLog:

* gcc-interface/decl.cc (gnat_to_gnu_subprog_type): Add guards.

ada: Improve code generated for return of Out parameter with access type

The second problem occurs on 64-bit platforms where there is a second Out
parameter that is smaller than the access parameter, creating a hole in the
return structure.

gcc/ada/ChangeLog:

* gcc-interface/decl.cc (gnat_to_gnu_subprog_type): In the case of a
subprogram using the Copy-In/Copy-Out mechanism, deal specially with
the case of 2 parameters of differing sizes.
* gcc-interface/trans.cc (Subprogram_Body_to_gnu): In the case of a
subprogram using the Copy-In/Copy-Out mechanism, make sure the types
are consistent on the two sides for all the parameters.

ada: Do not generate incorrect warning about redundant type conversion

If -gnatwr is enabled, then in some cases a type conversion between two
different Boolean types incorrectly results in a warning that the conversion
is redundant.

gcc/ada/ChangeLog:

* sem_res.adb (Resolve_Type_Conversion): Replace code for
detecting a similar case with a more comprehensive test.

ada: Pragma Short_Circuit_And_Or

Improve documentation of pragma Short_Circuit_And_Or.
Also disallow renamings, because the semantics as currently
implemented is confusing.

gcc/ada/ChangeLog:

* doc/gnat_rm/implementation_defined_pragmas.rst
(Short_Circuit_And_Or): Add more documentation.
* sem_ch8.adb (Analyze_Subprogram_Renaming):
Disallow renamings.
* gnat_rm.texi: Regenerate.

ada: Fix selection of Finalize subprogram in untagged case

The newly introduced Finalizable aspect makes it possible to derive from
a type that is not tagged but has a Finalize primitive. This patch fixes
problems where overridings of the Finalize primitive were ignored.

gcc/ada/ChangeLog:

* exp_ch7.adb (Make_Final_Call): Tweak search of Finalize primitive.
* exp_util.adb (Finalize_Address): Likewise.

ada: Fix inefficient Unchecked_Conversion to large array type

We fail to use the implementation permission given by RM 13.9(12) because
the array type does not have the Size_Known_At_Compile_Time flag set.

gcc/ada/ChangeLog:

* freeze.adb (Check_Compile_Time_Size): Try harder to see whether
the bounds of array types are known at compile time.

ada: Fix style in comment

Cleanup; technical commit meant to trigger a GNAT continuous builder.

gcc/ada/ChangeLog:

* sem_aux.ads (First_Discriminant): Remove space before period.

ada: Missing component clause warning for discriminant of Unchecked_Union type

Even when -gnatw.c is enabled, no warning about a missing component clause
should be generated if the placement of a discriminant of an Unchecked_Union
type is left unspecified in a record representation clause (such a discriminant
occupies no storage). In determining whether to generate such a warning, in
some cases the compiler would incorrectly ignore an Unchecked_Union pragma
occurring after the record representation clause. This could result in a
spurious warning.

gcc/ada/ChangeLog:

* sem_ch13.adb (Analyze_Record_Representation_Clause): In deciding
whether to generate a warning about a missing component clause, in
addition to calling Is_Unchecked_Union also call a new local
function, Unchecked_Union_Pragma_Pending, which checks for the
case of a not-yet-analyzed Unchecked_Union pragma occurring later
in the declaration list.

ada: Improved error message when size of descendant type exceeds Size'Class limit

Improve the error message that is generated when the size of tagged type
exceeds a Size'Class limit specified for an ancestor type.

gcc/ada/ChangeLog:

* mutably_tagged.adb (Make_CW_Size_Compile_Check): Include the
value of the Size'Class limit in the message generated via a
Compile_Time_Error pragma.

ada: Remove leftover from rework of aspect representation

This patch removes some comments and object definitions that referred to
a hacky use of the Entity field that had been removed by the latest
rework of the internal representation of aspects.

gcc/ada/ChangeLog:

* sem_ch13.adb (Check_Aspect_At_Freeze_Point): Remove obsolete bits.

ada: Fix error on Designated_Storage_Model with extensions disabled

The format string used for the error in that case requires setting the
Error_Msg_Name_1 global variable. This was not done so this patch adds
the missing assignment.

gcc/ada/ChangeLog:

* sem_ch13.adb (Analyze_Aspect_Specifications): Fix error emission.

Stop doing GCC 12 snapshots

In preparation for the final release from the GCC 12 branch stop
doing snapshots from it.

maintainer-scripts/
* crontab: Stop doing GCC 12 snapshots.

Regenerate common.opt.urls and add period into common.opt

gcc/ChangeLog:

* common.opt: Add period.
* common.opt.urls: Regenerate.

tree-optimization/120927 - 510.parest_r segfault with masked epilog

The following fixes bad alignment computaton for epilog vectorization
when as in this case for 510.parest_r and masked epilog vectorization
with AVX512 we end up choosing AVX to vectorize the main loop and
masked AVX512 (sic!) to vectorize the epilog.  In that case alignment
analysis for the epilog tries to force alignment of the base to 64,
but that cannot possibly help the epilog when the main loop had used
a vector mode with smaller alignment requirement.

There's another issue, that the check whether the step preserves
alignment needs to consider possibly previously involved VFs
(here, the main loops smaller VF) as well.

These might not be the only case with problems for such a mode mix
but at least there it seems wise to never use DR alignment forcing
when analyzing an epilog.

We get to chose this mode setup because the iteration over epilog
modes doesn't prevent this, the maybe_ge (cached_vf_per_mode[0],
first_vinfo_vf) skip is conditional on !supports_partial_vectors
and it is also conditional on having a cached VF.  Further nothing
in vect_analyze_loop_1 rejects this setup - it might be conceivable
that a target can do masking only for larger modes.  There is a
second reason we end up with this mode setup, which is that
vect_need_peeling_or_partial_vectors_p says we do not need
peeling or partial vectors when analyzing the main loop with
AVX512 (if it would say so we'd have chosen a masked AVX512
epilog-only vectorization).  It does that because it looks at
LOOP_VINFO_COST_MODEL_THRESHOLD (which is not yet computed, so
always zero at this point), and compares max_niter (5) against
the VF (8), but not with equality as the comment says but with
greater.  This also needs looking at, PR120939.

PR tree-optimization/120927
* tree-vect-data-refs.cc (vect_compute_data_ref_alignment):
Do not force a DRs base alignment when analyzing an
epilog loop.  Check whether the step preserves alignment
for all VFs possibly involved sofar.

* gcc.dg/vect/vect-pr120927.c: New testcase.
* gcc.dg/vect/vect-pr120927-2.c: Likewise.

c-family: Tweak ptr +- (expr +- cst) FE optimization [PR120837]

The following testcase is miscompiled with -fsanitize=undefined but we
introduce UB into the IL even without that flag.

The optimization ptr +- (expr +- cst) when expr/cst have undefined
overflow into (ptr +- cst) +- expr is sometimes simply not valid,
without careful analysis on what ptr points to we don't know if it
is valid to do (ptr +- cst) pointer arithmetics.
E.g. on the testcase, ptr points to start of an array (actually
conditionally one or another) and cst is -1, so ptr - 1 is invalid
pointer arithmetics, while ptr + (expr - 1) can be valid if expr
is at runtime always > 1 and smaller than size of the array ptr points
to + 1.

Unfortunately, removing this 1992-ish optimization altogether causes
FAIL: c-c++-common/restrict-2.c  -Wc++-compat   scan-tree-dump-times lim2 "Moving statement" 11
FAIL: gcc.dg/tree-ssa/copy-headers-5.c scan-tree-dump ch2 "is now do-while loop"
FAIL: gcc.dg/tree-ssa/copy-headers-5.c scan-tree-dump-times ch2 "  if " 3
FAIL: gcc.dg/vect/pr57558-2.c scan-tree-dump vect "vectorized 1 loops"
FAIL: gcc.dg/vect/pr57558-2.c -flto -ffat-lto-objects  scan-tree-dump vect "vectorized 1 loops"
regressions (restrict-2.c also for C++ in all std modes).  I've been thinking
about some match.pd optimization for signed integer addition/subtraction of
constant followed by widening integral conversion followed by multiplication
or left shift, but that wouldn't help 32-bit arches.

So, instead at least for now, the following patch keeps doing the
optimization, just doesn't perform it in pointer arithmetics.
pointer_int_sum itself actually adds the multiplication by size_exp,
so ptr + expr is turned into ptr p+ expr * size_exp,
so this patch will try to optimize
ptr + (expr +- cst)
into
ptr p+ ((sizetype)expr * size_exp +- (sizetype)cst * size_exp)
and
ptr - (expr +- cst)
into
ptr p+ -((sizetype)expr * size_exp +- (sizetype)cst * size_exp)

2025-07-04  Jakub Jelinek  <jakub@redhat.com>

PR c/120837
* c-common.cc (pointer_int_sum): Rewrite the intop PLUS_EXPR or
MINUS_EXPR optimization into extension of both intop operands,
their separate multiplication and then addition/subtraction followed
by rest of pointer_int_sum handling after the multiplication.

* gcc.dg/ubsan/pr120837.c: New test.

testsuite: Rename a test

I mistyped the file name :(.

gcc/testsuite/ChangeLog:

PR target/120807
* gcc.c-torture/compile/pr120708.c: Rename to ...
* gcc.c-torture/compile/pr120807.c: ... here.

LoongArch: Prevent subreg of subreg in CRC

The register_operand predicate can match subreg, then we'd have a subreg
of subreg and it's invalid. Use lowpart_subreg to avoid the nested
subreg.

gcc/ChangeLog:

* config/loongarch/loongarch.md (crc_combine): Avoid nested
subreg.

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr120708.c: New test.

[RISC-V] Add basic instrumentation to fusion detection

We were looking to evaluate some changes from Artemiy that improve GCC's
ability to discover fusible instruction pairs.  There was no good way to get
any static data out of the compiler about what kinds of fusions were happening.
Yea, you could grub around the .sched dumps looking for the magic '+'
annotation, then look around at the slim RTL representation and make an
educated guess about what fused.  But boy that was inconvenient.

All we really needed was a quick note in the dump file that the target hook
found a fusion pair and what kind was discovered.  That made it easy to spot
invalid fusions, evaluate the effectiveness of Artemiy's work, write/discover
testcases for existing fusions and implement new fusions.

So from a codegen standpoint this is NFC, it only affects dump file output.

It's gone through the usual testing and I'll wait for pre-commit CI to churn
through it before moving forward.

gcc/
* config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Add basic
instrumentation to all cases where fusion is detected.  Fix
minor formatting goofs found in the process.

RISC-V: Add testcases for signed scalar SAT_ADD IMM form 2

This patch adds testcase for form2, as shown below:

T __attribute__((noinline))                                  \
sat_s_add_imm_##T##_fmt_2##_##INDEX (T x)                    \
{                                                            \
  T sum = (T)((UT)x + (UT)IMM);                                   \
  return ((x ^ sum) < 0 && (x ^ IMM) >= 0) ?                 \
    (-(T)(x < 0) ^ MAX) : sum;                         \
}

Passed the rv64gcv regression test.

Signed-off-by: Ciyan Pan <panciyan@eswincomputing.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_arith.h: Add signed scalar SAT_ADD IMM form2.
* gcc.target/riscv/sat/sat_s_add_imm-2-i16.c: New test.
* gcc.target/riscv/sat/sat_s_add_imm-2-i32.c: New test.
* gcc.target/riscv/sat/sat_s_add_imm-2-i64.c: New test.
* gcc.target/riscv/sat/sat_s_add_imm-2-i8.c: New test.
* gcc.target/riscv/sat/sat_s_add_imm-run-2-i16.c: New test.
* gcc.target/riscv/sat/sat_s_add_imm-run-2-i32.c: New test.
* gcc.target/riscv/sat/sat_s_add_imm-run-2-i64.c: New test.
* gcc.target/riscv/sat/sat_s_add_imm-run-2-i8.c: New test.
* gcc.target/riscv/sat/sat_s_add_imm_type_check-2-i16.c: New test.
* gcc.target/riscv/sat/sat_s_add_imm_type_check-2-i32.c: New test.
* gcc.target/riscv/sat/sat_s_add_imm_type_check-2-i8.c: New test.

Match: Support for signed scalar SAT_ADD IMM form 2

This patch would like to support signed scalar SAT_ADD IMM form 2

Form2:
T __attribute__((noinline))                                  \
sat_s_add_imm_##T##_fmt_2##_##INDEX (T x)                    \
{                                                            \
  T sum = (T)((UT)x + (UT)IMM);                                   \
  return ((x ^ sum) < 0 && (x ^ IMM) >= 0) ?                 \
    (-(T)(x < 0) ^ MAX) : sum;                         \
}

Take below form1 as example:
DEF_SAT_S_ADD_IMM_FMT_2(0, int8_t, uint8_t, 9, INT8_MIN, INT8_MAX)

Before this patch:
__attribute__((noinline))
int8_t sat_s_add_imm_int8_t_fmt_2_0 (int8_t x)
{
  int8_t sum;
  unsigned char x.0_1;
  unsigned char _2;
  signed char _3;
  signed char _4;
  _Bool _5;
  signed char _6;
  int8_t _7;
  int8_t _10;
  signed char _11;
  signed char _13;
  signed char _14;

  <bb 2> [local count: 1073741822]:
  x.0_1 = (unsigned char) x_8(D);
  _2 = x.0_1 + 9;
  sum_9 = (int8_t) _2;
  _3 = x_8(D) ^ sum_9;
  _4 = x_8(D) ^ 9;
  _13 = ~_3;
  _14 = _4 | _13;
  if (_14 >= 0)
    goto <bb 3>; [59.00%]
  else
    goto <bb 4>; [41.00%]

  <bb 3> [local count: 259738146]:
  _5 = x_8(D) < 0;
  _11 = (signed char) _5;
  _6 = -_11;
  _10 = _6 ^ 127;

  <bb 4> [local count: 1073741824]:
  # _7 = PHI <sum_9(2), _10(3)>
  return _7;

}

After this patch:
__attribute__((noinline))
int8_t sat_s_add_imm_int8_t_fmt_2_0 (int8_t x)
{
  int8_t _7;

  <bb 2> [local count: 1073741824]:
  _7 = .SAT_ADD (x_8(D), 9); [tail call]
  return _7;

}

The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.

Signed-off-by: Ciyan Pan <panciyan@eswincomputing.com>
gcc/ChangeLog:

* match.pd: Add signed scalar SAT_ADD IMM form2 matching.

Daily bump.

libstdc++: fix bits/version.def typo

bits/version.def was missing a ';'.

libstdc++-v3/Changelog:
* include/bits/version.def: Fix typo.
* include/bits/version.h: Rebuilt.

c++: trivial lambda pruning [PR120716]

In this testcase there is nothing in the lambda except a static_assert which
mentions a variable from the enclosing scope but does not odr-use it, so we
want prune_lambda_captures to remove its capture. Since the lambda is so
empty, there's nothing in the body except the DECL_EXPR of the capture
proxy, so pop_stmt_list moves that into the enclosing STATEMENT_LIST and
passes the 'body' STATEMENT_LIST to free_stmt_list. As a result, passing
'body' to prune_lambda_captures is wrong; we should instead pass the
enclosing scope, i.e. cur_stmt_list.

PR c++/120716

gcc/cp/ChangeLog:

* lambda.cc (finish_lambda_function): Pass cur_stmt_list to
prune_lambda_captures.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-constexpr3.C: New test.
* g++.dg/cpp0x/lambda/lambda-constexpr3a.C: New test.

c++: ICE with 'this' in lambda signature [PR120748]

This testcase was crashing from infinite recursion in the diagnostic
machinery, trying to print the lambda signature, which referred to the
__this capture field in the lambda, which wanted to print the lambda again.

But we don't want the signature to refer to the capture field; 'this' in an
unevaluated context refers to the 'this' from the enclosing function, not
the capture.

After fixing that, we still wrongly rejected the B case because
THIS_FORBIDDEN is set in a default (template) argument. Since we don't
distinguish between THIS_FORBIDDEN being set for a default argument and it
being set for a static member function, let's just ignore it if
cp_unevaluated_operand; we'll give a better diagnostic for the static memfn
case in finish_this_expr.

PR c++/120748

gcc/cp/ChangeLog:

* lambda.cc (lambda_expr_this_capture): Don't return a FIELD_DECL.
* parser.cc (cp_parser_primary_expression): Ignore THIS_FORBIDDEN
if cp_unevaluated_operand.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-targ16.C: New test.
* g++.dg/cpp0x/this1.C: Adjust diagnostics.

c++: Fix a pasto in the PR120471 fix [PR120940]

No idea how this slipped in, I'm terribly sorry.
Strangely nothing in the testsuite has caught this, so I've added
a new test for that.

2025-07-03 Jakub Jelinek <jakub@redhat.com>

PR c++/120940
* typeck.cc (cp_build_array_ref): Fix a pasto.

* g++.dg/parse/pr120940.C: New test.
* g++.dg/warn/Wduplicated-branches9.C: New test.

Ada: Remove left-overs of front-end exception mechanism

It was removed from the compiler a few releases ago.

gcc/ada/
* gcc-interface/Makefile.in (gnatlib-sjlj): Delete.
(gnatlib-zcx): Do not modify Frontend_Exceptions constant.
* libgnat/system-linux-loongarch.ads (Frontend_Exceptions): Delete.

s390: More vec-perm-const cases.

s390 missed constant vector permutation cases based on the vector pack
instruction or changing the size of the vector elements during vector
merge. This enables some more patterns that do not need to load a
constant vector for permutation.

gcc/ChangeLog:

* config/s390/s390.cc (expand_perm_with_merge): Add size change cases.
(expand_perm_with_pack): New function.
(vectorize_vec_perm_const_1): Wire up new function.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/vec-perm-merge-1.c: New test.
* gcc.target/s390/vector/vec-perm-pack-1.c: New test.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>

Add myself as an aarch64 port reviewer

As mentioned in https://inbox.sourceware.org/gcc/EA828262-8F8F-4362-9CA8-312F7C20E2F9@nvidia.com/T/#m6e7e8e11656189598c759157d5d49cbd0ac9ba7c.
Adding myself as an aarch64 port reviewer.

ChangeLog:

* MAINTAINERS: Add myself as an aarch64 port reviewer.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

libstdc++: Update LWG 4166 changes to concat_view::end() [PR120934]

In r15-4555-gf191c830154565 we proactively implemented the initial
proposed resolution for LWG 4166 which later turned out to be
insufficient, since we must also require equality_comparable of the
underlying iterators before concat_view could be a common range.

This patch implements the updated P/R, requiring all underlying
iterators to be forward (which implies equality_comparable) before
making concat_view common, which fixes the testcase from this PR.

PR libstdc++/120934

libstdc++-v3/ChangeLog:

* include/std/ranges (concat_view::end): Refine condition
for returning an iterator instead of default_sentinel as
per the updated P/R for LWG 4166.
* testsuite/std/ranges/concat/1.cc (test05): New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

OpenMP: Add omp_get_initial_device/omp_get_num_devices builtins: Fix test cases

With this fix-up for commit 387209938d2c476a67966c6ddbdbf817626f24a2
"OpenMP: Add omp_get_initial_device/omp_get_num_devices builtins", we progress:

     PASS: c-c++-common/gomp/omp_get_num_devices_initial_device.c (test for excess errors)
     PASS: c-c++-common/gomp/omp_get_num_devices_initial_device.c scan-tree-dump-not optimized "abort"
    -FAIL: c-c++-common/gomp/omp_get_num_devices_initial_device.c scan-tree-dump-times optimized "omp_get_num_devices;" 1
    +PASS: c-c++-common/gomp/omp_get_num_devices_initial_device.c scan-tree-dump-times optimized "omp_get_num_devices" 1
     PASS: c-c++-common/gomp/omp_get_num_devices_initial_device.c scan-tree-dump optimized "_1 = __builtin_omp_get_num_devices \\(\\);[\\r\\n]+[ ]+return _1;"

... etc. for offloading configurations.

gcc/testsuite/
* c-c++-common/gomp/omp_get_num_devices_initial_device.c: Fix.
* gfortran.dg/gomp/omp_get_num_devices_initial_device.f90: Likewise.

[RISC-V][PR target/118886] Refine when two insns are signaled as fusion candidates

A number of folks have had their fingers in this code and it's going to take a
few submissions to do everything we want to do.

This patch is primarily concerned with avoiding signaling that fusion can occur
in cases where it obviously should not be signaling fusion.

Every DEC based fusion I'm aware of requires the first instruction to set a
destination register that is both used and set again by the second instruction.
If the two instructions set different registers, then the destination of the
first instruction was not dead and would need to have a result produced.

This is complicated by the fact that we have pseudo registers prior to reload.
So the approach we take is to signal fusion prior to reload even if the
destination registers don't match.  Post reload we require them to match.

That allows us to clean up the code ever-so-slightly.

Second, we sometimes signaled fusion into loads that weren't scalar integer
loads.  I'm not aware of a design that's fusing into FP loads or vector loads.
So those get rejected explicitly.

Third, the store pair "fusion" code is cleaned up a little.  We use fusion to
model store pair commits since the basic properties for detection are the same.
The point where they "fuse" is different.  Also this code liked to "return
false" at each step along the way if fusion wasn't possible.  Future work for
additional fusion cases makes that behavior undesirable.  So the logic gets
reworked a little bit to be more friendly to future work.

Fourth, if we already fused the previous instruction, then we can't fuse it
again.  Signaling fusion in that case is, umm, bad as it creates an atomic blob
of code from a scheduling standpoint.

Hopefully I got everything correct with extracting this work out of a larger
set of changes 🙂  We will contribute some instrumentation & testing code so if
I botched things in a major way we'll soon have a way to test that and I'll be
on the hook to fix any goof's.

From a correctness standpoint this should be a big fat nop.  We've seen this
make measurable differences in pico benchmarks, but obviously as you scale up
to bigger stuff the gains largely disappear into the noise.

This has been through Ventana's internal CI and my tester.  I'll obviously wait
for a verdict from the pre-commit tester.

PR target/118886
gcc/
* config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Check
for fusion being disabled earlier.  If PREV is already fused,
then it can't be fused again.  Be more selective about fusing
when the destination registers do not match.  Don't fuse into
loads that aren't scalar integer modes.  Revamp store pair
commit support.

Co-authored-by: Daniel Barboza <dbarboza@ventanamicro.com>
Co-authored-by: Shreya Munnangi <smunnangi1@ventanamicro.com>

testsuite: Fix gcc.dg/ipa/pr120295.c on Solaris

gcc.dg/ipa/pr120295.c FAILs on Solaris:

FAIL: gcc.dg/ipa/pr120295.c (test for excess errors)

Excess errors:
ld: warning: symbol 'glob' has differing types:
(file /var/tmp//ccsDR59c.o type=OBJT; file /lib/libc.so type=FUNC);
/var/tmp//ccsDR59c.o definition taken

Fixed by renaming the glob variable to glob_ to avoid the conflict.

Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.

gcc/testsuite:
* gcc.dg/ipa/pr120295.c (glob): Rename to glob_.

AArch64: make rules for CBZ/TBZ higher priority

Move the rules for CBZ/TBZ to be above the rules for
CBB<cond>/CBH<cond>/CB<cond>. We want them to have higher priority
because they can express larger displacements.

gcc/ChangeLog:

* config/aarch64/aarch64.md (aarch64_cbz<optab><mode>1): Move
above rules for CBB<cond>/CBH<cond>/CB<cond>.
(*aarch64_tbz<optab><mode>1): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/cmpbr.c: Update tests.

AArch64: rules for CMPBR instructions

Add rules for lowering `cbranch<mode>4` to CBB<cond>/CBH<cond>/CB<cond> when
CMPBR extension is enabled.

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (aarch64_cb_rhs): New function.
* config/aarch64/aarch64.cc (aarch64_cb_rhs): Likewise.
* config/aarch64/aarch64.md (cbranch<mode>4): Rename to ...
(cbranch<GPI:mode>4): ...here, and emit CMPBR if possible.
(cbranch<SHORT:mode>4): New expand rule.
(aarch64_cb<INT_CMP:code><GPI:mode>): New insn rule.
(aarch64_cb<INT_CMP:code><SHORT:mode>): Likewise.
* config/aarch64/constraints.md (Uc0): New constraint.
(Uc1): Likewise.
(Uc2): Likewise.
* config/aarch64/iterators.md (cmpbr_suffix): New mode attr.
(INT_CMP): New code iterator.
(cmpbr_imm_constraint): New code attr.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/cmpbr.c:

AArch64: precommit test for CMPBR instructions

Commit the test file `cmpbr.c` before rules for generating the new
instructions are added, so that the changes in codegen are more obvious
in the next commit.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add `cmpbr` to the list of extensions.
* gcc.target/aarch64/cmpbr.c: New test.

AArch64: recognize `+cmpbr` option

Add the `+cmpbr` option to enable the FEAT_CMPBR architectural
extension.

gcc/ChangeLog:

* config/aarch64/aarch64-option-extensions.def (cmpbr): New
option.
* config/aarch64/aarch64.h (TARGET_CMPBR): New macro.
* doc/invoke.texi (cmpbr): New option.

AArch64: make `far_branch` attribute a boolean

The `far_branch` attribute only ever takes the values 0 or 1, so make it
a `no/yes` valued string attribute instead.

gcc/ChangeLog:

* config/aarch64/aarch64.md (far_branch): Replace 0/1 with
no/yes.
(aarch64_bcond): Handle rename.
(aarch64_cbz<optab><mode>1): Likewise.
(*aarch64_tbz<optab><mode>1): Likewise.
(@aarch64_tbz<optab><ALLI:mode><GPI:mode>): Likewise.

AArch64: add constants for branch displacements

Extract the hardcoded values for the minimum PC-relative displacements
into named constants and document them.

gcc/ChangeLog:

* config/aarch64/aarch64.md (BRANCH_LEN_P_1MiB): New constant.
(BRANCH_LEN_N_1MiB): Likewise.
(BRANCH_LEN_P_32KiB): Likewise.
(BRANCH_LEN_N_32KiB): Likewise.

AArch64: rename branch instruction rules

Give the `define_insn` rules used in lowering `cbranch<mode>4` to RTL
more descriptive and consistent names: from now on, each rule is named
after the AArch64 instruction that it generates. Also add comments to
document each rule.

gcc/ChangeLog:

* config/aarch64/aarch64.md (condjump): Rename to ...
(aarch64_bcond): ...here.
(*compare_condjump<GPI:mode>): Rename to ...
(*aarch64_bcond_wide_imm<GPI:mode>): ...here.
(aarch64_cb<optab><mode>): Rename to ...
(aarch64_cbz<optab><mode>1): ...here.
(*cb<optab><mode>1): Rename to ...
(*aarch64_tbz<optab><mode>1): ...here.
(@aarch64_tb<optab><ALLI:mode><GPI:mode>): Rename to ...
(@aarch64_tbz<optab><ALLI:mode><GPI:mode>): ...here.
(restore_stack_nonlocal): Handle rename.
(stack_protect_combined_test): Likewise.
* config/aarch64/aarch64-simd.md (cbranch<mode>4): Likewise.
* config/aarch64/aarch64-sme.md (aarch64_restore_za): Likewise.
* config/aarch64/aarch64.cc (aarch64_gen_test_and_branch): Likewise.

AArch64: reformat branch instruction rules

Make the formatting of the RTL templates in the rules for branch
instructions more consistent with each other.

gcc/ChangeLog:

* config/aarch64/aarch64.md (cbranch<mode>4): Reformat.
(cbranchcc4): Likewise.
(condjump): Likewise.
(*compare_condjump<GPI:mode>): Likewise.
(aarch64_cb<optab><mode>1): Likewise.
(*cb<optab><mode>1): Likewise.
(tbranch_<code><mode>3): Likewise.
(@aarch64_tb<optab><ALLI:mode><GPI:mode>): Likewise.

AArch64: place branch instruction rules together

The rules for conditional branches were spread throughout `aarch64.md`.
Group them together so it is easier to understand how `cbranch<mode>4`
is lowered to RTL.

gcc/ChangeLog:

* config/aarch64/aarch64.md (condjump): Move.
(*compare_condjump<GPI:mode>): Likewise.
(aarch64_cb<optab><mode>1): Likewise.
(*cb<optab><mode>1): Likewise.
(tbranch_<code><mode>3): Likewise.
(@aarch64_tb<optab><ALLI:mode><GPI:mode>): Likewise.

libstdc++: construct bitset from string_view (P2697) [PR119742]

Add a bitset constructor from string_view, per P2697. Fix existing
tests that would fail to detect incorrect exception behavior.

Argument checks that result in exceptions guarded by "#if HOSTED"
are made unguarded because the functions called to throw just call
terminate() in free-standing builds. Improve readability in Doxygen
comments. Generalize a private member argument-checking function
to work with string and string_view without mentioning either,
obviating need for guards.

The version.h symbol is not "hosted" because string_view, though
not specified to be available in free-standing builds, is defined
there and the feature is useful there.

libstdc++-v3/ChangeLog:
PR libstdc++/119742
* include/bits/version.def: Add preprocessor symbol.
* include/bits/version.h: Add preprocessor symbol.
* include/std/bitset: Add constructor.
* testsuite/20_util/bitset/cons/1.cc: Fix.
* testsuite/20_util/bitset/cons/6282.cc: Fix.
* testsuite/20_util/bitset/cons/string_view.cc: Test new ctor.
* testsuite/20_util/bitset/cons/string_view_wide.cc: Test new ctor.

tree-optimization/120780: Support object size for containing objects

MEM_REF cast of a subobject to its containing object has negative
offsets, which objsz sees as an invalid access. Support this use case
by peeking into the structure to validate that the containing object
indeed contains a type of the subobject at that offset and if present,
adjust the wholesize for the object to allow the negative offset.

gcc/ChangeLog:

PR tree-optimization/120780
* tree-object-size.cc (inner_at_offset,
get_wholesize_for_memref): New functions.
(addr_object_size): Call get_wholesize_for_memref.

gcc/testsuite/ChangeLog:

PR tree-optimization/120780
* gcc.dg/builtin-dynamic-object-size-pr120780.c: New test case.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>

x86: Emit label only for __mcount_loc section

commit ecc81e33123d7ac9c11742161e128858d844b99d
Author: Andi Kleen <ak@linux.intel.com>
Date:   Fri Sep 26 04:06:40 2014 +0000

    Add direct support for Linux kernel __fentry__ patching

emitted a label, 1, for __mcount_loc section:

1: call mcount
.section __mcount_loc, "a",@progbits
.quad 1b
.previous

If __mcount_loc wasn't used, we got an unused label.  Update
x86_function_profiler to emit label only when __mcount_loc section
is used.

gcc/

PR target/120936
* config/i386/i386.cc (x86_print_call_or_nop): Add a label
argument and use it to print label.
(x86_function_profiler): Emit label only when __mcount_loc
section is used.

gcc/testsuite/

PR target/120936
* gcc.target/i386/pr120936-1.c: New test
* gcc.target/i386/pr120936-2.c: Likewise.
* gcc.target/i386/pr120936-3.c: Likewise.
* gcc.target/i386/pr120936-4.c: Likewise.
* gcc.target/i386/pr120936-5.c: Likewise.
* gcc.target/i386/pr120936-6.c: Likewise.
* gcc.target/i386/pr120936-7.c: Likewise.
* gcc.target/i386/pr120936-8.c: Likewise.
* gcc.target/i386/pr120936-9.c: Likewise.
* gcc.target/i386/pr120936-10.c: Likewise.
* gcc.target/i386/pr120936-11.c: Likewise.
* gcc.target/i386/pr120936-12.c: Likewise.
* gcc.target/i386/pr93492-3.c: Updated.
* gcc.target/i386/pr93492-5.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

Add -Wauto-profile warning

this patch adds new warning -Wauto-profile which warns about mismatches between
profile data and function bodies. This is implemented during the offline pass
where every function instance is compared with actual gimple body (if
available) and we verify that the statement locations in the profile data can
be matched with statements in the function.

Currently it is mostly useful to find bugs, but eventually I hope it will be
useful for users to verify that auto-profile works as expected or to evaulate
how much of an old auto-profile data can still be applied to current sources.
There will probably be always some side cases we can not handle with
auto-profile format (such as function with bodies in mutlple files) that can be
patched in compiled program.

I also added logic to fix up missing discriminators in the function callsites.
I am not sure how those happens (but seem to go away with -fno-crossjumping)
and will dig into it.

Ohter problem is that without -flto at the train run inlined functions have
dwarf names rather than symbol names. LLVM solves this by
-gdebug-for-autoprofile flag that we could also have. With this flag we could
output assembler names as well as multiplicities of statemnets.

Building SPECint there are approx 7k profile mismatches.

Bootstrapped/regtested x86_64-linux. Plan to commit it after some extra testing.

gcc/ChangeLog:

* auto-profile.cc (get_combined_location): Handle negative
offsets; output better diagnostics.
(get_relative_location_for_locus): Reutrn -1 for unknown location.
(function_instance::get_cgraph_node): New member function.
(match_with_target): New function.
(dump_stmt): New function.
(function_instance::lookup_count): New function.
(mark_expr_locations): New function.
(function_instance::match): New function.
(autofdo_source_profile::offline_external_functions): Do
not repeat renaming; manage two worklists and do matching.
(autofdo_source_profile::offline_unrealized_inlines): Simplify.
(afdo_set_bb_count): do not look for lost discriminators.
(auto_profile): Do not ICE when profile reading failed.
* common.opt (Wauto-profile): New warning flag
* doc/invoke.texi (-Wauto-profile): Document.

Make inliner loop hints more agressive

This patch makes loop inline hints more agressive. If we know iteration
count or stride, we currently assume improvement in time relative to
preheader count. I changed it to header count, since this knowledge
is supposed to likely help unrolling and vectorizing which brings
benefits relative to that.

* ipa-fnsummary.cc (analyze_function_body): For loop
heuristics use header count instead of preheader count.

Fix division by zero in ipa-cp.cc:update_profiling_info

This ICE has triggered for me during autoprofiledbootstrap. The
code already takes into care possible range, so I think in this case
we can just push to one side of it.

Bootstrapped/regtesed x86_64-linux, OK?

gcc/ChangeLog:

* ipa-cp.cc (update_profiling_info): Watch for division by zero.

Fortran: Remove corank conformability checks [PR120843]

Remove the checks on coranks conformability in expressions,
because there is nothing in the standard about it. When a coarray
has no coindexes it it treated like a non-coarray, when it has
a full-corank coindex its result is a regular array. So nothing
to check for corank conformability.

PR fortran/120843

gcc/fortran/ChangeLog:

* resolve.cc (resolve_operator): Remove conformability check,
because it is not in the standard.

gcc/testsuite/ChangeLog:

* gfortran.dg/coarray/coindexed_6.f90: Enhance test to have
coarray components covered.

libstdc++: Fix regression in std::uninitialized_fill for C++98 [PR120931]

A typo in r15-4473-g3abe751ea86e34 made it ill-formed to use
std::uninitialized_fill with iterators that aren't pointers (or pointers
wrapped in our __normal_iterator) if the value type is a narrow
character type.

libstdc++-v3/ChangeLog:

PR libstdc++/120931
* include/bits/stl_uninitialized.h (__uninitialized_fill<true>):
Fix typo resulting in call to __do_uninit_copy instead of
__do_uninit_fill.
* testsuite/20_util/specialized_algorithms/uninitialized_fill/120931.cc:
New test.

aarch64: Drop const_int from aarch64_maskload_else_operand

The "else operand" to maskload should always be a const_vector, never a
const_int.

This was just an issue I noticed while looking through the code, I don't
have a testcase which shows a concrete problem due to this.

Testing of that change alone showed ICEs with load lanes vectorization
and SVE. That turned out to be because the backend pattern was missing
a mode for the else operand (causing the middle-end to choose a
const_int during expansion), fixed thusly. That in turn exposed an
issue with the unpredicated load lanes expander which was using the
wrong mode for the else operand, so fixed that too.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md
(vec_load_lanes<mode><vsingle>): Expand else operand in
subvector mode, as per optab documentation.
(vec_mask_load_lanes<mode><vsingle>): Add missing mode for
operand 3.
* config/aarch64/predicates.md (aarch64_maskload_else_operand):
Remove const_int.

doc: Clarify mode of else operand for vec_mask_load_lanesmn

This extends the documentation of the vec_mask_load_lanes<m><n> optab to
explicitly state that the mode of the else operand is n, i.e. the mode
of a single subvector.

gcc/ChangeLog:

* doc/md.texi (Standard Names): Clarify mode of else operand for
vec_mask_load_lanesmn optab.

Enable ipa-cp cloning for cold wrappers of hot functions

ipa-cp cloning disables itself for all functions not passing opt_for_fn
(node->decl, optimize_size) which disables it for cold wrappers of hot
functions where we want to propagate. Since we later want to time saved
to be considered hot, we do not need to make this early test.

The patch also fixes few other places where AFDO 0 disables ipa-cp.

gcc/ChangeLog:

* ipa-cp.cc (cs_interesting_for_ipcp_p): Handle
correctly GLOBAL0 afdo counts.
(ipcp_cloning_candidate_p): Do not rule out nodes
!node->optimize_for_size_p ().
(good_cloning_opportunity_p): Handle afdo counts
as non-zero.

Fix overlfow in ipa-cp heuristics

ipa-cp converts sreal times to int, while point of sreal is to accomodate very
large values that can happen for loops with large number of iteraitons and also
when profile is inconsistent. This happens with afdo in testsuite where loop
preheader is estimated to have 0 excutions while loop body has large number of
executions.

Bootstrapped/regtesed x86_64-linux, comitted.

gcc/ChangeLog:

* ipa-cp.cc (hint_time_bonus): Return sreal and avoid
conversions to integer.
(good_cloning_opportunity_p): Avoid sreal to integer
conversions
(perform_estimation_of_a_value): Update.

Auto-FDO/FDO profile comparator

the patch I sent from airport only worked if you produced the gcda files with
unpatched compiler.  For some reason auto-profile reading is interwinded into
gcov reading which is not necessary.  Here is cleaner version which also
makes the format bit more convenient.  One can now grep as:

grep "bb.*fdo.*very hot.*cold" *.profile | sort -n -k 5 -r | less

digits_2/30 bb 307 fdo 10273284651 (very hot) afdo 0 (auto FDO) (cold)  scaled 0 diff -10273284651, -100.00%
digits_2/30 bb 201 fdo 2295561442 (very hot) afdo 19074 (auto FDO) (cold)  scaled 1341585 diff -2294219857, -99.94%
digits_2/30 bb 203 fdo 1236123372 (very hot) afdo 9537 (auto FDO) (cold)  scaled 670792 diff -1235452580, -99.95%
digits_2/30 bb 200 fdo 1236123372 (very hot) afdo 9537 (auto FDO) (cold)  scaled 670792 diff -1235452580, -99.95%
digits_2/30 bb 202 fdo 1059438070 (very hot) afdo 9537 (auto FDO) (cold)  scaled 670792 diff -1058767278, -99.94%
new_solver/9 bb 246 fdo 413879041 (very hot) afdo 76594 (guessed) (cold)  scaled 5387299 diff -408491742, -98.70%
new_solver/9 bb 167 fdo 413792205 (very hot) afdo 76594 (guessed) (cold)  scaled 5387299 diff -408404906, -98.70%
new_solver/9 bb 159 fdo 387809230 (very hot) afdo 57182 (guessed) (cold)  scaled 4021940 diff -383787290, -98.96%
new_solver/9 bb 158 fdo 387809230 (very hot) afdo 60510 (guessed) (cold)  scaled 4256018 diff -383553212, -98.90%
new_solver/9 bb 138 fdo 387809230 (very hot) afdo 40917 (guessed) (cold)  scaled 2877929 diff -384931301, -99.26%
new_solver/9 bb 137 fdo 387809230 (very hot) afdo 43298 (guessed) (cold)  scaled 3045398 diff -384763832, -99.21%

This dumps basic blocks that do have large counts by normal profile feedback
but autofdo gives them small count (so they get cold).  These seems to be
indeed mostly basic blocks controlling loops.

gcc/ChangeLog:

* auto-profile.cc (afdo_hot_bb_threshod): New global
variable.
(maybe_hot_afdo_count_p): New function.
(autofdo_source_profile::read): Do not set up dump file;
set afdo_hot_bb_threshod.
(afdo_annotate_cfg): Handle partial training.
(afdo_callsite_hot_enough_for_early_inline):
Use maybe_hot_afdo_count_p.
(auto_profile_offline::execute): Read autofdo file.
* auto-profile.h (maybe_hot_afdo_count_p): Declare.
(afdo_hot_bb_threshold): Declare.
* coverage.cc (read_counts_file): Also set gcov_profile_info.
(coverage_init): Do not read autofdo file.
* opts.cc (enable_fdo_optimizations): Add autofdo parameter;
do not set flag_branch_probabilities and flag_profile_values
with it.
(common_handle_option): Update.
* passes.cc (finish_optimization_passes): Do not end branch
prob here.
(pass_manager::dump_profile_report): Also mark change after
autofdo pass.
* profile.cc: Include auto-profile.h
(gcov_profile_info): New global variable.
(struct afdo_fdo_record): New struture.
(compute_branch_probabilities): Record afdo profile.
(end_branch_prob): Dump afdo/fdo profile comparsion.
* profile.h (gcov_profile_info): Declarre.
* tree-profile.cc (tree_profiling): Call end_branch_prob
(pass_ipa_tree_profile::gate): Also enable with autoFDO

ada: Fix poor code generated for return of Out parameter with access type

The record type of the return object is unnecessarily given BLKmode.

gcc/ada/ChangeLog:

* gcc-interface/decl.cc (type_contains_only_integral_data): Do not
return false only because the type contains pointer data.

ada: Enforce alignment constraint for large Object_Size clauses

The constraint is that the Object_Size must be a multiple of the alignment
in bits. But it's enforced only when the value of the clause is lower than
the Value_Size rounded up to the alignment in bits, not for larger values.

gcc/ada/ChangeLog:

* gcc-interface/decl.cc (gnat_to_gnu_entity): Use default messages
for errors reported for Object_Size clauses.
(validate_size): Give an error for stand-alone objects of composite
types if the specified size is not a multiple of the alignment.

ada: Fix alignment violation for mix of aligned and misaligned composite types

This happens when the chain of initialization procedures is called on the
subcomponents and causes the creation of temporaries along the way out of
alignment considerations. Now these temporaries are not necessary in the
context and were not created until recently, so this gets rid of them.

gcc/ada/ChangeLog:

* gcc-interface/trans.cc (addressable_p): Add COMPG third parameter.
<COMPONENT_REF>: Do not return true out of alignment considerations
for non-strict-alignment targets if COMPG is set.
(Call_to_gnu): Pass true as COMPG in the call to the addressable_p
predicate if the called subprogram is an initialization procedure.

ada: Fix wrong finalization of constrained subtype of unconstrained array type

This implements the Is_Constr_Array_Subt_With_Bounds flag for allocators.

gcc/ada/ChangeLog:

* gcc-interface/trans.cc (gnat_to_gnu) <N_Allocator>: Allocate the
bounds alongside the data if the Is_Constr_Array_Subt_With_Bounds
flag is set on the designated type.
<N_Free_Statement>: Take into account the allocated bounds if the
Is_Constr_Array_Subt_With_Bounds flag is set on the designated type.

ada: Fix missing error on too large Component_Size not multiple of storage unit

This is a small regression introduced a few years ago.

gcc/ada/ChangeLog:

* gcc-interface/decl.cc (gnat_to_gnu_component_type): Validate the
Component_Size like the size of a type only if the component type
is actually packed.

ada: Fix check for elaboration order on subprogram body stubs

Fix an assertion failure occurring when elaboration checks were applied to
subprogram with a separate body.

gcc/ada/ChangeLog:

* sem_elab.adb (Check_Overriding_Primitive): Find early call region
of the subprogram body declaration, not of the subprogram body stub.

ada: More Tbuild cleanup

Remove "Nmake_Assert => ..." on N_Unchecked_Type_Conversion at
gen_il-gen-gen_nodes.adb:473 (was disabled).

This was left over from commit 82a794419a00ea98b68d69b64363ae6746710de9
"Tbuild cleanup".

In addition, the checks for "Is_Composite_Type" in
Tbuild.Unchecked_Convert_To are narrowed to "not Is_Scalar_Type";
that way, useless duplicate unchecked conversions of access types will
be removed as for composite types.

gcc/ada/ChangeLog:

* gen_il-gen-gen_nodes.adb (N_Unchecked_Type_Conversion):
Remove useless Nmake_Assert.
* tbuild.adb (Unchecked_Convert_To):
Narrow the bitfield-related conditions.

ada: Refine sanity check in Insert_Actions

Insert_Actions performs a sanity check when it goes through an
expression with actions while going up the three. That check was not
perfectly right before this patch and spuriously failed when inserting
range checks in some situation. This patch makes the check more robust.

gcc/ada/ChangeLog:

* exp_util.adb (Insert_Actions): Fix check.

ada: Make comment more precise

gcc/ada/ChangeLog:

* exp_ch6.adb (Expand_Ctrl_Function_Call): Precisify comment.

ada: Fix missing finalization with conditional expression in extended return

Declarations of return objects are not (yet) distributed into the dependent
expressions of conditional expressions.

gcc/ada/ChangeLog:

* exp_ch6.adb (Expand_Ctrl_Function_Call): Do not bail out for the
declarations of return objects.

ada: Port System.Stack_Usage to CHERI

This unit performed integer to address conversions to calculate stack addresses
which, on a CHERI target, result in an invalid capability that triggers a
capability tag fault when dereferenced during stack filling. This patch updates
the unit to preserve addresses (capabilities) during the calculations.

The method used to determine the stack base address is also updated to CHERI.
The current method tries to get the stack base from the compiler info for the
current task. If no info is found, then as a fallback it estimates the base by
taking the address of a variable on the stack. This address is then derived to
calculate the range of addresses to fill the stack.

This fallback does not work on CHERI since taking the 'Address of a stack variable
will result in a capability with bounds restricted to that object and attempting to
write outside those bounds triggers a capability bounds fault. Instead, we add a
new function Get_Stack_Base which, on CHERI, gets the exact stack base from the
upper bound of the capability stack pointer (CSP) register. On non-CHERI platforms,
Get_Stack_Base returns the stack base from the compiler info, resulting in the same
behaviour as before on those platforms.

gcc/ada/ChangeLog:

* Makefile.rtl (LIBGNAT_TARGET_PAIRS): New unit s-tsgsba__cheri.adb for morello-freebsd.
* libgnarl/s-tassta.adb (Get_Stack_Base): New function.
* libgnarl/s-tsgsba__cheri.adb: New file for CHERI targets.
* libgnarl/s-tsgsba.adb: New default file for non-CHERI targets.
* libgnat/s-stausa.adb (Fill_Stack, Compute_Result): Port to CHERI.
* libgnat/s-stausa.ads (Initialize_Analyzer, Stack_Analyzer): Port to CHERI.

ada: Improve retrieval of nominal unconstrained type in extended return

To reliably retrieve the nominal unconstrained type of object declared in
extended return statement we need to rely on the Original_Node.

gcc/ada/ChangeLog:

* sem_ch3.adb (Check_Return_Subtype_Indication): Use Original_Node.

ada: Improve retrieval of nominal unconstrained type in extended return

When extended return statement declares object using an explicit subtype
indication, then it is better to recover the original unconstrained type using
the explicit subtype indication. This appears to be necessary for subtypes with
predicates.

gcc/ada/ChangeLog:

* sem_ch3.adb (Check_Return_Subtype_Indication): Use type from
explicit subtype indication, when possible.

ada: Adjust message about statically compatible result subtype

Ada RM 6.5(5.3/5) is about "result SUBTYPE of the function", while the error
message was saying "result TYPE of the function". Now use the exact RM wording
in the error message for this rule.

gcc/ada/ChangeLog:

* sem_ch3.adb (Check_Return_Subtype_Indication): Adjust error message
to match the RM wording.

ada: Fix constraint-related legality checks in extended return statements

Legality checks in extended return statements were (almost) literally
implementing the RM rules, but the when analyzing the return object declaration
we replace the nominal subtype of that object with its constrained subtype.
(It is a bit odd to have such an expansion activity in analysis, but we already
rely on this particular expansion in quite a few places).

gcc/ada/ChangeLog:

* sem_ch3.adb (Check_Return_Subtype_Indication): Use the nominal
subtype of a return object; literally implement the RM rule about
elementary types; check for static subtype compatibility both when
the subtype is given as a subtype mark and a subtype indication.

ada: Fix strange holes for type with variant part reported by -gnatRh

The problem is that the sorting algorithm mixes components of variants.

gcc/ada/ChangeLog:

* repinfo.adb (First_Comp_Or_Discr.Is_Placed_Before): Return True
only if the components are in the same component list.

ada: Fix node copy with functions as actual parameters in dispatching DIC

When dispatching in a Default_Initial_Condition, copying the condition
node crashes if there is a, possibly nested, parameterless function as
actual parameter; there were two issues:
1. Subp_Entity in Check_Dispatching_call was uninitialized, a GNAT SAS
   finding.
2. The controlling argument update logic only tried to propagate the
   update by traversing the actual parameters, leading to a crash in
   case of parameterless functions.
This patch initializes Subp_Entity and allows the update of controlling
argument to succeed even when no traversal happened.

gcc/ada/ChangeLog:

* sem_disp.adb (Check_Dispatching_call): Fix uninitialized Subp_Entity.
* sem_util.adb (Update_Controlling_Argument): No need to replace controlling argument
in case of functions.

ada: Fix minor fallout of latest change

This adjusts the header of the renamed files and adds missing blank lines.

gcc/ada/ChangeLog:

* errid.ads: Adjust header to renaming and fix copyright line.
* errid.adb: Adjust header to renaming and add blank line.
* erroutc-pretty_emitter.ads: Adjust header to renaming.
* erroutc-pretty_emitter.adb: Likewise.
* erroutc-sarif_emitter.ads: Likewise.
* erroutc-sarif_emitter.adb: Likewise.
* errsw.ads: Adjust header to renaming and add blank line.
* errsw.adb: Likewise.
* json_utils.ads: Likewise.
* json_utils.adb: Adjust header to renaming.

ada: Turn diagnostic object from variable to constant

Diagnostic entries are not supposed to be modified while compiling the code.
Code cleanup; behavior is unaffected.

gcc/ada/ChangeLog:

* errid.ads (Diagnostic_Entries): Now a constant.

ada: Remove redundant nested aggregates from diagnostics code

A nested aggregate with a single "others => <>" clause is equivalent to a box
itself. Code cleanup; semantics is unaffected.

gcc/ada/ChangeLog:

* errid.ads (Diagnostic_Entries): Remove nested aggregate.
* errsw.adb (Switches): Likewise.

ada: Fix crash with Finalizable in corner case

The Finalizable aspect introduced controlled types for which not all the
finalization primitives exist. This patch makes Make_Deep_Record_Body
handle this case correctly.

gcc/ada/ChangeLog:

* exp_ch7.adb (Make_Deep_Record_Body): Fix case of absent Initialize
primitive.

ada: Refine subtypes in task-counting code

Code cleanup; semantics is unaffected.

gcc/ada/ChangeLog:

* exp_ch3.adb (Count_Default_Sized_Task_Stacks): Refine subtypes of
parameters; same for callsites.

ada: Remove a couple of redundant calls to Set_Etype

The OK_Convert_To function already sets the Etype of its result.

gcc/ada/ChangeLog:

* exp_imgv.adb (Expand_Value_Attribute): Do not call Set_Etype on N
after rewriting it by means of OK_Convert_To.

ada: Fix crash with Finalizable in corner case

Since the introduction of the Finalizable aspect, there can be types
for which Is_Controlled returns True but that don't have all three
finalization primitives. The Generate_Finalization_Actions raised an
exception in that case before this patch, which fixes the problem.

gcc/ada/ChangeLog:

* exp_aggr.adb (Generate_Finalization_Actions): Stop assuming that
initialize primitive exists.

ada: Fix typo in comment

gcc/ada/ChangeLog:

* exp_ch7.adb (Build_Record_Deep_Procs): Fix typo in comment.

ada: Enforce visibility of unit used as a parent instance of a child instance

In cases involving instantiation of a generic child unit, the visibility
of the parent unit was mishandled, allowing the parent to be referenced
in another compilation unit that has visibility of the child instance
but no with_clause for the parent of the instance.

gcc/ada/ChangeLog:

* sem_ch12.adb (Install_Spec): Remove "not Is_Generic_Instance (Par)"
in test for setting Instance_Parent_Unit. Revise comment to no longer
say "noninstance", plus remove "???".
(Remove_Parent): Restructure if_statement to allow for both "elsif"
parts to be executed (by changing them to be separate if_statements
within an "else" part).

ada: Fix comment

This patch fixes a misnaming of Make_Predefined_Primitive_Specs in a
comment.

gcc/ada/ChangeLog:

* exp_ch3.adb (Predefined_Primitive_Bodies): Fix comment.

ada: Cleanup in type support subprograms code

Code cleanup; semantics is unaffected.

gcc/ada/ChangeLog:

* exp_tss.adb (TSS): Refactor IF condition to make code smaller.
* lib.adb (Increment_Serial_Number, Synchronize_Serial_Number):
Use type of renamed object when creating renaming.
* lib.ads (Unit_Record): Refine subtype of dependency number.

ada: Fix spurious Constraint_Error raised by 'Value of fixed-point types

This happens for very large Smalls with regard to the size of the mantissa,
because the prerequisites of the implementation used in this case are not
met, although they are documented in the head comment of Integer_To_Fixed.

This change documents them at the beginning of the body of System.Value_F
and adjusts the compiler interface accordingly.

gcc/ada/ChangeLog:

* libgnat/s-valuef.adb: Document the prerequisites more precisely.
* libgnat/a-tifiio.adb (OK_Get_32): Adjust to the prerequisites.
(OK_Get_64): Likewise.
* libgnat/a-tifiio__128.adb (OK_Get_32): Likewise.
(OK_Get_64): Likewise.
(OK_Get_128): Likewise.
* libgnat/a-wtfiio.adb (OK_Get_32): Likewise.
(OK_Get_64): Likewise.
* libgnat/a-wtfiio__128.adb (OK_Get_32): Likewise.
(OK_Get_64): Likewise.
(OK_Get_128): Likewise.
* libgnat/a-ztfiio.adb (OK_Get_32): Likewise.
(OK_Get_64): Likewise.
* libgnat/a-ztfiio__128.adb (OK_Get_32): Likewise.
(OK_Get_64): Likewise.
(OK_Get_128): Likewise.
* exp_imgv.adb (Expand_Value_Attribute): Adjust the conditions under
which the RE_Value_Fixed{32,64,128} routines are called for ordinary
fixed-point types.

ada: Fix comment

This patch fixes a comment that wrongly stated that no dispatch entry
for deep finalize was created for limited tagged types.

gcc/ada/ChangeLog:

* exp_ch3.adb (Make_Predefined_Primitive_Specs): Fix comment.

ada: Fix assertion failure on finalizable aggregate

The Finalizable aspect makes it possible that
Insert_Actions_In_Scope_Around is entered with an empty list of after
actions. This patch fixes a condition that was not quite right in this
case.

gcc/ada/ChangeLog:

* exp_ch7.adb (Insert_Actions_In_Scope_Around): Fix condition.

ada: Remove unnecessary "return;" statements

A "return;" at the end of a procedure is unnecessary and
misleading. This patch removes them.

gcc/ada/ChangeLog:

* checks.adb: Remove unnecessary "return;" statements.
* eval_fat.adb: Likewise.
* exp_aggr.adb: Likewise.
* exp_attr.adb: Likewise.
* exp_ch3.adb: Likewise.
* exp_ch4.adb: Likewise.
* exp_ch5.adb: Likewise.
* exp_ch6.adb: Likewise.
* exp_unst.adb: Likewise.
* krunch.adb: Likewise.
* layout.adb: Likewise.
* libgnat/s-excdeb.adb: Likewise.
* libgnat/s-trasym__dwarf.adb: Likewise.
* par-endh.adb: Likewise.
* par-tchk.adb: Likewise.
* sem.adb: Likewise.
* sem_attr.adb: Likewise.
* sem_ch6.adb: Likewise.
* sem_elim.adb: Likewise.
* sem_eval.adb: Likewise.
* sfn_scan.adb: Likewise.

ada: Correct documentation of policy_identifiers for Assertion_Policy

Follow-on to gnat-945.

Change Ignore to Disable; Ignore is defined by the language,
Disable is the implementation-defined one.

Also minor code cleanup.

gcc/ada/ChangeLog:

* doc/gnat_rm/implementation_defined_characteristics.rst:
Change Ignore to Disable.
* sem_ch13.ads (Analyze_Aspect_Specifications):
Minor: Remove incorrect comment; there is no need to check
Has_Aspects (N) at the call site.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.

ada: Remove Empty_Or_Error

Minor stylistic improvement: Remove Empty_Or_Error, and replace
comparisons with Empty_Or_Error with "[not] in Empty | Error".
(Found while working on VAST.)

gcc/ada/ChangeLog:

* types.ads (Empty_Or_Error): Remove.
* atree.adb: Remove reference to Empty_Or_Error.
* par-endh.adb: Likewise.
* sem_ch12.adb: Likewise.
* sem_ch3.adb: Likewise.
* sem_util.adb: Likewise.
* treepr.adb: Likewise.

ada: Call Semantics when analyzing a renamed package

Calling Semantics here will additionally update the reference to
Current_Sem_Unit the renamed unit so that we will not receive
bogus visibility errors when checking for self-referential with-s.

gcc/ada/ChangeLog:

* sem_ch10.adb(Analyze_With_Clause): Call Semantics instead
of Analyze to bring Current_Sem_Unit up to date.

ada: Fix SPARK context discovery from within subunits

When navigating the AST to find the enclosing subprogram we must traverse
from subunits to the corresponding stub.

gcc/ada/ChangeLog:

* lib-xref-spark_specific.adb
(Enclosing_Subprogram_Or_Library_Package): Traverse subunits and body
stubs.