]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
10 months agolibstdc++: Restore unrolling in std::find using pragma [PR116140]
Alex Coplan [Fri, 2 Aug 2024 08:56:07 +0000 (09:56 +0100)] 
libstdc++: Restore unrolling in std::find using pragma [PR116140]

Together with the preparatory compiler patches, this patch restores
unrolling in std::__find_if, but this time relying on the compiler to do
it by using:

  #pragma GCC unroll 4

which should restore the majority of the regression relative to the
hand-unrolled version while still being vectorizable with WIP alignment
peeling enhancements.

On Neoverse V1 with LTO, this reduces the regression in xalancbmk (from
SPEC CPU 2017) from 5.8% to 1.7% (restoring ~71% of the lost
performance).

libstdc++-v3/ChangeLog:

PR libstdc++/116140
* include/bits/stl_algobase.h (std::__find_if): Add #pragma to
request GCC to unroll the loop.

10 months agolto: Stream has_unroll flag during LTO [PR116140]
Alex Coplan [Sat, 3 Aug 2024 17:02:36 +0000 (17:02 +0000)] 
lto: Stream has_unroll flag during LTO [PR116140]

When #pragma GCC unroll is processed in
tree-cfg.cc:replace_loop_annotate_in_block, we set both the loop->unroll
field (which is currently streamed out and back in during LTO) but also
the cfun->has_unroll flag.

cfun->has_unroll, however, is not currently streamed during LTO.  This
patch fixes that.

Prior to this patch, loops marked with #pragma GCC unroll that would be
unrolled by RTL loop2_unroll in a non-LTO compilation didn't get
unrolled under LTO.

gcc/ChangeLog:

PR libstdc++/116140
* lto-streamer-in.cc (input_struct_function_base): Stream in
fn->has_unroll.
* lto-streamer-out.cc (output_struct_function_base): Stream out
fn->has_unroll.

gcc/testsuite/ChangeLog:

PR libstdc++/116140
* g++.dg/ext/pragma-unroll-lambda-lto.C: New test.

10 months agotestsuite: Ensure ltrans dump files get cleaned up properly [PR116140]
Alex Coplan [Thu, 8 Aug 2024 13:15:39 +0000 (13:15 +0000)] 
testsuite: Ensure ltrans dump files get cleaned up properly [PR116140]

I noticed while working on a test that uses LTO and requests a dump
file, that we are failing to cleanup ltrans dump files in the testsuite.

E.g. the test I was working on compiles with -flto
-fdump-rtl-loop2_unroll, and we end up with the following file:

./gcc/testsuite/g++/pr116140.ltrans0.ltrans.287r.loop2_unroll

being left behind by the testsuite.  This is problematic not just from a
"missing cleanup" POV, but also because it can cause the test to pass
spuriously when the test is re-run wtih an unpatched compiler (without
the bug fix).  In the broken case, loop2_unroll isn't run at all, so we
end up scanning the old dumpfile (from the previous test run) and making
the dumpfile scan pass.

Running with `-v -v` in RUNTESTFLAGS we can see the following cleanup
attempt is made:

remove-build-file `pr116140.{C,exe}.{ltrans[0-9]*.,}[0-9][0-9][0-9]{l,i,r,t}.*'

looking again at the ltrans dump file above we can see this will fail for two
reasons:

 - The actual dump file has no {C,exe} extension between the basename and
   ltrans0.
 - The actual dump file has an additional `.ltrans` component after `.ltrans0`.

This patch therefore relaxes the pattern constructed for cleaning up such
dumpfiles to also match dumpfiles with the above form.

Running the testsuite before/after this patch shows the number of files in
gcc/testsuite (in the build dir) with "ltrans" in the name goes from 1416 to 62
on aarch64.

gcc/testsuite/ChangeLog:

PR libstdc++/116140
* lib/gcc-dg.exp (schedule-cleanups): Relax ltrans dumpfile
cleanup pattern to handle missing cases.

10 months agoc++: Ensure ANNOTATE_EXPRs remain outermost expressions in conditions [PR116140]
Alex Coplan [Fri, 2 Aug 2024 08:52:50 +0000 (09:52 +0100)] 
c++: Ensure ANNOTATE_EXPRs remain outermost expressions in conditions [PR116140]

For the testcase added with this patch, we would end up losing the:

  #pragma GCC unroll 4

and emitting "warning: ignoring loop annotation".  That warning comes
from tree-cfg.cc:replace_loop_annotate, and means that we failed to
process the ANNOTATE_EXPR in tree-cfg.cc:replace_loop_annotate_in_block.
That function walks backwards over the GIMPLE in an exiting BB for a
loop, skipping over the final gcond, and looks for any ANNOTATE_EXPRS
immediately preceding the gcond.

The function documents the following pre-condition:

   /* [...] We assume that the annotations come immediately before the
      condition in BB, if any.  */

now looking at the exiting BB of the loop, we have:

  <bb 8> :
  D.4524 = .ANNOTATE (iftmp.1, 1, 4);
  retval.0 = D.4524;
  if (retval.0 != 0)
    goto <bb 3>; [INV]
  else
    goto <bb 9>; [INV]

and crucially there is an intervening assignment between the gcond and
the preceding .ANNOTATE ifn call.  To see where this comes from, we can
look to the IR given by -fdump-tree-original:

  if (<<cleanup_point ANNOTATE_EXPR <first != last && !use_find(short
    int*)::<lambda(short int)>::operator() (&pred, *first), unroll 4>>>)
    goto <D.4518>;
  else
    goto <D.4516>;

here the problem is that we've wrapped a CLEANUP_POINT_EXPR around the
ANNOTATE_EXPR, meaning the ANNOTATE_EXPR is no longer the outermost
expression in the condition.

The CLEANUP_POINT_EXPR gets added by the following call chain:

finish_while_stmt_cond
 -> maybe_convert_cond
 -> condition_conversion
 -> fold_build_cleanup_point_expr

this patch chooses to fix the issue by first introducing a new helper
class (annotate_saver) to save and restore outer chains of
ANNOTATE_EXPRs and then using it in maybe_convert_cond.

With this patch, we don't get any such warning and the loop gets unrolled as
expected at -O2.

gcc/cp/ChangeLog:

PR libstdc++/116140
* semantics.cc (anotate_saver): New. Use it ...
(maybe_convert_cond): ... here, to ensure any ANNOTATE_EXPRs
remain the outermost expression(s) of the condition.

gcc/testsuite/ChangeLog:

PR libstdc++/116140
* g++.dg/ext/pragma-unroll-lambda.C: New test.

10 months agoOpenMP: Add interop routines to omp_runtime_api_procname
Tobias Burnus [Wed, 11 Sep 2024 10:02:24 +0000 (12:02 +0200)] 
OpenMP: Add interop routines to omp_runtime_api_procname

gcc/
* omp-general.cc (omp_runtime_api_procname): Add
omp_get_interop_{int,name,ptr,rc_desc,str,type_desc}
and omp_get_num_interop_properties.

10 months agofortran/openmp.cc: Fix var init and locus use to avoid uninit values [PR fortran...
Tobias Burnus [Wed, 11 Sep 2024 07:25:47 +0000 (09:25 +0200)] 
fortran/openmp.cc: Fix var init and locus use to avoid uninit values [PR fortran/116661]

gcc/fortran/ChangeLog:

PR fortran/116661
* openmp.cc (gfc_match_omp_prefer_type): NULL init a gfc_expr
variable and use right locus in gfc_error.

10 months agoVect: Support form 1 of vector signed integer .SAT_ADD
Pan Li [Wed, 11 Sep 2024 01:54:38 +0000 (09:54 +0800)] 
Vect: Support form 1 of vector signed integer .SAT_ADD

This patch would like to support the vector signed ssadd pattern
for the RISC-V backend.  Aka

Form 1:
  #define DEF_VEC_SAT_S_ADD_FMT_1(T, UT, MIN, MAX)           \
  void __attribute__((noinline))                             \
  vec_sat_s_add_##T##_fmt_1 (T *out, T *x, T *y, unsigned n) \
  {                                                          \
    for (unsigned i = 0; i < n; i++)                         \
      {                                                      \
        T sum = (UT)x[i] + (UT)y[i];                         \
        out[i] = (x[i] ^ y[i]) < 0                           \
          ? sum                                              \
          : (sum ^ x[i]) >= 0                                \
            ? sum                                            \
            : x[i] < 0 ? MIN : MAX;                          \
      }                                                      \
  }

DEF_VEC_SAT_S_ADD_FMT_1(int64_t, uint64_t, INT64_MIN, INT64_MAX)

If the backend implemented the vector mode of ssadd, we will see IR diff
similar as below:

Before this patch:
 108   │   _114 = .SELECT_VL (ivtmp_112, POLY_INT_CST [2, 2]);
 109   │   ivtmp_77 = _114 * 8;
 110   │   vect__4.9_80 = .MASK_LEN_LOAD (vectp_x.7_78, 64B, { -1, ...  }, _114, 0);
 111   │   vect__5.10_81 = VIEW_CONVERT_EXPR<vector([2,2]) long unsigned int>(vect__4.9_80);
 112   │   vect__7.13_85 = .MASK_LEN_LOAD (vectp_y.11_83, 64B, { -1, ...  }, _114, 0);
 113   │   vect__8.14_86 = VIEW_CONVERT_EXPR<vector([2,2]) long unsigned int>(vect__7.13_85);
 114   │   vect__9.15_87 = vect__5.10_81 + vect__8.14_86;
 115   │   vect_sum_20.16_88 = VIEW_CONVERT_EXPR<vector([2,2]) long int>(vect__9.15_87);
 116   │   vect__10.17_89 = vect__4.9_80 ^ vect__7.13_85;
 117   │   vect__11.18_90 = vect__4.9_80 ^ vect_sum_20.16_88;
 118   │   mask__46.19_92 = vect__10.17_89 >= { 0, ... };
 119   │   _36 = vect__4.9_80 >> 63;
 120   │   mask__44.26_104 = vect__11.18_90 < { 0, ... };
 121   │   mask__43.27_105 = mask__46.19_92 & mask__44.26_104;
 122   │   _115 = .COND_XOR (mask__43.27_105, _36, { 9223372036854775807, ... }, vect_sum_20.16_88);
 123   │   .MASK_LEN_STORE (vectp_out.29_108, 64B, { -1, ... }, _114, 0, _115);
 124   │   vectp_x.7_79 = vectp_x.7_78 + ivtmp_77;
 125   │   vectp_y.11_84 = vectp_y.11_83 + ivtmp_77;
 126   │   vectp_out.29_109 = vectp_out.29_108 + ivtmp_77;
 127   │   ivtmp_113 = ivtmp_112 - _114;

After this patch:
  94   │   # vectp_x.7_82 = PHI <vectp_x.7_83(6), x_18(D)(5)>
  95   │   # vectp_y.10_86 = PHI <vectp_y.10_87(6), y_19(D)(5)>
  96   │   # vectp_out.14_91 = PHI <vectp_out.14_92(6), out_21(D)(5)>
  97   │   # ivtmp_95 = PHI <ivtmp_96(6), _94(5)>
  98   │   _97 = .SELECT_VL (ivtmp_95, POLY_INT_CST [2, 2]);
  99   │   ivtmp_81 = _97 * 8;
 100   │   vect__4.9_84 = .MASK_LEN_LOAD (vectp_x.7_82, 64B, { -1, ...  }, _97, 0);
 101   │   vect__7.12_88 = .MASK_LEN_LOAD (vectp_y.10_86, 64B, { -1, ...  }, _97, 0);
 102   │   vect_patt_40.13_89 = .SAT_ADD (vect__4.9_84, vect__7.12_88);
 103   │   .MASK_LEN_STORE (vectp_out.14_91, 64B, { -1, ... }, _97, 0, vect_patt_40.13_89);
 104   │   vectp_x.7_83 = vectp_x.7_82 + ivtmp_81;
 105   │   vectp_y.10_87 = vectp_y.10_86 + ivtmp_81;
 106   │   vectp_out.14_92 = vectp_out.14_91 + ivtmp_81;
 107   │   ivtmp_96 = ivtmp_95 - _97;

The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.

gcc/ChangeLog:

* match.pd: Add case 2 for the signed .SAT_ADD consumed by
vect pattern.
* tree-vect-patterns.cc (gimple_signed_integer_sat_add): Add new
matching func decl for signed .SAT_ADD.
(vect_recog_sat_add_pattern): Add signed .SAT_ADD pattern match.

Signed-off-by: Pan Li <pan2.li@intel.com>
10 months agoEnable tune fuse_move_and_alu for GNR.
liuhongt [Tue, 10 Sep 2024 07:04:58 +0000 (15:04 +0800)] 
Enable tune fuse_move_and_alu for GNR.

According to Intel Software Optimization Manual[1], the Redwood cove
microarchitecture supports LD+OP and MOV+OP macro fusions.

The patch enables MOV+OP tune for GNR.

[1] https://www.intel.com/content/www/us/en/content-details/814198/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html

gcc/ChangeLog:

* config/i386/x86-tune.def (X86_TUNE_FUSE_MOV_AND_ALU): Enable
for GNR and GNR-D.

10 months agoRISC-V: Fix asm check for Vector SAT_* due to middle-end change
Pan Li [Tue, 10 Sep 2024 23:00:13 +0000 (07:00 +0800)] 
RISC-V: Fix asm check for Vector SAT_* due to middle-end change

The middle-end change makes the effect on the layout of the assembly
for vector SAT_*.  This patch would like to fix it and make it robust.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c: Adjust
asm check and make it robust.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-13.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-14.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-15.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-16.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-17.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-18.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-19.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-20.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-21.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-22.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-23.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-24.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-25.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-26.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-27.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-28.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-29.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-30.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-31.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-32.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-13.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-14.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-15.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-16.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-17.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-18.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-19.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-20.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-21.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-22.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-23.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-24.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-25.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-26.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-27.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-28.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-29.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-30.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-31.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-32.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-33.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-34.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-35.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-36.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-37.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-38.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-39.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-40.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-13.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-14.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-15.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-16.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-17.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-18.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-19.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-20.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-21.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-22.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-23.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-24.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-9.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
10 months agoDaily bump.
GCC Administrator [Wed, 11 Sep 2024 00:17:46 +0000 (00:17 +0000)] 
Daily bump.

10 months agolibstdc++: Only use std::ios_base_library_init() for ELF [PR116159]
Jonathan Wakely [Tue, 10 Sep 2024 13:36:26 +0000 (14:36 +0100)] 
libstdc++: Only use std::ios_base_library_init() for ELF [PR116159]

The undefined std::ios_base_library_init() symbol that is referenced by
<iostream> is only supposed to be used for targets where symbol
versioning is supported.

The mingw-w64 target defaults to --enable-symvers=gnu due to using GNU
ld but doesn't actually support symbol versioning. This means it tries
to emit references to the std::ios_base_library_init() symbol, which
isn't really defined in the library. This causes problems when using lld
to link user binaries.

Disable the undefined symbol reference for non-ELF targets.

libstdc++-v3/ChangeLog:

PR libstdc++/116159
* include/std/iostream (ios_base_library_init): Only define for
ELF targets.
* src/c++98/ios_init.cc (ios_base_library_init): Likewise.

10 months agolibstdc++: std::string move assignment should not use POCCA trait [PR116641]
Jonathan Wakely [Tue, 10 Sep 2024 13:25:41 +0000 (14:25 +0100)] 
libstdc++: std::string move assignment should not use POCCA trait [PR116641]

The changes to implement LWG 2579 (r10-327-gdb33efde17932f) made
std::string::assign use the propagate_on_container_copy_assignment
(POCCA) trait, for consistency with operator=(const basic_string&).
However, this also unintentionally affected operator=(basic_string&&)
which calls assign(str) to make a deep copy when performing a move is
not possible. The fix is for the move assignment operator to call
_M_assign(str) instead of assign(str), as this just does the deep copy
and doesn't check the POCCA trait first.

The bug only affects the unlikely/useless combination of POCCA==true and
POCMA==false, but we should fix it for correctness anyway. it should
also make move assignment slightly cheaper to compile and execute,
because we skip the extra code in assign(const basic_string&).

libstdc++-v3/ChangeLog:

PR libstdc++/116641
* include/bits/basic_string.h (operator=(basic_string&&)): Call
_M_assign instead of assign.
* testsuite/21_strings/basic_string/allocator/116641.cc: New
test.

10 months agoc++: Fix get_member_function_from_ptrfunc with -fsanitize=bounds [PR116449]
Jakub Jelinek [Tue, 10 Sep 2024 16:32:58 +0000 (18:32 +0200)] 
c++: Fix get_member_function_from_ptrfunc with -fsanitize=bounds [PR116449]

The following testcase is miscompiled, because
get_member_function_from_ptrfunc
emits something like
(((FUNCTION.__pfn & 1) != 0)
 ? ptr + FUNCTION.__delta + FUNCTION.__pfn - 1
 : FUNCTION.__pfn) (ptr + FUNCTION.__delta, ...)
or so, so FUNCTION tree is used there 5 times.  There is
if (TREE_SIDE_EFFECTS (function)) function = save_expr (function);
but in this case function doesn't have side-effects, just nested ARRAY_REFs.
Now, if all the FUNCTION trees would be shared, it would work fine,
FUNCTION is evaluated in the first operand of COND_EXPR; but unfortunately
that isn't the case, both the BIT_AND_EXPR shortening and conversion to
bool done for build_conditional_expr actually unshare_expr that first
expression, but none of the other 4 are unshared.  With -fsanitize=bounds,
.UBSAN_BOUNDS calls are added to the ARRAY_REFs and use save_expr to avoid
evaluating the argument multiple times, but because that FUNCTION tree is
first used in the second argument of COND_EXPR (i.e. conditionally), the
SAVE_EXPR initialization is done just there and then the third argument
of COND_EXPR just uses the uninitialized temporary and so does the first
argument computation as well.

The following patch fixes that by doing save_expr even if !TREE_SIDE_EFFECTS,
but to avoid doing that too often only if !nonvirtual and if the expression
isn't a simple decl.

2024-09-10  Jakub Jelinek  <jakub@redhat.com>

PR c++/116449
* typeck.cc (get_member_function_from_ptrfunc): Use save_expr
on instance_ptr and function even if it doesn't have side-effects,
as long as it isn't a decl.

* g++.dg/ubsan/pr116449.C: New test.

10 months agolibstdc++: Add missing exception specifications in tests
Jonathan Wakely [Tue, 10 Sep 2024 15:59:29 +0000 (16:59 +0100)] 
libstdc++: Add missing exception specifications in tests

Since r15-3532-g7cebc6384a0ad6 18_support/new_nothrow.cc fails in C++98 mode because G++
diagnoses missing exception specifications for the user-defined
(de)allocation functions. Add throw(std::bad_alloc) and throw() for
C++98 mode.

Similarly, 26_numerics/headers/numeric/synopsis.cc fails in C++20 mode
because the declarations of gcd and lcm are not noexcept.

libstdc++-v3/ChangeLog:

* testsuite/18_support/new_nothrow.cc (THROW_BAD_ALLOC): Define
macro to add exception specifications for C++98 mode.
(NOEXCEPT): Expand to throw() for C++98 mode.
* testsuite/26_numerics/headers/numeric/synopsis.cc (gcd, lcm):
Add noexcept.

10 months agoc++: mutable temps in rodata [PR116369]
Marek Polacek [Thu, 29 Aug 2024 19:13:03 +0000 (15:13 -0400)] 
c++: mutable temps in rodata [PR116369]

Here we wrongly mark the reference temporary for g TREE_READONLY,
so it's put in .rodata and so we can't modify its subobject even
when the subobject is marked mutable.  This is so since r9-869.
r14-1785 fixed a similar problem, but not in set_up_extended_ref_temp.

PR c++/116369

gcc/cp/ChangeLog:

* call.cc (set_up_extended_ref_temp): Don't mark a temporary
TREE_READONLY if its type is TYPE_HAS_MUTABLE_P.

gcc/testsuite/ChangeLog:

* g++.dg/tree-ssa/initlist-opt7.C: New test.

10 months agoPass host specific ABI opts from mkoffload.
Prathamesh Kulkarni [Tue, 10 Sep 2024 15:31:58 +0000 (21:01 +0530)] 
Pass host specific ABI opts from mkoffload.

The patch adds an option -foffload-abi-host-opts, which
is set by host in TARGET_OFFLOAD_OPTIONS, and mkoffload then passes its value
to host_compiler.

gcc/ChangeLog:
PR target/96265
* common.opt (foffload-abi-host-opts): New option.
* config/aarch64/aarch64.cc (aarch64_offload_options): Pass
-foffload-abi-host-opts.
* config/i386/i386-options.cc (ix86_offload_options): Likewise.
* config/rs6000/rs6000.cc (rs6000_offload_options): Likewise.
* config/nvptx/mkoffload.cc (offload_abi_host_opts): Define.
(compile_native): Append offload_abi_host_opts to argv_obstack.
(main): Handle option -foffload-abi-host-opts.
* config/gcn/mkoffload.cc (offload_abi_host_opts): Define.
(compile_native): Append offload_abi_host_opts to argv_obstack.
(main): Handle option -foffload-abi-host-opts.
* lto-wrapper.cc (merge_and_complain): Handle
-foffload-abi-host-opts.
(append_compiler_options): Likewise.
* opts.cc (common_handle_option): Likewise.

Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>
10 months agotree-optimization/116658 - latent issue in vect_is_slp_load_node
Richard Biener [Tue, 10 Sep 2024 07:39:16 +0000 (09:39 +0200)] 
tree-optimization/116658 - latent issue in vect_is_slp_load_node

Permute nodes do not have a representative so we have to guard
vect_is_slp_load_node against those.

PR tree-optimization/116658
* tree-vect-slp.cc (vect_is_slp_load_node): Make sure
node isn't a permute.

* g++.dg/vect/pr116658.cc: New testcase.

10 months agoMatch: Support form 2 for scalar signed integer .SAT_ADD
Pan Li [Tue, 3 Sep 2024 07:39:16 +0000 (15:39 +0800)] 
Match: Support form 2 for scalar signed integer .SAT_ADD

This patch would like to support the form 2 of the scalar signed
integer .SAT_ADD.  Aka below example:

Form 2:
  #define DEF_SAT_S_ADD_FMT_2(T, UT, MIN, MAX) \
  T __attribute__((noinline))              \
  sat_s_add_##T##_fmt_2 (T x, T y)         \
  {                                        \
    T sum = (UT)x + (UT)y;                 \
                                           \
    if ((x ^ y) < 0 || (sum ^ x) >= 0)     \
      return sum;                          \
                                           \
    return x < 0 ? MIN : MAX;              \
  }

DEF_SAT_S_ADD_FMT_2(int8_t, uint8_t, INT8_MIN, INT8_MAX)

We can tell the difference before and after this patch if backend
implemented the ssadd<m>3 pattern similar as below.

Before this patch:
   4   │ __attribute__((noinline))
   5   │ int8_t sat_s_add_int8_t_fmt_2 (int8_t x, int8_t y)
   6   │ {
   7   │   int8_t sum;
   8   │   unsigned char x.0_1;
   9   │   unsigned char y.1_2;
  10   │   unsigned char _3;
  11   │   signed char _4;
  12   │   signed char _5;
  13   │   int8_t _6;
  14   │   _Bool _11;
  15   │   signed char _12;
  16   │   signed char _13;
  17   │   signed char _14;
  18   │   signed char _22;
  19   │   signed char _23;
  20   │
  21   │ ;;   basic block 2, loop depth 0
  22   │ ;;    pred:       ENTRY
  23   │   x.0_1 = (unsigned char) x_7(D);
  24   │   y.1_2 = (unsigned char) y_8(D);
  25   │   _3 = x.0_1 + y.1_2;
  26   │   sum_9 = (int8_t) _3;
  27   │   _4 = x_7(D) ^ y_8(D);
  28   │   _5 = x_7(D) ^ sum_9;
  29   │   _23 = ~_4;
  30   │   _22 = _5 & _23;
  31   │   if (_22 >= 0)
  32   │     goto <bb 4>; [42.57%]
  33   │   else
  34   │     goto <bb 3>; [57.43%]
  35   │ ;;    succ:       4
  36   │ ;;                3
  37   │
  38   │ ;;   basic block 3, loop depth 0
  39   │ ;;    pred:       2
  40   │   _11 = x_7(D) < 0;
  41   │   _12 = (signed char) _11;
  42   │   _13 = -_12;
  43   │   _14 = _13 ^ 127;
  44   │ ;;    succ:       4
  45   │
  46   │ ;;   basic block 4, loop depth 0
  47   │ ;;    pred:       2
  48   │ ;;                3
  49   │   # _6 = PHI <sum_9(2), _14(3)>
  50   │   return _6;
  51   │ ;;    succ:       EXIT
  52   │
  53   │ }

After this patch:
   4   │ __attribute__((noinline))
   5   │ int8_t sat_s_add_int8_t_fmt_2 (int8_t x, int8_t y)
   6   │ {
   7   │   int8_t _6;
   8   │
   9   │ ;;   basic block 2, loop depth 0
  10   │ ;;    pred:       ENTRY
  11   │   _6 = .SAT_ADD (x_7(D), y_8(D)); [tail call]
  12   │   return _6;
  13   │ ;;    succ:       EXIT
  14   │
  15   │ }

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Add the form 2 of signed .SAT_ADD matching.

Signed-off-by: Pan Li <pan2.li@intel.com>
10 months agoada: Include missing associated header file
Eric Botcazou [Tue, 27 Aug 2024 14:13:08 +0000 (16:13 +0200)] 
ada: Include missing associated header file

memmodel.h must be included alongside tm_p.h for the sake of the SPARC port.

gcc/ada/

* gcc-interface/misc.cc: Include memmodel.h before tm_p.h.

10 months agoada: Use the same warning character in continuations
Viljar Indus [Thu, 18 Jul 2024 07:52:03 +0000 (10:52 +0300)] 
ada: Use the same warning character in continuations

gcc/ada/

* gcc-interface/decl.cc: Use same warning characters in
continuation messages.
* gcc-interface/trans.cc: Likewise.

10 months agoada: First controlling parameter: report error without Extensions allowed
Javier Miranda [Mon, 26 Aug 2024 18:56:37 +0000 (18:56 +0000)] 
ada: First controlling parameter: report error without Extensions allowed

Enable reporting an error when this new aspect/pragma is set to
True, and the sources are compiled without language extensions
allowed.

gcc/ada/

* sem_ch13.adb (Analyze_One_Aspect): Call
Error_Msg_GNAT_Extension() to report an error when the aspect
First_Controlling_Parameter is set to True and the sources are
compiled without Core_Extensions_ Allowed.
* sem_prag.adb (Pragma_First_Controlling_Parameter): Call
subprogram Error_Msg_GNAT_Extension() to report an error when the
aspect First_Controlling_Parameter is set to True and the sources
are compiled without Core_Extensions_Allowed. Report an error when
the aspect pragma does not confirm an inherited True value.

10 months agoada: Normalize span generation on different platforms
Viljar Indus [Fri, 30 Aug 2024 11:22:16 +0000 (14:22 +0300)] 
ada: Normalize span generation on different platforms

The total number of characters on a source code line
is different on Windows and Linux based systems
(CRLF vs LF endings). Use the last non line change
character to adjust printing the spans that go over
the end of line.

gcc/ada/

* diagnostics-pretty_emitter.adb (Get_Last_Line_Char): New. Get
the last non line change character. Write_Span_Labels use the
adjusted line end pointer to calculate the length of the span.

10 months agoada: Evaluate calls to GNAT.Source_Info routines in semantic checking
Piotr Trojanek [Wed, 28 Aug 2024 15:56:06 +0000 (17:56 +0200)] 
ada: Evaluate calls to GNAT.Source_Info routines in semantic checking

When semantic checking mode is active, i.e. when switch -gnatc is
present or when the frontend is operating in the GNATprove mode,
we now rewrite calls to GNAT.Source_Info routines in evaluation
and not expansion (which is disabled in these modes).

This is needed to recognize constants initialized with calls to
GNAT.Source_Info as static constants, regardless of expansion being
enabled.

gcc/ada/

* exp_intr.ads, exp_intr.adb (Expand_Source_Info): Move
declaration to package spec.
* sem_eval.adb (Eval_Intrinsic_Call): Evaluate calls to
GNAT.Source_Info where possible.

10 months agoada: Simplify code for inserting checks into expressions
Piotr Trojanek [Mon, 26 Aug 2024 14:16:19 +0000 (16:16 +0200)] 
ada: Simplify code for inserting checks into expressions

Code cleanup; semantics is unaffected.

gcc/ada/

* checks.adb (Remove_Checks): Combine CASE alternatives.

10 months agoada: Whitespace cleanup in declaration of calendar-related routines
Piotr Trojanek [Tue, 27 Aug 2024 10:45:07 +0000 (12:45 +0200)] 
ada: Whitespace cleanup in declaration of calendar-related routines

Code cleanup.

gcc/ada/

* libgnat/s-os_lib.ads: Remove extra whitespace.

10 months agox86: Refine V4BF/V2BF FMA Testcase
Levy Hsu [Tue, 10 Sep 2024 05:42:09 +0000 (15:12 +0930)] 
x86: Refine V4BF/V2BF FMA Testcase

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx10_2-partial-bf-vector-fma-1.c: Separated 32-bit scan
and removed register checks in spill situations.

10 months agophiopt: Move the common code between pass_phiopt and pass_cselim into a seperate...
Andrew Pinski [Mon, 9 Sep 2024 22:34:11 +0000 (15:34 -0700)] 
phiopt: Move the common code between pass_phiopt and pass_cselim into a seperate function

When r14-303-gb9fedabe381cce was done, it was missed that some of the common parts could
be done in a template and a lambda could be used. This patch implements that. This new
function can be used later on to implement a simple ifcvt pass.

gcc/ChangeLog:

* tree-ssa-phiopt.cc (execute_over_cond_phis): New template function,
moved the common parts from pass_phiopt::execute/pass_cselim::execute.
(pass_phiopt::execute): Move the functon specific parts of the loop
into an lamdba.
(pass_cselim::execute): Likewise.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
10 months agophiopt: Use gimple_phi_result rather than PHI_RESULT [PR116643]
Andrew Pinski [Mon, 9 Sep 2024 15:08:37 +0000 (08:08 -0700)] 
phiopt: Use gimple_phi_result rather than PHI_RESULT [PR116643]

This converts the uses of PHI_RESULT in phiopt to be gimple_phi_result
instead. Since there was already a mismatch of uses here, it
would be good to use prefered one (gimple_phi_result) instead.

Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/116643
gcc/ChangeLog:

* tree-ssa-phiopt.cc (replace_phi_edge_with_variable): s/PHI_RESULT/gimple_phi_result/.
(factor_out_conditional_operation): Likewise.
(minmax_replacement): Likewise.
(spaceship_replacement): Likewise.
(cond_store_replacement): Likewise.
(cond_if_else_store_replacement_1): Likewise.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
10 months agoDon't force_reg operands[3] when it's not const0_rtx.
liuhongt [Fri, 6 Sep 2024 07:03:16 +0000 (15:03 +0800)] 
Don't force_reg operands[3] when it's not const0_rtx.

It fix the regression by

a51f2fc0d80869ab079a93cc3858f24a1fd28237 is the first bad commit
commit a51f2fc0d80869ab079a93cc3858f24a1fd28237
Author: liuhongt <hongtao.liu@intel.com>
Date:   Wed Sep 4 15:39:17 2024 +0800

    Handle const0_operand for *avx2_pcmp<mode>3_1.

caused

FAIL: gcc.target/i386/pr59539-1.c scan-assembler-times vmovdqu|vmovups 1

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="i386.exp=gcc.target/i386/pr59539-1.c --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="i386.exp=gcc.target/i386/pr59539-1.c --target_board='unix{-m64\ -march=cascadelake}'"

gcc/ChangeLog:

* config/i386/sse.md (*avx2_pcmp<mode>3_1): Don't force_reg
operands[3] when it's not const0_rtx.

10 months agoDaily bump.
GCC Administrator [Tue, 10 Sep 2024 00:22:41 +0000 (00:22 +0000)] 
Daily bump.

10 months agodiagnostics: introduce struct diagnostic_option_id
David Malcolm [Mon, 9 Sep 2024 23:38:13 +0000 (19:38 -0400)] 
diagnostics: introduce struct diagnostic_option_id

Use a new struct diagnostic_option_id rather than just "int" when
referring to command-line options controlling warnings in the
diagnostic subsystem.

No functional change intended, but better documents the meaning of
the code.

gcc/c-family/ChangeLog:
* c-common.cc (c_option_controlling_cpp_diagnostic): Return
diagnostic_option_id rather than int.
(c_cpp_diagnostic): Update for renaming of
diagnostic_override_option_index to diagnostic_set_option_id.

gcc/c/ChangeLog:
* c-errors.cc (pedwarn_c23): Use "diagnostic_option_id option_id"
rather than "int opt".  Update for renaming of diagnostic_info
field.
(pedwarn_c11): Likewise.
(pedwarn_c99): Likewise.
(pedwarn_c90): Likewise.
* c-tree.h (pedwarn_c90): Likewise for decl.
(pedwarn_c99): Likewise.
(pedwarn_c11): Likewise.
(pedwarn_c23): Likewise.

gcc/cp/ChangeLog:
* constexpr.cc (constexpr_error): Update for renaming of
diagnostic_info field.
* cp-tree.h (pedwarn_cxx98): Use "diagnostic_option_id" rather
than "int".
* error.cc (cp_adjust_diagnostic_info): Update for renaming of
diagnostic_info field.
(pedwarn_cxx98): Use "diagnostic_option_id option_id" rather than
"int opt".  Update for renaming of diagnostic_info field.
(diagnostic_set_info): Likewise.

gcc/d/ChangeLog:
* d-diagnostic.cc (d_diagnostic_report_diagnostic): Update for
renaming of diagnostic_info field.

gcc/ChangeLog:
* diagnostic-core.h (struct diagnostic_option_id): New.
(warning): Use it rather than "int" for param.
(warning_n): Likewise.
(warning_at): Likewise.
(warning_meta): Likewise.
(pedwarn): Likewise.
(permerror_opt): Likewise.
(emit_diagnostic): Likewise.
(emit_diagnostic_valist): Likewise.
(emit_diagnostic_valist_meta): Likewise.
* diagnostic-format-json.cc
(json_output_format::on_report_diagnostic): Update for renaming of
diagnostic_info field.
* diagnostic-format-sarif.cc (sarif_builder::make_result_object):
Likewise.
(make_reporting_descriptor_object_for_warning): Likewise.
* diagnostic-format-text.cc (print_option_information): Likewise.
* diagnostic-global-context.cc (emit_diagnostic): Use
"diagnostic_option_id option_id" rather than "int opt".
(emit_diagnostic_valist): Likewise.
(emit_diagnostic_valist_meta): Likewise.
(warning): Likewise.
(warning_at): Likewise.
(warning_meta): Likewise.
(warning_n): Likewise.
(pedwarn): Likewise.
(permerror_opt): Likewise.
* diagnostic.cc (diagnostic_set_info_translated): Update for
renaming of diagnostic_info field.
(diagnostic_option_classifier::classify_diagnostic): Use
"diagnostic_option_id option_id" rather than "int opt".
(update_effective_level_from_pragmas): Update for renaming of
diagnostic_info field.
(diagnostic_context::diagnostic_enabled): Likewise.
(diagnostic_context::warning_enabled_at): Use
"diagnostic_option_id option_id" rather than "int opt".
(diagnostic_context::diagnostic_impl): Likewise.
(diagnostic_context::diagnostic_n_impl): Likewise.
* diagnostic.h (diagnostic_info::diagnostic_info): Update for...
(diagnostic_info::option_index): Rename...
(diagnostic_info::option_id): ...to this.
(class diagnostic_option_manager): Use
"diagnostic_option_id option_id" rather than "int opt" for vfuncs.
(diagnostic_option_classifier): Likewise for member funcs.
(diagnostic_classification_change_t::option): Add comment.
(diagnostic_context::warning_enabled_at): Use
"diagnostic_option_id option_id" rather than "int option_index".
(diagnostic_context::option_unspecified_p): Likewise.
(diagnostic_context::classify_diagnostic): Likewise.
(diagnostic_context::option_enabled_p): Likewise.
(diagnostic_context::make_option_name): Likewise.
(diagnostic_context::make_option_url): Likewise.
(diagnostic_context::diagnostic_impl): Likewise.
(diagnostic_context::diagnostic_n_impl): Likewise.
(diagnostic_override_option_index): Rename...
(diagnostic_set_option_id): ...to this, and update for
diagnostic_info field renaming.
(diagnostic_classify_diagnostic): Use "diagnostic_option_id"
rather than "int".
(warning_enabled_at): Likewise.
(option_unspecified_p): Likewise.

gcc/fortran/ChangeLog:
* cpp.cc (cb_cpp_diagnostic_cpp_option): Convert return type from
"int" to "diagnostic_option_id".
(cb_cpp_diagnostic): Update for renaming of
diagnostic_override_option_index to diagnostic_set_option_id.
* error.cc (gfc_warning): Update for renaming of diagnostic_info
field.
(gfc_warning_now_at): Likewise.
(gfc_warning_now): Likewise.
(gfc_warning_internal): Likewise.

gcc/ChangeLog:
* ipa-pure-const.cc: Replace include of "opts.h" with
"opts-diagnostic.h".
(suggest_attribute): Convert param from int to
diagnostic_option_id.
* lto-wrapper.cc (class lto_diagnostic_option_manager): Use
diagnostic_option_id rather than "int".
* opts-common.cc
(compiler_diagnostic_option_manager::option_enabled_p): Likewise.
* opts-diagnostic.h (class gcc_diagnostic_option_manager):
Likewise.
(class compiler_diagnostic_option_manager): Likewise.
* opts.cc (compiler_diagnostic_option_manager::make_option_name):
Likewise.
(gcc_diagnostic_option_manager::make_option_url): Likewise.
* substring-locations.cc
(format_string_diagnostic_t::emit_warning_n_va): Likewise.
(format_string_diagnostic_t::emit_warning_va): Likewise.
(format_string_diagnostic_t::emit_warning): Likewise.
(format_string_diagnostic_t::emit_warning_n): Likewise.
* substring-locations.h
(format_string_diagnostic_t::emit_warning_va): Likewise.
(format_string_diagnostic_t::emit_warning_n_va): Likewise.
(format_string_diagnostic_t::emit_warning): Likewise.
(format_string_diagnostic_t::emit_warning_n): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
10 months agodiagnostics: replace option_hooks with a diagnostic_option_manager class
David Malcolm [Mon, 9 Sep 2024 23:38:12 +0000 (19:38 -0400)] 
diagnostics: replace option_hooks with a diagnostic_option_manager class

Introduce a diagnostic_option_manager class to help isolate the
diagnostics subsystem from GCC's option handling.

No functional change intended.

gcc/ChangeLog:
* diagnostic.cc (diagnostic_context::initialize): Replace
m_options_callbacks with m_option_mgr.
(diagnostic_context::set_option_hooks): Replace with...
(diagnostic_context::set_option_manager): ...this.
* diagnostic.h (diagnostic_option_enabled_cb): Delete.
(diagnostic_make_option_name_cb): Delete.
(diagnostic_make_option_url_cb): Delete.
(class diagnostic_option_manager): New.
(diagnostic_manager::option_enabled_p): Convert from using
m_option_callbacks to m_option_mgr.
(diagnostic_manager::make_option_name): Likewise.
(diagnostic_manager::make_option_url): Likewise.
(diagnostic_manager::set_option_hooks): Replace with...
(diagnostic_manager::set_option_manager): ...this.
(diagnostic_manager::get_lang_mask): Update for field changes.
(diagnostic_manager::m_option_callbacks): Replace with...
(diagnostic_manager::m_option_mgr): ...this and...
(diagnostic_manager::m_lang_mask): ...this.
* lto-wrapper.cc (class lto_diagnostic_option_manager): New.
(main): Port from option hooks to diagnostic_option_manager.
* opts-common.cc: Include "opts-diagnostic.h".
(compiler_diagnostic_option_manager::option_enabled_p): New.
* opts-diagnostic.h (option_name): Drop decl.
(get_option_url): Drop decl.
(class gcc_diagnostic_option_manager): New.
(class compiler_diagnostic_option_manager): New.
* opts.cc (option_name): Convert to...
(compiler_diagnostic_option_manager::make_option_name): ...this.
(get_option_url): Convert to...
(gcc_diagnostic_option_manager::make_option_url): ...this.
* toplev.cc (general_init): Port from option hooks to
diagnostic_option_manager.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
10 months agodiagnostics: rename dc.printer to m_printer [PR116613]
David Malcolm [Mon, 9 Sep 2024 23:38:12 +0000 (19:38 -0400)] 
diagnostics: rename dc.printer to m_printer [PR116613]

Rename diagnostic_context's "printer" field to "m_printer",
for consistency with other fields, and to highlight places
where we currently use this, to help assess feasibility
of supporting multiple output sinks (PR other/116613).

No functional change intended.

gcc/ChangeLog:
PR other/116613
* attribs.cc (decls_mismatched_attributes): Rename
diagnostic_context's "printer" field to "m_printer".
(attr_access::array_as_string): Likewise.
* diagnostic-format-json.cc
(json_output_format::on_report_diagnostic): Likewise.
(diagnostic_output_format_init_json): Likewise.
* diagnostic-format-sarif.cc
(sarif_result::on_nested_diagnostic): Likewise.
(sarif_ice_notification): Likewise.
(sarif_builder::on_report_diagnostic): Likewise.
(sarif_builder::make_result_object): Likewise.
(sarif_builder::make_location_object): Likewise.
(sarif_builder::make_message_object_for_diagram): Likewise.
(diagnostic_output_format_init_sarif): Likewise.
* diagnostic-format-text.cc
(diagnostic_text_output_format::~diagnostic_text_output_format):
Likewise.
(diagnostic_text_output_format::on_report_diagnostic): Likewise.
(diagnostic_text_output_format::on_diagram): Likewise.
(diagnostic_text_output_format::print_any_cwe): Likewise.
(diagnostic_text_output_format::print_any_rules): Likewise.
(diagnostic_text_output_format::print_option_information):
Likewise.
* diagnostic-format.h (diagnostic_output_format::get_printer):
New.
* diagnostic-global-context.cc (verbatim): Rename
diagnostic_context's "printer" field to "m_printer".
* diagnostic-path.cc (path_label::get_text): Likewise.
(print_path_summary_as_text): Likewise.
(diagnostic_context::print_path): Likewise.
(selftest::test_empty_path): Likewise.
(selftest::test_intraprocedural_path): Likewise.
(selftest::test_interprocedural_path_1): Likewise.
(selftest::test_interprocedural_path_2): Likewise.
(selftest::test_recursion): Likewise.
(selftest::test_control_flow_1): Likewise.
(selftest::test_control_flow_2): Likewise.
(selftest::test_control_flow_3): Likewise.
(assert_cfg_edge_path_streq): Likewise.
(selftest::test_control_flow_5): Likewise.
(selftest::test_control_flow_6): Likewise.
* diagnostic-show-locus.cc (layout::layout): Likewise.
(selftest::test_layout_x_offset_display_utf8): Likewise.
(selftest::test_layout_x_offset_display_tab): Likewise.
(selftest::test_diagnostic_show_locus_unknown_location): Likewise.
(selftest::test_one_liner_simple_caret): Likewise.
(selftest::test_one_liner_no_column): Likewise.
(selftest::test_one_liner_caret_and_range): Likewise.
(selftest::test_one_liner_multiple_carets_and_ranges): Likewise.
(selftest::test_one_liner_fixit_insert_before): Likewise.
(selftest::test_one_liner_fixit_insert_after): Likewise.
(selftest::test_one_liner_fixit_remove): Likewise.
(selftest::test_one_liner_fixit_replace): Likewise.
(selftest::test_one_liner_fixit_replace_non_equal_range):
Likewise.
(selftest::test_one_liner_fixit_replace_equal_secondary_range):
Likewise.
(selftest::test_one_liner_fixit_validation_adhoc_locations):
Likewise.
(selftest::test_one_liner_many_fixits_1): Likewise.
(selftest::test_one_liner_many_fixits_2): Likewise.
(selftest::test_one_liner_labels): Likewise.
(selftest::test_one_liner_simple_caret_utf8): Likewise.
(selftest::test_one_liner_caret_and_range_utf8): Likewise.
(selftest::test_one_liner_multiple_carets_and_ranges_utf8):
Likewise.
(selftest::test_one_liner_fixit_insert_before_utf8): Likewise.
(selftest::test_one_liner_fixit_insert_after_utf8): Likewise.
(selftest::test_one_liner_fixit_remove_utf8): Likewise.
(selftest::test_one_liner_fixit_replace_utf8): Likewise.
(selftest::test_one_liner_fixit_replace_non_equal_range_utf8):
Likewise.
(selftest::test_one_liner_fixit_replace_equal_secondary_range_utf8):
Likewise.
(selftest::test_one_liner_fixit_validation_adhoc_locations_utf8):
Likewise.
(selftest::test_one_liner_many_fixits_1_utf8): Likewise.
(selftest::test_one_liner_many_fixits_2_utf8): Likewise.
(selftest::test_one_liner_labels_utf8): Likewise.
(selftest::test_one_liner_colorized_utf8): Likewise.
(selftest::test_add_location_if_nearby): Likewise.
(selftest::test_diagnostic_show_locus_fixit_lines): Likewise.
(selftest::test_overlapped_fixit_printing): Likewise.
(selftest::test_overlapped_fixit_printing_utf8): Likewise.
(selftest::test_overlapped_fixit_printing_2): Likewise.
(selftest::test_fixit_insert_containing_newline): Likewise.
(selftest::test_fixit_insert_containing_newline_2): Likewise.
(selftest::test_fixit_replace_containing_newline): Likewise.
(selftest::test_fixit_deletion_affecting_newline): Likewise.
(selftest::test_tab_expansion): Likewise.
(selftest::test_escaping_bytes_1): Likewise.
(selftest::test_escaping_bytes_2): Likewise.
(selftest::test_line_numbers_multiline_range): Likewise.
* diagnostic.cc (file_name_as_prefix): Likewise.
(diagnostic_set_caret_max_width): Likewise.
(diagnostic_context::initialize): Likewise.
(diagnostic_context::color_init): Likewise.
(diagnostic_context::urls_init): Likewise.
(diagnostic_context::finish): Likewise.
(diagnostic_context::get_location_text): Likewise.
(diagnostic_build_prefix): Likewise.
(diagnostic_context::report_current_module): Likewise.
(default_diagnostic_starter): Likewise.
(default_diagnostic_start_span_fn): Likewise.
(default_diagnostic_finalizer): Likewise.
(diagnostic_context::report_diagnostic): Likewise.
(diagnostic_append_note): Likewise.
(diagnostic_context::error_recursion): Likewise.
(fancy_abort): Likewise.
* diagnostic.h (diagnostic_context::set_show_highlight_colors):
Likewise.
(diagnostic_context::printer): Rename to...
(diagnostic_context::m_printer): ...this.
(diagnostic_format_decoder): Rename diagnostic_context's "printer"
field to "m_printer".
(diagnostic_prefixing_rule): Likewise.
(diagnostic_ready_p): Likewise.
* gimple-ssa-warn-access.cc (pass_waccess::maybe_warn_memmodel):
Likewise.
* langhooks.cc (lhd_print_error_function): Likewise.
* lto-wrapper.cc (print_lto_docs_link): Likewise.
* opts-global.cc (init_options_once): Likewise.
* opts.cc (common_handle_option): Likewise.
* simple-diagnostic-path.cc (simple_diagnostic_path_cc_tests):
Likewise.
* text-art/dump.h (dump_to_file<T>): Likewise.
* toplev.cc (announce_function): Likewise.
(toplev::main): Likewise.
* tree-diagnostic.cc (default_tree_diagnostic_starter): Likewise.
* tree.cc (escaped_string::escape): Likewise.
(selftest::test_escaped_strings): Likewise.

gcc/ada/ChangeLog:
PR other/116613
* gcc-interface/misc.cc (internal_error_function): Rename
diagnostic_context's "printer" field to "m_printer".

gcc/analyzer/ChangeLog:
PR other/116613
* access-diagram.cc (access_range::dump): Rename
diagnostic_context's "printer" field to "m_printer".
* analyzer-language.cc (on_finish_translation_unit): Likewise.
* analyzer.cc (make_label_text): Likewise.
(make_label_text_n): Likewise.
* call-details.cc (call_details::dump): Likewise.
* call-summary.cc (call_summary::dump): Likewise.
(call_summary_replay::dump): Likewise.
* checker-event.cc (checker_event::debug): Likewise.
* constraint-manager.cc (range::dump): Likewise.
(bounded_range::dump): Likewise.
(bounded_ranges::dump): Likewise.
(constraint_manager::dump): Likewise.
* diagnostic-manager.cc
(diagnostic_manager::emit_saved_diagnostic): Likewise.
* engine.cc (exploded_node::dump): Likewise.
(exploded_path::dump): Likewise.
(run_checkers): Likewise.
* kf-analyzer.cc (kf_analyzer_dump_escaped::impl_call_pre):
Likewise.
* pending-diagnostic.cc (evdesc::event_desc::formatted_print):
Likewise.
* program-point.cc (function_point::print_source_line): Likewise.
(program_point::dump): Likewise.
* program-state.cc (extrinsic_state::dump_to_file): Likewise.
(sm_state_map::dump): Likewise.
(program_state::dump_to_file): Likewise.
* ranges.cc (symbolic_byte_offset::dump): Likewise.
(symbolic_byte_range::dump): Likewise.
* region-model-reachability.cc (reachable_regions::dump): Likewise.
* region-model.cc (region_to_value_map::dump): Likewise.
(region_model::dump): Likewise.
(model_merger::dump): Likewise.
* region.cc (region_offset::dump): Likewise.
(region::dump): Likewise.
* sm-malloc.cc (deallocator_set::dump): Likewise.
(sufficiently_similar_p): Likewise.
* store.cc (uncertainty_t::dump): Likewise.
(binding_key::dump): Likewise.
(binding_map::dump): Likewise.
(binding_cluster::dump): Likewise.
(store::dump): Likewise.
* supergraph.cc (supergraph::dump_dot_to_file): Likewise.
(superedge::dump): Likewise.
* svalue.cc (svalue::dump): Likewise.

gcc/c-family/ChangeLog:
PR other/116613
* c-format.cc (selftest::test_type_mismatch_range_labels): Rename
diagnostic_context's "printer" field to "m_printer".
(selftest::test_type_mismatch_range_labels): Likewise.
* c-opts.cc (c_diagnostic_finalizer): Likewise.

gcc/c/ChangeLog:
PR other/116613
* c-objc-common.cc (c_initialize_diagnostics): Rename
diagnostic_context's "printer" field to "m_printer".

gcc/cp/ChangeLog:
PR other/116613
* error.cc (cxx_initialize_diagnostics): Rename
diagnostic_context's "printer" field to "m_printer".
(cxx_print_error_function): Likewise.
(cp_diagnostic_starter): Likewise.
(cp_print_error_function): Likewise.
(print_instantiation_full_context): Likewise.
(print_instantiation_partial_context_line): Likewise.
(maybe_print_constexpr_context): Likewise.
(print_location): Likewise.
(print_constrained_decl_info): Likewise.
(print_concept_check_info): Likewise.
(print_constraint_context_head): Likewise.
(print_requires_expression_info): Likewise.
* module.cc (noisy_p): Likewise.

gcc/d/ChangeLog:
PR other/116613
* d-diagnostic.cc (d_diagnostic_report_diagnostic): Rename
diagnostic_context's "printer" field to "m_printer".

gcc/fortran/ChangeLog:
PR other/116613
* error.cc (gfc_clear_pp_buffer): Rename diagnostic_context's
"printer" field to "m_printer".
(gfc_warning): Likewise.
(gfc_diagnostic_build_kind_prefix): Likewise.
(gfc_diagnostic_build_locus_prefix): Likewise.
(gfc_diagnostic_starter): Likewise.
(gfc_diagnostic_starter): Likewise.
(gfc_diagnostic_start_span): Likewise.
(gfc_diagnostic_finalizer): Likewise.
(gfc_warning_check): Likewise.
(gfc_error_opt): Likewise.
(gfc_error_check): Likewise.

gcc/jit/ChangeLog:
PR other/116613
* jit-playback.cc (add_diagnostic): Rename diagnostic_context's
"printer" field to "m_printer".

gcc/testsuite/ChangeLog:
PR other/116613
* gcc.dg/plugin/analyzer_cpython_plugin.c (dump_refcnt_info):
Update for renaming of field "printer" to "m_printer".
* gcc.dg/plugin/diagnostic_group_plugin.c
(test_diagnostic_starter): Likewise.
(test_diagnostic_start_span_fn): Likewise.
(test_output_format::on_begin_group): Likewise.
(test_output_format::on_end_group): Likewise.
* gcc.dg/plugin/diagnostic_plugin_test_paths.c: Likewise.
* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
(custom_diagnostic_finalizer): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
10 months agoSARIF output: fix schema URL [§3.13.3, PR116603]
David Malcolm [Mon, 9 Sep 2024 23:38:11 +0000 (19:38 -0400)] 
SARIF output: fix schema URL [§3.13.3, PR116603]

We were using
  https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json
as the URL for the SARIF 2.1 schema, but this is now a 404.

Update it to the URL listed in the spec (§3.13.3 "$schema property"),
which is:
  https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/schemas/sarif-schema-2.1.0.json
and update the copy in
  gcc/testsuite/lib/sarif-schema-2.1.0.json
used by the "verify-sarif-file" DejaGnu directive to the version found at
that latter URL; the sha256 sum changes
from: 2b19d2358baef0251d7d24e208d05ffabf1b2a3ab5e1b3a816066fc57fd4a7e8
  to: c3b4bb2d6093897483348925aaa73af03b3e3f4bd4ca38cef26dcb4212a2682e

Doing so added a validation error on
  c-c++-common/diagnostic-format-sarif-file-pr111700.c
for which we emit this textual output:
  this-file-does-not-exist.c: warning: #warning message [-Wcpp]
with no line number, and these invalid SARIF regions within the
physical location of the warning:
  "region": {"startColumn": 2,
             "endColumn": 9},
  "contextRegion": {}

This is due to this directive:
  # 0 "this-file-does-not-exist.c"
with line number 0.

The patch fixes this by not creating regions that have startLine <= 0.

gcc/ChangeLog:
PR other/116603
* diagnostic-format-sarif.cc (SARIF_SCHEMA): Update URL.
(sarif_builder::maybe_make_region_object): Don't create regions
with startLine <= 0.
(sarif_builder::maybe_make_region_object_for_context): Likewise.

gcc/testsuite/ChangeLog:
PR other/116603
* gcc.dg/plugin/diagnostic-test-metadata-sarif.py (test_basics):
Update expected schema URL.
* gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.py:
Likewise.
* gcc.dg/sarif-output/test-include-chain-1.py: Likewise.
* gcc.dg/sarif-output/test-include-chain-2.py: Likewise.
* gcc.dg/sarif-output/test-missing-semicolon.py: Likewise.
* gcc.dg/sarif-output/test-no-diagnostics.py: Likewise.
* gcc.dg/sarif-output/test-werror.py: Likewise.
* lib/sarif-schema-2.1.0.json: Update with copy downloaded from
https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/schemas/sarif-schema-2.1.0.json

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
10 months agoi386: Use offsetable address constraint for double-word memory operands
Uros Bizjak [Mon, 9 Sep 2024 20:33:52 +0000 (22:33 +0200)] 
i386: Use offsetable address constraint for double-word memory operands

Double-word memory operands are accessed as their high and low part, so the
memory location has to be offsettable.  Use "o" constraint instead of "m"
for double-word memory operands.

gcc/ChangeLog:

* config/i386/i386.md (*insvdi_lowpart_1): Use "o" constraint
instead of "m" for double-word mode memory operands.
(*add<dwi>3_doubleword_zext): Ditto.
(*addv<dwi>4_doubleword_1): Use "jO" constraint instead of "jM"
for double-word mode memory operands.

10 months agoanalyzer: fix "unused variable 'summary_cast_reg'" warning
David Malcolm [Mon, 9 Sep 2024 19:30:42 +0000 (15:30 -0400)] 
analyzer: fix "unused variable 'summary_cast_reg'" warning

I missed this in r15-1108-g70f26314b62e2d.

gcc/analyzer/ChangeLog:
* call-summary.cc
(call_summary_replay::convert_region_from_summary_1): Drop unused
local "summary_cast_reg"

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
10 months agomiddle-end: also optimized `popcount(a) <= 1` [PR90693]
Andrew Pinski [Thu, 29 Aug 2024 19:10:44 +0000 (12:10 -0700)] 
middle-end: also optimized `popcount(a) <= 1` [PR90693]

This expands on optimizing `popcount(a) == 1` to also handle
`popcount(a) <= 1`. `<= 1` can be expanded as `(a & -a) == 0`
like what is done for `== 1` if we know that a was nonzero.
We have to do the optimization in 2 places due to if we have
an optab entry for popcount or not.

Built and tested for aarch64-linux-gnu.

PR middle-end/90693

gcc/ChangeLog:

* internal-fn.cc (expand_POPCOUNT): Handle the second argument
being `-1` for `<= 1`.
* tree-ssa-math-opts.cc (match_single_bit_test): Handle LE/GT
cases.
(math_opts_dom_walker::after_dom_children): Call match_single_bit_test
for LE_EXPR/GT_EXPR also.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/popcnt-le-1.c: New test.
* gcc.target/aarch64/popcnt-le-2.c: New test.
* gcc.target/aarch64/popcnt-le-3.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
10 months agohppa: Don't canonicalize operand order of scaled index addresses
John David Anglin [Mon, 9 Sep 2024 14:23:00 +0000 (10:23 -0400)] 
hppa: Don't canonicalize operand order of scaled index addresses

pa_print_operand handles both operand orders for scaled index
addresses, so it isn't necessary to canonicalize the order of
operands.

2024-09-09  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa.cc (pa_legitimate_address_p): Don't
canonicalize operand order of scaled index addresses.

10 months agotree-optimization/116514 - handle pointer difference in bit-CCP
Richard Biener [Wed, 28 Aug 2024 12:06:48 +0000 (14:06 +0200)] 
tree-optimization/116514 - handle pointer difference in bit-CCP

When evaluating the difference of two aligned pointers in CCP we
fail to handle the EXACT_DIV_EXPR by the element size that occurs.
The testcase then also exercises modulo to test alignment but
modulo by a power-of-two isn't handled either.

PR tree-optimization/116514
* tree-ssa-ccp.cc (bit_value_binop): Handle EXACT_DIV_EXPR
like TRUNC_DIV_EXPR.  Handle exact division of a signed value
by a power-of-two like a shift.  Handle unsigned division by
a power-of-two like a shift.
Handle unsigned TRUNC_MOD_EXPR by power-of-two, handle signed
TRUNC_MOD_EXPR by power-of-two if the result is zero.

* gcc.dg/tree-ssa/ssa-ccp-44.c: New testcase.

10 months agotree-optimization/116647 - wrong classified double reduction
Richard Biener [Mon, 9 Sep 2024 09:51:24 +0000 (11:51 +0200)] 
tree-optimization/116647 - wrong classified double reduction

The following avoids classifying a double reduction that's not
actually a reduction in the outer loop (because its value isn't
used outside of the outer loop).  This avoids us ICEing on the
unexpected stmt/SLP node arrangement.

PR tree-optimization/116647
* tree-vect-loop.cc (vect_is_simple_reduction): Add missing
check to double reduction detection.

* gcc.dg/torture/pr116647.c: New testcase.
* gcc.dg/vect/no-scevccp-pr86725-2.c: Adjust expected pattern.
* gcc.dg/vect/no-scevccp-pr86725-4.c: Likewise.

10 months agoSilence warning for 32-bit targets
Eric Botcazou [Mon, 9 Sep 2024 10:27:25 +0000 (12:27 +0200)] 
Silence warning for 32-bit targets

gcc/testsuite
PR ada/115250
* gnat.dg/opt58_pkg.ads: Convert to Unix line ending.
* gnat.dg/opt58.adb: Likewise and pass -gnatws to the compiler.

10 months agoRemove problematic declaration for 32-bit targets
Eric Botcazou [Mon, 9 Sep 2024 10:23:42 +0000 (12:23 +0200)] 
Remove problematic declaration for 32-bit targets

gcc/testsuite
PR ada/115246
* gnat.dg/alignment14.adb (My_Int2): Delete.
(Arr2): Likewise.

10 months agogimple-fold: Move optimizing memcpy to memset to fold_stmt from fab
Andrew Pinski [Fri, 6 Sep 2024 19:29:26 +0000 (12:29 -0700)] 
gimple-fold: Move optimizing memcpy to memset to fold_stmt from fab

I noticed this folding inside fab could be done else where and could
even improve inlining decisions and a few other things so let's
move it to fold_stmt.
It also fixes PR 116601 because places which call fold_stmt already
have to deal with the stmt becoming a non-throw statement.

For the fix for PR 116601 on the branches should be the original patch
rather than a backport of this one.

Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/116601

gcc/ChangeLog:

* gimple-fold.cc (optimize_memcpy_to_memset): Move
from tree-ssa-ccp.cc and rename. Also return true
if the optimization happened.
(gimple_fold_builtin_memory_op): Call
optimize_memcpy_to_memset.
(fold_stmt_1): Call optimize_memcpy_to_memset for
load/store copies.
* tree-ssa-ccp.cc (optimize_memcpy): Delete.
(pass_fold_builtins::execute): Remove code that
calls optimize_memcpy.

gcc/testsuite/ChangeLog:

* gcc.dg/pr78408-1.c: Adjust dump scan to match where
the optimization now happens.
* g++.dg/torture/except-2.C: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
10 months agoAmend gcc.dg/vect/fast-math-vect-call-2.c
Richard Biener [Mon, 9 Sep 2024 07:41:36 +0000 (09:41 +0200)] 
Amend gcc.dg/vect/fast-math-vect-call-2.c

There was a reported regression on x86-64 with -march=cascadelake
and -m32 where epilogue vectorization causes a different number of
SLPed loops.  Fixed by disabling epilogue vectorization for the
testcase.

* gcc.dg/vect/fast-math-vect-call-2.c: Disable epilogue
vectorization.

10 months agotestsuite: Fix up pr116588.c test [PR116588]
Jakub Jelinek [Mon, 9 Sep 2024 07:37:26 +0000 (09:37 +0200)] 
testsuite: Fix up pr116588.c test [PR116588]

The test as committed without the tree-vrp.cc change only FAILs with
FAIL: gcc.dg/pr116588.c scan-tree-dump-not vrp2 "0 != 0"
The DEBUG code in there was just to make it easier to debug, but doesn't
actually fail when the test is miscompiled.
We don't need such debugging code in simple tests like that, but it is
useful if they abort when miscompiled.

With this patch without the tree-vrp.cc change I see
FAIL: gcc.dg/pr116588.c execution test
FAIL: gcc.dg/pr116588.c scan-tree-dump-not vrp2 "0 != 0"
and with it it passes.

2024-09-09  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/116588
* gcc.dg/pr116588.c: Remove -DDEBUG from dg-options.
(main): Remove debugging code and simplify.

10 months agoMatch: Fix ordered and nonequal: Fix 'gcc.dg/opt-ordered-and-nonequal-1.c' re 'LOGICA...
Thomas Schwinge [Mon, 9 Sep 2024 06:39:10 +0000 (08:39 +0200)] 
Match: Fix ordered and nonequal: Fix 'gcc.dg/opt-ordered-and-nonequal-1.c' re 'LOGICAL_OP_NON_SHORT_CIRCUIT' [PR116635]

Fix up to make 'gcc.dg/opt-ordered-and-nonequal-1.c' of
commit 91421e21e8f0f05f440174b8de7a43a311700e08
"Match: Fix ordered and nonequal" work for default
'LOGICAL_OP_NON_SHORT_CIRCUIT == false' configurations.

PR testsuite/116635
gcc/testsuite/
* gcc.dg/opt-ordered-and-nonequal-1.c: Fix re
'LOGICAL_OP_NON_SHORT_CIRCUIT'.

10 months agophiopt: Small refactoring/cleanup of non-ssa name case of factor_out_conditional_oper...
Andrew Pinski [Sat, 31 Aug 2024 20:54:21 +0000 (13:54 -0700)] 
phiopt: Small refactoring/cleanup of non-ssa name case of factor_out_conditional_operation

This small cleanup removes a redundant check for gimple_assign_cast_p and reformats
based on that. Also changes the if statement that checks if the integral type and the
check to see if the constant fits into the new type such that it returns null
and reformats based on that.

Also moves the check for has_single_use earlier so it is less complex still a cheaper
check than some of the others (like the check on the integer side).

This was noticed when adding a few new things to factor_out_conditional_operation
but those are not ready to submit yet.

Note there are no functional difference with this change.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* tree-ssa-phiopt.cc (factor_out_conditional_operation): Move the has_single_use
checks much earlier. Remove redundant check for gimple_assign_cast_p.
Change around the check if the integral consts fits into the new type.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
10 months agodoc: Enhance Intel CPU documentation
Haochen Jiang [Fri, 6 Sep 2024 03:19:26 +0000 (11:19 +0800)] 
doc: Enhance Intel CPU documentation

This patch will add those recent aliased CPU names into documentation
for clearness.

gcc/ChangeLog:

PR target/116617
* doc/invoke.texi: Add meteorlake, raptorlake and lunarlake.

10 months agoDaily bump.
GCC Administrator [Mon, 9 Sep 2024 00:17:47 +0000 (00:17 +0000)] 
Daily bump.

10 months agox86-64: Don't use temp for argument in a TImode register
H.J. Lu [Fri, 6 Sep 2024 12:24:07 +0000 (05:24 -0700)] 
x86-64: Don't use temp for argument in a TImode register

Don't use temp for a PARALLEL BLKmode argument of an EXPR_LIST expression
in a TImode register.  Otherwise, the TImode variable will be put in
the GPR save area which guarantees only 8-byte alignment.

gcc/

PR target/116621
* config/i386/i386.cc (ix86_gimplify_va_arg): Don't use temp for
a PARALLEL BLKmode container of an EXPR_LIST expression in a
TImode register.

gcc/testsuite/

PR target/116621
* gcc.target/i386/pr116621.c: New test.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
10 months agogcov: Cache source files
Jørgen Kvalsvik [Thu, 4 Jul 2024 07:42:24 +0000 (09:42 +0200)] 
gcov: Cache source files

Cache the source files as they are read, rather than discarding them at
the end of output_lines (), and move the reading of the source file to
the new function slurp.

This patch does not really change anything other than moving the file
reading out of output_file, but set gcov up for more interaction with
the source file. The motvating example is reporting coverage on
functions from different source files, notably C++ headers and
((always_inline)).

Here is an example of what gcov does today:

hello.h:
inline __attribute__((always_inline))
int hello (const char *s)
{
  if (s)
    printf ("hello, %s!\n", s);
  else
    printf ("hello, world!\n");
  return 0;
}

hello.c:
int notmain(const char *entity)
{
  return hello (entity);
}

int main()
{
  const char *empty = 0;
  if (!empty)
    hello (empty);
  else
    puts ("Goodbye!");
}

$ gcov -abc hello
function notmain called 0 returned 0% blocks executed 0%
    #####:    4:int notmain(const char *entity)
    %%%%%:    4-block 2
branch  0 never executed (fallthrough)
branch  1 never executed
        -:    5:{
    #####:    6:  return hello (entity);
    %%%%%:    6-block 7
        -:    7:}

Clearly there is a branch in notmain, but the branch comes from the
inlining of hello. This is not very obvious from looking at the output.
Here is hello.h.gcov:

        -:    3:inline __attribute__((always_inline))
        -:    4:int hello (const char *s)
        -:    5:{
    #####:    6:  if (s)
    %%%%%:    6-block 3
branch  0 never executed (fallthrough)
branch  1 never executed
    %%%%%:    6-block 2
branch  2 never executed (fallthrough)
branch  3 never executed
    #####:    7:    printf ("hello, %s!\n", s);
    %%%%%:    7-block 4
call    0 never executed
    %%%%%:    7-block 3
call    1 never executed
        -:    8:  else
    #####:    9:    printf ("hello, world!\n");
    %%%%%:    9-block 5
call    0 never executed
    %%%%%:    9-block 4
call    1 never executed
    #####:   10:  return 0;
    %%%%%:   10-block 6
    %%%%%:   10-block 5
        -:   11:}

The blocks from the different call sites have all been interleaved.

The reporting could tuned be to list the inlined function, too, like
this:

        1:    4:int notmain(const char *entity)
        -: == inlined from hello.h ==
        1:    6:  if (s)
branch  0 taken 0 (fallthrough)
branch  1 taken 1
    #####:    7:    printf ("hello, %s!\n", s);
    %%%%%:    7-block 3
call    0 never executed
        -:    8:  else
        1:    9:    printf ("hello, world!\n");
        1:    9-block 4
call    0 returned 1
        1:   10:  return 0;
        1:   10-block 5
        -: == inlined from hello.h (end) ==
        -:    5:{
        1:    6:  return hello (entity);
        1:    6-block 7
        -:    7:}

Implementing something to this effect relies on having the sources for
both files (hello.c, hello.h) available, which is what this patch sets
up.

Note that the previous reading code would leak the source file content,
and explicitly storing them is not a huge departure nor performance
implication. I verified this with valgrind:

With slurp:

$ valgrind gcov ./hello
== == Memcheck, a memory error detector
== == Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
== == Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
== == Command: ./gcc/gcov demo
== ==
File 'hello.c'
Lines executed:100.00% of 4
Creating 'hello.c.gcov'

File 'hello.h'
Lines executed:75.00% of 4
Creating 'hello.h.gcov'
== ==
== == HEAP SUMMARY:
== ==     in use at exit: 84,907 bytes in 54 blocks
== ==   total heap usage: 254 allocs, 200 frees, 137,156 bytes allocated
== ==
== == LEAK SUMMARY:
== ==    definitely lost: 1,237 bytes in 22 blocks
== ==    indirectly lost: 562 bytes in 18 blocks
== ==      possibly lost: 0 bytes in 0 blocks
== ==    still reachable: 83,108 bytes in 14 blocks
== ==                       of which reachable via heuristic:
== ==                         newarray           : 1,544 bytes in 1 blocks
== ==         suppressed: 0 bytes in 0 blocks
== == Rerun with --leak-check=full to see details of leaked memory
== ==
== == For lists of detected and suppressed errors, rerun with: -s
== == ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Without slurp:

$ valgrind gcov ./demo
== == Memcheck, a memory error detector
== == Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
== == Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
== == Command: ./gcc/gcov demo
== ==
File 'hello.c'
Lines executed:100.00% of 4
Creating 'hello.c.gcov'

File 'hello.h'
Lines executed:75.00% of 4
Creating 'hello.h.gcov'

Lines executed:87.50% of 8
== ==
== == HEAP SUMMARY:
== ==     in use at exit: 85,316 bytes in 82 blocks
== ==   total heap usage: 250 allocs, 168 frees, 137,084 bytes allocated
== ==
== == LEAK SUMMARY:
== ==    definitely lost: 1,646 bytes in 50 blocks
== ==    indirectly lost: 562 bytes in 18 blocks
== ==      possibly lost: 0 bytes in 0 blocks
== ==    still reachable: 83,108 bytes in 14 blocks
== ==                       of which reachable via heuristic:
== ==                         newarray           : 1,544 bytes in 1 blocks
== ==         suppressed: 0 bytes in 0 blocks
== == Rerun with --leak-check=full to see details of leaked memory
== ==
== == For lists of detected and suppressed errors, rerun with: -s
== == ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

gcc/ChangeLog:

* gcov.cc (release_structures): Release source_lines.
(slurp): New function.
(output_lines): Read sources with slurp.

10 months agotestsuite: Use dg-compile, not gcc -c
Jørgen Kvalsvik [Fri, 9 Aug 2024 06:28:35 +0000 (08:28 +0200)] 
testsuite: Use dg-compile, not gcc -c

Since this is a pure compile test it makes sense to inform dejagnu of
it.

gcc/testsuite/ChangeLog:

* gcc.misc-tests/gcov-23.c: Use dg-compile, not gcc -c

10 months agoDaily bump.
GCC Administrator [Sun, 8 Sep 2024 00:17:46 +0000 (00:17 +0000)] 
Daily bump.

10 months agoFix pr116588.c for -m32
Andrew Pinski [Sat, 7 Sep 2024 23:20:03 +0000 (16:20 -0700)] 
Fix pr116588.c for -m32

This is a simple fix which adds the target supports requirement of int128
to the testcase too.

Pushed as obvious after testing to make sure the testcase is UNSUPPORTED now
with -m32 but working with -m64 on x86_64-linux-gnu.

gcc/testsuite/ChangeLog:

* gcc.dg/pr116588.c: Require int128.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
10 months agoc++: exception spec and stdlib specialization
Jason Merrill [Fri, 6 Sep 2024 19:28:53 +0000 (15:28 -0400)] 
c++: exception spec and stdlib specialization

We were silently accepting the pr65923.C specialization of std::swap with
the wrong exception specification; it should be declared noexcept.  Let's
limit ignoring mismatch with system headers to extern "C" functions so we
get a diagnostic for the C++ library.

In the case of an omitted exception-specification, let's also lower the
error to a pedwarn, and copy the missing spec over, to avoid a hard break
for code that accidentally relied on the old behavior.

...except extern "C" functions keep the new spec, to avoid breaking dubious
code like noexcept-type19.C.

gcc/cp/ChangeLog:

* decl.cc (check_redeclaration_exception_specification): Remove
OPT_Wsystem_headers from pedwarn when the old declaration is
in a system header.  Also check std namespace.

gcc/testsuite/ChangeLog:

* g++.dg/diagnostic/pr65923.C: Add noexcept.
* g++.dg/cpp1z/aligned-new3.C: Expect pedwarn.
* g++.dg/cpp1z/noexcept-type19.C: Add comment.

10 months agosplit-path: Fix dump wording about duplicating too many statements
Andrew Pinski [Sat, 7 Sep 2024 18:43:03 +0000 (11:43 -0700)] 
split-path: Fix dump wording about duplicating too many statements

It was pointed out in https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662183.html,
that the wording with this print has too many words.
Fixed thusly.

Pushed as obvious after a build and test for x86_64-linux-gnu.

gcc/ChangeLog:

* gimple-ssa-split-paths.cc (is_feasible_trace): Fix wording
on the print.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
10 months agoc++: deferring partial substitution into lambda [PR116567]
Patrick Palka [Sat, 7 Sep 2024 18:06:37 +0000 (14:06 -0400)] 
c++: deferring partial substitution into lambda [PR116567]

Here we correctly defer partial substitution into the lambda used as
a default template argument, but then incorrectly perform the full
substitution, because add_extra_args adds outer template arguments from
the full substitution that are not related to the original template
context of the lambda.  For example, the template depth of the first
lambda is 1 but add_extra_args return a set of args with 3 levels, with
the inner level corresponding to the parameters of v1 (good) and the
outer levels corresponding to those of A and B (bad).

For the cases that we're interested in, add_extra_args can assume that
the deferred args are a full set of template arguments, and so it
suffices to just substitute into the deferred args and not do any
additional merging.

This patch refines add_extra_args accordingly, and additionally
makes it look for the tf_partial flag instead of for dependent args to
decide if the deferred substitution is a partial one.  This reveals we
were neglecting to set tf_partial when substituting into a default
template argument in a template context.

PR c++/116567

gcc/cp/ChangeLog:

* pt.cc (coerce_template_parms): Set tf_partial when substituting
into a default template argument in a template context.
(build_extra_args): Set TREE_STATIC on the deferred args if this
is a partial substitution.
(add_extra_args): Check TREE_STATIC instead of dependence of args.
Adjust merging behavior in that case.
(tsubst_lammda_expr): Check for tf_partial instead of dependence
of args when determining whether to defer substitution.
(tsubst_expr) <case LAMBDA_EXPR>: Remove tf_partial early exit.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-targ7.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
10 months agoBefore running fast VRP, make sure all edges have EXECUTABLE set.
Andrew MacLeod [Fri, 6 Sep 2024 15:42:14 +0000 (11:42 -0400)] 
Before running fast VRP, make sure all edges have EXECUTABLE set.

PR tree-optimization/116588
gcc/
* tree-vrp.cc (execute_fast_vrp): Start with all edges executable.
gcc/testsuite/
* gcc.dg/pr116588.c: New.

10 months ago[PATCH] RISC-V: Add missing insn types for XiangShan Nanhu scheduler model
Zhao Dingyi [Sat, 7 Sep 2024 16:48:46 +0000 (10:48 -0600)] 
[PATCH] RISC-V: Add missing insn types for XiangShan Nanhu scheduler model

This patch aims to add the missing instruction types to the XiangShan-Nanhu scheduler model.

The current XiangShan -Nanhu model lacks the trap, atomic trap, fcvt_i2f, and fcvt_f2i instructions.

The trap, atomic, and i2f instructions belong to xs_jmp_rs. [1]

The f2i instruction belongs to xs_fmisc_rs.[2]

[1]
https://github.com/OpenXiangShan/XiangShan/blob/v2.0/src/main/scala/xiangshan/package.scala#L780

[2]
https://github.com/OpenXiangShan/XiangShan/blob/v2.0/src/main/scala/xiangshan/backend/decode/DecodeUnit.scala#L290

gcc/ChangeLog:

* config/riscv/xiangshan.md: Add atomic, trap, fcvt_i2f, fcvt_f2i.

10 months ago[PATCH v4] [target/116592] RISC-V: Fix illegal operands "th.vsetvli zero,0,e32,m8...
Jin Ma [Sat, 7 Sep 2024 16:29:02 +0000 (10:29 -0600)] 
[PATCH v4] [target/116592] RISC-V: Fix illegal operands "th.vsetvli zero,0,e32,m8" for XTheadVector

Since the THeadVector vsetvli does not support vl as an immediate, we
need to convert 0 to zero when outputting asm.

PR target/116592

gcc/ChangeLog:

* config/riscv/thead.cc (th_asm_output_opcode): Change '0' to
"zero"

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/xtheadvector/pr116592.c: New test.

10 months agoImplement first part of unsigned integers for Fortran.
Thomas Koenig [Sat, 7 Sep 2024 14:59:46 +0000 (16:59 +0200)] 
Implement first part of unsigned integers for Fortran.

gcc/fortran/ChangeLog:

* arith.cc (gfc_reduce_unsigned): New function.
(gfc_arith_error): Add ARITH_UNSIGNED_TRUNCATED and
ARITH_UNSIGNED_NEGATIVE.
(gfc_arith_init_1): Initialize unsigned types.
(gfc_check_unsigned_range): New function.
(gfc_range_check): Handle unsigned types.
(gfc_arith_uminus): Likewise.
(gfc_arith_plus): Likewise.
(gfc_arith_minus): Likewise.
(gfc_arith_times): Likewise.
(gfc_arith_divide): Likewise.
(gfc_compare_expr): Likewise.
(eval_intrinsic): Likewise.
(gfc_int2int): Also convert unsigned.
(gfc_uint2uint): New function.
(gfc_int2uint): New function.
(gfc_uint2int): New function.
(gfc_uint2real): New function.
(gfc_uint2complex): New function.
(gfc_real2uint): New function.
(gfc_complex2uint): New function.
(gfc_log2uint): New function.
(gfc_uint2log): New function.
* arith.h (gfc_int2uint, gfc_uint2uint, gfc_uint2int, gfc_uint2real):
Add prototypes.
(gfc_uint2complex, gfc_real2uint, gfc_complex2uint, gfc_log2uint):
Likewise.
(gfc_uint2log): Likewise.
* check.cc (gfc_boz2uint): New function
(type_check2): New function.
(int_or_real_or_unsigned_check): New function.
(less_than_bitsizekind): Adjust for unsingeds.
(less_than_bitsize2): Likewise.
(gfc_check_allocated): Likewise.
(gfc_check_mod): Likewise.
(gfc_check_bge_bgt_ble_blt): Likewise.
(gfc_check_bitfcn): Likewise.
(gfc_check_digits): Likewise.
(gfc_check_dshift): Likewise.
(gfc_check_huge): Likewise.
(gfc_check_iu): New function.
(gfc_check_iand_ieor_ior): Adjust for unsigneds.
(gfc_check_ibits): Likewise.
(gfc_check_uint): New function.
(gfc_check_ishft): Adjust for unsigneds.
(gfc_check_ishftc): Likewise.
(gfc_check_min_max): Likewise.
(gfc_check_merge_bits): Likewise.
(gfc_check_selected_int_kind): Likewise.
(gfc_check_shift): Likewise.
(gfc_check_mvbits): Likewise.
(gfc_invalid_unsigned_ops): Likewise.
* decl.cc (gfc_match_decl_type_spec): Likewise.
* dump-parse-tree.cc (show_expr): Likewise.
* expr.cc (gfc_get_constant_expr): Likewise.
(gfc_copy_expr): Likewise.
(gfc_extract_int): Likewise.
(numeric_type): Likewise.
* gfortran.h (enum arith): Extend with ARITH_UNSIGNED_TRUNCATED
and ARITH_UNSIGNED_NEGATIVE.
(enum gfc_isym_id): Extend with GFC_ISYM_SU_KIND and GFC_ISYM_UINT.
(gfc_check_unsigned_range): New prototype-
(gfc_arith_error): Likewise.
(gfc_reduce_unsigned): Likewise.
(gfc_boz2uint): Likewise.
(gfc_invalid_unsigned_ops): Likewise.
(gfc_convert_mpz_to_unsigned): Likewise.
* gfortran.texi: Add some rudimentary documentation.
* intrinsic.cc (gfc_type_letter): Adjust for unsigneds.
(add_functions): Add uint and adjust functions to be called.
(add_conversions): Add unsigned conversions.
(gfc_convert_type_warn): Adjust for unsigned.
* intrinsic.h (gfc_check_iu, gfc_check_uint, gfc_check_mod, gfc_simplify_uint,
gfc_simplify_selected_unsigned_kind, gfc_resolve_uint): New prototypes.
* invoke.texi: Add -funsigned.
* iresolve.cc (gfc_resolve_dshift): Handle unsigneds.
(gfc_resolve_iand): Handle unsigneds.
(gfc_resolve_ibclr): Handle unsigneds.
(gfc_resolve_ibits): Handle unsigneds.
(gfc_resolve_ibset): Handle unsigneds.
(gfc_resolve_ieor): Handle unsigneds.
(gfc_resolve_ior): Handle unsigneds.
(gfc_resolve_uint): Handle unsigneds.
(gfc_resolve_merge_bits): Handle unsigneds.
(gfc_resolve_not): Handle unsigneds.
* lang.opt: Add -funsigned.
* libgfortran.h: Add BT_UNSIGNED.
* match.cc (gfc_match_type_spec): Match UNSIGNED.
* misc.cc (gfc_basic_typename): Add UNSIGNED.
(gfc_typename): Likewise.
* primary.cc (convert_unsigned): New function.
(match_unsigned_constant): New function.
(gfc_match_literal_constant): Handle unsigned.
* resolve.cc (resolve_operator): Handle unsigned.
(resolve_ordinary_assign): Likewise.
* simplify.cc (convert_mpz_to_unsigned): Renamed to...
(gfc_convert_mpz_to_unsigned): and adjusted.
(gfc_simplify_bit_size): Adjusted for unsigned.
(compare_bitwise): Likewise.
(gfc_simplify_bge): Likewise.
(gfc_simplify_bgt): Likewise.
(gfc_simplify_ble): Likewise.
(gfc_simplify_blt): Likewise.
(simplify_cmplx): Likewise.
(gfc_simplify_digits): Likewise.
(simplify_dshift): Likewise.
(gfc_simplify_huge): Likewise.
(gfc_simplify_iand): Likewise.
(gfc_simplify_ibclr): Likewise.
(gfc_simplify_ibits): Likewise.
(gfc_simplify_ibset): Likewise.
(gfc_simplify_ieor): Likewise.
(gfc_simplify_uint): Likewise.
(gfc_simplify_ior): Likewise.
(simplify_shift): Likewise.
(gfc_simplify_ishftc): Likewise.
(gfc_simplify_merge_bits): Likewise.
(min_max_choose): Likewise.
(gfc_simplify_mod): Likewise.
(gfc_simplify_modulo): Likewise.
(gfc_simplify_popcnt): Likewise.
(gfc_simplify_range): Likewise.
(gfc_simplify_selected_unsigned_kind): Likewise.
(gfc_convert_constant): Likewise.
* target-memory.cc (size_unsigned): New function.
(gfc_element_size): Adjust for unsigned.
* trans-const.h (gfc_conv_mpz_unsigned_to_tree): Add prototype.
* trans-const.cc (gfc_conv_mpz_unsigned_to_tree): Handle unsigneds.
(gfc_conv_constant_to_tree): Likewise.
* trans-decl.cc (gfc_conv_cfi_to_gfc): Put in "not yet implemented".
* trans-expr.cc (gfc_conv_gfc_desc_to_cfi_desc): Likewise.
* trans-stmt.cc (gfc_trans_integer_select): Handle unsigned.
(gfc_trans_select): Likewise.
* trans-intrinsic.cc (gfc_conv_intrinsic_mod): Handle unsigned.
(gfc_conv_intrinsic_shift): Likewise.
(gfc_conv_intrinsic_function): Add GFC_ISYM_UINT.
* trans-io.cc (enum iocall): Add IOCALL_X_UNSIGNED and IOCALL_X_UNSIGNED_WRITE.
(gfc_build_io_library_fndecls): Add transfer_unsigned and transfer_unsigned_write.
(transfer_expr): Handle unsigneds.
* trans-types.cc (gfc_unsinged_kinds): New array.
(gfc_unsigned_types): Likewise.
(gfc_init_kinds): Handle them.
(validate_unsigned): New function.
(gfc_validate_kind): Use it.
(gfc_build_unsigned_type): New function.
(gfc_init_types): Use it.
(gfc_get_unsigned_type): New function.
(gfc_typenode_for_spec): Handle unsigned.
* trans-types.h (gfc_get_unsigned_type): New prototype.

libgfortran/ChangeLog:

* gfortran.map: Add _gfortran_transfer_unsgned and
_gfortran_transfer-signed.
* io/io.h (set_unsigned): New prototype.
(us_max): New prototype.
(read_decimal_unsigned): New prototype.
(write_iu): New prototype.
* io/list_read.c (convert_unsigned): New function.
(read_integer): Also handle unsigneds.
(list_formatted_read_scalar): Handle unsigneds.
(nml_read_obj): Likewise.
* io/read.c (set_unsigned): New function.
(us_max): New function.
(read_utf8): Whitespace fixes.
(read_default_char1): Whitespace fixes.
(read_a_char4): Whitespace fixes.
(next_char): Whiltespace fixes.
(read_decimal_unsigned): New function.
(read_f): Whitespace fixes.
(read_x): Whitespace fixes.
* io/transfer.c (transfer_unsigned): New function.
(transfer_unsigned_write): New function.
(require_one_of_two_types): New function.
(formatted_transfer_scalar_read): Use it.
(formatted_transfer_scalar_write): Also use it.
* io/write.c (write_decimal_unsigned): New function.
(write_iu): New function.
(write_unsigned): New function.
(list_formatted_write_scalar): Adjust for unsigneds.
* libgfortran.h (GFC_UINTEGER_1_HUGE): Define.
(GFC_UINTEGER_2_HUGE): Define.
(GFC_UINTEGER_4_HUGE): Define.
(GFC_UINTEGER_8_HUGE): Define.
(GFC_UINTEGER_16_HUGE): Define.
(HAVE_GFC_UINTEGER_1): Undefine (done by mk-kind-h.sh)
(HAVE_GFC_UINTEGER_4): Likewise.
* mk-kinds-h.sh: Add GFC_UINTEGER_*_HUGE.

gcc/testsuite/ChangeLog:

* gfortran.dg/unsigned_1.f90: New test.
* gfortran.dg/unsigned_10.f90: New test.
* gfortran.dg/unsigned_11.f90: New test.
* gfortran.dg/unsigned_12.f90: New test.
* gfortran.dg/unsigned_13.f90: New test.
* gfortran.dg/unsigned_14.f90: New test.
* gfortran.dg/unsigned_15.f90: New test.
* gfortran.dg/unsigned_16.f90: New test.
* gfortran.dg/unsigned_17.f90: New test.
* gfortran.dg/unsigned_18.f90: New test.
* gfortran.dg/unsigned_19.f90: New test.
* gfortran.dg/unsigned_2.f90: New test.
* gfortran.dg/unsigned_20.f90: New test.
* gfortran.dg/unsigned_21.f90: New test.
* gfortran.dg/unsigned_22.f90: New test.
* gfortran.dg/unsigned_23.f90: New test.
* gfortran.dg/unsigned_24.f: New test.
* gfortran.dg/unsigned_3.f90: New test.
* gfortran.dg/unsigned_4.f90: New test.
* gfortran.dg/unsigned_5.f90: New test.
* gfortran.dg/unsigned_6.f90: New test.
* gfortran.dg/unsigned_7.f90: New test.
* gfortran.dg/unsigned_8.f90: New test.
* gfortran.dg/unsigned_9.f90: New test.

10 months agolibiberty: Fix up > 64K section handling in simple_object_elf_copy_lto_debug_section...
Jakub Jelinek [Sat, 7 Sep 2024 07:36:53 +0000 (09:36 +0200)] 
libiberty: Fix up > 64K section handling in simple_object_elf_copy_lto_debug_section [PR116614]

cat abc.C
  #define A(n) struct T##n {} t##n;
  #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
  #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
  #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
  #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9)
  E(1) E(2) E(3)
  int main () { return 0; }
./xg++ -B ./ -o abc{.o,.C} -flto -flto-partition=1to1 -O2 -g -fdebug-types-section -c
./xgcc -B ./ -o abc{,.o} -flto -flto-partition=1to1 -O2
(not included in testsuite as it takes a while to compile) FAILs with
lto-wrapper: fatal error: Too many copied sections: Operation not supported
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

The following patch fixes that.  Most of the 64K+ section support for
reading and writing was already there years ago (and especially reading used
quite often already) and a further bug fixed in it in the PR104617 fix.

Yet, the fix isn't solely about removing the
  if (new_i - 1 >= SHN_LORESERVE)
    {
      *err = ENOTSUP;
      return "Too many copied sections";
    }
5 lines, the missing part was that the function only handled reading of
the .symtab_shndx section but not copying/updating of it.
If the result has less than 64K-epsilon sections, that actually wasn't
needed, but e.g. with -fdebug-types-section one can exceed that pretty
easily (reported to us on WebKitGtk build on ppc64le).
Updating the section is slightly more complicated, because it basically
needs to be done in lock step with updating the .symtab section, if one
doesn't need to use SHN_XINDEX in there, the section should (or should be
updated to) contain SHN_UNDEF entry, otherwise needs to have whatever would
be overwise stored but couldn't fit.  But repeating due to that all the
symtab decisions what to discard and how to rewrite it would be ugly.

So, the patch instead emits the .symtab_shndx section (or sections) last
and prepares the content during the .symtab processing and in a second
pass when going just through .symtab_shndx sections just uses the saved
content.

2024-09-07  Jakub Jelinek  <jakub@redhat.com>

PR lto/116614
* simple-object-elf.c (SHN_COMMON): Align comment with neighbouring
comments.
(SHN_HIRESERVE): Use uppercase hex digits instead of lowercase for
consistency.
(simple_object_elf_find_sections): Formatting fixes.
(simple_object_elf_fetch_attributes): Likewise.
(simple_object_elf_attributes_merge): Likewise.
(simple_object_elf_start_write): Likewise.
(simple_object_elf_write_ehdr): Likewise.
(simple_object_elf_write_shdr): Likewise.
(simple_object_elf_write_to_file): Likewise.
(simple_object_elf_copy_lto_debug_section): Likewise.  Don't fail for
new_i - 1 >= SHN_LORESERVE, instead arrange in that case to copy
over .symtab_shndx sections, though emit those last and compute their
section content when processing associated .symtab sections.  Handle
simple_object_internal_read failure even in the .symtab_shndx reading
case.

10 months agoDaily bump.
GCC Administrator [Sat, 7 Sep 2024 00:17:53 +0000 (00:17 +0000)] 
Daily bump.

10 months agolibstdc++: Fix std::chrono::parse for TAI and GPS clocks
Jonathan Wakely [Wed, 4 Sep 2024 20:23:20 +0000 (21:23 +0100)] 
libstdc++: Fix std::chrono::parse for TAI and GPS clocks

Howard Hinnant brought to my attention that chrono::parse was giving
incorrect values for chrono::gps_clock, because it was applying the
offset between the GPS clock and UTC. That's incorrect, because when we
parse HH::MM::SS as a GPS time, the result should be that time, not
HH:MM:SS+offset.

The problem was that I was using clock_cast to convert from sys_time to
utc_time and then using clock_time again to convert to gps_time. The
solution is to convert the parsed time into an duration representing the
time since the GPS clock's epoch, then construct a gps_time directly
from that duration.

As well as adding tests for correct round tripping of times for all
clocks, this also adds some more tests for correct results with
std::format.

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (from_stream): Fix conversions in
overloads for gps_time and tai_time.
* testsuite/std/time/clock/file/io.cc: Test round tripping using
chrono::parse. Add additional std::format tests.
* testsuite/std/time/clock/gps/io.cc: Likewise.
* testsuite/std/time/clock/local/io.cc: Likewise.
* testsuite/std/time/clock/tai/io.cc: Likewise.
* testsuite/std/time/clock/utc/io.cc: Likewise.

10 months agoc++: adjust testcase to reveal failure [PR107919]
Jason Merrill [Fri, 6 Sep 2024 19:14:33 +0000 (15:14 -0400)] 
c++: adjust testcase to reveal failure [PR107919]

This test appeared to be passing, but only because the warning was
suppressed by #pragma system_header.

PR tree-optimization/107919

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wuninitialized-pr107919-1.C: Add -Wsystem-headers and
xfail.

10 months agolibstdc++: add missing __
Jason Merrill [Fri, 6 Sep 2024 16:12:24 +0000 (12:12 -0400)] 
libstdc++: add missing __

I forgot the __ in my recent r15-3500-g1914ca8791ce4e.

libstdc++-v3/ChangeLog:

* include/bits/regex_constants.h: Add __ to attribute.

10 months agoUpdate gcc uk.po
Joseph Myers [Fri, 6 Sep 2024 16:13:40 +0000 (16:13 +0000)] 
Update gcc uk.po

* uk.po: Update.

10 months agors6000,extend and document built-ins vec_test_lsbb_all_ones and vec_test_lsbb_all_zeros
Carl Love [Fri, 6 Sep 2024 16:06:34 +0000 (12:06 -0400)] 
rs6000,extend and document built-ins vec_test_lsbb_all_ones and vec_test_lsbb_all_zeros

The built-ins currently support vector unsigned char arguments.  Extend the
built-ins to also support vector signed char and vector bool char
arguments.

Add documentation for the Power 10 built-ins vec_test_lsbb_all_ones
and vec_test_lsbb_all_zeros.  The vec_test_lsbb_all_ones built-in
returns 1 if the least significant bit in each byte is a 1, returns
0 otherwise.  Similarly, vec_test_lsbb_all_zeros returns a 1 if
the least significant bit in each byte is a zero and 0 otherwise.

Add addtional test cases for the built-ins in files:
  gcc/testsuite/gcc.target/powerpc/lsbb.c
  gcc/testsuite/gcc.target/powerpc/lsbb-runnable.c

gcc/ChangeLog:
* config/rs6000/rs6000-overload.def (vec_test_lsbb_all_ones,
vec_test_lsbb_all_zeros): Add built-in instances for vector signed
char and vector bool char.
* doc/extend.texi (vec_test_lsbb_all_ones,
vec_test_lsbb_all_zeros): Add documentation for the
existing built-ins.

gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
* gcc.target/powerpc/lsbb-runnable.c: Add test cases for the vector
signed char and vector bool char instances of
vec_test_lsbb_all_zeros and vec_test_lsbb_all_ones built-ins.
* gcc.target/powerpc/lsbb.c: Add compile test cases for the vector
signed char and vector bool char instances of
vec_test_lsbb_all_zeros and vec_test_lsbb_all_ones built-ins.

10 months agomiddle-end: check that the lhs of a COND_EXPR is an SSA_NAME in cond_store recognitio...
Tamar Christina [Fri, 6 Sep 2024 13:05:43 +0000 (14:05 +0100)] 
middle-end: check that the lhs of a COND_EXPR is an SSA_NAME in cond_store recognition [PR116628]

Because the vect_recog_bool_pattern can at the moment still transition
out of GIMPLE and back into GENERIC the vect_recog_cond_store_pattern can
end up using an expression as a mask rather than an SSA_NAME.

This adds an explicit check that we have a mask and not an expression.

gcc/ChangeLog:

PR tree-optimization/116628
* tree-vect-patterns.cc (vect_recog_cond_store_pattern): Add SSA_NAME
check on expression.

gcc/testsuite/ChangeLog:

PR tree-optimization/116628
* gcc.dg/vect/pr116628.c: New test.

10 months agoaarch64: Use is_attribute_namespace_p and get_attribute_name inside aarch64_lookup_sh...
Andrew Pinski [Wed, 4 Sep 2024 16:06:53 +0000 (09:06 -0700)] 
aarch64: Use is_attribute_namespace_p and get_attribute_name inside aarch64_lookup_shared_state_flags [PR116598]

The code in aarch64_lookup_shared_state_flags all C++11 attributes on the function type
had a namespace associated with them. But with the addition of reproducible/unsequenced,
this is not true.

This fixes the issue by using is_attribute_namespace_p instead of manually figuring out
the namespace is named "arm" and uses get_attribute_name instead of manually grabbing
the attribute name.

Built and tested for aarch64-linux-gnu.

gcc/ChangeLog:

PR target/116598
* config/aarch64/aarch64.cc (aarch64_lookup_shared_state_flags): Use
is_attribute_namespace_p and get_attribute_name instead of manually grabbing
the namespace and name of the attribute.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
10 months agoipa: Move pass_ipa_cdtor_merge before pass_ipa_cp and pass_ipa_sra
Martin Jambor [Fri, 6 Sep 2024 12:12:54 +0000 (14:12 +0200)] 
ipa: Move pass_ipa_cdtor_merge before pass_ipa_cp and pass_ipa_sra

When looking at PR 115815 we realized that it would make sense to make
calls to functions originally declared static constructors and
destructors created by pass_ipa_cdtor_merge visible to IPA-SRA.  This
patch does that.

gcc/ChangeLog:

2024-07-25  Martin Jambor  <mjambor@suse.cz>

* passes.def: Move pass_ipa_cdtor_merge before pass_ipa_cp and
pass_ipa_sra.

10 months agoipa: Treat static constructors and destructors as non-local (PR 115815)
Martin Jambor [Fri, 6 Sep 2024 12:12:53 +0000 (14:12 +0200)] 
ipa: Treat static constructors and destructors as non-local (PR 115815)

In PR 115815, IPA-SRA thought it had control over all invocations of a
(recursive) static destructor but it did not see the implied
invocation which led to the original being left behind and the
clean-up code encountering uses of SSAs that definitely should have
been dead.

Fixed by teaching cgraph_node::can_be_local_p about static
constructors and destructors.  Similar test is missing in
cgraph_node::local_p so I added the check there as well.

gcc/ChangeLog:

2024-07-25  Martin Jambor  <mjambor@suse.cz>

PR ipa/115815
* cgraph.cc (cgraph_node_cannot_be_local_p_1): Also check
DECL_STATIC_CONSTRUCTOR and DECL_STATIC_DESTRUCTOR.
* ipa-visibility.cc (non_local_p): Likewise.
(cgraph_node::local_p): Delete extraneous line of tabs.

gcc/testsuite/ChangeLog:

2024-07-25  Martin Jambor  <mjambor@suse.cz>

PR ipa/115815
* gcc.dg/lto/pr115815_0.c: New test.

10 months agoFix SLP double-reduction support
Richard Biener [Fri, 6 Sep 2024 11:24:38 +0000 (13:24 +0200)] 
Fix SLP double-reduction support

When doing SLP discovery I forgot to handle double reductions even
though they are already queued in LOOP_VINFO_REDUCTIONS.

* tree-vect-slp.cc (vect_analyze_slp): Also handle discovery
for double reductions.

10 months agoc++: Partially implement CWG 2867 - Order of initialization for structured bindings...
Jakub Jelinek [Fri, 6 Sep 2024 11:50:47 +0000 (13:50 +0200)] 
c++: Partially implement CWG 2867 - Order of initialization for structured bindings [PR115769]

The following patch partially implements CWG 2867
- Order of initialization for structured bindings.
The DR requires that initialization of e is sequenced before r_i and
that r_i initialization is sequenced before r_j for j > i, we already do it
that way, the former ordering is a necessity so that the get calls are
actually emitted on already initialized variable, the rest just because
we implemented it that way, by going through the structured binding
vars in ascending order and doing their initialization.

The hard part not implemented yet is the lifetime extension of the
temporaries from the e initialization to after the get calls (if any).
Unlike the range-for lifetime extension patch which I've posted recently
where IMO we can just ignore lifetime extension of reference bound
temporaries because all the temporaries are extended to the same spot,
here lifetime extension of reference bound temporaries should last until
the end of lifetime of e, while other temporaries only after all the get
calls.

The patch just attempts to deal with automatic structured bindings for now,
I'll post a patch for static locals incrementally and I don't have a patch
for namespace scope structured bindings yet, this patch should just keep
existing behavior for both static locals and namespace scope structured
bindings.

What GCC currently emits is a CLEANUP_POINT_EXPR around the e
initialization, followed optionally by nested CLEANUP_STMTs for cleanups
like the e dtor if any and dtors of lifetime extended temporaries from
reference binding; inside of the CLEANUP_STMT CLEANUP_BODY then the
initialization of the individual variables for the tuple case, again with
optional CLEANUP_STMT if e.g. lifetime extended temporaries from reference
binding are needed in those.

The following patch drops that first CLEANUP_POINT_EXPR and instead
wraps the whole sequence of the e initialization and the individual variable
initialization with get calls after it into a single CLEANUP_POINT_EXPR.
If there are any CLEANUP_STMTs needed, they are all emitted first, with
the CLEANUP_POINT_EXPR for e initialization and the individual variable
initialization inside of those, and a guard variable set after different
phases in those expressions guarding the corresponding cleanups, so that
they aren't invoked until the respective variables are constructed.
This is implemented by cp_finish_decl doing cp_finish_decomp on its own
when !processing_template_decl (otherwise we often don't cp_finish_decl
or process it at a different time from when we want to call
cp_finish_decomp) or unless the decl is erroneous (cp_finish_decl has
too many early returns for erroneous cases, and for those we can actually
call it even multiple times, for the non-erroneous cases
non-processing_template_decl cases we need to call it just once).

The two testcases try to construct various temporaries and variables and
verify the order in which the temporaries and variables are constructed and
destructed.

2024-09-06  Jakub Jelinek  <jakub@redhat.com>

PR c++/115769
* cp-tree.h: Partially implement CWG 2867 - Order of initialization
for structured bindings.
(cp_finish_decomp): Add TEST_P argument defaulted to false.
* decl.cc (initialize_local_var): Add DECOMP argument, if true,
don't build cleanup and temporarily override stmts_are_full_exprs_p
to 0 rather than 1.  Formatting fix.
(cp_finish_decl): Invoke cp_finish_decomp for structured bindings
here, first with test_p.  For automatic structured binding bases
if the test cp_finish_decomp returned true wrap the initialization
together with what non-test cp_finish_decomp emits with a
CLEANUP_POINT_EXPR, and if there are any CLEANUP_STMTs needed, emit
them around the whole CLEANUP_POINT_EXPR with guard variables for the
cleanups.  Call cp_finish_decomp using RAII if not called with
decomp != NULL otherwise.
(cp_finish_decomp): Add TEST_P argument, change return type from
void to bool, if TEST_P is true, return true instead of emitting
actual code for the tuple case, otherwise return false.
* parser.cc (cp_convert_range_for): Don't call cp_finish_decomp
after cp_finish_decl.
(cp_parser_decomposition_declaration): Set DECL_DECOMP_BASE
before cp_finish_decl call.  Don't call cp_finish_decomp after
cp_finish_decl.
(cp_finish_omp_range_for): Don't call cp_finish_decomp after
cp_finish_decl.
* pt.cc (tsubst_stmt): Likewise.

* g++.dg/DRs/dr2867-1.C: New test.
* g++.dg/DRs/dr2867-2.C: New test.

10 months agoAVR: lra/116321 - Add test case.
Georg-Johann Lay [Fri, 6 Sep 2024 11:47:12 +0000 (13:47 +0200)] 
AVR: lra/116321 - Add test case.

PR rtl-optimization/116321
gcc/testsuite/
* gcc.target/avr/torture/lra-pr116321.c: New test.

10 months agolibstdc++: avoid __GLIBCXX__ redefinition
Jason Merrill [Tue, 27 Aug 2024 17:15:38 +0000 (13:15 -0400)] 
libstdc++: avoid __GLIBCXX__ redefinition

testsuite/lib/dg-options.exp defines __GLIBCXX__ to 9999999; avoid a macro
redefinition warning in that case.

libstdc++-v3/ChangeLog:

* include/bits/c++config: Avoid redefining __GLIBCXX__.

10 months agoFortran: Add OpenMP 'interop' directive parsing support
Tobias Burnus [Fri, 6 Sep 2024 09:45:46 +0000 (11:45 +0200)] 
Fortran: Add OpenMP 'interop' directive parsing support

Parse OpenMP's 'interop' directive but stop with a 'sorry, unimplemented'
after resolving.

Additionally, it moves some clause dumping away from the end directive as
that lead to 'nowait' not being printed when it should as some cases were
missed.

gcc/fortran/ChangeLog:

* dump-parse-tree.cc (show_omp_namelist): Handle OMP_LIST_INIT.
(show_omp_clauses): Handle OMP_LIST_{INIT,USE,DESTORY}; move 'nowait'
from end-directive to the directive dump.
(show_omp_node, show_code_node): Handle EXEC_OMP_INTEROP.
* gfortran.h (enum gfc_statement): Add ST_OMP_INTEROP.
(OMP_LIST_INIT, OMP_LIST_USE, OMP_LIST_DESTROY): Add.
(enum gfc_exec_op): Add EXEC_OMP_INTEROP.
(struct gfc_omp_namelist): Add interop items to union.
(gfc_free_omp_namelist): Add boolean arg.
* match.cc (gfc_free_omp_namelist): Update to free
interop union members.
* match.h (gfc_match_omp_interop): New.
* openmp.cc (gfc_omp_directives): Uncomment 'interop' entry.
(gfc_free_omp_clauses, gfc_match_omp_allocate,
gfc_match_omp_flush, gfc_match_omp_clause_reduction): Update
call.
(enum omp_mask2): Add OMP_CLAUSE_{INIT,USE,DESTROY}.
(OMP_INTEROP_CLAUSES): Use it.
(gfc_match_omp_clauses): Match those clauses.
(gfc_match_omp_prefer_type, gfc_match_omp_init,
gfc_match_omp_interop): New.
(resolve_omp_clauses): Handle interop clauses.
(omp_code_to_statement): Add ST_OMP_INTEROP.
(gfc_resolve_omp_directive): Add EXEC_OMP_INTEROP.
* parse.cc (decode_omp_directive): Parse 'interop' directive.
(next_statement, gfc_ascii_statement): Handle ST_OMP_INTEROP.
* st.cc (gfc_free_statement): Likewise
* resolve.cc (gfc_resolve_code): Handle EXEC_OMP_INTEROP.
* trans.cc (trans_code): Likewise.
* trans-openmp.cc (gfc_trans_omp_directive): Print 'sorry'
for EXEC_OMP_INTEROP.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/interop-1.f90: New test.
* gfortran.dg/gomp/interop-2.f90: New test.
* gfortran.dg/gomp/interop-3.f90: New test.

10 months agoHandle non-grouped stores as single-lane SLP
Richard Biener [Fri, 29 Sep 2023 10:54:17 +0000 (12:54 +0200)] 
Handle non-grouped stores as single-lane SLP

The following enables single-lane loop SLP discovery for non-grouped stores
and adjusts vectorizable_store to properly handle those.

For gfortran.dg/vect/vect-8.f90 we vectorize one additional loop,
not running into the "not falling back to strided accesses" bail-out.
I have not investigated in detail.

There is a set of i386 target assembler test FAILs,
gcc.target/i386/pr88531-2[bc].c in particular fail because the
target cannot identify SLP emulated gathers, see another mail from me.
Others need adjustment, I've adjusted one with this patch only.
In particular there are gcc.target/i386/cond_op_fma_*-1.c FAILs
that are because we no longer fold a VEC_COND_EXPR during the
region value-numbering we do after vectorization since we
code-generate a { 0.0, ... } constant in the VEC_COND_EXPR now
instead of having a separate statement which gets forwarded
and then triggers folding.  This leads to sligtly different
code generation.  The solution is probably to use gimple_build
when building stmts or, in this case, directly emit .COND_FMA
instead of .FMA and a VEC_COND_EXPR.

gcc.dg/vect/slp-19a.c mixes contiguous 8-lane SLP with a single
lane contiguous store from one lane of the 8-lane load and we
expect to use load-lanes for this reason but the heuristic for
forcing single-lane rediscovery as implemented doesn't trigger
here as it treats both SLP instances separately.  FAILs on RISC-V

gcc.dg/vect/slp-19c.c shows we fail to implement an interleaving
scheme for group_size 12 (by extension using the group_size 3
scheme to reduce to 4 lanes and then continue with a pow2 scheme
would work);  we are also not considering load-lanes because of
the above reason, but aarch64 cannot do ld12.  FAILs on AARCH64
(load requires three vectors) and x86_64.

gcc.dg/vect/slp-19c.c FAILs with variable-length vectors because
of "SLP induction not supported for variable-length vectors".

gcc.target/aarch64/pr110449.c will FAIL because the (contested)
optimization in r14-2367-g224fd59b2dc8a5 was only applied to
loop-vect but not SLP vect.  I'll leave it to target maintainers
to either XFAIL (the optimization is bad) or remove the test.

* tree-vect-slp.cc (vect_analyze_slp): Perform single-lane
loop SLP discovery for non-grouped stores.  Move check on the root
for re-doing SLP analysis with a single lane for load/store-lanes
earlier and make sure we are dealing with a grouped access.
* tree-vect-stmts.cc (vectorizable_store): Always set
vec_num for SLP.

* gcc.dg/vect/O3-pr39675-2.c: Adjust expected number of SLP.
* gcc.dg/vect/fast-math-vect-call-1.c: Likewise.
* gcc.dg/vect/no-scevccp-slp-31.c: Likewise.
* gcc.dg/vect/slp-12b.c: Likewise.
* gcc.dg/vect/slp-12c.c: Likewise.
* gcc.dg/vect/slp-19a.c: Likewise.
* gcc.dg/vect/slp-19b.c: Likewise.
* gcc.dg/vect/slp-4-big-array.c: Likewise.
* gcc.dg/vect/slp-4.c: Likewise.
* gcc.dg/vect/slp-5.c: Likewise.
* gcc.dg/vect/slp-7.c: Likewise.
* gcc.dg/vect/slp-perm-7.c: Likewise.
* gcc.dg/vect/slp-37.c: Likewise.
* gcc.dg/vect/fast-math-vect-call-2.c: Likewise.
* gcc.dg/vect/slp-26.c: RISC-V can now SLP two instances.
* gcc.dg/vect/vect-outer-slp-3.c: Disable vectorization of
initialization loop.
* gcc.dg/vect/slp-reduc-5.c: Likewise.
* gcc.dg/vect/no-scevccp-outer-12.c: Un-XFAIL.  SLP can handle
inner loop inductions with multiple vector stmt copies.
* gfortran.dg/vect/vect-8.f90: Adjust expected number of
vectorized loops.
* gcc.target/i386/vectorize1.c: Adjust what we scan for.

11 months agoAVR: Remove "Atmel" from header comment.
Georg-Johann Lay [Sun, 1 Sep 2024 15:19:38 +0000 (17:19 +0200)] 
AVR: Remove "Atmel" from header comment.

gcc/
* config/avr/avr.h: Remove "Atmel" from header comment.
* config/avr/avr.cc: Same.
* config/avr/avr.md: Same.
* config/avr/avr.opt: Same.
* config/avr/avr-dimode.md: Same.
* config/avr/avr-fixed.md: Same.
* config/avr/constraints.md: Same.
* config/avr/predicates.md: Same.
* config/avr/avr-log.cc: Same.
* config/avr/avrlibc.h: Same.
* config/avr/specs.h: Same.
* common/config/avr/avr-common.cc: Same.
* doc/install.texi: Same.
* config/avr/avr-arch.h: Adjust header comment.
* config/avr/avr-c.cc: Same.
* config/avr/avr-mcus.def: Same.
* config/avr/avr-modes.def: Same.
* config/avr/avr-passes.cc: Same.
* config/avr/avr-passes.def: Same.
* config/avr/avr-protos.h: Same.
* config/avr/driver-avr.cc: Same.
* config/avr/elf.h: Same.
* config/avr/gen-avr-mmcu-specs.cc: Same.
* config/avr/gen-avr-mmcu-texi.cc: Same.

11 months agotree-optimization/116610 - wrong SLP induction bias for mask peeling
Richard Biener [Thu, 5 Sep 2024 09:18:57 +0000 (11:18 +0200)] 
tree-optimization/116610 - wrong SLP induction bias for mask peeling

The following fixes a mistake when applying the bias for peeling via
masking to the inital value of SLP inductions.

This resolves gcc.target/aarch64/sve/peel_ind_1.c (a scan-assembler
only unfortunately) when forcing single-lane SLP for it.

PR tree-optimization/116610
* tree-vect-loop.cc (vectorizable_induction): Use MINUS_EXPR
to apply a mask peeling adjustment.

11 months agotree-optimization/116609 - SLP live lane vectorization with partial vectors
Richard Biener [Thu, 5 Sep 2024 08:46:58 +0000 (10:46 +0200)] 
tree-optimization/116609 - SLP live lane vectorization with partial vectors

The following implements the simple case of single-lane SLP when
using partial vectors which can use the VEC_EXTRACT_LAST code
generation without changes.  I'll keep the PR open for further
enhancements.

This avoids FAILs of gcc.target/aarch64/sve/live_1.c when using
single-lane SLP for non-grouped stores.

PR tree-optimization/116609
* tree-vect-loop.cc (vectorizable_live_operation_1): Support
partial vectors for single-lane SLP.

11 months ago[PATCH 2/2 v2] RISC-V: Constant synthesis of inverted halves
Raphael Moreira Zinsly [Fri, 6 Sep 2024 04:14:32 +0000 (22:14 -0600)] 
[PATCH 2/2 v2] RISC-V: Constant synthesis of inverted halves

Changes since v1:
- Fix synthesis-15.c.

-- >8 --

Improve handling of constants where the high half can be constructed by
inverting the lower half.

gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_build_integer): Detect constants
were the higher half is the lower half inverted.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/synthesis-15.c: New test.

11 months ago[PATCH 1/2 v2] RISC-V: Additional large constant synthesis improvements
Raphael Moreira Zinsly [Fri, 6 Sep 2024 03:50:54 +0000 (21:50 -0600)] 
[PATCH 1/2 v2] RISC-V: Additional large constant synthesis improvements

Changes since v1:
- Fix bit31.
- Remove negative shift checks.
- Fix synthesis-7.c expected output.

-- >8 --

Improve handling of large constants in riscv_build_integer, generate
better code for constants where the high half can be constructed
by shifting/shiftNadding the low half or if the halves differ by less
than 2k.

gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_build_integer): Detect new case
of constants that can be improved.
(riscv_move_integer): Add synthesys for concatening constants
without Zbkb.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/synthesis-7.c: Adjust expected output.
* gcc.target/riscv/synthesis-12.c: New test.
* gcc.target/riscv/synthesis-13.c: New test.
* gcc.target/riscv/synthesis-14.c: New test.

11 months agoMatch: Add int type fits check for form 2 of .SAT_SUB imm operand
Pan Li [Mon, 2 Sep 2024 03:33:08 +0000 (11:33 +0800)] 
Match: Add int type fits check for form 2 of .SAT_SUB imm operand

This patch would like to add strict check for imm operand of .SAT_SUB
matching.  We have no type checking for imm operand in previous, which
may result in unexpected IL to be catched by .SAT_SUB pattern.

We leverage the int_fits_type_p here to make sure the imm operand is
a int type fits the result type of the .SAT_SUB.  For example:

Fits uint8_t:
uint8_t a;
uint8_t sum = .SAT_SUB (a, 12);
uint8_t sum = .SAT_SUB (a, 12u);
uint8_t sum = .SAT_SUB (a, 126u);
uint8_t sum = .SAT_SUB (a, 128u);
uint8_t sum = .SAT_SUB (a, 228);
uint8_t sum = .SAT_SUB (a, 223u);

Not fits uint8_t:
uint8_t a;
uint8_t sum = .SAT_SUB (a, -1);
uint8_t sum = .SAT_SUB (a, 256u);
uint8_t sum = .SAT_SUB (a, 257);

The below test suite are passed for this patch:
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Add int_fits_type_p check for .SAT_SUB imm operand.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_u_add_imm_type_check-57.c: New test.
* gcc.target/riscv/sat_u_add_imm_type_check-58.c: New test.
* gcc.target/riscv/sat_u_add_imm_type_check-59.c: New test.
* gcc.target/riscv/sat_u_add_imm_type_check-60.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
11 months agoMatch: Add int type fits check for form 1 of .SAT_SUB imm operand
Pan Li [Mon, 2 Sep 2024 01:48:46 +0000 (09:48 +0800)] 
Match: Add int type fits check for form 1 of .SAT_SUB imm operand

This patch would like to add strict check for imm operand of .SAT_SUB
matching.  We have no type checking for imm operand in previous, which
may result in unexpected IL to be catched by .SAT_SUB pattern.

We leverage the int_fits_type_p here to make sure the imm operand is
a int type fits the result type of the .SAT_SUB.  For example:

Fits uint8_t:
uint8_t a;
uint8_t sum = .SAT_SUB (12, a);
uint8_t sum = .SAT_SUB (12u, a);
uint8_t sum = .SAT_SUB (126u, a);
uint8_t sum = .SAT_SUB (128u, a);
uint8_t sum = .SAT_SUB (228, a);
uint8_t sum = .SAT_SUB (223u, a);

Not fits uint8_t:
uint8_t a;
uint8_t sum = .SAT_SUB (-1, a);
uint8_t sum = .SAT_SUB (256u, a);
uint8_t sum = .SAT_SUB (257, a);

The below test suite are passed for this patch:
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Add int_fits_type_p check for .SAT_SUB imm operand.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_u_add_imm_type_check-53.c: New test.
* gcc.target/riscv/sat_u_add_imm_type_check-54.c: New test.
* gcc.target/riscv/sat_u_add_imm_type_check-55.c: New test.
* gcc.target/riscv/sat_u_add_imm_type_check-56.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
11 months agoRISC-V: Fix out of index in riscv_select_multilib_by_abi
YunQiang Su [Thu, 5 Sep 2024 11:55:20 +0000 (19:55 +0800)] 
RISC-V: Fix out of index in riscv_select_multilib_by_abi

commit b5c2aae48723c9098a8a3dab1409b30fd87bbf56
Author: YunQiang Su <yunqiang@isrc.iscas.ac.cn>
Date:   Thu Sep 5 15:14:43 2024 +0800

    RISC-V: Lookup reversely in riscv_select_multilib_by_abi

The last element should use index
   multilib_infos.size () - 1

gcc
* common/config/riscv/riscv-common.cc(riscv_select_multilib_by_abi):
Fix out of index problem.

11 months agoc-family: add attribute flag_enum [PR81665]
Jason Merrill [Thu, 29 Aug 2024 15:09:21 +0000 (11:09 -0400)] 
c-family: add attribute flag_enum [PR81665]

Several PRs complain about -Wswitch warning about a case for a bitwise
combination of enumerators.  Clang has an attribute flag_enum to prevent
this; let's adopt that approach as well.

This also recognizes the attribute as [[clang::flag_enum]], introducing
handling of the clang attribute namespace.

PR c++/46457
PR c++/81665

gcc/c-family/ChangeLog:

* c-attribs.cc (handle_flag_enum_attribute): New.
(c_common_gnu_attributes): Add it.
(c_common_clang_attributes, c_common_clang_attribute_table): New.
* c-common.h: Declare c_common_clang_attribute_table.
* c-warn.cc (c_do_switch_warnings): Handle flag_enum.

gcc/c/ChangeLog:

* c-objc-common.h (c_objc_attribute_table): Add
c_common_clang_attribute_table.

gcc/cp/ChangeLog:

* cp-objcp-common.h (cp_objcp_attribute_table): Add
c_common_clang_attribute_table.

gcc/testsuite/ChangeLog:

* c-c++-common/attr-flag-enum-1.c: New test.

gcc/ChangeLog:

* doc/extend.texi: Document flag_enum attribute.
* doc/invoke.texi: Mention flag_enum in -Wswitch.

libstdc++-v3/ChangeLog:

* include/bits/regex_constants.h: Use flag_enum.

11 months agolibstdc++: -Wswitch and ios::openmode
Jason Merrill [Tue, 27 Aug 2024 17:16:27 +0000 (13:16 -0400)] 
libstdc++: -Wswitch and ios::openmode

In addition to marking it as flag_enum, we want to avoid warnings about
not having a case for the implementation detail enumerators
_S_ios_openmode_*.  And also for _S_noreplace in standard modes before it
was added.

libstdc++-v3/ChangeLog:

* include/bits/ios_base.h (_GLIBCXX_NOREPLACE_UNUSED): New.
(_Ios_Openmode): Add unused attributes.
* testsuite/27_io/ios_base/types/openmode/case_label.cc: Handle
noreplace.

11 months agoHandle const0_operand for *avx2_pcmp<mode>3_1.
liuhongt [Wed, 4 Sep 2024 07:39:17 +0000 (15:39 +0800)] 
Handle const0_operand for *avx2_pcmp<mode>3_1.

*<avx512>_eq<mode>3<mask_scalar_merge_name>_1 supports
nonimm_or_0_operand for op1 and op2, pass_combine would fail to lower
avx512 comparision back to avx2 one when op1/op2 is const0_rtx. It's
because the splitter only support nonimmediate_operand.

Failed to match this instruction:
(set (reg/i:V16QI 20 xmm0)
    (vec_merge:V16QI (const_vector:V16QI [
                (const_int -1 [0xffffffffffffffff]) repeated x16
            ])
        (const_vector:V16QI [
                (const_int 0 [0]) repeated x16
            ])
        (unspec:HI [
                (reg:V16QI 105 [ a ])
                (const_vector:V16QI [
                        (const_int 0 [0]) repeated x16
                    ])
                (const_int 0 [0])
            ] UNSPEC_PCMP)))

The patch extend predicates of the splitter to handles that.

gcc/ChangeLog:

PR target/115517
* config/i386/sse.md (*avx2_pcmp<mode>3_1): Change predicate
of operands[1] and operands[2] from nonimmdiate_operand to
nonimm_or_0_operand.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr115517.c: New test.

11 months agoDaily bump.
GCC Administrator [Fri, 6 Sep 2024 00:19:10 +0000 (00:19 +0000)] 
Daily bump.

11 months ago[V2][RISC-V] Avoid unnecessary extensions after sCC insns
Jeff Law [Thu, 5 Sep 2024 21:45:25 +0000 (15:45 -0600)] 
[V2][RISC-V] Avoid unnecessary extensions after sCC insns

So the first patch failed the pre-commit CI; it didn't fail in my testing
because I'm using --with-arch to set a default configuration that includes
things like zicond to ensure that's always tested.  And the failing test is
skipped when zicond is enabled by default.

The failing test is designed to ensure that we don't miss an if-conversion due
to costing issues around the extension that was typically done in an sCC
sequence (which is why it's only run when zicond is off).

The test failed because we have a little routine that is highly dependent on
the code generated by the sCC expander and will adjust the costing to account
for expansion quirks that usually go away in register allocation.

That code needs to be enhanced to work after the sCC expansion change.
Essentially it needs to account for the subreg extraction that shows up in the
sequence as well as being a bit looser on mode checking.

I kept the code working for the old sequences -- in theory a user could conjure
up the old sequence so handling them seems useful.

This also drops the testsuite changes.  Palmer's change makes them unnecessary.

---

So I was looking at a performance regression in spec with Ventana's internal
tree.  Ultimately the problem was a bad interaction with an internal patch
(REP_MODE_EXTENDED), fwprop and ext-dce.  The details of that problem aren't
particularly important.

Removal of the local patch went reasonably well.  But I did see some secondary
cases where we had redundant sign extensions.  The most notable cases come from
the integer sCC insns.

Expansion of those cases for rv64 can be improved using Jivan's trick. ie, if
the target is not DImode, then create a DImode temporary for the result and
copy the low bits out with a promoted subreg to the real target.

With the change in expansion the final code we generate is slightly different
for a few tests at -O1/-Og, but should perform the same.  The key for the
affected tests is we're not seeing the introduction of unnecessary extensions.
Rather than adjust the regexps to handle the -O1/-Og output, skipping for those
seemed OK to me.  I didn't extract a testcase.  I'm a bit fried from digging
through LTO'd code right now.

gcc/
* config/riscv/riscv.cc (riscv_expand_int_scc): For rv64, use a DI
temporary for the output and a promoted subreg to extract it into SI
arget.
(riscv_noce_conversion_profitable_p): Recognize new output from
sCC expansion too.

11 months agoc++: tweak redeclaration-6.C
Jason Merrill [Thu, 5 Sep 2024 20:39:55 +0000 (16:39 -0400)] 
c++: tweak redeclaration-6.C

gcc/testsuite/ChangeLog:

* g++.dg/diagnostic/redeclaration-6.C: Add -fno-implicit-constexpr.

11 months agoFortran: fix ICE in gfc_create_module_variable [PR100273]
Harald Anlauf [Thu, 5 Sep 2024 19:30:25 +0000 (21:30 +0200)] 
Fortran: fix ICE in gfc_create_module_variable [PR100273]

gcc/fortran/ChangeLog:

PR fortran/100273
* trans-decl.cc (gfc_create_module_variable): Handle module
variable also when it is needed for the result specification
of a contained function.

gcc/testsuite/ChangeLog:

PR fortran/100273
* gfortran.dg/pr100273.f90: New test.

11 months agoc++: vtable referring to "unavailable" virtual fn [PR116606]
Marek Polacek [Thu, 5 Sep 2024 17:01:59 +0000 (13:01 -0400)] 
c++: vtable referring to "unavailable" virtual fn [PR116606]

mark_vtable_entries already has

   /* It's OK for the vtable to refer to deprecated virtual functions.  */
   warning_sentinel w(warn_deprecated_decl);

but that doesn't cover __attribute__((unavailable)).  We can use the
following override to cover both.

PR c++/116606

gcc/cp/ChangeLog:

* decl2.cc (mark_vtable_entries): Temporarily override deprecated_state to
UNAVAILABLE_DEPRECATED_SUPPRESS.  Remove a warning_sentinel.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attr-unavailable-13.C: New test.

11 months agoc++, coroutines: Revise promise construction/destruction.
Iain Sandoe [Sat, 31 Aug 2024 12:08:42 +0000 (13:08 +0100)] 
c++, coroutines: Revise promise construction/destruction.

In examining the coroutine testcases for unexpected diagnostic output
for 'Wall', I found a 'statement has no effect' warning for the promise
construction in one case.  In particular, the case is where the users
promise type has an implicit CTOR but a user-provided DTOR. Further, the
type does not actually need constructing.

In very early versions of the coroutines code we used to check
TYPE_NEEDS_CONSTRUCTING() to determine whether to attempt to build
a constructor call for the promise.  During review, it was suggested
to use type_build_ctor_call () instead.

This latter call checks the constructors in the type (both user-defined
and implicit) and returns true, amongst other cases if any of the found
CTORs are marked as deprecated.

In a number of places (for example [class.copy.ctor] / 6) the standard
says that some version of an implicit CTOR is deprecated when the user
provides a DTOR.

Thus, for this specific arrangement of promise type, type_build_ctor_call
returns true, because of (for example) a deprecated implicit copy CTOR.

We are not going to use any of the deprecated CTORs and thus will not
see warnings from this - however, since the call returned true, we have
now determined that we should attempt to build a constructor call.

Note as above, the type does not actually require construction and thus
one might expect either a NULL_TREE or error_mark_node in response to
the build_special_member_call ().  However, in practice the function
returns the original instance object instead of a call or some error.

When we add that as a statement it triggers the 'statement has no effect'
warning.

The patch here rearranges the promise construction/destruction code to
allow for the case that a DTOR is required independently of a CTOR. In
addition, we check that the return from build_special_member_call ()
has side effects before we add it as a statement.

gcc/cp/ChangeLog:

* coroutines.cc
(cp_coroutine_transform::build_ramp_function): Separate the
build of promise constructor and destructor.  When evaluating
the constructor, check that build_special_member_call returns
an expression with side effects before adding it.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
11 months agoc++: local class memfn synth from noexcept context [PR113063]
Patrick Palka [Thu, 5 Sep 2024 18:31:00 +0000 (14:31 -0400)] 
c++: local class memfn synth from noexcept context [PR113063]

Extending the PR113063 testcase to additionally constant evaluate the <=>
expression causes us to trip over the assert in cxx_eval_call_expression

  /* We used to shortcut trivial constructor/op= here, but nowadays
     we can only get a trivial function here with -fno-elide-constructors.  */
  gcc_checking_assert (!trivial_fn_p (fun)
                       || !flag_elide_constructors
                       /* We don't elide constructors when processing
                          a noexcept-expression.  */
                       || cp_noexcept_operand);

since the local class's <=> was first used and therefore synthesized in
a noexcept context and so its definition contains unelided trivial
constructors.

This patch fixes this by clearing cp_noexcept_operand alongside
cp_unevaluated_context in the function-local case of
maybe_push_to_top_level.

PR c++/113063

gcc/cp/ChangeLog:

* name-lookup.cc (local_state_t): Clear and restore
cp_noexcept_operand as well.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/spaceship-synth16.C: Also constant evaluate
the <=> expression.
* g++.dg/cpp2a/spaceship-synth16a.C: Likewise.

Reviewed-by: Jason Merrill <jason@redhat.com>
11 months agodoc: remove stray character
Marek Polacek [Thu, 5 Sep 2024 17:17:06 +0000 (13:17 -0400)] 
doc: remove stray character

There's an extra '+'.

gcc/ChangeLog:

* doc/invoke.texi: Remove an extra char in @item sme2.

11 months agoc++: fn redecl in fn scope wrongly accepted [PR116239]
Marek Polacek [Fri, 30 Aug 2024 18:12:22 +0000 (14:12 -0400)] 
c++: fn redecl in fn scope wrongly accepted [PR116239]

Redeclaration such as

  void f(void);
  consteval void f(void);

is invalid.  In a namespace scope, we detect the collision in
validate_constexpr_redeclaration, but not when one declaration is
at block scope.

When we have

  void f(void);
  void g() { consteval void f(void); }

we call pushdecl on the second f and call push_local_extern_decl_alias.
It finds the namespace-scope f:

        for (ovl_iterator iter (binding); iter; ++iter)
          if (decls_match (decl, *iter, /*record_versions*/false))
            {
              alias = *iter;
              break;
            }

but decls_match says they match so we just set DECL_LOCAL_DECL_ALIAS
(and do not call another pushdecl leading to duplicate_decls which
would detect mismatching return types, for example).  I don't think
we want to change decls_match, so a simple fix is to detect the
problem in push_local_extern_decl_alias.

PR c++/116239

gcc/cp/ChangeLog:

* cp-tree.h (validate_constexpr_redeclaration): Declare.
* decl.cc (validate_constexpr_redeclaration): No longer static.
* name-lookup.cc (push_local_extern_decl_alias): Call
validate_constexpr_redeclaration.

gcc/testsuite/ChangeLog:

* g++.dg/diagnostic/redeclaration-6.C: New test.

11 months agoAvoid ICE when passing VLA vector to accelerator.
Prathamesh Kulkarni [Thu, 5 Sep 2024 13:22:53 +0000 (18:52 +0530)] 
Avoid ICE when passing VLA vector to accelerator.

gcc/ChangeLog:
* gimplify.cc (omp_add_variable): Check if decl size is not poly_int_tree_p.
(gimplify_adjust_omp_clauses): Likewise.
* omp-low.cc (scan_sharing_clauses): Likewise.
(lower_omp_target): Likewise.

Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>
11 months agonvptx: Emit DECL and DEF linker markers for aliases [PR104957]
Thomas Schwinge [Wed, 17 Jul 2024 21:56:25 +0000 (23:56 +0200)] 
nvptx: Emit DECL and DEF linker markers for aliases [PR104957]

With nvptx '-malias' enabled (as implemented in
commit f8b15e177155960017ac0c5daef8780d1127f91c
"[nvptx] Use .alias directive for mptx >= 6.3"), the C++ front end in certain
cases does 'write_fn_proto' before an eventual 'alias' attribute has been
added.  In that case, we do emit (via 'write_fn_marker') a DECL linker marker,
but then never emit a corresponding DEF linker marker for the alias.  This
causes hundreds of instances of link-time 'unresolved symbol [alias]' across
the C++ test suite, which are regressions compared to a test run with (default)
'-mno-alias' (in which case the respective functions get duplicated).

PR target/104957
gcc/
* config/nvptx/nvptx.cc (write_fn_proto_1): Revert 2022-03-22
change; 'write_fn_marker' also for alias DECL.
(nvptx_asm_output_def_from_decls): 'write_fn_marker' for alias
DEF.
gcc/testsuite/
* g++.target/nvptx/alias-g++.dg_init_dtor2-1.C: Un-XFAIL.
* gcc.target/nvptx/alias-1.c: Likewise.
* gcc.target/nvptx/alias-3.c: Likewise.
* gcc.target/nvptx/alias-to-alias-1.c: Likewise.