git.ipfire.org Git - thirdparty/gcc.git/log

OpenACC 2.7: default clause support for data constructs

This patch implements the OpenACC 2.7 addition of default(none|present) support
for data constructs.

Now, specifying "default(none|present)" on a data construct turns on same
default clause behavior for all lexically enclosed compute constructs (which
don't already themselves have a default clause).

gcc/c/ChangeLog:
* c-parser.cc (OACC_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DEFAULT.

gcc/cp/ChangeLog:
* parser.cc (OACC_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DEFAULT.

gcc/fortran/ChangeLog:
* openmp.cc (OACC_DATA_CLAUSES): Add OMP_CLAUSE_DEFAULT.

gcc/ChangeLog:
* gimplify.cc (oacc_region_type_name): New function.
(oacc_default_clause): If no 'default' clause appears on this
compute construct, see if one appears on a lexically containing
'data' construct.
(gimplify_scan_omp_clauses): Upon OMP_CLAUSE_DEFAULT case, set
ctx->oacc_default_clause_ctx to current context.

gcc/testsuite/ChangeLog:
* c-c++-common/goacc/default-3.c: Adjust testcase.
* c-c++-common/goacc/default-4.c: Adjust testcase.
* c-c++-common/goacc/default-5.c: Adjust testcase.
* gfortran.dg/goacc/default-3.f95: Adjust testcase.
* gfortran.dg/goacc/default-4.f: Adjust testcase.
* gfortran.dg/goacc/default-5.f: Adjust testcase.

Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>

RISC-V: Fix autovec_length_operand predicate[PR110989]

Currently, autovec_length_operand predicate incorrect configuration is
discovered in PR110989 since this following situation:

vect__6.24_107 = .MASK_LEN_LOAD (vectp.22_105, 32B, mask__49.21_99, POLY_INT_CST [2, 2], 0); ---> dummy length = VF.

The current autovec length operand failed to recognize the VF dummy length.

-march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=scalable -Ofast -fno-schedule-insns -fno-schedule-insns2:

Before this patch:

srli a4,s0,2
addi a4,a4,-3
srli s0,s0,3
vsetvli a5,zero,e64,m1,ta,ma
vid.v v1
vmul.vx v1,v1,a4
addi a4,s0,-2
vadd.vx v1,v1,a4
addi a4,s0,-1
vslide1up.vx v2,v1,a4
vmv.v.x v1,a4
vand.vv v1,v2,v1
vl1re64.v v3,0(t2)
vrgather.vv v2,v3,v1
vmv.v.i v1,0
vmfeq.vv v0,v2,v1
vsetvli zero,s0,e32,mf2,ta,ma ---> s0 = POLY (2,2)
vle32.v v3,0(t3),v0.t
vsetvli a5,zero,e64,m1,ta,ma
vmfne.vv v0,v2,v1
vsetvli zero,zero,e32,mf2,ta,ma
vfwcvt.f.x.v v1,v3
vsetvli zero,zero,e64,m1,ta,ma
vmerge.vvm v1,v1,v2,v0
vslidedown.vx v1,v1,a4
vfmv.f.s fa5,v1
j .L6

After this patch:

srli a4,s0,2
addi a4,a4,-3
srli s0,s0,3
vsetvli a5,zero,e64,m1,ta,ma
vid.v v1
vmul.vx v1,v1,a4
addi a4,s0,-2
vadd.vx v1,v1,a4
addi s0,s0,-1
vslide1up.vx v2,v1,s0
vmv.v.x v1,s0
vand.vv v1,v2,v1
vl1re64.v v3,0(t2)
vrgather.vv v2,v3,v1
vmv.v.i v1,0
vmfeq.vv v0,v2,v1
vle32.v v3,0(t3),v0.t
vmfne.vv v0,v2,v1
vsetvli zero,zero,e32,mf2,ta,ma
vfwcvt.f.x.v v1,v3
vsetvli zero,zero,e64,m1,ta,ma
vmerge.vvm v1,v1,v2,v0
vslidedown.vx v1,v1,s0
vfmv.f.s fa5,v1
j .L6

2 vsetvli insns are reduced.

gcc/ChangeLog:

PR target/110989
* config/riscv/predicates.md: Fix predicate.

gcc/testsuite/ChangeLog:

PR target/110989
* gcc.target/riscv/rvv/autovec/pr110989.c: Add vsetvli assembly check.

Cleanup BB vectorization roots handling

The following moves CONSTRUCTOR handling into the generic BB
vectorization roots handling, removing a special case and finally
renaming the function now consisting of more than just constructor
detection.

* tree-vect-slp.cc (vect_analyze_slp_instance): Remove
slp_inst_kind_ctor handling.
(vect_analyze_slp): Simplify.
(vect_build_slp_instance): Dump when we analyze a CTOR.
(vect_slp_check_for_constructors): Rename to ...
(vect_slp_check_for_roots): ... this. Register a
slp_root for CONSTRUCTORs instead of shoving them to
the set of grouped stores.
(vect_slp_analyze_bb_1): Adjust.

Support constants and externals in BB reduction vectorization

The following supports vectorizing BB reductions involving a
constant or an invariant.

* tree-vectorizer.h (_slp_instance::remain_stmts): Change
to ...
(_slp_instance::remain_defs): ... this.
(SLP_INSTANCE_REMAIN_STMTS): Rename to ...
(SLP_INSTANCE_REMAIN_DEFS): ... this.
(slp_root::remain): New.
(slp_root::slp_root): Adjust.
* tree-vect-slp.cc (vect_free_slp_instance): Adjust.
(vect_build_slp_instance): Get extra remain parameter,
adjust former handling of a cut off stmt.
(vect_analyze_slp_instance): Adjust.
(vect_analyze_slp): Likewise.
(_bb_vec_info::~_bb_vec_info): Likewise.
(vectorizable_bb_reduc_epilogue): Dump something if we fail.
(vect_slp_check_for_constructors): Handle non-internal
defs as remain defs of a reduction.
(vectorize_slp_instance_root_stmt): Adjust.

* gcc.dg/vect/bb-slp-75.c: New testcase.

Use find_loop_location from unrolling

The following uses the common find_loop_location as implemented
by the vectorizer to query a loop location also for unrolling.
That results in a more consistent reporting of locations.

* tree-ssa-loop-ivcanon.cc: Include tree-vectorizer.h
(canonicalize_loop_induction_variables): Use find_loop_location.

CRIS: Don't include tree.h in cris-protos.h, PR bootstrap/111021

While there's another patch that fixes the immediate error
in the PR by other means, the include of tree.h here is
something I prefer to avoid.

PR bootstrap/111021
* config/cris/cris-protos.h: Revert recent change.
* config/cris/cris.cc (cris_legitimate_address_p): Remove
code_helper unused parameter.
(cris_legitimate_address_p_hook): New wrapper function.
(TARGET_LEGITIMATE_ADDRESS_P): Change to
cris_legitimate_address_p_hook.

tree-optimization/110963 - more PRE when optimizing for size

The following adjusts the heuristic when we perform PHI insertion
during GIMPLE PRE from requiring at least one edge that is supposed
to be optimized for speed to also doing insertion when the expression
is available on all edges (but possibly with different value) and
we'd at most have one copy from a constant.  The first ensures
we optimize two computations on all paths to one plus a possible
copy due to the PHI, the second makes sure we do not need to insert
many possibly large copies from constants, disregarding the
cummulative size cost of the register copies when they are not
coalesced.

The case in the testcase is

  <bb 5>
  _14 = h;
  if (_14 == 0B)
    goto <bb 7>;
  else
    goto <bb 6>;

  <bb 6>
  h = 0B;

  <bb 7>
  h.6_12 = h;

and we want to optimize that to

  <bb 7>
  # h.6_12 = PHI <_14(5), 0B(6)>

If we want to consider the cost of the register copies I think the
only simplistic enough way would be to restrict the special-case to
two incoming edges - we'd assume one register copy is coalesced
leaving one copy from a register or from a constant.

As with every optimization the downstream effects are probably
bigger than what we can locally estimate.

PR tree-optimization/110963
* tree-ssa-pre.cc (do_pre_regular_insertion): Also insert
a PHI node when the expression is available on all edges
and we insert at most one copy from a constant.

* gcc.dg/tree-ssa/ssa-pre-34.c: New testcase.

tree-optimization/110991 - unroll size estimate after vectorization

The following testcase shows that we are bad at identifying inductions
that will be optimized away after vectorizing them because SCEV doesn't
handle vectorized defs. The following rolls a simpler identification
of SSA cycles covering a PHI and an assignment with a binary operator
with a constant second operand.

PR tree-optimization/110991
* tree-ssa-loop-ivcanon.cc (constant_after_peeling): Handle
VIEW_CONVERT_EXPR <op>, handle more simple IV-like SSA cycles
that will end up constant.

* gcc.dg/tree-ssa/cunroll-16.c: New testcase.

Makefile.in: Make recog.h depend on $(TREE_H) [PR111021]

Commit r14-3093 introduced a random build failure on
build/gencondmd.cc building. Since r14-3093 make recog.h
include tree.h, which further includes (depends on) some
files that are generated during the building, such as:
all-tree.def, tree-check.h etc, when building file
build/gencondmd.cc, the build can fail if these dependences
are not ready. So this patch is to teach this dependence.

Thank Jan-Benedict Glaw for testing this!

PR bootstrap/111021

gcc/ChangeLog:

* Makefile.in (RECOG_H): Add $(TREE_H) as dependence.

vect: Move VMAT_LOAD_STORE_LANES handlings from final loop nest

Following Richi's suggestion [1], this patch is to move the
handlings on VMAT_LOAD_STORE_LANES in the final loop nest
of function vectorizable_load to its own loop. Basically
it duplicates the final loop nest, clean up some useless
set up code for the case of VMAT_LOAD_STORE_LANES, remove
some unreachable code. Also remove the corresponding
handlings in the final loop nest.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/623329.html

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_load): Move the handlings on
VMAT_LOAD_STORE_LANES in the final loop nest to its own loop,
and update the final nest accordingly.

vect: Remove several useless VMAT_INVARIANT checks

In function vectorizable_load, there is one hunk which is
dedicated for the handlings on VMAT_INVARIANT and return
early, it means we shouldn't encounter any cases with
memory_access_type VMAT_INVARIANT in the following code
after that. This patch is to clean up several useless
checks on VMAT_INVARIANT. There should be no functional
changes.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_load): Remove some useless checks
on VMAT_INVARIANT.

Mode-Switching: Fix SET_SRC ICE for create_pre_exit

In same cases, like gcc/testsuite/gcc.dg/pr78148.c in RISC-V, there will
be only 1 operand when SET_SRC in create_pre_exit. For example as below.

(insn 13 9 14 2 (clobber (reg/i:TI 10 a0)) "gcc/testsuite/gcc.dg/pr78148.c":24:1 -1
(expr_list:REG_UNUSED (reg/i:TI 10 a0)
(nil)))

Unfortunately, SET_SRC requires at least 2 operands and then Segment
Fault here. For SH4 part result in Segment Fault, it looks like only
valid when the return_copy_pat is load or something like that. Thus,
this patch try to fix it by restrict the SET insn for SET_SRC.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* mode-switching.cc (create_pre_exit): Add SET insn check.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/mode-switch-ice-1.c: New test.

RISC-V: Support RVV VFREC7 rounding mode intrinsic API

Update in v2:

1. Remove the template of vfrec7 frm class.
2. Update the vfrec7_frm_obj declaration.

Original logs:

This patch would like to support the rounding mode API for the
VFREC7 as the below samples.

* __riscv_vfrec7_v_f32m1_rm
* __riscv_vfrec7_v_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfrec7_frm): New class for frm.
(vfrec7_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfrec7_frm): New intrinsic function definition.
* config/riscv/vector-iterators.md
(VFMISC): Remove VFREC7.
(misc_op): Ditto.
(float_insn_type): Ditto.
(VFMISC_FRM): New int iterator.
(misc_frm_op): New op for frm.
(float_frm_insn_type): New type for frm.
* config/riscv/vector.md (@pred_<misc_frm_op><mode>):
New pattern for misc frm.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-rec7.c: New test.

Daily bump.

[LRA]: Process output stack pointer reloads before emitting reload insns

Previous patch setting up asserts for processing stack pointer reloads
caught an error in code moving sp offset. This resulted in failure of
building aarch64 port. The code wrongly processed insns beyond the
output reloads of the current insn. This patch fixes it.

gcc/ChangeLog:

* lra-constraints.cc (curr_insn_transform): Process output stack
pointer reloads before emitting reload insns.

testsuite: Use distinct explicit error codes in value_9.f90

Use distinct error codes, so that we can spot directly from the
testsuite log which case is failing.

gcc/testsuite/ChangeLog:

* gfortran.dg/value_9.f90 (val, val4, sub, sub4): Take the error
codes from the arguments.
(p): Update calls: pass explicit distinct error codes.

fortran: Fix length one character dummy arg type [PR110419]

Revision r14-2171-g8736d6b14a4dfdfb58c80ccd398981b0fb5d00aa
changed the argument passing convention for length 1 value dummy
arguments to pass just the single character by value.  However, the
procedure declarations weren't updated to reflect the change in the
argument types.
This change does the missing argument type update.

The change of argument types generated an internal error in
gfc_conv_string_parameter with value_9.f90.  Indeed, that function is
not prepared for bare character type, so it is updated as well.

The condition guarding the single character argument passing code
is loosened to not exclude non-interoperable kind (this fixes
a regression with c_char_tests_2.f03).

Finally, the constant string argument passing code is updated as well
to extract the single char and pass it instead of passing it as
a length one string.  As the code taking care of non-constant arguments
was already doing this, the condition guarding it is just removed.

With these changes, value_9.f90 passes on 32 bits big-endian powerpc.

PR fortran/110360
PR fortran/110419

gcc/fortran/ChangeLog:

* trans-types.cc (gfc_sym_type): Use a bare character type for length
one value character dummy arguments.
* trans-expr.cc (gfc_conv_string_parameter): Handle single character
case.
(gfc_conv_procedure_call): Don't exclude interoperable kinds
from single character handling.  For single character dummy arguments,
extend the existing handling of non-constant expressions to constant
expressions.

gcc/testsuite/ChangeLog:

* gfortran.dg/bind_c_usage_13.f03: Update tree dump patterns.

fortran: New predicate gfc_length_one_character_type_p

Introduce a new predicate to simplify conditionals checking for
a character type whose length is the constant one.

gcc/fortran/ChangeLog:

* gfortran.h (gfc_length_one_character_type_p): New inline
function.
* check.cc (is_c_interoperable): Use
gfc_length_one_character_type_p.
* decl.cc (verify_bind_c_sym): Same.
* trans-expr.cc (gfc_conv_procedure_call): Same.

analyzer: New option fanalyzer-show-events-in-system-headers [PR110543]

This patch introduces -fanalyzer-show-events-in-system-headers,
disabled by default.

This option reduces the noise of the analyzer emitted diagnostics
when dealing with system headers.
The new option only affects the display of the diagnostics,
but doesn't hinder the actual analysis.

Given a diagnostics path diving into a system header in the form
[
  prefix events...,
  system header call,
    system header entry,
    events within system headers...,
  system header return,
  suffix events...
]
then disabling the option (either by default or explicitly)
will shorten the path into:
[
  prefix events...,
  system header call,
  system header return,
  suffix events...
]

Signed-off-by: benjamin priour <priour.be@gmail.com>
gcc/analyzer/ChangeLog:

PR analyzer/110543
* analyzer.opt: Add new option.
* diagnostic-manager.cc
(diagnostic_manager::prune_path): Call prune_system_headers.
(prune_frame): New function that deletes all events in a frame.
(diagnostic_manager::prune_system_headers): New function.
* diagnostic-manager.h: Add prune_system_headers declaration.

gcc/ChangeLog:

PR analyzer/110543
* doc/invoke.texi: Add documentation of
fanalyzer-show-events-in-system-headers

gcc/testsuite/ChangeLog:

PR analyzer/110543
* g++.dg/analyzer/fanalyzer-show-events-in-system-headers-default.C:
New test.
* g++.dg/analyzer/fanalyzer-show-events-in-system-headers-no.C:
New test.
* g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C:
New test.

c++: follow DR 2386 and update implementation of get_tuple_size [PR110216]

DR 2386 updated the tuple_size requirements for structured binding and
it now requires tuple_size to be considered only if
std::tuple_size<TYPE> names a complete class type with member value. GCC
before this patch does not follow the updated requrements, and this
patch is intended to implement it.

(jason) Accepting pseudonym sign-off because a change this small is not
legally significant for copyright.

DR 2386
PR c++/110216

gcc/cp/ChangeLog:

* decl.cc (get_tuple_size): Update implementation for DR 2386.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/decomp10.C: Update expected error for DR 2386.
* g++.dg/cpp1z/pr110216.C: New test.

Signed-off-by: gnaggnoyil <gnaggnoyil@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

c++: -fconcepts and __cpp_concepts

Since -fconcepts no longer implies -fconcepts-ts, we shouldn't advertise TS
support with __cpp_concepts=201507L. Also fix one case where -std=c++14
-fconcepts wasn't working (as found by range-v3 calendar). Fixing other
cases is not a priority, probably better to reject that flag combination if
there are further issues.

gcc/c-family/ChangeLog:

* c-cppbuiltin.cc (c_cpp_builtins): Adjust __cpp_concepts.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_simple_type_specifier): Handle -std=c++14
-fconcepts.

libstdc++: Avoid problematic use of log10 in std::format [PR110860]

If abs(__v) is smaller than one, the result will be of the
form 0.xxxxx. It is only if the magnitude is large that more digits
are needed before the decimal dot.

This uses frexp instead of log10 which should be less expensive
and have sufficient precision for the desired purpose.

It removes the problematic cases where log10 will be negative or not
fit in an int.

Signed-off-by: Paul Dreik <gccpatches@pauldreik.se>
libstdc++-v3/ChangeLog:

PR libstdc++/110860
* include/std/format (__formatter_fp::format): Use frexp instead
of log10.

Avoid division by zero in fold_loop_internal_call

My patch to fix profile after folding internal call is missing check for the
case profile was already zero before if-conversion.

gcc/ChangeLog:

PR gcov-profile/110988
* tree-cfg.cc (fold_loop_internal_call): Avoid division by zero.

RISC-V: Add ZC* test for failed march args being passed.

Add ZC* extensions march args tests for error input cases.

Co-Authored by: Nandni Jamnadas <nandni.jamnadas@embecosm.com>
Co-Authored by: Jiawei <jiawei@iscas.ac.cn>
Co-Authored by: Mary Bennett <mary.bennett@embecosm.com>
Co-Authored by: Simon Cook <simon.cook@embecosm.com>

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-24.c: New test.
* gcc.target/riscv/arch-25.c: New test.

RISC-V: Enable compressible features when use ZC* extensions.

This patch enables the compressible features with ZC* extensions.

Since all ZC* extension depends on the Zca extension, it's sufficient to only
add the target Zca to extend the target RVC.

Co-Authored by: Mary Bennett <mary.bennett@embecosm.com>
Co-Authored by: Nandni Jamnadas <nandni.jamnadas@embecosm.com>
Co-Authored by: Simon Cook <simon.cook@embecosm.com>

gcc/ChangeLog:

* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins):
Enable compressed builtins when ZC* extensions enabled.
* config/riscv/riscv-shorten-memrefs.cc:
Enable shorten_memrefs pass when ZC* extensions enabled.
* config/riscv/riscv.cc (riscv_compressed_reg_p):
Enable compressible registers when ZC* extensions enabled.
(riscv_rtx_costs): Allow adjusting rtx costs when ZC* extensions enabled.
(riscv_address_cost): Allow adjusting address cost when ZC* extensions enabled.
(riscv_first_stack_step): Allow compression of the register saves
without adding extra instructions.
* config/riscv/riscv.h (FUNCTION_BOUNDARY): Adjusts function boundary
to 16 bits when ZC* extensions enabled.

RISC-V: Minimal support for ZC* extensions.

This patch is the minimal support for ZC* extensions, include the extension
name, mask and target defination. Also define the dependencies with Zca
and Zce extension. Notes that all ZC* extensions depend on the Zca extension.
Zce includes all relevant ZC* extensions for microcontrollers using. Zce
will imply zcf when 'f' extension enabled in rv32.

Co-Authored by: Charlie Keaney <charlie.keaney@embecosm.com>
Co-Authored by: Mary Bennett <mary.bennett@embecosm.com>
Co-Authored by: Nandni Jamnadas <nandni.jamnadas@embecosm.com>
Co-Authored by: Simon Cook <simon.cook@embecosm.com>
Co-Authored by: Sinan Lin <sinan.lin@linux.alibaba.com>
Co-Authored by: Shihua Liao <shihua@iscas.ac.cn>
Co-Authored by: Yulong Shi <yulong@iscas.ac.cn>

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_subset_list::parse): New extensions.
* config/riscv/riscv-opts.h (MASK_ZCA): New mask.
(MASK_ZCB): Ditto.
(MASK_ZCE): Ditto.
(MASK_ZCF): Ditto.
(MASK_ZCD): Ditto.
(MASK_ZCMP): Ditto.
(MASK_ZCMT): Ditto.
(TARGET_ZCA): New target.
(TARGET_ZCB): Ditto.
(TARGET_ZCE): Ditto.
(TARGET_ZCF): Ditto.
(TARGET_ZCD): Ditto.
(TARGET_ZCMP): Ditto.
(TARGET_ZCMT): Ditto.
* config/riscv/riscv.opt: New target variable.

Revert "Fix type error of 'switch (SUBREG_BYTE (op)).'"

This reverts commit 6c6f96040a13e3403a418803cd9f539701c4c00e.

Fix print_loop_info ICE

It ICEs when invoked via debug_loops and dump_file clear.

* tree-cfg.cc (print_loop_info): Dump to 'file', not 'dump_file'.

RISC-V: Support RVV VFSQRT rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFSQRT as the below samples.

* __riscv_vfsqrt_v_f32m1_rm
* __riscv_vfsqrt_v_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class unop_frm): New class for frm.
(vfsqrt_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfsqrt_frm): New intrinsic function definition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-sqrt.c: New test.

RISC-V: Support RVV VFWNMSAC rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFWNMSAC as the below samples.

* __riscv_vfwnmsac_vv_f64m2_rm
* __riscv_vfwnmsac_vv_f64m2_rm_m
* __riscv_vfwnmsac_vf_f64m2_rm
* __riscv_vfwnmsac_vf_f64m2_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfwnmsac_frm): New class for frm.
(vfwnmsac_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfwnmsac_frm): New intrinsic function definition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-wnmsac.c: New test.

RISC-V: Support RVV VFWMSAC rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFWMSAC as the below samples.

* __riscv_vfwmsac_vv_f64m2_rm
* __riscv_vfwmsac_vv_f64m2_rm_m
* __riscv_vfwmsac_vf_f64m2_rm
* __riscv_vfwmsac_vf_f64m2_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfwmsac_frm): New class for frm.
(vfwmsac_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfwmsac_frm): New intrinsic function definition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-wmsac.c: New test.

RISC-V: Support RVV VFWNMACC rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFWNMACC as the below samples.

* __riscv_vfwnmacc_vv_f64m2_rm
* __riscv_vfwnmacc_vv_f64m2_rm_m
* __riscv_vfwnmacc_vf_f64m2_rm
* __riscv_vfwnmacc_vf_f64m2_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfwnmacc_frm): New class for frm.
(vfwnmacc_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfwnmacc_frm): New intrinsic function definition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-wnmacc.c: New test.

x86: Update model values for Raptorlake.

Update model values for Raptorlake according to SDM.

gcc/ChangeLog

* common/config/i386/cpuinfo.h (get_intel_cpu): Add model value 0xba
to Raptorlake.

MMIX: Switch to lra_in_progress

This is just a mechanical update.
It fixes no observed problems for LRA.

* config/mmix/predicates.md (mmix_address_operand): Use
lra_in_progress, not reload_in_progress.

MMIX: Re-enable LRA

* config/mmix/mmix.cc: Re-enable LRA.

MMIX: Handle LRA FP-to-SP-elimination oddity

When LRA is in progress, it can try and validate insns
half-way through frame-pointer (FP) to stack-pointer (SP)
elimination.  Operands have then been substituted where the
offset is from the SP elimination but the register is the
(hard) frame-pointer:

lra-eliminations.cc:lra_eliminate_regs_1:370:
rtx to = subst_p ? ep->to_rtx : ep->from_rtx;

In this regard reload played nicely.  Unfortunately, the
frame_pointer_operand predicate in mmix/predicates.md barfs
on such an address.  This broke the use of the MMIX
frame_pointer_operand predicate (and the Yf constraint),
used only in the nonlocal_goto_receiver expansion (which is
used in e.g. code generated for C++ "catch").

Force MMIX frame_pointer_operand to accept an FP+offset for
the duration of lra_in_progress.

* config/mmix/predicates.md (frame_pointer_operand): Handle FP+offset
when lra_in_progress.

Disable LRA for MMIX.

Since the change r14-383-gfaf8bea79b6256 "Enable LRA on
several ports", mmix has been broken building libstdc++-v3:

libtool: compile: /obj/./gcc/xgcc -shared-libgcc -B/obj/./gcc
-nostdinc++ -L/obj/mmix/libstdc++-v3/src
-L/obj/mmix/libstdc++-v3/src/.libs
-L/obj/mmix/libstdc++-v3/libsupc++/.libs -nostdinc -B/obj/mmix/newlib/
-isystem /obj/mmix/newlib/targ-include -isystem
/gcctop/newlib/libc/include -B/obj/mmix/libgloss/mmix
-L/obj/mmix/libgloss/libnosys -L/gcctop/libgloss/mmix
-B/home/hp/tmp/mmix230811-00/pre/mmix/bin/
-B/home/hp/tmp/mmix230811-00/pre/mmix/lib/ -isystem
/home/hp/tmp/mmix230811-00/pre/mmix/include -isystem
/home/hp/tmp/mmix230811-00/pre/mmix/sys-include
-I/gcctop/libstdc++-v3/../libgcc -I/obj/mmix/libstdc++-v3/include/mmix
-I/obj/mmix/libstdc++-v3/include -I/gcctop/libstdc++-v3/libsupc++
-fno-implicit-templates -Wall -Wextra -Wwrite-strings -Wcast-qual
-Wabi=2 -fdiagnostics-show-location=once -ffunction-sections
-fdata-sections -frandom-seed=eh_type.lo -g -O2 -c
/gcctop/libstdc++-v3/libsupc++/eh_type.cc -o eh_type.o
/gcctop/libstdc++-v3/libsupc++/eh_terminate.cc: In function 'void
__cxxabiv1::__terminate(std::terminate_handler)':
/gcctop/libstdc++-v3/libsupc++/eh_terminate.cc:53:1: error: unable to
generate reloads for:

   53 | }
      | ^
(insn 31 36 44 4 (parallel [
            (unspec_volatile [
                    (plus:DI (reg/f:DI 253 $253)
                        (const_int 24 [0x18]))
                ] 1)
            (clobber (reg:DI 275))
            (clobber (reg:DI 259 rJ))
        ]) "/gcctop/libstdc++-v3/libsupc++/eh_terminate.cc":51:3
discrim 1 63 {*nonlocal_goto_receiver_expanded}
     (expr_list:REG_UNUSED (reg:DI 275)
        (expr_list:REG_UNUSED (reg:DI 259 rJ)
            (nil))))
during RTL pass: reload
/gcctop/libstdc++-v3/libsupc++/eh_terminate.cc:53:1:
internal compiler error: in curr_insn_transform, at lra-constraints.cc:4281

This commit temporarily reverts the MMIX part of
r14-383-gfaf8bea79b6256 back to reload.

* config/mmix/mmix.cc: Disable LRA for MMIX.

RISC-V: Support RVV VFWMACC rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFWMACC as the below samples.

* __riscv_vfwmacc_vv_f64m2_rm
* __riscv_vfwmacc_vv_f64m2_rm_m
* __riscv_vfwmacc_vf_f64m2_rm
* __riscv_vfwmacc_vf_f64m2_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfwmacc_frm): New class for vfwmacc frm.
(vfwmacc_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfwmacc_frm): Function definition for vfwmacc.
* config/riscv/riscv-vector-builtins.cc
(function_expander::use_widen_ternop_insn): Add frm support.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-fwmacc.c: New test.

RISC-V: Support RVV VFNMSUB rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFNMSUB as the below samples.

* __riscv_vfnmsub_vv_f32m1_rm
* __riscv_vfnmsub_vv_f32m1_rm_m
* __riscv_vfnmsub_vf_f32m1_rm
* __riscv_vfnmsub_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfnmsub_frm): New class for vfnmsub frm.
(vfnmsub_frm): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfnmsub_frm): New function declaration.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-nmsub.c: New test.

[LRA]: Fix asserts for output stack pointer reloads

The patch implementing output stack pointer reloads contained superfluous
asserts. The patch makes them useful.

gcc/ChangeLog:

* lra-constraints.cc (curr_insn_transform): Set done_p up and
check it on true after processing output stack pointer reload.

Daily bump.

modula-2, plugin: Fix Darwin bootstrap issues.

This corrects some typos in the suffix of the m2rte pluing that
lead to a bootstrap fail on Darwin, where the suffix is not '.so'.

On some versions of Darwin, the linker complains if libSystem is not
linked, so we disable all the default libs, but add libc back.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/m2/ChangeLog:

* Make-lang.in: Update suffix spellings to use 'soext'.
Add libc to the plugin link.

Daily bump.

PR modula2/110779 SysClock can not read the clock (Darwin portability fixes)

This patch adds corrections to defensively check against glibc functions,
structures and contains fallbacks. These fixes were required under Darwin.

gcc/m2/ChangeLog:

PR modula2/110779
* gm2-libs-iso/SysClock.mod (EpochTime): New procedure.
(GetClock): Call EpochTime if the C time functions are
unavailable.
* gm2-libs-iso/wrapclock.def (istimezone): New function
definition.

libgm2/ChangeLog:

PR modula2/110779
* configure: Regenerate.
* configure.ac: Provide special case test for Darwin cross
configuration.
(GLIBCXX_CONFIGURE): New statement.
(GLIBCXX_CHECK_GETTIMEOFDAY): New statement.
(GLIBCXX_ENABLE_LIBSTDCXX_TIME): New statement.
* libm2iso/wrapclock.cc: New sys/time.h conditional include.
(sys/syscall.h): Conditional include.
(unistd.h): Conditional include.
(GetTimeRealtime): Re-implement.
(SetTimeRealtime): Re-implement.
(timezone): Re-implement.
(istimezone): New function.
(daylight): Re-implement.
(isdst): Re-implement.
(tzname): Re-implement.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

PR modula2/108119 disable m2rte plugin by default

This patch disables the m2rte plugin by default.  The driver
will only append the -fplugin=m2rte command line option for cc1gm2
if -fm2-plugin is present.  It only enabled providing ENABLE_PLUGIN
is defined.  gcc/m2/Make-file.in will only build and install m2rte
if enable_plugin is yes.

gcc/m2/ChangeLog:

PR modula2/108119
* Make-lang.in (M2RTE_PLUGIN_SO): Assigned to
plugin/m2rte$(exeext).so if enable_plugin is yes.
(m2.all.cross): Replace plugin/m2rte$(soext) with
$(M2RTE_PLUGIN_SO).
(m2.all.encap): Replace plugin/m2rte$(soext) with
$(M2RTE_PLUGIN_SO).
(m2.install-plugin): Add dummy rule when enable_plugin
is not yes.
(plugin/m2rte$(exeext).so): Add dummy rule when enable_plugin
is not yes.
(m2/stage2/cc1gm2$(exeext)): Replace plugin/m2rte$(soext) with
$(M2RTE_PLUGIN_SO).
(m2/stage1/cc1gm2$(exeext)): Replace plugin/m2rte$(soext) with
$(M2RTE_PLUGIN_SO).
* gm2spec.cc (lang_specific_driver): Set need_plugin to false
by default.

gcc/testsuite/ChangeLog:

PR modula2/108119
* gm2/iso/check/fail/iso-check-fail.exp (gm2_init_iso): Add -fm2-plugin.
* gm2/switches/auto-init/fail/switches-auto-init-fail.exp
(gm2_init_iso): Add -fm2-plugin.
* gm2/switches/check-all/pim2/fail/switches-check-all-pim2-fail.exp
(gm2_init_pim2): Add -fm2-plugin.
* gm2/switches/check-all/plugin/iso/fail/switches-check-all-plugin-iso-fail.exp
(gm2_init_iso): Add -fm2-plugin.
* gm2/switches/check-all/plugin/pim2/fail/switches-check-all-plugin-pim2-fail.exp
(gm2_init_pim2): Add -fm2-plugin.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

Add stdckdint.h header for C23

This patch adds <stdckdint.h> header, which defines ckd_{add,sub,mul}
using __builtin_{add,sub,mul}_overflow. As requested, it doesn't
pedantically diagnose things which work just fine, e.g. inputs with
plain char, bool, bit-precise integer or enumerated types and
result pointer to plain char or bit-precise integer.
The header will #include_next <stdckdint.h> so that C library can supply
its part if the header implementation in the future needs to be split
between parts under the control of the compiler and parts under the
control of C library.

2023-08-12 Jakub Jelinek <jakub@redhat.com>

* Makefile.in (USER_H): Add stdckdint.h.
* ginclude/stdckdint.h: New file.

* gcc.dg/stdckdint-1.c: New test.
* gcc.dg/stdckdint-2.c: New test.

RISC-V: Add TAREGT_VECTOR check into VLS modes

This patch fixes bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110994

This is caused VLS modes incorrect codes int register allocation.

The original case trigger the ICE is fortran code but I can reproduce
with a C code.

gcc/ChangeLog:

PR target/110994
* config/riscv/riscv-opts.h (TARGET_VECTOR_VLS): Add TARGET_VETOR.

gcc/testsuite/ChangeLog:

PR target/110994
* gcc.target/riscv/rvv/autovec/vls/pr110994.c: New test.

tree-pretty-print: delimit TREE_VEC with braces

This makes the generic pretty printer print braces around a TREE_VEC,
like we do for CONSTRUCTOR. This should improve readability of nested
TREE_VECs in particular.

gcc/ChangeLog:

* tree-pretty-print.cc (dump_generic_node) <case TREE_VEC>:
Delimit output with braces.

c++: bogus warning w/ deduction guide in anon ns [PR106604]

Here we're unintentionally issuing a "declared static but never defined"
warning from wrapup_namespace_globals for a deduction guide declared in
an anonymous namespace. This patch fixes this by giving deduction guides
a dummy DECL_INITIAL, which suppresses the warning and also allows us to
simplify redeclaration checking for them.

Co-authored-by: Jason Merrill <jason@redhat.com>
PR c++/106604

gcc/cp/ChangeLog:

* decl.cc (redeclaration_error_message): Remove special handling
for deduction guides.
(grokfndecl): Give deduction guides a dummy DECL_INITIAL.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction74.C: Expect "defined" instead
of "declared" in the repeated deduction guide diagnostics.
* g++.dg/cpp1z/class-deduction116.C: New test.

libstdc++: Use __bool_constant entirely

This patch uses __bool_constant entirely instead of integral_constant<bool>
in the type_traits header, specifically for true_type, false_type,
and bool_constant.

libstdc++-v3/ChangeLog:

* include/std/type_traits (true_type): Use __bool_constant
instead.
(false_type): Likewise.
(bool_constant): Likewise.

Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

RISC-V: Fix vec_series expander[PR110985]

This patch fix bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110985

gcc/ChangeLog:
PR target/110985
* config/riscv/riscv-v.cc (expand_vec_series): Refactor the expander.

gcc/testsuite/ChangeLog:
PR target/110985
* gcc.target/riscv/rvv/autovec/vls-vlmax/pr110985.c: New test.

RISC-V: Allow CONST_VECTOR for VLS modes

This patch enables COSNT_VECTOR for VLS modes.

void foo1 (int * __restrict a)
{
    for (int i = 0; i < 16; i++)
      a[i] = 8;
}

void foo2 (int * __restrict a)
{
    for (int i = 0; i < 16; i++)
      a[i] = i;
}

Compile option: -O3 --param=riscv-autovec-preference=scalable

Before this patch:

foo1:
        lui     a5,%hi(.LC0)
        addi    a5,a5,%lo(.LC0)
        vsetivli        zero,4,e32,m1,ta,ma
        addi    a4,a0,16
        vle32.v v1,0(a5)
        vse32.v v1,0(a0)
        vse32.v v1,0(a4)
        addi    a4,a0,32
        vse32.v v1,0(a4)
        addi    a0,a0,48
        vse32.v v1,0(a0)
        ret
foo2:
        lui     a5,%hi(.LC1)
        addi    a5,a5,%lo(.LC1)
        vsetivli        zero,4,e32,m1,ta,ma
        vle32.v v1,0(a5)
        lui     a5,%hi(.LC2)
        addi    a5,a5,%lo(.LC2)
        vse32.v v1,0(a0)
        vle32.v v1,0(a5)
        lui     a5,%hi(.LC3)
        addi    a4,a0,16
        addi    a5,a5,%lo(.LC3)
        vse32.v v1,0(a4)
        vle32.v v1,0(a5)
        addi    a4,a0,32
        lui     a5,%hi(.LC4)
        vse32.v v1,0(a4)
        addi    a0,a0,48
        addi    a5,a5,%lo(.LC4)
        vle32.v v1,0(a5)
        vse32.v v1,0(a0)
        ret

After this patch:

foo1:
vsetivli zero,16,e32,mf2,ta,ma
vmv.v.i v1,8
vse32.v v1,0(a0)
ret
.size foo1, .-foo1
.align 1
.globl foo2
.type foo2, @function
foo2:
vsetivli zero,16,e32,mf2,ta,ma
vid.v v1
vse32.v v1,0(a0)
ret

gcc/ChangeLog:

* config/riscv/autovec.md: Add VLS CONST_VECTOR.
* config/riscv/riscv.cc (riscv_const_insns): Ditto.
* config/riscv/vector.md: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS CONST_VECTOR tests.
* gcc.target/riscv/rvv/autovec/vls/const-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/const-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/const-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/const-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/const-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/series-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/series-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/series-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/series-4.c: New test.

Daily bump.

libstdc++: Fix std::format_to_n return value [PR110990]

When writing to a contiguous iterator, std::format_to_n(out, n, ...)
always returns out + n, even if it wrote fewer than n characters to the
iterator.

The problem is in the _M_finish() member function of the _Iter_sink
specialization for contiguous iterators. _M_finish() calls _M_overflow()
to update its count of characters written, so it can return the count of
characters that would be written if there was room. But _M_overflow()
assumes it's only called when the buffer is full, and so switches to the
internal buffer. _M_finish() then thinks that if the internal buffer is
in use, we already wrote at least n characters and so returns out+n as
the output position.

We can fix the problem by adding a check in _M_overflow() so that we
don't update the count and switch to the internal buffer unless we've
run out of room, i.e. _M_unused().size() is zero. The caller then needs
to be prepared for _M_count not being the final total, and so add
_M_used.size() to it.

However, there's not actually any need for _M_finish() to call
_M_overflow() to get the count. We now need to use _M_count and
_M_used.size() to get the total anyway so _M_overflow() doesn't help
with that. And we don't need to use _M_overflow() to flush unwritten
characters to the output, because the specialization for contiguous
iterators always writes directly to the output without buffering (except
when we've exceeded the maximum number of characters, in which case we
want to discard the buffered characters anyway). So _M_finish() can be
simplified and can avoid calling _M_overflow().

This change also fixes some member functions of other sink classes to
only call _M_overflow() when there are characters in the buffer, which
is needed to meet _M_overflow's precondition that _M_used().size()!=0.

libstdc++-v3/ChangeLog:

PR libstdc++/110990
* include/std/format (_Seq_sink::get): Only call _M_overflow if
its precondition is met.
(_Iter_sink::_M_finish): Likewise.
(_Iter_sink<C, ContigIter>::_M_overflow): Only switch to the
internal buffer after running out of space.
(_Iter_sink<C, ContigIter>::_M_finish): Do not use _M_overflow.
(_Counting_sink::count): Likewise.
* testsuite/std/format/functions/format_to_n.cc: Check cases
where the output fits into the buffer.

analyzer: new warning: -Wanalyzer-unterminated-string [PR105899]

This patch adds new functions to the analyzer for checking that
an argument at a callsite is a pointer to a valid null-terminated
string, and uses this for the following known functions:

- error (param 3, the format string)
- error_at_line (param 5, the format string)
- putenv
- strchr (1st param)
- strcpy (2nd param)
- strdup

Currently the check merely detects pointers to unterminated string
constants, and adds a new -Wanalyzer-unterminated-string to complain
about that. I'm experimenting with detecting other ways in which
a buffer can fail to be null-terminated, and for other problems with
such buffers, but this patch at least adds the framework for wiring
up the check to specific parameters of known_functions.

gcc/analyzer/ChangeLog:
PR analyzer/105899
* analyzer.opt (Wanalyzer-unterminated-string): New.
* call-details.cc
(call_details::check_for_null_terminated_string_arg): New.
* call-details.h
(call_details::check_for_null_terminated_string_arg): New decl.
* kf-analyzer.cc (class kf_analyzer_get_strlen): New.
(register_known_analyzer_functions): Register it.
* kf.cc (kf_error::impl_call_pre): Check that format arg is a
valid null-terminated string.
(kf_putenv::impl_call_pre): Likewise for the sole param.
(kf_strchr::impl_call_pre): Likewise for the first param.
(kf_strcpy::impl_call_pre): Likewise for the second param.
(kf_strdup::impl_call_pre): Likewise for the sole param.
* region-model.cc (get_strlen): New.
(struct call_arg_details): New.
(inform_about_expected_null_terminated_string_arg): New.
(class unterminated_string_arg): New.
(region_model::check_for_null_terminated_string_arg): New.
* region-model.h
(region_model::check_for_null_terminated_string_arg): New decl.

gcc/ChangeLog:
PR analyzer/105899
* doc/analyzer.texi (__analyzer_get_strlen): New.
* doc/invoke.texi: Add -Wanalyzer-unterminated-string.

gcc/testsuite/ChangeLog:
PR analyzer/105899
* gcc.dg/analyzer/analyzer-decls.h (__analyzer_get_strlen): New.
* gcc.dg/analyzer/error-1.c (test_error_unterminated): New.
(test_error_at_line_unterminated): New.
* gcc.dg/analyzer/null-terminated-strings-1.c: New test.
* gcc.dg/analyzer/putenv-1.c (test_unterminated): New.
* gcc.dg/analyzer/strchr-1.c (test_unterminated): New.
* gcc.dg/analyzer/strcpy-1.c (test_unterminated): New.
* gcc.dg/analyzer/strdup-1.c (test_unterminated): New.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

[committed] Fix subdi3 synthesis on rx port

Some of Andrew's recent match.pd changes triggered a regression in my tester
for the rx processor for c-torture/execute/pr66940.c which would be exposed
only during an LTO compilation.

Specifically the subdi3_internal pattern had the wrong idiom to detect a carry
from the high word into the low word.  It had the wrong opcode and the operands
were reversed.

This resulted in combine doing a simplification that was valid according to the
presented RTL, but which ultimately got the wrong result.

I would often say this was a latent bug.  But the testsuite shows
builtin-arith-overflow-14 and builtin-arith-overflow-p18 failures are fixed as
well.  So it's been visible indefinitely, but nobody's ever looked into those
failures.

Committed to the trunk.

gcc/
* config/rx/rx.md (subdi3): Fix test for borrow.

VECT: Fix ICE on MASK_LEN_{LOAD, STORE} when no LEN recorded[PR110989]

This ICE is caused because of this situation:

mask__49.21_99 = vect__17.19_96 == { 0.0, ... };
...
vect__6.24_107 = .MASK_LEN_LOAD (vectp.22_105, 32B, mask__49.21_99, POLY_INT_CST [2, 2], 0);

The MASK_LEN_LOAD is using real MASK which is produced by the EQ comparison wheras the LEN
is the dummy LEN which is the vectorization factor.

In this situation, we didn't enter 'vect_record_loop_len' since there is no LEN loop control.
Then 'LOOP_VINFO_RGROUP_IV_TYPE' is not suitable type for 'build_int_cst' used for producing
LEN argument for 'MASK_LEN_LOAD', so use sizetype instead which is perfectly matching
RVV length requirement.

gcc/ChangeLog:
PR middle-end/110989
* tree-vect-stmts.cc (vectorizable_store): Replace iv_type with sizetype.
(vectorizable_load): Ditto.

gcc/testsuite/ChangeLog:
PR middle-end/110989
* gcc.target/riscv/rvv/autovec/pr110989.c: New test.

RISC-V: Specify -mabi for ztso testcases

On rv32 targets, this patch fixes ztso testcases errors like this:
cc1: error: ABI requires '-march=rv32'

2023-08-11 Patrick O'Neill <patrick@rivosinc.com>

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo-table-ztso-amo-add-1.c: Add -mabi=lp64d
to dg-options.
* gcc.target/riscv/amo-table-ztso-amo-add-2.c: Ditto.
* gcc.target/riscv/amo-table-ztso-amo-add-3.c: Ditto.
* gcc.target/riscv/amo-table-ztso-amo-add-4.c: Ditto.
* gcc.target/riscv/amo-table-ztso-amo-add-5.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-1.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-2.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-3.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-4.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-5.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-6.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-7.c: Ditto.
* gcc.target/riscv/amo-table-ztso-fence-1.c: Ditto.
* gcc.target/riscv/amo-table-ztso-fence-2.c: Ditto.
* gcc.target/riscv/amo-table-ztso-fence-3.c: Ditto.
* gcc.target/riscv/amo-table-ztso-fence-4.c: Ditto.
* gcc.target/riscv/amo-table-ztso-fence-5.c: Ditto.
* gcc.target/riscv/amo-table-ztso-load-1.c: Ditto.
* gcc.target/riscv/amo-table-ztso-load-2.c: Ditto.
* gcc.target/riscv/amo-table-ztso-load-3.c: Ditto.
* gcc.target/riscv/amo-table-ztso-store-1.c: Ditto.
* gcc.target/riscv/amo-table-ztso-store-2.c: Ditto.
* gcc.target/riscv/amo-table-ztso-store-3.c: Ditto.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-1.c: Ditto.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-2.c: Ditto.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-3.c: Ditto.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-4.c: Ditto.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-5.c: Ditto.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>

libstdc++: Implement C++20 std::chrono::parse [PR104167]

This adds the missing C++20 features to <chrono>.

I've implemented my proposed resolutions to LWG issues 3960, 3961, and
3962. There are some unimplemented flags such as %OI which I think are
not implementable in general. It might be possible to use na_llanginfo
with ALT_DIGITS, but that isn't available on all targets. I intend to
file another LWG issue about that.

libstdc++-v3/ChangeLog:

PR libstdc++/104167
* include/bits/chrono_io.h (operator|=, operator|): Add noexcept
to _ChronoParts operators.
(from_stream, parse): Define new functions.
(__detail::_Parse, __detail::_Parser): New class templates.
* include/std/chrono (__cpp_lib_chrono): Define to 201907L for
C++20.
* include/std/version (__cpp_lib_chrono): Likewise.
* testsuite/20_util/duration/arithmetic/constexpr_c++17.cc:
Adjust expected value of feature test macro.
* testsuite/20_util/duration/io.cc: Test parsing.
* testsuite/std/time/clock/file/io.cc: Likewise.
* testsuite/std/time/clock/gps/io.cc: Likewise.
* testsuite/std/time/clock/system/io.cc: Likewise.
* testsuite/std/time/clock/tai/io.cc: Likewise.
* testsuite/std/time/clock/utc/io.cc: Likewise.
* testsuite/std/time/day/io.cc: Likewise.
* testsuite/std/time/month/io.cc: Likewise.
* testsuite/std/time/month_day/io.cc: Likewise.
* testsuite/std/time/weekday/io.cc: Likewise.
* testsuite/std/time/year/io.cc: Likewise.
* testsuite/std/time/year_month/io.cc: Likewise.
* testsuite/std/time/year_month_day/io.cc: Likewise.
* testsuite/std/time/syn_c++20.cc: Check value of macro and for
the existence of parse and from_stream in namespace chrono.
* testsuite/std/time/clock/local/io.cc: New test.
* testsuite/std/time/parse.cc: New test.

bpf: liberate R9 for general register allocation

We were reserving one of the hard registers in BPF in order to
implement dynamic stack allocation: alloca and VLAs. However, there is
kernel code that has inline assembly that requires all the non-fixed
registers to be available for register allocation.

This patch:

1. Liberates r9 that is now available for register allocation.

2. Adds a check to GCC so it errors out if the user tries to do
   dynamic stack allocation.  A couple of tests are added for this.

3. Changes xbpf so it no longer saves and restores callee-saved
   registers.  A couple of tests for this have been removed.

4. Adds bpf-*-* to the list of targets that do not support alloca in
   target-support.exp.

Tested in host x86_64-linux-gnu and target bpf-unknown-none.

gcc/ChangeLog

* config/bpf/bpf.md (allocate_stack): Define.
* config/bpf/bpf.h (FIRST_PSEUDO_REGISTER): Make room for fake
stack pointer register.
(FIXED_REGISTERS): Adjust accordingly.
(CALL_USED_REGISTERS): Likewise.
(REG_CLASS_CONTENTS): Likewise.
(REGISTER_NAMES): Likewise.
* config/bpf/bpf.cc (bpf_compute_frame_layout): Do not reserve
space for callee-saved registers.
(bpf_expand_prologue): Do not save callee-saved registers in xbpf.
(bpf_expand_epilogue): Do not restore callee-saved registers in
xbpf.

gcc/testsuite/ChangeLog

* lib/target-supports.exp (check_effective_target_alloca): BPF
target does not support alloca.
* gcc.target/bpf/diag-alloca-1.c: New test.
* gcc.target/bpf/diag-alloca-2.c: Likewise.
* gcc.target/bpf/xbpf-callee-saved-regs-1.c: Remove test.
* gcc.target/bpf/xbpf-callee-saved-regs-2.c: Likewise.
* gcc.target/bpf/regs-availability-1.c: Likewise.

bpf: allow exceeding max num of args in BPF when always_inline

BPF currently limits the number of registers used to pass arguments to
functions to five registers. There is a check for this at function
expansion time. However, if a function is guaranteed to be always
inlined (and its body never generated) by virtue of the always_inline
attribute, it can "receive" any number of arguments.

Tested in host x86_64-linux-gnu and target bpf-unknown-none.

gcc/ChangeLog

* config/bpf/bpf.cc (bpf_function_arg_advance): Do not complain
about too many arguments if function is always inlined.

gcc/testsuite/ChangeLog

* gcc.target/bpf/diag-funargs-inline-1.c: New test.
* gcc.target/bpf/diag-funargs.c: Adapt test.

analyzer: More features for CPython analyzer plugin [PR107646]

This patch adds known function subclasses for Python/C API functions
PyList_New, PyLong_FromLong, and PyList_Append. It also adds new
optional parameters for
region_model::get_or_create_region_for_heap_alloc, allowing for the
newly allocated region to immediately transition from the start state to
the assumed non-null state in the malloc state machine if desired.
Finally, it adds a new procedure, dg-require-python-h, intended as a
directive in Python-related analyzer tests, to append necessary Python
flags during the tests' build process.

The main warnings we gain in this patch with respect to the known function
subclasses mentioned are leak related. For example:

rc3.c: In function ‘create_py_object’:
│
rc3.c:21:10: warning: leak of ‘item’ [CWE-401] [-Wanalyzer-malloc-leak]
│
   21 |   return list;
      │
      |          ^~~~
│
  ‘create_py_object’: events 1-4
│
    |
│
    |    4 |   PyObject* item = PyLong_FromLong(10);
│
    |      |                    ^~~~~~~~~~~~~~~~~~~
│
    |      |                    |
│
    |      |                    (1) allocated here
│
    |      |                    (2) when ‘PyLong_FromLong’ succeeds
│
    |    5 |   PyObject* list = PyList_New(2);
│
    |      |                    ~~~~~~~~~~~~~
│
    |      |                    |
│
    |      |                    (3) when ‘PyList_New’ fails
│
    |......
│
    |   21 |   return list;
│
    |      |          ~~~~
│
    |      |          |
│
    |      |          (4) ‘item’ leaks here; was allocated at (1)
│

Some concessions were made to
simplify the analysis process when comparing kf_PyList_Append with the
real implementation. In particular, PyList_Append performs some
optimization internally to try and avoid calls to realloc if
possible. For simplicity, we assume that realloc is called every time.
Also, we grow the size by just 1 (to ensure enough space for adding a
new element) rather than abide by the heuristics that the actual implementation
follows.

gcc/analyzer/ChangeLog:
PR analyzer/107646
* call-details.h: New function.
* region-model.cc (region_model::get_or_create_region_for_heap_alloc):
New optional parameters.
* region-model.h (class region_model): New optional parameters.
* sm-malloc.cc (on_realloc_with_move): New function.
(region_model::transition_ptr_sval_non_null): New function.

gcc/testsuite/ChangeLog:
PR analyzer/107646
* gcc.dg/plugin/analyzer_cpython_plugin.c: Analyzer support for
PyList_New, PyList_Append, PyLong_FromLong
* gcc.dg/plugin/plugin.exp: New test.
* lib/target-supports.exp: New procedure.
* gcc.dg/plugin/cpython-plugin-test-2.c: New test.

Signed-off-by: Eric Feng <ef2648@columbia.edu>

c++: dependently scoped template-id in type-req [PR110927]

Here we're incorrectly rejecting the first type-requirement at parse
time with

  concepts-requires35.C:14:56: error: ‘typename A<T>::B’ is not a template [-fpermissive]

We also incorrectly reject the second type-requirement at satisfaction time
with

  concepts-requires35.C:17:34: error: ‘typename A<int>::B’ names ‘template<class U> struct A<int>::B’, which is not a type

and similarly for the third type-requirement.  This seems to happen only
within a type-requirement; if we instead use e.g. an alias template then
it works as expected.

The difference ultimately seems to be that during parsing of a using-decl,
we pass check_dependency_p=true to cp_parser_nested_name_specifier_opt
whereas for a type-requirement we pass check_dependency_p=false.
Passing =false causes cp_parser_template_id for the dependently-scoped
template-id B<bool> to create a TYPE_DECL of TYPENAME_TYPE (with
TYPENAME_IS_CLASS_P unexpectedly set in the last two cases) whereas
passing =true causes it to return a TEMPLATE_ID_EXPR.  We then call
make_typename_type on this TYPE_DECL which does the wrong thing.

Since there seems to be no justification for using check_dependency_p=false
here, the simplest fix seems to be to pass check_dependency_p=true instead,
matching the behavior of cp_parser_elaborated_type_specifier.

PR c++/110927

gcc/cp/ChangeLog:

* parser.cc (cp_parser_type_requirement): Pass
check_dependency_p=true instead of =false.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-requires35.C: New test.

c++: recognize in-class var tmpl partial spec [PR71954]

This makes us recognize member variable template partial specializations
defined directly inside the class body. It seems we mainly just need to
call check_explicit_specialization when we see a static TEMPLATE_ID_EXPR
data member, which sets SET_DECL_TEMPLATE_SPECIALIZATION for us and which
we otherwise don't call (for the out-of-class case we call it from
grokvardecl).

We also need to make finish_member_template_decl return NULL_TREE for
such partial specializations, matching its behavior for class template
partial specializations, so that later we don't try to register it as a
separate member declaration.

PR c++/71954

gcc/cp/ChangeLog:

* decl.cc (grokdeclarator): Pass 'dname' instead of
'unqualified_id' as the name when building the VAR_DECL for a
static data member. Call check_explicit_specialization for a
TEMPLATE_ID_EXPR such member.
* pt.cc (finish_member_template_decl): Return NULL_TREE
instead of 'decl' when DECL_TEMPLATE_SPECIALIZATION is not
set.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/var-templ84.C: New test.
* g++.dg/cpp1y/var-templ84a.C: New test.

libstdc++: Do not call log10(0.0) in std::format [PR110860]

Calling log10(0.0) returns -inf which has undefined behaviour when
converted to an integer. We only need to use log10 for large values
anyway. If the value is zero then the larger buffer is only needed due
to a large precision, so we don't need to use log10 to estimate the
number of digits for the significand.

libstdc++-v3/ChangeLog:

PR libstdc++/110860
* include/std/format (__formatter_fp::format): Do not call log10
with zero values.

MAINTAINERS: Add myself to write after approval

ChangeLog:

* MAINTAINERS: Add myself.

Signed-off-by: Eric Feng <ef2648@columbia.edu>

c++: improve debug_tree for templated types/decls

gcc/cp/ChangeLog:

* ptree.cc (cxx_print_decl): Check for DECL_LANG_SPECIFIC and
TS_DECL_COMMON only when necessary. Print DECL_TEMPLATE_INFO
for all decls that have it, not just VAR_DECL or FUNCTION_DECL.
Also print DECL_USE_TEMPLATE.
(cxx_print_type): Print TYPE_TEMPLATE_INFO.
<case BOUND_TEMPLATE_TEMPLATE_PARM>: Don't print TYPE_TI_ARGS
anymore.
<case TEMPLATE_TYPE/TEMPLATE_PARM>: Print TEMPLATE_TYPE_PARM_INDEX
instead of printing the index, level and original level
individually.

tree-pretty-print: handle COMPONENT_REF with non-decl RHS

In the C++ front end, a COMPONENT_REF's second operand isn't always a
decl (at least at template parse time). This patch makes the generic
pretty printer not ICE when printing such a COMPONENT_REF.

gcc/ChangeLog:

* tree-pretty-print.cc (dump_generic_node) <case COMPONENT_REF>:
Don't call component_ref_field_offset if the RHS isn't a decl.

Use strtol instead of std::stoi [PR110646]

Implementation of std::stoi was overlooked on hppa-hpux, so use
strtol instead.

2023-08-11 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

PR bootstrap/110646
* gensupport.cc(class conlist): Use strtol instead of std::stoi.

preserve base pointer for __deregister_frame [PR110956]

Original bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956
Rainer Orth successfully tested the patch on Solaris with a full bootstrap.

Some uncommon unwinding table encodings need to access the base pointer
for address computations. We do not have that information in calls to
__deregister_frame_info_bases, and previously simply used nullptr as
base pointer. That is usually fine, but for some Solaris i386 shared
libraries that results in wrong address computations.

To fix this problem we now associate the unwinding object with
the table pointer itself, which is always known, in addition to
the PC range. When deregistering a frame, we first locate the object
using the table pointer, and then use the base pointer stored within
the object to compute the PC range.

libgcc/ChangeLog:
PR libgcc/110956
* unwind-dw2-fde.c: Associate object with address of unwinding
table.

[LRA]: Implement output stack pointer reloads

LRA prohibited output stack pointer reloads but it resulted in LRA
failure for AVR target which has no arithmetic insns working with the
stack pointer register. Given patch implements the output stack
pointer reloads.

gcc/ChangeLog:

* lra-constraints.cc (goal_alt_out_sp_reload_p): New flag.
(process_alt_operands): Set the flag.
(curr_insn_transform): Modify stack pointer offsets if output
stack pointer reload is generated.

libstdc++: Handle invalid values in std::chrono pretty printers

This avoids an IndexError exception when printing invalid chrono::month
or chrono::weekday values.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (StdChronoCalendarPrinter):
Check for out-of-range month an weekday indices.
* testsuite/libstdc++-prettyprinters/chrono.cc: Check invalid
month and weekday values.

libstdc++: Revert accidentally committed change to bits/stl_iterator.h

In commit r14-3134-g9cb2a7c8d54b1f I only meant to change some uses of
__clamp_iter_cat to use __iter_category_t, I didn't mean to commit the
additional change introducing __clamped_iter_cat_t. This reverts that
part.

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (__clamped_iter_cat_t): Remove.

config: Fix host -rdynamic detection for build != host != target

The GCC_ENABLE_PLUGINS configure logic for detecting whether -rdynamic
is necessary and supported uses an appropriate objdump for $host
binaries (running on $build) in cases where $host is $build or
$target.

However, it is missing such logic in the case where $host is neither
$build nor $target, resulting in the compilers not being linked with
-rdynamic and plugins not being usable with such a compiler. In fact
$ac_cv_prog_OBJDUMP, as used when $build = $host, is always an objdump
for $host binaries that runs on $build; that is, it's appropriate to
use in this case as well.

Tested in such a configuration that it does result in cc1 being linked
with -rdynamic as expected. Also bootstrapped with no regressions for
x86_64-pc-linux-gnu.

config/
* gcc-plugin.m4 (GCC_ENABLE_PLUGINS): Use
export_sym_check="$ac_cv_prog_OBJDUMP -T" also when host is not
build or target.

gcc/
* configure: Regenerate.

libcc1/
* configure: Regenerate.

tree-optimization/110979 - fold-left reduction and partial vectors

When we vectorize fold-left reductions with partial vectors but
no target operation available we use a vector conditional to force
excess elements to zero.  But that doesn't correctly preserve
the sign of zero.  The following patch disables partial vector
support when we have to do that and also need to honor rounding
modes other than round-to-nearest.  When round-to-nearest is in
effect and we have to preserve the sign of zero instead use
negative zero for the excess elements.

PR tree-optimization/110979
* tree-vect-loop.cc (vectorizable_reduction): For
FOLD_LEFT_REDUCTION without target support make sure
we don't need to honor signed zeros and sign dependent rounding.

* gcc.dg/torture/pr110979.c: New testcase.

Improve BB vectorization opt-info

The following makes us more correctly print the used vector size
when doing BB vectorization and also print all involved SLP graph
roots, not just the random one we ended up picking as leader.
In particular the last bit improves diffing opt-info between
different GCC revs but it also requires some testsuite adjustments.

* tree-vect-slp.cc (vect_slp_region): Provide opt-info for all SLP
subgraph entries. Dump the used vector size based on the
SLP subgraph entry root vector type.

* g++.dg/vect/slp-pr87105.cc: Adjust.
* gcc.dg/vect/bb-slp-17.c: Likewise.
* gcc.dg/vect/bb-slp-20.c: Likewise.
* gcc.dg/vect/bb-slp-21.c: Likewise.
* gcc.dg/vect/bb-slp-22.c: Likewise.
* gcc.dg/vect/bb-slp-subgroups-2.c: Likewise.

RISC-V: Support RVV VFMSUB rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFMSUB as the below samples.

* __riscv_vfmsub_vv_f32m1_rm
* __riscv_vfmsub_vv_f32m1_rm_m
* __riscv_vfmsub_vf_f32m1_rm
* __riscv_vfmsub_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfmsub_frm): New class for vfmsub frm.
(vfmsub_frm): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfmsub_frm): New function declaration.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-msub.c: New test.

VECT: Add vec_mask_len_{load_lanes,store_lanes} patterns

This patch is add vec_mask_len_{load_lanes,store_stores} autovectorization patterns.

Here we want to support this following autovectorization:

void
foo (int8_t *__restrict a,
int8_t *__restrict b,
int8_t *__restrict cond,
int n)
{
  for (intptr_t i = 0; i < n; ++i)
    {
      if (cond[i])
        a[i] = b[i * 2] + b[i * 2 + 1];
    }
}

ARM SVE IR:

https://godbolt.org/z/cro1Eqc6a

  # loop_mask_60 = PHI <next_mask_82(4), max_mask_81(3)>
  ...
  mask__39.12_63 = vect__3.11_61 != { 0, ... };
  vec_mask_and_66 = loop_mask_60 & mask__39.12_63;
  ...
  vect_array.15 = .MASK_LOAD_LANES (_57, 8B, vec_mask_and_66);
  ...

For RVV, we would like to see IR:

  loop_len = SELECT_VL;
  ...
  mask__39.12_63 = vect__3.11_61 != { 0, ... };
  ...
  vect_array.15 = .MASK_LEN_LOAD_LANES (_57, 8B, mask__39.12_63, loop_len, bias);
  ...

Bootstrap and Regression on X86 passed.

Ok for trunk ?

gcc/ChangeLog:

* doc/md.texi: Add vec_mask_len_{load_lanes,store_lanes} patterns.
* internal-fn.cc (expand_partial_load_optab_fn): Ditto.
(expand_partial_store_optab_fn): Ditto.
* internal-fn.def (MASK_LEN_LOAD_LANES): Ditto.
(MASK_LEN_STORE_LANES): Ditto.
* optabs.def (OPTAB_CD): Ditto.

RISC-V: Support RVV VFNMADD rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFNMADD as the below samples.

* __riscv_vfnmadd_vv_f32m1_rm
* __riscv_vfnmadd_vv_f32m1_rm_m
* __riscv_vfnmadd_vf_f32m1_rm
* __riscv_vfnmadd_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfnmadd_frm): New class for vfnmadd frm.
(vfnmadd_frm): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfnmadd_frm): New function declaration.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-nmadd.c: New test.

match.pd: Implement missed optimization ((x ^ y) & z) | x -> (z & y) | x [PR109938]

Adds a simplification for ((x ^ y) & z) | x to be folded into
(z & y) | x. Merges this simplification with ((x | y) & z) | x -> (z & y) | x
to prevent duplicate pattern.

2023-08-11 Drew Ross <drross@redhat.com>
Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/109938
* match.pd (((x ^ y) & z) | x -> (z & y) | x): New simplification.

* gcc.c-torture/execute/pr109938.c: New test.
* gcc.dg/tree-ssa/pr109938.c: New test.

RISC-V: Support RVV VFMADD rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFMADD as the below samples.

* __riscv_vfmadd_vv_f32m1_rm
* __riscv_vfmadd_vv_f32m1_rm_m
* __riscv_vfmadd_vf_f32m1_rm
* __riscv_vfmadd_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfmadd_frm): New class for vfmadd frm.
(vfmadd_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfmadd_frm): New function definition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-madd.c: New test.

RISC-V: Support RVV VFNMSAC rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFNMSAC for the below samples.

* __riscv_vfnmsac_vv_f32m1_rm
* __riscv_vfnmsac_vv_f32m1_rm_m
* __riscv_vfnmsac_vf_f32m1_rm
* __riscv_vfnmsac_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfnmsac_frm): New class for vfnmsac frm.
(vfnmsac_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfnmsac_frm): New function definition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-nmsac.c: New test.

c: Add __typeof_unqual__ and __typeof_unqual support

As I mentioned in my stdckdint.h mail, I think having __ prefixed
keywords for the typeof_unqual keyword which can be used in earlier
language modes can be useful, not all code can be switched to C23
right away.

The following patch implements that. It keeps the non-C23 behavior
for it for the _Noreturn functions to stay compatible with how
__typeof__ behaves.

I think we don't need it for C++, in C++ we have standard
traits to remove qualifiers etc.

2023-08-11 Jakub Jelinek <jakub@redhat.com>

gcc/
* doc/extend.texi (Typeof): Document typeof_unqual
and __typeof_unqual__.
gcc/c-family/
* c-common.cc (c_common_reswords): Add __typeof_unqual
and __typeof_unqual__ spellings of typeof_unqual.
gcc/c/
* c-parser.cc (c_parser_typeof_specifier): Handle
__typeof_unqual and __typeof_unqual__ as !is_std.
gcc/testsuite/
* gcc.dg/c11-typeof-2.c: New test.
* gcc.dg/c11-typeof-3.c: New test.
* gcc.dg/gnu11-typeof-3.c: New test.
* gcc.dg/gnu11-typeof-4.c: New test.

Fix PR 110954: wrong code with cmp | !cmp

This was an oversight on my part forgetting that
cmp will might have a different true value than all ones
but will have a value of 1 in most cases.
This means if we have `(f < 0) | !(f < 0)` we would
optimize this to -1 rather than just 1.

This is version 2 of the patch.
Decided to go down a different route than just checking if
the precission was 1 inside bitwise_inverted_equal_p.
So instead bitwise_inverted_equal_p gets passed an argument
that will be set if there was a comparison that was being compared
and the user of bitwise_inverted_equal_p decides what needs to be done.
In most uses of bitwise_inverted_equal_p, the check will be
`!wascmp || element_precision (type) == 1` .
But in the case of `a & ~a` and `a ^| ~a` we can handle the case
of wascmp by using constant_boolean_node isntead.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/110954

gcc/ChangeLog:

* generic-match-head.cc (bitwise_inverted_equal_p): Add
wascmp argument and set it accordingly.
* gimple-match-head.cc (bitwise_inverted_equal_p): Add
wascmp argument to the macro.
(gimple_bitwise_inverted_equal_p): Add
wascmp argument and set it accordingly.
* match.pd (`a & ~a`, `a ^| ~a`): Update call
to bitwise_inverted_equal_p and handle wascmp case.
(`(~x | y) & x`, `(~x | y) & x`, `a?~t:t`): Update
call to bitwise_inverted_equal_p and check to see
if was !wascmp or if precision was 1.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/pr110954-1.c: New test.

c: Support for -Wuseless-cast [PR84510]

Add support for Wuseless-cast C (and ObjC).

PR c/84510

gcc/c/:
* c-typeck.cc (build_c_cast): Add warning.

gcc/c-family/:
* c.opt: Enable warning for C and ObjC.

gcc/:
* doc/invoke.texi: Update.

gcc/testsuite/:
* gcc.dg/Wuseless-cast.c: New test.

RISC-V: Support RVV VFMSAC rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFMSAC for the below samples.

* __riscv_vfmsac_vv_f32m1_rm
* __riscv_vfmsac_vv_f32m1_rm_m
* __riscv_vfmsac_vf_f32m1_rm
* __riscv_vfmsac_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfmsac_frm): New class for vfmsac frm.
(vfmsac_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfmsac_frm): New function definition

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-msac.c: New test.

Daily bump.

libstdc++: Fix out-of-bounds read in format string "{:{}." [PR110974]

libstdc++-v3/ChangeLog:

PR libstdc++/110974
* include/std/format (_Spec::_S_parse_width_or_precision): Check
for empty range before dereferencing iterator.
* testsuite/std/format/string.cc: Check for expected exception.
Fix expected exception message in test_pr110862() and actually
call it.

libstdc++: Fix std::format for localized floats [PR110968]

The __formatter_fp::_M_localize function just returns an empty string if
the formatting locale is the C locale, as there is nothing to do. But
the caller was assuming that the returned string contains the localized
string. The caller should use the original string if _M_localize returns
an empty string.

libstdc++-v3/ChangeLog:

PR libstdc++/110968
* include/std/format (__formatter_fp::format): Check return
value of _M_localize.
* testsuite/std/format/functions/format.cc: Check classic
locale.

libstdc++: Use alias template for iterator_category [PR110970]

This renames __iterator_category_t to __iter_category_t, for consistency
with std::iter_value_t, std::iter_difference_t and std::iter_reference_t
in C++20. Then use __iter_category_t in <bits/stl_iterator.h>, which
fixes the problem of the missing 'typename' that Clang 15 incorrectly
still requires.

libstdc++-v3/ChangeLog:

PR libstdc++/110970
* include/bits/stl_iterator.h (__detail::__move_iter_cat): Use
__iter_category_t.
(iterator_traits<common_iterator<I, S>>::_S_iter_cat): Likewise.
(__detail::__basic_const_iterator_iter_cat): Likewise.
* include/bits/stl_iterator_base_types.h (__iterator_category_t):
Rename to __iter_category_t.

Fix division by zero in loop splitting

Profile update I added to tree-ssa-loop-split can divide by zero in
situation that the conditional is predicted with 0 probability which
is triggered by jump threading update in the testcase.

gcc/ChangeLog:

PR middle-end/110923
* tree-ssa-loop-split.cc (split_loop): Watch for division by zero.

gcc/testsuite/ChangeLog:

PR middle-end/110923
* gcc.dg/tree-ssa/pr110923.c: New test.

RISC-V: Add Ztso atomic mappings

The RISC-V Ztso extension currently has no effect on generated code.
With the additional ordering constraints guarenteed by Ztso, we can emit
more optimized atomic mappings than the RVWMO mappings.

This PR implements the Ztso psABI mappings[1].

[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/391

2023-08-08 Patrick O'Neill <patrick@rivosinc.com>

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add Ztso and mark Ztso as
dependent on 'a' extension.
* config/riscv/riscv-opts.h (MASK_ZTSO): New mask.
(TARGET_ZTSO): New target.
* config/riscv/riscv.cc (riscv_memmodel_needs_amo_acquire): Add
Ztso case.
(riscv_memmodel_needs_amo_release): Add Ztso case.
(riscv_print_operand): Add Ztso case for LR/SC annotations.
* config/riscv/riscv.md: Import sync-rvwmo.md and sync-ztso.md.
* config/riscv/riscv.opt: Add Ztso target variable.
* config/riscv/sync.md (mem_thread_fence_1): Expand to RVWMO or
Ztso specific insn.
(atomic_load<mode>): Expand to RVWMO or Ztso specific insn.
(atomic_store<mode>): Expand to RVWMO or Ztso specific insn.
* config/riscv/sync-rvwmo.md: New file. Seperate out RVWMO
specific load/store/fence mappings.
* config/riscv/sync-ztso.md: New file. Seperate out Ztso
specific load/store/fence mappings.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo-table-ztso-amo-add-1.c: New test.
* gcc.target/riscv/amo-table-ztso-amo-add-2.c: New test.
* gcc.target/riscv/amo-table-ztso-amo-add-3.c: New test.
* gcc.target/riscv/amo-table-ztso-amo-add-4.c: New test.
* gcc.target/riscv/amo-table-ztso-amo-add-5.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-1.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-2.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-3.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-4.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-5.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-6.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-7.c: New test.
* gcc.target/riscv/amo-table-ztso-fence-1.c: New test.
* gcc.target/riscv/amo-table-ztso-fence-2.c: New test.
* gcc.target/riscv/amo-table-ztso-fence-3.c: New test.
* gcc.target/riscv/amo-table-ztso-fence-4.c: New test.
* gcc.target/riscv/amo-table-ztso-fence-5.c: New test.
* gcc.target/riscv/amo-table-ztso-load-1.c: New test.
* gcc.target/riscv/amo-table-ztso-load-2.c: New test.
* gcc.target/riscv/amo-table-ztso-load-3.c: New test.
* gcc.target/riscv/amo-table-ztso-store-1.c: New test.
* gcc.target/riscv/amo-table-ztso-store-2.c: New test.
* gcc.target/riscv/amo-table-ztso-store-3.c: New test.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-1.c: New test.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-2.c: New test.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-3.c: New test.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-4.c: New test.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-5.c: New test.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>

Fix profile update in duplicat_loop_body_to_header_edge for loops with 0 count_in

this patch makes duplicate_loop_body_to_header_edge to not drop profile counts to
uninitialized when count_in is 0. This happens because profile_probability in 0 count
is undefined.

gcc/ChangeLog:

* cfgloopmanip.cc (duplicate_loop_body_to_header_edge): Special case loops with
0 iteration count.

Fix profile updating bug in tree-ssa-threadupdate

ssa_fix_duplicate_block_edges later calls update_profile to correct profile after threading.
In the testcase this does not work since we lose track of the duplicated edge. This
happens because redirect_edge_and_branch returns NULL if the edge already has correct
destination which is the case.

gcc/ChangeLog:

* tree-ssa-threadupdate.cc (ssa_fix_duplicate_block_edges): Fix profile update.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/phi_on_compare-1.c: Check profile consistency.

Fix undefined behaviour in profile_count::differs_from_p

This patch avoid overflow in profile_count::differs_from_p and also makes it to
return false from one of the values is undefined while other is defined.

gcc/ChangeLog:

* profile-count.cc (profile_count::differs_from_p): Fix overflow and
handling of undefined values.

phiopt: Fix phiopt ICE on vops [PR102989]

I've ran into ICE on gcc.dg/torture/bitint-42.c with -O1 or -Os
when enabling expensive tests, and unfortunately I can't reproduce without
_BitInt.  The IL before phiopt3 has:
  <bb 87> [local count: 203190070]:
  # .MEM_428 = VDEF <.MEM_367>
  bitint.159 = VIEW_CONVERT_EXPR<unsigned long[8]>(*.LC3);
  goto <bb 89>; [100.00%]

  <bb 88> [local count: 203190070]:
  # .MEM_427 = VDEF <.MEM_367>
  bitint.159 = VIEW_CONVERT_EXPR<unsigned long[8]>(*.LC4);

  <bb 89> [local count: 406380139]:
  # .MEM_368 = PHI <.MEM_428(87), .MEM_427(88)>
  # VUSE <.MEM_368>
  _123 = VIEW_CONVERT_EXPR<unsigned long[8]>(r495[i_107].D.2780)[0];
and factor_out_conditional_operation is called on the vop PHI, it
sees it has exactly two operands and defining statements of both
PHI arguments are converts (VCEs in this case), so it thinks it is
a good idea to try to optimize that and while doing that it constructs
void type SSA_NAMEs and the like.

2023-08-10  Jakub Jelinek  <jakub@redhat.com>

PR c/102989
* tree-ssa-phiopt.cc (single_non_singleton_phi_for_edges): Never
return virtual phis and return NULL if there is a virtual phi
where the arguments from E0 and E1 edges aren't equal.

Make ISEL used internal functions const/nothrow where appropriate

Both .VEC_SET and .VEC_EXTACT and the various .VCOND internal functions
are operating on registers only and they are not supposed to raise
any exceptions. The following makes them const/nothrow. I've
verified this avoids useless SSA updates in ISEL.

* internal-fn.def (VCOND, VCONDU, VCONDEQ, VCOND_MASK,
VEC_SET, VEC_EXTRACT): Make ECF_CONST | ECF_NOTHROW.

RISC-V: Add MASK vec_duplicate pattern[PR110962]

This patch fix bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110962

SUBROUTINE a(b,c,d)
  LOGICAL,DIMENSION(INOUT)  :: b
  LOGICAL e
  REAL, DIMENSION(IN)     ::  c
  REAL, DIMENSION(INOUT)  ::  d
  REAL, DIMENSION(SIZE(c))   :: f
  WHERE (b.AND.e)
     WHERE (f>=0.)
        d = g
     ENDWHERE
  ENDWHERE
END SUBROUTINE a

   PR target/110962

gcc/ChangeLog:
PR target/110962
* config/riscv/autovec.md (vec_duplicate<mode>): New pattern.

RISC-V: Support RVV VFNMACC rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFNMACC for the below samples.

* __riscv_vfnmacc_vv_f32m1_rm
* __riscv_vfnmacc_vv_f32m1_rm_m
* __riscv_vfnmacc_vf_f32m1_rm
* __riscv_vfnmacc_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfnmacc_frm): New class for vfnmacc.
(vfnmacc_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfnmacc_frm): New function definition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-nmacc.c: New test.

RISC-V: Support RVV VFMACC rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFMACC for the below samples.

* __riscv_vfmacc_vv_f32m1_rm
* __riscv_vfmacc_vv_f32m1_rm_m
* __riscv_vfmacc_vf_f32m1_rm
* __riscv_vfmacc_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfmacc_frm): New class for vfmacc frm.
(vfmacc_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfmacc_frm): New function definition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-macc.c: New test.