git.ipfire.org Git - thirdparty/gcc.git/log

]> git.ipfire.org Git - thirdparty/gcc.git/log

Jonathan Wakely [Fri, 11 Aug 2023 17:10:29 +0000 (18:10 +0100)]

libstdc++: Do not call log10(0.0) in std::format [PR110860]

Calling log10(0.0) returns -inf which has undefined behaviour when
converted to an integer. We only need to use log10 for large values
anyway. If the value is zero then the larger buffer is only needed due
to a large precision, so we don't need to use log10 to estimate the
number of digits for the significand.

libstdc++-v3/ChangeLog:

PR libstdc++/110860
* include/std/format (__formatter_fp::format): Do not call log10
with zero values.

commit | commitdiff | tree

Eric Feng [Fri, 11 Aug 2023 17:04:59 +0000 (13:04 -0400)]

MAINTAINERS: Add myself to write after approval

ChangeLog:

* MAINTAINERS: Add myself.

Signed-off-by: Eric Feng <ef2648@columbia.edu>

commit | commitdiff | tree

Patrick Palka [Fri, 11 Aug 2023 16:50:52 +0000 (12:50 -0400)]

c++: improve debug_tree for templated types/decls

gcc/cp/ChangeLog:

* ptree.cc (cxx_print_decl): Check for DECL_LANG_SPECIFIC and
TS_DECL_COMMON only when necessary. Print DECL_TEMPLATE_INFO
for all decls that have it, not just VAR_DECL or FUNCTION_DECL.
Also print DECL_USE_TEMPLATE.
(cxx_print_type): Print TYPE_TEMPLATE_INFO.
<case BOUND_TEMPLATE_TEMPLATE_PARM>: Don't print TYPE_TI_ARGS
anymore.
<case TEMPLATE_TYPE/TEMPLATE_PARM>: Print TEMPLATE_TYPE_PARM_INDEX
instead of printing the index, level and original level
individually.

commit | commitdiff | tree

Patrick Palka [Fri, 11 Aug 2023 16:17:24 +0000 (12:17 -0400)]

tree-pretty-print: handle COMPONENT_REF with non-decl RHS

In the C++ front end, a COMPONENT_REF's second operand isn't always a
decl (at least at template parse time). This patch makes the generic
pretty printer not ICE when printing such a COMPONENT_REF.

gcc/ChangeLog:

* tree-pretty-print.cc (dump_generic_node) <case COMPONENT_REF>:
Don't call component_ref_field_offset if the RHS isn't a decl.

commit | commitdiff | tree

John David Anglin [Fri, 11 Aug 2023 15:44:37 +0000 (15:44 +0000)]

Use strtol instead of std::stoi [PR110646]

Implementation of std::stoi was overlooked on hppa-hpux, so use
strtol instead.

2023-08-11 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

PR bootstrap/110646
* gensupport.cc(class conlist): Use strtol instead of std::stoi.

commit | commitdiff | tree

Thomas Neumann [Fri, 11 Aug 2023 15:20:27 +0000 (09:20 -0600)]

preserve base pointer for __deregister_frame [PR110956]

Original bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956
Rainer Orth successfully tested the patch on Solaris with a full bootstrap.

Some uncommon unwinding table encodings need to access the base pointer
for address computations. We do not have that information in calls to
__deregister_frame_info_bases, and previously simply used nullptr as
base pointer. That is usually fine, but for some Solaris i386 shared
libraries that results in wrong address computations.

To fix this problem we now associate the unwinding object with
the table pointer itself, which is always known, in addition to
the PC range. When deregistering a frame, we first locate the object
using the table pointer, and then use the base pointer stored within
the object to compute the PC range.

libgcc/ChangeLog:
PR libgcc/110956
* unwind-dw2-fde.c: Associate object with address of unwinding
table.

commit | commitdiff | tree

Vladimir N. Makarov [Fri, 11 Aug 2023 11:57:37 +0000 (07:57 -0400)]

[LRA]: Implement output stack pointer reloads

LRA prohibited output stack pointer reloads but it resulted in LRA
failure for AVR target which has no arithmetic insns working with the
stack pointer register. Given patch implements the output stack
pointer reloads.

gcc/ChangeLog:

* lra-constraints.cc (goal_alt_out_sp_reload_p): New flag.
(process_alt_operands): Set the flag.
(curr_insn_transform): Modify stack pointer offsets if output
stack pointer reload is generated.

commit | commitdiff | tree

Jonathan Wakely [Fri, 11 Aug 2023 11:27:58 +0000 (12:27 +0100)]

libstdc++: Handle invalid values in std::chrono pretty printers

This avoids an IndexError exception when printing invalid chrono::month
or chrono::weekday values.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (StdChronoCalendarPrinter):
Check for out-of-range month an weekday indices.
* testsuite/libstdc++-prettyprinters/chrono.cc: Check invalid
month and weekday values.

commit | commitdiff | tree

Jonathan Wakely [Fri, 11 Aug 2023 13:27:30 +0000 (14:27 +0100)]

libstdc++: Revert accidentally committed change to bits/stl_iterator.h

In commit r14-3134-g9cb2a7c8d54b1f I only meant to change some uses of
__clamp_iter_cat to use __iter_category_t, I didn't mean to commit the
additional change introducing __clamped_iter_cat_t. This reverts that
part.

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (__clamped_iter_cat_t): Remove.

commit | commitdiff | tree

Joseph Myers [Fri, 11 Aug 2023 13:20:07 +0000 (13:20 +0000)]

config: Fix host -rdynamic detection for build != host != target

The GCC_ENABLE_PLUGINS configure logic for detecting whether -rdynamic
is necessary and supported uses an appropriate objdump for $host
binaries (running on $build) in cases where $host is $build or
$target.

However, it is missing such logic in the case where $host is neither
$build nor $target, resulting in the compilers not being linked with
-rdynamic and plugins not being usable with such a compiler. In fact
$ac_cv_prog_OBJDUMP, as used when $build = $host, is always an objdump
for $host binaries that runs on $build; that is, it's appropriate to
use in this case as well.

Tested in such a configuration that it does result in cc1 being linked
with -rdynamic as expected. Also bootstrapped with no regressions for
x86_64-pc-linux-gnu.

config/
* gcc-plugin.m4 (GCC_ENABLE_PLUGINS): Use
export_sym_check="$ac_cv_prog_OBJDUMP -T" also when host is not
build or target.

gcc/
* configure: Regenerate.

libcc1/
* configure: Regenerate.

commit | commitdiff | tree

Richard Biener [Fri, 11 Aug 2023 11:00:17 +0000 (13:00 +0200)]

tree-optimization/110979 - fold-left reduction and partial vectors

When we vectorize fold-left reductions with partial vectors but
no target operation available we use a vector conditional to force
excess elements to zero.  But that doesn't correctly preserve
the sign of zero.  The following patch disables partial vector
support when we have to do that and also need to honor rounding
modes other than round-to-nearest.  When round-to-nearest is in
effect and we have to preserve the sign of zero instead use
negative zero for the excess elements.

PR tree-optimization/110979
* tree-vect-loop.cc (vectorizable_reduction): For
FOLD_LEFT_REDUCTION without target support make sure
we don't need to honor signed zeros and sign dependent rounding.

* gcc.dg/torture/pr110979.c: New testcase.

commit | commitdiff | tree

Richard Biener [Fri, 11 Aug 2023 10:08:10 +0000 (12:08 +0200)]

Improve BB vectorization opt-info

The following makes us more correctly print the used vector size
when doing BB vectorization and also print all involved SLP graph
roots, not just the random one we ended up picking as leader.
In particular the last bit improves diffing opt-info between
different GCC revs but it also requires some testsuite adjustments.

* tree-vect-slp.cc (vect_slp_region): Provide opt-info for all SLP
subgraph entries. Dump the used vector size based on the
SLP subgraph entry root vector type.

* g++.dg/vect/slp-pr87105.cc: Adjust.
* gcc.dg/vect/bb-slp-17.c: Likewise.
* gcc.dg/vect/bb-slp-20.c: Likewise.
* gcc.dg/vect/bb-slp-21.c: Likewise.
* gcc.dg/vect/bb-slp-22.c: Likewise.
* gcc.dg/vect/bb-slp-subgroups-2.c: Likewise.

commit | commitdiff | tree

Pan Li [Fri, 11 Aug 2023 10:08:14 +0000 (18:08 +0800)]

RISC-V: Support RVV VFMSUB rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFMSUB as the below samples.

* __riscv_vfmsub_vv_f32m1_rm
* __riscv_vfmsub_vv_f32m1_rm_m
* __riscv_vfmsub_vf_f32m1_rm
* __riscv_vfmsub_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfmsub_frm): New class for vfmsub frm.
(vfmsub_frm): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfmsub_frm): New function declaration.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-msub.c: New test.

commit | commitdiff | tree

Juzhe-Zhong [Fri, 11 Aug 2023 06:49:10 +0000 (14:49 +0800)]

VECT: Add vec_mask_len_{load_lanes,store_lanes} patterns

This patch is add vec_mask_len_{load_lanes,store_stores} autovectorization patterns.

Here we want to support this following autovectorization:

void
foo (int8_t *__restrict a,
int8_t *__restrict b,
int8_t *__restrict cond,
int n)
{
  for (intptr_t i = 0; i < n; ++i)
    {
      if (cond[i])
        a[i] = b[i * 2] + b[i * 2 + 1];
    }
}

ARM SVE IR:

https://godbolt.org/z/cro1Eqc6a

  # loop_mask_60 = PHI <next_mask_82(4), max_mask_81(3)>
  ...
  mask__39.12_63 = vect__3.11_61 != { 0, ... };
  vec_mask_and_66 = loop_mask_60 & mask__39.12_63;
  ...
  vect_array.15 = .MASK_LOAD_LANES (_57, 8B, vec_mask_and_66);
  ...

For RVV, we would like to see IR:

  loop_len = SELECT_VL;
  ...
  mask__39.12_63 = vect__3.11_61 != { 0, ... };
  ...
  vect_array.15 = .MASK_LEN_LOAD_LANES (_57, 8B, mask__39.12_63, loop_len, bias);
  ...

Bootstrap and Regression on X86 passed.

Ok for trunk ?

gcc/ChangeLog:

* doc/md.texi: Add vec_mask_len_{load_lanes,store_lanes} patterns.
* internal-fn.cc (expand_partial_load_optab_fn): Ditto.
(expand_partial_store_optab_fn): Ditto.
* internal-fn.def (MASK_LEN_LOAD_LANES): Ditto.
(MASK_LEN_STORE_LANES): Ditto.
* optabs.def (OPTAB_CD): Ditto.

commit | commitdiff | tree

Pan Li [Fri, 11 Aug 2023 08:07:45 +0000 (16:07 +0800)]

RISC-V: Support RVV VFNMADD rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFNMADD as the below samples.

* __riscv_vfnmadd_vv_f32m1_rm
* __riscv_vfnmadd_vv_f32m1_rm_m
* __riscv_vfnmadd_vf_f32m1_rm
* __riscv_vfnmadd_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfnmadd_frm): New class for vfnmadd frm.
(vfnmadd_frm): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfnmadd_frm): New function declaration.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-nmadd.c: New test.

commit | commitdiff | tree

Drew Ross [Fri, 11 Aug 2023 08:17:56 +0000 (10:17 +0200)]

match.pd: Implement missed optimization ((x ^ y) & z) | x -> (z & y) | x [PR109938]

Adds a simplification for ((x ^ y) & z) | x to be folded into
(z & y) | x. Merges this simplification with ((x | y) & z) | x -> (z & y) | x
to prevent duplicate pattern.

2023-08-11 Drew Ross <drross@redhat.com>
Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/109938
* match.pd (((x ^ y) & z) | x -> (z & y) | x): New simplification.

* gcc.c-torture/execute/pr109938.c: New test.
* gcc.dg/tree-ssa/pr109938.c: New test.

commit | commitdiff | tree

Pan Li [Fri, 11 Aug 2023 07:15:47 +0000 (15:15 +0800)]

RISC-V: Support RVV VFMADD rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFMADD as the below samples.

* __riscv_vfmadd_vv_f32m1_rm
* __riscv_vfmadd_vv_f32m1_rm_m
* __riscv_vfmadd_vf_f32m1_rm
* __riscv_vfmadd_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfmadd_frm): New class for vfmadd frm.
(vfmadd_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfmadd_frm): New function definition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-madd.c: New test.

commit | commitdiff | tree

Pan Li [Fri, 11 Aug 2023 05:50:55 +0000 (13:50 +0800)]

RISC-V: Support RVV VFNMSAC rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFNMSAC for the below samples.

* __riscv_vfnmsac_vv_f32m1_rm
* __riscv_vfnmsac_vv_f32m1_rm_m
* __riscv_vfnmsac_vf_f32m1_rm
* __riscv_vfnmsac_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfnmsac_frm): New class for vfnmsac frm.
(vfnmsac_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfnmsac_frm): New function definition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-nmsac.c: New test.

commit | commitdiff | tree

Jakub Jelinek [Fri, 11 Aug 2023 07:34:15 +0000 (09:34 +0200)]

c: Add __typeof_unqual__ and __typeof_unqual support

As I mentioned in my stdckdint.h mail, I think having __ prefixed
keywords for the typeof_unqual keyword which can be used in earlier
language modes can be useful, not all code can be switched to C23
right away.

The following patch implements that. It keeps the non-C23 behavior
for it for the _Noreturn functions to stay compatible with how
__typeof__ behaves.

I think we don't need it for C++, in C++ we have standard
traits to remove qualifiers etc.

2023-08-11 Jakub Jelinek <jakub@redhat.com>

gcc/
* doc/extend.texi (Typeof): Document typeof_unqual
and __typeof_unqual__.
gcc/c-family/
* c-common.cc (c_common_reswords): Add __typeof_unqual
and __typeof_unqual__ spellings of typeof_unqual.
gcc/c/
* c-parser.cc (c_parser_typeof_specifier): Handle
__typeof_unqual and __typeof_unqual__ as !is_std.
gcc/testsuite/
* gcc.dg/c11-typeof-2.c: New test.
* gcc.dg/c11-typeof-3.c: New test.
* gcc.dg/gnu11-typeof-3.c: New test.
* gcc.dg/gnu11-typeof-4.c: New test.

commit | commitdiff | tree

Andrew Pinski [Wed, 9 Aug 2023 20:49:24 +0000 (13:49 -0700)]

Fix PR 110954: wrong code with cmp | !cmp

This was an oversight on my part forgetting that
cmp will might have a different true value than all ones
but will have a value of 1 in most cases.
This means if we have `(f < 0) | !(f < 0)` we would
optimize this to -1 rather than just 1.

This is version 2 of the patch.
Decided to go down a different route than just checking if
the precission was 1 inside bitwise_inverted_equal_p.
So instead bitwise_inverted_equal_p gets passed an argument
that will be set if there was a comparison that was being compared
and the user of bitwise_inverted_equal_p decides what needs to be done.
In most uses of bitwise_inverted_equal_p, the check will be
`!wascmp || element_precision (type) == 1` .
But in the case of `a & ~a` and `a ^| ~a` we can handle the case
of wascmp by using constant_boolean_node isntead.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/110954

gcc/ChangeLog:

* generic-match-head.cc (bitwise_inverted_equal_p): Add
wascmp argument and set it accordingly.
* gimple-match-head.cc (bitwise_inverted_equal_p): Add
wascmp argument to the macro.
(gimple_bitwise_inverted_equal_p): Add
wascmp argument and set it accordingly.
* match.pd (`a & ~a`, `a ^| ~a`): Update call
to bitwise_inverted_equal_p and handle wascmp case.
(`(~x | y) & x`, `(~x | y) & x`, `a?~t:t`): Update
call to bitwise_inverted_equal_p and check to see
if was !wascmp or if precision was 1.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/pr110954-1.c: New test.

commit | commitdiff | tree

Martin Uecker [Thu, 10 Aug 2023 08:39:41 +0000 (10:39 +0200)]

c: Support for -Wuseless-cast [PR84510]

Add support for Wuseless-cast C (and ObjC).

PR c/84510

gcc/c/:
* c-typeck.cc (build_c_cast): Add warning.

gcc/c-family/:
* c.opt: Enable warning for C and ObjC.

gcc/:
* doc/invoke.texi: Update.

gcc/testsuite/:
* gcc.dg/Wuseless-cast.c: New test.

commit | commitdiff | tree

Pan Li [Fri, 11 Aug 2023 02:06:38 +0000 (10:06 +0800)]

RISC-V: Support RVV VFMSAC rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFMSAC for the below samples.

* __riscv_vfmsac_vv_f32m1_rm
* __riscv_vfmsac_vv_f32m1_rm_m
* __riscv_vfmsac_vf_f32m1_rm
* __riscv_vfmsac_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfmsac_frm): New class for vfmsac frm.
(vfmsac_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfmsac_frm): New function definition

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-msac.c: New test.

commit | commitdiff | tree

GCC Administrator [Fri, 11 Aug 2023 00:16:45 +0000 (00:16 +0000)]

Daily bump.

commit | commitdiff | tree

Jonathan Wakely [Thu, 10 Aug 2023 22:15:29 +0000 (23:15 +0100)]

libstdc++: Fix out-of-bounds read in format string "{:{}." [PR110974]

libstdc++-v3/ChangeLog:

PR libstdc++/110974
* include/std/format (_Spec::_S_parse_width_or_precision): Check
for empty range before dereferencing iterator.
* testsuite/std/format/string.cc: Check for expected exception.
Fix expected exception message in test_pr110862() and actually
call it.

commit | commitdiff | tree

Jonathan Wakely [Thu, 10 Aug 2023 13:33:44 +0000 (14:33 +0100)]

libstdc++: Fix std::format for localized floats [PR110968]

The __formatter_fp::_M_localize function just returns an empty string if
the formatting locale is the C locale, as there is nothing to do. But
the caller was assuming that the returned string contains the localized
string. The caller should use the original string if _M_localize returns
an empty string.

libstdc++-v3/ChangeLog:

PR libstdc++/110968
* include/std/format (__formatter_fp::format): Check return
value of _M_localize.
* testsuite/std/format/functions/format.cc: Check classic
locale.

commit | commitdiff | tree

Jonathan Wakely [Thu, 10 Aug 2023 12:48:48 +0000 (13:48 +0100)]

libstdc++: Use alias template for iterator_category [PR110970]

This renames __iterator_category_t to __iter_category_t, for consistency
with std::iter_value_t, std::iter_difference_t and std::iter_reference_t
in C++20. Then use __iter_category_t in <bits/stl_iterator.h>, which
fixes the problem of the missing 'typename' that Clang 15 incorrectly
still requires.

libstdc++-v3/ChangeLog:

PR libstdc++/110970
* include/bits/stl_iterator.h (__detail::__move_iter_cat): Use
__iter_category_t.
(iterator_traits<common_iterator<I, S>>::_S_iter_cat): Likewise.
(__detail::__basic_const_iterator_iter_cat): Likewise.
* include/bits/stl_iterator_base_types.h (__iterator_category_t):
Rename to __iter_category_t.

commit | commitdiff | tree

Jan Hubicka [Thu, 10 Aug 2023 22:23:14 +0000 (00:23 +0200)]

Fix division by zero in loop splitting

Profile update I added to tree-ssa-loop-split can divide by zero in
situation that the conditional is predicted with 0 probability which
is triggered by jump threading update in the testcase.

gcc/ChangeLog:

PR middle-end/110923
* tree-ssa-loop-split.cc (split_loop): Watch for division by zero.

gcc/testsuite/ChangeLog:

PR middle-end/110923
* gcc.dg/tree-ssa/pr110923.c: New test.

commit | commitdiff | tree

Patrick O'Neill [Thu, 10 Aug 2023 21:05:50 +0000 (14:05 -0700)]

RISC-V: Add Ztso atomic mappings

The RISC-V Ztso extension currently has no effect on generated code.
With the additional ordering constraints guarenteed by Ztso, we can emit
more optimized atomic mappings than the RVWMO mappings.

This PR implements the Ztso psABI mappings[1].

[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/391

2023-08-08 Patrick O'Neill <patrick@rivosinc.com>

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add Ztso and mark Ztso as
dependent on 'a' extension.
* config/riscv/riscv-opts.h (MASK_ZTSO): New mask.
(TARGET_ZTSO): New target.
* config/riscv/riscv.cc (riscv_memmodel_needs_amo_acquire): Add
Ztso case.
(riscv_memmodel_needs_amo_release): Add Ztso case.
(riscv_print_operand): Add Ztso case for LR/SC annotations.
* config/riscv/riscv.md: Import sync-rvwmo.md and sync-ztso.md.
* config/riscv/riscv.opt: Add Ztso target variable.
* config/riscv/sync.md (mem_thread_fence_1): Expand to RVWMO or
Ztso specific insn.
(atomic_load<mode>): Expand to RVWMO or Ztso specific insn.
(atomic_store<mode>): Expand to RVWMO or Ztso specific insn.
* config/riscv/sync-rvwmo.md: New file. Seperate out RVWMO
specific load/store/fence mappings.
* config/riscv/sync-ztso.md: New file. Seperate out Ztso
specific load/store/fence mappings.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo-table-ztso-amo-add-1.c: New test.
* gcc.target/riscv/amo-table-ztso-amo-add-2.c: New test.
* gcc.target/riscv/amo-table-ztso-amo-add-3.c: New test.
* gcc.target/riscv/amo-table-ztso-amo-add-4.c: New test.
* gcc.target/riscv/amo-table-ztso-amo-add-5.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-1.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-2.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-3.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-4.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-5.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-6.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-7.c: New test.
* gcc.target/riscv/amo-table-ztso-fence-1.c: New test.
* gcc.target/riscv/amo-table-ztso-fence-2.c: New test.
* gcc.target/riscv/amo-table-ztso-fence-3.c: New test.
* gcc.target/riscv/amo-table-ztso-fence-4.c: New test.
* gcc.target/riscv/amo-table-ztso-fence-5.c: New test.
* gcc.target/riscv/amo-table-ztso-load-1.c: New test.
* gcc.target/riscv/amo-table-ztso-load-2.c: New test.
* gcc.target/riscv/amo-table-ztso-load-3.c: New test.
* gcc.target/riscv/amo-table-ztso-store-1.c: New test.
* gcc.target/riscv/amo-table-ztso-store-2.c: New test.
* gcc.target/riscv/amo-table-ztso-store-3.c: New test.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-1.c: New test.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-2.c: New test.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-3.c: New test.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-4.c: New test.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-5.c: New test.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>

commit | commitdiff | tree

Jan Hubicka [Thu, 10 Aug 2023 17:01:43 +0000 (19:01 +0200)]

Fix profile update in duplicat_loop_body_to_header_edge for loops with 0 count_in

this patch makes duplicate_loop_body_to_header_edge to not drop profile counts to
uninitialized when count_in is 0. This happens because profile_probability in 0 count
is undefined.

gcc/ChangeLog:

* cfgloopmanip.cc (duplicate_loop_body_to_header_edge): Special case loops with
0 iteration count.

commit | commitdiff | tree

Jan Hubicka [Thu, 10 Aug 2023 16:39:33 +0000 (18:39 +0200)]

Fix profile updating bug in tree-ssa-threadupdate

ssa_fix_duplicate_block_edges later calls update_profile to correct profile after threading.
In the testcase this does not work since we lose track of the duplicated edge. This
happens because redirect_edge_and_branch returns NULL if the edge already has correct
destination which is the case.

gcc/ChangeLog:

* tree-ssa-threadupdate.cc (ssa_fix_duplicate_block_edges): Fix profile update.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/phi_on_compare-1.c: Check profile consistency.

commit | commitdiff | tree

Jan Hubicka [Thu, 10 Aug 2023 16:35:13 +0000 (18:35 +0200)]

Fix undefined behaviour in profile_count::differs_from_p

This patch avoid overflow in profile_count::differs_from_p and also makes it to
return false from one of the values is undefined while other is defined.

gcc/ChangeLog:

* profile-count.cc (profile_count::differs_from_p): Fix overflow and
handling of undefined values.

commit | commitdiff | tree

Jakub Jelinek [Thu, 10 Aug 2023 15:29:23 +0000 (17:29 +0200)]

phiopt: Fix phiopt ICE on vops [PR102989]

I've ran into ICE on gcc.dg/torture/bitint-42.c with -O1 or -Os
when enabling expensive tests, and unfortunately I can't reproduce without
_BitInt.  The IL before phiopt3 has:
  <bb 87> [local count: 203190070]:
  # .MEM_428 = VDEF <.MEM_367>
  bitint.159 = VIEW_CONVERT_EXPR<unsigned long[8]>(*.LC3);
  goto <bb 89>; [100.00%]

  <bb 88> [local count: 203190070]:
  # .MEM_427 = VDEF <.MEM_367>
  bitint.159 = VIEW_CONVERT_EXPR<unsigned long[8]>(*.LC4);

  <bb 89> [local count: 406380139]:
  # .MEM_368 = PHI <.MEM_428(87), .MEM_427(88)>
  # VUSE <.MEM_368>
  _123 = VIEW_CONVERT_EXPR<unsigned long[8]>(r495[i_107].D.2780)[0];
and factor_out_conditional_operation is called on the vop PHI, it
sees it has exactly two operands and defining statements of both
PHI arguments are converts (VCEs in this case), so it thinks it is
a good idea to try to optimize that and while doing that it constructs
void type SSA_NAMEs and the like.

2023-08-10  Jakub Jelinek  <jakub@redhat.com>

PR c/102989
* tree-ssa-phiopt.cc (single_non_singleton_phi_for_edges): Never
return virtual phis and return NULL if there is a virtual phi
where the arguments from E0 and E1 edges aren't equal.

commit | commitdiff | tree

Richard Biener [Thu, 10 Aug 2023 12:19:38 +0000 (14:19 +0200)]

Make ISEL used internal functions const/nothrow where appropriate

Both .VEC_SET and .VEC_EXTACT and the various .VCOND internal functions
are operating on registers only and they are not supposed to raise
any exceptions. The following makes them const/nothrow. I've
verified this avoids useless SSA updates in ISEL.

* internal-fn.def (VCOND, VCONDU, VCONDEQ, VCOND_MASK,
VEC_SET, VEC_EXTRACT): Make ECF_CONST | ECF_NOTHROW.

commit | commitdiff | tree

Juzhe-Zhong [Thu, 10 Aug 2023 10:37:05 +0000 (18:37 +0800)]

RISC-V: Add MASK vec_duplicate pattern[PR110962]

This patch fix bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110962

SUBROUTINE a(b,c,d)
  LOGICAL,DIMENSION(INOUT)  :: b
  LOGICAL e
  REAL, DIMENSION(IN)     ::  c
  REAL, DIMENSION(INOUT)  ::  d
  REAL, DIMENSION(SIZE(c))   :: f
  WHERE (b.AND.e)
     WHERE (f>=0.)
        d = g
     ENDWHERE
  ENDWHERE
END SUBROUTINE a

   PR target/110962

gcc/ChangeLog:
PR target/110962
* config/riscv/autovec.md (vec_duplicate<mode>): New pattern.

commit | commitdiff | tree

Pan Li [Thu, 10 Aug 2023 08:00:17 +0000 (16:00 +0800)]

RISC-V: Support RVV VFNMACC rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFNMACC for the below samples.

* __riscv_vfnmacc_vv_f32m1_rm
* __riscv_vfnmacc_vv_f32m1_rm_m
* __riscv_vfnmacc_vf_f32m1_rm
* __riscv_vfnmacc_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfnmacc_frm): New class for vfnmacc.
(vfnmacc_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfnmacc_frm): New function definition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-nmacc.c: New test.

commit | commitdiff | tree

Pan Li [Thu, 3 Aug 2023 14:32:58 +0000 (22:32 +0800)]

RISC-V: Support RVV VFMACC rounding mode intrinsic API

This patch would like to support the rounding mode API for the
VFMACC for the below samples.

* __riscv_vfmacc_vv_f32m1_rm
* __riscv_vfmacc_vv_f32m1_rm_m
* __riscv_vfmacc_vf_f32m1_rm
* __riscv_vfmacc_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class vfmacc_frm): New class for vfmacc frm.
(vfmacc_frm_obj): New declaration.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def
(vfmacc_frm): New function definition.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-macc.c: New test.

commit | commitdiff | tree

Juzhe-Zhong [Thu, 10 Aug 2023 09:21:46 +0000 (17:21 +0800)]

RISC-V: Support TU for integer ternary OP[PR110964]

PR target/110964

gcc/ChangeLog:
PR target/110964
* config/riscv/riscv-v.cc (expand_cond_len_ternop): Add integer ternary.

gcc/testsuite/ChangeLog:
PR target/110964
* gcc.target/riscv/rvv/autovec/pr110964.c: New test.

commit | commitdiff | tree

Richard Biener [Wed, 9 Aug 2023 09:48:28 +0000 (11:48 +0200)]

Remove insert location argument from vectorizable_live_operation

The insert location argument isn't actually used but we compute
that ourselves. There's a single spot, namely when asking
for the loop mask via vect_get_loop_mask that the passed argument
is used but that looks like an oversight. The following fixes that
and adjusts vectorizable_live_operation and can_vectorize_live_stmts
to no longer take a stmt iterator argument.

* tree-vectorizer.h (vectorizable_live_operation): Remove
gimple_stmt_iterator * argument.
* tree-vect-loop.cc (vectorizable_live_operation): Likewise.
Adjust plumbing around vect_get_loop_mask.
(vect_analyze_loop_operations): Adjust.
* tree-vect-slp.cc (vect_slp_analyze_node_operations_1): Likewise.
(vect_bb_slp_mark_live_stmts): Likewise.
(vect_schedule_slp_node): Likewise.
* tree-vect-stmts.cc (can_vectorize_live_stmts): Likewise.
Remove gimple_stmt_iterator * argument.
(vect_transform_stmt): Adjust.

commit | commitdiff | tree

Juzhe-Zhong [Thu, 10 Aug 2023 08:55:18 +0000 (16:55 +0800)]

RISC-V: Add missing modes to the iterators

gcc/ChangeLog:

* config/riscv/vector-iterators.md: Add missing modes.

commit | commitdiff | tree

Jakub Jelinek [Thu, 10 Aug 2023 07:23:08 +0000 (09:23 +0200)]

lto-streamer-in: Adjust assert [PR102989]

With _BitInt(575) or any other _BitInt(513) or larger constants we can
run into this assertion. MAX_BITSIZE_MODE_ANY_INT is just a value from
which WIDE_INT_MAX_PRECISION is derived.

2023-08-10 Jakub Jelinek <jakub@redhat.com>

PR c/102989
* lto-streamer-in.cc (lto_input_tree_1): Assert TYPE_PRECISION
is up to WIDE_INT_MAX_PRECISION rather than MAX_BITSIZE_MODE_ANY_INT.

commit | commitdiff | tree

Jakub Jelinek [Thu, 10 Aug 2023 07:22:03 +0000 (09:22 +0200)]

expr: Small optimization [PR102989]

Small optimization to avoid testing modifier multiple times.

2023-08-10 Jakub Jelinek <jakub@redhat.com>

PR c/102989
* expr.cc (expand_expr_real_1) <case MEM_REF>: Add an early return for
EXPAND_WRITE or EXPAND_MEMORY modifiers to avoid testing it multiple
times.

commit | commitdiff | tree

liuhongt [Wed, 9 Aug 2023 06:25:53 +0000 (14:25 +0800)]

i386: Do not sanitize upper part of V2HFmode and V4HFmode reg with -fno-trapping-math [PR110832]

Also add ix86_partial_vec_fp_math to to condition of V2HF/V4HF named
patterns in order to avoid generation of partial vector V8HFmode
trapping instructions.

gcc/ChangeLog:

PR target/110832
* config/i386/mmx.md: (movq_<mode>_to_sse): Also do not
sanitize upper part of V4HFmode register with
-fno-trapping-math.
(<insn>v4hf3): Enable for ix86_partial_vec_fp_math.
(<divv4hf3): Ditto.
(<insn>v2hf3): Ditto.
(divv2hf3): Ditto.
(movd_v2hf_to_sse): Do not sanitize upper part of V2HFmode
register with -fno-trapping-math.

commit | commitdiff | tree

Pan Li [Sun, 6 Aug 2023 03:10:23 +0000 (11:10 +0800)]

RISC-V: Refactor RVV frm_mode attr for rounding mode intrinsic

The frm_mode attr has some assumptions for each define insn as below.

1. The define insn has at least 9 operands.
2. The operands[9] must be frm reg.
3. The operands[9] must be const int.

Actually, the frm operand can be operands[8], operands[9] or
operands[10], and not all the define insn has frm operands.

This patch would like to refactor frm and eliminate the above
assumptions, as well as unblock the underlying rounding mode intrinsic
API support.

After refactor, the default frm will be none, and the selected insn type
will be dyn. For the floating point which honors the frm, we will
set the frm_mode attr explicitly in define_insn.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored-by: Kito Cheng <kito.cheng@sifive.com>
gcc/ChangeLog:

* config/riscv/riscv-protos.h
(enum floating_point_rounding_mode): Add NONE, DYN_EXIT and DYN_CALL.
(get_frm_mode): New declaration.
* config/riscv/riscv-v.cc (get_frm_mode): New function to get frm mode.
* config/riscv/riscv-vector-builtins.cc
(function_expander::use_ternop_insn): Take care of frm reg.
* config/riscv/riscv.cc (riscv_static_frm_mode_p): Migrate to FRM_XXX.
(riscv_emit_frm_mode_set): Ditto.
(riscv_emit_mode_set): Ditto.
(riscv_frm_adjust_mode_after_call): Ditto.
(riscv_frm_mode_needed): Ditto.
(riscv_frm_mode_after): Ditto.
(riscv_mode_entry): Ditto.
(riscv_mode_exit): Ditto.
* config/riscv/riscv.h (NUM_MODES_FOR_MODE_SWITCHING): Ditto.
* config/riscv/vector.md
(rne,rtz,rdn,rup,rmm,dyn,dyn_exit,dyn_call,none): Removed
(symbol_ref): * config/riscv/vector.md: Set frm_mode attr explicitly.

commit | commitdiff | tree

GCC Administrator [Thu, 10 Aug 2023 00:17:26 +0000 (00:17 +0000)]

Daily bump.

commit | commitdiff | tree

Juzhe-Zhong [Wed, 9 Aug 2023 10:51:42 +0000 (18:51 +0800)]

RISC-V: Fix VLMAX AVL incorrect local anticipate [VSETVL PASS]

Realize we have a bug in VSETVL PASS which is triggered by strided_load_run-1.c in RV32 system.

FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test

This is because VSETVL PASS incorrect hoist vsetvl instruction:

...
   10156: 0d9075d7           vsetvli a1,zero,e64,m2,ta,ma ---> pollute 'a1' register which will be used by following insns.
   1015a: 01d586b3           add a3,a1,t4  --------> use 'a1'
   1015e: 5e070257           vmv.v.v v4,v14
   10162: b7032257           vmacc.vv v4,v6,v16
   10166: 26440257           vand.vv v4,v4,v8
   1016a: 22880227           vs2r.v v4,(a6)
   1016e: 00b6b7b3           sltu a5,a3,a1
   10172: 22888227           vs2r.v v4,(a7)
   10176: 9e60b157           vmv2r.v v2,v6
   1017a: 97ba                 add a5,a5,a4
   1017c: a6a62157           vmadd.vv v2,v12,v10
   10180: 26240157           vand.vv v2,v2,v8
   10184: 22830127           vs2r.v v2,(t1)
   10188: 873e                 mv a4,a5
   1018a: 982a                 add a6,a6,a0
   1018c: 98aa                 add a7,a7,a0
   1018e: 932a                 add t1,t1,a0
   10190: 85b6                 mv a1,a3       -----> set 'a1'
...

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (anticipatable_occurrence_p): Fix
incorrect anticipate info.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c:
Adapt test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-24.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-36.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-14.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-15.c: Ditto.

commit | commitdiff | tree

David Malcolm [Wed, 9 Aug 2023 20:17:04 +0000 (16:17 -0400)]

analyzer: remove default return value from region_model::on_call_pre

Previously, the code for simulating calls to external functions in
region_model::on_call_pre wrote a default svalue to the LHS of the
call statement, which could be further overwritten by known_function
subclasses.

Unfortunately, this led to messy hacks, such as when the default svalue
was an allocation: the LHS would be written to with two different
heap-allocated regions, requiring special-case cleanups to avoid the
stray state from the first heap allocation leading to state explosions;
see r14-3001-g021077b94741c9.

The following patch eliminates this write of a default svalue to the LHS
of callsite.  Instead, all known_function implementations that have a
return value are now responsible for set the LHS themselves.  A new
call_details::set_any_lhs_with_defaults function is provided to make it
easy to get the old behavior.

On working through the various known_function subclasses, I noticed that
memset was using the default behavior.  That patch updates this so that
it's now known to return its first parameter.

Cleaning this up eliminates various doubling of saved_diagnostics (e.g.
for dubious_allocation_size) where it was generating a diagnostic for
both writes to the LHS, deduplicating them to the first diagnostic (with
the default LHS), and then failing to create a region_creation_event
when emitting the diagnostic, leading to the fallback wording in
dubious_allocation_size::describe_final_event, such as:

  (1) allocated 42 bytes and assigned to ‘int32_t *’ {aka ‘int *’} here; ‘sizeof (int32_t {aka int})’ is ‘4’

Without the double write to the LHS, it creates a region_creation_event,
so we get the allocation and the assignment as two separate events in
the diagnostic path, e.g.:

  (1) allocated 42 bytes here
  (2) assigned to ‘int32_t *’ {aka ‘int *’} here; ‘sizeof (int32_t {aka int})’ is ‘4’

gcc/analyzer/ChangeLog:
* analyzer.h (class pure_known_function_with_default_return): New
subclass.
* call-details.cc (const_fn_p): Move here from region-model.cc.
(maybe_get_const_fn_result): Likewise.
(get_result_size_in_bytes): Likewise.
(call_details::set_any_lhs_with_defaults): New function, based on
code in region_model::on_call_pre.
* call-details.h (call_details::set_any_lhs_with_defaults): New
decl.
* diagnostic-manager.cc
(diagnostic_manager::emit_saved_diagnostic): Log the index of the
saved_diagnostic.
* kf.cc (pure_known_function_with_default_return::impl_call_pre):
New.
(kf_memset::impl_call_pre): Set the LHS to the first param.
(kf_putenv::impl_call_pre): Call cd.set_any_lhs_with_defaults.
(kf_sprintf::impl_call_pre): Call cd.set_any_lhs_with_defaults.
(class kf_stack_restore): Derive from
pure_known_function_with_default_return.
(class kf_stack_save): Likewise.
(kf_strlen::impl_call_pre): Call cd.set_any_lhs_with_defaults.
* region-model-reachability.cc (reachable_regions::handle_sval):
Remove logic for symbolic regions for pointers.
* region-model.cc (region_model::canonicalize): Remove purging of
dynamic extents workaround for surplus values from
region_model::on_call_pre's default LHS code.
(const_fn_p): Move to call-details.cc.
(maybe_get_const_fn_result): Likewise.
(get_result_size_in_bytes): Likewise.
(region_model::update_for_nonzero_return): Call
cd.set_any_lhs_with_defaults.
(region_model::on_call_pre): Remove the assignment to the LHS of a
default return value, instead requiring all known_function
implementations to write to any LHS of the call.  Use
cd.set_any_lhs_with_defaults on the non-kf paths.
* sm-fd.cc (kf_socket::outcome_of_socket::update_model): Use
cd.set_any_lhs_with_defaults when failing to get at fd state.
(kf_bind::outcome_of_bind::update_model): Likewise.
(kf_listen::outcome_of_listen::update_model): Likewise.
(kf_accept::outcome_of_accept::update_model): Likewise.
(kf_connect::outcome_of_connect::update_model): Likewise.
(kf_read::impl_call_pre): Use cd.set_any_lhs_with_defaults.
* sm-file.cc (class kf_stdio_output_fn): Derive from
pure_known_function_with_default_return.
(class kf_ferror): Likewise.
(class kf_fileno): Likewise.
(kf_fgets::impl_call_pre): Use cd.set_any_lhs_with_defaults.
(kf_read::impl_call_pre): Likewise.
(class kf_getc): Derive from
pure_known_function_with_default_return.
(class kf_getchar): Likewise.
* varargs.cc (kf_va_arg::impl_call_pre): Use
cd.set_any_lhs_with_defaults.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/allocation-size-1.c: Update expected results
to reflect splitting of allocation size and assignment messages
from a single event into pairs of events
* gcc.dg/analyzer/allocation-size-2.c: Likewise.
* gcc.dg/analyzer/allocation-size-3.c: Likewise.
* gcc.dg/analyzer/allocation-size-4.c: Likewise.
* gcc.dg/analyzer/allocation-size-multiline-1.c: Likewise.
* gcc.dg/analyzer/allocation-size-multiline-2.c: Likewise.
* gcc.dg/analyzer/allocation-size-multiline-3.c: Likewise.
* gcc.dg/analyzer/memset-1.c (test_1): Verify that the return
value is the initial argument.
* gcc.dg/plugin/analyzer_kernel_plugin.c
(copy_across_boundary_fn::impl_call_pre): Ensure the LHS is set on
the "known zero size" case.
* gcc.dg/plugin/analyzer_known_fns_plugin.c
(known_function_attempt_to_copy::impl_call_pre): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

commit | commitdiff | tree

Tsukasa OI [Wed, 9 Aug 2023 19:59:43 +0000 (13:59 -0600)]

RISC-V: Remove non-existing 'Zve32d' extension

Since this extension does not exist, this commit prunes this from
the defined extension version table.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_ext_version_table):
Remove 'Zve32d' from the version list.

commit | commitdiff | tree

Jin Ma [Wed, 9 Aug 2023 19:52:06 +0000 (13:52 -0600)]

RISC-V: Handle no_insn in TARGET_SCHED_VARIABLE_ISSUE.

Reference: https://github.com/gcc-mirror/gcc/commit/d0bc0cb66bcb0e6a5a5a31a9e900e8ccc98e34e5

RISC-V should also be implemented to handle no_insn patterns for pipelining.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_sched_variable_issue): New function.
(TARGET_SCHED_VARIABLE_ISSUE): New macro.

Co-authored-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Co-authored-by: Jeff Law <jlaw@ventanamicro.com>

commit | commitdiff | tree

Jivan Hakobyan [Wed, 9 Aug 2023 19:26:58 +0000 (13:26 -0600)]

RISC-V: Folding memory for FP + constant case

Accessing local arrays element turned into load form (fp + (index << C1)) +
C2 address.

In the case when access is in the loop we got loop invariant computation.  For
some reason, moving out that part cannot be done in loop-invariant passes.  But
we can handle that in target-specific hook (legitimize_address).  That provides
an opportunity to rewrite memory access more suitable for the target
architecture.

This patch solves the mentioned case by rewriting mentioned case to ((fp +
C2) + (index << C1))

I have evaluated it on SPEC2017 and got an improvement on leela (over 7b
instructions, .39% of the dynamic count) and dwarfs the regression for gcc (14m
instructions, .0012% of the dynamic count).

gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_legitimize_address): Handle folding.
(mem_shadd_or_shadd_rtx_p): New function.

commit | commitdiff | tree

Andrew Pinski [Mon, 7 Aug 2023 17:47:09 +0000 (10:47 -0700)]

MATCH: [PR110937/PR100798] (a ? ~b : b) should be optimized to b ^ -(a)

This adds a simple match pattern for this case.
I noticed it a couple of different places.
One while I was looking at code generation of a parser and
also while I was looking at locations where bitwise_inverted_equal_p
should be used more.

Committed as approved after bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/110937
PR tree-optimization/100798

gcc/ChangeLog:

* match.pd (`a ? ~b : b`): Handle this
case.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bool-14.c: New test.
* gcc.dg/tree-ssa/bool-15.c: New test.
* gcc.dg/tree-ssa/phi-opt-33.c: New test.
* gcc.dg/tree-ssa/20030709-2.c: Update testcase
so `a ? -1 : 0` is not used to hit the match
pattern.

commit | commitdiff | tree

Uros Bizjak [Wed, 9 Aug 2023 17:41:11 +0000 (19:41 +0200)]

i386: Add missing dot to -mpartial-vector-fp-math description

gcc/ChangeLog:

* config/i386/i386.opt (mpartial-vector-fp-math): Add dot.

commit | commitdiff | tree

Richard Ball [Wed, 9 Aug 2023 15:28:58 +0000 (16:28 +0100)]

aarch64: Add support for Cortex-A520 CPU

This patch adds support for the Cortex-A520 CPU to GCC.

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def (AARCH64_CORE): Add Cortex-A520 CPU.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi: Document Cortex-A520 CPU.

commit | commitdiff | tree

Carl Love [Wed, 9 Aug 2023 15:30:48 +0000 (11:30 -0400)]

rs6000: Fix __builtin_altivec_vcmpne{b,h,w} implementation

The current built-in definitions for vcmpneb, vcmpneh, vcmpnew are defined
under the Power 9 section of r66000-builtins.  This implies they are only
supported on Power 9 and above when in fact they are defined and work with
Altivec as well with the appropriate Altivec instruction generation.

The vec_cmpne builtin should generate the vcmpequ{b,h,w} instruction with
Altivec enabled and generate the vcmpne{b,h,w} on Power 9 and newer
processors.

This patch moves the definitions to the Altivec stanza to make it clear
the built-ins are supported for all Altivec processors.  The patch
removes the confusion as to which processors support the vcmpequ{b,h,w}
instructions.

There is existing test coverage for the vec_cmpne built-in for
vector bool char, vector bool short, vector bool int,
vector bool long long in builtins-3-p9.c and p8vector-builtin-2.c.
Coverage for vector signed int, vector unsigned int is in
p8vector-builtin-2.c.

Test vec-cmpne.c is updated to check the generation of the vcmpequ{b,h,w}
instructions for Altivec.  A new test vec-cmpne-runnable.c is added to
verify the built-ins work as expected.

Patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10 LE
with no regressions.

gcc/ChangeLog:

* config/rs6000/rs6000-builtins.def (vcmpneb, vcmpneh, vcmpnew):
Move definitions to Altivec stanza.
* config/rs6000/altivec.md (vcmpneb, vcmpneh, vcmpnew): New
define_expand.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/vec-cmpne-runnable.c: New execution test.
* gcc.target/powerpc/vec-cmpne.c (define_test_functions,
execute_test_functions): Move to vec-cmpne.h.  Add
scan-assembler-times for vcmpequb, vcmpequh, vcmpequw.
* gcc.target/powerpc/vec-cmpne.h: New include file for vec-cmpne.c
and vec-cmpne-runnable.c. Split define_test_functions definition
into define_test_functions and define_init_verify_functions.

commit | commitdiff | tree

Jonathan Wakely [Wed, 9 Aug 2023 10:11:31 +0000 (11:11 +0100)]

libstdc++: Fix constexpr functions to conform to older standards

Some constexpr functions were inadvertently relying on relaxed constexpr
rules from later standards.

libstdc++-v3/ChangeLog:

* include/bits/chrono.h (duration_cast): Do not use braces
around statements for C++11 constexpr rules.
* include/bits/stl_algobase.h (__lg): Rewrite as a single
statement for C++11 constexpr rules.
* include/experimental/bits/fs_path.h (path::string): Use
_GLIBCXX17_CONSTEXPR not _GLIBCXX_CONSTEXPR for 'if constexpr'.
* include/std/charconv (__to_chars_8): Initialize variable for
C++17 constexpr rules.

commit | commitdiff | tree

Jonathan Wakely [Wed, 9 Aug 2023 10:28:56 +0000 (11:28 +0100)]

libstdc++: Fix a -Wsign-compare warning in std::list

libstdc++-v3/ChangeLog:

* include/bits/list.tcc (list::sort(Cmp)): Fix -Wsign-compare
warning for loop condition.

commit | commitdiff | tree

Jonathan Wakely [Tue, 8 Aug 2023 21:19:49 +0000 (22:19 +0100)]

libstdc++: Suppress clang -Wc99-extensions warnings in <complex>

This prevents Clang from warning about the use of the non-standard
__complex__ keyword.

libstdc++-v3/ChangeLog:

* include/std/complex: Add diagnostic pragma for clang.

commit | commitdiff | tree

Jonathan Wakely [Tue, 8 Aug 2023 21:07:29 +0000 (22:07 +0100)]

libstdc++: Fix some -Wmismatched-tags warnings

libstdc++-v3/ChangeLog:

* include/bits/shared_ptr_atomic.h (atomic): Change class-head
to struct.
* include/bits/stl_tree.h (_Rb_tree_merge_helper): Change
class-head to struct in friend declaration.
* include/std/chrono (tzdb_list::_Node): Likewise.
* include/std/future (_Task_state_base, _Task_state): Likewise.
* include/std/scoped_allocator (__inner_type_impl): Likewise.
* include/std/valarray (_BinClos, _SClos, _GClos, _IClos)
(_ValFunClos, _RefFunClos): Change class-head to struct.

commit | commitdiff | tree

Jonathan Wakely [Tue, 8 Aug 2023 21:01:36 +0000 (22:01 +0100)]

libstdc++: Fix some -Wunused-parameter warnings

libstdc++-v3/ChangeLog:

* include/bits/alloc_traits.h (allocate): Add [[maybe_unused]]
attribute.
* include/bits/regex_executor.tcc: Remove name of unused
parameter.
* include/bits/shared_ptr_atomic.h (atomic_is_lock_free):
Likewise.
* include/bits/stl_uninitialized.h: Likewise.
* include/bits/streambuf_iterator.h (operator==): Likewise.
* include/bits/uses_allocator.h: Likewise.
* include/c_global/cmath (isfinite, isinf, isnan): Likewise.
* include/std/chrono (zoned_time): Likewise.
* include/std/future (__future_base::_S_allocate_result):
Likewise.
(packaged_task): Likewise.
* include/std/optional (_Optional_payload_base): Likewise.
* include/std/scoped_allocator (__inner_type_impl): Likewise.
* include/std/tuple (_Tuple_impl): Likewise.

commit | commitdiff | tree

Jonathan Wakely [Tue, 8 Aug 2023 15:24:31 +0000 (16:24 +0100)]

libstdc++: Explicitly default some copy ctors and assignments

The standard says that the implicit copy assignment operator is
deprecated for classes that have a user-provided copy constructor, and
vice versa.

libstdc++-v3/ChangeLog:

* include/bits/new_allocator.h (__new_allocator): Define copy
assignment operator as defaulted.
* include/std/complex (complex<float>, complex<double>)
(complex<long double>): Define copy constructor as defaulted.

commit | commitdiff | tree

Jonathan Wakely [Tue, 8 Aug 2023 15:29:17 +0000 (16:29 +0100)]

libstdc++: Minor fixes for some warnings in <format>

libstdc++-v3/ChangeLog:

* include/std/format: Fix some warnings.
(__format::__write(Ctx&, basic_string_view<CharT>)): Remove
unused function template.

commit | commitdiff | tree

Juzhe-Zhong [Wed, 9 Aug 2023 12:18:40 +0000 (20:18 +0800)]

RISC-V: Support NPATTERNS = 1 stepped vector[PR110950]

This patch fix ICE: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110950

0x1cf8939 expand_const_vector
../../../riscv-gcc/gcc/config/riscv/riscv-v.cc:1587

PR target/110950

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Add NPATTERNS = 1
stepped vector support.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr110950.c: New test.

commit | commitdiff | tree

Paul Thomas [Wed, 9 Aug 2023 11:04:09 +0000 (12:04 +0100)]

Fortran: Allow pure final procs contained in pure proc. [PR109684]

2023-08-09 Steve Kargl <sgk@troutmask.apl.washington.edu>

gcc/fortran
PR fortran/109684
* resolve.cc (resolve_types): Exclude contained procedures with
the artificial attribute from test for pureness.

commit | commitdiff | tree

Gaius Mulley [Wed, 9 Aug 2023 08:35:13 +0000 (09:35 +0100)]

PR modula2/110779: libgm2 fix solaris bootstrap check for tm_gmtoff

This patch defensively checks for every C function and every struct
used in wrapclock.cc. It adds return values to GetTimespec and
SetTimespec to allow the module to return a code representing
unavailable.

gcc/m2/ChangeLog:

PR modula2/110779
* gm2-libs-iso/SysClock.mod (GetClock): Test GetTimespec
return value.
(SetClock): Test SetTimespec return value.
* gm2-libs-iso/wrapclock.def (GetTimespec): Add integer
return type.
(SetTimespec): Add integer return type.

libgm2/ChangeLog:

PR modula2/110779
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac (AC_CACHE_CHECK): Check for tm_gmtoff field in
struct tm.
(GM2_CHECK_LIB): Check for daylight, timezone and tzname.
* libm2iso/wrapclock.cc (timezone): Guard against absence of
struct tm and tm_gmtoff.
(daylight): Check for daylight.
(timezone): Check for timezone.
(isdst): Check for isdst.
(tzname): Check for tzname.
(GetTimeRealtime): Check for struct timespec.
(SetTimeRealtime): Check for struct timespec.
(InitTimespec): Check for struct timespec.
(KillTimespec): Check for struct timespec.
(SetTimespec): Check for struct timespec.
(GetTimespec): Check for struct timespec.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

commit | commitdiff | tree

liuhongt [Wed, 9 Aug 2023 06:41:46 +0000 (14:41 +0800)]

Rename local variable subleaf_level to max_subleaf_level.

gcc/ChangeLog:

* common/config/i386/cpuinfo.h (get_available_features):
Rename local variable subleaf_level to max_subleaf_level.

commit | commitdiff | tree

Richard Biener [Tue, 25 Jul 2023 13:36:30 +0000 (15:36 +0200)]

rtl-optimization/110587 - speedup find_hard_regno_for_1

The following applies a micro-optimization to find_hard_regno_for_1,
re-ordering the check so we can easily jump-thread by using an else.
This reduces the time spent in this function by 15% for the testcase
in the PR.

PR rtl-optimization/110587
* lra-assigns.cc (find_hard_regno_for_1): Re-order checks.

commit | commitdiff | tree

Kewen Lin [Wed, 9 Aug 2023 06:15:46 +0000 (01:15 -0500)]

rs6000: Teach legitimate_address_p about LEN_{LOAD,STORE} [PR110248]

This patch is to teach rs6000_legitimate_address_p to
handle the queried rtx constructed for LEN_{LOAD,STORE},
since lxvl and stxvl doesn't support x-form or ds-form,
so consider it as not legitimate when outer code is PLUS.

PR tree-optimization/110248

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_legitimate_address_p): Check if
the given code is for ifn LEN_{LOAD,STORE}, if yes then make it not
legitimate when outer code is PLUS.

commit | commitdiff | tree

Kewen Lin [Wed, 9 Aug 2023 05:41:52 +0000 (00:41 -0500)]

ivopts: Call valid_mem_ref_p with ifn [PR110248]

As PR110248 shows, to get the expected query results for
that internal functions LEN_{LOAD,STORE} is able to adopt
some addressing modes, we need to pass down the related
IFN code as well. This patch is to make IVOPTs pass down
ifn code for USE_PTR_ADDRESS type uses, it adjusts the
related functions {strict_,}memory_address_addr_space_p,
and valid_mem_ref_p as well.

PR tree-optimization/110248

gcc/ChangeLog:

* recog.cc (memory_address_addr_space_p): Add one more argument ch of
type code_helper and pass it to targetm.addr_space.legitimate_address_p
instead of ERROR_MARK.
(offsettable_address_addr_space_p): Update one function pointer with
one more argument of type code_helper as its assignees
memory_address_addr_space_p and strict_memory_address_addr_space_p
have been adjusted, and adjust some call sites with ERROR_MARK.
* recog.h (tree.h): New include header file for tree_code ERROR_MARK.
(memory_address_addr_space_p): Adjust with one more unnamed argument
of type code_helper with default ERROR_MARK.
(strict_memory_address_addr_space_p): Likewise.
* reload.cc (strict_memory_address_addr_space_p): Add one unnamed
argument of type code_helper.
* tree-ssa-address.cc (valid_mem_ref_p): Add one more argument ch of
type code_helper and pass it to memory_address_addr_space_p.
* tree-ssa-address.h (valid_mem_ref_p): Adjust the declaration with
one more unnamed argument of type code_helper with default value
ERROR_MARK.
* tree-ssa-loop-ivopts.cc (get_address_cost): Use ERROR_MARK as code
by default, change it with ifn code for USE_PTR_ADDRESS type use, and
pass it to all valid_mem_ref_p calls.

commit | commitdiff | tree

Kewen Lin [Wed, 9 Aug 2023 05:02:26 +0000 (00:02 -0500)]

targhooks: Extend legitimate_address_p with code_helper [PR110248]

As PR110248 shows, some middle-end passes like IVOPTs can
query the target hook legitimate_address_p with some
artificially constructed rtx to determine whether some
addressing modes are supported by target for some gimple
statement. But for now the existing legitimate_address_p
only checks the given mode, it's unable to distinguish
some special cases unfortunately, for example, for LEN_LOAD
ifn on Power port, we would expand it with lxvl hardware
insn, which only supports one register to hold the address
(the other register is holding the length), that is we
don't support base (reg) + index (reg) addressing mode for
sure. But hook legitimate_address_p only considers the
given mode which would be some vector mode for LEN_LOAD
ifn, and we do support base + index addressing mode for
normal vector load and store insns, so the hook will return
true for the query unexpectedly.

This patch is to introduce one extra argument of type
code_helper for hook legitimate_address_p, it makes targets
able to handle some special case like what's described
above.

PR tree-optimization/110248

gcc/ChangeLog:

* coretypes.h (class code_helper): Add forward declaration.
* doc/tm.texi: Regenerate.
* lra-constraints.cc (valid_address_p): Call target hook
targetm.addr_space.legitimate_address_p with an extra parameter
ERROR_MARK as its prototype changes.
* recog.cc (memory_address_addr_space_p): Likewise.
* reload.cc (strict_memory_address_addr_space_p): Likewise.
* target.def (legitimate_address_p, addr_space.legitimate_address_p):
Extend with one more argument of type code_helper, update the
documentation accordingly.
* targhooks.cc (default_legitimate_address_p): Adjust for the
new code_helper argument.
(default_addr_space_legitimate_address_p): Likewise.
* targhooks.h (default_legitimate_address_p): Likewise.
(default_addr_space_legitimate_address_p): Likewise.
* config/aarch64/aarch64.cc (aarch64_legitimate_address_hook_p): Adjust
with extra unnamed code_helper argument with default ERROR_MARK.
* config/alpha/alpha.cc (alpha_legitimate_address_p): Likewise.
* config/arc/arc.cc (arc_legitimate_address_p): Likewise.
* config/arm/arm-protos.h (arm_legitimate_address_p): Likewise.
(tree.h): New include for tree_code ERROR_MARK.
* config/arm/arm.cc (arm_legitimate_address_p): Adjust with extra
unnamed code_helper argument with default ERROR_MARK.
* config/avr/avr.cc (avr_addr_space_legitimate_address_p): Likewise.
* config/bfin/bfin.cc (bfin_legitimate_address_p): Likewise.
* config/bpf/bpf.cc (bpf_legitimate_address_p): Likewise.
* config/c6x/c6x.cc (c6x_legitimate_address_p): Likewise.
* config/cris/cris-protos.h (cris_legitimate_address_p): Likewise.
(tree.h): New include for tree_code ERROR_MARK.
* config/cris/cris.cc (cris_legitimate_address_p): Adjust with extra
unnamed code_helper argument with default ERROR_MARK.
* config/csky/csky.cc (csky_legitimate_address_p): Likewise.
* config/epiphany/epiphany.cc (epiphany_legitimate_address_p):
Likewise.
* config/frv/frv.cc (frv_legitimate_address_p): Likewise.
* config/ft32/ft32.cc (ft32_addr_space_legitimate_address_p): Likewise.
* config/gcn/gcn.cc (gcn_addr_space_legitimate_address_p): Likewise.
* config/h8300/h8300.cc (h8300_legitimate_address_p): Likewise.
* config/i386/i386.cc (ix86_legitimate_address_p): Likewise.
* config/ia64/ia64.cc (ia64_legitimate_address_p): Likewise.
* config/iq2000/iq2000.cc (iq2000_legitimate_address_p): Likewise.
* config/lm32/lm32.cc (lm32_legitimate_address_p): Likewise.
* config/loongarch/loongarch.cc (loongarch_legitimate_address_p):
Likewise.
* config/m32c/m32c.cc (m32c_legitimate_address_p): Likewise.
(m32c_addr_space_legitimate_address_p): Likewise.
* config/m32r/m32r.cc (m32r_legitimate_address_p): Likewise.
* config/m68k/m68k.cc (m68k_legitimate_address_p): Likewise.
* config/mcore/mcore.cc (mcore_legitimate_address_p): Likewise.
* config/microblaze/microblaze-protos.h (tree.h): New include for
tree_code ERROR_MARK.
(microblaze_legitimate_address_p): Adjust with extra unnamed
code_helper argument with default ERROR_MARK.
* config/microblaze/microblaze.cc (microblaze_legitimate_address_p):
Likewise.
* config/mips/mips.cc (mips_legitimate_address_p): Likewise.
* config/mmix/mmix.cc (mmix_legitimate_address_p): Likewise.
* config/mn10300/mn10300.cc (mn10300_legitimate_address_p): Likewise.
* config/moxie/moxie.cc (moxie_legitimate_address_p): Likewise.
* config/msp430/msp430.cc (msp430_legitimate_address_p): Likewise.
(msp430_addr_space_legitimate_address_p): Adjust with extra code_helper
argument with default ERROR_MARK and adjust the call to function
msp430_legitimate_address_p.
* config/nds32/nds32.cc (nds32_legitimate_address_p): Adjust with extra
unnamed code_helper argument with default ERROR_MARK.
* config/nios2/nios2.cc (nios2_legitimate_address_p): Likewise.
* config/nvptx/nvptx.cc (nvptx_legitimate_address_p): Likewise.
* config/or1k/or1k.cc (or1k_legitimate_address_p): Likewise.
* config/pa/pa.cc (pa_legitimate_address_p): Likewise.
* config/pdp11/pdp11.cc (pdp11_legitimate_address_p): Likewise.
* config/pru/pru.cc (pru_addr_space_legitimate_address_p): Likewise.
* config/riscv/riscv.cc (riscv_legitimate_address_p): Likewise.
* config/rl78/rl78-protos.h (rl78_as_legitimate_address): Likewise.
(tree.h): New include for tree_code ERROR_MARK.
* config/rl78/rl78.cc (rl78_as_legitimate_address): Adjust with
extra unnamed code_helper argument with default ERROR_MARK.
* config/rs6000/rs6000.cc (rs6000_legitimate_address_p): Likewise.
(rs6000_debug_legitimate_address_p): Adjust with extra code_helper
argument and adjust the call to function rs6000_legitimate_address_p.
* config/rx/rx.cc (rx_is_legitimate_address): Adjust with extra
unnamed code_helper argument with default ERROR_MARK.
* config/s390/s390.cc (s390_legitimate_address_p): Likewise.
* config/sh/sh.cc (sh_legitimate_address_p): Likewise.
* config/sparc/sparc.cc (sparc_legitimate_address_p): Likewise.
* config/v850/v850.cc (v850_legitimate_address_p): Likewise.
* config/vax/vax.cc (vax_legitimate_address_p): Likewise.
* config/visium/visium.cc (visium_legitimate_address_p): Likewise.
* config/xtensa/xtensa.cc (xtensa_legitimate_address_p): Likewise.
* config/stormy16/stormy16-protos.h (xstormy16_legitimate_address_p):
Likewise.
(tree.h): New include for tree_code ERROR_MARK.
* config/stormy16/stormy16.cc (xstormy16_legitimate_address_p):
Adjust with extra unnamed code_helper argument with default
ERROR_MARK.

commit | commitdiff | tree

liuhongt [Fri, 4 Aug 2023 01:27:39 +0000 (09:27 +0800)]

Workaround possible CPUID bug in Sandy Bridge.

Don't access leaf 7 subleaf 1 unless subleaf 0 says it is
supported via EAX.

Intel documentation says invalid subleaves return 0. We had been
relying on that behavior instead of checking the max sublef number.

It appears that some Sandy Bridge CPUs return at least the subleaf 0
EDX value for subleaf 1. Best guess is that this is a bug in a
microcode patch since all of the bits we're seeing set in EDX were
introduced after Sandy Bridge was originally released.

This is causing avxvnniint16 to be incorrectly enabled with
-march=native on these CPUs.

gcc/ChangeLog:

* common/config/i386/cpuinfo.h (get_available_features): Check
EAX for valid subleaf before use CPUID.

commit | commitdiff | tree

GCC Administrator [Wed, 9 Aug 2023 00:16:57 +0000 (00:16 +0000)]

Daily bump.

commit | commitdiff | tree

Jeff Law [Tue, 8 Aug 2023 21:32:38 +0000 (15:32 -0600)]

[committed] [RISC-V] Fix bug in condition canonicalization for zicond

Vineet's glibc build triggered an ICE building glibc with the latest zicond
bits. It's a minor issue in the canonicalization of the condition.

When we need to canonicalize the condition we use an SCC insn to handle the
primary comparison with the output going into a temporary with the final value
of 0/1 which we can then use in a zicond instruction.

The mode of the newly generated temporary was taken from mode of the final
destination. That's simply wrong. The mode of the condition needs to be
word_mode.

This patch fixes that minor problem and adds a suitable testcase.

gcc/
* config/riscv/riscv.cc (riscv_expand_conditional_move): Use word_mode
for the temporary when canonicalizing the condition.

gcc/testsuite
* gcc.target/riscv/zicond-ice-1.c: New test.

commit | commitdiff | tree

Marek Polacek [Mon, 31 Jul 2023 19:50:48 +0000 (15:50 -0400)]

c++: parser cleanup, remove dummy arguments

Now that cp_parser_constant_expression accepts a null non_constant_p,
we can transitively remove dummy arguments in the call chain.

Running dg.exp and counting the # of is_rvalue_constant_expression calls
from cp_parser_constant_expression:
pre-r14-2800: 2,459,145
this patch : 1,719,454

gcc/cp/ChangeLog:

* parser.cc (cp_parser_postfix_expression): Adjust the call to
cp_parser_braced_list.
(cp_parser_postfix_open_square_expression): Likewise.
(cp_parser_new_initializer): Likewise.
(cp_parser_assignment_expression): Adjust the call to
cp_parser_initializer_clause.
(cp_parser_lambda_introducer): Adjust the call to cp_parser_initializer.
(cp_parser_range_for): Adjust the call to cp_parser_braced_list.
(cp_parser_jump_statement): Likewise.
(cp_parser_mem_initializer): Likewise.
(cp_parser_template_argument): Likewise.
(cp_parser_default_argument): Adjust the call to cp_parser_initializer.
(cp_parser_initializer): Handle null is_direct_init and non_constant_p
arguments.
(cp_parser_initializer_clause): Handle null non_constant_p argument.
(cp_parser_braced_list): Likewise.
(cp_parser_initializer_list): Likewise.
(cp_parser_member_declaration): Adjust the call to
cp_parser_initializer_clause and cp_parser_initializer.
(cp_parser_yield_expression): Adjust the call to cp_parser_braced_list.
(cp_parser_functional_cast): Likewise.
(cp_parser_late_parse_one_default_arg): Adjust the call to
cp_parser_initializer.
(cp_parser_omp_for_loop_init): Likewise.
(cp_parser_omp_declare_reduction_exprs): Likewise.

commit | commitdiff | tree

Nathaniel Shead [Tue, 8 Aug 2023 02:48:43 +0000 (12:48 +1000)]

c++: Report invalid id-expression in decltype [PR100482]

This patch ensures that any errors raised by finish_id_expression when
parsing a decltype expression are properly reported, rather than
potentially going ignored and causing invalid code to be accepted.

We can also now remove the separate check for templates without args as
this is also checked for in finish_id_expression.

PR c++/100482

gcc/cp/ChangeLog:

* parser.cc (cp_parser_decltype_expr): Report errors raised by
finish_id_expression.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/decltype-100482.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

commit | commitdiff | tree

Cupertino Miranda [Tue, 8 Aug 2023 10:12:00 +0000 (11:12 +0100)]

bpf: Fixed GC mistakes in BPF builtins code.

This patches fixes problems with GC within the CO-RE builtins
implementation.
List of included headers was also revised.

gcc/ChangeLog:

* config/bpf/core-builtins.cc: Cleaned include headers.
(struct cr_builtins): Added GTY.
(cr_builtins_ref): Created.
(builtins_data) Changed to GC root.
(allocate_builtin_data): Changed.
Included gt-core-builtins.h.
* config/bpf/coreout.cc: (bpf_core_extra) Added GTY.
(bpf_core_extra_ref): Created.
(bpf_comment_info): Changed to GC root.
(bpf_core_reloc_add, output_btfext_header, btf_ext_init): Changed.

commit | commitdiff | tree

Uros Bizjak [Tue, 8 Aug 2023 16:53:51 +0000 (18:53 +0200)]

i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832]

Also introduce -m[no-]partial-vector-fp-math option to disable trapping
V2SF named patterns in order to avoid generation of partial vector V4SFmode
trapping instructions.

The new option is enabled by default, because even with sanitization,
a small but consistent speed up of 2 to 3% with Polyhedron capacita
benchmark can be achieved vs. scalar code.

Using -fno-trapping-math improves Polyhedron capacita runtime 8 to 9%
vs. scalar code. This is what clang does by default, as it defaults
to -fno-trapping-math.

PR target/110832

gcc/ChangeLog:

* config/i386/i386.opt (mpartial-vector-fp-math): New option.
* config/i386/mmx.md (movq_<mode>_to_sse): Do not sanitize
upper part of V2SFmode register with -fno-trapping-math.
(<plusminusmult:insn>v2sf3): Enable for ix86_partial_vec_fp_math.
(divv2sf3): Ditto.
(<smaxmin:code>v2sf3): Ditto.
(sqrtv2sf2): Ditto.
(*mmx_haddv2sf3_low): Ditto.
(*mmx_hsubv2sf3_low): Ditto.
(vec_addsubv2sf3): Ditto.
(vec_cmpv2sfv2si): Ditto.
(vcond<V2FI:mode>v2sf): Ditto.
(fmav2sf4): Ditto.
(fmsv2sf4): Ditto.
(fnmav2sf4): Ditto.
(fnmsv2sf4): Ditto.
(fix_truncv2sfv2si2): Ditto.
(fixuns_truncv2sfv2si2): Ditto.
(floatv2siv2sf2): Ditto.
(floatunsv2siv2sf2): Ditto.
(nearbyintv2sf2): Ditto.
(rintv2sf2): Ditto.
(lrintv2sfv2si2): Ditto.
(ceilv2sf2): Ditto.
(lceilv2sfv2si2): Ditto.
(floorv2sf2): Ditto.
(lfloorv2sfv2si2): Ditto.
(btruncv2sf2): Ditto.
(roundv2sf2): Ditto.
(lroundv2sfv2si2): Ditto.
* doc/invoke.texi (x86 Options): Document
-mpartial-vector-fp-math option.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr110832-1.c: New test.
* gcc.target/i386/pr110832-2.c: New test.
* gcc.target/i386/pr110832-3.c: New test.

commit | commitdiff | tree

Andrew Pinski [Mon, 7 Aug 2023 07:05:21 +0000 (00:05 -0700)]

VR-VALUES [PR28794]: optimize compare assignments also

This patch fixes the oldish (2006) bug where VRP was not
optimizing the comparison for assignments while handling
them for GIMPLE_COND only.
It just happens to also solves PR 103281 due to allowing
to optimize `c < 1` to `c == 0` and then we get
`(c == 0) == c` (which was handled by r14-2501-g285c9d04).

OK? Bootstrapped and tested on x86_64-linux-gnu with no
regressions.

PR tree-optimization/103281
PR tree-optimization/28794

gcc/ChangeLog:

* vr-values.cc (simplify_using_ranges::simplify_cond_using_ranges_1): Split out
majority to ...
(simplify_using_ranges::simplify_compare_using_ranges_1): Here.
(simplify_using_ranges::simplify_casted_cond): Rename to ...
(simplify_using_ranges::simplify_casted_compare): This
and change arguments to take op0 and op1.
(simplify_using_ranges::simplify_compare_assign_using_ranges_1): New method.
(simplify_using_ranges::simplify): For tcc_comparison assignments call
simplify_compare_assign_using_ranges_1.
* vr-values.h (simplify_using_ranges): Add
new methods, simplify_compare_using_ranges_1 and simplify_compare_assign_using_ranges_1.
Rename simplify_casted_cond and simplify_casted_compare and
update argument types.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr103281-1.c: New test.
* gcc.dg/tree-ssa/vrp-compare-1.c: New test.

commit | commitdiff | tree

Pan Li [Wed, 2 Aug 2023 07:59:24 +0000 (15:59 +0800)]

RISC-V: Enhance the test case for RVV vfsub/vfrsub rounding

This patch would like to enhance the vfsub/vfrsub rounding API test for
below 2 purposes.

* The non-rm API has no frm related insn generated.
* The rm API has the frm backup/restore/set insn generated.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-single-rsub.c: Enhance
cases.
* gcc.target/riscv/rvv/base/float-point-single-sub.c: Ditto.
Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Andrzej Turko [Mon, 7 Aug 2023 09:59:01 +0000 (11:59 +0200)]

genmatch: Log line numbers indirectly

Currently fprintf calls logging to a dump file take line numbers
in the match.pd file directly as arguments.
When match.pd is edited, referenced code changes line numbers,
which causes changes to many fprintf calls and, thus, to many
(usually all) .cc files generated by genmatch. This forces make
to (unnecessarily) rebuild many .o files.

This change replaces those logging fprintf calls with calls to
a dedicated logging function. Because it reads the line numbers
from the lookup table, it is enough to pass a corresponding index.
Thanks to this, when match.pd changes, it is enough to rebuild
the file containing the lookup table and, of course, those
actually affected by the change.

Signed-off-by: Andrzej Turko <andrzej.turko@gmail.com>
gcc/ChangeLog:

* genmatch.cc: Log line numbers indirectly.

commit | commitdiff | tree

Andrzej Turko [Mon, 7 Aug 2023 09:59:00 +0000 (11:59 +0200)]

genmatch: Reduce variability of generated code

So far genmatch has been using an unordered map to store information about
functions to be generated. Since corresponding locations from match.pd were
used as keys in the map, even small changes to match.pd which caused
line number changes would change the order in which the functions are
generated. This would reshuffle the functions between the generated .cc files.
This way even a minimal modification to match.pd forces recompilation of all
object files originating from match.pd on rebuild.

This commit makes sure that functions are generated in the order of their
processing (in contrast to the random order based on hashes of their
locations in match.pd). This is done by replacing the unordered map with an
ordered one. This way small changes to match.pd does not cause function
renaming and reshuffling among generated source files.
Together with the subsequent change to logging fprintf calls, this
removes unnecessary changes to the files generated by genmatch allowing
for reuse of already built object files during rebuild. The aim is to
make editing of match.pd and subsequent testing easier.

Signed-off-by: Andrzej Turko <andrzej.turko@gmail.com>
gcc/ChangeLog:

* genmatch.cc: Make sinfo map ordered.
* Makefile.in: Require the ordered map header for genmatch.o.

commit | commitdiff | tree

Andrzej Turko [Mon, 7 Aug 2023 09:58:59 +0000 (11:58 +0200)]

Support get_or_insert in ordered_hash_map

Get_or_insert method is already supported by the unordered hash map.
Adding it to the ordered map enables us to replace the unordered map
with the ordered one in cases where ordering may be useful.

Signed-off-by: Andrzej Turko <andrzej.turko@gmail.com>
gcc/ChangeLog:

* ordered-hash-map.h: Add get_or_insert.
* ordered-hash-map-tests.cc: Use get_or_insert in tests.

commit | commitdiff | tree

Juzhe-Zhong [Thu, 3 Aug 2023 01:58:35 +0000 (09:58 +0800)]

RISC-V: Support CALL conditional autovec patterns

This patch is depending on middle-end patch on vectorizable_call.

Consider this following case:
void foo (float * __restrict a, float * __restrict b, int * __restrict cond, int n)
{
  for (int i = 0; i < n; i++)
    if (cond[i])
      a[i] = b[i] + a[i];
}

Before this patch (**NO** -ffast-math):
<source>:5:21: missed: couldn't vectorize loop
<source>:5:21: missed: not vectorized: control flow in loop.

After this patch:
foo:
ble a3,zero,.L5
mv a6,a0
.L3:
vsetvli a5,a3,e8,mf4,ta,ma
vle32.v v0,0(a2)
vsetvli a7,zero,e32,m1,ta,ma
slli a4,a5,2
vmsne.vi v0,v0,0
sub a3,a3,a5
vsetvli zero,a5,e32,m1,tu,mu    ------> must be TUMU
vle32.v v2,0(a0),v0.t
vle32.v v1,0(a1),v0.t
vfadd.vv v1,v1,v2,v0.t   ------> generated by COND_LEN_ADD with real mask and len.
vse32.v v1,0(a6),v0.t
add a2,a2,a4
add a1,a1,a4
add a0,a0,a4
add a6,a6,a4
bne a3,zero,.L3
.L5:
ret

gcc/ChangeLog:

* config/riscv/autovec.md (cond_<optab><mode>): New pattern.
(cond_len_<optab><mode>): Ditto.
(cond_fma<mode>): Ditto.
(cond_len_fma<mode>): Ditto.
(cond_fnma<mode>): Ditto.
(cond_len_fnma<mode>): Ditto.
(cond_fms<mode>): Ditto.
(cond_len_fms<mode>): Ditto.
(cond_fnms<mode>): Ditto.
(cond_len_fnms<mode>): Ditto.
* config/riscv/riscv-protos.h (riscv_get_v_regno_alignment): Export
global.
(enum insn_type): Add new enum type.
(prepare_ternary_operands): New function.
* config/riscv/riscv-v.cc (emit_vlmax_masked_fp_mu_insn): Ditto.
(emit_nonvlmax_tumu_insn): Ditto.
(emit_nonvlmax_fp_tumu_insn): Ditto.
(expand_cond_len_binop): Add condtional operations.
(expand_cond_len_ternop): Ditto.
(prepare_ternary_operands): New function.
* config/riscv/riscv.cc (riscv_memmodel_needs_amo_release): Export
riscv_get_v_regno_alignment as global scope.
* config/riscv/vector.md: Fix ternary bugs.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp: Add condition tests.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-5.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-6.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-7.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-8.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-9.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith_run-6.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith_run-7.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith_run-8.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_arith_run-9.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-7.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-8.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma_run-6.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma_run-7.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma_run-8.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms_run-6.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_logical-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_logical-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_logical-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_logical-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_logical-5.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_logical_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_logical_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_logical_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_logical_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_logical_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift-5.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift-6.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift-7.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift-8.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift-9.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift_run-6.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift_run-7.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift_run-8.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_shift_run-9.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_call-1.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_call-2.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_call-3.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_call-4.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_call-5.c: New test.

commit | commitdiff | tree

Richard Biener [Mon, 7 Aug 2023 12:44:20 +0000 (14:44 +0200)]

tree-optimization/49955 - BB reduction with odd number of lanes

The following enhances BB reduction vectorization to support
vectorizing only a subset of the lanes, keeping the rest as
scalar ops. For now we try to make the number of lanes even
by leaving alone the "last" lane. That's because SLP discovery
with all lanes will fail too soon to get us any hint on which
lane to strip and likewise we don't know what vector modes the
target supports so restricting ourselves to power-of-two or
other cases isn't easy.

This is enough to get at the vectorization opportunity for the
testcase in the PR - albeit with the chosen lanes not optimal
but at least vectorizable.

PR tree-optimization/49955
* tree-vectorizer.h (_slp_instance::remain_stmts): New.
(SLP_INSTANCE_REMAIN_STMTS): Likewise.
* tree-vect-slp.cc (vect_free_slp_instance): Release
SLP_INSTANCE_REMAIN_STMTS.
(vect_build_slp_instance): Make the number of lanes of
a BB reduction even.
(vectorize_slp_instance_root_stmt): Handle unvectorized
defs of a BB reduction.

* gfortran.dg/vect/pr49955.f: New testcase.

commit | commitdiff | tree

Ju-Zhe Zhong [Mon, 7 Aug 2023 09:38:12 +0000 (17:38 +0800)]

VECT: Support CALL vectorization for COND_LEN_*

Hi, Richard and Richi.

Base on the suggestions from Richard:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625396.html

This patch choose (1) approach that Richard provided, meaning:

RVV implements cond_* optabs as expanders.  RVV therefore supports
both IFN_COND_ADD and IFN_COND_LEN_ADD.  No dummy length arguments
are needed at the gimple level.

Such approach can make codes much cleaner and reasonable.

Consider this following case:
void foo (float * __restrict a, float * __restrict b, int * __restrict cond, int n)
{
  for (int i = 0; i < n; i++)
    if (cond[i])
      a[i] = b[i] + a[i];
}

Output of RISC-V (32-bits) gcc (trunk) (Compiler #3)
<source>:5:21: missed: couldn't vectorize loop
<source>:5:21: missed: not vectorized: control flow in loop.

ARM SVE:

...
mask__27.10_51 = vect__4.9_49 != { 0, ... };
...
vec_mask_and_55 = loop_mask_49 & mask__27.10_51;
...
vect__9.17_62 = .COND_ADD (vec_mask_and_55, vect__6.13_56, vect__8.16_60, vect__6.13_56);

For RVV, we want IR as follows:

...
_68 = .SELECT_VL (ivtmp_66, POLY_INT_CST [4, 4]);
...
mask__27.10_51 = vect__4.9_49 != { 0, ... };
...
vect__9.17_60 = .COND_LEN_ADD (mask__27.10_51, vect__6.13_55, vect__8.16_59, vect__6.13_55, _68, 0);
...

Both len and mask of COND_LEN_ADD are real not dummy.

This patch has been fully tested in RISC-V port with supporting both COND_* and COND_LEN_*.

And also, Bootstrap and Regression on X86 passed.

OK for trunk?

gcc/ChangeLog:

* internal-fn.cc (get_len_internal_fn): New function.
(DEF_INTERNAL_COND_FN): Ditto.
(DEF_INTERNAL_SIGNED_COND_FN): Ditto.
* internal-fn.h (get_len_internal_fn): Ditto.
* tree-vect-stmts.cc (vectorizable_call): Add CALL auto-vectorization.

commit | commitdiff | tree

Richard Biener [Tue, 8 Aug 2023 10:46:42 +0000 (12:46 +0200)]

tree-optimization/110924 - fix vop liveness for noreturn const CFG parts

The virtual operand live problem used by sinking assumes we have
virtual uses at each end point of the CFG but as shown in the PR
this isn't true for parts for example ending in __builtin_unreachable.
The following removes the optimization made possible by this and
now requires marking backedges.

PR tree-optimization/110924
* tree-ssa-live.h (virtual_operand_live): Update comment.
* tree-ssa-live.cc (virtual_operand_live::get_live_in): Remove
optimization, look at each predecessor.
* tree-ssa-sink.cc (pass_sink_code::execute): Mark backedges.

* gcc.dg/torture/pr110924.c: New testcase.

commit | commitdiff | tree

yulong [Tue, 8 Aug 2023 04:12:32 +0000 (12:12 +0800)]

RISC-V: Fix a bug that causes an error insn.

I test the following rvv intrinsics.
vint64m1_t test_vslide1up_vx_i64m1_m(vbool64_t mask, vint64m1_t src, int64_t value, size_t vl) {
  return __riscv_vslide1up_vx_i64m1_m(mask, src, value, vl);
}
And I got an error info,t hat is error:
  unrecognizable insn:(insn 17 16 18 2
    (set (reg:RVVMIDI 134 [ _1 ])(if_then_else:RVVMIDI
      (unspec:RVVMF64BI [(reg/v:SI 142 [ vl ])(const_int 2 [x2])(const_int 日 [o])(reg:SI 66 vl)(reg:SI 67 vtype)] UNSPEC_VPREDICATE
   (vec_merge:RVVMIDI (reg:RVVMIDI 134 [ _1 ])(unspec:RVVMIDI [(reg:sI 日 zero)] UNSPEC_VUNDEF)
   (reg/v:RVVMF64BI 137 [ mask ]))
   (unspec:RVVM1DI[(reg:sI 日 zero)] UNSPEC_VUNDEF)))

This patch fix it.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (slide1_sew64_helper): Modify.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/vslide1down-1.c: New test.
* gcc.target/riscv/rvv/base/vslide1down-2.c: New test.
* gcc.target/riscv/rvv/base/vslide1down-3.c: New test.
* gcc.target/riscv/rvv/base/vslide1up-1.c: New test.
* gcc.target/riscv/rvv/base/vslide1up-2.c: New test.
* gcc.target/riscv/rvv/base/vslide1up-3.c: New test.

commit | commitdiff | tree

Stefan Schulze Frielinghaus [Tue, 8 Aug 2023 06:53:12 +0000 (08:53 +0200)]

rtl-optimization/110869 Fix tests cmp-mem-const-*.c for sparc

This fixes the rather new tests cmp-mem-const-{1,2,3,4,5,6}.c for sparc.
For -1 and -2 we need at least optimization level 2 on sparc.  For the
sake of homogeneity, change all test cases to -O2.  For -3 and -4 we do
not end up with a comparison of memory and a constant, and finally for
-5 and -6 the constants are reduced by a prior optimization which means
there is nothing left to do.  Thus excluding sparc from those tests.

gcc/testsuite/ChangeLog:

PR rtl-optimization/110869
* gcc.dg/cmp-mem-const-1.c: Use optimization level 2.
* gcc.dg/cmp-mem-const-2.c: Dito.
* gcc.dg/cmp-mem-const-3.c: Exclude sparc from this test.
* gcc.dg/cmp-mem-const-4.c: Dito.
* gcc.dg/cmp-mem-const-5.c: Dito.
* gcc.dg/cmp-mem-const-6.c: Dito.

commit | commitdiff | tree

Juzhe-Zhong [Tue, 8 Aug 2023 03:06:24 +0000 (11:06 +0800)]

RISC-V: Support neg VLS auto-vectorization

#include "riscv_vector.h"

#define DEF_OP_V(PREFIX, NUM, TYPE, OP)                                        \
  void __attribute__ ((noinline, noclone))                                     \
  PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b)                    \
  {                                                                            \
    for (int i = 0; i < NUM; ++i)                                              \
      a[i] = OP b[i];                                                          \
  }

DEF_OP_V (neg, 16, int32_t, -)

After this patch:

neg_int32_t16:
vsetivli zero,16,e32,mf2,ta,ma
vle32.v v1,0(a1)
vneg.v v1,v1
vse32.v v1,0(a0)
ret

gcc/ChangeLog:

* config/riscv/autovec-vls.md (<optab><mode>2): Add VLS neg.
* config/riscv/vector.md: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Ditto.
* gcc.target/riscv/rvv/autovec/vls/neg-1.c: New test.

commit | commitdiff | tree

Juzhe-Zhong [Tue, 8 Aug 2023 01:33:05 +0000 (09:33 +0800)]

RISC-V: Support VLS shift vectorization

After this patch, this following case will be well optimized:

  void __attribute__ ((noinline, noclone))                                     \
  PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c)  \
  {                                                                            \
    for (int i = 0; i < NUM; ++i)                                              \
      a[i] = b[i] OP c[i];                                                     \
  }

DEF_OP_VV (shift, 16, int32_t, >>)

ASM:
shift_int32_t16:
vsetivli zero,16,e32,mf2,ta,ma
vle32.v v1,0(a1)
vle32.v v2,0(a2)
vsra.vv v1,v1,v2
vse32.v v1,0(a0)
ret

gcc/ChangeLog:

* config/riscv/autovec.md: Add VLS shift.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS shift.
* gcc.target/riscv/rvv/autovec/vls/shift-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/shift-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/shift-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/shift-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/shift-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/shift-6.c: New test.

commit | commitdiff | tree

GCC Administrator [Tue, 8 Aug 2023 00:17:37 +0000 (00:17 +0000)]

Daily bump.

commit | commitdiff | tree

Juzhe-Zhong [Mon, 7 Aug 2023 09:27:15 +0000 (17:27 +0800)]

RISC-V: Support VLS basic operation auto-vectorization

This patch support VLS modes auto-vectorization to enhance VLA auto-vectorization
when niters is known.

Consider this following case:

  void __attribute__ ((noinline, noclone))                                     \
  PREFIX##_##TYPE##NUM (TYPE *__restrict a, TYPE *__restrict b, TYPE *__restrict c)  \
  {                                                                            \
    for (int i = 0; i < NUM; ++i)                                              \
      a[i] = b[i] OP c[i];                                                     \
  }

DEF_OP_VV (plus, 16, int8_t, +)

Before this patch:

plus_int8_t16(signed char*, signed char*, signed char*):
        li      a5,16
        csrr    a4,vlenb
        bleu    a5,a4,.L2
        mv      a5,a4
.L2:
        vsetvli zero,a5,e8,m1,ta,ma
        vle8.v  v2,0(a1)
        vle8.v  v1,0(a2)
        vsetvli a4,zero,e8,m1,ta,ma
        vadd.vv v1,v1,v2
        vsetvli zero,a5,e8,m1,ta,ma
        vse8.v  v1,0(a0)
        ret

After this patch:

plus_int8_t16:
vsetivli zero,16,e8,m1,ta,ma
vle8.v v1,0(a2)
vle8.v v2,0(a1)
vadd.vv v1,v1,v2
vse8.v v1,0(a0)
ret

gcc/ChangeLog:

* config/riscv/autovec-vls.md (<optab><mode>3): Add VLS modes.
* config/riscv/vector-iterators.md: Ditto.
* config/riscv/vector.md: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Add basic operations.
* gcc.target/riscv/rvv/autovec/vls/and-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/and-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/and-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/div-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/ior-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/ior-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/ior-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/max-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/min-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/minus-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/minus-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/minus-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/mod-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/mult-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/plus-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/plus-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/plus-3.c: New test.

commit | commitdiff | tree

Jonathan Wakely [Mon, 7 Aug 2023 14:30:03 +0000 (15:30 +0100)]

libstdc++: Fix incorrect use of abs and log10 in std::format [PR110860]

The std::formatter implementation for floating-point types uses
__builtin_abs and __builtin_log10 to avoid including all of <cmath>, but
those functions are not generic. The result of abs(2e304) is -INT_MIN
which is undefined, and then log10(INT_MIN) is NaN. As well as being
undefined, we fail to grow the buffer correctly, and then loop more
times than needed to allocate a buffer and try formatting the value into
it again.

We can use if-constexpr to choose the correct form of log10 to use for
the type, and avoid using abs entirely. This avoids the undefined
behaviour and should mean we only reallocate and retry std::to_chars
once.

libstdc++-v3/ChangeLog:

PR libstdc++/110860
* include/std/format (__formatter_fp::format): Do not use
__builtin_abs and __builtin_log10 with arbitrary floating-point
types.

commit | commitdiff | tree

Jonathan Wakely [Mon, 7 Aug 2023 13:37:25 +0000 (14:37 +0100)]

libstdc++: Constrain __format::_Iter_sink for contiguous iterators [PR110917]

We can't write to a span<_CharT> if the contiguous iterator has a value
type that isn't _CharT.

libstdc++-v3/ChangeLog:

PR libstdc++/110917
* include/std/format (__format::_Iter_sink<CharT, OutIter>):
Constrain partial specialization for contiguous iterators to
require the value type to be CharT.
* testsuite/std/format/functions/format_to.cc: New test.

commit | commitdiff | tree

Jonathan Wakely [Mon, 7 Aug 2023 10:19:23 +0000 (11:19 +0100)]

i386: Fix grammar typo in diagnostic

gcc/ChangeLog:

* config/i386/i386.cc (ix86_invalid_conversion): Fix grammar.

commit | commitdiff | tree

Jonathan Wakely [Thu, 3 Aug 2023 07:45:43 +0000 (08:45 +0100)]

libstdc++: Fix past-the-end increment in std::format [PR110862]

At the end of a replacement field we should check that the closing brace
is actually present before incrementing past it.

libstdc++-v3/ChangeLog:

PR libstdc++/110862
* include/std/format (_Scanner::_M_on_replacement_field):
Check for expected '}' before incrementing iterator.
* testsuite/std/format/string.cc: Check "{0:{0}" format string.

commit | commitdiff | tree

Indu Bhagat [Thu, 19 Jan 2023 07:17:49 +0000 (23:17 -0800)]

toplevel: Makefile.def: add install-strip dependency on libsframe

As noted in PR libsframe/30014 - FTBFS: install-strip fails because
bfdlib relinks and fails to find libsframe, the install time
dependencies of libbfd need to be updated.

ChangeLog:

* Makefile.def: Reflect that libsframe needs to installed before
libbfd. Reorder a bit to better track libsframe dependencies.
* Makefile.in: Regenerate.

commit | commitdiff | tree

Indu Bhagat [Tue, 15 Nov 2022 23:07:04 +0000 (15:07 -0800)]

bfd: linker: merge .sframe sections

The linker merges all the input .sframe sections.  When merging, the
linker verifies that all the input .sframe sections have the same
abi/arch.

The linker uses libsframe library to perform key actions on the
.sframe sections - decode, read, and create output data.  This
implies buildsystem changes to make and install libsframe before
libbfd.

The linker places the output .sframe section in a new segment of its
own: PT_GNU_SFRAME.  A new segment is not added, however, if the
generated .sframe section is empty.

When a section is discarded from the final link, the corresponding
entries in the .sframe section for those functions are also deleted.

The linker sorts the SFrame FDEs on start address by default and sets
the SFRAME_F_FDE_SORTED flag in the .sframe section.

This patch also adds support for generation of SFrame unwind
information for the .plt* sections on x86_64.  SFrame unwind info is
generated for IBT enabled PLT, lazy/non-lazy PLT.

The existing linker option --no-ld-generated-unwind-info has been
adapted to include the control of whether .sframe unwind information
will be generated for the linker generated sections like PLT.

Changes to the linker script have been made as necessary.

ChangeLog:

* Makefile.def: Add install dependency on libsframe for libbfd.
* Makefile.in: Regenerated.

commit | commitdiff | tree

Nick Alcock [Mon, 27 Sep 2021 19:31:21 +0000 (20:31 +0100)]

libtool.m4: augment symcode for Solaris 11

This reports common symbols like GNU nm, via a type code of 'C'.

ChangeLog:

* libtool.m4 (lt_cv_sys_global_symbol_pipe): Augment symcode for
Solaris 11.

gcc/ChangeLog:

* configure: Regenerate.

libatomic/ChangeLog:

* configure: Regenerate.

libbacktrace/ChangeLog:

* configure: Regenerate.

libcc1/ChangeLog:

* configure: Regenerate.

libffi/ChangeLog:

* configure: Regenerate.

libgfortran/ChangeLog:

* configure: Regenerate.

libgm2/ChangeLog:

* configure: Regenerate.

libgomp/ChangeLog:

* configure: Regenerate.

libitm/ChangeLog:

* configure: Regenerate.

libobjc/ChangeLog:

* configure: Regenerate.

libphobos/ChangeLog:

* configure: Regenerate.

libquadmath/ChangeLog:

* configure: Regenerate.

libsanitizer/ChangeLog:

* configure: Regenerate.

libssp/ChangeLog:

* configure: Regenerate.

libstdc++-v3/ChangeLog:

* configure: Regenerate.

libvtv/ChangeLog:

* configure: Regenerate.

lto-plugin/ChangeLog:

* configure: Regenerate.

zlib/ChangeLog:

* configure: Regenerate.

commit | commitdiff | tree

H.J. Lu [Tue, 28 Jul 2020 13:59:20 +0000 (06:59 -0700)]

PKG_CHECK_MODULES: Properly check if $pkg_cv_[]$1[]_LIBS works

There is no need to check $pkg_cv_[]$1[]_LIBS works if package check
failed.

config/ChangeLog:

* pkg.m4 (PKG_CHECK_MODULES): Use AC_TRY_LINK only if
$pkg_failed = no.

commit | commitdiff | tree

H.J. Lu [Tue, 28 Jul 2020 10:50:10 +0000 (03:50 -0700)]

PKG_CHECK_MODULES: Check if $pkg_cv_[]$1[]_LIBS works

It is quite normal to have headers without library on multilib OSes.
Add AC_TRY_LINK to PKG_CHECK_MODULES to check if $pkg_cv_[]$1[]_LIBS
works.

config/ChangeLog:

* pkg.m4 (PKG_CHECK_MODULES): Add AC_TRY_LINK to check if
$pkg_cv_[]$1[]_LIBS works.

commit | commitdiff | tree

John Ericson [Wed, 11 Aug 2021 12:17:54 +0000 (13:17 +0100)]

Deprecate a.out support for NetBSD targets.

As discussed previously, a.out support is now quite deprecated, and in
some cases removed, in both Binutils itself and NetBSD, so this legacy
default makes little sense. `netbsdelf*` and `netbsdaout*` still work
allowing the user to be explicit about there choice. Additionally, the
configure script warns about the change as Nick Clifton requested.

One possible concern was the status of NetBSD on NS32K, where only a.out
was supported. But per [1] NetBSD has removed support, and if it were to
come back, it would be with ELF. The binutils implementation is
therefore marked obsolete, per the instructions in the last message.

With that patch and this one applied, I have confirmed the following:

--target=i686-unknown-netbsd
--target=i686-unknown-netbsdelf
builds completely

--target=i686-unknown-netbsdaout
properly fails because target is deprecated.

--target=vax-unknown-netbsdaout builds completely except for gas, where
the target is deprecated.

[1]: https://mail-index.netbsd.org/tech-toolchain/2021/07/19/msg004025.html

config/ChangeLog:

* picflag.m4: Simplify SHmedia NetBSD match by presuming ELF.

gcc/ChangeLog:

* configure: Regenerate.

libada/ChangeLog:

* configure: Regenerate.

libgcc/ChangeLog:

* configure: Regenerate.

libiberty/ChangeLog:

* configure: Regenerate.

Mirror of https://gcc.gnu.org/git/gcc.git

RSS Atom