git.ipfire.org Git - thirdparty/gcc.git/log

]> git.ipfire.org Git - thirdparty/gcc.git/log

Tobias Burnus [Sat, 14 Oct 2023 09:09:50 +0000 (11:09 +0200)]

libgomp.texi: Note to 'Memory allocation' sect and missing mem-memory routines

This commit completes the documentation of the OpenMP memory-management
routines, except for the unimplemented TR11 additions. It also makes clear
in the 'Memory allocation' section of the 'OpenMP-Implementation Specifics'
chapter under which condition OpenMP managed memory/allocators are used.

libgomp/ChangeLog:

* libgomp.texi: Fix some typos.
(Memory Management Routines): Document remaining 5.x routines.
(Memory allocation): Make clear when the section applies.

commit | commitdiff | tree

Tobias Burnus [Sat, 14 Oct 2023 09:07:47 +0000 (11:07 +0200)]

Fortran: Support OpenMP's 'allocate' directive for stack vars

gcc/fortran/ChangeLog:

* gfortran.h (ext_attr_t): Add omp_allocate flag.
* match.cc (gfc_free_omp_namelist): Void deleting same
u2.allocator multiple times now that a sequence can use
the same one.
* openmp.cc (gfc_match_omp_clauses, gfc_match_omp_allocate): Use
same allocator expr multiple times.
(is_predefined_allocator): Make static.
(gfc_resolve_omp_allocate): Update/extend restriction checks;
remove sorry message.
(resolve_omp_clauses): Reject corarrays in allocate/allocators
directive.
* parse.cc (check_omp_allocate_stmt): Permit procedure pointers
here (rejected later) for less misleading diagnostic.
* trans-array.cc (gfc_trans_auto_array_allocation): Propagate
size for GOMP_alloc and location to which it should be added to.
* trans-decl.cc (gfc_trans_deferred_vars): Handle 'omp allocate'
for stack variables; sorry for static variables/common blocks.
* trans-openmp.cc (gfc_trans_omp_clauses): Evaluate 'allocate'
clause's allocator only once; fix adding expressions to the
block.
(gfc_trans_omp_single): Pass a block to gfc_trans_omp_clauses.

gcc/ChangeLog:

* gimplify.cc (gimplify_bind_expr): Handle Fortran's
'omp allocate' for stack variables.

libgomp/ChangeLog:

* libgomp.texi (OpenMP Impl. Status): Mention that Fortran now
supports the allocate directive for stack variables.
* testsuite/libgomp.fortran/allocate-5.f90: New test.
* testsuite/libgomp.fortran/allocate-6.f90: New test.
* testsuite/libgomp.fortran/allocate-7.f90: New test.
* testsuite/libgomp.fortran/allocate-8.f90: New test.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/allocate-14.c: Fix directive name.
* c-c++-common/gomp/allocate-15.c: Likewise.
* c-c++-common/gomp/allocate-9.c: Fix comment typo.
* gfortran.dg/gomp/allocate-4.f90: Remove sorry dg-error.
* gfortran.dg/gomp/allocate-7.f90: Likewise.
* gfortran.dg/gomp/allocate-10.f90: New test.
* gfortran.dg/gomp/allocate-11.f90: New test.
* gfortran.dg/gomp/allocate-12.f90: New test.
* gfortran.dg/gomp/allocate-13.f90: New test.
* gfortran.dg/gomp/allocate-14.f90: New test.
* gfortran.dg/gomp/allocate-15.f90: New test.
* gfortran.dg/gomp/allocate-8.f90: New test.
* gfortran.dg/gomp/allocate-9.f90: New test.

commit | commitdiff | tree

Jakub Jelinek [Sat, 14 Oct 2023 07:35:44 +0000 (09:35 +0200)]

middle-end: Allow _BitInt(65535) [PR102989]

The following patch lifts further restrictions which limited _BitInt to at
most 16319 bits up to 65535.
The problem was mainly in INTEGER_CST representation, which had 3
unsigned char members to describe lengths in number of 64-bit limbs, which
it wanted to fit into 32 bits.  This patch removes the third one which was
just a cache to save a few compile time cycles for wi::to_offset and
enlarges the other two members to unsigned short.
Furthermore, the same problem has been in some uses of trailing_wide_int*
(in value-range-storage*) and value-range-storage* itself, while other
uses of trailing_wide_int* have been fine (e.g. CONST_POLY_INT, where no
constants will be larger than 3/5/9/11 limbs depending on target, so 255
limit is plenty).  The patch turns all those length representations to be
unsigned short for consistency, so value-range-storage* can handle even
16320-65535 bits BITINT_TYPE ranges.  The cc1plus growth is about 16K,
so not really significant for 38M .text section.

Note, the reason for the new limit is
  unsigned int precision : 16;
TYPE_PRECISION limit, if we wanted to overcome that, TYPE_PRECISION would
need to use some other member for BITINT_TYPE from all the others and
we could reach that way 4194239 limit (65535 * 64 - 1, again implied by
INTEGER_CST and value-range-storage*).  Dunno if that is
worth it or if it is something we want to do for GCC 14 though.

2023-10-14  Jakub Jelinek  <jakub@redhat.com>

PR c/102989
gcc/
* tree-core.h (struct tree_base): Remove int_length.offset
member, change type of int_length.unextended and int_length.extended
from unsigned char to unsigned short.
* tree.h (TREE_INT_CST_OFFSET_NUNITS): Remove.
(wi::extended_tree <N>::get_len): Don't use TREE_INT_CST_OFFSET_NUNITS,
instead compute it at runtime from TREE_INT_CST_EXT_NUNITS and
TREE_INT_CST_NUNITS.
* tree.cc (wide_int_to_tree_1): Don't assert
TREE_INT_CST_OFFSET_NUNITS value.
(make_int_cst): Don't initialize TREE_INT_CST_OFFSET_NUNITS.
* wide-int.h (WIDE_INT_MAX_ELTS): Change from 255 to 1024.
(WIDEST_INT_MAX_ELTS): Change from 510 to 2048, adjust comment.
(trailing_wide_int_storage): Change m_len type from unsigned char *
to unsigned short *.
(trailing_wide_int_storage::trailing_wide_int_storage): Change second
argument from unsigned char * to unsigned short *.
(trailing_wide_ints): Change m_max_len type from unsigned char to
unsigned short.  Change m_len element type from
struct{unsigned char len;} to unsigned short.
(trailing_wide_ints <N>::operator []): Remove .len from m_len
accesses.
* value-range-storage.h (irange_storage::lengths_address): Change
return type from const unsigned char * to const unsigned short *.
(irange_storage::write_lengths_address): Change return type from
unsigned char * to unsigned short *.
* value-range-storage.cc (irange_storage::write_lengths_address):
Likewise.
(irange_storage::lengths_address): Change return type from
const unsigned char * to const unsigned short *.
(write_wide_int): Change len argument type from unsigned char *&
to unsigned short *&.
(irange_storage::set_irange): Change len variable type from
unsigned char * to unsigned short *.
(read_wide_int): Change len argument type from unsigned char to
unsigned short.  Use trailing_wide_int_storage <unsigned short>
instead of trailing_wide_int_storage and
trailing_wide_int <unsigned short> instead of trailing_wide_int.
(irange_storage::get_irange): Change len variable type from
unsigned char * to unsigned short *.
(irange_storage::size): Multiply n by sizeof (unsigned short)
in len_size variable initialization.
(irange_storage::dump): Change len variable type from
unsigned char * to unsigned short *.
gcc/cp/
* module.cc (trees_out::start, trees_in::start): Remove
TREE_INT_CST_OFFSET_NUNITS handling.
gcc/testsuite/
* gcc.dg/bitint-38.c: Change into dg-do run test, in addition
to checking the addition, division and right shift results at compile
time check it also at runtime.
* gcc.dg/bitint-39.c: New test.

commit | commitdiff | tree

Juzhe-Zhong [Sat, 14 Oct 2023 03:06:02 +0000 (11:06 +0800)]

RISC-V: Remove redundant iterators.

These iterators are redundant, removed and commmitted.
gcc/ChangeLog:

* config/riscv/vector-iterators.md: Remove redundant iterators.

commit | commitdiff | tree

GCC Administrator [Sat, 14 Oct 2023 00:16:40 +0000 (00:16 +0000)]

Daily bump.

commit | commitdiff | tree

Harald Anlauf [Wed, 11 Oct 2023 19:29:35 +0000 (21:29 +0200)]

Fortran: name conflict between internal procedure and derived type [PR104351]

gcc/fortran/ChangeLog:

PR fortran/104351
* decl.cc (get_proc_name): Extend name conflict detection between
internal procedure and previous declaration also to derived type.

gcc/testsuite/ChangeLog:

PR fortran/104351
* gfortran.dg/derived_function_interface_1.f90: Adjust pattern.
* gfortran.dg/pr104351.f90: New test.

commit | commitdiff | tree

Harald Anlauf [Fri, 6 Oct 2023 20:21:56 +0000 (22:21 +0200)]

fortran: fix handling of options -ffpe-trap and -ffpe-summary [PR110957]

gcc/fortran/ChangeLog:

PR fortran/110957
* invoke.texi: Update documentation to reflect '-ffpe-trap=none'.
* options.cc (gfc_handle_fpe_option): Fix mixup up of error messages
for options -ffpe-trap and -ffpe-summary. Accept '-ffpe-trap=none'
to clear FPU traps previously set on command line.

commit | commitdiff | tree

Andrew MacLeod [Thu, 12 Oct 2023 21:06:36 +0000 (17:06 -0400)]

Do not add partial equivalences with no uses.

PR tree-optimization/111622
* value-relation.cc (equiv_oracle::add_partial_equiv): Do not
register a partial equivalence if an operand has no uses.

commit | commitdiff | tree

Richard Biener [Fri, 13 Oct 2023 10:32:51 +0000 (12:32 +0200)]

OMP SIMD inbranch call vectorization for AVX512 style masks

The following teaches vectorizable_simd_clone_call to handle
integer mode masks. The tricky bit is to second-guess the
number of lanes represented by a single mask argument - the following
uses simdlen and the number of mask arguments to calculate that,
assuming ABIs have them uniform.

Similar to the VOIDmode handling there's a restriction on not
supporting splitting/merging of incoming vector masks to
more/less SIMD call arguments.

PR tree-optimization/111795
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Handle
integer mode mask arguments.

* gcc.target/i386/vect-simd-clone-avx512-1.c: New testcase.
* gcc.target/i386/vect-simd-clone-avx512-2.c: Likewise.
* gcc.target/i386/vect-simd-clone-avx512-3.c: Likewise.

commit | commitdiff | tree

Richard Biener [Thu, 12 Oct 2023 12:25:07 +0000 (14:25 +0200)]

Add support for SLP vectorization of OpenMP SIMD clone calls

This adds support for SLP vectorization of OpenMP SIMD clone calls.
There's a complication when vectorizing calls involving virtual
operands since this is now for the first time not only leafs (loads
or stores).  With SLP this runs into the issue that placement of
the vectorized stmts is not necessarily at one of the original
scalar stmts which leads to the magic updating virtual operands
in vect_finish_stmt_generation not working.  So we run into the
assert that updating virtual operands isn't necessary.  I've
papered over this similar to how we do for mismatched const/pure
attribution by setting vinfo->any_known_not_updated_vssa.

I've added two basic testcases with multi-lane SLP and verified
that with single-lane SLP enabled the rest of the existing testcases
pass.

* tree-vect-slp.cc (mask_call_maps): New.
(vect_get_operand_map): Handle IFN_MASK_CALL.
(vect_build_slp_tree_1): Likewise.
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Handle
SLP.

* gcc.dg/vect/slp-simd-clone-1.c: New testcase.
* gcc.dg/vect/slp-simd-clone-2.c: Likewise.

commit | commitdiff | tree

Juzhe-Zhong [Fri, 13 Oct 2023 06:01:26 +0000 (14:01 +0800)]

RISC-V Regression: Fix FAIL of bb-slp-68.c for RVV

Like comment said, this test failed on 64 bytes vector.
Both RVV and GCN has 64 bytes vector.

So it's more reasonable to use vect512.
gcc/testsuite/ChangeLog:

* gcc.dg/vect/bb-slp-68.c: Use vect512.

commit | commitdiff | tree

Pan Li [Wed, 11 Oct 2023 12:08:52 +0000 (20:08 +0800)]

RISC-V: Refine run test cases of math autovec

For the run test cases of math autovec, we need a reference value to
check if the return value is expected or not.

The previous patch leverage hardcode for the reference value but we
can leverage the scalar math function instead. For example ceil after
autovec.

ASSERT (CEIL (Vector {1.2,...}) == Vector {2.0, ...});

But we can leverage the scalar math function to avoid potential mistakes.

ASSERT (CEIL (Vector {1.2,...}) == Vector {ceil (1.2), ...});

This patch remove some fflags check as it covered by check-body already.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-ceil-run-1.c:
Use scalar func as reference instead of hardcode.
* gcc.target/riscv/rvv/autovec/unop/math-ceil-run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-floor-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-floor-run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-rint-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-rint-run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-round-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-round-run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-trunc-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-trunc-run-2.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Fri, 13 Oct 2023 09:48:25 +0000 (17:48 +0800)]

RISC-V: Add test for FP llfloor auto vectorization

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

long long llfloor (double);

This patch would like to add the test cases for ensuring the
correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llfloor-0.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Fri, 13 Oct 2023 08:20:23 +0000 (16:20 +0800)]

RISC-V: Add test for FP ifloor auto vectorization

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

int ifloor (float);

This patch would like to add the test cases for ensuring the
correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-ifloor-0.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Fri, 13 Oct 2023 07:38:09 +0000 (15:38 +0800)]

RISC-V: Add test for FP iceil auto vectorization

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

int iceil (float);

This patch would like to add the test cases for ensuring the
correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-iceil-0.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Fri, 13 Oct 2023 07:16:27 +0000 (15:16 +0800)]

RISC-V: Add test for FP llceil auto vectorization

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

long long llceil (double);

This patch would like to add the test cases for ensuring the
correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llceil-0.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Florian Weimer [Fri, 13 Oct 2023 07:34:55 +0000 (09:34 +0200)]

C99 testsuite readiness: Some verified test case adjustments

The updated test cases still reproduce the bugs with old compilers.

gcc/testsuite/

* gcc.c-torture/compile/pc44485.c (func_21): Add missing cast.
* gcc.c-torture/compile/pr106101.c: Use builtins to avoid
calls to undeclared functions. Change type of yyvsp to
char ** and introduce yyvsp1 to avoid type errors.
* gcc.c-torture/execute/pr111331-1.c: Add missing int.
* gcc.dg/pr100512.c: Unreduce test case and suppress only
-Wpointer-to-int-cast.
* gcc.dg/pr103003.c: Likewise.
* gcc.dg/pr103451.c: Add cast to long and suppress
-Wdiv-by-zero only.
* gcc.dg/pr68435.c: Avoid implicit int and missing
static function implementation warning.

commit | commitdiff | tree

Florian Weimer [Fri, 13 Oct 2023 07:34:37 +0000 (09:34 +0200)]

C99 test suite readiness: Some unverified test case adjustments

These changes are assumed not to interfere with the test objective,
but it was not possible to reproduce the historic test case failures
(with or without the modification here).

gcc/testsuite/

* gcc.c-torture/compile/20000105-1.c: Add missing int return type.
Call __builtin_exit instead of exit.
* gcc.c-torture/compile/20000105-2.c: Add missing void types.
* gcc.c-torture/compile/20000211-1.c (Lstream_fputc, Lstream_write)
(Lstream_flush_out, parse_doprnt_spec): Add missing function
declaration.
* gcc.c-torture/compile/20000224-1.c (call_critical_lisp_code):
Declare.
* gcc.c-torture/compile/20000314-2.c: Add missing void types.
* gcc.c-torture/compile/980816-1.c (XtVaCreateManagedWidget)
(XtAddCallback): Likewise.
* gcc.c-torture/compile/pr49474.c: Use struct
gfc_formal_arglist * instead of (implied) int type.
* gcc.c-torture/execute/20001111-1.c (foo): Add cast to
char *.
(main): Call __builtin_abort and __builtin_exit.

commit | commitdiff | tree

Florian Weimer [Fri, 13 Oct 2023 07:34:37 +0000 (09:34 +0200)]

C99 test suite readiness: Mark some C89 tests

Add -std=gnu89 to some tests which evidently target C89-only language
features.

gcc/testsuite/

* gcc.c-torture/compile/920501-11.c: Compile with -std=gnu89.
* gcc.c-torture/compile/920501-23.c: Likewise.
* gcc.c-torture/compile/920501-8.c: Likewise.
* gcc.c-torture/compile/920701-1.c: Likewise.
* gcc.c-torture/compile/930529-1.c: Likewise.

commit | commitdiff | tree

Florian Weimer [Fri, 13 Oct 2023 07:34:37 +0000 (09:34 +0200)]

or1k: Fix -Wincompatible-pointer-types warning during libgcc build

libgcc/

* config/or1k/linux-unwind.h (or1k_fallback_frame_state): Add
missing cast.

commit | commitdiff | tree

Florian Weimer [Fri, 13 Oct 2023 07:34:36 +0000 (09:34 +0200)]

arc: Fix -Wincompatible-pointer-types warning during libgcc build

libgcc/

* config/arc/linux-unwind.h (arc_fallback_frame_state): Add
missing cast.

commit | commitdiff | tree

Florian Weimer [Fri, 13 Oct 2023 07:34:36 +0000 (09:34 +0200)]

riscv: Fix -Wincompatible-pointer-types warning during libgcc build

libgcc/

* config/riscv/linux-unwind.h (riscv_fallback_frame_state): Add
missing cast.

commit | commitdiff | tree

Florian Weimer [Fri, 13 Oct 2023 07:34:36 +0000 (09:34 +0200)]

csky: Fix -Wincompatible-pointer-types warning during libgcc build

libgcc/

* config/csky/linux-unwind.h (csky_fallback_frame_state): Add
missing cast.

commit | commitdiff | tree

Florian Weimer [Fri, 13 Oct 2023 07:34:36 +0000 (09:34 +0200)]

m68k: Avoid implicit function declaration in libgcc

libgcc/

* config/m68k/fpgnulib.c (__cmpdf2): Declare.

commit | commitdiff | tree

Jakub Jelinek [Fri, 13 Oct 2023 07:09:32 +0000 (09:09 +0200)]

libstdc++: Fix tr1/8_c_compatibility/cstdio/functions.cc regression with recent glibc

The following testcase started FAILing recently after the
https://sourceware.org/git/?p=glibc.git;a=commit;h=64b1a44183a3094672ed304532bedb9acc707554
glibc change which marked vfscanf with nonnull (1) attribute.
While vfwscanf hasn't been marked similarly (strangely), the patch changes
that too. By using va_arg one hides the value of it from the compiler
(volatile keyword would do too, or making the FILE* stream a function
argument, but then it might need to be guarded by #if or something).

2023-10-13 Jakub Jelinek <jakub@redhat.com>

* testsuite/tr1/8_c_compatibility/cstdio/functions.cc (test01):
Initialize stream to va_arg(ap, FILE*) rather than 0.
* testsuite/tr1/8_c_compatibility/cwchar/functions.cc (test01):
Likewise.

commit | commitdiff | tree

Richard Biener [Thu, 12 Oct 2023 09:34:57 +0000 (11:34 +0200)]

tree-optimization/111779 - Handle some BIT_FIELD_REFs in SRA

The following handles byte-aligned, power-of-two and byte-multiple
sized BIT_FIELD_REF reads in SRA. In particular this should cover
BIT_FIELD_REFs created by optimize_bit_field_compare.

For gcc.dg/tree-ssa/ssa-dse-26.c we now SRA the BIT_FIELD_REF
appearing there leading to more DSE, fully eliding the aggregates.

This results in the same false positive -Wuninitialized as the
older attempt to remove the folding from optimize_bit_field_compare,
fixed by initializing part of the aggregate unconditionally.

PR tree-optimization/111779
gcc/
* tree-sra.cc (sra_handled_bf_read_p): New function.
(build_access_from_expr_1): Handle some BIT_FIELD_REFs.
(sra_modify_expr): Likewise.
(make_fancy_name_1): Skip over BIT_FIELD_REF.

gcc/fortran/
* trans-expr.cc (gfc_trans_assignment_1): Initialize
lhs_caf_attr and rhs_caf_attr codimension flag to avoid
false positive -Wuninitialized.

gcc/testsuite/
* gcc.dg/tree-ssa/ssa-dse-26.c: Adjust for more DSE.
* gcc.dg/vect/vect-pr111779.c: New testcase.

commit | commitdiff | tree

Richard Biener [Thu, 12 Oct 2023 08:13:58 +0000 (10:13 +0200)]

tree-optimization/111773 - avoid CD-DCE of noreturn special calls

The support to elide calls to allocation functions in DCE runs into
the issue that when implementations are discovered noreturn we end
up DCEing the calls anyway, leaving blocks without termination and
without outgoing edges which is both invalid IL and wrong-code when
as in the example the noreturn call would throw. The following
avoids taking advantage of both noreturn and the ability to elide
allocation at the same time.

For the testcase it's valid to throw or return 10 by eliding the
allocation. But we have to do either where currently we'd run
off the function.

PR tree-optimization/111773
* tree-ssa-dce.cc (mark_stmt_if_obviously_necessary): Do
not elide noreturn calls that are reflected to the IL.

* g++.dg/torture/pr111773.C: New testcase.

commit | commitdiff | tree

Pan Li [Fri, 13 Oct 2023 06:13:26 +0000 (14:13 +0800)]

RISC-V: Add test for FP llround auto vectorization

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

long long llround (double);

This patch would like to add the test cases for ensuring the correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-llround-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llround-0.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Juzhe-Zhong [Fri, 13 Oct 2023 05:45:19 +0000 (13:45 +0800)]

RISC-V Regression: Fix FAIL of bb-slp-pr69907.c for RVV

Like ARM SVE and GCN, add RVV.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/bb-slp-pr69907.c: Add RVV.

commit | commitdiff | tree

Pan Li [Fri, 13 Oct 2023 04:11:56 +0000 (12:11 +0800)]

RISC-V: Add test for FP iroundf auto vectorization

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

int iroundf (float);

This patch would like to add the test cases for ensuring the
correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-iround-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-iround-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-iround-0.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Kito Cheng [Tue, 3 Oct 2023 02:27:24 +0000 (10:27 +0800)]

RISC-V: Fix the riscv_legitimize_poly_move issue on targets where the minimal VLEN exceeds 512.

riscv_legitimize_poly_move was expected to ensure the poly value is at most 32
times smaller than the minimal VLEN (32 being derived from '4096 / 128').
This assumption held when our mode modeling was not so precisely defined.
However, now that we have modeled the mode size according to the correct minimal
VLEN info, the size difference between different RVV modes can be up to 64
times. For instance, comparing RVVMF64BI and RVVMF1BI, the sizes are [1, 1]
versus [64, 64] respectively.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_poly_move): Bump
max_power to 64.
* config/riscv/riscv.h (MAX_POLY_VARIANT): New.

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/autovec/bug-01.C: New.
* g++.target/riscv/rvv/rvv.exp: Add autovec folder.

commit | commitdiff | tree

Pan Li [Fri, 13 Oct 2023 02:17:36 +0000 (10:17 +0800)]

RISC-V: Leverage stdint-gcc.h for RVV test cases

Leverage stdint-gcc.h for the int64_t types instead of typedef.
Or we may have conflict with stdint-gcc.h in somewhere else.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: Include
stdint-gcc.h for int types.
* gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/test-math.h: Remove int64_t
typedef.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Fri, 13 Oct 2023 01:30:55 +0000 (09:30 +0800)]

RISC-V: Support FP lfloor/lfloorf auto vectorization

This patch would like to support the FP lfloor/lfloorf auto vectorization.

* long lfloor (double) for rv64
* long lfloorf (float) for rv32

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lfloormn2 only act on DF => DI for
rv64, and SF => SI for rv32.

Given we have code like:

void
test_lfloor (long *out, double *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
    out[i] = __builtin_lfloor (in[i]);
}

Before this patch:
.L3:
  ...
  fld         fa5,0(a1)
  fcvt.l.d    a5,fa5,rdn
  sd          a5,-8(a0)
  ...
  bne         a1,a4,.L3

After this patch:
  frrm        a6
  ...
  fsrmi       2 // RDN
.L3:
  ...
  vsetvli     a3,zero,e64,m1,ta,ma
  vfcvt.x.f.v v1,v1
  vsetvli     zero,a2,e64,m1,ta,ma
  vse32.v     v1,0(a0)
  ...
  bne         a2,zero,.L3
  ...
  fsrm        a6

The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered
by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

* config/riscv/autovec.md (lfloor<mode><v_i_l_ll_convert>2): New
pattern for lfloor/lfloorf.
* config/riscv/riscv-protos.h (enum insn_type): New enum value.
(expand_vec_lfloor): New func decl for expanding lfloor.
* config/riscv/riscv-v.cc (expand_vec_lfloor): New func impl
for expanding lfloor.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-lfloor-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lfloor-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lfloor-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lfloor-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lfloor-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lfloor-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Hans-Peter Nilsson [Wed, 4 Oct 2023 02:16:18 +0000 (04:16 +0200)]

testsuite: Replace many dg-require-thread-fence with dg-require-atomic-cmpxchg-word

These tests actually use a form of atomic compare and exchange
operation, not just atomic loading and storing. Some targets (not
supported by e.g. libatomic) have atomic loading and storing, but not
compare and exchange, yielding linker errors for missing library
functions.

This change is just for existing uses of
dg-require-thread-fence. It does not fix any other tests
that should also be gated on dg-require-atomic-cmpxchg-word.

* testsuite/29_atomics/atomic/compare_exchange_padding.cc,
testsuite/29_atomics/atomic_flag/clear/1.cc,
testsuite/29_atomics/atomic_flag/cons/value_init.cc,
testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc,
testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc,
testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc,
testsuite/29_atomics/atomic_ref/generic.cc,
testsuite/29_atomics/atomic_ref/integral.cc,
testsuite/29_atomics/atomic_ref/pointer.cc: Replace
dg-require-thread-fence with dg-require-atomic-cmpxchg-word.

commit | commitdiff | tree

Hans-Peter Nilsson [Wed, 4 Oct 2023 01:40:25 +0000 (03:40 +0200)]

testsuite: Add dg-require-atomic-cmpxchg-word

Some targets (armv6-m) support inline atomic load and store,
i.e. dg-require-thread-fence matches, but not atomic operations like
compare and exchange.

This directive can be used to replace uses of dg-require-thread-fence
where an atomic operation is actually used.

* testsuite/lib/dg-options.exp (dg-require-atomic-cmpxchg-word):
New proc.
* testsuite/lib/libstdc++.exp (check_v3_target_atomic_cmpxchg_word):
Ditto.

commit | commitdiff | tree

GCC Administrator [Fri, 13 Oct 2023 00:18:18 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Pan Li [Thu, 12 Oct 2023 14:07:56 +0000 (22:07 +0800)]

RISC-V: Support FP lceil/lceilf auto vectorization

This patch would like to support the FP lceil/lceilf auto vectorization.

* long lceil (double) for rv64
* long lceilf (float) for rv32

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lceilmn2 only act on DF => DI for
rv64, and SF => SI for rv32.

Given we have code like:

void
test_lceil (long *out, double *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
    out[i] = __builtin_lceil (in[i]);
}

Before this patch:
.L3:
  ...
  fld         fa5,0(a1)
  fcvt.l.d    a5,fa5,rup
  sd          a5,-8(a0)
  ...
  bne         a1,a4,.L3

After this patch:
  frrm        a6
  ...
  fsrmi       3 // RUP
.L3:
  ...
  vsetvli     a3,zero,e64,m1,ta,ma
  vfcvt.x.f.v v1,v1
  vsetvli     zero,a2,e64,m1,ta,ma
  vse32.v     v1,0(a0)
  ...
  bne         a2,zero,.L3
  ...
  fsrm        a6

The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered
by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

* config/riscv/autovec.md (lceil<mode><v_i_l_ll_convert>2): New
pattern] for lceil/lceilf.
* config/riscv/riscv-protos.h (enum insn_type): New enum value.
(expand_vec_lceil): New func decl for expanding lceil.
* config/riscv/riscv-v.cc (expand_vec_lceil): New func impl
for expanding lceil.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-lceil-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceil-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceil-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceil-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lceil-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lceil-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Michael Meissner [Thu, 12 Oct 2023 20:17:59 +0000 (16:17 -0400)]

PR111778, PowerPC: Do not depend on an undefined shift

I was building a cross compiler to PowerPC on my x86_86 workstation with the
latest version of GCC on October 11th.  I could not build the compiler on the
x86_64 system as it died in building libgcc.  I looked into it, and I
discovered the compiler was recursing until it ran out of stack space.  If I
build a native compiler with the same sources on a PowerPC system, it builds
fine.

I traced this down to a change made around October 10th:

| commit 8f1a70a4fbcc6441c70da60d4ef6db1e5635e18a (HEAD)
| Author: Jiufu Guo <guojiufu@linux.ibm.com>
| Date:   Tue Jan 10 20:52:33 2023 +0800
|
|   rs6000: build constant via li/lis;rldicl/rldicr
|
|   If a constant is possible left/right cleaned on a rotated value from
|   a negative value of "li/lis".  Then, using "li/lis ; rldicl/rldicr"
|   to build the constant.

The code was doing a -1 << 64 which is undefined behavior because different
machines produce different results.  On the x86_64 system, (-1 << 64) produces
-1 while on a PowerPC 64-bit system, (-1 << 64) produces 0.  The x86_64 then
recurses until the stack runs out of space.

If I apply this patch, the compiler builds fine on both x86_64 as a PowerPC
crosss compiler and on a native PowerPC system.

2023-10-12  Michael Meissner  <meissner@linux.ibm.com>

gcc/

PR target/111778
* config/rs6000/rs6000.cc (can_be_built_by_li_lis_and_rldicl): Protect
code from shifts that are undefined.
(can_be_built_by_li_lis_and_rldicr): Likewise.
(can_be_built_by_li_and_rldic): Protect code from shifts that
undefined.  Also replace uses of 1ULL with HOST_WIDE_INT_1U.

commit | commitdiff | tree

Tobias Burnus [Thu, 12 Oct 2023 19:00:58 +0000 (21:00 +0200)]

libgomp.texi: Clarify OMP_TARGET_OFFLOAD=mandatory

In OpenMP 5.0/5.1, the semantic of OMP_TARGET_OFFLOAD=mandatory was
insufficiently specified; 5.2 clarified this with extensions/clarifications
(omp_initial_device, omp_invalid_device, "conforming device number").
GCC's implementation matches OpenMP 5.2.

libgomp/ChangeLog:

* libgomp.texi (OMP_DEFAULT_DEVICE): Update spec ref; add @ref to
OMP_TARGET_OFFLOAD.
(OMP_TARGET_OFFLOAD): Update spec ref; add @ref to OMP_DEFAULT_DEVICE;
clarify MANDATORY behavior.

commit | commitdiff | tree

Alex Coplan [Thu, 12 Oct 2023 16:49:20 +0000 (17:49 +0100)]

reg-notes.def: Fix up description of REG_NOALIAS

The description of the REG_NOALIAS note in reg-notes.def isn't quite
right. It describes it as being attached to call insns, but it is
instead attached to a move insn receiving the return value from a call.

This can be seen by looking at the code in calls.cc:expand_call which
attaches the note:

  emit_move_insn (temp, valreg);

  /* The return value from a malloc-like function cannot alias
     anything else.  */
  last = get_last_insn ();
  add_reg_note (last, REG_NOALIAS, temp);

gcc/ChangeLog:

* reg-notes.def (NOALIAS): Correct comment.

commit | commitdiff | tree

Christoph Müllner [Mon, 9 Oct 2023 22:40:35 +0000 (00:40 +0200)]

RISC-V: Make xtheadcondmov-indirect tests robust against instruction reordering

Fixes: c1bc7513b1d7 ("RISC-V: const: hide mvconst splitter from IRA")
A recent change broke the xtheadcondmov-indirect tests, because the order of
emitted instructions changed. Since the test is too strict when testing for
a fixed instruction order, let's change the tests to simply count instruction,
like it is done for similar tests.

Reported-by: Patrick O'Neill <patrick@rivosinc.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadcondmov-indirect.c: Make robust against
instruction reordering.

commit | commitdiff | tree

Jakub Jelinek [Thu, 12 Oct 2023 15:20:36 +0000 (17:20 +0200)]

wide-int: Fix build with gcc < 12 or clang++ [PR111787]

While my wide_int patch bootstrapped/regtested fine when I used GCC 12
as system gcc, apparently it doesn't with GCC 11 and older or clang++.
For GCC before PR96555 C++ DR1315 implementation the compiler complains
about template argument involving template parameters, for clang++ the
same + complains about missing needs_write_val_arg static data member
in some wi::int_traits specializations.

2023-10-12 Jakub Jelinek <jakub@redhat.com>

PR bootstrap/111787
* tree.h (wi::int_traits <unextended_tree>::needs_write_val_arg): New
static data member.
(int_traits <extended_tree <N>>::needs_write_val_arg): Likewise.
(wi::ints_for): Provide separate partial specializations for
generic_wide_int <extended_tree <N>> and INL_CONST_PRECISION or that
and CONST_PRECISION, rather than using
int_traits <extended_tree <N> >::precision_type as the second template
argument.
* rtl.h (wi::int_traits <rtx_mode_t>::needs_write_val_arg): New
static data member.
* double-int.h (wi::int_traits <double_int>::needs_write_val_arg):
Likewise.

commit | commitdiff | tree

Mary Bennett [Thu, 12 Oct 2023 15:17:24 +0000 (09:17 -0600)]

RISCV: Bugfix for incorrect documentation heading nesting

PR middle-end/111777

gcc/ChangeLog:
* doc/extend.texi: Change subsubsection to subsection for
CORE-V built-ins.

commit | commitdiff | tree

Tamar Christina [Thu, 12 Oct 2023 14:55:58 +0000 (15:55 +0100)]

AArch64: Fix Armv9-a warnings that get emitted whenever a ACLE header is used.

At the moment, trying to use -march=armv9-a with any ACLE header such as
arm_neon.h results in rows and rows of warnings saying:

<built-in>: warning: "__ARM_ARCH" redefined
<built-in>: note: this is the location of the previous definition

This is obviously not useful and happens because the header was defined at
__ARM_ARCH == 8 and the commandline changes it.

The Arm port solves this by undef the macro during argument processing and we do
the same on AArch64 for the majority of macros. However we define this macro
using a different helper which requires the manual undef.

Thanks,
Tamar

gcc/ChangeLog:

* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Add undef.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/armv9_warning.c: New test.

commit | commitdiff | tree

Jakub Jelinek [Thu, 12 Oct 2023 14:07:25 +0000 (16:07 +0200)]

wide-int: Add simple CHECKING_P stack-protector canary like checking

This patch adds hopefully not so expensive --enable-checking=yes
verification that the widest_int upper length bound estimates are really
upper bounds and nothing attempts to write more elements.
It is done only if the estimated upper length bound is smaller than
WIDE_INT_MAX_INL_ELTS, but that should be the most common case unless
large _BitInt is involved.

2023-10-12 Jakub Jelinek <jakub@redhat.com>

* wide-int.h (widest_int_storage <N>::write_val): If l is small
and there is space in u.val array, store a canary value at the
end when checking.
(widest_int_storage <N>::set_len): Check the canary hasn't been
overwritten.

commit | commitdiff | tree

Jakub Jelinek [Thu, 12 Oct 2023 14:01:12 +0000 (16:01 +0200)]

wide-int: Allow up to 16320 bits wide_int and change widest_int precision to 32640 bits [PR102989]

As mentioned in the _BitInt support thread, _BitInt(N) is currently limited
by the wide_int/widest_int maximum precision limitation, which is depending
on target 191, 319, 575 or 703 bits (one less than WIDE_INT_MAX_PRECISION).
That is fairly low limit for _BitInt, especially on the targets with the 191
bit limitation.

The following patch bumps that limit to 16319 bits on all arches (which support
_BitInt at all), which is the limit imposed by INTEGER_CST representation
(unsigned char members holding number of HOST_WIDE_INT limbs).

In order to achieve that, wide_int is changed from a trivially copyable type
which contained just an inline array of WIDE_INT_MAX_ELTS (3, 5, 9 or
11 limbs depending on target) limbs into a non-trivially copy constructible,
copy assignable and destructible type which for the usual small cases (up
to WIDE_INT_MAX_INL_ELTS which is the former WIDE_INT_MAX_ELTS) still uses
an inline array of limbs, but for larger precisions uses heap allocated
limb array.  This makes wide_int unusable in GC structures, so for dwarf2out
which was the only place which needed it there is a new rwide_int type
(restricted wide_int) which supports only up to RWIDE_INT_MAX_ELTS limbs
inline and is trivially copyable (dwarf2out should never deal with large
_BitInt constants, those should have been lowered earlier).

Similarly, widest_int has been changed from a trivially copyable type which
contained also an inline array of WIDE_INT_MAX_ELTS limbs (but unlike
wide_int didn't contain precision and assumed that to be
WIDE_INT_MAX_PRECISION) into a non-trivially copy constructible, copy
assignable and destructible type which has always WIDEST_INT_MAX_PRECISION
precision (32640 bits currently, twice as much as INTEGER_CST limitation
allows) and unlike wide_int decides depending on get_len () value whether
it uses an inline array (again, up to WIDE_INT_MAX_INL_ELTS) or heap
allocated one.  In wide-int.h this means we need to estimate an upper
bound on how many limbs will wide-int.cc (usually, sometimes wide-int.h)
need to write, heap allocate if needed based on that estimation and upon
set_len which is done at the end if we guessed over WIDE_INT_MAX_INL_ELTS
and allocated dynamically, while we actually need less than that
copy/deallocate.  The unexact guesses are needed because the exact
computation of the length in wide-int.cc is sometimes quite complex and
especially canonicalize at the end can decrease it.  widest_int is again
because of this not usable in GC structures, so cfgloop.h has been changed
to use fixed_wide_int_storage <WIDE_INT_MAX_INL_PRECISION> and punt if
we'd have larger _BitInt based iterators, programs having more than 128-bit
iterators will be hopefully rare and I think it is fine to treat loops with
more than 2^127 iterations as effectively possibly infinite, omp-general.cc
is changed to use fixed_wide_int_storage <1024>, as it better should support
scores with the same precision on all arches.

Code which used WIDE_INT_PRINT_BUFFER_SIZE sized buffers for printing
wide_int/widest_int into buffer had to be changed to use XALLOCAVEC for
larger lengths.

On x86_64, the patch in --enable-checking=yes,rtl,extra configured
bootstrapped cc1plus enlarges the .text section by 1.01% - from
0x25725a5 to 0x25e5555 and similarly at least when compiling insn-recog.cc
with the usual bootstrap option slows compilation down by 1.01%,
user 4m22.046s and 4m22.384s on vanilla trunk vs.
4m25.947s and 4m25.581s on patched trunk.  I'm afraid some code size growth
and compile time slowdown is unavoidable in this case, we use wide_int and
widest_int everywhere, and while the rare cases are marked with UNLIKELY
macros, it still means extra checks for it.

The patch also regresses
+FAIL: gm2/pim/fail/largeconst.mod,  -O
+FAIL: gm2/pim/fail/largeconst.mod,  -O -g
+FAIL: gm2/pim/fail/largeconst.mod,  -O3 -fomit-frame-pointer
+FAIL: gm2/pim/fail/largeconst.mod,  -O3 -fomit-frame-pointer -finline-functions
+FAIL: gm2/pim/fail/largeconst.mod,  -Os
+FAIL: gm2/pim/fail/largeconst.mod,  -g
+FAIL: gm2/pim/fail/largeconst2.mod,  -O
+FAIL: gm2/pim/fail/largeconst2.mod,  -O -g
+FAIL: gm2/pim/fail/largeconst2.mod,  -O3 -fomit-frame-pointer
+FAIL: gm2/pim/fail/largeconst2.mod,  -O3 -fomit-frame-pointer -finline-functions
+FAIL: gm2/pim/fail/largeconst2.mod,  -Os
+FAIL: gm2/pim/fail/largeconst2.mod,  -g
tests, which previously were rejected with
error: constant literal ‘12345678912345678912345679123456789123456789123456789123456789123456791234567891234567891234567891234567891234567912345678912345678912345678912345678912345679123456789123456789’ exceeds internal ZTYPE range
kind of errors, but now are accepted.  Seems the FE tries to parse constants
into widest_int in that case and only diagnoses if widest_int overflows,
that seems wrong, it should at least punt if stuff doesn't fit into
WIDE_INT_MAX_PRECISION, but perhaps far less than that, if it wants support
for middle-end for precisions above 128-bit, it better should be using
BITINT_TYPE.  Will file a PR and defer to Modula2 maintainer.

2023-10-12  Jakub Jelinek  <jakub@redhat.com>

PR c/102989
* wide-int.h: Adjust file comment.
(WIDE_INT_MAX_INL_ELTS): Define to former value of WIDE_INT_MAX_ELTS.
(WIDE_INT_MAX_INL_PRECISION): Define.
(WIDE_INT_MAX_ELTS): Change to 255.  Assert that WIDE_INT_MAX_INL_ELTS
is smaller than WIDE_INT_MAX_ELTS.
(RWIDE_INT_MAX_ELTS, RWIDE_INT_MAX_PRECISION, WIDEST_INT_MAX_ELTS,
WIDEST_INT_MAX_PRECISION): Define.
(WI_BINARY_RESULT_VAR, WI_UNARY_RESULT_VAR): Change write_val callers
to pass 0 as a new argument.
(class widest_int_storage): Likewise.
(widest_int, widest2_int): Change typedefs to use widest_int_storage
rather than fixed_wide_int_storage.
(enum wi::precision_type): Add INL_CONST_PRECISION enumerator.
(struct binary_traits): Add partial specializations for
INL_CONST_PRECISION.
(generic_wide_int): Add needs_write_val_arg static data member.
(int_traits): Likewise.
(wide_int_storage): Replace val non-static data member with a union
u of it and HOST_WIDE_INT *valp.  Declare copy constructor, copy
assignment operator and destructor.  Add unsigned int argument to
write_val.
(wide_int_storage::wide_int_storage): Initialize precision to 0
in the default ctor.  Remove unnecessary {}s around STATIC_ASSERTs.
Assert in non-default ctor T's precision_type is not
INL_CONST_PRECISION and allocate u.valp for large precision.  Add
copy constructor.
(wide_int_storage::~wide_int_storage): New.
(wide_int_storage::operator=): Add copy assignment operator.  In
assignment operator remove unnecessary {}s around STATIC_ASSERTs,
assert ctor T's precision_type is not INL_CONST_PRECISION and
if precision changes, deallocate and/or allocate u.valp.
(wide_int_storage::get_val): Return u.valp rather than u.val for
large precision.
(wide_int_storage::write_val): Likewise.  Add an unused unsigned int
argument.
(wide_int_storage::set_len): Use write_val instead of writing val
directly.
(wide_int_storage::from, wide_int_storage::from_array): Adjust
write_val callers.
(wide_int_storage::create): Allocate u.valp for large precisions.
(wi::int_traits <wide_int_storage>::get_binary_precision): New.
(fixed_wide_int_storage::fixed_wide_int_storage): Make default
ctor defaulted.
(fixed_wide_int_storage::write_val): Add unused unsigned int argument.
(fixed_wide_int_storage::from, fixed_wide_int_storage::from_array):
Adjust write_val callers.
(wi::int_traits <fixed_wide_int_storage>::get_binary_precision): New.
(WIDEST_INT): Define.
(widest_int_storage): New template class.
(wi::int_traits <widest_int_storage>): New.
(trailing_wide_int_storage::write_val): Add unused unsigned int
argument.
(wi::get_binary_precision): Use
wi::int_traits <WI_BINARY_RESULT (T1, T2)>::get_binary_precision
rather than get_precision on get_binary_result.
(wi::copy): Adjust write_val callers.  Don't call set_len if
needs_write_val_arg.
(wi::bit_not): If result.needs_write_val_arg, call write_val
again with upper bound estimate of len.
(wi::sext, wi::zext, wi::set_bit): Likewise.
(wi::bit_and, wi::bit_and_not, wi::bit_or, wi::bit_or_not,
wi::bit_xor, wi::add, wi::sub, wi::mul, wi::mul_high, wi::div_trunc,
wi::div_floor, wi::div_ceil, wi::div_round, wi::divmod_trunc,
wi::mod_trunc, wi::mod_floor, wi::mod_ceil, wi::mod_round,
wi::lshift, wi::lrshift, wi::arshift): Likewise.
(wi::bswap, wi::bitreverse): Assert result.needs_write_val_arg
is false.
(gt_ggc_mx, gt_pch_nx): Remove generic template for all
generic_wide_int, instead add functions and templates for each
storage of generic_wide_int.  Make functions for
generic_wide_int <wide_int_storage> and templates for
generic_wide_int <widest_int_storage <N>> deleted.
(wi::mask, wi::shifted_mask): Adjust write_val calls.
* wide-int.cc (zeros): Decrease array size to 1.
(BLOCKS_NEEDED): Use CEIL.
(canonize): Use HOST_WIDE_INT_M1.
(wi::from_buffer): Pass 0 to write_val.
(wi::to_mpz): Use CEIL.
(wi::from_mpz): Likewise.  Pass 0 to write_val.  Use
WIDE_INT_MAX_INL_ELTS instead of WIDE_INT_MAX_ELTS.
(wi::mul_internal): Use WIDE_INT_MAX_INL_PRECISION instead of
MAX_BITSIZE_MODE_ANY_INT in automatic array sizes, for prec
above WIDE_INT_MAX_INL_PRECISION estimate precision from
lengths of operands.  Use XALLOCAVEC allocated buffers for
prec above WIDE_INT_MAX_INL_PRECISION.
(wi::divmod_internal): Likewise.
(wi::lshift_large): For len > WIDE_INT_MAX_INL_ELTS estimate
it from xlen and skip.
(rshift_large_common): Remove xprecision argument, add len
argument with len computed in caller.  Don't return anything.
(wi::lrshift_large, wi::arshift_large): Compute len here
and pass it to rshift_large_common, for lengths above
WIDE_INT_MAX_INL_ELTS using estimations from xlen if possible.
(assert_deceq, assert_hexeq): For lengths above
WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer.
(test_printing): Use WIDE_INT_MAX_INL_PRECISION instead of
WIDE_INT_MAX_PRECISION.
* wide-int-print.h (WIDE_INT_PRINT_BUFFER_SIZE): Use
WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION.
* wide-int-print.cc (print_decs, print_decu, print_hex): For
lengths above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer.
* tree.h (wi::int_traits<extended_tree <N>>): Change precision_type
to INL_CONST_PRECISION for N == ADDR_MAX_PRECISION.
(widest_extended_tree): Use WIDEST_INT_MAX_PRECISION instead of
WIDE_INT_MAX_PRECISION.
(wi::ints_for): Use int_traits <extended_tree <N> >::precision_type
instead of hard coded CONST_PRECISION.
(widest2_int_cst): Use WIDEST_INT_MAX_PRECISION instead of
WIDE_INT_MAX_PRECISION.
(wi::extended_tree <N>::get_len): Use WIDEST_INT_MAX_PRECISION rather
than WIDE_INT_MAX_PRECISION.
(wi::ints_for::zero): Use
wi::int_traits <wi::extended_tree <N> >::precision_type instead of
wi::CONST_PRECISION.
* tree.cc (build_replicated_int_cst): Formatting fix.  Use
WIDE_INT_MAX_INL_ELTS rather than WIDE_INT_MAX_ELTS.
* print-tree.cc (print_node): Don't print TREE_UNAVAILABLE on
INTEGER_CSTs, TREE_VECs or SSA_NAMEs.
* double-int.h (wi::int_traits <double_int>::precision_type): Change
to INL_CONST_PRECISION from CONST_PRECISION.
* poly-int.h (struct poly_coeff_traits): Add partial specialization
for wi::INL_CONST_PRECISION.
* cfgloop.h (bound_wide_int): New typedef.
(struct nb_iter_bound): Change bound type from widest_int to
bound_wide_int.
(struct loop): Change nb_iterations_upper_bound,
nb_iterations_likely_upper_bound and nb_iterations_estimate type from
widest_int to bound_wide_int.
* cfgloop.cc (record_niter_bound): Return early if wi::min_precision
of i_bound is too large for bound_wide_int.  Adjustments for the
widest_int to bound_wide_int type change in non-static data members.
(get_estimated_loop_iterations, get_max_loop_iterations,
get_likely_max_loop_iterations): Adjustments for the widest_int to
bound_wide_int type change in non-static data members.
* tree-vect-loop.cc (vect_transform_loop): Likewise.
* tree-ssa-loop-niter.cc (do_warn_aggressive_loop_optimizations): Use
XALLOCAVEC allocated buffer for i_bound len above
WIDE_INT_MAX_INL_ELTS.
(record_estimate): Return early if wi::min_precision of i_bound is too
large for bound_wide_int.  Adjustments for the widest_int to
bound_wide_int type change in non-static data members.
(wide_int_cmp): Use bound_wide_int instead of widest_int.
(bound_index): Use bound_wide_int instead of widest_int.
(discover_iteration_bound_by_body_walk): Likewise.  Use
widest_int::from to convert it to widest_int when passed to
record_niter_bound.
(maybe_lower_iteration_bound): Use widest_int::from to convert it to
widest_int when passed to record_niter_bound.
(estimate_numbers_of_iteration): Don't record upper bound if
loop->nb_iterations has too large precision for bound_wide_int.
(n_of_executions_at_most): Use widest_int::from.
* tree-ssa-loop-ivcanon.cc (remove_redundant_iv_tests): Adjust for
the widest_int to bound_wide_int changes.
* match.pd (fold_sign_changed_comparison simplification): Use
wide_int::from on wi::to_wide instead of wi::to_widest.
* value-range.h (irange::maybe_resize): Avoid using memcpy on
non-trivially copyable elements.
* value-range.cc (irange_bitmask::dump): Use XALLOCAVEC allocated
buffer for mask or value len above WIDE_INT_PRINT_BUFFER_SIZE.
* fold-const.cc (fold_convert_const_int_from_int, fold_unary_loc):
Use wide_int::from on wi::to_wide instead of wi::to_widest.
* tree-ssa-ccp.cc (bit_value_binop): Zero extend r1max from width
before calling wi::udiv_trunc.
* lto-streamer-out.cc (output_cfg): Adjustments for the widest_int to
bound_wide_int type change in non-static data members.
* lto-streamer-in.cc (input_cfg): Likewise.
(lto_input_tree_1): Use WIDE_INT_MAX_INL_ELTS rather than
WIDE_INT_MAX_ELTS.  For length above WIDE_INT_MAX_INL_ELTS use
XALLOCAVEC allocated buffer.  Formatting fix.
* data-streamer-in.cc (streamer_read_wide_int,
streamer_read_widest_int): Likewise.
* tree-affine.cc (aff_combination_expand): Use placement new to
construct name_expansion.
(free_name_expansion): Destruct name_expansion.
* gimple-ssa-strength-reduction.cc (struct slsr_cand_d): Change
index type from widest_int to offset_int.
(class incr_info_d): Change incr type from widest_int to offset_int.
(alloc_cand_and_find_basis, backtrace_base_for_ref,
restructure_reference, slsr_process_ref, create_mul_ssa_cand,
create_mul_imm_cand, create_add_ssa_cand, create_add_imm_cand,
slsr_process_add, cand_abs_increment, replace_mult_candidate,
replace_unconditional_candidate, incr_vec_index,
create_add_on_incoming_edge, create_phi_basis_1,
replace_conditional_candidate, record_increment,
record_phi_increments_1, phi_incr_cost_1, phi_incr_cost,
lowest_cost_path, total_savings, ncd_with_phi, ncd_of_cand_and_phis,
nearest_common_dominator_for_cands, insert_initializers,
all_phi_incrs_profitable_1, replace_one_candidate,
replace_profitable_candidates): Use offset_int rather than widest_int
and wi::to_offset rather than wi::to_widest.
* real.cc (real_to_integer): Use WIDE_INT_MAX_INL_ELTS rather than
2 * WIDE_INT_MAX_ELTS and for words above that use XALLOCAVEC
allocated buffer.
* tree-ssa-loop-ivopts.cc (niter_for_exit): Use placement new
to construct tree_niter_desc and destruct it on failure.
(free_tree_niter_desc): Destruct tree_niter_desc if value is non-NULL.
* gengtype.cc (main): Remove widest_int handling.
* graphite-isl-ast-to-gimple.cc (widest_int_from_isl_expr_int): Use
WIDEST_INT_MAX_ELTS instead of WIDE_INT_MAX_ELTS.
* gimple-ssa-warn-alloca.cc (pass_walloca::execute): Use
WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION and
assert get_len () fits into it.
* value-range-pretty-print.cc (vrange_printer::print_irange_bitmasks):
For mask or value lengths above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC
allocated buffer.
* gimple-ssa-sprintf.cc (adjust_range_for_overflow): Use
wide_int::from on wi::to_wide instead of wi::to_widest.
* omp-general.cc (score_wide_int): New typedef.
(omp_context_compute_score): Use score_wide_int instead of widest_int
and adjust for those changes.
(struct omp_declare_variant_entry): Change score and
score_in_declare_simd_clone non-static data member type from widest_int
to score_wide_int.
(omp_resolve_late_declare_variant, omp_resolve_declare_variant): Use
score_wide_int instead of widest_int and adjust for those changes.
(omp_lto_output_declare_variant_alt): Likewise.
(omp_lto_input_declare_variant_alt): Likewise.
* godump.cc (go_output_typedef): Assert get_len () is smaller than
WIDE_INT_MAX_INL_ELTS.
gcc/c-family/
* c-warn.cc (match_case_to_enum_1): Use wi::to_wide just once instead
of 3 times, assert get_len () is smaller than WIDE_INT_MAX_INL_ELTS.
gcc/testsuite/
* gcc.dg/bitint-38.c: New test.

commit | commitdiff | tree

Georg-Johann Lay [Thu, 12 Oct 2023 13:32:41 +0000 (15:32 +0200)]

LibF7: Implement atan2.

libgcc/config/avr/libf7/
* libf7.c (F7MOD_atan2_, f7_atan2): New module and function.
* libf7.h: Adjust comments.
* libf7-common.mk (CALL_PROLOGUES): Add atan2.

commit | commitdiff | tree

Pan Li [Thu, 12 Oct 2023 08:54:36 +0000 (16:54 +0800)]

RISC-V: Support FP lround/lroundf auto vectorization

This patch would like to support the FP lround/lroundf auto vectorization.

* long lround (double) for rv64
* long lroundf (float) for rv32

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lroundmn2 only act on DF => DI for
rv64, and SF => SI for rv32.

Given we have code like:

void
test_lround (long *out, double *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
    out[i] = __builtin_lround (in[i]);
}

Before this patch:
.L3:
  ...
  fld      fa5,0(a1)
  fcvt.l.d a5,fa5,rmm
  sd       a5,-8(a0)
  ...
  bne      a1,a4,.L3

After this patch:
  frrm     a6
  ...
  fsrmi    4 // RMM
.L3:
  ...
  vsetvli     a3,zero,e64,m1,ta,ma
  vfcvt.x.f.v v1,v1
  vsetvli     zero,a2,e64,m1,ta,ma
  vse32.v     v1,0(a0)
  ...
  bne         a2,zero,.L3
  ...
  fsrm     a6

The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered
by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

* config/riscv/autovec.md (lround<mode><v_i_l_ll_convert>2): New
pattern for lround/lroundf.
* config/riscv/riscv-protos.h (enum insn_type): New enum value.
(expand_vec_lround): New func decl for expanding lround.
* config/riscv/riscv-v.cc (expand_vec_lround): New func impl
for expanding lround.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-lround-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lround-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lround-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lround-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lround-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lround-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Jakub Jelinek [Thu, 12 Oct 2023 08:45:27 +0000 (10:45 +0200)]

dwarf2out: Stop using wide_int in GC structures

The planned wide_int/widest_int changes to support larger precisions
make wide_int and widest_int unusable in GC structures, because it
has non-trivial destructors (and may point to heap allocated memory).
dwarf2out.{h,cc} is the only user of wide_int in GC structures for val_wide,
but actually doesn't really need much, all those are at one point created
from const wide_int_ref & and never changed afterwards, with just a couple
of methods used on it.

So, this patch replaces use of wide_int there with a new class, dw_wide_int,
which contains just precision, len field and the limbs in trailing array.
Most needed methods are implemented directly, just for the most complicated
cases it temporarily constructs a wide_int_ref from it and calls its methods.

2023-10-12  Jakub Jelinek  <jakub@redhat.com>

* dwarf2out.h (wide_int_ptr): Remove.
(dw_wide_int_ptr): New typedef.
(struct dw_val_node): Change type of val_wide from wide_int_ptr
to dw_wide_int_ptr.
(struct dw_wide_int): New type.
(dw_wide_int::elt): New method.
(dw_wide_int::operator ==): Likewise.
* dwarf2out.cc (get_full_len): Change argument type to
const dw_wide_int & from const wide_int &.  Use CEIL.  Call
get_precision method instead of calling wi::get_precision.
(alloc_dw_wide_int): New function.
(add_AT_wide): Change w argument type to const wide_int_ref &
from const wide_int &.  Use alloc_dw_wide_int.
(mem_loc_descriptor, loc_descriptor): Use alloc_dw_wide_int.
(insert_wide_int): Change val argument type to const wide_int_ref &
from const wide_int &.
(add_const_value_attribute): Pass rtx_mode_t temporary directly to
add_AT_wide instead of using a temporary variable.

commit | commitdiff | tree

Richard Biener [Thu, 12 Oct 2023 07:09:46 +0000 (09:09 +0200)]

tree-optimization/111764 - wrong reduction vectorization

The following removes a misguided attempt to allow x + x in a reduction
path, also allowing x * x which isn't valid. x + x actually never
arrives this way but instead is canonicalized to 2 * x. This makes
reduction path handling consistent with how we handle the single-stmt
reduction case.

PR tree-optimization/111764
* tree-vect-loop.cc (check_reduction_path): Remove the attempt
to allow x + x via special-casing of assigns.

* gcc.dg/vect/pr111764.c: New testcase.

commit | commitdiff | tree

Hu, Lin1 [Thu, 22 Dec 2022 02:26:47 +0000 (10:26 +0800)]

Support Intel USER_MSR

gcc/ChangeLog:

* common/config/i386/cpuinfo.h (get_available_features):
Detect USER_MSR.
* common/config/i386/i386-common.cc (OPTION_MASK_ISA2_USER_MSR_SET): New.
(OPTION_MASK_ISA2_USER_MSR_UNSET): Ditto.
(ix86_handle_option): Handle -musermsr.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_USER_MSR.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for usermsr.
* config.gcc: Add usermsrintrin.h
* config/i386/cpuid.h (bit_USER_MSR): New.
* config/i386/i386-builtin-types.def:
Add DEF_FUNCTION_TYPE (VOID, UINT64, UINT64).
* config/i386/i386-builtins.cc (ix86_init_mmx_sse_builtins):
Add __builtin_urdmsr and __builtin_uwrmsr.
* config/i386/i386-builtins.h (ix86_builtins):
Add IX86_BUILTIN_URDMSR and IX86_BUILTIN_UWRMSR.
* config/i386/i386-c.cc (ix86_target_macros_internal):
Define __USER_MSR__.
* config/i386/i386-expand.cc (ix86_expand_builtin):
Handle new builtins.
* config/i386/i386-isa.def (USER_MSR): Add DEF_PTA(USER_MSR).
* config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p):
Handle usermsr.
* config/i386/i386.md (urdmsr): New define_insn.
(uwrmsr): Ditto.
* config/i386/i386.opt: Add option -musermsr.
* config/i386/x86gprintrin.h: Include usermsrintrin.h
* doc/extend.texi: Document usermsr.
* doc/invoke.texi: Document -musermsr.
* doc/sourcebuild.texi: Document target usermsr.
* config/i386/usermsrintrin.h: New file.

gcc/testsuite/ChangeLog:

* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* gcc.target/i386/x86gprintrin-1.c: Add -musermsr for 64bit target.
* gcc.target/i386/x86gprintrin-2.c: Ditto.
* gcc.target/i386/x86gprintrin-3.c: Ditto.
* gcc.target/i386/x86gprintrin-4.c: Add musermsr for 64bit target.
* gcc.target/i386/x86gprintrin-5.c: Ditto
* gcc.target/i386/user_msr-1.c: New test.
* gcc.target/i386/user_msr-2.c: Ditto.

commit | commitdiff | tree

Chenghui Pan [Tue, 26 Sep 2023 06:42:57 +0000 (14:42 +0800)]

LoongArch: Modify check_effective_target_vect_int_mod according to SX/ASX capabilities.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add LoongArch in
check_effective_target_vect_int_mod according to SX/ASX capabilities.

commit | commitdiff | tree

Chenghui Pan [Tue, 26 Sep 2023 06:39:18 +0000 (14:39 +0800)]

LoongArch: Enable vect.exp for LoongArch. [PR111424]

gcc/testsuite/ChangeLog:

PR target/111424
* lib/target-supports.exp: Enable vect.exp for LoongArch.

commit | commitdiff | tree

Yang Yujie [Wed, 11 Oct 2023 09:59:53 +0000 (17:59 +0800)]

LoongArch: Adjust makefile dependency for loongarch headers.

gcc/ChangeLog:

* config.gcc: Add loongarch-driver.h to tm_files.
* config/loongarch/loongarch.h: Do not include loongarch-driver.h.
* config/loongarch/t-loongarch: Append loongarch-multilib.h to $(GTM_H)
instead of $(TM_H) for building generator programs.

commit | commitdiff | tree

Paul Thomas [Thu, 12 Oct 2023 06:26:59 +0000 (07:26 +0100)]

Fortran: Set hidden string length for pointer components [PR67740].

2023-10-11 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/67740
* trans-expr.cc (gfc_trans_pointer_assignment): Set the hidden
string length component for pointer assignment to character
pointer components.

gcc/testsuite/
PR fortran/67740
* gfortran.dg/pr67740.f90: New test

commit | commitdiff | tree

Kewen Lin [Thu, 12 Oct 2023 05:05:03 +0000 (00:05 -0500)]

rs6000: Make 32 bit stack_protect support prefixed insn [PR111367]

As PR111367 shows, with prefixed insn supported, some of
checkings consider it's able to leverage prefixed insn
for stack protect related load/store, but since we don't
actually change the emitted assembly for 32 bit, it can
cause the assembler error as exposed.

Mike's commit r10-4547-gce6a6c007e5a98 has already handled
the 64 bit case (DImode), this patch is to treat the 32
bit case (SImode) by making use of mode iterator P and
ptrload attribute iterator, also fixes the constraints
to match the emitted operand formats.

PR target/111367

gcc/ChangeLog:

* config/rs6000/rs6000.md (stack_protect_setsi): Support prefixed
instruction emission and incorporate to stack_protect_set<mode>.
(stack_protect_setdi): Rename to ...
(stack_protect_set<mode>): ... this, adjust constraint.
(stack_protect_testsi): Support prefixed instruction emission and
incorporate to stack_protect_test<mode>.
(stack_protect_testdi): Rename to ...
(stack_protect_test<mode>): ... this, adjust constraint.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/pr111367.C: New test.

commit | commitdiff | tree

Kewen Lin [Thu, 12 Oct 2023 05:04:58 +0000 (00:04 -0500)]

testsuite: Avoid uninit var in pr60510.f [PR111427]

The uninitialized variable a in pr60510.f can cause
some random failures as exposed in PR111427. This
patch is to make it initialized accordingly.

PR testsuite/111427

gcc/testsuite/ChangeLog:

* gfortran.dg/vect/pr60510.f (test): Init variable a.

commit | commitdiff | tree