]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
21 months agoor1k: Fix -Wincompatible-pointer-types warning during libgcc build
Florian Weimer [Fri, 13 Oct 2023 07:34:37 +0000 (09:34 +0200)] 
or1k: Fix -Wincompatible-pointer-types warning during libgcc build

libgcc/

* config/or1k/linux-unwind.h (or1k_fallback_frame_state): Add
missing cast.

21 months agoarc: Fix -Wincompatible-pointer-types warning during libgcc build
Florian Weimer [Fri, 13 Oct 2023 07:34:36 +0000 (09:34 +0200)] 
arc: Fix -Wincompatible-pointer-types warning during libgcc build

libgcc/

* config/arc/linux-unwind.h (arc_fallback_frame_state): Add
missing cast.

21 months agoriscv: Fix -Wincompatible-pointer-types warning during libgcc build
Florian Weimer [Fri, 13 Oct 2023 07:34:36 +0000 (09:34 +0200)] 
riscv: Fix -Wincompatible-pointer-types warning during libgcc build

libgcc/

* config/riscv/linux-unwind.h (riscv_fallback_frame_state): Add
missing cast.

21 months agocsky: Fix -Wincompatible-pointer-types warning during libgcc build
Florian Weimer [Fri, 13 Oct 2023 07:34:36 +0000 (09:34 +0200)] 
csky: Fix -Wincompatible-pointer-types warning during libgcc build

libgcc/

* config/csky/linux-unwind.h (csky_fallback_frame_state): Add
missing cast.

21 months agom68k: Avoid implicit function declaration in libgcc
Florian Weimer [Fri, 13 Oct 2023 07:34:36 +0000 (09:34 +0200)] 
m68k: Avoid implicit function declaration in libgcc

libgcc/

* config/m68k/fpgnulib.c (__cmpdf2): Declare.

21 months agolibstdc++: Fix tr1/8_c_compatibility/cstdio/functions.cc regression with recent glibc
Jakub Jelinek [Fri, 13 Oct 2023 07:09:32 +0000 (09:09 +0200)] 
libstdc++: Fix tr1/8_c_compatibility/cstdio/functions.cc regression with recent glibc

The following testcase started FAILing recently after the
https://sourceware.org/git/?p=glibc.git;a=commit;h=64b1a44183a3094672ed304532bedb9acc707554
glibc change which marked vfscanf with nonnull (1) attribute.
While vfwscanf hasn't been marked similarly (strangely), the patch changes
that too.  By using va_arg one hides the value of it from the compiler
(volatile keyword would do too, or making the FILE* stream a function
argument, but then it might need to be guarded by #if or something).

2023-10-13  Jakub Jelinek  <jakub@redhat.com>

* testsuite/tr1/8_c_compatibility/cstdio/functions.cc (test01):
Initialize stream to va_arg(ap, FILE*) rather than 0.
* testsuite/tr1/8_c_compatibility/cwchar/functions.cc (test01):
Likewise.

21 months agotree-optimization/111779 - Handle some BIT_FIELD_REFs in SRA
Richard Biener [Thu, 12 Oct 2023 09:34:57 +0000 (11:34 +0200)] 
tree-optimization/111779 - Handle some BIT_FIELD_REFs in SRA

The following handles byte-aligned, power-of-two and byte-multiple
sized BIT_FIELD_REF reads in SRA.  In particular this should cover
BIT_FIELD_REFs created by optimize_bit_field_compare.

For gcc.dg/tree-ssa/ssa-dse-26.c we now SRA the BIT_FIELD_REF
appearing there leading to more DSE, fully eliding the aggregates.

This results in the same false positive -Wuninitialized as the
older attempt to remove the folding from optimize_bit_field_compare,
fixed by initializing part of the aggregate unconditionally.

PR tree-optimization/111779
gcc/
* tree-sra.cc (sra_handled_bf_read_p): New function.
(build_access_from_expr_1): Handle some BIT_FIELD_REFs.
(sra_modify_expr): Likewise.
(make_fancy_name_1): Skip over BIT_FIELD_REF.

gcc/fortran/
* trans-expr.cc (gfc_trans_assignment_1): Initialize
lhs_caf_attr and rhs_caf_attr codimension flag to avoid
false positive -Wuninitialized.

gcc/testsuite/
* gcc.dg/tree-ssa/ssa-dse-26.c: Adjust for more DSE.
* gcc.dg/vect/vect-pr111779.c: New testcase.

21 months agotree-optimization/111773 - avoid CD-DCE of noreturn special calls
Richard Biener [Thu, 12 Oct 2023 08:13:58 +0000 (10:13 +0200)] 
tree-optimization/111773 - avoid CD-DCE of noreturn special calls

The support to elide calls to allocation functions in DCE runs into
the issue that when implementations are discovered noreturn we end
up DCEing the calls anyway, leaving blocks without termination and
without outgoing edges which is both invalid IL and wrong-code when
as in the example the noreturn call would throw.  The following
avoids taking advantage of both noreturn and the ability to elide
allocation at the same time.

For the testcase it's valid to throw or return 10 by eliding the
allocation.  But we have to do either where currently we'd run
off the function.

PR tree-optimization/111773
* tree-ssa-dce.cc (mark_stmt_if_obviously_necessary): Do
not elide noreturn calls that are reflected to the IL.

* g++.dg/torture/pr111773.C: New testcase.

21 months agoRISC-V: Add test for FP llround auto vectorization
Pan Li [Fri, 13 Oct 2023 06:13:26 +0000 (14:13 +0800)] 
RISC-V: Add test for FP llround auto vectorization

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

long long llround (double);

This patch would like to add the test cases for ensuring the correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-llround-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llround-0.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
21 months agoRISC-V Regression: Fix FAIL of bb-slp-pr69907.c for RVV
Juzhe-Zhong [Fri, 13 Oct 2023 05:45:19 +0000 (13:45 +0800)] 
RISC-V Regression: Fix FAIL of bb-slp-pr69907.c for RVV

Like ARM SVE and GCN, add RVV.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/bb-slp-pr69907.c: Add RVV.

21 months agoRISC-V: Add test for FP iroundf auto vectorization
Pan Li [Fri, 13 Oct 2023 04:11:56 +0000 (12:11 +0800)] 
RISC-V: Add test for FP iroundf auto vectorization

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

int iroundf (float);

This patch would like to add the test cases for ensuring the
correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-iround-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-iround-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-iround-0.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
21 months agoRISC-V: Fix the riscv_legitimize_poly_move issue on targets where the minimal VLEN...
Kito Cheng [Tue, 3 Oct 2023 02:27:24 +0000 (10:27 +0800)] 
RISC-V: Fix the riscv_legitimize_poly_move issue on targets where the minimal VLEN exceeds 512.

riscv_legitimize_poly_move was expected to ensure the poly value is at most 32
times smaller than the minimal VLEN (32 being derived from '4096 / 128').
This assumption held when our mode modeling was not so precisely defined.
However, now that we have modeled the mode size according to the correct minimal
VLEN info, the size difference between different RVV modes can be up to 64
times. For instance, comparing RVVMF64BI and RVVMF1BI, the sizes are [1, 1]
versus [64, 64] respectively.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_poly_move): Bump
max_power to 64.
* config/riscv/riscv.h (MAX_POLY_VARIANT): New.

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/autovec/bug-01.C: New.
* g++.target/riscv/rvv/rvv.exp: Add autovec folder.

21 months agoRISC-V: Leverage stdint-gcc.h for RVV test cases
Pan Li [Fri, 13 Oct 2023 02:17:36 +0000 (10:17 +0800)] 
RISC-V: Leverage stdint-gcc.h for RVV test cases

Leverage stdint-gcc.h for the int64_t types instead of typedef.
Or we may have conflict with stdint-gcc.h in somewhere else.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: Include
stdint-gcc.h for int types.
* gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/test-math.h: Remove int64_t
typedef.

Signed-off-by: Pan Li <pan2.li@intel.com>
21 months agoRISC-V: Support FP lfloor/lfloorf auto vectorization
Pan Li [Fri, 13 Oct 2023 01:30:55 +0000 (09:30 +0800)] 
RISC-V: Support FP lfloor/lfloorf auto vectorization

This patch would like to support the FP lfloor/lfloorf auto vectorization.

* long lfloor (double) for rv64
* long lfloorf (float) for rv32

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lfloormn2 only act on DF => DI for
rv64, and SF => SI for rv32.

Given we have code like:

void
test_lfloor (long *out, double *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
    out[i] = __builtin_lfloor (in[i]);
}

Before this patch:
.L3:
  ...
  fld         fa5,0(a1)
  fcvt.l.d    a5,fa5,rdn
  sd          a5,-8(a0)
  ...
  bne         a1,a4,.L3

After this patch:
  frrm        a6
  ...
  fsrmi       2 // RDN
.L3:
  ...
  vsetvli     a3,zero,e64,m1,ta,ma
  vfcvt.x.f.v v1,v1
  vsetvli     zero,a2,e64,m1,ta,ma
  vse32.v     v1,0(a0)
  ...
  bne         a2,zero,.L3
  ...
  fsrm        a6

The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered
by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

* config/riscv/autovec.md (lfloor<mode><v_i_l_ll_convert>2): New
pattern for lfloor/lfloorf.
* config/riscv/riscv-protos.h (enum insn_type): New enum value.
(expand_vec_lfloor): New func decl for expanding lfloor.
* config/riscv/riscv-v.cc (expand_vec_lfloor): New func impl
for expanding lfloor.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-lfloor-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lfloor-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lfloor-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lfloor-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lfloor-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lfloor-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
21 months agotestsuite: Replace many dg-require-thread-fence with dg-require-atomic-cmpxchg-word
Hans-Peter Nilsson [Wed, 4 Oct 2023 02:16:18 +0000 (04:16 +0200)] 
testsuite: Replace many dg-require-thread-fence with dg-require-atomic-cmpxchg-word

These tests actually use a form of atomic compare and exchange
operation, not just atomic loading and storing.  Some targets (not
supported by e.g. libatomic) have atomic loading and storing, but not
compare and exchange, yielding linker errors for missing library
functions.

This change is just for existing uses of
dg-require-thread-fence.  It does not fix any other tests
that should also be gated on dg-require-atomic-cmpxchg-word.

* testsuite/29_atomics/atomic/compare_exchange_padding.cc,
testsuite/29_atomics/atomic_flag/clear/1.cc,
testsuite/29_atomics/atomic_flag/cons/value_init.cc,
testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc,
testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc,
testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc,
testsuite/29_atomics/atomic_ref/generic.cc,
testsuite/29_atomics/atomic_ref/integral.cc,
testsuite/29_atomics/atomic_ref/pointer.cc: Replace
dg-require-thread-fence with dg-require-atomic-cmpxchg-word.

21 months agotestsuite: Add dg-require-atomic-cmpxchg-word
Hans-Peter Nilsson [Wed, 4 Oct 2023 01:40:25 +0000 (03:40 +0200)] 
testsuite: Add dg-require-atomic-cmpxchg-word

Some targets (armv6-m) support inline atomic load and store,
i.e. dg-require-thread-fence matches, but not atomic operations like
compare and exchange.

This directive can be used to replace uses of dg-require-thread-fence
where an atomic operation is actually used.

* testsuite/lib/dg-options.exp (dg-require-atomic-cmpxchg-word):
New proc.
* testsuite/lib/libstdc++.exp (check_v3_target_atomic_cmpxchg_word):
Ditto.

21 months agoDaily bump.
GCC Administrator [Fri, 13 Oct 2023 00:18:18 +0000 (00:18 +0000)] 
Daily bump.

21 months agoRISC-V: Support FP lceil/lceilf auto vectorization
Pan Li [Thu, 12 Oct 2023 14:07:56 +0000 (22:07 +0800)] 
RISC-V: Support FP lceil/lceilf auto vectorization

This patch would like to support the FP lceil/lceilf auto vectorization.

* long lceil (double) for rv64
* long lceilf (float) for rv32

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lceilmn2 only act on DF => DI for
rv64, and SF => SI for rv32.

Given we have code like:

void
test_lceil (long *out, double *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
    out[i] = __builtin_lceil (in[i]);
}

Before this patch:
.L3:
  ...
  fld         fa5,0(a1)
  fcvt.l.d    a5,fa5,rup
  sd          a5,-8(a0)
  ...
  bne         a1,a4,.L3

After this patch:
  frrm        a6
  ...
  fsrmi       3 // RUP
.L3:
  ...
  vsetvli     a3,zero,e64,m1,ta,ma
  vfcvt.x.f.v v1,v1
  vsetvli     zero,a2,e64,m1,ta,ma
  vse32.v     v1,0(a0)
  ...
  bne         a2,zero,.L3
  ...
  fsrm        a6

The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered
by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

* config/riscv/autovec.md (lceil<mode><v_i_l_ll_convert>2): New
pattern] for lceil/lceilf.
* config/riscv/riscv-protos.h (enum insn_type): New enum value.
(expand_vec_lceil): New func decl for expanding lceil.
* config/riscv/riscv-v.cc (expand_vec_lceil): New func impl
for expanding lceil.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-lceil-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceil-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceil-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceil-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lceil-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lceil-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
21 months agoPR111778, PowerPC: Do not depend on an undefined shift
Michael Meissner [Thu, 12 Oct 2023 20:17:59 +0000 (16:17 -0400)] 
PR111778, PowerPC: Do not depend on an undefined shift

I was building a cross compiler to PowerPC on my x86_86 workstation with the
latest version of GCC on October 11th.  I could not build the compiler on the
x86_64 system as it died in building libgcc.  I looked into it, and I
discovered the compiler was recursing until it ran out of stack space.  If I
build a native compiler with the same sources on a PowerPC system, it builds
fine.

I traced this down to a change made around October 10th:

| commit 8f1a70a4fbcc6441c70da60d4ef6db1e5635e18a (HEAD)
| Author: Jiufu Guo <guojiufu@linux.ibm.com>
| Date:   Tue Jan 10 20:52:33 2023 +0800
|
|   rs6000: build constant via li/lis;rldicl/rldicr
|
|   If a constant is possible left/right cleaned on a rotated value from
|   a negative value of "li/lis".  Then, using "li/lis ; rldicl/rldicr"
|   to build the constant.

The code was doing a -1 << 64 which is undefined behavior because different
machines produce different results.  On the x86_64 system, (-1 << 64) produces
-1 while on a PowerPC 64-bit system, (-1 << 64) produces 0.  The x86_64 then
recurses until the stack runs out of space.

If I apply this patch, the compiler builds fine on both x86_64 as a PowerPC
crosss compiler and on a native PowerPC system.

2023-10-12  Michael Meissner  <meissner@linux.ibm.com>

gcc/

PR target/111778
* config/rs6000/rs6000.cc (can_be_built_by_li_lis_and_rldicl): Protect
code from shifts that are undefined.
(can_be_built_by_li_lis_and_rldicr): Likewise.
(can_be_built_by_li_and_rldic): Protect code from shifts that
undefined.  Also replace uses of 1ULL with HOST_WIDE_INT_1U.

21 months agolibgomp.texi: Clarify OMP_TARGET_OFFLOAD=mandatory
Tobias Burnus [Thu, 12 Oct 2023 19:00:58 +0000 (21:00 +0200)] 
libgomp.texi: Clarify OMP_TARGET_OFFLOAD=mandatory

In OpenMP 5.0/5.1, the semantic of OMP_TARGET_OFFLOAD=mandatory was
insufficiently specified; 5.2 clarified this with extensions/clarifications
(omp_initial_device, omp_invalid_device, "conforming device number").
GCC's implementation matches OpenMP 5.2.

libgomp/ChangeLog:

* libgomp.texi (OMP_DEFAULT_DEVICE): Update spec ref; add @ref to
OMP_TARGET_OFFLOAD.
(OMP_TARGET_OFFLOAD): Update spec ref; add @ref to OMP_DEFAULT_DEVICE;
clarify MANDATORY behavior.

21 months agoreg-notes.def: Fix up description of REG_NOALIAS
Alex Coplan [Thu, 12 Oct 2023 16:49:20 +0000 (17:49 +0100)] 
reg-notes.def: Fix up description of REG_NOALIAS

The description of the REG_NOALIAS note in reg-notes.def isn't quite
right. It describes it as being attached to call insns, but it is
instead attached to a move insn receiving the return value from a call.

This can be seen by looking at the code in calls.cc:expand_call which
attaches the note:

  emit_move_insn (temp, valreg);

  /* The return value from a malloc-like function cannot alias
     anything else.  */
  last = get_last_insn ();
  add_reg_note (last, REG_NOALIAS, temp);

gcc/ChangeLog:

* reg-notes.def (NOALIAS): Correct comment.

21 months agoRISC-V: Make xtheadcondmov-indirect tests robust against instruction reordering
Christoph Müllner [Mon, 9 Oct 2023 22:40:35 +0000 (00:40 +0200)] 
RISC-V: Make xtheadcondmov-indirect tests robust against instruction reordering

Fixes: c1bc7513b1d7 ("RISC-V: const: hide mvconst splitter from IRA")
A recent change broke the xtheadcondmov-indirect tests, because the order of
emitted instructions changed. Since the test is too strict when testing for
a fixed instruction order, let's change the tests to simply count instruction,
like it is done for similar tests.

Reported-by: Patrick O'Neill <patrick@rivosinc.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadcondmov-indirect.c: Make robust against
instruction reordering.

21 months agowide-int: Fix build with gcc < 12 or clang++ [PR111787]
Jakub Jelinek [Thu, 12 Oct 2023 15:20:36 +0000 (17:20 +0200)] 
wide-int: Fix build with gcc < 12 or clang++ [PR111787]

While my wide_int patch bootstrapped/regtested fine when I used GCC 12
as system gcc, apparently it doesn't with GCC 11 and older or clang++.
For GCC before PR96555 C++ DR1315 implementation the compiler complains
about template argument involving template parameters, for clang++ the
same + complains about missing needs_write_val_arg static data member
in some wi::int_traits specializations.

2023-10-12  Jakub Jelinek  <jakub@redhat.com>

PR bootstrap/111787
* tree.h (wi::int_traits <unextended_tree>::needs_write_val_arg): New
static data member.
(int_traits <extended_tree <N>>::needs_write_val_arg): Likewise.
(wi::ints_for): Provide separate partial specializations for
generic_wide_int <extended_tree <N>> and INL_CONST_PRECISION or that
and CONST_PRECISION, rather than using
int_traits <extended_tree <N> >::precision_type as the second template
argument.
* rtl.h (wi::int_traits <rtx_mode_t>::needs_write_val_arg): New
static data member.
* double-int.h (wi::int_traits <double_int>::needs_write_val_arg):
Likewise.

21 months agoRISCV: Bugfix for incorrect documentation heading nesting
Mary Bennett [Thu, 12 Oct 2023 15:17:24 +0000 (09:17 -0600)] 
RISCV: Bugfix for incorrect documentation heading nesting

PR middle-end/111777

gcc/ChangeLog:
* doc/extend.texi: Change subsubsection to subsection for
CORE-V built-ins.

21 months agoAArch64: Fix Armv9-a warnings that get emitted whenever a ACLE header is used.
Tamar Christina [Thu, 12 Oct 2023 14:55:58 +0000 (15:55 +0100)] 
AArch64: Fix Armv9-a warnings that get emitted whenever a ACLE header is used.

At the moment, trying to use -march=armv9-a with any ACLE header such as
arm_neon.h results in rows and rows of warnings saying:

<built-in>: warning: "__ARM_ARCH" redefined
<built-in>: note: this is the location of the previous definition

This is obviously not useful and happens because the header was defined at
__ARM_ARCH == 8 and the commandline changes it.

The Arm port solves this by undef the macro during argument processing and we do
the same on AArch64 for the majority of macros.  However we define this macro
using a different helper which requires the manual undef.

Thanks,
Tamar

gcc/ChangeLog:

* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Add undef.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/armv9_warning.c: New test.

21 months agowide-int: Add simple CHECKING_P stack-protector canary like checking
Jakub Jelinek [Thu, 12 Oct 2023 14:07:25 +0000 (16:07 +0200)] 
wide-int: Add simple CHECKING_P stack-protector canary like checking

This patch adds hopefully not so expensive --enable-checking=yes
verification that the widest_int upper length bound estimates are really
upper bounds and nothing attempts to write more elements.
It is done only if the estimated upper length bound is smaller than
WIDE_INT_MAX_INL_ELTS, but that should be the most common case unless
large _BitInt is involved.

2023-10-12  Jakub Jelinek  <jakub@redhat.com>

* wide-int.h (widest_int_storage <N>::write_val): If l is small
and there is space in u.val array, store a canary value at the
end when checking.
(widest_int_storage <N>::set_len): Check the canary hasn't been
overwritten.

21 months agowide-int: Allow up to 16320 bits wide_int and change widest_int precision to 32640...
Jakub Jelinek [Thu, 12 Oct 2023 14:01:12 +0000 (16:01 +0200)] 
wide-int: Allow up to 16320 bits wide_int and change widest_int precision to 32640 bits [PR102989]

As mentioned in the _BitInt support thread, _BitInt(N) is currently limited
by the wide_int/widest_int maximum precision limitation, which is depending
on target 191, 319, 575 or 703 bits (one less than WIDE_INT_MAX_PRECISION).
That is fairly low limit for _BitInt, especially on the targets with the 191
bit limitation.

The following patch bumps that limit to 16319 bits on all arches (which support
_BitInt at all), which is the limit imposed by INTEGER_CST representation
(unsigned char members holding number of HOST_WIDE_INT limbs).

In order to achieve that, wide_int is changed from a trivially copyable type
which contained just an inline array of WIDE_INT_MAX_ELTS (3, 5, 9 or
11 limbs depending on target) limbs into a non-trivially copy constructible,
copy assignable and destructible type which for the usual small cases (up
to WIDE_INT_MAX_INL_ELTS which is the former WIDE_INT_MAX_ELTS) still uses
an inline array of limbs, but for larger precisions uses heap allocated
limb array.  This makes wide_int unusable in GC structures, so for dwarf2out
which was the only place which needed it there is a new rwide_int type
(restricted wide_int) which supports only up to RWIDE_INT_MAX_ELTS limbs
inline and is trivially copyable (dwarf2out should never deal with large
_BitInt constants, those should have been lowered earlier).

Similarly, widest_int has been changed from a trivially copyable type which
contained also an inline array of WIDE_INT_MAX_ELTS limbs (but unlike
wide_int didn't contain precision and assumed that to be
WIDE_INT_MAX_PRECISION) into a non-trivially copy constructible, copy
assignable and destructible type which has always WIDEST_INT_MAX_PRECISION
precision (32640 bits currently, twice as much as INTEGER_CST limitation
allows) and unlike wide_int decides depending on get_len () value whether
it uses an inline array (again, up to WIDE_INT_MAX_INL_ELTS) or heap
allocated one.  In wide-int.h this means we need to estimate an upper
bound on how many limbs will wide-int.cc (usually, sometimes wide-int.h)
need to write, heap allocate if needed based on that estimation and upon
set_len which is done at the end if we guessed over WIDE_INT_MAX_INL_ELTS
and allocated dynamically, while we actually need less than that
copy/deallocate.  The unexact guesses are needed because the exact
computation of the length in wide-int.cc is sometimes quite complex and
especially canonicalize at the end can decrease it.  widest_int is again
because of this not usable in GC structures, so cfgloop.h has been changed
to use fixed_wide_int_storage <WIDE_INT_MAX_INL_PRECISION> and punt if
we'd have larger _BitInt based iterators, programs having more than 128-bit
iterators will be hopefully rare and I think it is fine to treat loops with
more than 2^127 iterations as effectively possibly infinite, omp-general.cc
is changed to use fixed_wide_int_storage <1024>, as it better should support
scores with the same precision on all arches.

Code which used WIDE_INT_PRINT_BUFFER_SIZE sized buffers for printing
wide_int/widest_int into buffer had to be changed to use XALLOCAVEC for
larger lengths.

On x86_64, the patch in --enable-checking=yes,rtl,extra configured
bootstrapped cc1plus enlarges the .text section by 1.01% - from
0x25725a5 to 0x25e5555 and similarly at least when compiling insn-recog.cc
with the usual bootstrap option slows compilation down by 1.01%,
user 4m22.046s and 4m22.384s on vanilla trunk vs.
4m25.947s and 4m25.581s on patched trunk.  I'm afraid some code size growth
and compile time slowdown is unavoidable in this case, we use wide_int and
widest_int everywhere, and while the rare cases are marked with UNLIKELY
macros, it still means extra checks for it.

The patch also regresses
+FAIL: gm2/pim/fail/largeconst.mod,  -O
+FAIL: gm2/pim/fail/largeconst.mod,  -O -g
+FAIL: gm2/pim/fail/largeconst.mod,  -O3 -fomit-frame-pointer
+FAIL: gm2/pim/fail/largeconst.mod,  -O3 -fomit-frame-pointer -finline-functions
+FAIL: gm2/pim/fail/largeconst.mod,  -Os
+FAIL: gm2/pim/fail/largeconst.mod,  -g
+FAIL: gm2/pim/fail/largeconst2.mod,  -O
+FAIL: gm2/pim/fail/largeconst2.mod,  -O -g
+FAIL: gm2/pim/fail/largeconst2.mod,  -O3 -fomit-frame-pointer
+FAIL: gm2/pim/fail/largeconst2.mod,  -O3 -fomit-frame-pointer -finline-functions
+FAIL: gm2/pim/fail/largeconst2.mod,  -Os
+FAIL: gm2/pim/fail/largeconst2.mod,  -g
tests, which previously were rejected with
error: constant literal ‘12345678912345678912345679123456789123456789123456789123456789123456791234567891234567891234567891234567891234567912345678912345678912345678912345678912345679123456789123456789’ exceeds internal ZTYPE range
kind of errors, but now are accepted.  Seems the FE tries to parse constants
into widest_int in that case and only diagnoses if widest_int overflows,
that seems wrong, it should at least punt if stuff doesn't fit into
WIDE_INT_MAX_PRECISION, but perhaps far less than that, if it wants support
for middle-end for precisions above 128-bit, it better should be using
BITINT_TYPE.  Will file a PR and defer to Modula2 maintainer.

2023-10-12  Jakub Jelinek  <jakub@redhat.com>

PR c/102989
* wide-int.h: Adjust file comment.
(WIDE_INT_MAX_INL_ELTS): Define to former value of WIDE_INT_MAX_ELTS.
(WIDE_INT_MAX_INL_PRECISION): Define.
(WIDE_INT_MAX_ELTS): Change to 255.  Assert that WIDE_INT_MAX_INL_ELTS
is smaller than WIDE_INT_MAX_ELTS.
(RWIDE_INT_MAX_ELTS, RWIDE_INT_MAX_PRECISION, WIDEST_INT_MAX_ELTS,
WIDEST_INT_MAX_PRECISION): Define.
(WI_BINARY_RESULT_VAR, WI_UNARY_RESULT_VAR): Change write_val callers
to pass 0 as a new argument.
(class widest_int_storage): Likewise.
(widest_int, widest2_int): Change typedefs to use widest_int_storage
rather than fixed_wide_int_storage.
(enum wi::precision_type): Add INL_CONST_PRECISION enumerator.
(struct binary_traits): Add partial specializations for
INL_CONST_PRECISION.
(generic_wide_int): Add needs_write_val_arg static data member.
(int_traits): Likewise.
(wide_int_storage): Replace val non-static data member with a union
u of it and HOST_WIDE_INT *valp.  Declare copy constructor, copy
assignment operator and destructor.  Add unsigned int argument to
write_val.
(wide_int_storage::wide_int_storage): Initialize precision to 0
in the default ctor.  Remove unnecessary {}s around STATIC_ASSERTs.
Assert in non-default ctor T's precision_type is not
INL_CONST_PRECISION and allocate u.valp for large precision.  Add
copy constructor.
(wide_int_storage::~wide_int_storage): New.
(wide_int_storage::operator=): Add copy assignment operator.  In
assignment operator remove unnecessary {}s around STATIC_ASSERTs,
assert ctor T's precision_type is not INL_CONST_PRECISION and
if precision changes, deallocate and/or allocate u.valp.
(wide_int_storage::get_val): Return u.valp rather than u.val for
large precision.
(wide_int_storage::write_val): Likewise.  Add an unused unsigned int
argument.
(wide_int_storage::set_len): Use write_val instead of writing val
directly.
(wide_int_storage::from, wide_int_storage::from_array): Adjust
write_val callers.
(wide_int_storage::create): Allocate u.valp for large precisions.
(wi::int_traits <wide_int_storage>::get_binary_precision): New.
(fixed_wide_int_storage::fixed_wide_int_storage): Make default
ctor defaulted.
(fixed_wide_int_storage::write_val): Add unused unsigned int argument.
(fixed_wide_int_storage::from, fixed_wide_int_storage::from_array):
Adjust write_val callers.
(wi::int_traits <fixed_wide_int_storage>::get_binary_precision): New.
(WIDEST_INT): Define.
(widest_int_storage): New template class.
(wi::int_traits <widest_int_storage>): New.
(trailing_wide_int_storage::write_val): Add unused unsigned int
argument.
(wi::get_binary_precision): Use
wi::int_traits <WI_BINARY_RESULT (T1, T2)>::get_binary_precision
rather than get_precision on get_binary_result.
(wi::copy): Adjust write_val callers.  Don't call set_len if
needs_write_val_arg.
(wi::bit_not): If result.needs_write_val_arg, call write_val
again with upper bound estimate of len.
(wi::sext, wi::zext, wi::set_bit): Likewise.
(wi::bit_and, wi::bit_and_not, wi::bit_or, wi::bit_or_not,
wi::bit_xor, wi::add, wi::sub, wi::mul, wi::mul_high, wi::div_trunc,
wi::div_floor, wi::div_ceil, wi::div_round, wi::divmod_trunc,
wi::mod_trunc, wi::mod_floor, wi::mod_ceil, wi::mod_round,
wi::lshift, wi::lrshift, wi::arshift): Likewise.
(wi::bswap, wi::bitreverse): Assert result.needs_write_val_arg
is false.
(gt_ggc_mx, gt_pch_nx): Remove generic template for all
generic_wide_int, instead add functions and templates for each
storage of generic_wide_int.  Make functions for
generic_wide_int <wide_int_storage> and templates for
generic_wide_int <widest_int_storage <N>> deleted.
(wi::mask, wi::shifted_mask): Adjust write_val calls.
* wide-int.cc (zeros): Decrease array size to 1.
(BLOCKS_NEEDED): Use CEIL.
(canonize): Use HOST_WIDE_INT_M1.
(wi::from_buffer): Pass 0 to write_val.
(wi::to_mpz): Use CEIL.
(wi::from_mpz): Likewise.  Pass 0 to write_val.  Use
WIDE_INT_MAX_INL_ELTS instead of WIDE_INT_MAX_ELTS.
(wi::mul_internal): Use WIDE_INT_MAX_INL_PRECISION instead of
MAX_BITSIZE_MODE_ANY_INT in automatic array sizes, for prec
above WIDE_INT_MAX_INL_PRECISION estimate precision from
lengths of operands.  Use XALLOCAVEC allocated buffers for
prec above WIDE_INT_MAX_INL_PRECISION.
(wi::divmod_internal): Likewise.
(wi::lshift_large): For len > WIDE_INT_MAX_INL_ELTS estimate
it from xlen and skip.
(rshift_large_common): Remove xprecision argument, add len
argument with len computed in caller.  Don't return anything.
(wi::lrshift_large, wi::arshift_large): Compute len here
and pass it to rshift_large_common, for lengths above
WIDE_INT_MAX_INL_ELTS using estimations from xlen if possible.
(assert_deceq, assert_hexeq): For lengths above
WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer.
(test_printing): Use WIDE_INT_MAX_INL_PRECISION instead of
WIDE_INT_MAX_PRECISION.
* wide-int-print.h (WIDE_INT_PRINT_BUFFER_SIZE): Use
WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION.
* wide-int-print.cc (print_decs, print_decu, print_hex): For
lengths above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer.
* tree.h (wi::int_traits<extended_tree <N>>): Change precision_type
to INL_CONST_PRECISION for N == ADDR_MAX_PRECISION.
(widest_extended_tree): Use WIDEST_INT_MAX_PRECISION instead of
WIDE_INT_MAX_PRECISION.
(wi::ints_for): Use int_traits <extended_tree <N> >::precision_type
instead of hard coded CONST_PRECISION.
(widest2_int_cst): Use WIDEST_INT_MAX_PRECISION instead of
WIDE_INT_MAX_PRECISION.
(wi::extended_tree <N>::get_len): Use WIDEST_INT_MAX_PRECISION rather
than WIDE_INT_MAX_PRECISION.
(wi::ints_for::zero): Use
wi::int_traits <wi::extended_tree <N> >::precision_type instead of
wi::CONST_PRECISION.
* tree.cc (build_replicated_int_cst): Formatting fix.  Use
WIDE_INT_MAX_INL_ELTS rather than WIDE_INT_MAX_ELTS.
* print-tree.cc (print_node): Don't print TREE_UNAVAILABLE on
INTEGER_CSTs, TREE_VECs or SSA_NAMEs.
* double-int.h (wi::int_traits <double_int>::precision_type): Change
to INL_CONST_PRECISION from CONST_PRECISION.
* poly-int.h (struct poly_coeff_traits): Add partial specialization
for wi::INL_CONST_PRECISION.
* cfgloop.h (bound_wide_int): New typedef.
(struct nb_iter_bound): Change bound type from widest_int to
bound_wide_int.
(struct loop): Change nb_iterations_upper_bound,
nb_iterations_likely_upper_bound and nb_iterations_estimate type from
widest_int to bound_wide_int.
* cfgloop.cc (record_niter_bound): Return early if wi::min_precision
of i_bound is too large for bound_wide_int.  Adjustments for the
widest_int to bound_wide_int type change in non-static data members.
(get_estimated_loop_iterations, get_max_loop_iterations,
get_likely_max_loop_iterations): Adjustments for the widest_int to
bound_wide_int type change in non-static data members.
* tree-vect-loop.cc (vect_transform_loop): Likewise.
* tree-ssa-loop-niter.cc (do_warn_aggressive_loop_optimizations): Use
XALLOCAVEC allocated buffer for i_bound len above
WIDE_INT_MAX_INL_ELTS.
(record_estimate): Return early if wi::min_precision of i_bound is too
large for bound_wide_int.  Adjustments for the widest_int to
bound_wide_int type change in non-static data members.
(wide_int_cmp): Use bound_wide_int instead of widest_int.
(bound_index): Use bound_wide_int instead of widest_int.
(discover_iteration_bound_by_body_walk): Likewise.  Use
widest_int::from to convert it to widest_int when passed to
record_niter_bound.
(maybe_lower_iteration_bound): Use widest_int::from to convert it to
widest_int when passed to record_niter_bound.
(estimate_numbers_of_iteration): Don't record upper bound if
loop->nb_iterations has too large precision for bound_wide_int.
(n_of_executions_at_most): Use widest_int::from.
* tree-ssa-loop-ivcanon.cc (remove_redundant_iv_tests): Adjust for
the widest_int to bound_wide_int changes.
* match.pd (fold_sign_changed_comparison simplification): Use
wide_int::from on wi::to_wide instead of wi::to_widest.
* value-range.h (irange::maybe_resize): Avoid using memcpy on
non-trivially copyable elements.
* value-range.cc (irange_bitmask::dump): Use XALLOCAVEC allocated
buffer for mask or value len above WIDE_INT_PRINT_BUFFER_SIZE.
* fold-const.cc (fold_convert_const_int_from_int, fold_unary_loc):
Use wide_int::from on wi::to_wide instead of wi::to_widest.
* tree-ssa-ccp.cc (bit_value_binop): Zero extend r1max from width
before calling wi::udiv_trunc.
* lto-streamer-out.cc (output_cfg): Adjustments for the widest_int to
bound_wide_int type change in non-static data members.
* lto-streamer-in.cc (input_cfg): Likewise.
(lto_input_tree_1): Use WIDE_INT_MAX_INL_ELTS rather than
WIDE_INT_MAX_ELTS.  For length above WIDE_INT_MAX_INL_ELTS use
XALLOCAVEC allocated buffer.  Formatting fix.
* data-streamer-in.cc (streamer_read_wide_int,
streamer_read_widest_int): Likewise.
* tree-affine.cc (aff_combination_expand): Use placement new to
construct name_expansion.
(free_name_expansion): Destruct name_expansion.
* gimple-ssa-strength-reduction.cc (struct slsr_cand_d): Change
index type from widest_int to offset_int.
(class incr_info_d): Change incr type from widest_int to offset_int.
(alloc_cand_and_find_basis, backtrace_base_for_ref,
restructure_reference, slsr_process_ref, create_mul_ssa_cand,
create_mul_imm_cand, create_add_ssa_cand, create_add_imm_cand,
slsr_process_add, cand_abs_increment, replace_mult_candidate,
replace_unconditional_candidate, incr_vec_index,
create_add_on_incoming_edge, create_phi_basis_1,
replace_conditional_candidate, record_increment,
record_phi_increments_1, phi_incr_cost_1, phi_incr_cost,
lowest_cost_path, total_savings, ncd_with_phi, ncd_of_cand_and_phis,
nearest_common_dominator_for_cands, insert_initializers,
all_phi_incrs_profitable_1, replace_one_candidate,
replace_profitable_candidates): Use offset_int rather than widest_int
and wi::to_offset rather than wi::to_widest.
* real.cc (real_to_integer): Use WIDE_INT_MAX_INL_ELTS rather than
2 * WIDE_INT_MAX_ELTS and for words above that use XALLOCAVEC
allocated buffer.
* tree-ssa-loop-ivopts.cc (niter_for_exit): Use placement new
to construct tree_niter_desc and destruct it on failure.
(free_tree_niter_desc): Destruct tree_niter_desc if value is non-NULL.
* gengtype.cc (main): Remove widest_int handling.
* graphite-isl-ast-to-gimple.cc (widest_int_from_isl_expr_int): Use
WIDEST_INT_MAX_ELTS instead of WIDE_INT_MAX_ELTS.
* gimple-ssa-warn-alloca.cc (pass_walloca::execute): Use
WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION and
assert get_len () fits into it.
* value-range-pretty-print.cc (vrange_printer::print_irange_bitmasks):
For mask or value lengths above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC
allocated buffer.
* gimple-ssa-sprintf.cc (adjust_range_for_overflow): Use
wide_int::from on wi::to_wide instead of wi::to_widest.
* omp-general.cc (score_wide_int): New typedef.
(omp_context_compute_score): Use score_wide_int instead of widest_int
and adjust for those changes.
(struct omp_declare_variant_entry): Change score and
score_in_declare_simd_clone non-static data member type from widest_int
to score_wide_int.
(omp_resolve_late_declare_variant, omp_resolve_declare_variant): Use
score_wide_int instead of widest_int and adjust for those changes.
(omp_lto_output_declare_variant_alt): Likewise.
(omp_lto_input_declare_variant_alt): Likewise.
* godump.cc (go_output_typedef): Assert get_len () is smaller than
WIDE_INT_MAX_INL_ELTS.
gcc/c-family/
* c-warn.cc (match_case_to_enum_1): Use wi::to_wide just once instead
of 3 times, assert get_len () is smaller than WIDE_INT_MAX_INL_ELTS.
gcc/testsuite/
* gcc.dg/bitint-38.c: New test.

21 months agoLibF7: Implement atan2.
Georg-Johann Lay [Thu, 12 Oct 2023 13:32:41 +0000 (15:32 +0200)] 
LibF7: Implement atan2.

libgcc/config/avr/libf7/
* libf7.c (F7MOD_atan2_, f7_atan2): New module and function.
* libf7.h: Adjust comments.
* libf7-common.mk (CALL_PROLOGUES): Add atan2.

21 months agoRISC-V: Support FP lround/lroundf auto vectorization
Pan Li [Thu, 12 Oct 2023 08:54:36 +0000 (16:54 +0800)] 
RISC-V: Support FP lround/lroundf auto vectorization

This patch would like to support the FP lround/lroundf auto vectorization.

* long lround (double) for rv64
* long lroundf (float) for rv32

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lroundmn2 only act on DF => DI for
rv64, and SF => SI for rv32.

Given we have code like:

void
test_lround (long *out, double *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
    out[i] = __builtin_lround (in[i]);
}

Before this patch:
.L3:
  ...
  fld      fa5,0(a1)
  fcvt.l.d a5,fa5,rmm
  sd       a5,-8(a0)
  ...
  bne      a1,a4,.L3

After this patch:
  frrm     a6
  ...
  fsrmi    4 // RMM
.L3:
  ...
  vsetvli     a3,zero,e64,m1,ta,ma
  vfcvt.x.f.v v1,v1
  vsetvli     zero,a2,e64,m1,ta,ma
  vse32.v     v1,0(a0)
  ...
  bne         a2,zero,.L3
  ...
  fsrm     a6

The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered
by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

* config/riscv/autovec.md (lround<mode><v_i_l_ll_convert>2): New
pattern for lround/lroundf.
* config/riscv/riscv-protos.h (enum insn_type): New enum value.
(expand_vec_lround): New func decl for expanding lround.
* config/riscv/riscv-v.cc (expand_vec_lround): New func impl
for expanding lround.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-lround-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lround-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lround-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lround-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lround-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lround-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
21 months agodwarf2out: Stop using wide_int in GC structures
Jakub Jelinek [Thu, 12 Oct 2023 08:45:27 +0000 (10:45 +0200)] 
dwarf2out: Stop using wide_int in GC structures

The planned wide_int/widest_int changes to support larger precisions
make wide_int and widest_int unusable in GC structures, because it
has non-trivial destructors (and may point to heap allocated memory).
dwarf2out.{h,cc} is the only user of wide_int in GC structures for val_wide,
but actually doesn't really need much, all those are at one point created
from const wide_int_ref & and never changed afterwards, with just a couple
of methods used on it.

So, this patch replaces use of wide_int there with a new class, dw_wide_int,
which contains just precision, len field and the limbs in trailing array.
Most needed methods are implemented directly, just for the most complicated
cases it temporarily constructs a wide_int_ref from it and calls its methods.

2023-10-12  Jakub Jelinek  <jakub@redhat.com>

* dwarf2out.h (wide_int_ptr): Remove.
(dw_wide_int_ptr): New typedef.
(struct dw_val_node): Change type of val_wide from wide_int_ptr
to dw_wide_int_ptr.
(struct dw_wide_int): New type.
(dw_wide_int::elt): New method.
(dw_wide_int::operator ==): Likewise.
* dwarf2out.cc (get_full_len): Change argument type to
const dw_wide_int & from const wide_int &.  Use CEIL.  Call
get_precision method instead of calling wi::get_precision.
(alloc_dw_wide_int): New function.
(add_AT_wide): Change w argument type to const wide_int_ref &
from const wide_int &.  Use alloc_dw_wide_int.
(mem_loc_descriptor, loc_descriptor): Use alloc_dw_wide_int.
(insert_wide_int): Change val argument type to const wide_int_ref &
from const wide_int &.
(add_const_value_attribute): Pass rtx_mode_t temporary directly to
add_AT_wide instead of using a temporary variable.

21 months agotree-optimization/111764 - wrong reduction vectorization
Richard Biener [Thu, 12 Oct 2023 07:09:46 +0000 (09:09 +0200)] 
tree-optimization/111764 - wrong reduction vectorization

The following removes a misguided attempt to allow x + x in a reduction
path, also allowing x * x which isn't valid.  x + x actually never
arrives this way but instead is canonicalized to 2 * x.  This makes
reduction path handling consistent with how we handle the single-stmt
reduction case.

PR tree-optimization/111764
* tree-vect-loop.cc (check_reduction_path): Remove the attempt
to allow x + x via special-casing of assigns.

* gcc.dg/vect/pr111764.c: New testcase.

21 months agoSupport Intel USER_MSR
Hu, Lin1 [Thu, 22 Dec 2022 02:26:47 +0000 (10:26 +0800)] 
Support Intel USER_MSR

gcc/ChangeLog:

* common/config/i386/cpuinfo.h (get_available_features):
Detect USER_MSR.
* common/config/i386/i386-common.cc (OPTION_MASK_ISA2_USER_MSR_SET): New.
(OPTION_MASK_ISA2_USER_MSR_UNSET): Ditto.
(ix86_handle_option): Handle -musermsr.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_USER_MSR.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for usermsr.
* config.gcc: Add usermsrintrin.h
* config/i386/cpuid.h (bit_USER_MSR): New.
* config/i386/i386-builtin-types.def:
Add DEF_FUNCTION_TYPE (VOID, UINT64, UINT64).
* config/i386/i386-builtins.cc (ix86_init_mmx_sse_builtins):
Add __builtin_urdmsr and __builtin_uwrmsr.
* config/i386/i386-builtins.h (ix86_builtins):
Add IX86_BUILTIN_URDMSR and IX86_BUILTIN_UWRMSR.
* config/i386/i386-c.cc (ix86_target_macros_internal):
Define __USER_MSR__.
* config/i386/i386-expand.cc (ix86_expand_builtin):
Handle new builtins.
* config/i386/i386-isa.def (USER_MSR): Add DEF_PTA(USER_MSR).
* config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p):
Handle usermsr.
* config/i386/i386.md (urdmsr): New define_insn.
(uwrmsr): Ditto.
* config/i386/i386.opt: Add option -musermsr.
* config/i386/x86gprintrin.h: Include usermsrintrin.h
* doc/extend.texi: Document usermsr.
* doc/invoke.texi: Document -musermsr.
* doc/sourcebuild.texi: Document target usermsr.
* config/i386/usermsrintrin.h: New file.

gcc/testsuite/ChangeLog:

* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* gcc.target/i386/x86gprintrin-1.c: Add -musermsr for 64bit target.
* gcc.target/i386/x86gprintrin-2.c: Ditto.
* gcc.target/i386/x86gprintrin-3.c: Ditto.
* gcc.target/i386/x86gprintrin-4.c: Add musermsr for 64bit target.
* gcc.target/i386/x86gprintrin-5.c: Ditto
* gcc.target/i386/user_msr-1.c: New test.
* gcc.target/i386/user_msr-2.c: Ditto.

21 months agoLoongArch: Modify check_effective_target_vect_int_mod according to SX/ASX capabilities.
Chenghui Pan [Tue, 26 Sep 2023 06:42:57 +0000 (14:42 +0800)] 
LoongArch: Modify check_effective_target_vect_int_mod according to SX/ASX capabilities.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add LoongArch in
check_effective_target_vect_int_mod according to SX/ASX capabilities.

21 months agoLoongArch: Enable vect.exp for LoongArch. [PR111424]
Chenghui Pan [Tue, 26 Sep 2023 06:39:18 +0000 (14:39 +0800)] 
LoongArch: Enable vect.exp for LoongArch. [PR111424]

gcc/testsuite/ChangeLog:

PR target/111424
* lib/target-supports.exp: Enable vect.exp for LoongArch.

21 months agoLoongArch: Adjust makefile dependency for loongarch headers.
Yang Yujie [Wed, 11 Oct 2023 09:59:53 +0000 (17:59 +0800)] 
LoongArch: Adjust makefile dependency for loongarch headers.

gcc/ChangeLog:

* config.gcc: Add loongarch-driver.h to tm_files.
* config/loongarch/loongarch.h: Do not include loongarch-driver.h.
* config/loongarch/t-loongarch: Append loongarch-multilib.h to $(GTM_H)
instead of $(TM_H) for building generator programs.

21 months agoFortran: Set hidden string length for pointer components [PR67740].
Paul Thomas [Thu, 12 Oct 2023 06:26:59 +0000 (07:26 +0100)] 
Fortran: Set hidden string length for pointer components [PR67740].

2023-10-11  Paul Thomas  <pault@gcc.gnu.org>

gcc/fortran
PR fortran/67740
* trans-expr.cc (gfc_trans_pointer_assignment): Set the hidden
string length component for pointer assignment to character
pointer components.

gcc/testsuite/
PR fortran/67740
* gfortran.dg/pr67740.f90: New test

21 months agors6000: Make 32 bit stack_protect support prefixed insn [PR111367]
Kewen Lin [Thu, 12 Oct 2023 05:05:03 +0000 (00:05 -0500)] 
rs6000: Make 32 bit stack_protect support prefixed insn [PR111367]

As PR111367 shows, with prefixed insn supported, some of
checkings consider it's able to leverage prefixed insn
for stack protect related load/store, but since we don't
actually change the emitted assembly for 32 bit, it can
cause the assembler error as exposed.

Mike's commit r10-4547-gce6a6c007e5a98 has already handled
the 64 bit case (DImode), this patch is to treat the 32
bit case (SImode) by making use of mode iterator P and
ptrload attribute iterator, also fixes the constraints
to match the emitted operand formats.

PR target/111367

gcc/ChangeLog:

* config/rs6000/rs6000.md (stack_protect_setsi): Support prefixed
instruction emission and incorporate to stack_protect_set<mode>.
(stack_protect_setdi): Rename to ...
(stack_protect_set<mode>): ... this, adjust constraint.
(stack_protect_testsi): Support prefixed instruction emission and
incorporate to stack_protect_test<mode>.
(stack_protect_testdi): Rename to ...
(stack_protect_test<mode>): ... this, adjust constraint.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/pr111367.C: New test.

21 months agotestsuite: Avoid uninit var in pr60510.f [PR111427]
Kewen Lin [Thu, 12 Oct 2023 05:04:58 +0000 (00:04 -0500)] 
testsuite: Avoid uninit var in pr60510.f [PR111427]

The uninitialized variable a in pr60510.f can cause
some random failures as exposed in PR111427.  This
patch is to make it initialized accordingly.

PR testsuite/111427

gcc/testsuite/ChangeLog:

* gfortran.dg/vect/pr60510.f (test): Init variable a.

21 months agovect: Consider vec_perm costing for VMAT_CONTIGUOUS_REVERSE
Kewen Lin [Thu, 12 Oct 2023 05:04:57 +0000 (00:04 -0500)] 
vect: Consider vec_perm costing for VMAT_CONTIGUOUS_REVERSE

For VMAT_CONTIGUOUS_REVERSE, the transform code in function
vectorizable_store generates a VEC_PERM_EXPR stmt before
storing, but it's never considered in costing.

This patch is to make it consider vec_perm in costing, it
adjusts the order of transform code a bit to make it easy
to early return for costing_p.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_store): Consider generated
VEC_PERM_EXPR stmt for VMAT_CONTIGUOUS_REVERSE in costing as
vec_perm.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/costmodel/ppc/costmodel-vect-store-2.c: New test.

21 months agovect: Get rid of vect_model_store_cost
Kewen Lin [Thu, 12 Oct 2023 05:04:57 +0000 (00:04 -0500)] 
vect: Get rid of vect_model_store_cost

This patch is to eventually get rid of vect_model_store_cost,
it adjusts the costing for the remaining memory access types
VMAT_CONTIGUOUS{, _DOWN, _REVERSE} by moving costing close
to the transform code.  Note that in vect_model_store_cost,
there is one special handling for vectorizing a store into
the function result, since it's extra penalty and the
transform part doesn't have it, this patch keep it alone.

gcc/ChangeLog:

* tree-vect-stmts.cc (vect_model_store_cost): Remove.
(vectorizable_store): Adjust the costing for the remaining memory
access types VMAT_CONTIGUOUS{, _DOWN, _REVERSE}.

21 months agovect: Adjust vectorizable_store costing on VMAT_CONTIGUOUS_PERMUTE
Kewen Lin [Thu, 12 Oct 2023 05:04:57 +0000 (00:04 -0500)] 
vect: Adjust vectorizable_store costing on VMAT_CONTIGUOUS_PERMUTE

This patch adjusts the cost handling on VMAT_CONTIGUOUS_PERMUTE
in function vectorizable_store.  We don't call function
vect_model_store_cost for it any more.  It's the case of
interleaving stores, so it skips all stmts excepting for
first_stmt_info, consider the whole group when costing
first_stmt_info.  This patch shouldn't have any functional
changes.

gcc/ChangeLog:

* tree-vect-stmts.cc (vect_model_store_cost): Assert it will never
get VMAT_CONTIGUOUS_PERMUTE and remove VMAT_CONTIGUOUS_PERMUTE related
handlings.
(vectorizable_store): Adjust the cost handling on
VMAT_CONTIGUOUS_PERMUTE without calling vect_model_store_cost.

21 months agovect: Adjust vectorizable_store costing on VMAT_LOAD_STORE_LANES
Kewen Lin [Thu, 12 Oct 2023 05:04:57 +0000 (00:04 -0500)] 
vect: Adjust vectorizable_store costing on VMAT_LOAD_STORE_LANES

This patch adjusts the cost handling on VMAT_LOAD_STORE_LANES
in function vectorizable_store.  We don't call function
vect_model_store_cost for it any more.  It's the case of
interleaving stores, so it skips all stmts excepting for
first_stmt_info, consider the whole group when costing
first_stmt_info.  This patch shouldn't have any functional
changes.

gcc/ChangeLog:

* tree-vect-stmts.cc (vect_model_store_cost): Assert it will never
get VMAT_LOAD_STORE_LANES.
(vectorizable_store): Adjust the cost handling on VMAT_LOAD_STORE_LANES
without calling vect_model_store_cost.  Factor out new lambda function
update_prologue_cost.

21 months agovect: Adjust vectorizable_store costing on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP
Kewen Lin [Thu, 12 Oct 2023 05:04:57 +0000 (00:04 -0500)] 
vect: Adjust vectorizable_store costing on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP

This patch adjusts the cost handling on VMAT_ELEMENTWISE
and VMAT_STRIDED_SLP in function vectorizable_store.  We
don't call function vect_model_store_cost for them any more.

Like what we improved for PR82255 on load side, this change
helps us to get rid of unnecessary vec_to_scalar costing
for some case with VMAT_STRIDED_SLP.  One typical test case
gcc.dg/vect/costmodel/ppc/costmodel-vect-store-1.c has been
associated.  And it helps some cases with some inconsistent
costing too.

Besides, this also special-cases the interleaving stores
for these two affected memory access types, since for the
interleaving stores the whole chain is vectorized when the
last store in the chain is reached, the other stores in the
group would be skipped.  To keep consistent with this and
follows the transforming handlings like iterating the whole
group, it only costs for the first store in the group.
Ideally we can only cost for the last one but it's not
trivial and using the first one is actually equivalent.

gcc/ChangeLog:

* tree-vect-stmts.cc (vect_model_store_cost): Assert it won't get
VMAT_ELEMENTWISE and VMAT_STRIDED_SLP any more, and remove their
related handlings.
(vectorizable_store): Adjust the cost handling on VMAT_ELEMENTWISE
and VMAT_STRIDED_SLP without calling vect_model_store_cost.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/costmodel/ppc/costmodel-vect-store-1.c: New test.

21 months agovect: Simplify costing on vectorizable_scan_store
Kewen Lin [Thu, 12 Oct 2023 05:04:57 +0000 (00:04 -0500)] 
vect: Simplify costing on vectorizable_scan_store

This patch is to simplify the costing on the case
vectorizable_scan_store without calling function
vect_model_store_cost any more.

I considered if moving the costing into function
vectorizable_scan_store is a good idea, for doing
that, we have to pass several variables down which
are only used for costing, and for now we just
want to keep the costing as the previous, haven't
tried to make this costing consistent with what the
transforming does, so I think we can leave it for now.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_store): Adjust costing on
vectorizable_scan_store without calling vect_model_store_cost
any more.

21 months agovect: Adjust vectorizable_store costing on VMAT_GATHER_SCATTER
Kewen Lin [Thu, 12 Oct 2023 05:04:56 +0000 (00:04 -0500)] 
vect: Adjust vectorizable_store costing on VMAT_GATHER_SCATTER

This patch adjusts the cost handling on VMAT_GATHER_SCATTER
in function vectorizable_store (all three cases), then we
won't depend on vect_model_load_store for its costing any
more.  This patch shouldn't have any functional changes.

gcc/ChangeLog:

* tree-vect-stmts.cc (vect_model_store_cost): Assert it won't get
VMAT_GATHER_SCATTER any more, remove VMAT_GATHER_SCATTER related
handlings and the related parameter gs_info.
(vect_build_scatter_store_calls): Add the handlings on costing with
one more argument cost_vec.
(vectorizable_store): Adjust the cost handling on VMAT_GATHER_SCATTER
without calling vect_model_store_cost any more.

21 months agovect: Move vect_model_store_cost next to the transform in vectorizable_store
Kewen Lin [Thu, 12 Oct 2023 05:04:56 +0000 (00:04 -0500)] 
vect: Move vect_model_store_cost next to the transform in vectorizable_store

This patch is an initial patch to move costing next to the
transform, it still adopts vect_model_store_cost for costing
but moves and duplicates it down according to the handlings
of different vect_memory_access_types or some special
handling need, hope it can make the subsequent patches easy
to review.  This patch should not have any functional
changes.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_store): Move and duplicate the call
to vect_model_store_cost down to some different transform paths
according to the handlings of different vect_memory_access_types
or some special handling need.

21 months agovect: Ensure vect store is supported for some VMAT_ELEMENTWISE case
Kewen Lin [Thu, 12 Oct 2023 05:04:56 +0000 (00:04 -0500)] 
vect: Ensure vect store is supported for some VMAT_ELEMENTWISE case

When making/testing patches to move costing next to the
transform code for vectorizable_store, some ICEs got
exposed when I further refined the costing handlings on
VMAT_ELEMENTWISE.  The apparent cause is triggering the
assertion in rs6000 specific function for costing
rs6000_builtin_vectorization_cost:

  if (TARGET_ALTIVEC)
     /* Misaligned stores are not supported.  */
     gcc_unreachable ();

I used vect_get_store_cost instead of the original way by
record_stmt_cost with scalar_store for costing, that is to
use one unaligned_store instead, it matches what we use in
transforming, it's a vector store as below:

  else if (group_size >= const_nunits
           && group_size % const_nunits == 0)
    {
       nstores = 1;
       lnel = const_nunits;
       ltype = vectype;
       lvectype = vectype;
    }

So IMHO it's more consistent with vector store instead of
scalar store, with the given compilation option
-mno-allow-movmisalign, the misaligned vector store is
unexpected to be used in vectorizer, but why it's still
adopted?  In the current implementation of function
get_group_load_store_type, we always set alignment support
scheme as dr_unaligned_supported for VMAT_ELEMENTWISE, it
is true if we always adopt scalar stores, but as the above
code shows, we could use vector stores for some cases, so
we should use the correct alignment support scheme for it.

This patch is to ensure the vector store is supported by
further checking with vect_supportable_dr_alignment.  The
ICEs got exposed with patches moving costing next to the
transform but they haven't been landed, the test coverage
would be there once they get landed.  The affected test
cases are:
  - gcc.dg/vect/slp-45.c
  - gcc.dg/vect/vect-alias-check-{10,11,12}.c

btw, I tried to make some correctness test case, but I
realized that -mno-allow-movmisalign is mainly for noting
movmisalign optab and it doesn't guard for the actual hw
vector memory access insns, so I failed to make it unless
I also altered some conditions for them as it.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_store): Ensure the generated
vector store for some case of VMAT_ELEMENTWISE is supported.

21 months agox86: set spincount 1 for x86 hybrid platform
Zhang, Jun [Fri, 22 Sep 2023 15:56:37 +0000 (23:56 +0800)] 
x86: set spincount 1 for x86 hybrid platform

By test, we find in hybrid platform spincount 1 is better.

Use '-march=native -Ofast -funroll-loops -flto',
results as follows:

spec2017 speed   RPL     ADL
657.xz_s         0.00%   0.50%
603.bwaves_s     10.90%  26.20%
607.cactuBSSN_s  5.50%   72.50%
619.lbm_s        2.40%   2.50%
621.wrf_s        -7.70%  2.40%
627.cam4_s       0.50%   0.70%
628.pop2_s       48.20%  153.00%
638.imagick_s    -0.10%  0.20%
644.nab_s        2.30%   1.40%
649.fotonik3d_s  8.00%   13.80%
654.roms_s       1.20%   1.10%
Geomean-int      0.00%   0.50%
Geomean-fp       6.30%   21.10%
Geomean-all      5.70%   19.10%

omp2012          RPL     ADL
350.md           -1.81%  -1.75%
351.bwaves       7.72%   12.50%
352.nab          14.63%  19.71%
357.bt331        -0.20%  1.77%
358.botsalgn     0.00%   0.00%
359.botsspar     0.00%   0.65%
360.ilbdc        0.00%   0.25%
362.fma3d        2.66%   -0.51%
363.swim         10.44%  0.00%
367.imagick      0.00%   0.12%
370.mgrid331     2.49%   25.56%
371.applu331     1.06%   4.22%
372.smithwa      0.74%   3.34%
376.kdtree       10.67%  16.03%
GEOMEAN          3.34%   5.53%

include/ChangeLog:

PR target/109812
* spincount.h: New file.

libgomp/ChangeLog:

* env.c (initialize_env): Use do_adjust_default_spincount.
* config/linux/x86/spincount.h: New file.

21 months agoRISC-V: Support FP llrint auto vectorization
Pan Li [Thu, 12 Oct 2023 03:20:36 +0000 (11:20 +0800)] 
RISC-V: Support FP llrint auto vectorization

This patch would like to support the FP llrint auto vectorization.

* long long llrint (double)

This will be the CVT from DF => DI from the standard name's perpsective,
which has been covered in previous PATCH(es). Thus, this patch only add
some test cases.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/test-math.h: Add type int64_t.
* gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llrint-0.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
21 months ago[APX] Support Intel APX PUSH2POP2
Mo, Zewei [Mon, 6 Mar 2023 02:42:32 +0000 (10:42 +0800)] 
[APX] Support Intel APX PUSH2POP2

This feature requires stack to be aligned at 16byte, therefore in
prologue/epilogue, a standalone push/pop will be emitted before any
push2/pop2 if the stack was not aligned to 16byte.
Also for current implementation we only support push2/pop2 usage in
function prologue/epilogue for those callee-saved registers.

gcc/ChangeLog:

* config/i386/i386.cc (gen_push2): New function to emit push2
and adjust cfa offset.
(ix86_pro_and_epilogue_can_use_push2_pop2): New function to
determine whether push2/pop2 can be used.
(ix86_compute_frame_layout): Adjust preferred stack boundary
and stack alignment needed for push2/pop2.
(ix86_emit_save_regs): Emit push2 when available.
(ix86_emit_restore_reg_using_pop2): New function to emit pop2
and adjust cfa info.
(ix86_emit_restore_regs_using_pop2): New function to loop
through the saved regs and call above.
(ix86_expand_epilogue): Call ix86_emit_restore_regs_using_pop2
when push2pop2 available.
* config/i386/i386.md (push2_di): New pattern for push2.
(pop2_di): Likewise for pop2.

gcc/testsuite/ChangeLog:

* gcc.target/i386/apx-push2pop2-1.c: New test.
* gcc.target/i386/apx-push2pop2_force_drap-1.c: Likewise.
* gcc.target/i386/apx-push2pop2_interrupt-1.c: Likewise.

Co-authored-by: Hu Lin1 <lin1.hu@intel.com>
Co-authored-by: Hongyu Wang <hongyu.wang@intel.com>
21 months agoRISC-V: Support FP irintf auto vectorization
Pan Li [Thu, 12 Oct 2023 01:43:02 +0000 (09:43 +0800)] 
RISC-V: Support FP irintf auto vectorization

This patch would like to support the FP irintf auto vectorization.

* int irintf (float)

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lrintmn2 only act on SF => SI.

Given we have code like:

void
test_irintf (int *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
    out[i] = __builtin_irintf (in[i]);
}

Before this patch:
.L3:
  ...
  flw      fa5,0(a1)
  fcvt.w.s a5,fa5,dyn
  sw       a5,-4(a0)
  ...
  bne      a1,a4,.L3

After this patch:
.L3:
  ...
  vle32.v     v1,0(a1)
  vfcvt.x.f.v v1,v1
  vse32.v     v1,0(a0)
  ...
  bne         a2,zero,.L3

The rest part like DF => SI/HF => SI will be covered by the hook
TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

* config/riscv/autovec.md (lrint<mode><vlconvert>2): Rename from.
(lrint<mode><v_i_l_ll_convert>2): Rename to.
* config/riscv/vector-iterators.md: Rename and remove TARGET_64BIT.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-irint-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-irint-0.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
21 months agoDaily bump.
GCC Administrator [Thu, 12 Oct 2023 00:17:24 +0000 (00:17 +0000)] 
Daily bump.

21 months agoRISC-V: Add TARGET_MIN_VLEN_OPTS to fix the build
Kito Cheng [Wed, 11 Oct 2023 23:18:00 +0000 (16:18 -0700)] 
RISC-V: Add TARGET_MIN_VLEN_OPTS to fix the build

gcc/ChangeLog:

* config/riscv/riscv-opts.h (TARGET_MIN_VLEN_OPTS): New.

21 months agoRISC-V Adjust long unconditional branch sequence
Jeff Law [Wed, 11 Oct 2023 22:18:22 +0000 (16:18 -0600)] 
RISC-V Adjust long unconditional branch sequence

Andrew and I independently noted the long unconditional branch sequence was
using the "call" pseudo op.  Technically it works, but it's a bit odd.  This
patch flips it to use the "jump" pseudo-op.

This was tested with a hacked-up local compiler which forced all branches/jumps
to be long jumps.  Naturally it triggered some failures for scan-asm tests but
no execution regressions (which is mostly what I was testing for).

I've updated the long branch support item in the RISE wiki to indicate that we
eventually want a register scavenging approach with a fallback to $ra in the
future so that we don't muck up the return address predictors.  It's not
super-high priority and shouldn't be terrible to implement given we've got the
$ra fallback when a suitable register can not be found.

gcc/
* config/riscv/riscv.md (jump): Adjust sequence to use a "jump"
pseudo op instead of a "call" pseudo op.

21 months agoRISC-V: Extend riscv_subset_list, preparatory for target attribute support
Kito Cheng [Mon, 2 Oct 2023 14:37:50 +0000 (22:37 +0800)] 
RISC-V: Extend riscv_subset_list, preparatory for target attribute support

riscv_subset_list only accept a full arch string before, but we need to
parse single extension when supporting target attribute, also we may set
a riscv_subset_list directly rather than re-parsing the ISA string
again.

gcc/ChangeLog:

* config/riscv/riscv-subset.h (riscv_subset_list::parse_single_std_ext):
New.
(riscv_subset_list::parse_single_multiletter_ext): Ditto.
(riscv_subset_list::clone): Ditto.
(riscv_subset_list::parse_single_ext): Ditto.
(riscv_subset_list::set_loc): Ditto.
(riscv_set_arch_by_subset_list): Ditto.
* common/config/riscv/riscv-common.cc
(riscv_subset_list::parse_single_std_ext): New.
(riscv_subset_list::parse_single_multiletter_ext): Ditto.
(riscv_subset_list::clone): Ditto.
(riscv_subset_list::parse_single_ext): Ditto.
(riscv_subset_list::set_loc): Ditto.
(riscv_set_arch_by_subset_list): Ditto.

21 months agoRISC-V: Refactor riscv_option_override and riscv_convert_vector_bits. [NFC]
Kito Cheng [Sun, 1 Oct 2023 10:14:44 +0000 (18:14 +0800)] 
RISC-V: Refactor riscv_option_override and riscv_convert_vector_bits. [NFC]

Allow those funciton apply from a local gcc_options rather than the
global options.

Preparatory for target attribute, sperate this change for eaiser reivew
since it's a NFC.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_convert_vector_bits): Get setting
from argument rather than get setting from global setting.
(riscv_override_options_internal): New, splited from
riscv_override_options, also take a gcc_options argument.
(riscv_option_override): Splited most part to
riscv_override_options_internal.

21 months agooptions: Define TARGET_<NAME>_P and TARGET_<NAME>_OPTS_P macro for Mask and InverseMask
Kito Cheng [Sun, 1 Oct 2023 09:03:28 +0000 (17:03 +0800)] 
options: Define TARGET_<NAME>_P and TARGET_<NAME>_OPTS_P macro for Mask and InverseMask

We TARGET_<NAME>_P marcro to test a Mask and InverseMask with user
specified target_variable, however we may want to test with specific
gcc_options variable rather than target_variable.

Like RISC-V has defined lots of Mask with TargetVariable, which is not
easy to use, because that means we need to known which Mask are associate with
which TargetVariable, so take a gcc_options variable is a better interface
for such use case.

gcc/ChangeLog:

* doc/options.texi (Mask): Document TARGET_<NAME>_P and
TARGET_<NAME>_OPTS_P.
(InverseMask): Ditto.
* opth-gen.awk (Mask): Generate TARGET_<NAME>_P and
TARGET_<NAME>_OPTS_P macro.
(InverseMask): Ditto.

21 months agoMATCH: [PR111282] Simplify `a & (b ^ ~a)` to `a & b`
Andrew Pinski [Tue, 10 Oct 2023 19:45:56 +0000 (12:45 -0700)] 
MATCH: [PR111282] Simplify `a & (b ^ ~a)` to `a & b`

While `a & (b ^ ~a)` is optimized to `a & b` on the rtl level,
it is always good to optimize this at the gimple level and allows
us to match a few extra things including where a is a comparison.

Note I had to update/change the testcase and-1.c to avoid matching
this case as we can match -2 and 1 as bitwise inversions.

PR tree-optimization/111282

gcc/ChangeLog:

* match.pd (`a & ~(a ^ b)`, `a & (a == b)`,
`a & ((~a) ^ b)`): New patterns.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/and-1.c: Update testcase to avoid
matching `~1 & (a ^ 1)` simplification.
* gcc.dg/tree-ssa/bitops-6.c: New test.

21 months agomodula2: Narrow subranges to int or unsigned int if ZTYPE is the base type.
Gaius Mulley [Wed, 11 Oct 2023 16:44:35 +0000 (17:44 +0100)] 
modula2: Narrow subranges to int or unsigned int if ZTYPE is the base type.

This patch narrows the subrange base type to INTEGER or CARDINAL
providing the range is satisfied.  It only does this when the subrange
base type is the ZTYPE.

gcc/m2/ChangeLog:

* gm2-compiler/M2GCCDeclare.mod (DeclareSubrange): Check
the base type of the subrange against the ZTYPE and call
DeclareSubrangeNarrow if necessary.
(DeclareSubrangeNarrow): New procedure function.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
21 months ago[PATCH v4 2/2] RISC-V: Add support for XCValu extension in CV32E40P
Mary Bennett [Wed, 11 Oct 2023 13:41:38 +0000 (07:41 -0600)] 
[PATCH v4 2/2] RISC-V: Add support for XCValu extension in CV32E40P

Spec: github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributors:
  Mary Bennett <mary.bennett@embecosm.com>
  Nandni Jamnadas <nandni.jamnadas@embecosm.com>
  Pietra Ferreira <pietra.ferreira@embecosm.com>
  Charlie Keaney
  Jessica Mills
  Craig Blackmore <craig.blackmore@embecosm.com>
  Simon Cook <simon.cook@embecosm.com>
  Jeremy Bennett <jeremy.bennett@embecosm.com>
  Helene Chelin <helene.chelin@embecosm.com>

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add the XCValu
extension.
* config/riscv/constraints.md: Add builtins for the XCValu
extension.
* config/riscv/predicates.md (immediate_register_operand):
Likewise.
* config/riscv/corev.def: Likewise.
* config/riscv/corev.md: Likewise.
* config/riscv/riscv-builtins.cc (AVAIL): Likewise.
(RISCV_ATYPE_UHI): Likewise.
* config/riscv/riscv-ftypes.def: Likewise.
* config/riscv/riscv.opt: Likewise.
* config/riscv/riscv.cc (riscv_print_operand): Likewise.
* doc/extend.texi: Add XCValu documentation.
* doc/sourcebuild.texi: Likewise.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add proc for the XCValu extension.
* gcc.target/riscv/cv-alu-compile.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-addn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-addrn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-addun.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-addurn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-clip.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-clipu.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-subn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-subrn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-subun.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-suburn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile.c: New test.

21 months ago[PATCH v4 1/2] RISC-V: Add support for XCVmac extension in CV32E40P
Mary Bennett [Wed, 11 Oct 2023 13:39:41 +0000 (07:39 -0600)] 
[PATCH v4 1/2] RISC-V: Add support for XCVmac extension in CV32E40P

Spec: github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributors:
  Mary Bennett <mary.bennett@embecosm.com>
  Nandni Jamnadas <nandni.jamnadas@embecosm.com>
  Pietra Ferreira <pietra.ferreira@embecosm.com>
  Charlie Keaney
  Jessica Mills
  Craig Blackmore <craig.blackmore@embecosm.com>
  Simon Cook <simon.cook@embecosm.com>
  Jeremy Bennett <jeremy.bennett@embecosm.com>
  Helene Chelin <helene.chelin@embecosm.com>

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add XCVmac.
* config/riscv/riscv-ftypes.def: Add XCVmac builtins.
* config/riscv/riscv-builtins.cc: Likewise.
* config/riscv/riscv.md: Likewise.
* config/riscv/riscv.opt: Likewise.
* doc/extend.texi: Add XCVmac builtin documentation.
* doc/sourcebuild.texi: Likewise.
* config/riscv/corev.def: New file.
* config/riscv/corev.md: New file.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add new effective target check.
* gcc.target/riscv/cv-mac-compile.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mac.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-machhsn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-machhsrn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-machhun.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-machhurn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-macsn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-macsrn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-macun.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-macurn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-msu.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulhhsn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulhhsrn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulhhun.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulhhurn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulsn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulsrn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulun.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulurn.c: New test.
* gcc.target/riscv/cv-mac-test-autogeneration.c: New test.

21 months agoMAINTAINERS: Fix write after approval name order
Filip Kastl [Wed, 11 Oct 2023 12:53:44 +0000 (14:53 +0200)] 
MAINTAINERS: Fix write after approval name order

ChangeLog:

* MAINTAINERS: Fix name order.

Signed-off-by: Filip Kastl <fkastl@suse.cz>
21 months agoPR modula2/111675 Incorrect packed record field value passed to a procedure
Gaius Mulley [Wed, 11 Oct 2023 12:26:47 +0000 (13:26 +0100)] 
PR modula2/111675 Incorrect packed record field value passed to a procedure

This patch allows a packed field to be extracted and passed to a
procedure.  It ensures that the subrange type is the same for both the
procedure and record field.  It also extends the <* bytealignment (0) *>
to cover packed subrange types.

gcc/m2/ChangeLog:

PR modula2/111675
* gm2-compiler/M2CaseList.mod (appendTree): Replace
InitStringCharStar with InitString.
* gm2-compiler/M2GCCDeclare.mod: Import AreConstantsEqual.
(DeclareSubrange): Add zero alignment test and call
BuildSmallestTypeRange if necessary.
(WalkSubrangeDependants): Walk the align expression.
(IsSubrangeDependants): Test the align expression.
* gm2-compiler/M2Quads.mod (BuildStringAdrParam): Correct end name.
* gm2-compiler/P2SymBuild.mod (BuildTypeAlignment): Allow subranges
to be zero aligned (packed).
* gm2-compiler/SymbolTable.mod (Subrange): Add Align field.
(MakeSubrange): Set Align to NulSym.
(PutAlignment): Assign Subrange.Align to align.
(GetAlignment): Return Subrange.Align.
* gm2-gcc/m2expr.cc (noBitsRequired): Rewrite.
(calcNbits): Rename ...
(m2expr_calcNbits): ... to this and test for negative values.
(m2expr_BuildTBitSize): Replace calcNBits with m2expr_calcNbits.
* gm2-gcc/m2expr.def (calcNbits): Export.
* gm2-gcc/m2expr.h (m2expr_calcNbits): New prototype.
* gm2-gcc/m2type.cc (noBitsRequired): Remove.
(m2type_BuildSmallestTypeRange): Call m2expr_calcNbits.
(m2type_BuildSubrangeType): Create range_type from
build_range_type (type, lowval, highval).

gcc/testsuite/ChangeLog:

PR modula2/111675
* gm2/extensions/run/pass/packedrecord3.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
21 months agoRISC-V: Fix incorrect index(offset) of gather/scatter
Juzhe-Zhong [Wed, 11 Oct 2023 09:54:44 +0000 (17:54 +0800)] 
RISC-V: Fix incorrect index(offset) of gather/scatter

I suddenly discovered I made a mistake that was lucky un-exposed.

https://godbolt.org/z/c3jzrh7or

GCC is using 32 bit index offset:

        vsll.vi v1,v1,2
        vsetvli zero,a5,e32,m1,ta,ma
        vluxei32.v      v1,(a1),v1

This is wrong since v1 may overflow 32bit after vsll.vi.

After this patch:

vsext.vf2 v8,v4
vsll.vi v8,v8,2
vluxei64.v v8,(a1),v8

Same as Clang.

Regression passed. Ok for trunk ?

gcc/ChangeLog:

* config/riscv/autovec.md: Fix index bug.
* config/riscv/riscv-protos.h (gather_scatter_valid_offset_mode_p): New function.
* config/riscv/riscv-v.cc (expand_gather_scatter): Fix index bug.
(gather_scatter_valid_offset_mode_p): New function.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/gather-scatter/offset_extend-1.c: New test.

21 months agoRISC-V: Support FP lrint/lrintf auto vectorization
Pan Li [Wed, 11 Oct 2023 07:51:33 +0000 (15:51 +0800)] 
RISC-V: Support FP lrint/lrintf auto vectorization

This patch would like to support the FP lrint/lrintf auto vectorization.

* long lrint (double) for rv64
* long lrintf (float) for rv32

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lrintmn2 only act on DF => DI for
rv64, and SF => SI for rv32.

Given we have code like:

void
test_lrint (long *out, double *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
    out[i] = __builtin_lrint (in[i]);
}

Before this patch:
.L3:
  ...
  fld      fa5,0(a1)
  fcvt.l.d a5,fa5,dyn
  sd       a5,-8(a0)
  ...
  bne      a1,a4,.L3

After this patch:
.L3:
  ...
  vsetvli     a3,zero,e64,m1,ta,ma
  vfcvt.x.f.v v1,v1
  vsetvli     zero,a2,e64,m1,ta,ma
  vse32.v     v1,0(a0)
  ...
  bne         a2,zero,.L3

The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered
by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

* config/riscv/autovec.md (lrint<mode><vlconvert>2): New pattern
for lrint/lintf.
* config/riscv/riscv-protos.h (expand_vec_lrint): New func decl
for expanding lint.
* config/riscv/riscv-v.cc (emit_vec_cvt_x_f): New helper func impl
for vfcvt.x.f.v.
(expand_vec_lrint): New function impl for expanding lint.
* config/riscv/vector-iterators.md: New mode attr and iterator.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/test-math.h: New define for
CVT like test case.
* gcc.target/riscv/rvv/autovec/vls/def.h: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-lrint-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lrint-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lrint-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lrint-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lrint-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lrint-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
21 months agoRISC-V: Remove XFAIL of ssa-dom-cse-2.c
Juzhe-Zhong [Wed, 11 Oct 2023 03:25:07 +0000 (11:25 +0800)] 
RISC-V: Remove XFAIL of ssa-dom-cse-2.c

Confirm RISC-V is able to CSE this case no matter whether we enable RVV or not.

Remove XFAIL,  to fix:
XPASS: gcc.dg/tree-ssa/ssa-dom-cse-2.c scan-tree-dump optimized "return 28;"

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/ssa-dom-cse-2.c: Remove riscv.

21 months agotree-ssa-strlen: optimization skips clobbering store [PR111519]
Jakub Jelinek [Wed, 11 Oct 2023 06:58:29 +0000 (08:58 +0200)] 
tree-ssa-strlen: optimization skips clobbering store [PR111519]

The following testcase is miscompiled, because count_nonzero_bytes incorrectly
uses get_strinfo information on a pointer from which an earlier instruction
loads SSA_NAME stored at the current instruction.  get_strinfo shows a state
right before the current store though, so if there are some stores in between
the current store and the load, the string length information might have
changed.

The patch passes around gimple_vuse from the store and punts instead of using
strinfo on loads from MEM_REF which have different gimple_vuse from that.

2023-10-11  Richard Biener  <rguenther@suse.de>
    Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/111519
* tree-ssa-strlen.cc (strlen_pass::count_nonzero_bytes): Add vuse
argument and pass it through to recursive calls and
count_nonzero_bytes_addr calls.  Don't shadow the stmt argument, but
change stmt for gimple_assign_single_p statements for which we don't
immediately punt.
(strlen_pass::count_nonzero_bytes_addr): Add vuse argument and pass
it through to recursive calls and count_nonzero_bytes calls.  Don't
use get_strinfo if gimple_vuse (stmt) is different from vuse.  Don't
shadow the stmt argument.

* gcc.dg/torture/pr111519.c: New testcase.

21 months agoOptimize (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) as (and:SI x 1).
Roger Sayle [Wed, 11 Oct 2023 07:08:04 +0000 (08:08 +0100)] 
Optimize (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) as (and:SI x 1).

This patch is the middle-end piece of an improvement to PRs 101955 and
106245, that adds a missing simplification to the RTL optimizers.
This transformation is to simplify (char)(x << 7) != 0 as x & 1.
Technically, the cast can be any truncation, where shift is by one
less than the narrower type's precision, setting the most significant
(only) bit from the least significant bit.

This transformation applies to any target, but it's easy to see
(and add a new test case) on x86, where the following function:

int f(int a) { return (a << 31) >> 31; }

currently gets compiled with -O2 to:

foo:    movl    %edi, %eax
        sall    $7, %eax
        sarb    $7, %al
        movsbl  %al, %eax
        ret

but with this patch, we now generate the slightly simpler.

foo:    movl    %edi, %eax
        sall    $31, %eax
        sarl    $31, %eax
        ret

2023-10-11  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
PR middle-end/101955
PR tree-optimization/106245
* simplify-rtx.cc (simplify_relational_operation_1): Simplify
the RTL (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) to (and:SI x 1).

gcc/testsuite/ChangeLog
* gcc.target/i386/pr106245-1.c: New test case.

21 months agoRISC-V: Enable full coverage vect tests
Juzhe-Zhong [Wed, 11 Oct 2023 05:15:02 +0000 (13:15 +0800)] 
RISC-V: Enable full coverage vect tests

I have analyzed all existing FAILs.

Except these following FAILs need to be addressed:
FAIL: gcc.dg/vect/slp-reduc-7.c -flto -ffat-lto-objects execution test
FAIL: gcc.dg/vect/slp-reduc-7.c execution test
FAIL: gcc.dg/vect/vect-cond-arith-2.c -flto -ffat-lto-objects  scan-tree-dump optimized " = \\.COND_(LEN_)?SUB"
FAIL: gcc.dg/vect/vect-cond-arith-2.c scan-tree-dump optimized " = \\.COND_(LEN_)?SUB"

All other FAILs are dumple fail can be ignored (Confirm ARM SVE also has such FAILs and didn't fix them on either tests or implementation).

Now, It's time to enable full coverage vect tests including vec_unpack, vec_pack, vec_interleave, ... etc.

To see what we are still missing:

Before this patch:

                === gcc Summary ===

# of expected passes            182839
# of unexpected failures        79
# of unexpected successes       11
# of expected failures          1275
# of unresolved testcases       4
# of unsupported tests          4223

After this patch:

                === gcc Summary ===

# of expected passes            183411
# of unexpected failures        93
# of unexpected successes       7
# of expected failures          1285
# of unresolved testcases       4
# of unsupported tests          4157

There is an important issue increased that I have noticed after this patch:

FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects  scan-tree-dump vect "Loop contains only SLP stmts"
FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SLP stmts"
FAIL: gcc.dg/vect/vect-gather-3.c -flto -ffat-lto-objects  scan-tree-dump vect "Loop contains only SLP stmts"
FAIL: gcc.dg/vect/vect-gather-3.c scan-tree-dump vect "Loop contains only SLP stmts"

It has a related PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111721

I am gonna fix this first in the middle-end after commit this patch.

Ok for trunk ?

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add RVV.

21 months agoRefine predicate of operands[2] in divv4hf3 with register_operand.
liuhongt [Tue, 10 Oct 2023 03:32:09 +0000 (11:32 +0800)] 
Refine predicate of operands[2] in divv4hf3 with register_operand.

In the expander, it will emit below insn.

rtx tmp = gen_rtx_VEC_CONCAT (V4SFmode, operands[2],
force_reg (V2SFmode, CONST1_RTX (V2SFmode)));

but *vec_concat<mode> only allow register_operand.

gcc/ChangeLog:

PR target/111745
* config/i386/mmx.md (divv4hf3): Refine predicate of
operands[2] with register_operand.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr111745.c: New test.

21 months agoRISC-V Regression: Make pattern match more accurate of vect-live-2.c
Juzhe-Zhong [Tue, 10 Oct 2023 14:57:46 +0000 (22:57 +0800)] 
RISC-V Regression: Make pattern match more accurate of vect-live-2.c

Like previous patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632400.html
https://patchwork.sourceware.org/project/gcc/patch/dde89b9e-49a0-d70b-0906-fb3022cac11b@gmail.com/

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-live-2.c: Make pattern match more accurate.

21 months agoRISC-V Regression: Fix FAIL of vect-multitypes-16.c for RVV
Juzhe-Zhong [Tue, 10 Oct 2023 14:49:13 +0000 (22:49 +0800)] 
RISC-V Regression: Fix FAIL of vect-multitypes-16.c for RVV

As Richard suggested: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632288.html

Add vect_ext_char_longlong to fix FAIL for RVV.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-multitypes-16.c: Adapt check for RVV.
* lib/target-supports.exp: Add vect_ext_char_longlong property.

21 months agoDaily bump.
GCC Administrator [Wed, 11 Oct 2023 00:17:56 +0000 (00:17 +0000)] 
Daily bump.

21 months agoRISC-V: far-branch: Handle far jumps and branches for functions larger than 1MB
Andrew Waterman [Tue, 10 Oct 2023 18:34:04 +0000 (12:34 -0600)] 
RISC-V: far-branch: Handle far jumps and branches for functions larger than 1MB

On RISC-V, branches further than +/-1MB require a longer instruction
sequence (3 instructions): we can reuse the jump-construction in the
assmbler (which clobbers $ra) and a temporary to set up the jump
destination.

gcc/ChangeLog:

* config/riscv/riscv.cc (struct machine_function): Track if a
far-branch/jump is used within a function (and $ra needs to be
saved).
(riscv_print_operand): Implement 'N' (inverse integer branch).
(riscv_far_jump_used_p): Implement.
(riscv_save_return_addr_reg_p): New function.
(riscv_save_reg_p): Use riscv_save_return_addr_reg_p.
* config/riscv/riscv.h (FIXED_REGISTERS): Update $ra.
(CALL_USED_REGISTERS): Update $ra.
* config/riscv/riscv.md: Add new types "ret" and "jalr".
(length attribute): Handle long conditional and unconditional
branches.
(conditional branch pattern): Handle case where jump can not
reach the intended target.
(indirect_jump, tablejump): Use new "jalr" type.
(simple_return): Use new "ret" type.
(simple_return_internal, eh_return_internal): Likewise.
(gpr_restore_return, riscv_mret): Likewise.
(riscv_uret, riscv_sret): Likewise.
* config/riscv/generic.md (generic_branch): Also recognize jalr & ret
types.
* config/riscv/sifive-7.md (sifive_7_jump): Likewise.

Co-authored-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Co-authored-by: Jeff Law <jlaw@ventanamicro.com>
21 months agoc++: mangle multiple levels of template parms [PR109422]
Jason Merrill [Fri, 6 Oct 2023 15:41:20 +0000 (11:41 -0400)] 
c++: mangle multiple levels of template parms [PR109422]

This becomes be more important with concepts, but can also be seen with
generic lambdas.

PR c++/109422

gcc/cp/ChangeLog:

* mangle.cc (write_template_param): Also mangle level.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-generic-mangle1.C: New test.
* g++.dg/cpp2a/lambda-generic-mangle1a.C: New test.

21 months agoMATCH: [PR111679] Add alternative simplification of `a | ((~a) ^ b)`
Andrew Pinski [Mon, 9 Oct 2023 18:07:08 +0000 (11:07 -0700)] 
MATCH: [PR111679] Add alternative simplification of `a | ((~a) ^ b)`

So currently we have a simplification for `a | ~(a ^ b)` but
that does not match the case where we had originally `(~a) | (a ^ b)`
so we need to add a new pattern that matches that and uses bitwise_inverted_equal_p
that also catches comparisons too.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/111679

gcc/ChangeLog:

* match.pd (`a | ((~a) ^ b)`): New pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bitops-5.c: New test.

21 months agoRISC-V Regression: Make match patterns more accurate
Juzhe-Zhong [Tue, 10 Oct 2023 02:47:42 +0000 (10:47 +0800)] 
RISC-V Regression: Make match patterns more accurate

This patch fixes following 2 FAILs in RVV regression since the check is not accurate.

It's inspired by Robin's previous patch:
https://patchwork.sourceware.org/project/gcc/patch/dde89b9e-49a0-d70b-0906-fb3022cac11b@gmail.com/

gcc/testsuite/ChangeLog:

* gcc.dg/vect/no-scevccp-outer-7.c: Adjust regex pattern.
* gcc.dg/vect/no-scevccp-vect-iv-3.c: Ditto.

21 months agoRISC-V Regression: Fix FAIL of predcom-2.c
Juzhe-Zhong [Tue, 10 Oct 2023 02:58:35 +0000 (10:58 +0800)] 
RISC-V Regression: Fix FAIL of predcom-2.c

Like GCN, add -fno-tree-vectorize.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/predcom-2.c: Add riscv.

21 months agoRISC-V Regression: Fix FAIL of pr65947-8.c for RVV
Juzhe-Zhong [Tue, 10 Oct 2023 12:55:47 +0000 (20:55 +0800)] 
RISC-V Regression: Fix FAIL of pr65947-8.c for RVV

This test is testing fold_extract_last pattern so it's more reasonable use
vect_fold_extract_last instead of specifying targets.

This is the vect_fold_extract_last property:
proc check_effective_target_vect_fold_extract_last { } {
    return [expr { [check_effective_target_aarch64_sve]
   || [istarget amdgcn*-*-*]
   || [check_effective_target_riscv_v] }]
}

include ARM SVE/GCN/RVV.

It perfectly matches what we want and more reasonable, better maintainment.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr65947-8.c: Use vect_fold_extract_last.

21 months agoMAINTAINERS: Add myself to write after approval
Christoph Müllner [Fri, 6 Oct 2023 08:34:23 +0000 (10:34 +0200)] 
MAINTAINERS: Add myself to write after approval

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
ChangeLog:

* MAINTAINERS: Add myself.

21 months agoRISC-V: Add VLS BOOL mode vcond_mask[PR111751]
Juzhe-Zhong [Tue, 10 Oct 2023 12:15:35 +0000 (20:15 +0800)] 
RISC-V: Add VLS BOOL mode vcond_mask[PR111751]

Richard patch resolve PR111751: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=7c76c876e917a1f20a788f602cc78fff7d0a2a65

which cause ICE in RISC-V regression:

FAIL: gcc.dg/torture/pr53144.c   -O2  (internal compiler error: in gimple_expand_vec_cond_expr, at gimple-isel.cc:328)
FAIL: gcc.dg/torture/pr53144.c   -O2  (test for excess errors)
FAIL: gcc.dg/torture/pr53144.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error: in gimple_expand_vec_cond_expr, at gimple-isel.cc:328)
FAIL: gcc.dg/torture/pr53144.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/pr53144.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (internal compiler error: in gimple_expand_vec_cond_expr, at gimple-isel.cc:328)
FAIL: gcc.dg/torture/pr53144.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: gcc.dg/torture/pr53144.c   -O3 -g  (internal compiler error: in gimple_expand_vec_cond_expr, at gimple-isel.cc:328)
FAIL: gcc.dg/torture/pr53144.c   -O3 -g  (test for excess errors)

VLS BOOL modes vcond_mask is needed to fix this regression ICE.

More details: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751

Tested and Committed.

PR target/111751

gcc/ChangeLog:

* config/riscv/autovec.md: Add VLS BOOL modes.

21 months agotree-optimization/111751 - support 1024 bit vector constant reinterpretation
Richard Biener [Tue, 10 Oct 2023 11:33:34 +0000 (13:33 +0200)] 
tree-optimization/111751 - support 1024 bit vector constant reinterpretation

The following ups the limit in fold_view_convert_expr to handle
1024bit vectors as used by GCN and RVV.  It also robustifies
the handling in visit_reference_op_load to properly give up when
constants cannot be re-interpreted.

PR tree-optimization/111751
* fold-const.cc (fold_view_convert_expr): Up the buffer size
to 128 bytes.
* tree-ssa-sccvn.cc (visit_reference_op_load): Special case
constants, giving up when re-interpretation to the target type
fails.

21 months agoada: Fix internal error on too large representation clause for small component
Eric Botcazou [Thu, 28 Sep 2023 14:25:31 +0000 (16:25 +0200)] 
ada: Fix internal error on too large representation clause for small component

This is a small bug present on strict-alignment platforms for questionable
representation clauses.

gcc/ada/

* gcc-interface/decl.cc (inline_status_for_subprog): Minor tweak.
(gnat_to_gnu_field): Try harder to get a packable form of the type
for a bitfield.

21 months agoada: Tweak internal subprogram in Ada.Directories
Ronan Desplanques [Fri, 29 Sep 2023 10:35:05 +0000 (12:35 +0200)] 
ada: Tweak internal subprogram in Ada.Directories

The purpose of this patch is to work around false-positive warnings
emitted by GNAT SAS (also known as CodePeer). It does not change
the behavior of the modified subprogram.

gcc/ada/

* libgnat/a-direct.adb (Start_Search_Internal): Tweak subprogram
body.

21 months agoada: Remove superfluous setter procedure
Eric Botcazou [Fri, 29 Sep 2023 08:43:15 +0000 (10:43 +0200)] 
ada: Remove superfluous setter procedure

It is only called once.

gcc/ada/

* sem_util.ads (Set_Scope_Is_Transient): Delete.
* sem_util.adb (Set_Scope_Is_Transient): Likewise.
* exp_ch7.adb (Create_Transient_Scope): Set Is_Transient directly.

21 months agoada: Fix bad finalization of limited aggregate in conditional expression
Eric Botcazou [Wed, 27 Sep 2023 18:42:41 +0000 (20:42 +0200)] 
ada: Fix bad finalization of limited aggregate in conditional expression

This happens when the conditional expression is immediately returned, for
example in an expression function.

gcc/ada/

* exp_aggr.adb (Is_Build_In_Place_Aggregate_Return): Return true
if the aggregate is a dependent expression of a conditional
expression being returned from a build-in-place function.

21 months agoada: Fix infinite loop with multiple limited with clauses
Eric Botcazou [Tue, 26 Sep 2023 20:54:12 +0000 (22:54 +0200)] 
ada: Fix infinite loop with multiple limited with clauses

This occurs when one of the types has an incomplete declaration in addition
to its full declaration in its package. In this case AI05-129 says that the
incomplete type is not part of the limited view of the package, i.e. only
the full view is. Now, in the GNAT implementation, it's the opposite in the
regular view of the package, i.e. the incomplete type is the visible one.

That's why the implementation needs to also swap the types on the visibility
chain while it is swapping the views when the clauses are either installed
or removed. This works correctly for the installation, but does not for the
removal, so this change rewrites the code doing the latter.

gcc/ada/
PR ada/111434
* sem_ch10.adb (Replace): New procedure to replace an entity with
another on the homonym chain.
(Install_Limited_With_Clause): Rename Non_Lim_View to Typ for the
sake of consistency.  Call Replace to do the replacements and split
the code into the regular and the special cases.  Add debuggging
output controlled by -gnatdi.
(Install_With_Clause): Print the Parent_With and Implicit_With flags
in the debugging output controlled by -gnatdi.
(Remove_Limited_With_Unit.Restore_Chain_For_Shadow (Shadow)): Rewrite
using a direct replacement of E4 by E2.   Call Replace to do the
replacements.  Add debuggging output controlled by -gnatdi.

21 months agoada: Fix filesystem entry filtering
Ronan Desplanques [Tue, 19 Sep 2023 07:45:16 +0000 (09:45 +0200)] 
ada: Fix filesystem entry filtering

This patch fixes the behavior of Ada.Directories.Search when being
requested to filter out regular files or directories. One of the
configurations in which that behavior was incorrect was that when the
caller requested only the regular and special files but not the
directories, the directories would still be returned.

gcc/ada/

* libgnat/a-direct.adb: Fix filesystem entry filtering.

21 months agoada: Tweak documentation comments
Ronan Desplanques [Mon, 25 Sep 2023 09:14:58 +0000 (11:14 +0200)] 
ada: Tweak documentation comments

The concept of extended nodes was retired at the same time Gen_IL
was introduced, but there was a reference to that concept left over
in a comment. This patch removes that reference.

Also, the description of the field Comes_From_Check_Or_Contract was
incorrectly placed in a section for fields present in all nodes in
sinfo.ads. This patch fixes this.

gcc/ada/

* atree.ads, nlists.ads, types.ads: Remove references to extended
nodes. Fix typo.
* sinfo.ads: Likewise and fix position of
Comes_From_Check_Or_Contract description.

21 months agoada: Crash processing pragmas Compile_Time_Error and Compile_Time_Warning
Javier Miranda [Tue, 19 Sep 2023 13:54:28 +0000 (13:54 +0000)] 
ada: Crash processing pragmas Compile_Time_Error and Compile_Time_Warning

gcc/ada/

* sem_attr.adb (Analyze_Attribute): Protect the frontend against
replacing 'Size by its static value if 'Size is not known at
compile time and we are processing pragmas Compile_Time_Warning or
Compile_Time_Errors.

21 months agoRISC-V: Add testcase for SCCVN optimization[PR111751]
Juzhe-Zhong [Tue, 10 Oct 2023 11:47:06 +0000 (19:47 +0800)] 
RISC-V: Add testcase for SCCVN optimization[PR111751]

Add testcase for PR111751 which has been fixed:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632474.html

PR target/111751

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr111751.c: New test.

21 months agoFix missed CSE with a BLKmode entity
Richard Biener [Tue, 10 Oct 2023 09:09:16 +0000 (11:09 +0200)] 
Fix missed CSE with a BLKmode entity

The following fixes fallout of r10-7145-g1dc00a8ec9aeba which made
us cautionous about CSEing a load to an object that has padding bits.
The added check also triggers for BLKmode entities like STRING_CSTs
but by definition a BLKmode entity does not have padding bits.

PR tree-optimization/111751
* tree-ssa-sccvn.cc (visit_reference_op_load): Exempt
BLKmode result from the padding bits check.

21 months agoRISC-V Regression: Fix FAIL of bb-slp-pr65935.c for RVV
Juzhe-Zhong [Tue, 10 Oct 2023 01:39:04 +0000 (09:39 +0800)] 
RISC-V Regression: Fix FAIL of bb-slp-pr65935.c for RVV

Here is the reference comparing dump IR between ARM SVE and RVV.

https://godbolt.org/z/zqess8Gss

We can see RVV has one more dump IR:
optimized: basic block part vectorized using 128 byte vectors
since RVV has 1024 bit vectors.

The codegen is reasonable good.

However, I saw GCN also has 1024 bit vector.
This patch may cause this case FAIL in GCN port ?

Hi, GCN folk, could you check this patch in GCN port for me ?

gcc/testsuite/ChangeLog:

* gcc.dg/vect/bb-slp-pr65935.c: Add vect1024 variant.
* lib/target-supports.exp: Ditto.

21 months agoarc: Refurbish add.f combiner patterns
Claudiu Zissulescu [Tue, 10 Oct 2023 07:11:39 +0000 (10:11 +0300)] 
arc: Refurbish add.f combiner patterns

Refurbish add compare patterns: use 'r' constraint, fix identation,
and fix pattern to match 'if (a+b) { ... }' constructions.

gcc/

* config/arc/arc.cc (arc_select_cc_mode): Match NEG code with
the first operand.
* config/arc/arc.md (addsi_compare): Make pattern canonical.
(addsi_compare_2): Fix identation, constraint letters.
(addsi_compare_3): Likewise.

gcc/testsuite/

* gcc.target/arc/add_f-combine.c: New test.

Signed-off-by: Claudiu Zissulescu <claziss@gmail.com>
21 months agoRISC-V: Add available vector size for RVV
Juzhe-Zhong [Mon, 9 Oct 2023 23:23:26 +0000 (07:23 +0800)] 
RISC-V: Add available vector size for RVV

For RVV, we have VLS modes enable according to TARGET_MIN_VLEN
from M1 to M8.

For example, when TARGET_MIN_VLEN = 128 bits, we enable
128/256/512/1024 bits VLS modes.

This patch fixes following FAIL:
FAIL: gcc.dg/vect/bb-slp-subgroups-2.c -flto -ffat-lto-objects  scan-tree-dump-times slp2 "optimized: basic block" 2
FAIL: gcc.dg/vect/bb-slp-subgroups-2.c scan-tree-dump-times slp2 "optimized: basic block" 2

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add 256/512/1024

21 months agoDaily bump.
GCC Administrator [Tue, 10 Oct 2023 00:19:25 +0000 (00:19 +0000)] 
Daily bump.

21 months agoFixes for profile count/probability maintenance
Eugene Rozenfeld [Sat, 16 Sep 2023 01:12:47 +0000 (18:12 -0700)] 
Fixes for profile count/probability maintenance

Verifier checks have recently been strengthened to check that
all counts and probabilities are initialized. The checks fired
during autoprofiledbootstrap build and this patch fixes it.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:
* auto-profile.cc (afdo_calculate_branch_prob): Fix count comparisons
* tree-vect-loop-manip.cc (vect_do_peeling): Guard against zero count
when scaling loop profile

21 months agoanalyzer: fix build with gcc < 6
David Malcolm [Mon, 9 Oct 2023 19:11:52 +0000 (15:11 -0400)] 
analyzer: fix build with gcc < 6

gcc/analyzer/ChangeLog:
* access-diagram.cc (boundaries::add): Explicitly state
"boundaries::" scope for "kind" enum.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
21 months agoEnsure float equivalences include + and - zero.
Andrew MacLeod [Mon, 9 Oct 2023 14:15:07 +0000 (10:15 -0400)] 
Ensure float equivalences include + and - zero.

A floating point equivalence may not properly reflect both signs of
zero, so be pessimsitic and ensure both signs are included.

PR tree-optimization/111694
gcc/
* gimple-range-cache.cc (ranger_cache::fill_block_cache): Adjust
equivalence range.
* value-relation.cc (adjust_equivalence_range): New.
* value-relation.h (adjust_equivalence_range): New prototype.

gcc/testsuite/
* gcc.dg/pr111694.c: New.

21 months agoRemove unused get_identity_relation.
Andrew MacLeod [Mon, 9 Oct 2023 14:01:11 +0000 (10:01 -0400)] 
Remove unused get_identity_relation.

Turns out we didnt need this as there is no unordered relations
managed by the oracle.

* gimple-range-gori.cc (gori_compute::compute_operand1_range): Do
not call get_identity_relation.
(gori_compute::compute_operand2_range): Ditto.
* value-relation.cc (get_identity_relation): Remove.
* value-relation.h (get_identity_relation): Remove protyotype.