git.ipfire.org Git - thirdparty/gcc.git/log

]> git.ipfire.org Git - thirdparty/gcc.git/log

projects / thirdparty / gcc.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Joseph Myers [Thu, 11 May 2023 08:07:00 +0000 (10:07 +0200)]

Implement LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook [PR109128]

This is one part of the fix for PR109128, along with a corresponding
binutils's linker change. Without this patch, what happens in the
linker, when an unused object in a .a file has offload data, is that
elf_link_is_defined_archive_symbol calls bfd_link_plugin_object_p,
which ends up calling the plugin's claim_file_handler, which then
records the object as one with offload data. That is, the linker never
decides to use the object in the first place, but use of this _p
interface (called as part of trying to decide whether to use the
object) results in the plugin deciding to use its offload data (and a
consequent mismatch in the offload data present at runtime).

The new hook allows the linker plugin to distinguish calls to
claim_file_handler that know the object is being used by the linker
(from ldmain.c:add_archive_element), from calls that don't know it's
being used by the linker (from elf_link_is_defined_archive_symbol); in
the latter case, the plugin should avoid recording the object as one
with offload data.

PR middle-end/109128

include/
* plugin-api.h (ld_plugin_claim_file_handler_v2)
(ld_plugin_register_claim_file_v2)
(LDPT_REGISTER_CLAIM_FILE_HOOK_V2): New.
(struct ld_plugin_tv): Add tv_register_claim_file_v2.

lto-plugin/
* lto-plugin.c (register_claim_file_v2): New.
(claim_file_handler_v2): New.
(claim_file_handler): Wrap claim_file_handler_v2.
(onload): Handle LDPT_REGISTER_CLAIM_FILE_HOOK_V2.

commit | commitdiff | tree

Thomas Schwinge [Wed, 10 May 2023 07:17:47 +0000 (09:17 +0200)]

Testsuite: Add 'torture-init-done', and use it to conditionalize implicit 'torture-init'

Recent commit d6654a4be3ba44c0d57be7c8a51d76d9721345e1
"Let each 'lto_init' determine the default 'LTO_OPTIONS', and 'torture-init' the 'LTO_TORTURE_OPTIONS'"
made 'torture-init' non-idempotent re 'LTO_TORTURE_OPTIONS', in order to catch
certain classes of errors. Now, most of all '*.exp' files have 'torture-init'
followed by 'set-torture-options' before 'gcc-dg-runtest' etc., and therefore
don't run into the latter's
"Some callers set torture options themselves; don't override those." code.
Some '*.exp' files however do 'torture-init' but not 'set-torture-options', and
therefore we can't any longer conditionalize the implicit 'torture-init' by
'![torture-options-exist]'.

gcc/testsuite/
* lib/torture-options.exp (torture-init-done): Add.
* lib/gcc-dg.exp (gcc-dg-runtest): Use it to conditionalize
implicit 'torture-init'.
* lib/gfortran-dg.exp (gfortran-dg-runtest): Likewise.
* lib/obj-c++-dg.exp (obj-c++-dg-runtest): Likewise.
* lib/objc-dg.exp (objc-dg-runtest): Likewise.

commit | commitdiff | tree

Thomas Schwinge [Tue, 9 May 2023 08:35:27 +0000 (10:35 +0200)]

Testsuite: Add missing 'torture-init'/'torture-finish' around 'LTO_TORTURE_OPTIONS' usage

Recent commit d6654a4be3ba44c0d57be7c8a51d76d9721345e1
"Let each 'lto_init' determine the default 'LTO_OPTIONS', and 'torture-init' the 'LTO_TORTURE_OPTIONS'"
made it a requirement that 'LTO_TORTURE_OPTIONS' usage be within
'torture-init'/'torture-finish', and missed a few cases that didn't have that.

gcc/testsuite/
* gcc.target/arm/acle/acle.exp: Add missing
'torture-init'/'torture-finish' around 'LTO_TORTURE_OPTIONS'
usage.
* gcc.target/arm/cmse/cmse.exp: Likewise.
* gcc.target/arm/pure-code/pure-code.exp: Likewise.

commit | commitdiff | tree

Roger Sayle [Thu, 11 May 2023 07:15:21 +0000 (08:15 +0100)]

match.pd: Simplify popcount(X&Y)+popcount(X|Y) as popcount(X)+popcount(Y)

This patch teaches match.pd to simplify popcount(X&Y)+popcount(X|Y) as
popcount(X)+popcount(Y), and the related simplifications that
popcount(X)+popcount(Y)-popcount(X&Y) is popcount(X|Y).  As surprising
as it might seem, this idiom is common in cheminformatics codes
(for Tanimoto coefficient calculations).

2023-05-11  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* match.pd <popcount optimizations>: Simplify popcount(X|Y) +
popcount(X&Y) as popcount(X)+popcount(Y).  Likewise, simplify
popcount(X)+popcount(Y)-popcount(X&Y) as popcount(X|Y), and
vice versa.

gcc/testsuite/ChangeLog
* gcc.dg/fold-popcount-8.c: New test case.
* gcc.dg/fold-popcount-9.c: Likewise.
* gcc.dg/fold-popcount-10.c: Likewise.

commit | commitdiff | tree

Roger Sayle [Thu, 11 May 2023 07:10:04 +0000 (08:10 +0100)]

match.pd: Simplify popcount/parity of bswap/rotate.

This is the latest iteration of my patch from August 2020
https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552391.html
incorperating feedback and suggestions from reviewers.

This patch to match.pd optimizes away bit permutation operations,
specifically bswap and rotate, in calls to popcount and parity.

2023-05-11  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* match.pd <popcount optimizations>: Simplify popcount(bswap(x))
as popcount(x).  Simplify popcount(rotate(x,y)) as popcount(x).
<parity optimizations>:  Simplify parity(bswap(x)) as parity(x).
Simplify parity(rotate(x,y)) as parity(x).

gcc/testsuite/ChangeLog
* gcc.dg/fold-parity-6.c: New test.
* gcc.dg/fold-parity-7.c: Likewise.
* gcc.dg/fold-popcount-6.c: Likewise.
* gcc.dg/fold-popcount-7.c: Likewise.

commit | commitdiff | tree

Juzhe-Zhong [Wed, 10 May 2023 04:00:35 +0000 (12:00 +0800)]

RISC-V: Support const series vector for RVV auto-vectorization

This patch is the prerequiste patch for more RVV auto-vectorization
support.

Since when we enable a very simple binary operations, we will end
up with such following ICE:

during RTL pass: expand
add_run-1.c: In function 'main':
add_run-1.c:28:1: internal compiler error: Segmentation fault
0x1618ea3 crash_signal
../../../riscv-gcc/gcc/toplev.cc:314
0xe76cd9 single_set(rtx_insn const*)
../../../riscv-gcc/gcc/rtl.h:3602
0x1080f8a emit_move_insn(rtx_def*, rtx_def*)
../../../riscv-gcc/gcc/expr.cc:4342
0x170c458 insert_value_copy_on_edge
../../../riscv-gcc/gcc/tree-outof-ssa.cc:352
0x170d58e eliminate_phi
../../../riscv-gcc/gcc/tree-outof-ssa.cc:785
0x170df17 expand_phi_nodes(ssaexpand*)
../../../riscv-gcc/gcc/tree-outof-ssa.cc:1024
0xef27e2 execute
../../../riscv-gcc/gcc/cfgexpand.cc:6818

This is because LoopVectorizer assume target is able to handle
series const vector when we enable binary operations.
Then it will be easily causing ICE like that.

gcc/ChangeLog:

* config/riscv/autovec.md (@vec_series<mode>): New pattern
* config/riscv/riscv-protos.h (expand_vec_series): New function.
* config/riscv/riscv-v.cc (emit_binop): Ditto.
(emit_index_op): Ditto.
(expand_vec_series): Ditto.
(expand_const_vector): Add series vector handling.
* config/riscv/riscv.cc (riscv_const_insns): Enable series vector for testing.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/series-1.c: New test.
* gcc.target/riscv/rvv/autovec/series_run-1.c: New test.

commit | commitdiff | tree

Ju-Zhe Zhong [Thu, 11 May 2023 00:58:03 +0000 (08:58 +0800)]

MAINTAINERS: Add myself to write after approval

Signed-off-by: Juzhe Zhong <juzhe.zhong@rivai.ai>
ChangeLog:

* MAINTAINERS: Add myself.

commit | commitdiff | tree

GCC Administrator [Thu, 11 May 2023 00:16:33 +0000 (00:16 +0000)]

Daily bump.

commit | commitdiff | tree

Marek Polacek [Tue, 2 May 2023 21:36:00 +0000 (17:36 -0400)]

c++: wrong std::is_convertible with cv-qual fn [PR109680]

This PR points out that std::is_convertible has given the wrong answer
in

  static_assert (!std::is_convertible_v <int () const, int (*) ()>, "");

since r13-2822 implemented __is_{,nothrow_}convertible.

std::is_convertible uses the imaginary

  To test() { return std::declval<From>(); }

to do its job.  Here, From is 'int () const'.  std::declval is defined as:

  template<class T>
  typename std::add_rvalue_reference<T>::type declval() noexcept;

std::add_rvalue_reference is defined as "If T is a function type that
has no cv- or ref- qualifier or an object type, provides a member typedef
type which is T&&, otherwise type is T."

In our case, T is cv-qualified, so the result is T, so we end up with

  int () const declval() noexcept;

which is invalid.  In other words, this is pretty much like:

  using T = int () const;
  T fn1(); // bad, fn returning a fn
  T& fn2(); // bad, cannot declare reference to qualified function type
  T* fn3(); // bad, cannot declare pointer to qualified function type

  using U = int ();
  U fn4(); // bad, fn returning a fn
  U& fn5(); // OK
  U* fn6(); // OK

I think is_convertible_helper needs to simulate std::declval better.
To that end, I'm introducing build_trait_object, to be used where
a declval is needed.

PR c++/109680

gcc/cp/ChangeLog:

* method.cc (build_trait_object): New.
(assignable_expr): Use it.
(ref_xes_from_temporary): Likewise.
(is_convertible_helper): Likewise.  Check FUNC_OR_METHOD_TYPE_P.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_convertible6.C: New test.

commit | commitdiff | tree

Roger Sayle [Wed, 10 May 2023 21:10:49 +0000 (22:10 +0100)]

Use [(const_int 0)] idiom consistently in i386.md

This cleans up the use of [(clobber (const_int 0))] in the i386 backend.

2023-05-10 Roger Sayle <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/i386/i386.md (*concat<mode><dwi>3_1): Use preferred
[(const_int 0)] idiom, instead of [(clobber (const_int 0))].
(*concat<mode><dwi>3_2): Likewise.
(*concat<mode><dwi>3_3): Likewise.
(*concat<mode><dwi>3_4): Likewise.
(*concat<mode><dwi>3_5): Likewise.
(*concat<mode><dwi>3_6): Likewise.
(*concat<mode><dwi>3_7): Likewise.

commit | commitdiff | tree

Jason Merrill [Thu, 23 Mar 2023 14:41:29 +0000 (10:41 -0400)]

c++: adjust conversion diagnostics

While looking at PR109247 I made this change to improve diagnostics. I
don't think I'm going ahead with that patch, but this still seems like a
worthy cleanup.

gcc/cp/ChangeLog:

* call.cc (convert_like_internal): Share ck_ref_bind handling
between all bad conversions.

commit | commitdiff | tree

Uros Bizjak [Wed, 10 May 2023 20:40:53 +0000 (22:40 +0200)]

i386: Add missing vector extend patterns [PR92658]

Add missing insn pattern for v2qi -> v2si vector extend and named
expanders to activate generation of vector extends to 8-byte and 4-byte
vectors.

gcc/ChangeLog:

PR target/92658
* config/i386/mmx.md (sse4_1_<code>v2qiv2si2): New insn pattern.
(<insn>v4qiv4hi2): New expander.
(<insn>v2hiv2si2): Ditto.
(<insn>v2qiv2si2): Ditto.
(<insn>v2qiv2hi2): Ditto.

gcc/testsuite/ChangeLog:

PR target/92658
* gcc.target/i386/pr92658-sse4-4b.c: New test.
* gcc.target/i386/pr92658-sse4-8b.c: New test.

commit | commitdiff | tree

Bernhard Reutner-Fischer [Tue, 9 May 2023 15:35:20 +0000 (17:35 +0200)]

Fortran: dump-parse-tree: Mark debug functions with DEBUG_FUNCTION

gcc/fortran/ChangeLog:

* dump-parse-tree.cc (gfc_debug_expr): Remove forward declaration.
(debug): Add DEBUG_FUNCTION.
(show_code_node): Remove erroneous whitespace.

commit | commitdiff | tree

Bernhard Reutner-Fischer [Tue, 9 May 2023 15:21:16 +0000 (17:21 +0200)]

Fortran: dump-parse-tree attribs: fix unbalanced braces [PR109624]

gcc/fortran/ChangeLog:

PR fortran/109624
* dump-parse-tree.cc (debug): New function for gfc_namespace.
(gfc_debug_code): Delete forward declaration.
(show_attr): Make sure to print balanced braces.

commit | commitdiff | tree

Andrew Pinski [Wed, 10 May 2023 17:56:15 +0000 (17:56 +0000)]

Add another new testcase

While working on improving min/max detection, this
code (which is reduced from worse_state in ipa-pure-const.cc)
was being miscompiled. Since there was no testcase in the
testsuite yet for this, this patch adds one.

Committed as obvious after testing the testcase via:
make check-gcc RUNTESTFLAGS="execute.exp=20230510-1.c"

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/20230510-1.c: New test.

commit | commitdiff | tree

François Dumont [Wed, 3 May 2023 04:45:47 +0000 (06:45 +0200)]

libstdc++: [_Hashtable] Implement several small methods implicitly inline

Make implementation of 3 simple _Hashtable methods implicitly inline.

Avoid usage of const_iterator abstraction within _Hashtable implementation.

Replace several usages of __node_type* with expected __node_ptr.

libstdc++-v3/ChangeLog:

* include/bits/hashtable_policy.h
(_NodeBuilder<>::_S_build): Use __node_ptr.
(_ReuseOrAllocNode<>): Use __node_ptr in place of __node_type*.
(_AllocNode<>): Likewise.
(_Equality<>::_M_equal): Remove const_iterator usages. Only preserved
to call std::is_permutation in the non-unique key implementation.
* include/bits/hashtable.h (_Hashtable<>::_M_update_begin()): Capture
_M_begin() once.
(_Hashtable<>::_M_bucket_begin(size_type)): Implement implicitly inline.
(_Hashtable<>::_M_insert_bucket_begin): Likewise.
(_Hashtable<>::_M_remove_bucket_begin): Likewise.
(_Hashtable<>::_M_compute_hash_code): Use __node_ptr rather than
const_iterator.
(_Hashtable<>::find): Likewise.
(_Hashtable<>::_M_emplace): Likewise.
(_Hashtable<>::_M_insert_unique): Likewise.

commit | commitdiff | tree

Pan Li [Wed, 10 May 2023 15:41:08 +0000 (23:41 +0800)]

MAINTAINERS: Add myself to write after approval

Signed-off-by: Pan Li <pan2.li@intel.com>
ChangeLog:

* MAINTAINERS: Add myself.

commit | commitdiff | tree

Jason Merrill [Mon, 6 Feb 2023 23:08:17 +0000 (15:08 -0800)]

c++: be stricter about constinit [CWG2543]

DR 2543 clarifies that constinit variables should follow the language, and
diagnose non-constant initializers (according to [expr.const]) even if they
can actually initialize the variables statically.

DR 2543

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_outermost_constant_expr): Preserve
TARGET_EXPR flags.
(potential_constant_expression_1): Check TARGET_EXPR_ELIDING_P.
* typeck2.cc (store_init_value): Diagnose constinit sooner.

gcc/testsuite/ChangeLog:

* g++.dg/DRs/dr2543.C: New test.

commit | commitdiff | tree

Jason Merrill [Tue, 9 May 2023 22:48:11 +0000 (18:48 -0400)]

c++: always check consteval address

The restriction on the "permitted result of a constant expression" to not
refer to an immediate function applies regardless of context. The previous
code tried to only check in cases where we wouldn't get the check in
cp_fold_r, but with the next patch I would need to add another case and it
shouldn't be a problem to always check.

We also shouldn't talk about immediate evaluation when we aren't dealing
with one.

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_outermost_constant_expr): Always check
for address of immediate fn.
(maybe_constant_init_1): Evaluate PTRMEM_CST.

gcc/testsuite/ChangeLog:

* g++.dg/DRs/dr2478.C: Handle -fimplicit-constexpr.
* g++.dg/cpp23/consteval-if12.C: Adjust diagnostics.
* g++.dg/cpp2a/consteval20.C: Likewise.
* g++.dg/cpp2a/consteval24.C: Likewise.
* g++.dg/cpp2a/srcloc20.C: Likewise.

commit | commitdiff | tree

Richard Biener [Wed, 10 May 2023 13:36:12 +0000 (15:36 +0200)]

Avoid g++.dg/torture/pr106922.C FAIL with the pre-C++11 ABI

The following forces the g++.dg/torture/pr106922.C testcase to use
the C++11 libstdc++ ABI and checks whether that worked.

gcc/testsuite/
* g++.dg/torture/pr106922.C: Force _GLIBCXX_USE_CXX11_ABI to 1.

commit | commitdiff | tree

Jeff Law [Wed, 10 May 2023 11:25:12 +0000 (05:25 -0600)]

Fix a couple constraints on the H8 in preparation for LRA conversion

So this is the 2nd patch on the way to LRA for the H8.

LRA is more sensitive to getting define_constraint vs define_memory_constraint
vs define_special_memory_constraint correct.  than reload.

The H8 port has the "Q" constraint, which is used to indicate memory addresses
that can be used under certain circumstances in various ALU operations.  So it
should be a memory constraint.  Ideally it'd would be a simple memory
constraint, but it's used in contexts where MEMs are valid only for certain
parts in the H8 family.  So it really needs to be a special_memory_constraint.

The "Zz" constraint accepts memory, but the forms are limited and can not be
reloaded into a register.   It seems to be working, but I wouldn't be totally
surprised if this got stressed in the right way if it broke.

Anyway, this patch fixes "Q" and "Zz" to be special memory constraints.

Regression tested with gdbsim and pushed to the trunk.

gcc
* config/h8300/constraints.md (Q): Make this a special memory
constraint.
(Zz): Similarly.

commit | commitdiff | tree

Jakub Jelinek [Wed, 10 May 2023 11:20:39 +0000 (13:20 +0200)]

ipa-prop: Fix ipa_get_callee_param_type for calls with argument type mismatches

The PR contains a testcase where the Fortran FE creates FUNCTION_TYPE
which doesn't really match the passed in arguments (FUNCTION_TYPE has
5 arguments, call has 6).  Now, I think that is a Fortran FE bug that
should be fixed there, but I think with function pointers one can
create something similar (of course invalid) in C/C++ too,so IMHO IPA
should be also more careful.
The ipa_get_callee_param_type function can return NULL if something goes
wrong and it does e.g. if asked for 7th argument type on a function
with just 5 arguments and similar.  But, if a function isn't varargs,
when asked for 6th argument type on a function with just 5 arguments
it actually returns void_type_node because the argument list is in that
case terminated with void_list_node.

The following patch makes sure we don't treat void_list_node as something
holding another argument.

2023-05-10  Jakub Jelinek  <jakub@redhat.com>

PR fortran/109788
* ipa-prop.cc (ipa_get_callee_param_type): Don't return TREE_VALUE (t)
if t is void_list_node.

commit | commitdiff | tree

Kyrylo Tkachov [Wed, 10 May 2023 11:04:47 +0000 (12:04 +0100)]

aarch64: Simplify sqmovun expander

This patch is a no-op as it removes the explicit vec-concat-zero patterns in favour of vczle/vczbe.
This allows us to delete the explicit expander too. Tests are added to ensure the optimisation required
still triggers.

Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (aarch64_sqmovun<mode>_insn_le): Delete.
(aarch64_sqmovun<mode>_insn_be): Delete.
(aarch64_sqmovun<mode><vczle><vczbe>): New define_insn.
(aarch64_sqmovun<mode>): Delete expander.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/simd/pr99195_4.c: Add tests for sqmovun.

commit | commitdiff | tree

Kyrylo Tkachov [Wed, 10 May 2023 11:00:17 +0000 (12:00 +0100)]

[PATCH] aarch64: PR target/99195 annotate simple permutation patterns for vec-concat-zero

Another straightforward patch annotating patterns for the zip1, zip2, uzp1, uzp2, rev* instructions, plus tests.
Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

gcc/ChangeLog:

PR target/99195
* config/aarch64/aarch64-simd.md (aarch64_<PERMUTE:perm_insn><mode>):
Rename to...
(aarch64_<PERMUTE:perm_insn><mode><vczle><vczbe>): ... This.
(aarch64_rev<REVERSE:rev_op><mode>): Rename to...
(aarch64_rev<REVERSE:rev_op><mode><vczle><vczbe>): ... This.

gcc/testsuite/ChangeLog:

PR target/99195
* gcc.target/aarch64/simd/pr99195_1.c: Add tests for zip and rev
intrinsics.

commit | commitdiff | tree

Kyrylo Tkachov [Wed, 10 May 2023 10:50:01 +0000 (11:50 +0100)]

aarch64: PR target/99195 annotate simple saturating add/sub patterns for vec-concat-zero

Moving onto the saturating instructions, this one goes through the simple add/sub ones.
Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

gcc/ChangeLog:

PR target/99195
* config/aarch64/aarch64-simd.md (aarch64_<su_optab>q<addsub><mode>):
Rename to...
(aarch64_<su_optab>q<addsub><mode><vczle><vczbe>): ... This.
(aarch64_<sur>qadd<mode>): Rename to...
(aarch64_<sur>qadd<mode><vczle><vczbe>): ... This.

gcc/testsuite/ChangeLog:

PR target/99195
* gcc.target/aarch64/simd/pr99195_1.c: Add testing for qadd, qsub.
* gcc.target/aarch64/simd/pr99195_6.c: New test.

commit | commitdiff | tree

Kyrylo Tkachov [Wed, 10 May 2023 09:44:30 +0000 (10:44 +0100)]

aarch64: Simplify QSHRN expanders and patterns

This patch deletes the explicit BYTES_BIG_ENDIAN and !BYTES_BIG_ENDIAN patterns for the QSHRN instructions in favour
of annotating a single one with <vczle><vczbe>. This allows simplification of the expander too.
Tests are added to ensure that we still optimise away the concat-with-zero use case.

Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md
(aarch64_<sur>q<r>shr<u>n_n<mode>_insn_le): Delete.
(aarch64_<sur>q<r>shr<u>n_n<mode>_insn_be): Delete.
(aarch64_<sur>q<r>shr<u>n_n<mode>_insn<vczle><vczbe>): New define_insn.
(aarch64_<sur>q<r>shr<u>n_n<mode>): Simplify expander.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/simd/pr99195_5.c: New test.

commit | commitdiff | tree

Kyrylo Tkachov [Wed, 10 May 2023 09:40:06 +0000 (10:40 +0100)]

aarch64: PR target/99195 annotate simple narrowing patterns for vec-concat-zero

This patch cleans up some almost-duplicate patterns for the XTN, SQXTN, UQXTN instructions.
Using the <vczle><vczbe> attributes we can remove the BYTES_BIG_ENDIAN and !BYTES_BIG_ENDIAN cases,
as well as the intrinsic expanders that select between the two.
Tests are also added. Thankfully the diffstat comes out negative \O/.

Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

gcc/ChangeLog:

PR target/99195
* config/aarch64/aarch64-simd.md (aarch64_xtn<mode>_insn_le): Delete.
(aarch64_xtn<mode>_insn_be): Likewise.
(trunc<mode><Vnarrowq>2): Rename to...
(trunc<mode><Vnarrowq>2<vczle><vczbe>): ... This.
(aarch64_xtn<mode>): Move under the above. Just emit the truncate RTL.
(aarch64_<su>qmovn<mode>): Likewise.
(aarch64_<su>qmovn<mode><vczle><vczbe>): New define_insn.
(aarch64_<su>qmovn<mode>_insn_le): Delete.
(aarch64_<su>qmovn<mode>_insn_be): Likewise.

gcc/testsuite/ChangeLog:

PR target/99195
* gcc.target/aarch64/simd/pr99195_4.c: Add tests for vmovn, vqmovn.

commit | commitdiff | tree

Jakub Jelinek [Wed, 10 May 2023 09:37:04 +0000 (11:37 +0200)]

c++: Reject attributes without arguments used as pack expansion [PR109756]

The following testcase shows we silently accept (and ignore) attributes without
arguments used as pack expansions.  This is because we call
make_pack_expansion and that starts with
  if (!arg || arg == error_mark_node)
    return arg;
Now, an attribute without arguments like [[noreturn...]] is IMHO always
invalid, in this case for 2 reasons; one is that as it has no arguments,
no pack can be present and second is that the standard says that
attributes need to specially permit uses of parameter pack and doesn't
explicitly permit it for any of the standard attributes (except for alignas?
which has different syntax).
If an attribute has some arguments but doesn't contain packs in those
arguments, make_pack_expansion will already diagnose it.

The patch also changes cp_parser_std_attribute, such that for attributes unknown
to the compiler (or perhaps registered just for -Wno-attributes=) we differentiate
between the attribute having no arguments (in that case we want to diagnose them
when followed by ellipsis even if they are unknown, as they can't contain a pack
in that case) and the case where they do have arguments but we've just skipped over
those arguments because we don't know how to parse them (except that they are
a balanced token sequence) - in that case we really don't know if they contain
packs or not.

2023-05-10  Jakub Jelinek  <jakub@redhat.com>

PR c++/109756
* parser.cc (cp_parser_std_attribute): For unknown attributes with
arguments set TREE_VALUE (attribute) to error_mark_node after skipping
the balanced tokens.
(cp_parser_std_attribute_list): If ... is used after attribute without
arguments, diagnose it and return error_mark_node.  If
TREE_VALUE (attribute) is error_mark_node, don't call
make_pack_expansion nor return early error_mark_node.

* g++.dg/cpp0x/gen-attrs-78.C: New test.

commit | commitdiff | tree

Li Xu [Wed, 10 May 2023 04:02:13 +0000 (04:02 +0000)]

RISC-V: Insert vsetivli zero, 0 for vmv.x.s/vfmv.f.s instructions satisfying REG_P(operand[1]) in -O0.

This issue happens is because the operand1 of scalar move can be
REG_P (operand[1]) in the O0 case, which causes the VSETVL PASS to
not insert the vsetvl instruction correctly, and the compiler crashes.

Consider this following case:
int16_t foo1 (void *base, size_t vl)
{
    int16_t maxVal = __riscv_vmv_x_s_i16m1_i16 (__riscv_vle16_v_i16m1 (base, vl));
    return maxVal;
}

Before this patch:
bug.c:15:1: internal compiler error: Segmentation fault
   15 | }
      | ^
0x145d723 crash_signal
../.././riscv-gcc/gcc/toplev.cc:314
0x22929dd const_csr_operand(rtx_def*, machine_mode)
../.././riscv-gcc/gcc/config/riscv/predicates.md:44
0x2292a21 csr_operand(rtx_def*, machine_mode)
../.././riscv-gcc/gcc/config/riscv/predicates.md:46
0x23dfbb0 recog_356
../.././riscv-gcc/gcc/config/riscv/iterators.md:72
0x23efecd recog(rtx_def*, rtx_insn*, int*)
../.././riscv-gcc/gcc/config/riscv/iterators.md:89
0xdddc15 recog_memoized(rtx_insn*)
../.././riscv-gcc/gcc/recog.h:273

After this patch:
vsetivli zero,0,e16,m1,ta,ma
vmv.x.s a5,v1

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (gen_vsetvl_pat): For vfmv.f.s/vmv.x.s
intruction replace null avl with (const_int 0).

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/scalar_move-10.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-11.c: New test.

commit | commitdiff | tree

Juzhe-Zhong [Tue, 9 May 2023 12:05:50 +0000 (20:05 +0800)]

RISC-V: Fix incorrect implementation of TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT

This incorrect codes blocks the scalable RVV auto-vectorization.
Take a look at this target hook implementation of aarch64.
They only have the similiar handling on TARGET_SIMD.

They let movmisalign<mode> to handle scalable vector of SVE.
For RVV, we should follow the same implementation of ARM SVE.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_support_vector_misalignment): Fix
incorrect codes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/v-2.c: Adapt testcase.
* gcc.target/riscv/rvv/autovec/zve32f-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve32f-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve32x-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve32x-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64d-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64d-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64d_zvl128b-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64f-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64f-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64f_zvl128b-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64x-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64x-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64x_zvl128b-2.c: Ditto.

commit | commitdiff | tree

Juzhe-Zhong [Tue, 9 May 2023 02:19:34 +0000 (10:19 +0800)]

RISC-V: Fix dead loop for user vsetvli intrinsic avl checking [PR109773]

This patch is fix dead loop in vsetvl intrinsic avl checking.

vsetvli->get_def () has vsetvli->get_def () has vsetvli.....
Then it will keep looping in the vsetvli avl checking which is a dead loop.

PR target/109773

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (avl_source_has_vsetvl_p): New function.
(source_equal_p): Fix dead loop in vsetvl avl checking.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/pr109773-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/pr109773-2.c: New test.

commit | commitdiff | tree

Andrew Pinski [Wed, 10 May 2023 05:02:48 +0000 (05:02 +0000)]

New testcase

While I was writting a match.pd patch, I can across GCC was being miscompiled
but no testcase was failing. So this adds that testcase.

Committed after testing on x86_64 with
make check-gcc RUNTESTFLAGS="execute.exp=20230509-1.c"

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/20230509-1.c: New test.

commit | commitdiff | tree

Hans-Peter Nilsson [Fri, 28 Apr 2023 22:12:39 +0000 (00:12 +0200)]

CRIS: Fix ccmode typo in cris_postdbr_cmpelim

Typo spotted while doing CCmode improvements, as a missed
optimization. It's almost visible from the patch context;
there's not much done in terms of "mode-adjustment" when
replacing (reg:CC CRIS_CC0_REGNUM) with a copy!
This bug affects functions in the newlib printf-formatting
functions (nothing else in libgcc or newlib libc), with the
performance impact on coremark scores being less than 1e-6
(3/5078992 cycles, 6/48543 bytes).

* config/cris/cris.cc (cris_postdbr_cmpelim): Correct mode
of modeadjusted_dccr.

commit | commitdiff | tree

GCC Administrator [Wed, 10 May 2023 00:17:49 +0000 (00:17 +0000)]

Daily bump.

commit | commitdiff | tree

Joseph Myers [Tue, 9 May 2023 20:47:59 +0000 (20:47 +0000)]

Update cpplib ru.po

* ru.po: Update.

commit | commitdiff | tree

Joseph Myers [Tue, 9 May 2023 20:23:31 +0000 (20:23 +0000)]

Update gcc hr.po

* hr.po: Update.

commit | commitdiff | tree

Jonathan Wakely [Tue, 9 May 2023 17:18:01 +0000 (18:18 +0100)]

libstdc++: Fix <chrono> pretty printers and add tests

This fixes a couple of errors in the printers for chrono types, and adds
tests to ensure they keep working.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (StdChronoDurationPrinter):
Print floating-point durations correctly.
(StdChronoTimePointPrinter): Support printing only the value,
not the type name. Uncomment handling for known clocks.
(StdChronoZonedTimePrinter): Remove type names from output.
(StdChronoCalendarPrinter): Fix hh_mm_ss member access.
(StdChronoTimeZonePrinter): Add equals sign to output.
* testsuite/libstdc++-prettyprinters/chrono.cc: New test.

commit | commitdiff | tree

Patrick Palka [Tue, 9 May 2023 19:07:00 +0000 (15:07 -0400)]

c++: error-recovery ICE with unstable satisfaction [PR109752]

After diagnosing and recovering from unstable satisfaction, it's
possible to evaluate an atom for the first time noisily rather than
quietly. The satisfaction cache tries to handle this situation
gracefully, but apparently not gracefully enough: we inserted an empty
slot into the cache, and left it empty, which later makes
hash_table::check_complete_insertion unhappy. This patch fixes this by
removing the empty slot in this case.

PR c++/109752

gcc/cp/ChangeLog:

* constraint.cc (satisfaction_cache::satisfaction_cache): In the
unexpected case of evaluating an atom for the first time noisily,
remove the cache slot that we inserted.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-pr109752.C: New test.

commit | commitdiff | tree

Patrick Palka [Tue, 9 May 2023 19:06:34 +0000 (15:06 -0400)]

c++: noexcept-spec from nested class confusion [PR109761]

When late processing a noexcept-spec from a nested class after completion
of the outer class (since it's a complete-class context), we pass the wrong
class context to noexcept_override_late_checks -- the outer class type
instead of the nested class type -- which leads to bogus errors in the
below test.

This patch fixes this by making noexcept_override_late_checks obtain the
class context directly via DECL_CONTEXT instead of via an additional
parameter.

PR c++/109761

gcc/cp/ChangeLog:

* parser.cc (cp_parser_class_specifier): Don't pass a class
context to noexcept_override_late_checks.
(noexcept_override_late_checks): Remove 'type' parameter
and use DECL_CONTEXT of 'fndecl' instead.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept78.C: New test.

commit | commitdiff | tree

Christophe Lyon [Mon, 13 Feb 2023 21:49:21 +0000 (21:49 +0000)]

arm: [MVE intrinsics] rework vmaxaq vminaq

Implement vmaxaq and vminaq using the new MVE builtins framework.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-base.cc (vmaxaq, vminaq): New.
* config/arm/arm-mve-builtins-base.def (vmaxaq, vminaq): New.
* config/arm/arm-mve-builtins-base.h (vmaxaq, vminaq): New.
* config/arm/arm-mve-builtins.cc
(function_instance::has_inactive_argument): Handle vmaxaq and
vminaq.
* config/arm/arm_mve.h (vminaq): Remove.
(vmaxaq): Remove.
(vminaq_m): Remove.
(vmaxaq_m): Remove.
(vminaq_s8): Remove.
(vmaxaq_s8): Remove.
(vminaq_s16): Remove.
(vmaxaq_s16): Remove.
(vminaq_s32): Remove.
(vmaxaq_s32): Remove.
(vminaq_m_s8): Remove.
(vmaxaq_m_s8): Remove.
(vminaq_m_s16): Remove.
(vmaxaq_m_s16): Remove.
(vminaq_m_s32): Remove.
(vmaxaq_m_s32): Remove.
(__arm_vminaq_s8): Remove.
(__arm_vmaxaq_s8): Remove.
(__arm_vminaq_s16): Remove.
(__arm_vmaxaq_s16): Remove.
(__arm_vminaq_s32): Remove.
(__arm_vmaxaq_s32): Remove.
(__arm_vminaq_m_s8): Remove.
(__arm_vmaxaq_m_s8): Remove.
(__arm_vminaq_m_s16): Remove.
(__arm_vmaxaq_m_s16): Remove.
(__arm_vminaq_m_s32): Remove.
(__arm_vmaxaq_m_s32): Remove.
(__arm_vminaq): Remove.
(__arm_vmaxaq): Remove.
(__arm_vminaq_m): Remove.
(__arm_vmaxaq_m): Remove.

commit | commitdiff | tree

Christophe Lyon [Mon, 13 Feb 2023 21:49:02 +0000 (21:49 +0000)]

arm: [MVE intrinsics] factorize vmaxaq vminaq

Factorize vmaxaq vminaq so that they use the same pattern.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/iterators.md (MVE_VMAXAVMINAQ, MVE_VMAXAVMINAQ_M):
New.
(mve_insn): Add vmaxa, vmina.
(supf): Add VMAXAQ_S, VMAXAQ_M_S, VMINAQ_S, VMINAQ_M_S.
* config/arm/mve.md (mve_vmaxaq_s<mode>, mve_vminaq_s<mode>):
Merge into ...
(@mve_<mve_insn>q_<supf><mode>): ... this.
(mve_vmaxaq_m_s<mode>, mve_vminaq_m_s<mode>): Merge into ...
(@mve_<mve_insn>q_m_<supf><mode>): ... this.

commit | commitdiff | tree

Christophe Lyon [Mon, 13 Feb 2023 21:46:30 +0000 (21:46 +0000)]

arm: [MVE intrinsics] add binary_maxamina shape

This patch adds the binary_maxamina shape description.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-shapes.cc (binary_maxamina): New.
* config/arm/arm-mve-builtins-shapes.h (binary_maxamina): New.

commit | commitdiff | tree

Christophe Lyon [Mon, 13 Feb 2023 21:32:11 +0000 (21:32 +0000)]

arm: [MVE intrinsics] rework vmaxnmaq vminnmaq

Implement vmaxnmaq and vminnmaq using the new MVE builtins framework.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-base.cc (vmaxnmaq, vminnmaq): New.
* config/arm/arm-mve-builtins-base.def (vmaxnmaq, vminnmaq): New.
* config/arm/arm-mve-builtins-base.h (vmaxnmaq, vminnmaq): New.
* config/arm/arm-mve-builtins.cc
(function_instance::has_inactive_argument): Handle vmaxnmaq and
vminnmaq.
* config/arm/arm_mve.h (vminnmaq): Remove.
(vmaxnmaq): Remove.
(vmaxnmaq_m): Remove.
(vminnmaq_m): Remove.
(vminnmaq_f16): Remove.
(vmaxnmaq_f16): Remove.
(vminnmaq_f32): Remove.
(vmaxnmaq_f32): Remove.
(vmaxnmaq_m_f16): Remove.
(vminnmaq_m_f16): Remove.
(vmaxnmaq_m_f32): Remove.
(vminnmaq_m_f32): Remove.
(__arm_vminnmaq_f16): Remove.
(__arm_vmaxnmaq_f16): Remove.
(__arm_vminnmaq_f32): Remove.
(__arm_vmaxnmaq_f32): Remove.
(__arm_vmaxnmaq_m_f16): Remove.
(__arm_vminnmaq_m_f16): Remove.
(__arm_vmaxnmaq_m_f32): Remove.
(__arm_vminnmaq_m_f32): Remove.
(__arm_vminnmaq): Remove.
(__arm_vmaxnmaq): Remove.
(__arm_vmaxnmaq_m): Remove.
(__arm_vminnmaq_m): Remove.

commit | commitdiff | tree

Christophe Lyon [Mon, 13 Feb 2023 21:12:25 +0000 (21:12 +0000)]

arm: [MVE intrinsics] factorize vmaxnmaq vminnmaq

Factorize vmaxnmaq and vminnmaq so that they use the same pattern.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/iterators.md (MVE_VMAXNMA_VMINNMAQ)
(MVE_VMAXNMA_VMINNMAQ_M): New.
(mve_insn): Add vmaxnma, vminnma.
* config/arm/mve.md (mve_vmaxnmaq_f<mode>, mve_vminnmaq_f<mode>):
Merge into ...
(@mve_<mve_insn>q_f<mode>): ... this.
(mve_vmaxnmaq_m_f<mode>, mve_vminnmaq_m_f<mode>): Merge into ...
(@mve_<mve_insn>q_m_f<mode>): ... this.

commit | commitdiff | tree

Christophe Lyon [Mon, 13 Feb 2023 21:05:56 +0000 (21:05 +0000)]

arm: [MVE intrinsics] rework vmaxnmavq vmaxnmvq vminnmavq vminnmvq

Implement vmaxnmavq vmaxnmvq vminnmavq vminnmvq using the new MVE
builtins framework.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-base.cc (FUNCTION_PRED_P_F): New.
(vmaxnmavq, vmaxnmvq, vminnmavq, vminnmvq): New.
* config/arm/arm-mve-builtins-base.def (vmaxnmavq, vmaxnmvq)
(vminnmavq, vminnmvq): New.
* config/arm/arm-mve-builtins-base.h (vmaxnmavq, vmaxnmvq)
(vminnmavq, vminnmvq): New.
* config/arm/arm_mve.h (vminnmvq): Remove.
(vminnmavq): Remove.
(vmaxnmvq): Remove.
(vmaxnmavq): Remove.
(vmaxnmavq_p): Remove.
(vmaxnmvq_p): Remove.
(vminnmavq_p): Remove.
(vminnmvq_p): Remove.
(vminnmvq_f16): Remove.
(vminnmavq_f16): Remove.
(vmaxnmvq_f16): Remove.
(vmaxnmavq_f16): Remove.
(vminnmvq_f32): Remove.
(vminnmavq_f32): Remove.
(vmaxnmvq_f32): Remove.
(vmaxnmavq_f32): Remove.
(vmaxnmavq_p_f16): Remove.
(vmaxnmvq_p_f16): Remove.
(vminnmavq_p_f16): Remove.
(vminnmvq_p_f16): Remove.
(vmaxnmavq_p_f32): Remove.
(vmaxnmvq_p_f32): Remove.
(vminnmavq_p_f32): Remove.
(vminnmvq_p_f32): Remove.
(__arm_vminnmvq_f16): Remove.
(__arm_vminnmavq_f16): Remove.
(__arm_vmaxnmvq_f16): Remove.
(__arm_vmaxnmavq_f16): Remove.
(__arm_vminnmvq_f32): Remove.
(__arm_vminnmavq_f32): Remove.
(__arm_vmaxnmvq_f32): Remove.
(__arm_vmaxnmavq_f32): Remove.
(__arm_vmaxnmavq_p_f16): Remove.
(__arm_vmaxnmvq_p_f16): Remove.
(__arm_vminnmavq_p_f16): Remove.
(__arm_vminnmvq_p_f16): Remove.
(__arm_vmaxnmavq_p_f32): Remove.
(__arm_vmaxnmvq_p_f32): Remove.
(__arm_vminnmavq_p_f32): Remove.
(__arm_vminnmvq_p_f32): Remove.
(__arm_vminnmvq): Remove.
(__arm_vminnmavq): Remove.
(__arm_vmaxnmvq): Remove.
(__arm_vmaxnmavq): Remove.
(__arm_vmaxnmavq_p): Remove.
(__arm_vmaxnmvq_p): Remove.
(__arm_vminnmavq_p): Remove.
(__arm_vminnmvq_p): Remove.
(__arm_vmaxnmavq_m): Remove.
(__arm_vmaxnmvq_m): Remove.

commit | commitdiff | tree

Christophe Lyon [Mon, 13 Feb 2023 21:05:37 +0000 (21:05 +0000)]

arm: [MVE intrinsics] add support for mve_q_p_f

We can call code_for_mve_q_p_f only once this function exists, which
is the case after we factorized vmaxnmavq, vmaxnmvq, vminnmavq and
vminnmvq in a previous patch.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-functions.h
(unspec_mve_function_exact_insn_pred_p): Use code_for_mve_q_p_f.

commit | commitdiff | tree

Christophe Lyon [Mon, 13 Feb 2023 21:06:19 +0000 (21:06 +0000)]

arm: [MVE intrinsics] factorize vmaxnmavq vmaxnmvq vminnmavq vminnmvq

Factorize vmaxnmavq vmaxnmvq vminnmavq vminnmvq so that they use the
same pattern.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/iterators.md (MVE_VMAXNMxV_MINNMxVQ)
(MVE_VMAXNMxV_MINNMxVQ_P): New.
(mve_insn): Add vmaxnmav, vmaxnmv, vminnmav, vminnmv.
* config/arm/mve.md (mve_vmaxnmavq_f<mode>, mve_vmaxnmvq_f<mode>)
(mve_vminnmavq_f<mode>, mve_vminnmvq_f<mode>): Merge into ...
(@mve_<mve_insn>q_f<mode>): ... this.
(mve_vmaxnmavq_p_f<mode>, mve_vmaxnmvq_p_f<mode>)
(mve_vminnmavq_p_f<mode>, mve_vminnmvq_p_f<mode>): Merge into ...
(@mve_<mve_insn>q_p_f<mode>): ... this.

commit | commitdiff | tree

Christophe Lyon [Mon, 13 Feb 2023 18:48:54 +0000 (18:48 +0000)]

arm: [MVE intrinsics] rework vmaxnmq vminnmq

Implement vmaxnmq and vminnmq using the new MVE builtins framework.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-base.cc (vmaxnmq, vminnmq): New.
* config/arm/arm-mve-builtins-base.def (vmaxnmq, vminnmq): New.
* config/arm/arm-mve-builtins-base.h (vmaxnmq, vminnmq): New.
* config/arm/arm_mve.h (vminnmq): Remove.
(vmaxnmq): Remove.
(vmaxnmq_m): Remove.
(vminnmq_m): Remove.
(vminnmq_x): Remove.
(vmaxnmq_x): Remove.
(vminnmq_f16): Remove.
(vmaxnmq_f16): Remove.
(vminnmq_f32): Remove.
(vmaxnmq_f32): Remove.
(vmaxnmq_m_f32): Remove.
(vmaxnmq_m_f16): Remove.
(vminnmq_m_f32): Remove.
(vminnmq_m_f16): Remove.
(vminnmq_x_f16): Remove.
(vminnmq_x_f32): Remove.
(vmaxnmq_x_f16): Remove.
(vmaxnmq_x_f32): Remove.
(__arm_vminnmq_f16): Remove.
(__arm_vmaxnmq_f16): Remove.
(__arm_vminnmq_f32): Remove.
(__arm_vmaxnmq_f32): Remove.
(__arm_vmaxnmq_m_f32): Remove.
(__arm_vmaxnmq_m_f16): Remove.
(__arm_vminnmq_m_f32): Remove.
(__arm_vminnmq_m_f16): Remove.
(__arm_vminnmq_x_f16): Remove.
(__arm_vminnmq_x_f32): Remove.
(__arm_vmaxnmq_x_f16): Remove.
(__arm_vmaxnmq_x_f32): Remove.
(__arm_vminnmq): Remove.
(__arm_vmaxnmq): Remove.
(__arm_vmaxnmq_m): Remove.
(__arm_vminnmq_m): Remove.
(__arm_vminnmq_x): Remove.
(__arm_vmaxnmq_x): Remove.

commit | commitdiff | tree

Christophe Lyon [Mon, 13 Feb 2023 18:48:35 +0000 (18:48 +0000)]

arm: [MVE intrinsics] factorize vmaxnmq vminnmq

Factorize vmaxnmq and vminnmq so that they use the same pattern.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/iterators.md (MAX_MIN_F): New.
(MVE_FP_M_BINARY): Add VMAXNMQ_M_F, VMINNMQ_M_F.
(mve_insn): Add vmaxnm, vminnm.
(max_min_f_str): New.
* config/arm/mve.md (mve_vmaxnmq_f<mode>, mve_vminnmq_f<mode>):
Merge into ...
(@mve_<max_min_f_str>q_f<mode>): ... this.
(mve_vmaxnmq_m_f<mode>, mve_vminnmq_m_f<mode>): Merge into ...
(@mve_<mve_insn>q_m_f<mode>): ... this.

commit | commitdiff | tree

Christophe Lyon [Fri, 13 Jan 2023 12:39:49 +0000 (12:39 +0000)]

arm: add smax/smin expanders for v*hf

This patch adds the missing expanders for smax/smin for v*hf modes,
by using the VDQWH iterator instead of VALLW.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/vec-common.md (smin<mode>3): Use VDQWH iterator.
(smax<mode>3): Likewise.

commit | commitdiff | tree

Christophe Lyon [Mon, 13 Feb 2023 18:08:39 +0000 (18:08 +0000)]

arm: [MVE intrinsics] rework vmaxvq vminvq vmaxavq vminavq

Implement vmaxvq, vminvq, vmaxavq, vminavq using the new MVE builtins
framework.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-base.cc (FUNCTION_PRED_P_S_U)
(FUNCTION_PRED_P_S): New.
(vmaxavq, vminavq, vmaxvq, vminvq): New.
* config/arm/arm-mve-builtins-base.def (vmaxavq, vminavq, vmaxvq)
(vminvq): New.
* config/arm/arm-mve-builtins-base.h (vmaxavq, vminavq, vmaxvq)
(vminvq): New.
* config/arm/arm_mve.h (vminvq): Remove.
(vmaxvq): Remove.
(vminvq_p): Remove.
(vmaxvq_p): Remove.
(vminvq_u8): Remove.
(vmaxvq_u8): Remove.
(vminvq_s8): Remove.
(vmaxvq_s8): Remove.
(vminvq_u16): Remove.
(vmaxvq_u16): Remove.
(vminvq_s16): Remove.
(vmaxvq_s16): Remove.
(vminvq_u32): Remove.
(vmaxvq_u32): Remove.
(vminvq_s32): Remove.
(vmaxvq_s32): Remove.
(vminvq_p_u8): Remove.
(vmaxvq_p_u8): Remove.
(vminvq_p_s8): Remove.
(vmaxvq_p_s8): Remove.
(vminvq_p_u16): Remove.
(vmaxvq_p_u16): Remove.
(vminvq_p_s16): Remove.
(vmaxvq_p_s16): Remove.
(vminvq_p_u32): Remove.
(vmaxvq_p_u32): Remove.
(vminvq_p_s32): Remove.
(vmaxvq_p_s32): Remove.
(__arm_vminvq_u8): Remove.
(__arm_vmaxvq_u8): Remove.
(__arm_vminvq_s8): Remove.
(__arm_vmaxvq_s8): Remove.
(__arm_vminvq_u16): Remove.
(__arm_vmaxvq_u16): Remove.
(__arm_vminvq_s16): Remove.
(__arm_vmaxvq_s16): Remove.
(__arm_vminvq_u32): Remove.
(__arm_vmaxvq_u32): Remove.
(__arm_vminvq_s32): Remove.
(__arm_vmaxvq_s32): Remove.
(__arm_vminvq_p_u8): Remove.
(__arm_vmaxvq_p_u8): Remove.
(__arm_vminvq_p_s8): Remove.
(__arm_vmaxvq_p_s8): Remove.
(__arm_vminvq_p_u16): Remove.
(__arm_vmaxvq_p_u16): Remove.
(__arm_vminvq_p_s16): Remove.
(__arm_vmaxvq_p_s16): Remove.
(__arm_vminvq_p_u32): Remove.
(__arm_vmaxvq_p_u32): Remove.
(__arm_vminvq_p_s32): Remove.
(__arm_vmaxvq_p_s32): Remove.
(__arm_vminvq): Remove.
(__arm_vmaxvq): Remove.
(__arm_vminvq_p): Remove.
(__arm_vmaxvq_p): Remove.
(vminavq): Remove.
(vmaxavq): Remove.
(vminavq_p): Remove.
(vmaxavq_p): Remove.
(vminavq_s8): Remove.
(vmaxavq_s8): Remove.
(vminavq_s16): Remove.
(vmaxavq_s16): Remove.
(vminavq_s32): Remove.
(vmaxavq_s32): Remove.
(vminavq_p_s8): Remove.
(vmaxavq_p_s8): Remove.
(vminavq_p_s16): Remove.
(vmaxavq_p_s16): Remove.
(vminavq_p_s32): Remove.
(vmaxavq_p_s32): Remove.
(__arm_vminavq_s8): Remove.
(__arm_vmaxavq_s8): Remove.
(__arm_vminavq_s16): Remove.
(__arm_vmaxavq_s16): Remove.
(__arm_vminavq_s32): Remove.
(__arm_vmaxavq_s32): Remove.
(__arm_vminavq_p_s8): Remove.
(__arm_vmaxavq_p_s8): Remove.
(__arm_vminavq_p_s16): Remove.
(__arm_vmaxavq_p_s16): Remove.
(__arm_vminavq_p_s32): Remove.
(__arm_vmaxavq_p_s32): Remove.
(__arm_vminavq): Remove.
(__arm_vmaxavq): Remove.
(__arm_vminavq_p): Remove.
(__arm_vmaxavq_p): Remove.

commit | commitdiff | tree

Christophe Lyon [Mon, 13 Feb 2023 18:09:08 +0000 (18:09 +0000)]

arm: [MVE intrinsics] factorize vmaxvq vminvq vmaxavq vminavq

Factorize vmaxvq vminvq vmaxavq vminavq so that they use the same
pattern.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/iterators.md (MVE_VMAXVQ_VMINVQ, MVE_VMAXVQ_VMINVQ_P): New.
(mve_insn): Add vmaxav, vmaxv, vminav, vminv.
(supf): Add VMAXAVQ_S, VMAXAVQ_P_S, VMINAVQ_S, VMINAVQ_P_S.
* config/arm/mve.md (mve_vmaxavq_s<mode>, mve_vmaxvq_<supf><mode>)
(mve_vminavq_s<mode>, mve_vminvq_<supf><mode>): Merge into ...
(@mve_<mve_insn>q_<supf><mode>): ... this.
(mve_vmaxavq_p_s<mode>, mve_vmaxvq_p_<supf><mode>)
(mve_vminavq_p_s<mode>, mve_vminvq_p_<supf><mode>): Merge into ...
(@mve_<mve_insn>q_p_<supf><mode>): ... this.

commit | commitdiff | tree

Christophe Lyon [Mon, 13 Feb 2023 17:02:36 +0000 (17:02 +0000)]

arm: [MVE intrinsics add unspec_mve_function_exact_insn_pred_p

Introduce a function that will be used to build intrinsics that use p
predication.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-functions.h (class
unspec_mve_function_exact_insn_pred_p): New.

commit | commitdiff | tree

Christophe Lyon [Mon, 13 Feb 2023 22:06:46 +0000 (22:06 +0000)]

arm: [MVE intrinsics] add binary_maxavminav shape

This patch adds the binary_maxavminav shape description.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-shapes.cc (binary_maxavminav): New.
* config/arm/arm-mve-builtins-shapes.h (binary_maxavminav): New.

commit | commitdiff | tree

Christophe Lyon [Mon, 13 Feb 2023 17:00:38 +0000 (17:00 +0000)]

arm: [MVE intrinsics] add binary_maxvminv shape

This patch adds the binary_maxvminv shape description.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-shapes.cc (binary_maxvminv): New.
* config/arm/arm-mve-builtins-shapes.h (binary_maxvminv): New.

commit | commitdiff | tree

Richard Sandiford [Tue, 9 May 2023 17:57:23 +0000 (18:57 +0100)]

aarch64: Improve register allocation for lane instructions

REG_ALLOC_ORDER is much less important than it used to be, but it
is still used as a tie-breaker when multiple registers in a class
are equally good.

Previously aarch64 used the default approach of allocating in order
of increasing register number. But as the comment in the patch says,
it's better to allocate FP and predicate registers in the opposite
order, so that we don't eat into smaller register classes unnecessarily.

This fixes some existing FIXMEs and improves the register allocation
for some Arm ACLE code.

Doing this also showed that *vcond_mask_<mode><vpred> (predicated MOV/SEL)
unnecessarily required p0-p7 rather than p0-p15 for the unpredicated
movprfx alternatives. Only the predicated movprfx alternative requires
p0-p7 (due to the movprfx itself, rather than due to the main instruction).

gcc/
* config/aarch64/aarch64-protos.h (aarch64_adjust_reg_alloc_order):
Declare.
* config/aarch64/aarch64.h (REG_ALLOC_ORDER): Define.
(ADJUST_REG_ALLOC_ORDER): Likewise.
* config/aarch64/aarch64.cc (aarch64_adjust_reg_alloc_order): New
function.
* config/aarch64/aarch64-sve.md (*vcond_mask_<mode><vpred>): Use
Upa rather than Upl for unpredicated movprfx alternatives.

gcc/testsuite/
* gcc.target/aarch64/sve/acle/asm/abd_f16.c: Remove XFAILs.
* gcc.target/aarch64/sve/acle/asm/abd_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/abd_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/abd_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/abd_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/abd_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/abd_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/abd_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/abd_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/abd_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/abd_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/add_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/add_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/add_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/add_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/add_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/add_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/add_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/add_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/and_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/and_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/and_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/and_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/and_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/and_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/and_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/and_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/asr_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/asr_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/div_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/div_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/div_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/div_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/div_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/div_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/div_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/divr_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/divr_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/divr_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/divr_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/divr_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/divr_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/divr_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/dot_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/dot_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/dot_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/dot_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/eor_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/eor_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/eor_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/eor_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/eor_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/eor_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/eor_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/eor_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsr_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsr_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mad_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mad_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mad_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mad_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mad_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mad_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mad_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mad_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mad_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mad_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mad_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/max_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/max_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/max_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/max_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/max_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/max_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/max_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/max_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/min_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/min_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/min_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/min_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/min_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/min_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/min_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/min_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mla_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mla_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mla_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mla_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mla_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mla_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mla_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mla_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mla_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mla_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mla_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mls_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mls_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mls_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mls_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mls_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mls_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mls_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mls_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mls_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mls_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mls_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/msb_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/msb_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/msb_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/msb_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/msb_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/msb_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/msb_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/msb_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/msb_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/msb_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/msb_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_f16_notrap.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_f32_notrap.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_f64_notrap.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mulh_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mulh_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mulh_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mulh_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mulh_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mulh_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mulh_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mulh_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mulx_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mulx_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mulx_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/nmad_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/nmad_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/nmad_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/nmla_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/nmla_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/nmla_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/nmls_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/nmls_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/nmls_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/nmsb_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/nmsb_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/nmsb_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/orr_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/orr_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/orr_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/orr_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/orr_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/orr_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/orr_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/orr_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/scale_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/scale_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/scale_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/sub_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/sub_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/sub_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/sub_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/sub_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/sub_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/sub_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/sub_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/subr_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/subr_f16_notrap.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/subr_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/subr_f32_notrap.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/subr_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/subr_f64_notrap.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/subr_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/subr_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/subr_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/subr_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/subr_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/subr_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/subr_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/subr_u8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/bcax_s16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/bcax_s32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/bcax_s64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/bcax_s8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/bcax_u16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/bcax_u32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/bcax_u64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/bcax_u8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qadd_s16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qadd_s32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qadd_s64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qadd_s8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qadd_u16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qadd_u32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qadd_u64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qadd_u8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qdmlalb_s16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qdmlalb_s32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qdmlalb_s64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qdmlalbt_s16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qdmlalbt_s32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qdmlalbt_s64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsub_s16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsub_s32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsub_s64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsub_s8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsub_u16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsub_u32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsub_u64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsub_u8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsubr_s16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsubr_s32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsubr_s64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsubr_s8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsubr_u16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsubr_u32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsubr_u64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/qsubr_u8.c: Likewise.

commit | commitdiff | tree

Richard Sandiford [Tue, 9 May 2023 17:57:22 +0000 (18:57 +0100)]

aarch64: Fix cut-&-pasto in aarch64-sve2-acle-asm.exp

aarch64-sve2-acle-asm.exp tried to prevent --with-cpu/tune
from affecting the results, but it used sve_flags rather than
sve2_flags. This was a silent failure when running the full
testsuite, but was a fatal error when running the harness
individually.

gcc/testsuite/
* gcc.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp: Use
sve2_flags instead of sve_flags.

commit | commitdiff | tree

Gaius Mulley [Tue, 9 May 2023 17:17:13 +0000 (18:17 +0100)]

PR modula2/109779 isolib SkipLine skips the first character of the successive line

This is a patch for the m2iso library to prevent SkipLine from consuming
the next character on the next line.

gcc/m2/ChangeLog:

PR modula2/109779
* gm2-libs-iso/RTgen.mod (doLook): Remove old.
Remove re-assignment of result.
* gm2-libs-iso/TextIO.mod (CanRead): Rename into ...
(CharAvailable): ... this.
(DumpState): New procedure.
(SetResult): Rename as SetNul.
(WasGoodChar): Rename into ...
(EofOrEoln): ... this.
(SkipLine): Skip over the newline.
(ReadString): Flip THEN ELSE statements after testing for
EofOrEoln.
(ReadRestLine): Flip THEN ELSE statements after testing for
EofOrEoln.

gcc/testsuite/ChangeLog:

PR modula2/109779
* gm2/isolib/run/pass/skiplinetest.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

commit | commitdiff | tree

Jakub Jelinek [Tue, 9 May 2023 14:05:22 +0000 (16:05 +0200)]

c++: Reject pack expansion of assume attribute [PR109756]

http://eel.is/c++draft/dcl.attr#grammar-4 says
"In an attribute-list, an ellipsis may appear only if that attribute's
specification permits it."
and doesn't explicitly permit it on any standard attribute.
The https://wg21.link/p1774r8 paper which introduced assume attribute says
"We could therefore hypothetically permit the assume attribute to directly
support pack expansion:
template <int... args>
void f() {
[[assume(args >= 0)...]];
}
However, we do not propose this. It would require substantial additional work
for a very rare use case. Note that this can instead be expressed with a fold
expression, which is equivalent to the above and works out of the box without
any extra effort:
template <int... args>
void f() {
[[assume(((args >= 0) && ...))]];
}
", but as the testcase shows, GCC 13+ ICEs on assume attribute followed by
... if it contains packs.
The following patch rejects those instead of ICE and for C++17 or later
suggests using fold expressions instead (it doesn't make sense to suggest
it for C++14 and earlier when we'd error on the fold expressions).

2023-05-09 Jakub Jelinek <jakub@redhat.com>

PR c++/109756
* cp-gimplify.cc (process_stmt_assume_attribute): Diagnose pack
expansion of assume attribute.

* g++.dg/cpp23/attr-assume11.C: New test.

commit | commitdiff | tree

Jeff Law [Tue, 9 May 2023 13:18:45 +0000 (07:18 -0600)]

Eliminate more comparisons on the H8 port

This patch fixes a minor code quality issue I found while testing LRA on the
H8. Specifically we have a peephole which converts a comparison of a memory
location against zero into a load + comparison which is actually more
efficient. This triggers when there are registers available at the right
point during peephole2.

If the load is not a mode dependent address we can actually do better by
realizing the load itself sets the proper flags and eliminate the comparison.
I may have expected this to happen when I wrote the original peephole2,
but cmpelim runs before peephole2, so clearly if we want to eliminate the
comparison we have to do it manually.

gcc/
* config/h8300/testcompare.md: Add peephole2 which uses a memory
load to set flags, thus eliminating a compare against zero.

commit | commitdiff | tree

Thomas Schwinge [Fri, 31 Oct 2014 16:38:03 +0000 (17:38 +0100)]

libgomp testsuite: Use 'lang_test_file_found' instead of 'lang_test_file'

The value of 'lang_test_file' isn't actually used anywhere.

libgomp/
* testsuite/libgomp.c++/c++.exp: Don't set 'lang_test_file'.
* testsuite/libgomp.fortran/fortran.exp: Likewise.
* testsuite/libgomp.oacc-c++/c++.exp: Likewise.
* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
* testsuite/libgomp.c/c.exp: Unset 'lang_test_file_found' instead of
'lang_test_file'.
* testsuite/libgomp.oacc-c/c.exp: Likewise.
* testsuite/libgomp.graphite/graphite.exp: Likewise.
* testsuite/lib/libgomp.exp (libgomp_target_compile): Look for
'lang_test_file_found' instead of 'lang_test_file'.

commit | commitdiff | tree

Thomas Schwinge [Mon, 3 Nov 2014 08:58:38 +0000 (09:58 +0100)]

libgomp testsuite: Only use 'blddir' if set

(It is unclear to me why the current working directory needs to be in
'LD_LIBRARY_PATH'; leaving that alone for now.)

libgomp/
* testsuite/lib/libgomp.exp (libgomp_init): Only use 'blddir' if
set.
* testsuite/libgomp.c++/c++.exp: Likewise.
* testsuite/libgomp.oacc-c++/c++.exp: Likewise.

commit | commitdiff | tree

Thomas Schwinge [Fri, 31 Oct 2014 15:49:14 +0000 (16:49 +0100)]

libgomp C++ testsuite: Don't compute 'blddir' twice

It has already been set in 'libgomp/testsuite/lib/libgomp.exp:libgomp_init'.

libgomp/
* testsuite/libgomp.c++/c++.exp (blddir): Don't set.
* testsuite/libgomp.oacc-c++/c++.exp (blddir): Likewise.

commit | commitdiff | tree

Christophe Lyon [Fri, 10 Feb 2023 09:34:59 +0000 (09:34 +0000)]

arm: [MVE intrinsics] rework vshllbq vshlltq

Implement vshllbq and vshlltq using the new MVE builtins framework.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-base.cc (vshllbq, vshlltq): New.
* config/arm/arm-mve-builtins-base.def (vshllbq, vshlltq): New.
* config/arm/arm-mve-builtins-base.h (vshllbq, vshlltq): New.
* config/arm/arm_mve.h (vshlltq): Remove.
(vshllbq): Remove.
(vshllbq_m): Remove.
(vshlltq_m): Remove.
(vshllbq_x): Remove.
(vshlltq_x): Remove.
(vshlltq_n_u8): Remove.
(vshllbq_n_u8): Remove.
(vshlltq_n_s8): Remove.
(vshllbq_n_s8): Remove.
(vshlltq_n_u16): Remove.
(vshllbq_n_u16): Remove.
(vshlltq_n_s16): Remove.
(vshllbq_n_s16): Remove.
(vshllbq_m_n_s8): Remove.
(vshllbq_m_n_s16): Remove.
(vshllbq_m_n_u8): Remove.
(vshllbq_m_n_u16): Remove.
(vshlltq_m_n_s8): Remove.
(vshlltq_m_n_s16): Remove.
(vshlltq_m_n_u8): Remove.
(vshlltq_m_n_u16): Remove.
(vshllbq_x_n_s8): Remove.
(vshllbq_x_n_s16): Remove.
(vshllbq_x_n_u8): Remove.
(vshllbq_x_n_u16): Remove.
(vshlltq_x_n_s8): Remove.
(vshlltq_x_n_s16): Remove.
(vshlltq_x_n_u8): Remove.
(vshlltq_x_n_u16): Remove.
(__arm_vshlltq_n_u8): Remove.
(__arm_vshllbq_n_u8): Remove.
(__arm_vshlltq_n_s8): Remove.
(__arm_vshllbq_n_s8): Remove.
(__arm_vshlltq_n_u16): Remove.
(__arm_vshllbq_n_u16): Remove.
(__arm_vshlltq_n_s16): Remove.
(__arm_vshllbq_n_s16): Remove.
(__arm_vshllbq_m_n_s8): Remove.
(__arm_vshllbq_m_n_s16): Remove.
(__arm_vshllbq_m_n_u8): Remove.
(__arm_vshllbq_m_n_u16): Remove.
(__arm_vshlltq_m_n_s8): Remove.
(__arm_vshlltq_m_n_s16): Remove.
(__arm_vshlltq_m_n_u8): Remove.
(__arm_vshlltq_m_n_u16): Remove.
(__arm_vshllbq_x_n_s8): Remove.
(__arm_vshllbq_x_n_s16): Remove.
(__arm_vshllbq_x_n_u8): Remove.
(__arm_vshllbq_x_n_u16): Remove.
(__arm_vshlltq_x_n_s8): Remove.
(__arm_vshlltq_x_n_s16): Remove.
(__arm_vshlltq_x_n_u8): Remove.
(__arm_vshlltq_x_n_u16): Remove.
(__arm_vshlltq): Remove.
(__arm_vshllbq): Remove.
(__arm_vshllbq_m): Remove.
(__arm_vshlltq_m): Remove.
(__arm_vshllbq_x): Remove.
(__arm_vshlltq_x): Remove.

commit | commitdiff | tree

Christophe Lyon [Fri, 10 Feb 2023 09:30:04 +0000 (09:30 +0000)]

arm: [MVE intrinsics] factorize vshllbq vshlltq

Factorize vshllbq vshlltq so that they use the same pattern.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/iterators.md (mve_insn): Add vshllb, vshllt.
(VSHLLBQ_N, VSHLLTQ_N): Remove.
(VSHLLxQ_N): New.
(VSHLLBQ_M_N, VSHLLTQ_M_N): Remove.
(VSHLLxQ_M_N): New.
* config/arm/mve.md (mve_vshllbq_n_<supf><mode>)
(mve_vshlltq_n_<supf><mode>): Merge into ...
(@mve_<mve_insn>q_n_<supf><mode>): ... this.
(mve_vshllbq_m_n_<supf><mode>, mve_vshlltq_m_n_<supf><mode>):
Merge into ...
(@mve_<mve_insn>q_m_n_<supf><mode>): ... this.

commit | commitdiff | tree

Christophe Lyon [Fri, 10 Feb 2023 09:24:23 +0000 (09:24 +0000)]

arm: [MVE intrinsics] add binary_widen_n shape

This patch adds the binary_widen_n shape description.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-shapes.cc (binary_widen_n): New.
* config/arm/arm-mve-builtins-shapes.h (binary_widen_n): New.

commit | commitdiff | tree

Christophe Lyon [Fri, 10 Feb 2023 08:38:39 +0000 (08:38 +0000)]

arm: [MVE intrinsics] rework vmovnbq vmovntq vqmovnbq vqmovntq vqmovunbq vqmovuntq

Implement vmovnbq, vmovntq, vqmovnbq, vqmovntq, vqmovunbq, vqmovuntq
using the new MVE builtins framework.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-base.cc (vmovnbq, vmovntq, vqmovnbq)
(vqmovntq, vqmovunbq, vqmovuntq): New.
* config/arm/arm-mve-builtins-base.def (vmovnbq, vmovntq)
(vqmovnbq, vqmovntq, vqmovunbq, vqmovuntq): New.
* config/arm/arm-mve-builtins-base.h (vmovnbq, vmovntq, vqmovnbq)
(vqmovntq, vqmovunbq, vqmovuntq): New.
* config/arm/arm-mve-builtins.cc
(function_instance::has_inactive_argument): Handle vmovnbq,
vmovntq, vqmovnbq, vqmovntq, vqmovunbq, vqmovuntq.
* config/arm/arm_mve.h (vqmovntq): Remove.
(vqmovnbq): Remove.
(vqmovnbq_m): Remove.
(vqmovntq_m): Remove.
(vqmovntq_u16): Remove.
(vqmovnbq_u16): Remove.
(vqmovntq_s16): Remove.
(vqmovnbq_s16): Remove.
(vqmovntq_u32): Remove.
(vqmovnbq_u32): Remove.
(vqmovntq_s32): Remove.
(vqmovnbq_s32): Remove.
(vqmovnbq_m_s16): Remove.
(vqmovntq_m_s16): Remove.
(vqmovnbq_m_u16): Remove.
(vqmovntq_m_u16): Remove.
(vqmovnbq_m_s32): Remove.
(vqmovntq_m_s32): Remove.
(vqmovnbq_m_u32): Remove.
(vqmovntq_m_u32): Remove.
(__arm_vqmovntq_u16): Remove.
(__arm_vqmovnbq_u16): Remove.
(__arm_vqmovntq_s16): Remove.
(__arm_vqmovnbq_s16): Remove.
(__arm_vqmovntq_u32): Remove.
(__arm_vqmovnbq_u32): Remove.
(__arm_vqmovntq_s32): Remove.
(__arm_vqmovnbq_s32): Remove.
(__arm_vqmovnbq_m_s16): Remove.
(__arm_vqmovntq_m_s16): Remove.
(__arm_vqmovnbq_m_u16): Remove.
(__arm_vqmovntq_m_u16): Remove.
(__arm_vqmovnbq_m_s32): Remove.
(__arm_vqmovntq_m_s32): Remove.
(__arm_vqmovnbq_m_u32): Remove.
(__arm_vqmovntq_m_u32): Remove.
(__arm_vqmovntq): Remove.
(__arm_vqmovnbq): Remove.
(__arm_vqmovnbq_m): Remove.
(__arm_vqmovntq_m): Remove.
(vmovntq): Remove.
(vmovnbq): Remove.
(vmovnbq_m): Remove.
(vmovntq_m): Remove.
(vmovntq_u16): Remove.
(vmovnbq_u16): Remove.
(vmovntq_s16): Remove.
(vmovnbq_s16): Remove.
(vmovntq_u32): Remove.
(vmovnbq_u32): Remove.
(vmovntq_s32): Remove.
(vmovnbq_s32): Remove.
(vmovnbq_m_s16): Remove.
(vmovntq_m_s16): Remove.
(vmovnbq_m_u16): Remove.
(vmovntq_m_u16): Remove.
(vmovnbq_m_s32): Remove.
(vmovntq_m_s32): Remove.
(vmovnbq_m_u32): Remove.
(vmovntq_m_u32): Remove.
(__arm_vmovntq_u16): Remove.
(__arm_vmovnbq_u16): Remove.
(__arm_vmovntq_s16): Remove.
(__arm_vmovnbq_s16): Remove.
(__arm_vmovntq_u32): Remove.
(__arm_vmovnbq_u32): Remove.
(__arm_vmovntq_s32): Remove.
(__arm_vmovnbq_s32): Remove.
(__arm_vmovnbq_m_s16): Remove.
(__arm_vmovntq_m_s16): Remove.
(__arm_vmovnbq_m_u16): Remove.
(__arm_vmovntq_m_u16): Remove.
(__arm_vmovnbq_m_s32): Remove.
(__arm_vmovntq_m_s32): Remove.
(__arm_vmovnbq_m_u32): Remove.
(__arm_vmovntq_m_u32): Remove.
(__arm_vmovntq): Remove.
(__arm_vmovnbq): Remove.
(__arm_vmovnbq_m): Remove.
(__arm_vmovntq_m): Remove.
(vqmovuntq): Remove.
(vqmovunbq): Remove.
(vqmovunbq_m): Remove.
(vqmovuntq_m): Remove.
(vqmovuntq_s16): Remove.
(vqmovunbq_s16): Remove.
(vqmovuntq_s32): Remove.
(vqmovunbq_s32): Remove.
(vqmovunbq_m_s16): Remove.
(vqmovuntq_m_s16): Remove.
(vqmovunbq_m_s32): Remove.
(vqmovuntq_m_s32): Remove.
(__arm_vqmovuntq_s16): Remove.
(__arm_vqmovunbq_s16): Remove.
(__arm_vqmovuntq_s32): Remove.
(__arm_vqmovunbq_s32): Remove.
(__arm_vqmovunbq_m_s16): Remove.
(__arm_vqmovuntq_m_s16): Remove.
(__arm_vqmovunbq_m_s32): Remove.
(__arm_vqmovuntq_m_s32): Remove.
(__arm_vqmovuntq): Remove.
(__arm_vqmovunbq): Remove.
(__arm_vqmovunbq_m): Remove.
(__arm_vqmovuntq_m): Remove.

commit | commitdiff | tree

Christophe Lyon [Fri, 10 Feb 2023 08:32:51 +0000 (08:32 +0000)]

arm: [MVE intrinsics] factorize vmovnbq vmovntq vqmovnbq vqmovntq vqmovunbq vqmovuntq

Factorize vmovnbq vmovntq vqmovnbq vqmovntq vqmovunbq vqmovuntq so
that they use the same pattern.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/iterators.md (MVE_MOVN, MVE_MOVN_M): New.
(mve_insn): Add vmovnb, vmovnt, vqmovnb, vqmovnt, vqmovunb,
vqmovunt.
(isu): Likewise.
(supf): Add VQMOVUNBQ_M_S, VQMOVUNBQ_S, VQMOVUNTQ_M_S,
VQMOVUNTQ_S.
* config/arm/mve.md (mve_vmovnbq_<supf><mode>)
(mve_vmovntq_<supf><mode>, mve_vqmovnbq_<supf><mode>)
(mve_vqmovntq_<supf><mode>, mve_vqmovunbq_s<mode>)
(mve_vqmovuntq_s<mode>): Merge into ...
(@mve_<mve_insn>q_<supf><mode>): ... this.
(mve_vmovnbq_m_<supf><mode>, mve_vmovntq_m_<supf><mode>)
(mve_vqmovnbq_m_<supf><mode>, mve_vqmovntq_m_<supf><mode>)
(mve_vqmovunbq_m_s<mode>, mve_vqmovuntq_m_s<mode>): Merge into ...
(@mve_<mve_insn>q_m_<supf><mode>): ... this.

commit | commitdiff | tree

Christophe Lyon [Fri, 10 Feb 2023 08:22:06 +0000 (08:22 +0000)]

arm: [MVE intrinsics] add binary_move_narrow and binary_move_narrow_unsigned shapes

This patch adds the binary_move_narrow and binary_move_narrow_unsigned
shapes descriptions.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-shapes.cc (binary_move_narrow): New.
(binary_move_narrow_unsigned): New.
* config/arm/arm-mve-builtins-shapes.h (binary_move_narrow): New.
(binary_move_narrow_unsigned): New.

commit | commitdiff | tree

Christophe Lyon [Thu, 9 Feb 2023 20:30:18 +0000 (20:30 +0000)]

arm: [MVE intrinsics] rework vrndq vrndaq vrndmq vrndnq vrndpq vrndxq

Implement vrndq, vrndaq, vrndmq, vrndnq, vrndpq, vrndxq using the new
MVE builtins framework.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-base.cc (FUNCTION_ONLY_F): New.
(vrndaq, vrndmq, vrndnq, vrndpq, vrndq, vrndxq): New.
* config/arm/arm-mve-builtins-base.def (vrndaq, vrndmq, vrndnq)
(vrndpq, vrndq, vrndxq): New.
* config/arm/arm-mve-builtins-base.h (vrndaq, vrndmq, vrndnq)
(vrndpq, vrndq, vrndxq): New.
* config/arm/arm_mve.h (vrndxq): Remove.
(vrndq): Remove.
(vrndpq): Remove.
(vrndnq): Remove.
(vrndmq): Remove.
(vrndaq): Remove.
(vrndaq_m): Remove.
(vrndmq_m): Remove.
(vrndnq_m): Remove.
(vrndpq_m): Remove.
(vrndq_m): Remove.
(vrndxq_m): Remove.
(vrndq_x): Remove.
(vrndnq_x): Remove.
(vrndmq_x): Remove.
(vrndpq_x): Remove.
(vrndaq_x): Remove.
(vrndxq_x): Remove.
(vrndxq_f16): Remove.
(vrndxq_f32): Remove.
(vrndq_f16): Remove.
(vrndq_f32): Remove.
(vrndpq_f16): Remove.
(vrndpq_f32): Remove.
(vrndnq_f16): Remove.
(vrndnq_f32): Remove.
(vrndmq_f16): Remove.
(vrndmq_f32): Remove.
(vrndaq_f16): Remove.
(vrndaq_f32): Remove.
(vrndaq_m_f16): Remove.
(vrndmq_m_f16): Remove.
(vrndnq_m_f16): Remove.
(vrndpq_m_f16): Remove.
(vrndq_m_f16): Remove.
(vrndxq_m_f16): Remove.
(vrndaq_m_f32): Remove.
(vrndmq_m_f32): Remove.
(vrndnq_m_f32): Remove.
(vrndpq_m_f32): Remove.
(vrndq_m_f32): Remove.
(vrndxq_m_f32): Remove.
(vrndq_x_f16): Remove.
(vrndq_x_f32): Remove.
(vrndnq_x_f16): Remove.
(vrndnq_x_f32): Remove.
(vrndmq_x_f16): Remove.
(vrndmq_x_f32): Remove.
(vrndpq_x_f16): Remove.
(vrndpq_x_f32): Remove.
(vrndaq_x_f16): Remove.
(vrndaq_x_f32): Remove.
(vrndxq_x_f16): Remove.
(vrndxq_x_f32): Remove.
(__arm_vrndxq_f16): Remove.
(__arm_vrndxq_f32): Remove.
(__arm_vrndq_f16): Remove.
(__arm_vrndq_f32): Remove.
(__arm_vrndpq_f16): Remove.
(__arm_vrndpq_f32): Remove.
(__arm_vrndnq_f16): Remove.
(__arm_vrndnq_f32): Remove.
(__arm_vrndmq_f16): Remove.
(__arm_vrndmq_f32): Remove.
(__arm_vrndaq_f16): Remove.
(__arm_vrndaq_f32): Remove.
(__arm_vrndaq_m_f16): Remove.
(__arm_vrndmq_m_f16): Remove.
(__arm_vrndnq_m_f16): Remove.
(__arm_vrndpq_m_f16): Remove.
(__arm_vrndq_m_f16): Remove.
(__arm_vrndxq_m_f16): Remove.
(__arm_vrndaq_m_f32): Remove.
(__arm_vrndmq_m_f32): Remove.
(__arm_vrndnq_m_f32): Remove.
(__arm_vrndpq_m_f32): Remove.
(__arm_vrndq_m_f32): Remove.
(__arm_vrndxq_m_f32): Remove.
(__arm_vrndq_x_f16): Remove.
(__arm_vrndq_x_f32): Remove.
(__arm_vrndnq_x_f16): Remove.
(__arm_vrndnq_x_f32): Remove.
(__arm_vrndmq_x_f16): Remove.
(__arm_vrndmq_x_f32): Remove.
(__arm_vrndpq_x_f16): Remove.
(__arm_vrndpq_x_f32): Remove.
(__arm_vrndaq_x_f16): Remove.
(__arm_vrndaq_x_f32): Remove.
(__arm_vrndxq_x_f16): Remove.
(__arm_vrndxq_x_f32): Remove.
(__arm_vrndxq): Remove.
(__arm_vrndq): Remove.
(__arm_vrndpq): Remove.
(__arm_vrndnq): Remove.
(__arm_vrndmq): Remove.
(__arm_vrndaq): Remove.
(__arm_vrndaq_m): Remove.
(__arm_vrndmq_m): Remove.
(__arm_vrndnq_m): Remove.
(__arm_vrndpq_m): Remove.
(__arm_vrndq_m): Remove.
(__arm_vrndxq_m): Remove.
(__arm_vrndq_x): Remove.
(__arm_vrndnq_x): Remove.
(__arm_vrndmq_x): Remove.
(__arm_vrndpq_x): Remove.
(__arm_vrndaq_x): Remove.
(__arm_vrndxq_x): Remove.

commit | commitdiff | tree

Christophe Lyon [Thu, 9 Feb 2023 19:11:46 +0000 (19:11 +0000)]

arm: [MVE intrinsics] rework vabsq vnegq vclsq vclzq, vqabsq, vqnegq

Implement vabsq, vnegq, vclsq, vclzq, vqabsq, vqnegq using the new MVE
builtins framework.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-base.cc (FUNCTION_WITHOUT_N_NO_U_F): New.
(vabsq, vnegq, vclsq, vclzq, vqabsq, vqnegq): New.
* config/arm/arm-mve-builtins-base.def (vabsq, vnegq, vclsq)
(vclzq, vqabsq, vqnegq): New.
* config/arm/arm-mve-builtins-base.h (vabsq, vnegq, vclsq, vclzq)
(vqabsq, vqnegq): New.
* config/arm/arm_mve.h (vabsq): Remove.
(vabsq_m): Remove.
(vabsq_x): Remove.
(vabsq_f16): Remove.
(vabsq_f32): Remove.
(vabsq_s8): Remove.
(vabsq_s16): Remove.
(vabsq_s32): Remove.
(vabsq_m_s8): Remove.
(vabsq_m_s16): Remove.
(vabsq_m_s32): Remove.
(vabsq_m_f16): Remove.
(vabsq_m_f32): Remove.
(vabsq_x_s8): Remove.
(vabsq_x_s16): Remove.
(vabsq_x_s32): Remove.
(vabsq_x_f16): Remove.
(vabsq_x_f32): Remove.
(__arm_vabsq_s8): Remove.
(__arm_vabsq_s16): Remove.
(__arm_vabsq_s32): Remove.
(__arm_vabsq_m_s8): Remove.
(__arm_vabsq_m_s16): Remove.
(__arm_vabsq_m_s32): Remove.
(__arm_vabsq_x_s8): Remove.
(__arm_vabsq_x_s16): Remove.
(__arm_vabsq_x_s32): Remove.
(__arm_vabsq_f16): Remove.
(__arm_vabsq_f32): Remove.
(__arm_vabsq_m_f16): Remove.
(__arm_vabsq_m_f32): Remove.
(__arm_vabsq_x_f16): Remove.
(__arm_vabsq_x_f32): Remove.
(__arm_vabsq): Remove.
(__arm_vabsq_m): Remove.
(__arm_vabsq_x): Remove.
(vnegq): Remove.
(vnegq_m): Remove.
(vnegq_x): Remove.
(vnegq_f16): Remove.
(vnegq_f32): Remove.
(vnegq_s8): Remove.
(vnegq_s16): Remove.
(vnegq_s32): Remove.
(vnegq_m_s8): Remove.
(vnegq_m_s16): Remove.
(vnegq_m_s32): Remove.
(vnegq_m_f16): Remove.
(vnegq_m_f32): Remove.
(vnegq_x_s8): Remove.
(vnegq_x_s16): Remove.
(vnegq_x_s32): Remove.
(vnegq_x_f16): Remove.
(vnegq_x_f32): Remove.
(__arm_vnegq_s8): Remove.
(__arm_vnegq_s16): Remove.
(__arm_vnegq_s32): Remove.
(__arm_vnegq_m_s8): Remove.
(__arm_vnegq_m_s16): Remove.
(__arm_vnegq_m_s32): Remove.
(__arm_vnegq_x_s8): Remove.
(__arm_vnegq_x_s16): Remove.
(__arm_vnegq_x_s32): Remove.
(__arm_vnegq_f16): Remove.
(__arm_vnegq_f32): Remove.
(__arm_vnegq_m_f16): Remove.
(__arm_vnegq_m_f32): Remove.
(__arm_vnegq_x_f16): Remove.
(__arm_vnegq_x_f32): Remove.
(__arm_vnegq): Remove.
(__arm_vnegq_m): Remove.
(__arm_vnegq_x): Remove.
(vclsq): Remove.
(vclsq_m): Remove.
(vclsq_x): Remove.
(vclsq_s8): Remove.
(vclsq_s16): Remove.
(vclsq_s32): Remove.
(vclsq_m_s8): Remove.
(vclsq_m_s16): Remove.
(vclsq_m_s32): Remove.
(vclsq_x_s8): Remove.
(vclsq_x_s16): Remove.
(vclsq_x_s32): Remove.
(__arm_vclsq_s8): Remove.
(__arm_vclsq_s16): Remove.
(__arm_vclsq_s32): Remove.
(__arm_vclsq_m_s8): Remove.
(__arm_vclsq_m_s16): Remove.
(__arm_vclsq_m_s32): Remove.
(__arm_vclsq_x_s8): Remove.
(__arm_vclsq_x_s16): Remove.
(__arm_vclsq_x_s32): Remove.
(__arm_vclsq): Remove.
(__arm_vclsq_m): Remove.
(__arm_vclsq_x): Remove.
(vclzq): Remove.
(vclzq_m): Remove.
(vclzq_x): Remove.
(vclzq_s8): Remove.
(vclzq_s16): Remove.
(vclzq_s32): Remove.
(vclzq_u8): Remove.
(vclzq_u16): Remove.
(vclzq_u32): Remove.
(vclzq_m_u8): Remove.
(vclzq_m_s8): Remove.
(vclzq_m_u16): Remove.
(vclzq_m_s16): Remove.
(vclzq_m_u32): Remove.
(vclzq_m_s32): Remove.
(vclzq_x_s8): Remove.
(vclzq_x_s16): Remove.
(vclzq_x_s32): Remove.
(vclzq_x_u8): Remove.
(vclzq_x_u16): Remove.
(vclzq_x_u32): Remove.
(__arm_vclzq_s8): Remove.
(__arm_vclzq_s16): Remove.
(__arm_vclzq_s32): Remove.
(__arm_vclzq_u8): Remove.
(__arm_vclzq_u16): Remove.
(__arm_vclzq_u32): Remove.
(__arm_vclzq_m_u8): Remove.
(__arm_vclzq_m_s8): Remove.
(__arm_vclzq_m_u16): Remove.
(__arm_vclzq_m_s16): Remove.
(__arm_vclzq_m_u32): Remove.
(__arm_vclzq_m_s32): Remove.
(__arm_vclzq_x_s8): Remove.
(__arm_vclzq_x_s16): Remove.
(__arm_vclzq_x_s32): Remove.
(__arm_vclzq_x_u8): Remove.
(__arm_vclzq_x_u16): Remove.
(__arm_vclzq_x_u32): Remove.
(__arm_vclzq): Remove.
(__arm_vclzq_m): Remove.
(__arm_vclzq_x): Remove.
(vqabsq): Remove.
(vqnegq): Remove.
(vqnegq_m): Remove.
(vqabsq_m): Remove.
(vqabsq_s8): Remove.
(vqabsq_s16): Remove.
(vqabsq_s32): Remove.
(vqnegq_s8): Remove.
(vqnegq_s16): Remove.
(vqnegq_s32): Remove.
(vqnegq_m_s8): Remove.
(vqabsq_m_s8): Remove.
(vqnegq_m_s16): Remove.
(vqabsq_m_s16): Remove.
(vqnegq_m_s32): Remove.
(vqabsq_m_s32): Remove.
(__arm_vqabsq_s8): Remove.
(__arm_vqabsq_s16): Remove.
(__arm_vqabsq_s32): Remove.
(__arm_vqnegq_s8): Remove.
(__arm_vqnegq_s16): Remove.
(__arm_vqnegq_s32): Remove.
(__arm_vqnegq_m_s8): Remove.
(__arm_vqabsq_m_s8): Remove.
(__arm_vqnegq_m_s16): Remove.
(__arm_vqabsq_m_s16): Remove.
(__arm_vqnegq_m_s32): Remove.
(__arm_vqabsq_m_s32): Remove.
(__arm_vqabsq): Remove.
(__arm_vqnegq): Remove.
(__arm_vqnegq_m): Remove.
(__arm_vqabsq_m): Remove.

commit | commitdiff | tree

Christophe Lyon [Thu, 9 Feb 2023 19:11:31 +0000 (19:11 +0000)]

arm: [MVE intrinsics] factorize several unary operations

Factorize vabs vcls vclz vneg vqabs vqneg vrnda vrndm vrndn vrndp vrnd
vrndx so that they use the same pattern.

This patch introduces the mve_mnemo iterator because some of the
involved intrinsics have a different name from their mnenonic: for
instance vrndq vs vrintz.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/iterators.md (MVE_INT_M_UNARY, MVE_INT_UNARY)
(MVE_FP_UNARY, MVE_FP_M_UNARY): New.
(mve_insn): Add vabs, vcls, vclz, vneg, vqabs, vqneg, vrnda,
vrndm, vrndn, vrndp, vrnd, vrndx.
(isu): Add VABSQ_M_S, VCLSQ_M_S, VCLZQ_M_S, VCLZQ_M_U, VNEGQ_M_S,
VQABSQ_M_S, VQNEGQ_M_S.
(mve_mnemo): New.
* config/arm/mve.md (mve_vrndq_m_f<mode>, mve_vrndxq_f<mode>)
(mve_vrndq_f<mode>, mve_vrndpq_f<mode>, mve_vrndnq_f<mode>)
(mve_vrndmq_f<mode>, mve_vrndaq_f<mode>): Merge into ...
(@mve_<mve_insn>q_f<mode>): ... this.
(mve_vnegq_f<mode>, mve_vabsq_f<mode>): Merge into ...
(mve_v<absneg_str>q_f<mode>): ... this.
(mve_vnegq_s<mode>, mve_vabsq_s<mode>): Merge into ...
(mve_v<absneg_str>q_s<mode>): ... this.
(mve_vclsq_s<mode>, mve_vqnegq_s<mode>, mve_vqabsq_s<mode>): Merge into ...
(@mve_<mve_insn>q_<supf><mode>): ... this.
(mve_vabsq_m_s<mode>, mve_vclsq_m_s<mode>)
(mve_vclzq_m_<supf><mode>, mve_vnegq_m_s<mode>)
(mve_vqabsq_m_s<mode>, mve_vqnegq_m_s<mode>): Merge into ...
(@mve_<mve_insn>q_m_<supf><mode>): ... this.
(mve_vabsq_m_f<mode>, mve_vnegq_m_f<mode>, mve_vrndaq_m_f<mode>)
(mve_vrndmq_m_f<mode>, mve_vrndnq_m_f<mode>, mve_vrndpq_m_f<mode>)
(mve_vrndxq_m_f<mode>): Merge into ...
(@mve_<mve_insn>q_m_f<mode>): ... this.

commit | commitdiff | tree

Christophe Lyon [Thu, 9 Feb 2023 18:38:22 +0000 (18:38 +0000)]

arm: [MVE intrinsics] add unary shape

This patch adds the unary shape description.

2022-09-08 Christophe Lyon <christophe.lyon@arm.com>

gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary): New.
* config/arm/arm-mve-builtins-shapes.h (unary): New.

commit | commitdiff | tree

Jakub Jelinek [Tue, 9 May 2023 10:46:21 +0000 (12:46 +0200)]

mux-utils.h: Fix a comment typo

Trivial comment typo...

2023-05-09 Jakub Jelinek <jakub@redhat.com>

* mux-utils.h: Fix comment typo, avoides -> avoids.

commit | commitdiff | tree

Jakub Jelinek [Tue, 9 May 2023 10:14:18 +0000 (12:14 +0200)]

testsuite: Add further testcase for already fixed PR [PR109778]

I came up with a testcase which reproduces all the way to r10-7469.
LTO to avoid early inlining it, so that ccp handles rotates and not
shifts before they are turned into rotates.

2023-05-09 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/109778
* gcc.dg/lto/pr109778_0.c: New test.
* gcc.dg/lto/pr109778_1.c: New file.

commit | commitdiff | tree

Jakub Jelinek [Tue, 9 May 2023 10:10:07 +0000 (12:10 +0200)]

tree-ssa-ccp, wide-int: Fix up handling of [LR]ROTATE_EXPR in bitwise ccp [PR109778]

The following testcase is miscompiled, because bitwise ccp2 handles
a rotate with a signed type incorrectly.
Seems tree-ssa-ccp.cc has the only callers of wi::[lr]rotate with 3
arguments, all other callers just rotate in the right precision and
I think work correctly.  ccp works with widest_ints and so rotations
by the excessive precision certainly don't match what it wants
when it sees a rotate in some specific bitsize.  Still, if it is
unsigned rotate and the widest_int is zero extended from width,
the functions perform left shift and logical right shift on the value
and then at the end zero extend the result of left shift and uselessly
also the result of logical right shift and return | of that.
On the testcase we the signed char rrotate by 4 argument is
CONSTANT -75 i.e. 0xffffffff....fffffb5 with mask 2.
The mask is correctly rotated to 0x20, but because the 8-bit constant
is sign extended to 192-bit one, the logical right shift by 4 doesn't
yield expected 0xb, but gives 0xfffffffffff....ffffb, and then
return wi::zext (left, width) | wi::zext (right, width); where left is
0xfffffff....fb50, so we return 0xfb instead of the expected
0x5b.

The following patch fixes that by doing the zero extension in case of
the right variable before doing wi::lrshift rather than after it.

Also, wi::[lr]rotate widht width < precision always zero extends
the result.  I'm afraid it can't do better because it doesn't know
if it is done for an unsigned or signed type, but the caller in this
case knows that very well, so I've done the extension based on sgn
in the caller.  E.g. 0x5b rotated right (or left) by 4 with width 8
previously gave 0xb5, but sgn == SIGNED in widest_int it should be
0xffffffff....fffb5 instead.

2023-05-09  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/109778
* wide-int.h (wi::lrotate, wi::rrotate): Call wi::lrshift on
wi::zext (x, width) rather than x if width != precision, rather
than using wi::zext (right, width) after the shift.
* tree-ssa-ccp.cc (bit_value_binop): Call wi::ext on the results
of wi::lrotate or wi::rrotate.

* gcc.c-torture/execute/pr109778.c: New test.

commit | commitdiff | tree

Alexander Monakov [Mon, 8 May 2023 17:16:01 +0000 (20:16 +0300)]

genmatch: fixup get_out_file

get_out_file did not follow the coding conventions (mixing three-space
and two-space indentation, missing linebreak before function name).

Take that as an excuse to reimplement it in a more terse manner and
rename as 'choose_output', which is hopefully more descriptive.

gcc/ChangeLog:

* genmatch.cc (get_out_file): Make static and rename to ...
(choose_output): ... this. Reimplement. Update all uses ...
(decision_tree::gen): ... here and ...
(main): ... here.

commit | commitdiff | tree

Alexander Monakov [Fri, 5 May 2023 22:25:26 +0000 (01:25 +0300)]

genmatch: clean up showUsage

Display usage more consistently and get rid of camelCase.

gcc/ChangeLog:

* genmatch.cc (showUsage): Reimplement as ...
(usage): ...this. Adjust all uses.
(main): Print usage when no arguments. Add missing 'return 1'.

commit | commitdiff | tree

Alexander Monakov [Fri, 5 May 2023 21:55:57 +0000 (00:55 +0300)]

genmatch: clean up emit_func

Eliminate boolean parameters of emit_func. The first ('open') just
prints 'extern' to generated header, which is unnecessary. Introduce a
separate function to use when finishing a declaration in place of the
second ('close').

Rename emit_func to 'fp_decl' (matching 'fprintf' in length) to unbreak
indentation in several places.

Reshuffle emitted line breaks in a few places to make generated
declarations less ugly.

gcc/ChangeLog:

* genmatch.cc (header_file): Make static.
(emit_func): Rename to...
(fp_decl): ... this. Adjust all uses.
(fp_decl_done): New function. Use it...
(decision_tree::gen): ... here and...
(write_predicate): ... here.
(main): Adjust.

commit | commitdiff | tree

Richard Sandiford [Tue, 9 May 2023 06:43:35 +0000 (07:43 +0100)]

aarch64: Avoid hard-coding specific register allocations

Some tests hard-coded specific allocations for temporary registers,
whereas the RA should be free to pick anything that doesn't force
unnecessary moves or spills.

gcc/testsuite/
* gcc.target/aarch64/asimd-mul-to-shl-sub.c: Allow any register
allocation for temporary results, rather than requiring specific
registers.
* gcc.target/aarch64/auto-init-padding-1.c: Likewise.
* gcc.target/aarch64/auto-init-padding-2.c: Likewise.
* gcc.target/aarch64/auto-init-padding-3.c: Likewise.
* gcc.target/aarch64/auto-init-padding-4.c: Likewise.
* gcc.target/aarch64/auto-init-padding-9.c: Likewise.
* gcc.target/aarch64/memset-corner-cases.c: Likewise.
* gcc.target/aarch64/memset-q-reg.c: Likewise.
* gcc.target/aarch64/simd/vaddlv_1.c: Likewise.
* gcc.target/aarch64/sve-neon-modes_1.c: Likewise.
* gcc.target/aarch64/sve-neon-modes_3.c: Likewise.
* gcc.target/aarch64/sve/load_scalar_offset_1.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_6_256.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_6_512.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_6_1024.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_6_2048.c: Likewise.
* gcc.target/aarch64/sve/pr89007-1.c: Likewise.
* gcc.target/aarch64/sve/pr89007-2.c: Likewise.
* gcc.target/aarch64/sve/store_scalar_offset_1.c: Likewise.
* gcc.target/aarch64/vadd_reduc-1.c: Likewise.
* gcc.target/aarch64/vadd_reduc-2.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_bf16.c: Allow the temporary
predicate register to be any of p4-p7, rather than requiring p4
specifically.
* gcc.target/aarch64/sve/pcs/args_5_be_f16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_f32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_f64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_s8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_s16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_s32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_s64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_u8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_u16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_u32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_u64.c: Likewise.

commit | commitdiff | tree

Richard Sandiford [Tue, 9 May 2023 06:43:34 +0000 (07:43 +0100)]

aarch64: Relax FP/vector register matches

There were many tests that used [0-9] to match an FP or vector register,
but that should allow any of 0-31 instead.

asm-x-constraint-1.c required s0-s7, but that's the range for "y"
rather than "x". "x" allows s0-s15.

sve/pcs/return_9.c required z2-z7 (the initial set of available
call-clobbered registers), but z24-z31 are OK too.

gcc/testsuite/
* gcc.target/aarch64/advsimd-intrinsics/vshl-opt-6.c: Allow any
FP/vector register, not just register 0-9.
* gcc.target/aarch64/fmul_fcvt_2.c: Likewise.
* gcc.target/aarch64/ldp_stp_8.c: Likewise.
* gcc.target/aarch64/ldp_stp_17.c: Likewise.
* gcc.target/aarch64/ldp_stp_21.c: Likewise.
* gcc.target/aarch64/simd/vpaddd_f64.c: Likewise.
* gcc.target/aarch64/simd/vpaddd_s64.c: Likewise.
* gcc.target/aarch64/simd/vpaddd_u64.c: Likewise.
* gcc.target/aarch64/sve/adr_1.c: Likewise.
* gcc.target/aarch64/sve/adr_2.c: Likewise.
* gcc.target/aarch64/sve/adr_3.c: Likewise.
* gcc.target/aarch64/sve/adr_4.c: Likewise.
* gcc.target/aarch64/sve/adr_5.c: Likewise.
* gcc.target/aarch64/sve/extract_1.c: Likewise.
* gcc.target/aarch64/sve/extract_2.c: Likewise.
* gcc.target/aarch64/sve/extract_3.c: Likewise.
* gcc.target/aarch64/sve/extract_4.c: Likewise.
* gcc.target/aarch64/sve/slp_4.c: Likewise.
* gcc.target/aarch64/sve/spill_3.c: Likewise.
* gcc.target/aarch64/vfp-1.c: Likewise.
* gcc.target/aarch64/asm-x-constraint-1.c: Allow s0-s15, not just
s0-s7.
* gcc.target/aarch64/sve/pcs/return_9.c: Allow z24-z31 as well as
z2-z7.

commit | commitdiff | tree

Richard Sandiford [Tue, 9 May 2023 06:43:34 +0000 (07:43 +0100)]

aarch64: Relax predicate register matches

Most governing predicate operands require p0-p7, but some
instructions also allow p8-p15. Non-gp uses of predicates
often also allow all of p0-p15.

This patch fixes up cases where we required p0-p7 unnecessarily.
In some cases we match the definition (typically a comparison,
PFALSE or PTRUE), sometimes we match the use (like a logic
instruction, MOV or SEL), and sometimes we match both.

gcc/testsuite/
* g++.target/aarch64/sve/vcond_1.C: Allow any predicate
register for the temporary results, not just p0-p7.
* gcc.target/aarch64/sve/acle/asm/dupq_b8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/dupq_b16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/dupq_b32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/dupq_b64.c: Likewise.
* gcc.target/aarch64/sve/acle/general/whilele_5.c: Likewise.
* gcc.target/aarch64/sve/acle/general/whilele_6.c: Likewise.
* gcc.target/aarch64/sve/acle/general/whilele_7.c: Likewise.
* gcc.target/aarch64/sve/acle/general/whilele_9.c: Likewise.
* gcc.target/aarch64/sve/acle/general/whilele_10.c: Likewise.
* gcc.target/aarch64/sve/acle/general/whilelt_1.c: Likewise.
* gcc.target/aarch64/sve/acle/general/whilelt_2.c: Likewise.
* gcc.target/aarch64/sve/acle/general/whilelt_3.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_1.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_2.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_6.c: Likewise.
* gcc.target/aarch64/sve/vcond_2.c: Likewise.
* gcc.target/aarch64/sve/vcond_3.c: Likewise.
* gcc.target/aarch64/sve/vcond_7.c: Likewise.
* gcc.target/aarch64/sve/vcond_18.c: Likewise.
* gcc.target/aarch64/sve/vcond_19.c: Likewise.
* gcc.target/aarch64/sve/vcond_20.c: Likewise.

commit | commitdiff | tree

Richard Sandiford [Tue, 9 May 2023 06:43:33 +0000 (07:43 +0100)]

aarch64: Relax ordering requirements in SVE dup tests

Some of the svdup tests expand to a SEL between two constant vectors.
This patch allows the constants to be formed in either order.

gcc/testsuite/
* gcc.target/aarch64/sve/acle/asm/dup_s16.c: When using SEL to select
between two constant vectors, allow the constant moves to appear in
either order.
* gcc.target/aarch64/sve/acle/asm/dup_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/dup_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/dup_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/dup_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/dup_u64.c: Likewise.

commit | commitdiff | tree

Richard Sandiford [Tue, 9 May 2023 06:43:33 +0000 (07:43 +0100)]

aarch64: Allow moves after tied-register intrinsics

Some ACLE intrinsics map to instructions that tie the output
operand to an input operand. If all the operands are allocated
to different registers, and if MOVPRFX can't be used, we will need
a move either before the instruction or after it. Many tests only
matched the "before" case; this patch makes them accept the "after"
case too.

gcc/testsuite/
* gcc.target/aarch64/advsimd-intrinsics/bfcvtnq2-untied.c: Allow
moves to occur after the intrinsic instruction, rather than requiring
them to happen before.
* gcc.target/aarch64/advsimd-intrinsics/bfdot-1.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vdot-3-1.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/adda_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/adda_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/adda_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/brka_b.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/brkb_b.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/brkn_b.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/clasta_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/clasta_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/clasta_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/clasta_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/clastb_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/clastb_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/clastb_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/clastb_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/pfirst_b.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/pnext_b16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/pnext_b32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/pnext_b64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/pnext_b8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sli_s16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sli_s32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sli_s64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sli_s8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sli_u16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sli_u32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sli_u64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sli_u8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sri_s16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sri_s32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sri_s64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sri_s8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sri_u16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sri_u32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sri_u64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sri_u8.c: Likewise.

commit | commitdiff | tree

Richard Sandiford [Tue, 9 May 2023 06:43:32 +0000 (07:43 +0100)]

aarch64: Fix move-after-intrinsic function-body tests

Some of the SVE ACLE asm tests tried to be agnostic about the
instruction order, but only one of the alternatives was exercised
in practice. This patch fixes latent typos in the other versions.

gcc/testsuite/
* gcc.target/aarch64/sve2/acle/asm/aesd_u8.c: Fix expected register
allocation in the case where a move occurs after the intrinsic
instruction.
* gcc.target/aarch64/sve2/acle/asm/aese_u8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/aesimc_u8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/aesmc_u8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/sm4e_u32.c: Likewise.

commit | commitdiff | tree

Richard Sandiford [Tue, 9 May 2023 06:40:41 +0000 (07:40 +0100)]

ira: Don't create copies for earlyclobbered pairs

This patch follows on from g:9f635bd13fe9e85872e441b6f3618947f989909a
("the previous patch").  To start by quoting that:

If an insn requires two operands to be tied, and the input operand dies
in the insn, IRA acts as though there were a copy from the input to the
output with the same execution frequency as the insn.  Allocating the
same register to the input and the output then saves the cost of a move.

If there is no such tie, but an input operand nevertheless dies
in the insn, IRA creates a similar move, but with an eighth of the
frequency.  This helps to ensure that chains of instructions reuse
registers in a natural way, rather than using arbitrarily different
registers for no reason.

This heuristic seems to work well in the vast majority of cases.
However, the problem fixed in the previous patch was that we
could create a copy for an operand pair even if, for all relevant
alternatives, the output and input register classes did not have
any registers in common.  It is then impossible for the output
operand to reuse the dying input register.

This left unfixed a further case where copies don't make sense:
there is no point trying to reuse the dying input register if,
for all relevant alternatives, the output is earlyclobbered and
the input doesn't match the output.  (Matched earlyclobbers are fine.)

Handling that case fixes several existing XFAILs and helps with
a follow-on aarch64 patch.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  A SPEC2017 run
on aarch64 showed no differences outside the noise.  Also, I tried
compiling gcc.c-torture, gcc.dg, and g++.dg for at least one target
per cpu directory, using the options -Os -fno-schedule-insns{,2}.
The results below summarise the tests that showed a difference in LOC:

Target               Tests   Good    Bad   Delta    Best   Worst  Median
======               =====   ====    ===   =====    ====   =====  ======
amdgcn-amdhsa           14      7      7       3     -18      10      -1
arm-linux-gnueabihf     16     15      1     -22      -4       2      -1
csky-elf                 6      6      0     -21      -6      -2      -4
hppa64-hp-hpux11.23      5      5      0      -7      -2      -1      -1
ia64-linux-gnu          16     16      0     -70     -15      -1      -3
m32r-elf                53      1     52      64      -2       8       1
mcore-elf                2      2      0      -8      -6      -2      -6
microblaze-elf         285    283      2    -909     -68       4      -1
mmix                     7      7      0   -2101   -2091      -1      -1
msp430-elf               1      1      0      -4      -4      -4      -4
pru-elf                  8      6      2     -12      -6       2      -2
rx-elf                  22     18      4     -40      -5       6      -2
sparc-linux-gnu         15     14      1     -40      -8       1      -2
sparc-wrs-vxworks       15     14      1     -40      -8       1      -2
visium-elf               2      1      1       0      -2       2      -2
xstormy16-elf            1      1      0      -2      -2      -2      -2

with other targets showing no sensitivity to the patch.  The only
target that seems to be negatively affected is m32r-elf; otherwise
the patch seems like an extremely minor but still clear improvement.

gcc/
* ira-conflicts.cc (can_use_same_reg_p): Skip over non-matching
earlyclobbers.

gcc/testsuite/
* gcc.target/aarch64/sve/acle/asm/asr_wide_s16.c: Remove XFAILs.
* gcc.target/aarch64/sve/acle/asm/asr_wide_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/asr_wide_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsr_wide_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsr_wide_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsr_wide_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/scale_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/scale_f64.c: Likewise.

commit | commitdiff | tree

Jason Merrill [Mon, 8 May 2023 22:22:30 +0000 (18:22 -0400)]

c++: non-template friend of template [PR106740]

This was fixed by r13-1018, but the testcase seems needed.

PR c++/106740

gcc/testsuite/ChangeLog:

* g++.dg/template/friend78.C: New test.

commit | commitdiff | tree

GCC Administrator [Tue, 9 May 2023 00:16:43 +0000 (00:16 +0000)]

Daily bump.

commit | commitdiff | tree

Roger Sayle [Mon, 8 May 2023 22:48:46 +0000 (23:48 +0100)]

[x86_64] Introduce insvti_highpart define_insn_and_split.

This is a repost/respin of a patch that was conditionally approved:
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609470.html

This patch adds a convenient post-reload splitter for setting/updating
the highpart of a TImode variable, using i386's previously added
split_double_concat infrastructure.

For the new test case below:

__int128 foo(__int128 x, unsigned long long y)
{
  __int128 t = (__int128)y << 64;
  __int128 r = (x & ~0ull) | t;
  return r;
}

mainline GCC with -O2 currently generates:

foo:    movq    %rdi, %rcx
        xorl    %eax, %eax
        xorl    %edi, %edi
        orq     %rcx, %rax
        orq     %rdi, %rdx
        ret

with this patch, GCC instead now generates the much better:

foo: movq    %rdi, %rcx
        movq    %rcx, %rax
        ret

It turns out that the -m32 equivalent of this testcase, already
avoids using explict orl/xor instructions, as it gets optimized
(in combine) by a completely different path.  Given that this idiom
isn't seen in 32-bit code (so this pattern doesn't match with -m32),
and also that the shorter 32-bit AND bitmask is represented as a
CONST_INT rather than a CONST_WIDE_INT, this new define_insn_and_split
is implemented for just TARGET_64BIT rather than contort a "generic"
implementation using DWI mode iterators.

2023-05-08  Roger Sayle  <roger@nextmovesoftware.com>
    Uros Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog
* config/i386/i386.md (any_or_plus): Move definition earlier.
(*insvti_highpart_1): New define_insn_and_split to overwrite
(insv) the highpart of a TImode register/memory.

gcc/testsuite/ChangeLog
* gcc.target/i386/insvti_highpart-1.c: New test case.

commit | commitdiff | tree

Eugene Rozenfeld [Tue, 28 Feb 2023 23:58:40 +0000 (15:58 -0800)]

Fix cfg maintenance after inlining in AutoFDO

Todo from early_inliner needs to be propagated so that
cleanup_tree_cfg () is called if necessary.

This bug was causing an assert in get_loop_body during
ipa-sra in autoprofiledbootstrap build since loops weren't
fixed up and one of the loops had num_nodes set to 0.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:

* auto-profile.cc (auto_profile): Check todo from early_inline
to see if cleanup_tree_vfg needs to be called.
(early_inline): Return todo from early_inliner.

commit | commitdiff | tree

Andrew Pinski [Mon, 8 May 2023 17:58:06 +0000 (10:58 -0700)]

Fix pr81192.c for int16 targets

I had missed when converting this
testcase to Gimple that there was a define
for int/unsigned type specifically to get
an INT32 type. This means when using a
literal integer constant you need to use the
`_Literal (type)` to form the types correctly on the
constants.

This fixes the issue and has been both tested on
xstormy16-elf and x86_64-linux-gnu.

Committed as obvious.

gcc/testsuite/ChangeLog:

PR testsuite/109776
* gcc.dg/pr81192.c: Fix integer constants for int16 targets.

commit | commitdiff | tree

Kito Cheng [Mon, 8 May 2023 09:54:52 +0000 (17:54 +0800)]

RISC-V: Factor out vector manager code in vsetvli insertion pass. [NFC]

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::get_vector_info):
New.
(pass_vsetvl::get_block_info): New.
(pass_vsetvl::update_vector_info): New.
(pass_vsetvl::simple_vsetvl): Use get_vector_info.
(pass_vsetvl::compute_local_backward_infos): Ditto.
(pass_vsetvl::transfer_before): Ditto.
(pass_vsetvl::transfer_after): Ditto.
(pass_vsetvl::emit_local_forward_vsetvls): Ditto.
(pass_vsetvl::local_eliminate_vsetvl_insn): Ditto.
(pass_vsetvl::cleanup_insns): Ditto.
(pass_vsetvl::compute_local_backward_infos): Use
update_vector_info.

commit | commitdiff | tree

Kito Cheng [Mon, 8 May 2023 13:44:30 +0000 (21:44 +0800)]

RISC-V: Improve portability of testcases

stdint.h will require having corresponding multi-lib existing, so using
stdint-gcc.h instead, also added a riscv_vector.h wrapper to
gcc.target/riscv/rvv/autovec/.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.h: Change
stdint.h to stdint-gcc.h.
* gcc.target/riscv/rvv/autovec/template-1.h: Ditto.
* gcc.target/riscv/rvv/autovec/riscv_vector.h: New.

commit | commitdiff | tree

Jeff Law [Mon, 8 May 2023 14:28:26 +0000 (08:28 -0600)]

Fix minor length computation on stormy16

Today's build of xstormy16-elf failed due to a branch to an out of range
target. Manual inspection of the assembly code for the affected function
(divdi3) showed that the zero-extension patterns were claiming a length
of 2, but clearly assembled into 4 bytes.

This patch adds an explicit length to the zero extension pattern and
appears to resolve the issue in my test builds.

gcc/

* config/stormy16/stormy16.md (zero_extendhisi2): Fix length.

commit | commitdiff | tree

Thomas Schwinge [Thu, 4 May 2023 07:07:35 +0000 (09:07 +0200)]

libgomp C++ testsuite: Use 'lang_include_flags' instead of 'libstdcxx_includes'

With nvptx offloading configured, and supported, and CUDA available:

    $ make check-target-libgomp RUNTESTFLAGS="--all c.exp=context-1.c c++.exp=context-1.c"
    [...]
    Running [...]/libgomp.oacc-c/c.exp ...
    PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  (test for excess errors)
    PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  execution test
    PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test for excess errors)
    PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  execution test
    UNSUPPORTED: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2
    Running [...]/libgomp.oacc-c++/c++.exp ...
    PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  (test for excess errors)
    PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  execution test
    PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test for excess errors)
    PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  execution test
    UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2
    [...]

..., but for 'c++.exp=context-1.c' alone, we currently get all-UNSUPPORTED:

    $ make check-target-libgomp RUNTESTFLAGS_="--all c++.exp=context-1.c"
    [...]
    Running [...]/libgomp.oacc-c++/c++.exp ...
    UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
    UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
    UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2
    [...]

That is, if 'c.exp' executes first, it does successfully evaluate
'dg-require-effective-target openacc_cublas' -- and does cache this result (so
it isn't reevaluated for 'c++.exp').  However, for 'c++.exp' alone (that is,
without the 'c.exp' result cached), we run into:

    spawn -ignore SIGHUP [xgcc] [...] -x c++ openacc_cublas2311907.c [...]
    In file included from /usr/include/cuda_fp16.h:3673,
                     from /usr/include/cublas_api.h:75,
                     from /usr/include/cublas_v2.h:65,
                     from openacc_cublas2311907.c:3:
    /usr/include/cuda_fp16.hpp:67:10: fatal error: utility: No such file or directory

We're missing include paths to C++/libstdc++ build-tree headers.

Fix this by using the mechanism introduced for Fortran in
r212268 (commit f707da16f714f7fe5a42391748212c84dfec639b) re
"libgomp.fortran/fortran.exp - add -fintrinsic-modules-path ${blddir}".

libgomp/
* testsuite/libgomp.c++/c++.exp: Use 'lang_include_flags' instead
of 'libstdcxx_includes'.
* testsuite/libgomp.oacc-c++/c++.exp: Likewise.

commit | commitdiff | tree

Thomas Schwinge [Tue, 2 May 2023 17:57:47 +0000 (19:57 +0200)]

Let each 'lto_init' determine the default 'LTO_OPTIONS', and 'torture-init' the 'LTO_TORTURE_OPTIONS'

Otherwise, for example for 'RUNTESTFLAGS' of '--target_board=unix\{-m64,-m32\}'
vs. '--target_board=unix\{-m32,-m64\}', both variants exercise testing with
always the first flag variant's 'LTO_OPTIONS'/'LTO_TORTURE_OPTIONS', which
results in unequal test results between the two 'RUNTESTFLAGS' variants if one
of the flag variants has 'check_linker_plugin_available' but the other doesn't.

Fix-up for r180245 (commit c1a7cdbbcca90ad5260bfc543f8c10f3514e76c1)
"Update testsuite to run with slim LTO".

gcc/testsuite/
* g++.dg/guality/guality.exp: Move 'torture-init' earlier.
* gcc.dg/guality/guality.exp: Likewise.
* gfortran.dg/guality/guality.exp: Likewise.
* lib/c-torture.exp (LTO_TORTURE_OPTIONS): Don't set.
* lib/gcc-dg.exp (LTO_TORTURE_OPTIONS): Don't set.
* lib/lto.exp (lto_init, lto_finish): Let each 'lto_init'
determine the default 'LTO_OPTIONS'.
* lib/torture-options.exp (torture-init, torture-finish): Let each
'torture-init' determine the 'LTO_TORTURE_OPTIONS'.

commit | commitdiff | tree

Thomas Schwinge [Tue, 21 Mar 2023 15:14:16 +0000 (16:14 +0100)]

libgomp: Simplify OpenMP reverse offload host <-> device memory copy implementation

... by using the existing 'goacc_asyncqueue' instead of re-coding parts of it.

Follow-up to commit 131d18e928a3ea1ab2d3bf61aa92d68a8a254609
"libgomp/nvptx: Prepare for reverse-offload callback handling",
and commit ea4b23d9c82d9be3b982c3519fe5e8e9d833a6a8
"libgomp: Handle OpenMP's reverse offloads".

libgomp/
* target.c (gomp_target_rev): Instead of 'dev_to_host_cpy',
'host_to_dev_cpy', 'token', take a single 'goacc_asyncqueue'.
* libgomp.h (gomp_target_rev): Adjust.
* libgomp-plugin.c (GOMP_PLUGIN_target_rev): Adjust.
* libgomp-plugin.h (GOMP_PLUGIN_target_rev): Adjust.
* plugin/plugin-gcn.c (process_reverse_offload): Adjust.
* plugin/plugin-nvptx.c (rev_off_dev_to_host_cpy)
(rev_off_host_to_dev_cpy): Remove.
(GOMP_OFFLOAD_run): Adjust.

commit | commitdiff | tree

Thomas Schwinge [Mon, 8 May 2023 13:53:47 +0000 (15:53 +0200)]

libgm2: Remove 'autogen.sh'

... given that plain 'autoreconf' achieves the same.

libgm2/
* autogen.sh: Remove.

commit | commitdiff | tree

Thomas Schwinge [Tue, 11 Apr 2023 19:40:14 +0000 (21:40 +0200)]

libgm2: Adjust 'autogen.sh' to 'ACLOCAL_AMFLAGS', and simplify

Specifying explicit '-I ..' before '-I ../config' is what (most) other GCC
components do. Specifying '-I .' is not necessary.

With the order of '-I's aligned, 'autogen.sh' and plain 'autoreconf' then
produce identical results.

libgm2/
* autogen.sh: For 'aclocal', 'autoreconf', remove '-I .',
add '-I ..'.
* Makefile.am (ACLOCAL_AMFLAGS): Remove '-I .'.
* libm2cor/Makefile.am (ACLOCAL_AMFLAGS): Likewise.
* libm2iso/Makefile.am (ACLOCAL_AMFLAGS): Likewise.
* libm2log/Makefile.am (ACLOCAL_AMFLAGS): Likewise.
* libm2min/Makefile.am (ACLOCAL_AMFLAGS): Likewise.
* libm2pim/Makefile.am (ACLOCAL_AMFLAGS): Likewise.
* aclocal.m4: Regenerate.
* Makefile.in: Likewise.
* libm2cor/Makefile.in: Likewise.
* libm2iso/Makefile.in: Likewise.
* libm2log/Makefile.in: Likewise.
* libm2min/Makefile.in: Likewise.
* libm2pim/Makefile.in: Likewise.

commit | commitdiff | tree

Patrick Palka [Mon, 8 May 2023 13:03:35 +0000 (09:03 -0400)]

c++: list CTAD and resolve_nondeduced_context [PR106214]

This extends the PR93107 fix, which made us do resolve_nondeduced_context
on the elements of an initializer list during auto deduction, to happen for
CTAD as well.

PR c++/106214
PR c++/93107

gcc/cp/ChangeLog:

* pt.cc (do_auto_deduction): Move up resolve_nondeduced_context
calls to happen before do_class_deduction. Add some
error_mark_node tests.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction114.C: New test.

Mirror of https://gcc.gnu.org/git/gcc.git