Tom Tromey [Tue, 26 Sep 2023 19:38:42 +0000 (13:38 -0600)]
libstdc++: Use gdb.ValuePrinter base class
GDB 14 will add a new ValuePrinter tag class that will be used to
signal that pretty-printers will agree to the "extension protocol" --
essentially that they will follow some simple namespace rules, so that
GDB can add new methods over time.
A couple new methods have already been added to GDB, to support DAP.
While I haven't implemented these for any libstdc++ printers yet, this
patch makes the basic conversion: printers derive from
gdb.ValuePrinter if it is available, and all "non-standard" (that is,
not specified by GDB) members of the various value-printing classes
are renamed to have a leading underscore.
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/printers.py: Use gdb.ValuePrinter
everywhere. Rename members to start with "_".
Tom Tromey [Wed, 27 Sep 2023 19:49:59 +0000 (13:49 -0600)]
libstdc++: Show full Python stack on error
This changes the libstdc++ test suite to arrange for gdb to show the
full Python stack if any sort of Python exception occurs. This makes
debugging the printers a little simpler.
libstdc++-v3/ChangeLog:
* testsuite/lib/gdb-test.exp (gdb-test): Enable Python
stack traces from gdb.
Jonathan Wakely [Thu, 28 Sep 2023 19:52:01 +0000 (20:52 +0100)]
libstdc++: Refactor Python Xmethods to use is_specialization_of
This copies the is_specialization_of function from printers.py (with
slight modification for versioned namespace handling) and reuses it in
xmethods.py to replace repetitive re.match calls in every class.
This fixes the problem that the regular expressions used \d without
escaping the backslash properly.
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/xmethods.py (is_specialization_of): Define
new function.
(ArrayMethodsMatcher, DequeMethodsMatcher)
(ForwardListMethodsMatcher, ListMethodsMatcher)
(VectorMethodsMatcher, AssociativeContainerMethodsMatcher)
(UniquePtrGetWorker, UniquePtrMethodsMatcher)
(SharedPtrSubscriptWorker, SharedPtrMethodsMatcher): Use
is_specialization_of instead of re.match.
Jonathan Wakely [Thu, 28 Sep 2023 13:54:59 +0000 (14:54 +0100)]
libstdc++: Reformat Python code
Some of these changes were suggested by autopep8's --aggressive
option, others are for readability.
Break long lines by splitting strings across multiple lines, or
introducing local variables to hold results.
Use raw strings for regular expressions, so that backslashes don't need
to be escaped.
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/printers.py: Break long lines. Use raw
strings for regular expressions. Add whitespace around
operators.
(is_member_of_namespace): Use isinstance to check type.
(is_specialization_of): Likewise. Adjust template_name
for versioned namespace instead of duplicating the re.match
call.
(StdExpAnyPrinter._string_types): New static method.
(StdExpAnyPrinter.to_string): Use _string_types.
Gaius Mulley [Thu, 28 Sep 2023 18:07:04 +0000 (19:07 +0100)]
modula2: Increase linking test timeouts for slower targets
This patch introduces missing timeout handling for
pimlib-base-run-pass.exp and increases the timeout value
for larger projects which link (necessary for slower targets).
gcc/testsuite/ChangeLog:
* gm2/coroutines/pim/run/pass/coroutines-pim-run-pass.exp:
Add load_lib timeout-dg.exp and increase timeout to 60
seconds.
* gm2/pimlib/base/run/pass/pimlib-base-run-pass.exp: Add
load_lib timeout-dg.exp and increase timeout to 60 seconds.
* gm2/projects/iso/run/pass/halma/projects-iso-run-pass-halma.exp:
Increase timeout to 45 seconds.
* gm2/switches/whole-program/pass/run/switches-whole-program-pass-run.exp:
Add load_lib timeout-dg.exp and increase timeout to 120 seconds.
Remove unnecessary compile of mystrlib.mod.
* gm2/iso/run/pass/iso-run-pass.exp: Add load_lib
timeout-dg.exp and set timeout to 60 seconds.
Tim Song [Wed, 6 Sep 2023 17:31:55 +0000 (19:31 +0200)]
libstdc++: Force _Hash_node_value_base methods inline to fix abi (PR111050)
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1b6f0476837205932613ddb2b3429a55c26c409d
changed _Hash_node_value_base to no longer derive from _Hash_node_base, which means
that its member functions expect _M_storage to be at a different offset. So explosions
result if an out-of-line definition is emitted for any of the member functions (say,
in a non-optimized build) and the resulting object file is then linked with code built
using older version of GCC/libstdc++.
Although the patch improves x86-64 specfp2007, it also results in
performance and code size regression on different targets and
new GCC testsuite failures on tests expecting a specific output.
A MOPS memmove may corrupt registers since there is no copy of the input
operands to temporary registers. Fix this by calling
aarch64_expand_cpymem_mops.
Reviewed-by: Richard Sandiford <richard.sandiford@arm.com>
gcc/ChangeLog/
PR target/111121
* config/aarch64/aarch64.md (aarch64_movmemdi): Add new expander.
(movmemdi): Call aarch64_expand_cpymem_mops for correct expansion.
* config/aarch64/aarch64.cc (aarch64_expand_cpymem_mops): Add support
for memmove.
* config/aarch64/aarch64-protos.h (aarch64_expand_cpymem_mops): Add new
function.
Pan Li [Thu, 28 Sep 2023 05:51:07 +0000 (13:51 +0800)]
RISC-V: Support {U}INT64 to FP16 auto-vectorization
Update in v2:
* Add math trap check.
* Adjust some test cases.
Original logs:
This patch would like to support the auto-vectorization from
the INT64 to FP16. We take below steps for the conversion.
* INT64 to FP32.
* FP32 to FP16.
Given sample code as below:
void
test_func (int64_t * __restrict a, _Float16 *b, unsigned n)
{
for (unsigned i = 0; i < n; i++)
b[i] = (_Float16) (a[i]);
}
After this patch:
vsetvli a5,a2,e8,mf8,ta,ma
vle64.v v1,0(a0)
vsetvli a4,zero,e32,mf2,ta,ma
vfncvt.f.x.w v1,v1
vsetvli zero,zero,e16,mf4,ta,ma
vfncvt.f.f.w v1,v1
vsetvli zero,a2,e16,mf4,ta,ma
vse16.v v1,0(a1)
Please note VLS mode is also involved in this patch and covered by the
test cases.
PR target/111506
gcc/ChangeLog:
* config/riscv/autovec.md (<float_cvt><mode><vnnconvert>2):
New pattern.
* config/riscv/vector-iterators.md: New iterator.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/cvt-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/cvt-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cvt-0.c: New test.
RISCV target developers need a flag to prevent creating
insns in IRA which can not be split after RA as they will need a
temporary reg. The patch introduces such flag.
gcc/ChangeLog:
* rtl.h (lra_in_progress): Change type to bool.
(ira_in_progress): Add new extern.
* ira.cc (ira_in_progress): New global.
(pass_ira::execute): Set up ira_in_progress.
* lra.cc: (lra_in_progress): Change type to bool and initialize.
(lra): Use bool values for lra_in_progress.
* lra-eliminations.cc (init_elim_table): Ditto.
Richard Biener [Thu, 28 Sep 2023 09:51:30 +0000 (11:51 +0200)]
target/111600 - avoid deep recursion in access diagnostics
pass_waccess::check_dangling_stores uses recursion to traverse the CFG.
The following changes this to use a heap allocated worklist to avoid
blowing the stack.
Instead of using a better iteration order it tries hard to preserve
the current iteration order to avoid new false positives to pop up
since the set of stores we keep track isn't properly modeling flow,
so what is diagnosed and what not is quite random. We are also
lacking the ideal RPO compute on the inverted graph that would just
ignore reverse unreachable code (as the current iteration scheme does).
PR target/111600
* gimple-ssa-warn-access.cc (pass_waccess::check_dangling_stores):
Use a heap allocated worklist for CFG traversal instead of
recursion.
libgfortran: Use __builtin_unreachable() not -Wno-stringop-overflow to silence warning
The only caller of write_z is formatted_transfer_scalar_write that passes
kind to 'len'; in turn, write_z is the only caller of xtoa_big, passing on
its 'len'. The kind is passed as is, except for GFC_REAL_17 for which
len = 16 is used.
libgfortran/
* io/write.c (xtoa_big): Change a 'GCC diagnostic ignored
"-Wstringop-overflow"' to an assumption (via __builtin_unreachable).t
Jakub Jelinek [Thu, 28 Sep 2023 09:59:10 +0000 (11:59 +0200)]
vec.h: Make some ops work with non-trivially copy constructible and/or destructible types
We have some very limited support for non-POD types in vec.h
(in particular grow_cleared will invoke default ctors on the
cleared elements and vector copying invokes copy ctors.
My pending work on wide_int/widest_int which makes those two
non-trivially default constructible, copyable and destructible shows this
isn't enough though.
In particular the uses of it in irange shows that quick_push
still uses just assignment operator rather than copy construction
and we never invoke destructors on anything.
The following patch does that for quick_push (copy construction
using placement new rather than assignment, for trivially copy
constructible types I think it should be the same) and invokes
destructors (only if non-trivially destructible) in pop, release
and truncate. Now as discussed last night on IRC, the pop case
is problematic, because our pop actually does two things,
it decreases length (so the previous last element should be destructed)
but also returns a reference to it. We have some 300+ uses of this
and the reference rather than returning it by value is useful at least
for the elements which are (larger) POD structures, so I'm not
prepared to change that. Though obviously for types with non-trivial
destructors returning a reference to just destructed element is not
a good idea. So, this patch for that case only makes pop return void
instead and any users wishing to get the last element need to use last ()
and pop () separately (currently there are none).
Note, a lot of vec.h operations is still not friendly for non-POD types,
and the patch tries to enforce that through static asserts. Some
operations are now only allowed on trivially copyable types, sorting
operations as an extension on trivially copyable types or std::pair
of 2 trivially copyable types, quick_grow/safe_grow (but not _cleared
variants) for now have a commented out assert on trivially default
constructible types - this needs some further work before the assert
can be enabled - and finally all va_gc/va_gc_atomic vectors require
trivially destructible types.
2023-09-28 Jakub Jelinek <jakub@redhat.com>
Jonathan Wakely <jwakely@redhat.com>
* vec.h: Mention in file comment limited support for non-POD types
in some operations.
(vec_destruct): New function template.
(release): Use it for non-trivially destructible T.
(truncate): Likewise.
(quick_push): Perform a placement new into slot
instead of assignment.
(pop): For non-trivially destructible T return void
rather than T & and destruct the popped element.
(quick_insert, ordered_remove): Note that they aren't suitable
for non-trivially copyable types. Add static_asserts for that.
(block_remove): Assert T is trivially copyable.
(vec_detail::is_trivially_copyable_or_pair): New trait.
(qsort, sort, stablesort): Assert T is trivially copyable or
std::pair with both trivally copyable types.
(quick_grow): Add assert T is trivially default constructible,
for now commented out.
(quick_grow_cleared): Don't call quick_grow, instead inline it
by hand except for the new static_assert.
(gt_ggc_mx): Assert T is trivially destructable.
(auto_vec::operator=): Formatting fixes.
(auto_vec::auto_vec): Likewise.
(vec_safe_grow_cleared): Don't call vec_safe_grow, instead inline
it manually and call quick_grow_cleared method rather than quick_grow.
(safe_grow_cleared): Likewise.
* edit-context.cc (class line_event): Move definition earlier.
* tree-ssa-loop-im.cc (seq_entry::seq_entry): Make default ctor
defaulted.
* ipa-fnsummary.cc (evaluate_properties_for_edge): Use
safe_grow_cleared instead of safe_grow followed by placement new
constructing the elements.
vmv.s.x has vl operand, the following code will get
avl (cosnt_int) from RVV Insn 1.
rtx avl = has_vl_op (insn->rtl ()) ? get_vl (insn->rtl ())
: dem.get_avl ();
If use REGNO for const_int, the compiler will crash:
during RTL pass: vsetvl
res_debug.c: In function '__dn_count_labels':
res_debug.c:1050:1: internal compiler error: RTL check: expected code 'reg',
have 'const_int' in rhs_regno, at rtl.h:1934
1050 | }
| ^
0x8fb169 rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int, char const*)
../.././gcc/gcc/rtl.cc:770
0x1399818 rhs_regno(rtx_def const*)
../.././gcc/gcc/rtl.h:1934
0x1399818 anticipatable_occurrence_p
../.././gcc/gcc/config/riscv/riscv-vsetvl.cc:348
So in this case avl should be obtained from dem.
Another issue is caused by the following code:
HOST_WIDE_INT diff = INTVAL (builder.elt (i)) - i;
during RTL pass: expand
../../.././gcc/libgfortran/generated/matmul_c4.c: In function 'matmul_c4':
../../.././gcc/libgfortran/generated/matmul_c4.c:2906:39: internal compiler error: RTL check:
expected code 'const_int', have 'const_poly_int' in expand_const_vector,
at config/riscv/riscv-v.cc:1149
The builder.elt (i) can be either const_int or const_poly_int.
This is a bug fix for commit a62c8324e7e31ae6614f549bdf9d8a653233f8fc,
which added GIMPLE_OMP_STRUCTURED_BLOCK. I found a big switch statement
over gimple codes that needs to know about this new node, but didn't.
gcc/ChangeLog
* gimple.cc (gimple_copy): Add case GIMPLE_OMP_STRUCTURED_BLOCK.
Darwin, configure: Allow for an unrecognisable dsymutil [PR111610].
We had a catch-all configuration case for missing or unrecognised dsymutil
but it was setting the dsymutil source to "UNKNOWN" which is not usable in
this context (since it clashes with an existing enum). We rename this to
DET_UNKNOWN (for Darwin External Toolchain).
PR target/111610
gcc/ChangeLog:
* configure: Regenerate.
* configure.ac: Rename the missing dsymutil case to "DET_UNKNOWN".
aarch64: Fine-grained policies to control ldp-stp formation
This patch implements the following TODO in gcc/config/aarch64/aarch64.cc
to provide the requested behaviour for handling ldp and stp:
/* Allow the tuning structure to disable LDP instruction formation
from combining instructions (e.g., in peephole2).
TODO: Implement fine-grained tuning control for LDP and STP:
1. control policies for load and store separately;
2. support the following policies:
- default (use what is in the tuning structure)
- always
- never
- aligned (only if the compiler can prove that the
load will be aligned to 2 * element_size) */
It provides two new and concrete target-specific command-line parameters
--param=aarch64-ldp-policy= and --param=aarch64-stp-policy=
to give the ability to control load and store policies seperately as
stated in part 1 of the TODO.
The accepted values for both parameters are:
* default: Use the policy of the tuning structure (default).
* always: Emit ldp/stp regardless of alignment.
* never: Do not emit ldp/stp.
* aligned: In order to emit ldp/stp, first check if the load/store will
be aligned to 2 * element_size.
Bootstrapped and regtested aarch64-linux.
gcc/ChangeLog:
* config/aarch64/aarch64-opts.h (enum aarch64_ldp_policy): New
enum type.
(enum aarch64_stp_policy): New enum type.
* config/aarch64/aarch64-protos.h (struct tune_params): Add
appropriate enums for the policies.
(aarch64_mem_ok_with_ldpstp_policy_model): New declaration.
* config/aarch64/aarch64-tuning-flags.def
(AARCH64_EXTRA_TUNING_OPTION): Remove superseded tuning
options.
* config/aarch64/aarch64.cc (aarch64_parse_ldp_policy): New
function to parse ldp-policy parameter.
(aarch64_parse_stp_policy): New function to parse stp-policy parameter.
(aarch64_override_options_internal): Call parsing functions.
(aarch64_mem_ok_with_ldpstp_policy_model): New function.
(aarch64_operands_ok_for_ldpstp): Add call to
aarch64_mem_ok_with_ldpstp_policy_model for parameter-value
check and alignment check and remove superseded ones.
(aarch64_operands_adjust_ok_for_ldpstp): Add call to
aarch64_mem_ok_with_ldpstp_policy_model for parameter-value
check and alignment check and remove superseded ones.
* config/aarch64/aarch64.opt (aarch64-ldp-policy): New param.
(aarch64-stp-policy): New param.
* doc/invoke.texi: Document the parameters accordingly.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/ampere1-no_ldp_combine.c: Removed.
* gcc.target/aarch64/ldp_aligned.c: New test.
* gcc.target/aarch64/ldp_always.c: New test.
* gcc.target/aarch64/ldp_never.c: New test.
* gcc.target/aarch64/stp_aligned.c: New test.
* gcc.target/aarch64/stp_always.c: New test.
* gcc.target/aarch64/stp_never.c: New test.
Andre Vieira [Wed, 27 Sep 2023 10:05:40 +0000 (11:05 +0100)]
vect, omp: inbranch simdclone dropping const
The const attribute is ignored when simdclone's are used inbranch. This is due
to the fact that when analyzing a MASK_CALL we were not looking at the targeted
function for flags, but instead only at the internal function call itself.
This patch adds code to make sure we look at the target function to check for
the const attribute and enables the autovectorization of inbranch const
simdclones without needing the loop to be adorned the 'openmp simd' pragma.
Jakub Jelinek [Wed, 27 Sep 2023 08:38:54 +0000 (10:38 +0200)]
remove workaround for GCC 4.1-4.3 [PR105606]
While looking into vec.h, I've noticed we still have a workaround for
GCC 4.1-4.3 bugs.
As we now use C++11 and thus need to be built by GCC 4.8 or later,
I think this is now never used.
All single floating point glte 8388608.0 will have all zero mantisaa.
We leverage vmflt and mask to filter them out in vector and only do the
cvt on mask.
After this patch:
...
fsrmi 0 // Rounding to nearest, ties to even
.L4:
vfabs.v v1,v2
vmflt.vf v0,v1,fa5
vfcvt.x.f.v v3,v2,v0.t
vfcvt.f.x.v v1,v3,v0.t
vfsgnj.vv v1,v1,v2
bne .L4
.L14:
fsrm a6
ret
Please note VLS mode is also involved in this patch and covered by the
test cases. We will add more run test with zfa support later.
gcc/ChangeLog:
* config/riscv/autovec.md (roundeven<mode>2): New pattern.
* config/riscv/riscv-protos.h (enum insn_flags): New enum type.
(enum insn_type): Ditto.
(expand_vec_roundeven): New func decl.
* config/riscv/riscv-v.cc (expand_vec_roundeven): New func impl.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/math-roundeven-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-roundeven-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-roundeven-2.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-roundeven-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-roundeven-1.c: New test.
Then it ICE on: auto new_mode = smallest_int_mode_for_size (access_size * BITS_PER_UNIT);
The access_size may be 24 or 32. We don't have such integer modes with these size so it ICE.
TODO: The better way maybe make DSE use native_encode_rtx/native_decode_rtx
but I don't know how to do that. So let's quickly fix this issue, we
can improve the fix later.
PR target/111590
gcc/ChangeLog:
* dse.cc (find_shift_sequence): Check the mode with access_size exist on the target.
All single floating point >= 8388608.0 will have all zero mantisaa.
We leverage vmflt and mask to filter them out in vector and only do
the cvt on mask.
After this patch:
vfabs.v v2,v1
vmflt.vf v0,v2,fa5
vfcvt.rtz.x.f.v v4,v1,v0.t
vfcvt.f.x.v v2,v4,v0.t
vfsgnj.vv v2,v2,v1
bne .L4
Please note VLS mode is also involved in this patch and covered by the
test cases.
gcc/ChangeLog:
* config/riscv/autovec.md (btrunc<mode>2): New pattern.
* config/riscv/riscv-protos.h (expand_vec_trunc): New func decl.
* config/riscv/riscv-v.cc (emit_vec_cvt_x_f_rtz): New func impl.
(expand_vec_trunc): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/math-trunc-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-trunc-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-trunc-2.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-trunc-3.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-trunc-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-trunc-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-trunc-1.c: New test.
__atomic_test_and_set: Fall back to library, not non-atomic code
Make __atomic_test_and_set consistent with other __atomic_ and __sync_
builtins: call a matching library function instead of emitting
non-atomic code when the target has no direct insn support.
There's special-case code handling targetm.atomic_test_and_set_trueval
!= 1 trying a modified maybe_emit_sync_lock_test_and_set. Previously,
if that worked but its matching emit_store_flag_force returned NULL,
we'd segfault later on. Now that the caller handles NULL, gcc_assert
here instead.
While the referenced PR:s are ARM-specific, the issue is general.
PR target/107567
PR target/109166
* builtins.cc (expand_builtin) <case BUILT_IN_ATOMIC_TEST_AND_SET>:
Handle failure from expand_builtin_atomic_test_and_set.
* optabs.cc (expand_atomic_test_and_set): When all attempts fail to
generate atomic code through target support, return NULL
instead of emitting non-atomic code. Also, for code handling
targetm.atomic_test_and_set_trueval != 1, gcc_assert result
from calling emit_store_flag_force instead of returning NULL.
testsuite: Require thread-fence for 29_atomics/atomic_flag/cons/value_init.cc
A recent patch made __atomic_test_and_set no longer fall
back to emitting non-atomic code, but instead will then emit
a call to __atomic_test_and_set, thereby exposing the need
to gate also this test on support for atomics, similar to r14-3980-g62b29347c38394.
These are tests from patch 3/5 of Ziao Zeng's zicond submission.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zicond-primitiveSemantics_return_0_imm.c: New test.
* gcc.target/riscv/zicond-primitiveSemantics_return_imm_imm.c: New test.
* gcc.target/riscv/zicond-primitiveSemantics_return_imm_reg.c: New test.
* gcc.target/riscv/zicond-primitiveSemantics_return_reg_reg.c: New test.
Gaius Mulley [Tue, 26 Sep 2023 17:08:37 +0000 (18:08 +0100)]
PR modula2/111510 runtime ICE findChildAndParent has caused internal runtime error
This patch fixes the runtime bug above. The full runtime message is:
findChildAndParent has caused internal runtime error, RTentity is either
corrupt or the module storage has not been initialized yet. The bug is
due to a non nul terminated string determining the module initialization order.
This results in modules being uninitialized and the above crash. The bug
manifests itself on 32 bit systems - but obviously is latent on all
targets and the fix should be applied to both gcc-14 and gcc-13.
gcc/m2/ChangeLog:
PR modula2/111510
* gm2-compiler/M2GenGCC.mod (IsExportedGcc): Minor spacing changes.
(BuildTrashTreeFromInterface): Minor spacing changes.
* gm2-compiler/M2Options.mod (GetRuntimeModuleOverride): Call
string to generate a nul terminated C style string.
* gm2-compiler/M2Quads.mod (BuildStringAdrParam): New procedure.
(BuildM2InitFunction): Replace inline parameter generation with
calls to BuildStringAdrParam.
The outline atomic functions have hidden visibility and can only be called
directly. Therefore we can remove the BTI at function entry. This improves
security by reducing the number of indirect entry points in a binary.
The BTI markings on the objects are still emitted.
Andrew Pinski [Wed, 20 Sep 2023 21:54:31 +0000 (14:54 -0700)]
MATCH: Simplify `(A ==/!= B) &/| (((cast)A) CMP C)`
This patch adds support to the pattern for `(A == B) &/| (A CMP C)`
where the second A could be casted to a different type.
Some were handled correctly if using seperate `if` statements
but not if combined with BIT_AND/BIT_IOR.
In the case of pr111456-1.c, the testcase would pass if
`--param=logical-op-non-short-circuit=0` was used but now
can be optimized always.
Andrew Pinski [Thu, 21 Sep 2023 03:05:17 +0000 (03:05 +0000)]
PHIOPT: Fix minmax_replacement for three way
So when diamond bb support was added to minmax_replacement in r13-1950-g9bb19e143cfe,
the code was not expecting the alt_middle_bb not to exist if it was empty (for threeway_p).
So when factor_out_conditional_conversion was used to factor out conversions, it turns out
the assumption for alt_middle_bb to be wrong and we ended up with threeway_p being true but
having middle_bb being empty but alt_middle_bb not being empty which causes wrong code in
many cases.
This patch fixes the issue by adding a test for the 2 cases where the assumption on
threeway_p case having the other bb being empty.
Changes made:
v2: Fix test for `(a <= u) b = MAX(a, d) else b = u`.
Note my plan for GCC 15 is remove minmax_replacement as match.pd will catch all cases
at that point.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
PR tree-optimization/111469
gcc/ChangeLog:
* tree-ssa-phiopt.cc (minmax_replacement): Fix
the assumption for the `non-diamond` handling cases
of diamond code.
Current COND_ADD reduction pattern can't optimize floating-point vector.
As Richard suggested: https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631336.html
Allow COND_ADD reduction pattern to optimize floating-point vector.
Eric Botcazou [Mon, 18 Sep 2023 07:14:46 +0000 (09:14 +0200)]
ada: Fix missing call to Finalize_Protection for simple protected objects
There is a glitch in Exp_Ch7.Build_Finalizer causing the finalizer to do
nothing for simple protected objects.
The change also removes redundant calls to the Is_Simple_Protected_Type
predicate and fixes a minor inconsistency between Requires_Cleanup_Actions
and Build_Finalizer for this case.
gcc/ada/
* exp_ch7.adb (Build_Finalizer.Process_Declarations): Remove call
to Is_Simple_Protected_Type as redundant.
(Build_Finalizer.Process_Object_Declaration): Do not retrieve the
corresponding record type for simple protected objects. Make the
flow of control more explicit in their specific processing.
* exp_util.adb (Requires_Cleanup_Actions): Return false for simple
protected objects present in library-level package bodies for the
sake of consistency with Build_Finalizer and remove call to
Is_Simple_Protected_Type as redundant.
Marc Poulhiès [Thu, 14 Sep 2023 11:32:05 +0000 (13:32 +0200)]
ada: Fix unnesting generated loops with nested finalization procedure
The compiler can generate loops for creating array aggregates, for
example used during the initialization of variable. If the component
type of the array element requires finalization, the compiler also
creates a block and a nested procedure that need to be correctly
unnested if unnesting is enabled. During the unnesting transformation,
the scopes for these inner blocks need to be fixed and set to the
enclosing loop entity.
gcc/ada/
* exp_ch7.adb (Contains_Subprogram): Recursively search for subp
in loop's statements.
(Unnest_Loop)<Fixup_Inner_Scopes>: New.
(Unnest_Loop): Rename local variable for more clarity.
* exp_unst.ads: Refresh comment.
Javier Miranda [Fri, 15 Sep 2023 13:08:25 +0000 (13:08 +0000)]
ada: Crash processing the accessibility level of an actual parameter
gcc/ada/
* exp_ch6.adb (Expand_Call_Helper): When computing the
accessibility level of an actual parameter based on the
expresssion of a constant declaration, add missing support for
deferred constants
Daniel King [Wed, 23 Aug 2023 13:13:55 +0000 (14:13 +0100)]
ada: Update personality function for CHERI purecap
This makes two changes to the GNAT personality function to reflect
differences for pure capability CHERI/Morello. The first is to use
__builtin_code_address_from_pointer to drop the LSB from Morello
code pointers when searching through call-site tables (without this
we would never find the right landing pad when unwinding).
The second change is to reflect the change in the exception table
format for pure-capability Morello where the landing pad is a capability
indirected by an offset in the call-site table.
gcc/ada/
* raise-gcc.c (get_ip_from_context): Adapt for CHERI purecap
(get_call_site_action_for): Adapt for CHERI purecap
Daniel King [Wed, 23 Aug 2023 12:00:57 +0000 (13:00 +0100)]
ada: Fix conversions between addresses and integers
On CHERI targets the size of System.Address and Integer_Address
(or similar) are not the same. The operations in System.Storage_Elements
should be used to convert between integers and addresses.
gcc/ada/
* libgnat/a-tags.adb (To_Tag): Use System.Storage_Elements for
integer to address conversion.
* libgnat/s-putima.adb (Put_Image_Pointer): Likewise.
Daniel King [Wed, 23 Aug 2023 13:24:54 +0000 (14:24 +0100)]
ada: Add CHERI variant of System.Stream_Attributes
Reading and writing System.Address to a stream on CHERI targets does
not preserve the capability tag; it will always be invalid since
a valid capability cannot be created out of thin air. Reading an Address
from a stream would therefore never yield a capability that can be
dereferenced.
This patch introduces a CHERI variant of System.Stream_Attributes that
raises Program_Error when attempting to read a System.Address from a stream.
ada: Dimensional analysis when used with elementary functions
gcc/ada/
* doc/gnat_ugn/gnat_and_program_execution.rst: Add more details on
using Generic Elementary Functions with dimensional analysis.
* gnat_ugn.texi: Regenerate.
ada: Clarify RM references that justify a constraint check
gcc/ada/
* exp_ch5.adb (Expand_N_Case_Statement): Reference both sections
of the Ada RM that deal with case statements and case expressions
to justify the insertion of a runtime check.
All single floating point >= 8388608.0 will have all zero mantisaa.
We leverage vmflt and mask to filter them out in vector and only do
the cvt on mask.
After this patch:
...
fsrmi 4 // RMM, rounding to nearest, ties to max magnitude
.L4:
vfabs.v v2,v1
vmflt.vf v0,v2,fa5
vfcvt.x.f.v v4,v1,v0.t
vfcvt.f.x.v v2,v4,v0.t
vfsgnj.vv v2,v2,v1
bne .L4
.L14:
fsrm a6
ret
Please note VLS mode is also involved in this patch and covered by the
test cases.
gcc/ChangeLog:
* config/riscv/autovec.md (round<mode>2): New pattern.
* config/riscv/riscv-protos.h (enum insn_flags): New enum type.
(enum insn_type): Ditto.
(expand_vec_round): New function decl.
* config/riscv/riscv-v.cc (expand_vec_round): New function impl.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/math-round-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-round-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-round-2.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-round-3.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-round-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-round-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-round-1.c: New test.
RISC-V/testsuite: Fix ILP32 RVV failures from missing <gnu/stubs-ilp32d.h>
In non-multilib installations system headers may not be available for
compilation options using a non-default model, causing build errors such
as:
In file included from .../include/features.h:527,
from .../include/assert.h:35,
from .../gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-template.h:2,
from .../gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-fixed-rv32.c:4:
.../include/gnu/stubs.h:11:11: fatal error: gnu/stubs-ilp32d.h: No such file or directory
Therefore we have to be very cautious when trying to use a non-default
model in the testsuite, preferably avoiding to rely on headers that have
not been supplied by GCC itself, or otherwise verifying in a preparatory
step whether the given model is buildable in a given test environment.
In this case however we can easily avoid the issue, because <assert.h>
facilities are not used at all by "vmv-imm-template.h", which includes
the header. Remove the inclusion then, turning these issues:
All single floating point >= 8388608.0 will have all zero mantisaa.
We leverage vmflt and mask to filter them out in vector and only do
the cvt on mask.
After this patch:
vfabs.v v2,v1
vmflt.vf v0,v2,fa5
vfcvt.x.f.v v4,v1,v0.t
vfcvt.f.x.v v2,v4,v0.t
vfsgnj.vv v2,v2,v1
Please note VLS mode is also involved in this patch and covered by the
test cases.
gcc/ChangeLog:
* config/riscv/autovec.md (rint<mode>2): New pattern.
* config/riscv/riscv-protos.h (expand_vec_rint): New function decl.
* config/riscv/riscv-v.cc (expand_vec_rint): New function impl.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/math-rint-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-rint-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-rint-2.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-rint-3.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-rint-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-rint-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-rint-1.c: New test.
All single floating point >= 8388608.0 will have all zero mantisaa.
We leverage vmflt and mask to filter them out in vector and only do the
cvt on mask.
After this patch:
vfabs.v v2,v1
vmflt.vf v0,v2,fa5
frflags a7
vfcvt.x.f.v v4,v1,v0.t
vfcvt.f.x.v v2,v4,v0.t
fsflags a7
vfsgnj.vv v2,v2,v1
Please note VLS mode is also involved in this patch and covered by the
test cases.
gcc/ChangeLog:
* config/riscv/autovec.md (nearbyint<mode>2): New pattern.
* config/riscv/riscv-protos.h (enum insn_type): New enum.
(expand_vec_nearbyint): New function decl.
* config/riscv/riscv-v.cc (expand_vec_nearbyint): New func impl.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/test-math.h: Add helper function.
* gcc.target/riscv/rvv/autovec/unop/math-nearbyint-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-nearbyint-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-nearbyint-2.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-nearbyint-3.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-nearbyint-1.c: New test.
When we substitute the equivalence and it becomes shared, we can fail
to correctly update reg info used by LRA. This can result in wrong
code generation, e.g. because of incorrect live analysis. It can also
result in compiler crash as the pseudo survives RA. This is what
exactly happened for the PR. This patch solves this problem by
unsharing substituted equivalences.
Patrick Palka [Mon, 25 Sep 2023 18:48:26 +0000 (14:48 -0400)]
libstdc++: Shorten integer std::to/from_chars symbol names
For std::to_chars:
The constrained alias __integer_to_chars_result_type seems unnecessary
ever since r10-3080-g28f0075742ed58 got rid of the only public overload
which used it. Now only non-public overloads are constrained by it
(through their return type) and these non-public overloads aren't used
in a SFINAE context, so the constraints have no observable effect. So
this patch gets rid of this alias, which greatly shortens the symbol names
of the affected functions (since the expanded alias is quite large).
For std::from_chars:
We can't get rid of the corresponding alias because its constrains the
public integer std::from_chars overload. But we can avoid having the
constraint bloat the mangled name by instead encoding it as a defaulted
template parameter. We use the non-type parameter form
enable_if_t<..., int> = 0
instead of the type parameter form
typename = enable_if_t<...>
because the type form can be bypassed by giving an explicit template
argument for the type parameter, e.g. 'std::from_chars<int, void>(...)',
so the non-type form seems like the more robust choice.
In passing, use __is_standard_integer in the constraint.
libstdc++-v3/ChangeLog:
* include/std/charconv (__detail::__integer_to_chars_result_type):
Remove.
(__detail::__to_chars_16): Use to_chars_result as return type.
(__detail::__to_chars_10): Likewise.
(__detail::__to_chars_8): Likewise.
(__detail::__to_chars_2): Likewise.
(__detail::__to_chars_i): Likewise.
(__detail::__integer_from_chars_result_type): Inline the
constraint into ...
(from_chars): ... here. Use __is_standard_integer in the
constraint. Encode constraint as a defaulted non-type template
parameter instead of within the return type.
Jonathan Wakely [Thu, 21 Sep 2023 08:14:57 +0000 (09:14 +0100)]
libstdc++: Prevent unwanted ADL in std::to_array [PR111512]
As noted in PR c++/111512, GCC does ADL for __builtin_memcpy if it is
unqualified, which can cause errors for template argument types which
cannot be completed.
Casting the memcpy arguments to void* prevents ADL from considering the
problem type.
libstdc++-v3/ChangeLog:
PR libstdc++/111511
PR c++/111512
* include/std/array (to_array): Cast memcpy arguments to void*.
* testsuite/23_containers/array/creation/111512.cc: New test.
Andrew Pinski [Sun, 24 Sep 2023 04:53:09 +0000 (21:53 -0700)]
Fix PR 110386: backprop vs ABSU_EXPR
The issue here is that when backprop tries to go
and strip sign ops, it skips over ABSU_EXPR but
ABSU_EXPR not only does an ABS, it also changes the
type to unsigned.
Since strip_sign_op_1 is only supposed to strip off
sign changing operands and not ones that change types,
removing ABSU_EXPR here is correct. We don't handle
nop conversions so this does cause any missed optimizations either.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
Kewen Lin [Mon, 25 Sep 2023 05:28:19 +0000 (00:28 -0500)]
rs6000: Skip empty inline asm in rs6000_update_ipa_fn_target_info [PR111366]
PR111366 exposes one thing that can be improved in function
rs6000_update_ipa_fn_target_info is to skip the given empty
inline asm string, since it's impossible to adopt any
hardware features (so far HTM).
Since this rs6000_update_ipa_fn_target_info related approach
exists in GCC12 and later, the affected project highway has
updated its target pragma with ",htm", see the link:
https://github.com/google/highway/commit/15e63d61eb535f478bc
I'd not bother to consider an inline asm parser for now but
will file a separated PR for further enhancement.
Kewen Lin [Mon, 25 Sep 2023 05:27:59 +0000 (00:27 -0500)]
rs6000: Use default target option node for callee by default [PR111380]
As PR111380 (and the discussion in related PRs) shows, for
now how function rs6000_can_inline_p treats the callee
without any target option node is wrong. It considers it's
always safe to inline this kind of callee, but actually its
target flags are from the command line options
(target_option_default_node), it's possible that the flags
of callee don't satisfy the condition of inlining, but it
is still inlined, then result in unexpected consequence.
As the associated test case pr111380-1.c shows, the caller
main is attributed with power8, but the callee foo is
compiled with power9 from command line, it's unexpected to
make main inline foo since foo can contain something that
requires power9 capability. Without this patch, for lto
(with -flto) we can get error message (as it forces the
callee to have a target option node), but for non-lto, it's
inlined unexpectedly.
This patch is to make callee adopt target_option_default_node
when it doesn't have a target option node, it can avoid wrong
inlining decision and fix the inconsistency between LTO and
non-LTO. It also aligns with what the other ports do.
PR target/111380
gcc/ChangeLog:
* config/rs6000/rs6000.cc (rs6000_can_inline_p): Adopt
target_option_default_node when the callee has no option
attributes, also simplify the existing code accordingly.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr111380-1.c: New test.
* gcc.target/powerpc/pr111380-2.c: New test.
Guo Jie [Thu, 21 Sep 2023 01:19:18 +0000 (09:19 +0800)]
LoongArch: Optimizations of vector construction.
gcc/ChangeLog:
* config/loongarch/lasx.md (lasx_vecinit_merge_<LASX:mode>): New
pattern for vector construction.
(vec_set<mode>_internal): Ditto.
(lasx_xvinsgr2vr_<mode256_i_half>_internal): Ditto.
(lasx_xvilvl_<lasxfmt_f>_internal): Ditto.
* config/loongarch/loongarch.cc (loongarch_expand_vector_init):
Optimized the implementation of vector construction.
(loongarch_expand_vector_init_same): New function.
* config/loongarch/lsx.md (lsx_vilvl_<lsxfmt_f>_internal): New
pattern for vector construction.
(lsx_vreplvei_mirror_<lsxfmt_f>): New pattern for vector
construction.
(vec_concatv2df): Ditto.
(vec_concatv4sf): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/vector/lasx/lasx-vec-construct-opt.c: New test.
* gcc.target/loongarch/vector/lsx/lsx-vec-construct-opt.c: New test.
Pan Li [Sun, 24 Sep 2023 03:36:11 +0000 (11:36 +0800)]
RISC-V: Fix fortran ICE/PR111546 when RV32 vec_init
When broadcast the reperated element, we take the mask_int_mode
by mistake. This patch would like to fix it by leveraging the machine
mode of the element.
Paul Thomas [Sun, 24 Sep 2023 08:00:52 +0000 (09:00 +0100)]
Fortran: Pad mismatched charlens in component initializers [PR68155]
2023-09-24 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/68155
* decl.cc (fix_initializer_charlen): New function broken out of
add_init_expr_to_sym.
(add_init_expr_to_sym, build_struct): Call the new function.
Andrew Pinski [Sat, 23 Sep 2023 04:38:02 +0000 (04:38 +0000)]
MATCH: Add `(X & ~Y) & Y` and `(X | ~Y) | Y`
Even though this gets optimized by reassociation, catching it more often
will always be better.
Note the reason why I didn't add `(X ^ ~Y) ^ Y` is that it gets caught
by prefering `~(X ^ Y)` to `(X ^ ~Y)` which then it is caught by the
the pattern for `(X ^ Y) ^ Y` already.
* gcc.target/riscv/rvv/autovec/vls/def.h:
* gcc.target/riscv/rvv/autovec/vls/cond_convert-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-10.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-11.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-12.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-8.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_convert-9.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_copysign-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_ext-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_ext-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_ext-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_ext-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_ext-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_mulh-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_narrow-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_narrow-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_trunc-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_trunc-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_trunc-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_trunc-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_trunc-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wadd-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wadd-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wadd-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wadd-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wfma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wfma-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wfms-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wfnma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wmul-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wmul-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wmul-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wsub-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wsub-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wsub-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_wsub-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/narrow-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/narrow-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/narrow-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wred-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wred-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wred-3.c: New test.
Harald Anlauf [Fri, 22 Sep 2023 19:06:00 +0000 (21:06 +0200)]
fortran: error recovery on duplicate declaration of class variable [PR95710]
gcc/fortran/ChangeLog:
PR fortran/95710
* class.cc (gfc_build_class_symbol): Do not try to build class
container for invalid typespec.
* resolve.cc (resolve_fl_var_and_proc): Prevent NULL pointer
dereference.
(resolve_symbol): Likewise.
gcc/testsuite/ChangeLog:
PR fortran/95710
* gfortran.dg/pr95710.f90: New test.
- Import dmd v2.105.0.
- Catch clause must take only `const' or mutable exceptions.
- Creating a `scope' class instance with a non-scope constructor
is now `@system' only with `-fpreview=dip1000'.
- Global `const' variables can no longer be initialized from a
non-shared static constructor
D runtime changes:
- Import druntime v2.105.0.
Phobos changes:
- Import phobos v2.105.0.
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd 4574d1728d.
* dmd/VERSION: Bump version to v2.105.0.
* d-diagnostic.cc (verror): Remove.
(verrorSupplemental): Remove.
(vwarning): Remove.
(vwarningSupplemental): Remove.
(vdeprecation): Remove.
(vdeprecationSupplemental): Remove.
(vmessage): Remove.
(vtip): Remove.
(verrorReport): New function.
(verrorReportSupplemental): New function.
* d-lang.cc (d_parse_file): Update for new front-end interface.
* decl.cc (d_mangle_decl): Update for new front-end interface.
* intrinsics.cc (maybe_set_intrinsic): Update for new front-end
interface.
All single floating point glte 8388608.0 will have all zero mantisaa.
We leverage vmflt and mask to filter them out in vector and only do the
cvt on mask.
After this patch:
...
fsrmi 2 // Rounding Down
.L4:
vfabs.v v1,v2
vmflt.vf v0,v1,fa5
vfcvt.x.f.v v3,v2,v0.t
vfcvt.f.x.v v1,v3,v0.t
vfsgnj.vv v1,v1,v2
bne .L4
.L14:
fsrm a6
ret
Please note VLS mode is also involved in this patch and covered by the
test cases.
gcc/ChangeLog:
* config/riscv/autovec.md (floor<mode>2): New pattern.
* config/riscv/riscv-protos.h (enum insn_flags): New enum type.
(enum insn_type): Ditto.
(expand_vec_floor): New function decl.
* config/riscv/riscv-v.cc (gen_floor_const_fp): New function impl.
(expand_vec_floor): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/math-floor-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-floor-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-floor-2.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-floor-3.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-floor-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-floor-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-floor-1.c: New test.
Jason Merrill [Thu, 21 Sep 2023 14:39:46 +0000 (15:39 +0100)]
c++ __integer_pack conversion again [PR111357]
As Jakub pointed out, the real problem here is that in a partial
substitution we're forgetting the conversion to the type of the non-type
template argument, because maybe_convert_nontype_argument doesn't do
anything with value-dependent arguments. I'm experimenting with changing
that, but in the meantime we can work around it here.
PR c++/111357
gcc/cp/ChangeLog:
* pt.cc (expand_integer_pack): Use IMPLICIT_CONV_EXPR.
Pan Li [Fri, 22 Sep 2023 11:08:52 +0000 (19:08 +0800)]
RISC-V: Refine the code gen for ceil auto vectorization.
We vectorized below ceil code already.
void
test_ceil (float *out, float *in, int count)
{
for (unsigned i = 0; i < count; i++)
out[i] = __builtin_ceilf (in[i]);
}
Before this patch:
vfmv.v.x v4,fa0 // can be removed
vfabs.v v0,v1
vmv1r.v v2,v1 // can be removed
vmflt.vv v0,v0,v4 // can be refined to vmflt.vf
vfcvt.x.f.v v3,v1,v0.t
vfcvt.f.x.v v2,v3,v0.t
vfsgnj.vv v2,v2,v1
After this patch:
vfabs.v v1,v2
vmflt.vf v0,v1,fa5
vfcvt.x.f.v v3,v2,v0.t
vfcvt.f.x.v v1,v3,v0.t
vfsgnj.vv v1,v1,v2
We can generate better code include below items.
* Remove vfmv.v.f.
* Take vmflt.vf instead of vmflt.vv.
* Remove vmv1r.v.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (expand_vec_float_cmp_mask): Refactor.
(emit_vec_float_cmp_mask): Rename.
(expand_vec_copysign): Ditto.
(emit_vec_copysign): Ditto.
(emit_vec_abs): New function impl.
(emit_vec_cvt_x_f): Ditto.
(emit_vec_cvt_f_x): Ditto.
(expand_vec_ceil): Ditto.
* gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS modes.
* gcc.target/riscv/rvv/autovec/vls/wfma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wfma-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wfma-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wfms-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wfnma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wfnms-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS modes cond tests.
* gcc.target/riscv/rvv/autovec/vls/wadd-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wadd-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wadd-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wadd-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wmul-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wmul-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wmul-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wsub-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wsub-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wsub-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/wsub-4.c: New test.
Patrick Palka [Fri, 22 Sep 2023 10:27:48 +0000 (06:27 -0400)]
c++: missing SFINAE in grok_array_decl [PR111493]
We should guard both the diagnostic and backward compatibilty fallback
code with tf_error, so that in a SFINAE context we don't issue any
diagnostics and correctly treat ill-formed C++23 multidimensional
subscript operator expressions as such.
PR c++/111493
gcc/cp/ChangeLog:
* decl2.cc (grok_array_decl): Guard diagnostic and backward
compatibility fallback code paths with tf_error.
Patrick Palka [Fri, 22 Sep 2023 10:25:49 +0000 (06:25 -0400)]
c++: constraint rewriting during ttp coercion [PR111485]
In order to compare the constraints of a ttp with that of its argument,
we rewrite the ttp's constraints in terms of the argument template's
template parameters. The substitution to achieve this currently uses a
single level of template arguments, but that never does the right thing
because a ttp's template parameters always have level >= 2. This patch
fixes this by including the outer template arguments in the substitution,
which ought to match the depth of the ttp.
The second testcase demonstrates it's better to substitute the concrete
outer template arguments instead of generic ones since a ttp's constraints
could depend on outer parameters.
PR c++/111485
gcc/cp/ChangeLog:
* pt.cc (is_compatible_template_arg): New parameter 'args'.
Add the outer template arguments 'args' to 'new_args'.
(convert_template_argument): Pass 'args' to
is_compatible_template_arg.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-ttp5.C: New test.
* g++.dg/cpp2a/concepts-ttp6.C: New test.
* gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS conditional tests.
* gcc.target/riscv/rvv/autovec/vls/cond_add-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_add-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_and-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_div-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_div-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_fma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_fma-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_fms-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_fnma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_fnma-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_fnms-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_ior-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_max-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_max-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_min-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_min-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_mod-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_mul-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_mul-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_neg-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_neg-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_not-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_shift-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_shift-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_sub-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_sub-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cond_xor-1.c: New test.