[alpha] adjust MEM alignment for block move [PR115459]
Before issuing loads or stores for a block move, adjust the MEM
alignments if analysis of the addresses enabled the inference of
stricter alignment. This ensures that the MEMs are sufficiently
aligned for the corresponding insns, which avoids trouble in case of
e.g. substitutions into SUBREGs.
for gcc/ChangeLog
PR target/115459
* config/alpha/alpha.cc (alpha_expand_block_move): Adjust
MEMs to match inferred alignment.
YunQiang Su [Thu, 11 Jul 2024 12:43:54 +0000 (20:43 +0800)]
RISC-V: NO_WARNING preferred else value for RVV
PR target/115840.
In riscv_preferred_else_value, we create an uninitialized tmp var
for else value, instead of the 0 (as default_preferred_else_value)
or the pre-exists VAR (as aarch64 does), so that we can use agnostic
policy.
The problem is that `warn_uninit` will emit a warning:
'({anonymous})' may be used uninitialized
Let's mark this tmp var as NO_WARNING.
This problem is found when I try to build glibc with V extension.
gcc
PR target/115840
* config/riscv/riscv.cc(riscv_preferred_else_value): Mark
tmp_var as NO_WARNING.
gcc/testsuite
* gcc.dg/vect/pr115840.c: New testcase.
Mikael Morin [Thu, 11 Jul 2024 19:55:58 +0000 (21:55 +0200)]
fortran: Factor the evaluation of MINLOC/MAXLOC's BACK argument
Move the evaluation of the BACK argument out of the loop in the inline code
generated for MINLOC or MAXLOC. For that, add a new (scalar) element
associated with BACK to the scalarization loop chain, evaluate the argument
with the context of that element, and let the scalarizer do its job.
The problem was not only a missed optimisation, but also a wrong code
one in the cases where the expression associated with BACK is not free of
side-effects, making multiple evaluations observable.
The new tests check the evaluation count of the BACK argument, and try to
cover all the variations (integral or floating-point type, constant or
unknown shape, absent or scalar or array MASK) supported by the inline
implementation of the functions. Care has been taken to not check the case
of a constant .FALSE. MASK, for which the evaluation of BACK can be elided.
gcc/fortran/ChangeLog:
* trans-intrinsic.cc (gfc_conv_intrinsic_minmaxloc): Create a new
scalar scalarization chain element if BACK is present. Add it to
the loop. Set the scalarization chain before evaluating the
argument.
gcc/testsuite/ChangeLog:
* gfortran.dg/maxloc_5.f90: New test.
* gfortran.dg/minloc_5.f90: New test.
RISC-V: Disable misaligned vector access in hook riscv_slow_unaligned_access[PR115862]
The reason is that in the following code, icode = movmisalignv8si has
already been rejected by TARGET_VECTOR_MISALIGN_SUPPORTED, but it is
allowed by targetm.slow_unaligned_access,which is contradictory.
Kito Cheng [Tue, 9 Jul 2024 07:50:57 +0000 (15:50 +0800)]
RISC-V: Add SiFive extensions, xsfvcp and xsfcease
We have already upstreamed these extensions into binutils, and now we need GCC
to recognize these extensions and pass them to binutils as well. We also plan
to upstream intrinsics in the near future. :)
Kewen Lin [Fri, 12 Jul 2024 06:32:57 +0000 (01:32 -0500)]
rs6000: Remove vcond{,u} expanders
As PR114189 shows, middle-end will obsolete vcond, vcondu
and vcondeq optabs soon. This patch is to remove all
vcond{,u} expanders in rs6000 port and adjust the function
rs6000_emit_vector_cond_expr which is called by those
expanders as static.
PR target/115659
gcc/ChangeLog:
* config/rs6000/rs6000-protos.h (rs6000_emit_vector_cond_expr): Remove.
* config/rs6000/rs6000.cc (rs6000_emit_vector_cond_expr): Add static
qualifier as it is only called by rs6000_emit_swsqrt now.
* config/rs6000/vector.md (vcond<VEC_F:mode><VEC_F:mode>): Remove.
(vcond<VEC_I:mode><VEC_I:mode>): Remove.
(vcondv4sfv4si): Likewise.
(vcondv4siv4sf): Likewise.
(vcondv2dfv2di): Likewise.
(vcondv2div2df): Likewise.
(vcondu<VEC_I:mode><VEC_I:mode>): Likewise.
(vconduv4sfv4si): Likewise.
(vconduv2dfv2di): Likewise.
Richard Biener [Thu, 11 Jul 2024 08:18:55 +0000 (10:18 +0200)]
tree-optimization/115867 - ICE with simdcall vectorization in masked loop
When only a loop mask is to be supplied for the inbranch arg to a
simd function we fail to handle integer mode masks correctly. We
need to guess the number of elements represented by it. This assumes
that excess arguments are all for masks, I wasn't able to create
a simdclone with more than one integer mode mask argument.
The gcc.dg/vect/vect-simd-clone-20.c exercises this with -mavx512vl
PR tree-optimization/115867
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Properly
guess the number of mask elements for integer mode masks.
Jeff Law [Fri, 12 Jul 2024 03:37:34 +0000 (21:37 -0600)]
[committed] Fix m68k bootstrap segfault with late-combine
So the m68k port has failed to bootstrap since the introduction of
late-combine. My suspicion has been this is a backend problem. Sure enough
after bisecting things down (thank goodness for the debug counter!) I'm happy
to report m68k (after this patch) has moved into its stage3 build for the first
time in a month.
Basically late-combine propagated an address calculation to its use points,
generating this insn (dwarf2out.c, I forget what function):
> (define_insn "extendsidi2"
> [(set (match_operand:DI 0 "nonimmediate_operand" "=d,o,o,<")
> (sign_extend:DI
> (match_operand:SI 1 "nonimmediate_src_operand" "rm,rm,r<Q>,rm")))
> (clobber (match_scratch:SI 2 "=X,&d,&d,&d"))]
> ""
> {
> if (which_alternative == 0)
> /* Handle alternative 0. */
> {
> if (TARGET_68020 || TARGET_COLDFIRE)
> return "move%.l %1,%R0\;smi %0\;extb%.l %0";
> else
> return "move%.l %1,%R0\;smi %0\;ext%.w %0\;ext%.l %0";
> }
>
> /* Handle alternatives 1, 2 and 3. We don't need to adjust address by 4
> in alternative 3 because autodecrement will do that for us. */
> operands[3] = adjust_address (operands[0], SImode,
> which_alternative == 3 ? 0 : 4);
> operands[0] = adjust_address (operands[0], SImode, 0);
>
> if (TARGET_68020 || TARGET_COLDFIRE)
> return "move%.l %1,%3\;smi %2\;extb%.l %2\;move%.l %2,%0";
> else
> return "move%.l %1,%3\;smi %2\;ext%.w %2\;ext%.l %2\;move%.l %2,%0";
> }
> [(set_attr "ok_for_coldfire" "yes,no,yes,yes")])
Note the smi/ext instruction pair in the case for alternatives 1..3. Those
clobber the scratch register before we're done consuming inputs. The scratch
register really needs to be marked as an earlyclobber.
That fixes the bootstrap problem, but a cursory review of m68k.md is not
encouraging. I will not be surprised at all if there's more of this kind of
problem lurking.
But happy to at least have m68k bootstrapping again. It's failing the
comparison test, but definitely progress.
* config/m68k/m68k.md (extendsidi2): Add missing early clobbers.
Ian Lance Taylor [Fri, 12 Jul 2024 02:29:04 +0000 (19:29 -0700)]
libbacktrace: avoid infinite recursion
We could get an infinite recursion in an odd case in which a
.gnu_debugdata section was added to a debug file, and mini_debuginfo
was put into the debug file, and the debug file was put into a
/usr/lib/debug directory to be found by build ID. This combination
doesn't really make sense but we shouldn't get an infinite recursion.
* elf.c (elf_add): Don't use .gnu_debugdata if we are already
reading a debuginfo file.
* Makefile.am (m2test_*): New test targets.
(CHECK_PROGRAMS): Add m2test.
(MAKETESTS): Add m2test_minidebug2.
(%_minidebug2): New pattern.
(CLEANFILES): Remove minidebug2 files.
* Makefile.in: Regenerate.
Jonathan Wakely [Thu, 11 Jul 2024 20:23:15 +0000 (21:23 +0100)]
libstdc++: Test that std::atomic_ref<bool> uses the primary template
The previous commit changed atomic_ref<bool> to not use the integral
specialization. This adds a test to verify that change. We can't
directly test that the primary template is used, but we can check that
the member functions of the integral specializations are not present.
libstdc++-v3/ChangeLog:
* testsuite/29_atomics/atomic_ref/bool.cc: New test.
libstdc++: the specialization atomic_ref<bool> should use the primary template
Per [atomics.ref.int] `bool` is excluded from the list of integral types
for which there is a specialization of the `atomic_ref` class template
and [Note 1] clearly states that `atomic_ref<bool>` "uses the primary
template" instead.
libstdc++-v3/ChangeLog:
* include/bits/atomic_base.h (__atomic_ref): Do not use integral
specialization for bool.
Jeff Law [Thu, 11 Jul 2024 18:05:56 +0000 (12:05 -0600)]
[to-be-committed,RISC-V] Eliminate unnecessary sign extension after inlined str[n]cmp
This patch eliminates an unnecessary sign extension for scalar inlined
string comparisons on rv64.
Conceptually this is pretty simple. Prove all the paths which "return"
a value from the inlined string comparison already have sign extended
values.
FINAL_LABEL is the point after the calculation of the return value. So
if we have a jump to FINAL_LABEL, we must have a properly extended
result value at that point.
Second we're going to arrange in the .md part of the expander to use an
X mode temporary for the result. After computing the result we will (if
necessary) extract the low part of the result using a SUBREG tagged with
the appropriate SUBREG_PROMOTED_* bits.
So with that background.
We find a jump to FINAL_LABEL in emit_strcmp_scalar_compare_byte. Since
we know the result is X mode, we can just emit the subtraction of the
two chars in X mode and we'll have a properly sign extended result.
There's 4 jumps to final_label in emit_strcmp_scalar.
The first is just returning zero and needs trivial simplification to not
force the result into SImode.
The second is after calling strcmp in the library. The ABI mandates
that value is sign extended, so there's nothing to do for that case.
The 3rd occurs after a call to
emit_strcmp_scalar_result_calculation_nonul. If we dive into that
routine it needs simplificationq similar to what we did in
emit_strcmp_scalar_compare_byte
The 4th occurs after a call to emit_strcmp_scalar_result_calculation
which again needs trivial adjustment like we've done in the other routines.
Finally, at the end of expand_strcmp, just store the X mode result
sitting in SUB to RESULT.
The net of all that is we know every path has its result properly
extended to X mode. Standard redundant extension removal will take care
of the rest.
We've been running this within Ventana for about 6 months, so naturally
it's been through various QA cycles, dhrystone, spec2017, etc. It's
also been through a build/test cycle in my tester. Waiting on results
from the pre-commit testing before moving forward.
gcc/
* config/riscv/riscv-string.cc
(emit_strcmp_scalar_compare_byte): Set RESULT directly rather
than using a new temporary.
(emit_strcmp_scalar_result_calculation_nonul): Likewise.
(emit_strcmp_scalar_result_calculation): Likewise.
(riscv_expand_strcmp_scalar): Use CONST0_RTX rather than
generating a new node.
(expand_strcmp): Copy directly from SUB to RESULT.
* config/riscv/riscv.md (cmpstrnsi, cmpstrsi): Pass an X
mode temporary to the expansion routines. If necessary
extract low part of the word to store in final result location.
Andrew Pinski [Sat, 22 Jun 2024 04:07:26 +0000 (21:07 -0700)]
Ranger: Mark a few classes as final
I noticed there was a warning from clang about int_range's
dtor being marked as final saying the class cannot be inherited from.
So let's mark the few ranger classes as final for those which we know
will be final.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* value-range.h (class int_range): Mark as final.
(class prange): Likewise.
(class frange): Likewise.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
recog: Avoid validate_change shortcut for groups [PR115782]
In this PR, due to the -f flags, we ended up with:
bb1: r10=r10
...
bb2: r10=r10
...
bb3: ...=r10
with bb1->bb2 and bb1->bb3.
late-combine successfully combined the bb1->bb2 def-use and set
the insn code to NOOP_MOVE_INSN_CODE. The bb1->bb3 combination
then failed for... reasons. At this point, everything should have
been rewound to its original state.
However, substituting r10=r10 into r10=r10 gives r10=r10, and
validate_change had an early-out for no-op rtl changes. This meant
that validate_change did not register a change for the bb2 insn and
so did not save its old insn code. The NOOP_MOVE_INSN_CODE therefore
persisted even after the attempt had been rewound.
IMO it'd be too cumbersome and error-prone to expect all users of
validate_change to be aware of this possibility. If code is using
validate_change with in_group=1, I think it has a reasonable expectation
that a change will be registered and that the insn code will be saved
(and restored on cancel). This patch therefore limits the shortcut
to the !in_group case.
gcc/
PR rtl-optimization/115782
* recog.cc (validate_change_1): Suppress early exit for no-op
changes that are part of a group.
gcc/testsuite/
PR rtl-optimization/115782
* gcc.dg/pr115782.c: New test.
Eric Botcazou [Thu, 11 Jul 2024 08:49:13 +0000 (10:49 +0200)]
Fix gimplification of ordering comparisons of arrays of bytes
The Ada compiler now defers to the gimplifier for ordering comparisons of
arrays of bytes (Ada parlance for <, >, <= and >=) because the gimplifier
in turn defers to memcmp for them, which implements the required semantics.
However, the gimplifier has a special processing for aggregate types whose
mode is not BLKmode and this processing deviates from the memcmp semantics
when the target is little-endian.
gcc/
* gimplify.cc (gimplify_scalar_mode_aggregate_compare): Add support
for ordering comparisons.
(gimplify_expr) <default>: Call gimplify_scalar_mode_aggregate_compare
only for integral scalar modes.
gcc/testsuite/
* gnat.dg/array42.adb, gnat.dg/array42_pkg.ads: New test.
There are these insns that subtract and zero-extend where
the subtrahend is zero-extended to the mode of the minuend.
This patch uses one insn (and split) with mode iterators
instead of spelling out each variant individually.
This has the additional benefit that u32 - u24 is also supported,
which previously wasn't.
c++/modules: Keep entity mapping info across duplicate_decls [PR99241]
When duplicate_decls finds a match with an existing imported
declaration, it clears DECL_LANG_SPECIFIC of the olddecl and replaces it
with the contents of newdecl; this clears DECL_MODULE_ENTITY_P causing
an ICE if the same declaration is imported again later.
This fixes the issue by ensuring that the flag is transferred to newdecl
before clearing so that it ends up on olddecl again.
For future-proofing we also do the same with DECL_MODULE_KEYED_DECLS_P,
though because we don't yet support textual redefinition merging we
can't yet test this works as intended. I don't expect it's possible for
a new declaration already to have extra keyed decls mismatching that of
the old declaration though, so I don't do anything with 'keyed_map' at
this time.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add test
helper macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_data.h: Add test
data for .SAT_SUB in zip benchmark.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_zip-run.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_zip.c: New test.
Fortran: Fix rejecting class arrays of different ranks as storage association argument and add un/pack_class. [PR96992]
Removing the assert in trans-expr, lead to initial strides not set
which is now fixed. When the array needs repacking, this is done for
class arrays now, too.
Packing class arrays was done using the regular internal pack
function in the past. But that does not use the vptr's copy
function and breaks OOP paradigms (e.g. deep copy). The new
un-/pack_class functions use the vptr's copy functionality to
implement OOP paradigms correctly.
PR fortran/96992
gcc/fortran/ChangeLog:
* trans-array.cc (gfc_trans_array_bounds): Set a starting
stride, when descriptor expects a variable for the stride.
(gfc_trans_dummy_array_bias): Allow storage association for
dummy class arrays, when they are not elemental.
(gfc_conv_array_parameter): Add more general class support
and packing for classes, too.
* trans-array.h (gfc_conv_array_parameter): Add lbound shift
for class arrays.
* trans-decl.cc (gfc_build_builtin_function_decls): Add decls
for internal_un-/pack_class.
* trans-expr.cc (gfc_reset_vptr): Allow supplying a type-tree
to generate the vtab from.
(gfc_class_set_vptr): Allow supplying a class-tree to take the
vptr from.
(class_array_data_assign): Rename to gfc_class_array_data_assign
and make usable from other compile units.
(gfc_class_array_data_assign): Renamed from class_array_data_
assign.
(gfc_conv_derived_to_class): Remove assert to
allow converting derived to class type arrays with assumed
rank. Reduce code base and use gfc_conv_array_parameter also
for classes.
(gfc_conv_class_to_class): Use gfc_class_data_assign.
(gfc_conv_procedure_call): Adapt to new signature of
gfc_conv_derived_to_class.
* trans-io.cc (transfer_expr): Same.
* trans-stmt.cc (trans_associate_var): Same.
* trans.h (gfc_conv_derived_to_class): Signature changed.
(gfc_class_array_data_assign): Made public.
(gfor_fndecl_in_pack_class): Added declaration.
(gfor_fndecl_in_unpack_class): Same.
libgfortran/ChangeLog:
* Makefile.am: Add in_un-/pack_class.c to build.
* Makefile.in: Regenerated from Makefile.am.
* gfortran.map: Added new functions and bumped ABI.
* libgfortran.h (GFC_CLASS_T): Added for generating class
representation at runtime.
* runtime/in_pack_class.c: New file.
* runtime/in_unpack_class.c: New file.
Jørgen Kvalsvik [Fri, 29 Mar 2024 12:01:37 +0000 (13:01 +0100)]
Add function filtering to gcov
Add the --include and --exclude flags to gcov to control what functions
to report on. This is meant to make gcov more practical as an when
writing test suites or performing other coverage experiments, which
tends to focus on a few functions at the time. This really shines in
combination with the -t/--stdout flag. With support for more expansive
metrics in gcov like modified condition/decision coverage (MC/DC) and
path coverage, output quickly gets overwhelming without filtering.
The approach is quite simple: filters are egrep regexes and are
evaluated left-to-right, and the last filter "wins", that is, if a
function matches an --include and a subsequent --exclude, it should not
be included in the output. All of the output machinery works on the
function table, so by optionally (not) adding function makes the even
the json output work as expected, and only minor changes are needed to
suppress the filtered-out functions.
Demo: math.c
int mul (int a, int b) {
return a * b;
}
int sub (int a, int b) {
return a - b;
}
int sum (int a, int b) {
return a + b;
}
Plain matches:
$ gcov -t math --include=sum
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 9:int sum (int a, int b) {
#####: 10: return a + b;
-: 11:}
$ gcov -t math --include=mul
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 1:int mul (int a, int b) {
#####: 2: return a * b;
-: 3:}
Regex match:
$ gcov -t math --include=su
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 5:int sub (int a, int b) {
#####: 6: return a - b;
-: 7:}
#####: 9:int sum (int a, int b) {
#####: 10: return a + b;
-: 11:}
And similar for exclude:
$ gcov -t math --exclude=sum
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 1:int mul (int a, int b) {
#####: 2: return a * b;
-: 3:}
#####: 5:int sub (int a, int b) {
#####: 6: return a - b;
-: 7:}
Matching generally work well for mangled names, as the mangled names
also have the base symbol name in it. By default, functions are matched
by the mangled name, which means matching on base names always work as
expected. The -M flag makes the matching work on the demangled name
which is quite useful when you only want to report on specific
overloads and can use the full type names.
Why not just use grep? grep is not really sufficient as grep is very
line oriented, and the reports that benefit the most from filtering
often unpredictably span multiple lines based on the state of coverage.
For example, a condition coverage report for 3 terms/6 outcomes only
outputs 1 line when all conditions are covered, and 7 with no lines
covered.
* lib/gcov.exp: Add filtering test function.
* g++.dg/gcov/gcov-19.C: New test.
* g++.dg/gcov/gcov-20.C: New test.
* g++.dg/gcov/gcov-21.C: New test.
* gcc.misc-tests/gcov-25.c: New test.
* gcc.misc-tests/gcov-26.c: New test.
* gcc.misc-tests/gcov-27.c: New test.
* gcc.misc-tests/gcov-28.c: New test.
Ensure that the function.end_line in the lines vector for the source
file, even if it is not explicitly touched by a basic block. This
ensures consistency with what you would expect. For example, this file
has sources[sum.cc].lines.size () == 23 and main.end_line == 2 without
adjusting sources.lines, which in this case is a no-op.
#####: 17:int main ()
-: 18:{
#####: 19: sum (1, 2);
#####: 20: sum (1.1, 2);
#####: 21: sum (2.2, 2.3);
#####: 22:}
This is a useful property when combined with selective reporting.
gcc/ChangeLog:
* gcov.cc (process_all_functions): Ensure fn.end_line is
included source[fn].lines.
RISC-V: c implies zca, and conditionally zcf & zcd
According to Zc-1.0.4-3.pdf from
https://github.com/riscvarchive/riscv-code-size-reduction/releases/tag/v1.0.4-3
The rule is that:
- C always implies Zca
- C+F implies Zcf (RV32 only)
- C+D implies Zcd
After this patch:
test:
...
.L3:
vle16.v v3,0(a3)
vrsub.vx v5,v2,a6
mv a7,a4
addw a4,a4,t3
vrgather.vv v1,v3,v5
vssubu.vv v1,v1,v6
vrgather.vv v3,v1,v5
vse16.v v3,0(a3)
sub a3,a3,t1
bgtu t4,a4,.L3
...
The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The rv64gcv build with glibc.
3. The x86 bootstrap tests.
4. The x86 fully regression tests.
gcc/ChangeLog:
* tree-vect-patterns.cc (vect_recog_sat_sub_pattern_transform):
Add new func impl to perform the truncation distribution.
(vect_recog_sat_sub_pattern): Perform above optimize before
generate .SAT_SUB call.
Jonathan Wakely [Wed, 10 Jul 2024 09:27:24 +0000 (10:27 +0100)]
libstdc++: Make std::basic_format_context non-copyable [PR114387]
Users are not supposed to create objects of this type, and there's no
reason it needs to be copyable. LWG 4061 makes it non-copyable and
non-default constructible.
libstdc++-v3/ChangeLog:
PR libstdc++/114387
* include/std/format (basic_format_context): Define copy
operations as deleted, as per LWG 4061.
* testsuite/std/format/context.cc: New test.
Jonathan Wakely [Wed, 10 Jul 2024 16:47:56 +0000 (17:47 +0100)]
libstdc++: Minor optimization for std::locale::encoding()
For the C locale we know the encoding is ASCII, so we can avoid using
newlocale and nl_langinfo_l. Similarly, for an unnamed locale, we aren't
going to get a useful answer, so we can just use a default-constrcuted
std::text_encoding representing an unknown encoding.
libstdc++-v3/ChangeLog:
* src/c++26/text_encoding.cc (__locale_encoding): Add to unnamed
namespace.
(std::locale::encoding): Optimize for "C" and "*" names.
Jonathan Wakely [Wed, 10 Jul 2024 09:29:52 +0000 (10:29 +0100)]
libstdc++: Use direct-initialization for std::vector<bool>'s allocator [PR115854]
The consensus in the standard committee is that this change shouldn't be
necessary, and the Allocator requirements should require conversions
between rebound allocators to be implicit. But we can make it work for
now anyway.
libstdc++-v3/ChangeLog:
PR libstdc++/115854
* include/bits/stl_bvector.h (_Bvector_base): Convert allocator
to rebound type explicitly.
* testsuite/23_containers/vector/allocator/115854.cc: New test.
* testsuite/23_containers/vector/bool/allocator/115854.cc: New test.
Jonathan Wakely [Mon, 8 Jul 2024 09:45:52 +0000 (10:45 +0100)]
libstdc++: ranges::find needs explicit conversion to size_t [PR115799]
For an integer-class type we need to use an explicit conversion to size_t.
libstdc++-v3/ChangeLog:
PR libstdc++/115799
* include/bits/ranges_util.h (__find_fn): Make conversion
from difference type ti size_t explicit.
* testsuite/25_algorithms/find/bytes.cc: Check ranges::find with
__gnu_test::test_contiguous_range.
* testsuite/std/ranges/range.cc: Adjust expected difference_type
for __gnu_test::test_contiguous_range.
* testsuite/util/testsuite_iterators.h (contiguous_iterator_wrapper):
Use __max_diff_type as difference type.
(test_range::sentinel, test_sized_range_sized_sent::sentinel):
Ensure that operator- returns difference_type.
Marek Polacek [Fri, 31 May 2024 12:54:00 +0000 (08:54 -0400)]
c++: remove Concepts TS code
In GCC 14 we deprecated Concepts TS and discussed removing the code
in GCC 15. This patch removes Concepts TS code from the front end,
including support for template-introductions, as in:
The biggest part of this patch is adjusting the testsuite. We don't
want to lose coverage so I've converted most of -fconcepts-ts tests
to C++20. That means they no longer have to be c++17_only. Mostly
this meant turning "concept bool" into "concept" and turning function
concepts into C++20 concepts. I've added missing "auto"s where
required, but "auto"s in template-argument-lists are not supported
anymore so I've removed some of the tests; some of them are still
present to verify we don't crash on such autos. I've also added ()
around "requires" expressions.
I plan to add a porting_to.html entry with a few hints.
I've rebased and tested the patch after the recent r15-1103.
Marek Polacek [Tue, 25 Jun 2024 18:55:08 +0000 (14:55 -0400)]
c: ICE with invalid sizeof [PR115642]
Here we ICE in c_expr_sizeof_expr on an erroneous expr.value. The
code checks for expr.value == error_mark_node but here the e_m_n is
wrapped in a C_MAYBE_CONST_EXPR. I don't think we should have created
such a tree, so let's return earlier in c_cast_expr.
PR c/115642
gcc/c/ChangeLog:
* c-typeck.cc (c_cast_expr): Return error_mark_node if build_c_cast
failed.
Marek Polacek [Thu, 27 Jun 2024 20:39:29 +0000 (16:39 -0400)]
c: ICE on invalid with attribute optimize [PR115549]
I had this PR in my open tabs so why not go ahead and fix it.
decl_attributes gets last_decl, the last already pushed declaration,
to be used in common_handle_aligned_attribute. In C++, we look up
the decl via find_last_decl, which returns NULL_TREE if it finds
a decl that had not been declared. In C, we look up the decl via
lookup_last_decl which returns error_mark_node rather than NULL_TREE
in that case.
The error_mark_node causes a crash in common_handle_aligned_attribute.
We can fix this on the C FE side like in the patch below.
PR c/115549
gcc/c/ChangeLog:
* c-decl.cc (c_decl_attributes): If lookup_last_decl returns
error_mark_node, use NULL_TREE as last_decl.
testsuite: Align testcase with implementation [PR105090]
Since r13-1006-g2005b9b888eeac, the test case copysign_softfloat_1.c
no longer contains any lsr istruction, so drop the check as per
comment 9 in PR105090.
gcc/testsuite/ChangeLog:
PR target/105090
* gcc.target/arm/copysign_softfloat_1.c: Drop check for lsr
Edwin Lu [Wed, 10 Jul 2024 16:44:48 +0000 (09:44 -0700)]
RISC-V: Add support for B standard extension
This patch adds support for recognizing the B standard extension to be the
collection of Zba, Zbb, Zbs extensions for consistency and conciseness
across toolchains
https://github.com/riscv/riscv-b/tags
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add imply rules for B extension
* config/riscv/arch-canonicalize: Ditto
expand_fn_using_insn has code to handle SUBREG_PROMOTED_VAR_P
destinations. Specifically, for:
(subreg/v:M1 (reg:M2 R) ...)
it creates a new temporary register T, uses it for the output
operand, then sign- or zero-extends the M1 lowpart of T to M2,
storing the result in R.
This patch splits this handling out into helper routines and
uses them for other instances of:
if (!rtx_equal_p (target, ops[0].value))
emit_move_insn (target, ops[0].value);
It's quite probable that this doesn't help any of the other cases;
in particular, it shouldn't affect vectors. But I think it could
be useful for the CRC work.
Marek Polacek [Tue, 2 Jul 2024 19:22:39 +0000 (15:22 -0400)]
c++: array new with value-initialization [PR115645]
This extends the r11-5179 fix which doesn't work with multidimensional
arrays. In particular,
struct S {
explicit S() { }
};
auto p = new S[1][1]();
should not say "converting to S from initializer list would use
explicit constructor" because there's no {}. However, since we
went into the block where we create a {}, we got confused. We
should not have gotten there but we did because array_p was true.
This patch refines the check once more.
PR c++/115645
gcc/cp/ChangeLog:
* init.cc (build_new): Don't do any deduction for arrays with
bounds if it's value-initialized.
recog: Handle some mode-changing hardreg propagations
insn_propagation would previously only replace (reg:M H) with X
for some hard register H if the uses of H were also in mode M.
This patch extends it to handle simple mode punning too.
The original motivation was to try to get rid of the execution
frequency test in aarch64_split_simd_shift_p, but doing that is
follow-up work.
I tried this on at least one target per CPU directory (as for
the late-combine patches) and it seems to be a small win for
all of them.
The patch includes a couple of updates to the ia32 results.
In pr105033.c, foo3 replaced:
In vect-bfloat16-2b.c, 5 of the vec_extract_v32bf_* routines
(specifically the ones with nonzero even indices) replaced
things like:
movl 28(%esp), %eax
vmovd %eax, %xmm0
with:
vpinsrw $0, 28(%esp), %xmm0, %xmm0
(These functions return a bf16, and so only the low 16 bits matter.)
gcc/
* recog.cc (insn_propagation::apply_to_rvalue_1): Handle simple
cases of hardreg propagation in which the register is set and
used in different modes.
gcc/testsuite/
* gcc.target/i386/pr105033.c: Expect vmovhps for the ia32 version
of foo.
* gcc.target/i386/vect-bfloat16-2b.c: Expect more vpinsrws.
change_insns is used to change multiple instructions at once, so that
the IR on return is valid & self-consistent. These changes can involve
moving instructions, and the new position for one instruction might
be expressed in terms of the old position of another instruction
that is changing at the same time.
change_insns therefore adds placeholder instructions to mark each
new instruction position, then replaces each placeholder with the
corresponding real instruction. This replacement was done in two
steps: removing the old placeholder instruction and inserting the new
real instruction. But it's more convenient for the upcoming fix for
PR115785 if we do the operation as a single step. That should also
be slightly more efficient, since e.g. no splay tree operations are
needed.
This operation happens purely on the rtl-ssa instruction chain.
The placeholders are never represented in rtl.
gcc/
PR rtl-optimization/115785
* rtl-ssa/functions.h (function_info::replace_nondebug_insn): Declare.
* rtl-ssa/insns.h (insn_info::order_node::set_uid): New function.
(insn_info::remove_note): Declare.
* rtl-ssa/insns.cc (insn_info::remove_note): New function.
(function_info::replace_nondebug_insn): Likewise.
* rtl-ssa/changes.cc (function_info::change_insns): Use
replace_nondebug_insn instead of remove_insn + add_insn.
c++, contracts: Fix ICE in create_tmp_var [PR113968]
During contract parsing, in grok_contract(), we proceed even if the
condition contains errors. This results in contracts with embedded errors
which eventually confuse gimplify. Checks for errors have been added in
grok_contract() to exit early if an error is encountered.
PR c++/113968
gcc/cp/ChangeLog:
* contracts.cc (grok_contract): Check for error_mark_node early
exit.
Gaius Mulley [Wed, 10 Jul 2024 14:52:37 +0000 (15:52 +0100)]
PR modula2/115823 Wrong expansion of isnormal optab
The bug fix changes gcc/m2/gm2-gcc/m2builtins.c:m2builtins_BuiltinExists
to recognise both __builtin_<functionname> and functionname as a builtin.
gcc/m2/ChangeLog:
PR modula2/115823
* gm2-gcc/m2builtins.cc (struct builtin_macro_definition): New
field builtinname.
(builtin_function_match): New function.
(builtin_macro_match): Ditto.
(m2builtins_BuiltinExists): Use builtin_function_match and
builtin_macro_match.
(lookup_builtin_macro): Use builtin_macro_match.
(lookup_builtin_function): Use builtin_function_match.
(define_builtin): Assign builtinname field.
gcc/testsuite/ChangeLog:
PR modula2/115823
* gm2/builtins/run/pass/testalloa.mod: New test.
middle-end: Fix stalled swapped condition code value [PR115836]
emit_store_flag_1 calculates scode (swapped condition code) at the
beginning of the function from the value of code variable. However,
code variable may change before scode usage site, resulting in
invalid stalled scode value.
Move calculation of scode value just before its only usage site to
avoid stalled scode value.
PR middle-end/115836
gcc/ChangeLog:
* expmed.cc (emit_store_flag_1): Move calculation of
scode just before its only usage site.
The arm 'pe' target was removed back in 2012 when the FPA support was
removed, but in a small number of places some conditional code was
accidentally left behind. It's no-longer needed, so remove it.
The below test suites are passed for this patch:
1. The x86 bootstrap test.
2. The x86 fully regression test.
3. The rv64gcv fully regresssion test.
gcc/ChangeLog:
* match.pd: Add form 2 for .SAT_TRUNC.
* tree-ssa-math-opts.cc (math_opts_dom_walker::after_dom_children):
Add new case NOP_EXPR, and try to match SAT_TRUNC.
Andrew Pinski [Wed, 10 Jul 2024 00:13:24 +0000 (17:13 -0700)]
testsuite: Allow matching `{_1, { 0,0,0,0 }}` for vect/slp-gap-1.c
While working on adding V4QI support to the aarch64 backend,
vect/slp-gap-1.c started to fail but only because the regex
was failing. Before it was loading use SI (int) and afterwards,
we started to use V4QI. The generated code was the same and the
generated gimple was almost the same. The regex was searching
for `zero-padding trick` and it was still doing that but instead
of directly 0, it was V4QI 0 (or rather `{ 0, 0, 0 }`).
This extends regex to support both.
Tested on x86_64-linux-gnu and aarch64-linux-gnu (with the support added).
gcc/testsuite/ChangeLog:
* gcc.dg/vect/slp-gap-1.c: Support matching `{_1, { 0, 0, 0, 0 }}`
in addition to `{_1, 0}`.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andrew Pinski [Tue, 9 Jul 2024 21:00:34 +0000 (14:00 -0700)]
Remove expanding complex EQ/NE inside a GIMPLE_RETURN [PR115721]
This code has been dead at least since the move over to tuples
in 0-88576-g726a989a8b74bf, when gimple returns could only have
a simple expression in it. So let's remove it.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
PR tree-optimization/115721
* tree-complex.cc (expand_complex_comparison): Remove
support for GIMPLE_RETURN.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
No functional changes compared with V1, just spaces to table conversion
in testcases to pass check-function-bodies.
V1 passed regression locally but suprisingly failed in pre-commit CI, after
picking the patch from patchwork, I realize table got coverted to spaces
before sending the patch.
Root cause:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864
Commit above tries in targetm.gen_epilogue () to detect if
there's li a0,0 insn at the end of insn chain, if so, cm.popret
is replaced by cm.popretz and li a0,0 insn is deleted.
Insertion of the generated epilogue sequence
into the insn chain doesn't happen at this moment.
If later shrink-wrap decides NOT to insert the epilogue sequence at the end
of insn chain, then the li a0,0 insn has already been mistakeny removed.
Fix this issue by removing generation of cm.popretz in epilogue,
leaving the assignment to a0 and use insn with cm.popret.
That's likely going to result in some kind of code size regression,
but not a correctness regression.
Martin Uecker [Sun, 23 Jun 2024 07:10:20 +0000 (09:10 +0200)]
Fix test errors after r15-1394 for sizeof(int)==sizeof(long) [PR115545]
Some tests added to test the type of redeclarations of enumerators
in r15-1394 fail on architectures where sizeof(long) == sizeof(int).
Adapt tests to use long long and/or accept that long long is selected
as type for the enumerator.
Martin Uecker [Sat, 29 Jun 2024 13:53:43 +0000 (15:53 +0200)]
c: Fix ICE for redeclaration of structs with different alignment [PR114727]
For redeclarations of struct in C23, if one has an alignment attribute
that makes the alignment different, we later get an ICE in verify_types.
This patches disallows such redeclarations by declaring such types to
be different.
The built-in actually generates more instructions than the inline C code
with no optimization but is identical with -O3 optimizations.
All of the above built-ins that are removed do not have test cases and
are not documented.
Built-ins __builtin_vec_set_v1ti __builtin_vec_set_v2di,
__builtin_vec_set_v2df are not removed as they are used in function
resolve_vec_insert() in file rs6000-c.cc.
The built-ins are removed as they don't provide any benefit over just
using C code.
The code to define the bif_init_bit, bif_is_init, as well as their uses
are removed. The function altivec_expand_vec_init_builtin is also removed.
gcc/ChangeLog:
* config/rs6000/rs6000-builtin.cc (altivec_expand_vec_init_builtin):
Remove the function.
(rs6000_expand_builtin): Remove the if bif_is_int check to call
the altivec_expand_vec_init_builtin function.
* config/rs6000/rs6000-builtins.def: Remove the attribute string
comment for init.
(__builtin_vec_init_v16qi,
__builtin_vec_init_v4sf, __builtin_vec_init_v4si,
__builtin_vec_init_v8hi, __builtin_vec_init_v1ti,
__builtin_vec_init_v2df, __builtin_vec_init_v2di,
__builtin_vec_set_v16qi, __builtin_vec_set_v4sf,
__builtin_vec_set_v4si, __builtin_vec_set_v8hi): Remove
built-in definitions.
* config/rs6000/rs6000-gen-builtins.cc: Remove comment for init
attribute string.
(struct attrinfo): Remove isinit entry.
(parse_bif_attrs): Remove the if statement to check for attribute
init.
(ifdef DEBUG): Remove print for init attribute string.
(write_decls): Remove print for define bif_init_bit and
define for bif_is_init.
(write_bif_static_init): Remove if bifp->attrs.isinit statement.
Carl Love [Tue, 9 Jul 2024 17:32:19 +0000 (13:32 -0400)]
rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in
The built-in __builtin_vsx_xvcmpeqsp_p is a duplicate of the overloaded
__builtin_altivec_vcmpeqfp_p built-in. The built-in is undocumented and
there are no test cases for it. The patch removes built-in
__builtin_vsx_xvcmpeqsp_p.
Update the documentation to include a reference to the new vector built-in
instances of vec_xxpermdi.
Add test cases for the new overloaded instances.
gcc/ChangeLog:
* config/rs6000/rs6000-overload.def (vec_xxpermdi): Add new
overloaded built-in instances of vector signed and unsigned
int128.
* doc/extend.texi: Add documentation for built-in instances of
vector signed and unsigned int128.
gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vec_perm-runnable-i128.c: New test file.
Carl Love [Tue, 9 Jul 2024 17:32:02 +0000 (13:32 -0400)]
rs6000, remove __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp built-ins
The undocumented __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp are
redundant. The overloaded vec_neg built-in provides the same
functionality. The two built-ins are not documented nor are there any
test cases for them.
Remove the definitions so users will use the overloaded vec_neg built-in
which is documented in the PVIPR.
Carl Love [Tue, 9 Jul 2024 17:31:34 +0000 (13:31 -0400)]
rs6000, remove the vec_xxsel built-ins, they are duplicates
The following undocumented built-ins are covered by the existing overloaded
vec_sel built-in definitions.
const vsc __builtin_vsx_xxsel_16qi (vsc, vsc, vsc);
same as vsc __builtin_vec_sel (vsc, vsc, vuc); (overloaded vec_sel)
const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
same as vuc __builtin_vec_sel (vuc, vuc, vuc); (overloaded vec_sel)
const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
same as vd __builtin_vec_sel (vd, vd, vull); (overloaded vec_sel)
const vsll __builtin_vsx_xxsel_2di (vsll, vsll, vsll);
same as vsll __builtin_vec_sel (vsll, vsll, vsll); (overloaded vec_sel)
const vull __builtin_vsx_xxsel_2di_uns (vull, vull, vull);
same as vull __builtin_vec_sel (vull, vull, vsll); (overloaded vec_sel)
const vf __builtin_vsx_xxsel_4sf (vf, vf, vf);
same as vf __builtin_vec_sel (vf, vf, vsi) (overloaded vec_sel)
const vsi __builtin_vsx_xxsel_4si (vsi, vsi, vsi);
same as vsi __builtin_vec_sel (vsi, vsi, vbi); (overloaded vec_sel)
const vui __builtin_vsx_xxsel_4si_uns (vui, vui, vui);
same as vui __builtin_vec_sel (vui, vui, vui); (overloaded vec_sel)
const vss __builtin_vsx_xxsel_8hi (vss, vss, vss);
same as vss __builtin_vec_sel (vss, vss, vbs); (overloaded vec_sel)
const vus __builtin_vsx_xxsel_8hi_uns (vus, vus, vus);
same as vus __builtin_vec_sel (vus, vus, vus); (overloaded vec_sel)
This patch removed the duplicate built-in definitions so users will only
use the documented vec_sel built-in. The __builtin_vsx_xxsel_[4si, 8hi,
16qi, 4sf, 2df] tests are also removed.
Carl Love [Tue, 9 Jul 2024 17:31:22 +0000 (13:31 -0400)]
rs6000, add overloaded vec_sel with int128 arguments
Extend the vec_sel built-in to take three signed/unsigned/bool int128
arguments and return a signed/unsigned/bool int128 result.
Extending the vec_sel built-in makes the existing buit-ins
__builtin_vsx_xxsel_1ti and __builtin_vsx_xxsel_1ti_uns obsolete. The
patch removes these built-ins.
The patch adds documentation and test cases for the new overloaded
vec_sel built-ins.
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_1ti,
__builtin_vsx_xxsel_1ti_uns): Remove built-in definitions.
* config/rs6000/rs6000-overload.def (vec_sel): Add new
overloaded vector signed, unsigned and bool 128-bit definitions.
* doc/extend.texi (vec_sel): Add documentation for new instances
with signed, unsigned and bool 129-bit bool arguments.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/builtins-10-runnable.c: New runnable test
file.
* gcc.target/powerpc/builtins-10.c: New compile only test file.
Carl Love [Tue, 9 Jul 2024 17:31:12 +0000 (13:31 -0400)]
rs6000, remove duplicated built-ins of vecmergl and vec_mergeh
The following undocumented built-ins are same as existing documented
overloaded builtins.
const vf __builtin_vsx_xxmrghw (vf, vf);
same as vf __builtin_vec_mergeh (vf, vf); (overloaded vec_mergeh)
const vsi __builtin_vsx_xxmrghw_4si (vsi, vsi);
same as vsi __builtin_vec_mergeh (vsi, vsi); (overloaded vec_mergeh)
const vf __builtin_vsx_xxmrglw (vf, vf);
same as vf __builtin_vec_mergel (vf, vf); (overloaded vec_mergel)
const vsi __builtin_vsx_xxmrglw_4si (vsi, vsi);
same as vsi __builtin_vec_mergel (vsi, vsi); (overloaded vec_mergel)
This patch removes the duplicate built-in definitions so only the
documented built-ins will be available for use. The case statements in
rs6000_gimple_fold_builtin are removed as they are no longer needed. The
patch removes the now unused define_expands for vsx_xxmrghw_<mode> and
vsx_xxmrglw_<mode>.
Carl Love [Tue, 9 Jul 2024 17:29:31 +0000 (13:29 -0400)]
rs6000, Remove redundant vector float/double type conversions
The following built-ins are redundant as they are covered by another
overloaded built-in.
__builtin_vsx_xvcvspdp covered by vec_double{e,o}
__builtin_vsx_xvcvdpsp covered by vec_float{e,o}
__builtin_vsx_xvcvsxwdp covered by vec_double{e,o}
__builtin_vsx_xvcvuxddp_uns covered by vec_double
Remove the redundant built-ins. They are not documented nor do they have
test cases.
Carl Love [Tue, 9 Jul 2024 17:17:44 +0000 (13:17 -0400)]
rs6000, extend the current vec_{un,}signed{e,o} built-ins
The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds
convert a vector of floats to a vector of signed/unsigned long long ints.
Extend the existing vec_{un,}signed{e,o} built-ins to handle the argument
vector of floats to return a vector of even/odd signed/unsigned integers.
The define expands vsignede_v4sf, vsignedo_v4sf, vunsignede_v4sf,
vunsignedo_v4sf are added to support the new vec_{un,}signed{e,o}
built-ins.
The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds are
now for internal use only. They are not documented and they do not
have test cases.
Add testcases and update documentation.
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxds,
__builtin_vsx_xvcvspuxds): Rename to __builtin_vsignede_v4sf,
__builtin_vunsignede_v4sf respectively.
(XVCVSPSXDS, XVCVSPUXDS): Rename to VEC_VSIGNEDE_V4SF,
VEC_VUNSIGNEDE_V4SF respectively.
(__builtin_vsignedo_v4sf, __builtin_vunsignedo_v4sf): New
built-in definitions.
* config/rs6000/rs6000-overload.def (vec_signede, vec_signedo,
vec_unsignede, vec_unsignedo): Add new overloaded specifications.
* config/rs6000/vsx.md (vsignede_v4sf, vsignedo_v4sf,
vunsignede_v4sf, vunsignedo_v4sf): New define_expands.
* doc/extend.texi (vec_signedo, vec_signede, vec_unsignedo,
vec_unsignede): Add documentation for new overloaded built-ins to
convert vector float to vector {un,}signed long long.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/builtins-3-runnable.c
(test_unsigned_int_result, test_ll_unsigned_int_result): Add
new argument.
(vec_signede, vec_signedo, vec_unsignede, vec_unsignedo): New
tests for the overloaded built-ins.
Carl Love [Tue, 9 Jul 2024 17:17:28 +0000 (13:17 -0400)]
rs6000, fix error in unsigned vector float to unsigned int built-in definitions
The built-in __builtin_vsx_vunsigned_v2df is supposed to take a vector of
doubles and return a vector of unsigned long long ints. Similarly
__builtin_vsx_vunsigned_v4sf takes a vector of floats an is supposed to
return a vector of unsinged ints. The definitions are using the signed
version of the instructions not the unsigned version of the instruction.
The results should also be unsigned. The built-ins are used by the
overloaded vec_unsigned built-in which has an unsigned result.
Similarly the built-ins __builtin_vsx_vunsignede_v2df and
__builtin_vsx_vunsignedo_v2df are supposed to return an unsigned result.
If the floating point argument is negative, the unsigned result is zero.
The built-ins are used in the overloaded built-in vec_unsignede and
vec_unsignedo respectively.
Add a test cases for a negative floating point arguments for each of the
above built-ins.
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_vunsigned_v2df,
__builtin_vsx_vunsigned_v4sf, __builtin_vsx_vunsignede_v2df,
__builtin_vsx_vunsignedo_v2df): Change the result type to unsigned.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/builtins-3-runnable.c: Add tests for
vec_unsignede and vec_unsignedo with negative arguments.
The built-in __builtin_vsx_xvcvspsxws is covered by built-in vec_signed
built-in that is documented in the PVIPR. The __builtin_vsx_xvcvspsxws
built-in is not documented and there are no test cases for it.
The built-in __builtin_vsx_xvcvdpuxds_uns is redundant as it is covered by
vec_unsigned, remove.
The __builtin_vsx_xvcvspuxws is redundant as it is covered by
vec_unsigned, remove.
The built-in __builtin_vsx_xvcvdpsxws is redundant as it is covered by
vec_signed{e,o}, remove.
The built-in __builtin_vsx_xvcvdpuxws is redundant as it is covered by
vec_unsigned{e,o}, remove.
Carl Love [Tue, 9 Jul 2024 17:12:39 +0000 (13:12 -0400)]
rs6000, Remove __builtin_vsx_cmple* builtins
The built-ins __builtin_vsx_cmple_u16qi, __builtin_vsx_cmple_u2di,
__builtin_vsx_cmple_u4si and __builtin_vsx_cmple_u8hi should take
unsigned arguments and return an unsigned result. The current definitions
take signed arguments and return signed results which is incorrect.
The signed and unsigned versions of __builtin_vsx_cmple* are not
documented in extend.texi. Also there are no test cases for the
built-ins.
Users can use the existing vec_cmple as PVIPR defines instead of
__builtin_vsx_cmple_u16qi, __builtin_vsx_cmple_u2di,
__builtin_vsx_cmple_u4si and __builtin_vsx_cmple_u8hi,
__builtin_vsx_cmple_16qi, __builtin_vsx_cmple_2di,
__builtin_vsx_cmple_4si and __builtin_vsx_cmple_8hi,
__builtin_altivec_cmple_1ti, __builtin_altivec_cmple_u1ti.
Hence these built-ins are redundant and are removed by this patch.
David Malcolm [Tue, 9 Jul 2024 15:22:32 +0000 (11:22 -0400)]
diagnostics: use refs rather than pointers for diagnostic_{path,context}
Use const & rather than const * in various places where it can't be null
and can't change.
No functional change intended.
gcc/ChangeLog:
* diagnostic-path.cc: Replace "const diagnostic_path *" with
"const diagnostic_path &" throughout, and "diagnostic_context *"
with "diagnostic context &".
* diagnostic.cc (diagnostic_context::show_any_path): Pass
reference in call to print_path.
* diagnostic.h (diagnostic_context::print_path): Convert param
to a reference.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Support for the FPA on Arm was removed after gcc-4.7, but this little
bit of crufty code was left behind. In particular the code to support
the 'N' modifier in assembly code was left behind and this lead to a
trail of other code that depended on it, even though most of the
constants that it supported had been removed in the original cleanup.
This patch removes most of the remaining cruft and simplifies the one
bit that remains: to determine whether an RTL construct contains 0.0 we
don't need to convert it to a real value, we can simply compare it to
CONST0_RTX of the appropriate mode.
gcc/
* config/arm/arm.cc (fp_consts_initited): Delete variable.
(value_fp0): Likewise.
(init_fp_table): Delete function.
(fp_const_from_val): Likewise.
(arm_const_double_rtx): Rework to avoid converting to REAL_VALUE_TYPE.
(arm_print_operand, case 'N'): Make use of this case an error.
We have a code duplication in riscv_set_arch_by_subset_list() and
riscv_parse_arch_string(), where the latter function parses an ISA string
into a subset_list before doing the same as the former function.
riscv_parse_arch_string() is used to process command line options and
riscv_set_arch_by_subset_list() processes target attributes.
So, it is obvious that both functions should do the same.
Let's deduplicate the code to enforce this.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_set_arch_by_subset_list):
Fix overlong line.
(riscv_parse_arch_string): Replace duplicated code by a call to
riscv_set_arch_by_subset_list.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
There are two test cases with the following skip directive:
dg-skip-if "" { *-*-* } { "-flto -fno-fat-lto-objects" }
This reads as: skip if both '-flto' and '-fno-fat-lto-objects'
are present. This is not the case if only '-flto' is present.
Since both tests depend on instruction sequences (one does
check-function-bodies the other tests for an assembler error
message), they won't work reliably with fat LTO objects.
Let's change the skip line to gate the test on '-flto'
to avoid failing tests like this:
FAIL: gcc.target/riscv/interrupt-misaligned.c -O2 -flto check-function-bodies interrupt
FAIL: gcc.target/riscv/interrupt-misaligned.c -O2 -flto -flto-partition=none check-function-bodies interrupt
FAIL: gcc.target/riscv/pr93202.c -O2 -flto (test for errors, line 10)
FAIL: gcc.target/riscv/pr93202.c -O2 -flto (test for errors, line 9)
FAIL: gcc.target/riscv/pr93202.c -O2 -flto -flto-partition=none (test for errors, line 10)
FAIL: gcc.target/riscv/pr93202.c -O2 -flto -flto-partition=none (test for errors, line 9)
gcc/testsuite/ChangeLog:
* gcc.target/riscv/interrupt-misaligned.c: Remove
"-fno-fat-lto-objects" from skip condition.
* gcc.target/riscv/pr93202.c: Likewise.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
AVX10 Documentaion has specified ecx value as 0 for AVX10 version and
vector size under 0x24 subleaf. Although for ecx=1, the bits are all
reserved for now, we still need to specify ecx as 0 to avoid dirty
value in ecx.
Jakub Jelinek [Tue, 9 Jul 2024 08:45:25 +0000 (10:45 +0200)]
c: Rewrite c_parser_omp_tile_sizes to use c_parser_expr_list
The following patch simplifies c_parser_omp_tile_sizes to use
c_parser_expr_list, so that it will get CPP_EMBED parsing naturally,
without having another spot that needs to be adjusted for it.
2024-07-09 Jakub Jelinek <jakub@redhat.com>
* c-parser.cc (c_parser_omp_tile_sizes): Use c_parser_expr_list.
* c-c++-common/gomp/tile-11.c: Adjust expected diagnostics for c.
* c-c++-common/gomp/tile-12.c: Likewise.
Rename __{float,double}_u to __x86_{float,double}_u to avoid pulluting the namespace.
I have a build failure on NetBSD as the namespace pollution avoidance causes
a direct hit with the system /usr/include/math.h
=======================================================================
In file included from /usr/src/local/gcc/obj/gcc/include/emmintrin.h:31,
from /usr/src/local/gcc/obj/x86_64-unknown-netbsd10.99/libstdc++-v3/include/ext/random:45,
from /usr/src/local/gcc/libstdc++-v3/include/precompiled/extc++.h:65:
/usr/src/local/gcc/obj/gcc/include/xmmintrin.h:75:15: error: conflicting declaration 'typedef float __float_u'
75 | typedef float __float_u __attribute__ ((__may_alias__, __aligned__ (1)));
| ^~~~~~~~~
In file included from /usr/src/local/gcc/obj/x86_64-unknown-netbsd10.99/libstdc++-v3/include/cmath:47,
from /usr/src/local/gcc/obj/x86_64-unknown-netbsd10.99/libstdc++-v3/include/x86_64-unknown-netbsd10.99/bits/stdc++.h:114,
from /usr/src/local/gcc/libstdc++-v3/include/precompiled/extc++.h:32:
/usr/src/local/gcc/obj/gcc/include-fixed/math.h:49:7: note: previous declaration as 'union __float_u'
49 | union __float_u {
Pan Li [Mon, 8 Jul 2024 13:58:59 +0000 (21:58 +0800)]
RISC-V: Add testcases for unsigned vector .SAT_ADD IMM form 2
After the middle-end supported the vector mode of .SAT_ADD, add more
testcases to ensure the correctness of RISC-V backend for form 2. Aka:
Form 2:
#define DEF_VEC_SAT_U_ADD_IMM_FMT_2(T, IMM) \
T __attribute__((noinline)) \
vec_sat_u_add_imm##IMM##_##T##_fmt_2 (T *out, T *in, unsigned limit) \
{ \
unsigned i; \
for (i = 0; i < limit; i++) \
out[i] = (T)(in[i] + IMM) < in[i] ? -1 : (in[i] + IMM); \
}
DEF_VEC_SAT_U_ADD_IMM_FMT_2 (uint64_t, 9)
Passed the fully rv64gcv regression tests.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add help
test macro.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-5.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-6.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-7.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-8.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-5.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-6.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-7.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-8.c: New test.
Pan Li [Mon, 8 Jul 2024 12:31:31 +0000 (20:31 +0800)]
RISC-V: Add testcases for unsigned vector .SAT_ADD IMM form 1
After the middle-end supported the vector mode of .SAT_ADD, add more
testcases to ensure the correctness of RISC-V backend for form 1. Aka:
Form 1:
#define DEF_VEC_SAT_U_ADD_IMM_FMT_1(T, IMM) \
T __attribute__((noinline)) \
vec_sat_u_add_imm##IMM##_##T##_fmt_1 (T *out, T *in, unsigned limit) \
{ \
unsigned i; \
for (i = 0; i < limit; i++) \
out[i] = (T)(in[i] + IMM) >= in[i] ? (in[i] + IMM) : -1; \
}
DEF_VEC_SAT_U_ADD_IMM_FMT_1 (uint64_t, 9)
Passed the fully rv64gcv regression tests.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add help
test macro.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_data.h: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-3.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-4.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-3.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-4.c: New test.
Jeff Law [Mon, 8 Jul 2024 23:06:55 +0000 (17:06 -0600)]
[to-be-committed][RISC-V][V3] DCE analysis for extension elimination
The pre-commit testing showed that making ext-dce only active at -O2 and above
would require minor edits to the tests. In some cases we had specified -O1 in
the test or specified no optimization level at all. Those need to be bumped to
-O2. In one test we had one set of dg-options overriding another.
The other approach that could have been taken would be to drop the -On
argument, add an explicit -fext-dce and add dg-skip-if options. I originally
thought that was going to be way to go, but the dg-skip-if aspect was going to
get ugly as things like interaction between unrolling, peeling and -ftracer
would have to be accounted for and would likely need semi-regular adjustment.
Changes since V2:
Testsuite changes to deal with pass only being enabled at -O2 or
higher.
--
Changes since V1:
Check flag_ext_dce before running the new pass. I'd forgotten that
I had removed that part of the gate to facilitate more testing.
Turn flag_ext_dce on at -O2 and above.
Adjust one of the riscv tests to explicitly avoid vectors
Adjust a few aarch64 tests
In tbz_2.c we remove an unnecessary extension which causes us to use
"x" registers instead of "w" registers.
In the pred_clobber tests we also remove an extension and that
ultimately causes a reg->reg copy to change locations.
--
This was actually ack'd late in the gcc-14 cycle, but I chose not to integrate
it given how late we were in the cycle.
The basic idea here is to track liveness of subobjects within a word and if we
find an extension where the bits set aren't actually used, then we convert the
extension into a subreg. The subreg typically simplifies away.
I've seen this help a few routines in coremark, fix one bug in the testsuite
(pr111384) and fix a couple internally reported bugs in Ventana.
The original idea and code were from Joern; Jivan and I hacked it into usable
shape. I've had this in my tester for ~8 months, so it's been through more
build/test cycles than I care to contemplate and nearly every architecture we
support.
But just in case, I'm going to wait for it to spin through the pre-commit CI
tester. I'll find my old ChangeLog before committing.
gcc/
* Makefile.in (OBJS): Add ext-dce.o
* common.opt (ext-dce): Document new option.
* df-scan.cc (df_get_ext_block_use_set): Delete prototype and
make extern.
* df.h (df_get_exit_block_use_set): Prototype.
* ext-dce.cc: New file/pass.
* opts.cc (default_options_table): Handle ext-dce at -O2 or higher.
* passes.def: Add ext-dce before combine.
* tree-pass.h (make_pass_ext_dce): Prototype.
gcc/testsuite
* gcc.target/aarch64/sve/pred_clobber_1.c: Update expected output.
* gcc.target/aarch64/sve/pred_clobber_2.c: Likewise.
* gcc.target/aarch64/sve/pred_clobber_3.c: Likewise.
* gcc.target/aarch64/tbz_2.c: Likewise.
* gcc.target/riscv/core_bench_list.c: New test.
* gcc.target/riscv/core_init_matrix.c: New test.
* gcc.target/riscv/core_list_init.c: New test.
* gcc.target/riscv/matrix_add_const.c: New test.
* gcc.target/riscv/mem-extend.c: New test.
* gcc.target/riscv/pr111384.c: New test.
David Malcolm [Mon, 8 Jul 2024 22:55:28 +0000 (18:55 -0400)]
c-format.cc: add ctors to format_check_results and format_check_context
This is a minor cleanup I spotted whilst working on another patch.
No functional change intended.
gcc/c-family/ChangeLog:
* c-format.cc (format_check_results::format_check_results): New
ctor.
(struct format_check_context): Add ctor; add "m_" prefix to all
fields.
(check_format_info): Use above ctors.
(check_format_arg): Update for "m_" prefix to
format_check_context.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
i386: Promote {QI,HI}mode x86_mov<mode>cc_0_m1_neg to SImode
Promote HImode x86_mov<mode>cc_0_m1_neg insn to SImode to avoid
redundant prefixes. Also promote QImode insn when TARGET_PROMOTE_QImode
is set. This is similar to promotable_binary_operator splitter, where we
promote the result to SImode.
Also correct insn condition for splitters to SImode of NEG and NOT
instructions. The sizes of QImode and SImode instructions are always
the same, so there is no need for optimize_insn_for_size bypass.
gcc/ChangeLog:
* config/i386/i386.md (x86_mov<mode>cc_0_m1_neg splitter to SImode):
New splitter.
(NEG and NOT splitter to SImode): Remove optimize_insn_for_size_p
predicate from insn condition.
Jonathan Wakely [Sun, 7 Jul 2024 11:22:42 +0000 (12:22 +0100)]
libstdc++: Fix _Atomic(T) macro in <stdatomic.h> [PR115807]
The definition of the _Atomic(T) macro needs to refer to ::std::atomic,
not some other std::atomic relative to the current namespace.
libstdc++-v3/ChangeLog:
PR libstdc++/115807
* include/c_compatibility/stdatomic.h (_Atomic): Ensure it
refers to std::atomic in the global namespace.
* testsuite/29_atomics/headers/stdatomic.h/115807.cc: New test.