Patrick Palka [Fri, 2 Sep 2022 19:16:37 +0000 (15:16 -0400)]
libstdc++: Consistently use ::type when deriving from __and/or/not_
Now that these internal type traits are (again) class templates, it's
better to derive from the trait's ::type instead of from the trait
itself, for sake of a shallower inheritance chain.
libstdc++-v3/ChangeLog:
* include/std/tuple (tuple::_UseOtherCtor): Use ::type when
deriving from __and_, __or_ or __not_.
* include/std/type_traits (negation): Likewise.
(is_unsigned): Likewise.
(__is_implicitly_default_constructible): Likewise.
(is_trivially_destructible): Likewise.
(__is_nt_invocable_impl): Likewise.
This defines the is_xxx_constructible_v and is_xxx_assignable_v variable
templates by using the built-ins directly. The actual logic for each one
is the same as the corresponding class template, but way using the
variable template doesn't need to instantiate the class template.
This means that the variable templates won't use the static assertions
checking for complete types, cv void or unbounded arrays, but that's OK
because the built-ins check those anyway. We could probably remove the
static assertions from the class templates, and maybe from all type
traits that use a built-in.
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_constructible_v)
(is_default_constructible_v, is_copy_constructible_v)
(is_move_constructible_v): Define using __is_constructible.
(is_assignable_v, is_copy_assignable_v, is_move_assignable_v):
Define using __is_assignable.
(is_trivially_constructible_v)
(is_trivially_default_constructible_v)
(is_trivially_copy_constructible_v)
(is_trivially_move_constructible_v): Define using
__is_trivially_constructible.
(is_trivially_assignable_v, is_trivially_copy_assignable_v)
(is_trivially_move_assignable_v): Define using
__is_trivially_assignable.
(is_nothrow_constructible_v)
(is_nothrow_default_constructible_v)
(is_nothrow_copy_constructible_v)
(is_nothrow_move_constructible_v): Define using
__is_nothrow_constructible.
(is_nothrow_assignable_v, is_nothrow_copy_assignable_v)
(is_nothrow_move_assignable_v): Define using
__is_nothrow_assignable.
Patrick Palka [Fri, 2 Sep 2022 15:19:51 +0000 (11:19 -0400)]
libstdc++: Fix laziness of __and/or/not_
r13-2230-g390f94eee1ae69 redefined the internal logical operator traits
__and_, __or_ and __not_ as alias templates that directly resolve to
true_type or false_type. But it turns out using an alias template here
causes the traits to be less lazy than before because we now compute the
logical result immediately upon _specialization_ of the trait, and not
later upon _completion_ of the specialization.
So for example, in
using type = __and_<A, __not_<B>>;
we now compute the conjunction and thus instantiate A even though we're
in a context that doesn't require completion of the __and_. What's
worse is that we also compute the inner negation and thus instantiate B
(for the same reason), independent of the __and_ and the value of A!
Thus the traits are now less lazy and composable than before.
Fortunately, the fix is cheap and straightforward: redefine these traits
as class templates instead of as alias templates so that computation of
the logical result is triggered by completion, not by specialization.
libstdc++-v3/ChangeLog:
* include/std/type_traits (__or_, __and_, __not_): Redefine as a
class template instead of as an alias template.
* testsuite/20_util/logical_traits/requirements/short_circuit.cc:
Add more tests for conjunction and disjunction. Add corresponding
tests for __and_ and __or_.
vect_optimize_slp_pass always treats the starting layout as valid,
to avoid having to "optimise" when every possible choice is invalid.
But it gives the starting layout a high cost if it seems like the
target might reject it, in the hope that this will encourage other
(valid) layouts.
The testcase for PR106787 showed that this was flawed, since it was
triggering even in cases where the number of input lanes is different
from the number of output lanes. Picking such a high cost could also
make costs for loop-invariant nodes overwhelm the costs for inner-loop
nodes.
This patch makes the costing less aggressive by (a) restricting
it to N-to-N permutations and (b) assigning the maximum cost of
a permute.
gcc/
* tree-vect-slp.cc (vect_optimize_slp_pass::internal_node_cost):
Reduce the fallback cost to 1. Only use it if the number of
input lanes is equal to the number of output lanes.
gcc/testsuite/
* gcc.dg/vect/bb-slp-layout-20.c: New test.
vect: Ensure SLP nodes don't end up in multiple BB partitions [PR106787]
In the PR we have two REDUC_PLUS SLP instances that share a common
load of stride 4. Each instance also has a unique contiguous load.
Initially all three loads are out of order, so have a nontrivial
load permutation. The layout pass puts them in order instead,
For the two contiguous loads it is possible to do this by adjusting the
SLP_LOAD_PERMUTATION to be { 0, 1, 2, 3 }. But a SLP_LOAD_PERMUTATION
of { 0, 4, 8, 12 } is rejected as unsupported, so the pass creates a
separate VEC_PERM_EXPR instead.
Later the 4-stride load's initial SLP_LOAD_PERMUTATION is rejected too,
so that the load gets replaced by an external node built from scalars.
We then have an external node feeding a VEC_PERM_EXPR.
VEC_PERM_EXPRs created in this way do not have any associated
SLP_TREE_SCALAR_STMTS. This means that they do not affect the
decision about which nodes should be in which subgraph for costing
purposes. If the VEC_PERM_EXPR is fed by a vect_external_def,
then the VEC_PERM_EXPR's input doesn't affect that decision either.
The net effect is that a shared VEC_PERM_EXPR fed by an external def
can appear in more than one subgraph. This triggered an ICE in
vect_schedule_node, which (rightly) expects to be called no more
than once for the same internal def.
There seemed to be many possible fixes, including:
(1) Replace unsupported loads with external defs *before* doing
the layout optimisation. This would avoid the need for the
VEC_PERM_EXPR altogether.
(2) If the target doesn't support a load in its original layout,
stop the layout optimisation from checking whether the target
supports loads in any new candidate layout. In other words,
treat all layouts as if they were supported whenever the
original layout is not in fact supported.
I'd rather not do this. In principle, the layout optimisation
could convert an unsupported layout to a supported one.
Selectively ignoring target support would work against that.
We could try to look specifically for loads that will need
to be decomposed, but that just seems like admitting that
things are happening in the wrong order.
(3) Add SLP_TREE_SCALAR_STMTS to VEC_PERM_EXPRs.
That would be OK for this case, but wouldn't be possible
for external defs that represent existing vectors.
(4) Make vect_schedule_slp share SCC info between subgraphs.
It feels like that's working around the partitioning problem
rather than a real fix though.
(5) Directly ensure that internal def nodes belong to a single
subgraph.
(1) is probably the best long-term fix, but (5) is much simpler.
The subgraph partitioning code already has a hash set to record
which nodes have been visited; we just need to convert that to a
map from nodes to instances instead.
gcc/
PR tree-optimization/106787
* tree-vect-slp.cc (vect_map_to_instance): New function, split out
from...
(vect_bb_partition_graph_r): ...here. Replace the visited set
with a map from nodes to instances. Ensure that a node only
appears in one partition.
(vect_bb_partition_graph): Update accordingly.
gcc/testsuite/
* gcc.dg/vect/bb-slp-layout-19.c: New test.
Richard Biener [Fri, 2 Sep 2022 11:36:13 +0000 (13:36 +0200)]
tree-optimization/106809 - compile time hog in VN
The dominated_by_p_w_unex function is prone to high compile time.
With GCC 12 we introduced a VN run for uninit diagnostics which now
runs into a degenerate case with bison generated code. Fortunately
this case is easy to fix with a simple extra check - a more
general fix needs more work.
PR tree-optimization/106809
* tree-ssa-sccvn.cc (dominaged_by_p_w_unex): Check we have
more than one successor before doing extra work.
Kito Cheng [Fri, 20 Nov 2020 07:55:58 +0000 (15:55 +0800)]
RISC-V: Implement TARGET_COMPUTE_MULTILIB
Use TARGET_COMPUTE_MULTILIB to search the multi-lib reuse for riscv*-*-elf*,
according following rules:
1. Check ABI is same.
2. Check both has atomic extension or both don't have atomic extension.
- Because mix soft and hard atomic operation doesn't make sense and
won't work as expect.
3. Check current arch is superset of the target multi-lib arch.
- It might result slower performance or larger code size, but it
safe to run.
4. Pick most match multi-lib set if more than one multi-lib are pass
the above checking.
Example for how to select multi-lib:
We build code with -march=rv32imaf and -mabi=ilp32, and we have
following 5 multi-lib set:
The first and second multi-lib is safe to like, 3rd multi-lib can't
re-use becasue it don't have atomic extension, which is mismatch according
rule 2, and the 4th multi-lib can't re-use too due to the ABI mismatch,
the last multi-lib can't use since current arch is not superset of the
arch of multi-lib.
And emit error if not found suitable multi-lib set, the error message
only emit when link with standard libraries.
// No actual linking, so no error emitted.
$ riscv64-unknown-elf-gcc -print-multi-directory -march=rv32ia -mabi=ilp32
.
// Link to default libc and libgcc, so check the multi-lib, and emit
// error because not found suitable multilib.
$ riscv64-unknown-elf-gcc -march=rv32ia -mabi=ilp32 ~/hello.c
riscv64-unknown-elf-gcc: fatal error: can't found suitable multilib set for '-march=rv32ia'/'-mabi=ilp32'
compilation terminated.
// No error emitted, because not link to stdlib.
$ riscv64-unknown-elf-gcc -march=rv32ia -mabi=ilp32 ~/hello.c -nostdlib
// No error emitted, because compile only.
$ riscv64-unknown-elf-gcc -march=rv32ia -mabi=ilp32 ~/hello.c -c
Kito Cheng [Fri, 20 Nov 2020 07:52:53 +0000 (15:52 +0800)]
Add TARGET_COMPUTE_MULTILIB hook to override multi-lib result.
Create a new hook to let target could override the multi-lib result,
the motivation is RISC-V might have very complicated multi-lib re-use
rule*, which is hard to maintain and use current multi-lib scripts,
we even hit the "argument list too long" error when we tried to add more
multi-lib reuse rule.
So I think it would be great to have a target specific way to determine
the multi-lib re-use rule, then we could write those rule in C, instead
of expand every possible case in MULTILIB_REUSE.
Gary Dismukes [Wed, 13 Jul 2022 22:06:47 +0000 (18:06 -0400)]
[Ada] Error on return of object whose full view has undefaulted discriminants
The compiler wrongly reports an error about the expected type not
matching the same-named found type in a return statement for a function
whose result type has unknown discriminants when the full type is tagged
and has an undefaulted discriminant, and the return expression is an object
initialized by a function call. The processing for return statements that
creates an actual subtype based on the return expression type's underlying
type when that type has discriminants, and converts the expression to
the actual subtype, should only be done when the underlying discriminated
type is mutable (i.e., has defaulted discriminants). Otherwise the
unchecked conversion to the actual subtype (of the underlying full type)
can lead to a resolution problem later within Expand_Simple_Function_Return
in the expansion of tag assignments (because the target type of the
conversion is a full view and does not match the partial view of
the function's result type).
gcc/ada/
* exp_ch6.adb (Expand_Simple_Function_Return) Bypass creation of an actual
subtype and unchecked conversion to that subtype when the underlying type
of the expression has discriminants without defaults.
Eric Botcazou [Thu, 7 Jul 2022 22:01:15 +0000 (00:01 +0200)]
[Ada] Fix crash on declaration of overaligned array with constraints
The semantic analyzer was setting the Is_Constr_Subt_For_UN_Aliased flag on
the actual subtype of the object, which is incorrect because the nominal
subtype is constrained. This also adjusts a recent related change.
gcc/ada/
* exp_util.adb (Expand_Subtype_From_Expr): Check for the presence
of the Is_Constr_Subt_For_U_Nominal flag instead of the absence
of the Is_Constr_Subt_For_UN_Aliased flag on the subtype of the
expression of an object declaration before reusing this subtype.
* sem_ch3.adb (Analyze_Object_Declaration): Do not incorrectly
set the Is_Constr_Subt_For_UN_Aliased flag on the actual subtype
of an array with definite nominal subtype. Remove useless test.
[Ada] Recover proof of Scaled_Divide in System.Arith_64
Proof of Scaled_Divide was impacted by changes in provers and Why3.
Recover it partially, leaving some unproved basic inferences to be
further investigated.
[Ada] Fix proof of runtime unit System.Value* and System.Image*
Refactor specification of the Value* and Image* units and fix proofs.
gcc/ada/
* libgnat/a-nbnbig.ads: Add Always_Return annotation.
* libgnat/s-vaispe.ads: New ghost unit for the specification of
System.Value_I. Restore proofs.
* libgnat/s-vauspe.ads: New ghost unit for the specification of
System.Value_U. Restore proofs.
* libgnat/s-valuei.adb: The specification only subprograms are
moved to System.Value_I_Spec. Restore proofs.
* libgnat/s-valueu.adb: The specification only subprograms are
moved to System.Value_U_Spec. Restore proofs.
* libgnat/s-valuti.ads
(Uns_Params): Generic unit used to bundle together the
specification functions of System.Value_U_Spec.
(Int_Params): Generic unit used to bundle together the
specification functions of System.Value_I_Spec.
* libgnat/s-imagef.adb: It is now possible to instantiate the
appropriate specification units instead of creating imported ghost
subprograms.
* libgnat/s-imagei.adb: Update to refactoring of specifications
and fix proofs.
* libgnat/s-imageu.adb: Likewise.
* libgnat/s-imgint.ads: Ghost parameters are grouped together in a
package now.
* libgnat/s-imglli.ads: Likewise.
* libgnat/s-imgllu.ads: Likewise.
* libgnat/s-imgllli.ads: Likewise.
* libgnat/s-imglllu.ads: Likewise.
* libgnat/s-imguns.ads: Likewise.
* libgnat/s-vallli.ads: Likewise.
* libgnat/s-valllli.ads: Likewise.
* libgnat/s-imagei.ads: Likewise.
* libgnat/s-imageu.ads: Likewise.
* libgnat/s-vaispe.adb: Likewise.
* libgnat/s-valint.ads: Likewise.
* libgnat/s-valuei.ads: Likewise.
* libgnat/s-valueu.ads: Likewise.
* libgnat/s-vauspe.adb: Likewise.
Simon Rainer [Wed, 31 Aug 2022 21:00:08 +0000 (23:00 +0200)]
ipa: Fix throw in multi-versioned functions [PR106627]
Any multi-versioned function was implicitly declared as noexcept, which
leads to an abort if an exception is thrown inside the function.
The reason for this is that the function declaration is replaced by a
newly created dispatcher declaration, which has TREE_NOTHROW always set
to 1. Instead we need to set TREE_NOTHROW to the value of the original
declaration.
PR ipa/106627
gcc/ChangeLog:
* config/i386/i386-features.cc (ix86_get_function_versions_dispatcher):
Set TREE_NOTHROW correctly for dispatcher declaration.
* config/rs6000/rs6000.cc (rs6000_get_function_versions_dispatcher):
Likewise.
Jonathan Wakely [Thu, 1 Sep 2022 14:58:34 +0000 (15:58 +0100)]
libstdc++: Remove __is_referenceable helper
We only use the __is_referenceable helper in three places now:
add_pointer, add_lvalue_reference, and add_rvalue_reference. But lots of
other traits depend on add_[lr]value_reference, and decay depends on
add_pointer, so removing the instantiation of __is_referenceable helps
compile all those other traits slightly faster.
We can just use void_t<T&> to check for a referenceable type in the
add_[lr]value_reference traits.
Then we can specialize add_pointer for reference types, so that we don't
need to use remove_reference, and then use void_t<T*> for all
non-reference types to detect when we can form a pointer to the type.
libstdc++-v3/ChangeLog:
* include/std/type_traits (__is_referenceable): Remove.
(__add_lvalue_reference_helper, __add_rvalue_reference_helper):
Use __void_t instead of __is_referenceable.
(__add_pointer_helper): Likewise.
(add_pointer): Add partial specializations for reference types.
Jonathan Wakely [Thu, 1 Sep 2022 14:19:28 +0000 (15:19 +0100)]
libstdc++: Optimize is_constructible traits
We can replace some class template helpers with alias templates, which
are cheaper to instantiate.
For example, replace the __is_copy_constructible_impl class template
with an alias template that uses just evaluates the __is_constructible
built-in, using add_lvalue_reference<const T> to get the argument type
in a way that works for non-referenceable types. For a given
specialization of is_copy_constructible this results in the same number
of class templates being instantiated (for the common case of non-void,
non-function types), but the add_lvalue_reference instantiations are not
specific to the is_copy_constructible specialization and so can be
reused by other traits. Previously __is_copy_constructible_impl was a
distinct class template and its specializations were never used for
anything except is_copy_constructible.
With the new definitions of these traits that don't depend on helper
classes, it becomes more practical to optimize the
is_xxx_constructible_v variable templates to avoid instantiations.
Previously doing so would have meant two entirely separate
implementation strategies for these traits.
libstdc++-v3/ChangeLog:
* include/std/type_traits (__is_constructible_impl): Replace
class template with alias template.
(is_default_constructible, is_nothrow_constructible)
(is_nothrow_constructible): Simplify base-specifier.
(__is_copy_constructible_impl, __is_move_constructible_impl)
(__is_nothrow_copy_constructible_impl)
(__is_nothrow_move_constructible_impl): Remove class templates.
(is_copy_constructible, is_move_constructible)
(is_nothrow_constructible, is_nothrow_default_constructible)
(is_nothrow_copy_constructible, is_nothrow_move_constructible):
Adjust base-specifiers to use __is_constructible_impl.
(__is_copy_assignable_impl, __is_move_assignable_impl)
(__is_nt_copy_assignable_impl, __is_nt_move_assignable_impl):
Remove class templates.
(__is_assignable_impl): New alias template.
(is_assignable, is_copy_assignable, is_move_assignable):
Adjust base-specifiers to use new alias template.
(is_nothrow_copy_assignable, is_nothrow_move_assignable):
Adjust base-specifiers to use existing alias template.
(__is_trivially_constructible_impl): New alias template.
(is_trivially_constructible, is_trivially_default_constructible)
(is_trivially_copy_constructible)
(is_trivially_move_constructible): Adjust base-specifiers to use
new alias template.
(__is_trivially_assignable_impl): New alias template.
(is_trivially_assignable, is_trivially_copy_assignable)
(is_trivially_move_assignable): Adjust base-specifier to use
new alias template.
(__add_lval_ref_t, __add_rval_ref_t): New alias templates.
(add_lvalue_reference, add_rvalue_reference): Use new alias
templates.
Jonathan Wakely [Thu, 1 Sep 2022 12:06:13 +0000 (13:06 +0100)]
libstdc++: Optimize std::decay
Define partial specializations of std::decay and its __decay_selector
helper so that remove_reference, is_array and is_function are not
instantiated for every type, and remove_extent is not instantiated for
arrays.
libstdc++-v3/ChangeLog:
* include/std/type_traits (__decay_selector): Add partial
specializations for array types. Only check for function types
when not dealing with an array.
(decay): Add partial specializations for reference types.
Jonathan Wakely [Wed, 31 Aug 2022 14:00:24 +0000 (15:00 +0100)]
libstdc++: Use built-ins for some variable templates
This avoids having to instantiate a class template that just uses the
same built-in anyway.
None of the corresponding class templates have any type-completeness
static assertions, so we're not losing any diagnostics by using the
built-ins directly.
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_enum_v, is_class_v, is_union_v)
(is_empty_v, is_polymoprhic_v, is_abstract_v, is_final_v)
(is_base_of_v, is_aggregate_v): Use built-in directly instead of
instantiating class template.
Joseph Myers [Thu, 1 Sep 2022 19:10:59 +0000 (19:10 +0000)]
c: C2x removal of unprototyped functions
C2x has completely removed unprototyped functions, so that () now
means the same as (void) in both function declarations and
definitions, where previously that change had been made for
definitions only. Implement this accordingly.
This is a change where GNU/Linux distribution builders might wish to
try builds with a -std=gnu2x default to start early on getting old
code fixed that still has () declarations for functions taking
arguments, in advance of GCC moving to -std=gnu2x as default maybe in
GCC 14 or 15; I don't know how much such code is likely to be in
current use.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/c/
* c-decl.cc (grokparms): Handle () in a function declaration the
same as (void) for C2X.
gcc/testsuite/
* gcc.dg/c11-unproto-3.c, gcc.dg/c2x-unproto-3.c,
gcc.dg/c2x-unproto-4.c: New tests.
* gcc.dg/c2x-old-style-definition-6.c, gcc.dg/c2x-unproto-1.c,
gcc.dg/c2x-unproto-2.c: Update for removal of unprototyped
functions.
vect: Try to remove single-vector permutes from SLP graph
This patch extends the SLP layout optimisation pass so that it
tries to remove layout changes that are brought about by permutes
of existing vectors. This fixes the bb-slp-pr54400.c regression on
x86_64 and also means that we can remove the permutes in cases like:
The new test is a simple adaption of bb-slp-pr54400.c, with the
same style of markup.
gcc/
* tree-vect-slp.cc (vect_build_slp_tree_2): When building a
VEC_PERM_EXPR of an existing vector, set the SLP_TREE_LANES
to the number of vector elements, if that's a known constant.
(vect_optimize_slp_pass::is_compatible_layout): Remove associated
comment about zero SLP_TREE_LANES.
(vect_optimize_slp_pass::start_choosing_layouts): Iterate over
all partition members when looking for potential layouts.
Handle existing permutes of fixed-length vectors.
gcc/testsuite/
* gcc.dg/vect/bb-slp-pr54400.c: Extend to aarch64.
* gcc.dg/vect/bb-slp-layout-18.c: New test.
We're starting to abuse the infinity endpoints in the frange code and
the associated range operators. Building infinities are rather cheap,
and we could even inline them, but I think it's best to just not
recalculate them all the time.
I see about 20 uses of real_inf in the source code, not including the
backends. And I'm about to add more :).
gcc/ChangeLog:
* emit-rtl.cc (init_emit_once): Initialize dconstinf and
dconstninf.
* real.h: Add dconstinf and dconstninf.
Jason Merrill [Wed, 24 Aug 2022 20:31:11 +0000 (16:31 -0400)]
c++: set TYPE_STRING_FLAG for char8_t
While looking at the DWARF handling of char8_t I wondered why we weren't
setting TREE_STRING_FLAG on it. I hoped that setting that flag would be an
easy fix for PR102958, but it doesn't seem to be sufficicent. But it still
seems correct.
I also tried setting the flag on char16_t and char32_t, but that broke
because braced_list_to_string assumes char-sized elements. Since we don't
set the flag on wchar_t, I abandoned that idea.
gcc/c-family/ChangeLog:
* c-common.cc (c_common_nodes_and_builtins): Set TREE_STRING_FLAG on
char8_t.
(braced_list_to_string): Check for char-sized elements.
Aldy Hernandez [Wed, 31 Aug 2022 12:41:29 +0000 (14:41 +0200)]
Implement ranger folder for __builtin_signbit.
Now that we keep track of the signbit, we can use it to fold __builtin_signbit.
I am assuming I don't have try too hard to get the actual signbit
number and 1 will do. Especially, since we're inconsistent in trunk whether
we fold the builtin or whether we calculate it at runtime.
This adds an frange property to keep track of the sign bit. We keep
it updated at all times, but we don't use it make any decisions when
!HONOR_SIGNED_ZEROS.
With this property we can now query the range for the appropriate sign
with frange::get_signbit (). Possible values are yes, no, and unknown.
Jonathan Wakely [Wed, 31 Aug 2022 14:00:07 +0000 (15:00 +0100)]
libstdc++: Optimize array traits
Improve compile times by avoiding unnecessary class template
instantiations.
__is_array_known_bounds and __is_array_unknown_bounds can be defined
without instantiating extent, by providing partial specializations for
the true cases.
std::extent can avoid recursing down through a multidimensional array,
so it stops after providing the result. Previously extent<T[n][m], 0>
would instantiate extent<T[n], -1u> and extent<T, -2u> as well.
std::is_array_v can use partial specializations to avoid instantiating
std::is_array, and similarly for std::rank_v and std::extent_v.
std::is_bounded_array_v and std::is_unbounded_array_v can also use
partial specializations, and then the class templates can be defined in
terms of the variable templates. This makes sense for these traits,
because they are new in C++20 and so the variable templates are always
available, which isn't true in general for C++11 and C++14 traits.
libstdc++-v3/ChangeLog:
* include/std/type_traits (__is_array_known_bounds): Add partial
specialization instead of using std::extent.
(__is_array_unknown_bounds): Likewise.
(extent): Add partial specializations to stop recursion after
the result is found.
(is_array_v): Add partial specializations instead of
instantiating the class template.
(rank_v, extent_v): Likewise.
(is_bounded_array_v, is_unbounded_array_v): Likewise.
(is_bounded_array, is_unbounded_array): Define in terms of the
variable templates.
Jakub Jelinek [Thu, 1 Sep 2022 09:07:44 +0000 (11:07 +0200)]
Fix up dump_printf_loc format attribute and adjust uses [PR106782]
As discussed on IRC, the r13-2299-g68c61c2daa1f bug only got missed
because dump_printf_loc had incorrect format attribute and therefore
almost no -Wformat=* checking was performed on it.
3, 0 are suitable for function with (whatever, whatever, const char *, va_list)
arguments, not for (whatever, whatever, const char *, ...), that one should
use 3, 4.
The following patch fixes that and adjusts all spots to fix warnings.
In many cases it is just through an ugly cast (for %G casts to gimple *
from gassign */gphi * and the like and for %p casts to void * from slp_node
etc.).
There are 3 spots where the mismatch was worse though, two using %u or %d
for unsigned HOST_WIDE_INT argument and one %T for enum argument (promoted
to int).
2022-09-01 Jakub Jelinek <jakub@redhat.com>
PR other/106782
* dumpfile.h (dump_printf_loc): Use ATTRIBUTE_GCC_DUMP_PRINTF (3, 4)
instead of ATTRIBUTE_GCC_DUMP_PRINTF (3, 0).
* tree-parloops.cc (parloops_is_slp_reduction): Cast pointers to
derived types of gimple to gimple * to avoid -Wformat warnings.
* tree-vect-loop-manip.cc (vect_set_loop_condition,
vect_update_ivs_after_vectorizer): Likewise.
* tree-vect-stmts.cc (vectorizable_load): Likewise.
* tree-vect-patterns.cc (vect_split_statement,
vect_recog_mulhs_pattern, vect_recog_average_pattern,
vect_determine_precisions_from_range,
vect_determine_precisions_from_users): Likewise.
* gimple-loop-versioning.cc
(loop_versioning::analyze_term_using_scevs): Likewise.
* tree-vect-slp.cc (vect_build_slp_tree_1): Likewise.
(vect_build_slp_tree): Cast slp_tree to void * to avoid
-Wformat warnings.
(optimize_load_redistribution_1, vect_match_slp_patterns,
vect_build_slp_instance, vect_optimize_slp_pass::materialize,
vect_optimize_slp_pass::dump, vect_slp_convert_to_external,
vect_slp_analyze_node_operations, vect_bb_partition_graph): Likewise.
(vect_print_slp_tree): Likewise. Also use
HOST_WIDE_INT_PRINT_UNSIGNED instead of %u.
* tree-vect-loop.cc (vect_determine_vectorization_factor,
vect_analyze_scalar_cycles_1, vect_analyze_loop_operations,
vectorizable_induction, vect_transform_loop): Cast pointers to derived
types of gimple to gimple * to avoid -Wformat warnings.
(vect_analyze_loop_2): Cast slp_tree to void * to avoid
-Wformat warnings.
(vect_estimate_min_profitable_iters): Use HOST_WIDE_INT_PRINT_UNSIGNED
instead of %d.
* tree-vect-slp-patterns.cc (vect_pattern_validate_optab): Use %G
instead of %T and STMT_VINFO_STMT (SLP_TREE_REPRESENTATIVE (node))
instead of SLP_TREE_DEF_TYPE (node).
Jakub Jelinek [Thu, 1 Sep 2022 07:48:01 +0000 (09:48 +0200)]
libcpp: Add -Winvalid-utf8 warning [PR106655]
The following patch introduces a new warning - -Winvalid-utf8 similarly
to what clang now has - to diagnose invalid UTF-8 byte sequences in
comments, but not just in those, but also in string/character literals
and outside of them.
The warning is on by default when explicit -finput-charset=UTF-8 is
used and C++23 compilation is requested and if -{,W}pedantic or
-pedantic-errors it is actually a pedwarn.
The reason it is on by default only for -finput-charset=UTF-8 is
that the sources often are UTF-8, but sometimes could be some ASCII
compatible single byte encoding where non-ASCII characters only
appear in comments. So having the warning off by default
is IMO desirable. The C++23 pedantic mode for when the source code
is UTF-8 is -std=c++23 -pedantic-errors -finput-charset=UTF-8.
2022-09-01 Jakub Jelinek <jakub@redhat.com>
PR c++/106655
libcpp/
* include/cpplib.h (struct cpp_options): Implement C++23
P2295R6 - Support for UTF-8 as a portable source file encoding.
Add cpp_warn_invalid_utf8 and cpp_input_charset_explicit fields.
(enum cpp_warning_reason): Add CPP_W_INVALID_UTF8 enumerator.
* init.cc (cpp_create_reader): Initialize cpp_warn_invalid_utf8
and cpp_input_charset_explicit.
* charset.cc (_cpp_valid_utf8): Adjust function comment.
* lex.cc (UCS_LIMIT): Define.
(utf8_continuation): New const variable.
(utf8_signifier): Move earlier in the file.
(_cpp_warn_invalid_utf8, _cpp_handle_multibyte_utf8): New functions.
(_cpp_skip_block_comment): Handle -Winvalid-utf8 warning.
(skip_line_comment): Likewise.
(lex_raw_string, lex_string): Likewise.
(_cpp_lex_direct): Likewise.
gcc/
* doc/invoke.texi (-Winvalid-utf8): Document it.
gcc/c-family/
* c.opt (-Winvalid-utf8): New warning.
* c-opts.cc (c_common_handle_option) <case OPT_finput_charset_>:
Set cpp_opts->cpp_input_charset_explicit.
(c_common_post_options): If -finput-charset=UTF-8 is explicit
in C++23, enable -Winvalid-utf8 by default and if -pedantic
or -pedantic-errors, make it a pedwarn.
gcc/testsuite/
* c-c++-common/cpp/Winvalid-utf8-1.c: New test.
* c-c++-common/cpp/Winvalid-utf8-2.c: New test.
* c-c++-common/cpp/Winvalid-utf8-3.c: New test.
* g++.dg/cpp23/Winvalid-utf8-1.C: New test.
* g++.dg/cpp23/Winvalid-utf8-2.C: New test.
* g++.dg/cpp23/Winvalid-utf8-3.C: New test.
* g++.dg/cpp23/Winvalid-utf8-4.C: New test.
* g++.dg/cpp23/Winvalid-utf8-5.C: New test.
* g++.dg/cpp23/Winvalid-utf8-6.C: New test.
* g++.dg/cpp23/Winvalid-utf8-7.C: New test.
* g++.dg/cpp23/Winvalid-utf8-8.C: New test.
* g++.dg/cpp23/Winvalid-utf8-9.C: New test.
* g++.dg/cpp23/Winvalid-utf8-10.C: New test.
* g++.dg/cpp23/Winvalid-utf8-11.C: New test.
* g++.dg/cpp23/Winvalid-utf8-12.C: New test.
Aldy Hernandez [Wed, 31 Aug 2022 12:31:12 +0000 (14:31 +0200)]
Make frange selftests work on !HONOR_NANS systems.
I'm just shuffling the FP self tests here, with no change to existing
functionality.
If we agree that explicit NANs in the source code with !HONOR_NANS
should behave any differently, I'm happy to address whatever needs
fixing, but for now I'd like to unblock the !HONOR_NANS build systems.
I have added an adaptation of a test Jakub suggested we handle in the PR:
void funk(int cond)
{
float x;
if (cond)
x = __builtin_nan ("");
else
x = 1.24;
bar(x);
}
For !HONOR_NANS, the range for the PHI of x_1 is the union of 1.24 and
NAN which is really 1.24 with a maybe NAN. This reflects the IL-- the
presence of the actual NAN. However, VRP will propagate this because
it sees the 1.24 and ignores the possibility of a NAN, per
!HONOR_NANS. IMO, this is correct. OTOH, for HONOR_NANS the unknown
NAN property keeps us from propagating the value.
Is there a reason we don't warn for calls to __builtin_nan when
!HONOR_NANS? That makes no sense to me.
PR tree-optimization/106785
gcc/ChangeLog:
* value-range.cc (range_tests_nan): Adjust tests for !HONOR_NANS.
(range_tests_floats): Same.
Peter Bergner [Thu, 1 Sep 2022 02:14:36 +0000 (21:14 -0500)]
rs6000: Don't ICE when we disassemble an MMA variable [PR101322]
When we expand an MMA disassemble built-in with C++ using a pointer that
is cast to a valid MMA type, the type isn't passed down to the expand
machinery and we end up using the base type of the pointer which leads to
an ICE. This patch enforces we always use the correct MMA type regardless
of the pointer type being used.
2022-08-31 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/101322
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin):
Enforce the use of a valid MMA pointer type.
gcc/testsuite/
PR target/101322
* g++.target/powerpc/pr101322.C: New test.
Joseph Myers [Wed, 31 Aug 2022 22:22:07 +0000 (22:22 +0000)]
c: C2x attributes fixes and updates
Implement some changes to the currently supported C2x standard
attributes that have been made to the specification since they were
first implemented in GCC, and some consequent changes:
* maybe_unused is now supported on labels. In fact that was already
accidentally supported in GCC as a result of sharing the
implementation with __attribute__ ((unused)), but needed to be
covered in the tests.
* As part of the support for maybe_unused on labels, its
__has_c_attribute value changed.
* The issue of maybe_unused accidentally being already supported on
labels showed up the lack of tests for other standard attributes
being incorrectly applied to labels; add such tests.
* Use of fallthrough or nodiscard attributes on labels already
properly resulted in a pedwarn. For the deprecated attribute,
however, there was only a warning, and the wording "'deprecated'
attribute ignored for 'void'" included an unhelpful "for 'void'".
Arrange for the case of the deprecated attribute on a label to be
checked for separately and result in a pedwarn. As with
inappropriate uses of fallthrough (see commit 6c80b1b56dec2691436f3e2676e3d1b105b01b89), it seems reasonable for
this pedwarn to apply regardless of whether [[]] or __attribute__
was used and regardless of whether C or C++ is being compiled.
* Attributes on case or default labels (the standard syntax supports
attributes on all kinds of labels) were quietly ignored, whether or
not appropriate for use in such a context, because they weren't
passed to decl_attributes at all. (Note where I'm changing the
do_case prototype that such a function is actually only defined in
the C front end, not for C++, despite the declaration being in
c-common.h.)
* A recent change as part of the editorial review in preparation for
the C2x CD ballot has changed the __has_c_attribute value for
fallthrough to 201910 to reflect when that attribute was actually
voted into the working draft.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/c-family/
* c-attribs.cc (handle_deprecated_attribute): Check and pedwarn
for LABEL_DECL.
* c-common.cc (c_add_case_label): Add argument ATTRS. Call
decl_attributes.
* c-common.h (do_case, c_add_case_label): Update declarations.
* c-lex.cc (c_common_has_attribute): For C, produce a result of
201910 for fallthrough and 202106 for maybe_unused.
gcc/c/
* c-parser.cc (c_parser_label): Pass attributes to do_case.
* c-typeck.cc (do_case): Add argument ATTRS. Pass it to
c_add_case_label.
gcc/testsuite/
* gcc.dg/c2x-attr-deprecated-2.c, gcc.dg/c2x-attr-fallthrough-2.c,
gcc.dg/c2x-attr-maybe_unused-1.c, gcc.dg/c2x-attr-nodiscard-2.c:
Add tests of attributes on labels.
* gcc.dg/c2x-has-c-attribute-2.c: Update expected results for
maybe_unused and fallthrough.
Patrick Palka [Wed, 31 Aug 2022 20:45:30 +0000 (16:45 -0400)]
libstdc++: A few more minor <ranges> cleanups
libstdc++-v3/ChangeLog:
* include/bits/ranges_base.h (__advance_fn::operator()): Add
parentheses in assert condition to avoid -Wparentheses warning.
* include/std/ranges: (take_view::take_view): Uglify 'base'.
(take_while_view::take_while_view): Likewise.
(elements_view::elements_view): Likewise.
(views::_Zip::operator()): Adjust position of [[nodiscard]] for
compatibility with -fconcepts-ts.
(zip_transform_view::_Sentinel): Uglify 'OtherConst'.
(views::_ZipTransform::operator()): Adjust position of
[[nodiscard]] for compatibilty with -fconcepts-ts.
Martin Liska [Wed, 31 Aug 2022 19:55:45 +0000 (21:55 +0200)]
Support --disable-fixincludes.
Always install limits.h and syslimits.h header files
to include folder.
When --disable-fixincludes is used, then no system header files
are fixed by the tools in fixincludes. Moreover, the fixincludes
tools are not built any longer.
gcc/ChangeLog:
* Makefile.in: Always install limits.h and syslimits.h to
include folder.
* configure.ac: Assign STMP_FIXINC blank if
--disable-fixincludes is used.
* configure: Regenerate.
aarch64: Update sizeless tests for recent GNU C changes
The tests for sizeless SVE types include checks that the types
are handled for initialisation purposes in the same way as scalars.
GNU C and C2x now allow scalars to be initialised using empty braces,
so this patch updates the SVE tests to match.
Richard Biener [Wed, 31 Aug 2022 13:25:32 +0000 (15:25 +0200)]
Avoid fatal fails in predicate::init_from_control_deps
When processing USE predicates we can drop from the AND chain,
when procsssing DEF predicates we can drop from the OR chain. Do
that instead of giving up completely. This also removes cases
that should never trigger.
* gimple-predicate-analysis.cc (predicate::init_from_control_deps):
Assert the guard_bb isn't empty and has more than one successor.
Drop appropriate parts of the predicate when an edge fails to
register a predicate.
(predicate::dump): Dump empty predicate as TRUE.
Jonathan Wakely [Wed, 31 Aug 2022 12:57:34 +0000 (13:57 +0100)]
libstdc++: Add noexcept-specifier to std::reference_wrapper::operator()
This isn't required by the standard, but there's an LWG issue suggesting
to add it.
Also use __invoke_result instead of result_of, to match the spec in
recent standards.
libstdc++-v3/ChangeLog:
* include/bits/refwrap.h (reference_wrapper::operator()): Add
noexcept-specifier and use __invoke_result instead of result_of.
* testsuite/20_util/reference_wrapper/invoke-noexcept.cc: New test.
Richard Biener [Wed, 31 Aug 2022 12:04:46 +0000 (14:04 +0200)]
tree-optimization/90994 - fix uninit diagnostics with EH
r12-3640-g94c12ffac234b2 sneaked in a hack to avoid the diagnostic
for the testcase in PR90994 which sees non-call EH control flow
confusing predicate analysis. The following patch instead adjusts
the existing code handling EH to handle non-calls and do what I
think was intented.
PR tree-optimization/90994
* gimple-predicate-analysis.cc (predicate::init_from_control_deps):
Ignore exceptional control flow and skip the edge for the purpose of
predicate generation also for non-calls.
Richard Biener [Tue, 30 Aug 2022 08:31:26 +0000 (10:31 +0200)]
tree-optimization/65244 - include asserts in predicates for uninit
When uninit computes the actual predicates from the control dependence
edges it currently skips those that are assert-like (where one edge
leads to a block which ends in a noreturn call). That leads to
bogus uninit diagnostics when applied on the USE side.
PR tree-optimization/65244
* gimple-predicate-analysis.h (predicate::init_from_control_deps):
Add argument to specify whether the predicate is for the USE.
* gimple-predicate-analysis.cc (predicate::init_from_control_deps):
Also include predicates effective fallthru control edges when
the predicate is for the USE.
Richard Biener [Wed, 31 Aug 2022 06:52:58 +0000 (08:52 +0200)]
tree-optimization/73550 - more switch handling improvements for uninit
The following makes predicate analysis handle case labels with
a non-singleton contiguous range.
PR tree-optimization/73550
* gimple-predicate-analysis.cc (predicate::init_from_control_deps):
Sanitize debug dumping. Handle case labels with a CASE_HIGH.
(predicate::dump): Adjust for better readability.
Jakub Jelinek [Wed, 31 Aug 2022 08:22:36 +0000 (10:22 +0200)]
libcpp: Make static checkers happy about makeuname2c [PR106778]
The assertion ensures that we point within the image and at a byte
we haven't touched yet (or at least that it isn't the first byte
of an already stored tree), some static checker was unhappy about
first checking that it is zero and only afterwards checking that it
is within bounds.
2022-08-31 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/106778
* makeuname2c.cc (write_nodes): Reverse order of && operands in
assert.
vect: Fix stray argument in call to dump_printf_loc
One call to dump_printf_loc had a stray left-over argument
from an earlier version of the patch. This went unnoticed
on aarch64-linux-gnu and x86_64-linux-gnu since the parameters
that actually mattered were passed in FPRs rather than GPRs,
but I assume this is the reason for the i686-linux-gnu failures
that Jakub hit.
Aldy Hernandez [Tue, 30 Aug 2022 13:46:43 +0000 (15:46 +0200)]
Improve union of ranges containing NAN.
Previously [5,6] U NAN would just drop to VARYING. With this patch,
the resulting range becomes [5,6] with the NAN bit set to unknown.
[I still have yet to decide what to do with intersections. ISTM, the
intersection of a known NAN with anything else should be a NAN, but it
could also be undefined (the empty set). I'll have to run some tests
and see. Currently, we drop to VARYING cause well... it's always safe
to give up;-).]
gcc/ChangeLog:
* value-range.cc (early_nan_resolve): Change comment.
(frange::union_): Handle union when one side is a NAN.
(range_tests_nan): Add tests for NAN union.
Andrew Stubbs [Fri, 5 Aug 2022 12:28:50 +0000 (13:28 +0100)]
omp-simd-clone: Allow fixed-lane vectors
The vecsize_int/vecsize_float has an assumption that all arguments will use
the same bitsize, and vary the number of lanes according to the element size,
but this is inappropriate on targets where the number of lanes is fixed and
the bitsize varies (i.e. amdgcn).
With this change the vecsize can be left zero and the vectorization factor will
be the same for all types.
gcc/ChangeLog:
* doc/tm.texi: Regenerate.
* omp-simd-clone.cc (simd_clone_adjust_return_type): Allow zero
vecsize.
(simd_clone_adjust_argument_types): Likewise.
* target.def (compute_vecsize_and_simdlen): Document the new
vecsize_int and vecsize_float semantics.
store_bit_field_1 tries to convert a field assignment into a subreg
assignment. Normally it must check that the field occupies a full
word (or more specifically, a full REGMODE_NATURAL_SIZE chunk),
so that writing to the subreg doesn't clobber any other fields.
But it can skip that check if the structure is known to be in
an undefined state.
The idea was that, in the undefined case, we could rely on
simplify_gen_subreg to do the check for a valid subreg, rather
than having to repeat the required endianness logic in the caller.
Before the addition of the undefined case, the code could use
regnum * regsize to get the byte offset, where regnum came from
checking that the start was word-aligned. In the undefined case
we need to calculate the byte offset explicitly.
Currently SLP tries to force permute operations "down" the graph
from loads in the hope of reducing the total number of permutations
needed or (in the best case) removing the need for the permutations
entirely. This patch tries to extend it as follows:
- Allow loads to take a different permutation from the one they
started with, rather than choosing between "original permutation"
and "no permutation".
- Allow changes in both directions, if the target supports the
reverse permutation.
- Treat the placement of permutations as a two-way dataflow problem:
after propagating information from leaves to roots (as now), propagate
information back up the graph.
- Take execution frequency into account when optimising for speed,
so that (for example) permutations inside loops have a higher
cost than permutations outside loops.
- Try to reduce the total number of permutations when optimising for
size, even if that increases the number of permutations on a given
execution path.
See the big block comment above vect_optimize_slp_pass for
a detailed description.
The original motivation for doing this was to add a framework that would
allow other layout differences in future. The two main ones are:
- Make it easier to represent predicated operations, including
predicated operations with gaps. E.g.:
a[0] += 1;
a[1] += 1;
a[3] += 1;
could be a single load/add/store for SVE. We could handle this
by representing a layout such as { 0, 1, _, 2 } or { 0, 1, _, 3 }
(depending on what's being counted). We might need to move
elements between lanes at various points, like with permutes.
(This would first mean adding support for stores with gaps.)
- Make it easier to switch between an even/odd and unpermuted layout
when switching between wide and narrow elements. E.g. if a widening
operation produces an even vector and an odd vector, we should try
to keep operations on the wide elements in that order rather than
force them to be permuted back "in order".
To give some examples of what the patch does:
int f1(int *__restrict a, int *__restrict b, int *__restrict c,
int *__restrict d)
{
a[0] = (b[1] << c[3]) - d[1];
a[1] = (b[0] << c[2]) - d[0];
a[2] = (b[3] << c[1]) - d[3];
a[3] = (b[2] << c[0]) - d[2];
}
continues to produce the same code as before when optimising for
speed: b, c and d are permuted at load time. But when optimising
for size we instead permute c into the same order as b+d and then
permute the result of the arithmetic into the same order as a:
int f2(int *__restrict a, int *__restrict b, int *__restrict c,
int *__restrict d)
{
a[0] = (b[3] << c[3]) - d[3];
a[1] = (b[2] << c[2]) - d[2];
a[2] = (b[1] << c[1]) - d[1];
a[3] = (b[0] << c[0]) - d[0];
}
continues to push the reverse down to just before the store,
like the previous code did.
In:
int f3(int *__restrict a, int *__restrict b, int *__restrict c,
int *__restrict d)
{
for (int i = 0; i < 100; ++i)
{
a[0] = (a[0] + c[3]);
a[1] = (a[1] + c[2]);
a[2] = (a[2] + c[1]);
a[3] = (a[3] + c[0]);
c += 4;
}
}
the loads of a are hoisted and the stores of a are sunk, so that
only the load from c happens in the loop. When optimising for
speed, we prefer to have the loop operate on the reversed layout,
changing on entry and exit from the loop:
int f4(int *__restrict a, int *__restrict b, int *__restrict c,
int *__restrict d)
{
int a0 = a[0];
int a1 = a[1];
int a2 = a[2];
int a3 = a[3];
for (int i = 0; i < 100; ++i)
{
a0 ^= c[0];
a1 ^= c[1];
a2 ^= c[2];
a3 ^= c[3];
c += 4;
for (int j = 0; j < 100; ++j)
{
a0 += d[1];
a1 += d[0];
a2 += d[3];
a3 += d[2];
d += 4;
}
b[0] = a0;
b[1] = a1;
b[2] = a2;
b[3] = a3;
b += 4;
}
a[0] = a0;
a[1] = a1;
a[2] = a2;
a[3] = a3;
}
the a vector in the inner loop maintains the order { 1, 0, 3, 2 },
even though it's part of an SCC that includes the outer loop.
In other words, this is a motivating case for not assigning
permutes at SCC granularity. The code we get is:
bb-slp-layout-17.c is a collection of compile tests for problems
I hit with earlier versions of the patch. The same prolems might
show up elsewhere, but it seemed worth having the test anyway.
In slp-11b.c we previously pushed the permutation of the in[i*4]
group down from the load to just before the store. That didn't
reduce the number or frequency of the permutations (or increase
them either). But separating the permute from the load meant
that we could no longer use load/store lanes.
Whether load/store lanes are a good idea here is another question.
If there were two sets of loads, and if we could use a single
permutation instead of one per load, then avoiding load/store
lanes should be a good thing even under the current abstract
cost model. But I think under the current model we should
try to avoid splitting up potential load/store lanes groups
if there is no specific benefit to the split.
Preferring load/store lanes is still a source of missed optimisations
that we should fix one day...
gcc/
* params.opt (-param=vect-max-layout-candidates=): New parameter.
* doc/invoke.texi (vect-max-layout-candidates): Document it.
* tree-vectorizer.h (auto_lane_permutation_t): New typedef.
(auto_load_permutation_t): Likewise.
* tree-vect-slp.cc (vect_slp_node_weight): New function.
(slpg_layout_cost): New class.
(slpg_vertex): Replace perm_in and perm_out with partition,
out_degree, weight and out_weight.
(slpg_partition_info, slpg_partition_layout_costs): New classes.
(vect_optimize_slp_pass): Likewise, cannibalizing some part of
the previous vect_optimize_slp.
(vect_optimize_slp): Use it.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_vect_var_shift):
Return true for aarch64.
* gcc.dg/vect/bb-slp-layout-1.c: New test.
* gcc.dg/vect/bb-slp-layout-2.c: New test.
* gcc.dg/vect/bb-slp-layout-3.c: New test.
* gcc.dg/vect/bb-slp-layout-4.c: New test.
* gcc.dg/vect/bb-slp-layout-5.c: New test.
* gcc.dg/vect/bb-slp-layout-6.c: New test.
* gcc.dg/vect/bb-slp-layout-7.c: New test.
* gcc.dg/vect/bb-slp-layout-8.c: New test.
* gcc.dg/vect/bb-slp-layout-9.c: New test.
* gcc.dg/vect/bb-slp-layout-10.c: New test.
* gcc.dg/vect/bb-slp-layout-11.c: New test.
* gcc.dg/vect/bb-slp-layout-13.c: New test.
* gcc.dg/vect/bb-slp-layout-14.c: New test.
* gcc.dg/vect/bb-slp-layout-15.c: New test.
* gcc.dg/vect/bb-slp-layout-16.c: New test.
* gcc.dg/vect/bb-slp-layout-17.c: New test.
* gcc.dg/vect/slp-11b.c: XFAIL SLP test for load-lanes targets.
(1) hashing and equality of integers
(2) using spare integer encodings to represent empty and deleted slots
(1) is really independent of (2), and could be useful in cases where
no spare integer encodings are available. This patch adds a base class
(int_hash_base) for (1) and makes int_hash inherit from it.
If we follow a similar style for future hashes, we can make
unbounded_hashmap_traits take the "base" hash for the key
as a template parameter, rather than requiring every type of
key to have a separate derivative of unbounded_hashmap_traits.
A later patch applies this to vector keys.
No functional change intended.
gcc/
* hash-traits.h (int_hash_base): New struct, split out from...
(int_hash): ...this class, which now inherits from int_hash_base.
* hash-map-traits.h (unbounded_hashmap_traits): Take a template
parameter for the key that provides hash and equality functions.
(unbounded_int_hashmap_traits): Turn into a type alias of
unbounded_hashmap_traits.
Make graphds_scc pass the node order back to callers
As a side-effect, graphds_scc constructs a vector in which all
nodes in an SCC are listed consecutively. This can be useful
information, so that the patch adds an optional pass-back parameter
for it. The interface is similar to the one for graphds_dfs.
gcc/
* graphds.cc (graphds_scc): Add a pass-back parameter for the
final node order.
* graphds.h (graphds_scc): Update prototype accordingly.
Similarly to the previous vectorizable_slp_permutation patch,
this one splits out the main part of vect_transform_slp_perm_load
so that a later patch can test a permutation without constructing
a node for it.
Also fixes a lingering use of STMT_VINFO_VECTYPE.
gcc/
* tree-vect-slp.cc (vect_transform_slp_perm_load_1): Split out from...
(vect_transform_slp_perm_load): ...here. Use SLP_TREE_VECTYPE instead
of STMT_VINFO_VECTYPE.