Roger Sayle [Wed, 16 Jun 2021 08:56:09 +0000 (09:56 +0100)]
[PATCH] PR rtl-optimization/46235: Improved use of bt for bit tests on x86_64.
This patch tackles PR46235 to improve the code generated for bit tests
on x86_64 by making more use of the bt instruction. Currently, GCC emits
bt instructions when followed by condition jumps (thanks to Uros' splitters).
This patch adds splitters in i386.md, to catch the cases where bt is followed
by a conditional move (as in the original report), or by a setc/setnc (as in
comment 5 of the Bugzilla PR).
With this patch, the function in the original PR
int foo(int a, int x, int y) {
if (a & (1 << x))
return a;
return 1;
}
which with -O2 on mainline generates:
foo: movl %edi, %eax
movl %esi, %ecx
sarl %cl, %eax
testb $1, %al
movl $1, %eax
cmovne %edi, %eax
ret
now generates:
foo: btl %esi, %edi
movl $1, %eax
cmovc %edi, %eax
ret
After:
movzbl %dil, %edi
btl %esi, %edi
setc %al
ret
According to Agner Fog, SAR/SHR r,cl takes 2 cycles on skylake,
where BT r,r takes only one, so the performance improvements on
recent hardware may be more significant than implied by just
the reduced number of instructions. I've avoided transforming cases
(such as btsi_setcsi) where using bt sequences may not be a clear
win (over sarq/andl).
2010-06-15 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/46235
* config/i386/i386.md: New define_split for bt followed by cmov.
(*bt<mode>_setcqi): New define_insn_and_split for bt followed by setc.
(*bt<mode>_setncqi): New define_insn_and_split for bt then setnc.
(*bt<mode>_setnc<mode>): New define_insn_and_split for bt followed
by setnc with zero extension.
gcc/testsuite/ChangeLog
PR rtl-optimization/46235
* gcc.target/i386/bt-5.c: New test.
* gcc.target/i386/bt-6.c: New test.
* gcc.target/i386/bt-7.c: New test.
Jakub Jelinek [Wed, 16 Jun 2021 08:45:27 +0000 (10:45 +0200)]
libffi: Fix up x86_64 classify_argument
As the following testcase shows, libffi didn't handle properly
classify_arguments of structures at byte offsets not divisible by
UNITS_PER_WORD. The following patch adjusts it to match what
config/i386/ classify_argument does for that and also ports the
PR38781 fix there (the second chunk).
* src/x86/ffi64.c (classify_argument): For FFI_TYPE_STRUCT set words
to number of words needed for type->size + byte_offset bytes rather
than just type->size bytes. Compute pos before the loop and check
total size of the structure.
* testsuite/libffi.call/nested_struct12.c: New test.
Piotr Trojanek [Wed, 3 Mar 2021 19:15:56 +0000 (20:15 +0100)]
[Ada] Ignore volatile restrictions in preanalysis
gcc/ada/
* sem_util.adb (Is_OK_Volatile_Context): All references to
volatile objects are legal in preanalysis.
(Within_Volatile_Function): Previously it was wrongly called on
Empty entities; now it is only called on E_Return_Statement,
which allow the body to be greatly simplified.
Yannick Moy [Wed, 3 Mar 2021 13:54:09 +0000 (14:54 +0100)]
[Ada] Do not generate an Itype_Reference node for slices in GNATprove mode
gcc/ada/
* sem_res.adb (Set_Slice_Subtype): Revert special-case
introduced previously, which is not needed as Itypes created for
slices are precisely always used.
Eric Botcazou [Wed, 3 Mar 2021 19:15:42 +0000 (20:15 +0100)]
[Ada] Fix floating-point exponentiation with Integer'First exponent
gcc/ada/
* urealp.adb (Scale): Change first paramter to Uint and adjust.
(Equivalent_Decimal_Exponent): Pass U.Den directly to Scale.
* libgnat/s-exponr.adb (Negative): Rename to...
(Safe_Negative): ...this and change its lower bound.
(Exponr): Adjust to above renaming and deal with Integer'First.
Piotr Trojanek [Mon, 1 Mar 2021 15:39:31 +0000 (16:39 +0100)]
[Ada] Fix detection of volatile expressions in restricted contexts
gcc/ada/
* sem_res.adb (Flag_Effectively_Volatile_Objects): Detect also
allocators within restricted contexts and not just entity names.
(Resolve_Actuals): Remove duplicated code for detecting
restricted contexts; it is now exclusively done in
Is_OK_Volatile_Context.
(Resolve_Entity_Name): Adapt to new parameter of
Is_OK_Volatile_Context.
* sem_util.ads, sem_util.adb (Is_OK_Volatile_Context): Adapt to
handle contexts both inside and outside of subprogram call
actual parameters.
(Within_Subprogram_Call): Remove; now handled by
Is_OK_Volatile_Context itself and its parameter.
Piotr Trojanek [Mon, 1 Mar 2021 13:01:25 +0000 (14:01 +0100)]
[Ada] Fix aliasing check for actual parameters passed by reference
gcc/ada/
* checks.adb (Apply_Scalar_Range_Check): Fix handling of check depending
on the parameter passing mechanism. Grammar adjustment ("has"
=> "have").
(Parameter_Passing_Mechanism_Specified): Add a hyphen in a comment.
Piotr Trojanek [Mon, 1 Mar 2021 15:36:08 +0000 (16:36 +0100)]
[Ada] Cleanup related to volatile objects in restricted contexts
gcc/ada/
* sem_res.adb (Is_Assignment_Or_Object_Expression): Whitespace
cleanup.
(Is_Attribute_Expression): Prevent AST climbing from going to
the root of the compilation unit.
Bob Duff [Tue, 23 Feb 2021 20:50:21 +0000 (15:50 -0500)]
[Ada] Fix missing array bounds checking
gcc/ada/
* ghost.adb: Add another special case where full analysis is
needed. This bug is due to quirks in the way
Mark_And_Set_Ghost_Assignment works (it happens very early,
before name resolution is done).
Sergey Rybin [Wed, 17 Feb 2021 13:31:08 +0000 (16:31 +0300)]
[Ada] Clarify the documentation of -gnaty0 style check option
gcc/ada/
* doc/gnat_ugn/building_executable_programs_with_gnat.rst:
Instead of referring to the formatting of the Ada examples in
Ada RM add use the list of checks that are actually performed.
* gnat_ugn.texi: Regenerate.
Eric Botcazou [Wed, 17 Feb 2021 08:00:19 +0000 (09:00 +0100)]
[Ada] Small cleanup in C header files
gcc/ada/
* initialize.c: Do not include vxWorks.h and fcntl.h from here.
(__gnat_initialize) [__MINGW32__]: Remove #ifdef and attribute
(__gnat_initialize) [init_float]: Delete.
(__gnat_initialize) [VxWorks]: Likewise.
(__gnat_initialize) [PA-RISC HP-UX 10]: Likewise.
* runtime.h: Add comment about vxWorks.h include.
Richard Biener [Wed, 16 Jun 2021 06:56:21 +0000 (08:56 +0200)]
tree-optimization/101083 - fix ICE with SLP reassoc
This makes us pass down the vector type for the two-operand
SLP node build rather than picking that from operand one which,
when constant or external, could be NULL.
2021-06-16 Richard Biener <rguenther@suse.de>
PR tree-optimization/101083
* tree-vect-slp.c (vect_slp_build_two_operator_nodes): Get
vectype as argument.
(vect_build_slp_tree_2): Adjust.
gcc/analyzer/ChangeLog:
PR analyzer/99212
PR analyzer/101082
* engine.cc: Include "target.h".
(impl_run_checkers): Log BITS_BIG_ENDIAN, BYTES_BIG_ENDIAN, and
WORDS_BIG_ENDIAN.
* region-model-manager.cc
(region_model_manager::maybe_fold_binop): Move support for masking
via ARG0 & CST into...
(region_model_manager::maybe_undo_optimize_bit_field_compare):
...this new function. Flatten by converting from nested
conditionals to a series of early return statements to reject
failures. Reject if type is not unsigned_char_type_node.
Handle BYTES_BIG_ENDIAN when determining which bits are bound
in the binding_map.
* region-model.h
(region_model_manager::maybe_undo_optimize_bit_field_compare):
New decl.
* store.cc (bit_range::dump): New function.
* store.h (bit_range::dump): New decl.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jonathan Wakely [Tue, 15 Jun 2021 14:07:25 +0000 (15:07 +0100)]
libstdc++: Remove precondition checks from ranges::subrange
The assertion in the subrange constructor causes semantic changes,
because the call to ranges::distance performs additional operations that
are not part of the constructor's specification. That will fail to
compile if the iterator is move-only, because the argument to
ranges::distance is passed by value. It will modify the subrange if the
iterator is not a forward iterator, because incrementing the copy also
affects the _M_begin member. Those problems could be prevented by using
if-constexpr to only do the assertion for copyable forward iterators,
but the call to ranges::distance can also prevent the constructor being
usable in constant expressions. If the member initializers are usable in
constant expressions, but iterator increments of equality comparisons
are not, then the checks done by __glibcxx_assert might
make constant evaluation fail.
This change removes the assertion. Additionally, a new typedef is
introduced to simplify the declarations using __make_unsigned_like_t on
the iterator's difference type.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/ranges_util.h (subrange): Add __size_type typedef
and use it to simplify declarations.
(subrange(i, s, n)): Remove assertion.
* testsuite/std/ranges/subrange/constexpr.cc: New test.
Jonathan Wakely [Tue, 15 Jun 2021 13:39:02 +0000 (14:39 +0100)]
libstdc++: Use function object for __decay_copy helper
By changing __cust_access::__decay_copy from a function template to a
function object we avoid ADL. That means it's fine to call it
unqualified (the compiler won't waste time doing ADL in associated
namespaces, and won't try to complete associated types).
This also makes some other minor simplications to other concepts for the
[range.access] CPOs.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/iterator_concepts.h (__cust_access::__decay_copy):
Replace with function object.
(__cust_access::__member_begin, ___cust_access::_adl_begin): Use
__decay_copy unqualified.
* include/bits/ranges_base.h (__member_end, __adl_end):
Likewise. Use __range_iter_t for type of ranges::begin.
(__member_rend): Use correct value category for rbegin argument.
(__member_data): Use __decay_copy unqualified.
(__begin_data): Use __range_iter_t for type of ranges::begin.
Carl Love [Thu, 10 Jun 2021 22:36:25 +0000 (17:36 -0500)]
Fix for vcmpequt builtin
The vcmpequt builtin define eqvv1ti3 points to the eqv define instruction
for the eqv instruction. The vcmpequt builtin define should point to the
vector_eqv1ti instruction definition for the vcmpequq instruction.
2021-06-15 Carl Love <cel@us.ibm.com>
gcc/ChangeLog
PR target/101022
* config/rs6000/rs6000-builtin.def (VCMPEQUT): Fix the ICODE for the
enum definition.
(VRLQ, VSLQ, VSRQ, VSRAQ): Remove unused BU_P10_OVERLOAD_2
definitions.
David Malcolm [Tue, 15 Jun 2021 13:31:26 +0000 (09:31 -0400)]
analyzer: track dynamic extents of regions
This patch extends region_model to add tracking of the sizes of
dynamically-allocated regions, both on the heap (via malloc etc) and
stack (via alloca). It adds enough purging of this state to avoid
blowing up any existing analyzer test cases.
The state can be queried via a new "__analyzer_dump_capacity" for use
in DejaGnu tests but other than that doesn't do anything - I have
various followup experiments that make use of this.
gcc/analyzer/ChangeLog:
* engine.cc (exploded_node::on_stmt): Handle __analyzer_dump_capacity.
(exploded_node::on_stmt): Drop m_sm_changes from on_stmt_flags.
(state_change_requires_new_enode_p): New function...
(exploded_graph::process_node): Call it, rather than querying
flags.m_sm_changes, so that dynamic-extent differences can also
trigger the splitting of nodes.
* exploded-graph.h (struct on_stmt_flags): Drop field m_sm_changes.
* program-state.cc (program_state::detect_leaks): Purge dead
heap-allocated regions from dynamic extents.
(selftest::test_program_state_1): Fix type of "size_in_bytes".
(selftest::test_program_state_merging): Likewise.
* region-model-impl-calls.cc
(region_model::impl_call_analyzer_dump_capacity): New.
(region_model::impl_call_free): Remove dynamic extents from the
freed region.
* region-model-reachability.h
(reachable_regions::begin_mutable_base_regs): New.
(reachable_regions::end_mutable_base_regs): New.
* region-model.cc: Include "tree-object-size.h".
(region_model::region_model): Support new field m_dynamic_extents.
(region_model::operator=): Likewise.
(region_model::operator==): Likewise.
(region_model::dump_to_pp): Dump sizes of dynamic regions.
(region_model::handle_unrecognized_call): Purge dynamic extents
from any regions that have escaped mutably:.
(region_model::get_capacity): New function.
(region_model::add_constraint): Unset dynamic extents when a
heap-allocated region's address is NULL.
(region_model::unbind_region_and_descendents): Purge dynamic
extents of unbound regions.
(region_model::can_merge_with_p): Call
m_dynamic_extents.can_merge_with_p.
(region_model::create_region_for_heap_alloc): Assert that
size_in_bytes's type is compatible with size_type_node. Update
for renaming of record_dynamic_extents to set_dynamic_extents.
(region_model::create_region_for_alloca): Likewise.
(region_model::record_dynamic_extents): Rename to...
(region_model::set_dynamic_extents): ...this. Assert that
size_in_bytes's type is compatible with size_type_node. Add it
to the m_dynamic_extents map.
(region_model::get_dynamic_extents): New.
(region_model::unset_dynamic_extents): New.
(selftest::test_state_merging): Fix type of "size".
(selftest::test_malloc_constraints): Likewise.
(selftest::test_malloc): Verify dynamic extents.
(selftest::test_alloca): Likewise.
* region-model.h (region_to_value_map::is_empty): New.
(region_model::dynamic_extents_t): New typedef.
(region_model::impl_call_analyzer_dump_capacity): New decl.
(region_model::get_dynamic_extents): New function.
(region_model::get_dynamic_extents): New decl.
(region_model::set_dynamic_extents): New decl.
(region_model::unset_dynamic_extents): New decl.
(region_model::get_capacity): New decl.
(region_model::record_dynamic_extents): Rename to set_dynamic_extents.
(region_model::m_dynamic_extents): New field.
gcc/ChangeLog:
* doc/analyzer.texi
(Special Functions for Debugging the Analyzer): Add
__analyzer_dump_capacity.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/analyzer-decls.h (__analyzer_dump_capacity): New decl.
* gcc.dg/analyzer/capacity-1.c: New test.
* gcc.dg/analyzer/capacity-2.c: New test.
* gcc.dg/analyzer/capacity-3.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Tue, 15 Jun 2021 13:30:18 +0000 (09:30 -0400)]
analyzer: add class region_to_value_map
Add a class for associating symbolic values with regions, for use
initially for recording the sizes of dynamically-allocated regions,
though this also could potentially be used for e.g. tracking strlen()
values.
David Malcolm [Tue, 15 Jun 2021 13:29:23 +0000 (09:29 -0400)]
analyzer testsuite: add explode-2a.c [PR101068]
Due to a bug (PR analyzer/101068), the analyzer only explores a limited
subset of the possible paths through gcc.dg/analyzer/explode-2.c,
and this artifically helps stop this testcase from exploding.
I intend to fix this at some point, but for now, this patch adds a
revised test case which captures the effective CFG due to the bug, so
that we explicitly have test coverage for that CFG.
gcc/testsuite/ChangeLog:
PR analyzer/101068
* gcc.dg/analyzer/explode-2a.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Bob Duff [Thu, 11 Feb 2021 22:57:53 +0000 (17:57 -0500)]
[Ada] No_Task_Parts aspect
gcc/ada/
* aspects.ads (No_Task_Parts): New aspect.
* snames.ads-tmpl: Add the aspect name.
* exp_ch6.adb (Might_Have_Tasks): Return False if this is a
class-wide type whose specific type has No_Task_Parts.
* freeze.adb (Check_No_Parts_Violations): This is an adaptation
of the procedure formerly known as
Check_No_Controlled_Parts_Violations, which now supports both
No_Controlled_Parts and No_Task_Parts. It takes a parameter
indicating which aspect is being checked.
(Freeze_Entity): Call Check_No_Parts_Violations for both
aspects.
* sem_ch13.adb (Analyze_Aspect_Specifications): The code for
Aspect_No_Controlled_Parts already works as is with
Aspect_No_Task_Parts.
* libgnat/a-iteint.ads: Add No_Task_Parts aspect to the two
iterator iterfaces.
* doc/gnat_rm/implementation_defined_aspects.rst: Add
documentation for the No_Task_Parts aspect.
* gnat_rm.texi: Regenerate.
Arnaud Charlet [Wed, 13 Jan 2021 13:49:15 +0000 (08:49 -0500)]
[Ada] Use runtime from base compiler during stage1
gcc/ada/
* Make-generated.in: Add rule to copy runtime files needed
during stage1.
* raise.c: Remove obsolete symbols used during bootstrap.
* gcc-interface/Make-lang.in: Do not use libgnat sources during
stage1.
(GNAT_ADA_OBJS, GNATBIND_OBJS): Split in two parts, the common
part and the part only used outside of stage1.
(ADA_GENERATED_FILES): Add runtime files needed during bootstrap
when recent APIs are needed.
(ada/b_gnatb.adb): Remove prerequisite.
* gcc-interface/system.ads: Remove obsolete entries.
Ed Schonberg [Wed, 10 Feb 2021 15:52:04 +0000 (10:52 -0500)]
[Ada] AI12-0138: Iterators and other nonoverridable aspects
gcc/ada/
* sem_util.adb (Is_Confirming): Separate the handling of
Implicit_Dereference, for which no pragma is generated but which
is already checked for legality in Sem_Ch13, including renamed
discriminants in a derived type.
(Is_Confirming, Same_Name): For expanded names, only check
matching of selector, because prefix may correspond to original
and derived types with different names and/or scopes. Semantic
checks on aspect expression have already verified its legality.
Add comments regarding possible gaps in RM description of the
feature.
Gary Dismukes [Wed, 10 Feb 2021 01:18:47 +0000 (20:18 -0500)]
[Ada] Error when passing subprogram'Access to null-defaulted formal subprogram
gcc/ada/
* freeze.adb (Freeze_Subprogram): Don't propagate conventions
Intrinsic or Entry to anonymous access-to-subprogram types
associated with subprograms having those conventions. Update
related comment.
* sem_attr.adb (Resolve_Attribute, Attribute_*Access): Remove
special-case warning code for cases where a called subprogram
has convention Intrinsic as well as its formal's type (the
expected type for the Access attribute), since this case can no
longer occur.
Yannick Moy [Fri, 5 Feb 2021 14:19:57 +0000 (15:19 +0100)]
[Ada] Clarify the semantics of signed intrinsic shift operations
gcc/ada/
* doc/gnat_rm/intrinsic_subprograms.rst: More details on shift
operations for signed types. Also add the missing Import and
Convention on the example.
* gnat_rm.texi: Regenerate.
Eric Botcazou [Mon, 8 Feb 2021 11:00:19 +0000 (12:00 +0100)]
[Ada] Remove const qualifier on a couple of pointed-to types
gcc/ada/
* argv.c: Add include of <stdlib.h> for the runtime.
(gnat_argv): Change type to char ** and initialize to NULL.
(gnat_envp): Likewise.
* argv-lynxos178-raven-cert.c: Add include of <stdlib.h>.
(gnat_argv): Change type to char ** and initialize to NULL.
(gnat_envp): Likewise.
Arnaud Charlet [Sat, 30 Jan 2021 16:52:54 +0000 (11:52 -0500)]
[Ada] Add support for folding more and/or expressions
gcc/ada/
* sem_eval.adb (Eval_Logical_Op, Test_Expression_Is_Foldable):
Add support for folding more "and"/"or" expressions.
* exp_util.adb (Side_Effect_Free): Fix handling of membership
tests.
* gen_il-gen.adb (To_Bit_Offset): Use 'Base to avoid overflow in
computations in Last_Bit when Offset = 'Last.
(Choose_Offset): Give a better error message when we run out of
fields. In particular, point out that
Gen_IL.Internals.Bit_Offset'Last needs to be increased.
Bob Duff [Thu, 25 Feb 2021 15:38:55 +0000 (10:38 -0500)]
[Ada] Variable-sized node types -- cleanup
gcc/ada/
* atree.ads, einfo-utils.ads, einfo-utils.adb, fe.h, gen_il.adb,
gen_il.ads, gen_il-gen-gen_entities.adb,
gen_il-gen-gen_nodes.adb, sem_ch12.adb, sem_ch3.adb,
sem_util.adb, sinfo-utils.ads, treepr.adb, types.ads: Clean up
??? comments and other comments.
* atree.adb: Clean up ??? comments and other comments.
(Validate_Node): Fix bug: "Off_0 (N) < Off_L (N)"
should be "Off_0 (N) <= Off_L (N)".
* gen_il-gen.adb, gen_il-gen.ads: Clean up ???
comments and other comments. Add support for getter-specific
and setter-specific preconditions. Detect the error of putting
a field in the wrong subrange. Misc cleanup.
(Node_Field vs. Entity_Field): Clean up Nmake. Improve
comments.
* gen_il-utils.ads: Misc cleanup. Move...
* gen_il-internals.ads: ... here.
* gen_il-utils.adb: Misc cleanup. Move...
* gen_il-internals.adb: ... here.
* gen_il-fields.ads: Move Was_Default_Init_Box_Association,
which was in the wrong subrange. Add comments. Misc cleanup.
* gen_il-types.ads: Add Named_Access_Kind.
* sinfo-cn.adb: Clean up ??? comments and other comments.
Remove redundant assertions.
* einfo.ads, sinfo.ads: Clean up ??? comments and other
comments. Remove all the comments indicating field offsets.
These are obsolete now that Gen_IL computes the offsets
automatically.
Steve Baird [Thu, 18 Feb 2021 01:54:53 +0000 (17:54 -0800)]
[Ada] Avoid inappropriate error messages regarding aggregates and variant parts
gcc/ada/
* sem_util.adb (Gather_Components): Factor the test that was
already being used to govern emitting a pre-Ada_2020 error
message into an expression function,
OK_Scope_For_Discrim_Value_Error_Messages. Call that new
function in two places: the point where the same test was being
performed previously, and in governing emission of a newer
Ada_2020 error message. In both cases, the out-mode parameter
Gather_Components.Report_Errors is set to True even if no error
messages are generated within Gather_Components.
* sem_util.ads: Correct a comment.
Bob Duff [Sat, 13 Feb 2021 21:43:22 +0000 (16:43 -0500)]
[Ada] Fix bug in subtype of private type with invariants
gcc/ada/
* sem_util.adb (Propagate_Invariant_Attributes): Call
Set_Has_Own_Invariants on the base type, because these are
Base_Type_Only. The problem is that the base type of a type is
indeed a base type when Set_Base_Type is called, but then the
type is mutated into a subtype in rare cases.
* atree.ads, atree.adb (Is_Entity): Export. Correct subtype of
parameter in body.
* gen_il-gen.adb: Improve getters so that "Pre => ..." can refer
to the value of the field. Put Warnings (Off) on some with
clauses that are not currently used, but might be used by such
Pre's.
Piotr Trojanek [Wed, 24 Feb 2021 18:39:21 +0000 (19:39 +0100)]
[Ada] Robust switching from incomplete to access types
gcc/ada/
* sem_ch3.adb (Access_Type_Declaration): Add comments to explain
the ordering of Mutate_Kind and Set_Directly_Designated_Type;
remove temporary setting of Ekind to E_Access_Type for building
_master objects, since now the Ekind is already set to its final
value. Move repeated code into Setup_Access_Type routine and use
it so that Process_Subtype is executed before mutating the kind
of the type entity.
* gen_il-gen-gen_entities.adb (Gen_Entities): Remove
Directly_Designated_Type from E_Void, E_Private_Record,
E_Limited_Private_Type and Incomplete_Kind; now it only belongs
to Access_Kind entities.
* sem_util.adb: Minor reformatting.
Jakub Jelinek [Tue, 15 Jun 2021 09:36:47 +0000 (11:36 +0200)]
expr: Fix up VEC_PACK_TRUNC_EXPR expansion [PR101046]
The following testcase ICEs, because we have a mode mismatch.
VEC_PACK_TRUNC_EXPR's operands have different modes from the result
(same vector mode size but twice as large element),
but we were passing non-NULL subtarget with the mode of the result
to the expansion of its arguments, so the VEC_PERM_EXPR in one of the
operands which had V8SImode operands and result had V16HImode target.
Fixed by clearing the subtarget if we are changing mode.
2021-06-15 Jakub Jelinek <jakub@redhat.com>
PR target/101046
* expr.c (expand_expr_real_2) <case VEC_PACK_FIX_TRUNC_EXPR,
case VEC_PACK_TRUNC_EXPR>: Clear subtarget when changing mode.
Richard Biener [Mon, 14 Jun 2021 13:36:57 +0000 (15:36 +0200)]
Handle multiple latches in irreducible region mark
The following makes irreducible region discovery handle multiple latches.
2021-06-14 Richard Biener <rguenther@suse.de>
* cfgloopanal.c (mark_irreducible_loops): Use a dominance
check to identify loop latches.
* cfgloop.c (verify_loop_structure): Likewise.
* loop-init.c (apply_loop_flags): Allow marked irreducible
regions even with multiple latches.
* predict.c (rebuild_frequencies): Simplify.
Robin Dapp [Tue, 15 Jun 2021 07:06:15 +0000 (09:06 +0200)]
testsuite: Fix Wattributes test cases for s390 and add new tests.
There are several FAILs because we have an s390-specific check for a
warning which is not necessary anymore. Remove it.
Add a new test case that expects a warning about conflicting function
alignment. This would fail on s390 before but most likely on other
targets as well so it can be a target-independent test.
Also, add a test to verify that we do not emit a note when specifying
conflicting alignment for the same declaration. Need to explicitly
handle every dg-note because handling one disables dg-note pruning.
gcc/testsuite/ChangeLog:
* c-c++-common/Wattributes.c: Remove s390-specific check and add
new tests.
* gcc.dg/Wattributes-6.c: Likewise.
Robin Dapp [Tue, 15 Jun 2021 07:06:02 +0000 (09:06 +0200)]
c-family: Copy DECL_USER_ALIGN even if DECL_ALIGN is similar.
When re-declaring a function with differing attributes DECL_USER_ALIGN
is usually not merged/copied when DECL_ALIGN is similar. On s390 this
will cause a warning message not to be shown. Similarly, we warned
about the wrong alignment when short-circuiting an alignment initialization in
common_handle_aligned_attribute ().
Fix this by copying DECL_USER_ALIGN even if DECL_ALIGN is similar as
well as getting rid of the short-circuited initialization.
gcc/c-family/ChangeLog:
* c-attribs.c (common_handle_aligned_attribute): Remove short
circuit and dead code.
gcc/c/ChangeLog:
* c-decl.c (merge_decls): Copy DECL_USER_ALIGN if DECL_ALIGN is
similar.
Martin Sebor [Mon, 14 Jun 2021 22:34:48 +0000 (16:34 -0600)]
Teach compute_objsize about placement new [PR100876].
Resolves:
PR c++/100876 - -Wmismatched-new-delete should understand placement new when it's not inlined
gcc/ChangeLog:
PR c++/100876
* builtins.c (gimple_call_return_array): Check for attribute fn spec.
Handle calls to placement new.
(ndecl_dealloc_argno): Avoid placement delete.
gcc/testsuite/ChangeLog:
PR c++/100876
* g++.dg/warn/Wmismatched-new-delete-4.C: New test.
* g++.dg/warn/Wmismatched-new-delete-5.C: New test.
* g++.dg/warn/Wstringop-overflow-7.C: New test.
* g++.dg/warn/Wfree-nonheap-object-6.C: New test.
* g++.dg/analyzer/placement-new.C: Prune out expected warning.
Peter Bergner [Mon, 14 Jun 2021 21:55:18 +0000 (16:55 -0500)]
rs6000: MMA builtin usage ICEs when used in a #pragma omp parallel and using -fopenmp [PR100777]
Using an MMA builtin within an openmp parallel code block, leads to an SSA
verification ICE on the temporaries we create while expanding the MMA builtins
at gimple time. The solution is to use create_tmp_reg_or_ssa_name(), which
knows when to create either an SSA or register temporary.
2021-06-14 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/100777
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Use
create_tmp_reg_or_ssa_name().
gcc/testsuite/
PR target/100777
* gcc.target/powerpc/pr100777.c: New test.
Andrew MacLeod [Mon, 14 Jun 2021 19:33:59 +0000 (15:33 -0400)]
Limit new value calculations to first order effects.
When utilzing poor values during propagation, we mostly care about values that
were undefined/processed directly used in calcualting the SSA_NAME being
processed. 2nd level derivations of such poor values rarely affect the
inital calculation. Leave them to when they are directly encountered.
* gimple-range-cache.cc (ranger_cache::ranger_cache): Adjust.
(ranger_cache::enable_new_values): Set to specified value and
return the old value.
(ranger_cache::disable_new_values): Delete.
(ranger_cache::fill_block_cache): Disable non 1st order derived
poor values.
* gimple-range-cache.h (ranger_cache): Adjust prototypes.
* gimple-range.cc (gimple_ranger::range_of_expr): Adjust.
Jonathan Wakely [Mon, 14 Jun 2021 19:31:00 +0000 (20:31 +0100)]
libstdc++: Fix common_reference for non-reference results [PR100894]
The result of COMMON-REF(A&, B&&) where they have no common reference
type should not be a reference. The implementation of COMMON-REF fails
to check that the result is a reference, so is well-formed when it
shouldn't be. This means that common_reference uses that result when it
shouldn't.
The fix is to reject the result of COMMON-REF(A, B) if it's not a
reference, so that common_reference falls through to the next case,
which uses COND-RES, which yields a non-reference result.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/100894
* include/std/type_traits (__common_ref_impl<X&, Y&>): Only
use the type if it's a reference.
* testsuite/20_util/common_reference/100894.cc: New test.
Uros Bizjak [Mon, 14 Jun 2021 18:56:18 +0000 (20:56 +0200)]
i386: Split V2HImode *punpckwd to SSE instruction [PR101058]
V2HImode *punpckwd should not be split to the insn that depends on
TARGET_MMX_WITH_SSE, since the later is disabled on 32bit targets.
Also return true early from ix86_vectorize_vec_perm_const when testing
with V2HI mode. *punpckwd can be used to implement all permutations.
2021-06-14 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/101058
* config/i386/i386-expand.c (ix86_vectorize_vec_perm_const):
Return true early when testing with V2HImode.
* config/i386/mmx.md (*punpckwd): Split to sse2_pshuflw_1.
gcc/testsuite/
PR target/101058
* gcc.target/i386/pr101058.c: New test.
Christophe Lyon [Thu, 3 Jun 2021 14:35:50 +0000 (14:35 +0000)]
arm: Auto-vectorization for MVE: add pack/unpack patterns
This patch adds vec_unpack<US>_hi_<mode>, vec_unpack<US>_lo_<mode>,
vec_pack_trunc_<mode> patterns for MVE.
It does so by moving the unpack patterns from neon.md to
vec-common.md, while adding them support for MVE. The pack expander is
derived from the Neon one (which in turn is renamed into
neon_quad_vec_pack_trunc_<mode>).
The patch introduces mve_vec_unpack<US>_lo_<mode> and
mve_vec_unpack<US>_hi_<mode> which are similar to their Neon
counterparts, except for the assembly syntax.
The patch introduces mve_vec_pack_trunc_lo_<mode> to avoid the need for a
zero-initialized temporary, which is needed if the
vec_pack_trunc_<mode> expander calls @mve_vmovn[bt]q_<supf><mode>
instead.
With this patch, we can now vectorize the 16 and 8-bit versions of
vclz and vshl, although the generated code could still be improved.
For test_clz_s16, we now generate
vldrh.16 q3, [r1]
vmovlb.s16 q2, q3
vmovlt.s16 q3, q3
vclz.i32 q2, q2
vclz.i32 q3, q3
vmovnb.i32 q1, q2
vmovnt.i32 q1, q3
vstrh.16 q1, [r0]
which could be improved to
vldrh.16 q3, [r1]
vclz.i16 q1, q3
vstrh.16 q1, [r0]
if we could avoid the need for unpack/pack steps.
Richard Biener [Mon, 14 Jun 2021 12:57:26 +0000 (14:57 +0200)]
tree-optimization/100934 - properly mark irreducible regions for DOM
The jump threading code requires marked irreducible regions for the
purpose of validating jump threading paths but DOM fails to provide
that resulting in mised number of iteration upper bounds clearing.
2021-06-14 Richard Biener <rguenther@suse.de>
PR tree-optimization/100934
* tree-ssa-dom.c (pass_dominator::execute): Properly
mark irreducible regions.
Jonathan Wakely [Mon, 14 Jun 2021 12:17:40 +0000 (13:17 +0100)]
libstdc++: Add explicit -std=gnu++17 option to test
This test has no -std option so when the testsuite is run with
-std=gnu++20 or later, this test will use that. The recent addition of
no_unique_address will cause it to FAIL, because that's a reserved word
after C++17. Add an explicit option, so that this test alays uses
exactly C++17.