dje [Wed, 5 Jun 2019 16:45:57 +0000 (16:45 +0000)]
* config/rs6000/aix-unwind.h (LR_REGNO): Rename to R_LR.
(CR2_REGNO): Rename to R_CR2.
(XER_REGNO): Rename to R_XER.
(FIRST_ALTIVEC_REGNO): Rename to R_FIRST_ALTIVEC.
(VRSAVE_REGNO): Rename to R_VRSAVE.
(VSCR_REGNO): R_VSCR.
ebotcazou [Wed, 5 Jun 2019 14:14:40 +0000 (14:14 +0000)]
* fold-const.c (extract_muldiv_1) <PLUS_EXPR>: Do not distribute a
multiplication by a power-of-two value.
(fold_plusminus_mult_expr): Use pow2p_hwi to spot a power-of-two value
and turn the modulo operation into a masking operation.
samtebbs [Wed, 5 Jun 2019 11:06:56 +0000 (11:06 +0000)]
[PATCH][GCC][AARCH64] Add tests for pointer authentication B-key
gcc/testsuite/ChangeLog
* gcc.target/aarch64/return_address_sign_b_1.c: New file.
* gcc.target/aarch64/return_address_sign_b_2.c: New file.
* gcc.target/aarch64/return_address_sign_b_3.c: New file.
* gcc.target/aarch64/return_address_sign_builtin.c: New file.
* g++.target/aarch64/return_address_sign_ab_exception.C: New file.
* g++.target/aarch64/return_address_sign_b_exception.C: New file.
jakub [Wed, 5 Jun 2019 07:37:40 +0000 (07:37 +0000)]
* omp-low.c (lower_rec_input_clauses): For lastprivate conditional
references, lookup in in hash map MEM_REF operand instead of the
MEM_REF itself.
(lower_omp_1): When looking for lastprivate conditional assignments,
handle MEM_REFs with REFERENCE_TYPE operands.
* testsuite/libgomp.c++/lastprivate-conditional-1.C: New test.
* testsuite/libgomp.c++/lastprivate-conditional-2.C: New test.
jakub [Wed, 5 Jun 2019 07:36:30 +0000 (07:36 +0000)]
* omp-low.c (lower_rec_input_clauses): Force max_vf if is_simd and
on privatization clauses OMP_CLAUSE_DECL is privatized by reference
and references a VLA. Handle references to non-VLAs if is_simd
all privatization clauses like reductions.
(lower_rec_input_clauses) <case do_private, case do_firstprivate>:
If omp_is_reference, use always omp simd arrays and set
DECL_VALUE_EXPR in that case, if lower_rec_simd_input_clauses
fails, emit reference initialization.
ian [Wed, 5 Jun 2019 00:18:17 +0000 (00:18 +0000)]
compiler: statically allocate constant interface data
When converting a constant to interface, such as interface{}(42)
or interface{}("hello"), if the interface escapes, we currently
generate a heap allocation to hold the constant value.
This CL changes it to generate a static allocation instead, as
the gc compiler does. This reduces allocations in such cases.
segher [Tue, 4 Jun 2019 23:38:35 +0000 (23:38 +0000)]
rs6000: Update direct-move* testcases
This fixes some testcases that the last fifteen or so patches broke.
In all these cases we no longer need to set VSX_REG_ATTR: the default
value of "wa" is correct.
segher [Tue, 4 Jun 2019 23:30:43 +0000 (23:30 +0000)]
rs6000: Remove Ftrad, Fvsx, Fs; add s and sd
This removes the <Ftrad>, <Fvsx>, and <Fs> mode attributes, and creates
new <sd> and <s> mode attributes instead. <sd> is either "s" or "d",
depending on whether the mode is single-precision or double-precision
floating point; and <s> is either "s" or nothing.
* config/rs6000/rs6000.md (SFDF, SFDF2): Adjust comments.
(define_mode_attr sd): New.
(define_mode_attr s): New.
(define_mode_attr Ftrad): Delete.
(define_mode_attr Fvsx): Delete.
(define_mode_attr Fs): Delete.
(rest of file): Use the new mode attributes.
* config.rs6000/vsx.md: Use the new mode attributes.
segher [Tue, 4 Jun 2019 23:27:57 +0000 (23:27 +0000)]
rs6000: Simplify VS[ra]* for VSX_[BDF]
When used in VSX_B, VSX_D, or VSX_F, both <VSr> and <VSa> are always
just "wa" now. Similarly <VSr2> and <VSr3>. The former of those is
always "wa", so we can remove the mode attribute completely.
* config/rs6000/vsx.md (define_mode_attr VSr2): Delete.
(rest of file): Replace all <VSa>, <VSr>, <VSr2>, and <VSr3> that are
used with VSX_B, VSX_D, or VSX_F, with just "wa".
segher [Tue, 4 Jun 2019 16:28:46 +0000 (16:28 +0000)]
rs6000: wv -> v+p7v
"wv" is "v", but only if VSX is enabled (otherwise it's NO_REGS). So
this patch sets "isa" "p7v" to all alternatives that used "wv" before
(and that do not already need a later ISA), and changes the constraint.
nsz [Tue, 4 Jun 2019 16:16:52 +0000 (16:16 +0000)]
aarch64: fix asm visibility for extern symbols
Commit r271869 broke visibility declarations in asm for extern symbols, because
the new ASM_OUTPUT_EXTERNAL hook failed to call the default hook for elf.
jason [Tue, 4 Jun 2019 14:48:38 +0000 (14:48 +0000)]
PR c++/60531 - Wrong error about unresolved overloaded function
For PR60531, GCC wrongly rejects function templates with explicitly
specified template arguments as overloaded. They are resolved by
resolve_nondeduced_context, which is normally called by
cp_default_conversion through decay_conversion, but the latter have
extra effects making them unusable here. Calling the former directly
does work.
* typeck.c (cp_build_binary_op): See if overload can be resolved.
(cp_build_unary_op): Ditto.
jason [Tue, 4 Jun 2019 14:47:40 +0000 (14:47 +0000)]
Reduce accumulated garbage in constexpr evaluation.
We want to evaluate the arguments to a call before looking into the cache so
that we have constant values, but if we then find the call in the cache we
end up with a TREE_LIST that we don't end up using; in highly recursive
constexpr evaluation this ends up being a large proportion of the garbage
generated.
The cxx_eval_increment_expression hunk is less important, but it's an easy
tweak; we only use the MODIFY_EXPR to evaluate it, so after that it's
garbage.
* constexpr.c (cxx_eval_call_expression): ggc_free any bindings we
don't save.
(cxx_eval_increment_expression): ggc_free the MODIFY_EXPR after
evaluating it.
jakub [Tue, 4 Jun 2019 12:49:03 +0000 (12:49 +0000)]
* gimplify.c (gimplify_scan_omp_clauses): Don't sorry_at on lastprivate
conditional on combined for simd.
* omp-low.c (struct omp_context): Add combined_into_simd_safelen0
member.
(lower_rec_input_clauses): For gimple_omp_for_combined_into_p max_vf 1
constructs, don't remove lastprivate_conditional_map, but instead set
ctx->combined_into_simd_safelen0 and adjust hash_map, so that it points
to parent construct temporaries.
(lower_lastprivate_clauses): Handle ctx->combined_into_simd_safelen0
like !ctx->lastprivate_conditional_map.
(lower_omp_1) <case GIMPLE_ASSIGN>: If up->combined_into_simd_safelen0,
use up->outer context instead of up.
* omp-expand.c (expand_omp_for_generic): Perform cond_var bump even if
gimple_omp_for_combined_p.
(expand_omp_for_static_nochunk): Likewise.
(expand_omp_for_static_chunk): Add forgotten cond_var bump that was
probably moved over into expand_omp_for_generic rather than being copied
there.
gcc/cp/
* cp-tree.h (CP_OMP_CLAUSE_INFO): Allow for any clauses up to _condvar_
instead of only up to linear.
gcc/testsuite/
* c-c++-common/gomp/lastprivate-conditional-2.c (foo): Don't expect
a sorry_at on any of the clauses.
libgomp/
* testsuite/libgomp.c-c++-common/lastprivate-conditional-7.c: New test.
* testsuite/libgomp.c-c++-common/lastprivate-conditional-8.c: New test.
* testsuite/libgomp.c-c++-common/lastprivate-conditional-9.c: New test.
* testsuite/libgomp.c-c++-common/lastprivate-conditional-10.c: New test.
rguenth [Tue, 4 Jun 2019 08:09:16 +0000 (08:09 +0000)]
2019-06-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/90738
Revert
2019-06-03 Richard Biener <rguenther@suse.de>
* tree-ssa-sccvn.c (ao_ref_init_from_vn_reference): Get original
full reference tree and record in ref->ref.
(vn_reference_lookup_3): Pass in original ref to
ao_ref_init_from_vn_reference.
(vn_reference_lookup): Likewise.
* tree-ssa-sccvn.h (ao_ref_init_from_vn_reference): Adjust prototype.
* tree-ssa-alias.c (nonoverlapping_component_refs_of_decl_p):
Handle non-decl bases in the original reference.
marxin [Tue, 4 Jun 2019 07:53:08 +0000 (07:53 +0000)]
IPA ICF: use fibonacci heap instead of list as a worklist.
2019-06-04 Martin Liska <mliska@suse.cz>
* ipa-icf.c (sem_item_optimizer::add_item_to_class): Count
number of references.
(sem_item_optimizer::do_congruence_step):
(sem_item_optimizer::worklist_push): Dump how references
a class has.
(sem_item_optimizer::worklist_pop): Use heap.
(sem_item_optimizer::process_cong_reduction): Likewise.
* ipa-icf.h: Use fibonacci_heap insteam of std::list.
marxin [Tue, 4 Jun 2019 07:52:51 +0000 (07:52 +0000)]
IPA ICF: rewrite references into a hash_map.
2019-06-04 Martin Liska <mliska@suse.cz>
* ipa-icf.h (struct sem_usage_pair_hash): New.
(sem_usage_pair_hash::hash): Likewise.
(sem_usage_pair_hash::equal): Likewise.
(struct sem_usage_hash): Likewise.
* ipa-icf.c (sem_item::sem_item): Initialize
referenced_by_count.
(sem_item::add_reference): Register a reference
in ref_map and not in target->usages.
(sem_item::setup): Remove initialization of
dead vectors.
(sem_item::~sem_item): Remove usage of dead vectors.
(sem_item::dump): Remove dump of references.
(sem_item_optimizer::sem_item_optimizer): Initialize
m_references.
(sem_item_optimizer::read_section): Remove useless
dump.
(sem_item_optimizer::parse_funcs_and_vars): Likewise here.
(sem_item_optimizer::build_graph): Pass m_references
to ::add_reference.
(sem_item_optimizer::verify_classes): Remove usage of dead
vectors.
(sem_item_optimizer::traverse_congruence_split): Return true
when a class is split.
(sem_item_optimizer::do_congruence_step_for_index): Use
hash_map for look up of (sem_item *, index). That brings
significant speed up.
(sem_item_optimizer::do_congruence_step): Return true
when a split is done.
(congruence_class::is_class_used): Use referenced_by_count.
2019-06-04 Martin Liska <mliska@suse.cz>
ian [Mon, 3 Jun 2019 23:37:04 +0000 (23:37 +0000)]
compiler, runtime, reflect: generate unique type descriptors
Currently, the compiler already generates common symbols for type
descriptors, so the type descriptors are unique. However, when a
type is created through reflection, it is not deduplicated with
compiler-generated types. As a consequence, we cannot assume type
descriptors are unique, and cannot use pointer equality to
compare them. Also, when constructing a reflect.Type, it has to
go through a canonicalization map, which introduces overhead to
reflect.TypeOf, and lock contentions in concurrent programs.
In order for the reflect package to deduplicate types with
compiler-created types, we register all the compiler-created type
descriptors at startup time. The reflect package, when it needs
to create a type, looks up the registry of compiler-created types
before creates a new one. There is no lock contention since the
registry is read-only after initialization.
This lets us get rid of the canonicalization map, and also makes
it possible to compare type descriptors with pointer equality.
ian [Mon, 3 Jun 2019 23:07:54 +0000 (23:07 +0000)]
libgo: delay applying profile stack-frame skip until fixup
When the runtime collects a stack trace to associate it with some
profiling event (mem alloc, mutex, etc) there is a skip count passed
to runtime.Callers (or equivalent) to skip some known count of frames
in order to get to the "interesting" frame corresponding to the
profile event. Now that the profiling mechanism uses lazy fixup (when
removing compiler artifacts like thunks, morestack calls etc), we also
need to move the frame skipping logic after the fixup, so as to insure
that the skip count isn't thrown off by these artifacts.
ian [Mon, 3 Jun 2019 23:04:23 +0000 (23:04 +0000)]
compiler: permit inlining references to global variables
This requires tracking all references to unexported variables, so that
we can make them global symbols in the object file, and can export
them so that other compilations can see the right definition for their
own inline bodies.
This introduces a syntax for referencing names defined in other
packages: a <pNN> prefix, where NN is the package index. This will
need to be added to gccgoimporter, but I didn't do it yet since it
isn't yet possible to create an object for which gccgoimporter will
see a <pNN> prefix.
This increases the number of inlinable functions in the standard
library from 181 to 215, adding functions like context.Background.
segher [Mon, 3 Jun 2019 22:33:11 +0000 (22:33 +0000)]
rs6000: Delete wg
The "wg" constraint is used for the floating point side on mfpgpr
instructions. Those instructions do not exist on any relevant
hardware. This patch deletes the constraint and the insns using it.
wilco [Mon, 3 Jun 2019 13:55:15 +0000 (13:55 +0000)]
Fix PR64242 - Longjmp expansion incorrect
Improve the fix for PR64242. Various optimizations can change a memory
reference into a frame access. Given there are multiple virtual frame pointers
which may be replaced by multiple hard frame pointers, there are no checks for
writes to the various frame pointers. So updates to a frame pointer tends to
generate incorrect code. Improve the previous fix to also add clobbers of
several frame pointers and add a scheduling barrier. This should work in most
cases until GCC supports a generic "don't optimize across this instruction"
feature.
Bootstrap OK. Testcase passes on AArch64 and x86-64. Inspected x86, Arm,
Thumb-1 and Thumb-2 assembler which looks correct.
nsz [Mon, 3 Jun 2019 13:50:53 +0000 (13:50 +0000)]
aarch64: emit .variant_pcs for aarch64_vector_pcs symbol references
A dynamic linker with lazy binding support may need to handle vector PCS
function symbols specially, so an ELF symbol table marking was
introduced for such symbols.
Function symbol references and definitions that follow the vector PCS
are marked in the generated assembly with .variant_pcs and then the
STO_AARCH64_VARIANT_PCS st_other flag is set on the symbol in the object
file. The marking is propagated to the dynamic symbol table by the
static linker so a dynamic linker can handle such symbols specially.
For this to work, the assembler, the static linker and the dynamic
linker has to be updated on a system. Old assembler does not support
the new .variant_pcs directive, so a toolchain with old binutils won't
be able to compile code that references vector PCS symbols.
redi [Mon, 3 Jun 2019 13:22:59 +0000 (13:22 +0000)]
Enforce allocator::value_type consistency for containers in C++2a
In previous standards it is undefined for a container and its allocator
to have a different value_type. Libstdc++ has traditionally allowed it
as an extension, automatically rebinding the allocator to the
container's value_type. Since GCC 8.1 that extension has been disabled
for C++11 and later when __STRICT_ANSI__ is defined (i.e. for
-std=c++11, -std=c++14, -std=c++17 and -std=c++2a).
Since the acceptance of P1463R1 into the C++2a draft an incorrect
allocator::value_type now requires a diagnostic. This patch implements
that by enabling the static_assert for -std=gnu++2a as well.
* doc/xml/manual/status_cxx2020.xml: Document P1463R1 status.
* include/bits/forward_list.h [__cplusplus > 201703]: Enable
allocator::value_type assertion for C++2a.
* include/bits/hashtable.h: Likewise.
* include/bits/stl_deque.h: Likewise.
* include/bits/stl_list.h: Likewise.
* include/bits/stl_map.h: Likewise.
* include/bits/stl_multimap.h: Likewise.
* include/bits/stl_multiset.h: Likewise.
* include/bits/stl_set.h: Likewise.
* include/bits/stl_vector.h: Likewise.
* testsuite/23_containers/deque/48101-3_neg.cc: New test.
* testsuite/23_containers/forward_list/48101-3_neg.cc: New test.
* testsuite/23_containers/list/48101-3_neg.cc: New test.
* testsuite/23_containers/map/48101-3_neg.cc: New test.
* testsuite/23_containers/multimap/48101-3_neg.cc: New test.
* testsuite/23_containers/multiset/48101-3_neg.cc: New test.
* testsuite/23_containers/set/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_map/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_multimap/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_multiset/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_set/48101-3_neg.cc: New test.
* testsuite/23_containers/vector/48101-3_neg.cc: New test.
ktkachov [Mon, 3 Jun 2019 11:20:58 +0000 (11:20 +0000)]
[AArch64] Emit TARGET_DOTPROD-specific sequence for <us>sadv16qi
Wilco pointed out that when the Dot Product instructions are available we can use them
to generate an even more efficient expansion for the [us]sadv16qi optab.
Instead of the current:
uabdl2 v0.8h, v1.16b, v2.16b
uabal v0.8h, v1.8b, v2.8b
uadalp v3.4s, v0.8h
we can generate:
(1) mov v4.16b, 1
(2) uabd v0.16b, v1.16b, v2.16b
(3) udot v3.4s, v0.16b, v4.16b
Instruction (1) can be CSEd across multiple such expansions and even hoisted outside of loops,
so when this sequence appears frequently back-to-back (like in x264_r) we essentially only have 2 instructions
per sum. Also, the UDOT instruction does the byte-to-word accumulation in one step, which allows us to use
the much simpler UABD instruction before it.
This makes it a shorter and lower-latency sequence overall for targets that support it.
rguenth [Mon, 3 Jun 2019 10:45:38 +0000 (10:45 +0000)]
2019-06-03 Richard Biener <rguenther@suse.de>
* tree-ssa-sccvn.c (ao_ref_init_from_vn_reference): Get original
full reference tree and record in ref->ref.
(vn_reference_lookup_3): Pass in original ref to
ao_ref_init_from_vn_reference.
(vn_reference_lookup): Likewise.
* tree-ssa-sccvn.h (ao_ref_init_from_vn_reference): Adjust prototype.
* tree-ssa-alias.c (nonoverlapping_component_refs_of_decl_p):
Handle non-decl bases in the original reference.
alejandro [Mon, 3 Jun 2019 09:13:32 +0000 (09:13 +0000)]
Fix ICE in vect_slp_analyze_node_operations_1
This patch fixes bug 90681. It was caused by trying to SLP vectorize a non
groupped load. We've fixed it by tweaking a bit the implementation: mark
masked loads as not vectorizable, but support them as an special case. Then
the detect them in the test for normal non-groupped loads that was already
there.
tkoenig [Sun, 2 Jun 2019 15:18:22 +0000 (15:18 +0000)]
2019-06-02 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/90539
* trans-expr.c (gfc_conv_subref_array_arg): If the size of the
expression can be determined to be one, treat it as contiguous.
Set likelyhood of presence of an actual argument according to
PRED_FORTRAN_ABSENT_DUMMY and likelyhood of being contiguous
according to PRED_FORTRAN_CONTIGUOUS.
2019-06-02 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/90539
* predict.def (PRED_FORTRAN_CONTIGUOUS): New predictor.
2019-06-02 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/90539
* gfortran.dg/internal_pack_24.f90: New test.
iains [Sat, 1 Jun 2019 19:59:30 +0000 (19:59 +0000)]
Darwin, x86, testsuite - adjust tests for Darwin PR90698.
We don't have support for -mcmodel={medium, large, kernel} so don't
expect tests for those things to work.
For now mark them as xfail where possible and skip where that isn't.
These changes will be logged onto the PR and therefore can be backed
out when the facility is implemented.
gcc/testsuite/ChangeLog:
2019-06-01 Iain Sandoe <iain@sandoe.co.uk>
PR target/90698
* gcc.target/i386/pr49866.c: XFAIL for Darwin.
* gcc.target/i386/pr63538.c: Likewise.
* gcc.target/i386/pr61599-1.c: Skip for Darwin.
hubicka [Sat, 1 Jun 2019 16:36:49 +0000 (16:36 +0000)]
* alias.c: Include ipa-utils.h.
(get_alias_set): Try to complete ODR type via ODR type hash lookup.
* ipa-devirt.c (prevailing_odr_type): New.
* ipa-utils.h (previaling_odr_type): Declare.
* g++.dg/lto/alias-1_0.C: New testcase.
* g++.dg/lto/alias-1_1.C: New testcase.
hjl [Fri, 31 May 2019 23:59:16 +0000 (23:59 +0000)]
i386: Don't insert ENDBR after NOTE_INSN_DELETED_LABEL
NOTE_INSN_DELETED_LABEL is used to mark what used to be a 'code_label',
but was not used for other purposes than taking its address which cannot
be used as target for indirect jumps.
Tested on Linux/x86-64 with -fcf-protection.
For x86-64 libc.so on glibc master branch (commit f43b8dd55588c3),
jakub [Fri, 31 May 2019 21:38:35 +0000 (21:38 +0000)]
* tree.h (OMP_CLAUSE__CONDTEMP__ITER): Define.
* gimplify.c (gimplify_scan_omp_clauses): Allow lastprivate conditional
on OMP_SIMD if not nested inside of worksharing loop that also has
lastprivate conditional clause for the same decl.
(gimplify_omp_for): Add _condtemp_ clauses to OMP_SIMD if needed.
* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE__CONDTEMP_ also
on simd.
(lower_rec_input_clauses): Likewise. Handle lastprivate conditional
on simd construct.
(lower_lastprivate_conditional_clauses): Handle lastprivate conditional
on simd construct.
(lower_lastprivate_clauses): Likewise.
(lower_omp_sections): Call lower_lastprivate_conditional_clauses before
calling lower_rec_input_clauses.
(lower_omp_for): Likewise.
(lower_omp_1): Use first rather than second OMP_CLAUSE__CONDTEMP_
clause on simd construct.
* omp-expand.c (expand_omp_simd): Initialize cond_var if
OMP_CLAUSE__CONDTEMP_ clause is present.
* c-c++-common/gomp/lastprivate-conditional-2.c (foo): Don't expect
a sorry on lastprivate conditional on simd construct.
* gcc.dg/vect/vect-simd-6.c: New test.
* gcc.dg/vect/vect-simd-7.c: New test.