Tom de Vries [Tue, 12 Feb 2019 14:00:59 +0000 (14:00 +0000)]
[libbacktrace] Handle bsearch with NULL base in dwarf_lookup_pc
The call to bsearch in dwarf_lookup_pc can have NULL as base argument when
the nmemb argument is 0. The base argument is required to be pointing to the
initial member of an array of nmemb objects. It is not specified what
constitutes a valid pointer to an array of 0 objects, but glibc declares base
with attribute non-null, so the NULL will trigger a sanitizer runtime error.
Fix splay tree KEY leak detected in GDB test gdb.base/macscp.exp
When a node is removed from a splay tree, the splay tree was
not using the function splay_tree_delete_key_fn to release the key.
This was causing a leak, fixed by Tom Tromey.
This patch fixes another key leak, that happens when a key equal to
a key already present is inserted. In such a case, we have to release
the old KEY.
Note that this is based on the assumption that the caller always
allocates a new KEY when doing an insert.
Also, clarify the documentation about when the release functions are
called.
2019-02-11 Philippe Waroquiers <philippe.waroquiers@skynet.be>
libiberty/ChangeLog
2019-02-11 Philippe Waroquiers <philippe.waroquiers@skynet.be>
* splay-tree.c (splay_tree_insert): Also release old KEY in case
of insertion of a key equal to an already present key.
(splay_tree_new_typed_alloc): Update comment.
Jan Hubicka [Tue, 12 Feb 2019 11:25:11 +0000 (12:25 +0100)]
re PR target/88777 (Out-of-range offsets building glibc test-tgmath2.c for hppa-linux-gnu)
PR lto/88777
* cgraphunit.c (analyze_functions): Clear READONLY flag for external
types that needs constructiong.
* tree.h (may_be_aliased): Do not check TYPE_NEEDS_CONSTRUCTING.
David Malcolm [Tue, 12 Feb 2019 01:09:31 +0000 (01:09 +0000)]
linemap_line_start: protect against location_t overflow (PR lto/88147)
PR lto/88147 reports an assertion failure due to a bogus location_t value
when adding a line to a pre-existing line map, when there's a large
difference between the two line numbers.
For some "large differences", this leads to a location_t value that exceeds
LINE_MAP_MAX_LOCATION, in which case linemap_line_start returns 0. This
isn't ideal, but at least should lead to safe degradation of location
information.
However, if the difference is very large, it's possible for the line
number offset (relative to the start of the map) to be sufficiently large
that overflow occurs when left-shifted by the column-bits, and hence
the check against the LINE_MAP_MAX_LOCATION limit fails, leading to
a seemingly-valid location_t value, but encoding the wrong location. This
triggers the assertion failure:
linemap_assert (SOURCE_LINE (map, r) == to_line);
The fix (thanks to Martin) is to check for overflow when determining
whether to reuse an existing map, and to not reuse it if it would occur.
gcc/ChangeLog: David Malcolm <dmalcolm@redhat.com>
PR lto/88147
* input.c (selftest::test_line_offset_overflow): New selftest.
(selftest::input_c_tests): Call it.
libcpp/ChangeLog: Martin Liska <mliska@suse.cz>
PR lto/88147
* line-map.c (linemap_line_start): Don't reuse the existing line
map if the line offset is sufficiently large to cause overflow
when computing location_t values.
PR tree-optimization/88771
* gimple-ssa-warn-restrict.c (pass_wrestrict::gate): Also enable
when -Wstringop-overflow is set.
(builtin_memref::builtin_memref): Adjust excessive upper bound
only when lower bound is not excessive.
(maybe_diag_overlap): Detect and diagnose excessive bounds via
-Wstringop-ovefflow.
(maybe_diag_offset_bounds): Rename...
(maybe_diag_access_bounds): ...to this.
(check_bounds_or_overlap): Adjust for name change above.
gcc/testsuite/ChangeLog:
PR tree-optimization/88771
* gcc.dg/Wstringop-overflow-8.c: New test.
* gcc.dg/Wstringop-overflow-9.c: New test.
* gcc.dg/Warray-bounds-40.c: New test.
* gcc.dg/builtin-stpncpy.c: Adjust.
* gcc.dg/builtin-stringop-chk-4.c: Adjust.
* g++.dg/opt/memcpy1.C: Adjust.
PR c++/87996
* c-common.c (invalid_array_size_error): New function.
(valid_array_size_p): Call it. Handle size as well as type.
* c-common.h (valid_constant_size_p): New function.
(enum cst_size_error): New type.
gcc/cp/ChangeLog:
PR c++/87996
* decl.c (compute_array_index_type_loc): Preserve signed sizes
for diagnostics. Call valid_array_size_p instead of error.
* init.c (build_new_1): Compute size for diagnostic. Call
invalid_array_size_error
(build_new): Call valid_array_size_p instead of error.
Tamar Christina [Mon, 11 Feb 2019 16:54:18 +0000 (16:54 +0000)]
Arm: Update tests after register allocation changes. (PR/target 88560)
After the register allocator changes of r268705 we need to update a few tests
with new output.
In all cases the compiler is now generating the expected code, since the tests
are all float16 testcases using a hard-floar abi, we expect that actual fp16
instructions are used rather than using integer loads and stores. Because of
we also save on some mov.f16s that were being emitted before to move between
the two.
The aapcs cases now match the f32 cases in using floating point operations.
Bill Schmidt [Mon, 11 Feb 2019 16:50:33 +0000 (16:50 +0000)]
rs6000.c (rs6000_gimple_fold_builtin): Shift-right and shift-left vector built-ins need to include a TRUNC_MOD_EXPR...
[gcc]
2019-02-11 Bill Schmidt <wschmidt@linux.ibm.com>
* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Shift-right
and shift-left vector built-ins need to include a TRUNC_MOD_EXPR
for correct semantics.
Alan Modra [Mon, 11 Feb 2019 15:19:59 +0000 (01:49 +1030)]
[RS6000] No inline PLT for V4 bss-plt, implement -mno-pltseq
Inline PLT calls need PLT to be an array of addresses. PowerPC 32-bit
bss-plt works differently, so this patch disables inline PLT calls
when -mbss-plt. The patch also adds support for a new -mno-pltseq
option, which may be required when linking with -mbss-plt code.
* doc/invoke.texi (man page RS/6000 and PowerPC Options): Mention
-mlongcall and -mpltseq.
(RS/6000 and PowerPC Options <-mlongcall>): Mention inline PLT calls.
(RS/6000 and PowerPC Options <-mpltseq>): Document.
* config/rs6000/rs6000.h (TARGET_PLTSEQ): Define.
* config/rs6000/sysv4.opt (mpltseq): New option.
* config/rs6000/sysv4.h (TARGET_PLTSEQ): Redefine.
(SUBTARGET_OVERRIDE_OPTIONS): Error if given -mpltseq when assembler
support is lacking. Don't allow -mpltseq with -mbss-plt.
* config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Warn if
-mpltseq given for ELFv1.
* config/rs6000/rs6000.c (rs6000_call_aix): Comment on UNSPEC_PLTSEQ.
Only use UNSPEC_PLTSEQ for inline PLT calls.
(rs6000_call_sysv, rs6000_sibcall_sysv): Expand comments. Only
use UNSPEC_PLTSEQ for inline PLT calls.
(rs6000_indirect_call_template_1, rs6000_longcall_ref),
(rs6000_call_aix, rs6000_call_sysv, rs6000_sibcall_sysv): Replace
uses of HAVE_AS_PLTSEQ with TARGET_PLTSEQ, simplifying.
* config/rs6000/rs6000.md (pltseq_tocsave_<mode>),
(pltseq_plt16_ha_<mode>, pltseq_plt16_lo_<mode>),
(pltseq_mtctr_<mode>): Likewise.
Jonathan Wakely [Mon, 11 Feb 2019 12:56:59 +0000 (12:56 +0000)]
PR libstdc++/89023 fix test that fails when <omp.h> not available
Instead of a single test that only checks whether <regex> can be
included in Parallel Mode, add tests for each of C++11/C++14/C++17 that
check whether <bits/extc++.h> is compatible with _GLIBCXX_PARALLEL.
This increases the coverage to (almost) all headers.
If <omp.h> is not available then the tests will trivially pass, because
we don't care about compatibility with _GLIBCXX_PARALLEL in that case.
PR libstdc++/89023
* testsuite/17_intro/headers/c++2011/parallel_mode.cc: New test.
* testsuite/17_intro/headers/c++2014/parallel_mode.cc: New test.
* testsuite/17_intro/headers/c++2017/parallel_mode.cc: New test.
* testsuite/28_regex/headers/regex/parallel_mode.cc: Remove.
function.c (assign_parm_setup_block): Use the stored size...
* function.c (assign_parm_setup_block): Use the stored
size, not the passed size, when allocating stack-space,
also for a parameter with alignment larger than
MAX_SUPPORTED_STACK_ALIGNMENT.
Thomas Koenig [Sun, 10 Feb 2019 15:56:41 +0000 (15:56 +0000)]
re PR fortran/71723 ([F08] ICE on invalid pointer initialization)
2019-02-10 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/71723
* expr.c (gfc_check_assign): Add argument is_init_expr. If we are
looking at an init expression, issue error if the target is not a
TARGET and we are not looking at a procedure pointer.
* gfortran.h (gfc_check_assign): Add optional argument
is_init_expr.
Thomas Koenig [Sun, 10 Feb 2019 15:52:38 +0000 (15:52 +0000)]
re PR fortran/71723 ([F08] ICE on invalid pointer initialization)
2019-02-10 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/71237
* expr.c (gfc_check_assign): Add argument is_init_expr. If we are
looking at an init expression, issue error if the target is not a
TARGET and we are not looking at a procedure pointer.
* gfortran.h (gfc_check_assign): Add optional argument
is_init_expr.
Jakub Jelinek [Sat, 9 Feb 2019 08:55:39 +0000 (09:55 +0100)]
re PR middle-end/89246 (LTO produces references to cloned symbols which the compiler failed to clone)
PR middle-end/89246
* config/i386/i386.c (ix86_simd_clone_compute_vecsize_and_simdlen):
If !node->definition and TYPE_ARG_TYPES is non-NULL, use
TYPE_ARG_TYPES instead of DECL_ARGUMENTS.
* gcc.dg/gomp/pr89246-1.c: New test.
* gcc.dg/gomp/pr89246-2.c: New test.
Jonathan Wakely [Sat, 9 Feb 2019 00:25:39 +0000 (00:25 +0000)]
Add noexcept to filesystem::path query functions
In the standard these member functions are specified in terms of the
potentially-throwing path decompositions functions, but we implement
them without constructing any new paths or doing anything else that can
throw.
// -m32 -fpic -msecure-plt
extern int foo (int);
int f1 (int x) { return foo (x); }
These are both caused by save_reg_p returning false when the pic
offset table reg (r30 for ABI_V4) was used, due to the logic not
exactly matching that in rs6000_emit_prologue to set up r30.
I also noticed that save_reg_p isn't following the comment regarding
calls_eh_return (since svn 267049, git 0edf78b1b2a0), and the comment
needs tweaking too. For why the revised comment is correct, grep for
saves_all_registers in lra.c, and yes, we do want to save the pic
offset table reg for eh_return.
PR target/88343
* config/rs6000/rs6000.c (save_reg_p): Correct calls_eh_return
case. Match logic in rs6000_emit_prologue emitting pic_offset_table
setup.
Richard Biener [Fri, 8 Feb 2019 13:21:36 +0000 (13:21 +0000)]
re PR tree-optimization/89247 (ICE in expand_LOOP_VECTORIZED, at internal-fn.c:2409)
2019-02-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/89247
* tree-if-conv.c: Include tree-cfgcleanup.h.
(version_loop_for_if_conversion): Record LOOP_VECTORIZED call.
(tree_if_conversion): Pass through predicate vector.
(pass_if_conversion::execute): Do CFG cleanup and SSA update
inline, see if any if-converted loops we refrece in
LOOP_VECTORIZED calls vanished and fixup.
* tree-if-conv.h (tree_if_conversion): Adjust prototype.
Implementation of section anchors in S/390 back-end added in r266741
broke jump labels in S/390 Linux kernel [1]. Currently jump labels
pass global variable addresses to .quad directive in inline assembly
using "X" constraint. In the past this used to produce regular symbol
references, however, after r266741 we sometimes get values like
(plus (reg) (const_int)), where (reg) points to a section anchor.
Strictly speaking, this is still correct, since "X" accepts anything.
Thus, now we need another way to support jump labels.
The existing "i" constraint cannot be used, since with -fPIC it must
not accept non-local symbols, however, jump labels do require that,
e.g. __tracepoint_xdp_exception from kernel proper might be referenced
from kernel modules.
The existing "ZL" constraint cannot be used for the same reason.
The existing "b" constraint cannot be used because of the way
expand_asm_stmt works. It deduces whether the constraint allows
regs, subregs or mems, and processes asm operands differently based on
that. "b" is supposed to accept values like (mem (symbol_ref)), and
there appears to be no way to explain to expand_asm_stmt that for "b"
mem's address must not be in a register.
This patch introduces the new machine-specific constraint named "jdd" -
"j" prefix is already used for constants, and "d" stands for "data".
It accepts anything that fits into the data section, whether or not
this might require a relocation, that is, anything that passes
CONSTANT_P check.
[1] https://lkml.org/lkml/2019/1/23/346
gcc/ChangeLog:
2019-02-08 Ilya Leoshkevich <iii@linux.ibm.com>
* config/s390/constraints.md (jdd): New constraint.
Eric Botcazou [Fri, 8 Feb 2019 11:37:40 +0000 (11:37 +0000)]
trans.c (gnat_to_gnu): Minor tweak.
* gcc-interface/trans.c (gnat_to_gnu) <N_Aggregate>: Minor tweak.
* gcc-interface/utils.c (convert): Do not pad when doing an unchecked
conversion here. Use TREE_CONSTANT throughout the function.
(unchecked_convert): Also pad if the source is a CONSTRUCTOR and the
destination is a more aligned array type or a larger aggregate type,
but not between original and packable versions of a type.
H.J. Lu [Fri, 8 Feb 2019 11:30:53 +0000 (11:30 +0000)]
i386: Use OI/TImode in *mov[ot]i_internal_avx with AVX512VL
OImode and TImode moves must be done in XImode to access upper 16
vector registers without AVX512VL. With AVX512VL, we can access
upper 16 vector registers in OImode and TImode.
PR target/89229
* config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
upper 16 vector registers without TARGET_AVX512VL.
(*movti_internal): Likewise.
Eric Botcazou [Fri, 8 Feb 2019 11:07:08 +0000 (11:07 +0000)]
trans.c (Regular_Loop_to_gnu): Replace tests on individual flag_unswitch_loops and flag_tree_loop_vectorize...
* gcc-interface/trans.c (Regular_Loop_to_gnu): Replace tests on
individual flag_unswitch_loops and flag_tree_loop_vectorize switches
with test on global optimize switch.
(Raise_Error_to_gnu): Likewise.
Jakub Jelinek [Fri, 8 Feb 2019 10:26:33 +0000 (11:26 +0100)]
re PR rtl-optimization/89234 (ICE in get_eh_region_and_lp_from_rtx at gcc/except.c:1824)
PR rtl-optimization/89234
* except.c (copy_reg_eh_region_note_forward): Return if note_or_insn
is a NOTE, CODE_LABEL etc. - rtx_insn * other than INSN_P.
(copy_reg_eh_region_note_backward): Likewise.
The backtrace functions backtrace_full, backtrace_print and backtrace_simple
walk the call stack, but make sure to skip the first entry, in order to skip
over the functions themselves, and start the backtrace at the caller of the
functions.
When compiling with -flto, the functions may be inlined, causing them to skip
over the caller instead.
Fix this by declaring the functions with __attribute__((noinline)).
Richard Biener [Fri, 8 Feb 2019 08:18:09 +0000 (08:18 +0000)]
re PR tree-optimization/89223 (internal compiler error: in int_cst_value, at tree.c:11226)
2019-02-08 Richard Biener <rguenther@suse.de>
PR middle-end/89223
* tree-data-ref.c (initialize_matrix_A): Fail if constant
doesn't fit in HWI.
(analyze_subscript_affine_affine): Handle failure from
initialize_matrix_A.
David Malcolm [Thu, 7 Feb 2019 23:00:18 +0000 (23:00 +0000)]
Fix more ICEs in -fsave-optimization-record (PR tree-optimization/89235)
PR tree-optimization/89235 reports an ICE inside -fsave-optimization-record
whilst reporting the inlining chain of of the location_t in the
vect_location global.
This is very similar to PR tree-optimization/86637, fixed in r266821.
The issue is that the inlining chains are read from the location_t's
ad-hoc data, referencing GC-managed tree blocks, but the former are
not GC roots; it's simply assumed that old locations referencing dead
blocks never get used again.
The fix is to reset the "vect_location" global in more places. Given
that is a somewhat subtle detail, the patch adds a sentinel class to
reset vect_location at the end of a scope. Doing it as a class
simplifies the task of ensuring that the global is reset on every
exit path from a function, and also gives a good place to signpost
the above subtlety (in the documentation for the class).
The patch also adds test cases for both of the PRs mentioned above.
gcc/testsuite/ChangeLog:
PR tree-optimization/86637
PR tree-optimization/89235
* gcc.c-torture/compile/pr86637-1.c: New test.
* gcc.c-torture/compile/pr86637-2.c: New test.
* gcc.c-torture/compile/pr86637-3.c: New test.
* gcc.c-torture/compile/pr89235.c: New test.
gcc/ChangeLog:
PR tree-optimization/86637
PR tree-optimization/89235
* tree-vect-loop.c (optimize_mask_stores): Add an
auto_purge_vect_location sentinel to ensure that vect_location is
purged on exit.
* tree-vectorizer.c
(auto_purge_vect_location::~auto_purge_vect_location): New dtor.
(try_vectorize_loop_1): Add an auto_purge_vect_location sentinel
to ensure that vect_location is purged on exit.
(pass_slp_vectorize::execute): Likewise, replacing the manual
reset.
* tree-vectorizer.h (class auto_purge_vect_location): New class.
Kyrylo Tkachov [Thu, 7 Feb 2019 18:18:16 +0000 (18:18 +0000)]
[AArch64] Change representation of SABD in RTL
Richard raised a concern about the RTL we use to represent the AdvSIMD SABD
(vector signed absolute difference) instruction.
We currently represent it as ABS (MINUS op1 op2).
This isn't exactly what SABD does. ABS treats its input as a signed value
and returns the absolute of that.
For example:
(sabd:QI 64 -128) == 192 (unsigned) aka -64 (signed)
whereas
(minus:QI 64 -128) == 192 (unsigned) aka -64 (signed), (abs ...) of that is 64.
A better way to describe the instruction is with MINUS (SMAX (op1 op2) SMIN (op1 op2)).
This patch implements that, and also implements similar semantics for the UABD instruction
that uses UMAX and UMIN.
That way for the example above we'll have:
(minus:QI (smax:QI (64 -128)) (smin:QI (64 -128))) == (minus:QI 64 -128) == 192 (or -64 signed) which matches
what SABD does.
* config/aarch64/iterators.md (max_opp): New code_attr.
(USMAX): New code iterator.
* config/aarch64/predicates.md (aarch64_smin): New predicate.
(aarch64_smax): Likewise.
* config/aarch64/aarch64-simd.md (abd<mode>_3): Rename to...
(*aarch64_<su>abd<mode>_3): ... Change RTL representation to
MINUS (MAX MIN).
* gcc.target/aarch64/abd_1.c: New test.
* gcc.dg/sabd_1.c: Likewise.
H.J. Lu [Thu, 7 Feb 2019 17:58:19 +0000 (17:58 +0000)]
i386: Fix typo in *movoi_internal_avx/movti_internal
PR target/89229
* config/i386/i386.md (*movoi_internal_avx): Set mode to OI
for TARGET_AVX512VL.
(*movti_internal): Set mode to TI for TARGET_AVX512VL.
Andreas Krebbel [Thu, 7 Feb 2019 15:53:38 +0000 (15:53 +0000)]
S/390: Fix the vec_xl / vec_xst style builtins
This patch fixes several problems with the vec_xl/vec_xst builtins:
- vec_xl/vec_xst needs to use the alignment of the scalar memory
operand for the vector type reference. This is required to emit the
proper vl/vst alignment hints.
- vec_xl / vec_xld2 / vec_xlw4 should accept const pointer source operands
- vec_xlw4 / vec_xstw4 needs to accept float memory operands
gcc/ChangeLog:
2019-02-07 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/s390-builtin-types.def: Add new types.
* config/s390/s390-builtins.def: (s390_vec_xl, s390_vec_xld2)
(s390_vec_xlw4): Make the memory operand into a const pointer.
(s390_vec_xld2, s390_vec_xlw4): Add a variant for single precision
float.
* config/s390/s390-c.c (s390_expand_overloaded_builtin): Generate
a new vector type with the alignment of the scalar memory operand.
gcc/testsuite/ChangeLog:
2019-02-07 Andreas Krebbel <krebbel@linux.ibm.com>
* gcc.target/s390/zvector/xl-xst-align-1.c: New test.
* gcc.target/s390/zvector/xl-xst-align-2.c: New test.
These peepholes match a pair of SImode loads or stores that can be
implemented with a single LDRD or STRD instruction.
When compiling for TARGET_ARM, these peepholes originally created a set
pattern in DI mode to be caught by movdi patterns.
This approach failed to take into account the possibility that the two
matched insns operated on memory with different aliasing information.
The peepholes lost the aliasing information on one of the insns, which
could then cause the scheduler to make an invalid transformation.
This patch changes the peepholes so they generate a PARALLEL expression
of the two relevant loads or stores, which means the aliasing
information of both is kept. Such a PARALLEL pattern is what the
peepholes currently produce for TARGET_THUMB2.
In order to match these new insn patterns, we add two new define_insn's. These
define_insn's use the same checks as the peepholes to find valid insns.
Note that the patterns now created by the peepholes for LDRD and STRD
are very similar to those created by the peepholes for LDM and STM.
Many patterns could be matched by the LDM and STM define_insns, which
means we rely on the order the define_insn patterns are defined in the
machine description, with those for LDRD/STRD defined before those for
LDM/STM.
The difference between the peepholes for LDRD/STRD and those for LDM/STM
are mainly that those for LDRD/STRD have some logic to ensure that the
two registers are consecutive and the first one is even.
Bootstrapped and regtested on arm-none-linux-gnu.
Demonstrated fix of bug 88714 by bootstrapping on armv7l.
gcc/ChangeLog:
2019-02-07 Matthew Malcomson <matthew.malcomson@arm.com>
Jakub Jelinek <jakub@redhat.com>
PR bootstrap/88714
* config/arm/arm-protos.h (valid_operands_ldrd_strd,
arm_count_ldrdstrd_insns): New declarations.
* config/arm/arm.c (mem_ok_for_ldrd_strd): Remove broken handling of
MINUS.
(valid_operands_ldrd_strd): New function.
(arm_count_ldrdstrd_insns): New function.
* config/arm/ldrdstrd.md: Change peepholes to generate PARALLEL SImode
sets instead of single DImode set and define new insns to match this.
gcc/testsuite/ChangeLog:
2019-02-07 Matthew Malcomson <matthew.malcomson@arm.com>
Jakub Jelinek <jakub@redhat.com>
PR bootstrap/88714
* gcc.c-torture/execute/pr88714.c: New test.
* gcc.dg/rtl/arm/ldrd-peepholes.c: New test.
Co-Authored-By: Jakub Jelinek <jakub@redhat.com>
From-SVN: r268644
Tamar Christina [Thu, 7 Feb 2019 11:05:22 +0000 (11:05 +0000)]
AArch64: Fix initializer for array so it's a C initializer instead of C++.
This fixes a missing = that would cause the array initializer to be a C++
initializer instead of a C one, causing a warning when building with pre-C++11
standards compiler.
Committed under the GCC obvious rules.
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.c (aarch64_fcmla_lane_builtin_data):
Make it a C initializer.
Tamar Christina [Thu, 7 Feb 2019 10:05:57 +0000 (10:05 +0000)]
Arm: Fix NEON REG to REG reload failures. (PR/target 88850)
We currently return cost 2 for NEON REG to REG moves, which would be incorrect
for 64 bit moves. We currently don't have a pattern for this in the neon_move
alternatives because this is a bit of a special case. We would almost never
want it to use this r -> r pattern unless it really has no choice.
As such we add a new neon r -> r move pattern but also hide it from being used
to determine register preferences and also disparage it during LRA.
gcc/ChangeLog:
PR/target 88850
* config/arm/neon.md (*neon_mov<mode>): Add r -> r case.
gcc/testsuite/ChangeLog:
PR/target 88850
* gcc.target/arm/pr88850.c: New test.
Kyrylo Tkachov [Thu, 7 Feb 2019 09:32:46 +0000 (09:32 +0000)]
[arm] Use neon_dot_q type for 128-bit V[US]DOT instructions where appropriate
For the Dot Product instructions we have the scheduling types neon_dot and neon_dot_q for the 128-bit versions.
It seems that we're only using the former though, not assigning the neon_dot_q type anywhere.
This patch fixes that by adding the <q> mode attribute suffix to the type, similar to how we do it for other
types in neon.md.
* config/arm/neon.md (neon_<sup>dot<vsi2qi>):
Use neon_dot<q> for type.
(neon_<sup>dot_lane<vsi2qi>): Likewise.
Kyrylo Tkachov [Thu, 7 Feb 2019 09:31:33 +0000 (09:31 +0000)]
[AArch64] Use neon_dot_q type for 128-bit [US]DOT instructions where appropriate
For the Dot Product instructions we have the scheduling types neon_dot and neon_dot_q for the 128-bit versions.
It seems that we're only using the former though, not assigning the neon_dot_q type anywhere.
This patch fixes that by adding the <q> mode attribute suffix to the type, similar to how we do it for other
types in aarch64-simd.md.
* config/aarch64/aarch64-simd.md (aarch64_<sur>dot<vsi2qi>):
Use neon_dot<q> for type.
(aarch64_<sur>dot_lane<vsi2qi>): Likewise.
(aarch64_<sur>dot_laneq<vsi2qi>): Likewise.
Alexandre Oliva [Thu, 7 Feb 2019 07:50:42 +0000 (07:50 +0000)]
[PR86218] handle ck_aggr in compare_ics in both and either conversion
Because of rank compares, and checks for ck_list, we know that if we
see user_conv_p or ck_list in ics1, we'll also see it in ics2. This
reasoning does not extend to ck_aggr, however, so we might have
ck_aggr conversions starting both ics1 and ics2, which we handle
correctly, or either, which we likely handle by crashing on whatever
path we take depending on whether ck_aggr is in ics1 or ics2.
We crash because, as we search the conversion sequences, we may very
well fail to find what we are looking for, and reach the end of the
sequence, which is unexpected in all paths.
This patch arranges for us to take the same path when ck_aggr is in
ics2 only that we would if it was in ics1 (regardless of ics2), and it
deals with not finding the kind of conversion we look for there.
I've changed the type of the literal constant in the testcase, so as
to hopefully make it well-formed. We'd fail to reject the narrowing
conversion in the original testcase, but that's a separate bug.
for gcc/cp/ChangeLog
PR c++/86218
* call.c (compare_ics): Deal with ck_aggr in either cs.
re PR go/89199 (libgo regression in implementation of CompareAndSwap functions resulting in intermittent testcase failures on ppc64le power9 after r268458)
PR go/89199
sync/atomic: use strong form of atomic_compare_exchange_n
In the recent change to use atomic_compare_exchange_n I thought we
could use the weak form, which can spuriously fail. But that is not
how it is implemented in the gc library, and it is not what the rest
of the library expects.