jakub [Wed, 13 Feb 2019 13:32:00 +0000 (13:32 +0000)]
2019-02-13 Jakub Jelinek <jakub@redhat.com>
PR middle-end/89303
* tree-ssa-structalias.c (set_uids_in_ptset): Or in vi->is_heap_var
into pt->vars_contains_escaped_heap instead of setting
pt->vars_contains_escaped_heap to it.
2019-02-13 Jonathan Wakely <jwakely@redhat.com>
Jakub Jelinek <jakub@redhat.com>
PR middle-end/89303
* g++.dg/torture/pr89303.C: New test.
marxin [Wed, 13 Feb 2019 13:04:56 +0000 (13:04 +0000)]
Fix -fdec simplification (PR fortran/88649).
2019-02-13 Martin Liska <mliska@suse.cz>
PR fortran/88649
* resolve.c (resolve_operator): Initialize 't' right
after function entry. Skip switch (e->value.op.op)
for -fdec operands that become function calls.
jakub [Wed, 13 Feb 2019 12:12:09 +0000 (12:12 +0000)]
PR middle-end/89281
* optabs.c (prepare_cmp_insn): Use UINTVAL (size) instead of
INTVAL (size), compare it to GET_MODE_MASK instead of
1 << GET_MODE_BITSIZE.
paolo [Wed, 13 Feb 2019 10:34:49 +0000 (10:34 +0000)]
/cp
2019-02-13 Paolo Carlini <paolo.carlini@oracle.com>
PR c++/88986
* decl.c (make_typename_type): Allow for TYPE_PACK_EXPANSION as
context (the first argument).
* pt.c (tsubst, case TYPENAME_TYPE): Handle TYPE_PACK_EXPANSION
as context.
/testsuite
2019-02-13 Paolo Carlini <paolo.carlini@oracle.com>
jakub [Wed, 13 Feb 2019 08:45:37 +0000 (08:45 +0000)]
PR target/89290
* config/i386/predicates.md (x86_64_immediate_operand): Allow
TLS UNSPECs offsetted by signed 32-bit CONST_INT even with
-mcmodel=large.
ibuclaw [Wed, 13 Feb 2019 07:14:46 +0000 (07:14 +0000)]
libphobos: Fallback on UnwindBacktrace if LibBacktrace not defined.
In the gcc.backtrace module, either one of LibBacktrace or
UnwindBacktrace will always be defined. Giving UnwindBacktrace a higher
precedence over the libc backtrace as the default handler because the
latter depends on a rt.backtrace module that is not compiled in.
libphobos/ChangeLog:
* libdruntime/core/runtime.d (defaultTraceHandler): Give
UnwindBacktrace handler precedence over backtrace.
marxin [Wed, 13 Feb 2019 06:57:38 +0000 (06:57 +0000)]
Remove a barrier when EDGE_CROSSING is removed (PR lto/88858).
2019-02-13 Martin Liska <mliska@suse.cz>
PR lto/88858
* cfgrtl.c (remove_barriers_from_footer): New function.
(try_redirect_by_replacing_jump): Use it.
(cfg_layout_redirect_edge_and_branch): Likewise.
luoxhu [Wed, 13 Feb 2019 06:31:01 +0000 (06:31 +0000)]
rs6000: Add support for the vec_sbox_be, vec_cipher_be etc. builtins.
The 5 new builtins vec_sbox_be, vec_cipher_be, vec_cipherlast_be, vec_ncipher_be
and vec_ncipherlast_be only support vector unsigned char type parameters.
Add new instruction crypto_vsbox_<mode> and crypto_<CR_insn>_<mode> to handle
them accordingly, where the new mode CR_vqdi can be expanded to vector unsigned
long long for none _be postfix builtins or vector unsigned char for _be postfix
builtins.
jason [Tue, 12 Feb 2019 21:18:51 +0000 (21:18 +0000)]
PR c++/89144 - link error with constexpr initializer_list.
In this PR, we were unnecessarily rejecting a constexpr initializer_list
with no elements. This seems like a fairly useless degenerate case, but it
makes sense to avoid allocating an underlying array at all if there are no
elements and instead use a null pointer, like the initializer_list default
constructor.
If the (automatic storage duration) list does have initializer elements, we
continue to reject the declaration, because the initializer_list ends up
referring to an automatic storage duration temporary array, which is not a
suitable constant initializer. If we make it static, it should be OK
because we refer to a static array. The second hunk fixes that case. It
also means we won't diagnose some real errors in templates, but those
diagnostics aren't required, and we'll get them when the template is
instantiated.
* call.c (convert_like_real) [ck_list]: Don't allocate a temporary
array for an empty list.
* typeck2.c (store_init_value): Don't use cxx_constant_init in a
template.
hjl [Tue, 12 Feb 2019 19:00:35 +0000 (19:00 +0000)]
i386: Revert revision 268678 and revision 268657
i386 backend has
INT_MODE (OI, 32);
INT_MODE (XI, 64);
So, XI_MODE represents 64 INTEGER bytes = 64 * 8 = 512 bit operation,
in case of const_1, all 512 bits set.
We can load zeros with narrower instruction, (e.g. 256 bit by inherent
zeroing of highpart in case of 128 bit xor), so TImode in this case.
Some targets prefer V4SF mode, so they will emit float xorps for zeroing
Then the introduction of AVX512F fubared everything by overloading the
meaning of insn mode.
How should we use INSN mode, MODE_XI, in standard_sse_constant_opcode
and patterns which use standard_sse_constant_opcode? 2 options:
1. MODE_XI should only used to check if EXT_REX_SSE_REG_P is true
in any register operand. The operand size must be determined by operand
itself , not by MODE_XI. The operand encoding size should be determined
by the operand size, EXT_REX_SSE_REG_P and AVX512VL.
2. MODE_XI should be used to determine the operand encoding size.
EXT_REX_SSE_REG_P and AVX512VL should be checked for encoding
instructions.
is correctly recognized by LRA as RIL alternative of extendsidi2
define_insn. However, when recognition runs after LRA, it returns RXY
alternative, which is incorrect, since the offset 16 points past the
end of of *.LC0 literal pool entry. Such addresses are normally
rejected by s390_decompose_address ().
This inconsistency confuses annotate_constant_pool_refs: the selected
alternative makes it proceed with annotation, only to find that the
annotated address is invalid, causing ICE.
This patch fixes the root cause, namely, that s390_check_qrst_address ()
behaves differently during and after LRA.
gcc/ChangeLog:
2019-02-12 Ilya Leoshkevich <iii@linux.ibm.com>
PR target/89233
* config/s390/s390.c (s390_decompose_address): Update comment.
(s390_check_qrst_address): Reject invalid address forms after
LRA.
gcc/testsuite/ChangeLog:
2019-02-12 Ilya Leoshkevich <iii@linux.ibm.com>
PR target/89233
* gcc.target/s390/pr89233.c: New test.
vries [Tue, 12 Feb 2019 14:00:59 +0000 (14:00 +0000)]
[libbacktrace] Handle bsearch with NULL base in dwarf_lookup_pc
The call to bsearch in dwarf_lookup_pc can have NULL as base argument when
the nmemb argument is 0. The base argument is required to be pointing to the
initial member of an array of nmemb objects. It is not specified what
constitutes a valid pointer to an array of 0 objects, but glibc declares base
with attribute non-null, so the NULL will trigger a sanitizer runtime error.
tromey [Tue, 12 Feb 2019 13:02:48 +0000 (13:02 +0000)]
Fix splay tree KEY leak detected in GDB test gdb.base/macscp.exp
When a node is removed from a splay tree, the splay tree was
not using the function splay_tree_delete_key_fn to release the key.
This was causing a leak, fixed by Tom Tromey.
This patch fixes another key leak, that happens when a key equal to
a key already present is inserted. In such a case, we have to release
the old KEY.
Note that this is based on the assumption that the caller always
allocates a new KEY when doing an insert.
Also, clarify the documentation about when the release functions are
called.
2019-02-11 Philippe Waroquiers <philippe.waroquiers@skynet.be>
libiberty/ChangeLog
2019-02-11 Philippe Waroquiers <philippe.waroquiers@skynet.be>
* splay-tree.c (splay_tree_insert): Also release old KEY in case
of insertion of a key equal to an already present key.
(splay_tree_new_typed_alloc): Update comment.
hubicka [Tue, 12 Feb 2019 11:25:11 +0000 (11:25 +0000)]
PR lto/88777
* cgraphunit.c (analyze_functions): Clear READONLY flag for external
types that needs constructiong.
* tree.h (may_be_aliased): Do not check TYPE_NEEDS_CONSTRUCTING.
dmalcolm [Tue, 12 Feb 2019 01:09:31 +0000 (01:09 +0000)]
linemap_line_start: protect against location_t overflow (PR lto/88147)
PR lto/88147 reports an assertion failure due to a bogus location_t value
when adding a line to a pre-existing line map, when there's a large
difference between the two line numbers.
For some "large differences", this leads to a location_t value that exceeds
LINE_MAP_MAX_LOCATION, in which case linemap_line_start returns 0. This
isn't ideal, but at least should lead to safe degradation of location
information.
However, if the difference is very large, it's possible for the line
number offset (relative to the start of the map) to be sufficiently large
that overflow occurs when left-shifted by the column-bits, and hence
the check against the LINE_MAP_MAX_LOCATION limit fails, leading to
a seemingly-valid location_t value, but encoding the wrong location. This
triggers the assertion failure:
linemap_assert (SOURCE_LINE (map, r) == to_line);
The fix (thanks to Martin) is to check for overflow when determining
whether to reuse an existing map, and to not reuse it if it would occur.
gcc/ChangeLog: David Malcolm <dmalcolm@redhat.com>
PR lto/88147
* input.c (selftest::test_line_offset_overflow): New selftest.
(selftest::input_c_tests): Call it.
libcpp/ChangeLog: Martin Liska <mliska@suse.cz>
PR lto/88147
* line-map.c (linemap_line_start): Don't reuse the existing line
map if the line offset is sufficiently large to cause overflow
when computing location_t values.
mpolacek [Mon, 11 Feb 2019 20:03:43 +0000 (20:03 +0000)]
PR c++/89212 - ICE converting nullptr to pointer-to-member-function.
* pt.c (tsubst_copy_and_build) <case CONSTRUCTOR>: Return early for
null member pointer value.
* g++.dg/cpp0x/nullptr40.C: New test.
* g++.dg/cpp0x/nullptr41.C: New test.
PR tree-optimization/88771
* gimple-ssa-warn-restrict.c (pass_wrestrict::gate): Also enable
when -Wstringop-overflow is set.
(builtin_memref::builtin_memref): Adjust excessive upper bound
only when lower bound is not excessive.
(maybe_diag_overlap): Detect and diagnose excessive bounds via
-Wstringop-ovefflow.
(maybe_diag_offset_bounds): Rename...
(maybe_diag_access_bounds): ...to this.
(check_bounds_or_overlap): Adjust for name change above.
gcc/testsuite/ChangeLog:
PR tree-optimization/88771
* gcc.dg/Wstringop-overflow-8.c: New test.
* gcc.dg/Wstringop-overflow-9.c: New test.
* gcc.dg/Warray-bounds-40.c: New test.
* gcc.dg/builtin-stpncpy.c: Adjust.
* gcc.dg/builtin-stringop-chk-4.c: Adjust.
* g++.dg/opt/memcpy1.C: Adjust.
PR c++/87996
* c-common.c (invalid_array_size_error): New function.
(valid_array_size_p): Call it. Handle size as well as type.
* c-common.h (valid_constant_size_p): New function.
(enum cst_size_error): New type.
gcc/cp/ChangeLog:
PR c++/87996
* decl.c (compute_array_index_type_loc): Preserve signed sizes
for diagnostics. Call valid_array_size_p instead of error.
* init.c (build_new_1): Compute size for diagnostic. Call
invalid_array_size_error
(build_new): Call valid_array_size_p instead of error.
tnfchris [Mon, 11 Feb 2019 16:54:18 +0000 (16:54 +0000)]
Arm: Update tests after register allocation changes. (PR/target 88560)
After the register allocator changes of r268705 we need to update a few tests
with new output.
In all cases the compiler is now generating the expected code, since the tests
are all float16 testcases using a hard-floar abi, we expect that actual fp16
instructions are used rather than using integer loads and stores. Because of
we also save on some mov.f16s that were being emitted before to move between
the two.
The aapcs cases now match the f32 cases in using floating point operations.
wschmidt [Mon, 11 Feb 2019 16:50:33 +0000 (16:50 +0000)]
[gcc]
2019-02-11 Bill Schmidt <wschmidt@linux.ibm.com>
* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Shift-right
and shift-left vector built-ins need to include a TRUNC_MOD_EXPR
for correct semantics.
amodra [Mon, 11 Feb 2019 15:19:59 +0000 (15:19 +0000)]
[RS6000] No inline PLT for V4 bss-plt, implement -mno-pltseq
Inline PLT calls need PLT to be an array of addresses. PowerPC 32-bit
bss-plt works differently, so this patch disables inline PLT calls
when -mbss-plt. The patch also adds support for a new -mno-pltseq
option, which may be required when linking with -mbss-plt code.
* doc/invoke.texi (man page RS/6000 and PowerPC Options): Mention
-mlongcall and -mpltseq.
(RS/6000 and PowerPC Options <-mlongcall>): Mention inline PLT calls.
(RS/6000 and PowerPC Options <-mpltseq>): Document.
* config/rs6000/rs6000.h (TARGET_PLTSEQ): Define.
* config/rs6000/sysv4.opt (mpltseq): New option.
* config/rs6000/sysv4.h (TARGET_PLTSEQ): Redefine.
(SUBTARGET_OVERRIDE_OPTIONS): Error if given -mpltseq when assembler
support is lacking. Don't allow -mpltseq with -mbss-plt.
* config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Warn if
-mpltseq given for ELFv1.
* config/rs6000/rs6000.c (rs6000_call_aix): Comment on UNSPEC_PLTSEQ.
Only use UNSPEC_PLTSEQ for inline PLT calls.
(rs6000_call_sysv, rs6000_sibcall_sysv): Expand comments. Only
use UNSPEC_PLTSEQ for inline PLT calls.
(rs6000_indirect_call_template_1, rs6000_longcall_ref),
(rs6000_call_aix, rs6000_call_sysv, rs6000_sibcall_sysv): Replace
uses of HAVE_AS_PLTSEQ with TARGET_PLTSEQ, simplifying.
* config/rs6000/rs6000.md (pltseq_tocsave_<mode>),
(pltseq_plt16_ha_<mode>, pltseq_plt16_lo_<mode>),
(pltseq_mtctr_<mode>): Likewise.
redi [Mon, 11 Feb 2019 12:56:59 +0000 (12:56 +0000)]
PR libstdc++/89023 fix test that fails when <omp.h> not available
Instead of a single test that only checks whether <regex> can be
included in Parallel Mode, add tests for each of C++11/C++14/C++17 that
check whether <bits/extc++.h> is compatible with _GLIBCXX_PARALLEL.
This increases the coverage to (almost) all headers.
If <omp.h> is not available then the tests will trivially pass, because
we don't care about compatibility with _GLIBCXX_PARALLEL in that case.
PR libstdc++/89023
* testsuite/17_intro/headers/c++2011/parallel_mode.cc: New test.
* testsuite/17_intro/headers/c++2014/parallel_mode.cc: New test.
* testsuite/17_intro/headers/c++2017/parallel_mode.cc: New test.
* testsuite/28_regex/headers/regex/parallel_mode.cc: Remove.
hp [Mon, 11 Feb 2019 09:03:51 +0000 (09:03 +0000)]
* function.c (assign_parm_setup_block): Use the stored
size, not the passed size, when allocating stack-space,
also for a parameter with alignment larger than
MAX_SUPPORTED_STACK_ALIGNMENT.
tkoenig [Sun, 10 Feb 2019 15:56:41 +0000 (15:56 +0000)]
2019-02-10 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/71723
* expr.c (gfc_check_assign): Add argument is_init_expr. If we are
looking at an init expression, issue error if the target is not a
TARGET and we are not looking at a procedure pointer.
* gfortran.h (gfc_check_assign): Add optional argument
is_init_expr.
tkoenig [Sun, 10 Feb 2019 15:52:38 +0000 (15:52 +0000)]
2019-02-10 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/71237
* expr.c (gfc_check_assign): Add argument is_init_expr. If we are
looking at an init expression, issue error if the target is not a
TARGET and we are not looking at a procedure pointer.
* gfortran.h (gfc_check_assign): Add optional argument
is_init_expr.
jakub [Sat, 9 Feb 2019 08:55:39 +0000 (08:55 +0000)]
PR middle-end/89246
* config/i386/i386.c (ix86_simd_clone_compute_vecsize_and_simdlen):
If !node->definition and TYPE_ARG_TYPES is non-NULL, use
TYPE_ARG_TYPES instead of DECL_ARGUMENTS.
* gcc.dg/gomp/pr89246-1.c: New test.
* gcc.dg/gomp/pr89246-2.c: New test.
redi [Sat, 9 Feb 2019 00:25:39 +0000 (00:25 +0000)]
Add noexcept to filesystem::path query functions
In the standard these member functions are specified in terms of the
potentially-throwing path decompositions functions, but we implement
them without constructing any new paths or doing anything else that can
throw.
// -m32 -fpic -msecure-plt
extern int foo (int);
int f1 (int x) { return foo (x); }
These are both caused by save_reg_p returning false when the pic
offset table reg (r30 for ABI_V4) was used, due to the logic not
exactly matching that in rs6000_emit_prologue to set up r30.
I also noticed that save_reg_p isn't following the comment regarding
calls_eh_return (since svn 267049, git 0edf78b1b2a0), and the comment
needs tweaking too. For why the revised comment is correct, grep for
saves_all_registers in lra.c, and yes, we do want to save the pic
offset table reg for eh_return.
PR target/88343
* config/rs6000/rs6000.c (save_reg_p): Correct calls_eh_return
case. Match logic in rs6000_emit_prologue emitting pic_offset_table
setup.
rguenth [Fri, 8 Feb 2019 13:21:36 +0000 (13:21 +0000)]
2019-02-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/89247
* tree-if-conv.c: Include tree-cfgcleanup.h.
(version_loop_for_if_conversion): Record LOOP_VECTORIZED call.
(tree_if_conversion): Pass through predicate vector.
(pass_if_conversion::execute): Do CFG cleanup and SSA update
inline, see if any if-converted loops we refrece in
LOOP_VECTORIZED calls vanished and fixup.
* tree-if-conv.h (tree_if_conversion): Adjust prototype.
iii [Fri, 8 Feb 2019 12:39:27 +0000 (12:39 +0000)]
S/390: Introduce jdd constraint
Implementation of section anchors in S/390 back-end added in r266741
broke jump labels in S/390 Linux kernel [1]. Currently jump labels
pass global variable addresses to .quad directive in inline assembly
using "X" constraint. In the past this used to produce regular symbol
references, however, after r266741 we sometimes get values like
(plus (reg) (const_int)), where (reg) points to a section anchor.
Strictly speaking, this is still correct, since "X" accepts anything.
Thus, now we need another way to support jump labels.
The existing "i" constraint cannot be used, since with -fPIC it must
not accept non-local symbols, however, jump labels do require that,
e.g. __tracepoint_xdp_exception from kernel proper might be referenced
from kernel modules.
The existing "ZL" constraint cannot be used for the same reason.
The existing "b" constraint cannot be used because of the way
expand_asm_stmt works. It deduces whether the constraint allows
regs, subregs or mems, and processes asm operands differently based on
that. "b" is supposed to accept values like (mem (symbol_ref)), and
there appears to be no way to explain to expand_asm_stmt that for "b"
mem's address must not be in a register.
This patch introduces the new machine-specific constraint named "jdd" -
"j" prefix is already used for constants, and "d" stands for "data".
It accepts anything that fits into the data section, whether or not
this might require a relocation, that is, anything that passes
CONSTANT_P check.
[1] https://lkml.org/lkml/2019/1/23/346
gcc/ChangeLog:
2019-02-08 Ilya Leoshkevich <iii@linux.ibm.com>
* config/s390/constraints.md (jdd): New constraint.
ebotcazou [Fri, 8 Feb 2019 11:37:40 +0000 (11:37 +0000)]
* gcc-interface/trans.c (gnat_to_gnu) <N_Aggregate>: Minor tweak.
* gcc-interface/utils.c (convert): Do not pad when doing an unchecked
conversion here. Use TREE_CONSTANT throughout the function.
(unchecked_convert): Also pad if the source is a CONSTRUCTOR and the
destination is a more aligned array type or a larger aggregate type,
but not between original and packable versions of a type.
hjl [Fri, 8 Feb 2019 11:30:53 +0000 (11:30 +0000)]
i386: Use OI/TImode in *mov[ot]i_internal_avx with AVX512VL
OImode and TImode moves must be done in XImode to access upper 16
vector registers without AVX512VL. With AVX512VL, we can access
upper 16 vector registers in OImode and TImode.
PR target/89229
* config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
upper 16 vector registers without TARGET_AVX512VL.
(*movti_internal): Likewise.
ebotcazou [Fri, 8 Feb 2019 11:07:08 +0000 (11:07 +0000)]
* gcc-interface/trans.c (Regular_Loop_to_gnu): Replace tests on
individual flag_unswitch_loops and flag_tree_loop_vectorize switches
with test on global optimize switch.
(Raise_Error_to_gnu): Likewise.
jakub [Fri, 8 Feb 2019 10:26:33 +0000 (10:26 +0000)]
PR rtl-optimization/89234
* except.c (copy_reg_eh_region_note_forward): Return if note_or_insn
is a NOTE, CODE_LABEL etc. - rtx_insn * other than INSN_P.
(copy_reg_eh_region_note_backward): Likewise.
The backtrace functions backtrace_full, backtrace_print and backtrace_simple
walk the call stack, but make sure to skip the first entry, in order to skip
over the functions themselves, and start the backtrace at the caller of the
functions.
When compiling with -flto, the functions may be inlined, causing them to skip
over the caller instead.
Fix this by declaring the functions with __attribute__((noinline)).
rguenth [Fri, 8 Feb 2019 08:18:09 +0000 (08:18 +0000)]
2019-02-08 Richard Biener <rguenther@suse.de>
PR middle-end/89223
* tree-data-ref.c (initialize_matrix_A): Fail if constant
doesn't fit in HWI.
(analyze_subscript_affine_affine): Handle failure from
initialize_matrix_A.
dmalcolm [Thu, 7 Feb 2019 23:00:18 +0000 (23:00 +0000)]
Fix more ICEs in -fsave-optimization-record (PR tree-optimization/89235)
PR tree-optimization/89235 reports an ICE inside -fsave-optimization-record
whilst reporting the inlining chain of of the location_t in the
vect_location global.
This is very similar to PR tree-optimization/86637, fixed in r266821.
The issue is that the inlining chains are read from the location_t's
ad-hoc data, referencing GC-managed tree blocks, but the former are
not GC roots; it's simply assumed that old locations referencing dead
blocks never get used again.
The fix is to reset the "vect_location" global in more places. Given
that is a somewhat subtle detail, the patch adds a sentinel class to
reset vect_location at the end of a scope. Doing it as a class
simplifies the task of ensuring that the global is reset on every
exit path from a function, and also gives a good place to signpost
the above subtlety (in the documentation for the class).
The patch also adds test cases for both of the PRs mentioned above.
gcc/testsuite/ChangeLog:
PR tree-optimization/86637
PR tree-optimization/89235
* gcc.c-torture/compile/pr86637-1.c: New test.
* gcc.c-torture/compile/pr86637-2.c: New test.
* gcc.c-torture/compile/pr86637-3.c: New test.
* gcc.c-torture/compile/pr89235.c: New test.
gcc/ChangeLog:
PR tree-optimization/86637
PR tree-optimization/89235
* tree-vect-loop.c (optimize_mask_stores): Add an
auto_purge_vect_location sentinel to ensure that vect_location is
purged on exit.
* tree-vectorizer.c
(auto_purge_vect_location::~auto_purge_vect_location): New dtor.
(try_vectorize_loop_1): Add an auto_purge_vect_location sentinel
to ensure that vect_location is purged on exit.
(pass_slp_vectorize::execute): Likewise, replacing the manual
reset.
* tree-vectorizer.h (class auto_purge_vect_location): New class.
ktkachov [Thu, 7 Feb 2019 18:18:16 +0000 (18:18 +0000)]
[AArch64] Change representation of SABD in RTL
Richard raised a concern about the RTL we use to represent the AdvSIMD SABD
(vector signed absolute difference) instruction.
We currently represent it as ABS (MINUS op1 op2).
This isn't exactly what SABD does. ABS treats its input as a signed value
and returns the absolute of that.
For example:
(sabd:QI 64 -128) == 192 (unsigned) aka -64 (signed)
whereas
(minus:QI 64 -128) == 192 (unsigned) aka -64 (signed), (abs ...) of that is 64.
A better way to describe the instruction is with MINUS (SMAX (op1 op2) SMIN (op1 op2)).
This patch implements that, and also implements similar semantics for the UABD instruction
that uses UMAX and UMIN.
That way for the example above we'll have:
(minus:QI (smax:QI (64 -128)) (smin:QI (64 -128))) == (minus:QI 64 -128) == 192 (or -64 signed) which matches
what SABD does.
* config/aarch64/iterators.md (max_opp): New code_attr.
(USMAX): New code iterator.
* config/aarch64/predicates.md (aarch64_smin): New predicate.
(aarch64_smax): Likewise.
* config/aarch64/aarch64-simd.md (abd<mode>_3): Rename to...
(*aarch64_<su>abd<mode>_3): ... Change RTL representation to
MINUS (MAX MIN).
* gcc.target/aarch64/abd_1.c: New test.
* gcc.dg/sabd_1.c: Likewise.
hjl [Thu, 7 Feb 2019 17:58:19 +0000 (17:58 +0000)]
i386: Fix typo in *movoi_internal_avx/movti_internal
PR target/89229
* config/i386/i386.md (*movoi_internal_avx): Set mode to OI
for TARGET_AVX512VL.
(*movti_internal): Set mode to TI for TARGET_AVX512VL.