David Malcolm [Wed, 17 Jan 2018 15:56:07 +0000 (15:56 +0000)]
Fix failure building LLVM with location wrapper nodes (PR c++/83799)
PR c++/83799 reports a failure building LLVM due to a bogus
"no matching function for call to" error at a callsite like this:
TLI->getTypeLegalizationCost(DL);
where "DL" is from:
using TargetTransformInfoImplBase::DL;
The root cause is that type_dependent_expression_p on a USING_DECL
should return true when processing a template, but after r256448 the
the argument at the callsite is a location wrapper around the USING_DECL,
and type_dependent_expression_p erroneously returns false for it, as
it is comparing tree codes, and failing a match, then looking at types.
This prevents cp_parser_postfix_expression from using the
"build_min_nt_call_vec" path for handling the call, instead erroneously
handling it via build_new_method_call (which fails for this case).
This patch fixes the problem by stripping any location wrappers before
the various tree code tests in type_dependent_expression_p. It fixes
the reduced test case, and the full BasicTargetTransformInfo.ii; after
this patch, the assembly generated for that latter case is identical to
that generated before r256448.
gcc/cp/ChangeLog:
PR c++/83799
* pt.c (type_dependent_expression_p): Strip any location wrapper
before testing tree codes.
(selftest::test_type_dependent_expression_p): New function.
(selftest::cp_pt_c_tests): Call it.
gcc/testsuite/ChangeLog:
PR c++/83799
* g++.dg/wrappers/pr83799.C: New test case.
Nathan Sidwell [Wed, 17 Jan 2018 15:39:35 +0000 (15:39 +0000)]
[C++/83739] bogus error tsubsting range for in generic lambda
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01554.html
PR c++/83739
* pt.c (tsubst_expr) <case RANGE_FOR_STMT>: Rebuild a range_for if
this not a final instantiation.
Kyrylo Tkachov [Wed, 17 Jan 2018 11:30:35 +0000 (11:30 +0000)]
[arm] Convert gcc.target/arm/stl-cond.c into an RTL test
This is an awkward testsuite failure. The original bug was that we were failing to put out
the conditional code in the conditional form of the STL instruction (oops!).
So we wanted to output STLNE, but instead output STL.
The testacase relies on if-conversion to conditionalise the insn for STL.
However, ever since r251643 the expansion of a non-relaxed atomic store
always includes a compiler barrier. That blocks if-conversion in all cases.
So there's no easy way to get to a conditional STL instruction from a C program.
But we do want to test for the original bug fix that if the RTL insn for STL is conditionalised
it should output the conditional code.
The solution in this patch is to convert the test into an RTL test with the COND_EXEC form
of the STL insn and scan the assembly output there.
This seems to work fine, and gives us an opportunity to create a gcc.dg/rtl/arm directory
in the RTL tests.
This now makes the gcc.target/arm/stl-cond.c disappear (as the test is deleted) and
the new test in gcc.dg/rtl/arm/stl-cond.c passes.
* gcc.dg/rtl/arm/stl-cond.c: New test.
* gcc.target/arm/stl-cond.c: Delete.
Kyrylo Tkachov [Wed, 17 Jan 2018 11:24:52 +0000 (11:24 +0000)]
[arm] Fix gcc.target/arm/pr40887.c directives
This patch converts gcc.target/arm/pr40887.c to use the proper effective target check and dg-add-options for armv5te
so that we avoid situations where we end up trying to compile the test with a Thumb1 hard-float ABI, which makes the
compiler complain.
This allows the test to pass gracefully for me for my compiler configured with:
--with-cpu=cortex-a15 --with-fpu=neon-vfpv4 --with-float=hard --with-mode=thumb
* gcc.target/arm/pr40887.c: Add armv5te effective target checks and
directives.
Kyrylo Tkachov [Wed, 17 Jan 2018 11:13:05 +0000 (11:13 +0000)]
[arm] Fix gcc.target/arm/xor-and.c
This test is naughty because it doesn't use the proper effective target checks
and add-options mechanisms for setting a Thumb1 target, which leads to Thumb1 hard-float errors
when testing a toolchain configured with --with-cpu=cortex-a15 --with-fpu=neon-vfpv4 --with-float=hard --with-mode=thumb.
This patch fixes that in the obvious way.
* gcc.target/arm/xor-and.c: Fix armv6 effective target checks
and options.
to an aligned 32-bit integer. The strict-alignment handling of
this case creates an aligned temporary slot, moves the operand
into the slot in the operand's original mode, then accesses the
slot in the more-aligned result mode.
Previously the size of the temporary slot was calculated using:
HOST_WIDE_INT temp_size
= MAX (int_size_in_bytes (inner_type),
(HOST_WIDE_INT) GET_MODE_SIZE (mode));
int_size_in_bytes would return -1 for the variable-length type,
so we'd use the size of the result mode for the slot. r256152 replaced
int_size_in_bytes with tree_to_poly_uint64, which triggered an ICE.
If op0 has BLKmode we do a block copy of GET_MODE_SIZE (mode) bytes
and then convert the slot to "mode":
so I think in that case just the size of "mode" is enough, even if op0
is a fixed-size type. For non-BLKmode op0 we first move in op0's mode
and then convert the slot to "mode":
emit_move_insn (new_with_op0_mode, op0);
op0 = new_rtx;
}
}
op0 = adjust_address (op0, mode, 0);
so I think we want the maximum of the two mode sizes in that case.
2018-01-17 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
PR middle-end/83884
* expr.c (expand_expr_real_1): Use the size of GET_MODE (op0)
rather than the size of inner_type to determine the stack slot size
when handling VIEW_CONVERT_EXPRs on strict-alignment targets.
Ian Lance Taylor [Wed, 17 Jan 2018 01:39:05 +0000 (01:39 +0000)]
elf.c (codes): Fix size to be 288.
* elf.c (codes) [GENERATE_FIXED_HUFFMAN_TABLE]: Fix size to be
288.
(main) [GENERATE_FIXED_HUFFMAN_TABLE]: Pass 288 to
elf_zlib_inflate_table. Generate elf_zlib_default_dist_table.
(elf_zlib_default_table): Update.
(elf_zlib_default_dist_table): New static array.
(elf_zlib_inflate): Use elf_zlib_default_dist_table for dist table
for block type 1.
* ztest.c (struct zlib_test): Add uncompressed_len.
(tests): Initialize uncompressed_len field. Add new test case.
(test_samples): Use uncompressed_len field.
Michael Meissner [Wed, 17 Jan 2018 01:06:34 +0000 (01:06 +0000)]
config.gcc (powerpc*-linux*-*): Add support for 64-bit little endian Linux systems to optionally enable...
2018-01-16 Michael Meissner <meissner@linux.vnet.ibm.com>
* config.gcc (powerpc*-linux*-*): Add support for 64-bit little
endian Linux systems to optionally enable multilibs for selecting
the long double type if the user configured an explicit type.
* config/rs6000/rs6000.h (TARGET_IEEEQUAD_MULTILIB): Indicate we
have no long double multilibs if not defined.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Do not
warn if the user used -mabi={ieee,ibm}longdouble and we built
multilibs for long double.
* config/rs6000/linux64.h (MULTILIB_DEFAULTS_IEEE): Define as the
appropriate multilib option.
(MULTILIB_DEFAULTS): Add MULTILIB_DEFAULTS_IEEE to the default
multilib options.
* config/rs6000/t-ldouble-linux64le-ibm: New configuration files
for building long double multilibs.
* config/rs6000/t-ldouble-linux64le-ieee: Likewise.
pa.h (MALLOC_ABI_ALIGNMENT): Set 32-bit alignment default to 64 bits.
* config/pa.h (MALLOC_ABI_ALIGNMENT): Set 32-bit alignment default to
64 bits.
* config/pa/pa32-linux.h (MALLOC_ABI_ALIGNMENT): Set alignment to
128 bits.
Eric Botcazou [Tue, 16 Jan 2018 21:21:29 +0000 (21:21 +0000)]
patchable_function_entry-decl.c: Use 3 NOPs on Visium.
* c-c++-common/patchable_function_entry-decl.c: Use 3 NOPs on Visium.
* c-c++-common/patchable_function_entry-default.c: 4 NOPs on Visium.
* c-c++-common/patchable_function_entry-definition.c: 2 NOPs on Visium.
Bill Schmidt [Tue, 16 Jan 2018 16:49:39 +0000 (16:49 +0000)]
rs6000.c (rs6000_opt_vars): Add entry for -mspeculate-indirect-jumps.
[gcc]
2018-01-16 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
* config/rs6000/rs6000.c (rs6000_opt_vars): Add entry for
-mspeculate-indirect-jumps.
* config/rs6000/rs6000.md (*call_indirect_elfv2<mode>): Disable
for -mno-speculate-indirect-jumps.
(*call_indirect_elfv2<mode>_nospec): New define_insn.
(*call_value_indirect_elfv2<mode>): Disable for
-mno-speculate-indirect-jumps.
(*call_value_indirect_elfv2<mode>_nospec): New define_insn.
(indirect_jump): Emit different RTL for
-mno-speculate-indirect-jumps.
(*indirect_jump<mode>): Disable for
-mno-speculate-indirect-jumps.
(*indirect_jump<mode>_nospec): New define_insn.
(tablejump): Emit different RTL for
-mno-speculate-indirect-jumps.
(tablejumpsi): Disable for -mno-speculate-indirect-jumps.
(tablejumpsi_nospec): New define_expand.
(tablejumpdi): Disable for -mno-speculate-indirect-jumps.
(tablejumpdi_nospec): New define_expand.
(*tablejump<mode>_internal1): Disable for
-mno-speculate-indirect-jumps.
(*tablejump<mode>_internal1_nospec): New define_insn.
* config/rs6000/rs6000.opt (mspeculate-indirect-jumps): New
option.
[gcc/testsuite]
2018-01-16 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
* gcc.target/powerpc/safe-indirect-jump-1.c: New file.
* gcc.target/powerpc/safe-indirect-jump-2.c: New file.
* gcc.target/powerpc/safe-indirect-jump-3.c: New file.
* gcc.target/powerpc/safe-indirect-jump-4.c: New file.
* gcc.target/powerpc/safe-indirect-jump-5.c: New file.
* gcc.target/powerpc/safe-indirect-jump-6.c: New file.
Jakub Jelinek [Tue, 16 Jan 2018 15:18:24 +0000 (16:18 +0100)]
re PR libgomp/83590 ([nvptx] openacc reduction C regressions)
PR libgomp/83590
* gimplify.c (gimplify_one_sizepos): For is_gimple_constant (expr)
return early, inline manually is_gimple_sizepos. Make sure if we
call gimplify_expr we don't end up with a gimple constant.
* tree.c (variably_modified_type_p): Don't return true for
is_gimple_constant (_t). Inline manually is_gimple_sizepos.
* gimplify.h (is_gimple_sizepos): Remove.
Co-Authored-By: Richard Biener <rguenther@suse.de>
From-SVN: r256748
vect_analyze_loop_operations was calling vectorizable_live_operation
for all live-out phis, which led to a bogus ncopies calculation in
the pure SLP case. I think v_a_l_o should only be passing phis
that are vectorised using normal loop vectorisation, since
vect_slp_analyze_node_operations handles the SLP side (and knows
the correct slp_index and slp_node arguments to pass in, via
vect_analyze_stmt).
With that fixed we hit an older bug that vectorizable_live_operation
didn't handle live-out SLP inductions. Fixed by using gimple_phi_result
rather than gimple_get_lhs for phis.
2018-01-16 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
PR tree-optimization/83857
* tree-vect-loop.c (vect_analyze_loop_operations): Don't call
vectorizable_live_operation for pure SLP statements.
(vectorizable_live_operation): Handle PHIs.
gcc/testsuite/
PR tree-optimization/83857
* gcc.dg/vect/pr83857.c: New test.
Jakub Jelinek [Tue, 16 Jan 2018 15:08:32 +0000 (16:08 +0100)]
re PR c/83844 (ICE with warn_if_not_aligned attribute)
PR c/83844
* stor-layout.c (handle_warn_if_not_align): Use byte_position and
multiple_of_p instead of unchecked tree_to_uhwi and UHWI check.
If off is not INTEGER_CST, issue a may not be aligned warning
rather than isn't aligned. Use isn%'t rather than isn't.
* fold-const.c (multiple_of_p) <case BIT_AND_EXPR>: Don't fall through
into MULT_EXPR.
<case MULT_EXPR>: Improve the case when bottom and one of the
MULT_EXPR operands are INTEGER_CSTs and bottom is multiple of that
operand, in that case check if the other operand is multiple of
bottom divided by the INTEGER_CST operand.
is used by code in pa.c and by ASM_DECLARE_FUNCTION_NAME in som.h.
Treating GET_MODE_SIZE as a constant is OK for the former but not
the latter, which is used in target-independent code. This caused
a build failure on hppa2.0w-hp-hpux11.11.
2018-01-16 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
PR target/83858
* config/pa/pa.h (FUNCTION_ARG_SIZE): Delete.
* config/pa/pa-protos.h (pa_function_arg_size): Declare.
* config/pa/som.h (ASM_DECLARE_FUNCTION_NAME): Use
pa_function_arg_size instead of FUNCTION_ARG_SIZE.
* config/pa/pa.c (pa_function_arg_advance): Likewise.
(pa_function_arg, pa_arg_partial_bytes): Likewise.
(pa_function_arg_size): New function.
tree t = fold_vec_perm (type, arg1, arg2,
vec_perm_indices (sel, 2, nelts));
where fold_vec_perm takes a const vec_perm_indices &. GCC 4.1 apparently
required a public copy constructor:
gcc/vec-perm-indices.h:85: error: 'vec_perm_indices::vec_perm_indices(const vec_perm_indices&)' is private
gcc/fold-const.c:11410: error: within this context
even though no copy should be made here. This patch tries to work
around that by constructing the vec_perm_indices separately.
2018-01-16 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* fold-const.c (fold_ternary_loc): Construct the vec_perm_indices
in a separate statement.
In the testcase we were trying to group two gather loads, even though
that isn't supported. Fixed by explicitly disallowing grouping of
gathers and scatters.
This problem didn't show up on SVE because there we convert to
IFN_GATHER_LOAD/IFN_SCATTER_STORE pattern statements, which fail
the can_group_stmts_p check.
2018-01-16 Richard Sandiford <richard.sandiford@linaro.org>
Jakub Jelinek [Tue, 16 Jan 2018 08:55:14 +0000 (09:55 +0100)]
re PR rtl-optimization/83620 (ICE: in assign_by_spills, at lra-assigns.c:1470: unable to find a register to spill with -flive-range-shrinkage --param=max-sched-ready-insns=0)
PR rtl-optimization/86620
* params.def (max-sched-ready-insns): Bump minimum value to 1.
* gcc.dg/pr64935-2.c: Use --param=max-sched-ready-insns=1
instead of --param=max-sched-ready-insns=0.
* gcc.target/i386/pr83620.c: New test.
* gcc.dg/pr83620.c: New test.
Jakub Jelinek [Tue, 16 Jan 2018 08:44:48 +0000 (09:44 +0100)]
re PR c++/83817 (internal compiler error: tree check: expected call_expr, have aggr_init_expr in tsubst_copy_and_build, at cp/pt.c:17822)
PR c++/83817
* pt.c (tsubst_copy_and_build) <case CALL_EXPR>: If function
is AGGR_INIT_EXPR rather than CALL_EXPR, set AGGR_INIT_FROM_THUNK_P
instead of CALL_FROM_THUNK_P.
Jakub Jelinek [Tue, 16 Jan 2018 08:43:31 +0000 (09:43 +0100)]
re PR c++/83825 (ICE on invalid C++ code with shadowed identifiers: in operator[], at vec.h:826)
PR c++/83825
* name-lookup.c (member_vec_dedup): Return early if len is 0.
(resort_type_member_vec, set_class_bindings,
insert_late_enum_def_bindings): Use vec qsort method instead of
calling qsort directly.
"delayed_cr" is just "cr_logical" with the second source operand not
equal to the destination operand. This patch changes it to be
expressed as type "cr_logical", with a new boolean attribute
"cr_logical_3op" added. This simplifies code.
H.J. Lu [Mon, 15 Jan 2018 22:35:36 +0000 (22:35 +0000)]
Don't check ix86_indirect_branch_register for GOT operand
Since GOT_memory_operand and GOT32_symbol_operand are simple pattern
matches, don't check ix86_indirect_branch_register here. If needed,
-mindirect-branch= will convert indirect branch via GOT slot to a call
and return thunk.
Jakub Jelinek [Mon, 15 Jan 2018 21:47:11 +0000 (22:47 +0100)]
re PR middle-end/83837 (libgomp.fortran/pointer[12].f90 FAIL)
PR middle-end/83837
* omp-expand.c (expand_omp_atomic_pipeline): Use loaded_val
type rather than type addr's type points to.
(expand_omp_atomic_mutex): Likewise.
(expand_omp_atomic): Likewise.
Ian Lance Taylor [Mon, 15 Jan 2018 19:13:47 +0000 (19:13 +0000)]
compiler: make sure variables captured by defer closure live
Local variables captured by the deferred closure need to be live
until the function finishes, especially when the deferred
function runs. In Function::build, for function that has a defer,
we wrap the function body in a try block. So the backend sees
the local variables only live in the try block, without knowing
that they are needed also in the finally block where we invoke
the deferred function. Fix this by creating top-level
declarations for non-escaping address-taken locals when there
is a defer.
With escape analysis turned on, at optimization level -O1 or -O2,
the store "didPanic = false" is elided by the backend's
optimizer, presumably because it thinks "didPanic" is not live
after the store, so the store is useless.
Thomas Koenig [Mon, 15 Jan 2018 18:35:13 +0000 (18:35 +0000)]
re PR fortran/54613 ([F08] Add FINDLOC plus support MAXLOC/MINLOC with KIND=/BACK=)
2018-01-15 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/54613
* gfortran.h (gfc_check_f): Rename f4ml to f5ml.
(gfc_logical_4_kind): New macro
* intrinsic.h (gfc_simplify_minloc): Add a gfc_expr *argument.
(gfc_simplify_maxloc): Likewise.
(gfc_resolve_maxloc): Likewise.
(gfc_resolve_minloc): Likewise.
* check.c (gfc_check_minloc_maxloc): Add checking for "back"
argument; also raise error if it is used (for now). Add it
if it isn't present.
* intrinsic.c (add_sym_4ml): Rename to
(add_sym_5ml), adjust for extra argument.
(add_functions): Add "back" constant. Adjust maxloc and minloc
for back argument.
* iresolve.c (gfc_resolve_maxloc): Add back argument. If back is
not of gfc_logical_4_kind, convert.
(gfc_resolve_minloc): Likewise.
* simplify.c (gfc_simplify_minloc): Add back argument.
(gfc_simplify_maxloc): Likewise.
* trans-intinsic.c (gfc_conv_intrinsic_minmaxloc): Rename last
argument to %VAL to ensure passing by value.
(gfc_conv_intrinsic_function): Call gfc_conv_intrinsic_minmaxloc
also for library calls.
H.J. Lu [Mon, 15 Jan 2018 18:16:01 +0000 (18:16 +0000)]
i386: Don't use ASM_OUTPUT_DEF for TARGET_MACHO
ASM_OUTPUT_DEF isn't defined for TARGET_MACHO. Use ASM_OUTPUT_LABEL to
generate the __x86_return_thunk label, instead of the set directive.
Update testcase to remove the __x86_return_thunk label check. Since
-fno-pic is ignored on Darwin, update testcases to scan or "push" only
on Linux.
gcc/
PR target/83839
* config/i386/i386.c (output_indirect_thunk_function): Use
ASM_OUTPUT_LABEL, instead of ASM_OUTPUT_DEF, for TARGET_MACHO
for __x86_return_thunk.
Kyrylo Tkachov [Mon, 15 Jan 2018 11:56:03 +0000 (11:56 +0000)]
[arm] PR target/83687: Fix invalid combination of VSUB + VABS into VABD
In this wrong-code bug we combine a VSUB.I8 and a VABS.S8
into a VABD.S8 instruction . This combination is not valid
for integer operands because in the VABD instruction the semantics
are that the difference is computed in notionally infinite precision
and the absolute difference is computed on that, whereas for a
VSUB.I8 + VABS.S8 sequence the VSUB operation will perform any
wrapping that's needed for the 8-bit signed type before the VABS
gets its hands on it.
This leads to the wrong-code in the PR where the expected
sequence from the intrinsics:
VSUB + VABS of two vectors {-100, -100, -100...}, {100, 100, 100...}
gives a result of {56, 56, 56...} (-100 - 100)
but GCC optimises it into a single
VABD of {-100, -100, -100...}, {100, 100, 100...}
which produces a result of {200, 200, 200...}
The transformation is still valid for floating-point operands,
which is why it was added in the first place I believe (r178817)
but this patch disables it for integer operands.
The HFmode variants though only exist for TARGET_NEON_FP16INST, so
this patch adds the appropriate guards to the new mode iterator
Bootstrapped and tested on arm-none-linux-gnueabihf.
PR target/83687
* config/arm/iterators.md (VF): New mode iterator.
* config/arm/neon.md (neon_vabd<mode>_2): Use the above.
Remove integer-related logic from pattern.
(neon_vabd<mode>_3): Likewise.
* gcc.target/arm/neon-combine-sub-abs-into-vabd.c: Delete integer
tests.
* gcc.target/arm/pr83687.c: New test.
Make optional conditionally trivially_{copy,move}_{constructible,assignable}
* include/std/optional (_Optional_payload): Fix the comment in
the class head and turn into a primary and one specialization.
(_Optional_payload::_M_engaged): Strike the NSDMI.
(_Optional_payload<_Tp, false>::operator=(const _Optional_payload&)):
New.
(_Optional_payload<_Tp, false>::operator=(_Optional_payload&&)):
Likewise.
(_Optional_payload<_Tp, false>::_M_get): Likewise.
(_Optional_payload<_Tp, false>::_M_reset): Likewise.
(_Optional_base_impl): Likewise.
(_Optional_base): Turn into a primary and three specializations.
(optional(nullopt)): Change the base init.
* testsuite/20_util/optional/assignment/8.cc: New.
* testsuite/20_util/optional/cons/trivial.cc: Likewise.
* testsuite/20_util/optional/cons/value_neg.cc: Adjust.
Jonathan Wakely [Mon, 15 Jan 2018 11:13:53 +0000 (11:13 +0000)]
PR libstdc++/80276 fix template argument handling in type printers
PR libstdc++/80276
* python/libstdcxx/v6/printers.py (strip_inline_namespaces): New.
(get_template_arg_list): New.
(StdVariantPrinter._template_args): Remove, use get_template_arg_list
instead.
(TemplateTypePrinter): Rewrite to work with gdb.Type objects instead
of strings and regular expressions.
(add_one_template_type_printer): Adapt to new TemplateTypePrinter.
(FilteringTypePrinter): Add docstring. Match using startswith. Use
strip_inline_namespaces instead of strip_versioned_namespace.
(add_one_type_printer): Prepend namespace to match argument.
(register_type_printers): Add type printers for char16_t and char32_t
string types and for types using cxx11 ABI. Update calls to
add_one_template_type_printer to provide default argument dicts.
* testsuite/libstdc++-prettyprinters/80276.cc: New test.
* testsuite/libstdc++-prettyprinters/whatis.cc: Remove tests for
basic_string<unsigned char> and basic_string<signed char>.
* testsuite/libstdc++-prettyprinters/whatis2.cc: Duplicate whatis.cc
to test local variables, without overriding _GLIBCXX_USE_CXX11_ABI.
Jakub Jelinek [Mon, 15 Jan 2018 09:05:59 +0000 (10:05 +0100)]
re PR middle-end/82694 (Linux kernel miscompiled since r250765)
PR middle-end/82694
* common.opt (fstrict-overflow): No longer an alias.
(fwrapv-pointer): New option.
* tree.h (TYPE_OVERFLOW_WRAPS, TYPE_OVERFLOW_UNDEFINED): Define
also for pointer types based on flag_wrapv_pointer.
* opts.c (common_handle_option) <case OPT_fstrict_overflow>: Set
opts->x_flag_wrap[pv] to !value, clear opts->x_flag_trapv if
opts->x_flag_wrapv got set.
* fold-const.c (fold_comparison, fold_binary_loc): Revert 2017-08-01
changes, just use TYPE_OVERFLOW_UNDEFINED on pointer type instead of
POINTER_TYPE_OVERFLOW_UNDEFINED.
* match.pd: Likewise in address comparison pattern.
* doc/invoke.texi: Document -fwrapv and -fstrict-overflow.
Richard Biener [Mon, 15 Jan 2018 08:57:28 +0000 (08:57 +0000)]
re PR lto/83804 ([meta] LTO memory consumption)
2018-01-15 Richard Biener <rguenther@suse.de>
PR lto/83804
* tree.c (free_lang_data_in_type): Always unlink TYPE_DECLs
from TYPE_FIELDS. Free TYPE_BINFO if not used by devirtualization.
Reset type names to their identifier if their TYPE_DECL doesn't
have linkage (and thus is used for ODR and devirt).
(save_debug_info_for_decl): Remove.
(save_debug_info_for_type): Likewise.
(add_tree_to_fld_list): Adjust.
* tree-pretty-print.c (dump_generic_node): Make dumping of
type names more robust.
Jakub Jelinek [Sun, 14 Jan 2018 16:19:14 +0000 (17:19 +0100)]
config.gcc (i[34567]86-*-*): Remove one duplicate gfniintrin.h entry from extra_headers.
* config.gcc (i[34567]86-*-*): Remove one duplicate gfniintrin.h
entry from extra_headers.
(x86_64-*-*): Remove two duplicate gfniintrin.h entries from
extra_headers, make the list bitwise identical to the i?86-*-* one.
H.J. Lu [Sun, 14 Jan 2018 14:43:10 +0000 (14:43 +0000)]
x86: Disallow -mindirect-branch=/-mfunction-return= with -mcmodel=large
Since the thunk function may not be reachable in large code model,
-mcmodel=large is incompatible with -mindirect-branch=thunk,
-mindirect-branch=thunk-extern, -mfunction-return=thunk and
-mfunction-return=thunk-extern. Issue an error when they are used with
-mcmodel=large.
gcc/
* config/i386/i386.c (ix86_set_indirect_branch_type): Disallow
-mcmodel=large with -mindirect-branch=thunk,
-mindirect-branch=thunk-extern, -mfunction-return=thunk and
-mfunction-return=thunk-extern.
* doc/invoke.texi: Document -mcmodel=large is incompatible with
-mindirect-branch=thunk, -mindirect-branch=thunk-extern,
-mfunction-return=thunk and -mfunction-return=thunk-extern.
foo:
movq func_p(%rip), %rax
call __x86_indirect_thunk_rax
ret
gcc/
* config/i386/i386.c (print_reg): Print the name of the full
integer register without '%'.
(ix86_print_operand): Handle 'V'.
* doc/extend.texi: Document 'V' modifier.
gcc/testsuite/
* gcc.target/i386/indirect-thunk-register-4.c: New test.
H.J. Lu [Sun, 14 Jan 2018 14:40:01 +0000 (14:40 +0000)]
x86: Add -mindirect-branch-register
Add -mindirect-branch-register to force indirect branch via register.
This is implemented by disabling patterns of indirect branch via memory,
similar to TARGET_X32.
-mindirect-branch= and -mfunction-return= tests are updated with
-mno-indirect-branch-register to avoid false test failures when
-mindirect-branch-register is added to RUNTESTFLAGS for "make check".
gcc/
* config/i386/constraints.md (Bs): Disallow memory operand for
-mindirect-branch-register.
(Bw): Likewise.
* config/i386/predicates.md (indirect_branch_operand): Likewise.
(GOT_memory_operand): Likewise.
(call_insn_operand): Likewise.
(sibcall_insn_operand): Likewise.
(GOT32_symbol_operand): Likewise.
* config/i386/i386.md (indirect_jump): Call convert_memory_address
for -mindirect-branch-register.
(tablejump): Likewise.
(*sibcall_memory): Likewise.
(*sibcall_value_memory): Likewise.
Disallow peepholes of indirect call and jump via memory for
-mindirect-branch-register.
(*call_pop): Replace m with Bw.
(*call_value_pop): Likewise.
(*sibcall_pop_memory): Replace m with Bs.
* config/i386/i386.opt (mindirect-branch-register): New option.
* doc/invoke.texi: Document -mindirect-branch-register option.
H.J. Lu [Sun, 14 Jan 2018 14:37:39 +0000 (14:37 +0000)]
x86: Add -mfunction-return=
Add -mfunction-return= option to convert function return to call and
return thunks. The default is 'keep', which keeps function return
unmodified. 'thunk' converts function return to call and return thunk.
'thunk-inline' converts function return to inlined call and return thunk.
'thunk-extern' converts function return to external call and return
thunk provided in a separate object file. You can control this behavior
for a specific function by using the function attribute function_return.
Function return thunk is the same as memory thunk for -mindirect-branch=
where the return address is at the top of the stack:
-mindirect-branch= tests are updated with -mfunction-return=keep to
avoid false test failures when -mfunction-return=thunk is added to
RUNTESTFLAGS for "make check".
gcc/
* config/i386/i386-protos.h (ix86_output_function_return): New.
* config/i386/i386.c (ix86_set_indirect_branch_type): Also
set function_return_type.
(indirect_thunk_name): Add ret_p to indicate thunk for function
return.
(output_indirect_thunk_function): Pass false to
indirect_thunk_name.
(ix86_output_indirect_branch): Likewise.
(output_indirect_thunk_function): Create alias for function
return thunk if regno < 0.
(ix86_output_function_return): New function.
(ix86_handle_fndecl_attribute): Handle function_return.
(ix86_attribute_table): Add function_return.
* config/i386/i386.h (machine_function): Add
function_return_type.
* config/i386/i386.md (simple_return_internal): Use
ix86_output_function_return.
(simple_return_internal_long): Likewise.
* config/i386/i386.opt (mfunction-return=): New option.
(indirect_branch): Mention -mfunction-return=.
* doc/extend.texi: Document function_return function attribute.
* doc/invoke.texi: Document -mfunction-return= option.
H.J. Lu [Sun, 14 Jan 2018 14:35:19 +0000 (14:35 +0000)]
x86: Add -mindirect-branch=
Add -mindirect-branch= option to convert indirect call and jump to call
and return thunks. The default is 'keep', which keeps indirect call and
jump unmodified. 'thunk' converts indirect call and jump to call and
return thunk. 'thunk-inline' converts indirect call and jump to inlined
call and return thunk. 'thunk-extern' converts indirect call and jump to
external call and return thunk provided in a separate object file. You
can control this behavior for a specific function by using the function
attribute indirect_branch.
2 kinds of thunks are geneated. Memory thunk where the function address
is at the top of the stack:
After inlining A into B, inline_small_functions updates the information
for (most) callees and callers of the new B:
update_callee_keys (&edge_heap, where, updated_nodes);
[...]
/* Our profitability metric can depend on local properties
such as number of inlinable calls and size of the function body.
After inlining these properties might change for the function we
inlined into (since it's body size changed) and for the functions
called by function we inlined (since number of it inlinable callers
might change). */
update_caller_keys (&edge_heap, where, updated_nodes, NULL);
These functions in turn call can_inline_edge_p for most of the associated
edges:
can_inline_edge_p indirectly calls estimate_calls_size_and_time
on the caller node, which seems to recursively process all callee
edges rooted at the node. It looks from this like the algorithm
can be at least quadratic in the worst case.
Maybe there's something we can do to make can_inline_edge_p cheaper, but
since neither of these two calls is responsible for reporting an inline
failure reason, it seems cheaper to test want_inline_small_function_p
first, so that we don't calculate an estimate for something that we
already know isn't a "small function". I think the only change
needed to make that work is to check for CIF_FINAL_ERROR in
want_inline_small_function_p; at the moment we rely on can_inline_edge_p
to make that check.
This cuts the time to build optabs.ii by over 4% with an
--enable-checking=release compiler on x86_64-linux-gnu. I've seen more
dramatic wins on aarch64-linux-gnu due to the NUM_POLY_INT_COEFFS==2
thing. The patch doesn't affect the output code.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* ipa-inline.c (want_inline_small_function_p): Return false if
inlining has already failed with CIF_FINAL_ERROR.
(update_caller_keys): Call want_inline_small_function_p before
can_inline_edge_p.
(update_callee_keys): Likewise.