* bb-reorder.c (find_traces_1_round): Fix off-by-one index.
Move comment around. Do not reset best_edge for a copiable
destination if the copy would cause a partition change.
(better_edge_p): Remove redundant check.
Michael Meissner [Fri, 27 Oct 2017 18:15:38 +0000 (18:15 +0000)]
builtins.c (CASE_MATHFN_FLOATN): New helper macro to add cases for math functions that have _Float<N> and...
[gcc]
2017-10-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* builtins.c (CASE_MATHFN_FLOATN): New helper macro to add cases
for math functions that have _Float<N> and _Float<N>X variants.
(mathfn_built_in_2): Add support for math functions that have
_Float<N> and _Float<N>X variants.
(DEF_INTERNAL_FLT_FLOATN_FN): New helper macro.
(expand_builtin_mathfn_ternary): Add support for fma with
_Float<N> and _Float<N>X variants.
(expand_builtin): Likewise.
(fold_builtin_3): Likewise.
* builtins.def (DEF_EXT_LIB_FLOATN_NX_BUILTINS): New macro to
create math function _Float<N> and _Float<N>X variants as external
library builtins.
(BUILT_IN_COPYSIGN _Float<N> and _Float<N>X variants) Use
DEF_EXT_LIB_FLOATN_NX_BUILTINS to make built-in functions using
the __builtin_ prefix and if not strict ansi, without the prefix.
(BUILT_IN_FABS _Float<N> and _Float<N>X variants): Likewise.
(BUILT_IN_FMA _Float<N> and _Float<N>X variants): Likewise.
(BUILT_IN_FMAX _Float<N> and _Float<N>X variants): Likewise.
(BUILT_IN_FMIN _Float<N> and _Float<N>X variants): Likewise.
(BUILT_IN_NAN _Float<N> and _Float<N>X variants): Likewise.
(BUILT_IN_SQRT _Float<N> and _Float<N>X variants): Likewise.
* builtin-types.def (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16): New
function signatures for fma _Float<N> and _Float<N>X variants.
(BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32): Likewise.
(BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64): Likewise.
(BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128): Likewise.
(BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X): Likewise.
(BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X): Likewise.
(BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X): Likewise.
* gencfn-macros.c (print_case_cfn): Add support for math functions
that have _Float<N> and _Float<N>X variants.
(print_define_operator_list): Likewise.
(fltfn_suffixes): Likewise.
(main): Likewise.
* internal-fn.def (DEF_INTERNAL_FLT_FLOATN_FN): New helper macro
for math functions that have _Float<N> and _Float<N>X variants.
(SQRT): Add support for sqrt, copysign, fmin and fmax _Float<N>
and _Float<N>X variants.
(COPYSIGN): Likewise.
(FMIN): Likewise.
(FMAX): Likewise.
* fold-const.c (tree_call_nonnegative_warnv_p): Add support for
copysign, fma, fmax, fmin, and sqrt _Float<N> and _Float<N>X
variants.
(integer_valued_read_call_p): Likewise.
* fold-const-call.c (fold_const_call_ss): Likewise.
(fold_const_call_sss): Add support for copysign, fmin, and fmax
_Float<N> and _Float<N>X variants.
(fold_const_call_ssss): Add support for fma _Float<N> and
_Float<N>X variants.
* gimple-ssa-backprop.c (backprop::process_builtin_call_use): Add
support for copysign and fma _Float<N> and _Float<N>X variants.
(backprop::process_builtin_call_use): Likewise.
* tree-call-cdce.c (can_test_argument_range); Add support for
sqrt _Float<N> and _Float<N>X variants.
(edom_only_function): Likewise.
(get_no_error_domain): Likewise.
* tree-ssa-math-opts.c (internal_fn_reciprocal): Likewise.
* tree-ssa-reassoc.c (attempt_builtin_copysign): Add support for
copysign _Float<N> and _Float<N>X variants.
* config/rs6000/rs6000-builtin.def (SQRTF128): Delete, this is now
handled by machine independent code.
(FMAF128): Likewise.
* doc/cpp.texi (Common Predefined Macros): Document defining
__FP_FAST_FMAF<N> and __FP_FAST_FMAF<N>X if the backend supports
fma _Float<N> and _Float<N>X variants.
[gcc/c]
2017-10-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* c-decl.c (header_for_builtin_fn): Add support for copysign, fma,
fmax, fmin, and sqrt _Float<N> and _Float<N>X variants.
[gcc/c-family]
2017-10-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* c-cppbuiltin.c (mode_has_fma): Add support for PowerPC KFmode.
(c_cpp_builtins): If a machine has a fast fma _Float<N> and
_Float<N>X variant, define __FP_FAST_FMA<N> and/or
__FP_FAST_FMA<N>X.
[gcc/testsuite]
2017-10-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/float128-hw.c: Add support for all 4 FMA
variants. Check various conversions to/from float128. Check
negation. Use {\m...\M} in the tests.
* gcc.target/powerpc/float128-hw2.c: New test for implicit
_Float128 math functions.
* gcc.target/powerpc/float128-hw3.c: New test for strict ansi mode
not implicitly adding the _Float128 math functions.
* gcc.target/powerpc/float128-fma2.c: Delete, test is no longer
valid.
* gcc.target/powerpc/float128-sqrt2.c: Likewise.
Jeff Law [Fri, 27 Oct 2017 16:54:49 +0000 (10:54 -0600)]
gimple-ssa-sprintf.c: Include domwalk.h.
* gimple-ssa-sprintf.c: Include domwalk.h.
(class sprintf_dom_walker): New class, derived from dom_walker.
(sprintf_dom_walker::before_dom_children): New function.
(struct call_info): Moved into sprintf_dom_walker class
(compute_formath_length, handle_gimple_call): Likewise.
(sprintf_length::execute): Call the dominator walker rather
than walking the statements.
Jeff Law [Fri, 27 Oct 2017 15:35:37 +0000 (09:35 -0600)]
tree-vrp.c (check_all_array_refs): Do not use wi->info to smuggle gimple statement locations.
* tree-vrp.c (check_all_array_refs): Do not use wi->info to smuggle
gimple statement locations.
(check_array_bounds): Corresponding changes. Get the statement's
location directly from wi->stmt.
Palmer Dabbelt [Fri, 27 Oct 2017 15:22:43 +0000 (15:22 +0000)]
RISC-V: Correct and improve the "-mabi" documentation
The documentation for the "-mabi" argument on RISC-V was incorrect. We
chose to treat this as a documentation bug rather than a code bug, and
to make the documentation match what GCC currently does. In the
process, I also improved the documentation a bit.
* cgraph.h (set_malloc_flag): Declare.
* cgraph.c (set_malloc_flag_1): New function.
(set_malloc_flag): Likewise.
* ipa-fnsummary.h (ipa_call_summary): Add new field is_return_callee.
* ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to
false.
(read_ipa_call_summary): Add support for reading is_return_callee.
(write_ipa_call_summary): Stream is_return_callee.
* ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
* ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, symbol-summary.h,
ipa-prop.h, ipa-fnsummary.h.
(pure_const_names): Change to static.
(malloc_state_e): Define.
(malloc_state_names): Define.
(funct_state_d): Add field malloc_state.
(varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
(check_retval_uses): New function.
(malloc_candidate_p): Likewise.
(analyze_function): Add support for malloc attribute.
(pure_const_write_summary): Stream malloc_state.
(pure_const_read_summary): Add support for reading malloc_state.
(dump_malloc_lattice): New function.
(propagate_malloc): New function.
(warn_function_malloc): New function.
(ipa_pure_const::execute): Call propagate_malloc and
ipa_free_fn_summary.
(pass_local_pure_const::execute): Add support for malloc attribute.
* ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
* doc/invoke.texi: Document Wsuggest-attribute=malloc.
Michael Collison [Fri, 27 Oct 2017 06:05:58 +0000 (06:05 +0000)]
aarch64.md (<optab>_trunc><vf><GPI:mode>2): New pattern.
2017-10-26 Michael Collison <michael.collison@arm.com>
* config/aarch64/aarch64.md(<optab>_trunc><vf><GPI:mode>2):
New pattern.
(<optab>_trunchf<GPI:mode>2: New pattern.
(<optab>_trunc<vgp><GPI:mode>2: New pattern.
* config/aarch64/iterators.md (wv): New mode attribute.
(vf, VF): New mode attributes.
(vgp, VGP): New mode attributes.
(s): Update attribute with SImode and DImode prefixes.
* testsuite/gcc.target/aarch64/fix_trunc1.c: New testcase.
* testsuite/gcc.target/aarch64/vect-vcvt.c: Fix scan-assembler
directives to allow float or integer destination registers for
fcvtz[su].
Michael Meissner [Thu, 26 Oct 2017 17:33:38 +0000 (17:33 +0000)]
aix.h (TARGET_IEEEQUAD_DEFAULT): Set long double default to IBM.
[gcc]
2017-10-26 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/aix.h (TARGET_IEEEQUAD_DEFAULT): Set long double
default to IBM.
* config/rs6000/darwin.h (TARGET_IEEEQUAD_DEFAULT): Likewise.
* config/rs6000/rs6000.opt (-mabi=ieeelongdouble): Move the
warning to rs6000.c. Remove the Undocumented flag, since it has
been documented.
(-mabi=ibmlongdouble): Likewise.
* config/rs6000/rs6000.c (TARGET_IEEEQUAD_DEFAULT): If it is not
already set, set the default format for long double.
(rs6000_debug_reg_global): Print whether long double is IBM or
IEEE.
(rs6000_option_override_internal): Rework setting long double
format. Only warn if the user is changing the long double default
and they did not use -Wno-psabi.
* doc/invoke.texi (PowerPC options): Update the documentation for
-mabi=ieeelongdouble and -mabi=ibmlongdouble.
This patch adds helper functions that say which of the two modes
involved in a subreg is the larger, preferring the outer mode in
the event of a tie. It also converts IRA and reload to track modes
instead of byte sizes, since this is slightly more convenient when
variable-sized modes are added later.
2017-10-26 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* rtl.h (wider_subreg_mode): New function.
* ira.h (ira_sort_regnos_for_alter_reg): Take a machine_mode *
rather than an unsigned int *.
* ira-color.c (regno_max_ref_width): Replace with...
(regno_max_ref_mode): ...this new variable.
(coalesced_pseudo_reg_slot_compare): Update accordingly.
Use wider_subreg_mode.
(ira_sort_regnos_for_alter_reg): Likewise. Take a machine_mode *
rather than an unsigned int *.
* lra-constraints.c (uses_hard_regs_p): Use wider_subreg_mode.
(process_alt_operands): Likewise.
(invariant_p): Likewise.
* lra-spills.c (assign_mem_slot): Likewise.
(add_pseudo_to_slot): Likewise.
* lra.c (collect_non_operand_hard_regs): Likewise.
(add_regs_to_insn_regno_info): Likewise.
* reload1.c (regno_max_ref_width): Replace with...
(regno_max_ref_mode): ...this new variable.
(reload): Update accordingly. Update call to
ira_sort_regnos_for_alter_reg.
(alter_reg): Update to use regno_max_ref_mode. Call wider_subreg_mode.
(init_eliminable_invariants): Update to use regno_max_ref_mode.
(scan_paradoxical_subregs): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254115
Wilco Dijkstra [Thu, 26 Oct 2017 16:51:37 +0000 (16:51 +0000)]
Introduce emit_frame_chain
The current frame code combines the separate concepts of a frame chain
(saving old FP,LR in a record and pointing new FP to it) and a frame
pointer used to access locals. Add emit_frame_chain to the aarch64_frame
descriptor and use it in the prolog and epilog code. For now just
initialize it as before, so generated code is identical.
Also correctly set EXIT_IGNORE_STACK. The current AArch64 epilog code
restores SP from FP if alloca is used. If a frame pointer is used but
there is no alloca, SP must remain valid for the epilog to work correctly.
gcc/
* config/aarch64/aarch64.h (EXIT_IGNORE_STACK): Set if alloca is used.
(aarch64_frame): Add emit_frame_chain boolean.
* config/aarch64/aarch64.c (aarch64_frame_pointer_required)
Move eh_return case to aarch64_layout_frame.
(aarch64_layout_frame): Initialize emit_frame_chain.
(aarch64_expand_prologue): Use emit_frame_chain.
Wilco Dijkstra [Thu, 26 Oct 2017 16:40:25 +0000 (16:40 +0000)]
Simplify frame layout for stack probing
This patch makes some changes to the frame layout in order to simplify
stack probing. We want to use the save of LR as a probe in any non-leaf
function. With shrinkwrapping we may only save LR before a call, so it
is useful to define a fixed location in the callee-saves. So force LR at
the bottom of the callee-saves even with -fomit-frame-pointer.
Also remove a rarely used frame layout that saves the callee-saves first
with -fomit-frame-pointer. Doing so allows the store of LR to be used as
a valid stack probe in all frames.
gcc/
* config/aarch64/aarch64.c (aarch64_layout_frame):
Ensure LR is always stored at the bottom of the callee-saves.
Remove rarely used frame layout which saves callee-saves at top of
frame, so the store of LR can be used as a valid probe in all cases.
Wilco Dijkstra [Thu, 26 Oct 2017 16:34:03 +0000 (16:34 +0000)]
Improve addressing of TI/TFmode
In https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01125.html Jiong
pointed out some addressing inefficiencies due to a recent change in
regcprop (https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00775.html).
This patch improves aarch64_legitimize_address_displacement to split
unaligned offsets of TImode and TFmode accesses. The resulting code
is better and no longer relies on the original regcprop optimization.
This patch uses df_read_modify_subreg_p to check whether writing
to a subreg would preserve some of the existing contents.
This has the effect of putting more emphasis on the
REGMODE_NATURAL_SIZE-based definition of whether something can be
partially modified, instead of using UNITS_PER_WORD unconditionally.
This becomes important for SVE, where UNITS_PER_WORD has no
significance for subregs of multi-register LD2/ST2, LD3/ST3 and
LD4/ST4 tuples.
2017-10-26 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
Stop print_hex from printing bits above the precision
2017-10-26 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* wide-int-print.cc (print_hex): Loop based on extract_uhwi.
Don't print any bits outside the precision of the value.
* wide-int.cc (test_printing): Add some new tests.
James Greenhalgh [Thu, 26 Oct 2017 14:17:40 +0000 (14:17 +0000)]
[obvious][arm testsuite] Fixup expected location in require-pic-register-loc.c
After r254010 we now add -gcolumn-info by default, that means the tests
in gcc.target/arm/require-pic-register-loc.c need adjusting to not expect
to see column zero.
gcc/testsuite/
* gcc.target/arm/require-pic-register-loc.c: Use wider regex for
column information.
David Malcolm [Wed, 25 Oct 2017 23:53:41 +0000 (23:53 +0000)]
C: detect more missing semicolons (PR c/7356)
c_parser_declaration_or_fndef has logic for parsing what might be
either a declaration or a function definition.
This patch adds a test to detect cases where a semicolon would have
terminated the decls as a declaration, where the token that follows
would start a new declaration specifier, and updates the error message
accordingly, with a fix-it hint.
This addresses PR c/7356, fixing the case of a stray token before a
#include that previously gave inscrutable output, and improving e.g.:
int i
int j;
from:
t.c:2:1: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'int'
int j;
^~~
to:
t.c:1:6: error: expected ';' before 'int'
int i
^
;
int j;
~~~
gcc.dg/noncompile/920923-1.c needs a slight update, as the output for
the first line changes from:
gcc/testsuite/ChangeLog:
PR c/7356
PR c/44515
* c-c++-common/pr44515.c: New test case.
* gcc.dg/pr7356-2.c: New test case.
* gcc.dg/pr7356.c: New test case.
* gcc.dg/spellcheck-typenames.c: Update the "singed" char "TODO"
case to reflect changes to output.
* gcc.dg/noncompile/920923-1.c: Add dg-warning to reflect changes
to output.
Palmer Dabbelt [Wed, 25 Oct 2017 22:45:55 +0000 (22:45 +0000)]
RISC-V: Add Sign/Zero extend patterns for PIC loads
Loads on RISC-V are sign-extending by default, but we weren't telling
GCC this in our PIC load patterns. This corrects the problem, and adds
a zero-extending pattern as well.
gcc/ChangeLog
2017-10-25 Palmer Dabbelt <palmer@dabbelt.com>
* config/riscv/riscv.md (ZERO_EXTEND_LOAD): Define.
* config/riscv/pic.md (local_pic_load): Rename to local_pic_load_s,
mark as a sign-extending load.
(local_pic_load_u): Define.
Eric Botcazou [Wed, 25 Oct 2017 21:53:21 +0000 (21:53 +0000)]
re PR middle-end/82062 (simple conditional expressions no longer folded)
PR middle-end/82062
* fold-const.c (operand_equal_for_comparison_p): Also return true
if ARG0 is a simple variant of ARG1 with narrower precision.
(fold_ternary_loc): Always pass unstripped operands to the predicate.
Richard Biener [Wed, 25 Oct 2017 10:20:37 +0000 (10:20 +0000)]
tree-ssa-sccvn.h (vn_eliminate): Declare.
2017-10-25 Richard Biener <rguenther@suse.de>
* tree-ssa-sccvn.h (vn_eliminate): Declare.
* tree-ssa-pre.c (class eliminate_dom_walker, eliminate,
class pass_fre): Move to ...
* tree-ssa-sccvn.c (class eliminate_dom_walker, vn_eliminate,
class pass_fre): ... here and adjust for statistics.
Jakub Jelinek [Wed, 25 Oct 2017 08:05:58 +0000 (10:05 +0200)]
re PR libstdc++/81706 (std::sin vectorization bug)
PR libstdc++/81706
* attribs.c (attribute_value_equal): Use omp_declare_simd_clauses_equal
for comparison of OMP_CLAUSEs regardless of flag_openmp{,_simd}.
(duplicate_one_attribute, copy_attributes_to_builtin): New functions.
* attribs.h (duplicate_one_attribute, copy_attributes_to_builtin): New
declarations.
* c-decl.c (merge_decls): Copy "omp declare simd" attributes from
newdecl to corresponding __builtin_ if any.
* decl.c (duplicate_decls): Copy "omp declare simd" attributes from
newdecl to corresponding __builtin_ if any.
* gcc.target/i386/pr81706.c: New test.
* g++.dg/ext/pr81706.C: New test.
* tree-ssa-pre.c (need_eh_cleanup, need_ab_cleanup, el_to_remove,
el_to_fixup, el_todo, el_avail, el_avail_stack, eliminate_avail,
eliminate_push_avail, eliminate_insert): Move inside...
(class eliminate_dom_walker): ... this class in preparation
of move.
(fini_eliminate): Remove by merging with ...
(eliminate): ... this function. Adjust for class changes.
(pass_pre::execute): Remove fini_eliminate call.
(pass_fre::execute): Likewise.
Jakub Jelinek [Tue, 24 Oct 2017 19:35:37 +0000 (21:35 +0200)]
re PR target/82460 (AVX512: choose between vpermi2d and vpermt2d to save mov instructions. Also, fails to optimize away shifts before shuffle)
PR target/82460
* config/i386/sse.md (UNSPEC_VPERMI2, UNSPEC_VPERMI2_MASK): Remove.
(VPERMI2, VPERMI2I): New mode iterators.
(<avx512>_vpermi2var<mode>3_maskz): Remove 3 define_expand patterns.
(<avx512>_vpermi2var<mode>3<sd_maskz_name>): Remove 3 define_insn
patterns.
(<avx512>_vpermi2var<mode>3_mask): New define_expand using VPERMI2
mode iterator. Remove 3 old define_insn patterns.
(*<avx512>_vpermi2var<mode>3_mask): 2 new define_insn patterns.
(<avx512>_vpermt2var<mode>3_maskz): Adjust 1 define_expand to use
VPERMI2 mode iterator, remove the other two expanders.
(<avx512>_vpermt2var<mode>3<sd_maskz_name>): Adjust 1 define_insn
to use VPERMI2 mode iterator, add another alternative for vpermi2*
instructions, remove the other two patterns.
(<avx512>_vpermt2var<mode>3_mask): Adjust 1 define_insn to use VPERMI2
mode iterator, remove the other two patterns.
* config/i386/i386.c (ix86_expand_vec_perm_vpermi2): Renamed to ...
(ix86_expand_vec_perm_vpermt2): ... this. Swap mask and op0
arguments, use gen_*vpermt2* expanders instead of gen_*vpermi2*
and adjust argument order accordingly.
(ix86_expand_vec_perm): Adjust caller.
(expand_vec_perm_1): Likewise.
(expand_vec_perm_vpermi2_vpshub2): Rename to ...
(expand_vec_perm_vpermt2_vpshub2): ... this.
(ix86_expand_vec_perm_const_1): Adjust caller.
(ix86_vectorize_vec_perm_const_ok): Adjust comments.
* gcc.target/i386/pr82460-1.c: New test.
* gcc.target/i386/pr82460-2.c: New test.
* gcc.target/i386/avx512f-vpermt2pd-1.c: Adjust scan-assembler*
regexps to allow vpermt2* to vpermi2* replacement or vice versa
where possible.
* gcc.target/i386/avx512vl-vpermt2pd-1.c: Likewise.
* gcc.target/i386/avx512f-vpermt2d-1.c: Likewise.
* gcc.target/i386/vect-pack-trunc-2.c: Likewise.
* gcc.target/i386/avx512vl-vpermt2ps-1.c: Likewise.
* gcc.target/i386/avx512vl-vpermt2q-1.c: Likewise.
* gcc.target/i386/avx512f-vpermt2ps-1.c: Likewise.
* gcc.target/i386/avx512vl-vpermt2d-1.c: Likewise.
* gcc.target/i386/avx512bw-vpermt2w-1.c: Likewise.
* gcc.target/i386/avx512vbmi-vpermt2b-1.c: Likewise.
* gcc.target/i386/avx512f-vpermt2q-1.c: Likewise.
Wilco Dijkstra [Tue, 24 Oct 2017 17:21:19 +0000 (17:21 +0000)]
Cleanup autopref scheduling
r253236 broke AArch64 bootstrap. Earlier revision r253071 changed scheduling
behaviour on AArch64 as autopref scheduling no longer checks the base.
This patch fixes the bootstrap failure and cleans up autopref scheduling.
The code is greatly simplified. Sort accesses on the offset first, and
only if the offsets are the same fall back to other comparisons in
rank_for_schedule. This doesn't at all restore the original behaviour
since we no longer compare the base address, but it now defines a total
sorting order. More work will be required to improve the sorting so
that only loads/stores with the same base are affected.
Wilco Dijkstra [Tue, 24 Oct 2017 16:58:02 +0000 (16:58 +0000)]
PR60580: Fix frame pointer option magic
To fix PR60580 simplify the logic in aarch64_override_options_after_change_1 ().
If the frame pointer is enabled, set it to a special value that behaves similar
to frame pointer omission. If we don't do this all leaf functions will get a
frame pointer even if flag_omit_leaf_frame_pointer is set.
If flag_omit_frame_pointer has this special value, we must force the frame
pointer if not in a leaf function. We also need to force it in a leaf function
if flag_omit_frame_pointer is not set or if LR is used.
Doing this allows both -fomit-frame-pointer and -fomit-leaf-frame-pointer to be
independently set and changed in each function with the expected behaviour.
gcc/
PR middle-end/60580
* config/aarch64/aarch64.c (aarch64_frame_pointer_required)
Check special value of flag_omit_frame_pointer.
(aarch64_can_eliminate): Likewise.
(aarch64_override_options_after_change_1): Simplify handling of
-fomit-frame-pointer and -fomit-leaf-frame-pointer.
Alan Modra [Tue, 24 Oct 2017 12:45:01 +0000 (23:15 +1030)]
PR82687, g++.dg/asan/default-options-1.C fails with PR82575 fix
The problem with making discarded symbols hidden is that the
non-default visibility is sticky. When symbols other than the
__gnu_lto ones are discarded that turns out to be a bad idea.
PR lto/82687
PR lto/82575
* simple-object-elf.c (simple_object_elf_copy_lto_debug_sections):
Only make __gnu_lto symbols hidden. Delete outdated comment.
Silence ISO C warning.
H.J. Lu [Tue, 24 Oct 2017 10:52:50 +0000 (10:52 +0000)]
i386: Don't insert ENDBR at function entrance when called directly
There is no need to insert ENDBR instruction at function entrance if
function is only called directly.
gcc/
PR target/82659
* config/i386/i386.c (rest_of_insert_endbranch): Don't insert
ENDBR instruction at function entrance if function is only
called directly.
Jakub Jelinek [Tue, 24 Oct 2017 10:44:56 +0000 (12:44 +0200)]
re PR rtl-optimization/82628 (wrong code at -Os on x86_64-linux-gnu in the 32-bit mode)
PR target/82628
* config/i386/i386.md (addcarry<mode>, subborrow<mode>): Change
patterns to better describe from which operation the CF is computed.
(addcarry<mode>_0, subborrow<mode>_0): New patterns.
* config/i386/i386.c (ix86_expand_builtin) <case handlecarry>: Pass
one LTU with [DT]Imode and another one with [SD]Imode. If arg0
is 0, use _0 suffixed expanders instead of emitting a comparison
before it.