Bill Seurer [Tue, 7 Jun 2016 20:18:09 +0000 (20:18 +0000)]
This patch adds support for the missing versions of the vec_mul altivec...
This patch adds support for the missing versions of the vec_mul altivec
builtins from the Power Architecture 64-Bit ELF V2 ABI OpenPOWER ABI for
Linux Supplement (16 July 2015 Version 1.1). There are many of the builtins
that are missing and this is part of a series of patches to add them.
There aren't instructions for the {un}signed char, {un}signed short, and
{un}signed int versions of vec_mul so the output code is built from other
built-ins and operations that do have instructions.
The new test case is an executable test which verifies that the generated
code produces expected values. C macros were used so that the same
test case could be used for all the various supported types.
Bootstrapped and tested on powerpc64le-unknown-linux-gnu and
powerpc64-unknown-linux-gnu with no regressions. Is this ok for trunk?
[gcc]
2016-06-07 Bill Seurer <seurer@linux.vnet.ibm.com>
* config/rs6000/altivec.h: Add __builtin_vec_mul.
* config/rs6000/rs6000-builtin.def (vec_mul): Change vec_mul to a
special case Altivec builtin.
* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Remove
VSX_BUILTIN_VEC_MUL (replaced with special case code).
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Add
code for ALTIVEC_BUILTIN_VEC_MUL.
* config/rs6000/rs6000.c (altivec_init_builtins): Add definition
for __builtin_vec_mul.
[gcc/testsuite]
2016-06-07 Bill Seurer <seurer@linux.vnet.ibm.com>
David Malcolm [Tue, 7 Jun 2016 15:04:22 +0000 (15:04 +0000)]
C: add fixit hint to misspelled field names
gcc/c/ChangeLog:
* c-parser.c (c_parser_postfix_expression): In __builtin_offsetof
and structure element reference, capture the location of the
element name token and pass it to build_component_ref.
(c_parser_postfix_expression_after_primary): Likewise for
structure element dereference.
(c_parser_omp_variable_list): Likewise for
OMP_CLAUSE_{_CACHE, MAP, FROM, TO},
* c-tree.h (build_component_ref): Add location_t param.
* c-typeck.c (build_component_ref): Add location_t param
COMPONENT_LOC. Use it, if available, when issuing hints about
mispelled member names to provide a fixit replacement hint.
gcc/objc/ChangeLog:
* objc-act.c (objc_build_component_ref): Update call
to build_component_ref for added param, passing UNKNOWN_LOCATION.
gcc/testsuite/ChangeLog:
* gcc.dg/spellcheck-fields-2.c: New test case.
Richard Biener [Tue, 7 Jun 2016 12:41:46 +0000 (12:41 +0000)]
re PR c++/61564 (#pragma GCC optimize ("-fno-lto") causes the compiler to crash)
2016-06-07 Richard Biener <rguenther@suse.de>
PR c/61564
* c-common.c (parse_optimize_options): Only apply CL_OPTIMIZATION
options and warn about others.
* common.opt (ffast-math): Make Optimization.
* gcc.dg/Wpragmas-1.c: New testcase.
* gcc.dg/Wattributes-4.c: Likewise.
* gcc.dg/ipa/pr70646.c: Drop optimize pragma in favor of dg-option
entry.
Jakub Jelinek [Mon, 6 Jun 2016 19:48:22 +0000 (21:48 +0200)]
re PR c++/70847 (exponential time in cp_fold for chained virtual function calls)
PR c++/70847
PR c++/71330
PR c++/71393
* cp-gimplify.c (cp_fold_r): Set *walk_subtrees = 0 and return NULL
right after cp_fold call if cp_fold has returned the same stmt
already in some earlier cp_fold_r call.
(cp_fold_function): Add pset automatic variable, pass its address
to cp_walk_tree.
* g++.dg/opt/pr70847.C: New test.
* g++.dg/ubsan/pr70847.C: New test.
* g++.dg/ubsan/pr71393.C: New test.
Co-Authored-By: Patrick Palka <ppalka@gcc.gnu.org>
From-SVN: r237151
Jakub Jelinek [Mon, 6 Jun 2016 18:35:03 +0000 (20:35 +0200)]
re PR tree-optimization/71259 (GCC trunk emits wrong code)
PR tree-optimization/71259
* tree-vect-slp.c (vect_get_constant_vectors): For
VECTOR_BOOLEAN_TYPE_P, return all ones constant instead of
one for constant op, and use COND_EXPR for non-constant.
David Malcolm [Mon, 6 Jun 2016 17:11:30 +0000 (17:11 +0000)]
Selftest framework
gcc/ChangeLog:
* Makefile.in (OBJS): Add function-tests.o,
hash-map-tests.o, hash-set-tests.o, rtl-tests.o,
selftest-run-tests.o.
(OBJS-libcommon): Add selftest.o.
(OBJS-libcommon-target): Add selftest.o.
(all.internal): Add "selftest".
(all.cross): Likewise.
(selftest): New phony target.
(s-selftest): New target.
(selftest-gdb): New phony target.
(COLLECT2_OBJS): Add selftest.o.
* bitmap.c: Include "selftest.h".
(selftest::test_gc_alloc): New function.
(selftest::test_set_range): New function.
(selftest::test_clear_bit_in_middle): New function.
(selftest::test_copying): New function.
(selftest::test_bitmap_single_bit_set_p): New function.
(selftest::bitmap_c_tests): New function.
* common.opt (fself-test): New.
* diagnostic-show-locus.c: Include "selftest.h".
(make_range): New function.
(test_range_contains_point_for_single_point): New function.
(test_range_contains_point_for_single_line): New function.
(test_range_contains_point_for_multiple_lines): New function.
(assert_eq): New function.
(test_get_line_width_without_trailing_whitespace): New function.
(selftest::diagnostic_show_locus_c_tests): New function.
* et-forest.c: Include "selftest.h".
(selftest::test_single_node): New function.
(selftest::test_simple_tree): New function.
(selftest::test_disconnected_nodes): New function.
(selftest::et_forest_c_tests): New function.
* fold-const.c: Include "selftest.h".
(selftest::assert_binop_folds_to_const): New function.
(selftest::assert_binop_folds_to_nonlvalue): New function.
(selftest::test_arithmetic_folding): New function.
(selftest::fold_const_c_tests): New function.
* function-tests.c: New file.
* gimple.c: Include "selftest.h".
Include "gimple-pretty-print.h".
(selftest::verify_gimple_pp): New function.
(selftest::test_assign_single): New function.
(selftest::test_assign_binop): New function.
(selftest::test_nop_stmt): New function.
(selftest::test_return_stmt): New function.
(selftest::test_return_without_value): New function.
(selftest::gimple_c_tests): New function.
* hash-map-tests.c: New file.
* hash-set-tests.c: New file.
* input.c: Include "selftest.h".
(selftest::assert_loceq): New function.
(selftest::test_accessing_ordinary_linemaps): New function.
(selftest::test_unknown_location): New function.
(selftest::test_builtins): New function.
(selftest::test_reading_source_line): New function.
(selftest::input_c_tests): New function.
* rtl-tests.c: New file.
* selftest-run-tests.c: New file.
* selftest.c: New file.
* selftest.h: New file.
* spellcheck.c: Include "selftest.h".
(selftest::levenshtein_distance_unit_test_oneway): New function,
adapted from testsuite/gcc.dg/plugin/levenshtein_plugin.c.
(selftest::levenshtein_distance_unit_test): Likewise.
(selftest::spellcheck_c_tests): Likewise.
* toplev.c: Include selftest.h.
(toplev::run_self_tests): New.
(toplev::main): Handle -fself-test.
* toplev.h (toplev::run_self_tests): New.
* tree.c: Include "selftest.h".
(selftest::test_integer_constants): New function.
(selftest::test_identifiers): New function.
(selftest::test_labels): New function.
(selftest::tree_c_tests): New function.
* tree-cfg.c: Include "selftest.h".
(selftest::push_fndecl): New function.
(selftest::test_linear_chain): New function.
(selftest::test_diamond): New function.
(selftest::test_fully_connected): New function.
(selftest::tree_cfg_c_tests): New function.
* vec.c: Include "selftest.h".
(selftest::safe_push_range): New function.
(selftest::test_quick_push): New function.
(selftest::test_safe_push): New function.
(selftest::test_truncate): New function.
(selftest::test_safe_grow_cleared): New function.
(selftest::test_pop): New function.
(selftest::test_safe_insert): New function.
(selftest::test_ordered_remove): New function.
(selftest::test_unordered_remove): New function.
(selftest::test_block_remove): New function.
(selftest::reverse_cmp): New function.
(selftest::test_qsort): New function.
(selftest::vec_c_tests): New function.c.
* wide-int.cc: Include selftest.h and wide-int-print.h.
(selftest::from_int <wide_int>): New function.
(selftest::from_int <offset_int>): New function.
(selftest::from_int <widest_int>): New function.
(selftest::assert_deceq): New function.
(selftest::assert_hexeq): New function.
(selftest::test_printing <VALUE_TYPE>): New function template.
(selftest::test_ops <VALUE_TYPE>): New function template.
(selftest::test_comparisons <VALUE_TYPE>): New function template.
(selftest::run_all_wide_int_tests <VALUE_TYPE>): New function
template.
(selftest::wide_int_cc_tests): New function.
Jonathan Wakely [Mon, 6 Jun 2016 15:50:01 +0000 (16:50 +0100)]
libstdc++/71320 Add or remove file permissions correctly
PR libstdc++/71320
* src/filesystem/ops.cc (permissions(const path&, perms, error_code&)):
Add or remove permissions according to perms argument.
* testsuite/experimental/filesystem/operations/permissions.cc: New
test.
Aaron Conole [Mon, 6 Jun 2016 15:24:24 +0000 (15:24 +0000)]
re PR bootstrap/71400 (profiledbootstrap failed)
PR libgcc/71400
* libgcov-driver-system.c (__gcov_error_file): Disable if IN_GCOV_TOOL.
(get_gcov_error_file): Check __gcov_error_file before trying to
initialize it.
(gcov_error): Always use get_gcov_error_file.
2016-06-06 Jose E. Marchesi <jose.marchesi@oracle.com>
* config/sparc/sparc.md (cpu): Add niagara7 cpu type.
Include the M7 SPARC DFA scheduler.
New attribute v3pipe.
Annotate insns with v3pipe where appropriate.
Define cpu_feature vis4.
Add lzd instruction type and set it on clzdi_sp64 and clzsi_sp64.
Add (V8QI "8") to vbits.
Add insns {add,sub}v8qi3
Add insns ss{add,sub}v8qi3
Add insns us{add,sub}{v8qi,v4hi}3
Add insns {min,max}{v8qi,v4hi,v2si}3
Add insns {minu,maxu}{v8qi,v4hi,v2si}3
Add insns fpcmp{le,gt,ule,ug,ule,ugt}{8,16,32}_vis.
* config/sparc/niagara4.md: Add a comment explaining the
discrepancy between the documented latenty numbers and the
implemented ones.
* config/sparc/niagara7.md: New file.
* configure.ac (HAVE_AS_SPARC5_VIS4): Define if the assembler
supports SPARC5 and VIS 4.0 instructions.
* configure: Regenerate.
* config.in: Likewise.
* config.gcc: niagara7 is a supported cpu in sparc*-*-* targets.
* config/sparc/sol2.h (ASM_CPU32_DEFAUILT_SPEC): Set for
TARGET_CPU_niagara7.
(ASM_CPU64_DEFAULT_SPEC): Likewise.
(CPP_CPU_SPEC): Handle niagara7.
(ASM_CPU_SPEC): Likewise.
* config/sparc/sparc-opts.h (processor_type): Add
PROCESSOR_NIAGARA7.
(mvis4): New option.
* config/sparc/sparc.h (TARGET_CPU_niagara7): Define.
(AS_NIAGARA7_FLAG): Define.
(ASM_CPU64_DEFAULT_SPEC): Set for niagara7.
(CPP_CPU64_DEFAULT_SPEC): Likewise.
(CPP_CPU_SPEC): Handle niagara7.
(ASM_CPU_SPEC): Likewise.
* config/sparc/sparc.c (niagara7_costs): Define.
(sparc_option_override): Handle niagara7 and adjust cache-related
parameters with better values for niagara cpus. Also support VIS4.
(sparc32_initialize_trampoline): Likewise.
(sparc_use_sched_lookahead): Likewise.
(sparc_issue_rate): Likewise.
(sparc_register_move_cost): Likewise.
(dump_target_flag_bits): Support VIS4.
(sparc_vis_init_builtins): Likewise.
(sparc_builtins): Likewise.
* config/sparc/sparc-c.c (sparc_target_macros): Define __VIS__ for
VIS4 4.0.
* config/sparc/driver-sparc.c (cpu_names): Add SPARC-M7 and
UltraSparc M7.
* config/sparc/sparc.opt (sparc_processor_type): New value
niagara7.
* config/sparc/visintrin.h (__attribute__): Prototypes for the
VIS4 builtins.
* doc/invoke.texi (SPARC Options): Document -mcpu=niagara7 and
-mvis4.
* doc/extend.texi (SPARC VIS Built-in Functions): Document the
VIS4 builtins.
gcc/testsuite/ChangeLog:
2016-06-06 Jose E. Marchesi <jose.marchesi@oracle.com>
* gcc.target/sparc/vis4misc.c: New file.
* gcc.target/sparc/fpcmp.c: Likewise.
* gcc.target/sparc/fpcmpu.c: Likewise.
Eric Botcazou [Mon, 6 Jun 2016 10:03:14 +0000 (10:03 +0000)]
decl.c (Gigi_Equivalent_Type): Make sure equivalent types are present before returning them.
* gcc-interface/decl.c (Gigi_Equivalent_Type): Make sure equivalent
types are present before returning them. Remove final assertion.
(gnat_to_gnu_entity) <E_Access_Protected_Subprogram_Type>: Adjust to
above change.
<E_Protected_Type>: Likewise.
Eric Botcazou [Mon, 6 Jun 2016 09:51:33 +0000 (09:51 +0000)]
utils.c (gnat_internal_attribute_table): Add support for noinline and noclone attributes.
* gcc-interface/utils.c (gnat_internal_attribute_table): Add support
for noinline and noclone attributes.
(handle_noinline_attribute): New handler.
(handle_noclone_attribute): Likewise.
Eric Botcazou [Mon, 6 Jun 2016 09:44:11 +0000 (09:44 +0000)]
utils2.c (build_call_alloc_dealloc): Do not substitute placeholder expressions here but...
* gcc-interface/utils2.c (build_call_alloc_dealloc): Do not substitute
placeholder expressions here but...
* gcc-interface/trans.c (gnat_to_gnu) <N_Free_Statement>: ...here.
Make an exception to the protection of a CALL_EXPR result with an
unconstrained type only in the same cases as Call_to_gnu.
Eric Botcazou [Mon, 6 Jun 2016 09:31:13 +0000 (09:31 +0000)]
trans.c (gnat_to_gnu): Rework special code dealing with boolean rvalues and set the location directly.
* gcc-interface/trans.c (gnat_to_gnu): Rework special code dealing
with boolean rvalues and set the location directly. Do not set the
location in the other cases for a simple name.
(gnat_to_gnu_external): Clear the location on the expression.
Eric Botcazou [Mon, 6 Jun 2016 09:26:07 +0000 (09:26 +0000)]
decl.c (gnat_to_gnu_entity): Remove useless 'else' statements and tidy up.
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Component>: Remove
useless 'else' statements and tidy up.
<E_Array_Subtype>: Fully deal with the declaration here.
<E_Incomplete_Type>: Use properly-typed constant.
Assert that we don't apply the special type treatment to dummy types.
Separate this treatment from the final back-annotation and simplify
the condition for the RM size.
(gnat_to_gnu_param): Add GNU_PARAM_TYPE parameter and adjust.
(gnat_to_gnu_subprog_type): Ajust call to gnat_to_gnu_param.
* gcc-interface/trans.c (gnat_to_gnu) <N_Subprogram_Declaration>: Add
comment.
(process_freeze_entity): Remove obsolete code.
(process_type): Minor tweaks.
Eric Botcazou [Mon, 6 Jun 2016 09:18:41 +0000 (09:18 +0000)]
einfo.ads (Returns_Limited_View): Remove.
* einfo.ads (Returns_Limited_View): Remove.
(Set_Returns_Limited_View ): Likewise.
* einfo.adb (Returns_Limited_View): Likewise.
(Set_Returns_Limited_View ): Likewise.
* freeze.adb (Late_Freeze_Subprogram): Remove.
(Freeze_Entity): Do not defer the freezing of functions returning an
incomplete type coming from a limited context.
* gcc-interface/gigi.h (finish_subprog_decl): Add ASM_NAME parameter.
* gcc-interface/decl.c (gnu_ext_name_for_subprog): New function.
(gnat_to_gnu_entity) <E_Subprogram_Type>: Do not check compatibility
of profiles for builtins here... Call gnu_ext_name_for_subprog.
Also update profiles if pointers to limited_with'ed types are
updated.
(gnat_to_gnu_param): Restore the correct source location information
for vector ABI warnings.
(associate_subprog_with_dummy_type): Add comment about AI05-019.
Set TYPE_DUMMY_IN_PROFILE_P flag unconditionally.
(update_profile): Deal with builtin declarations.
Call gnu_ext_name_for_subprog. Adjust call to finish_subprog_decl.
(update_profiles_with): Add comment.
(gnat_to_gnu_subprog_type): Reuse the return type if it is complete.
Likewise for parameter declarations in most cases. Do not change
the return type for the CICO mechanism if the profile is incomplete.
...but here instead. Always reset the slot for the parameters.
* gcc-interface/utils.c (create_subprog_decl): Call
gnu_ext_name_for_subprog. Do not set the assembler name here but...
(finish_subprog_decl): ...but here instead. Add ASM_NAME parameter.
Eric Botcazou [Mon, 6 Jun 2016 08:46:33 +0000 (08:46 +0000)]
exp_ch9.adb (Expand_N_Protected_Type_Declaration): Insert the declaration of the corresponding record type before that of the...
* exp_ch9.adb (Expand_N_Protected_Type_Declaration): Insert the
declaration of the corresponding record type before that of the
unprotected version of the subprograms that operate on it.
(Expand_Access_Protected_Subprogram_Type): Declare the Equivalent_Type
just before the original type.
* sem_ch3.adb (Handle_Late_Controlled_Primitive): Point the current
declaration to the newly created declaration for the primitive.
(Analyze_Subtype_Declaration): Remove obsolete code forcing the
freezing of the subtype before its declaration.
(Replace_Anonymous_Access_To_Protected_Subprogram): Insert the new
declaration in the nearest enclosing scope for formal parameters too.
(Build_Derived_Access_Type): Restore the status of the created Itype
after it is erased by Copy_Node.
* sem_ch6.adb (Exchange_Limited_Views): Remove guard on entry.
(Analyze_Subprogram_Body_Helper): Call Exchange_Limited_Views only if
the specification is present.
Move around the code changing the designated view of the return type
and save the original view. Restore it on exit.
* sem_ch13.adb (Build_Predicate_Function_Declaration): Always insert
the declaration right after that of the type.
Jerry DeLisle [Sun, 5 Jun 2016 19:49:59 +0000 (19:49 +0000)]
re PR fortran/71404 (416.gamess in SPEC CPU 2006 failed to build)
2016-06-05 Jerry DeLisle <jvdelisle@gcc.gnu.org>
PR fortran/71404
* io.c (match_io): For READ, commit in pending symbols in the
current statement before trying to match an expression so that
if the match fails and we undo symbols we dont toss good symbols.
re PR fortran/69659 (ICE on using option -frepack-arrays, in gfc_conv_descriptor_data_get)
gcc/testsuite/ChangeLog:
2016-06-05 Andre Vehreschild <vehre@gcc.gnu.org>
PR fortran/69659
* gfortran.dg/class_array_22.f03: New test.
gcc/fortran/ChangeLog:
2016-06-05 Andre Vehreschild <vehre@gcc.gnu.org>
PR fortran/69659
* trans-array.c (gfc_trans_dummy_array_bias): For class arrays use
the address of the _data component to reference the arrays data
component.
Jan Hubicka [Sun, 5 Jun 2016 16:43:19 +0000 (18:43 +0200)]
predict.c (predicted_by_loop_heuristics_p): New function.
* predict.c (predicted_by_loop_heuristics_p): New function.
(predict_iv_comparison): Use it.
(predict_loops): Walk from innermost loops; do not predict edges
leaving multiple loops multiple times; implement
PRED_LOOP_ITERATIONS_MAX heuristics.
* predict.def (PRED_LOOP_ITERATIONS_MAX): New predictor.
* gcc.dg/predict-9.c: Update template.
Jan Hubicka [Sat, 4 Jun 2016 20:19:46 +0000 (22:19 +0200)]
tree-ssa-loop-ch.c (should_duplicate_loop_header_p): Do not check aux; dump reasons of decisions.
* tree-ssa-loop-ch.c (should_duplicate_loop_header_p): Do not check
aux; dump reasons of decisions.
(should_duplicate_loop_header_p): Likewise.
(do_while_loop_p): Likewise.
(ch_base::copy_headers): Dump asi num insns duplicated.
Jakub Jelinek [Sat, 4 Jun 2016 14:50:57 +0000 (16:50 +0200)]
re PR tree-optimization/71405 (ICE on valid C++ code at -Os and above on x86_64-linux-gnu: verify_gimple failed)
PR tree-optimization/71405
* tree-ssa.c (execute_update_addresses_taken): For clobber with
incompatible type, build a new clobber with the right type instead
of building a VIEW_CONVERT_EXPR around it.
Patrick Palka [Fri, 3 Jun 2016 20:42:08 +0000 (20:42 +0000)]
re PR c++/27100 (ICE with multiple friend declarations)
Fix PR c++/27100
gcc/cp/ChangeLog:
PR c++/27100
* decl.c (duplicate_decls): Properly copy the
DECL_PENDING_INLINE_P, DECL_PENDING_INLINE_INFO and
DECL_SAVED_FUNCTION_DATA fields from OLDDECL to NEWDECL.
Bill Schmidt [Fri, 3 Jun 2016 18:40:26 +0000 (18:40 +0000)]
rs6000-c.c (c/c-tree.h): Add #include.
[gcc]
2016-06-03 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
* rs6000-c.c (c/c-tree.h): Add #include.
(altivec_resolve_overloaded_builtin): Handle ARRAY_TYPE arguments
in C++ when found in the base position of vec_ld or vec_st.
[gcc/testsuite]
2016-06-03 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
Jan Hubicka [Fri, 3 Jun 2016 17:00:19 +0000 (19:00 +0200)]
tree-ssa-loop-niter.c (estimate_numbers_of_iterations_loop): Avoid use of profile unless profile status is PROFILE_READ.
* tree-ssa-loop-niter.c (estimate_numbers_of_iterations_loop): Avoid
use of profile unless profile status is PROFILE_READ.
* profile.c (compute_branch_probabilities): Set profile status
only after reporting predictor hitrates.
Joseph Myers [Fri, 3 Jun 2016 15:49:04 +0000 (16:49 +0100)]
Add option for whether ceil etc. can raise "inexact", adjust x86 conditions.
In ISO C99/C11, the ceil, floor, round and trunc functions may or may
not raise the "inexact" exception for noninteger arguments. Under TS
18661-1:2014, the C bindings for IEEE 754-2008, these functions are
prohibited from raising "inexact", in line with the general rule that
"inexact" is only when the mathematical infinite precision result of a
function differs from the result after rounding to the target type.
GCC has no option to select TS 18661 requirements for not raising
"inexact" when expanding built-in versions of these functions inline.
Furthermore, even given such requirements, the conditions on the x86
insn patterns for these functions are unnecessarily restrictive. I'd
like to make the out-of-line glibc versions follow the TS 18661
requirements; in the cases where this slows them down (the cases using
x87 floating point), that makes it more important for inline versions
to be used when the user does not care about "inexact".
This patch fixes these issues. A new option
-fno-fp-int-builtin-inexact is added to request TS 18661 rules for
these functions; the default -ffp-int-builtin-inexact reflects that
such exceptions are allowed by C99 and C11. (The intention is that if
C2x incorporates TS 18661-1, then the default would change in C2x
mode.)
The x86 built-ins for rint (x87, SSE2 and SSE4.1) are made
unconditionally available (no longer depending on
-funsafe-math-optimizations or -fno-trapping-math); "inexact" is
correct for noninteger arguments to rint. For floor, ceil and trunc,
the x87 and SSE2 built-ins are OK if -ffp-int-builtin-inexact or
-fno-trapping-math (they may raise "inexact" for noninteger
arguments); the SSE4.1 built-ins are made to use ROUND_NO_EXC so that
they do not raise "inexact" and so are OK unconditionally.
Now, while there was no semantic reason for depending on
-funsafe-math-optimizations, the insn patterns had such a dependence
because of use of gen_truncxf<mode>2_i387_noop to truncate back to
SFmode or DFmode after using frndint in XFmode. In this case a no-op
truncation is safe because rounding to integer always produces an
exactly representable value (the same reason why IEEE semantics say it
shouldn't produce "inexact") - but of course that insn pattern isn't
safe because it would also match cases where the truncation is not in
fact a no-op. To allow frndint to be used for SFmode and DFmode
without that unsafe pattern, the relevant frndint patterns are
extended to SFmode and DFmode or new SFmode and DFmode patterns added,
so that the frndint operation can be represented in RTL as an
operation acting directly on SFmode or DFmode without the extension
and the problematic truncation.
A generic test of the new option is added, as well as x86-specific
tests, both execution tests including the generic test with different
x86 options and scan-assembler tests verifying that functions that
should be inlined with different options are indeed inlined.
I think other architectures are OK for TS 18661-1 semantics already.
Considering those defining "ceil" patterns: aarch64, arm, rs6000, s390
use instructions that do not raise "inexact"; nvptx does not support
floating-point exceptions. (This does mean the -f option in fact only
affects one architecture, but I think it should still be a -f option;
it's logically architecture-independent and is expected to be affected
by future -std options, so is similar to e.g. -fexcess-precision=,
which also does nothing on most architectures but is implied by -std
options.)
Bootstrapped with no regressions on x86_64-pc-linux-gnu. OK to
commit?
PR target/71276
PR target/71277
gcc:
* common.opt (ffp-int-builtin-inexact): New option.
* doc/invoke.texi (-fno-fp-int-builtin-inexact): Document.
* doc/md.texi (floor@var{m}2, btrunc@var{m}2, round@var{m}2)
(ceil@var{m}2): Document dependence on this option.
* ipa-inline-transform.c (inline_call): Handle
flag_fp_int_builtin_inexact.
* ipa-inline.c (can_inline_edge_p): Likewise.
* config/i386/i386.md (rintxf2): Do not test
flag_unsafe_math_optimizations.
(rint<mode>2_frndint): New define_insn.
(rint<mode>2): Do not test flag_unsafe_math_optimizations for 387
or !flag_trapping_math for SSE. Just use gen_rint<mode>2_frndint
for 387 instead of extending and truncating.
(frndintxf2_<rounding>): Test flag_fp_int_builtin_inexact ||
!flag_trapping_math instead of flag_unsafe_math_optimizations.
Change to frndint<mode>2_<rounding>.
(frndintxf2_<rounding>_i387): Likewise. Change to
frndint<mode>2_<rounding>_i387.
(<rounding_insn>xf2): Likewise.
(<rounding_insn><mode>2): Test flag_fp_int_builtin_inexact ||
!flag_trapping_math instead of flag_unsafe_math_optimizations for
x87. Test TARGET_ROUND || !flag_trapping_math ||
flag_fp_int_builtin_inexact instead of !flag_trapping_math for
SSE. Use ROUND_NO_EXC in constant operand of
gen_sse4_1_round<mode>2. Just use gen_frndint<mode>2_<rounding>
for 387 instead of extending and truncating.
H.J. Lu [Fri, 3 Jun 2016 15:08:00 +0000 (15:08 +0000)]
Implement x86 interrupt attribute
The interrupt and exception handlers are called by x86 processors. X86
hardware pushes information onto stack and calls the handler. The
requirements are
1. Both interrupt and exception handlers must use the 'IRET' instruction,
instead of the 'RET' instruction, to return from the handlers.
2. All registers are callee-saved in interrupt and exception handlers.
3. The difference between interrupt and exception handlers is the
exception handler must pop 'ERROR_CODE' off the stack before the 'IRET'
instruction.
The design goals of interrupt and exception handlers for x86 processors
are:
1. Support both 32-bit and 64-bit modes.
2. Flexible for compilers to optimize.
3. Easy to use by programmers.
To implement interrupt and exception handlers for x86 processors, a
compiler should support:
'interrupt' attribute
Use this attribute to indicate that the specified function with
mandatory arguments is an interrupt or exception handler. The compiler
generates function entry and exit sequences suitable for use in an
interrupt handler when this attribute is present. The 'IRET' instruction,
instead of the 'RET' instruction, is used to return from interrupt or
exception handlers. All registers, except for the EFLAGS register which
is restored by the 'IRET' instruction, are preserved by the compiler.
Since GCC doesn't preserve MPX, SSE, MMX nor x87 states, the GCC option,
-mgeneral-regs-only, should be used to compile interrupt and exception
handlers.
Note for compiler implementers: If the compiler generates MPX, SSE, MMX
or x87 instructions in an interrupt or exception handler, or functions
called from an interrupt or exception handler may contain MPX, SSE, MMX
or x87 instructions, the compiler must save and restore the corresponding
state.
Since the direction flag in the FLAGS register in interrupt (exception)
handlers is undetermined, cld instruction must be emitted in function
prologue if rep string instructions are used in interrupt (exception)
handler or interrupt (exception) handler isn't a leaf function.
Any interruptible-without-stack-switch code must be compiled with
-mno-red-zone since interrupt handlers can and will, because of the
hardware design, touch the red zone.
1. interrupt handler must be declared with a mandatory pointer argument:
struct interrupt_frame;
__attribute__ ((interrupt))
void
f (struct interrupt_frame *frame)
{
...
}
and user must properly define the structure the pointer pointing to.
2. exception handler:
The exception handler is very similar to the interrupt handler with
a different mandatory function signature:
typedef unsigned int uword_t __attribute__ ((mode (__word__)));
and compiler pops the error code off stack before the 'IRET' instruction.
The exception handler should only be used for exceptions which push an
error code and all other exceptions must use the interrupt handler.
The system will crash if the wrong handler is used.
'no_caller_saved_registers' attribute
Use this attribute to indicate that the specified function has no
caller-saved registers. That is, all registers are callee-saved.
The compiler generates proper function entry and exit sequences to
save and restore any modified registers, except for the EFLAGS register.
Since GCC doesn't preserve MPX, SSE, MMX nor x87 states, the GCC option,
-mgeneral-regs-only, should be used to compile functions with
'no_caller_saved_registers'attribute.
Note for compiler implementers: If the compiler generates MPX, SSE,
MMX or x87 instructions in a function with 'no_caller_saved_registers'
attribute or functions called from a function with
'no_caller_saved_registers' attribute may contain MPX, SSE, MMX or x87
instructions, the compiler must save and restore the corresponding state.
The user can call functions specified with 'no_caller_saved_registers'
attribute from an interrupt handler without saving and restoring all
call clobbered registers.
On x86, interrupt handlers are only called by processors which push
interrupt data onto stack at the address where the normal return address
is. Interrupt handlers must access interrupt data via pointers so that
they can update interrupt data.
gcc/
PR target/66960
PR target/67630
PR target/67634
PR target/67841
PR target/68037
PR target/68618
PR target/68661
PR target/69575
PR target/69596
PR target/69734
* config/i386/i386-protos.h (ix86_epilogue_uses): New prototype.
* config/i386/i386.c (ix86_conditional_register_usage): Preserve
all registers, except for function return registers if there are
no caller-saved registers.
(ix86_set_func_type): New function.
(ix86_set_current_function): Call ix86_set_func_type to set
no_caller_saved_registers and func_type. Call reinit_regs if
caller-saved registers are changed. Don't allow MPX, SSE, MMX
nor x87 instructions in interrupt handler nor function with
no_caller_saved_registers attribute.
(ix86_function_ok_for_sibcall): Return false if there are no
caller-saved registers.
(type_natural_mode): Don't warn ABI change for MMX in interrupt
handler.
(ix86_function_arg_advance): Skip for callee in interrupt
handler.
(ix86_function_arg): Return special arguments in interrupt
handler.
(ix86_promote_function_mode): Promote pointer to word_mode only
for normal functions.
(ix86_can_use_return_insn_p): Don't use `ret' instruction in
interrupt handler.
(ix86_epilogue_uses): New function.
(ix86_hard_regno_scratch_ok): Likewise.
(ix86_save_reg): Preserve all registers in interrupt handler
after reload. Preserve all registers, except for function
return registers, if there are no caller-saved registers after
reload.
(find_drap_reg): Always use callee-saved register if there are
no caller-saved registers.
(ix86_minimum_incoming_stack_boundary): Return MIN_STACK_BOUNDARY
for interrupt handler.
(ix86_expand_prologue): Don't allow DRAP in interrupt handler.
Emit cld instruction if stringops are used in interrupt handler
or interrupt handler isn't a leaf function.
(ix86_expand_epilogue): Generate interrupt return for interrupt
handler and pop the 'ERROR_CODE' off the stack before interrupt
return in exception handler.
(ix86_expand_call): Disallow calling interrupt handler directly.
If there are no caller-saved registers, mark all registers that
are clobbered by the call which returns as clobbered.
(ix86_handle_no_caller_saved_registers_attribute): New function.
(ix86_handle_interrupt_attribute): Likewise.
(ix86_attribute_table): Add interrupt and no_caller_saved_registers
attributes.
(TARGET_HARD_REGNO_SCRATCH_OK): Likewise.
* config/i386/i386.h (ACCUMULATE_OUTGOING_ARGS): Use argument
accumulation in interrupt function if stack may be realigned to
avoid DRAP.
(EPILOGUE_USES): New.
(function_type): New enum.
(machine_function): Add func_type and no_caller_saved_registers.
* config/i386/i386.md (UNSPEC_INTERRUPT_RETURN): New.
(interrupt_return): New pattern.
* doc/extend.texi: Document x86 interrupt and
no_caller_saved_registers attributes.
Bernd Schmidt [Fri, 3 Jun 2016 14:20:53 +0000 (14:20 +0000)]
re PR tree-optimization/52171 (memcmp/strcmp/strncmp can be optimized when the result is tested for [in]equality with 0)
PR tree-optimization/52171
* builtins.c (expand_cmpstrn_or_cmpmem): Delete, moved elsewhere.
(expand_builtin_memcmp): New arg RESULT_EQ. All callers changed.
Look for constant strings. Move some code to emit_block_cmp_hints
and use it.
* builtins.def (BUILT_IN_MEMCMP_EQ): New.
* defaults.h (COMPARE_MAX_PIECES): New macro.
* expr.c (move_by_pieces_d, store_by_pieces_d): Remove old structs.
(move_by_pieces_1, store_by_pieces_1, store_by_pieces_2): Remvoe.
(clear_by_pieces_1): Don't declare. Move definition before use.
(can_do_by_pieces): New static function.
(can_move_by_pieces): Use it. Return bool.
(by_pieces_ninsns): Renamed from move_by_pieces_ninsns. New arg
OP. All callers changed. Handle COMPARE_BY_PIECES.
(class pieces_addr); New.
(pieces_addr::pieces_addr, pieces_addr::decide_autoinc,
pieces_addr::adjust, pieces_addr::increment_address,
pieces_addr::maybe_predec, pieces_addr::maybe_postinc): New member
functions for it.
(class op_by_pieces_d): New.
(op_by_pieces_d::op_by_pieces_d, op_by_pieces_d::run): New member
functions for it.
(class move_by_pieces_d, class compare_by_pieces_d,
class store_by_pieces_d): New subclasses of op_by_pieces_d.
(move_by_pieces_d::prepare_mode, move_by_pieces_d::generate,
move_by_pieces_d::finish_endp, store_by_pieces_d::prepare_mode,
store_by_pieces_d::generate, store_by_pieces_d::finish_endp,
compare_by_pieces_d::generate, compare_by_pieces_d::prepare_mode,
compare_by_pieces_d::finish_mode): New member functions.
(compare_by_pieces, emit_block_cmp_via_cmpmem): New static
functions.
(expand_cmpstrn_or_cmpmem): Moved here from builtins.c.
(emit_block_cmp_hints): New function.
(move_by_pieces, store_by_pieces, clear_by_pieces): Rewrite to just
use the newly defined classes.
* expr.h (by_pieces_constfn): New typedef.
(can_store_by_pieces, store_by_pieces): Use it in arg declarations.
(emit_block_cmp_hints, expand_cmpstrn_or_cmpmem): Declare.
(move_by_pieces_ninsns): Don't declare.
(can_move_by_pieces): Change return value to bool.
* target.def (TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Update docs.
(compare_by_pieces_branch_ratio): New hook.
* target.h (enum by_pieces_operation): Add COMPARE_BY_PIECES.
(by_pieces_ninsns): Declare.
* targethooks.c (default_use_by_pieces_infrastructure_p): Handle
COMPARE_BY_PIECES.
(default_compare_by_pieces_branch_ratio): New function.
* targhooks.h (default_compare_by_pieces_branch_ratio): Declare.
* doc/tm.texi.in (STORE_MAX_PIECES, COMPARE_MAX_PIECES): Document.
* doc/tm.texi: Regenerate.
* tree-ssa-strlen.c: Include "builtins.h".
(handle_builtin_memcmp): New static function.
(strlen_optimize_stmt): Call it for BUILT_IN_MEMCMP.
* tree.c (build_common_builtin_nodes): Create __builtin_memcmp_eq.
testsuite/
PR tree-optimization/52171
* gcc.dg/pr52171.c: New test.
* gcc.target/i386/pr52171.c: New test.
Jan Hubicka [Fri, 3 Jun 2016 13:47:15 +0000 (15:47 +0200)]
pred-1.C: New testcase
* g++.dg/tree-ssa/pred-1.C: New testcase
* gcc.dg/tree-ssa/pred-1.c: New testcase
* cp-gimplify.c (genericize_continue_stmt): Force addition of
predict stmt.
Alan Hayward [Fri, 3 Jun 2016 13:04:01 +0000 (13:04 +0000)]
[3/3] No need to vectorize simple only-live stmts
2016-06-03 Alan Hayward <alan.hayward@arm.com>
[3/3] No need to vectorize simple only-live stmts
gcc/
* tree-vect-stmts.c (vect_stmt_relevant_p): Do not vectorize non live
relevant stmts which are simple and invariant.
* tree-vect-loop.c (vectorizable_live_operation): Check relevance
instead of simple and invariant
Alan Hayward [Fri, 3 Jun 2016 13:00:06 +0000 (13:00 +0000)]
[2/3] Vectorize inductions that are live after the loop
2016-06-03 Alan Hayward <alan.hayward@arm.com>
[2/3] Vectorize inductions that are live after the loop
gcc/
* tree-vect-loop.c (vect_analyze_loop_operations): Allow live stmts.
(vectorizable_reduction): Check for new relevant state.
(vectorizable_live_operation): vectorize live stmts using
BIT_FIELD_REF. Remove special case for gimple assigns stmts.
* tree-vect-stmts.c (is_simple_and_all_uses_invariant): New function.
(vect_stmt_relevant_p): Check for stmts which are only used live.
(process_use): Use of a stmt does not inherit it's live value.
(vect_mark_stmts_to_be_vectorized): Simplify relevance inheritance.
(vect_analyze_stmt): Check for new relevant state.
* tree-vectorizer.h (vect_relevant): New entry for a stmt which is used
outside the loop, but not inside it.
testsuite/
* gcc.dg/tree-ssa/pr64183.c: Ensure test does not vectorize.
* testsuite/gcc.dg/vect/no-scevccp-vect-iv-2.c: Remove xfail.
* gcc.dg/vect/vect-live-1.c: New test.
* gcc.dg/vect/vect-live-2.c: New test.
* gcc.dg/vect/vect-live-3.c: New test.
* gcc.dg/vect/vect-live-4.c: New test.
* gcc.dg/vect/vect-live-5.c: New test.
* gcc.dg/vect/vect-live-slp-1.c: New test.
* gcc.dg/vect/vect-live-slp-2.c: New test.
* gcc.dg/vect/vect-live-slp-3.c: New test.
Alan Hayward [Fri, 3 Jun 2016 12:48:21 +0000 (12:48 +0000)]
[1/3] Split vect_get_vec_def_for_operand into two
2016-06-03 Alan Hayward <alan.hayward@arm.com>
[1/3] Split vect_get_vec_def_for_operand into two
gcc/
* tree-vectorizer.h (vect_get_vec_def_for_operand_1): New
* tree-vect-stmts.c (vect_get_vec_def_for_operand_1): New
(vect_get_vec_def_for_operand): Split out code.
These peepholes replace two mfcr;mask sequences by one mfcr;mask;mask
sequence. On modern cpus, the original mfcr's were actually mfocrf,
but the new insn is an actual heavy-weight mfcr. This is very bad
for performance.
The comment says there is a three cycle delay between two consecutive
mfcr insns. This may have been true on rios, and it's true on 604,
but on 603, 750, 7400 it is just a single cycle (on 7450 it is two).
This is also a define_peephole, and we should get rid of those.
So this patch just removes the peepholes; the benefit is marginal at
best, and it so very hurts in other cases.
* config/rs6000/rs6000.md (define_peepholes for two mfcr's): Delete.
Jakub Jelinek [Fri, 3 Jun 2016 08:03:11 +0000 (10:03 +0200)]
re PR middle-end/71387 (ICE in emit_move_insn, at expr.c:3418 with -Og)
PR middle-end/71387
* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): If redirecting
to noreturn e->callee->decl that has void return type and void
arguments, adjust gimple_call_fntype and remove lhs even if it had
previously addressable type.
Jeff Law [Fri, 3 Jun 2016 05:20:16 +0000 (23:20 -0600)]
re PR tree-optimization/71328 (ice in verify_jump_thread)
PR tree-optimization/71328
* tree-ssa-threadupdate.c (duplicate_thread_path): Fix off-by-one
error when checking for a jump back onto the copied path. */
PR tree-optimization/71328
* gcc.c-torture/compile/pr71328.c: New test.
Jakub Jelinek [Thu, 2 Jun 2016 16:36:04 +0000 (18:36 +0200)]
re PR c++/71372 (C++ FE drops TREE_THIS_VOLATILE in cp_fold on all tcc_reference trees)
PR c++/71372
* cp-gimplify.c (cp_fold): For INDIRECT_REF, if the folded expression
is INDIRECT_REF or MEM_REF, copy over TREE_READONLY, TREE_SIDE_EFFECTS
and TREE_THIS_VOLATILE flags. For ARRAY_REF and ARRAY_RANGE_REF, copy
over TREE_READONLY, TREE_SIDE_EFFECTS and TREE_THIS_VOLATILE flags
to the newly built tree.
H.J. Lu [Thu, 2 Jun 2016 13:46:20 +0000 (13:46 +0000)]
Update TARGET_FUNCTION_INCOMING_ARG documentation
On x86, interrupt handlers are only called by processors which push
interrupt data onto stack at the address where the normal return address
is. Since interrupt handlers must access interrupt data via pointers so
that they can update interrupt data, the pointer argument is passed as
"argument pointer - word".
TARGET_FUNCTION_INCOMING_ARG defines how callee sees its argument.
Normally it returns REG, NULL, or CONST_INT. This patch adds arbitrary
address computation based on hard register, which can be forced into a
register, to the list.
When copying an incoming argument onto stack, assign_parm_setup_stack
has:
if (argument in memory)
copy argument in memory to stack
else
move argument to stack
Since an arbitrary address computation may be passed as an argument, we
change it to:
if (argument in memory)
copy argument in memory to stack
else
{
if (argument isn't in register)
force argument into a register
move argument to stack
}
* function.c (assign_parm_setup_stack): Force source into a
register if needed.
* target.def (function_incoming_arg): Update documentation to
allow arbitrary address computation based on hard register.
* doc/tm.texi: Regenerated.
Co-Authored-By: Julia Koval <julia.koval@intel.com>
From-SVN: r237037