Uros Bizjak [Wed, 7 Nov 2012 12:15:59 +0000 (13:15 +0100)]
i386.c (enum upper_128bits_state): Remove.
* config/i386/i386.c (enum upper_128bits_state): Remove.
(check_avx256_store): Use bool pointer argument.
(ix86_avx_u128_mode_needed): Use note_stores also for CALL insns.
* config/i386/predicates.md (vzeroupper_operation): Use match_test.
Yufeng Zhang [Wed, 7 Nov 2012 11:01:46 +0000 (11:01 +0000)]
aarch64.c (aarch64_expand_prologue): For the load-pair with writeback instruction...
gcc/ChangeLog
2012-11-07 Yufeng Zhang <yufeng.zhang@arm.com>
* config/aarch64/aarch64.c (aarch64_expand_prologue): For the
load-pair with writeback instruction, replace
aarch64_set_frame_expr with add_reg_note (REG_CFA_ADJUST_CFA);
add new local variable 'cfa_reg' and use it.
* ipa-inline-analysis.c (true_predicate, single_cond_predicate,
reset_inline_edge_summary): Fix
formatting.
(account_size_time): Bump up the limit on number of size/time entries to
256.
(estimate_function_body_sizes): Work in reverse postorder.
Jakub Jelinek [Wed, 7 Nov 2012 07:50:01 +0000 (08:50 +0100)]
re PR debug/54693 (VTA guality issues with loops)
PR debug/54693
* tree-flow.h (propagate_threaded_block_debug_into): New prototype.
* tree-ssa-threadedge.c (propagate_threaded_block_debug_into): No
longer static.
* tree-ssa-loop-ch.c (copy_loop_headers): Use it.
Arithmetic sample for timevar log files
"Log0/*perf"
and selecting lines containing "TOTAL" with desired confidence 95 is
trial count is 4, mean is 443.022 (95% confidence in 440.234 to 445.811),
std.deviation is 1.75264, std.error is 0.876322
Arithmetic sample for timevar log files
"Log3/*perf"
and selecting lines containing "TOTAL" with desired confidence 95 is
trial count is 4, mean is 441.302 (95% confidence in 436.671 to 445.934),
std.deviation is 2.91098, std.error is 1.45549
The first sample appears to be 0.39% larger,
with 60% confidence of being larger.
To reach 95% confidence, you need roughly 14 trials,
assuming the standard deviation is stable, which is iffy.
* dwarf2.h (dwarf_location_list_entry_type): New enum with fields
DW_LLE_GNU_end_of_list_entry, DW_LLE_GNU_base_address_selection_entry,
DW_LLE_GNU_start_end_entry and DW_LLE_GNU_start_length_entry.
generic-morestack.c (__generic_morestack): Align the returned stack pointer to a 32 byte boundary.
* generic-morestack.c (__generic_morestack): Align the returned
stack pointer to a 32 byte boundary.
* config/i386/morestack.S (__morestack_non_split) [32-bit]: Don't
increment the return address until we have decided that we don't
have a varargs function.
(__morestack) [32-bit]: Align stack correctly when calling C
functions.
(__morestack) [64-bit]: Likewise.
Jan Hubicka [Tue, 6 Nov 2012 16:22:45 +0000 (17:22 +0100)]
cfgloopanal.c (get_loop_hot_path): New function.
* cfgloopanal.c (get_loop_hot_path): New function.
* tree-ssa-lop-ivcanon.c (struct loop_size): Add CONSTANT_IV,
NUM_NON_PURE_CALLS_ON_HOT_PATH, NUM_PURE_CALLS_ON_HOT_PATH,
NUM_BRANCHES_ON_HOT_PATH.
(tree_estimate_loop_size): Compute the new values.
(try_unroll_loop_completely): Disable unrolling of loops with only
calls or too many branches.
(tree_unroll_loops_completely): Deal also with outer loops of hot loops.
* cfgloop.h (get_loop_hot_path): Declare.
* params.def (PARAM_MAX_PEEL_BRANCHES): New parameters.
* invoke.texi (max-peel-branches): Document.
* gcc.dg/tree-ssa/loop-1.c: Make to look like a good unroling candidate still.
* gcc.dg/tree-ssa/loop-23.c: Likewise.
* gcc.dg/tree-ssa/cunroll-1.c: Unrolling now happens early.
* gcc.dg/tree-prof/unroll-1.c: Remove confused dg-options.
Oleg Endo [Tue, 6 Nov 2012 11:55:43 +0000 (11:55 +0000)]
re PR target/54089 ([SH] Refactor shift patterns)
PR target/54089
* config/sh/sh.c (and_xor_ior_costs, addsubcosts): Double the costs for
ops larger than SImode.
* config/sh/sh.md (rotcl, *rotcl): New insns and splits.
(ashldi3_k): Convert to insn_and_split and use new rotcl insn.
Ed Schonberg [Tue, 6 Nov 2012 10:47:21 +0000 (10:47 +0000)]
sem_res.adb (Preanalyze_And_Resolve): In Alfa mode do not disable checks...
2012-11-06 Ed Schonberg <schonberg@adacore.com>
* sem_res.adb (Preanalyze_And_Resolve): In Alfa mode do not
disable checks, so that flags can be properly set on expressions
that are not further expanded.
Gary Dismukes [Tue, 6 Nov 2012 10:22:42 +0000 (10:22 +0000)]
exp_attr.adb (Expand_N_Attribute_Reference): Apply a predicate check when evaluating the attribute Valid...
2012-11-06 Gary Dismukes <dismukes@adacore.com>
* exp_attr.adb (Expand_N_Attribute_Reference): Apply a predicate
check when evaluating the attribute Valid, and issue a warning
about infinite recursion when the check occurs within the
predicate function of the prefix's subtype.
* exp_ch4.adb (Expand_N_In): Remove test for Is_Discrete_Type
when we're checking that there's no predicate check function as a
condition for substituting a Valid check for a scalar membership
test (substitution should be suppressed for any kind of scalar
subtype with a predicate check). Also, don't emit a predicate
check when the right operand is a range.
* einfo.adb: Include Loop_Entry_Attributes to the list of
Node/List/Elist10 usage.
(Loop_Entry_Attributes): New routine.
(Set_Loop_Entry_Attributes): New routine.
(Write_Field10_Name): Add an output string for Loop_Entry_Attributes.
* einfo.ads: Define new attribute Loop_Entry_Attributes along
with its usage in nodes.
(Loop_Entry_Attributes): New routine and dedicated pragma Inline.
(Set_Loop_Entry_Attributes): New routine and dedicated pragma Inline.
* exp_attr.adb (Expand_N_Attribute_Reference): Do not expand
Attribute_Loop_Entry here.
* exp_ch5.adb: Add with and use clause for Elists;
(Expand_Loop_Entry_Attributes): New routine.
(Expand_N_Loop_Statement): Add a call to Expand_Loop_Entry_Attributes.
* exp_prag.adb (Expand_Pragma_Loop_Assertion): Specialize the
search to include multiple nested loops produced by the expansion
of Ada 2012 array iterator.
* sem_attr.adb: Add with and use clause for Elists.
(Analyze_Attribute): Check the legality of attribute Loop_Entry.
(Resolve_Attribute): Nothing to do for Loop_Entry.
(S14_Attribute): New routine.
* snames.ads-tmpl: Add a comment on entries marked with
HiLite. Add new name Name_Loop_Entry. Add new attribute
Attribute_Loop_Entry.
Arnaud Charlet [Tue, 6 Nov 2012 10:14:13 +0000 (11:14 +0100)]
[multiple changes]
2012-11-06 Geert Bosch <bosch@adacore.com>
* eval_fat.adb (Machine, Succ): Fix front end to support static
evaluation of attributes on targets with both VAX and IEEE float.
* sem_util.ads, sem_util.adb (Has_Denormals, Has_Signed_Zeros):
New type-specific functions. Previously we used Denorm_On_Target
and Signed_Zeros_On_Target directly, but that doesn't work well
for OpenVMS where a single target supports both floating point
with and without signed zeros.
* sem_attr.adb (Attribute_Denorm, Attribute_Signed_Zeros): Use
new Has_Denormals and Has_Signed_Zeros functions to support both
IEEE and VAX floating point on a single target.
2012-11-06 Tristan Gingold <gingold@adacore.com>
* bindgen.adb (System_Interrupts_Used): New variable.
(Gen_Adainit): Declare and call
Install_Restricted_Handlers_Sequential if System.Interrupts is
used when elaboration policy is sequential.
Arnaud Charlet [Tue, 6 Nov 2012 10:11:20 +0000 (11:11 +0100)]
[multiple changes]
2012-11-06 Tristan Gingold <gingold@adacore.com>
* fe.h (Get_Vax_Real_Literal_As_Signed): Declare.
* eval_fat.adb, eval_fat.ads (Decompose_Int): Move spec in package spec.
* exp_vfpt.adb, exp_vfpt.ads (Vax_Real_Literal_As_Signed): New function.
(Expand_Vax_Real_Literal): Remove.
* exp_ch2.adb (Expand_N_Real_Literal): Do nothing.
* sem_eval.adb (Expr_Value_R): Remove special Vax float case,
as this is not anymore a special case.
2012-11-06 Yannick Moy <moy@adacore.com>
* uintp.ads: Minor correction of typo in comment.
2012-11-06 Ed Schonberg <schonberg@adacore.com>
* sem_prag.adb (Analyze_Pragnma, case Unchecked_Union): remove
requirement that discriminants of an unchecked_union must have
defaults.
* exp_prag.adb (Expand_Pragma_Loop_Assertion): Update the comment
on intended expansion. Reimplement the logic which expands the
termination variants.
(Process_Increase_Decrease): Update the parameter profile and the
comment related to it. Accommodate the new aggregate-like appearance of
the termination variants.
* sem_prag.adb (Analyze_Pragma): Update the syntax of pragma
Loop_Assertion. Reimplement the semantic analysis of the pragma
to accommodate the new aggregate-like variant.
(Check_Variant): New routine.
* snames.ads-tmpl: Change names Name_Decreases and Name_Increases
to Name_Decreasing and Name_Increasing respectively. Add name
Variant.
2012-11-06 Ed Schonberg <schonberg@adacore.com>
* sem_eval.adb: Static evaluation of case expressions.
Arnaud Charlet [Tue, 6 Nov 2012 09:44:51 +0000 (10:44 +0100)]
[multiple changes]
2012-11-06 Thomas Quinot <quinot@adacore.com>
* s-oscons-tmplt.c: Interfaces.C now needs to be WITH'd even
on platforms that do not support sockets (for the benefit of
subtype IOCTL_Req_T).
2012-11-06 Ed Schonberg <schonberg@adacore.com>
* par-ch4.adb (P_Primary): if-expressions, case-expressions,
and quantified expressions are legal if surrounded by parentheses
from an enclosing context, such as a call or an instantiation.
2012-11-06 Yannick Moy <moy@adacore.com>
* impunit.adb (Get_Kind_Of_Unit): Return appropriate kind for
predefined implementation files, instead of returning
Not_Predefined_Unit on all .adb files.
2012-11-06 Tristan Gingold <gingold@adacore.com>
* exp_ch9.adb (Build_Activation_Chain_Entity): Return immediately if
partition elaboration policy is sequential.
(Build_Task_Activation_Call): Likewise. Use
Activate_Restricted_Tasks on restricted profile.
(Make_Task_Create_Call): Do not use the _Chain
parameter if elaboration policy is sequential. Call
Create_Restricted_Task_Sequential in that case.
* exp_ch3.adb (Build_Initialization_Call): Change condition to
support concurrent elaboration policy.
(Build_Record_Init_Proc): Likewise.
(Init_Formals): Likewise.
* bindgen.adb (Gen_Adainit): Declare Partition_Elaboration_Policy
and set it in generated code if the elaboration policy is
sequential. The procedure called to activate all tasks is now
named __gnat_activate_all_tasks.
* rtsfind.adb (RE_Activate_Restricted_Task,
RE_Create_Restricted_Task_Sequential): New RE_Id literals.
* s-tarest.adb (Create_Restricted_Task): Added to create a task without
adding it on an activation chain.
(Activate_Tasks): Has now a Chain parameter.
(Activate_All_Tasks_Sequential): Added. Called by the binder to
activate all tasks.
(Activate_Restricted_Tasks): Added. Called during elaboration to
activate tasks of the units.
* s-tarest.ads: Remove pragma Partition_Elaboration_Policy.
(Partition_Elaboration_Policy): New variable (set by the binder).
(Create_Restricted_Task): Revert removal of the chain parameter.
(Create_Restricted_Task_Sequential): New procedure.
(Activate_Restricted_Tasks): Revert removal.
(Activate_All_Tasks_Sequential): New procedure.
Bernard Banner [Tue, 6 Nov 2012 09:41:56 +0000 (09:41 +0000)]
2012-11-06 Bernard Banner <banner@adacore.com>
* adaint.c Add file macro definitions missing on Android.
* adaint.h Avoid definitions related to task affinity and CPU
sets since this functionality is missing on the Android
* errno.c (__set_errno): Android already contains such a named
procedure so do include again.
* gsocket.h: Sockets not supported on Android.
* init.c: Avoid linux related code not supported on Android.
* sysdep.c (sigismember, sigaddset, sigdelset, sigemptyset,
sigfillset): wrapper functions since sig routines are defined
as inline macros on Android.
* terminals.c: Add stubs for terminal related functions not
supported on Android.
* exp_prag.adb: Add with and use clause for Sem_Ch8.
(Expand_N_Pragma): Add a new variant to expand pragma Loop_Assertion.
(Expand_Pragma_Loop_Assertion): New routine.
* par-prag.adb (Prag): The semantic analysis of pragma
Loop_Assertion is carried out by Analyze_Pragma. No need for
checks in the parser.
* sem_prag.adb: Add a reference position value for pragma
Loop_Assertion in Sig_Flags.
(Analyze_Pragma): Add semantic analysis for pragma Loop_Assertion.
* snames.ads-tmpl: Add the following new names:
Name_Decreases Name_Increases Name_Loop_Assertion.
Add new pragma id Pragma_Loop_Assertion.
2012-11-06 Ed Schonberg <schonberg@adacore.com>
* exp_ch5.adb: Identifier in iterator must have debug
information.
Sriraman Tallam [Tue, 6 Nov 2012 02:35:17 +0000 (02:35 +0000)]
Function Multiversioning
========================
Sriraman Tallam, tmsriram@google.com
Overview of the patch which adds support to specify function versions. This is
only enabled for target i386.
Example:
int foo (); /* Default version */
int foo () __attribute__ ((target("avx,popcnt")));/*Specialized for avx and popcnt */
int foo () __attribute__ ((target("arch=core2,ssse3")));/*Specialized for core2 and ssse3*/
int main ()
{
int (*p)() = &foo;
return foo () + (*p)();
}
int foo ()
{
return 0;
}
int __attribute__ ((target("avx,popcnt")))
foo ()
{
return 0;
}
int __attribute__ ((target("arch=core2,ssse3")))
foo ()
{
return 0;
}
The above example has foo defined 3 times, but all 3 definitions of foo are
different versions of the same function. The call to foo in main, directly and
via a pointer, are calls to the multi-versioned function foo which is dispatched
to the right foo at run-time.
Front-end changes:
The front-end changes are calls at appropriate places to target hooks that
determine the following:
* Determine if two function decls with the same signature are versions.
* Determine the assembler name of a function version.
* Generate the dispatcher function for a set of function versions.
* Compare versions to see if one has a higher priority over the other.
All the implementation happens in the target-specific config/i386/i386.c.
What does the patch do?
* Tracking decls that correspond to function versions of function
name, say "foo":
When the front-end sees more than one decl for "foo", it calls a target hook to
determine if they are versions. To prevent duplicate definition errors with
other versions of "foo", "decls_match" function in cp/decl.c is made to return
false when 2 decls have are deemed versions by the target. This will make all
function versions of "foo" to be added to the overload list of "foo".
* Change the assembler names of the function versions.
For i386, the target changes the assembler names of the function versions by
suffixing the sorted list of args to "target" to the function name of "foo".
For example, the assembler name of
"void foo () __attribute__ ((target ("sse4")))" will
become _Z3foov.sse4. The target hook mangle_decl_assembler_name is used
for this.
* Overload resolution:
Function "build_over_call" in cp/call.c sees a call to function
"foo", which is multi-versioned. The overload resolution happens in
function "joust" in "cp/call.c". Here, the call to "foo" has all
possible versions of "foo" as candidates. All the candidates of "foo" are
stored in the cgraph side data structure. Each version of foo is chained in a
doubly-linked list with the default function as the first element. This allows
any pass to access all the semantically identical versions. A call to a
multi-versioned function will be replaced by a call to a dispatcher function,
determined by a target hook, to execute the right function version at run-time.
Optimization to directly call a version when possible:
Also, in joust, where overload resolution happens, a multiversioned function
resolution is made to return the most specialized version. This is the version
that will be checked for dispatching first and is determined by the target.
Now, if the caller can inline this function version then a direct call is made
to this function version rather than go through the dispatcher. When a direct
call cannot be made, a call to the dispatcher function is created.
* Creating the dispatcher body.
The dispatcher body, called the resolver is made only when there is a call to a
multiversioned function dispatcher or the address of a function is taken. This
is generated during cgraph_analyze_function. This is done by another target hook.
* Dispatch ordering.
The order in which the function versions are checked during dispatch is based
on a priority value assigned for the ISA that is catered. More specialized
versions are checked for dispatching first. This is to mitigate the ambiguity
that can arise when more than one function version is valid for execution on
a particular platform. This is not a perfect solution, and in future the user
should be allowed to assign a dispatching priority value to each version.
Function MV in the Intel compiler:
The intel compiler supports function multiversioning and the syntax is
similar to the patch proposed here. Here is an example of how to
generate multiple function versions with the intel compiler.
/* Create a stub function to specify the various versions of function that
will be created, using declspec attribute cpu_dispatch. */
__declspec (cpu_dispatch (core_i7_sse4_2, atom, generic))
void foo () {};
/* The generic or the default version. */
__declspec (cpu_specific(generic))
void foo ()
{
printf ("This is generic");
}
A new function version is generated by defining a new function with the same
signature but with a different cpu_specific declspec attribute string. The
set of cpu_specific strings that are allowed is the following:
Comparison with the GCC MV implementation in this patch:
* Version creation syntax:
The implementation in this patch also has a similar syntax to specify function
versions. The first stub function is not needed. Here is the code to generate
the function versions with this patch:
* doc/tm.texi.in (TARGET_OPTION_FUNCTION_VERSIONS): New hook
description.
* (TARGET_COMPARE_VERSION_PRIORITY): New hook description.
* (TARGET_GET_FUNCTION_VERSIONS_DISPATCHER): New hook description.
* (TARGET_GENERATE_VERSION_DISPATCHER_BODY): New hook description.
* doc/tm.texi: Regenerate.
* target.def (compare_version_priority): New target hook.
* (generate_version_dispatcher_body): New target hook.
* (get_function_versions_dispatcher): New target hook.
* (function_versions): New target hook.
* cgraph.c (cgraph_fnver_htab): New htab.
(cgraph_fn_ver_htab_hash): New function.
(cgraph_fn_ver_htab_eq): New function.
(version_info_node): New pointer.
(insert_new_cgraph_node_version): New function.
(get_cgraph_node_version): New function.
(delete_function_version): New function.
(record_function_versions): New function.
* cgraph.h (cgraph_node): New bitfield dispatcher_function.
(cgraph_function_version_info): New struct.
(get_cgraph_node_version): New function.
(insert_new_cgraph_node_version): New function.
(record_function_versions): New function.
(delete_function_version): New function.
(init_lowered_empty_function): Expose function.
* tree.h (DECL_FUNCTION_VERSIONED): New macro.
(tree_function_decl): New bit-field versioned_function.
* cgraphunit.c (cgraph_analyze_function): Generate body of multiversion
function dispatcher.
(cgraph_analyze_functions): Analyze dispatcher function.
(init_lowered_empty_function): Make non-static. New parameter in_ssa.
(assemble_thunk): Add parameter to call to init_lowered_empty_function.
* config/i386/i386.c (add_condition_to_bb): New function.
(get_builtin_code_for_version): New function.
(ix86_compare_version_priority): New function.
(feature_compare): New function.
(dispatch_function_versions): New function.
(ix86_function_versions): New function.
(attr_strcmp): New function.
(ix86_mangle_function_version_assembler_name): New function.
(ix86_mangle_decl_assembler_name): New function.
(make_name): New function.
(make_dispatcher_decl): New function.
(is_function_default_version): New function.
(ix86_get_function_versions_dispatcher): New function.
(make_attribute): New function.
(make_resolver_func): New function.
(ix86_generate_version_dispatcher_body): New function.
(fold_builtin_cpu): Return integer for cpu builtins.
(TARGET_MANGLE_DECL_ASSEMBLER_NAME): New macro.
(TARGET_COMPARE_VERSION_PRIORITY): New macro.
(TARGET_GENERATE_VERSION_DISPATCHER_BODY): New macro.
(TARGET_GET_FUNCTION_VERSIONS_DISPATCHER): New macro.
(TARGET_OPTION_FUNCTION_VERSIONS): New macro.
* class.c (add_method): Change assembler names of function versions.
(mark_versions_used): New static function.
(resolve_address_of_overloaded_function): Create dispatcher decl and
return address of dispatcher instead.
* decl.c (decls_match): Make decls unmatched for versioned
functions.
(duplicate_decls): Remove ambiguity for versioned functions.
Delete versioned function data for merged decls.
* decl2.c (check_classfn): Check attributes of versioned functions
for match.
* call.c (get_function_version_dispatcher): New function.
(mark_versions_used): New static function.
(build_over_call): Make calls to multiversioned functions
to call the dispatcher.
(joust): For calls to multi-versioned functions, make the most
specialized function version win.
* testsuite/g++.dg/mv1.C: New test.
* testsuite/g++.dg/mv2.C: New test.
* testsuite/g++.dg/mv3.C: New test.
* testsuite/g++.dg/mv4.C: New test.
* testsuite/g++.dg/mv5.C: New test.
* testsuite/g++.dg/mv6.C: New test.
Eric Botcazou [Mon, 5 Nov 2012 21:39:02 +0000 (21:39 +0000)]
re PR tree-optimization/54986 (segfault on constant initialized to object address at -O)
PR tree-optimization/54986
* gimple-fold.c (canonicalize_constructor_val): Strip again all no-op
conversions on entry but add them back on exit if needed.
François Dumont [Mon, 5 Nov 2012 20:58:35 +0000 (20:58 +0000)]
throw_allocator.h (__throw_value_base): Add move semantic, not throwing.
2012-10-05 François Dumont <fdumont@gcc.gnu.org>
* include/ext/throw_allocator.h (__throw_value_base): Add move
semantic, not throwing.
(__throw_value_limit): Likewise.
(__throw_value_random): Likewise.
* testsuite/util/exception/safety.h: Add validation of C++11
methods emplace/emplace_front/emplace_back/emplace_hint.
* testsuite/util/testsuite_container_traits.h: Signal emplace
support on deque, forward_list, list and vector.
* testsuite/23_containers/deque/requirements/exception/
propagation_consistent.cc: Remove dg-do run fail.
re PR target/55204 (ICE: in extract_insn, at recog.c:2140 (unrecognizable insn) with -O --param loop-invariant-max-bbs-in-loop=0)
gcc/
PR target/55204
* config/i386/i386.c (ix86_address_subreg_operand): Remove stack
pointer check.
(print_reg): Use true_regnum rather than REGNO.
(ix86_print_operand_address): Remove SUBREG handling.