From: Uros Bizjak Date: Wed, 12 Sep 2012 15:23:01 +0000 (+0200) Subject: i386.md: Comments on fma4 instruction selection reflect requirement on register press... X-Git-Tag: misc/gccgo-go1_1_2~915 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=ed56b7f958ffab960ccad757c510d718bc85b0b6;p=thirdparty%2Fgcc.git i386.md: Comments on fma4 instruction selection reflect requirement on register pressure... 2012-09-12 Ganesh Gopalasubramanian * config/i386/i386.md : Comments on fma4 instruction selection reflect requirement on register pressure based cost model. * config/i386/driver-i386.c (host_detect_local_cpu): fma4 flag is set-reset as informed by the cpuid flag. * config/i386/i386.c (processor_alias_table): fma4 flag is enabled for bdver2. From-SVN: r191226 --- diff --git a/gcc/ChangeLog b/gcc/ChangeLog index fe066e7aa6a3..345ea6a6287b 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,15 @@ +2012-09-12 Ganesh Gopalasubramanian + + * config/i386/i386.md : Comments on fma4 instruction + selection reflect requirement on register pressure based + cost model. + + * config/i386/driver-i386.c (host_detect_local_cpu): fma4 + flag is set-reset as informed by the cpuid flag. + + * config/i386/i386.c (processor_alias_table): fma4 + flag is enabled for bdver2. + 2012-09-12 Richard Guenther PR tree-optimization/54489 @@ -93,13 +105,11 @@ 2012-09-11 Diego Novillo - * var-tracking.c (vt_add_function_parameter): Adjust for VEC - changes. + * var-tracking.c (vt_add_function_parameter): Adjust for VEC changes. 2012-09-11 Dominique Dhumieres - * config/darwin.c (darwin_asm_named_section): Adjust for - VEC changes. + * config/darwin.c (darwin_asm_named_section): Adjust for VEC changes. (darwin_asm_dwarf_section): Likewise. 2012-09-11 Martin Jambor @@ -158,12 +168,11 @@ 2012-09-11 Richard Guenther - * graphite-scop-detection.c (move_sd_regions): Adjust for VEC - changes. + * graphite-scop-detection.c (move_sd_regions): Adjust for VEC changes. (scopdet_basic_block_info): Likewise. (build_scops_1): Likewise. (limit_scops): Likewise. - + 2012-09-11 Richard Guenther PR middle-end/54515 @@ -266,7 +275,7 @@ 2012-09-09 Mark Kettenis * config/openbsd-stdint.h (INTMAX_TYPE, UINTMAX_TYPE): Define. - + 2012-09-09 Jan Hubicka * passes.c (ipa_write_summaries_1): Set state; @@ -302,7 +311,8 @@ (lto_symtab_encoder_delete_node): New function. (lto_symtab_encoder_encode_body_p, lto_set_symtab_encoder_encode_body, lto_symtab_encoder_encode_initializer_p, - lto_set_symtab_encoder_encode_initializer, lto_symtab_encoder_in_partition_p, + lto_set_symtab_encoder_encode_initializer, + lto_symtab_encoder_in_partition_p, lto_symtab_encoder_in_partition_p): Update. (compute_ltrans_boundary): Take encoder as an input. * passes.c (ipa_write_summaries_1): Update. @@ -323,12 +333,12 @@ 2012-09-08 John David Anglin - * config/pa/pa.c (hppa_rtx_costs): Update costs for large integer modes. + * config/pa/pa.c (hppa_rtx_costs): Update costs for large + integer modes. 2012-09-08 Andi Kleen - * gcc/lto/lto.c (do_whole_program_analysis): - Fix last broken patch + * gcc/lto/lto.c (do_whole_program_analysis): Fix last broken patch. 2012-09-08 Andi Kleen @@ -392,8 +402,8 @@ PR tree-optimization/53986 * tree-vrp.c (extract_range_from_multiplicative_op_1): Allow LSHIFT_EXPR. - (extract_range_from_binary_expr_1): Handle LSHIFT with constant range as - shift amount. + (extract_range_from_binary_expr_1): Handle LSHIFT with constant + range as shift amount. 2012-09-07 Segher Boessenkool @@ -416,7 +426,7 @@ (call_value_nonlocal_aix32): Ditto. (call_value_nonlocal_aix64): Ditto. -2012-09-06 Andi Kleen +2012-09-06 Andi Kleen * doc/invoke.texi (-ffat-lto-objects): Clarify that gcc-ar et.al. should be used. @@ -498,7 +508,7 @@ 2012-09-06 Uros Bizjak - * configure.ac (hle prefixes): Remove .code64. + * configure.ac (hle prefixes): Remove .code64 directive. * configure: Regenerated. 2012-09-06 Kyrylo Tkachov @@ -922,7 +932,7 @@ * config/sh/sh.md (cbranchsi4): Remove TARGET_CBRANCHDI4 check and always invoke expand_cbranchsi4. -2012-09-03 Andi Kleen +2012-09-03 Andi Kleen * tree-ssa-sccvn.c (vn_reference_fold_indirect): Initialize addr_offset always. @@ -15115,7 +15125,7 @@ * cgraphunit.c (cgraph_analyze_function): Use gimple_has_body_p. 2012-05-02 Kirill Yukhin - Andi Kleen + Andi Kleen * coretypes.h (MEMMODEL_MASK): New. * builtins.c (get_memmodel): Add val. Call target.memmodel_check diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c index 79bf75ffaeb7..bda4e0222776 100644 --- a/gcc/config/i386/driver-i386.c +++ b/gcc/config/i386/driver-i386.c @@ -472,8 +472,6 @@ const char *host_detect_local_cpu (int argc, const char **argv) has_abm = ecx & bit_ABM; has_lwp = ecx & bit_LWP; has_fma4 = ecx & bit_FMA4; - if (vendor == signature_AMD_ebx && has_fma4 && has_fma) - has_fma4 = 0; has_xop = ecx & bit_XOP; has_tbm = ecx & bit_TBM; has_lzcnt = ecx & bit_LZCNT; diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 62d3a8c990b6..69a3377e1501 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -3164,7 +3164,7 @@ ix86_option_override_internal (bool main_args_p) {"bdver2", PROCESSOR_BDVER2, CPU_BDVER2, PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1 - | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX + | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_FMA4 | PTA_XOP | PTA_LWP | PTA_BMI | PTA_TBM | PTA_F16C | PTA_FMA}, {"btver1", PROCESSOR_BTVER1, CPU_GENERIC64, diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 898e01562488..05d22ddb3dc0 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -659,9 +659,11 @@ (eq_attr "isa" "noavx2") (symbol_ref "!TARGET_AVX2") (eq_attr "isa" "bmi2") (symbol_ref "TARGET_BMI2") (eq_attr "isa" "fma") (symbol_ref "TARGET_FMA") - ;; Disable generation of FMA4 instructions for generic code - ;; since FMA3 is preferred for targets that implement both - ;; instruction sets. + ;; Fma instruction selection has to be done based on + ;; register pressure. For generating fma4, a cost model + ;; based on register pressure is required. Till then, + ;; fma4 instruction is disabled for targets that implement + ;; both fma and fma4 instruction sets. (eq_attr "isa" "fma4") (symbol_ref "TARGET_FMA4 && !TARGET_FMA") ]