PR middle-end/86528
* builtins.c (check_access): Bail out if range[0] is no INTEGER_CST.
* expr.c (string_constant): Fix the element size of ARRAY_TYPE.
The standard doesn't specify this partial specialization (it was
required after the changes in N2637 but then should have been removed
following LWG 1262). Its presence is observable because it causes
different results when operator< has been overloaded for a shared_ptr
specialization.
* doc/extend.texi (PowerPC AltiVec Built-in Functions):
Alphabetize prototypes of built-in functions, separating out
built-in functions that are listed in this section but should be
described elsewhere.
PR target/86511
* expmed.c (emit_store_flag): Do not emit setcc followed by a
conditional move when trapping comparison was split to a
non-trapping one (and vice versa).
On i386 the profiler call sequence always consists of 1 call
instruction, so -mnop-mcount generates a single nop with the same
length as a call. For S/390 longer sequences may be used in some
cases, so -mnop-mcount generates the corresponding amount of nops.
2018-07-16 Ilya Leoshkevich <iii@linux.ibm.com>
* config/s390/s390.c (s390_function_profiler): Generate nops
instead of profiler call sequences.
* config/s390/s390.opt: Add the new option.
2018-07-16 Ilya Leoshkevich <iii@linux.ibm.com>
* gcc.target/s390/mnop-mcount-m31-fpic.c: New testcase.
* gcc.target/s390/mnop-mcount-m31-mzarch.c: New testcase.
* gcc.target/s390/mnop-mcount-m31.c: New testcase.
* gcc.target/s390/mnop-mcount-m64-mfentry.c: New testcase.
* gcc.target/s390/mnop-mcount-m64.c: New testcase.
S/390: Add direct support for Linux kernel __fentry__ patching.
On i386, the difference between mcount and fentry is that fentry
comes before the prolog. On s390 mcount already comes before the
prolog, but takes 4 instructions. This patch introduces the more
efficient implementation (just 1 instruction) and puts it under
-mfentry flag.
The produced code is compatible only with newer glibc versions,
which provide the __fentry__ symbol and do not clobber %r0 when
resolving lazily bound functions. Because 31-bit PLT stubs assume
%r12 contains GOT address, which is not the case when the code runs
before the prolog, -mfentry is allowed only for 64-bit code.
Also, code compiled with -mfentry cannot be used for the nested C
functions, since they both use %r0. In this case instrumentation is
not insterted, and a new warning is issued for each affected nested
function.
2018-07-16 Ilya Leoshkevich <iii@linux.ibm.com>
* common.opt: Add the new warning.
* config/s390/s390.c (s390_function_profiler): Emit "brasl
%r0,__fentry__" when -mfentry is specified.
(s390_option_override_internal): Disallow -mfentry for 31-bit
CPUs.
* config/s390/s390.opt: Add the new option.
[Ada] Missing error on hidden state in instantiation
This patch modifies the analysis of package contracts to split processing
which is specific to package instantiations on its own. As a result, the
lack of indicator Part_Of can now be properly assessed.
------------
-- Source --
------------
-- gen_pack.ads
generic
package Gen_Pack is
Pack_Var : Integer := 1;
end Gen_Pack;
-- gen_wrap.ads
with Gen_Pack;
generic
package Gen_Wrap is
Wrap_Var : Integer := 1;
package Inst is new Gen_Pack;
end Gen_Wrap;
-- pack.ads
with Gen_Pack;
with Gen_Wrap;
package Pack
with SPARK_Mode => On,
Abstract_State => State
is
procedure Force_Body;
private
package OK_Inst_1 is new Gen_Pack -- OK
with Part_Of => State; -- OK
package OK_Inst_2 is new Gen_Pack; -- OK
pragma Part_Of (State); -- OK
package OK_Inst_3 is new Gen_Wrap -- OK
with Part_Of => State; -- OK
package OK_Inst_4 is new Gen_Wrap; -- OK
pragma Part_Of (State);
package Error_Inst_1 is new Gen_Pack; -- Error
package Error_Inst_2 is new Gen_Wrap; -- Error
end Pack;
-- pack.adb
package body Pack
with SPARK_Mode => On,
Refined_State =>
(State => (OK_Inst_1.Pack_Var, OK_Inst_2.Pack_Var,
OK_Inst_3.Wrap_Var, OK_Inst_3.Inst.Pack_Var,
OK_Inst_4.Wrap_Var, OK_Inst_4.Inst.Pack_Var))
is
procedure Force_Body is null;
end Pack;
----------------------------
-- Compilation and output --
----------------------------
$ gcc -c pack.adb
pack.ads:23:12: indicator Part_Of is required in this context (SPARK RM
7.2.6(2))
pack.ads:23:12: "Error_Inst_1" is declared in the private part of package
"Pack"
pack.ads:24:12: indicator Part_Of is required in this context (SPARK RM
7.2.6(2))
pack.ads:24:12: "Error_Inst_2" is declared in the private part of package
"Pack"
* contracts.adb (Analyze_Contracts): Add specialized processing for
package instantiation contracts.
(Analyze_Package_Contract): Remove the verification of a missing
Part_Of indicator.
(Analyze_Package_Instantiation_Contract): New routine.
* contracts.ads (Analyze_Package_Contract): Update the comment on
usage.
* sem_prag.adb (Check_Missing_Part_Of): Ensure that the entity of the
instance is being examined when trying to determine whether a package
instantiation needs a Part_Of indicator.
[Ada] Fix expansion of blocks in loops inside elaboration code
2018-07-16 Ed Schonberg <schonberg@adacore.com>
gcc/ada/
* exp_ch7.adb (Check_Unnesting_Elaboration_Code): Handle loops that
contain blocks in the elaboration code for a package body. Create the
elaboration subprogram wrapper only if there is a subprogram
declaration in a block or loop.
[Ada] Deep copy operands of membership operations for unnesting
2018-07-16 Ed Schonberg <schonberg@adacore.com>
gcc/ada/
* exp_ch4.adb (Expand_Set_Membership): Use New_Copy_Tree to perform a
deep copy of the left operand when building each conjuct of the
expanded membership operation, to avoid sharing nodes between them.
This sharing interferes with the unnesting machinery and is generally
undesirable.
[Ada] Fix Default_Storage_Pool aspect handling in generic instantiations
2018-07-16 Ed Schonberg <schonberg@adacore.com>
gcc/ada/
* sem_ch12.adb (Analyze_Package_Instantiation): Handle properly an
instance that carries an aspect Default_Storage_Pool that overrides a
default storage pool that applies to the generic unit. The aspect in
the generic unit was removed before copying it in the instance, rather
than removing it from the copy of the aspects that are appended to the
aspects in the instance.
* einfo.adb (Set_Is_Uplevel_Referenced_Entity): Flag can appear on
loop parameters.
* exp_ch7.adb (Check_Unnesting_Elaboration_Code): Handle subprogram
bodies.
* exp_ch9.adb (Reset_Scopes_To): Set the scopes of entities local to an
entry body to be the corresponding generated subprogram, for correct
analysis of uplevel references.
* exp_unst.adb (Visit_Node): Handle properly binary and unary operators
Ignore pragmas, fix component associations.
(Register_Subprograms): Subprograms in synchronized types must be
treated as reachable.
This patch corrects the mechanism which ensures that a package with a null
Abstract_State does not introduce hidden state, by ignoring internal states
and variables because they do not represent the "source" hidden state.
[Ada] Deconstruct unused Withed_Body filed of N_With_Clause node
The Withed_Body field was added to N_With_Clause node to help the
Walk_Library_Items routine, which was created for the CodePeer backend
and later adopted by the GNATprove.
This routine is meant to traverse all library units, such that declarations
are visited before references. However, for complex units (in particular,
with generics and child packages) it never worked reliably and backends
developed their own workarounds. This patch deconstructs the field, as it
hasn't been used for years.
[Ada] Avoid crash when traversing units with -gnatd.WW debug switch
The debug switch -gnatd.WW enables extra info when traversing library units
with Walk_Library_Items, which is used in the CodePeer and GNATprove. This
routine was crashing when trying to print info about a unit with configuration
pragmas (typically an .adc file). Now fixed.
No test, as the crash only happens when a GNATprove backend is manually called
with -gnatd.WW switch. Frontend is not affected.
2018-07-16 Piotr Trojanek <trojanek@adacore.com>
gcc/ada/
* sem.adb (Walk_Library_Items): Skip units with configuration pragmas
when printing debug info.
[Ada] Trivial simplifications in in Walk_Library_Items
Cleanup only; semantics unaffected.
2018-07-16 Piotr Trojanek <trojanek@adacore.com>
gcc/ada/
* sem.adb (Walk_Library_Items): Reuse local constant.
(Is_Subunit_Of_Main): Turn condition to positive and flip the
IF-THEN-ELSE branches; avoid potentially ineffective assignment to the
Lib variable.
[Ada] Deconstruct always-false calls to Withed_Body in Walk_Library_Items
We previously removed the calls to Set_Withed_Body; this commit deconstructs
calls to Withed_Body, which always returned False.
The Set_Withed_Body/Withed_Body were helping the Walk_Library_Items routine
traverse the AST of several compilation units such that declarations are
visited before references. However, this never worked as it should and there is
no point to keep the code more complicated than necessary.
No test provided, because thie removed code was ineffective (and only used in
the non-compiler backends, i.e. CodePeer and GNATprove).
2018-07-16 Piotr Trojanek <trojanek@adacore.com>
gcc/ada/
* sem.adb (Walk_Library_Items): Deconstruct dead code.
[Ada] Crash on Indefinite_Hashed_Maps with -gnata -gnateV
This patch corrects the generation of helper functions which verify the
validity of record type scalar discriminants and scalar components when
switches -gnata (assertions enabled) and -gnateV (validity checks on
subprogram parameters) are in effect.
[Ada] Spurious possible contraint error warning with No_Exception_Propagation
This patch corrects an issue whereby spurious unhandled exception warnings on
integer literals within static if and case expressions would be emitted when
the restriction No_Exception_Propagation is enabled.
procedure Filter (Ret : out Integer) is
Val : constant Nat :=
(if User_Override in Nat then
User_Override
else
Default);
begin
Ret := Val;
end Filter;
end Pack;
----------------------------
-- Compilation and output --
----------------------------
& gcc -c -gnatp -gnatwa pack.adb
2018-07-16 Justin Squirek <squirek@adacore.com>
gcc/ada/
* sem_eval.adb (Eval_Integer_Literal): Add exception for avoiding
checks on expanded literals within if and case expressions.
[Ada] Segmentation_Fault with Integer'Wide_Wide_Value
This patch updates the routines which produce Wide_String and Wide_Wide_String
from a String to construct a result of the proper maximum size which is later
sliced.
This patch is preventive: it improves checks on inline functions that
return unconstrained type. It does not change the functionality of
the compiler.
2018-07-16 Javier Miranda <miranda@adacore.com>
gcc/ada/
* inline.adb (Build_Body_To_Inline): Minor code reorganization that
ensures that calls to function Has_Single_Return() pass a decorated
tree.
(Has_Single_Return.Check_Return): Peform checks on entities (instead on
relying on their characters).
[Ada] Crash processing sources under GNATprove debug mode
Processing sources under -gnatd.F the frontend may crash on
an iterator of the form 'for X of ...' over an array if the
iterator is located in an inlined subprogram.
2018-07-16 Javier Miranda <miranda@adacore.com>
gcc/ada/
* exp_ch5.adb (Expand_Iterator_Loop_Over_Array): Code cleanup. Required
to avoid generating an ill-formed tree that confuses gnatprove causing
it to blowup.
gcc/testsuite/
* gnat.dg/iter2.adb, gnat.dg/iter2.ads: New testcase.
[Ada] Adjust inlining in GNATprove mode for predicate/invariant/DIC
The frontend generates special functions for checking subtype predicates,
type invariants and Default_Initial_Condition aspect. These are translated
as predicates in GNATprove, and as such should no call inside these
functions should be inlined. This is similar to the existing handling of
calls inside expression functions.
There is no impact on compilation.
2018-07-16 Yannick Moy <moy@adacore.com>
gcc/ada/
* sem_res.adb (Resolve_Call): Do not inline calls inside
compiler-generated functions translated as predicates in GNATprove.
[Ada] Violation of No_Standard_Allocators_After_Elaboration not detected
The compiler fails to generate a call to detect allocators executed after
elaboration in cases where the allocator is associated with Global_Pool_Object.
The fix is to test for this associated storage pool as part of the condition
for generating a call to System.Elaboration_Allocators.Check_Standard_Alloctor.
Also, the exception Storage_Error is now generated instead of Program_Error
for such a run-time violation, as required by the Ada RM in D.7.
The following test must compile and execute quietly:
-- Put the pragma in gnat.adc:
pragma Restrictions (No_Standard_Allocators_After_Elaboration);
procedure Allocate
(Use_Global_Allocator : Boolean;
During_Elaboration : Boolean)
is
type Local_Acc is access Rec;
Local_Ptr : Local_Acc;
begin
if Use_Global_Allocator then
Ptr := new Rec; -- Raise Storage_Error if after elaboration
Ptr.Int := 1;
else
Local_Ptr := new Rec; -- Raise Storage_Error if after elaboration
Local_Ptr.Int := 1;
end if;
if not During_Elaboration then
raise Program_Error; -- No earlier exception: FAIL
end if;
exception
when Storage_Error =>
if During_Elaboration then
raise Program_Error; -- No exception expected: FAIL
else
null; -- Expected Storage_Error: PASS
end if;
when others =>
raise Program_Error; -- Unexpected exception: FAIL
end Allocate;
begin
Allocate (Use_Global_Allocator => True, During_Elaboration => True);
Allocate (Use_Global_Allocator => False, During_Elaboration => True);
end Pkg_With_Allocators;
with Pkg_With_Allocators;
procedure Alloc_Restriction_Main is
begin
Pkg_With_Allocators.Allocate
(Use_Global_Allocator => True,
During_Elaboration => False);
Pkg_With_Allocators.Allocate
(Use_Global_Allocator => False,
During_Elaboration => False);
end Alloc_Restriction_Main;
2018-07-16 Gary Dismukes <dismukes@adacore.com>
gcc/ada/
* exp_ch4.adb (Expand_N_Allocator): Test for Storage_Pool being RTE in
addition to the existing test for no Storage_Pool as a condition
enabling generation of the call to Check_Standard_Allocator when the
restriction No_Standard_Allocators_After_Elaboration is active.
* libgnat/s-elaall.ads (Check_Standard_Allocator): Correct comment to
say that Storage_Error will be raised (rather than Program_Error).
* libgnat/s-elaall.adb (Check_Standard_Allocator): Raise Storage_Error
rather than Program_Error when Elaboration_In_Progress is False.
* sem_eval.adb (Compile_Time_Known_Value): Add a guard which prevents
the compiler from entering infinite recursion when trying to determine
whether a deferred constant has a compile time known value, and the
initialization expression of the constant is a reference to the
constant itself.
PR tree-optimization/86514
* tree-ssa-reassoc.c (init_range_entry) <CASE_CONVERT>: Return for a
conversion to a boolean type from a type with greater precision.
We advertise Og as the optimization level of choice for the standard
edit-compile-debug cycle, but do not run the guality tests for Og with the
default torture options.
This patch ensures that we test -Og in the guality tests.
F.i., for gcc.dg/guality there are 45 fails for Og (while there are none for
O1), in these test-cases:
...
gcc.dg/guality/pr54200.c
gcc.dg/guality/pr54970.c
gcc.dg/guality/pr56154-1.c
gcc.dg/guality/pr59776.c
gcc.dg/guality/sra-1.c
...
2018-07-15 Tom de Vries <tdevries@suse.de>
* lib/gcc-gdb-test.exp (guality_minimal_options): New proc.
* lib/gfortran-dg.exp (gfortran-dg-runtest): Don't call torture-init if
already called.
* g++.dg/guality/guality.exp: Ensure Og is part of torture options.
* gcc.dg/guality/guality.exp: Same.
* gfortran.dg/guality/guality.exp: Same.
We advertise Og as the optimization level of choice for the standard
edit-compile-debug cycle, but do not run the guality tests for Og with the
default torture options.
This patch ensures that we test -Og in the guality tests.
F.i., for gcc.dg/guality there are 45 fails for Og (while there are none for
O1), in these test-cases:
...
gcc.dg/guality/pr54200.c
gcc.dg/guality/pr54970.c
gcc.dg/guality/pr56154-1.c
gcc.dg/guality/pr59776.c
gcc.dg/guality/sra-1.c
...
2018-07-15 Tom de Vries <tdevries@suse.de>
* lib/gcc-gdb-test.exp (guality_minimal_options): New proc.
* g++.dg/guality/guality.exp: Ensure Og is part of torture options.
* gcc.dg/guality/guality.exp: Same.
* gfortran.dg/guality/guality.exp: Same.
ian [Fri, 13 Jul 2018 20:39:02 +0000 (20:39 +0000)]
runtime: skip zero-sized fields in structs when converting to FFI
The libffi library doesn't understand zero-sized objects.
When we see a zero-sized field in a struct, just skip it when
converting to the FFI data structures. There is no value to pass in
any case, so not telling libffi about the field doesn't affect
anything.
The test case for this is https://golang.org/cl/123316.
x86: Tune Skylake, Cannonlake and Icelake as Haswell
r259399, which added PROCESSOR_SKYLAKE, disabled many x86 optimizations
which are enabled by PROCESSOR_HASWELL. As the result, -mtune=skylake
generates slower codes on Skylake than before. The same also applies
to Cannonlake and Icelak tuning.
This patch changes -mtune={skylake|cannonlake|icelake} to tune like
-mtune=haswell for until their tuning is properly adjusted. It also
enables -mprefer-vector-width=256 for -mtune=haswell, which has no
impact on codegen when AVX512 isn't enabled.
Performance impacts on SPEC CPU 2017 rate with 1 copy using
This patch improves -march=native performance on Skylake up to 60% and
leaves -march=native performance unchanged on Haswell.
gcc/
2018-07-13 H.J. Lu <hongjiu.lu@intel.com>
Sunil K Pandey <sunil.k.pandey@intel.com>
PR target/84413
* config/i386/i386.c (m_CORE_AVX512): New.
(m_CORE_AVX2): Likewise.
(m_CORE_ALL): Add m_CORE_AVX2.
* config/i386/x86-tune.def: Replace m_HASWELL with m_CORE_AVX2.
Replace m_SKYLAKE_AVX512 with m_CORE_AVX512 on avx256_optimal
and remove the rest of m_SKYLAKE_AVX512.
gcc/testsuite/
2018-07-13 H.J. Lu <hongjiu.lu@intel.com>
Sunil K Pandey <sunil.k.pandey@intel.com>
* lto.c (do_stream_out): Add PART parameter; open dump file.
(stream_out): Add PART parameter; pass it to do_stream_out.
(lto_wpa_write_files): Update call of stream_out.
* lto-streamer-out.c (copy_function_or_variable): Dump info about
copying section.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
Inline strcmp with small constant strings
The design doc for PR78809 is at:
https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html
this patch is for the third part of change of PR78809.
C. for strcmp (s1, s2), strncmp (s1, s2, n), and memcmp (s1, s2, n)
if the result is NOT used to do simple equality test against zero, one of
"s1" or "s2" is a small constant string, n is a constant, and the Min value of
the length of the constant string and "n" is smaller than a predefined
threshold T,
inline the call by a byte-to-byte comparision sequence to avoid calling
overhead.
adding test case strcmpopt_5.c into gcc.dg for part C of PR78809.
adding test case strcmpopt_6.c into gcc.dg to test the following case:
When the specified length exceeds one of the arguments of the call to memcmp,
the call to memcmp should NOT be inlined.
arm - Add vendor and CPU id information to arm-cpus.in
This patch moves the vendor and CPU id data from driver-arm.c to the
main table of CPU data in arm-cpus.in. It then adds rules to
parsecpu.awk to build data tables that can be used by the driver for
automatic CPU detection when running natively. This is the last major
bit of CPU-specific data that can be usefully moved to the CPU data
tables (I don't think it is sensible to move the per-cpu tuning data
from its current location).
The syntax and parser can support revision ranges, but at present
nothing is done with that data: no supported cpu currently needs that
capability.
* config/arm/driver-arm.c: Include arm-native.h.
(host_detect_local_cpu): Use auto-generated data tables.
(vendors, arm_cpu_table): Delete. Move part information to ...
* config/arm/arm-cpus.in: ... here.
* config/arm/parsecpu.awk (gen_native): New function.
(vendor, part): New CPU fields.
(END): Add support for building the native CPU detection tables.
* config/arm/t-arm (arm-native.h): Add build rule.
(driver-arm.o): Add dependency on arm-native.h.
[testsuite, guality] Add -fno-ipa-icf in gcc.dg/guality
Optimization fipa-icf breaks debug info (as is noted in PR63572 - "ICF
breaks user debugging experience"), which make guality tests clztest.c,
ctztest.c and sra-1.c unsupported for option combination "-O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects". F.i., in clztest.c foo and bar are
merged, and gdb can set a breakpoint on a line in foo, but trying to set a
breakpoint on a line in bar results in a breakpoint in main instead.
This patch works around the problem by adding -fno-ipa-icf (as is already done
in csttest.c and pr43077-1.c) to those testcases:
...
-UNSUPPORTED: gcc.dg/guality/clztest.c ... line . g == f
+PASS: gcc.dg/guality/clztest.c ... line . g == f
-UNSUPPORTED: gcc.dg/guality/ctztest.c ... line . g == f
+PASS: gcc.dg/guality/ctztest.c ... line . g == f
-UNSUPPORTED: gcc.dg/guality/sra-1.c ... line .+1 a[0] == 4
+PASS: gcc.dg/guality/sra-1.c ... line .+1 a[0] == 4
-UNSUPPORTED: gcc.dg/guality/sra-1.c ... line . a[1] == 14
+PASS: gcc.dg/guality/sra-1.c ... line . a[1] == 14
...
[debug] Reuse debug exprs generated in remap_ssa_name
When compiling gcc.dg/vla-1.c with -O3 -g, vla a and b in f1 are optimized
away, and f1 is cloned to a version f1.constprop with no parameters, eliminating
parameter i. Debug info is generated to describe the sizes of a and b, but
that process generates debug expressions that are not reused.
Fix the duplication by saving and reusing the generated debug expressions in
remap_ssa_name. Concretely: reuse D#7 here instead of generating D#8:
...
__attribute__((noinline))
f1.constprop ()
{
intD.6 iD.1935;
wilson [Thu, 12 Jul 2018 19:59:09 +0000 (19:59 +0000)]
RISC-V: Error if function declared with different interrupt modes.
gcc/
2018-07-06 Kito Cheng <kito.cheng@gmail.com>
* config/riscv/riscv.c (enum riscv_privilege_levels): Add UNKNOWN_MODE.
(riscv_expand_epilogue): Add assertion to check interrupt mode.
(riscv_set_current_function): Extract getting interrupt type to new
function.
(riscv_get_interrupt_type): New function.
(riscv_merge_decl_attributes): New function, checking interrupt type is
same.
(TARGET_MERGE_DECL_ATTRIBUTES): Define.
gcc/testsuite/
2018-07-06 Kito Cheng <kito.cheng@gmail.com>
* gcc.target/riscv/interrupt-conflict-mode.c: New.
* c-decl.c (c_decl_attributes): Don't diagnose vars without mappable
type here, instead add "omp declare target implicit" attribute.
(finish_decl): Diagnose vars without mappable type here.
* decl2.c (cplus_decl_attributes): Don't diagnose vars without mappable
type here, instead add "omp declare target implicit" attribute. Add
that attribute instead of "omp declare target" also when
processing_template_decl.
* decl.c (cp_finish_decl): Diagnose vars without mappable type here,
and before calling cp_omp_mappable_type call complete_type.
* c-c++-common/gomp/declare-target-3.c: New test.
* g++.dg/gomp/declare-target-2.C: New test.
Use conditional internal functions in if-conversion
This patch uses IFN_COND_* to vectorise conditionally-executed,
potentially-trapping arithmetic, such as most floating-point
ops with -ftrapping-math. E.g.:
if (cond) { ... x = a + b; ... }
becomes:
...
x = .COND_ADD (cond, a, b, else_value);
...
When this transformation is done on its own, the value of x for
!cond isn't important, so else_value is simply the target's
preferred_else_value (i.e. the value it can handle the most
efficiently).
However, the patch also looks for the equivalent of:
y = cond ? x : c;
in which the "then" value is the result of the conditionally-executed
operation and the "else" value "c" is some value that is available at x.
In that case we can instead use:
x = .COND_ADD (cond, a, b, c);
and replace uses of y with uses of x.
The patch also looks for:
y = !cond ? c : x;
which can be transformed in the same way. This involved adding a new
utility function inverse_conditions_p, which was already open-coded
in a more limited way in match.pd.
2018-07-12 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* fold-const.h (inverse_conditions_p): Declare.
* fold-const.c (inverse_conditions_p): New function.
* match.pd: Use inverse_conditions_p. Add folds of view_converts
that test the inverse condition of a conditional internal function.
* internal-fn.h (vectorized_internal_fn_supported_p): Declare.
* internal-fn.c (internal_fn_mask_index): Handle conditional
internal functions.
(vectorized_internal_fn_supported_p): New function.
* tree-if-conv.c: Include internal-fn.h and fold-const.h.
(any_pred_load_store): Replace with...
(need_to_predicate): ...this new variable.
(redundant_ssa_names): New variable.
(ifcvt_can_use_mask_load_store): Move initial checks to...
(ifcvt_can_predicate): ...this new function. Handle tree codes
for which a conditional internal function exists.
(if_convertible_gimple_assign_stmt_p): Use ifcvt_can_predicate
instead of ifcvt_can_use_mask_load_store. Update after variable
name change.
(predicate_load_or_store): New function, split out from
predicate_mem_writes.
(check_redundant_cond_expr): New function.
(value_available_p): Likewise.
(predicate_rhs_code): Likewise.
(predicate_mem_writes): Rename to...
(predicate_statements): ...this. Use predicate_load_or_store
and predicate_rhs_code.
(combine_blocks, tree_if_conversion): Update after above name changes.
(ifcvt_local_dce): Handle redundant_ssa_names.
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern): Handle
general conditional functions.
* tree-vect-stmts.c (vectorizable_call): Likewise.
Support fused multiply-adds in fully-masked reductions
This patch adds support for fusing a conditional add or subtract
with a multiplication, so that we can use fused multiply-add and
multiply-subtract operations for fully-masked reductions. E.g.
for SVE we vectorise:
double res = 0.0;
for (int i = 0; i < n; ++i)
res += x[i] * y[i];
using a fully-masked loop in which the loop body has the form:
res_1 = PHI<0(preheader), res_2(latch)>;
avec = .MASK_LOAD (loop_mask, a)
bvec = .MASK_LOAD (loop_mask, b)
prod = avec * bvec;
res_2 = .COND_ADD (loop_mask, res_1, prod, res_1);
where the last statement does the equivalent of:
res_2 = loop_mask ? res_1 + prod : res_1;
(operating elementwise). The point of the patch is to convert the last
two statements into:
2018-07-12 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* internal-fn.h (can_interpret_as_conditional_op_p): Declare.
* internal-fn.c (can_interpret_as_conditional_op_p): New function.
* tree-ssa-math-opts.c (convert_mult_to_fma_1): Handle conditional
plus and minus and convert them into IFN_COND_FMA-based sequences.
(convert_mult_to_fma): Handle conditional plus and minus.