Martin Liska [Thu, 9 Nov 2017 09:11:17 +0000 (10:11 +0100)]
GCOV: support multiple functions per a line (PR gcov-profile/48463)
2017-11-09 Martin Liska <mliska@suse.cz>
PR gcov-profile/48463
* coverage.c (coverage_begin_function): Output also end locus
of a function and information whether the function is
artificial.
* gcov-dump.c (tag_function): Parse and print the information.
* gcov.c (INCLUDE_MAP): Add include.
(INCLUDE_SET): Likewise.
(struct line_info): Move earlier in the source file because
of vector<line_info> in function_info structure.
(line_info::line_info): Likewise.
(line_info::has_block): Likewise.
(struct source_info): Add new member index.
(source_info::get_functions_at_location): New function.
(function_info::group_line_p): New function.
(output_intermediate_line): New function.
(output_intermediate_file): Use the mentioned function.
(struct function_start): New.
(struct function_start_pair_hash): Likewise.
(process_file): Add code that identifies group functions.
Assign lines either to global or function scope.
(generate_results): Skip artificial functions.
(find_source): Assign index for each source file.
(read_graph_file): Read new flag artificial and end_line.
(add_line_counts): Assign it either to global of function scope.
(accumulate_line_counts): Isolate core of the function to
accumulate_line_info and call it for both function and global
scope lines.
(accumulate_line_info): New function.
(output_line_beginning): Fix GNU coding style.
(print_source_line): New function.
(output_line_details): Likewise.
(output_function_details): Likewise.
(output_lines): Iterate both source (global) scope and function
scope.
(struct function_line_start_cmp): New class.
* doc/gcov.texi: Reflect changes in documentation.
Jakub Jelinek [Thu, 9 Nov 2017 08:54:19 +0000 (09:54 +0100)]
re PR debug/82837 (ICE in output_operand: invalid expression as operand)
PR debug/82837
* dwarf2out.c (const_ok_for_output_1): Reject NEG in addition to NOT.
(mem_loc_descriptor): Handle (const (neg (...))) as (neg (const (...)))
and similarly for not instead of neg.
Andi Kleen [Thu, 9 Nov 2017 05:42:43 +0000 (05:42 +0000)]
Add option to force indirect calls for x86
This patch adds a -mforce-indirect-call option to force all calls
or tail calls on x86_64 between functions to indirect. This is similar to the
large code model, but doesn't affect jumps inside functions, so has much
less run time overhead.
This is useful with Intel Processor Trace (PT). PT has precise timing
for indirect calls/jumps, but not for direct ones. So if we can force
them to indirect it allows to time every function relatively accurately
(minus the overhead of the indirect branch)
Without this short functions often don't see a timing update and cannot
be measured.
The timing requires at least Skylake or Goldmont based CPUs.
I made it an option. Originally I tried to make it a new code model,
but since it can be combined with other code models (medium, pic, kernel
etc.) this turned out to be too many combinations.
For example with gcc. This first column is a ns time stamp for the
functions.
* gcc.target/i386/force-indirect-call-1.c: New test.
* gcc.target/i386/force-indirect-call-2.c: New test.
* gcc.target/i386/force-indirect-call-3.c: New test.
Jakub Jelinek [Wed, 8 Nov 2017 20:15:42 +0000 (21:15 +0100)]
re PR target/82855 (AVX512: replace OP+movemask with OP_mask+ktest)
PR target/82855
* config/i386/sse.md (<avx512>_eq<mode>3<mask_scalar_merge_name>,
<avx512>_eq<mode>3<mask_scalar_merge_name>_1): Use
nonimmediate_operand predicate for operand 1 instead of
register_operand.
Kyrylo Tkachov [Wed, 8 Nov 2017 18:32:09 +0000 (18:32 +0000)]
[AArch64] Add STP pattern to store a vec_concat of two 64-bit registers
On top of the previous vec_merge simplifications [1] we can add this pattern to perform
a store of a vec_concat of two 64-bit values in distinct registers as an STP.
This avoids constructing such a vector explicitly in a register and storing it as
a Q register.
This way for the code in the testcase we can generate:
* config/aarch64/aarch64-simd.md (store_pair_lanes<mode>):
New pattern.
* config/aarch64/constraints.md (Uml): New constraint.
* config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): New
predicate.
* gcc.target/aarch64/store_v2vec_lanes.c: New test.
Kyrylo Tkachov [Wed, 8 Nov 2017 18:29:38 +0000 (18:29 +0000)]
[simplify-rtx] Simplify vec_merge of vec_duplicates into vec_concat
Another vec_merge simplification that's missing from simplify-rtx.c is transforming
a vec_merge of two vec_duplicates. For example:
(set (reg:V2DF 80)
(vec_merge:V2DF (vec_duplicate:V2DF (reg:DF 84))
(vec_duplicate:V2DF (reg:DF 81))
(const_int 2)))
Can be transformed into the simpler:
(set (reg:V2DF 80)
(vec_concat:V2DF (reg:DF 81)
(reg:DF 84)))
I believe this should always be beneficial.
I'm still looking into finding a small testcase demonstrating this, but on aarch64 SPEC
I've seen this eliminate some really bizzare codegen where GCC was generating nonsense like:
ldr q18, [sp, 448]
ins v18.d[0], v23.d[0]
ins v18.d[1], v22.d[0]
With q18 being pushed and popped off the stack in the prologue and epilogue of the function!
These are large files from SPEC that I haven't been able to analyse yet as to why GCC even attempts
to do that, but with this patch it doesn't try to load a register and overwrite all its lanes.
This patch shaves off about 5k of code size from zeusmp on aarch64 at -O3, so I believe it's a good
thing to do.
* simplify-rtx.c (simplify_ternary_operation): Simplify vec_merge
of two vec_duplicates into a vec_concat.
Another vec_merge simplification that's missing is transforming:
(vec_merge (vec_duplicate x) (vec_concat (y) (z)) (const_int N))
into
(vec_concat x z) if N == 1 (0b01) or
(vec_concat y x) if N == 2 (0b10)
For the testcase in this patch on aarch64 this allows us to try matching during combine the pattern:
(set (reg:V2DI 78 [ x ])
(vec_concat:V2DI
(mem:DI (reg/v/f:DI 76 [ y ]) [1 *y_4(D)+0 S8 A64])
(mem:DI (plus:DI (reg/v/f:DI 76 [ y ])
(const_int 8 [0x8])) [1 MEM[(long long int *)y_4(D) + 8B]+0 S8 A64])))
rather than the more complex:
(set (reg:V2DI 78 [ x ])
(vec_merge:V2DI (vec_duplicate:V2DI (mem:DI (plus:DI (reg/v/f:DI 76 [ y ])
(const_int 8 [0x8])) [1 MEM[(long long int *)y_4(D) + 8B]+0 S8 A64]))
(vec_duplicate:V2DI (mem:DI (reg/v/f:DI 76 [ y ]) [1 *y_4(D)+0 S8 A64]))
(const_int 2 [0x2])))
We don't actually have an aarch64 pattern for the simplified version above, but it's a simple enough
form to add, so this patch adds such a pattern that performs a concatenated load of two 64-bit vectors
in adjacent memory locations as a single Q-register LDR. The new aarch64 pattern is needed to demonstrate
the effectiveness of the simplify-rtx change, so I've kept them together as one patch.
Now for the testcase in the patch we can generate:
construct_lanedi:
ldr q0, [x0]
ret
construct_lanedf:
ldr q0, [x0]
ret
instead of:
construct_lanedi:
ld1r {v0.2d}, [x0]
ldr x0, [x0, 8]
ins v0.d[1], x0
ret
construct_lanedf:
ld1r {v0.2d}, [x0]
ldr d1, [x0, 8]
ins v0.d[1], v1.d[0]
ret
The new memory constraint Utq is needed because we need to allow only the Q-register addressing modes but
the MEM expressions in the RTL pattern have 64-bit vector modes, and if we don't constrain them they will
allow the D-register addressing modes during register allocation/address mode selection, which will produce
invalid assembly.
Bootstrapped and tested on aarch64-none-linux-gnu.
* simplify-rtx.c (simplify_ternary_operation, VEC_MERGE):
Simplify vec_merge of vec_duplicate and vec_concat.
* config/aarch64/constraints.md (Utq): New constraint.
* config/aarch64/aarch64-simd.md (load_pair_lanes<mode>): New
define_insn.
* gcc.target/aarch64/load_v2vec_lanes_1.c: New test.
Kyrylo Tkachov [Wed, 8 Nov 2017 18:23:35 +0000 (18:23 +0000)]
Simplify vec_merge of vec_duplicate with const_vector
I'm trying to improve some of the RTL-level handling of vector lane operations on aarch64 and that
involves dealing with a lot of vec_merge operations. One simplification that I noticed missing
from simplify-rtx are combinations of vec_merge with vec_duplicate.
In this particular case:
(vec_merge (vec_duplicate (X)) (const_vector [A, B]) (const_int N))
which can be replaced with
(vec_concat (X) (B)) if N == 1 (0b01) or
(vec_concat (A) (X)) if N == 2 (0b10).
For the aarch64 testcase in this patch this simplifications allows us to try to combine:
(set (reg:V2DI 77 [ x ])
(vec_concat:V2DI (mem:DI (reg:DI 0 x0 [ y ]) [1 *y_3(D)+0 S8 A64])
(const_int 0 [0])))
instead of the more complex:
(set (reg:V2DI 77 [ x ])
(vec_merge:V2DI (vec_duplicate:V2DI (mem:DI (reg:DI 0 x0 [ y ]) [1 *y_3(D)+0 S8 A64]))
(const_vector:V2DI [
(const_int 0 [0])
(const_int 0 [0])
])
(const_int 1 [0x1])))
For the simplified form above we already have an aarch64 pattern: *aarch64_combinez<mode> which
is missing a DI/DFmode version due to an oversight, so this patch extends that pattern as well to
use the VDC mode iterator that includes DI and DFmode (as well as V2HF which VD_BHSI was missing).
The aarch64 hunk is needed to see the benefit of the simplify-rtx.c hunk, so I didn't split them
into separate patches.
Before this for the testcase we'd generate:
construct_lanedi:
movi v0.4s, 0
ldr x0, [x0]
ins v0.d[0], x0
ret
construct_lanedf:
movi v0.2d, 0
ldr d1, [x0]
ins v0.d[0], v1.d[0]
ret
but now we can generate:
construct_lanedi:
ldr d0, [x0]
ret
construct_lanedf:
ldr d0, [x0]
ret
Bootstrapped and tested on aarch64-none-linux-gnu.
* simplify-rtx.c (simplify_ternary_operation, VEC_MERGE):
Simplify vec_merge of vec_duplicate and const_vector.
* config/aarch64/predicates.md (aarch64_simd_or_scalar_imm_zero):
New predicate.
* config/aarch64/aarch64-simd.md (*aarch64_combinez<mode>): Use VDC
mode iterator. Update predicate on operand 1 to
handle non-const_vec constants. Delete constraints.
(*aarch64_combinez_be<mode>): Likewise for operand 2.
* gcc.target/aarch64/construct_lane_zero_1.c: New test.
* exp_spark.adb (Expand_SPARK_N_Object_Renaming_Declaration): Obtain
the type of the renaming from its defining entity, rather then the
subtype mark as there may not be a subtype mark.
2017-11-08 Jerome Lambourg <lambourg@adacore.com>
* adaint.c, s-oscons-tmplt.c, init.c, libgnat/system-qnx-aarch64.ads,
libgnarl/a-intnam__qnx.ads, libgnarl/s-intman__qnx.adb,
libgnarl/s-osinte__qnx.ads, libgnarl/s-qnx.ads,
libgnarl/s-taprop__qnx.adb, s-oscons-tmplt.c, sigtramp-qnx.c,
terminals.c: Initial port of GNAT for aarch64-qnx
* sem_res.adb (Resolve_Allocator): Add info messages corresponding to
the owner and corresponding coextension.
2017-11-08 Ed Schonberg <schonberg@adacore.com>
* sem_aggr.adb (Resolve_Delta_Aggregate): Divide into the
following separate procedures.
(Resolve_Delta_Array_Aggregate): Previous code form
Resolve_Delta_Aggregate.
(Resolve_Delta_Record_Aggregate): Extend previous code to cover latest
ARG decisions on the legality rules for delta aggregates for records:
in the case of a variant record, components from different variants
cannot be specified in the delta aggregate, and this must be checked
statically.
* spark_xrefs.ads (SPARK_Scope_Record): Remove Spec_File_Num and
Spec_Scope_Num components.
* spark_xrefs.adb (dspark): Skip pretty-printing to removed components.
* lib-xref-spark_specific.adb (Add_SPARK_Scope): Skip initialization of
removed components.
(Collect_SPARK_Xrefs): Skip setting proper values of removed
components.
2017-11-08 Gary Dismukes <dismukes@adacore.com>
* exp_ch4.adb (Expand_N_Type_Conversion): Add test that the selector
name is a discriminant in check for unconditional accessibility
violation within instances.
* exp_ch9.adb, sem_disp.adb, sem_util.adb: Minor reformatting.
2017-11-08 Piotr Trojanek <trojanek@adacore.com>
* spark_xrefs.ads (Rtype): Remove special-casing of constants for SPARK
cross-references.
(dspark): Remove hardcoded table bound.
2017-11-08 Ed Schonberg <schonberg@adacore.com>
* sem_ch4.adb (Analyze_Aggregate): For Ada2020 delta aggregates, use
the type of the base of the construct to determine the type (or
candidate interpretations) of the delta aggregate. This allows the
construct to appear in a context that expects a private extension.
* sem_res.adb (Resolve): Handle properly a delta aggregate with an
overloaded base.
* spark_xrefs.ads (SPARK_Xref_Record): Replace file and scope indices
with Entity_Id of the reference.
* spark_xrefs.adb (dspark): Adapt pretty-printing routine.
* lib-xref-spark_specific.adb (Add_SPARK_Xrefs): Store Entity_Id of the
reference, not the file and scope indices.
2017-11-08 Arnaud Charlet <charlet@adacore.com>
* errout.ads (Current_Node): New.
* errout.adb (Error_Msg): Use Current_Node.
* par-ch6.adb, par-ch7.adb, par-ch9.adb, par-util.adb: Set Current_Node
when relevant.
* style.adb: Call Error_Msg_N when possible.
Piotr Trojanek [Wed, 8 Nov 2017 16:25:03 +0000 (16:25 +0000)]
spark_xrefs.ads (SPARK_Scope_Record): Rename Scope_Id component to Entity.
2017-11-08 Piotr Trojanek <trojanek@adacore.com>
* spark_xrefs.ads (SPARK_Scope_Record): Rename Scope_Id component to
Entity.
* lib-xref-spark_specific.adb, spark_xrefs.adb: Propagate renaming of
the Scope_Id record component.
* spark_xrefs.ads (SPARK_File_Record): Remove string components.
* spark_xrefs.adb (dspark): Remove pretty-printing of removed
SPARK_File_Record components.
* lib-xref-spark_specific.adb (Add_SPARK_File): Do not store string
representation of files/units.
* spark_xrefs.ads (SPARK_Xref_Record): Referenced object is now
represented by Entity_Id.
(SPARK_Scope_Record): Referenced scope (e.g. subprogram) is now
represented by Entity_Id; this information is not repeated as
Scope_Entity.
(Heap): Moved from lib-xref-spark_specific.adb, to reside next to
Name_Of_Heap_Variable.
* spark_xrefs.adb (dspark): Adapt debug routine to above changes in
data types.
* lib-xref-spark_specific.adb: Adapt routines for populating SPARK
scope and xrefs tables to above changes in data types.
2017-11-08 Justin Squirek <squirek@adacore.com>
* sem_ch8.adb (Mark_Use_Clauses): Add condition to always mark the
primitives of generic actuals.
(Mark_Use_Type): Add recursive call to properly mark class-wide type's
base type clauses as per ARM 8.4 (8.2/3).
2017-11-08 Ed Schonberg <schonberg@adacore.com>
* sem_ch6.adb (Analyze_Generic_Subprobram_Body): Validate
categorization dependency of the body, as is done for non-generic
units.
(New_Overloaded_Entity, Visible_Part_Type): Remove linear search
through declarations (Simple optimization, no behavior change).
Jakub Jelinek [Wed, 8 Nov 2017 15:46:58 +0000 (16:46 +0100)]
re PR tree-optimization/78821 (GCC7: Copying whole 32 bits structure field by field not optimised into copying whole 32 bits at once)
PR tree-optimization/78821
* gimple-ssa-store-merging.c (struct store_operand_info): Add bit_not_p
data member.
(store_operand_info::store_operand_info): Initialize it to false.
(pass_store_merging::terminate_all_aliasing_chains): Rewritten to use
ref_maybe_used_by_stmt_p and stmt_may_clobber_ref_p on lhs of each
store in the group, and if chain_info is non-NULL, to ignore altogether
that chain.
(compatible_load_p): Fail if bit_not_p does not match.
(imm_store_chain_info::output_merged_store): Handle bit_not_p loads.
(handled_load): Fill in bit_not_p. Handle BIT_NOT_EXPR.
(pass_store_merging::process_store): Adjust
terminate_all_aliasing_chains calls to pass NULL in all current spots,
call terminate_all_aliasing_chains newly when adding a store into
a chain with non-NULL chain_info.
* gcc.dg/store_merging_2.c: Expect 3 store mergings instead of 2.
* gcc.dg/store_merging_13.c (f7, f8, f9, f10, f11, f12, f13): New
functions.
(main): Test also those. Expect 13 store mergings instead of 6.
* gcc.dg/store_merging_14.c (f7, f8, f9): New functions.
(main): Test also those. Expect 9 store mergings instead of 6.
Wilco Dijkstra [Wed, 8 Nov 2017 15:36:34 +0000 (15:36 +0000)]
[AArch64] Simplify aarch64_can_eliminate
Simplify aarch64_can_eliminate - if we need a frame pointer, we must
eliminate to HARD_FRAME_POINTER_REGNUM. Rather than hardcoding all
combinations from the ELIMINABLE_REGS list, just do the correct check.
Wilco Dijkstra [Wed, 8 Nov 2017 15:34:36 +0000 (15:34 +0000)]
[AArch64] Remove aarch64_frame_pointer_required
To implement -fomit-leaf-frame-pointer, there are 2 places where we need
to check whether we have to use a frame chain (since register allocation
may allocate LR in a leaf function that omits the frame pointer, but if
LR is spilled we must emit a frame chain). To simplify this do not force
frame_pointer_needed via aarch64_frame_pointer_required, but enable the
frame chain in aarch64_layout_frame. Now aarch64_frame_pointer_required
can be removed and aarch64_can_eliminate is simplified.
* einfo.adb: Elist36 is now used as Nested_Scenarios.
(Nested_Scenarios): New routine.
(Set_Nested_Scenarios): New routine.
(Write_Field36_Name): New routine.
* einfo.ads: Add new attribute Nested_Scenarios along with occurrences
in entities.
(Nested_Scenarios): New routine along with pragma Inline.
(Set_Nested_Scenarios): New routine along with pragma Inline.
* sem_elab.adb (Find_And_Process_Nested_Scenarios): New routine.
(Process_Nested_Scenarios): New routine.
(Traverse_Body): When a subprogram body is traversed for the first
time, find, save, and process all suitable scenarios found within.
Subsequent traversals of the same subprogram body utilize the saved
scenarios.
2017-11-08 Piotr Trojanek <trojanek@adacore.com>
* lib-xref-spark_specific.adb (Add_SPARK_Scope): Remove detection of
protected operations.
(Add_SPARK_Xrefs): Simplify detection of empty entities.
* get_spark_xrefs.ads, get_spark_xrefs.adb, put_spark_xrefs.ads,
put_spark_xrefs.adb, spark_xrefs_test.adb: Remove code for writing,
reading and testing SPARK cross-references stored in the ALI files.
* lib-xref.ads (Output_SPARK_Xrefs): Remove.
* lib-writ.adb (Write_ALI): Do not write SPARK cross-references to the
ALI file.
* spark_xrefs.ads, spark_xrefs.adb (pspark): Remove, together
with description of the SPARK xrefs ALI format.
* gcc-interface/Make-lang.in (GNAT_ADA_OBJS): Remove get_spark_refs.o
and put_spark_refs.o.
* exp_ch4.adb (Apply_Accessibility_Check): Do not finalize the object
when the associated access type is subject to pragma
No_Heap_Finalization.
* exp_intr.adb (Expand_Unc_Deallocation): Use the available view of the
designated type in case it comes from a limited withed unit.
gcc/testsuite/
2017-11-08 Javier Miranda <miranda@adacore.com>
* gnat.dg/overriding_ops2.adb, gnat.dg/overriding_ops2.ads,
gnat.dg/overriding_ops2_pkg.ads, gnat.dg/overriding_ops2_pkg-high.ads:
New testcase.
* exp_ch3.adb (Expand_N_Object_Declaration): Save and restore relevant
SPARK-related flags. Add ??? comment.
* exp_util.adb (Insert_Actions): Add an entry for node
N_Variable_Reference_Marker.
* sem.adb (Analyze): Add an entry for node N_Variable_Reference_Marker.
* sem_ch8.adb (Find_Direct_Name): Add constant Is_Assignment_LHS. Build
and record a variable reference marker for the current name.
(Find_Expanded_Name): Add constant Is_Assignment_LHS. Build and record
a variable reference marker for the current name.
* sem_elab.adb (Build_Variable_Reference_Marker): New routine.
(Extract_Variable_Reference_Attributes): Reimplemented.
(Info_Scenario): Add output for variable references and remove output
for variable reads.
(Info_Variable_Read): Removed.
(Info_Variable_Reference): New routine.
(Is_Suitable_Scenario): Variable references are now suitable scenarios
while variable reads are not.
(Output_Active_Scenarios): Add output for variable references and
remove output for variable reads.
(Output_Variable_Read): Removed.
(Output_Variable_Reference): New routine.
(Process_Variable_Read): Removed.
(Process_Variable_Reference): New routine.
(Process_Variable_Reference_Read): New routine.
* sem_elab.ads (Build_Variable_Reference_Marker): New routine.
* sem_res.adb (Resolve_Actuals): Build and record a variable reference
marker for the current actual.
* sem_spark.adb (Check_Node): Add an entry for node
N_Variable_Reference_Marker.
* sem_util.adb (Within_Subprogram_Call): Moved to the library level.
* sem_util.ads (Within_Subprogram_Call): Moved to the library level.
* sinfo.adb (Is_Read): New routine.
(Is_Write): New routine.
(Target): Updated to handle variable reference markers.
(Set_Is_Read): New routine.
(Set_Is_Write): New routine.
(Set_Target): Updated to handle variable reference markers.
* sinfo.ads: Add new attributes Is_Read and Is_Write along with
occurrences in nodes. Update attribute Target. Add new node
kind N_Variable_Reference_Marker.
(Is_Read): New routine along with pragma Inline.
(Is_Write): New routine along with pragma Inline.
(Set_Is_Read): New routine along with pragma Inline.
(Set_Is_Write): New routine along with pragma Inline.
* sprint.adb (Sprint_Node_Actual): Add an entry for node
N_Variable_Reference_Marker.
* sem_elab.adb (Ensure_Prior_Elaboration): Add new parameter
In_Partial_Fin along with a comment on its usage. Do not guarantee the
prior elaboration of a unit when the need came from a partial
finalization context.
(In_Initialization_Context): Relocated to Process_Call.
(Is_Partial_Finalization_Proc): New routine.
(Process_Access): Add new parameter In_Partial_Fin along with a comment
on its usage.
(Process_Activation_Call): Add new parameter In_Partial_Fin along with
a comment on its usage.
(Process_Activation_Conditional_ABE_Impl): Add new parameter
In_Partial_Fin along with a comment on its usage. Do not emit any ABE
diagnostics when the activation occurs in a partial finalization
context.
(Process_Activation_Guaranteed_ABE_Impl): Add new parameter
In_Partial_Fin along with a comment on its usage.
(Process_Call): Add new parameter In_Partial_Fin along with a comment
on its usage. A call is within a partial finalization context when it
targets a finalizer or primitive [Deep_]Finalize, and the call appears
in initialization actions. Pass this information down to the recursive
steps of the Processing phase.
(Process_Call_Ada): Add new parameter In_Partial_Fin along with a
comment on its usage. Remove the guard which suppresses the generation
of implicit Elaborate[_All] pragmas. This is now done in
Ensure_Prior_Elaboration.
(Process_Call_Conditional_ABE): Add new parameter In_Partial_Fin along
with a comment on its usage. Do not emit any ABE diagnostics when the
call occurs in a partial finalization context.
(Process_Call_SPARK): Add new parameter In_Partial_Fin along with a
comment on its usage.
(Process_Instantiation): Add new parameter In_Partial_Fin along with a
comment on its usage.
(Process_Instantiation_Ada): Add new parameter In_Partial_Fin along
with a comment on its usage.
(Process_Instantiation_Conditional_ABE): Add new parameter
In_Partial_Fin along with a comment on its usage. Do not emit any ABE
diagnostics when the instantiation occurs in a partial finalization
context.
(Process_Instantiation_SPARK): Add new parameter In_Partial_Fin along
with a comment on its usage.
(Process_Scenario): Add new parameter In_Partial_Fin along with a
comment on its usage.
(Process_Single_Activation): Add new parameter In_Partial_Fin along
with a comment on its usage.
(Traverse_Body): Add new parameter In_Partial_Fin along with a comment
on its usage.
2017-11-08 Arnaud Charlet <charlet@adacore.com>
* sem_ch13.adb: Add optional parameter to Error_Msg.
2017-11-08 Jerome Lambourg <lambourg@adacore.com>
* fname.adb (Is_Internal_File_Name): Do not check the 8+3 naming schema
for the Interfaces.* hierarchy as longer unit names are now allowed.
2017-11-08 Arnaud Charlet <charlet@adacore.com>
* sem_util.adb (Subprogram_Name): Emit sloc for the enclosing
subprogram as well. Support more cases of entities.
(Append_Entity_Name): Add some defensive code.
Janne Blomqvist [Wed, 8 Nov 2017 11:51:00 +0000 (13:51 +0200)]
PR 82869 Introduce logical_type_node and use it
Earlier GFortran used to redefine boolean_type_node, which in the rest
of the compiler means the C/C++ _Bool/bool type, to the Fortran
default logical type. When this redefinition was removed, a few
issues surfaced. Namely,
1) PR 82869, where we created a boolean tmp variable, and passed it to
the runtime library as a Fortran logical variable of a different size.
2) Fortran specifies that logical operations should be done with the
default logical kind, not in any other kind.
3) Using 8-bit variables have some issues, such as
- on x86, partial register stalls and length prefix changes.
- s390 has a compare with immediate and jump instruction which
works with 32-bit but not 8-bit quantities.
This patch addresses these issues by introducing a type
logical_type_node which is a Fortran LOGICAL variable of default
kind. It is then used in places were the Fortran standard mandates, as
well as for compiler generated temporary variables.
For x86-64, using the Polyhedron benchmark suite, no performance or
code size difference worth mentioning was observed.
Boris Kolpackov [Tue, 7 Nov 2017 22:49:25 +0000 (22:49 +0000)]
[PATCH] Install cp/operators.def as part of plugin headers
https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00498.html
2017-11-07 Boris Kolpackov <boris@codesynthesis.com>
* Make-lang.in (CP_PLUGIN_HEADERS): Add operators.def since included
in cp-tree.h.
We currently generate (sometimes pretty long) sequences of integer
insns to implement the various cstore patterns. If the CPU has a fast
isel, we can use that at the same latency as of just two integer insns
(you also get a load immediate of 1, and sometimes one of 0 as well,
but those are not in the critical path: they don't depend on any other
instruction).
There are a few patterns that already are implemented with just two
instructions; so don't use isel in that case (I still need to check
all lt/gt/ltu/gtu/le/leu/ge/geu patterns with all SI/DI combinations,
one or two might be better without isel).
This introduces a new GPR2 mode iterator, for those patterns that use
two independent integer modes.
* config/rs6000/rs6000.md (GPR2): New mode_iterator.
("cstore<mode>4"): Don't always expand with rs6000_emit_int_cmove for
eq and ne if TARGET_ISEL.
(cmp): New code_iterator.
(UNS, UNSU_, UNSIK): New code_attrs.
(<code><GPR:mode><GPR2:mode>2_isel): New define_insn_and_split.
("eq<mode>3"): New define_expand, rename the define_insn_and_split
to...
("eq<mode>3"): ... this.
("ne<mode>3"): New define_expand, rename the define_insn_and_split
to...
("ne<mode>3"): ... this.
compiler: don't double count "." in nested_function_num
Nested functions are named "outerfunc.$nestedN", where N is a
number. nested_function_num extracts that number. The name is
first passed to unpack_hidden_name, which handles the "." and
should result "$nestedN". Don't expect the "." again.
This fixes assertion failure when escape analysis is enabled
and -fgo-debug-escape is on. The failure looks
go1: internal compiler error: in nested_function_num, at go/gofrontend/names.cc:241
0x7bd7d3 Gogo::nested_function_num(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
Eric Botcazou [Tue, 7 Nov 2017 17:37:29 +0000 (17:37 +0000)]
re PR c/53037 (warn_if_not_aligned(X))
PR c/53037
* stor-layout.c: Include attribs.h.
(handle_warn_if_not_align): Replace test on TYPE_USER_ALIGN with
explicit lookup of "aligned" attribute.
Michael Clark [Tue, 7 Nov 2017 16:58:37 +0000 (16:58 +0000)]
RISC-V: Define MUSL_DYNAMIC_LINKER
Use no suffix at all in the musl dynamic linker name for hard
float ABI. Use -sf and -sp suffixes in musl dynamic linker name
for soft float and single precision ABIs. The following table
outlines the musl interpreter names for the RISC-V ABI names.
[AArch64] Use aarch64_reg_or_imm instead of nonmemory_operand
Some of the shift expanders accepted nonmemory_operands but were only
able to handle register_operands or CONST_INTs. This is probably
academic without SVE, since we're not likely to see shifts by other
types of constant (const_wide_ints, consts, etc). But for SVE,
it's possible for a vectorised shift induction to have a CONST_POLY_INT
shift amount.
This patch makes the expanders use aarch64_reg_or_imm instead.
2017-11-07 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/aarch64/aarch64.md (ashl<mode>3, ashr<mode>3, lshr<mode>3)
(rotr<mode>3, rotl<mode>3): Use aarch64_reg_or_imm instead of
nonmmory_operand.
Sudakshina Das [Tue, 7 Nov 2017 12:23:38 +0000 (12:23 +0000)]
PR80131: Simplification of 1U << (31 - x)
Currently the code A << (B - C) is not simplified.
However at least a more specific case of 1U << (C -x) where
C = precision(type) - 1 can be simplified to (1 << C) >> x.
This is done by adding a new simplification rule in match.pd.
2017-11-07 Sudakshina Das <sudi.das@arm.com>
gcc/
PR middle-end/80131
* match.pd: Simplify 1 << (C - x) where C = precision (x) - 1.
testsuite/
PR middle-end/80131
* testsuite/gcc.dg/pr80131-1.c: New Test.
Alan Modra [Tue, 7 Nov 2017 04:54:01 +0000 (15:24 +1030)]
Require ngettext in test of system gettext implementation
gcc currently uses ngettext in a number of places (gcc/cp/pt.c,
gcc/diagnostic.c, gcc/collect2.c). Apparently there are (or used to
be) gettext implementations that lack ngettext. See config/gettext.m4.
This patch arranges for intl/ to be compiled when the system gettext
lacks ngettext.
* configure.ac: Invoke AM_GNU_GETTEXT with need_ngettext.
* configure: Regenerate.
We want to actually use isel, so we shouldn't disable it. It is
already not set by default on CPUs that don't have it, or where we
do not want to use it.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Don't
disable isel if it was not set explicitly.
James Bowman [Tue, 7 Nov 2017 01:10:18 +0000 (01:10 +0000)]
FT32 makes use of multiple address spaces.
FT32 makes use of multiple address spaces. When trying to inspect
objects in GDB, GDB was treating them as a straight "const". The cause
seems to be in GCC DWARF2 output.
This output is handled in gcc/gcc/dwarf2out.c, where modified_type_die()
checks that TYPE has qualifiers CV_QUALS. However while TYPE has
ADDR_SPACE qualifiers, the modified_type_die() explicitly discards the
ADDR_SPACE qualifiers.
This patch retains the ADDR_SPACE qualifiers as modified_type_die()
outputs the DWARF type tree. This allows the types to match, and correct
type information for the object is emitted.
[gcc]
2017-11-06 James Bowman <james.bowman@ftdichip.com>
"make check" runs make recursively to check each package. Pass
the flags through. So it is possible to run "make check" with
different settings easily.
[AArch64] Pass number of units to aarch64_expand_vec_perm(_const)
This patch passes the number of units to aarch64_expand_vec_perm
and aarch64_expand_vec_perm_const, which avoids a to_constant ()
once GET_MODE_NUNITS is variable.
2017-11-06 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* config/aarch64/aarch64-protos.h (aarch64_expand_vec_perm)
(aarch64_expand_vec_perm_const): Take the number of units too.
* config/aarch64/aarch64.c (aarch64_expand_vec_perm)
(aarch64_expand_vec_perm_const): Likewise.
* config/aarch64/aarch64-simd.md (vec_perm_const<mode>)
(vec_perm<mode>): Update accordingly.
Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com> Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254469
[AArch64] Pass number of units to aarch64_simd_vect_par_cnst_half
This patch passes the number of units to aarch64_simd_vect_par_cnst_half,
which avoids a to_constant () once GET_MODE_NUNITS is variable.
2017-11-06 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* config/aarch64/aarch64-protos.h (aarch64_simd_vect_par_cnst_half):
Take the number of units too.
* config/aarch64/aarch64.c (aarch64_simd_vect_par_cnst_half): Likewise.
(aarch64_simd_check_vect_par_cnst_half): Update call accordingly,
but check for a vector mode before rather than after the call.
* config/aarch64/aarch64-simd.md (aarch64_split_simd_mov<mode>)
(move_hi_quad_<mode>, vec_unpack<su>_hi_<mode>)
(vec_unpack<su>_lo_<mode, vec_widen_<su>mult_lo_<mode>)
(vec_widen_<su>mult_hi_<mode>, vec_unpacks_lo_<mode>)
(vec_unpacks_hi_<mode>, aarch64_saddl2<mode>, aarch64_uaddl2<mode>)
(aarch64_ssubl2<mode>, aarch64_usubl2<mode>, widen_ssum<mode>3)
(widen_usum<mode>3, aarch64_saddw2<mode>, aarch64_uaddw2<mode>)
(aarch64_ssubw2<mode>, aarch64_usubw2<mode>, aarch64_sqdmlal2<mode>)
(aarch64_sqdmlsl2<mode>, aarch64_sqdmlal2_lane<mode>)
(aarch64_sqdmlal2_laneq<mode>, aarch64_sqdmlsl2_lane<mode>)
(aarch64_sqdmlsl2_laneq<mode>, aarch64_sqdmlal2_n<mode>)
(aarch64_sqdmlsl2_n<mode>, aarch64_sqdmull2<mode>)
(aarch64_sqdmull2_lane<mode>, aarch64_sqdmull2_laneq<mode>)
(aarch64_sqdmull2_n<mode>): Update accordingly.
Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com> Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254468
[AArch64] Pass number of units to aarch64_reverse_mask
This patch passes the number of units to aarch64_reverse_mask,
which avoids a to_constant () once GET_MODE_NUNITS is variable.
2017-11-06 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* config/aarch64/aarch64-protos.h (aarch64_reverse_mask): Take
the number of units too.
* config/aarch64/aarch64.c (aarch64_reverse_mask): Likewise.
* config/aarch64/aarch64-simd.md (vec_load_lanesoi<mode>)
(vec_store_lanesoi<mode>, vec_load_lanesci<mode>)
(vec_store_lanesci<mode>, vec_load_lanesxi<mode>)
(vec_store_lanesxi<mode>): Update accordingly.
Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com> Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254467
Later patches turn the number of vector units into a poly_int.
We deliberately don't support applying GEN_INT to those (except
in target code that doesn't distinguish between poly_ints and normal
constants); gen_int_mode needs to be used instead.
2017-11-06 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* config/aarch64/aarch64-protos.h (aarch64_endian_lane_rtx): Declare.
* config/aarch64/aarch64.c (aarch64_endian_lane_rtx): New function.
* config/aarch64/aarch64.h (ENDIAN_LANE_N): Take the number
of units rather than the mode.
* config/aarch64/iterators.md (nunits): New mode attribute.
* config/aarch64/aarch64-builtins.c (aarch64_simd_expand_args):
Use aarch64_endian_lane_rtx instead of GEN_INT (ENDIAN_LANE_N ...).
* config/aarch64/aarch64-simd.md (aarch64_dup_lane<mode>)
(aarch64_dup_lane_<vswap_width_name><mode>, *aarch64_mul3_elt<mode>)
(*aarch64_mul3_elt_<vswap_width_name><mode>): Likewise.
(*aarch64_mul3_elt_to_64v2df, *aarch64_mla_elt<mode>): Likewise.
(*aarch64_mla_elt_<vswap_width_name><mode>, *aarch64_mls_elt<mode>)
(*aarch64_mls_elt_<vswap_width_name><mode>, *aarch64_fma4_elt<mode>)
(*aarch64_fma4_elt_<vswap_width_name><mode>):: Likewise.
(*aarch64_fma4_elt_to_64v2df, *aarch64_fnma4_elt<mode>): Likewise.
(*aarch64_fnma4_elt_<vswap_width_name><mode>): Likewise.
(*aarch64_fnma4_elt_to_64v2df, reduc_plus_scal_<mode>): Likewise.
(reduc_plus_scal_v4sf, reduc_<maxmin_uns>_scal_<mode>): Likewise.
(reduc_<maxmin_uns>_scal_<mode>): Likewise.
(*aarch64_get_lane_extend<GPI:mode><VDQQH:mode>): Likewise.
(*aarch64_get_lane_zero_extendsi<mode>): Likewise.
(aarch64_get_lane<mode>, *aarch64_mulx_elt_<vswap_width_name><mode>)
(*aarch64_mulx_elt<mode>, *aarch64_vgetfmulx<mode>): Likewise.
(aarch64_sq<r>dmulh_lane<mode>, aarch64_sq<r>dmulh_laneq<mode>)
(aarch64_sqrdml<SQRDMLH_AS:rdma_as>h_lane<mode>): Likewise.
(aarch64_sqrdml<SQRDMLH_AS:rdma_as>h_laneq<mode>): Likewise.
(aarch64_sqdml<SBINQOPS:as>l_lane<mode>): Likewise.
(aarch64_sqdml<SBINQOPS:as>l_laneq<mode>): Likewise.
(aarch64_sqdml<SBINQOPS:as>l2_lane<mode>_internal): Likewise.
(aarch64_sqdml<SBINQOPS:as>l2_laneq<mode>_internal): Likewise.
(aarch64_sqdmull_lane<mode>, aarch64_sqdmull_laneq<mode>): Likewise.
(aarch64_sqdmull2_lane<mode>_internal): Likewise.
(aarch64_sqdmull2_laneq<mode>_internal): Likewise.
(aarch64_vec_load_lanesoi_lane<mode>): Likewise.
(aarch64_vec_store_lanesoi_lane<mode>): Likewise.
(aarch64_vec_load_lanesci_lane<mode>): Likewise.
(aarch64_vec_store_lanesci_lane<mode>): Likewise.
(aarch64_vec_load_lanesxi_lane<mode>): Likewise.
(aarch64_vec_store_lanesxi_lane<mode>): Likewise.
(aarch64_simd_vec_set<mode>): Update use of ENDIAN_LANE_N.
(aarch64_simd_vec_setv2di): Likewise.
Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com> Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254466
Wilco Dijkstra [Mon, 6 Nov 2017 19:26:27 +0000 (19:26 +0000)]
[Arm] Cleanup IT attributes
A recent change to remove the movdi_vfp_cortexa8 meant that ldrd was used in
ITs block even when arm_restrict_it was enabled. Rather than just fixing this
latent issue, change the default of predicable_short_it to "no" so that only
16-bit instructions need to be marked with it. As a result there are far fewer
patterns that need the attribute, and omitting predicable_short_it is no longer
causing issues.
* config/arm/arm.md (predicable_short_it): Change default to "no",
improve documentation, remove uses that are identical to the default.
(enabled_for_depr_it): Rename to enabled_for_short_it.
* gcc/config/arm/arm-fixed.md (predicable_short_it): Remove default uses.
* gcc/config/arm/ldmstm.md (predicable_short_it): Likewise.
* gcc/config/arm/sync.md (predicable_short_it): Likewise.
* gcc/config/arm/thumb2.md (predicable_short_it): Likewise.
* gcc/config/arm/vfp.md (predicable_short_it): Likewise.
re PR target/82748 (ICE with __builtin_fabsq and __float128 in copy_to_mode_reg, at explow.c:612)
[gcc]
2017-11-06 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/82748
* config/rs6000/rs6000-builtin.def (BU_FLOAT128_1): Delete
float128 helper macros, which are no longer used after deleting
the old 'q' built-in functions, and moving the round to odd
built-in functions to being special built-in functions.
(BU_FLOAT128_2): Likewise.
(BU_FLOAT128_1_HW): Likewise.
(BU_FLOAT128_2_HW): Likewise.
(BU_FLOAT128_3_HW): Likewise.
(FABSQ): Delete old 'q' built-in functions.
(COPYSIGNQ): Likewise.
(SQRTF128_ODD): Move round to odd built-in functions to be
special built-in functions, so that we can handle
-mabi=ieeelongdouble.
(TRUNCF128_ODD): Likewise.
(ADDF128_ODD): Likewise.
(SUBF128_ODD): Likewise.
(MULF128_ODD): Likewise.
(DIVF128_ODD): Likewise.
(FMAF128_ODD): Likewise.
* config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Map old 'q'
built-in names to 'f128'.
* config/rs6000/rs6000.c (rs6000_fold_builtin): Remove folding the
old 'q' built-in functions, as the machine independent code for
'f128' built-in functions handles this.
(rs6000_expand_builtin): Add expansion for float128 round to odd
functions, keying off on -mabi=ieeelongdouble of whether to use
the KFmode or TFmode variant.
(rs6000_init_builtins): Initialize the _Float128 round to odd
built-in functions.
* doc/extend.texi (PowerPC Built-in Functions): Document the old
_Float128 'q' built-in functions are now mapped into the new
'f128' built-in functions.
[gcc/testsuite]
2017-11-06 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/82748
* gcc.target/powerpc/pr82748-1.c: New test.
* gcc.target/powerpc/pr82748-2.c: Likewise.
Jakub Jelinek [Mon, 6 Nov 2017 16:29:11 +0000 (17:29 +0100)]
re PR tree-optimization/82838 (ICE in verify_ssa failed w/ store-merging)
PR tree-optimization/82838
* gimple-ssa-store-merging.c
(imm_store_chain_info::output_merged_store): Call force_gimple_operand_1
on a separate gimple_seq which is then appended to seq.
In this PR we tried to create a widening multiply of two 3-bit numbers,
but that isn't a widening multiply at the optab/rtl level, since both
the input and output still have the same mode.
We could trap this either in is_widening_mult_p or (as the patch does)
in the routines that actually ask for an optab. The latter seemed
more natural since is_widening_mult_p doesn't otherwise care about modes.
2017-11-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
PR tree-optimization/82816
* tree-ssa-math-opts.c (convert_mult_to_widen): Return false
if the modes of the two types are the same.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.c-torture/compile/pr82816.c: New test.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254454
Bill Schmidt [Mon, 6 Nov 2017 13:47:46 +0000 (13:47 +0000)]
[gcc]
2017-11-06 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
* config/rs6000/altivec.md (*p9_vadu<mode>3) Rename to
p9_vadu<mode>3.
(usadv16qi): New define_expand.
(usadv8hi): New define_expand.
[gcc/testsuite]
2017-11-06 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
* gcc.target/powerpc/sad-vectorize-1.c: New file.
* gcc.target/powerpc/sad-vectorize-2.c: New file.
* gcc.target/powerpc/sad-vectorize-3.c: New file.
* gcc.target/powerpc/sad-vectorize-4.c: New file.
* testsuite/libgomp.c++/loop-2.C: Return a value
for functions with non-void return type, or change type to void,
or add -Wno-return-type for test.
* testsuite/libgomp.c++/loop-4.C: Likewise.
* testsuite/libgomp.c++/parallel-1.C: Likewise.
* testsuite/libgomp.c++/shared-1.C: Likewise.
* testsuite/libgomp.c++/single-1.C: Likewise.
* testsuite/libgomp.c++/single-2.C: Likewise.
2017-11-06 Martin Liska <mliska@suse.cz>
* testsuite/27_io/basic_fstream/cons/char/path.cc (main):
Return a value for functions with non-void return type,
or change type to void, or add -Wno-return-type for test.
* testsuite/27_io/basic_ifstream/cons/char/path.cc (main):
Likewise.
* testsuite/27_io/basic_ofstream/open/char/path.cc (main):
Likewise.