Javier Miranda [Mon, 26 Jul 2021 08:55:39 +0000 (04:55 -0400)]
[Ada] Wrappers of access-to-subprograms with pre/post conditions
gcc/ada/
* sem_ch3.adb (Build_Access_Subprogram_Wrapper): Decorate the
wrapper with attribute Is_Wrapper, and move its declaration to
the freezing actions of its type declaration; done to facilitate
identifying it at later stages to avoid handling it as a
primitive operation of a tagged type; otherwise it may be
handled as a dispatching operation and erroneously registered in
a dispatch table.
(Make_Index): Add missing decoration of field Parent.
* sem_disp.adb (Check_Dispatching_Operation): Complete
decoration of late-overriding dispatching primitives.
(Is_Access_To_Subprogram_Wrapper): New subprogram.
(Inherited_Subprograms): Prevent cascaded errors; adding missing
support for private types.
* sem_type.adb (Add_One_Interp): Add missing support for the
first interpretation of a primitive of an inmediate ancestor
interface.
* sem_util.adb (Check_Result_And_Post_State_In_Pragma): Do not
report missing reference in postcondition placed in internally
built wrappers.
* exp_disp.adb (Expand_Dispatching_Call): Adding assertion.
Ed Schonberg [Tue, 27 Jul 2021 14:55:07 +0000 (10:55 -0400)]
[Ada] Ada2022: implementation of AI12-0212 : iterator specs in array aggregates
gcc/ada/
* sem_aggr.adb (Resolve_Array_Aggregate): Check the validity of
an array aggregate all of whose components are iterated
component associations.
* exp_aggr.adb (Expand_Array_Aggregate,
Two_Pass_Aggregate_Expansion): implement two-pass algorithm and
replace original aggregate with resulting temporary, to ensure
that a proper length check is performed if context is
constrained. Use attributes Pos and Val to handle index types of
any discrete type.
Bob Duff [Fri, 30 Jul 2021 20:49:37 +0000 (16:49 -0400)]
[Ada] Follow-on efficiency improvements
gcc/ada/
* gen_il-gen.adb: Set the number of concrete nodes that have the
Homonym field to a higher number than any other field. This
isn't true, but it forces Homonym's offset to be chosen first,
so it will be at offset zero and hence slot zero.
Bob Duff [Thu, 29 Jul 2021 15:15:46 +0000 (11:15 -0400)]
[Ada] Cleanup and efficiency improvements
gcc/ada/
* gen_il-gen.adb: Generate getters and setters with much of the
code inlined. Generate code for storing a few fields in the node
header, to avoid the extra level of indirection for those
fields. We generate the header type, so we don't have to
duplicate hand-written Ada and C code to depend on the number of
header fields. Declare constants for slot size. Use short names
because these are used all over. Remove
Put_Low_Level_Accessor_Instantiations, Put_Low_Level_C_Getter,
which are no longer needed. Rename
Put_High_Level_C_Getter-->Put_C_Getter.
* atree.ads, atree.adb: Take into account the header slots.
Take into account the single Node_Or_Entity_Field type. Remove
"pragma Assertion_Policy (Ignore);", because the routines in
this package are no longer efficiency critical.
* atree.h: Remove low-level getters, which are no longer used by
sinfo.h and einfo.h.
* einfo-utils.adb: Avoid crash in Known_Alignment.
* live.adb, sem_eval.adb: Remove code that prevents Node_Id from
having a predicate. We don't actually add a predicate to
Node_Id, but we want to be able to for temporary debugging.
* sinfo-utils.adb: Remove code that prevents Node_Id from having
a predicate. Take into account the single Node_Or_Entity_Field
type.
* sinfo-utils.ads: Minor.
* table.ads (Table_Type): Make the components aliased, because
low-level setters in Atree need to take 'Access.
* treepr.adb: Take into account the single Node_Or_Entity_Field
type. Make some code more robust, so we can print out
half-baked nodes.
* types.ads: Move types here for visibility purposes.
* gcc-interface/gigi.h, gcc-interface/trans.c: Take into account
the Node_Header change in the GNAT front end.
* gcc-interface/cuintp.c, gcc-interface/targtyps.c: Add because
gigi.h now refers to type Node_Header, which is in sinfo.h.
Steve Baird [Fri, 23 Jul 2021 18:09:05 +0000 (11:09 -0700)]
[Ada] Update "Implementation Defined Characteristics" documentation.
gcc/ada/
* doc/gnat_rm/implementation_defined_characteristics.rst: Update
this section to reflect the current version of Ada RM M.2.
* gnat_rm.texi: Regenerate.
Bill Schmidt [Thu, 23 Sep 2021 12:35:42 +0000 (07:35 -0500)]
rs6000: Add psabi diagnostic for C++ zero-width bit field ABI change
Previously zero-width bit fields were removed from structs, so that otherwise
homogeneous aggregates were treated as such and passed in FPRs and VSRs.
This was incorrect behavior per the ELFv2 ABI. Now that these fields are no
longer being removed, we generate the correct parameter passing code. Alert
the unwary user in the rare cases where this behavior changes.
2021-09-23 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
PR target/102024
* config/rs6000/rs6000-call.c (rs6000_aggregate_candidate): Detect
zero-width bit fields and return indicator.
(rs6000_discover_homogeneous_aggregate): Diagnose when the
presence of a zero-width bit field changes parameter passing in
GCC 12.
Richard Biener [Thu, 23 Sep 2021 08:27:01 +0000 (10:27 +0200)]
tree-optimization/102448 - clear copied alignment info from vect
This fixes the previous change which removed setting alignment info
from the vectorizers idea of how a pointer is being used but left
in place the copied info from DR_PTR_INFO without realizing that this
is in fact _not_ the alignment of the access but the alignment of
a base pointer contained in it.
The following makes sure to not use that info.
2021-09-23 Richard Biener <rguenther@suse.de>
PR tree-optimization/102448
* tree-vect-data-refs.c (vect_duplicate_ssa_name_ptr_info):
Clear alignment info copied from DR_PTR_INFO.
Hongyu Wang [Mon, 12 Jul 2021 06:02:10 +0000 (14:02 +0800)]
AVX512FP16: add truncmn2/extendmn2 expanders
gcc/ChangeLog:
* config/i386/sse.md (extend<ssePHmodelower><mode>2):
New expander.
(extendv4hf<mode>2): Likewise.
(extendv2hfv2df2): Likewise.
(trunc<mode><ssePHmodelower>2): Likewise.
(avx512fp16_vcvt<castmode>2ph_<mode>): Rename to ...
(trunc<mode>v4hf2): ... this, and drop constraints.
(avx512fp16_vcvtpd2ph_v2df): Rename to ...
(truncv2dfv2hf2): ... this, and likewise.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-trunc-extendvnhf.c: New test.
Hongyu Wang [Mon, 12 Jul 2021 01:45:33 +0000 (09:45 +0800)]
AVX512FP16: Add float(uns)?mn2 expander
gcc/ChangeLog:
* config/i386/sse.md (float<floatunssuffix><mode><ssePHmodelower>2):
New expander.
(avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode>):
Rename to ...
(float<floatunssuffix><mode>v4hf2): ... this, and drop constraints.
(avx512fp16_vcvt<floatsuffix>qq2ph_v2di): Rename to ...
(float<floatunssuffix>v2div2hf2): ... this, and likewise.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-floatvnhf.c: New test.
Hongyu Wang [Thu, 1 Jul 2021 05:17:32 +0000 (13:17 +0800)]
AVX512FP16: Add fix(uns)?_truncmn2 for HF scalar and vector modes
NB: 64bit/32bit vectorize for HFmode is not supported for now, will
adjust this patch when V2HF/V4HF operations supported.
gcc/ChangeLog:
* config/i386/i386.md (fix<fixunssuffix>_trunchf<mode>2): New expander.
(fixuns_trunchfhi2): Likewise.
(*fixuns_trunchfsi2zext): New define_insn.
* config/i386/sse.md (ssePHmodelower): New mode_attr.
(fix<fixunssuffix>_trunc<ssePHmodelower><mode>2):
New expander for same element vector fix_truncate.
(fix<fixunssuffix>_trunc<ssePHmodelower><mode>2):
Likewise for V4HF to V4SI/V4DI fix_truncate.
(fix<fixunssuffix>_truncv2hfv2di2):
Likeise for V2HF to V2DI fix_truncate.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-trunchf.c: New test.
* gcc.target/i386/avx512fp16-truncvnhf.c: Ditto.
* config/i386/sse.md (FMAMODEM): extend to handle FP16.
(VFH_SF_AVX512VL): Extend to handle HFmode.
(VF_SF_AVX512VL): Deleted.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-fma-1.c: New test.
* gcc.target/i386/avx512fp16vl-fma-1.c: New test.
* gcc.target/i386/avx512fp16vl-fma-vectorize-1.c: New test.
Jakub Jelinek [Thu, 23 Sep 2021 08:07:49 +0000 (10:07 +0200)]
openmp: Diagnose omp::directive attribute without balanced token argument [PR102413]
If omp::directive attribute argument starting with the opening ( is not a balanced
token sequence, then cp_parser_skip_balanced_tokens (parser, 1) returns 1,
but the code was subtracting 2 from it and iterating until it was 0, so for the
non-balanced case it iterated from (size_t) -1 down to 0.
The following patch just diagnoses that as an error.
2021-09-23 Jakub Jelinek <jakub@redhat.com>
PR c++/102413
* parser.c (cp_parser_omp_directive_args): Diagnose if omp::directive
is not followed by a balanced token sequence starting with open paren.
I've been pulling state from across the forward jump threader into the
jt_state class, but it it still didn't feel right. The ultimate goal
was to keep track of candidate threading paths so that the simplifier
could simplify statements with the path as context. This patch completes
the transition, while cleaning up a lot of things in the process.
I've revamped both state and the simplifier such that a base state class
contains only the blocks as they're registered, and any pass specific
knowledge is where it belongs... in the pass. This allows VRP to keep
its const and copies business, and DOM to keep this as well as its evrp
client. This makes the threader cleaner, as it will now have no knowledge
of either const/copies or evrp.
This also paves the wave for the upcoming hybrid threader, which will
just derive the state class and provide almost nothing, since the ranger
doesn't need to register any equivalences or ranges as it folds.
There is some code duplication in the simplifier, since both the DOM and
VRP clients use a vr_values based simplifier, but this is temporary as
the VRP client is about to be replaced with a hybrid ranger.
For a better view of what this patch achieves, here are the base
classes:
The compiler was failing to diagnose the error required by F2018 C838
when passing an assumed-rank array argument to a non-assumed-rank dummy.
It was also incorrectly giving an error for calls to the 2-argument form
of the ASSOCIATED intrinsic, which is supposed to be permitted by C838.
gcc/fortran/
* check.c (gfc_check_associated): Allow an assumed-rank
array for the pointer argument.
* interface.c (compare_parameter): Also give rank mismatch
error on assumed-rank array.
Fortran: Fix testcases that violate C838, + revealed ICE
The three test cases fixed in this patch violated F2018 C838, which
only allows passing an assumed-rank argument to an assumed-rank dummy.
Wrapping the call in "select rank" revealed a null pointer dereference
which is fixed by guarding the use of the result of
GFC_DECL_SAVED_DESCRIPTOR similar to what is already done elsewhere.
gcc/fortran/
* trans-stmt.c (trans_associate_var): Check that result of
GFC_DECL_SAVED_DESCRIPTOR is not null before using it.
gcc/testsuite/
* gfortran.dg/assumed_rank_18.f90 (g): Wrap call to h in
select rank.
* gfortran.dg/assumed_type_10.f90 (test_array): Likewise for
call to test_lib.
* gfortran.dg/assumed_type_11.f90 (test_array): Likewise.
It turned out that enabling the -Wmissing-include-dirs for libcpp did output
too many warnings – at least as run with -B and similar options during the
GCC build and warning for internal include dirs like finclude, unlikely of
relevance to for a real-world user.
This patch now only warns for -I and -J by default but permits to get the
full warnings including libcpp ones with -Wmissing-include-dirs. It
additionally documents this in the manual.
With that change, the -Wno-missing-include-dirs could be removed
from libgfortran's configure and libgomp's testsuite always cflags.
This reverts those bits of the previous
commit r12-3722-g417ea5c02cef7f000e66d1af22b066c2c1cda047
Additionally, it turned out that all call to load_file called exit
explicitly - except for the main file via gfc_init -> gfc_new_file. The
latter also output a file not existing fatal error, such that two errors
where printed. Now exit is called in line with the other users of
load_file.
Finally, when compileing with "nonexisting/file.f90", first a warning that
"nonexisting" does not exist as include path was printed before the file
not found error was printed. Now the directory in which the physical file
is located is added silently, relying on the file-not-found diagnostic for
those.
* cpp.c (gfc_cpp_register_include_paths, gfc_cpp_post_options):
Add new bool verbose_missing_dir_warn argument.
* cpp.h (gfc_cpp_post_options): Update prototype.
* f95-lang.c (gfc_init): Remove duplicated file-not found diag.
* gfortran.h (gfc_check_include_dirs): Takes bool
verbose_missing_dir_warn arg.
(gfc_new_file): Returns now void.
* options.c (gfc_post_options): Update to warn for -I and -J,
only, by default but for all when user requested.
* scanner.c (gfc_do_check_include_dir):
(gfc_do_check_include_dirs, gfc_check_include_dirs): Take bool
verbose warn arg and update to avoid printing the same message
twice or never.
(load_file): Fix indent.
(gfc_new_file): Return void and exit when load_file failed
as all other load_file users do.
Roger Sayle [Wed, 22 Sep 2021 18:17:49 +0000 (19:17 +0100)]
More NEGATE_EXPR folding in match.pd
As observed by Jakub in comment #2 of PR 98865, the expression -(a>>63)
is optimized in GENERIC but not in GIMPLE. Investigating further it
turns out that this is one of a few transformations performed by
fold_negate_expr in fold-const.c that aren't yet performed by match.pd.
This patch moves/duplicates them there, and should be relatively safe
as these transformations are already performed by the compiler, but
just in different passes.
This revised patch adds a Boolean simplify argument to tree-ssa-sccvn.c's
vn_nary_build_or_lookup_1 to control whether simplification should be
performed before value numbering, updating the callers, but then
avoiding simplification when constructing/value-numbering NEGATE_EXPR.
This avoids the regression of gcc.dg/tree-ssa/ssa-free-88.c, and enables
the new test case(s) to pass.
2021-09-22 Roger Sayle <roger@nextmovesoftware.com>
Richard Biener <rguenther@suse.de>
gcc/ChangeLog
* match.pd (negation simplifications): Implement some negation
folding transformations from fold-const.c's fold_negate_expr.
* tree-ssa-sccvn.c (vn_nary_build_or_lookup_1): Add a SIMPLIFY
argument, to control whether the op should be simplified prior
to looking up/assigning a value number.
(vn_nary_build_or_lookup): Update call to vn_nary_build_or_lookup_1.
(vn_nary_simplify): Likewise.
(visit_nary_op): Likewise, but when constructing a NEGATE_EXPR
now call vn_nary_build_or_lookup_1 disabling simplification.
gcc/testsuite/ChangeLog
* gcc.dg/fold-negate-1.c: New test case.
H.J. Lu [Mon, 20 Sep 2021 14:48:05 +0000 (07:48 -0700)]
x86: Clean up gcc.target/i386/auto-init-* tests
1. Replace ia32 with { ! lp64 } to enable ILP32 tests for -mx32.
2. Replace lp64 with { ! ia32 } to enable x86-64 ISA tests for -mx32.
3. For auto-init-3.c, add -msse and -mfpmath=387 for ia32.
* gcc.target/i386/auto-init-2.c: Replace ia32 with { ! lp64 }.
* gcc.target/i386/auto-init-3.c (dg-options): Add -msse.
(dg-additional-options): Add -mfpmath=387 for ia32.
Replace lp64 with { ! ia32 }. Add a space after ia32.
* gcc.target/i386/auto-init-4.c: Replace lp64 with { ! ia32 }.
* gcc.target/i386/auto-init-5.c: Likewise.
* gcc.target/i386/auto-init-padding-3.c: Likewise.
* gcc.target/i386/auto-init-padding-7.c: Likewise.
* gcc.target/i386/auto-init-padding-8.c: Likewise.
* gcc.target/i386/auto-init-padding-9.c: Likewise.
Patrick Palka [Wed, 22 Sep 2021 15:16:53 +0000 (11:16 -0400)]
c++: concept-ids and value-dependence [PR102412]
The problem here is that uses_template_parms returns true for all
concept-ids (even those with non-dependent arguments), so when a concept-id
is used as a default template argument then during deduction the default
argument is considered dependent even after substituting into it, which
leads to deduction failure (from type_unification_real).
This patch fixes this by implementing the resolution of CWG 2446 which
says a concept-id is dependent only if its arguments are.
DR 2446
PR c++/102412
gcc/cp/ChangeLog:
* constexpr.c (cxx_eval_constant_expression)
<case TEMPLATE_ID_EXPR>: Check value_dependent_expression_p
instead of processing_template_decl.
* pt.c (value_dependent_expression_p) <case TEMPLATE_ID_EXPR>:
Return true only if any_dependent_template_arguments_p.
(instantiation_dependent_r) <case CALL_EXPR>: Remove this case.
<case TEMPLATE_ID_EXPR>: Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-nondep2.C: New test.
* g++.dg/cpp2a/concepts-nondep3.C: New test.
[Ada] Contracts written for the Ada.Strings.Bounded library
gcc/ada/
* libgnat/a-strbou.adb: Turn SPARK_Mode on.
* libgnat/a-strbou.ads: Write contracts.
* libgnat/a-strfix.ads (Index): Fix grammar error in a comment.
* libgnat/a-strsea.ads (Index): Likewise.
* libgnat/a-strsup.adb: Rewrite the body to take into account
the new definition of Super_String using Relaxed_Initialization
and a predicate.
(Super_Replicate, Super_Translate, Times): Added loop
invariants, and ghost lemmas for Super_Replicate and Times.
(Super_Trim): Rewrite the body using search functions to
determine the cutting points.
(Super_Element, Super_Length, Super_Slice, Super_To_String):
Remove (now written as expression functions in a-strsup.ads).
* libgnat/a-strsup.ads: Added contracts.
(Super_Element, Super_Length, Super_Slice, Super_To_String):
Rewrite as expression functions.
[Ada] VxWorks inconsistent use of return type (BOOL)
gcc/ada/
* libgnarl/s-vxwext.ads (BOOL): New int type.
(Interrupt_Context): Change return type to BOOL.
* libgnarl/s-vxwext__kernel.ads: Likewise.
* libgnarl/s-vxwext__rtp-smp.adb: Likewise.
* libgnarl/s-vxwext__rtp.adb: Likewise.
* libgnarl/s-vxwext__rtp.ads: Likewise.
* libgnarl/s-osinte__vxworks.adb (Interrupt_Context): Change
return type to BOOL.
* libgnarl/s-osinte__vxworks.ads (BOOL) New subtype.
(taskIsSuspended): Change return type to BOOL.
(Interrupt_Context): Change return type to BOOL. Adjust comments
accordingly.
* libgnarl/s-taprop__vxworks.adb (System.VxWorks.Ext.BOOL):
use type.
(Is_Task_Context): Test Interrupt_Context against 0.
* libgnat/i-vxwork.ads (BOOL): New int.
(intContext): Change return type to BOOL. Adjust comments.
* libgnat/i-vxwork__x86.ads: Likewise.
[Ada] More precise analysis of function renamings in GNATprove
gcc/ada/
* freeze.adb (Build_Renamed_Body): Special case for GNATprove.
* sem_ch6.adb (Analyze_Expression_Function): Remove useless test
for a node to come from source, which becomes harmful otherwise.
Steve Baird [Wed, 14 Jul 2021 23:55:28 +0000 (16:55 -0700)]
[Ada] Improve performance for case-insensitive regular expressions
gcc/ada/
* libgnat/s-regpat.adb (Match): Handle the case where Self.First
is not NUL (so we know the first character we are looking for),
but case-insensitive matching has
been specified.
(Optimize): In the case of an EXACTF Op, set Self.First as is
done in the EXACT case, except with the addition of a call to
Lower_Case.
Eric Botcazou [Thu, 15 Jul 2021 09:18:02 +0000 (11:18 +0200)]
[Ada] Generate temporary for if-expression with -fpreserve-control-flow
gcc/ada/
* exp_ch4.adb (Expand_N_If_Expression): Generate an intermediate
temporary when the expression is a condition in an outer decision
and control-flow optimizations are suppressed.
Steve Baird [Fri, 9 Jul 2021 19:04:09 +0000 (12:04 -0700)]
[Ada] Add -gnatX support for casing on array values
gcc/ada/
* exp_ch5.adb (Expand_General_Case_Statement.Pattern_Match): Add
new function Indexed_Element to handle array element
comparisons. Handle case choices that are array aggregates,
string literals, or names denoting constants.
* sem_case.adb (Composite_Case_Ops.Array_Case_Ops): New package
providing utilities needed for casing on arrays.
(Composite_Case_Ops.Choice_Analysis): If necessary, include
array length as a "component" (like a discriminant) when
traversing components. We do not (yet) partition choice analysis
to deal with unequal length choices separately. Instead, we
embed everything in the minimum-dimensionality Cartesian product
space needed to handle all choices properly; this is determined
by the length of the longest choice pattern.
(Composite_Case_Ops.Choice_Analysis.Traverse_Discrete_Parts):
Include length as a "component" in the traversal if necessary.
(Composite_Case_Ops.Choice_Analysis.Parse_Choice.Traverse_Choice):
Add support for case choices that are string literals or names
denoting constants.
(Composite_Case_Ops.Choice_Analysis): Include length as a
"component" in the analysis if necessary.
(Check_Choices.Check_Case_Pattern_Choices.Ops.Value_Sets.Value_Index_Count):
Improve error message when capacity exceeded.
* doc/gnat_rm/implementation_defined_pragmas.rst: Update
documentation to reflect current implementation status.
* gnat_rm.texi: Regenerate.
Eric Botcazou [Tue, 13 Jul 2021 09:23:38 +0000 (11:23 +0200)]
[Ada] Fix imprecise wording for error on scalar storage order
gcc/ada/
* freeze.adb (Check_Component_Storage_Order): Give a specific error
message for non-byte-aligned component in the packed case. Replace
"composite" with "record" in both cases.
In patch r12-3136, niter->control, niter->bound and niter->cmp are
derived from number_of_iterations_lt. While for 'until wrap condition',
the calculation in number_of_iterations_lt is not align the requirements
on the define of them and requirements in determine_exit_conditions.
This patch calculate niter->control, niter->bound and niter->cmp in
number_of_iterations_until_wrap.
gcc/ChangeLog:
2021-09-22 Jiufu Guo <guojiufu@linux.ibm.com>
PR tree-optimization/102087
* tree-ssa-loop-niter.c (number_of_iterations_until_wrap):
Update bound/cmp/control for niter.
gcc/testsuite/ChangeLog:
2021-09-22 Jiufu Guo <guojiufu@linux.ibm.com>
* gcc.dg/pr102087.c: New test.
PR tree-optimization/102087
path solver: Use range_on_path_entry instead of looking at equivalences.
Cycling through equivalences to improve a range is nowhere near as
efficient as asking the ranger what the range on entry is.
Testing on a hybrid VRP threader, shows that this improves our VRP
threading benefit from 14.5% to 18.5% and our overall jump threads from
0.85% to 1.28%.
Alan Modra [Wed, 1 Sep 2021 23:35:05 +0000 (09:05 +0930)]
obstack.h __PTR_ALIGN vs. ubsan
Current ubsan complains on every use of __PTR_ALIGN (when ptrdiff_t is
as large as a pointer), due to making calculations relative to a NULL
pointer. This patch avoids the problem by extracting out and
simplifying __BPTR_ALIGN for the usual case. I've continued to use
ptrdiff_t here, where it might be better to throw away __BPTR_ALIGN
entirely and just assume uintptr_t exists.
* obstack.h (__PTR_ALIGN): Expand and simplify __BPTR_ALIGN
rather than calculating relative to a NULL pointer.
Andreas Krebbel [Wed, 22 Sep 2021 07:32:21 +0000 (09:32 +0200)]
IBM Z: Fix PR102222
Avoid emitting a strict low part move if the insv target actually
affects the whole target reg.
gcc/ChangeLog:
PR target/102222
* config/s390/s390.c (s390_expand_insv): Emit a normal move if it
is actually a full copy of the source operand into the target.
Don't emit a strict low part move if source and target mode match.
Jakub Jelinek [Wed, 22 Sep 2021 07:32:32 +0000 (09:32 +0200)]
openmp: Fix OpenMP expansion of scope with non-fallthrugh body [PR102415]
I've used function for omp single expansion also for omp scope. That is
mostly ok, but as the testcase shows, there is one important difference.
The omp single expansion always has a fallthru body, because it during
omp lowering expands the body as if wrapped in an if to simulate that
one thread runs the body and others wait (unless nowait) until it completes
and continue. omp scope is invoked by all threads and so if the body
is non-fallthru, the barrier (unless nowait) at the end will not be reached
by any of the threads.
The following patch fixes that by handling the case where cfg pass optimizes
away the exit bb of it gracefully.
2021-09-22 Jakub Jelinek <jakub@redhat.com>
PR middle-end/102415
* omp-expand.c (expand_omp_single): If region->exit is NULL,
assert region->entry is GIMPLE_OMP_SCOPE region and return.
Jakub Jelinek [Wed, 22 Sep 2021 07:29:13 +0000 (09:29 +0200)]
openmp: Add support for allocator and align modifiers on allocate clauses
As the allocate-2.c testcase shows, this change isn't 100% backwards compatible,
one could have allocate and/or align functions that return an OpenMP allocator
handle and previously it would call those functions and now would use those
names as keywords for the modifiers. But it allows specify extra alignment
requirements for the allocations.
2021-09-22 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree.h (OMP_CLAUSE_ALLOCATE_ALIGN): Define.
* tree.c (omp_clause_num_ops): Change number of OMP_CLAUSE_ALLOCATE
arguments from 2 to 3.
* tree-pretty-print.c (dump_omp_clause): Print allocator() around
allocate clause allocator and print align if present.
* omp-low.c (scan_sharing_clauses): Force allocate_map entry even
for omp_default_mem_alloc if align modifier is present. If align
modifier is present, use TREE_LIST to encode both allocator and
align.
(lower_private_allocate, lower_rec_input_clauses, create_task_copyfn):
Handle align modifier on allocator clause if present.
gcc/c-family/
* c-omp.c (c_omp_split_clauses): Copy over OMP_CLAUSE_ALLOCATE_ALIGN.
gcc/c/
* c-parser.c (c_parser_omp_clause_allocate): Parse allocate clause
modifiers.
gcc/cp/
* parser.c (cp_parser_omp_clause_allocate): Parse allocate clause
modifiers.
* semantics.c (finish_omp_clauses) <OMP_CLAUSE_ALLOCATE>: Perform
semantic analysis of OMP_CLAUSE_ALLOCATE_ALIGN.
* pt.c (tsubst_omp_clauses) <case OMP_CLAUSE_ALLOCATE>: Handle
also OMP_CLAUSE_ALLOCATE_ALIGN.
gcc/testsuite/
* c-c++-common/gomp/allocate-6.c: New test.
* c-c++-common/gomp/allocate-7.c: New test.
* g++.dg/gomp/allocate-4.C: New test.
libgomp/
* testsuite/libgomp.c-c++-common/allocate-2.c: New test.
* testsuite/libgomp.c-c++-common/allocate-3.c: New test.
* gcc.target/i386/pr92658-avx512f.c: Refine testcase.
* gcc.target/i386/pr92658-avx512vl.c: Adjust scan-assembler,
only v2di->v2qi truncate is not supported, v4di->v4qi should
be supported.
AVX512FP16: Add expander for ceil/floor/trunc/roundeven.
gcc/ChangeLog:
* config/i386/i386.md (<rounding_insn>hf2): New expander.
(sse4_1_round<mode>2): Extend from MODEF to MODEFH.
* config/i386/sse.md (*sse4_1_round<ssescalarmodesuffix>):
Extend from VF_128 to VFH_128.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-builtin-round-1.c: New test.
* gcc.target/i386/avx-1.c: Add test for new builtins.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/sse-14.c: Add test for new intrinsics.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/avx-1.c: Add test for new builtins.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/sse-14.c: Add test for new intrinsics.
* gcc.target/i386/sse-22.c: Ditto.
Kewen Lin [Wed, 22 Sep 2021 03:25:54 +0000 (22:25 -0500)]
rs6000: Parameterize some const values for density test
This patch follows the discussion here[1], where Segher suggested
parameterizing those exact magic constants for density heuristics,
to make it easier to tweak if need.
The change here should be "No Functional Change". But I verified
it with SPEC2017 at option sets O2-vect and Ofast-unroll on Power8,
the result is neutral as expected.
c++: fix template instantiation comparison in redeclarations
This change fixes a primordial c++11 frontend defect where function template
redeclarations with trailing return types that used dependent
sizeof/alignof/noexcept expressions in template value arguments failed to
compare as equivalent to the identical primary template declaration. By
forcing structural AST comparison of the template arguments, we no longer
require TYPE_CANONICAL to match in this case. The new canon-type-{15..18}.C
tests failed with all prior GCC versions, where the redeclarations were
incorrectly reported as ambiguous overloads. The new dependent-name{15,16}.C
tests are regression tests for sneaky problems encountered during
development of this fix. Note that this fix does not address the use of parm
objects' constexpr members as template arguments within a declaration (a
superficially similar longstanding defect).
gcc/cp/ChangeLog:
* pt.c (find_parm_usage_r): New walk_tree callback to find func
parms.
(any_template_arguments_need_structural_equality_p): New special
case.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/constexpr-52830.C: Remove unwanted dg-ice.
* g++.dg/template/canon-type-15.C: New test.
* g++.dg/template/canon-type-16.C: New test.
* g++.dg/template/canon-type-17.C: New test.
* g++.dg/template/canon-type-18.C: New test.
* g++.dg/template/dependent-name15.C: New regression test.
* g++.dg/template/dependent-name16.C: New regression test.
The default behavior for the path solver is to resort to VARYING when
the range for an unknown SSA is outside the given path. This is both
cheap and fast, but fails to get a significant amount of ranges that
traditionally the DOM and VRP threaders could get.
This patch uses the ranger to resolve any unknown names upon entry to
the path. It also uses equivalences to improve ranges.
gcc/ChangeLog:
* gimple-range-path.cc (path_range_query::defined_outside_path):
New.
(path_range_query::range_on_path_entry): New.
(path_range_query::internal_range_of_expr): Resolve unknowns
with ranger.
(path_range_query::improve_range_with_equivs): New.
(path_range_query::ssa_range_in_phi): Resolve unknowns with
ranger.
* gimple-range-path.h (class path_range_query): Add
defined_outside_path, range_on_path_entry, and
improve_range_with_equivs.
The path solver takes an initial set of SSA names which are deemed
interesting. These are then solved along the path. Adding any copies
of said SSA names to the list of interesting names yields significantly
better results. This patch adds said copies to the already provided
list.
Currently this code is guarded by "m_resolve", which is the more
expensive mode, but it would be reasonable to make it available always,
especially since adding more imports usually has minimal impact on the
processing time. I will investigate and make it universally available
if this is indeed the case.
gcc/ChangeLog:
* gimple-range-path.cc (path_range_query::add_to_imports): New.
(path_range_query::add_copies_to_imports): New.
(path_range_query::precompute_ranges): Call
add_copies_to_imports.
* gimple-range-path.h (class path_range_query): Add prototypes
for add_copies_to_imports and add_to_imports.
This patch adds relational support to the path solver. It uses a
path_oracle that keeps track of relations within a path which are
augmented by relations on entry to the path. With it, range_of_stmt,
range_of_expr, and friends can give relation aware answers.