Richard Biener [Tue, 23 Nov 2021 09:11:41 +0000 (10:11 +0100)]
tree-optimization/103361 - fix unroll-and-jam direction vector handling
This properly uses lambda_int instead of truncating the direction
vector to int which leads to false unexpected negative values.
2021-11-23 Richard Biener <rguenther@suse.de>
PR tree-optimization/103361
* gimple-loop-jam.c (adjust_unroll_factor): Use lambda_int
for the dependence distance.
* tree-data-ref.c (print_lambda_vector): Properly print a lambda_int.
Richard Biener [Tue, 7 Dec 2021 10:13:39 +0000 (11:13 +0100)]
tree-optimization/103596 - fix missed propagation into switches
may_propagate_copy unnecessarily restricts propagating non-abnormals
into places that currently contain an abnormal SSA name but are
not the PHI argument for an abnormal edge. This causes VN to
not elide a CFG path that it assumes is elided, resulting in
released SSA names in the IL.
The fix is to enhance the may_propagate_copy API to specify the
destination is _not_ a PHI argument. I chose to not update only
the relevant caller in VN and the may_propagate_copy_into_stmt API
at this point because this is a regression and needs backporting.
2021-12-07 Richard Biener <rguenther@suse.de>
PR tree-optimization/103596
* tree-ssa-sccvn.c (eliminate_dom_walker::eliminate_stmt):
Note we are not propagating into a PHI argument to may_propagate_copy.
* tree-ssa-propagate.h (may_propagate_copy): Add
argument specifying whether we propagate into a PHI arg.
* tree-ssa-propagate.c (may_propagate_copy): Likewise.
When not doing so we can replace an abnormal with
something else.
(may_propagate_into_stmt): Update may_propagate_copy calls.
(replace_exp_1): Move propagation checking code to
propagate_value and rename to ...
(replace_exp): ... this and elide previous wrapper.
(propagate_value): Perform checking with adjusted
may_propagate_copy call and dispatch to replace_exp.
Richard Biener [Thu, 3 Feb 2022 10:20:59 +0000 (11:20 +0100)]
debug/104337 - avoid messing with the abstract origin chain in NRV
The following avoids NRV from massaging DECL_ABSTRACT_ORIGIN after
variable creation since NRV runs _after_ the function was inlined and thus
affects the inlined variables copy indirectly. We may adjust the abstract
origin of a variable only at the point we create it, not further along the
path since otherwise the (new) invariant that the abstract origin is always
the ultimate origin cannot be maintained.
The intent of what NRV does is OK I guess and it may improve the debug
experience. But I also notice we do
Richard Biener [Wed, 9 Mar 2022 09:55:49 +0000 (10:55 +0100)]
middle-end/104786 - ICE with asm and VLA
The following fixes an ICE observed with a MEM_REF allows_mem asm
operand referencing a VLA. The following makes sure to not attempt
to go the temporary creation way when we cannot.
2022-03-09 Richard Biener <rguenther@suse.de>
PR middle-end/104786
* cfgexpand.c (expand_asm_stmt): Do not generate a copy
for VLAs without an upper size bound.
Richard Biener [Thu, 10 Mar 2022 12:35:46 +0000 (13:35 +0100)]
ada/104861 - use target_noncanonial for Target_Name
The following arranges for s-oscons.ads to record target_noncanonical
for Target_Name, matching the install directory layout and what
gcc -dumpmachine says. This fixes build issues with gprbuild.
Vectorizer loop versioning tries to version outer loops if possible
but fails to check whether it can actually split the single exit
edge as it will do.
2022-04-12 Richard Biener <rguenther@suse.de>
PR tree-optimization/105226
* tree-vect-loop-manip.c (vect_loop_versioning): Verify
we can split the exit of an outer loop we choose to version.
Richard Biener [Mon, 28 Mar 2022 08:07:53 +0000 (10:07 +0200)]
tree-optimization/105070 - annotate bit cluster tests with locations
The following makes sure to annotate the tests generated by
switch lowering bit-clustering with locations which otherwise
can be completely lost even at -O0.
2022-03-28 Richard Biener <rguenther@suse.de>
PR tree-optimization/105070
* tree-switch-conversion.h
(bit_test_cluster::hoist_edge_and_branch_if_true): Add location
argument.
* tree-switch-conversion.c
(bit_test_cluster::hoist_edge_and_branch_if_true): Annotate
cond with location.
(bit_test_cluster::emit): Annotate all generated expressions
with location.
The testcase was missing dg- before require-effective-target.
While at that, I'm also pruning the excess-error warning I got when
the test failed to be disabled because of the above. I suppose it
might be useful for some target variants.
for gcc/testsuite/ChangeLog
PR target/104253
* gcc.target/powerpc/pr104253.c: Add missing dg- before
require-effective-target. Prune warning about -mfloat128
possibly not being fully supported.
IPA_JF_ANCESTOR jump functions are constructed also when the formal
parameter of the caller is first checked whether it is NULL and left
as it is if it is NULL, to accommodate C++ casts to an ancestor class.
The jump function type was invented for devirtualization and IPA-CP
propagation of tree constants is also careful to apply it only to
existing DECLs(*) but as PR 103083 shows, the part propagating "known
bits" was not careful about this, which can lead to miscompilations.
This patch introduces a flag to the ancestor jump functions which
tells whether a NULL-check was elided when creating it and makes the
bits propagation behave accordingly, masking any bits otherwise would
be known to be one. This should safely preserve alignment info, which
is the primary ifnormation that we keep in bits for pointers.
(*) There still may remain problems when a DECL resides on address
zero (with -fno-delete-null-pointer-checks ...I hope it cannot happen
otherwise). I am looking into that now but I think it will be easier
for everyone if I do so in a follow-up patch.
gcc/ChangeLog:
2022-02-11 Martin Jambor <mjambor@suse.cz>
PR ipa/103083
* ipa-prop.h (ipa_ancestor_jf_data): New flag keep_null;
(ipa_get_jf_ancestor_keep_null): New function.
* ipa-prop.c (ipa_set_ancestor_jf): Initialize keep_null field of the
ancestor function.
(compute_complex_assign_jump_func): Pass false to keep_null
parameter of ipa_set_ancestor_jf.
(compute_complex_ancestor_jump_func): Pass true to keep_null
parameter of ipa_set_ancestor_jf.
(update_jump_functions_after_inlining): Carry over keep_null from the
original ancestor jump-function or merge them.
(ipa_write_jump_function): Stream keep_null flag.
(ipa_read_jump_function): Likewise.
(ipa_print_node_jump_functions_for_edge): Print the new flag.
* ipa-cp.c (class ipcp_bits_lattice): Make various getters const. New
member function known_nonzero_p.
(ipcp_bits_lattice::known_nonzero_p): New.
(ipcp_bits_lattice::meet_with_1): New parameter drop_all_ones,
observe it.
(ipcp_bits_lattice::meet_with): Likewise.
(propagate_bits_across_jump_function): Simplify. Pass true in
drop_all_ones when it is necessary.
(propagate_aggs_across_jump_function): Take care of keep_null
flag.
(ipa_get_jf_ancestor_result): Propagate NULL accross keep_null
jump functions.
gcc/testsuite/ChangeLog:
2021-11-25 Martin Jambor <mjambor@suse.cz>
* gcc.dg/ipa/pr103083-1.c: New test.
* gcc.dg/ipa/pr103083-2.c: Likewise.
H.J. Lu [Fri, 25 Mar 2022 04:41:12 +0000 (21:41 -0700)]
x86: Use x constraint on SSSE3 patterns with MMX operands
Since PHADDW/PHADDD/PHADDSW/PHSUBW/PHSUBD/PHSUBSW/PSIGNB/PSIGNW/PSIGND
have no AVX512 version, replace the "Yv" register constraint with the
"x" register constraint.
Peter Bergner [Fri, 18 Mar 2022 19:09:26 +0000 (14:09 -0500)]
rs6000: Allow -mlong-double-64 after -mabi={ibm,ieee}longdouble [PR104208, PR87496]
The glibc build is showing a build error due to extra "error" checking from my
PR87496 fix. That checking was overeager, disallowing setting the long double
size to 64-bits if the 128-bit long double ABI had already been specified.
Now we only emit an error if we specify a 128-bit long double ABI if our
long double size is not 128 bits. This also fixes an erroneous error when
-mabi=ieeelongdouble is used and ISA 2.06 is not enabled, but the long double
size has been changed to 64 bits.
2022-03-04 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/87496
PR target/104208
* config/rs6000/rs6000.c (rs6000_option_override_internal): Make the
ISA 2.06 requirement for -mabi=ieeelongdouble conditional on
-mlong-double-128.
Move the -mabi=ieeelongdouble and -mabi=ibmlongdouble error checking
from here...
* common/config/rs6000/rs6000-common.c (rs6000_handle_option):
... to here.
gcc/testsuite/
PR target/87496
PR target/104208
* gcc.target/powerpc/pr104208-1.c: New test.
* gcc.target/powerpc/pr104208-2.c: Likewise.
* gcc.target/powerpc/pr87496-2.c: Swap long double options to trigger
the expected error.
* gcc.target/powerpc/pr87496-3.c: Likewise.
Richard Biener [Mon, 14 Feb 2022 09:09:10 +0000 (10:09 +0100)]
tree-optimization/104511 - avoid FP to DFP conversion for VEC_PACK_TRUNC
This avoids forwprop from matching DFP <-> FP vector conversions
using VEC_[UN]PACK{_TRUNC,_LO,_HI}. Maybe DFP vectors shouldn't be
a thing, but they appearantly are. Re-using CONVERT/NOP_EXPR for
DFP <-> FP conversions was probably a mistake.
Use correct names for __ibm128 if long double is IEEE 128-bit.
If you are on a PowerPC system where the default long double is IEEE
128-bit (either through the compiler option -mabi=ieeelongdouble or via
the configure option --with-long-double-format=ieee), GCC used the wrong
names for some of the conversion functions for the __ibm128 type.
Internally, GCC uses IFmode for __ibm128 if long double is IEEE 128-bit,
instead of TFmode when long double is IBM 128-bit. This patch adds the
missing conversions to prevent the 'if' name from being used.
In particular, before the patch, the conversions used were:
IFmode to DImode signed: __fixifdi instead of __fixtfdi
IFmode to DImode unsigned __fixunsifti instead of __fixunstfti
DImode to IFmode signed: __floatdiif instead of __floatditf
DImode to IFmode unsigned: __floatundiif instead of __floatunditf
2022-03-05 Michael Meissner <meissner@the-meissners.org>
gcc/
PR target/104253
* config/rs6000/rs6000.c (init_float128_ibm): Update the
conversion functions used to convert IFmode types. Backport
change made to the master branch on 2022-02-14.
gcc/testsuite/
PR target/104253
* gcc.target/powerpc/pr104253.c: New test. Backport change made
to the master branch on 2022-02-14.
Define the sizes of the PowerPC specific types __float128 and __ibm128 if those
types are enabled.
This patch will define __SIZEOF_IBM128__ and __SIZEOF_FLOAT128__ if their
respective types are created in the compiler. Currently, this means both of
these will be defined if float128 support is enabled. But at some point in
the future, __ibm128 could be enabled without enabling float128 support and
__SIZEOF_IBM128__ would be defined.
2022-03-05 Michael Meissner <meissner@the-meissners.org>
gcc/
PR target/99708
* config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Define
__SIZEOF_IBM128__ if the IBM 128-bit long double type is created.
Define __SIZEOF_FLOAT128__ if the IEEE 128-bit floating point type
is created. Backport change to master branch on 2022-02-17.
gcc/testsuite/
PR target/99708
* gcc.target/powerpc/pr99708.c: New test. Backport change to
master branch on 2022-02-17.
Patrick Palka [Thu, 21 Oct 2021 16:13:35 +0000 (12:13 -0400)]
libstdc++: missing constexpr for __[nm]iter_base [PR102358]
PR libstdc++/102358
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (__niter_base): Make constexpr
for C++20.
(__miter_base): Likewise.
* testsuite/25_algorithms/move/constexpr.cc: New test.
Double reductions which have multiple LC PHIs in the inner loop
are not handled correctly during transformation since those PHIs
are not properly classified as reduction. The following disables
vectorizing them.
2021-11-15 Richard Biener <rguenther@suse.de>
PR tree-optimization/103237
* tree-vect-loop.c (vect_is_simple_reduction): Fail for
double reductions with multiple inner loop LC PHI nodes.
Richard Biener [Thu, 11 Nov 2021 08:40:36 +0000 (09:40 +0100)]
middle-end/103181 - fix operation_could_trap_p for vector division
For integer vector division we only checked for all zero vector
constants rather than checking whether any element in the constant
vector is zero.
It also fixes the adjustment to operation_could_trap_helper_p
where I failed to realize that RDIV_EXPR is also used for
fixed-point types. It also fixes that handling by properly
checking for a fixed_zerop divisor.
2021-11-11 Richard Biener <rguenther@suse.de>
PR middle-end/103181
PR middle-end/103248
* tree-eh.c (operation_could_trap_helper_p): Properly
check vector constants for a zero element for integer
division. Separate floating point and integer division code.
Properly handle fixed-point RDIV_EXPR.
* gcc.dg/torture/pr103181.c: New testcase.
* gcc.dg/pr103248.c: Likewise.
Richard Biener [Mon, 18 Oct 2021 07:10:43 +0000 (09:10 +0200)]
tree-optimization/102798 - avoid copying PTA info to old SSA names
The vectorizer duplicates pointer-info to created pointer bases
but it has to avoid changing points-to info on existing SSA names
because there's now flow-sensitive info in there (pt->pt_null as
set from VRP).
2021-10-18 Richard Biener <rguenther@suse.de>
PR tree-optimization/102798
* tree-vect-data-refs.c (vect_create_addr_base_for_vector_ref):
Only copy points-to info to newly generated SSA names.
Richard Biener [Tue, 8 Jun 2021 10:52:12 +0000 (12:52 +0200)]
tree-optimization/100923 - fix alias-ref construction wrt availability
This PR shows that building an ao_ref from value-numbers is prone to
expose bogus contextual alias info to the oracle. The following makes
sure to construct ao_refs from SSA names available at the program point
only.
On the way it modifies the awkward valueize_refs[_1] API.
2021-06-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/100923
* tree-ssa-sccvn.c (valueize_refs_1): Take a pointer to
the operand vector to be valueized.
(valueize_refs): Likewise.
(valueize_shared_reference_ops_from_ref): Adjust.
(valueize_shared_reference_ops_from_call): Likewise.
(vn_reference_lookup_3): Likewise.
(vn_reference_lookup_pieces): Likewise. Re-valueize
with honoring availability when we are about to create
the ao_ref and valueized before.
(vn_reference_lookup): Likewise.
(vn_reference_insert_pieces): Adjust.
Kewen Lin [Mon, 7 Feb 2022 03:30:02 +0000 (21:30 -0600)]
rs6000: Move the hunk affecting VSX/ALTIVEC ahead [PR103627]
The modified hunk can update VSX and ALTIVEC flag, we have some codes
to check/warn for some flags related to VSX and ALTIVEC sitting where
the hunk is proprosed to be moved to. Without this adjustment, the
VSX and ALTIVEC update is too late, it can cause the incompatibility
and result in unexpected behaviors, the associated test case is one
typical case.
Since we already have the code which sets TARGET_FLOAT128_TYPE and lays
after the moved place, and OPTION_MASK_FLOAT128_KEYWORD will rely on
TARGET_FLOAT128_TYPE, so it just simply remove them.
gcc/ChangeLog:
PR target/103627
* config/rs6000/rs6000.c (rs6000_option_override_internal): Move the
hunk affecting VSX and ALTIVEC to appropriate place.
gcc/testsuite/ChangeLog:
PR target/103627
* gcc.target/powerpc/pr103627-3.c: New test.
Kewen Lin [Mon, 7 Feb 2022 03:29:32 +0000 (21:29 -0600)]
rs6000: Disable MMA if no VSX support [PR103627]
As PR103627 shows, there is an unexpected case where !TARGET_VSX
and TARGET_MMA co-exist. As ISA3.1 claims, SIMD is a requirement
for MMA. By looking into the ICE, I noticed that the current
MMA implementation depends on vector pairs load/store which use
VSX register, but we don't have a separated option to control
Power10 vector support and Segher pointed out "-mpower9-vector is
a workaround that should go away" and more explanations in [1].
So this patch makes MMA require VSX instead.
Alan Modra [Tue, 29 Jun 2021 04:01:45 +0000 (13:31 +0930)]
[POWER10] __morestack calls from pcrel code
Compiling gcc/testsuite/gcc.dg/split-*.c and others with -mcpu=power10
and linking with a non-pcrel libgcc results in crashes due to the
power10 pcrel code not having r2 set for the generic-morestack.c
functions called from __morestack. There is also a problem when
non-pcrel code calls a pcrel libgcc. See the patch comments.
A similar situation theoretically occurs with ELFv1 multi-toc
executables, when __morestack might be located in a different toc
group to its caller. This patch makes no attempt to fix that, since
the gold linker does not support multi-toc (gold is needed for proper
support of -fsplit-stack code) nor does gcc emit __morestack calls
that support multi-toc.
* config/rs6000/morestack.S (R2_SAVE): Define.
(__morestack): Save and restore r2. Set up r2 for called
functions.
Alan Modra [Mon, 28 Sep 2020 07:12:33 +0000 (16:42 +0930)]
[RS6000] Adjust gcc asm for power10
Generate assembly with .localentry,1 functions using @notoc calls.
This patch makes libgcc.a asm look the same as power10 pcrel as far as
toc/notoc is concerned.
Otherwise calling between functions that advertise as using the TOC
and those that don't, will require linker call stubs in statically
linked code.
gcc/
* config/rs6000/ppc-asm.h: Support __PCREL__ code.
libgcc/
* config/rs6000/morestack.S,
* config/rs6000/tramp.S: Support __PCREL__ code.
libitm/
* config/powerpc/sjlj.S: Support __PCREL__ code.