Jonathan Wakely [Thu, 4 Nov 2021 15:51:37 +0000 (15:51 +0000)]
libstdc++: Support getentropy and arc4random in std::random_device
This adds additional "getentropy" and "arc4random" tokens to
std::random_device. The former is supported on Glibc and OpenBSD (and
apparently wasm), and the latter is supported on various BSDs.
libstdc++-v3/ChangeLog:
* acinclude.m4 (GLIBCXX_CHECK_GETENTROPY, GLIBCXX_CHECK_ARC4RANDOM):
Define.
* configure.ac (GLIBCXX_CHECK_GETENTROPY, GLIBCXX_CHECK_ARC4RANDOM):
Use them.
* config.h.in: Regenerate.
* configure: Regenerate.
* src/c++11/random.cc (random_device): Add getentropy and
arc4random as sources.
* testsuite/26_numerics/random/random_device/cons/token.cc:
Check new tokens.
* testsuite/26_numerics/random/random_device/entropy.cc:
Likewise.
Jonathan Wakely [Tue, 9 Nov 2021 10:31:18 +0000 (10:31 +0000)]
libstdc++: Make spurious std::random_device FAIL less likely
It's possible that independent reads from /dev/random and /dev/urandom
could produce the same value by chance. Retry if that happens. The
chances of it happening twice are miniscule.
libstdc++-v3/ChangeLog:
* testsuite/26_numerics/random/random_device/cons/token.cc:
Retry if random devices produce the same value.
Jakub Jelinek [Tue, 9 Nov 2021 14:29:36 +0000 (15:29 +0100)]
c++: Fix ICE on complex constant with -frounding-math [PR103114]
The FE uses build_complex which assumes that fold_convert will fold
value to a constant. With -frounding-math that isn't guaranteed though.
So, the patch instead fold_build2s COMPLEX_EXPR, which will result
in build_complex if both arguments are constants, and otherwise
will build COMPLEX_EXPR.
build_zero_cst is an optimization for fold_convert (type, integer_zero_node).
2021-11-09 Jakub Jelinek <jakub@redhat.com>
PR c++/103114
* parser.c (cp_parser_userdef_numeric_literal): Use fold_build2
with COMPLEX_EXPR arg instead of build_complex, use build_zero_cst
instead of fold_convert from integer_zero_node.
Patrick Palka [Tue, 9 Nov 2021 14:09:43 +0000 (09:09 -0500)]
c++: bogus error w/ tentative type parse of concept-id [PR98394]
Here when tentatively parsing the if condition as a declaration, we try
to treat C<1> as the start of a constrained placeholder type, which we
quickly reject because C doesn't accept a type as its first argument.
But since we're parsing tentatively, we shouldn't emit an error in this
case.
In passing, also fix PR85846 by only overriding 'tentative' to false when
given a concept-name, and not also when given a concept-id that has an empty
argument list.
PR c++/98394
PR c++/85846
gcc/cp/ChangeLog:
* parser.c (cp_parser_placeholder_type_specifier): Declare
static. Don't override tentative to false when tmpl is a
concept-id with empty argument list. Don't emit a "does not
constrain a type" error when tentative.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-pr98394.C: New test.
* g++.dg/cpp2a/concepts-pr85846.C: New test.
Patrick Palka [Tue, 9 Nov 2021 14:09:12 +0000 (09:09 -0500)]
c++: unexpanded pack in var tmpl partial spec [PR100652]
Here we're failing to spot a bare parameter pack appearing in the argument
list of a variable template partial specialization because we only look for
them within the decl's TREE_TYPE, which is sufficient for class templates
but not for variable templates.
PR c++/100652
gcc/cp/ChangeLog:
* pt.c (push_template_decl): Check for bare parameter packs in
the argument list of a variable template partial specialization.
Thomas Schwinge [Tue, 31 Aug 2021 21:30:25 +0000 (23:30 +0200)]
Generalize 'gcc/input.h:struct location_hash'
This is currently only used here ('gcc/input.h:class string_concat_db'), but is
actually generally useful, so advertize it as such.
Per the rationale given, we may use 'BUILTINS_LOCATION' as spare value for
'Deleted', in addition to the existing use of 'UNKNOWN_LOCATION' as spare value
for 'Empty'.
gcc/
* input.h (location_hash): Use 'BUILTINS_LOCATION' as spare value
for 'Deleted'. Turn into a '#define'.
Aldy Hernandez [Tue, 9 Nov 2021 09:14:25 +0000 (10:14 +0100)]
Remove TDF_THREADING flag in favor of param.
I am returning a TDF_* flag to the queue of available entries as I am
unconvinced that we need to burn an entire flag for internal debugging
constructs, especially since we seem to be running out of them.
I've added a --param=threader-debug entry similar to the one we use for
ranger debugging. Currently this only affects the backward threader,
but since the DOM threader is an outlier and on the chopping block, I
avoided using the "backward" name.
Martin Jambor [Tue, 9 Nov 2021 10:32:20 +0000 (11:32 +0100)]
ipa: Fix segfault when remapping debug_binds with expressions (PR 103132)
My initial implementation of the method
ipa_param_body_adjustments::remap_with_debug_expressions was based on
the assumption that if it was asked to remap an expression (as opposed
to a simple SSA_NAME), the expression would not contain an SSA_NAME
operand which is to be debug-reset. While that is true for when
called from ipa_param_body_adjustments::prepare_debug_expressions, it
turns out it is not true when invoked from remap_gimple_stmt in
tree-inline.c. This patch adds a simple logic to handle such cases
and simply map the entire value to NULL_TREE in those cases.
gcc/ChangeLog:
2021-11-08 Martin Jambor <mjambor@suse.cz>
PR ipa/103132
* ipa-param-manipulation.c (replace_with_mapped_expr): Early
return with error_mark_mode when part of expression is mapped to
NULL.
(ipa_param_body_adjustments::remap_with_debug_expressions): Set
mapped value to NULL if walk_tree returns error_mark_mode.
Eric Botcazou [Wed, 27 Oct 2021 21:51:07 +0000 (23:51 +0200)]
[Ada] Tidy up implementation of Has_Compatible_Type
gcc/ada/
* sem_ch4.adb (Analyze_Membership_Op) <Find_Interpretation>: Handle
both overloaded and non-overloaded cases.
<Try_One_Interp>: Do a reversed call to Covers if the outcome of the
call to Has_Compatible_Type is false.
Simplify implementation after change to Find_Interpretation.
(Analyze_User_Defined_Binary_Op): Be prepared for previous errors.
(Find_Comparison_Types) <Try_One_Interp>: Do a reversed call to
Covers if the outcome of the call to Has_Compatible_Type is false.
(Find_Equality_Types) <Try_One_Interp>: Likewise.
* sem_type.adb (Has_Compatible_Type): Remove the reversed calls to
Covers. Add explicit return on all paths.
Eric Botcazou [Mon, 1 Nov 2021 09:30:51 +0000 (10:30 +0100)]
[Ada] Print Storage_Pool and Procedure_To_Call fields
gcc/ada/
* sprint.adb (Sprint_Node_Actual) <N_Allocator>: Also print the
Procedure_To_Call field if it is present.
<N_Extended_Return_Statement>: Also print the Storage_Pool and
Procedure_To_Call fields if they are present.
<N_Free_Statement>: Likewise.
<N_Simple_Return_Statement>: Likewise.
Alexandre Oliva [Wed, 27 Oct 2021 21:26:27 +0000 (18:26 -0300)]
[Ada] Improve integration of strub with type systems
gcc/ada/
* strub.adb, strub.ads: New files.
* exp_attr.adb (Access_Cases): Copy strub mode to subprogram type.
* exp_disp.adb (Expand_Dispatching_Call): Likewise.
* freeze.adb (Check_Inherited_Conditions): Check that strub modes
match overridden subprograms and interfaces.
(Freeze_All): Renaming declarations too.
* sem_attr.adb (Resolve_Attribute): Reject 'Access to
strub-annotated data object.
* sem_ch3.adb (Derive_Subprogram): Copy strub mode to
inherited subprogram.
* sem_prag.adb (Analyze_Pragma): Propagate Strub Machine_Attribute
from access-to-subprogram to subprogram type when required,
but not from access-to-data to data type. Mark the entity that
got the pragma as having a gigi rep item.
* sem_res.adb (Resolve): Reject implicit conversions that
would change strub modes.
(Resolve_Type_Conversions): Reject checked conversions
between incompatible strub modes.
* doc/gnat_rm/security_hardening_features.rst: Update.
* gnat_rm.texi: Regenerate.
* libgnat/a-except.ads (Raise_Exception): Revert strub-callable
annotation in public subprogram.
* libgnat/s-arit128.ads (Multiply_With_Ovflo_Check128): Likewise.
* libgnat/s-arit64.ads (Multiply_With_Ovflo_Check64): Likewise.
* libgnat/s-secsta.ads (SS_Allocate): Likewise.
(SS_Mark, SS_Release): Likewise.
* gcc-interface/Make-lang.in (GNAT_ADA_OBJS): Add ada/strub.o.
Piotr Trojanek [Wed, 27 Oct 2021 11:33:53 +0000 (13:33 +0200)]
[Ada] Use atomics in runtime on ARM and Aarch64 VxWorks
gcc/ada/
* Makefile.rtl (ARM and Aarch64 VxWorks): Use atomic variants of
runtime units.
* libgnat/a-strunb__shared.ads: Mention AARCH64 and ARM as
supported.
* libgnat/s-atocou.ads: Likewise.
[Ada] Add gcc specs with vxworks7 base addresses for cert
gcc/ada/
* vxworks7-cert-rtp-link.spec: Replace the definition of
__wrs_rtp_base with the base_link spec.
* vxworks7-cert-rtp-base-link.spec: Add base_link spec with
__wrs_rtp_base definition for all architectures.
* vxworks7-cert-rtp-base-link__ppc64.spec: Add base_link spec
with __wrs_rtp_base definition for ppc64.
* vxworks7-cert-rtp-base-link__x86.spec: Add base_link spec with
__wrs_rtp_base definition for x86.
* vxworks7-cert-rtp-base-link__x86_64.spec: Add base_link spec
with __wrs_rtp_base definition for x86_64.
Piotr Trojanek [Wed, 27 Oct 2021 18:43:24 +0000 (20:43 +0200)]
[Ada] Cleanup building of renamed equality
gcc/ada/
* exp_ch8.adb (Build_Body_For_Renaming): Remove unnecessary
calls to Sloc; set Handled_Statement_Sequence when building
subprogram body; whitespace cleanup.
Piotr Trojanek [Wed, 27 Oct 2021 08:33:32 +0000 (10:33 +0200)]
[Ada] Reference in Unbounded_String is almost never null
gcc/ada/
* libgnat/a-strunb.adb (Deallocate): Rename Reference_Copy to
Old, to make the code similar to other routines in this package.
(Realloc_For_Chunk): Use a temporary, deallocate the previous
string using a null-allowing copy of the string reference.
Gary Dismukes [Tue, 26 Oct 2021 00:45:50 +0000 (20:45 -0400)]
[Ada] Errors on globals in expressions of predicate aspects in generic bodies
gcc/ada/
* sem_ch13.adb (Freeze_Entity_Checks): Analyze the expression of
a pragma Predicate associated with an aspect at the freeze point
of the type, to ensure that references to globals get saved when
the aspect occurs within a generic body. Also, add
Aspect_Static_Predicate to the choices of the membership test of
the enclosing guard.
Piotr Trojanek [Tue, 26 Oct 2021 15:57:59 +0000 (17:57 +0200)]
[Ada] Tune comment about expansion of array equality
gcc/ada/
* exp_ch4.adb (Arr_Attr): Refine type of the parameter from Int
to Pos; refine name of the parameter from Num to Dim; fix
reference to "Expr" in comment.
* libgnat/s-regexp.adb (Compile.Check_Well_Formed_Patern): When
a "|" operator is encountered in a pattern, check that it is not
the last character of the pattern.
Piotr Trojanek [Mon, 25 Oct 2021 19:15:58 +0000 (21:15 +0200)]
[Ada] Fix detection of array aggregates with single others associations
gcc/ada/
* checks.adb (Apply_Constraint_Check): Guard against calling
Choices when the first association in an array aggregate is a
N_Iterated_Component_Association node.
Piotr Trojanek [Mon, 25 Oct 2021 14:33:24 +0000 (16:33 +0200)]
[Ada] Guard against illegal items in Global but not Depends
gcc/ada/
* sem_prag.adb (Check_Usage): Guard against calling Usage_Error
with illegal Item_Id. The intention to do this was already
described in the comment but not implemented.
Aldy Hernandez [Fri, 8 Oct 2021 13:54:23 +0000 (15:54 +0200)]
Convert strlen pass from evrp to ranger.
The following patch converts the strlen pass from evrp to ranger,
leaving DOM as the last remaining user.
No additional cleanups have been done. For example, the strlen pass
still has uses of VR_ANTI_RANGE, and the sprintf still passes around
pairs of integers instead of using a proper range. Fixing this
could further improve these passes.
Basically the entire patch is just adjusting the calls to range_of_expr
to include context. The previous context of si->stmt was mostly
empty, so not really useful ;-).
With ranger we are now able to remove the range calculation from
before_dom_children entirely. Just working with the ranger on-demand
catches all the strlen and sprintf testcases with the exception of
builtin-sprintf-warn-22.c which is due to a limitation of the sprintf
code. I have XFAILed the test and documented what the problem is.
On a positive note, these changes found two possible sprintf overflow
bugs in the C++ and Fortran front-ends which I have fixed below.
libstdc++: only define _GLIBCXX_HAVE_TLS for VxWorks >= 6.6
According to
https://gcc.gnu.org/legacy-ml/gcc-patches/2008-03/msg01698.html, the
TLS support, including the __tls_lookup function, was added to VxWorks
in 6.6.
It certainly doesn't exist on our VxWorks 5 platform, but the fallback
code in eh_globals.cc using __gthread_key_create() etc. used to work
just fine.
libstdc++-v3/ChangeLog:
* config/os/vxworks/os_defines.h (_GLIBCXX_HAVE_TLS): Only
define for VxWorks >= 6.6.
Eric Botcazou [Mon, 8 Nov 2021 21:09:16 +0000 (22:09 +0100)]
Fix couple of issues in large PIC model on x86-64/VxWorks
The first issue is that the !gotoff_operand path of legitimize_pic_address
in the large PIC model does not make use of REG when it is available, which
breaks for thunks because new pseudo-registers can no longer be created.
And the second issue is that the system compiler (LLVM) generates @GOTOFF
in large model even for RTP, so we do the same.
gcc/
* config/i386/i386.c (legitimize_pic_address): Adjust comment and
use the REG argument on the CM_LARGE_PIC code path as well.
* config/i386/predicates.md (gotoff_operand): Do not treat VxWorks
specially with the large code models.
Andrew MacLeod [Mon, 8 Nov 2021 14:32:42 +0000 (09:32 -0500)]
Don't calculate new values when using the private context callback.
When using rangers private callback mechanism to provide context
to fold_stmt calls, we are only suppose to be using the cache in read
only mode, never calculate new values.
gcc/
PR tree-optimization/103122
* gimple-range.cc (gimple_ranger::range_of_expr): Request the cache
entry with "calulate new values" set to false.
Jan Hubicka [Mon, 8 Nov 2021 17:40:17 +0000 (18:40 +0100)]
Improve handling of some builtins.
For nested functions we output call to builtin_dwarf_cfa which
initializes frame entry used only for debugging. This however
prevents us from detecting functions containing nested functions
as const/pure or analyze side effects in modref.
builtin_dwarf_cfa is not documented and I wonder if it should be turned to
internal function. But I think we could consider functions using it const even
if in theory one can do things like test the return address and see the
difference between different frame addreses.
While doing so I also noticed that special_buitin_state handles quite few
builtins that are not special cased by ipa-modref. They do not make
user visible loads/stores and thus I think they shoul dbe annotated by
".c" to make this explicit for both modref and PTA.
Finally I aded dwarf_cfa and similar return_address to list of simple
bulitins since it compiles to simple stack frame load (and we consider
simple other builtins doing so).
Jan Hubicka [Mon, 8 Nov 2021 17:38:09 +0000 (18:38 +0100)]
Move uncprop after modref
moveS uncprop after modref and pure/const pass and adds a comment that
this pass should alwasy be last since it is only supposed to help PHI lowering.
The pass replaces constant by SSA names that are known to be constant at the
place which hardly helps other passes.
gcc/ChangeLog:
PR tree-optimization/103177
* passes.def: Move uncprop after pure/const and modref.
Martin Jambor [Mon, 8 Nov 2021 16:49:54 +0000 (17:49 +0100)]
ipa: Unshare expresseions before putting them into debug statements (PR 103099, PR 103107)
My recent patch to improve debug experience when there are removed
parameters (by ipa-sra or ipa-split) was not careful to unshare the
expressions that were then put into debug statements, which manifests
itself as PR 103099. This patch adds unsharing them using
unshare_expr_without_location which is a bit more careful with stripping
locations than what we were doing manually and so also fixes PR 103107.
gcc/ChangeLog:
2021-11-08 Martin Jambor <mjambor@suse.cz>
PR ipa/103099
PR ipa/103107
* tree-inline.c (remap_gimple_stmt): Unshare the expression without
location before invoking remap_with_debug_expressions on it.
* ipa-param-manipulation.c
(ipa_param_body_adjustments::prepare_debug_expressions): Likewise.
David Edelsohn [Mon, 8 Nov 2021 16:46:47 +0000 (11:46 -0500)]
powerpc: Fix vsx_splat_v4si_di breakage on Power8.
The vsx_splat_v4si_di pattern uses a Power8 and a Power9 instruction.
The final condition of TARGET_DIRECT_MODE_64BIT implicitly requires Power8.
The "we" constraint requires Power9, but also requires 64 bit. Because
the DImode pattern already requires 64 bit mode, this isn't horrible,
but it would be best to remove all uses of "we" constraint. The
mtvsrws instruction itself does not require 64 bit mode.
This patch reverts the previous change to fix the breakage.
gcc/ChangeLog:
* config/rs6000/vsx.md (vsx_splat_v4si_di): Revert "wa"
constraint to "we".
Richard Biener [Mon, 8 Nov 2021 14:21:08 +0000 (15:21 +0100)]
Fix spurious valgrind errors in irred loop verification
The sbitmap bitmap_{set,clear}_bit changes trigger spurious
uninit value use reportings from valgrind since we now
read the old value before setting/clearing a bit so
verify_loop_structures optimization to not clear the sbitmap is reported.
Fixed by using a temporary BB flag which should also be more
efficient in terms of cache re-use.
2021-11-08 Richard Biener <rguenther@suse.de>
* cfgloop.c (verify_loop_structure): Use a temporary BB flag
instead of an sbitmap to cache irreducible state.
The problem here is that both value_17 and value_20 are in the set of
imports we must pre-calculate. The value_17 name occurs first in the
bitmap, so we try to resolve it first, which causes us to recursively
solve the value_20 range. We do so correctly and put them both in the
cache. However, when we try to solve value_20 from the bitmap, we
ignore that it already has a cached entry and try to resolve the PHI
with the wrong value of value_17:
# value_20 = PHI <value_17(19), value_7(D)(17)>
The right thing to do is to avoid recalculating definitions already
solved.
Regstrapped and checked for # threads before and after on x86-64 Linux.
gcc/ChangeLog:
PR tree-optimization/103120
* gimple-range-path.cc (path_range_query::range_defined_in_block):
Bail if there's a cache entry.
Bill Schmidt [Mon, 8 Nov 2021 14:34:03 +0000 (08:34 -0600)]
rs6000: Miscellaneous uses of rs6000_builtins_decl_x
There are a few leftover places where we use the old rs6000_builtins_decl
array, but we need to use rs6000_builtins_decl_x instead when the new
builtins infrastructure is in play.
2021-11-08 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000.c (rs6000_builtin_reciprocal): Use
rs6000_builtin_decls_x when appropriate.
(add_condition_to_bb): Likewise.
(rs6000_atomic_assign_expand_fenv): Likewise.
Thomas Schwinge [Fri, 22 Oct 2021 13:54:42 +0000 (15:54 +0200)]
Fix 'contrib/update-copyright.py': 'TypeError: exceptions must derive from BaseException'
Running 'contrib/update-copyright.py' currently fails:
[...]
Traceback (most recent call last):
File "contrib/update-copyright.py", line 365, in update_copyright
canon_form = self.canonicalise_years (dir, filename, filter, years)
File "contrib/update-copyright.py", line 270, in canonicalise_years
(min_year, max_year) = self.year_range (years)
File "contrib/update-copyright.py", line 253, in year_range
year_list = [self.parse_year (year)
File "contrib/update-copyright.py", line 253, in <listcomp>
year_list = [self.parse_year (year)
File "contrib/update-copyright.py", line 250, in parse_year
raise self.BadYear (string)
TypeError: exceptions must derive from BaseException
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "contrib/update-copyright.py", line 796, in <module>
GCCCmdLine().main()
File "contrib/update-copyright.py", line 527, in main
self.copyright.process_tree (dir, filter)
File "contrib/update-copyright.py", line 458, in process_tree
self.process_file (dir, filename, filter)
File "contrib/update-copyright.py", line 421, in process_file
res = self.update_copyright (dir, filename, filter,
File "contrib/update-copyright.py", line 366, in update_copyright
except self.BadYear as e:
TypeError: catching classes that do not inherit from BaseException is not allowed
aarch64: LD3/LD4 post-modify costs for struct modes
The LD3/ST3 and LD4/ST4 address cost code had no test coverage (oops).
This patch fixes that and updates it for the new structure modes.
The test only covers Advanced SIMD because SVE doesn't have
post-increment forms.
gcc/
* config/aarch64/aarch64.c (aarch64_ldn_stn_vectors): New function.
(aarch64_address_cost): Use it instead of testing for CImode and
XImode directly.
gcc/testsuite/
* gcc.target/aarch64/neoverse_v1_1.c: New test.
I was working on a patch that needed to calculate the number of
modes in a particular class. It seemed better to have genmodes
generate this directly rather than do the kind of dance that
expmed.h had.
gcc/
* genmodes.c (emit_insn_modes_h): Define NUM_MODE_* macros.
* expmed.h (NUM_MODE_INT): Delete in favor of genmodes definitions.
(NUM_MODE_PARTIAL_INT, NUM_MODE_VECTOR_INT): Likewise.
* real.h (real_format_for_mode): Use NUM_MODE_FLOAT and
NUM_MODE_DECIMAL_FLOAT.
(REAL_MODE_FORMAT): Likewise.
Richard Biener [Mon, 8 Nov 2021 08:08:12 +0000 (09:08 +0100)]
tree-optimization/103102 - fix error in vectorizer refactoring
This fixes an oversight that caused vectorized epilogues to have
versioning for niters applied.
2021-11-08 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (vect_create_loop_vinfo): Add main_loop_info
parameter.
* tree-vect-loop.c (vect_create_loop_vinfo): Likewise. Set
LOOP_VINFO_ORIG_LOOP_INFO and conditionalize set of
LOOP_VINFO_NITERS_ASSUMPTIONS.
(vect_analyze_loop_1): Adjust.
(vect_analyze_loop): Move loop constraint setting and
SCEV/niter reset here from vect_create_loop_vinfo to perform
it only once.
(vect_analyze_loop_form): Move dumping of symbolic niters
here from vect_create_loop_vinfo.
Jan Hubicka [Mon, 8 Nov 2021 06:52:45 +0000 (07:52 +0100)]
Add loads/stores relative to static chain in ipa-modref
Adds tracking of accesses relative to static chain into modref
load/stores analysis. This helps some Fortran benchmarks however it is still
quite limited. One problem is that we never discover functions with nested
functions as const, pure or not accessing global memory because it contains
__builtin_dward_cfa call which we believe to be non-pure.
Bootstrapped/regtested x86_64-linux. Plan to commit it tomorrow if there are
no complains and once periodic testers picks today modref changes.
Honza
gcc/ChangeLog:
* ipa-modref-tree.h (enum modref_special_parms): New enum.
(struct modref_access_node): update for special parms.
(struct modref_ref_node): Likewise.
(struct modref_parm_map): Likewise.
(struct modref_tree): Likewise.
* ipa-modref.c (dump_access): Likewise.
(get_access): Detect static chain.
(parm_map_for_arg): Take tree as arg instead of
stmt and index.
(merge_call_side_effects): Compute map for static chain.
(process_fnspec): Update.
(struct escape_point): Remove retslot_arg and static_chain_arg.
(analyze_parms): Update.
(compute_parm_map): Update.
(propagate_unknown_call): Update.
(modref_propagate_in_scc): Update.
(modref_merge_call_site_flags): Update.
(ipa_merge_modref_summary_after_inlining): Update.
* tree-ssa-alias.c (modref_may_conflict): Handle static chain.
* ipa-modref-tree.c (test_merge): Update.
Haochen Gui [Tue, 2 Nov 2021 06:09:32 +0000 (14:09 +0800)]
Disables gimple folding for VSX_BUILTIN_XVMINDP, VSX_BUILTIN_XVMAXDP,ALTIVEC_BUILTIN_VMINFP and ALTIVEC_BUILTIN_VMAXFP when fast-math is not set.
gcc/
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_builtin): Disable
gimple fold for VSX_BUILTIN_XVMINDP, ALTIVEC_BUILTIN_VMINFP,
VSX_BUILTIN_XVMAXDP, ALTIVEC_BUILTIN_VMAXFP when fast-math is not
set.
gcc/testsuite/
* gcc.target/powerpc/vec-minmax-1.c: New test.
* gcc.target/powerpc/vec-minmax-2.c: Likewise.
liuhongt [Fri, 5 Nov 2021 02:41:22 +0000 (10:41 +0800)]
Update documentation for -ftree-loop-vectorize and -ftree-slp-vectorize which are enabled by default at -02.
gcc/ChangeLog:
PR tree-optimization/103077
* doc/invoke.texi (Options That Control Optimization):
Update documentation for -ftree-loop-vectorize and
-ftree-slp-vectorize which are enabled by default at -02.
liuhongt [Mon, 8 Nov 2021 01:32:17 +0000 (09:32 +0800)]
Add !HONOR_SNANS to simplifcation: (trunc)copysign((extend)a, (extend)b) to copysign (a, b).
> Note that this is not safe with -fsignaling-nans, so needs to be disabled
> for that option (if there isn't already logic somewhere with that effect),
> because the extend will convert a signaling NaN to quiet (raising
> "invalid"), but copysign won't, so this transformation could result in a
> signaling NaN being wrongly returned when the original code would never
> have returned a signaling NaN.
>
> --
> Joseph S. Myers
> joseph@codesourcery.com
Thomas Koenig [Sun, 7 Nov 2021 14:38:35 +0000 (15:38 +0100)]
Fix keyword name for co_reduce.
gcc/fortran/ChangeLog:
* intrinsic.c (add_subroutines): Change keyword "operator"
to the correct one, "operation".
* check.c (gfc_check_co_reduce): Change OPERATOR to
OPERATION in error messages.
* intrinsic.texi: Change OPERATOR to OPERATION in
documentation.
gcc/testsuite/ChangeLog:
* gfortran.dg/co_reduce_2.f90: New test.
* gfortran.dg/coarray_collectives_14.f90: Change OPERATOR
to OPERATION.
* gfortran.dg/coarray_collectives_16.f90: Likewise.
* gfortran.dg/coarray_collectives_9.f90: Likewise.
Aldy Hernandez [Mon, 1 Nov 2021 14:50:38 +0000 (15:50 +0100)]
Remove VRP threader.
Now that things have stabilized, we can remove the old code.
I have left the hybrid threader in tree-ssa-threadedge, even though the
VRP threader was the only user, because we may need it as an interim
step for DOM threading removal.
Jan Hubicka [Sun, 7 Nov 2021 17:20:45 +0000 (18:20 +0100)]
Fix inter-procedural EAF flags propagation with respect to !binds_to_current_def_p
While proofreading the code for handling EAF flags of !binds_to_current_def_p I
noticed that the interprocedural dataflow actually ignores the flag possibly
introducing wrong code on quite complex interposable functions in non-trivial
recursion cycles (or at ltrans partition boundary).
This patch unifies the flags changes to single place (remove_useless_eaf_flags)
and does extend modref_merge_call_site_flags to do the right thing.
lto-bootstrapped/regtested x86_64-linux. Plan to commit it today after bit
more testing (firefox/clang build).
gcc/ChangeLog:
* gimple.c (gimple_call_arg_flags): Use interposable_eaf_flags.
(gimple_call_retslot_flags): Likewise.
(gimple_call_static_chain_flags): Likewise.
* ipa-modref.c (remove_useless_eaf_flags): Do not remove everything for
NOVOPS.
(modref_summary::useful_p): Likewise.
(modref_summary_lto::useful_p): Likewise.
(analyze_parms): Do not give up on NOVOPS.
(analyze_function): When dumping report chnages in EAF flags
between IPA and local pass.
(modref_merge_call_site_flags): Compute implicit eaf flags
based on callee ecf_flags and fnspec; if the function does not
bind to current defs use interposable_eaf_flags.
(modref_propagate_flags_in_scc): Update.
* ipa-modref.h (interposable_eaf_flags): New function.
Bill Schmidt [Sun, 7 Nov 2021 13:56:07 +0000 (07:56 -0600)]
rs6000: Replace the builtin expansion machinery
This patch forms the meat of the improvements for this patch series.
We develop a replacement for rs6000_expand_builtin and its supporting
functions, which are inefficient and difficult to maintain.
Differences between the old and new support in this patch include:
- Make use of the new builtin data structures, directly looking up
a function's information rather than searching for the function
multiple times;
- Test for enablement of builtins at expand time, to support #pragma
target changes within a compilation unit;
- Use the builtin function attributes (e.g., bif_is_cpu) to control
special handling;
- Refactor common code into one place; and
- Provide common error handling in one place for operands that are
restricted to specific values or ranges.
Jan Hubicka [Sun, 7 Nov 2021 08:35:16 +0000 (09:35 +0100)]
Implement intra-procedural dataflow in ipa-modref flags propagation.
implement the (long promised) intraprocedural dataflow for
propagating eaf flags, so we can handle parameters that participate
in loops in SSA graphs. Typical example are acessors that walk linked
lists, for example.
I implemented dataflow using the standard iteration over BBs in RPO some time
ago, but did not like it becuase it had measurable compile time impact with
very small code quality effect. This is why I kept mainline to do the DFS walk
instead. The reason is that we care about flags of SSA names that corresponds
to parameters and those can be often determined from a small fraction of the
SSA graph so solving dataflow for all SSA names in a function is a waste.
This patch implements dataflow more carefully. The DFS walk is kept in place to
solve acyclic cases and discover the relevat part of SSA graph into new graph
(which is similar to one used for inter-procedrual dataflow - we only need to
know the edges and if the access is direct or derefernced). The RPO iterative
dataflow then works on this simplified graph.
This seems to be fast in practice. For GCC linktime we do dataflow for 4881
functions. Out of that 4726 finishes in one iteration, 144 in two and 10 in 3.
Overall 31979 functions are analysed, so we do dataflow only for bit over of
10% of cases. 131123 edges are visited by the solver. I measured no compile
time impact of this.
gcc/ChangeLog:
* ipa-modref.c (modref_lattice): Add do_dataflow,
changed and propagate_to fields.
(modref_lattice::release): Free propagate_to
(modref_lattice::merge): Do not give up early on unknown
lattice values.
(modref_lattice::merge_deref): Likewise.
(modref_eaf_analysis): Update toplevel comment.
(modref_eaf_analysis::analyze_ssa_name): Record postponned ssa names;
do optimistic dataflow initialization.
(modref_eaf_analysis::merge_with_ssa_name): Build dataflow graph.
(modref_eaf_analysis::propagate): New member function.
(analyze_parms): Update to new API of modref_eaf_analysis.
David Edelsohn [Sat, 6 Nov 2021 00:33:45 +0000 (20:33 -0400)]
powerpc: Fix vsx_splat_v4si in 32 bit mode
Tamar's recent patch to teach CSE to perform vector extract exercises
VSX splat more frequently, which exposed a constraint error for the
vsx_splat patterns. The pattern could be created for Power9, but
the "we constraint only provided alternatives in 64 bit mode. The
instructions are valid in 32 bit mode and SImode is allowed in VSX
registers. This patch updates the constraints from "we" to "wa" to
allow the pattern and fix the failing testcases.
gcc/ChangeLog:
* config/rs6000/vsx.md (vsx_splat_v4si): Change constraints to "wa".
(vsx_splat_v4si_di): Change constraint to "wa".
First, lb_17 == _134 because of the PHI.
Second, _134 > M.10_120 because of _134 = M.10_120 + 1.
We then assume that lb_75 > M.10_120, but this is incorrect because
M.10_120 was killed along the path.
This incorrect thread causes the miscompilation in 527.cam4_r.
Tested on x86-64 and ppc64le Linux.
gcc/ChangeLog:
PR tree-optimization/103061
* value-relation.cc (path_oracle::path_oracle): Initialize
m_killed_defs.
(path_oracle::killing_def): Set m_killed_defs.
(path_oracle::query_relation): Do not look at the root oracle for
killed defs.
* value-relation.h (class path_oracle): Add m_killed_defs.
Aldy Hernandez [Thu, 4 Nov 2021 18:44:15 +0000 (19:44 +0100)]
Cleanup back_threader::find_path_to_names.
The main path discovery function was due for a cleanup. First,
there's a nagging goto and second, my bitmap use was sloppy. Hopefully
this makes the code easier for others to read.
Regstrapped on x86-64 Linux. I also made sure there were no difference
in the number of threads with this patch.
No functional changes.
gcc/ChangeLog:
* tree-ssa-threadbackward.c (back_threader::find_paths_to_names):
Remove gotos and other cleanups.