Richard Biener [Fri, 11 Jun 2021 11:33:17 +0000 (13:33 +0200)]
tree-optimization/101026 - fix SLP re-association
Since we cannot yet encode the operation in the SLP node itself
but need a representative stmt require an existing one for now
to avoid the need to build a fake GIMPLE stmt.
2021-06-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/101026
* tree-vect-slp.c (vect_build_slp_tree_2): Make sure we
have a representative for the associated chain nodes.
Jakub Jelinek [Fri, 11 Jun 2021 10:59:43 +0000 (12:59 +0200)]
simplify-rtx: Fix up simplify_logical_relational_operation for vector IOR [PR101008]
simplify_relational_operation callees typically return just const0_rtx
or const_true_rtx and then simplify_relational_operation attempts to fix
that up if the comparison result has vector mode, or floating mode,
or punt if it has scalar mode and vector mode operands (it doesn't know how
exactly to deal with the scalar masks).
But, simplify_logical_relational_operation has a special case, where
it attempts to fold (x < y) | (x >= y) etc. and if it determines it is
always true, it just returns const_true_rtx, without doing the dances that
simplify_relational_operation does.
That results in an ICE on the following testcase, where such folding happens
during expansion (of debug stmts into DEBUG_INSNs) and we ICE because
all of sudden a VOIDmode rtx appears where it expects a vector (V4SImode)
rtx.
The following patch fixes that by moving the adjustement into a separate
helper routine and using it from both simplify_relational_operation and
simplify_logical_relational_operation.
2021-06-11 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/101008
* simplify-rtx.c (relational_result): New function.
(simplify_logical_relational_operation,
simplify_relational_operation): Use it.
This regressed the following testcase with -msse -mno-sse2.
The define_insn_and_split splits the permutation into *vec_concat<mode>_0
or *vec_concatv2di_0 insns which both have TARGET_SSE2 in their
conditions (for the former you can see it above), but the
define_insn_and_split matches always when the V mode's condition do,
which for V16QI/V8HI/V4SI/V2DI/V4SF modes is always (well, when those
modes are valid, which is TARGET_SSE).
Uros Bizjak [Fri, 11 Jun 2021 10:31:42 +0000 (12:31 +0200)]
i386: Try to avoid variable permutation instruction [PR101021]
Some permutations can be implemented without costly PSHUFB instruction, e.g.:
{ 8,9,10,11,12,13,14,15, 0,1,2,3,4,5,6,7 } with PALIGNR,
{ 0,1,2,3, 4,5,6,7, 4,5,6,7, 12,13,14,15 } with PSHUFD,
{ 0,1, 2,3, 2,3, 6,7, 8,9,10,11,12,13,14,15 } with PSHUFLW and
{ 0,1,2,3,4,5,6,7, 8,9, 10,11, 10,11, 14,15 } with PSHUFHW.
All these instructions have constant shuffle control mask and do not
need to load shuffle mask from a memory to a temporary XMM register.
2021-06-11 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/101021
* config/i386/i386-expand.c (expand_vec_perm_pshufb): Return
false if the permutation can be implemented with constant
permutation instruction in wider mode.
(canonicalize_vector_int_perm): Move above expand_vec_perm_pshufb.
Handle V8QImode and V4HImode.
gcc/testsuite/
PR target/101021
* gcc.target/i386/pr101021-1.c: New test.
* gcc.target/i386/pr101021-2.c: Ditto.
Martin Liska [Tue, 1 Jun 2021 13:13:18 +0000 (15:13 +0200)]
Introduce -Wcoverage-invalid-line-number
PR gcov-profile/100788
gcc/ChangeLog:
* common.opt: Add new option.
* coverage.c (coverage_begin_function): Emit warning instead on
the internal compiler error.
* doc/invoke.texi: Document the option.
* toplev.c (process_options): Enable it by default.
Richard Biener [Fri, 11 Jun 2021 07:33:58 +0000 (09:33 +0200)]
middle-end/101009 - fix distance vector recording
This fixes recording of distance vectors in case the DDR has just
constant equal indexes. In that case we expect distance vectors
with zero distances to be recorded which is what was done when
any distance was computed for affine indexes.
2021-06-11 Richard Biener <rguenther@suse.de>
PR middle-end/101009
* tree-data-ref.c (build_classic_dist_vector_1): Make sure
to set *init_b to true when we encounter a constant equal
index pair.
(compute_affine_dependence): Also dump the actual DR_REF.
Kewen Lin [Fri, 11 Jun 2021 07:43:40 +0000 (02:43 -0500)]
rs6000: Support more short/char to float conversion
For some cases that when we load unsigned char/short values from
the appropriate unsigned char/short memories and convert them to
double/single precision floating point value, there would be
implicit conversions to int first. It makes GCC not leverage the
P9 instructions lxsibzx/lxsihzx. This patch is to add the related
define_insn_and_split to support this kind of scenario.
Bootstrapped/regtested on powerpc64le-linux-gnu P9 and
powerpc64-linux-gnu P8.
gcc/ChangeLog:
* config/rs6000/rs6000.md
(floatsi<SFDF:mode>2_lfiwax_<QHI:mode>_mem_zext): New
define_insn_and_split.
Marek Polacek [Wed, 9 Jun 2021 19:18:39 +0000 (15:18 -0400)]
c++: Extend std::is_constant_evaluated in if warning [PR100995]
Jakub pointed me at
<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1938r3.html#compiler-warnings>
which shows that our existing warning could be extended to handle more
cases. This patch implements that.
A minor annoyance was handling macros, in libstdc++ we have
During deduction, when the template of the argument for a bound ttp
is a template template parameter, we need to consider the
TEMPLATE_TEMPLATE_PARAMETER for matching rather than the TEMPLATE_DECL
thereof, because the canonical form of a template template parameter as
a template argument is the former tree, not the latter.
PR c++/67829
gcc/cp/ChangeLog:
* pt.c (unify) <case BOUND_TEMPLATE_TEMPLATE_PARM>: When
the TEMPLATE_DECL of a BOUND_TEMPLATE_TEMPLATE_PARM argument is
a template template parameter, adjust to the
TEMPLATE_TEMPLATE_PARAMETER before falling through.
gcc/testsuite/ChangeLog:
* g++.dg/template/ttp34.C: New test.
* g++.dg/template/ttp34a.C: New test.
* g++.dg/template/ttp34b.C: New test.
Patrick Palka [Thu, 10 Jun 2021 22:31:18 +0000 (18:31 -0400)]
c++: normalization of non-templated return-type-req [PR100946]
Here the satisfaction cache is conflating the satisfaction value of the
two return-type-requirements because the corresponding constrained
'auto's have level 2, but they capture an empty current_template_parms.
This ultimately causes the satisfaction cache to think the type
constraint doesn't depend on the deduced type of the expression.
When normalizing the constraints on an 'auto', the assumption made by
normalize_placeholder_type_constraints is that the level of the 'auto'
is one greater than the depth of the captured current_template_parms, an
assumption which is not holding here. So this patch just makes n_p_t_c
adjust the normalization context appropriately in this situation.
PR c++/100946
gcc/cp/ChangeLog:
* constraint.cc (normalize_placeholder_type_constraints): When
normalizing a non-templated return-type-requirement, add a dummy
level to initial_parms.
Peter Bergner [Thu, 10 Jun 2021 18:54:12 +0000 (13:54 -0500)]
rs6000: Add new __builtin_vsx_build_pair and __builtin_mma_build_acc built-ins
The __builtin_vsx_assemble_pair and __builtin_mma_assemble_acc built-ins
currently assign their first source operand to the first VSX register
in a pair/quad, their second operand to the second register in a pair/quad, etc.
This is not endian friendly and forces the user to generate different calls
depending on endianness. In agreement with the POWER LLVM team, we've
decided to lightly deprecate the assemble built-ins and replace them with
"build" built-ins that automatically handle endianness so the same built-in
call and be used for both little-endian and big-endian compiles. We are not
removing the assemble built-ins, since there is code in the wild that use
them, but we are removing their documentation to encourage the use of the
new "build" variants.
gcc/
* config/rs6000/rs6000-builtin.def (build_pair): New built-in.
(build_acc): Likewise.
* config/rs6000/rs6000-call.c (mma_expand_builtin): Swap assemble
source operands in little-endian mode.
(rs6000_gimple_fold_mma_builtin): Handle VSX_BUILTIN_BUILD_PAIR.
(mma_init_builtins): Likewise.
* config/rs6000/rs6000.c (rs6000_split_multireg_move): Handle endianness
ordering for the MMA assemble and build source operands.
* doc/extend.texi (__builtin_vsx_build_acc, __builtin_mma_build_pair):
Document.
(__builtin_mma_assemble_acc, __builtin_mma_assemble_pair): Remove
documentation.
Iain Buclaw [Thu, 10 Jun 2021 17:59:23 +0000 (19:59 +0200)]
d: Fix ICE in TypeInfoDeclaration, at dmd/declaration.c (PR100967)
Generate a stub TypeInfo class even if the root Object class is missing.
The front-end will take care of issuing an error and abort the
compilation when running semantic on constructed TypeInfo objects.
The errors issued by the code generation pass relating to missing or
disabled RTTI has been consolidated into a single function, so that a
meaningful error will be emitted before the front-end terminates.
gcc/d/ChangeLog:
PR d/100967
* d-frontend.cc (getTypeInfoType): Move TypeInfo checks to
check_typeinfo_type and call new function.
* d-tree.h (check_typeinfo_type): Declare.
* typeinfo.cc: Include dmd/scope.h.
(create_frontend_tinfo_types): Generate front-end types even if Object
is missing.
(build_typeinfo): Move TypeInfo checks to check_typeinfo_type and call
new function.
(check_typeinfo_type): New function.
Aldy Hernandez [Thu, 10 Jun 2021 07:20:30 +0000 (09:20 +0200)]
Use auto_vec in ssa_equiv_stack.
There is a mismatch between the new and the delete for the
ssa_equiv_stack class. The correct idiom should have been delete[].
It has been pointed out that perhaps a better alternative is to use
an auto_vec which does everything automatically. Plus, it is more
consistent with m_stack which is already an auto_vec.
This patch fixes the issue in PR100984.
Tested on x86-64 Linux.
gcc/ChangeLog:
PR tree-optimization/100984
* gimple-ssa-evrp.c (ssa_equiv_stack): Use auto_vec for
replacements table.
(ssa_equiv_stack::~ssa_equiv_stack): Remove.
Andrew Stubbs [Fri, 25 Sep 2020 15:22:47 +0000 (16:22 +0100)]
OpenACC: Separate enter/exit data ABIs
Move the OpenACC enter and exit data directives from using a single builtin to
having one each. For most purposes it was easy to tell which was which, from
the clauses given, but it's overhead we can easily avoid, and there may be
future uses where that isn't possible.
Aldy Hernandez [Thu, 10 Jun 2021 11:03:33 +0000 (13:03 +0200)]
Adjust variable names and comments in value-query.*
Now that range_of_expr can take arbitrary tree expressions, not just
SSA names or constants, the method names and comments are slightly out
of date. This patch adjusts them to reflect reality.
Jakub Jelinek [Thu, 10 Jun 2021 07:46:08 +0000 (09:46 +0200)]
testsuite: Uncomment __cpp_consteval test
The __cpp_consteval macro and corresponding test have been initially
commented out because the consteval support didn't have virtual consteval
method support. The r11-1789-ge6321c4508b2a85c21246c1c06a8208e2a151e48
change enabled the macro but didn't enable the corresponding test.
Jakub Jelinek [Thu, 10 Jun 2021 07:28:27 +0000 (09:28 +0200)]
ifcvt: Fix -fcompare-debug bug [PR100852]
The following testcase fails -fcompare-debug, because it is ifcvt optimized
into umin only with -g0 and not with -g - the function(s) use
prev_nonnote_insn, which without -g finds a real insn the code is looking
for, while with -g finds a DEBUG_INSN.
2021-06-10 Jakub Jelinek <jakub@redhat.com>
PR debug/100852
* ifcvt.c (noce_get_alt_condition, noce_try_abs): Use
prev_nonnote_nondebug_insn instead of prev_nonnote_insn.
Andrew Pinski [Sun, 6 Jun 2021 04:25:58 +0000 (21:25 -0700)]
Fix PR 100925: Limit some a?CST1:CST2 optimizations to intergal types only
The problem here is with offset (and pointer) types is we produce
a negative expression when this optimization hits.
It is easier to disable this optimization for all non-integeral types
instead of finding an integer type which is the same precission as the
type to do the negative expression on it.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
PR tree-optimization/100925
* match.pd (a ? CST1 : CST2): Limit transformations
that would produce a negative to integeral types only.
Change !POINTER_TYPE_P to INTEGRAL_TYPE_P also.
Iain Buclaw [Wed, 9 Jun 2021 17:37:22 +0000 (19:37 +0200)]
d: Respect explicit align(N) type alignment (PR100935)
It was previously the natural type alignment, defined as the maximum of
the field alignments for an aggregate. Make sure an explicit align(N)
overrides it.
Paul Eggert [Wed, 9 Jun 2021 16:25:26 +0000 (12:25 -0400)]
Document that -fno-trampolines is for Ada only [PR100735]
gcc/
PR other/100735
* doc/invoke.texi (Code Gen Options); Document that -fno-trampolines
and -ftrampolines work only with Ada.
* doc/tm.texi.in (Trampolines): Likewise.
* doc/tm.texi: Regenerated.
* gcc.target/powerpc/int_128bit-runnable.c (extsd2q): Update expected
count.
Add tests for vec_signextq.
* gcc.target/powerpc/p9-sign_extend-runnable.c: New test case.
Carl Love [Wed, 21 Apr 2021 22:07:39 +0000 (18:07 -0400)]
Conversions between 128-bit integer and floating point values.
The files fixkfti-sw.c and fixunskfti-sw.c are renamed versions of
fixkfti.c and fixunskfti.c respectively to do the conversions in software.
The function names in the files were updated with the rename as well as
some white spaces fixes. The file float128-p10.c contains the functions
for using the ISA 3.1 hardware instructions to perform the conversions.
Carl Love [Mon, 7 Jun 2021 21:06:04 +0000 (16:06 -0500)]
rs6000, Fix arguments in altivec_vrlwmi and altivec_rlwdi builtins
2021-06-07 Carl Love <cel@us.ibm.com>
gcc/
* config/rs6000/altivec.md (altivec_vrl<VI_char>mi): Fix
bug in argument generation.
gcc/testsuite/
* gcc.target/powerpc/check-builtin-vec_rlnm-runnable.c:
New runnable test case.
* gcc.target/powerpc/vec-rlmi-rlnm.c: Update scan assembler times
for xxlor instruction.
Christophe Lyon [Wed, 9 Jun 2021 16:07:43 +0000 (16:07 +0000)]
arm: Auto-vectorization for MVE: vclz
This patch adds support for auto-vectorization of clz for MVE.
It does so by removing the unspec from mve_vclzq_<supf><mode> and uses
'clz' instead. It moves to neon_vclz<mode> expander from neon.md to
vec-common.md and renames it into the standard name clz<mode>2.
Christophe Lyon [Wed, 9 Jun 2021 16:00:01 +0000 (16:00 +0000)]
arm: Auto-vectorization for MVE and Neon: vhadd/vrhadd
This patch adds support for auto-vectorization of average value
computation using vhadd or vrhadd, for both MVE and Neon.
The patch adds the needed [u]avg<mode>3_[floor|ceil] patterns to
vec-common.md, I'm not sure how to factorize them without introducing
an unspec iterator?
It also adds tests for 'floor' and for 'ceil', each for MVE and Neon.
gcc/testsuite/
* gcc.target/arm/simd/mve-vhadd-1.c: New test.
* gcc.target/arm/simd/mve-vhadd-2.c: New test.
* gcc.target/arm/simd/neon-vhadd-1.c: New test.
* gcc.target/arm/simd/neon-vhadd-2.c: New test.
Aaron Sawdey [Mon, 24 May 2021 19:51:05 +0000 (14:51 -0500)]
Fix p10 fusion test cases for -m32
The counts of fusion insns are slightly different for 32-bit compiles
so we need different scan-assembler-times counts for 32 and 64 bit
in the test cases for p10 fusion.
gcc/testsuite/ChangeLog
* gcc.target/powerpc/fusion-p10-2logical.c: Update fused insn
counts to test 32 and 64 bit separately.
* gcc.target/powerpc/fusion-p10-addadd.c: Update fused insn
counts to test 32 and 64 bit separately.
* gcc.target/powerpc/fusion-p10-ldcmpi.c: Update fused insn
counts to test 32 and 64 bit separately.
* gcc.target/powerpc/fusion-p10-logadd.c: Update fused insn
counts to test 32 and 64 bit separately.
The following fixes the SLP FMA patterns to preserve reduction
info and the reduction vectorization to consider internal function
call defs for the reduction stmt.
2021-06-09 Richard Biener <rguenther@suse.de>
PR tree-optimization/100981
gcc/
* tree-vect-loop.c (vect_create_epilog_for_reduction): Use
gimple_get_lhs to also handle calls.
* tree-vect-slp-patterns.c (complex_pattern::build): Transfer
reduction info.
gcc/testsuite/
* gfortran.dg/vect/pr100981-1.f90: New testcase.
libgomp/
* testsuite/libgomp.fortran/pr100981-2.f90: New testcase.
Richard Biener [Wed, 18 Nov 2020 13:17:34 +0000 (14:17 +0100)]
tree-optimization/97832 - handle associatable chains in SLP discovery
This makes SLP discovery handle associatable (including mixed
plus/minus) chains better by swapping operands across the whole
chain. To work this adds caching of the 'matches' lanes for
failed SLP discovery attempts, thereby fixing a failed SLP
discovery for the slp-pr98855.cc testcase which results in
building an operand from scalars as expected. Unfortunately
this makes us trip over the cost threshold so I'm XFAILing the
testcase for now.
For BB vectorization all this doesn't work because we have no way
to distinguish good from bad associations as we eventually build
operands from scalars and thus not fail in the classical sense.
2021-05-31 Richard Biener <rguenther@suse.de>
PR tree-optimization/97832
* tree-vectorizer.h (_slp_tree::failed): New.
* tree-vect-slp.c (_slp_tree::_slp_tree): Initialize
failed member.
(_slp_tree::~_slp_tree): Free failed.
(vect_build_slp_tree): Retain failed nodes and record
matches in them, copying that back out when running
into a cached fail. Dump start and end of discovery.
(dt_sort_cmp): New.
(vect_build_slp_tree_2): Handle associatable chains
together doing more aggressive operand swapping.
* config/obj-elf.c (special_section): Add .init_array,
.fini_array and .preinit_array.
* config/tc-ia64.h (ELF_TC_SPECIAL_SECTIONS): Remove
.init_array and .fini_array.
gcc/
PR target/100896
* config.gcc (gcc_cv_initfini_array): Set to yes for Linux and
GNU targets.
* doc/install.texi: Require glibc 2.1 and binutils 2.12 for
Linux and GNU targets.
Jonathan Wakely [Wed, 9 Jun 2021 10:03:15 +0000 (11:03 +0100)]
libstdc++: Fix constraint on std::optional assignment [PR 100982]
libstdc++-v3/ChangeLog:
PR libstdc++/100982
* include/std/optional (optional::operator=(const optional<U>&)):
Fix value category used in is_assignable check.
* testsuite/20_util/optional/assignment/100982.cc: New test.
ARC processor can use LP instruction to implement zero overlay loops.
The current inplementation doesn't handle the unlikely situation when
the loop iterator is located in memory. Refurbish the loop_end insn
pattern into a define_insn_and_split pattern.
Rework the (u)maddhisi4 patterns and use VMAC2H(U) instruction instead
of the 64bit MAC(U) instruction.
This fixes the next execute.exp failures:
arith-rand-ll.c -O2 execution test
arith-rand-ll.c -O3 execution test
pr78726.c -O2 execution test
pr78726.c -O3 execution test
ARCv2HS can use a limited number of instructions to implement 64bit
moves. The VADD2 is used as a 64bit move, the LDD/STD are 64 bit loads
and stores. All those instructions are not baseline, hence we need to
provide alternatives when they are not available or cannot be generate
due to instruction restriction.
This patch is cleaning up those move patterns, and updates splits
instruction lengths.
Uros Bizjak [Wed, 9 Jun 2021 07:46:00 +0000 (09:46 +0200)]
i386: Do not emit segment overrides for %p and %P [PR100936]
Using %p to move the address of a symbol using LEA:
asm ("lea %p1, %0" : "=r"(addr) : "m"(var));
emits assembler warning when VAR is declared in a non-generic address space:
Warning: segment override on `lea' is ineffectual
The problem is with %p operand modifier, which should emit raw symbol name:
p -- print raw symbol name.
Similar problem exists with %P modifier, trying to CALL or JMP to an
overridden symbol,e.g:
call %gs:zzz
jmp %gs:zzz
emits assembler warning:
Warning: skipping prefixes on `call'
Warning: skipping prefixes on `jmp'
Ensure that %p and %P never emit segment overrides.
2021-06-08 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/100936
* config/i386/i386.c (print_operand_address_as): Rename "no_rip"
argument to "raw". Do not emit segment overrides when "raw" is true.
gcc/testsuite/
PR target/100936
* gcc.target/i386/pr100936.c: New test.
Andrew MacLeod [Tue, 8 Jun 2021 19:43:03 +0000 (15:43 -0400)]
Virtualize fur_source and turn it into a proper API.
No more accessing the local info. Also add fur_source/fold_stmt where ranges
are provided via being specified, or a vector to replace gimple_fold_range.
* gimple-range-gori.cc (gori_compute::outgoing_edge_range_p): Use a
fur_stmt source record.
* gimple-range.cc (fur_source::get_operand): Generic range query.
(fur_source::get_phi_operand): New.
(fur_source::register_dependency): New.
(fur_source::query): New.
(class fur_edge): New. Edge source for operands.
(fur_edge::fur_edge): New.
(fur_edge::get_operand): New.
(fur_edge::get_phi_operand): New.
(fur_edge::query): New.
(fur_stmt::fur_stmt): New.
(fur_stmt::get_operand): New.
(fur_stmt::get_phi_operand): New.
(fur_stmt::query): New.
(class fur_depend): New. Statement source and process dependencies.
(fur_depend::fur_depend): New.
(fur_depend::register_dependency): New.
(class fur_list): New. List source for operands.
(fur_list::fur_list): New.
(fur_list::get_operand): New.
(fur_list::get_phi_operand): New.
(fold_range): New. Instantiate appropriate fur_source class and fold.
(fold_using_range::range_of_range_op): Use new API.
(fold_using_range::range_of_address): Ditto.
(fold_using_range::range_of_phi): Ditto.
(imple_ranger::fold_range_internal): Use fur_depend class.
(fold_using_range::range_of_ssa_name_with_loop_info): Use new API.
* gimple-range.h (class fur_source): Now a base class.
(class fur_stmt): New.
(fold_range): New prototypes.
(fur_source::fur_source): Delete.
Jason Merrill [Tue, 8 Jun 2021 21:48:49 +0000 (17:48 -0400)]
c++: remove redundant warning [PR100879]
Before my r277864, build_new_op promoted enums to int before passing them on
to cp_build_binary_op; after that commit, it doesn't, so
warn_for_sign_compare sees the enum operands and gives a redundant warning.
This warning dates back to 1995, and seems to have been dead code for a long
time--likely since build_new_op was added in 1997--so let's just remove it.
PR c++/100879
gcc/c-family/ChangeLog:
* c-warn.c (warn_for_sign_compare): Remove C++ enum mismatch
warning.
Thomas Rodgers [Tue, 8 Jun 2021 22:51:53 +0000 (15:51 -0700)]
libstdc++: Fix Wrong param type in :atomic_ref<_Tp*>::wait [PR100889]
libstdc++-v3/ChangeLog:
PR libstdc++/100889
* include/bits/atomic_base.h (atomic_ref<_Tp*>::wait):
Change parameter type from _Tp to _Tp*.
* testsuite/29_atomics/atomic_ref/wait_notify.cc: Extend
coverage of types tested.
Marek Polacek [Mon, 7 Jun 2021 20:06:00 +0000 (16:06 -0400)]
c++: explicit() ignored on deduction guide [PR100065]
When we have explicit() with a value-dependent argument, we can't
evaluate it at parsing time, so cp_parser_function_specifier_opt stashes
the argument into the decl-specifiers and grokdeclarator then stores it
into explicit_specifier_map, which is then used when substituting the
function decl. grokdeclarator stores it for constructors and conversion
functions, but we also need to do it for deduction guides, otherwise
we'll forget that we've seen an explicit-specifier as in the attached
test.
PR c++/100065
gcc/cp/ChangeLog:
* decl.c (grokdeclarator): Store a value-dependent
explicit-specifier even for deduction guides.
Andrew Pinski [Tue, 1 Jun 2021 06:48:05 +0000 (06:48 +0000)]
Improve match_simplify_replacement in phi-opt
This improves match_simplify_replace in phi-opt to handle the
case where there is one cheap (non-call) preparation statement in the
middle basic block similar to xor_replacement and others.
This allows to remove xor_replacement which it does too.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
Thanks,
Andrew Pinski
Changes since v1:
v3 - Just minor changes to using gimple_assign_lhs
instead of gimple_lhs and fixing a comment.
v2 - change the check on the preparation statement to
allow only assignments and no calls and only assignments
that feed into the phi.
gcc/ChangeLog:
PR tree-optimization/25290
* tree-ssa-phiopt.c (xor_replacement): Delete.
(tree_ssa_phiopt_worker): Delete use of xor_replacement.
(match_simplify_replacement): Allow one cheap preparation
statement that can be moved to before the if.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr96928-1.c: Fix testcase for now that ~
happens on the outside of the bit_xor.
David Malcolm [Tue, 8 Jun 2021 18:45:57 +0000 (14:45 -0400)]
analyzer: bitfield fixes [PR99212]
This patch verifies the previous fix for bitfield sizes by implementing
enough support for bitfields in the analyzer to get the test cases to pass.
The patch implements support in the analyzer for reading from a
BIT_FIELD_REF, and support for folding BIT_AND_EXPR of a mask, to handle
the cases generated in tests.
The existing bitfields tests in data-model-1.c turned out to rely on
undefined behavior, in that they were assigning values to a signed
bitfield that were outside of the valid range of values. I believe that
that's why we were seeing target-specific differences in the test
results (PR analyzer/99212). The patch updates the test to remove the
undefined behaviors.
gcc/analyzer/ChangeLog:
PR analyzer/99212
* region-model-manager.cc
(region_model_manager::maybe_fold_binop): Add support for folding
BIT_AND_EXPR of compound_svalue and a mask constant.
* region-model.cc (region_model::get_rvalue_1): Implement
BIT_FIELD_REF in terms of...
(region_model::get_rvalue_for_bits): New function.
* region-model.h (region_model::get_rvalue_for_bits): New decl.
* store.cc (bit_range::from_mask): New function.
(selftest::test_bit_range_intersects_p): New selftest.
(selftest::assert_bit_range_from_mask_eq): New.
(ASSERT_BIT_RANGE_FROM_MASK_EQ): New macro.
(selftest::assert_no_bit_range_from_mask_eq): New.
(ASSERT_NO_BIT_RANGE_FROM_MASK): New macro.
(selftest::test_bit_range_from_mask): New selftest.
(selftest::analyzer_store_cc_tests): Call the new selftests.
* store.h (bit_range::intersects_p): New.
(bit_range::from_mask): New decl.
(concrete_binding::get_bit_range): New accessor.
(store_manager::get_concrete_binding): New overload taking
const bit_range &.
gcc/testsuite/ChangeLog:
PR analyzer/99212
* gcc.dg/analyzer/bitfields-1.c: New test.
* gcc.dg/analyzer/data-model-1.c (struct sbits): Make bitfields
explicitly signed.
(test_44): Update test values assigned to the bits to ones that
fit in the range of the bitfield type. Remove xfails.
(test_45): Remove xfails.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Tue, 8 Jun 2021 18:45:07 +0000 (14:45 -0400)]
analyzer: fix region::get_bit_size for bitfields
gcc/analyzer/ChangeLog:
* analyzer.h (int_size_in_bits): New decl.
* region.cc (int_size_in_bits): New function.
(region::get_bit_size): Reimplement in terms of the above.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Tue, 8 Jun 2021 18:43:48 +0000 (14:43 -0400)]
analyzer: split out struct bit_range from class concrete_binding
gcc/analyzer/ChangeLog:
* store.cc (concrete_binding::dump_to_pp): Move bulk of
implementation to...
(bit_range::dump_to_pp): ...this new function.
(bit_range::cmp): New.
(concrete_binding::overlaps_p): Update for use of bit_range.
(concrete_binding::cmp_ptr_ptr): Likewise.
* store.h (struct bit_range): New.
(class concrete_binding): Replace fields m_start_bit_offset and
m_size_in_bits with new field m_bit_range.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jason Merrill [Tue, 8 Jun 2021 13:19:58 +0000 (09:19 -0400)]
c++: braced-list overload resolution [PR100963]
My PR969626 patch made us ignore template candidates when there's a perfect
non-template candidate. In this case, we were considering B(int) a perfect
match for B({0}), but the brace elision makes it imperfect.
Jeff Law [Tue, 8 Jun 2021 14:10:23 +0000 (10:10 -0400)]
Further improve redundant test/compare removal on the H8
gcc/
* config/h8300/logical.md (andqi3_1): Move BCLR case into define_insn_and_split.
Create length attribute on define_insn_and_split. Only split for cases which we
know will use AND.
(andqi3_1<cczn>): Renamed from andqi3_1_clobber_flags. Only handle AND here and
fix length computation.
(b<code><mode>msx): Combine QImode and HImode H8/SX patterns using iterator.
Richard Biener [Tue, 8 Jun 2021 10:52:12 +0000 (12:52 +0200)]
tree-optimization/100923 - fix alias-ref construction wrt availability
This PR shows that building an ao_ref from value-numbers is prone to
expose bogus contextual alias info to the oracle. The following makes
sure to construct ao_refs from SSA names available at the program point
only.
On the way it modifies the awkward valueize_refs[_1] API.
2021-06-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/100923
* tree-ssa-sccvn.c (valueize_refs_1): Take a pointer to
the operand vector to be valueized.
(valueize_refs): Likewise.
(valueize_shared_reference_ops_from_ref): Adjust.
(valueize_shared_reference_ops_from_call): Likewise.
(vn_reference_lookup_3): Likewise.
(vn_reference_lookup_pieces): Likewise. Re-valueize
with honoring availability when we are about to create
the ao_ref and valueized before.
(vn_reference_lookup): Likewise.
(vn_reference_insert_pieces): Adjust.
Aldy Hernandez [Fri, 4 Jun 2021 18:25:20 +0000 (20:25 +0200)]
Implement a context aware pointer equivalency class for use in evrp.
The substitute_and_fold_engine which evrp uses is expecting symbolics
from value_of_expr / value_on_edge / etc, which ranger does not provide.
In some cases, these provide important folding cues, as in the case of
aliases for pointers. For example, legacy evrp may return [&foo, &foo]
for the value of "bar" where bar is on an edge where bar == &foo, or
when bar has been globally set to &foo. This information is then used
by the subst & fold engine to propagate the known value of bar.
Currently this is a major source of discrepancies between evrp and
ranger. Of the 284 cases legacy evrp is getting over ranger, 237 are
for pointer equality as discussed above.
This patch implements a context aware pointer equivalency class which
ranger-evrp can use to query what an SSA pointer is currently
equivalent to. With it, we reduce the 284 cases legacy evrp is getting
to 47.
The API for the pointer equivalency analyzer is the following:
class pointer_equiv_analyzer
{
public:
pointer_equiv_analyzer (gimple_ranger *r);
~pointer_equiv_analyzer ();
void enter (basic_block);
void leave (basic_block);
void visit_stmt (gimple *stmt);
tree get_equiv (tree ssa) const;
...
};
The enter(), leave(), and visit_stmt() methods are meant to be called
from a DOM walk. At any point throughout the walk, one can call
get_equiv() to get whatever an SSA is equivalent to.
Tested on x86-64 Linux with a regular bootstrap/tests and by comparing
EVRP folds over ranger before and after this patch.
Thomas Schwinge [Fri, 4 Jun 2021 13:31:53 +0000 (15:31 +0200)]
Fix 'libgomp.oacc-fortran/parallel-dims.f90' for 'acc_device_radeon'
..., by simplifying 'libgomp.oacc-c-c++-common/parallel-dims.c', and updating
the former correspondingly. '__builtin_goacc_parlevel_id' does the right thing
for all 'acc_device_*'.