David Malcolm [Fri, 2 Jul 2021 19:19:44 +0000 (15:19 -0400)]
diagnostic-show-locus: tweak rejection logic
gcc/ChangeLog:
* diagnostic-show-locus.c (diagnostic_show_locus): Don't reject
printing the same location twice if there are fix-it hints,
multiple locations, or a label.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Fri, 2 Jul 2021 19:19:43 +0000 (15:19 -0400)]
analyzer: fix missing leak after call to strsep [PR100615]
PR analyzer/100615 reports a missing leak diagnostic.
The issue is that the code calls strsep which the analyzer doesn't
have special knowledge of, and so conservatively assumes that it
could free the pointer, so drops malloc state for it.
Properly "teaching" the analyzer about strsep would require it
to support bifurcating state at a call, which is currently fiddly to
do, so for now this patch notes that strsep doesn't affect the
malloc state machine, allowing the analyzer to correctly detect the leak.
gcc/analyzer/ChangeLog:
PR analyzer/100615
* sm-malloc.cc: Include "analyzer/function-set.h".
(malloc_state_machine::on_stmt): Call unaffected_by_call_p and
bail on the functions it recognizes.
(malloc_state_machine::unaffected_by_call_p): New.
gcc/testsuite/ChangeLog:
PR analyzer/100615
* gcc.dg/analyzer/pr100615.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Fri, 2 Jul 2021 19:19:43 +0000 (15:19 -0400)]
analyzer: fix ICE on NULL change.m_expr [PR100244]
PR analyzer/100244 reports an ICE on a -Wanalyzer-free-of-non-heap
due to a case where free_of_non_heap::describe_state_change can be
passed a NULL change.m_expr for a suitably complicated symbolic value.
Bulletproof it by checking for change.m_expr being NULL before
dereferencing it.
gcc/analyzer/ChangeLog:
PR analyzer/100244
* sm-malloc.cc (free_of_non_heap::describe_state_change):
Bulletproof against change.m_expr being NULL.
gcc/testsuite/ChangeLog:
PR analyzer/100244
* g++.dg/analyzer/pr100244.C: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Eric Botcazou [Fri, 2 Jul 2021 08:21:11 +0000 (10:21 +0200)]
Change EH pointer encodings to PC relative on Windows
A big difference between ELF and PE-COFF is that, with the latter, you can
build position-independent executables or DLLs without generating PIC; as
a matter of fact, flag_pic has historically been forced to 0 for 32-bit:
/* Don't allow flag_pic to propagate since gas may produce invalid code
otherwise. */
\
do {
\
flag_pic = TARGET_64BIT ? 1 : 0; \
} while (0)
The reason is that the linker builds a .reloc section that collects the
absolute relocations in the generated binary, and the loader uses them to
relocate it at load time if need be (e.g. if --dynamicbase is enabled).
Up to binutils 2.35, the GNU linker didn't build the .reloc section for
executables and defaulted to --enable-auto-image-base for DLLs, which means
that DLLs had an essentially unique load address and, therefore, need not
be relocated by the loader in most cases.
With binutils 2.36 and later, the GNU linker builds a .reloc section for
executables (thus making them PIE), --enable-auto-image-base is disabled
and --dynamicbase is enabled by default, which means that essentially all
the binaries are relocated at load time.
This badly breaks the 32-bit compiler configured to use DWARF-2 EH because
the loader corrupts the .eh_frame section when processing the relocations
contained in the .reloc section.
gcc/
* config/i386/i386.c (asm_preferred_eh_data_format): Always use the
PIC encodings for PE-COFF targets.
Ian Lance Taylor [Tue, 29 Jun 2021 18:00:13 +0000 (11:00 -0700)]
compiler: in composite literals use temps only for interfaces
For a composite literal we only need to introduce a temporary variable
if we may be converting to an interface type, so only do it then.
This saves over 80% of compilation time when using gccgo to compile
cmd/internal/obj/x86, as the GCC middle-end spends a lot of time
pointlessly computing interactions between temporary variables.
Marek Polacek [Tue, 8 Jun 2021 21:44:13 +0000 (17:44 -0400)]
c++: Failure to delay noexcept parsing with ptr-operator [PR100752]
We weren't passing 'flags' to the recursive call to cp_parser_declarator
in the ptr-operator case and as an effect, delayed parsing of noexcept
didn't work as advertised. The following change passes more than just
CP_PARSER_FLAGS_DELAY_NOEXCEPT but that doesn't seem to break anything.
I'm now also passing member_p and static_p, as a consequence, two tests
needed small tweaks.
PR c++/100752
gcc/cp/ChangeLog:
* parser.c (cp_parser_declarator): Pass flags down to
cp_parser_declarator. Also pass static_p/member_p.
Kewen Lin [Wed, 23 Jun 2021 04:09:30 +0000 (23:09 -0500)]
rs6000: Fix typos in float128 ISA3.1 support
The recent float128 ISA3.1 support (r12-1340) has some typos,
it makes the libgcc build fail if it's with one binutils
(assembler) which doesn't support Power10 insns. The error
looks like:
What this patch does are:
- fix test target typo libgcc_cv_powerpc_3_1_float128_hw
(written wrongly as libgcc_cv_powerpc_float128_hw, so it's
going to build ISA3.1 stuffs just when detecting ISA3.0).
- fix test used for libgcc_cv_powerpc_3_1_float128_hw check.
- fix test option used for libgcc_cv_powerpc_3_1_float128_hw
check.
- remove the ISA3.1 related contents from t-float128-hw.
- add new macro FLOAT128_HW_INSNS_ISA3_1 to differentiate
ISA3.1 content from ISA3.0 part in ifunc support.
Bootstrapped/regtested on:
- powerpc64le-linux-gnu P10
- powerpc64le-linux-gnu P9 (w/i and w/o p10 supported as)
- powerpc64-linux-gnu P8 (w/i and w/o p10 supported as)
libgcc/ChangeLog:
PR target/101235
* configure: Regenerate.
* configure.ac (test for libgcc_cv_powerpc_3_1_float128_hw): Fix
typos among the name, CFLAGS and the test.
* config/rs6000/t-float128-hw (fp128_3_1_hw_funcs, fp128_3_1_hw_src,
fp128_3_1_hw_static_obj, fp128_3_1_hw_shared_obj, fp128_3_1_hw_obj):
Remove.
* config/rs6000/t-float128-p10-hw (FLOAT128_HW_INSNS): Append
macro FLOAT128_HW_INSNS_ISA3_1.
(FP128_3_1_CFLAGS_HW): Fix option typo.
* config/rs6000/float128-ifunc.c (SW_OR_HW_ISA3_1): Guard this with
FLOAT128_HW_INSNS_ISA3_1.
(__floattikf_resolve): Likewise.
(__floatuntikf_resolve): Likewise.
(__fixkfti_resolve): Likewise.
(__fixunskfti_resolve): Likewise.
(__floattikf): Likewise.
(__floatuntikf): Likewise.
(__fixkfti): Likewise.
(__fixunskfti): Likewise.
Richard Biener [Thu, 24 Jun 2021 08:47:18 +0000 (10:47 +0200)]
Fix SLP permute propagation error
This fixes SLP permute propagation to not propagate across operations
that have different semantics on different lanes like for example
the recently added COMPLEX_ADD_ROT90.
2021-06-24 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_optimize_slp): Do not propagate
across operations that have different semantics on different
lanes.
Richard Biener [Tue, 22 Jun 2021 08:14:02 +0000 (10:14 +0200)]
tree-optimization/101151 - fix irreducible region check for sinking
The check whether two blocks are in the same irreducible region
and thus post-dominance checks being unreliable was incomplete
since an irreducible region can contain reducible sub-regions but
if one block is in the irreducible part and one not the check
still doesn't work as expected.
2021-06-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/101151
* tree-ssa-sink.c (statement_sink_location): Expand irreducible
region check.
Richard Biener [Wed, 23 Jun 2021 10:43:03 +0000 (12:43 +0200)]
tree-optimization/101105 - fix runtime alias test optimization
We were ignoring DR_STEP for VF == 1 which is OK only in case
the scalar order is preserved or both DR steps are the same.
2021-06-23 Richard Biener <rguenther@suse.de>
PR tree-optimization/101105
* tree-vect-data-refs.c (vect_prune_runtime_alias_test_list):
Only ignore steps when they are equal or scalar order is preserved.
Eric Botcazou [Thu, 24 Jun 2021 10:55:27 +0000 (12:55 +0200)]
Emit .file 0 directive earlier in DWARF 5
When the assembler supports it, the compiler automatically passes --gdwarf-5
to it, which has an interesting side effect: any assembly instruction prior
to the first .file directive defines a new line associated with .file 0 in
the .debug_line section and of course the numbering of these implicit lines
has nothing to do with that of the source code. This can be problematic in
Ada when we do not generate .file/.loc directives for compiled-generated
functions to avoid too jumpy a debugging experience.
Eric Botcazou [Thu, 24 Jun 2021 10:53:24 +0000 (12:53 +0200)]
Fix --gdwarf-5 configure tests for Windows
The issues are that 1) they use readelf instead of objdump and 2) they use
ELF syntax in the assembly code.
gcc/
* configure.ac (--gdwarf-5 option): Use objdump instead of readelf.
(working --gdwarf-4/--gdwarf-5 for all sources): Likewise.
(--gdwarf-4 not refusing generated .debug_line): Adjust for Windows.
* configure: Regenerate.
Aaron Sawdey [Tue, 22 Jun 2021 21:02:15 +0000 (16:02 -0500)]
Do not enable pcrel-opt by default
Backported from trunk.
SPEC2017 testing on p10 shows that this optimization does not have a
positive impact on performance. So we are no longer going to enable it
by default. The test cases for it needed to be updated so they always
enable it to test it.
gcc/
* config/rs6000/rs6000-cpus.def: Take OPTION_MASK_PCREL_OPT out
of OTHER_POWER10_MASKS so it will not be enabled by default.
gcc/testsuite/
* gcc.target/powerpc/pcrel-opt-inc-di.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-ld-df.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-ld-di.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-ld-hi.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-ld-qi.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-ld-sf.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-ld-si.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-ld-vector.c: Enable -mpcrel-opt to
test it.
* gcc.target/powerpc/pcrel-opt-st-df.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-st-di.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-st-hi.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-st-qi.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-st-sf.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-st-si.c: Enable -mpcrel-opt to test it.
* gcc.target/powerpc/pcrel-opt-st-vector.c: Enable -mpcrel-opt to
test it.
Michael Meissner [Wed, 23 Jun 2021 19:00:16 +0000 (15:00 -0400)]
Backport patch from master branch.
Add IEEE 128-bit min/max support on PowerPC.
This patch adds the support for the IEEE 128-bit floating point C minimum and
maximum instructions. The next patch will add the support for using the
compare and set mask instruction to implement conditional moves.
This patch does not try to re-use the code used for SF/DF min/max
support. It defines a separate insn for the IEEE 128-bit support. It
uses the code iterator <minmax> to simplify adding both operations.
GCC will not convert ternary operations into using min/max instructions
provided in this patch unless the user uses -Ofast. The next patch that adds
conditional move instructions will enable the ternary conversion in many cases.
gcc/
2021-06-23 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
3.1 IEEE 128-bit floating point xsmaxcqp/xsmincqp instructions.
* config/rs6000/rs6000.md (s<minmax><mode>3, IEEE128 iterator):
New insns.
gcc/testsuite/
2021-06-23 Michael Meissner <meissner@linux.ibm.com>
* gcc.target/powerpc/float128-minmax-2.c: New test.
Uros Bizjak [Wed, 23 Jun 2021 10:50:53 +0000 (12:50 +0200)]
i386: Prevent unwanted combine from LZCNT to BSR [PR101175]
The current RTX pattern for BSR allows combine pass to convert LZCNT insn
to BSR. Note that the LZCNT has a defined behavior to return the operand
size when operand is zero, where BSR has not.
Add a BSR specific setting of zero-flag to RTX pattern of BSR insn
in order to avoid matching unwanted combinations.
Jakub Jelinek [Wed, 23 Jun 2021 08:03:28 +0000 (10:03 +0200)]
openmp: Fix up *_reduction clause handling with UDRs on PARM_DECLs [PR101167]
The following testcase FAILs, because the UDR combiner is invoked incorrectly.
lower_omp_rec_clauses expects that when it sets
DECL_VALUE_EXPR/DECL_HAS_VALUE_EXPR_P
for both the placeholder and the var that everything will be properly
regimplified, but as the variable in question is a PARM_DECL rather than
VAR_DECL, lower_omp_regimplify_p doesn't say that it should be regimplified
and so it is not.
2021-06-23 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101167
* omp-low.c (lower_omp_regimplify_p): Regimplify also PARM_DECLs
and RESULT_DECLs that have DECL_HAS_VALUE_EXPR_P set.
* testsuite/libgomp.c-c++-common/task-reduction-15.c: New test.
Jakub Jelinek [Mon, 21 Jun 2021 11:30:42 +0000 (13:30 +0200)]
inline-asm: Fix ICE with bitfields in "m" operands [PR100785]
Bitfields, while they live in memory, aren't something inline-asm can easily
operate on.
For C and "=m" or "+m", we were diagnosing bitfields in the past in the
FE, where c_mark_addressable had:
case COMPONENT_REF:
if (DECL_C_BIT_FIELD (TREE_OPERAND (x, 1)))
{
error
("cannot take address of bit-field %qD", TREE_OPERAND (x, 1));
return false;
}
but that check got moved in GCC 6 to build_unary_op instead and now we
emit an error during expansion and ICE afterwards (i.e. error-recovery).
For "m" it used to be diagnosed in c_mark_addressable too, but since
GCC 6 it is ice-on-invalid.
For C++, this was never diagnosed in the FE, but used to be diagnosed
in the gimplifier and/or during expansion before 4.8.
The following patch does multiple things:
1) diagnoses it in the FEs
2) simplifies during expansion the inline asm if any errors have been
reported (similarly how e.g. vregs pass if it detects errors on
inline-asm either deletes them or simplifies to bare minimum -
just labels), so that we don't have error-recovery ICEs there
2021-06-11 Jakub Jelinek <jakub@redhat.com>
PR inline-asm/100785
gcc/
* cfgexpand.c (expand_asm_stmt): If errors are emitted,
remove all inputs, outputs and clobbers from the asm and
set template to "".
gcc/c/
* c-typeck.c (c_mark_addressable): Diagnose trying to make
bit-fields addressable.
gcc/cp/
* typeck.c (cxx_mark_addressable): Diagnose trying to make
bit-fields addressable.
gcc/testsuite/
* c-c++-common/pr100785.c: New test.
Thomas Rodgers [Tue, 22 Jun 2021 17:59:07 +0000 (10:59 -0700)]
libstdc++: Fix for deadlock in std::counting_semaphore [PR100806]
libstdc++-v3/ChangeLog:
PR libstdc++/100806
* include/bits/semaphore_base.h (__atomic_semaphore::_M_release):
Force _M_release() to wake all waiting threads.
* testsuite/30_threads/semaphore/100806.cc: New test.
* gcc.target/powerpc/int_128bit-runnable.c (extsd2q): Update expected
count.
Add tests for vec_signextq.
* gcc.target/powerpc/p9-sign_extend-runnable.c: New test case.
Carl Love [Wed, 21 Apr 2021 22:07:39 +0000 (18:07 -0400)]
Conversions between 128-bit integer and floating point values.
The files fixkfti-sw.c and fixunskfti-sw.c are renamed versions of
fixkfti.c and fixunskfti.c respectively to do the conversions in software.
The function names in the files were updated with the rename as well as
some white spaces fixes. The file float128-p10.c contains the functions
for using the ISA 3.1 hardware instructions to perform the conversions.
Carl Love [Tue, 15 Jun 2021 16:24:56 +0000 (11:24 -0500)]
rs6000, Fix arguments in altivec_vrlwmi and altivec_rlwdi builtins
2021-06-07 Carl Love <cel@us.ibm.com>
gcc/
* config/rs6000/altivec.md (altivec_vrl<VI_char>mi): Fix
bug in argument generation.
gcc/testsuite/
* gcc.target/powerpc/check-builtin-vec_rlnm-runnable.c:
New runnable test case.
* gcc.target/powerpc/vec-rlmi-rlnm.c: Update scan assembler times
for xxlor instruction.
Jason Merrill [Thu, 17 Jun 2021 19:31:15 +0000 (15:31 -0400)]
c++: deleted after first declaration [PR101106]
An explicitly deleted function must be deleted on its first declaration. We
were diagnosing this error only with -Wpedantic, but always giving the
"previous declaration" note. For GCC 11, keep the -Wpedantic dependency,
just make the note depend on the previous diagnostic.
PR c++/101106
gcc/cp/ChangeLog:
* decl.c (duplicate_decls): Condition note on return value of pedwarn.
Jason Merrill [Tue, 8 Jun 2021 21:48:49 +0000 (17:48 -0400)]
c++: remove redundant warning [PR100879]
Before my r277864, build_new_op promoted enums to int before passing them on
to cp_build_binary_op; after that commit, it doesn't, so
warn_for_sign_compare sees the enum operands and gives a redundant warning.
This warning dates back to 1995, and seems to have been dead code for a long
time--likely since build_new_op was added in 1997--so let's just remove it.
PR c++/100879
gcc/c-family/ChangeLog:
* c-warn.c (warn_for_sign_compare): Remove C++ enum mismatch
warning.
Aaron Sawdey [Fri, 18 Jun 2021 17:47:03 +0000 (12:47 -0500)]
Fix p10 fusion regtests
Backported from trunk.
Update the count of matches for the fusion combine patterns after
the recent changes to them. At Segher's request, used \m and \M
in the match patterns. Also I have grouped together all alternatives of
each fusion insn, which should hopefully make this test a little less
fragile.
gcc/testsuite/ChangeLog
* gcc.target/powerpc/fusion-p10-2logical.c: Update pattern
match counts.
* gcc.target/powerpc/fusion-p10-addadd.c: Update pattern match
counts.
* gcc.target/powerpc/fusion-p10-ldcmpi.c: Update pattern match
counts.
* gcc.target/powerpc/fusion-p10-logadd.c: Update pattern match
counts.
Jonathan Wakely [Fri, 18 Jun 2021 13:46:58 +0000 (14:46 +0100)]
libstdc++: Replace incorrect static assertion in std::reduce [PR95833]
The standard does not require the iterator's value type to be
convertible to the result type, it only requires that the result of
dereferencing the iterator can be passed to the binary function.
libstdc++-v3/ChangeLog:
PR libstdc++/95833
* include/std/numeric (reduce(Iter, Iter, T, BinaryOp)): Replace
incorrect static_assert with ones matching the 'Mandates'
conditions in the standard.
* testsuite/26_numerics/reduce/95833.cc: New test.
arm: Fix multilib mapping for CDE extensions [PR100856].
On passing +cdecp[0-7] extension to the -march string in command line options,
multilib linking is failing as mentioned in PR100856. This patch fixes this issue by
generating a separate canonical string by removing compiler options which are not
required for multilib linking from march string and assign the new string to mlibarch
option. This mlibarch string is used for multilib comparison.
PR target/100856
* common/config/arm/arm-common.c (arm_canon_arch_option_1): New function
derived from arm_canon_arch.
(arm_canon_arch_option): Call it.
(arm_canon_arch_multilib_option): New function.
* config/arm/arm-cpus.in (IGNORE_FOR_MULTILIB): New fgroup.
* config/arm/arm.h (arm_canon_arch_multilib_option): New prototype.
(CANON_ARCH_MULTILIB_SPEC_FUNCTION): New macro.
(MULTILIB_ARCH_CANONICAL_SPECS): New macro.
(DRIVER_SELF_SPECS): Add MULTILIB_ARCH_CANONICAL_SPECS.
* config/arm/arm.opt (mlibarch): New option.
* config/arm/t-rmprofile (MULTILIB_MATCHES): For armv8*-m, replace use
of march on RHS with mlibarch.
arm: Fix polymorphic variants failing with undefined reference to `__ARM_undef` error.
This patch fixes the issue mentioned in PR101016, which is mve polymorphic variants
failing at linking with undefined reference to "__ARM_undef" error.
arm: Fix the mve multilib for the broken cmse support (pr99939).
The current CMSE support in the multilib build for
"-march=armv8.1-m.main+mve -mfloat-abi=hard -mfpu=auto" is broken
as specified in PR99939 and this patch fixes the issue.
PR target/99939
* gcc.target/arm/cmse/cmse-18.c: Add separate scan-assembler
directives check for target is v8.1-m.main+mve or not before
comparing the assembly output.
* gcc.target/arm/cmse/cmse-20.c: New test.
PR target/99939
* config/arm/cmse_nonsecure_call.S: Add __ARM_FEATURE_MVE
macro.
* config/arm/t-arm: To link cmse.o and cmse_nonsecure_call.o
on passing -mcmse option.
Jonathan Wakely [Fri, 18 Jun 2021 10:08:19 +0000 (11:08 +0100)]
libstdc++: Suppress -Wstringop-overread warning in test
When compiled with -m32 -O2 -D_GLIBCXX_USE_CXX11_ABI=0 we get a warning
for 21_strings/basic_string/cons/char/1.cc:
bits/char_traits.h:409:56: warning: ‘void* __builtin_memcpy(void*, const void*, unsigned int)’ reading 1073741821 bytes from a region of size 19 [-Wstringop-overread]
The warning is legitimate, even if that line cannot be reached because
we throw std::length_error before getting there. Since the invalid
length is deliberate (and mentioned in a comment) just suppress the
warning, so that the test can verify we get the exception.
Also remove an unused typedef that produces another warning.
libstdc++-v3/ChangeLog:
* testsuite/21_strings/basic_string/cons/char/1.cc: Use
diagnostic pragma to suppress -Wstringop-overread error.
Jonathan Wakely [Thu, 17 Jun 2021 13:11:22 +0000 (14:11 +0100)]
libstdc++: Simplify constexpr checks in std::char_traits [PR 91488]
This removes the 'static' keyword from the helper functions added by
r8-1294 to detect whether the char_traits member functions can be
evaluated at compile time. This prevents the "inlining failed" error
reported in the PR.
The new testcase from the PR is added to the libitm testsuite, because
that's where we can be sure it's OK to use the -fgnu-tm option.
As a drive-by fix, the feature test macros for C++20 P0980R1 support are
made to depend on whether __cpp_lib_is_constant_evaluated is defined.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
PR libstdc++/91488
libstdc++-v3/ChangeLog:
* include/bits/basic_string.h (__cpp_lib_constexpr_string): Only
define C++20 value when std::is_constant_evaluated is available.
* include/bits/char_traits.h (__cpp_lib_constexpr_char_traits):
Likewise.
(__constant_string_p, __constant_array_p): Give external
linkage.
* include/std/version (__cpp_lib_constexpr_char_traits)
(__cpp_lib_constexpr_string): Only define C++20 values when
is_constant_evaluated is available.
libitm/ChangeLog:
* testsuite/libitm.c++/libstdc++-pr91488.C: New test.
Jakub Jelinek [Fri, 18 Jun 2021 09:20:40 +0000 (11:20 +0200)]
stor-layout: Don't create DECL_BIT_FIELD_REPRESENTATIVE for QUAL_UNION_TYPE [PR101062]
> The following patch does create them, but treats all such bitfields as if
> they were in a structure where the particular bitfield is the only field.
While the patch passed bootstrap/regtest on the trunk, when trying to
backport it to 11 branch the bootstrap failed with
atree.ads:3844:34: size for "Node_Record" too small
errors. Turns out the error is not about size being too small, but actually
about size being non-constant, and comes from:
/* In a FIELD_DECL of a RECORD_TYPE, this is a pointer to the storage
representative FIELD_DECL. */
#define DECL_BIT_FIELD_REPRESENTATIVE(NODE) \
(FIELD_DECL_CHECK (NODE)->field_decl.qualifier)
/* For a FIELD_DECL in a QUAL_UNION_TYPE, records the expression, which
if nonzero, indicates that the field occupies the type. */
#define DECL_QUALIFIER(NODE) (FIELD_DECL_CHECK (NODE)->field_decl.qualifier)
so by setting up DECL_BIT_FIELD_REPRESENTATIVE in QUAL_UNION_TYPE we
actually set or modify DECL_QUALIFIER and then construct size as COND_EXPRs
with those bit field representatives (e.g. with array type) as conditions
which doesn't fold into constant.
The following patch fixes it by not creating DECL_BIT_FIELD_REPRESENTATIVEs
for QUAL_UNION_TYPE as there is nowhere to store them,
Shall we change tree.h to document that DECL_BIT_FIELD_REPRESENTATIVE
is valid also on UNION_TYPE?
I see:
tree-ssa-alias.c- if (TREE_CODE (type1) == RECORD_TYPE
tree-ssa-alias.c: && DECL_BIT_FIELD_REPRESENTATIVE (field1))
tree-ssa-alias.c: field1 = DECL_BIT_FIELD_REPRESENTATIVE (field1);
tree-ssa-alias.c- if (TREE_CODE (type2) == RECORD_TYPE
tree-ssa-alias.c: && DECL_BIT_FIELD_REPRESENTATIVE (field2))
tree-ssa-alias.c: field2 = DECL_BIT_FIELD_REPRESENTATIVE (field2);
Shall we change that to || == UNION_TYPE or do we assume all fields
are overlapping in a UNION_TYPE already?
At other spots (asan, ubsan, expr.c) it is unclear what will happen
if they see a QUAL_UNION_TYPE with a DECL_QUALIFIER (or does the Ada FE
lower that somehow)?
Jakub Jelinek [Wed, 16 Jun 2021 10:17:55 +0000 (12:17 +0200)]
stor-layout: Create DECL_BIT_FIELD_REPRESENTATIVE even for bitfields in unions [PR101062]
The following testcase is miscompiled on x86_64-linux, the bitfield store
is implemented as a RMW 64-bit operation at d+24 when the d variable has
size of only 28 bytes and scheduling moves in between the R and W part
a store to a different variable that happens to be right after the d
variable.
The reason for this is that we weren't creating
DECL_BIT_FIELD_REPRESENTATIVEs for bitfields in unions.
The following patch does create them, but treats all such bitfields as if
they were in a structure where the particular bitfield is the only field.
2021-06-16 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101062
* stor-layout.c (finish_bitfield_representative): For fields in unions
assume nextf is always NULL.
(finish_bitfield_layout): Compute bit field representatives also in
unions, but handle it as if each bitfield was the only field in the
aggregate.
Patrick Palka [Thu, 17 Jun 2021 13:46:07 +0000 (09:46 -0400)]
libstdc++: Non-triv-copyable extra args aren't simple [PR100940]
This force-enables perfect forwarding call wrapper semantics whenever
the extra arguments of a partially applied range adaptor aren't all
trivially copyable, so as to avoid incurring unnecessary copies of
potentially expensive-to-copy objects (such as std::function objects)
when invoking the adaptor.
PR libstdc++/100940
libstdc++-v3/ChangeLog:
* include/std/ranges (__adaptor::_Partial): For the "simple"
forwarding partial specializations, also require that
the extra arguments are trivially copyable.
* testsuite/std/ranges/adaptors/100577.cc (test04): New test.
Patrick Palka [Thu, 17 Jun 2021 13:46:04 +0000 (09:46 -0400)]
libstdc++: Refine range adaptors' "simple extra args" mechanism [PR100940]
The _S_has_simple_extra_args mechanism is used to simplify forwarding
of range adaptor's extra arguments when perfect forwarding call wrapper
semantics isn't required for correctness, on a per-adaptor basis.
Both views::take and views::drop are flagged as such, but it turns out
perfect forwarding semantics are needed for these adaptors in some
contrived cases, e.g. when their extra argument is a move-only class
that's implicitly convertible to an integral type.
To fix this, we could just clear the flag for views::take/drop as with
views::split, but that'd come at the cost of acceptable diagnostics
for ill-formed uses of these adaptors (see PR100577).
This patch instead allows adaptors to parameterize their
_S_has_simple_extra_args flag according the types of the captured extra
arguments, so that we could conditionally disable perfect forwarding
semantics only when the types of the extra arguments permit it. We
then use this finer-grained mechanism to safely disable perfect
forwarding semantics for views::take/drop when the extra argument is
integer-like, rather than incorrectly always disabling it. Similarly,
for views::split, rather than always enabling perfect forwarding
semantics we now safely disable it when the extra argument is a scalar
or a view, and recover good diagnostics for these common cases.
PR libstdc++/100940
libstdc++-v3/ChangeLog:
* include/std/ranges (__adaptor::_RangeAdaptor): Document the
template form of _S_has_simple_extra_args.
(__adaptor::__adaptor_has_simple_extra_args): Add _Args template
parameter pack. Try to treat _S_has_simple_extra_args as a
variable template parameterized by _Args.
(__adaptor::_Partial): Pass _Arg/_Args to the constraint
__adaptor_has_simple_extra_args.
(views::_Take::_S_has_simple_extra_args): Templatize according
to the type of the extra argument.
(views::_Drop::_S_has_simple_extra_args): Likewise.
(views::_Split::_S_has_simple_extra_args): Define.
* testsuite/std/ranges/adaptors/100577.cc (test01, test02):
Adjust after changes to _S_has_simple_extra_args mechanism.
(test03): Define.
Peter Bergner [Mon, 14 Jun 2021 21:55:18 +0000 (16:55 -0500)]
rs6000: MMA builtin usage ICEs when used in a #pragma omp parallel and using -fopenmp [PR100777]
Using an MMA builtin within an openmp parallel code block, leads to an SSA
verification ICE on the temporaries we create while expanding the MMA builtins
at gimple time. The solution is to use create_tmp_reg_or_ssa_name(), which
knows when to create either an SSA or register temporary.
2021-06-14 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/100777
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Use
create_tmp_reg_or_ssa_name().
gcc/testsuite/
PR target/100777
* gcc.target/powerpc/pr100777.c: New test.
Peter Bergner [Thu, 10 Jun 2021 18:54:12 +0000 (13:54 -0500)]
rs6000: Add new __builtin_vsx_build_pair and __builtin_mma_build_acc built-ins
The __builtin_vsx_assemble_pair and __builtin_mma_assemble_acc built-ins
currently assign their first source operand to the first VSX register
in a pair/quad, their second operand to the second register in a pair/quad, etc.
This is not endian friendly and forces the user to generate different calls
depending on endianness. In agreement with the POWER LLVM team, we've
decided to lightly deprecate the assemble built-ins and replace them with
"build" built-ins that automatically handle endianness so the same built-in
call and be used for both little-endian and big-endian compiles. We are not
removing the assemble built-ins, since there is code in the wild that use
them, but we are removing their documentation to encourage the use of the
new "build" variants.
gcc/
* config/rs6000/rs6000-builtin.def (build_pair): New built-in.
(build_acc): Likewise.
* config/rs6000/rs6000-call.c (mma_expand_builtin): Swap assemble
source operands in little-endian mode.
(rs6000_gimple_fold_mma_builtin): Handle VSX_BUILTIN_BUILD_PAIR.
(mma_init_builtins): Likewise.
* config/rs6000/rs6000.c (rs6000_split_multireg_move): Handle endianness
ordering for the MMA assemble and build source operands.
* doc/extend.texi (__builtin_vsx_build_acc, __builtin_mma_build_pair):
Document.
(__builtin_mma_assemble_acc, __builtin_mma_assemble_pair): Remove
documentation.
Peter Bergner [Mon, 31 May 2021 03:45:55 +0000 (22:45 -0500)]
rs6000: MMA test case ICEs using -O3 [PR99842]
The mma_assemble_input_operand predicate does not accept reg+reg indexed
addresses which can lead to ICEs. The lxv and lxvp instructions have
indexed forms (lxvx and lxvpx), so the simple solution is to just allow
indexed addresses in the predicate.
2021-05-30 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/99842
* config/rs6000/predicates.md(mma_assemble_input_operand): Allow
indexed form addresses.
Martin Sebor [Thu, 17 Jun 2021 18:18:53 +0000 (12:18 -0600)]
Backported from trunk:
Teach compute_objsize about placement new [PR100876].
Resolves:
PR c++/100876 - -Wmismatched-new-delete should understand placement new when it's not inlined
gcc/ChangeLog:
PR c++/100876
* builtins.c (gimple_call_return_array): Check for attribute fn spec.
Handle calls to placement new.
(ndecl_dealloc_argno): Avoid placement delete.
gcc/testsuite/ChangeLog:
PR c++/100876
* g++.dg/warn/Wmismatched-new-delete-4.C: New test.
* g++.dg/warn/Wmismatched-new-delete-5.C: New test.
* g++.dg/warn/Wstringop-overflow-7.C: New test.
* g++.dg/warn/Wfree-nonheap-object-6.C: New test.
* g++.dg/analyzer/placement-new.C: Prune out expected warning.
Martin Sebor [Thu, 17 Jun 2021 18:08:15 +0000 (12:08 -0600)]
Backported from trunk:
PR middle-end/100732 - ICE on sprintf %s with integer argument
gcc/ChangeLog:
PR middle-end/100732
* gimple-fold.c (gimple_fold_builtin_sprintf): Avoid folding calls
with either source or destination argument of invalid type.
* tree-ssa-uninit.c (maybe_warn_pass_by_reference): Avoid checking
calls with arguments of invalid type.
gcc/testsuite/ChangeLog:
PR middle-end/100732
* gcc.dg/tree-ssa/builtin-snprintf-11.c: New test.
* gcc.dg/tree-ssa/builtin-snprintf-12.c: New test.
* gcc.dg/tree-ssa/builtin-sprintf-28.c: New test.
* gcc.dg/tree-ssa/builtin-sprintf-29.c: New test.
* gcc.dg/uninit-pr100732.c: New test.
Aaron Sawdey [Wed, 3 Mar 2021 00:06:37 +0000 (18:06 -0600)]
Backported from trunk:
Fusion patterns for add-logical/logical-add
This patch modifies the function in genfusion.pl for generating
the logical-logical patterns so that it can also generate the
add-logical and logical-add patterns which are very similar.
Also backported from trunk and combined with the add-logical patch
because that revealed problems on gcc-11.
Add needed earlyclobber to fusion patterns
The add-logical and add-add fusion patterns all have constraint
alternatives "=0,1,&r,r" for the output (3). The inputs 0 and 1
are used in the first fusion instruction and then either may be
reused as a temp for the output of the first insn which is
input to the second. However, if input 2 is the same as 0 or 1,
it gets clobbered unexpectedly. So the first 2 alts need to be
"=&0,&1,&r,r" instead to indicate that in alts 0 and 1, the
register used for 3 is earlyclobber, hence can't be the same as
input 2.
This was actually encountered in the backport of the add-logical
fusion patch to gcc-11. Some code in go hit this case:
<runtime.fillAligned+520>: andc r30,r30,r9
r30 now (~(x|((x&c)+c)))&(~c) --> this is new x
<runtime.fillAligned+524>: b <runtime.fillAligned+288>
<runtime.fillAligned+288>: addi r31,r31,-1
r31 now m-1
<runtime.fillAligned+292>: srd r31,r30,r31
r31 now x>>(m-1)
<runtime.fillAligned+296>: subf r30,r31,r30
r30 now x-(x>>(m-1))
<runtime.fillAligned+300>: or r30,r30,r30 # mdoom
nop
<runtime.fillAligned+304>: not r3,r30
r3 now ~(x-(x>>(m-1))) -- WHOOPS
The or r30,r30,r30 was meant to be or-ing in the earlier value
of r30 which was overwritten by the output of the subf.
Combined ChangeLog (needed for the scripts to understand):
gcc/ChangeLog
* config/rs6000/genfusion.pl (gen_logical_addsubf): Refactor to
add generation of logical-add and add-logical fusion pairs. Add
earlyclobber to alts 0/1.
(gen_addadd): Add earlyclobber to alts 0/1.
* config/rs6000/rs6000-cpus.def: Add new fusion to ISA 3.1 mask
and powerpc mask.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Turn on
logical-add and add-logical fusion by default.
* config/rs6000/rs6000.opt: Add -mpower10-fusion-logical-add and
-mpower10-fusion-add-logical options.
* config/rs6000/fusion.md: Regenerate file.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/fusion-p10-logadd.c: New file.
Jakub Jelinek [Wed, 16 Jun 2021 11:10:48 +0000 (13:10 +0200)]
testsuite: Use noipa attribute instead of noinline, noclone
I've noticed this test now on various arches sometimes FAILs, sometimes
PASSes (the line 12 test in particular).
The problem is that a = 0; initialization in the caller no longer happens
before the f(&a) call as what the argument points to is only used in
debug info.
Making the function noipa forces the caller to initialize it and still
tests what the test wants to test, namely that we don't consider *p as
valid location for the c variable at line 18 (after it has been overwritten
with *p = 1;).
2021-06-16 Jakub Jelinek <jakub@redhat.com>
* gcc.dg/guality/pr49888.c (f): Use noipa attribute instead of
noinline, noclone.
Jakub Jelinek [Wed, 16 Jun 2021 08:45:27 +0000 (10:45 +0200)]
libffi: Fix up x86_64 classify_argument
As the following testcase shows, libffi didn't handle properly
classify_arguments of structures at byte offsets not divisible by
UNITS_PER_WORD. The following patch adjusts it to match what
config/i386/ classify_argument does for that and also ports the
PR38781 fix there (the second chunk).
* src/x86/ffi64.c (classify_argument): For FFI_TYPE_STRUCT set words
to number of words needed for type->size + byte_offset bytes rather
than just type->size bytes. Compute pos before the loop and check
total size of the structure.
* testsuite/libffi.call/nested_struct12.c: New test.
Jakub Jelinek [Tue, 15 Jun 2021 09:36:47 +0000 (11:36 +0200)]
expr: Fix up VEC_PACK_TRUNC_EXPR expansion [PR101046]
The following testcase ICEs, because we have a mode mismatch.
VEC_PACK_TRUNC_EXPR's operands have different modes from the result
(same vector mode size but twice as large element),
but we were passing non-NULL subtarget with the mode of the result
to the expansion of its arguments, so the VEC_PERM_EXPR in one of the
operands which had V8SImode operands and result had V16HImode target.
Fixed by clearing the subtarget if we are changing mode.
2021-06-15 Jakub Jelinek <jakub@redhat.com>
PR target/101046
* expr.c (expand_expr_real_2) <case VEC_PACK_FIX_TRUNC_EXPR,
case VEC_PACK_TRUNC_EXPR>: Clear subtarget when changing mode.
Jakub Jelinek [Fri, 11 Jun 2021 10:59:43 +0000 (12:59 +0200)]
simplify-rtx: Fix up simplify_logical_relational_operation for vector IOR [PR101008]
simplify_relational_operation callees typically return just const0_rtx
or const_true_rtx and then simplify_relational_operation attempts to fix
that up if the comparison result has vector mode, or floating mode,
or punt if it has scalar mode and vector mode operands (it doesn't know how
exactly to deal with the scalar masks).
But, simplify_logical_relational_operation has a special case, where
it attempts to fold (x < y) | (x >= y) etc. and if it determines it is
always true, it just returns const_true_rtx, without doing the dances that
simplify_relational_operation does.
That results in an ICE on the following testcase, where such folding happens
during expansion (of debug stmts into DEBUG_INSNs) and we ICE because
all of sudden a VOIDmode rtx appears where it expects a vector (V4SImode)
rtx.
The following patch fixes that by moving the adjustement into a separate
helper routine and using it from both simplify_relational_operation and
simplify_logical_relational_operation.
2021-06-11 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/101008
* simplify-rtx.c (relational_result): New function.
(simplify_logical_relational_operation,
simplify_relational_operation): Use it.
Jakub Jelinek [Thu, 10 Jun 2021 07:28:27 +0000 (09:28 +0200)]
ifcvt: Fix -fcompare-debug bug [PR100852]
The following testcase fails -fcompare-debug, because it is ifcvt optimized
into umin only with -g0 and not with -g - the function(s) use
prev_nonnote_insn, which without -g finds a real insn the code is looking
for, while with -g finds a DEBUG_INSN.
2021-06-10 Jakub Jelinek <jakub@redhat.com>
PR debug/100852
* ifcvt.c (noce_get_alt_condition, noce_try_abs): Use
prev_nonnote_nondebug_insn instead of prev_nonnote_insn.
Jakub Jelinek [Mon, 7 Jun 2021 07:28:31 +0000 (09:28 +0200)]
fold-const: Fix up fold_read_from_vector [PR100887]
The callers of fold_read_from_vector expect that the index they pass is
an index of an element in the vector and the function does that most of the
time. But we allow CONSTRUCTORs with VECTOR_TYPE to have VECTOR_TYPE
elements and in that case every CONSTRUCTOR element represents not just one
index (with the exception of V1 vectors), but multiple.
So returning zero vector if i >= CONSTRUCTOR_NELTS or returning some
CONSTRUCTOR_ELT's value might not be what the callers expect.
Fixed by punting if the first element has vector type.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
In theory we could instead recurse (and assert that for CONSTRUCTORs of
vector elements we have always all elements specified like tree-cfg.c
verifies?) after adjusting the index appropriately.
2021-06-07 Jakub Jelinek <jakub@redhat.com>
PR target/100887
* fold-const.c (fold_read_from_vector): Return NULL if trying to
read from a CONSTRUCTOR with vector type elements.
Jakub Jelinek [Mon, 7 Jun 2021 07:25:37 +0000 (09:25 +0200)]
tree-inline: Fix up __builtin_va_arg_pack handling [PR100898]
The following testcase ICEs, because gimple_call_arg_ptr (..., 0)
asserts that there is at least one argument, while we were using
it even if we didn't copy anything just to get a pointer from/to which
the zero arguments should be copied.
Fixed by guarding the memcpy calls. Also, the code was calling
gimple_call_num_args too many times - 5 times instead of 2, so the patch
adds two temporaries for those.
2021-06-07 Jakub Jelinek <jakub@redhat.com>
PR middle-end/100898
* tree-inline.c (copy_bb): Only use gimple_call_arg_ptr if memcpy
should copy any arguments. Don't call gimple_call_num_args
on id->call_stmt or call_stmt more than once.
Jakub Jelinek [Sun, 6 Jun 2021 17:37:06 +0000 (19:37 +0200)]
openmp: Call c_omp_adjust_map_clauses even for combined target [PR100902]
When looking at in_reduction support for target, I've noticed that
c_omp_adjust_map_clauses is not called for the combined target case.
The following patch fixes it.
Unfortunately, there are other issues.
One is (also mentioned in the PR) that currently the pointer attachment
stuff seems to be clause ordering dependent (the standard says that clause
ordering on the same construct does not matter), the baz and qux cases
in the PR are rejected while when swapped it is accepted.
Note, the order of clauses in GCC really is treated as insignificant
initially and only later on the compiler can adjust the ordering (e.g. when
we sort map clauses based on what they refer to etc.) and in particular,
clauses from parsing is reverse of the order in user code, while
c_omp_split_clauses performed for combined/composite constructs typically
reverses that ordering, i.e. makes it follow the user code ordering.
And another one is I'm slightly afraid c_omp_adjust_map_clauses might
misbehave in templates, though haven't tried to verify it with testcases.
When processing_template_decl, the non-dependent clauses will be handled
usually the same as when not in a template, but dependent clauses aren't
processed or only limited processing is done there, and rest is deferred
till later. From quick skimming of c_omp_adjust_map_clauses, it seems
it might not be very happy about non-processed map clauses that might
still have the TREE_LIST representation of array sections, or might
not have finalized decls or base decls etc.
So, for this I wonder if cp_parser_omp_target (and other cp/parser.c
callers of c_omp_adjust_map_clauses) shouldn't call it only
if (!processing_template_decl) - perhaps you could add
cp_omp_adjust_map_clauses wrapper that would be
if (!processing_template_decl)
c_omp_adjust_map_clauses (...);
- and call c_omp_adjust_map_clauses from within pt.c after the clauses
are tsubsted and finish_omp_clauses is called again.
2021-06-06 Jakub Jelinek <jakub@redhat.com>
PR c/100902
* c-parser.c (c_parser_omp_target): Call c_omp_adjust_map_clauses
even when target is combined with other constructs.
* parser.c (cp_parser_omp_target): Call c_omp_adjust_map_clauses
even when target is combined with other constructs.
Jakub Jelinek [Fri, 4 Jun 2021 09:20:02 +0000 (11:20 +0200)]
x86: Fix ix86_expand_vector_init for V*TImode [PR100887]
We have vec_initv4tiv2ti and vec_initv2titi patterns which call
ix86_expand_vector_init and assume it works for those modes. For the
case of construction from two half-sized vectors, the code assumes it
will always succeed, but we have only insn patterns with SImode and DImode
element types. QImode and HImode element types are already handled
by performing it with same sized vectors with SImode elements and the
following patch extends that to V*TImode vectors.
2021-06-04 Jakub Jelinek <jakub@redhat.com>
PR target/100887
* config/i386/i386-expand.c (ix86_expand_vector_init): Handle
concatenation from half-sized modes with TImode elements.
Jason Merrill [Wed, 16 Jun 2021 20:09:59 +0000 (16:09 -0400)]
c++: static memfn from non-dependent base [PR101078]
After my patch for PR91706, or before that with the qualified call,
tsubst_baselink returned a BASELINK with BASELINK_BINFO indicating a base of
a still-dependent derived class. We need to look up the relevant base binfo
in the substituted class.
PR c++/101078
gcc/cp/ChangeLog:
* pt.c (tsubst_baselink): Update binfos in non-dependent case.
Jason Merrill [Mon, 14 Jun 2021 21:37:43 +0000 (17:37 -0400)]
libcpp: location comparison within macro [PR100796]
The patch for 96391 changed linemap_compare_locations to give up on
comparing locations from macro expansions if we don't have column
information. But in this testcase, the BOILERPLATE macro is multiple lines
long, so we do want to compare locations within the macro. So this patch
moves the LINE_MAP_MAX_LOCATION_WITH_COLS check inside the block, to use it
for failing gracefully.
PR c++/100796
PR preprocessor/96391
libcpp/ChangeLog:
* line-map.c (linemap_compare_locations): Only use comparison with
LINE_MAP_MAX_LOCATION_WITH_COLS to avoid abort.
gcc/testsuite/ChangeLog:
* g++.dg/plugin/location-overflow-test-pr100796.c: New test.
* g++.dg/plugin/plugin.exp: Run it.
Jason Merrill [Fri, 11 Jun 2021 20:55:30 +0000 (16:55 -0400)]
c++: constexpr and array[0] [PR101029]
build_vec_init_elt exits early if we're initializing a zero-element array,
so build_vec_init needs to do the same to avoid trying to instantiate things
after we've already started throwing important bits away.
Richard Biener [Fri, 11 Jun 2021 07:33:58 +0000 (09:33 +0200)]
middle-end/101009 - fix distance vector recording
This fixes recording of distance vectors in case the DDR has just
constant equal indexes. In that case we expect distance vectors
with zero distances to be recorded which is what was done when
any distance was computed for affine indexes.
2021-06-11 Richard Biener <rguenther@suse.de>
PR middle-end/101009
* tree-data-ref.c (build_classic_dist_vector_1): Make sure
to set *init_b to true when we encounter a constant equal
index pair.
(compute_affine_dependence): Also dump the actual DR_REF.
The following fixes the SLP FMA patterns to preserve reduction
info and the reduction vectorization to consider internal function
call defs for the reduction stmt.
2021-06-09 Richard Biener <rguenther@suse.de>
PR tree-optimization/100981
gcc/
* tree-vect-loop.c (vect_create_epilog_for_reduction): Use
gimple_get_lhs to also handle calls.
* tree-vect-slp-patterns.c (complex_pattern::build): Transfer
reduction info.
gcc/testsuite/
* gfortran.dg/vect/pr100981-1.f90: New testcase.
libgomp/
* testsuite/libgomp.fortran/pr100981-2.f90: New testcase.
Richard Biener [Mon, 14 Jun 2021 12:57:26 +0000 (14:57 +0200)]
tree-optimization/100934 - properly mark irreducible regions for DOM
The jump threading code requires marked irreducible regions for the
purpose of validating jump threading paths but DOM fails to provide
that resulting in mised number of iteration upper bounds clearing.
2021-06-14 Richard Biener <rguenther@suse.de>
PR tree-optimization/100934
* tree-ssa-dom.c (pass_dominator::execute): Properly
mark irreducible regions.