Jakub Jelinek [Sat, 26 Sep 2020 08:07:41 +0000 (10:07 +0200)]
powerpc, libcpp: Fix gcc build with clang on power8 [PR97163]
libcpp has two specialized altivec implementations of search_line_fast,
one for power8+ and the other one otherwise.
Both use __attribute__((altivec(vector))) and the GCC builtins rather than
altivec.h and the APIs from there, which is fine, but should be restricted
to when libcpp is built with GCC, so that it can be relied on.
The second elif is
and thus e.g. when built with clang it isn't picked, but the first one was
just guarded with
and so according to the bugreporter clang fails miserably on that.
The following patch fixes that by adding the same GCC_VERSION requirement
as the second version. I don't know where the 4.5 in there comes from and
the exact version doesn't matter that much, as long as it is above 4.2 that
clang pretends to be and smaller or equal to 4.8 as the oldest gcc we
support as bootstrap compiler ATM.
Furthermore, the patch fixes the comment, the version it is talking about is
not pre-GCC 5, but actually the GCC 5+ one.
2020-09-26 Jakub Jelinek <jakub@redhat.com>
PR bootstrap/97163
* lex.c (search_line_fast): Only use _ARCH_PWR8 Altivec version
for GCC >= 4.5.
Jakub Jelinek [Tue, 22 Sep 2020 19:06:32 +0000 (21:06 +0200)]
c++: Ignore __sanitizer_ptr_{sub,cmp} builtin calls during constant expression evaluation [PR97145]
These two builtin calls are added already during parsing before pointer
subtractions or comparisons, normally they perform runtime verification
of whether the pointers point to the same object or different objects,
but during constant expressione valuation we don't really need those
builtins for anything.
2020-09-22 Jakub Jelinek <jakub@redhat.com>
PR c++/97145
* constexpr.c (cxx_eval_builtin_function_call): Return void_node for
calls to __sanitize_ptr_{sub,cmp} builtins.
Kyrylo Tkachov [Fri, 2 Oct 2020 14:23:19 +0000 (15:23 +0100)]
AArch64: Add neoversev1_tunings struct
This patch adds a Neoverse V1-specific tuning struct that currently is
just a deduplication of the N1 struct it was using before and specifying
the SVE width.
This will allow us to tweak Neoverse V1 things in the future as needed.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
* config/aarch64/aarch64.c (neoversev1_tunings): Define.
* config/aarch64/aarch64-cores.def (zeus): Use it.
(neoverse-v1): Likewise.
Before the change gcc did not stream correctly TOPN counters
if counters belonged to a non-local shared object.
As a result zero-section optimization generated TOPN sections
in a form not recognizable by '__gcov_merge_topn'.
The problem happens because in a case of multiple shared objects
'__gcov_merge_topn' function is present in address space multiple
times (once per each object).
The fix is to never rely on function address and predicate on TOPN
counter types.
libgcc/ChangeLog:
PR gcov-profile/96913
* libgcov-driver.c (write_one_data): Avoid function pointer
comparison in TOP streaming decision.
Martin Liska [Thu, 24 Sep 2020 11:34:13 +0000 (13:34 +0200)]
switch conversion: make a rapid speed up
gcc/ChangeLog:
PR tree-optimization/96979
* tree-switch-conversion.c (jump_table_cluster::can_be_handled):
Make a fast bail out.
(bit_test_cluster::can_be_handled): Likewise here.
* tree-switch-conversion.h (get_range): Use wi::to_wide instead
of a folding.
gcc/testsuite/ChangeLog:
PR tree-optimization/96979
* g++.dg/tree-ssa/pr96979.C: New test.
config/i386/t-rtems: Change from mtune to march for multilibs
* config/i386/t-rtems: Change from mtune to march when building
multilibs. The mtune argument tunes or optimizes for a specific
CPU model but does not ensure the generated code is appropriate
for the CPU model. Prior to this patch, i386 compatible code
was always generated but tuned for later models.
Jakub Jelinek [Thu, 1 Oct 2020 09:04:56 +0000 (11:04 +0200)]
s390: Fix up s390_atomic_assign_expand_fenv
The following patch fixes
-FAIL: gcc.dg/pr94780.c (internal compiler error)
-FAIL: gcc.dg/pr94780.c (test for excess errors)
-FAIL: gcc.dg/pr94842.c (internal compiler error)
-FAIL: gcc.dg/pr94842.c (test for excess errors)
on s390x-linux. The fix is essentially the same as has been applied to many
other targets (i386, aarch64, arm, rs6000, alpha, riscv).
2020-10-01 Jakub Jelinek <jakub@redhat.com>
* config/s390/s390.c (s390_atomic_assign_expand_fenv): Use
TARGET_EXPR instead of MODIFY_EXPR for the first assignments to
fenv_var and old_fpc. Formatting fixes.
Joel Hutton [Wed, 30 Sep 2020 15:20:55 +0000 (16:20 +0100)]
[SLP][VECT] Add check to fix 96827
The following patch adds a simple check to prevent slp stmts from
vector constructors being rearranged. vect_attempt_slp_rearrange_stmts
tries to rearrange to avoid a load permutation.
This fixes PR target/96827
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96827
gcc/ChangeLog:
2020-09-29 Joel Hutton <joel.hutton@arm.com>
PR target/96827
* tree-vect-slp.c (vect_analyze_slp): Do not call
vect_attempt_slp_rearrange_stmts for vector constructors.
gcc/testsuite/ChangeLog:
2020-09-29 Joel Hutton <joel.hutton@arm.com>
PR target/96827
* gcc.dg/vect/bb-slp-49.c: New test.
Jim Wilson [Wed, 30 Sep 2020 20:06:28 +0000 (13:06 -0700)]
Fix build failure with zstd versio9n 1.2.0 or older.
Extends the configure check for zstd.h to also verify the zstd version,
since gcc requires features that only exist in 1.3.0 and newer. Without
this patch we get a build error for lto-compress.c when using an old zstd
version.
Backported from master:
2020-09-29 Jim Wilson <jimw@sifive.com>
The Linux kernel has defined the cpuinfo string for the +rng feature, so
this patch adds that to GCC so that -march=native can pick it up.
Bootstrapped and tested on aarch64-none-linux-gnu.
H.J. Lu [Wed, 23 Sep 2020 19:11:45 +0000 (12:11 -0700)]
x86: Use SET operation in MOVDIRI and MOVDIR64B
Since MOVDIRI and MOVDIR64B write to memory, similar to UNSPEC_MOVNT,
use SET operation in MOVDIRI and MOVDIR64B patterns with UNSPEC instead
of UNSPECV.
gcc/
PR target/97184
* config/i386/i386.md (UNSPECV_MOVDIRI): Renamed to ...
(UNSPEC_MOVDIRI): This.
(UNSPECV_MOVDIR64B): Renamed to ...
(UNSPEC_MOVDIR64B): This.
(movdiri<mode>): Use SET operation.
(@movdir64b_<mode>): Likewise.
ira: Fix elimination for global hard FPs [PR97054]
If the hard frame pointer is being used as a global register,
we should skip the usual handling for eliminations. As the
comment says, the register cannot in that case be eliminated
(or eliminated to) and is already marked live where appropriate.
Doing this removes the duplicate error for gcc.target/i386/pr82673.c.
The “cannot be used in 'asm' here” message is meant to be for asm
statements rather than register asms, and the function that the
error is reported against doesn't use asm.
gcc/
2020-09-18 Richard Sandiford <richard.sandiford@arm.com>
PR middle-end/97054
* ira.c (ira_setup_eliminable_regset): Skip the special elimination
handling of the hard frame pointer if the hard frame pointer is fixed.
gcc/testsuite/
2020-09-18 H.J. Lu <hjl.tools@gmail.com>
Richard Sandiford <richard.sandiford@arm.com>
PR middle-end/97054
* g++.target/i386/pr97054.C: New test.
* gcc.target/i386/pr82673.c: Remove redundant extra message.
Here, operands[1] is the address of the canary (&__stack_chk_guard)
and operands[2] is the register that we want to move that address into.
However, the code above instead sets operands[2] to the address of a
constant pool entry that contains &__stack_chk_guard, rather than to
&__stack_chk_guard itself. The sequence therefore does one less
pointer indirection than it should.
The net effect was to use &__stack_chk_guard for stack-smash detection,
instead of using __stack_chk_guard itself.
gcc/
* config/arm/arm.md (*stack_protect_combined_set_insn): For non-PIC,
load the address of the canary rather than the address of the
constant pool entry that points to it.
(*stack_protect_combined_test_insn): Likewise.
gcc/testsuite/
* gcc.target/arm/stack-protector-3.c: New test.
* gcc.target/arm/stack-protector-4.c: Likewise.
aarch64: Prevent canary address being spilled to stack
This patch fixes the equivalent of arm bug PR85434/CVE-2018-12886
for aarch64: under high register pressure, the -fstack-protector
code might spill the address of the canary onto the stack and
reload it at the test site, giving an attacker the opportunity
to change the expected canary value.
This would happen in two cases:
- when generating PIC for -mstack-protector-guard=global
(tested by stack-protector-6.c). This is a direct analogue
of PR85434, which was also about PIC for the global case.
- when using -mstack-protector-guard=sysreg.
The two problems were really separate bugs and caused by separate code,
but it was more convenient to fix them together.
The post-patch code still spills _GLOBAL_OFFSET_TABLE_ for
stack-protector-6.c, which is a more general problem. However,
it no longer spills the canary address itself.
The patch also fixes an ICE when using -mstack-protector-guard=sysreg
with ILP32: even if the register read is SImode, the address
calculation itself should still be DImode.
gcc/
* config/aarch64/aarch64-protos.h (aarch64_salt_type): New enum.
(aarch64_stack_protect_canary_mem): Declare.
* config/aarch64/aarch64.md (UNSPEC_SALT_ADDR): New unspec.
(stack_protect_set): Forward to stack_protect_combined_set.
(stack_protect_combined_set): New pattern. Use
aarch64_stack_protect_canary_mem.
(reg_stack_protect_address_<mode>): Add a salt operand.
(stack_protect_test): Forward to stack_protect_combined_test.
(stack_protect_combined_test): New pattern. Use
aarch64_stack_protect_canary_mem.
* config/aarch64/aarch64.c (strip_salt): New function.
(strip_offset_and_salt): Likewise.
(tls_symbolic_operand_type): Use strip_offset_and_salt.
(aarch64_stack_protect_canary_mem): New function.
(aarch64_cannot_force_const_mem): Use strip_offset_and_salt.
(aarch64_classify_address): Likewise.
(aarch64_symbolic_address_p): Likewise.
(aarch64_print_operand): Likewise.
(aarch64_output_addr_const_extra): New function.
(aarch64_tls_symbol_p): Use strip_salt.
(aarch64_classify_symbol): Likewise.
(aarch64_legitimate_pic_operand_p): Use strip_offset_and_salt.
(aarch64_legitimate_constant_p): Likewise.
(aarch64_mov_operand_p): Use strip_salt.
(TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA): Override.
aarch64: Tweaks to the handling of fixed-length SVE types
This patch is really four things rolled into one, since separating
them seemed artificial:
- Update the mangling of the fixed-length SVE ACLE types to match
the upcoming spec. The idea is to mangle:
VLAT __attribute__((arm_sve_vector_bits(N)))
as an instance __SVE_VLS<VLAT, N> of the template:
__SVE_VLS<typename, unsigned>
- Give the fixed-length types their own TYPE_DECL. This is needed
to make the above mangling fix work, but should also be a minor
QoI improvement for error reporting. Unfortunately, the names are
quite verbose, e.g.:
Previously we would complain that the type isn't an SVE type;
now we complain that it isn't a vector type.
- Don't allow arm_sve_vector_bits(N) to be applied to existing
fixed-length SVE types.
gcc/
* config/aarch64/aarch64-sve-builtins.cc (add_sve_type_attribute):
Take the ACLE name of the type as a parameter and add it as fourth
argument to the "SVE type" attribute.
(register_builtin_types): Update call accordingly.
(register_tuple_type): Likewise. Construct the name of the type
earlier in order to do this.
(get_arm_sve_vector_bits_attributes): New function.
(handle_arm_sve_vector_bits_attribute): Report a more sensible
error message if the attribute is applied to an SVE tuple type.
Don't allow the attribute to be applied to an existing fixed-length
SVE type. Mangle the new type as __SVE_VLS<type, vector-bits>.
Add a dummy TYPE_DECL to the new type.
gcc/testsuite/
* g++.target/aarch64/sve/acle/general-c++/attributes_2.C: New test.
* g++.target/aarch64/sve/acle/general-c++/mangle_6.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_7.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_8.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_9.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_10.C: Likewise.
* gcc.target/aarch64/sve/acle/general/attributes_7.c: Check the
error messages reported when arm_sve_vector_bits is applied to
SVE tuple types or to existing fixed-length SVE types.
aarch64: Update the mangling of single SVE vectors and predicates
GCC was implementing an old mangling scheme for single SVE
vectors and predicates (based on the Advanced SIMD one).
The final definition instead put them in the vendor built-in
namespace via the "u" prefix.
gcc/
* config/aarch64/aarch64-sve-builtins.cc (DEF_SVE_TYPE): Add a
leading "u" to each mangled name.
gcc/testsuite/
* g++.target/aarch64/sve/acle/general-c++/mangle_1.C: Add a leading
"u" to the mangling of each SVE vector and predicate type.
* g++.target/aarch64/sve/acle/general-c++/mangle_2.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_3.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_5.C: Likewise.
vtrn_half.c:76:17: error: redeclaration of 'vector_float64x2' with no linkage
vtrn_half.c:77:17: error: redeclaration of 'vector2_float64x2' with no linkage
vtrn_half.c:80:17: error: redeclaration of 'vector_res_float64x2' with no linkage
This is because r11-3402 now always declares float64x2 variables for
aarch64, leading to a duplicate declaration in these testcases.
The fix is simply to remove these now useless declarations.
These tests are skipped on arm*, so there is no impact on that target.
This patch implements the missing reinterprets to and from poly128_t and
float64x2_t.
I've plugged in the appropriate testing in the advsimd-intrinsics.exp
too.
Bootstrapped and tested on aarch64-none-linux-gnu.
Tested advsimd-intrinsics.exp on arm-none-eabi too to make sure arm
testing isn't affected.
This patch implements the missing vrndns_f32 intrinsic. This operates on a scalar float32_t value.
It can be mapped down to a __builtin_aarch64_frintnsf builtin.
This patch does that.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
PR target/71233
* config/aarch64/aarch64-simd-builtins.def (frintn): Use BUILTIN_VHSDF_HSDF
for modes. Remove explicit hf instantiation.
* config/aarch64/arm_neon.h (vrndns_f32): Define.
gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vrndns_f32_1.c: New test.
AArch64: Implement missing _p64 intrinsics for vector permutes
This patch implements some missing vector permute intrinsics operating on poly64x2_t types.
They are implemented identically to their uint64x2_t brethren.
Bootstrapped and tested on aarch64-none-linux-gnu.
This patch implements some missing vceq* intrinsics on poly types.
The behaviour is to produce the appropriate CMEQ instruction as for the unsigned types.
Bootstrapped and tested on aarch64-none-linux-gnu.
Eric Botcazou [Mon, 28 Sep 2020 07:00:46 +0000 (09:00 +0200)]
Fix bogus alignment warning on address clause
The compiler gives a bogus alignment warning on an address clause and
a discriminated record type with variable size.
gcc/ada/ChangeLog:
* gcc-interface/decl.c (maybe_saturate_size): Add ALIGN parameter
and round down the result to ALIGN.
(gnat_to_gnu_entity): Adjust calls to maybe_saturate_size.
gcc/testsuite/ChangeLog:
* gnat.dg/addr16.adb: New test.
* gnat.dg/addr16_pkg.ads: New helper.
Jakub Jelinek [Sun, 27 Sep 2020 21:18:26 +0000 (23:18 +0200)]
optabs: Don't reuse target for multi-word expansions if it overlaps operand(s) [PR97073]
The following testcase is miscompiled on i686-linux, because
we try to expand a double-word bitwise logic operation with op0
being a (mem:DI u) and target (mem:DI u+4), i.e. partial overlap, and
thus end up with:
movl 4(%esp), %eax
andl u, %eax
movl %eax, u+4
! movl u+4, %eax optimized out
andl 8(%esp), %eax
movl %eax, u+8
rather than with the desired:
movl 4(%esp), %edx
movl 8(%esp), %eax
andl u, %edx
andl u+4, %eax
movl %eax, u+8
movl %edx, u+4
because the store of the first word to target overwrites the second word of
the operand.
expand_binop for this (and several similar places) already check for target
== op0 or target == op1, this patch just adds reg_overlap_mentioned_p calls
next to it.
Pedantically, at least for some of these it might be sufficient to force
a different target if there is overlap but target is not rtx_equal_p to
the operand (e.g. in this bitwise logical case, but e.g. not in the shift
cases where there is reordering), though that would go against the
preexisting target == op? checks and the rationale that REG_EQUAL notes in
that case isn't correct.
2020-09-27 Jakub Jelinek <jakub@redhat.com>
PR middle-end/97073
* optabs.c (expand_binop, expand_absneg_bit, expand_unop,
expand_copysign_bit): Check reg_overlap_mentioned_p between target
and operand(s) and if it returns true, force a pseudo as target.
Mark Eggleston [Thu, 11 Jun 2020 13:33:51 +0000 (14:33 +0100)]
Fortran : ICE in build_field PR95614
Local identifiers can not be the same as a module name. Original
patch by Steve Kargl resulted in name clashes between common block
names and local identifiers. A local identifier can be the same as
a global identier if that identifier represents a common. The patch
was modified to allow global identifiers that represent a common
block.
2020-09-27 Steven G. Kargl <kargl@gcc.gnu.org>
Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/fortran/
PR fortran/95614
* decl.c (gfc_get_common): Use gfc_match_common_name instead
of match_common_name.
* decl.c (gfc_bind_idents): Use gfc_match_common_name instead
of match_common_name.
* match.c : Rename match_common_name to gfc_match_common_name.
* match.c (gfc_match_common): Use gfc_match_common_name instead
of match_common_name.
* match.h : Rename match_common_name to gfc_match_common_name.
* resolve.c (resolve_common_vars): Check each symbol in a
common block has a global symbol. If there is a global symbol
issue an error if the symbol type is known as is not a common
block name.
2020-09-27 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/testsuite/
PR fortran/95614
* gfortran.dg/pr95614_1.f90: New test.
* gfortran.dg/pr95614_2.f90: New test.
Add processing STRICT_LOW_PART for matched reloads.
2020-06-04 Vladimir Makarov <vmakarov@redhat.com>
PR middle-end/95464
* lra.c (lra_emit_move): Add processing STRICT_LOW_PART.
* lra-constraints.c (match_reload): Use STRICT_LOW_PART in output
reload if the original insn has it too.
Joe Ramsay [Wed, 19 Aug 2020 12:34:06 +0000 (12:34 +0000)]
arm: Require MVE memory operand for destination of vst1q intrinsic
Previously, the machine description patterns for vst1q accepted a generic memory
operand for the destination, which could lead to an unrecognised builtin when
expanding vst1q* intrinsics. This change fixes the pattern to only accept MVE
memory operands.
PR target/96683
* gcc.target/arm/mve/intrinsics/vst1q_f16.c: New test.
* gcc.target/arm/mve/intrinsics/vst1q_s16.c: New test.
* gcc.target/arm/mve/intrinsics/vst1q_s8.c: New test.
* gcc.target/arm/mve/intrinsics/vst1q_u16.c: New test.
* gcc.target/arm/mve/intrinsics/vst1q_u8.c: New test.
H.J. Lu [Mon, 14 Sep 2020 15:52:27 +0000 (08:52 -0700)]
rtl_data: Add sp_is_clobbered_by_asm
Add sp_is_clobbered_by_asm to rtl_data to inform backends that the stack
pointer is clobbered by asm statement.
gcc/
PR target/97032
* cfgexpand.c (asm_clobber_reg_kind): Set sp_is_clobbered_by_asm
to true if the stack pointer is clobbered by asm statement.
* emit-rtl.h (rtl_data): Add sp_is_clobbered_by_asm.
* config/i386/i386.c (ix86_get_drap_rtx): Set need_drap to true
if the stack pointer is clobbered by asm statement.
gcc/testsuite/
PR target/97032
* gcc.target/i386/pr97032.c: New test.
Alan Modra [Fri, 18 Sep 2020 13:51:05 +0000 (23:21 +0930)]
[RS6000] Power10 libffi fixes
Power10 pc-relative code doesn't use or preserve r2 as a TOC pointer.
That means calling between pc-relative and TOC using code can't be
done without intervening linker stubs, and a call from TOC code to
pc-relative code must have a nop after the bl in order to restore r2.
Now the PowerPC libffi assembly code doesn't use r2 except for the
implicit use when making calls back to C, ffi_closure_helper_LINUX64
and ffi_prep_args64. So changing the assembly to interoperate with
pc-relative code without stubs is easily done.
PR target/97166
* src/powerpc/linux64.S (ffi_call_LINUX64): Don't emit global
entry when __PCREL__. Call using @notoc. Add nops.
* src/powerpc/linux64_closure.S (ffi_closure_LINUX64): Likewise.
(ffi_go_closure_linux64): Likewise.
Jonathan Wakely [Tue, 22 Sep 2020 19:02:58 +0000 (20:02 +0100)]
libstdc++: Fix out-of-bounds string_view access in filesystem::path [PR 97167]
libstdc++-v3/ChangeLog:
PR libstdc++/97167
* src/c++17/fs_path.cc (path::_Parser::root_path()): Check
for empty string before inspecting the first character.
* testsuite/27_io/filesystem/path/append/source.cc: Append
empty string_view to path.
David Faust [Tue, 22 Sep 2020 18:31:35 +0000 (20:31 +0200)]
bpf: use xBPF signed div, mod insns when available
The 'mod' and 'div' operators in eBPF are unsigned, with no signed
counterpart. xBPF adds two new ALU operations, sdiv and smod, for
signed division and modulus, respectively. Update bpf.md with
'define_insn' blocks for signed div and mod to use them when targetting
xBPF, and add new tests to ensure they are used appropriately.
2020-09-17 David Faust <david.faust@oracle.com>
gcc/
* config/bpf/bpf.md: Add defines for signed div and mod operators.
gcc/testsuite/
* gcc.target/bpf/diag-sdiv.c: New test.
* gcc.target/bpf/diag-smod.c: New test.
* gcc.target/bpf/xbpf-sdiv-1.c: New test.
* gcc.target/bpf/xbpf-smod-1.c: New test.
Jonathan Wakely [Tue, 22 Sep 2020 07:42:18 +0000 (08:42 +0100)]
libstdc++: Use correct argument type for __use_alloc, again [PR 96803]
While backporting 5494edae83ad33c769bd1ebc98f0c492453a6417 I noticed
that it's still not correct. I made the allocator-extended constructor
use the right type for the uses-allocator construction detection, but I
used an rvalue when it should be a const lvalue.
This should fix it properly this time.
libstdc++-v3/ChangeLog:
PR libstdc++/96803
* include/std/tuple
(_Tuple_impl(allocator_arg_t, Alloc, const _Tuple_impl<U...>&)):
Use correct value category in __use_alloc call.
* testsuite/20_util/tuple/cons/96803.cc: Check with constructors
that require correct value category to be used.
Jonathan Wakely [Wed, 26 Aug 2020 18:32:30 +0000 (19:32 +0100)]
libstdc++: Use correct argument type for __use_alloc [PR 96803]
The _Tuple_impl constructor for allocator-extended construction from a
different tuple type uses the _Tuple_impl's own _Head type in the
__use_alloc test. That is incorrect, because the argument tuple could
have a different type. Using the wrong type might select the
leading-allocator convention when it should use the trailing-allocator
convention, or vice versa.
libstdc++-v3/ChangeLog:
PR libstdc++/96803
* include/std/tuple
(_Tuple_impl(allocator_arg_t, Alloc, const _Tuple_impl<U...>&)):
Replace parameter pack with a type parameter and a pack and pass
the first type to __use_alloc.
* testsuite/20_util/tuple/cons/96803.cc: New test.
Jonathan Wakely [Sun, 20 Sep 2020 23:17:02 +0000 (00:17 +0100)]
libstdc++: Fix noexcept-specifier for std::bind_front [PR 97101]
libstdc++-v3/ChangeLog:
PR libstdc++/97101
* include/std/functional (bind_front): Fix order of parameters
in is_nothrow_constructible_v specialization.
* testsuite/20_util/function_objects/bind_front/97101.cc: New test.
Jonathan Wakely [Thu, 10 Sep 2020 14:39:15 +0000 (15:39 +0100)]
libstdc++: handle small max_blocks_per_chunk in pool resources [PR 94160]
When a pool resource is constructed with max_blocks_per_chunk=1 it ends
up creating a pool with blocks_per_chunk=0 which means it never
allocates anything. Instead it returns null pointers, which should be
impossible.
To avoid this problem, round the max_blocks_per_chunk value to a
multiple of four, so it's never smaller than four.
libstdc++-v3/ChangeLog:
PR libstdc++/94160
* src/c++17/memory_resource.cc (munge_options): Round
max_blocks_per_chunk to a multiple of four.
(__pool_resource::_M_alloc_pools()): Simplify slightly.
* testsuite/20_util/unsynchronized_pool_resource/allocate.cc:
Check that valid pointers are returned when small values are
used for max_blocks_per_chunk.
2020-09-20 John David Anglin < danglin@gcc.gnu.org>
gcc/ChangeLog
* config/pa/pa-hpux11.h (LINK_GCC_C_SEQUENCE_SPEC): Delete.
* config/pa/pa64-hpux.h (LINK_GCC_C_SEQUENCE_SPEC): Likewise.
(ENDFILE_SPEC): Link with libgcc_stub.a and mill.a.
* config/pa/pa32-linux.h (ENDFILE_SPEC): Link with libgcc.a.
Marek Polacek [Wed, 16 Sep 2020 13:27:29 +0000 (09:27 -0400)]
preprocessor: Fix ICE with too long line in fmtwarn [PR96935]
Here we ICE in char_span::subspan because the offset it gets is -1.
It's -1 because get_substring_ranges_for_loc gets a location whose
column was 0. That only happens in testcases like the attached where
we're dealing with extremely long lines (at least 4065 chars it seems).
This does happen in practice, though, so it's not just a theoretical
problem (e.g. when building the SU2 suite).
Fixed by checking that the column get_substring_ranges_for_loc gets is
sane, akin to other checks in that function.
gcc/ChangeLog:
PR preprocessor/96935
* input.c (get_substring_ranges_for_loc): Return if start.column
is less than 1.
gcc/testsuite/ChangeLog:
PR preprocessor/96935
* gcc.dg/format/pr96935.c: New test.
Jakub Jelinek [Wed, 16 Sep 2020 07:42:33 +0000 (09:42 +0200)]
store-merging: Consider also overlapping stores earlier in the by bitpos sorting [PR97053]
As the testcases show, if we have something like:
MEM <char[12]> [&b + 8B] = {};
MEM[(short *) &b] = 5;
_5 = *x_4(D);
MEM <long long unsigned int> [&b + 2B] = _5;
MEM[(char *)&b + 16B] = 88;
MEM[(int *)&b + 20B] = 1;
then in sort_by_bitpos the stores are almost like in the given order,
except the first store is after the = _5; store.
We can't coalesce the = 5; store with = _5;, because the latter is MEM_REF,
while the former INTEGER_CST, and we can't coalesce the = _5 store with
the = {} store because the former is MEM_REF, the latter INTEGER_CST.
But we happily coalesce the remaining 3 stores, which is wrong, because the
= _5; store overlaps those and is in between them in the program order.
We already have code to deal with similar cases in check_no_overlap, but we
deal only with the following stores in sort_by_bitpos order, not the earlier
ones.
The following patch checks also the earlier ones. In coalesce_immediate_stores
it computes the first one that needs to be checked (all the ones whose
bitpos + bitsize is smaller or equal to merged_store->start don't need to be
checked and don't need to be checked even for any following attempts because
of the sort_by_bitpos sorting) and the end of that (that is the first store
in the merged_store).
2020-09-16 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/97053
* gimple-ssa-store-merging.c (check_no_overlap): Add FIRST_ORDER,
START, FIRST_EARLIER and LAST_EARLIER arguments. Return false if
any stores between FIRST_EARLIER inclusive and LAST_EARLIER exclusive
has order in between FIRST_ORDER and LAST_ORDER and overlaps the to
be merged store.
(imm_store_chain_info::try_coalesce_bswap): Add FIRST_EARLIER argument.
Adjust check_no_overlap caller.
(imm_store_chain_info::coalesce_immediate_stores): Add first_earlier
and last_earlier variables, adjust them during iterations. Adjust
check_no_overlap callers, call check_no_overlap even when extending
overlapping stores by extra INTEGER_CST stores.
* gcc.dg/store_merging_31.c: New test.
* gcc.dg/store_merging_32.c: New test.
Will Schmidt [Wed, 9 Sep 2020 15:59:38 +0000 (10:59 -0500)]
[PATCH,rs6000] Testsuite fixup pr96139 tests
Hi,
As reported, the recently added pr96139 tests will fail on older targets
because the tests are missing the appropriate -mvsx or -maltivec options.
This adds the options and clarifies the dg-require statements.
The pr96139-c.c test needs -maltivec to work, but does not actually use
vectors, so does not require -mvsx like the others.
Sniff-regtested OK when specifying older targets on a power7 host.
--target_board=unix/'{-mcpu=power4,-mcpu=power5,-mcpu=power6,-mcpu=power7,
-mcpu=power8,-mcpu=power9}''{-m64,-m32}'"
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr96139-a.c: Specify -mvsx option and update the
dg-require stanza to match.
* gcc.target/powerpc/pr96139-b.c: Same.
* gcc.target/powerpc/pr96139-c.c: Specify -maltivec option and update
the dg-require stanza to match.
Will Schmidt [Mon, 20 Jul 2020 15:51:37 +0000 (10:51 -0500)]
[PATCH, rs6000] Fix vector long long subtype (PR96139)
Hi,
This corrects an issue with the powerpc vector long long subtypes.
As reported by SjMunroe, when building some code with -Wall, and
attempting to print an element of a "long long vector" with a
long long printf format string, we will report an error because
the vector sub-type was improperly defined as int.
When defining a V2DI_type_node we use a TARGET_POWERPC64 ternary to
define the V2DI_type_node with "vector long" or "vector long long".
We also need to specify the proper sub-type when we define the type.
PR target/96139
2020-09-03 Will Schmidt <will_schmidt@vnet.ibm.com>
gcc/ChangeLog:
* config/rs6000/rs6000-call.c (rs6000_init_builtin): Update V2DI_type_node
and unsigned_V2DI_type_node definitions.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr96139-a.c: New test.
* gcc.target/powerpc/pr96139-b.c: New test.
* gcc.target/powerpc/pr96139-c.c: New test.
Richard Biener [Mon, 24 Aug 2020 12:12:01 +0000 (14:12 +0200)]
debug/96690 - mangle symbols eventually used by late dwarf output
The following makes sure to, at early debug generation time, mangle
symbols we eventually end up outputting during late finish.
2020-08-24 Richard Biener <rguenther@suse.de>
PR debug/96690
* dwarf2out.c (reference_to_unused): Make FUNCTION_DECL
processing more consistent with respect to
symtab->global_info_ready.
(tree_add_const_value_attribute): Unconditionally call
rtl_for_decl_init to do all mangling early but throw
away the result if early_dwarf.
Jose E. Marchesi [Mon, 14 Sep 2020 18:35:22 +0000 (20:35 +0200)]
bpf: use the expected instruction for NOPs
The BPF ISA doesn't have a no-operation instruction, but in practice
the Linux kernel verifier performs some optimizations that rely on
these instructions to be encoded in a particular way. As it turns
out, we were using the "wrong" instruction in GCC.
This patch makes GCC to generate the expected instruction for NOP (a
`ja 0') and also adds a test to make sure this is the case.
Tested in bpf-unknown-none targets.
2020-09-14 Jose E. Marchesi <jose.marchesi@oracle.com>
Richard Biener [Thu, 27 Aug 2020 09:48:15 +0000 (11:48 +0200)]
tree-optimization/96522 - transfer of flow-sensitive info in copy_ref_info
This removes the bogus tranfer of flow-sensitive info in copy_ref_info
plus fixes one oversight in FRE when flow-sensitive non-NULLness was added to
points-to info.
2020-08-27 Richard Biener <rguenther@suse.de>
PR tree-optimization/96522
* tree-ssa-address.c (copy_ref_info): Reset flow-sensitive
info of the copied points-to. Transfer bigger alignment
via the access type.
* tree-ssa-sccvn.c (eliminate_dom_walker::eliminate_stmt):
Reset all flow-sensitive info.
Richard Biener [Mon, 14 Sep 2020 09:25:04 +0000 (11:25 +0200)]
tree-optimization/97043 - fix latent wrong-code with SLP vectorization
When the unrolling decision comes late and would have prevented
eliding a SLP load permutation we can end up generating aligned
loads when the load is in fact unaligned. Most of the time
alignment analysis figures out the load is in fact unaligned
but that cannot be relied upon.
The following removes the SLP load permutation eliding based on
the still premature vectorization factor.
2020-09-14 Richard Biener <rguenther@suse.de>
PR tree-optimization/97043
* tree-vect-slp.c (vect_analyze_slp_instance): Do not
elide a load permutation if the current vectorization
factor is one.
Add new shrpsi and shrpdi instruction variants to gcc/config/pa/pa.md.
2020-09-12 Roger Sayle <roger@nextmovesoftware.com>
John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog
* config/pa/pa.md (shrpsi4_1, shrpsi4_2): New define_insns split
out from previous shrpsi4 providing two commutitive variants using
plus_xor_ior_operator as a predicate.
(shrpdi4_1, shrpdi4_2, shrpdi_3, shrpdi_4): Likewise DImode versions
where _1 and _2 take register shifts, and _3 and _4 for integers.
(rotlsi3_internal): Name this anonymous instruction.
(rotrdi3): New DImode insn copied from rotrsi3.
(rotldi3): New DImode expander copied from rotlsi3.
(rotldi4_internal): New DImode insn copied from rotsi3_internal.