git.ipfire.org Git - thirdparty/gcc.git/log

aarch64: Prevent canary address being spilled to stack

This patch fixes the equivalent of arm bug PR85434/CVE-2018-12886
for aarch64: under high register pressure, the -fstack-protector
code might spill the address of the canary onto the stack and
reload it at the test site, giving an attacker the opportunity
to change the expected canary value.

This would happen in two cases:

- when generating PIC for -mstack-protector-guard=global
  (tested by stack-protector-6.c).  This is a direct analogue
  of PR85434, which was also about PIC for the global case.

- when using -mstack-protector-guard=sysreg.

The two problems were really separate bugs and caused by separate code,
but it was more convenient to fix them together.

The post-patch code still spills _GLOBAL_OFFSET_TABLE_ for
stack-protector-6.c, which is a more general problem.  However,
it no longer spills the canary address itself.

The patch also fixes an ICE when using -mstack-protector-guard=sysreg
with ILP32: even if the register read is SImode, the address
calculation itself should still be DImode.

gcc/
* config/aarch64/aarch64-protos.h (aarch64_salt_type): New enum.
(aarch64_stack_protect_canary_mem): Declare.
* config/aarch64/aarch64.md (UNSPEC_SALT_ADDR): New unspec.
(stack_protect_set): Forward to stack_protect_combined_set.
(stack_protect_combined_set): New pattern.  Use
aarch64_stack_protect_canary_mem.
(reg_stack_protect_address_<mode>): Add a salt operand.
(stack_protect_test): Forward to stack_protect_combined_test.
(stack_protect_combined_test): New pattern.  Use
aarch64_stack_protect_canary_mem.
* config/aarch64/aarch64.c (strip_salt): New function.
(strip_offset_and_salt): Likewise.
(tls_symbolic_operand_type): Use strip_offset_and_salt.
(aarch64_stack_protect_canary_mem): New function.
(aarch64_cannot_force_const_mem): Use strip_offset_and_salt.
(aarch64_classify_address): Likewise.
(aarch64_symbolic_address_p): Likewise.
(aarch64_print_operand): Likewise.
(aarch64_output_addr_const_extra): New function.
(aarch64_tls_symbol_p): Use strip_salt.
(aarch64_classify_symbol): Likewise.
(aarch64_legitimate_pic_operand_p): Use strip_offset_and_salt.
(aarch64_legitimate_constant_p): Likewise.
(aarch64_mov_operand_p): Use strip_salt.
(TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA): Override.

gcc/testsuite/
* gcc.target/aarch64/stack-protector-5.c: New test.
* gcc.target/aarch64/stack-protector-6.c: Likewise.
* gcc.target/aarch64/stack-protector-7.c: Likewise.

(cherry picked from commit 74b27d8eedc7a4c0e8276345107790e6b3c023cb)

aarch64: Update feature macro name

GCC used the name __ARM_FEATURE_SVE_VECTOR_OPERATIONS, but in the
final spec it was renamed to__ARM_FEATURE_SVE_VECTOR_OPERATORS.

gcc/
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Rename
__ARM_FEATURE_SVE_VECTOR_OPERATIONS to
__ARM_FEATURE_SVE_VECTOR_OPERATORS.

gcc/testsuite/
* gcc.target/aarch64/sve/acle/general/attributes_1.c: Rename
__ARM_FEATURE_SVE_VECTOR_OPERATIONS to
__ARM_FEATURE_SVE_VECTOR_OPERATORS.

(cherry picked from commit ef4af9eddea5a658eb7d6dc29fcb58aa54c9dd9f)

aarch64: Tweaks to the handling of fixed-length SVE types

This patch is really four things rolled into one, since separating
them seemed artificial:

- Update the mangling of the fixed-length SVE ACLE types to match
  the upcoming spec.  The idea is to mangle:

    VLAT __attribute__((arm_sve_vector_bits(N)))

  as an instance __SVE_VLS<VLAT, N> of the template:

    __SVE_VLS<typename, unsigned>

- Give the fixed-length types their own TYPE_DECL.  This is needed
  to make the above mangling fix work, but should also be a minor
  QoI improvement for error reporting.  Unfortunately, the names are
  quite verbose, e.g.:

    svint8_t __attribute__((arm_sve_vector_bits(512)))

  but anything shorter would be ad-hoc syntax and so might be more
  confusing.

- Improve the error message reported when arm_sve_vector_bits is
  applied to tuples, such as:

    svint32x2_t __attribute__((arm_sve_vector_bits(N)))

  Previously we would complain that the type isn't an SVE type;
  now we complain that it isn't a vector type.

- Don't allow arm_sve_vector_bits(N) to be applied to existing
  fixed-length SVE types.

gcc/
* config/aarch64/aarch64-sve-builtins.cc (add_sve_type_attribute):
Take the ACLE name of the type as a parameter and add it as fourth
argument to the "SVE type" attribute.
(register_builtin_types): Update call accordingly.
(register_tuple_type): Likewise.  Construct the name of the type
earlier in order to do this.
(get_arm_sve_vector_bits_attributes): New function.
(handle_arm_sve_vector_bits_attribute): Report a more sensible
error message if the attribute is applied to an SVE tuple type.
Don't allow the attribute to be applied to an existing fixed-length
SVE type.  Mangle the new type as __SVE_VLS<type, vector-bits>.
Add a dummy TYPE_DECL to the new type.

gcc/testsuite/
* g++.target/aarch64/sve/acle/general-c++/attributes_2.C: New test.
* g++.target/aarch64/sve/acle/general-c++/mangle_6.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_7.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_8.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_9.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_10.C: Likewise.
* gcc.target/aarch64/sve/acle/general/attributes_7.c: Check the
error messages reported when arm_sve_vector_bits is applied to
SVE tuple types or to existing fixed-length SVE types.

(cherry picked from commit 9ded41a39c1bb29f356485a9ec3a573fb75ded12)

aarch64: Update the mangling of single SVE vectors and predicates

GCC was implementing an old mangling scheme for single SVE
vectors and predicates (based on the Advanced SIMD one).
The final definition instead put them in the vendor built-in
namespace via the "u" prefix.

gcc/
* config/aarch64/aarch64-sve-builtins.cc (DEF_SVE_TYPE): Add a
leading "u" to each mangled name.

gcc/testsuite/
* g++.target/aarch64/sve/acle/general-c++/mangle_1.C: Add a leading
"u" to the mangling of each SVE vector and predicate type.
* g++.target/aarch64/sve/acle/general-c++/mangle_2.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_3.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_5.C: Likewise.

(cherry picked from commit dcb043351307001a85fc1e7d56669f5adc9628f7)

arm: Add support for Neoverse V1 CPU

This patch backports the AArch32 support for Arm's Neoverse V1 CPU to
GCC 10.

gcc/ChangeLog:

* config/arm/arm-cpus.in (neoverse-v1): New.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* doc/invoke.texi: Document AArch32 support for Neoverse V1.

Daily bump.

testsuite: [aarch64] Fix aarch64/advsimd-intrinsics/v{trn,uzp,zip}_half.c

Since r11-3402 (g:65c9878641cbe0ed898aa7047b7b994e9d4a5bb1), the
vtrn_half, vuzp_half and vzip_half started failing with

vtrn_half.c:76:17: error: redeclaration of 'vector_float64x2' with no linkage
vtrn_half.c:77:17: error: redeclaration of 'vector2_float64x2' with no linkage
vtrn_half.c:80:17: error: redeclaration of 'vector_res_float64x2' with no linkage

This is because r11-3402 now always declares float64x2 variables for
aarch64, leading to a duplicate declaration in these testcases.

The fix is simply to remove these now useless declarations.

These tests are skipped on arm*, so there is no impact on that target.

2020-09-25 Christophe Lyon <christophe.lyon@linaro.org>

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/advsimd-intrinsics/vtrn_half.c: Remove
declarations of vector, vector2, vector_res for float64x2 type.
* gcc.target/aarch64/advsimd-intrinsics/vuzp_half.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vzip_half.c: Likewise.

(cherry picked from commit 8c775bf447e190024fa08c55e38db94dd013a393)

AArch64: Implement missing p128<->f64 reinterpret intrinsics

This patch implements the missing reinterprets to and from poly128_t and
float64x2_t.
I've plugged in the appropriate testing in the advsimd-intrinsics.exp
too.

Bootstrapped and tested on aarch64-none-linux-gnu.
Tested advsimd-intrinsics.exp on arm-none-eabi too to make sure arm
testing isn't affected.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vreinterpretq_f64_p128,
vreinterpretq_p128_f64): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
(clean_results): Add float64x2_t cleanup.
(DECL_VARIABLE_128BITS_VARIANTS): Add float64x2_t variable.
* gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c: Add
testing of vreinterpretq_f64_p128, vreinterpretq_p128_f64.

(cherry picked from commit 65c9878641cbe0ed898aa7047b7b994e9d4a5bb1)

AArch64: Implement missing vrndns_f32 intrinsic

This patch implements the missing vrndns_f32 intrinsic. This operates on a scalar float32_t value.
It can be mapped down to a __builtin_aarch64_frintnsf builtin.

This patch does that.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/aarch64-simd-builtins.def (frintn): Use BUILTIN_VHSDF_HSDF
for modes. Remove explicit hf instantiation.
* config/aarch64/arm_neon.h (vrndns_f32): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vrndns_f32_1.c: New test.

(cherry picked from commit 02b5377b3766804059b7824330d33d0e1cef2e5b)

AArch64: Implement missing _p64 intrinsics for vector permutes

This patch implements some missing vector permute intrinsics operating on poly64x2_t types.
They are implemented identically to their uint64x2_t brethren.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vtrn1q_p64, vtrn2q_p64, vuzp1q_p64,
vuzp2q_p64, vzip1q_p64, vzip2q_p64): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/trn_zip_p64_1.c: New test.

(cherry picked from commit e8e818399d70c5a5a3d30a54d305c6e2b92e2c66)

AArch64: Implement vldrq_p128 intrinsic

This patch implements the missing vldrq_p128 intrinsic that just loads from the appropriate pointer.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vldrq_p128): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vldrq_p128_1.c: New test.

(cherry picked from commit f2868e4bcff2c7b882d01231f039459c00e59d7b)

AArch64: Implement vstrq_p128 intrinsic

This patch implements the missing vstrq_p128 intrinsic.
It just performs a store of the poly128_t argument to a memory location.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vstrq_p128): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vstrq_p128_1.c: New test.

(cherry picked from commit d23ea1e865301cd45f14ccbdb0bca49251fde9e1)

AArch64: Implement missing vcls intrinsics on unsigned types

This patch implements some missing intrinsics that perform a CLS on unsigned SIMD types.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vcls_u8, vcls_u16, vcls_u32,
vclsq_u8, vclsq_u16, vclsq_u32): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vcls_unsigned_1.c: New test.

(cherry picked from commit 30957092db46d8798e632feefb5df634488dbb33)

AArch64: Implement missing vceq*_p* intrinsics

This patch implements some missing vceq* intrinsics on poly types.
The behaviour is to produce the appropriate CMEQ instruction as for the unsigned types.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vceqq_p64, vceqz_p64, vceqzq_p64): Define.

gcc/testsuite/

PR target/71233
* gcc.target/aarch64/simd/vceq_poly_1.c: New test.

(cherry picked from commit d4703be185b422f637deebd3bb9222a41c8023d6)

AArch64: Implement poly-type vadd intrinsics

This implements the vadd[p]_p* intrinsics.
In terms of functionality they are aliases of veor operations on the relevant unsigned types.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vadd_p8, vadd_p16, vadd_p64, vaddq_p8,
vaddq_p16, vaddq_p64, vaddq_p128): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vadd_poly_1.c: New test.

(cherry picked from commit fa9ad35dae03dcb20c4ccb50ba1b351a8ab77970)

Revert "Fortran : ICE in build_field PR95614"

This reverts commit 4a67941a956003dcce8866604ba25f5a0bfd16cf.

Fix bogus alignment warning on address clause

The compiler gives a bogus alignment warning on an address clause and
a discriminated record type with variable size.

gcc/ada/ChangeLog:
* gcc-interface/decl.c (maybe_saturate_size): Add ALIGN parameter
and round down the result to ALIGN.
(gnat_to_gnu_entity): Adjust calls to maybe_saturate_size.

gcc/testsuite/ChangeLog:
* gnat.dg/addr16.adb: New test.
* gnat.dg/addr16_pkg.ads: New helper.

Daily bump.

optabs: Don't reuse target for multi-word expansions if it overlaps operand(s) [PR97073]

The following testcase is miscompiled on i686-linux, because
we try to expand a double-word bitwise logic operation with op0
being a (mem:DI u) and target (mem:DI u+4), i.e. partial overlap, and
thus end up with:
movl 4(%esp), %eax
andl u, %eax
movl %eax, u+4
! movl u+4, %eax optimized out
andl 8(%esp), %eax
movl %eax, u+8
rather than with the desired:
movl 4(%esp), %edx
movl 8(%esp), %eax
andl u, %edx
andl u+4, %eax
movl %eax, u+8
movl %edx, u+4
because the store of the first word to target overwrites the second word of
the operand.
expand_binop for this (and several similar places) already check for target
== op0 or target == op1, this patch just adds reg_overlap_mentioned_p calls
next to it.
Pedantically, at least for some of these it might be sufficient to force
a different target if there is overlap but target is not rtx_equal_p to
the operand (e.g. in this bitwise logical case, but e.g. not in the shift
cases where there is reordering), though that would go against the
preexisting target == op? checks and the rationale that REG_EQUAL notes in
that case isn't correct.

2020-09-27 Jakub Jelinek <jakub@redhat.com>

PR middle-end/97073
* optabs.c (expand_binop, expand_absneg_bit, expand_unop,
expand_copysign_bit): Check reg_overlap_mentioned_p between target
and operand(s) and if it returns true, force a pseudo as target.

* gcc.c-torture/execute/pr97073.c: New test.

(cherry picked from commit a4b31d5807f2bc67c8999b3d53369cf2a5c6e1ec)

Fortran  :  ICE in build_field PR95614

Local identifiers can not be the same as a module name.  Original
patch by Steve Kargl resulted in name clashes between common block
names and local identifiers.  A local identifier can be the same as
a global identier if that identifier represents a common.  The patch
was modified to allow global identifiers that represent a common
block.

2020-09-27  Steven G. Kargl  <kargl@gcc.gnu.org>
    Mark Eggleston  <markeggleston@gcc.gnu.org>

gcc/fortran/

PR fortran/95614
* decl.c (gfc_get_common): Use gfc_match_common_name instead
of match_common_name.
* decl.c (gfc_bind_idents): Use gfc_match_common_name instead
of match_common_name.
* match.c : Rename match_common_name to gfc_match_common_name.
* match.c (gfc_match_common): Use gfc_match_common_name instead
of match_common_name.
* match.h : Rename match_common_name to gfc_match_common_name.
* resolve.c (resolve_common_vars): Check each symbol in a
common block has a global symbol.  If there is a global symbol
issue an error if the symbol type is known as is not a common
block name.

2020-09-27  Mark Eggleston  <markeggleston@gcc.gnu.org>

gcc/testsuite/

PR fortran/95614
* gfortran.dg/pr95614_1.f90: New test.
* gfortran.dg/pr95614_2.f90: New test.

(cherry picked from commit e5a76af3a2f3324efc60b4b2778ffb29d5c377bc)

Daily bump.

Add test for PR95464.c.

2020-06-04 Vladimir Makarov <vmakarov@redhat.com>

PR middle-end/95464
* gcc.target/i386/pr95464.c: New.

(cherry picked from commit e7ef9a40cd0c688cd331bc26224d1fbe360c1fe6)

Add processing STRICT_LOW_PART for matched reloads.

2020-06-04 Vladimir Makarov <vmakarov@redhat.com>

PR middle-end/95464
* lra.c (lra_emit_move): Add processing STRICT_LOW_PART.
* lra-constraints.c (match_reload): Use STRICT_LOW_PART in output
reload if the original insn has it too.

(cherry picked from commit 5261cf8ce824bfc75eb6f12ad5e3716c085b6f9a)

arm: Require MVE memory operand for destination of vst1q intrinsic

Previously, the machine description patterns for vst1q accepted a generic memory
operand for the destination, which could lead to an unrecognised builtin when
expanding vst1q* intrinsics. This change fixes the pattern to only accept MVE
memory operands.

gcc/ChangeLog:

PR target/96683
* config/arm/mve.md (mve_vst1q_f<mode>): Require MVE memory operand for
destination.
(mve_vst1q_<supf><mode>): Likewise.

gcc/testsuite/ChangeLog:

PR target/96683
* gcc.target/arm/mve/intrinsics/vst1q_f16.c: New test.
* gcc.target/arm/mve/intrinsics/vst1q_s16.c: New test.
* gcc.target/arm/mve/intrinsics/vst1q_s8.c: New test.
* gcc.target/arm/mve/intrinsics/vst1q_u16.c: New test.
* gcc.target/arm/mve/intrinsics/vst1q_u8.c: New test.

(cherry picked from commit 91d206adfe39ce063f6a5731b92a03c05e82e94a)

Daily bump.

rtl_data: Add sp_is_clobbered_by_asm

Add sp_is_clobbered_by_asm to rtl_data to inform backends that the stack
pointer is clobbered by asm statement.

gcc/

PR target/97032
* cfgexpand.c (asm_clobber_reg_kind): Set sp_is_clobbered_by_asm
to true if the stack pointer is clobbered by asm statement.
* emit-rtl.h (rtl_data): Add sp_is_clobbered_by_asm.
* config/i386/i386.c (ix86_get_drap_rtx): Set need_drap to true
if the stack pointer is clobbered by asm statement.

gcc/testsuite/

PR target/97032
* gcc.target/i386/pr97032.c: New test.

(cherry picked from commit 453a20c65722719b9e2d84339f215e7ec87692dc)

[RS6000] Power10 libffi fixes

Power10 pc-relative code doesn't use or preserve r2 as a TOC pointer.
That means calling between pc-relative and TOC using code can't be
done without intervening linker stubs, and a call from TOC code to
pc-relative code must have a nop after the bl in order to restore r2.

Now the PowerPC libffi assembly code doesn't use r2 except for the
implicit use when making calls back to C, ffi_closure_helper_LINUX64
and ffi_prep_args64. So changing the assembly to interoperate with
pc-relative code without stubs is easily done.

PR target/97166
* src/powerpc/linux64.S (ffi_call_LINUX64): Don't emit global
entry when __PCREL__. Call using @notoc. Add nops.
* src/powerpc/linux64_closure.S (ffi_closure_LINUX64): Likewise.
(ffi_go_closure_linux64): Likewise.

(cherry picked from commit 08cd8d5929eac84b27788d8483fd75ab7ad13129)
(cherry picked from commit fff56af6421a1a3e357bcaad99f2ea084d72a7a8)

[RS6000] Built-in __PCREL__ define

Useful in assembly to know details of power10 function calls.

PR target/97166
* config/rs6000/rs6000-c.c (rs6000_target_modify_macros):
Conditionally define __PCREL__.

(cherry picked from commit 677b9150f54a0483d3de1182ac40717b7c4431a5)

aarch64: Do not alter value on a force_reg returned rtx expanding __jcvt

2020-09-17 Andrea Corallo <andrea.corallo@arm.com>

* config/aarch64/aarch64-builtins.c
(aarch64_general_expand_builtin): Use expand machinery not to
alter the value of an rtx returned by force_reg.

(cherry picked from commit 2c62952f8160bdc8d4111edb34a4bc75096c1e05)

aarch64: Add support for Neoverse V1 CPU

This patch backports the AArch64 support for Arm's Neoverse V1 CPU to
GCC 10.

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def: Add Neoverse V1.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi: Document support for Neoverse V1.

Daily bump.

libstdc++: Fix out-of-bounds string_view access in filesystem::path [PR 97167]

libstdc++-v3/ChangeLog:

PR libstdc++/97167
* src/c++17/fs_path.cc (path::_Parser::root_path()): Check
for empty string before inspecting the first character.
* testsuite/27_io/filesystem/path/append/source.cc: Append
empty string_view to path.

(cherry picked from commit 49ff88bd0d8a36a9e903f01ce05685cfe07dee5d)

bpf: use xBPF signed div, mod insns when available

The 'mod' and 'div' operators in eBPF are unsigned, with no signed
counterpart. xBPF adds two new ALU operations, sdiv and smod, for
signed division and modulus, respectively. Update bpf.md with
'define_insn' blocks for signed div and mod to use them when targetting
xBPF, and add new tests to ensure they are used appropriately.

2020-09-17 David Faust <david.faust@oracle.com>

gcc/
* config/bpf/bpf.md: Add defines for signed div and mod operators.

gcc/testsuite/
* gcc.target/bpf/diag-sdiv.c: New test.
* gcc.target/bpf/diag-smod.c: New test.
* gcc.target/bpf/xbpf-sdiv-1.c: New test.
* gcc.target/bpf/xbpf-smod-1.c: New test.

(cherry picked from commit 7c8ba5da80d5d95a8521010d6731d0d83036145d)

libstdc++: Use correct argument type for __use_alloc, again [PR 96803]

While backporting 5494edae83ad33c769bd1ebc98f0c492453a6417 I noticed
that it's still not correct. I made the allocator-extended constructor
use the right type for the uses-allocator construction detection, but I
used an rvalue when it should be a const lvalue.

This should fix it properly this time.

libstdc++-v3/ChangeLog:

PR libstdc++/96803
* include/std/tuple
(_Tuple_impl(allocator_arg_t, Alloc, const _Tuple_impl<U...>&)):
Use correct value category in __use_alloc call.
* testsuite/20_util/tuple/cons/96803.cc: Check with constructors
that require correct value category to be used.

(cherry picked from commit 7825399092d572ce8ea82c4aa8dfeb65076b0e52)

libstdc++: Use correct argument type for __use_alloc [PR 96803]

The _Tuple_impl constructor for allocator-extended construction from a
different tuple type uses the _Tuple_impl's own _Head type in the
__use_alloc test. That is incorrect, because the argument tuple could
have a different type. Using the wrong type might select the
leading-allocator convention when it should use the trailing-allocator
convention, or vice versa.

libstdc++-v3/ChangeLog:

PR libstdc++/96803
* include/std/tuple
(_Tuple_impl(allocator_arg_t, Alloc, const _Tuple_impl<U...>&)):
Replace parameter pack with a type parameter and a pack and pass
the first type to __use_alloc.
* testsuite/20_util/tuple/cons/96803.cc: New test.

(cherry picked from commit 5494edae83ad33c769bd1ebc98f0c492453a6417)

Daily bump.

libstdc++: Fix build for targets without lstat [PR 94681]

libstdc++-v3/ChangeLog:

PR libstdc++/94681
* src/c++17/fs_ops.cc (read_symlink): Use posix::lstat instead
of calling ::lstat directly.
* src/filesystem/ops.cc (read_symlink): Likewise.

(cherry picked from commit 5b065f0563262a0d6cd1fea8426913bfdd841301)

libstdc++: Make C++17 ignore --disable-libstdcxx-filesystem-ts [PR 94681]

The configure switch should only affect the optional Filesystem TS, not
the std::filesystem features of C++17.

libstdc++-v3/ChangeLog:

PR libstdc++/94681
* acinclude.m4 (GLIBCXX_CHECK_FILESYSTEM_DEPS): Do not depend on
$enable_libstdcxx_filesystem_ts.
* configure: Regenerate.

(cherry picked from commit 90f7636bf8df50940e0f749af60a6b374a8f09b4)

libstdc++: Fix noexcept-specifier for std::bind_front [PR 97101]

libstdc++-v3/ChangeLog:

PR libstdc++/97101
* include/std/functional (bind_front): Fix order of parameters
in is_nothrow_constructible_v specialization.
* testsuite/20_util/function_objects/bind_front/97101.cc: New test.

(cherry picked from commit 3c755b428e188228d0bad90625c995fd25a02322)

libgo: don't put golang.org packages in zstdpkglist.go

This ensures that internal/goroot.IsStandardPackage does not treat
golang.org packages as being in the standard library.

For golang/go#41368
Fixes golang/go#41499

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/256319

libstdc++: handle small max_blocks_per_chunk in pool resources [PR 94160]

When a pool resource is constructed with max_blocks_per_chunk=1 it ends
up creating a pool with blocks_per_chunk=0 which means it never
allocates anything. Instead it returns null pointers, which should be
impossible.

To avoid this problem, round the max_blocks_per_chunk value to a
multiple of four, so it's never smaller than four.

libstdc++-v3/ChangeLog:

PR libstdc++/94160
* src/c++17/memory_resource.cc (munge_options): Round
max_blocks_per_chunk to a multiple of four.
(__pool_resource::_M_alloc_pools()): Simplify slightly.
* testsuite/20_util/unsynchronized_pool_resource/allocate.cc:
Check that valid pointers are returned when small values are
used for max_blocks_per_chunk.

(cherry picked from commit 30b41cfbb2dade63e52465234a725d1d02fe70aa)

Daily bump.

Fix linkage with -nodefaultlibs option.

2020-09-20 John David Anglin < danglin@gcc.gnu.org>

gcc/ChangeLog
* config/pa/pa-hpux11.h (LINK_GCC_C_SEQUENCE_SPEC): Delete.
* config/pa/pa64-hpux.h (LINK_GCC_C_SEQUENCE_SPEC): Likewise.
(ENDFILE_SPEC): Link with libgcc_stub.a and mill.a.
* config/pa/pa32-linux.h (ENDFILE_SPEC): Link with libgcc.a.

Daily bump.

Fortran: Avoid double-free with parse error (PR96041, PR93423)

gcc/fortran/

PR fortran/96041
PR fortran/93423
* decl.c (gfc_match_submod_proc): Avoid later double-free
in the error case.

(cherry picked from commit c12facd22881517127ebbe213d7ecc7fc1fcea4e)

PR fortran/93423 - ICE on invalid with argument list for module procedure

When recovering from an error, a NULL pointer dereference could occur.
Check for that situation and punt.

gcc/fortran/
PR fortran/93423
* resolve.c (resolve_symbol): Avoid NULL pointer dereference.

(cherry picked from commit b88744905a46be44ffa3c57d46080f601ae832b8)

Daily bump.

preprocessor: Fix ICE with too long line in fmtwarn [PR96935]

Here we ICE in char_span::subspan because the offset it gets is -1.
It's -1 because get_substring_ranges_for_loc gets a location whose
column was 0. That only happens in testcases like the attached where
we're dealing with extremely long lines (at least 4065 chars it seems).
This does happen in practice, though, so it's not just a theoretical
problem (e.g. when building the SU2 suite).

Fixed by checking that the column get_substring_ranges_for_loc gets is
sane, akin to other checks in that function.

gcc/ChangeLog:

PR preprocessor/96935
* input.c (get_substring_ranges_for_loc): Return if start.column
is less than 1.

gcc/testsuite/ChangeLog:

PR preprocessor/96935
* gcc.dg/format/pr96935.c: New test.

(cherry picked from commit 31dd5cd6344bfbbe122fb512993b128e11236d35)

If -mavx implies -mxsave, then -mno-xsave should imply -mno-avx.

Current status is -mno-avx implies -mno-xsave which should be wrong.

gcc/ChangeLog

* common/config/i386/i386-common.c
(OPTION_MASK_ISA_AVX_UNSET): Remove OPTION_MASK_ISA_XSAVE_UNSET.
(OPTION_MASK_ISA_XSAVE_UNSET): Add OPTION_MASK_ISA_AVX_UNSET.

gcc/testsuite/ChangeLog

* gcc.target/i386/xsave-avx-1.c: New test.

Daily bump.

store-merging: Consider also overlapping stores earlier in the by bitpos sorting [PR97053]

As the testcases show, if we have something like:
  MEM <char[12]> [&b + 8B] = {};
  MEM[(short *) &b] = 5;
  _5 = *x_4(D);
  MEM <long long unsigned int> [&b + 2B] = _5;
  MEM[(char *)&b + 16B] = 88;
  MEM[(int *)&b + 20B] = 1;
then in sort_by_bitpos the stores are almost like in the given order,
except the first store is after the = _5; store.
We can't coalesce the = 5; store with = _5;, because the latter is MEM_REF,
while the former INTEGER_CST, and we can't coalesce the = _5 store with
the = {} store because the former is MEM_REF, the latter INTEGER_CST.
But we happily coalesce the remaining 3 stores, which is wrong, because the
= _5; store overlaps those and is in between them in the program order.
We already have code to deal with similar cases in check_no_overlap, but we
deal only with the following stores in sort_by_bitpos order, not the earlier
ones.

The following patch checks also the earlier ones.  In coalesce_immediate_stores
it computes the first one that needs to be checked (all the ones whose
bitpos + bitsize is smaller or equal to merged_store->start don't need to be
checked and don't need to be checked even for any following attempts because
of the sort_by_bitpos sorting) and the end of that (that is the first store
in the merged_store).

2020-09-16  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/97053
* gimple-ssa-store-merging.c (check_no_overlap): Add FIRST_ORDER,
START, FIRST_EARLIER and LAST_EARLIER arguments.  Return false if
any stores between FIRST_EARLIER inclusive and LAST_EARLIER exclusive
has order in between FIRST_ORDER and LAST_ORDER and overlaps the to
be merged store.
(imm_store_chain_info::try_coalesce_bswap): Add FIRST_EARLIER argument.
Adjust check_no_overlap caller.
(imm_store_chain_info::coalesce_immediate_stores): Add first_earlier
and last_earlier variables, adjust them during iterations.  Adjust
check_no_overlap callers, call check_no_overlap even when extending
overlapping stores by extra INTEGER_CST stores.

* gcc.dg/store_merging_31.c: New test.
* gcc.dg/store_merging_32.c: New test.

(cherry picked from commit bd909071ac04e94f4b6f0baab64d0687ec55681d)

Daily bump.

[PATCH,rs6000] Testsuite fixup pr96139 tests

Hi,
  As reported, the recently added pr96139 tests will fail on older targets
  because the tests are missing the appropriate -mvsx or -maltivec options.
  This adds the options and clarifies the dg-require statements.

  The pr96139-c.c test needs -maltivec to work, but does not actually use
  vectors, so does not require -mvsx like the others.

  Sniff-regtested OK when specifying older targets on a power7 host.
  --target_board=unix/'{-mcpu=power4,-mcpu=power5,-mcpu=power6,-mcpu=power7,
  -mcpu=power8,-mcpu=power9}''{-m64,-m32}'"

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr96139-a.c: Specify -mvsx option and update the
dg-require stanza to match.
* gcc.target/powerpc/pr96139-b.c: Same.
* gcc.target/powerpc/pr96139-c.c: Specify -maltivec option and update
the dg-require stanza to match.

(cherry picked from commit 2fda9e9badbd78d1033075a44a7d6c1b33de239c)

[PATCH, rs6000] Fix vector long long subtype (PR96139)

Hi,
This corrects an issue with the powerpc vector long long subtypes.
As reported by SjMunroe, when building some code with -Wall, and
attempting to print an element of a "long long vector" with a
long long printf format string, we will report an error because
the vector sub-type was improperly defined as int.

When defining a V2DI_type_node we use a TARGET_POWERPC64 ternary to
define the V2DI_type_node with "vector long" or "vector long long".
We also need to specify the proper sub-type when we define the type.

PR target/96139

2020-09-03 Will Schmidt <will_schmidt@vnet.ibm.com>

gcc/ChangeLog:
* config/rs6000/rs6000-call.c (rs6000_init_builtin): Update V2DI_type_node
and unsigned_V2DI_type_node definitions.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr96139-a.c: New test.
* gcc.target/powerpc/pr96139-b.c: New test.
* gcc.target/powerpc/pr96139-c.c: New test.

(cherry picked from commit d8f3474ff81b07fd2e758337957711db17eb801e)

i386: Fix up vector mul and div with broadcasts in -masm=intel mode

These patterns printed bogus <>s around the {1to16} and similar strings.

2020-09-15 Jakub Jelinek <jakub@redhat.com>

PR target/97028
* config/i386/sse.md (mul<mode>3<mask_name>_bcs,
<avx512>_div<mode>3<mask_name>_bcst): Use <avx512bcst> instead of
<<avx512bcst>>.

* gcc.target/i386/avx512f-pr97028.c: Untested fix.

(cherry picked from commit 0f079e104a8d1994b6b47169a6b45737615eb2d7)

debug/96690 - mangle symbols eventually used by late dwarf output

The following makes sure to, at early debug generation time, mangle
symbols we eventually end up outputting during late finish.

2020-08-24 Richard Biener <rguenther@suse.de>

PR debug/96690
* dwarf2out.c (reference_to_unused): Make FUNCTION_DECL
processing more consistent with respect to
symtab->global_info_ready.
(tree_add_const_value_attribute): Unconditionally call
rtl_for_decl_init to do all mangling early but throw
away the result if early_dwarf.

* g++.dg/lto/pr96690_0.C: New testcase.

(cherry picked from commit 7fe2cec41bb2ccb499b6b6c513e00da1a270370f)

Daily bump.

doc: fix spelling of -fprofile-reproducibility

gcc/ChangeLog:

* doc/invoke.texi: fix '-fprofile-reproducibility' option
spelling in manual.

(cherry picked from commit 0620f4d79e270f1a455a7ec099504d44dc6180e6)

bpf: use the expected instruction for NOPs

The BPF ISA doesn't have a no-operation instruction, but in practice
the Linux kernel verifier performs some optimizations that rely on
these instructions to be encoded in a particular way. As it turns
out, we were using the "wrong" instruction in GCC.

This patch makes GCC to generate the expected instruction for NOP (a
`ja 0') and also adds a test to make sure this is the case.

Tested in bpf-unknown-none targets.

2020-09-14 Jose E. Marchesi <jose.marchesi@oracle.com>

gcc/

* config/bpf/bpf.md ("nop"): Re-define as `ja 0'.

gcc/testsuite/

* gcc.target/bpf/nop-1.c: New test.

(cherry picked from commit 5bcc0fa05ef713594f6c6d55d5c837e13a9c9803)

tree-optimization/96522 - transfer of flow-sensitive info in copy_ref_info

This removes the bogus tranfer of flow-sensitive info in copy_ref_info
plus fixes one oversight in FRE when flow-sensitive non-NULLness was added to
points-to info.

2020-08-27 Richard Biener <rguenther@suse.de>

PR tree-optimization/96522
* tree-ssa-address.c (copy_ref_info): Reset flow-sensitive
info of the copied points-to. Transfer bigger alignment
via the access type.
* tree-ssa-sccvn.c (eliminate_dom_walker::eliminate_stmt):
Reset all flow-sensitive info.

* gcc.dg/torture/pr96522.c: New testcase.

(cherry picked from commit eb68d9d828f94d28afa5900fbf3072bbcd64ba8a)

tree-optimization/97043 - fix latent wrong-code with SLP vectorization

When the unrolling decision comes late and would have prevented
eliding a SLP load permutation we can end up generating aligned
loads when the load is in fact unaligned. Most of the time
alignment analysis figures out the load is in fact unaligned
but that cannot be relied upon.

The following removes the SLP load permutation eliding based on
the still premature vectorization factor.

2020-09-14 Richard Biener <rguenther@suse.de>

PR tree-optimization/97043
* tree-vect-slp.c (vect_analyze_slp_instance): Do not
elide a load permutation if the current vectorization
factor is one.

i386: Fix array index in expander

I noticed a compiler warning about out-of-bound access. Fixed thusly.

gcc/
* config/i386/sse.md (mov<mode>): Fix operand indices.

Daily bump.

Improve costs for DImode shifts of interger constants.

2020-09-13 Roger Sayle <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/pa/pa.c (hppa_rtx_costs) [ASHIFT, ASHIFTRT, LSHIFTRT]:
Provide accurate costs for DImode shifts of integer constants.

Daily bump.

Add new shrpsi and shrpdi instruction variants to gcc/config/pa/pa.md.

2020-09-12 Roger Sayle <roger@nextmovesoftware.com>
John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog
* config/pa/pa.md (shrpsi4_1, shrpsi4_2): New define_insns split
out from previous shrpsi4 providing two commutitive variants using
plus_xor_ior_operator as a predicate.
(shrpdi4_1, shrpdi4_2, shrpdi_3, shrpdi_4): Likewise DImode versions
where _1 and _2 take register shifts, and _3 and _4 for integers.
(rotlsi3_internal): Name this anonymous instruction.
(rotrdi3): New DImode insn copied from rotrsi3.
(rotldi3): New DImode expander copied from rotlsi3.
(rotldi4_internal): New DImode insn copied from rotsi3_internal.

Daily bump.

testsuite: gimplefe-44 requires exceptions

This avoids an ICE on amdgcn.

gcc/testsuite/ChangeLog:

* gcc.dg/gimplefe-44.c: Require exceptions.

(cherry picked from commit 2c1d809e93e983fa9dde6d4a14c23193608fed68)

amdgcn: align TImode registers

This prevents execution failures caused by partially overlapping input and
output registers. This is the same solution already used for DImode.

gcc/ChangeLog:

* config/gcn/gcn.c (gcn_hard_regno_mode_ok): Align TImode registers.
* config/gcn/gcn.md: Assert that TImode registers do not early clobber.

(cherry picked from commit 8ae0de5621120b16295fe6b73ca044d4c576af6d)

testsuite: Run gcc.dg/pr96579.c only on targets with dfp support.

gcc.dg/pr96579.c includes gcc.dg/pr96370.c which needs target dfp.

2020-08-28 Christophe Lyon <christophe.lyon@linaro.org>

gcc/testsuite/
* gcc.dg/pr96579.c: Compile only with target dfp.

(cherry picked from commit 3ba43155d2b27978589b2c1b0c4fdbaf8d4bba4d)

tree-optimization/96579 - another special-operands fix in reassoc

This makes sure to put special-ops expanded rhs left where
expression rewrite expects it.

2020-08-27 Richard Biener <rguenther@suse.de>

PR tree-optimization/96579
* tree-ssa-reassoc.c (linearize_expr_tree): If we expand
rhs via special ops make sure to swap operands.

* gcc.dg/pr96579.c: New testcase.

(cherry picked from commit ff7463172e564c5dd2432d7af8eaa0124cbd4af7)

tree-optimization/96370 - make reassoc expr rewrite more robust

In the face of the more complex tricks in reassoc with respect
to negate processing it can happen that the expression rewrite
is fooled to recurse on a leaf and pick up a bogus expression
code. The following patch makes the expression rewrite more
robust in providing the expression code to it directly since
it is the same for all operations in a chain.

2020-07-30 Richard Biener <rguenther@suse.de>

PR tree-optimization/96370
* tree-ssa-reassoc.c (rewrite_expr_tree): Add operation
code parameter and use it instead of picking it up from
the stmt that is being rewritten.
(reassociate_bb): Pass down the operation code.

* gcc.dg/pr96370.c: New testcase.

(cherry picked from commit 2c558d2655cb22f472c83e8296b5cd2a92365cd3)

tree-optimization/96514 - avoid if-converting control-altering calls

This avoids if-converting when encountering control-altering calls.

2020-08-07 Richard Biener <rguenther@suse.de>

PR tree-optimization/96514
* tree-if-conv.c (if_convertible_bb_p): If the last stmt
is a call that is control-altering, fail.

* gcc.dg/pr96514.c: New testcase.

(cherry picked from commit c3f94f5786a014515c09c7852db228c74adf51e5)

lto/96385 - avoid unused global UNDEFs in debug objects

Unused global UNDEFs can have side-effects in some circumstances so
the following patch avoids them by treating them the same as other
to be discarded DEFs - make them local.

2020-08-03 Richard Biener <rguenther@suse.de>

PR lto/96385
libiberty/
* simple-object-elf.c
(simple_object_elf_copy_lto_debug_sections): Localize global
UNDEFs and reuse the prevailing name.

(cherry picked from commit b32c5d0b72fda2588b4e170e75a9c64e4bf266c7)

middle-end/96369 - fix missed short-circuiting during range folding

This makes the special case of constant evaluated LHS for a
short-circuiting or/and explicit rather than doing range
merging and eventually exposing a side-effect that shouldn't be
evaluated.

2020-07-31 Richard Biener <rguenther@suse.de>

PR middle-end/96369
* fold-const.c (fold_range_test): Special-case constant
LHS for short-circuiting operations.

* c-c++-common/pr96369.c: New testcase.

(cherry picked from commit 10231958fcfb13bc4847729eba21470c101b4a88)

tree-optimization/96349 - avoid abnormal coalescing issues in loop split

This avoids splitting a loop when the entry value of a loop PHI is
involved with abnormal coalescing.

2020-07-28 Richard Biener <rguenther@suse.de>

PR tree-optimization/96349
* tree-ssa-loop-split.c (stmt_semi_invariant_p_1): When the
condition runs into a loop PHI with an abnormal entry value give up.

* gcc.dg/torture/pr96349.c: New testcase.

(cherry picked from commit 2b2f3867c09c8977268b8ffbd646ac242188b335)

LTO: Add -fcf-protection=check

Mixing -fcf-protection and -fcf-protection=none objects are allowed.
Linker just merges -fcf-protection values from all input objects.

Add -fcf-protection=check for the final link with LTO.  An error is
issued if LTO object files are compiled with different -fcf-protection
values.  Otherwise, -fcf-protection=check is ignored at the compile
time.  Without explicit -fcf-protection at link time, -fcf-protection
values from LTO object files are merged at the final link.

gcc/

PR bootstrap/96203
* common.opt: Add -fcf-protection=check.
* flag-types.h (cf_protection_level): Add CF_CHECK.
* lto-wrapper.c (merge_and_complain): Issue an error for
mismatching -fcf-protection values with -fcf-protection=check.
Otherwise, merge -fcf-protection values.
* doc/invoke.texi: Document -fcf-protection=check.

gcc/testsuite/

PR bootstrap/96203
* gcc.target/i386/pr96203-1.c: New test.
* gcc.target/i386/pr96203-2.c: Likewise.

(cherry picked from commit c4c22e830251e1961c6ebec78d28d039eb2e6017)

LTO: pick up -fcf-protection flag for the link step

2020-07-14 Matthias Klose <doko@ubuntu.com>

PR lto/95604
* lto-wrapper.c (merge_and_complain): Add decoded options as parameter,
error on different values for -fcf-protection.
(append_compiler_options): Pass -fcf-protection option.
(find_and_merge_options): Add decoded options as parameter,
pass decoded_options to merge_and_complain.
(run_gcc): Pass decoded options to find_and_merge_options.
* lto-opts.c (lto_write_options): Pass -fcf-protection option.

(cherry picked from commit 6a48d12475cdb7375b98277f8bc089715feeeafe)

Fix ICE on nested packed variant record type

This is a regression present on the mainline and 10 branch: the compiler
aborts on code accessing a component of a packed record type whose type
is a packed discriminated record type with variant part.

gcc/ada/ChangeLog:
* gcc-interface/utils.c (type_has_variable_size): New function.
(create_field_decl): In the packed case, also force byte alignment
when the type of the field has variable size.

gcc/testsuite/ChangeLog:
* gnat.dg/pack27.adb: New test.
* gnat.dg/pack27_pkg.ads: New helper.

Fix crash on array component with nonstandard index type

This is a regression present on mainline, 10 and 9 branches: the compiler
goes into an infinite recursion eventually exhausting the stack for the
declaration of a discriminated record type with an array component having
a discriminant as bound and an index type that is an enumeration type with
a non-standard representation clause.

gcc/ada/ChangeLog:
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Array_Subtype>: Only
create extra subtypes for discriminants if the RM size of the base
type of the index type is lower than that of the index type.

gcc/testsuite/ChangeLog:
* gnat.dg/specs/discr7.ads: New test.

Adjust email address

lto: Fix up lto BLOCK tree streaming

When I've tried to backport recent LTO changes of mine, I've ran into
FAIL: g++.dg/ubsan/align-3.C   -O2 -flto -fno-use-linker-plugin -flto-partition=none  output pattern test
FAIL: g++.dg/ubsan/align-3.C   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  output pattern test
regressions that don't for some reason show up on the trunk.

I've tracked it down to input_location_and_block being called recursively,
first on some UBSAN_NULL ifn call's location which needs to stream a BLOCK
that hasn't been streamed yet, which in turn needs to stream some locations
for the decls in the BLOCK.

Looking at that align-3.C testcase on the trunk, I also see the recursive
lto_output_location_1 calls as well.  First on block:
$18 = <block 0x7fffea738480>
which is in the block tree from what I can see just fine (see the end of
mail).

Now, output_function has code that should stream the BLOCK tree leafs:
  /* Output DECL_INITIAL for the function, which contains the tree of
     lexical scopes.  */
  stream_write_tree (ob, DECL_INITIAL (function), true);
  /* As we do not recurse into BLOCK_SUBBLOCKS but only BLOCK_SUPERCONTEXT
     collect block tree leafs and stream those.  */
  auto_vec<tree> block_tree_leafs;
  if (DECL_INITIAL (function))
    collect_block_tree_leafs (DECL_INITIAL (function), block_tree_leafs);
  streamer_write_uhwi (ob, block_tree_leafs.length ());
  for (unsigned i = 0; i < block_tree_leafs.length (); ++i)
    stream_write_tree (ob, block_tree_leafs[i], true);

static void
collect_block_tree_leafs (tree root, vec<tree> &leafs)
{
  for (root = BLOCK_SUBBLOCKS (root); root; root = BLOCK_CHAIN (root))
    if (! BLOCK_SUBBLOCKS (root))
      leafs.safe_push (root);
    else
      collect_block_tree_leafs (BLOCK_SUBBLOCKS (root), leafs);
}

but the problem is that it is broken, it doesn't cover all block leafs,
but only leafs with an odd depth from DECL_INITIAL (and only some of those).

The following patch fixes that, but I guess we are going to stream at that
point significantly more blocks than before (though I guess most of the time
we'd stream them later on when streaming the gimple_locations that refer to
them).

2020-09-10  Jakub Jelinek  <jakub@redhat.com>

* lto-streamer-out.c (collect_block_tree_leafs): Recurse on
root rather than BLOCK_SUBBLOCKS (root).

(cherry picked from commit a8f9b4c54cc350625accbb2c6b4005acb42d4c2e)

* lto-streamer.h (LTO_minor_version): Bump.

lto: Stream current working directory for first streamed relative filename and adjust relative paths [PR93865]

If the gcc -c -flto ... commands to compile some or all objects are run in a
different directory (or in different directories) from the directory in
which the gcc -flto link line is invoked, then the .debug_line will be
incorrect if there are any relative filenames, it will use those relative
filenames while .debug_info will contain a different DW_AT_comp_dir.

The following patch streams (at most once after each clear_line_info)
the current working directory (what we record in DW_AT_comp_dir) when
encountering the first relative pathname, and when reading the location info
reads it back and if the current working directory at that point is
different from the saved one, adjusts relative paths by adding a relative
prefix how to go from the current working directory to the previously saved
path (with a fallback e.g. for DOS e:\\foo vs. d:\\bar change to use
absolute directory).

2020-09-10  Jakub Jelinek  <jakub@redhat.com>

PR debug/93865
* lto-streamer.h (struct output_block): Add emit_pwd member.
* lto-streamer-out.c: Include toplev.h.
(clear_line_info): Set emit_pwd.
(lto_output_location_1): Encode the ob->current_file != xloc.file
bit directly into the location number.  If changing file, emit
additionally a bit whether pwd is emitted and emit it before the
first relative pathname since clear_line_info.
(output_function, output_constructor): Don't call clear_line_info
here.
* lto-streamer-in.c (struct string_pair_map): New type.
(struct string_pair_map_hasher): New type.
(string_pair_map_hasher::hash): New method.
(string_pair_map_hasher::equal): New method.
(path_name_pair_hash_table, string_pair_map_allocator): New variables.
(relative_path_prefix, canon_relative_path_prefix,
canon_relative_file_name): New functions.
(canon_file_name): Add relative_prefix argument, if non-NULL
and string is a relative path, return canon_relative_file_name.
(lto_location_cache::input_location_and_block): Decode file change
bit from the location number.  If changing file, unpack bit whether
pwd is streamed and stream in pwd.  Adjust canon_file_name caller.
(lto_free_file_name_hash): Delete path_name_pair_hash_table
and string_pair_map_allocator.

(cherry picked from commit 3d0af0c997fe42a7f0963d970a9c495b81041206)

lto: Stream edge goto_locus [PR94235]

The following patch adds streaming of edge goto_locus (both LOCATION_LOCUS
and LOCATION_BLOCK from it), the PR shows a testcase (inappropriate for
gcc testsuite) where the lack of streaming of goto_locus results in worse
debug info.
Earlier version of the patch (without the output_function changes) failed
miserably, because on the order mismatch - input_function would
first input_cfg, then input_eh_regions and then input_bb (all of which now
have locations), while output_function used output_eh_regions, then output_bb
and then output_cfg. *_cfg went to a separate stream...
Now, is there a reason why the order is different?

If the intent is that the cfg could be read separately from the rest of
function or vice versa, alternatively we'd need to clear_line_info ();
before output_eh_regions and before/after output_cfg to make them
independent.

2020-09-07 Jakub Jelinek <jakub@redhat.com>

PR debug/94235
* lto-streamer-out.c (output_cfg): Also stream goto_locus for edges.
Use bp_pack_var_len_unsigned instead of streamer_write_uhwi to stream
e->dest->index and e->flags.
(output_function): Call output_cfg before output_ssa_name, rather than
after streaming all bbs.
* lto-streamer-in.c (input_cfg): Stream in goto_locus for edges.
Use bp_unpack_var_len_unsigned instead of streamer_read_uhwi to stream
in dest_index and edge_flags.

(cherry picked from commit fea13fcd0da0353520eb2675ad24c2f296611b85)

lto: Remove stream_input_location_now

As discussed yesterday, stream_input_location_now has been used in 3
remaining places. For ERT_MUST_NOT_THROW, I believe the failure_loc
location is stable at least until the apply_cache after the bbs are all
read, and the locations do not include BLOCK, so we can use normal
stream_input_location, and the two input_struct_function_base also
shouldn't include BLOCK and are stable at least until that same apply_cache
after reading all bbs, so again we can use the location cache.

2020-09-04 Jakub Jelinek <jakub@redhat.com>

* lto-streamer.h (stream_input_location_now): Remove declaration.
* lto-streamer-in.c (stream_input_location_now): Remove.
(input_eh_region, input_struct_function_base): Use
stream_input_location instead of stream_input_location_now.

(cherry picked from commit b898878032a5bbba0d1a981db6399664181531e9)

lto: Ensure we force a change for file/line/column after clear_line_info

As discussed yesterday:
On the streamer out side, we call clear_line_info
in multiple spots which resets the current_* values to something, but on the
reader side, we don't have corresponding resets in the same location, just have
the stream_* static variables that keep the current values through the
entire stream in (so across all the clear_line_info spots in a single LTO
object but also across jumping from one LTO object to another one).
Now, in an earlier version of my patch it actually broke LTO bootstrap
(and a lot of LTO testcases), so for the BLOCK case I've solved it by
clear_line_info setting current_block to something that should never appear,
which means that in the LTO stream after the clear_line_info spots including
the start of the LTO stream we force the block change bit to be set and thus
BLOCK to be streamed and therefore stream_block from earlier to be
ignored.  But for the rest I think that is not the case, so I wonder if we
don't sometimes end up with wrong line/column info because of that, or
please tell me what prevents that.
clear_line_info does:
  ob->current_file = NULL;
  ob->current_line = 0;
  ob->current_col = 0;
  ob->current_sysp = false;
while I think NULL current_file is something that should likely be different
from expanded_location (...).file (UNKNOWN_LOCATION/BUILTINS_LOCATION are
handled separately and not go through the caching), I think line number 0
can sometimes occur and especially column 0 occurs frequently if we ran out
of location_t with columns info.  But then we do:
      bp_pack_value (bp, ob->current_file != xloc.file, 1);
      bp_pack_value (bp, ob->current_line != xloc.line, 1);
      bp_pack_value (bp, ob->current_col != xloc.column, 1);
and stream the details only if the != is true.  If that happens immediately
after clear_line_info and e.g. xloc.column is 0, we would stream 0 bit and
not stream the actual value, so on read-in it would reuse whatever
stream_col etc. were before.  Shouldn't we set some ob->current_* new bit
that would signal we are immediately past clear_line_info which would force
all these != checks to non-zero?  Either by oring something into those
tests, or perhaps:
  if (ob->current_reset)
    {
      if (xloc.file == NULL)
        ob->current_file = "";
      if (xloc.line == 0)
        ob->current_line = 1;
      if (xloc.column == 0)
        ob->current_column = 1;
      ob->current_reset = false;
    }
before doing those bp_pack_value calls with a comment, effectively forcing
all 6 != comparisons to be true?

2020-09-04  Jakub Jelinek  <jakub@redhat.com>

* lto-streamer.h (struct output_block): Add reset_locus member.
* lto-streamer-out.c (clear_line_info): Set reset_locus to true.
(lto_output_location_1): If reset_locus, clear it and ensure
current_{file,line,col} is different from xloc members.

(cherry picked from commit 70d8d9bd93f7912e56a27e64abc9e1e895fe143a)

c++: Fix another PCH hash_map issue [PR96901]

The recent libstdc++ changes caused lots of libstdc++-v3 tests FAILs
on i686-linux, all of them in the same spot during constexpr evaluation
of a recursive _S_gcd call.
The problem is yet another hash_map that used the default hasing of
tree keys through pointer hashing which is preserved across PCH write/read.
During PCH handling, the addresses of GC objects are changed, which means
that the hash values of the keys in such hash tables change without those
hash tables being rehashed.  Which in the fundef_copies_table case usually
means we just don't find a copy of a FUNCTION_DECL body for recursive uses
and start from scratch.  But when the hash table keeps growing, the "dead"
elements in the hash table can sometimes reappear and break things.
In particular what I saw under the debugger is when the fundef_copies_table
hash map has been used on the outer _S_gcd call, it didn't find an entry for
it, so returned a slot with *slot == NULL, which is treated as that the
function itself is used directly (i.e. no recursion), but that addition of
a hash table slot caused the recursive _S_gcd call to actually find
something in the hash table, unfortunately not the new *slot == NULL spot,
but a different one from the pre-PCH streaming which contained the returned
toplevel (non-recursive) call entry for it, which means that for the
recursive _S_gcd call we actually used the same trees as for the outer ones
rather than a copy of those, which breaks constexpr evaluation.

2020-09-03  Jakub Jelinek  <jakub@redhat.com>

PR c++/96901
* tree.h (struct decl_tree_traits): New type.
(decl_tree_map): New typedef.

* constexpr.c (fundef_copies_table): Change type from
hash_map<tree, tree> * to decl_tree_map *.

(cherry picked from commit ba6730bd18371a3dff1e37d2c2ee27233285b597)

c++: Disable -frounding-math during manifestly constant evaluation [PR96862]

As discussed in the PR, fold-const.c punts on floating point constant
evaluation if the result is inexact and -frounding-math is turned on.
      /* Don't constant fold this floating point operation if the
         result may dependent upon the run-time rounding mode and
         flag_rounding_math is set, or if GCC's software emulation
         is unable to accurately represent the result.  */
      if ((flag_rounding_math
           || (MODE_COMPOSITE_P (mode) && !flag_unsafe_math_optimizations))
          && (inexact || !real_identical (&result, &value)))
        return NULL_TREE;
Jonathan said that we should be evaluating them anyway, e.g. conceptually
as if they are done with the default rounding mode before user had a chance
to change that, and e.g. in C in initializers it is also ignored.
In fact, fold-const.c for C initializers turns off various other options:

/* Perform constant folding and related simplification of initializer
   expression EXPR.  These behave identically to "fold_buildN" but ignore
   potential run-time traps and exceptions that fold must preserve.  */

  int saved_signaling_nans = flag_signaling_nans;\
  int saved_trapping_math = flag_trapping_math;\
  int saved_rounding_math = flag_rounding_math;\
  int saved_trapv = flag_trapv;\
  int saved_folding_initializer = folding_initializer;\
  flag_signaling_nans = 0;\
  flag_trapping_math = 0;\
  flag_rounding_math = 0;\
  flag_trapv = 0;\
  folding_initializer = 1;

  flag_signaling_nans = saved_signaling_nans;\
  flag_trapping_math = saved_trapping_math;\
  flag_rounding_math = saved_rounding_math;\
  flag_trapv = saved_trapv;\
  folding_initializer = saved_folding_initializer;

So, shall cxx_eval_outermost_constant_expr instead turn off all those
options (then warning_sentinel wouldn't be the right thing to use, but given
the 8 or how many return stmts in cxx_eval_outermost_constant_expr, we'd
need a RAII class for this.  Not sure about the folding_initializer, that
one is affecting complex multiplication and division constant evaluation
somehow.

2020-09-03  Jakub Jelinek  <jakub@redhat.com>

PR c++/96862
* constexpr.c (cxx_eval_outermost_constant_expr): Temporarily disable
flag_rounding_math during manifestly constant evaluation.

* g++.dg/cpp1z/constexpr-96862.C: New test.

(cherry picked from commit 6641d6d3fe79113f8d9f3ced355aea79bffda822)

lto: Cache location_ts including BLOCKs in GIMPLE streaming [PR94311]

As mentioned in the PR, when compiling valgrind even on fairly small
testcase where in one larger function the location keeps oscillating
between a small line number and 8000-ish line number in the same file
we very quickly run out of all possible location_t numbers and because of
that emit non-sensical line numbers in .debug_line.
There are ways how to decrease speed of depleting location_t numbers
in libcpp, but the main reason of this is that we use
stream_input_location_now for streaming in location_t for gimple_location
and phi arg locations.  libcpp strongly prefers that the locations
it is given are sorted by the different files and by line numbers in
ascending order, otherwise it depletes quickly no matter what and is much
more costly (many extra file changes etc.).
The reason for not caching those were the BLOCKs that were streamed
immediately after the location and encoded into the locations (and for PHIs
we failed to stream the BLOCKs altogether).
This patch enhances the location cache to handle also BLOCKs (but not for
everything, only for the spots we care about the BLOCKs) and also optimizes
the size of the LTO stream by emitting a single bit into a pack whether the
BLOCK changed from last case and only streaming the BLOCK tree if it
changed.

2020-09-03  Jakub Jelinek  <jakub@redhat.com>

PR lto/94311
* gimple.h (gimple_location_ptr, gimple_phi_arg_location_ptr): New
functions.
* streamer-hooks.h (struct streamer_hooks): Add
output_location_and_block callback.  Fix up formatting for
output_location.
(stream_output_location_and_block): Define.
* lto-streamer.h (class lto_location_cache): Fix comment typo.  Add
current_block member.
(lto_location_cache::input_location_and_block): New method.
(lto_location_cache::lto_location_cache): Initialize current_block.
(lto_location_cache::cached_location): Add block member.
(struct output_block): Add current_block member.
(lto_output_location): Formatting fix.
(lto_output_location_and_block): Declare.
* lto-streamer.c (lto_streamer_hooks_init): Initialize
streamer_hooks.output_location_and_block.
* lto-streamer-in.c (lto_location_cache::cmp_loc): Also compare
block members.
(lto_location_cache::apply_location_cache): Handle blocks.
(lto_location_cache::accept_location_cache,
lto_location_cache::revert_location_cache): Fix up function comments.
(lto_location_cache::input_location_and_block): New method.
(lto_location_cache::input_location): Implement using
input_location_and_block.
(input_function): Invoke apply_location_cache after streaming in all
bbs.
* lto-streamer-out.c (clear_line_info): Set current_block.
(lto_output_location_1): New function, moved from lto_output_location,
added block handling.
(lto_output_location): Implement using lto_output_location_1.
(lto_output_location_and_block): New function.
* gimple-streamer-in.c (input_phi): Use input_location_and_block
to input and cache both location and block.
(input_gimple_stmt): Likewise.
* gimple-streamer-out.c (output_phi): Use
stream_output_location_and_block.
(output_gimple_stmt): Likewise.

(cherry picked from commit 3536ff2de8317c430546fd97574d44c5146cef2b)

fortran: Fix o'...' boz to integer/real conversions [PR96859]

The standard says that excess digits from boz are truncated.
For hexadecimal or binary, the routines copy just the number of digits
that will be needed, but for octal we copy number of digits that
contain one extra bit (for 8-bit, 32-bit or 128-bit, i.e. kind 1, 4 and 16)
or two extra bits (for 16-bit or 64-bit, i.e. kind 2 and 8).
The clearing of the first bit is done correctly by changing the first digit
if it is 4-7 to one smaller by 4 (i.e. modulo 4).
The clearing of the first two bits is done by changing 4 or 6 to 0
and 5 or 7 to 1, which is incorrect, because we really want to change the
first digit to 0 if it was even, or to 1 if it was odd, so digits
2 and 3 are mishandled by keeping them as is, rather than changing 2 to 0
and 3 to 1.

2020-09-02 Jakub Jelinek <jakub@redhat.com>

PR fortran/96859
* check.c (gfc_boz2real, gfc_boz2int): When clearing first two bits,
change also '2' to '0' and '3' to '1' rather than just handling '4'
through '7'.

* gfortran.dg/pr96859.f90: New test.

(cherry picked from commit b567d3bd302933adb253aba9069fd8120c485441)

dwarf2out: Fix up dwarf2out_next_real_insn caching [PR96729]

The addition of NOTE_INSN_BEGIN_STMT and NOTE_INSN_INLINE_ENTRY notes
reintroduced quadratic behavior into dwarf2out_var_location.
This function needs to know the next real instruction to which the var
location note applies, but the way final_scan_insn is called outside of
final.c main loop doesn't make it easy to look up the next real insn in
there (and for non-dwarf it is even useless).  Usually next real insn is
only a few notes away, but we can have hundreds of thousands of consecutive
notes only followed by a real insn.  dwarf2out_var_location to avoid the
quadratic behavior contains a cache, it remembers the next note and when it
is called again on that loc_note, it can use the previously computed
dwarf2out_next_real_insn result, rather than walking the insn chain once
again.  But, for NOTE_INSN_{BEGIN_STMT,INLINE_ENTRY} dwarf2out_var_location
is not called while the code puts into the cache those notes, which means if
we have e.g. in the worst case NOTE_INSN_VAR_LOCATION and
NOTE_INSN_BEGIN_STMT notes alternating, the cache is not really used.

The following patch fixes it by looking up the next NOTE_INSN_VAR_LOCATION
if any.  While the lookup could be perhaps done together with looking for
the next real insn once (e.g. in dwarf2out_next_real_insn or its copy),
there are other dwarf2out_next_real_insn callers which don't need/want that
behavior and if there are more than two NOTE_INSN_VAR_LOCATION notes
followed by the same real insn, we need to do that "find next
NOTE_INSN_VAR_LOCATION" walk anyway.

On the testcase from the PR this patch speeds it 2.8times, from 0m0.674s
to 0m0.236s (why it takes for the reporter more than 60s is unknown).

2020-08-26  Jakub Jelinek  <jakub@redhat.com>

PR debug/96729
* dwarf2out.c (dwarf2out_next_real_insn): Adjust function comment.
(dwarf2out_var_location): Look for next_note only if next_real is
non-NULL, in that case look for the first non-deleted
NOTE_INSN_VAR_LOCATION between loc_note and next_real, if any.

(cherry picked from commit ca1afa261d03c9343dff1208325f87d9ba69ec7a)

Daily bump.

Fix bogus error on Value_Size clause for variant record type

This is a regression present on the mainline and 10 branch: the compiler
rejects a Value_Size clause on a discriminated record type with variant.

gcc/ada/ChangeLog:
* gcc-interface/decl.c (set_rm_size): Do not take into account the
Value_Size clause if it is not for the entity itself.

gcc/testsuite/ChangeLog:
* gnat.dg/specs/size_clause5.ads: New test.

Fix uninitialized variable with nested variant record types

This fixes a wrong code issue with nested variant record types: the
compiler generates move instructions that depend on an uninitialized
variable, which was initially a SAVE_EXPR not instantiated early enough.

gcc/ada/ChangeLog:
* gcc-interface/decl.c (build_subst_list): For a definition, make
sure to instantiate the SAVE_EXPRs generated by the elaboration of
the constraints in front of the elaboration of the type itself.

gcc/testsuite/ChangeLog:
* gnat.dg/discr59.adb: New test.
* gnat.dg/discr59_pkg1.ads: New helper.
* gnat.dg/discr59_pkg2.ads: Likewise.

Daily bump.

c++: Fix ICE in reshape_init with init-list [PR95164]

This patch fixes a long-standing bug in reshape_init_r.  Since r209314
we implement DR 1467 which handles list-initialization with a single
initializer of the same type as the target.  In this test this causes
a crash in reshape_init_r when we're processing a constructor that has
undergone the DR 1467 transformation.

Take e.g. the

  foo({{1, {H{k}}}});

line in the attached test.  {H{k}} initializes the field b of H in I.
H{k} is a functional cast, so has TREE_HAS_CONSTRUCTOR set, so is
COMPOUND_LITERAL_P.  We perform the DR 1467 transformation and turn
{H{k}} into H{k}.  Then we attempt to reshape H{k} again and since
first_initializer_p is null and it's COMPOUND_LITERAL_P, we go here:

           else if (COMPOUND_LITERAL_P (stripped_init))
             gcc_assert (!BRACE_ENCLOSED_INITIALIZER_P (stripped_init));

then complain about the missing braces, go to reshape_init_class and ICE
on
               gcc_checking_assert (d->cur->index
                                    == get_class_binding (type, id));

because due to the missing { } we're looking for 'b' in H, but that's
not found.

So we have to be prepared to handle an initializer whose outer braces
have been removed due to DR 1467.

gcc/cp/ChangeLog:

PR c++/95164
* decl.c (reshape_init_r): When initializing an aggregate member
with an initializer from an initializer-list, also consider
COMPOUND_LITERAL_P.

gcc/testsuite/ChangeLog:

PR c++/95164
* g++.dg/cpp0x/initlist123.C: New test.

(cherry picked from commit acbe30bbc884899da72df47d023ebde89f8f47f1)