git.ipfire.org Git - thirdparty/gcc.git/log

testsuite: Update testcase for PR96078 fix [PR99363]

My fix for PR96078 made us stop warning about flatten on an alias if the
target has the alias, which is exactly the case tested here. So let's
remove the expected warning and add a similar case which does warn.

gcc/testsuite/ChangeLog:

PR c/99363
* gcc.dg/attr-flatten-1.c: Adjust.

Daily bump.

aarch64: Fix status return logic in RNG intrinsics

There is a bug with the RNG intrinsics in their return code. The definition says:

"Stores a 64-bit random number into the object pointed to by the argument and returns zero.
If the implementation could not generate a random number within a reasonable period of time
the object pointed to by the input is set to zero and a non-zero value is returned."

This means we should be testing whether to return non-zero with:
CSET W0, EQ
rather than NE.

This patch fixes that.

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.c (aarch64_expand_rng_builtin): Use EQ
to compare against CC_REG rather than NE.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/rng_2.c: New test.

(cherry picked from commit f7581eb38eeaa8af64f3cdfe2faf764f5883f16f)

rs6000: Fix disassembling a vector pair in gcc-10 in little-endian mode

In gcc-10, we don't handle disassembling a vector pair in little-endian mode
correctly. The solution is to make use of the disassemble accumulator code
that is endian friendly.

gcc/

2021-03-17 Peter Bergner <bergner@linux.ibm.com>

* config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Handle
disassembling a vector pair vector by vector in little-endian mode.

Daily bump.

ipa: Fix resolving speculations through cgraph_edge::set_call_stmt

In the PR 98078 testcase, speculative call-graph edges which were
created by IPA-CP are confirmed during inlining but
cgraph_edge::set_call_stmt does not take it very well.

The function enters the update_speculative branch and updates the
edges in the speculation bundle separately (by a recursive call), but
when it processes the first direct edge, most of the bundle actually
ceases to exist because it is devirtualized.  It nevertheless goes on
to attempt to update the indirect edge (that has just been removed),
which surprisingly gets as far as adding the edge to the
call_site_hash, the same devirtualized edge for the second time, and
that triggers an assert.

Fixed by this patch which makes the function aware that it is about to
resolve a speculation and do so instead of updating components of
speculation.  Also, it does so before dealing with the hash because
the speculation resolution code needs the hash to point to the first
speculative direct edge and also cleans the hash up by calling
update_call_stmt_hash_for_removing_direct_edge.

Bootstrapped and tested on x86_64-linux, also profile-LTO-bootstrapped
on the same system.

gcc/ChangeLog:

2021-01-20  Martin Jambor  <mjambor@suse.cz>

PR ipa/98078
* cgraph.c (cgraph_edge::set_call_stmt): Do not update all
corresponding speculative edges if we are about to resolve
sepculation.  Make edge direct (and so resolve speculations) before
removing it from call_site_hash.
(cgraph_edge::make_direct): Relax the initial assert to allow calling
the function on speculative direct edges.

(cherry picked from commit b8188b7d7382e4a74af5dd6a125e76e8d43a68a5)

c/99224 - avoid ICEing on invalid __builtin_next_arg

This avoids crashes with __builtin_next_arg on non-parameters. For
the specific testcase we arrive with an anonymous SSA_NAME so that
SSA_NAME_VAR becomes NULL and we crash.

2021-02-24 Richard Biener <rguenther@suse.de>

PR c/99224
* builtins.c (fold_builtin_next_arg): Avoid NULL arg.

* gcc.dg/pr99224.c: New testcase.

(cherry picked from commit 084963dcaca2f0836366fdb001561e29ecbfb483)

tree-optimization/99253 - fix reduction path check

This fixes an ordering problem with verifying that no intermediate
computations in a reduction path are used outside of the chain.  The
check was disabled for value-preserving conversions at the tail
but whether a stmt was a conversion or not was only computed after
the first use.  The following fixes this by re-ordering things
accordingly.

2021-02-25  Richard Biener  <rguenther@suse.de>

PR tree-optimization/99253
* tree-vect-loop.c (check_reduction_path): First compute
code, then verify out-of-loop uses.

* gcc.dg/vect/pr99253.c: New testcase.

(cherry picked from commit 1193d05465acd39b6e3c7095274d8351a1e2cd44)

Daily bump.

coroutines : Avoid a C++11ism.

The master version of the code uses a defaulted CTOR, which had
been inadvertently backported to gcc-10. Fixed thus.

gcc/cp/ChangeLog:

* coroutines.cc (struct var_nest_node): Provide a default
CTOR.

tree-nested: Update assert for Fortran module vars [PR97927]

gcc/ChangeLog:

PR fortran/97927
* tree-nested.c (convert_local_reference_stmt): Avoid calling
lookup_field_for_decl for Fortran module (= namespace context).

gcc/testsuite/ChangeLog:

PR fortran/97927
* gfortran.dg/module_variable_3.f90: New test.

(cherry picked from commit 8a6a62614a8ae4544770420416d1632d6c3d3f6e)

ira: Make sure allocno copies are ordered [PR98791]

gcc/ChangeLog:
2021-02-22 Andre Vieira <andre.simoesdiasvieira@arm.com>

PR rtl-optimization/98791
* ira-conflicts.c (process_regs_for_copy): Don't create allocno copies
for unordered modes.

gcc/testsuite/ChangeLog:
2021-02-22 Andre Vieira <andre.simoesdiasvieira@arm.com>

PR rtl-optimization/98791
* gcc.target/aarch64/sve/pr98791.c: New test.

(cherry picked from commit 4c31a3a6d31b6214ea774d403bf8ab7ebe1ea862)

Fortran: Fix problem with allocate initialization [PR99545].

2021-03-15 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran/ChangeLog

PR fortran/99545
* trans-stmt.c (gfc_trans_allocate): Mark the initialization
assignment by setting init_flag.

gcc/testsuite/ChangeLog

PR fortran/99545
* gfortran.dg/pr99545.f90: New test.

(cherry picked from commit 21ced2776a117924e52f6aab8b41afb613fef0e7)

Daily bump.

aarch64: Set AARCH64_EXTRA_TUNE_PREFER_ADVSIMD_AUTOVEC for Neoverse N2

This patch tweaks the Neoverse N2 tuning on the GCC 10 branch to have it
in line with GCC 8 and 9 to prefer AdvancedSIMD over SVE for
auto-vectorisation.

gcc/ChangeLog:

* config/aarch64/aarch64.c (neoversen2_tunings): Set
AARCH64_EXTRA_TUNE_PREFER_ADVSIMD_AUTOVEC tune_flags.

Daily bump.

aarch64: Add missing error_mark_node check [PR99381]

We were missing a check in function_resolver::require_vector_type to see
if the argument type was already invalid. This was causing us to attempt
to emit a diagnostic and subsequently ICE in print_type. Fixed thusly.

gcc/ChangeLog:

PR target/99381
* config/aarch64/aarch64-sve-builtins.cc
(function_resolver::require_vector_type): Handle error_mark_node.

gcc/testsuite/ChangeLog:

PR target/99381
* gcc.target/aarch64/pr99381.c: New test.

(cherry picked from commit a6bc1680a493de356d6a381718021c6a44401201)

Daily bump.

rs6000: Fix pr98959 testcase

It needs the int128 selector because it uses __int128, and the lp64
selector is the best we can do for -mcmodel=.

2021-03-10 Segher Boessenkool <segher@kernel.crashing.org>

gcc/testsuite/
* gcc.target/powerpc/pr98959.c: Add int128 and lp64 selectors.

(cherry picked from commit 8f316f41ce0fd90570f4d4444c29c639a322a0be)

rs6000: Fix invalid splits when using Altivec style addresses [PR98959]

The rs6000_emit_le_vsx_* functions assume they are not passed an Altivec
style "& ~16" address.  However, some of our expanders and splitters do
not verify we do not have an Altivec style address before calling those
functions, leading to an ICE.  The solution here is to guard the expanders
and splitters to ensure we do not call them if we're given an Altivec style
address.

2021-03-08  Peter Bergner  <bergner@linux.ibm.com>

gcc/
PR target/98959
* config/rs6000/rs6000.c (rs6000_emit_le_vsx_permute): Add an assert
to ensure we do not have an Altivec style address.
* config/rs6000/vsx.md (*vsx_le_perm_load_<mode>): Disable if passed
an Altivec style address.
(*vsx_le_perm_store_<mode>): Likewise.
(splitters after *vsx_le_perm_store_<mode>): Likewise.
(vsx_load_<mode>): Disable special expander if passed an Altivec
style address.
(vsx_store_<mode>): Likewise.

gcc/testsuite/
PR target/98959
* gcc.target/powerpc/pr98959.c: New test.

(cherry picked from commit cb25dea3ef2c7768007bffc56f0e31e1c42b44e2)

rs6000: Fix ICE in rs6000_init_builtins when compiling with -mcpu=440 [PR99279]

The initialization of compat builtins assumes the builtin we are creating
a compatible builtin for exists and ICEs if it doesn't. However, there are
valid reasons why some builtins are disabled for a particular compile.
In this case, the MMA builtins are disabled for -mcpu=440 (and other cpus),
so instead of ICEing, we should just skip adding the MMA compat builtin.

2021-02-25 Peter Bergner <bergner@linux.ibm.com>

gcc/
PR target/99279
* config/rs6000/rs6000-call.c (rs6000_init_builtins): Replace assert
with an "if" test.

(cherry picked from commit 0159535adb0e7400f2c6922f14a7602f6b90cf69)

rs6000: Add support for compatibility built-ins

The LLVM and GCC teams agreed to rename the __builtin_mma_assemble_pair and
__builtin_mma_disassemble_pair built-ins to __builtin_vsx_assemble_pair and
__builtin_vsx_disassemble_pair respectively. It's too late to remove the
old names, so this patch renames the built-ins to the new names and then
adds support for creating compatibility built-ins (ie, multiple built-in
functions generate the same code) and then creates compatibility built-ins
using the old names.

2021-02-23 Peter Bergner <bergner@linux.ibm.com>

gcc/
* config/rs6000/mma.md (mma_assemble_pair): Rename from this...
(vsx_assemble_pair): ...to this.
* config/rs6000/rs6000-builtin.def (BU_MMA_V2, BU_MMA_V3,
BU_COMPAT): New macros.
(mma_assemble_pair): Rename from this...
(vsx_assemble_pair): ...to this.
(mma_disassemble_pair): Rename from this...
(vsx_disassemble_pair): ...to this.
(mma_assemble_pair): New compatibility built-in.
(mma_disassemble_pair): Likewise.
* config/rs6000/rs6000-call.c (struct builtin_compatibility): New.
(RS6000_BUILTIN_COMPAT): Define.
(bdesc_compat): New.
(rs6000_gimple_fold_mma_builtin): Use VSX_BUILTIN_ASSEMBLE_PAIR.
(rs6000_init_builtins): Register compatibility built-ins.
(mma_init_builtins): Use VSX_BUILTIN_ASSEMBLE_PAIR,
and VSX_BUILTIN_DISASSEMBLE_PAIR.
* doc/extend.texi (__builtin_mma_assemble_pair): Rename from this...
(__builtin_vsx_assemble_pair): ...to this.
(__builtin_mma_disassemble_pair): Rename from this...
(__builtin_vsx_disassemble_pair): ...to this.

gcc/testsuite/
* gcc.target/powerpc/mma-builtin-4.c: Add tests for
__builtin_vsx_assemble_pair and __builtin_vsx_disassemble_pair.
Add __has_builtin tests for built-ins.
Update expected instruction counts.

(cherry picked from commit 77ef995c1fbcab76a2a69b9f4700bcfd005d8e62)

rs6000: Fix invalid address used in MMA built-in function

The mma_assemble_input_operand predicate is too lenient on the memory
operands it will accept, leading to an ICE when illegitimate addresses
are passed in.  The solution is to only accept memory operands with
addresses that are valid for quad word memory accesses.  The test case
is a minimized test case from the Eigen library.  The creduced test case
is very noisy with respect to warnings, so the test case has added -w to
silence them.

2021-02-11  Peter Bergner  <bergner@linux.ibm.com>

gcc/
PR target/99041
* config/rs6000/predicates.md (mma_assemble_input_operand): Restrict
memory addresses that are legal for quad word accesses.

gcc/testsuite/
PR target/99041
* g++.target/powerpc/pr99041.C: New test.

(cherry picked from commit 2432c47970024db6410708b582a901259dabaae1)

Fix Ada bootstrap on Cygwin64

gcc/ada/
PR bootstrap/94918
* raise-gcc.c: On Cygwin include mingw32.h to prevent
windows.h from including x86intrin.h or emmintrin.h.

Fix ICE on atomic enumeration type with LTO

This is a strange regression whereby an enumeration type declared as
atomic (or volatile) incorrectly triggers the ODR machinery for its
values in LTO mode.

gcc/ada/
* gcc-interface/decl.c (gnat_to_gnu_entity): Build a TYPE_STUB_DECL
for the main variant of an enumeration type declared as volatile.
gcc/testsuite/
* gnat.dg/specs/lto25.ads: New test.

Daily bump.

Fix internal error on lambda function

This boils down to the RTL expander trying to take the address of a DECL
whose RTX is a register.

gcc/
PR c++/90448
* calls.c (initialize_argument_information): When the argument
is passed by reference, do not make a copy in a thunk only if
the argument is already in memory. Remove redundant test for
the case of callee copy.

Daily bump.

runtime: cast SIGSTKSZ to uintptr

PR go/99458
* libgo/runtime/proc.c: cast SIGSTKSZ to uintptr
In newer versions of glibc it is long, which causes a signed
comparison warning.

aarch64: Add internal tune flag to minimise VL-based scalar ops

This is a backport of the cse_sve_vl_constants tuning param to GCC 10.

Bootstrapped and tested on the branch on aarch64-none-linux-gnu.

gcc/ChangeLog:

* config/aarch64/aarch64-tuning-flags.def (cse_sve_vl_constants):
Define.
* config/aarch64/aarch64.md (add<mode>3): Force CONST_POLY_INT immediates
into a register when the above is enabled.
* config/aarch64/aarch64.c (neoversev1_tunings):
AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS.
(aarch64_rtx_costs): Use AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS.

gcc/testsuite/

* gcc.target/aarch64/sve/cse_sve_vl_constants_1.c: New test.

Daily bump.

PR libfortran/99218 - matmul on temporary array accesses invalid memory

Do not invoke tuned rank-2 times rank-2 matmul if rank(b) == 1.

libgfortran/ChangeLog:

PR libfortran/99218
* m4/matmul_internal.m4: Invoke tuned matmul only for rank(b)>1.
* generated/matmul_c10.c: Regenerated.
* generated/matmul_c16.c: Likewise.
* generated/matmul_c4.c: Likewise.
* generated/matmul_c8.c: Likewise.
* generated/matmul_i1.c: Likewise.
* generated/matmul_i16.c: Likewise.
* generated/matmul_i2.c: Likewise.
* generated/matmul_i4.c: Likewise.
* generated/matmul_i8.c: Likewise.
* generated/matmul_r10.c: Likewise.
* generated/matmul_r16.c: Likewise.
* generated/matmul_r4.c: Likewise.
* generated/matmul_r8.c: Likewise.
* generated/matmulavx128_c10.c: Likewise.
* generated/matmulavx128_c16.c: Likewise.
* generated/matmulavx128_c4.c: Likewise.
* generated/matmulavx128_c8.c: Likewise.
* generated/matmulavx128_i1.c: Likewise.
* generated/matmulavx128_i16.c: Likewise.
* generated/matmulavx128_i2.c: Likewise.
* generated/matmulavx128_i4.c: Likewise.
* generated/matmulavx128_i8.c: Likewise.
* generated/matmulavx128_r10.c: Likewise.
* generated/matmulavx128_r16.c: Likewise.
* generated/matmulavx128_r4.c: Likewise.
* generated/matmulavx128_r8.c: Likewise.

gcc/testsuite/ChangeLog:

PR libfortran/99218
* gfortran.dg/matmul_21.f90: New test.

(cherry picked from commit b1bee29167df6b0fbc9a4c8d06e2acbf3367af47)

OpenACC: C/C++ - fix async parsing [PR99137]

gcc/c/ChangeLog:

PR c/99137
* c-parser.c (c_parser_oacc_clause_async): Reject comma expressions.

gcc/cp/ChangeLog:

PR c/99137
* parser.c (cp_parser_oacc_clause_async): Reject comma expressions.

gcc/testsuite/ChangeLog:

PR c/99137
* c-c++-common/goacc/asyncwait-1.c: Update dg-error; add
additional test.

(cherry picked from commit 6ddedd3efa3fe482f76a4037521a06b3ac9f2a8b)

Daily bump.

Fix build breakage with latest glibc release

gcc/ada/
PR ada/99264
* init.c (__gnat_alternate_sta) [Linux]: Remove preprocessor test on
MINSIGSTKSZ and bump size to 32KB.
* libgnarl/s-osinte__linux.ads (Alternate_Stack_Size): Bump to 32KB.

Daily bump.

c++: Normalization and deduction guide rewriting [PR96199]

This is a subset of r11-2748; we don't want to rewrite all deduction guides
in GCC 10, but we do need the push_nested_class in normalization to avoid
breaking cmcstl2.

gcc/cp/ChangeLog:

PR c++/96199
* cp-tree.h (struct push_nested_class_guard): New.
* constraint.cc (get_normalized_constraints_from_decl): Use it.

c++: C++17 and decltype of multi-operator expression [PR95675]

A call that is the immediate operand of decltype has special semantics: no
temporary is produced, so it's OK for the return type to be e.g. incomplete.
But we were treating (e | f) the same way, which confused overload
resolution when we then tried to evaluate ... | g.

Fixed by making build_temp do what its name says, and force the C++17
temporary materialization conversion.

gcc/cp/ChangeLog:

PR c++/95675
* call.c (build_temp): Wrap a CALL_EXPR in a TARGET_EXPR
if it didn't get one before.

gcc/testsuite/ChangeLog:

PR c++/95675
* g++.dg/cpp0x/decltype-call5.C: New test.
* g++.dg/cpp0x/decltype-call6.C: New test.

cgraph: flatten and same_body aliases [PR96078]

The patch for PR92372 made us start warning about a flatten attribute on an
alias. But in the case of C++ 'tor base/complete variants, the user didn't
create the alias. If the alias target also has the attribute, the alias
points to a flattened function, so we shouldn't warn.

gcc/ChangeLog:

PR c++/96078
* cgraphunit.c (process_function_and_variable_attributes): Don't
warn about flatten on an alias if the target also has it.
* cgraph.h (symtab_node::get_alias_target_tree): New.

gcc/testsuite/ChangeLog:

PR c++/96078
* g++.dg/ext/attr-flatten1.C: New test.

c++: Fix class NTTP constness handling [PR98810]

Here, when substituting still-dependent args into an alias template, we see
a non-const type because the default argument is non-const, and is not a
template parm object because it's still dependent.

gcc/cp/ChangeLog:

PR c++/98810
* pt.c (tsubst_copy) [VIEW_CONVERT_EXPR]: Add const
to a class non-type template argument that needs it.

gcc/testsuite/ChangeLog:

PR c++/98810
* g++.dg/cpp2a/nontype-class-defarg1.C: New test.

Daily bump.

d: Fix heap-buffer-overflow in checkModFileAlias [PR 99337]

The code wrongly assumed memcmp did not read past the mismatch.

gcc/d/ChangeLog:

PR d/99337
* dmd/dmodule.c (checkModFileAlias): Don't read past buffer in
comparison.

(cherry picked from commit d6177870dd2696501e3b8d3930fd5549d4acaeae)

Fix ICE with pathologically large frames

gcc/
PR target/99234
* config/i386/i386.c (ix86_compute_frame_layout): For a SEH target,
point back the hard frame pointer to its default location when the
frame is larger than SEH_MAX_FRAME_SIZE.

tree-optimization/98758 - fix integer arithmetic in data-ref analysis

This fixes some int arithmetic issues and a bogus truncation.

2021-01-20 Richard Biener <rguenther@suse.de>

PR tree-optimization/98758
* tree-data-ref.c (int_divides_p): Use lambda_int arguments.
(lambda_matrix_right_hermite): Avoid undefinedness with
signed integer abs and multiplication.
(analyze_subscript_affine_affine): Use lambda_int.

* gcc.dg/torture/pr98758.c: New testcase.

(cherry picked from commit 34599780d0de72faf5719ea08d11a061722b9d19)

tree-optimization/98640 - fix bogus sign-extension with VN

VN tried to express a sign extension from int to long of
a trucated quantity with a plain conversion but that loses the
truncation. Since there's no single operand doing truncate plus
sign extend (there was a proposed SEXT_EXPR to do that at some
point mapping to RTL sign_extract) don't bother to appropriately
model this with two ops (which the VN insert machinery doesn't
handle and which is unlikely to CSE fully).

2021-01-13 Richard Biener <rguenther@suse.de>

PR tree-optimization/98640
* tree-ssa-sccvn.c (visit_nary_op): Do not try to
handle plus or minus from a truncated operand to be
sign-extended.

* gcc.dg/torture/pr98640.c: New testcase.

(cherry picked from commit ffd28c265e6d611983cd27e9332dc799039a3f04)

tree-optimization/98526 - fix vectorizer reduction cost

This fixes a double-counting in the reduction cost when vectorizing
the reduction through the regular vectorizable_* functions.

2021-01-11 Richard Biener <rguenther@suse.de>

PR tree-optimization/98526
* tree-vect-loop.c (vect_model_reduction_cost): Remove costing
of the actual reduction op for the regular case.
(vectorizable_reduction): Cost the stmts
vect_transform_reduction produces here.

(cherry picked from commit 04bff1bbfc11a974342c0eb0c0d65d902e36e82e)

tree-optimization/97897 - complex lowering on abnormal edges

This fixes complex lowering to not put constants into abnormal
edge PHI values by making sure abnormally used SSA names are
VARYING in its propagation lattice.

2020-11-19 Richard Biener <rguenther@suse.de>

PR tree-optimization/97897
* tree-complex.c (complex_propagate::visit_stmt): Make sure
abnormally used SSA names are VARYING.
(complex_propagate::visit_phi): Likewise.

* gcc.dg/pr97897.c: New testcase.

debug: fix switch lowering debug info

gcc/ChangeLog:

PR debug/98656
* tree-switch-conversion.c (jump_table_cluster::emit): Add loc
argument.
(bit_test_cluster::emit): Reuse location_t for newly created
gswitch statement.
(switch_decision_tree::try_switch_expansion): Preserve
location_t.
* tree-switch-conversion.h: Change function signatures.

(cherry picked from commit 4ede02a5f2af1205434f0e05aaaeff762b24e329)

Daily bump.

Fix PR ada/99095

This is a regression present on the mainline and 10 branch, where we fail
to make the bounds explicit for the return value of a function returning
an unconstrained array of a limited record type.

gcc/ada/
PR ada/99095
* sem_ch8.adb (Check_Constrained_Object): Restrict again the special
optimization for limited types to non-array types except in the case
of an extended return statement.
gcc/testsuite/
* gnat.dg/limited5.adb: New test.

Fix ICE in compute_fn_summary

PR ipa/98338
* ipa-fnsummary.c (compute_fn_summary): Fix sanity check.

(cherry picked from commit 150bde36c119eff4b8a74667c9d728d6a8a5e8a1)

RISC-V: Implement __builtin_thread_pointer

RISC-V has a dedicate register for thread pointer which is specified in psABI
doc, so we could support __builtin_thread_pointer in straightforward way.

Note: clang/llvm was supported __builtin_thread_pointer for RISC-V port
recently.
- https://reviews.llvm.org/rGaabc24acf0d5f8677bd22fe9c108581e07c3e180

gcc/ChangeLog:

* config/riscv/riscv.md (get_thread_pointer<mode>): New.
(TP_REGNUM): Ditto.
* doc/extend.texi (Target Builtins): Add RISC-V built-in section.
Document __builtin_thread_pointer.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/read-thread-pointer.c: New.

(cherry picked from commit 1073b500e5d33af8b75567108a8c04fe2598df2b)

Daily bump.

arm: force use of r4 for __gnu_cmse_nonsecure_call when !FPCXT [PR99271]

Commit r10-6017 relaxed the constraint on thumb2 calls to
__gnu_cmse_nonsecure_call to allow any register for the call address.
Although the initial code expansion continues to use r4 with the FPCXT
extension is not enabled, the change was unsafe because subsequent
optimizations could use the additional freedom to change which
register was being used.

To fix this we need to split the output patterns in the machine
description to use distinct recognizers: one with the additional
freedom when FPCXT is enabled an another that retains the original
restrictions when the extension is not available.

gcc:
PR target/99271
* config/arm/thumb2.md (nonsecure_call_reg_thumb2_fpcxt): New pattern.
(nonsecure_call_value_reg_thumb2_fpcxt): Likewise.
(nonsecure_call_reg_thumb2): Restrict to using r4 for the callee
address and disable when the FPCXT is not available.
(nonsecure_call_value_reg_thumb2): Likewise.

gcc/testsuite:
* gcc.target/arm/cmse/cmse-18.c: New test.

Fix wrong result for 1.0/3.0 at -O2 -fno-omit-frame-pointer -frounding-math

This wrong-code PR for the C++ compiler on x86-64/Windows is a regression
in GCC 9 and later, but the underlying issue has probably been there since
SEH was implemented and is exposed by this comment in config/i386/winnt.c:

  /* SEH records saves relative to the "current" stack pointer, whether
     or not there's a frame pointer in place.  This tracks the current
     stack pointer offset from the CFA.  */
  HOST_WIDE_INT sp_offset;

That's not what the (current) Microsoft documentation says; instead it says:

  /* SEH records offsets relative to the lowest address of the fixed stack
     allocation.  If there is no frame pointer, these offsets are from the
     stack pointer; if there is a frame pointer, these offsets are from the
     value of the stack pointer when the frame pointer was established, i.e.
     the frame pointer minus the offset in the .seh_setframe directive.  */

That's why the implementation is correct only under the condition that the
frame pointer be established *after* the fixed stack allocation; as a matter
of fact, that's clearly the model underpinning SEH, but is the opposite of
what is done e.g. on Linux.

However the issue is mostly papered over in practice because:

  1. SEH forces use_fast_prologue_epilogue to false, which in turns forces
save_regs_using_mov to false, so the general regs are always pushed when
they need to be saved, which eliminates the offset computation for them.

  2. As soon as a frame is larger than 240 bytes, the frame pointer is fixed
arbitrarily to 128 bytes above the stack pointer, which of course requires
that it be established after the fixed stack allocation.

So you need a small frame clobbering one of the call-saved XMM registers in
order to generate wrong SEH unwind info.

The attached fix makes sure that the frame pointer is always established
after the fixed stack allocation by pointing it at or below the lowest used
register save area, i.e. the SSE save area, and removing the special early
saves in the prologue; the end result is a uniform prologue sequence for
SEH whatever the frame size.  And it avoids a discrepancy between cases
where the number of saved general regs is even and cases where it is odd.

gcc/
PR target/99234
* config/i386/i386.c (ix86_compute_frame_layout): For a SEH target,
point the hard frame pointer to the SSE register save area instead
of the general register save area.  Perform only minimal adjustment
for small frames if it is initially not correctly aligned.
(ix86_expand_prologue): Remove early saves for a SEH target.
* config/i386/winnt.c (struct seh_frame_state): Document constraint.
gcc/testsuite/
* g++.dg/eh/seh-xmm-unwind.C: New test.

Daily bump.

c++: Allow GNU attributes before lambda -> [PR90333]

In my 9.3/10 patch for 90333 I allowed attributes between [] and (), and
after the trailing return type, but not in the place that GCC 8 expected
them, and we've gotten several bug reports about that. So let's allow them
there, as well.

gcc/cp/ChangeLog:

PR c++/90333
* parser.c (cp_parser_lambda_declarator_opt): Accept GNU attributes
between () and ->.

gcc/testsuite/ChangeLog:

PR c++/90333
* g++.dg/ext/attr-lambda3.C: New test.

c++: variadic lambda template and empty pack [PR97246]

In get<0>, Is is empty, so the first parameter pack of the lambda is empty,
but after the fix for PR94546 we were wrongly associating it with the
partial instantiation of 'v'.

gcc/cp/ChangeLog:

PR c++/97246
PR c++/94546
* pt.c (extract_fnparm_pack): Check DECL_PACK_P here.
(register_parameter_specializations): Not here.

gcc/testsuite/ChangeLog:

PR c++/97246
* g++.dg/cpp2a/lambda-generic-variadic21.C: New test.

Daily bump.

PR fortran/93340 - fix missed substring simplifications

Substrings were not reduced early enough for use in initializations,
such as DATA statements. Add an early simplification for substrings
with constant starting and ending points.

gcc/fortran/ChangeLog:

* gfortran.h (gfc_resolve_substring): Add prototype.
* primary.c (match_string_constant): Simplify substrings with
constant starting and ending points.
* resolve.c: Rename resolve_substring to gfc_resolve_substring.
(gfc_resolve_ref): Use renamed function gfc_resolve_substring.

gcc/testsuite/ChangeLog:

* substr_10.f90: New test.
* substr_9.f90: New test.

(cherry picked from commit bdd1b1f55529da00b867ef05a53a08fbfc3d1c2e)

Daily bump.

arm: Fix CMSE support detection in libgcc (PR target/99157)

As discussed in the PR, the Makefile fragment lacks a double '$' to
get the return-code from GCC invocation, resulting is CMSE support
missing from multilibs.

I checked that the simple patch proposed in the PR fixes the problem.

2021-02-23 Christophe Lyon <christophe.lyon@linaro.org>
Hau Hsu <hsuhau617@gmail.com>

PR target/99157
libgcc/
* config/arm/t-arm: Fix cmse support detection.

(cherry picked from commit be30dd89926d5dd19d72f90c1586b0e2557fde43)

Fortran: Fix for class defined operators [PR99124].

2021-02-23 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/99124
* resolve.c (resolve_fl_procedure): Include class results in
the test for F2018, C15100.
* trans-array.c (get_class_info_from_ss): Do not use the saved
descriptor to obtain the class expression for variables. Use
gfc_get_class_from_expr instead.

gcc/testsuite/
PR fortran/99124
* gfortran.dg/class_defined_operator_2.f03 : New test.
* gfortran.dg/elemental_result_2.f90 : New test.
* gfortran.dg/class_assign_4.f90: Correct the non-conforming
elemental function with an allocatable result with an operator
interface with array dummies and result.

(cherry picked from commit 29a5298955f777c539c628f51e78b75d8e586c44)

Daily bump.

PR fortran/99169 - Do not clobber allocatable intent(out) dummy argument

gcc/fortran/ChangeLog:

* trans-expr.c (gfc_conv_procedure_call): Do not add clobber to
allocatable intent(out) argument.

gcc/testsuite/ChangeLog:

* gfortran.dg/intent_optimize_3.f90: New test.

(cherry picked from commit 2df374b337a5f6cf5528e91718e4e12e4006b7ae)

aarch64: Add cpu cost tables for A64FX

This is a backport of adding cost tables for A64FX.
Bootstrapped and tested on aarch64-none-linux-gnu.

2021-02-23 Qian Jianhua <qianjh@cn.fujitsu.com>

gcc/ChangeLog:

* config/aarch64/aarch64-cost-tables.h (a64fx_extra_costs): New.
* config/aarch64/aarch64.c (a64fx_addrcost_table): New.
(a64fx_regmove_cost, a64fx_vector_cost): New.
(a64fx_tunings): Use the new added cost tables.

Daily bump.

Add mi_thunk support for vcalls on hppa.

gcc/ChangeLog:

PR target/85074
* config/pa/pa.c (TARGET_ASM_CAN_OUTPUT_MI_THUNK): Define as
hook_bool_const_tree_hwi_hwi_const_tree_true.
(pa_asm_output_mi_thunk): Add support for nonzero vcall_offset.

Fortran/OpenMP: Fix optional dummy procedures [PR99171]

gcc/fortran/ChangeLog:

PR fortran/99171
* trans-openmp.c (gfc_omp_is_optional_argument): Regard optional
dummy procs as nonoptional as no special treatment is needed.

libgomp/ChangeLog:

PR fortran/99171
* testsuite/libgomp.fortran/dummy-procs-1.f90: New test.

(cherry picked from commit e9b34037cdd196ab912a7ac3358f8a8d3e307e92)

Fortran: Fix ubound simplifcation [PR99027]

gcc/fortran/ChangeLog:

PR fortran/99027
* simplify.c (simplify_bound_dim): Honor DIMEN_ELEMENT
when using dim=.

gcc/testsuite/ChangeLog:

PR fortran/99027
* gfortran.dg/ubound_1.f90: New test.

(cherry picked from commit f600f271b10d0214b111f2aa52a3d5740477e139)

Daily bump.

aarch64: Introduce prefer_advsimd_autovec to GCC 10

This patch introduces the prefer_advsimd_autovec internal tune flag that's already available on the GCC 8 and 9 branches.
It allows a CPU tuning to specify that it prefers Advanced SIMD for autovectorisation rather than SVE.
In GCC 10 onwards this can be easily adjusted through the aarch64_autovec_preference param in the options override hook.
The neoversev1_tunings struct makes use of this tuning flag

Bootstrapped and tested on aarch64-none-linux-gnu.
Confirmed that an --param aarch64-autovec-preference can override the CPU setting if the user really wishes to.

gcc/ChangeLog:

* config/aarch64/aarch64-tuning-flags.def (prefer_advsimd_autovec): Define.
* config/aarch64/aarch64.c (neoversev1_tunings): Use it.
(aarch64_override_options_internal): Adjust aarch64_autovec_preference
param when prefer_advsimd_autovec is enabled.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/advsimd_autovec_only_1.c: New test.

Fortran: Reject DT as fmt in I/O statments [PR99111]

gcc/fortran/ChangeLog:

PR fortran/99111
* io.c (resolve_tag_format): Reject BT_DERIVED/CLASS/VOID
as (array-valued) FORMAT tag.

gcc/testsuite/ChangeLog:

PR fortran/99111
* gfortran.dg/fmt_nonchar_1.f90: New test.
* gfortran.dg/fmt_nonchar_2.f90: New test.

(cherry picked from commit ebf9b6c13f0847ddcc22e540a5fcdbf644e85a9c)

Daily bump.

c++: Revert EXPR_LOCATION change to build_aggr_init_expr [PR96997]

My change in r10-7718 to make build_aggr_init_expr set EXPR_LOCATION
(mimicking build_target_expr) causes the debuginfo regression PR96997.
Given that this change is mostly independent of the rest of the commit,
and that the only fallout of reverting it is a less accurate error
message location in a testcase introduced in the same commit, it seems
the best way forward is to just revert this part of the commit.

gcc/cp/ChangeLog:

PR debug/96997
PR c++/94034
* tree.c (build_aggr_init_expr): Revert r10-7718 change.

gcc/testsuite/ChangeLog:

PR debug/96997
PR c++/94034
* g++.dg/cpp1y/constexpr-nsdmi7b.C: Adjust expected location of
"call to non-'constexpr' function" error message.

(cherry picked from commit 78a6d0e30d7950216dc0c5be5d65d0cbed13924c)

Daily bump.

Fix cast in df_worklist_dataflow_doublequeue

The existing cast to float gives weird results in the RTL dump files
on x86 when the compiler is configured -with-fpmath=sse.

gcc/
* df-core.c (df_worklist_dataflow_doublequeue): Use proper cast.

Daily bump.

libgfortran: Fix PR95647 by changing the interfaces of operators .eq. and .ne.

The FE converts the old school .eq. to ==,
and then tracks the ==.  The module starts with == and so it does not
properly overload the .eq.  Reversing the interfaces fixes this.

2021-02-12  Steve Kargl <sgk@troutmask.apl.washington.edu>

libgfortran/ChangeLog:

PR libfortran/95647
* ieee/ieee_arithmetic.F90: Flip interfaces of operators .eq. to
== and .ne. to /= .

gcc/testsuite/ChangeLog:

PR libfortran/95647
* gfortran.dg/ieee/ieee_12.f90: New test.

Fortran: Fix rank of assumed-rank array [PR99043]

gcc/fortran/ChangeLog:

PR fortran/99043
* trans-expr.c (gfc_conv_procedure_call): Don't reset
rank of assumed-rank array.

gcc/testsuite/ChangeLog:

PR fortran/99043
* gfortran.dg/assumed_rank_20.f90: New test.

(cherry picked from commit f699e0b16578cdc1be8b90691ef8b0964af32d2f)

c++: consteval and explicit instantiation [PR96905]

Normally, an explicit instantiation means we want to write out the
instantiation. But not for a consteval function.

gcc/cp/ChangeLog:

PR c++/96905
* pt.c (mark_decl_instantiated): Exit early if consteval.

gcc/testsuite/ChangeLog:

PR c++/96905
* g++.dg/cpp2a/consteval-expinst1.C: New test.

c++: generic lambda, fn* conv, empty class [PR98326]

Here, in the thunk returned from the captureless lambda conversion to
pointer-to-function, we try to pass through invisible reference parameters
by reference, without doing a copy. The empty class copy optimization was
messing that up.

gcc/cp/ChangeLog:

PR c++/98326
PR c++/20408
* cp-gimplify.c (simple_empty_class_p): Don't touch an invisiref
parm.

gcc/testsuite/ChangeLog:

PR c++/98326
* g++.dg/cpp1y/lambda-generic-empty1.C: New test.

Daily bump.

c++: Endless loop with targ deduction in member tmpl [PR95888]

My r10-7007 patch tweaked tsubst not to reduce the template level of
template parameters when tf_partial.  That caused infinite looping in
is_specialization_of: we ended up with a class template specialization
whose TREE_TYPE (CLASSTYPE_TI_TEMPLATE (t)) == t, so the second for
loop in is_specialization_of never finished.

There's a lot going on in this test, but essentially: the template fn
here has two template parameters, we call it with one explicitly
provided, the other one has to be deduced.  So we'll find ourselves
in fn_type_unification which uses tf_partial when tsubsting the
*explicit* template arguments into the function type.  That leads to
tsubstituting the return type, C<T>.  C is a member template; its
most general template is

  template<class U> template<class V> struct B<U>::C

we figure out (tsubst_template_args) that the template argument list
is <int, int>.  They come from different levels, one comes from B<int>,
the other one from fn<int>.

So now we lookup_template_class to see if we have C<int, int>.  We
do the
  /* This is a full instantiation of a member template.  Find
     the partial instantiation of which this is an instance.  */
  TREE_VEC_LENGTH (arglist)--;
  // arglist is now <int>, not <int, int>
  found = tsubst (gen_tmpl, arglist, complain, NULL_TREE);
  TREE_VEC_LENGTH (arglist)++;

magic which is looking for the partial instantiation, in this case,
that would be template<class V> struct B<int>::C.  Note we're still
in a tf_partial context!  So the tsubst_template_args in the tsubst
(which tries to substitute <int> into <U, V>) returns <int, V>, but
V's template level hasn't been reduced!  After tsubst_template_args,
tsubst_template_decl looks to see if we already have this specialization:

  // t = template_decl C
  // full_args = <int, V>
  spec = retrieve_specialization (t, full_args, hash);

but doesn't find the one we created a while ago, when processing
B<int> b; in the test, because V's levels don't match.  Whereupon
tsubst_template_decl creates a new TEMPLATE_DECL, one that leads to
the infinite looping problem.

Fixed by using tf_none when looking for an existing partial instantiation.

It also occurred to me that I should be able to trigger a similar
problem with 'auto', since r10-7007 removed an is_auto check.  And lo,
I constructed deduce10.C which exhibits the same issue with pre-r10-7007
compilers.  This patch fixes that problem as well.  I'm ecstatic.

gcc/cp/ChangeLog:

PR c++/95888
* pt.c (lookup_template_class_1): Pass tf_none to tsubst when looking
for the partial instantiation.

gcc/testsuite/ChangeLog:

PR c++/95888
* g++.dg/template/deduce10.C: New test.
* g++.dg/template/deduce9.C: New test.

(cherry picked from commit 88cfd531c69b3c1fe7a3c183d83cfeacc8f69402)

Fix -freorder-blocks-and-partition glitch with Windows SEH

Since GCC 8, the -freorder-blocks-and-partition pass can split a function
into hot and cold parts, thus generating 2 CIEs for a single function in
DWARF for exception purposes and doing an equivalent trick for Windows SEH.

Now the Windows system unwinder is picky when it comes to the boundary
between an active EH region and the end of the function and, therefore,
a nop may need to be added in specific cases.

gcc/
* config/i386/winnt.c (i386_pe_seh_unwind_emit): When switching to
the cold section, emit a nop before the directive if the previous
active instruction can throw.

Fortran: Fix calls to associate name typebound subroutines [PR98897].

2021-02-11 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/98897
* match.c (gfc_match_call): Include associate names as possible
entities with typebound subroutines. The target needs to be
resolved for the type.

gcc/testsuite/
PR fortran/98897
* gfortran.dg/typebound_call_32.f90: New test.

(cherry picked from commit ff6903288d96aa1d28ae4912b1270985475f3ba8)

Fortran: Fix ICE after error regression [PR99060].

2021-02-11 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/99060
* primary.c (gfc_match_varspec): Test for non-null 'previous'
before using its name in the error message.

gcc/testsuite/
PR fortran/99060
* gfortran.dg/pr99060.f90: New test.

(cherry picked from commit 5ee5415af8691640b0f7a5332b78d04ba309f4f0)

Daily bump.