git.ipfire.org Git - thirdparty/gcc.git/log

Darwin, config: Revise host config fragment.

There were two uses for the Darwin host config fragment:

The first is to arrange for targets that support mdynamic-no-pic
to be built with that enabled (since it makes a significant
difference to the compiler performance). We can be more specific
in the application of this, since it only applies to 32b hosts
plus powerpc64-darwin9.

The second was to work around a tool bug where -fno-PIE was not
propagated to the link stage. This second use is redundant,
since the buggy toolchain cannot bootstrap current GCC sources
anyway.

This makes the host fragment more specific and reduces the number
of toolchains for which it is included which reduces clutter in
configure lines.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
config/ChangeLog:

* mh-darwin: Make this specific to handling the
mdynamic-no-pic case.

ChangeLog:

* configure: Regenerate.
* configure.ac: Adjust cases for which it is necessary to
include the Darwin host config fragment.

(cherry picked from commit 54258e22b0846aaa6bd3265f592feb161eecda75)

Darwin, PPC : Fix R13 for PPC64.

We have a somewhat unusual situation in that for PPC64, R13 is
both reserved and callee-saved (it is used internally by the
pthreads implementation to contain pthread_self).

So add R13 to the fixed regs, but also keep it in the callee-
saved set.

gcc/ChangeLog:

* config/rs6000/darwin.h (FIXED_R13): Add for PPC64.
(FIRST_SAVED_GP_REGNO): Save from R13 even when it is one
of the fixed regs.

(cherry picked from commit b12d6e79899fd27833c53ffc3c973538244f62e1)

Darwin: Future-proof and homogeneize detection of darwin versions

The current GCC branch will become 12.1.0, which will be the stable
version of GCC when the next macOS version is released. There are some
places in GCC that don’t handle darwin22 as a version, so we need to
future-proof it (gcc/config.gcc and gcc/config/darwin-driver.c). We
align that code with what Apple clang does, i.e. accept all potential
major macOS versions until 99.

This patch also homogenises the handling of darwin version numbers,
where the majority of places use darwin2*, but some used darwin2[0-9]*.
Since there never was a darwin2.x version, the two are equivalent, and
we prefer the simpler darwin2*

gcc/ChangeLog:

* config/darwin-driver.c: Make version code more future-proof.
* config.gcc: Homogeneize darwin versions.
* configure.ac: Homogeneize darwin versions.
* configure: Regenerate.

gcc/testsuite/ChangeLog:

* gcc.dg/darwin-minversion-link.c: Test darwin21.
* obj-c++.dg/cxx-ivars-3.mm: Homogeneize darwin versions.
* obj-c++.dg/objc-gc-3.mm: Homogeneize darwin versions.
* objc.dg/objc-gc-4.m: Homogeneize darwin versions.

(cherry picked from commit f18cbc1ee1f421a0dd79dc389bef9a23dd4a761d)

Darwin, config: Amend for Darwin 21 / macOS 12.

It seems that the OS major version is now tracking the kernel
major version - 9. Minor version has been set to kerne
min - 1.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
Signed-off-by: Saagar Jha <saagar@saagarjha.com>
gcc/ChangeLog:

* config.gcc: Adjust for Darwin21.
* config/darwin-c.c (macosx_version_as_macro): Likewise.
* config/darwin-driver.c (validate_macosx_version_min):
Likewise.
(darwin_find_version_from_kernel): Likewise.

(cherry picked from commit 11b967577483e51f97d540e9c2c9d1ea76da8122)

Darwin, X86, config: Adjust 'as' command lines [PR100340].

Versions of the assembler using clang from XCode 12.5/12.5.1
have a bug which produces different code layout between debug and
non-debug input, leading to a compare fail for default configure
parameters.

This is a workaround fix to disable the optimisation that is
responsible for the bug.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR target/100340 - Bootstrap fails with Clang 12.0.5 (XCode 12.5)

PR target/100340

gcc/ChangeLog:

* config.in: Regenerate.
* config/i386/darwin.h (EXTRA_ASM_OPTS): New
(ASM_SPEC): Pass options to disable branch shortening where
needed.
* configure: Regenerate.
* configure.ac: Detect versions of 'as' that support the
optimisation which has the bug.

(cherry picked from commit 743b8dd6fd757e997eb060d70fd4ae8e04fb56cd)

Darwin: Fix a type mismatch warning for a non-GCC bootstrap compiler.

DECL_MD_FUNCTION_CODE() returns an int, on one particular compiler the
code in darwin_fold_builtin() triggers a warning.

Fixed thus.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:

* config/darwin.c (darwin_fold_builtin): Make fcode an int to
avoid a mismatch with DECL_MD_FUNCTION_CODE().

(cherry picked from commit 25587472ccd223c861fe77cfeca4ba33c3f6cd99)

Daily bump.

testsuite: Skip pr105250.c for powerpc and s390 [PR105266]

This test case pr105250.c is like its related pr105140.c, which
suffers the error with message like "{AltiVec,vector} argument
passed to unprototyped" on powerpc and s390. So like commits
r12-8025 and r12-8039, this fix is to add the dg-skip-if for
powerpc*-*-* and s390*-*-*.

gcc/testsuite/ChangeLog:

PR testsuite/105266
* gcc.dg/pr105250.c: Skip for powerpc*-*-* and s390*-*-*.

(cherry picked from commit 021b51814d67bedd8f41ac07edfd05654140c6e5)

tree-optimization/105250 - adjust fold_convertible_p PR105140 fix

The following reverts the original PR105140 fix and goes for instead
applying the additional fold_convert constraint for VECTOR_TYPE
conversions also to fold_convertible_p. I did not try sanitizing
all of this at this point.

2022-04-13 Richard Biener <rguenther@suse.de>

PR tree-optimization/105250
* fold-const.c (fold_convertible_p): Revert
r12-7979-geaaf77dd85c333, instead check for size equality
of the vector types involved.

* gcc.dg/pr105250.c: New testcase.

(cherry picked from commit 4e892de6774f86540d36385701aa7b0a2bba5155)

IBM zSystems/testsuite: PR105147: Skip pr105140.c

pr105140.c fails on IBM zSystems with "vector argument passed to
unprototyped function". s390_invalid_arg_for_unprototyped_fn in
s390.cc is triggered by that.

gcc/testsuite/ChangeLog:

PR target/105147
* gcc.dg/pr105140.c: Skip for s390*-*-*.

(cherry picked from commit 176df4ccb58689aae29511b99d60a448558ede94)

rs6000/testsuite: Skip pr105140.c

This test fails with error "AltiVec argument passed to unprototyped
function", but the code (in rs6000.c:invalid_arg_for_unprototyped_fn,
from 2005) actually tests for any vector type argument. It also does
not fail on Darwin, not reflected here though.

2022-04-06 Segher Boessenkool <segher@kernel.crashing.org>

gcc/testsuite/
PR target/105147
* gcc.dg/pr105140.c: Skip for powerpc*-*-*.

(cherry picked from commit c65d15d40738f3691ff1a39907a4b93e9fe5c5ae)

middle-end/105140 - fix bogus recursion in fold_convertible_p

fold_convertible_p expects an operand and a type to convert to
but recurses with two vector component types. Fixed by allowing
types instead of an operand as well.

2022-04-04 Richard Biener <rguenther@suse.de>

PR middle-end/105140
* fold-const.c (fold_convertible_p): Allow a TYPE_P arg.

* gcc.dg/pr105140.c: New testcase.

(cherry picked from commit eaaf77dd85c333b116111bb1ae6c080154a4e411)

tree-optimization/105163 - abnormal SSA coalescing and reassoc

The negate propagation optimizations in reassoc did not look out for
abnormal SSA coalescing issues. The following fixes that.

2022-04-06 Richard Biener <rguenther@suse.de>

PR tree-optimization/105163
* tree-ssa-reassoc.c (repropagate_negates): Avoid propagating
negated abnormals.

* gcc.dg/torture/pr105163.c: New testcase.

(cherry picked from commit 44fe49401725055a740ce47e80561b6932b8cd01)

tree-optimization/105173 - fix insertion logic in reassoc

The find_insert_point logic around deciding whether to insert
before or after the found insertion point does not handle
the case of _12 = ..;, _12, 1.0 well. The following puts the
logic into find_insert_point itself instead.

2022-04-06 Richard Biener <rguenther@suse.de>

PR tree-optimization/105173
* tree-ssa-reassoc.c (find_insert_point): Get extra
insert_before output argument and compute it.
(insert_stmt_before_use): Adjust.
(rewrite_expr_tree): Likewise.

* gcc.dg/pr105173.c: New testcase.

(cherry picked from commit e1a5e7562d53a8d2256f754714b06595bea72196)

tree-optimization/105431 - another overflow in powi handling

This avoids undefined signed overflow when calling powi_as_mults_1.

2022-04-29 Richard Biener <rguenther@suse.de>

PR tree-optimization/105431
* tree-ssa-math-opts.c (powi_as_mults_1): Make n unsigned.
(powi_as_mults): Use absu_hwi.
(gimple_expand_builtin_powi): Remove now pointless n != -n
check.

(cherry picked from commit 44b09adb9bad99dd7e3017c5ecefed7f7c9a1590)

tree-optimization/105368 - avoid overflow in powi_cost

The following avoids undefined signed overflow when computing
the absolute of the exponent in powi_cost.

2022-04-25 Richard Biener <rguenther@suse.de>

PR tree-optimization/105368
* tree-ssa-math-opts.c (powi_cost): Use absu_hwi.

(cherry picked from commit f0e170f72f8bfaa2a64e1d09ebdfd48f917420f1)

rtl-optimization/105559 - avoid quadratic behavior in delete_insn_and_edges

When the insn to delete is a debug insn there's no point in figuring
out whether it might be the last real insn and thus we have to purge
dead edges.

2022-05-11 Richard Biener <rguenther@suse.de>

PR rtl-optimization/105559
* cfgrtl.c (delete_insn_and_edges): Only perform search to BB_END
for non-debug insns.

(cherry picked from commit 37a8220fa9188470c677abfef50c1b120c0b6c76)

Daily bump.

c++: constexpr ref to array of array [PR102307]

The problem here is that first check_initializer calls
build_aggr_init_full_exprs, which does overload resolution, but then in the
case of failed constexpr throws away the result and does it again in
build_functional_cast.  But in the first overload resolution,
reshape_init_array_1 decided to reuse the inner CONSTRUCTORs because
tf_error is set, so we know we're committed.  But the second pass gets
confused by the CONSTRUCTORs with non-init-list types.

Fixed by avoiding a second pass: instead, pass the call from build_aggr_init
to build_cplus_new, which will turn it into a TARGET_EXPR.  I don't bother
to change the object argument because it will be replaced later in
simplify_aggr_init_expr.

PR c++/102307

gcc/cp/ChangeLog:

* decl.c (check_initializer): Use build_cplus_new in case of
constexpr failure.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-array2.C: New test.

Revert "c++: pack init-capture of unresolved overload [PR102629]"

PR c++/105722

This reverts commit 93ec7bf22530610ef697fd3a64a28bebd589c790.

Daily bump.

rs6000: __Uglify non-uglified local variables in headers

Properly prefix (with "__")  all local variables in shipped headers for x86
compatibility intrinsics implementations.  This avoids possible problems with
usages like:
```
#define result foo()
#include <emmintrin.h>
```

2022-05-23  Paul A. Clarke  <pc@us.ibm.com>

gcc
PR target/104257
* config/rs6000/bmi2intrin.h: Uglify local variables.
* config/rs6000/emmintrin.h: Likewise.
* config/rs6000/mm_malloc.h: Likewise.
* config/rs6000/mmintrin.h: Likewise.
* config/rs6000/pmmintrin.h: Likewise.
* config/rs6000/tmmintrin.h: Likewise.
* config/rs6000/xmmintrin.h: Likewise.

Daily bump.

Fortran: fix error recovery on invalid array section

gcc/fortran/ChangeLog:

PR fortran/105230
* expr.c (find_array_section): Correct logic to avoid NULL
pointer dereference on invalid array section.

gcc/testsuite/ChangeLog:

PR fortran/105230
* gfortran.dg/pr105230.f90: New test.

Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
(cherry picked from commit 0acdbe29f66017fc5cca40dcbd72a0dd41491d07)

Fortran: improve error recovery on invalid array section

gcc/fortran/ChangeLog:

PR fortran/104849
* expr.c (find_array_section): Avoid NULL pointer dereference on
invalid array section.

gcc/testsuite/ChangeLog:

PR fortran/104849
* gfortran.dg/pr104849.f90: New test.

(cherry picked from commit 22015e77d3e45306077396b9de8a8a28bb67fb20)

Fortran: a RECURSIVE procedure cannot be an INTRINSIC

gcc/fortran/ChangeLog:

PR fortran/105138
* intrinsic.c (gfc_is_intrinsic): When a symbol refers to a
RECURSIVE procedure, it cannot be an INTRINSIC.

gcc/testsuite/ChangeLog:

PR fortran/105138
* gfortran.dg/recursive_reference_3.f90: New test.

Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
(cherry picked from commit d46685b04071a485b56de353d997a866bfc8caba)

add barriers to ool __sync builtins

2022-05-13 Sebastian Pop <spop@amazon.com>

gcc/
PR target/105162
* config/aarch64/aarch64-protos.h (atomic_ool_names): Increase dimension
of str array.
* config/aarch64/aarch64.c (aarch64_atomic_ool_func): Call
memmodel_from_int and handle MEMMODEL_SYNC_*.
(DEF0): Add __aarch64_*_sync functions.

gcc/testsuite/
PR target/105162
* gcc.target/aarch64/sync-comp-swap-ool.c: New.
* gcc.target/aarch64/sync-op-acquire-ool.c: New.
* gcc.target/aarch64/sync-op-full-ool.c: New.
* gcc.target/aarch64/target_attr_20.c: Update check.
* gcc.target/aarch64/target_attr_21.c: Same.

libgcc/
PR target/105162
* config/aarch64/lse.S: Define BARRIER and handle memory MODEL 5.
* config/aarch64/t-lse: Add a 5th memory model for _sync functions.

libstdc++: Fix hyperlink in docs

libstdc++-v3/ChangeLog:

* doc/xml/manual/prerequisites.xml: Fix attributes for external
hyperlink.
* doc/html/manual/setup.html: Regenerate.

(cherry picked from commit 682e587f1021241758f7dfe0b22651008622a312)

libstdc++: Fix status docs for <bit> support

libstdc++-v3/ChangeLog:

* doc/html/manual/status.html: Regenerate.
* doc/xml/manual/status_cxx2020.xml: Fix supported version for
C++20 bit operations.

(cherry picked from commit 64648821f151b0f86e16185bef7f0be5635fd737)

Daily bump.

c++: static memfn from non-dependent base [PR101078]

After my patch for PR91706, or before that with the qualified call,
tsubst_baselink returned a BASELINK with BASELINK_BINFO indicating a base of
a still-dependent derived class. We need to look up the relevant base binfo
in the substituted class.

PR c++/101078

gcc/cp/ChangeLog:

* pt.c (tsubst_baselink): Update binfos in non-dependent case.

gcc/testsuite/ChangeLog:

* g++.dg/template/access39.C: New test.

c++: alignment of local typedef in template [PR65211]

Because common_handle_aligned_attribute only applies the alignment to the
TREE_TYPE of a typedef, not the DECL_ORIGINAL_TYPE, we need to copy it
explicitly in tsubst.

PR c++/65211

gcc/cp/ChangeLog:

* pt.c (tsubst_decl) [TYPE_DECL]: Copy TYPE_ALIGN.

gcc/testsuite/ChangeLog:

* g++.target/i386/vec-tmpl1.C: New test.

c++: template conversion op [PR101698]

Asking for conversion to a dependent type also makes a BASELINK dependent.

PR c++/101698

gcc/cp/ChangeLog:

* pt.c (tsubst_baselink): Also check dependent optype.

gcc/testsuite/ChangeLog:

* g++.dg/template/conv19.C: New test.

c++: NRV and ref-extended temps [PR101442]

This issue goes back to r83221, where the cleanup for extended ref temps
changed from being unconditional to being tied to the declaration they
formed part of the initializer for.

The named return value optimization changes the cleanup for the NRV variable
to only run on the EH path; we don't want that change to affect temporary
cleanups. The perform_member_init change isn't necessary (there 'decl' is a
COMPONENT_REF), it's just for consistency.

PR c++/101442

gcc/cp/ChangeLog:

* decl.c (cp_finish_decl): Don't pass decl to push_cleanup.
* init.c (perform_member_init): Likewise.
* semantics.c (push_cleanup): Adjust comment.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-nrv1.C: New test.

c++: extern template and static data member [PR99066]

'extern template' should mean that the relevant symbols are never emitted.
But in this case we were assuming that DECL_EXTERNAL was already set on the
variable, so we just needed to clear DECL_NOT_REALLY_EXTERN. Since
DECL_EXTERNAL was not set, we emitted a definition of npos.

gcc/cp/ChangeLog:

PR c++/99066
* pt.c (mark_decl_instantiated): Set DECL_EXTERNAL.

gcc/testsuite/ChangeLog:

PR c++/99066
* g++.dg/cpp0x/extern_template-6.C: New test.

c++: mangling of lambdas in default args [PR91241]

In this testcase, the parms remembered in LAMBDA_EXPR_EXTRA_SCOPE are no
longer the parms of the FUNCTION_DECL they have as their DECL_CONTEXT, so we
were mangling both lambdas as parm #0. But since the parms are numbered
from right to left we don't need to need to find them in the FUNCTION_DECL,
we can measure their own DECL_CHAIN.

gcc/cp/ChangeLog:

PR c++/91241
* mangle.c (write_compact_number): Add sanity check.
(write_local_name): Use list_length for parm number.

gcc/testsuite/ChangeLog:

PR c++/91241
* g++.dg/abi/lambda-defarg1.C: New test.

c++: argument pack with expansion [PR86355]

This testcase revealed that we were using PACK_EXPANSION_EXTRA_ARGS a lot
more than necessary; use_pack_expansion_extra_args_p meant to use it in the
case of corresponding arguments in different argument packs differing in
whether they are pack expansions, but it was mistakenly also returning true
for the case of a single argument pack containing both expansion and
non-expansion elements.

Surprisingly, just disabling that didn't lead to any regressions in the
testsuite; it seems other changes have prevented us getting to this point
for code that used to exercise it. So this patch limits the check to
arguments in the same position in the packs, and asserts that we never
actually see a mismatch.

PR c++/86355

gcc/cp/ChangeLog:

* pt.c (use_pack_expansion_extra_args_p): Don't compare
args from the same argument pack.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-variadic2.C: New test.

Daily bump.

Revert "c++: designator and anon struct [PR101767]"

101767 wasn't broken on the 10 branch, so it doesn't need the fix, just the
test.

PR c++/101767

This reverts commit 846bff4d4659d9b2026da574194599f38a00cc79.

Revert "c++: friend with redundant qualification [PR41723]"

PR c++/102300
PR c++/41723

The patch for PR41723 caused PR102300 on trunk; let's just back it out on
the 10 branch.

This reverts commit e41d610696b81e72d1d06db176b281424e32fc23.

gcc/testsuite/ChangeLog:

* g++.dg/template/nested7.C: New test.

c++: rodata and defaulted ctor [PR104142]

Trivial initialization shouldn't bump a variable out of .rodata; if the
result of build_aggr_init is an empty STATEMENT_LIST, throw it away.

PR c++/104142

gcc/cp/ChangeLog:

* decl.c (check_initializer): Check TREE_SIDE_EFFECTS.

gcc/testsuite/ChangeLog:

* g++.dg/opt/const7.C: New test.

c++: low -faligned-new [PR102071]

This test ICEd after the constexpr new patch (r10-3661) because alloc_call
had a NOP_EXPR around it; fixed by moving the NOP_EXPR to alloc_expr. And
the PR pointed out that the size_t cookie might need more alignment, so I
fix that as well.

PR c++/102071

gcc/cp/ChangeLog:

* init.c (build_new_1): Include cookie in alignment. Omit
constexpr wrapper from alloc_call.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/aligned-new9.C: New test.

c++: missing dtor with -fno-elide-constructors [PR100838]

tf_no_cleanup only applies to the outermost TARGET_EXPR, and we already
clear it for nested calls in build_over_call, but in this case both
constructor calls came from convert_like, so we need to clear it in the
recursive call as well. This revealed that we were adding an extra
ck_rvalue in direct-initialization cases where it was wrong.

PR c++/100838
PR c++/105265

gcc/cp/ChangeLog:

* call.c (convert_like_internal): Clear tf_no_cleanup when
recursing.
(build_user_type_conversion_1): Only add ck_rvalue if
LOOKUP_ONLYCONVERTING.

gcc/testsuite/ChangeLog:

* g++.dg/init/no-elide2.C: New test.
* g++.dg/cpp0x/initlist-new6.C: New test.

c++: lambda and the current instantiation [PR82980]

When a captured variable is type-dependent, we've expressed the type of the
capture field and proxy with a decltype variant.  But if the type is "the
current instantiation", we need to be able to see that so that we can do
lookup inside it just like we could with the captured variable itself.

I also tried looking through lambda capture in
cp_parser_postfix_dot_deref_expression, but this way seems cleaner.  I plan
to treat more types as deducible in stage 1.

I considered also using this in do_auto_deduction, but think that would be
wrong: [temp.dep.expr] says an id-expression is type-dependent if it is
"associated by name lookup with a variable declared with a type that
contains a placeholder type where the initializer is type-dependent".  That
doesn't clearly exclude deducing a dependent type from the initializer, but
it seems like a barrier, and other implementations agree.

PR c++/82980

gcc/cp/ChangeLog:

* lambda.c (type_deducible_expression_p): New.
(lambda_capture_field_type): Check it.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-current-inst1.C: New test.

c++: constexpr trivial -fno-elide-ctors [PR104646]

The constexpr constructor checking code got confused by the expansion of a
trivial copy constructor; we don't need to do that checking for defaulted
ctors, anyway.

PR c++/104646

gcc/cp/ChangeLog:

* constexpr.c (maybe_save_constexpr_fundef): Don't do extra
checks for defaulted ctors.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-fno-elide-ctors1.C: New test.

c++: pack init-capture of unresolved overload [PR102629]

Here we were failing to diagnose that the initializer for the capture pack
is an unresolved overload. It turns out that the reason we didn't recognize
the deduction failure in do_auto_deduction was that the individual 'auto' in
the expansion of the capture pack was still marked as a parameter pack, so
we were deducing it to an empty pack instead of failing.

PR c++/102629

gcc/cp/ChangeLog:

* pt.c (gen_elem_of_pack_expansion_instantiation): Clear
TEMPLATE_TYPE_PARAMETER_PACK on auto.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-pack-init7.C: New test.

c++: assignment to temporary [PR59950]

Given build_this of a TARGET_EXPR, cp_build_fold_indirect_ref returns the
TARGET_EXPR. But that's the wrong value category for the result of the
defaulted class assignment operator, which returns an lvalue, so we need to
actually build the INDIRECT_REF.

PR c++/59950

gcc/cp/ChangeLog:

* call.c (build_over_call): Use cp_build_indirect_ref.

gcc/testsuite/ChangeLog:

* g++.dg/init/assign2.C: New test.

c++: empty base constexpr -fno-elide-ctors [PR105245]

The patch for 100111 extended our handling of empty base elision to the case
where the derived class has no other fields, but we still need to make sure
that there's some initializer for the derived object.

PR c++/105245
PR c++/100111

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_store_expression): Build a CONSTRUCTOR
as needed in empty base handling.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-empty2.C: Add -fno-elide-constructors.

c++: nested generic lambda in DMI [PR101717]

We were already checking COMPLETE_TYPE_P to recognize instantiation of a
generic lambda, but didn't consider that we might be nested in a non-generic
lambda.

PR c++/101717

gcc/cp/ChangeLog:

* lambda.c (lambda_expr_this_capture): Check all enclosing
lambdas for completeness.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/lambda-generic-this4.C: New test.

c++: -Wshadow=compatible-local type vs var [PR100608]

The patch for PR92024 changed -Wshadow=compatible-local to warn if either
new or old decl was a type, but the rationale only talked about the case
where both are types. If only one is, they aren't compatible.

PR c++/100608

gcc/cp/ChangeLog:

* name-lookup.c (check_local_shadow): Use -Wshadow=local
if exactly one of 'old' and 'decl' is a type.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wshadow-compatible-local-3.C: New test.

c++: operator new lookup [PR98249]

The standard says, as we quote in the comment just above, that if we don't
find operator new in the allocated type, it should be looked up in the
global scope. This is specifically ::, not just any namespace, and we
already give an error for an operator new declared in any other namespace.

PR c++/98249

gcc/cp/ChangeLog:

* call.c (build_operator_new_call): Just look in ::.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/new3.C: New test.

c++: designator and anon struct [PR101767]

We found .x in the anonymous struct, but then didn't find .y there; we
should decide that means we're done with the struct rather than that the
code is wrong.

PR c++/101767

gcc/cp/ChangeLog:

* decl.c (reshape_init_class): Back out of anon struct
if a designator doesn't match.

gcc/testsuite/ChangeLog:

* g++.dg/ext/anon-struct10.C: New test.

Daily bump.

c++: cxx_eval_array_reference and empty elem type [PR101194]

Here the initializer for x is represented as an empty CONSTRUCTOR due to
its empty element type. So during constexpr evaluation of the ARRAY_REF
x[0], we end up trying to value initialize the omitted element at index 0,
which fails because the element type is not default constructible.

This patch makes cxx_eval_array_reference specifically handle the case
where the element type is an empty type.

PR c++/101194

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_array_reference): When the element type
is an empty type and the corresponding element is omitted, just
return an empty CONSTRUCTOR instead of attempting value
initialization.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-empty16.C: New test.

(cherry picked from commit a688c284dd3848b6c4ea553035f0f9769fb4fbc9)

Daily bump.

[committed] Fix more problems with new linker warnings

gcc/testsuite
* lib/prune.exp (prune_gcc_output): Prune new linker warning.

(cherry picked from commit d993c6dea7c664aa26ee04210c471cfcb4e7d0e0)

g++.dg/gomp/clause-3.C: Fix - missing in r12-438-g1580fc7 [PR100422]

gcc/testsuite/
PR testsuite/100422
* g++.dg/gomp/clause-3.C: Use 'reduction(&:..)' instead of '...(&&:..)'.

(cherry picked from commit af4e4d35f0b84d7c2f57a7b682a09116e9911142)

asan: Fix up asan_redzone_buffer::emit_redzone_byte [PR105396]

On the following testcase, we have in main's frame 3 variables,
some red zone padding, 4 byte d, followed by 12 bytes of red zone padding, then
8 byte b followed by 24 bytes of red zone padding, then 40 bytes c followed
by some red zone padding.
The intended content of shadow memory for that is (note, each byte describes
8 bytes of memory):
f1 f1 f1 f1 04 f2 00 f2 f2 f2 00 00 00 00 00 f3 f3 f3 f3 f3
left red    d  mr b  middle r c              right red zone

f1 is left red zone magic
f2 is middle red zone magic
f3 is right red zone magic
00 when all 8 bytes are accessible
01-07 when only 1 to 7 bytes are accessible followed by inaccessible bytes

The -fdump-rtl-expand-details dump makes it clear that it misbehaves:
Flushing rzbuffer at offset -160 with: f1 f1 f1 f1
Flushing rzbuffer at offset -128 with: 04 f2 00 00
Flushing rzbuffer at offset -128 with: 00 00 00 f2
Flushing rzbuffer at offset -96 with: f2 f2 00 00
Flushing rzbuffer at offset -64 with: 00 00 00 f3
Flushing rzbuffer at offset -32 with: f3 f3 f3 f3
In the end we end up with
f1 f1 f1 f1 00 00 00 f2 f2 f2 00 00 00 00 00 f3 f3 f3 f3 f3
shadow bytes because at offset -128 there are 2 overlapping stores
as asan_redzone_buffer::emit_redzone_byte has flushed the temporary 4 byte
buffer in the middle.

The function is called with an offset and value.  If the passed offset is
consecutive with the prev_offset + buffer size (off == offset), then
we handle it correctly, similarly if the new offset is far enough from the
old one (we then flush whatever was in the buffer and if needed add up to 3
bytes of 00 before actually pushing value.

But what isn't handled correctly is when the offset isn't consecutive to
what has been added last time, but it is in the same 4 byte word of shadow
memory (32 bytes of actual memory), like the above case where
we have consecutive 04 f2 and then skip one shadow memory byte (aka 8 bytes
of real memory) and then want to emit f2.  Emitting that as a store
of little-endian 0x0000f204 followed by a store of 0xf2000000 to the same
address doesn't work, we want to emit 0xf200f204.

The following patch does that by pushing 1 or 2 00 bytes.
Additionally, as a small cleanup, instead of using
      m_shadow_bytes.safe_push (value);
      flush_if_full ();
in all of if, else if and else bodies it sinks those 2 stmts to the end
of function as all do the same thing.

2022-04-27  Jakub Jelinek  <jakub@redhat.com>

PR sanitizer/105396
* asan.c (asan_redzone_buffer::emit_redzone_byte): Handle the case
where offset is bigger than off but smaller than m_prev_offset + 32
bits by pushing one or more 0 bytes.  Sink the
m_shadow_bytes.safe_push (value); flush_if_full (); statements from
all cases to the end of the function.

* gcc.dg/asan/pr105396.c: New test.

(cherry picked from commit 9715f10c0651c9549b479b69d67be50ac4bd98a6)

rtlanal: Fix up replace_rtx [PR105333]

The following testcase FAILs, because replace_rtx replaces a REG with
CONST_WIDE_INT inside of a SUBREG, which is an invalid transformation
because a SUBREG relies on SUBREG_REG having non-VOIDmode but
CONST_WIDE_INT has VOIDmode.

replace_rtx already has code to deal with it, but it was doing
it only for CONST_INTs. The following patch does it also for
VOIDmode CONST_DOUBLE or CONST_WIDE_INT.

2022-04-22 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/105333
* rtlanal.c (replace_rtx): Use simplify_subreg or
simplify_unary_operation if CONST_SCALAR_INT_P rather than just
CONST_INT_P.

* gcc.dg/pr105333.c: New test.

(cherry picked from commit 7092b7aea122a91824d048aeb23834cf1d19b1a1)

sparc: Preserve ORIGINAL_REGNO in epilogue_renumber [PR105257]

The following testcase ICEs, because the pic register is
(reg:DI 24 %i0 [109]) and is used in the delay slot of a return.
We invoke epilogue_renumber and that changes it to
(reg:DI 8 %o0) which no longer satisfies sparc_pic_register_p
predicate, so we don't recognize the insn anymore.

The following patch fixes that by preserving ORIGINAL_REGNO if
specified, so we get (reg:DI 8 %o0 [109]) instead.

2022-04-19 Jakub Jelinek <jakub@redhat.com>

PR target/105257
* config/sparc/sparc.c (epilogue_renumber): If ORIGINAL_REGNO,
use gen_raw_REG instead of gen_rtx_REG and copy over also
ORIGINAL_REGNO. Use return 0; instead of /* fallthrough */.

* gcc.dg/pr105257.c: New test.

(cherry picked from commit eeca2b8bd03f57c59c6cf48bf6b9bd6dc86924f6)

c++: Fix up CONSTRUCTOR_PLACEHOLDER_BOUNDARY handling [PR105256]

The CONSTRUCTOR_PLACEHOLDER_BOUNDARY bit is supposed to separate
PLACEHOLDER_EXPRs that should be replaced by one object or subobjects of it
(variable, TARGET_EXPR slot, ...) from other PLACEHOLDER_EXPRs that should
be replaced by different objects or subobjects.
The bit is set when finding PLACEHOLDER_EXPRs inside of a CONSTRUCTOR, not
looking into nested CONSTRUCTOR_PLACEHOLDER_BOUNDARY ctors, and we prevent
elision of TARGET_EXPRs (through TARGET_EXPR_NO_ELIDE) whose initializer
is a CONSTRUCTOR_PLACEHOLDER_BOUNDARY ctor.  The following testcase ICEs
though, we don't replace the placeholders in there at all, because
CONSTRUCTOR_PLACEHOLDER_BOUNDARY isn't set on the TARGET_EXPR_INITIAL
ctor, but on a ctor nested in such a ctor.  replace_placeholders should be
run on the whole TARGET_EXPR slot.

So, the following patch fixes it by moving the CONSTRUCTOR_PLACEHOLDER_BOUNDARY
bit from nested CONSTRUCTORs to the CONSTRUCTOR containing those (but only
if it is closely nested, if there is some other tree sandwiched in between,
it doesn't do it).

2022-04-19  Jakub Jelinek  <jakub@redhat.com>

PR c++/105256
* typeck2.c (process_init_constructor_array,
process_init_constructor_record, process_init_constructor_union): Move
CONSTRUCTOR_PLACEHOLDER_BOUNDARY flag from CONSTRUCTOR elements to the
containing CONSTRUCTOR.

* g++.dg/cpp0x/pr105256.C: New test.

(cherry picked from commit eb03e424598d30fed68801af6d6ef6236d32e32e)

i386: Fix ICE caused by ix86_emit_i387_log1p [PR105214]

The following testcase ICEs, because ix86_emit_i387_log1p attempts to
emit something like
  if (cond)
    some_code1;
  else
    some_code2;
and emits a conditional jump using emit_jump_insn (standard way in
the file) and an unconditional jump using emit_jump.
The problem with that is that if there is pending stack adjustment,
it isn't emitted before the conditional jump, but is before the
unconditional jump and therefore stack is adjusted only conditionally
(at the end of some_code1 above), which makes dwarf2 pass unhappy about it
but is a serious wrong-code even if it doesn't ICE.

This can be fixed either by emitting pending stack adjust before the
conditional jump as the following patch does, or by not using
  emit_jump (label2);
and instead hand inlining what that function does except for the
pending stack adjustment, like:
  emit_jump_insn (targetm.gen_jump (label2));
  emit_barrier ();
In that case there will be no stack adjustment in the sequence and
it will be done later on somewhere else.

2022-04-12  Jakub Jelinek  <jakub@redhat.com>

PR target/105214
* config/i386/i386-expand.c (ix86_emit_i387_log1p): Call
do_pending_stack_adjust.

* gcc.dg/asan/pr105214.c: New test.

(cherry picked from commit d481d13786cb84f6294833538133dbd6f39d2e55)

builtins: Fix up expand_builtin_int_roundingfn_2 [PR105211]

The expansion of __builtin_iround{,f,l} etc. builtins in some cases
emits calls to a different fallback builtin. To locate the right builtin
it uses mathfn_built_in_1 with the type of the first argument.
If its TYPE_MAIN_VARIANT is {float,double,long_double}_type_node, all is
fine, but on the following testcase, because GIMPLE considers scalar
float conversions between types with the same mode as useless,
TYPE_MAIN_VARIANT of the arg's type is float32_type_node and because there
isn't __builtin_lroundf32 returns NULL and we ICE.

This patch will first try the type of the first argument of the builtin's
prototype (so that say on sizeof(double)==sizeof(long double) target it honors
whether it was a *l or non-*l call; though even that can't be 100% trusted,
user could incorrectly prototype it) and as fallback the type argument.
If neither works, doesn't fallback.

2022-04-11 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/105211
* builtins.c (expand_builtin_int_roundingfn_2): If mathfn_built_in_1
fails for TREE_TYPE (arg), retry it with
TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl))) and if even that
fails, emit call normally.

* gcc.dg/pr105211.c: New test.

(cherry picked from commit 91a38e8a848c61b2e23ee277306dc8cd194d135b)

c-family: Initialize ridpointers for __int128 etc. [PR105186]

The following testcase ICEs with C++ and is incorrectly rejected with C.
The reason is that both FEs use ridpointers identifiers for CPP_KEYWORD
and value or u.value for CPP_NAME e.g. when parsing attributes or OpenMP
directives etc., like:
         /* Save away the identifier that indicates which attribute
            this is.  */
         identifier = (token->type == CPP_KEYWORD)
           /* For keywords, use the canonical spelling, not the
              parsed identifier.  */
           ? ridpointers[(int) token->keyword]
           : id_token->u.value;

         identifier = canonicalize_attr_name (identifier);
I've tried to change those to use ridpointers only if non-NULL and otherwise
use the value/u.value even for CPP_KEYWORDS, but that was a large 10 hunks
patch.

The following patch instead just initializes ridpointers for the __intNN
keywords.  It can't be done earlier before we record_builtin_type as there
are 2 different spellings and if we initialize those ridpointers early, the
second record_builtin_type fails miserably.

2022-04-11  Jakub Jelinek  <jakub@redhat.com>

PR c++/105186
* c-common.c (c_common_nodes_and_builtins): After registering __int%d
and __int%d__ builtin types, initialize corresponding ridpointers
entry.

* c-c++-common/pr105186.c: New test.

(cherry picked from commit 083e8e66d2e90992fa83a53bfc3553dfa91abda1)

fold-const: Fix up make_range_step [PR105189]

The following testcase is miscompiled, because fold_truth_andor
incorrectly folds
(unsigned) foo () >= 0U && 1
into
foo () >= 0
For the unsigned comparison (which is useless in this case,
as >= 0U is always true, but hasn't been folded yet), previous
make_range_step derives exp (unsigned) foo () and +[0U, -]
range for it.  Next we process the NOP_EXPR.  We have special code
for unsigned to signed casts, already earlier punt if low or high
aren't representable in arg0_type or if it is a narrowing conversion.
For the signed to unsigned casts, I think if high is specified we
are still fine, as we punt for non-representable values in arg0_type,
n_high is then still representable and so was smaller or equal to
signed maximum and either low is not present (equivalent to 0U), or
low must be smaller or equal to high and so for unsigned exp
+[low, high] the signed exp +[n_low, n_high] will be correct.
Similarly, if both low and high aren't specified (always true or
always false), it is ok too.
But if we have for unsigned exp +[low, -] or -[low, -], using
+[n_low, -] or -[n_high, -] is incorrect.  Because low is smaller
or equal to signed maximum and high is unspecified (i.e. unsigned
maximum), when signed that range is a union of +[n_low, -] and
+[-, -1] which is equivalent to -[0, n_low-1], unless low
is 0, in that case we can treat it as [-, -].

2022-04-08  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/105189
* fold-const.c (make_range_step): Fix up handling of
(unsigned) x +[low, -] ranges for signed x if low fits into
typeof (x).

* g++.dg/torture/pr105189.C: New test.

(cherry picked from commit 5e6597064b0c7eb93b8f720afc4aa970eefb0628)

combine: Don't record for UNDO_MODE pointers into regno_reg_rtx array [PR104985]

The testcase in the PR fails under valgrind on mips64 (but only Martin
can reproduce, I couldn't).
But the problem reported there is that SUBST_MODE remembers addresses
into the regno_reg_rtx array, then some splitter needs a new pseudo
and calls gen_reg_rtx, which reallocates the regno_reg_rtx array
and finally undo operation is done and dereferences the old regno_reg_rtx
entry.
The rtx values stored in regno_reg_rtx array seems to be created
by gen_reg_rtx only and since then aren't modified, all we do for it
is adjusting its fields (e.g. adjust_reg_mode that SUBST_MODE does).

So, I think it is useless to use where.r for UNDO_MODE and store
&regno_reg_rtx[regno] in struct undo, we can store just
regno_reg_rtx[regno] (i.e. pointer to the REG itself instead of
pointer to pointer to REG) or could also store just the regno.

The following patch does the latter, and because SUBST_MODE no longer
needs to be a macro, changes all SUBST_MODE uses to subst_mode.

2022-04-06  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/104985
* combine.c (struct undo): Add where.regno member.
(do_SUBST_MODE): Rename to ...
(subst_mode): ... this.  Change first argument from rtx * into int,
operate on regno_reg_rtx[regno] and save regno into where.regno.
(SUBST_MODE): Remove.
(try_combine): Use subst_mode instead of SUBST_MODE, change first
argument from regno_reg_rtx[whatever] to whatever.  For UNDO_MODE, use
regno_reg_rtx[undo->where.regno] instead of *undo->where.r.
(undo_to_marker): For UNDO_MODE, use regno_reg_rtx[undo->where.regno]
instead of *undo->where.r.
(simplify_set): Use subst_mode instead of SUBST_MODE, change first
argument from regno_reg_rtx[whatever] to whatever.

(cherry picked from commit 61bee6aed26eb30b798c75b9a595c9d51e080442)

i386: Fix up ix86_expand_vector_init_general [PR105123]

The following testcase is miscompiled on ia32.
The problem is that at -O0 we end up with:
  vector(4) short unsigned int _1;
  short unsigned int u.0_3;
...
  _1 = {u.0_3, u.0_3, u.0_3, u.0_3};
statement (dead) which is wrongly expanded.
elt is (subreg:HI (reg:SI 83 [ u.0_3 ]) 0), tmp_mode SImode,
so after convert_mode we start with word (reg:SI 83 [ u.0_3 ]).
The intent is to manually broadcast that value to 2 SImode parts,
but because we pass word as target to expand_simple_binop, it will
overwrite (reg:SI 83 [ u.0_3 ]) and we end up with 0:
   10: {r83:SI=r83:SI<<0x10;clobber flags:CC;}
   11: {r83:SI=r83:SI|r83:SI;clobber flags:CC;}
   12: {r83:SI=r83:SI<<0x10;clobber flags:CC;}
   13: {r83:SI=r83:SI|r83:SI;clobber flags:CC;}
   14: clobber r110:V4HI
   15: r110:V4HI#0=r83:SI
   16: r110:V4HI#4=r83:SI
as the two ors do nothing and two shifts each by 16 left shift it all
away.
The following patch fixes that by using NULL_RTX target, so we expand it as
   10: {r110:SI=r83:SI<<0x10;clobber flags:CC;}
   11: {r111:SI=r110:SI|r83:SI;clobber flags:CC;}
   12: {r112:SI=r83:SI<<0x10;clobber flags:CC;}
   13: {r113:SI=r112:SI|r83:SI;clobber flags:CC;}
   14: clobber r114:V4HI
   15: r114:V4HI#0=r111:SI
   16: r114:V4HI#4=r113:SI
instead.

Another possibility would be to pass NULL_RTX only when word == elt
and word otherwise, where word would necessarily be a pseudo from the first
shift after passing NULL_RTX there once or pass NULL_RTX for the shift and
word for ior.

2022-04-03  Jakub Jelinek  <jakub@redhat.com>

PR target/105123
* config/i386/i386-expand.c (ix86_expand_vector_init_general): Avoid
using word as target for expand_simple_binop when doing ASHIFT and
IOR.

* gcc.target/i386/pr105123.c: New test.

(cherry picked from commit e1a74058b784c845e84a0cf1997b54b984df483d)

ubsan: Fix ICE due to -fsanitize=object-size [PR105093]

The following testcase ICEs, because for a volatile X & RESULT_DECL
ubsan wants to take address of that reference.  instrument_object_size
is called with x, so the base is equal to the access and the var
is automatic, so there is no risk of an out of bounds access for it.
Normally we wouldn't instrument those because we fold address of the
t - address of inner to 0, add constant size of the decl and it is
equal to what __builtin_object_size computes.  But the volatile
results in the subtraction not being folded.

The first hunk fixes it by punting if we access the whole automatic
decl, so that even volatile won't cause a problem.
The second hunk (not strictly needed for this testcase) is similar
to what has been added to asan.cc recently, if we actually take
address of a decl and keep it in the IL, we better mark it addressable.

2022-03-30  Jakub Jelinek  <jakub@redhat.com>

PR sanitizer/105093
* ubsan.c (instrument_object_size): If t is equal to inner and
is a decl other than global var, punt.  When emitting call to
UBSAN_OBJECT_SIZE ifn, make sure base is addressable.

* g++.dg/ubsan/pr105093.C: New test.

(cherry picked from commit e3e68fa59ead502c24950298b53c637bbe535a74)

store-merging: Avoid ICEs on roughly ~0ULL/8 sized stores [PR105094]

On the following testcase on 64-bit targets, store-merging sees
a MEM_REF store from {} ctor with "negative" bitsize where bitoff + bitsize
wraps around to very small end offset.  This later confuses the code
so that it allocates just a few bytes of memory but fills in huge amounts of
it.  Later on there is a param_store_merging_max_size size check but due to
the wrap-around we pass that.

The following patch punts on such large bitsizes.

2022-03-30  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/105094
* gimple-ssa-store-merging.c (mem_valid_for_store_merging): Punt if
bitsize <= 0 rather than just == 0.

* gcc.dg/pr105094.c: New test.

(cherry picked from commit 387e818cda0ffde86f624228c3da1ab28f453685)

c++: Fox template-introduction tentative parsing in class bodies clear colon_corrects_to_scope_p [PR105061]

The concepts support (in particular template introductions from concepts TS)
broke the following testcase, valid unnamed bitfields with dependent
types (or even just typedefs) were diagnosed as typos (: instead of correct
::) in template introduction during their tentative parsing.
The following patch fixes that by not doing this : to :: correction when
member_p is true.

2022-03-30 Jakub Jelinek <jakub@redhat.com>

PR c++/105061
* parser.c (cp_parser_template_introduction): If member_p, temporarily
clear parser->colon_corrects_to_scope_p around tentative parsing of
nested name specifier.

* g++.dg/concepts/pr105061.C: New test.

(cherry picked from commit 4f2795218a6ba6a7b7b9b18ca7a6e390661e1608)

c++: Fix up __builtin_convertvector parsing

Jonathan reported on IRC that we don't parse
__builtin_bit_cast (type, val).field
etc.
The problem is that for these 2 builtins we return from
cp_parser_postfix_expression instead of setting postfix_expression
to the cp_build_* value and falling through into the postfix regression
suffix handling loop.

2022-03-26 Jakub Jelinek <jakub@redhat.com>

* parser.c (cp_parser_postfix_expression)
<case RID_BILTIN_CONVERTVECTOR>: Don't
return cp_build_vec_convert result right away, instead
set postfix_expression to it and break.

* c-c++-common/builtin-convertvector-3.c: New test.

(cherry picked from commit 1806829e08f14e4cacacec43d7845cc2dad2ddc8)

c++: extern thread_local declarations in constexpr [PR104994]

C++14 to C++20 apparently should allow extern thread_local declarations in
constexpr functions, however useless they are there (because accessing
such vars is not valid in a constant expression, perhaps sizeof/decltype).
P2242 changed that for C++23 to passing through declaration but
https://cplusplus.github.io/CWG/issues/2552.html
has been filed for it yesterday.

2022-03-24 Jakub Jelinek <jakub@redhat.com>

PR c++/104994
* constexpr.c (potential_constant_expression_1): Don't diagnose extern
thread_local declarations.
* decl.c (start_decl): Likewise.

* g++.dg/cpp2a/constexpr-nonlit7.C: New test.

(cherry picked from commit 72124f487ccb5c8065dd5f7b8fba254600b7e611)

i386: Don't emit pushf;pop for __builtin_ia32_readeflags_u* with unused lhs [PR104971]

__builtin_ia32_readeflags_u* aren't marked const or pure I think
intentionally, so that they aren't CSEd from different regions of a function
etc. because we don't and can't easily track all dependencies between
it and surrounding code (if somebody looks at the condition flags, it is
dependent on the vast majority of instructions).
But the builtin itself doesn't have any side-effects, so if we ignore the
result of the builtin, there is no point to emit anything.

There is a LRA bug that miscompiles the testcase which this patch makes
latent, which is certainly worth fixing too, but IMHO this change
(and maybe ix86_gimple_fold_builtin too which would fold it even earlier
when it looses lhs) is worth it as well.

2022-03-19 Jakub Jelinek <jakub@redhat.com>

PR middle-end/104971
* config/i386/i386-expand.c
(ix86_expand_builtin) <case IX86_BUILTIN_READ_FLAGS>: If ignore,
don't push/pop anything and just return const0_rtx.

* gcc.target/i386/pr104971.c: New test.

(cherry picked from commit b60bc913cca7439d29a7ec9e9a7f448d8841b43c)

c++: Fix up constexpr evaluation of new with zero sized types [PR104568]

The new expression constant expression evaluation right now tries to
deduce how many elts the array it uses for the heap or heap [] vars
should have (or how many elts should its trailing array have if it has
cookie at the start).  As new is lowered at that point to
(some_type *) ::operator new (size)
or so, it computes it by subtracting cookie size if any from size, then
divides the result by sizeof (some_type).
This works fine for most types, except when sizeof (some_type) is 0,
then we divide by zero; size is then equal to cookie_size (or if there
is no cookie, to 0).
The following patch special cases those cases so that we don't divide
by zero and also recover the original outer_nelts from the expression
by forcing the size not to be folded in that case but be explicit
0 * outer_nelts or cookie_size + 0 * outer_nelts.

Note, we have further issues, we accept-invalid various cases, for both
zero sized elt_type and even non-zero sized elts, we aren't able to
diagnose out of bounds POINTER_PLUS_EXPR like:
constexpr bool
foo ()
{
  auto p = new int[2];
  auto q1 = &p[0];
  auto q2 = &p[1];
  auto q3 = &p[2];
  auto q4 = &p[3];
  delete[] p;
  return true;
}
constexpr bool a = foo ();
That doesn't look like a regression so I think we should resolve that for
GCC 13, but there are 2 problems.  Figure out why
cxx_fold_pointer_plus_expression doesn't deal with the &heap []
etc. cases, and for the zero sized arrays, I think we really need to preserve
whether user wrote an array ref or pointer addition, because in the
&p[3] case if sizeof(p[0]) == 0 we know that if it has 2 elements it is
out of bounds, while if we see p p+ 0 the information if it was
p + 2 or p + 3 in the source is lost.
clang++ seems to handle it fine even in the zero sized cases or with
new expressions.

2022-03-18  Jakub Jelinek  <jakub@redhat.com>

PR c++/104568
* init.c (build_new_constexpr_heap_type): Remove FULL_SIZE
argument and its handling, instead add ITYPE2 argument.  Only
support COOKIE_SIZE != NULL.
(build_new_1): If size is 0, change it to 0 * outer_nelts if
outer_nelts is non-NULL.  Pass type rather than elt_type to
maybe_wrap_new_for_constexpr.
* constexpr.c (build_new_constexpr_heap_type): New function.
(cxx_eval_constant_expression) <case CONVERT_EXPR>:
If elt_size is zero sized type, try to recover outer_nelts from
the size argument to operator new/new[] and pass that as
arg_size to build_new_constexpr_heap_type.  Pass ctx,
non_constant_p and overflow_p to that call too.

* g++.dg/cpp2a/constexpr-new22.C: New test.

(cherry picked from commit 0a0c2c3f06227d46b5e9542dfdd4e0fd2d67d894)

aarch64: Fix up RTL sharing bug in aarch64_load_symref_appropriately [PR104910]

We unshare all RTL created during expansion, but when
aarch64_load_symref_appropriately is called after expansion like in the
following testcases, we use imm in both HIGH and LO_SUM operands.
If imm is some RTL that shouldn't be shared like a non-sharable CONST,
we get at least with --enable-checking=rtl a checking ICE, otherwise might
just get silently wrong code.

The following patch fixes that by copying it if it can't be shared.

2022-03-16 Jakub Jelinek <jakub@redhat.com>

PR target/104910
* config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Copy
imm rtx.

* gcc.dg/pr104910.c: New test.

(cherry picked from commit 952155629ca1a4dfe7c7b26e53d118a9b853ed4a)

ifcvt: Punt if not onlyjump_p for find_if_case_{1,2} [PR104814]

find_if_case_{1,2} implicitly assumes conditional jumps and rewrites them,
so if they have extra side-effects or are say asm goto, things don't work
well, either the side-effects are lost or we could ICE.
In particular, the testcase below on s390x has there a doloop instruction
that decrements a register in addition to testing it for non-zero and
conditionally jumping based on that.

The following patch fixes that by punting for !onlyjump_p case, i.e.
if there are side-effects in the jump instruction or it isn't a plain PC
setter.

Also, it assumes BB_END (test_bb) will be always non-NULL, because basic
blocks with 2 non-abnormal successor edges should always have some instruction
at the end that determines which edge to take.

2022-03-15 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/104814
* ifcvt.c (find_if_case_1, find_if_case_2): Punt if test_bb doesn't
end with onlyjump_p. Assume BB_END (test_bb) is always non-NULL.

* gcc.c-torture/execute/pr104814.c: New test.

(cherry picked from commit a2645cd8fb33b36d737b310e26f4c47401305c7b)

c, c++, c-family: -Wshift-negative-value and -Wshift-overflow* tweaks for -fwrapv and C++20+ [PR104711]

As mentioned in the PR, different standards have different definition
on what is an UB left shift.  They all agree on out of bounds (including
negative) shift count.
The rules used by ubsan are:
C99-C2x ((unsigned) x >> (uprecm1 - y)) != 0 then UB
C++11-C++17 x < 0 || ((unsigned) x >> (uprecm1 - y)) > 1 then UB
C++20 and later everything is well defined
Now, for C++20, I've in the P1236R1 implementation added an early
exit for -Wshift-overflow* warning so that it never warns, but apparently
-Wshift-negative-value remained as is.  As it is well defined in C++20,
the following patch doesn't enable -Wshift-negative-value from -Wextra
anymore for C++20 and later, if users want for compatibility with C++17
and earlier get the warning, they still can by using -Wshift-negative-value
explicitly.
Another thing is -fwrapv, that is an extension to the standards, so it is up
to us how exactly we define that case.  Our ubsan code treats
TYPE_OVERFLOW_WRAPS (type0) and cxx_dialect >= cxx20 the same as only
diagnosing out of bounds shift count and nothing else and IMHO it is most
sensical to treat -fwrapv signed left shifts the same as C++20 treats
them, https://eel.is/c++draft/expr.shift#2
"The value of E1 << E2 is the unique value congruent to E1×2^E2 modulo 2^N,
where N is the width of the type of the result.
[Note 1: E1 is left-shifted E2 bit positions; vacated bits are zero-filled.
— end note]"
with no UB dependent on the E1 values.  The UB is only
"The behavior is undefined if the right operand is negative, or greater
than or equal to the width of the promoted left operand."
Under the hood (except for FEs and ubsan from FEs) GCC middle-end doesn't
consider UB in left shifts dependent on the first operand's value, only
the out of bounds shifts.

While this change isn't a regression, I'd think it is useful for GCC 12,
it doesn't add new warnings, but just removes warnings that aren't
appropriate.

2022-03-09  Jakub Jelinek  <jakub@redhat.com>

PR c/104711
gcc/
* doc/invoke.texi (-Wextra): Document that -Wshift-negative-value
is enabled by it only for C++11 to C++17 rather than for C++03 or
later.
(-Wshift-negative-value): Similarly (except here we stated
that it is enabled for C++11 or later).
gcc/c-family/
* c-opts.c (c_common_post_options): Don't enable
-Wshift-negative-value from -Wextra for C++20 or later.
* c-ubsan.c (ubsan_instrument_shift): Adjust comments.
* c-warn.c (maybe_warn_shift_overflow): Use TYPE_OVERFLOW_WRAPS
instead of TYPE_UNSIGNED.
gcc/c/
* c-fold.c (c_fully_fold_internal): Don't emit
-Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS.
* c-typeck.c (build_binary_op): Likewise.
gcc/cp/
* constexpr.c (cxx_eval_check_shift_p): Use TYPE_OVERFLOW_WRAPS
instead of TYPE_UNSIGNED.
* typeck.c (cp_build_binary_op): Don't emit
-Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS.
gcc/testsuite/
* c-c++-common/Wshift-negative-value-1.c: Remove
dg-additional-options, instead in target selectors of each diagnostic
check for exact C++ versions where it should be diagnosed.
* c-c++-common/Wshift-negative-value-2.c: Likewise.
* c-c++-common/Wshift-negative-value-3.c: Likewise.
* c-c++-common/Wshift-negative-value-4.c: Likewise.
* c-c++-common/Wshift-negative-value-7.c: New test.
* c-c++-common/Wshift-negative-value-8.c: New test.
* c-c++-common/Wshift-negative-value-9.c: New test.
* c-c++-common/Wshift-negative-value-10.c: New test.
* c-c++-common/Wshift-overflow-1.c: Remove
dg-additional-options, instead in target selectors of each diagnostic
check for exact C++ versions where it should be diagnosed.
* c-c++-common/Wshift-overflow-2.c: Likewise.
* c-c++-common/Wshift-overflow-5.c: Likewise.
* c-c++-common/Wshift-overflow-6.c: Likewise.
* c-c++-common/Wshift-overflow-7.c: Likewise.
* c-c++-common/Wshift-overflow-8.c: New test.
* c-c++-common/Wshift-overflow-9.c: New test.
* c-c++-common/Wshift-overflow-10.c: New test.
* c-c++-common/Wshift-overflow-11.c: New test.
* c-c++-common/Wshift-overflow-12.c: New test.

(cherry picked from commit d76511138dc816ef66fd16f71531f48c37dac3b4)

c++: Don't suggest cdtor or conversion op identifiers in spelling hints [PR104806]

On the following testcase, we emit "did you mean '__dt '?" in the error
message. "__dt " shows there because it is dtor_identifier, but we
shouldn't suggest those to the user, they are purely internal and can't
be really typed by the user because of the final space in it.

2022-03-08 Jakub Jelinek <jakub@redhat.com>

PR c++/104806
* search.c (lookup_field_fuzzy_info::fuzzy_lookup_field): Ignore
identifiers with space at the end.

* g++.dg/spellcheck-pr104806.C: New test.

(cherry picked from commit e480c3c06d20874fd7504bfdcca0b829f8000389)

s390: Fix up *cmp_and_trap_unsigned_int<mode> constraints [PR104775]

The following testcase fails to assemble due to clgte %r6,0(%r1,%r10)
insn not being accepted by assembler.
My rough understanding is that in the RSY-b insn format the spot
in other formats used for index registers is used instead for M3 what
kind of comparison it is, so this patch follows what other similar
instructions use for constraint (i.e. one without index register).

2022-03-07 Jakub Jelinek <jakub@redhat.com>

PR target/104775
* config/s390/s390.md (*cmp_and_trap_unsigned_int<mode>): Use
S constraint instead of T in the last alternative.

* gcc.target/s390/pr104775.c: New test.

(cherry picked from commit 2472dcaa8cb9e02e902f83d419c3ee7e0f3d9041)

match.pd: Further complex simplification fixes [PR104675]

Mark mentioned in the PR further 2 simplifications that also ICE
with complex types.
For these, eventually (but IMO GCC 13 materials) we could support it
for vector types if it would be uniform vector constants.
Currently integer_pow2p is true only for INTEGER_CSTs and COMPLEX_CSTs
and we can't use bit_and etc. for complex type.

2022-02-25 Jakub Jelinek <jakub@redhat.com>
Marc Glisse <marc.glisse@inria.fr>

PR tree-optimization/104675
* match.pd (t * 2U / 2 -> t & (~0 / 2), t / 2U * 2 -> t & ~1):
Restrict simplifications to INTEGRAL_TYPE_P.

* gcc.dg/pr104675-3.c : New test.

(cherry picked from commit f62115c9b770a66c5378f78a2d5866243d560573)

rs6000: Use rs6000_emit_move in movmisalign<mode> expander [PR104681]

The following testcase ICEs, because for some strange reason it decides to use
movmisaligntf during expansion where the destination is MEM and source is
CONST_DOUBLE. For normal mov<mode> expanders the rs6000 backend uses
rs6000_emit_move to ensure that if one operand is a MEM, the other is a REG
and a few other things, but for movmisalign<mode> nothing enforced this.
The middle-end documents that movmisalign<mode> shouldn't fail, so we can't
force that through predicates or condition on the expander.

2022-02-25 Jakub Jelinek <jakub@redhat.com>

PR target/104681
* config/rs6000/vector.md (movmisalign<mode>): Use rs6000_emit_move.

* g++.dg/opt/pr104681.C: New test.

(cherry picked from commit 3885a122f817a1b6dca4a84ba9e020d5ab2060af)

match.pd: Don't create BIT_NOT_EXPRs for COMPLEX_TYPE [PR104675]

We don't support BIT_{AND,IOR,XOR,NOT}_EXPR on complex types,
&/|/^ are just rejected for them, and ~ is parsed as CONJ_EXPR.
So, we should avoid simplifications which turn valid complex type
expressions into something that will ICE during expansion.

2022-02-25 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/104675
* match.pd (-A - 1 -> ~A, -1 - A -> ~A): Don't simplify for
COMPLEX_TYPE.

* gcc.dg/pr104675-1.c: New test.
* gcc.dg/pr104675-2.c: New test.

(cherry picked from commit 758671b88b78d7629376b118ec6ca6bcfbabbd36)

libiberty: Fix up debug.temp.o creation if *.o has 64K+ sections [PR104617]

On
#define A(n) int foo1##n(void) { return 1##n; }
#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
#define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9)
E(0) E(1) E(2) D(30) D(31) C(320) C(321) C(322) C(323) C(324) C(325)
B(3260) B(3261) B(3262) B(3263) A(32640) A(32641) A(32642)
testcase with
./xgcc -B ./ -c -g -fpic -ffat-lto-objects -flto  -O0 -o foo1.o foo1.c -ffunction-sections
./xgcc -B ./ -shared -g -fpic -flto -O0 -o foo1.so foo1.o
/tmp/ccTW8mBm.debug.temp.o: file not recognized: file format not recognized
(testcase too slow to be included into testsuite).
The problem is clearly reported by readelf:
readelf: foo1.o.debug.temp.o: Warning: Section 2 has an out of range sh_link value of 65321
readelf: foo1.o.debug.temp.o: Warning: Section 5 has an out of range sh_link value of 65321
readelf: foo1.o.debug.temp.o: Warning: Section 10 has an out of range sh_link value of 65323
readelf: foo1.o.debug.temp.o: Warning: [ 2]: Link field (65321) should index a symtab section.
readelf: foo1.o.debug.temp.o: Warning: [ 5]: Link field (65321) should index a symtab section.
readelf: foo1.o.debug.temp.o: Warning: [10]: Link field (65323) should index a string section.
because simple_object_elf_copy_lto_debug_sections doesn't adjust sh_info and
sh_link fields in ElfNN_Shdr if they are in between SHN_{LO,HI}RESERVE
inclusive.  Not adjusting those is incorrect though, SHN_{LO,HI}RESERVE
range is only relevant to the 16-bit fields, mainly st_shndx in ElfNN_Sym
where if one needs >= SHN_LORESERVE section number, SHN_XINDEX should be
used instead and .symtab_shndx section should contain the real section
index, and in ElfNN_Ehdr e_shnum and e_shstrndx fields, where if >=
SHN_LORESERVE value is needed it should put those into
Shdr[0].sh_{size,link}.  But, sh_{link,info} are 32-bit fields which can
contain any section index.

Note, as simple-object-elf.c mentions, binutils from 2.12 to 2.18 (so before
2011) used to mishandle the > 63.75K sections case and assumed there is a
hole in between the sections, but what
simple_object_elf_copy_lto_debug_sections does wouldn't help in that case
for the debug temp object creation, we'd need to detect the case also in
that routine and take it into account in the remapping etc.  I think
it is not worth it given that it is over 10 years, if somebody needs
63.75K or more sections, better use more recent binutils.

2022-02-22  Jakub Jelinek  <jakub@redhat.com>

PR lto/104617
* simple-object-elf.c (simple_object_elf_match): Fix up URL
in comment.
(simple_object_elf_copy_lto_debug_sections): Remap sh_info and
sh_link even if they are in the SHN_LORESERVE .. SHN_HIRESERVE
range (inclusive).

(cherry picked from commit 2f59f067610f22c3f2ec9b1516e24b85836676ed)

asan: Mark instrumented vars addressable [PR102656]

We ICE on the following testcase, because the asan1 pass decides to
instrument
  <retval>.x = 0;
and does that by
  _13 = &<retval>.x;
  .ASAN_CHECK (7, _13, 4, 4);
  <retval>.x = 0;
and later sanopt pass turns that into:
  _39 = (unsigned long) &<retval>.x;
  _40 = _39 >> 3;
  _41 = _40 + 2147450880;
  _42 = (signed char *) _41;
  _43 = *_42;
  _44 = _43 != 0;
  _45 = _39 & 7;
  _46 = (signed char) _45;
  _47 = _46 + 3;
  _48 = _47 >= _43;
  _49 = _44 & _48;
  if (_49 != 0)
    goto <bb 10>; [0.05%]
  else
    goto <bb 9>; [99.95%]

  <bb 10> [local count: 536864]:
  __builtin___asan_report_store4 (_39);

  <bb 9> [local count: 1073741824]:
  <retval>.x = 0;
The problem is during expansion, <retval> isn't marked TREE_ADDRESSABLE,
even when we take its address in (unsigned long) &<retval>.x.

Now, instrument_derefs has code to avoid the instrumentation altogether
if we can prove the access is within bounds of an automatic variable in the
current function and the var isn't TREE_ADDRESSABLE (or we don't instrument
use after scope), but we do it solely for VAR_DECLs.

I think we should treat RESULT_DECLs exactly like that too, which is what
the following patch does.  I must say I'm unsure about PARM_DECLs, those can
have different cases, either they are fully or partially passed in
registers, then if we take parameter's address, they are in a local copy
inside of a function and so work like those automatic vars.  But if they
are fully passed in memory, we typically just take address of the slot
and in that case they live in the caller's frame.  It is true we don't
(can't) put any asan padding in between the arguments, so all asan could
detect in that case is if caller passes fewer on stack arguments or smaller
arguments than callee accepts.  Anyway, as I'm unsure, I haven't added
PARM_DECLs to that case.

And another thing is, when we actually build_fold_addr_expr, we need to
mark_addressable the inner if it isn't addressable already.

2022-02-19  Jakub Jelinek  <jakub@redhat.com>

PR sanitizer/102656
* asan.c (instrument_derefs): If inner is a RESULT_DECL and access is
known to be within bounds, treat it like automatic variables.
If instrumenting access and inner is {VAR,PARM,RESULT}_DECL from
current function and !TREE_STATIC which is not TREE_ADDRESSABLE, mark
it addressable.

(cherry picked from commit 9e3bbb4a8024121eb0fa675cb1f074218c1345a6)

valtrack: Avoid creating raw SUBREGs with VOIDmode argument [PR104557]

After the recent r12-7240 simplify_immed_subreg changes, we bail on more
simplify_subreg calls than before, e.g. apparently for decimal modes
in the NaN representations  we almost never preserve anything except the
canonical {q,s}NaNs.
simplify_gen_subreg will punt in such cases because a SUBREG with VOIDmode
is not valid, but debug_lowpart_subreg wants to attempt even harder, even
if e.g. target indicates certain mode combinations aren't valid for the
backend, dwarf2out can still handle them.  But a SUBREG from a VOIDmode
operand is just too much, the inner mode is lost there.  We'd need some
new rtx that would be able to represent those cases.
For now, just punt in those cases.

2022-02-17  Jakub Jelinek  <jakub@redhat.com>

PR debug/104557
* valtrack.c (debug_lowpart_subreg): Don't call gen_rtx_raw_SUBREG
if expr has VOIDmode.

* gcc.dg/dfp/pr104557.c: New test.

(cherry picked from commit 1c2b44b52364cb5661095b346de794bc7ff02866)

combine: Fix up -fcompare-debug issue in the combiner [PR104544]

On the following testcase on aarch64-linux, we behave differently
with -g and -g0.

The problem is that on:
(insn 10011 10010 10012 2 (set (reg:CC 66 cc)
        (compare:CC (reg:DI 105)
            (const_int 0 [0]))) "pr104544.c":18:3 407 {cmpdi}
     (expr_list:REG_DEAD (reg:DI 105)
        (nil)))
(insn 10012 10011 10013 2 (set (reg:SI 109)
        (eq:SI (reg:CC 66 cc)
            (const_int 0 [0]))) "pr104544.c":18:3 444 {aarch64_cstoresi}
     (expr_list:REG_DEAD (reg:CC 66 cc)
        (nil)))
(insn 10013 10012 10016 2 (set (reg:DI 110)
        (zero_extend:DI (reg:SI 109))) "pr104544.c":18:3 111 {*zero_extendsidi2_aarch64}
     (expr_list:REG_DEAD (reg:SI 109)
        (nil)))
(insn 10016 10013 10017 2 (parallel [
            (set (reg:CC 66 cc)
                (compare:CC (const_int 0 [0])
                    (reg:DI 110)))
            (set (reg:DI 111)
                (neg:DI (reg:DI 110)))
        ]) "pr104544.c":18:3 281 {negdi_carryout}
     (expr_list:REG_DEAD (reg:DI 110)
        (nil)))
...
(debug_insn 6 5 7 2 (var_location:SI y (debug_expr:SI D#5)) "pr104544.c":18:3 -1
     (nil))
(debug_insn 7 6 10033 2 (debug_marker) "pr104544.c":11:3 -1
     (nil))
(insn 10033 7 10034 2 (set (reg:DI 117 [ _14 ])
        (ior:DI (reg:DI 111)
            (reg:DI 112))) "pr104544.c":11:6 496 {iordi3}
     (expr_list:REG_DEAD (reg:DI 112)
        (expr_list:REG_DEAD (reg:DI 111)
            (nil))))
we successfully split 3 insns into two:

Trying 10011, 10013 -> 10016:
10011: cc:CC=cmp(r105:DI,0)
      REG_DEAD r105:DI
10013: r110:DI=cc:CC==0
      REG_DEAD cc:CC
10016: {cc:CC=cmp(0,r110:DI);r111:DI=-r110:DI;}
      REG_DEAD r110:DI
Failed to match this instruction:
(parallel [
        (set (reg:CC 66 cc)
            (compare:CC (reg:DI 105)
                (const_int 0 [0])))
        (set (reg:DI 111)
            (neg:DI (eq:DI (reg:DI 105)
                    (const_int 0 [0]))))
    ])
Failed to match this instruction:
(parallel [
        (set (reg:CC 66 cc)
            (compare:CC (reg:DI 105)
                (const_int 0 [0])))
        (set (reg:DI 111)
            (neg:DI (eq:DI (reg:DI 105)
                    (const_int 0 [0]))))
    ])
Successfully matched this instruction:
(set (reg:DI 111)
    (neg:DI (eq:DI (reg:DI 105)
            (const_int 0 [0]))))
Successfully matched this instruction:
(set (reg:CC 66 cc)
    (compare:CC (reg:DI 105)
        (const_int 0 [0])))
Successfully matched this instruction:
(set (reg:DI 112)
    (neg:DI (eq:DI (reg:CC 66 cc)
            (const_int 0 [0]))))
allowing combination of insns 10011, 10013 and 10016
original costs 4 + 4 + 4 = 16
replacement costs 4 + 4 = 12
deferring deletion of insn with uid = 10011.

but the code that searches forward for insns to update their log
links (before the change there is a link from insn 10033 to insn 10016
for pseudo 111) only finds insn 10033 and updates the log link if
-g isn't enabled, otherwise it stops earlier because there are debug insns
in between.  So, with -g LOG_LINKS of 10033 isn't updated, points eventually
to NOTE_INSN_DELETED and so we do not attempt to combine 10033 with other
insns, while with -g0 we do.

The following patch fixes that by instead ignoring debug insns during the
searching.  We can still check BLOCK_FOR_INSN (insn) on those, because
if we notice DEBUG_INSN in a following basic block, necessarily there won't
be any further normal insns in the current block after it.

2022-02-16  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/104544
* combine.c (try_combine): When looking for insn whose links
should be updated from i3 to i2, don't stop on debug insns, instead
skip over them.

* gcc.dg/pr104544.c: New test.

(cherry picked from commit f997eef5654f782bedb985c9285862c4d76b3209)

c-family: Fix up shorten_compare for decimal vs. non-decimal float comparison [PR104510]

The comment in shorten_compare says:
/* If either arg is decimal float and the other is float, fail. */
but the callers of shorten_compare don't expect anything like failure
as a possibility from the function, callers require that the function
promotes the operands to the same type, whether the original selected
*restype_ptr one or some shortened.
So, if we choose not to shorten, we should still promote to the original
*restype_ptr.

2022-02-16 Jakub Jelinek <jakub@redhat.com>

PR c/104510
* c-common.c (shorten_compare): Convert original arguments to
the original *restype_ptr when mixing binary and decimal float.

* gcc.dg/dfp/pr104510.c: New test.

(cherry picked from commit 6e74122f0de6748b3fd0ed9183090cd7c61fb53e)

sanitizer: Use glibc _thread_db_sizeof_pthread symbol if present

I've cherry-picked following fix from llvm-project. Recent glibcs
have _thread_db_sizeof_pthread symbol variable which contains the
size of struct pthread, so that sanitizers don't need to guess that
and risk that it will change again.

2022-02-15 Jakub Jelinek <jakub@redhat.com>

* sanitizer_common/sanitizer_linux_libcdep.cpp: Cherry-pick
llvm-project revision ef14b78d9a144ba81ba02083fe21eb286a88732b.

(cherry picked from commit c4c0aa60891daeb4ea5a7c265bd681038f6d8271)

openmp: Make finalize_task_copyfn order reproduceable [PR104517]

The following testcase fails -fcompare-debug, because finalize_task_copyfn
was invoked from splay tree destruction, whose order can in some cases
depend on -g/-g0. The fix is to queue the task stmts that need copyfn
in a vector and run finalize_task_copyfn on elements of that vector.

2022-02-15 Jakub Jelinek <jakub@redhat.com>

PR debug/104517
* omp-low.c (task_cpyfns): New variable.
(delete_omp_context): Don't call finalize_task_copyfn from here.
(create_task_copyfn): Push task_stmt into task_cpyfns.
(execute_lower_omp): Call finalize_task_copyfn here on entries from
task_cpyfns vector and release the vector.

(cherry picked from commit 6a0d6e7ca9b9e338e82572db79c26168684a7441)