git.ipfire.org Git - thirdparty/gcc.git/log

d: Suboptimal codegen for __builtin_expect(cond, false)

Since PR96435, both boolean objects and expressions have been evaluated
in the following way.

(*(ubyte*)&obj_or_expr) & 1

It has been noted that sometimes this can cause the back-end to optimize
in non-obvious ways - in particular with __builtin_expect.

This @safe feature is now restricted to just when reading the value of a
bool field that comes from a union.

PR d/110359

gcc/d/ChangeLog:

* d-convert.cc (convert_for_rvalue): Only apply the @safe boolean
conversion to boolean fields of a union.
(convert_for_condition): Call convert_for_rvalue in the default case.

gcc/testsuite/ChangeLog:

* gdc.dg/pr110359.d: New test.

(cherry picked from commit ab98db1e8c1b997414539f41b7fb814019497d8d)

d: Fix crash in d/dmd/root/aav.d:127 dmd_aaGetRvalue from DsymbolTable::lookup

Backports patch from upstream dmd mainline for fixing PR110113.

The data being Mem.xrealloc'd contains many Array(T) fields, some of
which have self references in their data.ptr field thanks to the
smallarray optimization used by Array.

Naturally then, the memcpy from old GC data to new retains those self
referenced addresses, and the GC marks the old data as "free". Some time
later GC.malloc will return a pointer to said "free" data. So now we
have two GC references to the same memory. One that is treating the data
as an Array(VarDeclaration) in dmd.escape.escapeByStorage, and the other
as an AA in the symtab of a dmd.dsymbol.ScopeDsymbol.

Fix this memory corruption by not storing the data in a global variable
for reuse. If there are no more live references, the GC will free it.

PR d/110113

gcc/d/ChangeLog:

* dmd/escape.d (checkMutableArguments): Always allocate new buffer for
computing escapeBy.

gcc/testsuite/ChangeLog:

* gdc.test/compilable/test23978.d: New test.

Reviewed-on: https://github.com/dlang/dmd/pull/15302
(cherry picked from commit ae3a4cefd855512b10b833a56f275b701bacdb34)

Daily bump.

compiler, libgo: support bootstrapping gc compiler

In the Go 1.21 release the package internal/profile imports
internal/lazyregexp.  That works when bootstrapping with Go 1.17,
because that compiler has internal/lazyregep and permits importing it.
We also have internal/lazyregexp in libgo, but since it is not installed
it is not available for importing.  This CL adds internal/lazyregexp
to the list of internal packages that are installed for bootstrapping.

The Go 1.21, and earlier, releases have a couple of functions in
the internal/abi package that are always fully intrinsified.
The gofrontend recognizes and intrinsifies those functions as well.
However, the gofrontend was also building function descriptors
for references to the functions without calling them, which
failed because there was nothing to refer to.  That is OK for the
gc compiler, which guarantees that the functions are only called,
not referenced.  This CL arranges to not generate function descriptors
for these functions.

For golang/go#60913

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/504798

libstdc++: Document removal of implicit allocator rebinding extensions

Traditionally libstdc++ allowed containers to be
instantiated with allocator's that have the wrong value type, implicitly
rebinding the allocator to the container's value type. Since C++20 that
has been explicitly ill-formed, so the extension is no longer supported
in strict modes (e.g. -std=c++17) and in C++20 and later.

libstdc++-v3/ChangeLog:

* doc/xml/manual/evolution.xml: Document removal of implicit
allocator rebinding extensions in strict mode and for C++20.
* doc/html/*: Regenerate.

(cherry picked from commit 8cbaf679a3c1875c5475bd1cb0fb86fb9d03b2d4)

tree-optimization/110298 - CFG cleanup and stale nb_iterations

When unrolling we eventually kill nb_iterations info since it may
refer to removed SSA names. But we do this only after cleaning
up the CFG which in turn can end up accessing it. Fixed by
swapping the two.

PR tree-optimization/110298
* tree-ssa-loop-ivcanon.cc (tree_unroll_loops_completely):
Clear number of iterations info before cleaning up the CFG.

* gcc.dg/torture/pr110298.c: New testcase.

(cherry picked from commit 916add3bf6e46467e4391e358b11ecfbc4daa275)

middle-end/110182 - TYPE_PRECISION on VECTOR_TYPE causes wrong-code

When folding two conversions in a row we use TYPE_PRECISION but
that's invalid for VECTOR_TYPE. The following fixes this by
using element_precision instead.

PR middle-end/110182
* match.pd (two conversions in a row): Use element_precision
to DTRT for VECTOR_TYPE.

(cherry picked from commit 3e12669a0eb968cfcbe9242b382fd8020935edf8)

Daily bump.

aarch64: Allow compiler to define ls64 builtins [PR110132]

This patch refactors the ls64 builtins to allow the compiler to define them
directly instead of having wrapper functions in arm_acle.h. This should be not
only easier to maintain, but it makes two important correctness fixes:
- It fixes PR110132, where the builtins ended up getting declared with
invisible bindings in the C FE, so the FE ended up synthesizing
incompatible implicit definitions for these builtins.
- It allows the builtins to be used with LTO, which didn't work previously.

We also take the opportunity to add test coverage from C++ for these
builtins.

gcc/ChangeLog:

PR target/110132
* config/aarch64/aarch64-builtins.cc (aarch64_general_simulate_builtin):
New. Use it ...
(aarch64_init_ls64_builtins): ... here. Switch to declaring public ACLE
names for builtins.
(aarch64_general_init_builtins): Ensure we invoke the arm_acle.h
setup if in_lto_p, just like we do for SVE.
* config/aarch64/arm_acle.h: (__arm_ld64b): Delete.
(__arm_st64b): Delete.
(__arm_st64bv): Delete.
(__arm_st64bv0): Delete.

gcc/testsuite/ChangeLog:

PR target/110132
* lib/target-supports.exp (check_effective_target_aarch64_asm_FUNC_ok):
Extend to ls64.
* g++.target/aarch64/acle/acle.exp: New.
* g++.target/aarch64/acle/ls64.C: New test.
* g++.target/aarch64/acle/ls64_lto.C: New test.
* gcc.target/aarch64/acle/ls64_lto.c: New test.
* gcc.target/aarch64/acle/pr110132.c: New test.

(cherry picked from commit 9963029a24f2d2510b82e7106fae3f364da33c5d)

aarch64: Fix wrong code with st64b builtin [PR110100]

The st64b pattern incorrectly had an output constraint on the register
operand containing the destination address for the store, leading to
wrong code. This patch fixes that.

gcc/ChangeLog:

PR target/110100
* config/aarch64/aarch64-builtins.cc (aarch64_expand_builtin_ls64):
Use input operand for the destination address.
* config/aarch64/aarch64.md (st64b): Fix constraint on address
operand.

gcc/testsuite/ChangeLog:

PR target/110100
* gcc.target/aarch64/acle/pr110100.c: New test.

(cherry picked from commit 737a0b749a7bc3e7cb904ea2d4b18dc130514b85)

aarch64: Fix whitespace in ls64 builtin implementation [PR110100]

The ls64 builtin code was using incorrect GNU style with eight spaces where
there should be a tab. Fixed thusly.

gcc/ChangeLog:

PR target/110100
* config/aarch64/aarch64-builtins.cc (aarch64_init_ls64_builtins_types):
Replace eight consecutive spaces with tabs.
(aarch64_init_ls64_builtins): Likewise.
(aarch64_expand_builtin_ls64): Likewise.
* config/aarch64/aarch64.md (ld64b): Likewise.
(st64b): Likewise.
(st64bv): Likewise
(st64bv0): Likewise.

(cherry picked from commit 713613541254039a34e1dd8fd4a613a299af1fd6)

Daily bump.

libstdc++: avoid bogus -Wrestrict [PR105651]

PR tree-optimization/105651

libstdc++-v3/ChangeLog:

* include/bits/basic_string.tcc (_M_replace): Add an assert
to avoid -Wrestrict false positive.

Daily bump.

testsuite: Check int128 effective target for pr109932-{1,2}.c [PR110230]

This patch is to make newly added test cases pr109932-{1,2}.c
check int128 effective target to avoid unsupported type error
on 32-bit. I did hit this failure during testing and fixed
it, but made a stupid mistake not updating the local formatted
patch which was actually out of date.

PR testsuite/110230
PR target/109932

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr109932-1.c: Adjust with int128 effective target.
* gcc.target/powerpc/pr109932-2.c: Ditto.

(cherry picked from commit 16eb9d69079d769b2aa2c07ce54aca20f5547c14)

rs6000: Guard __builtin_{un,}pack_vector_int128 with vsx [PR109932]

As PR109932 shows, builtins __builtin_{un,}pack_vector_int128
should be guarded under vsx rather than power7, as their
corresponding bif patterns have the conditions TARGET_VSX
and VECTOR_MEM_ALTIVEC_OR_VSX_P (V1TImode). This patch is to
move __builtin_{un,}pack_vector_int128 to stanza vsx to ensure
their supports.

PR target/109932

gcc/ChangeLog:

* config/rs6000/rs6000-builtins.def (__builtin_pack_vector_int128,
__builtin_unpack_vector_int128): Move from stanza power7 to vsx.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr109932-1.c: New test.
* gcc.target/powerpc/pr109932-2.c: New test.

(cherry picked from commit ff83d1b47aadcdaf80a4fda84b0dc00bb2cd3641)

rs6000: Don't use TFmode for 128 bits fp constant in toc [PR110011]

As PR110011 shows, when encoding 128 bits fp constant into
toc, we adopts REAL_VALUE_TO_TARGET_LONG_DOUBLE which is
to find the first float mode with LONG_DOUBLE_TYPE_SIZE
bits of precision, it would be TFmode here. But the 128
bits fp constant can be with mode IFmode or KFmode, which
doesn't necessarily have the same underlying float format
as the one of TFmode, like this PR exposes, with option
-mabi=ibmlongdouble TFmode has ibm_extended_format while
KFmode has ieee_quad_format, mixing up the formats (the
encoding/decoding ways) would cause unexpected results.

This patch is to make it use constant's own mode instead
of TFmode for real_to_target call.

PR target/110011

gcc/ChangeLog:

* config/rs6000/rs6000.cc (output_toc): Use the mode of the 128-bit
floating constant itself for real_to_target call.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr110011.c: New test.

(cherry picked from commit 388809f2afde874180da0669c669e241037eeba0)

Daily bump.

aarch64: testsuite: disable stack protector for tests relying on stack offset

Stack protector needs a guard value on the stack and change the stack
layout. So we need to disable it for those tests, to avoid test failure
with --enable-default-ssp.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/shrink_wrap_1.c (dg-options): Add
-fno-stack-protector.
* gcc.target/aarch64/stack-check-cfa-1.c (dg-options): Add
-fno-stack-protector.
* gcc.target/aarch64/stack-check-cfa-2.c (dg-options): Add
-fno-stack-protector.
* gcc.target/aarch64/test_frame_17.c (dg-options): Add
-fno-stack-protector.

(cherry picked from commit 59a72acbccf4c81a04b4d09760fc8b16992de106)

aarch64: testsuite: disable stack protector for pr104005.c

Storing stack guarding variable need one stp instruction, breaking the
scan-assembler-not pattern in the test. Disable stack protector to
avoid a test failure with --enable-default-ssp.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pr104005.c (dg-options): Add
-fno-stack-protector.

(cherry picked from commit 5937cfb981debb2aeb72a1ed255fc3ed5a5835c4)

aarch64: testsuite: disable stack protector for auto-init-7.c

The test scans for "const_int 0" in the RTL dump, but stack protector
can produce more "const_int 0". To avoid a failure with
--enable-default-ssp, disable stack protector for this.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/auto-init-7.c (dg-options): Add
-fno-stack-protector.

(cherry picked from commit 4c59cfc4a4da579d60bfd82404e3ff51c72aca79)

aarch64: testsuite: disable stack protector for pr103147-10 tests

Stack protector influence code generation and cause function body checks
fail.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pr103147-10.c (dg-options): Add
-fno-stack-protector.
* g++.target/aarch64/pr103147-10.C: Likewise.

(cherry picked from commit 2fa31207ea602d48b8f69982e0bde1d143e54ecb)

aarch64: testsuite: disable stack protector for sve-pcs tests

If GCC is configured with --enable-default-ssp, the stack protector can
make many sve-pcs tests fail.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp (sve_flags):
Add -fno-stack-protector.

(cherry picked from commit edb336cc575a82cfc2a883e4c3453c36267482c1)

aarch64: testsuite: disable PIE for fuse_adrp_add_1.c [PR70150]

In PIE, symbol "fixed_regs" is addressed via GOT. It will break the
scan-assembler pattern and cause test failure with --enable-default-pie.

gcc/testsuite/ChangeLog:

PR testsuite/70150
* gcc.target/aarch64/fuse_adrp_add_1.c (dg-options): Add
-fno-pie.

(cherry picked from commit 7e8a3dbbb26f66ce8ea60be48962022b5fb2ef55)

aarch64: testsuite: disable PIE for tests with large code model [PR70150]

These tests set large code model with -mcmodel=large or target pragma for
AArch64. But if GCC is configured with --enable-default-pie, it triggers
"sorry: unimplemented: code model large with -fpic". Disable PIE to make
avoid the issue.

gcc/testsuite/ChangeLog:

PR testsuite/70150
* gcc.dg/tls/pr78796.c (dg-additional-options): Add -fno-pie
-no-pie for aarch64-*-*.
* gcc.target/aarch64/pr63304_1.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr70120-2.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr78733.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr79041-2.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr94530.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr94577.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/reload-valid-spoff.c (dg-options): Add
-fno-pie.

(cherry picked from commit a1ccb4583dfaa267648110aa7da7275acc3000f8)

aarch64: testsuite: disable PIE for aapcs64 tests [PR70150]

If GCC is built with --enable-default-pie, a lot of aapcs64 tests fail
because relocation unsupported in PIE is used.

gcc/testsuite/ChangeLog:

PR testsuite/70150
* gcc.target/aarch64/aapcs64/aapcs64.exp (additional_flags):
Add -fno-pie -no-pie.

(cherry picked from commit f30f04b1fbd4b4e13a7535fad8e698c7b24db9b8)

LoongArch: Avoid non-returning indirect jumps through $ra [PR110136]

Micro-architecture unconditionally treats a "jr $ra" as "return from subroutine",
hence doing "jr $ra" would interfere with both subroutine return prediction and
the more general indirect branch prediction.

Therefore, a problem like PR110136 can cause a significant increase in branch error
prediction rate and affect performance. The same problem exists with "indirect_jump".

gcc/ChangeLog:

PR target/110136
* config/loongarch/loongarch.md: Modify the register constraints for template
"jumptable" and "indirect_jump" from "r" to "e".

Co-authored-by: Andrew Pinski <apinski@marvell.com>
(cherry picked from commit 5430c86e71927492399129f3df80824c6c334ddf)

Daily bump.

middle-end/110200 - genmatch force-leaf and convert interaction

The following fixes code GENERIC generation for (convert! ...)
which currently generates

  if (TREE_TYPE (_o1[0]) != type)
    _r1 = fold_build1_loc (loc, NOP_EXPR, type, _o1[0]);
    if (EXPR_P (_r1))
      goto next_after_fail867;
  else
    _r1 = _o1[0];

where obviously braces are missing.

PR middle-end/110200
* genmatch.cc (expr::gen_transform): Put braces around
the if arm for the (convert ...) short-cut.

(cherry picked from commit 820d1aec89c43dbbc70d3d0b888201878388454c)

Daily bump.

target/109650: Fix wrong code after cc0 -> CCmode transition.

This patch fixes a wrong-code bug in the wake of PR92729, the transition that
turned the AVR backend from cc0 to CCmode.  In cc0, the insn that uses cc0 like
a conditional branch always follows the cc0 setter, which is no more the case
with CCmode where set and use of REG_CC might be in different basic blocks.

This patch removes the machine-dependent reorg pass in avr_reorg entirely.

It is replaced by a new, AVR specific mini-pass that runs prior to split2.
Canonicalization of comparisons away from the "difficult" codes GT[U] and LE[U]
is now mostly performed by implementing TARGET_CANONICALIZE_COMPARISON.

Moreover:

* Text peephole conditions get "dead_or_set_regno_p (*, REG_CC)" as needed.

* RTL peephole conditions get "peep2_regno_dead_p (*, REG_CC)" as needed.

* Conditional branches no more clobber REG_CC.

* insn output for compares looks ahead to determine the branch mode in use.
  This needs also "dead_or_set_regno_p (*, REG_CC)".

* Add RTL peepholes for decrement-and-branch detection.

* Some of the patterns like "*cmphi.zero-extend.0" lost their
  combine-ational part wit PR92729.  Restore them.

Finally, it fixes some of the many indentation glitches left over from PR92729.

gcc/
PR target/109650
PR target/92729
Backport from 2023-05-10 master r14-1688.
* config/avr/avr-passes.def (avr_pass_ifelse): Insert new pass.
* config/avr/avr.cc (avr_pass_ifelse): New RTL pass.
(avr_pass_data_ifelse): New pass_data for it.
(make_avr_pass_ifelse, avr_redundant_compare, avr_cbranch_cost)
(avr_canonicalize_comparison, avr_out_plus_set_ZN)
(avr_out_cmp_ext): New functions.
(compare_condtition): Make sure REG_CC dies in the branch insn.
(avr_rtx_costs_1): Add computation of cbranch costs.
(avr_adjust_insn_length) [ADJUST_LEN_ADD_SET_ZN, ADJUST_LEN_CMP_ZEXT]:
[ADJUST_LEN_CMP_SEXT]Handle them.
(TARGET_CANONICALIZE_COMPARISON): New define.
(avr_simplify_comparison_p, compare_diff_p, avr_compare_pattern)
(avr_reorg_remove_redundant_compare, avr_reorg): Remove functions.
(TARGET_MACHINE_DEPENDENT_REORG): Remove define.
* config/avr/avr-protos.h (avr_simplify_comparison_p): Remove proto.
(make_avr_pass_ifelse, avr_out_plus_set_ZN, cc_reg_rtx)
(avr_out_cmp_zext): New Protos
* config/avr/avr.md (branch, difficult_branch): Don't split insns.
(*cbranchhi.zero-extend.0", *cbranchhi.zero-extend.1")
(*swapped_tst<mode>, *add.for.eqne.<mode>): New insns.
(*cbranch<mode>4): Rename to cbranch<mode>4_insn.
(define_peephole): Add dead_or_set_regno_p(insn,REG_CC) as needed.
(define_deephole2): Add peep2_regno_dead_p(*,REG_CC) as needed.
Add new RTL peepholes for decrement-and-branch and *swapped_tst<mode>.
Rework signtest-and-branch peepholes for *sbrx_branch<mode>.
(adjust_len) [add_set_ZN, cmp_zext]: New.
(QIPSI): New mode iterator.
(ALLs1, ALLs2, ALLs4, ALLs234): New mode iterators.
(gelt): New code iterator.
(gelt_eqne): New code attribute.
(rvbranch, *rvbranch, difficult_rvbranch, *difficult_rvbranch)
(branch_unspec, *negated_tst<mode>, *reversed_tst<mode>)
(*cmpqi_sign_extend): Remove insns.
(define_c_enum "unspec") [UNSPEC_IDENTITY]: Remove.
* config/avr/avr-dimode.md (cbranch<mode>4): Canonicalize comparisons.
* config/avr/predicates.md (scratch_or_d_register_operand): New.
* config/avr/constraints.md (Yxx): New constraint.

gcc/testsuite/
PR target/109650
Backport from 2023-05-10 master r14-1688.
* gcc.target/avr/torture/pr109650-1.c: New test.
* gcc.target/avr/torture/pr109650-2.c: New test.

Daily bump.

rs6000: Remove duplicate expression [PR106907]

PR106907 has few warnings spotted from cppcheck. In that addressing duplicate
expression issue here. Here the same expression is used twice in logical
AND(&&) operation which result in same result so removing that.

2023-06-06 Jeevitha Palanisamy <jeevitha@linux.ibm.com>

gcc/
PR target/106907
* config/rs6000/rs6000.cc (vec_const_128bit_to_bytes): Remove
duplicate expression.

(cherry picked from commit c4deccd44655c5d748dfed200a37f2b678c32fe8)

Darwin, PPC: Fix struct layout with pragma pack [PR110044].

This bug was essentially that darwin_rs6000_special_round_type_align()
was ignoring externally-imposed capping of field alignment.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR target/110044

gcc/ChangeLog:

* config/rs6000/rs6000.cc (darwin_rs6000_special_round_type_align):
Make sure that we do not have a cap on field alignment before altering
the struct layout based on the type alignment of the first entry.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/darwin-abi-13-0.c: New test.
* gcc.target/powerpc/darwin-abi-13-1.c: New test.
* gcc.target/powerpc/darwin-abi-13-2.c: New test.
* gcc.target/powerpc/darwin-structs-0.h: New test.

(cherry picked from commit 84d080a29a780973bef47171ba708ae2f7b4ee47)

fortran: Fix ICE on pr96024.f90 on big-endian hosts [PR96024]

The pr96024.f90 testcase ICEs on big-endian hosts. The problem is
that length->val.integer is accessed after checking
length->expr_type == EXPR_CONSTANT, but it is a CHARACTER constant
which uses length->val.character union member instead and on big-endian
we end up reading constant 0x100000000 rather than some small number
on little-endian and if target doesn't have enough memory for 4 times
that (i.e. 16GB allocation), it ICEs.

2023-06-09 Jakub Jelinek <jakub@redhat.com>

PR fortran/96024
* primary.cc (gfc_convert_to_structure_constructor): Only do
constant string ctor length verification and truncation/padding
if constant length has INTEGER type.

(cherry picked from commit 4cf6e322adc19f927859e0a5edfa93cec4b8c844)

Explicitly view_convert_expr mask to signed type when folding pblendvb builtins.

Since mask < 0 will be always false for vector char when
-funsigned-char, but vpblendvb needs to check the most significant
bit. The patch explicitly VCE to vector signed char.

gcc/ChangeLog:

PR target/110108
* config/i386/i386.cc (ix86_gimple_fold_builtin): Explicitly
view_convert_expr mask to signed type when folding pblendvb
builtins.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr110108-2.c: New test.

Daily bump.

arm: Fix ICE due to infinite splitting [PR109800]

In r11-966-g9a182ef9ee011935d827ab5c6c9a7cd8e22257d8 we introduce a
simplification to emit_move_insn that attempts to simplify moves of the form:

(set (subreg:M1 (reg:M2 ...)) (constant C))

where M1 and M2 are of equal mode size. That is problematic for the splitter
vfp.md:no_literal_pool_df_immediate in the arm backend, which tries to pun an
lvalue DFmode pseudo into DImode and assign a constant to it with
emit_move_insn, as the new transformation simply undoes this, and we end up
splitting indefinitely.

This patch changes things around in the arm backend so that we use a
DImode temporary (instead of DFmode) and first load the DImode constant
into the pseudo, and then pun the pseudo into DFmode as an rvalue in a
reg -> reg move. I believe this should be semantically equivalent but
avoids the pathalogical behaviour seen in the PR.

gcc/ChangeLog:

PR target/109800
* config/arm/arm.md (movdf): Generate temporary pseudo in DImode
instead of DFmode.
* config/arm/vfp.md (no_literal_pool_df_immediate): Rather than punning an
lvalue DFmode pseudo into DImode, use a DImode pseudo and pun it into
DFmode as an rvalue.

gcc/testsuite/ChangeLog:

PR target/109800
* gcc.target/arm/pure-code/pr109800.c: New test.

(cherry picked from commit f5298d9969b4fa34ff3aecd54b9630e22b2984a5)

arm: PR target/109939 Correct signedness of return type of __ssat intrinsics

As the PR says we shouldn't be using qualifier_unsigned for the return type of the __ssat intrinsics.
UNSIGNED_SAT_BINOP_UNSIGNED_IMM_QUALIFIERS already exists for that.
This was just a thinko.
This patch fixes this and the warning with -Wconversion goes away.

Bootstrapped and tested on arm-none-linux-gnueabihf.

gcc/ChangeLog:

PR target/109939
* config/arm/arm-builtins.cc (SAT_BINOP_UNSIGNED_IMM_QUALIFIERS): Use
qualifier_none for the return operand.

gcc/testsuite/ChangeLog:

PR target/109939
* gcc.target/arm/pr109939.c: New test.

(cherry picked from commit 95542a6ec4b350c653b793b7c36a8210b0e9a89d)

Daily bump.

d: Merge upstream dmd 316b89f1e3, phobos 8e8aaae50.

Updates D language version to v2.100.2.

Phobos changes:

    - Fix instantiating std.container.array.Array!T where T is a
      shared class.
    - Fix calling toString on a const std.typecons.Nullable type.

gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd 316b89f1e3.
* dmd/VERSION: Bump version to v2.100.2.

libphobos/ChangeLog:

* src/MERGE: Merge upstream phobos 8e8aaae50.

Daily bump.

Fortran: fix diagnostics for SELECT RANK [PR100607]

gcc/fortran/ChangeLog:

PR fortran/100607
* resolve.cc (resolve_select_rank): Remove duplicate error.
(resolve_fl_var_and_proc): Prevent NULL pointer dereference and
suppress error message for temporary.

gcc/testsuite/ChangeLog:

PR fortran/100607
* gfortran.dg/select_rank_6.f90: New test.

(cherry picked from commit fae09dfc0e6bf4cfe35d817558827aea78c6426f)

Daily bump.

target/110088: Improve operation of l-reg with const after move from d-reg.

After reload, there may be sequences like
   lreg = dreg
   lreg = lreg <op> const
with an LD_REGS dreg, non-LD_REGS lreg, and <op> in PLUS, IOR, AND.
If dreg dies after the first insn, it is possible to use
   dreg = dreg <op> const
   lreg = dreg
instead which is more efficient.

gcc/
PR target/110088
* config/avr/avr.md: Add an RTL peephole to optimize operations on
non-LD_REGS after a move from LD_REGS.
(piaop): New code iterator.

Daily bump.

doc: Fix description of x86 -m32 option [PR109954]

This option does not imply -march=i386 so it's incorrect to say it
generates code that will run on "any i386 system".

gcc/ChangeLog:

PR target/109954
* doc/invoke.texi (x86 Options): Fix description of -m32 option.

(cherry picked from commit eeb92704967875411416b0b9508aa6f49e8192fd)

Daily bump.

[libstdc++] [testsuite] xfail double-prec from_chars for x86_64 ldbl

When long double is wider than double, but from_chars is implemented
in terms of double, tests that involve the full precision of long
double are expected to fail. Mark them as such on x86_64-*-vxworks*.

for libstdc++-v3/ChangeLog

* testsuite/20_util/from_chars/4.cc: Skip long double test06
on x86_64-vxworks.
* testsuite/20_util/to_chars/long_double.cc: Xfail run on
x86_64-vxworks.

(cherry picked from commit 282e4e745981c5c6e3edaae315e1f499a45402df)

[libstdc++] [testsuite] xfail to_chars/long_double on x86-vxworks

Just as on aarch64, x86's wider long double experiences loss of
precision with from_chars implemented in terms of double. Expect the
execution fail.

for libstdc++-v3/ChangeLog

* testsuite/20_util/to_chars/long_double.cc: Expect execution
fail on x86-vxworks.

(cherry picked from commit 7daa166fe89fca4ff1baa063c00a5a690f7e462f)

[libstdc++] [testsuite] xfail double-prec from_chars for ldbl

When long double is wider than double, but from_chars is implemented
in terms of double, tests that involve the full precision of long
double are expected to fail. Mark them as such on aarch64-*-vxworks.

for libstdc++-v3/ChangeLog

* testsuite/20_util/from_chars/4.cc: Skip long double test06
on aarch64-vxworks.
* testsuite/20_util/to_chars/long_double.cc: Xfail run on
aarch64-vxworks.

(cherry picked from commit e383fc69d2a3eab37319ea41543ee09c8cdd6e57)

libstdc++: Correct NTTP and simd_mask ctor call

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/109822
* include/experimental/bits/simd.h (to_native): Use int NTTP
as specified in PTS2.
(to_compatible): Likewise. Add missing tag to call mask
generator ctor.
* testsuite/experimental/simd/pr109822_cast_functions.cc: New
test.

(cherry picked from commit 668d43502f465d48adbc1fe2956b979f36657e5f)

libstdc++: Simplify calculation of expected value in simd test

This avoids a failure on PR109964.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* testsuite/experimental/simd/tests/integer_operators.cc:
Compute expected value differently to avoid getting turned into
a vector shift.

(cherry picked from commit 3e2689e568425f14d6728504ad6f5d32b90320ad)

libstdc++: Fix test assumptions on long and long double

Expect that long might not fit into the long double mantissa bits.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* testsuite/experimental/simd/tests/operator_cvt.cc: Make long
double <-> (u)long conversion tests conditional on sizeof(long
double) and sizeof(long).

(cherry picked from commit 291549d43e823f163fa9961e42a751b5ce0d57fb)

libstdc++: Resolve -Wsign-compare issue

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_ppc.h (_S_bit_shift_left):
Negative __y is UB, so prefer signed compare.

(cherry picked from commit 1a1abec1d618cde709c585fcce89330bb33b07ac)

testsuite: make mve_intrinsic_type_overloads-int.c libc-agnostic

Glibc defines int32_t as 'int' while newlib defines it as 'long int'.

Although these correspond to the same size, g++ complains when using the
'wrong' version:
  invalid conversion from 'long int*' to 'int32_t*' {aka 'int*'} [-fpermissive]
or
  invalid conversion from 'int*' to 'int32_t*' {aka 'long int*'} [-fpermissive]

when calling vst1q(int32*, int32x4_t) with a first parameter of type
'long int *' (resp. 'int *')

To make this test pass with any type of toolchain, this patch defines
'word_type' according to which libc is in use.

2023-05-23  Christophe Lyon  <christophe.lyon@linaro.org>

gcc/testsuite/
* gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c:
Support both definitions of int32_t.

(cherry picked from commit d12d2aa4fccc76a9a08c8120c5e37d9cab8683e8)

riscv: update riscv_asan_shadow_offset

gcc/
PR target/110036
* config/riscv/riscv.cc (riscv_asan_shadow_offset): Update to
match libsanitizer.

Daily bump.

target/104327: Allow more inlining between different optimization levels.

avr-common.cc introduces the following options that are set depending
on optimization level: -mgas-isr-prologues, -mmain-is-OS-task and
-fsplit-wide-types-early. The inliner thinks that different options
disallow cross-optimization inlining, so provide can_inline_p.

gcc/
PR target/104327
* config/avr/avr.cc (avr_can_inline_p): New static function.
(TARGET_CAN_INLINE_P): Define to that function.

target/82931: Make a pattern more generic to match more bit-transfers.

There is already a pattern in avr.md that matches single-bit transfers
from one register to another one, but it only handled bit 0 of 8-bit
registers. This change makes that pattern more generic so it matches
more of similar single-bit transfers.

gcc/
PR target/82931
* config/avr/avr.md (*movbitqi.0): Rename to *movbit<mode>.0-6.
Handle any bit position and use mode QISI.
* config/avr/avr.cc (avr_rtx_costs_1) [IOR]: Return a cost
of 2 insns for bit-transfer of respective style.

gcc/testsuite/
PR target/82931
* gcc.target/avr/pr82931.c: New test.

Daily bump.

libstdc++: Fix type of first argument to vec_cntm call

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/109949
* include/experimental/bits/simd.h (__intrinsic_type): If
__ALTIVEC__ is defined, map gnu::vector_size types to their
corresponding __vector T types without losing unsignedness of
integer types. Also prefer long long over long.
* include/experimental/bits/simd_ppc.h (_S_popcount): Cast mask
object to the expected unsigned vector type.

(cherry picked from commit efd2b55d8562c6e80cb7ee8b9b1f9418f0c00cd9)

libstdc++: Fix SFINAE for __is_intrinsic_type on ARM

On ARM NEON doesn't support double, so __is_intrinsic_type_v<double,
whatever> should say false (instead of being ill-formed).

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/109261
* include/experimental/bits/simd.h (__intrinsic_type):
Specialize __intrinsic_type<double, 8> and
__intrinsic_type<double, 16> in any case, but provide the member
type only with __aarch64__.

(cherry picked from commit aa8b363171a95b8f867a74f29c75f9577e9087e1)

libstdc++: Add missing constexpr to simd_neon

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/109261
* include/experimental/bits/simd_neon.h (_S_reduce): Add
constexpr and make NEON implementation conditional on
not __builtin_is_constant_evaluated.

(cherry picked from commit b0a483b0a011f9cbc8b25053eae809c77dae2a12)

Daily bump.

Improve cost computation for single-bit bit insertions.

Some miscomputation of rtx_costs lead to sub-optimal code for
single-bit bit insertions. This patch implements TARGET_INSN_COST,
which has a chance to see the whole insn during insn combination;
in particular the SET_DEST of (set (zero_extract (...) ...)).

gcc/
* config/avr/avr.cc (avr_insn_cost): New static function.
(TARGET_INSN_COST): Define to that function.

libstdc++: Add missing constexpr to simd

The constexpr API is only available with -std=gnu++XX (and proposed for
C++26). The proposal is to have the complete simd API usable in constant
expressions.

This patch resolves several issues with using simd in constant
expressions.

Issues why constant_evaluated branches are necessary:
* subscripting vector builtins is not allowed in constant expressions
* if the implementation needs/uses memcpy
* if the implementation would otherwise call SIMD intrinsics/builtins

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/109261
* include/experimental/bits/simd.h (_SimdWrapper::_M_set):
Avoid vector builtin subscripting in constant expressions.
(resizing_simd_cast): Avoid memcpy if constant_evaluated.
(const_where_expression, where_expression, where)
(__extract_part, simd_mask, _SimdIntOperators, simd): Add either
_GLIBCXX_SIMD_CONSTEXPR (on public APIs), or constexpr (on
internal APIs).
* include/experimental/bits/simd_builtin.h (__vector_permute)
(__vector_shuffle, __extract_part, _GnuTraits::_SimdCastType1)
(_GnuTraits::_SimdCastType2, _SimdImplBuiltin)
(_MaskImplBuiltin::_S_store): Add constexpr.
(_CommonImplBuiltin::_S_store_bool_array)
(_SimdImplBuiltin::_S_load, _SimdImplBuiltin::_S_store)
(_SimdImplBuiltin::_S_reduce, _MaskImplBuiltin::_S_load): Add
constant_evaluated case.
* include/experimental/bits/simd_fixed_size.h
(_S_masked_load): Reword comment.
(__tuple_element_meta, __make_meta, _SimdTuple::_M_apply_r)
(_SimdTuple::_M_subscript_read, _SimdTuple::_M_subscript_write)
(__make_simd_tuple, __optimize_simd_tuple, __extract_part)
(__autocvt_to_simd, _Fixed::__traits::_SimdBase)
(_Fixed::__traits::_SimdCastType, _SimdImplFixedSize): Add
constexpr.
(_SimdTuple::operator[], _M_set): Add constexpr and add
constant_evaluated case.
(_MaskImplFixedSize::_S_load): Add constant_evaluated case.
* include/experimental/bits/simd_scalar.h: Add constexpr.

* include/experimental/bits/simd_x86.h (_CommonImplX86): Add
constexpr and add constant_evaluated case.
(_SimdImplX86::_S_equal_to, _S_not_equal_to, _S_less)
(_S_less_equal): Value-initialize to satisfy constexpr
evaluation.
(_MaskImplX86::_S_load): Add constant_evaluated case.
(_MaskImplX86::_S_store): Add constexpr and constant_evaluated
case. Value-initialize local variables.
(_MaskImplX86::_S_logical_and, _S_logical_or, _S_bit_not)
(_S_bit_and, _S_bit_or, _S_bit_xor): Add constant_evaluated
case.
* testsuite/experimental/simd/pr109261_constexpr_simd.cc: New
test.

(cherry picked from commit da579188807ede4ee9466d0b5bf51559c96a0b51)

libstdc++: Resolve -Wunused-variable warnings in stdx::simd and tests

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_builtin.h (_S_fpclassify): Move
__infn into #ifdef'ed block.
* testsuite/experimental/simd/tests/fpclassify.cc: Declare
constants only when used.
* testsuite/experimental/simd/tests/frexp.cc: Likewise.
* testsuite/experimental/simd/tests/logarithm.cc: Likewise.
* testsuite/experimental/simd/tests/trunc_ceil_floor.cc:
Likewise.
* testsuite/experimental/simd/tests/ldexp_scalbn_scalbln_modf.cc:
Move totest and expect1 into #ifdef'ed block.

(cherry picked from commit a7129e82bed1bd4f513fc3c3f401721e2c96a865)

libstdc++: Add missing trait is_simd_flag_type

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h (is_simd_flag_type): New.
(_IsSimdFlagType): New.
(copy_from, copy_to, load ctors): Constrain _Flags using
_IsSimdFlagType.

(cherry picked from commit 97383b4116ea63486eb5bfb0a7140871bed75fb4)

libstdc++: Fix operator% implementation for Clang

This resolves a regression of my previous fix where Clang would ICE on
_S_divides.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_x86.h (_SimdImplX86): Use
_Base::_S_divides if the optimized _S_divides function is hidden
via the preprocessor.

(cherry picked from commit 1a62008123694b2ac07f28e25fc6e5ff371925f5)

libstdc++: Fix simd compilation with Clang

Clang fails to compile some constant expressions involving simd.
Therefore, just disable this non-conforming extension for clang.

Fix AVX512 blend implementation for Clang. It was converting the bitmask
to bool before, which is obviously wrong. Instead use a Clang builtin to
convert the bitmask to vector-mask before using a vector blend ?:. A
similar change is required for the masked unary implementation, because
the GCC builtins do not exist on Clang.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_detail.h: Don't declare the
simd API as constexpr with Clang.
* include/experimental/bits/simd_x86.h (__movm): New.
(_S_blend_avx512): Resolve FIXME. Implement blend using __movm
and ?:.
(_SimdImplX86::_S_masked_unary): Clang does not implement the
same builtins. Implement the function using __movm, ?:, and -
operators on vector_size types instead.

(cherry picked from commit 8ff3ca2d94721fab78f167d435d4ea4fa4fdca6a)

libstdc++: Fix formatting

Whitespace changes only.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h: Line breaks and indenting
fixed to follow the libstdc++ standard.
* include/experimental/bits/simd_builtin.h: Likewise.
* include/experimental/bits/simd_fixed_size.h: Likewise.
* include/experimental/bits/simd_neon.h: Likewise.
* include/experimental/bits/simd_ppc.h: Likewise.
* include/experimental/bits/simd_scalar.h: Likewise.
* include/experimental/bits/simd_x86.h: Likewise.

(cherry picked from commit b31186e589caee43ac5720a538d9a41ebf514e81)

libstdc++: Always-inline most of non-cmath fixed_size implementation

For simd, the inlining behavior should be similar to builtin types. (No
operator on buitin types is ever translated into a function call.)
Therefore, always_inline is the right choice (i.e. inline on -O0 as
well).

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/108030
* include/experimental/bits/simd_fixed_size.h
(_SimdImplFixedSize::_S_broadcast): Replace inline with
_GLIBCXX_SIMD_INTRINSIC.
(_SimdImplFixedSize::_S_generate): Likewise.
(_SimdImplFixedSize::_S_load): Likewise.
(_SimdImplFixedSize::_S_masked_load): Likewise.
(_SimdImplFixedSize::_S_store): Likewise.
(_SimdImplFixedSize::_S_masked_store): Likewise.
(_SimdImplFixedSize::_S_min): Likewise.
(_SimdImplFixedSize::_S_max): Likewise.
(_SimdImplFixedSize::_S_complement): Likewise.
(_SimdImplFixedSize::_S_unary_minus): Likewise.
(_SimdImplFixedSize::_S_plus): Likewise.
(_SimdImplFixedSize::_S_minus): Likewise.
(_SimdImplFixedSize::_S_multiplies): Likewise.
(_SimdImplFixedSize::_S_divides): Likewise.
(_SimdImplFixedSize::_S_modulus): Likewise.
(_SimdImplFixedSize::_S_bit_and): Likewise.
(_SimdImplFixedSize::_S_bit_or): Likewise.
(_SimdImplFixedSize::_S_bit_xor): Likewise.
(_SimdImplFixedSize::_S_bit_shift_left): Likewise.
(_SimdImplFixedSize::_S_bit_shift_right): Likewise.
(_SimdImplFixedSize::_S_remquo): Add inline keyword (to be
explicit about not always-inline, yet).
(_SimdImplFixedSize::_S_isinf): Likewise.
(_SimdImplFixedSize::_S_isfinite): Likewise.
(_SimdImplFixedSize::_S_isnan): Likewise.
(_SimdImplFixedSize::_S_isnormal): Likewise.
(_SimdImplFixedSize::_S_signbit): Likewise.

(cherry picked from commit e37b04328ae68f91efe1fb2c5de9122be34bc74a)

libstdc++: More efficient masked inc-/decrement implementation

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/108856
* include/experimental/bits/simd_builtin.h
(_SimdImplBuiltin::_S_masked_unary): More efficient
implementation of masked inc-/decrement for integers and floats
without AVX2.
* include/experimental/bits/simd_x86.h
(_SimdImplX86::_S_masked_unary): New. Use AVX512 masked subtract
builtins for masked inc-/decrement.

(cherry picked from commit 6ce55180d494b616e2e3e68ffedfe9007e42ca06)

libstdc++: Test that integral simd reductions are precise

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* testsuite/experimental/simd/tests/reductions.cc: Introduce
max_distance as the type-dependent max error.

(cherry picked from commit 8fda668e0919af9ceda9435f02a1708b375b2913)

libstdc++: Fix simd build failure on clang

Clang does not support __attribute__ on lambdas. Therefore, only set
_GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA if __clang__ is not defined.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/108030
* include/experimental/bits/simd_detail.h
(_GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA): Define as empty for
__clang__.

(cherry picked from commit 92c47b15d5af3e7f93d11ad69a45b6d1cb8661c5)

libstdc++: Annotate most lambdas with always_inline

All of the annotated lambdas are simply a necessary means for
implementing these functions and should never result in an actual
function call. Many of these lambdas would go away if C++ had better
language support for packs.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/108030
* include/experimental/bits/simd_detail.h: Define
_GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA.
* include/experimental/bits/simd.h: Annotate lambdas with
_GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA.
* include/experimental/bits/simd_builtin.h: Ditto.
* include/experimental/bits/simd_converter.h: Ditto.
* include/experimental/bits/simd_fixed_size.h: Ditto.
* include/experimental/bits/simd_math.h: Ditto.
* include/experimental/bits/simd_neon.h: Ditto.
* include/experimental/bits/simd_x86.h: Ditto.

(cherry picked from commit 53b55701aed6896f456cdec7997ac6bbef1d6074)

Daily bump.

Do not generate vmaddfp and vnmsubfp

This is version 3 of the patch.  This is essentially version 1 with the removal
of changes to altivec.md, and cleanup of the comments.

Version 2 generated the vmaddfp and vnmsubfp instructions if -Ofast was used,
and those changes are deleted in this patch.

The Altivec instructions vmaddfp and vnmsubfp have different rounding behaviors
than the VSX xvmaddsp and xvnmsubsp instructions.  In particular, generating
these instructions seems to break Eigen on big endian systems.

I have done bootstrap builds on power9 little endian (with both IEEE long
double and IBM long double).  I have also done the builds and test on a power8
big endian system (testing both 32-bit and 64-bit code generation).  Chip has
verified that it fixes the problem that Eigen encountered.  Can I check this
into the master GCC branch?  After a burn-in period, can I check this patch
into the active GCC branches?

Thanks in advance.

2023-05-22   Michael Meissner  <meissner@linux.ibm.com>

gcc/

PR target/70243
* config/rs6000/vsx.md (vsx_fmav4sf4): Do not generate vmaddfp.
(vsx_nfmsv4sf4): Do not generate vnmsubfp.  Back port from master
04/10/2023 change.

gcc/testsuite/

PR target/70243
* gcc.target/powerpc/pr70243.c: New test.  Back port from master
04/10/2023 change.

atch.pd: Ensure (op CONSTANT_CLASS_P CONSTANT_CLASS_P) is simplified [PR109505]

On the following testcase we hang, because POLY_INT_CST is CONSTANT_CLASS_P,
but BIT_AND_EXPR with it and INTEGER_CST doesn't simplify and the
(x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2)
simplification actually relies on the (CST1 & CST2) simplification,
otherwise it is a deoptimization, trading 2 ops for 3 and furthermore
running into
/* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
   operands are another bit-wise operation with a common input.  If so,
   distribute the bit operations to save an operation and possibly two if
   constants are involved.  For example, convert
     (A | B) & (A | C) into A | (B & C)
   Further simplification will occur if B and C are constants.  */
simplification which simplifies that
(x & CST2) | (CST1 & CST2) back to
CST2 & (x | CST1).
I went through all other places I could find where we have a simplification
with 2 CONSTANT_CLASS_P operands and perform some operation on those two,
while the other spots aren't that severe (just trade 2 operations for
another 2 if the two constants don't simplify, rather than as in the above
case trading 2 ops for 3), I still think all those spots really intend
to optimize only if the 2 constants simplify.

So, the following patch adds to those a ! modifier to ensure that,
even at GENERIC that modifier means !EXPR_P which is exactly what we want
IMHO.

2023-05-21  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/109505
* match.pd ((x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2),
Combine successive equal operations with constants,
(A +- CST1) +- CST2 -> A + CST3, (CST1 - A) +- CST2 -> CST3 - A,
CST1 - (CST2 - A) -> CST3 + A): Use ! on ops with 2 CONSTANT_CLASS_P
operands.

* gcc.target/aarch64/sve/pr109505.c: New test.

(cherry picked from commit f211757f6fa9515e3fd1a4f66f1a8b48e500c9de)

vect: Don't retry if the previous analysis fails

When working on a cost tweaking patch, I found that a newly
added test case has different dumpings with stage-1 and
bootstrapped gcc.  By looking into it, the apparent reason
is vect_analyze_loop_2 doesn't get slp_done_for_suggested_uf
set expectedly, the following retrying will use the garbage
slp_done_for_suggested_uf instead.  In fact, the setting of
slp_done_for_suggested_uf only happens when the previous
analysis succeeds, for the mentioned test case, its previous
analysis does fail, it's unexpected to use the value of
slp_done_for_suggested_uf any more.

In function vect_analyze_loop_1, we only return success when
res is true, which is the result of 1st analysis.  It means
we never try to vectorize with unroll_vinfo if the previous
analysis fails.  So this patch shouldn't break anything, and
just stop some useless analysis early.

gcc/ChangeLog:

* tree-vect-loop.cc (vect_analyze_loop_1): Don't retry analysis with
suggested unroll factor once the previous analysis fails.

(cherry picked from commit a04bf39f61ce7814d197d712760f08c206daf4f1)

Daily bump.

Darwin, libgcc : Adjust min version supported for the OS.

Tools from later versions of the OS deprecate or fail to support
earlier OS revisions.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
libgcc/ChangeLog:

* config.host: Arrange to set min Darwin OS versions from
the configured host version.
* config/darwin10-unwind-find-enc-func.c: Do not use current
headers, but declare the nexessary structures locally to the
versions in use for Mac OSX 10.6.
* config/t-darwin: Amend to handle configured min OS
versions.
* config/t-darwin-min-1: New.
* config/t-darwin-min-5: New.
* config/t-darwin-min-8: New.

(cherry picked from commit 20b8779ea9bd82b26eeb195b30f695168cd7ae1d)