git.ipfire.org Git - thirdparty/gcc.git/log

pru: Enable section anchoring by default

Loading an arbitrary constant address in a register is expensive for
PRU. So enable section anchoring by default to utilize the unsigned
byte constant offset operand of load/store instructions.

gcc/ChangeLog:

* common/config/pru/pru-common.cc
(TARGET_OPTION_OPTIMIZATION_TABLE): New definition.
* config/pru/pru.cc (TARGET_MIN_ANCHOR_OFFSET): Set minimal
anchor offset.
(TARGET_MAX_ANCHOR_OFFSET): Set maximum anchor offset.

gcc/testsuite/ChangeLog:

* gcc.target/pru/section-anchors-1.c: New test.
* gcc.target/pru/section-anchors-2.c: New test.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>

testsuite: Fix for targets not passing argc/argv [PR116154]

PRU and other simulator targets do not pass any argv arguments
to main.  Instead of erroneously relying on argc==0, use a volatile
variable instead.

I reverted the fix for PR67947 in r6-3891-g8a18fcf4aa1d5c, and made sure
that the updated test case still fails for x86_64:

  $ make check-gcc-c RUNTESTFLAGS="dg-torture.exp=pr67947.c"
  ...
  FAIL: gcc.dg/torture/pr67947.c   -O1  execution test
  ...
  # of expected passes            8
  # of unexpected failures        8

Fix was suggested by Andrew Pinski in PR116154.  Committed as obvious.

PR testsuite/116154

gcc/testsuite/ChangeLog:

* gcc.dg/torture/pr67947.c: Use volatile variable instead of
argc.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>

libstdc++: drop bogus 'dg_do run' directive

We already have a valid 'dg-do run' (- vs _) directive, so drop the bogus
one.

libstdc++-v3/ChangeLog:
* testsuite/28_regex/traits/char/translate.cc: Drop bogus 'dg_do run'.

[PR rtl-optimization/116136] Fix previously latent SUBREG simplification bug

This fixes a testsuite regression seen on m68k after some of the recent ext-dce
changes.  Ultimately Richard S and I have concluded the bug was a latent issue
in subreg simplification.

Essentially when simplifying something like

(set (target:M1) (subreg:M1 (subreg:M2 (reg:M1) 0) 0))

Where M1 > M2.  We'd simplify to:

(set (target:M1) (reg:M1))

The problem is on a big endian target that's wrong.   Consider if M1 is DI and
M2 is SI.    The original should extract bits 32..63 from the source register
and store them into bits 0..31 of the target register. In the simplified form
it's just a copy, so bits 0..63 of the source end up bits 0..63 of the target.

This shows up as the following regressions on the m68k:

> Tests that now fail, but worked before (3 tests):
>
> gcc: gcc.c-torture/execute/960416-1.c   -O2  execution test
> gcc: gcc.c-torture/execute/960416-1.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
> gcc: gcc.c-torture/execute/960416-1.c   -Os  execution test

The fix is pretty trivial, instead of hardcoding "0" as the byte offset in the
test for the simplification, instead we need to use the subreg_lowpart_offset.

Anyway, bootstrapped and regression tested on m68k and x86_64 and tested on the
other embedded targets as well without regressions.  Naturally it fixes the
regression noted above.  I haven't see other testsuite improvements when I spot
checked some of the big endian crosses.

PR rtl-optimization/116136
gcc/
* simplify-rtx.cc (simplify_context::simplify_subreg): Check
that we're working with the lowpart offset rather than byte 0.

libstdc++: Handle strerror returning null

The linux man page for strerror says that some systems return NULL for
an unknown error number. That violates the C and POSIX standards, but we
can esily handle it to avoid a segfault.

libstdc++-v3/ChangeLog:

* src/c++11/system_error.cc (strerror_string): Handle
non-conforming NULL return from strerror.

libstdc++: Only append "@euro" to locale names for Glibc testing

The testsuite automatically appends "@euro" to "xx.ISO8859-15" locale
names on all targets except FreeBSD, DragonflyBSD, and NetBSD. It should
only be for Glibc, not all non-BSD targets.

libstdc++-v3/ChangeLog:

* testsuite/lib/libstdc++.exp (check_v3_target_namedlocale):
Only append "@euro" to ".ISO8859-15" locales for Glibc.

libstdc++: Bump __cpp_lib_format value for std::runtime_format

We already supported this feature, but couldn't set the feature test
macro accordingly because we were missing support for older features.
Now that we support all the older <format> changes, we can set this to
the correct value.

libstdc++-v3/ChangeLog:

* include/bits/version.def (format): Update value for C++26.
* include/bits/version.h: Regenerate.
* include/std/format (runtime_format, wruntime_format): Check
__cpp_lib_format instead of __cplusplus.
* testsuite/std/format/functions/format.cc: Update expected
value of macro for C++26 mode.

libstdc++: Define C++26 member visit for std::basic_format_arg [PR110356]

Implement the std::format changes from P2637R3. This adds visit member
functions to std::basic_format_arg and deprecates the non-member
function std::visit_format_arg.

libstdc++-v3/ChangeLog:

PR libstdc++/110356
* include/bits/c++config (_GLIBCXX26_DEPRECATED): Define.
(_GLIBCXX26_DEPRECATED_SUGGEST): Define.
* include/bits/version.def (format): Update for C++26.
* include/bits/version.h: Regenerate.
* include/std/format (basic_format_arg::visit): New member
functions.
(visit_format_arg): Add deprecated attribute.
* testsuite/std/format/arguments/args.cc: Expect deprecated
warnings. Check member visit.
* testsuite/std/format/functions/format.cc: Update expected
value for __cpp_lib_format macro.
* testsuite/std/format/parse_ctx.cc: Add dg-warning for
deprecation.

libstdc++: Define C++26 member visit for std::variant [PR110356]

Implement the std::variant changes from P2637R3.

libstdc++-v3/ChangeLog:

PR libstdc++/110356
* include/bits/version.def (variant): Update for C++26.
* include/bits/version.h: Regenerate.
* include/std/variant (variant::visit): New member functions.
* testsuite/20_util/variant/visit.cc: Check second alternative.
* testsuite/20_util/variant/visit_member.cc: New test.

libstdc++: Implement C++26 type checking for std::format args [PR115776]

Implement the changes from P2757R3, which enhance the parse context to
be able to do type checking on format arguments, and to use that to
ensure that args used for width and precisions are integral types.

libstdc++-v3/ChangeLog:

PR libstdc++/115776
* include/bits/version.def (format): Update for C++26.
* include/bits/version.h: Regenerate.
* include/std/format (basic_format_parse_context): Remove
default argument from constructor and split into two
constructors. Make the constructor taking size_t private for
C++26 and later.
(basic_format_parse_context::check_dynamic_spec): New member
function template.
(basic_format_parse_context::check_dynamic_spec_integral): New
member function.
(basic_format_parse_context::check_dynamic_spec_string):
Likewise.
(__format::_Spec::_S_parse_width_or_precision): Use
check_dynamic_spec_integral.
(__format::__to_arg_t_enum): New helper function.
(basic_format_arg): Declare __to_arg_t_enum as friend.
(__format::_Scanner): Define and use a derived parse context
type.
(__format::_Checking_scanner): Make arg types available to parse
context.
* testsuite/std/format/functions/format.cc: Check for new values
of __cpp_lib_format macro.
* testsuite/std/format/parse_ctx.cc: Check all members of
basic_format_parse_context.
* testsuite/std/format/parse_ctx_neg.cc: New test.
* testsuite/std/format/string.cc: Add more checks for dynamic
width and precision args.

libstdc++: Support P2510R3 "Formatting pointers" as a DR for C++20

We already enable this for -std=gnu++20 but we can do it for -std=c++20
too. Both libc++ and MSVC also treat this change as a DR for C++20.

Now that the previous change to the value of __cpp_lib_format is
supported, we can finally update it to 202304 to indicate support for
this feature too.

libstdc++-v3/ChangeLog:

* include/bits/version.def (format): Update value for P2510R3.
* include/bits/version.h: Regenerate.
* include/std/format (_GLIBCXX_P2518R3): Remove misspelled macro
and check __glibcxx_format instead.
* testsuite/std/format/functions/format.cc: Check value of the
__cpp_lib_format macro for formatting pointers support.
* testsuite/std/format/parse_ctx.cc: Likewise.

libstdc++: Handle encodings in localized chrono formatting [PR109162]

This implements the C++23 paper P2419R2 (Clarify handling of encodings
in localized formatting of chrono types). The requirement is that when
the literal encoding is "a Unicode encoding form" and the formatting
locale uses a different encoding, any locale-specific strings such as
"août" for std::chrono::August should be converted to the literal
encoding.

Using the recently-added std::locale::encoding() function we can check
the locale's encoding and then use iconv if a conversion is needed.
Because nl_langinfo_l and iconv_open both allocate memory, a naive
implementation would perform multiple allocations and deallocations for
every snippet of locale-specific text that needs to be converted to
UTF-8. To avoid that, a new internal locale::facet is defined to store
the text_encoding and an iconv_t descriptor, which are then cached in
the formatting locale. This requires access to the internals of a
std::locale object in src/c++20/format.cc, so that new file needs to be
compiled with -fno-access-control, as well as -std=gnu++26 in order to
use std::text_encoding.

Because the new std::text_encoding and std::locale::encoding() symbols
are only in the libstdc++exp.a archive, we need to include
src/c++26/text_encoding.cc in the main library, but not export its
symbols yet. This means they can be used by the two new functions which
are exported from the main library.

The encoding conversions are done for C++20, treating it as a DR that
resolves LWG 3656.

With this change we can increase the value of the __cpp_lib_format macro
for C++23. The value should be 202207 for P2419R2, but we already
implement P2510R3 (Formatting pointers) so can use the value 202304.

libstdc++-v3/ChangeLog:

PR libstdc++/109162
* acinclude.m4 (libtool_VERSION): Update to 6:34:0.
* config/abi/pre/gnu.ver: Disambiguate old patters. Add new
GLIBCXX_3.4.34 symbol version and new exports.
* configure: Regenerate.
* include/bits/chrono_io.h (_ChronoSpec::_M_locale_specific):
Add new accessor functions to use a reserved bit in _Spec.
(__formatter_chrono::_M_parse): Use _M_locale_specific(true)
when chrono-specs contains locale-dependent conversion
specifiers.
(__formatter_chrono::_M_format): Open iconv descriptor if
conversion to UTF-8 will be needed.
(__formatter_chrono::_M_write): New function to write a
localized string with possible character conversion.
(__formatter_chrono::_M_a_A, __formatter_chrono::_M_b_B)
(__formatter_chrono::_M_p, __formatter_chrono::_M_r)
(__formatter_chrono::_M_x, __formatter_chrono::_M_X)
(__formatter_chrono::_M_locale_fmt): Use _M_write.
* include/bits/version.def (format): Update value.
* include/bits/version.h: Regenerate.
* include/std/format (_GLIBCXX_P2518R3): Check feature test
macro instead of __cplusplus.
(basic_format_context): Declare __formatter_chrono as friend.
* src/c++20/Makefile.am: Add new file.
* src/c++20/Makefile.in: Regenerate.
* src/c++20/format.cc: New file.
* testsuite/std/time/format_localized.cc: New test.
* testsuite/util/testsuite_abi.cc: Add new symbol version.

testsuite: fix dg-require-* order vs dg-additional-sources

Per gccint, 'dg-require-*' must come before any
'dg-additional-sources' directives. Fix a handful of deviant cases.

* gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c: Fix dg-require-profiling
directive order.
* gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c: Likewise.

testsuite: fix dg-require-effective-target order vs dg-additional-sources

Per gccint, 'dg-require-effective-target' must come before any
'dg-additional-sources' directives. Fix a handful of deviant cases.

gcc/testsuite/ChangeLog:
* gcc.target/aarch64/aapcs64/func-ret-3.c: Fix dg-require-effective-target directive order.
* gcc.target/aarch64/aapcs64/func-ret-4.c: Likewise.
* gfortran.dg/PR100914.f90: Likewise.

libgomp/ChangeLog:
* testsuite/libgomp.c++/pr24455.C: Fix dg-require-effective-target directive order.
* testsuite/libgomp.c/pr24455.c: Likewise.

testsuite: fix 'dg-do-preprocess' typo

We want 'dg-do preprocess', not 'dg-do-preprocess'. Fix that.

PR target/106828
* g++.target/loongarch/pr106828.C: Fix 'dg-do compile' typo.

testsuite: fix 'dg-do-compile' typos

We want 'dg-do compile', not 'dg-do-compile'. Fix that.

PR target/69194
PR c++/92024
PR c++/110057
* c-c++-common/Wshadow-1.c: Fix 'dg-do compile' typo.
* g++.dg/tree-ssa/devirt-array-destructor-1.C: Likewise.
* g++.dg/tree-ssa/devirt-array-destructor-2.C: Likewise.
* gcc.target/arm/pr69194.c: Likewise.

testsuite: libgomp: fix dg-do run typo

'dg-run' is not a valid dejagnu directive, 'dg-do run' is needed here
for the test to be executed.

That said, it actually seems to be executed for me anyway, presumably
a default in the directory, but let's fix it to be consistent with
other uses in the tree and in that test directory even.

libgomp/ChangeLog:
* testsuite/libgomp.c++/declare-target-indirect-1.C: Fix 'dg-run' typo.

aarch64: Add fpm register helper functions.

The ACLE declares several helper types and functions to facilitate construction
of `fpm` arguments. These are available when one of the arm_neon.h, arm_sve.h,
or arm_sme.h headers is included. These helpers don't map to specific FP8
instructions and there's no expectation that they will produce a given code
sequence, they're just an abstraction and an aid to the programmer. Thus they are
implemented in a new header file arm_private_fp8.h
Users are not expected to include this file, as it is a mere implementation detail,
subject to change. A check is included to guard against direct inclusion.

gcc/ChangeLog:

* config.gcc (extra_headers): Install arm_private_fp8.h.
* config/aarch64/arm_neon.h: Include arm_private_fp8.h.
* config/aarch64/arm_sve.h: Likewise.
* config/aarch64/arm_private_fp8.h: New file
(fpm_t): New type representing fpmr values.
(enum __ARM_FPM_FORMAT): New enum representing valid fp8 formats.
(enum __ARM_FPM_OVERFLOW): New enum representing how some fp8
calculations work.
(__arm_fpm_init): New.
(__arm_set_fpm_src1_format): Likewise.
(__arm_set_fpm_src2_format): Likewise.
(__arm_set_fpm_dst_format): Likewise.
(__arm_set_fpm_overflow_cvt): Likewise.
(__arm_set_fpm_overflow_mul): Likewise.
(__arm_set_fpm_lscale): Likewise.
(__arm_set_fpm_lscale2): Likewise.
(__arm_set_fpm_nscale): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/fp8-helpers-neon.c: New test of fpmr helper
functions.
* gcc.target/aarch64/acle/fp8-helpers-sve.c: New test of fpmr helper
functions presence.
* gcc.target/aarch64/acle/fp8-helpers-sme.c: New test of fpmr helper
functions presence.

aarch64: Add support for moving fpm system register

Unlike most system registers, fpmr can be heavily written to in code that
exercises the fp8 functionality. That is because every fp8 instrinsic call
can potentially change the value of fpmr.
Rather than just use an unspec, we treat the fpmr system register like
all other registers and use a move operation to read and write to it.

We introduce a new class of moveable system registers that, currently,
only accepts fpmr and a new constraint, Umv, that allows us to
selectively use mrs and msr instructions when expanding rtl for them.
Given that there is code that depends on "real" registers coming before
"fake" ones, we introduce a new constant FPM_REGNUM that uses an
existing value and renumber registers below that.
This requires us to update the bitmaps that describe which registers
belong to each register class.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_hard_regno_nregs): Add
support for MOVEABLE_SYSREGS class.
(aarch64_hard_regno_mode_ok): Allow reads and writes to fpmr.
(aarch64_regno_regclass): Support MOVEABLE_SYSREGS class.
(aarch64_class_max_nregs): Likewise.
* config/aarch64/aarch64.h (FIXED_REGISTERS): add fpmr.
(CALL_REALLY_USED_REGISTERS): Likewise.
(REGISTER_NAMES): Likewise.
(enum reg_class): Add MOVEABLE_SYSREGS class.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Update class bitmaps to deal with fpmr,
the new MOVEABLE_REGS class and renumbering of registers.
* config/aarch64/aarch64.md: (FPM_REGNUM): added new register
number, reusing old value.
(FFR_REGNUM): Renumber.
(FFRT_REGNUM): Likewise.
(LOWERING_REGNUM): Likewise.
(TPIDR2_BLOCK_REGNUM): Likewise.
(SME_STATE_REGNUM): Likewise.
(TPIDR2_SETUP_REGNUM): Likewise.
(ZA_FREE_REGNUM): Likewise.
(ZA_SAVED_REGNUM): Likewise.
(ZA_REGNUM): Likewise.
(ZT0_REGNUM): Likewise.
(*mov<mode>_aarch64): Add support for moveable sysregs.
(*movsi_aarch64): Likewise.
(*movdi_aarch64): Likewise.
* config/aarch64/constraints.md (MOVEABLE_SYSREGS): New constraint.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/fp8.c: New tests.

aarch64: Add march flags for +fp8 arch extensions

This introduces the relevant flags to enable access to the fpmr register and fp8 intrinsics, which will be added subsequently.

gcc/ChangeLog:

* config/aarch64/aarch64-option-extensions.def (fp8): New.
* config/aarch64/aarch64.h (TARGET_FP8): Likewise.
* doc/invoke.texi (AArch64 Options): Document new -march flags
and extensions.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/fp8.c: New test.

c++: array new with value-initialization, again [PR115645]

Unfortunately, my r15-1946 fix broke the attached testcase; the
constexpr evaluation reported an error about not being able to
evaluate the code emitted by build_vec_init. Jason figured out
it's because we were wrongly setting try_const to false, where
in fact it should have been true. Value-initialization of scalars
is constexpr, so we should check that alongside of
type_has_constexpr_default_constructor.

PR c++/115645

gcc/cp/ChangeLog:

* init.cc (build_vec_init): When initializing a scalar type, try to
create a constant initializer.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-new23.C: New test.

testsuite: Adjust switch-exp-transform-3.c for 32bit

32bit x86 CPUs won't natively support the FFS operation on a 64 bit
type. Therefore, I'm setting the long long int part of the
switch-exp-transform-3.c test to only execute with 64bit targets.

gcc/testsuite/ChangeLog:

* gcc.target/i386/switch-exp-transform-3.c: Set the long long
int test to only execute with 64bit targets.

Signed-off-by: Filip Kastl <fkastl@suse.cz>

LoongArch: Rework bswap{hi,si,di}2 definition

Per a gcc-help thread we are generating sub-optimal code for
__builtin_bswap{32,64}.  To fix it:

- Use a single revb.d instruction for bswapdi2.
- Use a single revb.2w instruction for bswapsi2 for TARGET_64BIT,
  revb.2h + rotri.w for !TARGET_64BIT.
- Use a single revb.2h instruction for bswapsi2 (x) r>> 16, and a single
  revb.2w instruction for bswapdi2 (x) r>> 32.

Unfortunately I cannot figure out a way to make the compiler generate
revb.4h or revh.{2w,d} instructions.

gcc/ChangeLog:

* config/loongarch/loongarch.md (UNSPEC_REVB_2H, UNSPEC_REVB_4H,
UNSPEC_REVH_D): Remove UNSPECs.
(revb_4h, revh_d): Remove define_insn.
(revb_2h): Define as (rotatert:SI (bswap:SI x) 16) instead of
an UNSPEC.
(revb_2h_extend, revb_2w, *bswapsi2, bswapdi2): New define_insn.
(bswapsi2): Change to define_expand.  Only expand to revb.2h +
rotri.w if !TARGET_64BIT.
(bswapdi2): Change to define_insn of which the output is just a
revb.d instruction.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/revb.c: New test.

LoongArch: Relax ins_zero_bitmask_operand and remove and<mode>3_align

In r15-1207 I was too stupid to realize we just need to relax
ins_zero_bitmask_operand to allow using bstrins for aligning, instead of
adding a new split. And, "> 12" in ins_zero_bitmask_operand also makes
no sense: it rejects bstrins for things like "x & ~4l" with no good
reason.

So fix my errors now.

gcc/ChangeLog:

* config/loongarch/predicates.md (ins_zero_bitmask_operand):
Cover more cases that bstrins can benefit.
(high_bitmask_operand): Remove.
* config/loongarch/constraints.md (Yy): Remove.
* config/loongarch/loongarch.md (and<mode>3_align): Remove.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/bstrins-4.c: New test.

middle-end/101478 - ICE with degenerate address during gimplification

When we gimplify &MEM[0B + 4] we are re-folding the address in case
types are not canonical which ends up with a constant address that
recompute_tree_invariant_for_addr_expr ICEs on. Properly guard
that call.

PR middle-end/101478
* gimplify.cc (gimplify_addr_expr): Check we still have an
ADDR_EXPR before calling recompute_tree_invariant_for_addr_expr.

* gcc.dg/pr101478.c: New testcase.

i386: Mark target option with optimization when enabled with opt level [PR116065]

When introducing munroll-only-small-loops, the option was marked as
Target Save and added to -O2 default which makes attribute(optimize)
resets target option and causing error when cmdline has O1 and
funciton attribute has O2 and other target options. Mark this option
as Optimization to fix.

gcc/ChangeLog

PR target/116065
* config/i386/i386.opt (munroll-only-small-loops): Mark as
Optimization instead of Save.

gcc/testsuite/ChangeLog

PR target/116065
* gcc.target/i386/pr116065.c: New test.

recog: Disallow subregs in mode-punned value [PR115881]

In g:9d20529d94b23275885f380d155fe8671ab5353a, I'd extended
insn_propagation to handle simple cases of hard-reg mode punning.
The punned "to" value was created using simplify_subreg rather
than simplify_gen_subreg, on the basis that hard-coded subregs
aren't generally useful after RA (where hard-reg propagation is
expected to happen).

This PR is about a case where the subreg gets pushed into the
operands of a plus, but the subreg on one of the operands
cannot be simplified. Specifically, we have to generate
(subreg:SI (reg:DI sp) 0) rather than (reg:SI sp), since all
references to the stack pointer must be via stack_pointer_rtx.

However, code in x86 (reasonably) expects no subregs of registers
to appear after RA, except for special cases like strict_low_part.
This leads to an awkward situation where we can't ban subregs of sp
(because of the strict_low_part use), can't allow direct references
to sp in other modes (because of the stack_pointer_rtx requirement),
and can't allow rvalue uses of the subreg (because of the "no subregs
after RA" assumption). It all seems a bit of a mess...

I sat on this for a while in the hope that a clean solution might
become apparent, but in the end, I think we'll just have to check
manually for nested subregs and punt on them.

gcc/
PR rtl-optimization/115881
* recog.cc: Include rtl-iter.h.
(insn_propagation::apply_to_rvalue_1): Check that the result
of simplify_subreg does not include nested subregs.

gcc/testsuite/
PR rtl-optimization/115881
* gcc.c-torture/compile/pr115881.c: New test.

rs6000: Relax some FLOAT128 expander condition for FLOAT128_IEEE_P [PR105359]

As PR105359 shows, we disable some FLOAT128 expanders for
64-bit long double, but in fact IEEE float128 types like
__ieee128 are only guarded with TARGET_FLOAT128_TYPE and
TARGET_LONG_DOUBLE_128 is only checked when determining if
we can reuse long_double_type_node.  So this patch is to
relax all affected FLOAT128 expander conditions for
FLOAT128_IEEE_P.  By the way, currently IBM double double
type __ibm128 is guarded by TARGET_LONG_DOUBLE_128, so we
have to use TARGET_LONG_DOUBLE_128 for it.  IMHO, it's not
necessary and can be enhanced later.

Btw, for all test cases mentioned in PR105359, I removed
the xfails and tested them with explicit -mlong-double-64,
both pr79004.c and float128-hw.c are tested well and
float128-hw4.c isn't tested (unsupported due to 64 bit
long double conflicts with -mabi=ieeelongdouble).

PR target/105359

gcc/ChangeLog:

* config/rs6000/rs6000.md (@extenddf<FLOAT128:mode>2): Don't check
TARGET_LONG_DOUBLE_128 for FLOAT128_IEEE_P modes.
(extendsf<FLOAT128:mode>2): Likewise.
(trunc<FLOAT128:mode>df2): Likewise.
(trunc<FLOAT128:mode>sf2): Likewise.
(floatsi<FLOAT128:mode>2): Likewise.
(fix_trunc<FLOAT128:mode>si2): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr79004.c: Remove xfails.

rs6000: Use standard name uabd for absdu insns

r14-1832 adds recognition pattern, ifn and optab for ABD
(ABsolute Difference), we have some vector absolute
difference unsigned instructions since ISA 3.0, as the
associated test cases shown, they are not exploited well
as we don't define it (them) with a standard name.  So this
patch is to rename it with standard name first.  And it
merges both define_expand and define_insn as a separated
define_expand isn't needed.  Besides, it adjusts the RTL
pattern by using generic umax and umin rather than
UNSPEC_VADU, it's more meaningful and can catch umin/umax
opportunity.

gcc/ChangeLog:

* config/rs6000/altivec.md (p9_vadu<mode>3): Rename to ...
(uabd<mode>3): ... this.  Update RTL pattern with umin and umax rather
than UNSPEC_VADU.
(vadu<mode>3): Remove.
(UNSPEC_VADU): Remove.
(usadv16qi): Replace gen_p9_vaduv16qi3 with gen_uabdv16qi3.
(usadv8hi): Replace gen_p9_vaduv8hi3 with gen_uabdv8hi3.
* config/rs6000/rs6000-builtins.def (__builtin_altivec_vadub): Replace
expander with uabdv16qi3.
(__builtin_altivec_vaduh): Adjust expander with uabdv8hi3.
(__builtin_altivec_vaduw): Adjust expander with uabdv4si3.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/abd-vectorize-1.c: New test.
* gcc.target/powerpc/abd-vectorize-2.c: New test.

LoongArch: Expand some SImode operations through "si3_extend" instructions if TARGET_64BIT

We already had "si3_extend" insns and we hoped the fwprop or combine
passes can use them to remove unnecessary sign extensions.  But this
does not always work: for cases like x << 1 | y, the compiler
tends to do

    (sign_extend:DI
      (ior:SI (ashift:SI (reg:SI $r4)
                         (const_int 1))
              (reg:SI $r5)))

instead of

    (ior:DI (sign_extend:DI (ashift:SI (reg:SI $r4) (const_int 1)))
            (sign_extend:DI (reg:SI $r5)))

So we cannot match the ashlsi3_extend instruction here and we get:

    slli.w $r4,$r4,1
    or     $r4,$r5,$r4
    slli.w $r4,$r4,0    # <= redundant
    jr    $r1

To eliminate this redundant extension we need to turn SImode shift etc.
to DImode "si3_extend" operations earlier, when we expand the SImode
operation.  We are already doing this for addition, now do it for
shifts, rotates, substract, multiplication, division, and modulo as
well.

The bytepick.w definition for TARGET_64BIT needs to be adjusted so it
won't be undone by the shift expanding.

gcc/ChangeLog:

* config/loongarch/loongarch.md (optab): Add (rotatert "rotr").
(<optab:any_shift><mode>3, <optab:any_div><mode>3,
sub<mode>3, rotr<mode>3, mul<mode>3): Add a "*" to the insn name
so we can redefine the names with define_expand.
(*<optab:any_shift>si3_extend): Remove "*" so we can use them
in expanders.
(*subsi3_extended, *mulsi3_extended): Likewise, also remove the
trailing "ed" for consistency.
(*<optab:any_div>si3_extended): Add mode for sign_extend to
prevent an ICE using it in expanders.
(shift_w, arith_w): New define_code_iterator.
(<optab:any_w><mode>3): New define_expand.  Expand with
<optab:any_w>si3_extend for SImode if TARGET_64BIT.
(<optab:arith_w><mode>3): Likewise.
(mul<mode>3): Expand to mulsi3_extended for SImode if
TARGET_64BIT and ISA_HAS_DIV32.
(<optab:any_div><mode>3): Expand to <optab:any_div>si3_extended
for SImode if TARGET_64BIT.
(rotl<mode>3): Expand to rotrsi3_extend for SImode if
TARGET_64BIT.
(bytepick_w_<bytepick_imm>): Add mode for lshiftrt and ashift.
(bitsize, bytepick_imm, bytepick_w_ashift_amount): New
define_mode_attr.
(bytepick_w_<bytepick_imm>_extend): Adjust for the RTL change
caused by 32-bit shift expanding.  Now bytepick_imm only covers
2 and 3, separate one remaining case to ...
(bytepick_w_1_extend): ... here, new define_insn.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/bitwise_extend.c: New test.

Daily bump.

libstdc++: Fix formatter for low-resolution chrono::zoned_time (LWG 4124)

This implements the proposed resolution of LWG 4124, so that
low-resolution chrono::zoned_time objects can be formatted. The
formatter for zoned_time<D, P> needs to account for get_local_time
returning local_time<common_type_t<D, seconds>> not local_time<D>.

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (__local_time_fmt_for): New alias
template.
(formatter<zoned_time<D, P>>): Use __local_time_fmt_for.
* testsuite/std/time/zoned_time/io.cc: Check zoned_time<minutes>
can be formatted.

libstdc++: Fix std::format output for std::chrono::zoned_time

When formatting a chrono::zoned_time with an empty chrono-specs, we were
only formatting its _M_time member, but the ostream insertion operator
uses the format "{:L%F %T %Z}" which includes the time zone
abbreviation. The %Z should also be used when formatting with an empty
chrono-specs.

This commit makes _M_format_to_ostream handle __local_time_fmt
specializations directly, rather than calling itself recursively to
format the _M_time member. We need to be able to customize the output of
_M_format_to_ostream for __local_time_fmt, because we use that type for
gps_time and tai_time as well as for zoned_time and __local_time_fmt.
When formatting gps_time and tai_time we don't want to include the time
zone abbreviation in the "{}" output, but for zoned_time we do want to.
We can reuse the __is_neg flag passed to _M_format_to_ostream (via
_M_format) to say that we want the time zone abbreviation. Currently
the __is_neg flag is only used for duration specializations, so it's
available for __local_time_fmt to use.

In addition to fixing the zoned_time output to use %Z, this commit also
changes the __local_time_fmt output to use %Z. Previously it didn't use
it, just like zoned_time. The standard doesn't actually say how to
format local-time-format-t for an empty chrono-specs, but this behaviour
seems sensible and is what I'm proposing as part of LWG 4124.

While testing this I noticed that some chrono types were not being
tested with empty chrono-specs, so this adds more tests. I also noticed
that std/time/clock/local/io.cc was testing tai_time instead of
local_time, which was completely wrong. That's fixed now too.

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (__local_fmt_t): Remove unused
declaration.
(__formatter_chrono::_M_format_to_ostream): Add explicit
handling for specializations of __local_time_fmt, including the
time zone abbreviation in the output if __is_neg is true.
(formatter<chrono::tai_time<D>>::format): Add comment.
(formatter<chrono::gps_time<D>>::format): Likewise.
(formatter<chrono::__detail::__local_time_fmt::format): Call
_M_format with true for the __is_neg flag.
* testsuite/std/time/clock/gps/io.cc: Remove unused variable.
* testsuite/std/time/clock/local/io.cc: Fix test error that
checked tai_time instead of local_time. Add tests for
local-time-format-t formatting.
* testsuite/std/time/clock/system/io.cc: Check empty
chrono-specs.
* testsuite/std/time/clock/tai/io.cc: Likewise.
* testsuite/std/time/zoned_time/io.cc: Likewise.

libstdc++: Implement LWG 3886 for std::optional and std::expected

This uses remove_cv_t<T> for the default template argument used for
deducing a type for a braced-init-list used with std::optional and
std::expected.

libstdc++-v3/ChangeLog:

* include/std/expected (expected(U&&), operator=(U&&))
(value_or): Use remove_cv_t on default template argument, as per
LWG 3886.
* include/std/optional (optional(U&&), operator=(U&&))
(value_or): Likewise.
* testsuite/20_util/expected/lwg3886.cc: New test.
* testsuite/20_util/optional/cons/lwg3886.cc: New test.

testsuite: fix 'dg-compile' typos

'dg-compile' is not a thing, replace it with 'dg-do compile'.

PR target/68015
PR c++/83979
* c-c++-common/goacc/loop-shape.c: Fix 'dg-compile' typo.
* g++.dg/pr83979.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/attributes_2.C: Likewise.
* gcc.dg/tree-ssa/builtin-sprintf-7.c: Likewise.
* gcc.dg/tree-ssa/builtin-sprintf-8.c: Likewise.
* gcc.target/riscv/amo/zabha-rvwmo-all-amo-ops-char.c: Likewise.
* gcc.target/riscv/amo/zabha-rvwmo-all-amo-ops-short.c: Likewise.
* gcc.target/s390/20181024-1.c: Likewise.
* gcc.target/s390/addr-constraints-1.c: Likewise.
* gcc.target/s390/arch12/aghsghmgh-1.c: Likewise.
* gcc.target/s390/arch12/mul-1.c: Likewise.
* gcc.target/s390/arch13/bitops-1.c: Likewise.
* gcc.target/s390/arch13/bitops-2.c: Likewise.
* gcc.target/s390/arch13/fp-signedint-convert-1.c: Likewise.
* gcc.target/s390/arch13/fp-unsignedint-convert-1.c: Likewise.
* gcc.target/s390/arch13/popcount-1.c: Likewise.
* gcc.target/s390/pr68015.c: Likewise.
* gcc.target/s390/vector/fp-signedint-convert-1.c: Likewise.
* gcc.target/s390/vector/fp-unsignedint-convert-1.c: Likewise.
* gcc.target/s390/vector/reverse-elements-1.c: Likewise.
* gcc.target/s390/vector/reverse-elements-2.c: Likewise.
* gcc.target/s390/vector/reverse-elements-3.c: Likewise.
* gcc.target/s390/vector/reverse-elements-4.c: Likewise.
* gcc.target/s390/vector/reverse-elements-5.c: Likewise.
* gcc.target/s390/vector/reverse-elements-6.c: Likewise.
* gcc.target/s390/vector/reverse-elements-7.c: Likewise.
* gnat.dg/alignment15.adb: Likewise.
* gnat.dg/debug4.adb: Likewise.
* gnat.dg/inline21.adb: Likewise.
* gnat.dg/inline22.adb: Likewise.
* gnat.dg/opt37.adb: Likewise.
* gnat.dg/warn13.adb: Likewise.

libstdc++: Fix name of source file in comment

libstdc++-v3/ChangeLog:

* src/c++17/fs_ops.cc: Fix file name in comment.

i386/testsuite: Add testcase for fixed PR [PR51492]

PR target/51492

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr51492.c: New test.

RISC-V: Add configure check for B extention support

Binutils 2.42 and before don't recognize the b extension in the march
strings even though it supports zba_zbb_zbs. Add a configure check to
ignore the b in the march string if found.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_subset_list::to_string):
Skip b in march string
* config.in: Regenerate.
* configure: Regenerate.
* configure.ac: Add B assembler check

Signed-off-by: Edwin Lu <ewlu@rivosinc.com>

testsuite: fix whitespace in dg-require-effective-target directives

PR middle-end/54400
PR target/98161
* gcc.dg/vect/bb-slp-layout-18.c: Fix whitespace in dg directive.
* gcc.dg/vect/bb-slp-pr54400.c: Likewise.
* gcc.target/i386/pr98161.c: Likewise.

gimple ssa: Teach switch conversion to optimize powers of 2 switches

Sometimes a switch has case numbers that are powers of 2.  Switch
conversion usually isn't able to optimize these switches.  This patch
adds "exponential index transformation" to switch conversion.  After
switch conversion applies this transformation on the switch the index
variable of the switch becomes the exponent instead of the whole value.
For example:

switch (i)
  {
    case (1 << 0): return 0;
    case (1 << 1): return 1;
    case (1 << 2): return 2;
    ...
    case (1 << 30): return 30;
    default: return 31;
  }

gets transformed roughly into

switch (log2(i))
  {
    case 0: return 0;
    case 1: return 1;
    case 2: return 2;
    ...
    case 30: return 30;
    default: return 31;
  }

This enables switch conversion to further optimize the switch.

This patch only enables this transformation if there are optabs for FFS
so that the base 2 logarithm can be computed efficiently at runtime.

gcc/ChangeLog:

* tree-switch-conversion.cc (can_log2): New static function to
check if gen_log2 can be used on current target.
(gen_log2): New static function to generate efficient GIMPLE
code for taking an exact base 2 log.
(gen_pow2p): New static function to generate efficient GIMPLE
code for checking if a value is a power of 2.
(switch_conversion::switch_conversion): Track if the
transformation happened.
(switch_conversion::is_exp_index_transform_viable): New function
to decide whether the transformation should be applied.
(switch_conversion::exp_index_transform): New function to
execute the transformation.
(switch_conversion::gen_inbound_check): Don't remove the default
BB if the transformation happened.
(switch_conversion::expand): Execute the transform if it is
viable.  Skip the "sufficiently small case range" test if the
transformation is going to be executed.
* tree-switch-conversion.h: Add is_exp_index_transform_viable
and exp_index_transform.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/switch-3.c: Disable switch conversion.
* gcc.target/i386/switch-exp-transform-1.c: New test.
* gcc.target/i386/switch-exp-transform-2.c: New test.
* gcc.target/i386/switch-exp-transform-3.c: New test.

Signed-off-by: Filip Kastl <fkastl@suse.cz>

libbacktrace: fix syntax of Windows registration functions

Adjust the syntax to keep MSVC happy.

Fixes https://github.com/ianlancetaylor/libbacktrace/issues/131

* pecoff.c (LDR_DLL_NOTIFICATION): Put function modifier
inside parentheses.
(LDR_REGISTER_FUNCTION): Likewise.

testsuite: fix whitespace in dg-do assemble directive

* gcc.target/aarch64/simd/vmmla.c: Fix whitespace in dg directive.

testsuite: fix whitespace in dg-do preprocess directive

PR preprocessor/90581
* c-c++-common/cpp/fmax-include-depth.c: Fix whitespace in dg directive.

testsuite: fix whitespace in dg-do compile directives

Nothing seems to change here in reality at least on x86_64-pc-linux-gnu,
but important to fix nonetheless in case people copy it.

PR rtl-optimization/48633
PR tree-optimization/83072
PR tree-optimization/83073
PR tree-optimization/96542
PR tree-optimization/96707
PR tree-optimization/97567
PR target/69225
PR target/89929
PR target/96562
* g++.dg/pr48633.C: Fix whitespace in dg directive.
* g++.dg/pr96707.C: Likewise.
* g++.target/i386/mv28.C: Likewise.
* gcc.dg/Warray-bounds-flex-arrays-1.c: Likewise.
* gcc.dg/pr83072-2.c: Likewise.
* gcc.dg/pr83073.c: Likewise.
* gcc.dg/pr96542.c: Likewise.
* gcc.dg/pr97567-2.c: Likewise.
* gcc.target/i386/avx512fp16-11a.c: Likewise.
* gcc.target/i386/avx512fp16-13.c: Likewise.
* gcc.target/i386/avx512fp16-14.c: Likewise.
* gcc.target/i386/avx512fp16-conjugation-1.c: Likewise.
* gcc.target/i386/avx512fp16-neg-1a.c: Likewise.
* gcc.target/i386/avx512fp16-set1-pch-1a.c: Likewise.
* gcc.target/i386/avx512fp16vl-conjugation-1.c: Likewise.
* gcc.target/i386/avx512fp16vl-neg-1a.c: Likewise.
* gcc.target/i386/avx512fp16vl-set1-pch-1a.c: Likewise.
* gcc.target/i386/avx512vlfp16-11a.c: Likewise.
* gcc.target/i386/pr69225-1.c: Likewise.
* gcc.target/i386/pr69225-2.c: Likewise.
* gcc.target/i386/pr69225-3.c: Likewise.
* gcc.target/i386/pr69225-4.c: Likewise.
* gcc.target/i386/pr69225-5.c: Likewise.
* gcc.target/i386/pr69225-6.c: Likewise.
* gcc.target/i386/pr69225-7.c: Likewise.
* gcc.target/i386/pr96562-1.c: Likewise.
* gcc.target/riscv/rv32e_stack.c: Likewise.
* gfortran.dg/c-interop/removed-restrictions-3.f90: Likewise.
* gnat.dg/renaming1.adb: Likewise.

RISC-V: Add basic support for the Zacas extension

This patch adds support for amocas.{b|h|w|d}. Support for amocas.q
(64/128 bit cas for rv32/64) will be added in a future patch.

Extension: https://github.com/riscv/riscv-zacas
Ratification: https://jira.riscv.org/browse/RVS-680

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add zacas extension.
* config/riscv/arch-canonicalize: Make zacas imply zaamo.
* config/riscv/riscv.opt: Add zacas.
* config/riscv/sync.md (zacas_atomic_cas_value<mode>): New pattern.
(atomic_compare_and_swap<mode>): Use new pattern for compare-and-swap ops.
(zalrsc_atomic_cas_value_strong<mode>): Rename atomic_cas_value_strong.
* doc/sourcebuild.texi: Add Zacas documentation.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add zacas testsuite infra support.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire-release.c:
Remove zacas to continue to test the lr/sc pairs.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-consume.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-consume.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst.c: Ditto.
* gcc.target/riscv/amo/zabha-zacas-preferred-over-zalrsc.c: New test.
* gcc.target/riscv/amo/zacas-char-requires-zabha.c: New test.
* gcc.target/riscv/amo/zacas-char-requires-zacas.c: New test.
* gcc.target/riscv/amo/zacas-preferred-over-zalrsc.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-acq-rel.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-acquire.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-relaxed.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-release.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-seq-cst.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-compatability-mapping-no-fence.c:
New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-compatability-mapping.cc: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-acq-rel.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-acquire.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-relaxed.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-release.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-seq-cst.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-acq-rel.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-acquire.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-relaxed.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-release.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-seq-cst.c: New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-char-seq-cst.c: New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-char.c: New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-compatability-mapping-no-fence.c:
New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-compatability-mapping.cc: New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-int-seq-cst.c: New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-int.c: New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-short-seq-cst.c: New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-short.c: New test.

Co-authored-by: Patrick O'Neill <patrick@rivosinc.com>
Tested-by: Andrea Parri <andrea@rivosinc.com>
Signed-Off-By: Gianluca Guida <gianluca@rivosinc.com>

RISC-V: Remove configure check for zabha

This patch removes the zabha configure check since it's not a breaking change
and updates the existing zaamo/zalrsc comment.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc
(riscv_subset_list::to_string): Remove zabha configure check
handling and clarify zaamo/zalrsc comment.
* config.in: Regenerate.
* configure: Regenerate.
* configure.ac: Remove zabha configure check.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>

libstdc++: Fix overwriting files with fs::copy_file on Windows

There are no inode numbers on Windows filesystems, so stat_type::st_ino
is always zero and the check for equivalent files in do_copy_file was
incorrectly identifying distinct files as equivalent. This caused
copy_file to incorrectly report errors when trying to overwrite existing
files.

The fs::equivalent function already does the right thing on Windows, so
factor that logic out into a new function that can be reused by
fs::copy_file.

The tests for fs::copy_file were quite inadequate, so this also adds
checks for that function's error conditions.

libstdc++-v3/ChangeLog:

* src/c++17/fs_ops.cc (auto_win_file_handle): Change constructor
parameter from const path& to const wchar_t*.
(fs::equiv_files): New function.
(fs::equivalent): Use equiv_files.
* src/filesystem/ops-common.h (fs::equiv_files): Declare.
(do_copy_file): Use equiv_files.
* src/filesystem/ops.cc (fs::equiv_files): Define.
(fs::copy, fs::equivalent): Use equiv_files.
* testsuite/27_io/filesystem/operations/copy.cc: Test
overwriting directory contents recursively.
* testsuite/27_io/filesystem/operations/copy_file.cc: Test
overwriting existing files.

libstdc++: Fix fs::hard_link_count behaviour on MinGW [PR113663]

std::filesystem::hard_link_count() always returns 1 on
mingw-w64ucrt-11.0.1-r3 on Windows 10 19045

hard_link_count() queries _wstat64() on MinGW-w64
The MSFT documentation claims _wstat64() will always return 1 *non*-NTFS volumes
https://learn.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2013/14h5k7ff(v=vs.120)

My tests suggest that is not always true -
hard_link_count()/_wstat64() still returns 1 on NTFS.
GetFileInformationByHandle does return the correct result of 2.
Please see the PR for a minimal repro.

This patch changes the Windows implementation to always call
GetFileInformationByHandle.

PR libstdc++/113663

libstdc++-v3/ChangeLog:

* src/c++17/fs_ops.cc (fs::equivalent): Moved helper class
auto_handle to anonymous namespace as auto_win_file_handle.
(fs::hard_link_count): Changed Windows implementation to use
information provided by GetFileInformationByHandle which is more
reliable.
* testsuite/27_io/filesystem/operations/hard_link_count.cc: New
test.

Signed-off-by: "Lennox" Shou Hao Ho <lennoxhoe@gmail.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

c++: diagnose usage of co_await and co_yield in default args [PR115906]

This is a partial fix for PR115906. Per [expr.await] 2s3, "An
await-expression shall not appear in a default argument
([dcl.fct.default])". This patch introduces the diagnostic in that
case, and in the case of a co_yield (as co_yield is defined in terms of
co_await, so prerequisites of co_await hold).

PR c++/115906 - [coroutines] missing diagnostic and ICE when co_await used as default argument in function declaration

gcc/cp/ChangeLog:

PR c++/115906
* parser.cc (cp_parser_unary_expression): Reject await
expressions if use of local variables is currently forbidden.
(cp_parser_yield_expression): Reject yield expressions if use of
local variables is currently forbidden.

gcc/testsuite/ChangeLog:

PR c++/115906
* g++.dg/coroutines/pr115906-yield.C: New test.
* g++.dg/coroutines/pr115906.C: New test.
* g++.dg/coroutines/co-await-syntax-02-outside-fn.C: Don't rely
on default arguments.
* g++.dg/coroutines/co-yield-syntax-01-outside-fn.C: Ditto.

c++: fix ICE on FUNCTION_DECLs inside coroutines [PR115906]

When register_local_var_uses iterates a BIND_EXPRs BIND_EXPR_VARS, it
fails to account for the fact that FUNCTION_DECLs might be present, and
later passes it to DECL_HAS_VALUE_EXPR_P.  This leads to a tree check
failure in DECL_HAS_VALUE_EXPR_P:

  tree check: expected var_decl or parm_decl or result_decl, have
  function_decl in register_local_var_uses

We only care about PARM_DECL and VAR_DECL, so select only those.

PR c++/115906 - [coroutines] missing diagnostic and ICE when co_await used as default argument in function declaration

gcc/cp/ChangeLog:

PR c++/115906
* coroutines.cc (register_local_var_uses): Only process
PARM_DECL and VAR_DECLs.

gcc/testsuite/ChangeLog:

PR c++/115906
* g++.dg/coroutines/coro-function-decl.C: New test.

SVE intrinsics: Add strength reduction for division by constant.

This patch folds SVE division where all divisor elements are the same
power of 2 to svasrd (signed) or svlsr (unsigned).
Tests were added to check
1) whether the transform is applied (existing test harness was amended), and
2) correctness using runtime tests for all input types of svdiv; for signed
and unsigned integers, several corner cases were covered.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/

* config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold):
Implement strength reduction.

gcc/testsuite/

* gcc.target/aarch64/sve/div_const_run.c: New test.
* gcc.target/aarch64/sve/acle/asm/div_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/div_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/div_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/div_u64.c: Likewise.

c++: make source_location follow DECL_RAMP_FN

This fixes the value of current_function in compiler generated coroutine
code.

PR c++/110855 - std::source_location doesn't work with C++20 coroutine

gcc/cp/ChangeLog:

PR c++/110855
* cp-gimplify.cc (fold_builtin_source_location): Use the name of
the DECL_RAMP_FN of the current function if present.

gcc/testsuite/ChangeLog:

PR c++/110855
* g++.dg/coroutines/pr110855.C: New test.

testsuite: fix dg-do run whitespace

This caused the tests to not be run. I may do further passes for non-run
next.

Tested on x86_64-pc-linux-gnu and checked test logs before/after.

PR c/53548
PR target/101529
PR tree-optimization/102359
* c-c++-common/fam-in-union-alone-in-struct-1.c: Fix whitespace in dg directive.
* c-c++-common/fam-in-union-alone-in-struct-2.c: Likewise.
* c-c++-common/torture/builtin-shufflevector-2.c: Likewise.
* g++.dg/pr102359_2.C: Likewise.
* g++.target/i386/mvc1.C: Likewise.

Fix warnings for tree formats in gfc_error

This enables proper warnings for formats like %qD.

gcc/c-family/ChangeLog:

* c-format.cc (gcc_gfc_char_table): Add formats for tree objects.

gfortran.dg/compiler-directive_2.f: Update dg-error

This is a fallout of commit r15-2378-g29b1587e7d3466
OpenMP/Fortran: Fix handling of 'declare target' with 'link' clause [PR115559]
where the '!GCC$' attributes were added in reverse order.
Result: The error diagnostic for the stdcall/fastcall was reversed.
Solution: Swap the order in dg-error.

gcc/testsuite/ChangeLog:

* gfortran.dg/compiler-directive_2.f: Update dg-error.

AVR: Propose to use attribute signal(n) via AVR-LibC's ISR_N.

gcc/
* doc/extend.texi (AVR Function Attributes): Propose to use
attribute signal(n) via AVR-LibC's ISR_N from avr/interrupt.h

RISC-V: Take Xmode instead of Pmode for ussub expanding

The Pmode is designed for pointer, thus leverage the Xmode instead
for the expanding of the ussub.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_expand_ussub): Promote to Xmode
instead of Pmode.

Signed-off-by: Pan Li <pan2.li@intel.com>

xtensa: Add missing speed cost for TYPE_FARITH in TARGET_INSN_COST

According to the implemented pipeline model, this cost can be assumed to be
1 clock cycle.

gcc/ChangeLog:

* config/xtensa/xtensa.cc (xtensa_insn_cost):
Add a case statement for TYPE_FARITH.

xtensa: Fix suboptimal loading of pooled constant value into hardware single-precision FP register

We would like to implement the following to store a single-precision FP
constant in a hardware FP register:

- Load the bit-exact integer image of the pooled single-precision FP
  constant into an address (integer) register
- Then, assign from that address register to a hardware single-precision
  FP register

.literal_position
.literal .LC1, 0x3f800000
...
l32r a9, .LC1
wfr f0, a9

However, it was emitted as follows:

- Load the address of the FP constant entry in litpool into an address
  register
- Then, dereference the address via that address register into a hardware
  single-precision FP register

.literal_position
.literal .LC1, 0x3f800000
.literal .LC2, .LC1
...
l32r a9, .LC2
lsi f0, a9, 0

It is obviously inefficient to read the pool twice.

gcc/ChangeLog:

* config/xtensa/xtensa.md (movsf_internal):
Reorder alternative that corresponds to L32R machine instruction,
and prefix alternatives that correspond to LSI/SSI instructions
with the constraint character '^' so that they are disparaged by
reload/LRA.

xtensa: Fix the regression introduce by r15-959-gbe9b3f4375e7

It is not wrong but also not optimal to specify that sibcalls require
register A0 in RTX generation pass, by misleading DFA into thinking it
is being used in function body.
It would be better to specify it in pro_and_epilogue as with 'return'
insn in order to avoid incorrect removing load that restores A0 in
subsequent passes, but since it is not possible to modify each sibcall
there, as a workaround we will preface it with a 'use' as before.

This patch effectively reverts commit r15-959-gbe9b3f4375e7

gcc/ChangeLog:

* config/xtensa/xtensa-protos.h (xtensa_expand_call):
Remove the third argument.
* config/xtensa/xtensa.cc (xtensa_expand_call):
Remove the third argument and the code that uses it.
* config/xtensa/xtensa.md (call, call_value, sibcall, sibcall_value):
Remove each Boolean constant specified in the third argument of
xtensa_expand_call.
(sibcall_epilogue): Add emitting '(use A0_REG)' after calling
xtensa_expand_epilogue.

Refine constraint "Bk" to define_special_memory_constraint.

For below pattern, RA may still allocate r162 as v/k register, try to
reload for address with leaq __libc_tsd_CTYPE_B@gottpoff(%rip), %rsi
which result a linker error.

(set (reg:DI 162)
     (mem/u/c:DI
       (const:DI (unspec:DI
[(symbol_ref:DI ("a") [flags 0x60]  <var_decl 0x7f621f6e1c60 a>)]
UNSPEC_GOTNTPOFF))

Quote from H.J for why linker issue an error.
>What do these do:
>
>        leaq    __libc_tsd_CTYPE_B@gottpoff(%rip), %rax
>        vmovq   (%rax), %xmm0
>
>From x86-64 TLS psABI:
>
>The assembler generates for the x@gottpoff(%rip) expressions a R X86
>64 GOTTPOFF relocation for the symbol x which requests the linker to
>generate a GOT entry with a R X86 64 TPOFF64 relocation. The offset of
>the GOT entry relative to the end of the instruction is then used in
>the instruction. The R X86 64 TPOFF64 relocation is pro- cessed at
>program startup time by the dynamic linker by looking up the symbol x
>in the modules loaded at that point. The offset is written in the GOT
>entry and later loaded by the addq instruction.
>
>The above code sequence looks wrong to me.

gcc/ChangeLog:

PR target/116043
* config/i386/constraints.md (Bk): Refine to
define_special_memory_constraint.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr116043.c: New test.

i386: Add non-optimize prefetchi intrins

Under -O0, with the "newly" introduced intrins, the variable will be
transformed as mem instead of the origin symbol_ref. The compiler will
then treat the operand as invalid and turn the operation into nop, which
is not expected. Use macro for non-optimize to keep the variable as
symbol_ref just as how prefetch intrin does.

gcc/ChangeLog:

* config/i386/prfchiintrin.h
(_m_prefetchit0): Add macro for non-optimized option.
(_m_prefetchit1): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/prefetchi-1b.c: New test.

Daily bump.

xtensa: Make use of scaled [U]FLOAT/TRUNC.S instructions

[U]FLOAT.S machine instruction in Xtensa ISA, which converts an integer to
a hardware single-precision FP register, has the ability to divide the
result by power of two (0 to 15th).

Similarly, [U]TRUNC.S instruction, which truncates single-precision FP to
integer, can multiply the source value by power of two in advance, but
neither of these currently uses this function (always specified with 0th
power of two, i.e. a scaling factor of 1).

This patch unleashes the scaling ability of the above instructions.

     /* example */
     float test0(int a) {
       return a / 2.f;
     }
     float test1(unsigned int a) {
       return a / 32768.f;
     }
     int test2(float a) {
       return a * 2;
     }
     unsigned int test3(float a) {
       return a * 32768;
     }

     ;; before
     test0:
      movi.n a9, 0x3f
      float.s f0, a2, 0
      slli a9, a9, 24
      wfr f1, a9
      mul.s f0, f0, f1
      rfr a2, f0
      ret.n
     test1:
      movi.n a9, 7
      ufloat.s f0, a2, 0
      slli a9, a9, 27
      wfr f1, a9
      mul.s f0, f0, f1
      rfr a2, f0
      ret.n
     test2:
      wfr f1, a2
      add.s f0, f1, f1
      trunc.s a2, f0, 0
      ret.n
     test3:
      movi.n a9, 0x47
      slli a9, a9, 24
      wfr f1, a2
      wfr f2, a9
      mul.s f0, f1, f2
      utrunc.s a2, f0, 0
      ret.n

     ;; after
     test0:
      float.s f0, a2, 1
      rfr a2, f0
      ret.n
     test1:
      ufloat.s f0, a2, 15
      rfr a2, f0
      ret.n
     test2:
      wfr f0, a2
      trunc.s a2, f0, 1
      ret.n
     test3:
      wfr f0, a2
      utrunc.s a2, f0, 15
      ret.n

gcc/ChangeLog:

* config/xtensa/predicates.md
(fix_scaling_operand, float_scaling_operand): New predicates.
* config/xtensa/xtensa.md
(any_fix/m_fix/s_fix, any_float/m_float/s_float):
New code iterators and their attributes.
(fix<s_fix>_truncsfsi2): Change from "fix_truncsfsi2".
(*fix<s_fix>_truncsfsi2_2x, *fix<s_fix>_truncsfsi2_scaled):
New insn definitions.
(float<s_float>sisf2): Change from "floatsisf2".
(*float<s_float>sisf2_scaled): New insn definition.

xtensa: Make use of std::swap where appropriate

No functional changes.

gcc/ChangeLog:

* config/xtensa/xtensa.cc
(gen_int_relational, gen_float_relational): Replace tempvar-based
value-swapping codes with std::swap.
* config/xtensa/xtensa.md (movdi_internal, movdf_internal):
Ditto.

[target/116104] Fix test guarding UINTVAL to extract shift count

Minor oversight in the ext-dce bits.  If the shift count is a constant vector,
then we shouldn't be extracting values with [U]INTVAL.  We guarded that test
with CONSTANT_P, when it should have been CONSTANT_INT_P.

Shows up on gcn, but I wouldn't be terribly surprised if it could be triggered
elsewhere.

Verified the testcase compiles on gcn.  Haven't done a libgcc build for gcn
though.  Also verified x86 bootstraps and regression tests cleanly.

Pushing to the trunk.

PR target/116104
gcc/
* ext-dce.cc (carry_backpropagate): Fix test guarding UINTVAL
extraction of shift count.

Polish libstdc++ 'dg-final' action 'file-io-diff'

Follow-up to recent commit 515da03a838db05443ebcc4c543a405bed764188
"libstdc++: Add file-io-diff to replace @diff@ markup in I/O tests".

Currently, if a 'dg-final' action 'file-io-diff' passes, we print nothing
(should: 'PASS: [...]'), but if it fails, we just print: 'FAIL: files differ',
for example ('*.log' file):

    [...]
    FAIL: 27_io/basic_ostream/inserters_other/wchar_t/2.cc  -std=gnu++17 (test for excess errors)
    [...]
    UNRESOLVED: 27_io/basic_ostream/inserters_other/wchar_t/2.cc  -std=gnu++17 compilation failed to produce executable
    diff: wostream_inserter_other_in.txt: No such file or directory
    diff: wostream_inserter_other_out.txt: No such file or directory
    FAIL: files differ
    diff: wostream_inserter_other_in.txt: No such file or directory
    diff: wostream_inserter_other_out.txt: No such file or directory

When later the '*.sum' files get sorted, these 'FAIL: files differ' instances
aren't grouped anymore with the other test cases' results, but they appear en
bloc, lexically sorted between ('e[...]' and 's[...]'), for example:

    [...]
    PASS: ext/vstring/types/23767.cc  -std=gnu++17 (test for excess errors)
    FAIL: files differ
    FAIL: files differ
    FAIL: files differ
    PASS: special_functions/01_assoc_laguerre/check_nan.cc  -std=gnu++17 (test for excess errors)
    [...]

Also, we shouldn't emit the actual 'diff' into the '*.sum' file, but just into
the '*.log* file, and there's no need for 'spawn'/'expect', as we're not
matching any specific messages.

libstdc++-v3/
* testsuite/lib/libstdc++.exp (file-io-diff): Polish.

testsuite: fix PR111613 test

PR ipa/111613
* gcc.c-torture/pr111613.c: Rename to..
* gcc.c-torture/execute/pr111613.c: ...this.

c++: generic lambda in default template argument [PR88313]

Here we're rejecting the generic lambda inside the default template
argument ultimately because auto_is_implicit_function_template_parm_p
doesn't get set during parsing of the lambda's parameter list, due
to the !processing_template_parmlist restriction.  But when parsing a
lambda parameter list we should always set that flag regardless of where
the lambda appears.  This patch makes sure of this via a local lambda_p
flag.

PR c++/88313

gcc/cp/ChangeLog:

* parser.cc (cp_parser_lambda_declarator_opt): Pass
lambda_p=true to cp_parser_parameter_declaration_clause.
(cp_parser_direct_declarator): Pass lambda_p=false to
to cp_parser_parameter_declaration_clause.
(cp_parser_parameter_declaration_clause): Add bool lambda_p
parameter.  Consider lambda_p instead of current_class_type
when setting parser->auto_is_implicit_function_template_parm_p.
Don't consider processing_template_parmlist.
(cp_parser_requirement_parameter_list): Pass lambda_p=false
to cp_parser_parameter_declaration_clause.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-targ6.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

doc: Improve punctuation and grammar in -fdiagnostics-format docs

The hyphen can be misunderstood to mean "emitted to -" i.e. stdout.
Refer to both forms by name, rather than using "the former" for one and
referring to the other by name.

gcc/ChangeLog:

* doc/invoke.texi (Diagnostic Message Formatting Options):
Replace hyphen with a new sentence. Replace "the former" with
the actual value.

gcc: xtensa: disable late-combine by default

gcc/
* config/xtensa/xtensa.cc (xtensa_option_override_after_change):
New function.
(TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE): Define as
xtensa_option_override_after_change.
(xtensa_option_override): Call
xtensa_option_override_after_change.

Revert "PR116080: Fix tail call dejagnu checks"

This reverts commit ee41cd863b7c38ee3bc415ea7154954aa6facca3.

testsuite: make PR115277 test an execute one

PR middle-end/115277
* gcc.c-torture/compile/pr115277.c: Rename to...
* gcc.c-torture/execute/pr115277.c: ...this.

AVR: avr.cc - Fix a typo in a diagnostic.

gcc/
* config/avr/avr.cc (avr_set_current_function): Fix typo in
error message.

libgomp.texi: Update 'Device Information Routines' section

Update 'OpenMP Runtime Library Routines' by adding a note that invoking
inside a target region might invoke unspecified behavior. Additionally,
update omp_{get,set}_default_device for omp_{initial,invalid}_device
named constants.

libgomp/ChangeLog:

* libgomp.texi (OpenMP Runtime Library Routines): Add missing
title to some commented still undocumented items.
(Device Information Routines): Update.

rs6000, add comment to VEC_IC definition

This patch adds a comment to the VEC_IC definition to clarify
the V1TI "TARGET_POWER10" mode that was added.

gcc/ChangeLog:
* config/rs6000/vector.md: Add comment for the VEC_IC
define_mode_iterator.

Widening-Mul: Try .SAT_SUB for PLUS_EXPR when one op is IMM

After add the matching for .SAT_SUB when one op is IMM,  there
will be a new root PLUS_EXPR for the .SAT_SUB pattern.  For example,

Form 3:
  #define DEF_SAT_U_SUB_IMM_FMT_3(T, IMM) \
  T __attribute__((noinline))             \
  sat_u_sub_imm##IMM##_##T##_fmt_3 (T x)  \
  {                                       \
    return x >= IMM ? x - IMM : 0;        \
  }

DEF_SAT_U_SUB_IMM_FMT_3(uint64_t, 11)

And then we will have gimple before widening-mul as below.  Thus,  try
the .SAT_SUB for the PLUS_EXPR.

   4   │ __attribute__((noinline))
   5   │ uint64_t sat_u_sub_imm11_uint64_t_fmt_3 (uint64_t x)
   6   │ {
   7   │   long unsigned int _1;
   8   │   uint64_t _3;
   9   │
  10   │   <bb 2> [local count: 1073741824]:
  11   │   _1 = MAX_EXPR <x_2(D), 11>;
  12   │   _3 = _1 + 18446744073709551605;
  13   │   return _3;
  14   │
  15   │ }

The below test suites are passed for this patch.
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.

gcc/ChangeLog:

* tree-ssa-math-opts.cc (math_opts_dom_walker::after_dom_children):
Try .SAT_SUB for PLUS_EXPR case.

Signed-off-by: Pan Li <pan2.li@intel.com>

OpenMP/Fortran: Fix handling of 'declare target' with 'link' clause [PR115559]

Contrary to a normal 'declare target', the 'declare target link' attribute
also needs to set node->offloadable and push the offload_vars in the front end.

Linked variables require that the data is mapped. For module variables, this
can happen anywhere. For variables in an external subprograms or the main
programm, this can only happen in the either that program itself or in an
internal subprogram. - Whether a variable is just normally mapped or linked then
becomes relevant if a device routine exists that can access that variable,
i.e. an internal procedure has then to be marked as declare target.

PR fortran/115559

gcc/fortran/ChangeLog:

* trans-common.cc (build_common_decl): Add 'omp declare target' and
'omp declare target link' variables to offload_vars.
* trans-decl.cc (add_attributes_to_decl): Likewise; update args and
call decl_attributes.
(get_proc_pointer_decl, gfc_get_extern_function_decl,
build_function_decl): Update calls.
(gfc_get_symbol_decl): Likewise; move after 'DECL_STATIC (t)=1'
to avoid errors with symtab_node::get_create.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/declare-target-link.f90: New test.

libgomp: Fix declare target link with offset array-section mapping [PR116107]

Assume that 'int var[100]' is 'omp declare target link(var)'. When now
mapping an array section with offset such as 'map(to:var[20:10])',
the device-side link pointer has to store &<device-storage-data>[0] minus
the offset such that var[20] will access <device-storage-data>[0]. But
the offset calculation was missed such that the device-side 'var' pointed
to the first element of the mapped data - and var[20] points beyond at
some invalid memory.

PR middle-end/116107

libgomp/ChangeLog:

* target.c (gomp_map_vars_internal): Honor array mapping offsets
with declare-target 'link' variables.
* testsuite/libgomp.c-c++-common/target-link-2.c: New test.

Fix ICE with -fdump-tree-moref

gcc/ChangeLog:

PR ipa/116055
* ipa-modref.cc (analyze_function): Do not ICE when flags regress.

testsuite: Fix up consteval-prop21.C for 32-bit targets [PR115986]

The test fails on 32-bit targets (which don't support __int128 type).
Using unsigned long long instead still ICEs before the fix and passes
after it on those targets.

2024-07-29 Jakub Jelinek <jakub@redhat.com>

PR c++/115986
* g++.dg/cpp2a/consteval-prop21.C (operator "" _c): Use
unsigned long long rather than __uint128_t for return type if int128
is unsupported.

vect: Fix single_imm_use in tree_vect_patterns

Since pattern statement coexists with normal statements in a way that it is
not linked into function body, we should not invoke utility procedures that
depends on def/use graph on pattern statement, such as counting uses of a
pseudo value defined by a pattern statement. This patch is to fix a bug of
this type in vect pattern formation.

2024-06-14 Feng Xue <fxue@os.amperecomputing.com>

gcc/
* tree-vect-patterns.cc (vect_recog_bitfield_ref_pattern): Only call
single_imm_use if statement is not generated from pattern recognition.

i386: Fix AVX512 intrin macro typo

There are several typo in AVX512 intrins macro define. Correct them to solve
errors when compiled with -O0.

gcc/ChangeLog:

* config/i386/avx512dqintrin.h
(_mm_mask_fpclass_ss_mask): Correct operand order.
(_mm_mask_fpclass_sd_mask): Ditto.
(_mm256_maskz_reduce_round_ss): Use __builtin_ia32_reducess_mask_round
instead of __builtin_ia32_reducesd_mask_round.
(_mm_reduce_round_sd): Use -1 as mask since it is non-mask.
(_mm_reduce_round_ss): Ditto.
* config/i386/avx512vlbwintrin.h
(_mm256_mask_alignr_epi8): Correct operand usage.
(_mm_mask_alignr_epi8): Ditto.
* config/i386/avx512vlintrin.h (_mm_mask_alignr_epi64): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512bw-vpalignr-1b.c: New test.
* gcc.target/i386/avx512dq-vfpclasssd-1b.c: Ditto.
* gcc.target/i386/avx512dq-vfpclassss-1b.c: Ditto.
* gcc.target/i386/avx512dq-vreducesd-1b.c: Ditto.
* gcc.target/i386/avx512dq-vreducess-1b.c: Ditto.
* gcc.target/i386/avx512vl-valignq-1b.c: Ditto.

Daily bump.

testsuite: fix dg-add-options vs. dg-options ordering

Per gccint, dg-add-options must be placed after all dg-options directives.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/cmpmem-2.c: Fix dg-add-options order.

testsuite: fix dg-do ordering wrt dg-require-*

Per gccint, dg-do must precede dg-require-effective-target or
dg-require-support. Fix a handful of deviant cases.

gcc/testsuite/ChangeLog:
PR middle-end/25521
PR debug/93122
* gcc.dg/pr25521.c: Fix dg-do directive order.
* gcc.dg/vect/vect-simd-clone-19.c: Likewise.
* gcc.target/arm/stack-protector-7.c: Likewise.
* gcc.target/arm/stack-protector-8.c: Likewise.
* gcc.target/powerpc/pr93122.c: Likewise.

libstdc++-v3/ChangeLog:
PR libstdc++/110572
* testsuite/18_support/type_info/110572.cc: Fix dg-do directive order.

c++: if consteval and consteval propagation [PR115583]

During speculative constant folding of an if consteval, we take the false
branch, but the true branch is an immediate function context, so we don't
want to to cp_fold_immediate it. So we could check IF_STMT_CONSTEVAL_P
here. But beyond that, we don't want to do this inside a call, only when
first parsing a function.

PR c++/115583

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_conditional_expression): Don't
cp_fold_immediate for if consteval.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/consteval-if13.C: New test.

gcc: Make exec-tool.in handle missing Binutils more gracefully

When users try to build a cross-compiler without first installing
binutils they get confusing errors like:
/tmp/gcc-obj/./gcc/as: line 114: exec: -m: invalid option

This is an incredibly common source of questions on gcc-help and IRC,
and bogus bug reports e.g. see PR 116119 for the latest example.

This change adds an explicit check for an empty $original variable and
exits with a user-friendly error.

gcc/ChangeLog:

* exec-tool.in: Exit with an error if $original is empty.

AVR target 116056 - Support attribute signal(n), interrupt(n) and noblock.

This patch adds support for arguments to the signal and interrupt
function attributes.  It allows to specify the ISR by means of the
associated IRQ number, in extension to the current attributes that
require to specify the ISR name like "__vector_1" as (assembly) name
for the function.  The new feature is more convenient, e.g. when the
ISR is implemented by a class method or in a namespace.  There is no
requirement that the ISR is externally visible.  The syntax is like:

__attribute__((signal(1, 2, ...), signal(3, 4, ...)))
[static] void isr_function (void)
{
    // Code
}

Moreover, this patch adds support for the "noblock" function attribute
to let an ISR start with a SEI instruction.  Attribute "signal" together
with "noblock" behaves like "interrupt" but without imposing a specific
function name or visibility like "interrupt" does.

PR target/116056
gcc/
* config/avr/avr.h (machine_function) <is_noblock>: New field.
* config/avr/avr-c.cc (avr_cpu_cpp_builtins) <__HAVE_SIGNAL_N__>: New
built-in macro.
* config/avr/avr.cc (avr_declare_function_name): New function.
(avr_attribute_table) <noblock>: New function attribute>.
<signal, interrupt>: Allow any number of args.
(avr_insert_attributes): Check validity of "signal" and "interrupt"
arguments.
(avr_foreach_function_attribute, avr_interrupt_signal_function)
(avr_isr_number, avr_asm_isr_alias, avr_handle_isr_attribute)
(avr_noblock_function_p): New static functions.
(avr_interrupt_function): New from avr_interrupt_function_p.
Adjust callers.
(avr_signal_function): New from avr_signal_function_p.
Adjust callers.
(avr_set_current_function): Only diagnose non-__vector ISR names
when "signal" or "interrupt" attribute has no args. Set
cfun->machine->is_noblock.  Warn about "noblock" in non-ISR functions.
(struct avr_fun_cookie): New.
(avr_expand_prologue, avr_asm_function_end_prologue): Handle "noblock".
* config/avr/elf.h (ASM_DECLARE_FUNCTION_NAME): New define.
* config/avr/avr-protos.h (avr_declare_function_name): New proto.
* doc/extend.texi (AVR Function Attributes): Document
signal(num) and interrupt(num).
* doc/invoke.texi (AVR Built-in Macros) <__HAVE_SIGNAL_N__>: Document.
gcc/testsuite/
* gcc.target/avr/torture/signal_n-1.c: New test.
* gcc.target/avr/torture/signal_n-2.c: New test.
* gcc.target/avr/torture/signal_n-3.c: New test.
* gcc.target/avr/torture/signal_n-4.cpp: New test.

PR modula2/115823 Wrong expansion of isnormal optab

This patch corrects the function declaration of a builtin
(using the libname rather than the source name).

gcc/m2/ChangeLog:

PR modula2/115823
* gm2-gcc/m2builtins.cc (define_builtin): Build
the function decl using the libname.

gcc/testsuite/ChangeLog:

PR modula2/115823
* gm2/builtins/run/pass/testisnormal.mod: Change to an
implementation module.
* gm2/builtins/run/pass/testisnormal.def: New test.
* gm2/builtins/run/pass/testsinl.def: New test.
* gm2/builtins/run/pass/testsinl.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

testsuite: Fix unaligned accesses in ipa-sra-8.c and ipa-sra-9.c

2024-07-28 John David Anglin <danglin@gcc.gnu.org>

gcc/testsuite/ChangeLog:

PR testsuite/92550
* gcc.dg/ipa/ipa-sra-8.c: Change get_a argument type to SSS.
* gcc.dg/ipa/ipa-sra-9.c: Likewise.

Add config file so b4 uses inbox.sourceware.org automatically

This makes b4 use inbox.sourceware.org instead of the default host
lore.kernel.org, so that every b4 user doesn't have to configure this
themselves.

ChangeLog:

* .b4-config: New file.

Daily bump.

c++: consteval propagation and templates [PR115986]

Here the call to e() makes us decide to check d() for escalation at EOF, but
while checking it we try to fold_immediate 0_c, and get confused by the
template trees. Let's not mess with escalation for function templates.

PR c++/115986

gcc/cp/ChangeLog:

* cp-gimplify.cc (remember_escalating_expr): Skip function
templates.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/consteval-prop21.C: New test.

c++: ICE with concept, local class, and lambda [PR115561]

Here when we want to synthesize methods for foo()::B maybe_push_to_top_level
calls push_function_context, which sets cfun to a dummy value; later
finish_call_expr tries to set something in
cp_function_chain (i.e. cfun->language), which isn't set. Many places in
the compiler check cfun && cp_function_chain to avoid this problem; here we
also want to check !cp_unevaluated_operand, like set_flags_from_callee does.

PR c++/115561

gcc/cp/ChangeLog:

* semantics.cc (finish_call_expr): Check cp_unevaluated_operand.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-lambda21.C: New test.

c++: improve C++ testsuite default versions

I wanted to add more cases to the setting of std_list in g++-dg.exp, but
didn't want to do a full scan through the file for each case.  So this patch
improves that in two ways: first, by extracting all interesting lines on a
single pass; second, by generating the list more flexibly: now we test every
version mentioned explicitly in the testcase, plus a few more if fewer than
three are mentioned.

This also lowers changes from testing four to three versions for most
testcases: the current default and the earliest and latest versions.  This
will reduce testing of C++14 and C++20 modes, and increase testing of C++26
mode.  C++ front-end developers are encouraged to set the
GXX_TESTSUITE_STDS environment variable to test more modes.

gcc/testsuite/ChangeLog:

* lib/gcc-dg.exp (get_matching_lines): New.
* lib/g++-dg.exp: Improve std_list selection.

Fold ctz(-x) and ctz(abs(x)) as ctz(x) in match.pd.

The subject line pretty much says it all; the count-trailing-zeros function
of -X and abs(X) produce the same result as count-trailing-zeros of X.
This transformation eliminates a negation which may potentially overflow
with an equivalent expression that doesn't [much like the analogous
abs(-X) simplification in match.pd].

I'd noticed this -X equivalence, which isn't mentioned in Hacker's Delight,
investigating whether ranger's non_zero_bits can help determine whether
an integer variable may be converted to a floating point type exactly
(without raising FE_INEXACT), but it turns out this observation isn't
novel, as (disappointingly) LLVM already performs this same folding.

2024-07-27 Roger Sayle <roger@nextmovesoftware.com>
Andrew Pinski <quic_apinski@quicinc.com>

gcc/ChangeLog
* match.pd (ctz (-X) => ctz (X)): New simplification.
(ctz (abs (X)) => ctz (X)): Likewise.

gcc/testsuite/ChangeLog
* gcc.dg/fold-ctz-1.c: New test case.
* gcc.dg/fold-ctz-2.c: Likewise.

libstdc++: Fix -Wsign-compare warning in <charconv>

Cast ptrdiff_t to size_t to avoid a -Wsign-compare warning. We can check
in __to_chars_i that the ptrdiff_t won't be negative, so that we know
the cast is safe.

libstdc++-v3/ChangeLog:

* include/std/charconv (__to_chars_16, __to_chars_10)
(__to_chars_8, __to_chars_2, __to_chars): Cast ptrdiff_t to
size_t for comparison.
(__to_chars_i): Check for first >= last instead of first == last
for initial sanity check.

libstdc++: Add comment noting LWG 3617 support

The resolution was implemented in r14-8752-g6f75149488b74a but I didn't
add the usual _GLIBCXX_RESOLVE_LIB_DEFECTS comment.

libstdc++-v3/ChangeLog:

* include/bits/std_function.h: Add comment about LWG 3617 being
supported.

libstdc++: Remove __find_if unrolling for random access iterators

As the numbers in PR libstdc++/88545 show, the manual loop unrolling in
std::__find_if doesn't actually help these days, and it prevents the
compiler from auto-vectorizing.

Remove the dispatching on iterator_category and just use the simple loop
for all iterator categories.

libstdc++-v3/ChangeLog:

* include/bits/stl_algobase.h (__find_if): Remove overloads for
dispatching on iterator_category. Do not unroll loop manually.
* include/bits/stl_algo.h (__find_if_not): Remove
iterator_category argument from __find_if call.