git.ipfire.org Git - thirdparty/gcc.git/log

i386: Add AVX10.1 related macros

gcc/ChangeLog:

PR target/113288
* config/i386/i386-c.cc (ix86_target_macros_internal):
Add __AVX10_1__, __AVX10_1_256__ and __AVX10_1_512__.

RISC-V: Enhance a testcase

This test should pass no matter how we adjust cost model.

Remove -fno-vect-cost-model.

Committed.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/fold-min-poly.c: Remove -fno-vect-cost-model

target/112280 - properly guard permute query

The following adds guards avoiding code generation to
expand_perm_as_a_vlbr_vstbr_candidate when d.testing_p.

PR target/112280
* config/s390/s390.cc (expand_perm_as_a_vlbr_vstbr_candidate):
Do not generate code when d.testing_p.

libgcc, nios2: Fix exception handling on nios2 with -fpic

Exception handling on nios2-linux-gnu with -fpic has been broken since
revision 790854ea7670f11c14d431c102a49181d2915965, "Use _dl_find_object
in _Unwind_Find_FDE".  For whatever reason, this doesn't work on nios2.

Nios2 uses the GOT address as the base for DW_EH_PE_datarel
relocations in PIC; see my previous fix to make this work, revision
2d33dcfe9f0494c9b56a8d704c3d27c5a4329ebc, "Support for GOT-relative
DW_EH_PE_datarel encoding".  So this may be a horrible bug in the ABI
or in my interpretation of it or just glibc's implementation of
_dl_find_object for this target, but there's existing code out there
that does things this way; and realistically, nobody is going to
re-engineer this now that the vendor has EOL'ed the nios2
architecture.  So, just skip over the code trying to use
_dl_find_object on this target and fall back to the way that works.

I plan to backport this patch to the GCC 12 and GCC 13 branches as well.

libgcc/ChangeLog
* unwind-dw2-fde-dip.c (_Unwind_Find_FDE): Do not try to use
_dl_find_object on nios2; it doesn't work.

Update documents for fcf-protection=

After r14-2692-g1c6231c05bdcca, the option is defined as EnumSet and
-fcf-protection=branch won't unset any others bits since they're in
different groups. So to override -fcf-protection, an explicit
-fcf-protection=none needs to be added and then with
-fcf-protection=XXX

gcc/ChangeLog:

PR target/113039
* doc/invoke.texi (fcf-protection=): Update documents.

RISC-V: Update the comments of riscv_v_ext_mode_p [NFC]

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_v_ext_mode_p): Update the
comments of predicate func riscv_v_ext_mode_p.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Modify ABI-name length of vfloat16m8_t

The length of vfloat16m8_t ABI-name should be 17.
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.def (vfloat16m8_t):
Modify ABI-name length of vfloat16m8_t

LoongArch: Redundant sign extension elimination optimization 2.

Eliminate the redundant sign extension that exists after the conditional
move when the target register is SImode.

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_expand_conditional_move):
Adjust.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/sign-extend-2.c: Adjust.

LoongArch: Redundant sign extension elimination optimization.

We found that the current combine optimization pass in gcc cannot handle
the following redundant sign extension situations:

(insn 77 76 78 5 (set (reg:SI 143)
        (plus:SI (subreg/s/u:SI (reg/v:DI 104 [ len ]) 0)
            (const_int 1 [0x1]))) {addsi3}
    (expr_list:REG_DEAD (reg/v:DI 104 [ len ])
        (nil)))
(insn 78 77 82 5 (set (reg/v:DI 104 [ len ])
        (sign_extend:DI (reg:SI 143))) {extendsidi2}
        (nil))

Because reg:SI 143 is not died or set in insn 78, no replacement merge will
be performed for the insn sequence. We adjusted the add template to eliminate
redundant sign extensions during the expand pass.
Adjusted based on upstream comments:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641988.html

gcc/ChangeLog:

* config/loongarch/loongarch.md (add<mode>3): Removed.
(*addsi3): New.
(addsi3): Ditto.
(adddi3): Ditto.
(*addsi3_extended): Removed.
(addsi3_extended): New.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/sign-extend.c: Moved to...
* gcc.target/loongarch/sign-extend-1.c: ...here.
* gcc.target/loongarch/sign-extend-2.c: New test.

Daily bump.

OpenMP: lvalue parsing for map/to/from clauses (C)

This patch adds support for parsing general lvalues ("locator list item
types") for OpenMP "map", "to" and "from" clauses to the C front-end,
similar to the previously-posted patch for C++.  Such syntax is permitted
for OpenMP 5.0 and above.  It was previously posted for mainline here:

  https://gcc.gnu.org/pipermail/gcc-patches/2022-December/609038.html

and for the og13 branch here:

  https://gcc.gnu.org/pipermail/gcc-patches/2023-June/623355.html

2024-01-11  Julian Brown  <julian@codesourcery.com>

gcc/c-family/
* c-pretty-print.cc (c_pretty_printer::postfix_expression,
c_pretty_printer::expression): Add OMP_ARRAY_SECTION support.

gcc/c/
* c-parser.cc (c_parser_braced_init, c_parser_conditional_expression):
Don't allow OpenMP array section.
(c_parser_postfix_expression): Don't allow array section in statement
expression.
(c_parser_postfix_expression_after_primary): Add support for OpenMP
array section parsing.
(c_parser_expr_list): Don't allow OpenMP array section here.
(c_parser_omp_variable_list): Change ALLOW_DEREF parameter to
MAP_LVALUE.  Support parsing of general lvalues in "map", "to" and
"from" clauses.
(c_parser_omp_var_list_parens): Change ALLOW_DEREF parameter to
MAP_LVALUE.  Update call to c_parser_omp_variable_list.
(c_parser_oacc_data_clause): Update calls to
c_parser_omp_var_list_parens.
(c_parser_omp_clause_reduction): Use OMP_ARRAY_SECTION tree node
instead of TREE_LIST for array sections.
(c_parser_omp_target): Allow GOMP_MAP_ATTACH.
* c-tree.h (c_omp_array_section_p): Add extern declaration.
(build_omp_array_section): Add prototype.
* c-typeck.cc (c_omp_array_section_p): Add flag.
(mark_exp_read): Support OMP_ARRAY_SECTION.
(build_omp_array_section): Add function.
(build_external_ref): Tweak error path for OpenMP array sections.
(handle_omp_array_sections_1): Use OMP_ARRAY_SECTION tree code instead
of TREE_LIST.  Handle more kinds of expressions.
(c_oacc_check_attachments): Use OMP_ARRAY_SECTION instead of TREE_LIST
for array sections.
(c_finish_omp_clauses): Use OMP_ARRAY_SECTION instead of TREE_LIST.
Check for supported expression types.

gcc/testsuite/
* gcc.dg/gomp/bad-array-section-c-1.c: New test.
* gcc.dg/gomp/bad-array-section-c-2.c: New test.
* gcc.dg/gomp/bad-array-section-c-3.c: New test.
* gcc.dg/gomp/bad-array-section-c-4.c: New test.
* gcc.dg/gomp/bad-array-section-c-5.c: New test.
* gcc.dg/gomp/bad-array-section-c-6.c: New test.
* gcc.dg/gomp/bad-array-section-c-7.c: New test.
* gcc.dg/gomp/bad-array-section-c-8.c: New test.

libgomp/
* libgomp.texi: C/C++ lvalues are supported now for map/to/from.
* testsuite/libgomp.c-c++-common/ind-base-4.c: New test.
* testsuite/libgomp.c-c++-common/unary-ptr-1.c: New test.

c++: corresponding object parms [PR113191]

As discussed, our handling of corresponding object parameters needed to
handle the using-declaration case better. And I took the opportunity to
share code between the add_method and cand_parms_match uses.

This patch specifically doesn't compare reversed parameters, but a follow-up
patch will.

PR c++/113191

gcc/cp/ChangeLog:

* class.cc (xobj_iobj_parameters_correspond): Add context parm.
(object_parms_correspond): Factor out of...
(add_method): ...here.
* method.cc (defaulted_late_check): Use it.
* call.cc (class_of_implicit_object): New.
(object_parms_correspond): Overload taking two candidates.
(cand_parms_match): Use it.
(joust): Check reversed before comparing constraints.
* cp-tree.h (object_parms_correspond): Declare.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-memfun4.C: New test.

libstdc++: Fix spelling mistake in new doc addition

libstdc++-v3/ChangeLog:

* doc/xml/manual/evolution.xml: Fix spelling.
* doc/html/manual/api.html: Regenerate.

libstdc++: Document addition of libstdc++exp.a

The API Evolution section of the manual should mention when the
libstdc++exp.a library was added.

libstdc++-v3/ChangeLog:

* doc/xml/manual/evolution.xml: Document addition of
libstdc++exp.a.
* doc/html/*: Regenerate.

libstdc++: use updated type for __unexpected_handler

Commit f4130a3eb545ab1aaf3ecb44f3d06b43e3751e04 changed the type of
__expected_handler in libsupc++/unwind-cxx.h to be a
std::terminate_handler to avoid a deprecated warning. However, the
definition in eh_unex_handler.cc still used the old type
(std::unexpected_handler) and thus causes a warning when compiling
libstdc++ with -Wdeprecated-declarations (which is the default, for
example, for clang).

Adapt the definition to match the declaration.

libstdc++-v3/ChangeLog:

* libsupc++/eh_unex_handler.cc: Adjust definition type to
declaration.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Removed a duplicate define directive for __glibcxx_want_ranges_iota

libstdc++-v3/ChangeLog:

* include/std/ranges (__glibcxx_want_ranges_iota): Remove
duplicate definition.

Signed-off-by: Michael Levine <mlevine55@bloomberg.net>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

RISC-V: THEAD: Fix ICE caused by split optimizations for XTheadFMemIdx.

Due to the premature split optimizations for XTheadFMemIdx, GPR
is allocated when reload allocates registers, resulting in the
following insn.

(insn 66 21 64 5 (set (reg:DF 14 a4 [orig:136 <retval> ] [136])
        (mem:DF (plus:SI (reg/f:SI 15 a5 [141])
                (ashift:SI (reg/v:SI 10 a0 [orig:137 i ] [137])
                    (const_int 3 [0x3]))) [0  S8 A64])) 218 {*movdf_hardfloat_rv32}
     (nil))

Since we currently do not support adjustments to th_m_mir/th_m_miu,
which will trigger ICE. So it is recommended to place the split
optimizations after reload to ensure FPR when registers are allocated.

gcc/ChangeLog:

* config/riscv/thead.md: Add limits for splits.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadfmemidx-medany.c: New test.

libstdc++: [_GLIBCXX_DEBUG] Fix assignment of value-initialized iterator [PR112477]

Now that _M_Detach do not reset iterator _M_version value we need to reset it when
the iterator is attached to a new sequence, even if this sequencer is null when
assigning a value-initialized iterator. In this case _M_version shall be resetted to 0.

libstdc++-v3/ChangeLog:

PR libstdc++/112477
* src/c++11/debug.cc
(_Safe_iterator_base::_M_attach): Reset _M_version to 0 if attaching to null
sequence.
(_Safe_iterator_base::_M_attach_single): Likewise.
(_Safe_local_iterator_base::_M_attach): Likewise.
(_Safe_local_iterator_base::_M_attach_single): Likewise.
* testsuite/23_containers/map/debug/112477.cc: New test case.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++/ranges: Use C++23 deducing this in _Pipe and _Partial

This simplifies the operator() of the _Pipe and _Partial range adaptor
closure objects using C++23 deducing this, allowing us to condense
multiple operator() overloads into one.

The new __like_t alias template is similar to the expositional one from
P0847R6 except it's implemented in terms of forward_like instead of vice
versa, and thus ours always yields a reference so e.g. __like_t<A, char>
is char&& instead of char. For our purposes (forwarding) this shouldn't
make a difference, I think..

libstdc++-v3/ChangeLog:

* include/bits/move.h (__like_t): Define in C++23 mode.
* include/std/ranges (views::__adaptor::Partial::operator()):
Implement using C++23 deducing this when available.
(views::__adaptor::_Pipe::operator()): Likewise.
* testsuite/std/ranges/adaptors/100577.cc: Adjust testcase to
accept new "no match for call" errors issued in C++23 mode.
* testsuite/std/ranges/adaptors/lazy_split_neg.cc: Likewise.

expr: Limit the store flag optimization for single bit to non-vectors [PR113322]

The problem here is after the recent vectorizer improvements, we end up
with a comparison against a vector bool 0 which then tries expand_single_bit_test
which is not expecting vector comparisons at all.

The IR was:
  vector(4) <signed-boolean:1> mask_patt_5.13;
  _Bool _12;

  mask_patt_5.13_44 = vect_perm_even_41 != { 0.0, 1.0e+0, 2.0e+0, 3.0e+0 };
  _12 = mask_patt_5.13_44 == { 0, 0, 0, 0 };

and we tried to call expand_single_bit_test for the last comparison.
Rejecting the vector comparison is needed.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR middle-end/113322

gcc/ChangeLog:

* expr.cc (do_store_flag): Don't try single bit tests with
comparison on vector types.

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr113322-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

match: Delay folding of 1/x into `(x+1u)<2u?x:0` until late [PR113301]

Since currently ranger does not work with the complexity of COND_EXPR in
some cases so delaying the simplification of `1/x` for signed types
help code generation.
tree-ssa/divide-8.c is a new testcase where this can help.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/113301

gcc/ChangeLog:

* match.pd (`1/x`): Delay signed case until late.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/divide-8.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

libstdc++: Add GDB printer for std::integral_constant

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (StdIntegralConstantPrinter):
Add printer for std::integral_constant.
* testsuite/libstdc++-prettyprinters/cxx11.cc: Test it.

libstdc++: Prefer posix_memalign for aligned-new [PR113258]

As described in PR libstdc++/113258 there are old versions of tcmalloc
which replace malloc and related APIs, but do not repalce aligned_alloc
because it didn't exist at the time they were released. This means that
when operator new(size_t, align_val_t) uses aligned_alloc to obtain
memory, it comes from libc's aligned_alloc not from tcmalloc. But when
operator delete(void*, size_t, align_val_t) uses free to deallocate the
memory, that goes to tcmalloc's replacement version of free, which
doesn't know how to free it.

If we give preference to the older posix_memalign instead of
aligned_alloc then we're more likely to use a function that will be
compatible with the replacement version of free. Because posix_memalign
has been around for longer, it's more likely that old third-party malloc
replacements will also replace posix_memalign alongside malloc and free.

libstdc++-v3/ChangeLog:

PR libstdc++/113258
* libsupc++/new_opa.cc: Prefer to use posix_memalign if
available.

dg-extract-results.py: Ignore case in header line

DejaGNU changed its header line from "Test Run By" to "Test run by"
around 2016. This patch makes it so that both alternatives are
correcly detected.

contrib/ChangeLog:

* dg-extract-results.py: Make the test_run regex case
insensitive.

testsuite: remove xfail

These two lines have been getting XPASS since the test was added.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/explicit-obj-diagnostics7.C: Remove xfail.

AVR: invoke.texi: Put internal options in their own @subsubsection.

gcc/
* doc/invoke.texi (AVR Options): Move -mrmw, -mn-flash, -mshort-calls
and -msp8 to...
(AVR Internal Options): ...this new @subsubsection.

testsuite: remove -save-temps from many tests [PR113319]

This removes -save-temps from the tests I've introduced to fix the LTO
mismatches.

gcc/testsuite/ChangeLog:

PR testsuite/113319
* gcc.dg/bic-bitmask-13.c: Remove -save-temps.
* gcc.dg/bic-bitmask-14.c: Likewise.
* gcc.dg/bic-bitmask-15.c: Likewise.
* gcc.dg/bic-bitmask-16.c: Likewise.
* gcc.dg/bic-bitmask-17.c: Likewise.
* gcc.dg/bic-bitmask-18.c: Likewise.
* gcc.dg/bic-bitmask-19.c: Likewise.
* gcc.dg/bic-bitmask-20.c: Likewise.
* gcc.dg/bic-bitmask-21.c: Likewise.
* gcc.dg/bic-bitmask-22.c: Likewise.
* gcc.dg/bic-bitmask-7.c: Likewise.
* gcc.dg/vect/vect-early-break-run_1.c: Likewise.
* gcc.dg/vect/vect-early-break-run_10.c: Likewise.
* gcc.dg/vect/vect-early-break-run_2.c: Likewise.
* gcc.dg/vect/vect-early-break-run_3.c: Likewise.
* gcc.dg/vect/vect-early-break-run_4.c: Likewise.
* gcc.dg/vect/vect-early-break-run_5.c: Likewise.
* gcc.dg/vect/vect-early-break-run_6.c: Likewise.
* gcc.dg/vect/vect-early-break-run_7.c: Likewise.
* gcc.dg/vect/vect-early-break-run_8.c: Likewise.
* gcc.dg/vect/vect-early-break-run_9.c: Likewise.

[PR112918][LRA]: Fixing IRA ICE on m68k

Some GCC tests on m68K port of LRA is failed on `maximum number of
generated reload insns per insn achieved`.  The problem is in that for
subreg reload LRA can not narrow reg class more from ALL_REGS to
GENERAL_REGS and then to data regs or address regs.  The patch permits
narrowing reg class from reload insns if this results in successful
matching of reg operand.  This is the second version of the patch to
fix the PR.  This version adds matching with and without narrowing reg
class and preferring match without narrowing classes.

gcc/ChangeLog:

PR rtl-optimization/112918
* lra-constraints.cc (SMALL_REGISTER_CLASS_P): Move before in_class_p.
(in_class_p): Restrict condition for narrowing class in case of
allow_all_reload_class_changes_p.
(process_alt_operands): Try to match operand without and with
narrowing reg class.  Discourage narrowing the class.  Finish insn
matching only if there is no class narrowing.
(curr_insn_transform): Pass true to in_class_p for reg operand win.

tree-optimization/112505 - bit-precision induction vectorization

Vectorization of bit-precision inductions isn't implemented but we
don't check this, instead we ICE during transform.

PR tree-optimization/112505
* tree-vect-loop.cc (vectorizable_induction): Reject
bit-precision induction.

* gcc.dg/vect/pr112505.c: New testcase.

tree-optimization/113126 - vector extension compare optimization

The following makes sure the resulting boolean type is the same
when eliding a float extension.

PR tree-optimization/113126
* match.pd ((double)float CMP (double)float -> float CMP float):
Make sure the boolean type is the same.
* fold-const.cc (fold_binary_loc): Likewise.

* gcc.dg/torture/pr113126.c: New testcase.

tree-optimization/112636 - estimate niters before header copying

The following avoids a mismatch between an early query for maximum
number of iterations of a loop and a late one when through ranger
we'd get iterations estimated. Instead make sure we compute niters
before querying the iteration bound.

PR tree-optimization/112636
* tree-ssa-loop-ch.cc (ch_base::copy_headers): Call
estimate_numbers_of_iterations before querying
get_max_loop_iterations_int.
(pass_ch::execute): Initialize SCEV and loops appropriately.

* gcc.dg/pr112636.c: New testcase.

AVR: Some minor improvements to the TEXI documentation.

gcc/
* config/avr/avr-devices.cc (avr_texinfo): Adjust documentation for
Reduced Tiny.
* config/avr/gen-avr-mmcu-texi.cc (main): Add @anchor for each core.
* doc/extend.texi (AVR Variable Attributes): Improve documentation
of io, io_low and address attributes.
* doc/invoke.texi (AVR Options): Add some anchors for external refs.
* doc/avr-mmcu.texi: Rebuild.

libstdc++: Use using instead of typedef in opts-common.h

libstdc++-v3/ChangeLog:

* src/filesystem/ops-common.h (stat_type): Use using.

Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Fix error handling in filesystem::equivalent [PR113250]

This patch made std::filesystem::equivalent correctly throw an exception
when either path does not exist as per [fs.op.equivalent]/4.

PR libstdc++/113250

libstdc++-v3/ChangeLog:

* src/c++17/fs_ops.cc (fs::equivalent): Use || instead of &&.
* src/filesystem/ops.cc (fs::equivalent): Likewise.
* testsuite/27_io/filesystem/operations/equivalent.cc: Handle
error codes.
* testsuite/experimental/filesystem/operations/equivalent.cc:
Likewise.

Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

LoongArch: Implement option save/restore

LTO option streaming and target attributes both require per-function
target configuration, which is achieved via option save/restore.

We implement TARGET_OPTION_{SAVE,RESTORE} to switch the la_target
context in addition to other automatically maintained option states
(via the "Save" option property in the .opt files).

Tested on loongarch64-linux-gnu without regression.

PR target/113233

gcc/ChangeLog:

* config/loongarch/genopts/loongarch.opt.in: Mark options with
the "Save" property.
* config/loongarch/loongarch.opt: Same.
* config/loongarch/loongarch-opts.cc: Refresh -mcmodel= state
according to la_target.
* config/loongarch/loongarch.cc: Implement TARGET_OPTION_{SAVE,
RESTORE} for the la_target structure; Rename option conditions
to have the same "la_" prefix.
* config/loongarch/loongarch.h: Same.

LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion

The insert_var_expansion_initialization depends on the
HONOR_SIGNED_ZEROS to initialize the unrolling variables
to +0.0f when -0.0f and no-signed-option. Unfortunately,
we should always keep the -0.0f here because:

* The -0.0f is always the correct initial value.
* We need to support the target that always honor signed zero.

Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize
instead of HONOR_SIGNED_ZEROS. Then the target/backend can
decide to honor the no-signed-zero or not.

We also removed the testcase pr30957-1.c, as it makes undefined behavior
whether the return value is positive or negative.

The below tests are passed for this patch:

* The riscv regression tests.
* The aarch64 regression tests.
* The x86 bootstrap and regression tests.

gcc/ChangeLog:

* loop-unroll.cc (insert_var_expansion_initialization): Leverage
MODE_HAS_SIGNED_ZEROS for expansion variable initialization.

gcc/testsuite/ChangeLog:

* gcc.dg/pr30957-1.c: Remove.

Signed-off-by: Pan Li <pan2.li@intel.com>

aarch64: Fix dwarf2cfi ICEs due to recent CFI note changes [PR113077]

In r14-6604-gd7ee988c491cde43d04fe25f2b3dbad9d85ded45 we changed the CFI notes
attached to callee saves (in aarch64_save_callee_saves).  That patch changed
the ldp/stp representation to use unspecs instead of PARALLEL moves.  This meant
that we needed to attach CFI notes to all frame-related pair saves such that
dwarf2cfi could still emit the appropriate CFI (it cannot interpret the unspecs
directly).  The patch also attached REG_CFA_OFFSET notes to individual saves so
that the ldp/stp pass could easily preserve them when forming stps.

In that change I chose to use REG_CFA_OFFSET, but as the PR shows, that
choice was problematic in that REG_CFA_OFFSET requires the attached
store to be expressed in terms of the current CFA register at all times.
This means that even scheduling of frame-related insns can break this
invariant, leading to ICEs in dwarf2cfi.

The old behaviour (before that change) allowed dwarf2cfi to interpret the RTL
directly for sp-relative saves.  This change restores that behaviour by using
REG_FRAME_RELATED_EXPR instead of REG_CFA_OFFSET.  REG_FRAME_RELATED_EXPR
effectively just gives a different pattern for dwarf2cfi to look at instead of
the main insn pattern.  That allows us to attach the old-style PARALLEL move
representation in a REG_FRAME_RELATED_EXPR note and means we are free to always
express the save addresses in terms of the stack pointer.

Since the ldp/stp fusion pass can combine frame-related stores, this patch also
updates it to preserve REG_FRAME_RELATED_EXPR notes, and additionally gives it
the ability to synthesize those notes when combining sp-relative saves into an
stp (the latter always needs a note due to the unspec representation, the former
does not).

gcc/ChangeLog:

PR target/113077
* config/aarch64/aarch64-ldp-fusion.cc (filter_notes): Add
fr_expr param to extract REG_FRAME_RELATED_EXPR notes.
(combine_reg_notes): Handle REG_FRAME_RELATED_EXPR notes, and
synthesize these if needed.  Update caller ...
(ldp_bb_info::fuse_pair): ... here.
(ldp_bb_info::try_fuse_pair): Punt if either insn has writeback
and either insn is frame-related.
(find_trailing_add): Punt on frame-related insns.
* config/aarch64/aarch64.cc (aarch64_save_callee_saves): Use
REG_FRAME_RELATED_EXPR instead of REG_CFA_OFFSET.

gcc/testsuite/ChangeLog:

PR target/113077
* gcc.target/aarch64/pr113077.c: New test.

MIPS: Add ATTRIBUTE_UNUSED to mips_start_function_definition

Fix build warning:
mips.cc: warning: unused parameter 'decl'.

gcc
* config/mips/mips.cc (mips_start_function_definition):
Add ATTRIBUTE_UNUSED.

tree-optimization/111003 - new testcase

Testcase for fixed PR.

PR tree-optimization/111003
gcc/testsuite/
* gcc.dg/tree-ssa/pr111003.c: New testcase.

middle-end/112740 - vector boolean CTOR expansion issue

The optimization to expand uniform boolean vectors by sign-extension
works only for dense masks but it failed to check that.

PR middle-end/112740
* expr.cc (store_constructor): Check the integer vector
mask has a single bit per element before using sign-extension
to expand an uniform vector.

* gcc.dg/pr112740.c: New testcase.

RISC-V: VLA preempts VLS on unknown NITERS loop

This patch fixes the known issues on SLP cases:

ble a2,zero,.L11
addiw t1,a2,-1
li a5,15
bleu t1,a5,.L9
srliw a7,t1,4
slli a7,a7,7
lui t3,%hi(.LANCHOR0)
lui a6,%hi(.LANCHOR0+128)
addi t3,t3,%lo(.LANCHOR0)
li a4,128
addi a6,a6,%lo(.LANCHOR0+128)
add a7,a7,a0
addi a3,a1,37
mv a5,a0
vsetvli zero,a4,e8,m8,ta,ma
vle8.v v24,0(t3)
vle8.v v16,0(a6)
.L4:
li a6,128
vle8.v v0,0(a3)
vrgather.vv v8,v0,v24
vadd.vv v8,v8,v16
vse8.v v8,0(a5)
add a5,a5,a6
add a3,a3,a6
bne a5,a7,.L4
andi a5,t1,-16
mv t1,a5
.L3:
subw a2,a2,a5
li a4,1
beq a2,a4,.L5
slli a5,a5,32
srli a5,a5,32
addiw a2,a2,-1
slli a5,a5,3
csrr a4,vlenb
slli a6,a2,32
addi t3,a5,37
srli a3,a6,29
slli a4,a4,2
add t3,a1,t3
add a5,a0,a5
mv t5,a3
bgtu a3,a4,.L14
.L6:
li a4,50790400
addi a4,a4,1541
li a6,67633152
addi a6,a6,513
slli a4,a4,32
add a4,a4,a6
vsetvli t4,zero,e64,m4,ta,ma
vmv.v.x v16,a4
vsetvli a6,zero,e16,m8,ta,ma
vid.v v8
vsetvli zero,t5,e8,m4,ta,ma
vle8.v v20,0(t3)
vsetvli a6,zero,e16,m8,ta,ma
csrr a7,vlenb
vand.vi v8,v8,-8
vsetvli zero,zero,e8,m4,ta,ma
slli a4,a7,2
vrgatherei16.vv v4,v20,v8
vadd.vv v4,v4,v16
vsetvli zero,t5,e8,m4,ta,ma
vse8.v v4,0(a5)
bgtu a3,a4,.L15
.L7:
addw t1,a2,t1
.L5:
slliw a5,t1,3
add a1,a1,a5
lui a4,%hi(.LC2)
add a0,a0,a5
lbu a3,37(a1)
addi a5,a4,%lo(.LC2)
vsetivli zero,8,e8,mf2,ta,ma
vmv.v.x v1,a3
vle8.v v2,0(a5)
vadd.vv v1,v1,v2
vse8.v v1,0(a0)
.L11:
ret
.L15:
sub a3,a3,a4
bleu a3,a4,.L8
mv a3,a4
.L8:
li a7,50790400
csrr a4,vlenb
slli a4,a4,2
addi a7,a7,1541
li t4,67633152
add t3,t3,a4
vsetvli zero,a3,e8,m4,ta,ma
slli a7,a7,32
addi t4,t4,513
vle8.v v20,0(t3)
add a4,a5,a4
add a7,a7,t4
vsetvli a5,zero,e64,m4,ta,ma
vmv.v.x v16,a7
vsetvli a6,zero,e16,m8,ta,ma
vid.v v8
vand.vi v8,v8,-8
vsetvli zero,zero,e8,m4,ta,ma
vrgatherei16.vv v4,v20,v8
vadd.vv v4,v4,v16
vsetvli zero,a3,e8,m4,ta,ma
vse8.v v4,0(a4)
j .L7
.L14:
mv t5,a4
j .L6
.L9:
li a5,0
li t1,0
j .L3

The vectorization codegen is quite inefficient since we choose a VLS modes to vectorize the loop body
with epilogue choosing a VLA modes.

cost.c:6:21: note:  ***** Choosing vector mode V128QI
cost.c:6:21: note:  ***** Choosing epilogue vector mode RVVM4QI

As we known, in RVV side, we have VLA modes and VLS modes. VLAmodes support partial vectors wheras
VLSmodes support full vectors.  The goal we add VLSmodes is to improve the codegen of known NITERS
or SLP codes.

If NITERS is unknown, that is i < n, n is unknown. We will always have partial vectors vectorization.
It can be loop body or epilogue. In this case, It's always more efficient to apply VLA partial vectorization
on loop body which doesn't have epilogue.

After this patch:

f:
ble a2,zero,.L7
li a5,1
beq a2,a5,.L5
li a6,50790400
addi a6,a6,1541
li a4,67633152
addi a4,a4,513
csrr a5,vlenb
addiw a2,a2,-1
slli a6,a6,32
add a6,a6,a4
slli a5,a5,2
slli a4,a2,32
vsetvli t1,zero,e64,m4,ta,ma
srli a3,a4,29
neg t4,a5
addi a7,a1,37
mv a4,a0
vmv.v.x v12,a6
vsetvli t3,zero,e16,m8,ta,ma
vid.v v16
vand.vi v16,v16,-8
.L4:
minu a6,a3,a5
vsetvli zero,a6,e8,m4,ta,ma
vle8.v v8,0(a7)
vsetvli t3,zero,e8,m4,ta,ma
mv t1,a3
vrgatherei16.vv v4,v8,v16
vsetvli zero,a6,e8,m4,ta,ma
vadd.vv v4,v4,v12
vse8.v v4,0(a4)
add a7,a7,a5
add a4,a4,a5
add a3,a3,t4
bgtu t1,a5,.L4
.L3:
slliw a2,a2,3
add a1,a1,a2
lui a5,%hi(.LC0)
lbu a4,37(a1)
add a0,a0,a2
addi a5,a5,%lo(.LC0)
vsetivli zero,8,e8,mf2,ta,ma
vmv.v.x v1,a4
vle8.v v2,0(a5)
vadd.vv v1,v1,v2
vse8.v v1,0(a0)
.L7:
ret

Tested on both RV32 and RV64 no regression. Ok for trunk ?

gcc/ChangeLog:

* config/riscv/riscv-vector-costs.cc (costs::better_main_loop_than_p): VLA
preempt VLS on unknown NITERS loop.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/slp-1.c: Remove xfail.
* gcc.target/riscv/rvv/autovec/partial/slp-16.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-5.c: Ditto.

libstdc++: Optimize std::is_compound compilation performance

This patch optimizes the compilation performance of std::is_compound.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_compound): Do not use __not_.
(is_compound_v): Use is_fundamental_v instead.

Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>

Add -mevex512 into invoke.texi

Hi Richard,

It seems that I send out a not updated patch. This patch should what
I want to send.

Thx,
Haochen

gcc/ChangeLog:

* doc/invoke.texi: Add -mevex512.

LoongArch: Optimized some of the symbolic expansion instructions generated during bitwise operations.

There are two mode iterators defined in the loongarch.md:
(define_mode_iterator GPR [SI (DI "TARGET_64BIT")])
and
(define_mode_iterator X [(SI "!TARGET_64BIT") (DI "TARGET_64BIT")])
Replace the mode in the bit arithmetic from GPR to X.

Since the bitwise operation instruction does not distinguish between 64-bit,
32-bit, etc., it is necessary to perform symbolic expansion if the bitwise
operation is less than 64 bits.
The original definition would have generated a lot of redundant symbolic
extension instructions. This problem is optimized with reference to the
implementation of RISCV.

Add this patch spec2017 500.perlbench performance improvement by 1.8%

gcc/ChangeLog:

* config/loongarch/loongarch.md (one_cmpl<mode>2): Replace GPR with X.
(*nor<mode>3): Likewise.
(nor<mode>3): Likewise.
(*negsi2_extended): New template.
(*<optab>si3_internal): Likewise.
(*one_cmplsi2_internal): Likewise.
(*norsi3_internal): Likewise.
(*<optab>nsi_internal): Likewise.
(bytepick_w_<bytepick_imm>_extend): Modify this template according to the
modified bit operation to make the optimization work.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/sign-extend-bitwise.c: New test.

Optimize A < B ? A : B to MIN_EXPR.

Similar for A < B ? B : A to MAX_EXPR.
There're codes in the frontend to optimize such pattern but failed to
handle testcase in the PR since it's exposed at gimple level when
folding backend builtins.

pr95906 now can be optimized to MAX_EXPR as it's commented in the
testcase.

// FIXME: this should further optimize to a MAX_EXPR
typedef signed char v16i8 __attribute__((vector_size(16)));
v16i8 f(v16i8 a, v16i8 b)

gcc/ChangeLog:

PR target/104401
* match.pd (VEC_COND_EXPR: A < B ? A : B -> MIN_EXPR): New patten match.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr104401.c: New test.
* gcc.dg/tree-ssa/pr95906.c: Adjust testcase.

PR modula2/112946 set expression type checking

This patch adds type checking for binary set operators.
It also checks the IN operator and improves the := type checking.

gcc/m2/ChangeLog:

PR modula2/112946
* gm2-compiler/M2GenGCC.mod (IsExpressionCompatible): Import.
(ExpressionTypeCompatible): Import.
(CodeStatement): Remove op1, op2, op3 parameters from CodeSetOr,
CodeSetAnd, CodeSetSymmetricDifference, CodeSetLogicalDifference.
(checkArrayElements): Rename op1 to des and op3 to expr.
Use despos and exprpos instead of CurrentQuadToken.
(checkRecordTypes): Rename op1 to des and op2 to expr.
Use virtpos instead of CurrentQuadToken.
(checkIncorrectMeta): Ditto.
(checkBecomes): Rename op1 to des and op3 to expr.
Use virtpos instead of CurrentQuadToken.
(NoWalkProcedure): New procedure stub.
(CheckBinaryExpressionTypes): New procedure function.
(CheckElementSetTypes): New procedure function.
(CodeBinarySet): Re-write.
(FoldBinarySet): Re-write.
(CodeSetOr): Remove parameters op1, op2 and op3.
(CodeSetAnd): Ditto.
(CodeSetLogicalDifference): Ditto.
(CodeSetSymmetricDifference): Ditto.
(CodeIfIn): Call CheckBinaryExpressionTypes and
CheckElementSetTypes.
* gm2-compiler/M2Quads.mod (BuildRotateFunction): Correct
parameters to MakeVirtualTok to reflect parameter block
passed to Rotate.

gcc/testsuite/ChangeLog:

PR modula2/112946
* gm2/pim/fail/badbecomes.mod: New test.
* gm2/pim/fail/badexpression.mod: New test.
* gm2/pim/fail/badexpression2.mod: New test.
* gm2/pim/fail/badifin.mod: New test.
* gm2/pim/pass/goodifin.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

config: delete unused CYG_AC_PATH_LIBERTY macro

Nothing uses this, so delete it to avoid confusion.

config/ChangeLog:

* acinclude.m4 (CYG_AC_PATH_LIBERTY): Delete.

Daily bump.

libstdc++: Use _GLIBCXX_USE_BUILTIN_TRAIT for _Nth_type

Since _Nth_type has a fallback native implementation, use
_GLIBCXX_USE_BUILTIN_TRAIT when checking for __type_pack_element
so that we can easily toggle which implementation to use.

libstdc++-v3/ChangeLog:

* include/bits/utility.h (_Nth_type): Use
_GLIBCXX_USE_BUILTIN_TRAIT instead of __has_builtin.

RISC-V: Switch RVV cost model.

This patch is preparing patch for the following cost model tweak.

Since we don't have vector cost model in default tune info (rocket),
we set the cost model default as generic cost model by default.

The reason we want to switch to generic vector cost model is the default
cost model generates inferior codegen for various benchmarks.

For example, PR113247, we have performance bug that we end up having over 70%
performance drop of SHA256. Currently, no matter how we adapt cost model,
we are not able to fix the performance bug since we always use default cost model by default.

Also, tweak the generic cost model back to default cost model since we have some FAILs in
current tests.

After this patch, we (me an Robin) can work on cost model tunning together to improve performane
in various benchmarks.

Tested on both RV32 and RV64, ok for trunk ?

gcc/ChangeLog:

* config/riscv/riscv.cc (get_common_costs): Switch RVV cost model.
(get_vector_costs): Ditto.
(riscv_builtin_vectorization_cost): Ditto.

RISC-V: Minor tweak dynamic cost model

v2 update: Robostify tests.

While working on cost model, I notice one case that dynamic lmul cost doesn't work well.

Before this patch:

foo:
        lui     a4,%hi(.LANCHOR0)
        li      a0,1953
        li      a1,63
        addi    a4,a4,%lo(.LANCHOR0)
        li      a3,64
        vsetvli a2,zero,e32,mf2,ta,ma
        vmv.v.x v5,a0
        vmv.v.x v4,a1
        vid.v   v3
.L2:
        vsetvli a5,a3,e32,mf2,ta,ma
        vadd.vi v2,v3,1
        vadd.vv v1,v3,v5
        mv      a2,a5
        vmacc.vv        v1,v2,v4
        slli    a1,a5,2
        vse32.v v1,0(a4)
        sub     a3,a3,a5
        add     a4,a4,a1
        vsetvli a5,zero,e32,mf2,ta,ma
        vmv.v.x v1,a2
        vadd.vv v3,v3,v1
        bne     a3,zero,.L2
        li      a0,0
        ret

Unexpected: Use scalable vector and LMUL = MF2 which is wasting computation resources.

Ideally, we should use LMUL = M8 VLS modes.

The root cause is the dynamic LMUL heuristic dominates the VLS heuristic.
Adapt the cost model heuristic.

After this patch:

foo:
lui a4,%hi(.LANCHOR0)
addi a4,a4,%lo(.LANCHOR0)
li a3,4096
li a5,32
li a1,2016
addi a2,a4,128
addiw a3,a3,-32
vsetvli zero,a5,e32,m8,ta,ma
li a0,0
vid.v v8
vsll.vi v8,v8,6
vadd.vx v16,v8,a1
vadd.vx v8,v8,a3
vse32.v v16,0(a4)
vse32.v v8,0(a2)
ret

Tested on both RV32/RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

* config/riscv/riscv-vector-costs.cc (costs::better_main_loop_than_p): Minior tweak.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-10.c: Fix test.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-11.c: Ditto.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-12.c: Ditto.

libgccjit: Fix GGC segfault when using -flto

gcc/ChangeLog:
PR jit/111396
* ipa-fnsummary.cc (ipa_fnsummary_cc_finalize): Call
ipa_free_size_summary.
* ipa-icf.cc (ipa_icf_cc_finalize): New function.
* ipa-profile.cc (ipa_profile_cc_finalize): New function.
* ipa-prop.cc (ipa_prop_cc_finalize): New function.
* ipa-prop.h (ipa_prop_cc_finalize): New function.
* ipa-sra.cc (ipa_sra_cc_finalize): New function.
* ipa-utils.h (ipa_profile_cc_finalize, ipa_icf_cc_finalize,
ipa_sra_cc_finalize): New functions.
* toplev.cc (toplev::finalize): Call ipa_icf_cc_finalize,
ipa_prop_cc_finalize, ipa_profile_cc_finalize and
ipa_sra_cc_finalize
Include ipa-utils.h.

gcc/testsuite/ChangeLog:
PR jit/111396
* jit.dg/all-non-failing-tests.h: Add note about test-ggc-bugfix.
* jit.dg/test-ggc-bugfix.c: New test.

RISC-V: T-HEAD: Add support for the XTheadInt ISA extension

The XTheadInt ISA extension provides the following instructions
to accelerate interrupt processing:
* th.ipush
* th.ipop

Ref:
https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.3.0/xthead-2023-11-10-2.3.0.pdf

gcc/ChangeLog:

* config/riscv/riscv-protos.h (th_int_get_mask): New prototype.
(th_int_get_save_adjustment): Likewise.
(th_int_adjust_cfi_prologue): Likewise.
* config/riscv/riscv.cc (BITSET_P): Moved away from here.
(TH_INT_INTERRUPT): New macro.
(riscv_expand_prologue): Add the processing of XTheadInt.
(riscv_expand_epilogue): Likewise.
* config/riscv/riscv.h (BITSET_P): Moved to here.
* config/riscv/riscv.md: New unspec.
* config/riscv/thead.cc (th_int_get_mask): New function.
(th_int_get_save_adjustment): Likewise.
(th_int_adjust_cfi_prologue): Likewise.
* config/riscv/thead.md (th_int_push): New pattern.
(th_int_pop): new pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadint-push-pop.c: New test.

middle-end: Don't apply copysign optimization if target does not implement optab [PR112468]

Currently GCC does not treat IFN_COPYSIGN the same as the copysign tree expr.
The latter has a libcall fallback and the IFN can only do optabs.

Because of this the change I made to optimize copysign only works if the
target has impemented the optab, but it should work for those that have the
libcall too.

More annoyingly if a target has vector versions of ABS and NEG but not COPYSIGN
then the change made them lose vectorization.

The proper fix for this is to treat the IFN the same as the tree EXPR and to
enhance expand_COPYSIGN to also support vector calls.

I have such a patch for GCC 15 but it's quite big and too invasive for stage-4.
As such this is a minimal fix, just don't apply the transformation and leave
targets which don't have the optab unoptimized.

Targets list for check_effective_target_ifn_copysign was gotten by grepping for
copysign and looking at the optab.

gcc/ChangeLog:

PR tree-optimization/112468
* doc/sourcebuild.texi: Document ifn_copysign.
* match.pd: Only apply transformation if target supports the IFN.

gcc/testsuite/ChangeLog:

PR tree-optimization/112468
* gcc.dg/fold-copysign-1.c: Modify tests based on if target supports
IFN_COPYSIGN.
* gcc.dg/pr55152-2.c: Likewise.
* gcc.dg/tree-ssa/abs-4.c: Likewise.
* gcc.dg/tree-ssa/backprop-6.c: Likewise.
* gcc.dg/tree-ssa/copy-sign-2.c: Likewise.
* gcc.dg/tree-ssa/mult-abs-2.c: Likewise.
* lib/target-supports.exp (check_effective_target_ifn_copysign): New.

reassoc vs uninitialized variable [PR112581]

Like r14-2293-g11350734240dba and r14-2289-gb083203f053f16,
reassociation can combine across a few bb and one of the usage
can be an uninitializated variable and if going from an conditional
usage to an unconditional usage can cause wrong code.
This uses maybe_undef_p like other passes where this can happen.

Note if-to-switch uses the function (init_range_entry) provided
by ressociation so we need to call mark_ssa_maybe_undefs there;
otherwise we assume almost all ssa names are uninitialized.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

PR tree-optimization/112581
* gimple-if-to-switch.cc (pass_if_to_switch::execute): Call
mark_ssa_maybe_undefs.
* tree-ssa-reassoc.cc (can_reassociate_op_p): Uninitialized
variables can not be reassociated.
(init_range_entry): Check for uninitialized variables too.
(init_reassoc): Call mark_ssa_maybe_undefs.

gcc/testsuite/ChangeLog:

PR tree-optimization/112581
* gcc.c-torture/execute/pr112581-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

RISC-V/testsuite: Fix comment termination in pr105314.c

Add terminating `/' character missing from one of the test harness
command clauses in pr105314.c. This causes no issue with compilation
owing to another comment immediately following, but would cause a:

pr105314.c:3:1: warning: "/*" within comment [-Wcomment]

message if warnings were enabled.

gcc/testsuite/
* gcc.target/riscv/pr105314.c: Fix comment termination.

RISC-V: Also handle sign extension in branch costing

Complement commit c1e8cb3d9f94 ("RISC-V: Rework branch costing model for
if-conversion") and also handle extraneous sign extend operations that
are sometimes produced by `noce_try_cmove_arith' instead of zero extend
operations, making branch costing consistent. It is unclear what the
condition is for the middle end to choose between the zero extend and
sign extend operation, but the test case included uses sign extension
with 64-bit targets, preventing if-conversion from triggering across all
the architectural variants.

There are further anomalies revealed by the test case, specifically the
exceedingly high branch cost of 6 required for the `-mmovcc' variant
despite that the final branchless sequence only uses 4 instructions, the
missed conversion at -O1 for 32-bit targets even though code is machine
word size agnostic, and the missed conversion at -Os and -Oz for 32-bit
Zicond targets even though the branchless sequence would be shorter than
the branched one. These will have to be handled separately.

gcc/
* config/riscv/riscv.cc (riscv_noce_conversion_profitable_p):
Also handle sign extension.

gcc/testsuite/
* gcc.target/riscv/cset-sext-sfb.c: New test.
* gcc.target/riscv/cset-sext-thead.c: New test.
* gcc.target/riscv/cset-sext-ventana.c: New test.
* gcc.target/riscv/cset-sext-zicond.c: New test.
* gcc.target/riscv/cset-sext.c: New test.

testsuite: Add testcase for already fixed PR [PR112734]

This test was already fixed by r14-6051 aka PR112770 fix.

2024-01-10 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/112734
* gcc.dg/bitint-64.c: New test.

aarch64: Make ldp/stp pass off by default

As discussed on IRC, this makes the aarch64 ldp/stp pass off by default. This
should stabilize the trunk and give some time to address the P1 regressions.

gcc/ChangeLog:

* config/aarch64/aarch64.opt (-mearly-ldp-fusion): Set default
to 0.
(-mlate-ldp-fusion): Likewise.

middle-end: correctly identify the edge taken when condition is true. [PR113287]

The vectorizer needs to know during early break vectorization whether the edge
that will be taken if the condition is true stays or leaves the loop.

This is because the code assumes that if you take the true branch you exit the
loop. If you don't exit the loop it has to generate a different condition.

Basically it uses this information to decide whether it's generating a
"any element" or an "all element" check.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and no issues with --enable-lto --with-build-config=bootstrap-O3
--enable-checking=release,yes,rtl,extra.

gcc/ChangeLog:

PR tree-optimization/113287
* tree-vect-stmts.cc (vectorizable_early_exit): Check the flags on edge
instead of using BRANCH_EDGE to determine true edge.

gcc/testsuite/ChangeLog:

PR tree-optimization/113287
* gcc.dg/vect/vect-early-break_100-pr113287.c: New test.
* gcc.dg/vect/vect-early-break_99-pr113287.c: New test.

tree-optimization/113078 - conditional subtraction reduction vectorization

When if-conversion was changed to use .COND_ADD/SUB for conditional
reduction it was forgotten to update reduction path handling to
canonicalize .COND_SUB to .COND_ADD for vectorizable_reduction
similar to what we do for MINUS_EXPR. The following adds this
and testcases exercising this at runtime and looking for the
appropriate masked subtraction in the vectorized code on x86.

PR tree-optimization/113078
* tree-vect-loop.cc (check_reduction_path): Canonicalize
.COND_SUB to .COND_ADD.

* gcc.dg/vect/vect-reduc-cond-sub.c: New testcase.
* gcc.target/i386/vect-pr113078.c: Likewise.

c++ frontend: initialize ivdep value

Should control enter the switch from one of the cases other than
the IVDEP one then the variable remains uninitialized.

This fixes it by initializing it to false.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_pragma): Initialize to false.

gcc-urlifier: handle option prefixes such as '-fno-'

Given e.g. this missppelled option (omitting the trailing 's'):
$ LANG=C ./xgcc -B. -fno-inline-small-function
xgcc: error: unrecognized command-line option '-fno-inline-small-function'; did you mean '-fno-inline-small-functions'?

we weren't providing a documentation URL for the suggestion.

The issue is the URLification code uses find_opt, which doesn't consider
the various '-fno-' prefixes.

This patch adds a way to find the pertinent prefix remapping and uses it
when determining URLs.
With this patch, the suggestion '-fno-inline-small-functions' now gets a
documentation link (to that of '-finline-small-functions').

gcc/ChangeLog:
* gcc-urlifier.cc (gcc_urlifier::get_url_suffix_for_option):
Handle prefix mappings before calling find_opt.
(selftest::gcc_urlifier_cc_tests): Add example of urlifying a
"-fno-"-prefixed command-line option.
* opts-common.cc (get_option_prefix_remapping): New.
* opts.h (get_option_prefix_remapping): New decl.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

pretty-print: support urlification in phase 3

TL;DR: for the case when the user misspells a command-line option
and we suggest one, with this patch we now provide a documentation URL
for the suggestion.

In r14-5118-gc5db4d8ba5f3de I added a mechanism to automatically add
URLs to quoted strings in diagnostics, and in r14-6920-g9e49746da303b8
through r14-6923-g4ded42c2c5a5c9 wired this up so that any time
we mention a command-line option in a diagnostic message in quotes,
the user gets a URL to the HTML documentation for that option.

However this only worked for quoted strings that were fully within
a single "chunk" within the pretty-printer implementation, such as:

* "%<-foption%>" (handled in phase 1)
* "%qs", "-foption" (handled in phase 2)

but not where the the quoted string straddled multiple chunks, in
particular for this important case in the gcc.cc:

  error ("unrecognized command-line option %<-%s%>;"
" did you mean %<-%s%>?",
switches[i].part1, hint);

e.g. for:
$ LANG=C ./xgcc -B. -finling-small-functions
xgcc: error: unrecognized command-line option '-finling-small-functions'; did you mean '-finline-small-functions'?

which within pp_format becomes these chunks:

* chunk 0: "unrecognized command-line option `-"
* chunk 1: switches[i].part1  (e.g. "finling-small-functions")
* chunk 2: "'; did you mean `-"
* chunk 3: hint (e.g. "finline-small-functions")
* chunk 4: "'?"

where the first quoted run is in chunks 1-3 and the second in
chunks 2-4.

Hence we were not attempting to provide a URL for the two quoted runs,
and, in particular not for the hint.

This patch refactors the urlification mechanism in pretty-print.cc so
that it checks for quoted runs that appear in phase 3 (as well as in
phases 1 and 2, as before).  With this, the quoted text runs
"-finling-small-functions" and "-finline-small-functions" are passed
to the urlifier, which successfully finds a documentation URL for
the latter.

As before, the urlification code is only run if the URL escapes are
enabled, and only for messages from diagnostic.cc (error, warn, inform,
etc), not for all pretty_printer usage.

gcc/ChangeLog:
* diagnostic.cc (diagnostic_context::report_diagnostic): Pass
m_urlifier to pp_output_formatted_text.
* pretty-print.cc: Add #define of INCLUDE_VECTOR.
(obstack_append_string): New overload, taking a length.
(urlify_quoted_string): Pass in an obstack ptr, rather than using
that of the pp's buffer.  Generalize to handle trailing text in
the buffer beyond the run of quoted text.
(class quoting_info): New.
(on_begin_quote): New.
(on_end_quote): New.
(pp_format): Refactor phase 1 and phase 2 quoting support, moving
it to calls to on_begin_quote and on_end_quote.
(struct auto_obstack): New.
(quoting_info::handle_phase_3): New.
(pp_output_formatted_text): Add urlifier param.  Use it if there
is deferred urlification.  Delete m_quotes.
(selftest::pp_printf_with_urlifier): Pass urlifier to
pp_output_formatted_text.
(selftest::test_urlification): Update results for the existing
case of quoted text stradding chunks; add more such test cases.
* pretty-print.h (class quoting_info): New forward decl.
(chunk_info::m_quotes): New field.
(pp_output_formatted_text): Add optional urlifier param.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

pretty-print: add selftest coverage for numbered args

No functional change intended.

gcc/ChangeLog:
* pretty-print.cc (selftest::test_pp_format): Add selftest
coverage for numbered args.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

OpenMP: Fix g++.dg/gomp/bad-array-section-10.C for C++23 and up

This patch adjusts diagnostic output for C++23 and above for the test
case mentioned in the commit title.

2024-01-10 Julian Brown <julian@codesourcery.com>

gcc/testsuite/
* g++.dg/gomp/bad-array-section-10.C: Adjust diagnostics for C++23 and
up.

OpenMP: Fix new lvalue-parsing map/to/from tests for 32-bit targets

This patch fixes several tests introduced by the commit
r14-7033-g1413af02d62182 for 32-bit targets.

2024-01-10 Julian Brown <julian@codesourcery.com>

gcc/testsuite/
* g++.dg/gomp/array-section-1.C: Fix scan output for 32-bit target.
* g++.dg/gomp/array-section-2.C: Likewise.
* g++.dg/gomp/bad-array-section-4.C: Adjust error output for 32-bit
target.

middle-end: Fix dominators updates when peeling with multiple exits [PR113144]

When we peel at_exit we are moving the new loop at the exit of the previous
loop.  This means that the blocks outside the loop dat the previous loop used to
dominate are no longer being dominated by it.

The new dominators however are hard to predict since if the loop has multiple
exits and all the exits are an "early" one then we always execute the scalar
loop.  In this case the scalar loop can completely dominate the new loop.

If we later have skip_vector then there's an additional skip edge added that
might change the dominators.

The previous patch would force an update of all blocks reachable from the new
exits.  This one updates *only* blocks that we know the scalar exits dominated.

For the examples this reduces the blocks to update from 18 to 3.

gcc/ChangeLog:

PR tree-optimization/113144
PR tree-optimization/113145
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
Update all BB that the original exits dominated.

gcc/testsuite/ChangeLog:

PR tree-optimization/113144
PR tree-optimization/113145
* gcc.dg/vect/vect-early-break_94-pr113144.c: New test.

testsuite: Fix PR number [PR113297]

2024-01-10 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/113297
* gcc.dg/bitint-63.c: Fix PR number.

libgomp: Fix up FLOCK fallback handling [PR113192]

My earlier change broke Solaris testing, because @FLOCK@ isn't substituted
just into libgomp/Makefile where it worked, but also the
testsuite/libgomp-site-extra.exp file where Make variables aren't present
and can't be substituted.

The following patch instead computes the absolute srcdir path and uses it
for FLOCK.

2024-01-10 Jakub Jelinek <jakub@redhat.com>

PR libgomp/113192
* configure.ac (FLOCK): Use $libgomp_abs_srcdir/testsuite/flock
instead of \$(abs_top_srcdir)/testsuite/flock.
* configure: Regenerated.

Fix debug info for enumeration types with reverse Scalar_Storage_Order

This implements the support of DW_AT_endianity for enumeration types because
they are scalar and therefore, reverse Scalar_Storage_Order is supported for
them, but only when the -gstrict-dwarf switch is not passed because this is
an extension.

There is an associated GDB patch to be submitted to grok the new DWARF.

gcc/
* dwarf2out.cc (modified_type_die): Extend the support of reverse
storage order to enumeration types if -gstrict-dwarf is not passed.
(gen_enumeration_type_die): Add REVERSE parameter and generate the
DIE immediately after the existing one if it is true.
(gen_tagged_type_die): Add REVERSE parameter and pass it in the
call to gen_enumeration_type_die.
(gen_type_die_with_usage): Add REVERSE parameter and pass it in the
first recursive call as well as the call to gen_tagged_type_die.
(gen_type_die): Add REVERSE parameter and pass it in the call to
gen_type_die_with_usage.

LoongArch: testsuite: Add loongarch support to slp-21.c.

The function of this test is to check that the compiler supports vectorization
using SLP and vec_{load/store/*}_lanes. However, vec_{load/store/*}_lanes are
not supported on LoongArch, such as the corresponding "st4/ld4" directives on
aarch64.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/slp-21.c: Add loongarch.

LoongArch: testsuite:Fixed a bug that added a target check error.

After the code is committed in r14-6948, GCC regression testing on some
architectures will produce the following error:

"error executing dg-final: unknown effective target keyword `loongarch*-*-*'"

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Removed an issue with "target keyword"
checking errors on LoongArch architecture.

sra: Partial fix for BITINT_TYPEs [PR113120]

As changed in other parts of the compiler, using
build_nonstandard_integer_type is not appropriate for arbitrary precisions,
especially if the precision comes from a BITINT_TYPE or something based on
that, build_nonstandard_integer_type relies on some integral mode being
supported that can support the precision.

The following patch uses build_bitint_type instead for BITINT_TYPE
precisions.

Note, it would be good if we were able to punt on the optimization
(but this code doesn't seem to be able to punt, so it needs to be done
somewhere earlier) at least in cases where building it would be invalid.
E.g. right now BITINT_TYPE can support precisions up to 65535 (inclusive),
but 65536 will not work anymore (we can't have > 16-bit TYPE_PRECISION).
I've tried to replace 513 with 65532 in the testcase and it didn't ICE,
so maybe it ran into some other SRA limit.

2024-01-10 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/113120
* tree-sra.cc (analyze_access_subtree): For BITINT_TYPE
with root->size TYPE_PRECISION don't build anything new.
Otherwise, if root->type is a BITINT_TYPE, use build_bitint_type
rather than build_nonstandard_integer_type.

* gcc.dg/bitint-63.c: New test.

i386: [APX] Document inline asm behavior and new switch for APX

For APX, the inline asm behavior was not mentioned in any document
before. Add description for it.

gcc/ChangeLog:

* config/i386/i386.opt: Adjust document.
* doc/invoke.texi: Add description for
-mapx-inline-asm-use-gpr32.

RISC-V: Refine unsigned avg_floor/avg_ceil

This patch is inspired by LLVM patches:
https://github.com/llvm/llvm-project/pull/76550
https://github.com/llvm/llvm-project/pull/77473

Use vaaddu for AVG vectorization.

Before this patch:

        vsetivli        zero,8,e8,mf2,ta,ma
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        vwaddu.vv        v1,v3,v2
        vsetvli zero,zero,e16,m1,ta,ma
        vadd.vi v1,v1,1
        vsetvli zero,zero,e8,mf2,ta,ma
        vnsrl.wi        v1,v1,1
        vse8.v  v1,0(a0)
        ret

After this patch:

vsetivli zero,8,e8,mf2,ta,ma
csrwi vxrm,0
vle8.v v1,0(a1)
vle8.v v2,0(a2)
vaaddu.vv v1,v1,v2
vse8.v v1,0(a0)
ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu.

More details in:
https://github.com/riscv/riscv-v-spec/issues/935
https://github.com/riscv/riscv-v-spec/issues/934

Tested on both RV32 and RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

* config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove.
(avg<v_double_trunc>3_floor): New pattern.
(<u>avg<v_double_trunc>3_ceil): Remove.
(avg<v_double_trunc>3_ceil): New pattern.
(uavg<mode>3_floor): Ditto.
(uavg<mode>3_ceil): Ditto.
* config/riscv/riscv-protos.h (enum insn_flags): Add for average addition.
(enum insn_type): Ditto.
* config/riscv/riscv-v.cc: Ditto.
* config/riscv/vector-iterators.md (ashiftrt): Remove.
(ASHIFTRT): Ditto.
* config/riscv/vector.md: Add VLS modes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.

testsuite, rs6000: Adjust pcrel-sibcall-1.c with noipa [PR112751]

As PR112751 shows, commit r14-5628 caused pcrel-sibcall-1.c
to fail as it enables ipa-vrp which makes return values of
functions {x,y,xx} as known and propagated. This patch is
to adjust it with noipa to make it not fragile.

PR testsuite/112751

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pcrel-sibcall-1.c: Replace noinline as noipa.

rs6000: Eliminate zext fed by vclzlsbb [PR111480]

As PR111480 shows, commit r14-4079 only optimizes the case
of vctzlsbb but not for the similar vclzlsbb. This patch
is to consider vclzlsbb as well and avoid the failure on
the reported test case. It also simplifies the patterns
with iterator and attribute.

PR target/111480

gcc/ChangeLog:

* config/rs6000/vsx.md (VCZLSBB): New int iterator.
(vczlsbb_char): New int attribute.
(vclzlsbb_<mode>, vctzlsbb_<mode>): Merge to ...
(vc<vczlsbb_char>zlsbb_<mode>): ... this.
(*vctzlsbb_zext_<mode>): Rename to ...
(*vc<vczlsbb_char>zlsbb_zext_<mode>): ... this, and extend it to
cover vclzlsbb.

rs6000: Make copysign (x, -1) back to -abs (x) for IEEE128 float [PR112606]

I noticed that commit r14-6192 can't help PR112606 #c3 as
it only takes care of SF/DF but TF/KF can still suffer the
issue. Similar to commit r14-6192, this patch is to take
care of copysign<mode>3 with IEEE128 as well.

PR target/112606

gcc/ChangeLog:

* config/rs6000/rs6000.md (copysign<mode>3 IEEE128): Change predicate
of the last argument from altivec_register_operand to any_operand. If
operands[2] is CONST_DOUBLE, emit abs or neg abs depending on its sign
otherwise if it doesn't satisfy altivec_register_operand, force it to
REG using copy_to_mode_reg.

strub: Only unbias stack point for SPARC_STACK_BOUNDARY_HACK [PR113100]

As PR113100 shows, the unbiasing introduced by r14-6737 can
cause the scrubbing to overrun and screw some critical data
on stack like saved toc base consequently cause segfault.

By checking PR112917, IMHO we should keep this unbiasing
guarded under SPARC_STACK_BOUNDARY_HACK (TARGET_ARCH64 &&
TARGET_STACK_BIAS), similar to some existing code special
treating SPARC stack bias.

PR middle-end/113100

gcc/ChangeLog:

* builtins.cc (expand_builtin_stack_address): Guard stack point
adjustment with SPARC_STACK_BOUNDARY_HACK.

LoongArch: Simplify -mexplicit-reloc definitions

Since we do not need printing or manual parsing of this option,
(whether in the driver or for target attributes to be supported later)
it can be handled in the .opt file framework.

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Remove explicit-reloc
argument string definitions.
* config/loongarch/loongarch-str.h: Same.
* config/loongarch/genopts/loongarch.opt.in: Mark -m[no-]explicit-relocs
as aliases to -mexplicit-relocs={always,none}
* config/loongarch/loongarch.opt: Regenerate.
* config/loongarch/loongarch.cc: Same.

LoongArch: Use enums for constants

Target features constants from loongarch-def.h are currently defined as macros.
Switch to enums for better look in the debugger.

gcc/ChangeLog:

* config/loongarch/loongarch-def.h: Define constants with
enums instead of Macros.

LoongArch: Rename ISA_BASE_LA64V100 to ISA_BASE_LA64

LoongArch ISA manual v1.10 suggests that software should not depend on
the ISA version number for marking processor features. The ISA version
number is now defined as a collective name of individual ISA evolutions.
Since there is a independent ISA evolution mask now, we can drop the
version information from the base ISA.

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Rename.
* config/loongarch/genopts/loongarch.opt.in: Same.
* config/loongarch/loongarch-cpu.cc: Same.
* config/loongarch/loongarch-def.cc: Same.
* config/loongarch/loongarch-def.h: Same.
* config/loongarch/loongarch-opts.cc: Same.
* config/loongarch/loongarch-opts.h: Same.
* config/loongarch/loongarch-str.h: Same.
* config/loongarch/loongarch.opt: Same.

LoongArch: Handle ISA evolution switches along with other options

gcc/ChangeLog:

* config/loongarch/genopts/genstr.sh: Prepend the isa_evolution
variable with the common la_ prefix.
* config/loongarch/genopts/loongarch.opt.in: Mark ISA evolution
flags as saved using TargetVariable.
* config/loongarch/loongarch.opt: Same.
* config/loongarch/loongarch-def.h: Define evolution_set to
mark changes to the -march default.
* config/loongarch/loongarch-driver.cc: Same.
* config/loongarch/loongarch-opts.cc: Same.
* config/loongarch/loongarch-opts.h: Define and use ISA evolution
conditions around the la_target structure.
* config/loongarch/loongarch.cc: Same.
* config/loongarch/loongarch.md: Same.
* config/loongarch/loongarch-builtins.cc: Same.
* config/loongarch/loongarch-c.cc: Same.
* config/loongarch/lasx.md: Same.
* config/loongarch/lsx.md: Same.
* config/loongarch/sync.md: Same.

RISC-V: Robostify dynamic lmul test

While working on refining the cost model, I notice this test will generate unexpected
scalar xor instructions if we don't tune cost model carefully.

Add more assembler to avoid future regression.

Committed.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-7.c: Add assembler-not check.

Daily bump.

libstdc++: Fix Unicode property detection functions

Fix some copy & pasted logic in __is_extended_pictographic. This
function should yield false for the values before the first edge, not
true. Also add a missing boundary condition check in __incb_property.

Also Fix an off-by-one error in _Utf_iterator::operator++() that would
make dereferencing a past-the-end iterator undefined (where the intended
design is that the iterator is always incrementable and dereferenceable,
for better memory safety).

Also simplify the grapheme view iterator, which still contained some
remnants of an earlier design I was experimenting with.

Slightly tweak the gen_libstdcxx_unicode_data.py script so that the
_Gcb_property enumerators are in the order we encounter them in the data
file, instead of sorting them alphabetically. Start with the "Other"
property at value 0, because that's the default property for anything
not in the file. This makes no practical difference, but seems cleaner.
It causes the values in the __gcb_edges table to change, so can only be
done now before anybody is using this code yet. The enumerator values
and table entries become ABI artefacts for the function using them.

contrib/ChangeLog:

* unicode/gen_libstdcxx_unicode_data.py: Print out Gcb_property
enumerators in the order they're seen, not alphabetical order.

libstdc++-v3/ChangeLog:

* include/bits/unicode-data.h: Regenerate.
* include/bits/unicode.h (_Utf_iterator::operator++()): Fix off
by one error.
(__incb_property): Add missing check for values before the
first edge.
(__is_extended_pictographic): Invert return values to fix
copy&pasted logic.
(_Grapheme_cluster_view::_Iterator): Remove second iterator
member and find end of cluster lazily.
* testsuite/ext/unicode/grapheme_view.cc: New test.
* testsuite/ext/unicode/properties.cc: New test.
* testsuite/ext/unicode/view.cc: New test.

Fix spurious match in extract_symvers

Tighten the regex to find the start of the .dynsym symtab in the readelf
output to avoid matching the section symbol in the normal symtab.

libstdc++-v3:
* scripts/extract_symvers.in: Require final colon to only match
.dsynsym in the header of the dynamic symtab.

c++: adjust accessor fixits for explicit object parm

In a couple of places in the xobj patch I noticed that is_this_parameter
probably wanted to change to is_object_parameter; this implements that and
does the additional adjustments needed to make the accessor fixits handle
xobj parms.

gcc/cp/ChangeLog:

* semantics.cc (is_object_parameter): New.
* cp-tree.h (is_object_parameter): Declare.
* call.cc (maybe_warn_class_memaccess): Use it.
* search.cc (field_access_p): Use it.
(class_of_object_parm): New.
(field_accessor_p): Adjust for explicit object parms.

gcc/testsuite/ChangeLog:

* g++.dg/torture/accessor-fixits-9-xobj.C: New test.

c++: explicit object cleanups

The FIXME in xobj_iobj_parameters_correspond was due to expecting
TYPE_MAIN_VARIANT to be the same for all equivalent types, which is not the
case. And I adjusted some comments that I disagree with; the iobj parameter
adjustment only applies to overload resolution, we can handle that in
cand_parms_match (and I have WIP for that).

gcc/cp/ChangeLog:

* call.cc (build_over_call): Refactor handle_arg lambda.
* class.cc (xobj_iobj_parameters_correspond): Fix FIXME.
* method.cc (defaulted_late_check): Adjust comments.

c++: P0847R7 (deducing this) - CWG2586 [PR102609]

This adds support for defaulted comparison operators and copy/move
assignment operators, as well as allowing user defined xobj copy/move
assignment operators. It turns out defaulted comparison operators already
worked though, so this just adds a test for them. Defaulted comparison
operators were not so nice and required a bit of a hack. Should work fine
though!

The diagnostics leave something to be desired, and there are some things
that could be improved with more extensive design changes. There are a few
notes left indicating where I think we could make improvements.

Aside from some small bugs, with this commit xobj member functions should be
feature complete.

PR c++/102609

gcc/cp/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - CWG2586.
* decl.cc (copy_fn_p): Accept xobj copy assignment functions.
(move_signature_fn_p): Accept xobj move assignment functions.
* method.cc (do_build_copy_assign): Handle defaulted xobj member
functions.
(defaulted_late_check): Comment.
(defaultable_fn_check): Comment.

gcc/testsuite/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - CWG2586.
* g++.dg/cpp23/explicit-obj-basic6.C: New test.
* g++.dg/cpp23/explicit-obj-default1.C: New test.
* g++.dg/cpp23/explicit-obj-default2.C: New test.

Signed-off-by: Waffl3x <waffl3x@protonmail.com>

c++: P0847R7 (deducing this) - xobj lambdas. [PR102609]

This implements support for xobj lambdas.  There are extensive tests
included, but not exhaustive.  Dependent lambdas should work and have been
tested lightly, but we need more exhaustive tests for them.

PR c++/102609

gcc/cp/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - xobj lambdas.
* lambda.cc (build_capture_proxy): Don't fold direct object types.
* parser.cc (cp_parser_lambda_declarator_opt): Handle xobj lambdas,
diagnostics.  Comments also updated.
* pt.cc (tsubst_function_decl): Handle xobj lambdas.  Check object
type of xobj lambda call operator, diagnose incorrect types.
(tsubst_lambda_expr): Update comment.
* semantics.cc (finish_decltype_type): Also consider by-value object
parameter qualifications.

gcc/testsuite/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - xobj lambdas.
* g++.dg/cpp23/explicit-obj-diagnostics8.C: New test.
* g++.dg/cpp23/explicit-obj-lambda1.C: New test.
* g++.dg/cpp23/explicit-obj-lambda10.C: New test.
* g++.dg/cpp23/explicit-obj-lambda11.C: New test.
* g++.dg/cpp23/explicit-obj-lambda12.C: New test.
* g++.dg/cpp23/explicit-obj-lambda13.C: New test.
* g++.dg/cpp23/explicit-obj-lambda2.C: New test.
* g++.dg/cpp23/explicit-obj-lambda3.C: New test.
* g++.dg/cpp23/explicit-obj-lambda4.C: New test.
* g++.dg/cpp23/explicit-obj-lambda5.C: New test.
* g++.dg/cpp23/explicit-obj-lambda6.C: New test.
* g++.dg/cpp23/explicit-obj-lambda7.C: New test.
* g++.dg/cpp23/explicit-obj-lambda8.C: New test.
* g++.dg/cpp23/explicit-obj-lambda9.C: New test.

Signed-off-by: Waffl3x <waffl3x@protonmail.com>

c++: P0847R7 (deducing this) - diagnostics. [PR102609]

Diagnostics for xobj member functions. Also includes some diagnostics for
xobj lambdas which are not implemented here. CWG2554 is also implemented
here, we explicitly error when an xobj member function overrides a virtual
function.

PR c++/102609

gcc/c-family/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - diagnostics.
* c-cppbuiltin.cc (c_cpp_builtins): Define
__cpp_explicit_this_parameter=202110L feature test macro.

gcc/cp/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - diagnostics.
* class.cc (resolve_address_of_overloaded_function): Diagnostics.
* cp-tree.h (TFF_XOBJ_FUNC): Define.
* decl.cc (grokfndecl): Diagnostics.
(grokdeclarator): Diagnostics.
* error.cc (dump_aggr_type): Pass TFF_XOBJ_FUNC.
(dump_lambda_function): Formatting for xobj lambda.
(dump_function_decl): Pass TFF_XOBJ_FUNC.
(dump_parameters): Formatting for xobj member functions.
(function_category): Formatting for xobj member functions.
* parser.cc (cp_parser_decl_specifier_seq): Diagnostics.
(cp_parser_parameter_declaration): Diagnostics.
* search.cc (look_for_overrides_here): Make xobj member functions
override.
(look_for_overrides_r): Reject an overriding xobj member function
and diagnose it.
* semantics.cc (finish_this_expr): Diagnostics.
* typeck.cc (cp_build_addr_expr_1): Diagnostics.

gcc/testsuite/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - diagnostics.
* g++.dg/cpp23/feat-cxx2b.C: Test existance and value of
__cpp_explicit_this_parameter feature test macro.
* g++.dg/cpp26/feat-cxx26.C: Likewise.
* g++.dg/cpp23/explicit-obj-cxx-dialect-A.C: New test.
* g++.dg/cpp23/explicit-obj-cxx-dialect-B.C: New test.
* g++.dg/cpp23/explicit-obj-cxx-dialect-C.C: New test.
* g++.dg/cpp23/explicit-obj-cxx-dialect-D.C: New test.
* g++.dg/cpp23/explicit-obj-cxx-dialect-E.C: New test.
* g++.dg/cpp23/explicit-obj-diagnostics1.C: New test.
* g++.dg/cpp23/explicit-obj-diagnostics2.C: New test.
* g++.dg/cpp23/explicit-obj-diagnostics3.C: New test.
* g++.dg/cpp23/explicit-obj-diagnostics4.C: New test.
* g++.dg/cpp23/explicit-obj-diagnostics5.C: New test.
* g++.dg/cpp23/explicit-obj-diagnostics6.C: New test.
* g++.dg/cpp23/explicit-obj-diagnostics7.C: New test.

Signed-off-by: Waffl3x <waffl3x@protonmail.com>

c++: P0847R7 (deducing this) - initial functionality. [PR102609]

This implements the initial functionality for P0847R7.  CWG2789 is
implemented, but instead of "same type" for the object parameters we take
correspondence into account instead.  Without this alteration, the behavior
here would be slightly different than the behavior with constrained member
function templates, which I believe would be undesirable.

There are a few outstanding issues related to xobj member functions
overloading iobj member functions that are introduced by using declarations.
Unfortunately, fixing this will be a little more involved and will have to
be pushed back until later.

Most diagnostics have been split out into another patch to improve its
clarity and allow all the strictly functional changes to be more
distinct. Explicit object lambdas and CWG2586 are addressed in a follow up
patch.

PR c++/102609

gcc/cp/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - initial functionality.
* class.cc (xobj_iobj_parameters_correspond): New function, checks
for corresponding object parameters between xobj and iobj member
functions.
(add_method): Handle object parameters of xobj member functions, use
xobj_iobj_parameters_correspond.
* call.cc (build_over_call): Refactor, handle xobj member functions.
(cand_parms_match): Handle object parameters of xobj and iobj member
functions, use xobj_iobj_parameters_correspond.
* cp-tree.h (enum cp_decl_spec): Add ds_this, add comments.
* decl.cc (grokfndecl): Add xobj_func_p parameter.  For xobj member
functions, Set xobj_flag, don't set static_function flag.
(grokdeclarator): Handle xobj member functions, tell grokfndecl.
(grok_op_properties): Don't error for xobj operators.
* parser.cc (cp_parser_decl_specifier_seq): Handle this specifier.
(cp_parser_parameter_declaration): Set default argument to
"this_identifier" for xobj parameters.
(set_and_check_decl_spec_loc): Add "this", add comments.
* tree.cc (build_min_non_dep_op_overload): Handle xobj operators.
* typeck.cc (cp_build_addr_expr_1): Handle address-of xobj member
functions.

gcc/testsuite/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - initial functionality.
* g++.dg/cpp23/explicit-obj-basic1.C: New test.
* g++.dg/cpp23/explicit-obj-basic2.C: New test.
* g++.dg/cpp23/explicit-obj-basic3.C: New test.
* g++.dg/cpp23/explicit-obj-basic4.C: New test.
* g++.dg/cpp23/explicit-obj-basic5.C: New test.
* g++.dg/cpp23/explicit-obj-by-value1.C: New test.
* g++.dg/cpp23/explicit-obj-by-value2.C: New test.
* g++.dg/cpp23/explicit-obj-by-value3.C: New test.
* g++.dg/cpp23/explicit-obj-by-value4.C: New test.
* g++.dg/cpp23/explicit-obj-constraints.C: New test.
* g++.dg/cpp23/explicit-obj-constraints2.C: New test.
* g++.dg/cpp23/explicit-obj-ops-mem-arrow.C: New test.
* g++.dg/cpp23/explicit-obj-ops-mem-assignment.C: New test.
* g++.dg/cpp23/explicit-obj-ops-mem-call.C: New test.
* g++.dg/cpp23/explicit-obj-ops-mem-subscript.C: New test.
* g++.dg/cpp23/explicit-obj-ops-non-mem-dep.C: New test.
* g++.dg/cpp23/explicit-obj-ops-non-mem-non-dep.C: New test.
* g++.dg/cpp23/explicit-obj-ops-non-mem.h: New test.
* g++.dg/cpp23/explicit-obj-ops-requires-mem.C: New test.
* g++.dg/cpp23/explicit-obj-ops-requires-non-mem.C: New test.
* g++.dg/cpp23/explicit-obj-redecl.C: New test.
* g++.dg/cpp23/explicit-obj-redecl2.C: New test.
* g++.dg/cpp23/explicit-obj-redecl3.C: New test.
* g++.dg/cpp23/explicit-obj-redecl4.C: New test.

Signed-off-by: Waffl3x <waffl3x@protonmail.com>

c++: P0847R7 (deducing this) - prerequisite changes. [PR102609]

Adds the xobj_flag member to lang_decl_fn and a corresponding member access
macro and predicate to support the addition of explicit object member
functions. Additionally, since explicit object member functions are also
non-static member functions, we need to change uses of
DECL_NONSTATIC_MEMBER_FUNCTION_P to clarify whether they intend to include
or exclude them.

PR c++/102609

gcc/cp/ChangeLog:

* cp-tree.h (struct lang_decl_fn): New data member.
(DECL_NONSTATIC_MEMBER_FUNCTION_P): Poison.
(DECL_IOBJ_MEMBER_FUNCTION_P): Define.
(DECL_FUNCTION_XOBJ_FLAG): Define.
(DECL_XOBJ_MEMBER_FUNCTION_P): Define.
(DECL_OBJECT_MEMBER_FUNCTION_P): Define.
(DECL_FUNCTION_MEMBER_P): Don't use
DECL_NONSTATIC_MEMBER_FUNCTION_P.
(DECL_CONST_MEMFUNC_P): Likewise.
(DECL_VOLATILE_MEMFUNC_P): Likewise.
(DECL_NONSTATIC_MEMBER_P): Likewise.
* module.cc (trees_out::lang_decl_bools): Handle xobj_flag.
(trees_in::lang_decl_bools): Handle xobj_flag.
* call.cc (build_this_conversion)
(add_function_candidate)
(add_template_candidate_real)
(add_candidates)
(maybe_warn_class_memaccess)
(cand_parms_match)
(joust)
(do_warn_dangling_reference)
* class.cc (finalize_literal_type_property)
(finish_struct)
(resolve_address_of_overloaded_function)
* constexpr.cc (is_valid_constexpr_fn)
(cxx_bind_parameters_in_call)
* contracts.cc (build_contract_condition_function)
* cp-objcp-common.cc (cp_decl_dwarf_attribute)
* cxx-pretty-print.cc (cxx_pretty_printer::postfix_expression)
(cxx_pretty_printer::declaration_specifiers)
(cxx_pretty_printer::direct_declarator)
* decl.cc (cp_finish_decl)
(grok_special_member_properties)
(start_preparsed_function)
(record_key_method_defined)
* decl2.cc (cp_handle_deprecated_or_unavailable)
* init.cc (find_uninit_fields_r)
(build_offset_ref)
* lambda.cc (lambda_expr_this_capture)
(maybe_generic_this_capture)
(nonlambda_method_basetype)
* mangle.cc (write_nested_name)
* method.cc (early_check_defaulted_comparison)
(skip_artificial_parms_for)
(num_artificial_parms_for)
* pt.cc (is_specialization_of_friend)
(determine_specialization)
(copy_default_args_to_explicit_spec)
(check_explicit_specialization)
(tsubst_contract_attribute)
(check_non_deducible_conversions)
(more_specialized_fn)
(maybe_instantiate_noexcept)
(register_parameter_specializations)
(value_dependent_expression_p)
* search.cc (shared_member_p)
(lookup_member)
(field_access_p)
* semantics.cc (finish_omp_declare_simd_methods)
* tree.cc (lvalue_kind)
* typeck.cc (invalid_nonstatic_memfn_p): Don't use
DECL_NONSTATIC_MEMBER_FUNCTION_P.

libcc1/ChangeLog:

* libcp1plugin.cc (plugin_pragma_push_user_expression): Don't use
DECL_NONSTATIC_MEMBER_FUNCTION_P.

Signed-off-by: Waffl3x <waffl3x@protonmail.com>
Co-authored-by: Jason Merrill <jason@redhat.com>

[committed] Adding missing prototype for __clzhi2 to xstormy port

xstormy16 has failed since the c99 transition due to a missing prototype for
__clzhi2 in the implementation of stormy16_count_leading_zeros.

This fixes the missing prototype. Pushed to the trunk.

include/
* longlong.h (__stormy16_count_leading_zeros): Add prototype for
__clzhi2.

libstdc++: Simplify some chrono formatters

I don't remember exactly why I made these bits of code reserve space in
a COW string and append to it, rather than just use the string returned
from std::format (which will undergo copy elision). The _Str_sink type
used by std::format means the string only performs a single allocation
for the formatted output, and the returned string's reference count will
be one, so won't reallocate when indexing into it. We can remove these
non-optimizations.

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (__formatter_chrono::_M_F): Simplify
handling of string returned from std::format.
(__formatter_chrono::_M_R_T): Likewise.

[committed] Fix minor bug in epiphany port

So I consider this port dead as it semi-randomly fails in reload due to
unrelated changes earlier in the gimple and RTL pipelines.  Regardless Richard
S's late-combine work did show a very obvious error in the port that we should
go ahead and fix as long as the port is in-tree.

The epiphany add-with-immediate instruction allows an 11 bit signed immediate.
That gives the instruction an immediate range of -1024..1023.

The port actually allowed -8192..8191 due to the uber-weird constraint
definition.  I've simplified the constraint to match the hardware documentation
I was able to find.  That was enough to get the epiphany port to build
libgcc/newlib with Richard S's late-combine work.

The testsuite is so flakey on that port (due to the reload failures) that my
tester doesn't run it.  So no comparisons are available.

gcc/
* config/epiphany/constraints.md (Car): Allow -1024..1023, no more,
no less.

[committed] Fix minor bug on mn103 port

Richard Sandiford debugged a failure on the mn103 port with his late-combine
patches down to the subdi3 pattern not specifying the isa on alternatives which
required newer variants of the chip family.

This patch adds the missing isa attribute and the port now works with his
late-combine patch. I'm pushing this to the trunk on his behalf.

gcc/
* config/mn10300/mn10300.md (subdi3_degenerate): Add isa attribute.

middle-end: removed unused variable in vectorizable_live_operation_1

It looks like the previous patch had an unused variable.
It's odd that my bootstrap didn't catch it (I'm assuming
-Werror is still on for O3 bootstraps) but this fixes it.

gcc/ChangeLog:

* tree-vect-loop.cc (vectorizable_live_operation_1): Drop unused
restart_loop.
(vectorizable_live_operation): Likewise.