]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
4 months agodiagnostics: reimplement html_token_printer in terms of xml::printer
David Malcolm [Thu, 29 May 2025 20:57:52 +0000 (16:57 -0400)] 
diagnostics: reimplement html_token_printer in terms of xml::printer

No functional change intended.

gcc/ChangeLog:
* diagnostic-format-html.cc
(html_builder::make_element_for_diagnostic::html_token_printer):
Reimplement in terms of xml::printer.
(html_builder::make_element_for_diagnostic): Create an
xml::printer and use it with the html_token_printer.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
4 months agodiagnostics: bulletproof html_builder::make_metadata_element
David Malcolm [Thu, 29 May 2025 20:57:52 +0000 (16:57 -0400)] 
diagnostics: bulletproof html_builder::make_metadata_element

gcc/ChangeLog:
* diagnostic-format-html.cc (html_builder::make_metadata_element):
Gracefully handle the case where "url" is null.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
4 months agodiagnostics: use unique_ptr for m_format_postprocessor
David Malcolm [Thu, 29 May 2025 20:57:52 +0000 (16:57 -0400)] 
diagnostics: use unique_ptr for m_format_postprocessor

No functional change intended.

gcc/cp/ChangeLog:
* error.cc (cxx_format_postprocessor::clone): Update to use
unique_ptr.
(cxx_dump_pretty_printer::cxx_dump_pretty_printer): Likewise.
(cxx_initialize_diagnostics): Likewise.

gcc/ChangeLog:
* pretty-print.cc (pretty_printer::pretty_printer): Use "nullptr"
rather than "NULL".  Remove explicit delete of
m_format_postprocessor.
* pretty-print.h (format_postprocessor::clone): Use unique_ptr.
(pretty_printer::set_format_postprocessor): New.
(pretty_printer::m_format_postprocessor): Use unique_ptr.
(pp_format_postprocessor): Update for use of unique_ptr, removing
reference from return type.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
4 months agolibgomp: Add OpenACC's acc_memcpy_device{,_async} routines [PR93226]
Tobias Burnus [Thu, 29 May 2025 20:47:06 +0000 (22:47 +0200)] 
libgomp: Add OpenACC's acc_memcpy_device{,_async} routines [PR93226]

libgomp/ChangeLog:

PR libgomp/93226
* libgomp-plugin.h (GOMP_OFFLOAD_openacc_async_dev2dev): New
prototype.
* libgomp.h (struct acc_dispatch_t): Add dev2dev_func.
(gomp_copy_dev2dev): New prototype.
* libgomp.map (OACC_2.6.1): New; add acc_memcpy_device{,_async}.
* libgomp.texi (acc_memcpy_device): New.
* oacc-mem.c (memcpy_tofrom_device): Change to take from/to
device boolean; use memcpy not memmove; add early return if
size == 0 or same device + same ptr.
(acc_memcpy_to_device, acc_memcpy_to_device_async,
acc_memcpy_from_device, acc_memcpy_from_device_async): Update.
(acc_memcpy_device, acc_memcpy_device_async): New.
* openacc.f90 (acc_memcpy_device, acc_memcpy_device_async):
Add interface.
* openacc_lib.h (acc_memcpy_device, acc_memcpy_device_async):
Likewise.
* openacc.h (acc_memcpy_device, acc_memcpy_device_async): Add
prototype.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_openacc_async_host2dev):
Update comment.
(GOMP_OFFLOAD_openacc_async_dev2host): Update call.
(GOMP_OFFLOAD_openacc_async_dev2dev): New.
* plugin/plugin-nvptx.c (cuda_memcpy_dev_sanity_check): New.
(GOMP_OFFLOAD_dev2dev): Call it.
(GOMP_OFFLOAD_openacc_async_dev2dev): New.
* target.c (gomp_copy_dev2dev): New.
(gomp_load_plugin_for_device): Load dev2dev and async_dev2dev.
* testsuite/libgomp.oacc-c-c++-common/acc_memcpy_device-1.c: New test.
* testsuite/libgomp.oacc-fortran/acc_memcpy_device-1.f90: New test.

4 months agoc++: xobj lambda 'this' capture [PR113563]
Jason Merrill [Thu, 29 May 2025 16:36:23 +0000 (12:36 -0400)] 
c++: xobj lambda 'this' capture [PR113563]

Various places were still making assumptions that we could get to the 'this'
capture through current_class_ref in a lambda op(), which is incorrect for
an explicit object op().

PR c++/113563

gcc/cp/ChangeLog:

* lambda.cc (build_capture_proxy): Check pointerness of the
member, not the proxy type.
(lambda_expr_this_capture): Don't assume current_class_ref.
(nonlambda_method_basetype): Likewise.
* semantics.cc (finish_non_static_data_member): Don't assume
TREE_TYPE (object) is set.
(finish_this_expr): Check current_class_type for lambda,
not current_class_ref.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/explicit-obj-lambda16.C: New test.

4 months agoc++, coroutines: Make a check more specific [PR109283].
Iain Sandoe [Thu, 29 May 2025 14:45:29 +0000 (15:45 +0100)] 
c++, coroutines: Make a check more specific [PR109283].

The check was intended to assert that we had visited contained
ternary expressions with embedded co_awaits, but had been made
too general - and therefore was ICEing on code that was actually
OK.  Fixed by checking specifically that no co_awaits embedded.

PR c++/109283

gcc/cp/ChangeLog:

* coroutines.cc (find_any_await): Only save the statement
pointer if the caller passes a place for it.
(flatten_await_stmt): When checking that ternary expressions
have been handled, also check that they contain a co_await.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr109283.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
4 months agoc++: C++17 constexpr lambda and goto/static
Jason Merrill [Thu, 29 May 2025 16:21:28 +0000 (12:21 -0400)] 
c++: C++17 constexpr lambda and goto/static

We only want the error for these cases for functions explicitly declared
constexpr, but we still want to set invalid_constexpr on C++17 lambdas so
maybe_save_constexpr_fundef doesn't make them implicitly constexpr.

The potential_constant_expression_1 change isn't necessary for this test,
but still seems correct.

gcc/cp/ChangeLog:

* decl.cc (start_decl): Also set invalid_constexpr
for maybe_constexpr_fn.
* parser.cc (cp_parser_jump_statement): Likewise.
* constexpr.cc (potential_constant_expression_1): Ignore
goto to an artificial label.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-lambda29.C: New test.

4 months agoFortran: Make minor adjustment to error message.
Jerry DeLisle [Thu, 29 May 2025 17:02:00 +0000 (10:02 -0700)] 
Fortran: Make minor adjustment to error message.

PR fortran/120049

gcc/fortran/ChangeLog:

* check.cc(check_c_ptr_2): Rephrase error message
for clarity.

gcc/testsuite/ChangeLog:

* gfortran.dg/c_f_pointer_tests_6.f90: Adjust dg-error
directive.

4 months agoi386: Add x86 FMV symbol tests
Alice Carlotti [Tue, 28 Jan 2025 15:17:33 +0000 (15:17 +0000)] 
i386: Add x86 FMV symbol tests

This is for testing the x86 mangling of FMV versioned function
assembly names.

gcc/testsuite/ChangeLog:

* g++.target/i386/mv-symbols1.C: New test.
* g++.target/i386/mv-symbols2.C: New test.
* g++.target/i386/mv-symbols3.C: New test.
* g++.target/i386/mv-symbols4.C: New test.
* g++.target/i386/mv-symbols5.C: New test.
* g++.target/i386/mvc-symbols1.C: New test.
* g++.target/i386/mvc-symbols2.C: New test.
* g++.target/i386/mvc-symbols3.C: New test.
* g++.target/i386/mvc-symbols4.C: New test.

Co-authored-by: Alfie Richards <alfie.richards@arm.com>
4 months agoppc: Add PowerPC FMV symbol tests.
Alice Carlotti [Tue, 28 Jan 2025 15:16:31 +0000 (15:16 +0000)] 
ppc: Add PowerPC FMV symbol tests.

This tests the mangling of function assembly names when annotated with
target_clones attributes.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/mvc-symbols1.C: New test.
* g++.target/powerpc/mvc-symbols2.C: New test.
* g++.target/powerpc/mvc-symbols3.C: New test.
* g++.target/powerpc/mvc-symbols4.C: New test.

Co-authored-by: Alfie Richards <alfie.richards@arm.com>
4 months agoOpenMP: Fix ICE and other issues in C/C++ metadirective error recovery.
Sandra Loosemore [Mon, 26 May 2025 19:21:48 +0000 (19:21 +0000)] 
OpenMP: Fix ICE and other issues in C/C++ metadirective error recovery.

The new testcase included in this patch used to ICE in gcc after
diagnosing the first error, and in g++ it only diagnosed the error in
the first metadirective, ignoring the second one.  The solution is to
make error recovery in the C front end more like that in the C++ front
end, and remove the code in both front ends that previously tried to
skip all the way over the following statement (instead of just to the
end of the metadirective pragma) after an error.

gcc/c/ChangeLog
* c-parser.cc (c_parser_skip_to_closing_brace): New, copied from
the equivalent function in the C++ front end.
(c_parser_skip_to_end_of_block_or_statement): Pass false to
the error flag.
(c_parser_omp_context_selector): Immediately return error_mark_node
after giving an error that the integer trait property is invalid,
similarly to C++ front end.
(c_parser_omp_context_selector_specification): Likewise handle
error return from c_parser_omp_context_selector similarly to C++.
(c_parser_omp_metadirective): Do not call
c_parser_skip_to_end_of_block_or_statement after an error.

gcc/cp/ChangeLog
* parser.cc (cp_parser_omp_metadirective): Do not call
cp_parser_skip_to_end_of_block_or_statement after an error.

gcc/testsuite/ChangeLog
* c-c++-common/gomp/declare-variant-2.c: Adjust patterns now that
C and C++ now behave similarly.
* c-c++-common/gomp/metadirective-error-recovery.c: New.

4 months agoOpenMP: Fix ICE in metadirective recovery after error [PR120180]
Sandra Loosemore [Sat, 24 May 2025 03:21:18 +0000 (03:21 +0000)] 
OpenMP: Fix ICE in metadirective recovery after error [PR120180]

It's not clear whether a metadirective in a loop nest is supposed to
be valid, but GCC certainly shouldn't be ICE'ing after diagnosing it
as an error.

gcc/c/ChangeLog
PR c/120180
* c-parser.cc (c_parser_omp_metadirective): Only consume the
token if it is the expected close paren.

gcc/cp/ChangeLog
PR c/120180
* parser.cc (cp_parser_omp_metadirective): Only consume the
token if it is the expected close paren.

gcc/testsuite/ChangeLog
PR c/120180
* c-c++-common/gomp/pr120180.c: New.

4 months agoc++, coroutines: Delete now unused code for parm guards.
Iain Sandoe [Sun, 25 May 2025 11:14:13 +0000 (12:14 +0100)] 
c++, coroutines: Delete now unused code for parm guards.

Since r16-775-g18df4a10bc9694 we use nested cleanups to
handle parameter copy destructors in the ramp (and pass
a list of cleanups required to the actor which will only
be invoked if the parameter copies were all correctly
built - and therefore does not need to guard destructors
either.

This deletes the provisions for frame parameter copy
destructor guards.

gcc/cp/ChangeLog:

* coroutines.cc (analyze_fn_parms): No longer
create a parameter copy guard var.
* coroutines.h (struct param_info): Remove the
entry for the parameter copy destructor guard.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
4 months agoc++, coroutines: Fix identification of coroutine ramps [PR120453].
Iain Sandoe [Thu, 29 May 2025 10:00:18 +0000 (11:00 +0100)] 
c++, coroutines: Fix identification of coroutine ramps [PR120453].

The existing implementation, incorrectly, tried to use DECL_RAMP_FN
in check_return_expr to determine if we are handling a ramp func.
However, that query is only set for the resume/destroy functions.

Replace the use of DECL_RAMP_FN with a new query.

PR c++/120453

gcc/cp/ChangeLog:

* cp-tree.h (DECL_RAMP_P): New.
* typeck.cc (check_return_expr): Use DECL_RAMP_P instead
of DECL_RAMP_FN.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr120453.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
4 months agoipa: When inlining, don't combine PT JFs changing signedness (PR120295)
Martin Jambor [Thu, 29 May 2025 14:32:04 +0000 (16:32 +0200)] 
ipa: When inlining, don't combine PT JFs changing signedness (PR120295)

In GCC 15 we allowed jump-function generation code to skip over a
type-cast converting one integer to another as long as the latter can
hold all the values of the former or has at least the same precision.
This works well for IPA-CP where we do then evaluate each jump
function as we propagate values and value-ranges.  However, the
test-case in PR 120295 shows a problem with inlining, where we combine
pass-through jump-functions so that they are always relative to the
function which is the root of the inline tree.  Unfortunately, we are
happy to combine also those with type-casts to a different signedness
which makes us use sign zero extension for the expected value ranges
where we should have used sign extension.  When the value-range which
then leads to wrong insertion of a call to builtin_unreachable is
being computed, the information about an existence of a intermediary
signed type has already been lost during previous inlining.

This patch simply blocks combining such jump-functions so that it is
back-portable to GCC 15.  Once we switch pass-through jump functions
to use a vector of operations rather than having room for just one, we
will be able to address this situation with adding an extra conversion
instead.

gcc/ChangeLog:

2025-05-19  Martin Jambor  <mjambor@suse.cz>

PR ipa/120295
* ipa-prop.cc (update_jump_functions_after_inlining): Do not
combine pass-through jump functions with type-casts changing
signedness.

gcc/testsuite/ChangeLog:

2025-05-19  Martin Jambor  <mjambor@suse.cz>

PR ipa/120295
* gcc.dg/ipa/pr120295.c: New test.

4 months agoipa: Fix whitespace when dumping VR in jump_functions
Martin Jambor [Thu, 29 May 2025 14:32:04 +0000 (16:32 +0200)] 
ipa: Fix whitespace when dumping VR in jump_functions

Lack of white space breakes the tree-visualisation structure and makes
the dump unnecessarily difficult to read.

gcc/ChangeLog:

2025-05-19  Martin Jambor  <mjambor@suse.cz>

* ipa-prop.cc (ipa_dump_jump_function): Fix whitespace when
dumping IPA VRs.

4 months agolibstdc++: Compare keys and values separately in flat_map::operator==
Patrick Palka [Thu, 29 May 2025 14:12:23 +0000 (10:12 -0400)] 
libstdc++: Compare keys and values separately in flat_map::operator==

Instead of effectively doing a zipped comparison of the keys and values,
compare them separately to leverage the underlying containers' optimized
equality implementations.

libstdc++-v3/ChangeLog:

* include/std/flat_map (_Flat_map_impl::operator==): Compare
keys and values separately.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
4 months agolibstdc++: Fix tuple/pair confusion with std::erase_if(flat_map) [PR120465]
Patrick Palka [Thu, 29 May 2025 14:11:57 +0000 (10:11 -0400)] 
libstdc++: Fix tuple/pair confusion with std::erase_if(flat_map) [PR120465]

std::erase_if for flat_map/multimap is implemented via ranges::erase_if
over a zip_view of the keys and values, the value_type of which is a
tuple, but the given predicate needs to be called with a pair (flat_map's
value_type).  So use a projection to convert the tuple into a suitable
pair.

PR libstdc++/120465

libstdc++-v3/ChangeLog:

* include/std/flat_map (_Flat_map_impl::_M_erase_if): Use a
projection with ranges::remove_if to pass a pair instead of
a tuple to the predicate.
* testsuite/23_containers/flat_map/1.cc (test07): Strengthen
to expect the argument passed to the predicate is a pair.
* testsuite/23_containers/flat_multimap/1.cc (test07): Likewise.

Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
4 months agolibstdc++: Fix another 17_intro/names.cc failure on AIX
Jonathan Wakely [Tue, 27 May 2025 20:50:24 +0000 (21:50 +0100)] 
libstdc++: Fix another 17_intro/names.cc failure on AIX

FAIL: 17_intro/names.cc  -std=gnu++98 (test for excess errors)

Also fix typo in experimental/names.cc where I did #undef for the wrong
name in r16-901-gd1ced2a5ea6b09.

libstdc++-v3/ChangeLog:

* testsuite/17_intro/names.cc [_AIX] (a): Undefine.
* testsuite/experimental/names.cc [_AIX] (ptr): Undefine.

4 months agolibstdc++: Fix lwg4084.cc test FAIL on AIX
Jonathan Wakely [Wed, 28 May 2025 21:02:58 +0000 (22:02 +0100)] 
libstdc++: Fix lwg4084.cc test FAIL on AIX

On AIX printf formats a quiet NaN as "NaNQ" and it doesn't matter
whether %f or %F is used. Similarly, it always prints "INF" for
infinity, even when %f is used. Adjust a test that currently fails due
to this AIX-specific (and non-conforming) behaviour.

libstdc++-v3/ChangeLog:

* testsuite/22_locale/num_put/put/char/lwg4084.cc [_AIX]: Adjust
expected output for NaN and infinity.

4 months agolibstdc++: Re-enable some XPASS tests for AIX
Jonathan Wakely [Wed, 28 May 2025 22:04:42 +0000 (23:04 +0100)] 
libstdc++: Re-enable some XPASS tests for AIX

The deque shrink_to_fit.cc test always passes on AIX, I think it should
not have been disabled.

The 96088.cc tests pass for C++20 and later (I don't know why) so make
them require C++20, as they fail otherwise.

libstdc++-v3/ChangeLog:

* testsuite/23_containers/deque/capacity/shrink_to_fit.cc:
Remove dg-xfail-run-if for AIX.
* testsuite/23_containers/unordered_map/96088.cc: Replace
dg-xfail-run-if with dg-require-effective-target c++20.
* testsuite/23_containers/unordered_multimap/96088.cc: Likewise.
* testsuite/23_containers/unordered_multiset/96088.cc: Likewise.
* testsuite/23_containers/unordered_set/96088.cc: Likewise.

4 months agoi386: Use Shuffles instead of shifts for Reduction in AMD znver4/5
Pranav Gorantla [Thu, 29 May 2025 13:02:24 +0000 (15:02 +0200)] 
i386: Use Shuffles instead of shifts for Reduction in AMD znver4/5

In AMD znver4, znver5 targets vpshufd, vpsrldq have latencies 1,2 and
throughput 4 (2 for znver4),2 respectively. It is better to generate
shuffles instead of shifts wherever possible. In this patch we try to
generate appropriate shuffle instruction to copy higher half to lower
half instead of a simple right shift during horizontal vector reduction.

gcc/ChangeLog:

* config/i386/i386-expand.cc (emit_reduc_half): Use shuffles to
generate reduc half for V4SI, similar modes.
* config/i386/i386.h (TARGET_SSE_REDUCTION_PREFER_PSHUF): New Macro.
* config/i386/x86-tune.def (X86_TUNE_SSE_REDUCTION_PREFER_PSHUF):
New tuning.

gcc/testsuite/ChangeLog:

* gcc.target/i386/reduc-pshuf.c: New test.

4 months agolibstdc++: Disable -Wlong-long warnings in boost_concept_check.h
Jonathan Wakely [Tue, 20 May 2025 14:14:42 +0000 (15:14 +0100)] 
libstdc++: Disable -Wlong-long warnings in boost_concept_check.h

The _IntegerConcept, _SignedIntegerConcept and _UnsignedIntegerConcept
class template are specialized for long long, which gives warnings with
-Wsystem-headers in C++98 mode.

libstdc++-v3/ChangeLog:

* include/bits/boost_concept_check.h: Disable -Wlong-long
warnings.
* testsuite/24_iterators/operations/prev_neg.cc: Adjust dg-error
line number.

4 months agolibstdc++: Document that -std cannot be used in --target_board now
Jonathan Wakely [Tue, 27 May 2025 16:52:37 +0000 (17:52 +0100)] 
libstdc++: Document that -std cannot be used in --target_board now

Only using GLIBCXX_TESTSUITE_STDS or v3_std_list works now.

libstdc++-v3/ChangeLog:

* doc/xml/manual/test.xml: Remove outdated documentation on
testing with -std options in --target_board.
* doc/html/manual/test.html: Regenerate.

4 months agoggc-page: Fix up build on non-USING_MMAP hosts [PR120464]
Jakub Jelinek [Thu, 29 May 2025 07:58:30 +0000 (09:58 +0200)] 
ggc-page: Fix up build on non-USING_MMAP hosts [PR120464]

The r16-852 "Use optimize free lists for alloc_pages" change broke build
on non-USING_MMAP hosts.
I don't have access to one, so I've just added #undef USING_MMAP
before first use of that macro after the definitions.

There were 2 problems.  One was one missed G.free_pages
to free_list->free_pages replacement in #ifdef USING_MALLOC_PAGE_GROUPS
guarded code which resulted in obvious compile error.

Once fixed, there was an ICE during self-test and without self-test pretty
much on any garbage collection.
The problem is that the patch moved all of release_pages into new
do_release_pages and runs it for each freelist from the new release_pages
wrapper.  The #ifdef USING_MALLOC_PAGE_GROUPS code had two loops, one
which walked the entries in the freelist and freed the ones which had
unused group there and another which walked all the groups (regardless of
which freelist they belong to) and freed the unused ones.
With the change the first call to do_release_pages would free freelist
entries from the first freelist with unused groups, then free all unused
groups and then second and following would access already freed groups,
crashing there, and then walk again all groups looking for unused ones (but
there are guaranteed to be none).

So, this patch fixes it by moving the unused group freeing to the caller,
release_pages after all freelists are freed, and while at it, moves there
the statistics printout as well, we don't need to print separate info
for each of the freelist, previously we were emitting just one.

2025-05-29  Jakub Jelinek  <jakub@redhat.com>

PR bootstrap/120464
* ggc-page.cc (struct ggc_globals): Fix up comment formatting.
(find_free_list): Likewise.
(alloc_page): For defined(USING_MALLOC_PAGE_GROUPS) use
free_list->free_pages instead of G.free_pages.
(do_release_pages): Add n1 and n2 arguments, make them used.
Move defined(USING_MALLOC_PAGE_GROUPS) page group freeing to
release_pages and dumping of statistics as well.  Formatting fixes.
(release_pages): Adjust do_release_pages caller, move here
defined(USING_MALLOC_PAGE_GROUPS) page group freeing and dumping
of statistics.
(ggc_handle_finalizers): Fix up comment formatting and typo.

4 months agoRISC-V: Add minimal support of double trap extension 1.0
Jerry Zhang Jian [Wed, 28 May 2025 02:17:36 +0000 (10:17 +0800)] 
RISC-V: Add minimal support of double trap extension 1.0

Add support of double trap extension [1], enabling GCC
to recognize the following extensions at compile time.

New extensions:
    - ssdbltrp
    - smdbltrp

[1] https://github.com/riscv/riscv-double-trap/releases/download/v1.0/riscv-double-trap.pdf

gcc/ChangeLog:

* config/riscv/riscv-ext.def: New extensions
* config/riscv/riscv-ext.opt: Auto re-generated
* doc/riscv-ext.texi: Auto re-generated

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-57.c: New test
* gcc.target/riscv/arch-58.c: New test

Signed-off-by: Jerry Zhang Jian <jerry.zhangjian@sifive.com>
4 months agoFortran: Fix ChangeLog.
Jerry DeLisle [Thu, 29 May 2025 04:04:13 +0000 (21:04 -0700)] 
Fortran: Fix ChangeLog.

PR fortran/119856

gcc/fortran/ChangeLog:

* ChangeLog: Fix PR number in log.

4 months agoRISC-V: Add test for vec_duplicate + vmul.vv combine case 1 with GR2VR cost 0, 1...
Pan Li [Wed, 28 May 2025 08:22:04 +0000 (16:22 +0800)] 
RISC-V: Add test for vec_duplicate + vmul.vv combine case 1 with GR2VR cost 0, 1 and 2

Add asm dump check test for vec_duplicate + vmul.vv combine to vmul.vx,
with the GR2VR cost is 0, 1 and 2.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm
check for vmul.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
4 months agoRISC-V: Add test for vec_duplicate + vmul.vv combine case 0 with GR2VR cost 0, 2...
Pan Li [Wed, 28 May 2025 08:20:32 +0000 (16:20 +0800)] 
RISC-V: Add test for vec_duplicate + vmul.vv combine case 0 with GR2VR cost 0, 2 and 15

Add asm dump check test for vec_duplicate + vmul.vv combine to vmul.vx,
with the GR2VR cost is 0, 2 and 15.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check
for vmul.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for vmul run test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vmul-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vmul-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vmul-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vmul-run-1-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
4 months agoRISC-V: Combine vec_duplicate + vmul.vv to vmul.vx on GR2VR cost
Pan Li [Wed, 28 May 2025 08:16:49 +0000 (16:16 +0800)] 
RISC-V: Combine vec_duplicate + vmul.vv to vmul.vx on GR2VR cost

This patch would like to combine the vec_duplicate + vmul.vv to the
vmul.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_VX_BINARY(T, OP)                                        \
  void                                                                \
  test_vx_binary (T * restrict out, T * restrict in, T x, unsigned n) \
  {                                                                   \
    for (unsigned i = 0; i < n; i++)                                  \
      out[i] = in[i] OP x;                                            \
  }

  DEF_VX_BINARY(int32_t, |)

Before this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │     beq a3,zero,.L8
  12   │     vsetvli a5,zero,e32,m1,ta,ma
  13   │     vmv.v.x v2,a2
  14   │     slli    a3,a3,32
  15   │     srli    a3,a3,32
  16   │ .L3:
  17   │     vsetvli a5,a3,e32,m1,ta,ma
  18   │     vle32.v v1,0(a1)
  19   │     slli    a4,a5,2
  20   │     sub a3,a3,a5
  21   │     add a1,a1,a4
  22   │     vmul.vv v1,v1,v2
  23   │     vse32.v v1,0(a0)
  24   │     add a0,a0,a4
  25   │     bne a3,zero,.L3

After this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │     beq a3,zero,.L8
  12   │     slli    a3,a3,32
  13   │     srli    a3,a3,32
  14   │ .L3:
  15   │     vsetvli a5,a3,e32,m1,ta,ma
  16   │     vle32.v v1,0(a1)
  17   │     slli    a4,a5,2
  18   │     sub a3,a3,a5
  19   │     add a1,a1,a4
  20   │     vmul.vx v1,v1,a2
  21   │     vse32.v v1,0(a0)
  22   │     add a0,a0,a4
  23   │     bne a3,zero,.L3

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_vx_binary_vec_dup_vec): Add
new case for MULT op.
(expand_vx_binary_vec_vec_dup): Ditto.
* config/riscv/riscv.cc (riscv_rtx_costs): Ditto.
* config/riscv/vector-iterators.md: Add new op mult to no_shift_vx_ops.

Signed-off-by: Pan Li <pan2.li@intel.com>
4 months agoc++: add __is_*destructible builtins [PR107600]
Jason Merrill [Wed, 28 May 2025 15:42:00 +0000 (11:42 -0400)] 
c++: add __is_*destructible builtins [PR107600]

Typically "does this class have a trivial destructor" is the wrong question
to ask, we rather want "can I destroy this class trivially", thus the
std::is_trivially_destructible standard trait.  Let's provide a builtin for
it, and complain about asking whether a deleted destructor is trivial.

Clang and MSVC also have these traits.

PR c++/107600

gcc/cp/ChangeLog:

* cp-trait.def (IS_DESTRUCTIBLE, IS_NOTHROW_DESTRUCTIBLE)
(IS_TRIVIALLY_DESTRUCTIBLE): New.
* constraint.cc (diagnose_trait_expr): Explain them.
* method.cc (destructible_expr): New.
(is_xible_helper): Use it.
* semantics.cc (finish_trait_expr): Handle new traits.
(trait_expr_value): Likewise.  Complain about asking
whether a deleted dtor is trivial.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_destructible1.C: New test.

4 months agoDaily bump.
GCC Administrator [Thu, 29 May 2025 00:18:19 +0000 (00:18 +0000)] 
Daily bump.

4 months ago[AUTOFDO] Fix autogen remake issue
Kugan Vivekanandarajah [Wed, 28 May 2025 22:47:19 +0000 (08:47 +1000)] 
[AUTOFDO] Fix autogen remake issue

Fix autogen issue introduced by commit
commit 86dc974cf30f926a014438a5fccdc9d41e26282b

ChangeLog:

* Makefile.def: Fix typo in cpu_type
* Makefile.tpl: Add cpu_type

Signed-off-by: Kugan Vivekanandarajah <kvivekananda@nvidia.com>
4 months agoSet znver5 addss cost to 2 again
Jan Hubicka [Wed, 28 May 2025 21:43:51 +0000 (23:43 +0200)] 
Set znver5 addss cost to 2 again

since uses of addss for other purposes then modelling FP addition/subtraction should
be gone now, this patch sets addss cost back to 2.

gcc/ChangeLog:

PR target/119298
* config/i386/x86-tune-costs.h (struct processor_costs): Set addss cost
back to 2.

4 months agoFortran: gfc_simplify_{cospi,sinpi} - fix for MPFR < 4.2.0
Tobias Burnus [Wed, 28 May 2025 20:35:03 +0000 (22:35 +0200)] 
Fortran: gfc_simplify_{cospi,sinpi} - fix for MPFR < 4.2.0

gcc/fortran/ChangeLog:

PR fortran/113152
* simplify.cc (gfc_simplify_cospi, gfc_simplify_sinpi): Avoid using
mpfr_fmod_ui in the MPFR < 4.2.0 version.

4 months agoFortran: Adjust handling of optional comma in FORMAT.
Jerry DeLisle [Wed, 28 May 2025 14:56:12 +0000 (07:56 -0700)] 
Fortran: Adjust handling of optional comma in FORMAT.

This change adjusts the error messages for optional commas
in format strings to give a warning at compile time unless
-std=legacy is used. This is more consistant with the
runtime library. A missing comma separator should not be
encouraged as it is non-standard fortran.

PR fortran/119586

gcc/fortran/ChangeLog:

* io.cc: Set missing comma error checks to STD_STD_LEGACY.

gcc/testsuite/ChangeLog:

* gfortran.dg/comma_format_extension_1.f: Update dg-options to
"-std=legacy".
* gfortran.dg/comma_format_extension_3.f: Likewise.
* gfortran.dg/continuation_13.f90: Likewise.

4 months agofortran: add constant input support for trig functions with half-revolutions
Yuao Ma [Wed, 28 May 2025 15:13:45 +0000 (23:13 +0800)] 
fortran: add constant input support for trig functions with half-revolutions

This patch introduces constant input support for trigonometric functions,
including those involving half-revolutions. Both valid and invalid inputs have
been thoroughly tested, as have mpfr versions greater than or equal to 4.2 and
less than 4.2.

Inspired by Steve's previous work, this patch also fixes subtle bugs revealed
by newly added test cases.

If this patch is merged, I plan to work on middle-end optimization support for
previously added GCC built-ins and libgfortran intrinsics.

PR fortran/113152

gcc/fortran/ChangeLog:

* gfortran.h (enum gfc_isym_id): Add new enum.
* intrinsic.cc (add_functions): Register new intrinsics. Changing the call
from gfc_resolve_trigd{,2} to gfc_resolve_trig{,2}.
* intrinsic.h (gfc_simplify_acospi, gfc_simplify_asinpi,
gfc_simplify_asinpi, gfc_simplify_atanpi, gfc_simplify_atan2pi,
gfc_simplify_cospi, gfc_simplify_sinpi, gfc_simplify_tanpi): New.
(gfc_resolve_trig): Rename from gfc_resolve_trigd.
(gfc_resolve_trig2): Rename from gfc_resolve_trigd2.
* iresolve.cc (gfc_resolve_trig): Rename from gfc_resolve_trigd.
(gfc_resolve_trig2): Rename from gfc_resolve_trigd2.
* mathbuiltins.def: Add 7 new math builtins and re-align.
* simplify.cc (gfc_simplify_acos, gfc_simplify_asin,
gfc_simplify_acosd, gfc_simplify_asind): Revise error message.
(gfc_simplify_acospi, gfc_simplify_asinpi,
gfc_simplify_asinpi, gfc_simplify_atanpi, gfc_simplify_atan2pi,
gfc_simplify_cospi, gfc_simplify_sinpi, gfc_simplify_tanpi): New.

gcc/testsuite/ChangeLog:

* gfortran.dg/dec_math_3.f90: Test invalid input.
* gfortran.dg/dec_math_5.f90: Test valid output.
* gfortran.dg/dec_math_6.f90: New test.

Signed-off-by: Yuao Ma <c8ef@outlook.com>
Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
4 months agovect: Remove non-SLP paths in strided slp/elementwise.
Robin Dapp [Tue, 20 May 2025 09:23:34 +0000 (11:23 +0200)] 
vect: Remove non-SLP paths in strided slp/elementwise.

This patch removes non-SLP paths in the
VMAT_STRIDED_SLP/VMAT_ELEMENTWISE part of vectorizable_load.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_load): Remove non-SLP paths.

4 months agoRISC-V: Avoid division by zero in check_builtin_call [PR120436].
Robin Dapp [Mon, 26 May 2025 14:16:36 +0000 (16:16 +0200)] 
RISC-V: Avoid division by zero in check_builtin_call [PR120436].

In check_builtin_call we eventually perform a division by zero when no
vector modes are present.  This patch just avoids the division in that
case.

PR target/120436

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-shapes.cc (struct vset_def):
Avoid division by zero.
(struct vget_def): Ditto.
* config/riscv/riscv-vector-builtins.h (struct function_group_info):
Use required_extensions_specified instead of duplicating code.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr120436.c: New test.

4 months agolibgomp.fortran/metadirective-1.f90: Expect 'error:' for nvptx compile [PR118694]
Tobias Burnus [Wed, 28 May 2025 13:14:14 +0000 (15:14 +0200)] 
libgomp.fortran/metadirective-1.f90: Expect 'error:' for nvptx compile [PR118694]

This should have been part of commit r16-838-gb3d07ec7ac2ccd or
r16-883-g5d6ed6d604ff94 - all showing the same issue:
'!$omp target' followed by a metadirective with 'teams'; if
the metadirective cannot be early resolved, a diagnostic
error is shown about using directives between 'target' and
'teams'.

While the message is misleading, the problem is that the
host invokes 'target' differently when 'teams' is present;
in this case, host fallback + amdgcn offload require the
no-teams case, nvptx offload the teams case such that it
only can be resolved at runtime.

Mark the error as 'dg-bogus + xfail' to silence the FAIL,
when nvptx offloading is compiled for. (If not, the
metadirective can be resolved early during compilation.)

libgomp/ChangeLog:

PR middle-end/118694
* testsuite/libgomp.fortran/metadirective-1.f90: xfail when
compiling (also) for nvptx offloading as an error is then expected.

4 months agoc++: modules and using-directives
Jason Merrill [Wed, 20 Nov 2024 22:46:54 +0000 (23:46 +0100)] 
c++: modules and using-directives

We weren't representing 'using namespace' at all in modules, which broke
some of the <chrono> literals tests.

This only represents exported using-declarations; others should be
irrelevant to importers, as any name lookup in the imported module that
would have cared about them was done while compiling the header unit.

I experimented with various approaches to representing them; this patch
handles them in read/write_namespaces, after the namespaces themselves.  I
spent a while pondering how to deal with the depset code in order to connect
them, but then realized it would be simpler to refer to them based on their
index in the array of namespaces.

Any using-directives from an indirect import are ignored, so in an export
import, any imported using-directives are exported again.

gcc/cp/ChangeLog:

* module.cc (module_state::write_namespaces): Write
using-directives.
(module_state::read_namespaces): And read them.
* name-lookup.cc (add_using_namespace): Add overload.  Build a
USING_DECL for modules.
(name_lookup::search_usings, name_lookup::queue_usings)
(using_directives_contain_std_p): Strip the USING_DECL.
* name-lookup.h: Declare it.
* parser.cc (cp_parser_import_declaration): Set MK_EXPORTING
for export import.

gcc/testsuite/ChangeLog:

* g++.dg/modules/namespace-8_a.C: New test.
* g++.dg/modules/namespace-8_b.C: New test.
* g++.dg/modules/namespace-9_a.C: New test.
* g++.dg/modules/namespace-9_b.C: New test.
* g++.dg/modules/namespace-10_a.C: New test.
* g++.dg/modules/namespace-10_b.C: New test.
* g++.dg/modules/namespace-10_c.C: New test.
* g++.dg/modules/namespace-11_a.C: New test.
* g++.dg/modules/namespace-11_b.C: New test.
* g++.dg/modules/namespace-11_c.C: New test.

4 months agoRISC-V: Add test cases for avg_floor vaadd implementation
Pan Li [Tue, 27 May 2025 02:27:01 +0000 (10:27 +0800)] 
RISC-V: Add test cases for avg_floor vaadd implementation

Add asm and run testcase for avg_floor vaadd implementation.

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/avg.h: New test.
* gcc.target/riscv/rvv/autovec/avg_data.h: New test.
* gcc.target/riscv/rvv/autovec/avg_floor-1-i16-from-i32.c: New test.
* gcc.target/riscv/rvv/autovec/avg_floor-1-i16-from-i64.c: New test.
* gcc.target/riscv/rvv/autovec/avg_floor-1-i32-from-i64.c: New test.
* gcc.target/riscv/rvv/autovec/avg_floor-1-i8-from-i16.c: New test.
* gcc.target/riscv/rvv/autovec/avg_floor-1-i8-from-i32.c: New test.
* gcc.target/riscv/rvv/autovec/avg_floor-1-i8-from-i64.c: New test.
* gcc.target/riscv/rvv/autovec/avg_floor-run-1-i16-from-i32.c: New test.
* gcc.target/riscv/rvv/autovec/avg_floor-run-1-i16-from-i64.c: New test.
* gcc.target/riscv/rvv/autovec/avg_floor-run-1-i32-from-i64.c: New test.
* gcc.target/riscv/rvv/autovec/avg_floor-run-1-i8-from-i16.c: New test.
* gcc.target/riscv/rvv/autovec/avg_floor-run-1-i8-from-i32.c: New test.
* gcc.target/riscv/rvv/autovec/avg_floor-run-1-i8-from-i64.c: New test.
* gcc.target/riscv/rvv/autovec/avg_run.h: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
4 months agoRISC-V: Reconcile the existing test for avg_floor
Pan Li [Tue, 27 May 2025 02:24:56 +0000 (10:24 +0800)] 
RISC-V: Reconcile the existing test for avg_floor

Some existing avg_floor test need updated due to change to
leverage vaadd.vv directly.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Update asm check
to vaadd.
* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
4 months agoRISC-V: Leverage vaadd.vv for signed standard name avg_floor
Pan Li [Tue, 27 May 2025 01:53:56 +0000 (09:53 +0800)] 
RISC-V: Leverage vaadd.vv for signed standard name avg_floor

The signed avg_floor totally match the sematics of fixed point
rvv insn vaadd, within round down.  Thus, leverage it directly
to implement the avf_floor.

The spec of RVV is somehow not that clear about the difference
between the float point and fixed point for the rounding that
discard least-significant information.

For float point which is not two's complement, the "discard
least-significant information" indicates truncation round.  For
example as below:

* 3.5 -> 3
* -2.3 -> -2

For fixed point which is two's complement, the "discard
least-significant information" indicates round down.  For
example as below:

* 3.5 -> 3
* -2.3 -> -3

And the vaadd takes the round down which is totally matching
the sematics of the avf_floor.

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/autovec.md (avg<v_double_trunc>3_floor): Add insn
expand to leverage vaadd directly.

Signed-off-by: Pan Li <pan2.li@intel.com>
4 months agoHandle auto-fdo 0 more carefully
Jan Hubicka [Wed, 28 May 2025 12:26:11 +0000 (14:26 +0200)] 
Handle auto-fdo 0 more carefully

This patch fixes few other places where auto-fdo 0 should be be treated as
actual 0 (i.e. probably never executed).  Overall I think we should end up
combining static profile with auto-fdo profile where auto-fdo has 0 counts,
but that is something that should be benchmarked and first it is neccessary to
get something benchmarkeable out of auto-FDO.

gcc/ChangeLog:

* cgraph.cc (cgraph_edge::maybe_hot_p): For auto-fdo turn 0
to non-zero.
* ipa-cp.cc (cs_interesting_for_ipcp_p): Do not trust
auto-fdo 0.
* profile-count.cc (profile_count::adjust_for_ipa_scaling): Likewise.
(profile_count::from_gcov_type): Fix formating.

4 months agoDo not recompute profile when entry block has afdo count of 0
Jan Hubicka [Wed, 28 May 2025 12:18:39 +0000 (14:18 +0200)] 
Do not recompute profile when entry block has afdo count of 0

With normal profile feedback checking entry block count to be non-zero is quite
reliable check for presence of non-0 profile in the body since the function
body can only be executed if the entry block was executed.  With autofdo this
is not true, since the entry block may just execute too few times to be
recorded.  As a consequence we currently drop AFDO profile quite often.  This
patch fixes it.

gcc/ChangeLog:

* predict.cc (rebuild_frequencies): look harder for presence
of profile feedback.

4 months agoaarch64: Enable newly implemented features for FUJITSU-MONAKA
Yuta Mukai [Fri, 23 May 2025 04:51:11 +0000 (04:51 +0000)] 
aarch64: Enable newly implemented features for FUJITSU-MONAKA

This patch enables newly implemented features in GCC (FAMINMAX, FP8FMA,
FP8DOT2, FP8DOT4, LUT) for FUJITSU-MONAKA
processor (-mcpu=fujitsu-monaka).

2025-05-23  Yuta Mukai  <mukai.yuta@fujitsu.com>

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def (fujitsu-monaka): Update ISA
features.

4 months agoFix profile_probability quality of switch
Jan Hubicka [Wed, 28 May 2025 10:23:48 +0000 (12:23 +0200)] 
Fix profile_probability quality of switch

This fixes ages old bug I noticed only now where switch cases, in situation
prediction is completely missing, gets all equal probability that should be
GUESSED instead of ADJUSTED.

gcc/ChangeLog:

* predict.cc (set_even_probabilities): Set quality to guessed.

4 months agoDo not erase static profile by 0 autofdo profile
Jan Hubicka [Wed, 28 May 2025 10:15:32 +0000 (12:15 +0200)] 
Do not erase static profile by 0 autofdo profile

This patch makes auto-fdo more careful about keeping info we have
from static profile prediction.

If all counters in function are 0, we can keep original auto-fdo profile.
Having all 0 profile is not very useful especially becuase 0 in autofdo is not
very informative and the code still may have been executed in the train run.
I added comment about adding GUESSED_GLOBAL0_AFDO which would still preserve
info that the function is not hot in the profile, but I would like to do this
incrementally.

If function has non-zero counters, we can still keep info about zero being
reliable from static prediction (i.e. after EH or with cold attribute).

gcc/ChangeLog:

* auto-profile.cc (update_count_by_afdo_count): New function.
(afdo_set_bb_count): Add debug output; only set count if it is
non-zero.
(afdo_find_equiv_class): Add debug output.
(afdo_calculate_branch_prob): Fix formating.
(afdo_annotate_cfg): Add debug output; do not erase static
profile if autofdo profile is all 0.

4 months agodoc: Fix extend.texi menu
Haochen Jiang [Wed, 28 May 2025 02:36:34 +0000 (10:36 +0800)] 
doc: Fix extend.texi menu

commit 517c9487f8fdc4e4e90252a9365e5823259dc783
Author: Alejandro Colomar <alx@kernel.org>
Date:   Thu May 22 01:15:36 2025 +0200

    c: Add _Countof operator [PR117025]

broke gcc build on RHEL 9 when building texi files (with bundled
makeinfo 6.7):

gcc/doc/extend.texi:6: node `C Extensions' lacks menu item for
`_Countof' despite being its Up target

The same fail will happen for makeinfo <= 6.7, while won't fail
when makeinfo >= 6.8.

Fixed by adding the missing menu entires.

gcc/ChangeLog:

* doc/extend.texi (C Extensions): Add missing menu items.

4 months agoFor datarefs with big gap, split them into different groups.
liuhongt [Wed, 12 Mar 2025 01:40:07 +0000 (18:40 -0700)] 
For datarefs with big gap, split them into different groups.

The patch tries to solve miss vectorization for below case.

void
foo (int* a, int* restrict b)
{
    b[0] = a[0] * a[64];
    b[1] = a[65] * a[1];
    b[2] = a[2] * a[66];
    b[3] = a[67] * a[3];
    b[4] = a[68] * a[4];
    b[5] = a[69] * a[5];
    b[6] = a[6] * a[70];
    b[7] = a[7] * a[71];
}

In vect_analyze_data_ref_accesses, a[0], a[1], .. a[7], a[64], ...,
a[71] are in same group with size of 71. It caused vectorization
unprofitable.

gcc/ChangeLog:

PR tree-optimization/119181
* tree-vect-data-refs.cc (vect_analyze_data_ref_accesses):
Split datarefs when there's a gap bigger than
MAX_BITSIZE_MODE_ANY_MODE.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/bb-slp-pr119181.c: New test.

4 months agoDaily bump.
GCC Administrator [Wed, 28 May 2025 00:18:43 +0000 (00:18 +0000)] 
Daily bump.

4 months agoc: Fix up a pasto in countof diagnostics [PR117025]
Jakub Jelinek [Tue, 27 May 2025 21:27:23 +0000 (23:27 +0200)] 
c: Fix up a pasto in countof diagnostics [PR117025]

The C23 in there looks like pasto, should be C2Y.

2025-05-27  Jakub Jelinek  <jakub@redhat.com>

PR c/117025
* c-parser.cc (c_parser_sizeof_or_countof_expression): Use
C2Y rather than C23 in pedwarn_c23.

4 months agoFortran: fix regression introduced by commit r16-914-g787a8dec1acedf
Harald Anlauf [Tue, 27 May 2025 21:08:54 +0000 (23:08 +0200)] 
Fortran: fix regression introduced by commit r16-914-g787a8dec1acedf

A last-minute cleanup before patch submission reordered a change
that should not have happened.  This fixes it.

PR fortran/101735

gcc/fortran/ChangeLog:

* primary.cc (gfc_match_varspec): Correct order of logic.

4 months agolibgcc: Add DPD support + fix big-endian support of _BitInt <-> dfp conversions
Jakub Jelinek [Tue, 27 May 2025 21:10:08 +0000 (23:10 +0200)] 
libgcc: Add DPD support + fix big-endian support of _BitInt <-> dfp conversions

The following patch fixes
FAIL: gcc.dg/dfp/bitint-1.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-2.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-3.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-4.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-5.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-6.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-8.c (test for excess errors)
FAIL: gcc.dg/dfp/int128-1.c (test for excess errors)
FAIL: gcc.dg/dfp/int128-2.c (test for excess errors)
FAIL: gcc.dg/dfp/int128-4.c (test for excess errors)
on s390x-linux (with the 3 not yet posted patches).

The patch does multiple things:
1) the routines were written for the DFP BID (binary integer decimal)
   format which is used on all arches but powerpc*/s390* (those use
   DPD - densely packed decimal format); as most of the code is actually
   the same for both BID and DPD formats, I haven't copied the sources
   + slightly modified them, but added the DPD support directly, + renaming
   of the exported symbols from __bid_* prefixed to __dpd_* prefixed that
   GCC expects on the DPD targets
2) while testing that I've found some big-endian issues in the existing
   support
3) testing also revealed that in some cases __builtin_clzll (~msb) was
   called with msb set to all ones, so invoking UB; apparently on aarch64
   and x86 we were lucky and got some value that happened to work well,
   but that wasn't the case on s390x

For 1), the patch uses two ~ 2KB tables to speed up the decoding/encoding.
I haven't found such tables in what is added into libgcc.a, though they
are in libdecnumber/bid/bid2dpd_dpd2bid.h, but there they are just huge
and next to other huge tables - there is d2b which is like __dpd_d2bbitint
in the patch but it uses 64-bit entries rather than 16-bit, then there is
d2b2 with 64-bit entries like in d2b all multiplied by 1000, then d2b3
similarly multiplied by 1000000, then d2b4 similarly multiplied by
1000000000, then d2b5 similarly multiplied by 1000000000000ULL and
d2b6 similarly multipled by 1000000000000000ULL.  Arguably it can
save some of the multiplications, but on the other side accesses memory
which is unlikely in the caches, and the 2048 bytes in the patch vs.
24 times more for d2b is IMHO significant.
For b2d, libdecnumber/bid/bid2dpd_dpd2bid.h has again b2d table like
__dpd_b2dbitint in the patch, except that it has 64-bit entries rather
than 16-bit (this time 1000 entries), but then has b2d2 which has the
same entries shifted left by 10, then b2d3 shifted left by 20, b2d4 shifted
left by 30 and b2d5 shifted left by 40.  I can understand for d2b paying
memory cost to speed up multiplications, but don't understand paying
extra 4 * 8 * 1000 bytes (+ 6 * 1000 bytes for b2d not using ushort)
just to avoid shifts.

2025-05-27  Jakub Jelinek  <jakub@redhat.com>

* config/t-softfp (softfp_bid_list): Don't guard with
$(enable_decimal_float) == bid.
* soft-fp/bitint.h (__bid_pow10bitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_pow10bitint.
(__dpd_d2bbitint, __dpd_b2dbitint): Declare.
* soft-fp/bitintpow10.c (__dpd_d2bbitint, __dpd_b2dbitint): New
variables.
* soft-fp/fixsdbitint.c (__bid_fixsdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixsdbitint.
Add DPD support.  Fix big-endian support.
* soft-fp/fixddbitint.c (__bid_fixddbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixddbitint.
Add DPD support.  Fix big-endian support.
* soft-fp/fixtdbitint.c (__bid_fixtdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixtdbitint.
Add DPD support.  Fix big-endian support.
* soft-fp/fixsdti.c (__bid_fixsdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixsdbitint.
(__bid_fixsdti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine to
__dpd_fixsdti.
* soft-fp/fixddti.c (__bid_fixddbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixddbitint.
(__bid_fixddti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine to
__dpd_fixddti.
* soft-fp/fixtdti.c (__bid_fixtdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixtdbitint.
(__bid_fixtdti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine to
__dpd_fixtdti.
* soft-fp/fixunssdti.c (__bid_fixsdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixsdbitint.
(__bid_fixunssdti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine
to __dpd_fixunssdti.
* soft-fp/fixunsddti.c (__bid_fixddbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixddbitint.
(__bid_fixunsddti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine
to __dpd_fixunsddti.
* soft-fp/fixunstdti.c (__bid_fixtdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixtdbitint.
(__bid_fixunstdti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine
to __dpd_fixunstdti.
* soft-fp/floatbitintsd.c (__bid_floatbitintsd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitintsd.
Add DPD support.  Avoid calling __builtin_clzll with 0 argument.  Fix
big-endian support.
* soft-fp/floatbitintdd.c (__bid_floatbitintdd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitintdd.
Add DPD support.  Avoid calling __builtin_clzll with 0 argument.  Fix
big-endian support.
* soft-fp/floatbitinttd.c (__bid_floatbitinttd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitinttd.
Add DPD support.  Avoid calling __builtin_clzll with 0 argument.  Fix
big-endian support.
* soft-fp/floattisd.c (__bid_floatbitintsd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitintsd.
(__bid_floattisd): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine to
__dpd_floattisd.
* soft-fp/floattidd.c (__bid_floatbitintdd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitintdd.
(__bid_floattidd): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine to
__dpd_floattidd.
* soft-fp/floattitd.c (__bid_floatbitinttd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitinttd.
(__bid_floattitd): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine to
__dpd_floattitd.
* soft-fp/floatuntisd.c (__bid_floatbitintsd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitintsd.
(__bid_floatuntisd): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine
to __dpd_floatuntisd.
* soft-fp/floatuntidd.c (__bid_floatbitintdd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitintdd.
(__bid_floatuntidd): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine
to __dpd_floatuntidd.
* soft-fp/floatuntitd.c (__bid_floatbitinttd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitinttd.
(__bid_floatuntitd): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine
to __dpd_floatuntitd.

4 months agofold: DECL_VALUE_EXPR isn't simple [PR120400]
Jason Merrill [Fri, 23 May 2025 13:41:25 +0000 (09:41 -0400)] 
fold: DECL_VALUE_EXPR isn't simple [PR120400]

This PR noted that fold_truth_andor was wrongly changing && to & where the
RHS is a VAR_DECL with DECL_VALUE_EXPR; we can't assume that such can be
evaluated unconditionally.

To be more precise we could recurse into DECL_VALUE_EXPR, but that doesn't
seem worth bothering with since typical uses involve a COMPONENT_REF, which
is not simple.

PR c++/120400

gcc/ChangeLog:

* fold-const.cc (simple_operand_p): False for vars with
DECL_VALUE_EXPR.

4 months agoc: Add -Wpedantic diagnostic for _Countof [PR117025]
Alejandro Colomar [Wed, 21 May 2025 23:15:43 +0000 (01:15 +0200)] 
c: Add -Wpedantic diagnostic for _Countof [PR117025]

It has been standardized in C2y.

PR c/117025

gcc/c/ChangeLog:

* c-parser.cc (c_parser_sizeof_or_countof_expression):
Add -Wpedantic diagnostic for _Countof in <= C23 mode.

gcc/testsuite/ChangeLog:

* gcc.dg/countof-compat.c: New test.
* gcc.dg/countof-no-compat.c: New test.
* gcc.dg/countof-pedantic.c: New test.
* gcc.dg/countof-pedantic-errors.c: New test.

Signed-off-by: Alejandro Colomar <alx@kernel.org>
4 months agoc: Add <stdcountof.h> [PR117025]
Alejandro Colomar [Wed, 21 May 2025 23:15:40 +0000 (01:15 +0200)] 
c: Add <stdcountof.h> [PR117025]

PR c/117025

gcc/ChangeLog:

* Makefile.in (USER_H): Add <stdcountof.h>.
* ginclude/stdcountof.h: New file.

gcc/testsuite/ChangeLog:

* gcc.dg/countof-stdcountof.c: New test.

Signed-off-by: Alejandro Colomar <alx@kernel.org>
4 months agoc: Add _Countof operator [PR117025]
Alejandro Colomar [Wed, 21 May 2025 23:15:36 +0000 (01:15 +0200)] 
c: Add _Countof operator [PR117025]

This operator is similar to sizeof but can only be applied to an array,
and returns its number of elements.

FUTURE DIRECTIONS:

-  We should make it work with array parameters to functions,
   and somehow magically return the number of elements of the array,
   regardless of it being really a pointer.

Link: <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3550.pdf>
Link: <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117025>
Link: <https://inbox.sourceware.org/gcc/M8S4oQy--3-2@tutanota.com/T/>
Link: <https://inbox.sourceware.org/gcc-patches/20240728141547.302478-1-alx@kernel.org/T/#t>
Link: <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3313.pdf>
Link: <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3325.pdf>
Link: <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3369.pdf>
Link: <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3469.htm>
Link: <https://github.com/llvm/llvm-project/issues/102836>
Link: <https://thephd.dev/the-big-array-size-survey-for-c>
Link: <https://thephd.dev/the-big-array-size-survey-for-c-results>
Link: <https://stackoverflow.com/questions/37538/#57537491>

PR c/117025

gcc/ChangeLog:

* doc/extend.texi: Document _Countof operator.

gcc/c-family/ChangeLog:

* c-common.h (enum rid): Add RID_COUNTOF.
(c_countof_type): New function prototype.
* c-common.def (COUNTOF_EXPR): New tree.
* c-common.cc (c_common_reswords): Add RID_COUNTOF entry.
(c_countof_type): New function.

gcc/c/ChangeLog:

* c-tree.h (in_countof): Add global variable declaration.
(c_expr_countof_expr): Add function prototype.
(c_expr_countof_type): Add function prototype.
* c-decl.cc (start_struct, finish_struct): Add support for
_Countof.
(start_enum, finish_enum): Add support for _Countof.
* c-parser.cc (c_parser_sizeof_expression): New macro.
(c_parser_countof_expression): New macro.
(c_parser_sizeof_or_countof_expression): Rename function and add
support for _Countof.
(c_parser_unary_expression): Add RID_COUNTOF entry.
* c-typeck.cc (in_countof): Add global variable.
(build_external_ref): Add support for _Countof.
(record_maybe_used_decl): Add support for _Countof.
(pop_maybe_used): Add support for _Countof.
(is_top_array_vla): New function.
(c_expr_countof_expr, c_expr_countof_type): New functions.

gcc/testsuite/ChangeLog:

* gcc.dg/countof-compile.c: New test.
* gcc.dg/countof-vla.c: New test.
* gcc.dg/countof-vmt.c: New test.
* gcc.dg/countof-zero-compile.c: New test.
* gcc.dg/countof-zero.c: New test.
* gcc.dg/countof.c: New test.

Suggested-by: Xavier Del Campo Romero <xavi.dcr@tutanota.com>
Co-authored-by: Martin Uecker <uecker@tugraz.at>
Acked-by: "James K. Lowden" <jklowden@schemamania.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
4 months agoFortran: Fix c_associated argument checks.
Jerry DeLisle [Tue, 20 May 2025 02:41:16 +0000 (19:41 -0700)] 
Fortran: Fix c_associated argument checks.

PR fortran/120049

gcc/fortran/ChangeLog:

* check.cc (gfc_check_c_associated): Use new helper functions.
Only call check_c_ptr_1 if optional c_ptr_2 tests succeed.
(check_c_ptr_1): Handle only c_ptr_1 checks.
(check_c_ptr_2): Expand checks for c_ptr_2 and handle cases
where there is no derived pointer in the gfc_expr and check
the inmod_sym_id only if it exists.
* misc.cc (gfc_typename): Handle the case for BT_VOID rather
than throw an internal error.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr120049_a.f90: Update test directives.
* gfortran.dg/pr120049_b.f90: Update test directives
* gfortran.dg/pr120049_2.f90: New test.

Co-Authored-By: Steve Kargl <kargl@gcc.gnu.org>
4 months agoFortran: fix parsing of type parameter inquiries of substrings [PR101735]
Harald Anlauf [Tue, 27 May 2025 17:23:16 +0000 (19:23 +0200)] 
Fortran: fix parsing of type parameter inquiries of substrings [PR101735]

Handling of type parameter inquiries of substrings failed to due either
parsing issues or not following or handling reference chains properly.

PR fortran/101735

gcc/fortran/ChangeLog:

* expr.cc (find_inquiry_ref): If an inquiry reference applies to
a substring, use that, and calculate substring length if needed.
* primary.cc (extend_ref): Also handle attaching to end of
reference chain for appending.
(gfc_match_varspec): Discrimate between arrays of character and
substrings of them.  If a substring is taken from a character
component of a derived type, get the proper typespec so that
inquiry references work correctly.
(gfc_match_rvalue): Handle corner case where we hit a seemingly
dangling '%' and missed an inquiry reference. Try another match.

gcc/testsuite/ChangeLog:

* gfortran.dg/inquiry_type_ref_7.f90: New test.

4 months agoc++, coroutines: Fix typos in TRUTH_ANDIF_EXPRs.
Iain Sandoe [Thu, 22 May 2025 12:24:12 +0000 (13:24 +0100)] 
c++, coroutines: Fix typos in TRUTH_ANDIF_EXPRs.

These were typoed to TRUTH_AND_EXPR (and then that got copied).

gcc/cp/ChangeLog:

* coroutines.cc (cp_coroutine_transform::build_ramp_function):
Replace TRUTH_AND_EXPR with TRUTH_ANDIF_EXPR in three places.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
4 months agoEnable afdo testing on AMD Zen3+
Jan Hubicka [Tue, 27 May 2025 17:53:50 +0000 (19:53 +0200)] 
Enable afdo testing on AMD Zen3+

contrib/ChangeLog:

* gen_autofdo_event.py: Add support for AMD Zen 3 and
later CPUs.

gcc/ChangeLog:

* config/i386/gcc-auto-profile: regenerate.

4 months agoRemove dead code in auto-profile.cc
Jan Hubicka [Tue, 27 May 2025 17:14:21 +0000 (19:14 +0200)] 
Remove dead code in auto-profile.cc

This code to track what locations were used when reading auto-fdo profile
seems dead since the initial commit. Removed thus.

gcc/ChangeLog:

* auto-profile.cc (function_instance::mark_annotated): Remove.
(function_instance::total_annotated_count): Remove.
(autofdo_source_profile::mark_annotated): Remove.
(afdo_set_bb_count): Do not mark annotated locations.
(afdo_annotate_cfg): Likewise.

4 months agoFix IPA-SRA issue with reverse SSO on specific pattern
Eric Botcazou [Tue, 27 May 2025 17:42:17 +0000 (19:42 +0200)] 
Fix IPA-SRA issue with reverse SSO on specific pattern

IPA-SRA generally works fine in the presence of reverse Scalar_Storage_Order
by propagating the relevant flag onto the newly generated MEM_REFs.  However
we have been recently faced with a specific Ada pattern that it does not
handle correctly: the 'Valid attribute applied to a floating-point component
of an aggregate type with reverse Scalar_Storage_Order.

The attribute is implemented by a call to a specific routine of the runtime
that expects a pointer to the object so, in the case of a component with
reverse SSO, the compiler first loads it from the aggregate to get back the
native storage order, but it does the load using an array of bytes instead
of the floating-point type to prevent the FPU from fiddling with the value,
which yields in the .original dump file:

  *(character[1:4] *) &F2b = VIEW_CONVERT_EXPR<character[1:4]>(item.f);

Of course that's a bit convoluted, but it does not seem that another method
would be simpler or even work, and using VIEW_CONVERT_EXPR to toggle the SSO
is supposed to be supported in any case (unlike aliasing or type punning).

The attached patch makes it work.  While the call to storage_order_barrier_p
from IPA-SRA is quite natural (the regular SRA has it too), the tweak to the
predicate itself is needed to handle the scalar->aggregate conversion, which
is admittedly awkward but again without clear alternative.

gcc/
* ipa-sra.cc (scan_expr_access): Also disqualify storage order
barriers from splitting.
* tree.h (storage_order_barrier_p): Also return false if the
operand of the VIEW_CONVERT_EXPR has reverse storage order.

gcc/testsuite/
* gnat.dg/sso19.adb: New test.
* gnat.dg/sso19_pkg.ads, gnat.dg/sso19_pkg.adb: New helper.

4 months agodiagnostics: rework experimental-html output [PR116792]
David Malcolm [Tue, 27 May 2025 16:22:26 +0000 (12:22 -0400)] 
diagnostics: rework experimental-html output [PR116792]

This patch reworks the HTML output from the the option
  -fdiagnostics-add-output=experimental-html
so that for source quoting and path printing, rather than simply adding
the textual output inside <pre> elements, it breaks up the output
into HTML tags reflecting the structure of the output, using CSS, SVG
and a little javascript to help the user navigate the diagnostics and
the events within any paths.

This uses ideas from the patch I posted in:
  https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558603.html
but reworks the above patch so that:
* rather than printing source to a pretty_printer, the HTML is created
  by building a DOM tree in memory, using a new xml::printer class.  This
  should be less error-prone than the pretty_printer approach, since it
  ought to solve escaping issues.  Instead of a text vs html boolean,
  the code is generalized via templates with to_text vs to_html sinks.
  This templatization applies both to path-printing and
  diagnostic-show-locus.cc.
* the HTML output can have multiple diagnostics and multiple paths rather
  than just a single path.  The javascript keyboard controls now cycle
  through all diagnostics and all events within them

An example of the output can be seen at:
  https://dmalcolm.fedorapeople.org/gcc/2025-05-27/malloc-1.c.html
where the keys "j" and "k" cycle through diagnostics and events
within them.

gcc/ChangeLog:
PR other/116792
* diagnostic-format-html.cc: Define INCLUDE_STRING.
Include "xml.h", "xml-printer.h", and "json.h".
(html_generation_options::html_generation_options): New.
(namespace xml): Move decls to xml.h and convert from using
label_text to std::string.
(xml::text::write_as_xml): Reimplement indentation so it is done
by this node, rather than the parent.
(xml::node_with_children::add_text): Convert from label_text to
std::string.  Consolidate runs of text into a single node.
(xml::document::write_as_xml): Reimplement indentation.
(xml::element::write_as_xml): Reimplement indentation so it is
done by this node, rather than the parent.  Convert from
label_text to std::string.  Update attribute-printing to new
representation to preserve insertion order.
(xml::element::set_attr): Convert from label_text to std::string.
Record insertion order.
(xml::raw::write_as_xml): New.
(xml::printer::printer): New.
(xml::printer::push_tag): New.
(xml::printer::push_tag_with_class): New.
(xml::printer::pop_tag): New.
(xml::printer::set_attr): New.
(xml::printer::add_text): New.
(xml::printer::add_raw): New.
(xml::printer::push_element): New.
(xml::printer::append): New.
(xml::printer::get_insertion_point): New.
(html_builder::add_focus_id): New.
(html_builder::m_html_gen_opts): New field.
(html_builder::m_head_element): New field.
(html_builder::m_next_diag_id): New field.
(html_builder::m_ui_focus_ids): New field.
(make_div): Convert from label_text to std::string.
(make_span): Likewise.
(HTML_STYLE): New.
(HTML_SCRIPT): New.
(html_builder::html_builder): Fix indentation.  Add
"html_gen_opts" param.  Initialize new fields.  Reimplement
using xml::printer.  Optionally add style and script tags.
(class html_path_label_writer): New.
(html_builder::make_element_for_diagnostic): Convert from
label_text to std::string. Set "id" on "gcc-diagnostic" and
"gcc-message" <div> elements; add the latter to the focus ids.
Use diagnostic_context::maybe_show_locus_as_html rather than
html_builder::make_element_for_source.  Use print_path_as_html
rather than html_builder::make_element_for_path.
(html_builder::make_element_for_source): Drop.
(html_builder::make_element_for_path): Drop.
(html_builder::make_element_for_patch): Convert from label_text to
std::string.
(html_builder::make_metadata_element): Likewise.  Use
xml::printer.
(html_builder::make_element_for_metadata): Convert from label_text
to std::string.
(html_builder::emit_diagram): Expand comment.
(html_builder::flush_to_file): Write out initializer for
"focus_ids" into javascript.
(html_output_format::html_output_format): Add param
"html_gen_opts" and use it to initialize m_builder.
(html_file_output_format::html_file_output_format): Likewise, to
initialize base class.
(make_html_sink): Likewise, to pass to ctor.
(selftest::test_html_diagnostic_context::test_html_diagnostic_context):
Set up html_generation_options.
(selftest::html_buffered_output_format::html_buffered_output_format):
Add html_gen_opts param.
(selftest::test_simple_log): Add id attributes to expected text
for "gcc-diagnostic" and "gcc-message" elements.  Update
whitespace for indentation fixes.
(selftest::test_metadata): Update whitespace for indentation
fixes.
(selftest::test_printer): New selftest.
(selftest::test_attribute_ordering): New selftest.
(selftest::diagnostic_format_html_cc_tests): Call the new
selftests.
* diagnostic-format-html.h (struct html_generation_options): New.
(make_html_sink): Add "html_gen_opts" param.
(print_path_as_html): New decl.
* diagnostic-path-output.cc: Define INCLUDE_MAP.  Add includes of
"diagnostic-format-html.h", "xml.h", and "xml-printer.h".
(path_print_policy::path_print_policy): Add ctor.
(path_print_policy::get_diagram_theme): Fix whitespace.
(struct stack_frame): New.
(begin_html_stack_frame): New function.
(end_html_stack_frame): New function.
(emit_svg_arrow): New function.
(event_range::print): Rename to...
(event_range::print_as_text): ...this.  Update call to
diagnostic_start_span.
(event_range::print_as_html): New, based on the above, but ported
from pretty_printer to xml::printer.
(thread_event_printer::print_swimlane_for_event_range): Rename
to...
(thread_event_printer::print_swimlane_for_event_range_as_text):
...this.  Update for renaming of event_range::print to
event_range::print_as_text.
(thread_event_printer::print_swimlane_for_event_range_as_html):
New.
(print_path_summary_as_text): Update for "_as_text" renaming.
(print_path_summary_as_html): New.
(print_path_as_html): New.
* diagnostic-show-locus.cc: Add defines of INCLUDE_MAP and
INCLUDE_STRING.  Add includes of "xml.h" and "xml-printer.h".
(struct char_display_policy): Replace "m_print_cb" with
"m_print_text_cb" and "m_print_html_cb".
(struct to_text): New.
(struct to_html): New.
(get_printer): New.
(default_diagnostic_start_span_fn<to_text>): New.
(default_diagnostic_start_span_fn<to_html>): New.
(class layout): Update "friend class layout_printer;" for
template.
(enum class margin_kind): New.
(class layout_printer): Convert into a template.
(layout_printer::m_pp): Replace field with...
(layout_printer::m_sink): ...this.
(layout_printer::m_colorizer): Drop field in favor of a pointer
in the "to_text" sink.
(default_print_decoded_ch): Convert into a template.
(escape_as_bytes_print): Likewise.
(escape_as_unicode_print): Likewise.
(make_char_policy): Update to use both text and html callbacks.
(layout_printer::print_gap_in_line_numbering): Replace with...
(layout_printer<to_text>::print_gap_in_line_numbering): ...this
(layout_printer<to_html>::print_gap_in_line_numbering): ...and
this.
(layout_printer::print_source_line): Convert to template, using
m_sink.
(layout_printer::print_leftmost_column): Likewise.
(layout_printer::start_annotation_line): Likewise.
(layout_printer<to_text>::end_line): New.
(layout_printer<to_html>::end_line): New.
(layout_printer::print_annotation_line): Convert to template,
using m_sink.
(class line_label): Add field m_original_range_idx.
(layout_printer<to_text>::begin_label): New.
(layout_printer<to_html>::begin_label): New.
(layout_printer<to_text>::end_label): New.
(layout_printer<to_html>::end_label): New.
(layout_printer::print_any_labels): Convert to template, using
m_sink.
(layout_printer::print_leading_fixits): Likewise.
(layout_printer::print_trailing_fixits): Likewise.
(layout_printer::print_newline): Drop.
(layout_printer::move_to_column): Convert to template, using
m_sink.
(layout_printer::show_ruler): Likewise.
(layout_printer::print_line): Likewise.
(layout_printer::print_any_right_to_left_edge_lines): Likewise.
(layout_printer::layout_printer): Likewise.
(diagnostic_context::maybe_show_locus_as_html): New.
(diagnostic_source_print_policy::diagnostic_source_print_policy):
Update for split of start_span_cb into text vs html variants.
(diagnostic_source_print_policy::print): Update for use of
templates; use to_text.
(diagnostic_source_print_policy::print_as_html): New.
(layout_printer::print): Convert to template, using m_sink.
(selftest::make_element_for_locus): New.
(selftest::make_raw_html_for_locus): New.
(selftest::test_layout_x_offset_display_utf8): Update for use of
templates.
(selftest::test_layout_x_offset_display_tab): Likewise.
(selftest::test_one_liner_caret_and_range): Add test coverage of
HTML output.
(selftest::test_one_liner_labels): Likewise.
* diagnostic.cc (diagnostic_context::initialize): Update for split
of start_span_cb into text vs html variants.
(default_diagnostic_start_span_fn): Move to
diagnostic-show-locus.cc, converting to template.
* diagnostic.h (class xml::printer): New forward decl.
(diagnostic_start_span_fn): Replace typedef with "using",
converting to a template.
(struct to_text): New forward decl.
(struct to_html): New forward decl.
(get_printer): New decl.
(diagnostic_location_print_policy::print_text_span_start): New
decl.
(diagnostic_location_print_policy::print_html_span_start): New
decl.
(class html_label_writer): New.
(diagnostic_source_print_policy::print_as_html): New decl.
(diagnostic_source_print_policy::get_start_span_fn): Replace
with...
(diagnostic_source_print_policy::get_text_start_span_fn): ...this
(diagnostic_source_print_policy::get_html_start_span_fn): ...and
this
(diagnostic_source_print_policy::m_start_span_cb): Replace with...
(diagnostic_source_print_policy::m_text_start_span_cb): ...this
(diagnostic_source_print_policy::m_html_start_span_cb): ...and
this.
(diagnostic_context::maybe_show_locus_as_html): New decl.
(diagnostic_context::m_text_callbacks::m_start_span): Replace
with...
(diagnostic_context::m_text_callbacks::m_text_start_span): ...this
(diagnostic_context::m_text_callbacks::m_html_start_span): ...and
this.
(diagnostic_start_span): Update for template change.
(diagnostic_show_locus_as_html): New inline function.
(default_diagnostic_start_span_fn): Convert to template.
* doc/invoke.texi (experimental-html): Add "css" and "javascript"
keys.
* opts-diagnostic.cc (html_scheme_handler::make_sink): Likewise.
* selftest-diagnostic.cc
(selftest::test_diagnostic_context::start_span_cb): Update for
template changes.
* selftest-diagnostic.h
(selftest::test_diagnostic_context::start_span_cb): Likewise.
* xml-printer.h: New file.
* xml.h: New file, based on material in diagnostic-format-html.cc,
but using std::string rather than label_text.
(xml::element::m_key_insertion_order): New field.
(struct xml::raw): New.

gcc/fortran/ChangeLog
PR other/116792
* error.cc (gfc_diagnostic_start_span): Update for diagnostic.h
changes.

gcc/testsuite/ChangeLog:
PR other/116792
* gcc.dg/html-output/missing-semicolon.c: Add ":javascript=no" to
html output.
* gcc.dg/html-output/missing-semicolon.py: Move repeated
definitions into lib/htmltest.py.
* gcc.dg/plugin/diagnostic_group_plugin.cc: Update for template
changes.
* gcc.dg/plugin/diagnostic-test-metadata-html.c: Add
":javascript=no" to html output.  Add
"-fdiagnostics-show-line-numbers".
* gcc.dg/plugin/diagnostic-test-metadata-html.py: Move repeated
definitions into lib/htmltest.py.  Add checks of annotated source.
* gcc.dg/plugin/diagnostic-test-paths-2.c: Add ":javascript=no" to
html output.
* gcc.dg/plugin/diagnostic-test-paths-2.py: Move repeated
definitions into lib/htmltest.py.  Add checks of execution path.
* gcc.dg/plugin/diagnostic-test-paths-4.c: Add
-fdiagnostics-add-output=experimental-html:javascript=no.  Add
invocation ot diagnostic-test-paths-4.py.
* gcc.dg/plugin/diagnostic-test-paths-4.py: New test script.
* gcc.dg/plugin/diagnostic-test-show-locus-bw-line-numbers.c: Add
-fdiagnostics-add-output=experimental-html:javascript=no.  Add
invocation of diagnostic-test-show-locus.py.
* gcc.dg/plugin/diagnostic-test-show-locus.py: New test script.
* lib/htmltest.py: New test support script.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
4 months agodiagnostics: split path-printing into a new diagnostic-path-output.cc
David Malcolm [Tue, 27 May 2025 16:22:25 +0000 (12:22 -0400)] 
diagnostics: split path-printing into a new diagnostic-path-output.cc

No functional change intended.

gcc/ChangeLog:
* Makefile.in (OBJS-libcommon): Add diagnostic-path-output.o.
* diagnostic-path-output.cc: New file, taken from material in
diagnostic-path.cc.
* diagnostic-path.cc: Drop includes of
"diagnostic-macro-unwinding.h", "intl.h", "gcc-rich-location.h",
"diagnostic-color.h", "diagnostic-event-id.h",
"diagnostic-label-effects.h", "pretty-print-markup.h",
"selftest.h", "selftest-diagnostic.h",
"selftest-diagnostic-path.h", "text-art/theme.h", and
"diagnostic-format-text.h".
(class path_print_policy): Move to diagnostic-path-output.cc.
(class path_label): Likewise.
(can_consolidate_events): Likewise.
(class per_thread_summary): Likewise.
(struct event_range): Likewise.
(struct path_summary): Likewise.
(per_thread_summary::interprocedural_p): Likewise.
(path_summary::path_summary): Likewise.
(write_indent): Likewise.
(base_indent): Likewise.
(per_frame_indent): Likewise.
(class thread_event_printer): Likewise.
(print_path_summary_as_text): Likewise.
(class element_event_desc): Likewise.
(diagnostic_text_output_format::print_path): Likewise.
(selftest::path_events_have_column_data_p): Likewise.
(selftest::test_empty_path): Likewise.
(selftest::test_intraprocedural_path): Likewise.
(selftest::test_interprocedural_path_1): Likewise.
(selftest::test_interprocedural_path_2): Likewise.
(selftest::test_recursion): Likewise.
(class selftest::control_flow_test): Likewise.
(selftest::test_control_flow_1): Likewise.
(selftest::test_control_flow_2): Likewise.
(selftest::test_control_flow_3): Likewise.
(selftest::assert_cfg_edge_path_streq): Likewise.
(ASSERT_CFG_EDGE_PATH_STREQ): Likewise.
(selftest::test_control_flow_4): Likewise.
(selftest::test_control_flow_5): Likewise.
(selftest::test_control_flow_6): Likewise.
(selftest::control_flow_tests): Likewise.
(selftest::diagnostic_path_cc_tests): Likewise, renaming
accordingly.
* selftest-run-tests.cc (selftest::run_tests): Update for
move of path-printing selftests.
* selftest.h (selftest::diagnostic_path_cc_tests): Replace decl
with...
(selftest::diagnostic_path_output_cc_tests): ...this.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
4 months agos390x: Fix bootstrap.
Juergen Christ [Tue, 27 May 2025 13:42:10 +0000 (15:42 +0200)] 
s390x: Fix bootstrap.

A typo in the mnemonic attribute caused a failed bootstrap.  Not sure
how that passed the bootstrap done before committing.

gcc/ChangeLog:

* config/s390/vector.md(*vec_extract<mode>): Fix mnemonic.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
4 months agotree-optimization/117965 - phiprop validity checking is too strict
Richard Biener [Tue, 10 Dec 2024 08:55:37 +0000 (09:55 +0100)] 
tree-optimization/117965 - phiprop validity checking is too strict

The PR shows that when using std::clamp from the C++ standard library
and there is surrounding code using exceptions then phiprop can fail
to simplify the code so phiopt can turn the clamping into efficient
min/max operations.

The validation code is needlessly complicated, steming from the
time we had memory-SSA with multiple virtual operands.  The following
simplifies this, thereby fixing this issue.

PR tree-optimization/117965
* tree-ssa-phiprop.cc (phivn_valid_p): Remove.
(propagate_with_phi): Pass in virtual PHI node from BB,
rewrite load motion validity check to require the same
virtual use along all paths.

* g++.dg/tree-ssa/pr117965-1.C: New testcase.
* g++.dg/tree-ssa/pr117965-2.C: Likewise.

4 months agoasf: Fix calling of emit_move_insn on registers of different modes [PR119884]
Konstantinos Eleftheriou [Mon, 19 May 2025 11:00:05 +0000 (13:00 +0200)] 
asf: Fix calling of emit_move_insn on registers of different modes [PR119884]

This patch uses `lowpart_subreg` for the base register initialization,
instead of zero-extending it. We had tried this solution before, but
we were leaving undefined bytes in the upper part of the register.
This shouldn't be happening as we are supposed to write the whole
register when the load is eliminated. This was occurring when having
multiple stores with the same offset as the load, generating a
register move for all of them, overwriting the bit inserts that were
inserted before them.

In order to overcome this, we are removing redundant stores from the
sequence, i.e. stores that write to addresses that will be overwritten
by stores that come after them in the sequence. We are using the same
bitmap that is used for the load elimination check, to keep track of
the bytes that are written by each store.

Also, we are now allowing the load to be eliminated even when there
are overlaps between the stores, as there is no obvious reason why we
shouldn't do that, we just want the stores to cover all of the load's
bytes.

Bootstrapped/regtested on AArch64 and x86_64.

PR rtl-optimization/119884

gcc/ChangeLog:

* avoid-store-forwarding.cc (process_store_forwarding):
Use `lowpart_subreg` for the base register initialization
and remove redundant stores from the store/load sequence.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr119884.c: New test.

4 months agosbitmap: Add bitmap_all_bits_in_range_p function
Konstantinos Eleftheriou [Mon, 19 May 2025 11:00:04 +0000 (13:00 +0200)] 
sbitmap: Add bitmap_all_bits_in_range_p function

This patch adds the `bitmap_all_bits_in_range_p` function in sbitmap,
which checks if all the bits in a range are set.

Helper function `bitmap_bit_in_range_p` has also been added, in order
to be used by `bitmap_all_bits_in_range_p` and
`bitmap_any_bit_in_range_p`.  When the function's `any_inverted`
parameter is true, the function checks if any of the bits in the range
is unset, otherwise it checks if any of them is set.

Function `bitmap_any_bit_in_range_p` has been updated to call
`bitmap_bit_in_range_p` with the `any_inverted` parameter set to
false, retaining its previous functionality.

Function `bitmap_all_bits_in_range_p` calls `bitmap_bit_in_range_p`
with `any_inverted` set to true and returns the negation of the
result, i.e. true if all the bits in the range are set.

gcc/ChangeLog:

* sbitmap.cc (bitmap_any_bit_in_range_p):
Call and return the result of `bitmap_bit_in_range_p` with the
`any_inverted` parameter set to false.
(bitmap_bit_in_range_p): New function.
(bitmap_all_bits_in_range_p): New function.
* sbitmap.h (bitmap_all_bits_in_range_p): New function.

4 months agosbitmap: Rename bitmap_bit_in_range_p to bitmap_any_bit_in_range_p
Konstantinos Eleftheriou [Tue, 20 May 2025 15:01:14 +0000 (17:01 +0200)] 
sbitmap: Rename bitmap_bit_in_range_p to bitmap_any_bit_in_range_p

This patch renames `bitmap_bit_in_range_p` to
`bitmap_any_bit_in_range_p` to better reflect its purpose.

gcc/ChangeLog:

* sbitmap.cc (bitmap_bit_in_range_p): Renamed the function.
(bitmap_any_bit_in_range_p): New function name.
(bitmap_bit_in_range_p_checking): Renamed the function.
(bitmap_any_bit_in_range_p_checking): New function name.
(test_set_range): Updated function calls to use the new name.
(test_bit_in_range): Likewise.
* sbitmap.h (bitmap_bit_in_range_p): Renamed the function.
(bitmap_any_bit_in_range_p): New function name.
* tree-ssa-dse.cc (live_bytes_read):
Updated function call to use the new name.

4 months ago[RISC-V] Add andi+bclr synthesis
Shreya Munnangi [Tue, 27 May 2025 12:43:29 +0000 (06:43 -0600)] 
[RISC-V] Add andi+bclr synthesis

So this patch from Shreya adds the ability to use andi + a series of bclr insns
to synthesize a logical AND, much like we're doing for IOR/XOR using ori+bset
or their xor equivalents.

This would regress from a code quality standpoint if we didn't make some
adjustments to a handful of define_insn_and_split patterns in the riscv backend
which support the same kind of idioms.

Essentially we turn those define_insn_and_split patterns into the simple
define_splits they always should have been.  That's been the plan since we
started down this path -- now is the time to make that change for a subset of
patterns.  It may be the case that when we're finished we may not even need
those patterns.  That's still TBD.

I'm aware of one minor regression in xalan.  As seen elsewhere, combine
reconstructs the mask value, uses mvconst_internal to load it into a reg then
an and instruction.  That looks better than the operation synthesis, but only
because of the mvconst_internal little white lie.

This patch does help in a variety of places.  It's fairly common in gimple.c
from 502.gcc to see cases where we'd use bclr to clear a bit, then set the
exact same bit a few instructions later.  That was an artifact of using a
define_insn_and_split -- it wasn't obvious to combine that we had two
instructions manipulating the same bit.  Now that is obvious to combine and the
redundant operation gets removed.

This has spun in my tester with no regressions on riscv32-elf and riscv64-elf.
Hopefully the baseline for the tester as stepped forward 🙂

gcc/
* config/riscv/bitmanip.md (andi+bclr splits): Simplified from
prior define_insn_and_splits.
* config/riscv/riscv.cc (synthesize_and): Add support for andi+bclr
sequences.

Co-authored-by: Jeff Law <jlaw@ventanamicro.com>
4 months agolibstdc++: Fix some names.cc test failures on AIX
Jonathan Wakely [Tue, 27 May 2025 11:54:14 +0000 (12:54 +0100)] 
libstdc++: Fix some names.cc test failures on AIX

libstdc++-v3/ChangeLog:

* testsuite/17_intro/names.cc [_AIX] (n): Undefine.
* testsuite/experimental/names.cc [_AIX] (ptr): Undefine.

4 months agolibstdc++: Fix test failures for 32-bit AIX
Jonathan Wakely [Tue, 27 May 2025 11:06:01 +0000 (12:06 +0100)] 
libstdc++: Fix test failures for 32-bit AIX

With -maix32 (the default) we only have 16-bit wchar_t so these tests
fail. The debug.cc one is because we use -fwide-exec-charset=UTF-32BE
which tries to encode each wide character as four bytes in a 2-byte
wchar_t. The format.cc one is because the clown face character can't be
encoded in a single 16-bit wchar_t.

libstdc++-v3/ChangeLog:

* testsuite/std/format/debug.cc: Disable for targets with 16-bit
wchar_t.
* testsuite/std/format/functions/format.cc: Use -DUNICODE for
targets with 32-bit wchar_t.
(test_unicode) [UNICODE]: Only run checks when UNICODE is
defined.

4 months agolibstdc++: Regenerate include/Makefile.in
Jonathan Wakely [Tue, 27 May 2025 12:28:56 +0000 (13:28 +0100)] 
libstdc++: Regenerate include/Makefile.in

libstdc++-v3/ChangeLog:

* include/Makefile.in: Regenerate.

4 months agoRISC-V: Add test for vec_duplicate + vxor.vv combine case 1 with GR2VR cost 0, 1...
Pan Li [Sun, 25 May 2025 09:17:34 +0000 (17:17 +0800)] 
RISC-V: Add test for vec_duplicate + vxor.vv combine case 1 with GR2VR cost 0, 1 and 2

Add asm dump check test for vec_duplicate + vxor.vv combine to vxor.vx,
with the GR2VR cost is 0, 1 and 2.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check
for vxor.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
4 months agoRISC-V: Add test for vec_duplicate + vxor.vv combine case 0 with GR2VR cost 0, 2...
Pan Li [Sun, 25 May 2025 09:16:09 +0000 (17:16 +0800)] 
RISC-V: Add test for vec_duplicate + vxor.vv combine case 0 with GR2VR cost 0, 2 and 15

Add asm dump check test for vec_duplicate + vxor.vv combine to vxor.vx,
with the GR2VR cost is 0, 2 and 15.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check
for vxor.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for vxor run test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vxor-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vxor-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vxor-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vxor-run-1-i8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vxor-run-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vxor-run-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vxor-run-1-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vxor-run-1-u8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
4 months agoRISC-V: Combine vec_duplicate + vxor.vv to vxor.vx on GR2VR cost
Pan Li [Sun, 25 May 2025 09:13:09 +0000 (17:13 +0800)] 
RISC-V: Combine vec_duplicate + vxor.vv to vxor.vx on GR2VR cost

This patch would like to combine the vec_duplicate + vxor.vv to the
vxor.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_VX_BINARY(T, OP)                                        \
  void                                                                \
  test_vx_binary (T * restrict out, T * restrict in, T x, unsigned n) \
  {                                                                   \
    for (unsigned i = 0; i < n; i++)                                  \
      out[i] = in[i] OP x;                                            \
  }

  DEF_VX_BINARY(int32_t, |)

Before this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │     beq a3,zero,.L8
  12   │     vsetvli a5,zero,e32,m1,ta,ma
  13   │     vmv.v.x v2,a2
  14   │     slli    a3,a3,32
  15   │     srli    a3,a3,32
  16   │ .L3:
  17   │     vsetvli a5,a3,e32,m1,ta,ma
  18   │     vle32.v v1,0(a1)
  19   │     slli    a4,a5,2
  20   │     sub a3,a3,a5
  21   │     add a1,a1,a4
  22   │     vxor.vv v1,v1,v2
  23   │     vse32.v v1,0(a0)
  24   │     add a0,a0,a4
  25   │     bne a3,zero,.L3

After this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │     beq a3,zero,.L8
  12   │     slli    a3,a3,32
  13   │     srli    a3,a3,32
  14   │ .L3:
  15   │     vsetvli a5,a3,e32,m1,ta,ma
  16   │     vle32.v v1,0(a1)
  17   │     slli    a4,a5,2
  18   │     sub a3,a3,a5
  19   │     add a1,a1,a4
  20   │     vxor.vx v1,v1,a2
  21   │     vse32.v v1,0(a0)
  22   │     add a0,a0,a4
  23   │     bne a3,zero,.L3

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_vx_binary_vec_dup_vec): Add
new case for XOR op.
(expand_vx_binary_vec_vec_dup): Diito.
* config/riscv/riscv.cc (riscv_rtx_costs): Ditto.
* config/riscv/vector-iterators.md: Add new op or to no_shift_vx_ops.

Signed-off-by: Pan Li <pan2.li@intel.com>
4 months agos390: Floating point vector lane handling
Juergen Christ [Thu, 8 May 2025 12:11:00 +0000 (14:11 +0200)] 
s390: Floating point vector lane handling

Since floating point and vector registers overlap on s390, more
efficient code can be generated to extract FPRs from VRs.
Additionally, for double vectors, more efficient code can be generated
to load specific lanes.

gcc/ChangeLog:

* config/s390/vector.md (VF): New mode iterator.
(VEC_SET_NONFLOAT): New mode iterator.
(VEC_SET_SINGLEFLOAT): New mode iterator.
(*vec_set<mode>): Split pattern in two.
(*vec_setv2df): Extract special handling for V2DF mode.
(*vec_extract<mode>): Split pattern in two.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/vec-extract-1.c: New test.
* gcc.target/s390/vector/vec-set-1.c: New test.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
4 months agolibstdc++: Support std::abs for 128-bit integers and floats [PR96710]
Jonathan Wakely [Fri, 16 May 2025 12:05:51 +0000 (13:05 +0100)] 
libstdc++: Support std::abs for 128-bit integers and floats [PR96710]

Currently we only provide std::abs(__int128) and std::abs(__float128)
for non-strict modes, i.e. -std=gnu++NN but not -std=c++NN.

This defines those overloads for strict modes too, as a small step
towards resolving PR 96710 (which will eventually mean that __int128
satisfies the std::integral concept).

libstdc++-v3/ChangeLog:

PR libstdc++/96710
* include/bits/std_abs.h [__SIZEOF_INT128__] (abs(__int128)):
Define.
[_GLIBCXX_USE_FLOAT128] (abs(__float128)): Enable definition for
strict modes.
* testsuite/26_numerics/headers/cmath/82644.cc: Use strict_std
instead of defining __STRICT_ANSI__.
* testsuite/26_numerics/headers/cstdlib/abs128.cc: New test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
4 months agodoc: Fix typo in description of nonstring attribute
Jonathan Wakely [Tue, 27 May 2025 08:11:58 +0000 (09:11 +0100)] 
doc: Fix typo in description of nonstring attribute

gcc/ChangeLog:

* doc/extend.texi (Common Variable Attributes): Fix typo in
description of nonstring.

4 months agoAVR: target/120442 - Support f7_fdim / fdiml in LibF7.
Georg-Johann Lay [Tue, 27 May 2025 10:03:13 +0000 (12:03 +0200)] 
AVR: target/120442 - Support f7_fdim / fdiml in LibF7.

Add Support for fdiml.
PR target/120442
libgcc/config/avr/libf7/
* libf7-common.mk (LIBF_C_PARTS, m_ddd): Add fdim.
* libf7.h (f7_fdim): New proto.
* libf7.c (f7_fdim): New function.
* f7renames.sh (f7_fdim): Add rename.
* f7-wraps.h: Rebuild
* f7-renames.h: Rebuild

4 months agolibstdc++: Fix bug in default ctor of extents.
Luc Grosheintz [Sat, 24 May 2025 11:26:55 +0000 (13:26 +0200)] 
libstdc++: Fix bug in default ctor of extents.

The array that stores the dynamic extents used to be default
initialized. The standard requires value intialization. This
commit fixes the bug and adds a test.

libstdc++-v3/ChangeLog:

* include/std/mdspan: Value initialize the array storing the
dynamic extents.
* testsuite/23_containers/mdspan/extents/ctor_default.cc: New
test.

Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
4 months agoAVR: target/120441 - Fix f7_exp for |x| ≥ 512.
Georg-Johann Lay [Tue, 27 May 2025 07:43:57 +0000 (09:43 +0200)] 
AVR: target/120441 - Fix f7_exp for |x| ≥ 512.

f7_exp limited exponents to 512, but 1023 * ln2 ≈ 709,
hence 1024 is a correct limit.

libgcc/config/avr/libf7/
PR target/120441
* libf7.c (f7_exp): Limit aa->expo to 10 (not to 9).

4 months agodriver: Fix multilib_os_dir and multiarch_dir for those target use TARGET_COMPUTE_MUL...
Kito Cheng [Mon, 10 Mar 2025 08:26:04 +0000 (16:26 +0800)] 
driver: Fix multilib_os_dir and multiarch_dir for those target use TARGET_COMPUTE_MULTILIB

This patch fixes the multilib_os_dir and multiarch_dir for those targets
that use TARGET_COMPUTE_MULTILIB, since the TARGET_COMPUTE_MULTILIB hook
only update/fix the multilib_dir but not the multilib_os_dir and multiarch_dir,
so the multilib_os_dir and multiarch_dir are not set correctly for those targets.

Use RISC-V linux target (riscv64-unknown-linux-gnu) as an example:

```
$ riscv64-unknown-linux-gnu-gcc -print-multi-lib
.;
lib32/ilp32;@march=rv32imac@mabi=ilp32
lib32/ilp32d;@march=rv32imafdc@mabi=ilp32d
lib64/lp64;@march=rv64imac@mabi=lp64
lib64/lp64d;@march=rv64imafdc@mabi=lp64d
```

If we use the exactly same -march and -mabi options to compile a source file,
the multilib_os_dir and multiarch_dir are set correctly:

```
$ riscv64-unknown-linux-gnu-gcc -print-multi-os-directory -march=rv64imafdc -mabi=lp64d
../lib64/lp64d
$ riscv64-unknown-linux-gnu-gcc -print-multi-directory -march=rv64imafdc -mabi=lp64d
lib64/lp64d
```

However if we use the -march=rv64imafdcv -mabi=lp64d option to compile a source
file, the multilib_os_dir and multiarch_dir are not set correctly:
```
$ riscv64-unknown-linux-gnu-gcc -print-multi-os-directory -march=rv64imafdc -mabi=lp64d
lib64/lp64d
$ riscv64-unknown-linux-gnu-gcc -print-multi-directory -march=rv64imafdc -mabi=lp64d
lib64/lp64d
```

That's because the TARGET_COMPUTE_MULTILIB hook only update/fix the multilib_dir
but not the multilib_os_dir, so the multilib_os_dir is blank and will use same
value as multilib_dir, but that is not correct.

So we introduce second chance to fix the multilib_os_dir if it's not set, we do
also try to fix the multiarch_dir, because it may also not set correctly if
multilib_os_dir is not set.

Changes since v1:
- Fix non-multilib build.
- Fix fix indentation.

gcc/ChangeLog:

* gcc.cc (find_multilib_os_dir_by_multilib_dir): New.
(set_multilib_dir): Fix multilib_os_dir and multiarch_dir
if multilib_os_dir is not set.

4 months agoRISC-V: Add testcases for signed vector SAT_ADD IMM form 1
xuli [Thu, 26 Dec 2024 09:39:08 +0000 (09:39 +0000)] 
RISC-V: Add testcases for signed vector SAT_ADD IMM form 1

This patch adds testcase for form1, as shown below:

void __attribute__((noinline))                                       \
vec_sat_s_add_imm_##T##_fmt_1##_##INDEX (T *out, T *op_1, unsigned limit) \
{                                                                    \
  unsigned i;                                                        \
  for (i = 0; i < limit; i++)                                        \
    {                                                                \
      T x = op_1[i];                                                 \
      T sum = (UT)x + (UT)IMM;                                       \
      out[i] = (x ^ IMM) < 0                                         \
        ? sum                                                        \
        : (sum ^ x) >= 0                                             \
          ? sum                                                      \
          : x < 0 ? MIN : MAX;                                       \
    }                                                                \
}

Passed the rv64gcv regression test.

Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/sat/vec_sat_arith.h: add signed vec SAT_ADD IMM form1.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_data.h: add sat_s_add_imm data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-1-i8.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-run-1-i8.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm_type_check-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm_type_check-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm_type_check-1-i8.c: New test.

4 months agoMatch:Support signed vector SAT_ADD IMM form 1
xuli [Thu, 26 Dec 2024 09:06:57 +0000 (09:06 +0000)] 
Match:Support signed vector SAT_ADD IMM form 1

This patch would like to support vector SAT_ADD when one of the op
is singed IMM.

void __attribute__((noinline))                                       \
vec_sat_s_add_imm_##T##_fmt_1##_##INDEX (T *out, T *op_1, unsigned limit) \
{                                                                    \
  unsigned i;                                                        \
  for (i = 0; i < limit; i++)                                        \
    {                                                                \
      T x = op_1[i];                                                 \
      T sum = (UT)x + (UT)IMM;                                       \
      out[i] = (x ^ IMM) < 0                                         \
        ? sum                                                        \
        : (sum ^ x) >= 0                                             \
          ? sum                                                      \
          : x < 0 ? MIN : MAX;                                       \
    }                                                                \
}

Take below form1 as example:
DEF_VEC_SAT_S_ADD_IMM_FMT_1(0, int8_t, uint8_t, 9, INT8_MIN, INT8_MAX)

Before this patch:
__attribute__((noinline))
void vec_sat_s_add_imm_int8_t_fmt_1_0 (int8_t * restrict out, int8_t * restrict op_1, unsigned int limit)
{
  vector([16,16]) signed char * vectp_out.28;
  vector([16,16]) signed char vect_iftmp.27;
  vector([16,16]) <signed-boolean:1> mask__28.26;
  vector([16,16]) <signed-boolean:1> mask__29.25;
  vector([16,16]) <signed-boolean:1> mask__19.19;
  vector([16,16]) <signed-boolean:1> mask__31.18;
  vector([16,16]) signed char vect__6.17;
  vector([16,16]) signed char vect__5.16;
  vector([16,16]) signed char vect_sum_15.15;
  vector([16,16]) unsigned char vect__4.14;
  vector([16,16]) unsigned char vect_x.13;
  vector([16,16]) signed char vect_x_14.12;
  vector([16,16]) signed char * vectp_op_1.10;
  vector([16,16]) <signed-boolean:1> _78;
  vector([16,16]) unsigned char _79;
  vector([16,16]) unsigned char _80;
  unsigned long _92;
  unsigned long ivtmp_93;
  unsigned long ivtmp_94;
  unsigned long _95;

  <bb 2> [local count: 118111598]:
  if (limit_12(D) != 0)
    goto <bb 3>; [89.00%]
  else
    goto <bb 5>; [11.00%]

  <bb 3> [local count: 105119322]:
  _92 = (unsigned long) limit_12(D);

  <bb 4> [local count: 955630226]:
  # vectp_op_1.10_62 = PHI <vectp_op_1.10_63(4), op_1_13(D)(3)>
  # vectp_out.28_89 = PHI <vectp_out.28_90(4), out_16(D)(3)>
  # ivtmp_93 = PHI <ivtmp_94(4), _92(3)>
  _95 = .SELECT_VL (ivtmp_93, POLY_INT_CST [16, 16]);
  vect_x_14.12_64 = .MASK_LEN_LOAD (vectp_op_1.10_62, 8B, { -1, ... }, _95, 0);
  vect_x.13_65 = VIEW_CONVERT_EXPR<vector([16,16]) unsigned char>(vect_x_14.12_64);
  vect__4.14_67 = vect_x.13_65 + { 9, ... };
  vect_sum_15.15_68 = VIEW_CONVERT_EXPR<vector([16,16]) signed char>(vect__4.14_67);
  vect__5.16_70 = vect_x_14.12_64 ^ { 9, ... };
  vect__6.17_71 = vect_x_14.12_64 ^ vect_sum_15.15_68;
  mask__31.18_73 = vect__5.16_70 >= { 0, ... };
  mask__19.19_75 = vect_x_14.12_64 < { 0, ... };
  mask__29.25_85 = vect__6.17_71 < { 0, ... };
  mask__28.26_86 = mask__31.18_73 & mask__29.25_85;
  _78 = ~mask__28.26_86;
  _79 = .VCOND_MASK (mask__19.19_75, { 128, ... }, { 127, ... });
  _80 = .COND_ADD (_78, vect_x.13_65, { 9, ... }, _79);
  vect_iftmp.27_87 = VIEW_CONVERT_EXPR<vector([16,16]) signed char>(_80);
  .MASK_LEN_STORE (vectp_out.28_89, 8B, { -1, ... }, _95, 0, vect_iftmp.27_87);
  vectp_op_1.10_63 = vectp_op_1.10_62 + _95;
  vectp_out.28_90 = vectp_out.28_89 + _95;
  ivtmp_94 = ivtmp_93 - _95;
  if (ivtmp_94 != 0)
    goto <bb 4>; [89.00%]
  else
    goto <bb 5>; [11.00%]

  <bb 5> [local count: 118111600]:
  return;

}

After this patch:
__attribute__((noinline))
void vec_sat_s_add_imm_int8_t_fmt_1_0 (int8_t * restrict out, int8_t * restrict op_1, unsigned int limit)
{
  vector([16,16]) signed char * vectp_out.12;
  vector([16,16]) signed char vect_patt_10.11;
  vector([16,16]) signed char vect_x_14.10;
  vector([16,16]) signed char D.2852;
  vector([16,16]) signed char * vectp_op_1.8;
  vector([16,16]) signed char _73(D);
  unsigned long _80;
  unsigned long ivtmp_81;
  unsigned long ivtmp_82;
  unsigned long _83;

  <bb 2> [local count: 118111598]:
  if (limit_12(D) != 0)
    goto <bb 3>; [89.00%]
  else
    goto <bb 5>; [11.00%]

  <bb 3> [local count: 105119322]:
  _80 = (unsigned long) limit_12(D);

  <bb 4> [local count: 955630226]:
  # vectp_op_1.8_71 = PHI <vectp_op_1.8_72(4), op_1_13(D)(3)>
  # vectp_out.12_77 = PHI <vectp_out.12_78(4), out_16(D)(3)>
  # ivtmp_81 = PHI <ivtmp_82(4), _80(3)>
  _83 = .SELECT_VL (ivtmp_81, POLY_INT_CST [16, 16]);
  vect_x_14.10_74 = .MASK_LEN_LOAD (vectp_op_1.8_71, 8B, { -1, ... }, _73(D), _83, 0);
  vect_patt_10.11_75 = .SAT_ADD (vect_x_14.10_74, { 9, ... });
  .MASK_LEN_STORE (vectp_out.12_77, 8B, { -1, ... }, _83, 0, vect_patt_10.11_75);
  vectp_op_1.8_72 = vectp_op_1.8_71 + _83;
  vectp_out.12_78 = vectp_out.12_77 + _83;
  ivtmp_82 = ivtmp_81 - _83;
  if (ivtmp_82 != 0)
    goto <bb 4>; [89.00%]
  else
    goto <bb 5>; [11.00%]

  <bb 5> [local count: 118111600]:
  return;

}

The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.

Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/ChangeLog:

* match.pd: add singned vector SAT_ADD IMM form1 matching.

4 months agoRISC-V:Add testcases for signed .SAT_ADD IMM form 1 with IMM = -1.
xuli [Fri, 27 Dec 2024 07:59:31 +0000 (07:59 +0000)] 
RISC-V:Add testcases for signed .SAT_ADD IMM form 1 with IMM = -1.

This patch adds testcase for form1, as shown below:

T __attribute__((noinline))                  \
sat_s_add_imm_##T##_fmt_1##_##INDEX (T x)             \
{                                            \
  T sum = (UT)x + (UT)IMM;                     \
  return (x ^ IMM) < 0                         \
    ? sum                                    \
    : (sum ^ x) >= 0                         \
      ? sum                                  \
      : x < 0 ? MIN : MAX;                   \
}

Passed the rv64gcv regression test.

Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_s_add_imm-2.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-1-i16.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-3.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-1-i32.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-4.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-1-i64.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-1.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-1-i8.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-run-2.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-run-1-i16.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-run-3.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-run-1-i32.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-run-4.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-run-1-i64.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-run-1.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-run-1-i8.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-2-1.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm_type_check-1-i16.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-3-1.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm_type_check-1-i32.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-1-1.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm_type_check-1-i8.c: ...here.

4 months agoMatch:Support IMM=-1 for signed scalar SAT_ADD IMM form1
xuli [Fri, 27 Dec 2024 07:03:14 +0000 (07:03 +0000)] 
Match:Support IMM=-1 for signed scalar SAT_ADD IMM form1

This patch would like to support .SAT_ADD when IMM=-1.

Form1:
T __attribute__((noinline))                  \
sat_s_add_imm_##T##_fmt_1##_##INDEX (T x)             \
{                                            \
  T sum = (UT)x + (UT)IMM;                     \
  return (x ^ IMM) < 0                         \
    ? sum                                    \
    : (sum ^ x) >= 0                         \
      ? sum                                  \
      : x < 0 ? MIN : MAX;                   \
}

Take below form1 as example:
DEF_SAT_S_ADD_IMM_FMT_1(0, int8_t, uint8_t, -1, INT8_MIN, INT8_MAX)

Before this patch:
__attribute__((noinline))
int8_t sat_s_add_imm_int8_t_fmt_1_0 (int8_t x)
{
  unsigned char x.0_1;
  unsigned char _2;
  unsigned char _3;
  int8_t iftmp.1_4;
  signed char _8;
  unsigned char _9;
  signed char _10;

  <bb 2> [local count: 1073741824]:
  x.0_1 = (unsigned char) x_5(D);
  _3 = -x.0_1;
  _10 = (signed char) _3;
  _8 = x_5(D) & _10;
  if (_8 < 0)
    goto <bb 4>; [1.40%]
  else
    goto <bb 3>; [98.60%]

  <bb 3> [local count: 434070867]:
  _2 = x.0_1 + 255;

  <bb 4> [local count: 1073741824]:
  # _9 = PHI <_2(3), 128(2)>
  iftmp.1_4 = (int8_t) _9;
  return iftmp.1_4;

}

After this patch:
__attribute__((noinline))
int8_t sat_s_add_imm_int8_t_fmt_1_0 (int8_t x)
{
  int8_t _4;

  <bb 2> [local count: 1073741824]:
  gimple_call <.SAT_ADD, _4, x_5(D), 255> [tail call]
  gimple_return <_4>

}

The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.

Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/ChangeLog:

* match.pd: Add signed scalar SAT_ADD IMM form1 with IMM=-1 matching.
* tree-ssa-math-opts.cc (match_unsigned_saturation_add): Adapt function name.
(match_saturation_add_with_assign): Match signed and unsigned SAT_ADD with assign.
(math_opts_dom_walker::after_dom_children): Match imm=-1 signed SAT_ADD with NOP_EXPR case.

4 months agoDaily bump.
GCC Administrator [Tue, 27 May 2025 00:19:27 +0000 (00:19 +0000)] 
Daily bump.

4 months agoc-c++-common/gomp/{attrs-,}metadirective-3.c: Fix expected result [PR118694]
Tobias Burnus [Mon, 26 May 2025 17:50:40 +0000 (19:50 +0200)] 
c-c++-common/gomp/{attrs-,}metadirective-3.c: Fix expected result [PR118694]

With compilation for nvptx enabled, two issues showed up:
(a) "error: 'target' construct with nested 'teams' construct contains
     directives outside of the 'teams' construct"
    See PR comment 9 why this is difficult to fix.
Solution: Add dg-bogus and accept/expect the error for 'target offload_nvptx'.

(b) The assumptions about the dump for 'target offload_nvptx' were wrong
    as the metadirective was already expanded to a OMP_NEXT_VARIANT
    construct such that no 'omp metadirective' was left in either case.
Solution: Check that no 'omp metadirective' is left; additionally, expect
either OMP_NEXT_VARIANT (when offload_nvptx is available) or no 'teams'
directive at all (if not).

gcc/testsuite/ChangeLog:

PR middle-end/118694
* c-c++-common/gomp/attrs-metadirective-3.c: Change to never
expect 'omp metadirective' in the dump. If !offload_nvptx, check
that no 'teams' shows up in the dump; for offload_nvptx, expect
OMP_NEXT_VARIANT and an error about directive between 'target'
and 'teams'.
* c-c++-common/gomp/metadirective-3.c: Likewise.

4 months agolibstdc++: Run in_place constructor test for std::indirect [PR119152]
Tomasz Kamiński [Mon, 26 May 2025 15:35:08 +0000 (17:35 +0200)] 
libstdc++: Run in_place constructor test for std::indirect [PR119152]

In indirect/ctor.cc test_inplace_ctor function was defined, but never
called.

PR libstdc++/119152

libstdc++-v3/ChangeLog:

* testsuite/std/memory/indirect/ctor.cc: Run test_inplace_ctor.

4 months agoOpenMP/C++: Avoid ICE for BIND_EXPR with empty BIND_EXPR_BLOCK [PR120413]
Tobias Burnus [Mon, 26 May 2025 15:58:07 +0000 (17:58 +0200)] 
OpenMP/C++: Avoid ICE for BIND_EXPR with empty BIND_EXPR_BLOCK [PR120413]

PR c++/120413

gcc/cp/ChangeLog:

* semantics.cc (finish_omp_target_clauses_r): Handle
BIND_EXPR with empty BIND_EXPR_BLOCK.

gcc/testsuite/ChangeLog:

* g++.dg/gomp/target-4.C: New test.

4 months agoc++: add -fdump-lang-tinst
Jason Merrill [Fri, 18 Apr 2025 13:50:04 +0000 (09:50 -0400)] 
c++: add -fdump-lang-tinst

This patch adds a dump with a trace of template instantiations, indented
based on the depth of recursive instantiation. -lineno adds the location
that triggered the instantiation, -details adds non-instantiation
sbustitutions.

The instantiate_pending_templates change is to avoid a bunch of entries for
reopening tinst scopes that we then don't instantiate anything with; it also
seems a bit cleaner this way.

gcc/cp/ChangeLog:

* cp-tree.h: Declare tinst_dump_id.
* cp-objcp-common.cc (cp_register_dumps): Set it.
* pt.cc (push_tinst_level_loc): Dump it.
(reopen_tinst_level): Here too.
(tinst_complete_p): New.
(instantiate_pending_templates): Don't reopen_tinst_level for
already-complete instantiations.

gcc/ChangeLog:

* doc/invoke.texi: Move C++ -fdump-lang to C++ section.
Add -fdump-lang-tinst.

4 months agoc++: add cxx_dump_pretty_printer
Jason Merrill [Thu, 17 Apr 2025 20:29:49 +0000 (16:29 -0400)] 
c++: add cxx_dump_pretty_printer

A class to simplify implementation of -fdump-lang-foo with support for
pp_printf using %D and such.

gcc/cp/ChangeLog:

* cp-tree.h (class cxx_dump_pretty_printer): New.
* error.cc (cxx_dump_pretty_printer): Ctor/dtor definitions.

4 months agolibstdc++: Implement C++26 std::indirect [PR119152]
Jonathan Wakely [Thu, 21 Mar 2024 23:07:56 +0000 (23:07 +0000)] 
libstdc++: Implement C++26 std::indirect [PR119152]

This patch implements C++26 std::indirect as specified
in P3019 with amendment to move assignment from LWG 4251.

PR libstdc++/119152

libstdc++-v3/ChangeLog:

* doc/doxygen/stdheader.cc: Added indirect.h file.
* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/bits/indirect.h: New file.
* include/bits/version.def (indirect): Define.
* include/bits/version.h: Regenerate.
* include/std/memory: Include new header.
* testsuite/std/memory/indirect/copy.cc
* testsuite/std/memory/indirect/copy_alloc.cc
* testsuite/std/memory/indirect/ctor.cc
* testsuite/std/memory/indirect/incomplete.cc
* testsuite/std/memory/indirect/invalid_neg.cc
* testsuite/std/memory/indirect/move.cc
* testsuite/std/memory/indirect/move_alloc.cc
* testsuite/std/memory/indirect/relops.cc

Co-authored-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
4 months agolibstdc++: Implement C++26 function_ref [PR119126]
Tomasz Kamiński [Wed, 14 May 2025 10:04:24 +0000 (12:04 +0200)] 
libstdc++: Implement C++26 function_ref [PR119126]

This patch implements C++26 function_ref as specified in P0792R14,
with correction for constraints for constructor accepting nontype_t
parameter from LWG 4256.

As function_ref may store a pointer to the const object, __Ptrs::_M_obj is
changed to const void*, so again we do not cast away const from const
objects. To help with necessary casts, a __polyfunc::__cast_to helper is
added, that accepts reference to or target type direclty.

The _Invoker now defines additional call methods used by function_ref:
_S_ptrs() for invoking target passed by reference, and __S_nttp, _S_bind_ptr,
_S_bind_ref for handling constructors accepting nontype_t. The existing
_S_call_storage is changed to thin wrapper, that initialies _Ptrs, and forwards
to _S_call_ptrs.

This reduced the most uses of _Storage::_M_ptr and _Storage::_M_ref,
so this functions was removed, and _Manager uses were adjusted.

Finally we make function_ref available in freestanding mode, as
move_only_function and copyable_function are currently only available in hosted,
so we define _Manager and _Mo_base only if either __glibcxx_move_only_function
or __glibcxx_copyable_function is defined.

PR libstdc++/119126

libstdc++-v3/ChangeLog:

* doc/doxygen/stdheader.cc: Added funcref_impl.h file.
* include/Makefile.am: Added funcref_impl.h file.
* include/Makefile.in: Added funcref_impl.h file.
* include/bits/funcref_impl.h: New file.
* include/bits/funcwrap.h: (_Ptrs::_M_obj): Const-qualify.
(_Storage::_M_ptr, _Storage::_M_ref): Remove.
(__polyfunc::__cast_to) Define.
(_Base_invoker::_S_ptrs, _Base_invoker::_S_nttp)
(_Base_invoker::_S_bind_ptrs, _Base_invoker::_S_bind_ref)
(_Base_invoker::_S_call_ptrs): Define.
(_Base_invoker::_S_call_storage): Foward to _S_call_ptrs.
(_Manager::_S_local, _Manager::_S_ptr): Adjust for _M_obj being
const qualified.
(__polyfunc::_Manager, __polyfunc::_Mo_base): Guard with
__glibcxx_move_only_function || __glibcxx_copyable_function.
(__polyfunc::__skip_first_arg, __polyfunc::__deduce_funcref)
(std::function_ref) [__glibcxx_function_ref]: Define.
* include/bits/utility.h (std::nontype_t, std::nontype)
(__is_nontype_v) [__glibcxx_function_ref]: Define.
* include/bits/version.def: Define function_ref.
* include/bits/version.h: Regenerate.
* include/std/functional: Define __cpp_lib_function_ref.
* src/c++23/std.cc.in (std::nontype_t, std::nontype)
(std::function_ref) [__cpp_lib_function_ref]: Export.
* testsuite/20_util/function_ref/assign.cc: New test.
* testsuite/20_util/function_ref/call.cc: New test.
* testsuite/20_util/function_ref/cons.cc: New test.
* testsuite/20_util/function_ref/cons_neg.cc: New test.
* testsuite/20_util/function_ref/conv.cc: New test.
* testsuite/20_util/function_ref/deduction.cc: New test.
* testsuite/20_util/function_ref/mutation.cc: New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
4 months agoFixup gcc.target/i386/vect-epilogues-5.c
Richard Biener [Mon, 26 May 2025 09:21:19 +0000 (11:21 +0200)] 
Fixup gcc.target/i386/vect-epilogues-5.c

The following adjusts the expected messages after -fopt-info-vec
was improved for (masked) epilogues.

* gcc.target/i386/vect-epilogues-5.c: Adjust.

4 months ago[AUTOFDO][AARCH64] Add support for profilebootstrap
Kugan Vivekanandarajah [Mon, 26 May 2025 01:41:59 +0000 (11:41 +1000)] 
[AUTOFDO][AARCH64] Add support for profilebootstrap

Add support for autoprofiledbootstrap in aarch64.
This is similar to what is done for i386. Added
gcc/config/aarch64/gcc-auto-profile for aarch64 profile
creation.

How to run:
configure --with-build-config=bootstrap-lto
make autoprofiledbootstrap

ChangeLog:

* Makefile.def: AUTO_PROFILE based on cpu_type.
* Makefile.in: Likewise.
* configure: Regenerate.
* configure.ac: Set autofdo_target.

gcc/ChangeLog:

* config/aarch64/gcc-auto-profile: New file.

Signed-off-by: Kugan Vivekanandarajah <kvivekananda@nvidia.com>