Jakub Jelinek [Fri, 28 Mar 2025 14:45:03 +0000 (15:45 +0100)]
srcextra fixes
Here is a patch which uses sed to fix up the copies of the generated
files by flex/bison in the source directories (i.e. what we ship in
release tarballs).
In that case the generated files are in the same directory as the
files they are generated from, so there should be no absolute or relative
directories, just the filenames.
Furthermore, c.srcextra was duplicating the work of gcc.srcextra, there is
nothing C FE specific on gengtype-lex.l.
2025-03-28 Jakub Jelinek <jakub@redhat.com>
gcc/
* Makefile.in (gcc.srcextra): Use sed to turn .../gcc/gengtype-lex.l
in #line directives into just gengtype-lex.l.
gcc/c/
* Make-lang.in (c.srcextra): Don't depend on anything and don't copy
anything.
gcc/cobol/
* Make-lang.in (cobol.srcextra): Use sed to turn
.../gcc/cobol/*.{y,l,h,cc} and cobol/*.{y,l,h,cc} in #line directives
into just *.{y,l,h,cc}.
Richard Biener [Fri, 28 Mar 2025 14:20:16 +0000 (15:20 +0100)]
other/119510 - use --enable-languages=default,cobol for release tarballs
The following adds cobol to the set of languages built during release
tarball building so the bison and flex generated sources for cobol
are included in the tarball.
PR other/119510
maintainer-scripts/
* gcc_release: Use --enable-languages=default,cobol
when building generated files.
Andrew MacLeod [Wed, 26 Mar 2025 14:34:42 +0000 (10:34 -0400)]
If the LHS does not contain zero, neither do multiply operands.
Given ~[0,0] = op1 * op2, range-ops should determine that neither op1 nor
op2 is zero. Add this to the operator_mult for op1_range. op2_range
simply invokes op1_range, so both will be covered.
PR tree-optimzation/110992.c
PR tree-optimzation/119471.c
gcc/
* range-op.cc (operator_mult::op1_range): If the LHS does not
contain zero, return non-zero.
Richard Biener [Fri, 28 Mar 2025 12:48:36 +0000 (13:48 +0100)]
bootstrap/119513 - fix cobol bootstrap with --enable-generated-files-in-srcdir
This adds gcc/cobol/parse.o to compare_exclusions and makes sure to
ignore errors when copying generated files, like it's done when
copying gengtype-lex.cc.
Jakub Jelinek [Fri, 28 Mar 2025 09:49:40 +0000 (10:49 +0100)]
tailc: Handle musttail noreturn calls [PR119483]
The following (first) testcase is accepted by clang (if clang::musttail)
and rejected by gcc, because we discover the call is noreturn and then bail
out because we don't want noreturn tailcalls.
The general reason not to support noreturn tail calls is for cases like
abort where we want nicer backtrace, but if user asks explicitly to
musttail a call which either is explicitly noreturn or is implicitly
determined to be noreturn, I don't see a reason why we couldn't do that.
Both for tail calls and tail recursions.
An alternative would be to keep rejecting musttail to explicit noreturn,
but not actually implicitly mark anything as noreturn if it has any musttail
calls. But it is unclear how we could do that, such marking is I think done
typically before IPA and e.g. for LTO we won't know whether some other TU
could have musttail calls to it. And keeping around both explicit and
implicit noreturn bits would be ugly. Well, I guess we could differentiate
between presence of noreturn/_Noreturn attributes and just ECF_NORETURN
without those, but then tailc would still need to support it, just error out
if it was explicit.
2025-03-28 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119483
* tree-tailcall.cc (find_tail_calls): Handle noreturn musttail
calls.
(eliminate_tail_call): Likewise.
(tree_optimize_tail_calls_1): If cfun->has_musttail and
diag_musttail, handle also basic blocks with no successors
with noreturn musttail calls.
* calls.cc (can_implement_as_sibling_call_p): Allow ECF_NORETURN
calls if they are musttail calls.
* c-c++-common/pr119483-1.c: New test.
* c-c++-common/pr119483-2.c: New test.
Jakub Jelinek [Fri, 28 Mar 2025 09:48:31 +0000 (10:48 +0100)]
ipa-sra: Don't change return type to void if there are musttail calls [PR119484]
The following testcase is rejected, because IPA-SRA decides to
turn bar.constprop call into bar.constprop.isra which returns void.
While there is no explicit lhs on the call, as it is a musttail call
the tailc pass checks if IPA-VRP returns singleton from that function
and the function returns the same value and in that case it still turns
it into a tail call. This can't work with IPA-SRA changing it into
void returning function though.
The following patch fixes this by forcing returning the original type
if there are musttail calls.
2025-03-28 Jakub Jelinek <jakub@redhat.com>
PR ipa/119484
* ipa-sra.cc (isra_analyze_call): Don't set m_return_ignored if
gimple_call_must_tail_p even if it doesn't have lhs.
Richard Biener [Fri, 21 Mar 2025 18:30:31 +0000 (19:30 +0100)]
Export native_encode_real operating on REAL_VALUE_TYPE
The following exports the native_encode_real worker, and makes it
take a scalar float mode and REAL_VALUE_TYPE data instead of a tree
for use in the COBOL frontend, avoiding creating of a temporary tree.
* fold-const.h (native_encode_real): Export.
* fold-const.cc (native_encode_real): Change API to take
mode and REAL_VALUE_TYPE.
(native_encode_expr): Adjust.
Iain Sandoe [Thu, 20 Mar 2025 17:08:57 +0000 (17:08 +0000)]
cobol: Do not include <cmath> (no longer needed)
Several of enumerators in parse.y conflict with ones declared in at
least some versions of <cmath> .. e.g. "OVERFLOW". The header is no
longer needed since the FE is not trying to do host arithmetic.
David Malcolm [Thu, 27 Mar 2025 23:46:20 +0000 (19:46 -0400)]
contrib: add dg-lint and libgdiagnostics.py [PR116163]
Changed in v2:
- eliminated COMMON_MISSPELLINGS in favor of retesting with a regexp
that adds underscores
- add a list of KNOWN_DIRECTIVES, and complain if we see a directive
that isn't in the list
- various refactorings to reduce the nesting within the script
- skip more kinds of file ('README', 'Makefile.am', 'Makefile.in',
'gen_directive_tests')
- keep track of the number of files scanned and report it and the end
with a note
This patch adds a new dg-lint subdirectory below contrib, containing
a "dg-lint" script for detecting common mistakes made in our DejaGnu
tests.
Specifically, DejaGnu's dg.exp's dg-get-options has a regexp for
detecting dg- directives
https://git.savannah.gnu.org/gitweb/?p=dejagnu.git;a=blob;f=lib/dg.exp
here's the current:
set tmp [grep $prog "{\[ \t\]\+dg-\[-a-z\]\+\[ \t\]\+.*\[ \t\]\+}" line]
which if I'm reading it right requires a "{", then one or more tab/space
chars, then a "dg-" directive name, then one of more tab/space
characters, then anything (for arguments to the directive), then one of
more tab/space character, then a "}".
There are numerous places in our testsuite which look like attempts to
use a directive, but which don't match this regexp.
The script warns about such places, along with a list of misspelled
directives (currently just "dg_options" for "dg-options"), and a warning
if a dg-do appears after a dg-require-* (as per
https://gcc.gnu.org/onlinedocs/gccint/Directives.html
"This directive must appear after any dg-do directive in the test
and before any dg-additional-sources directive." for
dg-require-effective-target.
dg-lint uses libgdiagnostics to report its results; the patch adds a
new libgdiagnostics.py script below contrib/dg-lint. This uses Python's
ctypes module to expose libgdianostics.so to Python via FFI. Hence
the warnings have colorization, quote the pertinent parts of the tested
file, can have fix-it hints, etc. Here's the output from the tests, run
from the top-level directory:
$ LD_LIBRARY_PATH=../build/gcc/ ./contrib/dg-lint/dg-lint contrib/dg-lint/test-*.c
contrib/dg-lint/test-1.c:6:6: warning: misspelled directive: 'dg_final'; did you mean 'dg-final'?
6 | /* { dg_final { scan_assembler_times "vmsumudm" 2 } } */
| ^~~~~~~~
| dg-final
contrib/dg-lint/test-1.c:15:4: warning: directive 'dg-output-file' appears not to match dg.exp's regexp
15 | dg-output-file "m4.out"
| ^~~~~~~~~~~~~~
contrib/dg-lint/test-1.c:18:4: warning: directive 'dg-output-file' appears not to match dg.exp's regexp
18 | dg-output-file "m4.out" }
| ^~~~~~~~~~~~~~
contrib/dg-lint/test-1.c:21:6: warning: directive 'dg-output-file' appears not to match dg.exp's regexp
21 | { dg-output-file "m4.out"
| ^~~~~~~~~~~~~~
contrib/dg-lint/test-1.c:24:5: warning: directive 'dg-output-file' appears not to match dg.exp's regexp
24 | {dg-output-file "m4.out"}
| ^~~~~~~~~~~~~~
contrib/dg-lint/test-1.c:27:6: warning: directive 'dg-output-file' appears not to match dg.exp's regexp
27 | { dg-output-file, "m4.out" }
| ^~~~~~~~~~~~~~
contrib/dg-lint/test-2.c:4:6: warning: 'dg-do' after 'dg-require-effective-target'
4 | /* { dg-do compile } */
| ^~~~~
contrib/dg-lint/test-2.c:3:6: note: 'dg-require-effective-target' was here
3 | /* { dg-require-effective-target c++11 } */
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
I don't yet have a way to verify these tests (clearly we can't use
DejaGnu for this).
These Python bindings could be used by other projects, but so far I only
implemented what I needed for dg-lint.
Running the test on the GCC source tree finds dozens of issues, which
followup patches address.
Tested with Python 3.8
contrib/ChangeLog:
PR testsuite/116163
* dg-lint/dg-lint: New file.
* dg-lint/libgdiagnostics.py: New file.
* dg-lint/test-1.c: New file.
* dg-lint/test-2.c: New file.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Bob Dubner [Thu, 27 Mar 2025 21:55:53 +0000 (17:55 -0400)]
cobol: Incorporate new testcases from the cobolworx UAT tests.
The author notes that some of the file names are regrettably lengthy,
which is because they are derived from the descriptive names of the
autom4te tests.
Jakub Jelinek [Thu, 27 Mar 2025 20:21:48 +0000 (21:21 +0100)]
testsuite: Fix up strub-internal-pr112938.C test for C++2{0,3,6}
On Thu, Mar 27, 2025 at 12:05:21AM +0000, Sam James wrote:
> The test was being ignored because dg.exp looks for .C in g++.dg/.
>
> gcc/testsuite/ChangeLog:
> PR middle-end/112938
>
> * g++.dg/strub-internal-pr112938.cc: Move to...
> * g++.dg/strub-internal-pr112938.C: ...here.
This regressed the test for C++20 and higher:
FAIL: g++.dg/strub-internal-pr112938.C -std=gnu++20 (test for excess errors)
FAIL: g++.dg/strub-internal-pr112938.C -std=gnu++23 (test for excess errors)
FAIL: g++.dg/strub-internal-pr112938.C -std=gnu++26 (test for excess errors)
Here is a fix.
2025-03-27 Jakub Jelinek <jakub@redhat.com>
* g++.dg/strub-internal-pr112938.C: Add dg-warning for c++20.
Eric Botcazou [Thu, 27 Mar 2025 19:29:51 +0000 (20:29 +0100)]
Ada: Fix too late initialization of tasking runtime with standalone library
The Tasking_Runtime_Initialize routine installs the tasking version of the
RTS_Lock manipulation routines and thus needs to be called very early before
the elaboration of all the Ada units of the program, including those of the
runtime itself.
This is guaranteed by the binder when the tasking runtime is explicitly
dragged into the link. However, for a standalone dynamic library that
does not depend on the tasking runtime and is auto-initialized, no such
guarantee holds, even though the library might be later dragged into a
link that contains the tasking runtime.
This change causes the routine to be called even earlier, in particular
at load time when a (standalone) dynamic library is involved in the link,
so as to meet the requirements. It will cause the routine to be called
twice if the main subprogram is generated by the binder, but this is
harmless since the routine is idempotent.
ada/
* libgnarl/s-tasini.adb (Tasking_Runtime_Initialize): Add pragma
Linker_Constructor for the procedure.
Sam James [Thu, 27 Mar 2025 17:52:19 +0000 (17:52 +0000)]
testsuite: revert Fortran change
Revert part of my change from r15-8973-g1307de1b4e7d5e; as Harald points
out, the comment explains why this is there. It's a hack but it needs to
stay for now. (I did have this marked as a TODO in my branch and didn't
leave a proper note as to why, so it's my fault.)
Jonathan Wakely [Wed, 26 Mar 2025 11:47:05 +0000 (11:47 +0000)]
libstdc++: Replace use of std::min in ranges::uninitialized_xxx algos [PR101587]
Because ranges can have any signed integer-like type as difference_type,
it's not valid to use std::min(diff1, diff2). Instead of calling
std::min with an explicit template argument, this adds a new __mindist
helper that determines the common type and uses that with std::min.
libstdc++-v3/ChangeLog:
PR libstdc++/101587
* include/bits/ranges_uninitialized.h (__detail::__mindist):
New function object.
(ranges::uninitialized_copy, ranges::uninitialized_copy_n)
(ranges::uninitialized_move, ranges::uninitialized_move_n): Use
__mindist instead of std::min.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/constrained.cc:
Check ranges with difference difference types.
* testsuite/20_util/specialized_algorithms/uninitialized_move/constrained.cc:
Likewise.
Jonathan Wakely [Wed, 26 Mar 2025 17:45:06 +0000 (17:45 +0000)]
libstdc++: Use const_cast to workaround tm_zone being non-const
Iain reported that he's seeing this on Darwin:
include/bits/chrono_io.h:914: warning: ISO C++ forbids converting a string constant to 'char*' [-Wwrite-strings]
This is because the BSD definition ot tm::tm_zone is a char* (and has
been since 1987) rather than const char* as in Glibc and POSIX.1-2024.
We can fix it by using const_cast<char*> when setting the tm_zone
member. This should be safe because libc doesn't actually write anything
to tm_zone; it's only non-const because the BSD definition predates the
addition of the const keyword to C.
For targets where it's a const char* the cast won't matter because it
will be converted back to const char* on assignment anyway.
libstdc++-v3/ChangeLog:
* include/bits/chrono_io.h (__formatter_chrono::_M_c): Use
const_cast when setting tm.tm_zone.
Richard Biener [Thu, 27 Mar 2025 07:21:10 +0000 (08:21 +0100)]
target/119010 - more DFmode handling in zn4zn5 reservations
The following adds DFmode where V1DFmode and SFmode were handled.
This resolves missing reservations for adds, subs [with memory]
and for FMAs for the testcase I'm looking at. Resolved cases are
-;; 16--> b 0: i 237 xmm3=xmm3+[r9*0x8+si] :nothing
-;; 29--> b 0: i 246 xmm3=xmm3+xmm1 :nothing
-;; 46--> b 0: i 296 xmm1=xmm1-xmm3 :nothing
I've done search-and-replace for this, the catched cases look reasonable
though I'm of course not sure all of them can actually happen.
This also fixes the matched type for the znver{4,5}_sse_muladd_load
reservations from sseshuf to ssemuladd, resolving
-;; 1--> b 0: i 161 xmm0={-xmm0*xmm27+[cx+ax]} :nothing
-;; 22--> b 0: i 229 xmm11={-xmm11*xmm7+[di*0x8+dx]} :nothing
Sam James [Thu, 27 Mar 2025 13:19:51 +0000 (13:19 +0000)]
testsuite: harmless dg-* whitespace fixes
These just fix inconsistent/unusual style to avoid noise when grepping
and also people picking up bad habits when they see it (as similar
mistakes can be harmful).
Tobias Burnus [Thu, 27 Mar 2025 13:09:20 +0000 (14:09 +0100)]
OpenMP: Fix C++ template handling with append_args' prefer_type modifier
It is possible but not very sensible to use C++ templates with in the
prefer_type modifier to the 'append_args' clause of 'declare variant'.
The commit r15-6336-g12dd892b1a3ad7 added substitution support in pt.cc,
but missed to update afterward the actual data in decl.cc.
As gimplification support was only added in r15-8898-gf016ee89955ab4,
this could not be tested back then. The latter commit added a sorry
for it gimplify.cc and the existing testcase, which this commit now removes.
gcc/cp/ChangeLog:
* cp-tree.h (cp_finish_omp_init_prefer_type): Add.
* decl.cc (omp_declare_variant_finalize_one): Call it.
* pt.cc (tsubst_attribute): Minor rebustification for OpenMP
append_args handling.
* semantics.cc (cp_omp_init_prefer_type_update): Rename to ...
(cp_finish_omp_init_prefer_type): ... this; remove static attribute
and return modified tree. Move clause handling to ...
(finish_omp_clauses): ... the caller.
libstdc++: re-bump the feature-test macro for P2562R1 [PR119488]
Now that the algorithms have been merged we can advertise full support
for P2562R1. This effectively reverts r15-8933-ga264c270fde292.
libstdc++-v3/ChangeLog:
PR libstdc++/119488
* include/bits/version.def (constexpr_algorithms): Bump
the feature-testing macro.
* include/bits/version.h: Regenerate.
* testsuite/25_algorithms/cpp_lib_constexpr.cc: Test the
bumped value for the feature-testing macro.
This completes the implementation of P2562R1 for C++26.
Unlike the other constexpr algorithms of the same family,
stable_partition does not have a constexpr-friendly version "ready to
use" during constant evaluation. In fact, it is not even available on
freestanding, because it always allocates a temporary memory buffer.
This commit implements the simplest possible strategy: during constant
evaluation allocate a buffer of length 1 on the stack, and use that as
a working area.
libstdc++-v3/ChangeLog:
* include/bits/algorithmfwd.h (stable_partition): Mark it
as constexpr for C++26.
* include/bits/ranges_algo.h (__stable_partition_fn): Likewise.
* include/bits/stl_algo.h (stable_partition): Mark it as
constexpr for C++26; during constant evaluation use a new
codepath where a temporary buffer of 1 element is used.
* testsuite/25_algorithms/headers/algorithm/synopsis.cc
(stable_partition): Add constexpr.
* testsuite/25_algorithms/stable_partition/constexpr.cc: New test.
This commit adds support for constexpr inplace_merge, added by P2562R1
for C++26. The implementation strategy is the same as for constexpr
stable_sort: use if consteval to detect if we're in constant evaluation,
and dispatch to a suitable path (same one as freestanding).
libstdc++-v3/ChangeLog:
* include/bits/algorithmfwd.h (inplace_merge): Mark it as
constexpr for C++26.
* include/bits/ranges_algo.h (__inplace_merge_fn): Likewise.
* include/bits/stl_algo.h (inplace_merge): Mark it as constexpr;
during constant evaluation, dispatch to the non-allocating
codepath.
* testsuite/25_algorithms/headers/algorithm/synopsis.cc
(inplace_merge): Add constexpr.
* testsuite/25_algorithms/inplace_merge/constexpr.cc: New test.
Nathaniel Shead [Wed, 26 Mar 2025 12:43:36 +0000 (23:43 +1100)]
c++/modules: Handle conflicting ABI tags [PR118920]
The ICE in the linked PR is caused because out_ptr_t inherits an ABI tag
in a module that it does not in the importing module. When we try to
build a qualified 'const out_ptr_t' during stream-in, we find the
existing 'const out_ptr_t' variant type that has been built, but discard
it due to having a mismatching attribute list. This causes us to build
a new copy of this variant, and ultimately fail a checking assertion due
to this being an identical type with different TYPE_CANONICAL.
This patch adds checking that ABI tags between an imported and existing
declaration match, and errors if they are incompatible. We make use of
'equal_abi_tags' from mangle.cc to determine if we should error; in the
case in the PR, because the ABI tag was an implicit tag that doesn't
affect name mangling, we don't need to error. To fix the ICE we ensure
that (regardless of whether we errored or not) later processing
considers the ABI tags as equivalent.
PR c++/118920
gcc/cp/ChangeLog:
* cp-tree.h (equal_abi_tags): Declare.
* mangle.cc (equal_abi_tags): Make external, fix comparison.
(tree_string_cmp): Make internal.
* module.cc (trees_in::check_abi_tags): New function.
(trees_in::decl_value): Use it.
(trees_in::is_matching_decl): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/modules/attrib-3_a.H: New test.
* g++.dg/modules/attrib-3_b.C: New test.
* g++.dg/modules/pr118920.h: New test.
* g++.dg/modules/pr118920_a.H: New test.
* g++.dg/modules/pr118920_b.H: New test.
* g++.dg/modules/pr118920_c.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Nathaniel Shead [Wed, 26 Mar 2025 13:23:24 +0000 (00:23 +1100)]
c++/modules: Fix tsubst of global module friend classes [PR118920]
When doing tsubst_friend_class, we need to first check if any imported
module has already created a (hidden) declaration for the class so that
we don't end up with conflicting declarations. Currently we do this
using DECL_MODULE_IMPORT_P, but this is not set in cases where the class
is in the global module and matches an existing GM declaration we've
seen (via an include, for example).
This patch fixes this by checking DECL_MODULE_ENTITY_P instead, which is
set on all entities that have been seen from a module import. We also
use the 'for_mangle' version of get_originating_module so that we don't
treat imported GM entities as attached to the module we imported them
from. And rename that parameter to something more general.
And dump_module_suffix is another place where we want to treat global module
entities as not coming from a module.
PR c++/118920
gcc/cp/ChangeLog:
* name-lookup.cc (lookup_imported_hidden_friend): Check for
module entity rather than just module import.
* module.cc (get_originating_module): Rename for_mangle parm to
global_m1.
* error.cc (dump_module_suffix): Don't decorate global module decls.
gcc/testsuite/ChangeLog:
* g++.dg/modules/tpl-friend-17.h: New test.
* g++.dg/modules/tpl-friend-17_a.C: New test.
* g++.dg/modules/tpl-friend-17_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Co-authored-by: Jason Merrill <jason@redhat.com>
Jonathan Wakely [Wed, 26 Mar 2025 11:21:32 +0000 (11:21 +0000)]
libstdc++: Fix std::ranges::iter_move for function references [PR119469]
The result of std::move (or a cast to an rvalue reference) on a function
reference is always an lvalue. Because std::ranges::iter_move was using
the type std::remove_reference_t<X>&& as the result of std::move, it was
giving the wrong type for function references. Use a decltype-specifier
with declval<remove_reference_t<X>>() instead of just using the
remove_reference_t<X>&& type directly. This gives the right result,
while still avoiding the cost of doing overload resolution for
std::move.
libstdc++-v3/ChangeLog:
PR libstdc++/119469
* include/bits/iterator_concepts.h (_IterMove::__result): Use
decltype-specifier instead of an explicit type.
* testsuite/24_iterators/customization_points/iter_move.cc:
Check results for function references.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Richard Earnshaw [Wed, 26 Mar 2025 15:56:18 +0000 (15:56 +0000)]
arm: don't vectorize fmaxf() unless unsafe math opts are enabled
This test has presumably been failing since vectorization was enabled
at -O2. I suspect part of the reason this wasn't picked up sooner is
that the test is a hybrid execution/scan-assembler test and the
execution part requires appropriate hardware.
The problem is that we are vectorizing an expansion of fmaxf() when
the vector version of the instruction does not preserve denormal
values. This means we should only apply this optimization when
-funsafe-math-optimizations is enabled.
This fix does a few things:
- Moves the expand pattern to vec-common.md. Although I haven't changed
its behaviour (beyond fixing the bug), this should really be enabled for
MVE as well (but that will need to wait for gcc-16 since the MVE code
needs some additional changes first).
- Adds support for HF mode vectors.
- splits the test that was exposing the bug into two parts: an executable
test and a scan-assembler test. The scan-assembler version is more
widely enabled, since it does not require a suitable executable environment.
gcc/ChangeLog:
* config/arm/neon.md (<fmaxmin><mode>3): Move pattern from here...
* config/arm/vec-common.md (<fmaxmin><mode>3): ... to here. Convert
to define_expand and disable the pattern when denormal values might
get truncated to zero. Iterate on VF to add V4HF and V8HF variants.
gcc/testsuite/ChangeLog:
* gcc.target/arm/fmaxmin.c: Move scan-assembler checks to ...
* gcc.target/arm/fmaxmin-2.c: ... here. New test.
Martin Uecker [Sun, 16 Mar 2025 09:54:17 +0000 (10:54 +0100)]
c: Fix tagname confusion for typedef redefinitions [PR118765]
When we redefine a typedef for a tagged type that has just been
redefined, merge_decls may produce invalid TYPE_DECLS that are not
consistent with what set_underlying_type produces. This is fixed
by updating DECL_ORIGINAL_TYPE.
PR c/118765
gcc/c/ChangeLog:
* c-decl.cc (merge_decls): For TYPE_DECLS copy
DECL_ORIGINAL_TYPE from the old declaration.
* c-typeck.cc (tagged_types_tu_compatible_p): Add
checking assertions.
gcc/testsuite/ChangeLog:
* gcc.dg/pr118765-2.c: New test.
* gcc.dg/pr118765-3.c: New test.
* gcc.dg/typedef-redecl3.c: New test.
Sandra Loosemore [Thu, 27 Mar 2025 00:59:37 +0000 (00:59 +0000)]
OpenMP: Fix declaration in append-args-interop.c test case
I ran into this while backporting my declare variant/dispatch/interop
patch f016ee89955ab4da5fe7ef89368e9437bb5ffb13 to the og14 development
branch. In C dialects prior to C23 (the default on mainline),
functions declared "float f()" and "float g(void)" aren't considered
equivalent for the purpose of the C front end code that checks whether
a type of a variant matches the base function after accounting for the
added interop arguments. Using "(void)" instead of "()" works in all
C dialects as well as C++, so do that.
gcc/testsuite/ChangeLog
* c-c++-common/gomp/append-args-interop.c: Fix declaration of base
function to be correct for pre-C23 dialects.
Jørgen Kvalsvik [Wed, 26 Mar 2025 21:15:26 +0000 (22:15 +0100)]
Add coverage_instrumentation_p
Provide a helper for checking if any coverage (arc, conditions, paths)
is enabled, rather than manually checking all the flags. This should
make the intent clearer, and make it easier to maintain the checks when
more flags are added.
The function is forward declared in two header files as different passes
tend to include different headers (profile.h vs value-prof.h). This
could maybe be merged at some points, but profiling related symbols are
already a bit spread out and should probably be handled in a targeted
effort.
Jørgen Kvalsvik [Tue, 4 Jun 2024 12:13:22 +0000 (14:13 +0200)]
Add prime path coverage to gcc/gcov
This patch adds prime path coverage to gcc/gcov. First, a quick
introduction to path coverage, before I explain a bit on the pieces of
the patch.
PRIME PATHS
Path coverage is recording the paths taken through the program. Here is
a simple example:
if (cond1) BB 1
then1 () BB 2
else
else1 () BB 3
if (cond2) BB 4
then2 () BB 5
else
else2 () BB 6
_ BB 7
To cover all paths you must run {then1 then2}, {then1 else2}, {else1
then1}, {else1 else2}. This is in contrast with line/statement coverage
where it is sufficient to execute then2, and it does not matter if it
was reached through then1 or else1.
1 2 4 5 7
1 2 4 6 7
1 3 4 5 7
1 3 4 6 7
This gets more complicated with loops, because 0, 1, 2, ..., N
iterations are all different paths. There are different ways of
addressing this, a promising one being prime paths. A prime path is a
maximal simple path (a path with no repeated vertices) or simple cycle
(no repeated vertices except for the first/last) Prime paths strike a
decent balance between number of tests, path growth, and loop coverage,
requiring loops to be both taken and skipped. Of course, the number of
paths still grows very fast with program complexity - for example, this
program has 14 prime paths:
while (a)
{
if (b)
return;
while (c--)
a++;
}
--
ALGORITHM
Since the numbers of paths grows so fast, we need a good algorithm. The
naive approach of generating all paths and discarding redundancies (see
reference_prime_paths in the diff) simply doesn't complete for even
pretty simple functions with a few ten thousand paths (granted, the
implementation is also poor, but only serves as a reference). Fazli &
Afsharchi in their paper "Time and Space-Efficient Compositional Method
for Prime and Test Paths Generation" describe a neat algorithm which
drastically improves on for most programs, and brings complexity down to
something managable. This patch implements that algorithm with a few
minor tweaks.
The algorithm first finds the strongly connected components (SCC) of the
graph and creates a new graph where the vertices are the SCCs of the
CFG. Within these vertices different paths are found - regular prime
paths, paths that start in the SCCs entries, and paths that end in the
SCCs exits. These per-SCC paths are combined with paths through the CFG
which greatly reduces of paths needed to be evaluated just to be thrown
away.
Using this algorithm we can find the prime paths for somewhat
complicated functions in a reasonable time. Please note that some
programs don't benefit from this at all. We need to find the prime paths
within a SCC, so if a single SCC is very large the function degenerates
to the naive implementation. This can probably be much improved on, but
is an exercise for later.
--
OVERALL ARCHITECTURE
Like the other coverages in gcc, this operates on the CFG in the
profiling phase, just after branch and condition coverage, in phases:
1. All prime paths are generated, counted, and enumerated from the CFG
2. The paths are evaluted and counter instructions and accumulators are
emitted
3. gcov reads the CFG and computes the prime paths (same as step 1)
4. gcov prints a report
Simply writing out all the paths in the .gcno file is not really viable,
the files would be too big. Additionally, there are limits to the
practicality of measuring (and reporting) on millions of paths, so for
most programs where coverage is feasible, computing paths should be
plenty fast. As a result, path coverage really only adds 1 bit to the
counter, rounded up to nearest 64 ("bucket"), so 64 paths takes up 8
bytes, 65 paths take up 16 bytes.
Recording paths is really just massaging large bitsets. Per function,
ceil(paths/64 or 32) buckets (gcov_type) are allocated. Paths are
sorted, so the first path maps to the lowest bit, the second path to the
second lowest bit, and so on. On taking an edge and entering a basic
block, a few bitmasks are applied to unset the bits corresponding to the
paths outside the block and set the bits of the paths that start in that
block. Finally, the right buckets are masked and written to the global
accumulators for the paths that end in the block. Full coverage is
achieved when all bits are set.
gcc does not really inform gcov of abnormal paths, so paths with
abnormal edges are ignored. This probably possible, but requires some
changes to the graph gcc writes to the .gcno file.
--
IMPLEMENTATION
In order to remove non-prime paths (subpaths) we use a suffix tree.
Fazli & Afsharchi do not discuss how duplicates or subpaths are removed,
and using the suffix works really well -- insertion time is a function
of the length of the (longest) paths, not the number of paths. The paths
are usually quite short, but there are many of them. The same
prime_paths function is used both in gcc and in gcov.
As for speed, I would say that it is acceptable. Path coverage is a
problem that is exponential in its very nature, so if you enable this
feature you can reasonably expect it to take a while. To combat the
effects of path explosion there is a limit at which point gcc will give
up and refuse to instrument a function, set with -fpath-coverage-limit.
Since estimating the number of prime paths is pretty much is counting
them, gcc maintains a pessimistic running count which slightly
overestimates the number of paths found so far. When that count exceeds
the limit, the function aborts and gcc prints a warning. This is a
blunt instrument meant to not get stuck on the occasional large
function, not fine-grained control over when to skip instrumentation.
My main benchmark has been tree-2.1.3/tree.c which generates approx 2M
paths across the 20 functions or so in it. Most functions have less than
1500 paths, and 2 around a million each. Finding the paths takes
3.5-4s, but the instrumentation phase takes approx. 2.5 minutes and
generates a 32M binary. Not bad for a 1429 line source file.
There are some selftests which deconstruct the algorithm, so it can be
easily referenced with the Fazli & Afsharchi. I hope that including them
both help to catch regression, clarify the assumptions, and help
understanding the algorithm by breaking up the phases.
DEMO
This is the denser line-aware (grep-friendlier) output. Every missing
path is summarized as the lines you need to run in what order, annotated
with the true/false/throw decision.
$ gcc -fpath-coverage --coverage bs.c -c -o bs
$ gcov -e --prime-paths-lines bs.o
bs.gcda:cannot open data file, assuming not executed
-: 0:Source:bs.c
-: 0:Graph:bs.gcno
-: 0:Data:-
-: 0:Runs:0
paths covered 0 of 17
path 0 not covered: lines 6 6(true) 11(true) 12
path 1 not covered: lines 6 6(true) 11(false) 13(true) 14
path 2 not covered: lines 6 6(true) 11(false) 13(false) 16
path 3 not covered: lines 6 6(false) 18
path 4 not covered: lines 11(true) 12 6(true) 11
path 5 not covered: lines 11(true) 12 6(false) 18
path 6 not covered: lines 11(false) 13(true) 14 6(true) 11
path 7 not covered: lines 11(false) 13(true) 14 6(false) 18
path 8 not covered: lines 12 6(true) 11(true) 12
path 9 not covered: lines 12 6(true) 11(false) 13(true) 14
path 10 not covered: lines 12 6(true) 11(false) 13(false) 16
path 11 not covered: lines 13(true) 14 6(true) 11(true) 12
path 12 not covered: lines 13(true) 14 6(true) 11(false) 13
path 13 not covered: lines 14 6(true) 11(false) 13(true) 14
path 14 not covered: lines 14 6(true) 11(false) 13(false) 16
path 15 not covered: lines 6(true) 11(true) 12 6
path 16 not covered: lines 6(true) 11(false) 13(true) 14 6
#####: 1:int binary_search(int a[], int len, int from, int to, int key)
-: 2:{
#####: 3: int low = from;
#####: 4: int high = to - 1;
-: 5:
#####: 6: while (low <= high)
-: 7: {
#####: 8: int mid = (low + high) >> 1;
#####: 9: long midVal = a[mid];
-: 10:
#####: 11: if (midVal < key)
#####: 12: low = mid + 1;
#####: 13: else if (midVal > key)
#####: 14: high = mid - 1;
-: 15: else
#####: 16: return mid; // key found
-: 17: }
#####: 18: return -1;
-: 19:}
Then there's the human-oriented source mode. Because it is so verbose I
have limited the demo to 2 paths. In this mode gcov will print the
sequence of *lines* through the program and in what order to cover the
path, including what basic block the line is a part of. Like its denser
sibling, this also prints the true/false/throw decision, if there is
one.
$ gcov -t --prime-paths-source bs.o
bs.gcda:cannot open data file, assuming not executed
-: 0:Source:bs.c
-: 0:Graph:bs.gcno
-: 0:Data:-
-: 0:Runs:0
paths covered 0 of 17
path 0:
BB 2: 1:int binary_search(int a[], int len, int from, int to, int key)
BB 2: 3: int low = from;
BB 2: 4: int high = to - 1;
BB 2: 6: while (low <= high)
BB 8: (true) 6: while (low <= high)
BB 3: 8: int mid = (low + high) >> 1;
BB 3: 9: long midVal = a[mid];
BB 3: (true) 11: if (midVal < key)
BB 4: 12: low = mid + 1;
path 1:
BB 2: 1:int binary_search(int a[], int len, int from, int to, int key)
BB 2: 3: int low = from;
BB 2: 4: int high = to - 1;
BB 2: 6: while (low <= high)
BB 8: (true) 6: while (low <= high)
BB 3: 8: int mid = (low + high) >> 1;
BB 3: 9: long midVal = a[mid];
BB 3: (false) 11: if (midVal < key)
BB 5: (true) 13: else if (midVal > key)
BB 6: 14: high = mid - 1;
The listing is also aware of inlining:
hello.c:
#include <stdio.h>
#include "hello.h"
int notmain(const char *entity)
{
return hello (entity);
}
#include <stdio.h>
inline __attribute__((always_inline))
int hello (const char *s)
{
if (s)
printf ("hello, %s!\n", s);
else
printf ("hello, world!\n");
return 0;
}
--prime-paths-{lines,source} take an optional argument type, which can
be 'covered', 'uncovered', or 'both', which defaults to 'uncovered'.
The flag controls if the covered or uncovered paths are printed, and
while uncovered is generally the most useful one, it is sometimes nice
to be able to see only the covered paths.
And finally, JSON (abbreviated). It is quite sparse and very nested, but
is mostly a JSON version of the source listing. It has to be this nested
in order to consistently capture multiple locations. It is always
includes the file name per location for consistency, even though this is
very much redundant in almost all cases. This format is in no way set in
stone, and without targeting it with other tooling I am not sure if it
does the job well.
* lib/gcov.exp: Add prime paths test function.
* g++.dg/gcov/gcov-22.C: New test.
* g++.dg/gcov/gcov-23-1.h: New test.
* g++.dg/gcov/gcov-23-2.h: New test.
* g++.dg/gcov/gcov-23.C: New test.
* gcc.misc-tests/gcov-29.c: New test.
* gcc.misc-tests/gcov-30.c: New test.
* gcc.misc-tests/gcov-31.c: New test.
* gcc.misc-tests/gcov-32.c: New test.
* gcc.misc-tests/gcov-33.c: New test.
* gcc.misc-tests/gcov-34.c: New test.
Jørgen Kvalsvik [Wed, 7 Aug 2024 15:33:31 +0000 (17:33 +0200)]
gcov: branch, conds, calls in function summaries
The gcov function summaries only output the covered lines, not the
branches and calls. Since the function summaries is an opt-in it
probably makes sense to also include branch coverage, calls, and
condition coverage.
Before:
$ gcov -f hello
Function 'main'
Lines executed:100.00% of 4
Function 'fn'
Lines executed:100.00% of 7
File 'hello.c'
Lines executed:100.00% of 11
Creating 'hello.c.gcov'
After:
$ gcov -f hello
Function 'main'
Lines executed:100.00% of 3
No branches
Calls executed:100.00% of 1
Function 'fn'
Lines executed:100.00% of 7
Branches executed:100.00% of 4
Taken at least once:50.00% of 4
No calls
File 'hello.c'
Lines executed:100.00% of 10
Creating 'hello.c.gcov'
Lines executed:100.00% of 10
With conditions:
$ gcov -fg hello
Function 'main'
Lines executed:100.00% of 3
No branches
Calls executed:100.00% of 1
No conditions
Function 'fn'
Lines executed:100.00% of 7
Branches executed:100.00% of 4
Taken at least once:50.00% of 4
Condition outcomes covered:100.00% of 8
No calls
File 'hello.c'
Lines executed:100.00% of 10
Creating 'hello.c.gcov'
Jonathan Wakely [Tue, 18 Mar 2025 21:21:28 +0000 (21:21 +0000)]
libgcobol: Use auto for container iterator types
libgcobol/ChangeLog:
* charmaps.cc (__gg__raw_to_ascii): Use auto for complicated
nested type.
(__gg__raw_to_ebcdic): Likewise.
(__gg__console_to_ascii): Likewise.
(__gg__console_to_ebcdic): Likewise.
Jonathan Wakely [Tue, 18 Mar 2025 21:17:03 +0000 (21:17 +0000)]
libgcobol: Fix uses of tolower and toupper with std::transform
As explained in the libstdc++ manual[1] and elsewhere[2], using tolower
and toupper in std::transform directly is wrong. If char is signed then
non-ASCII characters with negative values lead to undefined behaviour.
Also, tolower and toupper are overloaded names in C++ so expecting them
to resolve to a unique function pointer is unreliable. Finally, the
<cctype> header was included, not <ctype.h>, so they should have been
qualified as std::tolower and std::toupper.
* intrinsic.cc (is_zulu_format): Qualify toupper and cast
argument to unsigned char.
(fill_cobol_tm): Likewise.
(iscasematch): Likewise for to lower.
(numval): Qualify calls to tolower.
(__gg__lower_case): Use lambda expression for
tolower call.
(__gg__upper_case): Likewise for toupper call.
* libgcobol.cc (mangler_core): Cast tolower argument to unsigned
char.
* valconv.cc (__gg__string_to_numeric_edited): Cast to upper
arguments to unsigned char.
Bob Dubner [Wed, 26 Mar 2025 20:07:44 +0000 (16:07 -0400)]
cobol: Bring trunk in line with Dubner's test system.
gcc/cobol
* genapi.cc: (parser_display_internal): Adjust for E vs e exponent notation.
* parse.y: (literal_refmod_valid): Display correct value in error message.
Jakub Jelinek [Wed, 26 Mar 2025 19:07:09 +0000 (20:07 +0100)]
cobol: Get rid of __int128 uses in the COBOL FE [PR119242]
The following patch changes some remaining __int128 uses in the FE
into FIXED_WIDE_INT(128), i.e. emulated 128-bit integral type.
The use of wide_int_to_tree directly from that rather than going through
build_int_cst_type means we don't throw away the upper 64 bits of the
values, so the emitting of constants needing full 128 bits can be greatly
simplied.
Plus all the #pragma GCC diagnostic ignored "-Wpedantic" spots aren't
needed, we don't use the _Float128/__int128 types directly in the FE
anymore.
Note, PR119241/PR119242 bugs are still not fully fixed, I think the
remaining problem is that several FE sources include
../../libgcobol/libgcobol.h and that header declares various APIs with
__int128 and _Float128 types, so trying to build a cross-compiler on a host
without __int128 and _Float128 will still fail miserably.
I believe none of those APIs are actually used by the FE, so the question is
what the FE needs from libgcobol.h and whether the rest could be wrapped
with #ifndef IN_GCC or #ifndef IN_GCC_FRONTEND or something similar
(those 2 macros are predefined when compiling the FE files).
2025-03-26 Jakub Jelinek <jakub@redhat.com>
PR cobol/119242
* genutil.h (get_power_of_ten): Remove #pragma GCC diagnostic
around declaration.
* genapi.cc (psa_FldLiteralN): Change type of value from
__int128 to FIXED_WIDE_INT(128). Remove #pragma GCC diagnostic
around the declaration. Use wi::min_precision to determine
minimum unsigned precision of the value. Use wi::neg_p instead
of value < 0 tests and wi::set_bit_in_zero<FIXED_WIDE_INT(128)>
to build sign bit. Handle field->data.capacity == 16 like
1, 2, 4 and 8, use wide_int_to_tree instead of build_int_cst.
(mh_source_is_literalN): Remove #pragma GCC diagnostic around
the definition.
(binary_initial_from_float128): Likewise.
* genutil.cc (get_power_of_ten): Remove #pragma GCC diagnostic
before the definition.
Iain Buclaw [Tue, 25 Mar 2025 18:37:34 +0000 (19:37 +0100)]
d: import __stdin causes compilation to pause while reading from stdin
Moves the special handling of reading from stdin out of the language
semantic routines. All references to `__stdin.d` have also been removed
from the front-end implementation.
Jonathan Wakely [Wed, 26 Mar 2025 10:10:19 +0000 (10:10 +0000)]
c++: Fix FAIL: g++.dg/tree-ssa/initlist-opt1.C
My r15-8904-ge200f53a555651 changed the std::vector initializer-list
constructor so that it calls a new _M_range_initialize_n function
instead of _M_range_initialize. Change the scan-tree-dump pattern in
this g++.dg test to match the new gimple output.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/initlist-opt1.C: Match _M_range_initialize_n
instead of _M_range_initialize.
P2562R1 ("constexpr Stable Sorting") adds constexpr to stable_sort,
stable_partition and inplace_merge. However only the first is already
implemented in libstdc++, so we shouldn't bump the feature-testing
macro to the bumped C++26 value. This commit sets it to one less
than the final value.
* include/bits/version.def (constexpr_algorithms): Change
the value of the feature-testing macro.
* include/bits/version.h: Regenerate.
* testsuite/25_algorithms/cpp_lib_constexpr.cc: Amend the
check of the feature-testing macro.
The problem is two-fold: restricting a test to target x86_64-*-* is
always wrong: an i?86-*-* compiler can produce 64-bit code with -m64
just as well, so it should always be both.
In addition, the -mx32 failure shows that the test seems to be 64-bit
only.
To fix both issues, this patch uses the new x86 effective-target keyword
and restricts the tests to lp64 instead of ! ia32.
Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.
gcc/testsuite:
* c-c++-common/gomp/metadirective-device.c
(dg-additional-options): Use on all x86 targets. Restrict to lp64.
* c-c++-common/gomp/metadirective-target-device-1.c: Likewise.
Jakub Jelinek [Wed, 26 Mar 2025 13:41:15 +0000 (14:41 +0100)]
testsuite: Fix up append-args-interop.f90 test
The gcc/testsuite/*/gomp/ tests aren't compiled with include or module
paths pointing to libgomp, so shouldn't be using omp.h nor use omp_lib
etc.
The following patch adjusts the test to define it locally, like
e.g. recently in interop-5.f90 test or many other tests which have
their own definitions of types or enumerators they need.
2025-03-26 Jakub Jelinek <jakub@redhat.com>
* gfortran.dg/gomp/append-args-interop.f90: Don't use omp_lib,
instead use iso_c_binding and define omp_interop_kind parameter
locally.
Thomas Schwinge [Tue, 19 Jul 2022 13:42:17 +0000 (15:42 +0200)]
driver: Forward '-lstdc++' to offloading compilation [PR101544]
..., so that users don't manually need to specify '-foffload-options=-lstdc++'
in addition to '-lstdc++' (specified manually, or implicitly by the driver).
Do like commit 4bcb46b3ade1796c5a57b294f5cca25f00671cac
"driver: Forward '-lgfortran', '-lm' to offloading compilation".
Thomas Schwinge [Wed, 19 Mar 2025 11:18:26 +0000 (12:18 +0100)]
C++: Adjust implicit '__cxa_bad_typeid' prototype to reality
In 2001 Subversion r40924 (Git commit 52a11cbfcf0cfb32628b6953588b6af4037ac0b6)
"IA-64 ABI Exception Handling", '__cxa_bad_typeid' changed from
'std::type_info const &' to 'void' return type:
--- libstdc++-v3/libsupc++/exception_support.cc
+++ /dev/null
@@ -1,388 +0,0 @@
-[...]
-// Helpers for rtti. Although these don't return, we give them return types so
-// that the type system is not broken.
-[...]
-extern "C" std::type_info const &
-__cxa_bad_typeid ()
-{
- [...]
-}
-[...]
The implicit prototype in the C++ front end however wasn't likewise adjusted,
and so for nvptx we generate code for 'std::type_info const &' return type:
// BEGIN GLOBAL FUNCTION DECL: __cxa_bad_typeid
.extern .func (.param .u64 %value_out) __cxa_bad_typeid;
..., which is in conflict with the library code with 'void' return type:
// BEGIN GLOBAL FUNCTION DECL: __cxa_bad_typeid
.visible .func __cxa_bad_typeid;
// BEGIN GLOBAL FUNCTION DEF: __cxa_bad_typeid
.visible .func __cxa_bad_typeid
{
[...]
}
..., and we thus get execution test FAILs for 'g++.dg/rtti/typeid11.C', for
example:
error : Prototype doesn't match for '__cxa_bad_typeid' in 'input file 4 at offset 22204', first defined in 'input file 4 at offset 22204'
nvptx-run: cuLinkAddData failed: unknown error (CUDA_ERROR_UNKNOWN, 999)
With this patched, we get the expected:
// BEGIN GLOBAL FUNCTION DECL: __cxa_bad_typeid
-.extern .func (.param .u64 %value_out) __cxa_bad_typeid;
+.extern .func __cxa_bad_typeid;
Jakub Jelinek [Wed, 26 Mar 2025 13:03:50 +0000 (14:03 +0100)]
widening_mul: Fix up further r14-8680 widening mul issues [PR119417]
The following testcase is miscompiled since r14-8680 PR113560 changes.
I've already tried to fix some of the issues caused by that change in
r14-8823 PR113759, but apparently didn't get it right.
The problem is that the r14-8680 changes sometimes set *type_out to
a narrower type than the *new_rhs_out actually has (because it will
handle stuff like _1 = rhs1 & 0xffff; and imply from that HImode type_out.
Now, if in convert_mult_to_widen or convert_plusminus_to_widen we actually
get optab for the modes we've asked for (i.e. with from_mode and to_mode),
everything works fine, if the operands don't have the expected types,
they are converted to those (for INTEGER_CSTs with fold_convert,
otherwise with build_and_insert_cast).
On the following testcase on aarch64 that is not the case, we ask
for from_mode HImode and to_mode DImode, but get actual_mode SImode.
The mult_rhs1 operand already has SImode and we change type1 to unsigned int
and so no cast is actually done, except that the & 0xffff is lost that way.
The following patch ensures that if we change typeN because of wider
actual_mode (or because of a sign change), we first cast to the old
typeN (if the r14-8680 code was encountered, otherwise it would have the
same precision) and only then change it, and then perhaps cast again.
On the testcase on aarch64-linux the patch results in the expected
- add x19, x19, w0, uxtw 1
+ add x19, x19, w0, uxth 1
difference.
2025-03-26 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119417
* tree-ssa-math-opts.cc (convert_mult_to_widen): Before changing
typeN because actual_precision/from_unsignedN differs cast rhsN
to typeN if it has a different type.
(convert_plusminus_to_widen): Before changing
typeN because actual_precision/from_unsignedN differs cast mult_rhsN
to typeN if it has a different type.
Jakub Jelinek [Wed, 26 Mar 2025 11:19:14 +0000 (12:19 +0100)]
i386: Fix up pr55583.c testcase [PR119465]
In r15-4289 H.J. fixed up the pr55583.c testcase to use unsigned long long
or long long instead of unsigned long or long. That change looks correct to
me because the
void test64r () { b = ((u64)b >> n) | (a << (64 - n)); }
etc. functions otherwise aren't really 64-bit rotates, but something that
triggers UB all the time (at least one of the shifts is out of bounds).
I assume that change fixed the FAILs on -mx32, but it caused
FAIL: gcc.target/i386/pr55583.c scan-assembler-times (?n)shldl?[\\\\t ]*\\\\\$2 1
FAIL: gcc.target/i386/pr55583.c scan-assembler-times (?n)shrdl?[\\\\t ]*\\\\\$2 2
regression on i686-linux (but just for -m32 without defaulting to SSE2 or
what). The difference is that for say -m32 -march=x86-64 the stv pass
handles some of the rotates in SSE and so we get different sh[rl]dl
instruction counts from the case when SSE isn't enabled and stv pass isn't
done.
The following patch fixes that by disabling SSE for ia32 and always testing
for the same number of instructions.
Tested with all of
make check-gcc RUNTESTFLAGS='--target_board=unix\{-m32/-march=x86-64,-m32/-march=i686,-mx32,-m64\} i386.exp=pr55583.c'
2025-03-26 Jakub Jelinek <jakub@redhat.com>
PR target/55583
PR target/119465
* gcc.target/i386/pr55583.c: Add -mno-sse -mno-mmx to
dg-additional-options. Expect 4 shrdl and 2 shldl instructions on
ia32.
Tomasz Kamiński [Wed, 26 Mar 2025 06:34:37 +0000 (07:34 +0100)]
libstdc++: Check presence of iterator_category for flat_sets insert_range [PR119415]
As pointed out by Hewill Kang (reporter) in the issue, checking if iterator
of the incoming range satisfies __cpp17_input_iterator, may still lead
to hard errors inside of insert_range for iterators that satisfies
that concept, but specialize iterator_traits without iterator_category
typedef (std::common_iterator specialize iterator_traits without
iterator_category in some cases).
To address that we instead check if the iterator_traits<It>::iterator_category
is present and denote at least input_iterator_tag, using existing __has_input_iter_cat.
PR libstdc++/119415
libstdc++-v3/ChangeLog:
* include/std/flat_set (_Flat_set_impl:insert_range):
Replace __detail::__cpp17_input_iterator with __has_input_iter_cat.
* testsuite/23_containers/flat_multiset/1.cc: New tests
* testsuite/23_containers/flat_set/1.cc: New tests
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
gcc/testsuite:
* gcc.target/i386/pr117946.c: Require dfp support.
* gcc.target/i386/pr118017.c: Likewise. Use
dg-require-effective-target for both this and int128.
Tobias Burnus [Wed, 26 Mar 2025 10:27:56 +0000 (11:27 +0100)]
libgomp.texi: Document supported OpenMP 'interop' types for nvptx and gcn
Note that this commit also updates the API interface to OpenMP 6.0;
while 5.1 and 5.2 use 'int *' for the the ret_code argument,
OpenMP 6.0 changed this to omp_interop_rc_t *; this enum also exists in
OpenMP 5.1. However, C++ does not like this change such that unless NULL
is passed (i.e. the argument is ignored), OpenMP 5.x and 6.x are not
compatible.
Note that GCC's omp.h already follows OpenMP 6.0 and is now in sync with
the documentation.
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.1): Add @ref to offload-target specifics
for 'interop'.
(OpenMP 6.0): Mark dispatch's interop clause as implemented.
(omp_get_interop_int, omp_get_interop_str,
omp_get_interop_ptr, omp_get_interop_type_desc): Add @ref to
Offload-Target Specifics; change ret_code argument type to
'omp_interop_rc_t *'.
(Offload-Target Specifics): Document the supported OpenMP
interop foreign runtimes on AMD and Nvidia GPUs.
Tomasz Kamiński [Fri, 21 Mar 2025 08:03:54 +0000 (09:03 +0100)]
libstdc++: Add P1206R7 range operations to std::deque [PR111055]
This is another piece of P1206R7, adding from_range constructor, append_range,
prepend_range, insert_range, and assign_range members to std::deque.
For append_front of input non-sized range, we are emplacing element at the front and
then reverse inserted elements. This does not existing elements, and properly handle
aliasing ranges.
For insert_range, the handling of insertion in the middle of input-only ranges
that are sized could be optimized, we still insert nodes one-by-one in such case.
For forward and stronger ranges, we reduce them to common_range case, by computing
the iterator when computing the distance. This is slightly suboptimal, as it require
range to be iterated for non-common forward ranges that are sized, but reduces
number of instantiations.
This patch extract _M_range_prepend, _M_range_append helper functions that accepts
(iterator, sentinel) pair. This all used in all standard modes.
PR libstdc++/111055
libstdc++-v3/ChangeLog:
* include/bits/deque.tcc (deque::prepend_range, deque::append_range)
(deque::insert_range, __advance_dist): Define.
(deque::_M_range_prepend, deque::_M_range_append):
Extract from _M_range_insert_aux for _ForwardIterator(s).
* include/bits/stl_deque.h (deque::assign_range): Define.
(deque::prepend_range, deque::append_range, deque::insert_range):
Declare.
(deque(from_range_t, _Rg&&, const allocator_type&)): Define constructor
and deduction guide.
* include/debug/deque (deque::prepend_range, deque::append_range)
(deque::assign_range): Define.
(deque(from_range_t, _Rg&&, const allocator_type&)): Define constructor
and deduction guide.
* testsuite/23_containers/deque/cons/from_range.cc: New test.
* testsuite/23_containers/deque/modifiers/append_range.cc: New test.
* testsuite/23_containers/deque/modifiers/assign/assign_range.cc:
New test.
* testsuite/23_containers/deque/modifiers/prepend_range.cc: New test.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jakub Jelinek [Wed, 26 Mar 2025 07:47:20 +0000 (08:47 +0100)]
i386: Require in peephole2 that memory is offsettable [PR119450]
The following testcase ICEs because a peephole2 attempts to offset
memory which is not offsettable (in particular address is a ZERO_EXTEND
in this case).
Because peephole2s don't support constraints, I've added a check for this
in the peephole2's condition.
2025-03-26 Jakub Jelinek <jakub@redhat.com>
PR target/119450
* config/i386/i386.md (narrow test peephole2): Test for
offsettable_memref_p in condition.
Richard Biener [Tue, 25 Mar 2025 14:18:14 +0000 (15:18 +0100)]
target/119010 - add missing DF load/store reservations for znver4 and znver5
The following resolves missing reservations for DFmode *movdf_internal
loads and stores, visible as 'nothing' in -fsched-verbose=2 dumps.
PR target/119010
* config/i386/zn4zn5.md (znver4_sse_mov_fp, znver4_sse_mov_fp_load,
znver5_sse_mov_fp_load, znver4_sse_mov_fp_store,
znver5_sse_mov_fp_store): Also match V1SF and DF.
Richard Biener [Tue, 25 Mar 2025 12:45:36 +0000 (13:45 +0100)]
middle-end/118795 - fix can_vec_perm_const_p query in match.pd
When expanding to RTL we always use vec_perm_indices with two operands
which can make a difference with respect to supported vs. unsupported.
So the following adjusts a query in match.pd for target support which
got this "wrong" and using 1 for a equal operand permute.
PR middle-end/118795
* match.pd (vec_perm <vec_perm <a, b>> -> vec_perm <a, b>):
Use the appropriate check to see whether the original
outer permute was supported.
Hu, Lin1 [Fri, 21 Mar 2025 02:43:10 +0000 (10:43 +0800)]
i386: Add "s_" as Saturation for AVX10.2 Converting Intrinsics.
This patch aims to add "s_" after 'cvt' represent saturation.
gcc/ChangeLog:
* config/i386/avx10_2-512convertintrin.h (_mm512_mask_cvtx2ps_ph): Formatting fixes
(_mm512_mask_cvtx_round2ps_ph): Ditto
(_mm512_maskz_cvtx_round2ps_ph): Ditto
(_mm512_cvtbiassph_bf8): Rename to _mm512_cvts_biasph_bf8.
(_mm512_mask_cvtbiassph_bf8): Rename to _mm512_mask_cvts_biasph_bf8.
(_mm512_maskz_cvtbiassph_bf8): Rename to _mm512_maskz_cvts_biasph_bf8.
(_mm512_cvtbiassph_hf8): Rename to _mm512_cvts_biasph_hf8.
(_mm512_mask_cvtbiassph_hf8): Rename to _mm512_mask_cvts_biasph_hf8.
(_mm512_maskz_cvtbiassph_hf8): Rename to _mm512_maskz_cvts_biasph_hf8.
(_mm512_cvts2ph_bf8): Rename to _mm512_cvts_2ph_bf8.
(_mm512_mask_cvts2ph_bf8): Rename to _mm512_mask_cvts_2ph_bf8.
(_mm512_maskz_cvts2ph_bf8): Rename to _mm512_maskz_cvts_2ph_bf8.
(_mm512_cvts2ph_hf8): Rename to _mm512_cvts_2ph_hf8.
(_mm512_mask_cvts2ph_hf8): Rename to _mm512_mask_cvts_2ph_hf8.
(_mm512_maskz_cvts2ph_hf8): Rename to _mm512_maskz_cvts_2ph_hf8.
(_mm512_cvtsph_bf8): Rename to _mm512_cvts_ph_bf8.
(_mm512_mask_cvtsph_bf8): Rename to _mm512_mask_cvts_ph_bf8.
(_mm512_maskz_cvtsph_bf8): Rename to _mm512_maskz_cvts_ph_bf8.
(_mm512_cvtsph_hf8): Rename to _mm512_cvts_ph_hf8.
(_mm512_mask_cvtsph_hf8): Rename to _mm512_mask_cvts_ph_hf8.
(_mm512_maskz_cvtsph_hf8): Rename to _mm512_maskz_cvts_ph_hf8.
* config/i386/avx10_2convertintrin.h
(_mm_cvtbiassph_bf8): Rename to _mm_cvts_biasph_bf8.
(_mm_mask_cvtbiassph_bf8): Rename to _mm_mask_cvts_biasph_bf8.
(_mm_maskz_cvtbiassph_bf8): Rename to _mm_maskz_cvts_biasph_bf8.
(_mm256_cvtbiassph_bf8): Rename to _mm256_cvts_biasph_bf8.
(_mm256_mask_cvtbiassph_bf8): Rename to _mm256_mask_cvts_biasph_bf8.
(_mm256_maskz_cvtbiassph_bf8): Rename to _mm256_maskz_cvts_biasph_bf8.
(_mm_cvtbiassph_hf8): Rename to _mm_cvts_biasph_hf8.
(_mm_mask_cvtbiassph_hf8): Rename to _mm_mask_cvts_biasph_hf8.
(_mm_maskz_cvtbiassph_hf8): Rename to _mm_maskz_cvts_biasph_hf8.
(_mm256_cvtbiassph_hf8): Rename to _mm256_cvts_biasph_hf8.
(_mm256_mask_cvtbiassph_hf8): Rename to _mm256_mask_cvts_biasph_hf8.
(_mm256_maskz_cvtbiassph_hf8): Rename to _mm256_maskz_cvts_biasph_hf8.
(_mm_cvts2ph_bf8): Rename to _mm_cvts_2ph_bf8.
(_mm_mask_cvts2ph_bf8): Rename to _mm_mask_cvts_2ph_bf8.
(_mm_maskz_cvts2ph_bf8): Rename to _mm_maskz_cvts_2ph_bf8.
(_mm256_cvts2ph_bf8): Rename to _mm256_cvts_2ph_bf8.
(_mm256_mask_cvts2ph_bf8): Rename to _mm256_mask_cvts_2ph_bf8.
(_mm256_maskz_cvts2ph_bf8): Rename to _mm256_maskz_cvts_2ph_bf8.
(_mm_cvts2ph_hf8): Rename to _mm_cvts_2ph_hf8.
(_mm_mask_cvts2ph_hf8): Rename to _mm_mask_cvts_2ph_hf8.
(_mm_maskz_cvts2ph_hf8): Rename to _mm_maskz_cvts_2ph_hf8.
(_mm256_cvts2ph_hf8): Rename to _mm256_cvts_2ph_hf8.
(_mm256_mask_cvts2ph_hf8): Rename to _mm256_mask_cvts_2ph_hf8.
(_mm256_maskz_cvts2ph_hf8): Rename to _mm256_maskz_cvts_2ph_hf8.
(_mm_cvtsph_bf8): Rename to _mm_cvts_ph_bf8.
(_mm_mask_cvtsph_bf8): Rename to _mm_mask_cvts_ph_bf8.
(_mm_maskz_cvtsph_bf8): Rename to _mm_maskz_cvts_ph_bf8.
(_mm256_cvtsph_bf8): Rename to _mm256_cvts_ph_bf8.
(_mm256_mask_cvtsph_bf8): Rename to _mm256_mask_cvts_ph_bf8.
(_mm256_maskz_cvtsph_bf8): Rename to _mm256_maskz_cvts_ph_bf8.
(_mm_cvtsph_hf8): Rename to _mm_cvts_ph_hf8.
(_mm_mask_cvtsph_hf8): Rename to _mm_mask_cvts_ph_hf8.
(_mm_maskz_cvtsph_hf8): Rename to _mm_maskz_cvts_ph_hf8.
(_mm256_cvtsph_hf8): Rename to _mm256_cvts_ph_hf8.
(_mm256_mask_cvtsph_hf8): Rename to _mm256_mask_cvts_ph_hf8.
(_mm256_maskz_cvtsph_hf8): Rename to _mm256_maskz_cvts_ph_hf8.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_2-512-convert-1.c: Modify function name
to follow the latest version.
* gcc.target/i386/avx10_2-512-vcvt2ph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvt2ph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-convert-1.c: Ditto.
Bob Dubner [Tue, 25 Mar 2025 19:38:38 +0000 (15:38 -0400)]
cobol: Changes to eliminate _Float128 from the front end [PR119241]
These changes switch _Float128 types to REAL_VALUE_TYPE in the front end.
Some __int128 variables and function return values are changed to
FIXED_WIDE_INT(128)
gcc/cobol
PR cobol/119241
* cdf.y: (cdfval_base_t::operator()): Return const.
* cdfval.h: (struct cdfval_base_t): Add const cdfval_base_t&
operator().
(struct cdfval_t): Add cdfval_t constructor. Change cdf_value
definitions.
* gcobolspec.cc (lang_specific_driver): Formatting fix.
* genapi.cc: Include fold-const.h and realmpfr.h.
(initialize_variable_internal): Use real_to_decimal instead of
strfromf128.
(get_binary_value_from_float): Use wide_int_to_tree instead of
build_int_cst_type.
(psa_FldLiteralN): Use fold_convert instead of strfromf128,
real_from_string and build_real.
(parser_display_internal): Rewritten to work on REAL_VALUE_TYPE
rather than _Float128.
(mh_source_is_literalN): Use FIXED_WIDE_INT(128) rather than
__int128, wide_int_to_tree rather than build_int_cst_type,
fold_convert rather than build_string_literal.
(real_powi10): New function.
(binary_initial_from_float128): Change type of last argument from
_Float128 to REAL_VALUE_TYPE, process it using real.cc and mpfr
APIs.
(digits_from_float128): Likewise.
(initial_from_float128): Make static. Remove value argument, add
local REAL_VALUE_TYPE value variable instead, process it using
real.cc and native_encode_expr APIs.
(parser_symbol_add): Adjust initial_from_float128 caller.
* genapi.h (initial_from_float128): Remove declaration.
* genutil.cc (get_power_of_ten): Change return type from __int128
to FIXED_WIDE_INT(128), ditto for retval type, change type of pos
from __int128 to unsigned long long.
(scale_by_power_of_ten_N): Use wide_int_to_tree instead of
build_int_cst_type. Use FIXED_WIDE_INT(128) instead of __int128
as power_of_ten variable type.
(copy_little_endian_into_place): Likewise.
* genutil.h (get_power_of_ten): Change return type from __int128
to FIXED_WIDE_INT(128).
* parse.y (%union): Change type of float128 from _Float128 to
REAL_VALUE_TYPE.
(string_of): Change argument type from _Float128 to
const REAL_VALUE_TYPE &, use real_to_decimal rather than
strfromf128. Add another overload with tree argument type.
(field: cdf): Use real_zerop rather than comparison against 0.0.
(occurs_clause, const_value): Use real_to_integer.
(value78): Use build_real and real_to_integer.
(data_descr1): Use real_to_integer.
(count): Use real_to_integer, real_from_integer and real_identical
instead of direct comparison.
(value_clause): Use real_from_string3 instead of num_str2i. Use
real_identical instead of direct comparison. Use build_real.
(allocate): Use real_isneg and real_iszero instead of <= 0 comparison.
(move_tgt): Use real_to_integer, real_value_truncate,
real_from_integer and real_identical instead of comparison of casts.
(cce_expr): Use real_arithmetic and real_convert or real_value_negate
instead of direct arithmetics on _Float128.
(cce_factor): Use real_from_string3 instead of numstr2i.
(literal_refmod_valid): Use real_to_integer.
* symbols.cc (symbol_table_t::registers_t::registers_t): Formatting
fix.
(ERROR_FIELD): Likewise.
(extend_66_capacity): Likewise.
(cbl_occurs_t::subscript_ok): Use real_to_integer, real_from_integer
and real_identical.
* symbols.h (cbl_field_data_t::etc_t::value): Change type from
_Float128 to tree.
(cbl_field_data_t::etc_t::etc_t): Adjust defaulted argument value.
(cbl_field_data_t::cbl_field_data_t): Formatting fix. Use etc()
rather than etc(0).
(cbl_field_data_t::value_of): Change return type from _Float128 to
tree.
(cbl_field_data_t::operator=): Change return and argument type from
_Float128 to tree.
(cbl_field_data_t::valify): Use real_from_string, real_value_truncate
and build_real.
(cbl_field_t::same_as): Use build_zero_cst instead of _Float128(0.0).
gcc/testsuite
* cobol.dg/literal1.cob: New testcase.
* cobol.dg/output1.cob: Likewise
Co-authored-by: Richard Biener <rguenth@suse.de> Co-authored-by: Jakub Jelinek <jakub@redhat.com> Co-authored-by: James K. Lowden <jklowden@cobolworx.com> Co-authored-by: Robert Dubner <rdubher@symas.com>
We've been miscompiling the following since r0-51314-gd6b4ea8592e338 (I
did not go compile something that old, and identified this change via
git blame, so might be wrong)
=== cut here ===
struct Foo { int x; };
Foo& get (Foo &v) { return v; }
void bar () {
Foo v; v.x = 1;
(true ? get (v) : get (v)).*(&Foo::x) = 2;
// v.x still equals 1 here...
}
=== cut here ===
The problem lies in build_m_component_ref, that computes the address of
the COND_EXPR using build_address to build the representation of
(true ? get (v) : get (v)).*(&Foo::x);
and gets something like
&(true ? get (v) : get (v)) // #1
instead of
(true ? &get (v) : &get (v)) // #2
and the write does not go where want it to, hence the miscompile.
This patch replaces the call to build_address by a call to
cp_build_addr_expr, which gives #2, that is properly handled.
PR c++/114525
gcc/cp/ChangeLog:
* typeck2.cc (build_m_component_ref): Call cp_build_addr_expr
instead of build_address.