David Malcolm [Thu, 27 Mar 2025 23:46:20 +0000 (19:46 -0400)]
contrib: add dg-lint and libgdiagnostics.py [PR116163]
Changed in v2:
- eliminated COMMON_MISSPELLINGS in favor of retesting with a regexp
that adds underscores
- add a list of KNOWN_DIRECTIVES, and complain if we see a directive
that isn't in the list
- various refactorings to reduce the nesting within the script
- skip more kinds of file ('README', 'Makefile.am', 'Makefile.in',
'gen_directive_tests')
- keep track of the number of files scanned and report it and the end
with a note
This patch adds a new dg-lint subdirectory below contrib, containing
a "dg-lint" script for detecting common mistakes made in our DejaGnu
tests.
Specifically, DejaGnu's dg.exp's dg-get-options has a regexp for
detecting dg- directives
https://git.savannah.gnu.org/gitweb/?p=dejagnu.git;a=blob;f=lib/dg.exp
here's the current:
set tmp [grep $prog "{\[ \t\]\+dg-\[-a-z\]\+\[ \t\]\+.*\[ \t\]\+}" line]
which if I'm reading it right requires a "{", then one or more tab/space
chars, then a "dg-" directive name, then one of more tab/space
characters, then anything (for arguments to the directive), then one of
more tab/space character, then a "}".
There are numerous places in our testsuite which look like attempts to
use a directive, but which don't match this regexp.
The script warns about such places, along with a list of misspelled
directives (currently just "dg_options" for "dg-options"), and a warning
if a dg-do appears after a dg-require-* (as per
https://gcc.gnu.org/onlinedocs/gccint/Directives.html
"This directive must appear after any dg-do directive in the test
and before any dg-additional-sources directive." for
dg-require-effective-target.
dg-lint uses libgdiagnostics to report its results; the patch adds a
new libgdiagnostics.py script below contrib/dg-lint. This uses Python's
ctypes module to expose libgdianostics.so to Python via FFI. Hence
the warnings have colorization, quote the pertinent parts of the tested
file, can have fix-it hints, etc. Here's the output from the tests, run
from the top-level directory:
$ LD_LIBRARY_PATH=../build/gcc/ ./contrib/dg-lint/dg-lint contrib/dg-lint/test-*.c
contrib/dg-lint/test-1.c:6:6: warning: misspelled directive: 'dg_final'; did you mean 'dg-final'?
6 | /* { dg_final { scan_assembler_times "vmsumudm" 2 } } */
| ^~~~~~~~
| dg-final
contrib/dg-lint/test-1.c:15:4: warning: directive 'dg-output-file' appears not to match dg.exp's regexp
15 | dg-output-file "m4.out"
| ^~~~~~~~~~~~~~
contrib/dg-lint/test-1.c:18:4: warning: directive 'dg-output-file' appears not to match dg.exp's regexp
18 | dg-output-file "m4.out" }
| ^~~~~~~~~~~~~~
contrib/dg-lint/test-1.c:21:6: warning: directive 'dg-output-file' appears not to match dg.exp's regexp
21 | { dg-output-file "m4.out"
| ^~~~~~~~~~~~~~
contrib/dg-lint/test-1.c:24:5: warning: directive 'dg-output-file' appears not to match dg.exp's regexp
24 | {dg-output-file "m4.out"}
| ^~~~~~~~~~~~~~
contrib/dg-lint/test-1.c:27:6: warning: directive 'dg-output-file' appears not to match dg.exp's regexp
27 | { dg-output-file, "m4.out" }
| ^~~~~~~~~~~~~~
contrib/dg-lint/test-2.c:4:6: warning: 'dg-do' after 'dg-require-effective-target'
4 | /* { dg-do compile } */
| ^~~~~
contrib/dg-lint/test-2.c:3:6: note: 'dg-require-effective-target' was here
3 | /* { dg-require-effective-target c++11 } */
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
I don't yet have a way to verify these tests (clearly we can't use
DejaGnu for this).
These Python bindings could be used by other projects, but so far I only
implemented what I needed for dg-lint.
Running the test on the GCC source tree finds dozens of issues, which
followup patches address.
Tested with Python 3.8
contrib/ChangeLog:
PR testsuite/116163
* dg-lint/dg-lint: New file.
* dg-lint/libgdiagnostics.py: New file.
* dg-lint/test-1.c: New file.
* dg-lint/test-2.c: New file.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Bob Dubner [Thu, 27 Mar 2025 21:55:53 +0000 (17:55 -0400)]
cobol: Incorporate new testcases from the cobolworx UAT tests.
The author notes that some of the file names are regrettably lengthy,
which is because they are derived from the descriptive names of the
autom4te tests.
Jakub Jelinek [Thu, 27 Mar 2025 20:21:48 +0000 (21:21 +0100)]
testsuite: Fix up strub-internal-pr112938.C test for C++2{0,3,6}
On Thu, Mar 27, 2025 at 12:05:21AM +0000, Sam James wrote:
> The test was being ignored because dg.exp looks for .C in g++.dg/.
>
> gcc/testsuite/ChangeLog:
> PR middle-end/112938
>
> * g++.dg/strub-internal-pr112938.cc: Move to...
> * g++.dg/strub-internal-pr112938.C: ...here.
This regressed the test for C++20 and higher:
FAIL: g++.dg/strub-internal-pr112938.C -std=gnu++20 (test for excess errors)
FAIL: g++.dg/strub-internal-pr112938.C -std=gnu++23 (test for excess errors)
FAIL: g++.dg/strub-internal-pr112938.C -std=gnu++26 (test for excess errors)
Here is a fix.
2025-03-27 Jakub Jelinek <jakub@redhat.com>
* g++.dg/strub-internal-pr112938.C: Add dg-warning for c++20.
Eric Botcazou [Thu, 27 Mar 2025 19:29:51 +0000 (20:29 +0100)]
Ada: Fix too late initialization of tasking runtime with standalone library
The Tasking_Runtime_Initialize routine installs the tasking version of the
RTS_Lock manipulation routines and thus needs to be called very early before
the elaboration of all the Ada units of the program, including those of the
runtime itself.
This is guaranteed by the binder when the tasking runtime is explicitly
dragged into the link. However, for a standalone dynamic library that
does not depend on the tasking runtime and is auto-initialized, no such
guarantee holds, even though the library might be later dragged into a
link that contains the tasking runtime.
This change causes the routine to be called even earlier, in particular
at load time when a (standalone) dynamic library is involved in the link,
so as to meet the requirements. It will cause the routine to be called
twice if the main subprogram is generated by the binder, but this is
harmless since the routine is idempotent.
ada/
* libgnarl/s-tasini.adb (Tasking_Runtime_Initialize): Add pragma
Linker_Constructor for the procedure.
Sam James [Thu, 27 Mar 2025 17:52:19 +0000 (17:52 +0000)]
testsuite: revert Fortran change
Revert part of my change from r15-8973-g1307de1b4e7d5e; as Harald points
out, the comment explains why this is there. It's a hack but it needs to
stay for now. (I did have this marked as a TODO in my branch and didn't
leave a proper note as to why, so it's my fault.)
Jonathan Wakely [Wed, 26 Mar 2025 11:47:05 +0000 (11:47 +0000)]
libstdc++: Replace use of std::min in ranges::uninitialized_xxx algos [PR101587]
Because ranges can have any signed integer-like type as difference_type,
it's not valid to use std::min(diff1, diff2). Instead of calling
std::min with an explicit template argument, this adds a new __mindist
helper that determines the common type and uses that with std::min.
libstdc++-v3/ChangeLog:
PR libstdc++/101587
* include/bits/ranges_uninitialized.h (__detail::__mindist):
New function object.
(ranges::uninitialized_copy, ranges::uninitialized_copy_n)
(ranges::uninitialized_move, ranges::uninitialized_move_n): Use
__mindist instead of std::min.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/constrained.cc:
Check ranges with difference difference types.
* testsuite/20_util/specialized_algorithms/uninitialized_move/constrained.cc:
Likewise.
Jonathan Wakely [Wed, 26 Mar 2025 17:45:06 +0000 (17:45 +0000)]
libstdc++: Use const_cast to workaround tm_zone being non-const
Iain reported that he's seeing this on Darwin:
include/bits/chrono_io.h:914: warning: ISO C++ forbids converting a string constant to 'char*' [-Wwrite-strings]
This is because the BSD definition ot tm::tm_zone is a char* (and has
been since 1987) rather than const char* as in Glibc and POSIX.1-2024.
We can fix it by using const_cast<char*> when setting the tm_zone
member. This should be safe because libc doesn't actually write anything
to tm_zone; it's only non-const because the BSD definition predates the
addition of the const keyword to C.
For targets where it's a const char* the cast won't matter because it
will be converted back to const char* on assignment anyway.
libstdc++-v3/ChangeLog:
* include/bits/chrono_io.h (__formatter_chrono::_M_c): Use
const_cast when setting tm.tm_zone.
Richard Biener [Thu, 27 Mar 2025 07:21:10 +0000 (08:21 +0100)]
target/119010 - more DFmode handling in zn4zn5 reservations
The following adds DFmode where V1DFmode and SFmode were handled.
This resolves missing reservations for adds, subs [with memory]
and for FMAs for the testcase I'm looking at. Resolved cases are
-;; 16--> b 0: i 237 xmm3=xmm3+[r9*0x8+si] :nothing
-;; 29--> b 0: i 246 xmm3=xmm3+xmm1 :nothing
-;; 46--> b 0: i 296 xmm1=xmm1-xmm3 :nothing
I've done search-and-replace for this, the catched cases look reasonable
though I'm of course not sure all of them can actually happen.
This also fixes the matched type for the znver{4,5}_sse_muladd_load
reservations from sseshuf to ssemuladd, resolving
-;; 1--> b 0: i 161 xmm0={-xmm0*xmm27+[cx+ax]} :nothing
-;; 22--> b 0: i 229 xmm11={-xmm11*xmm7+[di*0x8+dx]} :nothing
Sam James [Thu, 27 Mar 2025 13:19:51 +0000 (13:19 +0000)]
testsuite: harmless dg-* whitespace fixes
These just fix inconsistent/unusual style to avoid noise when grepping
and also people picking up bad habits when they see it (as similar
mistakes can be harmful).
Tobias Burnus [Thu, 27 Mar 2025 13:09:20 +0000 (14:09 +0100)]
OpenMP: Fix C++ template handling with append_args' prefer_type modifier
It is possible but not very sensible to use C++ templates with in the
prefer_type modifier to the 'append_args' clause of 'declare variant'.
The commit r15-6336-g12dd892b1a3ad7 added substitution support in pt.cc,
but missed to update afterward the actual data in decl.cc.
As gimplification support was only added in r15-8898-gf016ee89955ab4,
this could not be tested back then. The latter commit added a sorry
for it gimplify.cc and the existing testcase, which this commit now removes.
gcc/cp/ChangeLog:
* cp-tree.h (cp_finish_omp_init_prefer_type): Add.
* decl.cc (omp_declare_variant_finalize_one): Call it.
* pt.cc (tsubst_attribute): Minor rebustification for OpenMP
append_args handling.
* semantics.cc (cp_omp_init_prefer_type_update): Rename to ...
(cp_finish_omp_init_prefer_type): ... this; remove static attribute
and return modified tree. Move clause handling to ...
(finish_omp_clauses): ... the caller.
libstdc++: re-bump the feature-test macro for P2562R1 [PR119488]
Now that the algorithms have been merged we can advertise full support
for P2562R1. This effectively reverts r15-8933-ga264c270fde292.
libstdc++-v3/ChangeLog:
PR libstdc++/119488
* include/bits/version.def (constexpr_algorithms): Bump
the feature-testing macro.
* include/bits/version.h: Regenerate.
* testsuite/25_algorithms/cpp_lib_constexpr.cc: Test the
bumped value for the feature-testing macro.
This completes the implementation of P2562R1 for C++26.
Unlike the other constexpr algorithms of the same family,
stable_partition does not have a constexpr-friendly version "ready to
use" during constant evaluation. In fact, it is not even available on
freestanding, because it always allocates a temporary memory buffer.
This commit implements the simplest possible strategy: during constant
evaluation allocate a buffer of length 1 on the stack, and use that as
a working area.
libstdc++-v3/ChangeLog:
* include/bits/algorithmfwd.h (stable_partition): Mark it
as constexpr for C++26.
* include/bits/ranges_algo.h (__stable_partition_fn): Likewise.
* include/bits/stl_algo.h (stable_partition): Mark it as
constexpr for C++26; during constant evaluation use a new
codepath where a temporary buffer of 1 element is used.
* testsuite/25_algorithms/headers/algorithm/synopsis.cc
(stable_partition): Add constexpr.
* testsuite/25_algorithms/stable_partition/constexpr.cc: New test.
This commit adds support for constexpr inplace_merge, added by P2562R1
for C++26. The implementation strategy is the same as for constexpr
stable_sort: use if consteval to detect if we're in constant evaluation,
and dispatch to a suitable path (same one as freestanding).
libstdc++-v3/ChangeLog:
* include/bits/algorithmfwd.h (inplace_merge): Mark it as
constexpr for C++26.
* include/bits/ranges_algo.h (__inplace_merge_fn): Likewise.
* include/bits/stl_algo.h (inplace_merge): Mark it as constexpr;
during constant evaluation, dispatch to the non-allocating
codepath.
* testsuite/25_algorithms/headers/algorithm/synopsis.cc
(inplace_merge): Add constexpr.
* testsuite/25_algorithms/inplace_merge/constexpr.cc: New test.
Nathaniel Shead [Wed, 26 Mar 2025 12:43:36 +0000 (23:43 +1100)]
c++/modules: Handle conflicting ABI tags [PR118920]
The ICE in the linked PR is caused because out_ptr_t inherits an ABI tag
in a module that it does not in the importing module. When we try to
build a qualified 'const out_ptr_t' during stream-in, we find the
existing 'const out_ptr_t' variant type that has been built, but discard
it due to having a mismatching attribute list. This causes us to build
a new copy of this variant, and ultimately fail a checking assertion due
to this being an identical type with different TYPE_CANONICAL.
This patch adds checking that ABI tags between an imported and existing
declaration match, and errors if they are incompatible. We make use of
'equal_abi_tags' from mangle.cc to determine if we should error; in the
case in the PR, because the ABI tag was an implicit tag that doesn't
affect name mangling, we don't need to error. To fix the ICE we ensure
that (regardless of whether we errored or not) later processing
considers the ABI tags as equivalent.
PR c++/118920
gcc/cp/ChangeLog:
* cp-tree.h (equal_abi_tags): Declare.
* mangle.cc (equal_abi_tags): Make external, fix comparison.
(tree_string_cmp): Make internal.
* module.cc (trees_in::check_abi_tags): New function.
(trees_in::decl_value): Use it.
(trees_in::is_matching_decl): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/modules/attrib-3_a.H: New test.
* g++.dg/modules/attrib-3_b.C: New test.
* g++.dg/modules/pr118920.h: New test.
* g++.dg/modules/pr118920_a.H: New test.
* g++.dg/modules/pr118920_b.H: New test.
* g++.dg/modules/pr118920_c.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Nathaniel Shead [Wed, 26 Mar 2025 13:23:24 +0000 (00:23 +1100)]
c++/modules: Fix tsubst of global module friend classes [PR118920]
When doing tsubst_friend_class, we need to first check if any imported
module has already created a (hidden) declaration for the class so that
we don't end up with conflicting declarations. Currently we do this
using DECL_MODULE_IMPORT_P, but this is not set in cases where the class
is in the global module and matches an existing GM declaration we've
seen (via an include, for example).
This patch fixes this by checking DECL_MODULE_ENTITY_P instead, which is
set on all entities that have been seen from a module import. We also
use the 'for_mangle' version of get_originating_module so that we don't
treat imported GM entities as attached to the module we imported them
from. And rename that parameter to something more general.
And dump_module_suffix is another place where we want to treat global module
entities as not coming from a module.
PR c++/118920
gcc/cp/ChangeLog:
* name-lookup.cc (lookup_imported_hidden_friend): Check for
module entity rather than just module import.
* module.cc (get_originating_module): Rename for_mangle parm to
global_m1.
* error.cc (dump_module_suffix): Don't decorate global module decls.
gcc/testsuite/ChangeLog:
* g++.dg/modules/tpl-friend-17.h: New test.
* g++.dg/modules/tpl-friend-17_a.C: New test.
* g++.dg/modules/tpl-friend-17_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Co-authored-by: Jason Merrill <jason@redhat.com>
Jonathan Wakely [Wed, 26 Mar 2025 11:21:32 +0000 (11:21 +0000)]
libstdc++: Fix std::ranges::iter_move for function references [PR119469]
The result of std::move (or a cast to an rvalue reference) on a function
reference is always an lvalue. Because std::ranges::iter_move was using
the type std::remove_reference_t<X>&& as the result of std::move, it was
giving the wrong type for function references. Use a decltype-specifier
with declval<remove_reference_t<X>>() instead of just using the
remove_reference_t<X>&& type directly. This gives the right result,
while still avoiding the cost of doing overload resolution for
std::move.
libstdc++-v3/ChangeLog:
PR libstdc++/119469
* include/bits/iterator_concepts.h (_IterMove::__result): Use
decltype-specifier instead of an explicit type.
* testsuite/24_iterators/customization_points/iter_move.cc:
Check results for function references.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Richard Earnshaw [Wed, 26 Mar 2025 15:56:18 +0000 (15:56 +0000)]
arm: don't vectorize fmaxf() unless unsafe math opts are enabled
This test has presumably been failing since vectorization was enabled
at -O2. I suspect part of the reason this wasn't picked up sooner is
that the test is a hybrid execution/scan-assembler test and the
execution part requires appropriate hardware.
The problem is that we are vectorizing an expansion of fmaxf() when
the vector version of the instruction does not preserve denormal
values. This means we should only apply this optimization when
-funsafe-math-optimizations is enabled.
This fix does a few things:
- Moves the expand pattern to vec-common.md. Although I haven't changed
its behaviour (beyond fixing the bug), this should really be enabled for
MVE as well (but that will need to wait for gcc-16 since the MVE code
needs some additional changes first).
- Adds support for HF mode vectors.
- splits the test that was exposing the bug into two parts: an executable
test and a scan-assembler test. The scan-assembler version is more
widely enabled, since it does not require a suitable executable environment.
gcc/ChangeLog:
* config/arm/neon.md (<fmaxmin><mode>3): Move pattern from here...
* config/arm/vec-common.md (<fmaxmin><mode>3): ... to here. Convert
to define_expand and disable the pattern when denormal values might
get truncated to zero. Iterate on VF to add V4HF and V8HF variants.
gcc/testsuite/ChangeLog:
* gcc.target/arm/fmaxmin.c: Move scan-assembler checks to ...
* gcc.target/arm/fmaxmin-2.c: ... here. New test.
Martin Uecker [Sun, 16 Mar 2025 09:54:17 +0000 (10:54 +0100)]
c: Fix tagname confusion for typedef redefinitions [PR118765]
When we redefine a typedef for a tagged type that has just been
redefined, merge_decls may produce invalid TYPE_DECLS that are not
consistent with what set_underlying_type produces. This is fixed
by updating DECL_ORIGINAL_TYPE.
PR c/118765
gcc/c/ChangeLog:
* c-decl.cc (merge_decls): For TYPE_DECLS copy
DECL_ORIGINAL_TYPE from the old declaration.
* c-typeck.cc (tagged_types_tu_compatible_p): Add
checking assertions.
gcc/testsuite/ChangeLog:
* gcc.dg/pr118765-2.c: New test.
* gcc.dg/pr118765-3.c: New test.
* gcc.dg/typedef-redecl3.c: New test.
Sandra Loosemore [Thu, 27 Mar 2025 00:59:37 +0000 (00:59 +0000)]
OpenMP: Fix declaration in append-args-interop.c test case
I ran into this while backporting my declare variant/dispatch/interop
patch f016ee89955ab4da5fe7ef89368e9437bb5ffb13 to the og14 development
branch. In C dialects prior to C23 (the default on mainline),
functions declared "float f()" and "float g(void)" aren't considered
equivalent for the purpose of the C front end code that checks whether
a type of a variant matches the base function after accounting for the
added interop arguments. Using "(void)" instead of "()" works in all
C dialects as well as C++, so do that.
gcc/testsuite/ChangeLog
* c-c++-common/gomp/append-args-interop.c: Fix declaration of base
function to be correct for pre-C23 dialects.
Jørgen Kvalsvik [Wed, 26 Mar 2025 21:15:26 +0000 (22:15 +0100)]
Add coverage_instrumentation_p
Provide a helper for checking if any coverage (arc, conditions, paths)
is enabled, rather than manually checking all the flags. This should
make the intent clearer, and make it easier to maintain the checks when
more flags are added.
The function is forward declared in two header files as different passes
tend to include different headers (profile.h vs value-prof.h). This
could maybe be merged at some points, but profiling related symbols are
already a bit spread out and should probably be handled in a targeted
effort.
Jørgen Kvalsvik [Tue, 4 Jun 2024 12:13:22 +0000 (14:13 +0200)]
Add prime path coverage to gcc/gcov
This patch adds prime path coverage to gcc/gcov. First, a quick
introduction to path coverage, before I explain a bit on the pieces of
the patch.
PRIME PATHS
Path coverage is recording the paths taken through the program. Here is
a simple example:
if (cond1) BB 1
then1 () BB 2
else
else1 () BB 3
if (cond2) BB 4
then2 () BB 5
else
else2 () BB 6
_ BB 7
To cover all paths you must run {then1 then2}, {then1 else2}, {else1
then1}, {else1 else2}. This is in contrast with line/statement coverage
where it is sufficient to execute then2, and it does not matter if it
was reached through then1 or else1.
1 2 4 5 7
1 2 4 6 7
1 3 4 5 7
1 3 4 6 7
This gets more complicated with loops, because 0, 1, 2, ..., N
iterations are all different paths. There are different ways of
addressing this, a promising one being prime paths. A prime path is a
maximal simple path (a path with no repeated vertices) or simple cycle
(no repeated vertices except for the first/last) Prime paths strike a
decent balance between number of tests, path growth, and loop coverage,
requiring loops to be both taken and skipped. Of course, the number of
paths still grows very fast with program complexity - for example, this
program has 14 prime paths:
while (a)
{
if (b)
return;
while (c--)
a++;
}
--
ALGORITHM
Since the numbers of paths grows so fast, we need a good algorithm. The
naive approach of generating all paths and discarding redundancies (see
reference_prime_paths in the diff) simply doesn't complete for even
pretty simple functions with a few ten thousand paths (granted, the
implementation is also poor, but only serves as a reference). Fazli &
Afsharchi in their paper "Time and Space-Efficient Compositional Method
for Prime and Test Paths Generation" describe a neat algorithm which
drastically improves on for most programs, and brings complexity down to
something managable. This patch implements that algorithm with a few
minor tweaks.
The algorithm first finds the strongly connected components (SCC) of the
graph and creates a new graph where the vertices are the SCCs of the
CFG. Within these vertices different paths are found - regular prime
paths, paths that start in the SCCs entries, and paths that end in the
SCCs exits. These per-SCC paths are combined with paths through the CFG
which greatly reduces of paths needed to be evaluated just to be thrown
away.
Using this algorithm we can find the prime paths for somewhat
complicated functions in a reasonable time. Please note that some
programs don't benefit from this at all. We need to find the prime paths
within a SCC, so if a single SCC is very large the function degenerates
to the naive implementation. This can probably be much improved on, but
is an exercise for later.
--
OVERALL ARCHITECTURE
Like the other coverages in gcc, this operates on the CFG in the
profiling phase, just after branch and condition coverage, in phases:
1. All prime paths are generated, counted, and enumerated from the CFG
2. The paths are evaluted and counter instructions and accumulators are
emitted
3. gcov reads the CFG and computes the prime paths (same as step 1)
4. gcov prints a report
Simply writing out all the paths in the .gcno file is not really viable,
the files would be too big. Additionally, there are limits to the
practicality of measuring (and reporting) on millions of paths, so for
most programs where coverage is feasible, computing paths should be
plenty fast. As a result, path coverage really only adds 1 bit to the
counter, rounded up to nearest 64 ("bucket"), so 64 paths takes up 8
bytes, 65 paths take up 16 bytes.
Recording paths is really just massaging large bitsets. Per function,
ceil(paths/64 or 32) buckets (gcov_type) are allocated. Paths are
sorted, so the first path maps to the lowest bit, the second path to the
second lowest bit, and so on. On taking an edge and entering a basic
block, a few bitmasks are applied to unset the bits corresponding to the
paths outside the block and set the bits of the paths that start in that
block. Finally, the right buckets are masked and written to the global
accumulators for the paths that end in the block. Full coverage is
achieved when all bits are set.
gcc does not really inform gcov of abnormal paths, so paths with
abnormal edges are ignored. This probably possible, but requires some
changes to the graph gcc writes to the .gcno file.
--
IMPLEMENTATION
In order to remove non-prime paths (subpaths) we use a suffix tree.
Fazli & Afsharchi do not discuss how duplicates or subpaths are removed,
and using the suffix works really well -- insertion time is a function
of the length of the (longest) paths, not the number of paths. The paths
are usually quite short, but there are many of them. The same
prime_paths function is used both in gcc and in gcov.
As for speed, I would say that it is acceptable. Path coverage is a
problem that is exponential in its very nature, so if you enable this
feature you can reasonably expect it to take a while. To combat the
effects of path explosion there is a limit at which point gcc will give
up and refuse to instrument a function, set with -fpath-coverage-limit.
Since estimating the number of prime paths is pretty much is counting
them, gcc maintains a pessimistic running count which slightly
overestimates the number of paths found so far. When that count exceeds
the limit, the function aborts and gcc prints a warning. This is a
blunt instrument meant to not get stuck on the occasional large
function, not fine-grained control over when to skip instrumentation.
My main benchmark has been tree-2.1.3/tree.c which generates approx 2M
paths across the 20 functions or so in it. Most functions have less than
1500 paths, and 2 around a million each. Finding the paths takes
3.5-4s, but the instrumentation phase takes approx. 2.5 minutes and
generates a 32M binary. Not bad for a 1429 line source file.
There are some selftests which deconstruct the algorithm, so it can be
easily referenced with the Fazli & Afsharchi. I hope that including them
both help to catch regression, clarify the assumptions, and help
understanding the algorithm by breaking up the phases.
DEMO
This is the denser line-aware (grep-friendlier) output. Every missing
path is summarized as the lines you need to run in what order, annotated
with the true/false/throw decision.
$ gcc -fpath-coverage --coverage bs.c -c -o bs
$ gcov -e --prime-paths-lines bs.o
bs.gcda:cannot open data file, assuming not executed
-: 0:Source:bs.c
-: 0:Graph:bs.gcno
-: 0:Data:-
-: 0:Runs:0
paths covered 0 of 17
path 0 not covered: lines 6 6(true) 11(true) 12
path 1 not covered: lines 6 6(true) 11(false) 13(true) 14
path 2 not covered: lines 6 6(true) 11(false) 13(false) 16
path 3 not covered: lines 6 6(false) 18
path 4 not covered: lines 11(true) 12 6(true) 11
path 5 not covered: lines 11(true) 12 6(false) 18
path 6 not covered: lines 11(false) 13(true) 14 6(true) 11
path 7 not covered: lines 11(false) 13(true) 14 6(false) 18
path 8 not covered: lines 12 6(true) 11(true) 12
path 9 not covered: lines 12 6(true) 11(false) 13(true) 14
path 10 not covered: lines 12 6(true) 11(false) 13(false) 16
path 11 not covered: lines 13(true) 14 6(true) 11(true) 12
path 12 not covered: lines 13(true) 14 6(true) 11(false) 13
path 13 not covered: lines 14 6(true) 11(false) 13(true) 14
path 14 not covered: lines 14 6(true) 11(false) 13(false) 16
path 15 not covered: lines 6(true) 11(true) 12 6
path 16 not covered: lines 6(true) 11(false) 13(true) 14 6
#####: 1:int binary_search(int a[], int len, int from, int to, int key)
-: 2:{
#####: 3: int low = from;
#####: 4: int high = to - 1;
-: 5:
#####: 6: while (low <= high)
-: 7: {
#####: 8: int mid = (low + high) >> 1;
#####: 9: long midVal = a[mid];
-: 10:
#####: 11: if (midVal < key)
#####: 12: low = mid + 1;
#####: 13: else if (midVal > key)
#####: 14: high = mid - 1;
-: 15: else
#####: 16: return mid; // key found
-: 17: }
#####: 18: return -1;
-: 19:}
Then there's the human-oriented source mode. Because it is so verbose I
have limited the demo to 2 paths. In this mode gcov will print the
sequence of *lines* through the program and in what order to cover the
path, including what basic block the line is a part of. Like its denser
sibling, this also prints the true/false/throw decision, if there is
one.
$ gcov -t --prime-paths-source bs.o
bs.gcda:cannot open data file, assuming not executed
-: 0:Source:bs.c
-: 0:Graph:bs.gcno
-: 0:Data:-
-: 0:Runs:0
paths covered 0 of 17
path 0:
BB 2: 1:int binary_search(int a[], int len, int from, int to, int key)
BB 2: 3: int low = from;
BB 2: 4: int high = to - 1;
BB 2: 6: while (low <= high)
BB 8: (true) 6: while (low <= high)
BB 3: 8: int mid = (low + high) >> 1;
BB 3: 9: long midVal = a[mid];
BB 3: (true) 11: if (midVal < key)
BB 4: 12: low = mid + 1;
path 1:
BB 2: 1:int binary_search(int a[], int len, int from, int to, int key)
BB 2: 3: int low = from;
BB 2: 4: int high = to - 1;
BB 2: 6: while (low <= high)
BB 8: (true) 6: while (low <= high)
BB 3: 8: int mid = (low + high) >> 1;
BB 3: 9: long midVal = a[mid];
BB 3: (false) 11: if (midVal < key)
BB 5: (true) 13: else if (midVal > key)
BB 6: 14: high = mid - 1;
The listing is also aware of inlining:
hello.c:
#include <stdio.h>
#include "hello.h"
int notmain(const char *entity)
{
return hello (entity);
}
#include <stdio.h>
inline __attribute__((always_inline))
int hello (const char *s)
{
if (s)
printf ("hello, %s!\n", s);
else
printf ("hello, world!\n");
return 0;
}
--prime-paths-{lines,source} take an optional argument type, which can
be 'covered', 'uncovered', or 'both', which defaults to 'uncovered'.
The flag controls if the covered or uncovered paths are printed, and
while uncovered is generally the most useful one, it is sometimes nice
to be able to see only the covered paths.
And finally, JSON (abbreviated). It is quite sparse and very nested, but
is mostly a JSON version of the source listing. It has to be this nested
in order to consistently capture multiple locations. It is always
includes the file name per location for consistency, even though this is
very much redundant in almost all cases. This format is in no way set in
stone, and without targeting it with other tooling I am not sure if it
does the job well.
* lib/gcov.exp: Add prime paths test function.
* g++.dg/gcov/gcov-22.C: New test.
* g++.dg/gcov/gcov-23-1.h: New test.
* g++.dg/gcov/gcov-23-2.h: New test.
* g++.dg/gcov/gcov-23.C: New test.
* gcc.misc-tests/gcov-29.c: New test.
* gcc.misc-tests/gcov-30.c: New test.
* gcc.misc-tests/gcov-31.c: New test.
* gcc.misc-tests/gcov-32.c: New test.
* gcc.misc-tests/gcov-33.c: New test.
* gcc.misc-tests/gcov-34.c: New test.
Jørgen Kvalsvik [Wed, 7 Aug 2024 15:33:31 +0000 (17:33 +0200)]
gcov: branch, conds, calls in function summaries
The gcov function summaries only output the covered lines, not the
branches and calls. Since the function summaries is an opt-in it
probably makes sense to also include branch coverage, calls, and
condition coverage.
Before:
$ gcov -f hello
Function 'main'
Lines executed:100.00% of 4
Function 'fn'
Lines executed:100.00% of 7
File 'hello.c'
Lines executed:100.00% of 11
Creating 'hello.c.gcov'
After:
$ gcov -f hello
Function 'main'
Lines executed:100.00% of 3
No branches
Calls executed:100.00% of 1
Function 'fn'
Lines executed:100.00% of 7
Branches executed:100.00% of 4
Taken at least once:50.00% of 4
No calls
File 'hello.c'
Lines executed:100.00% of 10
Creating 'hello.c.gcov'
Lines executed:100.00% of 10
With conditions:
$ gcov -fg hello
Function 'main'
Lines executed:100.00% of 3
No branches
Calls executed:100.00% of 1
No conditions
Function 'fn'
Lines executed:100.00% of 7
Branches executed:100.00% of 4
Taken at least once:50.00% of 4
Condition outcomes covered:100.00% of 8
No calls
File 'hello.c'
Lines executed:100.00% of 10
Creating 'hello.c.gcov'
Jonathan Wakely [Tue, 18 Mar 2025 21:21:28 +0000 (21:21 +0000)]
libgcobol: Use auto for container iterator types
libgcobol/ChangeLog:
* charmaps.cc (__gg__raw_to_ascii): Use auto for complicated
nested type.
(__gg__raw_to_ebcdic): Likewise.
(__gg__console_to_ascii): Likewise.
(__gg__console_to_ebcdic): Likewise.
Jonathan Wakely [Tue, 18 Mar 2025 21:17:03 +0000 (21:17 +0000)]
libgcobol: Fix uses of tolower and toupper with std::transform
As explained in the libstdc++ manual[1] and elsewhere[2], using tolower
and toupper in std::transform directly is wrong. If char is signed then
non-ASCII characters with negative values lead to undefined behaviour.
Also, tolower and toupper are overloaded names in C++ so expecting them
to resolve to a unique function pointer is unreliable. Finally, the
<cctype> header was included, not <ctype.h>, so they should have been
qualified as std::tolower and std::toupper.
* intrinsic.cc (is_zulu_format): Qualify toupper and cast
argument to unsigned char.
(fill_cobol_tm): Likewise.
(iscasematch): Likewise for to lower.
(numval): Qualify calls to tolower.
(__gg__lower_case): Use lambda expression for
tolower call.
(__gg__upper_case): Likewise for toupper call.
* libgcobol.cc (mangler_core): Cast tolower argument to unsigned
char.
* valconv.cc (__gg__string_to_numeric_edited): Cast to upper
arguments to unsigned char.
Bob Dubner [Wed, 26 Mar 2025 20:07:44 +0000 (16:07 -0400)]
cobol: Bring trunk in line with Dubner's test system.
gcc/cobol
* genapi.cc: (parser_display_internal): Adjust for E vs e exponent notation.
* parse.y: (literal_refmod_valid): Display correct value in error message.
Jakub Jelinek [Wed, 26 Mar 2025 19:07:09 +0000 (20:07 +0100)]
cobol: Get rid of __int128 uses in the COBOL FE [PR119242]
The following patch changes some remaining __int128 uses in the FE
into FIXED_WIDE_INT(128), i.e. emulated 128-bit integral type.
The use of wide_int_to_tree directly from that rather than going through
build_int_cst_type means we don't throw away the upper 64 bits of the
values, so the emitting of constants needing full 128 bits can be greatly
simplied.
Plus all the #pragma GCC diagnostic ignored "-Wpedantic" spots aren't
needed, we don't use the _Float128/__int128 types directly in the FE
anymore.
Note, PR119241/PR119242 bugs are still not fully fixed, I think the
remaining problem is that several FE sources include
../../libgcobol/libgcobol.h and that header declares various APIs with
__int128 and _Float128 types, so trying to build a cross-compiler on a host
without __int128 and _Float128 will still fail miserably.
I believe none of those APIs are actually used by the FE, so the question is
what the FE needs from libgcobol.h and whether the rest could be wrapped
with #ifndef IN_GCC or #ifndef IN_GCC_FRONTEND or something similar
(those 2 macros are predefined when compiling the FE files).
2025-03-26 Jakub Jelinek <jakub@redhat.com>
PR cobol/119242
* genutil.h (get_power_of_ten): Remove #pragma GCC diagnostic
around declaration.
* genapi.cc (psa_FldLiteralN): Change type of value from
__int128 to FIXED_WIDE_INT(128). Remove #pragma GCC diagnostic
around the declaration. Use wi::min_precision to determine
minimum unsigned precision of the value. Use wi::neg_p instead
of value < 0 tests and wi::set_bit_in_zero<FIXED_WIDE_INT(128)>
to build sign bit. Handle field->data.capacity == 16 like
1, 2, 4 and 8, use wide_int_to_tree instead of build_int_cst.
(mh_source_is_literalN): Remove #pragma GCC diagnostic around
the definition.
(binary_initial_from_float128): Likewise.
* genutil.cc (get_power_of_ten): Remove #pragma GCC diagnostic
before the definition.
Iain Buclaw [Tue, 25 Mar 2025 18:37:34 +0000 (19:37 +0100)]
d: import __stdin causes compilation to pause while reading from stdin
Moves the special handling of reading from stdin out of the language
semantic routines. All references to `__stdin.d` have also been removed
from the front-end implementation.
Jonathan Wakely [Wed, 26 Mar 2025 10:10:19 +0000 (10:10 +0000)]
c++: Fix FAIL: g++.dg/tree-ssa/initlist-opt1.C
My r15-8904-ge200f53a555651 changed the std::vector initializer-list
constructor so that it calls a new _M_range_initialize_n function
instead of _M_range_initialize. Change the scan-tree-dump pattern in
this g++.dg test to match the new gimple output.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/initlist-opt1.C: Match _M_range_initialize_n
instead of _M_range_initialize.
P2562R1 ("constexpr Stable Sorting") adds constexpr to stable_sort,
stable_partition and inplace_merge. However only the first is already
implemented in libstdc++, so we shouldn't bump the feature-testing
macro to the bumped C++26 value. This commit sets it to one less
than the final value.
* include/bits/version.def (constexpr_algorithms): Change
the value of the feature-testing macro.
* include/bits/version.h: Regenerate.
* testsuite/25_algorithms/cpp_lib_constexpr.cc: Amend the
check of the feature-testing macro.
The problem is two-fold: restricting a test to target x86_64-*-* is
always wrong: an i?86-*-* compiler can produce 64-bit code with -m64
just as well, so it should always be both.
In addition, the -mx32 failure shows that the test seems to be 64-bit
only.
To fix both issues, this patch uses the new x86 effective-target keyword
and restricts the tests to lp64 instead of ! ia32.
Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.
gcc/testsuite:
* c-c++-common/gomp/metadirective-device.c
(dg-additional-options): Use on all x86 targets. Restrict to lp64.
* c-c++-common/gomp/metadirective-target-device-1.c: Likewise.
Jakub Jelinek [Wed, 26 Mar 2025 13:41:15 +0000 (14:41 +0100)]
testsuite: Fix up append-args-interop.f90 test
The gcc/testsuite/*/gomp/ tests aren't compiled with include or module
paths pointing to libgomp, so shouldn't be using omp.h nor use omp_lib
etc.
The following patch adjusts the test to define it locally, like
e.g. recently in interop-5.f90 test or many other tests which have
their own definitions of types or enumerators they need.
2025-03-26 Jakub Jelinek <jakub@redhat.com>
* gfortran.dg/gomp/append-args-interop.f90: Don't use omp_lib,
instead use iso_c_binding and define omp_interop_kind parameter
locally.
Thomas Schwinge [Tue, 19 Jul 2022 13:42:17 +0000 (15:42 +0200)]
driver: Forward '-lstdc++' to offloading compilation [PR101544]
..., so that users don't manually need to specify '-foffload-options=-lstdc++'
in addition to '-lstdc++' (specified manually, or implicitly by the driver).
Do like commit 4bcb46b3ade1796c5a57b294f5cca25f00671cac
"driver: Forward '-lgfortran', '-lm' to offloading compilation".
Thomas Schwinge [Wed, 19 Mar 2025 11:18:26 +0000 (12:18 +0100)]
C++: Adjust implicit '__cxa_bad_typeid' prototype to reality
In 2001 Subversion r40924 (Git commit 52a11cbfcf0cfb32628b6953588b6af4037ac0b6)
"IA-64 ABI Exception Handling", '__cxa_bad_typeid' changed from
'std::type_info const &' to 'void' return type:
--- libstdc++-v3/libsupc++/exception_support.cc
+++ /dev/null
@@ -1,388 +0,0 @@
-[...]
-// Helpers for rtti. Although these don't return, we give them return types so
-// that the type system is not broken.
-[...]
-extern "C" std::type_info const &
-__cxa_bad_typeid ()
-{
- [...]
-}
-[...]
The implicit prototype in the C++ front end however wasn't likewise adjusted,
and so for nvptx we generate code for 'std::type_info const &' return type:
// BEGIN GLOBAL FUNCTION DECL: __cxa_bad_typeid
.extern .func (.param .u64 %value_out) __cxa_bad_typeid;
..., which is in conflict with the library code with 'void' return type:
// BEGIN GLOBAL FUNCTION DECL: __cxa_bad_typeid
.visible .func __cxa_bad_typeid;
// BEGIN GLOBAL FUNCTION DEF: __cxa_bad_typeid
.visible .func __cxa_bad_typeid
{
[...]
}
..., and we thus get execution test FAILs for 'g++.dg/rtti/typeid11.C', for
example:
error : Prototype doesn't match for '__cxa_bad_typeid' in 'input file 4 at offset 22204', first defined in 'input file 4 at offset 22204'
nvptx-run: cuLinkAddData failed: unknown error (CUDA_ERROR_UNKNOWN, 999)
With this patched, we get the expected:
// BEGIN GLOBAL FUNCTION DECL: __cxa_bad_typeid
-.extern .func (.param .u64 %value_out) __cxa_bad_typeid;
+.extern .func __cxa_bad_typeid;
Jakub Jelinek [Wed, 26 Mar 2025 13:03:50 +0000 (14:03 +0100)]
widening_mul: Fix up further r14-8680 widening mul issues [PR119417]
The following testcase is miscompiled since r14-8680 PR113560 changes.
I've already tried to fix some of the issues caused by that change in
r14-8823 PR113759, but apparently didn't get it right.
The problem is that the r14-8680 changes sometimes set *type_out to
a narrower type than the *new_rhs_out actually has (because it will
handle stuff like _1 = rhs1 & 0xffff; and imply from that HImode type_out.
Now, if in convert_mult_to_widen or convert_plusminus_to_widen we actually
get optab for the modes we've asked for (i.e. with from_mode and to_mode),
everything works fine, if the operands don't have the expected types,
they are converted to those (for INTEGER_CSTs with fold_convert,
otherwise with build_and_insert_cast).
On the following testcase on aarch64 that is not the case, we ask
for from_mode HImode and to_mode DImode, but get actual_mode SImode.
The mult_rhs1 operand already has SImode and we change type1 to unsigned int
and so no cast is actually done, except that the & 0xffff is lost that way.
The following patch ensures that if we change typeN because of wider
actual_mode (or because of a sign change), we first cast to the old
typeN (if the r14-8680 code was encountered, otherwise it would have the
same precision) and only then change it, and then perhaps cast again.
On the testcase on aarch64-linux the patch results in the expected
- add x19, x19, w0, uxtw 1
+ add x19, x19, w0, uxth 1
difference.
2025-03-26 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119417
* tree-ssa-math-opts.cc (convert_mult_to_widen): Before changing
typeN because actual_precision/from_unsignedN differs cast rhsN
to typeN if it has a different type.
(convert_plusminus_to_widen): Before changing
typeN because actual_precision/from_unsignedN differs cast mult_rhsN
to typeN if it has a different type.
Jakub Jelinek [Wed, 26 Mar 2025 11:19:14 +0000 (12:19 +0100)]
i386: Fix up pr55583.c testcase [PR119465]
In r15-4289 H.J. fixed up the pr55583.c testcase to use unsigned long long
or long long instead of unsigned long or long. That change looks correct to
me because the
void test64r () { b = ((u64)b >> n) | (a << (64 - n)); }
etc. functions otherwise aren't really 64-bit rotates, but something that
triggers UB all the time (at least one of the shifts is out of bounds).
I assume that change fixed the FAILs on -mx32, but it caused
FAIL: gcc.target/i386/pr55583.c scan-assembler-times (?n)shldl?[\\\\t ]*\\\\\$2 1
FAIL: gcc.target/i386/pr55583.c scan-assembler-times (?n)shrdl?[\\\\t ]*\\\\\$2 2
regression on i686-linux (but just for -m32 without defaulting to SSE2 or
what). The difference is that for say -m32 -march=x86-64 the stv pass
handles some of the rotates in SSE and so we get different sh[rl]dl
instruction counts from the case when SSE isn't enabled and stv pass isn't
done.
The following patch fixes that by disabling SSE for ia32 and always testing
for the same number of instructions.
Tested with all of
make check-gcc RUNTESTFLAGS='--target_board=unix\{-m32/-march=x86-64,-m32/-march=i686,-mx32,-m64\} i386.exp=pr55583.c'
2025-03-26 Jakub Jelinek <jakub@redhat.com>
PR target/55583
PR target/119465
* gcc.target/i386/pr55583.c: Add -mno-sse -mno-mmx to
dg-additional-options. Expect 4 shrdl and 2 shldl instructions on
ia32.
Tomasz Kamiński [Wed, 26 Mar 2025 06:34:37 +0000 (07:34 +0100)]
libstdc++: Check presence of iterator_category for flat_sets insert_range [PR119415]
As pointed out by Hewill Kang (reporter) in the issue, checking if iterator
of the incoming range satisfies __cpp17_input_iterator, may still lead
to hard errors inside of insert_range for iterators that satisfies
that concept, but specialize iterator_traits without iterator_category
typedef (std::common_iterator specialize iterator_traits without
iterator_category in some cases).
To address that we instead check if the iterator_traits<It>::iterator_category
is present and denote at least input_iterator_tag, using existing __has_input_iter_cat.
PR libstdc++/119415
libstdc++-v3/ChangeLog:
* include/std/flat_set (_Flat_set_impl:insert_range):
Replace __detail::__cpp17_input_iterator with __has_input_iter_cat.
* testsuite/23_containers/flat_multiset/1.cc: New tests
* testsuite/23_containers/flat_set/1.cc: New tests
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
gcc/testsuite:
* gcc.target/i386/pr117946.c: Require dfp support.
* gcc.target/i386/pr118017.c: Likewise. Use
dg-require-effective-target for both this and int128.
Tobias Burnus [Wed, 26 Mar 2025 10:27:56 +0000 (11:27 +0100)]
libgomp.texi: Document supported OpenMP 'interop' types for nvptx and gcn
Note that this commit also updates the API interface to OpenMP 6.0;
while 5.1 and 5.2 use 'int *' for the the ret_code argument,
OpenMP 6.0 changed this to omp_interop_rc_t *; this enum also exists in
OpenMP 5.1. However, C++ does not like this change such that unless NULL
is passed (i.e. the argument is ignored), OpenMP 5.x and 6.x are not
compatible.
Note that GCC's omp.h already follows OpenMP 6.0 and is now in sync with
the documentation.
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.1): Add @ref to offload-target specifics
for 'interop'.
(OpenMP 6.0): Mark dispatch's interop clause as implemented.
(omp_get_interop_int, omp_get_interop_str,
omp_get_interop_ptr, omp_get_interop_type_desc): Add @ref to
Offload-Target Specifics; change ret_code argument type to
'omp_interop_rc_t *'.
(Offload-Target Specifics): Document the supported OpenMP
interop foreign runtimes on AMD and Nvidia GPUs.
Tomasz Kamiński [Fri, 21 Mar 2025 08:03:54 +0000 (09:03 +0100)]
libstdc++: Add P1206R7 range operations to std::deque [PR111055]
This is another piece of P1206R7, adding from_range constructor, append_range,
prepend_range, insert_range, and assign_range members to std::deque.
For append_front of input non-sized range, we are emplacing element at the front and
then reverse inserted elements. This does not existing elements, and properly handle
aliasing ranges.
For insert_range, the handling of insertion in the middle of input-only ranges
that are sized could be optimized, we still insert nodes one-by-one in such case.
For forward and stronger ranges, we reduce them to common_range case, by computing
the iterator when computing the distance. This is slightly suboptimal, as it require
range to be iterated for non-common forward ranges that are sized, but reduces
number of instantiations.
This patch extract _M_range_prepend, _M_range_append helper functions that accepts
(iterator, sentinel) pair. This all used in all standard modes.
PR libstdc++/111055
libstdc++-v3/ChangeLog:
* include/bits/deque.tcc (deque::prepend_range, deque::append_range)
(deque::insert_range, __advance_dist): Define.
(deque::_M_range_prepend, deque::_M_range_append):
Extract from _M_range_insert_aux for _ForwardIterator(s).
* include/bits/stl_deque.h (deque::assign_range): Define.
(deque::prepend_range, deque::append_range, deque::insert_range):
Declare.
(deque(from_range_t, _Rg&&, const allocator_type&)): Define constructor
and deduction guide.
* include/debug/deque (deque::prepend_range, deque::append_range)
(deque::assign_range): Define.
(deque(from_range_t, _Rg&&, const allocator_type&)): Define constructor
and deduction guide.
* testsuite/23_containers/deque/cons/from_range.cc: New test.
* testsuite/23_containers/deque/modifiers/append_range.cc: New test.
* testsuite/23_containers/deque/modifiers/assign/assign_range.cc:
New test.
* testsuite/23_containers/deque/modifiers/prepend_range.cc: New test.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jakub Jelinek [Wed, 26 Mar 2025 07:47:20 +0000 (08:47 +0100)]
i386: Require in peephole2 that memory is offsettable [PR119450]
The following testcase ICEs because a peephole2 attempts to offset
memory which is not offsettable (in particular address is a ZERO_EXTEND
in this case).
Because peephole2s don't support constraints, I've added a check for this
in the peephole2's condition.
2025-03-26 Jakub Jelinek <jakub@redhat.com>
PR target/119450
* config/i386/i386.md (narrow test peephole2): Test for
offsettable_memref_p in condition.
Richard Biener [Tue, 25 Mar 2025 14:18:14 +0000 (15:18 +0100)]
target/119010 - add missing DF load/store reservations for znver4 and znver5
The following resolves missing reservations for DFmode *movdf_internal
loads and stores, visible as 'nothing' in -fsched-verbose=2 dumps.
PR target/119010
* config/i386/zn4zn5.md (znver4_sse_mov_fp, znver4_sse_mov_fp_load,
znver5_sse_mov_fp_load, znver4_sse_mov_fp_store,
znver5_sse_mov_fp_store): Also match V1SF and DF.
Richard Biener [Tue, 25 Mar 2025 12:45:36 +0000 (13:45 +0100)]
middle-end/118795 - fix can_vec_perm_const_p query in match.pd
When expanding to RTL we always use vec_perm_indices with two operands
which can make a difference with respect to supported vs. unsupported.
So the following adjusts a query in match.pd for target support which
got this "wrong" and using 1 for a equal operand permute.
PR middle-end/118795
* match.pd (vec_perm <vec_perm <a, b>> -> vec_perm <a, b>):
Use the appropriate check to see whether the original
outer permute was supported.
Hu, Lin1 [Fri, 21 Mar 2025 02:43:10 +0000 (10:43 +0800)]
i386: Add "s_" as Saturation for AVX10.2 Converting Intrinsics.
This patch aims to add "s_" after 'cvt' represent saturation.
gcc/ChangeLog:
* config/i386/avx10_2-512convertintrin.h (_mm512_mask_cvtx2ps_ph): Formatting fixes
(_mm512_mask_cvtx_round2ps_ph): Ditto
(_mm512_maskz_cvtx_round2ps_ph): Ditto
(_mm512_cvtbiassph_bf8): Rename to _mm512_cvts_biasph_bf8.
(_mm512_mask_cvtbiassph_bf8): Rename to _mm512_mask_cvts_biasph_bf8.
(_mm512_maskz_cvtbiassph_bf8): Rename to _mm512_maskz_cvts_biasph_bf8.
(_mm512_cvtbiassph_hf8): Rename to _mm512_cvts_biasph_hf8.
(_mm512_mask_cvtbiassph_hf8): Rename to _mm512_mask_cvts_biasph_hf8.
(_mm512_maskz_cvtbiassph_hf8): Rename to _mm512_maskz_cvts_biasph_hf8.
(_mm512_cvts2ph_bf8): Rename to _mm512_cvts_2ph_bf8.
(_mm512_mask_cvts2ph_bf8): Rename to _mm512_mask_cvts_2ph_bf8.
(_mm512_maskz_cvts2ph_bf8): Rename to _mm512_maskz_cvts_2ph_bf8.
(_mm512_cvts2ph_hf8): Rename to _mm512_cvts_2ph_hf8.
(_mm512_mask_cvts2ph_hf8): Rename to _mm512_mask_cvts_2ph_hf8.
(_mm512_maskz_cvts2ph_hf8): Rename to _mm512_maskz_cvts_2ph_hf8.
(_mm512_cvtsph_bf8): Rename to _mm512_cvts_ph_bf8.
(_mm512_mask_cvtsph_bf8): Rename to _mm512_mask_cvts_ph_bf8.
(_mm512_maskz_cvtsph_bf8): Rename to _mm512_maskz_cvts_ph_bf8.
(_mm512_cvtsph_hf8): Rename to _mm512_cvts_ph_hf8.
(_mm512_mask_cvtsph_hf8): Rename to _mm512_mask_cvts_ph_hf8.
(_mm512_maskz_cvtsph_hf8): Rename to _mm512_maskz_cvts_ph_hf8.
* config/i386/avx10_2convertintrin.h
(_mm_cvtbiassph_bf8): Rename to _mm_cvts_biasph_bf8.
(_mm_mask_cvtbiassph_bf8): Rename to _mm_mask_cvts_biasph_bf8.
(_mm_maskz_cvtbiassph_bf8): Rename to _mm_maskz_cvts_biasph_bf8.
(_mm256_cvtbiassph_bf8): Rename to _mm256_cvts_biasph_bf8.
(_mm256_mask_cvtbiassph_bf8): Rename to _mm256_mask_cvts_biasph_bf8.
(_mm256_maskz_cvtbiassph_bf8): Rename to _mm256_maskz_cvts_biasph_bf8.
(_mm_cvtbiassph_hf8): Rename to _mm_cvts_biasph_hf8.
(_mm_mask_cvtbiassph_hf8): Rename to _mm_mask_cvts_biasph_hf8.
(_mm_maskz_cvtbiassph_hf8): Rename to _mm_maskz_cvts_biasph_hf8.
(_mm256_cvtbiassph_hf8): Rename to _mm256_cvts_biasph_hf8.
(_mm256_mask_cvtbiassph_hf8): Rename to _mm256_mask_cvts_biasph_hf8.
(_mm256_maskz_cvtbiassph_hf8): Rename to _mm256_maskz_cvts_biasph_hf8.
(_mm_cvts2ph_bf8): Rename to _mm_cvts_2ph_bf8.
(_mm_mask_cvts2ph_bf8): Rename to _mm_mask_cvts_2ph_bf8.
(_mm_maskz_cvts2ph_bf8): Rename to _mm_maskz_cvts_2ph_bf8.
(_mm256_cvts2ph_bf8): Rename to _mm256_cvts_2ph_bf8.
(_mm256_mask_cvts2ph_bf8): Rename to _mm256_mask_cvts_2ph_bf8.
(_mm256_maskz_cvts2ph_bf8): Rename to _mm256_maskz_cvts_2ph_bf8.
(_mm_cvts2ph_hf8): Rename to _mm_cvts_2ph_hf8.
(_mm_mask_cvts2ph_hf8): Rename to _mm_mask_cvts_2ph_hf8.
(_mm_maskz_cvts2ph_hf8): Rename to _mm_maskz_cvts_2ph_hf8.
(_mm256_cvts2ph_hf8): Rename to _mm256_cvts_2ph_hf8.
(_mm256_mask_cvts2ph_hf8): Rename to _mm256_mask_cvts_2ph_hf8.
(_mm256_maskz_cvts2ph_hf8): Rename to _mm256_maskz_cvts_2ph_hf8.
(_mm_cvtsph_bf8): Rename to _mm_cvts_ph_bf8.
(_mm_mask_cvtsph_bf8): Rename to _mm_mask_cvts_ph_bf8.
(_mm_maskz_cvtsph_bf8): Rename to _mm_maskz_cvts_ph_bf8.
(_mm256_cvtsph_bf8): Rename to _mm256_cvts_ph_bf8.
(_mm256_mask_cvtsph_bf8): Rename to _mm256_mask_cvts_ph_bf8.
(_mm256_maskz_cvtsph_bf8): Rename to _mm256_maskz_cvts_ph_bf8.
(_mm_cvtsph_hf8): Rename to _mm_cvts_ph_hf8.
(_mm_mask_cvtsph_hf8): Rename to _mm_mask_cvts_ph_hf8.
(_mm_maskz_cvtsph_hf8): Rename to _mm_maskz_cvts_ph_hf8.
(_mm256_cvtsph_hf8): Rename to _mm256_cvts_ph_hf8.
(_mm256_mask_cvtsph_hf8): Rename to _mm256_mask_cvts_ph_hf8.
(_mm256_maskz_cvtsph_hf8): Rename to _mm256_maskz_cvts_ph_hf8.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_2-512-convert-1.c: Modify function name
to follow the latest version.
* gcc.target/i386/avx10_2-512-vcvt2ph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvt2ph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-convert-1.c: Ditto.
Bob Dubner [Tue, 25 Mar 2025 19:38:38 +0000 (15:38 -0400)]
cobol: Changes to eliminate _Float128 from the front end [PR119241]
These changes switch _Float128 types to REAL_VALUE_TYPE in the front end.
Some __int128 variables and function return values are changed to
FIXED_WIDE_INT(128)
gcc/cobol
PR cobol/119241
* cdf.y: (cdfval_base_t::operator()): Return const.
* cdfval.h: (struct cdfval_base_t): Add const cdfval_base_t&
operator().
(struct cdfval_t): Add cdfval_t constructor. Change cdf_value
definitions.
* gcobolspec.cc (lang_specific_driver): Formatting fix.
* genapi.cc: Include fold-const.h and realmpfr.h.
(initialize_variable_internal): Use real_to_decimal instead of
strfromf128.
(get_binary_value_from_float): Use wide_int_to_tree instead of
build_int_cst_type.
(psa_FldLiteralN): Use fold_convert instead of strfromf128,
real_from_string and build_real.
(parser_display_internal): Rewritten to work on REAL_VALUE_TYPE
rather than _Float128.
(mh_source_is_literalN): Use FIXED_WIDE_INT(128) rather than
__int128, wide_int_to_tree rather than build_int_cst_type,
fold_convert rather than build_string_literal.
(real_powi10): New function.
(binary_initial_from_float128): Change type of last argument from
_Float128 to REAL_VALUE_TYPE, process it using real.cc and mpfr
APIs.
(digits_from_float128): Likewise.
(initial_from_float128): Make static. Remove value argument, add
local REAL_VALUE_TYPE value variable instead, process it using
real.cc and native_encode_expr APIs.
(parser_symbol_add): Adjust initial_from_float128 caller.
* genapi.h (initial_from_float128): Remove declaration.
* genutil.cc (get_power_of_ten): Change return type from __int128
to FIXED_WIDE_INT(128), ditto for retval type, change type of pos
from __int128 to unsigned long long.
(scale_by_power_of_ten_N): Use wide_int_to_tree instead of
build_int_cst_type. Use FIXED_WIDE_INT(128) instead of __int128
as power_of_ten variable type.
(copy_little_endian_into_place): Likewise.
* genutil.h (get_power_of_ten): Change return type from __int128
to FIXED_WIDE_INT(128).
* parse.y (%union): Change type of float128 from _Float128 to
REAL_VALUE_TYPE.
(string_of): Change argument type from _Float128 to
const REAL_VALUE_TYPE &, use real_to_decimal rather than
strfromf128. Add another overload with tree argument type.
(field: cdf): Use real_zerop rather than comparison against 0.0.
(occurs_clause, const_value): Use real_to_integer.
(value78): Use build_real and real_to_integer.
(data_descr1): Use real_to_integer.
(count): Use real_to_integer, real_from_integer and real_identical
instead of direct comparison.
(value_clause): Use real_from_string3 instead of num_str2i. Use
real_identical instead of direct comparison. Use build_real.
(allocate): Use real_isneg and real_iszero instead of <= 0 comparison.
(move_tgt): Use real_to_integer, real_value_truncate,
real_from_integer and real_identical instead of comparison of casts.
(cce_expr): Use real_arithmetic and real_convert or real_value_negate
instead of direct arithmetics on _Float128.
(cce_factor): Use real_from_string3 instead of numstr2i.
(literal_refmod_valid): Use real_to_integer.
* symbols.cc (symbol_table_t::registers_t::registers_t): Formatting
fix.
(ERROR_FIELD): Likewise.
(extend_66_capacity): Likewise.
(cbl_occurs_t::subscript_ok): Use real_to_integer, real_from_integer
and real_identical.
* symbols.h (cbl_field_data_t::etc_t::value): Change type from
_Float128 to tree.
(cbl_field_data_t::etc_t::etc_t): Adjust defaulted argument value.
(cbl_field_data_t::cbl_field_data_t): Formatting fix. Use etc()
rather than etc(0).
(cbl_field_data_t::value_of): Change return type from _Float128 to
tree.
(cbl_field_data_t::operator=): Change return and argument type from
_Float128 to tree.
(cbl_field_data_t::valify): Use real_from_string, real_value_truncate
and build_real.
(cbl_field_t::same_as): Use build_zero_cst instead of _Float128(0.0).
gcc/testsuite
* cobol.dg/literal1.cob: New testcase.
* cobol.dg/output1.cob: Likewise
Co-authored-by: Richard Biener <rguenth@suse.de> Co-authored-by: Jakub Jelinek <jakub@redhat.com> Co-authored-by: James K. Lowden <jklowden@cobolworx.com> Co-authored-by: Robert Dubner <rdubher@symas.com>
We've been miscompiling the following since r0-51314-gd6b4ea8592e338 (I
did not go compile something that old, and identified this change via
git blame, so might be wrong)
=== cut here ===
struct Foo { int x; };
Foo& get (Foo &v) { return v; }
void bar () {
Foo v; v.x = 1;
(true ? get (v) : get (v)).*(&Foo::x) = 2;
// v.x still equals 1 here...
}
=== cut here ===
The problem lies in build_m_component_ref, that computes the address of
the COND_EXPR using build_address to build the representation of
(true ? get (v) : get (v)).*(&Foo::x);
and gets something like
&(true ? get (v) : get (v)) // #1
instead of
(true ? &get (v) : &get (v)) // #2
and the write does not go where want it to, hence the miscompile.
This patch replaces the call to build_address by a call to
cp_build_addr_expr, which gives #2, that is properly handled.
PR c++/114525
gcc/cp/ChangeLog:
* typeck2.cc (build_m_component_ref): Call cp_build_addr_expr
instead of build_address.
Iain Sandoe [Tue, 25 Mar 2025 16:20:58 +0000 (16:20 +0000)]
toplevel, libcobol: Add dependency on libquadmath build [PR119244].
For the configuration of libgcobol to be correct for targets that need
to use libquadmath for 128b FP support, we must be able to find the
quadmath library (or not, for targets that have the support in libc).
Iain Sandoe [Thu, 13 Mar 2025 17:28:55 +0000 (17:28 +0000)]
gcc, gcov: Use 'lbasename' consistently.
The 'basename' implementation can vary with the host platform (e.g. POSIX
c.f. Linux). This is the only current uses of basename() in the source
so convert them to use lbasename() as most other cases do.
gcc/ChangeLog:
* gcov.cc (get_gcov_intermediate_filename): Use lbasename().
Iain Sandoe [Thu, 13 Mar 2025 17:23:33 +0000 (17:23 +0000)]
gcc, configure: When checking for basename, use the same process as libiberty [PR119250].
We need the configure result from the decl check for basename() in the GCC
configuration to match that obtained when configuring libiberty or we get
conflicts when <libgen.h> is included in any TU that also includes "system.h"
or "libiberty.h" directly.
PR other/119250
gcc/ChangeLog:
* config.in: Regenerate.
* configure: Regenerate.
* configure.ac: Match the configure test in libiberty when checking
the basename decl.
Iain Sandoe [Wed, 12 Mar 2025 15:04:31 +0000 (15:04 +0000)]
libiberty: Append <libgen.h> to AC_CHECK_DECLS [PR119218].
Darwin and Solaris, at least, provide basename() in libc, but only
declare it in <libgen.h>. That library is not one of the set in
AC_INCLUDES_DEFAULT and so we fail the config test and fall back
to the libiberty-provided version. In itself, this is not an
issue; however, if we include <libgen.h> and libiberty.h in the same
TU we do then get a decl conflict.
PR other/119218
libiberty/ChangeLog:
* config.in: Regenerate.
* configure: Regenerate.
* configure.ac: Append <libgen.h> to AC_INCLUDES_DEFAULT
when checking for the 'basename' decl.
Thomas Koenig [Tue, 25 Mar 2025 17:42:30 +0000 (18:42 +0100)]
C prototypes for functions returning C function pointers.
This patch handles dumping prototypes for C functions returning
function pointers. For the test case
MODULE test
USE, INTRINSIC :: ISO_C_BINDING
CONTAINS
FUNCTION lookup(idx) BIND(C)
type(C_FUNPTR) :: lookup
integer(C_INT), VALUE :: idx
lookup = C_FUNLOC(x1)
END FUNCTION lookup
subroutine x1()
end subroutine x1
END MODULE test
the prototype is
void (*lookup (int idx)) ();
Regression-tested. Again no test case because I don't know how.
During testing, I also found that vtabs were dumped, this is
also corrected.
gcc/fortran/ChangeLog:
PR fortran/119419
* dump-parse-tree.cc (write_funptr_fcn): New function.
(write_type): Invoke it for C_FUNPTR.
(write_interop_decl): Do not dump vtabs.
Jonathan Wakely [Tue, 25 Mar 2025 00:27:52 +0000 (00:27 +0000)]
libstdc++: Allow std::ranges::to to create unions
LWG 4229 points out that the std::ranges::to wording refers to class
types, but I added an assertion using std::is_class_v which only allows
non-union class types. LWG consensus is that unions should be allowed,
so this additionally uses std::is_union_v.
libstdc++-v3/ChangeLog:
* include/std/ranges (ranges::to): Allow unions as well as
non-union class types.
* testsuite/std/ranges/conv/lwg4229.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Tue, 25 Mar 2025 13:24:08 +0000 (13:24 +0000)]
libstdc++: Optimize std::vector construction from input iterators [PR108487]
LWG 3291 make std::ranges::iota_view's iterator have input_iterator_tag
as its iterator_category, even though it satisfies the C++20
std::forward_iterator concept. This means that the traditional
std::vector::vector(InputIterator, InputIterator) constructor treats
iota_view iterators as input iterators, because it only understands the
C++17 iterator requirements, not the C++20 iterator concepts. This
results in a loop that calls emplace_back for each individual element of
the iota_view, requiring the vector to reallocate repeatedly as the
values are inserted. This makes it unnecessarily slow to construct a
vector from an iota_view.
This change adds a new _M_range_initialize_n function for initializing a
vector from a range (which doesn't have to be common) and a size. This
new function can be used by vector(InputIterator, InputIterator) and
vector(from_range_t, R&&) when std::ranges::distance can be used to get
the size. It can also be used by the _M_range_initialize overload that
gets the size for a Cpp17ForwardIterator pair using std::distance, and
by the vector(initializer_list) constructor.
With this new function constructing a std::vector from iota_view does
a single allocation of the correct size and so doesn't need to
reallocate in a loop.
Previously the _M_range_initialize overload for Cpp17ForwardIterator was
using a local RAII _Guard_alloc object to own the storage, but that was
redundant. The _Vector_base can own the storage right away, and its
destructor will deallocate it if _M_range_initialize exits via an
exception.
libstdc++-v3/ChangeLog:
PR libstdc++/108487
* include/bits/stl_vector.h (vector(initializer_list)): Call
_M_range_initialize_n instead of _M_range_initialize.
(vector(InputIterator, InputIterator)): Use _M_range_initialize_n
for C++20 sized sentinels and forward iterators.
(vector(from_range_t, R&&)): Use _M_range_initialize_n for sized
ranges and forward ranges.
(vector::_M_range_initialize(FwIt, FwIt, forward_iterator_tag)):
Likewise.
(vector::_M_range_initialize_n): New function.
* testsuite/23_containers/vector/cons/108487.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Mon, 24 Mar 2025 21:21:02 +0000 (21:21 +0000)]
libstdc++: Adjust how __gnu_debug::vector detects invalidation
The new C++23 member functions assign_range, insert_range and
append_range were checking whether the begin() iterator changed after
calling the base class member. That works, but is technically undefined
when the original iterator has been invalidated by a change in capacity.
We can just check the capacity directly, because reallocation only
occurs if a change in capacity is required.
N.B. we can't use data() either because std::vector<bool> doesn't have
it.
libstdc++-v3/ChangeLog:
* include/debug/vector (vector::assign_range): Use change in
capacity to detect reallocation.
(vector::insert_range, vector::append_range): Likewise. Remove
unused variables.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Fri, 21 Mar 2025 23:21:42 +0000 (23:21 +0000)]
libstdc++: Fix std::vector::append_range for overlapping ranges
Unlike insert_range and assign_range, the append_range function does not
have a precondition that the range doesn't overlap *this. That means we
need to avoid relocating the existing elements until after copying from
the range. This means I need to revert r15-8488-g3e1d760bf49d0e which
made the from_range_t constructor use append_range, because the
constructor can avoid the additional complexity needed by append_range.
When relocating the existing elements in append_range we can use
std::__relocate_a to do it more efficiently, if that's valid.
std::vector<bool>::append_range needs similar treatment, although it's a
bit simpler as we know that the elements are trivially copyable and so
we don't need to worry about them throwing. assign_range doesn't allow
overlapping ranges, so can be rewritten to be more efficient than
calling append_range for the forward or sized range case.
libstdc++-v3/ChangeLog:
* include/bits/stl_bvector.h (vector::assign_range): More
efficient implementation for forward/sized ranges.
(vector::append_range): Handle potentially overlapping range.
* include/bits/stl_vector.h (vector(from_range_t, R&&, Alloc)):
Do not use append_range for non-sized input range case.
(vector::append_range): Handle potentially overlapping range.
* include/bits/vector.tcc (vector::insert_range): Forward range
instead of moving it.
* testsuite/23_containers/vector/bool/modifiers/insert/append_range.cc:
Test overlapping ranges.
* testsuite/23_containers/vector/modifiers/append_range.cc:
Likewise.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Eric Botcazou [Tue, 25 Mar 2025 17:11:08 +0000 (18:11 +0100)]
Add support of --enable-host-pie to the native Ada compiler
gcc/ada/
PR ada/119440
* gcc-interface/Make-lang.in (GCC_LINK): Filter out -pie in stage 1
(GCC_LLINK): Likewise.
* gcc-interface/Makefile.in (COMPILER): Delete and replace by CC.
(COMPILER_FLAGS): Delete.
(ALL_COMPILERFLAGS): Delete and replace by ALL_CFLAGS.
(ALL_ADAFLAGS): Move around.
(enable_host_pie): New substituted variable.
(LD_PICFLAG): Likewise. Do not add it to TOOLS_LIBS.
(LIBIBERTY): Test enable_host_pie.
(LIBGNAT): Likewise and use libgnat_pic.a if yes.
(TOOLS_FLAGS_TO_PASS): Pass $(PICFLAG) under CFLAGS & $(LD_PICFLAG)
under LDFLAGS. Also pass through ADA_CFLAGS.
(common-tools): Add $(ALL_CFLAGS) $(ADA_CFLAGS) to the --GCC string
of $(GNATLINK) commands.
(../../gnatdll$(exeext)): Likewise.
(gnatmake-re): Likewise.
(gnatlink-re): Likewise.
(gnatlib-shared-dual): Remove all the object files at the end.
gnattools/
PR ada/119440
* configure.ac (host-pie): New switch.
(host-bind-now): Likewise.
Substitute PICFLAG and LD_PICFLAG.
* configure: Regenerate.
* Makefile.in (PICFLAG): New substituted variable.
(LD_PICFLAG): Likewise.
(TOOLS_FLAGS_TO_PASS): Pass $(PICFLAG) under CFLAGS & $(LD_PICFLAG)
under LDFLAGS. Do not pass -I- under ADA_INCLUDES.
(TOOLS_FLAGS_TO_PASS_RE): Likewise.
This patch would like to avoid the ICE when template lambdas call with
default parameters in unevaluated context. The bug is the same as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119385. For example:
Sandra Loosemore [Tue, 25 Mar 2025 15:55:45 +0000 (15:55 +0000)]
OpenMP: Create additional interop objects with append_args.
This patch adds support for the case where #pragma omp declare variant
with append_args is used inside a #pragma omp dispatch interop that
specifies fewer interop args than required by the variant; new interop
objects are implicitly created and then destroyed around the call to the
variant, using the GOMP_interop builtin.
gcc/fortran/ChangeLog
* trans-openmp.cc (gfc_trans_omp_declare_variant): Remove accidental
redeclaration of pref.
gcc/ChangeLog
* gimplify.cc (modify_call_for_omp_dispatch): Adjust arguments.
Remove the "sorry" for the case where new interop objects must be
constructed, and add code to make it work instead.
(expand_variant_call_expr): Adjust arguments and call to
modify_call_for_omp_dispatch.
(gimplify_variant_call_expr): Simplify logic for calling
expand_variant_call_expr.
Richard Earnshaw [Tue, 25 Mar 2025 15:36:02 +0000 (15:36 +0000)]
arm: testsuite: adjust ftest tests
The ftest-*.c tests for Arm check certain ACLE mandated macros to ensure
they are correctly defined based on the selected architecture. ACLE
states that the macro should be defined if the operation exists in
the hardware, but it doesn't have to exist in the current ISA because
and interworking call to the library function will still result in using
the hardware operation (both GCC and Clang agree on this). So adjust
the tests accordingly.
Whilst cleaning this up, also remove the now redundant dg-skip-if operations
that were testing for incompatible command-line options. That should now
be a thing of the past as the framework will clean this up more thoroughly
before running the test, or detect incompatible option combinations.
Jakub Jelinek [Tue, 25 Mar 2025 15:55:24 +0000 (16:55 +0100)]
i386: Fix up combination of -2 r<<= (x & 7) into btr [PR119428]
The following patch is miscompiled from r15-8478 but latently already
since my r11-5756 and r11-6631 changes.
The r11-5756 change was
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/561164.html
which changed the splitters to immediately throw away the masking.
And the r11-6631 change was an optimization to recognize
(set (zero_extract:HI (...) (const_int 1) (...)) (const_int 1)
as btr.
The problem is their interaction. x86 is not a SHIFT_COUNT_TRUNCATED
target, so the masking needs to be explicit in the IL.
And combine.cc (make_field_assignment) has since 1992 optimizations
which try to optimize x &= (-2 r<< y) into zero_extract (x) = 0.
Now, such an optimization is fine if y has not been masked or if the
chosen zero_extract has the same mode as the rotate (or it recognizes
something with a left shift too). IMHO such optimization is invalid
for SHIFT_COUNT_TRUNCATED targets because we explicitly say that
the masking of the shift/rotate counts are redundant there and don't
need to be part of the IL (I have a patch for that, but because it
is just latent, I'm not sure it needs to be posted for gcc 15 (and
also am not sure if it should punt or add operand masking just in case)).
x86 is not SHIFT_COUNT_TRUNCATED though and so even fixing combine
not to do that for SHIFT_COUNT_TRUNCATED targets doesn't help, and we don't
have QImode insv, so it is optimized into HImode insertions. Now,
if the y in x &= (-2 r<< y) wasn't masked in any way, turning it into
HImode btr is just fine, but if it was x &= (-2 r<< (y & 7)) and we just
decided to throw away the masking, using btr changes the behavior on it
and causes e2fsprogs and sqlite miscompilations.
So IMHO on !SHIFT_COUNT_TRUNCATED targets, we need to keep the maskings
explicit in the IL, either at least for the duration of the combine pass
as does the following patch (where combine is the only known pass to have
such transformation), or even keep it until final pass in case there are
some later optimizations that would also need to know whether there was
explicit masking or not and with what mask. The latter change would be
much larger.
The following patch just reverts the r11-5756 change and adds a testcase.
2025-03-25 Jakub Jelinek <jakub@redhat.com>
PR target/96226
PR target/119428
* config/i386/i386.md (splitter after *<rotate_insn><mode>3_mask,
splitter after *<rotate_insn><mode>3_mask_1): Revert 2020-12-05
changes.