Sandra Loosemore [Tue, 25 Mar 2025 05:25:34 +0000 (05:25 +0000)]
Doc: Move builtin documentation to a new chapter [PR42270]
This is part of an incremental effort to make the documentation for
GCC extensions better organized by grouping/rearranging sections by
topic.
I was originally intending to consolidate all the sections documenting
builtins as subsections of a new container section within the C
extensions chapter, but I ran into a technical limitation of Texinfo:
it only supports sectioning depth up to @subsubsection, and we already
had quite a few of those in the target-specific builtins sections. So
instead I have pulled all the existing sections out into a new
chapter. This actually makes sense since some of the builtins are
specific to C++ anyway and are not C language extensions at all.
Subsequent patches in this series will move things around within the
new chapter; this one just adds the new container node and adjusts
the menus.
gcc/ChangeLog
PR other/42270
* doc/extend.texi (C Extensions): Move menu items for
builtin-related sections to...
(Built-in Functions): New chapter.
* doc/gcc.texi (Introduction): Add menu entry for new chapter.
Sandra Loosemore [Tue, 25 Mar 2025 02:56:48 +0000 (02:56 +0000)]
Doc: Add a container section to consolidate attribute documentation [PR42270]
This is part of an incremental effort to make the chapter on GCC
extensions better organized by grouping/rearranging sections by topic.
Note that this patch does not address the restructuring/rewrite
suggested by PR88472 or PR102397, beyond adding a very short
introduction to the new container section that is more explicit about
both syntaxes being accepted as a GNU extension.
gcc/ChangeLog
PR other/42270
* doc/extend.texi (Attributes): New section.
(Function Attributes): Make it a subsection of the new section.
(Variable Attributes): Likewise.
(Type Attributes): Likewise.
(Label Attributes): Likewise.
(Enumerator Attributes): Likewise.
(Attribute Syntax): Likewise.
Sandra Loosemore [Tue, 25 Mar 2025 05:22:38 +0000 (05:22 +0000)]
Doc: Remove separate "Target Format Checks" section [PR42270]
This is part of an incremental effort to make the chapter on GCC
extensions better organized by grouping/rearranging sections by topic.
Following the last round of patches, there's a leftover section
"Target Format Checks" that didn't fit into any category. It seems best
to merge this material into the main discussion of the "format" attribute,
in particular because that discussion already contains similar discussion
for mingw/Windows targets.
gcc/ChangeLog
PR other/42270
* doc/extend.texi (Function Attributes): Merge text from "Target
Format Checks" into the main discussion of the format and
format_arg attributes.
(Target Format Checks): Delete section.
Jakub Jelinek [Sun, 30 Mar 2025 18:11:05 +0000 (20:11 +0200)]
testsuite: Fix up atomic-inst-ldlogic.c
r15-8956 changed in the test:
-/* { dg-final { scan-assembler-times "ldclr\t" 16} */
+/* { dg-final { scan-assembler-times "ldclr\t" 16 } */
which made it even worse than before, when the directive has
been silently ignored because it didn't match the regex for
directives. Now it matches it but is unbalanced.
The following patch fixes it and adds space after all the
other scan-assembler-times counts in the file.
2025-03-30 Jakub Jelinek <jakub@redhat.com>
* gcc.target/aarch64/atomic-inst-ldlogic.c: Fix another
unbalanced {} directive problem. Add space after all
scan-assembler-times counts.
Alpha: Add option to avoid data races for partial writes [PR117759]
Similarly to data races with 8-bit byte or 16-bit word quantity memory
writes on non-BWX Alpha implementations we have the same problem even on
BWX implementations with partial memory writes produced for unaligned
stores as well as block memory move and clear operations. This happens
at the boundaries of the area written where we produce unprotected RMW
sequences, such as for example:
ldbu $1,0($3)
stw $31,8($3)
stq $1,0($3)
to zero a 9-byte member at the byte offset of 1 of a quadword-aligned
struct, happily clobbering a 1-byte member at the beginning of said
struct if concurrent write happens while executing on the same CPU such
as in a signal handler or a parallel write happens while executing on
another CPU such as in another thread or via a shared memory segment.
To guard against these data races with partial memory write accesses
introduce the `-msafe-partial' command-line option that instructs the
compiler to protect boundaries of the data quantity accessed by instead
using a longer code sequence composed of narrower memory writes where
suitable machine instructions are available (i.e. with BWX targets) or
atomic RMW access sequences where byte and word memory access machine
instructions are not available (i.e. with non-BWX targets).
Owing to the desire of branch avoidance there are redundant overlapping
writes in unaligned cases where STQ_U operations are used in the middle
of a block so as to make sure no part of data to be written has been
lost regardless of run-time alignment. For the non-BWX case it means
that with blocks whose size is not a multiple of 8 there are additional
atomic RMW sequences issued towards the end of the block in addition to
the always required pair enclosing the block from each end.
Only one such additional atomic RMW sequence is actually required, but
code currently issues two for the sake of simplicity. An improvement
might be added to `alpha_expand_unaligned_store_words_safe_partial' in
the future, by folding `alpha_expand_unaligned_store_safe_partial' code
for handling multi-word blocks whose size is not a multiple of 8 (i.e.
with a trailing partial-word part). It would improve performance a bit,
but current code is correct regardless.
Update test cases with `-mno-safe-partial' where required and add new
ones accordingly.
In some cases GCC chooses to open-code block memory write operations, so
with non-BWX targets `-msafe-partial' will in the usual case have to be
used together with `-msafe-bwa'.
Credit to Magnus Lindholm <linmag7@gmail.com> for sharing hardware for
the purpose of verifying the BWX side of this change.
gcc/
PR target/117759
* config/alpha/alpha-protos.h
(alpha_expand_unaligned_store_safe_partial): New prototype.
* config/alpha/alpha.cc (alpha_expand_movmisalign)
(alpha_expand_block_move, alpha_expand_block_clear): Handle
TARGET_SAFE_PARTIAL.
(alpha_expand_unaligned_store_safe_partial)
(alpha_expand_unaligned_store_words_safe_partial)
(alpha_expand_clear_safe_partial_nobwx): New functions.
* config/alpha/alpha.md (insvmisaligndi): Handle
TARGET_SAFE_PARTIAL.
* config/alpha/alpha.opt (msafe-partial): New option.
* config/alpha/alpha.opt.urls: Regenerate.
* doc/invoke.texi (Option Summary, DEC Alpha Options): Document
the new option.
gcc/testsuite/
PR target/117759
* gcc.target/alpha/memclr-a2-o1-c9-ptr.c: Add
`-mno-safe-partial'.
* gcc.target/alpha/memclr-a2-o1-c9-ptr-safe-partial.c: New file.
* gcc.target/alpha/memcpy-di-unaligned-dst.c: New file.
* gcc.target/alpha/memcpy-di-unaligned-dst-safe-partial.c: New
file.
* gcc.target/alpha/memcpy-di-unaligned-dst-safe-partial-bwx.c:
New file.
* gcc.target/alpha/memcpy-si-unaligned-dst.c: New file.
* gcc.target/alpha/memcpy-si-unaligned-dst-safe-partial.c: New
file.
* gcc.target/alpha/memcpy-si-unaligned-dst-safe-partial-bwx.c:
New file.
* gcc.target/alpha/stlx0.c: Add `-mno-safe-partial'.
* gcc.target/alpha/stlx0-safe-partial.c: New file.
* gcc.target/alpha/stlx0-safe-partial-bwx.c: New file.
* gcc.target/alpha/stqx0.c: Add `-mno-safe-partial'.
* gcc.target/alpha/stqx0-safe-partial.c: New file.
* gcc.target/alpha/stqx0-safe-partial-bwx.c: New file.
* gcc.target/alpha/stwx0.c: Add `-mno-safe-partial'.
* gcc.target/alpha/stwx0-bwx.c: Add `-mno-safe-partial'. Refer
to stwx0.c rather than copying its code and also verify no LDQ_U
or STQ_U instructions have been produced.
* gcc.target/alpha/stwx0-safe-partial.c: New file.
* gcc.target/alpha/stwx0-safe-partial-bwx.c: New file.
Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]
With non-BWX Alpha implementations we have a problem of data races where
a 8-bit byte or 16-bit word quantity is to be written to memory in that
in those cases we use an unprotected RMW access of a 32-bit longword or
64-bit quadword width. If contents of the longword or quadword accessed
outside the byte or word to be written are changed midway through by a
concurrent write executing on the same CPU such as by a signal handler
or a parallel write executing on another CPU such as by another thread
or via a shared memory segment, then the concluding write of the RMW
access will clobber them. This is especially important for the safety
of RCU algorithms, but is otherwise an issue anyway.
To guard against these data races with byte and aligned word quantities
introduce the `-msafe-bwa' command-line option (standing for Safe Byte &
Word Access) that instructs the compiler to instead use an atomic RMW
access sequence where byte and word memory access machine instructions
are not available. There is no change to code produced for BWX targets.
It would be sufficient for the secondary reload handle to use a pair of
scratch registers, as requested by `reload_out<mode>', but it would end
with poor code produced as one of the scratches would be occupied by
data retrieved and the other one would have to be reloaded with repeated
calculations, all within the LL/SC sequence.
Therefore I chose to add a dedicated `reload_out<mode>_safe_bwa' handler
and ask for more scratches there by defining a 256-bit OI integer mode.
While reload is documented in our manual to support an arbitrary number
of scratches in reality it hasn't been implemented for IRA:
/* ??? It would be useful to be able to handle only two, or more than
three, operands, but for now we can only handle the case of having
exactly three: output, input and one temp/scratch. */
and it seems to be the case for LRA as well. Do what everyone else does
then and just have one wide multi-register scratch.
I note that the atomic sequences emitted are suboptimal performance-wise
as the looping branch for the unsuccessful completion of the sequence
points backwards, which means it will be predicted as taken despite that
in most cases it will fall through. I do not see it as a deficiency of
this change proposed as it takes care of recording that the branch is
unlikely to be taken, by calling `alpha_emit_unlikely_jump'. Therefore
generic code elsewhere should instead be investigated and adjusted
accordingly for the arrangement to actually take effect.
Add test cases accordingly.
There are notable regressions between a plain `-mno-bwx' configuration
and a `-mno-bwx -msafe-bwa' one:
FAIL: gcc.dg/torture/inline-mem-cpy-cmp-1.c -O0 execution test
FAIL: gcc.dg/torture/inline-mem-cpy-cmp-1.c -O1 execution test
FAIL: gcc.dg/torture/inline-mem-cpy-cmp-1.c -O2 execution test
FAIL: gcc.dg/torture/inline-mem-cpy-cmp-1.c -O3 -g execution test
FAIL: gcc.dg/torture/inline-mem-cpy-cmp-1.c -Os execution test
FAIL: gcc.dg/torture/inline-mem-cpy-cmp-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none execution test
FAIL: gcc.dg/torture/inline-mem-cpy-cmp-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects execution test
FAIL: g++.dg/init/array25.C -std=c++17 execution test
FAIL: g++.dg/init/array25.C -std=c++98 execution test
FAIL: g++.dg/init/array25.C -std=c++26 execution test
They come from the fact that these test cases play tricks with alignment
and end up calling code that expects a reference to aligned data but is
handed one to unaligned data.
This doesn't cause a visible problem with plain `-mno-bwx' code, because
the resulting alignment exception is fixed up by Linux. There's no such
handling currently implemented for LDL_L or LDQ_L instructions (which
are first in the sequence) and consequently the offender is issued with
SIGBUS instead. Suitable handling will be added to Linux to complement
this change that will emulate the trapping instructions[1], so these
interim regressions are seen as harmless and expected.
References:
[1] "Alpha: Emulate unaligned LDx_L/STx_C for data consistency",
<https://lore.kernel.org/r/alpine.DEB.2.21.2502181912230.65342@angie.orcam.me.uk/>
gcc/
PR target/117759
* config/alpha/alpha-modes.def (OI): New integer mode.
* config/alpha/alpha-protos.h (alpha_expand_mov_safe_bwa): New
prototype.
* config/alpha/alpha.cc (alpha_expand_mov_safe_bwa): New
function.
(alpha_secondary_reload): Handle TARGET_SAFE_BWA.
* config/alpha/alpha.md (aligned_store_safe_bwa)
(unaligned_store<mode>_safe_bwa, reload_out<mode>_safe_bwa)
(reload_out<mode>_unaligned_safe_bwa): New expanders.
(mov<mode>, movcqi, reload_out<mode>_aligned): Handle
TARGET_SAFE_BWA.
(reload_out<mode>): Guard against TARGET_SAFE_BWA.
* config/alpha/alpha.opt (msafe-bwa): New option.
* config/alpha/alpha.opt.urls: Regenerate.
* doc/invoke.texi (Option Summary, DEC Alpha Options): Document
the new option.
gcc/testsuite/
PR target/117759
* gcc.target/alpha/stb.c: New file.
* gcc.target/alpha/stb-bwa.c: New file.
* gcc.target/alpha/stb-bwx.c: New file.
* gcc.target/alpha/stba.c: New file.
* gcc.target/alpha/stba-bwa.c: New file.
* gcc.target/alpha/stba-bwx.c: New file.
* gcc.target/alpha/stw.c: New file.
* gcc.target/alpha/stw-bwa.c: New file.
* gcc.target/alpha/stw-bwx.c: New file.
* gcc.target/alpha/stwa.c: New file.
* gcc.target/alpha/stwa-bwa.c: New file.
* gcc.target/alpha/stwa-bwx.c: New file.
IRA+LRA: Let the backend request to split basic blocks
The next change for Alpha will produce extra labels and branches in
reload, which in turn requires basic blocks to be split at completion.
We do this already for functions that can trap, so just extend the
arrangement with a flag for the backend to use whenever it finds it
necessary.
Alpha: Export `emit_unlikely_jump' for a subsequent change to use
Rename `emit_unlikely_jump' function to `alpha_emit_unlikely_jump', so
as to avoid namespace pollution, updating callers accordingly and export
it for use in the machine description. Make it return the insn emitted.
LIU Hao [Sat, 29 Mar 2025 14:47:54 +0000 (22:47 +0800)]
gcc/mingw: Align `.refptr.` to 8-byte boundaries for 64-bit targets
Windows only requires sections to be aligned on a 4-byte boundary. This used
to work because in binutils the `.rdata` section is over-aligned to a 16-byte
boundary, which will be fixed in the future.
This matches the output of Clang.
Signed-off-by: LIU Hao <lh_mouse@126.com> Signed-off-by: Jonathan Yong <10walls@gmail.com>
gcc/ChangeLog:
* config/mingw/winnt.cc (mingw_pe_file_end): Add `.p2align`.
Jason Merrill [Sat, 29 Mar 2025 12:56:09 +0000 (08:56 -0400)]
c++/modules: unexported friend template
Here we were failing to match the injected friend declaration to the
definition because the latter isn't exported. But the friend is attached to
the module, so we need to look for any reachable declaration in that module,
not just the exports.
The duplicate_decls change is to avoid clobbering DECL_MODULE_IMPORT_P on
the imported definition; matching an injected friend doesn't change that
it's imported. I considered checking get_originating_module == 0 or
!decl_defined_p instead of DECL_UNIQUE_FRIEND_P there, but I think this
situation is specific to friends.
I removed an assert because we have a test for the same condition a few
lines above.
gcc/cp/ChangeLog:
* decl.cc (duplicate_decls): Don't clobber DECL_MODULE_IMPORT_P with
an injected friend.
* name-lookup.cc (check_module_override): Look at all reachable
decls in decl's originating module.
gcc/testsuite/ChangeLog:
* g++.dg/modules/friend-9_a.C: New test.
* g++.dg/modules/friend-9_b.C: New test.
Jason Merrill [Mon, 24 Mar 2025 19:28:04 +0000 (15:28 -0400)]
c++: optimize push_to_top_level [PR64500]
Profiling showed that the loop to save away IDENTIFIER_BINDINGs from open
binding levels was taking 5% of total compilation time in the PR116285
testcase. This turned out to be because we were unnecessarily trying to do
this for namespaces, whose bindings are found through
DECL_NAMESPACE_BINDINGS, not IDENTIFIER_BINDING.
As a result we would frequently loop through everything in std::, checking
whether it needs to be stored, and never storing anything.
This change actually appears to speed up compilation for the PR116285
testcase by ~20%.
The replaced comments referred either to long-replaced handling of classes
and templates, or to wanting b to point to :: when the loop exits.
PR c++/64500
PR c++/116285
gcc/cp/ChangeLog:
* name-lookup.cc (push_to_top_level): Don't try to store_bindings
for namespace levels.
Nathaniel Shead [Fri, 28 Mar 2025 12:38:26 +0000 (23:38 +1100)]
c++/modules: Fix modules and LTO with header units [PR118961]
This patch makes some adjustments required to get a simple modules
testcase working with LTO. There are two main issues fixed.
Firstly, modules only streams the maybe-in-charge constructor, and any
clones are recreated on stream-in. These clones are copied from the
existing function decl and then adjusted. This caused issues because
the clones were getting incorrectly marked as abstract, since after
clones have been created (in the imported file) the maybe-in-charge decl
gets marked as abstract. So this patch just ensures that clones are
always created as non-abstract.
The second issue is that we need to explicitly tell cgraph that explicit
instantiations need to be emitted, otherwise LTO will elide them (as
they don't necessarily appear to be used directly) and cause link
errors. Additionally, expand_or_defer_fn doesn't setup comdat groups
for explicit instantiations, so we need to do that here as well.
Currently this is all handled in 'mark_decl_instantiated'; this patch
splits out the linkage handling into a separate function that we can
call from modules code, maybe in GCC16 we could move this somewhere more
central.
PR c++/118961
gcc/cp/ChangeLog:
* class.cc (copy_fndecl_with_name): Mark clones as non-abstract.
* cp-tree.h (setup_explicit_instantiation_definition_linkage):
Declare new function.
* module.cc (trees_in::read_var_def): Use it.
(module_state::read_cluster): Likewise.
* pt.cc (setup_explicit_instantiation_definition_linkage): New
function.
(mark_decl_instantiated): Use it.
gcc/testsuite/ChangeLog:
* g++.dg/modules/lto-1.h: New test.
* g++.dg/modules/lto-1_a.H: New test.
* g++.dg/modules/lto-1_b.C: New test.
* g++.dg/modules/lto-1_c.C: New test.
* g++.dg/modules/lto-2_a.H: New test.
* g++.dg/modules/lto-2_b.C: New test.
* g++.dg/modules/lto-3_a.H: New test.
* g++.dg/modules/lto-3_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Jakub Jelinek [Fri, 28 Mar 2025 23:49:27 +0000 (00:49 +0100)]
testsuite: Fix up musttail2.C test
On Wed, Mar 26, 2025 at 10:10:07AM -0700, Andi Kleen wrote:
> I think this needs to be target external_tailcall, otherwise you will
> fail on targets that don't support that.
>
> Or alternatively make this not extern.
You're right (although I don't remember which targets are
non-external_musttail).
Here is a patch to define the function.
2025-03-28 Jakub Jelinek <jakub@redhat.com>
* g++.dg/opt/musttail2.C (foo): Define the function instead of
just declaring it, add [[gnu::noipa]] attribute to it.
Jakub Jelinek [Fri, 28 Mar 2025 23:47:57 +0000 (00:47 +0100)]
cobol: Fix up cobol/{charmaps,valconv}.cc rules
sed -i is not portable, it is supported by GNU sed and perhaps some BSDs,
but not elsewhere.
Furthermore, I think it is far better to always use
#include "../../libgcobol/something.h"
paths rather than something depending on the build directory.
And because we require GNU make, we don't have to have two different
rules for those, can use just one for both.
The l variable in there is just to make it fit into 80 columns.
2025-03-28 Jakub Jelinek <jakub@redhat.com>
* Make-lang.in (cobol/charmaps.cc, cobol/valconv.cc): Used sed -e
instead of cp and multiple sed -i commands. Always prefix libgcobol
header names in #include directives with ../../libgcobol/ rather than
something depending on $(LIB_SOURCE).
Jeremy Bettis [Fri, 28 Mar 2025 07:54:27 +0000 (00:54 -0700)]
libcpp: Fix incorrect line numbers in large files [PR108900]
This patch addresses an issue in the C preprocessor where incorrect
line number information is generated when processing files with a
large number of lines. The problem arises from improper handling
of location intervals in the line map, particularly when locations
exceed LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES.
By ensuring that the highest location is not decremented if it
would move to a different ordinary map, this fix resolves
the line number discrepancies observed in certain test cases.
This change improves the accuracy of line number reporting, benefiting
users relying on precise code coverage and debugging information.
libcpp/ChangeLog:
PR preprocessor/108900
* files.cc (_cpp_stack_file): Do not decrement highest_location
across distinct maps.
Signed-off-by: Jeremy Bettis <jbettis@google.com> Signed-off-by: Yash Shinde <Yash.Shinde@windriver.com>
Jakub Jelinek [Fri, 28 Mar 2025 19:29:31 +0000 (20:29 +0100)]
testsuite: Don't cycle through option list for gfortran.dg and libgomp.fortran dg-do run tests with -O in dg*options
Here is a new version of the patch.
The current behavior in gfortran.dg/ and libgomp.fortran/libgomp.oacc-fortran
is that tests without any dg-do directive are implicitly dg-do compile
and tests with dg-do compile or without dg-do don't cycle through options
(-O is implicitly added but can be overridden), while test with dg-do run
cycle through the optimization options.
The following patch modifies this, so that even tests with dg-do run
with -O in dg-options or dg-additional-options (after [ \t"{]) don't cycle
either and also get implicit -O which is overridden by that
-O{,0,1,2,3,s,z,g,fast} in dg-{,additional-}options. Previously we were
mostly wasting test time on those, because e.g.
-O0 -O2
-O1 -O2
-O2 -O2
-Os -O2
are still effectively -O2 and so the same thing, while
-O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions -O2
and
-O3 -g -O2
are not the same thing (effectively
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions -O2
and
-g -O2) I think it isn't worth to test those combinations (especially when
with e.g. -O0 in dg-options it mostly doesn't do much).
Tested with make check-gfortran where this results in slight decrease of
tests:
# of expected passes 73809
# of expected failures 343
# of unsupported tests 78
with unmodified trunk vs.
# of expected passes 72734
# of expected failures 343
# of unsupported tests 73
with the patch, and on the libgomp side
# of expected passes 11162
# of expected failures 238
# of unsupported tests 274
to
# of expected passes 11092
# of expected failures 238
# of unsupported tests 274
(when counting just fortran.exp tests).
Before the patch I see
grep -- '-O[^ ].*-O' testsuite/gfortran/gfortran.log | grep -v '/vect/\|/graphite/' | wc -l
1008
and with the patch
grep -- '-O[^ ].*-O' testsuite/gfortran/gfortran.log | grep -v '/vect/\|/graphite/' | wc -l
0
(vect and graphite have a few occurrences, but not too much).
2025-03-28 Jakub Jelinek <jakub@redhat.com>
* lib/gfortran-dg.exp: Don't cycle through the option list if
dg-options or dg-additional-options contains -O after space, tab,
double quote or open curly bracket.
* gfortran.dg/cray_pointers_2.f90: Remove extraneous space between
dg-do and run and remove comment about it.
Gaius Mulley [Fri, 28 Mar 2025 15:25:55 +0000 (15:25 +0000)]
PR modula2/119504: ICE when attempting to access an element of a constant string
This patch prevents an ICE and generates an error if an array access to a
constant string is attempted. The patch also allows HIGH ("string").
gcc/m2/ChangeLog:
PR modula2/119504
* gm2-compiler/M2Quads.mod (BuildHighFunction): Defend against
Type = NulSym and fall into BuildConstHighFromSym.
(BuildDesignatorArray): Rewrite to detect an array access to
a constant string.
(BuildDesignatorArrayStaticDynamic): New procedure.
gcc/testsuite/ChangeLog:
PR modula2/119504
* gm2/iso/fail/conststrarray2.mod: New test.
* gm2/iso/run/pass/constarray2.mod: New test.
* gm2/pim/pass/hexstring.mod: New test.
Jakub Jelinek [Fri, 28 Mar 2025 14:45:03 +0000 (15:45 +0100)]
srcextra fixes
Here is a patch which uses sed to fix up the copies of the generated
files by flex/bison in the source directories (i.e. what we ship in
release tarballs).
In that case the generated files are in the same directory as the
files they are generated from, so there should be no absolute or relative
directories, just the filenames.
Furthermore, c.srcextra was duplicating the work of gcc.srcextra, there is
nothing C FE specific on gengtype-lex.l.
2025-03-28 Jakub Jelinek <jakub@redhat.com>
gcc/
* Makefile.in (gcc.srcextra): Use sed to turn .../gcc/gengtype-lex.l
in #line directives into just gengtype-lex.l.
gcc/c/
* Make-lang.in (c.srcextra): Don't depend on anything and don't copy
anything.
gcc/cobol/
* Make-lang.in (cobol.srcextra): Use sed to turn
.../gcc/cobol/*.{y,l,h,cc} and cobol/*.{y,l,h,cc} in #line directives
into just *.{y,l,h,cc}.
Richard Biener [Fri, 28 Mar 2025 14:20:16 +0000 (15:20 +0100)]
other/119510 - use --enable-languages=default,cobol for release tarballs
The following adds cobol to the set of languages built during release
tarball building so the bison and flex generated sources for cobol
are included in the tarball.
PR other/119510
maintainer-scripts/
* gcc_release: Use --enable-languages=default,cobol
when building generated files.
Andrew MacLeod [Wed, 26 Mar 2025 14:34:42 +0000 (10:34 -0400)]
If the LHS does not contain zero, neither do multiply operands.
Given ~[0,0] = op1 * op2, range-ops should determine that neither op1 nor
op2 is zero. Add this to the operator_mult for op1_range. op2_range
simply invokes op1_range, so both will be covered.
PR tree-optimzation/110992.c
PR tree-optimzation/119471.c
gcc/
* range-op.cc (operator_mult::op1_range): If the LHS does not
contain zero, return non-zero.
Richard Biener [Fri, 28 Mar 2025 12:48:36 +0000 (13:48 +0100)]
bootstrap/119513 - fix cobol bootstrap with --enable-generated-files-in-srcdir
This adds gcc/cobol/parse.o to compare_exclusions and makes sure to
ignore errors when copying generated files, like it's done when
copying gengtype-lex.cc.
Jakub Jelinek [Fri, 28 Mar 2025 09:49:40 +0000 (10:49 +0100)]
tailc: Handle musttail noreturn calls [PR119483]
The following (first) testcase is accepted by clang (if clang::musttail)
and rejected by gcc, because we discover the call is noreturn and then bail
out because we don't want noreturn tailcalls.
The general reason not to support noreturn tail calls is for cases like
abort where we want nicer backtrace, but if user asks explicitly to
musttail a call which either is explicitly noreturn or is implicitly
determined to be noreturn, I don't see a reason why we couldn't do that.
Both for tail calls and tail recursions.
An alternative would be to keep rejecting musttail to explicit noreturn,
but not actually implicitly mark anything as noreturn if it has any musttail
calls. But it is unclear how we could do that, such marking is I think done
typically before IPA and e.g. for LTO we won't know whether some other TU
could have musttail calls to it. And keeping around both explicit and
implicit noreturn bits would be ugly. Well, I guess we could differentiate
between presence of noreturn/_Noreturn attributes and just ECF_NORETURN
without those, but then tailc would still need to support it, just error out
if it was explicit.
2025-03-28 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119483
* tree-tailcall.cc (find_tail_calls): Handle noreturn musttail
calls.
(eliminate_tail_call): Likewise.
(tree_optimize_tail_calls_1): If cfun->has_musttail and
diag_musttail, handle also basic blocks with no successors
with noreturn musttail calls.
* calls.cc (can_implement_as_sibling_call_p): Allow ECF_NORETURN
calls if they are musttail calls.
* c-c++-common/pr119483-1.c: New test.
* c-c++-common/pr119483-2.c: New test.
Jakub Jelinek [Fri, 28 Mar 2025 09:48:31 +0000 (10:48 +0100)]
ipa-sra: Don't change return type to void if there are musttail calls [PR119484]
The following testcase is rejected, because IPA-SRA decides to
turn bar.constprop call into bar.constprop.isra which returns void.
While there is no explicit lhs on the call, as it is a musttail call
the tailc pass checks if IPA-VRP returns singleton from that function
and the function returns the same value and in that case it still turns
it into a tail call. This can't work with IPA-SRA changing it into
void returning function though.
The following patch fixes this by forcing returning the original type
if there are musttail calls.
2025-03-28 Jakub Jelinek <jakub@redhat.com>
PR ipa/119484
* ipa-sra.cc (isra_analyze_call): Don't set m_return_ignored if
gimple_call_must_tail_p even if it doesn't have lhs.
Richard Biener [Fri, 21 Mar 2025 18:30:31 +0000 (19:30 +0100)]
Export native_encode_real operating on REAL_VALUE_TYPE
The following exports the native_encode_real worker, and makes it
take a scalar float mode and REAL_VALUE_TYPE data instead of a tree
for use in the COBOL frontend, avoiding creating of a temporary tree.
* fold-const.h (native_encode_real): Export.
* fold-const.cc (native_encode_real): Change API to take
mode and REAL_VALUE_TYPE.
(native_encode_expr): Adjust.
Iain Sandoe [Thu, 20 Mar 2025 17:08:57 +0000 (17:08 +0000)]
cobol: Do not include <cmath> (no longer needed)
Several of enumerators in parse.y conflict with ones declared in at
least some versions of <cmath> .. e.g. "OVERFLOW". The header is no
longer needed since the FE is not trying to do host arithmetic.
David Malcolm [Thu, 27 Mar 2025 23:46:20 +0000 (19:46 -0400)]
contrib: add dg-lint and libgdiagnostics.py [PR116163]
Changed in v2:
- eliminated COMMON_MISSPELLINGS in favor of retesting with a regexp
that adds underscores
- add a list of KNOWN_DIRECTIVES, and complain if we see a directive
that isn't in the list
- various refactorings to reduce the nesting within the script
- skip more kinds of file ('README', 'Makefile.am', 'Makefile.in',
'gen_directive_tests')
- keep track of the number of files scanned and report it and the end
with a note
This patch adds a new dg-lint subdirectory below contrib, containing
a "dg-lint" script for detecting common mistakes made in our DejaGnu
tests.
Specifically, DejaGnu's dg.exp's dg-get-options has a regexp for
detecting dg- directives
https://git.savannah.gnu.org/gitweb/?p=dejagnu.git;a=blob;f=lib/dg.exp
here's the current:
set tmp [grep $prog "{\[ \t\]\+dg-\[-a-z\]\+\[ \t\]\+.*\[ \t\]\+}" line]
which if I'm reading it right requires a "{", then one or more tab/space
chars, then a "dg-" directive name, then one of more tab/space
characters, then anything (for arguments to the directive), then one of
more tab/space character, then a "}".
There are numerous places in our testsuite which look like attempts to
use a directive, but which don't match this regexp.
The script warns about such places, along with a list of misspelled
directives (currently just "dg_options" for "dg-options"), and a warning
if a dg-do appears after a dg-require-* (as per
https://gcc.gnu.org/onlinedocs/gccint/Directives.html
"This directive must appear after any dg-do directive in the test
and before any dg-additional-sources directive." for
dg-require-effective-target.
dg-lint uses libgdiagnostics to report its results; the patch adds a
new libgdiagnostics.py script below contrib/dg-lint. This uses Python's
ctypes module to expose libgdianostics.so to Python via FFI. Hence
the warnings have colorization, quote the pertinent parts of the tested
file, can have fix-it hints, etc. Here's the output from the tests, run
from the top-level directory:
$ LD_LIBRARY_PATH=../build/gcc/ ./contrib/dg-lint/dg-lint contrib/dg-lint/test-*.c
contrib/dg-lint/test-1.c:6:6: warning: misspelled directive: 'dg_final'; did you mean 'dg-final'?
6 | /* { dg_final { scan_assembler_times "vmsumudm" 2 } } */
| ^~~~~~~~
| dg-final
contrib/dg-lint/test-1.c:15:4: warning: directive 'dg-output-file' appears not to match dg.exp's regexp
15 | dg-output-file "m4.out"
| ^~~~~~~~~~~~~~
contrib/dg-lint/test-1.c:18:4: warning: directive 'dg-output-file' appears not to match dg.exp's regexp
18 | dg-output-file "m4.out" }
| ^~~~~~~~~~~~~~
contrib/dg-lint/test-1.c:21:6: warning: directive 'dg-output-file' appears not to match dg.exp's regexp
21 | { dg-output-file "m4.out"
| ^~~~~~~~~~~~~~
contrib/dg-lint/test-1.c:24:5: warning: directive 'dg-output-file' appears not to match dg.exp's regexp
24 | {dg-output-file "m4.out"}
| ^~~~~~~~~~~~~~
contrib/dg-lint/test-1.c:27:6: warning: directive 'dg-output-file' appears not to match dg.exp's regexp
27 | { dg-output-file, "m4.out" }
| ^~~~~~~~~~~~~~
contrib/dg-lint/test-2.c:4:6: warning: 'dg-do' after 'dg-require-effective-target'
4 | /* { dg-do compile } */
| ^~~~~
contrib/dg-lint/test-2.c:3:6: note: 'dg-require-effective-target' was here
3 | /* { dg-require-effective-target c++11 } */
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
I don't yet have a way to verify these tests (clearly we can't use
DejaGnu for this).
These Python bindings could be used by other projects, but so far I only
implemented what I needed for dg-lint.
Running the test on the GCC source tree finds dozens of issues, which
followup patches address.
Tested with Python 3.8
contrib/ChangeLog:
PR testsuite/116163
* dg-lint/dg-lint: New file.
* dg-lint/libgdiagnostics.py: New file.
* dg-lint/test-1.c: New file.
* dg-lint/test-2.c: New file.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Bob Dubner [Thu, 27 Mar 2025 21:55:53 +0000 (17:55 -0400)]
cobol: Incorporate new testcases from the cobolworx UAT tests.
The author notes that some of the file names are regrettably lengthy,
which is because they are derived from the descriptive names of the
autom4te tests.
Jakub Jelinek [Thu, 27 Mar 2025 20:21:48 +0000 (21:21 +0100)]
testsuite: Fix up strub-internal-pr112938.C test for C++2{0,3,6}
On Thu, Mar 27, 2025 at 12:05:21AM +0000, Sam James wrote:
> The test was being ignored because dg.exp looks for .C in g++.dg/.
>
> gcc/testsuite/ChangeLog:
> PR middle-end/112938
>
> * g++.dg/strub-internal-pr112938.cc: Move to...
> * g++.dg/strub-internal-pr112938.C: ...here.
This regressed the test for C++20 and higher:
FAIL: g++.dg/strub-internal-pr112938.C -std=gnu++20 (test for excess errors)
FAIL: g++.dg/strub-internal-pr112938.C -std=gnu++23 (test for excess errors)
FAIL: g++.dg/strub-internal-pr112938.C -std=gnu++26 (test for excess errors)
Here is a fix.
2025-03-27 Jakub Jelinek <jakub@redhat.com>
* g++.dg/strub-internal-pr112938.C: Add dg-warning for c++20.
Eric Botcazou [Thu, 27 Mar 2025 19:29:51 +0000 (20:29 +0100)]
Ada: Fix too late initialization of tasking runtime with standalone library
The Tasking_Runtime_Initialize routine installs the tasking version of the
RTS_Lock manipulation routines and thus needs to be called very early before
the elaboration of all the Ada units of the program, including those of the
runtime itself.
This is guaranteed by the binder when the tasking runtime is explicitly
dragged into the link. However, for a standalone dynamic library that
does not depend on the tasking runtime and is auto-initialized, no such
guarantee holds, even though the library might be later dragged into a
link that contains the tasking runtime.
This change causes the routine to be called even earlier, in particular
at load time when a (standalone) dynamic library is involved in the link,
so as to meet the requirements. It will cause the routine to be called
twice if the main subprogram is generated by the binder, but this is
harmless since the routine is idempotent.
ada/
* libgnarl/s-tasini.adb (Tasking_Runtime_Initialize): Add pragma
Linker_Constructor for the procedure.
Sam James [Thu, 27 Mar 2025 17:52:19 +0000 (17:52 +0000)]
testsuite: revert Fortran change
Revert part of my change from r15-8973-g1307de1b4e7d5e; as Harald points
out, the comment explains why this is there. It's a hack but it needs to
stay for now. (I did have this marked as a TODO in my branch and didn't
leave a proper note as to why, so it's my fault.)
Jonathan Wakely [Wed, 26 Mar 2025 11:47:05 +0000 (11:47 +0000)]
libstdc++: Replace use of std::min in ranges::uninitialized_xxx algos [PR101587]
Because ranges can have any signed integer-like type as difference_type,
it's not valid to use std::min(diff1, diff2). Instead of calling
std::min with an explicit template argument, this adds a new __mindist
helper that determines the common type and uses that with std::min.
libstdc++-v3/ChangeLog:
PR libstdc++/101587
* include/bits/ranges_uninitialized.h (__detail::__mindist):
New function object.
(ranges::uninitialized_copy, ranges::uninitialized_copy_n)
(ranges::uninitialized_move, ranges::uninitialized_move_n): Use
__mindist instead of std::min.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/constrained.cc:
Check ranges with difference difference types.
* testsuite/20_util/specialized_algorithms/uninitialized_move/constrained.cc:
Likewise.
Jonathan Wakely [Wed, 26 Mar 2025 17:45:06 +0000 (17:45 +0000)]
libstdc++: Use const_cast to workaround tm_zone being non-const
Iain reported that he's seeing this on Darwin:
include/bits/chrono_io.h:914: warning: ISO C++ forbids converting a string constant to 'char*' [-Wwrite-strings]
This is because the BSD definition ot tm::tm_zone is a char* (and has
been since 1987) rather than const char* as in Glibc and POSIX.1-2024.
We can fix it by using const_cast<char*> when setting the tm_zone
member. This should be safe because libc doesn't actually write anything
to tm_zone; it's only non-const because the BSD definition predates the
addition of the const keyword to C.
For targets where it's a const char* the cast won't matter because it
will be converted back to const char* on assignment anyway.
libstdc++-v3/ChangeLog:
* include/bits/chrono_io.h (__formatter_chrono::_M_c): Use
const_cast when setting tm.tm_zone.
Richard Biener [Thu, 27 Mar 2025 07:21:10 +0000 (08:21 +0100)]
target/119010 - more DFmode handling in zn4zn5 reservations
The following adds DFmode where V1DFmode and SFmode were handled.
This resolves missing reservations for adds, subs [with memory]
and for FMAs for the testcase I'm looking at. Resolved cases are
-;; 16--> b 0: i 237 xmm3=xmm3+[r9*0x8+si] :nothing
-;; 29--> b 0: i 246 xmm3=xmm3+xmm1 :nothing
-;; 46--> b 0: i 296 xmm1=xmm1-xmm3 :nothing
I've done search-and-replace for this, the catched cases look reasonable
though I'm of course not sure all of them can actually happen.
This also fixes the matched type for the znver{4,5}_sse_muladd_load
reservations from sseshuf to ssemuladd, resolving
-;; 1--> b 0: i 161 xmm0={-xmm0*xmm27+[cx+ax]} :nothing
-;; 22--> b 0: i 229 xmm11={-xmm11*xmm7+[di*0x8+dx]} :nothing
Sam James [Thu, 27 Mar 2025 13:19:51 +0000 (13:19 +0000)]
testsuite: harmless dg-* whitespace fixes
These just fix inconsistent/unusual style to avoid noise when grepping
and also people picking up bad habits when they see it (as similar
mistakes can be harmful).
Tobias Burnus [Thu, 27 Mar 2025 13:09:20 +0000 (14:09 +0100)]
OpenMP: Fix C++ template handling with append_args' prefer_type modifier
It is possible but not very sensible to use C++ templates with in the
prefer_type modifier to the 'append_args' clause of 'declare variant'.
The commit r15-6336-g12dd892b1a3ad7 added substitution support in pt.cc,
but missed to update afterward the actual data in decl.cc.
As gimplification support was only added in r15-8898-gf016ee89955ab4,
this could not be tested back then. The latter commit added a sorry
for it gimplify.cc and the existing testcase, which this commit now removes.
gcc/cp/ChangeLog:
* cp-tree.h (cp_finish_omp_init_prefer_type): Add.
* decl.cc (omp_declare_variant_finalize_one): Call it.
* pt.cc (tsubst_attribute): Minor rebustification for OpenMP
append_args handling.
* semantics.cc (cp_omp_init_prefer_type_update): Rename to ...
(cp_finish_omp_init_prefer_type): ... this; remove static attribute
and return modified tree. Move clause handling to ...
(finish_omp_clauses): ... the caller.
libstdc++: re-bump the feature-test macro for P2562R1 [PR119488]
Now that the algorithms have been merged we can advertise full support
for P2562R1. This effectively reverts r15-8933-ga264c270fde292.
libstdc++-v3/ChangeLog:
PR libstdc++/119488
* include/bits/version.def (constexpr_algorithms): Bump
the feature-testing macro.
* include/bits/version.h: Regenerate.
* testsuite/25_algorithms/cpp_lib_constexpr.cc: Test the
bumped value for the feature-testing macro.
This completes the implementation of P2562R1 for C++26.
Unlike the other constexpr algorithms of the same family,
stable_partition does not have a constexpr-friendly version "ready to
use" during constant evaluation. In fact, it is not even available on
freestanding, because it always allocates a temporary memory buffer.
This commit implements the simplest possible strategy: during constant
evaluation allocate a buffer of length 1 on the stack, and use that as
a working area.
libstdc++-v3/ChangeLog:
* include/bits/algorithmfwd.h (stable_partition): Mark it
as constexpr for C++26.
* include/bits/ranges_algo.h (__stable_partition_fn): Likewise.
* include/bits/stl_algo.h (stable_partition): Mark it as
constexpr for C++26; during constant evaluation use a new
codepath where a temporary buffer of 1 element is used.
* testsuite/25_algorithms/headers/algorithm/synopsis.cc
(stable_partition): Add constexpr.
* testsuite/25_algorithms/stable_partition/constexpr.cc: New test.
This commit adds support for constexpr inplace_merge, added by P2562R1
for C++26. The implementation strategy is the same as for constexpr
stable_sort: use if consteval to detect if we're in constant evaluation,
and dispatch to a suitable path (same one as freestanding).
libstdc++-v3/ChangeLog:
* include/bits/algorithmfwd.h (inplace_merge): Mark it as
constexpr for C++26.
* include/bits/ranges_algo.h (__inplace_merge_fn): Likewise.
* include/bits/stl_algo.h (inplace_merge): Mark it as constexpr;
during constant evaluation, dispatch to the non-allocating
codepath.
* testsuite/25_algorithms/headers/algorithm/synopsis.cc
(inplace_merge): Add constexpr.
* testsuite/25_algorithms/inplace_merge/constexpr.cc: New test.
Nathaniel Shead [Wed, 26 Mar 2025 12:43:36 +0000 (23:43 +1100)]
c++/modules: Handle conflicting ABI tags [PR118920]
The ICE in the linked PR is caused because out_ptr_t inherits an ABI tag
in a module that it does not in the importing module. When we try to
build a qualified 'const out_ptr_t' during stream-in, we find the
existing 'const out_ptr_t' variant type that has been built, but discard
it due to having a mismatching attribute list. This causes us to build
a new copy of this variant, and ultimately fail a checking assertion due
to this being an identical type with different TYPE_CANONICAL.
This patch adds checking that ABI tags between an imported and existing
declaration match, and errors if they are incompatible. We make use of
'equal_abi_tags' from mangle.cc to determine if we should error; in the
case in the PR, because the ABI tag was an implicit tag that doesn't
affect name mangling, we don't need to error. To fix the ICE we ensure
that (regardless of whether we errored or not) later processing
considers the ABI tags as equivalent.
PR c++/118920
gcc/cp/ChangeLog:
* cp-tree.h (equal_abi_tags): Declare.
* mangle.cc (equal_abi_tags): Make external, fix comparison.
(tree_string_cmp): Make internal.
* module.cc (trees_in::check_abi_tags): New function.
(trees_in::decl_value): Use it.
(trees_in::is_matching_decl): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/modules/attrib-3_a.H: New test.
* g++.dg/modules/attrib-3_b.C: New test.
* g++.dg/modules/pr118920.h: New test.
* g++.dg/modules/pr118920_a.H: New test.
* g++.dg/modules/pr118920_b.H: New test.
* g++.dg/modules/pr118920_c.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Nathaniel Shead [Wed, 26 Mar 2025 13:23:24 +0000 (00:23 +1100)]
c++/modules: Fix tsubst of global module friend classes [PR118920]
When doing tsubst_friend_class, we need to first check if any imported
module has already created a (hidden) declaration for the class so that
we don't end up with conflicting declarations. Currently we do this
using DECL_MODULE_IMPORT_P, but this is not set in cases where the class
is in the global module and matches an existing GM declaration we've
seen (via an include, for example).
This patch fixes this by checking DECL_MODULE_ENTITY_P instead, which is
set on all entities that have been seen from a module import. We also
use the 'for_mangle' version of get_originating_module so that we don't
treat imported GM entities as attached to the module we imported them
from. And rename that parameter to something more general.
And dump_module_suffix is another place where we want to treat global module
entities as not coming from a module.
PR c++/118920
gcc/cp/ChangeLog:
* name-lookup.cc (lookup_imported_hidden_friend): Check for
module entity rather than just module import.
* module.cc (get_originating_module): Rename for_mangle parm to
global_m1.
* error.cc (dump_module_suffix): Don't decorate global module decls.
gcc/testsuite/ChangeLog:
* g++.dg/modules/tpl-friend-17.h: New test.
* g++.dg/modules/tpl-friend-17_a.C: New test.
* g++.dg/modules/tpl-friend-17_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Co-authored-by: Jason Merrill <jason@redhat.com>
Jonathan Wakely [Wed, 26 Mar 2025 11:21:32 +0000 (11:21 +0000)]
libstdc++: Fix std::ranges::iter_move for function references [PR119469]
The result of std::move (or a cast to an rvalue reference) on a function
reference is always an lvalue. Because std::ranges::iter_move was using
the type std::remove_reference_t<X>&& as the result of std::move, it was
giving the wrong type for function references. Use a decltype-specifier
with declval<remove_reference_t<X>>() instead of just using the
remove_reference_t<X>&& type directly. This gives the right result,
while still avoiding the cost of doing overload resolution for
std::move.
libstdc++-v3/ChangeLog:
PR libstdc++/119469
* include/bits/iterator_concepts.h (_IterMove::__result): Use
decltype-specifier instead of an explicit type.
* testsuite/24_iterators/customization_points/iter_move.cc:
Check results for function references.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Richard Earnshaw [Wed, 26 Mar 2025 15:56:18 +0000 (15:56 +0000)]
arm: don't vectorize fmaxf() unless unsafe math opts are enabled
This test has presumably been failing since vectorization was enabled
at -O2. I suspect part of the reason this wasn't picked up sooner is
that the test is a hybrid execution/scan-assembler test and the
execution part requires appropriate hardware.
The problem is that we are vectorizing an expansion of fmaxf() when
the vector version of the instruction does not preserve denormal
values. This means we should only apply this optimization when
-funsafe-math-optimizations is enabled.
This fix does a few things:
- Moves the expand pattern to vec-common.md. Although I haven't changed
its behaviour (beyond fixing the bug), this should really be enabled for
MVE as well (but that will need to wait for gcc-16 since the MVE code
needs some additional changes first).
- Adds support for HF mode vectors.
- splits the test that was exposing the bug into two parts: an executable
test and a scan-assembler test. The scan-assembler version is more
widely enabled, since it does not require a suitable executable environment.
gcc/ChangeLog:
* config/arm/neon.md (<fmaxmin><mode>3): Move pattern from here...
* config/arm/vec-common.md (<fmaxmin><mode>3): ... to here. Convert
to define_expand and disable the pattern when denormal values might
get truncated to zero. Iterate on VF to add V4HF and V8HF variants.
gcc/testsuite/ChangeLog:
* gcc.target/arm/fmaxmin.c: Move scan-assembler checks to ...
* gcc.target/arm/fmaxmin-2.c: ... here. New test.
Martin Uecker [Sun, 16 Mar 2025 09:54:17 +0000 (10:54 +0100)]
c: Fix tagname confusion for typedef redefinitions [PR118765]
When we redefine a typedef for a tagged type that has just been
redefined, merge_decls may produce invalid TYPE_DECLS that are not
consistent with what set_underlying_type produces. This is fixed
by updating DECL_ORIGINAL_TYPE.
PR c/118765
gcc/c/ChangeLog:
* c-decl.cc (merge_decls): For TYPE_DECLS copy
DECL_ORIGINAL_TYPE from the old declaration.
* c-typeck.cc (tagged_types_tu_compatible_p): Add
checking assertions.
gcc/testsuite/ChangeLog:
* gcc.dg/pr118765-2.c: New test.
* gcc.dg/pr118765-3.c: New test.
* gcc.dg/typedef-redecl3.c: New test.
Sandra Loosemore [Thu, 27 Mar 2025 00:59:37 +0000 (00:59 +0000)]
OpenMP: Fix declaration in append-args-interop.c test case
I ran into this while backporting my declare variant/dispatch/interop
patch f016ee89955ab4da5fe7ef89368e9437bb5ffb13 to the og14 development
branch. In C dialects prior to C23 (the default on mainline),
functions declared "float f()" and "float g(void)" aren't considered
equivalent for the purpose of the C front end code that checks whether
a type of a variant matches the base function after accounting for the
added interop arguments. Using "(void)" instead of "()" works in all
C dialects as well as C++, so do that.
gcc/testsuite/ChangeLog
* c-c++-common/gomp/append-args-interop.c: Fix declaration of base
function to be correct for pre-C23 dialects.
Jørgen Kvalsvik [Wed, 26 Mar 2025 21:15:26 +0000 (22:15 +0100)]
Add coverage_instrumentation_p
Provide a helper for checking if any coverage (arc, conditions, paths)
is enabled, rather than manually checking all the flags. This should
make the intent clearer, and make it easier to maintain the checks when
more flags are added.
The function is forward declared in two header files as different passes
tend to include different headers (profile.h vs value-prof.h). This
could maybe be merged at some points, but profiling related symbols are
already a bit spread out and should probably be handled in a targeted
effort.
Jørgen Kvalsvik [Tue, 4 Jun 2024 12:13:22 +0000 (14:13 +0200)]
Add prime path coverage to gcc/gcov
This patch adds prime path coverage to gcc/gcov. First, a quick
introduction to path coverage, before I explain a bit on the pieces of
the patch.
PRIME PATHS
Path coverage is recording the paths taken through the program. Here is
a simple example:
if (cond1) BB 1
then1 () BB 2
else
else1 () BB 3
if (cond2) BB 4
then2 () BB 5
else
else2 () BB 6
_ BB 7
To cover all paths you must run {then1 then2}, {then1 else2}, {else1
then1}, {else1 else2}. This is in contrast with line/statement coverage
where it is sufficient to execute then2, and it does not matter if it
was reached through then1 or else1.
1 2 4 5 7
1 2 4 6 7
1 3 4 5 7
1 3 4 6 7
This gets more complicated with loops, because 0, 1, 2, ..., N
iterations are all different paths. There are different ways of
addressing this, a promising one being prime paths. A prime path is a
maximal simple path (a path with no repeated vertices) or simple cycle
(no repeated vertices except for the first/last) Prime paths strike a
decent balance between number of tests, path growth, and loop coverage,
requiring loops to be both taken and skipped. Of course, the number of
paths still grows very fast with program complexity - for example, this
program has 14 prime paths:
while (a)
{
if (b)
return;
while (c--)
a++;
}
--
ALGORITHM
Since the numbers of paths grows so fast, we need a good algorithm. The
naive approach of generating all paths and discarding redundancies (see
reference_prime_paths in the diff) simply doesn't complete for even
pretty simple functions with a few ten thousand paths (granted, the
implementation is also poor, but only serves as a reference). Fazli &
Afsharchi in their paper "Time and Space-Efficient Compositional Method
for Prime and Test Paths Generation" describe a neat algorithm which
drastically improves on for most programs, and brings complexity down to
something managable. This patch implements that algorithm with a few
minor tweaks.
The algorithm first finds the strongly connected components (SCC) of the
graph and creates a new graph where the vertices are the SCCs of the
CFG. Within these vertices different paths are found - regular prime
paths, paths that start in the SCCs entries, and paths that end in the
SCCs exits. These per-SCC paths are combined with paths through the CFG
which greatly reduces of paths needed to be evaluated just to be thrown
away.
Using this algorithm we can find the prime paths for somewhat
complicated functions in a reasonable time. Please note that some
programs don't benefit from this at all. We need to find the prime paths
within a SCC, so if a single SCC is very large the function degenerates
to the naive implementation. This can probably be much improved on, but
is an exercise for later.
--
OVERALL ARCHITECTURE
Like the other coverages in gcc, this operates on the CFG in the
profiling phase, just after branch and condition coverage, in phases:
1. All prime paths are generated, counted, and enumerated from the CFG
2. The paths are evaluted and counter instructions and accumulators are
emitted
3. gcov reads the CFG and computes the prime paths (same as step 1)
4. gcov prints a report
Simply writing out all the paths in the .gcno file is not really viable,
the files would be too big. Additionally, there are limits to the
practicality of measuring (and reporting) on millions of paths, so for
most programs where coverage is feasible, computing paths should be
plenty fast. As a result, path coverage really only adds 1 bit to the
counter, rounded up to nearest 64 ("bucket"), so 64 paths takes up 8
bytes, 65 paths take up 16 bytes.
Recording paths is really just massaging large bitsets. Per function,
ceil(paths/64 or 32) buckets (gcov_type) are allocated. Paths are
sorted, so the first path maps to the lowest bit, the second path to the
second lowest bit, and so on. On taking an edge and entering a basic
block, a few bitmasks are applied to unset the bits corresponding to the
paths outside the block and set the bits of the paths that start in that
block. Finally, the right buckets are masked and written to the global
accumulators for the paths that end in the block. Full coverage is
achieved when all bits are set.
gcc does not really inform gcov of abnormal paths, so paths with
abnormal edges are ignored. This probably possible, but requires some
changes to the graph gcc writes to the .gcno file.
--
IMPLEMENTATION
In order to remove non-prime paths (subpaths) we use a suffix tree.
Fazli & Afsharchi do not discuss how duplicates or subpaths are removed,
and using the suffix works really well -- insertion time is a function
of the length of the (longest) paths, not the number of paths. The paths
are usually quite short, but there are many of them. The same
prime_paths function is used both in gcc and in gcov.
As for speed, I would say that it is acceptable. Path coverage is a
problem that is exponential in its very nature, so if you enable this
feature you can reasonably expect it to take a while. To combat the
effects of path explosion there is a limit at which point gcc will give
up and refuse to instrument a function, set with -fpath-coverage-limit.
Since estimating the number of prime paths is pretty much is counting
them, gcc maintains a pessimistic running count which slightly
overestimates the number of paths found so far. When that count exceeds
the limit, the function aborts and gcc prints a warning. This is a
blunt instrument meant to not get stuck on the occasional large
function, not fine-grained control over when to skip instrumentation.
My main benchmark has been tree-2.1.3/tree.c which generates approx 2M
paths across the 20 functions or so in it. Most functions have less than
1500 paths, and 2 around a million each. Finding the paths takes
3.5-4s, but the instrumentation phase takes approx. 2.5 minutes and
generates a 32M binary. Not bad for a 1429 line source file.
There are some selftests which deconstruct the algorithm, so it can be
easily referenced with the Fazli & Afsharchi. I hope that including them
both help to catch regression, clarify the assumptions, and help
understanding the algorithm by breaking up the phases.
DEMO
This is the denser line-aware (grep-friendlier) output. Every missing
path is summarized as the lines you need to run in what order, annotated
with the true/false/throw decision.
$ gcc -fpath-coverage --coverage bs.c -c -o bs
$ gcov -e --prime-paths-lines bs.o
bs.gcda:cannot open data file, assuming not executed
-: 0:Source:bs.c
-: 0:Graph:bs.gcno
-: 0:Data:-
-: 0:Runs:0
paths covered 0 of 17
path 0 not covered: lines 6 6(true) 11(true) 12
path 1 not covered: lines 6 6(true) 11(false) 13(true) 14
path 2 not covered: lines 6 6(true) 11(false) 13(false) 16
path 3 not covered: lines 6 6(false) 18
path 4 not covered: lines 11(true) 12 6(true) 11
path 5 not covered: lines 11(true) 12 6(false) 18
path 6 not covered: lines 11(false) 13(true) 14 6(true) 11
path 7 not covered: lines 11(false) 13(true) 14 6(false) 18
path 8 not covered: lines 12 6(true) 11(true) 12
path 9 not covered: lines 12 6(true) 11(false) 13(true) 14
path 10 not covered: lines 12 6(true) 11(false) 13(false) 16
path 11 not covered: lines 13(true) 14 6(true) 11(true) 12
path 12 not covered: lines 13(true) 14 6(true) 11(false) 13
path 13 not covered: lines 14 6(true) 11(false) 13(true) 14
path 14 not covered: lines 14 6(true) 11(false) 13(false) 16
path 15 not covered: lines 6(true) 11(true) 12 6
path 16 not covered: lines 6(true) 11(false) 13(true) 14 6
#####: 1:int binary_search(int a[], int len, int from, int to, int key)
-: 2:{
#####: 3: int low = from;
#####: 4: int high = to - 1;
-: 5:
#####: 6: while (low <= high)
-: 7: {
#####: 8: int mid = (low + high) >> 1;
#####: 9: long midVal = a[mid];
-: 10:
#####: 11: if (midVal < key)
#####: 12: low = mid + 1;
#####: 13: else if (midVal > key)
#####: 14: high = mid - 1;
-: 15: else
#####: 16: return mid; // key found
-: 17: }
#####: 18: return -1;
-: 19:}
Then there's the human-oriented source mode. Because it is so verbose I
have limited the demo to 2 paths. In this mode gcov will print the
sequence of *lines* through the program and in what order to cover the
path, including what basic block the line is a part of. Like its denser
sibling, this also prints the true/false/throw decision, if there is
one.
$ gcov -t --prime-paths-source bs.o
bs.gcda:cannot open data file, assuming not executed
-: 0:Source:bs.c
-: 0:Graph:bs.gcno
-: 0:Data:-
-: 0:Runs:0
paths covered 0 of 17
path 0:
BB 2: 1:int binary_search(int a[], int len, int from, int to, int key)
BB 2: 3: int low = from;
BB 2: 4: int high = to - 1;
BB 2: 6: while (low <= high)
BB 8: (true) 6: while (low <= high)
BB 3: 8: int mid = (low + high) >> 1;
BB 3: 9: long midVal = a[mid];
BB 3: (true) 11: if (midVal < key)
BB 4: 12: low = mid + 1;
path 1:
BB 2: 1:int binary_search(int a[], int len, int from, int to, int key)
BB 2: 3: int low = from;
BB 2: 4: int high = to - 1;
BB 2: 6: while (low <= high)
BB 8: (true) 6: while (low <= high)
BB 3: 8: int mid = (low + high) >> 1;
BB 3: 9: long midVal = a[mid];
BB 3: (false) 11: if (midVal < key)
BB 5: (true) 13: else if (midVal > key)
BB 6: 14: high = mid - 1;
The listing is also aware of inlining:
hello.c:
#include <stdio.h>
#include "hello.h"
int notmain(const char *entity)
{
return hello (entity);
}
#include <stdio.h>
inline __attribute__((always_inline))
int hello (const char *s)
{
if (s)
printf ("hello, %s!\n", s);
else
printf ("hello, world!\n");
return 0;
}
--prime-paths-{lines,source} take an optional argument type, which can
be 'covered', 'uncovered', or 'both', which defaults to 'uncovered'.
The flag controls if the covered or uncovered paths are printed, and
while uncovered is generally the most useful one, it is sometimes nice
to be able to see only the covered paths.
And finally, JSON (abbreviated). It is quite sparse and very nested, but
is mostly a JSON version of the source listing. It has to be this nested
in order to consistently capture multiple locations. It is always
includes the file name per location for consistency, even though this is
very much redundant in almost all cases. This format is in no way set in
stone, and without targeting it with other tooling I am not sure if it
does the job well.
* lib/gcov.exp: Add prime paths test function.
* g++.dg/gcov/gcov-22.C: New test.
* g++.dg/gcov/gcov-23-1.h: New test.
* g++.dg/gcov/gcov-23-2.h: New test.
* g++.dg/gcov/gcov-23.C: New test.
* gcc.misc-tests/gcov-29.c: New test.
* gcc.misc-tests/gcov-30.c: New test.
* gcc.misc-tests/gcov-31.c: New test.
* gcc.misc-tests/gcov-32.c: New test.
* gcc.misc-tests/gcov-33.c: New test.
* gcc.misc-tests/gcov-34.c: New test.
Jørgen Kvalsvik [Wed, 7 Aug 2024 15:33:31 +0000 (17:33 +0200)]
gcov: branch, conds, calls in function summaries
The gcov function summaries only output the covered lines, not the
branches and calls. Since the function summaries is an opt-in it
probably makes sense to also include branch coverage, calls, and
condition coverage.
Before:
$ gcov -f hello
Function 'main'
Lines executed:100.00% of 4
Function 'fn'
Lines executed:100.00% of 7
File 'hello.c'
Lines executed:100.00% of 11
Creating 'hello.c.gcov'
After:
$ gcov -f hello
Function 'main'
Lines executed:100.00% of 3
No branches
Calls executed:100.00% of 1
Function 'fn'
Lines executed:100.00% of 7
Branches executed:100.00% of 4
Taken at least once:50.00% of 4
No calls
File 'hello.c'
Lines executed:100.00% of 10
Creating 'hello.c.gcov'
Lines executed:100.00% of 10
With conditions:
$ gcov -fg hello
Function 'main'
Lines executed:100.00% of 3
No branches
Calls executed:100.00% of 1
No conditions
Function 'fn'
Lines executed:100.00% of 7
Branches executed:100.00% of 4
Taken at least once:50.00% of 4
Condition outcomes covered:100.00% of 8
No calls
File 'hello.c'
Lines executed:100.00% of 10
Creating 'hello.c.gcov'