Jan Hubicka [Wed, 15 May 2024 12:14:27 +0000 (14:14 +0200)]
Avoid pointer compares on TYPE_MAIN_VARIANT in TBAA
while building more testcases for ipa-icf I noticed that there are two places
in aliasing code where we still compare TYPE_MAIN_VARIANT for pointer equality.
This is not good idea for LTO since type merging may not happen for example
when in one unit pointed to type is forward declared while in other it is fully
defined. We have same_type_for_tbaa for that.
Bootstrapped/regtested x86_64-linux, OK?
gcc/ChangeLog:
* alias.cc (reference_alias_ptr_type_1): Use view_converted_memref_p.
* alias.h (view_converted_memref_p): Declare.
* tree-ssa-alias.cc (view_converted_memref_p): Export.
(ao_compare::compare_ao_refs): Use same_type_for_tbaa.
As it turns out, this only happens when the Solaris linker is used; with
GNU ld the test PASSes just fine. In fact, that happens because gld
supports the lto-plugin while ld does not: in a Solaris build with gld,
the test FAILs the same way as with ld when -fno-use-linker-plugin is
passed, so this patch requires linker_plugin.
Tested on i386-pc-solaris2.11 (ld and gld) and x86_64-pc-linux-gnu.
Jonathan Wakely [Fri, 3 May 2024 19:00:08 +0000 (20:00 +0100)]
libstdc++: Fix data race in std::basic_ios::fill() [PR77704]
The lazy caching in std::basic_ios::fill() updates a mutable member
without synchronization, which can cause a data race if two threads both
call fill() on the same stream object when _M_fill_init is false.
To avoid this we can just cache the _M_fill member and set _M_fill_init
early in std::basic_ios::init, instead of doing it lazily. As explained
by the comment in init, there's a good reason for doing it lazily. When
char_type is neither char nor wchar_t, the locale might not have a
std::ctype<char_type> facet, so getting the fill character would throw
an exception. The current lazy init allows using unformatted I/O with
such a stream, because the fill character is never needed and so it
doesn't matter if the locale doesn't have a ctype<char_type> facet. We
can maintain this property by only setting the fill character in
std::basic_ios::init if the ctype facet is present at that time. If
fill() is called later and the fill character wasn't set by init, we can
get it from the stream's current locale at the point when fill() is
called (and not try to cache it without synchronization). If the stream
hasn't been imbued with a locale that includes the facet when we need
the fill() character, then throw bad_cast at that point.
This causes a change in behaviour for the following program:
std::ostringstream out;
out.imbue(loc);
auto fill = out.fill();
Previously the fill character would have been set when fill() is called,
and so would have used the new locale. This commit changes it so that
the fill character is set on construction and isn't affected by the new
locale being imbued later. This new behaviour seems to be what the
standard requires, and matches MSVC.
The new 27_io/basic_ios/fill/char/fill.cc test verifies that it's still
possible to use a std::basic_ios without the ctype<char_type> facet
being present at construction.
libstdc++-v3/ChangeLog:
PR libstdc++/77704
* include/bits/basic_ios.h (basic_ios::fill()): Do not modify
_M_fill and _M_fill_init in a const member function.
(basic_ios::fill(char_type)): Use _M_fill directly instead of
calling fill(). Set _M_fill_init to true.
* include/bits/basic_ios.tcc (basic_ios::init): Set _M_fill and
_M_fill_init here instead.
* testsuite/27_io/basic_ios/fill/char/1.cc: New test.
* testsuite/27_io/basic_ios/fill/wchar_t/1.cc: New test.
Rainer Orth [Wed, 15 May 2024 11:13:48 +0000 (13:13 +0200)]
testsuite: i386: Fix g++.target/i386/pr97054.C on Solaris
g++.target/i386/pr97054.C currently FAILs on 64-bit Solaris/x86:
FAIL: g++.target/i386/pr97054.C -std=gnu++14 (test for excess errors)
UNRESOLVED: g++.target/i386/pr97054.C -std=gnu++14 compilation failed to produce executable
FAIL: g++.target/i386/pr97054.C -std=gnu++17 (test for excess errors)
UNRESOLVED: g++.target/i386/pr97054.C -std=gnu++17 compilation failed to produce executable
FAIL: g++.target/i386/pr97054.C -std=gnu++2a (test for excess errors)
UNRESOLVED: g++.target/i386/pr97054.C -std=gnu++2a compilation failed to produce executable
FAIL: g++.target/i386/pr97054.C -std=gnu++98 (test for excess errors)
UNRESOLVED: g++.target/i386/pr97054.C -std=gnu++98 compilation failed to produce executable
Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/g++.target/i386/pr97054.C:49:20: error: frame pointer required, but reserved
Since Solaris/x86 defaults to -fno-omit-frame-pointer, this patch
explicitly builds with -fomit-frame-pointer as is the default on other
x86 targets.
Tested on i386-pc-solaris2.11 (32 and 64-bit) and x86_64-pc-linux-gnu.
RISC-V: Allow by-pieces to do overlapping accesses in block_move_straight
The current implementation of riscv_block_move_straight() emits a couple
of loads/stores with with maximum width (e.g. 8-byte for RV64).
The remainder is handed over to move_by_pieces().
The by-pieces framework utilizes target hooks to decide about the emitted
instructions (e.g. unaligned accesses or overlapping accesses).
Since the current implementation will always request less than XLEN bytes
to be handled by the by-pieces infrastructure, it is impossible that
overlapping memory accesses can ever be emitted (the by-pieces code does
not know of any previous instructions that were emitted by the backend).
This patch changes the implementation of riscv_block_move_straight()
such, that it utilizes the by-pieces framework if the remaining data
is less than 2*XLEN bytes, which is sufficient to enable overlapping
memory accesses (if the requirements for them are given).
The changes in the expansion can be seen in the adjustments of the
cpymem-NN-ooo test cases. The changes in the cpymem-NN tests are
caused by the different instruction ordering of the code emitted
by the by-pieces infrastructure, which emits alternating load/store
sequences.
gcc/ChangeLog:
* config/riscv/riscv-string.cc (riscv_block_move_straight):
Hand over up to 2xXLEN bytes to move_by_pieces().
gcc/testsuite/ChangeLog:
* gcc.target/riscv/cpymem-32-ooo.c: Adjustments for overlapping
access.
* gcc.target/riscv/cpymem-32.c: Adjustments for code emitted by
by-pieces.
* gcc.target/riscv/cpymem-64-ooo.c: Adjustments for overlapping
access.
* gcc.target/riscv/cpymem-64.c: Adjustments for code emitted by
by-pieces.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
A recent patch added the field overlap_op_by_pieces to the struct
riscv_tune_param, which is used by the TARGET_OVERLAP_OP_BY_PIECES_P()
hook. This hook is used by the by-pieces infrastructure to decide
if overlapping memory accesses should be emitted.
The changes in the expansion can be seen in the adjustments of the
cpymem test cases. These tests also reveal a limitation in the
RISC-V cpymem expansion that prevents this optimization as only
by-pieces cpymem expansions emit overlapping memory accesses.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/cpymem-32-ooo.c: Adjust for overlapping
access.
* gcc.target/riscv/cpymem-64-ooo.c: Likewise.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
RISC-V: Allow unaligned accesses in cpymemsi expansion
The RISC-V cpymemsi expansion is called, whenever the by-pieces
infrastructure will not take care of the builtin expansion.
The code emitted by the by-pieces infrastructure may emits code,
that includes unaligned accesses if riscv_slow_unaligned_access_p
is false.
The RISC-V cpymemsi expansion is handled via riscv_expand_block_move().
The current implementation of this function does not check
riscv_slow_unaligned_access_p and never emits unaligned accesses.
Since by-pieces emits unaligned accesses, it is reasonable to implement
the same behaviour in the cpymemsi expansion. And that's what this patch
is doing.
The patch checks riscv_slow_unaligned_access_p at the entry and sets
the allowed alignment accordingly. This alignment is then propagated
down to the routines that emit the actual instructions.
The changes introduced by this patch can be seen in the adjustments
of the cpymem tests.
gcc/ChangeLog:
* config/riscv/riscv-string.cc (riscv_block_move_straight): Add
parameter align.
(riscv_adjust_block_mem): Replace parameter length by align.
(riscv_block_move_loop): Add parameter align.
(riscv_expand_block_move_scalar): Set alignment properly if the
target has fast unaligned access.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/cpymem-32-ooo.c: Adjust for unaligned access.
* gcc.target/riscv/cpymem-64-ooo.c: Likewise.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
We have two mechanisms in the RISC-V backend that expand
cpymem pattern: a) by-pieces, b) riscv_expand_block_move()
in riscv-string.cc. The by-pieces framework has higher priority
and emits a sequence of up to 15 instructions
(see use_by_pieces_infrastructure_p() for more details).
As a rule-of-thumb, by-pieces emits alternating load/store sequences
and the setmem expansion in the backend emits a sequence of loads
followed by a sequence of stores.
Let's add some test cases to document the current behaviour
and to have tests to identify regressions.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
gcc/testsuite/ChangeLog:
* gcc.target/riscv/cpymem-32-ooo.c: New test.
* gcc.target/riscv/cpymem-32.c: New test.
* gcc.target/riscv/cpymem-64-ooo.c: New test.
* gcc.target/riscv/cpymem-64.c: New test.
Aldy Hernandez [Tue, 14 May 2024 14:21:50 +0000 (16:21 +0200)]
[prange] Default pointers_handled_p() to false.
The pointers_handled_p() method is an internal range-op helper to help
catch dispatch type mismatches for pointer operands. This is what
caught the IPA mismatch in PR114985.
This method is only a temporary measure to catch any incompatibilities
in the current pointer range-op entries. This patch returns true for
any *new* entries in the range-op table, as the current ones are
already fleshed out. This keeps us from having to implement this
boilerplate function for any new range-op entries.
PR tree-optimization/114995
* range-op-ptr.cc (range_operator::pointers_handled_p): Default to true.
Jonathan Wakely [Thu, 11 Apr 2024 18:12:48 +0000 (19:12 +0100)]
libstdc++: Give std::memory_order a fixed underlying type [PR89624]
Prior to C++20 this enum type doesn't have a fixed underlying type,
which means it can be modified by -fshort-enums, which then means the
HLE bits are outside the range of valid values for the type.
As it has a fixed type of int in C++20 and later, do the same for
earlier standards too. This is technically a change for C++17 down,
because the implicit underlying type (without -fshort-enums) was
unsigned before. I doubt it matters in practice. That incompatibility
already exists between C++17 and C++20 and nobody has noticed or
complained. Now at least the underlying type will be int for all -std
modes.
libstdc++-v3/ChangeLog:
PR libstdc++/89624
* include/bits/atomic_base.h (memory_order): Use int as
underlying type.
* testsuite/29_atomics/atomic/89624.cc: New test.
Andrew Pinski [Tue, 14 May 2024 13:29:18 +0000 (06:29 -0700)]
tree-cfg: Move the returns_twice check to be last statement only [PR114301]
When I was checking to making sure that all of the bugs dealing
with the case where gimple_can_duplicate_bb_p would return false was fixed,
I noticed that the code which was checking if a call statement was
returns_twice was checking all call statements rather than just the
last statement. Since calling gimple_call_flags has a small non-zero
overhead due to a few string comparison, removing the uses of it
can have a small performance improvement. In the case of returns_twice
functions calls, will always end the basic-block due to the check in
stmt_can_terminate_bb_p (and others). So checking only the last statement
is a small optimization and will be safe.
Bootstrapped and tested pon x86_64-linux-gnu with no regressions.
PR tree-optimization/114301
gcc/ChangeLog:
* tree-cfg.cc (gimple_can_duplicate_bb_p): Check returns_twice
only on the last call statement rather than all.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Jeff Law [Wed, 15 May 2024 04:50:15 +0000 (22:50 -0600)]
[committed] Fix rv32 issues with recent zicboz work
I should have double-checked the CI system before pushing Christoph's patches
for memset-zero. While I thought I'd checked CI state, I must have been
looking at the wrong patch from Christoph.
Anyway, this fixes the rv32 ICEs and disables one of the tests for rv32.
The test would need a revamp for rv32 as the expected output is all rv64 code
using "sd" instructions. I'm just not vested deeply enough into rv32 to adjust
the test to work in that environment though it should be fairly trivial to copy
the test and provide new expected output if someone cares enough.
Verified this fixes the rv32 failures in my tester:
> New tests that FAIL (6 tests):
>
> unix/-march=rv32gcv: gcc: gcc.target/riscv/cmo-zicboz-zic64-1.c -O1 (internal compiler error: in extract_insn, at recog.cc:2812)
> unix/-march=rv32gcv: gcc: gcc.target/riscv/cmo-zicboz-zic64-1.c -O1 (test for excess errors)
> unix/-march=rv32gcv: gcc: gcc.target/riscv/cmo-zicboz-zic64-1.c -O2 (internal compiler error: in extract_insn, at recog.cc:2812)
> unix/-march=rv32gcv: gcc: gcc.target/riscv/cmo-zicboz-zic64-1.c -O2 (test for excess errors)
> unix/-march=rv32gcv: gcc: gcc.target/riscv/cmo-zicboz-zic64-1.c -O3 -g (internal compiler error: in extract_insn, at recog.cc:2812)
> unix/-march=rv32gcv: gcc: gcc.target/riscv/cmo-zicboz-zic64-1.c -O3 -g (test for excess errors)
And after the ICE is fixed, these are eliminated by only running the test for
rv64:
Levy Hsu [Thu, 9 May 2024 08:50:56 +0000 (16:50 +0800)]
x86: Add 3-instruction subroutine vector shift for V16QI in ix86_expand_vec_perm_const_1 [PR107563]
Hi All
We've introduced a new subroutine in ix86_expand_vec_perm_const_1
to optimize vector shifting for the V16QI type on x86.
This patch uses a three-instruction sequence psrlw, psllw, and por
to handle specific vector shuffle operations more efficiently.
The change aims to improve assembly code generation for configurations
supporting SSE2.
Bootstrapped and tested on x86_64-linux-gnu, OK for trunk?
Best
Levy
gcc/ChangeLog:
PR target/107563
* config/i386/i386-expand.cc (expand_vec_perm_psrlw_psllw_por): New
subroutine.
(ix86_expand_vec_perm_const_1): Call expand_vec_perm_psrlw_psllw_por.
gcc/testsuite/ChangeLog:
PR target/107563
* g++.target/i386/pr107563-a.C: New test.
* g++.target/i386/pr107563-b.C: New test.
Patrick Palka [Wed, 15 May 2024 02:55:16 +0000 (22:55 -0400)]
c++: lvalueness of non-dependent assignment expr [PR114994]
r14-4111-g6e92a6a2a72d3b made us check non-dependent simple assignment
expressions ahead of time and give them a type, as was already done for
compound assignments. Unlike for compound assignments however, if a
simple assignment resolves to an operator overload we represent it as a
(typed) MODOP_EXPR instead of a CALL_EXPR to the selected overload.
(I reckoned this was at worst a pessimization -- we'll just have to repeat
overload resolution at instantiatiation time.)
But this turns out to break the below testcase ultimately because
MODOP_EXPR (of non-reference type) is always treated as an lvalue
according to lvalue_kind, which is incorrect for the MODOP_EXPR
representing x=42.
We can fix this by representing such class assignment expressions as
CALL_EXPRs as well, but this turns out to require some tweaking of our
-Wparentheses warning logic and may introduce other fallout making it
unsuitable for backporting.
So this patch instead fixes lvalue_kind to consider the type of a
MODOP_EXPR representing a class assignment.
PR c++/114994
gcc/cp/ChangeLog:
* tree.cc (lvalue_kind) <case MODOP_EXPR>: For a class
assignment, consider the result type.
Jeff Law [Wed, 15 May 2024 00:17:59 +0000 (18:17 -0600)]
[to-be-committed,RISC-V] Remove redundant AND in shift-add sequence
So this patch allows us to eliminate an redundant AND in some shift-add
style sequences. I think the testcase was reduced from xz by the RAU
team, but I'm not highly confident of that.
Specifically the AND is masking off the upper 32 bits of the un-shifted
value and there's an outer SIGN_EXTEND from SI to DI. However in the
RTL it's working on the post-shifted value, so the constant is left
shifted, so we have to account for that in the pattern's condition.
We can just drop the AND in this case. So instead we do a 64bit shift,
then a sign extending ADD utilizing the low part of that 64bit shift result.
This has run through Ventana's CI as well as my own. I'll wait for it
to run through the larger CI system before pushing.
Jeff
gcc/
* config/riscv/riscv.md: Add pattern for sign extended shift-add
sequence with a masked input.
Simon Martin [Mon, 6 May 2024 13:20:10 +0000 (15:20 +0200)]
c++: ICE in build_deduction_guide for invalid template [PR105760]
We currently ICE upon the following invalid snippet because we fail to
properly handle tsubst_arg_types returning error_mark_node in
build_deduction_guide.
== cut ==
template<class... Ts, class>
struct A { A(Ts...); };
A a;
== cut ==
This patch fixes this, and has been successfully tested on x86_64-pc-linux-gnu.
PR c++/105760
gcc/cp/ChangeLog:
* pt.cc (build_deduction_guide): Check for error_mark_node
result from tsubst_arg_types.
Dimitar Dimitrov [Mon, 13 May 2024 16:24:14 +0000 (19:24 +0300)]
pru: Implement TARGET_CLASS_LIKELY_SPILLED_P to fix PR115013
Commit r15-436-g44e7855e did not fix PR115013 for PRU because
SMALL_REGISTER_CLASS_P is not returning an accurate value for the PRU
backend.
Word mode for PRU backend is defined as 8-bit, yet all ALU operations
are preferred in 32-bit mode. Thus checking whether a register class
contains a single word_mode register would not classify the actually
single SImode register classes as small. This affected the
multiplication source and destination register classes.
Fix by implementing TARGET_CLASS_LIKELY_SPILLED_P to treat all register
classes with SImode or smaller size as likely spilled. This in turn
corrects the behaviour of SMALL_REGISTER_CLASS_P for PRU.
PR rtl-optimization/115013
gcc/ChangeLog:
* config/pru/pru.cc (pru_class_likely_spilled_p): Implement
to mark classes containing one SImode register as likely
spilled.
(TARGET_CLASS_LIKELY_SPILLED_P): Define.
Vineet Gupta [Mon, 13 May 2024 18:45:55 +0000 (11:45 -0700)]
RISC-V: avoid LUI based const materialization ... [part of PR/106265]
... if the constant can be represented as sum of two S12 values.
The two S12 values could instead be fused with subsequent ADD insn.
The helps
- avoid an additional LUI insn
- side benefits of not clobbering a reg
e.g.
w/o patch w/ patch
long | |
plus(unsigned long i) | li a5,4096 |
{ | addi a5,a5,-2032 | addi a0, a0, 2047
return i + 2064; | add a0,a0,a5 | addi a0, a0, 17
} | ret | ret
NOTE: In theory not having const in a standalone reg might seem less
CSE friendly, but for workloads in consideration these mat are
from very late LRA reloads and follow on GCSE is not doing much
currently.
The real benefit however is seen in base+offset computation for array
accesses and especially for stack accesses which are finalized late in
optim pipeline, during LRA register allocation. Often the finalized
offsets trigger LRA reloads resulting in mind boggling repetition of
exact same insn sequence including LUI based constant materialization.
This shaves off 290 billion dynamic instrustions (QEMU icounts) in
SPEC 2017 Cactu benchmark which is over 10% of workload. In the rest of
suite, there additional 10 billion shaved, with both gains and losses
in indiv workloads as is usual with compiler changes.
This should still be considered damage control as the real/deeper fix
would be to reduce number of LRA reloads or CSE/anchor those during
LRA constraint sub-pass (re)runs (thats a different PR/114729.
Implementation Details (for posterity)
--------------------------------------
- basic idea is to have a splitter selected via a new predicate for constant
being possible sum of two S12 and provide the transform.
This is however a 2 -> 2 transform which combine can't handle.
So we specify it using a define_insn_and_split.
- the initial loose "i" constraint caused LRA to accept invalid insns thus
needing a tighter new constraint as well.
- An additional fallback alternate with catch-all "r" register
constraint also needed to allow any "reloads" that LRA might
require for ADDI with const larger than S12.
Testing
--------
This is testsuite clean (rv64 only).
I'll rely on post-commit CI multlib run for any possible fallout for
other setups such as rv32.
I also threw this into a buildroot run, it obviously boots Linux to
userspace. bloat-o-meter on glibc and kernel show overall decrease in
staic instruction counts with some minor spot increases.
These are generally in the category of
- LUI + ADDI are 2 byte each vs. two ADD being 4 byte each.
- Knock on effects due to inlining changes.
- Sometimes the slightly shorter 2-insn seq in a mult-exit function
can cause in-place epilogue duplication (vs. a jump back).
This is slightly larger but more efficient in execution.
In summary nothing to fret about.
Caveat:
------
Jeff noted during v2 review that the operand0 constraint !riscv_reg_frame_related
could potentially cause issues with hard reg cprop in future. If that
trips things up we will have to loosen the constraint while dialing down
the const range to (-2048 to 2032) as opposed to fll S12 range of
(-2048 to 2047) to keep stack regs aligned.
gcc/ChangeLog:
* config/riscv/riscv.h: New macros to check for sum of two S12
range.
* config/riscv/constraints.md: New constraint.
* config/riscv/predicates.md: New Predicate.
* config/riscv/riscv.md: New splitter.
* config/riscv/riscv.cc (riscv_reg_frame_related): New helper.
* config/riscv/riscv-protos.h: New helper prototype.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sum-of-two-s12-const-1.c: New test: checks
for new patterns output.
* gcc.target/riscv/sum-of-two-s12-const-2.c: Ditto.
* gcc.target/riscv/sum-of-two-s12-const-3.c: New test: should not
ICE.
Richard Biener [Tue, 14 May 2024 09:13:51 +0000 (11:13 +0200)]
tree-optimization/99954 - redo loop distribution memcpy recognition fix
The following revisits the fix for PR99954 which was observed as
causing missed memcpy recognition and instead using memmove for
non-aliasing copies. While the original fix mitigated bogus
recognition of memcpy the root cause was not properly identified.
The root cause is dr_analyze_indices "failing" to handle union
references and leaving the DRs indices in a state that's not correctly
handled by dr_may_alias. The following mitigates this there
appropriately, restoring memcpy recognition for non-aliasing copies.
This makes us run into a latent issue in ptr_deref_may_alias_decl_p
when the pointer is something like &MEM[0].a in which case we fail
to handle non-SSA name pointers. Add code similar to what we have
in ptr_derefs_may_alias_p.
PR tree-optimization/99954
* tree-data-ref.cc (dr_may_alias_p): For bases that are
not completely analyzed fall back to TBAA and points-to.
* tree-loop-distribution.cc
(loop_distribution::classify_builtin_ldst): When there
is no dependence again classify as memcpy.
* tree-ssa-alias.cc (ptr_deref_may_alias_decl_p): Verify
the pointer is an SSA name.
[PATCH 3/3] RISC-V: Add memset-zero expansion to cbo.zero
The Zicboz extension offers the cbo.zero instruction, which can be used
to clean a memory region corresponding to a cache block.
The Zic64b extension defines the cache block size to 64 byte.
If both extensions are available, it is possible to use cbo.zero
to clear memory, if the alignment and size constraints are met.
This patch implements this.
gcc/ChangeLog:
* config/riscv/riscv-protos.h (riscv_expand_block_clear): New prototype.
* config/riscv/riscv-string.cc (riscv_expand_block_clear_zicboz_zic64b):
New function to expand a block-clear with cbo.zero.
(riscv_expand_block_clear): New RISC-V block-clear expansion function.
* config/riscv/riscv.md (setmem<mode>): New setmem expansion.
Rainer Orth [Tue, 14 May 2024 14:23:14 +0000 (16:23 +0200)]
testsuite: analyzer: Fix fd-glibc-byte-stream-connection-server.c on Solaris [PR107750]
gcc.dg/analyzer/fd-glibc-byte-stream-connection-server.c currently FAILs
on Solaris:
FAIL: gcc.dg/analyzer/fd-glibc-byte-stream-connection-server.c (test for
excess errors)
Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/analyzer/fd-glibc-byte-stream-connection-server.c:91:3:
error: implicit declaration of function 'memset'
[-Wimplicit-function-declaration]
Solaris <sys/select.h> has
but no declaration of memset. While one can argue that this should be
fixed, it's easy enough to just include <string.h> instead, which is
what this patch does.
Tested on i386-pc-solaris2.11 and i686-pc-linux-gnu.
Tom de Vries [Mon, 13 May 2024 16:10:15 +0000 (18:10 +0200)]
[debug] Fix dwarf v4 .debug_macro.dwo
Consider a hello world, compiled with -gsplit-dwarf and dwarf version 4, and
-g3:
...
$ gcc -gdwarf-4 -gsplit-dwarf /data/vries/hello.c -g3 -save-temps -dA
...
In section .debug_macro.dwo, we have:
...
.Ldebug_macro0:
.value 0x4 # DWARF macro version number
.byte 0x2 # Flags: 32-bit, lineptr present
.long .Lskeleton_debug_line0
.byte 0x3 # Start new file
.uleb128 0 # Included from line number 0
.uleb128 0x1 # file /data/vries/hello.c
.byte 0x5 # Define macro strp
.uleb128 0 # At line number 0
.uleb128 0x1d0 # The macro: "__STDC__ 1"
...
Given that we use a DW_MACRO_define_strp, we'd expect 0x1d0 to be an
offset into a .debug_str.dwo section.
But in fact, 0x1d0 is an index into the string offset table in
section .debug_str_offsets.dwo:
...
.long 0x34f0 # indexed string 0x1d0: __STDC__ 1
...
Add asserts that catch this inconsistency, and fix this by using
DW_MACRO_define_strx instead.
Tested on x86_64.
gcc/ChangeLog:
2024-05-14 Tom de Vries <tdevries@suse.de>
PR debug/115066
* dwarf2out.cc (output_macinfo_op): Fix DW_MACRO_define_strx/strp
choice for v4 .debug_macro.dwo. Add asserts to check that choice.
Jan Hubicka [Tue, 14 May 2024 10:58:56 +0000 (12:58 +0200)]
Reduce recursive inlining of always_inline functions
this patch tames down inliner on (mutiply) self-recursive always_inline functions.
While we already have caps on recursive inlning, the testcase combines early inliner
and late inliner to get very wide recursive inlining tree. The basic idea is to
ignore DISREGARD_INLINE_LIMITS when deciding on inlining self recursive functions
(so we cut on function being large) and clear the flag once it is detected.
I did not include the testcase since it still produces a lot of code and would
slow down testing. It also outputs many inlining failed messages that is not
very nice, but it is hard to detect self recursin cycles in full generality
when indirect calls and other tricks may happen.
gcc/ChangeLog:
PR ipa/113291
* ipa-inline.cc (enum can_inline_edge_by_limits_flags): New enum.
(can_inline_edge_by_limits_p): Take flags instead of multiple bools; add flag
for forcing inlinie limits.
(can_early_inline_edge_p): Update.
(want_inline_self_recursive_call_p): Update; use FORCE_LIMITS mode.
(check_callers): Update.
(update_caller_keys): Update.
(update_callee_keys): Update.
(recursive_inlining): Update.
(add_new_edges_to_heap): Update.
(speculation_useful_p): Update.
(inline_small_functions): Clear DECL_DISREGARD_INLINE_LIMITS on self recursion.
(flatten_function): Update.
(inline_to_all_callers_1): Update.
Haochen Gui [Tue, 14 May 2024 08:37:06 +0000 (16:37 +0800)]
rs6000: Enable overlapped by-pieces operations
This patch enables overlapped by-piece operations by defining
TARGET_OVERLAP_OP_BY_PIECES_P to true. On rs6000, default move/set/clear
ratio is 2. So the overlap is only enabled with compare by-pieces.
Piotr Trojanek [Mon, 19 Feb 2024 08:46:04 +0000 (09:46 +0100)]
ada: Fix classification of SPARK Boolean aspects
The implementation of User_Aspect_Definition uses subtype
Boolean_Aspects to decide which existing aspects can be used to define
old aspects. This subtype didn't include many of the SPARK aspects,
notably the Always_Terminates.
gcc/ada/
* aspects.ads (Aspect_Id, Boolean_Aspect): Change categorization
of Boolean-valued SPARK aspects.
* sem_ch13.adb (Analyze_Aspect_Specification): Adapt CASE
statements to new classification of Boolean-valued SPARK
aspects.
Eric Botcazou [Fri, 16 Feb 2024 09:30:17 +0000 (10:30 +0100)]
ada: Fix small inaccuracy in previous change
The call to Build_Allocate_Deallocate_Proc must occur before the special
accessibility check for class-wide allocation is generated, because this
check comes with cleanup code.
gcc/ada/
* exp_ch4.adb (Expand_Allocator_Expression): Move the first call to
Build_Allocate_Deallocate_Proc up to before the accessibility check.
A recent change broke pragma Warnings when -gnatD is enabled in some
cases. This patch fixes this by caching more slocs at times when it's
known that they haven't been modified by -gnatD.
gcc/ada/
* errout.adb (Validate_Specific_Warnings): Adapt to record
definition change.
* erroutc.adb (Set_Specific_Warning_On, Set_Specific_Warning_Off,
Warning_Specifically_Suppressed): Likewise.
* erroutc.ads: Change record definition.
Eric Botcazou [Thu, 15 Feb 2024 15:02:51 +0000 (16:02 +0100)]
ada: Decouple attachment from dynamic allocation for controlled objects
This decouples the attachment to the appropriate finalization collection of
dynamically allocated objects that need finalization from their allocation.
The current implementation immediately attaches them after allocating them,
which means that they will be finalized even if their initialization does
not complete successfully. The new implementation instead generates the
same sequence as the one generated for (statically) declared objects, that
is to say, allocation, initialization and attachment in this order.
gcc/ada/
* exp_ch3.adb (Build_Default_Initialization): Do not generate the
protection for finalization collections.
(Build_Heap_Or_Pool_Allocator): Set the No_Initialization flag on
the declaration of the temporary.
* exp_ch4.adb (Build_Aggregate_In_Place): Do not build an allocation
procedure here.
(Expand_Allocator_Expression): Build an allocation procedure, if it
is required, only just before rewriting the allocator.
(Expand_N_Allocator): Do not build an allocation procedure if the
No_Initialization flag is set on the allocator, except for those
generated for special return objects. In other cases, build an
allocation procedure, if it is required, only before rewriting
the allocator.
* exp_ch7.ads (Make_Address_For_Finalize): New function declaration.
* exp_ch7.adb (Finalization Management): Update description for
dynamically allocated objects.
(Make_Address_For_Finalize): Remove declaration.
(Find_Last_Init): Change to function and move to...
(Process_Object_Declaration): Adjust to above change.
* exp_util.ads (Build_Allocate_Deallocate_Proc): Add Mark parameter
with Empty default and document it.
(Find_Last_Init): New function declaration.
* exp_util.adb (Build_Allocate_Deallocate_Proc): Add Mark parameter
with Empty default and pass it in recursive call. Deal with type
conversions created for interface types. Adjust call sequence to
Allocate_Any_Controlled by changing Collection to In/Out parameter
and removing Finalize_Address parameter. For a controlled object,
generate a conditional call to Attach_Object_To_Collection for an
allocation and to Detach_Object_From_Collection for a deallocation.
(Find_Last_Init): ...here. Compute the initialization type for an
allocator whose designating type is class wide specifically and also
handle concurrent types.
* rtsfind.ads (RE_Id): Add RE_Attach_Object_To_Collection and
RE_Detach_Object_From_Collection.
(RE_Unit_Table): Add entries for RE_Attach_Object_To_Collection and
RE_Detach_Object_From_Collection.
* libgnat/s-finpri.ads (Finalization_Started): Delete.
(Attach_Node_To_Collection): Likewise.
(Detach_Node_From_Collection): Move to...
(Attach_Object_To_Collection): New procedure declaration.
(Detach_Object_From_Collection): Likewise.
(Finalization_Collection): Remove Atomic for Finalization_Started.
Add pragma Inline for Initialize.
* libgnat/s-finpri.adb: Add clause for Ada.Unchecked_Conversion.
(To_Collection_Node_Ptr): New instance of Ada.Unchecked_Conversion.
(Detach_Node_From_Collection): ...here.
(Attach_Object_To_Collection): New procedure.
(Detach_Object_From_Collection): Likewise.
(Finalization_Started): Delete.
(Finalize): Replace allocation with attachment in comments.
* libgnat/s-stposu.ads (Allocate_Any_Controlled): Rename parameter
Context_Subpool into Named_Subpool, parameter Context_Collection
into Collection and change it to In/Out, and remove Fin_Address.
* libgnat/s-stposu.adb: Remove clause for Ada.Unchecked_Conversion
and Finalization_Primitives.
(To_Collection_Node_Ptr): Delete.
(Allocate_Any_Controlled): Rename parameter Context_Subpool into
Named_Subpool, parameter Context_Collection into Collection and
change it to In/Out, and remove Fin_Address. Do not lock/unlock
and do not attach the object, instead only displace its address.
(Deallocate_Any_Controlled): Do not lock/unlock and do not detach
the object.
(Header_Size_With_Padding): Use qualified name for Header_Size.
Steve Baird [Thu, 15 Feb 2024 00:27:59 +0000 (16:27 -0800)]
ada: Follow up fixes for Put_Image/streaming regressions
A recent change to reduce duplication of compiler-generated Put_Image and
streaming subprograms introduced two regressions. One is yet another of the
many cases where generating these routines "on demand" (as opposed at the
point of the associated type declaration) requires loosening the compiler's
enforcement of privacy. The other is a use-before-definition issue that
occurs because the declaration of a Put_Image procedure is not hoisted far
enough.
gcc/ada/
* exp_attr.adb (Build_And_Insert_Type_Attr_Subp): If a subprogram
associated with a (library-level) type declared in another unit is
to be inserted somewhere in a list, then insert it at the head of
the list.
* sem_ch5.adb (Analyze_Assignment): Normally a limited-type
assignment is illegal. Relax this rule if Comes_From_Source is
False and the type is not immutably limited.
ada: Fix pragma Compile_Time_Error and -gnatdJ crash
This patch makes it so the diagnostics coming from occurrences of
pragma Compile_Time_Error and Compile_Time_Warning are emitted with
a node parameter so they don't cause a crash when -gnatdJ is enabled.
gcc/ada/
* errout.ads (Error_Msg): Add node parameter.
* errout.adb (Error_Msg): Add parameter and pass it to
the underlying call.
* sem_prag.adb (Validate_Compile_Time_Warning_Or_Error): Pass
pragma node when emitting errors.
This patch makes it so -gnatyz style checks reports specify a node
ID. That is required since those checks are sometimes made during
semantic analysis of short-circuit operators, where the Current_Node
mechanism that -gnatdJ uses is not operational.
Check_Xtra_Parens_Precedence is moved from Styleg to Style to make
this possible.
gcc/ada/
* styleg.ads (Check_Xtra_Parens_Precedence): Moved ...
* style.ads (Check_Xtra_Parens_Precedence): ... here. Also
replace corresponding renaming.
* styleg.adb (Check_Xtra_Parens_Precedence): Moved ...
* style.adb (Check_Xtra_Parens_Precedence): here. Also use
Errout.Error_Msg and pass it a node parameter.
Eric Botcazou [Wed, 14 Feb 2024 00:22:49 +0000 (01:22 +0100)]
ada: Small cleanup about allocators and aggregates
This eliminates a few oddities present in the expander for allocators and
aggregates present in allocators:
- Convert_Array_Aggr_In_Allocator takes both a Decl and Alloc parameters,
and inserts new code before Alloc for records and after Decl for arrays
through Convert_Array_Aggr_In_Allocator. Now, for the 3 (duplicated)
calls to the procedure, that's the same place. It also creates a new
list that it does not use in most cases.
- Expand_Allocator_Expression uses the same code sequence in 3 places
when the expression is an aggregate to build in place.
- Build_Allocate_Deallocate_Proc takes an Is_Allocate parameter that is
entirely determined by the N parameter: if N is an allocator, it must
be true; if N is a free statement, it must be false. Barring that,
the procedure either raises an assertion or Program_Error. It also
contains useless pattern matching code in the second part.
No functional changes.
gcc/ada/
* exp_aggr.ads (Convert_Aggr_In_Allocator): Rename Alloc into N,
replace Decl with Temp and adjust description.
(Convert_Aggr_In_Object_Decl): Alphabetize.
(Is_Delayed_Aggregate): Likewise.
* exp_aggr.adb (Convert_Aggr_In_Allocator): Rename Alloc into N
and replace Decl with Temp. Allocate a list only when neeeded.
(Convert_Array_Aggr_In_Allocator): Replace N with Decl and insert
new code before it.
* exp_ch4.adb (Build_Aggregate_In_Place): New procedure nested in
Expand_Allocator_Expression.
(Expand_Allocator_Expression): Call it to build aggregates in place.
Remove second parameter in calls to Build_Allocate_Deallocate_Proc.
(Expand_N_Allocator): Likewise.
* exp_ch13.adb (Expand_N_Free_Statement): Likewise.
* exp_util.ads (Build_Allocate_Deallocate_Proc): Remove Is_Allocate
parameter.
* exp_util.adb (Build_Allocate_Deallocate_Proc): Remove Is_Allocate
parameter and replace it with local variable of same name. Delete
useless pattern matching.
Before this patch, the default status of -gnatw.i and -gnatw.d are
reported incorrectly in the usage string used throughout GNAT tools.
This patch fixes this.
The parameters should be swapped to fit Fileapi.h documentation.
BOOL LocalFileTimeToFileTime(
[in] const FILETIME *lpLocalFileTime,
[out] LPFILETIME lpFileTime
);
This patch tweaks the calls made to Errout subprograms to report
violations of dependence restrictions, in order fix a crash that
occurred with -gnatdJ and -fdiagnostics-format=json.
This patch fixes a crash when -gnatdJ is enabled and a warning
must be emitted about an ineffective pragma Warnings clause.
Some modifications are made to the specific warnings machinery so
that warnings carry the ID of the pragma node they're about, so the
-gnatdJ mechanism can find an appropriate enclosing subprogram.
gcc/ada/
* sem_prag.adb (Analyze_Pragma): Adapt call to new signature.
* erroutc.ads (Set_Specific_Warning_Off): change signature
and update documentation.
(Validate_Specific_Warnings): Move ...
* errout.adb: ... here and change signature. Also move body
of Validate_Specific_Warnings from erroutc.adb.
(Finalize): Adapt call.
* errout.ads (Set_Specific_Warning_Off): Adapt signature of
renaming.
* erroutc.adb (Set_Specific_Warning_Off): Adapt signature and
body.
(Validate_Specific_Warnings): Move to the body of Errout.
(Warning_Specifically_Suppressed): Adapt body.
Eric Botcazou [Mon, 12 Feb 2024 14:23:41 +0000 (15:23 +0100)]
ada: Restore default size for dynamic allocations of discriminated type
The allocation strategy for objects of a discriminated type with defaulted
discriminants is not the same when the allocation is dynamic as when it is
static (i.e a declaration): in the former case, the compiler allocates the
default size whereas, in the latter case, it allocates the maximum size.
This restores the default size, which was dropped during the refactoring.
gcc/ada/
* exp_aggr.adb (Build_Array_Aggr_Code): Pass N in the call to
Build_Initialization_Call.
(Build_Record_Aggr_Code): Likewise.
(Convert_Aggr_In_Object_Decl): Likewise.
(Initialize_Discriminants): Likewise.
* exp_ch3.ads (Build_Initialization_Call): Replace Loc witn N.
* exp_ch3.adb (Build_Array_Init_Proc): Pass N in the call to
Build_Initialization_Call.
(Build_Default_Initialization): Likewise.
(Expand_N_Object_Declaration): Likewise.
(Build_Initialization_Call): Replace Loc witn N parameter and add
Loc local variable. Build a default subtype for an allocator of
a discriminated type with defaulted discriminants.
(Build_Record_Init_Proc): Pass the declaration of components in the
call to Build_Initialization_Call.
* exp_ch6.adb (Make_CPP_Constructor_Call_In_Allocator): Pass the
allocator in the call to Build_Initialization_Call.
Gary Dismukes [Mon, 12 Feb 2024 20:18:36 +0000 (20:18 +0000)]
ada: Compiler crash or errors on if_expression in container aggregate
The compiler may either crash or incorrectly report errors when
a component association in a container aggregate is an if_expression
with an elsif part whose dependent expression is a call to a function
returning a result that requires finalization. The compiler complains
that a private type is expected, but a package or procedure name was
found. This is due to the compiler improperly associating expanded
calls to Finalize_Object with the aggregate, rather than the enclosing
object declaration being initialized by the aggregate, which can result
in the Finalize_Object procedure call being passed as an actual to
the Add_Unnamed operation of the container type and leading to a type
mismatch and the confusing error message. This is fixed by adjusting
the code that locates the proper context for insertion of Finalize_Object
calls to locate the enclosing declaration or statement rather than
stopping at the aggregate.
gcc/ada/
* exp_util.adb (Find_Hook_Context): Exclude N_*Aggregate Nkinds
of Parent (Par) from the early return in the second loop of the
In_Cond_Expr case, to prevent returning an aggregate from this
function rather than the enclosing declaration or statement.
Eric Botcazou [Mon, 12 Feb 2024 18:25:39 +0000 (19:25 +0100)]
ada: Follow-up adjustment after fix to Default_Initialize_Object
Now that Default_Initialize_Object honors the No_Initialization flag in all
cases, objects of an access type declared without initialization expression
can no longer be considered as being automatically initialized to null.
gcc/ada/
* exp_ch3.adb (Expand_N_Object_Declaration): Examine the Expression
field after the call to Default_Initialize_Object in order to set
Is_Known_Null, as well as Is_Known_Non_Null, on an access object.
Steve Baird [Thu, 21 Dec 2023 21:58:51 +0000 (13:58 -0800)]
ada: Reduce generated code duplication for streaming and Put_Image subprograms
In the case of an untagged composite type, the compiler does not generate
streaming-related subprograms or a Put_Image procedure when the type is
declared. Instead, these subprograms are declared "on demand" when a
corresponding attribute reference is encountered. In this case, hoist the
declaration of the implicitly declared subprogram out as far as possible
in order to maximize the chances that it can be reused (as opposed to
generating an identical second subprogram) in the case where a second
reference to the same attribute is encountered. Also relax some
privacy-related rules to allow these procedures to do what they need to do
even when constructed in a scope where some of those actions would
normally be illegal.
gcc/ada/
* exp_attr.adb: Change name of package Cached_Streaming_Ops to
reflect the fact that it is now also used for Put_Image
procedures. Similarly change other "Streaming_Op" names therein.
Add Validate_Cached_Candidate procedure to detect case where a
subprogram found in the cache cannot be reused. Add new generic
procedure Build_And_Insert_Type_Attr_Subp; the "Build" part is
handled by just calling a formal procedure; the bulk of this
(generic) procedure's code has to with deciding where in the tree
to insert the newly-constructed subprogram. Replace each later
"Build" call (and the following Insert_Action or
Compile_Stream_Body_In_Scope call) with a declare block that
instantiates and then calls this generic procedure. Delete the
now-unused procedure Compile_Stream_Body_In_Scope. A constructed
subprogram is entered in the appropriate cache if the
corresponding type is untagged; this replaces more complex tests.
A new function Interunit_Ref_OK is added to determine whether an
attribute reference occuring in one unit can safely refer to a
cached subprogram declared in another unit.
* exp_ch3.adb (Build_Predefined_Primitive_Bodies): A formal
parameter was deleted, so delete the corresponding actual in a
call.
* exp_put_image.adb (Build_Array_Put_Image_Procedure): Because the
procedure being built may be referenced more than once, the
generated procedure takes its source position info from the type
declaration instead of the (first) attribute reference.
(Build_Record_Put_Image_Procedure): Likewise.
* exp_put_image.ads (Build_Array_Put_Image_Procedure): Eliminate
now-unused Nod parameter.
(Build_Record_Put_Image_Procedure): Eliminate now-unused Loc parameter.
* sem_ch3.adb (Constrain_Discriminated_Type): For declaring a
subtype with a discriminant constraint, ignore privacy if
Comes_From_Source is false (as is already done if Is_Instance is
true).
* sem_res.adb (Resolve): When passed two type entities that have
the same underlying base type, Sem_Type.Covers may return False in
some cases because of privacy. [This can happen even if
Is_Private_Type returns False both for Etype (N) and for Typ;
Covers calls Base_Type, which can take a non-private argument and
yield a private result.] If Comes_From_Source (N) is False
(e.g., for a compiler-generated Put_Image or streaming subprogram), then
avoid that scenario by not calling Covers. Covers already has tests for
doing this sort of thing (see the calls therein to Full_View_Covers),
but the Comes_From_Source test is too coarse to apply there. So instead
we handle the problem here at the call site.
(Original_Implementation_Base_Type): A new function. Same as
Implementation_Base_Type except if the Original_Node attribute of
a non-derived type declaration indicates that it once was a derived
type declaration. Needed for looking through privacy.
(Valid Conversion): Ignore privacy when converting between different views
of the same type if Comes_From_Source is False for the conversion.
(Valid_Tagged_Conversion): An ancestor-to-descendant conversion is not an
illegal downward conversion if there is no type extension involved
(because the derivation was from an untagged view of the parent type).
Steve Baird [Fri, 9 Feb 2024 23:08:51 +0000 (15:08 -0800)]
ada: Better error message for bad general case statements
If -gnatX0 is specified, we allow case statements with a selector
expression of a record or array type, but not of a private type.
If the selector expression is of a private type then we should generate
an appropriate error message instead of a bugbox.
gcc/ada/
* sem_ch5.adb (Analyze_Case_Statement): Emit a message and return
early in the case where general case statements are allowed but
the selector expression is of a private type. This is done to
avoid a bugbox.
Justin Squirek [Mon, 12 Feb 2024 15:50:24 +0000 (15:50 +0000)]
ada: Spurious unreferenced warning on selected component
This patch fixes an error in the compiler whereby a selected component on the
left hand side of an assignment statement may not get marked as referenced -
leading to spurious unreferenced warnings on such objects.
gcc/ada/
* sem_util.adb (Set_Referenced_Modified): Use Original_Node to
avoid recursive calls on expanded / internal objects such that
source nodes get appropriately marked as referenced.
Before this patch, some warnings about overlapping actuals were
emitted regardless of the Value of
Warnsw.Warnings_Package.Warn_On_Overlap. This patch fixes this.
Eric Botcazou [Sun, 11 Feb 2024 18:05:08 +0000 (19:05 +0100)]
ada: Follow-up adjustment to earlier fix in Build_Allocate_Deallocate_Proc
The profile of the procedure built for an allocation on the secondary stack
now includes the alignment parameter, so the parameter can just be forwarded
in the call to Allocate_Any_Controlled.
gcc/ada/
* exp_util.adb (Build_Allocate_Deallocate_Proc): Pass the alignment
parameter in the inner call for a secondary stack allocation too.
Javier Miranda [Sun, 11 Feb 2024 16:22:28 +0000 (16:22 +0000)]
ada: Missing support for consistent assertion policy
Add missing support for RM 10.2/5: the region for a pragma
Assertion_Policy given as a configuration pragma is the
declarative region for the entire compilation unit (or units)
to which it applies.
gcc/ada/
* sem_ch10.adb (Install_Inherited_Policy_Pragmas): New subprogram.
(Remove_Inherited_Policy_Pragmas): New subprogram.
(Analyze_Compilation_Unit): Call the new subprograms to
install and remove inherited assertion policy pragmas.
Eric Botcazou [Fri, 9 Feb 2024 23:03:42 +0000 (00:03 +0100)]
ada: Fix double finalization for dependent expression of case expression
The recent fix to Default_Initialize_Object, which has ensured that the
No_Initialization flag set on an object declaration, for example for the
temporary created by Expand_N_Case_Expression, is honored in all cases,
has also uncovered a latent issue in the machinery responsible for the
finalization of transient objects.
More specifically, the answer returned by the Is_Finalizable_Transient
predicate for an object of an access type is different when it is left
uninitialized (true) than when it is initialized to null (false), which
is incorrect; it must return false in both cases, because the only case
where an object can be finalized by the machinery through an access value
is when this value is a reference (N_Reference node) to the object.
This was already more or less the current state of the evolution of the
predicate, but this now explicitly states it in the code.
The change also sets the No_Initialization flag for the temporary created
by Expand_N_If_Expression for the sake of consistency.
gcc/ada/
* exp_ch4.adb (Expand_N_If_Expression): Set No_Initialization on the
declaration of the temporary in the by-reference case.
* exp_util.adb (Initialized_By_Access): Delete.
(Is_Allocated): Likewise.
(Initialized_By_Reference): New predicate.
(Is_Finalizable_Transient): If the transient object is of an access
type, do not return true unless it is initialized by a reference.
Steve Baird [Wed, 7 Feb 2024 21:52:58 +0000 (13:52 -0800)]
ada: Error in determining accumulator subtype for a reduction expression
There was an earlier bug in determining the accumulator subtype for a
reduction expression in the case where the reducer subprogram is overloaded.
The fix for that bug introduced a recently-discovered
regression. Redo accumulator subtype computation in order to address
this regression while preserving the benefits of the earlier fix.
gcc/ada/
* exp_attr.adb: Move computation of Accum_Typ entirely into the
function Build_Stat.
Steve Baird [Wed, 7 Feb 2024 19:47:22 +0000 (11:47 -0800)]
ada: Rtsfind should not trash state used in analyzing instantiations.
During analysis of an instantiation, Sem_Ch12 manages formal/actual binding
information in package state (see Sem_Ch12.Generic_Renamings_HTable).
A call to rtsfind can cause another unit to be loaded and compiled.
If this occurs during the analysis of an instantiation, and if the loaded
unit contains a second instantiation, then the Sem_Ch12 state needed for
analyzing the first instantiation can be trashed during the analysis of the
second instantiation. Rtsfind calls that can include the analysis of an
instantiation need to save and restore Sem_Ch12's state.
gcc/ada/
* sem_ch12.ads: Declare new Instance_Context package, which
declares a private type Context with operations Save_And_Reset and
Restore.
* sem_ch12.adb: Provide body for new Instance_Context package.
* rtsfind.adb (Load_RTU): Wrap an Instance_Context Save/Restore
call pair around the call to Semantics.
* table.ads: Add initial value for Last_Val (because
Save_And_Reset expects Last_Val to be initialized).
Eric Botcazou [Tue, 6 Feb 2024 18:31:09 +0000 (19:31 +0100)]
ada: Factor out implementation of default initialization for objects
As written down in a comment, "There is a *huge* amount of code duplication"
in the implementation of default initializaion for objects in the front-end,
between the (static) declaration case and the dynamic allocation case.
This change factors out the implementation of the (static) declaration case
and uses it for the dynamic allocation case, with the following benefits:
1. getting rid of the duplication and reducing total line count,
2. bringing optimizations implemented for the (static) declaration case
to the dynamic allocation case,
3. performing the missing abort deferral prescribed by RM 9.8(9) in the
dynamic allocation case.
gcc/ada/
* exp_aggr.adb (Build_Record_Aggr_Code): Replace reference to
Build_Task_Allocate_Block_With_Init_Stmts in comment with reference
to Build_Task_Allocate_Block.
(Convert_Aggr_In_Allocator): Likewise for the call in the code.
* exp_ch6.adb (Make_Build_In_Place_Call_In_Allocator): Likewise.
* exp_ch3.ads: Alphabetize clauses.
(Build_Default_Initialization): New function declaration.
(Build_Default_Simple_Initialization): Likewise.
(Build_Initialization_Call): Add Target_Ref parameter with default.
* exp_ch3.adb (Build_Default_Initialization): New function extracted
from...
(Build_Default_Simple_Initialization): Likewise.
(Build_Initialization_Call): Add Target_Ref parameter with default.
(Expand_N_Object_Declaration): ...here.
(Default_Initialize_Object): Call Build_Default_Initialization and
Build_Default_Simple_Initialization.
* exp_ch4.adb (Expand_Allocator_Expression): Minor comment tweaks.
(Expand_N_Allocator): Call Build_Default_Initialization and
Build_Default_Simple_Initialization to implement the default
initialization of the allocated object.
* exp_ch9.ads (Build_Task_Allocate_Block): Delete.
(Build_Task_Allocate_Block_With_Init_Stmts): Rename into...
(Build_Task_Allocate_Block): ...this.
* exp_ch9.adb: Remove clauses for Exp_Tss.
(Build_Task_Allocate_Block): Delete.
(Build_Task_Allocate_Block_With_Init_Stmts): Rename into...
(Build_Task_Allocate_Block): ...this.
* exp_util.adb (Build_Allocate_Deallocate_Proc): Remove unnecessary
initialization expression, adjust commentary and replace early exit
with assertion.
* sem_ch4.adb (Analyze_Allocator): In the null-exclusion case, call
Apply_Compile_Time_Constraint_Error to insert the raise.
The crash this patch fixes happened because calling the Errout.Error_Msg
procedures that don't have an N parameter is not allowed when not
parsing and -gnatdJ is on. And -gnatyB style checks are not emitted during
parsing but during semantic analysis.
This commit moves Check_Boolean_Operator from Styleg to Style so it can
call Errout.Error_Msg with a Node_Id parameter. This change of package
makes sense because:
1. The compiler is currently the only user of Check_Boolean_Operator.
2. Other tools don't do semantic analysis, and so cannot possibly
know when to use Check_Boolean_Operator anyway.
gcc/ada/
* styleg.ads (Check_Boolean_Operator): Moved ...
* style.ads (Check_Boolean_Operator): ... here.
* styleg.adb (Check_Boolean_Operator): Moved ...
* style.adb (Check_Boolean_Operator): ... here. Also add node
parameter to call to Errout.Error_Msg.
Yannick Moy [Fri, 2 Feb 2024 17:20:06 +0000 (18:20 +0100)]
ada: Update of SPARK RM legality rules on ghost code
Update checking of ghost code after a small change in SPARK RM
rules 6.9(15) and 6.9(20), so that the Ghost assertion policy
that matters when checking the validity of a reference to a ghost entity
in an assertion expression is the Ghost assertion policy at the point
of declaration of the entity.
Also fix references to SPARK RM rules in comments, which were off by two
in many cases after the insertion of rules 13 and 14 regarding generic
instantiations.
gcc/ada/
* contracts.adb: Fix references to SPARK RM rules.
* freeze.adb: Same.
* ghost.adb: Fix references to SPARK RM rules.
(Check_Ghost_Context): Update checking of references to
ghost entities in assertion expressions.
* sem_ch6.adb: Fix references to SPARK RM rules.
* sem_prag.adb: Same.
Yannick Moy [Thu, 8 Feb 2024 10:47:20 +0000 (11:47 +0100)]
ada: Fix ghost policy in use for generic instantiation
The Ghost assertion policy relevant for analyzing a generic instantiation
is the Ghost policy at the point of instantiation, not the one applicable
for the generic itself.
gcc/ada/
* ghost.adb (Mark_And_Set_Ghost_Instantiation): Fix the current
Ghost policy for the instantiation.
Eric Botcazou [Sun, 4 Feb 2024 10:16:18 +0000 (11:16 +0100)]
ada: Small fix to Default_Initialize_Object
Unlike what is assumed in other parts of the front-end, some objects created
with No_Initialization set on their declaration may end up being initialized
with a default value.
gcc/ada/
* exp_ch3.adb (Default_Initialize_Object): Return immediately when
either Has_Init_Expression or No_Initialization is set on the node.
Tidy up the rest of the code accordingly.
(Simple_Initialization_OK): Do not test Has_Init_Expression here.
Jeff Law [Mon, 13 May 2024 23:37:46 +0000 (17:37 -0600)]
[to-be-committed,RISC-V] Improve AND with some constants
If we have an AND with a constant operand and the constant operand
requires synthesis, then we may be able to generate more efficient code
than we do now.
Essentially the need for constant synthesis gives us a budget for
alternative ways to clear bits, which zext.w can do for bits 32..63
trivially. So if we clear 32..63 via zext.w, the constant for the
remaining bits to clear may be simple enough to use with andi or bseti.
That will save us an instruction.
This has tested in Ventana's CI system as well as my own. I'll wait for
the upstream CI tester to report success before committing.
Jeff
gcc/
* config/riscv/bitmanip.md: Add new splitter for AND with
a constant that masks off bits 32..63 and needs synthesis.
Sergei Lewis [Mon, 13 May 2024 23:32:24 +0000 (17:32 -0600)]
[PATCH v2 1/3] RISC-V: movmem for RISCV with V extension
This patchset permits generation of inlined vectorised code for movmem,
setmem and cmpmem, if and only if the operation size is
at least one and at most eight vector registers' worth of data.
Further vectorisation rapidly becomes debatable due to code size concerns;
however, for these simple cases we do have an unambiguous performance win
without sacrificing too much code size compared to a libc call.
Changes in v2:
* run clang-format over the code in addition to the
contrib/check_GNU_style.sh that was used for v1
* remove string.h include and refer to __builtin_* memory functions
in multilib tests
* respect stringop_strategy (don't vectorise if it doesn't include VECTOR)
* use an integer constraint for movmem length parameter
* use TARGET_MAX_LMUL unless riscv-autovec-lmul=dynamic
to ensure we respect the user's wishes if they request specific lmul
* add new unit tests to check that riscv-autovec-lmul is respected
* PR target/112109 added to changelog for patch 1/3 as requested
Sergei Lewis (3):
RISC-V: movmem for RISCV with V extension
RISC-V: setmem for RISCV with V extension
RISC-V: cmpmem for RISCV with V extension
gcc/ChangeLog
* config/riscv/riscv.md (movmem<mode>): Use riscv_vector::expand_block_move,
if and only if we know the entire operation can be performed using one vector
load followed by one vector store
gcc/testsuite/ChangeLog
PR target/112109
* gcc.target/riscv/rvv/base/movmem-1.c: New test
Patrick Palka [Mon, 13 May 2024 19:46:55 +0000 (15:46 -0400)]
c++: replace tf_norm with a local flag
The tf_norm flag controlling whether to build diagnostic information
during constraint normalization doesn't need to be a global tsubst flag,
and is confusingly named. This patch replaces it with a boolean flag
local to normalization.
gcc/cp/ChangeLog:
* constraint.cc (norm_info::norm_info): Take a bool instead of
tsubst_flags_t.
(norm_info::generate_diagnostics): Turn this predicate function
into a bool data member.
(normalize_logical_operation): Adjust after norm_info changes.
(normalize_concept_check): Likewise.
(normalize_atom): Likewise.
(get_normalized_constraints_from_info): Likewise.
(normalize_concept_definition): Likewise.
(normalize_constraint_expression): Likewise.
(normalize_placeholder_type_constraints): Likewise.
(satisfy_nondeclaration_constraints): Likewise.
* cp-tree.h (enum tsubst_flags): Remove tf_norm.
My recent patch to recognize reg starvation resulted in few GCC test
failures. The following patch fixes this by using more accurate
starvation calculation and ignoring small reg classes.
gcc/ChangeLog:
PR rtl-optimization/115013
* lra-constraints.cc (process_alt_operands): Update all_used_nregs
only for winreg. Ignore reg starvation for small reg classes.
Pan Li [Sat, 11 May 2024 07:25:28 +0000 (15:25 +0800)]
RISC-V: Bugfix ICE for RVV intrinisc vfw on _Float16 scalar
For the vfw vx format RVV intrinsic, the scalar type _Float16 also
requires the zvfh extension. Unfortunately, we only check the
vector tree type and miss the scalar _Float16 type checking. For
example:
It should report some error message like zvfh extension is required
instead of ICE for unreg insn.
This patch would like to make up such kind of validation for _Float16
in the RVV intrinsic API. It will report some error like below when
there is no zvfh enabled.
error: built-in function '__riscv_vfwsub_wf_f32mf2(vs2, rs1, vl)'
requires the zvfhmin or zvfh ISA extension
Passed the rv64gcv fully regression tests, included c/c++/fortran.
PR target/114988
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins.cc
(validate_instance_type_required_extensions): New func impl to
validate the intrinisc func type ops.
(expand_builtin): Validate instance type before expand.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr114988-1.c: New test.
* gcc.target/riscv/rvv/base/pr114988-2.c: New test.
During maybe_aggr_guide with a nested class template and paren init,
like with list init we need to consider the generic template type rather
than the partially instantiated type since partial instantiations don't
have (partially instantiated) TYPE_FIELDS. In turn we need to partially
substitute PARMs in the paren init case as well. As a drive-by improvement
it seems better to use outer_template_args instead of DECL_TI_ARGS during
this partial substitution so that we lower instead of substitute the
innermost template parameters, which is generally more robust.
And during alias_ctad_tweaks with a nested class template, even though
the guides may be already partially instantiated we still need to
substitute the outermost arguments into its constraints.
PR c++/114974
PR c++/114901
PR c++/114903
gcc/cp/ChangeLog:
* pt.cc (maybe_aggr_guide): Fix obtaining TYPE_FIELDS in
the paren init case. Hoist out partial substitution logic
to apply to the paren init case as well.
(alias_ctad_tweaks): Substitute outer template arguments into
a guide's constraints.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/class-deduction-aggr14.C: New test.
* g++.dg/cpp2a/class-deduction-alias20.C: New test.
* g++.dg/cpp2a/class-deduction-alias21.C: New test.
Jeff Law [Mon, 13 May 2024 13:14:08 +0000 (07:14 -0600)]
[to-be-committed,RISC-V] Improve single inverted bit extraction - v3
So this patch fixes a minor code generation inefficiency that (IIRC) the
RAU team discovered a while ago in spec.
If we want the inverted value of a single bit we can use bext to extract
the bit, then seq to invert the value (if viewed as a 0/1 truth value).
The RTL is fairly convoluted, but it's basically a right shift to get
the bit into position, bitwise-not then masking off all but the low bit.
So it's a 3->2 combine, hidden by the fact that and-not is a
define_insn_and_split, so it actually looks like a 2->2 combine.
We've run this through Ventana's internal CI (which includes
zba_zbb_zbs) and I've run it in my own tester (rv64gc, rv32gcv). I'll
wait for the upstream CI to finish with positive results before pushing.
gcc/
* config/riscv/bitmanip.md (bextseqzdisi): New patterns.
gcc/testsuite/
* gcc.target/riscv/zbs-bext-2.c: New test.
* gcc.target/riscv/zbs-bext.c: Fix one of the possible expectes sequences.
Richard Biener [Thu, 9 Nov 2023 10:30:22 +0000 (11:30 +0100)]
PR60276 fix for single-lane SLP
When enabling single-lane SLP and not splitting groups the fix for
PR60276 is no longer effective since it for unknown reason exempted
pure SLP. The following removes this exemption, making
gcc.dg/vect/pr60276.c PASS even with --param vect-single-lane-slp=1
PR tree-optimization/60276
* tree-vect-stmts.cc (vectorizable_load): Do not exempt
pure_slp grouped loads from the STMT_VINFO_MIN_NEG_DIST
restriction.
PR libstdc++/114958
* include/experimental/bits/simd.h (__as_vector): Return scalar
simd as one-element vector. Return vector from single-vector
fixed_size simd.
(__vec_shuffle): New.
(__extract_part): Adjust return type signature.
(split): Use __extract_part for any split into non-fixed_size
simds.
(concat): If the return type stores a single vector, use
__vec_shuffle (which calls __builtin_shufflevector) to produce
the return value.
* include/experimental/bits/simd_builtin.h
(__shift_elements_right): Removed.
(__extract_part): Return single elements directly. Use
__vec_shuffle (which calls __builtin_shufflevector) to for all
non-trivial cases.
* include/experimental/bits/simd_fixed_size.h (__extract_part):
Return single elements directly.
* testsuite/experimental/simd/pr114958.cc: New test.
Richard Biener [Fri, 1 Mar 2024 11:08:36 +0000 (12:08 +0100)]
Refactor SLP reduction group discovery
The following refactors a bit how we perform SLP reduction group
discovery possibly making it easier to have multiple reduction
groups later, esp. with single-lane SLP.
* tree-vect-slp.cc (vect_analyze_slp_instance): Remove
slp_inst_kind_reduc_group handling.
(vect_analyze_slp): Add the meat here.
Jakub Jelinek [Mon, 13 May 2024 09:15:27 +0000 (11:15 +0200)]
tree-ssa-math-opts: Pattern recognize yet another .ADD_OVERFLOW pattern [PR113982]
We pattern recognize already many different patterns, and closest to the
requested one also
yc = (type) y;
zc = (type) z;
x = yc + zc;
w = (typeof_y) x;
if (x > max)
where y/z has the same unsigned type and type is a wider unsigned type
and max is maximum value of the narrower unsigned type.
But apparently people are creative in writing this in diffent ways,
this requests
yc = (type) y;
zc = (type) z;
x = yc + zc;
w = (typeof_y) x;
if (x >> narrower_type_bits)
The following patch implements that.
2024-05-13 Jakub Jelinek <jakub@redhat.com>
PR middle-end/113982
* tree-ssa-math-opts.cc (arith_overflow_check_p): Also return 1
for RSHIFT_EXPR by precision of maxval if shift result is only
used in a cast or comparison against zero.
(match_arith_overflow): Handle the RSHIFT_EXPR use case.
Piotr Trojanek [Fri, 2 Feb 2024 12:24:45 +0000 (13:24 +0100)]
ada: Revert recent change for Put_Image and Object_Size attributes
Recent change for attribute Object_Size caused spurious errors when
restriction No_Implementation_Attributes is active and attribute
Object_Size is introduced by expansion of dispatching operations.
Temporarily revert that change for a further investigation.
Eric Botcazou [Thu, 1 Feb 2024 14:30:28 +0000 (15:30 +0100)]
ada: Rename finalization scope masters into finalization masters
Now that what was previously called "finalization master" has been renamed
into "finalization collection" in the front-end, we can also rename what was
initially called "finalization scope master" into "finalization master".
These entities indeed drive the finalization of all the objects that require
it, directly for (statically) declared objects or indirectly for dynamically
allocated objects (that is to say, through finalization collections).
gcc/ada/
* exp_ch7.adb: Adjust the description of finalization management.
(Build_Finalizer): Rename scope master into master throughout.
* rtsfind.ads (RE_Id): Replace RE_Finalization_Scope_Master with
RE_Finalization_Master.
(RE_Unit_Table): Replace entry for RE_Finalization_Scope_Master with
entry for RE_Finalization_Master.
* libgnat/s-finpri.ads (Finalization_Scope_Master): Rename into...
(Finalization_Master): ...this.
(Attach_Object_To_Master): Adjust to above renaming.
(Chain_Node_To_Master): Likewise.
(Finalize_Master): Likewise.
* libgnat/s-finpri.adb (Attach_Object_To_Master): Likewise.
(Chain_Node_To_Master): Likewise.
(Finalize_Master): Likewise.
Eric Botcazou [Thu, 1 Feb 2024 13:38:47 +0000 (14:38 +0100)]
ada: Remove dynamic frame in System.Image_D and document it in System.Image_F
The former can easily be removed while the latter cannot.
gcc/ada/
* libgnat/s-imaged.ads (System.Image_D): Add Uns formal parameter.
* libgnat/s-imaged.adb: Add with clauses for System.Image_I,
System.Value_I_Spec and System.Value_U_Spec.
(Uns_Spec): New instance of System.Value_U_Spec.
(Int_Spec): New instance of System.Value_I_Spec.
(Image_I): New instance of System.Image_I.
(Set_Image_Integer): New renaming.
(Set_Image_Decimal): Replace 'Image with call to Set_Image_Integer.
* libgnat/s-imde32.ads (Uns32): New subtype.
(Impl): Pass Uns32 as second actual paramter to Image_D.
* libgnat/s-imde64.ads (Uns64): New subtype.
(Impl): Pass Uns64 as second actual paramter to Image_D.
* libgnat/s-imde128.ads (Uns128): New subtype.
(Impl): Pass Uns128 as second actual paramter to Image_D.
* libgnat/s-imagef.adb (Set_Image_Fixed): Document bounds for the
A, D and AF local constants.
Piotr Trojanek [Thu, 1 Feb 2024 12:15:27 +0000 (13:15 +0100)]
ada: Attributes Put_Image and Object_Size are defined by Ada 2022
Recognize references to attributes Put_Image and Object_Size as
language-defined in Ada 2022 and implementation-defined in earlier
versions of Ada. Other attributes listed in Ada 2022 RM, K.2 and
currently implemented in GNAT are correctly categorized.
This change only affects code with restriction
No_Implementation_Attributes.
Piotr Trojanek [Wed, 31 Jan 2024 14:32:22 +0000 (15:32 +0100)]
ada: Remove guards against traversal of empty list of aspects
When iterating over Aspect_Specifications, we can use First/Next
directly even if the Aspect_Specifications returns a No_List or
the list has no items.
Code cleanup.
gcc/ada/
* aspects.adb (Copy_Aspects): Style fix.
* contracts.adb (Analyze_Contracts): Style fix.
(Save_Global_References_In_Contract): Remove extra guards.
* par_sco.adb (Traverse_Aspects): Move guard to the caller and
make it consistent with Save_Global_References_In_Contract.
* sem_ch12.adb (Has_Contracts): Remove extra guards.
* sem_ch3.adb (Delayed_Aspect_Present, Get_Partial_View_Aspect,
Check_Duplicate_Aspects): Likewise.
* sem_disp.adb (Check_Dispatching_Operation): Likewise.
Bob Duff [Wed, 31 Jan 2024 14:30:06 +0000 (09:30 -0500)]
ada: Fix crash on Compile_Time_Warning in dead code
If a pragma Compile_Time_Warning triggers, and the pragma
is later removed because it is dead code, then the compiler
can return a bad exit code. This causes gprbuild to report
"*** compilation phase failed".
This is because Total_Errors_Detected, which is declared as Nat,
goes negative, causing Constraint_Error. In assertions-off mode,
the Constraint_Error is not detected, but the compiler nonetheless
reports a bad exit code.
This patch prevents that negative count.
gcc/ada/
* errout.adb (Output_Messages): Protect against the total going
negative.
Piotr Trojanek [Tue, 30 Jan 2024 15:12:16 +0000 (16:12 +0100)]
ada: Move splitting of pre/post aspect expressions to expansion
We split expressions of pre/post aspects into individual conjuncts and
emit messages with their precise location when they fail at runtime.
This was done when processing the aspects and caused inefficiency when
the original expression had to be recovered to detects uses of 'Old that
changed in Ada 2022. This patch moves splitting to expansion.
Conceptually, splitting in expansion is easy, but we need to take care
of locations for inherited pre/post contracts. Previously the location
string was generated while splitting the aspect into pragmas and then
it was manipulated when inheriting the pragmas. Now the location string
is built when installing the Pre'Class check and when splitting the
expression in expansion.
gcc/ada/
* exp_ch6.adb (Append_Message): Build the location string from
scratch and not rely on the one produced while splitting the
aspect into pragmas.
* exp_prag.adb (Expand_Pragma_Check): Split pre/post checks in
expansion.
* sem_ch13.adb (Analyze_Aspect_Specification): Don't split
pre/post expressions into conjuncts; don't add message with
location to the corresponding pragma.
* sem_prag.adb (Build_Pragma_Check_Equivalent): Inherited
pragmas no longer have messages that would need to be updated.
* sinput.adb (Build_Location_String): Adjust to keep previous
messages while using with inherited pragmas.