Steve Baird [Wed, 24 Apr 2024 02:10:34 +0000 (19:10 -0700)]
ada: Reject too-strict alignment specifications.
In some cases the compiler incorrectly concludes that a package body is
required for a package specification that includes the implicit declaration
of one or more inherited subprograms for an explicitly declared derived type.
Spurious error messages (e.g., "cannot generate code for file") may result.
gcc/ada/
* sem_ch7.adb
(Requires_Completion_In_Body): Modify the Comes_From_Source test so that
the implicit declaration of an inherited subprogram does not cause
an incorrect result of True.
Eric Botcazou [Tue, 23 Apr 2024 17:54:32 +0000 (19:54 +0200)]
ada: Fix fallout of previous finalization change
Now that Is_Finalizable_Transient only looks at the renamings coming from
nontransient objects serviced by transient scopes, it must find the object
ultimately renamed by them through a chain of renamings.
gcc/ada/
PR ada/114710
* exp_util.adb (Find_Renamed_Object): Recurse if the renamed object
is itself a renaming.
Javier Miranda [Tue, 23 Apr 2024 17:30:23 +0000 (17:30 +0000)]
ada: Missing support for 'Old with overloaded function
The compiler reports an error when the prefix of 'Old is
a call to an overloaded function that has no parameters.
gcc/ada/
* sem_attr.adb (Analyze_Attribute): Enhance support for
using 'Old with a prefix that references an overloaded
function that has no parameters; add missing support
for the use of 'Old within qualified expressions.
* sem_util.ads (Preanalyze_And_Resolve_Without_Errors):
New subprogram.
* sem_util.adb (Preanalyze_And_Resolve_Without_Errors):
New subprogram.
Eric Botcazou [Mon, 22 Apr 2024 14:52:14 +0000 (16:52 +0200)]
ada: Add support for symbolic backtraces with DLLs on Windows
This puts Windows on par with Linux as far as backtraces are concerned.
gcc/ada/
* libgnat/s-tsmona__linux.adb (Get): Move down descriptive comment.
* libgnat/s-tsmona__mingw.adb: Add with clause and use clause for
System.Storage_Elements.
(Get): Pass GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT in the call
to GetModuleHandleEx and remove the subsequent call to FreeLibrary.
Upon success, set Load_Addr to the base address of the module.
* libgnat/s-win32.ads (GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS): Use
shorter literal.
(GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT): New constant.
Eric Botcazou [Mon, 22 Apr 2024 07:35:44 +0000 (09:35 +0200)]
ada: Fix too late finalization of temporary object
The problem is that Is_Finalizable_Transient returns false when a transient
object is subject to a renaming by another transient object present in the
same transient scope, thus forcing its finalization to be deferred to the
enclosing scope. That's not necessary, as only renamings by nontransient
objects serviced by transient scopes need to be rejected by the predicate.
The change also removes now dead code in the finalization machinery.
gcc/ada/
PR ada/114710
* exp_ch7.adb (Build_Finalizer.Process_Declarations): Remove dead
code dealing with renamings.
* exp_util.ads (Is_Finalizable_Transient): Rename Rel_Node to N.
* exp_util.adb (Is_Finalizable_Transient): Likewise.
(Is_Aliased): Remove obsolete code dealing wih EWA nodes and only
consider renamings present in N itself.
(Requires_Cleanup_Actions): Remove dead code dealing with renamings.
Javier Miranda [Mon, 22 Apr 2024 16:36:58 +0000 (16:36 +0000)]
ada: Missing dynamic predicate checks
The compiler does not generate dynamic predicate checks when
they are enabled for one type declaration and ignored for
other type declarations defined in the same scope.
gcc/ada/
* sem_ch13.adb (Analyze_One_Aspect): Set the applicable policy
of a type declaration when its aspect Dynamic_Predicate is
analyzed.
* sem_prag.adb (Handle_Dynamic_Predicate_Check): New subprogram
that enables or ignores dynamic predicate checks depending on
whether dynamic checks are enabled in the context where the
associated type declaration is defined; used in the analysis
of pragma check. In addition, for pragma Predicate, do not
disable it when the aspect was internally build as part of
processing a dynamic predicate aspect.
Jonathan Wakely [Tue, 11 Jun 2024 10:08:12 +0000 (11:08 +0100)]
libstdc++: Improve diagnostics for invalid std::hash specializations [PR115420]
When using a key type without a valid std::hash specialization the
unordered containers give confusing diagnostics about the default
constructor being deleted. Add a static_assert that will fail for
disabled std::hash specializations (and for a subset of custom hash
functions).
libstdc++-v3/ChangeLog:
PR libstdc++/115420
* include/bits/hashtable.h (_Hashtable): Add static_assert to
check that hash function is copy constructible.
* testsuite/23_containers/unordered_map/115420.cc: New test.
Jonathan Wakely [Wed, 12 Jun 2024 15:47:17 +0000 (16:47 +0100)]
libstdc++: Fix unwanted #pragma messages from PSTL headers [PR113376]
When we rebased the PSTL on upstream, in r14-2109-g3162ca09dbdc2e, a
change to how _PSTL_USAGE_WARNINGS is set was missed out, but the change
to how it's tested was included. This means that the macro is always
defined, so testing it with #ifdef (instead of using #if to test its
value) doesn't work as intended.
Revert the test to use #if again, since that part of the upstream change
was unnecessary in the first place (the macro is always defined, so
there's no need to use #ifdef to avoid -Wundef warnings).
libstdc++-v3/ChangeLog:
PR libstdc++/113376
* include/pstl/pstl_config.h: Use #if instead of #ifdef to test
the _PSTL_USAGE_WARNINGS macro.
This patch optimizes the compilation performance of
std::is_nothrow_invocable by dispatching to the new
__is_nothrow_invocable built-in trait.
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_nothrow_invocable): Use
__is_nothrow_invocable built-in trait.
* testsuite/20_util/is_nothrow_invocable/incomplete_args_neg.cc:
Handle the new error from __is_nothrow_invocable.
* testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc:
Likewise.
Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org> Reviewed-by: Patrick Palka <ppalka@redhat.com> Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
This patch optimizes the compilation performance of std::is_invocable
by dispatching to the new __is_invocable built-in trait.
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_invocable): Use __is_invocable
built-in trait.
* testsuite/20_util/is_invocable/incomplete_args_neg.cc: Handle
the new error from __is_invocable.
* testsuite/20_util/is_invocable/incomplete_neg.cc: Likewise.
Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org> Reviewed-by: Patrick Palka <ppalka@redhat.com> Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
The testcase extracts one arm_neon.h vector from a pair (one subreg)
and then reinterprets the result as an SVE vector (another subreg).
Each subreg makes sense individually, but we can't fold them together
into a single subreg: it's 32 bytes -> 16 bytes -> 16*N bytes,
but the interpretation of 32 bytes -> 16*N bytes depends on
whether N==1 or N>1.
Since the second subreg makes sense individually, simplify_subreg
should bail out rather than ICE on it. simplify_gen_subreg will
then do the same (because it already checks validate_subreg).
This leaves simplify_gen_subreg returning null, requiring the
caller to take appropriate action.
I think this is relatively likely to occur elsewhere, so the patch
adds a helper for forcing a subreg, allowing a temporary pseudo to
be created where necessary.
I'll follow up by using force_subreg in more places. This patch
is intended to be a minimal backportable fix for the PR.
gcc/
PR target/115464
* simplify-rtx.cc (simplify_context::simplify_subreg): Don't try
to fold two subregs together if their relationship isn't known
at compile time.
* explow.h (force_subreg): Declare.
* explow.cc (force_subreg): New function.
* config/aarch64/aarch64-sve-builtins-base.cc
(svset_neonq_impl::expand): Use it instead of simplify_gen_subreg.
gcc/testsuite/
PR target/115464
* gcc.target/aarch64/sve/acle/general/pr115464.c: New test.
We have vec_extract pattern which takes ZVFHMIN as the mode
iterator of the VLS mode. Aka V_VLS. But it will expand to
pred_extract_first pattern which takes the ZVFH as the mode
iterator of the VLS mode. AKa V_VLSF. The mismatch will
result in one ICE similar as below:
This patch would like to fix this issue by align the mode
iterator restriction to ZVFH.
The below test suites are passed for this patch.
1. The rv64gcv fully regression test.
2. The rv64gcv build with glibc.
PR target/115456
gcc/ChangeLog:
* config/riscv/autovec.md: Take ZVFH mode iterator instead of
the ZVFHMIN for the alignment.
* config/riscv/vector-iterators.md: Add 2 new iterator
V_VLS_ZVFH and VLS_ZVFH.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr115456-1.c: New test.
Hongyu Wang [Thu, 9 May 2024 02:12:16 +0000 (10:12 +0800)]
[APX CCMP] Use ctestcc when comparing to const 0
For CTEST, we don't have conditional AND so there's no optimization
opportunity to write a new ctest pattern. Emit ctest when ccmp did
comparison to const 0 to save bytes.
gcc/ChangeLog:
* config/i386/i386.md (@ccmp<mode>): Add new alternative
<r>,C and adjust output templates. Also adjust UNSPEC mode
to CCmode.
gcc/testsuite/ChangeLog:
* gcc.target/i386/apx-ccmp-1.c: Adjust output to scan ctest.
* gcc.target/i386/apx-ccmp-2.c: Adjust some condition to
compare with 0.
Gerald Pfeifer [Thu, 13 Jun 2024 07:49:29 +0000 (09:49 +0200)]
doc: Streamline requirements on the build compiler
No need to talk about potential implementation bugs in older versions
than what we require. And no need to talk about building GCC 3.3 and
earlier at this point.
gcc:
PR other/69374
* doc/install.texi (Prerequisites): Simplify note on the C++
compiler required. Drop requirements for versions of GCC prior
to 3.4. Fix grammar.
Richard Biener [Mon, 10 Jun 2024 13:31:35 +0000 (15:31 +0200)]
Improve code generation of strided SLP loads
This avoids falling back to elementwise accesses for strided SLP
loads when the group size is not a multiple of the vector element
size. Instead we can use a smaller vector or integer type for the load.
For stores we can do the same though restrictions on stores we handle
and the fact that store-merging covers up makes this mostly effective
for cost modeling which shows for gcc.target/i386/vect-strided-3.c
which we now vectorize with V4SI vectors rather than just V2SI ones.
For all of this there's still the opportunity to use non-uniform
accesses, say for a 6-element group with a VF of two do
V4SI, { V2SI, V2SI }, V4SI. But that's for a possible followup.
* tree-vect-stmts.cc (get_group_load_store_type): Consistently
use VMAT_STRIDED_SLP for strided SLP accesses and not
VMAT_ELEMENTWISE.
(vectorizable_store): Adjust VMAT_STRIDED_SLP handling to
allow not only half-size but also smaller accesses.
(vectorizable_load): Likewise.
Richard Biener [Fri, 7 Jun 2024 12:47:12 +0000 (14:47 +0200)]
tree-optimization/115385 - handle more gaps with peeling of a single iteration
The following makes peeling of a single scalar iteration handle more
gaps, including non-power-of-two cases. This can be done by rounding
up the remaining access to the next power-of-two which ensures that
the next scalar iteration will pick at least the number of excess
elements we access.
I've added a correctness testcase and one x86 specific scanning for
the optimization.
PR tree-optimization/115385
* tree-vect-stmts.cc (get_group_load_store_type): Peeling
of a single scalar iteration is sufficient if we can narrow
the access to the next power of two of the bits in the last
access.
(vectorizable_load): Ensure that the last access is narrowed.
* gcc.dg/vect/pr115385.c: New testcase.
* gcc.target/i386/vect-pr115385.c: Likewise.
Richard Biener [Fri, 7 Jun 2024 09:29:05 +0000 (11:29 +0200)]
tree-optimization/114107 - avoid peeling for gaps in more cases
The following refactors the code to detect necessary peeling for
gaps, in particular the PR103116 case when there is no gap but
the group size is smaller than the vector size. The testcase in
PR114107 shows we fail to SLP
for (int i=0; i<n; i++)
for (int k=0; k<4; k++)
data[4*i+k] *= factor[i];
because peeling one scalar iteration isn't enough to cover a gap
of 3 elements of factor[i]. But the code detecting this is placed
after the logic that detects cases we handle properly already as
we'd code generate { factor[i], 0., 0., 0. } for V4DFmode vectorization
already. In fact the check to detect when peeling a single iteration
isn't enough seems improperly guarded as it should apply to all cases.
I'm not sure we correctly handle VMAT_CONTIGUOUS_REVERSE but I
checked that VMAT_STRIDED_SLP and VMAT_ELEMENTWISE correctly avoid
touching excess elements.
With this change we can use SLP for the above testcase and the
PR103116 testcases no longer require an epilogue on x86-64. It
might be different on other targets so I made those testcases
runtime FAIL only instead of relying on dump scanning there's
currently no easy way to properly constrain.
PR tree-optimization/114107
PR tree-optimization/110445
* tree-vect-stmts.cc (get_group_load_store_type): Refactor
contiguous access case. Make sure peeling for gap constraints
are always tested and consistently relax when we know we can
avoid touching excess elements during code generation. But
rewrite the check poly-int aware.
* gcc.dg/vect/pr114107.c: New testcase.
* gcc.dg/vect/pr103116-1.c: Adjust.
* gcc.dg/vect/pr103116-2.c: Likewise.
Peter Bergner [Thu, 13 Jun 2024 02:05:34 +0000 (21:05 -0500)]
rs6000: Fix pr66144-3.c test to accept multiple equivalent insns. [PR115262]
Jeff's commit r15-831-g05daf617ea22e1 changed the instruction we expected
for this test case into an equivalent instruction. Modify the test case
so it will accept any of three instructions we could get depending on the
options used.
2024-06-12 Peter Bergner <bergner@linux.ibm.com>
gcc/testsuite/
PR testsuite/115262
* gcc.target/powerpc/pr66144-3.c (dg-do): Compile for all targets.
(dg-options): Add -fno-unroll-loops and remove -mvsx.
(scan-assembler): Change from this...
(scan-assembler-times): ...to this. Tweak regex to accept multiple
allowable instructions.
YunQiang Su [Mon, 10 Jun 2024 06:31:12 +0000 (14:31 +0800)]
MIPS: Use FPU-enabled tune for mips32/mips64/mips64r2/mips64r3/mips64r5
Currently, the default tune value of mips32 is PROCESSOR_4KC, and
the default tune value of mips64/mips64r2/mips64r3/mips64r5 is
PROCESSOR_5KC. PROCESSOR_4KC and PROCESSOR_5KC are both FPU-less.
Let's use PROCESSOR_24KF1_1 for mips32, and PROCESSOR_5KF for mips64/
mips64r2/mips64r3/mips64r5.
We find this problem when we try to fix gcc.target/mips/movcc-3.c.
gcc:
* config/mips/mips-cpus.def: Use PROCESSOR_24KF1_1 for mips32;
Use PROCESSOR_5KF for mips64/mips64r2/mips64r3/mips64r5.
YunQiang Su [Sat, 8 Jun 2024 03:31:19 +0000 (11:31 +0800)]
MIPS: Use signaling fcmp instructions for LT/LE/LTGT
LT/LE: c.lt.fmt/c.le.fmt on pre-R6 and cmp.lt.fmt/cmp.le.fmt have
different semantic:
c.lt.fmt will signal for all NaN, including qNaN;
cmp.lt.fmt will only signal sNaN, while not qNaN;
cmp.slt.fmt has the same semantic as c.lt.fmt;
lt/le of RTL will signaling qNaN.
while in `s<code>_<SCALARF:mode>_using_<FPCC:mode>`, RTL operation
`lt`/`le` are convert to c/cmp's lt/le, which is correct for C.cond.fmt,
while not for CMP.cond.fmt. Let's convert them to slt/sle if ISA_HAS_CCF.
For LTGT, which signals qNaN, `sne` of r6 has same semantic, while pre-R6
has only inverse one `ngl`. Thus for RTL we have to use the `uneq` as the
operator, and introduce a new CC mode: CCEmode to mark it as signaling.
This patch can fix
gcc.dg/torture/pr91323.c for pre-R6;
gcc.dg/torture/builtin-iseqsig-* for R6.
gcc:
* config/mips/mips-modes.def: New CC_MODE CCE.
* config/mips/mips-protos.h(mips_output_compare): New function.
* config/mips/mips.cc(mips_allocate_fcc): Set CCEmode count=1.
(mips_emit_compare): Use CCEmode for LTGT/LT/LE for pre-R6.
(mips_output_compare): New function. Convert lt/le to slt/sle
for R6; convert ueq to ngl for CCEmode.
(mips_hard_regno_mode_ok_uncached): Mention CCEmode.
* config/mips/mips.h: Mention CCEmode for LOAD_EXTEND_OP.
* config/mips/mips.md(FPCC): Add CCE.
(define_mode_iterator MOVECC): Mention CCE.
(define_mode_attr reg): Add CCE with "z".
(define_mode_attr fpcmp): Add CCE with "c".
(define_code_attr fcond): ltgt should use sne instead of ne.
(s<code>_<SCALARF:mode>_using_<FPCC:mode>): call mips_output_compare.
Patrick Palka [Thu, 13 Jun 2024 00:05:05 +0000 (20:05 -0400)]
c++: visibility wrt concept-id as targ [PR115283]
Like with alias templates, it seems we don't maintain visibility flags
for concepts either, so min_vis_expr_r should ignore them for now.
Otherwise after r14-6789 we may incorrectly give a function template that
uses a concept-id in its signature internal linkage.
Alexandre Oliva [Wed, 12 Jun 2024 22:48:06 +0000 (19:48 -0300)]
[libstdc++] [testsuite] require cmath for c++23 cmath tests
Some c++23 tests fail on targets that don't satisfy dg-require-cmath,
because referenced math functions don't get declared in std. Add the
missing requirement.
Alexandre Oliva [Wed, 12 Jun 2024 22:48:04 +0000 (19:48 -0300)]
[libstdc++] [testsuite] xfail double-prec from_chars for float128_t
Tests involving float128_t were xfailed or otherwise worked around for
vxworks on aarch64. The same issue came up on rtems. This patch
adjusts them similarly.
for libstdc++-v3/ChangeLog
* testsuite/20_util/from_chars/8.cc: Skip float128_t testing
on aarch64-rtems*.
* testsuite/20_util/to_chars/float128_c++23.cc: Xfail run on
aarch64-rtems*.
Jason Merrill [Wed, 12 Jun 2024 12:06:47 +0000 (08:06 -0400)]
c++: repeated export using
A sample implementation of module std was breaking because the exports
included 'using std::operator&' twice. Since Nathaniel's r15-964 for
PR114867, the first using added an extra instance of each function that was
revealed/exported by that using, resulting in duplicates for
lookup_maybe_add to dedup. But if the duplicate is the first thing in the
list, lookup_add doesn't make an OVERLOAD, so trying to set OVL_USING_P
crashes. Fixed by using ovl_make in the case where we want to set the flag.
gcc/cp/ChangeLog:
* tree.cc (lookup_maybe_add): Use ovl_make when setting OVL_USING_P.
Jason Merrill [Wed, 12 Jun 2024 04:13:45 +0000 (00:13 -0400)]
c++: module std and exception_ptr
exception_ptr.h contains
namespace __exception_ptr
{
class exception_ptr;
}
using __exception_ptr::exception_ptr;
so when module std tries to 'export using std::exception_ptr', it names
another using-directive rather than the class directly, so __exception_ptr
is never explicitly opened in module purview.
gcc/cp/ChangeLog:
* module.cc (depset::hash::add_binding_entity): Set
DECL_MODULE_PURVIEW_P instead of asserting.
David Malcolm [Wed, 12 Jun 2024 18:24:47 +0000 (14:24 -0400)]
pretty_printer: unbreak build on aarch64 [PR115465]
I missed this target-specific usage of pretty_printer::buffer when
making the fields private in r15-1209-gc5e3be456888aa; sorry.
gcc/ChangeLog:
PR bootstrap/115465
* config/aarch64/aarch64-early-ra.cc (early_ra::process_block):
Update for fields of pretty_printer becoming private in r15-1209-gc5e3be456888aa.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Patrick O'Neill [Tue, 11 Jun 2024 00:00:38 +0000 (17:00 -0700)]
RISC-V: Allow any temp register to be used in amo tests
We artifically restrict the temp registers to be a[0-9]+ when other
registers like t[0-9]+ are valid too. Update to make the regex
accept any register for the temp value.
Andrew Pinski [Tue, 11 Jun 2024 20:36:34 +0000 (20:36 +0000)]
aarch64: Use bitreverse rtl code instead of unspec [PR115176]
Bitreverse rtl code was added with r14-1586-g6160572f8d243c. So let's
use it instead of an unspec. This is just a small cleanup but it does
have one small fix with respect to rtx costs which didn't handle vector modes
correctly for the UNSPEC and now it does.
This is part of the first step in adding __builtin_bitreverse's builtins
but it is independent of it though.
Bootstrapped and tested on aarch64-linux-gnu with no regressions.
gcc/ChangeLog:
PR target/115176
* config/aarch64/aarch64-simd.md (aarch64_rbit<mode><vczle><vczbe>): Use
bitreverse instead of unspec.
* config/aarch64/aarch64-sve-builtins-base.cc (svrbit): Convert over to using
rtx_code_function instead of unspec_based_function.
* config/aarch64/aarch64-sve.md: Update comment where RBIT is included.
* config/aarch64/aarch64.cc (aarch64_rtx_costs): Handle BITREVERSE like BSWAP.
Remove UNSPEC_RBIT support.
* config/aarch64/aarch64.md (unspec): Remove UNSPEC_RBIT.
(aarch64_rbit<mode>): Use bitreverse instead of unspec.
* config/aarch64/iterators.md (SVE_INT_UNARY): Add bitreverse.
(optab): Likewise.
(sve_int_op): Likewise.
(SVE_INT_UNARY): Remove UNSPEC_RBIT.
(optab): Likewise.
(sve_int_op): Likewise.
(min_elem_bits): Likewise.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andrew Pinski [Wed, 12 Jun 2024 00:16:42 +0000 (17:16 -0700)]
match: Improve gimple_bitwise_equal_p and gimple_bitwise_inverted_equal_p for truncating casts [PR115449]
As mentioned by Jeff in r15-831-g05daf617ea22e1d818295ed2d037456937e23530, we don't handle
`(X | Y) & ~Y` -> `X & ~Y` on the gimple level when there are some different signed
(but same precision) types dealing with matching `~Y` with the `Y` part. This
improves both gimple_bitwise_equal_p and gimple_bitwise_inverted_equal_p to
be able to say `(truncate)a` and `(truncate)a` are bitwise_equal and
that `~(truncate)a` and `(truncate)a` are bitwise_invert_equal.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
PR tree-optimization/115449
gcc/ChangeLog:
* gimple-match-head.cc (gimple_maybe_truncate): New declaration.
(gimple_bitwise_equal_p): Match truncations that differ only
in types with the same precision.
(gimple_bitwise_inverted_equal_p): For matching after bit_not_with_nop
call gimple_bitwise_equal_p.
* match.pd (maybe_truncate): New match pattern.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/bitops-10.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andi Kleen [Wed, 12 Jun 2024 13:59:37 +0000 (06:59 -0700)]
Move cexpr_stree tree string build into utility function
No semantics changes.
gcc/cp/ChangeLog:
* cp-tree.h (extract): Add new overload to return tree.
* parser.cc (cp_parser_asm_string_expression): Use tree extract.
* semantics.cc (cexpr_str::extract): Add new overload to return
tree.
The shift operations for dynamic_bitset fail to zero out words where the
non-zero bits were shifted to a completely different word.
For a right shift we don't need to sanitize the unused bits in the high
word, because we know they were already clear and a right shift doesn't
change that.
libstdc++-v3/ChangeLog:
PR libstdc++/115399
* include/tr2/dynamic_bitset (operator>>=): Remove redundant
call to _M_do_sanitize.
* include/tr2/dynamic_bitset.tcc (_M_do_left_shift): Zero out
low bits in words that should no longer be populated.
(_M_do_right_shift): Likewise for high bits.
* testsuite/tr2/dynamic_bitset/pr115399.cc: New test.
middle-end: Drop __builtin_prefetch calls in autovectorization [PR114061]
At present the autovectorizer fails to vectorize simple loops
involving calls to `__builtin_prefetch'. A simple example of such
loop is given below:
void foo(double * restrict a, double * restrict b, int n){
int i;
for(i=0; i<n; ++i){
a[i] = a[i] + b[i];
__builtin_prefetch(&(b[i+8]));
}
}
The failure stems from two issues:
1. Given that it is typically not possible to fully reason about a
function call due to the possibility of side effects, the
autovectorizer does not attempt to vectorize loops which make such
calls.
Given the memory reference passed to `__builtin_prefetch', in the
absence of assurances about its effect on the passed memory
location the compiler deems the function unsafe to vectorize,
marking it as clobbering memory in `vect_find_stmt_data_reference'.
This leads to the failure in autovectorization.
2. Notwithstanding the above issue, though the prefetch statement
would be classed as `vect_unused_in_scope', the loop invariant that
is used in the address of the prefetch is the scalar loop's and not
the vector loop's IV. That is, it still uses `i' and not `vec_iv'
because the instruction wasn't vectorized, causing DCE to think the
value is live, such that we now have both the vector and scalar loop
invariant actively used in the loop.
This patch addresses both of these:
1. About the issue regarding the memory clobber, data prefetch does
not generate faults if its address argument is invalid and does not
write to memory. Therefore, it does not alter the internal state
of the program or its control flow under any circumstance. As
such, it is reasonable that the function be marked as not affecting
memory contents.
To achieve this, we add the necessary logic to
`get_references_in_stmt' to ensure that builtin functions are given
given the same treatment as internal functions. If the gimple call
is to a builtin function and its function code is
`BUILT_IN_PREFETCH', we mark `clobbers_memory' as false.
2. Finding precedence in the way clobber statements are handled,
whereby the vectorizer drops these from both the scalar and
vectorized versions of a given loop, we choose to drop prefetch
hints in a similar fashion. This seems appropriate given how
software prefetch hints are typically ignored by processors across
architectures, as they seldom lead to performance gain over their
hardware counterparts.
gcc/ChangeLog:
PR tree-optimization/114061
* tree-data-ref.cc (get_references_in_stmt): set
`clobbers_memory' to false for __builtin_prefetch.
* tree-vect-loop.cc (vect_transform_loop): Drop all
__builtin_prefetch calls from loops.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/vect-prefetch-drop.c: New test.
* gcc.target/aarch64/vect-prefetch-drop.c: Likewise.
David Malcolm [Wed, 12 Jun 2024 13:15:09 +0000 (09:15 -0400)]
pretty_printer: convert chunk_info into a class
No functional change intended.
gcc/cp/ChangeLog:
* error.cc (append_formatted_chunk): Move part of body into
chunk_info::append_formatted_chunk.
gcc/ChangeLog:
* dumpfile.cc (dump_pretty_printer::emit_items): Update for
changes to chunk_info.
* pretty-print.cc (chunk_info::append_formatted_chunk): New, based
on code in cp/error.cc's append_formatted_chunk.
(chunk_info::pop_from_output_buffer): New, based on code in
pp_output_formatted_text and dump_pretty_printer::emit_items.
(on_begin_quote): Convert to...
(chunk_info::on_begin_quote): ...this.
(on_end_quote): Convert to...
(chunk_info::on_end_quote): ...this.
(pretty_printer::format): Update for chunk_info becoming a class
and its fields gaining "m_" prefixes. Update for on_begin_quote
and on_end_quote moving to chunk_info.
(quoting_info::handle_phase_3): Update for changes to chunk_info.
(pp_output_formatted_text): Likewise. Move cleanup code to
chunk_info::pop_from_output_buffer.
* pretty-print.h (class output_buffer): New forward decl.
(class urlifier): New forward decl.
(struct chunk_info): Convert to...
(class chunk_info): ...this. Add friend class pretty_printer.
(chunk_info::get_args): New accessor.
(chunk_info::get_quoting_info): New accessor.
(chunk_info::append_formatted_chunk): New decl.
(chunk_info::pop_from_output_buffer): New decl.
(chunk_info::on_begin_quote): New decl.
(chunk_info::on_end_quote): New decl.
(chunk_info::prev): Rename to...
(chunk_info::m_prev): ...this.
(chunk_info::args): Rename to...
(chunk_info::m_args): ...this.
(output_buffer::cur_chunk_array): Drop "struct" from decl.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Xi Ruoyao [Sun, 9 Jun 2024 06:43:48 +0000 (14:43 +0800)]
LoongArch: Use bstrins for "value & (-1u << const)"
A move/bstrins pair is as fast as a (addi.w|lu12i.w|lu32i.d|lu52i.d)/and
pair, and twice fast as a srli/slli pair. When the src reg and the dst
reg happens to be the same, the move instruction can be optimized away.
gcc/ChangeLog:
* config/loongarch/predicates.md (high_bitmask_operand): New
predicate.
* config/loongarch/constraints.md (Yy): New constriant.
* config/loongarch/loongarch.md (and<mode>3_align): New
define_insn_and_split.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/bstrins-1.c: New test.
* gcc.target/loongarch/bstrins-2.c: New test.
Xi Ruoyao [Wed, 12 Jun 2024 03:01:53 +0000 (11:01 +0800)]
LoongArch: Fix mode size comparision in loongarch_expand_conditional_move
We were comparing a mode size with word_mode, but word_mode is an enum
value thus this does not really make any sense. (Un)luckily E_DImode
happens to be 8 so this seemed to work, but let's make it correct so it
won't blow up when we add LA32 support or add another machine mode...
gcc/ChangeLog:
* config/loongarch/loongarch.cc
(loongarch_expand_conditional_move): Compare mode size with
UNITS_PER_WORD instead of word_mode.
Libatomic: Clean up AArch64 `atomic_16.S' implementation file
At present, `atomic_16.S' groups different implementations of the
same functions together in the file. Therefore, as an example,
the LSE2 implementation of `load_16' follows on immediately from its
core implementation, as does the `store_16' LSE2 implementation.
Such architectural extension-dependent implementations are dependent
on ifunc support, such that they are guarded by the relevant
preprocessor macro, i.e. `#if HAVE_IFUNC'.
Having to apply these guards on a per-function basis adds unnecessary
clutter to the file and makes its maintenance more error-prone.
We therefore reorganize the layout of the file in such a way that all
core implementations needing no `#ifdef's are placed first, followed
by all ifunc-dependent implementations, which can all be guarded by a
single `#if HAVE_IFUNC', greatly reducing the overall number of
required `#ifdef' macros.
libatomic/ChangeLog:
* config/linux/aarch64/atomic_16.S: Reorganize functions in
file.
(HAVE_FEAT_LSE2): Delete.
Libatomic: Make ifunc selector behavior contingent on importing file
By querying previously-defined file-identifier macros, `host-config.h'
is able to get information about its environment and, based on this
information, select more appropriate function-specific ifunc
selectors. This reduces the number of unnecessary feature tests that
need to be carried out in order to find the best atomic implementation
for a function at run-time.
An immediate benefit of this is that we can further fine-tune the
architectural requirements for each atomic function without risk of
incurring the maintenance and runtime-performance penalties of having
to maintain an ifunc selector with a huge number of alternatives, most
of which are irrelevant for any particular function. Consequently,
for AArch64 targets, we relax the architectural requirements of
`compare_exchange_16', which now requires only LSE as opposed to the
newer LSE2.
The new flexibility provided by this approach also means that certain
functions can now be called directly, doing away with ifunc selectors
altogether when only a single implementation is available for it on a
given target. As per the macro expansion framework laid out in
`libatomic_i.h', such functions should have their names prefixed with
`__atomic_' as opposed to `libat_'. This is the same prefix applied
to function names when Libatomic is configured with
`--disable-gnu-indirect-function'.
To achieve this, these functions unconditionally apply the aliasing
rule that at present is conditionally applied only when libatomic is
built without ifunc support, which ensures that the default
`libat_##NAME' is accessible via the equivalent `__atomic_##NAME' too.
This is ensured by using the new `ENTRY_ALIASED' macro.
Finally, this means we are able to do away with a whole set of
function aliases that were needed until now, thus considerably
cleaning up the implementation.
libatomic/ChangeLog:
* config/linux/aarch64/atomic_16.S: Remove unnecessary
aliasing.
(LSE): New.
(ENTRY_ALIASED): Likewise.
* config/linux/aarch64/host-config.h (LSE_ATOP): New.
(LSE2_ATOP): Likewise.
(LSE128_ATOP): Likewise.
(IFUNC_COND_1): Make its definition conditional on above 3
macros.
(IFUNC_NCOND): Likewise.
In order to facilitate the fine-tuning of how `libatomic_i.h' and
`host-config.h' headers are used by different atomic functions, we
define distinct identifier macros for each file which, in implementing
atomic operations, imports these headers.
The idea is that different parts of these headers could then be
conditionally defined depending on the macros set by the file that
`#include'd them.
Given how it is possible that some file names are generic enough that
using them as-is for macro names (e.g. flag.c -> FLAG) may potentially
lead to name clashes with other macros, all file names first have LAT_
prepended to them such that, for example, flag.c is assigned the
LAT_FLAG macro.
Libatomic: AArch64: Convert all lse128 assembly to .insn directives
Given the lack of support for the LSE128 instructions in all but the
the most up-to-date version of Binutils (2.42), having the build-time
test for assembler support for these instructions often leads to the
building of Libatomic without support for LSE128-dependent atomic
function implementations. This ultimately leads to different people
having different versions of Libatomic on their machines, depending on
which assembler was available at compilation time.
Furthermore, the conditional inclusion of these atomic function
implementations predicated on assembler support leads to a series of
`#if HAVE_FEAT_LSE128' guards scattered throughout the codebase and
the need for a series of aliases when the feature flag evaluates
to false. The preprocessor macro guards, together with the
conditional aliasing leads to code that is cumbersome to understand
and maintain.
Both of the issues highlighted above will only get worse with the
coming support for LRCPC3 atomics which under the current scheme will
also require build-time checks.
Consequently, a better option for both consistency across builds and
code cleanness is to make recourse to the `.inst' directive. By
replacing all novel assembly instructions for their hexadecimal
representation within `.inst's, we ensure that the Libatomic code is
both considerably cleaner and all machines build the same binary,
irrespective of binutils version available at compile time.
This patch therefore removes all configure checks for LSE128-support
in the assembler and all the guards and aliases that were associated
with `HAVE_FEAT_LSE128'
libatomic/ChangeLog:
* acinclude.m4 (LIBAT_TEST_FEAT_AARCH64_LSE128): Delete.
* auto-config.h.in (HAVE_FEAT_LSE128): Likewise
* config/linux/aarch64/atomic_16.S: Replace all LSE128
instructions with equivalent `.inst' directives.
(HAVE_FEAT_LSE128): Remove all references.
* configure: Regenerate.
* configure.ac: Remove call to LIBAT_TEST_FEAT_AARCH64_LSE128.
arm: Zero/Sign extends for CMSE security on Armv8-M.baseline [PR115253]
Properly handle zero and sign extension for Armv8-M.baseline as
Cortex-M23 can have the security extension active.
Currently, there is an internal compiler error on Cortex-M23 for the
epilog processing of sign extension.
This patch addresses the following CVE-2024-0151 for Armv8-M.baseline.
Pan Li [Tue, 11 Jun 2024 13:39:43 +0000 (21:39 +0800)]
Widening-Mul: Take gsi after_labels instead of start_bb for gcall insertion
We inserted the gcall of .SAT_ADD before the gsi_start_bb for avoiding
the ssa def after use ICE issue. Unfortunately, there will be the
potential ICE when the first stmt is label. We cannot insert the gcall
before the label. Thus, we take gsi_after_labels to locate the
'really' stmt that the gcall will insert before.
The existing test cases pr115387-1.c and pr115387-2.c cover this change.
The below test suites are passed for this patch.
* The rv64gcv fully regression test with newlib.
* The x86 regression test.
* The x86 bootstrap test.
gcc/ChangeLog:
* tree-ssa-math-opts.cc (math_opts_dom_walker::after_dom_children):
Leverage gsi_after_labels instead of gsi_start_bb to skip the
leading labels of bb.
Alexandre Oliva [Wed, 12 Jun 2024 03:16:27 +0000 (00:16 -0300)]
[tree-prof] skip if errors were seen [PR113681]
ipa_tree_profile asserts that the symtab is in IPA_SSA state, but we
don't reach that state and ICE if e.g. ipa-strub passes report errors.
Skip this pass if errors were seen.
for gcc/ChangeLog
PR tree-optimization/113681
* tree-profile.cc (pass_ipa_tree_profile::gate): Skip if
seen_errors.
Alexandre Oliva [Wed, 12 Jun 2024 03:16:24 +0000 (00:16 -0300)]
[testsuite] [arm] test board cflags in multilib.exp
multilib.exp tests for multilib-altering flags in a board's
multilib_flags and skips the test, but if such flags appear in the
board's cflags, with the same distorting effects on tested multilibs,
we fail to skip the test.
Extend the skipping logic to board's cflags as well.
for gcc/testsuite/ChangeLog
* gcc.target/arm/multilib.exp: Skip based on board cflags too.
Alexandre Oliva [Wed, 12 Jun 2024 03:16:22 +0000 (00:16 -0300)]
map packed field type to unpacked for debug info
We create a distinct type for each field in a packed record with a
gnu_size, but there is no distinct debug information for them. Use
the same unpacked type for debug information.
for gcc/ada/ChangeLog
* gcc-interface/decl.cc (gnat_to_gnu_field): Use unpacked type
as the debug type for packed fields.
for gcc/testsuite/ChangeLog
* gnat.dg/bias1.adb: Count occurrences of -7.*DW_AT_GNU_bias.
Alexandre Oliva [Wed, 12 Jun 2024 03:16:20 +0000 (00:16 -0300)]
[libstdc++] drop workaround for clang<=7
In response to a request in the review of the patch that introduced
_GLIBCXX_CLANG, this patch removes from std/variant an obsolete
workaround for clang 7-.
liuhongt [Tue, 11 Jun 2024 02:23:27 +0000 (10:23 +0800)]
Fix ICE in rtl check due to CONST_WIDE_INT in CONST_VECTOR_DUPLICATE_P
The patch add extra check to make sure the component of CONST_VECTOR
is CONST_INT_P.
gcc/ChangeLog:
PR target/115384
* simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
Only do the simplification of (AND (ASHIFTRT A imm) mask)
to (LSHIFTRT A imm) when the component of const_vector is
CONST_INT_P.
Joseph Myers [Tue, 11 Jun 2024 23:00:04 +0000 (23:00 +0000)]
c: Add -std=c2y, -std=gnu2y, -Wc23-c2y-compat, C2Y _Generic with type operand
The first new C2Y feature, _Generic where the controlling operand is a
type name rather than an expression (as defined in N3260), was voted
into C2Y today. (In particular, this form of _Generic allows
distinguishing qualified and unqualified versions of a type.) This
feature also includes allowing the generic associations to specify
incomplete and function types.
Add this feature to GCC, along with the -std=c2y, -std=gnu2y and
-Wc23-c2y-compat options to control when and how it is diagnosed. As
usual, the feature is allowed by default in older standards modes,
subject to diagnosis with -pedantic, -pedantic-errors or
-Wc23-c2y-compat.
Bootstrapped with no regressions on x86_64-pc-linux-gnu.
gcc/c/
* c-errors.cc (pedwarn_c23): New.
* c-parser.cc (disable_extension_diagnostics)
(restore_extension_diagnostics): Save and restore
warn_c23_c2y_compat.
(c_parser_generic_selection): Handle type name as controlling
operand. Allow incomplete and function types subject to
pedwarn_c23 calls.
* c-tree.h (pedwarn_c23): New.
gcc/testsuite/
* gcc.dg/c23-generic-1.c, gcc.dg/c23-generic-2.c,
gcc.dg/c23-generic-3.c, gcc.dg/c23-generic-4.c,
gcc.dg/c2y-generic-1.c, gcc.dg/c2y-generic-2.c,
gcc.dg/c2y-generic-3.c, gcc.dg/gnu2y-generic-1.c: New tests.
* gcc.dg/c23-tag-6.c: Use -pedantic-errors.
libcpp/
* include/cpplib.h (CLK_GNUC2Y, CLK_STDC2Y): New.
* init.cc (lang_defaults): Add GNUC2Y and STDC2Y entries.
(cpp_init_builtins): Define __STDC_VERSION__ to 202500L for GNUC2Y
and STDC2Y.
Robin Dapp [Fri, 7 Jun 2024 12:36:41 +0000 (14:36 +0200)]
vect: Merge loop mask and cond_op mask in fold-left reduction [PR115382].
Currently we discard the cond-op mask when the loop is fully masked
which causes wrong code in
gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c
when compiled with
-O3 -march=cascadelake --param vect-partial-vector-usage=2.
This patch ANDs both masks.
gcc/ChangeLog:
PR tree-optimization/115382
* tree-vect-loop.cc (vectorize_fold_left_reduction): Use
prepare_vec_mask.
* tree-vect-stmts.cc (check_load_store_for_partial_vectors):
Remove static of prepare_vec_mask.
* tree-vectorizer.h (prepare_vec_mask): Export.
Patrick O'Neill [Thu, 8 Feb 2024 00:30:30 +0000 (16:30 -0800)]
RISC-V: Add Zalrsc amo-op patterns
All amo<op> patterns can be represented with lrsc sequences.
Add these patterns as a fallback when Zaamo is not enabled.
gcc/ChangeLog:
* config/riscv/sync.md (atomic_<atomic_optab><mode>): New expand pattern.
(amo_atomic_<atomic_optab><mode>): Rename amo pattern.
(atomic_fetch_<atomic_optab><mode>): New lrsc sequence pattern.
(lrsc_atomic_<atomic_optab><mode>): New expand pattern.
(amo_atomic_fetch_<atomic_optab><mode>): Rename amo pattern.
(lrsc_atomic_fetch_<atomic_optab><mode>): New lrsc sequence pattern.
(atomic_exchange<mode>): New expand pattern.
(amo_atomic_exchange<mode>): Rename amo pattern.
(lrsc_atomic_exchange<mode>): New lrsc sequence pattern.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/amo-zaamo-preferred-over-zalrsc.c: New test.
* gcc.target/riscv/amo-zalrsc-amo-add-1.c: New test.
* gcc.target/riscv/amo-zalrsc-amo-add-2.c: New test.
* gcc.target/riscv/amo-zalrsc-amo-add-3.c: New test.
* gcc.target/riscv/amo-zalrsc-amo-add-4.c: New test.
* gcc.target/riscv/amo-zalrsc-amo-add-5.c: New test.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
Edwin Lu [Thu, 8 Feb 2024 00:30:28 +0000 (16:30 -0800)]
RISC-V: Add basic Zaamo and Zalrsc support
There is a proposal to split the A extension into two parts: Zaamo and Zalrsc.
This patch adds basic support by making the A extension imply Zaamo and
Zalrsc.
* common/config/riscv/riscv-common.cc: Add Zaamo and Zalrsc.
* config/riscv/arch-canonicalize: Make A imply Zaamo and Zalrsc.
* config/riscv/riscv.opt: Add Zaamo and Zalrsc
* config/riscv/sync.md: Convert TARGET_ATOMIC to TARGET_ZAAMO and
TARGET_ZALRSC.
Pengxuan Zheng [Sat, 8 Jun 2024 02:52:00 +0000 (19:52 -0700)]
aarch64: Add vector floating point trunc pattern
This patch is a follow-up of r15-1079-g230d62a2cdd16c to add vector floating
point trunc pattern for V2DF->V2SF and V4SF->V4HF conversions by renaming the
existing aarch64_float_truncate_lo_<mode><vczle><vczbe> pattern to the standard
optab one, i.e., trunc<Vwide><mode>2<vczle><vczbe>. This allows the vectorizer
to vectorize certain floating point narrowing operations for the aarch64 target.
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.cc (VAR1): Remap float_truncate_lo_
builtin codes to standard optab ones.
* config/aarch64/aarch64-simd.md (aarch64_float_truncate_lo_<mode><vczle><vczbe>):
Rename to...
(trunc<Vwide><mode>2<vczle><vczbe>): ... This.
Andi Kleen [Wed, 24 Jan 2024 12:27:13 +0000 (04:27 -0800)]
C++: Support constexpr strings for asm statements
Some programing styles use a lot of inline assembler, and it is common
to use very complex preprocessor macros to generate the assembler
strings for the asm statements. In C++ there would be a typesafe alternative
using templates and constexpr to generate the assembler strings, but
unfortunately the asm statement requires plain string literals, so this
doesn't work.
This patch modifies the C++ parser to accept strings generated by
constexpr instead of just plain strings. This requires new syntax
because e.g. asm("..." : "r" (expr)) would be ambigious with a function
call. I chose () to make it unique. For example now you can write
The constexpr strings are allowed for the asm template, the
constraints and the clobbers (every time current asm accepts a string)
This version allows the same constexprs as C++26 static_assert,
following Jakub's suggestion.
The drawback of this scheme is that the constexpr doesn't have
full control over the input/output/clobber lists, but that can be
usually handled with a switch statement. One could imagine
more flexible ways to handle that, for example supporting constexpr
vectors for the clobber list, or similar. But even without
that it is already useful.
* parser.cc (cp_parser_asm_string_expression): New function
to handle constexpr strings for asm.
(cp_parser_asm_definition): Use cp_parser_asm_string_expression.
(cp_parser_yield_expression): Dito.
(cp_parser_asm_specification_opt): Dito.
(cp_parser_asm_operand_list): Dito.
(cp_parser_asm_clobber_list): Dito.
gcc/ChangeLog:
* doc/extend.texi: Document constexpr asm.
gcc/testsuite/ChangeLog:
* g++.dg/ext/asm11.C: Adjust to new error message.
* g++.dg/ext/asm9.C: Dito.
* g++.dg/parse/asm1.C: Dito.
* g++.dg/parse/asm2.C: Dito.
* g++.dg/parse/asm3.C: Dito.
* g++.dg/cpp1z/constexpr-asm-1.C: New test.
* g++.dg/cpp1z/constexpr-asm-2.C: New test.
* g++.dg/cpp1z/constexpr-asm-3.C: New test.
Jonathan Wakely [Mon, 10 Jun 2024 20:10:29 +0000 (21:10 +0100)]
libstdc++: Add test for chrono::leap_seconds ostream insertion
Also add a comment to the three-way comparison oeprator for
chrono::leap_seconds, noting the deviation from the spec (which is
functionally equivalent). What we implement is the originally proposed
resolution to LWG 3383, which should compile slightly more efficiently
than the final accepted resolution.
libstdc++-v3/ChangeLog:
* include/std/chrono (leap_seconds): Add comment.
* testsuite/std/time/leap_seconds/io.cc: New test.
Arthur Cohen [Fri, 12 Apr 2024 11:52:18 +0000 (13:52 +0200)]
rust: Do not link with libdl and libpthread unconditionally
ChangeLog:
* Makefile.tpl: Add CRAB1_LIBS variable.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: Check if -ldl and -lpthread are needed, and if so, add
them to CRAB1_LIBS.
gcc/rust/ChangeLog:
* Make-lang.in: Remove overazealous LIBS = -ldl -lpthread line, link
crab1 against CRAB1_LIBS.
Gaius Mulley [Tue, 11 Jun 2024 09:01:12 +0000 (10:01 +0100)]
PR modula2/114529 Avoid ODR violations in bootstrap translated sources
This patch changes the bootstrap tool mc to avoid redefining any data
types and therefore preventing ODR violations. All exported opaque type
usages are implemented as void *. Local opaque type usages (static
functions containing opaque type parameters) use the full declaration.
mc casts usages between void * and full opaque type as necessary.
The --extended-opaque option in mc has been disabled, as this generated
ODR violations. The extended-opaque option inlined all declarations in
the translated implementation module. As this is no longer used there
is now a .h file for each .def file and a .cc file for every .mod file.
This results in more Makefile rules for the ppg tool in Make-maintainer.in.
gcc/m2/ChangeLog:
PR modula2/114529
* Make-lang.in (MC_EXTENDED_OPAQUE): Assign to nothing.
* Make-maintainer.in (mc-basetest): New rule.
(mc-devel-basetest): New rule.
(mc-clean): Remove mc.
(m2/mc-boot-gen/$(SRC_PREFIX)decl.cc): Replace --extended-opaque
with $(EXTENDED_OPAQUE).
(PG-SRC): Move define before generic rules.
(PGE-DEF): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)%.h): New rule.
(m2/gm2-ppg-boot/$(SRC_PREFIX)libc.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)mcrts.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)UnixArgs.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)Selective.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)termios.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)SysExceptions.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)ldtoa.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)wrapc.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)SYSTEM.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)errno.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)M2RTS.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)SymbolKey.h): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)SymbolKey.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)NameKey.h): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)NameKey.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)Lists.h): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)Lists.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)Output.h): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)bnflex.h): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)bnflex.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)RTco.h): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)RTentity.h): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)RTco.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)RTentity.o): Ditto.
(m2/gm2-ppg-boot/$(SRC_PREFIX)%.o): Ditto.
(m2/ppg$(exeext)): Ditto.
(m2/gm2-ppg-boot/main.o): Ditto.
(m2/gm2-auto): Ditto.
(c-family/m2pp.o): Ditto.
(BUILD-BOOT-PG-H): Correct macro definition.
(m2/gm2-pg-boot/$(SRC_PREFIX)%.h): New rule.
(m2/gm2-pg-boot/$(SRC_PREFIX)NameKey.h): Ditto.
(m2/gm2-pg-boot/$(SRC_PREFIX)NameKey.o): Ditto.
(m2/gm2-pg-boot/$(SRC_PREFIX)Lists.h): Ditto.
(m2/gm2-pg-boot/$(SRC_PREFIX)Lists.o): Ditto.
(m2/gm2-pg-boot/$(SRC_PREFIX)Output.h): Ditto.
(m2/gm2-pg-boot/$(SRC_PREFIX)Output.o): Ditto.
(m2/gm2-pg-boot/$(SRC_PREFIX)bnflex.h): Ditto.
(m2/gm2-pg-boot/$(SRC_PREFIX)bnflex.o): Ditto.
(m2/gm2-pg-boot/$(SRC_PREFIX)RTco.h): Ditto.
(m2/gm2-pg-boot/$(SRC_PREFIX)RTentity.h): Ditto.
(m2/gm2-pg-boot/$(SRC_PREFIX)RTco.o): Ditto.
(m2/gm2-pg-boot/$(SRC_PREFIX)RTentity.o): Ditto.
(BUILD-BOOT-PGE-H): Correct macro definition.
(m2/gm2-pge-boot/$(SRC_PREFIX)SymbolKey.h): Ditto.
(m2/gm2-pge-boot/$(SRC_PREFIX)SymbolKey.o): Ditto.
(m2/gm2-pge-boot/$(SRC_PREFIX)NameKey.h): Ditto.
(m2/gm2-pge-boot/$(SRC_PREFIX)NameKey.o): Ditto.
(m2/gm2-pge-boot/$(SRC_PREFIX)Lists.h): Ditto.
(m2/gm2-pge-boot/$(SRC_PREFIX)Lists.o): Ditto.
(m2/gm2-pge-boot/$(SRC_PREFIX)Output.h): Ditto.
(m2/gm2-pge-boot/$(SRC_PREFIX)Output.o): Ditto.
(m2/gm2-pge-boot/$(SRC_PREFIX)bnflex.h): Ditto.
(m2/gm2-pge-boot/$(SRC_PREFIX)bnflex.o): Ditto.
(m2/gm2-pge-boot/$(SRC_PREFIX)RTco.h): Ditto.
(m2/gm2-pge-boot/$(SRC_PREFIX)RTentity.h): Ditto.
(m2/gm2-pge-boot/$(SRC_PREFIX)RTco.o): Ditto.
(m2/gm2-pge-boot/$(SRC_PREFIX)RTentity.o): Ditto.
(mc-basetest): Ditto.
(mc-devel-basetest): Ditto.
* gm2-compiler/M2Options.def (SetM2Dump): Add BOOLEAN return.
* gm2-compiler/M2Quads.def (BuildAlignment): Add tokno parameter.
(BuildBitLength): Ditto.
* gm2-compiler/P3Build.bnf (ByteAlignment): Move tokpos assignment
to the start of the block.
* gm2-compiler/PCBuild.bnf (ConstSetOrQualidentOrFunction): Ditto.
(SetOrDesignatorOrFunction): Ditto.
* gm2-compiler/PHBuild.bnf (ConstSetOrQualidentOrFunction): Ditto.
(SetOrDesignatorOrFunction): Ditto.
(ByteAlignment): Ditto.
* gm2-libs/dtoa.def (dtoa): Change mode to INTEGER.
* gm2-libs/ldtoa.def (ldtoa): Ditto.
* mc-boot-ch/GSYSTEM.c (_M2_SYSTEM_init): Correct parameter list.
(_M2_SYSTEM_fini): Ditto.
* mc-boot-ch/Gdtoa.cc (dtoa_calcsign): Return bool.
(dtoa_dtoa): Return void * and use bool in the fifth parameter.
(_M2_dtoa_init): Correct parameter list.
(_M2_dtoa_fini): Ditto.
* mc-boot-ch/Gerrno.cc (_M2_errno_init): Ditto.
(_M2_errno_fini): Ditto.
* mc-boot-ch/Gldtoa.cc (dtoa_calcsign): Return bool.
(ldtoa_ldtoa): Return void * and use bool in the fifth parameter.
(_M2_ldtoa_init): Correct parameter list.
(_M2_ldtoa_fini): Ditto.
* mc-boot-ch/Glibc.c (tracedb_zresult): New function.
(libc_read): Return size_t and use size_t in parameter three.
(libc_write): Return size_t and use size_t in parameter three.
(libc_printf): Add const to the format specifier.
Change declaration of c to use const.
(libc_snprintf): Add const to the format specifier.
Change declaration of c to use const.
(libc_malloc): Use size_t.
(libc_memcpy): Ditto.
* mc-boot/GASCII.cc: Regenerate.
* mc-boot/GArgs.cc: Ditto.
* mc-boot/GAssertion.cc: Ditto.
* mc-boot/GBreak.cc: Ditto.
* mc-boot/GCmdArgs.cc: Ditto.
* mc-boot/GDebug.cc: Ditto.
* mc-boot/GDynamicStrings.cc: Ditto.
* mc-boot/GEnvironment.cc: Ditto.
* mc-boot/GFIO.cc: Ditto.
* mc-boot/GFormatStrings.cc: Ditto.
* mc-boot/GFpuIO.cc: Ditto.
* mc-boot/GIO.cc: Ditto.
* mc-boot/GIndexing.cc: Ditto.
* mc-boot/GM2Dependent.cc: Ditto.
* mc-boot/GM2EXCEPTION.cc: Ditto.
* mc-boot/GM2RTS.cc: Ditto.
* mc-boot/GMemUtils.cc: Ditto.
* mc-boot/GNumberIO.cc: Ditto.
* mc-boot/GPushBackInput.cc: Ditto.
* mc-boot/GRTExceptions.cc: Ditto.
* mc-boot/GRTint.cc: Ditto.
* mc-boot/GSArgs.cc: Ditto.
* mc-boot/GSFIO.cc: Ditto.
* mc-boot/GStdIO.cc: Ditto.
* mc-boot/GStorage.cc: Ditto.
* mc-boot/GStrCase.cc: Ditto.
* mc-boot/GStrIO.cc: Ditto.
* mc-boot/GStrLib.cc: Ditto.
* mc-boot/GStringConvert.cc: Ditto.
* mc-boot/GSysStorage.cc: Ditto.
* mc-boot/GTimeString.cc: Ditto.
* mc-boot/Galists.cc: Ditto.
* mc-boot/Gdecl.cc: Ditto.
* mc-boot/Gkeyc.cc: Ditto.
* mc-boot/Glists.cc: Ditto.
* mc-boot/GmcComment.cc: Ditto.
* mc-boot/GmcComp.cc: Ditto.
* mc-boot/GmcDebug.cc: Ditto.
* mc-boot/GmcError.cc: Ditto.
* mc-boot/GmcFileName.cc: Ditto.
* mc-boot/GmcLexBuf.cc: Ditto.
* mc-boot/GmcMetaError.cc: Ditto.
* mc-boot/GmcOptions.cc: Ditto.
* mc-boot/GmcPreprocess.cc: Ditto.
* mc-boot/GmcPretty.cc: Ditto.
* mc-boot/GmcPrintf.cc: Ditto.
* mc-boot/GmcQuiet.cc: Ditto.
* mc-boot/GmcReserved.cc: Ditto.
* mc-boot/GmcSearch.cc: Ditto.
* mc-boot/GmcStack.cc: Ditto.
* mc-boot/GmcStream.cc: Ditto.
* mc-boot/Gmcp1.cc: Ditto.
* mc-boot/Gmcp2.cc: Ditto.
* mc-boot/Gmcp3.cc: Ditto.
* mc-boot/Gmcp4.cc: Ditto.
* mc-boot/Gmcp5.cc: Ditto.
* mc-boot/GnameKey.cc: Ditto.
* mc-boot/GsymbolKey.cc: Ditto.
* mc-boot/Gvarargs.cc: Ditto.
* mc-boot/Gwlists.cc: Ditto.
* mc-boot/Gdecl.h: Ditto.
* mc-boot/Gldtoa.h: Ditto.
* mc-boot/Glibc.h: Ditto.
* mc/decl.def (putTypeOpaque): New procedure.
(isTypeOpaque): New procedure function.
* mc/decl.mod (debugOpaque): New constant.
(nodeT): New enumeration field opaquecast.
(node): New record field opaquecastF.
(opaqueCastState): New record.
(opaquecastT): New record.
(typeT): New field isOpaque.
(varT): New field opaqueState.
(arrayT): Ditto.
(varparamT): Ditto.
(paramT): Ditto.
(pointerT): Ditto.
(recordfieldT): Ditto.
(componentrefT): Ditto.
(pointerrefT): Ditto.
(arrayrefT): Ditto.
(procedureT): Ditto.
(proctypeT): Ditto.
(makeType): Initialize field isOpaque.
(makeTypeImp): Initialize field isOpaque.
(putVar): Call initNodeOpaqueCastState.
(putReturnType): Ditto.
(makeProcType): Ditto.
(putProcTypeReturn): Ditto.
(makeVarParameter): Ditto.
(makeNonVarParameter): Ditto.
(makeFuncCall): Ditto.
(putTypeOpaque): New procedure.
(isTypeOpaque): New procedure function.
(doMakeComponentRef): Call initNodeOpaqueCastState.
(makePointerRef): Call initNodeOpaqueCastState.
(doGetFuncType): Call initNodeOpaqueCastState.
(doBinary): Add FALSE parameter to doExprCup.
(doDeRefC): Rewrite.
(doComponentRefC): Call flushOpaque.
(doPointerRefC): Call flushOpaque.
(doArrayRefC): Add const_cast for unbounded array.
(doExprCup): Rewrite.
(doTypeAliasC): Remove.
(isDeclType): New procedure function.
(doEnumerationC): New procedure function.
(doParamTypeEmit): Ditto.
(doParamTypeNameModifier): Ditto.
(initOpaqueCastState): Ditto.
(initNodeOpaqueCastState): Ditto.
(setOpaqueCastState): Ditto.
(setNodeOpaqueVoidStar): Ditto.
(nodeUsesOpaque): Ditto.
(getNodeOpaqueVoidStar): Ditto.
(getOpaqueFlushNecessary): Ditto.
(makeOpaqueCast): Ditto.
(flushOpaque): Ditto.
(castOpaque): Ditto.
(isTypeOpaqueDefImp): Ditto.
(isParamVoidStar): Ditto.
(isRefVoidStar): Ditto.
(isReturnVoidStar): Ditto.
(isVarVoidStar): Ditto.
(initNodeOpaqueState): Ditto.
(assignNodeOpaqueCastState): Ditto.
(assignNodeOpaqueCastFalse): Ditto.
(dumpOpaqueState): Ditto.
(doProcTypeC): Rewrite.
(isDeclInImp): New procedure function.
(doTypeNameModifier): Ditto.
(doTypeC): Emit typedef if enum is declared in this module.
(doCompletePartialProcType): Rewrite.
(outputCompletePartialProcType): New procedure.
(doOpaqueModifier): Ditto.
(doVarC): Ditto.
(doProcedureHeadingC): Add opaque modifier to return type if
necessary.
(doReturnC): Cast opaque type for return if necessary.
(forceCastOpaque): New procedure.
(forceReintCastOpaque): New procedure.
(doUnConstCastUnbounded): New procedure.
(doAssignmentC): Cast opaque for both des and expr if necessary.
(doAdrExprC): Use static_cast for void * casting.
(doFuncVarParam): New procedure.
(doFuncParamC): Rewrite.
(doAdrArgC): Rewrite.
(getFunction): New procedure function.
(stop): Rename to ...
(localstop): ... this.
(dupFunccall): Call assignNodeOpaqueCastState.
(dbg): Rewrite.
(addDone): Rewrite.
(addDoneDef): Do not add opaque types to the doneQ when declared in
the definition module.
* mc/mc.flex (openSource): Return bool.
(_M2_mcflex_init): Correct parameter list.
(_M2_mcflex_fini): Ditto.
* mc/mcComment.h (stdbool.h): Include.
(mcComment_initComment): Change unsigned int to bool.
* mc/mcOptions.mod (handleOption): Disable --extended-opaque
and issue warning.
* mc/mcp1.bnf (DefTypeDeclaration): Call putTypeOpaque.
gcc/testsuite/ChangeLog:
PR modula2/114529
* gm2/base-lang/pass/SYSTEM.def: New test.
* gm2/base-lang/pass/base-lang-test.sh: New test.
* gm2/base-lang/pass/globalproctype.def: New test.
* gm2/base-lang/pass/globalproctype.mod: New test.
* gm2/base-lang/pass/globalvar.def: New test.
* gm2/base-lang/pass/globalvar.mod: New test.
* gm2/base-lang/pass/globalvarassign.def: New test.
* gm2/base-lang/pass/globalvarassign.mod: New test.
* gm2/base-lang/pass/localproctype.def: New test.
* gm2/base-lang/pass/localproctype.mod: New test.
* gm2/base-lang/pass/localvar.def: New test.
* gm2/base-lang/pass/localvar.mod: New test.
* gm2/base-lang/pass/localvarassign.def: New test.
* gm2/base-lang/pass/localvarassign.mod: New test.
* gm2/base-lang/pass/opaquefield.def: New test.
* gm2/base-lang/pass/opaquefield.mod: New test.
* gm2/base-lang/pass/opaquenew.def: New test.
* gm2/base-lang/pass/opaquenew.mod: New test.
* gm2/base-lang/pass/opaqueparam.def: New test.
* gm2/base-lang/pass/opaqueparam.mod: New test.
* gm2/base-lang/pass/opaquestr.def: New test.
* gm2/base-lang/pass/opaqueuse.def: New test.
* gm2/base-lang/pass/opaqueuse.mod: New test.
* gm2/base-lang/pass/opaqueusestr.def: New test.
* gm2/base-lang/pass/opaqueusestr.mod: New test.
* gm2/base-lang/pass/opaquevariant.def: New test.
* gm2/base-lang/pass/opaquevariant.mod: New test.
* gm2/base-lang/pass/opaquevarparam.def: New test.
* gm2/base-lang/pass/opaquevarparam.mod: New test.
* gm2/base-lang/pass/simplelist.def: New test.
* gm2/base-lang/pass/simplelist.mod: New test.
* gm2/base-lang/pass/simplelistiter.def: New test.
* gm2/base-lang/pass/simplelistiter.mod: New test.
* gm2/base-lang/pass/simpleopaque.def: New test.
* gm2/base-lang/pass/simpleopaque.mod: New test.
* gm2/base-lang/pass/straddress.def: New test.
* gm2/base-lang/pass/straddress.mod: New test.
* gm2/base-lang/pass/straddressexport.def: New test.
* gm2/base-lang/pass/straddressexport.mod: New test.
* gm2/base-lang/pass/unboundedarray.def: New test.
* gm2/base-lang/pass/unboundedarray.mod: New test.
Roger Sayle [Tue, 11 Jun 2024 08:31:34 +0000 (09:31 +0100)]
i386: PR target/115397: AVX512 ternlog vs. -m32 -fPIC constant pool.
This patch fixes PR target/115397, a recent regression caused by my
ternlog patch that results in an ICE (building numpy) with -m32 -fPIC.
The problem is that ix86_broadcast_from_constant, which calls
get_pool_constant, doesn't handle the UNSPEC_GOTOFF that's created by
calling validize_mem when using -fPIC on i686. The logic here is a bit
convoluted (and my future patches will clean some of this up), but the
simplest fix is to call ix86_broadcast_from_constant between the calls
to force_const_mem and the call to validize_mem.
Perhaps a better solution might be to call targetm.delegitimize_address
from the middle-end's get_pool_constant, but ultimately the best approach
would be to not place things in the constant pool if we don't need to.
My plans to move (broadcast) constant handling from expand to split1
should simplify this.
2024-06-11 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR target/115397
* config/i386/i386-expand.cc (ix86_expand_ternlog): Move call to
ix86_broadcast_from_constant before call to validize_mem, but after
call to force_const_mem.
gcc/testsuite/ChangeLog
PR target/115397
* gcc.target/i386/pr115397.c: New test case.
After this patch:
...
vsetvli a5,a3,e64,m1,ta,ma
slli a4,a5,3
vle64.v v1,0(a1)
vle64.v v2,0(a2)
vssubu.vv v1,v1,v2
vse64.v v1,0(a0)
...
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/ChangeLog:
* config/riscv/autovec.md (ussub<mode>3): Add new pattern impl
for the unsigned vector modes.
* config/riscv/riscv-protos.h (expand_vec_ussub): Add new func
decl to expand .SAT_SUB for vector mode.
* config/riscv/riscv-v.cc (emit_vec_saddu): Add new func impl
to expand .SAT_SUB for vector mode.
(emit_vec_binary_alu): Add new helper func to emit binary alu.
(expand_vec_ussub): Leverage above helper func.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat_arith.h: Add helper macros for test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-3.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-4.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-3.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-4.c: New test.
Jeff Law [Tue, 11 Jun 2024 04:39:40 +0000 (22:39 -0600)]
[committed] [RISC-V] Drop dead round_32 test
This test is no longer useful. It doesn't test what it was originally intended
to test and there's really no way to recover it sanely.
We agreed in the patchwork meeting last week that if we want to test Zfa that
we'll write a new test for that. Similarly if we want to do deeper testing of
the non-Zfa sequences in this space that we'd write new tests for those as well
(execution tests in particular).
Andrew MacLeod [Wed, 5 Jun 2024 19:12:27 +0000 (15:12 -0400)]
Move array_bounds warnings into a separate pass.
Array bounds checking is currently tied to VRP. This causes issues with
using laternate VRP algorithms as well as experimenting with moving
the location of the warnings later. This moves it to its own pass
and cleans up the vrp_pass object.
* gimple-array-bounds.cc (array_bounds_checker::array_bounds_checker):
Always use current range_query.
(pass_data_array_bounds): New.
(pass_array_bounds): New.
(make_pass_array_bounds): New.
* gimple-array-bounds.h (array_bounds_checker): Adjust prototype.
* passes.def (pass_array_bounds): New. Add after VRP1.
* timevar.def (TV_TREE_ARRAY_BOUNDS): New timevar.
* tree-pass.h (make_pass_array_bounds): Add prototype.
* tree-vrp.cc (execute_ranger_vrp): Remove warning param and do
not invoke array bounds warning pass.
(pass_vrp::pass_vrp): Adjust params.
(pass_vrp::close): Adjust parameters.
(pass_vrp::warn_array_bounds_p): Remove.
(make_pass_vrp): Remove warning param.
(make_pass_early_vrp): Remove warning param.
(make_pass_fast_vrp): Remove warning param.
Raphael Zinsly [Mon, 10 Jun 2024 20:16:16 +0000 (14:16 -0600)]
[to-be-committed] [RISC-V] Use bext for extracting a bit into a SImode object
bext is defined as (src >> n) & 1. With that formulation, particularly the
"&1" means the result is implicitly zero extended. So we can safely use it on
SI objects for rv64 without the need to do any explicit extension.
This patch adds the obvious pattern and a few testcases. I think one of the
tests is derived from coremark, the other two from spec2017.
This has churned through Ventana's CI system repeatedly since it was first
written. Assuming pre-commit CI doesn't complain, I'll commit it on Raphael's
behalf later today or Monday.
gcc/
* config/riscv/bitmanip.md (*bextdisi): New pattern.
Pan Li [Mon, 10 Jun 2024 20:13:38 +0000 (14:13 -0600)]
[PATCH v1] Widening-Mul: Fix one ICE of gcall insertion for PHI match
When enabled the PHI handing for COND_EXPR, we need to insert the gcall
to replace the PHI node. Unfortunately, I made a mistake that insert
the gcall to before the last stmt of the bb. See below gimple, the PHI
is located at no.1 but we insert the gcall (aka no.9) to the end of
the bb. Then the use of _9 in no.2 will have no def and will trigger
ICE when verify_ssa.
1. # _9 = PHI <_3(4), 18446744073709551615(3)> // The PHI node to be deleted.
2. prephitmp_36 = (char *) _9;
3. buf.write_base = string_13(D);
4. buf.write_ptr = string_13(D);
5. buf.write_end = prephitmp_36;
6. buf.written = 0;
7. buf.mode = 3;
8. _7 = buf.write_end;
9. _9 = .SAT_ADD (string.0_2, maxlen_15(D)); // Insert gcall to last bb by mistake
This patch would like to insert the gcall to before the start of the bb
stmt. To ensure the possible use of PHI_result will have a def exists.
After this patch the above gimple will be:
The below test suites are passed for this patch:
* The rv64gcv fully regression test with newlib.
* The rv64gcv build with glibc.
* The x86 regression test with newlib.
* The x86 bootstrap test with newlib.
PR target/115387
gcc/ChangeLog:
* tree-ssa-math-opts.cc (math_opts_dom_walker::after_dom_children): Take
the gsi of start_bb instead of last_bb.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr115387-1.c: New test.
* gcc.target/riscv/pr115387-2.c: New test.