Unfortunately this led to numerous g++ testsuite failures on Solaris,
tracked as PR analyzer/111475.
Almost all of the failures are due to standard library differences where
including a C standard library on C++ e.g. <stdlib.h> leads to the plain
symbols referencing the symbols "std::" via a "using" declaration,
whereas I had written the code expecting them to use symbols in the root
namespace.
The analyzer has special-case handling of many functions by name.
This patch generalizes such handling to also match against functions
in "std::" for all of the cases I found in the testsuite (via manual
inspection of the preprocessed test cases against Solaris headers).
This fixes cases where the analyzer was failing to "know about" the
behavior of such functions.
Other such failures are due to "std::" prefixes appearing in names of
functions in the output, leading to mismatches against expected output.
The patch adds regexes to some cases, and moves some other cases back
from c-c++-common to gcc.dg where the dg-multiline syntax isn't
expressive enough.
Various "fd-*.c" failures relate to Solaris's socket-handling functions
not being marked with "noexcept", where due to PR analyzer/97111 we
mishandle the exception-handling edges in the CFG, leading to leak
false positives. The patch works around this by adding -fno-exceptions
to these cases, pending a proper fix for PR analyzer/97111.
gcc/analyzer/ChangeLog:
PR analyzer/111475
* analyzer.cc (is_special_named_call_p): Add "look_in_std" param.
(is_std_function_p): Make non-static.
* analyzer.h (is_special_named_call_p): Add optional "look_in_std"
param.
(is_std_function_p): New decl.
* engine.cc (stmt_requires_new_enode_p): Look for both "signal"
and "std::signal".
* kf.cc (register_known_functions): Add various "std::" copies
of the known functions.
* known-function-manager.cc
(known_function_manager::~known_function_manager): Clean up
m_std_ns_map_id_to_kf.
(known_function_manager::add_std_ns): New.
(known_function_manager::get_match): Also look for known "std::"
functions.
(known_function_manager::get_by_identifier_in_std_ns): New.
* known-function-manager.h
(known_function_manager::add_std_ns): New decl.
(known_function_manager::get_by_identifier_in_std_ns): New decl.
(known_function_manager::m_std_ns_map_id_to_kf): New field.
* sm-file.cc (register_known_file_functions): Add various "std::"
copies of the known functions.
* sm-malloc.cc (malloc_state_machine::on_stmt): Handle
"std::realloc".
* sm-signal.cc (signal_unsafe_p): Consider "std::" copies of the
functions as also being async-signal-unsafe.
(signal_state_machine::on_stmt): Consider "std::signal".
gcc/testsuite/ChangeLog:
PR analyzer/111475
* c-c++-common/analyzer/fd-glibc-byte-stream-socket.c: Add
-fno-exceptions for now.
* c-c++-common/analyzer/fd-manpage-getaddrinfo-client.c: Likewise.
* c-c++-common/analyzer/fd-mappage-getaddrinfo-server.c: Rename to...
* c-c++-common/analyzer/fd-manpage-getaddrinfo-server.c: ...this, and
add -fno-exceptions for now.
* c-c++-common/analyzer/fd-socket-meaning.c: Add -fno-exceptions
for now.
* c-c++-common/analyzer/fd-symbolic-socket.c: Likewise.
* c-c++-common/analyzer/flexible-array-member-1.c: Use regexp to
handle C vs C++ differences in spelling of function name, which
could have a "std::" prefix on some targets.
* c-c++-common/analyzer/pr106539.c: Likewise.
* c-c++-common/analyzer/malloc-ipa-8-unchecked.c: Move back to...
* gcc.dg/analyzer/malloc-ipa-8-unchecked.c: ...here, dropping
attempt to generalize output for C vs C++.
* c-c++-common/analyzer/signal-4a.c: Move back to...
* gcc.dg/analyzer/signal-4a.c: ...here, dropping attempt to
generalize output for C vs C++.
* c-c++-common/analyzer/signal-4b.c: Move back to...
* gcc.dg/analyzer/signal-4b.c: ...here, dropping attempt to
generalize output for C vs C++.
This patch fixes libgm2/libm2iso/wraptime.cc:InitTM so that
it does not always return NULL. The incorrect autoconf macro
was used (inside InitTM) and the function short circuited
to return NULL. The fix is to use HAVE_SYS_TIME_H and use
AC_HEADER_TIME in libgm2/configure.ac.
libgm2/ChangeLog:
PR modula2/115276
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Use AC_HEADER_TIME.
* libm2iso/wraptime.cc (InitTM): Check HAVE_SYS_TIME_H
before using struct tm to obtain the size.
gcc/testsuite/ChangeLog:
PR modula2/115276
* gm2/isolib/run/pass/testinittm.mod: New test.
Gaius Mulley [Wed, 20 Nov 2024 00:17:23 +0000 (00:17 +0000)]
[PATCH] modula2: use groups in the type resolver of the bootstrap tool mc
This patch introduces groups to maintain the lists used when resolving
types in the bootstrap tool mc. The groups and type resolver are very
similar to that used in cc1gm2. Specifically the resolver uses the group
to detect any change to any element in any list within a group. This is
much cleaner and safer than the previous list length comparisons.
gcc/m2/ChangeLog:
* Make-lang.in (MC_EXTENDED_OPAQUE): New definition.
* mc-boot/GDynamicStrings.cc: Rebuild.
* mc-boot/GDynamicStrings.h: Rebuild.
* mc-boot/Galists.cc: Rebuild.
* mc-boot/Galists.h: Rebuild.
* mc-boot/Gdecl.cc: Rebuild.
* mc/alists.def (equalList): New procedure.
* mc/alists.mod (equalList): New procedure implementation.
* mc/decl.mod (group): New type.
(freeGroup): New variable.
(globalGroup): Ditto.
(todoQ): Remove declaration and prefix all occurances with globalGroup^.
(partialQ): Ditto.
(doneQ): Ditto.
(newGroup): New procedure.
(initGroup): Ditto.
(killGroup): Ditto.
(dupGroup): Ditto.
(equalGroup): Ditto.
(topologicallyOut): Rewrite.
Gaius Mulley [Tue, 19 Nov 2024 19:33:18 +0000 (19:33 +0000)]
[PATCH] PR modula2/115164 initial test code highlighting the problem
This patch includes some trivial testcode which highlights
PR 115164. Expect future test code to perform runtime checks
for a series of trailing zeros.
gcc/testsuite/ChangeLog:
PR modula2/115164
* gm2/isolib/run/pass/testlowread.mod: New test.
* gm2/isolib/run/pass/testwritereal.mod: New test.
Gaius Mulley [Tue, 19 Nov 2024 18:30:10 +0000 (18:30 +0000)]
[PATCH] PR modula2/115057 TextIO.ReadRestLine raises an exception when buffer is exceeded
TextIO.ReadRestLine will raise an "attempting to read beyond end of file"
exception if the buffer is exceeded. This bug is caused by the
TextIO.ReadRestLine calling IOChan.Skip without a preceeding IOChan.Look.
The Look procedure will update the status result whereas
Skip always sets read result to allRight.
gcc/m2/ChangeLog:
PR modula2/115057
* gm2-libs-iso/TextIO.mod (ReadRestLine): Use ReadChar to
skip unwanted characters as this calls IOChan.Look and updates
the cid result status. A Skip without a Look does not update
the status. Skip always sets read result to allRight.
* gm2-libs-iso/TextUtil.def (SkipSpaces): Improve comments.
(CharAvailable): Improve comments.
* gm2-libs-iso/TextUtil.mod (SkipSpaces): Improve comments.
(CharAvailable): Improve comments.
gcc/testsuite/ChangeLog:
PR modula2/115057
* gm2/isolib/run/pass/testrestline.mod: New test.
* gm2/isolib/run/pass/testrestline2.mod: New test.
* gm2/isolib/run/pass/testrestline3.mod: New test.
Gaius Mulley [Tue, 19 Nov 2024 15:32:02 +0000 (15:32 +0000)]
[PATCH] PR modula2/115003 exporting a symbol to outer scope with a name clash causes ICE
An ICE will occur if an unknown symbol is exported and causes a name
clash. The error mechanism attempts to find the scope of an unknown
symbol. This patch adds a missing case clause to GetScope and returns
NulSym if the scope is an unknown symbol.
gcc/m2/ChangeLog:
PR modula2/115003
* gm2-compiler/SymbolTable.mod (GetScope): Add UndefinedSym
case clause and return NulSym.
Uros Bizjak [Mon, 18 Nov 2024 21:38:46 +0000 (22:38 +0100)]
i386: Enable *rsqrtsf2_sse without TARGET_SSE_MATH [PR117357]
__builtin_ia32_rsqrtsf2 expander generates UNSPEC_RSQRT insn pattern
also when TARGET_SSE_MATH is not set. Enable *rsqrtsf2_sse without
TARGET_SSE_MATH to avoid ICE with unrecognizable insn.
PR target/117357
gcc/ChangeLog:
* config/i386/i386.md (*rsqrtsf2_sse):
Also enable for !TARGET_SSE_MATH.
Paul Thomas [Sun, 3 Nov 2024 18:02:16 +0000 (18:02 +0000)]
Fortran: Fix associate_69.f90 that fails on some platforms [PR115700]
2024-11-03 Paul Thomas <pault@gcc.gnu.org>
gcc/testsuite/
PR fortran/115700
* gfortran.dg/associate_69.f90: Remove the test that produces a
variable string length because the optimized count depends on
the platform. This is tested in associate_70.f90.
Paul Thomas [Fri, 1 Nov 2024 07:45:00 +0000 (07:45 +0000)]
Fortran: Fix problems with substring selectors in ASSOCIATE [PR115700]
2024-11-01 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/115700
* resolve.cc (resolve_assoc_var): Extract a substring reference
with missing as well as non-constant start or end.
gcc/testsuite/
PR fortran/115700
* gfortran.dg/associate_69.f90: Activate commented out tests.
* gfortran.dg/associate_70.f90: Test correct functioning of
references in associate_69.f90 tests.
Paul Thomas [Thu, 31 Oct 2024 07:22:36 +0000 (07:22 +0000)]
Fortran: Fix problem with substring selectors in ASSOCIATE [PR115700]
2024-10-31 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/115700
* resolve.cc (resolve_variable): The typespec of an expression,
which is not a substring, can be shared with a deferred length
associate name.
(resolve_assoc_var): Extract a substring reference with non-
constant start or end. Use it to flag up the need for array
associate name to be a pointer.
(resolve_block_construct): Change comment from past to future
tense.
gcc/testsuite/
PR fortran/115700
* gfortran.dg/associate_70.f90: New test.
Eric Botcazou [Wed, 30 Oct 2024 10:22:12 +0000 (11:22 +0100)]
ada: Fix spurious error on iterated component association with large index type
This is only for the Ada 2022 form of the iterated component association.
gcc/ada/ChangeLog:
PR ada/117328
* exp_aggr.adb (Two_Pass_Aggregate_Expansion): Use a type sized
from the index type to compute the length. Simplify and remove
useless calls to New_Copy_Tree for this computation.
Andrew Carlotti [Fri, 1 Nov 2024 17:27:38 +0000 (17:27 +0000)]
testsuite: Adjust jump threading test expectation
This test started failing on aarch64 after 0cfc9c95 in 2023 ("Phi
analyzer - Initialize with range instead of a tree.").
The only change visible in the pass dumps prior to thread2 is the upper
bounds of some ranges are reduced from +INF to 7, consistent with the
bitamsk information. After thread2, there are changes in the control
flow, but only affecting edges that are obviously never taken (from
basic blocks 6 through 12). These are cleaned up in the following pass,
but the final codegen remains different.
There isn't anything obviously wrong with the change in dump output, so
let's just update the test expectations (as has happened previously
here).
Jonathan Wakely [Mon, 11 Nov 2024 11:54:00 +0000 (11:54 +0000)]
libstdc++: Fix typos in iterator increment for std::text_encoding [PR117520]
The intended behaviour for std::text_encoding::aliases_view's iterator
is that it incrementing or decrementing too far sets it to a
value-initialized state, or fails an assertion when those are enabled.
There were typos that used == instead of = which meant that instead of
becoming singular or aborting, an out-of-range increment just did
nothing. This meant erroneous operations were well-defined and didn't
produce any undefined behaviour, but were not diagnosed with assertions
enabled, as had been intended.
This change fixes the bugs and adds more tests to verify the intended
behaviour.
libstdc++-v3/ChangeLog:
PR libstdc++/117520
* include/std/text_encoding (aliases_view:_Iterator::operator+=):
Fix typos that caused == to be used instead of =.
(aliases_view::_Iterator): Fix friend declaration.
* testsuite/std/text_encoding/members.cc: Adjust expected
behaviour of invalid subscript. Add tests for other erroneous
operations on iterators.
Jonathan Wakely [Fri, 8 Nov 2024 13:58:23 +0000 (13:58 +0000)]
libstdc++: Do not define _Insert_base::try_emplace before C++17
This is not a reserved name in C++11 and C++14, so must not be defined.
Also use the appropriate feature test macros for the try_emplace members
of the Debug Mode maps.
libstdc++-v3/ChangeLog:
* include/bits/hashtable_policy.h (_Insert_base::try_emplace):
Do not define for C++11 and C++14.
* include/debug/map.h (try_emplace): Use feature test macro.
* include/debug/unordered_map (try_emplace): Likewise.
* testsuite/17_intro/names.cc: Define try_emplace before C++17.
arm: Fix ICE on arm_mve.h pragma without MVE types [PR117408]
Starting with r14-435-g00d97bf3b5a, doing `#pragma arm "arm_mve.h"
false` or `#pragma arm "arm_mve.h" true` without first doing
`#pragma arm "arm_mve_types.h"` causes GCC to ICE.
gcc/ChangeLog:
PR target/117408
* config/arm/arm-mve-builtins.cc(handle_arm_mve_h): Detect if MVE
types is missing and if so, return error.
gcc/testsuite/ChangeLog:
PR target/117408
* gcc.target/arm/mve/pr117408-1.c: New test.
* gcc.target/arm/mve/pr117408-2.c: Likewise.
hppa: Fix handling of secondary reloads involving a SUBREG
This is fairly subtle.
When handling spills for SUBREG arguments in pa_emit_move_sequence,
alter_subreg may be called. It in turn calls adjust_address_1 and
change_address_1. change_address_1 calls pa_legitimate_address_p
to validate the new spill address. change_address_1 generates an
internal compiler error if the address is not valid. We need to
allow 14-bit displacements for all modes when reload_in_progress
is true and strict is false to prevent the internal compiler error.
SUBREGs are only used with the general registers, so the spill
should result in an integer access. 14-bit displacements are okay
for integer loads and stores but not for floating-point loads and
stores.
Potentially, the change could break the handling of spills for the
floating point-registers but I believe these are handled separately
in pa_emit_move_sequence.
This change fixes the build of symmetrica-3.0.1+ds.
2024-11-08 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
PR target/117443
* config/pa/pa.cc (pa_legitimate_address_p): Allow any
14-bit displacement when reload is in progress and strict
is false.
Tamar Christina [Fri, 8 Nov 2024 18:12:32 +0000 (18:12 +0000)]
AArch64: backport Neoverse and Cortex CPU definitions
This is a conservative backport of a few core definitions backporting only the
core definitions and mapping them to their closest cost model that exist on the
branches.
aarch64: Make PSEL dependent on SME rather than SME2
The svpsel_lane intrinsics were wrongly classified as SME2+ only,
rather than as base SME intrinsics. They should always be available
in streaming mode.
gcc/
* config/aarch64/aarch64-sve2.md (@aarch64_sve_psel<BHSD_BITS>)
(*aarch64_sve_psel<BHSD_BITS>_plus): Require TARGET_STREAMING
rather than TARGET_STREAMING_SME2.
There are two sets of patterns for FCLAMP: one set for single registers
and one set for multiple registers. The multiple-register set was
correctly gated on SME2, but the single-register set only required SME.
This doesn't matter for ACLE usage, since the intrinsic definitions
are correctly gated. But it does matter for automatic generation of
FCLAMP from separate minimum and maximum operations (either ACLE
intrinsics or autovectorised code).
gcc/
* config/aarch64/aarch64-sve2.md (@aarch64_sve_fclamp<mode>)
(*aarch64_sve_fclamp<mode>_x): Require TARGET_STREAMING_SME2
rather than TARGET_STREAMING_SME.
gcc/testsuite/
* gcc.target/aarch64/sme/clamp_3.c: Force sme2
* gcc.target/aarch64/sme/clamp_4.c: Likewise.
* gcc.target/aarch64/sme/clamp_5.c: New test.
aarch64: Fix folding of degenerate svwhilele case [PR117045]
The svwhilele folder mishandled the degenerate case in which
the second argument is the maximum integer. In that case,
the result is all-true regardless of the first parameter:
If the second scalar operand is equal to the maximum signed integer
value then a condition which includes an equality test can never fail
and the result will be an all-true predicate.
This is because the conceptual "increment the first operand
by 1 after each element" is done modulo the range of the operand.
The GCC code was instead treating it as infinite precision.
whilele_5.c even had a test for the incorrect behaviour.
The easiest fix seemed to be to handle that case specially before
doing constant folding. This also copes with variable first operands.
gcc/
PR target/116999
PR target/117045
* config/aarch64/aarch64-sve-builtins-base.cc
(svwhilelx_impl::fold): Check for WHILELTs of the minimum value
and WHILELEs of the maximum value. Fold them to all-false and
all-true respectively.
aarch64: Fix SVE ACLE gimple folds for C++ LTO [PR116629]
The SVE ACLE code has two ways of handling overloaded functions.
One, used by C, is to define a single dummy function for each unique
overloaded name, with resolve_overloaded_builtin then resolving calls
to real non-overloaded functions. The other, used by C++, is to
define a separate function for each individual overload.
The builtins harness assigns integer function codes programmatically.
However, LTO requires it to use the same assignment for every
translation unit, regardless of language. This means that C++ TUs
need to create (unused) slots for the C overloads and that C TUs
need to create (unused) slots for the C++ overloads.
In many ways, it doesn't matter whether the LTO frontend itself
uses the C approach or the C++ approach to defining overloaded
functions, since the LTO frontend never has to resolve source-level
overloading. However, the C++ approach of defining a separate
function for each overload means that C++ calls never need to
be redirected to a different function. Calls to an overload
can appear in the LTO dump and survive until expand. In contrast,
calls to C's dummy overload functions are resolved by the front
end and never survive to LTO (or expand).
Some optimisations work by moving between sibling functions, such as _m
to _x. If the source function is an overload, the expected destination
function is too. The LTO frontend needs to define C++ overloads if it
wants to do this optimisation properly for C++.
The PR is about a tree checking failure caused by trying to use a
stubbed-out C++ overload in LTO. Dealing with that by detecting the
stub (rather than changing which overloads are defined) would have
turned this from an ice-on-valid to a missed optimisation.
In future, it would probably make sense to redirect overloads to
non-overloaded functions during gimple folding, in case that exposes
more CSE opportunities. But it'd probably be of limited benefit, since
it should be rare for code to mix overloaded and non-overloaded uses of
the same operation. It also wouldn't be suitable for backports.
gcc/
PR target/116629
* config/aarch64/aarch64-sve-builtins.cc
(function_builder::function_builder): Use direct overloads for LTO.
gcc/testsuite/
PR target/116629
* gcc.target/aarch64/sve/acle/general/pr106326_2.c: New test.
testsuite: arm: Use effective-target for pr84556.cc test
Using "dg-do run" with a selector overrides the default selector set by
vect.exp that picks between "dg-do run" and "dg-do compile" based on the
target's support for simd operations for Arm targets.
The actual selection of default operation is performed in
check_vect_support_and_set_flags.
gcc/testsuite/ChangeLog:
* g++.dg/vect/pr84556.cc: Change from "dg-do run" with selector
to instead use dg-require-effective-target with the same
selector.
Hu, Lin1 [Thu, 7 Nov 2024 02:13:15 +0000 (10:13 +0800)]
i386: Modify regexp of pr117304-1.c
Since the test doesn't care if the hint is correct,
modify the regexp of the hint part to avoid future
changes to the hint that would cause the test to fail.
=== cut here ===
struct Base {
unsigned int *intarray;
};
template <typename T> struct Sub : public Base {
bool Get(int i) {
return (Base::intarray[++i] == 0);
}
};
=== cut here ===
The problem is that from c++17 on, we use -fstrong-eval-order and need
to wrap the array access expression into a SAVE_EXPR. We do so at
template declaration time, and end up calling contains_placeholder_p
with a SCOPE_REF, that it does not handle well.
This patch fixes this by deferring the wrapping into SAVE_EXPR to
instantiation time for templates, when the SCOPE_REF will have been
turned into a COMPONENT_REF.
PR c++/117158
gcc/cp/ChangeLog:
* typeck.cc (cp_build_array_ref): Only wrap array expression
into a SAVE_EXPR at template instantiation time.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/eval-order13.C: New test.
* g++.dg/parse/crash77.C: New test.
Patrick Palka [Tue, 5 Nov 2024 20:18:26 +0000 (15:18 -0500)]
c++: reference variable as default targ [PR101463]
Here during default template argument substitution we wrongly consider
the (substituted) default arguments v and vt<int> as value-dependent
which ultimately leads to deduction failure for the calls.
The bogus value_dependent_expression_p result aside, I noticed
type_unification_real during default targ substitution keeps track of
whether all previous targs are known and non-dependent, as is the case
for these calls. And in such cases it should be safe to avoid checking
dependence of the substituted default targ and just assume it's not.
This patch implements this optimization for GCC 14, which lets us accept
both testcases by sidestepping the value_dependent_expression_p issue
altogether. (Note that for GCC 15 we fixed this differently, see r15-3038-g5348e3cb9bc99d.)
PR c++/101463
gcc/cp/ChangeLog:
* pt.cc (type_unification_real): Avoid checking dependence of
a substituted default template argument if we can assume it's
non-dependent.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/nontype6.C: New test.
* g++.dg/cpp1z/nontype6a.C: New test.
testsuite: arm: Use effective-target for pr98636.c test
The test case assumes that -mfp16-format=alternative is accepted for the
target, but not all targets support this flag. One such target is
Cortex-M85 that does support FP16, but not the alternative format.
gcc/testsuite/ChangeLog:
* gcc.target/arm/pr98636.c: Use effective-target
arm_fp16_alternative.
Jason Merrill [Mon, 4 Nov 2024 22:48:46 +0000 (17:48 -0500)]
c++: allow array mem-init with -fpermissive [PR116634]
We've accidentally accepted this forever (at least as far back as 4.7), but
it's always been ill-formed; this was PR59465. And we didn't accept it for
scalar types. But rather than switch to a hard error for this code, let's
give a permerror so affected code can continue to work with -fpermissive.
PR c++/116634
gcc/cp/ChangeLog:
* init.cc (can_init_array_with_p): Allow PR59465 case with
permerror.
Paul Thomas [Tue, 5 Nov 2024 15:54:45 +0000 (15:54 +0000)]
Fortran: Fix regressions with intent(out) class[PR115070, PR115348].
2024-11-05 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/115070
PR fortran/115348
* trans-expr.cc (gfc_trans_class_init_assign): If all the
components of the default initializer are null for a scalar,
build an empty statement to prevent prior declarations from
disappearing.
gcc/testsuite/
PR fortran/115070
* gfortran.dg/ieee/pr115070.f90: New test.
PR fortran/115348
* gfortran.dg/pr115348.f90: New test.
Andrew MacLeod [Sat, 2 Nov 2024 14:26:24 +0000 (10:26 -0400)]
Don't call invert on VARYING.
When all cases go to one label and resul in a VARYING value, we can't
invert that value to remove all values from the default case. Simply
check for this case and set the default to UNDEFINED.
PR tree-optimization/117398
gcc/
* gimple-range-edge.cc (gimple_outgoing_range::calc_switch_ranges):
Check for VARYING and don't call invert () on it.
The test case is written in a way that it should be using hard float
ABI, but the use of -mfloat-abi=hard could be overriden by
dg-add-options arm_neon. Ensure that -mfloat-abi=hard is always after.
gcc/testsuite/ChangeLog:
* gcc.target/arm/pr51534.c: Ensure -mfloat-abi=hard is used.
Andrew MacLeod [Fri, 1 Nov 2024 14:56:54 +0000 (10:56 -0400)]
Make fur_edge accessible.
Move the decl of fur_edge out of the source file into the header file.
* gimple-range-fold.cc (class fur_edge): Relocate from here.
(fur_edge::fur_edge): Also move to:
* gimple-range-fold.h (class fur_edge): Relocate to here.
(fur_edge::fur_edge): Likewise.
Jakub Jelinek [Mon, 4 Nov 2024 11:29:01 +0000 (12:29 +0100)]
libstdc++: Fix up 117406.cc test [PR117406]
Christophe mentioned in bugzilla that the test FAILs on aarch64,
I'm not including <climits> and use INT_MAX.
Apparently during my testing I got it because the test preinclude
-include bits/stdc++.h
and that includes <climits>, dunno why that didn't happen on aarch64.
In any case, either I can add #include <climits>, or because the
test already has #include <limits> I've changed uses of INT_MAX
with std::numeric_limits<int>::max(), that should be the same thing.
But if you prefer
#include <climits>
I can surely add that instead.
2024-11-04 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/117406
* testsuite/26_numerics/headers/cmath/117406.cc: Use
std::numeric_limits<int>::max() instead of INT_MAX.
Jakub Jelinek [Sat, 2 Nov 2024 17:48:54 +0000 (18:48 +0100)]
libstdc++: Fix up std::{,b}float16_t std::{ilogb,l{,l}r{ound,int}} [PR117406]
These overloads incorrectly cast the result of the float __builtin_*
to _Float or __gnu_cxx::__bfloat16_t. For std::ilogb that changes
behavior for the INT_MAX return because that isn't representable in
either of the floating point formats, for the others it is I think
just a very inefficient hop from int/long/long long to std::{,b}float16_t
and back. I mean for the round/rint cases, either the argument is small
and then the return value should be representable in the floating point
format too, or it is too large that the argument is already integral
and then it should just return the argument with the round trips.
Too large value is unspecified unlike ilogb.
2024-11-02 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/117406
* include/c_global/cmath (std::ilogb(_Float16), std::llrint(_Float16),
std::llround(_Float16), std::lrint(_Float16), std::lround(_Float16)):
Don't cast __builtin_* return to _Float16.
(std::ilogb(__gnu_cxx::__bfloat16_t),
std::llrint(__gnu_cxx::__bfloat16_t),
std::llround(__gnu_cxx::__bfloat16_t),
std::lrint(__gnu_cxx::__bfloat16_t),
std::lround(__gnu_cxx::__bfloat16_t)): Don't cast __builtin_* return to
__gnu_cxx::__bfloat16_t.
* testsuite/26_numerics/headers/cmath/117406.cc: New test.
Jakub Jelinek [Thu, 31 Oct 2024 09:52:56 +0000 (10:52 +0100)]
expand: Fix up expansion of VIEW_CONVERT_EXPR to BITINT_TYPE [PR117354]
The following testcase ICEs, because when trying to expand the
VIEW_CONVERT_EXPR operand which is SSA_NAME defined to
V32QI or V4DI MEM_REF which is aligned just to 8 bytes we force
it as unaligned into a register, but then try to call extract_bit_field
from the V32QI or V4DI register to BLKmode. extract_bit_field doesn't
obviously support BLKmode extraction and so ICEs.
The second hunk fixes the ICE by not calling extract_bit_field when
it can't handle it, the last if will handle it properly by storing
it to memory and using BLKmode access to the copy.
The first hunk is an optimization, if mode is BLKmode, by setting
inner_reference_p argument to expand_expr_real we avoid the
expand_misaligned_mem_ref calls which load it from memory into a register.
2024-10-31 Jakub Jelinek <jakub@redhat.com>
PR middle-end/117354
* expr.cc (expand_expr_real_1) <case VIEW_CONVERT_EXPR>: Pass
true as inner_reference_p argument to expand_expr_real if
mode is BLKmode. Don't call extract_bit_field if mode is BLKmode.
Jakub Jelinek [Wed, 30 Oct 2024 08:59:22 +0000 (09:59 +0100)]
function: Call do_pending_stack_adjust in assign_parms [PR117296]
Functions called by assign_parms call emit_block_move in two places,
so on some targets can be expanded as calls and can result in pending
stack adjustment.
Now, during expansion we normally call do_pending_stack_adjust at the end
of expansion of each basic block or before emitting code that will branch
and/or has labels, and when emitting labels we assert that there are no
pending stack adjustments.
assign_parms is expanded before the first basic block and if the first
basic block starts with a label and at least one of those emit_block_move
calls resulted in the need of pending stack adjustments, we ICE when
emitting that label.
The following patch fixes that by calling do_pending_stack_adjust after
after the assign_parms potential emit_block_move calls.
Jakub Jelinek [Tue, 29 Oct 2024 10:14:12 +0000 (11:14 +0100)]
libstdc++: Use if consteval rather than if (std::__is_constant_evaluated()) for {,b}float16_t nextafter [PR117321]
The nextafter_c++23.cc testcase fails to link at -O0.
The problem is that eventhough std::__is_constant_evaluated() has
always_inline attribute, that at -O0 just means that we inline the
call, but its result is still assigned to a temporary which is tested
later, nothing at -O0 propagates that false into the if and optimizes
away the if body. And the __builtin_nextafterf16{,b} calls are meant
to be used solely for constant evaluation, the C libraries don't
define nextafterf16 these days.
As __STDCPP_FLOAT16_T__ and __STDCPP_BFLOAT16_T__ are predefined right
now only by GCC, not by clang which doesn't implement the extended floating
point types paper, and as they are predefined in C++23 and later modes only,
I think we can just use if consteval which is folded already during the FE
and the body isn't included even at -O0. I've added a feature test for
that just in case clang implements those and implements those in some weird
way. Note, if (__builtin_is_constant_evaluted()) would work correctly too,
that is also folded to false at gimplification time and the corresponding
if block not emitted at all. But for -O0 it can't be wrapped into a helper
inline function.
2024-10-29 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/117321
* include/c_global/cmath (nextafter(_Float16, _Float16)): Use
if consteval rather than if (std::__is_constant_evaluated()) around
the __builtin_nextafterf16 call.
(nextafter(__gnu_cxx::__bfloat16_t, __gnu_cxx::__bfloat16_t)): Use
if consteval rather than if (std::__is_constant_evaluated()) around
the __builtin_nextafterf16b call.
* testsuite/26_numerics/headers/cmath/117321.cc: New test.
Eric Botcazou [Fri, 16 Aug 2024 14:03:30 +0000 (16:03 +0200)]
ada: Fix internal error on concatenation of discriminant-dependent component
This only occurs with optimization enabled, but the expanded code is always
wrong because it reuses the formal parameter of an initialization procedure
associated with a discriminant (a discriminal in GNAT parlance) outside of
the initialization procedure.
gcc/ada/
* checks.adb (Selected_Length_Checks.Get_E_Length): For a
component of a record with discriminants and if the expression is
a selected component, try to build an actual subtype from its
prefix instead of from the discriminal.
Paul Thomas [Fri, 25 Oct 2024 16:59:03 +0000 (17:59 +0100)]
Fortran: Fix ICE with structure constructor in data statement [PR79685]
2024-10-25 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/79685
* decl.cc (match_data_constant): Find the symtree instead of
the symbol so the use renamed symbols are found. Pass this and
the derived type to gfc_match_structure_constructor.
* match.h: Update prototype of gfc_match_structure_contructor.
* primary.cc (gfc_match_structure_constructor): Remove call to
gfc_get_ha_sym_tree and use caller supplied symtree instead.
gcc/testsuite/
PR fortran/79685
* gfortran.dg/use_rename_13.f90: New test.
Hongyu Wang [Wed, 7 Feb 2024 06:42:58 +0000 (14:42 +0800)]
[APX PPX] Avoid generating unmatched pushp/popp in pro/epilogue
According to APX spec, the pushp/popp pairs should be matched,
otherwise the PPX hint cannot take effect and cause performance loss.
In the ix86_expand_epilogue, there are several optimizations that may
cause the epilogue using mov to restore the regs. Check if PPX applied
and prevent usage of mov/leave in the epilogue. Also do not use PPX
for eh_return.
gcc/ChangeLog:
* config/i386/i386.cc (ix86_expand_prologue): Set apx_ppx_used
flag in m.fs with TARGET_APX_PPX && !crtl->calls_eh_return.
(ix86_emit_save_regs): Emit ppx is available only when
TARGET_APX_PPX && !crtl->calls_eh_return.
(ix86_expand_epilogue): Don't restore reg using mov when
apx_ppx_used flag is true.
* config/i386/i386.h (struct machine_frame_state):
Add apx_ppx_used flag.
gcc/testsuite/ChangeLog:
* gcc.target/i386/apx-ppx-2.c: New test.
* gcc.target/i386/apx-ppx-3.c: Likewise.
The current code was based on an early version of the SME spec,
which allowed the .Q forms of TRN1, TRN2, UZP1, UZP2, ZIP1, and ZIP2
to be used in streaming mode. We should now forbid them instead;
see https://developer.arm.com/documentation/ddi0602/2024-09/SVE-Instructions/TRN1--TRN2--vectors---Interleave-even-or-odd-elements-from-two-vectors-?lang=en
and the corresponding entries for the others.
Yangyu Chen [Thu, 31 Oct 2024 19:52:45 +0000 (19:52 +0000)]
Fix function multiversioning dispatcher link error with LTO
We forgot to apply DECL_EXTERNAL to __init_cpu_features_resolver decl. When
building with LTO, the linker cannot find the
__init_cpu_features_resolver.lto_priv* symbol, causing the link error.
This patch gets this fixed by adding DECL_EXTERNAL to the decl. To avoid used
but never defined warning for this symbol, we also mark TREE_PUBLIC to the decl.
We should also mark the decl having hidden visibility. And fix the attribute in
the same way for __aarch64_cpu_features identifier.
* config/aarch64/aarch64.cc (dispatch_function_versions): Adding
DECL_EXTERNAL, TREE_PUBLIC and hidden DECL_VISIBILITY to
__init_cpu_features_resolver and __aarch64_cpu_features.
David Malcolm [Wed, 30 Oct 2024 20:11:41 +0000 (16:11 -0400)]
jit: fix leak of pending_assemble_externals_set [PR117275]
My recent r15-4580-g779c0390e3b57d fix for resetting state in
varasm.cc introduced some noise to "make selftest-valgrind" and,
presumably, a memory leak in libgccjit:
==2462086== 160 (56 direct, 104 indirect) bytes in 1 blocks are definitely lost in loss record 248 of 352
==2462086== at 0x5270E7D: operator new(unsigned long) (vg_replace_malloc.c:342)
==2462086== by 0x1D1EB89: init_varasm_once() (varasm.cc:6806)
==2462086== by 0x181C845: backend_init() (toplev.cc:1826)
==2462086== by 0x181D41A: do_compile() (toplev.cc:2193)
==2462086== by 0x181D99C: toplev::main(int, char**) (toplev.cc:2371)
==2462086== by 0x378391D: main (main.cc:39)
Fixed thusly.
gcc/ChangeLog:
PR jit/117275
* varasm.cc (process_pending_assemble_externals): Reset
pending_assemble_externals_set to nullptr after deleting it.
(varasm_cc_finalize): Delete pending_assemble_externals_set.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
(cherry picked from commit 7f41203f08b9948c1c636dc9d66571121c6c7793) Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Wed, 30 Oct 2024 20:11:40 +0000 (16:11 -0400)]
jit: reset state in varasm.cc [PR117275]
PR jit/117275 reports various jit test failures seen on
powerpc64le-unknown-linux-gnu due to hitting this assertion
in varasm.cc on the 2nd compilation in a process:
#2 0x00007ffff63e67d0 in assemble_external_libcall (fun=0x7ffff2a4b1d8)
at ../../src/gcc/varasm.cc:2650
2650 gcc_assert (!pending_assemble_externals_processed);
(gdb) p pending_assemble_externals_processed
$1 = true
We're not properly resetting state in varasm.cc after a compile
for libgccjit.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
(cherry picked from commit 779c0390e3b57d1eebd41bbfe43d1f329c91de6c) Signed-off-by: David Malcolm <dmalcolm@redhat.com>
jit.dg/test-error-pr63969-missing-driver.c tries to break PATH and
verify that an error is generated when using an external driver.
However it does this by unsetting PATH, and so the test could
accidentally find the driver if the system supplies a default and the
driver happens to be installed in that path (reported as rhbz#2318021).
Fix the test by instead setting PATH to a bogus value.
gcc/testsuite/ChangeLog:
* jit.dg/test-error-pr63969-missing-driver.c (create_code): When
breaking PATH, use setenv with a bogus value, rather than
unsetenv, in case the system uses a default path that contains
the driver binary.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
(cherry picked from commit f8dcb559e615dbb4557a23363f9532a3544a7241) Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Alex Coplan [Wed, 30 Oct 2024 13:46:12 +0000 (13:46 +0000)]
aarch64: Assume alias conflict if common address reg changes [PR116783]
As the PR shows, pair fusion was tricking memory_modified_in_insn_p into
returning false when a common base register (in this case, x1) was
modified between the mem and the store insn. This lead to wrong code as
the accesses really did alias.
To avoid this sort of problem, this patch avoids invoking RTL alias
analysis altogether (and assume an alias conflict) if the two insns to
be compared share a common address register R, and the insns see different
definitions of R (i.e. it was modified in between).
PR rtl-optimization/116783
* config/aarch64/aarch64-ldp-fusion.cc
(def_walker::cand_addr_uses): New.
(def_walker::def_walker): Add parameter for candidate address
uses.
(def_walker::alias_conflict_p): Declare.
(def_walker::addr_reg_conflict_p): New.
(def_walker::conflict_p): New.
(store_walker::store_walker): Add parameter for candidate
address uses and pass to base ctor.
(store_walker::conflict_p): Rename to ...
(store_walker::alias_conflict_p): ... this.
(load_walker::load_walker): Add parameter for candidate
address uses and pass to base ctor.
(load_walker::conflict_p): Rename to ...
(load_walker::alias_conflict_p): ... this.
(ldp_bb_info::try_fuse_pair): Collect address register
uses for candidate insns and pass down to alias walkers.
gcc/testsuite/ChangeLog:
PR rtl-optimization/116783
* g++.dg/torture/pr116783.C: New test.