Richard Biener [Wed, 31 Jan 2024 10:28:50 +0000 (11:28 +0100)]
tree-optimization/113630 - invalid code hoisting
The following avoids code hoisting (but also PRE insertion) of
expressions that got value-numbered to another one that are not
a valid replacement (but still compute the same value). This time
because the access path ends in a structure with different size,
meaning we consider a related access as not trapping because of the
size of the base of the access.
PR tree-optimization/113630
* tree-ssa-pre.cc (compute_avail): Avoid registering a
reference with a representation with not matching base
access size.
Paul Thomas [Tue, 2 Apr 2024 14:53:29 +0000 (15:53 +0100)]
Fortran: Add error for subroutine passed to a variable dummy [PR106999]
2024-04-02 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/106999
* interface.cc (gfc_compare_interfaces): Add error for a
subroutine proc pointer passed to a variable formal.
(compare_parameter): If a procedure pointer is being passed to
a non-procedure formal arg, and there is an an interface, use
gfc_compare_interfaces to check and provide a more useful error
message.
gcc/testsuite/
PR fortran/106999
* gfortran.dg/pr106999.f90: New test.
Paul Thomas [Tue, 2 Apr 2024 13:19:09 +0000 (14:19 +0100)]
Fortran: Fix wrong recursive errors and class initialization [PR112407]
2024-04-02 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/112407
* resolve.cc (resolve_procedure_expression): Change the test for
for recursion in the case of hidden procedures from modules.
(resolve_typebound_static): Add warning for possible recursive
calls to typebound procedures.
* trans-expr.cc (gfc_trans_class_init_assign): Do not apply
default initializer to class dummy where component initializers
are all null.
gcc/testsuite/
PR fortran/112407
* gfortran.dg/pr112407a.f90: New test.
* gfortran.dg/pr112407b.f90: New test.
Paul Thomas [Fri, 29 Mar 2024 07:23:19 +0000 (07:23 +0000)]
Fortran: Fix a gimplifier ICE/wrong result with finalization [PR36337]
2024-03-29 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/36337
PR fortran/110987
PR fortran/113885
* trans-expr.cc (gfc_trans_assignment_1): Place finalization
block before rhs post block for elemental rhs.
* trans.cc (gfc_finalize_tree_expr): Check directly if a type
has no components, rather than the zero components attribute.
Treat elemental zero component expressions in the same way as
scalars.
gcc/testsuite/
PR fortran/113885
* gfortran.dg/finalize_54.f90: New test.
* gfortran.dg/finalize_55.f90: New test.
gcc/testsuite/
PR fortran/110987
* gfortran.dg/finalize_56.f90: New test.
Paul Thomas [Mon, 6 May 2024 07:21:14 +0000 (08:21 +0100)]
Fortran: Fix ICE and clear incorrect error messages [PR114739]
2024-05-06 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/114739
* primary.cc (gfc_match_varspec): Check for default type before
checking for derived types with the right component name.
gcc/testsuite/
PR fortran/114739
* gfortran.dg/pr114739.f90: New test.
* gfortran.dg/derived_comp_array_ref_8.f90: Add 'implicit none'
for consistency with expected error message.
* gfortran.dg/nullify_4.f90: ditto
* gfortran.dg/pointer_init_6.f90: ditto
* gfortran.dg/pr107397.f90: ditto
* gfortran.dg/pr88138.f90: ditto
Objective-C, NeXT, v2: Correct a regression in code-gen.
There have been several changes in the ABI of Objective-C which
depend on the OS version targetted. In this case Protocols and
LabelProtocols should be made weak/hidden/extern from macOS 10.7
however there was a mistake in the code causing this to occur
from macOS 10.6. Fixed thus.
gcc/objc/ChangeLog:
* objc-next-runtime-abi-02.cc (WEAK_PROTOCOLS_AFTER): New.
(next_runtime_abi_02_protocol_decl): Use WEAK_PROTOCOLS_AFTER
to determine this ABI change.
(build_v2_protocol_list_address_table): Likewise.
Richard Biener [Wed, 17 Apr 2024 08:40:04 +0000 (10:40 +0200)]
tree-optimization/114749 - reset partial vector decision for no-SLP retry
The following makes sure to reset LOOP_VINFO_USING_PARTIAL_VECTORS_P
to its default of false when re-trying without SLP as otherwise
analysis may run into bogus asserts.
PR tree-optimization/114749
* tree-vect-loop.cc (vect_analyze_loop_2): Reset
LOOP_VINFO_USING_PARTIAL_VECTORS_P when re-trying without SLP.
Richard Biener [Tue, 16 Apr 2024 09:33:48 +0000 (11:33 +0200)]
tree-optimization/114736 - SLP DFS walk issue
The following fixes a DFS walk issue when identifying to be ignored
latch edges. We have (bogus) SLP_TREE_REPRESENTATIVEs for VEC_PERM
nodes so those have to be explicitly ignored as possibly being PHIs.
PR tree-optimization/114736
* tree-vect-slp.cc (vect_optimize_slp_pass::is_cfg_latch_edge):
Do not consider VEC_PERM_EXPRs as PHI use.
Richard Biener [Mon, 15 Apr 2024 09:09:17 +0000 (11:09 +0200)]
gcov-profile/114715 - missing coverage for switch
The following avoids missing coverage for the line of a switch statement
which happens when gimplification emits a BIND_EXPR wrapping the switch
as that prevents us from setting locations on the containing statements
via annotate_all_with_location. Instead set the location of the GIMPLE
switch directly.
PR gcov-profile/114715
* gimplify.cc (gimplify_switch_expr): Set the location of the
GIMPLE switch.
Richard Biener [Tue, 9 Apr 2024 12:25:57 +0000 (14:25 +0200)]
lto/114655 - -flto=4 at link time doesn't override -flto=auto at compile time
The following adjusts -flto option processing in lto-wrapper to have
link-time -flto override any compile time setting.
PR lto/114655
* lto-wrapper.cc (merge_flto_options): Add force argument.
(merge_and_complain): Do not force here.
(run_gcc): But here to make the link-time -flto option override
any compile-time one.
Richard Biener [Thu, 4 Apr 2024 08:00:51 +0000 (10:00 +0200)]
tree-optimization/114485 - neg induction with partial vectors
We can't use vect_update_ivs_after_vectorizer for partial vectors,
the following fixes vect_can_peel_nonlinear_iv_p accordingly.
PR tree-optimization/114485
* tree-vect-loop-manip.cc (vect_can_peel_nonlinear_iv_p):
vect_step_op_neg isn't OK for partial vectors but only
for unknown niter.
Andre Vieira [Fri, 20 Oct 2023 16:02:32 +0000 (17:02 +0100)]
ifcvt: Don't lower bitfields with non-constant offsets [PR 111882]
This patch stops lowering of bitfields by ifcvt when they have non-constant
offsets as we are not likely to be able to do anything useful with those during
vectorization. That also fixes the issue reported in PR 111882, which was
being caused by an offset with a side-effect being lowered, but constants have
no side-effects so we will no longer run into that problem.
gcc/ChangeLog:
PR tree-optimization/111882
* tree-if-conv.cc (get_bitfield_rep): Return NULL_TREE for bitfields
with non-constant offsets.
Richard Biener [Wed, 10 Apr 2024 08:33:40 +0000 (10:33 +0200)]
tree-optimization/114672 - WIDEN_MULT_PLUS_EXPR type mismatch
The following makes sure to restrict WIDEN_MULT*_EXPR to a mode
precision final compute type as the mode is used to find the optab
and type checking chokes when seeing bit-precisions later which
would likely also not properly expanded to RTL.
PR tree-optimization/114672
* tree-ssa-math-opts.cc (convert_plusminus_to_widen): Only
allow mode-precision results.
Jonathan Wakely [Thu, 4 Apr 2024 09:33:33 +0000 (10:33 +0100)]
libstdc++: Fix infinite loop in std::istream::ignore(n, delim) [PR93672]
A negative delim value passed to std::istream::ignore can never match
any character in the stream, because the comparison is done using
traits_type::eq_int_type(sb->sgetc(), delim) and sgetc() never returns
negative values (except at EOF). The optimized version of ignore for the
std::istream specialization uses traits_type::find to locate the delim
character in the streambuf, which _can_ match a negative delim on
platforms where char is signed, but then we do another comparison using
eq_int_type which fails. The code then keeps looping forever, with
traits_type::find locating the character and traits_type::eq_int_type
saying it's not a match, so traits_type::find is used again and finds
the same character again.
A possible fix would be to check with eq_int_type after a successful
find, to see whether we really have a match. However, that would be
suboptimal since we know that a negative delimiter will never match
using eq_int_type. So a better fix is to adjust the check at the top of
the function that handles delim==eof(), so that we treat all negative
delim values as equivalent to EOF. That way we don't bother using find
to search for something that will never match with eq_int_type.
The version of ignore in the primary template doesn't need a change,
because it doesn't use traits_type::find, instead characters are
extracted one-by-one and always matched using eq_int_type. That avoids
the inconsistency between find and eq_int_type. The specialization for
std::wistream does use traits_type::find, but traits_type::to_int_type
is equivalent to an implicit conversion from wchar_t to wint_t, so
passing a wchar_t directly to ignore without using to_int_type works.
libstdc++-v3/ChangeLog:
PR libstdc++/93672
* src/c++98/istream.cc (istream::ignore(streamsize, int_type)):
Treat all negative delimiter values as eof().
* testsuite/27_io/basic_istream/ignore/char/93672.cc: New test.
* testsuite/27_io/basic_istream/ignore/wchar_t/93672.cc: New
test.
Will Schmidt [Fri, 12 Apr 2024 19:55:16 +0000 (14:55 -0500)]
rs6000: Add OPTION_MASK_POWER8 [PR101865]
The bug in PR101865 is the _ARCH_PWR8 predefine macro is conditional upon
TARGET_DIRECT_MOVE, which can be false for some -mcpu=power8 compiles if the
-mno-altivec or -mno-vsx options are used. The solution here is to create
a new OPTION_MASK_POWER8 mask that is true for -mcpu=power8, regardless of
Altivec or VSX enablement.
Unfortunately, the only way to create an OPTION_MASK_* mask is to create
a new option, which we have done here, but marked it as WarnRemoved since
we do not want users using it. For stage1, we will look into how we can
create ISA mask flags for use in the compiler without the need for explicit
options.
2024-04-12 Will Schmidt <will_schmidt@linux.ibm.com>
Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/101865
* config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Use
TARGET_POWER8.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Use
OPTION_MASK_POWER8.
* config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Add OPTION_MASK_POWER8.
(ISA_2_7_MASKS_SERVER): Likewise.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Update
comment. Use OPTION_MASK_POWER8 and TARGET_POWER8.
* config/rs6000/rs6000.h (TARGET_SYNC_HI_QI): Use TARGET_POWER8.
* config/rs6000/rs6000.md (define_attr "isa"): Add p8.
(define_attr "enabled"): Handle it.
(define_insn "prefetch"): Use TARGET_POWER8.
* config/rs6000/rs6000.opt (mpower8-internal): New.
gcc/testsuite/
PR target/101865
* gcc.target/powerpc/predefine-p7-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-noaltivec-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-noaltivec.c: New test.
* gcc.target/powerpc/predefine-p8-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-pragma-vsx.c: New test.
* gcc.target/powerpc/predefine-p9-novsx.c: New test.
Peter Bergner [Tue, 9 Apr 2024 20:24:39 +0000 (15:24 -0500)]
rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]
This is a cleanup patch in preparation to fixing the real bug in PR101865.
TARGET_DIRECT_MOVE is redundant with TARGET_P8_VECTOR, so alias it to that.
Also replace all usages of OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR
and delete the now dead mask.
2024-04-09 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/101865
* config/rs6000/rs6000.h (TARGET_DIRECT_MOVE): Define.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR. Delete redundant
OPTION_MASK_DIRECT_MOVE usage. Delete TARGET_DIRECT_MOVE dead code.
(rs6000_opt_masks): Neuter the "direct-move" option.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR. Delete useless
comment.
* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Delete
OPTION_MASK_DIRECT_MOVE.
(OTHER_P8_VECTOR_MASKS): Likewise.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000.opt (mdirect-move): Remove Mask and Var.
Jason Merrill [Tue, 2 Apr 2024 14:52:28 +0000 (10:52 -0400)]
c++: binding reference to comma expr [PR114561]
We represent a reference binding where the referent type is more qualified
by a ck_ref_bind around a ck_qual. We performed the ck_qual and then tried
to undo it with STRIP_NOPS, but that doesn't work if the conversion is
buried in COMPOUND_EXPR. So instead let's avoid performing that fake
conversion in the first place.
PR c++/114561
PR c++/114562
gcc/cp/ChangeLog:
* call.cc (convert_like_internal): Avoid adding qualification
conversion in direct reference binding.
gcc/testsuite/ChangeLog:
* g++.dg/conversion/ref10.C: New test.
* g++.dg/conversion/ref11.C: New test.
Jason Merrill [Wed, 27 Mar 2024 20:14:01 +0000 (16:14 -0400)]
c++: __is_constructible ref binding [PR100667]
The requirement that a type argument be complete is excessive in the case of
direct reference binding to the same type, which does not rely on any
properties of the type. This is LWG 2939.
PR c++/100667
gcc/cp/ChangeLog:
* semantics.cc (same_type_ref_bind_p): New.
(finish_trait_expr): Use it.
Jonathan Wakely [Fri, 26 Apr 2024 10:42:26 +0000 (11:42 +0100)]
libstdc++: Do not apply localized formatting to NaN and inf [PR114863]
We don't want to add grouping to strings like "-inf", and there is no
radix character to replace either.
libstdc++-v3/ChangeLog:
PR libstdc++/114863
* include/std/format (__formatter_fp::format): Only use
_M_localized for finite values.
* testsuite/std/format/functions/format.cc: Check localized
formatting of NaN and initiny.
Since the match.pd transforms (zero_one == 0) ? y : z <op> y,
into ((typeof(y))zero_one * z) <op> y. Add splitters to recongize
this expression to generate SFB instructions.
gcc/ChangeLog:
PR target/113095
* config/riscv/riscv.md: New splitters to rewrite single bit
sign extension as the condition to SFB instructions.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sfb.c: New test.
* gcc.target/riscv/pr113095.c: New test.
Richard Biener [Thu, 28 Sep 2023 09:51:30 +0000 (11:51 +0200)]
target/111600 - avoid deep recursion in access diagnostics
pass_waccess::check_dangling_stores uses recursion to traverse the CFG.
The following changes this to use a heap allocated worklist to avoid
blowing the stack.
Instead of using a better iteration order it tries hard to preserve
the current iteration order to avoid new false positives to pop up
since the set of stores we keep track isn't properly modeling flow,
so what is diagnosed and what not is quite random. We are also
lacking the ideal RPO compute on the inverted graph that would just
ignore reverse unreachable code (as the current iteration scheme does).
PR target/111600
* gimple-ssa-warn-access.cc (pass_waccess::check_dangling_stores):
Use a heap allocated worklist for CFG traversal instead of
recursion.
Yang Yujie [Fri, 8 Dec 2023 10:01:18 +0000 (18:01 +0800)]
LoongArch: Fix eh_return epilogue for normal returns.
On LoongArch, the regitsters $r4 - $r7 (EH_RETURN_DATA_REGNO) will be saved
and restored in the function prologue and epilogue if the given function calls
__builtin_eh_return. This causes the return value to be overwritten on normal
return paths and breaks a rare case of libgcc's _Unwind_RaiseException.
gcc/ChangeLog:
PR target/114848
* config/loongarch/loongarch.cc: Do not restore the saved eh_return
data registers ($r4-$r7) for a normal return of a function that calls
__builtin_eh_return elsewhere.
* config/loongarch/loongarch-protos.h: Same.
* config/loongarch/loongarch.md: Same.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/eh_return-normal-return.c: New test.
This patch updates the Solaris baselines for the GLIBCXX_3.4.32 version
added in GCC 13.2.
Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11 (32 and 64-bit
each) on the gcc-13 branch and (together with the GLIBCXX_3.4.33 update)
on both gcc-14 branch and trunk.
gfortran: Allow ref'ing PDT's len() in parameter-initializer.
Fix declaring a parameter initialized using a pdt_len reference
not simplifying the reference to a constant.
2023-07-12 Andre Vehreschild <vehre@gcc.gnu.org>
gcc/fortran/ChangeLog:
PR fortran/102003
* expr.cc (find_inquiry_ref): Replace len of pdt_string by
constant.
(simplify_ref_chain): Ensure input to find_inquiry_ref is
NULL.
(gfc_match_init_expr): Prevent PDT analysis for function calls.
(gfc_pdt_find_component_copy_initializer): Get the initializer
value for given component.
* gfortran.h (gfc_pdt_find_component_copy_initializer): New
function.
* simplify.cc (gfc_simplify_len): Replace len() of PDT with pdt
component ref or constant.
Jonathan Wakely [Thu, 18 Apr 2024 16:26:55 +0000 (17:26 +0100)]
libstdc++: Add libstdc++_libbacktrace.a to libstdc++exp
This completes the fixes to put all experimental symbols into
libstdc++exp.a.
On trunk the libstdc++_libbacktrace.a was removed completely and its
contents aded to libstdc++exp.a instead. We don't want to remove it on
the gcc-13 branch because that would break makefiles using it. We can
add the contents to libstdc++exp.a and then install a symlink so that
using -lstdc++_libbacktrace still works, but links to libstdc++exp.a
instead.
The libstdc++_libbacktrace.la libtool control file is removed by this
change, because I'm pretty sure it's not actually useful, and I don't
know whether it should be a symlink to libstdc++exp.la or a regular file
that refers to libstdc++_libbacktrace.a.
libstdc++-v3/ChangeLog:
* src/experimental/Makefile.am (install-exec-local): New target.
(uninstall-local): New target.
* src/experimental/Makefile.in: Regenerate.
* src/libbacktrace/Makefile.am: Build libstdc++_libbacktrace as
noinst_LTLIBRARIES so it's only a convenience library.
* src/libbacktrace/Makefile.in: Regenerate.
Jonathan Wakely [Fri, 2 Feb 2024 12:07:09 +0000 (12:07 +0000)]
libstdc++: Fix libstdc++exp.a so it really does contain Filesystem TS symbols
In r14-3812-gb96b554592c5cb I claimed that libstdc++exp.a now contains
all the symbols from libstdc++fs.a as well as libstdc++_libbacktrace.a,
but that wasn't true. Only the symbols from the latter were added to
libstdc++exp.a, the Filesystem TS ones weren't. This seems to be because
libtool won't combine static libs that are going to be installed
separately. Because libstdc++fs.a is still installed, libtool decides it
shouldn't be included in libstdc++exp.a.
The solution is similar to what we already do for libsupc++.a: build two
static libs, libstdc++fs.a and libstdc++fsconvenience.a, where the
former is installed and the latter isn't. If we then tell libtool to
include the latter in libstdc++exp.a it will do as it's told.
libstdc++-v3/ChangeLog:
* src/experimental/Makefile.am: Use libstdc++fsconvenience.a
instead of libstdc++fs.a.
* src/experimental/Makefile.in: Regenerate.
* src/filesystem/Makefile.am: Build libstdc++fsconvenience.a as
well.
* src/filesystem/Makefile.in: Regenerate.
Richard Ball [Thu, 25 Apr 2024 14:30:42 +0000 (15:30 +0100)]
arm: Zero/Sign extends for CMSE security
Co-Authored by: Andre Simoes Dias Vieira <Andre.SimoesDiasVieira@arm.com>
This patch makes the following changes:
1) When calling a secure function from non-secure code then any arguments
smaller than 32-bits that are passed in registers are zero- or sign-extended.
2) After a non-secure function returns into secure code then any return value
smaller than 32-bits that is passed in a register is zero- or sign-extended.
Kewen Lin [Tue, 9 Apr 2024 02:01:36 +0000 (21:01 -0500)]
rs6000: Fix wrong align passed to build_aligned_type [PR88309]
As the comments in PR88309 show, there are two oversights
in rs6000_gimple_fold_builtin that pass align in bytes to
build_aligned_type but which actually requires align in
bits, it causes unexpected ICE or hanging in function
is_miss_rate_acceptable due to zero align_unit value.
This patch is to fix them by converting bytes to bits, add
an assertion on positive align_unit value and notes function
build_aligned_type requires align measured in bits in its
function comment.
PR target/88309
Co-authored-by: Andrew Pinski <quic_apinski@quicinc.com>
gcc/ChangeLog:
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Fix
wrong align passed to function build_aligned_type.
* tree-ssa-loop-prefetch.cc (is_miss_rate_acceptable): Add an
assertion to ensure align_unit should be positive.
* tree.cc (build_qualified_type): Update function comments.
Kito Cheng [Wed, 24 Apr 2024 08:54:44 +0000 (16:54 +0800)]
RISC-V: Fix recursive vsetvli checking [PR114172]
extract_single_source will recursive checking the sources to
make sure if it's single source, however it may cause infinite
recursive when the source is come from itself, so it should just skip
first source to prevent that.
NOTE: This logic has existing on trunk/GCC 14, but it included in a big
vsetvli improvement patch, which is not backport to GCC 13.
Jerry DeLisle [Mon, 22 Apr 2024 03:50:26 +0000 (20:50 -0700)]
libfortran: Fix handling of formatted separators.
Backport from mainline.
PR libfortran/114304
PR libfortran/105473
libgfortran/ChangeLog:
* io/list_read.c (eat_separator): Add logic to handle spaces
preceding a comma or semicolon such that that a 'null' read
occurs without error at the end of comma or semicolon
terminated input lines. Add check and error message for ';'.
Accept tab as alternative to space.
(list_formatted_read_scalar): Treat comma as a decimal point
when specified by the decimal mode on the first item.
gcc/testsuite/ChangeLog:
* gfortran.dg/pr105473.f90: Modify for revised checks.
* gfortran.dg/pr114304-2.f90: New test.
* gfortran.dg/pr114304.f90: New test.
Jakub Jelinek [Fri, 19 Apr 2024 22:12:36 +0000 (00:12 +0200)]
c-family: Allow arguments with NULLPTR_TYPE as sentinels [PR114780]
While in C++ the ellipsis argument conversions include
"An argument that has type cv std::nullptr_t is converted to type void*"
in C23 a nullptr_t argument is not promoted in any way, but va_arg
description says:
"the type of the next argument is nullptr_t and type is a pointer type that has the same
representation and alignment requirements as a pointer to a character type."
So, while in C++ check_function_sentinel will never see NULLPTR_TYPE, for
C23 it can see that and currently we incorrectly warn about those.
The only question is whether we should warn on any argument with
nullptr_t type or just about nullptr (nullptr_t argument with integer_zerop
value). Through undefined behavior guess one could pass non-NULL pointer
that way, say by union { void *p; nullptr_t q; } u; u.p = &whatever;
and pass u.q to ..., but valid code should always pass something that will
read as (char *) 0 when read using va_arg (ap, char *), so I think it is
better not to warn rather than warn in those cases.
Note, clang seems to pass (void *)0 rather than expression of nullptr_t
type to ellipsis in C23 mode as if it did the C++ ellipsis argument
conversions, in that case guess not warning about that would be even safer,
but what GCC does I think follows the spec more closely, even when in a
valid program one shouldn't be able to observe the difference.
2024-04-20 Jakub Jelinek <jakub@redhat.com>
PR c/114780
* c-common.cc (check_function_sentinel): Allow as sentinel any
argument of NULLPTR_TYPE.
Jakub Jelinek [Fri, 19 Apr 2024 06:47:53 +0000 (08:47 +0200)]
rtlanal: Fix set_noop_p for volatile loads or stores [PR114768]
On the following testcase, combine propagates the mem/v load into mem store
with the same address and then removes it, because noop_move_p says it is a
no-op move. If it was the other way around, i.e. mem/v store and mem load,
or both would be mem/v, it would be kept.
The problem is that rtx_equal_p never checks any kind of flags on the rtxes
(and I think it would be quite dangerous to change it at this point), and
set_noop_p checks side_effects_p on just one of the operands, not both.
In the MEM <- MEM set, it only checks it on the destination, in
store to ZERO_EXTRACT only checks it on the source.
The following patch adds the missing side_effects_p checks.
2024-04-19 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/114768
* rtlanal.cc (set_noop_p): Don't return true for MEM <- MEM
sets if src has side-effects or for stores into ZERO_EXTRACT
if ZERO_EXTRACT operand has side-effects.
Jakub Jelinek [Thu, 18 Apr 2024 07:45:14 +0000 (09:45 +0200)]
internal-fn: Temporarily disable flag_trapv during .{ADD,SUB,MUL}_OVERFLOW etc. expansion [PR114753]
__builtin_{add,sub,mul}_overflow{,_p} builtins are well defined
for all inputs even for -ftrapv, and the -fsanitize=signed-integer-overflow
ifns shouldn't abort in libgcc but emit the desired ubsan diagnostics
or abort depending on -fsanitize* setting regardless of -ftrapv.
The expansion of these internal functions uses expand_expr* in various
places (e.g. MULT_EXPR at least in 2 spots), so temporarily disabling
flag_trapv in all those spots would be hard.
The following patch disables it around the bodies of 3 functions
which can do the expand_expr calls.
If it was in the C++ FE, I'd use some RAII sentinel, but I don't think
we have one in the middle-end.
2024-04-18 Jakub Jelinek <jakub@redhat.com>
PR middle-end/114753
* internal-fn.cc (expand_mul_overflow): Save flag_trapv and
temporarily clear it for the duration of the function, then
restore previous value.
(expand_vector_ubsan_overflow): Likewise.
(expand_arith_overflow): Likewise.
.ABNORMAL_DISPATCHER is currently the only internal function with
ECF_NORETURN, and asan likes to instrument ECF_NORETURN calls by adding
some builtin call before them, which breaks the .ABNORMAL_DISPATCHER
discovery added in gsi_safe_*.
The following patch fixes asan not to instrument .ABNORMAL_DISPATCHER
calls, like it doesn't instrument a couple of specific builtin calls
as well.
2024-04-17 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/114743
* asan.cc (maybe_instrument_call): Don't instrument calls to
.ABNORMAL_DISPATCHER.
* gcc.dg/asan/pr112709-2.c (freddy): New function from
gcc.dg/ubsan/pr112709-2.c version of the test.
Jakub Jelinek [Mon, 15 Apr 2024 08:25:22 +0000 (10:25 +0200)]
attribs: Don't crash on NULL TREE_TYPE in diag_attr_exclusions [PR114634]
The enumerator still doesn't have TREE_TYPE set but diag_attr_exclusions
assumes that all decls must have types.
I think it is better in something as unimportant as diag_attr_exclusions
to be more robust, if there is no type, it can just diagnose exclusions
on the DECL_ATTRIBUTES, like for types it only diagnoses it on
TYPE_ATTRIBUTES.
2024-04-15 Jakub Jelinek <jakub@redhat.com>
PR c++/114634
* attribs.cc (diag_attr_exclusions): Set attrs[1] to NULL_TREE for
decls with NULL TREE_TYPE.
Jakub Jelinek [Fri, 12 Apr 2024 18:53:10 +0000 (20:53 +0200)]
c++: Fix bogus warnings about ignored annotations [PR114691]
The middle-end warns about the ANNOTATE_EXPR added for while/for loops
if they declare a var inside of the loop condition.
This is because the assumption is that ANNOTATE_EXPR argument is used
immediately in a COND_EXPR (later GIMPLE_COND), but simplify_loop_decl_cond
wraps the ANNOTATE_EXPR inside of a TRUTH_NOT_EXPR, so it no longer
holds.
The following patch fixes that by adding the TRUTH_NOT_EXPR inside of the
ANNOTATE_EXPR argument if any.
2024-04-12 Jakub Jelinek <jakub@redhat.com>
PR c++/114691
* semantics.cc (simplify_loop_decl_cond): Use cp_build_unary_op with
TRUTH_NOT_EXPR on ANNOTATE_EXPR argument (if any) rather than
ANNOTATE_EXPR itself.
Jakub Jelinek [Fri, 12 Apr 2024 08:59:54 +0000 (10:59 +0200)]
Limit special asan/ubsan/bitint returns_twice handling to calls in bbs with abnormal pred [PR114687]
The tree-cfg.cc verifier only diagnoses returns_twice calls preceded
by non-label/debug stmts if it is in a bb with abnormal predecessor.
The following testcase shows that if a user lies in the attributes
(a function which never returns can't be pure, and can't return
twice when it doesn't ever return at all), when we figure it out,
we can remove the abnormal edges to the "returns_twice" call and perhaps
whole .ABNORMAL_DISPATCHER etc.
edge_before_returns_twice_call then ICEs because it can't find such
an edge.
The following patch limits the special handling to calls in bbs where
the verifier requires that.
2024-04-12 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/114687
* gimple-iterator.cc (gsi_safe_insert_before): Only use
edge_before_returns_twice_call if bb_has_abnormal_pred.
(gsi_safe_insert_seq_before): Likewise.
Jakub Jelinek [Thu, 11 Apr 2024 09:12:11 +0000 (11:12 +0200)]
asan, v3: Fix up handling of > 32 byte aligned variables with -fsanitize=address -fstack-protector* [PR110027]
On Tue, Mar 26, 2024 at 02:08:02PM +0800, liuhongt wrote:
> > > So, try to add some other variable with larger size and smaller alignment
> > > to the frame (and make sure it isn't optimized away).
> > >
> > > alignb above is the alignment of the first partition's var, if
> > > align_frame_offset really needs to depend on the var alignment, it probably
> > > should be the maximum alignment of all the vars with alignment
> > > alignb * BITS_PER_UNIT <=3D MAX_SUPPORTED_STACK_ALIGNMENT
> > >
>
> In asan_emit_stack_protection, when it allocated fake stack, it assume
> bottom of stack is also aligned to alignb. And the place violated this
> is the first var partition. which is 32 bytes offsets, it should be
> BIGGEST_ALIGNMENT / BITS_PER_UNIT.
> So I think we need to use MAX (BIGGEST_ALIGNMENT /
> BITS_PER_UNIT, ASAN_RED_ZONE_SIZE) for the first var partition.
Your first patch aligned offsets[0] to maximum of alignb and
ASAN_RED_ZONE_SIZE. But as I wrote in the reply to that mail, alignb there
is the alignment of just a single variable which is the first one to appear
in the sorted list and is placed in the highest spot in the stack frame.
That is not necessarily the largest alignment, the sorting ensures that it
is a variable with the largest size in the frame (and only if several of
them have equal size, largest alignment from the same sized ones). Your
second patch used maximum of BIGGEST_ALIGNMENT / BITS_PER_UNIT and
ASAN_RED_ZONE_SIZE. That doesn't change anything at all when using
-mno-avx512f - offsets[0] is still just 32-byte aligned in that case
relative to top of frame, just changes the -mavx512f case to be 64-byte
aligned offsets[0] (aka offsets[0] is then either 0 or -64 instead of either
0 or -32). That will not help if any variable in the frame needs 128-byte,
256-byte, 512-byte ... 4096-byte alignment. If you want to fix the bug in
the spot you've touched, you'd need to walk all the
stack_vars[stack_vars_sorted[si2]] for si2 [si + 1, n - 1] and for those
where the loop would do anything (i.e.
stack_vars[i2].representative == i2
&& TREE_CODE (decl2) == SSA_NAME
? SA.partition_to_pseudo[var_to_partition (SA.map, decl2)] == NULL_RTX
: DECL_RTL (decl2) == pc_rtx
and the pred applies (but that means also walking the earlier ones!
because with -fstack-protector* the vars can be processed in several calls) and
alignb2 * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT
and compute maximum of those alignments.
That maximum is already computed,
data->asan_alignb = MAX (data->asan_alignb, alignb);
computes that, but you get the final result only after you do all the
expand_stack_vars calls. You'd need to compute it before.
Though, that change would be still in the wrong place.
The thing is, it would be a waste of the precious stack space when it isn't
needed at all (e.g. when asan will not at compile time do the use after
return checking, or if it won't do it at runtime, or even if it will do at
runtime it will waste the space on the stack).
The following patch fixes it solely for the __asan_stack_malloc_N
allocations, doesn't enlarge unnecessarily further the actual stack frame.
Because asan is only supported on FRAME_GROWS_DOWNWARD architectures
(mips, rs6000 and xtensa are conditional FRAME_GROWS_DOWNWARD arches, which
for -fsanitize=address or -fstack-protector* use FRAME_GROWS_DOWNWARD 1,
otherwise 0, others supporting asan always just use 1), the assumption for
the dynamic stack realignment is that the top of the stack frame (aka offset
0) is aligned to alignb passed to the function (which is the maximum of alignb
of all the vars in the frame). As checked by the assertion in the patch,
offsets[0] is 0 most of the time and so that assumption is correct, the only
case when it is not 0 is if -fstack-protector* is on together with
-fsanitize=address and cfgexpand.cc (create_stack_guard) created a stack
guard. That is the only variable which is allocated in the stack frame
right away, for all others with -fsanitize=address defer_stack_allocation
(or -fstack-protector*) returns true and so they aren't allocated
immediately but handled during the frame layout phases. So, the original
frame_offset of 0 is changed because of the stack guard to
-pointer_size_in_bytes and later at the
if (data->asan_vec.is_empty ())
{
align_frame_offset (ASAN_RED_ZONE_SIZE);
prev_offset = frame_offset.to_constant ();
}
to -ASAN_RED_ZONE_SIZE. The asan_emit_stack_protection code wasn't
taking this into account though, so essentially assumed in the
__asan_stack_malloc_N allocated memory it needs to align it such that
pointer corresponding to offsets[0] is alignb aligned. But that isn't
correct if alignb > ASAN_RED_ZONE_SIZE, in that case it needs to ensure that
pointer corresponding to frame offset 0 is alignb aligned.
The following patch fixes that. Unlike the previous case where
we knew that asan_frame_size + base_align_bias falls into the same bucket
as asan_frame_size, this isn't in some cases true anymore, so the patch
recomputes which bucket to use and if going to bucket 11 (because there is
no __asan_stack_malloc_11 function in the library) disables the after return
sanitization.
2024-04-11 Jakub Jelinek <jakub@redhat.com>
PR middle-end/110027
* asan.cc (asan_emit_stack_protection): Assert offsets[0] is
zero if there is no stack protect guard, otherwise
-ASAN_RED_ZONE_SIZE. If alignb > ASAN_RED_ZONE_SIZE and there is
stack pointer guard, take the ASAN_RED_ZONE_SIZE bytes allocated at
the top of the stack into account when computing base_align_bias.
Recompute use_after_return_class from asan_frame_size + base_align_bias
and set to -1 if that would overflow to 11.
Jakub Jelinek [Tue, 9 Apr 2024 07:31:42 +0000 (09:31 +0200)]
c++: Fix up maybe_warn_for_constant_evaluated calls [PR114580]
When looking at maybe_warn_for_constant_evaluated for the trivial
infinite loops patch, I've noticed that it can emit weird diagnostics
for if constexpr in templates, first warn that std::is_constant_evaluted()
always evaluates to false (because the function template is not constexpr)
and then during instantiation warn that std::is_constant_evaluted()
always evaluates to true (because it is used in if constexpr condition).
Now, only the latter is actually true, even when the if constexpr
is in a non-constexpr function, it will still always evaluate to true.
So, the following patch fixes it to call maybe_warn_for_constant_evaluated
always with IF_STMT_CONSTEXPR_P (if_stmt) as the second argument rather than
true if it is if constexpr with non-dependent condition etc.
2024-04-09 Jakub Jelinek <jakub@redhat.com>
PR c++/114580
* semantics.cc (finish_if_stmt_cond): Call
maybe_warn_for_constant_evaluated with IF_STMT_CONSTEXPR_P (if_stmt)
as the second argument, rather than true/false depending on if
it is if constexpr with non-dependent constant expression with
bool type.
* g++.dg/cpp2a/is-constant-evaluated15.C: New test.
Jakub Jelinek [Fri, 5 Apr 2024 12:56:14 +0000 (14:56 +0200)]
vect: Don't clear base_misaligned in update_epilogue_loop_vinfo [PR114566]
The following testcase is miscompiled, because in the vectorized
epilogue the vectorizer assumes it can use aligned loads/stores
(if the base decl gets alignment increased), but it actually doesn't
increase that.
This is because r10-4203-g97c1460367 added the hunk following
patch removes. The explanation feels reasonable, but actually it
is not true as the testcase proves.
The thing is, we vectorize the main loop with 64-byte vectors
and the corresponding data refs have base_alignment 16 (the
a array has DECL_ALIGN 128) and offset_alignment 32. Now, because
of the offset_alignment 32 rather than 64, we need to use unaligned
loads/stores in the main loop (and ditto in the first load/store
in vectorized epilogue). But the second load/store in the vectorized
epilogue uses only 32-byte vectors and because it is a multiple
of offset_alignment, it checks if we could increase alignment of the
a VAR_DECL, the function returns true, sets base_misaligned = true
and says the access is then aligned.
But when update_epilogue_loop_vinfo clears base_misaligned with the
assumption that the var had to have the alignment increased already,
the update of DECL_ALIGN doesn't happen anymore.
Now, I'd think this base_alignment = false was needed before r10-4030-gd2db7f7901 change was committed where it incorrectly
overwrote DECL_ALIGN even if it was already larger, rather than
just always increasing it. But with that change in, it doesn't
make sense to me anymore.
Note, the testcase is latent on the trunk, but reproduces on the 13
branch.
Jakub Jelinek [Fri, 5 Apr 2024 07:31:28 +0000 (09:31 +0200)]
c++: Fix ICE with weird copy assignment operator [PR114572]
While ctors/dtors don't return anything (undeclared void or this pointer
on arm) and copy assignment operators normally return a reference to *this,
it isn't invalid to return uselessly some class object which might need
destructing, but the OpenMP clause handling code wasn't expecting that.
The following patch fixes that.
2024-04-05 Jakub Jelinek <jakub@redhat.com>
PR c++/114572
* cp-gimplify.cc (cxx_omp_clause_apply_fn): Call build_cplus_new
on build_call_a result if it has class type.
Jakub Jelinek [Thu, 4 Apr 2024 08:47:52 +0000 (10:47 +0200)]
fold-const: Handle NON_LVALUE_EXPR in native_encode_initializer [PR114537]
The following testcase is incorrectly rejected. The problem is that
for bit-fields native_encode_initializer expects the corresponding
CONSTRUCTOR elt value must be INTEGER_CST, but that isn't the case
here, it is wrapped into NON_LVALUE_EXPR by maybe_wrap_with_location.
We could STRIP_ANY_LOCATION_WRAPPER as well, but as all we are looking for
is INTEGER_CST inside, just looking through NON_LVALUE_EXPR seems easier.
2024-04-04 Jakub Jelinek <jakub@redhat.com>
PR c++/114537
* fold-const.cc (native_encode_initializer): Look through
NON_LVALUE_EXPR if val is INTEGER_CST.
Jakub Jelinek [Wed, 3 Apr 2024 08:02:35 +0000 (10:02 +0200)]
libquadmath: Don't assume the storage for __float128 arguments is aligned [PR114533]
With the register_printf_type/register_printf_modifier/register_printf_specifier
APIs the C library is just told the size of the argument and is provided with
a callback to fetch the argument from va_list using va_arg into C library provided
memory. The C library isn't told what alignment requirement it has, but we were
using direct load of a __float128 value from that memory which assumes
__alignof (__float128) alignment.
The following patch fixes that by using memcpy instead.
I haven't been able to reproduce an actual crash, tried
#include <quadmath.h>
#include <stdlib.h>
#include <stdio.h>
int main ()
{
__float128 r;
int prec = 20;
int width = 46;
char buf[128];
r = 2.0q;
r = sqrtq (r);
int n = quadmath_snprintf (buf, sizeof buf, "%+-#*.20Qe", width, r);
if ((size_t) n < sizeof buf)
printf ("%s\n", buf);
/* Prints: +1.41421356237309504880e+00 */
quadmath_snprintf (buf, sizeof buf, "%Qa", r);
if ((size_t) n < sizeof buf)
printf ("%s\n", buf);
/* Prints: 0x1.6a09e667f3bcc908b2fb1366ea96p+0 */
n = quadmath_snprintf (NULL, 0, "%+-#46.*Qe", prec, r);
if (n > -1)
{
char *str = malloc (n + 1);
if (str)
{
quadmath_snprintf (str, n + 1, "%+-#46.*Qe", prec, r);
printf ("%s\n", str);
/* Prints: +1.41421356237309504880e+00 */
}
free (str);
}
printf ("%+-#*.20Qe\n", width, r);
printf ("%Qa\n", r);
printf ("%+-#46.*Qe\n", prec, r);
printf ("%d %Qe %d %Qe %d %Qe\n", 1, r, 2, r, 3, r);
return 0;
}
In any case, I think memcpy for loading from it is right.
2024-04-03 Simon Chopin <simon.chopin@canonical.com>
Jakub Jelinek <jakub@redhat.com>
PR libquadmath/114533
* printf/printf_fp.c (__quadmath_printf_fp): Use memcpy to copy
__float128 out of args.
* printf/printf_fphex.c (__quadmath_printf_fphex): Likewise.
Jakub Jelinek [Wed, 3 Apr 2024 07:59:45 +0000 (09:59 +0200)]
expr: Fix up emit_push_insn [PR114552]
r13-990 added optimizations in multiple spots to optimize during
expansion storing of constant initializers into targets.
In the load_register_parameters and expand_expr_real_1 cases,
it checks it has a tree as the source and so knows we are reading
that whole decl's value, so the code is fine as is, but in the
emit_push_insn case it checks for a MEM from which something
is pushed and checks for SYMBOL_REF as the MEM's address, but
still assumes the whole object is copied, which as the following
testcase shows might not always be the case. In the testcase,
k is 6 bytes, then 2 bytes of padding, then another 4 bytes,
while the emit_push_insn wants to store just the 6 bytes.
The following patch simply verifies it is the whole initializer
that is being stored, I think that is best thing to do so late
in GCC 14 cycle as well for backporting.
For GCC 15, perhaps the code could stop requiring it must be at offset zero,
nor that the size is equal, but could use
get_symbol_constant_value/fold_ctor_reference gimple-fold APIs to actually
extract just part of the initializer if we e.g. push just some subset
(of course, still verify that it is a subset). For sizes which are power
of two bytes and we have some integer modes, we could use as type for
fold_ctor_reference corresponding integral types, otherwise dunno, punt
or use some structure (e.g. try to find one in the initializer?), whatever.
But even in the other spots it could perhaps handle loading of
COMPONENT_REFs or MEM_REFs from the .rodata vars.
2024-04-03 Jakub Jelinek <jakub@redhat.com>
PR middle-end/114552
* expr.cc (emit_push_insn): Only use store_constructor for
immediate_const_ctor_p if int_expr_size matches size.
Jakub Jelinek [Tue, 2 Apr 2024 11:40:27 +0000 (13:40 +0200)]
Fix up postboot dependencies [PR106472]
On Wed, Mar 13, 2024 at 10:13:37AM +0100, Jakub Jelinek wrote:
> While the first Makefile.tpl hunk looks obviously ok, the others look
> completely wrong to me.
> There is nothing special about libgo vs. libbacktrace/libatomic
> compared to any other target library which is not bootstrapped vs. any
> of its dependencies which are in the bootstrapped set.
> So, Makefile.tpl shouldn't hardcode such dependencies.
Here is my version of the fix.
The dependencies in the toplevel Makefile simply didn't take into account
that some target modules could be in a bootstrapped build built in some
configurations as bootstrap modules (typically as dependencies of other
target bootstrap modules), while in other configurations just as
dependencies of non-bootstrap target modules and so not built during the
bootstrap, but after it.
Makefile.tpl arranges for those postboot target module -> target module
dependencies to be emitted only inside of an @unless gcc-bootstrap block,
while for @if gcc-bootstrap it just emits
configure-target-whatever: stage_last
dependencies which ensure those postbootstrap target modules are only built
after everything that is bootstrapped has been.
Now, the libbacktrace/libatomic target modules have bootstrap=true
target_modules = { module= libbacktrace; bootstrap=true; };
target_modules = { module= libatomic; bootstrap=true; lib_path=.libs; };
because those modules are dependencies of libphobos target module, so
when d is included among bootstrapped languages, those are all bootstrapped
and everything works correctly.
While if d is not included, libphobos target module is disabled,
libbacktrace/libatomic target modules aren't bootstrapped, nothing during
bootstrap needs them, but post bootstrap libgo target module depends on
the libatomic and libbacktrace target modules, libgfortran target module
depends on the libbacktrace target module and libgm2 target module depends
on the libatomic target module, but those dependencies were emitted only
@unless gcc-bootstrap. There is a similar theoretical problem for zlib
target module if GCJ would be ressurected, libphobos as bootstrap target
module depends on the zlib target module, but if d is not configured,
fastjar also depends on it.
The following patch arranges for the @if gcc-bootstrap case to emit also
target module -> target module dependencies, but conditionally on the
on dependency not being bootstrapped.
In the generated Makefile.in you can see what the Makefile.tpl change
produces and that it just adds extra dependencies which weren't there
before in the @if gcc-bootstrap case.
I've bootstrapped without this patch with
../configure --enable-languages=c,c++,go; make
on x86_64-linux (note, make -j2 or higher usually worked) which failed
as described in the PR, then with this patch with the same command which
built fine and the Makefile difference between the two builds being
diff -up obj40{a,b}/Makefile
--- obj40a/Makefile 2024-03-31 00:35:22.243791499 +0100
+++ obj40b/Makefile 2024-03-31 22:40:38.143299144 +0200
@@ -29376,6 +29376,14 @@ configure-bison: stage_last
configure-flex: stage_last
configure-m4: stage_last
# Dependencies for target modules on other target modules are
# described by lang_env_dependencies; the defaults apply to anything
which I believe are exactly the extra dependencies we want.
Plus I've done normal x86_64-linux and i686-linux bootstraps/regtests
which in my case include --enable-languages=default,ada,obj-c++,lto,go,d,rust,m2
for x86_64 and the same except ada for i686; those with my usual make -j32.
The Makefile difference in those builds vs. unpatched case
is just an extra empty line.
2024-04-02 Jakub Jelinek <jakub@redhat.com>
PR bootstrap/106472
* Makefile.tpl (make-postboot-target-dep): New lambda.
Use it to add --enable-bootstrap dependencies of target modules
on other target modules if the latter aren't bootstrapped.
* Makefile.in: Regenerate.
Martin Jambor [Fri, 19 Apr 2024 14:48:12 +0000 (16:48 +0200)]
ipa: Force args obtined through pass-through maps to the expected type (PR 113964)
Interactions of IPA-CP and IPA-SRA on the same data is a rather big
source of issues, I'm afraid. PR 113964 is a situation where IPA-CP
propagates an unsigned short in a union parameter into a function
which itself calls a different function which has a same union
parameter and both these union parameters are split with IPA-SRA. The
leaf function however uses a signed short member of the union.
In the calling function, we get the unsigned constant as the
replacement for the union and it is then passed in the call without
any type compatibility checks. Apparently on riscv64 it matters
whether the parameter is signed or unsigned short and so the leaf
function can see different values.
Fixed by using useless_type_conversion_p at the appropriate place and
if it fails, use force_value_to type as elsewhere in similar
situations.
gcc/ChangeLog:
2024-04-04 Martin Jambor <mjambor@suse.cz>
PR ipa/113964
* ipa-param-manipulation.cc (ipa_param_adjustments::modify_call):
Force values obtined through pass-through maps to the expected
split type.
gcc/testsuite/ChangeLog:
2024-04-04 Patrick O'Neill <patrick@rivosinc.com>
Martin Jambor <mjambor@suse.cz>
Martin Jambor [Fri, 19 Apr 2024 14:48:12 +0000 (16:48 +0200)]
ipa: Avoid duplicate replacements in IPA-SRA transformation phase
When the analysis part of IPA-SRA figures out that it would split out
a scalar part of an aggregate which is known by IPA-CP to contain a
known constant, it skips it knowing that the transformation part looks
at IPA-CP aggregate results too and does the right thing (which can
include doing the propagation in GIMPLE because that is the last
moment the parameter exists).
However, when IPA-SRA wants to split out a smaller aggregate out
of an aggregate, which happens to be of the same size as a known
scalar constant at the same offset, the transformation bit fails to
recognize the situation, tries to do both splitting and constant
propagation and in PR 111571 testcase creates a nonsensical call
statement on which the call redirection then ICEs.
Fixed by making sure we don't try to do two replacements of the same
part of the same parameter.
The look-up among replacements requires these are sorted and this
patch just sorts them if they are not already sorted before each new
look-up. The worst number of sortings that can happen is number of
parameters which are both split and have aggregate constants times
param_ipa_max_agg_items (default 16). I don't think complicating the
source code to optimize for this unlikely case is worth it but if need
be, it can of course be done.
When partially substituting a requires-expr, we don't want to perform
any additional checks beyond the substitution itself so as to minimize
checking requirements out of order. So don't check the return-type-req
of a compound-requirement during partial substitution. And don't check
the noexcept condition either since we can't do that on templated trees.
PR c++/113966
gcc/cp/ChangeLog:
* constraint.cc (tsubst_compound_requirement): Don't check
the noexcept condition or the return-type-requirement when
partially substituting.
Patrick Palka [Sat, 3 Feb 2024 00:07:08 +0000 (19:07 -0500)]
c++: requires-exprs and partial constraint subst [PR110006]
In r11-3261-gb28b621ac67bee we made tsubst_requires_expr never partially
substitute into a requires-expression so as to avoid checking its
requirements out of order during e.g. generic lambda regeneration.
These PRs however illustrate that we still sometimes do need to
partially substitute into a requires-expression, in particular when it
appears in associated constraints that we're directly substituting for
sake of declaration matching or dguide constraint rewriting. In these
cases we're being called from tsubst_constraint during which
processing_constraint_expression_p is true, so this patch checks this
predicate to control whether we defer substitution or partially
substitute.
In turn, we now need to propagate semantic tsubst flags through
tsubst_requires_expr rather than just using tf_none, notably for sake of
dguide constraint rewriting which sets tf_dguide.
PR c++/110006
PR c++/112769
gcc/cp/ChangeLog:
* constraint.cc (subst_info::quiet): Accomodate non-diagnostic
tsubst flags.
(tsubst_valid_expression_requirement): Likewise.
(tsubst_simple_requirement): Return a substituted _REQ node when
processing_template_decl.
(tsubst_type_requirement_1): Accomodate non-diagnostic tsubst
flags.
(tsubst_type_requirement): Return a substituted _REQ node when
processing_template_decl.
(tsubst_compound_requirement): Likewise. Accomodate non-diagnostic
tsubst flags.
(tsubst_nested_requirement): Likewise.
(tsubst_requires_expr): Don't defer partial substitution when
processing_constraint_expression_p is true, in which case return
a substituted REQUIRES_EXPR.
* pt.cc (tsubst_expr) <case REQUIRES_EXPR>: Accomodate
non-diagnostic tsubst flags.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/class-deduction-alias18.C: New test.
* g++.dg/cpp2a/concepts-friend16.C: New test.
H.J. Lu [Sun, 14 Apr 2024 19:57:39 +0000 (12:57 -0700)]
tree-profile: Disable indirect call profiling for IFUNC resolvers
We can't profile indirect calls to IFUNC resolvers nor their callees as
it requires TLS which hasn't been set up yet when the dynamic linker is
resolving IFUNC symbols.
Add an IFUNC resolver caller marker to cgraph_node and set it if the
function is called by an IFUNC resolver. Disable indirect call profiling
for IFUNC resolvers and their callees.
Tested with profiledbootstrap on Fedora 39/x86-64.
Tamar Christina [Mon, 15 Apr 2024 11:12:30 +0000 (12:12 +0100)]
AArch64: remove ls64 from being mandatory on armv8.7-a..
The Arm Architectural Reference Manual (Version J.a, section A2.9 on FEAT_LS64)
shows that ls64 is an optional extensions and should not be enabled by default
for Armv8.7-a.
This drops it from the mandatory bits for the architecture and brings GCC inline
with LLVM and the achitecture.
Note that we will not be changing binutils to preserve compatibility with older
released compilers.
gcc/ChangeLog:
* config/aarch64/aarch64-arches.def (AARCH64_ARCH): Remove LS64 from
Armv8.7-a.
Jakub Jelinek [Thu, 11 Apr 2024 13:55:53 +0000 (15:55 +0200)]
libstdc++: Regenerate baseline_symbols.txt files for Linux
The following patch regenerates the ABI files for 13 branch (I've only changed
the Linux files which were updated in r13-7289, all but m68k, riscv64 and
powerpc64 are from actual Fedora 39 gcc builds, the rest hand edited).
We've added one symbol very early in the 13.2 cycle, but then added 2
further ones very soon afterwards, quite a long time before 13.2 release
and haven't regenerated. The patch applies cleanly to trunk as well.
Kito Cheng [Wed, 28 Feb 2024 08:01:52 +0000 (16:01 +0800)]
RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64
atomic_compare_and_swapsi will use lr.w to do obtain the original value,
which sign extends to DI. RV64 only has DI comparisons, so we also need
to sign extend the expected value to DI as otherwise the comparison will
fail when the expected value has the 32nd bit set.
gcc/ChangeLog:
PR target/114130
* config/riscv/sync.md (atomic_compare_and_swap<mode>): Sign
extend the expected value if needed.
Martin Jambor [Mon, 8 Apr 2024 15:34:33 +0000 (17:34 +0200)]
ipa: Self-DCE of uses of removed call LHSs (PR 108007)
PR 108007 is another manifestation where we rely on DCE to clean-up
after IPA-SRA and if the user explicitely switches DCE off, IPA-SRA
can leave behind statements which are fed uninitialized values and
trap, even though their results are themselves never used.
I have already fixed this for unused parameters in callees, this bug
shows that almost the same thing can happen for removed returns, on
the side of callers. This means that the issue has to be fixed
elsewhere, in call redirection. This patch adds a function which
looks for (and through, using a work-list) uses of operations fed
specific SSA names and removes them all.
That would have been easy if it wasn't for debug statements during
tree-inline (from which call redirection is also invoked). Debug
statements are decoupled from the rest at this point and iterating
over uses of SSAs does not bring them up. During tree-inline they are
handled especially at the end, I assume in order to make sure that
relative ordering of UIDs are the same with and without debug info.
This means that during tree-inline we need to make a hash of killed
SSAs, that we already have in copy_body_data, available to the
function making the purging. So the patch duly does also that, making
the interface slightly ugly. Moreover, all newly unused SSA names
need to be freed and as PR 112616 showed, it must be done in a defined
order, which is what newly added ipa_release_ssas_in_hash does.
PR ipa/108007
PR ipa/112616
* cgraph.h (cgraph_edge): Add a parameter to
redirect_call_stmt_to_callee.
* ipa-param-manipulation.h (ipa_param_adjustments): Add a
parameter to modify_call.
(ipa_release_ssas_in_hash): Declare.
* cgraph.cc (cgraph_edge::redirect_call_stmt_to_callee): New
parameter killed_ssas, pass it to padjs->modify_call.
* ipa-param-manipulation.cc (purge_all_uses): New function.
(ipa_param_adjustments::modify_call): New parameter killed_ssas.
Instead of substituting uses, invoke purge_all_uses. If
hash of killed SSAs has not been provided, create a temporary one
and release SSAs that have been added to it.
(compare_ssa_versions): New function.
(ipa_release_ssas_in_hash): Likewise.
* tree-inline.cc (redirect_all_calls): Create
id->killed_new_ssa_names earlier, pass it to edge redirection,
adjust a comment.
(copy_body): Release SSAs in id->killed_new_ssa_names.