Jonathan Wakely [Fri, 19 Jan 2024 12:28:30 +0000 (12:28 +0000)]
libstdc++: Fix P2255R2 dangling checks for std::tuple in C++17 [PR108822]
I accidentally used && in a fold-expression instead of || which meant
that in C++17 the tuple(UElements&&...) constructor only failed its
debug assertion if all tuple elements were dangling references. Some
missing tests (noted as "TODO") meant this wasn't tested.
This fixes the fold expression and adds the missing tests.
libstdc++-v3/ChangeLog:
PR libstdc++/108822
* include/std/tuple (__glibcxx_no_dangling_refs) [C++17]: Fix
wrong fold-operator.
* testsuite/20_util/tuple/dangling_ref.cc: Check tuples with one
element and three elements. Check allocator-extended
constructors.
Jason Merrill [Wed, 17 Jan 2024 22:29:33 +0000 (17:29 -0500)]
c++: alias template argument conversion [PR112632]
We've had a problem with lost conversions to template parameter types for a
while now; looking at this PR, it occurred to me that the problem is really
with alias (and concept) templates, since we do substitution of dependent
arguments into them in a way that we don't for other templates. And fixing
that specific problem is a lot simpler than adding IMPLICIT_CONV_EXPR around
all dependent template arguments the way I gave up on for 111357.
The other part of the fix was changing tsubst_expr to actually call
convert_nontype_argument instead of assuming it will eventually happen.
I waffled about stripping the forced conversion when !force_conv
vs. skipping them in iterative_hash_template_arg and
template_args_equal (like we already do for some other conversions) and
decided to go with the former, but that isn't a strong preference if it
turns out to be somehow problematic.
* g++.dg/cpp0x/alias-decl-nontype1.C: New test.
* g++.dg/cpp2a/concepts-narrowing1.C: New test.
* g++.dg/cpp2a/nontype-class63.C: New test.
* g++.dg/cpp2a/nontype-class63a.C: New test.
This patch removes unused parameters and local variables from
M2GenGCC.mod. It required ForeachScopeBlockDo2 to be implemented and
exported affecting any module indirectly calling ConvertQuadsToTree.
gcc/m2/ChangeLog:
* gm2-compiler/M2BasicBlock.mod (InitBasicBlocks): Rename
ForeachScopeBlockDo to ForeachScopeBlockDo3.
* gm2-compiler/M2Code.mod: Import ForeachScopeBlockDo2.
(OptimizeScopeBlock): Call ForeachScopeBlockDo3 for
procedures with three parameters and ForeachScopeBlockDo2
for two parameters.
(CodeBlock): Ditto.
* gm2-compiler/M2GCCDeclare.mod (DeclareTypesConstantsProcedures):
Rename ForeachScopeBlockDo to ForeachScopeBlockDo3.
* gm2-compiler/M2GenGCC.def (ConvertQuadsToTree): Remove Scope
parameter.
* gm2-compiler/M2GenGCC.mod (ConvertQuadsToTree): Remove Scope
parameter.
(MaybeDebugBuiltinMemcpy): Remove parameter tok.
(MaybeDebugBuiltinMemset): Remove.
(MakeCopyUse): Remove tokenno from call to
MaybeDebugBuiltinMemcpy.
(PerformFoldBecomes): Remove desloc and exprloc.
(checkArrayElements): Remove location. Remove virtpos
as a parameter to MaybeDebugBuiltinMemcpy.
(NoWalkProcedure): Add attribute unused.
(CheckElementSetTypes): Remove parameter p.
Remove CurrentQuadToken in call to MaybeDebugBuiltinMemcpy.
Remove NoWalkProcedure from call to CheckElementSetTypes.
Remove tokenno from call to MaybeDebugBuiltinMemcpy.
* gm2-compiler/M2Optimize.mod (RemoveProcedures): Replace
two parameter indirect procedure iterator with
ForeachScopeBlockDo2.
* gm2-compiler/M2SSA.mod: Remove ForeachScopeBlockDo.
* gm2-compiler/M2Scope.def (ForeachScopeBlockDo2): New
declaration.
(ForeachScopeBlockDo): Rename ...
(ForeachScopeBlockDo3): ... to this.
(ScopeProcedure2): New declaration.
* gm2-compiler/M2Scope.mod (ForeachScopeBlockDo2): New
procedure.
(ForeachScopeBlockDo): Rename ...
(ForeachScopeBlockDo3): ... to this.
Kito Cheng [Mon, 8 Jan 2024 13:26:52 +0000 (21:26 +0800)]
RISC-V: Documnet the list of supported extensions
Try to list all supported extensions: name, version and few description
for each extension.
v2 changes:
- Fix several typo.
- Add expantion info for vector crypto extensions.
- Drop zvl8192b, zvl16384b, zvl32768b and zvl65536b.
- Aadd zicntr and zihpm
gcc/ChangeLog:
* doc/invoke.texi (RISC-V Options): Add list of supported
extensions.
Juzhe-Zhong [Fri, 19 Jan 2024 08:34:25 +0000 (16:34 +0800)]
RISC-V: Fix RVV_VLMAX
This patch fixes memory hog found in SPEC2017 wrf benchmark which caused by
RVV_VLMAX since RVV_VLMAX generate brand new rtx by gen_rtx_REG (Pmode, X0_REGNUM)
every time we call RVV_VLMAX, that is, we are always generating garbage and redundant
(reg:DI 0 zero) rtx.
After this patch fix, the memory hog is gone.
Time variable usr sys wall GGC
machine dep reorg : 1.99 ( 9%) 0.35 ( 56%) 2.33 ( 10%) 939M ( 80%) [Before this patch]
machine dep reorg : 1.71 ( 6%) 0.16 ( 27%) 3.77 ( 6%) 659k ( 0%) [After this patch]
Time variable usr sys wall GGC
machine dep reorg : 75.93 ( 18%) 14.23 ( 88%) 90.15 ( 21%) 33383M ( 95%) [Before this patch]
machine dep reorg : 56.00 ( 14%) 7.92 ( 77%) 63.93 ( 15%) 4361k ( 0%) [After this patch]
Test is running. Ok for trunk if I passed the test with no regresion ?
Richard Biener [Fri, 19 Jan 2024 08:50:43 +0000 (09:50 +0100)]
debug/113488 - DW_AT_abstract_origin to self
The new sanity check avoiding creating of DIE refs to self triggers
on the PRs testcase when using -g1 and -ffat-lto-objects as while
early DWARF with -g1 doesn't contain any DIEs for LABEL_DECLs later
cloning will still mark DECLs as in if they would via
dwarf2out_abstract_function calling set_block_origin_self.
Instead of messing with the delicate setup of dwarf2out at this stage
the following simply rectifies things after the fact during LTO
streaming when the decl indicates there's an early DIE but there
isn't fixup that indication.
PR debug/113488
* lto-streamer-in.cc (lto_read_tree_1): When there isn't
an early DIE but there should be, do not pretend there is.
Richard Biener [Fri, 19 Jan 2024 08:23:48 +0000 (09:23 +0100)]
tree-optimization/113494 - Fix two observed regressions with r14-8206
The following handles the situation where we lack a loop-closed
PHI for a virtual operand because a loop exit goes to a code
region not having any virtual use (an endless loop). It also
handles the situation of edge redirection re-allocating a PHI node
in the destination block so we have to re-lookup that before
populating the new PHI argument.
Daniel Cederman [Tue, 16 Jan 2024 13:57:15 +0000 (14:57 +0100)]
libsanitizer: Replace memcpy with internal version in sanitizer_common
When GCC is configured with --enable-target-optspace the compiler generates
a memcpy call in the Symbolizer constructor in sanitizer_symbolizer.cpp
when compiling for SPARC V8. Add HAVE_AS_SYM_ASSIGN to replace it with a
call to __sanitizer_internal_memcpy.
Jakub Jelinek [Fri, 19 Jan 2024 09:01:43 +0000 (10:01 +0100)]
lower-bitint: Don't use m_loads for loads used in GIMPLE_ASM [PR113464]
Like for GIMPLE_PHIs or calls, even for GIMPLE_ASMs we want
a corresponding VAR_DECL assigned for lhs SSA_NAMEs of loads
from memory, as even GIMPLE_ASM relies on those VAR_DECLs to exist.
2024-01-19 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/113464
* gimple-lower-bitint.cc (gimple_lower_bitint): Don't try to
optimize loads into GIMPLE_ASM stmts.
Jakub Jelinek [Fri, 19 Jan 2024 09:00:51 +0000 (10:00 +0100)]
gimple-ssa-warn-restrict: Only use type range from NOP_EXPR for non-narrowing conversions [PR113463]
builtin_memref::extend_offset_range when it sees a NOP_EXPR from
INTEGRAL_TYPE (to INTEGRAL_TYPE of sizetype/ptrdifftype precision
given the callers) uses wi::to_offset on TYPE_{MIN,MAX}_VALUE
of the rhs1 type. This ICEs with large BITINT_TYPEs - to_offset
is only supported for precisions up to the offset_int precision
- but it even doesn't make any sense to do such thing for narrowing
conversions, their range means the whole sizetype/ptrdifftype range
and so the normal handling done later on (largest sized supported object)
is the way to go in that case.
So, the following patch just restrict this to non-narrowing conversions.
2024-01-19 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/113463
* gimple-ssa-warn-restrict.cc (builtin_memref::extend_offset_range):
Only look through NOP_EXPRs if rhs1 doesn't have wider type than
lhs.
Jakub Jelinek [Fri, 19 Jan 2024 09:00:16 +0000 (10:00 +0100)]
sccvn: Don't use SCALAR_INT_TYPE_MODE on BLKmode BITINT_TYPEs [PR113459]
sccvn uses GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (type)) for INTEGER_TYPEs,
most likely because that is what native_{interpret,encode}_int used.
This obviously doesn't work for larger BITINT_TYPEs which have BLKmode
and the above ICEs on those. native_{interpret,encode}_int checks whether
the BITINT_TYPE is medium/large/huge (i.e. an array of 2+ ABI limbs)
and uses TYPE_SIZE_UNIT for that case, otherwise SCALAR_INT_TYPE_MODE like
for the INTEGER_TYPE case.
The following patch instead just uses SCALAR_INT_TYPE_MODE for non-BLKmode
TYPE_MODE and TYPE_SIZE_UNIT otherwise.
2024-01-19 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/113459
* tree-ssa-sccvn.cc (vn_walk_cb_data::push_partial_def): Use
TREE_INT_CST_LOW of TYPE_SIZE_UNIT rather than GET_MODE_SIZE
of SCALAR_INT_TYPE_MODE if type has BLKmode.
(vn_reference_lookup_3): Likewise. Formatting fix.
Jakub Jelinek [Fri, 19 Jan 2024 08:31:42 +0000 (09:31 +0100)]
expansion: Fix ICEs with BLKmode VIEW_CONVERT_EXPR around non-BLKmode VAR_DECLs
On aarch64 the backend decides to use non-BLKmode for some arrays
like unsigned long[4] - OImode in that case, but the corresponding
BITINT_TYPEs have BLKmode (like structures containing that many limb
elements).
This later causes ICEs durring expansion when expanding VIEW_CONVERT_EXPR
from non-BLKmode VAR_DECL to BLKmode BITINT_TYPE.
The following fix contains two parts, the discover_nonconstant_array_refs_r
is make sure we force such variables into memory and the expand_expr_real_1
change makes sure we don't try to extract a bitfield or something similar
which doesn't really work for BLKmode - as op0 is a MEM, all we need is
the op0 = adjust_address (op0, mode, 0); at the end to change the MEM's mode
to BLKmode.
2024-01-19 Jakub Jelinek <jakub@redhat.com>
Richard Biener <rguenther@suse.de>
* cfgexpand.cc (discover_nonconstant_array_refs_r): Force non-BLKmode
VAR_DECLs referenced in BLKmode VIEW_CONVERT_EXPRs into memory.
* expr.cc (expand_expr_real_1) <case VIEW_CONVERT_EXPR>: Do nothing
but adjust_address also for BLKmode mode and MEM op0.
Palmer Dabbelt [Wed, 9 Nov 2022 03:00:36 +0000 (19:00 -0800)]
RISC-V: Add the Zihpm and Zicntr extensions
These extensions were recently frozen [1]. As per Andrew's post [2]
we're meant to ignore these in software, this just adds them to the list
of allowed extensions and otherwise ignores them. I added these under
SPEC_CLASS_NONE even though the PDF lists them as 20190614 because it
seems pointless to add another spec class just to accept two extensions
we then ignore.
Kito Cheng [Fri, 5 Jan 2024 14:08:34 +0000 (22:08 +0800)]
RISC-V: Relax the -march string for accept any order
-march was require canonical order before, however it's not easy for
most user when we have so many extension, so this patch is relax the
constraint, -march accept the ISA string in any order, it only has few
requirement:
1. Must start with rv[32|64][e|i|g].
2. Multi-letter and single letter extension must be separated by
at least one underscore(`_`).
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc
(riscv_subset_list::parse_single_std_ext): New parameter.
(riscv_subset_list::parse_single_multiletter_ext): Ditto.
(riscv_subset_list::parse_single_ext): Ditto.
(riscv_subset_list::parse): Relax the order for the input of ISA
string.
* config/riscv/riscv-subset.h
(riscv_subset_list::parse_single_std_ext): New parameter.
(riscv_subset_list::parse_single_multiletter_ext): Ditto.
(riscv_subset_list::parse_single_ext): Ditto.
Kito Cheng [Fri, 5 Jan 2024 13:33:35 +0000 (21:33 +0800)]
RISC-V: Extract part parsing base ISA logic into a standalone function [NFC]
Minor refactor, preparation for further change.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc
(riscv_subset_list::parse_base_ext): New.
(riscv_subset_list::parse): Extract part of logic into
riscv_subset_list::parse_base_ext.
* config/riscv/riscv-subset.h (riscv_subset_list::parse_base_ext):
New.
liuhongt [Fri, 19 Jan 2024 01:22:39 +0000 (09:22 +0800)]
Fix testcase failure on many platforms which don't support vect_int_max.
After r14-7124-g6686e16fda4190, the testcase can be optimized to
MAX_EXPR if the backends support that. So I adjust the testcase to
scan for MAX_EXPR, but it failed many platforms which don't support
that.
As pinski mentioned, target vect_no_int_min_max is only available
under vect directory, so for simplicity, I adjust the testcase to scan
either MAX_EXPR or original VEC_COND_EXPR.
gcc/testsuite/ChangeLog:
PR testsuite/113437
* gcc.dg/tree-ssa/pr95906.c: Scan either MAX_EXPR or
VEC_COND_EXPR.
Sandra Loosemore [Fri, 19 Jan 2024 02:06:55 +0000 (02:06 +0000)]
More precise documentation for cleanup attribute [PR110029]
gcc/ChangeLog
PR c/110029
* doc/extend.texi (Common Variable Attributes): Explain what
happens when multiple variables with cleanups are in the same scope.
Sandra Loosemore [Thu, 18 Jan 2024 23:19:39 +0000 (23:19 +0000)]
Improve documentation of noinline and noipa attributes [PR108470]
gcc/ChangeLog
PR ipa/108470
* doc/extend.texi (Common Function Attributes): Document that
noinline also disables some interprocedural optimizations and
improve flow to the part about using inline asm instead to
disable calls from being optimized away completely. Remove the
sentence that says noipa is mainly for internal compiler testing.
Sandra Loosemore [Thu, 18 Jan 2024 18:28:22 +0000 (18:28 +0000)]
Restore documentation for const/volatile functions [PR107942]
In r5-7698-g8648c55f3b703a I accidentally removed the documentation of
GCC's special interpretation of const/volatile qualifiers on functions
from the function attributes section, thinking this was just a
bit-rotten leftover from old versions of GCC. PR107942 points out
that this functionality is still present even though the docs are now gone.
I decided this material didn't really belong in the function
attributes discussion, but a new subsection in the general list of GCC
extensions to the C language. And I agree with the comment in the
issue that we shouldn't really recommend this usage any more.
gcc/ChangeLog
PR c/107942
* doc/extend.texi (C Extensions): Add new section to menu.
(Function Attributes): Move dangling index entries to....
(Const and Volatile Functions): New section.
David Malcolm [Thu, 18 Jan 2024 17:11:57 +0000 (12:11 -0500)]
analyzer: fix ICE on strlen ((char *)&VECTOR_CST) [PR111361]
gcc/analyzer/ChangeLog:
PR analyzer/111361
* region-model.cc (svalue_byte_range_has_null_terminator_1): The
initial byte of an all-zeroes SVAL is a zero byte. Remove
gcc_unreachable from SK_CONSTANT for constants that aren't
STRING_CST or INTEGER_CST.
gcc/testsuite/ChangeLog:
PR analyzer/111361
* c-c++-common/analyzer/strlen-pr111361.c: New test.
* c-c++-common/analyzer/strncpy-1.c (test_zero_fill): Remove fixed
xfail.
* c-c++-common/analyzer/strncpy-pr111361.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Thu, 18 Jan 2024 17:11:57 +0000 (12:11 -0500)]
analyzer: fix offsets in has_null_terminator [PR112811]
PR analyzer/112811 reports an ICE attempting to determine whether a
string is null-terminated.
The root cause is confusion in the code about whether byte offsets are
relative to the start of the base region, or relative to the bound
fragment within the the region.
This patch rewrites the code to enforce a clearer separation between
the kinds of offset, fixing the ICE, and adds logging to help track
down future issues in this area of the code.
gcc/analyzer/ChangeLog:
PR analyzer/112811
* region-model.cc (fragment::dump_to_pp): New.
(fragment::has_null_terminator): Convert to...
(svalue_byte_range_has_null_terminator_1): ...this new function,
updating to use a byte_range relative to the start of the svalue.
(svalue_byte_range_has_null_terminator): New.
(fragment::string_cst_has_null_terminator): Convert to...
(string_cst_has_null_terminator): ...this, updating to use a
byte_range relative to the start of the svalue.
(iterable_cluster::dump_to_pp): New.
(region_model::scan_for_null_terminator): Add logging, moving body
to...
(region_model::scan_for_null_terminator_1): ...this new function,
adding more logging, and updating to use
svalue_byte_range_has_null_terminator.
* region-model.h (region_model::scan_for_null_terminator_1): New
decl.
gcc/testsuite/ChangeLog:
PR analyzer/112811
* c-c++-common/analyzer/strlen-pr112811.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Thu, 18 Jan 2024 17:11:57 +0000 (12:11 -0500)]
Fix ICE in -fdiagnostics-generate-patch [PR112684]
gcc/ChangeLog:
PR middle-end/112684
* toplev.cc (toplev::main): Don't ICE in
-fdiagnostics-generate-patch when exiting after options,
since no edit context will have been created.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Patrick Palka [Thu, 18 Jan 2024 15:36:07 +0000 (10:36 -0500)]
libstdc++/debug: Fix constexpr _Safe_iterator in C++20 mode [PR109536]
Some _Safe_iterator member functions define a variable of non-literal
type __gnu_cxx::__scoped_lock, which automatically disqualifies them from
being constexpr in C++20 mode even if that code path is never constant
evaluated. This restriction was lifted by P2242R3 for C++23, but we
need to work around it in C++20 mode. To that end this patch defines
a pair of macros that encapsulate the lambda-based workaround mentioned
in that paper and uses it to make these functions valid C++20 constexpr
functions. The augmented std::vector test element_access/constexpr.cc
now successfully compiles in C++20 mode with -D_GLIBCXX_DEBUG (and it
should test all member functions modified by this patch).
PR libstdc++/109536
libstdc++-v3/ChangeLog:
* include/debug/safe_base.h (_Safe_sequence_base::_M_swap):
Remove _GLIBCXX20_CONSTEXPR from non-inline member function.
* include/debug/safe_iterator.h
(_GLIBCXX20_CONSTEXPR_NON_LITERAL_SCOPE_BEGIN): Define.
(_GLIBCXX20_CONSTEXPR_NON_LITERAL_SCOPE_END): Define.
(_Safe_iterator::operator=): Use them around the code path that
defines a variable of type __gnu_cxx::__scoped_lock.
(_Safe_iterator::operator++): Likewise.
(_Safe_iterator::operator--): Likewise.
(_Safe_iterator::operator+=): Likewise.
(_Safe_iterator::operator-=): Likewise.
* testsuite/23_containers/vector/element_access/constexpr.cc
(test_iterators): Test more iterator operations.
* testsuite/23_containers/vector/bool/element_access/constexpr.cc
(test_iterators): Likewise.
* testsuite/std/ranges/adaptors/all.cc (test08) [_GLIBCXX_DEBUG]:
Remove.
Iain Sandoe [Thu, 18 Jan 2024 09:56:42 +0000 (09:56 +0000)]
Darwin, configure: Handle a missing substitution.
The configure substitution for enable_darwin_at_rpath has been
omitted, which leads to a failure to set ENABLE_DARWIN_AT_RPATH in
the testsuite site.exp (which leads to failure to add -B options
in some cases, breaking uninstalled testing there).
Since we already have substitutions for ENABLE_DARWIN_AT_RPATH_TRUE
we can use that instead, which is what this patch does.
gcc/ChangeLog:
* Makefile.in: Emit ENABLE_DARWIN_AT_RPATH into site.exp
when ENABLE_DARWIN_AT_RPATH_TRUE is not '#'.
Jun Sha (Joshua) [Fri, 12 Jan 2024 08:44:20 +0000 (16:44 +0800)]
RISC-V: Rewrite some instructions using ASM targethook
There are some xtheadvector instructions that differ from RVV1.0
apart from simply adding "th." prefix. For example, RVV1.0
load/store instructions will have SEW while xtheadvector not;
RVV1.0 will have "o" for indexed-ordered store instructions while
xtheadvecotr not; xtheadvector and RVV1.0 have different
vnsrl/vnsra/vfncvt suffix (vv/vx/vi vs wv/wx/wi).
To address this issue without duplicating patterns, we use ASM
targethook to rewrite the whole string of the instructions. We
identify different instructions from the corresponding attribute.
gcc/ChangeLog:
* config/riscv/thead.cc
(th_asm_output_opcode): Rewrite some instructions.
Co-authored-by: Jin Ma <jinma@linux.alibaba.com> Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com> Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
Jun Sha (Joshua) [Fri, 12 Jan 2024 03:23:21 +0000 (11:23 +0800)]
RISC-V: Fix register overlap issue for some xtheadvector instructions
For th.vmadc/th.vmsbc as well as narrowing arithmetic instructions
and floating-point compare instructions, an illegal instruction
exception will be raised if the destination vector register overlaps
a source vector register group.
To handle this issue, we add an attribute "spec_restriction" to disable
some alternatives for xtheadvector.
gcc/ChangeLog:
* config/riscv/riscv.md (none,thv,rvv): New attribute.
(no,yes): Add an attribute to disable alternative
for xtheadvector or RVV1.0.
* config/riscv/vector.md:
Disable alternatives that destination register overlaps
source register group for xtheadvector.
Co-authored-by: Jin Ma <jinma@linux.alibaba.com> Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com> Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
Jun Sha (Joshua) [Fri, 12 Jan 2024 03:22:41 +0000 (11:22 +0800)]
RISC-V: Add support for xtheadvector-specific intrinsics.
This patch only involves the generation of xtheadvector
special load/store instructions and vext instructions.
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-bases.cc
(class th_loadstore_width): Define new builtin bases.
(class th_extract): Define new builtin bases.
(BASE): Define new builtin bases.
* config/riscv/riscv-vector-builtins-bases.h:
Define new builtin class.
* config/riscv/riscv-vector-builtins-shapes.cc
(struct th_loadstore_width_def): Define new builtin shapes.
(struct th_indexed_loadstore_width_def):
Define new builtin shapes.
(struct th_extract_def): Define new builtin shapes.
(SHAPE): Define new builtin shapes.
* config/riscv/riscv-vector-builtins-shapes.h:
Define new builtin shapes.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_FUNCTION):
Redefine DEF_RVV_FUNCTION for XTheadVector special intrinsics.
* config/riscv/riscv-vector-builtins.h
(enum required_ext): Add new XTheadVector member.
(struct function_group_info): Likewise.
* config/riscv/t-riscv:
Add thead-vector-builtins-functions.def
* config/riscv/thead-vector.md
(@pred_mov_width<vlmem_op_attr><mode>): Add new patterns.
(*pred_mov_width<vlmem_op_attr><mode>): Likewise.
(@pred_store_width<vlmem_op_attr><mode>): Likewise.
(@pred_strided_load_width<vlmem_op_attr><mode>): Likewise.
(@pred_strided_store_width<vlmem_op_attr><mode>): Likewise.
(@pred_indexed_load_width<vlmem_op_attr><mode>): Likewise.
(@pred_th_extract<mode>): Likewise.
(*pred_th_extract<mode>): Likewise.
* config/riscv/thead-vector-builtins-functions.def: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/xtheadvector/vlb-vsb.c: New test.
* gcc.target/riscv/rvv/xtheadvector/vlbu-vsb.c: New test.
* gcc.target/riscv/rvv/xtheadvector/vlh-vsh.c: New test.
* gcc.target/riscv/rvv/xtheadvector/vlhu-vsh.c: New test.
* gcc.target/riscv/rvv/xtheadvector/vlw-vsw.c: New test.
* gcc.target/riscv/rvv/xtheadvector/vlwu-vsw.c: New test.
Co-authored-by: Jin Ma <jinma@linux.alibaba.com> Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com> Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
Jun Sha (Joshua) [Fri, 12 Jan 2024 03:22:10 +0000 (11:22 +0800)]
RISC-V: Handle differences between XTheadvector and Vector
This patch is to handle the differences in instruction generation
between Vector and XTheadVector. In this version, we only support
partial xtheadvector instructions that leverage directly from current
RVV1.0 with simple adding "th." prefix. For different name xtheadvector
instructions but share same patterns as RVV1.0 instructions, we will
use ASM targethook to rewrite the whole string of the instructions in
the following patches.
For some vector patterns that cannot be avoided, we use
"!TARGET_XTHEADVECTOR" to disable them in vector.md in order
not to generate instructions that xtheadvector does not support,
like vmv1r.
gcc/ChangeLog:
* config.gcc: Add files for XTheadVector intrinsics.
* config/riscv/autovec.md: Guard XTheadVector.
* config/riscv/predicates.md: Disable immediate vl
for XTheadVector.
* config/riscv/riscv-c.cc (riscv_pragma_intrinsic):
Add pragma for XTheadVector.
* config/riscv/riscv-string.cc (riscv_expand_block_move):
Guard XTheadVector.
* config/riscv/riscv-v.cc (vls_mode_valid_p):
Avoid autovec.
* config/riscv/riscv-vector-builtins-bases.cc:
Do not normalize vsetvl instructions for XTheadVector.
* config/riscv/riscv-vector-builtins-shapes.cc (check_type):
New check type function.
(build_one): Adjust for XTheadVector.
* config/riscv/riscv-vector-switch.def (ENTRY):
Disable fractional mode for the XTheadVector extension.
(TUPLE_ENTRY): Likewise.
* config/riscv/riscv.cc (riscv_v_adjust_bytesize):
Guard XTheadVector.
(riscv_preferred_simd_mode): Likewsie.
(riscv_autovectorize_vector_modes): Likewise.
(riscv_vector_mode_supported_any_target_p): Likewise.
(TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P): Likewise.
* config/riscv/thead.cc (th_asm_output_opcode):
Rewrite vsetvl instructions.
* config/riscv/vector.md:
Include thead-vector.md and change fractional LMUL
into 1 for vbool.
* config/riscv/riscv_th_vector.h: New file.
* config/riscv/thead-vector.md: New file.
Co-authored-by: Jin Ma <jinma@linux.alibaba.com> Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com> Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
Jun Sha (Joshua) [Fri, 12 Jan 2024 08:34:21 +0000 (16:34 +0800)]
RISC-V: Adds the prefix "th." for the instructions of XTheadVector.
This patch adds th. prefix to all XTheadVector instructions by
implementing new assembly output functions. We only check the
prefix is 'v', so that no extra attribute is needed.
gcc/ChangeLog:
* config/riscv/riscv-protos.h (riscv_asm_output_opcode):
Add new function to add assembler insn code prefix/suffix.
(th_asm_output_opcode):
Add Thead function to add assembler insn code prefix/suffix.
* config/riscv/riscv.cc (riscv_asm_output_opcode):
Implement function to add assembler insn code prefix/suffix.
* config/riscv/riscv.h (ASM_OUTPUT_OPCODE):
Add new function to add assembler insn code prefix/suffix.
* config/riscv/thead.cc (th_asm_output_opcode):
Implement Thead function to add assembler insn code
prefix/suffix.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/xtheadvector/prefix.c: New test.
Co-authored-by: Jin Ma <jinma@linux.alibaba.com> Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com> Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
Jun Sha (Joshua) [Fri, 12 Jan 2024 03:20:29 +0000 (11:20 +0800)]
RISC-V: Introduce XTheadVector as a subset of V1.0.0
This patch is to introduce basic XTheadVector support
(march string parsing and a test for __riscv_xtheadvector)
according to https://github.com/T-head-Semi/thead-extension-spec/
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc
(riscv_subset_list::parse): Add new vendor extension.
* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins):
Add test marco.
* config/riscv/riscv.opt: Add new mask.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/predef-__riscv_th_v_intrinsic.c: New test.
* gcc.target/riscv/rvv/xtheadvector.c: New test.
Co-authored-by: Jin Ma <jinma@linux.alibaba.com> Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com> Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
Iain Sandoe [Mon, 8 Jan 2024 17:00:18 +0000 (17:00 +0000)]
Darwin: Suppress adding embedded rpaths for earlier OS versions.
When we have @rpath support by virtue of the OS version we're hosting on
we still need to omit those rpath entries when targeting < 10.5 (or the
linker will complain). To do this we (maybe ab-)use a property of the
spec function expansion that a non-null return value can be used as the
true input to a second spec (whereas, unfortunately, we cannot pass specs
to the version function at present).
gcc/ChangeLog:
* config/darwin.h (DARWIN_RPATH_SPEC): Arrange for the %P spec
to be conditional on macosx-version-min.
Iain Sandoe [Mon, 8 Jan 2024 16:17:04 +0000 (16:17 +0000)]
Darwin: Fix a typo in Objective-C meta-data.
We have a typo in the metadata for assigning NSStrings to a specific
section for the V1 (32b) ABI. When that is fixed we should never see
the case where the section needs to be deduced from the properties of
the DECLs.
gcc/ChangeLog:
* config/darwin.cc (darwin_objc1_section): Use the correct
meta-data version for constant strings.
(machopic_select_section): Assert if we fail to handle CFString
sections as Obejctive-C meta-data or drectly.
Marek Polacek [Thu, 18 Jan 2024 00:16:32 +0000 (19:16 -0500)]
c++: ICE when xobj is not the first parm [PR113389]
In grokdeclarator/cdk_function the comment says that the find_xobj_parm
lambda clears TREE_PURPOSE so that we can correctly detect an xobj that
is not the first parameter. That's all good, but we should also clear
the TREE_PURPOSE once we've given the error, otherwise we crash later in
check_default_argument because the 'this' TREE_PURPOSE lacks a type.
PR c++/113389
gcc/cp/ChangeLog:
* decl.cc (grokdeclarator) <case cdk_function>: Set TREE_PURPOSE to
NULL_TREE when emitting an error.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/explicit-obj-diagnostics10.C: New test.
Iain Sandoe [Tue, 16 Jan 2024 08:45:26 +0000 (08:45 +0000)]
lto, Darwin: Fix offload section names.
Currently, these section names have wrong syntax for Mach-O.
Although they were added some time ago; recently added tests are
now emitting them leading to new fails on Darwin.
This adds a Mach-O variant for each.
gcc/ChangeLog:
* lto-section-names.h (OFFLOAD_SECTION_NAME_PREFIX,
OFFLOAD_VAR_TABLE_SECTION_NAME, OFFLOAD_FUNC_TABLE_SECTION_NAME,
OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): Provide Mach-O syntax
versions when the object format is Mach-O.
Iain Sandoe [Sat, 13 Jan 2024 13:30:08 +0000 (13:30 +0000)]
testsuite, jit: Allow for target-specific assembler scans.
If we want to support multiple object formats and to allow for
scan-assembler tests, we need to make it possible to adjust the
tests on a per-target basis.
This adds similar mechamisms to jit-verify-assembler-output{,-not}
to those used for the general scan-assembler dg directives.
As an aside; it would, perhaps, be possible to integrate this more
with scanasm.exp (which would also give access to function body
scanning) but I did not attempt that for this patch.
After this, we can accept things like:
... { jit-verify-assembler-output-not "......" { target { ! *-*-darwin* } } } }
or
... { jit-verify-assembler-output "......" { target *-*-darwin* } } }
gcc/testsuite/ChangeLog:
* jit.dg/jit.exp: Accept target clauses in jit-verify-assembler
handling.
Iain Sandoe [Sat, 13 Jan 2024 12:49:28 +0000 (12:49 +0000)]
testsuite, jit: Handle whitespace in test-link-section-assembler.c.
Darwin has a different .section directive that has more fields and
uses different whitespace. Amend the whitespace in the scan-asm to
be more flexible.
gcc/testsuite/ChangeLog:
* jit.dg/test-link-section-assembler.c: Accept any whitespace
between the .section directive and its arguments.
Although this only fires for one of the Darwin sub-ports, it is latent
elsewhere, it is also a regression c.f. the Darwin system compiler.
In the code we imported from an earlier branch, CFString objects (which
are constant aggregates) are constructed as CONST_DECLs. Although our
current documentation suggests that these are reserved for enumeration
values, in fact they are used elsewhere in the compiler for constants.
This includes Objective-C where they are used to form NSString constants.
In the particular case, we take the address of the constant and that
triggers varasm.cc:decode_addr_constant, which does not currently support
CONST_DECL.
If there is a general intent to allow/encourage wider use of CONST_DECL,
then we should fix decode_addr_constant to look through these and evaluate
the initializer (a two-line patch, but I'm not suggesting it for stage-4).
We also need to update the GCC internals documentation to allow for the
additional uses.
This patch is Darwin-local and fixes the problem by making the CFString
constants into regular variable but TREE_CONSTANT+TREE_READONLY. I plan
to back-port this to the open branches once it has baked a while on trunk.
Since, for Darwin, the Objective-C default is to construct constant
NSString objects as CFStrings; this will also cover the majority of cases
there (this patch does not make any changes to Objective-C NSStrings).
PR target/105522
gcc/ChangeLog:
* config/darwin.cc (machopic_select_section): Handle C and C++
CFStrings.
(darwin_rename_builtins): Move this out of the CFString code.
(darwin_libc_has_function): Likewise.
(darwin_build_constant_cfstring): Create an anonymous var to
hold each CFString.
* config/darwin.h (ASM_OUTPUT_LABELREF): Handle constant
CFstrings.
Martin Jambor [Thu, 18 Jan 2024 13:24:15 +0000 (14:24 +0100)]
sra: Disqualify bases of operands of asm gotos
PR 110422 shows that SRA can ICE assuming there is a single edge
outgoing from a block terminated with an asm goto. We need that for
BB-terminating statements so that any adjustments they make to the
aggregates can be copied over to their replacements. Because we can't
have that after ASM gotos, we need to punt.
gcc/ChangeLog:
2024-01-17 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/110422
* tree-sra.cc (scan_function): Disqualify bases of operands of asm
gotos.
gcc/testsuite/ChangeLog:
2024-01-17 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/110422
* gcc.dg/torture/pr110422.c: New test.
Richard Biener [Thu, 18 Jan 2024 12:23:27 +0000 (13:23 +0100)]
tree-optimization/113475 - fix memory leak in phi_analyzer
phi_analyzer leaks all phi_group objects it allocates. The following
fixes this by maintaining a vector of allocated objects and release
them when destroying the phi_analyzer object.
PR tree-optimization/113475
* gimple-range-phi.h (phi_analyzer::m_phi_groups): New.
* gimple-range-phi.cc (phi_analyzer::phi_analyzer): Initialize.
(phi_analyzer::~phi_analyzer): Deallocate and free collected
phi_grous.
(phi_analyzer::process_phi): Record allocated phi_groups.
Richard Biener [Thu, 18 Jan 2024 12:04:17 +0000 (13:04 +0100)]
Fix memory leak in vectorizable_store
The following fixes a memory leak in vectorizable_store which happens
because the functions populating gvec_oprnds[i] will call .create ()
on the incoming vector, leaking what we've previously allocated.
* tree-vect-stmts.cc (vectorizable_store): Do not allocate
storage for gvec_oprnds elements.
Gaius Mulley [Thu, 18 Jan 2024 13:06:30 +0000 (13:06 +0000)]
PR modula2/111956 Many powerpc platforms do _not_ have support for IEEE754
This patch corrects commit r14-4149-g81d5ca0b9b8431f1bd7a5ec8a2c94f04bb0cf032 which assummed
all powerpc platforms would have IEEE754 long double. The patch
ensures that cc1gm2 obtains the default IEEE754 long double availability
from the configure generated tm_defines. The user command
line switches -mabi=ibmlongdouble and -mabi=ieeelongdouble are implemented
to override the configuration defaults.
Bootstrapped on power8 and power9 machines.
gcc/m2/ChangeLog:
PR modula2/111956
* Make-lang.in (host_mc_longreal): Remove.
* configure: Regenerate.
* configure.ac (M2C_LONGREAL_FLOAT128): Remove.
(M2C_LONGREAL_PPC64LE): Remove.
* gm2-compiler/M2Options.def (SetIBMLongDouble): New procedure.
(GetIBMLongDouble): New procedure function.
(SetIEEELongDouble): New procedure.
(GetIEEELongDouble): New procedure function.
* gm2-compiler/M2Options.mod (SetIBMLongDouble): New procedure.
(GetIBMLongDouble): New procedure function.
(SetIEEELongDouble): New procedure.
(GetIEEELongDouble): New procedure function.
(InitializeLongDoubleFlags): New procedure called during
module block initialization.
* gm2-gcc/m2configure.cc: Remove duplicate includes.
(m2configure_M2CLongRealFloat128): Remove.
(m2configure_M2CLongRealIBM128): Remove.
(m2configure_M2CLongRealLongDouble): Remove.
(m2configure_M2CLongRealLongDoublePPC64LE): Remove.
(m2configure_TargetIEEEQuadDefault): New function.
* gm2-gcc/m2configure.def (M2CLongRealFloat128): Remove.
(M2CLongRealIBM128): Remove.
(M2CLongRealLongDouble): Remove.
(M2CLongRealLongDoublePPC64LE): Remove.
(TargetIEEEQuadDefault): New function.
* gm2-gcc/m2configure.h (m2configure_M2CLongRealFloat128): Remove.
(m2configure_M2CLongRealIBM128): Remove.
(m2configure_M2CLongRealLongDouble): Remove.
(m2configure_M2CLongRealLongDoublePPC64LE): Remove.
(m2configure_TargetIEEEQuadDefault): New function.
* gm2-gcc/m2options.h (M2Options_SetIBMLongDouble): New prototype.
(M2Options_GetIBMLongDouble): New prototype.
(M2Options_SetIEEELongDouble): New prototype.
(M2Options_GetIEEELongDouble): New prototype.
* gm2-gcc/m2type.cc (build_m2_long_real_node): Re-implement using
results of M2Options_GetIBMLongDouble and M2Options_GetIEEELongDouble.
* gm2-lang.cc (gm2_langhook_handle_option): Add case
OPT_mabi_ibmlongdouble and call M2Options_SetIBMLongDouble.
Add case OPT_mabi_ieeelongdouble and call M2Options_SetIEEELongDouble.
* gm2config.aci.in: Regenerate.
* gm2spec.cc (lang_specific_driver): Remove block defined by
M2C_LONGREAL_PPC64LE.
Remove case OPT_mabi_ibmlongdouble.
Remove case OPT_mabi_ieeelongdouble.
H.J. Lu [Tue, 9 Jan 2024 16:46:59 +0000 (08:46 -0800)]
hwasan: Check if Intel LAM_U57 is enabled
When -fsanitize=hwaddress is used, libhwasan will try to enable LAM_U57
in the startup code. Update the target check to enable hwaddress tests
if LAM_U57 is enabled. Also compile hwaddress tests with -mlam=u57 on
x86-64 since hwasan requires LAM_U57 on x86-64.
* lib/hwasan-dg.exp (check_effective_target_hwaddress_exec):
Return 1 if Intel LAM_U57 is enabled.
(hwasan_init): Add -mlam=u57 on x86-64.
Jonathan Wakely [Thu, 18 Jan 2024 12:40:52 +0000 (12:40 +0000)]
libstdc++: Avoid -Wmaybe-uninitialized warnings in text_encoding.cc
These variables are only read from if we haven't reached the end of
either range, in which case they're guaranteed to be initialized to the
next alphanumeric character. But we can just initialize them to make the
compiler happy.
libstdc++-v3/ChangeLog:
* include/bits/unicode.h (__charset_alias_match): Initialize
__var_a and __var_b.
Jonathan Wakely [Wed, 17 Jan 2024 21:40:25 +0000 (21:40 +0000)]
libstdc++: Fix std::format test for Solaris [PR113450]
When int8_t is a typedef for char (rather than signed char) this test
fails because it tries to format a char, which is treated differently
from formatting other integral types (including signed char).
Use signed char explicitly so the result doesn't depend on the
non-portable definition of int8_t.
libstdc++-v3/ChangeLog:
PR libstdc++/113450
* testsuite/std/format/functions/format.cc: Use signed char
instead of int8_t.
Richard Biener [Thu, 18 Jan 2024 10:22:34 +0000 (11:22 +0100)]
Fix memory leak in vect_analyze_loop_form
The following fixes a memory leak in vect_analyze_loop_form which fails
to free the loop body it gets. It also allows more countable exits,
matching what we can handle later, when we decide which exit to use
as main exit. Finally some no longer applying comments are adjusted.
* tree-vect-loop.cc (vec_init_loop_exit_info): Adjust comment,
prefer all later exits we can handle.
(vect_analyze_loop_form): Free the allocated loop body.
Adjust comments.
Juzhe-Zhong [Thu, 18 Jan 2024 09:53:24 +0000 (17:53 +0800)]
RISC-V: Support vi variant for vec_cmp
While running various benchmarks, I notice we miss vi variant support for integer comparison.
That is, we can vectorize code into vadd.vi but we can't vectorize into vmseq.vi.
Consider this following case:
void
foo (int n, int **__restrict a)
{
int b;
int c;
int d;
for (b = 0; b < n; b++)
for (long e = 8; e > 0; e--)
a[b][e] = a[b][e] == 15;
}
It's the missing feature caused by our some mistakes, support vi variant for vec_cmp like other patterns (add, sub, ..., etc).
Tested with no regression, ok for trunk ?
gcc/ChangeLog:
* config/riscv/autovec.md: Support vi variant.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-1.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-2.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-3.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-4.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-5.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-6.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-7.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-8.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-9.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/macro.h: New test.
Jakub Jelinek [Thu, 18 Jan 2024 09:24:25 +0000 (10:24 +0100)]
riscv: Remove Bool keywords from riscv.opt
As I wrote recently, Bool is an undocumented unsupported keyword, as
can be seen by
grep Bool doc/options.texi *.awk
The option parsing just parses and ignores all keywords it doesn't handle.
But, because it isn't a supported keyword, I think we shouldn't have it in
*.opt files, because that just means people copy it over to other places
even when it doesn't have any effect.
Tested with a cross to riscv64-linux, none of the generated
options.{h,cc} options-{save,urls}.cc
files change with the patch, only optionlist does (but that is just
used as a source for those files).
Jakub Jelinek [Thu, 18 Jan 2024 09:21:12 +0000 (10:21 +0100)]
i386: Add -masm=intel profiling support [PR113122]
x86_function_profiler emits assembly directly into file and only emits
AT&T syntax. The following patch adjusts it to emit MASM syntax
if -masm=intel.
As it doesn't use asm_fprintf, I can't use {|} syntax for the dialects.
I've tested using
for i in -mcmodel=large "-mcmodel=large -fpic" "" -fpic "-m32 -fpic" "-m32"; do
./xgcc -B ./ -c -O2 -fprofile $i -masm=att pr113122.c -o pr113122.o1;
./xgcc -B ./ -c -O2 -fprofile $i -masm=intel pr113122.c -o pr113122.o2;
objdump -dr pr113122.o1 > /tmp/1; objdump -dr pr113122.o2 > /tmp/2;
diff -up /tmp/1 /tmp/2; done
that the emitted sequences are identical after assembly.
2024-01-18 Jakub Jelinek <jakub@redhat.com>
PR target/113122
* config/i386/i386.cc (x86_function_profiler): Add -masm=intel
support. Add missing space after , in emitted assembly in some
cases. Formatting fixes.
* gcc.target/i386/pr113122-1.c: New test.
* gcc.target/i386/pr113122-2.c: New test.
* gcc.target/i386/pr113122-3.c: New test.
* gcc.target/i386/pr113122-4.c: New test.
Georg-Johann Lay [Thu, 18 Jan 2024 08:59:38 +0000 (09:59 +0100)]
AVR: Fix typo in device-specs generation. Reuse -m[no-]rodata-in-ram checker.
gcc/
* config/avr/gen-avr-mmcu-specs.cc (diagnose_rodata_in_ram): Fix typo
in the diagnostic, and capitalize the device name.
(print_mcu): Generate specs such that:
<*check_rodata_in_ram>: New.
<*cc1_misc>: Use check_rodata_in_ram instead of cc1_rodata_in_ram.
<*link_misc>: Use check_rodata_in_ram instead of link_rodata_in_ram.
<*cc1_rodata_in_ram, *link_rodata_in_ram>: Remove.
Jakub Jelinek [Thu, 18 Jan 2024 07:51:53 +0000 (08:51 +0100)]
testsuite: Fix up scev-16.c test [PR113446]
This test FAILs on i686-linux or e.g. sparc*-solaris*, because
it uses vect_int effective target outside of */vect/ testsuite.
That is wrong, vect_int assumes the extra added flags by vect.exp
by default, which aren't added in other testsuites.
The following patch fixes that by moving the test into gcc.dg/vect/
and doing small tweaks.
2024-01-18 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/112774
PR testsuite/113446
* gcc.dg/tree-ssa/scev-16.c: Move test ...
* gcc.dg/vect/pr112774.c: ... here. Add PR comment line, use
dg-additional-options instead of dg-options and drop
-fdump-tree-vect-details.
Jakub Jelinek [Thu, 18 Jan 2024 07:46:15 +0000 (08:46 +0100)]
testsuite: Fix up gcc.target/i386/sse4_1-stv-1.c test [PR113452]
From what I can see, this test has been written for a backend fix and
assumes the loop isn't vectorized (at least, it wasn't when the test was
added, it contains an early exit), but that is no longer true and because
of the vectorization it now contains an instruction which the test scans
for not being present.
I think we should just disable vectorization here.
2024-01-18 Jakub Jelinek <jakub@redhat.com>
PR testsuite/113452
* gcc.target/i386/sse4_1-stv-1.c: Add -fno-tree-vectorize to
dg-options.
Jakub Jelinek [Thu, 18 Jan 2024 07:45:09 +0000 (08:45 +0100)]
opts: Fix up -ffold-mem-offsets option keywords
While the option was originally meant to be a Target option for a single
target, it is an option for all targets, so should be Common rather than
Target, and because it is an optimization option which could be different
in between different LTO TUs, I've added Optimization keyword too.
From what I can see, Bool is a non-documented non-existing keyword (at
least, grep Bool *.awk shows nothing, so I've dropped that too. Seems
that the option parsing simply parses and ignores any non-existing keywords.
Guess we should drop the Bool keywords from the gcc/config/riscv/riscv.opt
file eventually, so that people don't copy this around.
2024-01-18 Jakub Jelinek <jakub@redhat.com>
PR other/113399
* common.opt (ffold-mem-offsets): Remove Target and Bool keywords, add
Common and Optimization.
Richard Biener [Wed, 17 Jan 2024 13:05:42 +0000 (14:05 +0100)]
tree-optimization/113431 - wrong dependence with invariant load
The vectorizer dependence analysis is confused with invariant loads
when figuring whether the circumstances are so that we preserve
scalar stmt execution order. The following rectifies this.
PR tree-optimization/113431
* tree-vect-data-refs.cc (vect_preserves_scalar_order_p):
When there is an invariant load we might not preserve
scalar order.
Richard Biener [Wed, 17 Jan 2024 12:24:22 +0000 (13:24 +0100)]
tree-optimization/113374 - early break vect and virtual operands
The following fixes wrong virtual operands being used for peeled
early breaks where we can have different live ones and for multiple
exits it makes sure to update the correct PHI arguments.
I've introduced SET_PHI_ARG_DEF_ON_EDGE so we can avoid using
a wrong edge to compute the PHI arg index from.
I've took the liberty to understand the code again and refactor
and comment it a bit differently. The main functional change
is that we preserve the live virtual operand on all exits.
chenxiaolong [Sat, 13 Jan 2024 07:28:34 +0000 (15:28 +0800)]
LoongArch: testsuite:Fix fail in gen-vect-{2,25}.c file.
1.Added dg-do compile on LoongArch.
When binutils does not support vector instruction sets, an error occurs
because the assembler does not recognize vector instructions.
2.Added "-mlsx" option for vectorization on LoongArch.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/gen-vect-2.c: Added detection of compilation
behavior and "-mlsx" option on LoongArch.
* gcc.dg/tree-ssa/gen-vect-25.c: Dito.
As PR101169 comment #c4 shows, previsouly the addi count
update on fold-vec-extract-char.p7.c covered a sub-optimal
code gen issue. On trunk, pass fold-mem-offsets helps to
recover the best code sequence, so this patch is to
revert the count back to the original which matches the
optimal addi count.
PR testsuite/111850
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/fold-vec-extract-char.p7.c: Update the
checking count of addi to 6.
Sandra Loosemore [Thu, 18 Jan 2024 00:12:40 +0000 (00:12 +0000)]
Re-alphabetize attribute tables in extend.texi.
These sections used to be alphabetized, but when I was working on the
fix for PR111659 I noticed documentation for some newer attributes had
been inserted at random places in the tables instead of maintaining
alphabetical order. There's no change to content here, just moving
blocks of text around.
gcc/ChangeLog
* doc/extend.texi (Common Function Attributes): Re-alphabetize
the table.
(Common Variable Attributes): Likewise.
(Common Type Attributes): Likewise.
Nathaniel Shead [Sat, 16 Dec 2023 10:34:45 +0000 (21:34 +1100)]
c++: Prevent overwriting arguments when merging duplicates [PR112588]
When merging duplicate instantiations of function templates, currently
read_function_def overwrites the arguments with that of the existing
duplicate. This is problematic, however, since this means that the
PARM_DECLs in the body of the function definition no longer match with
the PARM_DECLs in the argument list, which causes issues when it comes
to generating RTL.
There doesn't seem to be any reason to do this replacement, so this
patch removes that logic.
Sandra Loosemore [Wed, 17 Jan 2024 21:37:19 +0000 (21:37 +0000)]
Clean up documentation for -Wstrict-flex-arrays [PR111659]
gcc/ChangeLog
PR middle-end/111659
* doc/extend.texi (Common Variable Attributes): Fix long lines
in documentation of strict_flex_array + other minor copy-editing.
Add a cross-reference to -Wstrict-flex-arrays.
* doc/invoke.texi (Option Summary): Fix whitespace in tables
before -fstrict-flex-arrays and -Wstrict-flex-arrays.
(C Dialect Options): Combine the docs for the two
-fstrict-flex-arrays forms into a single entry. Note this option
is for C/C++ only. Add a cross-reference to -Wstrict-flex-arrays.
(Warning Options): Note -Wstrict-flex-arrays is for C/C++ only.
Minor copy-editing. Add cross references to the strict_flex_array
attribute and -fstrict-flex-arrays option. Add note that this
option depends on -ftree-vrp.
Andrew Pinski [Tue, 16 Jan 2024 23:37:49 +0000 (15:37 -0800)]
aarch64: Fix aarch64_ldp_reg_operand predicate not to allow all subreg [PR113221]
So the problem here is that aarch64_ldp_reg_operand will all subreg even subreg of lo_sum.
When LRA tries to fix that up, all things break. So the fix is to change the check to only
allow reg and subreg of regs.
Note the tendancy here is to use register_operand but that checks the mode of the register
but we need to allow a mismatch modes for this predicate for now.
Built and tested for aarch64-linux-gnu with no regressions
(Also tested with the LD/ST pair pass back on).
PR target/113221
gcc/ChangeLog:
* config/aarch64/predicates.md (aarch64_ldp_reg_operand): For subreg,
only allow REG operands instead of allowing all.
gcc/testsuite/ChangeLog:
* gcc.c-torture/compile/pr113221-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Vineet Gupta [Tue, 16 Jan 2024 21:23:42 +0000 (13:23 -0800)]
RISC-V: fix some vsetvl debug info in pass's Phase 2 code [NFC]
When staring at VSETVL pass for PR/113429, spotted some minor
improvements.
1. For readablity, remove some redundant condition check in Phase 2
function earliest_fuse_vsetvl_info ().
2. Add iteration count in debug prints in same function.
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (earliest_fuse_vsetvl_info):
Remove redundant checks in else condition for readablity.
(earliest_fuse_vsetvl_info) Print iteration count in debug
prints.
(earliest_fuse_vsetvl_info) Fix misleading vsetvl info
dump details in certain cases.