Eric Botcazou [Mon, 29 Apr 2024 15:46:20 +0000 (17:46 +0200)]
Minor tweaks to code computing modular multiplicative inverse
This removes the last parameter of choose_multiplier, which is unused, adds
another assertion and more details to the description and various comments.
Likewise to the closely related invert_mod2n, except for the last parameter.
[changelog]
* expmed.h (choose_multiplier): Tweak description and remove last
parameter.
* expmed.cc (choose_multiplier): Likewise. Add assertion for the
third parameter and adds details to various comments.
(invert_mod2n): Tweak description and add assertion for the first
parameter.
(expand_divmod): Adjust calls to choose_multiplier.
* tree-vect-generic.cc (expand_vector_divmod): Likewise.
* tree-vect-patterns.cc (vect_recog_divmod_pattern): Likewise.
konglin1 [Wed, 8 May 2024 07:46:10 +0000 (15:46 +0800)]
x86: Fix cmov cost model issue [PR109549]
(if_then_else:SI (eq (reg:CCZ 17 flags)
(const_int 0 [0]))
(reg/v:SI 101 [ e ])
(reg:SI 102))
The cost is 8 for the rtx, the cost for
(eq (reg:CCZ 17 flags) (const_int 0 [0])) is 4,
but this is just an operator do not need to compute it's cost in cmov.
gcc/ChangeLog:
PR target/109549
* config/i386/i386.cc (ix86_rtx_costs): The XEXP (x, 0) for cmov
is an operator do not need to compute cost.
Aldy Hernandez [Tue, 7 May 2024 12:05:50 +0000 (14:05 +0200)]
Enable prange support.
This throws the switch on prange. After this patch, it is no longer
valid to store a pointer in an irange (or vice versa). Instead, they
must go in prange, which is faster and more memory efficient.
I will push this now, so I have time to do any follow-up bugfixing
before going on paternity leave.
There are various cleanups we plan on doing after this patch (faster
intersect/union, remove range-op-mixed.h, remove value_range in favor
of int_range_max, reclaim the name for the Value_Range temporary,
clean up range-ops, etc etc). But we will hold off on those for now
to make it easier to revert this patch, if for some reason we need to
do so while I'm away.
c++/modules: Stream unmergeable temporaries by value again [PR114856]
In r14-9266-g2823b4d96d9ec4 I gave all temporary vars a DECL_CONTEXT,
including those at namespace or global scope, so that they could be
properly merged across importers. However, not all of these temporary
vars are actually supposed to be mergeable.
For instance, in the attached testcase we have an unnamed temporary var
used in the NSDMI of a class member, which cannot properly merged -- but
it also doesn't need to be, as it'll be thrown away when the class type
itself is merged anyway.
This patch reverts the change made above and instead makes a weaker
adjustment that only causes temporary vars with linkage have a
DECL_CONTEXT to merge from. This way these unnamed, "unmergeable"
temporaries are properly streamed by value again.
PR c++/114856
gcc/cp/ChangeLog:
* call.cc (make_temporary_var_for_ref_to_temp): Set context for
temporaries with linkage.
* init.cc (create_temporary_var): Revert to only set context
when in a function decl.
gcc/testsuite/ChangeLog:
* g++.dg/modules/pr114856.h: New test.
* g++.dg/modules/pr114856_a.H: New test.
* g++.dg/modules/pr114856_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com> Reviewed-by: Patrick Palka <ppalka@redhat.com>
Andrew Pinski [Tue, 20 Feb 2024 21:38:28 +0000 (13:38 -0800)]
c++/c-common: Fix convert_vector_to_array_for_subscript for qualified vector types [PR89224]
After r7-987-gf17a223de829cb, the access for the elements of a vector type would lose the qualifiers.
So if we had `constvector[0]`, the type of the element of the array would not have const on it.
This was due to a missing build_qualified_type for the inner type of the vector when building the array type.
We need to add back the call to build_qualified_type and now the access has the correct qualifiers. So the
overloads and even if it is a lvalue or rvalue is correctly done.
Note we correctly now reject the testcase gcc.dg/pr83415.c which was incorrectly accepted after r7-987-gf17a223de829cb.
Built and tested for aarch64-linux-gnu.
PR c++/89224
gcc/c-family/ChangeLog:
* c-common.cc (convert_vector_to_array_for_subscript): Call build_qualified_type
for the inner type.
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_array_reference): Compare main variants
for the vector/array types instead of the types directly.
gcc/testsuite/ChangeLog:
* g++.dg/torture/vector-subaccess-1.C: New test.
* gcc.dg/pr83415.c: Change warning to error.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andrew Pinski [Fri, 15 Mar 2024 23:34:22 +0000 (16:34 -0700)]
DCE __cxa_atexit calls where the function is pure/const [PR19661]
In C++ sometimes you have a deconstructor function which is "empty", like for an
example with unions or with arrays. The front-end might not know it is empty either
so this should be done on during optimization.o
To implement it I added it to DCE where we mark if a statement is necessary or not.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
Changes since v1:
* v2: Add support for __aeabi_atexit for arm-*eabi. Add extra comments.
Add cxa_atexit-5.C testcase for -fPIC case.
* v3: Fix testcases for the __aeabi_atexit (forgot to do in the v2).
PR tree-optimization/19661
gcc/ChangeLog:
* tree-ssa-dce.cc (is_cxa_atexit): New function.
(is_removable_cxa_atexit_call): New function.
(mark_stmt_if_obviously_necessary): Don't mark removable
cxa_at_exit calls.
(mark_all_reaching_defs_necessary_1): Likewise.
(propagate_necessity): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/cxa_atexit-1.C: New test.
* g++.dg/tree-ssa/cxa_atexit-2.C: New test.
* g++.dg/tree-ssa/cxa_atexit-3.C: New test.
* g++.dg/tree-ssa/cxa_atexit-4.C: New test.
* g++.dg/tree-ssa/cxa_atexit-5.C: New test.
* g++.dg/tree-ssa/cxa_atexit-6.C: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andrew Pinski [Tue, 30 Apr 2024 21:45:26 +0000 (14:45 -0700)]
MATCH: Add some more value_replacement simplifications (a != 0 ? expr : 0) to match
This adds a few more of what is currently done in phiopt's value_replacement
to match. I noticed this when I was hooking up phiopt's value_replacement
code to use match and disabling the old code. But this can be done
independently from the hooking up phiopt's value_replacement as phiopt
is already hooked up for simplified versions already.
/* a != 0 ? a / b : 0 -> a / b iff b is nonzero. */
/* a != 0 ? a * b : 0 -> a * b */
/* a != 0 ? a & b : 0 -> a & b */
We prefer the `cond ? a : 0` forms to allow optimization of `a * cond` which
uses that form.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
PR tree-optimization/114894
gcc/ChangeLog:
* match.pd (`a != 0 ? a / b : 0`): New pattern.
(`a != 0 ? a * b : 0`): New pattern.
(`a != 0 ? a & b : 0`): New pattern.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/phi-opt-value-5.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
[committed] [RISC-V] Allow uarchs to set TARGET_OVERLAP_OP_BY_PIECES_P
This is almost exclusively work from the VRULL team.
As we've discussed in the Tuesday meeting in the past, we'd like to have a knob
in the tuning structure to indicate that overlapped stores during
move_by_pieces expansion of memcpy & friends are acceptable.
This patch adds the that capability in our tuning structure. It's off for all
the uarchs upstream, but we have been using it inside Ventana for our uarch
with success. So technically it's NFC upstream, but puts in the infrastructure
multiple organizations likely need.
gcc/
* config/riscv/riscv.cc (struct riscv_tune_param): Add new
"overlap_op_by_pieces" field.
(rocket_tune_info, sifive_7_tune_info): Set it.
(sifive_p400_tune_info, sifive_p600_tune_info): Likewise.
(thead_c906_tune_info, xiangshan_nanhu_tune_info): Likewise.
(generic_ooo_tune_info, optimize_size_tune_info): Likewise.
(riscv_overlap_op_by_pieces): New function.
(TARGET_OVERLAP_OP_BY_PIECES_P): define.
gcc/testsuite/
* gcc.target/riscv/memcpy-nonoverlapping.c: New test.
* gcc.target/riscv/memset-nonoverlapping.c: New test.
The following patch imeplements the C++26 P2893R3 - Variadic friends
paper. The paper allows for the friend type declarations to specify
more than one friend type specifier and allows to specify ... at
the end of each. The patch doesn't introduce tentative parsing of
friend-type-declaration non-terminal, but rather just extends existing
parsing where it is a friend declaration which ends with ; after the
declaration specifiers to the cases where it ends with ...; or , or ...,
In that case it pedwarns for cxx_dialect < cxx26, handles the ... and
if there is , continues in a loop to parse the further friend type
specifiers.
2024-05-07 Jakub Jelinek <jakub@redhat.com>
PR c++/114459
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Predefine
__cpp_variadic_friend=202403L for C++26.
gcc/cp/
* parser.cc (cp_parser_member_declaration): Implement C++26
P2893R3 - Variadic friends. Parse friend type declarations
with ... or with more than one friend type specifier.
* friend.cc (make_friend_class): Allow TYPE_PACK_EXPANSION.
* pt.cc (instantiate_class_template): Handle PACK_EXPANSION_P
in friend classes.
gcc/testsuite/
* g++.dg/cpp26/feat-cxx26.C (__cpp_variadic_friend): Add test.
* g++.dg/cpp26/variadic-friend1.C: New test.
Jakub Jelinek [Tue, 7 May 2024 19:30:21 +0000 (21:30 +0200)]
expansion: Use __trunchfbf2 calls rather than __extendhfbf2 [PR114907]
The HF and BF modes have the same size/precision and neither is
a subset nor superset of the other.
So, using either __extendhfbf2 or __trunchfbf2 is weird.
The expansion apparently emits __extendhfbf2, but on the libgcc side
we apparently have __trunchfbf2 implemented.
I think it is easier to switch to using what is available rather than
adding new entrypoints to libgcc, even alias, because this is backportable.
2024-05-07 Jakub Jelinek <jakub@redhat.com>
PR middle-end/114907
* expr.cc (convert_mode_scalar): Use trunc_optab rather than
sext_optab for HF->BF conversions.
* optabs-libfuncs.cc (gen_trunc_conv_libfunc): Likewise.
Jakub Jelinek [Tue, 7 May 2024 19:29:14 +0000 (21:29 +0200)]
tree-inline: Remove .ASAN_MARK calls when inlining functions into no_sanitize callers [PR114956]
In r9-5742 we've started allowing to inline always_inline functions into
functions which have disabled e.g. address sanitization even when the
always_inline function is implicitly from command line options sanitized.
This mostly works fine because most of the asan instrumentation is done only
late after ipa, but as the following testcase the .ASAN_MARK ifn calls
gimplifier adds can result in ICEs.
Fixed by dropping those during inlining, similarly to how we drop
.TSAN_FUNC_EXIT calls.
2024-05-07 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/114956
* tree-inline.cc: Include asan.h.
(copy_bb): Remove also .ASAN_MARK calls if id->dst_fn has asan/hwasan
sanitization disabled.
Gaius Mulley [Tue, 7 May 2024 18:24:08 +0000 (19:24 +0100)]
PR modula2/114133 bugfix constants must be cast prior to vararg call
This bug fix corrects the test codes below by converting the constant
literals to the type required by C. In the testcases below the values, 1
etc were converted into the INTEGER type before being passed to a C
vararg function. By default in modula2 constant literal ordinals are
represented as the ZTYPE (the largest GCC integer type node).
gcc/testsuite/ChangeLog:
PR modula2/114133
* gm2/extensions/run/pass/callingc10.mod: Convert constant
literal numbers into INTEGER.
* gm2/extensions/run/pass/callingc11.mod: Ditto.
* gm2/extensions/run/pass/vararg2.mod: Ditto.
* gm2/iso/run/pass/packed.mod: Emit a printf as a runtime
diagnostic.
Jeff Law [Tue, 7 May 2024 17:43:09 +0000 (11:43 -0600)]
[RISC-V] [PATCH v2] Enable inlining str* by default
So with Chrstoph's patches from late 2022 we've had the ability to inline
strlen, and str[n]cmp (scalar). However, we never actually turned this
capability on by default!
This patch flips the those default to allow inlinining by default. It also
fixes one bug exposed by our internal testing when NBYTES is zero for strncmp.
I don't think that case happens enough to try and optimize it, we just disable
inline expansion for that instance.
This has been bootstrapped and regression tested on rv64gc at various times as
well as cross tested on rv64gc more times than I can probably count (we've have
this patch internally for a while). More importantly, I just successfully
tested it on rv64gc and rv32gcv elf configurations with the trunk
gcc/
* config/riscv/riscv-string.cc (riscv_expand_strcmp): Do not inline
strncmp with zero size.
(emit_strcmp_scalar_compare_subword): Adjust rotation for rv32 vs rv64.
* config/riscv/riscv.opt (var_inline_strcmp): Enable by default.
(vriscv_inline_strncmp, riscv_inline_strlen): Likewise.
gcc/testsuite
* gcc.target/riscv/zbb-strlen-disabled-2.c: Turn off inlining.
Zac Walker [Mon, 12 Feb 2024 14:22:47 +0000 (15:22 +0100)]
Add aarch64-w64-mingw32 target to libgcc
Reuse MinGW definitions from i386 for libgcc. Move reused files to
libgcc/config/mingw folder.
libgcc/ChangeLog:
* config.host: Add aarch64-w64-mingw32 target. Adjust targets
after moving MinGW files.
* config/i386/t-gthr-win32: Move to...
* config/mingw/t-gthr-win32: ...here.
* config/i386/t-mingw-pthread: Move to...
* config/mingw/t-mingw-pthread: ...here.
* config/aarch64/t-no-eh: New file. EH is not yet implemented for
the target, and the default definition should be disabled.
aarch64: Add Cygwin and MinGW environments for AArch64
Define Cygwin and MinGW environment such as types, SEH definitions,
shared libraries, etc.
gcc/ChangeLog:
* config.gcc: Add Cygwin and MinGW difinitions.
* config/aarch64/aarch64-protos.h
(mingw_pe_maybe_record_exported_symbol): Declare functions
which are used in Cygwin and MinGW environment.
(mingw_pe_section_type_flags): Likewise.
(mingw_pe_unique_section): Likewise.
(mingw_pe_encode_section_info): Likewise.
* config/aarch64/cygming.h: New file.
This patch defines TARGET_AARCH64_MS_ABI in config.gcc and uses it to
exclude i386 functionality from aarch64 build and adjust MinGW headers
for AArch64 MS ABI.
gcc/ChangeLog:
* config.gcc: Define TARGET_AARCH64_MS_ABI.
* config/mingw/mingw-stdint.h (INTPTR_TYPE): Use
TARGET_AARCH64_MS_ABI to adjust MinGW headers for
AArch64 MS ABI.
(UINTPTR_TYPE): Likewise.
(defined): Likewise.
* config/mingw/mingw32.h (DEFAULT_ABI): Likewise.
(defined): Likewise.
* config/mingw/winnt.cc (defined): Use TARGET_ARM64_MS_ABI to
exclude ix86_get_callcvt.
(i386_pe_maybe_mangle_decl_assembler_name): Likewise.
(i386_pe_mangle_decl_assembler_name): Likewise.
aarch64: Mark x18 register as a fixed register for MS ABI
Define the MS ABI for aarch64-w64-mingw32.
Adjust FIXED_REGISTERS, CALL_REALLY_USED_REGISTERS and
STATIC_CHAIN_REGNUM for AArch64 MS ABI.
The X18 register is reserved on Windows for the TEB.
gcc/ChangeLog:
* config.gcc: Define TARGET_AARCH64_MS_ABI when
AArch64 MS ABI is used.
* config/aarch64/aarch64.h (FIXED_X18): Adjust
FIXED_REGISTERS, CALL_REALLY_USED_REGISTERS and
STATIC_CHAIN_REGNUM for AArch64 MS ABI.
(CALL_USED_X18): Likewise.
(FIXED_REGISTERS): Likewise.
* config/aarch64/aarch64-abi-ms.h: New file.
Jonathan Wakely [Wed, 1 May 2024 16:09:39 +0000 (17:09 +0100)]
libstdc++: Fix handling of incomplete UTF-8 sequences in _Unicode_view
Eddie Nolan reported to me that _Unicode_view was not correctly
implementing the substitution of ill-formed subsequences with U+FFFD,
due to failing to increment the counter when the iterator reaches the
end of the sequence before a multibyte sequence is complete. As a
result, the incomplete sequence was not completely consumed, and then
the remaining character was treated as another ill-formed sequence,
giving two U+FFFD characters instead of one.
To avoid similar mistakes in future, this change introduces a lambda
that increments the iterator and the counter together. This ensures the
counter is always incremented when the iterator is incremented, so that
we always know how many characters have been consumed.
libstdc++-v3/ChangeLog:
* include/bits/unicode.h (_Unicode_view::_M_read_utf8): Ensure
count of characters consumed is correct when the end of the
input is reached unexpectedly.
* testsuite/ext/unicode/view.cc: Test incomplete UTF-8
sequences.
Alex Coplan [Wed, 10 Apr 2024 15:30:36 +0000 (16:30 +0100)]
aarch64: Preserve mem info on change of base for ldp/stp [PR114674]
The ldp/stp fusion pass can change the base of an access so that the two
accesses end up using a common base register. So far we have been using
adjust_address_nv to do this, but this means that we don't preserve
other properties of the mem we're replacing. It seems better to use
replace_equiv_address_nv, as this will preserve e.g. the MEM_ALIGN of the
mem whose address we're changing.
The PR shows that by adjusting the other mem we lose alignment
information about the original access and therefore end up rejecting an
otherwise viable pair when --param=aarch64-stp-policy=aligned is passed.
This patch fixes that by using replace_equiv_address_nv instead.
Notably this is the same approach as taken by
aarch64_check_consecutive_mems when a change of base is required, so
this at least makes things more consistent between the ldp fusion pass
and the peepholes.
gcc/ChangeLog:
PR target/114674
* config/aarch64/aarch64-ldp-fusion.cc (ldp_bb_info::fuse_pair):
Use replace_equiv_address_nv on a change of base instead of
adjust_address_nv on the other access.
gcc/testsuite/ChangeLog:
PR target/114674
* gcc.target/aarch64/pr114674.c: New test.
Jonathan Wakely [Wed, 27 Mar 2024 15:24:05 +0000 (15:24 +0000)]
libstdc++: Constrain equality ops for std::pair, std::tuple, std::variant
Implement the changes from P2944R3 which add constraints to the
comparison operators of std::pair, std::tuple, and std::variant.
The paper also changes std::optional, but we already constrain its
comparisons using SFINAE on the return type. However, we need some
additional constraints on the [optional.comp.with.t] operators that
compare an optional with a value. The paper doesn't say to do that, but
I think it's needed because otherwise when the comparison for two
optional objects fails its constraints, the two overloads that are
supposed to be for comparing to a non-optional become the best overload
candidates, but are ambiguous (and we don't even get as far as checking
the constraints for satisfaction). I reported LWG 4072 for this.
The paper does not change std::expected, but probably should have done.
I'll submit an LWG issue about that and implement it separately.
Also add [[nodiscard]] to all these comparison operators.
libstdc++-v3/ChangeLog:
* include/bits/stl_pair.h (operator==): Add constraint.
* include/bits/version.def (constrained_equality): Define.
* include/bits/version.h: Regenerate.
* include/std/optional: Define feature test macro.
(__optional_rep_op_t): Use is_convertible_v instead of
is_convertible.
* include/std/tuple: Define feature test macro.
(operator==, __tuple_cmp, operator<=>): Reimplement C++20
comparisons using lambdas. Add constraints.
* include/std/utility: Define feature test macro.
* include/std/variant: Define feature test macro.
(_VARIANT_RELATION_FUNCTION_TEMPLATE): Add constraints.
(variant): Remove unnecessary friend declarations for comparison
operators.
* testsuite/20_util/optional/relops/constrained.cc: New test.
* testsuite/20_util/pair/comparison_operators/constrained.cc:
New test.
* testsuite/20_util/tuple/comparison_operators/constrained.cc:
New test.
* testsuite/20_util/variant/relops/constrained.cc: New test.
* testsuite/20_util/tuple/comparison_operators/overloaded.cc:
Disable for C++20 and later.
* testsuite/20_util/tuple/comparison_operators/overloaded2.cc:
Remove dg-error line for target c++20.
Jonathan Wakely [Thu, 11 Apr 2024 14:35:11 +0000 (15:35 +0100)]
libstdc++: Update ABI test to disallow adding to released symbol versions
If we update the list of "active" symbols versions now, rather than when
adding a new symbol version, we will notice if new symbols get added to
the wrong version (as in PR 114692).
libstdc++-v3/ChangeLog:
* testsuite/util/testsuite_abi.cc: Update latest versions to
new versions that should be used in future.
Richard Biener [Tue, 20 Feb 2024 10:47:03 +0000 (11:47 +0100)]
middle-end/27800 - avoid unnecessary temporary during gimplification
This avoids a tempoary when gimplifying reg = a ? b : c, re-using
the LHS of an assignment if that's a register.
PR middle-end/27800
* gimplify.cc (gimplify_modify_expr_rhs): For a COND_EXPR
avoid a temporary from gimplify_cond_expr when the LHS is
a register by pushing the assignment into the COND_EXPR arms.
tree-optimization/110490 - bitcount for narrow modes
Bitcount operations popcount, clz, and ctz are emulated for narrow modes
in case an operation is only supported for wider modes. Beside that ctz
may be emulated via clz in expand_ctz. Reflect this in
expression_expensive_p.
I considered the emulation of ctz via clz as not expensive since this
basically reduces to ctz (x) = c - (clz (x & ~x)) where c is the mode
precision minus 1 which should be faster than a loop.
gcc/ChangeLog:
PR tree-optimization/110490
* tree-scalar-evolution.cc (expression_expensive_p): Also
consider mode widening for popcount, clz, and ctz.
Richard Biener [Thu, 4 Apr 2024 11:01:10 +0000 (13:01 +0200)]
Use unsigned for stack var indexes during RTL expansion
We're currently using size_t but at the same time storing them into
bitmaps which only support unsigned int index. The following makes
it unsigned int throughout, saving memory as well.
Rainer Orth [Tue, 7 May 2024 11:14:05 +0000 (13:14 +0200)]
build: Derive object names in make_sunver.pl
The recent move of libgfortran object files to subdirs and the resulting
breakage of libgfortran.so symbol exports demonstrated how fragile
deriving object and archive names from their libtool counterparts in the
Makefiles is. Therefore, this patch moves that step into
make_sunver.pl, considerably simplifying the Makefile rules to create
the version scripts.
Bootstrapped without regressions on i386-pc-solaris2.11 and
sparc-sun-solaris2.11, verifying that the version scripts are identical
except for the input filenames.
Richard Biener [Fri, 3 May 2024 08:44:50 +0000 (10:44 +0200)]
middle-end/114931 - type_hash_canon and structual equality types
TYPE_STRUCTURAL_EQUALITY_P is part of our type system so we have
to make sure to include that into the type unification done via
type_hash_canon. This requires the flag to be set before querying
the hash which is the biggest part of the patch.
PR middle-end/114931
gcc/
* tree.cc (type_hash_canon_hash): Hash TYPE_STRUCTURAL_EQUALITY_P.
(type_cache_hasher::equal): Compare TYPE_STRUCTURAL_EQUALITY_P.
(build_array_type_1): Set TYPE_STRUCTURAL_EQUALITY_P before
probing with type_hash_canon.
(build_function_type): Likewise.
(build_method_type_directly): Likewise.
(build_offset_type): Likewise.
(build_complex_type): Likewise.
* attribs.cc (build_type_attribute_qual_variant): Likewise.
gcc/c-family/
* c-common.cc (complete_array_type): Set TYPE_STRUCTURAL_EQUALITY_P
before probing with type_hash_canon.
Aldy Hernandez [Tue, 19 Mar 2024 16:55:58 +0000 (17:55 +0100)]
Minor range type fixes for IPA in preparation for prange.
The polymorphic Value_Range object takes a tree type at construction
so it can determine what type of range to use (currently irange or
frange). It seems a few of the types are slightly off. This isn't a
problem now, because IPA only cares about integers and pointers, which
can both live in an irange. However, with prange coming about, we
need to get the type right, because you can't store an integer in a
pointer range or vice versa.
Also, in preparation for prange, the irange::supports_p() idiom will become:
irange::supports_p () || prange::supports_p()
To avoid changing all these places, I've added an inline function we
can later change and change everything at once.
Finally, there's a Value_Range::supports_type_p() &&
irange::supports_p() in the code. The latter is a subset of the
former, so there's no need to check both.
Rainer Orth [Tue, 7 May 2024 08:45:55 +0000 (10:45 +0200)]
Remove obsolete Solaris 11.3 support
Support for Solaris 11.3 had already been obsoleted in GCC 13. However,
since the only Solaris system in the cfarm was running 11.3, I've kept
it in tree until now when both Solaris 11.4/SPARC and x86 systems have
been added.
This patch actually removes the Solaris 11.3 support. Apart from
several minor simplifications, there are two more widespread changes:
* In Solaris 11.4, libsocket and libnsl were folded into libc, so
there's no longer a need to link them explictly.
* Since Solaris 11.4, Solaris includes all crts needed by gcc (like
crt1.o and gcrt1.o) with the base system. All workarounds to provide
fallbacks can thus go.
Bootstrapped without regressions on i386-pc-solaris2.11 and
sparc-sun-solaris2.11 (as/ld, gas/ld, and gas/gld) as well as Solaris
11.3/x86 to ascertain that version is actually rejected.
Piotr Trojanek [Wed, 10 Jan 2024 17:48:04 +0000 (18:48 +0100)]
ada: Fix calculation of tasks in null arrays
Fix handling of null arrays when calculating the secondary stack size
for the binder.
gcc/ada/
* sem_util.adb (Number_Of_Elements_In_Array): Fix counting of
elements in null arrays; remove redundant parenthesis; avoid
run-time conversion of 1 to universal integer.
Piotr Trojanek [Wed, 10 Jan 2024 12:14:34 +0000 (13:14 +0100)]
ada: Cleanup calculation of task stacks
Code cleanup; semantics is unaffected.
gcc/ada/
* exp_ch3.adb (Count_Default_Sized_Task_Stacks): Do not look for
tasks inside record discriminants; remove avoid repeated call to
Has_Task that happened for record components.
(Expand_N_Object_Declaration): Use high-level routine to detect
array types and subtypes; remove unused initial values.
Piotr Trojanek [Sun, 21 Aug 2022 18:56:26 +0000 (20:56 +0200)]
ada: Remove redundant guard against empty list of actions
Code cleanup.
gcc/ada/
* exp_ch4.adb (Useful): Remove redundant check for empty list,
because iteration with First works also for empty list; rename
local variable from L to Action.
Piotr Trojanek [Tue, 16 Jan 2024 12:08:18 +0000 (13:08 +0100)]
ada: Cleanup detection of per-object constraints in inlining for SPARK
In GNATprove mode we didn't inline subprograms whose formal parameters
was of a record type with constraints depending on discriminants. Now
this is extended to formal parameters with per-object constraints,
regardless if they come from references to discriminants or from
attributes prefixed by the current type instance.
gcc/ada/
* inline.adb (Has_Formal_With_Per_Object_Constrained_Component):
Use flag Has_Per_Object_Constraint which is set by analysis;
rename for consistency.
Piotr Trojanek [Tue, 16 Jan 2024 11:55:24 +0000 (12:55 +0100)]
ada: Fix detection of components with per-object constraints
Routine Contains_POC (where POC means "per-object constraint") was
failing to detect expressions of the form "Current_Type'Access", because
it was comparing prefix (typically an N_Identifier) with a scope
(typically an N_Definining_Entity). This was harmless, because these
expressions are detected anyway in Analyze_Access_Attribute, together
with uses of 'Unconstrained_Access and 'Unchecked_Access.
Also, this routine was failing to detect the use of discriminants in
array types with constrained subtype indication, e.g.:
type T (D : Integer) is record
C : array (Integer range 1 .. D);
end record;
It is simpler to just reuse Has_Discriminant_Dependent_Constraint and
leave detection of access attributes to Analyze_Access_Attribute.
gcc/ada/
* sem_attr.adb (Analyze_Access_Attribute): Prevent search from
going too far.
* sem_ch3.adb (Analyze_Component_Declaration): Remove
Contains_POC; reuse Has_Discriminant_Dependent_Constraint.
Eric Botcazou [Tue, 16 Jan 2024 08:18:15 +0000 (09:18 +0100)]
ada: Fix bad interaction between homogeneous finalization master and BIP protocol
Dynamically-allocated objects that require finalization are attached to a
finalization master, which is of a (limited) controlled type declared in
the System.Finalization_Masters unit. Now there are two kinds of them:
homogeneous and heterogeneous; for the former, all the objects attached
to the master share the same Finalize_Address primitive whereas, for the
latter, they may have different Finalize_Address primitives.
There is a problem in this scheme with the BIP protocol, because this
protocol forwards the finalization master from callers to callees and it
does so even if the result types are distinct, so it is possible for a
homogeneous finalization master to end up containing objects with different
Finalize_Address primitives; in that case, the object attached last wins
and sets the common Finalize_Address, which is then used to finalize other
objects with unpredictable outcome (and very loud valgrind report).
Therefore, this change gets rid of homogeneous finalization masters and
also streamlines the implementation of heterogeneous ones by storing the
Finalize_Address primitive on a per object basis in the FM_Node record.
gcc/ada/
* einfo.ads (Pending_Access_Types): Delete.
* exp_ch3.adb (Freeze_Type.Process_Pending_Access_Types): Likewise.
(Freeze_Type): Do not call Process_Pending_Access_Types.
* exp_ch7.ads (Make_Set_Finalize_Address_Call): Delete.
* exp_ch7.adb (Build_Finalization_Master.Add_Pending_Access_Type):
Delete.
(Build_Finalization_Master): Do not set Finalize_Address on the
master or call Add_Pending_Access_Type.
(Make_Set_Finalize_Address_Call): Delete.
* gen_il-fields.ads (Opt_Field_Enum): Remove Pending_Access_Types.
* gen_il-gen-gen_entities.adb (Type_Kind): Likewise.
* rtsfind.ads (RE_Id): Remove RE_Set_Finalize_Address.
(RE_Unit_Table): Likewise.
* sem_ch3.adb (Analyze_Full_Type_Declaration): Do not deal with
pending access types.
* libgnat/s-finmas.ads (Attach_Unprotected): Add Finalize_Address
second parameter.
(Delete_Finalize_Address_Unprotected): Delete.
(Finalize_Address): Likewise.
(Finalize_Address_Unprotected): Likewise.
(Is_Homogeneous): Likewise.
(Set_Finalize_Address): Likewise.
(Set_Finalize_Address_Unprotected): Likewise.
(Set_Heterogeneous_Finalize_Address_Unprotected): Likewise.
(Set_Is_Heterogeneous): Likewise.
(FM_Node): Add Finalize_Address component.
(Finalization_Master): Remove Is_Homogeneous and Finalize_Address
components.
* libgnat/s-finmas.adb: Remove with & use clauses for System.HTable.
(Finalize_Address_Table): Delete.
(Attach_Unprotected): Add Finalize_Address second parameter and save
its value in the Finalize_Address field of the node.
(Delete_Finalize_Address_Unprotected): Delete.
(Finalize): Call Finalize_Address saved in the nodes.
(Finalize_Address): Delete.
(Finalize_Address_Unprotected): Likewise.
(Hash): Likewise.
(Is_Homogeneous): Likewise.
(Print_Master): Adjust.
(Set_Finalize_Address): Delete.
(Set_Finalize_Address_Unprotected): Likewise.
(Set_Heterogeneous_Finalize_Address_Unprotected): Likewise.
(Set_Is_Heterogeneous): Likewise.
* libgnat/s-stposu.adb (Finalize_Address_Table_In_Use): Likewise.
(Allocate_Any_Controlled): Pass Fin_Address to Attach_Unprotected
and remove obsolete processing.
(Deallocate_Any_Controlled): Remove obsolete processing.
(Set_Pool_Of_Subpool): Do not call Set_Is_Heterogeneous.
Joffrey Huguet [Mon, 15 Jan 2024 16:20:47 +0000 (17:20 +0100)]
ada: Add Global contracts to Ada.Numerics.Generic_Elementary_Functions
GNATprove raised warnings about unspecified Global contracts when
using functions from an instance of
Ada.Numerics.Generic_Elementary_Functions. This patch adds null Global
contracts to all subprograms.
Bob Duff [Fri, 12 Jan 2024 13:12:27 +0000 (08:12 -0500)]
ada: Fix crash on body postcondition
This patch fixes a bug where the compiler could crash on a postcondition
on a subprogram body (i.e. a body that "acts as spec"), if the
postcondition contains 'Old attributes that use the Ada 2022 feature
that allows certain conditionals (see RM-6.1.1).
The main bug fix here is in exp_attr.adb to set Ins_Node properly in the
Acts_As_Spec case. Otherwise, the initialization of the 'Old temp would
occur before the declaration, which gigi does not like.
gcc/ada/
* exp_attr.adb (Attribute_Old): The 'Old attribute we are
processing here is in a postcondition, which cannot be inside the
"Wrapped_Statements" of the subprogram with that postcondition. So
remove the loop labeled "Climb the parent chain looking for
subprogram _Wrapped_Statements". The only way this loop could find
a Subp is if we are nested inside a subprogram that also has a
postcondition, and in that case we would find the wrong (outer)
one. In any case, Subp is set to Empty after the loop, so all
subsequent tests for Present (Subp) are necessarily False; remove
them and the corresponding code. Set Ins_Node unconditionally (to
the right thing). Remove obsolete comments.
* sem_util.adb (Determining_Expressions): Fix assertion;
Pragma_Test_Case was missing.
(Eligible_For_Conditional_Evaluation): Fix assert that could fail
in case of errors.
* libgnat/s-valspe.ads: Remove pragma Unevaluated_Use_Of_Old;
there are no uses of 'Old in this package.
Steve Baird [Tue, 19 Dec 2023 00:17:40 +0000 (16:17 -0800)]
ada: Improve pragma No_Return's pre-Ada2022 handling of functions
Ada 2022 allows pragma No_Return to apply to a function (or a generic function).
For earlier Ada versions, if a No_Return pragma argument's possible
resolutions include a function (or a generic function) then we want to ignore
that candidate if a non-function candidate is also available and otherwise
to generate an error message mentioning that this is an Ada 2022 feature.
gcc/ada/
* sem_prag.adb (Analyze_Pragma): Restructure the loop over
possible resolutions of a No_Return pragma's argument so that
functions (and generic functions) are not processed until after it
is known whether there is a non-function candidate resolution. For
a pre-2022 Ada version, terminate the iteration before processing
functions if a non-function resolution is found.
Eric Botcazou [Tue, 9 Jan 2024 15:25:09 +0000 (16:25 +0100)]
ada: Fix LTO type mismatches in GNAT.Sockets.Thin
The default implementation of GNAT.Sockets.Thin is mainly used on Linux and
the socklen_t type used in various routines of the BSD sockets C API is a
typedef for unsigned int there, so importing it as Interface.C.int will be
flagged as a type mismatch during LTO compilation.
gcc/ada/
* libgnat/g-socthi.ads (C_Bind): Turn into inline function.
(C_Getpeername): Likewise.
(C_Getsockname): Likewise.
(C_Getsockopt): Likewise.
(C_Setsockopt): Likewise.
(Nonreentrant_Gethostbyaddr): Likewise.
* libgnat/g-socthi.adb (Syscall_Accept): Adjust profile.
(Syscall_Connect): Likewise.
(Syscall_Recvfrom): Likewise.
(Syscall_Sendto): Likewise.
(C_Bind): New function.
(C_Accept): Adjust to above change for profiles.
(C_Connect): Likewise.
(C_Getpeername): New function.
(C_Getsockname): Likewise.
(C_Getsockopt): Likewise.
(C_Recvfrom): Adjust to above change for profiles.
(C_Setsockopt): New function.
(Nonreentrant_Gethostbyaddr): Likewise.
Bob Duff [Tue, 9 Jan 2024 12:59:22 +0000 (07:59 -0500)]
ada: Aspects on multiple component declarations
This patch fixes a bug where aspect specifications were ignored
on all but the last of multiple component declarations.
For example, in a record type with components "X, Y: T with Volatile;"
only Y was marked Volatile; X was not. Both should be marked Volatile.
The fix is in Par.Ch3.P_Component_Items, where P_Aspect_Specifications
needs to be called each time through the loop.
In addition, various minor cleanups.
gcc/ada/
* par-ch3.adb (P_Component_Items): Move P_Aspect_Specifications
into the loop, so aspects can be attached to multiple component
declarations.
(P_Type_Declaration, P_Subtype_Declaration)
(P_Known_Discriminant_Part_Opt): Remove default for Semicolon in
calls to P_Aspect_Specifications.
* gen_il-gen-gen_nodes.adb (N_Discriminant_Specification): Add
Aspect_Specifications field to N_Discriminant_Specification, which
was missing.
* aspects.adb (Has_Aspect_Specifications_Flag): Make it True for
N_Discriminant_Specification.
* par-ch13.adb: Remove default for Semicolon in calls to
P_Aspect_Specifications.
(Get_Aspect_Specifications): Misc cleanup.
(P_Aspect_Specifications): Remove comment. It's not clear what
"the flag" is referring to, but anyway the first part of the
comment is obvious, and the second part is apparently obsolete.
Misc cleanup.
* par.adb (P_Aspect_Specifications, Get_Aspect_Specifications):
Remove default for Semicolon; calls are more readable that way.
Improve comments.
* par-ch12.adb: Remove default for Semicolon in calls to
P_Aspect_Specifications.
* par-ch6.adb: Likewise.
* par-ch7.adb: Likewise.
* par-ch9.adb: Likewise.
* par-endh.adb: Likewise.
Justin Squirek [Tue, 9 Jan 2024 15:08:08 +0000 (15:08 +0000)]
ada: Bad internal naming when using pragma Compile_Time_Error
This patch fixes an error in the compiler whereby the presence of a condition
which tests the size of a type not known at compile time within an instance
of pragma Compile_Time_Error causes incorrect internal names to be generated
for said type during expansion.
gcc/ada/
* sem_prag.adb (Defer_Compile_Time_Warning_Error_To_BE): Better
handle itypes such that the tree copy required for the expansion
of the pragma doesn't cause ordering problems with internal names.
Yannick Moy [Thu, 4 Jan 2024 13:18:04 +0000 (14:18 +0100)]
ada: Fix missing flag for GNATprove
GNATprove expects the frontend to position correctly range check
flags, on expressions which might lead to a range check failure.
This was missing on in-out parameters of calls. Now fixed.
There is no impact on compilation.
gcc/ada/
* sem_res.adb (Resolve_Actuals): Add range check flag.
Yannick Moy [Mon, 8 Jan 2024 08:53:58 +0000 (09:53 +0100)]
ada: Fix spurious error on generic state in SPARK
The public state of a generic package needs not be part of the state of
the enclosing unit, only the state of instantiations need to be accounted
for in the enclosing package. Now fixed.
gcc/ada/
* sem_util.adb (Find_Placement_In_State_Space): Stop search for
placement when reaching the public state of a generic package.
Add missing check of RM 6.5(5.3/5): when the result subtype of the
function is defined by a subtype mark, the subtype defined by the
subtype indication of the extended return statement shall be
statically compatible with the result subtype of the function.
gcc/ada/
* sem_ch3.adb (Check_Return_Subtype_Indication): Add missing check
on statically compatible subtypes.
* sem_eval.adb (Subtypes_Statically_Compatible): Ensure that both
types are either scalar types or access types to evaluate this
predicate.
Bob Duff [Fri, 5 Jan 2024 15:40:00 +0000 (10:40 -0500)]
ada: Fix bug in overloaded selected_components in aspect_specifications
This patch fixes a bug where if a selected_component X.Y appears in an
aspect_specification, and there are two or more overloaded Y's in X,
then it can choose the wrong one, leading to subsequent type errors.
It was always picking the last declaration of Y, and leaving Entity
set to that. We now reset Entity (as for the already-existing code
for N_Identifier just below).
Note that Resolve_Aspect_Expressions is called only for
aspect_specifications, and not even all aspect_specifications,
so the bug didn't occur for other names. For example,
Resolve_Aspect_Expressions is not called for aspect_specifications
in the visible part of a library package if there is no private part.
gcc/ada/
* sem_ch13.adb (Resolve_Name): This is called only for names in
aspect_specifications. If the name is an overloaded
selected_component, reset the Entity. Note that this was already
done for N_Identifier in the code just below.
Andrew Pinski [Mon, 6 May 2024 21:14:41 +0000 (14:14 -0700)]
Mention that some options are turned on by `-Ofast` in their descriptions [PR97263]
Like was done for -ffast-math in r0-105946-ga570fc16fa8056, we should
document that -Ofast enables -fmath-errno, -funsafe-math-optimizations,
-finite-math-only, -fno-trapping-math in their documentation.
Note this changes the stronger "must not" to be "is not" for -fno-trapping-math
since we do enable it for -Ofast already.
gcc/ChangeLog:
PR middle-end/97263
* doc/invoke.texi(fmath-errno): Document it is turned on
with -Ofast.
(funsafe-math-optimizations): Likewise.
(ffinite-math-only): Likewise.
(fno-trapping-math): Likewise and use less strong language.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
liuhongt [Wed, 20 Dec 2023 03:54:43 +0000 (11:54 +0800)]
Optimize 64-bit vector permutation with punpcklqdq + 128-bit vector pshuf.
gcc/ChangeLog:
PR target/113090
* config/i386/i386-expand.cc
(expand_vec_perm_punpckldq_pshuf): New function.
(ix86_expand_vec_perm_const_1): Try
expand_vec_perm_punpckldq_pshuf for sequence of 2
instructions.
Add a new pru-specific pass to validate that the assumptions for the
minimal C runtime are not violated by the user program.
gcc/ChangeLog:
* config/pru/pru-passes.cc (class pass_pru_minrt_check): New
pass.
(pass_pru_minrt_check::execute): New method.
(make_pru_minrt_check): New function.
* config/pru/pru-passes.def (INSERT_PASS_AFTER): Register the
minrt check pass.
* config/pru/pru-protos.h (make_pru_minrt_check): Add
declaration.
gcc/testsuite/ChangeLog:
* g++.target/pru/minrt-1.cc: New test.
* g++.target/pru/minrt-2.cc: New test.
* g++.target/pru/minrt-3.cc: New test.
* g++.target/pru/pru.exp: New test.
* gcc.target/pru/minrt-1.c: New test.
* gcc.target/pru/minrt-2.c: New test.
* gcc.target/pru/minrt-3.c: New test.
Switch to using a passes definition file instead of explicitly
registering the PRU-specific passes in pru.cc. This would make it
cleaner to add new PRU-specific passes.
Dimitar Dimitrov [Sat, 13 Jan 2024 20:29:57 +0000 (22:29 +0200)]
pru: Skip register save if function will not return
There is no need to store callee-saved registers in prologue if the
function would never return. Size optimization is paramount for the
microcontroller-class PRU.
Some backends save some registers for noreturn functions. But for PRU
debuggability is a less concern because GDB has not been ported yet
for PRU.
gcc/ChangeLog:
* config/pru/pru.cc (prologue_saved_reg_p): Skip saving
if function will not return.
gcc/testsuite/ChangeLog:
* gcc.target/pru/noreturn-prologue-1.c: New test.
* gcc.target/pru/noreturn-prologue-2.c: New test.
Dimitar Dimitrov [Fri, 29 Dec 2023 19:22:58 +0000 (21:22 +0200)]
pru: Add pattern variants for zero extending destination
The higher bits in the result of some ALU operations are inherently
always zero when all input operands are smaller than 32-bits.
Add pattern variants to match when the resulting value is zero
extended, so that all operations can be effectively executed in a
single instruction. For PRU it simply means to use a wider register for
destination.
ALU operations which cannot be presented as zero-extending their
destination are addition, subtraction and logical shift left. The PRU
ALU performs all operations in 32-bit mode, so the carry-out and
shifted-out bits would violate the assumption that ALU operation was
performed in 16-bit or 8-bit mode, and result was zero-extended.
gcc/ChangeLog:
* config/pru/alu-zext.md (_noz0): New subst attribute.
(<code>_impl): Allow zero-extending the destination.
(<shift_op>): Remove unified pattern
(ashl_impl): New distinct pattern.
(lshr_impl): Ditto.
(alu3_zext_op0_subst): New subst iterator to zero-extend the
destination register.
gcc/testsuite/ChangeLog:
* gcc.target/pru/extzv-1.c: Update to mark the new more
efficient generated code sequence.
* gcc.target/pru/extzv-2.c: Ditto.
* gcc.target/pru/extzv-3.c: Ditto.
* gcc.target/pru/zero_extend-op0.c: New test.
Dimitar Dimitrov [Sun, 19 Nov 2023 10:20:08 +0000 (12:20 +0200)]
pru: Optimize the extzv and insv patterns
Optimize the generated code for the bit-field extract and insert
patterns:
- Use bit-set and bit-clear instructions for 1-bit fields.
- Expand to SImode operations instead of relying on the default
expansion to word (QI) mode.
gcc/ChangeLog:
* config/pru/pru.md (extzv<mode>): Make it an expand pattern,
handle efficiently zero-positioned bit-fields.
(insv<mode>): New expand pattern.
gcc/testsuite/ChangeLog:
* gcc.target/pru/ashiftrt.c: Minor update due to new (but
equivalent) generated code sequence.
* gcc.target/pru/extzv-1.c: New test.
* gcc.target/pru/extzv-2.c: New test.
* gcc.target/pru/extzv-3.c: New test.
* gcc.target/pru/insv-1.c: New test.
* gcc.target/pru/insv-2.c: New test.
* gcc.target/pru/insv-3.c: New test.
* gcc.target/pru/insv-4.c: New test.
Dimitar Dimitrov [Sun, 22 Oct 2023 11:44:29 +0000 (14:44 +0300)]
pru: Implement TARGET_ADDRESS_COST
Stop relying on the default fallback to TARGET_RTX_COST for PRU's
addressing costs. Implement TARGET_ADDRESS_COST, in order to allow RTX
cost refactoring in the future without affecting the addressing costs.
No code generation changes are expected by this patch. No changes were
detected when running embench-iot and building a few real-world firmware
examples.
The following further strengthens the check which convert expressions
we allow to vectorize as simple copy by resorting to
tree_nop_conversion_p on the vector components.
PR tree-optimization/114921
* tree-vect-stmts.cc (vectorizable_assignment): Use
tree_nop_conversion_p to identify converts we can vectorize
with a simple assignment.
Roger Sayle [Tue, 7 May 2024 06:14:40 +0000 (07:14 +0100)]
PR target/106060: Improved SSE vector constant materialization on x86.
This patch resolves PR target/106060 by providing efficient methods for
materializing/synthesizing special "vector" constants on x86. Currently
there are three methods of materializing a vector constant; the most
general is to load a vector from the constant pool, secondly "duplicated"
constants can be synthesized by moving an integer between units and
broadcasting (of shuffling it), and finally the special cases of the
all-zeros vector and all-ones vectors can be loaded via a single SSE
instruction. This patch handle additional cases that can be synthesized
in two instructions, loading an all-ones vector followed by another SSE
instruction. Following my recent patch for PR target/112992, there's
conveniently a single place in i386-expand.cc where these special cases
can be handled.
Two examples are given in the original bugzilla PR for 106060.
vpcmpeqd %ymm0, %ymm0, %ymm0
vpaddb %ymm0, %ymm0, %ymm0
ret
2024-05-07 Roger Sayle <roger@nextmovesoftware.com>
Hongtao Liu <hongtao.liu@intel.com>
gcc/ChangeLog
PR target/106060
* config/i386/i386-expand.cc (enum ix86_vec_bcast_alg): New.
(struct ix86_vec_bcast_map_simode_t): New type for table below.
(ix86_vec_bcast_map_simode): Table of SImode constants that may
be efficiently synthesized by a ix86_vec_bcast_alg method.
(ix86_vec_bcast_map_simode_cmp): New comparator for bsearch.
(ix86_vector_duplicate_simode_const): Efficiently synthesize
V4SImode and V8SImode constants that duplicate special constants.
(ix86_vector_duplicate_value): Attempt to synthesize "special"
vector constants using ix86_vector_duplicate_simode_const.
* config/i386/i386.cc (ix86_rtx_costs) <case ABS>: ABS of a
vector integer mode costs with a single SSE instruction.
gcc/testsuite/ChangeLog
PR target/106060
* gcc.target/i386/auto-init-8.c: Update test case.
* gcc.target/i386/avx512fp16-13.c: Likewise.
* gcc.target/i386/pr100865-9a.c: Likewise.
* gcc.target/i386/pr101796-1.c: Likewise.
* gcc.target/i386/pr106060-1.c: New test case.
* gcc.target/i386/pr106060-2.c: Likewise.
* gcc.target/i386/pr106060-3.c: Likewise.
* gcc.target/i386/pr70314.c: Update test case.
* gcc.target/i386/vect-shiftv4qi.c: Likewise.
* gcc.target/i386/vect-shiftv8qi.c: Likewise.
Nathaniel Shead [Fri, 3 May 2024 09:36:17 +0000 (19:36 +1000)]
c++/modules: Fix dangling pointer with imported_temploid_friends
I got notified by Linaro CI and by checking testresults that there seems
to be some occasional failures in tpl-friend-4_b.C on some architectures
and standards modes since r15-59-gb5f6a56940e708. I haven't been able
to reproduce but looking at the backtrace I suspect the issue is that
we're adding to the 'imported_temploid_friend' map a decl that is
ultimately discarded, which then has its address reused by a later decl
causing a failure in the assert in 'set_originating_module'.
This patch fixes the problem by ensuring 'imported_temploid_friends' is
correctly marked as a GTY root, and that 'duplicate_decls' properly
removes entries from the map for declarations that it frees.
PR c++/114275
gcc/cp/ChangeLog:
* cp-tree.h (remove_defining_module): Declare.
* decl.cc (duplicate_decls): Call remove_defining_module on
to-be-freed newdecl.
* module.cc (imported_temploid_friends): Mark as GTY root...
(init_modules): ...and allocate from ggc.
(trees_in::decl_value): Only track for declarations that won't
be discarded.
(remove_defining_module): New function.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com> Reviewed-by: Patrick Palka <ppalka@redhat.com>
Xiao Zeng [Mon, 6 May 2024 21:57:37 +0000 (15:57 -0600)]
[PATCH 1/1] RISC-V: Add Zfbfmin extension to the -march= option
This patch would like to add new sub extension (aka Zfbfmin) to the
-march= option. It introduces a new data type BF16.
1 The Zfbfmin extension depend on 'F', and the FLH, FSH, FMV.X.H, and
FMV.H.X instructions as defined in the Zfh extension.
2 The Zfhmin extension includes the following instructions from the
Zfh extension: FLH, FSH, FMV.X.H, FMV.H.X, FCVT.S.H, and FCVT.H.S.
3 Zfhmin extension depend on 'F'.
4 Simply put, just make Zfbfmin dependent on Zfhmin.
Perhaps in the future, we could propose making the FLH, FSH, FMV.X.H, and
FMV.H.X instructions an independent extension to achieve precise dependency
relationships for the Zfbfmin.
You can locate more information about Zfbfmin from below spec doc.
gcc/testsuite/
* gcc.target/riscv/arch-35.c: New test.
* gcc.target/riscv/arch-36.c: New test.
* gcc.target/riscv/predef-34.c: New test.
* gcc.target/riscv/predef-35.c: New test.
David Edelsohn [Sun, 5 May 2024 20:17:51 +0000 (16:17 -0400)]
aix: Fix building fat library for AIX
With the change in subdirectories, the code for libgfortran fat libraries
needs to be adjusted to explicitly reference the subdirectory. AIX
creates fat library archives and the compiler itself can be built as
either 32 bit or 64 bit application and default code generation. For
the two, alternate versions of the compiler to interoperate, GCC needs
to construct the fat libraries manually.
The Makefile fragment had been trying to leverage as much of the existing
targets and macros as possible. With the subdirectory change, the
location of single.o is more obscured and cannot be determined without
libtool. This patch references the location of the real object file
more explicitly.
Utilizing subst seems like overkill and unnecessary obscuration for a single
object file. Either way, it's digging below the libtool abstraction layer.
Xiao Zeng [Mon, 6 May 2024 21:39:12 +0000 (15:39 -0600)]
[RISC-V] Add support for _Bfloat16
1 At point <https://github.com/riscv/riscv-bfloat16>,
BF16 has already been completed "post public review".
2 LLVM has also added support for RISCV BF16 in
<https://reviews.llvm.org/D151313> and
<https://reviews.llvm.org/D150929>.
3 According to the discussion <https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/367>,
this use __bf16 and use DF16b in riscv_mangle_type like x86.
Below test are passed for this patch
* The riscv fully regression test.
gcc/ChangeLog:
* config/riscv/iterators.md: New mode iterator HFBF.
* config/riscv/riscv-builtins.cc (riscv_init_builtin_types):
Initialize data type _Bfloat16.
* config/riscv/riscv-modes.def (FLOAT_MODE): New.
(ADJUST_FLOAT_FORMAT): New.
* config/riscv/riscv.cc (riscv_mangle_type): Support for BFmode.
(riscv_scalar_mode_supported_p): Ditto.
(riscv_libgcc_floating_mode_supported_p): Ditto.
(riscv_init_libfuncs): Set the conversion method for BFmode and
HFmode.
(riscv_block_arith_comp_libfuncs_for_mode): Set the arithmetic
and comparison libfuncs for the mode.
* config/riscv/riscv.md (mode" ): Add BF.
(movhf): Support for BFmode.
(mov<mode>): Ditto.
(*movhf_softfloat): Ditto.
(*mov<mode>_softfloat): Ditto.
libgcc/ChangeLog:
* config/riscv/sfp-machine.h (_FP_NANFRAC_B): New.
(_FP_NANSIGN_B): Ditto.
* config/riscv/t-softfp32: Add support for BF16 libfuncs.
* config/riscv/t-softfp64: Ditto.
* soft-fp/floatsibf.c: For si -> bf16.
* soft-fp/floatunsibf.c: For unsi -> bf16.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/bf16_arithmetic.c: New test.
* gcc.target/riscv/bf16_call.c: New test.
* gcc.target/riscv/bf16_comparison.c: New test.
* gcc.target/riscv/bf16_float_libcall_convert.c: New test.
* gcc.target/riscv/bf16_integer_libcall_convert.c: New test.