ada: Use the same warning character in continuation messages
For consitency sake the main and continuation messages should
use the same warning characters.
gcc/ada/
* exp_aggr.adb (Expand_Range_Component): Remove extra warning
character. Use same conditional warning char.
* freeze.adb (Warn_Overlay): Use named warning character.
* restrict.adb (Id_Case): Use named warning character.
* sem_prag.adb (Rewrite_Assertion_Kind): Use default warning
character.
ada: Restructure continuation message for pretty printing
Continuation messages should have the same location
as the main message. If the goal is to point to a different
location then Error_Msg_Sloc should be used to change
the location of the continuation message.
gcc/ada/
* par-ch4.adb (P_Name): Use Error_Msg_Sloc for the location of the
continuation message.
Viljar Indus [Wed, 14 Aug 2024 12:24:10 +0000 (15:24 +0300)]
ada: Avoid creating continuation messages without an intended parent
The messages modified in this patch do not have a clear intended
parent. This causes a lot of issues when grouping continuation
messages together with their parent. This can be confusing as it
is not obvious what was the parent message that caused this
problem or in worst case scenarios the message not being printed
alltogether.
These modified messages do not seem to be related to any concrete
error message and thus should be treated as independent messages.
gcc/ada/
* sem_ch12.adb (Abandon_Instantiation): Remove continuation
characters from the error message.
* sem_ch13.adb (Check_False_Aspect_For_Derived_Type): Remove
continuation characters from the error message.
* sem_ch6.adb (Assert_False): Avoid creating a continuation
message without a parent. If no primary message is created then
the message is considered as primary.
Viljar Indus [Fri, 21 Jun 2024 10:28:40 +0000 (13:28 +0300)]
ada: Parse the attributes of continuation messages correctly
Currently unless pretty printing is enabled we avoid parsing
the message strings for continuation messages. This leads
to inconsistent state for the Error_Msg_Object-s that are
being created.
gcc/ada/
* erroutc.adb (Prescan_Message): Avoid not parsing all of the
message attributes.
* erroutc.ads: Update the documentation.
Viljar Indus [Fri, 21 Jun 2024 10:19:10 +0000 (13:19 +0300)]
ada: Use consistent type continuations messages
Avoid cases where the main message is an error and the
continuation is a warning.
gcc/ada/
* freeze.adb: Remove warning insertion characters from a
continuation message.
* sem_util.adb: Remove warning insertion characters from a
continuation message.
* sem_warn.adb: Use same warning character as the main message.
Piotr Trojanek [Fri, 9 Aug 2024 15:52:51 +0000 (17:52 +0200)]
ada: Ensure validity checks for private scalar types
To check validity of data values, we must strip privacy from their
types.
gcc/ada/
* checks.adb (Expr_Known_Valid): Use Validated_View, which strips
type derivation and privacy.
* exp_ch3.adb (Simple_Init_Private_Type): Kill checks inside
unchecked conversions, just like in Simple_Init_Scalar_Type.
Gary Dismukes [Mon, 12 Aug 2024 22:50:57 +0000 (22:50 +0000)]
ada: Proper handling for iterator associations in array aggregates
The compiler was flagging type-mismatch errors on iterated component
associations in array aggregates of form "for C in <iterator_name>",
improperly requiring the type of the iterator to be the array index
type. The parser can't distinguish whether the association is one
involving an actual discrete choice vs. an iterator specification,
and creates an N_Iterated_Component_Association with a Defining_Identifer
and Discrete_Choices, and the analysis phase has to disambiguate this,
determining whether to create an N_Iterator_Specification node for
the association. A related change is to revise the similar code for
iterated associations of container aggregates, to allow forms of
iterator objects other than just function calls.
gcc/ada/
* sem_aggr.adb (Resolve_Array_Aggregate): Add loop over associations to locate
N_Iterated_Component_Associations that do not have an Iterator_Specification,
and if their Discrete_Choices list consists of a single choice, analyze it and
if it's the name of an iterator object, then create an Iterator_Specification
and associate it with the iterated component association.
(Resolve_Iterated_Association): Replace test for function call with test of
Is_Object_Reference, to handle other forms of iterator objects in container
aggregates.
Justin Squirek [Mon, 12 Aug 2024 18:31:39 +0000 (18:31 +0000)]
ada: Update documentation for conditional when constructs
This patch moves the documentation for conditional when constructs out of the
curated set (e.g. into -gnatX0).
gcc/ada/
* doc/gnat_rm/gnat_language_extensions.rst: Move conditional when
constructs out of the curated set.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.
Before the previous patch, a lax SUBREG check meant that we would
treat the subreg as a base and reload it into a base register.
But that wasn't what the target was expecting. Instead we should
treat "foo" as a constant displacement, to match:
leal foo, <dest>
After the patch, we recognised that "foo" isn't a base register,
but ICEd on it rather than handling it as a displacement.
With or without the recent patches, if the address had instead been:
then we would have treated "foo" as the displacement and R as the base
or index, as expected. The problem was that the code that does this was
rejecting all subregs of objects, rather than just subregs of variable
objects.
Make some smallest_int_mode_for_size calls cope with failure
smallest_int_mode_for_size now returns an optional mode rather
than aborting on failure. This patch adjusts a couple of callers
so that they fail gracefully when no mode exists.
There should be no behavioural change, since anything that triggers
the new return paths would previously have aborted. I just think
this is how the code would have been written if the option had been
available earlier.
AVR: target/115830 - Make better use of SREG.N and SREG.Z.
This patch adds new CC modes CCN and CCZN for operations that
set SREG.N, resp. SREG.Z and SREG.N. Add peephole2 patterns
to generate new compute + branch insns that make use
of the Z and N flags. Most of these patterns need their own
asm output routines that don't do all the micro-optimizations
that the ordinary outputs may perform, as the latter have no
requirement to set CC in a usable way.
We don't use cmpelim because it cannot provide scratch regs
(which peephole2 can), and some of the patterns require a
scratch reg, whereas the same operations that don't set REG_CC
don't require a scratch. See the comments in avr.md for details.
The existing add.for.cc* patterns are simplified as they no
more cover QImode, which is handled in a separate QImode case.
Apart from that, it adds 3 patterns for subtractions and one
pattern for shift left, all for multi-byte cases (HI, PSI, SI).
The add.for.cc* patterns now use CC[Z]Nmode, instead of the
formerly abuse of CCmode.
PR target/115830
gcc/
* config/avr/avr-modes.def (CCN, CCZN): New CC_MODEs.
* config/avr/avr-protos.h (avr_cond_branch): New from
ret_cond_branch.
(avr_out_plus_set_N, avr_op8_ZN_operator, avr_cmp0_code)
(avr_out_op8_set_ZN, avr_len_op8_set_ZN): New protos.
(ccn_reg_rtx, cczn_reg_rtx): New declarations.
* config/avr/avr.cc (avr_cond_branch): New from ret_cond_branch.
(avr_cond_string): Add bool cc_overflow_unusable argument.
(avr_print_operand) ['L']: Like 'j' but overflow unusable.
['K']: Like 'k' but overflow unusable.
(avr_out_plus_set_ZN): Remove handling of QImode.
(avr_out_plus_set_N, avr_op8_ZN_operator, avr_cmp0_code)
(avr_out_op8_set_ZN, avr_len_op8_set_ZN): New functions.
(avr_adjust_insn_length) [ADJUST_LEN_ADD_SET_N]: Hande case.
(avr_class_max_nregs): All MODE_CCs occupy one hard reg.
(avr_hard_regno_nregs): Same.
(avr_hard_regno_mode_ok) [REG_CC]: Allow all MODE_CC.
(pass_manager.h, context.h, tree-pass.h): Include them.
(ccn_reg_rtx, cczn_reg_rtx): New GTY variables.
(avr_init_expanders): Initialize them.
(avr_option_override): Run peephole2 a second time.
* config/avr/avr.md (adjust_len) [add_set_N]: New attr value.
(ALLCC, HI_SI): New mode iterators.
(CCname): New mode attribute.
(eqnegtle, cmp_signed, op8_ZN): New code iterators.
(swap, SWAP): New code attributes.
(branch): Handle CCNmode and CCZNmode. Assimilate...
(difficult_branch): ...this insn.
(p1m1): Remove.
(gen_add_for_<code>_<mode>): Adjust to CCNmode and CCZNmode. Use
HISI as mode iterator. Extend peephole2s that produce them.
(*add.for.eqne.<mode>): Extend to *add.for.cc[z]n.<mode>.
(*ashift.for.ccn.<mode>): New insn and peephole2 to make them.
(*sub.for.cczn.<mode>, *sub-extend<mode>.for.cczn.<mode>):
New insns and peephole2s to make them.
(*op8.for.cczn.<code>): New insn and peephole2 to make them.
* config/avr/predicates.md (const_1_to_3_operand)
(abs1_abs2_operand, signed_comparison_operator)
(op8_ZN_operator): New predicates.
gcc/testsuite/
* gcc.target/avr/pr115830-add.c: New test.
* gcc.target/avr/pr115830-add-c.c: New test.
* gcc.target/avr/pr115830-add-i.c: New test.
* gcc.target/avr/pr115830-and.c: New test.
* gcc.target/avr/pr115830-asl.c: New test.
* gcc.target/avr/pr115830-asr.c: New test.
* gcc.target/avr/pr115830-ior.c: New test.
* gcc.target/avr/pr115830-lsr.c: New test.
* gcc.target/avr/pr115830-asl32.c: New test.
* gcc.target/avr/pr115830-sub.c: New test.
* gcc.target/avr/pr115830-sub-ext.c: New test.
Arsen Arsenović [Fri, 16 Aug 2024 17:07:01 +0000 (19:07 +0200)]
c++: don't remove labels during coro-early-expand-ifns [PR105104]
In some scenarios, it is possible for the CFG cleanup to cause one of
the labels mentioned in CO_YIELD, which coro-early-expand-ifns intends
to remove, to become part of some statement. As a result, when that
label is removed, the statement it became part of becomes invalid,
crashing the compiler.
There doesn't appear to be a reason to remove the labels (anymore, at
least), so let's not do that.
PR c++/105104
gcc/ChangeLog:
* coroutine-passes.cc (execute_early_expand_coro_ifns): Don't
remove any labels.
Alex Coplan [Thu, 29 Aug 2024 10:31:40 +0000 (11:31 +0100)]
testsuite: Fix up refactored scanltranstree.exp functions [PR116522]
When adding RTL variants of the scan-ltrans-tree* functions in: r15-3254-g3f51f0dc88ec21c1ec79df694200f10ef85915f4
I messed up the name of the underlying scan function to invoke. The
code currently attempts to invoke functions named
scan{,-not,-dem,-dem-not} but should instead be invoking
scan-dump{,-not,-dem,-dem-not}. This patch fixes that.
gcc/testsuite/ChangeLog:
PR testsuite/116522
* lib/scanltranstree.exp: Fix name of underlying scan function
used for scan-ltrans-{tree,rtl}-dump{,-not,-dem,-dem-not}.
Robin Dapp [Tue, 27 Aug 2024 08:25:34 +0000 (10:25 +0200)]
RISC-V: Fix subreg of VLS modes larger than a vector [PR116086].
When the source mode is potentially larger than one vector (e.g. an
LMUL2 mode for VLEN=128) we don't know which vector the subreg actually
refers to. For zvl128b and LMUL=2 the subreg in (subreg:V2DI (reg:V4DI))
could actually be the a full (high) vector register of a two-register
group (at VLEN=128) or the higher part of a single register (at VLEN>128).
As the subreg is statically ambiguous we prevent such situations in
can_change_mode_class.
The culprit in PR116086 is
_12 = BIT_FIELD_REF <vect_cst__42, 128, 128>;
which can be expanded with a vector-vector extract (from V4DI to V2DI).
This patch adds a VLS-mode vector-vector extract that handles "halving"
cases like this one by sliding down the source vector, thus making sure
the correct part is used.
PR target/116086
gcc/ChangeLog:
* config/riscv/autovec.md (vec_extract<mode><v_half>): Add
vector-vector extract for VLS modes.
* config/riscv/riscv.cc (riscv_can_change_mode_class): Forbid
VLS modes larger than one vector.
* config/riscv/vector-iterators.md: Add vector-vector extract
iterators.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp: Add effective target checks for
zvl256b and zvl512b.
* gcc.target/riscv/rvv/autovec/pr116086-2-run.c: New test.
* gcc.target/riscv/rvv/autovec/pr116086-2.c: New test.
* gcc.target/riscv/rvv/autovec/pr116086.c: New test.
Roger Sayle [Thu, 29 Aug 2024 03:19:28 +0000 (21:19 -0600)]
i386: Support wide immediate constants in STV.
This patch provides more accurate costs/gains for (wide) immediate
constants in STV, suitably adjusting the costs/gains when the highpart
and lowpart words are the same.
2024-08-28 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386-features.cc (timode_immed_const_gain): New
function to determine the gain/cost on a CONST_WIDE_INT.
(timode_scalar_chain::compute_convert_gain): Fix whitespace.
<case CONST_WIDE_INT>: Provide more accurate estimates using
timode_immed_const_gain.
<case AND>: Handle CONSTANT_SCALAR_INT_P (src).
Mark Harmstone [Mon, 26 Aug 2024 21:16:11 +0000 (22:16 +0100)]
Write LF_MFUNC_ID types for CodeView struct member functions
If recording the definition of a struct member function, write an
LF_MFUNC_ID type rather than an LF_FUNC_ID. This links directly to the
struct type, rather than to an LF_STRING_ID with its name.
gcc/
* dwarf2codeview.cc (enum cv_leaf_type): Add LF_MFUNC_ID.
(write_lf_mfunc_id): New function.
(add_lf_func_id): New function.
(add_lf_mfunc_id): New function.
(add_function): Call add_lf_func_id or add_lf_mfunc_id.
Mark Harmstone [Mon, 26 Aug 2024 21:40:56 +0000 (22:40 +0100)]
Record member functions in CodeView struct definitions
CodeView has two ways of recording struct member functions.
Non-overloaded functions have an LF_ONEMETHOD sub-type in the field
list, which records the name and the function type (LF_MFUNCTION).
Overloaded functions have an LF_METHOD instead, which points to an
LF_METHODLIST, which is an array of links to various LF_MFUNCTION types.
gcc/
* dwarf2codeview.cc (enum cv_leaf_type): Add LF_MFUNCTION,
LF_METHODLIST, LF_METHOD, and LF_ONEMETHOD.
(struct codeview_subtype): Add lf_onemethod and lf_method to union.
(struct lf_methodlist_entry): New type.
(struct codeview_custom_type): Add lf_mfunc_id, lf_mfunction, and
lf_methodlist to union.
(struct codeview_method): New type.
(struct method_hasher): New type.
(get_type_num_subroutine_type): Add forward declaration.
(write_lf_fieldlist): Handle LF_ONEMETHOD and LF_METHOD.
(write_lf_mfunction): New function.
(write_lf_methodlist): New function.
(write_custom_types): Handle LF_MFUNCTION and LF_METHODLIST.
(add_struct_function): New function.
(get_mfunction_type): New function.
(is_templated_func): New function.
(get_type_num_struct): Handle DW_TAG_subprogram child DIEs.
(get_type_num_subroutine_type): Add containing_class_type, this_type,
and this_adjustment params, and handle creating LF_MFUNCTION types as
well as LF_PROCEDURE.
(get_type_num): New params for get_type_num_subroutine_type.
(add_function): New params for get_type_num_subroutine_type.
* dwarf2codeview.h (CV_METHOD_VANILLA, CV_METHOD_VIRTUAL): Define.
(CV_METHOD_STATIC, CV_METHOD_FRIEND, CV_METHOD_INTRO): Likewise.
(CV_METHOD_PUREVIRT, CV_METHOD_PUREINTRO): Likewise.
Mark Harmstone [Mon, 26 Aug 2024 20:34:46 +0000 (21:34 +0100)]
Record static data members in CodeView structs
Record LF_STMEMBER field list subtypes to represent static data members
in structs.
gcc/
* dwarf2codeview.cc (enum cv_leaf_type): Add LF_STMEMBER.
(struct codeview_subtype): Add lf_static_member to union.
(write_lf_fieldlist): Handle LF_STMEMBER.
(add_struct_member): New function.
(add_struct_static_member): New function.
(get_accessibility): New function.
(get_type_num_struct): Split out into add_struct_member and
get_accessibility, and handle static members.
Mark Harmstone [Mon, 26 Aug 2024 20:19:51 +0000 (21:19 +0100)]
Handle scoping in CodeView LF_FUNC_ID types
If a function is in a namespace, create an LF_STRING_ID type for the
name of its parent, and record this in the LF_FUNC_ID type we create
for the function.
gcc/
* dwarf2codeview.cc (enum cf_leaf_type): Add LF_STRING_ID.
(struct codeview_custom_type): Add lf_string_id to union.
(struct string_id_hasher): New type.
(string_id_htab): New global variable.
(write_lf_string_id): New function.
(write_custom_types): Call write_lf_string_id.
(codeview_debug_finish): Free string_id_htab.
(add_string_id): New function.
(get_scope_string_id): New function.
(add_function): Call get_scope_string_id and set scope.
Marek Polacek [Wed, 28 Aug 2024 19:45:49 +0000 (15:45 -0400)]
c++: wrong error due to std::initializer_list opt [PR116476]
Here maybe_init_list_as_array gets elttype=field, init={NON_LVALUE_EXPR <2>}
and it tries to convert the init's element type (int) to field
using implicit_conversion, which works, so overall maybe_init_list_as_array
is successful.
But it constifies init_elttype so we end up with "const int". Later,
when we actually perform the conversion and invoke field::field(T&&),
we end up with this error:
error: binding reference of type 'int&&' to 'const int' discards qualifiers
So I think maybe_init_list_as_array should try to perform the conversion,
like it does below with fc.
PR c++/116476
gcc/cp/ChangeLog:
* call.cc (maybe_init_list_as_array): Try convert_like and see if it
worked.
Andrew Pinski [Mon, 26 Aug 2024 22:14:24 +0000 (15:14 -0700)]
expand: Add debug dump on the cost for `popcount==1` expand
While working on PR 114224, I found it would be useful to dump the
different costs of the expansion to make easier to understand why one
was chosen over the other.
Changes since v1:
* v2: make the dump a single line
Bootstrapped and tested on x86_64-linux-gnu.
Build and tested for aarch64-linux-gnu.
gcc/ChangeLog:
* internal-fn.cc (expand_POPCOUNT): Dump the costs for
the two choices.
Jonathan Wakely [Wed, 28 Aug 2024 11:38:18 +0000 (12:38 +0100)]
libstdc++: Fix autoconf check for O_NONBLOCK in <fcntl.h>
I misused the AC_CHECK_DECL macro, assuming that it behaved like
AC_CHECK_DECLS and always defined a HAVE_xxx macro if the decl was
found. Instead, the [action-if-found] shell commands are needed to
defined HAVE_O_NONBLOCK explicitly.
Georg-Johann Lay [Fri, 23 Aug 2024 09:34:43 +0000 (11:34 +0200)]
AVR: Overhaul the avr-ifelse RTL optimization pass.
Mini-pass avr-ifelse realizes optimizations that replace two cbranch
insns with one comparison and two branches. This patch adds the
following improvements:
- The right operand of the comparisons may also be REGs.
Formerly only CONST_INT was handled.
- The RTX code of the first comparison in no more restricted
to (effectively) EQ.
- When the second cbranch is located in the fallthrough path
of the first cbranch, then difficult (expensive) comparisons
can always be avoided. This may require to swap the branch
targets. (When the second cbranch is located after the target
label of the first one, then getting rid of difficult branches
would require to reorder blocks.)
- The code has been cleaned up: avr_rest_of_handle_ifelse() now
just scans the insn stream for optimization candidates. The code
that actually performs the transformation has been outsourced to
the new function avr_optimize_2ifelse().
- The code to find a better representation for reg-const_int comparisons
has been split into two parts: First try to find codes such that the
right-hand sides of the comparisons are the same (avr_2comparisons_rhs).
When this succeeds then one comparison can serve two branches, and
that function tries to get rid of difficult branches. This is always
possible when the second cbranch is located in the fallthrough path
of the first one.
Some final notes on why we don't use compare-elim: 1) The two cbranch
insns may come with different scratch operands depending on the chosen
constraint alternatives. There are cases where the outgoing comparison
requires a scratch but only one incoming cbranch has one. 2) Avoiding
difficult branches can be achieved by rewiring basic blocks.
compare-elim doesn't do that; it doesn't even know the costs of the
branch codes. 3) avr_2comparisons_rhs() may de-canonicalize a
comparison to achieve its goal. compare-elim doesn't know how to do
that. 4) There are more reasons, see for example the commit message
and discussion for PR115830.
avr_2comparisons_rhs tries to decompose the interval as given by some
[u]intN_t into three intervals using the new Ranges struct that
implemens set operations on finite unions of intervals.
Sadly, value-range.h is not well suited for that, and writing a
wrapper around it that avoids all corner case ICEs would be more
laborious than struct Ranges.
gcc/
* config/avr/avr.cc (INCLUDE_VECTOR): Define it.
(cfganal.h): Include it.
(Ranges): New struct.
(avr_2comparisons_rhs, avr_redundant_compare_regs)
(avr_strict_signed_p, avr_strict_unsigned_p): New static functions.
(avr_redundant_compare): Overhaul: Allow more cases.
(avr_optimize_2ifelse): New static function, outsourced from...
(avr_rest_of_handle_ifelse): ...this method.
gcc/testsuite/
* gcc.target/avr/torture/ifelse-c.h: New file.
* gcc.target/avr/torture/ifelse-d.h: New file.
* gcc.target/avr/torture/ifelse-q.h: New file.
* gcc.target/avr/torture/ifelse-r.h: New file.
* gcc.target/avr/torture/ifelse-c-i8.c: New test.
* gcc.target/avr/torture/ifelse-d-i8.c: New test.
* gcc.target/avr/torture/ifelse-q-i8.c: New test.
* gcc.target/avr/torture/ifelse-r-i8.c: New test.
* gcc.target/avr/torture/ifelse-c-i16.c: New test.
* gcc.target/avr/torture/ifelse-d-i16.c: New test.
* gcc.target/avr/torture/ifelse-q-i16.c: New test.
* gcc.target/avr/torture/ifelse-r-i16.c: New test.
* gcc.target/avr/torture/ifelse-c-u16.c: New test.
* gcc.target/avr/torture/ifelse-d-u16.c: New test.
* gcc.target/avr/torture/ifelse-q-u16.c: New test.
* gcc.target/avr/torture/ifelse-r-u16.c: New test.
We cannot elide the TARGET_EXPR because we're taking its address.
It is set as eliding in massage_init_elt. I've tried to not set
TARGET_EXPR_ELIDING_P when the context is not direct-initialization.
That didn't work: even when it's not direct-initialization now, it
can become one later, for instance, after split_nonconstant_init.
One problem is that replace_placeholders_for_class_temp_r will replace
placeholders in non-eliding TARGET_EXPRs with the slot, but if we then
elide the TARGET_EXPR, we end up with a "stray" VAR_DECL and crash.
(Only some TARGET_EXPRs are handled by replace_decl.)
I thought I'd have to go back to
<https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651163.html> but
then I realized that this problem occurrs only with ()-init but not
{}-init. With {}-init, there is no problem, because we are clearing
TARGET_EXPR_ELIDING_P in process_init_constructor_record:
/* We can't actually elide the temporary when initializing a
potentially-overlapping field from a function that returns by
value. */
if (ce->index
&& TREE_CODE (next) == TARGET_EXPR
&& unsafe_copy_elision_p (ce->index, next))
TARGET_EXPR_ELIDING_P (next) = false;
But that does not happen for ()-init because we have no ce->index.
()-init doesn't allow brace elision so we don't really reshape them.
But I can just move the clearing a few lines down and then it handles
both ()-init and {}-init.
PR c++/116424
gcc/cp/ChangeLog:
* typeck2.cc (process_init_constructor_record): Move the clearing of
TARGET_EXPR_ELIDING_P down.
aarch64: Assume zero gather/scatter set-up cost for -mtune=generic
generic_vector_cost is not currently used by any SVE target
by default; it has to be specifically selected by -mtune=generic.
Its SVE costing has historically been somewhat idealised, since
it predated any actual SVE cores. This seems like a useful
tradition to continue, at least for testing purposes.
The ideal case is that gathers and scatters do not induce a specific
one-off overhead. This patch therefore sets the gather/scatter init
costs to zero.
This patch is necessary to switch -mtune=generic over to the
"new" vector costs.
gcc/
* config/aarch64/tuning_models/generic.h (generic_sve_vector_cost):
Set gather_load_x32_init_cost and gather_load_x64_init_cost to 0.
The SVE gather and scatter costs are classified based on whether
they do 4 loads per 128 bits (x32) or 2 loads per 128 bits (x64).
The number after the "x" refers to the number of bits in each
"container".
However, the test for which to use was based on the element size
rather than the container size. This meant that we'd use the
overly conservative x32 costs for VNx2SI gathers. VNx2SI gathers
are really .D gathers in which the upper half of each extension
result is ignored.
This patch is necessary to switch -mtune=generic over to the
"new" vector costs.
gcc/
* config/aarch64/aarch64.cc (aarch64_detect_vector_stmt_subtype)
(aarch64_vector_costs::add_stmt_cost): Use the x64 cost rather
than x32 cost for all VNx2 modes.
The documentation of ASM_INPUT_P implied that the flag has no
effect on ASM_EXPRs that have operands (and which therefore must be
extended asms). In fact we require ASM_INPUT_P to be false for all
extended asms.
Filip Kastl [Wed, 28 Aug 2024 13:47:44 +0000 (15:47 +0200)]
gimple ssa: switchconv: Use __builtin_popcount and support more types in exp transform [PR116355]
The gen_pow2p function generates (a & -a) == a as a fallback for
POPCOUNT (a) == 1. Not only is the bitmagic not equivalent to
POPCOUNT (a) == 1 but it also introduces UB (consider signed
a = INT_MIN).
This patch rewrites gen_pow2p to always use __builtin_popcount instead.
This means that what the end result GIMPLE code is gets decided by an
already existing machinery in a later pass. That is a cleaner solution
I think. This existing machinery also uses a ^ (a - 1) > a - 1 which is
the correct bitmagic.
While rewriting gen_pow2p I had to add logic for converting the
operand's type to a type that __builtin_popcount accepts. I naturally
also added this logic to gen_log2. Thanks to this, exponential index
transform gains the capability to handle all operand types with
precision at most that of long long int.
gcc/ChangeLog:
PR tree-optimization/116355
* tree-switch-conversion.cc (can_log2): Add capability to
suggest converting the operand to a different type.
(gen_log2): Add capability to generate a conversion in case the
operand is of a type incompatible with the logarithm operation.
(can_pow2p): New function.
(gen_pow2p): Rewrite to use __builtin_popcount instead of
manually inserting an internal fn call or bitmagic. Also add
capability to generate a conversion.
(switch_conversion::is_exp_index_transform_viable): Call
can_pow2p. Store types suggested by can_log2 and gen_log2.
(switch_conversion::exp_index_transform): Params of gen_pow2p
and gen_log2 changed so update their calls.
* tree-switch-conversion.h: Add m_exp_index_transform_log2_type
and m_exp_index_transform_pow2p_type to switch_conversion class
to track type conversions needed to generate the "is power of 2"
and logarithm operations.
gcc/testsuite/ChangeLog:
PR tree-optimization/116355
* gcc.target/i386/switch-exp-transform-1.c: Don't test for
presence of POPCOUNT internal fn after switch conversion. Test
for it after __builtin_popcount has had a chance to get
expanded.
* gcc.target/i386/switch-exp-transform-3.c: Also test char and
short.
Jason Merrill [Tue, 27 Aug 2024 17:17:20 +0000 (13:17 -0400)]
libstdc++: avoid -Wsign-compare
-Wsign-compare complained about these comparisons between (unsigned) size_t
and (signed) streamsize, or between (unsigned) native_handle_type
and (signed) -1. Fixed by adding casts to unify the types.
libstdc++-v3/ChangeLog:
* include/std/istream: Add cast to avoid -Wsign-compare.
* include/std/stacktrace: Likewise.
Richard Biener [Wed, 28 Aug 2024 09:04:07 +0000 (11:04 +0200)]
Fix leak of SLP nodes when building store interleaving
The following fixes a leak of the discovered single-lane store
SLP nodes from which we only use their children. This uncovers
a latent reference counting issue in the interleaving build where
we fail to increment their reference count.
Jason Merrill [Tue, 27 Aug 2024 17:14:45 +0000 (13:14 -0400)]
c++: add missing -Wc++??-extensions checks
The pedwarns for each of these features should be silenced by
the appropriate -Wno-c++??-extensions.
The handle_pragma_diagnostic_impl change is necessary so that we handle
-Wc++23-extensions early so it's available to interpret_float while lexing.
gcc/c-family/ChangeLog:
* c-pragma.cc (handle_pragma_diagnostic_impl): Also handle
-Wc++23-extensions early.
* c-lex.cc (interpret_float): Use -Wc++23-extensions for extended
floating point literal pedwarn.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_simple_type_specifier): Use
-Wc++20-extensions for auto parameter pedwarn.
* pt.cc (do_decl_instantiation, do_type_instantiation): Use
-Wc++11-extensions for 'extern template'.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/extern_template-7.C: New test.
* g++.dg/cpp23/ext-floating19.C: New test.
* g++.dg/cpp2a/abbrev-fn1.C: New test.
Tobias Burnus [Wed, 28 Aug 2024 09:50:43 +0000 (11:50 +0200)]
libgomp: Add interop types and routines to OpenMP's headers and module
This commit adds OpenMP 5.1+'s interop enumeration, type and routine
declarations to the C/C++ header file and, new in OpenMP TR13, also to
the Fortran module and omp_lib.h header file.
While a stub implementation is provided, only with foreign runtime
support by the libgomp GPU plugins and with the 'interop' directive,
this becomes really useful.
libgomp/ChangeLog:
* fortran.c (omp_get_interop_str_, omp_get_interop_name_,
omp_get_interop_type_desc_, omp_get_interop_rc_desc_): Add.
* libgomp.map (GOMP_5.1.3): New; add interop routines.
* omp.h.in: Add interop typedefs, enum and prototypes.
(__GOMP_DEFAULT_NULL): Define.
(omp_target_memcpy_async, omp_target_memcpy_rect_async):
Use it for the optional depend argument.
* omp_lib.f90.in: Add paramters and interfaces for interop.
* omp_lib.h.in: Likewise; move F90 '&' to column 81 for
-ffree-length-80.
* target.c (omp_get_num_interop_properties, omp_get_interop_int,
omp_get_interop_ptr, omp_get_interop_str, omp_get_interop_name,
omp_get_interop_type_desc, omp_get_interop_rc_desc): Add.
* config/gcn/target.c (omp_get_num_interop_properties,
omp_get_interop_int, omp_get_interop_ptr, omp_get_interop_str,
omp_get_interop_name, omp_get_interop_type_desc,
omp_get_interop_rc_desc): Add.
* config/nvptx/target.c (omp_get_num_interop_properties,
omp_get_interop_int, omp_get_interop_ptr, omp_get_interop_str,
omp_get_interop_name, omp_get_interop_type_desc,
omp_get_interop_rc_desc): Add.
* testsuite/libgomp.c-c++-common/interop-routines-1.c: New test.
* testsuite/libgomp.c-c++-common/interop-routines-2.c: New test.
* testsuite/libgomp.fortran/interop-routines-1.F90: New test.
* testsuite/libgomp.fortran/interop-routines-2.F90: New test.
* testsuite/libgomp.fortran/interop-routines-3.F: New test.
* testsuite/libgomp.fortran/interop-routines-4.F: New test.
* testsuite/libgomp.fortran/interop-routines-5.F: New test.
* testsuite/libgomp.fortran/interop-routines-6.F: New test.
* testsuite/libgomp.fortran/interop-routines-7.F90: New test.
Pan Li [Mon, 19 Aug 2024 02:02:46 +0000 (10:02 +0800)]
Test: Move pr116278 run test to dg/torture [NFC]
Move the run test of pr116278 to dg/torture and leave the risc-v the
asm check under risc-v part.
PR target/116278
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr116278-run-1.c: Take compile instead of run.
* gcc.target/riscv/pr116278-run-2.c: Ditto.
* gcc.dg/torture/pr116278-run-1.c: New test.
* gcc.dg/torture/pr116278-run-2.c: New test.
Pan Li [Tue, 27 Aug 2024 07:01:02 +0000 (15:01 +0800)]
Vect: Reconcile the const_int operand type of unsigned .SAT_ADD
The .SAT_ADD has 2 operand, when one of the operand may be INTEGER_CST.
For example _1 = .SAT_ADD (_2, 9) comes from below sample code.
Form 3:
#define DEF_VEC_SAT_U_ADD_IMM_FMT_3(T, IMM) \
T __attribute__((noinline)) \
vec_sat_u_add_imm##IMM##_##T##_fmt_3 (T *out, T *in, unsigned limit) \
{ \
unsigned i; \
T ret; \
for (i = 0; i < limit; i++) \
{ \
out[i] = __builtin_add_overflow (in[i], IMM, &ret) ? -1 : ret; \
} \
}
DEF_VEC_SAT_U_ADD_IMM_FMT_3(uint64_t, 9)
It will fail to vectorize as the vectorizable_call will check the
operands is type_compatiable but the imm will be (const_int 9) with
the SImode, which is different from _2 (DImode). Aka:
uint64_t _1;
uint64_t _2;
_1 = .SAT_ADD (_2, 9);
This patch would like to reconcile the imm operand to the operand type
mode of _2 by fold_convert to make the vectorizable_call happy.
The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.
gcc/ChangeLog:
* tree-vect-patterns.cc (vect_recog_sat_add_pattern): Add fold
convert for const_int to the type of operand 0.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-10.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-11.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-12.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-13.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-14.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-15.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-3.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-4.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-5.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-6.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-7.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-8.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-9.c: New test.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.
gcc/ChangeLog:
* match.pd: Add the matching for signed .SAT_ADD.
* tree-ssa-math-opts.cc (gimple_signed_integer_sat_add): Add new
matching func decl.
(match_unsigned_saturation_add): Try signed .SAT_ADD and rename
to ...
(match_saturation_add): ... here.
(math_opts_dom_walker::after_dom_children): Update the above renamed
func from caller.
gcc/testsuite:
PR testsuite/116271
* gcc.dg/vect/tsvc/vect-tsvc-s176.c [TRUNCATE_TEST]: Make sure
that m stays the same as the loop bound of the middle loop.
* gcc.dg/vect/tsvc/tsvc.h (get_expected_result) <s176> [TRUNCATE_TEST]:
Adjust expected value.
Pan Li [Tue, 27 Aug 2024 07:14:40 +0000 (15:14 +0800)]
RISC-V: Add testcases for unsigned scalar .SAT_SUB IMM form 4
This patch would like to add test cases for the unsigned scalar
.SAT_SUB IMM form 4. Aka:
Form 4:
#define DEF_SAT_U_SUB_IMM_FMT_4(T, IMM) \
T __attribute__((noinline)) \
sat_u_sub_imm##IMM##_##T##_fmt_4 (T x) \
{ \
return x > (T)IMM ? x - (T)IMM : 0; \
}
DEF_SAT_U_SUB_IMM_FMT_4(uint64_t, 23)
The below test is passed for this patch.
* The rv64gcv regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_u_sub_imm-13.c: New test.
* gcc.target/riscv/sat_u_sub_imm-13_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-13_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-14.c: New test.
* gcc.target/riscv/sat_u_sub_imm-14_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-14_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-15.c: New test.
* gcc.target/riscv/sat_u_sub_imm-15_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-15_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-16.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-13.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-14.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-15.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-16.c: New test.
Pan Li [Tue, 27 Aug 2024 06:37:01 +0000 (14:37 +0800)]
RISC-V: Add testcases for unsigned scalar .SAT_SUB IMM form 3
This patch would like to add test cases for the unsigned scalar
.SAT_SUB IMM form 3. Aka:
Form 3:
#define DEF_SAT_U_SUB_IMM_FMT_3(T, IMM) \
T __attribute__((noinline)) \
sat_u_sub_imm##IMM##_##T##_fmt_3 (T y) \
{ \
return (T)IMM > y ? (T)IMM - y : 0; \
}
DEF_SAT_U_SUB_IMM_FMT_3(uint64_t, 23)
The below test is passed for this patch.
* The rv64gcv regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_u_sub_imm-10.c: New test.
* gcc.target/riscv/sat_u_sub_imm-10_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-10_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-11.c: New test.
* gcc.target/riscv/sat_u_sub_imm-11_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-11_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-12.c: New test.
* gcc.target/riscv/sat_u_sub_imm-9.c: New test.
* gcc.target/riscv/sat_u_sub_imm-9_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-9_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-10.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-11.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-12.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-9.c: New test.
c++/coroutines: fix actor cases not being added to the current switch [PR109867]
Previously, we were building and inserting case_labels manually, which
led to them not being added into the currently running switch via
c_add_case_label. This led to false diagnostics that the user could not
act on.
PR c++/109867
gcc/cp/ChangeLog:
* coroutines.cc (expand_one_await_expression): Replace uses of
build_case_label with finish_case_label.
(build_actor_fn): Ditto.
(create_anon_label_with_ctx): Remove now-unused function.
Simon Martin [Mon, 26 Aug 2024 12:09:46 +0000 (14:09 +0200)]
c++: Don't show constructor internal name in error message [PR105483]
We mention 'X::__ct' instead of 'X::X' in the "names the constructor,
not the type" error for this invalid code:
=== cut here ===
struct X {};
void g () {
X::X x;
}
=== cut here ===
The problem is that we use %<%T::%D%> to build the error message, while
%qE does exactly what we need since we have DECL_CONSTRUCTOR_P. This is
what this patch does.
It also skips until the end of the statement and returns error_mark_node
for this and the preceding if block, to avoid emitting extra (useless)
errors.
PR c++/105483
gcc/cp/ChangeLog:
* parser.cc (cp_parser_expression_statement): Use %qE instead of
incorrect %<%T::%D%>. Skip to end of statement and return
error_mark_node in case of error.
gcc/testsuite/ChangeLog:
* g++.dg/parse/error36.C: Adjust test expectation.
* g++.dg/tc1/dr147.C: Likewise.
* g++.old-deja/g++.other/typename1.C: Likewise.
* g++.dg/diagnostic/pr105483.C: New test.
Patrick O'Neill [Tue, 20 Aug 2024 19:50:51 +0000 (12:50 -0700)]
RISC-V: Allow non-duplicate bool patterns in expand_const_vector
Currently we assert when encountering a non-duplicate boolean vector.
This patch allows non-duplicate vectors to fall through to the
gcc_unreachable and assert there.
This will be useful when adding a catch-all pattern to emit costs and
handle arbitary vectors.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (expand_const_vector): Allow non-duplicate
to fall through other patterns before asserting.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
Patrick O'Neill [Tue, 20 Aug 2024 18:51:50 +0000 (11:51 -0700)]
RISC-V: Emit costs for bool and stepped const vectors
These cases are handled in the expander
(riscv-v.cc:expand_const_vector). We need the vector builder to detect
these cases so extract that out into a new riscv-v.h header file.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (class rvv_builder): Move to riscv-v.h.
* config/riscv/riscv.cc (riscv_const_insns): Emit placeholder costs for
bool/stepped const vectors.
* config/riscv/riscv-v.h: New file.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
Patrick O'Neill [Tue, 20 Aug 2024 18:29:12 +0000 (11:29 -0700)]
RISC-V: Handle case when constant vector construction target rtx is not a register
This manifests in RTL that is optimized away which causes runtime failures
in the testsuite. Update all patterns to use a temp result register if required.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (expand_const_vector): Use tmp register if
needed.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
Patrick O'Neill [Tue, 20 Aug 2024 18:38:20 +0000 (11:38 -0700)]
RISC-V: Reorder insn cost match order to match corresponding expander match order
The corresponding expander (riscv-v.cc:expand_const_vector) matches
const_vec_duplicate_p before const_vec_series_p. Reorder to match this
behavior when calculating costs.
Christophe Lyon [Wed, 21 Aug 2024 13:58:08 +0000 (13:58 +0000)]
arm: Always use vmov.f64 instead of vmov.f32 with MVE
With MVE, vmov.f64 is always supported (no need for +fp.dp extension).
This patch updates two patterns:
- in movdi_vfp, we incorrectly checked
TARGET_VFP_SINGLE || TARGET_HAVE_MVE instead of
TARGET_VFP_SINGLE && !TARGET_HAVE_MVE, and didn't take into account
these two possibilities when computing the length attribute.
- in thumb2_movdf_vfp, we checked only TARGET_VFP_SINGLE.
No need to update movdf_vfp, since it is enabled only for TARGET_ARM
(which is not the case when MVE is enabled).
The patch also updates gcc.target/arm/armv8_1m-fp64-move-1.c, to
accept only vmov.f64 instead of vmov.f32.
Tested on arm-none-eabi with:
qemu/-mthumb/-mtune=cortex-m55/-mfloat-abi=hard/-mfpu=auto
qemu/-mthumb/-mtune=cortex-m55/-mfloat-abi=hard/-mfpu=auto/-march=armv8.1-m.main+mve
qemu/-mthumb/-mtune=cortex-m55/-mfloat-abi=hard/-mfpu=auto/-march=armv8.1-m.main+mve.fp
qemu/-mthumb/-mtune=cortex-m55/-mfloat-abi=hard/-mfpu=auto/-march=armv8.1-m.main+mve.fp+fp.dp
H.J. Lu [Tue, 27 Aug 2024 14:03:22 +0000 (07:03 -0700)]
Extend check-function-bodies to allow label and directives
As PR target/116174 shown, we may need to verify labels and the directive
order. Extend check-function-bodies to support matched output lines to
allow label and directives.
gcc/
* doc/sourcebuild.texi (check-function-bodies): Add an optional
argument for matched output lines.
gcc/testsuite/
* gcc.target/i386/pr116174.c: Use check-function-bodies.
* lib/scanasm.exp (parse_function_bodies): Append the line if
$up_config(matched) matches the line.
(check-function-bodies): Add an argument for matched. Set
up_config(matched) to $matched. Append the expected line without
$config(line_prefix) to function_regexp if it starts with ".L".
Michael Matz [Thu, 22 Aug 2024 15:21:42 +0000 (17:21 +0200)]
LRA: Fix setup_sp_offset
This is part of making m68k work with LRA. See PR116429.
In short: setup_sp_offset is internally inconsistent. It wants to
setup the sp_offset for newly generated instructions. sp_offset for
an instruction is always the state of the sp-offset right before that
instruction. For that it starts at the (assumed correct) sp_offset
of the instruction right after the given (new) sequence, and then
iterates that sequence forward simulating its effects on sp_offset.
That can't ever be right: either it needs to start at the front
and simulate forward, or start at the end and simulate backward.
The former seems to be the more natural way. Funnily the local
variable holding that instruction is also called 'before'.
This changes it to the first variant: start before the sequence,
do one simulation step to get the sp-offset state in front of the
sequence and then continue simulating.
More details: in the problematic testcase we start with this
situation (sp_off before 550 is 0):
The call insn sp_off remains at the correct -16, but internally it's already
inconsistent here. If the sp_off before an insn is -16, and that insn
pre_decs sp, then the after-insn sp_off should be -20.
PR target/116429
* lra.cc (setup_sp_offset): Start with sp_offset from
before the new sequence, not from after.
Michael Matz [Thu, 22 Aug 2024 15:09:11 +0000 (17:09 +0200)]
LRA: Don't use 0 as initialization for sp_offset
this is part of making m68k work with LRA. See PR116374.
m68k has the property that sometimes the elimation offset
between %sp and %argptr is zero. During setting up elimination
infrastructure it's changes between sp_offset and previous_offset
that feed into insns_with_changed_offsets that ultimately will
setup looking at the instructions so marked.
But the initial values for sp_offset and previous_offset are
also zero. So if the targets INITIAL_ELIMINATION_OFFSET (called
in update_reg_eliminate) is zero then nothing changes, the
instructions in question don't get into the list to consider and
the sp_offset tracking goes wrong.
Solve this by initializing those member with -1 instead of zero.
An initial offset of that value seems very unlikely, as it's
in word-sized increments. This then also reveals a problem in
eliminate_regs_in_insn where it always uses sp_offset-previous_offset
as offset adjustment, even in the first_p pass. That was harmless
when previous_offset was uninitialized as zero. But all the other
code uses a different idiom of checking for first_p (or rather
update_p which is !replace_p&&!first_p), and using sp_offset directly.
So use that as well in eliminate_regs_in_insn.
PR target/116374
* lra-eliminations.cc (init_elim_table): Use -1 as initializer.
(update_reg_eliminate): Accept -1 as not-yet-used marker.
(eliminate_regs_in_insn): Use previous_sp_offset only when
not first_p.
Michael Matz [Thu, 22 Aug 2024 15:03:56 +0000 (17:03 +0200)]
final: go down ASHIFT in walk_alter_subreg
when experimenting with m68k plus LRA one of the
changes in the backend is to accept ASHIFTs (not only
MULT) as scale code for address indices. When then not
turning on LRA but using reload those addresses are
presented to it which chokes on them. While reload is
going away the change to make them work doesn't really hurt
(and generally seems useful, as MULT and ASHIFT really are
no different). So just add it.
PR target/116413
* final.cc (walk_alter_subreg): Recurse on AHIFT.
Jonathan Wakely [Tue, 27 Aug 2024 12:30:42 +0000 (13:30 +0100)]
libstdc++: Do not use std::vector<bool>::reference default ctor [PR115098]
This default constructor was made private by r15-3124-gb25b101bc38000 so
the pretty printer tests need a fix to stop using it. There's no
conforming way to get a default-constructed 'reference' now, e.g. trying
to access an element of a default-constructed std::vector<bool> will
trigger an assertion. Remove the tests, but leave a comment in the
printer code about handling it.
libstdc++-v3/ChangeLog:
PR libstdc++/115098
* python/libstdcxx/v6/printers.py (StdBitReferencePrinter): Add
comment.
* testsuite/libstdc++-prettyprinters/simple.cc: Do not default
construct std::vector<bool>::reference.
* testsuite/libstdc++-prettyprinters/simple11.cc: Likewise.
Jonathan Wakely [Tue, 27 Aug 2024 11:19:47 +0000 (12:19 +0100)]
c++: Add most missing C++20 and C++23 names to cxxapi-data.csv
This includes uncommenting the atomic_flag non-member functions, which
were added by PR libstdc++/103934.
Also generate a hint for std::ignore, which was recently tweaked to be
more generally useful by P2968R2, which r15-2324 implemented.
gcc/cp/ChangeLog:
* cxxapi-data.csv: Add C++20 and C++23 names from <chrono>,
<format>, <generator>, <iterator>, <print>, and <stdfloat>.
Set cxx11 dialect for std::ignore in <tuple>. Uncomment
atomic_flag functions from <atomic>.
* std-name-hint.gperf: Regenerate.
* std-name-hint.h: Regenerate.
Handle arithmetic on eliminated address indices [PR116413]
This patch fixes gcc.c-torture/compile/opout.c for m68k with LRA
enabled. The test has:
...
z (a, b)
{
return (int) &a + (int) &b + (int) x + (int) z;
}
so it adds the address of two incoming arguments. This ends up
being treated as an LEA in which the "index" is the incoming
argument pointer, which the LEA multiplies by 2. The incoming
argument pointer is then eliminated, leading to:
lra: Don't apply eliminations to allocated registers [PR116321]
The sequence of events in this PR is that:
- the function has many addresses in which only a single hard base
register is acceptable. Let's call the hard register H.
- IRA allocates that register to one of the pseudo base registers.
Let's call the pseudo register P.
- Some of the other addresses that require H occur when P is still live.
- LRA therefore has to spill P.
- When it reallocates P, LRA chooses to use FRAME_POINTER_REGNUM,
which has been eliminated to the stack pointer. (This is ok,
since the frame register is free.)
- Spilling P causes LRA to reprocess the instruction that uses P.
- When reprocessing the address that has P as its base, LRA first
applies the new allocation, to get FRAME_POINTER_REGNUM,
and then applies the elimination, to get the stack pointer.
The last step seems wrong: the elimination should only apply to
pre-existing uses of FRAME_POINTER_REGNUM, not to uses that result
from allocating pseudos. Applying both means that we get the wrong
register number, and therefore the wrong class.
The PR is about an existing testcase that fails with LRA on m86k.
gcc/
PR middle-end/116321
* lra-constraints.cc (get_hard_regno): Only apply eliminations
to existing hard registers.
(get_reg_class): Likewise.
Iain Sandoe [Mon, 26 Aug 2024 13:09:40 +0000 (14:09 +0100)]
c++, coroutines: The frame pointer is used in the helpers [PR116482].
We have a bogus warning about the coroutine state frame pointers
being apparently unused in the resume and destroy functions. Fixed
by making the parameters DECL_ARTIFICIAL.
PR c++/116482
gcc/cp/ChangeLog:
* coroutines.cc
(coro_build_actor_or_destroy_function): Make the parameter
decls DECL_ARTIFICIAL.
Richard Biener [Mon, 26 Aug 2024 11:50:00 +0000 (13:50 +0200)]
tree-optimization/116460 - ICE with DCE in forwprop
The following avoids removing stmts with defs that might still have
uses in the IL before calling simple_dce_from_worklist which might
remove those as that will wreck debug stmt generation. Instead first
perform use-based DCE and then remove stmts which may have uses in
code that CFG cleanup will remove. This requires tracking stmts
in to_remove by their SSA def so we can check whether it was removed
before without running into the issue that PHIs can be ggc_free()d
upon removal. So this adds to_remove_defs in addition to to_remove
which has to stay to track GIMPLE_NOPs we want to elide.
PR tree-optimization/116460
* tree-ssa-forwprop.cc (pass_forwprop::execute): First do
simple_dce_from_worklist and then remove stmts in to_remove.
Track defs to be removed in to_remove_defs.
Bernd Edlinger [Mon, 26 Aug 2024 16:06:52 +0000 (18:06 +0200)]
Fix another inline7.c test failure on sparc targets
This new test was reported to be still failing on sparc targets.
Here the number of DW_AT_ranges dropped to zero.
The test should pass on this architecture with -Os, -O2 and -O3.
I tried to improve also different known problematic targets,
where only one subroutine had DW_AT_ranges:
Those are armhf (arm with hard float), powerpc and powerpc64.
The best option is to use -Os: So far the only one, where
all two inline instances in this test had two DW_AT_ranges.
gcc/testsuite/ChangeLog:
PR other/116462
* gcc.dg/debug/dwarf2/inline7.c: Switch to -Os optimization.
Pan Li [Mon, 26 Aug 2024 07:58:52 +0000 (15:58 +0800)]
RISC-V: Support IMM for operand 1 of ussub pattern
This patch would like to allow IMM for the operand 1 of ussub pattern.
Aka .SAT_SUB(x, 22) as the below example.
Form 2:
#define DEF_SAT_U_SUB_IMM_FMT_2(T, IMM) \
T __attribute__((noinline)) \
sat_u_sub_imm##IMM##_##T##_fmt_2 (T x) \
{ \
return x >= (T)IMM ? x - (T)IMM : 0; \
}
DEF_SAT_U_SUB_IMM_FMT_2(uint64_t, 1022)
It is almost the as support imm for operand 0 of ussub pattern, but
allow the second operand to be imm insted of the first operand.
The below test suites are passed for this patch:
1. The rv64gcv fully regression test.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_expand_ussub): Gen xmode for the
second operand, aka y in parameter.
* config/riscv/riscv.md (ussub<mode>3): Allow const_int for operand 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_u_sub_imm-5.c: New test.
* gcc.target/riscv/sat_u_sub_imm-5_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-5_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-6.c: New test.
* gcc.target/riscv/sat_u_sub_imm-6_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-6_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-7.c: New test.
* gcc.target/riscv/sat_u_sub_imm-7_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-7_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-8.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-5.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-6.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-7.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-8.c: New test.
Nathaniel Shead [Thu, 22 Aug 2024 10:41:54 +0000 (20:41 +1000)]
c++/modules: Fix include translation for already-seen headers [PR99243]
After importing a header unit we learn about and setup any header
modules that we transitively depend on. However, this causes
'set_filename' to fail an assertion if we then come across this header
as an #include and attempt to translate it into a module. We still need
to do this translation so that libcpp learns that this is a header unit,
but we shouldn't error just because we've already seen it as an import.
Instead this patch merely checks and errors to handle the case of a
broken mapper implementation which supplies a different CMI path from
the one we already got.
As a drive-by fix, also make failing to find the CMI for a module be a
fatal error: any further errors in the TU are unlikely to be helpful.
PR c++/99243
gcc/cp/ChangeLog:
* module.cc (module_state::set_filename): Handle repeated calls
to 'set_filename' as long as the CMI path matches.
(maybe_translate_include): Adjust comment.
gcc/testsuite/ChangeLog:
* g++.dg/modules/map-2.C: Prune additional fatal error message.
* g++.dg/modules/inc-xlate-4_a.H: New test.
* g++.dg/modules/inc-xlate-4_b.H: New test.
* g++.dg/modules/inc-xlate-4_c.H: New test.
Nathaniel Shead [Thu, 22 Aug 2024 11:04:11 +0000 (21:04 +1000)]
c++/modules: Clean up include translation [PR110980]
Currently the handling of include translation is confusing to read,
using a tri-state integer without much clarity on what different states
mean. This patch cleans this up to use explicit enumerators indicating
the different possible states instead, and fixes a bug where the option
'-flang-info-include-translate' ended being accidentally unusable.
PR c++/110980
gcc/cp/ChangeLog:
* module.cc (maybe_translate_include): Clean up.
gcc/testsuite/ChangeLog:
* g++.dg/modules/inc-xlate-2_a.H: New test.
* g++.dg/modules/inc-xlate-2_b.H: New test.
* g++.dg/modules/inc-xlate-3.h: New test.
* g++.dg/modules/inc-xlate-3_a.H: New test.
combine.cc (make_more_copies): Copy attributes from the original pseudo, PR115883
The first of the late-combine passes, propagates some of the copies
made during the (in-time-)combine pass in make_more_copies into the
users of the "original" pseudo registers and removes the "old"
pseudos. That effectively removes attributes such as REG_POINTER,
which matter to LRA. The quoted PR is for an ICE-manifesting bug that
was exposed by the late-combine pass and went back to hiding with this
patch until commit r15-2937-g3673b7054ec2, the fix for PR116236, when
it was actually fixed. To wit, this patch is only incidentally
related to that bug.
In other words, the REG_POINTER attribute should not be required for
LRA to work correctly. This patch merely corrects state for those
propagated register-uses to ante late-combine.
For reasons not investigated, this fixes a failing test
"FAIL: gcc.dg/guality/pr54200.c -Og -DPREVENT_OPTIMIZATION line 20 z == 3"
for x86_64-linux-gnu.
PR middle-end/115883
* combine.cc (make_more_copies): Copy attributes from the original
pseudo to the new copy.
Arsen Arsenović [Fri, 23 Aug 2024 18:19:05 +0000 (20:19 +0200)]
c++/coros: do not assume coros don't nest [PR113457]
In the testcase presented in the PR, during template expansion, an
tsubst of an operand causes a lambda coroutine to be processed, causing
it to get an initial suspend and final suspend. The code for assigning
awaitable var names (get_awaitable_var) assumed that the sequence Is ->
Is -> Fs -> Fs is impossible (i.e. that one could only 'open' one
coroutine before closing it at a time), and reset the counter used for
unique numbering each time a final suspend occured. This assumption is
false in a few cases, usually when lambdas are involved.
Instead of storing this counter in a static-storage variable, we can
store it in coroutine_info. This struct is local to each function, so
we don't need to worry about "cross-contamination" nor resetting.
PR c++/113457
gcc/cp/ChangeLog:
* coroutines.cc (struct coroutine_info): Add integer field
awaitable_number. This is a counter used for assigning unique
names to awaitable temporaries.
(get_awaitable_var): Use awaitable_number from coroutine_info
instead of the static int awn.
gcc/testsuite/ChangeLog:
* g++.dg/coroutines/pr113457-1.C: New test.
* g++.dg/coroutines/pr113457.C: New test.