Tobias Burnus [Tue, 6 Aug 2024 08:34:28 +0000 (10:34 +0200)]
libgomp: Device load_image - improve minor num-funcs/vars check
The run time library loads the offload functions and variable and optionally
the ICV variable and returns the number of loaded items, which has to match
the host side. The plugin returns "+1" (since GCC 12) for the ICV variable
entry, independently whether it was loaded or not, but the var's value
(start == end == 0) can be used to detect when this failed.
Thus, we can tighten the assert check - which this commit does together with
making the output less surprising - and simplify the condition further below.
libgomp/ChangeLog:
* target.c (gomp_load_image_to_device): Extend fatal-error message;
simplify a condition.
Richard Biener [Fri, 2 Aug 2024 11:49:34 +0000 (13:49 +0200)]
middle-end/111821 - compile-time/memory-hog with large copy
The following fixes a compile-time/memory-hog when performing a
large aggregate copy to a small object allocated to a register.
While store_bit_field_1 called by store_integral_bit_field will
do nothign for accesses outside of the target the loop over the
source in store_integral_bit_field will still code-generate
the read parts for all words in the source. The following copies
the noop condition from store_bit_field_1 and terminates the
loop when it runs forward or avoid code-generating the read parts
when not.
PR middle-end/111821
* expmed.cc (store_integral_bit_field): Terminate the
word-wise copy loop when we get out of the destination
and do a forward copy. Skip the word if it would be
outside of the destination in case of a backward copy.
Paul Thomas [Tue, 6 Aug 2024 05:42:27 +0000 (06:42 +0100)]
Fortran: Fix class transformational intrinsic calls [PR102689]
2024-08-06 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/102689
* trans-array.cc (get_array_ref_dim_for_loop_dim): Use the arg1
class container carried in ss->info as the seed for a lhs in
class valued transformational intrinsic calls that are not the
rhs of an assignment. Otherwise, the lhs variable expression is
taken from the loop chain. For this latter case, the _vptr and
_len fields are set.
(gfc_trans_create_temp_array): Use either the lhs expression
seeds to build a class variable that will take the returned
descriptor as its _data field. In the case that the arg1 expr.
is used, a class typespec must be built with the correct rank
and the _vptr and _len fields set. The element size is provided
for the temporary allocation and to set the descriptor span.
(gfc_array_init_size): When an intrinsic type scalar expr3 is
used in allocation of a class array, use its element size in
the descriptor dtype.
* trans-expr.cc (gfc_conv_class_to_class): Class valued
transformational intrinsics return the pointer to the array
descriptor as the _data field of a class temporary. Extract
directly and return the address of the class temporary.
(gfc_conv_procedure_call): store the expression for the first
argument of a class valued transformational intrinsic function
in the ss info class_container field. Later, use its type as
the element type in the call to gfc_trans_create_temp_array.
(fcncall_realloc_result): Add a dtype argument and use it in
the descriptor, when available.
(gfc_trans_arrayfunc_assign): For class lhs, build a dtype with
the lhs rank and the rhs element size and use it in the call to
fcncall_realloc_result.
gcc/testsuite/
PR fortran/102689
* gfortran.dg/class_transformational_1.f90: New test for class-
valued reshape.
* gfortran.dg/class_transformational_2.f90: New test for other
class_valued transformational intrinsics.
Feng Xue [Mon, 5 Aug 2024 07:53:19 +0000 (15:53 +0800)]
vect: Add missed opcodes in vect_get_smallest_scalar_type [PR115228]
Some opcodes are missed when determining the smallest scalar type for a
vectorizable statement. Currently, this bug does not cause any problem,
because vect_get_smallest_scalar_type is only used to compute max nunits
vectype, and even statement with missed opcode is incorrectly bypassed,
the max nunits vectype could also be rightly deduced from def statements
for operands of the statement.
In the future, if this function will be called to do other thing, we may
get something wrong. So fix it in this patch.
Feng Xue [Mon, 5 Aug 2024 07:23:56 +0000 (15:23 +0800)]
vect: Allow unsigned-to-signed promotion in vect_look_through_possible_promotion [PR115707]
The function fails to figure out root definition if casts involves more than
two promotions with sign change as:
long a = (long)b; // promotion cast
-> int b = (int)c; // promotion cast, sign change
-> unsigned short c = ...;
For this case, the function thinks the 2nd cast has different sign as the 1st,
so stop looking through, while "unsigned short -> integer" is a nature sign
extension.
Andrew Pinski [Sat, 3 Aug 2024 16:30:57 +0000 (09:30 -0700)]
sh: Don't call make_insn_raw in sh_recog_treg_set_expr [PR116189]
This was an interesting compare debug failure to debug. The first symptom
was in gcse which would produce different order of creating psedu-registers. This
was caused by a different order of a hashtable walk, due to the hash table having different
number of entries. Which in turn was due to the number of max insn being different between
the 2 runs. The place max insn uid comes from was in sh_recog_treg_set_expr which is called
via rtx_costs and fwprop would cause rtx_costs in some cases for debug insn related stuff.
Build and tested for sh4-linux-gnu.
PR target/116189
gcc/ChangeLog:
* config/sh/sh.cc (sh_recog_treg_set_expr): Don't call make_insn_raw,
make the insn with a fake uid.
gcc/testsuite/ChangeLog:
* c-c++-common/torture/pr116189-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Marek Polacek [Thu, 1 Aug 2024 19:39:10 +0000 (15:39 -0400)]
c++: remove function/var concepts code
This patch removes vestigial Concepts TS code as discussed in
<https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657937.html>.
In particular, it removes code related to function/variable concepts.
That includes variable_concept_p and function_concept_p, which then
cascades into removing DECL_DECLARED_CONCEPT_P etc. So I think we
no longer need to say "standard concept" since there are no non-standard
ones anymore.
I've added two new errors saying that "variable/function concepts are
no longer supported".
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_constant_expression): Don't call
unpack_concept_check. Add a concept_check_p assert. Remove
function_concept_p code.
* constraint.cc (check_constraint_atom): Remove function concepts code.
(unpack_concept_check): Remove.
(get_concept_check_template): Remove Concepts TS code.
(resolve_function_concept_overload): Remove.
(resolve_function_concept_check): Remove.
(resolve_concept_check): Remove Concepts TS code.
(get_returned_expression): Remove.
(get_variable_initializer): Remove.
(get_concept_definition): Remove Concepts TS code.
(normalize_concept_check): Likewise.
(build_function_check): Remove.
(build_variable_check): Remove.
(build_standard_check): Use concept_definition_p instead of
standard_concept_p.
(build_concept_check): Remove variable_concept_p/function_concept_p
code.
(build_concept_id): Simplify.
(build_type_constraint): Likewise.
(placeholder_extract_concept_and_args): Likewise.
(satisfy_nondeclaration_constraints): Likewise.
(check_function_concept): Remove.
(get_constraint_error_location): Remove Concepts TS code.
* cp-tree.h (DECL_DECLARED_CONCEPT_P): Remove.
(check_function_concept): Remove.
(unpack_concept_check): Remove.
(standard_concept_p): Remove.
(variable_concept_p): Remove.
(function_concept_p): Remove.
(concept_definition_p): Simplify.
(concept_check_p): Don't check for CALL_EXPR.
* decl.cc (check_concept_refinement): Remove.
(duplicate_decls): Remove check_concept_refinement code.
(is_concept_var): Remove.
(cp_finish_decl): Remove is_concept_var.
(check_concept_fn): Remove.
(grokfndecl): Give an error about function concepts not being supported
anymore. Remove unused code.
(grokvardecl): Give an error about variable concepts not being
supported anymore.
(finish_function): Remove DECL_DECLARED_CONCEPT_P code.
* decl2.cc (min_vis_expr_r): Use concept_definition_p instead of
standard_concept_p.
(maybe_instantiate_decl): Remove DECL_DECLARED_CONCEPT_P check.
(mark_used): Likewise.
* error.cc (dump_simple_decl): Use concept_definition_p instead of
standard_concept_p.
(dump_function_decl): Remove DECL_DECLARED_CONCEPT_P code.
(print_concept_check_info): Don't call unpack_concept_check. Simplify.
* mangle.cc (write_type_constraint): Likewise.
* parser.cc (cp_parser_nested_name_specifier_opt): Remove
function_concept_p code. Only check concept_definition_p, not
variable_concept_p/standard_concept_p.
(add_debug_begin_stmt): Remove DECL_DECLARED_CONCEPT_P code.
(cp_parser_template_declaration_after_parameters): Remove a stale
comment.
* pt.cc (check_explicit_specialization): Remove
DECL_DECLARED_CONCEPT_P code.
(process_partial_specialization): Remove variable_concept_p code.
(lookup_template_variable): Likewise.
(tsubst_expr) <case CALL_EXPR>: Remove Concepts TS code and simplify.
(do_decl_instantiation): Remove DECL_DECLARED_CONCEPT_P code.
(instantiate_decl): Likewise.
(placeholder_type_constraint_dependent_p): Don't call
unpack_concept_check. Add a concept_check_p assert.
(convert_generic_types_to_packs): Likewise.
* semantics.cc (finish_call_expr): Remove Concepts TS code and simplify.
compiler: panic arguments are empty interface type
After CL 536643 passing NULL as the expected type permitted an untyped
constant expression to remain untyped. Change to passing the empty
interface type.
The panic and print/println functions are the only builtin functions
that turn an untyped constant expression into a regular function call,
and we already handled print/println specially.
c++, coroutines: Simplify separation of the user function body and ramp.
We need to separate the original user-authored function body from the
definition of the ramp function (which is what is called instead).
The function body tree is either in DECL_SAVED_TREE or the first operand
of current_eh_spec_block (for functions with an EH spec).
This version simplifies the process by extracting the second case directly
instead of inspecting the DECL_SAVED_TREE trees to discover it.
gcc/cp/ChangeLog:
* coroutines.cc (split_coroutine_body_from_ramp): New.
(morph_fn_to_coro): Use split_coroutine_body_from_ramp().
* cp-tree.h (use_eh_spec_block): New.
* decl.cc (use_eh_spec_block): Make non-static.
Mark Harmstone [Sun, 4 Aug 2024 22:26:53 +0000 (23:26 +0100)]
Fix handling of const or volatile void pointers in CodeView
DWARF represents voids in DW_TAG_const_type and DW_TAG_volatile_type
DIEs by the absence of a DW_AT_type attribute, which we weren't handling
correctly.
This fixes another false positive. When a function is taking a
temporary of scalar type that couldn't be bound to the return type
of the function, don't warn, such a program would be ill-formed.
Thanks to Jonathan for reporting the problem.
PR c++/115987
gcc/cp/ChangeLog:
* call.cc (do_warn_dangling_reference): Don't consider a
temporary with a scalar type that cannot bind to the return type.
gcc/testsuite/ChangeLog:
* g++.dg/ext/attr-no-dangling6.C: Adjust.
* g++.dg/ext/attr-no-dangling7.C: Likewise.
* g++.dg/warn/Wdangling-reference22.C: New test.
bpf: do not emit BPF non-fetching atomic instructions
When GCC finds a call to one of the __atomic_OP_fetch built-ins in
which the return value is not used it optimizes it into the
corresponding non-fetching atomic operation. Up to now we had
definitions in gcc/config/bpf/atomic.md to implement both atomic_OP
and atomic_fetch_OP sets of insns:
This was not correct, because as it happens the non-fetching BPF
atomic instructions imply different memory ordering semantics than the
fetching BPF atomic instructions, and they cannot be used
interchangeably, as it would be expected.
This patch modifies config/bpf/atomic.md in order to not define the
atomic_{add,and,or,xor} insns. This makes GCC to implement them in
terms of the corresponding fetching operations; this is less
efficient, but correct. It also updates the expected results in the
corresponding tests, which are also updated to cover cases where the
value resulting from the __atomic_fetch_* operations is actually used.
Tested in bpf-unknown-none target in x86_64-linux-gnu host.
Jennifer Schmitz [Mon, 29 Jul 2024 14:59:33 +0000 (07:59 -0700)]
AArch64: Set instruction attribute of TST to logics_imm
As suggested in
https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658249.html,
this patch changes the instruction attribute of "*and<mode>_compare0" (TST) from
alus_imm to logics_imm.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
Filip Kastl [Mon, 5 Aug 2024 12:39:06 +0000 (14:39 +0200)]
gimple ssa: Fix a typo in gimple-ssa-sccopy.cc
Fixes a misplaced comment in gimple-ssa-sccopy.cc. The comment belongs
to a bitmap definition but was instead placed before the beginning of a
namespace block.
Kyrylo Tkachov [Fri, 2 Aug 2024 13:21:16 +0000 (06:21 -0700)]
tree-reassoc.cc: PR tree-optimization/116139 Don't assert when forming fully-pipelined FMAs on wide MULT targets
The code in get_reassociation_width that forms FMAs aggressively when
they are fully pipelined expects the FMUL reassociation width in the
target to be less than for FMAs. This doesn't hold for all target
tunings.
This code shouldn't ICE, just avoid forming these FMAs here.
This patch does that.
Feng Xue [Mon, 5 Aug 2024 10:13:55 +0000 (18:13 +0800)]
vect: Fix dot-product slp testcases [PR116000]
These testcases were added by the patch of supporting multiple lane-reducing
operations. For target that has no dot-product instrution, we should add
matching condition to skip it.
Thomas Schwinge [Wed, 28 Feb 2024 22:06:25 +0000 (23:06 +0100)]
Inline 'gcc/rust/Make-lang.in:RUST_LIBDEPS'
..., also fixing up an apparently mis-merged
commit 2340894554334a310b891a1d9e9d5e3f502357ac
"gccrs: Add 'gcc/rust/Make-lang.in:LIBFORMAT_PARSER'", which was adding a bogus
second definition of 'RUST_LIBDEPS'.
gcc/rust/
* Make-lang.in (RUST_LIBDEPS): Inline into all users.
Thomas Schwinge [Mon, 5 Aug 2024 08:06:05 +0000 (10:06 +0200)]
Don't override 'LIBS' if '--enable-languages=rust'; use 'CRAB1_LIBS'
Recent commit 6fef4d6ffcab0fec8518adcb05458cba5dbeac25
"gccrs: libgrust: Add format_parser library", added a general override of
'LIBS += -ldl -lpthread' if '--enable-languages=rust'. This is wrong
conceptually, and will make the build fail on systems not providing such
libraries. Instead, 'CRAB1_LIBS', added a while ago in
commit 75299e4fe50aa8d9b3ff529e48db4ed246083e64
"rust: Do not link with libdl and libpthread unconditionally", should be used,
and not generally, but for 'crab1' only.
gcc/rust/
* Make-lang.in (LIBS): Don't override.
(crab1$(exeext):): Use 'CRAB1_LIBS'.
Alex Coplan [Mon, 5 Aug 2024 07:45:58 +0000 (08:45 +0100)]
gdbhooks: Add attempt to invoke on-gcc-hooks-load
This extends GCC's GDB hooks to attempt invoking the user-defined
command "on-gcc-hooks-load". The idea is that users can define the
command in their .gdbinit to override the default values of parameters
defined by GCC's GDB extensions.
For example, together with the previous patch, I can add the following
fragment to my .gdbinit:
define on-gcc-hooks-load
set gcc-dot-cmd xdot
end
which means, once the GCC extensions get loaded, whenever I invoke
dot-fn then the graph will be rendered using xdot.
The try/except should make this patch a no-op for users that don't
currently define this command. I looked for a way to test explicitly
for whether a GDB command exists but didn't find one.
This is needed because the user's .gdbinit is sourced before GCC's GDB
extensions are loaded, and GCC-specific parameters can't be configured
before they are defined.
gcc/ChangeLog:
* gdbhooks.py: Add attempted call to "on-gcc-hooks-load" once
we've finished loading the hooks.
Alex Coplan [Mon, 5 Aug 2024 07:45:29 +0000 (08:45 +0100)]
gdbhooks: Make dot viewer configurable
This adds a new GDB parameter 'gcc-dot-cmd' which allows the user to
configure the command used to render the CFG within dot-fn.
E.g. with this patch the user can change their dot viewer like so:
(gdb) show gcc-dot-cmd
The current value of 'gcc-dot-cmd' is "dot -Tx11".
(gdb) set gcc-dot-cmd xdot
(gdb) dot-fn # opens in xdot
The second patch in this series adds a hook which users can define in
their .gdbinit in order to be called when the GCC extensions have
finished loading, thus allowing users to automatically configure
gcc-dot-cmd as desired in their .gdbinit.
gcc/ChangeLog:
* gdbhooks.py (GCCDotCmd): New.
(gcc_dot_cmd): New. Use it ...
(DotFn.invoke): ... here.
Tobias Burnus [Mon, 5 Aug 2024 07:18:29 +0000 (09:18 +0200)]
libgomp.texi: Add OpenMP TR13 routines to @menu (commented out)
To keep track of missing routine documentation (both implemented and not),
the libgomp.texi file contains all non-OMPT routines as commented items
in @menu. This commit adds the routines added in TR13 as commented fixme
items.
Andrew Pinski [Fri, 2 Aug 2024 17:04:40 +0000 (10:04 -0700)]
IRA: Ignore debug insns for uses in split_live_ranges_for_shrink_wrap. [PR116179]
Late_combine exposed this latent bug in split_live_ranges_for_shrink_wrap.
What it did was copy-prop regno 151 from regno 119 from:
```
(insn 2 264 3 2 (set (reg/f:DI 119 [ thisD.3697 ])
(reg:DI 151)) "/app/example.cpp":19:13 70 {*movdi_aarch64}
(expr_list:REG_DEAD (reg:DI 151)
(nil)))
```
Both are valid things to do. The problem is split_live_ranges_for_shrink_wrap looks at the
uses of reg 151 and with and without debugging reg 151 have a different usage in different BBs.
The function is trying to find a splitting point for reg 151 and they are different. In the end
this causes register allocation difference.
The fix is for split_live_ranges_for_shrink_wrap to ignore uses that were in debug insns.
Bootstrappped and tested on x86_64-linux-gnu with no regressions.
PR rtl-optimization/116179
gcc/ChangeLog:
* ira.cc (split_live_ranges_for_shrink_wrap): For the uses loop,
only look at non-debug insns.
gcc/testsuite/ChangeLog:
* g++.dg/torture/pr116179-1.C: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Again, sensible. The pattern requires op0 and op1 to match, so we try to
figure out if d0 & a0 are the same underlying register. So we get into this
code in operands_match_p:
> if (code == SUBREG)
> {
> i = REGNO (SUBREG_REG (x));
> if (i >= FIRST_PSEUDO_REGISTER)
> goto slow;
> i += subreg_regno_offset (REGNO (SUBREG_REG (x)),
> GET_MODE (SUBREG_REG (x)),
> SUBREG_BYTE (x),
> GET_MODE (x));
> }
> else
> i = REGNO (x);
There's a similar fragment for the other operand. The key is that
subreg_regno_offset call. That call assumes the subreg is representable. But
in the case of (subreg:DI (reg:SI d0)) we're going to get -1 (remember, m68k is
a big endian target). That -1 gets passed to hard_regno_regs via this code
(again, just showing one of the two copies of this fragment):
> if (REG_WORDS_BIG_ENDIAN
> && is_a <scalar_int_mode> (GET_MODE (x), &xmode)
> && GET_MODE_SIZE (xmode) > UNITS_PER_WORD
> && i < FIRST_PSEUDO_REGISTER)
> i += hard_regno_nregs (i, xmode) - 1;
That triggers the reported ICE. It appears this has been broken since the
conversion to SUBREG_BYTE way back in 2001, though possibly it could have been
some minor changes around this code circa 2005 as well, it didn't seem worth
putting under the debugger to be sure. Certainly the code from 2001 looks
suspicious to me.
Anyway, the fix here is pretty simple. The routine "simplify_subreg_regno" is
meant to be used to determine if we can simplify the subreg expression and will
explicitly return -1 if it can't be represented for one reason or another. It
checks a variety of conditions that aren't worth listing here.
Bootstrapped and regression tested on x86 (after reverting an unrelated patch
from Richard S that's causing multiple unrelated failures), which of course
doesn't really test the code as x86 is an LRA target. Also built & tested the
crosses, none of which show issues (and some of which are reload targets).
m68k will bootstrap & regression test tomorrow, but I don't think there's any
point in waiting for that.
Pushing to the trunk.
PR rtl-optimization/116199
gcc/
* reload.cc (operands_match_p): Verify subreg is expressable before
trying to simplify and match it to another operand.
gcc/testsuite/
* gcc.dg/torture/pr116199.c: New test.
Jakub Jelinek [Sat, 3 Aug 2024 18:37:54 +0000 (20:37 +0200)]
libquadmath: Fix up libquadmath/math/sqrtq.c compilation in some powerpc* configurations [PR116007]
My PR114623 change started using soft-fp.h and quad.h for the sqrtq implementation.
Unfortunately, that seems to fail building in some powerpc* configurations, where
TFmode isn't available.
quad.h has:
#ifndef TFtype
typedef float TFtype __attribute__ ((mode (TF)));
#endif
and uses TFtype. quad.h has:
/* Define the complex type corresponding to __float128
("_Complex __float128" is not allowed) */
#if (!defined(_ARCH_PPC)) || defined(__LONG_DOUBLE_IEEE128__)
typedef _Complex float __attribute__((mode(TC))) __complex128;
#else
typedef _Complex float __attribute__((mode(KC))) __complex128;
#endif
with the conditional and KCmode use added during porting of libquadmath
to powerpc*, so I've just defined TFtype for powerpc when __LONG_DOUBLE_IEEE128__
isn't defined; I could define it to float __attribute__ ((mode (KF))) but it
seemed easier to just define it to __float128 which should do the same thing.
2024-08-03 Jakub Jelinek <jakub@redhat.com>
PR target/116007
* math/sqrtq.c (TFtype): For PowerPC without __LONG_DOUBLE_IEEE128__
define to __float128 before including soft-fp.h and quad.h.
Patrick Palka [Sat, 3 Aug 2024 13:05:05 +0000 (09:05 -0400)]
libstdc++: use concrete return type for std::forward_like
Inspired by https://github.com/llvm/llvm-project/issues/101614 this
inverts the relationship between forward_like and __like_t so that
forward_like is defined in terms of __like_t and with a concrete return
type. __like_t in turn is defined via partial specializations that
pattern match on the const- and reference-ness of T.
This turns out to be more SFINAE friendly and significantly cheaper
to compile than the previous implementation.
libstdc++-v3/ChangeLog:
* include/bits/move.h (__like_impl): New metafunction.
(__like_t): Redefine in terms of __like_impl.
(forward_like): Redefine in terms of __like_t.
* testsuite/20_util/forward_like/2_neg.cc: Don't expect
error outside the immediate context anymore.
Marek Polacek [Thu, 1 Aug 2024 14:35:38 +0000 (10:35 -0400)]
c++: Move -Wdangling-reference to -Wextra
Despite a number of mitigations (don't warn for std::span-like classes,
lambdas, adding [[gnu::no_dangling]], etc.), the warning still seems to
cause some grief. Let's move the warning to -Wextra, then.
gcc/c-family/ChangeLog:
* c.opt (Wdangling-reference): Move from -Wall to -Wextra.
gcc/ChangeLog:
* doc/invoke.texi: Document that -Wdangling-reference is
enabled by -Wextra.
c++/coroutines: check for members we use in handle_types [PR105475]
Currently, it is possible to ICE GCC by giving it sufficiently broken
code, where sufficiently broken means a std::coroutine_handle missing a
default on the promise_type template argument, and missing members.
As the code generator relies on lookups in the coroutine_handle never
failing (and has no way to signal that error), lets do it ahead of time,
save the result, and use that. This saves us some lookups and allows us
to propagate an error.
PR c++/105475 - coroutines: ICE in coerce_template_parms, at cp/pt.cc:9183
gcc/cp/ChangeLog:
PR c++/105475
* coroutines.cc (struct coroutine_info): Add from_address.
Carries the from_address member we looked up earlier.
(coro_resume_identifier): Remove. Unused.
(coro_init_identifiers): Do not initialize the above.
(void_coro_handle_address): New variable. Contains the baselink
for the std::coroutine_handle<void>::address() instance method.
(get_handle_type_address): New function. Looks up and validates
handle_type::address in a given handle_type.
(get_handle_type_from_address): New function. Looks up and
validates handle_type::from_address in a given handle_type.
(coro_promise_type_found_p): Remove reliance on
coroutine_handle<> defaulting the promise type to void. Store
get_handle_type_* results where appropriate.
(get_coroutine_from_address): New helper. Gets the
handle_type::from_address BASELINK from a coroutine_info.
(build_actor_fn): Use the get_coroutine_from_address helper and
void_coro_handle_address.
gcc/testsuite/ChangeLog:
PR c++/105475
* g++.dg/coroutines/pr103868.C: Add std::coroutine_handle
members we check for now.
* g++.dg/coroutines/pr105287.C: Ditto.
* g++.dg/coroutines/pr105301.C: Ditto.
* g++.dg/coroutines/pr94528.C: Ditto.
* g++.dg/coroutines/pr94879-folly-1.C: Ditto.
* g++.dg/coroutines/pr94883-folly-2.C: Ditto.
* g++.dg/coroutines/pr98118.C: Ditto.
* g++.dg/coroutines/pr105475.C: New test.
* g++.dg/coroutines/pr105475-1.C: New test.
* g++.dg/coroutines/pr105475-2.C: New test.
* g++.dg/coroutines/pr105475-3.C: New test.
* g++.dg/coroutines/pr105475-4.C: New test.
* g++.dg/coroutines/pr105475-5.C: New test.
* g++.dg/coroutines/pr105475-6.C: New test.
* g++.dg/coroutines/pr105475-broken-spec.C: New test.
* g++.dg/coroutines/pr105475-broken-spec-2.C: New test.
Mikael Morin [Fri, 2 Aug 2024 12:24:34 +0000 (14:24 +0200)]
fortran: Support optional dummy as BACK argument of MINLOC/MAXLOC.
Protect the evaluation of BACK with a check that the reference is non-null
in case the expression is an optional dummy, in the inline code generated
for MINLOC and MAXLOC.
This change contains a revert of the non-testsuite part of commit r15-1994-ga55d24b3cf7f4d07492bb8e6fcee557175b47ea3, which factored the
evaluation of BACK out of the loop using the scalarizer. It was a bad idea,
because delegating the argument evaluation to the scalarizer makes it
cumbersome to add a null pointer check next to the evaluation.
Instead, evaluate BACK at the beginning, before scalarization, add a check
that the argument is present if necessary, and evaluate the resulting
expression to a variable, before using the variable in the inline code.
gcc/fortran/ChangeLog:
* trans-intrinsic.cc (maybe_absent_optional_variable): New function.
(gfc_conv_intrinsic_minmaxloc): Remove BACK from scalarization and
evaluate it before. Add a check that BACK is not null if the
expression is an optional dummy. Save the resulting expression to a
variable. Use the variable in the generated inline code.
gcc/testsuite/ChangeLog:
* gfortran.dg/maxloc_6.f90: New test.
* gfortran.dg/minloc_7.f90: New test.
Andrew Pinski [Thu, 1 Aug 2024 21:22:36 +0000 (14:22 -0700)]
genemit: Fix handling of explicit parallels for clobbers [PR116058]
In a define_insn, you could use either an explicit parallel for
the insns or genrecog/genemit will add one for you.
The problem when genemit is processing the pattern for clobbers
(to create the function add_clobbers), genemit hadn't add the implicit
parallel yet but at the same time forgot to ignore that there
could be an explicit parallel there.
This means in some cases (like in the sh backend), add_clobbers
and recog had a different idea if there was clobbers on the insn.
This fixes the problem by looking through the explicit parallel
for the instruction in genemit.
Bootstrapped and tested on x86_64-linux-gnu.
PR middle-end/116058
gcc/ChangeLog:
* genemit.cc (struct clobber_pat): Change pattern to be rtvec.
Add code field.
(gen_insn): Look through an explicit parallel if there was one.
Update store to new clobber_pat.
(output_add_clobbers): Update call to gen_exp for the changed
clobber_pat.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andre Vieira [Fri, 2 Aug 2024 15:39:34 +0000 (16:39 +0100)]
arm: Fix testism with mve/ivopts-3.c testcase
This patch ensures this testcase is ran for armv8.1-m.main+mve as this is
testing that doloops with function calls that aren't intrinsics get rejected
as potential doloop targets during ivopts. For other targets this loop gets
rejected for different reasons.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/ivopts-3.c: Add require target and options.
AArch64: Fuse CMP+CSEL and CMP+CSET for -mcpu=neoverse-v2
According to the Neoverse V2 Software Optimization Guide (section 4.14), the
instruction pairs CMP+CSEL and CMP+CSET can be fused, which had not been
implemented so far. This patch implements and tests the two fusion pairs.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
There was also no non-noise impact on SPEC CPU2017 benchmark.
OK for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64.cc (aarch_macro_fusion_pair_p): Implement
fusion logic.
* config/aarch64/aarch64-fusion-pairs.def (cmp+csel): New entry.
(cmp+cset): Likewise.
* config/aarch64/tuning_models/neoversev2.h: Enable logic in
field fusible_ops.
gcc/testsuite/
* gcc.target/aarch64/fuse_cmp_csel.c: New test.
* gcc.target/aarch64/fuse_cmp_cset.c: Likewise.
Make may_trap_p_1 return false for constant pool references [PR116145]
The testcase contains the constant:
arr2 = svreinterpret_u8(svdup_u32(0x0a0d5c3f));
which was initially hoisted by hand, but which gimple optimisers later
propagated to each use (as expected). The constant was then expanded
as a load-and-duplicate from the constant pool. Normally that load
should then be hoisted back out of the loop, but may_trap_or_fault_p
stopped that from happening in this case.
The code responsible was:
if (/* MEM_NOTRAP_P only relates to the actual position of the memory
reference; moving it out of context such as when moving code
when optimizing, might cause its address to become invalid. */
code_changed
|| !MEM_NOTRAP_P (x))
{
poly_int64 size = MEM_SIZE_KNOWN_P (x) ? MEM_SIZE (x) : -1;
return rtx_addr_can_trap_p_1 (XEXP (x, 0), 0, size,
GET_MODE (x), code_changed);
}
where code_changed is true. (Arguably it doesn't need to be true in
this case, if we inserted invariants on the preheader edge, but it
would still need to be true for conditionally executed loads.)
Normally this wouldn't be a problem, since rtx_addr_can_trap_p_1
would recognise that the address refers to the constant pool.
However, the SVE load-and-replicate instructions have a limited
offset range, so it isn't possible for them to have a LO_SUM address.
All we have is a plain pseudo base register.
MEM_READONLY_P is defined as:
/* 1 if RTX is a mem that is statically allocated in read-only memory. */
#define MEM_READONLY_P(RTX) \
(RTL_FLAG_CHECK1 ("MEM_READONLY_P", (RTX), MEM)->unchanging)
and so I think it should be safe to move memory references if both
MEM_READONLY_P and MEM_NOTRAP_P are true.
The testcase isn't a minimal reproducer, but I think it's good
to have a realistic full routine in the testsuite.
gcc/
PR rtl-optimization/116145
* rtlanal.cc (may_trap_p_1): Trust MEM_NOTRAP_P even for code
movement if MEM_READONLY_P is also true.
gcc/testsuite/
PR rtl-optimization/116145
* gcc.target/aarch64/sve/acle/general/pr116145.c: New test.
c++, coroutines: Remove unused suspend point state [NFC].
We maintain state on the progress of await analysis in an object that
is passed to the various tree walks used. Some of the state had become
stale (i.e. unused members). Remove those and provide a CTOR so that
updates are localised.
Remove the file scope hash_map used to collect the final state for the
actor function and make that part of the suspend point state.
gcc/cp/ChangeLog:
* coroutines.cc (struct susp_frame_data): Remove unused members,
provide a CTOR.
(morph_fn_to_coro): Use susp_frame_data CTOR, and make the suspend
state hash map local to the morph function.
Andrew Pinski [Thu, 1 Aug 2024 17:33:34 +0000 (10:33 -0700)]
forwprop: Don't add uses to dce list if debug statement [PR116156]
The problem here is that when forwprop does a copy prop, into a statement,
we mark the uses of that statement as possibly need to be removed. But it just
happened that statement was a debug statement, there will be a difference when
compiling with debuging info turned on vs off; this is not expected.
So the fix is not to add the old use to dce list to process if it was a debug
statement.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
PR tree-optimization/116156
gcc/ChangeLog:
* tree-ssa-forwprop.cc (pass_forwprop::execute): Don't add
uses if the statement was a debug statement.
gcc/testsuite/ChangeLog:
* c-c++-common/torture/pr116156-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Martin Uecker [Sun, 23 Jun 2024 21:14:33 +0000 (23:14 +0200)]
c: Add support for byte arrays in C2Y
To get correct aliasing behavior requires that structures and unions
that contain a byte array, i.e. an array of non-atomic character
type (N3254), are marked with TYPE_TYPELESS_STORAGE. This change
affects also earlier language modes.
gcc/c/
* c-decl.cc (grokdeclarator, finish_struct): Set and
propagate TYPE_TYPELESS_STORAGE.
gcc/testsuite/
* gcc.dg/c2y-byte-alias-1.c: New test.
* gcc.dg/c2y-byte-alias-2.c: New test.
* gcc.dg/c2y-byte-alias-3.c: New test.
Lingling Kong [Fri, 2 Aug 2024 08:52:33 +0000 (16:52 +0800)]
i386: Fix comment/naming for APX NDD constraints
gcc/ChangeLog:
* config/i386/constraints.md: Fixed the comment/naming for je/jM/jO.
* config/i386/predicates.md (apx_ndd_memory_operand): Renamed and
fixed the comment.
(apx_evex_memory_operand): New name.
(apx_ndd_add_memory_operand): Ditto.
(apx_evex_add_memory_operand): Ditto.
ada: Fix handling of SPARK_Mode on standalone child subprogram
SPARK_Mode aspect was not properly propagated to the body of
a standalone child subprogram from the generated spec for that subprogram,
leading GNATprove to not analyze this body. Now fixed.
gcc/ada/
* aspects.adb (Find_Aspect): Take into account the case of a node
of kind N_Defining_Program_Unit_Name.
* sem_ch10.adb (Analyze_Compilation_Unit): Copy the SPARK aspect
from the spec to the body. Delay semantic analysis after that
point to ensure that SPARK_Mode is properly analyzed.
Piotr Trojanek [Tue, 16 Jul 2024 14:07:59 +0000 (16:07 +0200)]
ada: Fix handling of iterated component associations with sub-aggregates
Fix a number of problems in handling of actions generated for a
2-dimensional array aggregate where the outer aggregate has iterated
component association and the inner aggregate involves run-time checks.
gcc/ada/
* exp_aggr.adb (Add_Loop_Actions): Actions are now attached to
iterated component association just like they are attached to
ordinary component association.
(Build_Array_Aggr_Code): If resolution of the array aggregate
generated some actions, e.g. for run-time checks, then we must
keep them; same for the Other_Clause.
* sem_aggr.adb (Resolve_Iterated_Component_Association): Unset
references to iterator variable in loop actions (which might come
from run-time check), just these references are unset in the
expression itself.
Gary Dismukes [Mon, 15 Jul 2024 23:57:43 +0000 (23:57 +0000)]
ada: Errors on legal container aggregates with iterated_element_associations
The compiler rejects various cases of container aggregates with
iterated_element_associations that include a loop_parameter_subtype_indication
or that include the "reverse" keyword. The fixes are in the parser, for
naccepting the syntax for these cases, as well as for properly accounting
for reverse iterators in the analyzer and expander.
gcc/ada/
* exp_aggr.adb
(Expand_Container_Aggregate.Expand_Iterated_Component): Set the
Reverse_Present flag when creating the loop's iteration_scheme.
* gen_il-gen-gen_nodes.adb: Add flag Reverse_Present to
N_Iterated_Component_Association nodes.
* par-ch3.adb (P_Constraint_Op): Remove testing for and ignoring
of Tok_In following a constraint. It's allowed for "in" to follow
a constraint of loop_parameter_subtype_indication of an
iterator_specification, so it shouldn't be ignored.
* par-ch4.adb (P_Iterated_Component_Association): Account for
"reverse" following the "in" in an iterated_component_association,
and set the Reverse_Present flag on the
N_Iterated_Component_Association node. Add handling for a ":"
following the identifier in an iterator_specification of an
iterated_element_association, sharing the code with the "of" case
(which backs up to the identifier at the beginning of the
iterator_specification). Fix incorrect trailing comment following
the call to Scan.
(Build_Iterated_Element_Association): Set the Reverse_Present flag
on an N_Loop_Parameter_Specification node of an
N_Iterated_Element_Association.
* par-ch5.adb (P_Iterator_Specification): Remove error-recovery
and error code that reports "subtype indication is only legal on
an element iterator", as that error can no longer be emitted (and
was formerly only reported on one fixedbugs test).
* sem_aggr.adb
(Resolve_Container_Aggregate.Resolve_Iterated_Association): When
creating an N_Iterator_Specification for an
N_Iterated_Component_Association, set the Reverse_Present flag of
the N_Iterated_Specification from the flag on the latter.
* sinfo.ads: Add comments for the Reverse_Present flag, which is
now allowed on nodes of kind N_Iterated_Component_Association.
ada: Add contracts to Ada.Strings.Unbounded and adapt implementation
Add complete functional contracts to all subprograms in
Ada.Strings.Unbounded, except Count, following the specification from
Ada RM A.4.5. These contracts are similar to the contracts found in
Ada.Strings.Fixed and Ada.Strings.Bounded.
A difference is that type Unbounded_String is controlled, thus we avoid
performing copies of a parameter Source with Source'Old, and instead
apply 'Old attribute on the enclosing call, such as Length(Source)'Old.
As Unbounded_String is controlled, the implementation is not in SPARK.
Instead, we have separately proved a slightly different implementation
for which Unbounded_String is not controlled, against the same
specification. This ensures that the specification is consistent.
To minimize differences between this test from the SPARK testsuite and
the actual implementation (the one in a-strunb.adb), and to avoid
overflows in the actual implementation, some code is slightly rewritten.
Delete and Insert are modified to return the correct result in all
cases allowed by the standard.
The same contracts are added to the version in a-strunb__shared.ads and
similar implementation patches are applied to the body
a-strunb__shared.adb. In particular, tests are added to avoid overflows
on strings for which the last index is Natural'Last, and the computations
that involve Sum to guarantee that an exception is raised in case of
overflow are rewritten to guarantee correct detection and no intermediate
overflows (and such tests are applied consistently between the procedure
and the function when both exist).
gcc/ada/
* libgnat/a-strunb.adb (Sum, Saturated_Sum, Saturated_Mul): Adapt
function signatures to more precise types that allow proof.
(function "&"): Conditionally assign a slice to avoid possible
overflow which only occurs when the assignment is a noop (because
the slice is empty in that case).
(Append): Same.
(function "*"): Retype K to avoid a possible overflow. Add early
return on null length for proof.
(Delete): Fix implementation to return the correct result in all
cases allowed by the Ada standard.
(Insert): Same. Also avoid possible overflows.
(Length): Rewrite as expression function for proof.
(Overwrite): Avoid possible overflows.
(Slice): Same.
(To_String): Rewrite as expression function for proof.
* libgnat/a-strunb.ads: Extend Assertion_Policy to new contracts
used. Add complete functional contracts to all subprograms of the
public API except Count.
* libgnat/a-strunb__shared.adb (Sum): Adapt function signature to
more precise types that allow proof.
(function "&"): Conditionally assign a slice to avoid possible
overflow.
(function "*"): Retype K to avoid a possible overflow.
(Delete): Fix implementation to return the correct result in all
cases allowed by the Ada standard.
(Insert): Avoid possible overflows.
(Overwrite): Avoid possible overflows.
(Replace_Slice): Same.
(Slice): Same.
(To_String): Rewrite as expression function for proof.
* libgnat/a-strunb__shared.ads: Extend Assertion_Policy to new
contracts used. Add complete functional contracts to all
subprograms of the public API except Count. Mark public part of
spec as in SPARK.
ada: Add leap second support to conversion of Unix_Time
Unix timestamp jumps one second back when a leap second
is applied and doesn't count cumulative leap seconds.
This was not taken into account in conversions between
Unix time and Ada time. Now fixed.
gcc/ada/
* libgnat/a-calend.adb: Modify unix time handling.
Javier Miranda [Sat, 6 Jul 2024 19:07:16 +0000 (19:07 +0000)]
ada: Reject ambiguous function calls in interpolated string expressions
This patch enhances support for this language feature by rejecting
more ambiguous function calls. In terms of name resolution, the
analysis of interpolated expressions is now treated as an expression
of any type, as required by the documentation. Additionally, support
for nested interpolated strings has been removed.
gcc/ada/
* gen_il-fields.ads (Is_Interpolated_String_Literal): New field.
* gen_il-gen-gen_nodes.adb (Is_Interpolated_String_Literal): The
new field is a flag handled by the parser (syntax flag).
* par-ch2.adb (P_Interpolated_String_Literal): Decorate the new
flag.
* sem_ch2.adb (Analyze_Interpolated_String_Literal): Improve code
detecting and reporting ambiguous function calls.
* sem_res.adb (Resolve_Interpolated_String_Literal): Restrict
resolution imposed by the context type to string literals that
have the new flag.
* sinfo.ads (Is_Interpolated_String_Literal): New field defined in
string literals. Fix documentation of the syntax rule of
interpolated string literal.
Steve Baird [Tue, 9 Jul 2024 23:54:59 +0000 (16:54 -0700)]
ada: Compiler accepts illegal assignment to reference type target.
An assignment statement whose LHS is of a reference type is never legal. If
no other legality rule is violated, then it is ambiguous. In some cases this
ambiguity was not correctly detected.
gcc/ada/
* sem_ch5.adb (Analyze_Assignment): Delete code that was
incorrectly implementing a preference rule.
Eric Botcazou [Tue, 9 Jul 2024 20:44:40 +0000 (22:44 +0200)]
ada: Finish up support for relaxed finalization
This adds a variant of the System.Finalization_Primitives unit that supports
only controlled types with relaxed finalization, and adds the description of
its implementation to Exp_Ch7.
gcc/ada/
* exp_ch7.adb (Relaxed Finalization): New paragraph in head
comment.
* sem_ch13.adb (Validate_Finalizable_Aspect): Give an error
message if strict finalization is required but not supported by
the runtime.
Steve Baird [Mon, 8 Jul 2024 21:02:15 +0000 (14:02 -0700)]
ada: Reject illegal uses of type/subtype current instance
The current instance of a type or subtype (see RM 8.6) is an object or
value, not a type or subtype. So a name denoting such a current instance is
illegal in any context that requires a name denoting a type or subtype.
In some cases this error was not detected.
gcc/ada/
* sem_ch8.adb (Find_Type): If Is_Current_Instance returns True for
N (and Comes_From_Source (N) is also True) then flag an error.
Call Is_Current_Instance (twice) instead of duplicating (twice)
N_Access_Definition-related code in Is_Current_Instance.
* sem_util.adb (Is_Current_Instance): Implement
access-type-related clauses of the RM 8.6 current instance rule.
For pragmas Predicate and Predicate_Failure, distinguish between
the first and subsequent pragma arguments.
Steve Baird [Mon, 8 Jul 2024 21:45:55 +0000 (14:45 -0700)]
ada: Type conversion in instance incorrectly rejected.
In some cases, a legal type conversion in a generic package is correctly
accepted but the corresponding type conversion in an instance of the generic
is incorrectly rejected.
gcc/ada/
* sem_res.adb (Valid_Conversion): Test In_Instance instead of
In_Instance_Body.
Eric Botcazou [Wed, 3 Jul 2024 16:24:37 +0000 (18:24 +0200)]
ada: Implement No_Raise aspect & pragma on subprograms
The new aspect is automatically set on the Adjust and Finalize primitives of
finalizable types, unless Relaxed_Finalization is explicitly set to False,
but it can also be specified directly on subprograms. It is also available
in earlier versions of the language by means of the associated pragma.
gcc/ada/
* aspects.ads (Aspect_Id): Add Aspect_No_Raise identifier.
(Implementation_Defined_Aspect): Add True for Aspect_No_Raise.
(Is_Representation_Aspect): Add False for Aspect_No_Raise.
(Aspect_Names): Add Name_No_Raise for Aspect_No_Raise.
(Aspect_Delay): Add Always_Delay for Aspect_No_Raise.
* checks.ads (Raise_Checks_Suppressed): New function.
(Apply_Raise_Check): New procedure.
* checks.adb (Apply_Raise_Check): New procedure.
(Raise_Checks_Suppressed): New function.
* doc/gnat_rm/gnat_language_extensions.rst (Generalized
Finalization): Update.
* doc/gnat_rm/implementation_defined_aspects.rst (No_Raise): New.
* doc/gnat_rm/implementation_defined_characteristics.rst (Check
names): Document Raise_Check and alphabetize others.
* doc/gnat_rm/implementation_defined_pragmas.rst (No_Raise): New.
* einfo.ads (No_Raise): New flag defined in subprograms and
generic subprograms.
* exp_ch6.adb (Expand_N_Subprogram_Body): Call Apply_Raise_Check
at the end of the processing.
* exp_ch11.adb (Get_RT_Exception_Name): Add alternative for
PE_Raise_Check_Failed to case statement.
* gen_il-fields.ads (Opt_Field_Enum): Add No_Raise identifier.
* gen_il-gen-gen_entities.adb (Subprogram_Kind): Add No_Raise as
semantical flag.
(Generic_Subprogram_Kind): Likewise.
* par-prag.adb (Prag): Add alternative for Pragma_No_Raise to case
statement.
* sem_ch13.adb (Validate_Finalizable_Aspect): Set No_Raise on the
Adjust and Finalize primitives if Relaxed_Finalization is set.
* sem_prag.adb (Analyze_Pragma): Add alternative for
Pragma_No_Raise to case statement.
(Sig_Flag): Add 0 for Pragma_No_Raise.
* snames.ads-tmpl (Remaining pragma names): Add Name_No_Raise.
(Names of recognized checks): Add Name_Raise_Check.
(Pragma_Id): Add Pragma_No_Raise identifier.
* types.ads (Raise_Check): New named number.
(All_Checks): Adjust.
(RT_Exception_Code): Add PE_Raise_Check_Failed identifier.
(Rkind): Add PE_Reason for PE_Raise_Check_Failed and alphabetize.
* types.h (RT_Exception_Code): Add PE_Raise_Check_Failed as 38.
(LAST_REASON_CODE): Adjust.
* libgnat/a-except.adb (Rcheck_PE_Raise_Check): New procedure with
pragmas Export, No_Return and Machine_Attributes.
(Rmsg_38): New string constant.
* gnat_rm.texi: Regenerate.
The pseudo random number generators used in GNAT are not
suitable for applications that require cryptographic
security. While this was mentioned in some places others
did not have a corresponding note, leading to these
generators being used in a non-suitable context.
gcc/ada/
* doc/gnat_rm/standard_library_routines.rst: Add note to section
of Ada.Numerics.Discrete_Random and Ada.Numerics.Float_Random.
* doc/gnat_rm/the_gnat_library.rst: Add note to section about
GNAT.Random_Numbers.
* libgnat/a-nudira.ads: Add note about cryptographic properties.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.
Eric Botcazou [Thu, 4 Jul 2024 21:35:01 +0000 (23:35 +0200)]
ada: Fix crash on expression function returning tagged type in nested package
This happens when the expression is a reference to a formal parameter of
the function, or a conditional expression with such a reference as one of
its dependent expressions, because the RM 6.5(8/5) subclause prescribes a
tag reassignment in this case, which requires freezing the tagged type in
the GNAT freezing model, although the language says there is no freezing.
In other words, it's another occurrence of the discrepancy between this
model tailored to Ada 95 and the freezing rules introduced in Ada 2012,
that is papered over by Should_Freeze_Type and the associated processing.
gcc/ada/
* exp_util.ads (Is_Conversion_Or_Reference_To_Formal): New
function declaration.
* exp_util.adb (Is_Conversion_Or_Reference_To_Formal): New
function body.
* exp_ch6.adb (Expand_Simple_Function_Return): Call the predicate
Is_Conversion_Or_Reference_To_Formal in order to decide whether a
tag check or reassignment is needed.
* freeze.adb (Should_Freeze_Type): Move declaration and body to
the appropriate places. Also return True for tagged results
subject to the expansion done in Expand_Simple_Function_Return
that is guarded by the predicate
Is_Conversion_Or_Reference_To_Formal.
This patch fixes an assertion failure in some cases in the code to
warn about possible misuse of range attributes in loop. The root of
the problem is that this code failed to consider the case where the
outer loop is a while loop.
Lingling Kong [Fri, 2 Aug 2024 02:31:39 +0000 (10:31 +0800)]
i386: Fix memory constraint for APX NF
The je constraint should be used for APX NDD ADD with register source
operand. The jM is for APX NDD patterns with immediate operand.
gcc/ChangeLog:
* config/i386/i386.md (nf_mem_constraint): Fixed the constraint
for the define_subst_attr.
(nf_mem_constraint): Added new define_subst_attr.
(*add<mode>_1<nf_name>): Fixed the constraint.
Pengxuan Zheng [Thu, 1 Aug 2024 00:00:01 +0000 (17:00 -0700)]
aarch64: Improve Advanced SIMD popcount expansion by using SVE [PR113860]
This patch improves the Advanced SIMD popcount expansion by using SVE if
available.
For example, GCC currently generates the following code sequence for V2DI:
cnt v31.16b, v31.16b
uaddlp v31.8h, v31.16b
uaddlp v31.4s, v31.8h
uaddlp v31.2d, v31.4s
However, by using SVE, we can generate the following sequence instead:
ptrue p7.b, all
cnt z31.d, p7/m, z31.d
Similar improvements can be made for V4HI, V8HI, V2SI and V4SI too.
The scalar popcount expansion can also be improved similarly by using SVE and
those changes will be included in a separate patch.
PR target/113860
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (popcount<mode>2): Add TARGET_SVE
support.
* config/aarch64/aarch64-sve.md (@aarch64_pred_<optab><mode>): Use new
iterator SVE_VDQ_I.
* config/aarch64/iterators.md (SVE_VDQ_I): New mode iterator.
(VPRED): Add V8QI, V16QI, V4HI, V8HI and V2SI.
Kewen Lin [Fri, 2 Aug 2024 00:29:22 +0000 (19:29 -0500)]
testsuite: Adjust fam-in-union-alone-in-struct-2.c to support BE [PR116148]
As Andrew pointed out in PR116148, fam-in-union-alone-in-struct-2.c
was designed for little-endian, the recent commit r15-2403 made it
be tested with running on BE and PR116148 got exposed.
This patch is to adjust the expected data for members in with_fam_2_v
and with_fam_3_v by considering endianness, also update with_fam_3_v.b[1]
from 0x5f6f7f7f to 0x5f6f7f8f to avoid two "7f"s.
PR testsuite/116148
gcc/testsuite/ChangeLog:
* c-c++-common/fam-in-union-alone-in-struct-2.c: Define macros
WITH_FAM_2_V_B[03] and WITH_FAM_3_V_A[07] as endianness, update the
checking with these macros and initialize with_fam_3_v.b[1] with
0x5f6f7f8f instead of 0x5f6f7f7f.
Jonathan Wakely [Sat, 23 Mar 2024 11:11:17 +0000 (11:11 +0000)]
libstdc++: Remove unused helper traits
These are not used anywhere, we have more efficient variable templates
for them instead. They're not documented as extensions, and are easy for
users to write if they need them.
Jonathan Wakely [Thu, 7 Dec 2023 12:13:59 +0000 (12:13 +0000)]
libstdc++: Remove unnecessary uses of <stdint.h>
We don't need to include all of <stdint.h> when we only need uintptr_t
from it. By using GCC's internal macro we avoid unnecessarily declaring
everything in <stdint.h>. This helps users to avoid accidentally relying
on those names being declared without explicitly including the header.
libstdc++-v3/ChangeLog:
* include/bits/align.h (align, assume_aligned): Use
__UINTPTR_TYPE__ instead of uintptr_t. Do not include
<stdint.h>.
* include/bits/atomic_base.h (__atomic_ref): Likewise.
* include/bits/atomic_wait.h (__waiter_pool_base::_S_for):
Likewise.
* include/std/atomic: Include <cstdint>.
This is needed to avoid errors outside the immediate context when
evaluating is_default_constructible_v<basic_string<C, T, A>> when A is
not default constructible.
This change is not sufficient to solve the problem because there are a
large number of member functions which have a default argument that
constructs an allocator.
libstdc++-v3/ChangeLog:
PR libstdc++/113841
* include/bits/basic_string.h (basic_string::basic_string()):
Constrain so that it's only present if the allocator is default
constructible.
* include/bits/cow_string.h (basic_string::basic_string()):
Likewise.
* testsuite/21_strings/basic_string/cons/113841.cc: New test.
Jonathan Wakely [Wed, 27 Mar 2024 11:07:17 +0000 (11:07 +0000)]
libstdc++: Remove noexcept from non-const std::basic_string::data() [PR99942]
The C++17 non-const overload of data() allows modifying the string
contents directly, so for the COW string we must do a copy-on-write to
unshare it. That means allocating, which can throw, so it shouldn't be
noexcept.
libstdc++-v3/ChangeLog:
PR libstdc++/99942
* include/bits/cow_string.h (data()): Change to noexcept(false).
Robin Dapp [Wed, 31 Jul 2024 14:54:03 +0000 (16:54 +0200)]
RISC-V: Correct mode_idx attribute for viwalu wx variants [PR116149].
In PR116149 we choose a wrong vector length which causes wrong values in
a reduction. The problem happens in avlprop where we choose the
number of units in the instruction's mode as vector length. For the
non-scalar variants the respective operand has the correct non-widened
mode. For the scalar variants, however, the same operand has a scalar
mode which obviously only has one unit. This makes us choose VL = 1
leaving three elements undisturbed (so potentially -1). Those end up
in the reduction causing the wrong result.
This patch adjusts the mode_idx just for the scalar variants of the
affected instruction patterns.
gcc/ChangeLog:
PR target/116149
* config/riscv/vector.md: Fix mode_idx attribute of scalar
widen add/sub variants.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr116149.c: New test.
Jakub Jelinek [Thu, 1 Aug 2024 18:23:15 +0000 (20:23 +0200)]
fortran: Fix up pasto in gfc_get_array_descr_info
A static analyzer found a pasto in gfc_get_array_descr_info.
The code does
t = base_decl;
if (!integer_zerop (dtype_off))
t = fold_build_pointer_plus (t, dtype_off);
dtype = TYPE_MAIN_VARIANT (get_dtype_type_node ());
field = gfc_advance_chain (TYPE_FIELDS (dtype), GFC_DTYPE_RANK);
rank_off = byte_position (field);
if (!integer_zerop (dtype_off))
t = fold_build_pointer_plus (t, rank_off);
i.e. uses the same !integer_zerop check between both, while it should
be checking rank_off in the latter case.
This actually doesn't change anything on the generated code, because
both the dtype_off and rank_off aren't zero,
typedef struct dtype_type
{
size_t elem_len;
int version;
signed char rank;
signed char type;
signed short attribute;
}
dtype_type;
struct {
type *base_addr;
size_t offset;
dtype_type dtype;
index_type span;
descriptor_dimension dim[];
};
dtype_off is 16 on 64-bit arches and 8 on 32-bit ones and rank_off is
12 on 64-bit arches and 8 on 32-bit arches, so this patch is just to
pacify those static analyzers or be prepared if the ABI changes in the
future. Because in the current ABI both of those are actually non-zero,
doing
if (!integer_zerop (something)) t = fold_build_pointer_plus (t, something);
actually isn't an optimization, it will consume more compile time. If
the ABI changes and we forget to readd it, nothing bad happens,
fold_build_pointer_plus handles 0 addends fine, just takes some compile
time to handle that.
I've kept this if (!integer_zerop (data_off)) guard earlier because
data_off is 0 in the current ABI, so it is an optimization there.
2024-08-01 Jakub Jelinek <jakub@redhat.com>
* trans-types.cc (gfc_get_array_descr_info): Don't test if
!integer_zerop (dtype_off), use fold_build_pointer_plus
unconditionally.
Jakub Jelinek [Thu, 1 Aug 2024 16:49:39 +0000 (18:49 +0200)]
c++: Fix up error recovery of invalid structured bindings used in conditions [PR116113]
The following testcase ICEs, because for structured binding error recovery
DECL_DECOMP_BASE is kept NULL and the newly added code to pick up saved
value from the base assumes that on structured binding bases the
TARGET_EXPR will be always there (that is the case if there are no errors).
The following patch fixes it by testing DECL_DECOMP_BASE before
dereferencing it, another option would be not to do that if
error_operand_p (cond).
2024-08-01 Jakub Jelinek <jakub@redhat.com>
PR c++/116113
* semantics.cc (maybe_convert_cond): Check DECL_DECOMP_BASE
is non-NULL before dereferencing it.
(finish_switch_cond): Likewise.
Tamar Christina [Thu, 1 Aug 2024 15:55:10 +0000 (16:55 +0100)]
AArch64: Add Cortex-X925 core definition and cost model
This adds a cost model and core definition for Cortex-X925.
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def (cortex-x925): New.
* config/aarch64/aarch64-tune.md: Regenerate.
* config/aarch64/tuning_models/cortexx925.h: New file.
* config/aarch64/aarch64.cc: Use it.
* doc/invoke.texi: Document it.