Piotr Trojanek [Tue, 15 Oct 2024 08:53:47 +0000 (10:53 +0200)]
ada: Move special case for null string literal from frontend to backend
Previously the lower bound of string literals indexed by non-static
integer types was artificially set to 1 in the frontend. This was to
avoid an overflow in calculation of a null string size by the GCC
backend, which was causing an excessively large binary object file.
However, setting the lower bound to 1 was problematic for GNATprove,
which could not easily retrieve the lower bound of string literals.
This patch avoids the overflow in GCC by recognizing null string literal
subtypes in Gigi.
gcc/ada/ChangeLog:
* gcc-interface/decl.cc (gnat_to_gnu_entity): Recognize null
string literal subtypes and set their bounds to 1 .. 0.
Bob Duff [Thu, 17 Oct 2024 16:04:45 +0000 (12:04 -0400)]
ada: Split Library_Unit using multiple wrappers
The Library_Unit field was used for all sorts of different purposes,
which led to confusing code.
This patch splits Library_Unit into much more specific wrapper
subprograms that should be called instead of [Set_]Library_Unit.
Predicates and pragmas Assert are used to catch misuses of these.
We document the semantics, especially "surprising" cases (e.g.
internally-generated with clauses can refer to package bodies).
This change does not fix gigi, codepeer, spark, or llvm
to use the new wrappers; so far, they are used only in
the GNAT front end.
gcc/ada/ChangeLog:
* sinfo.ads (Library_Unit): Rewrite documentation. Note that
the "??? not (always) true..." comment was not true;
the Subunit_Parent never points to the spec.
(N_Compilation_Unit): Improve documentation. The Aux_ node
was not created to solve the mentioned problems; it was
created because the size of nodes was limited.
Misc doc improvements.
* sinfo-utils.ads: Add new wrappers for Library_Unit field.
Use subtypes with predicates for the parameters.
(First_Real_Statement): Still used in codepeer.
* sinfo-utils.adb: Add new wrappers for Library_Unit field,
with suitable assertions.
* sem_prag.adb: Use new field wrapper names.
(Matching_Name): New name for Same_Name to avoid
potential confusion with the other function with the
same name (Sem_Util.Same_Name), which is also called
in this same file.
(Matching_Convention): Change Same_Convention to match
Matching_Name.
* sem_util.ads (Same_Name): Improve comments; the old comment
implied that it works for all names, which was not true.
* sem_util.adb: Use new field wrapper names.
* gen_il-gen.adb: Rename N_Unit_Body to be N_Lib_Unit_Body.
Plain "unit" is ambiguous in Ada (library unit, compilation
unit, program unit, etc).
Add new union types N_Lib_Unit_Declaration and
N_Lib_Unit_Renaming_Declaration.
* gen_il-gen-gen_nodes.adb (Compute_Ranges): Raise exception
earlier (it is already raised later, in Verify_Type_Table).
Add a comment explaining why it might be raised.
* gen_il-types.ads: Rename N_Unit_Body to be N_Lib_Unit_Body, and add
new N_Lib_Unit_Declaration and N_Lib_Unit_Renaming_Declaration.
* einfo.ads: Fix obsolete comment (was left over from before
the "variable-sized nodes").
* exp_ch7.adb: Use new field wrapper names.
* exp_disp.adb: Use new field wrapper names.
* exp_unst.adb: Use new field wrapper names.
* exp_util.adb: Use new field wrapper names.
* fe.h: Add new field wrapper names. These are currently not
used in gigi, but this change prepares for using them in
gigi.
* inline.adb: Use new field wrapper names.
* lib.adb: Use new field wrapper names.
Comment improvements.
* lib-load.adb: Use new field wrapper names.
Minor cleanup.
* lib-writ.adb: Use new field wrapper names.
* live.adb: Use new field wrapper names.
* par-load.adb: Use new field wrapper names.
Comment improvements. Minor cleanup.
* rtsfind.adb: Use new field wrapper names.
* sem.adb: Use new field wrapper names.
* sem_ch10.adb: Use new field wrapper names.
Comment improvements. Minor cleanup.
* sem_ch12.adb: Use new field wrapper names.
* sem_ch7.adb: Use new field wrapper names.
* sem_ch8.adb: Use new field wrapper names.
* sem_elab.adb: Use new field wrapper names.
Comment improvements.
* errout.adb (Output_Source_Line): Fix blowup in some
obscure cases, where List_Pragmas is not fully set up.
Nicolas Roche [Wed, 16 Oct 2024 09:56:35 +0000 (11:56 +0200)]
ada: Improve Unbounded_Wide_String performance
Improve performance of iteration using Element function.
Improve performance of Append.
gcc/ada/ChangeLog:
* libgnat/a-stwiun__shared.adb: Restructure code to inline only
the most common cases. Remove whenever possible runtime checks.
* libgnat/a-stwiun__shared.ads: Add Inline => True to Append
variants and Element.
Nicolas Roche [Wed, 25 Sep 2024 11:21:04 +0000 (13:21 +0200)]
ada: Improve performance of Unbounded_Wide_Wide_String
Improve performance of iteration using Element function.
Improve performance of Append.
gcc/ada/ChangeLog:
* libgnat/a-stzunb__shared.adb: Restructure code to inline only
the most common cases. Remove whenever possible runtime checks.
* libgnat/a-stzunb__shared.ads: Add Inline => True to Append
variants and Element.
Nicolas Roche [Wed, 25 Sep 2024 10:31:14 +0000 (12:31 +0200)]
ada: Improve Unbounded_String performance
Improve performance of iteration using Element function.
Improve performance of Append.
gcc/ada/ChangeLog:
* libgnat/a-strunb__shared.adb: Restructure code to inline only
the most common cases. Remove whenever possible runtime checks.
* libgnat/a-strunb__shared.ads: Add Inline => True to Append
variants and Element.
Steve Baird [Wed, 17 Jul 2024 22:21:01 +0000 (15:21 -0700)]
ada: Initial implementation of Extended_Access aspect (FE portion only)
The Extended_Access aspect can be specified to be True for certain
access-to-unconstrained-array-subtype types. Such extended access types
can designate objects that a normal general access type (with the same
designated subtype) cannot, such as a slice of an aliased array object
or an object that is represented without contiguous bounds information.
gcc/ada/ChangeLog:
* aspects.ads: Add Aspect_Extended_Access to Aspect_Id
enumeration.
* par-prag.adb: Add Pragma_Extended_Access to list of pragmas that
get no interesting processing in the parser.
* sem_attr.adb: Relax legality checks on Access/Unchecked_Access
attribute references if access type is Extended_Access.
* sem_ch12.adb (Validate_Access_Type_Instance): For an instance of
a generic with a formal access type, check that formal and actual
agree with with respect to Extended_Access aspect.
* sem_prag.adb (Analyze_Pragma): Add analysis code for pragma
Extended_Access. Set Pragma_Extended_Access element in Sig_Flags
aggregate.
* sem_prag.ads: Set Pragma_Extended_Access element in
Aspect_Specifying_Pragma aggregate.
* sem_res.adb (Valid_Conversion): Disallow
extended-to-not-extended access conversion.
* sem_util.adb (Is_Extended_Access_Access_Type): Implement new
function.
(Is_Aliased_View): If (and only if) the new Boolean For_Extended
parameter is True, then a slice of an aliased non-bitpacked array
is aliased, a constrained nominal subtype does not force a result
of False, and a dereference of an extended access value is
aliased. The last point is somewhat subtle. This is how we prevent
covert fat-to-nonfat type conversions via things like
"Not_Extended_Type'(Extended_Ptr.all'Access)" or passing
Extended_Ptr.all as an actual parameter corresponding to an
explicitly aliased formal parameter.
* sem_util.ads (Is_Extended_Access_Type): Declare new function.
(Is_Aliased_View): Add new defaults-False parameter For_Extended.
* snames.ads-tmpl: Declare Name_Extended_Access Name_Id constant
and Pragma_Extended_Access Pragma_Id enumeration literal.
Viljar Indus [Wed, 16 Oct 2024 09:01:38 +0000 (12:01 +0300)]
ada: Avoid unused with warning with Extend_System
When the Extend_System pragma is used then we are supposed
to check the extended system for referenced entities. Otherwise
we would get an incorrect unused with warning.
This was previously done on body files but it should also be
done specs as well.
gcc/ada/ChangeLog:
* sem_warn.adb (Check_One_Unit): When a system extension is
present always check entities from that unit before marking
the unit unreferenced.
Eric Botcazou [Tue, 15 Oct 2024 19:41:45 +0000 (21:41 +0200)]
ada: Propagate resolution status from Resolve_Iterated_Component_Association
The resolution status of Resolve_Aggr_Expr is lost when the routine is
invoked indirectly from Resolve_Iterated_Component_Association.
gcc/ada/ChangeLog:
* sem_aggr.adb (Resolve_Iterated_Component_Association): Change to
function returning Boolean and return the result of the call made
to Resolve_Aggr_Expr.
(Resolve_Array_Aggregate): Return failure status if the call to
Resolve_Iterated_Component_Association returns false.
Viljar Indus [Tue, 15 Oct 2024 10:49:07 +0000 (13:49 +0300)]
ada: Update documentation for -gnatVxx switches
Imporve the wording to explicitly state which options are turned on
by -gnatVa and that -gnatVd is enabled by default.
It can be somewhat hard to decifer that information from the old
wording. Especially when compared to -gnatWxx switches where there
is an elaborate scheme for describing those properties.
gcc/ada/ChangeLog:
* usage.adb: Update the wording for -gnatVa and -gnatVd.
Javier Miranda [Tue, 15 Oct 2024 09:32:43 +0000 (09:32 +0000)]
ada: Missing runtime check in interpolated string
When the type imposed by the context for an interpolated string is
constrained, the compiler silently omits adding a runtime check.
gcc/ada/ChangeLog:
* exp_ch2.adb (Expand_N_Interpolated_String_Literal): Use the
base type of the type imposed by the context for building the
interpolated string image; required to allow the expander adding
the missing runtime check when the target type is constrained.
(Apply_Static_Length_Check): New subprogram.
Viljar Indus [Fri, 11 Oct 2024 13:34:36 +0000 (16:34 +0300)]
ada: Add Invocation node to the SARIF report
Add an invocation node to the SARIF report that contains the
command line use to activate gnat and whether the execution was
successful or not.
gcc/ada/ChangeLog:
* diagnostics-sarif_emitter.adb (Print_Runs): Add printing for
the invocation node that consists of a single invocations that
is composed of the commandLine and executionSuccessful attributes.
The primary motivation for this change is making the taskset command
line tool work as expected for tasking programs that don't use features
from section D.16 of the Ada reference manual. A couple of components
are added to the ATCB record to make it possible to tell values that
come from explicit aspects and subprogram calls from values that are
inherited from activating tasks.
gcc/ada/ChangeLog:
* libgnarl/s-mudido__affinity.adb (Unchecked_Set_Affinity): Set new
ATCB component.
* libgnarl/s-taprop__linux.adb (Create_Task): Only set CPU affinity
when required.
(Requires_Affinity_Change): New subprogram.
(Set_Task_Affinity): Likewise.
* libgnarl/s-tarest.adb (Create_Restricted_Task): Adapt to
Initialize_ATCB change.
* libgnarl/s-taskin.adb (Initialize_ATCB): Update parameter list.
Record whether aspects were explicitly specified.
* libgnarl/s-taskin.ads (Common_ATCB): Add component.
* libgnarl/s-tassta.adb (Create_Task): Update call to Initialize_ATCB.
* libgnarl/s-tporft.adb (Register_Foreign_Thread): Likewise.
Daniel King [Fri, 13 Sep 2024 15:15:32 +0000 (16:15 +0100)]
ada: Build and runtime support for CheriBSD
SIGPROT is a new signal on CheriBSD that signals a CHERI protection violation.
The full runtime converts these to the appropriate Ada exception declared in
Interfaces.CHERI.Exceptions.
gcc/ada/ChangeLog:
* Makefile.rtl: Build support for Morello CheriBSD.
* libgnarl/s-intman__cheribsd.adb: New file for CheriBSD.
* libgnarl/s-osinte__cheribsd.ads: New file for CheriBSD.
Daniel King [Mon, 23 Sep 2024 09:50:13 +0000 (10:50 +0100)]
ada: Refactor exception declarations from Interfaces.CHERI to separate package
Exception declarations require elaboration on the full run-time to
register the exceptions. The package Interfaces.CHERI, however, is
used on bare-metal targets during early initialization, before
elaboration and is therefore marked No_Elaboration_Code_All.
Refactoring the exception declarations to a separate package allows
the common CHERI bindings to be used in such contexts.
gcc/ada/ChangeLog:
* libgnat/i-cheri.ads: Remove exception declarations.
* libgnat/i-cheri-exceptions.ads: New file.
Daniel King [Fri, 13 Sep 2024 15:16:52 +0000 (16:16 +0100)]
ada: Fix alignment of pthread_mutex_t
On most targets the alignment of unsigned long is the same as pointer
alignment, but on CHERI targets pointers have larger alignment (16 bytes
compared to 8 bytes). pthread_mutex_t needs the same alignment as
System.Address to account for CHERI targets.
gcc/ada/ChangeLog:
* libgnat/s-oslock__posix.ads: Fix alignment of pthread_mutex_t
for CHERI targets.
Claire Dross [Thu, 10 Oct 2024 14:51:13 +0000 (16:51 +0200)]
ada: Move formal hash tables from gnat repository to the SPARK library
The formal containers have been part of the SPARK library for some
time now. However, some units used only by these containers are still
part of the gnat repository. Move them to the SPARK library.
Bob Duff [Wed, 9 Oct 2024 11:25:14 +0000 (07:25 -0400)]
ada: Correction to disable self-referential with_clauses
Follow-on to previous change "Disable self-referential with_clauses",
which caused some regressions. Remove useless use clauses referring
to useless self-referential with'ed packages. This is necessary
because in some cases, such use clauses cause the compiler to
crash or give spurious errors.
In addition, enable the warning on self-referential with_clauses.
gcc/ada/ChangeLog:
* sem_ch10.adb (Analyze_With_Clause): In the case of a
self-referential with clause, if there is a subsequent use clause
for the same package (which is necessarily useless), remove it from
the context clause. Reenable the warning.
Javier Miranda [Tue, 8 Oct 2024 18:33:37 +0000 (18:33 +0000)]
ada: Missing precondition runtime check in inherited primitive
When a derived tagged type implements interface types in addition
to deriving from its parent type, and a primitive inherited from
its parent type corresponds to an inherited primitive that has
class-wide preconditions, then the generated code fails to check
the class-wide preconditions inherited from the interface primitive.
gcc/ada/ChangeLog:
* einfo.ads (Is_Dispatch_Table_Wrapper): Complete documentation.
* exp_ch6.adb (Install_Class_Preconditions_Check): Dispatch table
wrappers do not require installing the check since it is performed
by the caller.
(Class_Preconditions_Subprogram): Use new predicate Is_LSP_Wrapper.
* freeze.adb (Check_Inherited_Conditions): Rename Postcond_Wrappers to
Condition_Wrappers to handle implicitly inherited subprograms that
implement pre-/postconditions inherited from interface primitives.
Use new predicate Is_LSP_Wrapper.
* sem_disp.adb (Check_Dispatching_Operation): Complete assertion to
handle functions returning class-wide types.
* exp_util.ads (Is_LSP_Wrapper): New subprogram.
* exp_util.adb (Is_LSP_Wrapper): New subprogram.
* contracts.adb (Process_Spec_Postconditions): Use Is_LSP_Wrapper.
(Process_Inherited_Conditions): Use Is_LSP_Wrapper.
* sem_ch6.adb (New_Overloaded_Entity): Use Is_LSP_Wrapper.
* sem_util.adb (Nearest_Class_Condition_Subprogram): Use Is_LSP_Wrapper.
Piotr Trojanek [Tue, 8 Oct 2024 20:52:38 +0000 (22:52 +0200)]
ada: Fix visibility of Taft amendment types
When uninstalling private package declarations we must mark Taft
amendment types hidden, just like we mark other types.
Looking at previous revisions of this code, it is quite clear that this
bug comes from a code evolution and marking types should happen in all
ELSE branches of the enclosing IF statement.
gcc/ada/ChangeLog:
* sem_ch7.adb (Uninstall_Declarations): Mark Taft amendment
types like we mark other types declared in private package
declarations.
Piotr Trojanek [Fri, 27 Sep 2024 14:56:37 +0000 (16:56 +0200)]
ada: Add null exclusion to avoid run-time checks
By declaring access parameter with non-null qualifier, the compiler
should avoid generating run-time checks in debug builds, resulting in
a tiny performance improvement.
Piotr Trojanek [Fri, 27 Sep 2024 08:47:29 +0000 (10:47 +0200)]
ada: Resolve intrinsic operators without homonyms
Intrinsic operators are resolved by rewriting into a corresponding
operator from the Standard package. Traversing homonyms just to find the
corresponding operator was not particularly efficient; also, for the
binary "-" it was finding the unary "-".
There appears to be no difference in compiler behavior, but the new code
should be more efficient and finding the correct operator seems to make
more sense.
gcc/ada/ChangeLog:
* sem_res.adb (Resolve_Intrinsic_Operator)
(Resolve_Intrinsic_Unary_Operator): Replace traversals of
homonyms with a direct lookup.
Piotr Trojanek [Thu, 26 Sep 2024 06:40:28 +0000 (08:40 +0200)]
ada: Fix asymmetry in resolution of unary intrinsic operators
Resolution of binary and unary intrinsic operators differed when
expansion was inactive. In particular, this affected GNATprove
handling of Ada.Real_Time."abs" operator. This patch makes unary
resolution behave like binary resolution.
gcc/ada/ChangeLog:
* sem_res.adb (Resolve_Intrinsic_Unary_Operator): Disable when
expansion is inactive.
This is, in effect, a reapplication of de2bc6a7367aca2eecc925ebb64cfb86998d89f3
fixing the compile-time hog in var-tracking due to calling simplify_rtx
on the two arms of the rotation before detecting the ROTATE.
That is not necessary.
simplify-rtx can transform (X << C1) | (X >> C2) into ROTATE (X, C1) when
C1 + C2 == mode-width. But the transformation is also valid for PLUS and XOR.
Indeed GIMPLE can also do the fold. Let's teach RTL to do it too.
The motivating testcase for this is in AArch64 intrinsics:
uint64x2_t G2(uint64x2_t a, uint64x2_t b) {
uint64x2_t c = veorq_u64(a, b);
return veorq_u64(vaddq_u64(c, c), vshrq_n_u64(c, 63));
}
which I was hoping to fold to a single XAR (a ROTATE+XOR instruction) but
GCC was failing to detect the rotate operation for two reasons:
1) The combination of the two arms of the expression is done under XOR rather
than IOR that simplify-rtx currently supports.
2) The ASHIFT operation is actually a (PLUS X X) operation and thus is not
detected as the LHS of the two arms we require.
The patch fixes both issues. The analysis of the two arms of the rotation
expression is factored out into a common helper simplify_rotate_op which is
then used in the PLUS, XOR, IOR cases in simplify_binary_operation_1.
The check-assembly testcase for this is added in the following patch because
it needs some extra AArch64 backend work, but I've added self-tests in this
patch to validate the transformation.
Bootstrapped and tested on aarch64-none-linux-gnu
Signed-off-by: Kyrylo Tkachov <ktachov@nvidia.com>
PR target/117048
* simplify-rtx.cc (extract_ashift_operands_p): Define.
(simplify_rotate_op): Likewise.
(simplify_context::simplify_binary_operation_1): Use the above in
the PLUS, IOR, XOR cases.
(test_vector_rotate): Define.
(test_vector_ops): Use the above.
Andrew MacLeod [Sat, 2 Nov 2024 14:26:24 +0000 (10:26 -0400)]
Don't call invert on VARYING.
When all cases go to one label and resul in a VARYING value, we can't
invert that value to remove all values from the default case. Simply
check for this case and set the default to UNDEFINED.
PR tree-optimization/117398
gcc/
* gimple-range-edge.cc (gimple_outgoing_range::calc_switch_ranges):
Check for VARYING and don't call invert () on it.
As Yuta Mukai pointed out, the manual wrongly said that LS64 is
enabled by default for Armv8.7-A and above, and for Armv9.2-A
and above. LS64 is not mandatory at any architecture level
(and the code correctly implemented that).
I think this was a leftover from an early version of the spec.
gcc/
* doc/invoke.texi: Fix documentation of LS64 so that it's
not implied by Armv8.7-A or Armv9.2-A.
Jakub Jelinek [Mon, 4 Nov 2024 11:29:01 +0000 (12:29 +0100)]
libstdc++: Fix up 117406.cc test [PR117406]
Christophe mentioned in bugzilla that the test FAILs on aarch64,
I'm not including <climits> and use INT_MAX.
Apparently during my testing I got it because the test preinclude
-include bits/stdc++.h
and that includes <climits>, dunno why that didn't happen on aarch64.
In any case, either I can add #include <climits>, or because the
test already has #include <limits> I've changed uses of INT_MAX
with std::numeric_limits<int>::max(), that should be the same thing.
But if you prefer
#include <climits>
I can surely add that instead.
2024-11-04 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/117406
* testsuite/26_numerics/headers/cmath/117406.cc: Use
std::numeric_limits<int>::max() instead of INT_MAX.
Richard Biener [Thu, 31 Oct 2024 10:51:16 +0000 (11:51 +0100)]
Preserve ->move_dr behavior when adjusting epilogue info
When update_epilogue_loop_vinfo relates the shared loop DRs with
the epilogue stmts and infos it should not fiddle with how
pattern recognition applied move_dr.
* tree-vect-loop.cc (update_epilogue_loop_vinfo): A DRs
main stmt vinfo dr_aux should refer to a pattern stmt
which is how move_dr sets this up. We shouldn't undo this.
Richard Biener [Thu, 31 Oct 2024 08:55:35 +0000 (09:55 +0100)]
Move updated versioning threshold compute
The following moves computing the combined main + epilogue loop
versioning threshold until we figured the epilogues to use rather
than incrementally updating it with the chance to joust candidates
after the fact.
* tree-vect-loop.cc (vect_analyze_loop): Move lowest_th
compute until after epilogue_vinfos is final.
Kyrylo Tkachov [Thu, 17 Oct 2024 13:39:57 +0000 (06:39 -0700)]
simplify-rtx: Simplify ROTATE:HI (X:HI, 8) into BSWAP:HI (X)
With recent patch to improve detection of vector rotates at RTL level
combine now tries matching a V8HImode rotate by 8 in the example in the
testcase. We can teach AArch64 to emit a REV16 instruction for such a rotate
but really this operation corresponds to the RTL code BSWAP, for which we
already have the right patterns. BSWAP is arguably a simpler representation
than ROTATE here because it has only one operand, so let's teach simplify-rtx
to generate it.
With this patch the testcase now generates the simplest form:
.L2:
ldr q31, [x1, x0]
rev16 v31.16b, v31.16b
str q31, [x0, x2]
add x0, x0, 16
cmp x0, 2048
bne .L2
IMO ideally the bswap detection would have been done during vectorisation
time and used the expanders for that, but teaching simplify-rtx to do this
transformation is fairly straightforward and, unlike at tree level, we have
the native RTL BSWAP code. This change is not enough to generate the
equivalent sequence in SVE, but that is something that should be tackled
separately.
Bootstrapped and tested on aarch64-none-linux-gnu.
Kyrylo Tkachov [Tue, 22 Oct 2024 14:52:36 +0000 (07:52 -0700)]
aarch64: Emit XAR for vector rotates where possible
We can make use of the integrated rotate step of the XAR instruction
to implement most vector integer rotates, as long we zero out one
of the input registers for it. This allows for a lower-latency sequence
than the fallback SHL+USRA, especially when we can hoist the zeroing operation
away from loops and hot parts. This should be safe to do for 64-bit vectors
as well even though the XAR instructions operate on 128-bit values, as the
bottom 64-bit results is later accessed through the right subregs.
This strategy is used whenever we have XAR instructions, the logic
in aarch64_emit_opt_vec_rotate is adjusted to resort to
expand_rotate_as_vec_perm only when it's expected to generate a single REV*
instruction or when XAR instructions are not present.
With this patch we can gerate for the input:
v4si
G1 (v4si r)
{
return (r >> 23) | (r << 9);
}
v8qi
G2 (v8qi r)
{
return (r << 3) | (r >> 5);
}
the assembly for +sve2:
G1:
movi v31.4s, 0
xar z0.s, z0.s, z31.s, #23
ret
G2:
movi v31.4s, 0
xar z0.b, z0.b, z31.b, #5
ret
instead of the current:
G1:
shl v31.4s, v0.4s, 9
usra v31.4s, v0.4s, 23
mov v0.16b, v31.16b
ret
G2:
shl v31.8b, v0.8b, 3
usra v31.8b, v0.8b, 5
mov v0.8b, v31.8b
ret
Bootstrapped and tested on aarch64-none-linux-gnu.
Kyrylo Tkachov [Wed, 16 Oct 2024 11:10:08 +0000 (04:10 -0700)]
aarch64: Optimize vector rotates as vector permutes where possible
Some vector rotate operations can be implemented in a single instruction
rather than using the fallback SHL+USRA sequence.
In particular, when the rotate amount is half the bitwidth of the element
we can use a REV64,REV32,REV16 instruction.
More generally, rotates by a byte amount can be implented using vector
permutes.
This patch adds such a generic routine in expmed.cc called
expand_rotate_as_vec_perm that calculates the required permute indices
and uses the expand_vec_perm_const interface.
On aarch64 this ends up generating the single-instruction sequences above
where possible and can use LDR+TBL sequences too, which are a good choice.
With help from Richard, the routine should be VLA-safe.
However, the only use of expand_rotate_as_vec_perm introduced in this patch
is in aarch64-specific code that for now only handles fixed-width modes.
A runtime aarch64 test is added to ensure the permute indices are not messed
up.
Bootstrapped and tested on aarch64-none-linux-gnu.
Kyrylo Tkachov [Tue, 15 Oct 2024 13:33:11 +0000 (06:33 -0700)]
PR 117048: aarch64: Add define_insn_and_split for vector ROTATE
The ultimate goal in this PR is to match the XAR pattern that is represented
as a (ROTATE (XOR X Y) VCST) from the ACLE intrinsics code in the testcase.
The first blocker for this was the missing recognition of ROTATE in
simplify-rtx, which is fixed in the previous patch.
The next problem is that once the ROTATE has been matched from the shifts
and orr/xor/plus, it will try to match it in an insn before trying to combine
the XOR into it. But as we don't have a backend pattern for a vector ROTATE
this recog fails and combine does not try the followup XOR+ROTATE combination
which would have succeeded.
This patch solves that by introducing a sort of "scaffolding" pattern for
vector ROTATE, which allows it to be combined into the XAR.
If it fails to be combined into anything the splitter will break it back
down into the SHL+USRA sequence that it would have emitted.
By having this splitter we can special-case some rotate amounts in the future
to emit more specialised instructions e.g. from the REV* family.
This can be done if the ROTATE is not combined into something else.
This optimisation is done in the next patch in the series.
Bootstrapped and tested on aarch64-none-linux-gnu.
Kyrylo Tkachov [Tue, 22 Oct 2024 10:27:47 +0000 (03:27 -0700)]
aarch64: Use canonical RTL representation for SVE2 XAR and extend it to fixed-width modes
The MD pattern for the XAR instruction in SVE2 is currently expressed with
non-canonical RTL by using a ROTATERT code with a constant rotate amount.
Fix it by using the left ROTATE code. This necessitates splitting out the
expander separately to translate the immediate coming from the intrinsic
from a right-rotate to a left-rotate immediate.
Additionally, as the SVE2 XAR instruction is unpredicated and can handle all
element sizes from .b to .d, it is a good fit for implementing the XOR+ROTATE
operation for Advanced SIMD modes where the TARGET_SHA3 cannot be used
(that can only handle V2DImode operands). Therefore let's extend the accepted
modes of the SVE2 patternt to include the Advanced SIMD integer modes.
This leads to some tests for the svxar* intrinsics to fail because they now
simplify to a plain EOR when the rotate amount is the width of the element.
This simplification is desirable (EOR instructions have better or equal
throughput than XAR, and they are non-destructive of their input) so the
tests are adjusted.
For V2DImode XAR operations we should prefer the Advanced SIMD version when
it is available (TARGET_SHA3) because it is non-destructive, so restrict the
SVE2 pattern accordingly. Tests are added to confirm this.
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for mainline?
* config/aarch64/iterators.md (SVE_ASIMD_FULL_I): New mode iterator.
* config/aarch64/aarch64-sve2.md (@aarch64_sve2_xar<mode>):
Use SVE_ASIMD_FULL_I modes. Use ROTATE code for the rotate step.
Adjust output logic.
* config/aarch64/aarch64-sve-builtins-sve2.cc (svxar_impl): Define.
(svxar): Use the above.
gcc/testsuite/
* gcc.target/aarch64/xar_neon_modes.c: New test.
* gcc.target/aarch64/xar_v2di_nonsve.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/xar_s16.c: Scan for EOR rather than
XAR.
* gcc.target/aarch64/sve2/acle/asm/xar_s32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/xar_s64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/xar_s8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/xar_u16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/xar_u32.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/xar_u64.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/xar_u8.c: Likewise.
simplify-rtx can transform (X << C1) | (X >> C2) into ROTATE (X, C1) when
C1 + C2 == mode-width. But the transformation is also valid for PLUS and XOR.
Indeed GIMPLE can also do the fold. Let's teach RTL to do it too.
The motivating testcase for this is in AArch64 intrinsics:
uint64x2_t G2(uint64x2_t a, uint64x2_t b) {
uint64x2_t c = veorq_u64(a, b);
return veorq_u64(vaddq_u64(c, c), vshrq_n_u64(c, 63));
}
which I was hoping to fold to a single XAR (a ROTATE+XOR instruction) but
GCC was failing to detect the rotate operation for two reasons:
1) The combination of the two arms of the expression is done under XOR rather
than IOR that simplify-rtx currently supports.
2) The ASHIFT operation is actually a (PLUS X X) operation and thus is not
detected as the LHS of the two arms we require.
The patch fixes both issues. The analysis of the two arms of the rotation
expression is factored out into a common helper simplify_rotate which is
then used in the PLUS, XOR, IOR cases in simplify_binary_operation_1.
The check-assembly testcase for this is added in the following patch because
it needs some extra AArch64 backend work, but I've added self-tests in this
patch to validate the transformation.
Bootstrapped and tested on aarch64-none-linux-gnu
Signed-off-by: Kyrylo Tkachov <ktachov@nvidia.com>
PR target/117048
* simplify-rtx.cc (extract_ashift_operands_p): Define.
(simplify_rotate_op): Likewise.
(simplify_context::simplify_binary_operation_1): Use the above in
the PLUS, IOR, XOR cases.
(test_vector_rotate): Define.
(test_vector_ops): Use the above.
Andrew Pinski [Sun, 3 Nov 2024 06:34:00 +0000 (23:34 -0700)]
docs: Document that __builtin_assoc_barrier also can be used for FMAs [PR115023]
I noticed that __builtin_assoc_barrier makes a differnce for FMAs formation
but it was not documented. This adds that documentation even with a small example.
Build the HTML documents to make sure everything looks correct.
gcc/ChangeLog:
PR middle-end/115023
* doc/extend.texi (__builtin_assoc_barrier): Document ffp-contract=fast
and FMA usage.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
There are a couple of things wrong with this pattern which
I missed during the review. First each nop_convert should
be nop_convert1 or nop_convert2.
Second is we need to the minus in the same type as the minus
was originally so we don't introduce extra undefined behavior
(signed integer overflow). And we need a convert into the new
type too.
pr117363-1.c tests not introducing extra undefined behavior.
pr117363-2.c tests the casting to the correct final type, ldist
introduces the cond_expr here.
Bootstraped and tested on x86_64-linux-gnu.
PR tree-optimization/117363
gcc/ChangeLog:
* match.pd (`a != 0 ? a - 1 : 0`): Fix type handling
and nop_convert handling.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr117363-1.c: New test.
* gcc.dg/torture/pr117363-2.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Paul Thomas [Sun, 3 Nov 2024 18:02:16 +0000 (18:02 +0000)]
Fortran: Fix associate_69.f90 that fails on some platforms [PR115700]
2024-11-03 Paul Thomas <pault@gcc.gnu.org>
gcc/testsuite/
PR fortran/115700
* gfortran.dg/associate_69.f90: Remove the test that produces a
variable string length because the optimized count depends on
the platform. This is tested in associate_70.f90.
Thomas Koenig [Tue, 29 Oct 2024 20:08:59 +0000 (21:08 +0100)]
Add UMASKR and UMASKL intrinsics.
gcc/fortran/ChangeLog:
* check.cc (gfc_check_mask): Handle BT_INSIGNED.
* gfortran.h (enum gfc_isym_id): Add GFC_ISYM_UMASKL and
GFC_ISYM_UMASKR.
* gfortran.texi: List UMASKL and UMASKR, remove unsigned future
unsigned arguments for MASKL and MASKR.
* intrinsic.cc (add_functions): Add UMASKL and UMASKR.
* intrinsic.h (gfc_simplify_umaskl): New function.
(gfc_simplify_umaskr): New function.
(gfc_resolve_umasklr): New function.
* intrinsic.texi: Document UMASKL and UMASKR.
* iresolve.cc (gfc_resolve_umasklr): New function.
* simplify.cc (gfc_simplify_umaskr): New function.
(gfc_simplify_umaskl): New function.
Jakub Jelinek [Sat, 2 Nov 2024 17:48:54 +0000 (18:48 +0100)]
libstdc++: Fix up std::{,b}float16_t std::{ilogb,l{,l}r{ound,int}} [PR117406]
These overloads incorrectly cast the result of the float __builtin_*
to _Float or __gnu_cxx::__bfloat16_t. For std::ilogb that changes
behavior for the INT_MAX return because that isn't representable in
either of the floating point formats, for the others it is I think
just a very inefficient hop from int/long/long long to std::{,b}float16_t
and back. I mean for the round/rint cases, either the argument is small
and then the return value should be representable in the floating point
format too, or it is too large that the argument is already integral
and then it should just return the argument with the round trips.
Too large value is unspecified unlike ilogb.
2024-11-02 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/117406
* include/c_global/cmath (std::ilogb(_Float16), std::llrint(_Float16),
std::llround(_Float16), std::lrint(_Float16), std::lround(_Float16)):
Don't cast __builtin_* return to _Float16.
(std::ilogb(__gnu_cxx::__bfloat16_t),
std::llrint(__gnu_cxx::__bfloat16_t),
std::llround(__gnu_cxx::__bfloat16_t),
std::lrint(__gnu_cxx::__bfloat16_t),
std::lround(__gnu_cxx::__bfloat16_t)): Don't cast __builtin_* return to
__gnu_cxx::__bfloat16_t.
* testsuite/26_numerics/headers/cmath/117406.cc: New test.
Jakub Jelinek [Sat, 2 Nov 2024 17:47:27 +0000 (18:47 +0100)]
gimplify: Fix up RAW_DATA_CST related ICE [PR117384]
Apparently tree_output_constant_def doesn't strictly guarantee that the
returned VAR_DECL will have the same or uselessly convertible type as
the type of the constant passed to it, compare_constants says:
/* For arrays, check that mode, size and storage order match. */
/* For record and union constructors, require exact type equality. */
The older use of tree_output_constant_def in gimplify.cc was already
handling this right:
ctor = tree_output_constant_def (ctor);
if (!useless_type_conversion_p (type, TREE_TYPE (ctor)))
ctor = build1 (VIEW_CONVERT_EXPR, type, ctor);
but the spot I've added for RAW_DATA_CST missed this.
So, the following patch adds that.
2024-11-02 Jakub Jelinek <jakub@redhat.com>
PR middle-end/117384
* gimplify.cc (gimplify_init_ctor_eval): Add VIEW_CONVERT_EXPR around
rctor if it doesn't have expected type.
Nathaniel Shead [Thu, 31 Oct 2024 09:05:16 +0000 (20:05 +1100)]
c++/modules: Propagate TYPE_CANONICAL for partial specialisations [PR113814]
In some cases, when we go to import a partial specialisation there might
already be an incomplete implicit instantiation in the specialisation
table. This causes ICEs described in the linked PR as we now have two
separate matching specialisations for this same arguments with different
TYPE_CANONICAL.
We already support multiple specialisations with the same args however,
as they may be differently constrained. So we can solve this by simply
ensuring that the TYPE_CANONICAL of the new partial specialisation
matches the existing specialisation.
* g++.dg/modules/partial-6.h: New test.
* g++.dg/modules/partial-6_a.H: New test.
* g++.dg/modules/partial-6_b.H: New test.
* g++.dg/modules/partial-6_c.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Co-authored-by: Jason Merrill <jason@redhat.com>
In cases like the linked PR we sometimes get mutually recursive
dependencies that both rely on the other to have been streamed as part
of their merge key information. In the linked PR, this causes an ICE.
The root cause is that 'sort_cluster' is not correctly ordering the
dependencies; both the element_t specialisation and the
reverse_adaptor::first function decl depend on each other, but by
streaming element_t first it ends up trying to stream itself recursively
as part of calculating its own merge key, which apart from the checking
ICE will also cause issues on stream-in, as the merge key will not
properly stream.
There is a comment already in 'sort_cluster' describing this issue, but
it says:
Finding the single cluster entry dep is very tricky and
expensive. Let's just not do that. It's harmless in this case
anyway.
However in this case it was not harmless: it's just somewhat luck that
the sorting happened to work for the existing cases in the testsuite.
This patch solves the issue by noting any declarations that rely on deps
first seen within their own merge key. This declaration gets marked as
an "entry" dep; any of these deps that end up recursively referring back
to that entry dep as part of their own merge key do not.
Then within sort_cluster we can ensure that the entry dep is written to
be streamed first of its cluster; this will ensure that any other deps
are just emitted as back-references, and the mergeable dep itself will
structurally decompose.
PR c++/116317
gcc/cp/ChangeLog:
* module.cc
(depset::DB_MAYBE_RECURSIVE_BIT): New flag.
(depset::DB_ENTRY_BIT): New flag.
(depset::is_maybe_recursive): New accessor.
(depset::is_entry): New accessor.
(depset::hash::writing_merge_key): New field.
(trees_out::decl_value): Inform dep_hash while we're writing the
merge key information for a decl.
(depset::hash::add_dependency): Find recursive deps and mark the
entry point.
(sort_cluster): Ensure that the entry dep is streamed first.
gcc/testsuite/ChangeLog:
* g++.dg/modules/late-ret-4_a.H: New test.
* g++.dg/modules/late-ret-4_b.C: New test.
Jeff Law [Sat, 2 Nov 2024 02:28:07 +0000 (20:28 -0600)]
[committed] Make LRA default for ft32 and remove -mlra option
I was looking to clean up an old patch I'm carrying in my tester. My first
thought was that ft32 was likely going to be deprecated because it wasn't using
LRA -- which in turn would mean the patch in question could just be removed.
But then I checked, ft32 has an LRA option and if turned on it gets the exact
same test results as with reload. While the port mentions a failure with
sieve.c, that's been there since the port was introduced in 2015.
It's working well enough that I think just converting it is the right thing to
do. The testsuite patch which precipitated this one will follow separately.
I've kept the -mlra option for compatibility sake, but it's ignored.
Pushing to the trunk.
gcc/
* config/ft32/ft32.cc (ft32_lra_p): Remove.
(TARGET_LRA_P): Likewise.
* config/ft32/ft32.opt: Make -mlra ignored.
* doc/invoke.texi: Adjust documentation for -mlra on ft32.
Jakub Jelinek [Fri, 1 Nov 2024 22:03:48 +0000 (23:03 +0100)]
builtins: Fix expand_builtin_prefetch [PR117407]
On Fri, Nov 01, 2024 at 04:47:35PM +0800, Haochen Jiang wrote:
> * builtins.cc (expand_builtin_prefetch): Use IN_RANGE to
> avoid second usage of INTVAL.
I doubt this has been actually tested.
> --- a/gcc/builtins.cc
> +++ b/gcc/builtins.cc
> @@ -1297,7 +1297,7 @@ expand_builtin_prefetch (tree exp)
> else
> op1 = expand_normal (arg1);
> /* Argument 1 must be 0, 1 or 2. */
> - if (INTVAL (op1) < 0 || INTVAL (op1) > 2)
> + if (IN_RANGE (INTVAL (op1), 0, 2))
> {
> warning (0, "invalid second argument to %<__builtin_prefetch%>;"
> " using zero");
> @@ -1315,7 +1315,7 @@ expand_builtin_prefetch (tree exp)
> else
> op2 = expand_normal (arg2);
> /* Argument 2 must be 0, 1, 2, or 3. */
> - if (INTVAL (op2) < 0 || INTVAL (op2) > 3)
> + if (IN_RANGE (INTVAL (op2), 0, 3))
> {
> warning (0, "invalid third argument to %<__builtin_prefetch%>; using zero");
> op2 = const0_rtx;
because it inverts the tests, previously it was warning when op1 wasn't
0, 1, 2, now it warns when it is 0, 1 or 2, previously it was warning
when op2 wasn't 0, 1, 2 or 3, now it warns when it is 0, 1, 2, or 3.
Fixed thusly.
2024-11-01 Jakub Jelinek <jakub@redhat.com>
PR bootstrap/117407
* builtins.cc (expand_builtin_prefetch): Use !IN_RANGE rather
than IN_RANGE.
Andrew MacLeod [Thu, 31 Oct 2024 19:44:15 +0000 (15:44 -0400)]
Make fur_edge accessible.
Move the decl of fur_edge out of the source file into the header file.
* gimple-range-fold.cc (class fur_edge): Relocate from here.
(fur_edge::fur_edge): Also move to:
* gimple-range-fold.h (class fur_edge): Relocate to here.
(fur_edge::fur_edge): Likewise.
Jakub Jelinek [Fri, 1 Nov 2024 18:50:28 +0000 (19:50 +0100)]
c++: Adjust docs and option descriptions for the publishing of C++23
Now that C++23 has been finally published, the following patch attempts
to mention it in the option descriptions and documentation.
Given that it has been published about 1.5 years after being finalized
and has the 14882:2024 document number pair rather than :2023, I wasn't
sure when exactly to use 2023 (as informal name) and when 2024 (as year
of publishing), so I've tried to use 2024 in standards.texi which talks
more formally about the standards and a note that it has been published
in 2024 when it is talked about more informally.
I remember at least one older edition has been published in January too,
but the ISO pages pretend it was published still in December of the previous
year, in this case it doesn't.
2024-11-01 Jakub Jelinek <jakub@redhat.com>
gcc/
* doc/standards.texi (C++ Language): Mention also the 2024
revision and -std=gnu++23 option.
* doc/invoke.texi (-std=): Adjust description of c++23, c++2b,
gnu++23 and gnu++2b now that ISO C++ 14882:2024 is published.
gcc/c-family/
* c.opt (std=c++2b, std=c++23, std=gnu++2b, std=gnu++23): Adjust
description now that ISO C++ 14882:2024 is published.
Jakub Jelinek [Fri, 1 Nov 2024 18:42:28 +0000 (19:42 +0100)]
c++: Attempt to implement C++26 P3034R1 - Module Declarations Shouldn't be Macros [PR114461]
This is an attempt to implement the https://wg21.link/p3034r1 paper,
but I'm afraid the wording in the paper is bad for multiple reasons.
I think I understand the intent, that the module name and partition
if any shouldn't come from macros so that they can be scanned for
without preprocessing, but on the other side doesn't want to disable
macro expansion in pp-module altogether, because e.g. the optional
attribute in module-declaration would be nice to come from macros
as which exact attribute is needed might need to be decided based on
preprocessor checks.
The paper added https://eel.is/c++draft/cpp.module#2
which uses partly the wording from https://eel.is/c++draft/cpp.module#1
The first issue I see is that using that "defined as an object-like macro"
from there means IMHO something very different in those 2 paragraphs.
As per https://eel.is/c++draft/cpp.pre#7.sentence-1 preprocessing tokens
in preprocessing directives aren't subject to macro expansion unless
otherwise stated, and so the export and module tokens aren't expanded
and so the requirement that they aren't defined as an object-like macro
makes perfect sense. The problem with the new paragraph is that
https://eel.is/c++draft/cpp.module#3.sentence-1 says that the rest of
the tokens are macro expanded and after macro expansion none of the
tokens can be defined as an object-like macro, if they would be, they'd
be expanded to that. So, I think either the wording needs to change
such that not all preprocessing tokens after module are macro expanded,
only those which are after the pp-module-name and if any pp-module-partition
tokens, or all tokens after module are macro expanded but none of the tokens in
pp-module-name and pp-module-partition if any must come from macro
expansion. The patch below implements it as if the former would be
specified (but see later), so essentially scans the preprocessing tokens
after module without expansion, if the first one is an identifier, it
disables expansion for it and then if followed by . or : expects another
such identifier (again with disabled expansion), but stops after second
: is seen.
Second issue is that while the global-module-fragment start is fine, matches
the syntax of the new paragraph where the pp-tokens[opt] aren't present,
there is also private-module-fragment in the syntax where module is
followed by : private ; and in that case the colon doesn't match the
pp-module-name grammar and appears now to be invalid. I think the
https://eel.is/c++draft/cpp.module#2
paragraph needs to change so that it allows also that pp-tokens of
a pp-module may also be : pp-tokens[opt] (and in that case, I think
the colon shouldn't come from a macro and private and/or ; can).
Third issue is that there are too many pp-tokens in
https://eel.is/c++draft/cpp.module , one is all the tokens between
module keyword and the semicolon and one is the optional extra tokens
after pp-module-partition (if any, if missing, after pp-module).
Perhaps introducing some other non-terminal would help talking about it?
So in "where the pp-tokens (if any) shall not begin with a ( preprocessing
token" it isn't obvious which pp-tokens it is talking about (my assumption
is the latter) and also whether ( can't appear there just before macro
expansion or also after expansion. The patch expects only before expansion,
so
#define F ();
export module foo F
would be valid during preprocessing but obviously invalid during
compilation, but
#define foo(n) n;
export module foo (3)
would be invalid already during preprocessing.
The last issue applies only if the first issue is resolved to allow
expansion of tokens after : if first token, or after pp-module-partition
if present or after pp-module-name if present. When non-preprocessing
scanner sees
export module foo.bar:baz.qux;
it knows nothing can come from preprocessing macros and is ok, but if it
sees
export module foo.bar:baz qux
then it can't know whether it will be
export module foo.bar:baz;
or
export module foo.bar:baz [[]];
or
export module foo.bar:baz.freddy.garply;
because qux could be validly a macro, which expands to ; or [[]];
or .freddy.garply; etc. So, either the non-preprocessing scanner would
need to note it as possible export of foo.bar:baz* module partitions
and preprocess if it needs to know the details or just compile, or if that
is not ok, the wording would need to rule out that the expansion of (the
second) pp-tokens if any can't start with . or : (colon would be only
problematic if it isn't present in the tokens before it already).
So, if e.g. defining qux above to . whatever is invalid, then the scanner
can rely it sees the whole module name and partition.
The patch below implements what is above described as the first variant
of the first issue resolution, i.e. disables expansion of as many tokens
as could be in the valid module name and module partition syntax, but
as soon as it e.g. sees two adjacent identifiers, the second one can be
macro expanded. If it is macro expanded though, the expansion can't
start with . or :, and if it expands to nothing, tokens after it (whether
they come from macro expansion or not) can't start with . or :.
So, effectively:
#define SEMI ;
export module SEMI
used to be valid and isn't anymore,
#define FOO bar
export module FOO;
isn't valid,
#define COLON :
export module COLON private;
isn't valid,
#define BAR baz
export module foo.bar:baz.qux.BAR;
isn't valid,
#define BAZ .qux
export module foo BAZ;
isn't valid,
#define FREDDY :garply
export module foo FREDDY;
isn't valid,
while
#define QUX [[]]
export module foo QUX;
or
#define GARPLY private
module : GARPLY;
etc. is.
2024-11-01 Jakub Jelinek <jakub@redhat.com>
PR c++/114461
libcpp/
* include/cpplib.h: Implement C++26 P3034R1
- Module Declarations Shouldn’t be Macros (or more precisely
its expected intent).
(NO_DOT_COLON): Define.
* internal.h (struct cpp_reader): Add diagnose_dot_colon_from_macro_p
member.
* lex.cc (cpp_maybe_module_directive): For pp-module, if
module keyword is followed by CPP_NAME, ensure all CPP_NAME
tokens possibly matching module name and module partition
syntax aren't expanded and aren't defined as object-like macros.
Verify first token after that doesn't start with open paren.
If the next token after module name/partition is CPP_NAME defined
as macro, set NO_DOT_COLON flag on it.
* macro.cc (cpp_get_token_1): Set
pfile->diagnose_dot_colon_from_macro_p if token to be expanded has
NO_DOT_COLON bit set in flags. Before returning, if
pfile->diagnose_dot_colon_from_macro_p is true and not returning
CPP_PADDING or CPP_COMMENT and not during macro expansion preparation,
set pfile->diagnose_dot_colon_from_macro_p to false and diagnose
if returning CPP_DOT or CPP_COLON.
gcc/testsuite/
* g++.dg/modules/cpp-7.C: New test.
* g++.dg/modules/cpp-8.C: New test.
* g++.dg/modules/cpp-9.C: New test.
* g++.dg/modules/cpp-10.C: New test.
* g++.dg/modules/cpp-11.C: New test.
* g++.dg/modules/cpp-12.C: New test.
* g++.dg/modules/cpp-13.C: New test.
* g++.dg/modules/cpp-14.C: New test.
* g++.dg/modules/cpp-15.C: New test.
* g++.dg/modules/cpp-16.C: New test.
* g++.dg/modules/cpp-17.C: New test.
* g++.dg/modules/cpp-18.C: New test.
* g++.dg/modules/cpp-19.C: New test.
* g++.dg/modules/cpp-20.C: New test.
* g++.dg/modules/pmp-4.C: New test.
* g++.dg/modules/pmp-5.C: New test.
* g++.dg/modules/pmp-6.C: New test.
* g++.dg/modules/token-6.C: New test.
* g++.dg/modules/token-7.C: New test.
* g++.dg/modules/token-8.C: New test.
* g++.dg/modules/token-9.C: New test.
* g++.dg/modules/token-10.C: New test.
* g++.dg/modules/token-11.C: New test.
* g++.dg/modules/token-12.C: New test.
* g++.dg/modules/token-13.C: New test.
* g++.dg/modules/token-14.C: New test.
* g++.dg/modules/token-15.C: New test.
* g++.dg/modules/token-16.C: New test.
* g++.dg/modules/dir-only-3.C: Expect an error.
* g++.dg/modules/dir-only-4.C: Expect an error.
* g++.dg/modules/dir-only-5.C: New test.
* g++.dg/modules/atom-preamble-2_a.C: In export module malcolm;
replace malcolm with kevin. Don't define malcolm macro.
* g++.dg/modules/atom-preamble-4.C: Expect an error.
* g++.dg/modules/atom-preamble-5.C: New test.
Xi Ruoyao [Fri, 1 Nov 2024 16:05:44 +0000 (00:05 +0800)]
testsuite: Fix up builtin-prefetch-1.c tests
How can you use "read-shared" as an identifier? It's not allowed by all
C standard versions.
gcc/testsuite/ChangeLog:
* gcc.c-torture/execute/builtin-prefetch-1.c (rws): Use
"read_shared" instead of "read-shared" as the identifier for
enum value.
* gcc.dg/builtin-prefetch-1.c (rws): Likewise.
Xi Ruoyao [Thu, 10 Oct 2024 18:44:27 +0000 (02:44 +0800)]
Always set SECTION_RELRO for or .data.rel.ro{,.local} [PR116887]
At least two ports (hppa and loongarch) need to set SECTION_RELRO for
.data.rel.ro{,.local} in section_type_flags (PR52999 and PR116887), and
I cannot see a reason not to just set it in the generic code.
With this applied we can also remove the hppa-specific
pa_section_type_flags in a future patch.
gcc/ChangeLog:
PR target/116887
* varasm.cc (default_section_type_flags): Always set
SECTION_RELRO if name is .data.rel.ro{,.local}.
Jakub Jelinek [Fri, 1 Nov 2024 10:57:32 +0000 (11:57 +0100)]
openmp: Return error_mark_node from tsubst_attribute for errneous varid
We incorrectly accept some invalid declare variant cases as if declare
variant wasn't there, in particular if a function template has some dependent
arguments and variant name lookup fails, because that is during
fn_type_unification with complain=tf_none, it just sets it to error_mark_node
and doesn't complain further, because it doesn't know the substitution failed
(we don't return error_mark_node from tsubst_attribute, just create TREE_LIST
with error_mark_node TREE_PURPOSE).
The following patch fixes it by returning error_mark_node in that case, then
fn_type_unification caller can see it failed and can redo it with explain_p
so that errors are reported.
2024-11-01 Jakub Jelinek <jakub@redhat.com>
* pt.cc (tsubst_attribute): For "omp declare variant base" attribute
if varid is error_mark_node, set val to error_mark_node rather than
creating a TREE_LIST with error_mark_node TREE_PURPOSE.
Paul Thomas [Fri, 1 Nov 2024 07:45:00 +0000 (07:45 +0000)]
Fortran: Fix problems with substring selectors in ASSOCIATE [PR115700]
2024-11-01 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/115700
* resolve.cc (resolve_assoc_var): Extract a substring reference
with missing as well as non-constant start or end.
gcc/testsuite/
PR fortran/115700
* gfortran.dg/associate_69.f90: Activate commented out tests.
* gfortran.dg/associate_70.f90: Test correct functioning of
references in associate_69.f90 tests.