Julian Brown [Tue, 23 May 2023 09:37:00 +0000 (09:37 +0000)]
OpenMP/OpenACC: Reorganise OMP map clause handling in gimplify.cc
This patch has been separated out from the C++ "declare mapper"
support patch. It contains just the gimplify.cc rearrangement
work, mostly moving gimplification from gimplify_scan_omp_clauses
to gimplify_adjust_omp_clauses for map clauses.
The motivation for doing this was that we don't know if we need to
instantiate mappers implicitly until the body of an offload region has
been scanned, i.e. in gimplify_adjust_omp_clauses, but we also need the
un-gimplified form of clauses to sort by base-pointer dependencies after
mapper instantiation has taken place.
The patch also reimplements the "present" clause sorting code to avoid
another sorting pass on mapping nodes.
This version of the patch is based on the version posted for og13, and
additionally incorporates a follow-on fix for DECL_VALUE_EXPR handling
in gimplify_adjust_omp_clauses:
"OpenMP/OpenACC: Reorganise OMP map clause handling in gimplify.cc"
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622223.html
Parts of:
"OpenMP: OpenMP 5.2 semantics for pointers with unmapped target"
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/623351.html
2023-12-16 Julian Brown <julian@codesourcery.com>
gcc/
* gimplify.cc (omp_segregate_mapping_groups): Handle "present" groups.
(gimplify_scan_omp_clauses): Use mapping group functionality to
iterate through mapping nodes. Remove most gimplification of
OMP_CLAUSE_MAP nodes from here, but still populate ctx->variables
splay tree.
(gimplify_adjust_omp_clauses): Move most gimplification of
OMP_CLAUSE_MAP nodes here.
Alex Coplan [Thu, 21 Dec 2023 10:52:44 +0000 (10:52 +0000)]
aarch64: Prevent moving throwing accesses in ldp/stp pass [PR113093]
As the PR shows, there was nothing to prevent the ldp/stp pass from
trying to move throwing insns, which lead to an RTL verification
failure.
This patch fixes that.
gcc/ChangeLog:
PR target/113093
* config/aarch64/aarch64-ldp-fusion.cc (latest_hazard_before):
If the insn is throwing, record the previous insn as a hazard to
prevent moving it from the end of the BB.
I notice we have much better codegen and performance improvement gain with --param=riscv-autovec-lmul=dynamic
which is able to pick the best LMUL (M2).
Add test avoid future somebody potential destroy performance on X264.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-7.c: New test.
Jakub Jelinek [Thu, 21 Dec 2023 10:20:17 +0000 (11:20 +0100)]
Fix -Wcalloc-transposed-args warning in collect2.cc and work around -Walloc-size warning
This fixes one warning and works around another one where we allocate less than
what we cast to.
2023-12-21 Jakub Jelinek <jakub@redhat.com>
* gimple-fold.cc (maybe_fold_comparisons_from_match_pd):
Use unsigned char buffers for lhs1 and lhs2 instead of allocating
them through XALLOCA.
* collect2.cc (maybe_run_lto_and_relink): Swap xcalloc arguments.
aarch64: Fix early RA handling of deleted insns [PR113094]
The testcase constructs a sequence of insns that are fully dead
and yet (due to forced options) are not removed as such. This
triggered a case where we would emit a meaningless reload for a
to-be-deleted insn.
We can't delete the insns first because that might disrupt the
iteration ranges. So this patch turns them into notes before
the walk and then continues to delete them properly afterwards.
gcc/
PR target/113094
* config/aarch64/aarch64-early-ra.cc (apply_allocation): Stub
out instructions that are going to be deleted before iterating
over the rest.
gcc/testsuite/
PR target/113094
* gcc.target/aarch64/pr113094.c: New test.
Jakub Jelinek [Thu, 21 Dec 2023 10:17:08 +0000 (11:17 +0100)]
c++: Enable -Walloc-size and -Wcalloc-transposed-args warnings for C++
The following patch enables the -Walloc-size and -Wcalloc-transposed-args
warnings for C++ as well.
Tracking just 6 arguments for SIZEOF_EXPR for the calloc purposes
is because I see alloc_size 1,2, 2,3 and 3,4 pairs used in the wild,
so we need at least 5 to cover that rather than 3, and don't want to waste
too much compile time/memory for that.
2023-12-21 Jakub Jelinek <jakub@redhat.com>
gcc/c-family/
* c.opt (Walloc-size): Enable also for C++ and ObjC++.
gcc/cp/
* cp-gimplify.cc (cp_genericize_r): If warn_alloc_size, call
warn_for_alloc_size for -Walloc-size diagnostics.
* semantics.cc (finish_call_expr): If warn_calloc_transposed_args,
call warn_for_calloc for -Wcalloc-transposed-args diagnostics.
gcc/testsuite/
* g++.dg/warn/Walloc-size-1.C: New test.
* g++.dg/warn/Wcalloc-transposed-args-1.C: New test.
Jakub Jelinek [Thu, 21 Dec 2023 10:14:55 +0000 (11:14 +0100)]
ubsan: Add workaround for missing bitint libubsan support for shifts [PR113092]
libubsan still doesn't support bitints, so ubsan contains a workaround and
emits value 0 and TK_Unknown kind for those. If shift second operand has
the large/huge _BitInt type, this results in internal errors in libubsan
though, so the following patch provides a temporary workaround for that
- in the rare case where the last operand has _BitInt type wider than
__int128 (or long long on 32-bit arches), it will pretend the shift count
has that type saturated to its range. IMHO better than crashing in
the library. If the value fits into the __int128 (or long long) range,
it will be printed correctly (just print that it has __int128/long long
type rather than say _BitInt(255)), if it doesn't, user will at least
know that it is a very large negative or very large positive value.
2023-12-21 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/113092
* c-ubsan.cc (ubsan_instrument_shift): Workaround for missing
ubsan _BitInt support for the shift count.
Jakub Jelinek [Thu, 21 Dec 2023 10:13:42 +0000 (11:13 +0100)]
lower-bitint: Avoid nested casts in muldiv/float operands [PR112941]
Multiplication/division/modulo/float operands are handled by libgcc calls
and so need to be passed as array of limbs with precision argument,
using handle_operand_addr. That code can't deal with more than one cast,
so the following patch avoids merging those cases.
.MUL_OVERFLOW calls use the same code, but we don't actually try to merge
the operands in that case already.
2023-12-21 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/112941
* gimple-lower-bitint.cc (gimple_lower_bitint): Disallow merging
a cast with multiplication, division or conversion to floating point
if rhs1 of the cast is result of another single use cast in the same
bb.
* gcc.dg/bitint-56.c: New test.
* gcc.dg/bitint-57.c: New test.
chenxiaolong [Tue, 19 Dec 2023 08:43:17 +0000 (16:43 +0800)]
LoongArch: Fix builtin function prototypes for LASX in doc.
gcc/ChangeLog:
* doc/extend.texi:According to the documents submitted earlier,
Two problems with function return types and using the actual types
of parameters instead of variable names were found and fixed.
chenxiaolong [Wed, 13 Dec 2023 01:31:07 +0000 (09:31 +0800)]
LoongArch: Modify the check type of the vector builtin function.
On LoongArch architecture, using the latest gcc14 in regression test,
it is found that the vector test cases in vector directory appear FAIL
entries with unmatched pointer types. In order to solve this kind of
problem, the type of the variable in the check result is modified with
the parameter type defined in the vector builtin function.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/vector/simd_correctness_check.h:The variable
types in the check results are modified in conjunction with the
parameter types defined in the vector builtin function.
Jiahao Xu [Thu, 14 Dec 2023 12:49:04 +0000 (20:49 +0800)]
LoongArch: Fix incorrect code generation for sad pattern
When I attempt to enable vect_usad_char effective target for LoongArch, slp-reduc-sad.c
and vect-reduc-sad*.c tests fail. These tests fail because the sad pattern generates bad
code. This patch to fixed them, for sad patterns, use zero expansion instead of sign
expansion for reduction.
Currently, we are fixing failed vectorized tests, and in the future, we will
enable more tests of "vect" for LoongArch.
gcc/ChangeLog:
* config/loongarch/lasx.md: Use zero expansion instruction.
* config/loongarch/lsx.md: Ditto.
Martin Uecker [Tue, 15 Aug 2023 20:38:14 +0000 (22:38 +0200)]
c23: aliasing of compatible tagged types
Tell the backend which types are equivalent by setting
TYPE_CANONICAL to one struct in the set of equivalent
structs. Structs are considered equivalent by ignoring
all sizes of arrays nested in types below field level.
The following two structs are incompatible and lvalues
with these types can be assumed not to alias:
struct foo { int a[3]; };
struct foo { int a[4]; };
The following two structs are also incompatible, but
will get the same TYPE_CANONICAL and it is then not
exploited that lvalues with those types can not alias:
struct bar { int (*p)[3]; };
struct bar { int (*p)[4]; };
The reason is that both are compatible to
struct bar { int (*p)[]; };
and therefore are in the same equivalence class. For
the same reason all enums with the same underyling type
are in the same equivalence class. Tests are added
for the expected aliasing behavior with optimization.
gcc/c:
* c-decl.cc (c_struct_hasher): Hash stable for struct
types.
(c_struct_hasher::hash, c_struct_hasher::equal): New
functions.
(finish_struct): Set TYPE_CANONICAL to first struct in
equivalence class.
* c-objc-common.cc (c_get_alias_set): Let structs or
unions with variable size alias anything.
* c-tree.h (comptypes_equiv): New prototype.
* c-typeck.cc (comptypes_equiv): New function.
(comptypes_internal): Implement equivalence mode.
(tagged_types_tu_compatible): Implement equivalence mode.
gcc/testsuite:
* gcc.dg/c23-tag-2.c: Activate.
* gcc.dg/c23-tag-5.c: Activate.
* gcc.dg/c23-tag-alias-1.c: New test.
* gcc.dg/c23-tag-alias-2.c: New test.
* gcc.dg/c23-tag-alias-3.c: New test.
* gcc.dg/c23-tag-alias-4.c: New test.
* gcc.dg/c23-tag-alias-5.c: New test.
* gcc.dg/gnu23-tag-alias-1.c: New test.
* gcc.dg/gnu23-tag-alias-2.c: New test.
* gcc.dg/gnu23-tag-alias-3.c: New test.
* gcc.dg/gnu23-tag-alias-4.c: New test.
* gcc.dg/gnu23-tag-alias-5.c: New test.
* gcc.dg/gnu23-tag-alias-6.c: New test.
* gcc.dg/gnu23-tag-alias-7.c: New test.
Martin Uecker [Tue, 15 Aug 2023 21:16:35 +0000 (23:16 +0200)]
c23: tag compatibility rules for enums
Allow redefinition of enum types and enumerators. Diagnose
nested redefinitions including redefinitions in the enum
specifier for enum types with fixed underlying type.
gcc/c:
* c-tree.h (c_parser_enum_specifier): Add parameter.
* c-decl.cc (start_enum): Allow redefinition.
(finish_enum): Diagnose conflicts.
(build_enumerator): Set context.
(diagnose_mismatched_decls): Diagnose conflicting enumerators.
(push_decl): Preserve context for enumerators.
* c-typeck.cc (tagged_types_tu_compatible_p): Adapt.
* c-parser.cc (c_parser_enum_specifier): Remember when
seen is from an enum type which is not yet defined.
gcc/testsuite:
* gcc.dg/c23-tag-enum-1.c: New test.
* gcc.dg/c23-tag-enum-2.c: New test.
* gcc.dg/c23-tag-enum-3.c: New test.
* gcc.dg/c23-tag-enum-4.c: New test.
* gcc.dg/c23-tag-enum-5.c: New test.
* gcc.dg/gnu23-tag-enum-1.c: Mew test.
Martin Uecker [Tue, 15 Aug 2023 12:58:32 +0000 (14:58 +0200)]
c23: tag compatibility rules for struct and unions
Implement redeclaration and compatibility rules for
structures and unions in C23.
gcc/c/:
* c-decl.cc (previous_tag): New function.
(parser_xref_tag): Find earlier definition.
(get_parm_info): Turn off warning for C23.
(start_struct): Allow redefinitons.
(finish_struct): Diagnose conflicts.
* c-tree.h (comptypes_same_p): Add prototype.
* c-typeck.cc (comptypes_same_p): New function.
(comptypes_internal): Activate comparison of tagged types.
(convert_for_assignment): Ignore qualifiers.
(digest_init): Add error.
(initialized_elementwise_p): Allow compatible types.
gcc/testsuite/:
* gcc.dg/c23-enum-7.c: Remove warning.
* gcc.dg/c23-tag-1.c: New test.
* gcc.dg/c23-tag-2.c: New deactivated test.
* gcc.dg/c23-tag-3.c: New test.
* gcc.dg/c23-tag-4.c: New test.
* gcc.dg/c23-tag-5.c: New deactivated test.
* gcc.dg/c23-tag-6.c: New test.
* gcc.dg/c23-tag-7.c: New test.
* gcc.dg/c23-tag-8.c: New test.
* gcc.dg/gnu23-tag-1.c: New test.
* gcc.dg/gnu23-tag-2.c: New test.
* gcc.dg/gnu23-tag-3.c: New test.
* gcc.dg/gnu23-tag-4.c: New test.
* gcc.dg/pr112488-2.c: Remove warning.
Kewen Lin [Thu, 21 Dec 2023 05:20:19 +0000 (23:20 -0600)]
sel-sched: Verify change before replacing dest in EXPR_INSN_RTX [PR112995]
PR112995 exposed one issue in current try_replace_dest_reg
that the result rtx insn after replace_dest_with_reg_in_expr
is probably unable to match any constraints. Although there
are some checks on the changes onto dest or src of orig_insn,
none is performed on the EXPR_INSN_RTX.
The dest (reg 64) is a VR (also VSX REG), the updating makes
it become to (reg 32) which is a FPR (also VSX REG), we have
an alternative to match "VR,VR" but no one to match "FPR/VSX,
VR/VSX", so it fails with ICE.
This patch is to add the check before actually replacing dest
in expr with reg.
PR rtl-optimization/112995
gcc/ChangeLog:
* sel-sched.cc (try_replace_dest_reg): Check the validity of the
replaced insn before actually replacing dest in expr.
Alexandre Oliva [Wed, 20 Dec 2023 05:31:57 +0000 (02:31 -0300)]
compare_tests: distinguish c-c++-common results by tool
When compare_tests compares both C and C++ tests in c-c++-common, they
get the same identifier, so expected differences in results across
languages become undesirably noisy.
This patch adds tool identifiers to tests, so that runs by different
tools are not confused by the compare logic.
It also fixes a bug in reporting differences, that would attempt to
print an undefined fname (the definitions are in subshell loops), and
adjusts the target insertion to match tabs in addition to blanks after
colons.
for contrib/ChangeLog
* compare_tests: Add tool to test lines. Match tabs besides
blanks to insert tool and target. Don't print undefined fname.
Kewen Lin [Thu, 21 Dec 2023 03:21:54 +0000 (21:21 -0600)]
sched: Remove debug counter sched_block
Currently the debug counter sched_block doesn't work well
since we create dependencies for some insns and those
dependencies are expected to be resolved during scheduling
insns but they can get skipped once we are skipping some
block while respecting sched_block debug counter.
For example, for the below test case:
--
int a, b, c, e, f;
float d;
void
g ()
{
float h, i[1];
for (; f;)
if (c)
{
d *e;
if (b)
{
float *j = i;
j[0] = 0;
}
h = d;
}
if (h)
a = i[0];
}
--
ICE occurs with option "-O2 -fdbg-cnt=sched_block:1".
As the discussion in [1], it seems that we think this debug
counter is useless and can be removed. It's also implied
that if it's useful and used often, the above issue should
have been cared about and resolved earlier. So this patch
is to remove this debug counter.
Jason Merrill [Tue, 19 Dec 2023 20:33:07 +0000 (15:33 -0500)]
opts: -Werror=foo always implies -Wfoo [PR106213]
-Werror=foo implying -Wfoo wasn't working for -Wdeprecated-copy-dtor,
because it is specified as the value 2 of warn_deprecated_copy, which shows
up as CLVC_EQUAL, which is not one of the three var_types handled by
control_warning_option. It seems to me that we can just unconditionally
handle_generated_option, and only have special argument handling for those
types.
PR c++/106213
gcc/ChangeLog:
* opts-common.cc (control_warning_option): Call
handle_generated_option for all cl_var_types.
Jason Merrill [Tue, 12 Dec 2023 00:38:32 +0000 (19:38 -0500)]
contrib: add git gcc-style alias
I thought it could be easier to use check_GNU_style.py. With this alias,
'git gcc-style' will take a git revision as argument instead of a file, or
check HEAD if no argument is given.
Julian Brown [Wed, 23 Aug 2023 23:46:29 +0000 (23:46 +0000)]
OpenMP, NVPTX: memcpy[23]D bias correction
This patch works around behaviour of the 2D and 3D memcpy operations in
the CUDA driver runtime. Particularly in Fortran, the "base pointer"
of an array (used for either source or destination of a host/device copy)
may lie outside of data that is actually stored on the device. The fix
is to make sure that we use the first element of data to be transferred
instead, and adjust parameters accordingly.
2023-10-02 Julian Brown <julian@codesourcery.com>
libgomp/
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d): Adjust parameters to
avoid out-of-bounds array checks in CUDA runtime.
(GOMP_OFFLOAD_memcpy3d): Likewise.
* testsuite/libgomp.c-c++-common/memcpyxd-bias-1.c: New test.
Fortran: Use non conflicting file extensions for intermediates [PR81615]
gcc/ChangeLog:
* doc/invoke.texi: Document the new file extensions
gcc/fortran/ChangeLog:
PR fortran/81615
* lang-specs.h (F951_CPP_OPTIONS): Do not hardcode ".f90" extension
(F951_CPP_EXTENSION): Use .fi/.fii for fixed/free form sources
* options.cc (form_from_filename): Handle the new extensions
cse: Fix handling of fake vec_select sets [PR111702]
If cse sees:
(set (reg R) (const_vector [A B ...]))
it creates fake sets of the form:
(set R[0] A)
(set R[1] B)
...
(with R[n] replaced by appropriate rtl) and then adds them to the tables
in the same way as for normal sets. This allows a sequence like:
(set (reg R2) A)
...(reg R2)...
to try to use R[0] instead of (reg R2).
But the pass was taking the analogy too far, and was trying to simplify
these fake sets based on costs. That is, if there was an earlier:
(set (reg T) A)
the pass would go to considerable effort trying to work out whether:
(set R[0] A)
or:
(set R[0] (reg T))
was more profitable. This included running validate*_change on the sets,
which has no meaning given that the sets are not part of the insn.
In this example, the equivalence A == T is already known, and the
purpose of the fake sets is to add A == T == R[0]. We can do that
just as easily (or, as the PR shows, more easily) if we keep the
original form of the fake set, with A instead of T.
The problem in the PR occurred if we had:
(1) something that establishes an equivalence between a vector V1 of
M-bit scalar integers and a hard register H
(2) something that establishes an equivalence between a vector V2 of
N-bit scalar integers, where N<M and where V2 contains at least 2
instances of V1[0]
(1) established an equivalence between V1[0] and H in M bits.
(2) then triggered a search for an equivalence of V1[0] in N bits.
This included:
/* See if we have a CONST_INT that is already in a register in a
wider mode. */
which (correctly) found that the low N bits of H contain the right value.
But because it came from a wider mode, this equivalence between N-bit H
and N-bit V1[0] was not yet in the hash table. It therefore survived
the purge in:
/* At this point, ELT, if nonzero, points to a class of expressions
equivalent to the source of this SET and SRC, SRC_EQV, SRC_FOLDED,
and SRC_RELATED, if nonzero, each contain additional equivalent
expressions. Prune these latter expressions by deleting expressions
already in the equivalence class.
And since more than 1 set found the same N-bit equivalence between
H and V1[0], the pass tried to add it more than once.
Things were already wrong at this stage, but an ICE was only triggered
later when trying to merge this N-bit equivalence with another one.
We could avoid the double registration by adding:
for (elt = classp; elt; elt = elt->next_same_value)
if (rtx_equal_p (elt->exp, x))
return elt;
to insert_with_costs, or by making cse_insn check whether previous
sets have recorded the same equivalence. The latter seems more
appealing from a compile-time perspective. But in this case,
doing that would be adding yet more spurious work to the handling
of fake sets.
The handling of fake sets therefore seems like the more fundamental bug.
While there, the patch also makes sure that we don't apply REG_EQUAL
notes to these fake sets. They only describe the "real" (first) set.
gcc/
PR rtl-optimization/111702
* cse.cc (set::mode): Move earlier.
(set::src_in_memory, set::src_volatile): Convert to bitfields.
(set::is_fake_set): New member variable.
(add_to_set): Add an is_fake_set parameter.
(find_sets_in_insn): Update calls accordingly.
(cse_insn): Do not apply REG_EQUAL notes to fake sets. Do not
try to optimize them either, or validate changes to them.
gcc/testsuite/
PR rtl-optimization/111702
* gcc.dg/rtl/aarch64/pr111702.c: New test.
Jason Merrill [Wed, 20 Dec 2023 16:06:27 +0000 (11:06 -0500)]
c++: throwing dtor and empty try [PR113088]
maybe_splice_retval_cleanup assumed that the function body can't be empty if
there's a throwing cleanup, but when I added cleanups to try blocks in r12-6333-gb10e031458d541 I didn't adjust that assumption.
PR c++/113088
PR c++/33799
gcc/cp/ChangeLog:
* except.cc (maybe_splice_retval_cleanup): Handle an empty block.
Jason Merrill [Tue, 19 Dec 2023 21:12:02 +0000 (16:12 -0500)]
c++: xvalue array subscript [PR103185]
Normally we handle xvalue array subscripting with ARRAY_REF, but in this
case we weren't doing that because the operands were reversed. Handle that
case better.
Andre Vieira [Wed, 20 Dec 2023 16:41:52 +0000 (16:41 +0000)]
veclower: improve selection of vector mode when lowering [PR 112787]
This patch addresses the issue reported in PR target/112787 by improving the
compute type selection. We do this by not considering types with more elements
than the type we are lowering since we'd reject such types anyway.
gcc/ChangeLog:
PR target/112787
* tree-vect-generic.cc (type_for_widest_vector_mode): Change function to
use original vector type and check widest vector mode has at most the
same number of elements.
(get_compute_type): Pass original vector type rather than the element
type to type_for_widest_vector_mode and remove now obsolete check for
the number of elements.
Narrow down scope of the unknowns bitmap so that it is only accessible
within the reexamination process. This also removes any role of unknown
propagation from object_sizes_set, thus simplifying that code path a
bit.
gcc/ChangeLog:
* tree-object-size.cc (object_size_info): Remove UNKNOWNS.
Drop all references to it.
(object_sizes_set): Move unknowns propagation code to...
(gimplify_size_expressions): ... here. Also free reexamine
bitmap.
(propagate_unknowns): New parameter UNKNOWNS. Update callers.
Based on commit 5f1bed2a7af828103ca23a3546466a23e8dd2f30 (2023-12-16), there
are a ton of progressions (for test cases not actually depending on libstdc++
symbols, obviously):
=== g++ Summary ===
# of expected passes [-178369-]{+189226+}
# of unexpected failures [-19880-]{+14089+}
# of unexpected successes 14
# of expected failures [-1684-]{+1685+}
# of unresolved testcases [-9820-]{+4837+}
# of unsupported tests [-11971-]{+11968+}
..., and only two benign "regressions":
[-UNSUPPORTED:-]{+FAIL:+} g++.dg/init/array54.C -std=c++14 {+(test for excess errors)+}
{+UNRESOLVED: g++.dg/init/array54.C -std=c++14 compilation failed to produce executable+}
[Etc.]
[...]/g++.dg/init/array54.C:5:10: fatal error: atomic: No such file or directory
That's similar to a lof of other test cases intending to '#include' standard
C++/libstdc++ headers; to be addressed in due time.
PASS: g++.old-deja/g++.pt/const2.C -std=c++98 at line 5 (test for warnings, line )
[-PASS:-]{+FAIL:+} g++.old-deja/g++.pt/const2.C -std=c++98 (test for excess errors)
[Etc.]
ld: error: undefined symbol: A<int>::i
>>> referenced by /tmp/ccqXWCSh.o:(p)
The 'error: undefined symbol' is expected here; maybe should simply in the test
case 'dg-prune-output "referenced by"'? (This PASSed before, as the
'dg-message "i"' was satisfied by 'ld: error: unable to find library -lstdc++',
eh...)
gcc/
* config/gcn/gcn.h (LIBSTDCXX): Define to "gcc".
Richard Biener [Wed, 20 Dec 2023 12:18:51 +0000 (13:18 +0100)]
Improve DCE of dead parts of a permute chain
gcc.dg/vect/bb-slp-pr78205.c is reported to have regressed with
the PR113073 change and in the end it's due to the DCE performed
by vect_transform_slp_perm_load_1 being imperfect. The following
enhances it to also cover the CTOR and VIEW_CONVERT operations that
might be involved.
* tree-vect-slp.cc (vect_transform_slp_perm_load_1): Also handle
CTOR and VIEW_CONVERT up to the load when performing chain DCE.
Xi Ruoyao [Mon, 18 Dec 2023 21:02:42 +0000 (05:02 +0800)]
LoongArch: Clean up vec_init expander
Non functional change, clean up the code.
gcc/ChangeLog:
* config/loongarch/loongarch.cc
(loongarch_expand_vector_init_same): Remove "temp2" and reuse
"temp" instead.
(loongarch_expand_vector_init): Use gcc_unreachable () instead
of gcc_assert (0), and fix the comment for it.
Xi Ruoyao [Mon, 18 Dec 2023 20:48:03 +0000 (04:48 +0800)]
LoongArch: Use force_reg instead of gen_reg_rtx + emit_move_insn in vec_init expander [PR113033]
Jakub says:
Then that seems like a bug in the loongarch vec_init pattern(s).
Those really don't have a predicate in any of the backends on the
input operand, so they need to force_reg it if it is something it
can't handle. I've looked e.g. at i386 vec_init and that is exactly
what it does, see the various tests + force_reg calls in
ix86_expand_vector_init*.
So replace gen_reg_rtx + emit_move_insn with force_reg to fix PR 113033.
For every RTX code for which the LSX/LASX code is different from the
scalar code, the scalar code is correct and the LSX/LASX code is wrong.
Most seriously, the RTX code NE should be mapped to "cneq", not "cne".
Rewrite <x>vfcmp define_insns in simd.md using the same mapping as
scalar fcmp.
Note that GAS does not support [x]vfcmp.{c/s}[u]{ge/gt} (pseudo)
instruction (although fcmp.{c/s}[u]{ge/gt} is supported), so we need to
switch the order of inputs and use [x]vfcmp.{c/s}[u]{le/lt} instead.
The <x>vfcmp.{sult/sule/clt/cle}.{s/d} instructions do not have a single
RTX code, but they can be modeled as an inversed RTX code following a
"not" operation. Doing so allows the compiler to optimized vectorized
__builtin_isless etc. to a single instruction. This optimization should
be added for scalar code too and I'll do it later.
Tests are added for mapping between C code, IEC 60559 operations, and
vfcmp instructions.
PR target/113034
* config/loongarch/lasx.md (UNSPEC_LASX_XVFCMP_*): Remove.
(lasx_xvfcmp_caf_<flasxfmt>): Remove.
(lasx_xvfcmp_cune_<FLASX:flasxfmt>): Remove.
(FSC256_UNS): Remove.
(fsc256): Remove.
(lasx_xvfcmp_<vfcond:fcc>_<FLASX:flasxfmt>): Remove.
(lasx_xvfcmp_<fsc256>_<FLASX:flasxfmt>): Remove.
* config/loongarch/lsx.md (UNSPEC_LSX_XVFCMP_*): Remove.
(lsx_vfcmp_caf_<flsxfmt>): Remove.
(lsx_vfcmp_cune_<FLSX:flsxfmt>): Remove.
(vfcond): Remove.
(fcc): Remove.
(FSC_UNS): Remove.
(fsc): Remove.
(lsx_vfcmp_<vfcond:fcc>_<FLSX:flsxfmt>): Remove.
(lsx_vfcmp_<fsc>_<FLSX:flsxfmt>): Remove.
* config/loongarch/simd.md
(fcond_simd): New define_code_iterator.
(<simd_isa>_<x>vfcmp_<fcond:fcond_simd>_<simdfmt>):
New define_insn.
(fcond_simd_rev): New define_code_iterator.
(fcond_rev_asm): New define_code_attr.
(<simd_isa>_<x>vfcmp_<fcond:fcond_simd_rev>_<simdfmt>):
New define_insn.
(fcond_inv): New define_code_iterator.
(fcond_inv_rev): New define_code_iterator.
(fcond_inv_rev_asm): New define_code_attr.
(<simd_isa>_<x>vfcmp_<fcond_inv>_<simdfmt>): New define_insn.
(<simd_isa>_<x>vfcmp_<fcond_inv:fcond_inv_rev>_<simdfmt>):
New define_insn.
(UNSPEC_SIMD_FCMP_CAF, UNSPEC_SIMD_FCMP_SAF,
UNSPEC_SIMD_FCMP_SEQ, UNSPEC_SIMD_FCMP_SUN,
UNSPEC_SIMD_FCMP_SUEQ, UNSPEC_SIMD_FCMP_CNE,
UNSPEC_SIMD_FCMP_SOR, UNSPEC_SIMD_FCMP_SUNE): New unspecs.
(SIMD_FCMP): New define_int_iterator.
(fcond_unspec): New define_int_attr.
(<simd_isa>_<x>vfcmp_<fcond_unspec>_<simdfmt>): New define_insn.
* config/loongarch/loongarch.cc (loongarch_expand_lsx_cmp):
Remove unneeded special cases.
gcc/testsuite/ChangeLog:
PR target/113034
* gcc.target/loongarch/vfcmp-f.c: New test.
* gcc.target/loongarch/vfcmp-d.c: New test.
* gcc.target/loongarch/xvfcmp-f.c: New test.
* gcc.target/loongarch/xvfcmp-d.c: New test.
* gcc.target/loongarch/vector/lasx/lasx-vcond-2.c: Scan for cune
instead of cne.
* gcc.target/loongarch/vector/lsx/lsx-vcond-2.c: Likewise.
demin.han [Wed, 20 Dec 2023 08:15:37 +0000 (16:15 +0800)]
RISC-V: Fix calculation of max live vregs
For the stmt _1 = _2 + _3, assume that _2 or _3 not used after this stmt.
_1 can use same register with _2 or _3 if without early clobber.
Two registers are needed, but current calculation is three.
This patch preserves point 0 for bb entry and excludes its def when
calculates live regs of certain point.
Jakub Jelinek [Wed, 20 Dec 2023 11:01:57 +0000 (12:01 +0100)]
i386: Make most MD builtins nothrow, leaf [PR112962]
The following patch makes most of x86 MD builtins nothrow,leaf
(like most middle-end builtins are). For -fnon-call-exceptions it
doesn't nothrow, better might be to still add it if the builtins
don't read or write memory and can't raise floating point exceptions,
but we don't have such information readily available, so the patch
uses just !flag_non_call_exceptions for now.
Not sure if we shouldn't have some exceptions for the leaf attribute,
e.g. wonder about EMMS/FEMMS and the various xsave/xrstor etc. builtins,
pedantically none of those builtins do anything that leaf functions
are forbidden to do (having callbacks, calling functions from current TU,
longjump into the current TU), but sometimes non-leaf is also used on
really complex functions to prevent some unwanted optimizations.
That said, haven't run into any problems as is with the patch.
2023-12-20 Jakub Jelinek <jakub@redhat.com>
PR target/112962
* config/i386/i386-builtins.cc (ix86_builtins): Increase by one
element.
(def_builtin): If not -fnon-call-exceptions, set TREE_NOTHROW on
the builtin FUNCTION_DECL. Add leaf attribute to DECL_ATTRIBUTES.
(ix86_add_new_builtins): Likewise.
Jakub Jelinek [Wed, 20 Dec 2023 10:32:52 +0000 (11:32 +0100)]
lower-bitint: Fix up handling of nested casts in mergeable stmt handling [PR112941]
The following patch fixes 2 issues in handling of casts for mergeable
stmts.
The first hunk fixes the case when we have two nested casts (typically
after optimization that is zero-extension of a sign-extension because
everything else should have been folded into a single cast). If
the lowering of the outer cast needs to make the code conditional
(e.g.
for (...)
{
if (idx <= 32)
{
if (idx < 32)
{ ... handle_operand (idx); ... }
else
{ ... handle_operand (32); ... }
}
...
}
) and the lowering of the inner one as well, right now it creates invalid
SSA form, because even for the inner cast we need a PHI on the loop
and the PHI argument from the latch edge is a SSA_NAME initialized in
the conditionally executed bb. The hunk fixes that by detecting such
a case and adding further PHI nodes at the end of the ifs such that
the right value propagates to the next loop iteration. We can use
0 arguments for the other edges because the inner operand handling
is only done for the first set of iterations and then the other ifs take
over.
The rest fixes a case of again invalid SSA form, when for a sign extension
we need to use the 0 or -1 value initialized by earlier iteration in
a constant idx case, the code was using the value of the loop PHI argument
from latch edge rather than result; that is correct for cases expanded
in straight line code after the loop, but not inside of the loop for the
cases of handle_cast conditionals, there we should use PHI result. This
is done in the second hunk and supported by the remaining hunks, where
it clears m_bb to tell the code we aren't in the loop anymore.
Note, this patch doesn't deal with similar problems during multiplication,
division, floating casts etc. where we just emit a library call. I'll
need to make sure in that case we don't merge more than one cast per
operand.
2023-12-20 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/112941
* gimple-lower-bitint.cc (bitint_large_huge::handle_cast): If
save_cast_conditional, instead of adding assignment of t4 to
m_data[save_data_cnt + 1] before m_gsi, add phi nodes such that
t4 propagates to m_bb loop. For constant idx, use
m_data[save_data_cnt] rather than m_data[save_data_cnt + 1] if inside
of the m_bb loop.
(bitint_large_huge::lower_mergeable_stmt): Clear m_bb when no longer
expanding inside of that loop.
(bitint_large_huge::lower_comparison_stmt): Likewise.
(bitint_large_huge::lower_addsub_overflow): Likewise.
(bitint_large_huge::lower_mul_overflow): Likewise.
(bitint_large_huge::lower_bit_query): Likewise.
Jakub Jelinek [Wed, 20 Dec 2023 10:31:18 +0000 (11:31 +0100)]
c: Split -Wcalloc-transposed-args warning from -Walloc-size, -Walloc-size fixes
The following patch changes -Walloc-size warning to no longer warn
about int *p = calloc (1, sizeof (int));, because as discussed earlier,
the size is IMNSHO sufficient in that case, for alloc_size with 2
arguments warns if the product of the 2 arguments is insufficiently small.
Also, it warns also for explicit casts of malloc/calloc etc. calls
rather than just implicit, so not just
int *p = malloc (1);
but also
int *p = (int *) malloc (1);
It also fixes some ICEs where the code didn't verify the alloc_size
arguments properly (Walloc-size-5.c testcase ICEs with vanilla trunk).
And lastly, it introduces a coding style warning, -Wcalloc-transposed-args
to warn for calloc (sizeof (struct S), 1) and similar calls (regardless
of what they are cast to, warning whenever first argument is sizeof and
the second is not).
2023-12-20 Jakub Jelinek <jakub@redhat.com>
gcc/
* doc/invoke.texi (-Walloc-size): Add to the list of
warning options, remove unnecessary line-break.
(-Wcalloc-transposed-args): Document new warning.
gcc/c-family/
* c.opt (Wcalloc-transposed-args): New warning.
* c-common.h (warn_for_calloc, warn_for_alloc_size): Declare.
* c-warn.cc (warn_for_calloc, warn_for_alloc_size): New functions.
gcc/c/
* c-parser.cc (c_parser_postfix_expression_after_primary): Grow
sizeof_arg and sizeof_arg_loc arrays to 6 elements. Call
warn_for_calloc if warn_calloc_transposed_args for functions with
alloc_size type attribute with 2 arguments.
(c_parser_expr_list): Use 6 instead of 3.
* c-typeck.cc (build_c_cast): Call warn_for_alloc_size for casts
of calls to functions with alloc_size type attribute.
(convert_for_assignment): Likewise.
gcc/testsuite/
* gcc.dg/Walloc-size-4.c: New test.
* gcc.dg/Walloc-size-5.c: New test.
* gcc.dg/Wcalloc-transposed-args-1.c: New test.
Alex Coplan [Wed, 20 Dec 2023 09:39:29 +0000 (09:39 +0000)]
aarch64: Validate register operands early in ldp fusion pass [PR113062]
We were missing validation of the candidate register operands in the
ldp/stp pass. I was relying on recog rejecting such cases when we
formed the final pair insn, but the testcase shows that with
-fharden-conditionals we attempt to combine two insns with asm_operands,
both containing mem rtxes. This then trips the assert:
gcc_assert (change->new_uses.is_valid ());
in the stp case as we aren't expecting to have (distinct) uses of mem in
the candidate stores.
While doing this I noticed that it seems more natural to have the
initial definition of mem_size closer to its first use in track_access,
so I moved that down.
gcc/ChangeLog:
PR target/113062
* config/aarch64/aarch64-ldp-fusion.cc
(ldp_bb_info::track_access): Punt on accesses with invalid
register operands, move definition of mem_size closer to its
first use.
Pan Li [Wed, 20 Dec 2023 02:19:39 +0000 (10:19 +0800)]
RISC-V: Bugfix for the const vector in single steps
This patch would like to fix the below execution failure when build with
"-march=rv64gcv_zvl512b -mabi=lp64d -mcmodel=medlow --param=riscv-autovec-lmul=m8 -ftree-vectorize -fno-vect-cost-model -O3"
FAIL: gcc.dg/vect/pr92420.c -flto -ffat-lto-objects execution test
The will be one single step const vector like { -4, 4, -3, 5, -2, 6, -1, 7, ...}.
For such const vector generation with single step, we will generate vid
+ diff here. For example as below, given npatterns = 4.
Unfortunately, that cannot work well for { -4, 4, -3, 5, -2, 6, -1, 7, ...}
because it has one implicit requirement for the diff. Aka, the diff
sequence in npattern are repeated. For example the v2 (diff) as above.
The diff between { -4, 4, -3, 5, -2, 6, -1, 7, ...} and vid are not
npattern size repeated and then we have wrong code here. We implement
one new code gen the sequence like { -4, 4, -3, 5, -2, 6, -1, 7, ...}.
The below tests are passed for this patch.
* The RV64 regression test with rv64gcv configuration.
* The run test gcc.dg/vect/pr92420.c for below configurations.
* config/riscv/riscv-v.cc (rvv_builder::npatterns_vid_diff_repeated_p):
New function to predicate the diff to vid is repeated or not.
(expand_const_vector): Add restriction
for the vid-diff code gen and implement general one.
The solution is quite simple, we just need to extract index = 1 element to the highpart of the DImode register on RV32 system
since DImode register consists of 2 scalar registers.
Alexandre Oliva [Thu, 14 Dec 2023 07:50:45 +0000 (04:50 -0300)]
strub: sparc64: unbias the stack address [PR112917]
The stack pointer is biased by 2047 bytes on sparc64, so the range it
delimits is way off. Unbias the addresses returned by
__builtin_stack_address (), so that the strub builtins, inlined or
not, can function correctly. I've considered introducing a new target
macro, but using STACK_POINTER_OFFSET seems safe, and it enables the
register save areas to be scrubbed as well.
Because of the large fixed-size outgoing args area next to the
register save area on sparc, we still need __strub_leave to not
allocate its own frame, otherwise it won't be able to clear part of
the frame it should.
Alexandre Oliva [Thu, 14 Dec 2023 06:21:37 +0000 (03:21 -0300)]
strub: sparc: omit frame in strub_leave [PR112917]
If we allow __strub_leave to allocate a frame on sparc, it will
overlap with a lot of the stack range we're supposed to scrub, because
of the large fixed-size outgoing args and register save area.
Unfortunately, setting up the PIC register seems to prevent the frame
pointer from being omitted.
Since the strub runtime doesn't issue calls or use global variables,
at least on sparc, disabling PIC to compile strub.c seems to do the
right thing.
for libgcc/ChangeLog
PR middle-end/112917
* config.host (sparc, sparc64): Enable...
* config/sparc/t-sparc: ... this new fragment.
Alexandre Oliva [Wed, 20 Dec 2023 01:17:42 +0000 (22:17 -0300)]
-finline-stringops: allow expansion into edges [PR113002]
Builtin expanders for memset and memcpy may involve conditionals and
loops, but their sequences may be end up emitted in edges. Alas,
commit_one_edge_insertion rejects sequences that end with a jump, a
requirement that makes sense for insertions after expand, but not so
much during expand.
During expand, jumps may appear in the middle of the insert sequence
as much as in the end, and it's only after committing edge insertions
out of PHI nodes that we go through the entire function splitting
blocks where needed, so relax the assert in commit_one_edge_insertion
so that jumps are accepted during expand even at the end of the
sequence.
for gcc/ChangeLog
PR rtl-optimization/113002
* cfgrtl.cc (commit_one_edge_insertion): Tolerate jumps in the
inserted sequence during expand.
Jeff Law [Wed, 20 Dec 2023 04:05:25 +0000 (21:05 -0700)]
[committed] Stop forcing unsigned bitfields on mcore
The GCC manual has a whole section on signedness of bitfields with the ultimate
conclusion that the property really isn't an ABI issue, but instead a C dialect
issue (agreed). Furthermore it concludes that all targets should behave the
same by default.
So it was a mistake for the mcore port to force bitfields to be unsigned and
that never should have been included. This patch rectifies that problem.
I should have remembered this -- I went down this path once in the 90s. I
don't recall which port anymore, but once Joseph mentioned this policy bits and
pieces did start to come back to me.
Restoring the proper default happens to also fix 170 tests in the GCC
testsuite, some of which would go into infinite loops when bitfields were
treated as signed values (pr88621 for example). Essentially the testing time
cuts in half, which was actually the point of digging into pr88621 to begin
with.
gcc/
* config/mcore/mcore.h (CC1_SPEC): Do not set -funsigned-bitfields.
Alexandre Oliva [Wed, 20 Dec 2023 04:05:45 +0000 (01:05 -0300)]
-finline-stringops: copy timeout factor from memcmp-1.c test
I added some -finline-stringops tests that included memcmp-1.c, but
carried over the timeout factor onto only one such test. Jeff Law
kindly pointed that out (thanks!), so here's the fix.
for gcc/testsuite/ChangeLog
* gcc.dg/torture/inline-mem-cmp-1.c: Copy timeout factor from
mem-cmp-1.c.
* gcc.dg/torture/inline-mem-cpy-1.c: Likewise.
tree-object-size: Always set computed bit for bdos [PR113012]
It is always safe to set the computed bit for dynamic object sizes at
the end of collect_object_sizes_for because even in case of a dependency
loop encountered in nested calls, we have an SSA temporary to actually
finish the object size expression. The reexamine pass for dynamic
object sizes is only for propagation of unknowns and gimplification of
the size expressions, not for loop resolution as in the case of static
object sizes.
gcc/ChangeLog:
PR tree-optimization/113012
* tree-object-size.cc (compute_builtin_object_size): Expand
comment for dynamic object sizes.
(collect_object_sizes_for): Always set COMPUTED bitmap for
dynamic object sizes.
gcc/testsuite/ChangeLog:
PR tree-optimization/113012
* gcc.dg/ubsan/pr113012.c: New test case.
Alexandre Oliva [Wed, 20 Dec 2023 00:06:22 +0000 (21:06 -0300)]
strub: avoid lto inlining
The strub builtins are not suited for cross-unit inlining, they should
only be inlined by the builtin expanders, if at all. While testing on
sparc64, it occurred to me that, if libgcc was built with LTO enabled,
lto1 might inline them, and that would likely break things. So, make
sure they're clearly marked as not inlinable.
for libgcc/ChangeLog
* strub.c (ATTRIBUTE_NOINLINE): New.
(ATTRIBUTE_STRUB_CALLABLE): Add it.
(__strub_dummy_force_no_leaf): Drop it.
Alexandre Oliva [Wed, 20 Dec 2023 00:06:17 +0000 (21:06 -0300)]
hardened: use LD_PIE_SPEC only if defined
sol2.h may define LINK_PIE_SPEC and leave LD_PIE_SPEC undefined, but
gcc.cc will only provide a LD_PIE_SPEC definition if LINK_PIE_SPEC is
not defined, and thenit uses LD_PIE_SPEC guarded by #ifdef HAVE_LD_PIE
only. Add LD_PIE_SPEC to the guard.
gcc/ChangeLog
* gcc.cc (process_command): Use LD_PIE_SPEC only if defined.
Patrick Palka [Tue, 19 Dec 2023 21:33:55 +0000 (16:33 -0500)]
c++: local class memfn synth from uneval context [PR113063]
Here we first use and therefore synthesize the local class operator<=>
from an unevaluated context, which inadvertently affects synthesization
by preventing functions used within the definition (such as the copy
constructor of std::strong_ordering) from getting marked as odr-used.
This patch fixes this by using maybe_push_to_top_level in synthesize_method
which ensures cp_unevaluated_operand gets cleared even in the function-local
case.
PR c++/113063
gcc/cp/ChangeLog:
* method.cc (synthesize_method): Use maybe_push_to_top_level
and maybe_pop_from_top_level.
Nathaniel Shead [Sun, 17 Dec 2023 01:46:02 +0000 (12:46 +1100)]
c++: Check null pointer deref when calling memfn in constexpr [PR102420]
Calling a non-static member function on a null pointer is undefined
behaviour (see [expr.ref] p8) and should error in constant evaluation,
even if the 'this' pointer is never actually accessed within that
function.
One catch is that currently, the function pointer conversion operator
for lambdas passes a null pointer as the 'this' pointer to the
underlying 'operator()', so for now we ignore such calls.
PR c++/102420
gcc/cp/ChangeLog:
* constexpr.cc (cxx_bind_parameters_in_call): Check for calling
non-static member functions with a null pointer.
The linking of libgcc is already present in %(liborig), so the current
situation duplicates libraries. This was not an issue until macOS's new
linker started giving warnings for such cases.
Sandra Loosemore [Mon, 18 Dec 2023 23:16:53 +0000 (23:16 +0000)]
OpenMP: Use enumerators for names of trait-sets and traits
This patch introduces enumerators to represent trait-set names and
trait names, which makes it easier to use tables to control other
behavior and for switch statements to dispatch on the tags. The tags
are stored in the same place in the TREE_LIST structure (OMP_TSS_ID or
OMP_TS_ID) and are encoded there as integer constants.
gcc/ChangeLog
* omp-selectors.h: New file.
* omp-general.h: Include omp-selectors.h.
(OMP_TSS_CODE, OMP_TSS_NAME): New.
(OMP_TS_CODE, OMP_TS_NAME): New.
(make_trait_set_selector, make_trait_selector): Adjust declarations.
(omp_construct_traits_to_codes): Likewise.
(omp_context_selector_set_compare): Likewise.
(omp_get_context_selector): Likewise.
(omp_get_context_selector_list): New.
* omp-general.cc (omp_construct_traits_to_codes): Pass length in
as argument instead of returning it. Make it table-driven.
(omp_tss_map): New.
(kind_properties, vendor_properties, extension_properties): New.
(atomic_default_mem_order_properties): New.
(omp_ts_map): New.
(omp_check_context_selector): Simplify lookup and dispatch logic.
(omp_mark_declare_variant): Ignore variants with unknown construct
selectors. Adjust for new representation.
(make_trait_set_selector, make_trait_selector): Adjust for new
representations.
(omp_context_selector_matches): Simplify dispatch logic. Avoid
fixed-sized buffers and adjust call to omp_construct_traits_to_codes.
(omp_context_selector_props_compare): Adjust for new representations
and simplify dispatch logic.
(omp_context_selector_set_compare): Likewise.
(omp_context_selector_compare): Likewise.
(omp_get_context_selector): Adjust for new representations, and split
out...
(omp_get_context_selector_list): New function.
(omp_lookup_tss_code): New.
(omp_lookup_ts_code): New.
(omp_context_compute_score): Adjust for new representations. Avoid
fixed-sized buffers and magic numbers. Adjust call to
omp_construct_traits_to_codes.
* gimplify.cc (omp_construct_selector_matches): Avoid use of
fixed-size buffer. Adjust call to omp_construct_traits_to_codes.
gcc/c/ChangeLog
* c-parser.cc (omp_construct_selectors): Delete.
(omp_device_selectors): Delete.
(omp_implementation_selectors): Delete.
(omp_user_selectors): Delete.
(c_parser_omp_context_selector): Adjust for new representations
and simplify dispatch logic. Uniformly warn instead of sometimes
error when an unknown selector is found. Adjust error messages
for extraneous/incorrect score.
(c_parser_omp_context_selector_specification): Likewise.
(c_finish_omp_declare_variant): Adjust for new representations.
gcc/cp/ChangeLog
* decl.cc (omp_declare_variant_finalize_one): Adjust for new
representations.
* parser.cc (omp_construct_selectors): Delete.
(omp_device_selectors): Delete.
(omp_implementation_selectors): Delete.
(omp_user_selectors): Delete.
(cp_parser_omp_context_selector): Adjust for new representations
and simplify dispatch logic. Uniformly warn instead of sometimes
error when an unknown selector is found. Adjust error messages
for extraneous/incorrect score.
(cp_parser_omp_context_selector_specification): Likewise.
* pt.cc (tsubst_attribute): Adjust for new representations.
gcc/fortran/ChangeLog
* gfortran.h: Include omp-selectors.h.
(enum gfc_omp_trait_property_kind): Delete, and replace all
references with equivalent omp_tp_type enumerators.
(struct gfc_omp_trait_property): Update for omp_tp_type.
(struct gfc_omp_selector): Replace string name with new enumerator.
(struct gfc_omp_set_selector): Likewise.
* openmp.cc (gfc_free_omp_trait_property_list): Update for
omp_tp_type.
(omp_construct_selectors): Delete.
(omp_device_selectors): Delete.
(omp_implementation_selectors): Delete.
(omp_user_selectors): Delete.
(gfc_ignore_trait_property_extension): New.
(gfc_ignore_trait_property_extension_list): New.
(gfc_match_omp_selector): Adjust for new representations and simplify
dispatch logic. Uniformly warn instead of sometimes error when an
unknown selector is found.
(gfc_match_omp_context_selector): Adjust for new representations.
Adjust error messages for extraneous/incorrect score.
(gfc_match_omp_context_selector_specification): Likewise.
* trans-openmp.cc (gfc_trans_omp_declare_variant): Adjust for
new representations.
gcc/testsuite/
* c-c++-common/gomp/declare-variant-1.c: Expect warning on
unknown selectors.
* c-c++-common/gomp/declare-variant-2.c: Likewise. Also adjust
messages for score errors.
* c-c++-common/gomp/declare-variant-no-score.c: New.
* gfortran.dg/gomp/declare-variant-1.f90: Expect warning on
unknown selectors.
* gfortran.dg/gomp/declare-variant-2.f90: Likewise. Also adjust
messages for score errors.
* gfortran.dg/gomp/declare-variant-no-score.f90: New.
Sandra Loosemore [Sun, 19 Nov 2023 05:31:41 +0000 (05:31 +0000)]
OpenMP: Unify representation of name-list properties.
Previously, name-list properties specified as identifiers were stored
in the TREE_PURPOSE/OMP_TP_NAME slot, while those specified as strings
were stored in the TREE_VALUE/OMP_TP_VALUE slot. This patch puts both
representations in OMP_TP_VALUE with a magic cookie in OMP_TP_NAME.
gcc/ChangeLog
* omp-general.h (OMP_TP_NAMELIST_NODE): New.
* omp-general.cc (omp_context_name_list_prop): Move earlier
in the file, and adjust for new representation.
(omp_check_context_selector): Adjust this too.
(omp_context_selector_props_compare): Likewise.
gcc/c/ChangeLog
* c-parser.cc (c_parser_omp_context_selector): Adjust for new
namelist property representation.
gcc/cp/ChangeLog
* parser.cc (cp_parser_omp_context_selector): Adjust for new
namelist property representation.
* pt.cc (tsubst_attribute): Likewise.
gcc/fortran/ChangeLog
* trans-openmp.cc (gfc_trans_omp_declare_varaint): Adjust for
new namelist property representation.
Sandra Loosemore [Sun, 19 Nov 2023 00:56:47 +0000 (00:56 +0000)]
OpenMP: Introduce accessor macros and constructors for context selectors.
This patch hides the underlying nested TREE_LIST structure of context
selectors behind accessor macros that have more meaningful names than
the generic TREE_PURPOSE/TREE_VALUE accessors. There is a slight
change to the representation in that the score expression in
trait-selectors has a distinguished tag and is separated from the
ordinary properties, although internally it is still represented as
the first item in the TREE_VALUE of the selector. This patch also renames
some local variables with slightly more descriptive names so it is easier
to track whether something is a selector-set, selector, or property.
gcc/ChangeLog
* omp-general.h (OMP_TS_SCORE_NODE): New.
(OMP_TSS_ID, OMP_TSS_TRAIT_SELECTORS): New.
(OMP_TS_ID, OMP_TS_SCORE, OMP_TS_PROPERTIES): New.
(OMP_TP_NAME, OMP_TP_VALUE): New.
(make_trait_set_selector): Declare.
(make_trait_selector): Declare.
(make_trait_property): Declare.
(omp_constructor_traits_to_codes): Rename to
omp_construct_traits_to_codes.
* omp-general.cc (omp_constructor_traits_to_codes): Rename
to omp_construct_traits_to_codes. Use new accessors.
(omp_check_context_selector): Use new accessors.
(make_trait_set_selector): New.
(make_trait_selector): New.
(make_trait_property): New.
(omp_context_name_list_prop): Use new accessors.
(omp_context_selector_matches): Use new accessors.
(omp_context_selector_props_compare): Use new accessors.
(omp_context_selector_set_compare): Use new accessors.
(omp_get_context_selector): Use new accessors.
(omp_context_compute_score): Use new accessors.
* gimplify.cc (omp_construct_selector_matches): Adjust for renaming
of omp_constructor_traits_to_codes.
gcc/c/ChangeLog
* c-parser.cc (c_parser_omp_context_selector): Use new constructors.
gcc/cp/ChangeLog
* parser.cc (cp_parser_omp_context_selector): Use new constructors.
* pt.cc: Include omp-general.h.
(tsubst_attribute): Use new context selector accessors and
constructors.
gcc/fortran/ChangeLog
* trans-openmp.cc (gfc_trans_omp_declare_variant): Use new
constructors.
David Faust [Tue, 12 Dec 2023 21:55:59 +0000 (13:55 -0800)]
btf: change encoding of forward-declared enums [PR111735]
The BTF specification does not formally define a representation for
forward-declared enum types such as:
enum Foo;
Forward-declarations for struct and union types are represented by
BTF_KIND_FWD, which has a 1-bit flag distinguishing the two.
The de-facto standard format used by other tools like clang and pahole
is to represent forward-declared enums as BTF_KIND_ENUM with vlen=0,
i.e. as a regular enum type with no enumerators. This patch changes
GCC to adopt that format, and makes a couple of minor cleanups in
btf_asm_type ().
gcc/
PR debug/111735
* btfout.cc (btf_fwd_to_enum_p): New.
(btf_asm_type_ref): Special case references to enum forwards.
(btf_asm_type): Special case enum forwards. Rename btf_size_type to
btf_size, and change chained ifs switching on btf_kind into else ifs.
gcc/testsuite/
PR debug/111735
* gcc.dg/debug/btf/btf-forward-2.c: New test.
Patrick Palka [Tue, 19 Dec 2023 16:40:15 +0000 (11:40 -0500)]
c++: partial ordering and dep alias tmpl specs [PR90679]
During partial ordering, we want to look through dependent alias
template specializations within template arguments and otherwise
treat them as opaque in other contexts (see e.g. r7-7116-g0c942f3edab108
and r11-7011-g6e0a231a4aa240). To that end template_args_equal was
given a partial_order flag that controls this behavior. This flag
does the right thing when a dependent alias template specialization
appears as template argument of the partial specialization, e.g. in
we correctly consider #2 to be more specialized than #1. But if the
alias specialization appears as a nested template argument of another
class template specialization, e.g. in
then we incorrectly consider #1 and #2 to be unordered. This is because
1. we don't propagate the flag to recursive template_args_equal calls
2. we don't use structural equality for class template specializations
written in terms of dependent alias template specializations
This patch fixes the first issue by turning the partial_order flag into
a global. This patch fixes the second issue by making us propagate
structural equality appropriately when building a class template
specialization. In passing this patch also improves hashing of
specializations that use structural equality.
PR c++/90679
gcc/cp/ChangeLog:
* cp-tree.h (comp_template_args): Remove partial_order parameter.
(template_args_equal): Likewise.
* pt.cc (comparing_for_partial_ordering): New global flag.
(iterative_hash_template_arg) <case tcc_type>: Hash the template
and arguments for specializations that use structural equality.
(template_args_equal): Remove partial order parameter and
use comparing_for_partial_ordering instead.
(comp_template_args): Likewise.
(comp_template_args_porder): Set comparing_for_partial_ordering
instead. Make static.
(any_template_arguments_need_structural_equality_p): Return true
for an argument that's a dependent alias template specialization
or a class template specialization that itself needs structural
equality.
* tree.cc (cp_tree_equal) <case TREE_VEC>: Adjust call to
comp_template_args.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/alias-decl-75a.C: New test.
* g++.dg/cpp0x/alias-decl-75b.C: New test.
For a (complex) alias template-id, dependent_alias_template_spec_p
returns true if any template argument of the template-id is dependent.
This predicate indicates that substitution into the template-id may
behave differently with respect to SFINAE than substitution into the
expanded alias, and so the alias is in a way non-transparent.
is such an alias template-id since first_t doesn't use its second
template parameter and so the substitution into the expanded alias would
discard the SFINAE effects of the corresponding (dependent) argument 'T&'.
But this predicate is overly conservative since what really matters for
sake of SFINAE equivalence is whether a template argument corresponding
to an _unused_ template parameter is dependent. So the predicate should
return false for e.g. 'first_t<T&, int>'.
This patch refines the predicate appropriately. We need to be able to
efficiently determine which template parameters of a complex alias
template are unused, so to that end we add a new out parameter to
complex_alias_template_p and cache its result in an on-the-side hash_map
that replaces the existing TEMPLATE_DECL_COMPLEX_ALIAS_P flag.
PR c++/90679
gcc/cp/ChangeLog:
* cp-tree.h (TEMPLATE_DECL_COMPLEX_ALIAS_P): Remove.
(most_general_template): Constify parameter.
* pt.cc (push_template_decl): Adjust after removing
TEMPLATE_DECL_COMPLEX_ALIAS_P.
(complex_alias_tmpl_info): New hash_map.
(uses_all_template_parms_data::seen): Change type to
tree* from bool*.
(complex_alias_template_r): Adjust accordingly.
(complex_alias_template_p): Add 'seen_out' out parameter.
Call most_general_template and check PRIMARY_TEMPLATE_P.
Use complex_alias_tmpl_info to cache the result and set
'*seen_out' accordigly.
(dependent_alias_template_spec_p): Add !processing_template_decl
early exit test. Consider dependence of only template arguments
corresponding to seen template parameters as per
Eric Botcazou [Thu, 28 Sep 2023 13:53:36 +0000 (15:53 +0200)]
ada: Fix internal error on call with parameter of predicated subtype
The problem is that the predicated subtype does not inherit all the required
attributes of a string subtype with a static predicate.
gcc/ada/
* sem_ch3.adb (Analyze_Subtype_Declaration): Remove a short-circuit
for subtypes without aspects when it comes to predicates.
* sem_util.adb (Inherit_Predicate_Flags): Deal with private subtypes
whose full view is an Itype.
Gary Dismukes [Sat, 2 Dec 2023 00:11:31 +0000 (00:11 +0000)]
ada: Missing error on positional container aggregates for types with Add_Named
The compiler fails to reject a container aggregate written using positional
notation when the container type specifies an Add_Named operation in its
Aggregate aspect. Container aggregates for such types must be written using
named associations. The compiler ignores the positional associations and
produces an empty aggregate object. An error check is added to catch such
illegal container aggregates.
gcc/ada/
* sem_aggr.adb (Resolve_Container_Aggregate): In the Add_Named
case, issue an error if the container aggregate is written as a
positional aggregate, since such an aggregate must have named
associations.
Sheri Bernstein [Fri, 1 Dec 2023 01:14:22 +0000 (01:14 +0000)]
ada: Remove GNATcheck violations
Remove GNATcheck violations by refactoring code and also using
pragma Annotate to exempt them.
gcc/ada/
* libgnat/a-comlin.adb (Argument_Count): Rewrite code so there is
only one return, to remove Improper_Returns violation.
(Command_Name): Add pragma to exempt Improper_Returns violation.
Gary Dismukes [Thu, 30 Nov 2023 19:28:42 +0000 (19:28 +0000)]
ada: Compiler hangs on container aggregate with function call as key expression
The compiler hangs (or may crash, if assertions are enabled) when compiling
an iterated association of a container aggregate that has a key expression
given by a function call. The resolution of the call leads to a blowup in
Build_Call_Marker, because the temporary copy of the expression that's
analyzed has an Empty parent, causing insertion of the call marker to fail.
The fix for this is to preanalyze, rather than analyze, the copy of the key
expression (Build_Call_Marker will return without creating a call marker in
the case of preanalysis).
gcc/ada/
* sem_aggr.adb (Resolve_Iterated_Association): Call
Preanalyze_And_Resolve instead of Analyze_And_Resolve on a key
expression of an iterated association.
Routine Get_Logical_Line_Number_Img was introduced for splitting of
Pre/Post contracts, but subsequent patch for that feature removed its
only use. It was then used by GNATprove, but that use is now removed
as well.
Patrick Bernardi [Wed, 29 Nov 2023 12:14:03 +0000 (07:14 -0500)]
ada: gnatbind: Do not generate Ada.Command_Line references when not used
It was previously assumed that configurable runtimes could not return exit
statuses, however this assumption no longer holds. Instead, only import
the required symbols from Ada.Command_Line's support packages if
Ada.Command_Line is in the closure of the partition when a configurable
runtime is used.
gcc/ada/
* bindgen.adb (Command_Line_Used): New object.
(Gen_Main): Only generate references to symbols used by
Ada.Command_Line if the package is used by the partition.
(Gen_Output_File_Ada): Ditto.
(Resolve_Binder_Options): Check if Ada.Command_Line is in the
closure of the partition.
Piotr Trojanek [Tue, 28 Nov 2023 22:01:31 +0000 (23:01 +0100)]
ada: Ignore unconstrained components as inputs for Depends
The current wording of SPARK RM 6.1.5(5) about the inputs for the
Depends contract doesn't mention "a record with at least one
unconstrained component".
gcc/ada/
* sem_prag.adb (Is_Unconstrained_Or_Tagged_Item): Update comment
and body.
Eric Botcazou [Tue, 28 Nov 2023 19:46:00 +0000 (20:46 +0100)]
ada: Rename Is_Constr_Subt_For_UN_Aliased flag
The flag is set on the constructed subtype of an object with unconstrained
nominal subtype that is aliased and is used by the code generator to adjust
the layout of the object.
But it is actually only used for array subtypes, where it determines whether
the object is allocated with its bounds, and this usage could be extended to
other cases than the original case.
gcc/ada/
* einfo.ads (Is_Constr_Subt_For_UN_Aliased): Rename into...
(Is_Constr_Array_Subt_With_Bounds): ...this.
* exp_ch3.adb (Expand_N_Object_Declaration): Adjust to above
renaming and remove now redundant test.
* sem_ch3.adb (Analyze_Object_Declaration): Likewise, but set
Is_Constr_Array_Subt_With_Bounds only on arrays.
* gen_il-fields.ads (Opt_Field_Enum): Apply same renaming.
* gen_il-gen-gen_entities.adb (Entity_Kind): Likewise.
* gen_il-internals.adb (Image): Remove specific processing for
Is_Constr_Subt_For_UN_Aliased.
* treepr.adb (Image): Likewise.
* gcc-interface/decl.cc (gnat_to_gnu_entity): Adjust to renaming
and remove now redundant tests.
* gcc-interface/trans.cc (Identifier_to_gnu): Likewise.
(Call_to_gnu): Likewise.
ada: Remove No_Dynamic_Priorities from Restricted_Tasking
Some of our restricted runtimes support dynamic priorities. The binder
needs to generate code for a restricted runtime even if the restriction
No_Dynamic_Priorities is not in place.
gcc/ada/
* libgnat/s-rident.ads: Remove No_Dynamic_Priorities from
Restricted_Tasking.
Patrick Bernardi [Sat, 18 Nov 2023 00:39:47 +0000 (19:39 -0500)]
ada: Adapt Ada.Command_Line to work on configurable runtimes
The behaviour of the binder when handling command line arguments and exit
codes is simplified so that references to the corresponding runtime symbols
are always generated when the runtime is configured with command line
argument and exit code support. This allows Ada.Command_Line to work with
all runtimes, which was not the case previously.
As a result of this change, configurable runtimes that do not include
Ada.Command_Line and it support files, but are configured with
Command_Line_Args and/or Exit_Status_Supported set to True will need to
provide the symbols required by the binder, as these symbols will no longer
be defined in the binder generated file.
argv.c includes a small change to exclude adaint.h when compiling for a
light runtime, since this header is not required.
gcc/ada/
* argv.c: Do not include adaint.h if LIGHT_RUNTIME is defined.
* bindgen.adb (Gen_Main): Simplify command line argument and exit
handling by requiring the runtime to always provide the required
symbols if command line argument and exit code is enabled.
* targparm.ads: Update comments to reflect changes to gnatbind.
Before this patch, the compiler would fail to examine the corresponding
record types of concurrent types when building aggregate components.
This patch fixes this, and adds a precondition and additional documentation
on the subprogram that triggered the crash, as it never makes sense
to call it with a concurrent type.
gcc/ada/
* exp_aggr.adb (Initialize_Component): Use corresponding record
types of concurrent types.
* exp_util.ads (Make_Tag_Assignment_From_Type): Add precondition
and extend documentation.
Co-authored-by: Javier Miranda <miranda@adacore.com>
Eric Botcazou [Fri, 24 Nov 2023 15:26:00 +0000 (16:26 +0100)]
ada: Further cleanup in finalization machinery
This removes the setting of the Is_Ignored_Transient flag on the temporaries
needing finalization created by Expand_Ctrl_Function_Call when invoked from
within the dependent expressions of conditional expressions.
This flag tells the general finalization machinery to disregard the object.
But temporaries needing finalization present in action lists of dependent
expressions are picked up by Process_Transients_In_Expression, which deals
with their finalization and sets the Is_Finalized_Transient flag on them.
Now this latter flag has exactly the same effect as Is_Ignored_Transient
as far as the general finalization machinery is concerned, so setting the
flag is unnecessary. In the end, the flag can be decoupled entirely from
transient objects and renamed into Is_Ignored_For_Finalization.
This also moves around the declaration of a local variable and turns a
library-level procedure into a nested procedure.
gcc/ada/
* einfo.ads (Is_Ignored_Transient): Rename into...
(Is_Ignored_For_Finalization): ...this.
* gen_il-fields.ads (Opt_Field_Enum): Adjust to above renaming.
* gen_il-gen-gen_entities.adb (Object_Kind): Likewise.
* exp_aggr.adb (Expand_Array_Aggregate): Likewise.
* exp_ch7.adb (Build_Finalizer.Process_Declarations): Likewise.
* exp_util.adb (Requires_Cleanup_Actions): Likewise.
* exp_ch4.adb (Expand_N_If_Expression): Move down declaration of
variable Optimize_Return_Stmt.
(Process_Transient_In_Expression): Turn procedure into a child of...
(Process_Transients_In_Expression): ...this procedure.
* exp_ch6.adb (Expand_Ctrl_Function_Call): Remove obsolete setting
of Is_Ignored_Transient flag on the temporary if within a dependent
expression of a conditional expression.
Yannick Moy [Wed, 22 Nov 2023 15:43:08 +0000 (16:43 +0100)]
ada: Fix SPARK expansion of container aggregates
GNATprove supports container aggregates, except for indexed aggregates.
It needs all expressions to have suitable target types and Do_Range_Check
flags, which are added by the special expansion for GNATprove.
There is no impact on code generation.
gcc/ada/
* exp_spark.adb (Expand_SPARK_N_Aggregate): New procedure for the
special expansion.
(Expand_SPARK): Call the new expansion procedure.
* sem_util.adb (Is_Container_Aggregate): Implement missing test.
Eric Botcazou [Wed, 22 Nov 2023 15:29:01 +0000 (16:29 +0100)]
ada: Fix spurious visibility error on parent's component in instance
This occurs for an aggregate of a derived tagged type in the body of the
instance, because the full view of the parent type, which was visible in
the generic construct (otherwise the aggregate would have been illegal),
is not restored in the body of the instance.
Copy_Generic_Node already contains code to restore the full view in this
case, but it works only if the derived tagged type is itself global to
the generic construct, and not if the derived tagged type is local but
the parent type global, as is the case here.
gcc/ada/
* gen_il-fields.ads (Aggregate_Bounds): Rename to
Aggregate_Bounds_Or_Ancestor_Type.
* gen_il-gen-gen_nodes.adb (Aggregate_Bounds): Likewise.
* sem_aggr.adb (Resolve_Record_Aggregate): Remove obsolete bypass.
* sem_ch12.adb (Check_Generic_Actuals): Add decoration.
(Copy_Generic_Node): For an extension aggregate, restore only the
full view, if any. For a full aggregate, restore the full view as
well as that of its Ancestor_Type, if any, and up to the root type.
(Save_References_In_Aggregate): For a full aggregate of a local
derived tagged type with a global ancestor, set Ancestor_Type to
this ancestor. For a full aggregate of a global derived tagged
type, set Ancestor_Type to the parent type.
* sinfo-utils.ads (Aggregate_Bounds): New function renaming.
(Ancestor_Type): Likewise.
(Set_Aggregate_Bounds): New procedure renaming.
(Set_Ancestor_Type): Likewise.
* sinfo.ads (Ancestor_Type): Document new field.
Eric Botcazou [Wed, 22 Nov 2023 22:04:33 +0000 (23:04 +0100)]
ada: Plug small loophole in finalization machinery
The path in Expand_N_If_Expression implementing the special optimization for
an unidimensional array type and dependent expressions with static bounds
fails to call Process_Transients_In_Expression on their list of actions.
gcc/ada/
* exp_ch4.adb (Expand_N_If_Expression): Also add missing calls to
Process_Transients_In_Expression on the code path implementing the
special optimization for an unidimensional array type and
dependent expressions with static bounds.
Steve Baird [Tue, 21 Nov 2023 19:00:37 +0000 (11:00 -0800)]
ada: Cope with Sem_Util.Enclosing_Declaration oddness.
Sem_Util.Enclosing_Declaration can return a non-empty result which is not
a declaration; clients may need to compensate for the case where an
N_Subprogram_Specification node is returned. One such client is the function
Is_Actual_Subp_Of_Inst.
gcc/ada/
* sem_ch8.adb (Is_Actual_Subp_Of_Inst): After calling
Enclosing_Declaration, add a check for the case where one more
Parent call is needed to get the enclosing declaration.
This patch relaxes the requirement that discriminants values should be
known at compile time for a particular optimization to be applied. That
optimization is the one that treats an unconstrained object as constrained
when the object is of a limited type, in order to reduce the size of the
object.
What makes it possible to relax this requirement is that the set of
cases where the optimization is applied was narrowed in a previous
patch.
Yannick Moy [Tue, 21 Nov 2023 11:06:52 +0000 (12:06 +0100)]
ada: Do not issue SPARK legality error if SPARK_Mode ignored
When pragma Ignore_Pragma(SPARK_Mode) is used, do not issue error
messages related to SPARK legality checking. This facilitates the
instrumentation of code by GNATcoverage.
gcc/ada/
* doc/gnat_rm/implementation_defined_pragmas.rst: Fix doc for
pragma Ignore_Pragma, in the case where it follows another
configuration pragma that it names, which causes the preceding
pragma to be ignored after parsing.
* errout.adb (Should_Ignore_Pragma_SPARK_Mode): New query.
(SPARK_Msg_N): Do nothing if SPARK_Mode is ignored.
(SPARK_Msg_NE): Same.
* gnat-style.texi: Regenerate.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.
Yannick Moy [Tue, 21 Nov 2023 15:48:20 +0000 (16:48 +0100)]
ada: Cleanup SPARK legality checking
Move one SPARK legality check from GNAT to GNATprove, and cleanup
other uses of SPARK_Mode for legality checking.
gcc/ada/
* sem_ch4.adb (Analyze_Selected_Component): Check correct mode
variable for GNATprove.
* sem_prag.adb (Refined_State): Call SPARK_Msg_NE which checks
value of SPARK_Mode before issuing a message.
* sem_res.adb (Resolve_Entity_Name): Remove legality check for
SPARK RM 6.1.9(1), moved to GNATprove.