Jeff Law [Thu, 28 Mar 2024 22:56:53 +0000 (16:56 -0600)]
[committed] Provide suitable output template for zero_extendqihi2 on H8
Segher's recent combine change, quite unexpectedly, triggered a regression on
the H8 port. It failed to build newlib.
The zero_extendqihi2 pattern provided two alternatives. One where the source
and destination matched. That turns into a suitable instruction trivially.
The second alternative was actually meant to capture cases where the value is
coming from memory.
What was missing here was the reg->reg case where the source and destination do
not match. That fell into the second case which was requested to be split by
the pattern's output template.
The splitter had a suitable condition to make sure it only triggered in the
right cases. Unfortunately with the pattern requiring a split in a case where
the splitter was going to fail led to the fault.
So regardless of what's going on in the combiner, this code was just wrong.
Fixed thusly by providing a suitable output template for the reg->reg case.
Regression tested on h8300-elf. Pushing to the trunk.
gcc/
* config/h8300/extensions.md (zero_extendqihi*): Add output
template for reg->reg case where the regs don't match.
Jason Merrill [Wed, 27 Mar 2024 20:14:01 +0000 (16:14 -0400)]
c++: __is_constructible ref binding [PR100667]
The requirement that a type argument be complete is excessive in the case of
direct reference binding to the same type, which does not rely on any
properties of the type. This is LWG 2939.
PR c++/100667
gcc/cp/ChangeLog:
* semantics.cc (same_type_ref_bind_p): New.
(finish_trait_expr): Use it.
Harald Anlauf [Wed, 27 Mar 2024 20:18:04 +0000 (21:18 +0100)]
Fortran: fix DATA and derived types with pointer components [PR114474]
When matching actual arguments in match_actual_arg, these are initially
treated as a possible dummy procedure, assuming that the correct type is
determined later. This resolution could fail when the procedure is a
derived type constructor with a pointer component and appears in a DATA
statement, where the pointer shall be associated with an initial data
target. Check for those cases where the type obviously has not been
resolved yet, and which were missed because there was no component
reference.
gcc/fortran/ChangeLog:
PR fortran/114474
* primary.cc (gfc_variable_attr): Catch variables used in structure
constructors within DATA statements that are still tagged with a
temporary type BT_PROCEDURE from match_actual_arg and which have the
target attribute, and fix their typespec.
gcc/testsuite/ChangeLog:
PR fortran/114474
* gfortran.dg/data_pointer_3.f90: New test.
Vineet Gupta [Wed, 27 Mar 2024 21:55:04 +0000 (14:55 -0700)]
RISC-V: testsuite: ensure vtype is call clobbered
Per classic Vector calling convention ABI, vtype is call clobbered,
so ensure gcc regenerates a VSETVLI in following cases:
- after a function call.
- after an inline asm fragment which clobbers vtype.
ATM gcc seems to be doing the right thing, but a test can never hurt.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vtype-call-clobbered.c: New Test.
Gaius Mulley [Thu, 28 Mar 2024 16:49:44 +0000 (16:49 +0000)]
PR modula2/114520 Incorrect ordering of import/export statements cause confusion
The error recovery causes misleading error messages to appear if an
EXPORT and IMPORT statement are in the wrong order. This patch
detects the incorrect order and issues an error message and prevents
error recovery. The fix should be improved and made more general if
another similar case is required.
gcc/m2/ChangeLog:
PR modula2/114520
* gm2-compiler/P0SyntaxCheck.bnf (DetectImport): New
procedure.
(EnableImportCheck): New boolean.
(Expect): Call DetectImport.
(Export): Set EnableImportCheck TRUE before ';' and FALSE
afterwards.
Gaius Mulley [Thu, 28 Mar 2024 14:57:49 +0000 (14:57 +0000)]
PR modula2/114517 gm2 does not allow comparison operator hash in column one
This patch allows -fno-cpp to be supplied to gm2. Without this patch
it causes an ICE. The patch allows -fno-cpp to turn off cpp flags.
These are tested in m2.flex to decide whether a change of state is
allowed (enabling handling of #line directives).
gcc/ChangeLog:
PR modula2/114517
* doc/gm2.texi: Mention gm2 treats a # in the first column
as a preprocessor directive unless -fno-cpp is supplied.
gcc/m2/ChangeLog:
PR modula2/114517
* gm2-compiler/M2Options.def (SetCpp): Add comment.
(GetCpp): Move after SetCpp.
(GetLineDirectives): New procedure function.
* gm2-compiler/M2Options.mod (GetLineDirectives): New
procedure function.
* gm2-gcc/m2options.h (M2Options_GetLineDirectives): New
prototype.
* gm2-lang.cc (gm2_langhook_init_options): OPT_fcpp only
assert if !value.
* m2.flex: Test GetLineDirectives before changing to LINE0
state.
gcc/testsuite/ChangeLog:
PR modula2/114517
* gm2/cpp/fail/hashfirstcolumn2.mod: New test.
* gm2/imports/fail/imports-fail.exp: New test.
* gm2/imports/fail/localmodule2.mod: New test.
* gm2/imports/run/pass/localmodule.mod: New test.
Jakub Jelinek [Thu, 28 Mar 2024 14:00:44 +0000 (15:00 +0100)]
profile-count: Avoid overflows into uninitialized [PR112303]
The testcase in the patch ICEs with
--- gcc/tree-scalar-evolution.cc
+++ gcc/tree-scalar-evolution.cc
@@ -3881,7 +3881,7 @@ final_value_replacement_loop (class loop *loop)
/* Propagate constants immediately, but leave an unused initialization
around to avoid invalidating the SCEV cache. */
- if (CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rslt))
+ if (0 && CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rslt))
replace_uses_by (rslt, def);
/* Create the replacement statements. */
(the addition of the above made the ICE latent), because profile_count
addition doesn't check for overflows and if unlucky, we can even overflow
into the uninitialized value.
Getting really huge profile counts is very easy even when not using
recursive inlining in loops, e.g.
__attribute__((noipa)) void
bar (void)
{
__builtin_exit (0);
}
__attribute__((noipa)) void
foo (void)
{
for (int i = 0; i < 1000; ++i)
for (int j = 0; j < 1000; ++j)
for (int k = 0; k < 1000; ++k)
for (int l = 0; l < 1000; ++l)
for (int m = 0; m < 1000; ++m)
for (int n = 0; n < 1000; ++n)
for (int o = 0; o < 1000; ++o)
for (int p = 0; p < 1000; ++p)
for (int q = 0; q < 1000; ++q)
for (int r = 0; r < 1000; ++r)
for (int s = 0; s < 1000; ++s)
for (int t = 0; t < 1000; ++t)
for (int u = 0; u < 1000; ++u)
for (int v = 0; v < 1000; ++v)
for (int w = 0; w < 1000; ++w)
for (int x = 0; x < 1000; ++x)
for (int y = 0; y < 1000; ++y)
for (int z = 0; z < 1000; ++z)
for (int a = 0; a < 1000; ++a)
for (int b = 0; b < 1000; ++b)
bar ();
}
int
main ()
{
foo ();
}
reaches the maximum count already on the 11th loop.
Some other methods of profile_count like apply_scale already
do use MIN (val, max_count) before assignment to m_val, this patch
just extends that to operator{+,+=} methods.
Furthermore, one overload of apply_probability wasn't using
safe_scale_64bit and so could very easily overflow as well
- prob is required to be [0, 10000] and if m_val is near the max_count,
it can overflow even with multiplications by 8.
2024-03-28 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/112303
* profile-count.h (profile_count::operator+): Perform
addition in uint64_t variable and set m_val to MIN of that
val and max_count.
(profile_count::operator+=): Likewise.
(profile_count::operator-=): Formatting fix.
(profile_count::apply_probability): Use safe_scale_64bit
even in the int overload.
Maxim Kuvyrkov [Wed, 13 Mar 2024 06:48:47 +0000 (06:48 +0000)]
[testsuite] Fixup dg-options in {gcc,g++,gfortran}.dg/vect.exp tests
Testsuites driven by vect.exp rely on check_vect_support_and_set_flags
to set appropriate DEFAULT_VECTFLAGS for a given target (e.g., add
-mfpu=neon for arm-linux-gnueabi). Unfortunately, these flags are
overwritten by dg-options directive, which can cause tests to fail.
Behavior of dg-options is documented in vect.exp files, but not
all developers look at the .exp file when adding a new testcase.
This caused a few dg-options directives to be used instead of
the more appropriate dg-additional-options.
This patch changes target-independent dg-options into
dg-additional-options. This patch does not touch target-specific
dg-options and target-specific tests to avoid disturbing the gentle
balance of target-specific vectorization.
This patch also removes a couple of unneeded "dg-do run" directives
to avoid failures on compile-only targets. Default action is, again,
set by check_vect_support_and_set_flags.
Lastly, I avoided renaming tests that use -O<n> options to O<n>-*
filename format because this support is not consistent between
gcc.dg/vect/, g++.dg/vect/, and gfortran.dg/vect/ testsuites.
It seems dg-additional-options is cleaner.
This patch does the following,
1. do not change target-specific tests, e.g., gcc.dg/vect/costmodel/riscv/*;
2. do not change { dg-options FOO { target { target-*-pattern } } };
3. do not remove { dg-do run { target { target-*-pattern } } };
4. change { dg-options FOO } to { dg-additional-options FOO };
5. remove { dg-do run } in several tests, where it is clearly not needed.
Mikael Morin [Wed, 27 Mar 2024 15:30:42 +0000 (16:30 +0100)]
fortran: Fix specification expression check in submodules [PR114475]
The patch fixing PR111781 made the check of specification expressions more
restrictive, disallowing local variables in specification expressions of
dummy arguments. PR114475 showed an example where that change regressed,
disallowing in submodules expressions that had been allowed in the parent
module. In submodules indeed, the hierarchy of namespaces inherited from
the parent module is not reproduced so the host-association of symbols
can't be recognized by checking the nesting of namespaces.
This change fixes the problem by allowing in specification expressions
all the symbols in a submodule that are inherited from the parent module.
PR fortran/111781
PR fortran/114475
gcc/fortran/ChangeLog:
* expr.cc (check_restricted): In submodules, allow variables host-
associated from the parent module.
Palmer Dabbelt [Wed, 27 Mar 2024 19:54:04 +0000 (12:54 -0700)]
RISC-V: Add vxsat as a register
We aren't doing anything with vxsat right now, but I'd like to add it as
an accepted register to the clobber list. If we get this into GCC-14
then we'll avoid some preprocessor-based twiddling if we ever start
using vxsat in the future.
David Malcolm [Wed, 27 Mar 2024 22:26:51 +0000 (18:26 -0400)]
analyzer: fix ICE due to type mismatch when replaying call summary [PR114473]
gcc/analyzer/ChangeLog:
PR analyzer/114473
* call-summary.cc
(call_summary_replay::convert_svalue_from_summary): Assert that
the types match.
(call_summary_replay::convert_region_from_summary): Likewise.
(call_summary_replay::convert_region_from_summary_1): Add missing
cast for the deref of RK_SYMBOLIC case.
gcc/testsuite/ChangeLog:
PR analyzer/114473
* gcc.dg/analyzer/call-summaries-pr114473.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jakub Jelinek [Wed, 27 Mar 2024 19:22:02 +0000 (20:22 +0100)]
btf: Fix up btf-datasec-1.c test on x86
> -/* The offset entry for each variable in a DATSEC should be 0 at compile time. */
> -/* { dg-final { scan-assembler-times "0\[\t \]+\[^\n\]*bts_offset" 7 } } */
> +/* The offset entry for each variable in a DATSEC should contain a label. */
> +/* { dg-final { scan-assembler-times ".4byte\[\t \]\[a-e\]\[\t \]+\[^\n\]*bts_offset" 5 } } */
4byte is used only on some targets, what exact assembler directive is used
for 4byte unaligned data is heavily target dependent.
2024-03-27 Jakub Jelinek <jakub@redhat.com>
* gcc.dg/debug/btf/btf-cvr-quals-1.c: Use dg-additional-options
instead of multiple dg-options.
* gcc.dg/debug/btf/btf-datasec-1.c: Likewise. Accept all supported
unaligned 4 byte assembler directives rather than assuming it must
be .4byte.
Jakub Jelinek [Wed, 27 Mar 2024 18:38:06 +0000 (19:38 +0100)]
c-family: Cast __atomic_load_*/__atomic_exchange_* result to _BitInt rather then VCE it [PR114469]
As written in the PR, torture/bitint-64.c test fails with -O2 -flto
and the reason is that on _BitInt arches where the padding bits
are undefined, the padding bits in the _Atomic vars are also undefined,
but when __atomic_load or __atomic_exchange on a _BitInt _Atomic variable
with some padding bits is lowered into __atomic_load_{1,2,4,8,16} or
__atomic_exchange_*, the mode precision unsigned result is VIEW_CONVERT_EXPR
converted to _BitInt and because of the VCE nothing actually sign/zero
extends it as needed for later uses - the var is no longer addressable and
expansion assumes such automatic vars are properly extended.
The following patch fixes that by using NOP_EXPR on it (the
VIEW_CONVERT_EXPR after it will then be optimized away during
gimplification, didn't want to repeat it in the code as else result = build1
(VIEW_CONVERT_EXPR, ...); twice.
2024-03-27 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/114469
* c-common.cc (resolve_overloaded_builtin): For _BitInt result
on !extended targets convert result to the _BitInt type before
using VIEW_CONVERT_EXPR.
In some cases combine will "combine" an I2 and I3, but end up putting
exactly the same thing back as I2 as was there before. This is never
progress, so we shouldn't do it, it will lead to oscillating behaviour
and the like.
If we want to canonicalise things, that's fine, but this is not the
way to do it.
Jakub Jelinek [Wed, 27 Mar 2024 14:41:59 +0000 (15:41 +0100)]
docs: Use @var{S} etc. in Spec File invoke.texi documentation
We got internally a question about the Spec File syntax, misunderstanding
what is the literal syntax and what are the placeholder variables in
the syntax descriptions.
The following patch attempts to use @var{S} etc. instead of just S
to clarify it stands for any option (or start of option etc.) rather
than literal S, say in %{S:X}. At least in HTML documentation it
then uses italics.
2024-03-27 Jakub Jelinek <jakub@redhat.com>
* doc/invoke.texi (Spec Files): Use @var{S} instead of S,
@var{X} instead of X etc. for other placeholders.
* include/experimental/bits/simd_x86.h (_S_masked_unary):
Cast inputs < 16 bytes to 16 byte vectors before calling the
right subtraction builtin. Before returning, truncate to the
return vector type.
* include/experimental/bits/simd_x86.h (_S_masked_unary): Call
the 4- and 8-byte variants of __builtin_ia32_subp[ds] without
rounding direction argument.
libstdc++: add ARM SVE support to std::experimental::simd
libstdc++-v3/ChangeLog:
* include/Makefile.am: Add simd_sve.h.
* include/Makefile.in: Add simd_sve.h.
* include/experimental/bits/simd.h: Add new SveAbi.
* include/experimental/bits/simd_builtin.h: Use
__no_sve_deduce_t to support existing Neon Abi.
* include/experimental/bits/simd_converter.h: Convert
sequentially when sve is available.
* include/experimental/bits/simd_detail.h: Define sve
specific macro.
* include/experimental/bits/simd_math.h: Fallback frexp
to execute sequntially when sve is available, to handle
fixed_size_simd return type that always uses sve.
* include/experimental/simd: Include bits/simd_sve.h.
* testsuite/experimental/simd/tests/bits/main.h: Enable
testing for sve128, sve256, sve512.
* include/experimental/bits/simd_sve.h: New file.
Richard Biener [Wed, 27 Mar 2024 10:37:16 +0000 (11:37 +0100)]
tree-optimization/114057 - handle BB reduction remain defs as LIVE
The following makes sure to record the scalars we add to the BB
reduction vectorization result as scalar uses for the purpose of
computing live lanes. This restores vectorization in the
bondfree.c TU of 435.gromacs.
PR tree-optimization/114057
* tree-vect-slp.cc (vect_bb_slp_mark_live_stmts): Mark
BB reduction remain defs as scalar uses.
Jakub Jelinek [Wed, 27 Mar 2024 11:00:58 +0000 (12:00 +0100)]
testsuite: Fix up ext-floating{3,12}.C on i686-linux
These tests FAIL for quite a while on i686-linux since July last year,
likely r14-2628 . Since that patch gcc claims _Float16 and __bf16
support even without -msse2 because some functions could be using
target attribute.
Later r14-2691 added -msse2 to add_options_for_float16, but didn't do that
for bfloat16, plus ext-floating{3,12}.C tests need the added dg-add-options,
so that float16 and bfloat16 effective targets match the __STDCPP_FLOAT16_T__
or __STDCPP_BFLOAT16_T__ macros.
Fixes
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 144)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 146)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 148)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 150)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 152)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 154)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 144)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 146)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 148)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 150)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 152)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 154)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 107)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 114)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 126)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 79)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 86)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 98)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for warnings, line 22)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for warnings, line 23)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for warnings, line 24)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for warnings, line 25)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 107)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 114)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 126)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 79)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 86)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 98)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for warnings, line 22)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for warnings, line 23)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for warnings, line 24)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for warnings, line 25)
on the latter and changes nothing on the former.
2024-03-27 Jakub Jelinek <jakub@redhat.com>
* lib/target-supports.exp (add_options_for_bfloat16): Add -msse2 on
i?86/x86_64.
* g++.dg/cpp23/ext-floating3.C: Add dg-add-options float16.
* g++.dg/cpp23/ext-floating12.C: Add dg-add-options float16 and
bfloat16.
aarch64: Align lrcpc3 FEAT_STRING with /proc/cpuinfo 'Features' entry
Due to the Linux kernel exposing the lrcpc3 architectural feature as
"lrcpc3", this patch corrects the relevant FEATURE_STRING entry in the
"rcpc3" AARCH64_OPT_FMV_EXTENSION macro, such that the feature can be
correctly detected when doing native compilation on rcpc3-enabled
targets.
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def (rcpc3):
Fix FEATURE_STRING field to "lrcpc3".
aarch64: Add +lse128 architectural extension command-line flag
Given how, at present, the choice of using LSE128 atomic instructions
by the toolchain is delegated to run-time selection in the form of
Libatomic ifuncs, responsible for querying target support, the
`+lse128' target architecture compile-time flag is absent from GCC.
This, however, contrasts with the Binutils implementation, which gates
LSE128 instructions behind the `+lse128' flag. This can lead to
problems in GCC for certain use-cases. One such example is in the use
of inline assembly, whereby the inability of enabling the feature in
the command-line prevents the compiler from automatically issuing the
necessary LSE128 `.arch' directive.
This patch therefore brings GCC into alignment with LLVM and Binutils
in adding support for the `+lse128' architectural extension flag.
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def: Add LSE128
AARCH64_OPT_EXTENSION, adding it as a dependency for the D128
feature.
* doc/invoke.texi (AArch64 Options): Document +lse128.
For targets where LOGICAL_OP_NON_SHORT_CIRCUIT evaluates to false, two
conditional jumps are emitted instead of a combined conditional which
this test is all about. Thus, set it to true.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/copy-headers-8.c: Set
LOGICAL_OP_NON_SHORT_CIRCUIT to true.
GCC was defining bts_offset entry to always contain 0.
When comparing with clang, the same entry is instead a label to the
respective variable or function. The assembler emits relocations for
those labels.
Jakub Jelinek [Tue, 26 Mar 2024 15:40:53 +0000 (16:40 +0100)]
testsuite: Fix up pr111151.c testcase [PR114486]
Apparently I've somehow screwed up the adjustments of the originally tested
testcase, tweaked it so that in the second/third cases it actually see
a MAX_EXPR rather than COND_EXPR the MAX_EXPR has been optimized into,
and didn't update the expected value.
2024-03-26 Jakub Jelinek <jakub@redhat.com>
PR middle-end/111151
PR testsuite/114486
* gcc.c-torture/execute/pr111151.c (main): Fix up expected value for
f.
PR modula2/114478
* gm2/builtins/run/pass/builtins-run-pass.exp: New test.
* gm2/builtins/run/pass/testcomparisons.mod: New test.
* gm2/builtins/run/pass/testisnormal.mod: New test.
* gm2/pimlib/run/pass/testchar.mod: New test.
Richard Ball [Tue, 26 Mar 2024 13:54:31 +0000 (13:54 +0000)]
aarch64: Fix SCHEDULER_IDENT for Cortex-A510 and Cortex-A520
The SCHEDULER_IDENT for these two CPUs
was incorrectly set to cortexa55.
This can cause sub-optimal asm
to be generated.
gcc/ChangeLog:
PR target/114272
* config/aarch64/aarch64-cores.def (AARCH64_CORE):
Change SCHEDULER_IDENT from cortexa55 to cortexa53
for Cortex-A510 and Cortex-A520.
Jonathan Wakely [Fri, 22 Mar 2024 22:26:10 +0000 (22:26 +0000)]
libstdc++: Replace stacktrace effective target with feature test
Rmove the dejagnu code for checking whether std::stacktrace is supported
and just use the new dg-require-cpp-feature-test directive to check for
__cpp_lib_stacktrace instead.
Jonathan Wakely [Fri, 22 Mar 2024 22:01:50 +0000 (22:01 +0000)]
libstdc++: Add dg-require-cpp-feature-test to test feature test macros
This adds a new dejagnu directive which can be used to make a test
depend on a feature test macro such as __cpp_lib_text_encoding. This is
mroe flexible than writing a new dg-require-xxx for each feature.
libstdc++-v3/ChangeLog:
* testsuite/lib/dg-options.exp (dg-require-cpp-feature-test):
New proc.
* testsuite/lib/libstdc++.exp (check_v3_target_cpp_feature_test):
New proc.
* testsuite/std/text_encoding/cons.cc: Use new directive to skip
the test if the __cpp_lib_text_encoding feature test macro is
not defined.
* testsuite/std/text_encoding/requirements.cc: Likewise.
Jakub Jelinek [Tue, 26 Mar 2024 10:25:15 +0000 (11:25 +0100)]
testsuite: Add -Wno-psabi to pr113126.c test
I've missed
FAIL: gcc.dg/torture/pr113126.c -O0 (test for excess errors)
etc. regressions on i686-linux since January. The problem is obvious
Excess errors:
.../gcc/testsuite/gcc.dg/torture/pr113126.c:11:1: warning: MMX vector return without MMX enabled changes the ABI [-Wpsabi]
and I've added -Wno-psabi to dg-additional-options to fix that up.
2024-03-26 Jakub Jelinek <jakub@redhat.com>
* gcc.dg/torture/pr113126.c: Add -Wno-psabi as dg-additional-options.
Jakub Jelinek [Tue, 26 Mar 2024 10:21:38 +0000 (11:21 +0100)]
fold-const: Punt on MULT_EXPR in extract_muldiv MIN/MAX_EXPR case [PR111151]
As I've tried to explain in the comments, the extract_muldiv_1
MIN/MAX_EXPR optimization is wrong for code == MULT_EXPR.
If the multiplication is done in unsigned type or in signed
type with -fwrapv, it is fairly obvious that max (a, b) * c
in many cases isn't equivalent to max (a * c, b * c) (or min if c is
negative) due to overflows, but even for signed with undefined overflow,
the optimization could turn something without UB in it (where
say a * c invokes UB, but max (or min) picks the other operand where
b * c doesn't).
As for division/modulo, I think it is in most cases safe, except if
the problematic INT_MIN / -1 case could be triggered, but we can
just punt for MAX_EXPR because for MIN_EXPR if one operand is INT_MIN,
we'd pick that operand already. It is just for completeness, match.pd
already has an optimization which turns x / -1 into -x, so the division
by zero is mostly theoretical. That is also why in the testcase the
i case isn't actually miscompiled without the patch, while the c and f
cases are.
2024-03-26 Jakub Jelinek <jakub@redhat.com>
PR middle-end/111151
* fold-const.cc (extract_muldiv_1) <case MAX_EXPR>: Punt for
MULT_EXPR altogether, or for MAX_EXPR if c is -1.
Jakub Jelinek [Tue, 26 Mar 2024 10:06:15 +0000 (11:06 +0100)]
tsan: Don't instrument non-generic AS accesses [PR111736]
Similar to the asan and ubsan changes, we shouldn't instrument non-generic
address space accesses with tsan, because we just have library functions
which take address of the objects as generic address space pointers, so they
can't handle anything else.
2024-03-26 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/111736
* tsan.cc (instrument_expr): Punt on non-generic address space
accesses.
Richard Biener [Tue, 26 Mar 2024 08:11:00 +0000 (09:11 +0100)]
tree-optimization/114471 - ICE with mismatching vector types
The following fixes too lax verification of vector type compatibility
in vectorizable_operation. When we only have a single vector size then
comparing the number of elements is enough but with SLP we mix those
and thus for operations like BIT_AND_EXPR we need to verify compatible
element types as well. Allow sign changes for ABSU_EXPR though.
PR tree-optimization/114471
* tree-vect-stmts.cc (vectorizable_operation): Verify operand
types are compatible with the result type.
Richard Biener [Tue, 26 Mar 2024 08:39:30 +0000 (09:39 +0100)]
tree-optimization/114464 - verify types in recurrence vectorization
The following adds missing verification of vector type compatibility
to recurrence vectorization.
PR tree-optimization/114464
* tree-vect-loop.cc (vectorizable_recurr): Verify the latch
vector type is compatible with what we chose for the recurrence.
Jakub Jelinek [Tue, 26 Mar 2024 09:03:27 +0000 (10:03 +0100)]
c-family, c++: Handle EXCESS_PRECISION_EXPR in pretty printers [PR112724]
I've noticed that the c-c++-common/gomp/depobj-3.c test FAILs on i686-linux:
PASS: c-c++-common/gomp/depobj-3.c -std=c++17 at line 17 (test for warnings, line 15)
FAIL: c-c++-common/gomp/depobj-3.c -std=c++17 at line 39 (test for warnings, line 37)
PASS: c-c++-common/gomp/depobj-3.c -std=c++17 at line 43 (test for errors, line 41)
PASS: c-c++-common/gomp/depobj-3.c -std=c++17 (test for warnings, line 45)
FAIL: c-c++-common/gomp/depobj-3.c -std=c++17 (test for excess errors)
Excess errors:
/home/jakub/src/gcc/gcc/testsuite/c-c++-common/gomp/depobj-3.c:37:38: warning: the 'destroy' expression ''excess_precision_expr' not supported by dump_expr<expression error>' should
+be the same as the 'depobj' argument 'obj' [-Wopenmp]
The following patch replaces that 'excess_precision_expr' not supported by dump_expr<expression error>
with (float)(((long double)a) + (long double)5)
Still ugly and doesn't actually fix the FAIL (will deal with that
incrementally), but at least valid C/C++ and shows the excess precision
handling in action.
2024-03-26 Jakub Jelinek <jakub@redhat.com>
PR c++/112724
gcc/c-family/
* c-pretty-print.cc (pp_c_cast_expression,
c_pretty_printer::expression): Handle EXCESS_PRECISION_EXPR like
NOP_EXPR.
gcc/cp/
* error.cc (dump_expr): Handle EXCESS_PRECISION_EXPR like NOP_EXPR.
Marek Polacek [Fri, 15 Mar 2024 13:23:28 +0000 (09:23 -0400)]
c++: ICE with noexcept and local specialization, again [PR114349]
Patrick noticed that my r14-9339-gdc6c3bfb59baab patch is wrong;
we're dealing with a noexcept-spec there, not a noexcept-expr, so
setting cp_noexcept_operand et al is incorrect. Back to the drawing
board then.
To fix noexcept84.C, we should probably avoid doing push_to_top_level
in certain cases. maybe_push_to_top_level didn't work here as-is, so
I changed it to not push to top level if decl_function_context is
non-null, when we are not dealing with a lambda.
This also fixes c++/114349, introduced by r14-9339.
PR c++/114349
gcc/cp/ChangeLog:
* name-lookup.cc (maybe_push_to_top_level): For a non-lambda,
don't push to top level if decl_function_context is non-null.
* pt.cc (maybe_instantiate_noexcept): Use maybe_push_to_top_level.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/noexcept85.C: New test.
* g++.dg/cpp0x/noexcept86.C: New test.
Marek Polacek [Mon, 25 Mar 2024 19:32:20 +0000 (15:32 -0400)]
c++: broken direct-init with trailing array member [PR114439]
can_init_array_with_p is wrongly saying that the init for 's' here:
struct S {
int *list = arr;
int arr[];
};
struct A {
A() {}
S s[2]{};
};
is invalid. But as process_init_constructor_array says, for "non-constant
initialization of trailing elements with no explicit initializers" we use
a VEC_INIT_EXPR wrapped in a TARGET_EXPR, built in process_init_constructor.
Unfortunately we didn't have a test for this scenario so I didn't
realize can_init_array_with_p must handle it.
PR c++/114439
gcc/cp/ChangeLog:
* init.cc (can_init_array_with_p): Return true for a VEC_INIT_EXPR
wrapped in a TARGET_EXPR.
Richard Biener [Mon, 25 Mar 2024 10:24:08 +0000 (11:24 +0100)]
amdgcn: Add gfx1036 target
Add support for the gfx1036 RDNA2 APU integrated graphics devices. The ROCm
documentation warns that these may not be supported, but it seems to work
at least partially.
build with -march=rv64gc -mabi=lp64d -O3, we will have asm like below:
test_1:
.option push
.option arch, rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_v1p0_zicsr2p0_\
zifencei2p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0
vsetvli zero,a0,e32,m1,ta,ma
vadd.vv v8,v8,v9
ret
The riscv_vector.h must be included when leverage intrinisc type(s) and
API(s). And the scope of this attribute should not excced the function
body. Meanwhile, to make rvv types and API(s) available for this attribute,
include riscv_vector.h will not report error for now if v is not present
in march.
Below test are passed for this patch:
* The riscv fully regression test.
gcc/ChangeLog:
* config/riscv/riscv-c.cc (riscv_pragma_intrinsic): Remove error
when V is disabled and init the RVV types and intrinic APIs.
* config/riscv/riscv-vector-builtins.cc (expand_builtin): Report
error if V ext is disabled.
* config/riscv/riscv.cc (riscv_return_value_is_vector_type_p):
Ditto.
(riscv_arguments_is_vector_type_p): Ditto.
(riscv_vector_cc_function_p): Ditto.
* config/riscv/riscv_vector.h: Remove error if V is disable.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pragma-1.c: Remove.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-1.c: New test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-2.c: New test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-3.c: New test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-4.c: New test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-5.c: New test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-6.c: New test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c: New test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c: New test.
David Malcolm [Sat, 23 Mar 2024 13:52:38 +0000 (09:52 -0400)]
analyzer: fix ICE and false positive with -Wanalyzer-deref-before-check [PR114408]
gcc/analyzer/ChangeLog:
PR analyzer/114408
* engine.cc (impl_run_checkers): Free up any dominance info that
we may have created.
* kf.cc (class kf_ubsan_handler): New.
(register_sanitizer_builtins): New.
(register_known_functions): Call register_sanitizer_builtins.
gcc/testsuite/ChangeLog:
PR analyzer/114408
* c-c++-common/analyzer/deref-before-check-pr114408.c: New test.
* c-c++-common/ubsan/analyzer-ice-pr114408.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
This was just approved in Tokyo as a DR for C++23. It doesn't affect us
yet, because we don't implement the __cpp_lib_format_ranges features. We
can add the disabled specializations and add a testcase now though.
libstdc++-v3/ChangeLog:
* include/std/format (formatter): Disable specializations that
would allow sequences of narrow characters to be formatted as
wchar_t without conversion, as per LWG 3944.
* testsuite/std/format/formatter/lwg3944.cc: New test.
Jonathan Wakely [Fri, 22 Mar 2024 11:47:44 +0000 (11:47 +0000)]
libstdc++: Add __is_in_place_index_v helper and use it in <variant>
We already have __is_in_place_type_v for in_place_type_t so adding an
equivalent for in_place_index_t allows us avoid a class template
instantiation for the __not_in_place_tag constraint on the most
commonly-used std::variant::variant(T&&) constructor.
For in_place_type_t we also have a __is_in_place_type class template
defined in terms of the variable template, but that isn't actually used
anywhere. I'm not adding an equivalent for the new variable template,
because that wouldn't be used either.
For GCC 15 we should remove the unused __is_in_place_tag and
__is_in_place_type class templates.
libstdc++-v3/ChangeLog:
* include/bits/utility.h (__is_in_place_index_v): New variable
template.
* include/std/variant (__not_in_place_tag): Define in terms of
variable templates not a class template.
Jonathan Wakely [Wed, 20 Mar 2024 11:07:56 +0000 (11:07 +0000)]
libstdc++: Use std::type_identity_t in <string_view> as per LWG 3950 [PR114400]
The difference between __type_identity_t and std::type_identity_t is
observable, as demonstrated in the PR. Nobody in LWG seems to think this
an example we should really care about, but it seems easy and harmless
to change this.
libstdc++-v3/ChangeLog:
PR libstdc++/114400
* include/std/string_view (operator==): Use std::type_identity_t
in C++20 instead of our own __type_identity_t.
Jakub Jelinek [Sat, 23 Mar 2024 10:20:00 +0000 (11:20 +0100)]
bitint: Fix bitfield loads in handle_cast [PR114433]
We ICE on the following testcase, because handle_cast was incorrectly
testing !m_first to see whether it should use m_data[m_bitfld_load + 1]
or fresh SSA_NAME for a PHI result.
Now, m_first is in the routine sometimes temporarily cleared in between
doing prepare_data_in_out and the !m_first check and only before returning
restored from the save_first copy.
Without this patch, we try to use the same SSA_NAME (_12 here) in 2
different PHI results which is obviously invalid IL and ICEs very quickly.
2024-03-23 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/114433
* gimple-lower-bitint.cc (bitint_large_huge::handle_cast): For
m_bitfld_load check save_first rather than m_first.
Jakub Jelinek [Sat, 23 Mar 2024 10:19:09 +0000 (11:19 +0100)]
bitint: Handle complex types in build_bitint_stmt_ssa_conflicts [PR114425]
The task of the build_bitint_stmt_ssa_conflicts hook for
tree-ssa-coalesce.cc next to special casing the
multiplication/division/modulo is to ignore statements with
large/huge _BitInt lhs which isn't in names bitmap and on the
other side pretend all uses of the stmt are used in a later stmt
(single user of that SSA_NAME or perhaps single user of lhs of
the single user etc.) where the lowering will actually emit the
code.
Unfortunately the function wasn't handling COMPLEX_TYPE of the large/huge
BITINT_TYPE, while the FE doesn't really support such types, they are
used under the hood for __builtin_{add,sub,mul}_overflow{,_p}, they are
also present or absent from the names bitmap and should be treated the same.
Without this patch, the operands of .ADD_OVERFLOW were incorrectly pretended
to be used right in that call statement rather than on the cast stmt from
IMAGPART_EXPR of .ADD_OVERFLOW return value to some integral type.
2024-03-23 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/114425
* gimple-lower-bitint.cc (build_bitint_stmt_ssa_conflicts): Handle
_Complex large/huge _BitInt types like the large/huge _BitInt types.
Jakub Jelinek [Sat, 23 Mar 2024 10:17:44 +0000 (11:17 +0100)]
predcom: Punt for steps which aren't multiples of access size [PR111683]
On the following testcases, there is no overlap between data references
within a single iteration, but the data references have size which is twice
as large as the step, which means the data references overlap with the next
iteration which predcom doesn't take into account.
As discussed in the PR, even if the reference size is smaller than step,
if step isn't a multiple of the reference size, there could be overlaps with
some other iteration later on.
The initial version of the patch regressed (test still passed, but predcom
didn't optimize anymore) pr71083.c which has a packed char, short structure
and was reading/writing the short 2 bytes in there with step 3.
The following patch deals with that by retrying for COMPONENT_REFs also the
aggregate sizes etc., so that it then compares 3 bytes against step 3.
In make check-gcc/check-g++ this patch I believe affects code generation
for only the 2 new testcases according to statistics I've gathered.
2024-03-23 Jakub Jelinek <jakub@redhat.com>
PR middle-end/111683
* tree-predcom.cc (pcom_worker::suitable_component_p): If has_write
and comp_step is RS_NONZERO, return false if any reference in the
component doesn't have DR_STEP a multiple of access size.
* gcc.dg/pr111683-1.c: New test.
* gcc.dg/pr111683-2.c: New test.
xtensa: Add supplementary split pattern for "*addsubx"
int test(int a) {
return a * 4 + 30000;
}
In the example above, since Xtensa has instructions to add register value
scaled by 2, 4 or 8 (and corresponding define_insns), we would expect them
to be used but not, because it is transformed before reaching the RTL
generation pass as below:
int test(int a) {
return (a + 7500) * 4;
}
Fortunately, the RTL combination pass tries a splitting pattern that matches
the first example, so it is easy to solve by defining that pattern.
gcc/ChangeLog:
* config/xtensa/xtensa.md: Add new split pattern described above.
Jonathan Wakely [Thu, 21 Mar 2024 13:25:15 +0000 (13:25 +0000)]
libstdc++: Destroy allocators in re-inserted container nodes [PR114401]
The allocator objects in container node handles were not being destroyed
after the node was re-inserted into a container. They are stored in a
union and so need to be explicitly destroyed when the node becomes
empty. The containers were zeroing the node handle's pointer, which
makes it empty, causing the handle's destructor to think there's nothign
to clean up.
Add a new member function to the node handle which destroys the
allocator and zeros the pointer. Change the containers to call that
instead of just changing the pointer manually.
We can also remove the _M_empty member of the union which is not
necessary.
libstdc++-v3/ChangeLog:
PR libstdc++/114401
* include/bits/hashtable.h (_Hashtable::_M_reinsert_node): Call
release() on node handle instead of just zeroing its pointer.
(_Hashtable::_M_reinsert_node_multi): Likewise.
(_Hashtable::_M_merge_unique): Likewise.
(_Hashtable::_M_merge_multi): Likewise.
* include/bits/node_handle.h (_Node_handle_common::release()):
New member function.
(_Node_handle_common::_Optional_alloc::_M_empty): Remove
unnecessary union member.
(_Node_handle_common): Declare _Hashtable as a friend.
* include/bits/stl_tree.h (_Rb_tree::_M_reinsert_node_unique):
Call release() on node handle instead of just zeroing its
pointer.
(_Rb_tree::_M_reinsert_node_equal): Likewise.
(_Rb_tree::_M_reinsert_node_hint_unique): Likewise.
(_Rb_tree::_M_reinsert_node_hint_equal): Likewise.
* testsuite/23_containers/multiset/modifiers/114401.cc: New test.
* testsuite/23_containers/set/modifiers/114401.cc: New test.
* testsuite/23_containers/unordered_multiset/modifiers/114401.cc: New test.
* testsuite/23_containers/unordered_set/modifiers/114401.cc: New test.
This is needed to avoid errors outside the immediate context when
evaluating is_default_constructible_v<vector<T, A>> when A is not
default constructible.
To avoid diagnostic regressions for 23_containers/vector/48101_neg.cc we
need to make the std::allocator<cv T> partial specializations default
constructible, which they probably should have been anyway.
libstdc++-v3/ChangeLog:
PR libstdc++/113841
* include/bits/allocator.h (allocator<cv T>): Add default
constructor to partial specializations for cv-qualified types.
* include/bits/stl_vector.h (_Vector_impl::_Vector_impl()):
Constrain so that it's only present if the allocator is default
constructible.
* include/bits/stl_bvector.h (_Bvector_impl::_Bvector_impl()):
Likewise.
* testsuite/23_containers/vector/cons/113841.cc: New test.
Jonathan Wakely [Mon, 18 Mar 2024 13:09:52 +0000 (13:09 +0000)]
libstdc++: Use feature test macros in <bits/stl_construct.h>
The preprocessor checks for __cplusplus in <bits/stl_construct.h> should
use the appropriate feature test macros instead of __cplusplus, namely
__glibcxx_raw_memory_algorithms and __cpp_constexpr_dynamic_alloc.
For the latter, we want to check the compiler macro not the library's
__cpp_lib_constexpr_dynamic_alloc, because the latter is not defined for
freestanding but std::construct_at needs to be.
libstdc++-v3/ChangeLog:
* include/bits/stl_construct.h (destroy_at, construct_at): Guard
with feature test macros instead of just __cplusplus.
Jonathan Wakely [Tue, 19 Mar 2024 14:02:06 +0000 (14:02 +0000)]
libstdc++: Replace std::result_of with __invoke_result_t [PR114394]
Replace std::result_of with std::invoke_result, as specified in the
standard since C++17, to avoid deprecated warnings for std::result_of.
We don't have __invoke_result_t in C++11 mode, so add it as an alias
template for __invoke_result<>::type (which is what std::result_of uses
as its base class, so there's no change in functionality).
This fixes warnings given by Clang 18.
libstdc++-v3/ChangeLog:
PR libstdc++/114394
* include/std/functional (bind): Use __invoke_result_t instead
of result_of::type.
* include/std/type_traits (__invoke_result_t): New alias
template.
* testsuite/20_util/bind/ref_neg.cc: Adjust prune pattern.
openmp: Change to using a hashtab to lookup offload target addresses for indirect function calls
A splay-tree was previously used to lookup equivalent target addresses
for a given host address on offload targets. However, as splay-trees can
modify their structure on lookup, they are not suitable for concurrent
access from separate teams/threads without some form of locking. This
patch changes the lookup data structure to a hashtab instead, which does
not have these issues.
The call to build_indirect_map to initialize the data structure is now
called from just the first thread of the first team to avoid redundant
calls to this function.
Andrew Stubbs [Wed, 6 Mar 2024 15:54:46 +0000 (15:54 +0000)]
amdgcn: Adjust GFX10/GFX11 cache coherency
The RDNA devices have different cache architectures to the CDNA devices, and
the differences go deeper than just the assembler mnemonics.
I believe this patch is correct according to the documentation in the LLVM
AMDGPU user guide (the ISA manual is less instructive), but I hadn't observed
any real problems before (or after).
gcc/ChangeLog:
* config/gcn/gcn.md (*memory_barrier): Split into RDNA and !RDNA.
(atomic_load<mode>): Adjust RDNA cache settings.
(atomic_store<mode>): Likewise.
(atomic_exchange<mode>): Likewise.
Andrew Stubbs [Thu, 22 Feb 2024 11:41:19 +0000 (11:41 +0000)]
amdgcn: Prefer V32 on RDNA devices
We run these devices in wavefrontsize64 for compatibility, but they actually
only have 32-lane vectors, natively. If the upper part of a V64 is masked
off (as it is in V32) then RDNA devices will skip execution of the upper part
for most operations, so this adjustment shouldn't leave too much performance on
the table. One exception is memory instructions, so full wavefrontsize32
support would be better.
The advantage is that we avoid the missing V64 operations (such as permute and
vec_extract).
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_vectorize_preferred_simd_mode): Prefer V32 on
RDNA devices.
David Malcolm [Fri, 22 Mar 2024 14:57:25 +0000 (10:57 -0400)]
analyzer: look through casts in taint sanitization [PR112974,PR112975]
PR analyzer/112974 and PR analyzer/112975 record false positives
from the analyzer's taint detection where sanitization of the form
if (VALUE CMP VALUE-OF-WIDER-TYPE)
happens, but wasn't being "noticed" by the taint checker, due to the
test being:
(WIDER_TYPE)VALUE CMP VALUE-OF-WIDER-TYPE
at the gimple level, and thus taint_state_machine recording
sanitization of (WIDER_TYPE)VALUE, but not of VALUE.
Fix by stripping casts in taint_state_machine::on_condition so that
the state machine records sanitization of the underlying value.
gcc/analyzer/ChangeLog:
PR analyzer/112974
PR analyzer/112975
* sm-taint.cc (taint_state_machine::on_condition): Strip away
casts before considering LHS and RHS, to increase the chance of
detecting places where sanitization of a value may have happened.
gcc/testsuite/ChangeLog:
PR analyzer/112974
PR analyzer/112975
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
taint-pr112974.c and taint-pr112975.c to analyzer_kernel_plugin.c.
* gcc.dg/plugin/taint-pr112974.c: New test.
* gcc.dg/plugin/taint-pr112975.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Andrew Stubbs [Fri, 15 Mar 2024 14:26:15 +0000 (14:26 +0000)]
amdgcn: Add gfx1103 target
Add support for the gfx1103 RDNA3 APU integrated graphics devices. The ROCm
documentation warns that these may not be supported, but it seems to work
at least partially.
Marek Polacek [Thu, 22 Feb 2024 23:49:08 +0000 (18:49 -0500)]
c++: direct-init of an array of class type [PR59465]
...from another array in a mem-initializer should not be accepted.
We already reject
struct string {} a[1];
string x[1](a);
but
struct pair {
string s[1];
pair() : s(a) {}
};
is wrongly accepted.
It started to be accepted with r0-110915-ga034826198b771:
<https://gcc.gnu.org/pipermail/gcc-patches/2011-August/320236.html>
which was supposed to be a cleanup, not a deliberate change to start
accepting the code. The build_vec_init_expr code was added in r165976:
<https://gcc.gnu.org/pipermail/gcc-patches/2010-October/297582.html>.
It appears that we do the magic copy array when we have a defaulted
constructor and we generate code for its mem-initializer which
initializes an array. I also see that we go that path for compound
literals. So when initializing an array member, we can limit building
up a VEC_INIT_EXPR to those special cases.
PR c++/59465
gcc/cp/ChangeLog:
* init.cc (can_init_array_with_p): New.
(perform_member_init): Check it.
gcc/testsuite/ChangeLog:
* g++.dg/init/array62.C: New test.
* g++.dg/init/array63.C: New test.
* g++.dg/init/array64.C: New test.
Andrew Stubbs [Fri, 15 Mar 2024 14:21:15 +0000 (14:21 +0000)]
vect: more oversized bitmask fixups
These patches fix up a failure in testcase vect/tsvc/vect-tsvc-s278.c when
configured to use V32 instead of V64 (I plan to do this for RDNA devices).
The problem was that a "not" operation on the mask inadvertently enabled
inactive lanes 31-63 and corrupted the output. The fix is to adjust the mask
when calling internal functions (in this case COND_MINUS), when doing masked
loads and stores, and when doing conditional jumps (some cases were already
handled).
gcc/ChangeLog:
* dojump.cc (do_compare_rtx_and_jump): Clear excess bits in vector
bitmasks.
(do_compare_and_jump): Remove now-redundant similar code.
* internal-fn.cc (expand_fn_using_insn): Clear excess bits in vector
bitmasks.
(add_mask_and_len_args): Likewise.
Thomas Neumann [Mon, 11 Mar 2024 13:35:20 +0000 (14:35 +0100)]
handle unwind tables that are embedded within unwinding code [PR111731]
Original bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111731
The unwinding mechanism registers both the code range and the unwind
table itself within a b-tree lookup structure. That data structure
assumes that is consists of non-overlappping intervals. This
becomes a problem if the unwinding table is embedded within the
code itself, as now the intervals do overlap.
To fix this problem we now keep the unwind tables in a separate
b-tree, which prevents the overlap.
libgcc/ChangeLog:
PR libgcc/111731
* unwind-dw2-fde.c: Split unwind ranges if they contain the
unwind table.
Mikael Morin [Thu, 21 Mar 2024 16:27:54 +0000 (17:27 +0100)]
fortran: Ignore use statements on error [PR107426]
This fixes an access to freed memory on the testcase from the PR.
The problem comes from an invalid subroutine statement in an interface,
which is ignored and causes the following statements forming the procedure
body to be rejected. One of them use-associates the intrinsic ISO_C_BINDING
module, which imports new symbols in a namespace that is freed at the time
the statement is rejected. However, this creates dangling pointers as
ISO_C_BINDING is special and its import creates a reference to the imported
C_PTR symbol in the return type of the global intrinsic symbol for C_LOC
(see the function create_intrinsic_function).
This change saves and restores the list of use statements, so that rejected
use statements are removed before they have a chance to be applied to the
current namespace and create dangling pointers.
PR fortran/107426
gcc/fortran/ChangeLog:
* gfortran.h (gfc_save_module_list, gfc_restore_old_module_list):
New declarations.
* module.cc (old_module_list_tail): New global variable.
(gfc_save_module_list, gfc_restore_old_module_list): New functions.
(gfc_use_modules): Set module_list and old_module_list_tail.
* parse.cc (next_statement): Save module_list before doing any work.
(reject_statement): Restore module_list to its saved value.
Mikael Morin [Fri, 22 Mar 2024 11:32:34 +0000 (12:32 +0100)]
fortran: Fix specification expression error with dummy procedures [PR111781]
This fixes a spurious invalid variable in specification expression error.
The error was caused on the testcase from the PR by two different bugs.
First, the call to is_parent_of_current_ns was unable to recognize
correct host association and returned false. Second, an ad-hoc
condition coming next was using a global variable previously improperly
restored to false (instead of restoring it to its initial value). The
latter happened on the testcase because one dummy argument was a procedure,
and checking that argument what causing a check of all its arguments with
the (improper) reset of the flag at the end, and that preceded the check of
the next argument.
For the first bug, the wrong result of is_parent_of_current_ns is fixed by
correcting the namespaces that function deals with, both the one passed
as argument and the current one tracked in the gfc_current_ns global. Two
new functions are introduced to select the right namespace.
Regarding the second bug, the problematic condition is removed, together
with the formal_arg_flag associated with it. Indeed, that condition was
(wrongly) allowing local variables to be used in array bounds of dummy
arguments.
PR fortran/111781
gcc/fortran/ChangeLog:
* symbol.cc (gfc_get_procedure_ns, gfc_get_spec_ns): New functions.
* gfortran.h (gfc_get_procedure_ns, gfc_get_spec ns): Declare them.
(gfc_is_formal_arg): Remove.
* expr.cc (check_restricted): Remove special case allowing local
variable in dummy argument bound expressions. Use gfc_get_spec_ns
to get the right namespace.
* resolve.cc (gfc_is_formal_arg, formal_arg_flag): Remove.
(gfc_resolve_formal_arglist): Set gfc_current_ns. Quit loop and
restore gfc_current_ns instead of early returning.
(resolve_symbol): Factor common array spec resolution code to...
(resolve_symbol_array_spec): ... this new function. Additionnally
set and restore gfc_current_ns.
gcc/testsuite/ChangeLog:
* gfortran.dg/spec_expr_8.f90: New test.
* gfortran.dg/spec_expr_9.f90: New test.